Properties, attributes and methods for objects¶
All of the properties and functions of an object are public in Python, which is different from other languages where properties can be public, private, or protected. That is, there is no point in preventing caller objects from invoking any attributes an object has. This is another difference with respect to other programming languages in which you can mark some attributes as private or protected.
There is no strict enforcement, but there are some conventions. An attribute that starts with an underscore is meant to be private to that object, and we expect that no external agent calls it (but again, there is nothing preventing this).
1. Underscores in Python¶
Consider the following example to illustrate this:
class Connector:
def __init__(self, source, user, password, timeout):
self.source = source
self.user = user
self.__password = password
self._timeout = timeout
>>> Connector(...).source
'postgresql://localhost'
>>> Connector(...)._timeout
60
>>> Connector(...).__password
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Connector' object has no attribute '__password'
>>> vars(Connector(...))
{'source': 'postgresql://localhost', '_timeout': 60, 'user': 'root', '_Connector__password': '1234'}
Here, a Connector object is created with source, and it starts with 4 attributes—the
aforementioned source
, timeout
, user
and password
. source
and user
are public, timeout
is
private and password
is.
However, as we can see from the following lines when we create an object like this, we can actually access timeout
.
The interpretation of this code is that _timeout
should be accessed only within connector itself and never from a
caller. This means that you should organize the code in a way so that you can safely refactor the timeout at all of the
times it’s needed, relying on the fact that it’s not being called from outside the object (only internally), hence
preserving the same interface as before. Complying with these rules makes the code easier to maintain and more
robust because we don’t have to worry about ripple effects when refactoring. The same principle applies to methods as
well.
Note
Objects should only expose those attributes and methods that are relevant to an external caller object, namely, entailing its interface. Everything that is not strictly part of an object’s interface should be kept prefixed with a single underscore.
This is the Pythonic way of clearly delimiting the interface of an object. There is, however, a common misconception that some attributes and methods can be actually made private. This is, again, a misconception.
password
is defined with a double underscore instead. Some developers use this method to hide some attributes,
thinking, like in this example, that password
is now private and that no other object can modify it. Now, take a
look at the exception that is raised when trying to access it. It’s AttributeError
, saying that it doesn’t exist.
It doesn’t say something like “this is private” or “this can’t be accessed” and so on. It says it does not exist. This
should give us a clue that, in fact, something different is happening and that this behavior is instead just a side
effect, but not the real effect we want.
What’s actually happening is that with the double underscores, Python creates a different name for the attribute (this is called name mangling). What it does is create the attribute with the following name instead: “_<class-name>__<attribute-name>”. In this case, an attribute named ‘_Connector__password’ will be created.
Notice the side effect that we mentioned earlier—the attribute only exists with a different name, and for that reason the AttributeError was raised on our first attempt to access it.
The idea of the double underscore in Python is completely different. It was created as a means to override different methods of a class that is going to be extended several times, without the risk of having collisions with the method names. Even that is a too far-fetched use case as to justify the use of this mechanism.
Double underscores are a non-Pythonic approach. If you need to define attributes as private, use a single underscore, and respect the Pythonic convention that it is a private attribute.
Note
Do not use double underscores.
2. Properties¶
When the object needs to just hold values, we can use regular attributes. Sometimes, we might want to do some computations based on the state of the object and the values of other attributes. Most of the time, properties are a good choice for this.
Properties are to be used when we need to define access control to some attributes in an object, which is another point where Python has its own way of doing things. In other programming languages (like Java), you would create access methods (getters and setters), but idiomatic Python would use properties instead.
The properties provide a built-in descriptor type that knows how to link an attribute to a
set of methods. property
takes four optional arguments: fget
, fset
, fdel
, and doc
. The
last one can be provided to define a docstring
function that is linked to the attribute as if
it were a method. Here is an example of a Rectangle
class that can be controlled either by
direct access to attributes that store two corner points or by using
the width
and height
properties:
class Rectangle:
def __init__(self, x1, y1, x2, y2):
self.x1, self.y1 = x1, y1
self.x2, self.y2 = x2, y2
def _width_get(self):
return self.x2 - self.x1
def _width_set(self, value):
self.x2 = self.x1 + value
def _height_get(self):
return self.y2 - self.y1
def _height_set(self, value):
self.y2 = self.y1 + value
width = property(
_width_get, _width_set,
doc="rectangle width measured from left"
)
height = property(
_height_get, _height_set,
doc="rectangle height measured from top"
)
def __repr__(self):
return "{}({}, {}, {}, {})".format(
self.__class__.__name__,
self.x1, self.y1, self.x2, self.y2
)
The following is an example of such defined properties in an interactive session:
>>> rectangle = Rectangle(10, 10, 25, 34)
>>> rectangle.width, rectangle.height
(15, 24)
>>> rectangle.width = 100
>>> rectangle
Rectangle(10, 10, 110, 34)
>>> rectangle.height = 100
>>> rectangle
Rectangle(10, 10, 110, 110)
>>> help(Rectangle)
Help on class Rectangle in module sample_module:
class Rectangle(builtins.object)
| Methods defined here:
|
| __init__(self, x1, y1, x2, y2)
| Initialize self. See help(type(self)) for accurate signature.
| __repr__(self)
| Return repr(self).
|
| --------------------------------------------------------
| Data descriptors defined here:
| (...)
|
| height
| rectangle height measured from top
|
| width
| rectangle width measured from left
The properties make it easier to write descriptors, but must be handled carefully when using inheritance over classes. The attribute created is made on the fly using the methods of the current class and will not use methods that are overridden in the derived classes.
For instance, the following example will fail to override the implementation of
the fget
method of the parent’s class (Rectangle
) width
property:
>>> class MetricRectangle(Rectangle):
... def _width_get(self):
... return "{} meters".format(self.x2 - self.x1)
...
>>> Rectangle(0, 0, 100, 100).width
100
In order to resolve this, the whole property simply needs to be overwritten in the derived class:
>>> class MetricRectangle(Rectangle):
... def _width_get(self):
... return "{} meters".format(self.x2 - self.x1)
... width = property(_width_get, Rectangle.width.fset)
...
>>> MetricRectangle(0, 0, 100, 100).width
'100 meters'
Unfortunately, the preceding code has some maintainability issues. It can be a source of
confusion if the developer decides to change the parent class, but forgets to update the
property call. This is why overriding only parts of the property behavior is not advised.
Instead of relying on the parent class’s implementation, it is recommended that you rewrite
all the property methods in the derived classes if you need to change how they work. In
most cases, this is the only option, because usually the change to the
property setter
behavior implies a change to the behavior of getter
as well.
Because of this, the best syntax for creating properties is to use property as a decorator. This will reduce the number of method signatures inside the class and make the code more readable and maintainable:
class Rectangle:
def __init__(self, x1, y1, x2, y2):
self.x1, self.y1 = x1, y1
self.x2, self.y2 = x2, y2
@property
def width(self):
"""rectangle width measured from left"""
return self.x2 - self.x1
@width.setter
def width(self, value):
self.x2 = self.x1 + value
@property
def height(self):
"""rectangle height measured from top"""
return self.y2 - self.y1
@height.setter
def height(self, value):
self.y2 = self.y1 + value
This approach is much more compact than having custom methods prefixed with get_
or set_
. It’s clear what is
expected because it’s just email.
Note
Don’t write custom get_*
and set_*
methods for all attributes on your objects. Most of the time, leaving
them as regular attributes is just enough. If you need to modify the logic for when an attribute is retrieved or
modified, then use properties.
You might find that properties are a good way to achieve command and query separation. Command and query separation state that a method of an object should either answer to something or do something, but not both. If a method of an object is doing something and at the same time it returns a status answering a question of how that operation went, then it’s doing more than one thing, clearly violating the principle that functions should do one thing, and one thing only.
Depending on the name of the method, this can create even more confusion, making it harder for readers to understand
what the actual intention of the code is. For example, if a method is called set_email
, and we use it as
if self.set_email("a@j.com"): ...
, what is that code doing? Is it setting the email to a@j.com? Is it checking if the
email is already set to that value? Both (setting and then checking if the status is correct)?
With properties, we can avoid this kind of confusion. The @property
decorator is the query that will answer to
something, and the @<property_name>.setter
is the command that will do something.
Another piece of good advice derived from this example is as follows: don’t do more than one thing on a method. If you want to assign something and then check the value, break that down into two or more sentences.
Note
Methods should do one thing only. If you have to run an action and then check for the status, so that in separate methods that are called by different statements.
3. Slots¶
An interesting feature that is very rarely used by developers is slots. They allow you to set a
static attribute list for a given class with the __slots__
attribute, and skip the creation of
the __dict__
dictionary in each instance of the class. They were intended to save memory
space for classes with very few attributes, since __dict__
is not created at every instance.
When a class defines the __slots__
attribute, it can contain all the attributes that the class
expects and no more.
Trying to add extra attributes dynamically to a class that defines __slots__
will result in
an AttributeError
. By defining this attribute, the class becomes static, so it will not have
a __dict__
attribute where you can add more objects dynamically.
How, then, are its attributes retrieved if not from the dictionary of the object? By using descriptors. Each name defined in a slot will have its own descriptor that will store the value for retrieval later.
__slots__
can help to design classes whose signature needs to be frozen. For
instance, if you need to restrict the dynamic features of the language over a class, defining
slots can help:
class Coordinate2D:
__slots__ = ("lat", "lon")
def __init__(self, lat, lon):
self.lat = lat
self.lon = lon
def __repr__(self):
return f"{self.__class__.__name__}({self.lat}, {self.lon})"
While this is an interesting feature, it has to be used with caution because it is taking away the dynamic nature of Python. In general, this ought to be reserved only for objects that we know are static, and if we are absolutely sure we are not adding any attributes to them dynamically in other parts of the code. Some techniques, such as monkey patching, will not work with instances of classes that have slots defined. Fortunately, the new attributes can be added to the derived classes if they do not have their own slots defined:
>>> class Frozen:
... __slots__ = ['ice', 'cream']
...
>>> '__dict__' in dir(Frozen)
False
>>> 'ice' in dir(Frozen)
True
>>> frozen = Frozen()
>>> frozen.ice = True
>>> frozen.cream = None
>>> frozen.icy = True
Traceback (most recent call last): File "<input>", line 1, in <module>
AttributeError: 'Frozen' object has no attribute 'icy'
>>> class Unfrozen(Frozen):
... pass
...
>>> unfrozen = Unfrozen()
>>> unfrozen.icy = False
>>> unfrozen.icy
False
As an upside of this, objects defined with slots use less memory, since they only need a fixed set of fields to hold values and not an entire dictionary.