|
Author: Michael Fötsch
May 8, 2005
Table of Contents
- Why New-Style Classes?
- Properties
- Static Methods
- Class Methods
- Descriptors
- Attribute Slots
- The Constructor __new__
- Cooperative Super Call
- Conclusion
- References
New-style classes are part of an effort to unify built-in types and
user-defined classes in the Python programming language. New-style classes have
been around since Python 2.2 (not that new anymore), so
it's definitely time to take advantage of the new possibilities.
A new-style class is one that is derived, either directly or indirectly, from
a built-in type. (Something that wasn't possible at all before Python 2.2.)
Built-in types include types such as:
int
list
tuple
dict
str
- and others
The base class for all new-style classes is called object.
All of the following are new-style classes:
class NewStyleUserDefinedClass(object):
pass
class DerivedFromBuiltInType(list):
pass
class IndirectlyDerivedFromType(DerivedFromBuiltInType):
pass
Here's what new-style classes have to offer:
- Properties: Attributes that are defined by get/set methods
- Static methods and class methods
- The new
__getattribute__ hook, which, unlike
__getattr__, is called for every attribute access,
not just when the attribute can't be found in the instance
- Descriptors: A protocol to define the behavior of attribute access through objects
- Overriding the constructor
__new__
- Metaclasses (not discussed)
I'll try to be very brief, yet to give you enough information so that you
can start using these language features. Examples are presented in place
of long descriptions. Once your interest is awakened, you can refer to the
References section for more detailed material on these topics.
A property is an attribute that is defined by get/set methods. The concept is
simple and well-known from other languages. A property is defined like this:
class ClassWithProperty(object):
...
TheProperty = property(fget=<the get method>,
fset=<the set method>,
fdel=<the del method>,
doc=<the docstring>)
The signature of the property descriptor
is property(fget=None, fset=None, fdel=None, doc=None).
If any of the methods is not specified,
an exception of type AttributeError is raised when the respective
operation is attempted. For example, to define a read-only property, you would
specify fget but not fset. (Write-only properties are
also possible, although a regular method achieves the same thing in a less
awkward way.)
Here's a more complete example:
class ClassWithProperty(object):
def __SetTheProperty(self, value):
print "Setting the property"
self.__m_the_property = value
def __GetTheProperty(self):
print "Getting the property"
return self.__m_the_property
def __DelTheProperty(self):
print "Deleting the property"
del self.__m_the_property
TheProperty = property(fget=__GetTheProperty,
fset=__SetTheProperty,
fdel=__DelTheProperty,
doc="The property description.")
def __GetReadOnlyProperty(self):
return "This is a calculated value."
ReadOnlyProperty = property(fget=__GetReadOnlyProperty)
The property is used like this:
>>> c = ClassWithProperty()
>>> c.TheProperty = 10
Setting the property
>>> print c.TheProperty
Getting the property
10
>>> del c.TheProperty
Deleting the property
>>> # The property itself is still there after deleting
>>> c.TheProperty = 5
Setting the property
>>> c.TheProperty.__doc__
'The property description.'
>>> print c.ReadOnlyProperty
This is a calculated value.
>>> c.ReadOnlyProperty = 100
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
AttributeError: can't set attribute
Note: Don't forget to derive your class from
object, otherwise properties won't work.
There's not much to explain about static methods; they behave just
like in C++. A static method is defined using the staticmethod
descriptor:
class MyClass(object):
def SomeMethod(x):
print x
SomeMethod = staticmethod(SomeMethod)
>>> MyClass.SomeMethod(15)
15
>>> obj = MyClass()
>>> obj.SomeMethod(15)
15
You should really consider creating a static method whenever a method
does not make substantial use of the instance (self).
A class method is similar to a static method in that it has no self
argument. Instead, it receives a class as its first argument. By convention,
this argument is called cls. A class method is defined using
the classmethod descriptor:
class MyClass(object):
def SomeMethod(cls, x):
print cls, x
SomeMethod = classmethod(SomeMethod)
class DerivedClass(MyClass):
pass
>>> MyClass.SomeMethod(15)
<class '__main__.MyClass'> 15
>>> obj = MyClass()
>>> obj.SomeMethod(15)
<class '__main__.MyClass'> 15
>>> DerivedClass.SomeMethod(150)
<class '__main__.DerivedClass'> 150
In the last call, you can see that only the class involved in making the method
call defines the value of the cls argument. This is despite the fact
that the method has been defined in a different class.
We have already seen three different descriptors:
property
staticmethod
classmethod
But what exactly is a descriptor?
In general, a descriptor is an object attribute with "binding behavior",
one whose attribute access has been overridden by methods in the descriptor protocol.
Those methods are __get__, __set__, and __delete__.
If any of those methods are defined for an object, it is said to be a descriptor.
[2]
When executing the assignment x.m = y, then m may be an
object that defines a __set__ method. If that's the case,
that method is called to perform the assignment.
The following example shows how to define each of the three methods
from the descriptor protocol:
class MyDescriptor(object):
def __get__(self, obj, type=None):
print "get", self, obj, type
return "The value"
def __set__(self, obj, value):
print "set", self, obj, val
return None
def __delete__(self, obj):
print "delete", self, obj
return None
class SomeClass(object):
m = MyDescriptor()
Note: Both classes must be derived from object.
Now we can start using the descriptor:
>>> x = SomeClass()
>>> print x.m
get <__main__.MyDescriptor object at 0x12345678>
<__main__.SomeClass object at 0x23456789> <class '__main__.SomeClass'>
The value
>>> x.m = 1000
set <__main__.MyDescriptor object at 0x12345678>
<__main__.SomeClass object at 0x23456789> 1000
>>> del x.m
delete <__main__.MyDescriptor object at 0x12345678>
<__main__.SomeClass object at 0x23456789>
Here's how the descriptor methods get called:
- When writing an attribute, the
__setattr__ method
invokes the descriptor's __set__ method.
- When reading an attribute, the
__getattribute__ method invokes the
descriptor's __get__ method.
- When deleting an attribute, the
__delattr__ method
invokes the descriptor's __delete__ method.
A few caveats:
- Both the descriptor class and the class using it must be new-style
classes.
- When overriding
__setattr__, __getattribute__,
and __delattr__, make sure to invoke the inherited method.
(That is, extend these methods; don't override them.)
Otherwise, the descriptor mechanism will stop working.
As a sidenote: In his How-To Guide
[2], Raymond Hettinger has this to say about the
difference between data and non-data descriptors:
If an object defines both __get__ and __set__,
it is considered a data descriptor. Descriptors that only define __get__
are called non-data descriptors (they are typically used for methods but other
uses are possible).
Data and non-data descriptors differ in how overrides are calculated with
respect to entries in an instance's dictionary. If an instance's dictionary has
an entry with the same name as a data descriptor, the data descriptor takes
precedence. If an instance's dictionary has an entry with the same name as a
non-data descriptor, the dictionary entry takes precedence.
To make a read-only data descriptor, define both __get__
and __set__ with the __set__ raising an
AttributeError when called. Defining the __set__
method with an exception raising placeholder is enough to make it a data descriptor.
However, I did not manage to add a dictionary entry in such a way that a non-data
descriptor's __get__ stopped being called. Maybe I tried the
wrong things, maybe I'm misunderstanding something. In the meantime,
I'll just follow Raymond's advice about defining both __get__
and __set__ for read-only data descriptors.
An interesting new feature that I discovered in Guido van Rossum's paper
on new-style classes [1] is that of "slots". Here's
how it works:
class X(object):
__slots__ = ["m", "n"]
>>> x = X()
>>> x.m = 10
>>> x.n = 10
>>> x.k = 3
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
AttributeError: 'X' object has no attribute 'k'
__slots__ reserves space for the listed variables directly in the
instance. Classes that define slots don't have an instance dictionary
(__dict__). If you try to assign to an attribute that's not in
__slots__, you receive an error. This
may be quite useful for struct-like classes, because it prevents
problems with misspelled attribute names.
The main purpose seems to be with classes that derive from built-in types.
For example, a derived dict can have a few slots for all additional
attributes that it needs. No second __dict__ has to be created
for these attributes, which saves space.
Just be warned that a slot in a derived class hides a slot of the same name
in the base class.
If you are like me, then you probably always thought of the __init__
method as the Python equivalent of what is called a constructor in C++.
This isn't the whole story.
When an instance of a class is created, Python first calls the
__new__ method of the class. __new__ is a static
method that is called with the class as its first argument. __new__
returns a new instance of the class.
The __init__ method is called afterwards to initialize
the instance. In some situations (think "unplickling"!),
no initialization is performed. Also, immutable types like int
and str are completely constructed by the __new__
method; their __init__ method does nothing. This way, it
is impossible to circumvent immutability by explicitly calling
the __init__ method after construction.
There's a new way in which Python handles method resolution
in connection with multiple inheritance. Personally, I try to avoid multiple
inheritance whenever possible, so I won't go into detail here. However, what
I did learn is that when someone else derives multiply from my own classes,
it would be nice if my classes performed what's called "cooperative super calls".
In short, instead of the old
<base-class>.<inherited-method>(self),
one should call
super(<own-class>, self).<inherited-method>().
class BaseClass:
def Method(self):
pass
class DerivedClass(BaseClass):
def Method(self):
super(DerivedClass, self).Method()
Seems to be fairly easy, and if it helps... ;-)
We've seen how new-style classes can be used to
- Define properties
- Define static and class methods
- Define descriptors
- Assign attribute slots
- Override the constructor
__new__
I haven't mentioned metaclasses at all. I haven't used metaclasses myself,
so I better refer you to the experts. There's a number of links to
metaclass-related articles on the New-style
Classes page at python.org
[3].
I hope this article has been useful. For any questions, suggestions,
or comments, please feel free to e-mail me.
[1] Unifying
types and classes in Python 2.2, by Guido van Rossum.
[2] How-To
Guide for Descriptors, by Raymond Hettinger.
[3] New-style
Classes at python.org.
[4]
OOP
in Python after 2.2, by
Michael Hudson. [Some kind of an ASCII slide show; very concise.]
The author is a former C++-only programmer who has been cursing Python for
years for not providing anything like Delphi's property keyword.
(The author would never curse C++ for not providing properties.)
|