This post originated from an RSS feed registered with Python Buzz
by Dmitry Dvoinikov.
Original Post: This is Python, dot operator and the magic "self"
Feed Title: Things That Require Further Thinking
Feed URL: http://feeds.feedburner.com/ThingsThatRequireFurtherThinking
Feed Description: Once your species has evolved language, and you have learned language, [...] and you have something to say, [...] it doesn't take much time, energy and effort to say it. The hard part of course is having something interesting to say.
-- Geoffrey Miller
Although syntactically similar to "regular" imperative programming languages which support OOP and everything, Python offers extra semantical freedom short of being magic.
Consider you have a reference to some object, perhaps in a variable, found after a lookup process described in my previous post. And so, you have
x
some variable. As soon as it contains a reference to an object (and it always does), you can access that object through the variable, by applying all sorts of operators to it:
x += 1 x["foo"] = "bar" x(1, 2) x.foo("bar")
and so on. Whether or not each of those accesses will succeed depends on the target object, but the worst thing that could happen if you mistreat an object is a runtime exception, for example:
x = 1 x()
results in
TypeError: 'int' object is not callable
Let's keep on looking. As soon as Python is an OOP-capable language (whatever on Earth that means), it supports classes and methods:
class C: def foo(self, x): print(x)
and allows overriding reaction to some of the operators, for example the following pieces of code have similar meaning:
class C: class C def __call__(self): { pass -vs- public: void operator()(void) {} };
and it might seem that there is no difference except for Python way of having a fancy double underscore method for anything advanced, but in fact Python offers more.
Python allows overriding of "dot" operator. For example, the following class (despite being a little unclean) appears to support just any method you throw at it:
class C: def __getattr__(self, name): def any_method(*args, **kwargs): print(name, args, kwargs) return any_method def i_exist(self): print("i would not budge") c = C() c.ping() c.add(1, 2) c.lookup([1, 2], key = 1) c.i_exist()
prints out
ping () {} add (1, 2) {} lookup ([1, 2],) {'key': 1} i would not budge
The magic method is apparently __getattr__, it is invoked when you apply dot operator to a class instance and it does not have such named attribute by itself, note how the i_exist method stepped up despite of having __getattr__ overriden.
x.foo ^---- __getattr__ is invoked when the dot is crossed
So what does it mean ? It means that you can override anything, including the dot operator, something not possible in static-typed compiled languages, and this feature makes it really simple to hide all sorts of advanced behavior behind a simple method access. For example, consider XMLRPC client in Python:
from xmlrpc.client import ServerProxy p = ServerProxy("http://1.2.3.4:5678") p.AddNumbers(1, 2, 3)
and see how straightforward the access to a network service with procedural interface is. ServerProxy class simply intercepts the method access and turns it into a network call. This is done transparently at runtime with no need to recompile any stub or anything - you can access any target service method without any preparation. Compare this to an XMLRPC client library of your choice.
Now take a look at the following fictional line:
foo.bar["biz"]("baz").keep.on("going")
Can you see now that every delimiter (except for literal string quoute) can be intercepted and have its behavior modified ? Given this, I can (and almost universally do) apply aesthetic thinking - how would I like my code to look ? One of the Python principles is to have code (pleasantly) readable. In each case, for each relation between program modules (whatever that means) I can have it
and so on. Depending on the situation I can pick up whatever option that makes the code more clear. And guess what ? Overriding the dot is sometimes useful.
Anyhow, this is only half of the story.
The other half is told from the other side of the dot. See, __getattr__ notifies an instance that one of its methods is about to be accessed and allows for it to override. But Python also allows for the accessed member to be notified whenever it is being accessed as a member of some other instance. Sounds weird ? Take a look at this:
class Member: def __get__(self, instance, owner): print("I'm a member of {0}".format(instance)) return self
class C: x = Member()
c = C() c.x
prints out
I'm a member of <__main__.C object at ...>
See ? The Member instance being a member of some other class is notified whenever it is accessed. Where can it be useful you may ask ? Oh, it is the key to the magic "self" in Python.
Consider the following most simple piece of code:
class C: def foo(self): print(self)
Have you ever thought what "self" is ? I mean - it obviously is an argument containing a reference to the instance being called, but where did it come from ? It doesn't even have to be called "self", it is just a convention, the following will work just as well:
class C: def foo(magic): print(magic)
And so it turns out that somehow at the moment of the invocation the first argument of every method points to the containing instance. How is it done ?
What happens when you do
c = C() c.foo()
anyhow ? At first sight, access to c.foo should return a reference to a method - something related to C and irrelevant to c. But it appears that the following two accesses to foo
c1 = C() c1.foo c2 = C() c2.foo
fetch different things - c1.foo returns a method with its first argument set to c1 and c2.foo - to c2. How could that happen ? The key here is that you access a method (which is a member of a class) through a class instance. The class itself contains its methods in a half-cooked "unbound" state, they don't have any "self":
class C: def foo(self): pass print(C.foo) print(C().foo)
prints out
<function foo at ...> <bound method C.foo of <__main__.C object at ...>>
See ? When fetched directly from a class, a method is nothing but a regular function, it is not "bound" to anything. You can even call it, but you will have to provide its first argument "self" by yourself as you see fit:
class C: def foo(self): print(self) C.foo("123")
prints out
123
But as soon as you instantiate and fetch the same method through an instance, the magic __get__ method comes into play and allows the returned reference to be "bound" to the actual instance. Something like this:
class C: foo = Method(lambda self, *args, **kwargs: print(self, args, kwargs)) c = C() print(c) c.foo(1, 2, foo = "bar")
prints out
<__main__.C object at 0x00ADA0D0> <__main__.C object at 0x00ADA0D0> (1, 2) {'foo': 'bar'}
And so I could demonstrate a reimplementation of a major language feature in a few lines. May be not apparently useful most of the time, such experience certainly makes you understand the language better.
One more thing, have I told you Python was cool ? :)