Computing Thoughts
Self in the Argument List: Redundant is not Explicit
by Bruce Eckel
September 23, 2008

Summary
The response to arguments about self in Python is "explicit is better than implicit." In a discussion at Pycon Brazil, I realized that we do need self in the body of a method, but being forced to place it in the argument list is redundant. I'm not actually expecting it to change, but I want to try to establish that this isn't a dumb argument.

self is necessary to distinguish between global variables/functions and those of the object, for example. It provides a scoping mechanism, and I personally find it clearer than Ruby's @ and @@, but I'm sure that's just because I'm used to it, probably because it's similar to this in C++ and Java.

Something has always bugged me about self, however, and I've blogged about it before -- I had hoped that it would have been rectified in Python 3 and this caused the usual furor which ends up by people saying "explicit is better than implicit."

In a conversation with Luciano Ramalho (president of the Brazilian Python group) while I was in Brazil, he made me realize that it wasn't self everywhere that has been bugging me, it's self in the argument list which I think could actually be called un-pythonic.

How it Works Now

Here's some simple Python code showing the use of classes:

def f(): pass
a = 1

class C1(object):
    a = 2
    def m1(self):
        print a # Prints '1'
        print self.a # Prints '2'
        f() # The global version
        self.m2() # Must scope other members

    def m2(self): pass

obj = C1()
obj.m1()

First, you see f() and the global a, so we have something to call at the global scope. The class C1 is defined by inheriting from object, which is the standard procedure for defining a new class (I think this might become implicit in Python 3).

Note that both m1() and m2() have a first argument of self. In Python, self is not a keyword, but the name "self" is conventionally used to represent the address of the current object. The address of the object is always the first argument.

The a that is defined at class scope represents one way to create object fields, but you can also just assign to self.a within a method, and the first time this happens the storage will be created for that field. However, the two versions of a must now be differentiated. If you just say a within a method, you'll get the global version, but self.a produces the object field (you can also assign to global variables from within classes, but I'll skip that for the current discussion).

Similarly, an unqualified call to f() produces the global function, and self.m2() calls the member function by qualifying it (and simultaneously passing the address of the current object to be used as the self argument for m2()).

Now let's look at a class with a method that has arguments:

class C2(object):
    def m2(self, a, b): pass

To call the method, we create an instance of the object and use the dot notation to call m2() on the object obj:

obj = C2()
obj.m2(1,2)

In the call, the address of obj is implicitly passed as self for the call to m2(), and here we see a big inconsistency: why is implicit better than explicit when you define the method, but it's OK to be implicit when you call the method?

I certainly think that the method call syntax is desireable, but it means that you define a method differently than you call it, which I don't see as either "explicit" or pythonic. This is seen when you call the method with the wrong number of arguments:

obj.m2(1)

Here's the resulting error:

Traceback (most recent call last):
  File "classes.py", line 9, in <module>
    obj.m2(1)
TypeError: m2() takes exactly 3 arguments (2 given)

Because of the implicit argument pass of self during a method call, the above error message is actually saying that it wants you to call the method this way:

C2.m2(obj,1,2)

Even though the above line does run, this of course isn't the way you actually do it; you use the normal method calling syntax and give it two arguments:

obj.m2(1,2)

The message m2() takes exactly 3 arguments (2 given) is not only confusing for beginners, but it confuses me every time I see it, which I think suggests non-Pythonicness and points out the inconsistency between method definition and method invocation.

The Hopeless Suggestion

So what am I suggesting, despite the long history of hopelessness for this idea?

Make self a keyword in Python 3.1 (what's a bit more backwards incompatibility, as long as we're at it?) (Or even use this to make it easier for C++ and Java programmers to transition). All the existing rules for self remain the same.

The only difference: you don't have to put self in a method argument list. That's the only place it becomes implicit; everywhere else it's explicit -- except, just as it is now, the method call.

This produces consistency between the method definition and the method call, so you define a method with the same number of arguments that you call it with. When you call a method with the wrong number of arguments, the error message tells you the actual number of arguments the method is expecting, instead of one more.

Explicit vs. Redundant

Before I hear "explicit is better than implicit" one more time, there's a difference between making something clear and making it redundant. We already have a language that forces you to jump through lots of hoops for reasons that must have seemed good at the time but have since worn thin: it's called Java.

If we just want to be explicit about absolutely everything, we can use C or assembler or some language that spells out exactly what's happening inside the machine all the time and doesn't abstract away from those details.

Forcing programmers to put self in the method argument list doesn't honor explicitness; it's just redundant forced behavior. It doesn't add to the expression of programming (we already know it's a method; we don't need self in the argument list to remind us), it's just mechanical, and thus, I argue, non-pythonic.

Talk Back!

Have an opinion? Readers have already posted 43 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Bruce Eckel adds a new entry to his weblog, subscribe to his RSS feed.

Digg |

del.icio.us |

About the Blogger

Bruce Eckel (www.BruceEckel.com) provides development assistance in Python with user interfaces in Flex. He is the author of Thinking in Java (Prentice-Hall, 1998, 2nd Edition, 2000, 3rd Edition, 2003, 4th Edition, 2005), the Hands-On Java Seminar CD ROM (available on the Web site), Thinking in C++ (PH 1995; 2nd edition 2000, Volume 2 with Chuck Allison, 2003), C++ Inside & Out (Osborne/McGraw-Hill 1993), among others. He's given hundreds of presentations throughout the world, published over 150 articles in numerous magazines, was a founding member of the ANSI/ISO C++ committee and speaks regularly at conferences.


	Web Artima.com