The Artima Developer Community
Sponsored Link

Weblogs Forum
Generic functions vs mixins: a case study

11 replies on 1 page. Most recent reply: Jan 25, 2012 7:48 AM by Don Sawatzky

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 11 replies on 1 page
Michele Simionato

Posts: 222
Nickname: micheles
Registered: Jun, 2008

Generic functions vs mixins: a case study (View in Weblogs)
Posted: Sep 2, 2008 10:23 PM
Reply to this message Reply
Summary
Just yesterday at work I had a good real-life use case for generic functions which deserved a blog post.
Advertisement

In the last few weeks my collegues and me have been involved in a project which required a command line interface. We did so by leveraging on the cmd module in the standard Python library, to which we added a network layer using Twisted. In the end, we had classes interacting with the standard streams stdin, stdout, stderr and classes interacting with nonstandard streams such as Twisted transports. All the I/O was line oriented and we basically needed three methods:

  • print_out(self, text, *args) to print a line on self.stdout
  • print_err(self, text, *args) to print a line on self.stderr
  • readln_in(self) to read a line from self.stdin

Depending on the type of self, self.stdout was sys.stdout, a Twisted transport, a log file or a file-like wrapper to a database. Likewise for self.stderr and self.stdin.

This is a problem that begs for generic functions. Unfortunately, nobody in the Python world uses them (with the exception of P. J. Eby) so for the moment we are using a suboptimal design involving mixins instead. I am not really happy with that. The aim of this blog post is to explain why a mixin solution is inferior to a generic functions solution.

A mixin-oriented solution

In the mixin solution, instead of generic functions one uses plain old methods, stored into a mixin class. In this specific case let me call the class StdIOMixin:

class StdIOMixin(object):
    "A mixin implementing line-oriented I/O"
    stdin = sys.stdin
    stdout = sys.stdout
    stderr = sys.stderr
    linesep = os.linesep

    def print_out(self, text, *args):
        "Write on self.stdout by flushing"
        write(self.stdout, str(text) + self.linesep, *args)

    def print_err(self, text, *args):
        "Write on self.stderr by flushing"
        write(self.stderr, str(text) + self.linesep, *args)

    def readln_in(self):
        "Read a line from self.stdin (without trailing newline) or None"
        line = self.stdin.readline()
        if line:
            return line[:-1] # strip trailing newline

where write is the following helper function:

def write(stream, text, *args):
    'Write on a stream by flushing if possible'
    if args: # when no args, do not consider '%' a special char
        text = text % args
    stream.write(text)
    flush = getattr(stream, 'flush', False)
    if flush:
        flush()

StdIOMixin is there to be mixed with other classes, providing them with the ability to perform line-oriented I/O. By default, it works on the standard streams, but if the client class overrides the attributes stdout, stderr, stdin with suitable file-like objects, it can be made to work with Twisted transports, files and databases. For instance, here is an example where stdout and stderr are overridden as files:

class FileIO(StdIOMixin):
    def __init__(self):
        self.stdout = file('out.txt', 'w')
        self.stderr = file('err.txt', 'w')
>>> FileIO().print_out('hello!') # prints a line on out.txt

The design works and it looks elegant, but still I say that it is sub-optimal compared to generic functions.

The basic problem of this design is that it adds methods to the client classes and therefore it adds to the learning curve. Suppose you have four client classes - one managing standard stream, one managing files, one managing Twisted transports and one managing database connections - then you have to add the mixin four times. If you generate the documentation for your classes, the methods print_out, print_err and readln_in will be documented four times. And this is not a shortcoming of pydoc: the three methods are effectively cluttering your application in a linear way, proportionally to the number of classes you have.

Moreover, those methods will add to the pollution of your class namespace, with the potential risk on name collisions, especially in large frameworks. In large frameworks (i.e. Plone, where a class my have 700+ attributes) this is a serious problem: for instance, you cannot even use auto-completion, since there are just too many completions. You must know that I am very sensitive to namespace pollution so I always favor approaches that can avoid it.

Also, suppose you only need the print_out functionality; the mixin approach naturally would invite you to include the entire StdIOMixin, importing in your class methods you don't need. The alternative would be to create three mixin classes StdinMixin, StdoutMixin, StderrMixin, but most of the time you would need all of them; it seems overkill to complicate so much your inheritance hierarchy for a very simple functionality.

As you may know, I am always looking for solutions avoiding (multiple) inheritance and generic functions fit the bill perfectly.

A generic functions solution

I am sure most people do not know about it, but Python 2.5 ships with an implementation of generic functions in the standard library, in the pkgutil module (by P.J. Eby). Currently, the implementation is only used internally in pkgutil and it is completely undocumented; therefore I never had the courage to use it in production, but it works well. Even if it is simple, it is able to cover most practical uses of generic functions. For instance, in our case we need three generic functions:

from pkgutil import simplegeneric

@simplegeneric
def print_out(self, text, *args):
    if args:
        text = text % args
    print >> self.stdout, text

@simplegeneric
def print_err(self, text, *args):
    if args:
        text = text % args
    print >> self.stderr, text

@simplegeneric
def readln_in(self):
    "Read a line from self.stdin (without trailing newline)"
    line = self.stdin.readline()
    if line:
        return line[:-1] # strip trailing newline

The power of generic functions is that you don't need to use inheritance: print_out will work on any object with a .stdout attribute even if it does not derive from StdIOMixin. For instance, if you define the class

class FileOut(object):
    def __init__(self):
        self.stdout = file('out.txt', 'w')

the following will print a message on the file out.txt:

>>> print_out(FileOut(), 'writing on file') # prints a line on out.txt

Simple, isn't it?

Extending generic functions

One advantage of methods with respect to ordinary functions is that they can be overridden in subclasses; however, generic functions can be overridden too - this is why they are also called multimethods. For instance, you could define a class AddTimeStamp and override print_out to add a time stamp when applied to instances of AddTimeStamp. Here is how you would do it:

class AddTimeStamp(object):
    stdout = sys.stdout
@print_out.register(AddTimeStamp) # add an implementation to print_out
def impl(self, text, *args):
    "Implementation of print_out for AddTimeStamp instances"
    if args:
        text = text % args
    print >> self.stdout, datetime.datetime.now().isoformat(), text

and here in an example of use:

>>> print_out(AddTimeStamp(), 'writing on stdout')
2008-09-02T07:28:46.863932 writing on stdout

The syntax @print_out.register(AddTimeStamp) is not the most beatiful in the world, but its purposes should be clear: we are registering the implementation of print_out to be used for instances of AddTimeStamp. When print_out is invoked on an instance of AddTimeStamp a time stamp is printed; otherwise, the default implementation is used.

Notice that since the implementation of simplegeneric is simple, the internal registry of implementations is not exposed and there is no introspection API; moreover, simplegeneric works for single dispatch only and there is no explicit support for multimethod cooperation (i.e. call-next-method, for the ones familiar with Common Lisp). Yet, you cannot pretend too much from thirty lines of code ;)

In this example I have named the AddTimeStamp implementation of print_out impl, but you could have used any valid Python identifier, including print_out_AddTimeStamp or _, if you felt so. Since the name print_out is explicit in the decorator and since in practice you do not need to access the explicit implementation directly, I have settled for a generic name like impl. There is no standard convention since nobody uses generic functions in Python (yet).

There were plan to add generic functions to Python 3.0, but the proposal have been shifted to Python 3.1, with a syntax yet to define. Nevertheless, for people who don't want to wait, pkgutil.simplegeneric is already there and you can start experimenting with generic functions right now. Have fun!


Phillip J. Eby

Posts: 28
Nickname: pje
Registered: Dec, 2004

Re: Generic functions vs mixins: a case study Posted: Sep 3, 2008 2:40 PM
Reply to this message Reply
Don't forget the "simplegeneric" package on PyPI, at http://pypi.python.org/pypi/simplegeneric -- it's a more robust version of its stubbed cousin in pkgutil. Also, there's PEAK-Rules, which is considerably more featureful, at the cost of a bigger footprint.

Michele Simionato

Posts: 222
Nickname: micheles
Registered: Jun, 2008

Re: Generic functions vs mixins: a case study Posted: Sep 3, 2008 9:43 PM
Reply to this message Reply
Thanks for mentioning simplegeneric, Phillip.
Actually people wanting to use generic functions in production should use that package. I never had the courage to use pkgutil.simplegeneric in production, since it is an undocumented feature and I am worried that it will be removed once generic functions enter in the standard library.



Posts: 6
Nickname: rhymes
Registered: Oct, 2003

Re: Generic functions vs mixins: a case study Posted: Sep 7, 2008 10:10 AM
Reply to this message Reply
Great discovery Michele, now we can experiment in production code at work :P

Apart from jokes, thanks for the article and thanks to Eby for simplegeneric :)

andrew queisser

Posts: 6
Nickname: queisser
Registered: Oct, 2003

Re: Generic functions vs mixins: a case study Posted: Sep 18, 2008 2:54 PM
Reply to this message Reply
Wouldn't it be easier in your case to have a Logger base class with derived FileLogger, SocketLogger, StdioLogger and then just have each class instantiate the Logger it needs?

Disclaimer: I don't know any Python so I may be misunderstanding something.

Michele Simionato

Posts: 222
Nickname: micheles
Registered: Jun, 2008

Re: Generic functions vs mixins: a case study Posted: Sep 19, 2008 3:40 AM
Reply to this message Reply
This is certainly a possible approach, but it is the kind
of design I am trying to avoid. I do not want to put the burden of logging on the framework class. Suppose for instance I instantiate the loggers in the constructor. If, later on, I want to choose a different logger, I need
to subclass and to override: that is too heavy weight for me.
Alternatively, I could use Dependency Injection and pass to the constructor all of its loggers, but this is also undesiderable, since I would be complicating the signature always, even if I needed to use a non-standard logger only very few times. I could use different factories to instantiate the class with
different loggers, of course, but I think that the generic function approach is a much better solution. It clearly decouples the logging capabilities from the other features of the framework class, which stays simple. Client code does not need to touch the framework class, they just register their preferred logging function and it is done. Moreover generic functions are not just for logging: they are a general mechanism that you can use in many other circumstances.

Megan Gay

Posts: 2
Nickname: meg557
Registered: Sep, 2008

Re: Generic functions vs mixins: a case study Posted: Sep 20, 2008 2:50 PM
Reply to this message Reply
Hi, I was wondering if anyone could help me figure out how to do these problems in Python. I am so lost! Please help me if you are able to.

1) Suppose you see the following program

def main():
num = input("Enter a number: ")
for i in range(5):
num = num / 2
print num

main()

Suppose the input to this program is 1024, what is the output? Do it first without a computer, and then run it to verify (or correct) your answer.

2) In a program, you find
x=input("Please enter a number")
a) Some user gets a little literal and types in a number What happens?
b) A slightly smarter user realizes that just typing in a number won’t work; you need quotes around text. Said user types in "a number". Now what happens?

3) Why does
x=x+1
work in Python, but
x+1=x
not work?

4)
>>> x, y, z = 1, 2,3
>>> x, y = y, z
>>> y,z = z, x
>>> z, y = x, y
>>> print x, y, z

First try to predict what the answer will be, then verify.

Kai Weber

Posts: 1
Nickname: kwbr
Registered: Jan, 2009

Re: Generic functions vs mixins: a case study Posted: Jan 5, 2009 8:21 AM
Reply to this message Reply
Generic functions look elegant and well suited for some cases. I even confirm they are better than mixins. What I do not like about it:

* breaks good OO design which leads to
* they are not (unit) testable. Maybe you could provide an example.

As a side note, I wonder why in your example the functions argument is self and not self.stdout, self.stderr or self.stdin?

Michele Simionato

Posts: 222
Nickname: micheles
Registered: Jun, 2008

Re: Generic functions vs mixins: a case study Posted: Jan 5, 2009 9:49 AM
Reply to this message Reply
> Generic functions look elegant and well suited for some
> cases. I even confirm they are better than mixins. What I
> do not like about it:
>
> * breaks good OO design which leads to
> * they are not (unit) testable. Maybe you could provide an
> example.

Maybe you could give an example, since I have no idea of what you are talking about.

> As a side note, I wonder why in your example the functions
> argument is self and not self.stdout,
> self.stderr or self.stdin?

Because I am performing the dispatching on the class of self and not on the class of self.stdout/err/in. Does that help?

Michael Foord

Posts: 13
Nickname: fuzzyman
Registered: Jan, 2006

Re: Generic functions vs mixins: a case study Posted: Jan 6, 2009 3:18 PM
Reply to this message Reply
I think using functions over methods is a backwards step. :-)

Ethan Furman

Posts: 3
Nickname: stoneleaf
Registered: Sep, 2009

Re: Generic functions vs mixins: a case study Posted: Sep 27, 2009 1:12 PM
Reply to this message Reply
Practicality beats purity. ;-)

An excellent article, and an excellent module. I have downloaded it from PyPI and am still studying and learning.

Thanks to both of you!

~Ethan~

Don Sawatzky

Posts: 1
Nickname: donsaw
Registered: Jan, 2012

Re: Generic functions vs mixins: a case study Posted: Jan 25, 2012 7:48 AM
Reply to this message Reply
I've been reading many discussions on the web from over the last few years about generic functions in Python. This began with my interest in what I know as overloaded functions. I have been able to overload functions and methods simply with a dictionary and two decorators. I would like to be informed about the differences between generic function, multimethods, and overloaded functions.
I find that Eby's implementation of generic functions is a bit obtuse and smacks of the framework he was fitting them into. If generic functions are to become a feature of Python they will have to be more general in application and perhaps be closer to pkgutil.simplegeneric than PeakRules. What say you?

Flat View: This topic has 11 replies on 1 page
Topic: First Impression: Scala Goodies Previous Topic   Next Topic Topic: Wow, it's been 6 years...

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use