Artima Weblogs | Guido van van Rossum's Weblog | Discuss | Email | Print | Bloggers | Previous | Next
Sponsored Link •
I've just spent a few days hunting for a memory leak in M2Crypto, without success. Are there better options for interfacing to OpenSSL from Python? A small rant.
This is my first rant here. I just spent several entire days trying to figure out why a long-running multi-threaded program that is making lots of SSL connections using M2Crypto is leaking memory. Unsuccesful!
I found numerous problems in M2Crypto. If you use the default parameters for the HTTPSConnection class, it creates an SSL Context object which is artificially kept alive in a global map of all SSL Context objects; you have to close the Context object explicitly to remove it from that map, but closing the HTTPSConnection doesn't close the context it created.
After fixing that, it still leaked pretty severely. Despite valiant attempts with valgrind (a useful enough tool, but unfinished) I still wasn't able to track down the leak; the stack trace shown by valgrind was missing several crucial entries (apparently calls through function pointers confuse it?).
But even if I find the cause of the leak, I may not be able to plug it. M2Crypto is written using SWIG, which contains lots of obfuscated code and doesn't exactly give me a warm fuzzy feeling that it's always doing the reference counting correctly. And of course, OpenSSL itself is pretty much entirely opaque. (Where the heck is d2i_X509 defined? It's got its own man page, and is used in numerous place, but I can't find a single definition of it -- no macro, no function! They must be defining it implicitly via some meta-macro...)
M2Crypto has other problems too: a month ago, I was trying to create a multi-threaded server handling SSL and HTTP connections. While it was not leaking, it was crashing as soon as I had multiple clients connect to it simultaneously. M2Crypto's author tried to help me but could not reproduce the problem. Then, later, I could not either. So what was going on? My suspicion is that the bug isn't fixed but an unrelated change in my code happens to stop triggering it -- meaning it could come back any time.
M2Crypto hasn't been updated in a year. It contains (in SSL/httpslib.py) various versions of an HTTPSConnection class which try to adjust to different Python versions, but due to a lack of imagination it reverts to the pre-2.2 code when Python 2.4 or higher is used! There's a patch purporting to upgrade it to version 0.13.1 but the patch looks bogus to me (it uses an unbound method to override a bound one, which can't possibly work). The HTTPSConnection.close() method does nothing, allowing for even more potential leaks. The SSLServer class defines an error handler with an incorrect number of arguments.
I could go on, but I think I've ranted enough, except for one thing. I've yet to see an extension module using SWIG that doesn't make me think it was a mistake to use SWIG instead of manually written wrappers. The extra time paid upfront to create hand-crafted wrappers is gained back hundredfold by time saved debugging the SWIG-generated code later.
OK, now I've got it off my chest. Is there an alternative? Bill Janssen recommended PyOpenSSL. Anyone concur?
Have an opinion? Readers have already posted 14 comments about this weblog entry. Why not add yours?
If you'd like to be notified whenever Guido van van Rossum adds a new entry to his weblog, subscribe to his RSS feed.
|Guido van Rossum is the creator of Python, one of the major programming languages on and off the web. The Python community refers to him as the BDFL (Benevolent Dictator For Life), a title straight from a Monty Python skit. He moved from the Netherlands to the USA in 1995, where he met his wife. Until July 2003 they lived in the northern Virginia suburbs of Washington, DC with their son Orlijn, who was born in 2001. They then moved to Silicon Valley where Guido now works for Google (spending 50% of his time on Python!).|