The Artima Developer Community
Sponsored Link

Weblogs Forum
Math and the Current State of Coarse-Grained Parallelism in Python

4 replies on 1 page. Most recent reply: Mar 29, 2008 2:00 PM by Evan Cofsky

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 4 replies on 1 page
Bruce Eckel

Posts: 875
Nickname: beckel
Registered: Jun, 2003

Math and the Current State of Coarse-Grained Parallelism in Python (View in Weblogs)
Posted: Mar 18, 2008 8:01 AM
Reply to this message Reply
One of my main objectives at Pycon 2008 was to hear about experiences regarding existing tools for parallelizing Python programs, and also to find out more about mathematical programming with Python.

I have a client who runs very compute-intensive and long-running simulations and we've been working on parallelizing the process using Python. In my explorations, I followed the common path of writing tools myself before discovering that others had already solved the problem (however, I now have the advantage of understanding the issues much better).

There were no official talks on parallelism at the conference, other than one that talked about using Amazon EC2, so I set up an open spaces discussion and there was a second one that happened later. (There were also a couple of talks involving stackless python, but that is a coroutine system that doesn't run on multiple processors).

Although there seems to be a fair amount of exploration in this arena, the consensus appears to be that the two current practical contenders are Parallel Python and Processing. (Please correct me if I got this wrong or missed something; I was doing more discussing than taking notes during the open spaces).

Another interesting possibility involves IPython, which I finally started to play with in earnest at the conference. This is a cross between the Python interpreter prompt and an IDE. What's very nice is that IPython does command completion and produces help. Basically it's a terrific way to explore Python and libraries. If you have setuptools installed (just run, you can just say easy_install ipython at a command prompt to install it, then just run ipython. Try importing a library and then using the '?' and command completion to see what IPython does for you.

IPython1 is the next version of IPython, and claims to have "a powerful architecture for parallel computing," but it is apparently still under development and no one in the open spaces session had used it for serious development.

The upcoming version of Jython 2.5 (first alpha soon, final sometime before the end of the year but summer sounded like it could be a "maybe") doesn't have a global interpreter lock (GIL) so you can use Jython to utilize the JVM's true threading. The removal of the GIL has the possibility of producing side effects that we may not actually notice right away, but that the Jython team assures me can be fixed as they are discovered.

When asked, Jim Baker got that faraway look in his eyes and said that yes, Stackless Jython could probably be implemented, but it wasn't exactly clear what that would really mean.

Iron Python apparently also doesn't require a GIL. This doesn't help my particular situation because Linux machines are used in the cluster.

A number of people said they had very good experiences with Pyro, which is a Python distributed object system. This might also have possibilities for certain types of parallel solutions.

NumPy and SciPy

I came early to Pycon to take the NumPy and SciPy tutorials, given by Travis Oliphant and Eric Jones of Enthought, a company which supports and makes its living teaching and consulting about these open-source libraries.

You can install NumPy with easy_install numpy.

NumPy is basically about "arrays" where the term "array" includes multiple-dimension matrices as well. So you can, for example, do the classic "invert a matrix and multiply it by itself to produce the identity matrix" trick:

>>> from numpy import *
>>> A = mat([[1,2,4], [2,5,3], [7,8,9]])
>>> A.I # Matrix inversion
matrix([[-0.42857143, -0.28571429,  0.28571429],
        [-0.06122449,  0.3877551 , -0.10204082],
        [ 0.3877551 , -0.12244898, -0.02040816]])
>>> A * A.I # Matrix multiplication
matrix([[  1.00000000e+00,   5.55111512e-17,   1.38777878e-17],
        [  5.55111512e-17,   1.00000000e+00,   3.12250226e-17],
        [  3.33066907e-16,   1.52655666e-16,   1.00000000e+00]])

NumPy does a lot more (even Fourier Transforms), but the core is this very efficient array mechanism, which is much more compact that arrays in Java, so you can make them huge without worrying about running out of memory. There is also support for handing these data structures to other, non-Python routines, which is where SciPy comes in.

SciPy has all kinds of high end math functions, many of which use the long-optimized C and Fortran routines directly. For example, here's how to produce a Bessel function (the solution to the vibrating drumhead problem):

from scipy import special
x = special.r_[0:100:0.1]
j0x = special.j0(x)
print j0x

Peter Skomoroch

Posts: 1
Nickname: pskomoroch
Registered: Mar, 2008

Re: Math and the Current State of Coarse-Grained Parallelism in Python Posted: Mar 18, 2008 1:56 PM
Reply to this message Reply

I'm sorry I missed your Open Spaces session, we only noticed the card on the board after it was over and really wanted to check it out.

You said:

"I have a client who runs very compute-intensive and long-running simulations and we've been working on parallelizing the process using Python."

This is something we run into a lot and have been trying tackle using a number of different approaches. You want the simplicity of numpy syntax, but the ability to distribute the computation across multiple cores/machines.

This thread might interest you:

I gave the Amazon EC2 talk where I covered parallel processing of the Netflix data using numpy and mpi4py (part of scipy now).

slides here:

The takeaway for me is that using an MPI approach with numpy works, but requires stretching your brain a bit more than should be necessary for simple problems. For coarse-grained parallelism, I'm leaning more towards IPython1 and/or Hadoop now and will be writing some up tutorials/case studies for those approaches on my blog.

IPython1 will work well right now for running parallel simulations using a master/worker model... the project recently switched to Bazaar, but you can take a look at these monte carlo pricing examples on the old svn repo:

Hadoop Streaming combined with Python is another "ready-to-run" solution:

There was also some talk in our open spaces session about writing python bindings to the hadoop distributed filesystem (HDFS), but I'm not sure where that will end up.


Posts: 3
Nickname: scrambles
Registered: Jan, 2008

Re: Math and the Current State of Coarse-Grained Parallelism in Python Posted: Mar 18, 2008 8:17 PM
Reply to this message Reply
I've had excellent experiences using Twisted's Perspective Broker layer for the kind of RPC you could use for this. It's quite flexible and well-structured...

...though the learning curve is a little steep. But it's kind of hard to overstate how capable and powerful Twisted is when it comes to writing client/server/RPC sorts of apps, especially prototypes. Unlike other frameworks, I've never had any mixed feelings about the considerable time I've poured into learning it, simply because Twisted has enabled me to write apps that I couldn't have contemplated writing without it.

Berk Geveci

Posts: 1
Nickname: berksky
Registered: Mar, 2008

Re: Math and the Current State of Coarse-Grained Parallelism in Python Posted: Mar 19, 2008 5:51 PM
Reply to this message Reply
There are a few projects out there using Python in conjunction with MPI to build distributed simulation codes. One interesting project is pyMPI:
I used it with VTK ( to visualize large datasets in parallel on clusters. The ParaView ( Python bindings support client/server based computation where the server can run on a cluster over MPI. We ran paraview-python on clusters as well as supercomputers (BlueGene and Cray Xt3). Of course, most of these are thin Python wrappers around Fortran/C/C++ code. It is possible to build distributed numpy algorithms using pyMPI but I have little experience doing that.

A commercial solution (which I have not used) is Star-P. They claim to have all of the numpy running distributed:

Evan Cofsky

Posts: 9
Nickname: theunixman
Registered: Jun, 2006

NumPy/SciPy Posted: Mar 29, 2008 2:00 PM
Reply to this message Reply
I've used these on a couple of projects with good results. They also make use of a few of the existing scientific libraries for high-performance computing if available, and vector instructions as well. is a module that wraps the nVidia CUDA library for high-performance computing. CUDA offloads a lot of vector operations to the GPU in the video card, and nVidia is even deploying special cards just for compute work. It's pretty interesting, and I'll be experimenting with it directly over the next few weeks with some DSP analysis.

Flat View: This topic has 4 replies on 1 page
Topic: Math and the Current State of Coarse-Grained Parallelism in Python Previous Topic   Next Topic Topic: Django Wired

Sponsored Links


Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use