The Artima Developer Community
Sponsored Link

Weblogs Forum
Reply to Guido's Reply

21 replies on 2 pages. Most recent reply: Sep 19, 2007 11:12 PM by Larry H.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 21 replies on 2 pages [ 1 2 | » ]
Bruce Eckel

Posts: 875
Nickname: beckel
Registered: Jun, 2003

Reply to Guido's Reply (View in Weblogs)
Posted: Sep 13, 2007 12:54 PM
Reply to this message Reply
Summary
Guido van Rossum published a reply to my article "Python 3K or Python 2.9."
Advertisement

Guido's response is here.

First, I bring up these issues because I want my favorite language to be better, not because I am criticizing Guido or his process. I've worked with a number of languages in some depth and the job done by Guido and the top-level committers is better than any I've seen -- and I've personally spent time with these folks. What's amazing about Python is that it is not a one-person operation; there's a real team that is making decisions, and Guido pushes down those decisions so that he doesn't make them personally if he doesn't have to (which, I think, would be too exhausting, for one thing).

Guido's continuing drumbeat of "contribute, contribute" is absolutely essential. And although contributing PEPs and code is contribution in the most real sense -- which I hope to do myself someday -- I think that creating conversations is also an important contribution.

Some of the requests I made are certainly unreasonable or impractical, but I think we need to be careful about issues that become superstition: "we talked about that once and it didn't work so that topic is done," "maybe that would be better but it's too much trouble" or (my favorite) "that couldn't possibly be efficient enough to be justified."

And the other issues I can live with, but when it comes to concurrency, the world is changing and we can't leave our heads in the sand about it, especially considering that multiple CPUs could obliterate the "Python is slow" argument.

In particular (this is to everyone, not just Guido), be careful when assuming that threads are the right solution. We came to threads through a series of steps, like the temperature being turned up on a frog in a pan of water. People assume that you "must have threads to do concurrency properly." But threads are fraught with problems and notoriously difficult -- some experts even say impossible -- to get right (hey, the GIL might be your friend a lot more than you know). Yes, with processes you don't get everything you get with threads, but you can use multicores and multiple machines right now and write robust code because the OS is protecting you by not allowing you to share memory. That's a good thing! As far as overhead goes, I hope that we may start to see cleverer solutions -- in the same vein as ctypes solves its problem optimally -- as the issues become clearer. But for now, pretend that any expensive problem can be distributed to as many CPUs as you want, because from that perspective Python begins to look like the most effective language on all counts: if you can write it quickly and you can use distribution to make it run fast enough, then Python becomes the cheapest solution.


Mike Ivanov

Posts: 23
Nickname: mikeivanov
Registered: Jul, 2007

Re: Reply to Guido's Reply Posted: Sep 13, 2007 3:38 PM
Reply to this message Reply
Threads are poorman's forks. We have to say in a decent society that's not acceptable.

Andrew Binstock

Posts: 9
Nickname: binstock
Registered: Sep, 2006

Re: Reply to Guido's Reply Posted: Sep 13, 2007 9:41 PM
Reply to this message Reply
>...multiple CPUs could obliterate the "Python is slow" argument.

You seem to be implying that the primary reason for Python's slowness is lack of support for multiple processors. I'm not sure I understand how you come to that conclusion--if that is what you're saying. I'm no expert, but the problems with Python's slowness are considerably more pervasive than lack of support for multiple processors.

anthony boudouvas

Posts: 14
Nickname: objectref
Registered: Jun, 2007

Re: Reply to Guido's Reply Posted: Sep 14, 2007 1:40 AM
Reply to this message Reply
>I'm no expert,
> but the problems with Python's slowness are considerably
> more pervasive than lack of support for multiple
> processors.

Yes, but the usage of >=2 processors will surely speed things up.

Ah, we have talked thousands of hours latest years about the GIL and multi core usage and processes and and and...

As i use much of C# in my daily job, i just read this http://msdn.microsoft.com/msdnmag/issues/07/10/Futures/default.aspx

and i feel a little sad...It is the thing i would surely like to see some day in our beloved Python: Parallelism that is supported mainly from the language/platform...

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Reply to Guido's Reply Posted: Sep 14, 2007 8:24 AM
Reply to this message Reply
> I'm no expert,
> but the problems with Python's slowness are considerably
> more pervasive than lack of support for multiple
> processors.

I thought it was interesting that in another one of these threads there was benchmark showing that Jython was substantially faster (> 2.5X) than CPython when both were single-threaded.

http://blogs.warwick.ac.uk/dwatkins/entry/benchmarking_parallel_python_1_2/

I'm not going to vouch for the results but they are provocative considering the short-shrift that Jython seems to get from the Python community.

And, the addition of the invokeDynamic bytecode in the VM is supposedly going to increase the speed of languages on the JVM even more.

nes

Posts: 137
Nickname: nn
Registered: Jul, 2004

Re: Reply to Guido's Reply Posted: Sep 14, 2007 10:02 AM
Reply to this message Reply
> >...multiple CPUs could obliterate the "Python is slow"
> argument.
>
> You seem to be implying that the primary reason for
> Python's slowness is lack of support for multiple
> processors. I'm not sure I understand how you come to that

I agree.
C is slow, C++ is slow, Java is slow, but how many applications are written in hand tuned assembly today? Unless you say Python is too slow to do X on Y hardware doing it in a Z particular way, saying "Python is slow" is a lame excuse that should be ignored. There is no blanket solution to improve performance.

Bruno Gomes

Posts: 1
Nickname: blfgomes
Registered: Sep, 2007

Re: Reply to Guido's Reply Posted: Sep 14, 2007 11:11 AM
Reply to this message Reply
> > I'm no expert,
> > but the problems with Python's slowness are
> considerably
> > more pervasive than lack of support for multiple
> > processors.
>
> I thought it was interesting that in another one of these
> threads there was benchmark showing that Jython was
> substantially faster (> 2.5X) than CPython when both were
> single-threaded.
>
> http://blogs.warwick.ac.uk/dwatkins/entry/benchmarking_parallel_python_1_2/
>

I ran the benchmarks on my machine (Windows XP, Pentium(R) D 2.8GHZ, 1GB, Jython 2.2, Python 2.5, Java 1.6.0_02 and also IronPython 1.1) and the results I obtained were far from the ones Daniel published. His setup was different than mine (including OS), so I wonder if anyone was able to reproduce his results. His scripts (using one worker) showed results compatible with pystone: IronPython performed better, followed by CPython. Jython was last, running almost twice as slow as CPython (instead of being being 2.5x faster).

Jesse is working on his own benchmark (http://jessenoller.com/2007/09/12/have-gil-want-benchmarks/), let's see how that one goes.

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Reply to Guido's Reply Posted: Sep 14, 2007 12:02 PM
Reply to this message Reply
> I ran the benchmarks on my machine (Windows XP, Pentium(R)
> D 2.8GHZ, 1GB, Jython 2.2, Python 2.5, Java 1.6.0_02 and
> also IronPython 1.1) and the results I obtained were far
> from the ones Daniel published. His setup was different
> than mine (including OS), so I wonder if anyone was able
> to reproduce his results. His scripts (using one worker)
> showed results compatible with pystone: IronPython
> performed better, followed by CPython. Jython was last,
> running almost twice as slow as CPython (instead of being
> being 2.5x faster).

You results are more along the lines of what I would expect. But just for kicks, do you have time to download Jython 2.3 and rerun? I don't think the Jython version was specified in the post.

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Reply to Guido's Reply Posted: Sep 14, 2007 12:06 PM
Reply to this message Reply
> You results are more along the lines of what I would
> expect. But just for kicks, do you have time to download
> Jython 2.3 and rerun? I don't think the Jython version
> was specified in the post.

Sorry, I forgot that you are not able to travel forward in time.

Mike Ivanov

Posts: 23
Nickname: mikeivanov
Registered: Jul, 2007

Re: Reply to Guido's Reply Posted: Sep 14, 2007 12:58 PM
Reply to this message Reply
> saying "Python is slow" is a lame excuse that should be
> ignored. There is no blanket solution to
> improve performance.

Except optimizing code by hand.

People think parallelism is a kind of magic wand. Just add some more threads and everything will run 73.5% faster. No, that's WRONG. Parallelism takes substantial work to be done properly. This is especially true for non-Neumann architectures, such as threading model.

Larry Bugbee

Posts: 13
Nickname: bugbee
Registered: Dec, 2004

Re: Reply to Guido's Reply Posted: Sep 14, 2007 3:10 PM
Reply to this message Reply
Two comments...

I have not played with this yet, but it looks promising. Is this a candidate for a future standard library?
http://code.google.com/p/papyros/

Second, the **perception** Python is slow could be exacerbated if the rest of the world moves to multiple processors and Python doesn't. It's almost like we gotta.

Paul Gresham

Posts: 2
Nickname: gresh
Registered: Mar, 2006

Re: Reply to Guido's Reply Posted: Sep 14, 2007 5:54 PM
Reply to this message Reply
There is the age old discussion on threads happening, but threads, as in pthreads etc are of course not really threads in the true sense. If you create two pthreads on a single core single-threaded processor you have only one thread processing at one time.

My laptop has two cores (aka two threads in the real sense). The minimum spec of machines at my company are now 4CPU sparcIV+ (8 threads) and many of our apps are hitting 8 or 12 CPU's in a minimum configuration (that means someone somewhere is not getting the response times they expect), all to simply to handle thousands of queries on Terabytes of data, there's no rocket science, just pure volume. Ignoring Intel and AMD, take a look at the sparc T1, a low wattage 8 core CPU, it's a wonderful piece of kit, power efficient and so on. It's obviously successful and so Sun are releasing the T2 which could have up to 32 cores (real threads).

It's just not a debate that we must go parallel to keep up, but threads (ala pthreads) and GIL are just not the solution, they are only one mechanism which is really lightweight processes.

At my firm, our most critical apps are now going multi-host on commodity hardware. This is only feasible with decent middleware support, caching, efficient data transports, networking and so on. What we need are easy to use distributed models. By implementing some of these libraries, there may (or not) be changes to the core language that could ease and support these strategies.

Read the google white paper on map/reduce if you have't done so already and look for the java based nutch project.

My feeling is not to focus the core of Python on removing GIL and better support for pthreads (as that route is essentially a dead-mans-shoes scenario, waiting for CPU vendors to give more cores), but, to take a leaf out of the books of languages such as Erlang, look at openMPI or the older PVM. There are concurrent versions of Haskell and Lisp (multi-lisp I think).

Python is rock-solid and stable in it's core and Guido is right in the sense that there seem to be no tangible gains from threads (pthreads), because the real gains in concurrency come from the strategies employed solve the particular tasks at hand.

Oracle could be described as a mildly successful database company and their major strategy (whilst they may employ pthreads beneath the covers on a per process basis) is to have multiple, long running processes.

Krishna Sankar

Posts: 3
Nickname: ksankar
Registered: Nov, 2003

Re: Reply to Guido's Reply Posted: Sep 14, 2007 6:44 PM
Reply to this message Reply
Good points both sides, that is what makes a discussion lively, anyway ;o)

A) First of all, the question is not whether py is slow or fast but performance of a system written in py. Which means, ability to leverage multi-core architectures as well as control. Control in term of things like ability to pin one process/task to a core, ability to pin one or more homogeneous tasks to specific cores et al, as well as not wait for a global lock and similar primitives. (Before anybody jumps into a conclusion, this is not about GIL by any means ;o))

B) Second, it is clear that we need a good solution (not THE solution) for moderately massive parallelism in multi-core architectures (i.e 8-32 cores). Share nothing might not be optimal; we need some form of memory sharing, not just copy all data via messages. May be functional programming based on the blackboard pattern would work, who knows.

I have seen systems saturated still having only ~25% of CPU utilization (in a 4 core system!). It is because we didn't leverage multi-cores and parallelism. So while py3k will not be slow, lack of a cohesive multi-core strategy will show up in system performance and byte us later(pun intended!).

C) As Guido and Bruce (and others) echo, this is a call for participative action - conversation is a, excellent start, let us extend the conversation to a PEP addressing the multi-core parallelism in python and an implementation there of. The good news is there are at least 2 or 3 paradigms with implementations and rough benchmarks.

D) While Guido is almost right in saying that this is a (std)library problem, it is not fully so. We would need a few primitives from the underlying PVM substrate. Possibly one reason for Guido's position is the lack of clarity as to what needs to be changed and why. IMHO, just saying take GIL off does not solve the problem.

E) And Guido is right in insisting on speed, and Bruce is right in asking for language constructs. Without pragmatic speed, folks won't use it; same is the case without the required constructs. Both are barriers to adoption. We have an opportunity to offer a solution for multi-core architectures and let us seize it - we will rush in where angels fear to tred!

F) What would verybody suggest ? A PEP on support for multi-core parallelism or should we de-scope the PEP to (say) Declarative multi-core for web applications and apply our creativity and ingenuity into that one domain and constrain the problem ?

Cheers
<k/>

P.S ; Bruce, you had written "Actually it was not purely via library constructs; there was a very important low-level change made to ensure cache coherency based on a seminal paper that came out in recent years and changed everyone's thinking about the issue (I don't have the reference, but Scott Meyers wrote about it, and I'm pretty sure Brian Goetz has as well)" Can you help me find this paper ?

Krishna Sankar

Posts: 3
Nickname: ksankar
Registered: Nov, 2003

Re: Reply to Guido's Reply Posted: Sep 14, 2007 9:21 PM
Reply to this message Reply
A very relevant article http://www.ddj.com/architect/201804248

Couple excellent points:

Program using abstractions (and let the machine handle the implementations)

Program chores not cores

No Locks &

Have a way to run with concurrency OFF.

If we could offer all of these capabilities I think we would be good to go.

My $0.02

Cheers & have a nice weekend
<k/>

Bruce Eckel

Posts: 875
Nickname: beckel
Registered: Jun, 2003

Re: Reply to Guido's Reply Posted: Sep 15, 2007 6:30 AM
Reply to this message Reply
> P.S ; Bruce, you had written "Actually it was not purely
> via library constructs; there was a very important
> low-level change made to ensure cache coherency based on a
> seminal paper that came out in recent years and changed
> everyone's thinking about the issue (I don't have the
> reference, but Scott Meyers wrote about it, and I'm pretty
> sure Brian Goetz has as well)" Can you help me find this
> paper ?

Although the title is "C++ and the perils of double-checked locking," it actually goes into the details of issues like cache coherency and various problems in order to explain why double-checked locking doesn't work in C++:
http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf

Flat View: This topic has 21 replies on 2 pages [ 1  2 | » ]
Topic: Reply to Guido's Reply Previous Topic   Next Topic Topic: Crossing the Rich-Client Rubicon

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use