The Artima Developer Community
Sponsored Link

Weblogs Forum
It isn't Easy to Remove the GIL

54 replies on 4 pages. Most recent reply: Sep 12, 2008 8:12 PM by Patrick Stinson

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 54 replies on 4 pages [ « | 1 2 3 4 ]
Bruce Eckel

Posts: 868
Nickname: beckel
Registered: Jun, 2003

Re: It isn't Easy to Remove the GIL Posted: Sep 17, 2007 12:42 PM
Reply to this message Reply
Advertisement
> Bruce, I'm a bit disappointed about you sinking to such a
> low with your response. A personal attack of that form -
> nicely veiled as it was - clearly was not necessary.
> Especially, if you could have just assumed a little less
> and actually bothered to read what I wrote.

You're right. I apologize. Your post pushed my sarcasm button and I should have reworked it before posting it.

The thing that pushed the button is hearing "but threading just isn't that hard" one too many times. It took me a lot of study and experimentation -- years -- to begin realizing that threading really is that hard. And there were a number of periods where I thought I actually understood it, so I need to be more understanding when someone else is in one of those phases as well.

I have written giant chapters about it both in C++ and in Java and that may have convinced some people of the complexity. I probably need to write a much shorter chapter or article that can somehow make the case that threads in general are not the right solution. But it's not easy and people who think threads are easy are probably not going to be convinced. Threads are fascinating and they draw you in with the feeling that if you can just put one more locking mechanism into the system then threads will work OK. Not unlike "just one more static typing mechanism" in the static-vs-dynamic discussions.

We do have a fundamental disagreement, though. I have come to the conclusion that shared-memory concurrency is impossible for most programmers to get right, and you feel that threads are a reasonable solution for a significant class of problems, and that you are able to get it right when you need to.

I shouldn't have replied sarcastically, but your reply seemed to suggest that it was obvious that most programmers could write correct threaded programs. I guess that frustrates me because I don't know how to explain the difficulties of threading well enough.

Juergen Brendel

Posts: 8
Nickname: jbrendel
Registered: Sep, 2007

Re: It isn't Easy to Remove the GIL Posted: Sep 17, 2007 2:10 PM
Reply to this message Reply
> You're right. I apologize. Your post pushed my sarcasm
> button and I should have reworked it before posting it.

It's ok, don't worry. I'm gladly accepting your apology.

When someone greatly cares about a topic then it is often difficult to keep the discussion free of any emotions. You obviously care and know very much about the topic of threading, which makes this understandable.

I like to avoid conflicts when I can, and I'm happy about your graceful response, which helps us all to get back to the basics of this discussion.


> The thing that pushed the button is hearing "but threading
> just isn't that hard" one too many times. It took me a lot
> of study and experimentation -- years -- to begin
> realizing that threading really is that hard. And there
> were a number of periods where I thought I actually
> understood it, so I need to be more understanding when
> someone else is in one of those phases as well.

Ok. It's interesting to be described as being 'in a phase', but Ok. :-)


> I have written giant chapters about it both in C++ and in
> Java and that may have convinced some people of the
> complexity. I probably need to write a much shorter
> chapter or article that can somehow make the case that
> threads in general are not the right solution. But it's
> not easy and people who think threads are easy are
> probably not going to be convinced. Threads are
> fascinating and they draw you in with the feeling that if
> you can just put one more locking mechanism into the
> system then threads will work OK. Not unlike "just one
> more static typing mechanism" in the static-vs-dynamic
> discussions.

While I haven't written books about this, and also wouldn't say that I have studied threading for years, I would say that I have written multi-threading programs on a number of different platforms for many years now.

I don't believe for a moment that I am some sort of special threading superstar. I'm not. I have had my fair share of deadlocks, head-scratching, hard-resets on machines with locked kernel threads and so on. I can tell you that. I have written multi-threaded programs that worked 'really well' ... until I actually tried to run them on an SMP machine.

The scariest thing is when you get a memory corruption in your shared data structure and there doesn't seem to be any decent debugging tool available that tells you what's going on. Valgrind informs you where the corruption occurs, but you just can't see 'why' it happens, because it should be 'impossible'. Then you sit with your team at the table, staring at the print-out of the code and just hope that someone suddenly says 'Aha!' and points at the one line of code where your intricate locking falls apart. And if you don't get that 'Aha!' moment you will be royally stuck...

I've been through all of this plenty of times. I guess that is why I don't see myself as being particularly naive about the topic.


> We do have a fundamental disagreement, though. I have come
> to the conclusion that shared-memory concurrency is
> impossible for most programmers to get right, and you feel
> that threads are a reasonable solution for a significant
> class of problems, and that you are able to get it right
> when you need to.

Yeah, well, sometimes it was a pretty significant struggle, as you can see from what I wrote above. There were moments when we weren't so sure whether we would ever 'get it right'. :-)

I think that a shared-nothing approach, in which all communication between 'threads' (or processes) takes place via message queues definitely makes it easier to arrive at a correct program. You can program with this model quite easily, even using the normal threading API. I have had several problems where this was exactly the right approach to take. If this model is appropriate for a problem then one should stick to it.

Maybe out of sheer (mis)fortune, though, I had to work on several projects where this approach didn't work in all cases. Sometimes, a legacy system with a large, 'central' data structure needed to be made to take advantage of multiple CPU cores, and so the work had to be broken out across threads without having to re-write or re-architect the core of the system.

In other cases, the nature of the shared data didn't lend itself to being communicated via messages. The overhead of doing so would have been prohibitive. For example, some large tree. You have locks on certain branches, or even the individual nodes at times, which allows multiple threads to work at the same time in the tree. This is very difficult to get right, but the performance requirements (and other considerations) pretty much mandated that this was the approach that had to be taken.

Once something like this has been wrestled to the ground and has been made to work, the advantage is that all data can be modified in place. No additional copying (for messages) is necessary, no marshalling and unmarshalling when sending data, and so on. Sure, you need to consider the benefit of in-place modification if your individual cores have individual memory caches, but it still can often work out to your advantage.

So, I guess my point is this: Yes, avoid the shared memory if you can, but realize that sometimes it still is the right approach. The current threading API (all GIL issues aside) allows me to write shared-nothing, message-based systems. But it also allows me to use locks and shared data to my hearts content. That's why I like the current threading API. It gives me a choice. And sometimes, I need to make the (hopefully) informed choice to bite the bullet and go for the shared memory approach.

I'd be happy if all documentation chapters that mention multi-threading and shared memory come with a warning label, similar to cigarette packages...


> I shouldn't have replied sarcastically, but your reply
> seemed to suggest that it was obvious that most
> programmers could write correct threaded programs.

I hope that my reply clarified my view on this: Writing correct threaded programs with shared memory is not obvious, and it's not always easy. Not for many good programmers and definitely not for me either.

> I guess that frustrates me because I don't know how to explain the difficulties of threading well enough.

May I humbly suggest that in your next article or book about this you relay the experience I described above: Looking with your team at the print-outs, not knowing if you will ever find the problem with your program? In moments like this you can easily get that terrible, sinking feeling, especially if you are supposed to check in the 'fix' for the deadlock problem tonight for the upcoming release. It's already 6.30 pm and you have no idea when or even if you will ever fix it.

Of course, there are also plenty of other areas in software development where you can get just as stuck. But the potential for this certainly is there when you are dealing with shared memory multi-threading.

"Warning: Shared data multi-threading ahead. Proceed with caution!"

Sometimes I still have to proceed.

Adam Olsen

Posts: 11
Nickname: rhamph
Registered: Jun, 2007

Re: It isn't Easy to Remove the GIL Posted: Sep 19, 2007 3:00 AM
Reply to this message Reply
> So, I guess my point is this: Yes, avoid the shared memory
> if you can, but realize that sometimes it still is the
> right approach. The current threading API (all GIL issues
> aside) allows me to write shared-nothing, message-based
> systems. But it also allows me to use locks and shared
> data to my hearts content. That's why I like the current
> threading API. It gives me a choice. And sometimes, I need
> to make the (hopefully) informed choice to bite the bullet
> and go for the shared memory approach.
>
> I'd be happy if all documentation chapters that mention
> multi-threading and shared memory come with a warning
> label, similar to cigarette packages...

I'd be happy if cigarettes were made an illegal/prescription-only substance, after an appropriate depreciation period and with free addiction clinics. ;)

If it's not possible to provide a reasonably safe way to use threads then python shouldn't provide them, and we should use processes or event-driven instead. (Threads would still be available in C of course.)

I think we can make it work though. The trick is to not provide shared state as such. Language-enforced monitors or actors can partition away the data in a way that doesn't allow low-level corruption and encourages better structuring.

Although there's some obvious cases where monitors or actors aren't sufficient, you can always write a specialized container in C that provides a safe API. For example, if you have a large list that you want to subdivide and process amongst many threads, you could have "views" that give exclusive access to a section of it at a time.

I see this as the same as pointer arithmetic. There's some tricks you have to give up in exchange for a much safer and more reliable programming environment. You can always write in C if you need them, but only a small minority of code does.

Andrew Inggs

Posts: 2
Nickname: aminggs
Registered: Apr, 2006

The Problem with Threads Posted: Oct 2, 2007 3:56 AM
Reply to this message Reply
I too have gone through phases where I thought I fully understood threading, only to find yet deeper flaws in my understanding. Here's an article from IEEE Computer that helped me come to the conclusion that threading is the wrong paradigm for writing concurrent programs:

The Problem with Threads
by Edward A. Lee
University of California, Berkeley
http://www.computer.org/portal/site/computer/menuitem.5d61c1d591162e4b0ef1bd108bcd45f3/index.jsp?&pName=computer_level1_article&TheCat=1005&path=computer/homepage/0506&file=cover.xml&xsl=article.xsl

Adam Olsen

Posts: 11
Nickname: rhamph
Registered: Jun, 2007

Re: The Problem with Threads Posted: Oct 2, 2007 2:53 PM
Reply to this message Reply
> I too have gone through phases where I thought I fully
> understood threading, only to find yet deeper flaws in my
> understanding. Here's an article from IEEE Computer that
> helped me come to the conclusion that threading is the
> wrong paradigm for writing concurrent programs:

But how can threads be so wrong when processes are so right? Having a shared address space (efficiently shared data) isn't the problem. Using it for mutable state, requiring a memory model to determine correctness, is.

Halfway down they discuss the difficulty in writing a listener pattern, but this has nothing to do with threads! I could have the same problems (wanting to send asynchronous, ordered notifications without risk of unbounded memory usage) with actors, processes, CSP, or UML. Even gtk's signals, which aren't threaded, must confront this problem. Solving it is more about making the cycles obvious (so they can be handled) than making them impossible.

I think much of the problem in the concurrency world is the focus on creating pure, elegant solutions that inherently avoid all problems (ie a silver bullet), rather than building a set of convenience functions that avoid them well in practise.

My approach to python "threading" (which I could just as well call processes, actors, etc) is to build a solid foundation. The convenience tools to solve a broad set of use cases can be added later, as a library.

Myroslav Opyr

Posts: 1
Nickname: myroslav
Registered: Nov, 2007

Re: It isn't Easy to Remove the GIL Posted: Nov 23, 2007 9:39 PM
Reply to this message Reply
I was pointed to a good writeup upon Hardware/Software overview of tendencies in multi-core computing. It sheds some light upon progress made in the area touched by Guido in different languages/systems/platforms.

http://notes-on-haskell.blogspot.com/2007/10/defanging-multi-core-werewolf.html

Bernd Will

Posts: 1
Nickname: berndwill
Registered: Jan, 2008

Re: It isn't Easy to Remove the GIL Posted: Jan 1, 2008 5:19 PM
Reply to this message Reply
Hello everybody,

I think, the GIL shouldn't be removed, since many modules are expecting this cpython based technology implicitely. But I think it is a lack in the language itself, if the language can only control one core CPU, if more CPU's are available. Maybe there could be more built in functionality for parallel algorithms / codes spreading threads onto several CPU's while one central CPU process could control them ?

Then, GIL would be for single threaded robust processes, while CPU multitasking /multithreading functionality controlling CPU's would be an extra benefit.

Regards
Bernd

Ray Horn

Posts: 1
Nickname: raychorn
Registered: Jun, 2008

Re: It isn't Easy to Remove the GIL Posted: Jun 12, 2008 11:24 AM
Reply to this message Reply
What about simply making each Python OS Thread use a separate Python Namespace ? (See also: http://python2.near-by.info) for the details.

Patrick Stinson

Posts: 2
Nickname: patrickkid
Registered: Sep, 2008

Re: It isn't Easy to Remove the GIL Posted: Sep 12, 2008 8:11 PM
Reply to this message Reply
We have made an interesting case along the lines of multi-threaded audio applications (meaning more than one audio thread). I have described it on my blog here:

http://pkaudio.blogspot.com/2008/07/multiple-rt-threads-and-gil.html

Advantages to us for removing or migrating the GIL thus allowing us to use threads instead of processes:

- Low startup speed overhead
- Low long-term memory footprint
- Easy debugging (very important)
- Being nice to the host sequencer app (they don't expect many processes)

Since we don't use extension modules and therefore have more need for the language than the entire VM platform, the problem becomes more the execution environment instead of the algorithmic environment. Since we run as a plugin in many host apps we should ideally run a light-weight thread to do audio compilation. We want to be able to script some control-rate computation, and are never allowed to block in the audio thread.

interesting problem, really.

Patrick Stinson

Posts: 2
Nickname: patrickkid
Registered: Sep, 2008

Re: It isn't Easy to Remove the GIL Posted: Sep 12, 2008 8:12 PM
Reply to this message Reply
We have made an interesting case along the lines of multi-threaded audio applications (meaning more than one audio thread). I have described it on my blog here:

http://pkaudio.blogspot.com/2008/07/multiple-rt-threads-and-gil.html

Advantages to us for removing or migrating the GIL thus allowing us to use threads instead of processes:

- Low startup speed overhead
- Low long-term memory footprint
- Easy debugging (very important)
- Being nice to the host sequencer app (they don't expect many processes)

Since we don't use extension modules and therefore have more need for the language than the entire VM platform, the problem becomes more the execution environment instead of the algorithmic environment. Since we run as a plugin in many host apps we should ideally run a light-weight thread to do audio compilation. We want to be able to script some control-rate computation, and are never allowed to block in the audio thread.

interesting problem, really.

Flat View: This topic has 54 replies on 4 pages [ « | 1  2  3  4 ]
Topic: Should Microsoft Buy Yahoo? Previous Topic   Next Topic Topic: Development Management: Carthorse, Racehorse, or Wild Horse?


Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2014 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use - Advertise with Us