Java Community News - A New Puzzler from Neal Gafter

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Java Community News
A New Puzzler from Neal Gafter

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Frank Sommers

Posts: 2642
Nickname: fsommers
Registered: Jan, 2002

A New Puzzler from Neal Gafter

Posted: Nov 22, 2006 2:44 PM

Summary
As co-author, with Josh Bloch, of the popular book, Java Puzzlers, Neal Gafter is no stranger to Java pitfalls and traps. In a recent blog post, Gafter provides a bonus puzzle about why using a single thread pool in some concurrent tasks can lead to deadlock, and suggests a solution.

In a recent blog post, A Thread Pool Puzzler, Neal Gafter reports on a concurrency problem he encountered while working on Google's calendaring application. At issue was a parallel task that could be expressed with a version of fork-join parallelism:

Many parallel programming problems can be expressed using fork-join parallelism, in which tasks spawn, or "fork", a number of subtasks that can be executed in parallel. The caller then waits for these subtasks to complete by "join"ing with them.

By forking off several threads of execution, a previously large sequential program can be executed in parallel. In many cases, that will lead to faster performance, especially if those threads are scheduled for execution on different CPUs.

In order for this to work, however, there must be enough threads available to do the work of each parallelized subtask. Gafter points out the kind of problem that can happen in the current Java thread implementation when there are not enough threads to perform the work. The threads in his example are scheduled for execution as follows:

static ExecutorService threadPool =  
    Executors.newCachedThreadPool();
static void loopNtimes(int n, Runnable runnable) {
    Collection<Callable<Object>> c = 
        new ArrayList<Callable<Object>>();
        for (int i=0; i<n; i++)  
            c.add(Executors.callable(runnable));
        Collection<Future<Object>> futures = null;
        try { futures = threadPool.invokeAll(c); } catch (InterruptedException _) {}
        if (futures != null) 
            for (Future<Object> f : futures) {
                try { f.get(); }
                catch (InterruptedException ex) {}
                catch (ExecutionException ex) {
                    ex.printStackTrace();
                    System.exit(1);
                }
            }
        }
With ten threads in the pool the program prints 11 periods [the work performed in each thread] and then deadlocks! If you use a debugger to examine the state of the program to figure out what's going on, you'll find the main thread waiting for invokeAll(), three threads in doLayerOne() [the first parallel subtask] waiting for invokeAll(), seven threads in doLayerTwo() [the second parallel subtask] waiting for invokeAll(), and there are no threads left to do any of the work of calling doLayerThree() [the third subtask]. This is a classic thread starvation deadlock.

In the blog post, Gafter discusses several possible solutions, finally settling on the following:

What we did to address this issue is avoid the single monolithic thread pool altogether. Instead, we use a separate thread pool at every level in the hierarchy. In terms of this example, we would have a thread pool for use in main, one for use in doLayerOne(), and one for use in doLayerTwo(). Every subsystem that requires concurrency gets its own personal thread pool.

As Gafter points out, this solution requires careful management of thread pool sizes. In addition, separate thread pools for each subtask work only if a parallel problem lends itself to clear compartmentalization into subtasks.

A better solution, according to Gafter, would be to change the way Java's invokeAll() method is implemented:

In my opinion, this kind of thread starvation should never occur because there is always one thread that can contribute processing power toward execution [of] the subtasks: the thread that is waiting for the subtasks to complete.

The problem is that that thread does not currently contribute processing due to how invokeAll() is implemented. In the concluding part of his post, Gafter outlines an invokeAll() implementation that lets the main thread do some work as well, and thus does not deadlock because of thread starvation.

At the same time, Gafter also notes that the problem is subtle, and that solutions to the thread starvation issue in a fork-join type problem are not trivial:

I discussed this issue with Doug Lea, and he warned me that selecting tasks for efficient scheduling in a fork-join concurrency framework is not trivial; the standard solution is to have a double-ended queue for each worker task where it enqueues its subtasks. The worker removes the most recently generated task from this queue for itself to process, thereby simulating a depth-first execution strategy in the single-thread case. When a worker finds itself without any work to do, it steals work from the other end of the queue of another task. That is, it should steal one of the least-recently created (course-grained) subtasks.

What do you think of Gafter's solution for the thread starvation problem? Have you encountered a similar issue due to the current implementation of invokeAll()?

Previous Topic

Next Topic


	Web Artima.com