The Artima Developer Community
Sponsored Link

Make Room for JavaSpaces, Part V
Make Your Compute Server Robust and Scalable
by Susan Hupfer
First Published in JavaWorld, June 2000

<<  Page 3 of 8  >>

Advertisement

Adding Transactions to the Master

Recall the Master code from the compute server example, which calls the generateTasks method to generate a set of tasks and then calls the collectResults method to collect results:



public class Master {
    private JavaSpace space;
    public static void main(String[] args) {
        Master master = new Master();
        master.startComputing();
    }
    private void startComputing() {
        space = SpaceAccessor.getSpace();
        generateTasks();
        collectResults();
    }
    private void generateTasks() {
        for (int i = 0; i < 10; i++) {          
            writeTask(new AddTask(new Integer(i), new Integer(i)));
        }
        for (int i = 0; i < 10; i++) {
            writeTask(new MultTask(new Integer(i), new Integer(i)));
        }
    }
    private void collectResults() {
        for (int i = 0; i < 20; i++) {
            ResultEntry result = takeResult();
            if (result != null) {
                System.out.println(result);
            }
        }
    }
    private void writeTask(Command task) {
        try {
            space.write(task, null, Lease.FOREVER);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
    protected ResultEntry takeResult() {
        ResultEntry template = new ResultEntry();
        try {
            ResultEntry result = (ResultEntry)
                space.take(template, null, Long.MAX_VALUE);
            return result;
        } catch (Exception e) {
            e.printStackTrace();
            return null;
        }
    }            
}

You'll notice in writeTask that the write occurs under a null transaction. If the write returns without throwing an exception, then you can trust that the entry was committed to the space. But if a problem occurs during the operation and an exception is thrown, you can't know for sure whether or not the task entry was written to the space. If a RemoteException is thrown (which can occur whenever a space operation communicates with the remote JavaSpace service), the task entry may or may not have been written. If any other type of exception is thrown, then you know the entry wasn't written to the space.

The master isn't very fault tolerant, since you never know for sure whether or not a task is written successfully into the space. And if some of the tasks don't get written, the compute server isn't going to be able to completely solve the parallel computation on which it's working. To make Master more robust, you'll first add the convenient getTransaction method that you saw previously. You'll also modify the writeTask to make use of transactions and to return a Boolean value that indicates whether or not it wrote the task:



private boolean writeTask(Command task) {
    // try to get a transaction with a 10-min lease time
    Transaction txn = getTransaction(1000*10*60);
    if (txn == null) {
        throw new RuntimeException("Can't obtain a transaction");
    }
    try {
        try {
            space.write(task, txn, Lease.FOREVER);
        } catch (Exception e) {
            txn.abort();
            return false;
        }
        txn.commit();
        return true;
    } catch (Exception e) {
        System.err.println("Transaction failed");
        return false;
    }
}

First, writeTask tries to obtain a transaction with a 10-minute lease by calling getTransaction, just as you did in the worker code. With the transaction in hand, you can retrofit the write operation to work under it. This time when you call write, you supply the transaction as the second argument.

Again, three things can happen as this code runs. If the write completes without throwing an exception, you attempt to commit the transaction. If the commit succeeds, then you know the task entry has been written to the space, and you return a status of true. On the other hand, if the write throws an exception, you attempt to abort the transaction. If the abort succeeds, you know that the task entry was not written to the space (if it was written, it's discarded), and you return a status of false.

The third possibility is that an exception is thrown in the process of either committing or aborting the transaction. In this case, the outer catch clause prints a message, the transaction expires when its lease time is up, and the write operation doesn't occur, so you return a status of false.

Now that you have a new and improved writeTask method that tells you whether or not it has written a task into the space, you'll revamp the generateTasks method to make use of the new information. Here is a revised generateTasks that makes sure all the tasks get written into the space:



private void generateTasks() {
    boolean written;
    for (int i = 0; i < 10; i++) {
        boolean written = false;
        while (!written) {
            written = writeTask(new AddTask(new Integer(i), new Integer(i)));
        }
    }
    for (int i = 0; i < 10; i++) {
        boolean written = false;
        while (!written) {
            written = writeTask(new MultTask(new Integer(i), new Integer(i)));
        }
    }
}

As you can see, the idea here is pretty simple. You wrap a loop around each call to writeTask and make the call repeatedly until it returns a status true to indicate that it successfully wrote the task into the space.

If you look back at the preceding original Master code, you'll notice that the takeResult method isn't particularly robust either, since you can't be sure whether or not the entry has been removed from the space. If the take returns a non-null value, then you know the entry has been removed from the space. But, if the take returns null, you can't be sure. For instance, if a RemoteException is thrown, the entry could have been removed from the space but then could have gotten lost before it made its way back to the client.

To make the takeResult method more fault tolerant, you'll follow the same scheme you used for writeTask. You'll wrap the take inside a transaction that's committed once you have the result entry in hand:



protected ResultEntry takeResult() {
    // try to get a transaction with a 10-min lease time
    Transaction txn = getTransaction(1000*10*60);
    if (txn == null) {
        throw new RuntimeException("Can't obtain a transaction");
    }
ResultEntry template = new ResultEntry();
    ResultEntry result;
    try {
        try {
            result = (ResultEntry)
                space.take(template, txn, Long.MAX_VALUE);
        } catch (Exception e) {
            txn.abort();
            return null;
        }
        txn.commit();
        return result;
    } catch (Exception e) {
        System.err.println("Transaction failed");
        return null;
    }
}

As you can see, takeResult either returns the result entry that was removed from the space or null if no result entry was removed.

You'll also need to revise the collectResults method to make use of the new and improved takeResult:



private void collectResults() {
    ResultEntry result;
    for (int i = 0; i < 20; i++) {
        result = null;
        while (result == null) {
            result = takeResult();
        }
        System.out.println(result);
    }
}

Here, each time through the outer loop, your goal is to retrieve one result entry from the space. In the inner loop, you call takeResult in an attempt to retrieve a result. If the method returns null (meaning it couldn't remove a result entry from the space), you iterate through the loop again; you'll continue looping until you manage to get a result from the space.

By now, you've managed to make your master and worker codes more robust and improved the compute server considerably. But that's not quite the end of the story. You may still have to contend with the compute server's scalability issues. Let's take a closer look.

<<  Page 3 of 8  >>


Sponsored Links



Google
  Web Artima.com   
Copyright © 1996-2014 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use - Advertise with Us