The Artima Developer Community
Sponsored Link

Weblogs Forum
Java Threads

36 replies on 3 pages. Most recent reply: Oct 18, 2005 12:19 PM by Bruce Eckel

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 36 replies on 3 pages [ 1 2 3 | » ]
Bruce Eckel

Posts: 875
Nickname: beckel
Registered: Jun, 2003

Java Threads (View in Weblogs)
Posted: Sep 8, 2005 3:15 PM
Reply to this message Reply
Summary
In my research for the "Concurrency" chapter for Thinking in Java 4e, I've read through a lot of material. This morning, I started perusing the 3rd edition of "Java Threads" by Oaks & Wong, from O'Reilly.
Advertisement
The promising thing about this book is that it focuses exclusively on the new threading material in J2SE5. A number of Java books that have appeared recently, promising to be a thorough coverage of J2SE5, have not delivered on their promises, at least where threading is concerned.

The third edition of the Oaks and Wong book appears to be a significant improvement over the first two editions. As I said, I've only just dipped into it, but so far there are two issues that I haven't been able to sort out, and I thought that perhaps readers of this weblog could offer clarifications:

The first is on page 10:

Worse yet, the word processor may periodically perform an autosave, which invariably interrupts the flow of typing and disrupts the thought process. In a threaded word processor, the save operation would be in a separate thread so that it didn't interfere with the work flow.

Hmm. I use Word for book creation (don't bother commenting about that; I've heard all the "you should use this" comments, and every one of them would require much more work to use than Word does. And OpenOffice, much as I love the concept, won't handle a book. When I find a truly better solution, I'll be the first to switch.). I've used it for many years, and although it appears to do a background save, any typing that you do during any kind of save (foreground or background) simply fills up the input buffer. That is, the operating system saves your keystrokes, but Word doesn't appear to start a separate task to perform the save.

After giving it some thought, I began to wonder if it would even be a feasible approach. In order to allow a background save to occur, I would think that you would have to clone your document and save the clone. If you started performing a save on a big document and that document was changing during the save, well, that sounds like it opens up a lot of awfully messy possibilities, especially if a complex document format could cause changes throughout the document.

So this worried me, but never having written an editor or word processor I'll grant the possibility that it's something I don't understand well enough.

But then, I came to their discussion of the volatile keyword on pages 41-43. Although volatile has always seemed trivial, in the last few years it has gotten much more complex (primarily because of a deeper understanding that has been developed about the interaction of caches and concurrency), so I was hoping to find some illumination in this book.

Instead, on page 43, I found this statement:

For example, operations like increment and decrement (e.g., ++ and --) can't be used on a volatile variable because these operations are syntactic sugar for a load, change, and a store.

My first impression was that the word "can't" meant that the compiler wouldn't let you do it, so I tried this:


public class VolatileIncrement {
  volatile int x = 1;
  public void f() { x++; }
  public String toString() { return Integer.toString(x); }
  public static void main(String[] args) {
    VolatileIncrement vi = new VolatileIncrement();
    vi.f();
    System.out.println(vi);
  }
}

The compiler had no complaints, and running the program produces a value of "2" so I must now assume that "can't" means "shouldn't" or "doing so won't give you the right results." But this is still unclear, because earlier in the paragraph they introduced the idea of atomicity:

They [volatile variables] can be used only when the operations that use the variable are atomic, meaning the methods that access the variable must use only a single load or store.

And again, I'm confused by the wording "can be used only," since clearly the operations available for a volatile variable include those that are non-atomic. In fact, no operations on long or double variables are atomic, and yet I can write this:


public class NonAtomic {
  volatile long a = 1;
  volatile double b = 1.0;
  public void f1() { a++; }
  public void f2() { b += 1.0; }
  public String toString() { 
    return Long.toString(a) + " " + Double.toString(b); 
  }
  public static void main(String[] args) {
    NonAtomic na = new NonAtomic();
    na.f1();
    na.f2();
    System.out.println(na);
  }
}

Again, no complaints from the compiler, and reasonable results at run time. So I don't really know what to think here: either they were speaking imprecisely, and meant that you shouldn't (but even then their explanation of volatile is vague at best), or they thought that you really couldn't but never actually tried it (which would be rather ominous for the rest of the book).

The section concludes in the second-to-last paragraph on page 43 with this:

The requirements of using volatile variables seem overly restrictive. Are they really important? This question can lead to an undending debate. For now, it is better to think of the volatile keyword as a way to force the virtual machine not to make temporary copies of a variable.

Two things bother me about this. The first is the suggestion that this is the subject of unending debate. My impression so far is that it is a fairly deterministic issue, but if it is really debateable I'd like to hear what the issues are (it was not clear to me from this book).

The second, and deeper, disturbance about the above statement is that my pre-new-memory-model understanding of volatile has always been "don't make any assumptions about this value when doing optimizations." That is, the optimizer might come along and look at the code surrounding a variable access and say "hey, there's nothing in this code that has changed that variable since the last code that accessed it, so I'll keep it in a register and just read from the register instead of going all the way back to main memory to access it." Ignoring the caching issue for now, volatile says "don't do that. Always read the variable as if some other process had changed it.

I suppose you could say that "putting the value of a variable in a register or a cache" could be called "making a temporary copy," but that seems imprecise to me; I think of a temporary as an actual variable that might, for example, be created by the compiler in order to evaluate a complex expression.

But I guess the real problem I'm having, after hardly getting into the book at all, is that I can't figure out whether what they're saying is wrong or whether they are just communicating it badly. But in either case it's starting to look like it will take a lot of effort to extract value from this particular book.

Any illumination is appreciated.


Andreas Mross

Posts: 12
Nickname: amross
Registered: Apr, 2004

Re: Java Threads Posted: Sep 8, 2005 4:37 PM
Reply to this message Reply
Regarding the word processor question:

In theory, the word processor document could be modelled using the Command pattern. The document would be stored as an ever growing List of edit Commands. The save thread could safely save these Commands to disk even as the Controller was adding new Commands (entered by the user) to the end of the List.
I imagine this approach to modelling a document would be completely impractical for any document over perhaps a page in length, as the list of Commands would grow too long to render to a View efficently. Perhaps there is some kind of hybrid approach possible, where the state is calculated to some point then a short list of Commands is tacked on to the end.

The best approach, as you say, seems to be to lock the document model, clone it, then pass the clone off to a seperate thread which would save the document to disk while the user kept working. I'm not sure what the confusion is here? It sounds pretty straight forward.
In a large document, the cloning may take considerable time in itself, and the user could not work while the cloning was taking place. This cloning could be optimised if required.

Eric Gillespie

Posts: 13
Nickname: viking
Registered: Jun, 2005

Re: Java Threads Posted: Sep 8, 2005 7:29 PM
Reply to this message Reply
Depending upon how much of the document was loaded into memory, you would possibly double memory requirements at the time of cloning and saving. Is this a feasible way of doing the job of saving a big document at the same time as you continue to work upon it?
Could "save" sessions be tied into a "microbreak" anti-RSI loop? I.E. the user HAS to take a quick five second break while the document is (at least somewhat) synchronised. That's just a totally random thought that popped into my head.

Brian Slesinsky

Posts: 43
Nickname: skybrian
Registered: Sep, 2003

Re: Java Threads Posted: Sep 8, 2005 10:31 PM
Reply to this message Reply
You could also do it by representing the document as an original and the differences between the latest version and the original. Editing the document just changes the diffs. When the save is finished, apply the diffs to the original.

Nail Samatov

Posts: 2
Nickname: nfsr
Registered: Sep, 2005

Re: Java Threads Posted: Sep 9, 2005 12:16 AM
Reply to this message Reply
Before save begeins you can turn "SavingFlag" into "true" and then run saving thread, while user changes will be applied to another temporary stream (may be using Command ?). When the save is finished, SavingFlag=false; apply changes from temporary stream to main document and then work as before.

P.S. sorry for my bad endlish :)

Nail Samatov

Posts: 2
Nickname: nfsr
Registered: Sep, 2005

Re: Java Threads Posted: Sep 9, 2005 12:17 AM
Reply to this message Reply
I mean english =)

Maarten Hazewinkel

Posts: 32
Nickname: terkans
Registered: Jan, 2005

Re: Java Threads Posted: Sep 9, 2005 1:10 AM
Reply to this message Reply
I think the word processor example is probably not a good one, given that current hardware speeds make for very quick saves (though using a network volume can make that less smooth).

Regarding the list-of-commands document representation, I seem to recall that Word actually does use such an approach. Among other things, it speeds up saves by only needing to append updates to the end of the document. Take a look at a Word document with a plain text editor. You generally won't find your document text in any normal order, but pieces all over the place, including duplicates.
Obviously you don't use this to store single-keystroke commands, but to larger sections, and at some point you need to flatten (parts of) the list.

Anyway, as a design for a separate saving thread, I'd try using a decorator on the document object which can buffer updates to the real document and also apply those updates to any read methods.

For a better example of using background threads, I'd look at something like taking/saving a snapshot of a real-time display or control system.

Krzysztof Sobolewski

Posts: 7
Nickname: jezuch
Registered: Dec, 2003

Re: Java Threads Posted: Sep 9, 2005 1:22 AM
Reply to this message Reply
I think the word processor example is probably not a good one, given that current hardware speeds make for very quick saves (though using a network volume can make that less smooth).

I think you haven't seen OpenOffice saving a big document ;) AFAIR it also does that synchronously, so you can't edit it while saving.
I too think that the document is [should be?] stored as a sequence of Commands (for undo buffer). This is not limitless - you have the preference to limit the undu buffer to, say, 100 undos, after that all previous changes are "merged" into the main body. Save action makes a "snapshot" by remembering the last item in the buffer.
O course, this is only a speculation ;)

Nobody commented on volatile, so it looks like nobody knows the answer. I too always thought this is a flag to prevent over-optimisation. I also always though that ++ and -- are atomic operations...

Nicky Bodentien

Posts: 1
Nickname: gratis
Registered: Sep, 2005

What volatile means Posted: Sep 9, 2005 3:38 AM
Reply to this message Reply
Saying that "you can't" apply ++ to a volatile field is plain wrong. They must mean that "you shouldn't".

About volatile: According to 3rd edition of the Java Language specification, declaring a long field to be volatile ensures that reads or writes to it occur atomically. However, I have not been able to find any guarantee that the effect of the ++ increment operator will occur atomically - so there probably is no such guarantee.

About the effects of volatile: Yes, the INTENTION of volatile in the old java memory model was, intuitively, to prevent the optimizer from making assumptions about a variable that would only hold in a singlethreaded environment. However, declaring a field volatile - according to the old memory model - simply didn't give the thread safety that people intuitively thought it would. In particular, see http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html about how volatile COULDN'T fix the double checked locking (anti)pattern. With the new memory model, the volatile keywords actually provides useful semantics, at the expense of performance.

Hanson Char

Posts: 2
Nickname: hchar
Registered: Sep, 2005

Re: What volatile means Posted: Sep 9, 2005 4:19 AM
Reply to this message Reply
"++" and "--" can't be used on a volatile variable means they can't be used in a ** thread-safe manner **. Atomic is different from thread-safe. The "++" and "--" involves 3 individually atomic operations (load, change, save), not an atomic transaction that involves 3 operations.

Each operation by itself is thread safe since it's atomic => "you can".

But as a whole (load, change, save) it's not thread safe => "you can't".

Hanson Char

Posts: 2
Nickname: hchar
Registered: Sep, 2005

Re: What volatile means Posted: Sep 9, 2005 4:34 AM
Reply to this message Reply
In JDK1.5, there is a simple yet clever use of volatile in java.util.concurrent.CopyOnWriteArrayList to provide thread-safe read operations without the need to synchronize. Note the update operations are still synchronized, but the read operations are not. Worth have a look.

John D. Mitchell

Posts: 244
Nickname: johnm
Registered: Apr, 2003

Multi-threaded Word Processor Posted: Sep 9, 2005 6:51 AM
Reply to this message Reply
Well, I assume that since the quote is from page 10 that it's very generic and handwavy. There are lots of factors that could be at play depending on little things like how threads are implemented in the JRE being used, the data model of the word processor (and hence how concurrent access/mutation is handled, etc.), how I/O is managed, etc.

In terms of the data model, there are a fairly wide variety of approaches that deal with the various tradeoffs (browsing ease/speed, insertion ease/speed, concurrency, memory footprint, persistent footprint, versioning, I/O, etc.).

MS Word, in particular, has been the subject of a particularly nefarious history. It stores things in chunks. For example, it keeps old versions of the chunks and so people have been quite embarrassed when someone has looked through the actual .doc file and found that lingering data. One of the tricks has been to always do a Save As... of the file to get a clean .doc. In various versions in the past (at least :-), it's not the most robust at actually saving the data to the .doc files correctly.

John D. Mitchell

Posts: 244
Nickname: johnm
Registered: Apr, 2003

Volatile, Atomic, and ++ Posted: Sep 9, 2005 8:02 AM
Reply to this message Reply
Well, since I haven't read that book, I don't have the full context of what they are (trying to say) but from your excerpts I gotta say that I'm quite concerned that, at the very least, they are being *quite* misleading. Nothing personal but after going through an earlier edition of the book, I won't even bother to look at this edition.

In Java v5 (JLS v3), volatile (and atomic) has "real" semantics due to all of the excellent work done on the new/fixed Java Memory Model. For the specification, check out e.g., Section 17 of the JLS.

Basically, reads and writes of non-volatile longs/doubles are implementation dependent as to whether or not they are done atomically. However, reads and writes of volatile longs/doubles must now be atomic. [They also cleared up the fact that references, regardless of implementation size, must also always be read/written atomically.]

Semantically, the operators like ++ (+=, etc.) aren't specified as being atomic. They are composite actions (i.e., as if load, act, store).

In terms of checking the memory model/concurrency stuff out using programs, you need to be very careful as a lot of the nasties won't show up in simplistic tests or on uni-processors or....

In terms of the mental model of the new memory model, one of the keys is the notion of happens-before. If you really want to know, definitely read the specs and check out the mailing list archives for the concurrency and memory-model JSRs as well as the JLS. The basic idea is the control of the visibility of changes across threads. Atomicity and volatility are two of the tools to deal with that sort of inter-thread visibility. [FWIW, a very rough analogy is that of sequence points as defined in the C language standard.]

Does this help address your confusion/consternation?

Bruce Eckel

Posts: 875
Nickname: beckel
Registered: Jun, 2003

Re: Volatile, Atomic, and ++ Posted: Sep 9, 2005 9:49 AM
Reply to this message Reply
It's a start. The 4th edition of "The Java Programming Language" also has helpful information, but when they get close to things like the cache coherency problem, they punt and say it's too complex for the book. Doug Lea's "Concurrent Programming in Java, 2nd edition" is also a resource, although it can be heavy going at times because Doug covers everything without regard to complexity or how often you'll use it, or order of presentation.

Currently my struggle is to come up with a definitive understanding of volatile (rather than a description that "can lead to an undending debate"). Especially with the advent of multicore processors and the cache coherency problem, this can no longer be described in a simple fashion. You have to get into instruction reordering and memory barriers, at least to justify arguments for use of volatile. But it would be nice to have a clear description of when to use it, since it's no longer as obvious as it once was.

John D. Mitchell

Posts: 244
Nickname: johnm
Registered: Apr, 2003

Re: Volatile, Atomic, and ++ Posted: Sep 9, 2005 10:25 AM
Reply to this message Reply
> Currently my struggle is to come up with a definitive
> understanding of volatile (rather than a
> description that "can lead to an undending debate").
> Especially with the advent of multicore processors and the
> cache coherency problem, this can no longer be described
> in a simple fashion. You have to get into instruction
> reordering and memory barriers, at least to justify
> arguments for use of volatile. But it would be nice
> to have a clear description of when to use it, since it's
> no longer as obvious as it once was.

Well, at the bigger picture level, I'd say it's more important to have the people learn the new concurrency (JSR-166) libraries and focus on using them for their problems. Basically, anybody who truly has a problem that requires something that they don't cover will be far enough along to actually learn the whole memory model nitty gritty.

W.r.t. specifically, a simplistic view is to say that it should be used anytime the variable needs to be reliably/predictably visible across threads and heavier synchronization mechanisms aren't being used. But, again, with the big caveat that people shouldn't be performing this sort of "optimization" unless they really do know what they are doing (and, frankly, not that many people really do -- look at all of the blather on this subject for the last 10 years -- that so-called "endless debate" is more of an endless having to explain this complexity to people who are not only ignorant but also indignant).

Flat View: This topic has 36 replies on 3 pages [ 1  2  3 | » ]
Topic: Back to Generics: Contravariance and Erasure Previous Topic   Next Topic Topic: Designing a Language for Library Developers

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use