Sponsored Link •
The Java collection class library has some fine qualities, and having a standardized set of collections is certainly far better than not having it. But the library also has a number of flaws that should be discussed and understood, so that they don't get perpetuated in future libraries. This article discusses two of the more irritating issues.
Several years ago, I was on a committee of the College Board, planning the switch of the Advanced Placement Computer Science exam from C++ to Java. That exam is offered to high school students who seek to get college credit for their high school computer science classes. One of the tasks of the committee was to specify a subset of the Java library that was fair game for exam questions. When the time came to discuss the collection class library, we all sighed and muttered about "Bloch's blunders".
A few weeks later, I was at Java One, where the same library was awarded a prize for "best" (or at least, "best known") class library of the year.
How could two audiences have such a different opinion of the same library? When I asked Josh Bloch, he told me that we academics should stop grousing about the "optional operations". But that wasn't our complaint. We understood that the optional operations are necessary for limited capability views, such as the view returned by Arrays.asList. There is nothing wrong with trading away some purity for a substantial increase in functionality.
The problem is trading away purity when the only result is confusion and inconvenience. The first issue is a serious problem for educators. We try to teach our students about the performance differences between data structures, such as linked lists and arrays. For example, it would be a poor idea to execute binary search on a linked list. After all, linked lists don't have efficient random access. But in the collection class library, both LinkedList and ArrayList implement the List interface, which has operations for random access. Why not have two interfaces, one for an ordered collection such as a linked list, and another for an indexed collection, such as an array? I asked Josh Bloch, and he suggested that some programmers might have a linked list with a few elements, and they might want to sort it, and who was he to deny them that? Huh? When is the last time you needed to do that? And anyway, there are perfectly good algorithms for sorting linked lists.
You may well wonder what happens when you call the Collections.binarySearch method on a linked list. Rather than carry out a horrendously inefficient binary search, the method actually switches to a linear search for a linked list, its name nonwithstanding. Before SDK 1.4, the test for the linear search involved a very ugly hack. Now there is, somewhat belatedly, a tagging interface RandomAccess that ArrayList implements but LinkedList does not. Ugh. Smalltalk did a lot better than that, 20 years earlier.
The second problem with the Java collection classes is iterator inconvenience. In C++, iterators were simple. If you have a collection with n elements, there are n + 1 iterator positions: At the first, second, . . ., nth element, and past the nth element. You could think of an iterator as a cursor like the old terminals used to have (and Emacs still has). Alternatively, you can think of the iterator as new-fangled caret--a vertical line that is between elements. Then the n + 1 positions are before the first, between the first and second, . . . between the (n-1)st and nth, and past the nth element. Both interpretations are essentially equivalent, but the "cursor" view makes it a bit easier to interpret the element that the iterator currently visits.
What should happen when you add an element at the iterator position? Both C++ and Java do the right thing--which is the same thing your text editor does. The element is inserted before the iterator, and the iterator moves past the inserted element, like this:
a|bc => ax|bc (after inserting x)
How about deleting the iterator position? C++ does what your text editor would do when you press the Delete key (ok, except on the Mac, don't get me going...). The element under the cursor (or to the right of the caret) is deleted, like this:
ax|bc => ax|c (after deleting)
How about Java? Well, you can't tell. It depends if the iterator had just moved to the right, to the left, or not at all. Huh? In the last case, the remove method even throws an exception! If you want to delete two consecutive elements, then you have to move the iterator past the first one, invoke the remove method, and move the iterator over the adjacent element before calling remove again. If my text editor worked like that, it would drive me crazy.
I don't think I am the only one who is aggrieved by Java iterators. Douglas Dunn has a long-winded section in his splendid book "Java Rules", trying to explain the behavior of iterators. (He takes a different tack, taking the remove functionality for granted and then laboriously tries to explain the inconsistency of the add method.)
Of course, these flaws are not fatal. The collection class library has some fine qualities, and having a standardized set of collections is certainly far better than not having it. Other libraries, such as the STL library of C++, have their share of bizarre features. What bugs be about the Java collection class library is that it seems to be uncritically accepted, and that so few people talks about its flaws. If we can't recognize our mistakes, how can we learn from them?
That's my pet peeve of the day.
|Cay Horstmann is a professor of computer science at San Jose State University. He is an experienced professional programmer and was VP of Technology and CTO at Preview Systems Inc. He wrote many successful professional and college books, including Core Java (with Gary Cornell), Big Java, Big C++ (with Tim Budd), and OO Design & Patterns.|