The Artima Developer Community
Sponsored Link

Computing Thoughts
Collection overused as an argument in Java Libraries?
by Bruce Eckel
July 12, 2005
Summary
For Thinking in Java 4e, I'm trying to analyze the presence of the Collections interface in java.util.

Advertisement

There is no equivalent explicit common interface for sequences in C++, and my perception is that anytime a generic STL function is created, that function works with any sequence (via templates) but it turns around and pulls an iterator from the sequence every time (or almost every time).

My perception is that almost every time they pass a Collection in the Java library code, they pull an Iterator from it and use that -- therefore they could just pass the Iterator rather than a Collection, and create a more general-purpose piece of code as a result.

I wrote a Python program to extract the method bodies of methods that take various types of container and iterator arguments, to make it easier to evaluate this assertion (note: program updated July 14). Try running it on your own Java library source to see the results:


'''Searches through standard Java source code files looking
for examples where the classes in "searchFor" are method
arguments. Formats results into the "output" file'''
# Requires Python 2.4
import os, re

src = r"C:\ProgTools\Java\src" # Your Java source directory
output = file("CollectionVsIteratorArguments.txt", 'w')
headerWidth = 59

searchFor = [ "Collection", "List", "LinkedList", "Set",
  "Map", "Iterator", "ListIterator", "Iterable", "Enumeration" ]

# Regular expressions to match argument types. Imperfect --
# only finds methods where the entire argument list is on
# a single line:
argSearch = {}
for srch in searchFor:
    argSearch[srch] = re.compile("\W%s( |<)" % srch)

# Capture argument list:
arglist = re.compile("\([^)]+?\)", )

results = {}
for types in argSearch.keys():
    results[types] = [0, ""]

# Don't include files that are implementing container classes 
exclude = """CopyOnWriteArraySet.java 
AbstractSet.java PriorityQueue.java AbstractList.java 
LinkedList.java DefaultListModel.java RegularEnumSet.java 
JumboEnumSet.java DelayQueue.java Vector.java HashSet.java 
AbstractCollection.java ArrayList.java 
PriorityBlockingQueue.java CopyOnWriteArrayList.java 
LinkedHashSet.java IdentityHashMap.java AbstractQueue.java 
AbstractSequentialList.java LinkedBlockingQueue.java 
TreeSet.java Collections.java ArrayBlockingQueue.java 
SynchronousQueue.java ConcurrentLinkedQueue.java 
EnumSet.java Class.java BeanContextSupport.java 
AbstractMap.java EnumMap.java HashMap.java Hashtable.java 
LinkedHashMap.java TreeMap.java WeakHashMap.java 
ConcurrentHashMap.java""".split()

def captureMethod(javaFile, firstLine, lines, index):
    '''The caller has found a first line of a method. This
    captures the rest of the method, assuming it can do a
    simple brace count to detect the end of the method'''
    method = ""
    openBraces = 1
    method += "-" * headerWidth + "\n" + javaFile + "\n"
    method += firstLine
    print firstLine, #############
    while openBraces:
        index += 1
        line = lines[index]
        method += line
        openBraces += line.count('{') - line.count('}')
    return method

# Walk the directory tree looking for appropriate Java files
for javaFile in (os.path.join(root, name)
        for root, dirs, files in os.walk(src)
        for name in files if name.endswith(".java")
        and name not in exclude):
    lines = file(javaFile).readlines()
    for index, line in enumerate(lines):
        methodArgs = arglist.search(line)
        if methodArgs \
                and line.strip().endswith("{") \
                and not line.strip().startswith("for") \
                and not line.strip().startswith("if") \
                and not line.strip().startswith("*") \
                and not line.strip().startswith("//"):
            args =  methodArgs.group(0)
            for argType in argSearch.keys():
                if argSearch[argType].search(args):
                    results[argType][0] += 1
                    results[argType][1] += captureMethod(
                        javaFile, line, lines, index)
            
for type, methods in results.items():
    print >>output, "#" * headerWidth
    print >>output, (type + " (%d instances)" % 
        methods[0]).center(headerWidth)
    print "%s (%d instances)" % (type, methods[0])
    print >>output, "#" * headerWidth
    print >>output, methods[1]

C++ doesn't need the common interface because of the latent interface produced by templates.

However, if you are dealing with forward iterators, and this is the case for Iterator in Java, you may need to pass through the sequence more than once, in which case you need the actual container rather than a forward iterator.

But it still seems to me that there may be more cases where an Iterator could be passed to a method rather than Collection.

I also notice that the majority of the uses of Collection as an argument come as the implementation of the java.util library classes.

Chuck Allison commented:

Algorithms have no knowledge of the underlying container in C++. That's why they require two iterators to delimit the sequence. If they need to make multiple passes through the sequence, they make a local copy of the iterator (which is why iterators must be copyable).

Please comment on this idea. Thanks.

Talk Back!

Have an opinion? Readers have already posted 24 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Bruce Eckel adds a new entry to his weblog, subscribe to his RSS feed.

About the Blogger

Bruce Eckel (www.BruceEckel.com) provides development assistance in Python with user interfaces in Flex. He is the author of Thinking in Java (Prentice-Hall, 1998, 2nd Edition, 2000, 3rd Edition, 2003, 4th Edition, 2005), the Hands-On Java Seminar CD ROM (available on the Web site), Thinking in C++ (PH 1995; 2nd edition 2000, Volume 2 with Chuck Allison, 2003), C++ Inside & Out (Osborne/McGraw-Hill 1993), among others. He's given hundreds of presentations throughout the world, published over 150 articles in numerous magazines, was a founding member of the ANSI/ISO C++ committee and speaks regularly at conferences.

This weblog entry is Copyright © 2005 Bruce Eckel. All rights reserved.

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use