The Artima Developer Community
Sponsored Link

Weblogs Forum
Generics: Bounds Puzzle

10 replies on 1 page. Most recent reply: Oct 24, 2005 6:04 AM by Bruce Eckel

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 10 replies on 1 page
Bruce Eckel

Posts: 875
Nickname: beckel
Registered: Jun, 2003

Generics: Bounds Puzzle (View in Weblogs)
Posted: Oct 24, 2005 6:04 AM
Reply to this message Reply
Summary
I was poking around in the Standard Java Libraries looking for examples of generic code, and came across something curious.
Advertisement

This is in the package java.lang.reflect (comments have been edited out):

public interface TypeVariable<D extends GenericDeclaration> extends Type {
    Type[] getBounds();
    D getGenericDeclaration();
    String getName();
}

Why is this curious? A generic parameter "erases to its first bound," which means that the compiler effectively replaces the parameter with its first bound. In this case, that bound is simply GenericDeclaration.

But if you look at the code, you'll see that there is no need for generics to be used here. You can just say that the return type of getGenericDeclaration() is GenericDeclaration, and skip the use of generics. So I think it should look like this, instead:

public interface TypeVariable extends Type {
    Type[] getBounds();
    GenericDeclaration getGenericDeclaration();
    String getName();
}

That's my take on it, anyway. Perhaps you can see something that I havent.


Here is the Python program I used to hunt for generic declarations. It has a serious flaw in that it will only find such declarations where a '<' and '>' appear on the same line (I know it's possible to write a regular expression that deals with all the possibilities of multiple lines and nested angle brackets, but I just wanted to do a quick scan, and regular expressions like that hurt my brain. Please feel free to suggest a correction).

The program attempts to remove any generic argument list that looks "complex enough to be justified," and just tries to find the ones that might be "too simple."

'''Searches through standard Java source code files'''
# Uses Python 2.4
import os, re

src = r"C:\ProgTools\Java\src" # Your Java source directory
output = file("GenericArgumentLists.txt", 'w')

# Limit: only matches when they're all on one line:
genericArglist = re.compile('<.*?>')

# Walk the directory tree looking for appropriate Java files
for javaFile in (os.path.join(root, name)
        for root, dirs, files in os.walk(src)
        for name in files if name.endswith(".java")):
    lines = file(javaFile).readlines()
    for index, line in enumerate(lines):
        methodArgs = genericArglist.search(line)
        if methodArgs:
            match = methodArgs.group(0)
            if match.find("?") == -1 \
            and match.find("extends") != -1 \
            and line.find("Class<") == -1 \
            and match[1:].find("<") == -1:
                print>>output, javaFile, "(" + str(index) + "):\n\t", line.rstrip()


Noam Tamim

Posts: 26
Nickname: noamtm
Registered: Jun, 2005

Re: Generics: Bounds Puzzle Posted: Oct 24, 2005 7:04 AM
Reply to this message Reply
> D getGenericDeclaration();
> GenericDeclaration getGenericDeclaration();

Maybe I'm missing something here, but it looks like the reason for using Generics here, is simply to avoid having to cast the return value of getGenericDecleration() to type D (or, more generally, type safety).

You want to see it's not a good enough reason? Fine, but then why don't you complain about Comparable<T>?

Noam.

Bruce Eckel

Posts: 875
Nickname: beckel
Registered: Jun, 2003

Re: Generics: Bounds Puzzle Posted: Oct 24, 2005 9:23 AM
Reply to this message Reply
That may in fact be the case. I'll think about it, and in the meantime perhaps others can weigh in.

Patrick Mellinger

Posts: 2
Nickname: entername
Registered: Oct, 2005

Re: Generics: Bounds Puzzle Posted: Oct 24, 2005 10:14 AM
Reply to this message Reply
It's interesting to me that you're regex won't catch anything thats not on the same line. I use regex all day, every day (with Java). Does Python handle the .*? differently than other languages? In my experience that will go right over the top of everything until that next character is matched.

Maybe there are no line spanning generic declarations?

This expression may work better for you, even though in theory they do the same thing(ish).

genericArglist = re.compile('<[^>]*')

Roland Kaufmann

Posts: 2
Nickname: rolandk
Registered: Oct, 2005

Re: Generics: Bounds Puzzle Posted: Oct 24, 2005 11:42 AM
Reply to this message Reply
The interface itself may not need the generic declaration, but perhaps a client would like to constrain the implementations of it that it may receive and thus would like to impose it as a restriction (i.e. only allow classes that implement this interface using a certain subclass of GenericDeclaration)

Daniel Jimenez

Posts: 40
Nickname: djimenez
Registered: Dec, 2004

Re: Generics: Bounds Puzzle Posted: Oct 24, 2005 12:03 PM
Reply to this message Reply
Maybe a hint can be found by browsing some more JDK source ...

There's three classes in the JDK that implement GenericDeclaration: Class, Method, and Constructor.

interface GenericDeclaration { TypeVariable<?>[] getTypeParameters(); }


It seems that TypeVariable is parameterized in order to allow each of those subclasses to declare its implementation of getTypeParameters() to return a TypeVariable array parameterized on itself. (I believe Roland K is saying the same thing, only more compactly.)

Eg,
class Method { public TypeVariable<Method>[] getTypeParameters(); }


This doesn't appear to be simply a matter of using generics to define covariant return types (it might be if GenericDeclaration were itself parameterized); I'm learning to consider that inappropriate usage since JDK 5 allows covariant returns without demanding generic declarations.

Of course, looking at what currently happens to be there isn't a complete answer of what possibilities the authors are trying to represent, but this seems to clear up some of the confusion.

I do have two additional questions though. Aren't parameterized arrays illegal (as in the return type of getTypeParameters())? How does the compiler get from TypeVariable<[u]?[/u]>[] to TypeVariable<[u]Method[/u]>[] when defining class Method?

(Couldn't get underline to work within code: are nested inline html types not allowed?)

Bruce Eckel

Posts: 875
Nickname: beckel
Registered: Jun, 2003

Re: Generics: Bounds Puzzle Posted: Oct 24, 2005 12:08 PM
Reply to this message Reply
> It's interesting to me that you're regex won't catch
> anything thats not on the same line. I use regex all day,
> every day (with Java). Does Python handle the .*?
> differently than other languages? In my experience that
> will go right over the top of everything until that next
> character is matched.
>
> Maybe there are no line spanning generic declarations?
>
> This expression may work better for you, even though in
> theory they do the same thing(ish).
>
> genericArglist = re.compile('<[^>]*')

Yes, you can span lines, and generic definitions do. I may have been overly lazy. I would probably have said:

genericArglist = re.compile('<[^>]*>')

But neither one of these solves the problem of nested generic definitions like:

class Foo<T extends Comparable<T>> { // ...

The regexp for that, I think, could get very hairy. There is a section that talks about embedded quoted strings in "Mastering Regular Expressions," but it's never made sense to me.

That said, I've often had good luck creating "multi-pass regular expressions," where the first pass pulls out part of what I'm looking for, then another pass looks for something within that. It's often much easier to create and maintain than trying to make one magical do-all regular expression.

Patrick Mellinger

Posts: 2
Nickname: entername
Registered: Oct, 2005

Re: Generics: Bounds Puzzle Posted: Oct 24, 2005 1:06 PM
Reply to this message Reply
I'm really not that versed with Comparable or generics for that matter, but I think what you are looking for can be solved with a simple conditional, assuming that you can't nest more than one generic into the declaration (I really have no idea whether or not you can do that). At least, I think, this would be where I would start.

genericArglist = re.compile('<[^>]*(?:<[^>]*>)?[^>]*>')

If, indeed you can nest more than once in there, then the conditional portion can be repeated a few times. This will decrease performance or crash your app if you get too many asterisks in there.

Bruce Eckel

Posts: 875
Nickname: beckel
Registered: Jun, 2003

Re: Generics: Bounds Puzzle Posted: Oct 24, 2005 1:13 PM
Reply to this message Reply
> The interface itself may not need the generic declaration,
> but perhaps a client would like to constrain the
> implementations of it that it may receive and thus would
> like to impose it as a restriction (i.e. only allow
> classes that implement this interface using a certain
> subclass of GenericDeclaration)

I think these are actually two separate use cases. One for returning the exact type, and another for constraining the use of the interface.

I think that one of the complexity issues with generics is that you usually combine several of these idioms with any generic code. What I think would be interesting is to make a list of all the separate idioms (partly, to ensure that I cover them all). But also to express each one independently, so you know what that particular idiom is for. (for example, covariance, contravariance, wildcard capture, etc.)

Howard Lovatt

Posts: 321
Nickname: hlovatt
Registered: Mar, 2003

Re: Generics: Bounds Puzzle Posted: Oct 25, 2005 4:46 AM
Reply to this message Reply
You raise two issues:

1. Why is TypeVariable declared as it is?

1A. So that if you implement TypeVariable you need to give a GenericDeclaration as the generic (type) parameter, otherwise TypeVariable does not make sense - it is giving info. about a GenericDeclaration.

1B. So that if you program to TypeVariable then you don't need to cast when you call getGenericDeclaration.

2. How can you parse the source to find generic declarations?

You need something more sophisticated than regex's (whether they be in Python, Pearl, awk, or Java). Try StreamTokenizer, e.g.:

package findgenerics; // Finds generic declarations but ignores generic useage, comments, etc.
 
import java.io.*;
import static java.io.StreamTokenizer.*;
import static java.lang.System.*;
import java.util.StringTokenizer;
 
public class Main {
    private static final String input =           // test input
            // should be found
            "class Foo< T1 > { ... }\n" +         // standard class
            "interface Foo\n< T2\n>\n{ ... }" +   // interface with line break
            "class Foo< /* comment1 */ T3 // comment2\n> { ... }\n" + // class with comments
            "{\n< T4 > void bar() { ... }\n" +    // genetic method first thing in body
            ";\n< T5 > void bar() { ... }\n" +    // genetic method following a field etc.
            "}\n< T6 > void bar() { ... }\n" +    // genetic method following another method etc.
            "class Foo< T7 extends Comparable< Foo > > { ... }\n" + // embedded generic
            // shouldn't be found
            "\"class Foo< T8 > { ... }\"\n" +     // string
            "if ( x < T9 ) { ... }\n" +           // if with <
            "new List< T10 >() { ... }\n" +       // generic inner class (usage)
            "List< T11 > l = new ArrayList< T11 >() {};\n" + // use of generics (difficult case!)
            "class Foo { ... }";                  // no generics
    private static final StreamTokenizer st = new StreamTokenizer( new StringReader( input ) ); // tokenize the input
    private static boolean isPossibleGeneric = false; // start of generic declaration possibly found
    private static boolean braceFound = false;    // the < of the generic is found
    private static StringBuffer generic = null;   // the possible generic declaration for printing
    
    public static void main( final String[] notUsed ) throws IOException {
        st.resetSyntax();                         // start of setup of parser
        st.whitespaceChars( '\u0000', '\u0020' ); // normal def. of whitespace
        st.wordChars( '\u0020', '\u00FF' );       // make every other char a word - i.e. no chars are significant for parsing!
        st.slashSlashComments( true );            // ignore // comments
        st.slashStarComments( true );             // ignore /* comments
        st.quoteChar( '\'' );                     // ignore things between 's - probably doesn't matter!
        st.quoteChar( '"' );                      // ignore things between "s
        st.eolIsSignificant( true );              // seperate line breaks - helps to display // comments
        st.ordinaryChar( '<' );                   // break input at < - significant characters
        st.ordinaryChar( '}' );                   // break input at }
        st.ordinaryChar( '{' );                   // break input at {
        st.ordinaryChar( ';' );                   // break input at ;
        while ( st.nextToken() != TT_EOF ) {
            // out.println( st + ", isPossibleGeneric = " + isPossibleGeneric + ", braceFound = " + braceFound );
            switch ( st.ttype ) {
                case TT_WORD:
                    if ( st.sval.startsWith( "class" ) || st.sval.startsWith( "interface" ) ) start(); // start of a new possible generic
                    else if ( isPossibleGeneric ) generic.append( st.sval ); // make the posssible generic
                    break;                        // not inside a possible generic
                case '<':
                    if ( isPossibleGeneric ) {
                        generic.append( '<' );    // make the posssible generic
                        braceFound = true;
                    }
                    break;                        // not inside a possible generic
                case TT_EOL:
                    if ( isPossibleGeneric ) generic.append( '\n' ); // make the posssible generic
                    break;                        // not inside a possible generic
                case '\'':
                    isPossibleGeneric = false;    // reset - generics can't contain char constants
                    break;
                case '"':
                    isPossibleGeneric = false;    // reset - generics can't contain string constants
                    break;
                case '{':
                    if ( isPossibleGeneric && braceFound ) { // generic possibly found
                        newCheck: {               // check if a 'new' keyword is present
                            final String gs = generic.toString().trim();
                            for ( final StringTokenizer nc = new StringTokenizer( gs, " \t\n\r\f};=" ); nc.hasMoreElements(); )
                                if ( nc.nextToken().equals( "new" ) ) break newCheck;
                            out.println( "line " + st.lineno() + ": " + gs ); // print the generic
                            isPossibleGeneric = false; // reset for next generic
                        }
                    } else startNoKeyword();      // start of a new possible generic - generic method as 1st method in class
                    break;
                case '}':
                    startNoKeyword();             // start of a new possible generic - generic method following another method etc.
                    break;
                case ';':
                    startNoKeyword();             // start of a new possible generic - generic method following a field etc.
                    break;
                default:
                    throw new IllegalStateException( "Parsing error: " + st );
            }
        }
    }
    
    private static void start() { // start of possible new generic that is introduced by a keyword
        isPossibleGeneric = true;
        braceFound = false;
        generic = new StringBuffer( st.sval );
    }
    
    private static void startNoKeyword() { // start of possible new generic with no intro. keyword
        st.sval = "";
        start();
    }
}

Which gives:

line 1: class Foo<T1 >
line 5: interface Foo
<T2
>
line 6: class Foo</* comment1 */ T3 // comment2
>
line 8: <T4 > void bar()
line 10: <T5 > void bar()
line 12: <T6 > void bar()
line 13: class Foo<T7 extends Comparable<Foo > >

Sure this is longer than a regex solution in any of the common languages, but it does work considerably better :)

Igor Mihalik

Posts: 1
Nickname: igm
Registered: Aug, 2005

Re: Generics: Bounds Puzzle Posted: Oct 26, 2005 1:05 AM
Reply to this message Reply
> > D getGenericDeclaration();
> > GenericDeclaration getGenericDeclaration();
>
> Maybe I'm missing something here, but it looks like the
> reason for using Generics here, is simply to avoid having
> to cast the return value of getGenericDecleration() to
> type D (or, more generally, type safety).
>
> You want to see it's not a good enough reason? Fine, but
> then why don't you complain about Comparable<T>?
>
> Noam.

I laso think this is the case.

Flat View: This topic has 10 replies on 1 page
Topic: Musing about Closures Previous Topic   Next Topic Topic: Java Threads

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use