Summary
One of the areas receiving a lot of attention for J7 is simplifying and extending the capabilities of inner classes or adding a new construct the closure. This blog examines the options and compares the different proposals using a typical example, inject.
Advertisement
I have editted this blog to incorporate some improvements suggested by Neal
Gafter - thanks Neal.
Since this is my virgin blog I would like to thank Bill for agreeing to me
blogging and say that I will be blogging mostly about Design Patterns in Java.
Particularly my own pet project a Pattern
Enforcing Compiler™ (PEC™) for Java. However this first blog
is about proposed new features for Java 7 — wow we are talking about
7 and 6 isn't done and dusted yet!
One of the areas receiving a lot of attention for J7 is simplifying and extending
the capabilities of inner classes or adding a new construct the closure. This
blog examines the options and compares the different proposals using a typical
example, inject. Amongst the options on the web are:
Do nothing — Mitts off my favourite language!
Add Closures - which are like inner classes but have read and write access
local variables and return exits from the enclosing method as
well as exiting from the closure (some proposals support break and continue as
well as return). E.g. BGGA
Closures, proposed by Bracha, Gafter, Gosling,
and Ahe.
Add spiffy new syntax that makes anonymous inner classes instance creation
shorter and let inner classes have write access to local variables
(return acts as it currently does for an inner class). E.g. Concise Instance
Creation Expressions (CICE), proposed by Bloch, Lea, and Lee.
Shorter Syntax for Common Operations
in general but with some bias to make inner class syntax, in particular,
shorter and to give inner classes write access to locals (return acts
as it currently does for an inner class). E.g. SSCO Request For Enhancement
6389769, proposed by Yours Truly (note the wild mismatch in standing
between the Authors of the first two proposal and the third :).
In the Appendix to this blog an alternative SSCO proposal to RFE 6389769 is
given and this proposal is used in the example given below. The example below
is straight forward, so you probably don't need to read the details of the
three proposals to understand the example. The details of the proposals are
however worth a read to understand the thought process behind the proposals.
Examples
Taking two examples will illustrate the four different approaches
(my apologies in advance if I don't quite get the syntax exactly correct for
other people's proposals). The examples are taken from some of my code, but
I think they are reasonably typical. I added to my original source code the
throwing of a generic exception (the original didn't throw anything); because
BGGA Closure makes a big deal of exception throwing. The examples are inject (also
called reduce, fold, or accumulate in
some libraries) and forEach (a.k.a. each). inject applies
a function to each element of a list and when all the elements are processed
the result is returned, typically the result is a scalar. Likewise forEach processes each element of a list, but does not return anything. The use of
both inject and
forEach in the examples is the common example usage of summing
a list of integers, whilst I think inject and forEach are typical
methods I am less convinced about the summing of integers example (I have never
actual done this with an inner class and I write scientific software).
However summing of integers is a standard example.
Java inject:
Use:
final Integer sum = inject( input, 0,
new Injection< Integer, RuntimeException >() {
public void call( final Integer item ) { value += item; }
} );
Where:
public static < R, Es extends Throwable > R inject(
final Iterable< R > collection, final R initial,
final Injection< R, Es > injection )
throws Es {
injection.value = initial;
for ( final R item : collection ) { injection.call( item ); }
return injection.value;
}
public abstract class Injection< R, Es extends Throwable > {
public R value;
public Injection( final R initial ) { value = initial; }
public abstract void call( R item ) throws Es;
}
Java forEach:
Use:
final Tuple1< Integer > sum = tuple( 0 );
forEach( input, new Function1< Integer, RuntimeException >() {
public void call( final Integer item ) { sum.e1 += item; }
} );
Where:
public static < A, Es extends Throwable > void forEach(
final Iterable< A > collection, final Block1< A, Es > block )
throws Es {
for ( final A item : collection ) { block.call( item ); }
}
public interface Block1< A1, Es extends Throwable > {
void call( A1 item ) throws Es;
}
public class Tuples {
public static class Tuple1< E1 > {
public E1 e1;
public Tuple1( final E1 value ) { this.value = value; }
}
public static < E1 > T1< E1 > tuple( final E1 value ) {
return new Tuple1( value );
}
...
}
BGGA inject:
Use:
final Integer sum = for inject( input, 0, final Integer item ) {
value += item;
};
Where:
public static < R, throws Es > R for inject(
final Iterable< R > collection, final R initial,
final Injection< R, Es > injection ) throws Es {
injection.value = initial;
for ( final R item : collection ) { injection.call( item ); }
return injection.value;
}
public abstract class Injection< R, Es extends Throwable > {
public R value;
public abstract void call( R item ) throws Es;
}
BGGA forEach:
Use:
Integer sum = 0; // cannot be final even if logically it should befor each( input, final Integer item ) { sum += item; }
Where:
public static < A1, throws Es> void for each(
final Iterable< A1 > collection, final Block1< A1, Es > block )
throws Es {
for ( final A1 item : collection ) { block.call( item ); }
}
public interface Block1< A1, throws Es> { void call( A1 a1 ) throws Es; }
CICE inject:
Use:
final Integer sum = inject( input, 0,
Injection< Integer, RuntimeException >( final Integer item ) {
sum += item;
} );
Where:
public static < R, Es extends Throwable > void inject(
final Iterable< R > collection, final R initial,
final Injection< R, Es > block ) throws Es {
injection.value = initial;
for ( final R item : collection ) { injection.call( item ); }
return injection.value;
}
public abstract class Injection< R, Es extends Throwable > {
public R value;
public abstract void call( R item ) throws Es; }
CICE forEach:
Use:
Integer sum = 0; // cannot be final even if logically it should beforEach( input, Block1< Integer, RuntimeException >( final Integer item ) {
sum += item;
} );
Where:
public static < A1, Es extends Throwable > void forEach(
final Iterable< A1 > collection, final Block1< A1, Es > block )
throws Es {
for ( final A1 item : collection ) { block.call( item ); }
}
public interface Block1< A1, Es extends Throwable > { void call( A1 a1 ) throws Es; }
SSCO inject:
Use:
final Integer sum = inject input, 0, new.{ ( final item ) value += item };
Where:
public static < R, Throwable... Es > R inject(
final Iterable< R > collection, final R initial,
final Injection< R, Es > injection ) throws Es {
for ( final R item : collection ) injection.call item;
return injection.value
}
public abstract class Injection< R, Throwable... Es > {
public R value;
public Injection( final R initial ) value = initial;
public abstract void call( R item ) throws Es
}
SSCO forEach:
Use:
Integer sum = 0; // cannot be final even if logically it should be
forEach input, new().{ ( final item ) sum += item };
Where:
public static < A1, Throwable... Es > void forEach(
final Iterable< A1 > collection, final Block1< A1, Es > block )
throws Es
for ( final A1 item : collection ) block.call item;
public interface Block1< A1, Throwable... Es > void call( A1 item ) throws Es;
Vote
I obviously prefer SSCO, otherwise I wouldn't be proposing it! I think the
advantages are that it is consistent with current Java (this is also true of
CICE) and that it will not only help with instances of inner classes but will
also reduce verbosity in many places. E.g. it is the shortest code for both
use of the example and declaration of the example. But, which do you prefer?
Acknowledgements: This new SSCO proposal has Rules taken
from or inspired by: RFE 6389769, the discussions in the 6389769 forum, Ruby,
and BGGA Closures.
Rule 0: For a method call allow brackets, (),
to be omitted if there are no arguments and for thelast call in a chain of top
level calls
(method, constructor, or modified new, see Rules 1 and 2 below);
provided that the call isn't ambiguous with respect to a field. Top Level means
the call isn't an argument to another call. Note braces start a new statement
with a new top level; if a new statement is started without braces then this
does not create a new top level (see E.g. 2 for Rule 5 below). The last call
means that for the top level calls only: it is either the only call (and that
call isn't qualified) or it is the call following the last dot. The last call
rule means it must follow the last dot, even if that dot is a qualifier dot
rather than a call dot; so that an integer argument is distinct from a real
argument (see last example Rule 2). E.g.:
Rule 1: Allow new to be used as a qualifier
with a . (dot). This rule improves readability, is more consistent
with the rest of Java, and (importantly) enables Rule 2. E.g.:
ArrayList< String > strings = new.ArrayList< String >.add "A";
T[] numbers = (T[])new.Number[ 1 ]; // where T extends Number
Which are equivalent to:
ArrayList< String > strings = (new ArrayList< String >()).add( "A" );
T[] numbers = (T[])(new Number[ 1 ]); // where T extends Number
Rule 2: Allow new to infer type and use <> to
mean a raw type when a . is used (but not for a space). This rule
considerably shortens generic declarations and also instances of anonymous
inner classes. E.g.:
Rule 3: If an expression ends in } then there
is no need for ; before the brace (like array initializers and
argument lists). This rule helps the readability of instances of anonymous
classes. E.g.:
if ( end ) { break }
Rule 4: Consistently make {} optional for a
single line (not just if etc.). This rule helps the readability
of instances of anonymous classes, see last example Rule 5, and shortens many
simple class and method declarations. E.g.:
class IntWrapper public int i;
@Override public String toString() return "Hello"; // Also see Rules 9 & 10 below
Rule 5: Infer return type, method name, arguments and exceptions
if a single method is abstract (works for interfaces or abstract classes) and
it isn't ambiguous. In particular it may be necessary to supply the type of
the method arguments, see example at end. The brackets around the argument
list, even for an empty argument list cannot be omitted; this is needed to
distinguish between a method declaration using the new syntax and an instance
initializer. . This rule helps the readability of instances of anonymous classes.
E.g.:
// E.g. 1: Using Rule 5 only
class MyAction implements ActionListener {
( notUsed ) {
textArea.append( textField.getText() );
}
}
// E.g. 2: Using Rules 0 to 5
textField.addActionListner new.{ ( notUsed ) textArea.append textField.getText() };
// Or
textField.addActionListner new.( notUsed ) textArea.append( textField.getText() );;
Rule 6: Allow extends type generic arguments
to be shortened to < Type name > (i.e. like normal
variable declarations), e.g.:
public static < K, V > Map< K, V > unmodifiableMap( Map< K ?, V ? > map ) ...
public class Enum< Enum< E > E > ...
Which are equivalent to
public static < K, V > Map< K, V > unmodifiableMap( Map< ? extends K, ? extends K > map ) ...
public class Enum< E extends Enum< E > > ...
Rule 7: A throws clause can be empty, which
means the method doesn't throw anything (equivalent to an absent throws clause).
This is useful in conjunction with generics, see rule 8 below. E.g.:
Void call() throws;
Rule 8: Generics are extended to allow varargs (only for
use with Throwables and derivatives). An empty generics varargs
list is allowed and it is equivalent to an absent throws clause (note Rule
5 above). E.g.:
< R, Es extends Throwable... > R call() throws Es;
< R, Throwable... Es > R call() throws Es; // See Rule 6 above
Rule 9: return may be omitted for last line
of a method and the returned value is the value of the last expression, e.g.:
@Override public String toString() "Hello"; // Also see rule 10
Rule 10: @Override name infers modifiers,
return type, argument types, and exceptions (like Rule 5 above), e.g.:
@Override toString() "Hello";
@Override equals( other ) ...
Rule 11: Non-final locals that are referenced by an inner
class are automatically wrapped in a final-tuple instance, like C# 3.0 does.
Note: special treatment, name mangelling temporary required, of non-final arguments
is needed (not shown in example below). E.g.:
Given:
public class Tuple1< E1 > {
public E1 e1;
public Tuple1( final E1 e1 ) this.e1 = e1
}
public class Tuple2< E1, E2 > extends Tuple1 ...
...
public interface Function1< R, A1, Throwable... Es > R call( A1 a1 ) throws Es;
public interface Predicate1< A1, Throwable... Es > extends Function1< Boolean, A1, Es >;
...
public static < T, Throwable... Es > List< T > select( final Iterable< T > c, final Predicate1< T ?, Es > f ) throws Es ...
Then:
String beginning = "Fred";
ArrayList< String > names = new.add "Frederick";
ArrayList< String > filtered = select names, new.{ ( name ) name.startsWith beginning };
Which is equivalent to the following verbose version:
final Tuple1< String > beginning = new Tuple1< String >( "Fred" );
ArrayList< String > names = (new ArrayList< String >()).add( "Frederick" );
ArrayList< String > filtered = select( names, new Predicate1< String >() {
public Boolean call( final String name ) {
return name.startsWith( beginning.e1 );
}
} );
Rule 12: If a method returns a Void, then make
the end of a method without a return and a return without
an argument synonymous with return null. E.g. the following are
identical.
Given:
interface Function0< R, Throwable... Es > R call() throws Es;
Then:
Function0< Void > block1 = new.{ () };
Function0< Void > block2 = new.{ () return };
Function0< Void > block3 = new.{ () return null };
Function0< Void > block4 = new Function0< Void >() {
Void call() { return null; }
};
Are all the same.
> I think that if those are the simplest ways to call inject > under all of the proposals, Bruce Tate is right. It's > time to move on.
This is a an option - inner classes are quite good on their own
> Is this really supposed to compile or are there characters > missing? > >
forEach input, new().{ ( final item ) sum += item
> };
This is meant to compile. The steps are:
1. The last Top Level method can have brackets ommitted, forEach( input, new().{ ( final item ) sum += item } );
2. The semicolon before a close brace can be ommitted, forEach( input, new().{ ( final item ) sum += item; } );
3. Braces can be ommitted for single statement methods, forEach( input, new().{ ( final item ) { sum += item; } } );
4. The types etc. of an abstract method can be ommitted if there is just one abstract method, forEach( input, new().{ public A call( final A item ) { sum += item; } } );
5. new infers its type including generic arguments, forEach( input, new Block< Integer, RuntimeException >() { public Integer call( final Integer item ) { sum += item; } } );
6. sum is automatically wrapped in a final tuple, forEach( input, new Block< Integer, RuntimeException >() { public Integer call( final Integer item ) { sum.e1 += item; } } );
So the result in current Java is:
final T1< Integer > sum = new T1< Integer >( 0 );
forEach( input, new Block< Integer, RuntimeException >() {
public Integer call( final Integer item ) { sum.e1 += item; }
} );
Which as you say isn't too bad. I think the driver for this is competition from other languages, particularly Ruby that has many libraries that use this style of coding and the results are an easy to use library. Swing would be another obvious benifitiary, all the event handelers ActionListner etc.
> Because all of the syntax you've shown me is more painful > than just doing without. > > OTOH, you might borrow something from javascript here and > allow functions to be first class objects - either > anonymous or named.
I see it differently in some respects, in particular:
1. You have a single concept, the class
2. You can Serialize, Clone, Annotate, etc. a class
3. You can have multiple methods, e.g. a toString is commonly useful
4. You can inherite a partial implementation from another class
5. You can have fields
But in one respect, the proposed new syntax is similar to Javascripts fun syntax.
> Otherwise, I'd bag it and just use what you really want - > go program in Smalltalk. That's what you're trying to > 'fake' anyhow.
The BGGA Closure proposal is definitely from people who are fans of Smalltalk. My own bias is that I like inner classes more than closures or functions as outlined above. In fact I would go as far as to say:
Closure | First Class Function == Poor Man's Inner Class
All of the proposals are a bit painful syntactically. They are different enough from Java to make recognition hard, but don't achieve the simplicity you get in other languages. E.g., in Scala:
var sum = 0 input foreach (x => sum = sum + x)
Languages like Python, Ruby, Smalltalk are similarly concise. There's a popular claim that it is the dynamic typing that achieves the conciseness. The above example shows that static typing need not lead to bulky syntax (in fact, Haskell has demonstrated this all along).
Syntax aside, I agree with the idea that closures should be objects, and that therefore function types should be classes.
You raise a of good point, i.e. maybe the syntax needs to be more radically different to make the change worth while.
I assume you teach Scala, therefore can I ask if the students have a problem with the difference between the following two examples?
def findIndex( inputs : Findable[] ) : Findable = {
for ( var item : inputs ) if ( item.isFound ) return item;
null;
}
def findIndex( inputs : Findable[] ) : Findable = {
var found = null;
inputs foreach ( item => if ( item.isFound ) {
found = item;
return; // Do students think this returns from findIndex like return in first example does?
} )
found;
}
(The examples are meant to be in Scala - hopefully I got the syntax at least close.)
Also - how do declarations like var found = null; work in Scala - does it need to look ahead to first and subsequent uses? I.e. like Haskel.
Howard, > > I assume you teach Scala, therefore can I ask if the > students have a problem with the difference between the > following two examples?
def findIndex( inputs : Findable[] ) : Findable = {
for ( var item : inputs ) if ( item.isFound ) return item;
null;
}
def findIndex( inputs : Findable[] ) : Findable = {
var found = null;
inputs foreach ( item => if ( item.isFound ) {
found = item;
return; // Do students think this returns from findIndex like return in first example does?
I hope so, because return does return from findIndex, just like the first example does.
> > Also - how do declarations like var found = > null; work in Scala - does it need to look ahead to > first and subsequent uses? I.e. like Haskel.
No, Scala's type inference is purely local. So you have to write var found: Findable = null. If you don't give a type, it will infer the type of the right hand side, i.e. Null
var array = [new SomeObject(), new SomeObject(), ...
where array is inferred to be ArrayList<SomeObject>
"foreach" sum:
int val = 0;
array.each( \elt -> value += elt.getSomeIntValue() );
"inject" sum:
sum = array.inject( 0, \value, elt -> value + elt.getSomeIntValue() );
The closure types are inferred from the data structure's statically inferred type in the first case, and the static type of the first arg plus the data structures inferred type in the second. It *DOESN'T* have to be any more complicated than this, kids.
As hard as Sun is trying to prove it, static typing and syntactic sanity are not mutually exclusive.
Thanks for clarifying Scala behaviour. I hadn't appreciated that return returned from the enclosing method when a Scala closure was used, I had assumed the Java like behaviour of just the inner most method was exited by return.
The local type inference is probably a good idea also, keeps things simple.
Your suggestion is similer to RFE 6389769. Below the example is given using variations on the SSCO theme (and dropping final, giving the inital value as an argument to inject, and reducing white space to be consistent with your example):
SSCO RFE 6389769 like syntax
var sum = 0;
forEach list, (item) {sum += item};
final sum2 = inject list, 0, { ( item ) value += item };
SSCO this blog syntax
Integer sum = 0;
forEach list, new.{(item) sum += item};
final Integer sum2 = inject list, 0, new.{(item) value += item};
SSCO Scala like syntax
var sum = 0;
forEach list, {item => sum += item};
final sum2 = inject list, 0, {item => value += item};
The Scala like syntax is probably the neatest. The only real drawback is that if it is used to mean an inner class then will people mistakenly write this:
What is all that <pre>forEach</pre> stuff garbaging up the code?
I'm just a simple caveman, but it seems like the core idea here is summing up the values of an array. If you want to do that, all you should have to do is say: "take each element in this array and add it to a sum value"
array.each( \elt -> sum += elt );
Array. Operation. Name binding. Expression.
That's it. No more. Nada. Nothing. Hitting another key on the keyboard should be morally offensive to us all. It's certainly morally offensive to me.
And ask about why it is called forEach and not each and why it isn't a member, also you need to use Iterable and I think your proposed syntax needs modifying:
1. The each/forEach is just a preference, its just a name.
2. The method can't be a member of Iterable because Iterable is widely used and therefore changing it would break code. Sure, you could write a replacement for Iterable but an alternative is a static method in Collections; which is what I have assumed.
3. In Java you would also need to be talking primarily about an Iterable rather than an array. The only place that Java currently boxes an array into an Iterable is in for ( item : array ) .... (Actually it doesn't box - but it appears to.)
4. I think using the Haskel like lamda calculus expression requires an extra semicolon to mark the end of the expression, i.e. your example would need to be modified for an enhanced Java syntax to:
each list, \elt -> sum += elt; ;
or
each list, {\elt -> sum += elt};
Which isn't that different from other proposals, e.g.:
each list, {elt => sum += elt};
or
each list, new.{(elt) sum += elt};
The differences between all these options is minor and largely a matter of taste, which is heavily influenced by what other languages you know.
> And ask about why it is called forEach and > not each and why it isn't a member, also you > need to use Iterable and I think your > proposed syntax needs modifying:
Bwha?
> 1. The each/forEach is just a > preference, its just a name.
_shrug_ Sure. I prefer less typing, but I'm wacky like that.
> 2. The method can't be a member of Iterable > because Iterable is widely used and therefore > changing it would break code. Sure, you could write a > replacement for Iterable but an alternative > is a static method in > Collections; which is what I have assumed.
I see no reason why we can't make improvements to the existing data structure classes. A good 50% of the whole constellation of supporting interfaces and classes that surround the core java collections classes could be done away with in day to day programming if we baked reasonable closure-based methods (map, etc.) into the core. Why on earth should anyone EVER have to type the word Iterator? What year is it?
> 3. In Java you would also need to be talking primarily > about an Iterable rather than an array. The > only place that Java currently boxes an array into an > Iterable is in for ( item : array ) > .... (Actually it doesn't box - but it appears > to.)
Frankly, I don't care. Sun owns the langauge. They can add whatever features they like and it doesn't take much imagination to think up how to weld reasonable closure methods onto the existing collections in a backwards compatible manner. Hell, maybe they could even add mixins to do it... (Heaven forfend!)
> 4. I think using the Haskel like lamda calculus > expression requires an extra semicolon to mark the end of > the expression, i.e. your example would need to be > modified for an enhanced Java syntax to: >
> each list, \elt -> sum += elt; ;
> or
> each list, {\elt -> sum += elt};
>
Why on earth should it? It's just a '\' character. Sun owns the parser. They can do whatever they like. The rule I would adopt is: an expression closure body can stand alone without any additional syntax. A statment list closure must be enclosed in {}'s. Simple, easy to parse, and it makes the common case beautiful. The BNF is easy enough:
Again, it is Sun's parser. They can do whatever the hell they want with it.
> Which isn't that different from other proposals, e.g.: >
> each list, {elt => sum += elt};
> or
> each list, new.{(elt) sum += elt};
>
"each" (and "map" and "inject", etc.) should be baked into lists/arrays/whatever-you-want-to-call-your-sequential-datastructure. It's the object oriented thing to do. Use ruby seriously for a month and you'll agree.
> The differences between all these options is minor and > largely a matter of taste, which is heavily influenced by > what other languages you know.
Nope. Syntax matters. Which is why ruby, despite all of its glaring problems, is gaining traction. The more insane syntax java piles on to deal with a solved problem, the closer to iCobol it will become.
>Closure | First Class Function == Poor Man's Inner Class
So you prefer more complexity in your language? Because that's what you are advocating.
Smalltalk blocks are just chunks of code - similar to CompiledMethods - which are also implemented as objects. Once you dig into it you find that the system is built out of one thing. Systems like that are exceptionally powerful and profound.
Versus the dozen or so things with different semantics you find in Java. Which means the language is much more complicated to grasp, it has many more rules, yet it is still less expressive. In general - more things impiles less power.
Inner classes are definitly a poor man's closures. Because they are special cases and special cases are weaknesses.
Flat View: This topic has 38 replies
on 3 pages
[
123
|
»
]