Sponsored Link •
|
Summary
In a previous weblog a suggestion for reifying lambda types was given. The previous weblog give examples but no formal description of the process. This weblog gives a more formal description.
Advertisement
|
In a previous weblog a proposal for reified lambdas was given, http://www.artima.com/weblogs/viewpost.jsp?thread=277879. This weblog builds on the previous weblog and assumes familiarity with the previous weblog. Where the details present in this weblog contradict details in the previous weblog, this one supersedes the previous. In effect this weblog is an improved version of the previous.
Expanding on the previous weblog was suggested by Neal Gafter (http://mail.openjdk.java.net/pipermail/lambda-dev/2010-January/000218.html). This weblog is deliberately written in a 'dry' style and doesn't introduce lambdas or justify reification of them, other than section 1.1 (See previous weblog for an 'easier' read and discussions of these topics.)
The writing style is not saying 'we could do X', instead it says 'do X', this is purely a writing convenience and is not meant to imply that this is in any way an accepted proposal. The proposal is simply a proposal and has no formal backing from Sun (or anyone else for that matter).
The text below is not strictly a formal description of the processes, but it is hoped that it is sufficiently detailed to tease out any problems. Since the aim of this post is to tease out problems, feedback is most welcome.
The following source:
static double reduce(final double[] ds, double initial, final #double(double, double) reducer) { for (final double d : ds) { initial = reducer.callPrimitive(initial, d); } return initial; } static <T> T reduce(final T[] ts, T initial, final #T(T, T) reducer) { for (final T t : ts) { initial = reducer.call(initial, t); } return initial; } static <T> void time(final #T() c) { final long start = System.currentTimeMillis(); final T result = c.call(); final long finish = System.currentTimeMillis(); System.out.println(c + " took " + (finish - start) + " ms and returned " + result); } static void timeTest() { final int size = 1024 * 1024; final double[] ds = new double[size]; final Double[] ts = new Double[size]; for (int i = 0; i < size; i++) { ts[i] = ds[i] = i; } final #double(double, double) sumDs = #double(final double a1, final double a2) { return a1 + a2; }; time(new #double() { // Not supported in this proposal as yet @Override public double callPrimitive() { return reduce(ds, 0, sumDs); } @Override public String toString() { return "double[] test"; } }); final #Double(Double, Double) sumTs = new #Double(final Double a1, final Double a2) { return a1 + a2; }; time(new #Double() { // Not supported in this proposal as yet @Override public Double call() { return reduce(ts, 0.0, sumTs); } // note 0.0 - annoying inference bug! @Override public String toString() { return "Double[] test"; } }); }
Is translated to:
static double reduce(final double[] ds, double initial, final _Callable_double_double_double reducer) { for (final double d : ds) { initial = reducer.callPrimitive(initial, d); } return initial; } static <T> T reduce(final T[] ts, T initial, final _Callable_2<? extends T, ? super T, ? super T> reducer) { for (final T t : ts) { initial = reducer.call(initial, t); } return initial; } static <T> void time(final _Callable_0<T> c) { final long start = System.currentTimeMillis(); final T result = c.call(); final long finish = System.currentTimeMillis(); System.out.println(c + " took " + (finish - start) + " ms and returned " + result); } static void timeTest() { final int size = 1024 * 1024; final double[] ds = new double[size]; final Double[] ts = new Double[size]; for (int i = 0; i < size; i++) { ts[i] = ds[i] = i; } final _Callable_double_double_double sumDs = new _Callable_double_double_double() { @Override public double callPrimitive(final double a1, final double a2) { return a1 + a2; } }; time(new _Callable_double() { @Override public double callPrimitive() { return reduce(ds, 0, sumDs); } @Override public String toString() { return "double[] test"; } }); final _Callable_Dble_Dble_Dble sumTs = new _Callable_Dble_Dble_Dble() { // Compiler bug! Should be Double not Dble @Override public Double call(final Double a1, final Double a2) { return a1 + a2; } }; time(new _Callable_Dble() { // Compiler bug! Should be Double not Dble @Override public Double call() { return reduce(ts, 0.0, sumTs); } // note 0.0 - annoying inference bug! @Override public String toString() { return "Double[] test"; } }); }
And when run on my laptop gives:
double[] test took 20 ms and returned 5.497552896E11 Double[] test took 588 ms and returned 5.497552896E11
Which demonstrates the considerable performance increase reified lambdas offer (erased lambdas would perform similarly to the Double case above).
At this point in time the following issues are not addressed, they are open issues, and a brief note about there status given. These open issues are put into three groups of ones that are specific to this specification, ones that might influence the choice of one implementation over another, and ones that are largely orthogonal to the implementation. This categorization is open to interpretation and you could argue about the placement of an issue within in a particular group.
To java.lang add:
public interface Wrapper { Object getWrappee(); } public static final class Wrappers { private Wrappers() {} public static Object unWrap(Object from) { if (from == null) { return null; } while (from instanceof Wrapper) { from = ((Wrapper)from).getWrappee(); } return from; } public static boolean equals(final Object lhs, final Object rhs) { return unWrap(lhs) == unWrap(rhs); } } public static abstract class AbstractCallable { @Override public final boolean equals(final Object other) { return Wrappers.equals(this, other); } }
The following types are added by the class loader to java.lang as needed (the compiler internally also generates these classes but does not emit these classes). The reason these classes are dynamically generated by the class loader is twofold:
I haven't spelled out how the loader (or compiler) generates the required classes in detail, but two short paragraphs are given in the next two sub-sections on this topic. The reasons for not spelling this out is to save space (it is reasonably obvious what to do). The processes is to generate the type from its name, the name encodes all the information needed to generate the type.
Generic interfaces are used when any types involved are generic and as a base type for the reified types. These interfaces are interfaces that define a generic-abstract call method with n arguments and follow the naming convention _Callable_ n, e.g.:
public interface _Callable_0<R> { R call(); } public interface _Callable_1<R, A1> { R call(A1 a1); }
When the loader sees in a class file the interface _Callable_ n it first checks to see if it already has this interface and if not generates the appropriate interface. The name _Callable_ n gives the loader all the information it requires (which is simply n, the number of arguments). The name _Callable_ n is unique, since other _Callable_ types (described below) do not contain the integer n since an integer is not a valid type name.
The following description of dynamic classes uses '...' to indicate part of a name that is dependent upon the type of the lambda, e.g. #Object(String) has types Object and String and '...' is replaced by Object_String.
When the loader sees in a class file the class _Callable_... it first checks to see if it already has this class and if not generates the appropriate abstract class. The name _Callable_... gives the loader all the information it requires (in particular the '...' part encodes the reified types needed). The reified _Callable_... names are distinguished from generic _Callable_ n because n (an integer) is not a type name.
Reified, non-primitive, abstract classes are used when there are no generic types and no primitive types. These classes are abstract classes that define a reified-abstract call method via implementing the equivalent generic _Callable_ n classes with the following naming convention _Callable_..., e.g.:
public abstract class _Callable_String extends AbstractCallable implements _Callable_0<String> {} public abstract class _Callable_Object_String extends AbstractCallable implements _Callable_1<Object, String> {}
Note the extension of AbstractCallable class which defines an equals method, the reason for this is that some of the lambdas get wrapped for type conversion purposes (see below for more details).
Reified, primitive, abstract classes are used when there are no generic types and some primitive types. These classes are abstract classes that define an reified-abstract callPrimitive method via extending the equivalent non-primitive _Callable_... classes with the same naming convention as non-primitive (_Callable_...), e.g.:
public abstract class _Callable_int extends _Callable_Integer { @Override public final Integer call() { return callPrimitive(); } public abstract int callPrimitive(); } public abstract class _Callable_int_String extends _Callable_Integer_String { @Override public final Integer call(String a1) { return callPrimitive(a1); } public abstract int callPrimitive(String a1); } public abstract class _Callable_int_int extends _Callable_Integer_Integer { @Override public final Integer call(Integer a1) { return callPrimitive(a1); } public abstract int callPrimitive(int a1); }
Note that the call method is defined to call the callPrimitive method.
There are two variations of From class and the purpose of both these From classes is to 'convert' one lambda into another. Typically this conversion is achieved by wrapping the first lambda inside itself, but it could in the case of a wrapping and then unwrapping just give back the original lambda. Whether a From... class or a _Callable_... class (the original lambda) results, it is always reified lambda that results from a From conversion.
The reified _Callable_... abstract classes implement the equivalent generic _Callable_ n interface and therefore no 'cast' is needed from a reified lambda to a generic lambda.
The first of the two variations of the From classes are those that go in the reverse direction to their inheritance hierarchy, i.e. from a generic _Callable_ n interface to a the equivalent reified _Callable_... class (reifying From). The second variation of From class is for contravariant overriding of lambdas.
All From classes follow a naming convention: _From_...__To_... where the first '...' are as used in the 'from' _Callable_... and the second '...' are as used in the 'to' _Callable_... names. If the 'from' _Callable_... is a generic interface then the '...' include the _ n as well as the type names.
These classes convert a generic _Callable_ n interface to a the equivalent reified _Callable_... class, i.e. they reify the generic class, e.g.:
public final class _From_0_Integer__To_int extends _Callable_int implements Wrapper { private final _Callable_0<? extends Integer> f; private _From_0_Integer__To_int(final _Callable_0<? extends Integer> f) { this.f = f; } public static _Callable_int instance(final _Callable_0<? extends Integer> from) { if (from == null) { return null; } final _Callable_0<? extends Integer> f = (_Callable_0<? extends Integer>)Wrappers.unWrap(from); if (f instanceof _Callable_int) { return (_Callable_int)f; } return new _From_0_Integer__To_int(f); } @Override public int callPrimitive() { return f.call(); } @Override public Object getWrappee() { return f; } @Override public int hashCode() { return f.hashCode(); } }
Note use of callPrimitive; because the resultant, 'to', class is a primitive _Callable_..., if 'to' is a non-primitive _Callable_... then method call is defined instead.
These classes convert a reified _Callable_... class to another, contravariantly overriding, reified _Callable_... class.
Primitive types have an implicit conversion mechanism in Java, e.g. an int is automatically converted to a long. It is necessary to simulate this implicit conversion for lambdas that have primitive and/or primitive wrappers. The term "implicit conversion" is used below to indicate when such a non-inheritance based type conversion is required.
If going from the 'from' to the 'to' lambda types does not involve "implicit conversion" then the dynamically created class is for example:
public final class _From_String_Object__To_Object_String extends _Callable_Object_String implements Wrapper { private final _Callable_String_Object f; private _From_String_Object__To_Object_String(final _Callable_String_Object f) { this.f = f; } public _From_String_Object__To_Object_String instance(final _Callable_String_Object from) { if (from == null) { return null; } final _Callable_String_Object f = (_Callable_String_Object)Wrappers.unWrap(from); return new _From_String_Object__To_Object_String(f); } @Override public Object call(final String a1) { return f.call(a1); } @Override public Object getWrappee() { return f; } @Override public int hashCode() { return f.hashCode(); } }
Note if the 'from' lambda is primitive then use f.callPrimitive(a1) and if the 'to' lambda is primitive define method callPrimitive.
If "implicit conversion" is required then there are a number of cases to be covered.
Primitive-wrapper to primitive-wrapper for example:
public final class _From_Integer_Long__To_Long_Integer extends _Callable_Long_Integer implements Wrapper { private final _Callable_Integer_Long f; private _From_Integer_Long__To_Long_Integer(final _Callable_Integer_Long f) { this.f = f; } public _From_Integer_Long__To_Long_Integer instance(final _Callable_Integer_Long from) { if (from == null) { return null; } final _Callable_Integer_Long f = (_Callable_Integer_Long)Wrappers.unWrap(from); return new _From_Integer_Long__To_Long_Integer(f); } @Override public Long call(final Integer a1) { return f.call(a1.longValue()).longValue(); } @Override public Object getWrappee() { return f; } @Override public int hashCode() { return f.hashCode(); } }
Primitive-wrapper to primitive for example:
public final class _From_Integer_Long__To_long_int extends _Callable_long_int implements Wrapper { private final _Callable_Integer_Long f; private _From_Integer_Long__To_long_int(final _Callable_Integer_Long f) { this.f = f; } public _From_Integer_Long__To_long_int instance(final _Callable_Integer_Long from) { if (from == null) { return null; } final _Callable_Integer_Long f = (_Callable_Integer_Long)Wrappers.unWrap(from); return new _From_Integer_Long__To_long_int(f); } @Override public long callPrimitive(final int a1) { return f.call(Long.valueOf(a1)); } @Override public Object getWrappee() { return f; } @Override public int hashCode() { return f.hashCode(); } }
Primitive to primitive-wrapper for example:
public final static class _From_int_long__To_Long_Integer extends _Callable_Long_Integer implements Wrapper { private final _Callable_int_long f; private _From_int_long__To_Long_Integer(final _Callable_int_long f) { this.f = f; } public _From_int_long__To_Long_Integer instance(final _Callable_int_long from) { if (from == null) { return null; } final _Callable_int_long f = (_Callable_int_long)Wrappers.unWrap(from); return new _From_int_long__To_Long_Integer(f); } @Override public Long call(final Integer a1) { return Long.valueOf(f.callPrimitive(a1)); } @Override public Object getWrappee() { return f; } @Override public int hashCode() { return f.hashCode(); } }
Primitive to primitive (which is the same as the no "implicit conversion" case since Java automatically converts primitives) for example:
public final static class _From_int_long__To_long_int extends _Callable_long_int implements Wrapper { private final _Callable_int_long f; private _From_int_long__To_long_int(final _Callable_int_long f) { this.f = f; } public _From_int_long__To_long_int instance(final _Callable_int_long from) { if (from == null) { return null; } final _Callable_int_long f = (_Callable_int_long)Wrappers.unWrap(from); return new _From_int_long__To_long_int(f); } @Override public long callPrimitive(final int a1) { return f.callPrimitive(a1); } @Override public Object getWrappee() { return f; } @Override public int hashCode() { return f.hashCode(); } }
These classes provide the same functionality as the _From classes above do for individual lambdas but instead operate on arrays of lambdas. All the _Array_From classes create a new array, for the 'to', and convert each element of 'from'. The name of the _Array_From class is the same as the name of the appropriate _From class with '_Array' prepended, e.g:
public final class _Array_From_0_String__To_String { public static _Callable_String[] instance(final _Callable_0<? extends String>[] from) { if (from == null) { return null; } final _Callable_String[] to = new _Callable_String[from.length]; for (int i = 0; i < from.length; i++) { to[i] = _From_0_String__To_String.instance(from[i]); } return to; } }
To keep the weblog shorter I will limit the explanation to lambdas that have a return value and one argument, it is straightforward to extrapolate to a void lambda and down to zero or up to n arguments. The strawman proposal allows the return type of the lambda to be implied. Inferring the return type is a separate issue to how to implement lambdas and therefore the explanations assume the types are specified (similarly the proposed diamond operator is ignored). There are two basic constructs in lambdas:
There is a second new-lambda-instance construct in the strawman that allows statements, however it is easy to extrapolate to multiple statements from a single expression and therefore this form is not described.
In the above:
Generic types in lambdas do not include wildcards, i.e. no ?, ? extends $type$, or ? super $type$. If wildards are used as part of the lambda syntax, then it is an error.
From time to time, when explaining the expansion of lambda syntax, other textual place markers are introduced between dollars, e.g. $type$ as used above. If more than one $A$, for example, is needed then an additional letter is added to clarify which $A$ is referred to, e.g. $AL$ might be used to indicate the $A$ on the left hand side of an assignment.
New lambdas are created by the expression #$R$($A$ $N$) ($E$).
If either $R$ or $A$ are generic types then if either $R$ or $A$ are primitive they are converted to the equivalent primitive-wrapper classes and the expansion is:
new _Callable_1<$R$, $A$>() { @Overload public final $R$ call($A$ $N$) { return $E$; } }
Else the expansion is:
new _Callable_$R$_$A$() { @Overload public final $R$ call($A$ $N$) { return $E$; } }
The type of a lambda is given by #$R$($A$) and this may occur in different contexts as listed below.
EG #$R$($A$) lambda = ...;.
If either $R$ or $A$ are generic types then the expansion is:
_Callable_1<? extends $R$, ? super $A$>
Else the expansion is:
_Callable_$R$_$A$
EG #$R$($A$)[] $name$ = new #$R$($A$)[$size$];, the LHS is treated like a definition and the RHS like an instance, except that the RHS has special treatment for the generics case.
If either $R$ or $A$ are generic types then an unchecked warning is given, because you cannot create type-safe-generic arrays. The expansion is:
_Callable_1<? extends $R$, ? super $A$>[] $name$ = (_Callable_1<? extends $R$, ? super $A$>[]) new _Callable_1[$size$];
(There is no syntax in the strawman for a raw-type lambda, hence the need for this special expansion.)
Else the expansion is:
_Callable_$R$_$A$[] $name$ = new _Callable_$R$_$A$[$size$];
EG #$R$($A$).class.
If either $R$ or $A$ are generic types then an error is reported, because you cannot have class literals of generic types.
Else the expansion is:
_Callable_$R$_$A$.class
EG (#$R$($A$)).
If either $R$ or $A$ are generic types then an unchecked cast warning is given, because you cannot check generic casts at compile time, but the expansion proceeds and is:
(_Callable_1<$R$, $A$>)
Else the expansion is:
(_Callable_$R$_$A$)
EG $expression$ instanceof #$R$($A$).
If either $R$ or $A$ are generic types then an error is reported, because you cannot have instanceof tests of generic types.
Else the expansion is:
$expression$ instanceof _Callable_$R$_$A$
EG $genericType$<$optionalWildcardSyntax$ #$R$($A$)> is treated like the LHS of a variable declaration and is always generic and is therefore expanded to:
$genericType$<$optionalWildcardSyntax$ Callable$1<? extends $R$, ? super $A$>>
To facilitate reifying generic lambdas and contravariant-overriding, 'safe' automatic-conversion via loader-generated From classes are supported for assignment and initialization of variables, method parameters, and array elements (for arrays see next section).
A running example is used below:
$lValue$ = $rExpression$
Where the type, after expansion, of $lValue$ is _Callable_$RL$_$AL$ and the expanded-type of $rExpression$ is either _Callable_1<? extends $RR$, ? super $AR$> or _Callable_$RR$_$AR$ depending on if it is reified or not. Note $lValue$ is reified else no 'conversion' is added by the compiler (it is not an error or a warning for $lValue$ to be generic, it just means that the compiler does not need to add a 'conversion').
In the description below the phrase $R$ "is convertible to" $L$ is used to indicate that $R$ has is the same as $L$, $R$ is a subtype of $L$, or $R$ is a primitive or primitive-wrapper that can be converted to $L$ which is also a primitive or primitive-wrapper. The automatic conversion of primitives and primitive-wrappers is usual in Java, e.g. int to long.
In summary, the possible actions are:
If the the expanded type of $rExpression$ is _Callable_1<? extends $RR$, ? super $AR$> (i.e. it is generic) then a reifying cast is applicable (and as described above lValue should be reified).
If $RR$ or $AR$ are generic-parameter names rather than specific-type names then report an unchecked cast warning and substitute the limits of the parameters into $RR$ and $AR$ respectively, instead of the parameter names, and continue as below.
If $RR$ "is convertible to" $RL$ and if $AL$ "is convertible to" $AR$ then the expansion is:
lValue = _From_1_$RR$_$AR$__To_$RL$_$AL$.instance(rExpression);
Else there is a type error that should be reported.
If the the expanded type of $rExpression$ is _Callable_$RR$_$AR$ (i.e. it is reified) then a reified-to-reified cast is applicable (and as described above lValue should be reified).
If $RR$ is the same type as $RL$ and if $AL$ is the same type as $AR$ then do nothing (no cast necessary).
If $RR$ "is convertible to" $RL$ and if $AL$ "is convertible to" $AR$ then the expansion is:
lValue = _From_$RR$_$AR$__To_$RL$_$AL$.instance(rExpression);
Else there is a type error that should be reported.
To facilitate reifying arrays of generic-lambdas and contravariant-overriding of lambda arrays, 'safe' automatic-conversion via loader-generated Array_From classes are supported for assignment and initialization of array variables, method parameters, and array-array elements (for none arrays see previous section).
In the following:
$lValue$ = $rExpression$
The expanded type of:
If $RR$ is $RL$ and if $AR$ is $AL$ then no expansion is necessary.
Else if $lValue$ generic and $RL$ "is convertible to" $RR$ and if $AR$ "is convertible to" $AL$ then no expansion is necessary.
Else if ($RR$ "is convertible to" $RL$ and if $AL$ "is convertible to" $AR$) or ($RL$ "is convertible to" $RR$ and if $AR$ "is convertible to" $AL$) then the expansion is one of:
Both $lValue$ and $rExpression $ generic:
lValue = Array_From_1_$RR$_$AR$__To_1_$RL$_$AL$.instance(rExpression);
Generic $lValue$ and reified $rExpression $:
lValue = Array_From_1_$RR$_$AR$__To_$RL$_$AL$.instance(rExpression);
Reified $lValue$ and generic $rExpression $:
lValue = Array_From_$RR$_$AR$__To_1_$RL$_$AL$.instance(rExpression);
Both $lValue$ and $rExpression $ reified:
lValue = Array_From_$RR$_$AR$__To_$RL$_$AL$.instance(rExpression);
Else there is a type error that should be reported.
Reifying lambdas is not particularly straightforward, however it offers improved performance over some erased lambdas schemes and a smoother joining of lambdas to the rest of Java. A high-performance lambda implementation, e.g. reified lambdas, makes ParallelArray worth while. Without supporting high-performance lambdas you might well argue that provide a series of generic Callable interfaces and a short inner-class syntax is sufficient.
Are there any use cases were the above proposal is not ideal? (Other than the already justified, in previous weblog, wrapping for contravariant-overriding conversions.)
Are there any bugs?
Feedback would be greatly appreciated!
Have an opinion? Readers have already posted 4 comments about this weblog entry. Why not add yours?
If you'd like to be notified whenever Howard Lovatt adds a new entry to his weblog, subscribe to his RSS feed.
Dr. Howard Lovatt is a senior scientist with CSIRO, an Australian government owned research organization, and is the creator of the Pattern Enforcing Compiler (PEC) for Java. PEC is an extended Java compiler that allows Software Design Patterns to be declared and hence checked by the compiler. PEC forms the basis of Howard's 2nd PhD, his first concerned the design of Switched Reluctance Motors. |
Sponsored Links
|