Pattern Centric Blog
Clear, Consistent, and Concise Syntax (C3S) for Java
by Howard Lovatt
October 27, 2006

Summary
Many people think that inner classes in Java could be better; this Blog presents a new syntax that emphasizes Clarity, Consistency, and Conciseness (in that order!). An example usage is: withLock lock, method { out.println "Hello" };

Clear, Consistent, and Concise Syntax (C³S) for Java

Many people think that inner classes in Java could be better and many proposals have been put forward. However none of these proposals, including this author's, have really gained broad support. This Blog presents an alternative, building on and hopefully unifying past proposals, that emphasizes Clarity, Consistency, and Conciseness (in that order!). An example usage is: withLock lock, method{out.println "Hello"};

Like normal writing a program code should express intent clearly; you should not be left wondering what a line does and the compiler should not have to make assumptions that may or may not agree with the programmers assumptions. The language should be consistent; it should not have many similar concepts or syntaxes with usage overlap (a few general features are better than many specific features even if for some use cases the specific features are better than the general features). The syntax should be concise, since verbosity can obscure intent (not to save keystrokes). In the ideal world you would meet all three Cs (Clear, Consistent, and Concise), however the Cs are at times contradictory and a compromise has to be made. It is therefore important to prioritise the Cs and for a language like Java that aspires to ubiquity the priority order should be: Clear, Consistent, and Concise. Much of the Consistent constraint will be consistency with existing Java features and the existing Java syntax style, e.g. non-abbreviated keywords rather than symbols. The consistency constraint also precludes lifting syntax directly from other languages unless that syntax is similar in style to existing Java syntax. From the priority given it is obvious that I am not equating minimum typing with simplicity, however I am advocating conciseness because I think that in places the current verbosity obscures intent.

Previous proposals are summarized in a previous blog; telling comment on that blog about these proposals include: “... all of the syntax you've shown me is more painful than just doing without” (Todd Blanchard), “The heavy reliance on the correct use use of differently shaped brackets, colon and semi-colons is steadily reducing Java's scope from being a language for use by general programmers to a notation decipherable only by geeks” (Vincent O'Sullivan), and “All of the proposals are a bit painful syntactically” (Martin Odersky). These comments strike at the crux of the problem, the proposals are not Clear or Consistent. Also Neal Gafter pointed out in an email that there are more use cases than iterating over collections and therefore the examples given in the previous blog were not the complete story. The new proposal is in the Appendix 1, along with justifications, and three examples are given below: time, withLock, and each. It is hoped that this proposal unifies other proposals by adopting features from other proposals and by suggesting a staged introduction of features (see Appendix) and that the use cases are considered realistic.

Examples

Java time

Method time times a series of methods and reports their timings, and as is the case for all the examples I will present the current Java code first to provide a basis of comparison. Formatting of the example is close to what I use in practice; this choice of formatting although realistic is less concise than often shown in blogs and other example forums where an unrealistically short format is used to save space and also confer (sometimes inadvertently) simplicity. In my preferred style I declare variables final (except for ones that do change!); which is common (or mandatory) in functional languages, but unusual in Java. As an aside – take a look at your own code, I bet nearly all variables should be final so that it is clear to humans and machines that the variable is a pure declaration. This use of final in the example below is not because variables are passed to inner classes; though in some of the other examples it is necessary to use final.

Use:
    final Method0< R, RuntimeException > method1 = new Method0< R, RuntimeException >() { 
      public void call() { ... }
      public String toString() { return "Method1"; }
    } );
    final Method0< R, RuntimeException > method2 = new Method0< R, RuntimeException >() { 
      public void call() { ... }
      public String toString() { return "Method2"; }
    } );
    time( method1, method2 );

Given:
  public interface Method0< R, Ts extends Throwable > { R call() throws Ts; }
  ...
    public static < R, Ts extends Throwable > void time( final Method0< R, Ts >... methods ) throws Ts {
      for ( final Method0< R, Ts > m : methods ) {
        final long start = currentTimeMillis();
        final R result = m.call();
        final long end = currentTimeMillis();
        out.println( m + " returned " + result + " and took " + (end - start) + " ms." );
      }
    }

C³S time

The same example above in the new syntax is given below, the most striking features are the use of type inference and the new method declaration using the keyword method. The method declaration using a keyword is similar syntax to other languages, e.g. Javascript, however method is chosen instead of function because it emphasizes dynamic dispatch instead of a static binding.

Use:
    final method1 = Method0.new { 
      method call { ... }
      method toString { "Method1" }
    };
    final method2 = Method0.new { 
      method call { ... }
      method toString { "Method2" }
    };
    time method1, method2;

Or shorter:
    time new { 
      method call { ... }
      method toString { "Method1" }
    }, new { 
      method call { ... }
      method toString { "Method2" }
    };

Or shorter still:
    time enum { 
      Method1 { method { ... } },
      Method2 { method { ... } }
    }.values;

Given:
  public interface< R, Throwable... Ts > Method0 { method R call throws Ts }
  ...
    public static method< R, Throwable... Ts > void time( final Method0< R, Ts >... methods ) throws Ts {
      for ( final m : methods ) {
        final start = currentTimeMillis;
        final result = m.call;
        final end = currentTimeMillis;
        out.println m + " returned " + result + " and took " + (end - start) + " ms."
      }
    }

The new syntax is explained fully in Appendix 1. The most interesting aspects, not all of which are demonstrated above, are:

Keyword final infers a variables type (there is an equivalent keyword, declare, for non-final variables).
Keyword return as the last statement of a method may be omitted, even if it returns a value. If the method is of type Void (object not primitive) then the method always returns null, even if an explicit return or return value aren't given.
Keyword new follows the type and it uses a dot to closely associate it with the type in a manor similar to a static call. Keyword new can also be used for constructor declaration within a class and for creating instances of classes for method arguments. When used for constructor declarations constructor name, modifiers, argument types, and simple argument copy to fields are inferred by new. When used to create an instance for a class the type of the class to be created is inferred.
Keyword method identifies the block as a method block with the given arguments and body and also indicates object creation. The syntax mimics the established syntax for other blocks, e.g. if as in keyword brackets braces, and it uses a dot to closely associate it with the object type in a manor similar to a static call. Keyword method can also be used for method declaration within a class and for creating instances of classes for method arguments. When used within a class method name, method modifiers, and argument types are inferred by method. When used to create an instance for a method argument, method also infers the type of the class to be created.
Keywords enum, interface, and class can be used to declare anonymous types, as opposed to anonymous instances of anonymous types that new and method create. Static methods can be called on these anonymous types, as they can on anything of type Class. The notation Base.class { ... } declares an anonymous class type that extends Base, similarly enum and interface. If anonymous types are used in a method call the type to extend is inferred, like new and method.
Generic type declarations are unified to appear in the same position in a declaration, generalized to any declaration including variables, their syntax for common cases changed to shorter variable like declarations, and varargs for Throwables and derivatives are allowed.
The number of brackets and semicolons are reduced.

Java withLock

Method withLock allows the use of the new concurrent locks conveniently. Medium formatting is adopted in this example.

Use:
    withLock(lock, new Block0() {
      public Void call() { out.println( "Hello" ); } 
    });

Given:
  public interface Block0extends Throwable> extends Method0 {}
  ...
    public static extends Throwable> void withLock(Lock lock, Block0 block) throws Ts {
      try {
        lock.lock();
        block.call();
      }
      finally { lock.unlock }
    }

C³S withLock

Use:
    withLock lock, method { out.println "Hello" };

Given:
  public interface Block0 extends Method0<Void, Ts> {}
  ...
    public static method void withLock(Lock lock, Block0 block) throws Ts {
      try {
        lock.lock;
        block.call
      }
      finally { lock.unlock }
    }

Java each

Method each steps through each item in a collection. The example assumes that collections are given new members functions (i.e. each) and given vararg constructors. (However, modifying the collections library is really a separate issue to C³S.) The example uses short names and is shown with minimum white space and it inlines definitions. I don't consider this extra short formatting realistic; but it is used in other blogs, some people prefer it, and a proposal must look OK in many different formatting styles.

Use:
    final ArrayList list=new ArrayList(1,2,3);
    final Tuple1 sum=new Tuple1(0);
    list.each(new Block1(){
      public void call(x){sum.e1+=x;}
    });

Given:
  public class Tuple1{
    public E1 e1;
Tuple1(E1 e1){this.e1=e1;}
  }
  ...
  public interface Block1extends Throwable>extends Method1{}
  ...
    public extends Throwable> void each(Block1 b)throws Ts{
      for(E e:es){b.call(e)}
    }

C³S each

Use:
    final list=ArrayList.new(1,2,3);
    declare sum=0;
    list.each method(x){sum+=x};

Given:
  ...
    public method void each(Method1.interface{} b)throws Ts{
      for(final e:es){b.call(e)}
    }

Summary

Whilst the syntax proposed is not the shortest possible; I think it is the best proposal to date because it balances: Clarity, Consistency, and Conciseness and because it unifies previous proposals. The syntax synergy of the proposal is demonstrated in the table below (syntax in table slightly simplified to keep table short).

Table 1: Syntax Synergy in C³S
Example	Keywords	Comment

Modifiers_opt Keyword< ... >_opt Type_opt Name_opt Extras_opt ( ... )_opt { ... ; ... }		Standard Template
if ( ... ) { ... ; ... }	if, for, while	Loops and Branches
Modifiers_opt class< ... >_opt Name_opt Extras_opt { ... ; ... }	class, interface, enum	Type Declarations
Modifiers_opt method< ... >_opt Type_opt Name_opt Extras_opt ( ... )_opt { ... ; ... }	method, new	Method and Constructor Declarations Inside Type Declarations
Modifiers_opt final< ... >_opt Type_opt Name ... , ... ;	final, declare	Fields, Locals, and Arguments

Type._optKeyword< ... >_opt Type_opt Name_opt Extras_opt ( ... )_opt { ... ; ... }		Qualified Template
Type._optinterface { ... ; ... }	class, interface, enum	Anonymous Type Declaration
Type._optmethod< ... >_opt Type_opt Name_opt Extras_opt ( ... )_opt { ... ; ... }	new, method	Anonymous Instance as Method Arguments or Variable Assignment

What do other people think – is this syntax an improvement on the other proposals? Does this proposal unify the other proposals?

Appendix 1 – C³S Rules

Stage 1

The concept is to implement stage 1 rules and then see if stage 2 suggestions are still considered necessary.

Rule 0: For a method call allow brackets, (), to be omitted if there are no arguments and for the last call in a chain of Top Level calls (method, constructor, or modified new, see Rules 2 and 3 below); provided that the call isn't ambiguous with respect to a field. Top Level means the call isn't an argument to another call. Note braces start a new statement with a new top level. The last call means that for the top level calls only: it is either the only call (and that call isn't qualified) or it is the call following the last dot. The last call rule means it must follow the last dot, even if that dot is a qualifier dot rather than a call dot; so that an integer argument is distinct from a real argument (see last example Rule 3). This rule is more concise. E.g.:

     size = list.size;
     list.set 0, 'A';
     list.add( 'A' ).add 'B';
     list.add string.charAt( 0 );

Rule 1: Allow final and a new keyword, declare, to infer the type of a declaration based on the initial value. If no initial value is given or the value is given as null then the type must be specified as normal (also the declare keyword may be used like final is currently used). Function arguments are as is, except that declare may prefix the declaration to emphasize that the parameter isn't final. To specify primitives the type must explicitly be given, see first example below. Generic arguments follow the keyword and are useful for recursive or repeated types. This rule improves clarity by using a keyword, consistency by always using a keyword and allowing generic parameters, and conciseness of by allowing type inference. E.g. (also see rules 2 and 3 for new new notation):

     declare sum = 0; // Integer, for int use: int sum = 0 or declare int sum = 0
     final< X > Tuple2< X, X > duplicate; // Not the same as Tuple< ?, ? >!
     declare strings = ArrayList< String >.new.add "A";
     final strings = ArrayList< String >.new;

Rule 2: Allow new to be used as a qualifier with a . (dot) like a static method call (but generic arguments retain their normal position after the type) instead of as a prefix operator. This rule improves readability and hence clarity since it tightly associated with the type, is more consistent with the rest of Java, and (importantly) enables Rule 3. E.g.:

     declare strings = ArrayList< String >.new( "a", "b", "c" ); // Assuming ArrayList had a vararg constructor!
     declare numbers = (T[])Number.new[ 1 ]; // where T extends Number

Rule 3: Allow new to infer generic type from constructor argument and use <> to mean a raw type. This rule considerably shortens generic declarations and also instances of anonymous inner classes. E.g.:

     declare strings = ArrayList<>.new; // Explicit raw type
     final strings = ArrayList< String >.new; // Explicit generic type
     final strings = ArrayList.new( "a", "b", "c" ); // Inferred generic type, assuming vararg constructor

Rule 4: Allow new to declare constructors in classes and to infer argument types from fields of the same name if no argument types are given (like final and declare do for variables – see Rule 1). If argument names are given without qualification and body is given as default then infer a simple Assignment of arguments to fields. This inferring of Assignment is similar default functionality to ML, Scala, and Fortress, but is more flexible since multiple constructors with different access and annotation modifiers are possible. Inside an instance method the notation this.new( ... )_opt may be used. This use of new improves clarity since new is a keyword, improves consistency since it emphasizes the similarity between constructors and static methods and is consistent with the new use of .new after a type name (Rule 2), and is more concise. E.g. (also see Rule 5 for omitting trailing ; and Rule 15 for position of generic arguments):

     class< E1 > T1 {
       declare E1 i;
       new( final E1 i ) { this.i = i }
     }

     class< E1 > T1 {
       declare E1 i;
       new( final i ) { this.i = i } // Infer argument type from field
     }

     class< E1 > T1 {
       declare E1 i;
       new( i ) { default } // Infer argument type from field and infer copy of argument into field
     }

Rule 5: If an expression ends in } then there is no need for ; before the brace (like , in array initializers and also note , at end of argument lists is an error). IE treat ; as a statement separator rather than a statement terminator. This rule helps the readability of instances of anonymous classes and encourages braces to always be used even if a single statement is possible. E.g.:

     if ( end ) { break }

Rule 6: Introduce a new keyword, method, for declaring methods which is syntactically placed just before any generic parameters (i.e. Immediately after annotations and modifiers). If the method overrides a method then optionally infer @Override, other modifiers, return type, method argument types, and throws clause. If there is only one method to override then the method name can be inferred. If the method has no arguments the brackets may be omitted. This new keyword is clearer because it is a keyword, is more consistent because other declarations have keywords, and is more concise because it allows inference. E.g. (also see Rule 9 for new generic type syntax and Rule 12 for omitting keyword return):

     public static method< K, V > Map< K, V > unmodifiableMap( Map< K ?, V ? > map ) ...
     method toString { "Hello" }

Rule 7: Keyword, method, can also be used with a type followed by a dot notation to be a shorthand for creating an anonymous instance of an anonymous inner class. This new syntax is clearer because its intent is not obscured by verbosity, is more consistent because its syntax is like other blocks (e.g. if), and is more concise because of inference. E.g.:

     final action = AbstractAction.method ( final notUsed ) {
       controller.updateAllViews updateModel
     };

Rule 8: Keyword, method, can also be used to create anonymous instances of anonymous inner classes without a type and a dot notation if the type can be inferred from a methods arguments. This new syntax is clearer because its intent is not obscured by verbosity, is more consistent because its syntax is like other blocks (e.g. if), and is more concise because of inference. E.g.:

     final button = JButton.new method ( final notUsed ) {
       controller.updateAllViews updateModel
     };

Rule 9: Allow extends type generic arguments to be shortened to < Type name > (i.e. like normal variable declarations), e.g.:

     public static method< K, V > Map< K, V > unmodifiableMap( Map< K ?, V ? > map ) ...
     public class Enum< Enum< E > E > ...
 
   Which are equivalent to
     public static < K, V > Map< K, V > unmodifiableMap( Map< ? extends K, ? extends K > map ) ...
     public class Enum< E extends Enum< E > > ...

Rule 10: A throws clause can be empty, which means the method doesn't throw anything (equivalent to an absent throws clause). This is useful in conjunction with generics, see Rule 11 below. E.g.:

     method Void call() throws;

Rule 11: Generics are extended to allow varargs (only for use with Throwables and derivatives). An empty generics varargs list is allowed and it is equivalent to an absent throws clause (note Rule 10 above). E.g.:

     method< R, Throwable... Ts > R call() throws Ts;

Rule 12: Keyword return may be omitted for last line of a method and the returned value is the value of the last expression, e.g.:

     method toString { "Hello" };

Rule 13: If a method returns a Void, then make the end of a method without a return and a return without an argument synonymous with return null. E.g. the following are identical.

   Given:
     interface< Throwable... Ts > Block0 extends Method0< Void, Throwable... Ts > {}
   
  Then:
     final block = Block0.method {};
     final block = Block0.method { return };
     final block = Block0.method { return null };
     final block = Block0.new {
       method { return null }
     };
     final block = Block0.new {
       @Override method call { return null }
     };
     final block = Block0.new {
       method call { return null }
     };
     final Block0< RuntimeException > block = new Block0< RuntimeException >() {
       Void call() { return null; }
     };

  Are all the same.

Rule 14: Non-final locals that are referenced by an inner class are automatically wrapped in a final-tuple instance, like C# 3.0 does. Note: special treatment, name mangled temporary required, of non-final arguments is needed (not shown in example below). E.g.:

   Given:
     public class< E1 > Tuple1 {
       public E1 e1;
       public new( e1 ) { default }
     }
     public class< E1, E2 > Tuple2 extends Tuple1< E1 > ...
     public interface< R, A1, Throwable... Ts > Method1 { R call( A1 a1 ) throws Ts }
     public interface< A1, Throwable... Ts > Predicate1 extends Method1< Boolean, A1, Ts > {}
     ...
       public static method< T, Throwable... Ts > List< T > select( final Iterable< T > c, final Predicate1< T ?, Ts > f ) throws Ts ...
   
  Then:
     final beginning = "Fred";
     final names = ArrayList.new "Frederick";
     final filtered = names.select method ( name ) { name.startsWith beginning };
   
  Which is equivalent to the following verbose version:
     final Tuple1< String > beginning = new Tuple1< String >( "Fred" );
     ArrayList< String > names = (new ArrayList< String >( "Frederick" ));
     ArrayList< String > filtered = names.select( new Predicate1< String >() {
       public Boolean call( final String name ) {
         return name.startsWith( beginning.e1 );
       }
     } );

Rule 15: Allow generic parameters to follow keyword for class and interface, i.e. like method and like Type[] for arrays. This improves consistency since in some circumstances it cannot follow the name, e.g. method declarations. E.g.:

     public class< E1 > Tuple1 {
       public E1 e1;
       public new( e1 ) { default }
     }

Rule 16: Allow anonymous type declarations, TypeToExtend._optclass { ... }, for class, interface, and enum. This is intentionally similar syntax to anonymous instances of anonymous classes, i.e.: similar syntax to Type._optnew ( ... )_opt { ... } and Type._optmethod ( ... )_opt { ... }. If used as an initializer for a variable the type is inferred. as Class< $1 >. If used as a method argument the type to extend may be inferred. This is a useful generalization of the notion of creating anonymous instances of anonymous classes to the creation of anonymous types. E.g. (also see Rule 17):

    public method< Throwable... Ts > void each( final Method1< Void, E, Ts >.interface{} b )throws Ts {
      for ( final e : es ) { b.call( e ) }
    }

    final singleton = Method0.enum { // Singleton is of type Class< $1 > where $1 implements Method0 and is an Enum
      Method1 { method { ... } },
      Method2 { method { ... } }
    };
    time singleton.values;

    time enum { // Infer Method0
      Method1 { method { ... } },
      Method2 { method { ... } }
    }.values;

Rule 17: Allow static methods to be called on variables of type Class< T > using the normal dot notation. This is consistent with the concept that static members are what other languages call class members, e.g. smalltalk. E.g. (also see Rule 16):

    final singleton = Method0.enum { // Singleton is of type Class< $1 > where $1 implements Method0 and is an Enum
      Method1 { method { ... } },
      Method2 { method { ... } }
    };
    time singleton.values;

Rule 18: Allow variable declarations . E.g.:

    final singleton = Method0.enum { // Singleton is of type Class< $1 > where $1 implements Method0 and is an Enum
      Method1 { method { ... } },
      Method2 { method { ... } }
    };
    time singleton.values;

Stage 2

Having implemented stage 1 and gained some experience a second stage could be added with the following suggestions (these are very much suggestions and this section should be treated as a list of open issues). I have stuck with syntax related suggestions that are essentially sugar and no new semantics or modifications to JVM.

Suggestion A: Add support for statements break, continue, and exit multiple enclosing blocks using break, continue, or return. The syntax for break and continue for an enclosing loop would be unchanged. For exiting an enclosing method the syntax could be MethodName.return value;. This could be implemented using pre-made checked exceptions, thus allowing compile time checking of multiple block exits. Syntax could be unified by allowing Name.break and Name.continue and named blocks, e.g. if Name ( ... ) { ... }.

Suggestion B: Add support for long strings with embedded variables using a new keyword, string, e.g.:

    string { // Trailing white space and single new line on { line not part of string
Embed variables including optional formatting, e.g.: %2.1f{ real }
    } // Leading white space and single new line on } line not part of the string

Suggestion C: Add support for union of types using & like generics (particularly for catch blocks).

Suggestion D: Allow final and declare to be used for the variable declaration in a catch block and allow type inference as the union of all declared or inferred (checked or not) exceptions.

Suggestion E: Add support for properties by inferring bodies and declarations for: method getName { default }and method setName { default }; i.e. like new { default }.

Suggestion F: Universally allow () in C³S to be omitted, e.g. if condition { ... ; ... } and method Type name ..., ... { ... ; ... }.

Suggestion G: Unify variable (field, local, and argument) declarations around the standard syntax, e.g. final Type { name1; name2 }.

Suggestion H: Make new line act as a semicolon and allow ,, +, -, =, etc. as last non-white space or non-comment on a line to mean continued on next line.

Talk Back!

Have an opinion? Readers have already posted 23 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Howard Lovatt adds a new entry to his weblog, subscribe to his RSS feed.

Digg |

del.icio.us |

About the Blogger

Dr. Howard Lovatt is a senior scientist with CSIRO, an Australian government owned research organization, and is the creator of the Pattern Enforcing Compiler (PEC) for Java. PEC is an extended Java compiler that allows Software Design Patterns to be declared and hence checked by the compiler. PEC forms the basis of Howard's 2nd PhD, his first concerned the design of Switched Reluctance Motors.


	Web Artima.com

Clear, Consistent, and Concise Syntax (C3S) for Java