> I find myself unconvinced by arguments on both sides of this debate (all
> lacking any hard data or proof, but full of opinions stated
> unequivocally)
Let's see. Here are some arguments I've used in this debate along with a couple of additional notes.
> [The SML/NJ] compiler infers the types of expressions and responds with
> the result of the inference
The proof is right next to the statement. If you don't take my word for it you can try SML/NJ by yourself. The same applies to other type inferred languages with a REPL, such as Ocaml and Haskell.
> there are cases where Hindley-Milner style type-infering can yield
> rather difficult to understand errors, because the cause of the type
> error and the place where the error is detected may be far away. You can
> limit such problems by using type ascriptions.
Here is a link to a particular article on the subject of improving error messages from H-M type systems that specifically tries to improve the location information: http://citeseer.ist.psu.edu/680831.html .
> The undesirable declarations are the ones you write just because the
> language forces you to write them, but serve no other purpose than to
> keep the compiler happy (e.g. in Java, every variable declaration must
> be annotated by a type). The desirable kinds of declarations are the
> ones that you use to specify abstractions so that you can write modular
> programs. From another point of view, the desirable kinds of
> declarations are the ones you use to hide implementation details.
I assume you already know Java. Let me illustrate the two kinds of declarations. Let's first assume that SML would require type annotations on every binding. Here is how the Direction example using strings would look like:
structure LeakyStringDirection (* no sealing ascription *) =
struct
type direction = string
val north : direction = "north"
val south : direction = "south"
val west : direction = "west"
val east : direction = "east"
fun opposite (d : direction) : direction =
if d = north then south
else if d = south then north
else if d = west then east
else west
end
Although all the bindings are annotated with types, the above does not actually enforce the abstraction. We can, for example, pass an arbitrary string as a direction or to treat a direction as a string:
- LeakyStringDirection.opposite "look, no protection" ;
val it = "west" : LeakyStringDirection.direction
- print (LeakyStringDirection.north^"\n") ;
north
val it = () : unit
What I'm demonstrating here is that type annotations do not necessarily help you to provide any additional guarantees about your program.
In fact, the module LeakyStringDirection
would exhibit the same properties if we left out all the annotations:
structure LeakyStringDirection (* no sealing ascription *) =
struct
type direction = string
val north = "north"
val south = "south"
val west = "west"
val east = "east"
fun opposite d =
if d = north then south
else if d = south then north
else if d = west then east
else west
end
While the above is clearly less verbose than the one with ascriptions, it has the exact same safety properties. In other words,
> The undesirable declarations are a major cause of verbosity and are
> usually a hindrance rather than a help.
Fortunately, SML allows you to actually specify and enforce abstractions. Let's first see a signature for directions:
signature DIRECTION =
sig
eqtype direction
val north : direction
val south : direction
val west : direction
val east : direction
val opposite : direction -> direction
end
What the above signature says is that there is a type direction
whose values can be tested for equality. It also says that there are four (not necessarily distinct) named values of the type: north
, south
, west
, and east
. Finally, there is a function opposite
that maps a direction to a direction.
Let's then create a sealed module matching the DIRECTION
signature:
structure SealedStringDirection :> DIRECTION =
struct
type direction = string
val north = "north"
val south = "south"
val west = "west"
val east = "east"
fun opposite d =
if d = north then south
else if d = south then north
else if d = west then east
else west
end
The abstraction is now enforced. We can not pass an arbitrary string as a direction nor treat a direction as a string:
- SealedStringDirection.opposite "look, protection" ;
stdIn:1.1-8.25 Error: operator and operand don't agree [tycon mismatch]
operator domain: SealedStringDirection.direction
operand: string
in expression:
SealedStringDirection.opposite "look, protection"
uncaught exception Error
raised at: ../compiler/TopLevel/interact/evalloop.sml:52.48-52.56
../compiler/TopLevel/interact/evalloop.sml:35.55
- print (SealedStringDirection.north^"\n") ;
stdIn:1.7-8.16 Error: operator and operand don't agree [tycon mismatch]
operator domain: string * string
operand: SealedStringDirection.direction * string
in expression:
SealedStringDirection.north ^ "\n"
uncaught exception Error
raised at: ../compiler/TopLevel/interact/evalloop.sml:52.48-52.56
../compiler/TopLevel/interact/evalloop.sml:35.55
Essentially, SealedStringDirection.direction
is a new abstract type, distinct from the underlying type string
.
What the above demonstrates is the use of declarations to specify and enforce abstractions. Specification and enforcement of abstractions allows the compiler to detect errors.
> you can not express "concepts" in C++.
Try the book Generic Programming and the STL by Matthew Austern for starters. If you don't have it, you can look at http://www.osl.iu.edu/publications/prints/2003/comparing_generic_programming03.pdf .
> Another example is the lack of any means of abstracting over primitive
> types in Java. If you change between int and long in an interface,
> you'll get a nasty ripple effect.
Suppose I have a Java class of the form:
public class Foo {
static int introduce();
static void eliminate(int token);
}
If, for some good reason (e.g. an external library changes), the interface needs to be changed to
public class Foo {
static long introduce();
static void eliminate(long token);
}
All clients of the interface will have to be changed. For instance, every place where a result of calling introduce
is stored in a variable must be changed. Every interface that you need to pass through values returned by introduce
potentially needs to change. This what I mean with the ripple effect.
In many other languages it is possible to avoid the ripple effect without a major programming effort. For example, in C++ you can use a typedef:
class Foo {
public:
typedef int token_type;
static token_type introduce();
static void eliminate(token_type token);
}
Unfortunately, the abstraction isn't actually enforced in C++. You can still write code that uses int
rather than Foo::token_type
. As I already demonstrated earlier, you can enforce abstractions like the above in SML by using abstract type specifications and opaque signature ascription:
structure Foo :>
sig
type token_type
val introduce : unit -> token_type
val eliminate : token_type -> unit
end =
struct
type token_type = int
val introduce = ...
val eliminate = ...
end
If we would change the above to
structure Foo :>
sig
type token_type
val introduce : unit -> token_type
val eliminate : token_type -> unit
end =
struct
type token_type = Int64.int
val introduce = (*...*)
val eliminate = (*...*)
end
in Standard ML, there would be no ripple effect, because it is essentially impossible to directly depend on the underlying representation of token_type
.
> Except for a few exceptions (and the definition of new datatypes) you
> can essentially write your entire program without any type annotations
> or declarations.
This can be formalized and proven by defining a subset of the language (SML, for example) that type checks without type ascriptions. I leave it as an exercise to the reader.
[In SML, the few exceptions are related to certain kinds of polymorphic (top-level) declarations and certain uses of record types.]
> In a type inferred language you can actually have both (and more) for
> little more than the price of one (the tests). You can use just the
> amount of type declarations that you find appropriate to enforce the
> abstractions that you want and you can write the tests in an attempt to
> find particular errors in the dynamic semantics.
I hope my additional explanations above have clarified this statement. I have demonstrated that type ascriptions are optional in SML. I have demonstrated that you can use signatures in SML to specify abstractions and signature ascription to enforce abstractions. It is quite obvious that you can write unit tests in SML. Here is a concrete example of an ad hoc unit test from my utility library (you can't compile it without the rest of the library):
local
open UnitTest
structure LP = ListProduct
val A = LP.A
val L = LP.L
val S = LP.S
val V = LP.V
in
val () =
unitTests
title "ListProduct"
test (fn () =>
verify (LP.all (thunk true) $))
test (fn () =>
verify (LP.all (fn s & i & r =>
s = Int.toString i andalso real i <= r)
L["1", "2"] L[1, 2] A(Array.fromList [1.0, 2.0, 3.0]) $))
test (fn () =>
verify (not (LP.allEq (fn i & s & r =>
s = Int.toString i andalso real i <= r)
L[1, 2, 3] L["1", "2"] L[1.0, 2.0] $)))
test (fn () => let
val results = ref []
in
LP.app (fn i & r =>
push (Real.toString (real i + r) ^ "\n",
results))
L[1, 2, 3] L[0.1, 0.2] $
; verify (["2.2\n", "1.1\n"] = !results)
end)
test (fn () =>
expect (fn () =>
LP.appEq ignore L[1, 2] L["1", "2", "3"] L[0.1, 0.2] $,
fn ListPair.UnequalLengths => ()))
test (fn () =>
verify (not (LP.exists (fn i & r =>
real i <= r)
L[3, 4] L[1.0, 2.0] $)))
test (fn () =>
verify ("c3.3b2.2a1.1\n" =
LP.foldl (fn (s & r & i, rest) =>
s^Real.toString (real i + r)^rest)
"\n"
L["a", "b", "c"] V(Vector.fromList [0.1, 0.2, 0.3]) L[1,2,3,4] $))
test (fn () =>
verify (["a" & 1, "b" & 2, "c" & 3] =
LP.foldr op:: [] L["a", "b", "c", "d"] L[1, 2, 3] $))
test (fn () =>
verify ([(~1, #"A", false), (~3, #"B", true), (~5, #"C", false)] =
LP.map (fn i & s & b =>
(~i, Char.toUpper s, not b))
L[1,3,5] S"abcd" L[true, false, true] $))
test (fn () =>
verify ([1, 3, 2] =
LP.map Real.round
L[1.0, 3.0, 2.0] $))
$
end
The above is an ad hoc unit test for a ListProduct
module. The module UnitTest
is a very simple unit testing "framework" I've written for my own use.
> dependent type systems
Use google: http://www.google.com/search?q=dependent+types
> Being able to specify and enforce abstractions is pretty much the
> essense of static typing.
Ditto: http://www.google.com/search?q=static+types+enforce+abstractions
...
If you would care to elaborate on what you think is "lacking proof or data" then maybe I can provide it.
Here is a summary of some of my positions:
- You should use types to specify and enforce abstractions. When you do so, the types actually help to detect errors.
- Type ascriptions that do not either specify or enforce abstractions are unnecessary. You don't fundamentally need them.
- Manifest typing (as in Java and C++) leads to verbosity (having to read/write unnecessary words).
- Type inferred languages allow you to omit the unnecessary type ascriptions.