Agile Buzz Forum - Overloading Semicolon, or, monads from 10,000 Feet

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Agile Buzz Forum
Overloading Semicolon, or, monads from 10,000 Feet

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Oliver Steele

Posts: 112
Nickname: ows
Registered: Aug, 2003

Oliver Steele is Chief Software Architect at Laszlo Systems, Inc.

Overloading Semicolon, or, monads from 10,000 Feet

Posted: Dec 2, 2007 8:12 PM

This post originated from an RSS feed registered with Agile Buzz by Oliver Steele.
Original Post: Overloading Semicolon, or, monads from 10,000 Feet Feed Title: Oliver Steele on Software Feed URL: http://feeds.feedburner.com/osteele Feed Description: Languages of the real and artificial.	Latest Agile Buzz Posts Latest Agile Buzz Posts by Oliver Steele Latest Posts From Oliver Steele on Software

amichail on reddit asks about understanding monads in one minute. My thoughts ran longer than a comment and more than a minute, so I’ve placed them here.

The main message of this posting is that you already use monads, just without the labels. The complexity in most explanations comes from factoring out the different pieces of what you already know, and from the mathematical exposition in terms of category theory and monad laws. (I like the math, but you won’t find any of it here.) This posting trades away accuracy for ease; I hope it’s a helpful start.

A monad, as used in Haskell, is a rule that defines how to get from one statement in a program to the next. For example, there’s an implicit monad in the sequence var x = 1; var y = x+1. It describes how to get from var x = 1 to var y = x+1.

A monad describes the flow of control, and the flow of data. Typically, the flow of control is something like “execute the first statement once, and then execute the next statement once”, and the flow of data is something like “the first statement computes a value, and makes it available to the next statement” (through a variable binding, say).

Sometimes the rules are more complicated. For example, if one statement is throw or raise, the statements that follow it are skipped. If one statement sets a global variable, the statements that follow it can get additional data by reading that variable. And if one statement changes the world outside the program (say, that statement creates a file), this will affect what happens when the following statements peek at the world, maybe even in other ways (say, by reading the free space on that volume).

Also, “following statements” is a dynamic notion, not a static one. If function f calls g, then all the statements in f that follow f’s call to g, follow all the statements in g (at least, up until g’s return).

You’ve used monads. You’ve used the State monad (which manages global variables), the Error monad (which enables exceptions), and the IO monad (which handles interactions with the file system, and other resources outside the program). You may not have thought much about these properties, because they come “for free”: in most languages, you don’t need to do anything special to get them.

In Haskell, you do have to do something special. All you get by default is the “typical” case from above: one statement computes a value; the next statement reads it. If you want additional behavior (State, or Error, or IO), you have to say so. You can say so by declaring the type of your statement block. Just like every variable in a statically-typed language such as C or Java has a compile-time type (int if its values are integers, or String if its values are strings), every statement in Haskell has a compile-time type too: Error Int if it might raise an exception, or IO Int it it might interact with the world.

Why introduce all this complexity? There’s two benefits: one that checks dynamism, and another that adds it.

The first benefit is that, if the only statements that might change the world have IO in their type, you can assert to the compiler that some compound statement or function not only doesn’t change the world within its body, but doesn’t call any functions that contain statements that do. And so on for other properties (“raises an exception”, “accesses global data”). This turns out to be useful.

The other benefit is that you define your own monad rules. Remember that part about “execute the first statement once, and then execute the next statement”, and “the first statement computes a value, which the next statement can use”? You can replace that with “execute the first statement, but only execute the next statement if the value that it computes isn’t null; otherwise, it’s the end of the line for this whole sequence of statements, including callers”. This is the Maybe monad; it turns out to be useful too.

Or, you could replace the rule with “the first statement computes a list of values, and the second statement runs once using each of them”. This is the List monad; it’s useful too.

Back to the analogy with variable types. (Java and C++ have typed variables; Haskell has typed statements.) Some languages let you overload operators. Integer+Integer does one thing; you can define String+String to another (hopefully string concatenation), and Vector+Vector to do a third (hopefully vector addition).

You can think of the semicolon an operator that combines two statements. A definition for the semicolon operator is a monad: it defines the meaning of a compound statement composed of two simpler ones. Haskell lets you overload semicolon.

I hope this helps. If it did, now go read a monad tutorial and see how it works for real. If didn’t, go read a monad tutorial to see if it’s easier with actual examples, and the details and syntax filled in.

Some Lies

Here’s some of what I said above, that just isn’t true:

A monad isn’t just a rule. It’s a set (or type) of statements, and a rule that combines them. (There’s more accurate and technical definitions, but that gets into the heavy math.)

Furthermore, the rule part of the monad isn’t just any rule. It has to have certain properties. (A moment with Google will turn up a dozen great tutorials that define the Monad Laws; take your pick.) However, you can perfectly well understand how to use a monad without being able to enumerate the monad laws, just like you can use numbers without being able to enumerate the properties of a field or ring. The properties just say that (1) there’s a way to turn any expression (that computes a value) into a monad (that passes that value on); and (2) a sequence of statements act the way you think it should.

The descriptions of the State and IO monads above are particularly oversimplified. I think they’re useful for coming from procedural programming, but you’ll want to refine them in order to get to* *functional programming. Again, follow any monad tutorial.

In Haskell, you don’t even get statements by default. (I said above that you did.) Everything is an expression. As soon as you start using statements, you’re in monad land (even if it’s just the identity monad). The syntactic step from expressions to statements is the big one, and the steps between different monads aren’t as big a (syntactic) deal.

In Java, you do* have to declare the types of *some collections of statements. The collections are functions, and the type that you have to declare is the type of a statement that throws a checked exception. This fact about Java is almost universally despised, but it does give you a taste of Haskell.

Finally, thinking about monads as defining (or overloading) semicolon is a start, but sequencing is more general than that. Even in a language where every two lexical statements are separated by a semicolon, one statement can follow another without any particular relation between their sources, such as when one is in a function, that is called by the other.

Read: Overloading Semicolon, or, monads from 10,000 Feet

Previous Topic

Next Topic


	Web Artima.com