Weblogs Forum
The adventures of a Pythonista in Schemeland/5

13 replies on 1 page. Most recent reply: Sep 14, 2010 2:38 AM by Michele Simionato

 Previous Topic Next Topic
 Flat View: This topic has 13 replies on 1 page
 Michele Simionato Posts: 222 Nickname: micheles Registered: Jun, 2008
The adventures of a Pythonista in Schemeland/5 (View in Weblogs)
Posted: Oct 6, 2008 4:44 AM
Summary
In this installment I will talk about tail call optimization, performance and the R6RS module system.

# About tail call optimization (and the module system)

In order to speak of Scheme performance one needs to write at least a few benchmarks. This is not completely trivial, for at least a couple of reasons. The first reason is shallow, but it may be baffling for beginners: since there is no for loop in Scheme, how are we supposed to write a benchmark, which is usually based on running many times the same instructions and measuring the total time spent? We will see in a moment that there is an easy solution to this question (use recursion). On the other hand, there is a second more serious question: since there is no fully portable way to write a library in Scheme, how can we write a benchmark library? There is no real answer, so we will restrict ourselves to R6RS-compliant Scheme implementations where there is a standard concept of library.

# There are no for loops in Scheme

The for loop is missing in Scheme as a primitive construct since it is useless in a language that guarantees tail call optimization. If you studied the concept of tail call at college you know what I am talking about; on the other hand, if you forgot it, or if you did study Physics like me, it is worth spending a couple of words on the subject. The ones who want to know more, may consult this Wikipedia article.

The important point beyond the tail recursion concept is that it is always possibile to convert a for into a recursive function in tail call form, i.e. a recursive function returning a value or a call to itself. For instance, the Python loop:

```# a loop printing 1 2 3
for i in range(1,4):
print i,
```

can be converted to a recursive function print_up_to_3:

```def print_up_to_3(i):
if i == 4: return
print i,
return print_up_to_3(i+1)

print_up_to_3(1)
```

Here the last instruction of the function (the tail) is a call to itself, hence the name tail call.

Tail call optimization is guaranteed by the Scheme language. Scheme compilers/interpreters are able to recognize recursive functions in tail call form and to convert them internally in for loops. As a consequence, the programmer has no need to write for loops directly: she can just use recursive function. Our example would look as follows in Scheme:

```(define (print-up-to-3 i)
(unless (= i 4)
(display i) (display " ")
(print-up-to-3 (+ i 1))))

(print-up-to-3 1)
```

This works, but it is not really readable; to improve the situation Scheme provides a little syntactic sugar called named let:

```(let loop ((i 1))
(unless (= i 4)
(display i) (display " ")
(loop (+ i 1))))
```

Traditionally the function in the named let construct is called loop to make clear to the programmer that we are emulating a for loop. In this example loop is exactly equivalent to print-up-to-3.

Let me point out two things before closing this paragraph:

1. there are other let forms, used to define local variables. The simplest one is let:

```> (let ((a 1) (b 2)) (+ a b)) ; => 3
```

The scope of a and b is limited to the current S-expression/form; if a and b are defined outside the let form, internally a and b shadow the outer names.

2. there is actually a do loop in the language, but it is cumbersome to use and redundant because the named let allows you to perform everything do does. I see it as a useless construct in a language that would like to be minimalist but it is not.

# There is no portable module system

As I have anticipated, libraries are the weak point of Scheme. There are few libraries available and it is also difficult to write portable ones. The reason is that historically Scheme never had any standard module system until very recently, with the R6RS document: that means nearly all current implementations provide different and incompatible module systems.

In order to understand the reason for this serious lack, you must understand the philosophy behind Scheme, i.e. the so called MIT approach: things must be done well, or not at all. For thirty years the Scheme community has not been able to converge on a well done single module system. It is only in 2007 that a standard module system has been blessed by the Scheme committee: but even that was done with a lot of opposition and there are implementors who said they will never support R6RS.

As a consequence of history and mentality, if you want to write a library for implementation X, you need to do a lot of boring and uninspiring work to port the library to implementations Y, Z, W, ... (there are dozens of different implementations). Moreover, a few implementations do not have a module system at all, so you may be forced to solve name clashes issue by hand, changing the names of the functions exported by your library, if they shadow names coming from third party libraries (!)

Personally, I picked up Scheme 5 years ago, but never used it because of the lack of a standardized module system. The main reason why I have decided to go back to Scheme and to write this series is the coming of the R6RS document last year. The R5RS standard has lots of defects, but at least now I can write a library and I can have people using different implementations install it and use it (nearly) seemlessly.

Since there is some hope for a large diffusion of R6RS module system in the future, I have decided to use it and to ignore implementations not supporting it. I should notice however that there are solutions to run R6RS modules on top of R5RS implementations, like the psyntax package, but I have not tried it, so I cannot comment on its reliability.

As first example of usage of the R6RS module system, I will define a repeat library exporting a call function which is able to call a procedure n times. Here is the code, that should be self-explanatory:

```(library (repeat)
(export call)
(import (rnrs))

(define (call n proc . args)
(let loop ((i 0))
(when (< i n) (apply proc args) (loop (+ 1 i))))))
```

The export declaration corresponds to Python's __all__: only the names listed in export are exported. In this case we will export only the function (call n proc . args). Notice the dot in the argument list: that means that the functions accept a variable number of arguments, which are collected in the list args. In other words, . args is the moral equivalent of *args in Python, with some difference that we will ignore for the moment. The apply function applies the argument list to the input procedure proc, which is called many times until the index i reaches the value n.

(import (rnrs)) imports all the libraries of the current version of the "Revised Report on Scheme", i.e. the R6RS report. At the REPL this is automatically done by the system, but for batch scripts it is mandatory (as Pythonistas say explicit is better than implicit). It is also possible to import subsections of the whole library. For instance (import (rnrs base)) imports only the base library of the R6RS, (import (rnrs io)) imports only the I/O libraries, et cetera.

The usage of the libray is trivial: it is enough to put the file repeat.sls somewhere in the Ikarus search path (specified by the environment variable IKARUS_LIBRARY_PATH). Then, you can import the library as follows:

```\$ rlwrap ikarus
Ikarus Scheme version 0.0.2
Copyright (c) 2006-2007 Abdulaziz Ghuloum
> (import (repeat))
> (call 3 display "hello!\n")
hello!
hello!
hello!
```

By default (import (repeat)) imports all the names exported by the module repeat, something that a Pythonista would never do (it is equivalent to a import * from repeat); fortunately it is possible to list the names to be imported, or to add a custom prefix:

```> (import (only (repeat) call)); import only call from repeat
call
#<procedure call>
> (import (prefix (repeat) repeat:)); import all with prefix repeat:
> repeat:call
#<procedure call>
```

# A simple benchmark

The main advantage of Scheme with respect to Python is the performance. In order to show the differences in performance I will go back to the factorial example of episode 4. I will compare the following Python script:

```# fact.py
import sys, timeit

def fact(x):
if x == 0: return 1
else: return x * fact(x-1)

if __name__ == '__main__':
n = int(sys.argv[-1])
t = timeit.Timer('fact(%d)' % n, 'from fact import fact')
print t.repeat(1, number=10000000)
print fact(n)
```

with the following R6RS-compliant script (written in Ikarus Scheme):

```; fact.ss
(import (rnrs) (only (repeat) call) (only (ikarus) time))

(define (fac x)
(if (= x 0) 1
(* x (fac (- x 1)))))

(define n
(string->number (car (reverse (command-line)))))

(time (call 10000000 (lambda () (fac n))))
(display "result:") (display (fac n)) (newline)
```

I will notice two things:

1. Python manages to compute the factorial of 995, but then it faces the stack wall and it raises a RuntimeError: maximum recursion depth exceeded whereas Scheme has no issues whatsoever;
2. In order to compute the factorial of 995 ten thousands times, Python takes 15.2 seconds, whereas Ikarus takes 7.2 seconds.

Notice that since the factorial of 995 is a large number, the computation time is spent in multiplication of large numbers, which are implemented in C. Python has its own implementation of long integers, whereas Ikarus uses the GNU Multiprecision library (gmp): the times measured here mean that the gmp implementation of long integers is more efficient than the Python one, but they say nothing on the relative performances of the two languages. It is more interesting to see what happens for small numbers. For instance, in order to compute the factorial of 7 for 10 millions of times, Python takes 30.5 seconds, whereas Ikarus takes 3.08 seconds and thus it is nearly ten times faster than Python. This is not surprising at all, since function calls in Python are especially slow whereas they are optimized in Scheme. Moreover Ikarus is a native code compiler.

It means Ikarus' REPL works by compiling expressions to native code, whereas Python's REPL compiles to bytecode. The technology is called incremental compilation and it is commonly used in Lisp languages from decades, even if it may look futuristic for C/C++ programmers. The factorial example is not very practical (on purpose), but it is significant, in the sense that it is legitimate to expect good performances from Scheme compilers. The fastest Scheme compiler out there is Stalin, but I would not recommend it to beginners.

The next episodes will be devoted to the dangers of benchmarks, do not miss it!

 Miki Tebeka Posts: 1 Nickname: tebeka Registered: Apr, 2007
Re: The adventures of a Pythonista in Schemeland/5 Posted: Oct 7, 2008 1:47 PM
Recursion is not considered the Pythonic way to do things.
A faster definition of fact will be:
`from operator import muldef fact(n): return reduce(mul, xrange(1, n + 1))`
On my machine it is twice as fast compared to ikarus (computing factorial of 995).

 Daniel Weinreb Posts: 3 Nickname: dlweinreb Registered: Oct, 2008
Re: The adventures of a Pythonista in Schemeland/5 Posted: Oct 12, 2008 1:50 PM
I'm sorry to hear that the next few articles are about benchmarking. What I was hoping to see was more about the Scheme language as seen from the point of view of a Python programmer. I hope you'll write more about that.

 Michele Simionato Posts: 222 Nickname: micheles Registered: Jun, 2008
Re: The adventures of a Pythonista in Schemeland/5 Posted: Oct 13, 2008 10:54 AM
Don't worry, there is only a single article about benchmarks, where I say that I do not like benchmark ;)

 Michele Simionato Posts: 222 Nickname: micheles Registered: Jun, 2008
Re: The adventures of a Pythonista in Schemeland/5 Posted: Oct 14, 2008 9:20 PM
> Recursion is not considered the Pythonic way to do
> things.

Yes, I say the same in the next episode ;)

> A faster definition of fact will be:
> `> from operator import mul> > def fact(n):> return reduce(mul, xrange(1, n + 1))> `
> On my machine it is twice as fast compared to ikarus
> (computing factorial of 995).

I cannot reproduce your results on my MacBook: reduce
is indeed faster (from 15.2 to 10.5 seconds) but not as fast as Ikarus (7.2s). This is probably very dependent on the architecture and how the multiprecision library is compiled. Yet another case of tricky benchmark. For more,
see the next episode!

 Matteo Pradella Posts: 1 Nickname: bzoto Registered: Oct, 2008
Re: The adventures of a Pythonista in Schemeland/5 Posted: Oct 21, 2008 8:40 AM
Well, here you are comparing an iterative version of the algorithm with a pure recursive one. A more similar Scheme version is the following (tail-recursive):

(define (fact n)
(define (fac n out)
(if (< n 1)
out
(fac (- n 1) (* out n))))
(fac n 1))

 Michele Simionato Posts: 222 Nickname: micheles Registered: Jun, 2008
Re: The adventures of a Pythonista in Schemeland/5 Posted: Oct 21, 2008 8:54 AM
The tail recursive version is considered in episode #6.

 Grant Rettke Posts: 23 Nickname: grettke Registered: Nov, 2008
Re: The adventures of a Pythonista in Schemeland/5 Posted: Nov 2, 2008 1:20 AM
The "MIT Approach" is *not* the reason that R5RS was so minimalistic and to your point, did not include a module system. Based on what I've read the reason is that the Scheme committee's members were just not interested in specifying things that exactly (there is supposedly undefined behavior all over R5RS, for example).

When you read into the history of the language you will find that the community was simply more interested in teaching and research than standardizing on a one-size-fits-all definition of the language.

Now look at the change in the committee between R5RS and R6RS, the differences between them don't explain the massive change in membership.

For contrast, read into the history of Haskell and look at how they addressed all of the different statically typed lazy languages variants (some incompatible) at the time!

 Grant Rettke Posts: 23 Nickname: grettke Registered: Nov, 2008
Re: The adventures of a Pythonista in Schemeland/5 Posted: Nov 2, 2008 1:20 AM
Why do you say that "There is no portable module system" in big bold letters when the whole point of this article is to use R6RS which has a module system?

 Grant Rettke Posts: 23 Nickname: grettke Registered: Nov, 2008
Re: The adventures of a Pythonista in Schemeland/5 Posted: Nov 2, 2008 1:20 AM
You should elaborate on the difference between a library and a script.

 Michele Simionato Posts: 222 Nickname: micheles Registered: Jun, 2008
Re: The adventures of a Pythonista in Schemeland/5 Posted: Nov 2, 2008 1:40 AM
> Why do you say that "There is no portable module system"
> in big bold letters when the whole point of this article
> is to use R6RS which has a module system?

Because as of now most Scheme implementations are not R6RS-compliant. Notice that this article was originally written before PLT Scheme added support for R6RS, but even now AFAIK Chicken, Bigloo, Gambit, Gauche, STklos, ... are not supporting R6RS so you cannot write a library and port it everywhere. Moreover, the R6RS module system has unspecified dark corners, so even for R6RS-compliant implementations it is not trivial to write portable code. If you want to know more, I can refer you to recent threads in comp.lang.scheme and in the Ikarus mailing list.

 Grant Rettke Posts: 23 Nickname: grettke Registered: Nov, 2008
Re: The adventures of a Pythonista in Schemeland/5 Posted: Nov 2, 2008 7:44 AM
If you want to read more about the rationale behind tail-recursion in Scheme you should read:

LAMBDA: The Ultimate Imperative

and

LAMBDA: The Ultimate GOTO

(I posted about them here:

http://www.wisdomandwonder.com/article/497/lambda-the-ultimate-imperative

and here:

http://www.wisdomandwonder.com/article/509/lambda-the-ultimate-goto)

It should have been called tail-transfer! :)

 Isaiah Gilliland Posts: 1 Nickname: jonsul Registered: Sep, 2010
Re: The adventures of a Pythonista in Schemeland/5 Posted: Sep 13, 2010 3:16 AM
I'm having trouble with Ikarus. I installed it with apt-get but when I try to "(import (only (ikarus) time))" it does it without error, but when I try to use "time" it comes up with an exception.
I want to follow your posts closely but I can't without getting time to work. Also my IKARUS_LIBRARY_PATH variable hasn't been set??? I know how to set it but I don't know where the default location is.

 Michele Simionato Posts: 222 Nickname: micheles Registered: Jun, 2008
Re: The adventures of a Pythonista in Schemeland/5 Posted: Sep 14, 2010 2:38 AM
The right place to ask questions about Ikarus is the Ikarus mailing list.

 Flat View: This topic has 13 replies on 1 page
 Previous Topic Next Topic