Summary
In the last dozen episodes I have defined plenty of macros, but I have not really explained what macros are and how they work. This episode closes the gap: it explains the true meaning of Scheme macros by introducing the concepts of syntax object and of transformer over syntax objects.
Advertisement
Syntax objects
Scheme macros - as standardized in the R6RS document -
are built over the concept of syntax object.
The concept is peculiar to Scheme and has no counterpart in other
languages (including Common Lisp), therefore it is worth to spend some time
on it.
A syntax-object is a kind of enhanced s-expression: it contains
the source code as a list of symbols and primitive values, but also
additional informations, such as
the name of the file containing the source code, the position
of the syntax object in the file,
a set of marks to distinguish identifiers according to their
lexical context, and more.
The easiest way to get a syntax object is to use the syntax quoting
operation, i.e. the syntax (#') symbol you have seen in all the
macros I have defined until now. Consider for instance the following
script, which displays the string representation of
the syntax object #'1:
In other words, the string representation of the syntax object #'1
contains the full pathname of the script and the line number/column number
where the syntax object appears in the source code. Clearly this
information is pretty useful for tools like IDEs and debuggers. The
internal implementation of syntax objects is not standardized at all,
so that you get different
informations in different implementations. For instance Ikarus
gives
$ ikarus --r6rs-script x.ss
#<syntax 1 [char 28 of x.ss]>
i.e. in Ikarus syntax objects do not store line numbers, they just store
the character position from the beginning of the file. If you are using
the REPL you will have less information, of course, and even more
implementation-dependency. Here are a few example of syntax objects
obtained from syntax quoting:
> #'x ; convert a name into an identifier
#<syntax x>
> #''x ; convert a literal symbol
#<syntax 'x>
> #'1 ; convert a literal number
#<syntax 1>
> #'"s" ; convert a literal string
#<syntax "s">
> #''(1 "a" 'b) ; convert a literal data structure
#<syntax '(1 "a" 'b)>
Here I am running all my examples under Ikarus; your Scheme
system may have a slightly different output representation for syntax
objects.
Different syntax-objects can be equivalent: for instance the improper
list of syntax objects (cons#'display(cons#'"hello"#'())) is
equivalent to the syntax object #'(display"hello") in the sense
that both corresponds to the same datum:
The (syntax) macro is analogous to the (quote) macro.
Mreover, there is a quasisyntax macro denoted with #` which
is analogous to the quasiquote macro (`).
In analogy to
the operations comma (,) and comma-splice
(,@) on regular lists, there are two
operations unsyntax#, (sharp comma) e unsyntax-splicing#,@ (sharp comma splice) on lists and improper lists of
syntax objects.
Notice that the output - in Ikarus - is an improper list. This is
somewhat consistent with the behavior of usual quoting: for usual
quoting '(abc) is a shortcut for (cons*'a'b'c'()), which
is a proper list, and for syntax-quoting #'(abc) is equivalent
to (cons*#'a#'b#'c#'()), which is an improper list. The
cons* operator here is a R6RS shortcut for nested conses: (cons*wxyz) is the same as (consw(consx(consyz))).
However, the result of a quasi quote interpolation is very much
implementation-dependent: Ikarus returns an improper list, but other
implementations returns different results; for instance Ypsilon
returns a proper list of syntax objects whereas PLT Scheme returns
an atomic syntax object. The lesson here is that you cannot
rely on properties of the inner representation of syntax objects:
what matters is the code they correspond to, i.e. the result of
syntax->datum.
It is possible to promote a datum to a syntax object with the
datum->syntax procedure, but in order
to do so you need to provide a lexical context, which can be specified
by using an identifier:
syntax-match is a general utility to perform pattern matching on
syntax objects; it takes a syntax object in output and returns a
syntax object in output. Here is an example of a simple transformer based on
syntax-match:
> (define transformer
(syntax-match ()
(sub (name . args) #'name))); return the name as a syntax object
> (transformer #'(a 1 2 3))
#<syntax a>
For convenience, syntax-match also accepts a second syntax
(syntax-matchx(lit...)clause...) to match syntax expressions
directly. This is more convenient than writing
((syntax-match(lit...)clause...)x).
Here is a simple example:
> (syntax-match #'(a 1 2 3) ()
(sub (name . args) #'args)); return the args as a syntax object
#<syntax (1 2 3)>
Here is an example using quasisyntax and unsyntax-splicing:
The pattern variables introduced by with-syntax
are automatically expanded inside the syntax template, without need to
resort to the quasisyntax notation (i.e. there is no need for
#`#,#,@).
Macros are in one-to-one correspondence with syntax transformers, i.e. every
macro is associated to a transformer which converts a syntax object
(the macro and its arguments) into another syntax object (the
expansion of the macro). Scheme itself takes care of converting the
input code into a syntax object (if you wish, internally
there is a datum->syntax conversion) and the output syntax object
into code (an internal syntax->datum conversion).
Consider for instance a macro to apply a function to a (single)
argument:
(def-syntax (apply1 f a)
#'(f a))
This macro can be equivalently written as
(def-syntax apply1 (syntax-match () (sub (apply1 f a) (list #'f #'a))))
The sharp-quoted syntax is more readable, but it hides the underlying list
representation which in some cases is pretty useful. This second form
of the macro is more explicit, but still it relies on syntax-match.
It is possible to provide the same functionality without using
syntax-match as follows:
Here the macro transformer is explicitly written as a lambda function,
and the pattern matching is performed by hand by converting
the input syntax object into a list and by using the list
destructuring form let+ introduced in episode 15. At the
end, the resulting list is converted back to a syntax object
in the context of apply1. Here is an example of usage:
> (apply1 display "hey")
hey
sweet-macros provide a convenient feature:
it is possible to extract the associated
transformer for each macro defined via def-syntax. For instance,
here is the transformer associated to the apply1 macro:
The ability to extract the underlying transformer is useful in
certain situations, in particular when debugging. It can also
be exploited to define extensible macros, and I will come back
to this point in the future.
The previous paragraphs were a little abstract and
probably of unclear utility (but what would you expect from
an advanced macro tutorial? ;). Now let me be more
concrete. My goal is to provide
a nicer syntax for association lists (an association list is just
a non-empty list of non-empty lists) by means of an alist
macro expanding into an association list.
The macro accepts a variable
number of arguments; every argument is of the form (namevalue) or
it is a single identifier: in this case latter case it must be
magically converted
into the form (namevalue) where value is the value of the
identifier, assuming it is bound in the current scope, otherwise
a run time error is raised "unboundidentifier". If you try to
pass an argument which is not of the expected form, a compile time
syntax error must be raised.
In concrete, the macro works as follows:
The expression #'(arg...) expands into a list of syntax
objects which are then transformed by the syntax-match transformer,
which converts identifiers of the form n into couples of the form
(nn), whereas it leaves couples (nv) unchanged, just
checking that n is an identifier.
This is a typical use case for
syntax-match as a list matcher inside a bigger macro. We will
see other use cases in the next Adventures.