LISP – Why LISP Makes Implementing Macro Systems Easier

common-lisplispmacrosschemetemplates

I'm learning Scheme from the SICP and I'm getting the impression that a big part of what makes Scheme and, even more so, LISP special is the macro system. But, since macros are expanded at compile-time, why don't people make equivalent macro systems for C/Python/Java/whatever? For example, one could bind the python command to expand-macros | python or whatever. The code would still be portable to people who don't use the macro system, one would just expand the macros before publishing code. But I don't know of anything like that except templates in C++/Haskell, which I gather aren't really the same. What about LISP, if anything, makes it easier to implement macro systems?

Best Answer

Many Lispers will tell you that what makes Lisp special is homoiconicity, which means that the code's syntax is represented using the same data structures as other data. For example, here's a simple function (using Scheme syntax) for calculating the hypotenuse of a right-angled triangle with the given side lengths:

(define (hypot x y)
  (sqrt (+ (square x) (square y))))

Now, homoiconicity says that the code above is actually representable as a data structure (specifically, lists of lists) in Lisp code. Thus, consider the following lists and see how they "glue" together:

  1. (define #2# #3#)
  2. (hypot x y)
  3. (sqrt #4#)
  4. (+ #5# #6#)
  5. (square x)
  6. (square y)

Macros allow you to treat the source code as just that: lists of stuff. Each of those 6 "sublists" contain either pointers to other lists, or to symbols (in this example: define, hypot, x, y, sqrt, +, square).


So, how can we use homoiconicity to "pick apart" syntax and make macros? Here's a simple example. Let's reimplement the let macro, which we'll call my-let. As a reminder,

(my-let ((foo 1)
         (bar 2))
  (+ foo bar))

should expand into

((lambda (foo bar)
   (+ foo bar))
 1 2)

Here's an implementation using Scheme "explicit renaming" macros:

(define-syntax my-let
  (er-macro-transformer
    (lambda (form rename compare)
      (define bindings (cadr form))
      (define body (cddr form))
      `((,(rename 'lambda) ,(map car bindings)
          ,@body)
        ,@(map cadr bindings)))))

The form parameter is bound to the actual form, so for our example, it would be (my-let ((foo 1) (bar 2)) (+ foo bar)). So, let's work through the example:

  1. First, we retrieve the bindings from the form. cadr grabs the ((foo 1) (bar 2)) portion of the form.
  2. Then, we retrieve the body from the form. cddr grabs the ((+ foo bar)) portion of the form. (Note that this is intended to grab all the subforms after the binding; so if the form were

    (my-let ((foo 1)
             (bar 2))
      (debug foo)
      (debug bar)
      (+ foo bar))
    

    then the body would be ((debug foo) (debug bar) (+ foo bar)).)

  3. Now, we actually build the resultant lambda expression and call using the bindings and body we have collected. The backtick is called a "quasiquote", which means to treat everything inside the quasiquote as literal datums, except the bits after the commas ("unquote").
    • The (rename 'lambda) means to use the lambda binding in force when this macro is defined, rather than whatever lambda binding might be around when this macro is used. (This is known as hygiene.)
    • (map car bindings) returns (foo bar): the first datum in each of the bindings.
    • (map cadr bindings) returns (1 2): the second datum in each of the bindings.
    • ,@ does "splicing", which is used for expressions that return a list: it causes the list's elements to be pasted into the result, rather than the list itself.
  4. Putting all that together, we get, as a result, the list (($lambda (foo bar) (+ foo bar)) 1 2), where $lambda here refers to the renamed lambda.

Straightforward, right? ;-) (If it's not straightforward for you, just imagine how difficult it would be to implement a macro system for other languages.)


So, you can have macro systems for other languages, if you have a way to be able to "pick apart" source code in a non-clunky way. There are some attempts at this. For example, sweet.js does this for JavaScript.

† For seasoned Schemers reading this, I intentionally chose to use explicit renaming macros as a middle compromise between defmacros used by other Lisp dialects, and syntax-rules (which would be the standard way to implement such a macro in Scheme). I don't want to write in other Lisp dialects, but I don't want to alienate non-Schemers who aren't used to syntax-rules.

For reference, here's the my-let macro that uses syntax-rules:

(define-syntax my-let
  (syntax-rules ()
    ((my-let ((id val) ...)
       body ...)
     ((lambda (id ...)
        body ...)
      val ...))))

The corresponding syntax-case version looks very similar:

(define-syntax my-let
  (lambda (stx)
    (syntax-case stx ()
      ((_ ((id val) ...)
         body ...)
       #'((lambda (id ...)
            body ...)
          val ...)))))

The difference between the two is that everything in syntax-rules has an implicit #' applied, so you can only have pattern/template pairs in syntax-rules, hence it's fully declarative. In contrast, in syntax-case, the bit after the pattern is actual code that, in the end, has to return a syntax object (#'(...)), but can contain other code too.

Related Topic