I'm learning Scheme from the SICP and I'm getting the impression that a big part of what makes Scheme and, even more so, LISP special is the macro system. But, since macros are expanded at compile-time, why don't people make equivalent macro systems for C/Python/Java/whatever? For example, one could bind the python
command to expand-macros | python
or whatever. The code would still be portable to people who don't use the macro system, one would just expand the macros before publishing code. But I don't know of anything like that except templates in C++/Haskell, which I gather aren't really the same. What about LISP, if anything, makes it easier to implement macro systems?
LISP – Why LISP Makes Implementing Macro Systems Easier
common-lisplispmacrosschemetemplates
Best Answer
Many Lispers will tell you that what makes Lisp special is homoiconicity, which means that the code's syntax is represented using the same data structures as other data. For example, here's a simple function (using Scheme syntax) for calculating the hypotenuse of a right-angled triangle with the given side lengths:
Now, homoiconicity says that the code above is actually representable as a data structure (specifically, lists of lists) in Lisp code. Thus, consider the following lists and see how they "glue" together:
(define #2# #3#)
(hypot x y)
(sqrt #4#)
(+ #5# #6#)
(square x)
(square y)
Macros allow you to treat the source code as just that: lists of stuff. Each of those 6 "sublists" contain either pointers to other lists, or to symbols (in this example:
define
,hypot
,x
,y
,sqrt
,+
,square
).So, how can we use homoiconicity to "pick apart" syntax and make macros? Here's a simple example. Let's reimplement the
let
macro, which we'll callmy-let
. As a reminder,should expand into
Here's an implementation using Scheme "explicit renaming" macros†:
The
form
parameter is bound to the actual form, so for our example, it would be(my-let ((foo 1) (bar 2)) (+ foo bar))
. So, let's work through the example:cadr
grabs the((foo 1) (bar 2))
portion of the form.Then, we retrieve the body from the form.
cddr
grabs the((+ foo bar))
portion of the form. (Note that this is intended to grab all the subforms after the binding; so if the form werethen the body would be
((debug foo) (debug bar) (+ foo bar))
.)lambda
expression and call using the bindings and body we have collected. The backtick is called a "quasiquote", which means to treat everything inside the quasiquote as literal datums, except the bits after the commas ("unquote").(rename 'lambda)
means to use thelambda
binding in force when this macro is defined, rather than whateverlambda
binding might be around when this macro is used. (This is known as hygiene.)(map car bindings)
returns(foo bar)
: the first datum in each of the bindings.(map cadr bindings)
returns(1 2)
: the second datum in each of the bindings.,@
does "splicing", which is used for expressions that return a list: it causes the list's elements to be pasted into the result, rather than the list itself.(($lambda (foo bar) (+ foo bar)) 1 2)
, where$lambda
here refers to the renamedlambda
.Straightforward, right? ;-) (If it's not straightforward for you, just imagine how difficult it would be to implement a macro system for other languages.)
So, you can have macro systems for other languages, if you have a way to be able to "pick apart" source code in a non-clunky way. There are some attempts at this. For example, sweet.js does this for JavaScript.
† For seasoned Schemers reading this, I intentionally chose to use explicit renaming macros as a middle compromise between
defmacro
s used by other Lisp dialects, andsyntax-rules
(which would be the standard way to implement such a macro in Scheme). I don't want to write in other Lisp dialects, but I don't want to alienate non-Schemers who aren't used tosyntax-rules
.For reference, here's the
my-let
macro that usessyntax-rules
:The corresponding
syntax-case
version looks very similar:The difference between the two is that everything in
syntax-rules
has an implicit#'
applied, so you can only have pattern/template pairs insyntax-rules
, hence it's fully declarative. In contrast, insyntax-case
, the bit after the pattern is actual code that, in the end, has to return a syntax object (#'(...)
), but can contain other code too.