Hacker News new | past | comments | ask | show | jobs | submit login
Macros in Python (github.com/lihaoyi)
300 points by lihaoyi on May 13, 2013 | hide | past | favorite | 66 comments



If you're interested in doing some of this stuff in JS, there's sweet.js[1] for macros. I've also implemented some similar stuff as libraries: adt.js[2], matches.js[3], and tailrec.js[4]. By implementing them as libraries, I've given up some of the nicety of native-looking syntax, but it requires no preprocessing and only ES3.

[1] http://sweetjs.org/

[2] https://github.com/natefaubion/adt.js

[3] https://github.com/natefaubion/matches.js

[4] https://github.com/natefaubion/tailrec.js


This is awesome. Everything here (except the anonymous arguments to lambdas, seriously, just use x y and z as default argument names) should be in Python 4 or whatever you'd want to call the successor to Python.

I love love love the PEG implementation.


I agree nice clean work, so many try to simply "out do" with elegance and "smart code". Clean and simple is a gem hard to find. I will gladly contribute going forward.


If you're serious about wanting to contribute, my email is on my GitHub profile, so ping me and I can help you get up to speed!


^ or me


I implemented a network-sort function generator, but the limited lambdas of Python seem to force me to make the macro declare a real function with magic name (see below). Also, I didn't figure out how to use the walker or quasiquotes, so I'm just generating source code and taking the AST of that.

  #run.py
  import macropy.core.macros
  import target

  #batcher.py
  # http://en.wikipedia.org/wiki/Batcher_odd%E2%80%93even_mergesort

  def oddeven_merge(lo, hi, r):
    step = r * 2
    if step < hi - lo:
        for x in oddeven_merge(lo, hi, step): yield x
        for x in oddeven_merge(lo + r, hi, step): yield x
        for i in range(lo + r, hi - r, step): yield (i, i+r)
    else:
        yield (lo, lo + r)

  def oddeven_merge_sort_range(lo, hi):
    if (hi - lo) >= 1:
        mid = lo + ((hi - lo) // 2)
        for x in oddeven_merge_sort_range(lo, mid): yield x
        for x in oddeven_merge_sort_range(mid + 1, hi): yield x
        for x in oddeven_merge(lo, hi, 1): yield x

  def oddeven_merge_sort(n):
    return list(oddeven_merge_sort_range(0, n-1))

  #target.py
  from macro_module import macros, my_expr_macro
  import dis

  my_expr_macro%(8)

  print dis.dis(fun8)

  print fun8([2, 4, 3, 5, 6, 1, 7, 8])

  #macro_module.py
  from macropy.core.macros import Macros
  from macropy.core import *
  from ast import *
  macros = Macros()

  import macropy.macros

  from batcher import oddeven_merge_sort

  @macros.expr
  def my_expr_macro(tree):
   return parse_stmt("def fun" + str(tree.n) + "(a):\n" + 
   "\n".join(["  if a[{0}] > a[{1}]: tmp = a[{0}]; a[{0}] = a[{1}]; a[{1}] = tmp".format(x,y) for (x,y) in oddeven_merge_sort(tree.n)]) + 
   "\n  return a")
When running, it will spit out the Python bytecode disassembly to show the unrolled sorting loop.


Author here,

I just put up a Detailed Guide:

https://github.com/lihaoyi/macropy#detailed-guide

That goes into more detail of how macros are written, and walks you through writing a simple but useful macro (the quicklambda macro). This should greatly help anyone who wants to try his hand at some metaprogramming =)


This is really impressive. The whole contents of that module are fantastic. Ever since I've started to play around with Scala, I've missed case classes and pattern matching in Python. The ability to write AST macros is even cooler.


This whole library is flat-out fascinating. I'm looking forward to trying it out. Macros? The PINQ stuff?


While the specific feature is fun to see in python, I'm much more impressed by the fact that MacroPy enables you to do AST transformations from python modules on the fly. This allows for developing DSLs like they show with PINQ (a LINQ "clone").


Hy [1] is also making a lot of advancement these days. It now has eval, classes, and macros working in it [2].

[1] Lisp in Python https://github.com/paultag/hy

[2] https://twitter.com/paultag/status/333637217765441537


Something about classes that are nested inside other classes suddenly being available as top level classes seems wrong. I'm not super deep on the innards of Python, but that doesn't look right.


Other author here: Like lihaoyi said, it's true that the way we defined things doesn't match standard Python namespacing, but it saves some typing and I think it's easy to get used to the semantics of our case classes. Our macro just transforms the "namespace tree" that you would normally get from writing code like that into an "inheritance tree", where all the subclasses' names are at the level of the top class. If you wanted to do something like this with proper namespaces, you can probably do something similar without macros using metaclasses.


> it saves some typing

Your macropy is very interesting, but please do not use this argument. "Saving some typing" is the least interesting side-effect of some macros.

To clarify my point: in most of the industry, we all fight hard to garden big codebases with intricate dependency problems. I am convinced that code complexity management is very far from a solved problem, and I hope advanced tools like macros can help there.

But "saving a few keystrokes" is very often at the heart of the stupides issues we have with code complexity.

Two examples:

+ "from x import " <-- this saves a few typing, but it become extremely annoying as soon as your code get more than 10 modules.

+ template.render(tpl_file, locals()) <-- this feeds the template space with shit and makes it much harder to trace data between templates and models. I daily revile the guy who typed the "*locals()" in the Mako doc.


I urge you to take this criticism more seriously. With your library a programmer is changing the semantics of class nesting in the language, all by pre-pending a very missable and ambiguous @case to the class. I'm no expert but I think this is pretty bad in any language, and especially one that says "explicit is better than implicit"


We do take it seriously; it's a hard tradeoff between Novelty and Utility. Given that the whole point of macros is to change the semantics of the Python language, this (valid) criticism applies not just to case classes but macros in general. If you go haywire with macros, all hell breaks loose.

We think that with some discipline, it is possible to come up with macro transformations which are both useful and understandable by a programmer. This means setting clearly defined semantics for the transformations, which is difficult but doable.

If you think macros are kinda crazy (well, they are!) you should look at the implementation of such pythonic constructs like `namedtuple`, the `ast` module and the new Enum library coming out!


The implementation of namedtuple is pretty crazy, but the interface is simple and doesn't change the language semantics.


I'd argue that "fields as a string with spaces inside, or maybe a list of strings" is changing language semantics quite a bit.

We're used to namedtuples doing it like this now, but if namedtuples didn't exist, "fields as a list of strings with spaces inside, or maybe a list of strings" would definitely not make it past code review and probably get me yelled at by my future colleagues.


Yes, it could be bad, but at the same time the "@case" isn't the only indicator that this class is special. A case class definition can be very visually distinct compared to a normal class.

One thing case classes let you do is implement functionality externally to a class using pattern-matching functions. When you do this, your case class definition will consist of nothing but empty class definitions:

  @case
  class List:
    class Nil():  pass
    class Cons(x, xs): pass
In any case, Python's enums will use fully qualified names, so we may change our current system for compatibility and consistency.


It definitely looks funny from a python point of view; it's inspired by ADTs and OCaml. Perhaps it would be better to leave them to be accessed via a qualified name (e.g. `List.Nil`, `List.Cons`). This is all up for debate.


A qualified name makes more sense to me. Having them exposed as top-level classes just doesn't feel right, given how they are defined.


True. The way it is now is counter-intuitive.


I see the docs mention that these only work when importing modules. The dev version of IPython lets you transform the AST of code that is typed in the REPL. It would be really cool to see an IPython AST transformer based on this. Here is an example from the test files in IPython: https://github.com/ipython/ipython/blob/master/IPython/core/...


Yep, there's an issue for that! https://github.com/lihaoyi/macropy/issues/14

I've also gotten a basic implementation working in the CPython REPL, which requires you to run a function `macro_repl()` before you can use them. A Macro powered REPL would be super cool!


MacroPy now works with the CPython REPL:

    PS C:\Dropbox\Workspace\6.945\Project> python
    Python 2.7 (r27:82525, Jul  4 2010, 07:43:08) [MSC v.1500 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import macropy.core.macros
    0=[]=====> MacroPy Enabled <=====[]=0
    >>> from macropy.macros2.tracing import macros, trace
    >>> trace%[x*2 for x in range(3)]
    range(3) -> [0, 1, 2]
    (x * 2) -> 0
    (x * 2) -> 2
    (x * 2) -> 4
    [(x * 2) for x in range(3)] -> [0, 2, 4]
    [0, 2, 4]
It's somewhat hacky, but it seems to work pretty well. I went through the examples and most of them work perfectly in the REPL, except for those whose have blank-lines-in-class-def confuse it


Wow, I am blown away by how complete this implementation is. Nice work! Now please post a writeup of how the internals work :)


It's actually pretty simple and almost all in this file:

https://github.com/lihaoyi/macropy/blob/master/macropy/core/...

Those already familiar with the imp mechanism and AST module could probably infer it from the description.

An absolutely beautiful facet of Python's nature is that with relatively little background, it's quite straightforward to know how all sorts of 'magic's and hacks work at first glance. :)


I really liked this up until the inheritance bit.

Child classes have to be nested classes of the parent doesn't sit right with me. Also, wouldn't it be better if these classes inherited object in some way?

For the record, I liked everything else that it mentioned. Just the inheritance bit was something I didn't like.


- The classes do inherit from object, iirc - We know the nesting is controversial; whether it will remain in its current form has yet to be seen. I think there is some merit in this way of doing it, we know others disagree


Ah, okay.

Honestly, if I have classes that really do need to inherit from each other, I probably won't be using @case to do it anyways. I see this being extremely useful for the many structures that I need in my application (that may require a few convenience methods) more than anything.


Can anyone give a Tl;DR about how this works? Don't x and y need to exist beforehand?


It's hard to give a TL;DR on the whole library, but case classes can be used with for example pattern matching (https://github.com/lihaoyi/macropy#pattern-matching).

The following code:

  with patterns:
      Foo(x, Bar(3, z)) << Foo(4, Bar(3, 8))
The constructor after the << will be matched with the part before <<. It will only match if the first argument of "Bar" is 3. If that's the case, x and z will be bound to 4 and 8.

Does this help? :)


I'm sure there's a use-case for this, but I'm not coming up with one off the top of my head. This comes across as an extremely rare type of problem to encounter.

What's the purpose of only binding x and z if the first argument of Bar matches? What if it doesn't match? Is an exception thrown? Are x,z just None?

[Note: I'm talking about the example, not the @case decorator, which does seem useful]

Edit:

After reading the library examples, the explanation above is not entirely clear:

  with patterns:
      Foo(x, Bar(3, z)) << Foo(4, Bar(3, 8))
here's the rest of the example:

    print x   # 4
    print z   # 8
I was thinking that somehow a new Foo instance was created only when the pattern matched. (Oh, and an exception is thrown when the match fails)


Author here. The `<<` operator is being abused to mean "bind", so the right side is a normal expression (constructing a Foo and Bar) and the left side matches it to a particular "shape".

The purposes of pattern matching is to replace code that looks like this:

    if  (isinstance(tree, BinOp)
            and type(tree.left) is Name
            and type(tree.op) is Mod
            and tree.left.id in module.expr_registry):
        ...
with code that looks like this

    if BinOp(Name(id), Mod(), body) << tree 
            and id in module.expr_registry:
        ...
Which looks much nicer, and more clearly says what you want: that `tree` "looks like" a particular shape.

EDIT: Here's another example. Turning this:

    if  ((isinstance(tree, ClassDef) or isinstance(tree, FunctionDef))
            and len(tree.decorator_list) == 1
            and tree.decorator_list[0]
            and type(tree.decorator_list[0]) is Name
            and tree.decorator_list[0].id in module.decorator_registry):
        ...
into:

    if  ((isinstance(tree, ClassDef) or isinstance(tree, FunctionDef))
            and [Name(id)] << tree.decorator_list 
            and id in module.decorator_registry):
        ...
Doesn't quite work yet, but we're getting there


Personally, I find those to be better 'real world' examples. I would suggest including them in the docs. Thanks for the explanation!


With pattern matching, the term on the left is a destructuring of the term on the right. Using a static instance like that isn't what you normally do in practice, but rather something like:

    myfoo = Foo(4, Bar(3, 8))
    with patterns:
        Foo(x, Bar(3, z)) << myfoo
        
So in this case, the assertion will only pass if the Foo contains a Bar in the second slot, and that Bar contains a 3 in its first slot, while also binding the 4 and 8 to their own names, x and z respectively.


That's known as destructuring-bind: http://www.lispworks.com/documentation/HyperSpec/Body/m_dest...

Is that a macro created by this library or a feature of the library?

Edit: I did some reading and it looks to be a feature macro of the library.


From the opening paragraph of the readme.md

> MacroPy provides a mechanism for user-defined functions (macros) to perform transformations on the abstract syntax tree(AST) of Python code at module import time

Basically it rewrites the code before it's compiled, so no, x and y don't need to exist (compilation hasn't finished when the macro runs)


Decorators can do this? I did not know this. I thought decorators just get handed the object after initialization, so something like:

    @blah
    class Lol:
      pass
is equivalent to:

    class Lol:
      pass
    blah(Lol)
What am I missing?


> What am I missing?

Import hooks.


Should've RTFM'ed. Thanks :)


isn't it equivalent to

  class Lol:
    pass
  Lol = blah(Lol)


Sorry, yes.


This is the coolest thing I've seen in a long time! Amazing!


It seems like the give-up on clarity of code isn't worth of it for most of this, but the tail call optimization is a nifty little feature to have.


Am I the only one thinks that Python is definitely not suitable for syntax magic? like Ruby or something else does?


Nope, you're not alone. Though this looks cool, it reminds me of the evilness in C macros.


totally agree


Just a small note. In the definition of the Cons class, the iterate method is calling each time to len before proceeding, and that is terribly inefficient.

def __iter__(self): current = self

            while len(current) > 0:
                yield current.head
                current = current.tail


Yeah, we haven't looked at performance at all.

"The Wonderful Thing About A Dancing Bear Is Not How Well He Dances, But That He Dances At All"

We'll get to teaching him to dance well later!


Speaking about concepts perhaps Norvig post "(How to Write a (Lisp) Interpreter (in Python))" http://norvig.com/lispy.html can be useful. The concept of a dotted pair (the car and the cdr are interesting historical facts.

I wish you the best for your presentation for the Mit class, it seems interesting but perhaps it should have a link to some standard techniques like machine learning.

I am curious about the small projects, will they be available in the page of the course?


The homework assignments are all online; One of the project requirements is that the code be hosted somewhere public where people can go and see it, so it'd probably be pretty easy to get a list from the prof


I'm basically seeing lots of things I love in Scala, but in Python. Makes me want to go learn Python now ^_^


A while ago I made a similar thing: https://github.com/mmikulicic/metacontext

still has some issues with line numbers (in tracebacks for example) in files where there is macro expansion.


It's a cool idea, but this basically sneaks new features into the Python language through a backdoor, bypassing the public review/feedback process. I would suggest the author create a PEP for these features instead.


Given the language's general trend away from functional programming and Guido's flat-out refusal to implement TCO, I think starting with an implementation, showing interest (if it exists), and then trying to standardize via PEPs is probably a reasonable route to go.


That's rather the point of macros - giving programmers the ability to new features to a language.


Macros aren't solely used to implement high-level features, mind you - in CL they're introduced without much hassle to work out gruntwork as well precisely because an elided form in s-expressions seems just as natural as the original forms.

Although that level of fragmentation you do talk about is plausible - I'd find it simultaneously disheartening and worth a bit of a chuckle. (I'm reminded of all of the concurrency abstractions CL seems "capable of" hosting with minimal friction - but none with definite thrust.)


What made you target Python 2 instead of Python 3?


Laziness; we already had it installed.

That's what a lot of other people are using.

It's not that we don't like Python3; their ASTs are considerably simpler (e.g. no print statement), they have importlib, and a bunch of other nice things. It's just that the world we interact with mostly uses Python2, so we just follow along.


need a slime implementation please.


[deleted]


ADTs are a more general feature and one that I've missed in Python on more than one occasion.

I'm a little sad that the enum PEP got accepted, without them being actual ADTs.


Author here. If you look at my github, you'll see I do use plenty of Scala!


Well yes; the code is clearly the work of someone who uses and likes scala. My point was: the library seems designed to let you write "scala in python"; wouldn't it be better to use scala when you want scala syntax, and write more idiomatic python when using python?


Yes, probably if you're starting from scratch. But I think in general it's good to have a flexible language which lets you write code in any reasonable style. Hopefully this library (the AST transformation part, not the specific macros) is making Python even more Pythonic! Although I suspect Guido would disagree with me


Because it's hideous?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: