Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Crafting Interpreters: Closures (craftinginterpreters.com)
269 points by azhenley on Sept 27, 2019 | hide | past | favorite | 60 comments


This is one of my favorite books(-in-progress) to follow along with. If you've ever even had a passing curiosity in how interpreters work, I highly recommend checking it out from the beginning: http://craftinginterpreters.com/welcome.html

The writing is charming and approachable, while still packed with an impressive amount of knowledge.


> The writing is charming and approachable, while still packed with an impressive amount of knowledge.

It's actually the best technical writing I have ever seen, and I have read a lot. He can explain a complex thing really easy, and bring it very personally, as if he's a good, relaxed and funny friend explaining things.


Then you should check his book on game programming patterns:

http://gameprogrammingpatterns.com/index.html


Haha, I'm in the book of game programming patterns :D (deWiTTERS game loop)


Thank you! :D


No, thank you for writing these! I haven't read your book on interpreters yet, but your game programming patterns book helped me a ton when I was building my own game engine.


>"The writing is charming and approachable, while still packed with an impressive amount of knowledge."

Agreed. I came here to say the same. "Charming" and "approachable" sums up the writing style nicely. I'm always excited when there's a new chapter.


Any more gems like this you would recommend?


“Compiler Construction” by Niklaus Wirth (2014) [pdf]

https://www.inf.ethz.ch/personal/wirth/CompilerConstruction/...

Modern Compiler Implementation in Java, C and ML.

https://www.cs.princeton.edu/~appel/modern/

From Nand to Tetris

https://www.nand2tetris.org/


Niklaus Wirth's, "Algorithms + Data Structures = Programs", is also a great read for the last chapter, "Language Structures and Compilers", which is a nice succinct read. It provides a little more formalism than Crafting Interpreters and it's introduction of representing grammar as a syntax diagram was especially helpful for me to visually understand the challenges of having more than 1 token of lookahead in LL(k) parsers.


Yes. I love this book. At a practical level, it's pretty (very) outdated. It talks about things like sorting when your data is stored on tape.

But it's just a beautiful, succinct book. If you get tired of how messy, fast-moving, and chaotic a lot of software development can be, it's a delightful respite. It cleanses and orders the mind.


From what I have read about Wirth his way of thinking is the computer science equivalent of Bauhaus minimalism. Which oddly enough doesn't appeal to me in the design of everyday things all that much (I can definitely respect it though), but feels very fitting for computer science.


Wirth's minimalism is, IMO, driven by some semi- or full understanding of formal semantics, wherein simplicity is essential to correctness in implementation of the language (as the language designer), and essential to the correct use of the language (as a programmer). This is my opinion formed from what I read so beware.

C.A.R. Hoare has excellent stuff to say on this and hardware design - here's his 1982 turing award lecture https://dl.acm.org/ft_gateway.cfm?id=1283936&type=pdf and it's well worth reading (quite short).

Extracts:

" [...]in May 1965, Niklaus Wirth was com- missioned to collate them into a single language design [the successor to algol 60]. I was delighted by his draft design which avoided all the known defects of ALGOL 60 and included several new features, all of which could be simply and efficiently implemented, and safely and conveniently used.

The description of the language was not yet complete. I worked hard on making suggestions for its improve- ment and so did many other members of our group. By the time of the next meeting in St. Pierre de Chartreuse, France in October 1965, we had a draft of an excellent and realistic language design which was published in June 1966 as "A Contribution to the Development of ALGOL", in the Communications of the A CM. It was implemented on the IBM 360 and given the title ALGOL W by its many happy users. It was not only a worthy successor of ALGOL 60, it was even a worthy predecessor of PASCAL."

But people wanted as much as possible in so instead...

"Three months came and went--not a word of the new draft appeared. After six months, in October 1966, the ALGOL working group met in Warsaw. It had before it an even longer and thicker document, full of errors corrected at the last minute, describing equally obscurely yet another different, and to me, equally unattractive language. The experts in the group could not see the defects of the design and they firmly resolved to adopt the draft, believing it would be completed in three months. In vain, I told them it would not. In vain, I urged them to remove some of the technical mistakes of the language, the predominance of references, the default type conversions. Far from wishing to simplify the lan- guage, the working group actually asked the authors to include even more complex features like overloading of operators and concurrency."

[...]

"At last, in December 1968, in a mood of black depression, I attended the meeting in Munich at which our long-gestated monster was to come to birth and receive the name ALGOL 68." [Edit: added this para]

I think Wirth (or Hoare?) said algol 60 was an improvement algol 68. [Sentence edited]

He then goes on to describe the creation of PL/1 which is an even worse experience.

Anyway, the whole paper's well worth reading. It has a lot of good stuff and is only 9 sides long.


Thank you for both that comment and that link!


> Niklaus Wirth (2014)

The fact that he's still writing is kind of amazing. Besides him and Knuth, who from their generation of computer scientists are still active?


Bob Nystrom (aka munificent)'s previous book "Game Programming Patterns"[0] is really nice, and there's lots of things in there that work outside of a game context.

[0] gameprogrammingpatterns.com


This is a great introduction to writing an interpreter if you know, or are learning, Go:

https://interpreterbook.com/

There's a follow-up as well: https://compilerbook.com/


Wish there were more books like this for other languages... I like learning programming languages, but between a job and a family, with 30 mins in the evening to myself until I'm totally brain dead, I can't write any code myself that's not job related. Books like these let me learn and experience new languages vicariously.


A colleague of mine just starting writing an online book about making a operating system for RISC-V in Rust: http://web.eecs.utk.edu/~smarz1/osblog/


Writing an Interpreter in Go and it's sequel Writing a Compiler in Go by Thorstan Ball are both excellent reads.

Also enjoyed Game programming patterns by Bob again, but that has already been mentioned.


Check this free book:

Introduction to Compilers and Language Design by Douglas Thain

http://compilerbook.org/


Crenshaw's "Let's Build a Compiler" is also a very lucid read: https://compilers.iecc.com/crenshaw/


The implementation described here is inspired by Lua. Even if you already know what a closure is, it might still be a worthwile read, since it probably implements them very differently from what you are thinking!

Instead of always heap-allocating variables that are used by closures, it starts by stack-allocating them and only moves them to the heap if the closure outlives its parent function. In order for this to work, the inner function always accesses the outer variables through an indirection (upvalue).

The main advantage of this approach is that it is compatible with a single-pass compiler, without sacrificing performance in the common case where closures are not present. Since the generated code for using a variable is the same no matter whether it is used by inner functions or not, the compiler can start emitting code as soon as it sees the variable declaration, without needing to look ahead to find where the variables are going to be used.


Another approach is to do the opposite - always store all local variables in the heap, and rely on scalar replacement of aggregates to put them back onto the stack where possible. This is what I do in my Ruby interpreter.

That’s definitely not compatible with a single-pass though!


A third option that can still be single pass is the Smalltalk way, where the stack frames themselves are first-class objects: the closures then keep a reference to the stack frame and the GC won't collect any live stack frames, which will contain closed-over variables.


Yes that's really what I'm talking about, but plus the (really required for practical performance) optimisation of virtualising the frames onto the stack if they don't escape an activation.


Very interesting! Have you written anything about your implementation? I'm always keen to learn new techniques. I don't have much back end optimization experience (yet).


Section 4.2 of http://lafo.ssw.uni-linz.ac.at/papers/2013_Onward_OneVMToRul... describes it.

> We ensure this by forcing an escape analysis ... of the array containing the values of local variables. This eliminates every access to the array and instead connects the read of the variable with the last write. ... After conversion to SSA form, guest-language local variables have no performance disadvantage compared to host language local variables.

Then if the closure escapes (stored into the heap or something), the escape analysis just naturally fails and the frame is stored on the heap again.


Author here! Happy to answer questions, accept criticism, etc. :)


I've been anticipating the Garbage Collection chapter (next on the list!) for quite some time! Do you have an estimated time frame for when it will be released? Thanks for the great book!


> I've been anticipating the Garbage Collection chapter (next on the list!) for quite some time!

Me too! It's one of the chapters I've been most looking forward to writing. (This chapter on closures was another.)

> Do you have an estimated time frame for when it will be released?

Chapters take me roughly a month, but there's pretty high variance depending on what else is going on in my life. I work on it every day, but the amount of time per day varies. My hope is that it will get faster as I get closer to the finish line. It's been a long marathon, longer than I initially anticipated.


Do you have the back cover ready for this book? The back of Game Programming Patterns makes me happy every time I see it.


Ha, this has definitely come up for discussion.

My original plan was to have a photo with me and both of my dogs with the implication that each later book must feature yet another additional dog.

Unfortunately, Ginny, the dog in the photo on the back of Game Programming Patterns died earlier this year. (This makes it even more meaningful to me that she is immortalized on the book. You'd be surprised how many strangers have asked me about her.)

So, I don't have any firm plans, but I might just try to do a new photo with me and my other dog, Benny. He's getting pretty old too, though, so I guess the pressure is on for me to finish the book.


Checked the table of content, didn't find continuations. Am I missing something or is this material too much outside of the intended scope?

It's perhaps also too different, but I'd be interested in ways of saving the state of program execution and restoring that state later. Different from continuations, but overlaps...


Out of scope, alas. One of the real challenges of the book was deciding which language features to include and which to omit. It's on track to an enormous book — it's over 200k words already. So I had to cut what I could. The language in the book doesn't even have arrays.

But it does have lexical scope, closures, objects, inheritance, a full operator precedence table, and a bunch of other fun stuff.


Thanks for creating this great resource and making it available for free!


Why would one want to build a single-pass interpreter? (Feel free to point me at an earlier chapter that answers this. I skimmed them a bit but didn't immediately find an answer.)


I subscribed to the updates and it's great! I recommended it to other people and they're also glad with it. Thank you for sharing your work.


A title like "Crafting Interpreters" on this gets my hopes up. I expect something about building an emulator like Qemu, MAME, or Xenia.

For a programming language, you'd want to use terms like "parser" and hopefully "compiler", not an unqualified "interpreter".


Tiny question about lox : does it support arrays ? I understand it might not be that interesting as a concept so its removed from the example language to keep it smaller, but I was wondering if I missed anything ? If not, how would those be declared or implemented ?


It does not. I agonized over whether to include them because the language is really limited without them. I even had an implementation. But it wasn't very conceptually interesting, so ultimately I decided it could be cut.

There is an exercise in one of the chapters to add them, and you can see my answer to that here:

https://github.com/munificent/craftinginterpreters/blob/mast...


So, there's one thing here that strikes me as really weird. And I think it's best summed up at the end, where it discusses closing over a loop variable.

It discusses how if you close over a loop variable, people expect that each function created will get a different value of the variable, even though there's only one variable, and closures close over variables, not values. It treats this as a special case. But to my mind this isn't a special case; it just highlights that having closures close over variables rather than values is, well, not right. Like, if I write, to use some sort of pseudocode,

  x = 1;
  f = function() { print(x) };
  x = 2;
  g = function() { print(x) };
...then I expect that f will print 1 and g will print 2. Because, well, I'm coming in from lambda calculus and Haskell where everything is values. Of course, there this problem doesn't come up because there's no mutability -- but that's still how I'd expect things to work once you introduce it. Closing over variables just seems horribly counterintuitive to me.

This came up earlier, too, when the was going through all the machinery with upvalues, and I was just like, why doesn't it just copy in the values? And after reading further, it's like, oh, it's because they want to be able to modify the variable outside the function! Which, like, whoa.

Like obviously what I said above about how I expect closures to work isn't workable if you want to be able to mutate the closed-over variables, outside the closure, from inside the closure. But maybe that just shouldn't be allowed? I think if I were writing a language from scratch, and it included lambdas, they'd close over values, not variables, and mutating the closed-over variables would have no effect on the world outside the closure (or perhaps be disallowed entirely).

So, basically... are you sure that people expect that behavior with loop variables because they are thinking of each iteration as a new variable? Or is it because they don't expect closures to close over variables at all, but rather values, and would be surprised that loop variables are being treated as a special case rather than closures just always working that way? Because in my case it's definitely the latter.


From a functional perspective, this of course doesn't make any sense. But from an imperative perspective it does.

A common use for closures in imperative languages is to have custom control structures similar to `if`, `for`, `while`, etc. And in imperative languages mutating values is very important, if you want a new control structure you have to be able to mutate your surrounding variables.

Smalltalk only uses this kind of control structures based on closures. Ruby also uses this a lot, it even changes the meaning of `return` in closures (blocks) to return not from the closure but the enclosing function.


This mechanism that you suggest about copying values is how Lua used to work before version 5.0, when they came up with the current upvalue mechanism.


I think some languages let you choose to capture variables or values - does C++ do that?


C++ and Rust do let you capture by value. Here is a Rust example that outputs exactly what you would naively expect.

    fn main() {
        let mut x = 1;
        let f = move || println!("{}", x);
        x = 2;
        let g = move || println!("{}", x);
        f();
        g();
    }


Sort of related with gcc's nested function extension, you can reference variables that in scope when the function is defined. If you pass a pointer to the nested function is can access those variables (via a trampoline on the stack).

To me that's been very useful to have both to get around not being able to use malloc.

Meaning I think it's a good idea actually.


Small piece of ui/ux feedback. It would be awesome if you moved the next and previous buttons to a static position so I can quickly page through the book. Right now they hop up and down.


Thanks for mentioning that. I notice this too and it bugs me. I've tried a few different layouts and positions for those navigation buttons and so far haven't found anything else I like better. My thinking at the time was that most readers aren't quickly paging through chapters so it's not a key affordance.

I'll probably do some site tweaks after I finish the last chapter. In the meantime, the layout is quite responsive. If you make the window narrower, eventually you get to a single column layout with the navigation statically positioned on top.


Hey, thanks for listening to the feedback. Also, layout feedback aside, I love your book. <3


I am recently working on a hobby programming language: https://github.com/rust-shell-script/rust-shell-script, which could compile to either rust code or shell script. This book helps me clarified a lot of things along the implementation.


This was a great time for this to be posted. I've been spending a lot of time with Racket, and closures are a big part of that. I believe (though maybe I'm wrong) that Racket's closures don't have the issue described in this chapter of closing over values vs. variables due to the functional nature of the language.


They do, actually. Scheme supports `(set! some-var)` to reassign values to variables, which means you can tell if a closure captures a value or variable.


Cool, thanks for bringing my attention to this!


Are there any advantages or disadvantages with learning how to build an interpreter versus learning how to build a compiler? I'm asking since I will be taking a compiler class in college soon and this book looks really interesting to me. Usually I see school curriculums have a class on compilers instead of interpreters.


As someone with a bit of a Lua background, reading an example with a local variable named `local` makes my head hurt :)


Does anyone know of any similar material that helps you create a database / distributed database from scratch?


I'm in the hunt because I try to build a relational language (http://tablam.org - looking for help!) and have collected as many resources as I could.

Apart of the link given by @azhenley, some interesting stuff:

About queries:

- "Building Efficient Query Engines in a High-Level Language" by Yannis Klonatos, Christoph Koch, Tiark Rompf1, and Hassan Chafi

- "Optimizing relational algebra operations using generic partitioning discriminators and lazy products" by Fritz Henglein

The relational model

- "An Introduction to Relational Database Theory" by Hugh Darwen

About the whole thing:

- "Database Systems: The Complete Book" by Hector Garcia-Molina Jeffrey D. Ullman Jennifer Widom

- "Database Management Systems 3rd Edition" by Ramakrishnan

About more modern stuff:

- Readings in Database Systems

- The Design and Implementation of Modern Column-Oriented Database Systems


Here is a 13 part series on implementing a database: https://cstack.github.io/db_tutorial/


It misses the optimization and execution part, i.e. how joins, filtering and aggregation is executed




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: