Is there any paper describing how this extends to objects containing N pointers ...

cwzwarich · on March 3, 2018

I find the easiest way to think about this trick is to consider the CPS transformation of a depth-first traversal of a tree (in the case of a general graph, you use a DFS spanning tree). At each recursive call, you need to be able to produce and store the continuation upon return. Upon returning from a recursive call, you need to rematerialize the continuation.

The object pointer for the most recent continuation is stored in a global variable, and others are stored in a stack constructed by using pointers in objects that can now be inferred from the stacking nature of the recursion.

Part of the information for the code pointer for the most recent continuation is stored in the object pointer, i.e. from the object header you narrow the possibilities to process it. For an object containing N fields, you need log_2 N bits to store the field you were last processing. Since each object can appear in at most one stack frame of the recursive traversal (modulo where you put the marked check to bail out in the graph case), you need to put these log_2 N bits somewhere in the object.

You have a few options for where to put the bits:

- Find a few bits lying around where you can store them. This works pretty well for objects with few fields.

- Add an extra field with log_2 N bits to store them. This is the most straightforward solution for a larger object with an arbitrary number of fields.

- If you have an easy way of locating object headers from internal pointers, you can use internal pointers as your back pointers.

- Steal a few bits from every field that has a pointer value stored in it. This works because you actually only need log_2 M bits where M is the number of fields containing a pointer. The non-pointer fields can all be skipped.

This scheme can be generalized to implement arbitrary structural recursion on polynomial datatypes:

https://www.sciencedirect.com/science/article/pii/0096055193...