2024-06-14-Smallness
2024-06-14-Smallness
I learned some interesting lessons when studying Sector Lisp (https://justine.lol/sectorlisp2/). Sector Lisp is especially small, not due only to its assembler tricks, but, due to some very fundamental programming ideas. Here are some of the things I discovered.
Keep Types Simple
e.g. using indices as type tests:
> 0 ≡ Atom
< 0 ≡ Cons cell
= 0 ≡ Nil
The above types are fundamentally supported by most CPUs. Branch on 0, branch on positive, branch on negative. From: McCarthy’s Lisp 1.5 and Sector Lisp.
Lots of branching, e.g. using if-then-else, causes the CPU to slow down. CPUs have several built-in operations for checking the sign of integers, and, these tend to be efficient. Use such simple tests for indices and pointers.
Developers want a rich set of types that can be checked by a type checker, but, end-users don’t care about such types. I wonder if types in end-user code should be simplified into hierarchies of simple binary tests?
Pure FP
Functional Programming (FP) means that data is always allocated on a stack in LIFO order, not in “random” order in a heap. This means that GC (Garbage Collection) can be greatly simplified.
For example, Cons cells can only be generated in a stack-like manner by any function. Reset the Cons cell stack pointer when the function terminates, keep only the cells that need to be returned from the function. Due to LIFO ordering, any retained cell that refers to other cells refers to cells strictly below it (I think of a stack growing upwards). All functions clean up after themselves, so there is no garbage left on the stack below the cells generated by the current function.
Sector Lisp uses the ABC garbage collection algorithm, which works atomically with the cells on the stack starting at the stack pointer when the function was entered. ABC has a pointer to the cell that needs to be returned. All other cells can be reclaimed by moving the kept cell down in the stack to the previous position of the stack pointer, as depicted in the following sketch.
In Sector Lisp, each function cleans up after itself - again, this can be done because of the pure functional nature of Sector Lisp. This incremental GC (Garbage Collection) means that there is no “stop the world” event and no need for complicated GC strategies. The stack is incrementally compacted.
Unfortunately, most current languages, like Python, Scheme, etc., rely on heavy-handed non-LIFO garbage collection, hence, are less efficient than they might be.
Certain kinds of things, like closures, must be kept on a persistent heap, but, the whole language doesn’t need to pay for this feature. A smaller, heap-based language, might be used in conjunction with a pure FP language, resulting in two smaller languages which co-exist and concentrate only on their own sweet-spots.
It is more efficient to have programmers separate between LIFO code and heap-based code by giving them two languages for programming, and, having the IDE join them together and rewrite the result in an optimized form for production.
Generational GC is a generalized technique for optimizing garbage collection based on peculiarities of the programs, but, it should be more efficient to let programmers tell compilers what’s going on (through language syntax and use), instead of trying to guess at runtime. The Odin programming language (https://odin-lang.org/) is a step in this direction, but, programmers must work harder to make persistent objects.
Of course, if you provide two languages for programming, why not more than two? Pattern-matching - like Prolog, PEG parsing, miniKanren, OhmJS, etc. - indicates a rich subset of capabilities needed by programmers.
Interning
All atoms are created only once. The algorithm is:
Read an Atom from the text input (e.g. a keyword) and convert it into internal form as follows:
Check if the Atom already exists in memory, return its pointer
If the Atom does not exist, create a new slot for it in memory and return a pointer to that.
This is essentially a hash-table strategy. McCarthy did not use hash tables and did a linear search through Atom space. We can do better, now.
See Also
References: https://guitarvydas.github.io/2024/01/06/References.html
Blog: https://guitarvydas.github.io/
Videos: https://www.youtube.com/@programmingsimplicity2980
Discord: https://discord.gg/Jjx62ypR
Leanpub: https://leanpub.com/u/paul-tarvydas
Gumroad: Gumroad: https://tarvydas.gumroad.com
Twitter: @paul_tarvydas