[This is part 1 of a short, ongoing exploration of the 2-stage compilation technique, exemplified by the Forth Haiku code. For part 2 go here.]
I was intrigued by the Forth Haiku mentioned in the Handmade Discord. Initially, I considered repurposing the back end for a PBP Pong game. Instead, I discovered a more significant concept: constructing notations and compilers in a two-stage process which is a method that was being developed several decades ago and results in simple construction and portability of compilers.
The first stage involves writing simple code, followed by optimizing it. I decided to document my analysis of this code as a series of YouTube and Substack articles.
To date, I have created two videos that provide an overview of the first stage of compilation. This approach is facilitated by Forth’s “concatenative code” nature. This method appears straightforward enough to implement manually, eliminating the need for a formal compiler to develop architectural applications and games in this manner.
This approach is reminiscent of RTL used in gcc and Cordy’s improvements to the technique. The optimizer appears to use symbolic execution. I have not yet fully analyzed that part. The fact that Forth Haiku is written in JavaScript suggests that this type of optimization is not overly complex, or, that by severely restricting the base language, we can drastically reduce the difficulty of using this idea.
RTL and OCG optimize code locally, whereas the optimizer in this code performs a more global optimization. I tend to believe that full-blown optimization is not necessary for most parts of most applications, so simple ideas like RTL and OCG are probably sufficient for most use cases.
I haven’t yet finished making analysis videos. We will see if my understanding changes.
An upcoming goal might be to separate the pass 1 and pass 2 code into distinct applications that can be run on the command line.
Further
Decision Tree Transmogrifer Playlist: This playlist demonstrates the creation of a minimal notation for writing decision trees. It primarily focuses on using text-to-text transmogrification (t2t) and straightforward code generation. The notation does not need to be Turing complete to be useful; this decision tree notation is severely restricted to choices, yes, no, and actions, which simplifies the implementation of a transmogrifier (transpiler or compiler). Designing general-purpose languages is a complex process, but designing restricted notations (Solution-Centric Notations, or SCNs) results in only a few hours of work. I have not yet created videos on building peepholers (pass 2 optimizations) using this method, but I have implemented exemplar code that has been shared on GitHub.
JSON Parser in T2T: This playlist primarily covers pass 1 of the process, again demonstrating the generation of a grammar and a rewriter in a few hours.
See Also
Email: ptcomputingsimplicity@gmail.com
Substack: paultarvydas.substack.com
Videos: https://www.youtube.com/@programmingsimplicity2980
Discord: https://discord.gg/65YZUh6Jpq
Leanpub: [WIP] https://leanpub.com/u/paul-tarvydas
Twitter: @paul_tarvydas
Bluesky: @paultarvydas.bsky.social
Mastodon: @paultarvydas
(earlier) Blog: guitarvydas.github.io
References: https://guitarvydas.github.io/2024/01/06/References.html








