Experiments With a Python Indenter
Towards Higher Level Syntax for Programming Languages 2024-09-09
Abstract
To use t2t techniques to emit Python, we must cope with the fact that Python is indentation-oriented, whereas most of our software development tools, e.g. parser generators, compiler generators, etc., encourage the use of context-free, indentation-insensitive, nested, bracketed languages.
In this article, I show a simple technique for generating correctly indented Python with a small post-pass that consists of some 40 lines of code.
This technique opens the doors to inventing new indented programming languages, for example, even that of using markdown as a programming language.
Spoiler
I write code that generates pseudo-python code, then clean up the result to produce legally indented Python.
I insert indentation and de-indentation markers into generated code using reserved Unicode characters “⤷” and “⤶”.
At the very end of the code generation process, I fix-up indentation by replacing indentation markers with appropriate numbers of spaces. The code for doing these fix-ups in isolated units is quite simple, consisting of some 40 lines of Javascript.

