Portability
2025-10-07
Part 1: The Fox Critique - When Standardization Kills Performance
In 1999, Jeff Fox wrote what might be the most politically incorrect essay in programming language history (https://www.ultratechnology.com/antiansi.htm). Working as Programming Manager at iTV Corporation alongside Forth’s creator Chuck Moore, Fox watched two groups of programmers solve the same problems. One group used ANSI Forth—the freshly standardized, “portable” version of the language. The other used Machine Forth—Chuck Moore’s minimal, hardware-specific approach.
The performance gap wasn’t close. It was carnage.
Machine Forth code ran 10 to 1000 times faster. It consumed 4 to 100 times less memory. In one case—a JPEG decoder—the Machine Forth version was 100 times smaller and 1000 times faster than its ANSI equivalent.
But the real kicker? Fox claims that 98% of the bugs they chased came from ANSI Forth’s promise of portability.
This wasn’t just a performance debate. It was a fundamental disagreement about what programming should be.
Section 1: What Fox Saw at iTV
Fox describes a workplace divided. Some programmers wrote exclusively in ANSI Forth. Others wrote Machine Forth. The code profiles made it obvious which was which.
In meetings, one programmer would report shrinking a module from 3K to 2K and making it 40% faster. Another would report adding 50K to fix a problem, and the result was now “too slow to measure.”
The ANSI programmers believed they were writing portable code. They tested in their development environments. The code compiled. It ran. It must work, right?
Wrong. When someone else tried to port that “portable” code to the actual target hardware, the bugs emerged. The code had hidden assumptions. It relied on implementation details. It hadn’t been written with the target machine in mind at all.
Fox’s most damning example: A programmer who knew C and ANSI Forth translated a JPEG decoder from C to ANSI Forth. It was buggy and slow. Later, a programmer who “had not been spoiled by ANSI Forth” looked at the problem fresh, thought it through, and wrote Machine Forth code. One hundred times smaller. One thousand times faster.
Section 2: The Philosophy - What Is Forth Supposed To Be?
Chuck Moore created Forth to be radically simple. Fox argues it should be “simpler than Logo” and “taught to children after counting and before long addition.”
The power comes from intimacy with the machine. In Machine Forth, when you write DUP, you know it’s a 2-nanosecond opcode. That knowledge enables optimization. You can feel the cost of every operation.
Machine Forth had 27 opcodes. Fox claims they could teach it in an hour. The next day, new programmers were writing code for embedded wireless ethernet controllers.
ANSI Forth, by contrast, has hundreds of words you need to learn just to get started. “Almost as complex as C,” Fox says.
The ANSI specification is thorough, well-written, created by smart people. But it tries to cover every implementation style, every machine type, every use case. It’s designed to be portable across “any type or size of machine or style of implementation.”
That comprehensiveness is precisely the problem.
Section 3: The Illusion of Portability
Fox identifies two dangerous illusions in ANSI Forth:
First illusion: Standardization gives Forth credibility in the business and computer science communities. Maybe. But at what cost?
Second illusion: ANSI Forth provides portable code. This one is actively harmful.
The programmers at iTV who wrote ANSI Forth couldn’t see their bugs. Their development environments hid the problems. They were writing code that worked in one ANSI Forth implementation but made assumptions that broke in another.
Someone else had to port the code. Someone else had to debug it. And that someone else spent 98% of their time fixing bugs that came from the portability illusion.
Fox uses a memorable phrase: ANSI Forth gives programmers “way too much room to stay with their bad programming habits from other languages.” It doesn’t constrain you to good practices. It certifies bad ones.
Section 4: The Resistance
The ANSI Forth programmers didn’t want to change. Fox describes trying to get them to try Machine Forth as “like trying to pull teeth from an angry pit bull by hand.”
Chuck Moore and Fox argued constantly: Machine Forth is easier, more productive, and forces good programming practices. Just try it.
Some programmers got it. Others insisted they were hired to write ANSI Forth. Period.
The code kept bloating. The product needed to be low-cost with small memory. But “portable” library routines ran 10,000 times slower than optimized versions Fox could write in a couple of hours.
Section 5: Moore’s Verdict
Chuck Moore had opposed ANSI standardization from the start. He worried it would be “a disaster and not merely a dubious advantage.”
All his fears came true. None of the advantages materialized.
“Any spirit of innovation has been thoroughly quelled,” Moore said. “Underground Forths are still needed.”
He saw something at iTV that disturbed him—the first Forth project he’d been involved with where he wasn’t contributing code. He watched how other programmers used Forth.
“They don’t always get it right,” he observed.
Section 6: Fox’s Conclusion - ANSI as Anti-Forth
Fox’s final assessment is harsh: ANSI Forth is “little more than an agreement to officially recognize the pigeon hole where Forth can die.”
It becomes “one more failed attempt to find the ideal portable scripting language.” It’s “an inefficient interactive portable scripting extension to C and not at all in the spirit of Forth.”
The standardization that was supposed to save Forth and give it credibility instead killed what made Forth powerful: radical simplicity, hardware intimacy, and optimization as a first principle.
Fox had been a strong ANSI advocate. He changed his mind. The evidence at iTV was too stark to ignore.
Part 2: The Portability Problem - A Pattern That Transcends Forth
Fox’s essay struck me because I’ve seen this pattern before. The details differ, but the fundamental mistake is the same: treating portability as a language feature rather than an optimization problem to be solved in the production engineering step.
The Forth community isn’t unique in this failure. Lisp made the same mistake. C made the same mistake. And we keep making it, over and over, because we’ve misunderstood what portability actually is.
Section 1: The Type System Trap
When languages try to achieve portability, they typically turn to the type system. The reasoning seems sound: if we can describe what operations mean abstractly enough, if we can specify behavior precisely enough, then code will work the same way on different machines.
But this approach has a fatal flaw: every edge case demands another type system feature. Every architectural quirk requires another epicycle. The specification grows. The language grows. The cognitive load on programmers grows.
Eventually, we end up with conditional compilation:
#ifdef PLATFORM_X
// do it this way
#else
// do it that way
#endif
This is an admission of defeat. The type system couldn’t actually capture portability. So we bolt on a weak macro system to paper over the cracks.
Both Lisp and C communities went down this path. Both ended up with conditional compilation as the escape hatch. Both produced languages that grew increasingly complex while failing to deliver on the portability promise.
Section 2: The Knowledge Location Problem
Here’s what bothers me most about the type system approach: the decision tree for handling different architectures lives in the heads of the people who wrote the specification.
The spec itself is just the artifact—an elaborate type system that attempts to formalize all those decisions. But the actual knowledge, the “why we made these choices,” remains implicit.
When you encounter a new edge case, you can’t easily see how similar problems were solved. You can’t learn from the accumulated wisdom. You just see more type rules.
Compare this to what I call the compilation then production engineering approach.
Section 3: Compilation Then Production Engineering - Portability as an Optimization Concern
The idea is straightforward, but requires understanding what we’re compiling to.
Stage 1 (Compilation): Move as much work as possible away from runtime. Type checking, binding resolution, semantic validation—do these once, not every time the code runs. Generate correct code for a fictional target architecture. You get to define what that architecture is. Fraser/Davidson chose a machine with an infinite number of registers—no register allocation concerns in Stage 1. Cordy’s Orthogonal Code Generator defines it as simple operations (like assignment) working with generalized data descriptors—constants, variables, dereferenced pointers—based on Holt’s work on data descriptors [1]. The key is: the fictional architecture makes Stage 1 straightforward. Just preserve semantics.
Stage 2 (Production Engineering): Take that compiled code and custom-tune it for each target architecture’s capabilities. Pattern match idioms and rewrite them to exploit what the specific hardware does well. Map infinite registers to actual register sets. Map generalized operations to what the hardware does efficiently.
This isn’t a new idea. GCC uses Fraser/Davidson peephole optimization [2]. Jim Cordy’s Orthogonal Code Generator goes further with MIST—trees annotated with ‘if the target can do this operation efficiently, rewrite the code this way’ [3].
The key insight: optimization knowledge becomes an explicit, reusable artifact. The MIST database grows smarter over time. You can read it. You can understand the reasoning. You can add new patterns when you discover better approaches.
Instead of encoding portability decisions in language type systems—where they become frozen and implicit—you encode them in declarative transformation rules—where they remain flexible and explicit.
Section 4: Machine Forth as the Right Abstraction
This reframes what Fox and Moore were arguing, though from a different angle. Machine Forth writes directly for the metal. The two-stage approach writes for a fictional architecture, then relies on production engineering to map to actual metal. Both avoid the ANSI trap of pretending the target doesn’t matter.
You write for a fictional architecture. Then production engineering handles custom-tuning and writing for the metal of specific target architectures.
The conventional approach is backwards: write for an abstract machine, hope the compiler can optimize for real machines. This stacks abstraction on abstraction, each layer hiding details that matter for performance.
Fox’s 100x size and 1000x speed improvements weren’t magic. They came from eliminating unnecessary abstraction layers and writing directly for what the hardware could actually do efficiently.
Section 5: Why the Standard Approach Fails
The ANSI Forth programmers at iTV thought their code was portable because it compiled in their development environment. But they had never tested against the actual constraints of the target hardware.
This is the core problem with “write once, run anywhere”: it encourages thinking that you can ignore the target. But the target always matters.
Fox’s claim that 98% of bugs came from portability illusions isn’t just about Forth. It’s about what happens when you give programmers a false sense of security.
Portability isn’t “code that works the same everywhere.” That’s impossible. Portability is “explicit knowledge about how to map semantics to different targets efficiently.”
Section 6: The Bloat Accumulation Pattern
Every time a standards committee encounters an edge case, the instinct is to add complexity to the language specification.
Can’t handle this architectural difference? Add a type system feature.
Can’t express that constraint? Extend the standard library.
Found a new class of bugs? Introduce more compile-time checks.
Each addition seems justified in isolation. But collectively, they produce bloatware.
The language grows from 27 opcodes you can teach in an hour to hundreds of words you need to learn before you start. Then to thousands of pages of specification that no one fully understands.
We accept this complexity growth as inevitable. Fox and Moore suggest it’s not. It’s a consequence of the wrong approach to portability.
Section 7: What Would Better Look Like?
The conventional approach stems from function-based thinking: one input, one output. The compiler is conceived as a single function that must somehow generalize for every architecture known to man. This forces all the complexity into that one transformation step—the type system must capture every edge case, the intermediate representation must be abstract enough to work everywhere, and the optimizer must guess what every possible target might want.
This 1-in-1-out design is fundamentally at odds with the portability problem.
The two-stage approach is architecturally different: it’s 1-to-many. You compile once to an abstract Design Intent (DI) machine that captures what the program means. Then you fan out to custom little compilers—or punt to a tree-walker like MIST—that production-engineer the best code for each specific target.
This wasn’t practical in the 1970s. Building custom code generators for each target was expensive. Maintaining transformation rules was tedious. The tooling didn’t exist.
But today? With tools like OhmJS and PEG parsers, it’s straightforward. The entire production engineering step can be implemented as the simple act of rewriting strings to other strings. You match patterns in the abstract DI code and emit optimized sequences for the target. Add a new target? Write new rewrite rules. Discover a better optimization? Add it to the pattern library.
The fan-out architecture makes the 1-to-many relationship explicit in the design, rather than trying to hide it inside a monolithic compiler that pretends it’s doing 1-to-1 transformation.
Imagine a language designed this way from the start:
You write in a minimal core—something like Machine Forth’s 27 opcodes. The semantics are clear because there’s not much to learn.
But you also write transformation rules: “On architectures with this capability, rewrite this pattern as that pattern.” The rules are declarative, human-readable, and accumulate over time.
Compilation becomes a two-part process: first generate correct code, then teach the production engineering step about what each target does efficiently.
When you encounter a new target, you add transformation rules. You don’t modify the language. You don’t change the type system. You just teach the production engineering system what this particular machine does well.
The knowledge becomes reusable. The decision tree becomes explicit. And you avoid Fox’s nightmare scenario where “portable” code hides bugs until someone tries to actually port it.
Closing
Fox changed his mind about ANSI Forth because the evidence was undeniable. Working alongside Chuck Moore at iTV, watching Machine Forth code outperform ANSI Forth code by orders of magnitude, he couldn’t maintain his earlier position.
But the lesson extends beyond Forth. We’re still making the same mistake in modern language design: treating portability as something to bake into the language rather than something to handle in the second production engineering step.
The type system approach gives us complexity and bloat. The compilation then production engineering approach—with explicit, declarative transformation rules—gives us simplicity and performance.
Maybe it’s time we learned from the Forth wars.
References
[1] R.C. Holt, “Data Descriptors: A Compile-Time Model of Data and Addressing,” ACM Transactions on Programming Languages and Systems (TOPLAS), Vol. 9, 1987, pp. 367-389. https://dl.acm.org/doi/10.1145/24039.24051
[2] Jack W. Davidson and Christopher W. Fraser, “The Design and Application of a Retargetable Peephole Optimizer,” ACM Transactions on Programming Languages and Systems (TOPLAS), Vol. 2, No. 2, April 1980, pp. 191-202. https://dl.acm.org/doi/10.1145/357094.357098
[3] J.R. Cordy, “An Orthogonal Model for Code Generation,” Ph.D. dissertation, Report CSRI-177, Computer Systems Research Institute, University of Toronto, Toronto, February 1985.
See Also
Email: ptcomputingsimplicity@gmail.com
Substack: paultarvydas.substack.com
Videos: https://www.youtube.com/@programmingsimplicity2980
Discord: https://discord.gg/65YZUh6Jpq
Leanpub: [WIP] https://leanpub.com/u/paul-tarvydas
Twitter: @paul_tarvydas
Bluesky: @paultarvydas.bsky.social
Mastodon: @paultarvydas
(earlier) Blog: guitarvydas.github.io
References: https://guitarvydas.github.io/2024/01/06/References.html

