2024-06-15-Failure Driven Development

Jun 15, 2024

Failure, Most of the Time

During development, programs tend to fail.

After a program is debugged, it stops failing. You ship it and move on to develop another program - which, also, mostly fails as you debug it.

The name “waterfall development” is given to the idea of being so sure that you know about some aspect of a problem space - or about the whole problem space - that you can hard-code it as software in a program.

The actual development workflow, though, does not resemble “waterfall development”. You learn new things about the problem space as you develop and debug.

The ideal workflow, then, is to iterate and to change your mind as you gain new insights about a problem. If you hard-code parts of the problem too early, and, you work hard at it, you feel reluctant to change your mind and rewrite the hard-coded software, when you learn new things about the problem.

Thinking and learning is hard. Code should be cheap.

3GL programming languages, like Python, Rust, etc., encourage early hard-coding. Type-checking encourages early hard-coding, too. When you learn something new about the problem, you should revamp the software and revamp the type system, but, you tend not to, since you put in a lot of hard work to get to where you already are.

This situation is not an ideal development workflow. Waterfall development works when you know everything about a problem space, but, doesn’t work so well while you’re still learning about the problem space. Waterfall is what you want to do as a Production Engineer, but, not as a Design Engineer. You can apply waterfall thinking when the design has stabilized. Applying waterfall in the early stages leads to delays and poor designs.

Fred Brooks, in his famous book, the Mythical Man-Month, said that you fail, then you fail, and, only then do you succeed. That’s a failure rate of 67%.

How Can You Cope With an Iterative Workflow?

I cope with this kind of iteration by not writing software. Instead, I write software that writes software.

When I discover something new and want to change my mind, I feel free to wipe the slate clean, and, to delete most of the software developed up to that point. I make a change, then I push a button and regenerate all of the software.

I learned this kind of attitude by watching how compilers work. You write software in a “high level” notation. Then, the compiler grinds through it and produces assembler that can run on a CPU-based computer.

I don’t ever have to write assembler, any more. Compilers write assembler for me. I feed compilers with higher-level, more concise specifications of what I want to accomplish.

I don’t have to worry about changing my mind about data structures that represent the solution to the problem. I hit a button and - presto - new assembler code is automagically generated for me.

Ron Cain’s Small-C compiler taught me how I could get a machine to do work for me. The first program I wrote in Small-C was ‘a = b + c;’. Small-C output a comment containing ‘a = b + c;’ followed by a few lines of 8080 assembler that performed the actual task of adding the two numbers together and storing them into a memory location. The before and after text was displayed on the computer screen in front of my eyes. I simply had to pipe the assembler text through another workhorse program - the assembler - and out came bits and bytes that could be fed into my 8080.

That was then. I used to write programs in assembler. Then, I learned that I could write fewer lines of text and have the machine write the assembler code for me.

Now, though, I write all programs in C (or something even better) and those programs are becoming as laboriously huge and as hard-to-write as my assembler programs were, only a few decades ago.

So, I’m back at the same place - asking myself if can I write less code and get a machine to do the heavy lifting. Can a machine write most of the code for me?

The answer is yes.

That, to me is one of bedrock principles of FDD - Failure Driven Development. Write as little code as possible. Feel encouraged to throw all of the code away and to regenerate new code each time I learn something new about the problem.

One thing I learned while studying Physics is how to tackle seemingly insurmountable problems. Invent a tiny new notation, specific to the problem space, and use that tiny notation to describe the problem. Then use the same tiny notation to describe the solution to the problem.

In software development, that approach means creating little DSLs - I call them SCNs for Solution Centric Notations - for each project. If the project contains a number of sub-projects, invent a bunch of little SCNs, at least one for each sub-project.

Huh? But, writing a new language, even a little one, is hard work, right? I would have to build a whole compiler for each SCN, right?

Nope.

PEG parsing, manifested as OhmJS1, lets you knock off little languages - little SCNs - in only a few hours. I don’t build a whole compiler each time, I simply transpile the little SCN notation into some existing language, like Python, and let the existing compiler do the rest of the work. In essence, Python, Haskell, Rust, Javascript, Common Lisp, etc. are just “assemblers” to me2. I pick which existing language to use depending on several factors, including which libraries they have that are specific to a given project, and how much syntactic hassle is involved. Common Lisp provides the least syntactic hassle, while Python, with it’s indentation syntax, provides a lot of syntactic hassle. PEGs encourage definition of SCNs that are nested and bracketed. In essence, the less human-readable and the more regular a language is, the better assembler it makes.

I’ve written a tiny filter in JavaScript that does the final indentation for me, if I choose to target Python.

I don’t even bother to build air-tight semantic and type checking into the little SCNs, since each little SCN is essentially a throw-away, specific to only one project. This approach is kinda like using REGEXs, but, better. REGEXs don’t really understand nested SCNs nor mutual recursion during parsing.

Without air-tight checking, debugging gets a bit harder - something that I’m willing to put with. When I work on a project, I end up with hoary debugging problems regardless of how tight a language I use, so the trade-off isn’t actually that large.

In fact, having intimate control over the little SCNs allows me to stuff traces and other debugging helpers into the notation and allows me to debug at a much higher level. After debugging, I can remove the traces and regenerate all of the code at a “production quality” level automatically.

Many people say that strong type systems help them design by checking consistency in their designs. The approach of building little SCNs works that way, too. The control flow becomes whatever you want it to be, and, whatever lends itself to the problem space. You can lean on existing compilers to check your type system.

I’ve even seen more formally-specified compilers be used as linters. These compilers aren’t inserted into the actual workflow, but, are sometimes used as adjuncts to check intermediate thoughts and designs.

The syntax needed by little SCNs isn’t particularly spectacular, more like a bunch of macros. The PEG formalization is about all that is needed to keep your design in line, but, heavier technologies, like YACC, can be used to lint designs.

RWR - An SCN for Creating SCNs

One of the first projects that I used OhmJS for was to create a little SCN notation that I could use to specify rewrite rules for ASTs3.

OhmJS gives you a way to specify pattern-matching grammars that type-check using the PEG formalism. After pattern-matching, though, OhmJS expects programmers to write JavaScript code to actually do something with the matches.

RWR is a little SCN that writes such JavaScript for me. I reduced the problem to one of simply rewriting strings and creating strings based on the pattern matches. JavaScript lets you do much more than this, but, after a few years of use, I’ve never found the RWR simplification to be lacking and have never needed to go back to manually writing JavaScript myself.

RWR deserves its own essay, so I won’t go into it further, here. RWR is quite simple, though. I think that it is self-explanatory. It should be enough to simply look at OhmJS code and RWR code to figure out what’s going on.

Paul’s Substack

Discussion about this post