Recursive, Asynchronous Layering: What Shell Scripts Teach Us About Program Architecture

2025-09-18

Sep 22, 2025

Key Concepts

Nested, recursive Container Parts and Leaf Parts (scripts and commands)
Wires represented by triples {direction , sender, receiver} (direction == {down | across | up | through})
One input queue per Part (mevents == {port id, payload})
Visualize dataflow and sequencing, leave functions as text
use draw.io instead of building a graphic editor ('compile' diagram .graphML to JSON, removing graphic details)

---

Shell scripts get something fundamentally right about program structure, yet they don't go nearly far enough. They naturally partition code into two distinct types: recursive scripts that orchestrate, and atomic commands that execute. Meanwhile, our dominant programming paradigms collapse everything into a single type—synchronous functions—inadvertently creating fragility and overwhelming complexity as programs grow.

What if we took the shell's architectural hints seriously and extended them into a complete programming model?

The Shell's Hidden Wisdom

Every shell script programmer intuitively understands a crucial distinction:

- Scripts can call other scripts or commands recursively

- Commands are atomic units that don't recurse by default

This creates a natural hierarchy where scripts act as containers, orchestrating smaller units of work. It's a form of layered architecture that emerges organically from the medium itself.

But shells stop short of fully realizing this vision. They remain trapped in linear, text-based thinking where asynchronous behavior requires special syntax (`&`) and coordination between components remains primitive.

Five Key Revelations

Leaf Parts are what we traditionally call "code"—Container Parts are something "new" that extract and lift out the essence of scripts. Leaf Parts remain as textual code, whereas Container Parts are better expressed in a visual manner.

1. Two-Part Architecture: Containers and Leaves

The most important insight is embracing two fundamentally different types of components:

Container Parts (like shell scripts):

- Can contain and orchestrate other parts

- Handle routing and coordination

- Act as micro-dispatchers with their own event loops for processing mevents

- Maintain tables of connections between their children

Leaf Parts (like shell commands):

- Contain actual implementation code

- Cannot contain other parts

- Have single, focused responsibilities

- Provide the atomic units of computation

This partition prevents the "infinite canvas" problem where programs sprawl without structure. Instead of allowing unlimited nesting of synchronous function calls, we force architectural decisions at each level.

2. Asynchronous by Default

Most programming languages make synchronous execution the default and treat asynchronous behavior as a special case requiring extra syntax and complexity. This backwards thinking constrains our architectural options.

The better approach: Make asynchronous, message-passing behavior the default. Synchronous behavior becomes the special case when you explicitly need it.

This flip enables natural composition. Parts can be wired together without the tight coupling that comes from direct function calls. Each part maintains its own input and output queues, processing messages at its own pace.

This approach leads to a LEGO®-like "feel" for architecture. Programmers can visually snap already-existing Parts together to form new applications. Programmers can visually rearrange the internal of application containers to produce new applications. Programmers can borrow ("reuse", "multiply-use") Parts to build new applications without writing any code, or, programmers can write new Parts as needed. Currently, programmers tend to treat code libraries in this manner, but the synchronous, strongly-coupled nature of code libraries severely restricts architectural flexibility.

3. Wire Triples, Not Pairs

Traditional approaches think of connections as simple sender-receiver pairs. But in hierarchical systems, you need richer routing information.

Wire specification: `{direction, sender, receiver}`

Where direction can be:

- down: Container input to child input

- across: Child output to child input (including feedback loops)

- up: Child output to container output

- through: Container input directly to container output

This triple structure enables proper message routing in nested hierarchies and makes fan-out (copying messages to multiple destinations) natural and explicit.

4. Ticking Until Quiescence

Container parts operate by "ticking"—repeatedly processing available messages until no more work can be done in the current cycle. This creates natural event loops without the complexity of traditional thread management.

The process:

1. Container checks for incoming messages

2. Routes messages to appropriate children according to wire table

3. Children process their messages and potentially generate outputs

4. Container routes child outputs according to wire table

5. Repeat until no more messages flow (quiescence)

This is inherently recursive—when a Container ticks a child that is itself a Container, that child ticks its own children.

5. Feedback Is Not Recursion

One of the most subtle but important distinctions:

Recursion: New execution contexts jump to the head of the line (stack/LIFO behavior) Feedback: Messages wait in line like everything else (queue/FIFO behavior)

When a part sends a message to itself, that message gets queued normally rather than creating immediate re-entry. This eliminates many traditional concurrency hazards while still allowing feedback loops and state machines.

Queueing preserves mevent order, making it possible to reason about sequencing and control flow which adds a new dimension to program development. This way of thinking about control flow is more structured than what currently exists in popular, function-based programming languages.

A mevent is just a message. Mevents have two parts - 1. the port identifier, 2. the payload. The addition of a port id (this can be something as simple as just a string) is needed by the one-input-queue requirement discussed in this article.

Practical Implications

One Queue Per Part

Each part maintains exactly one input queue and one output queue. Messages are tagged with port information: `{port, payload}`. This design choice prevents deadlock at a low level—though domain-specific deadlock issues may still need project-specific solutions.

Fan-Out for Debugging

The wire triple system makes "tapping" connections trivial. You can copy any message flow to debugging probes without modifying the original components. Fan-out becomes a first-class architectural concept rather than an afterthought.

Visual Representation Without Drawing Editors

The move from text-based to visual representation is crucial for understanding program architecture at a glance. But you don't need to build sophisticated drawing editors from scratch.

Existing tools like draw.io and Excalidraw already produce machine-readable graphML and XML output. The key insight is to focus on parsing their formats rather than creating new visual editors. Let humans use familiar drawing tools while machines consume the structural information.

Shell-Out Is Cheap Now

Modern systems make process creation and inter-process communication remarkably efficient. Rather than avoiding shell-out operations, embrace them as a natural way to compose atomic units of functionality.

Shelling-out creates a development system that can use more than one language at a time for creating programs, instead of sticking with only one language throughout a whole project. Our current biases against the use of multiple languages are based on the notion of treating programming languages as development systems and adding features and more complication in the process. On the other hand, adopting a multiple-language development workflow makes it possible to use only the best features of each programming language to laser-focus on specific issues using specific paradigms. For example, programmers can use the Prolog/Datalog/etc. relational paradigm to process factbases, but switch to using string interpolation features of other languages like Python and Javascript to perform string manipulation and formatting.

For a deeper discussion of optimization considerations, see the appendix on production engineering.

Layered Growth

Programs built this way grow in layers rather than sprawling across an infinite canvas. Each layer maintains approximately 7±2 significant elements—the sweet spot for human comprehension. When a layer grows too complex, it signals the need to create a new level of abstraction.

This approach prevents the common problem where programs become walls of detail that obscure their essential structure and intent.

Beyond Shells

While shell scripting provides the initial insight, this architectural approach applies to programming in general. The core principles—isolated async units, message passing, hierarchical composition—make software architecture easier to understand, modify, and debug.

The functional programming paradigm, despite its benefits, leads to synchronous thinking patterns that don't naturally accommodate all types of problems. Some domains require asynchronous, event-driven solutions that are awkward to express in purely functional terms.

By embracing asynchronous, message-passing components as the primary building blocks, we create systems that are more naturally composable and easier to reason about at scale.

Implementation Notes

Part Structure:

- Handler entry points for processing messages

- Template registration for creating instances

- Unique instantiation from templates to avoid shared state issues

Container Responsibilities:

- Maintain routing tables between children and self

- Process routing tables atomically to ensure consistency

- Act as micro-routers for their contained subsystems

Wire Processing:

- Fan-out and fan-in described by multiple 1:1 wires

- Wire tables must be processed atomically to maintain system consistency

- Direction information enables proper hierarchical message flow

Conclusion

Shell scripts hint at a powerful architectural pattern that we've barely begun to explore. By taking their two-part structure seriously and extending it with proper asynchronous messaging, visual representation, and hierarchical composition, we can build software systems that grow gracefully and remain comprehensible even at scale.

The path forward doesn't require abandoning existing tools or starting from scratch. It means recognizing the architectural wisdom already present in shells and systematically applying those principles to create more maintainable, composable software systems.

The shell's missing dimension isn't just visual representation—it's the full realization of layered, asynchronous architecture that shells accidentally discovered but never fully developed.

Appendix: Production Engineering and Optimization

The instinct to optimize shell-out operations and standardize on a single development language stems from a fundamental confusion between development efficiency and runtime efficiency. This represents a classic case of premature optimization—solving tomorrow's problems at the expense of today's productivity.

During the development phase, only one metric truly matters: is the program fast enough to maintain developer flow? The ability to iterate quickly, test ideas, and refine architectural decisions far outweighs concerns about process overhead or language switching costs.

Consider the deeper reality: every programming language eventually compiles down to the same machine code and assembly instructions. The path from high-level abstractions to optimized execution has been well-traveled for decades. Existing techniques—queue implementations, closure optimizations, and compiler transformations—provide clear pathways for converting working prototypes into production-ready systems.

The architectural approach advocated here deliberately separates design concerns from performance engineering. This separation accelerates development even when the final product requires significant rewrites and optimizations. A working system built with the right architectural structure can be systematically optimized without losing its essential clarity.

Modern tooling makes this transition even more practical. Prolog-style relational code that elegantly handles complex data relationships can be mechanically transformed into nested loops when performance demands it. Multi-language shell-out architectures can be consolidated into single-language implementations once the design stabilizes. Large language models now excel at facilitating such transformations, making it possible to have the best of both worlds—expressive development languages and optimized production code.

The key insight is refusing to accept false tradeoffs. There's no reason to endure Prolog's awkward string handling when you need its powerful relational capabilities for part of your system. Use the right tool for each job during development, then optimize the critical paths for production deployment.

Substack: paultarvydas.substack.com

Videos: https://www.youtube.com/@programmingsimplicity2980

Discord: https://discord.gg/65YZUh6Jpq

Leanpub: [WIP] https://leanpub.com/u/paul-tarvydas

Twitter: @paul_tarvydas

Bluesky: @paultarvydas.bsky.social

Mastodon: @paultarvydas

(earlier) Blog: guitarvydas.github.io

References: https://guitarvydas.github.io/2024/01/06/References.html

Paul’s Substack

Discussion about this post