Closures Are State Machines

Towards Component Based Message Passing 2024-12-26

Dec 27, 2024

This article explores the inner structure of functions, closures, objects, state machines and message passing components.

From an internal perspective, functions are but a subset of closures, closures are but objects, objects are but a subset of state machines, and, state machines lead to straight-forward implementation of message-passing components.

Pure Functions

Pure functions consist of a block of code (a “lambda”) plus data cells for input parameters and return values.

Mutation of data cells for input and return is performed solely by the underlying function-programming engine.

Input parameter data cells can only be accessed in Read-Only mode by code in the pure function.

Mutation of return value data cells is hidden within the pure function’s code. Return value data cells cannot be explicitly mutated by the programmer, except through syntax - such as return statements - provided by the pure functional programming language. Language implementations usually use CPU registers or callstack locations as targets for return values.

The fact that pure functions cannot express explicit mutation of data cells contradicts the core aspect of CPU-based computers.

Procedures With Global Variables

Data cells made available on a global (“free variable”) basis to procedures.

Global access causes headaches when multiple writers are involved, leading to concerns about “thread safety”.

Scoping tries to mitigate and corral the problems, but, fixes symptoms instead of causes.

Closures

Closures contain procedures with private, mutable access to enclosed data cells.

Closures tend to be allocated in a (global) heap to allow multiple activations and non-scoped lifetimes of the closures.

Object

Objects in OOP are like closures, except that multiple procedures are exposed. These procedures are usually called “methods”.

Objects operate in a function/procedure call/return (blocking) manner, and, basically, do not perform true asynchronous message passing.

State Machines

State machines are like Objects wherein one private data cell is reserved to act as a state variable.

State machines expose only one entry point - the handler - but, internally, the state variable determines which block of code is executed when the handler is called.

The book Let Over Lambda[1] shows how to implement state machines using lambdas.

One can view closures as degenerate state machines with only one state. Since there is only one state, the state variable can be optimized away, resulting in a structure that is exactly like that of closures.

Message-Passing Components

Message-passing components are like State Machines, but, use queues for input and output.

Queues allow components to operate in a truly asynchronous manner.

Components support the fiction of multiple ports for input and output.

Instinctively, the tendency is to assign one queue to each port. This leads to issues of internal deadlock.

On the other hand, using only one input queue per component and only one output queue per component eliminates the possibility of deadlock within the low-level implementation of components. This does not eliminate the issue of deadlock, but punts the issue up to a higher level allowing Software Architects, Software Engineers and Software Implementors to deal with the deadlock issue(s) explicitly.

Using only one input and only one output queue preserves the time-ordering of arriving messages and of generated output messages. Time-ordering information is vital for reasoning about the sequencing of events in an asynchronous system.

Insisting on only one input and one output queue per component requires that messages be tagged by port ids to provide vital port information to the internal handler code.

Input queues and output queues contain the same kind of messages, but, the meaning of the ports needs to be interpreted differently, hence, the queues must be separate (or, messages need to be further tagged with their input/output direction). On input, messages can be directly handled by the handler code, whereas output messages are tagged by sending-component-relative port ids which must be re-mapped with receiver-relative port ids.

Message-Passing Components and Private Memory

This sketch attempts to show the read/write relationships between code and private memory within message-passing components. For simplicity of presentation, the three arrows between code and input queue, output queue and private variables is shown only for one code block, but, should be repeated for every code block.

It should be “easy” and straight-forward to implement components shaped this way in any modern programming language, e.g. WASM, Rust, Javascript, Python, etc. Most modern programming languages support closures and make it easy to create queues.

Bibliography

[1] Doug Hoyte. Let Over Lambda from https://letoverlambda.com

Programming Simplicity

Discussion about this post

Ready for more?