Understanding Pattern Matching

Bird’s Eye View 2025-01-01

Jan 01, 2025

This article is meant to give a flavour for what’s going on under the hood in pattern matching. In order to keep things simple, I gloss over details. Some of the details might not even be accurate.

A Simple Factbase

A Simple Query

Query 2

There are 2 questions (“queries”) here. All parts of both queries must be satisfied before success can be declared. This is called “unification”. The pattern-matching engine backs up an retries until everything is consistent.

In the above example, the constant parts (blue rectangles) must match up, and, the holes (clear ovals) must match up. Here we have two different holes that get filled (“unified”) only when the corresponding constant parts line up.

In more complicated practical uses, there might be many anded-together queries with a complex chain of constants and holes that have to line up. In practical programming languages, like Prolog, you can write pattern matching rules that call other pattern matching rules. Each rule might contain a complicated combination of required matches. This ability to chop up (“divide and conquer”) complicated patterns into sub-patterns makes logic programming an interesting and useful tool.

If asked, Prolog will keep re-trying until every possible consistent combination has been found. MiniKanren constructs an internal data structure that can then be used to find as many matches as the programmer wants.

Query 3

How To Keep Track of Where We’ve Been?

You need to keep a TODO list.

And you need to track which options have already been tried.

The TODO list cannot be a linear structure, it needs to be a tree of some sort, because each query might contain more than one “hole” to be filled.

You need to traverse the tree depth-first.

Every time you satisfy a constraint by filling a hole, you need to clear the hole and try again. Only when the current hole can’t be satisfied while avoiding previous attempts, you can back-up left-wards and right-wards, clear and try again.

In the above example, maybe Paul plays more than one instrument (say, bass and piano - not represented in the simplistic factbase at the top). The engine has to keep trying different combinations of Q1, while keeping “Paul” in the Name slot. So, it might find “Paul play bass. Paul born 1942.” and “Paul plays piano. Paul born 1942.”. Then, the engine needs to clear the Name slot and try again with anything but “Paul” in that slot.

You need to check that holes unify with holes above with the same name.

[Aside: Unification is assignment. The programmer doesn’t get to specify the details of the assignment, the pattern-matching engine figures this out. The pattern-matching engine can un-assign values and try again.]

Many implementations don’t build the tree explicitly. They encode the tree-walk in the call-stack and use continuations. Detailed implementations can be seen in some books[1, 2].

Using Loops

There is no magic here. You can implement pattern-matching with loops. And, because loops are really recursion in disguise, you can implement pattern-matching with recursion and mutually-recursive routines.

But,,, manually-written code for this stuff tends to get messy and provides lots of opportunities for bugs.

And,,, cramming all of the details of pattern matching into a blob of code written in a general purpose language, makes the resulting code essentially unreadable. The DI (Design Intent) gets lost in piles of little niggly details.

Pattern-matching languages, like Prolog, are DSLs for pattern-matching. Programmers can write patterns at a high level, then rely on pattern-matching engines to perform all of the looping required to accomplish exhaustive search.

Early on, someone wrote pattern-matching code as loops - in assembler, probably - enough times to get sick of blowing their own feet off and distilled the essence of this one aspect of programming - exhaustive search - and created a DSL for it. At the time, writing DSLs was a black art, since convenient tools for writing DSLs did not exist yet.

When people used the DSL, say in the form of Prolog, they discovered that pattern-matching was a powerful concept that could solve many problems. A whole community of “logic programmers” sprung up and tried to show that every problem could be solved in this way. I don’t particularly agree with this over-kill approach - many problems can be solved this way, but, many problems are better expressed and solved in other ways, for example simple string manipulation and string interpolation - string concatenation, and, substring extraction - which can be easily expressed in /bin/sh and in Javascript.

Resources

Nils Holm’s Prolog Control in Six Slides[3]

Scheme code for the simplest explanation of backtracking that I’ve encountered

PAIP[2]

book about pre-LLM AI
includes source code in Lisp
rewritten in Python in later editions

On Lisp[1]

book about using the power of Lisp
chapter on building Prolog, using continuations

miniKanren[4]

modern functional-programming library and toolkit for doing exhaustive search, without the use of backtracking
Explained in a book[5]

Nova[6, 7]

modern pattern-matching and rewriting language
Used in game programming
Presentation at Handmade Cities 2024

Cl-holm[8]

Nils Holm’s Prolog Scheme code manually ported to Common Lisp

Js-Prolog[9]

Nils Holm’s Prolog Scheme code automatically ported to Javascript
(Not heavily tested, may need polishing)

OhmJS in small steps[10]

Diary of porting Nils Holm’s Prolog Scheme code to Javascript using OhmJS[11]

Prolog for Programmers[12]

Video regarding simple difference between Prolog and 3GL languages, like Python, Javascript, Rust, etc.

SWIPL[13]

Modernized version of Prolog, including many libraries

Combine prolog+js+bash[14]

Demonstration of combining Prolog and Javascript into a single DSL

WAM[15]

Warren’s Abstract Machine
De facto “opcodes” and engine for implementing Prolog

Aït-Kaci’s tutorial on WAM[16]

A tutorial about WAM

Gnu prolog[17]

A version of Prolog that can show generated WAM code

Read and write JSON in Prolog[18]

Simple demo of how to read and write JSON in the Prolog language using SWIPL

Bibliography

[1] On Lisp from https://paulgraham.com/onlisp.html

[2] Paradigms of Artificial Intelligence Programming (PAIP) from https://github.com/norvig/paip-lisp

[3] Prolog Control in Six Slides from http://www.t3x.org/bits/prolog6.html

[4] minikanren from http://minikanren.org

[5] Daniel P. Friedman, William E. Byrd, Oleg Kiselyov, Jason Hemann, Duane Bibby, Guy L. Steele, Gerald Jay Sussman, and Robert Kowalski. 2018. The reasoned schemer. (Second edition. ed.). The MIT Press, Cambridge, Massachusetts.

[6] Nova from https://forum.nova-lang.net

[7] Democratizing Software from At about 1:24:00 of

[8] cl-holm-prolog from https://github.com/guitarvydas/cl-holm-prolog

[9] js-prolog from https://github.com/guitarvydas/js-prolog

[10] Ohm In Small Steps from https://computingsimplicity.neocities.org/blogs/OhmInSmallSteps.pdf

[11] OhmJS from https://ohmjs.org

[12] Prolog for Programmers from

[13] SWIPL from https://www.swi-prolog.org

[14] Combining Prolog and Javascript Into a Programming Language from https://programmingsimplicity.substack.com/p/combining-prolog-and-javascript-into?r=1egdky

[15] WAM from https://en.wikipedia.org/wiki/Warren_Abstract_Machine

[16] Warren’s Abstract Machine - A Tutorial Reconstruction from https://direct.mit.edu/books/monograph/4253/Warren-s-Abstract-MachineA-Tutorial-Reconstruction

[17] GNU Prolog from https://en.wikipedia.org/wiki/GNU_Prolog

[18] Read and Write JSON in PROLOG from

Programming Simplicity

Discussion about this post

Ready for more?