Episode 4: Seemingly Disconnected Things
Chris Patuzzo tells the story of creating the Sentient programming language, with diversions into NAND to Tetris, self-enumerating pangrams, the boolean satisfiability problem, and The Witness.
- Download an MP3
- Subscribe via RSS or iTunes
- Featuring @cpatuzzo and @tomstuart
- Music by Martin Austwick (@martinaustwick)
- Published on 2016-08-19 at 14:00 UTC
- Recorded on 2016-08-02 in Hackney
- Contact @whyarecomputers or firstname.lastname@example.org
Tom: Hi, welcome to Why Are Computers, a podcast where people talk about electronic stored-program digital computers with a mixture of reverence and dread.
I’m Tom Stuart. This is episode four, which I’m sorry about.
My guest today is Chris Patuzzo. Hi Chris!
Tom: Who on earth are you?
Chris: I’m a programmer, I work and live in central London. I don’t really have any big projects to my name, but I’m infinitely curious about computer science and I’m glad to be here.
Tom: You’re here today because you have a story. It all started with a puzzle. Can you tell me what happened?
Chris: Yeah. About four years ago, I was introduced to a word problem at work; a colleague was curious about this thing. It’s a word problem where you have to try and find a sentence that describes itself. These sentences are called “self-enumerating pangrams”. They contain a count of all of the letters within themselves, all the way through to Z, starting at A. So it will be something of the form “This sentence contains three As, two Bs, four Cs…”. And it’s a self-enumerating pangram if, when you actually tally up the characters in all of these words that make up the numbers in the sentence, it is in fact a true statement.
Tom: Okay. So the numbers are all written out in full as English words?
Tom: What makes this puzzle difficult or challenging? How difficult is it? Is this something that I could do with pencil and paper?
Chris: You could certainly try, and that’s where I started with this problem. I got a pad of paper and I tried it. And you quickly realise just how fiendishly difficult this thing is.
It’s very self-referential: as soon as you start changing any of the letters in the sentence, it throws off all of the other counts, and you’re back to square one. You notionally would like to zero in on a solution, and it’s very hard to do that. It’s this really chaotic thing.
Tom: So has this puzzle actually been solved? When you heard about it, did it come with a solution, or were you trying to solve it for the first time? What’s the status of the problem?
Chris: I heard that it had been solved, but I didn’t really go off and look for those solutions. I was hoping to solve it in isolation somewhat; through my own devices, come up with something that would work. I’d heard that other people at this company I worked for at the time had tried it, and played around with it for a while, and given up, and just moved on with their lives. And I’m not sure why I didn’t do that, but this one stuck with me, and it’s been pervasive in my thoughts ever since.
Tom: So what happened next?
Chris: I took this problem away, and I basically tried to write a program to do what I would do with pen and paper. I began by generating a random sentence with a mix of these numbers, and treating it like a minimisation problem: I wanted to minimise the number of errors, so I’d find the number-word which was off by the most characters, and I’d try and change that to something else. And I’d do this iteratively, over and over again, trying to get down to zero.
Tom: So that’s just brute force, basically?
Chris: Pretty much. Yeah. Using this heuristic of minimisation, but just trying sentence after sentence. The original program I wrote for this was something in Ruby; it was the language I was using every day at that point, so it was just from my grab-bag.
I iterated on it a little bit — I refined my program — and I ran this thing on a pretty beefy desktop computer at home, in my fairly small flat at the time, so I could hear it buzzing away. And this thing ran for about ten days…
Chris: …before it found one solution.
Chris: Yeah, I was successful, which was really exciting and thrilling — to see that one sentence appear on a screen after ten days of waiting. For now, that was enough for me. I was content to move on with my life. I’d done a little bit extra on this problem, I could put it under the rug for the moment and move on to other things.
Chris: About a year later, I was looking at YouTube videos on computer science, trying to satisfy my need for something a bit technical and a bit different from my day-to-day job. And I started looking at a book, which is colloquially known as “NAND to Tetris”. It’s a book that talks you through how to build a computer from the ground up, starting with really low-level NAND gates, in digital circuitry, and it has this cascade effect where you build multiplexers and half adders and full adders, and you go all the way up to a fully-blown arithmetic logic unit, and then a CPU. You then move on to the software side of things, so you write an assembler, and you work your way up the compilation stack.
And the idea is that this book walks you through the whole journey, not in a huge amount of depth at any stage, but it gives you a flavour for all of the bits of building a computer — at least in an abstract form — from the ground up. This thing really captivated me, and I started working on it at home. I was doing a chapter every day or two at the start, I was really addicted to it.
About halfway through this book, I’d pretty much built the hardware side of the stack. The hardware side of the computation stack is building all of the chips and componentry that you’d have in a computer.
Tom: So were you building this stuff physically, or in software, or were you not building it at all — just reading the book and understanding it?
Chris: I was building it in software. The book comes with a Java toolkit for building these things. I felt myself getting quite frustrated at this tooling, because I had been exposed to Ruby and a lot of the DSLs it has. So, as I invariably do, I ended up getting sidetracked, and instead of pushing all the way through and finishing my computer, and doing all of the software stack, I ended up trying to write a hardware description language, which was a replacement for what this Java toolkit was doing, but writing that in Ruby.
Tom: So, the Java tool gave you a way to write a description of a physical circuit as a text file, essentially, and then simulate it…
Tom: …and so what was the replacement that you were making? You were making a replacement for the Java simulator, but you were also making a new language for it to execute? Is that right?
Chris: Yeah. My ambition was to replace it, like for like, but with a slightly more elegant syntax. Something that was a little bit cleaner, but also more testable. I found one of my frustrations with the toolkit was, it was a lot of testing at the high level. You’d say, “here are my two hundred inputs, and I expect these two hundred outputs”, for, say, an arithmetic logic unit, which takes a huge bus of pins.
As someone who had done a lot of test-driven development, I kind of liked the idea of having a toolkit that let me work in a way that was a bit more compatible with how I’d worked in the past. It encouraged those incremental development behaviours that I like. That was a hugely interesting project, and taught me a lot about things like parsing syntax, and building abstract syntax trees, and then traversing over those to understand the semantics of a program.
It was interesting as well, because it was a declarative language, which means — in the context of these hardware description languages — you can order things in any way you like, and it will just figure out how everything fits together. It will figure out all of your circuitry and wiring, regardless of whether you declare an adder before something that actually uses that adder. That was really interesting.
I wrote this thing in Ruby and, it kind of got me thinking: I wonder if I can build a chip for solving this contrived word puzzle. An electronic circuit, essentially, which would take as input a bunch of numbers, which is an input to this self-enumerating pangram problem, and as output there’d be a single pin which is like a
false boolean, which is like, “yes, this is a true self-enumerating pangram” or “no, it’s not”.
Tom: So it’s a checker for self-enumerating pangrams? Not a piece of hardware that can compute one, but one that can just check one.
Chris: Just one that can check one.
Tom: What you said before is that a self-enumerating pangram has got a load of numbers in it, esentially — how many As, how many Bs, how many Cs — and so presumably the input would be the quantity of As, the quantity of Bs, the quantity of Cs, and so on, and then I just get a
false, one-or-zero output to tell me “yes, if you plug all of those numbers into a sentence, then you’ve got yourself a self-enumerating pangram”.
Chris: Exactly. At the time I was watching some lectures on YouTube, and one of the topics was something called “reductions”. The idea of reductions is that if you’ve got a problem in one form, you can often turn it into another form.
At the time, my thinking was that a lot of developments have been made in recent years with circuitry and optimising chip design, and making them really efficient and removing redundancy from them, so I had this idea that maybe if I can turn it into a circuit or a chip, then you could do a lot of analysis on that, and you could somehow work that backwards so that I could set that output pin to
true, and it would somehow give me a valid set of inputs that produces that output. Kind of turning it on its head.
Tom: So you didn’t necessarily have a plan to make a circuit that could directly solve a pangram…
Tom: …but you thought that if you made something that could check them, then maybe — like you said — you might be able to somehow run it backwards and get, “what are the inputs that I need to put into this to get
true on the output?”.
Chris: I had no idea about whether that would even be possible, but it seemed like an interesting project, and it was just a curiosity thing. Even if it hadn’t worked, it would have just been interesting, I think, to build that chip.
Tom: So you managed to design this chip?
Chris: Initially I didn’t go ahead right away and start building this chip. I realised this language I’d written was… it had a few issues, in that as soon as it started to scale, it started to fall over. It had really high memory usage. It really showed that building a HDL is a hard thing to do, and there were some really important features that my HDL couldn’t do that the Java one could. So I kind of noted that as a limitation, and thought “maybe I’ll revisit this again in the future”.
From Chips to Equations
Tom: So you had some performance issues with your Ruby implementation of this HDL. How did you fix those?
Chris: About a year after my initial attempt at this, I actually heard that there was a club in London — the London Computation Club — they were actually looking at this NAND to Tetris book. So I figured maybe I could go along and pick up where I left off, and potentially finish this book instead of getting sidetracked and building other things that were beside the point.
I started going to these meetings, and we were quite early on in the book at that stage, rebuilding that hardware stack again. It reignited my passion to try and build this pangram chip, but do it properly this time, and in order to do so I realised I’m probably going to need to architect this code better, to make it more performant.
So I spent a good while doing that. I’d learnt a lot from the initial implementation I wrote in Ruby. I’d been doing a lot of domain-driven design at the time as well, and that influenced my thinking of how to architect this thing.
Tom: Am I right that the challenge of this thing is that the HDL describes connections between things — so, it’ll say “this component is connected to this component, which is connected to this component”, but in no particular order — and then the job of the simulator is to try and figure out, when a certain number of inputs are set in a certain way, it has to figure out how those electrical currents would move through the simulated circuit. And so that’s why you need this topological sort, is so that you can make decisions about what order you have to figure out what the inputs and outputs of each component in the circuit are going to be?
Chris: Absolutely. If you actually want to simulate these circuits, i.e. run inputs through them and get outputs out, then you need to resolve that ordering issue. Otherwise all you end up with is a pretty-looking graph, but you can’t actually do anything with it.
Chris: I did. It works really well, actually. If I had more time I’d like to turn it into something that people could use as an alternative for this NAND to Tetris book. It still has a few rough edges; it doesn’t do literally every feature this thing does. But it does a few other things. The testing approach for this thing is nicer than the book’s; the language is a little bit simplified.
You also don’t get anything for free with this language. In this NAND to Tetris book they give you NAND for free, which is one of the primitive logic gates, whereas in my language you can define your own truth tables as well, so you can use those as starting points for building other chips.
It reached a point where I thought, “I’m going to have a crack at this pangram thing. Maybe I’ve been over-hyping it the whole time, and it’s a couple of days’ work, it’ll be fine, I’ll do it and I’ll look at it and it’ll be a pretty graph”. It didn’t turn out that way. I could leverage the test suite I’d put in place.
It took me probably about two or three weekends to build this pangram chip, which is not overly complicated. It’s a mixture of lots of addition operations: as you’d imagine, if you’re looking at the frequency of letters in a sentence, you need to add all of those letters up. And it would use a basic lookup table chip. These truth tables I had added to the language, I could pretty much just define a truth table for each of the number words — the word “one” would be “o”, “n”, “e”, I’d just put a bit in each of the columns that represented that letter.
I managed to actually get this thing working. I test-drove the pangram chip, which was hugely rewarding. I was grabbing examples from the internet where other people had solved this problem through all kinds of weird and wacky means, and those were forming my integration tests. It was like, here’s an actual genuine self-enumerating pangram, run it through this chip, and it produces the right output.
So that was great, but then I thought back to that initial inclination of, maybe there’s some way to do things with this; now that I’ve got this schematic of a chip, maybe I can use some kind of industry-standard simplification techniques and have a really efficient chip, or maybe there’s some way to turn this on its head and set that output pin to
true, and maybe that can ripple back through the circuit and give me valid inputs that produce that output.
Chris: I knew it wasn’t a deterministic thing. It had to make arbitrary choices at each stage, because I knew that there were multiple solutions to this problem. If you look at it from a purely functional perspective, it’s a many-to-one thing.
So I wasn’t really sure how to do that, but after doing some reading around — and I think I was in conversation with you at the time — I found something called the Tseitin transformation. Someone wrote a paper in 1968 about how you can take a combinatorial chip and turn that into something called the boolean satisfiability problem. Which sounds very technical and computer-sciencey, but all it is really is, if you have an equation of booleans, like
a or b and c, the idea is that you try and assign a
false value to each of those variables. So
a can be either
b can be
c can be
false. And you have to try and come up with a satisfying assignment — something that makes that entire statement
Tom: So this is almost like a mathematical puzzle.
Tom: It’s not a million miles away from the self-enumerating pangram problem, but it just has
falses in it.
Chris: Yeah, it had kind of gone full circle from this big software engineering project back into the world of theory and mathematics and reasoning about these boolean equations again. Which was kind of surprising at the time, that I was able to take this chip and think of it in terms of a boolean equation rather than a piece of software.
Tom: So what’s the advantage of being able to turn a circuit — that tells you whether a particular sentence is a self-enumerating pangram or not — into a big boolean expression that does the same job? What does that win you?
Chris: The thing it wins you is the fantastic work of decades of research into this very specific problem of trying to satisfy a boolean equation.
It turns out that SAT, as it’s abbreviated, is a very central problem. A lot of things reduce down to SAT. So that was the advantage of coming up with one of these satisfiability equations, in that I could then look at the existing tooling for trying to actually solve these equations.
Running It Backwards
Tom: You’ve already told quite a lot of story. And now you’re saying you had at least discovered this transformation that meant that you could turn, in principle, one of these HDL circuits into a boolean satisfiability problem, which is something that you could then feed into one of these SAT solvers in order to… do what? What’s the upshot of this? If I had one of these circuits, and I turned it into an instance of the boolean satisfiability problem, and I fed it into one of these solvers, what would I get out the other end?
Chris: The thing you’d get out is a set of assignments for each of the variables —
false for each variable in that very large equation. The idea is that you can then interpret the result of that and turn it back into something meaningful. You can look at all of those individual bits and you can reconstruct the integers that those bits represent in the first place. So you can turn that output back into the actual numbers that would appear in one of these sentences, very indirectly getting a solution for one of these self-enumerating pangrams.
Tom: So that’s ultimately why you were interested in this, is because this does provide you with a way of sort of running your circuit backwards, by turning it into a problem that you can feed into a SAT solver, and that can tell you “if you want the output to be
true, then here are all of the values of the inputs”. And then as you say, you can just re-interpret that back in the world of the circuit, and then you have the information that tells you: if you write a sentence that’s got three As and five Bs and three Cs then you’ll have a self-enumerating pangram.
Chris: That’s exactly right. One of the great things about the satisfiability problem is that it gives you an assignment of everything, and that includes all of the inputs to the chips. So when it finds that assignment that produces a
true output, it’ll give you the entire set of inputs that produce that. Which you can then translate back into a sentence.
Tom: Did you manage to do all this, and did you manage to generate a different self-enumerating pangram in less than ten days?
Chris: At that point my jaw dropped, that was amazing. Given I had tried this years ago and the best I could do was this cobbled-together algorithm that took ten days to compute, I’d gone all the way down to a few minutes. And most of that was on just throwing out huge amounts of redundancy from this equation. It was just simplifying things that just didn’t need to be there. There’s certainly scope for bringing that down further, and solving it probably in just a few seconds.
Tom: Well, it’s really impressive that you managed to make such a performance improvement. The only disappointment, I suppose, is that you’re solving a problem that’s already been solved. From what you said, it sounds as though plenty of other people had already generated these self-enumerating pangrams.
Chris: Yeah. It actually sparked my interest to go back and read where this problem came from in the first place. As far as I can tell, the first occurrence of this problem was in Douglas Hofstadter’s “Metamagical Themas” column in Scientific American, which is quite a prestigious magazine.
Someone had originally come up with a sentence in Dutch, and a famous electronics engineer-slash-mathematician had picked this up in England — I should state his name, it was Lee Sallows — and he’d looked at this problem as early as 1982, 1983. And surprisingly, he had gone a very similar route. He’d started thinking about writing a program, I think he’d tried to narrow down the search space, did a bit of reasoning about the problem.
He’d written a program which, at the time, it could check about ten solutions a second, which was probably quite good for that time. He ran it, I think, for a couple of days, and then did a bit of math on that problem and realised it was going to run for about thirty million years. So he then started looking at ways to improve that, thinking about the problem and scoping it down further. And eventually he kind of went down the hardware route as well. He ended up building this pangram machine which was a dedicated piece of hardware which would cycle through attempt after attempt. And he got it running at about a million attempts per second, which for the time must have just been amazing, this really specific piece of hardware to do this one specific task.
Tom: But unlike yours, this was real hardware, not simulated hardware.
Chris: Real hardware, and he wasn’t thinking about running the circuitry backwards or anything. He was taking that brute force approach, having narrowed down the search space as much as he could, and exploring regions that he thought there would be solutions.
Tom: So this hardware was just a way of brute-forcing it much faster than he could on a general-purpose computer.
Chris: Yeah. Originally I think he ran this thing for a month, which shows how times have changed: this very immature algorithm that I wrote and ran in ten days on a desktop computer could beat this very specific device of the day. But it just happened to be in the 1980s.
I felt like I’d solved it and I’d come up with a novel approach for doing so, but I still wasn’t ever so satisfied, in that it was already a solved problem, and this chip that I came up with was really hard to write. It took me many weekends. This hardware description language just felt like a very clunky way of actually trying to solve this thing.
So there were two issues. One was the originality issue, in that other people had already solved this problem. And secondly was just the fact that this really isn’t a very good way to be thinking about these problems, in terms of circuitry and chips, when actually when we think about these problems as humans, we’re thinking about letters and numbers and addition, and when you’re operating at a chip level you have to actually be very strict about what those things mean.
Tom: Going the route that you did seems quite circuitous. That’s a joke.
An Unsolved Problem
Chris: So, to tackle those two issues. The first one — originality — I actually contacted Lee Sallows to tell him about this approach for solving the problem. He was kind of like, “oh that’s great, well done, but here’s an even harder, more arcane, fiendishly difficult problem for you to think about”.
I thought, okay, I’ll rise to this challenge, I will try and attempt this slightly— I say “slightly”, it was quite a lot more difficult. It’s a form of this sentence where, instead of simply counting how many of each letter appear in the sentence, like “this sentence contains three As, three Bs…”, it would be in the form “this sentence contains three point one two percent As, two point six percent Bs…”, where each number in those percentages is written out as the word that represents that number, all the way through to Z.
On the surface it looks very similar, it’s certainly in the same class of problems, but this is a lot more difficult for two reasons. One is, you’re having to be a lot more precise about this thing. It’s no longer just a simple tally of things; there’s pretty much zero room for error here. On top of that, it’s far more chaotic than the original form of the sentence, in that as soon as you add a single character to the sentence, it actually throws every single other count off, because you’ve affected a percentage — not just a count, but something that’s a function of everything in that sentence.
Tom: Right. So the original problem’s difficult because, when you have those counts of letters, if the word “two” changes to the word “three”, then suddenly there’s one fewer W and one fewer O, and now there’s a new H and a new R and two more Es. But in the percentage version of it, not only do you change the counts of those letters, but you also change all of the percentages in the whole string because the length has changed, and then presumably if you go and update all of those percentages then that changes the length of the string again, and you’re stuck in this endless updating.
Chris: Yeah. It’s far more chaotic than the one before. The previous form of the sentence would… you’d change a few words, and a few of the numbers, and a few things would change; it was very hard to actually converge on a solution, but it wouldn’t throw everything off entirely. Whereas this form, it’s just impenetrable, if you were going to actually attempt this with pen and paper.
Tom: So is this a completely unsolved problem?
Chris: As far as I’m aware, it is. Or… it was!
Tom: Oh, interesting! So what happened next?
Chris: I spent another probably two or three weekends — it was probably a lot less, after I’d learnt all the Vim macros I needed to construct these arcane chips — I spent that time essentially modifying the existing chip I had, to figure out percentages.
Designing a chip that could do division in hardware is not a straightforward thing. I spent one of those weekends just figuring out how to do long division in binary on a whiteboard at home. And on top of that, most of the algorithms for doing long division in binary are sequential, so you have to unwind that loop and do the whole thing in a single pass.
It took a lot of thinking about. But I could essentially use the tooling I had already provided to modify what I had to attempt this harder version of the problem. I then rented another server from Amazon, and turned it into a boolean equation, which turned out to be about ten times as big as the one before. And then I ran it on my computer at home, and after about three days it had found a solution.
Chris: It had solved it, and I decided to dedicate my first solution to Lee Sallows, so I put his name in that original string. So yeah, three days of running, compared to five minutes or so for the simpler version of the problem. I think that just goes to show how much more complicated this thing was. And I think if we’d tried to build a dedicated piece of hardware in the 80s to do this, I think even an ASIC would have struggled with this back then.
Tom: I have the sentence written here. Shall we read it out?
Chris: Yeah? Sure.
Tom: Do you want to read it, or shall I read it?
Chris: I think you should take the honour.
Tom: Okay. “This sentence is dedicated to Lee Sallows, and to within one decimal place, four point five percent of the letters in this sentence are As, zero point one percent are Bs, four point three percent are Cs, zero point nine percent are Ds…
Tom: …zero point one percent are Ys, and one point six percent are Zs”.
Chris: Bravo! That’s fantastic.
Tom: So that’s a great success — you managed to solve an unsolved problem. You’re the first person in the world to ever generate one of these self-enumerating pangrams with percentages in. You got into the Guardian for this!
Chris: That’s a bit of an exaggeration. Lee Sallows got excited and emailed a few of his contacts, and I think it ended up on one of the puzzle pages in the Guardian somewhere on the web site.
Tom: But then as a result of that, earlier today I went and had a look at the Wikipedia page for self-enumerating pangrams, and you’re on there now, so you’re now published on Wikipedia!
Chris: Great, wow, okay.
Tom: But this isn’t the end of the story. What did you do next?
Chris: It’s not. So I’d solved that originality problem of solving a problem that previously hadn’t been tackled. So the next step was thinking about a better way to do this. I don’t want to be building some arcane circuitry every time somebody challenges me with a word puzzle, so I thought, what’s going on here? Can I do something better? And I knew immediately that writing Vim macros to be dealing with huge amounts of circuitry, a lot of that could be turned into code, and parse it; I could have some generative grammars or I could borrow some ideas from computer science in that regard.
But there was a nagging thing which was: none of this is intuitive yet. It all feels like I’m having to lower myself to the level of the machine. I’m having to lose my humanity a bit, and go down to boolean equations and circuitry, and that just didn’t seem like something scalable or that people would buy into as a concept. It didn’t seem like that was the way forward in this area.
So I started thinking about: could I write a programming language, or some kind of parser, or syntax thing, that would make this easier? I started looking at this area a bit more, and realised that there was a strong connection here with constraint solving and declarative programming, which are huge fields in programming — I’m almost embarrassed that I didn’t initially think of those technologies as a means of solving this problem.
I had a look at them, and the μKanren stuff is really good, and I’m yet to explore a lot of that in more detail. But as somebody who enjoys making things, and writing code, and learning as I build things, I wanted to see if I could actually make this a real thing: start to provide human abstractions for this stuff, rather than it being a machine-only thing.
I decided to start writing a programming language, if you would call it that. I had two goals of this programming language: one was to use all of the learnings I already had from turning something into a boolean satisfiability equation, and then feeding it to one of these fantastic solvers that already exist. But my second goal was to make it as expressive and — learning from ideas from Ruby and domain-specific languages — make it as human-friendly as possible. And by that I mean actually making it like a programming language: if you’re thinking about an array of numbers, having an array of numbers in this programming language that you can manipulate and you can work with, and be able to do addition on things, and not have to actually turn that into some circuitry under the covers.
Tom: So this is ultimately making a tool that, if it had existed when you first encountered this puzzle, you could’ve just written a little program that said “here’s the lookup table for how many letters there are in each human word, and the totals are just going to be found by mapping across all the characters in the string and adding up the numbers”, and if you’d had this programming language to begin with you could’ve just expressed the problem in a much more natural way, and you could’ve just executed it and it would’ve told you it found a solution, rather than you having to express it in terms of digital circuits and division chips and HDL.
Chris: Exactly that. I wanted to write the tool I wish I had to begin with. And maybe they already exist, but I wanted to go through that journey myself. So I started writing a programming language, which I coined “Sentient”, which is quite an audacious-sounding name in that it’s, you know, A.I. and computers thinking for themselves. So I tried to justify this with a very tenuous anagram of the word “Tseitin”, which is that final little solution I needed to turn a chip into a satisfiability equation. That’s how I justified it.
Tom: I think that makes it completely defensible.
Chris: Yeah. It’s not even close, though. We’ll just gloss over that.
Tom: Yeah, that’s fine.
Chris: Initially I grabbed Ruby from my grab-bag, and I started thinking about writing this object-oriented style. What I wanted to do was take a high-level syntax that you’d see in a programming language, and turn it into an abstract syntax tree, and eventually make sense of that to turn it into one of these satisfiability equations.
And I toyed around with Ruby for a while, and what I realised from this initial approach was, I wasn’t actually sure what the boundaries of this language were — like what things could it do, and what couldn’t it do. The travelling salesperson problem, where you’re trying to find a minimum of a maximum, that’s certainly something you can compute in a Turing complete programming language, but I wasn’t sure if I could compute that. I kind of knew that I almost certainly couldn’t, in that these things have to have a
false output — you can’t specify those types of operations on things.
So I scrapped that and took a different approach, where instead I started looking from the ground up, building a programming language from the bottom up, where I’d start with my boolean satisfiability equation and introduce the smallest possible abstraction I could to produce that boolean equation. So, introducing things like
nots, and equality, and those kinds of really low-level boolean operations. And then the idea was that if I had a really good level of abstraction, I could then step that up a level, and from booleans I could then build integers. And from integers I could build arrays. And I could keep pushing it up a level and figuring things out as I go, rather than knowing what the syntax is immediately and somehow figuring out how to turn that down into one of these things.
Tom: So, when you had the boolean layer working, that means that you could write programs in Sentient that said things like “
a = b and b or not c”, or something like that, and it could turn that into a boolean satisfiability problem, which it almost sounds like it is, but it’s like a more constrained version of that.
Chris: Yeah, it uses something called conjunctive normal form, which is a boolean equation but it has even more restrictions on what form it can take. You can’t have things in nested brackets. Everything has to be
ord together inside a term, and then terms can be
anded together; that’s where the “conjunctive” normal form bit comes from.
Tom: The language already had to do a fair amount of work to turn an arbitrary boolean expression containing equality and maybe boolean operations that aren’t just
ors; it has to turn that into a SAT problem. I assume there’s also the job of not just compiling that program into a SAT problem, but then there needs to be some kind of runtime that is able to hand that off to a SAT solver and then, when a solution comes back, it has to interpret the result of the SAT solver in terms of the original program, right? So I want to know: what is the value of
a? what is the value of
Chris: Yeah. To give a bit of context, if you
and together two things in a general way — in first-order logic, which is the first level of abstraction — that actually produces three different bits when you turn that into this very strict form.
So there was a kind of duality to this language, in that you had a compilation side that would turn it down into SAT, but then you had a kind of interpretive side that would, after you’d run this program on a SAT solver, it would then start to unpack that result and express it in the language that that level used. It was something that was a bit of a challenge, in that most programming languages you’re telling the machine what to do, and you’re not having to think about communicating back up through that stack. You’re operating on the machine and you’re drawing to the screen and those kinds of things, but you’re never unpacking something at that really low assembly level and then starting to decompile it back up the stack again. So to even build that first level I was having to already start to think about the architecture of this thing.
Tom: So once you had that boolean layer working, you were able to build integers on top of that, literally by implementing arithmetic in terms of boolean operations on individual bits?
Chris: Yeah. I borrowed a lot of my learnings from the original circuitry — I’d already implemented adders and I’d implemented division in hardware, so I kind of knew how to do those things. I implemented all of the typical arithmetic operations you’d expect that operate on integers, which are just comprised of bits. I used two’s complement notation which allows for negative integers as well.
This was one of the earliest times I started to really see the power of this language. Quite early on I was implementing multiplication and division, and I had figured out how to do multiplication first, and was like, “okay, that was a huge chore, it took a really long time to get working and to test and make sure it was consistent, I’m now going to move on to division”. And I realised that I could actually write division in terms of multiplication: I could invent two variables, multiply them together, and then state an invariant or constraint that this must equal the result, which is the quotient in integer division.
Tom: Right. So this is exactly the kind of program that I would have expected to be able to write in this language, is something like:
x + y = 4, and have Sentient come back and say, well,
x = 1 and
y = 3. But what you’re saying is, I could just as well say
x × y = 12, and it should be able to come back and say
x = 3 and
y = 4, and that’s how you got it to do division, was just by leveraging its existing ability to run multiplication backwards?
Chris: Yeah, pretty much exactly. If you are feeling adventurous and want to look at the code, you can see that one is really just an inverse of the other. I looked on Wikipedia at the definition of Euclidean division, which specifies that the remainder must be a positive integer, and it states that equation, and I really just implemented that using the things I had already built.
So already it was starting to bootstrap itself, and I could start to leverage things I’d built already to continue building upwards. And this reminded me a bit of those very early days of building chips and hardware and studying that NAND to Tetris book where you’d start to build things out of other things, and you’re kind of building up your toolkit as you went, and it was really quite reminiscent of that.
The Past and The Future
Tom: It’s interesting that this whole project has really just been a collision of lots of apparently unrelated things that you were interested in at the time, and they’ve all sort of come together in this one project.
Chris: Yeah. I find the most interesting projects are like that. It’s someone who has experienced quite disparate things and different ideas, and just want to mash everything together. I find that really fascinating as a concept for a project — just like, pick ideas out of a bag and see if you can build something that incorporates each of those ideas.
Tom: But in particular, you weren’t pursuing this as a goal initially. It’s not like you learned all of these different things so that you could build this programming language, it’s just that you happened to have already learned these different things following your interests, and then presumably this seemed like the obvious next step, was to make your own programming language?
Chris: I guess so, yeah. It did seem like everything pulled together, and the bits where I was quite fuzzy about “how on earth do I do multiplication in hardware?”, I went and learnt about those things, and it’s been great for just building a really broad understanding of a lot of quite seemingly disconnected things.
Tom: So you built some more of these layers — you had the booleans and then the integers, and you mentioned arrays — how many layers are there here? How sophisticated does this language get?
Chris: The language has four layers, and I think that’s probably where it will stop now. My first three layers in the stack — as you say: booleans, integers, arrays predominantly — those are all stack-based languages. You push variables onto stack, and you’d perform an operation on them, and that was all preparing myself for that layer where I start to actually turn this into a concrete syntax.
Chris: No, at this stage it was really just a really quite primitive stack-based machine. Each layer is almost a completely isolated thing, it has a contract with the layer above and below it. That’s kind of how I tried to architect things.
Tom: But now you’re at the point where you have added a concrete syntax to it, and people can write a Sentient program in something that looks more like a conventional programming language?
Chris: Absolutely. Again, looping back to Ruby and having done a lot of Ruby programming in my time, I tried to model a lot of my concrete syntax on what Ruby can do. It has a really good standard library, and there’s almost no overhead to calling methods on things.
It’s always going to struggle with being Turing complete — that’s a boundary I just can’t cross, as an inherent limitation of the language. But I’ve tried to approximate functions as best I can; you can’t have recursive functions, but you can define functions which are more like macros. It’s a means of calling the same code as you iterate over an array, for example. You can pass function pointers around, within certain constraints.
It’s really getting to that point where I’m able to write programs that really do express how I think about a problem. That original pangram problem, I’ve written it in this language and it’s pretty much exactly how I think about it, line by line. And it’s less than a hundred lines. I think half of that is just this lookup table which is mapping the number words into the digits that appear in those numbers.
Tom: Are there any other notable programs that you have written in it?
Chris: Yeah, I’m trying to capture lots of examples on a web site which I’m building, which is at sentient-lang.org. Things I’ve written as a means of testing the programming language are things like Sudoku solvers, the eight queens problem which is one of the “hello world” programs of declarative programming. Magic squares, these puzzles where all the rows and columns and diagonals add to the same number.
The most adventurous project I’d like to work on is, there’s a game called The Witness, which you and I have both played, where it has these geometric maze-solving problems. On the surface it sounds simple — you have to navigate a maze from start to finish — but slowly over the course of the game they start adding in additional constraints, and each of the different symbols in these puzzles have different rules. I kind of see that as the pinnacle of fiendishly difficult problems to solve with this language; if I am able to express that problem in this language, then I’ve kind of done what I wanted to.
I’m actually using problems as a means of stress-testing the language, as well. So if I’m writing a Sudoku solver, thinking “I really want to transpose this array here” — with a nested array I want to swap columns for rows, and then iterate over it, and make assertions that all of the elements are unique — I’m like “okay, well, I don’t have
uniq yet, so let’s add that to the standard library; I don’t have
transpose or different types of enumeration yet”. So I’ve been gradually building up this language as I go.
Tom: And just to clarify. For each of those things, if I wanted to write a Sudoku solver, or a magic square finder, or a Witness puzzle solver in Sentient, that challenge is just: how do I express the rules of Sudoku, how do I express the rules of magic squares. So, for magic squares, I would need to write a program that says “when you add up all the values in each column, they all add up to nine” or whatever, and specifically I don’t have to know how to implement an algorithm to find the solution, I just have to know how to explain what success looks like, and then I rely on the SAT solver to actually find a solution that satisfies the rules that I’ve written?
Chris: It almost seems too good to be true. You can write a program that just describes the problem, and then Sentient — by passing it to a SAT solver — will just try and figure it out by itself. If you were to write that Sudoku problem, you would just say like, “all of the columns must be unique, all of the rows must be unique, the boxes must be unique, and the numbers must be within this range”, and that’s pretty much the whole program. You’d then run that, and Sentient — probably at many solutions a second — it would solve that problem for you.
That ties back to a whole other area of computer science, around nondeterminism and NP problems versus P problems, and I find it fascinating that this really big engineering project has so many ties to a lot of the really theoretical stuff that dusty old professors talk about. And it’s actually doing something along those lines, and inventing something from it.
Tom: Yeah, and it’s really very impressive that you have got so far with this. That just by following your interest in various disconnected things, you’ve managed to make it all build up into this thing that — it sounds like — is now capable of actually solving problems in a useful way. And if it had existed when you’d first heard about this self-enumerating pangram problem, it would’ve been the perfect tool to use to solve that, and now, somehow, it exists!
Chris: Yeah, it’s really rewarding to look back at reckons I had a year or two ago about “I wonder if I can run these things backwards?”, or “wouldn’t it be great if there was a high-level language to do this?”, and it’s like, I’m not a hundred percent there yet, but it’s great to reflect on that and see that actually I’ve now got this thing I was craving all that time ago.
Tom: So looking to the future a bit: it’s not at version 1.0 yet. What will version 1.0 of Sentient look like? Have you thought that far ahead, or are you just taking it a day at a time?
So I’d like to build out that web site, and in doing so create a bunch of fun, interesting examples that demonstrate the capability of this language. And while I’m doing that, I’m learning more about what needs to be in the language, I’m stress-testing it. So I want to arrive at a stage where I’ve got a good set of examples, and I’m happy with the way the language solves those examples. I think I will release version 1.0 at that stage.
If you are interested in the language, it is online and I think it would be great to get feedback, especially from people who work with this stuff. If you’ve used Prolog before, compare the “hello world” example in Prolog with what that might look like in Sentient. Feedback on that would be great.
Tom: You mentioned that some of the work involved in doing this has been done at least in the company of other people — you mentioned London Computation Club, which I’m also a member of — and that you read a book, and that there have been other people involved, but it sounds like you’ve mostly done work on this on your own, it’s sort of been a solitary project for you.
I was interested in what it’s like to build something this large and this complex all by yourself.
Chris: It’s really difficult to work on something this big by yourself. If you’re having a bad day and you can’t get something to work, you don’t have anyone who has the context to bounce ideas off, and to go over to a whiteboard and try and figure something out; you have to just find that motivation to do it yourself.
On top of that, it doesn’t help that there’s no obvious way of making this a day job right now. It’s really a passion project that’s gone way out of hand, it takes a lot of my time up, and continually justifying that is an ongoing challenge.
Tom: Would you be working on it differently as part of a team of people?
Chris: I would certainly have worked on it differently at a macroscopic level — I would have been thinking about refactoring a lot more, I would have been documenting things better from the start, rather than trying to play catch-up now. I have a tendency to leave visual clues in the code that mean something to me, but they certainly wouldn’t mean things to other people.
I think there’s also an element of not wanting to be personally shamed if you’re working in a team, and leaving some gnarly method in there. No-one is pushing me to fix technical debt in this thing, it’s only when I actually bump into it again and I have to really do something about it.
Tom: But at the same time, it’s a project where you have been setting the goals. And it sounds as though you may not even have known necessarily what you wanted to do next. If you are building a web application as part of a team of twenty developers, you probably have a little bit more of a sense of where you’re going with it, and what the ultimate business goal is, whereas in this case it sounds like it would have been quite difficult for you to plan ahead, because you’ve really just been following this thread of your own interest for what sounds like years.
Chris: Yeah. It’s probably not at all agile in any way, this project. I haven’t been delivering anything throughout, really. It’s really been a passion project where I’ve gone away in a dark room for a long time and worked on this thing and wanted to get it right.
If I had been working in a team, I’d certainly be doing things differently. If I were working for a client I would be thinking about trying to ship something and seeing what works and getting feedback from people. But yeah, none of that has been happening.
The Grand Scheme of Things
Tom: Why did you do this? In hindsight it makes sense, but it sounds like you never really had a big plan about where you were going with this, and ultimately you’ve spent quite a lot of time and energy building all of these projects, and developing this programming language. Looking back on it, do you have a sense of why you have done all of this?
Chris: I think that’s really getting at the heart of this podcast! Why do we do any of this at all? I’m not sure. I think it’s something that has just always been at the back of my mind, and I feel like I just want to keep pushing it and see where it goes.
I actually went to see a talk by Tom Crayford. He talked about System R and some of the early database technology — relational databases — that were developed, and he kept really hammering this point that everything is invented. Everything we use in software engineering today, it’s all just invented by people. And if you look at the grand scheme of things, we haven’t been doing this a whole lot of time.
We’ve been going down this really imperative and also functional route; all of these languages are about telling the computer precisely what to do. And I just feel like there’s a gap there. I want to express problems to my computer in a human way, and I just don’t think we’re going to get there by going down imperative and functional routes.
We’ve seen a lot of problems today that just can’t be solved by straightforward algorithms — we need artificial intelligence techniques, machine learning, to begin to tackle these. And it’s really just huge amounts of curiosity that’s driving me. I’m not sure in retrospect if I’d give all of those weekends up again, but at the same time, it’s incredibly rewarding to end up with something that isn’t completely broken and actually does some interesting things, which is great.
Tom: You reminded me of The Witness. I remember you saying something about being able to encode SAT problems as Witness puzzles. Is that anything to do with Sentient, or is this just a side project that you had because you were playing The Witness?
Chris: That was an interesting day. At the London Computation Club, one of our meetings was about NP-completeness and trying to unpack that really quite challenging topic, and make sense of it.
The core idea is all about these reductions, and being able to transform problems into other problems. I had been toying with The Witness in terms of turning it into a boolean satisfiability problem with Sentient, and I kind of thought “well, maybe you can turn it on its head and go in the other direction”. So I had a go at actually taking an arbitrary boolean equation and turning it into a Witness problem, which is like… what does that even look like? How does that work, at all?
There are a bunch of different symbols that have different constraints on them, and I came up with a polynomial time algorithm for constructing a Witness puzzle that can essentially encode a SAT problem. So if you can make it from the start of this maze to the end of this maze, satisfying all of the constraints in this Witness puzzle, then you will have indirectly solved a SAT problem.
And if I get a bit of time in the coming months, I’d like to actually show this working; I’d like to take a classic travelling salesperson problem, turn it into SAT, turn it into The Witness, and then build a user interface where you’re navigating this grid, trying to solve this problem, and seeing what it’s doing with the travelling salesperson problem at the same time, the route that it’s trying as you try and navigate through this maze. I think that would be a fun thing.
It’s totally useless! As far as actually proving things about The Witness puzzles go, you couldn’t actually tell much from the fact that you can translate a SAT problem into a Witness problem. You’d learn things if you went in the other direction — that’s the really challenging problem.
Tom: So that does tell you that The Witness is at least as difficult as SAT. Do you have any intuition about whether The Witness is harder than SAT? Are you likely to be able to solve Witness puzzles in Sentient?
Chris: I believe The Witness problems are harder than SAT. To even check whether a puzzle is correct, for most of the types of puzzle pieces you can do that in polynomial time — you can check it with an algorithm that doesn’t explode combinatorially — but, without going into spoilers too much, there are certain pieces which, as far as we know, they require exponential time just to check one of these problems.
Tom: Whereas SAT is an NP problem, so it’s nondeterministic polynomial time, which means that you should be able to check the answer in polynomial time, even if finding an answer requires either exponential time or a nondeterministic Turing machine in polynomial time, right?
Chris: Right. I think if you eliminated that puzzle piece, or you limited the number of them, it becomes polynomial and therefore you could actually solve Witness puzzles.
I wrote a Sentient program that was just solving the maze problem: if you’re given an
n grid, finding paths that go from start to finish. That’s actually a studied problem; there’s something called a self-avoiding walk where you navigate through a geometric grid without stepping over yourself — you’re constantly avoiding your own path. And that has a huge combinatorial explosion, and I think if you were going to write a solver for those types of problems, you would need to be using something that’s really good at solving combinatorial problems. Whether that’s SAT, or other techniques like branch and prune, which is another common algorithm for this kind of thing.
I’d like to see where that goes with Sentient. One of the great properties it has is you can really layer constraints on top of each other: you can specify the basic constraints of your problem, and then you can almost forget about that bit and then start focusing on the next bit, and start to layer things up. I think if you were trying to write an algorithm to solve one of these things, you’d pretty much have to throw what you had away and start again, and take a different approach.
Tom: Well Chris, it’s been really interesting to hear about your epic journey through the world of declarative programming and self-enumerating pangrams. If people want to install Sentient and play with it, what do they do?
Chris: There are two ways. You can either
sentient and start compiling and running programs.
Tom: Great! Thank you very much for coming and telling me about it.
Chris: Thank you! It’s been my pleasure.