Episode 1: A Fairly Deep Yak Shave
- Download an AAC or MP3
- Subscribe via RSS or iTunes
- Featuring @jcoglan and @tomstuart
- Music by Martin Austwick (@martinaustwick)
- Published on 2013-11-04 at 16:00 UTC
- Recorded on 2013-10-24 in the Makeshift Shedscraper
- Contact @whyarecomputers or email@example.com
Tom: Hi, welcome to Why Are Computers, a podcast that attempts to answer that question by talking to people who reckon stuff about computers and hoping they accidentally say the answer.
I’m Tom Stuart. This is episode one, the shambolic pilot episode. If we get picked up for a full series then I’ll be recast and my parts of this episode will be redubbed by someone with a mellifluous voice and appealing personality.
I’m joined today by James Coglan. Hi James.
James: Hi Tom.
Tom: Who are you?
Tom: Doing stuff like this. Going into pubs.
Tom: One of the reasons I wanted to talk to you today is that, like most programmers, you are occasionally angry about things, specifically about how bad most software is. And probably like most programmers you think you can do better than most of the bad software that’s out there. But what’s unusual about you is that, unlike most programmers, you actually do something about it: when you get sufficiently annoyed about something, you’ll actually spend the time to write some new software that’s notionally an improvement on the software that currently exists.
James: That’s called “not helping”.
Tom: Possibly, but it strikes me as quite a positive anger management strategy — that you can channel your frustration into the medium of software rather than the medium of tweets and blog posts.
James: Yeah, sometimes. Sometimes that leads me down incredibly silly alleyways where it turns out I have no idea what I’m talking about, but one of the nice side effects of ranting about stuff on the internet is that occasionally people who know more than you come along and correct you and explain why you have no idea what you’re talking about, and then you learn things. So that’s quite nice.
The Great Terminus Yak Shave
Tom: I was thinking the other day about Capybara, which is one of my favourite Ruby libraries. For anyone who isn’t familiar with it: Capybara is a browser automation library for Ruby. It gives you a nice Ruby DSL for controlling a web browser. You can say “go to this URL, and click on this link, and fill in this form, and press this button”, and it has a pluggable driver architecture that allows you to implement that DSL against something that looks like a web browser.
There’s a Rack::Test driver for it that just pretends to be a web browser by talking the Rack interface directly to your Rack application (if it’s written in Ruby), and there’s a Mechanize driver that actually makes HTTP requests, and there’s an HtmlUnit driver, and so on, all the way up to Poltergeist, which is a driver for PhantomJS, and then Selenium which can drive real web browsers like Firefox and Chrome.
You wrote a Capybara driver called Terminus, right?
Tom: I wanted to pick at that a little bit. Firstly, what is Terminus and why is Terminus? (That’s a restriction of the “why are computers” question.) And secondly, can you tell me a bit of the story about all of the other software that this project generated? Because it seems kind of amazing to me.
All these things are basically software libraries that give you an interface that’s like a web browser but doesn’t have a GUI; they just behave like a web browser but don’t draw anything on the screen. You can still script them and interact with them and check things, they just don’t paint anything. That means you can use them from the command line, and put them in a continuous integration process, and all that kind of thing. But the thing about web browsers is that they’re all really different and they all have bugs, so fake browsers are not really — at least in my experience — all that useful, because they’ve tended to be a bad approximation of the real thing.
Tom: So even in the unlikely situation that your fake browser is actually a 100% faithful representation of the underlying specifications, it’s still not necessarily going to give you very much confidence that your web app is going to work right in Firefox or whatever.
James: Right. If you’re using a fake browser you’re really just testing on one more browser target that is not like the other browser targets. And yeah, I got frustrated with trying various of those. I always found it really easy to find fairly show-stopping bugs in them, things that do work in browsers that just totally crash the test system, so that was kind of annoying.
What I wanted to do was make it so that it was really easy to test full-stack stuff on any target browser — any browser, any device, anywhere, on mobile phones, remote machines, anything like that — and without needing any plugins, so you wouldn’t have to get Selenium hooked into your phone somehow, you could just point your phone at some web page and it would let you control the phone and navigate around your web site. So that’s sort of what Terminus was.
I’m not sure if it was a Capybara driver originally, because I’m not quite sure how the timing of that played out, but at some point fairly early in its life it became obvious that it should be a Capybara driver so that you can use the same Capybara testing API — just swap the backend out and all your tests can still run against this other thing. And that was really nice; Capybara is a really nice system. I wouldn’t say it’s easy to do a backend for it, but the contract between Capybara and its backends is quite cleanly separated; it’s quite a sensible API contract between Capybara and its driver plugins, so it’s easy to understand what you need to do. That doesn’t imply that actually doing that backend is easy, but the architecture is fairly sensible.
James: Exactly, yeah. So it’s more sort of controlling the browser from inside the web page rather than controlling it from outside with a plugin or some kind of other harness system.
Tom: Well that seems like a super useful idea, and I’m glad you did that. However, I gather that you had at least two problems.
James: Oh boy.
Tom: The two problems that I know about are: number one, you wanted to have that bidirectional communication with the server; and number two, the Capybara driver integration API that you just mentioned is done in terms of XPath, right?
Tom: So when you use the Capybara DSL and you say “I want to click on a link” or “I want to press a button”, ultimately the information about what link or what button to click on or press is communicated by sending strings of XPath up to the client?
James: Yeah, exactly.
Tom: So why were those two things problems, and what did you do about it? What terrible thing?
James: Er, well, the fact that we’re now talking about this and it’s four years later gives some indication of the scope creep involved.
Tom: You’re still working on all the software that fell out of this, right?
James: Most of it, yeah. Erm. God.
Tom: It seems like a totally plausible thing to do.
James: Yeah, it’s a fairly deep yak shave. Capybara’s front end API lets you pick elements out of the page with CSS or XPath, but if you use CSS it converts it into XPath so that the contract with the driver is simpler — there’s only one language that the driver has to understand. XPath can do things that CSS can’t, so if you’re going to compile one to the other, CSS to XPath is the one that makes sense, because if you compile XPath to CSS, there are some XPath things that you won’t be able to convert in that way.
So, yeah, the first thing is, Capybara will give your driver XPath and you have to deal with that, which is fine, because most browsers have a function called
document.evaluate, which is sort of clunky, but it runs XPath queries against the page, and then gives you an iterator that gives you all the nodes that match the thing. The other thing was, like you say, your test program which is written in Ruby needs to be able to tell the browser to do stuff, so you have to send commands to the browser and then find out what happened. So you can tell the browser to do stuff and you can ask the browser things, like “how many elements match this selector?”, “what’s the current URL?”, all that sort of thing. So you need bidirectional communication.
The second of those things, the communication problem, is basically why I started Faye. At the time we didn’t have WebSocket yet, that was still on the horizon, and I was looking around thinking “okay, how do you do this?”. I’d heard about this thing called Comet, which is an umbrella term for a bunch of hacks that people use to do two-way communication for browsers, so I looked into that. Then I found out about this open protocol for doing that called Bayeux, which there are a bunch of implementations for, but there wasn’t a Ruby one. What I really should’ve done is use one of the existing ones for Java or whatever and written a Ruby client for that.
Tom: That would’ve made a lot of sense.
James: It would’ve made a lot of sense. And my life would be quite different now if I’d done that.
Tom: That was probably bad software though. If you looked at the Java server it was probably unacceptable.
Tom: And at the time, this was all sitting on top of long polling and various strategies for faking out bidirectional communication over HTTP, right?
James: Yeah. Pre-WebSockets you did stuff like: the client would send a request, and the server would just not respond to it for ages until it got a message, and then it would send the response with that message in it, and when the client got that it would just immediately open another request.
Tom: But the intention is that Bayeux abstracts away the details of the underlying transport, so that in principle you might be able to replace that transport — if your browser suddenly had an API that allowed you to just open a TCP socket, then you would still be able to sit Bayeux on top of that?
James: Exactly. It’s within the scope of Bayeux that it defines a transport negotiation system where the client can say what it can do, and the server says what it can do, and somehow they agree on what type of network transport to use. That meant that when WebSockets came along they could be transparently integrated into that. It totally hides all the messiness of browser networking from you, which is quite nice.
So I looked at that and started hacking at a server for it, and used that for a Music Hack Day project to put iTunes in your browser. That was sort of fun. It would let you play songs, and it would make them play at the same time on other people’s machines, so you could have this sort of listening party, so that was quite neat. But it was a just-good-enough implementation and I put it away for six months. I haven’t even started working on Terminus at this point; this is still just like, “what technology do I need to even make that possible?”.
Tom: And this is still pre-WebSockets, right?
James: Yeah. This is mid-to-late 2009. And in 2010, Node started getting off the ground — Node 0.1, very early and a little unstable, but it was usable — and one weekend I thought “I’ll try this out and I’ll port this Faye thing to it”, which I hadn’t really announced or released, because it was not really production-quality software. So yeah, I ported the server from Ruby to Node to learn what this Node thing was all about. In Ruby I’d been using EventMachine to do this, so it was a very similar architecture in Node — it was all asynchronous network stuff, so that was a fairly easy translation.
Once I’d done the Node version it started looking like a real project, just because you revisit a project and change stuff and improve it and update it to make it work with whatever’s current now, and it started looking like a real thing. Or I started thinking of it as not just a weekend music hack day thing — I was like, “oh, this is a thing now”. Then over the first half of 2010, most of the early work that turned it into a project happened: I went through several release cycles, put a web site up for it, announced it, and said “hey, there’s this thing I’m working on”, and it started getting users.
Tom: And at this point you still haven’t actually written Terminus?
James: No. But it did let me start working on Terminus. By late 2010 I had an early version of Terminus that could do the stuff I wanted, that spoke to Capybara and would drive most of the standards-y browsers and would do a little bit of iPhone at a push. So it was showing a little bit of promise, but Terminus was still nowhere near being any sort of a reasonable product; it was like a demo at this point. That takes you to the first 18 months of this ridiculous project.
Tom: It sounds like, in principle, you could be close to being finished here. You’ve got a demo working, surely just polish it up and round the edges off and you’ve got yourself a product.
James: Yeah. Just ship it. Put a squirrel in a hat and it’s done. Yeah, so, web browsers. The other fun thing about web browsers is that one of them is really not very good. Actually, two of them in this particular instance aren’t very good: Internet Explorer and the Android browser don’t have this
document.evaluate function that does XPath for you. That’s kind of a deal-breaker, because the big thing that’s a problem for cross-browser testing — the thing that you most want to target — doesn’t have this feature that Capybara needs in order to work.
So I spent ages looking around, and there are various XPath-y things that do things like, if you’ve got a document fragment from an XHR request you can do XPath on that, if you’re doing XHTML you can do XPath on some bits of that, sort of, but there was no polyfill for
document.evaluate. I couldn’t find anything that was even close enough that would just let you go query the page; they let you either query a document fragment or query an XHR response or, you know, they just didn’t implement all of XPath, or they were just buggy, or whatever it was. And I really tried to find something, because I really didn’t want to write an XPath engine, for fairly obvious reasons.
Tom: It would probably turn into a massive yak shave.
James: Right. But no, I ran out of steam on that search, so I thought, “okay, if this is going to work at all, it needs to have an XPath thing”, so I just had to bite the bullet and write one, of course.
XPath, if you haven’t used it much, is sort of similar to CSS selectors in what it does: it lets you query an HTML or XML document using a selector language. It has a slightly different syntax from CSS, and it also has some features that aren’t in CSS. In particular, it lets you query the text of things, and their attributes, in ways that CSS doesn’t let you do. And the really neat thing about it is that XPath expressions can be nested inside each other, so you can say “select all the
<div>s that contain anything matching this other selector”, so it’s sort of recursive in a way that CSS selectors aren’t.
James: Yeah, job done. Except, I’m a total idiot, and I’m very, very fussy. But seriously, PEG.js is really good. I just had some very, very minor quibbles about its syntax, and about exactly how it generated stuff, which are really really minor quibbles — like, this is not even a criticism, it’s just me being a pedant. And I was also just curious about how this stuff worked, because I’d played around with doing Lisp interpreters before, I thought “this is the next stage down of that: how do you do parsers?” I’d been interested in language implementation for a while and this seemed like a good opportunity to learn some more about that.
So yeah, I thought, “how hard can it be, I’ll just write a parser generator myself to do this XPath thing”. And that was sort of scary but not as difficult as I’d first thought. I bootstrapped it by inventing a JSON format for doing the grammar so I didn’t have to parse a new syntax language. I invented some JSON representation for saying how the language worked, and then turned that into what are called parser combinators, which are just functions — you can make a function that matches a string, or a function that matches something one or more times. All the sort of things that you see in regexes, all those concepts can be turned into functions that just do that thing, and then you can compose them and those are called parser combinators.
The performance of this thing was just unacceptable, so I decided I needed to do the next thing and make it faster, which meant I had to do code generation. Once you’ve done the combinator stuff, you can sort of see how you would generate code to do the same thing. I had some help, because I just went and looked at the code that Treetop and PEG.js generate to see what sort of thing they’re doing: oh, it turns every rule into a method, okay, and it turns a one-or-more selector into a
while loop with a counter, and you can sort of see what’s going on.
The code that these things generate is very verbose, but if you just play around with writing a grammar and changing it and seeing how that changes the compiled output, you can kind of get a feel for how it works. So I went through that process, up to the point where it was powerful enough that, instead of doing this JSON format for the grammar, I could do an actual grammar syntax format and then compile that into code that parses the same format, so it became self-hosting.
Somewhere in that project there is a file that describes its own syntax, which still does my head in quite a lot, and once you’ve bootstrapped it, working on it becomes really complicated because you have to use the previous version of the thing to compile the next version of the thing, and then if you get it wrong, you then have no working files any more, so that’s kind of fun.
All that ended up being a project called Canopy, which I released as open source. I think it’s probably just me that uses it, but it was kind of a fun learning experience — one of those things that I didn’t do because I thought loads of people needed it, I just did it because I wanted to learn how to do it and how to test it and all that sort of thing. But that was a pretty big diversion. I was working on that on and off between 2010 and 2012, just coming back to it — it was originally a Rhino script, and then I turned it into a Node thing…
Tom: And this is just getting you to the point where you can parse an XPath expression, right? You still had to deal with the problem of actually evaluating XPath.
Tom: Did that turn out to be straightforward in comparison to parsing the expressions in the first place?
James: Less work. I didn’t really know how XPath worked at the time, but it’s not very complicated to understand. If all you know is CSS, it’s got some slightly funky semantics, because it can do things like parent selection and has this concept called “axes”, which are strategies for walking the DOM. An axis expresses something like “select all the children that match this”, or “include the current node in this”, or “include all the descendants in this”. This combination of tag matchers and attribute matchers and axes and functions is all a bit weird.
Tom: Is it called Pathology, the XPath implementation?
Tom: Did you reach a point where that was actually an acceptable implementation of XPath 1 that had enough XPath support for the Capybara driver API?
James: Yeah, it was just enough to make Capybara work, as in I just did it by running the Capybara test suite and adding stuff until it all passed, so it’s really not complete and it’s quite slow, and it’s by no means a complete XPath implementation. It was just enough to support the things that people are probably going to use Capybara for.
That was a kind of weird project in that I just started the GitHub project and put the name on it, which is Pathology, and I think the description just said “The goggles: they do nothing”. And it got followers. There was no code, it didn’t say anything that it did, it just aroused enough curiosity that people started watching it, which is very strange.
Tom: So, at this point in the story, does Terminus exist yet?
James: Yeah, so now it’s increasingly a real thing, because you can run it on Internet Explorer, and it can just about do Android browser stuff. There was still ongoing stuff: Capybara was still evolving, so I’d periodically do maintenance work to make it keep working, and browsers keep changing, and there is a lot of stuff with Terminus that is just quite egregious hacks around browser differences and all the weird sorts of things that Capybara can do to web pages.
Capybara isn’t just for testing Ruby stuff, you can point it at any web application and it will work; you can just say “go and talk to this hostname”, you can point it at Facebook or Gmail or whatever.
Tom: Assuming you’ve got a driver that can make HTTP requests, right?
localhost. There’s a bunch of rewriting, and then there’s a bunch of rewriting back in the opposite direction so that Capybara gets the real URL back out when it does stuff.
That’s all egregious. There’s stuff where it doesn’t really deal with chunked encoding very nicely, so you have to strip those headers out in the request phase so that you get non-chunked encoding text out and everything gets easier. There’s just piles and piles of messy nonsense to make it work.
Tom: But it did work.
James: It did eventually work. I think at some point in 2012 a web site for Terminus went up with a little bit of documentation, and a few people started using it.
Tom: And then WebSockets happened.
James: Right. Well, WebSockets had happened a while ago — that started in 2010, we started getting WebSockets in browsers. So yeah, at some point within a year of me starting work on Faye, then WebSockets came along and sort of changed everything. And because this was still really early days, and there weren’t really any good libraries for it, I was like, “okay, I’m going to have to implement this WebSocket protocol thingy”.
At the start that was pretty easy because the handshake format was really simple — you didn’t have to do any of the crypto stuff that they added later. You get these headers that told you it was a WebSocket, and you’d send these other headers that said “yep, okay, I can do that”, and then there was this really simple message framing format where it would send you a null byte and then a bunch of text and then a 255 byte, and those delimited the message and you would just run through everything you received over TCP and parse that.
Tom: That sounds brilliant and simple, but this was an evolving standard, right?
James: Right. Someone decided that they needed binary support. So previously this was like, “it’s UTF-8 text delimited by a zero and a 255 byte”, and there was a couple of versions of that that actually got deployed in browsers, that differed a little bit on how the initial headers were done, but that wasn’t really a big problem. And then sometime in 2011 we started getting the thing that would eventually become the RFC that was way more complicated, and it had UTF-8 and binary messages, and control frames, like ping/pong, explicit closing, the handshake was more complicated.
It’s one of those things where I really didn’t know what I was getting into with doing this project, because like, “oh, I’ll just do this messaging thing for browsers”, “oh that seems to work”, and then people were like “it should really use WebSocket because that’s a better solution to this”, and I was like, “yeah, well, it is, okay, I’ll do that”.
Then people keep putting out new versions of this WebSocket thing, and you never knew when they were coming along exactly, or when they’d show up in browsers, so I always had to sort of keep up with it. 2011 had several panic weeks where a new version of Chrome came out and now Faye didn’t work any more. That was one of the biggest externalities of doing this project: I was unwittingly committing myself to maintaining this thing that I had no idea how it was going to evolve, and I didn’t even know it would exist when I started the project, just like, “okay, you want to do this, we’re going to do it this way now, that’s how the web is going to work now, you’ve got to deal with it”.
Tom: What was your goal at this point? Were you trying to support only the latest version of the WebSocket standard? I don’t know what the negotiation is like, but are you able to support all possible standards? Are they distinguishable?
James: They are distinguishable. At some point during the thing that led up to the RFC they put in an explicit version header, but even before they did that, you could tell based on other properties of the headers which one you were dealing with.
Tom: So did you end up with one of these horrendous grids of all of the different versions?
James: It’s not so much a grid. Nowadays, in terms of stuff that actually got put into browsers — there are things that tracked every little draft of the RFC that came out, but in terms of what actually got put into browsers — you really only had to support three things: there was the original, what’s called draft 75, that went into Chrome; there was draft 76, which was like that but with a different handshake; and then there was a series of things that led up to the RFC that, by the time they actually got put into browsers, were sort of mostly stable-ish.
By the time that stuff went into browsers it had stopped radically changing, the framing format. So it’s not so much a grid as there are basically three backends to this, and you can support all of them on the same server, and Faye still does. Draft 75 really doesn’t get used any more; draft 76 is still around — if anyone’s still running Safari 5, or actually PhantomJS I think is still running this, it’s still around a little bit, but most stuff is on the RFC now.
Tom: Did you keep those two things in perfect sync? Every time you went and updated the WebSocket implementation, you were doing it twice?
James: Yeah, basically. Another case where I didn’t realise how much work I was making for myself. It was originally a Ruby project, and I needed to do Ruby stuff with it, but I also ported it to Node and people were using that; it was sort of, “yeah, I’ve got to maintain both things now”. I get very guilty about not maintaining stuff, and there’s a lot of stuff that I’ve done that is really badly maintained, it’s just that Faye has the most users and attention and people relying on it, so it’s what gets my attention.
Tom: At least initially this stuff was all baked into Faye, but now Faye has its own GitHub user, and this has produced enough software that you have github.com/faye and there are a whole bunch of repositories there. The WebSocket stuff is broken out into a
faye-websocket library in Ruby and Node versions, and those depend on a
websocket-driver library in both Ruby and Node versions. I gather that the
websocket-driver stuff is just an implementation of the wire protocol without anything on top of it, and then I guess the
faye-websocket stuff is the interface between…
James: The fact that it’s called
faye-websocket tends to trick a lot of people, because what it really means is that it’s a WebSocket library that was extracted from Faye.
Tom: Oh, alright, okay.
James: It was a sequence of extracting things. Inside of Faye it was always done in a pretty modular way, but yeah, the first bit of extraction was that I broke out this thing called
faye-websocket which is just for doing WebSocket handling for Rack and for Node.
Tom: So if I wanted to write a Rack app that had WebSocket support, I’d use
faye-websocket to do that.
James: Exactly. And that made certain assumptions about how you were doing I/O, so on Ruby it assumed you wanted to do I/O with EventMachine, and you were using an EventMachine-based web server, which over time became more and more problematic. Thin’s been very popular for a long time, but it’s started getting supplanted a little bit by stuff like Puma, which isn’t based on EventMachine.
The thing that triggered the second phase was that the guys that maintain Puma and another project called Celluloid, which is an actor-based concurrency and I/O framework, asked if they could take just the protocol handling stuff out of
faye-websocket and not take the EventMachine-based I/O system. That was the second bit of extraction: taking all the protocol logic and making it so that you could have a thing and you could attach it to any I/O system you wanted, and then you’d have a WebSocket, and you didn’t have to worry about how the protocol worked or anything like that.
Tom: So that’s what
James: Exactly. And then
faye-websocket is really just a thing that glues that to EventMachine. It’s become very very small.
Tom: And then Faye itself sits on top of that and deals with all the Bayeux-level stuff.
James: Yeah, Faye uses that and EventSource and long polling and it implements the Bayeux protocol and all of that stuff.
faye-websocket is just one of the transport components that makes that up.
Tom: Right. We’ve talked about quite a lot of software here. Before we spoke today, I was sitting down and trying to create a partial map of the Coglanverse — all of these pieces of software and how they related to each other. It seems like Terminus is where you started, then Terminus triggered the creation of Pathology, which triggered the creation of Canopy, and that was just along the axis of XPath.
Tom: And then separately, Terminus triggered the creation of Faye, and the subsequent arrival of WebSockets meant you ended up writing all of this code, which meant that you ended up with the
faye-websocket stuff which sits on top of the
Tom: Is there anything I’ve missed out? I know there are a bunch of other peripheral things like the Redis integration and the cluster stuff, but I don’t know whether you use those in anger, or whether they’re on the critical path to just getting Terminus working in the way that the other projects are.
James: The Redis stuff wasn’t. The thing that makes Faye a bit weird is that the reason I was writing it was to do weird stuff with web browsers, and weird music projects, rather than doing large-scale messaging things. So when people came along and said “we want to do large-scale messaging things but you’ve given us this single-process Ruby messaging hub, we need something better”, then that’s where the Redis stuff came in. It was making the backend business-logic-y bit of Faye pluggable so that you could run a whole cluster of Faye servers and distribute your load.
That’s been quite fun because making that bit pluggable has meant that other people can write plugins. Myspace did their own sharded Redis backend that solved a bunch of problems that they were having, and they open-sourced that. It’s really nice to be able to take a bit of a system and figure out the abstraction boundary well enough so you can just defer that problem to other people. Because I’m not the one who’s doing these big large-scale messaging deployments of it, it’s really good to be able to offload solving that to people who are doing it, and really know what the problems are, and then they can share that back with the community.
The Rest of the Coglanverse
Tom: I know that this process is ongoing, but, at least having got to the point where all of these pieces of software were implemented and working to your satisfaction, how much use are you getting out of Terminus these days?
James: Not very much. This is the weird thing: I thought Terminus would be really useful, and really it’s only a handful of people that care about it. Orders of magnitude more people found Faye useful than found Terminus useful, which I really didn’t expect, because it started out being this really hacky thing that didn’t work very well. I’m surprised that it became usable enough that people started using it in their companies and things. That was a big surprise.
Tom: It’s really interesting to me that you semi-accidentally have mined this very rich vein of utility for people. Terminus was not necessarily something that you had a burning need for, but you just felt like you wanted to make it either for fun or for a small amount of use. But just by pulling on that thread all of the rest of this stuff has come out, and just by following your own needs you’ve uncovered this really rich area. I know a lot of people who use Faye — I use Faye, as you know — so from the outside it seems like it’s the most used thing that you’ve made. Is that true?
James: Yeah, absolutely. If I’m at a conference or whatever, that’s what I introduce myself as working on, because it’s what people tend to have heard of.
Tom: Right. I don’t want to get too much into any more of this stuff — that’s a fascinating story, and I would love to hear more of the stories, but I think there are other stories that have a similar character, right? In my mind you are characterised by a lack of hesitation in just going out and just making your own thing. You’ve got jstest, and I guess jsclass and jsbuild are tied into that, and you’ve got wake for Make-y stuff; those all form a cluster in my mental model of the Coglanverse.
James: Do you have a map of this? Because I’m kind of terrified.
Tom: There does need to be one! And then there’s a separate constellation which is your Lisp implementations: you had Fargo and Heist in Ruby. And then various other bits: I remember you talking about Primer, your caching thing for Active Record.
James: Yeah, that was a terrible idea that went nowhere.
Tom: You subsequently completely disowned that idea, but at the time, you know, it seemed cool.
James: That was totally one of those things where I was just naïve enough to think it was a good idea and went and talked about it, and then learnt a bit more and discovered it was a terrible idea. Doing that sort of stuff is educational.
Tom: In the last year you were talking about Coping, the templating language that you were playing around with.
James: Yeah. That was a conference-day hack thing. That’s still an experiment that I show to people occasionally and ask if it’s any use.
Tom: This is more generally something that I was interested in asking you about: the way that it seems that you do a lot of your thinking — in public, and on GitHub. When you look at the stuff you’re responsible for on GitHub there’s kind of a power law: there’s Faye and all of its attendant projects; and then there’s the stuff under jcoglan, a few big-ticket items and a really long tail of stuff that’s got two commits in, and no
README — a bot that does a thing, or something.
There’s lots and lots of stuff there and it gives me the impression at least that you’re someone who, when you start thinking about something, your mechanism for doing that may well involve hacking on some software to do it, and then your M.O. for hacking on that software is to just put it in public. I don’t know to what extent you get excited about coming up with a good name for it, and making sure you’ve got the gem name or the npm name or whatever it is.
James: I actually banned myself from naming projects at Songkick because I’m so bad at it.
Tom: It sounds as though this is an integral part of your process for figuring stuff out. With Primer, you were trying to solve the hard problem of cache invalidation, right? Given that you’re abdicating responsibility for the hard problem of naming, you thought you would focus on the cache invalidation. Roughly, you were doing runtime instrumentation of Active Record instances to try and figure out when they had changed, and it was an intriguing idea.
It’s the kind of thing that, if I’d had that idea down the pub, I would’ve nodded to myself for a minute and then I probably would’ve not gone anywhere with it. But you at least worked through it to the extent that you built a piece of software and then you went and spoke about it in public, and it seems like you maybe required that whole process of doing all of that work to even figure out whether or not you liked it.
James: Yeah, sort of. Looking back, it’s one of the very small set of problems for which the right answer is “do a new language”. It was this really hacky thing where it would try and infer and record how your templates depended on your data, so that when your data changed it could invalidate your caches. It was really bad, and the way it did that involved storing a huge amount of metadata that was just completely ridiculous.
Tom: I’ve kind of hypothesised an answer to the question already, but I wanted to ask you explicitly: why do you do this? This isn’t an intervention, it’s just: why do you live your life in this way? You’ve partially answered this question in what you said about the various stuff that spun off from Terminus, but do you feel like the production of all of this software is done primarily out of actual necessity or is it primarily out of curiosity?
It sounds like, for example, the reason why you wrote Canopy was primarily because you were interested. If you had not been interested, you probably just would’ve used PEG.js, right?
James: Yeah, it was a kind of necessity in that I wanted to use it in order to do something else, rather than just doing it for its own sake. Plus being… I wouldn’t say frustrated, just having some different design ideas than the existing tools that were available.
Tom: Because there’s no programming language inline with the grammar.
Tom: I remember the first time I saw the syntax for Treetop I was confused about that. As someone who was interested in parsing already, when I saw the inline Ruby code inside the grammar my first thought was: how does the parser for this grammar language work? Because in Treetop when you want to introduce — I can’t remember what they call them, semantic actions or whatever — inside the grammar, you have an opening curly brace, and then you have some Ruby, and then a closing curly brace.
The immediate question that occurred to me was: how do you know when you reach the closing curly brace? Because the Ruby’s going to have curly braces in it, and some of them are going to be inside string literals, and how do you know? And the answer in Treetop is: it just counts how many opening and closing curly braces it sees. It doesn’t know about Ruby.
James: Yeah. Right. But it at least needs to know enough about Ruby to know that Ruby’s use of curly braces requires balancing. Which means that if you’ve got that assumption, you can just say, “yeah, let’s just count balanced braces, that’s fine”.
Tom: But if your Ruby code is just “
puts a literal string with a closing curly brace in it”, then…
James: Then it won’t work.
Tom: Then you’re screwed. Yeah.
James: I don’t know. Maybe you can escape it, or maybe it does something like in Make where you have to do the two dollar signs: “no, this isn’t a Make dollar sign, this is a Bash dollar sign”.
Tom: Right. But this obviously annoyed you sufficiently that you felt…
James: It wasn’t so much annoyed as being like, “I have this other idea and I feel like I might just about understand enough to be able to have a go at doing it”.
Tom: That pretty much falls under curiosity. I made a joke at the beginning about you just being angry and channelling your anger into open source, but it feels like it’s slightly more virtuous than that. You’re actually just interested in how things might be done, and when you can see something like that where an opportunity has gone unexploited, you’re interested to see how that might work out, regardless of how much of your life it might consume in the process.
James: I wouldn’t say “regardless”. I’m much more cautious certainly about announcing and releasing stuff these days, because now I know what actually being a maintainer means in terms of time, I can’t take on a lot more of those commitments.
Tom: It seems like these things start life as something for you: you’re hacking on something locally, you quite quickly decide you’re going to push it up to GitHub just so you’ve got an offsite backup or whatever, for whatever reason — so you can give it a cool name — and then it’s up on GitHub. But then at some point some of your projects go through this phase transition where they stop being something primarily for you, and you start thinking of them as being something for other people.
This is one of the things that I think is very impressive about the projects where you’ve made that transition. You’re very conscientious about making sure that your stuff is well documented — and there are some other things we should also talk about, other properties that your software has — but primarily in terms of presenting it to the world, making sure you’re not just slapping in a half-finished
README on the GitHub repo. You have a nice subdomain on jcoglan.com, and you go to the trouble of designing a nice set of pages, and you write narrative documentation and stuff. How do you know when something has tipped over into that state where you’re like, “oh God, time for a new subdomain”?
James: Well, there’s a bunch of reasons why it could happen. Either I have some indication that it’s something other people want…
Tom: Because people are already using it?
James: No, that it’s a thing where you see people being frustrated by a certain thing and you think you might have the answer to it. Which most of the time you don’t, but that’s fine.
I think there’s another angle to it. This templating stuff that you brought up earlier: I’ve got this sort of experimental project called Coping which is a typesafe templating library. For example, if you’ve got an HTML template and you’ve got a thing that drops a value into an HTML attribute, the templating library knows how that should be encoded when you drop it in. In Rails, when you drop stuff into a template it automatically gets HTML-encoded; this system can do more context-aware encoding of stuff. If you’re dropping something into a query string, it will CGI-encode it; if it’s into HTML, it will HTML-encode it; and it will even do that in different places in the same template. It understands the grammatical structure of what you’re doing, and it knows how things should be composed together.
That sort of project is more an argument-by-programming thing, where you’re going: well, we’ve got this big problem which is that encoding stuff is hard and security is hard, and doing all that stuff correctly is really really hard to explain. Someone’s written a URL builder that has
& in it but doesn’t CGI-encode the stuff that’s being dropped into the parameters, because people view source on web pages and you can’t see the layers of language involved in how that works, so you don’t know what to attribute different symbols to.
So it’s trying to make a thing that makes it easy to do the right thing even if you don’t fully understand what the problem is. Sort of going: “okay, there’s this problem with web development, I haven’t seen any good solutions to it, here’s an idea, is it any good?” That’s not so much going “here is packaged, finished stuff”, it’s going like “what if you could do this? and here’s a demo”.
Tom: So as of right now, Coping hasn’t made that transition from a thing that you made for your own satisfaction to something that other people are using?
James: No. It’s not on RubyGems, there’s no packaging, it’s minimally documented. The
README is like “you can do this”; it’s not “here’s everything you can do with this”, it’s like “here’s one example of a thing you can do with it”.
Tom: What would need to happen for it to make that magic transition, to get coping.jcoglan.com?
There’s a point there, but how do you package a good idea in a way that makes it so easy that people will want to use it over what they are already doing? ERB has spoiled people with convenience: you just drop a string in here, that’s great. If you’re trying to put forward another alternative, you’re fighting a usability competition with ERB, which is really easy to use, but very hard to use correctly.
You can do a system and go “hey! let’s all become Haskell programmers and learn how to compose stuff properly”, or you can try and meet people where they are, and go “here’s this thing that looks mostly like what you’re doing, but it’s better”. It’s trying to trade off correctness versus what should the right interface be that makes people feel comfortable using it.
A Conservative Maintainer
Tom: That leads me into another thing I wanted to ask you about, which was generally about your attitudes as a maintainer of open source software. In my mind I characterise you as a relatively conservative maintainer, in as much as you seem to say “no” a lot more than you say “yes” to things. Even when you are saying “yes” to things, you’re certainly not one of these people who just wants to shovel as much as stuff as possible into your software.
Maybe that’s related to the idea you just explained about software as argument: primarily it’s software, and people can use it to get stuff done, but the reason why you have constructed it is as an expression of an idea that you have. It seems to make sense that you’re not necessarily going to want other people’s — or even your own — whims to pollute the clarity. Quite a lot of the time I’ve seen you say to people, “that’s a great idea, you should do that as a plugin or as a different piece of software or something”, just because it doesn’t align exactly with what you’re trying to do with whatever the piece of software is.
I’m also aware that you have opinions on things like API stability and API design. You did some tweets a few months ago which I found sufficiently interesting that I wrote them down. You said: “Web software is built by the smart and young, and its pace of change privileges the mentally agile. This includes both consumer software and dev tools. Change spends people’s time.” That sounds like something that quite a conservative maintainer would say.
Obviously the stuff you’ve worked on has changed over time, and you are still actively maintaining a lot of this stuff, certainly Faye. How do you manage that process and how does your own attitude towards your own software shape how something like Faye changes over time?
James: It’s mostly influenced by what I find painful using other stuff. I think when you describe me as “conservative”, the core of that is that I really value stability. There’s that article about the “worse is better” school of design versus the “it must be correct” school of design; I’m more on the latter end of that spectrum. I will put up with any amount of complexity in the implementation to make sure that the interface is right.
That mostly comes from being frustrated by having spent seven years now doing Ruby and living in the Ruby ecosystem for that much time. None of the software that I use looks anything like it did when I started using it, and there are good reasons for that and there are bad reasons for that. The thing that most concerns me is stuff that gets changed because someone thought that the new way is how it should’ve been done in the first place and it’s obviously better, but it doesn’t give any real new capabilities or power and doesn’t really fix any mistakes, and it breaks their existing software.
To me the canonical example of that is the Ruby 1.9 hash syntax, which a lot of people are like “oh, it’s obviously better”, but it doesn’t let you write programs that you couldn’t write before, it doesn’t fix any mistakes that anyone was making before, and it means that if someone uses that syntax, that program now won’t run on an older thing. It’s purely an aesthetic change. The aesthetics of code are important, and it’s important to have stuff that’s readable, but if you have a thing that’s already been shipped, making those tiny little fussy aesthetic changes to it, to me, never seems really worth it.
If you’re going to change something, you’d better be giving me a lot more power or preventing me making significant mistakes, because as a maintainer when you change stuff you’re spending people’s time. When you break people’s software, you’re spending their time, and you don’t know what kind of context that might be in. You don’t know how costly people’s time is. In a lot of cases it’s a lot more costly than you realise: companies have to pay people to do this stuff. People get locked into huge legacy problems where they can’t upgrade things and now they’ve got massive technical debt that slows entire companies down.
Keeping software in a state where people can upgrade it easily without having to spend a lot of their own time, to me, is really valuable. If you make software such that people can’t do that easily, I don’t think you’re helping them. You’re going, “okay, I made this tool, and you’ve adopted it, and you’ve got utility out of it, and now you’re sort of dependent on it, and now I’m going to make that dependency very costly for you”. If it’s unstable, someone then has a bunch of code that’s coupled to that, and now that code is really costly and it’s a liability, and I really don’t like doing that to people.
Tom: You don’t want that to be your fault.
James: Well, I don’t like it when people do that to me, therefore I try not to do it to other people.
Tom: It reminds me more generally of the problem of the instinctive tendency of developers to prioritise the concerns of developers. Your Ruby hash syntax example is a great example of that. As the author of some library, or as the creator of Ruby or whatever, there’s something that I am concerned about — whether it’s aesthetics, or the names of things in my API, or whether or not I’m using the new hotness, or whatever it is — so I’m going to make that my focus and not necessarily concentrate on what the knock-on effect is going to be on the other developers or regular people who are using that software.
That seems to be relatively endemic. A lot of things that I see you complaining about in the world generally are instances of that problem, right?
James: Yeah. I don’t mean to write this argument off entirely, because it is kind of valid, but I tend to discount very heavily any argument that’s based on what makes programmers happy. I like using tools that I like using, I like stuff that makes it easy for me to do my job, but an argument about making developers happy can never win over an argument about keeping users happy.
A lot of people make the argument of “just do whatever satisfies 95% of the users”, but there are some problems where 95% isn’t good enough. Government is an obvious example of that, but I think more generally the effect of all these private businesses going “keep the 95% happy” is that there are a set of people who are just cut out from the marketplace entirely. All of those businesses are making the right decisions in terms of their interests — it’s too costly, it’s not a good business decision for them to spend loads of money supporting whatever it is they need to do to make this possible — but the net effect of that is to cut a certain set of people out of being able to use any of this stuff. I’m not sure that I have a great answer to that beyond “smash capitalism”.
James: Whenever I bring this up on Twitter people say to me, “people shouldn’t disable it, if they disable it they deserve it”. This is people who don’t live in big cities, who have terrible mobile reception. They just cannot use your web site.
Tom: So you’d be quite happy in a world where we’re using “apps” that are hosted on the web and, whether it’s Ember or Angular or whatever strategy to do that stuff, you would fundamentally be happy with that as long as it works fine when you’re on your mobile phone? You’re not fundamentally opposed to that model of the web, it’s just that there are currently specific operational problems with it that need to be addressed?
Tom: Maybe once that problem is solved we can start addressing the problem of you not having received all of the web site’s fonts. And so on.
James: Well. Yeah.
James: It’s for web sites, yeah.
James: I think there’s a lot of people who identify that way.
The frustrating thing about it to me is that there are lots of tools that work just fine, but there’s also quite a lot of obfuscation. It’s really easy — as in, it’s not much typing — to go and get Jasmine or Mocha or QUnit or whatever you want, make a web page, put it on there, and write some tests and run them, and you’ve got a web page that runs your tests. There is however a problem – I’ve found this more in Ruby land — that it sort of gets obfuscated, that people make gems that say “okay, we’re going to make a thing that lets you test your Sinatra app with Jasmine on Phantom”, and they’ll make a gem that ties those things together.
It’s not that that isn’t a useful thing to do, but the way in which it’s packaged is sort of obfuscatory. I always start from the position of making a web page with some tests on it, and then the second step is how do you make those run on Phantom, or how do you make them run on your web server. It’s the same approach that I take to doing a lot of Ruby stuff: I make it work on its own, then make it work on Sinatra, then make it work on Rails, rather than taking all the Rails assumptions and baking them in at the start.
So, because these things tend to obfuscate stuff, they hide stuff like: how does the test page get constructed? If you want to put markup in there, how do you do that? Some things add in interfaces that let you load files off disk, and then you can’t run that code in the browser because there’s no way to do that in the browser, so you get coupled to that testing platform and you can’t take those tests and run them somewhere else.
The thing is, there are a lot of useful places to run your tests. There are various CI servers like TestSwarm and Buster, a whole class of things that’ll automatically run your tests in loads of browsers, and if you don’t start from the standpoint of just “have a static web page that runs some tests”, using those things becomes much harder. If you become reliant on server-side software, and booting a server process, integrating with those things becomes harder. If you’re relying on a file system API that somehow magically springs into life, that’s not going to be available in this other ecosystem.
We’ve got enough tools, there just isn’t enough thought put into how they compose. There are some things that are presented so that you can’t see how to compose them if you change your mind about something. Because the thing is, a lot of these test runners have good APIs. Jasmine has a thing where you can write a reporter plugin that will just emit test results however you want, and you can use that to make it report the results to any of these test runner frameworks; you can do that glue. It’s just that there’s a lot of projects that go, “okay, I’ve done the glue, but now I’m going to package all of it into this one blob” and that means people can’t really see how it works, so they find it confusing.
James: I wouldn’t say that the book is that. The book isn’t really about tools, it’s more about use cases. Like, how do I test a Node app that talks to a database? How do I test streams? How do I test something that uses
I would say in terms of code-as-argument, jstest — the test framework that I use — is more in that vein. It is a testing framework that doesn’t try to hide things from you. The documentation explains to you how to integrate it with anything you want, and it’s designed to be able to do that. It’s designed to just assume that you have a web page, and then if you have a web page you can run it on Phantom. It’s making minimal assumptions, not trying to create a big server-based framework, treating it more as a library than a framework: “here’s this thing; you run it like this; if you want to run it on this other platform, here’s how you do that; here’s how you make it talk to some other stuff”.
So yeah, the way that that is constructed is more in line with your code-as-argument idea. The book tries to stay away from tooling as much as possible.
Tom: It sounds like you’re sneaking the code-as-argument in under the radar.
James: Well, maybe.
Tom: As part of the cookbook, all of these arguments will be absorbed by osmosis as you are doing this.
James: Possibly. Some of that stuff is something that I do tell to people when I’m teaching them how to do this stuff: start with a web page, start with a thing you can just load in your browser. Because yeah, there are a lot of good tools for running this stuff, and the more your stuff is just based on static files, the easier it is to put it in your CI system without coupling to a lot of other complicated dependencies. There are bits of that that are just general good advice that you can apply whatever tools you’re using.
Tom: Well, that sounds great. When’s the book gonna be done? When you finish building a toolchain to generate
.epub files, presumably?
James: That’s fine. Yeah, I said the other day: I don’t have a book yet, but I have a really fun Makefile I can sell you.
Tom: Well, it doesn’t come as a massive surprise to me that you’ve ended up in that situation.
James: It’s fine. Since leaving Songkick last month I’ve taken a bit of time off, gone on holiday, but now doing the book is my full-time job for a bit, so yeah, I’m making steady progress. Hopefully out by the end of the year if I get my act together. I’m being at once discouraged by people who’ve written PhDs and tell me how hard writing is, and encouraged by people I know who’ve written books in a week. Somewhere in between those extremes is hopefully what’s going to happen.
Tom: Well, good luck with it, I feel like I should let you get back to it, given that it is your full-time job now. But thanks very much for coming and talking to me.
James: Thank you for having me.
Tom: Oh, it’s been a pleasure. Thanks very much!