Êíèãà: Coders at Work: Reflections on the craft of programming

Joe Armstrong

Joe Armstrong

Joe Armstrong is best known as the creator of the programming language Erlang and the Open Telecom Platform (OTP), a framework for building Erlang applications.

In the modern language landscape, Erlang is a bit of an odd duck. It is both older and younger than many popular languages: Armstrong started work on it in 1986—a year before Perl appeared—but it was available only as a commercial product and used primarily within Ericsson until it was released as open source in 1998, three years after Java and Ruby appeared. Its roots are in the logic programming language Prolog rather than some member of the Algol family. And it was designed for a fairly specific kind of software: highly available, highly reliable systems like telephone switches.

But the characteristics that made it good for building telephone switches also—and almost inadvertently—made it quite well suited to writing concurrent software, something which has drawn notice as programmers have started wrestling with the consequences of the multicore future.

Armstrong, too, is a bit of an odd duck. Originally a physicist, he switched to computer science when he ran out of money in the middle of his physics PhD and landed a job as a researcher working for Donald Michie—one of the founders of the field of artificial intelligence in Britain. At Michie’s lab, Armstrong was exposed to the full range of AI goodies, becoming a founding member of the British Robotics Association and writing papers about robotic vision.

When funding for AI dried up as a result of the famous Lighthill report, it was back to physics-related programming for more than half a decade, first at the EISCAT scientific association and later the Swedish Space Corporation, before finally joining the Ericsson Computer Science Lab, where he invented Erlang.

In our several days of conversation over his kitchen table in Stockholm, we talked about, among other things, the Erlang approach to concurrency, the need for better and simpler ways of connecting programs, and the importance of opening up black boxes.

Seibel: How did you learn to program? When did it all start?

Armstrong: When I was at school. I was born in 1950 so there weren’t many computers around then. The final year of school, I suppose I must have been 17, the local council had a mainframe computer—probably an IBM. We could write Fortran on it. It was the usual thing—you wrote your programs on coding sheets and you sent them off. A week later the coding sheets and the punch cards came back and you had to approve them. But the people who made the punch cards would make mistakes. So it might go backwards and forwards one or two times. And then it would finally go to the computer center.

Then it went to the computer center and came back and the Fortran compiler had stopped at the first syntactic error in the program. It didn’t even process the remainder of the program. It was something like three months to run your first program. I learned then, instead of sending one program you had to develop every single subroutine in parallel and send the lot. I think I wrote a little program to display a chess board—it would plot a chess board on the printer. But I had to write all the subroutines as parallel tasks because the turnaround time was so appallingly bad.

Seibel: So you would write a subroutine with, basically, a unit test so you would see that it had, in fact, run?

Armstrong: Yes. And then you’d put it all together. I don’t know if that counts as learning programming. When I went to university I was in the physics department at University College of London. I think we probably had programming from the first year. Then you had this turnaround of three hours or something. But again it was best to run about four or five programs at the same time so you got them back fairly quickly.

Seibel: In high school, was it an actual school course?

Armstrong: It was an after-hours course—computer club or something. We went to see the computer, I remember. Lots of serious-looking older men wearing white coats with pens stuck in their pockets wandering around, like, a church. It was a very expensive computer.

Seibel: You were studying physics; when did you shift to programming?

Armstrong: Well, as an undergraduate some of the courses involved writing programs and I really enjoyed that. And I got to be very good at debugging. If all else failed, I would debug people’s programs. The standard debugging was one beer. Then it would go up—a two-beer problem or a three-beer problem or something like that.

Seibel: That was in terms of how many beers they had to buy you when you debugged their program?

Armstrong: Yeah, when I fixed their program. I used to read programs and think, “Why are they writing it this way; this is very complicated,” and I’d just rewrite them to simplify them. It used to strike me as strange that people wrote complicated programs. I could see how to do things in a few lines and they’d written tens of lines and I’d sort of wonder why they didn’t see the simple way. I got quite good at that.

When I really got to programming was after I finished my first degree and I decided I wanted to do a PhD. So I started to do a PhD in high-energy physics and joined the bubble chamber group there and they had a computer. A DDP-516, a Honeywell DDP-516. And I could use it all by myself. It was punched cards, but I could run the programs there—I could put them into the thing and press a button and whoomp, out came the answer immediately. I had great fun with that. I wrote a little chess program for it.

This was when real core memory was knitted by little old ladies and you could see the cores—you could see these little magnets and the wires went in and out. Frightfully expensive—it had something like a 10MB disk drive that had 20 platters and weighed 15 kilos or something. It had a teletext interface—you could type your programs in on that.

And then came this “glass TTY” which was one of the first visual display units and you could type your programs in and edit them. I thought this was fantastic. No more punched cards. I remember talking to the computer manager and saying, “You know, one day everybody will have these.” And he said, “You’re mad, Joe. Completely mad!” “Why not?” “Well, they’re far too expensive.”

That was really when I learned to program. And my supervisor at the time, he said, “You shouldn’t be doing a PhD in physics. You should stop and do computers because you love computers.” And I said, “No, no, no. I’ve to finish this stuff that I was doing.” But he was right, actually.

Seibel: Did you finish your PhD?

Armstrong: No, I didn’t because I ran out of money. Then I went to Edinburgh. When I was reading physics we used to go and study in the physics library. And in the corner of the physics library there was this section of computer science books. And there were these brown-backed volumes called Machine Intelligence, Volumes 1, 2, 3, and 4, which came from Edinburgh, from the Department of Machine Intelligence there. I was supposed to be studying physics but I was eagerly reading these things and thought, “Oh, that’s jolly good fun.” So I wrote to Donald Michie, who was the director of the Department of Machine Intelligence at Edinburgh, and said I was very interested in this kind of stuff and did he have any jobs. And I got back a letter that said, well, they didn’t at the moment but he would like to meet me anyway, see what sort of person I was.

Months later I got a phone call, or letter, from Michie, saying, “I’ll be in London next Tuesday; can we meet? I’m getting the train to Edinburgh; can you come to the station?” I went to the station, met Michie, and he said, “Hmmm! Well, we can’t have an interview here—well, we’ll find a pub.” So we went to a pub and I chatted to Michie and then a bit later I got another letter from him, he says, “There’s a research job at Edinburgh, why don’t you apply for it.” So I became Donald Michie’s research assistant and went to Edinburgh. That was my transition between physics and computer science.

Michie had worked with Turing at Bletchley Park during the second World War and got all of Turing’s papers. I had a desk in Turing’s library, so all around me were Turing’s papers. So I was a year at Edinburgh. After that Edinburgh kind of collapsed because James Lighthill, a mathematician, was hired by the government to go and investigate artificial intelligence at Edinburgh. And he came back and said, “Nothing of commercial value will ever come out of this place.”

It was like one gigantic playpen kind of place. I was a founding member of the British Robotics Association and we all thought this was really going to have enormous relevance. But the funding agencies—Robotics! What’s this stuff? We’re not going to fund this! And so there was a period around ’72, I guess, when all the funding dried up and everybody said, “Well, we had fun while we were here; better go and do something else.”

Then it’s back to being a physicist. I came to Sweden and I got a job as a physicist programmer for the EISCAT scientific association. My boss had come from IBM and he was older than me and he wanted a specification and he would go and implement it. We used to argue over this. He said, “What’s bad about the job is we don’t have a job description and we don’t have a detailed specification.” And I said, “Well, a job with no job description is a really good job. Because then you can form it how you like.” Anyway, he left after about a year and I got the boss’s job, the chief designer.

I designed a system for them and that was what I suppose you’d call an application operating system—it’s something that runs on top of the regular operating system. By now computers were becoming quite reasonable. We had NORD-10 computers which were Norwegian—I think they were an attempt to get into the PDP-11 market.

I worked there for almost four years. Then I got a job for the Swedish Space Corporation and built yet another application operating system to control Sweden’s first satellite, which was called Viking. That was a fun project—I’ve forgotten the name of the computer but it was a clone of the Amdahl computer. It still only had line editors. It didn’t have full-screen editors. And all your programs had to be in one directory. Ten letters for the file name and three letters for the extension. And a Fortran compiler or assembler and that’s it.

The funny thing is, thinking back, I don’t think all these modern gizmos actually make you any more productive. Hierarchical file systems—how do they make you more productive? Most of software development goes on in your head anyway. I think having worked with that simpler system imposes a kind of disciplined way of thinking. If you haven’t got a directory system and you have to put all the files in one directory, you have to be fairly disciplined. If you haven’t got a revision control system, you have to be fairly disciplined. Given that you apply that discipline to what you’re doing it doesn’t seem to me to be any better to have hierarchical file systems and revision control. They don’t solve the fundamental problem of solving your problem. They probably make it easier for groups of people to work together. For individuals I don’t see any difference.

Also, I think today we’re kind of overburdened by choice. I mean, I just had Fortran. I don’t think we even had shell scripts. We just had batch files so you could run things, a compiler, and Fortran. And assembler possibly, if you really needed it. So there wasn’t this agony of choice. Being a young programmer today must be awful—you can choose 20 different programming languages, dozens of framework and operating systemsand you’re paralyzed by choice. There was no paralysis of choice then. You just start doing it because the decision as to which language and things is just made—there’s no thinking about what you should do, you just go and do it.

Seibel: Another difference these days is that you can no longer understand the whole system from top to bottom. So not only do you have lots of choices to make, they’re all about which black boxes you want to use without necessarily fully understanding how they work.

Armstrong: Yeah—if these big black boxes don’t work properly, and you have to modify them, I reckon it’s easier just to start from scratch and just write everything yourself. The thing that really hasn’t worked is software reuse. It’s appallingly bad.

Seibel: Yet you’re the architect not only of Erlang but of an application framework, the Open Telecom Platform. Is it reusable?

Armstrong: To an extent it’s reusable. But the same problem will occur. If that framework exactly solves your problem—if some programmer who doesn’t know anything about the design criteria for OTP looks at it in a few years’ time and says, “Oh, that’s great; that’s exactly what I want to do,” then it’s fine and you get this measure of reusability. If it’s not, then you have a problem.

Fairly recently I’ve seen people say, “This is really kind of artificial, we’re twisting the code to fit into this OTP framework.” So I say, “Well, rewrite the OTP framework.” They don’t feel they can change the framework. But the framework’s just another program. It’s really rather easy. And I go into it and then it does what they want. They look at it and they say, “Yeah, well, that’s easy.” They accept that it’s easy. But they say, “Well, our project management doesn’t want us messing around with the framework.” Well, give it a different name then or something.

Seibel: But do you think it’s really feasible to really open up all those black boxes, look inside, see how they work, and decide how to tweak them to one’s own needs?

Armstrong: Over the years I’ve kind of made a generic mistake and the generic mistake is to not open the black box. To mentally think, this black box is so impenetrable and so difficult that I won’t open it. I’ve opened up one or two black boxes: I wanted to do a windowing system, a graphics system for Erlang, and I thought, “Well, let’s run this on X Windows.” What is X Windows? It’s a socket with a protocol on top of it. So you just open the socket and squirt these messages down it. Why do you need libraries? Erlang is message based. The whole idea is you send messages to things and they do things. Well, that’s the idea in X Windows—you’ve got a window, send it a message, it does something. If you do something in the window it sends you a message back. So that’s very much like Erlang. The way of programming X Windows, however, is through callback libraries—this happens and call this. That’s not the Erlang way of thinking. The Erlang way of thinking is, send a message to something and do something. So, hang on, let’s get rid of all these libraries in between—let’s talk directly to the socket.

And guess what? It’s really easy. The X protocol’s got, I don’t know, 100 messages, 80 messages or something. Turns out you only need about 20 of them to do anything useful. And these 20 messages you just map onto Erlang terms and do a little bit of magic and then you can start sending messages to windows directly and they do things. And it’s efficient as well. It’s not very pretty because I haven’t put much effort into graphics and artistic criteria—there’s a lot of work there to make it look beautiful. But it’s not actually difficult.

Another one is this typesetting system I did where the abstraction boundary I opened up is Postscript. As you get to that boundary you think, “I don’t want to go through the boundary,” because what’s underneath is—you imagine—enormously complicated. But again, it turns out to be very easy. It’s a programming language. It’s a good programming language. The abstraction boundary is easy to go through and once you’ve gone through, there’s a lot of benefit.

For my Erlang book, my publisher said, “We’ve got tools to make diagrams.” But the thing I don’t like about diagramming tools is it’s really difficult to get an arrow to meet exactly. And your hand hurts. I thought, “The amount of time to write a program that spits out Postscript and then say, ‘I want a circle there and the arrow goes exactly there,’ and get the program right, isn’t long.” It takes a few hours. Doing diagrams with programs takes about the same time as doing them in a WYSIWYG thing. Only there are two benefits. Your hand doesn’t hurt at the end and even when you blow the thing up to a magnification of 10,000, the arrow points exactly right.

I can’t say beginner programmers should open up all these abstractions. But what I am saying is you should certainly consider the possibility of opening them. Not completely reject the idea. It’s worthwhile seeing if the direct route is quicker than the packaged route. In general I think if you buy software, or if you use other people’s software, you have to reckon with an extremely long time to tailor it—it doesn’t do exactly what you want, it does something subtly different. And that difference can take a very long time to solve.

Seibel: So you started out saying software reuse is “appallingly bad,” but opening up every black box and fiddling with it all hardly seems like movement toward reusing software.

Armstrong: I think the lack of reusability comes in object-oriented languages, not in functional languages. Because the problem with object-oriented languages is they’ve got all this implicit environment that they carry around with them. You wanted a banana but what you got was a gorilla holding the banana and the entire jungle.

If you have referentially transparent code, if you have pure functions—all the data comes in its input arguments and everything goes out and leaves no state behind—it’s incredibly reusable. You can just reuse it here, there, and everywhere. When you want to use it in a different project, you just cut and paste this code into your new project.

Programmers have been conned into using all these different programming languages and they’ve been conned into not using easy ways to connect programs together. The Unix pipe mechanism—A pipe B pipe C—is trivially easy to connect things together. Is that how programmers connect things together? No. They use APIs and they link them into the same memory space, which is appallingly difficult and isn’t cross-language. If the language is in the same family it’s OK—if they’re imperative languages, that’s fine. But suppose one is Prolog and the other is C. They have a completely different view of the world, how you handle memory. So you can’t just link them together like that. You can’t reuse things. There must be big commercial interests for whom it is very desirable that stuff won’t work together. It creates thousands of jobs for consultants. And thousands of tools to solve problems that shouldn’t exist. Problems that were solved years ago.

I think it’s really weird that we have very few programming languages that describe the interaction between things. I keep coming back to ways of gluing things together and ways of describing protocols. We don’t have ways of describing this protocol in between things: if I send you one of them then you send me one of these. We have ways of describing packets and their types but we have very restricted ways of describing the protocols.

Programming is fundamentally different to the way we construct things in the real world. Imagine you’re a car manufacturer. You buy components from subcontractors. You buy a battery from Lucas and you buy a generator from somewhere. And you bolt things together—you construct things by placing things next to each other. You build a house by putting the bricks on top of each other and putting the door there. That’s how we make chips. You get a printed circuit board that basically just provides this connection. But you can think of making electronic things as you buy all these chips and you connect the legs of some to others with wires. And that’s how you make hardware. But we don’t make software like that. We should make software like that and we don’t.

The reason we don’t, has to do with concurrency. You see, the chips, when you put them next to each other, they all execute in parallel. And they send messages. They are based on this message-passing paradigm of programming, which is what I believe in. And that’s not how we write software together. So I think one direction Erlang might take, or I would like it to take, is this component direction. I haven’t done it yet, but I’d like to make some graphic front ends that make components and I’d like to make software by just connecting them together. Dataflow programming is very declarative. There’s no notion of sequential state. There’s no program counter flipping through this thing. It just is. It’s a declarative model and it’s very easy to understand. And I miss that in most programming languages.

That’s not to say that what’s inside an individual black box isn’t very complicated. Take grep, for example. Seen from the outside—imagine a little square. The input is a stream of data, a file. You say cat foo span grep and grep has got some arguments, it’s got a regular expression it’s got to match. OK. And out of grep come all the lines that match that regular expression. Now, at a perceptual level, understanding what grep does is extremely simple. It has an input which is a file. It has an input which is a regular expression. It has an output which is a set of lines or a stream of lines that match the regular expression. But that is not to say that the algorithm inside the black box is simple—it could be exceedingly complicated.

What’s going on inside the black boxes can be exceedingly complicated. But gluing things together from these complicated components does not itself have to be complicated. The use of grep is not complicated in the slightest. And what I don’t see in system architectures is this clear distinction between the gluing things together and the complexity of the things inside the boxes.

When we connect things together through programming language APIs we’re not getting this black box abstraction. We’re putting them in the same memory space. If grep is a module that exposes routines in its API and you give it a char* pointer to this and you’ve got to malloc that and did you deep copy this string—can I create a parallel process that’s doing this? Then it becomes appallingly complicated to understand. I don’t understand why people connect things together in such complicated ways. They should connect things together in simple ways.

Seibel: Comparing how you think about programming now with how you thought when you were starting out, what’s the biggest change in your thinking?

Armstrong: The big changes in how I think about programming have nothing to do with the hardware. Obviously it’s a lot faster and lot more powerful but your brain is a million times more powerful than the best software tools. I can write programs and then suddenly, days later, say, “There’s a mistake in that program—if this happens and that happens and that happens and this happens, then it will crash.” And then I go and look in the code—yup, I was right. There has never been a symptom. Now you tell me a development system that can do that kind of stuff. So the changes that have happened as a programmer, they’re mental changes within me.

There are two changes and I think they’re to do with the number of years you program. One is, when I was younger quite often I would write a program and work at it until it’s finished. When it was finished I would stop working on it. It was done, finished. Then I’d get an insight—“Ah! Wrong! Idiot!” I’d rewrite it. Again: “Yeah, it’s wrong”—rewrite it.

I remember thinking to myself, “Wouldn’t it be nice if I could think all of this stuff instead of writing it?” Wouldn’t it be nice if I could get this insight without writing it. I think I can do that now. So I would characterize that period, which took 20 years, as learning how to program. Now I know how to program. I was doing experiments to learn how to program. I think I know how to program now and therefore I don’t have to do the experiments anymore.

Occasionally I have to do very small experiments—write extremely small programs just to answer some question. And then I think through things and they more or less work as I expect when I program them because I’ve thought through them. That also means it takes a long time. A program that you write, you get the insight, you rewrite—it might take you a year to write. So I might think about it for a year instead. I’m just not doing all this typing.

That’s the first thing. The second thing that’s happened is intuition. When I was younger, I would do the all-night hacks, programming to four in the morning and you get really tired and it’s macho programming—you hack the code in hour after hour. And it’s not going well and you persevere and you get it working. And I would program when the intuition wasn’t there.

And what I’ve learned is, programming when you’re tired, you write crap and you throw it all away the next day. And 20 years ago I would program although I was getting a strong feeling that this isn’t right—there’s something wrong with this code. I have noticed over the years, the really good code I would write was when I’m in complete flow—just totally unaware of time: not even really thinking about the program, just sitting there in a relaxed state just typing this stuff and watching it come out on the screen as I type it in. That code’s going to be OK. The stuff where you can’t concentrate and something’s saying, “No, no, no, this is wrong, wrong, wrong”—I was ignoring that years ago. And I’d throw it all away. Now I can’t program anymore if it says, “No.” I just know from experience, stop—don’t write code. Stop with the problem. Do something else.

Because I was good at math and that sort of stuff at school, I thought, “Oh, I’m a logical person.” But I took these psychology tests and got way high scores on intuition. And quite low scores on logical thinking. Not low—I can do math and stuff; I’m quite good at them. But because I was good at math I thought science was about logic and math. But I wouldn’t say that now. I’d say it’s an awful lot of intuition, just knowing what’s right.

Seibel: So now that you spend more time thinking before you code, what are you actually doing in that stage?

Armstrong: Oh, I’m writing notes—I’m not just thinking. Doodling on paper. I’m probably not committing much to code. If you were to monitor my activity it’d be mostly thinking, a bit of doodling. And another thing, very important for problem solving, is asking my colleagues, “How would you solve this?” It happens so many times that you go to them and you say, “I’ve been wondering about whether I should do it this way or that way. I’ve got to choose between A and B,” and you describe A and B to them and then halfway through that you go, “Yeah, B. Thank you, thank you very much.”

You need this intelligent white board—if you just did it yourself on a white board there’s no feedback. But a human being, you’re explaining to them on the white board the alternative solutions and they join in the conversation and suggest the odd thing. And then suddenly you see the answer. To me that doesn’t extend to writing code. But the dialog with your colleagues who are in the same problem space is very valuable.

Seibel: Do you think it’s those little bits of feedback or questions? Or is it just the fact of explaining it?

Armstrong: I think it is because you are forcing it to move it from the part of your brain that has solved it to the part of your brain that has verbalized it and they are different parts of the brain. I think it’s because you’re forcing that to happen. I’ve never done the experiment of just speaking out loud to an empty room.

Seibel: I heard about a computer science department where in the tutor’s office they had a stuffed animal and the rule was you had to explain your problem to the stuffed animal before you could bother the tutor. “OK, Mr. Bear, here’s the thing I’m working on and here’s my approach—aha! There it is.”

Armstrong: Really? I must try that.

Seibel: Talk to your cats.

Armstrong: The cats—absolutely! I worked with this guy who was slightly older than me and very clever. And every time I’d go into his office and ask him a question, every single question, he would say, “A program is a black box. It has inputs and it has outputs. And there is a functional relationship between the inputs and the outputs. What are the inputs to your problem? What are the outputs to your problem? What is the functional relationship between the two?” And then somewhere in this dialog, you would say, “You’re a genius!” And you’d run out of the room and he would shake his head in amazement—“I wonder what the problem was, he never said.” So he’s your bear which you explain the problem to.

Seibel: The doodling—is that writing little snippets of code or is it literally graphical doodles?

Armstrong: It’s more bubbles with arrows. You know when you explain things to people on a white board—you draw bubbles and arrows and equations and notations. Not code. Code fragments—piddly bits of code sometimes because that’s a compact way to express something. This is in the thinking period. Very occasional code experiments because I don’t know how long it takes to do something. So I’ll write ten lines of code and time something.

Seibel: You mean how long it takes for the computer to do it?

Armstrong: Yeah. Does that take a millisecond or a microsecond—I don’t know. I can guess but I want to confirm that guess. And so I’m only looking at the bits I don’t really know. But I have a great stock of experience programming Erlang so I know pretty much what things are going to do. Problem solving was the same years ago. It was, identify the difficult bits, write the small prototypes, identify the areas of uncertainty, writing very small bits of code. Essentially I do the same thing now but I have less reason to do these small experiments. If it’s Erlang. If I’m doing Ruby or Java then I have to go back and do a lot of experiments because I don’t know what’s going to happen.

Seibel: Then somewhere in this thinking process you get to the point where you know how to write the code?

Armstrong: Yeah, then all the bits fit together. But maybe I can’t explain it to anybody. I just get a very strong feeling that if I start writing the program now it’ll work. I don’t really know what the solution is. It’s like an egg. The chicken’s ready to lay the egg. Now I’m ready to lay the egg.

Seibel: And that’s the point at which you need to go into flow and not be interrupted.

Armstrong: Yes, yes.

Seibel: So there are still presumably a lot of details to be sorted out at the code level which requires your concentration.

Armstrong: Oh yes. But then there are two types of those things. The stuff that really needs the concentration is the stuff that is not automatic—you’ve got to think about it. You’ve got this really tricky garbage collection—exactly what needs to be marked and exactly where—you’ve got to think hard about that. You know you’ll find a solution because you’ve kind of bounded it in. And you know it’s in the right little black box.

Michelangelo is doing the roof of the Sistine Chapel or something and he’s got a whole team of painters helping him. So he would sketch the big picture first. These huge areas have got to be done in blue and green. So that’s rather like writing a program. The first sketch is this broad sketch where everything’s in the right place. Some of these places are going to be filled with uniform color and just can be filled in fairly rapidly—you don’t have to think.

And then you get to the details of the eyes—that’s tricky stuff. You know you can do it. And the eye is in the right place because the picture is OK. So you go and do the eye and the detail. That’s not to say that’s easy—that’s the difficult bit, actually. You’ve got to really concentrate while you’re doing the eye. You don’t have to really concentrate while you’re doing the forehead or the cheeks because they’re fairly uniform. A bit of stubble here so you pay a sort of half concentration.

Then type it all in and get the syntax errors out and run a few little tests to make sure it works. And that’s all rather relaxing. See a little compiler error there and you fix it. Once you’re experienced at a language you don’t even bother to read the diagnostic. It just says the line number—you don’t read what it says. That line—oh, yeah. That’s wrong, you retype it.

I gave a course in Erlang in Chicago. I was wandering around the class and I’d notice, there’s something wrong. Oh, there’s a comma missing there or that’ll crash before that happens and you’re not linked. My wife’s very good at proofreading and she says errors spring out of the page at you. A missing comma or a spelling mistake—they literally spring out of the page at her.

And programming errors just spring out of the page if I look at other people’s code, wandering around. It doesn’t feel like conscious thought is involved—it’s holistic. You see everything on the screen and there’s the error, bumpf. So it’s just a matter of correcting those surface errors.

One that’s tricky is slight spelling errors in variable names. So I choose variable names that are very dissimilar, deliberately, so that error won’t occur. If you’ve got a long variable like personName and you’ve got personNames with an “s” on the end, that’s a list of person names, that will be something that my eye will tend to read what I thought it should have been. And so I’d have personName and then listOfPeople. And I do that deliberately because I know that my eye will see what I thought I’d written. But punctuation, I do see that—I do see the commas and the brackets as being wrong. And of course Emacs colors everything and auto-indents and the brackets are different colors. So this is really easy.

Seibel: At the point that you start typing code, do you code top-down or bottom-up or middle-out?

Armstrong: Bottom up. I write a little bit and test it, write a little bit and test it. I’ve gone over to this writing test cases first, now. Unit testing. Just write the test cases and then write the code. I feel fairly confident that it works.

Seibel: Back to a bit of your history, it was after the Swedish Space Corporation that you went to Ericsson’s research lab?

Armstrong: Yes. And it was a very, very fortunate time to come, it must have been ’84. I think I had come to the lab something like two years after it had started. So we were very optimistic. Our view of the world was, yes we’ll solve problems and then we’ll push them into projects and we will improve Ericsson’s productivity. This view of the world wasn’t yet tinged by any contact with reality. So we thought it would be easy to discover new and useful stuff and we thought that once we had discovered new and useful stuff then the world would welcome us with open arms. What we learned later was, it wasn’t all that easy to discover new stuff. And it’s incredibly difficult to get people to use new and better stuff.

Seibel: And Erlang was one of those new and useful things you expected them to use?

Armstrong: Yes. Absolutely. So what happened was, first of all it was just Prolog. I sort of made a little language and people started using it. And then Robert Virding came along and said, “Hey, this looks like fun.” And he’d been reading my Prolog and he said, “Can I modify it a bit?” That’s pretty dangerous because Robert says that and you end up with one comment at the top of the program that says, “Joe thought of this stuff and I’ve changed a bit,” and then it’s completely changed. So Robert and I just rewrote this stuff back and forth and we had great arguments—“Ahhh, I can’t read your code, it’s got blanks after all the commas.”

Then we found somebody inside Ericsson who wanted a new programming language or wanted a better way of programming telephony. We met up with them once a week for about, I can’t remember, six months, nine months. And the general idea was we would teach them how to program and they would teach us about telephony—what the problem was. I remember it was both frustrating and very stimulating. That changed the language because we had real people using it and that resulted in a study where they thought, “Yeah, this would be OK but it’s far too slow”—they measure the performance of it and said, “It’s gotta be 70 times faster.” So then we said, “This phase is now over. We’ll make it go 70 times faster and they’ll carry on programming it and we have to do this in two years or something.”

We had several false starts. And we had several really embarrassing moments. Big mistake: don’t tell people how fast something is going to be before you’ve implemented it. But ultimately we figured out how to do it. I wrote a compiler in Prolog. And Rob was doing the libraries and things. We’re now kind of two years in. Then I thought I could implement this abstract machine in C so I started writing my first-ever C. And Mike Williams came along and looked at my C and said, “This is the worst C I’ve ever seen in my entire life. This is appallingly bad.” I didn’t think it was that bad but Mike didn’t like it. So then Mike did the virtual machine in C and I did the compiler in Prolog. Then the compiler compiled itself and produced byte-code and you put it in the machine and then we changed the grammar and the syntax and compiled the compiler in itself and came out with an image that would bootstrap and then we’re flying. We’ve lost our Prolog roots and we’re now a language.

Seibel: Has there ever been anything that you’ve found difficult to work into the Erlang model?

Armstrong: Yeah. We abstract away from memory, completely. If you were turning a JPEG image into a bitmap data, which depends on the placement of the data in a very exact sense, that doesn’t work very well. Algorithms that depend on destructively upgrading state—they don’t work well.

Seibel: So if you were writing a big image processing work-flow system, then would you write the actual image transformations in some other language?

Armstrong: I’d write them in C or assembler or something. Or I might actually write them in a dialect of Erlang and then cross-compile the Erlang to C. Make a dialect—this kind of domain-specific language kind of idea. Or I might write Erlang programs which generate C programs rather than writing the C programs by hand. But the target language would be C or assembler or something. Whether I wrote them by hand or generated them would be the interesting question. I’m tending toward automatically generating C rather than writing it by hand because it’s just easier.

But I’d use an Erlang structure. I’ve got some stuff that does my family images and things. So I use ImageMagik with some shell scripts. But I control it all from Erlang. So I just write wrappers around it and call os:command and then the ImageMagik command. So it’s quite nice to wrap up things in. Wouldn’t want to do the actual image processing in Erlang. It’d be foolish to write that in Erlang. C’s just going to be a lot better.

Seibel: Plus, ImageMagik is already written.

Armstrong: That doesn’t worry me in the slightest. I think if I was doing it in OCaml then I would go down and do it because OCaml can do that kind of efficiency. But Erlang can’t. So if I was an OCaml programmer: “OK, what do I have to do? Reimplement ImageMagik? Right, off we go.”

Seibel: Just because it’s fun?

Armstrong: I like programming. Why not? You know, I’ve always been saying that Erlang is bad for image processing—I’ve never actually tried. I feel it would be bad but that might be false. I should try. Hmmm, interesting. You shouldn’t tempt me.

The really good programmers spend a lot of time programming. I haven’t seen very good programmers who don’t spend a lot of time programming. If I don’t program for two or three days, I need to do it. And you get better at it—you get quicker at it. The side effect of writing all this other stuff is that when you get to doing ordinary problems, you can do them very quickly.

Seibel: Is there anything that you have done specifically to improve your skill as a programmer?

Armstrong: No, I don’t think so. I learned new programming languages but not with the goal of becoming a better programmer. With the goal of being a better language designer, maybe.

I like to figure out how things work. And a good test of that is to implement it yourself. To me programming isn’t about typing code into a machine. Programming is about understanding. I like understanding things. So why would I implement a JPEG thing like we talked about earlier? It’s because I’d like to understand wavelet transforms. So the programming is a vehicle to understand wavelet transformations. Or why do I try to do an interface to X Windows? Because I wanted to understand how the X protocol worked.

It’s a motivating force to implement something; I really recommend it. If you want to understand C, write a C compiler. If you want to understand Lisp, write a Lisp compiler or a Lisp interpreter. I’ve had people say, “Oh, wow, it’s really difficult writing a compiler.” It’s not. It’s quite easy. There are a lot of little things you have to learn about, none of which is difficult. You have to know about data structures. You need to know about hash tables, you need to know about parsing. You need to know about code generation. You need to know about interpretation techniques. Each one of these is not particularly difficult. I think if you’re a beginner you think it’s big and complicated so you don’t do it. Things you don’t do are difficult and things you’ve done are easy. So you don’t even try. And I think that’s a mistake.

Seibel: Several of the folks I’ve talked to have recommended learning different programming languages because it gives you different perspectives on how to solve problems.

Armstrong: Languages that do different things. There’s no point learning lots of languages that all do the same thing. Certainly I’ve written quite a lot of JavaScript and quite a lot of Tcl and quite a lot of C and quite a lot of Prolog—well, an enormous amount of Prolog and an enormous amount of Fortran and an enormous amount of Erlang. And a bit of Ruby. A bit of Haskell. I sort of read all languages and I’m not fluent at programming them all. Certainly I can program in quite a lot of languages.

Seibel: No C++?

Armstrong: No, C++, I can hardly read or write it. I don’t like C++; it doesn’t feel right. It’s just complicated. I like small simple languages. It didn’t feel small and simple.

Seibel: What languages influenced the design of Erlang?

Armstrong: Prolog. Well, it grew out of Prolog, obviously.

Seibel: There’s not a lot of Prolog discernible in it today.

Armstrong: Well, unification—pattern matching, that comes directly from Prolog. And the kind of data structures. Tuples and lists have slightly different syntax in Prolog but they’re there. Then there was Tony Hoare’s CSP, Communicating Sequential Processes. Also I’d read about Dijkstra’s guarded commands—that’s why I require that some pattern should always match, there shouldn’t be a default case—you should explicitly require that some branch always match. I think those are the main influences.

Seibel: And where did you get the functional aspect?

Armstrong: Once you’ve added concurrency to Prolog you really just had to make sure it didn’t backtrack after you’d done something. In Prolog you could call something and then backtrack over the solution to basically undo the effect of calling it. So you had to realize if this statement says, “Fire the missiles,” and whoom, off they go, you can’t backtrack over it and reverse that. Pure Prolog programs are reversible. But when you’re interacting with the real world, all the things you do are one way. Having said, fire the missiles, the missiles fire. Having said, “Change the traffic lights from red to green,” they change from red to green and you can’t say, “Oh, that was a bad decision; undo it.”

Now we’ve got a concurrent language and parallel processes and inside these processes we’re doing full Prolog with backtracking and all that kind of stuff. So the Prolog became very deterministic with cuts everywhere to stop it from backtracking.

Seibel: Where the irreversible things would be sending messages to other processes?

Armstrong: Yes. But it’s just a function call and maybe not of the function that fires the rockets but one that calls something else that calls something else that calls it so it’s just a pain kind of trying to keep these two worlds separate. So the code you wrote inside a process became more and more functional, sort of a dialect of Prolog which was a functional subset. And so if it’s a functional subset, might as well make it completely functional.

Seibel: Yet Erlang is pretty different from most functional languages these days in being dynamically typed. Do you feel like part of the functional language community?

Armstrong: Oh yes. When we go to functional programming conferences, I suppose we argue about our differences. We argue about eager evaluation and lazy evaluation. We argue about dynamic type systems and static type systems. But despite everything the central core of functional programming is the idea of nonmutable state—that x isn’t the name of a location in memory; it’s a value. So it can’t change. We say x equals three and you can’t change it thereafter. All these different communities say that has enormous benefits for understanding your program and for parallelizing your program and for debugging your program. Then there are functional languages with dynamic type systems like Erlang and functional languages with static type systems and they’ve both got their good and bad points.

It’d be really nice to have the benefits of a static type system in Erlang. Maybe in certain places we could annotate programs to make the types more explicit so the compiler can derive the types and generate much better code.

Then the static type people say, “Well, we really rather like the benefits of dynamic types when we’re marshaling data structures.” We can’t send an arbitrary program down a wire and reconstruct it at the other end because we need to know the type. And we have—Cardelli called it a system that’s permanently inconsistent. We have systems that are growing and changing all the time, where the parts may be temporarily inconsistent. And as I change the code in a system, it’s not atomic. Some of the nodes change, others don’t. They talk to each other—at certain times they’re consistent. At other times—when we go over a communication boundary—do we trust that the boundary is correct? They might fib. So we need to check certain stuff.

Seibel: So early on you earned your beer by debugging other people’s programs. Why do you think you were such a good debugger?

Armstrong: Well, I enjoyed debugging. At this point in the program you print out a few variables and things to see what’s going on and they’re all according to what you expect. And at this point in the program it’s right. And somewhere later it’s wrong. So you look halfway in between—it’s either right or wrong and you just do this interval halving. Provided you can reproduce an error. Errors that are nonreproducible, that’s pretty difficult to debug. But they weren’t giving me that. They were giving me reproducible errors. So just carry on halving until you find it. You must ultimately find it.

Seibel: So do you think you just had a more systematic view?

Armstrong: Yeah, they gave up. I don’t know why—I couldn’t really understand why they couldn’t debug programs. I mean, do you think debugging is difficult? I don’t. You just stop it and slow it down. I mean, I’m just talking about batch Fortran.

OK, debugging real-time systems or garbage collectors—I remember once Erlang crashed—it was early days—and it crashed just after I’d started it. I was just typing something. It had built in sort of Emacsy commands into the shell. And I typed erl to start it and you get into a read-eval-print loop. And I’d typed about four or five characters and made a spelling mistake. And then I backed the cursor a couple of times and corrected it and it crashed with a garbage collection error. And I knew that’s a deep, deep, error. And I thought, “Can I remember exactly what did I type in?” Because it was only about 12 characters or something. I restarted and typed and it didn’t crash. And I sat there for like an hour and a half trying probably a hundred different things. Then it crashed again! Then I wrote it down. Then I could debug it.

Seibel: What are the techniques that you use there? Print statements?

Armstrong: Print statements. The great gods of programming said, “Thou shalt put printf statements in your program at the point where you think it’s gone wrong, recompile, and run it.”

Then there’s—I don’t know if I read it somewhere or if I invented it myself—Joe’s Law of Debugging, which is that all errors will be plus/minus three statements of the place you last changed the program. When I worked at the Swedish Space Corporation my boss was a hardware guy. We were up at Esrange, the rocket-launching site and satellite-tracking station in the north. And one time he was banging his head, debugging some bug in the hardware, plugging in oscilloscopes, and changing things. And I said, “Oh, can I help?” And he said, “No Joe, you can’t help here—this is hardware.” And I said, “Yeah, but it must be like software—the bug will be pretty near to the last change you made to the hardware.” And he went, “I changed a capacitor. You’re a genius!” He’d replaced one capacitor with a bigger capacitor and he unsoldered it and put the original one back and it worked. It’s the same everywhere. You fix your car and it goes wrong—it’s the last thing you did. You changed something—you just have to remember what it was. It’s true with everything.

Seibel: So have you ever proved any of your programs correct? Has that kind of formalism ever appealed to you?

Armstrong: Yes and no. I’ve manipulated programs algebraically to just show that they were equivalent. Haven’t really gone into theorem proving as such. I did a course in denotational semantics and things like that. I remember giving up. The exercise was given: let x = 3 in let y = 4 in xplus y show that the eager evaluation scheme given by the equations foo and the lazy evaluation scheme given by the equations bar, both evaluate to seven.

Fourteen pages of lemmas and things later I thought, “Hang on—x is three, y is four, x plus y; yeah seven.” At the time I was writing the Erlang compiler. If it took lots of pages to prove that three plus four is seven then the proof that my compiler was in any sense correct would have been thousands and thousands of pages.

Seibel: Do you prefer to work alone or on a team?

Armstrong: I like a workplace of teams, if you see what I mean. I’m not antisocial. But I just like programming by myself. Certainly I like collaborating with people in the sense of discussing problems with them. I always thought the coffee break that you have when you got to work and out came all the ideas that you’d had on your walk to work was very valuable. You get a lot of insights then. Good to thrash your ideas out in front of the crowd. You’re put in a position of explaining your ideas which, for me, moves them from one part of my brain to another part. Often when you explain things then you understand them better.

Seibel: Have you ever pair programmed—sat down at a computer and produced code with another person?

Armstrong: Yeah. With Robert, Robert Virding. We would tend to do that when both of us were kind of struggling in the dark. We didn’t really know what we were doing. So if you don’t know what you’re doing then I think it can be very helpful with someone who also doesn’t know what they’re doing. If you have one programmer who’s better than the other one, then there’s probably benefit for the weaker programmer or the less experienced programmer to observe the other one. They’re going to learn something from that. But if the gap’s too great then they won’t learn, they’ll just sit there feeling stupid. When I have done pair programming with programmers about the same ability as me but neither of us knew what we were doing, then it’s been quite fun.

Then there are what I might call special problems. I wouldn’t attempt them if I’ve got a cold or I’m not on good physical form. I know it’s going to take three days to write and I’ll plan a day and not read email and start and it’s gonna be four hours solid. I’ll do it at home so I know I won’t be interrupted. I just want to do it and get into this complete concentrated state where I can do it. I don’t think pair programming would help there. It would be very disruptive.

Seibel: What’s an example of that kind of problem?

Armstrong: Figuring out bits of a garbage collector—it’s the imperative coding—where you’ve got to remember to mark all those registers. Or doing some lambda lifting in the compiler, which is pretty tough—you relabel all the variables and then you’ve got four or five layers of abstract data types all messing around and frames with different stuff in them and you think, “I’ve got to really understand this, really think deeply about it.” You want to concentrate.

I vary the tasks I do according to mood. Sometimes I’m very uninspired so I think to myself, “Ah, who shall I go and disturb now.” Or I’ll read some emails. Other times I feel, right now I’m going to do some hard coding because I’m in the mood for it. You’ve got to be sort of right to do the coding. So how’s that going to work with two people? One of them is just not in a concentrating mode and wants to read his emails and things.

Seibel: You did do a kind of serial pair programming with Robert Virding, when you passed the code back and forth rewriting it each time.

Armstrong: Yeah. One at a time. I would work on the program, typically two or three weeks, and then I’d say, “Well, I’ve had enough, here you are, Robert.” And he’d take it. Every time we did this, it would come back sort of unrecognizable. He would make a large number of changes and it’d come back to me and I’d make a large number of changes.

Seibel: And they were productive changes?

Armstrong: Oh, absolutely. I was delighted if he found better ways of doing things. We both got on very well. He used to generalize. I remember once I found a variable—I followed it round and round through about 45 routines and then, out it came, at the end, never even used. He just passed this variable in and out of 45 different functions. I said, “What’s that for? You don’t use it.” He said, “I know. Reserved for future expansion.” So I removed that.

I would write a specific algorithm removing all things that were not necessary for this program. Whenever I got the program, it became shorter as it became more specific. And whenever Robert took my program it became longer, adding generality. I believe this Unix philosophy—a program should do what it’s supposed to do and nothing else. And Robert’s philosophy is it should be a general program and then the program itself should be a specific case of the general program. So he would add generality and then specialize it.

Seibel: That seems like a pretty deep philosophical divide. Was there any benefit to having the program go through those two extremes?

Armstrong: Oh yes. Every cycle it improved. I think it was a lot better because of that. And probably better than either of us could have done on our own.

Seibel: Can you talk about how you design software? Maybe take example of something like OTP.

Armstrong: OTP was designed by me and Martin Bj?rklund and Magnus Fr?berg. There were just the three of us did the original design. We met every morning at coffee and had a long conversation—about an hour to two hours—and we covered the white board in stuff. I’d take loads of notes—I wrote all the documentation immediately and they wrote all the code. Sometimes I’d write a bit of code as well. And when I was writing the documentation I’d discover, I can’t describe this, we have to change it. Or they would run into me and say, “Nah, it doesn’t work; this idea we had this morning, because of this, this, this, and this it doesn’t work.” At the end of the day we either got to the point where we got all the documentation and all the code or enough of the code and enough of the documentation that we knew it was going to work. And then we called it a day.

Some days it didn’t work so we said, “OK, we’ll do it again tomorrow.” There wasn’t enough time to do a second pass in a day. But about one pass in a day worked fine. Because it gives us about two hours to discuss it in the morning, about two hours to write the documentation or code it up. And if you spent four hours really thinking hard, that’s a good day’s work. So that worked very, very well. I don’t know how long we worked like that for. Ten weeks, twelve weeks, something like that. And then we got the basic framework and then we had more people. We’d specified the architecture—now we could start growing it. We’d get three or four more programmers in.

Seibel: And then how did you divvy up the work for those new folks?

Armstrong: Well, we knew what were prototypes and what were final versions. I’ve always taken the view of system design, you solve the hard problems first. Identify the hard problems and then solve them. And then the easy problems, you know they’ll just come out in the wash. So there’s a bit of experience there in classifying them as easy and hard. I know IP failover or something like that is going to be fairly hard. But I know that parsing a configuration file is going to be easy. In the prototype you might just have a configuration file that you read. You don’t syntax check it—you don’t have a grammar. In the production version you might do it in XML and have a complete grammar and validate it. But you know that that’s a mechanical step to do that. It will take a competent programmer several weeks, or whatever time it takes. But it’s doable, it’s predictable in time, and there shouldn’t be any nasty surprises on the way. But getting the communication protocols right, and getting them working properly when things fail, that I would do in a small group.

Seibel: So in this case you wrote the documentation before, or at least while, the code was being written. Is that how you usually do it?

Armstrong: It depends on the difficulty of the problem. I think with very difficult problems I quite often start right by writing the documentation. The more difficult it is, the more likely I am to document it first.

I like documentation. I don’t think a program is finished until you’ve written some reasonable documentation. And I quite like a specification. I think it’s unprofessional these people who say, “What does it do? Read the code.” The code shows me what it does. It doesn’t show me what it’s supposed to do. I think the code is the answer to a problem. If you don’t have the spec or you don’t have any documentation, you have to guess what the problem is from the answer. You might guess wrong. I want to be told what the problem is.

Seibel: Is the documentation you write at this stage internal documentation that another programmer would read or documentation for the user?

Armstrong: It’s for user guides. It sort of switches me into a different mode of thinking. I just start, in order to do this, create a directory called that, put this file in there, rename this as that and that is guiding the structure. I’ve sort of pondered the question. I bet Knuth would say, “Well, all programs are literate programs.” You don’t write the code and then write the documentation. You write both at the same time, so it’s a literate program. I’m not there. I don’t think that. I don’t know if his view is because he publishes his programs.

I don’t know if it’s a left-brain/right-brain shift, or what it is, but when you write the documentation you think about the program differently to when you write the code. So I guess writing literate programs forces that shift as you’re doing it. Which might be very productive. I did do some literate Erlang though I haven’t actually used it for a very long time. So that’s an interesting idea—perhaps I should wake it up again and write some stuff using literate Erlang. I’m not against the idea but I’m sort of impatient and wanted to write the code and not the documentation. But if you really want to understand it then I think writing the documentation is an essential step.

If I were programming Haskell, I would be forced to think about the types pretty early and document them and write them down. If you’re programming in Lisp or Erlang you can start writing the code and you haven’t really thought about the types. And in a way, writing the documentation is thinking about the types in a way. I suppose you start off with “is a”. You say, “A melody is a sequence of notes.” Right. OK. A melody is a sequence of chords where each chord is a parallel composition of notes of the same duration. Just by defining terms in your documentation—a something is a something—you’re doing a sort of type analysis and you’re thinking declaratively about what the data structures are.

Seibel: Do you think overall programming languages are getting better? Are we on a trajectory where we learn enough lessons from the past and come up with enough new ideas?

Armstrong: Yes. The new languages are good. Haskell and things like that. Erlang. Then there are some funny languages that should really be used. Prolog is a beautiful language but not widely used. It sort of peaked; Kowalski called it a solution looking for a problem.

Seibel: Dan Ingalls mentioned Prolog as an example of the kind of idea that we should really revisit now that we’ve had a couple decades of Moore’s Law.

Armstrong: Prolog is so different to all the other programming languages. It’s just this amazing way of thinking. And it’s not appropriate to all problems. But it is appropriate to an extremely large set of problems. It’s not widely used. And it’s a great shame because programs are incredibly short. I think I went into shock when I wrote my first Prolog program. It’s a kind of shocking experience. You just walk around going, where’s the program—I haven’t written a program. You just told it a few facts about the system, about your problem. Here it is figuring out what to do. It’s wonderful. I should go back to Prolog—drop Erlang.

Seibel: Are there other skills that are not directly related to programming that you feel have improved your programming or that are valuable to have as a programmer?

Armstrong: Writing is. There’s some computer scientist that said, “Oh, if you’re no good at English you’ll never be a very good programmer.”

Seibel: I think Dijkstra had something about that.

Armstrong: I’ve occasionally been asked to advise people at universities on choice of syllabus subjects for computer science courses, being as how I work for industry—what does industry want? And I say, “Well, turn ’em out being able to write and argue cogently.” Most graduates who come out, and they’ve got degrees in computer science, writing’s not their strong point.

I think it’s actually very difficult to teach because it’s very individual. Somebody’s got to take your text and a red pen and explain to you what you did wrong. And that’s very time consuming. Have you ever read Hamming’s advice to young researchers?

Seibel: “You and Your Research”?

Armstrong: He says things like, “Do good stuff.” He says, “If you don’t do good stuff, in good areas, it doesn’t matter what you do.” And Hamming said, “I always spend a day a week learning new stuff. That means I spend 20 percent more of my time than my colleagues learning new stuff. Now 20 percent at compound interest means that after four and a half years I will know twice as much as them. And because of compound interest, this 20 percent extra, one day a week, after five years I will know three times as much,” or whatever the figures are. And I think that’s very true. Because I do research I don’t spend 20 percent of my time thinking about new stuff, I spend 40 percent of my time thinking about new stuff. And I’ve done it for 30 years. So I’ve noticed that I know a lot of stuff. When I get pulled in as a troubleshooter, boom, do it that way, do it that way. You were asking earlier what should one do to become a better programmer? Spend 20 percent of your time learning stuff—because it’s compounded. Read Hamming’s paper. It’s good. Very good.

Seibel: Do you find some code beautiful?

Armstrong: Yes. Why this is I don’t know. The funny thing is, if you give two programmers the same problem—it depends on the problem, but problems of a more mathematical nature, they can often end up writing the same code. Subject to just formatting issues and relabeling the variables and the function names, it’s isomorphic—it’s exactly the same algorithms. Are we creating these things or are we just pulling the cobwebs off? It’s like a statue that’s there and we’re pulling the cobwebs off and revealing the algorithm that’s always been there. So are we inventing a new algorithm or are we inventing a structure that already exists? Some algorithms feel like that. I think it’s more the mathematical algorithms. I don’t get that feeling when I’m implementing a telephony protocol or something. That’s not a statue that I’m pulling the cobwebs off.

Seibel: So that’s similar to the beauty of math, because it’s part of nature. Then there are other levels at which code sort of has an aesthetic.

Armstrong: Yeah. It’s kind of feng shui. I like minimalistic code, very beautifully poised, structured code. If you start removing things, if you get to the point where if you were to remove anything more it would not work any more—at this point it is beautiful. Where every change that you could conceivably make, makes it a worse algorithm, at that point it becomes beautiful.

Seibel: You mentioned that when you and Robert Virding were passing the code back and forth how each of you changed the low-level details of formatting, stuff that programmers argue endlessly about.

Armstrong: That’s not affecting the beauty of the algorithm.

Seibel: But it’s part of the aesthetic. It’s people’s taste.

Armstrong: Yeah. But I wouldn’t say, “This is ugly code because there’s a blank after the comma.” Ugly is when it’s done with a linear search and it could have been done with a binary interval halving. Or it could have done logarithmically and it’s done linearly. For the wrong reasons. Sure do it linearly if we know we’re searching through a list of ten elements, who cares? But if it’s a big data structure then it should have been done with a binary search. And so it’s really not very pretty to do it in a linear form. The mathematical algorithms—that’s like Platonic beauty. This is more like architecture. You admire a fine building—it’s not a mathematical object. Not a solid or a sphere or a prism—it’s a skyscraper. It looks nice.

Seibel: What makes a good programmer? If you are hiring programmers—what do you look for?

Armstrong: Choice of problem, I think. Are you driven by the problems or by the solutions? I tend to favor the people who say, “I’ve got this really interesting problem.” Then you ask, “What was the most fun project you ever wrote; show me the code for this stuff. How would you solve this problem?” I’m not so hung up on what they know about language X or Y. From what I’ve seen of programmers, they’re either good at all languages or good at none. The guy who’s a good C programmer will be good at Erlang—it’s an incredibly good predictor. I have seen exceptions to that but the mental skills necessary to be good at one language seem to convert to other languages.

Seibel: Some companies are famous for using logic puzzles during interviews. Do you ask people that kind of question in interviews?

Armstrong: No. Some very good programmers are kind of slow at that kind of stuff. One of the guys who worked on Erlang, he got a PhD in math, and the only analogy I have of him, it’s like a diamond drill drilling a hole through granite. I remember he had the flu so he took the Erlang listings home. And then he came in and he wrote an atom in an Erlang program and he said, “This will put the emulator into an infinite loop.” He found the initial hash value of this atom was exactly zero and we took something mod something to get the next value which also turned out to be zero. So he reverse engineered the hash algorithm for a pathological case. He didn’t even execute the programs to see if they were going to work; he read the programs. But he didn’t do it quickly. He read them rather slowly. I don’t know how good he would have been at these quick mental things.

Seibel: Are there any other characteristics of good programmers?

Armstrong: I read somewhere, that you have to have a good memory to be a reasonable programmer. I believe that to be true.

Seibel: Bill Gates once claimed that he could still go to a blackboard and write out big chunks of the code to the BASIC that he written for the Altair, a decade or so after he had originally written it. Do you think you can remember your old code that way?

Armstrong: Yeah. Well, I could reconstruct something. Sometimes I’ve just completely lost some old code and it doesn’t worry me in the slightest. I haven’t got a listing or anything; just type it in again. It would be logically equivalent. Some of the variable names would change and the ordering of the functions in the file would change and the names of the functions would change. But it would be almost isomorphic. Or what I would type in would be an improved version because my brain had worked at it.

Take the pattern matching in the compiler which I wrote ten years ago. I could sit down and type that in. It would be different to the original version but it’d be an improved version if I did it from memory. Because it sort of improves itself while you’re not doing anything. But it’d probably have a pretty similar structure.

I’m not worried about losing code or anything like that. It’s these patterns in your head that you remember. Well, I can’t even say you remember them. You can do it again. It’s not so much remembering. When I say you can remember a program exactly, I don’t think that it’s actually remembering. But you can do it again. If Bill could remember the actual text, I can’t do that. But I can certainly remember the structure for quite a long time.

Seibel: Is Erlang-style message passing a silver bullet for slaying the problem of concurrent programming?

Armstrong: Oh, it’s not. It’s an improvement. It’s a lot better than shared memory programming. I think that’s the one thing Erlang has done—it has actually demonstrated that. When we first did Erlang and we went to conferences and said, “You should copy all your data.” And I think they accepted the arguments over fault tolerance—the reason you copy all your data is to make the system fault tolerant. They said, “It’ll be terribly inefficient if you do that,” and we said, “Yeah, it will but it’ll be fault tolerant.”

The thing that is surprising is that it’s more efficient in certain circumstances. What we did for the reasons of fault tolerance, turned out to be, in many circumstances, just as efficient or even more efficient than sharing.

Then we asked the question, “Why is that?” Because it increased the concurrency. When you’re sharing, you’ve got to lock your data when you access it. And you’ve forgotten about the cost of the locks. And maybe the amount of data you’re copying isn’t that big. If the amount of data you’re copying is pretty small and if you’re doing lots of updates and accesses and lots of locks, suddenly it’s not so bad to copy everything. And then on the multicores, if you’ve got the old sharing model, the locks can stop all the cores. You’ve got a thousand-core CPU and one program does a global lock—all the thousand cores have got to stop.

I’m also very skeptical about implicit parallelism. Your programming language can have parallel constructs but if it doesn’t map into hardware that’s parallel, if it’s just being emulated by your programming system, it’s not a benefit. So there are three types of hardware parallelism.

There’s pipeline parallelism—so you make a deeper pipeline in the chip so you can do things in parallel. Well, that’s once and for all when you design the chip. A normal programmer can’t do anything about the instructionlevel parallelism.

There’s data parallelism, which is not really parallelism but it has to do with cache behavior. If you want to make a C program go efficiently, if *p is on a 16-byte boundary, if you access *p, then the access to *(p + 1) is free, basically, because the cache line pulls it in. Then you need to worry about how wide the cache lines are—how many bytes do you pull in in one cache transfer? That’s data parallelism, which the programmer can use by being very careful about their structures and knowing exactly how it’s laid out in memory. Messy stuff—you don’t really want to do that.

The other source of real concurrency in the chip are multicores. There’ll be 32 cores by the end of the decade and a million cores by 2019 or whatever. So you have to take the granules of concurrency in your program and map them onto the cores of the computer. Of course that’s quite a heavyweight operation. Starting a computation on a different core and getting the answer back is itself something that takes time. So if you’re just adding two numbers together, it’s just not worth the effort—you’re spending more effort in moving it to another core and doing it and getting the answer back than you are in doing it in place.

Erlang’s quite well suited there because the programmer has said, “I want a process, I want another process, I want another process.” Then we just put them on the cores. And maybe we should be thinking about actually physically placing them on cores. Probably a process that spawns another process talks to that process. So if we put it on a core that’s physically near, that’s a good place to put it, not on one that’s a long way away. And maybe if we know it’s not going to talk to it a lot maybe we can put it a long way away. And maybe processes that do I/O should be near the edge of the chip—the ones that talk to the I/O processes. As the chips get bigger we’re going to have to think about how getting data to the middle of the chip is going to cost more than getting it to the edge of the chip. Maybe you’ve got two or three servers and a database and maybe you’re going to map this onto the cores so we’ll put the database in the middle of the chip and these ones talk to the client so we’ll put them near the edge of the chip. I don’t know—this is research.

Seibel: You care a lot about the idea of Erlang’s way of doing concurrency. Do you care more about that idea—the message-passing shared-nothing concurrency—or Erlang the language?

Armstrong: The idea—absolutely. People keep on asking me, “What will happen to Erlang? Will it be a popular language?” I don’t know. I think it’s already been influential. It might end up like Smalltalk. I think Smalltalk’s very, very influential and loved by an enthusiastic band of people but never really very widely adopted. And I think Erlang might be like that. It might need Microsoft to take some of its ideas and put some curly braces here and there and shove it out in the Common Language Runtime to hit a mass market.

Îãëàâëåíèå êíèãè


Ãåíåðàöèÿ: 1.862. Çàïðîñîâ Ê ÁÄ/Cache: 3 / 1
ïîäåëèòüñÿ
Ââåðõ Âíèç