Êíèãà: Coders at Work: Reflections on the craft of programming

Bernie Cosell

Bernie Cosell

In 1969 when the first two nodes of the ARPANET—the network that would become the core of the Internet—came on line, every packet that flowed over 50 kilobit/second leased lines was routed through two specialized computers called Interface Message Processors, or IMPs. The IMPs were designed and built by Bolt Beranek and Newman (BBN), and the software that ran the IMPs had been written by a team of three programmers, one of whom was Bernie Cosell, who had left MIT three years before, at the beginning of his junior year, to join BBN.

Originally hired as an application programmer on a project building one of the earliest timesharing systems, Cosell quickly moved to the systems programming side of things and was soon “czar of the PDP-1 timesharing system” responsible for finishing the operating-system code and keeping the system running.

Over a 26-year career at BBN, Cosell would work on a little bit of everything, earning a reputation within BBN as a master debugger and “fixer” who could be thrown onto a struggling project to make the software work. And he hacked just for fun: to hone his Lisp skills he wrote DOCTOR, a version of Joseph Weizenbaum’s ELIZA, based on Weizenbaum’s description in a journal article. Written in BBN-LISP, which spread around the ARPANET along with the TENEX operating system, Cosell’s version of DOCTOR also had a wide distribution—wider than Weizenbaum’s original—inspiring new implementations and related programs.

In 1991 Cosell left BBN and bought a sheep farm in Virginia, where he now lives with his wife Lynn, three dogs, innumerable cats, and lots of sheep. He does some programming for a local ISP, hacks a bit on his own projects, and teaches a few courses in programming and computer security but is glad he no longer works as a full-time programmer. Ironically, as a result of his move to the country, Cosell—one of the fathers of the Internet—now has only dial-up access from his home.

In this interview we talked about how he won his reputation as a master debugger, the importance of writing clear code, and how he convinced the other programmers on the IMP project to stop patching the binary.

Seibel: When did you first get involved with programming?

Cosell: In high school. I don’t know if it was true or not but the rumor was that our high school was the first high school in the country to actually have its own computer. IBM donated a 1620 to our high school. I think it arrived either the year before I arrived, or the year I arrived at high school in ’59.

Seibel: And what high school was it?

Cosell: Bronx High School of Science in New York. I believe the previous generation of students were using Columbia University’s 650. But the head of the math department was very pleased that he had his own computer. In fact, he was writing a book on programming and this was back when there weren’t many books on programming. I ended up debugging all of his examples. Almost the only thing I remember about high school is learning to program.

Seibel: What were you programming then? Assembly on punch cards?

Cosell: Yeah. Well, it was punch cards but the 1620 also had a console. It had an IBM Selectric typewriter that was the input/output console, and you could input programs from that. To show you the era it was, they chose not to put arithmetic hardware in it. It had table-lookup arithmetic: there was an area of memory and when you wanted to do an addition, one digit gave you the row, one digit gave you the column, and the value was there. And part of every program was loading that part of memory with the addition and multiplication tables.

So you could actually type from the typewriter but mostly we punched cards and loaded them in. There was a Fortran for it but I never did very much Fortran. Mostly, I programmed in 1620 assembler.

The other thing I learned in high school is how to wire plug boards. Someplace along the route, we had something like an old 403 calculating printers and I learned how to wire a plug board. It was such a primitive art, even at the time, but it turned out to be useful. At BBN, like ten years after I was in high school, we actually needed somebody to wire a plug board and I just said, “Oh, give me the manual on that thing.” And I read the manual and made an old standalone accounting-machine printer do a primitive protocol to serve as a line printer on our PDP-1.

Seibel: And in between high school and BBN you went to MIT?

Cosell: I graduated from high school and entered MIT in ’63. I was a solid math major at MIT, taking an odd computer course. Computers were still an occasional class taught out of the electrical engineering department; you couldn’t major in computer science. Folks were just starting to build the first time-sharing systems on the 709 or 7094, whatever they had at the computer center, but I was pretty busy doing math.

I took some EE courses and logic courses and I took the odd computer course and seemed to be OK at it. I didn’t understand what the really good programmers did because I was just a little kid. But I seemed to be able to program.

I did fall in with a group called the Tech Model Railroad Club. I really thought that was great. Relay logic was right up my alley. They had a railroad layout completely done with relay logic and stepping switches. Through that, I got slightly in touch with the people at RLE—Research Lab of Electronics. This was still in the era where we spent all of our time in the basement of Building 26 typing up punch cards on the keypunch, which we would then hand to the shaman, who would give us listings back the next day. Then I started hanging out at Project MAC. Basically, when I was supposed to have been doing lots of math lessons, I discovered I was spending more and more time hanging out in the computer places.

And after RLE, you went over to Tech Square. I met people like Richard Greenblatt and Bill Gosper. But I was just drifting through that world; I don’t think I was doing much programming. Like I remember how I got involved with Project MAC: I was really taken by Spacewar! on the PDP-1. But I didn’t approach it as a hacker or a programmer—“Let me see the source code. How did you do that?” I just thought the game was the neatest thing. I was just a gamer at that point, as opposed to a programmer, and I had heard that the guys over at Project MAC had done a super version of Spacewar!, that they had fancy consoles, and they had a spare PDP, so I wandered up there. So I got to meet Peter Samson in his great failed attempt to solve the New York City subway system, to ride the whole system on one ticket as fast as possible.

I was probably a sophomore, deeply entrenched in the usual sophomore things, watching all of these guys who were clearly adept and clearly knew what they were doing. I was writing little programs to solve a maze. The frog had to hop from lily pad to lily pad and get out of the middle of the pond. I remember writing that program and helping other students from my dorm get theirs working. But that’s where I was at. I had no clue what happened after I handed my deck in.

As I look back, I would say that at that point, I was learning the craft of programming. I could sort of make computers do what I wanted. But the light hadn’t gone off. I hadn’t internalized it; I didn’t really understand what was happening. It was all a little bit magical and strange. And that was how I was drifting through college. The thing that really made me a programmer was going to work at BBN.

One of the guys I had met at college, who had graduated and worked at BBN said, “Come out here.” He took me out one night in the middle of the night because BBN was a 24-hour-a-day, 7-day-a-week weird place. It was sort of an extension of the MIT labs. People could come and go at all hours. And he was part of the night crew. So we went out one evening. It was all too mysterious and marvelous to understand; I just had no clue what he was showing me. Not long after that, he suggested that they hire me. And so they had me out, interviewed me, and hired me.

Seibel: This was when you were three years into MIT?

Cosell: Correct. In September of my junior year they hired me part-time. I believe I made it until October before I dropped out and went to work fulltime at BBN.

In retrospect, I wasn’t very good. I had seen a PDP-1 but I had no idea how to program one. I didn’t know anything about time-sharing. That, of course, was not surprising, since there were probably maybe 50 people on the planet who knew what time-sharing was.

But BBN was working on a project with Massachusetts General Hospital to experiment with automating hospitals and I got brought onto that project. I started out as an application programmer because that was all I was good for. I think I spent about three weeks as an application programmer. I quickly became a systems programmer, working on the libraries that they were using. And not long after that, the two systems gurus, the guys that had written much of that PDP-1 time-sharing system, took me under their wing and designated me their heir apparent. That winter they both left BBN to go back to grad school. By January I was the czar of the PDP-1 timesharing system—I was responsible for the whole mess.

But in that little interval, a whole series of lightbulbs lit up. All of a sudden, I understood time-sharing. I understood real-time systems. Once I understood it, I absorbed the time-sharing system. And everything’s been downhill for me after that.

The project was quite ambitious for its time. The idea was that there would be a Model 33 Teletype—noisy and clunky and uppercase only—on each ward. There would be a Model 33 Teletype in each doctor’s office. There would be a Model 33 Teletype in the pharmacy. And there would be, I guess, a Model 33 Teletype in the admissions office. And our little timesharing system was going to coordinate all of that.

When a patient got in, they would be assigned a bed. The doctor would schedule lab tests. At which point, the nurse’s Teletype would say, “Take these samples. Put this number on it.” The lab would get a message saying, “Run these tests.” If the doctor prescribed something, the pharmacy would be told and the cart would be ready.

It was amazing to have those little noisy, silly things on the wards. Having that level of professional dealing with these clunky things was really pretty offensive, so there was a lot of resistance. But I was sort of immune to all of that, because I had gravitated off to the systems part of the world.

And I had decided it was really important that the system not stop. I don’t know if they told me that or not, but I decided that we had to prove—I had to prove—that time-sharing could work. That it was a good enough and solid enough thing that you would consider running a hospital with it. I thought about what happened if a patient needed medication and the system crashed? Or worse, the system lost the prescription and the patient never got dosed? Or the system juggled prescriptions and the nurses had actually started trusting the system? So I started thinking the system should not crash. This system should be good as Unix 30 years later.

But there was no real-time debugging. When the system crashed, basically the run light went out and that was it. You had control-panel switches where you could read and write memory. The only way to debug the system was to say, “What was the system doing when it crashed?” You don’t get to run a program; you get to look at the table that kept track of what it was doing. So I got to look at memory, keeping track on pieces of graph paper what it was doing. And I got better at that.

In retrospect, I got scarily better at that. So they had me have a pager. This was back in the era when pagers were sort of cool and only doctors had them. It was a big, clunky thing and all it would do is beep. No two-way. No messages. And it only worked in the Boston area, because its transmitter was on top of the Prudential Center. But if I was within 50 miles of Boston, it worked.

And basically, I was a trained little robot: when my pager went beep, beep, beep, I called in to find out what the problem was. What was bizarre was that with no paper, in a parking lot, on a pay phone I could have them examining octal locations, changing octal locations and then I would say, “OK, put this address in and hit run,” and the system would come back up. I don’t know how the hell I managed to do that. But I could do those kinds of things. I took care of the time-sharing system for probably a good two or three years.

Seibel: At this point, you had presumably written a lot of the code despite having originally inherited the system.

Cosell: Yeah. The operating system was not done when I got it. It was buggy and there were pieces of it that were not finished when Steve Weiss and Bob Morgan went off to grad school. I did something that they hadn’t done—it was one of the things that I got known for around BBN, which is, I made things work.

I really believed that computers were deterministic, that you could understand what they were supposed to do, and that there was no excuse for computers not working, for things not functioning properly. In retrospect, I was surprisingly good at keeping the system running, putting in new code and having it not break the system.

That was the first instance of something I got an undeserved reputation for. I know that my boss, and probably some other of my colleagues, have said I was a great debugger. And that’s partly true. But there’s a fake in there.

Really what I was was a very careful programmer with the arrogance to believe that very few computer programs are inherently difficult. I would take some piece of code that didn’t look like it was working and I would try to read it. And if I could understand, then I could usually see what was wrong or poke around with it and fix it. But sometimes I would get a piece of code—often one that other people couldn’t make work—and I would say, “This is way too complicated.”

So I would think through what it was supposed to do, throw it away, and write it again from scratch. Some of the folks I worked with—like Will Crowther—who are terrific programmers, couldn’t tolerate that. They would believe that by doing that, I would probably have fixed the 2 bugs that were there and introduced 27 new bugs. But the fact is, I was good at that. So I would rewrite stuff completely and it would be organized differently than the original programmer had organized it because I had thought about the problem differently. Typically, it was simpler than it used to be, or at least simpler to my eyes. And it would work.

So I got this reputation—I fixed these mysterious bugs that nobody else could fix. Fortunately, they never asked me what the bug was. Because the truth of the matter is if they’d have asked, “How did you fix the bug?” my answer would have been, “I couldn’t understand the code well enough to figure out what it was doing, so I rewrote it.”

I did that a lot on the PDP-1 time-sharing system. There were chunks of the code that I would read and would say, “This doesn’t do what I think this part of the program is supposed to be doing,” or “It’s weird.” So I’d rewrite it. The only thing that kept me working there, with that attitude, was that I had a good track record. That’s one of the things, that if you’re not good at it, you make chaos. But if you are good at it, the world thinks that you can do things that you can’t, really.

Seibel: When you left MIT, was it a hard decision at all, deciding to drop out of school?

Cosell: No. In retrospect, it was surprisingly easy. I was hating school. It was making me crazy. MIT is really a pressure-filled place. And BBN was like the Promised Land. It was wonderful. They played with computers; the company was so laid back. It was more like Project MAC than Project MAC. This was back in an era when people routinely brought their dogs in with them. So pets were padding up and down the halls; people were working day and night.

I started working part-time because I almost always had a part-time job while I was at MIT. And it just instantly felt like home. I couldn’t believe it. My MIT stuff went completely to hell so I dropped out of school and went to full-time. Then I got settled in at BBN and was much more mellow and got my head in a better place. So the following fall, which would have been my senior year, I actually re-enrolled back at MIT. And I got back in again. So that all worked out.

Seibel: Did you feel like your MIT education was a good complement to your work experience?

Cosell: The programming courses that I took when I was an MIT undergraduate stood me in good stead in some abstract way, but didn’t actually teach me very much. Mostly what did was the environment at BBN. Nobody, other than maybe Steve Weiss, was really mentoring me, but I was sucking what I needed to know from everybody.

Seibel: Obviously there were fewer computer books available then than there are now, but are there books that you found particularly useful or books that you think programmers should read?

Cosell: Hard for me to say what programmers should do now. There was certainly nothing I can remember from back then in terms of how to program. The closest was when I got my copy of Knuth’s The Art of Computer Programming and sort of digested them from cover to cover. But I would hardly recommend that as a tutorial text for people.

Seibel: You read Knuth straight through?

Cosell: Oh, it was hot stuff. I was in my prime back then. So each volume as it came out we mostly read and sucked into our heads cover to cover.

Seibel: That requires a fair bit of mathematical sophistication. Do you think most programmers need to be able to read Knuth cover to cover like that?

Cosell: I brought up Knuth as an example. I would not teach students Knuth per se for two reasons. First, it’s got all this mathematical stuff where he’s not just trying to present the algorithms but to derive whether they’re good or bad. I’m not sure you need that. I understand a little bit of it and I’m not sure I need any of it. But getting a feel for what’s fast and what’s slow and when, that’s an important thing to do even if you don’t know how much faster or how much slower.

The second problem is once students get sensitive to that, they get too clever by half. They start optimizing little parts of the program because, “This is the ideal place to do an AB unbalanced 2-3 double reverse backward pointer cube thing and I always wanted to write one of those.” So they spend a week or two tuning an obscure part of a program that doesn’t need anything, which is now more complicated and didn’t make the program any better. So they need a tempered understanding that there are all these algorithms, how they work, and how to apply them. It’s really more of a case of how to pick the right one for the job you’re trying to do as opposed to knowing that this one is an order n-cubed plus three and this one is just order n-squared times four.

If they’re interested in that it’s nice to know that Knuth is there, but no ordinary person needs to know that. But they need to know the wisdom in there. They need to know the data structures. They need to not be stunned when they see me building linked lists in Perl. When you know all of those data structures you can pick the right one. You don’t have to pick the fastest one. You don’t have to pick the one that’s cutest to implement. You can actually pick the one that best serves your data because you know the alternatives. Don’t tell Don that I fought through but didn’t have a lot of use for a lot of the gruesome numerical calculations he did to reduce those combinatorics. But boy, did I learn a lot about data structures, and that was good stuff.

Seibel: Do you have any advice for the many programmers who are selftaught?

Cosell: Write a lot of programs. That’s certainly what worked for me. Looking at the various courses I’ve taken, writing programs is what really did it. Not programming just to while away the hours but specifically, “I ought to learn something about this; why don’t I try writing a little program to do it?” That really does it.

You can’t see how these things work and how they interact until you’ve done it some. You don’t know what programming practices are dangerous until you’ve seen which ones make your programs take weeks to debug and then seen a good programmer fix it in five minutes. I don’t think you can get that from classes. Classes can give you a lot of stuff, but in the end programming is a craft you have to perfect by plying it.

If you’re lucky, you can do it at work. But even in a work environment, where you’re learning on the job, I think that to really be good you have to learn faster than your job will make you learn things. You have to supplement what your job is asking you to do. If your job requires that you do a Tcl thing, just learning enough Tcl to build the interface for the job is barely adequate. The right thing is, that weekend start hacking up some Tcl things so that by Monday morning you’re pretty well versed in the mechanics of it.

Seibel: How much of your own programming did you do for fun versus consciously doing things to learn particular techniques?

Cosell: Mostly I viewed computer programming as a means to get neat things done and I learned how to program in order to make things happen. There were things that seemed broken to me that I could fix. I thought it would be fun to do some Lisp programming not because I wanted to learn Lisp but because some of my friends across the bridge were big Lisp guys and it was all a little mysterious to me. So I wrote some programs and that just seemed like the natural thing for me to do as opposed to sitting at Dan Murphy’s knee and having him give me lectures on CONS and CDR and CAR.

Seibel: Are there areas in formal computer science that you think are particularly useful for people who ultimately want to work as programmers?

Cosell: There are a bunch of things. I know a lot of schools do a terrible job of it, but I think getting a good course in object-oriented programming in its abstract form. One of the things I fought about with some folks at a local college here was teaching object-oriented programming using C++. I asked how they make sure their students understand the distinctions between the philosophical concept of object-oriented programming versus the idiosyncrasies and weirdnesses of C++’s implementation of it.

One other thing I think schools can do is the stuff that’s in Knuth. I’m surrounded by people who think linked lists are magic. They don’t know anything about the 83 different kinds of trees and why some are better than others. They don’t understand about garbage collection. They don’t understand about structures and things.

Then the next volume: sorting and searching. If the programming language didn’t have a sort function, they wouldn’t have a clue about different types of sorting, or how to search for things, when you should build indexes, what it means that the database we’re using stores things in a B-tree. I think a good course would give them background not in, how do you write a linked list in C—that’s a craftsman thing—but what do linked lists do in an abstract sense?

Seibel: Perhaps the most famous project you worked on was the beginning of the ARPANET, when you, Will Crowther, and Dave Walden wrote the software for the original ARPANET IMPs. How did that come about?

Cosell: In Frank Heart’s group, our division, Frank viewed all of his programmer guys as this basic stable. He picked and chose how to move people from project to project. When my projects ran out, Frank would figure out what I should work on next. As opposed to the real consulting guys who would start flying to Washington and writing proposals; I was spared having to do that. Somehow, Frank had decided that I was to be the third guy on the IMP project.

I was working on another project in the fall of ’68 when Dave and Willy and those guys had started. I think the contract had been awarded but wasn’t going to start until January. When I joined the project, not much was done. I think they had scraped out some of the code, but nothing was really cycling yet. When I came on board and Dave and Willy had started blocking out how the system was going to be organized and had taken hunks that they were starting to write. I just fit in and claimed a piece or two for myself. We all had different skills but we were all going to know how every line of code worked for the thing because it wasn’t that big a program. Complicated, but not that big.

And I know they couldn’t have gotten very much done when I joined because they were still doing offline assemblies, which involved taking a paper tape into the Honeywell room where there was a 516 and running paper tapes through, making an assembly listing by having it punch an entire box of paper tape, which they would then have to carry to another machine because there was no line printer on the Honeywell machine to make an assembly listing. It was really pretty cumbersome doing the software management for that. One of the first concrete things I did on the project was I wrote a cross assembler for our PDP-1.

Then on the PDP-1 we could edit the files, assemble the files, make assembly listings of the files, run TECO macros over things. The only thing that got punched out was the comparatively small paper tape of the binary executable program, which would then go into the Honeywell machine.

Seibel: Was that the biggest challenge of writing the IMP software: making it go fast?

Cosell: Oh, that’s interesting. Well, let’s see. We didn’t think very much about how big it was because the idea was that the system was going to have to have a lot of space for buffering. And the code wasn’t going to be that big. And if the code was, say, ten percent larger than it could be if you squeezed it down, that would just mean that there would be a few fewer buffers. So we weren’t quite so much worried about counting how many instructions everything took.

Seibel: In terms of how much space it would take.

Cosell: Right. How much space. But we were concerned with speed, whether we were going to keep up with the bandwidth. And how do you organize a system so that it degrades gracefully and, in particular, degrades in a way that it can dig itself out of a hole as opposed to just collapsing and dying?

The second thing was just making the system work. There was a lot of untried, untested stuff. Were the protocols going to work? Will had come up with some ideas for the routing algorithm—was that going to work? There were still a lot of underlying questions. A question about congestion control. Did we know for sure that if everybody in the world sent packets to one poor guy that we would actually refuse the packets in the right order and dig himself out?

Seibel: So that was basically because nobody had ever tried to solve this problem before.

Cosell: Exactly right. It was a research project at that point—a lot of theory. A lot of people had written dissertations. A lot of people thought they knew what was going on. At that point, the rubber had to meet the road. We had to actually see whether the queuing theory was going to work, whether the routing algorithm could oscillate.

The third big challenge was simply how do you debug the thing. All of a sudden, you can’t talk to Cincinnati, Ohio. What went wrong? How do you figure it out? You call Cincinnati, Ohio, and you get a sleepy night watchman at 3:00 in the morning walking up to this little blinking box in the corner. What does he look at? What do you do? And even if you get the system back up, what went wrong? How do you fix it? Remember, I was a big things-don’t-crash, things-are-going to-keep-working guy.

I know that one of the things that impressed Will was there was some bug that they could not find and I found it. It turns out it was a bug in the handling of some protocol for the modems and it was sending the wrong packet at the wrong time. I put together a series of patches so that I could put a marker in a packet and when it saw that particular packet, it installed a patch on the system that looked for this other thing happening and as soon as it saw it, it stopped the system. Then once it stopped the system, we could use debuggers to figure out what was going on. Once I had done that, it took about two minutes to find the bug because the offending packet was still in memory; it hadn’t been written over.

I don’t remember the exact problem, but it was one of these problems that was not fatal. There was a bad pointer corrupting memory and the corruption wasn’t causing any trouble, but thousands and thousands of machine cycles later, the program crashed because some data structure was corrupt. But it turns out the data structure was used all the time, so we couldn’t put in code that says, “Stop when it changes.” So I thought about it for a while and eventually I put in this two- or three-stage patch that when this first thing happened, it enabled another patch that went through a different part of the code. When that happened, it enabled another patch to put in another thing. And then when it noticed something bad happening, it froze the system. I managed to figure how to delay it until the right time by doing a dynamic patching hack where one path through the code was patched dynamically to another piece of the code. And I was lucky because I guessed the right thing and we immediately found the problem.

Seibel: What enables that kind of intuition?

Cosell: On the systems I’m very good with like that, like the IMP system when I had it all in my head, or the PDP-1 time-sharing system, even though the system is a multiprogramming, multilayered, interrupt-driven system, I have all the dynamics of the system in my head. I know what order things are supposed to happen. I know somehow what’s not supposed to happen, when things are supposed to not be happening. That lets me build up a model for, “How could this thing possibly have happened?”

And at least some of those were two-machine problems, which also required some odd creativity to find. That is, the trouble is something goes wrong on my machine and the evidence of it shows up on yours. I can’t stop—my machine has already processed 6,000 more packets by the time yours hits the trap that says, “I got a bogus packet.” So now what do you do? We’d work through, the three of us, finding ways to track those things down and fix them and basically make the system pretty solid.

Seibel: Did you build in debugging code?

Cosell: No.

Seibel: So you had many different tricky bugs, each of which you had to track down in a unique way?

Cosell: As far as I can remember, we didn’t build in any debugging stuff. I mean, these days, I always point out that you’ve got to make programs so that they are testable. And the only way to make a program testable is to think about that before you write the first line of code. You can’t retrofit block points and assert points and test points that work efficiently and do the right thing if you wait until the program is working.

But I’m sure that we didn’t think about any of that. We were just trying to write this incredibly complicated real-time thing that had to be fast. It was a hard enough problem. We didn’t put in any real consistency checks; who would want to waste time for that? So these things were all ad hoc patches. Jump off into a spare part of memory, run through some hand-coded stuff to check this or that or the other, jump back, and continue.

In fact, it was even formalized. One of the things—I’m pretty sure I wrote it—was a patcher where you could submit a patch to the system and it would pull one buffer out of circulation and use it to hold the code and link up to that and then link back. We used to do that kind of stuff but it was all ad hoc. We would find some bug and we would crack our heads trying to figure out what it could be.

A lot of the times, just understanding what the bug is points you at the right piece of code. Now you read it more critically and you fix it. Other times, you need to collect more data. Other times, you need to bang your head against the wall trying to catch that little bit of evidence that illuminates the thing. And we did some of all of that.

Remember, we’re running on a machine that’s got no console, no nothing. In general, the patches would stash away some data and then halt the machine. Then we would probably use the front panel because I don’t think there was a debugger we could run from the terminal that wouldn’t trash the machine. So we’d look through the appropriate areas of memory from the front console, doing examines and deposits to go figure out what was going on.

Seibel: So that’s literally a row of lights?

Cosell: Yeah, a row of lights. Bit per light.

Seibel: And toggle switches to put in the address?

Cosell: Right. Actually, this is better. The PDP-1 had toggle switches. This one had, as I recall, push buttons.

Seibel: How did the three of you work together?

Cosell: One of the things that I remember doing shows a little bit of the style difference. Will was a brilliant intuitive programmer. All of the hardest problems that most people couldn’t understand how to do at all, he would find ways to do.

Like the AI engine in Adventure that he did in Fortran of all things. And the routing algorithm and all sorts of stuff in the dynamics of the IMP system, Will had cobbled together. One of the things about a real-time system is everything has to be timed out. You can’t wait forever for anything because there’s no forever in a real-time system.

And a bigger and bigger collection of time-outs were growing up all over the program. I tried to understand them and had a hard time doing it. So in one of my revisions of the source code, I tried to make an algebra for all of the time-outs. For example, the total time-out to get an acknowledgement for a message should be eight times the time-out for a single packet to transit the net plus something. Or, the total time-out for a message to track the net is the maximum diameter of the net times the maximum time for the packet to make one hop.

I was sort of trying to find out what the basic constants in Will’s mind were when he put things together. When two time-outs had the same time were they supposed to be the same or were they coincidentally the same? Who knows? How many places do you have to change when you want to change one of the constants? If you discover dynamically that you’re not waiting long enough for something to happen and it is timing out when it shouldn’t, you know that you can’t just change that one time-out because these things are interrelated.

So I made a whole bunch of sharp sign defines, basically, to try to find the smallest number of independent constants. I remember doing that because it was just really scary. It was one of the places where I was dabbling in things that really nobody understood because a lot of those constants Will had put in intuitively and we had tuned to make work, one by one. The time-out isn’t big enough and so we would make it bigger, not doing it by first principles or algebra, but just tuning it until it works.

Seibel: Did you find bugs that way or did you just put it on a more solid footing so that as things changed, you could change things in a way that wouldn’t require endless retuning?

Cosell: I don’t recall finding any bugs. But there were undoubtedly some places where there were timers that now had different values than they used to, but not operationally significant ones, just defensively different ones. It was less so that you could change it if you have to; really it was so that it made the program easier to understand. I hated having a program that had 200 randomly chosen independent constants scattered throughout it and knowing that they have something to do with the heartbeat of the network. I think it simplified some of the code. It made it easier to fathom what was going on. It also let us use more symbolic constants. Eight times diameter plus pulse time or something like that would be understandable.

Will was sort of the advanced idea man. I remember complaining to Frank Heart about this once, that he got to work on the projects right out of the box because BBN was doing a lot of very cutting-edge stuff and he was terrific at finding ways to do things that couldn’t be done before.

He was not as good at getting 100 percent done nailed-down code. He was really good at getting 75 or 80 percent pretty good code that worked most of the time. Will had already gone on to, I think, the TIP, and Dave and I were still working on the IMP system and that’s when I redid the routing algorithm because it had funny constants and I didn’t understand it. So it was still Will’s routing algorithm but recoded with my style. And I think it was a little more solid. At least I understood if it was going to oscillate, why it was going to oscillate, because I made it oscillate.

One of the places where Will Crowther and I absolutely differed—and I had to put in hours and hours of work and even then he was skeptical—was he believed that when you reassemble a program you add more bugs than you remove. So he used to keep notebooks with pages and pages of patches. He would go as long as he could patching the existing system before he had to reassemble. Those patches were of patches on top of patches and so complicated that often his prediction was a self-fulfilling prophecy. It was hard, after all of that, to get it just right so that it turned out to be what the patches were actually saying.

Seibel: So you had an original source listing that you could feed to an assembler—

Cosell: Right, and a binary image that was running. Then we would have a paper tape—or sometimes we’d just do it by hand—that plants a jump here out to a little area where these three lines of code were replaced by these five lines of code and then it transfers back to the subsequent thing so when you execute this code it goes off to the patch, executes some stuff, and comes back.

Seibel: So the paper tape held the binary version of the patch?

Cosell: Yeah. Later on, when I built a little interactive debugger that had the examine and deposit functions I was so fond of, we actually could build a little text tape that looked like, “Go to location 12785, value, value, value, value. Blank line. Go to location 12832, value, value, value, value, value.” If you had to load the program from scratch you loaded the program and then you loaded the patch tape when you were done.

Seibel: So at that point you didn’t actually have any source code that would assemble into the current state of the running binary?

Cosell: Exactly right. One of the troubles was we had different copies of the listings. One of the listings will have an inked mark at some place in the code where two lines will be crossed out and next to it the replacement code. Now, did every copy of the listing get that? Will was very good because he had his notebook and the final say was not a particular listing but his notebooks. That was his approach.

My approach was that the system should always run out of the box. I do not want to mark on the assembly listing. When I first came on the project it was hard to get all of his patches in. We would work all day and then I would edit and reassemble the system overnight so that the next morning we had another clean tape and we would start with that. It turns out when you’re doing it overnight you only have two or three changes and they’re located so you can read the code as you change it and it makes sense. Of course, that settled down right away.

So we almost never again had a problem of fixing a bug creating a new one other than if the patch was wrong. But Will and I were at odds about that because he was really very fond of patching and staying away from assembler if you could, partly because it took a lot of time whereas he can patch and just go on, and partly because he didn’t trust the cycle because the editing was too scary.

Seibel: Do you consider your work on the IMP one of your important technical achievements?

Cosell: Oddly enough no. It was an interesting, hard program, but I’d written Doctor, I was doing Lisp stuff, I’d been czar of the hospital computer system. Certainly at that point the neatest thing I had worked on was understanding every line of code in this cutting-edge time-sharing system. This thing was just a little stand-alone communications processor. It didn’t have as many interrupt channels as we had on the PDP-1. It didn’t have to deal with what do you do when you only have 32 swapping slots and you have 40 people logged on.

The three of us got along famously, so it was fun and it was challenging. There were things about debugging it and implementing it that were hard to do. But hard to say that I would’ve thought it was the crown of my career. It was just the next program. The other thing that was anticlimactic about the IMP system was how bounded it was. The PDP-1 was basically hard. It was a time-sharing system and it had to evolve over time.

The thing about the IMP system that was so neat is how we did it in such a disciplined fashion. They started officially in January; I joined the project in February and in September it was done. Small value of “done”—we were still working on it fixing bugs and stuff but it got released in September and didn’t stop. Not long after that Will went on to his next project and Dave and I continued with it and somebody new came in.

The person I have to give a lot of credit to is Frank Heart. I don’t understand how he hit on the management style of mostly letting us be as crazy as we were. I’m hard-pressed to remember a software-review meeting. I’m hard-pressed to remember being hassled for documentation when the three of us had the program in our heads and couldn’t be bothered with a lot of that stuff. There was a level of trust and confidence that the three of us were going to do this thing and he left us alone. In retrospect, having been a project manager, that’s a stunning thing; that’s just bizarre. No weekly staff meetings, no PERT charts on the board. Willy was of course keeping track of what needed to be done and bugs we found and stuff, but the lack of oversight structure for that was pretty impressive. Throwing us together and telling us to go do it was, I think, a stroke of management bravery.

Another thing that Frank did, on other projects, was design reviews. He had the most scary design reviews and I actually carried that idea forward. People would quake in their boots at his design reviews. This was sort of like taking your orals for your dissertation. He would have a hand-picked collection of people in the audience and you would have to present your design. The people he picked were always good. The thing that made his design reviews so scary is he knew when you were bluffing.

I’m sure you’ve done design reviews where you didn’t work on some part of it real well and so you kind of slide past that part. You think you got this right but you didn’t really do the analysis so you don’t know quite what’s going on. He had an instinct, and it was abetted by having a good crew in there, of catching you when you were bluffing, catching you when you hadn’t thought it through.

The parts that you did absolutely fine hardly got a mention. We all said, “Oh.” But the part that you were most uncomfortable with, we would focus in on. I know some people were terrified of it. The trouble is if you were an insecure programmer you assumed that this was an attack and that you have now been shown up as being incompetent, and life sucks for you.

The reality—I got to be on the good side of the table occasionally—was it wasn’t. The design review was to help you get your program right. There’s nothing we can do to help you for the parts that you got right and now what you’ve got is four of the brightest people at BBN helping you fix this part that you hadn’t thought through. Tell us why you didn’t think it through. Tell us what you were thinking. What did you get wrong? We have 15 minutes and we can help you.

That takes enough confidence in your skill as an engineer, to say, “Well that’s wonderful. Here’s my problem. I couldn’t figure out how to do this and I was hoping you guys wouldn’t notice so you’d give me an OK on the design review.” The implicit answer was, “Of course you’re going to get an OK on the design review because it looks OK. Let’s fix that problem while we’ve got all the good guys here so you don’t flounder with it for another week or two.”

What you wanted to do with a design review was double-check that the parts that he thought he had right he did have right and potentially give him some insight on the parts that he didn’t. Once I apprehended that—I was only like 20 or 21—that seemed so obviously right, such an obvious good use of the senior talent doing the review.

Of course, the design review for the client is different. The design review for the client is all, “We know it all. It’s all going to be perfect.” But the internal design review was an opportunity and I was always surprised by how many people were absolutely scared about the prospect of a design review. These are good people but they just said, “My design is going to be torn to shreds.” It’s hard to convince them that it’s won’t get torn to shreds if it’s any good, that these guys are not vindictive. They’re going to try to continue the BBN mystique of getting it all right.

It’s also hard to tell them that you will never again in your career get this collection of people willing to spend an hour helping you think through your design. You’re going to be on your own after this, and that was just a wonderful experience.

Seibel: How often were these design reviews? At the beginning of a project or throughout?

Cosell: There weren’t multiple design reviews; the design review was basically once when the design was considered done.

Seibel: So the design was done before you had really started the coding part?

Cosell: Yeah, right. Yeah. Probably some of the coding had been done because a lot of people, including me, have to start blocking up little bits of code to see how a thing is actually going to work out. But typically we are in a cycle where we have to propose things and then we get funded later to do them. So what we have to do is propose to the client, “This is what we’re going to do,” and you want a good understanding because the client at this point is going to give you so much time and so much money and expect it to work. So it was typically at that point, we are about to finalize the proposal, we are going to have the technical description of what we’re going to do. Now we sit down for the design review to make sure we understand it. I don’t recall Frank stepping into contracts once they were afoot. Certainly the projects I was working on I can’t remember an ongoing project review including Frank.

Seibel: You just mentioned Doctor. What was that?

Cosell: When I was working on the PDP-1 time-sharing system, Dan Murphy and his friends were working on their PDP-1, bringing up this Lisp system. So I thought I would learn Lisp. That spring, Joe Weizenbaum had written an article for Communications of the ACM on ELIZA. I thought that was way cool. And I believed, as I likely still believe now, that anything I can understand, I can make a computer do. He described how ELIZA works and I said, “I bet I could write something to do that.” And so I started writing a Lisp program on Dan Murphy’s PDP-1 system at BBN. I had a Model 33 Teletype that was in my PDP-1 computer room connected to Dan Murphy’s PDP-1 so I could play on his computer from my computer room and pretend to be working on my system. I wrote that program and got it up and working. Playing with it was an all-BBN project. People would leave me comments: “It would be better if you did this” or, “I tried this, and it didn’t work.” That actually helped spread Weizenbaum’s idea beyond its boundaries. It was written, at first, in the PDP-1 Lisp. But they were building a Lisp on the PDP-6 at that point—or maybe the PDP-10. But it was the Lisp that had spread across the ARPANET. So Doctor went along with it, it turns out.

I got a little glimmer of fame because Danny Bobrow wrote up “A Turing Test Passed”. That was one of the first times I actually got some notice for my stupid hacking: I had left Doctor up. And one of the execs at BBN came into the PDP-1 computer room and thought that Danny Bobrow was dialed into that and thought he was talking to Danny. For us folk that had played with ELIZA, we all recognized the responses and we didn’t think about how humanlike they were. But for somebody who wasn’t real familiar with ELIZA, it seemed perfectly reasonable. It was obnoxious but he actually thought it was Danny Bobrow. “But tell me more about—” “Earlier, you said you wanted to go to the client’s place.” Things like that almost made sense in context, until eventually he typed something and he forgot to hit the go button, so the program didn’t respond. And he thought that Danny had disconnected. So he called Danny up at home and yelled at him. And Danny has absolutely no idea what was going on. Except Danny knew about my terminal. So he came in and tore the typescript off of the thing, to save it.

It was a very slick version of Weizenbaum’s thing. We improved the scripts a little bit. Lots of generations of hackers worked on it. And as I say, it traveled around the Net. And now, I guess, there’s a version of it written in Emacs macros. But that was my trial by fire in becoming a serious Lisp programmer.

Seibel: So I’m curious—I’ve observed that often the programmers that write the hairiest, most complicated code are the ones who can keep a ton of details in their mind. You obviously had the ability to keep details in your mind but still cared a lot about making code simple and clear.

Cosell: I have to admit that I did both. I would make things simple in the large. But when I say that programs should be easy, it’s not necessarily the case that specific pieces of the functionality of the program have to be easy. I could write some very complicated code to do the right thing, right there, code that people would cringe at and not be willing to touch. But it was always in an encapsulated place.

Most of the bad programs I ran into, the ones where I threw things out and recoded them, there wasn’t a little island of complexity you could try to understand and fix, but the complexity had oozed through the program.

I have a couple of rules that I try to impress on people, usually people fresh out of college, who believe that they understand everything there is to know about programming. The first is the idea that there are very few inherently hard programs. If you’re looking at a piece of code and it looks very hard—if you can’t understand what this thing is supposed to be doing—that’s almost always an indication that it was poorly thought through. At that point you don’t roll up your sleeves and try to fix the code; you take a step back and think it through again. When you’ve thought it through enough, you’ll find out that it’s easy.

We just did that recently at work. They were working on some big design project and it was just getting more and more convoluted. So we had a meeting and started shedding away things. I said, “That seems too complicated.” And all of a sudden, we had a block diagram for how the thing would work. And everybody was stunned because they understood how each block could possibly do its job. We hadn’t done the dull things where you have to write it all down but they understood that the interfaces were clean and they could make progress. I’ve done this business long enough to understand that there are some very hard problems. But very few. It’s invariably the case that when they think about it harder, it gets easier and all of a sudden it’s easy to program correctly.

The other rule is to realize that programs are meant to be read. Even though I’m guilty of writing pages of TECO macros back in my early days, I very quickly—probably when I was working on the PDP-1 time-sharing system and the complexity of the time-sharing system started to sink in—came to the belief that computer-program source code is for people, not for computers. Computers don’t care. I think it’s a good thing that Perl has both “if” and “unless.” Because it turns out that when you’re getting an intuition for what something is supposed to be doing, saying “if not some condition” doesn’t connote the same idea as saying “unless the condition.”

The binary bits are what computers want and the text file is for me. I would get people—bright, really good people, right out of college, tops of their classes—on one of my projects. And they would know all about programming and I would give them some piece of the project to work on. And we would start crossing swords at our project-review meetings. They would say, “Why are you complaining about the fact that I have my global variables here, that I’m not doing this, that you don’t like the way the subroutines are laid out? The program works.”

They’d be stunned when I tell them, “I don’t care that the program works. The fact that you’re working here at all means that I expect you to be able to write programs that work. Writing programs that work is a skilled craft and you’re good at it. Now, you have to learn how to program.” Some of these guys were fabulously good programmers and they’d never once read a line of anybody else’s code. In fact, some of them never even read their own code, so they never had the pain of seeing what happens six months later.

Some would rebel. Some were absolutely convinced that they were good programmers and I was just some over-the-hill old guy that didn’t know what he was doing. I know I would have said the same thing not long ago: “The program works. What is your problem?” When I say, “You don’t get credit because the program works. We’re going to the next level. Working programs are a given,” they say, “Oh.” Then they talk to other people and discover that that’s basically the BBN standard. You can’t explore a new idea in doing something if you are not craftsman enough to make the computer do what you had in your mind.

I had a preference of how I liked my global variables and how I liked my subroutines organized and I got into a multiday battle with one guy where he said, “Look, it works just fine” and he was such a good programmer that I didn’t want to pull rank. I felt it important that he understand that I was not just being a tyrannical turkey; that there was a reason why I wanted him to do it this other way. He didn’t realize how hard it is to understand a program with a single C subroutine that’s 42 pages of code long.

Seibel: Yikes!

Cosell: I argued with him because I’m a big fan of call-once subroutines where the only function of the subroutine is to abstract some little part of a parent subroutine. When you read the parent subroutine—in my approach to programming—and you get to this place in the code and you get distracted with the details of this big nexus of stuff, I like to pull that whole clump out. Now you have a single thing that says, “Sort the table and find the best route,” even though this is the only place it’s called. Someone optimizing the code would say, “That shouldn’t be a subroutine. Put that in line.” But it’s a little subroutine where I can isolate it. It’s obvious what the inputs are. You can see the algorithm, and only be concerned with that. He hated when I used to say, “Your routines are too complicated. Your routines are spanning big chunks of the design”. He’d say, “That’s OK because I can do it all in one routine.”

He rebelled but eventually he did it my way. Then the next task he had was to take a big piece of code from one of the programmers working on an earlier effort and make it fit into our system. He worked on that for almost a week. He so hated the other guy’s program that he complained to my boss that there aren’t strict enough programming standards in the division. And the other guy was programming the way he had wanted to program but with a different spin. So he saw what happens when one very intense, very good programmer doesn’t segment it down. You get one very long program—it’s not that the program was spaghetti code but there were just so many levels of complexity in this one linear suite. He almost pissed me off because, as I say, he went over my head to demand that the department had to have standards to not allow that thing to happen.

Seibel: Not realizing that his own previous code would’ve probably fallen afoul of the same standards?

Cosell: No. He got that. He was a convert. It’s sort of like the guys who give up smoking and are the most pains in the butt about other people still smoking. He became one of the strongest guys on my project. He used to nag me when I wasn’t careful enough—when I compromised. My project was the first project of its type he had ever worked on. Communications, real time, all this stuff—all new to him. But he was a smart guy and he went through this little epiphany and came out of it the programmer I always thought he was going to be. Last I heard, he was doing wonderfully. With him it worked out. Other people didn’t like working with me because they found me too overbearing; I can’t imagine why.

Seibel: Did you have particular rules for how much or how little to comment?

Cosell: I don’t put a lot of comments in my code because I think you should be writing your code so that it is readable and your algorithms and thoughts are clear in the code. I put comments that say this routine is supposed to do this, and usually some description of how you call it—what do you do when you get exceptions, what the order of the arguments is, and things like that. But the code itself should clearly express what you are doing.

The only place I tend to put comments in my code is when my instinct says, “This particular piece of code, even though it works, doesn’t clearly state what I’m trying to accomplish.” And so I put a comment in the code to say, “This code sorts the table,” if it may not look like your standard tablesorting code because, as it turns out, I can take advantage of something.

I have never been a fan of structured programming listings where every subroutine has to have 18 lines of comment at the beginning and the arguments are in the right order. I don’t do a consistent segmentation of my programs. Some of my subroutines are complicated and some are simple. I do worry about things like layout; I’m part of the contingent that argues about curly braces.

One of the reasons is because I read code to understand what it does as opposed to reading code to see what each little piece of it is. So when I see an if statement, for example, I see the condition. I’m now thinking yea or nay on that condition, and if I want to skip the if statement I like having a program organization that lets my eye flip down to the end of the if statement without me having to process a lot of syntax. So I’m one of those old-fashioned guys that likes lined-up, open and close braces.

If you made column five go away, my code would look like, “operator, open brace, close brace; operator, open brace, close brace,” which lets me see the sequence of operators. Another part of that is related to something I mentioned before: if the open brace is too far from the close brace then often it’s doing too much, in which case I can pull it out. Sometimes even if it’s not doing too much I’ll still pull it out because I can’t apprehend what that little branch is doing if there’s too much crap in there.

I try very hard to hide the crap, to move the crap someplace so that I can follow the flow of the code, so I can build the picture in my head of what the code is doing. I have a lot of trouble reading some programming styles because I have too much trouble trying to absorb the block structure. It’s interesting that the guy that did Python was clearly of a similar mind. He eliminated the syntax wars because he doesn’t have open and close braces. When you see an if the open curly brace is always there implicitly and the closed curly brace is also implicitly there and if you need to find the next thing, it’s lined up under the if. I use an editor in C and in Perl, and I assume that editors in Python do the same thing, where you can click on a button and it shrinks the whole thing so you only see the outer structure.

I don’t like to fight these style wars on the basis that one style is ugly. I like to believe I fight the style wars is because it interferes with me understanding the code. I was always pretty good at that. Unless you could convince me you’re better at understanding code than I am, you have a tough fight convincing me your way is better.

Seibel: Certainly coming in cold to new code and debugging it is a particular skill that not every good programmer has, which it sounds like you did.

Cosell: Indeed. And there are two aspects of that. There was another guy. His name was Steve Butterfield. And he was also a good fixer, but the antithesis of me. Steve was about the best I have seen at not having any clue how a program worked and fixing it. He could dive into a program and change some little ugly piece down in the bowels of the code to make it do something different. Big, complicated programs, Steve could leap in and fix little things leaving them, to my view, functionally better but worse off.

I always tried to make the whole program better and that would often mean that even though there was one little problem, I would try to understand the whole program. I would try to find the problem by reasoning down from the top and finding it as opposed to saying, “Oh, this isn’t working. Do surgery here.” So there were some things where I just took too long and spent too much time trying to import the whole thing into my head, when a more directed approach just to go fix it would do.

But usually when Steve left a project, it was hard to revise the code to make it do stuff. Whereas I tried to keep things good, but it meant that if a program was really big and awful, I would spend a lot of time spinning my wheels before I felt comfortable diving in. But that didn’t happen very often because often when I debug things, I don’t do it by debugging.

As I mentioned before, there were many bugs that I never had any clue where they were. I just get to a point where I say, “This piece of code is supposed to be doing this. This does not look like it’s doing that. I mean, how could anybody have written this complicated bit of code to do this simple thing?” So I’d rip it out and replace it with a routine that does the simple thing I thought that piece of code was supposed to do. And the program magically works. In retrospect, what had happened is that the program had evolved and this little routine kept getting changed. Rather than being replaced when the program evolved, somebody was patching it to do different things and missed once.

I never debugged any of that stuff. I hack on it for a day or two doing all of this typing and nobody would have a clue of what I’m doing and the program would get fixed. What a debugger! That is very dangerous because Will’s dictum is basically right, that if you rewrite a hundred lines of code, you may well have fixed the one bug and introduced six new ones. And at least the one bug you, knew what to look for; you now have to start looking for the six new ones. And I was just fortunate because I had a very good track record over the years of managing to write code that, for the most part, worked.

Seibel: So you must have had some strategies for reading code. Even if there’s no bug but just a big pile of code that you’re going to work on, how did you tackle that?

Cosell: Not very well, it turns out. One of the reasons why I tend to rewrite chunks of code rather than fix them is because I reach points where I can’t manage to figure it out anymore. I don’t read the code as if it were a book. I try to figure out what the program is doing and then get hints about the code from the top down.

In parallel with reading the program I think about how would I solve this problem. Which means I’m looking for certain specific pieces so I can say, “Oh, here’s where the program does it.” Then I can say, in my usual arrogant way, that the guy that wrote this did it wrong. Or at least I now understand that they’re doing this some other way.

So I would go top-down. But some of the guys I knew were spectacularly good at bottom-up. They would start reading little subroutines and eventually find the one subroutine they needed. But mostly for those kinds of things I was a top-down kind of guy. That is, I’m looking at the program trying to figure out what the other programmers should have done. That was one of the things that led me to sometimes fix bugs where I didn’t know what the bug was. I’d hit a place where I say, “This piece of code—as I understand this program now—is supposed to be doing this” and then either the code I’m looking at doesn’t do that or the code is so complicated and seems to be doing six other things and it’s not making sense to me.

In either case my usual response at that point is to fix that piece of code so that it agrees with what I thought was supposed to be happening there in the program. You can see how fabulously dangerous that can be because there is no one correct way to organize a program and if the program was perfectly well organized but in a different way than I wanted, I have now just killed the program and now have an incredible avalanche of stuff to fix. But I was pretty lucky with that. Usually when I said, “This looks wrong and I’m going to fix it” it got fixed. And that was even true from the early days.

The first big program I worked on, the PDP-1 time-sharing system, I was just a raw programmer doing college-undergrad programming problems and I moved through the hospital project very quickly, from doing applications to coming under the wing of the systems guys. Even though I was six months into being a professional programmer I was perfectly willing to say that this little piece of the remote process swapper doesn’t look like it’s right and I would rewrite it.

Seibel: In addition to the danger of introducing new bugs, another risk is that you may have misapprehended what the program is supposed to do.

Cosell: That’s right. The path I took was, if you will, not for the faint at heart. At the time I was 19 years old and that seemed like the only way to do things. I had two convictions, which actually served me well: that programs ought to make sense and there are very, very few inherently hard problems. Anything that looks really hard or tricky is probably more the product of the programmer not fully understanding what they needed to do and pounding it with a hammer ’til they got code that looked like it did the right thing.

I don’t know why I had those two convictions. I arrived at BBN with no skill per se, but I had those principles in the back of my head for some reason. I thought I ought to be able to understand anything and it shouldn’t be so hard. I found that even for the time-sharing system and the IMPs—for all of those class of programs, that proved to be true. In general once I had the right understanding of what a program was supposed to do, the pieces would fall into place. The pieces that didn’t belong would stand out like a miscolored piece in a jigsaw puzzle.

Another principle was I always wanted clean listings. I wanted the thing to be just right. When you have to fix a bug in a program you never, ever fix the bug in the place where you find it. My rule is, “If you knew then what you know now about the fact that this piece of code is broken, how would you have organized this piece of the routine?” What were you thinking about wrong before? Fix the code so that can’t happen. When you finish with a routine I want every routine you work on to look as if it was just written. I do not want to see any evidence of afterthoughts or things gone wrong followed by something to correct the error or a mysterious piece of code saying, “This routine returns the wrong value every now and then so I’ve got to fix it.” I don’t want to see any of that. I want to see code that looks like through some divine inspiration you got it exactly right the first time.

Then I compound that with one other little trick. I got this when I was working on D.O.D. projects. They’ll never fund a new project. Both BBN and the government have too much invested in the current program, even when it has limitations that are awful that need to be fixed. The most common one is something that you did that was right when the program started is now hopelessly wrong because the program’s use or requirements or something have evolved. What you’d like to do is rip out that part of the program and just fix it. And they say, “What is that going to improve?” You say, “It’s not going to improve anything but it’ll make the program better for next week.” Not going to get permission to do that.

The method I took is the sneaky way and this has worked very well for me for a lot of programs. I do a design of the future version of the program. Knowing what I know now, this is how the program would have looked, now at the program level rather than at a subroutine level. Now when you go to fix a bug and you have a choice on how to fix it, fix it moving toward the better model. Don’t just fix it in the shortest way. Don’t just fix it in the way that fits, but move it toward the other model so that over several months instead of the program getting more and more mired in patches fixing up the stuff that was old and wrong, all the critical parts of the program all of a sudden look like they’re the new way of doing things. Often you can get to the point where there are so few places left that still do things the old way that you can slip in and get those fixed because you’re now not damaging the whole program.

So when they ask, “How long is it going to take you to put this change in?” you have three answers. The first is the absolute shortest way, changing the one line of code. The second answer is how long it would be using my simple rule of rewriting the subroutine as if you were not going to make that mistake. Then the third answer is how long if you fix that bug if you were actually writing this subroutine in the better version of the program. So you make your estimate someplace between those last two and then every time you get assigned a task you have a little bit of extra time available to make the program better. I think that that makes an incredible difference. It makes for programs that evolve cleanly. It’s amazing to have a program that’s still in version one but it’s like Washington’s hammer. It’s now a really sleek new thing because all the key parts have gotten fixed without any project manager having to actually authorize you to go rip out the guts and go fix it.

Seibel: Have you heard of refactoring?

Cosell: No, what is that?

Seibel: What you just described. I think now there’s perhaps a bit more acceptance, even among the project managers of this idea.

Cosell: Oh, that’s good because I used to need a bug that—I used to need a reason to change the piece of the code to do what you just said because I could never get permission just to rewrite it to make it cleaner. So I would have to wait till a bug or an improvement request came along touching that part of code, but then I would do exactly that. I guess the thing about refactoring is that you have spend some time thinking about what the right target is because it won’t do to refactor and have different people aiming in different directions or to have the target not be the right thing.

I never did that with a name; it just seemed like the only way I could do two things: manage the complexity and get a program that you didn’t have to throw out and code over. The PDP-1 taught me that. It ran for a lot of years and it was such a huge project. It took three or four guys to write the first two versions of it and there was no way to throw it out, but it had to get better.

Seibel: How do you hire programmers? How do you recognize the talented ones?

Cosell: I couldn’t ever get into the standard interviewing paradigm. People talk about—and I think Microsoft was famous for this—giving them little problems to solve. I seem to have taken a more intuitive approach. I basically glanced at the guy’s—or girl’s—r?sum? to get a feel for whether they felt like my type of person. Often the r?sum?s were useless because they were just college seniors about to graduate. You read between the lines that this fancy-looking project was really a class project for some course in something or other. But I used to talk to them and just get a feel whether they had somehow the kind of inquiring, curious, precise kind of mind that I had grown to expect people around me to have.

What were their other interests, their nonprofessional interests? Did they show both aptitude for picking things up, and curiosity? It was kind of slapdash how I did all of that. I had this idealized image of a BBN-quality person, some vague thing about aptitude, curiosity, quickness of learning, interested in lots of different things, and kind of broadly based. I used to go on a hunt to see if I got the impression that this person was going to be BBN-quality folk.

Seibel: As you mentioned, Microsoft is famous for asking puzzle-type questions. And you like puzzles. What do you think of that as a way of gauging someone’s potential?

Cosell: Carefully chosen, I think it has potential. Not because the person solves the puzzle, but if it gives you a glimmer as to how they organize something to approach it. I have never used it. I would certainly not have handed somebody one of the little tchotchke puzzles and watched them while they tried to put it back together again. The problem is that a lot of these puzzles require different styles of solution, and either you know that or you don’t. That’s not so good because I don’t want somebody who’s really good at doing a sliding-block puzzle because they happen to know some good tricks for doing that.

BBN spent a lot of its time sailing in unknown waters, doing things that hadn’t been done, that we didn’t know how to do. And that requires both a degree of daring, because it’s not so easy, and a degree of skill in order to not founder. That’s the kind of thing I’m looking for, not looking for somebody with a knack for solving a particular puzzle but, thrown into this complicated thing that they have a need to deal with, can they approach it reasonably?

A case in point was when the Rubik’s Cubes arrived. We had just heard rumors about this wonderful puzzle and one of the guys was on a business trip to England and brought back a satchel of them. No books, no documentation, not yet a phenomenon in the U.S. at all. Just a strange little group theoretic puzzle. And we started to play with it. Several of us solved the puzzle in different ways, but it was interesting that we were able to cope with a puzzle like that. It’s just that that group of BBN people back then had the right stuff. That was what I used to look for.

I don’t know whether Microsoft’s little quizzes—and I’ve heard that Google has an aptitude test too, or something—I don’t know if those can give you a hint that this person has the right spark. But that’s what I used to look for. Does this person look like they’re ready to be BBN-quality folk? Often I would say no. They’re perfectly good, they’re terrific engineers, but as we’re talking I’m not getting that spark. My approach was to look for the spark, and I don’t know how I did it.

Seibel: Do you think programming is a young person’s game?

Cosell: I think that may be the case. I can even see, as I look back at some of the projects I worked on toward the end of my career at BBN, that the people I had doing the work for me were doing things that I couldn’t possibly have done. One of the guys working for me thought that using Tcl would be a neat thing for part of the interface, so in a day and a half he learned enough Tcl to bring the thing up and make it work, which I don’t think I could’ve done. It was amusing for me to think in the back of my head, “Gee, I used to be able to do things like that.”

I think that the actual production of code—of working, logical, good code—requires an intensity and a mental agility in terms of picking up new things that I, at least, find hard to do now. The other side of the coin is that you get a certain wisdom about things that you certainly didn’t have when you were younger. I know better now how to do things. So I find a better mix is to be able to give young and active people guidance. I think that by and large the sort of programming that I’ve been talking about is similar to the old saw about mathematics, that most mathematicians do most of their best work well before they’re 30. The kind of intensity, the kind of focus that you need to do really cutting-edge mathematics is probably similar to what you need to do the kind of crazy programming I used to do back when I was young.

Seibel: One part of the intensity is that it’s simply physically draining to work long hours. Are long hours necessary or are they just a side effect of the fact that we love it?

Cosell: I think that’s a personality side effect. The question of whether you can put something down and come back to it or whether you are compelled to stick with it and finish it is more of a personality thing. There were certainly many people I knew of at BBN out in the brilliant end of the spectrum who were perfectly able to work normal hours and not be interested in coming in on weekends. Then, of course, there were the other people who were at the lunatic fringe—for a while I was sleeping in the computer room because it took me too long to drive back to my apartment. I would just take a nap in the computer room and I have no idea how crazy people thought I was for doing that. But I don’t think that is necessary; I think it’s a byproduct of the fact that doing the kind of thing that we do is pretty exciting, especially when it starts to come together.

One of the really good guys at BBN ended up working a perfectly normal schedule and finished his PhD dissertation just by being astoundingly disciplined. He worked every Saturday on his thesis and he worked evenings on this and that. Part of it, I guess, is being organized. It is much easier to do something if you can do it all the way through, at which point you don’t have to think about being organized or careful or putting it down and picking it back up, because when you’re done you can forget about it. I’ve been learning that recently since my life is more normal now. I program around the cracks, putting things down and picking things back up. I find that if I don’t work on it again for two to three weeks it’s surprisingly hard to pick it back up. Often when I am lagging on some little personal programming thing and I really want to get it done, I try to say, “OK, I’m going to be like the people who exercise. I’m going to do it a couple hours every morning.” That mostly doesn’t work for me. What happens is at some point I get bored with having it not be done and I’ll spend a day or two on it and get it done.

I can, in bursts, get the kind of focus I used to have, but not so well anymore. So I think that makes this kind of special, fancy programming—what the true hackers do—more of a young person’s game. I have to admit almost all of the people I know who did stellar things when they were young did it intensely. It’s hard for me to think of any folks in the really stellar category who did it just a couple hours a day as if it were a job. Almost everyone I know would do it with this burst of maniacal focus and intensity and then get it done. But the focus is hard. It really wears you out. It certainly used to wear me out.

Seibel: Would you consider yourself a scientist, an engineer, an artist, a craftsman, or something else?

Cosell: A blend of those, obviously. I don’t consider myself a scientist, as I understand what scientists do. I’d like to believe I consider myself a combination of an artist and a craftsman. The way I do engineering is as a combination of art and craft.

Seibel: Let me first ask about the engineering part. There are certainly folks, like Watts Humphrey and the folks at the Software Engineering Institute, who say programming should be an engineering discipline just like building bridges. People can build bridges and they can predict how long it’s going to take, and the bridges, for the most part, don’t fall down.

Cosell: Exactly right. The analogy is very good except they put a non sequitur in the middle of it. It turns out that the guy that designs the bridge so it won’t fall down is not the guy that strings the cables or that inspects the cables to make sure the steel was right or pours the concrete or does any of those other things.

Programming is in that regard an engineering discipline. You have to know what to do. You have to know what your capabilities are. At the level I was working, I had to be able to envision how the pieces were going to fit together. I had to have some intuition for what things were fast and what things were slow, what things were hard to build, what things were easy to build, and come up with, at the engineering level, a model of what I thought was going to happen.

The artist part decides that the design should be elegant. Those fit together because in computer programs the artistry affects the longevity of it. Part of what I call the artistry of the computer program is how easy it is for future people to be able to change it without breaking it. That doesn’t have anything to do with constructing its functionality but with its life as an existing thing.

Seibel: So for you the beauty of code is tied up with the fact that people are going to have to change it.

Cosell: One or two things I wrote were black boxes that just had to run as long as the computer cycled, but most of them were code that generations of people were able to hammer on and mostly not keep breaking. When I talk about the artistry and the beauty of it, what I’m talking about is the idea that when you write a program you have a huge amount of discretion. How you organize your routines, how you lay them out on the page, where you decide to put comments, how you name your variables, whether you like your subroutines to all have uniform calling sequences or situational appropriate code.

So you have to look at the program through the eyes of a new programmer sometime down the road. What’s the structure of the program? What are you doing? How are you doing it? And why are you doing it? The artistry is how the next guy that reads the program and understands that this subroutine was supposed to do this—he gets the message that it’s not right for him to go and muck it up to do something else and that he should keep the structure of the program.

Seibel: What about the tension between clarity and efficiency? Sometimes the simplest, easiest-to-read code isn’t the fastest.

Cosell: Programmers are the worst optimizers in the world. They always optimize the part of the code that’s most interesting to optimize, and almost never get the part of the code that actually needs optimization. So you get these little nuts of very difficult code that have no point. I always tell the people working with me, “Code it as lucidly, as easy to read, as crystalclear as you can. Do it the simple way. And then if it needs to be sped up, we’ll deal with that later. If you’ve done it right, we can draw a little box around this piece.”

Eons ago one of the versions of Emacs had one page of the source code that was a gigantic skull and crossbones in comments that said something like: “Seriously twisted code follows this thing.” It was some piece of the innermost guts of the search code or something like that that they had optimized the hell out of. That’s a place where I can see that this piece is really tough. So there’s a big black box around it saying, “Don’t stray in here, unless you know what you’re doing.”

But programs we write these days are bigger and clumsier, in terms of the elegance I learned when I was doing PDP-1, and slower. And that’s fine. It turns out it doesn’t matter. Now, the guys doing video synthesis and the guys doing the CGI animation stuff, clearly those guys don’t have that luxury. That takes some beaucoup careful programming. I could not do that anymore; I am way over the hill for that. But I could do that once. And I understand the guys that do that. But most of the programs we do are just routine crap.

At some college they had a two-semester course from September right through May and you had you work on some fairly hard program at the beginning. What they didn’t warn you was in April they were going to make you work on the program again, having now really run you through the hoops on other things. The idea was for you to be stunned at how hard it was to remember whatever it was you thought you understood perfectly clearly just six months ago.

Seibel: So all that stuff that you did at the last weekend before it was due comes back to haunt you.

Cosell: That’s right. I thought that was a brilliant scheme. It’s teaching them a lesson that is hard to get except in the real world.

Seibel: When I talked to Ken Thompson I asked him about whether there were inherent problems in C that lead to security problems and he basically said that’s not the problem. You teach courses on computer security—what do you think of that?

Cosell: I would hate to cross swords with him, but I say in my computersecurity class that the biggest security problem to befall modern computers is C. It was designed to be a systems-programming language and it was such a comfortable systems-programming language that all the hotshots used it. We built operating systems out of it. We built real-time systems out of it.

I remember the wars in the days of Pascal. The argument was that computers should help you; that C was too dangerous a language. The two big voices I remember were Wirth and Dijkstra. On the other side was every systems programmer I know, including me. I wrote everything in C. So C sort of steamrolled the zillions of languages back then.

The government tried to mandate Ada and they wouldn’t let contracts unless they were in Ada. C steamrolled over those. It was just amazing. But as I look at it now, I am still stunned almost every day that it borders on the impossible to write a program of any real complexity in C and not have a security problem. The amount of care it takes for a programmer to never do a read into a buffer without explicitly making sure it can’t run over the end of the buffer, to never free a block of memory at the wrong time so a pointer way elsewhere in the program becomes stale, to never store something that’s the wrong size and happens to step on the next value—those problems can be so hard to find.

It has been such a boon in systems programming. The idea that we would write our systems in assembler and all our applications in Pascal sent shivers down my spine. I don’t think that was the right answer. But writing both systems and applications in C, I would have to say has proven just not to work very well. It’s just too hard.

It’s kind of like the problems we had with interrupt bugs. You could argue that there’s no real magic in writing a program with sequence breaks or interrupts. There’s no real problem with that. It takes a little bit of understanding and a little bit of care. But I know for a fact that very good programmers who understand all of that put those bugs in their program. A programmer like me would have to come and fix it and I had to do the Niklaus Wirth–style thing of inventing the computer language that wouldn’t let them make interrupt bugs.

For the IMP system I wrote a complicated set of assembler macros, so you could declare what you were doing. When you came in on an interrupt, you wrote a declaration that says, “I am on modem input” or “I am on the highpriority clock or the low-priority clock.” And then when it assembled your program, it actually tagged every instruction with which interrupt level it was running on and then there was a postprocessor that I probably wrote in TECO macros, honest to God, that processed that and looked for timesharing problems. It would look for a variable that was accessed by two different levels and it would say, “There is an interrupt conflict.” Now all of a sudden, the time-sharing bug would go away. Other programmers could understand that if they put the right declarations in, these macros would keep them from making timing bugs. I got a trip to Hungary to present how you could get programmers who don’t really understand real-time issues to be able to write solid real-time programs by using this technique to abstract out the conflict problems.

That’s a little bit the way I feel about C. I’m sure that there are good programmers, perhaps including me, who can write good C programs. But it’s just harder than it has to be. In the modern environment it’s gotten harder because the environment is so much more difficult—the number of places where C’s weaknesses can be exploited or overlooked, the amount of care it takes. That’s one reason why I’m very comfortable writing in Perl. Perl is slow. I’m sure it’s one of the slower languages, but in essence it repairs all of the security problems of programming in C. What happens when you index off the end of the array in Perl? It makes more array.

It knows what its pointers point to, so you can never misreference a pointer because you only say to go through it and it tells you where it’s going. So I’m much more comfortable building security-necessary applications in Perl because I have a world full of Perl people pounding on the core and it’s been stable for so many years. I don’t think we’re going to find too many allocate bugs or pointer bugs, and they’re hard to exploit from random Perl code anyway. I don’t have to trust the programmers around me to get every pointer check right.

And even then we get programs like the classic one where somebody wrote a web page that was looking up somebody in a table and some hacker put something in the input that looked like, “Joe;drop all tables.” That still happens. That’s obviously not C’s fault, but it shows programmers just can’t be careful enough. They don’t see all the places. And C makes too many places. Too scary for me, and I guess it’s fair to say I’ve programmed C only about five years less than Ken has. We’re not in the same league, but I have a long track record with C and know how difficult it is and I think C is a big part of the problem.

As these applications get more complicated, and built on more and more complicated libraries—and nobody will ever understand the security cracks in the libraries because they’re so immensely complicated—probably we’ll have to move toward application-development languages that are more fault-free. Processors are becoming blindingly fast and memory is becoming ridiculously cheap. I don’t know what tomorrow’s language is. I don’t think C or its derivatives such as C++ are going to really be the right vehicle for heavy-duty program application—even system development—going forward.

Java didn’t feel right. My old reflexes hit me. Java struck me as too authoritarian. That’s one of the reasons why I mentioned that Perl felt so good, because it’s got the safety and the checks but it is so damn multidimensioned that the artist part of me has a lot of free board to express things clearly and to think about the right way to do things. I have some freedom.

When I first messed with Java—this was when it was little baby language, of course—I said, “Oh, this is just another one of those languages to help not-so-good programmers go down the straight and narrow by restricting what they can do.” But maybe we’ve come to a point where that’s the right thing. Maybe the world has gotten so dangerous you can’t have a good, flexible language that one percent or two percent of the programmers will use to make great art because the world is now populated with 75 million run-of-the-mill programmers building these incredibly complicated applications and they need more help than that. So maybe Java’s the right thing. I don’t know.

Seibel: When I spoke with Fran Allen, who worked at IBM on Fortran compilers, she was quite upset about C from a completely different perspective, which was it made it impossible to write really highly optimizing compilers because it was so low-level.

Cosell: Now, she’s in a different camp. She’s working on compilers; she sees C as this awful, clunky step down that you can’t do anything with. Whereas we were working with bit-twiddling assemblers and C was like a breath of fresh air. So of course most of the very best programmers back then were not the guys writing BASIC programs and not so much writing Fortran programs doing calculations. The real heavy hitters were of course the guys doing all the assembly code. So we went to C because C was like breath of fresh of air. If you think C has problems with array checks, try writing your array loops in assembler. So in that regard, it was a great boon.

I don’t want to say that C has outlived its usefulness, but I think it was used by too many good programmers so that now not-good-enough programmers are using it to build applications and the bottom line is they’re not good enough and they can’t. Maybe C is the perfect language for really good systems programmers, but unfortunately not-so-good systems and applications programmers are using it and they shouldn’t be.

Seibel: Do you think that the nature of programming has changed as a consequence of the fact that we can’t know how it all works anymore?

Cosell: Oh, yeah. That’s another thing that makes me a little bit more dinosaurlike. Everything builds assuming what came before. I remember on our old PDP-11, 7th edition Unix, we were doing some animation and graphics. That was a big deal. It was hard to program. The displays weren’t handy. There were no libraries.

Each generation of programmers gets farther and farther away from the low-level stuff and has fancier and fancier tools for doing things. The good part is they can do cleverer things. The baseline is so good that the next thing is spectacular and that then becomes the baseline and two years later it becomes even better. The trouble is that these baselines are getting more and more complicated. The PDP-1 instruction set was like a walk in the park compared to some of the stuff that’s happening.

I would hate to be the guys at Microsoft who have to build these operating systems that run on the quad-core multiprocessor. Video cards have grown to the point where they have multiple megs of memory, and complete pipeline parallel processors on them that can do array and vector things on the fly. So you now use your video card as this very fancy data processor. I keep thinking how hard it must be to program these things.

We had a thing called an IMLAC, which is one of the early machines that actually had a nice integrated vector display on it the way the old PDP-1 did but it was a mini computer. There was a program for that that had you sitting on a little cart doing a 3-D display of a maze. So you saw the walls coming by. You could peek around corners. I was fascinated because it did hidden-line suppression. This is in the era where guys are writing articles in Communications of the ACM about algorithms. I have a whole book about how to use symmetric coordinates and somebody’s algorithm for figuring out where two lines cross so you know where a line crosses a plane so you know that that’s where you have to stop the line because it now becomes hidden.

Doing the hidden-line thing was a big deal back then and that program did it. I was just stunned by that program. That was big deal code—singular stuff. Now, as far as I can tell, the video cards take 3-D coordinates and the video cards do the hidden-line suppression. Eight, nine years ago things like texture mapping and ray tracing were big deals. Hard to do in code. It took your program hours to get the glint off of a sphere.

And now I discover that video cards do the ray tracing. So on the one hand you have these guys working at NVIDIA and stuff who must be doing incredibly complicated stuff and you have the modern programmer who no longer can be content just writing a thing with little line-drawn walls—he has to master this incredible 3-D video environment that’s built on libraries that have gotten more and more complicated. They’re easier than it would be to write the code yourself, but I can’t fathom how people can absorb all of that these days. It just seems so huge to me.

I run into that just with doing Tk. I’ve been trying to do a little Tk program and I am stunned by how complicated Tk is and how many hooks it’s got, and it’s what you need to do in order to make the button be bigger or smaller or here or there. Mastering that thing is just a huge thing. Understanding the PDP-1 time-sharing system was simple by comparison.

So I don’t envy modern programmers, and it’s going to get worse. The simple things are getting packaged into libraries, leaving only the hard things. That stuff is getting so complicated, but the standards that people are expecting are stunning. One of the ones they showed me stunned me. He was showing me Google Maps that will do routes for you. One of the things you can do is you can grab a piece of the route with your mouse and drag that piece of the route somewhere else to tell Google that you want the route go there. Then it remaps the route so that it goes through where you just dragged the point. Now I know what’s going on in there: a pile of JavaScript code for the mouse tracking. When you let go of the mouse it has to do an Ajax XML request to tell momma system that he just put this point on the route. The route then has to do incremental updates. Calculating the route. I can’t even imagine how they do that code so well. People complain that you get routed through people’s backyards and stuff like that, but the optimal-route problems are one of the classic problems of computer science. How to take this arbitrary graph and find the shortest path through a graph. Just stunning.

At one level I’m thinking, “This is way cool that you can do that.” The other level, the programmer in me is saying, “Jesus, I’m glad that this wasn’t around when I was a programmer.” I could never have written all this code to do this stuff. How do these guys do that? There must be a generation of programmers way better than what I was when I was a programmer. I’m glad I can have a little bit of repute as having once been a good programmer without having to actually demonstrate it anymore, because I don’t think I could.

This is a good time to be an over-the-hill programmer emeritus, because you have a few props because you did it once, but the world is so wondrous that you can take advantage of it, maybe even get a little occasional credit for it without having to still be able to do it. Whereas if you were in college—if you major in computer science and you have to go out there and you have to figure out how you are going to add to this pile of stuff—save me.

Îãëàâëåíèå êíèãè


Ãåíåðàöèÿ: 6.472. Çàïðîñîâ Ê ÁÄ/Cache: 3 / 1
ïîäåëèòüñÿ
Ââåðõ Âíèç