Lessons 1 & 2 - Welcome to C; Historical Context

Self-discovered resources

History of C

"A damn stupid thing to do"—the origins of C (Ars Technica article): Talks about how origin story begins in England with CPL which became BCPL which became B which became C. Notable excerpt:
- BCPL is a "bootstrap" language because its compiler is capable of self-compiling. Essentially, a small chunk of the BCPL compiler was written in assembly or machine code, and the rest of the compiler would be written in a corresponding subset of BCPL. The section of the compiler written in BCPL would be fed into the section written in assembly code, and the resultant compiler program could be used to compile any program written in BCPL. Bootstrapping compilers dramatically simplifies the process of porting a language from one computer or operating system to another. Only the relatively small portion of the compiler written in the code that is specific to that computer needs to be changed to enable the compiler to run on another computer.
- When they [Ken Thompson and Bell Labs after pulling out of the Multics project and no longer having the GE computer to work with] did finally obtain a computer, it was a hand-me-down Digital PDP-7 that wasn't particularly powerful even by the standards of that era. Nevertheless, Thompson was able to get the first version of Unix up and running on that machine. The PDP-7 had 8,192 "words" of memory (a word in this instance was 18 bits—the industry had not yet standardized on the 8-bit 'byte'). Unix took up the first 4k, leaving 4k free for running programs.
  Word (computer architecture) and word- and byte-addressing
  Don't confuse the loaded term word in computer architecture parlance with the now-defunct word-addressing in favor of byte-addressing. As this post and this post note, there's a good reason for why computing switched from word-addressing to byte-addressing in the 1970s -- much of this had to do with how C wanted a character to be a core part of the language.
  Essentially, the early focus of computing was all about computation as opposed to communication The gradual focus on communication meant that string characters became increasingly important, and word-addressing was not at all efficient in handling this. Byte-addressing made string manipulation much easier, cleaner, and more efficient.
- The PDP-7 had 8,192 "words" of memory (a word in this instance was 18 bits—the industry had not yet standardized on the 8-bit ‘byte’). Unix took up the first 4k, leaving 4k free for running programs. Thompson took his copy of BCPL—which was CPL with reduced functionality—and further compressed it so that it would fit into the available 4k of memory on the PDP-7. In the course of doing this, he borrowed from a language he had encountered while a student at the University of California, Berkeley. That language, "SMALGOL," was a subset of ALGOL 60 designed to run on less powerful computers. The language that Thompson eventually ended up using on the PDP-7 was, as he described it to Ars, "BCPL semantics with a lot of SMALGOL syntax," meaning that it looked like SMALGOL and worked like BCPL. Because this new language consisted only of the aspects of BCPL that Thompson found most useful and that could be fit into the rather cramped PDP-7, he decided to shorten the name "BCPL" to just "B."
Unix at 50: How the OS that powered smartphones started from failure (Ars Technica article): Talks about how failure of Multics project between MIT, GE, and Bell Labs ultimately led to the profoundly successful Unix.

Reference notes

A History of C, UNIX, and Computation (a.k.a "The importance of 1978")

Link: A History of C, UNIX, and Computation (a.k.a "The importance of 1978")

1:23: The C Programming Language was published by Brian Kernighan and Dennis Ritchie in 1978. The key thing is that it is a significant moment in history where everything changed. And so we're looking at this textbook and the text in this textbook and the language itself in the context of how it's impacted history. The C language itself has a long history:
1969 - B Language - Word oriented (i.e. not byte oriented)
1972 - C Multiple types including (byte / character)
1972 - 1978 - C and UNIX co-evolved with a goal of increasingly less assembly language in UNIX
1978 - K&R C
1989 - C89 / ANSI - void type, C++ declarations, character sets, locales
1990 - C90 ISO C
1999 - C99 - complex type, // comments, Unicode
2011 - C11 - Library improvements
2018 - C17 - Cleanup of C11
There was a language called B and they were using it at Bell Labs to build utilities and operating systems stuff. But it was a little too word-oriented. As new computer hardware came out that supported byte addressing instead of word addressing and the ability to load a string of bytes and store a string of bytes instead of words, words being larger than a byte and being more than one character ... C wanted to make a character a sort of core, low-level data that the language would handle. The mid-1970s saw C and Unix kind of co-evolve. They wanted to build something that would make Unix work well on a PDP-11/20 and, at the same time, make it so they could port Unix to other systems. But really it was about the PDP-11/20's cool memory architecture having to do with byte addressibility. And what happened was they were carefully rewriting Unix and C, fixing C, and laying the groundwork for Unix portability. So by 1978 the K&R C book was published, and at that point you could think of it as a summary of over a decade of research in how to build a portable programming language and then use that programming language to build a portable operating system. By 1989, C had become popular and there was a need to standardize it. There was a variant of C, C89, that was released, so-called ANSI C.
C had continued to evolve from sort of 1990 to the present. There's a number of major revisions, but the key thing that these revisions don't do in the modern version of C is they do not attempt to make C a easy-to-use language like Python or JavaScript. C knows its place in the panoply of languages and does a good job of that. If we look at sort of what is the future of C ...
Challenges to use C as general purpose languages
No dynamic memory support in the core types / libraries
No "safe" string type
C++ is best thought of as a more powerful and flexible C for professional programmers and systems applications
Java / JavaScript / C# / Python - Types are usually objects - not "close to the metal" - Not as well suited for an operating system Kernel
The likely follow on to C in systems applications is Rust
Stays close to the metal while providing simple and safe core data types
Becoming the second official language in "Linux"
C is a difficult language to use a general purpose language. Python is a great general purpose language but is not a great systems level programming language. Two things that are missing from C are the lack of solid dynamic memory support in the core types and libraries and there are no safe string types. There's no string in C. It's character arrays. And arrays have sizes. And if you start putting stuff beyond the boundary of that array things just blow up. And C++ to me is not the future version of C...it's really a more powerful, intricate, and flexible version of C for programmers who are doing professional, intricate systems applications. Writing good C++, in some ways, is more difficult than writing good C.
The languages that sort of take on C's mantle in the general purpose are those like Java, JavaScript, C#, or Python. The key thing with these languages is they don't give you sort of strings as just raw byte arrays. And they give us a simple object-oriented layer that keeps us away from the metal. The goal of C is to get close to the hardware. Close to the metal. And so all of those languages are great -- they're just not well-suited for writing an operating system kernel. The most likely language that is like C is Rust. The idea of Rust is that it stays close to the metal, but then gives us some simple and safe core data types. And recently Linux is starting to accept some Rust. That means Rust has to be mature. It can't be like evolving rapidly.

7:10: Now, C has been around for a long time. Before C, most of us would write assembly language or FORTRAN. But FORTRAN is not really a general purpose programming language. FORTRAN was really for scientific computations. In the earliest of computers, in the 1950s or 1960s, they were either really specialized toward like payroll or HR systems or they were really specialized to doing computations. C, as a language, was kind of none of the above in that it was aimed at writing system code, a kernal, an operating system, and the utilities around it, including like other languages. And so C is kind of the mother tongue of all kinds of other derivative languages. Things like Bash, Perl, Python, PHP, C++, JavaScript, Java, C#, Objective-C, etc., kind of were derivatives of this beginning of C and that's why you see a lot of patterns in these other languages that are similar and that's because JavaScript and Java both inherited their for loop syntax from C.

8:51: On top of this history of the C language we can look at a brief history of computers that starts really in the 1940s with a focus more on communication rather than computation. Communication and computation were very much connected throughout the 1940s and even today. In the 1950s, computers were basically multimillion dollar strategic assets. Every single computer. And a lot of them were custom built:
1940's - Top Secret / Military / WWII (ihts.dr-chuck.com)
Early 1950's - Custom built
Late 1950's - Companies like IBM, DEC, etc. begin selling computers
1960's - More companies, less expensive, wider range of options
Late 1960's - Many kinds of computers old/new/fast/slow
1970's - Searching for "the one" solution for software
1980's - Microprocessors and Personal Computers performance++
1990's - The network is the computer - performance++
2000's - Amazon AWS founded in 2002 - computing as commodity
The first computer at Michigan State was built by the electrical engineering students of that university based on some designs that they borrowed from Illinois. So things like the programming language, the operating system ... you didn't have a whole lot of generalization. You didn't have a lot of sharing. You tended to write code, put it on a paper tape, or later a magnet tape, and load it and run it. And so you were just pretty happy if the code worked. You didn't need an operating system, and these weren't multiprocessing computers. So the software environment was very minimalist. But in the late 1950s and early 1960s you saw companies like IBM and Digital Equipment Corporation (DEC) began selling general purpose computers. They just could make them and began selling them. And they were still expensive and they were only in like a business would have a couple computers help with like payroll perhaps. Something that was really really important because computers were expensive. But in the 1960s we really got to the point where the computer componentry (the chips etc.) were becoming commodities. You could just go to a place and buy chips. And then you could make a computer by buying a bunch of chips and putting those things together. And because you weren't building everything from scratch the cost got a lot lower. The other thing that these less expensive computers were is that they were a little slower. But by the end of the 1960s there were a lot of computers. There were some super expensive sort of one-offs, small production computers; there were computers that had previous generations of mini computer where there were lots of them laying around computer science departments or businesses, where they weren't really quite sure what to do with them, they wanted to buy a new one. And then there were just innovative, low cost computers coming out. And in the 1970s, in this milieu of just new and old computer hardware ... the question was whether or not there was a way we could do things with all of this old hardware. And is there sort of one solution? And that's where Unix and C came.

11:42: After the 1970s we look at the 1980s and that's where microprocessors and personal computers came in. So we went from computers the size of refrigerators or desks to where a computer could be on a single chip. In the beginning those personal computers had really bad performance, but once you could get everything on a single chip the performance could get faster very quickly. And because personal computers became a mass market item, a lot of money could be invested in personal computers. By the 1990s, personal computers continued to grow. THe need to communicate, talk, and exchange information became all the more important. So in the 1990s we saw really an increased focus on connecting computers with the Internet and other kinds of networks. Price kept going down, performance kept going up. By the time we get to the 2000s, Amazon AWS was founded in 2002 and it used personal computer microprocessors (like from Intel) and produced computing as a commodity. So you don't even buy computers anymore. You just go to Amazon and say you'll rent a computer for $7/month.

13:05: So we see in 1978 this moment where we're going from computers becoming more and more common and going down in price. There was a diversity in computers. These days there's actually less diversity. So let's take a look at the operating system (Unix):
1960s - Multics 1970 - UNIX on a DEC PDP 11/20 1973 - UNIX Rewritten in C - Ran only on the PDP 11 1978 - UNIX ran on the Interdata 8/32 - C Evolved as well to support portability so UNIX could be ported 1978 - Unix version 7 ran on DEC VAX systems 1978 - 1BSD Unix Released from Berkeley Software Distribution 1982 - Sun Microsystems Founded - UNIX Workstation Late 1980's Intellectual Property became complex
In the 1960s there was a multiuser operating system called Multics. Then, in the 1970s, they wanted to come up with yet another operating system, and they eventually called it Unix. And the DEC PDP 11/20 was a new commodity that was coming into the marketplace. And so in 1973, Unix was rewritten in C, but it was only there on the PDP 11, although they had laid the groundwork for portability from the beginning. They knew they wanted to have everything be portable. They just couldn't make it all portable. The first version they just had to make it run on the PDP 11. Then, by 1978, the second computer that Unix ran on was the Interdata 8/32 which was quite a different computer. And that was good because they learned a lot about making Unix a portable bit of software. In the early 1970s C was evolving in a way so that Unix could be ported. So it's like we got this problem between, say, the PDP 11 and the Interdata 8/32 -- how can we fix this. We can change how the operating system works, we can change the operating system code, we can change the C compiler, and we can rewrite the operating system code to get less and less assembly language and more and more C language. So the idea was to get to the point where there was a very very small amount of assembly language in Unix. And over the years that's gotten lower and lower. Unix was rewritten. A number of versions came out in the 1970s having to do with their portability. So by 1978 the Unix version 7 could also run on a whole new architecture from DEC, namely the VAX systems. UC-Berkeley had their own distribution of Unix called BSD, Berkeley Software Distribution. That was really cool because universities often pushed things like networking, TCP/IP, ARPANET. BSD Unix was one of the first places some of us saw TCP/IP. In 1982 a company was based solely on Unix, Sun Microsystems. Some work at Stanford, some work at Berkeley, but all of it was based on Unix. They created what in effect was the Unix workstation marketplace.
At this point you could imagine that the world was about to just adopt Unix. Unix was the greatest thing ever. Computer science departments were teaching Unix in the operating systems classes in the 1980s. The problem became that AT&T had never really come up with a business plan for what the purpose of Unix was. So there were some challenges as to how they could monetize this extremely popular thing. And they didn't do a great job and it took them a long time to figure out what was going to be successful. By the time AT&T figured things out, the market had moved on:
Late 1980's UNIX was very popular - AT&T saw an opportunity to commercialize their work. Many variations of UNIX had bits and pieces taken from AT&T UNIX - it got complex quickly
1987 - Minux was developed as a fresh ground-up implementation by Andrew S. Tannenbaum to teach operating system concepts - it was free but modification and redistribution were restricted.
1991 - Linus Torvalds wanted to build a fresh ground up implementation of the "UNIX" kernel that was 100% free - some of the utility code came from the GPL-Licensed GNU project
1992 - Linux adopted the GPL license
Minux was an operating system that was developed in the Netherlands by Andrew S. Tannenbaum. He built a completely free and open source operating system that was used for education. He built a textbook around it, and it was very popular. But he didn't want commercialization. At least not at first. He sort of held on it too tightly, a kind of intellectual property mistake. In 1991, a programmer named Linus Torvals decided he was going to build a fresh ground-up implementation of the Unix kernel that was 100% free. So he wasn't going to use Unix, he wasn't going to use Minux. He wanted to create another thing. Originally, it was just like a hobby. By 1992, Linux started to work and it adopted the GPL license (GNU Public Library license), which is a strongly open source license which means it's difficult to take Linux out of open source. Which meant that people could invest in Linux.

17:55: So Linux has become Unix. Linux in the modern world is the Unix-like system. And Unix tried to hold on for a while but they really really couldn't. So the remaining Unix distributions are a tiny tiny fraction of the marketplace. Linux is the marketplace.

Best programming language ever? (David Bombal YouTube interview)

Link: Best Programming Language Ever? (Free Course)

3:48: The modern era of computing started in 1978. Not networking. Programming, hardware, and the ability to make our modern M1 processor. In 1978, the foundation was laid so that Apple could switch from Intel to M1. Period. The M1 processor potentially wouldn't even exist if it weren't for the C language, Unix, and what they (the folks at Bell Labs) did then.

8:04: There was a fundamental change in architectures (from first edition of K&R to second edition), where we were going from word-oriented architectures to character-oriented architectures, and that's the difference between B (the language) and C. In 1978, K&R weren't sure which was the right architecture. But then what happens is the future architectures all happen, C happened, Unix happened, and in 1984 K&R wrote the same book and took all of the innocence out of it. They had changed the world between 1978 and 1984 -- they knew C was a good language by the time the second edition came out. And that's the version that has been in print all these years.

27:46: 1978 is the single most pivotal moment in all of computer science history, and the key thing is that you can draw a line and after 1978 literally nothing before 1978 was true; that is, there was a truth before 1978 and then after 1978 there was a new truth. The things that were true pre-1978 were not true post-1978. Pre-1978 people tended to write FORTRAN on computers that were word-oriented -- they tended to think about floating point and integer calculations for weather simulations and things like that. We tended to use computers to compute. We did not use computers to communicate or write word-processing documents or whatever. And so the architecture of computers was tuned to computation, to numeric computation. So the programming languages were tuned to numeric computation, the hardware was tuned to numeric computation, and generally the problems we solved tended to be tuned to numeric computation. All of this before 1978. And what's happening in the 1970s is being partly driven by the ARPANET and the internet is that computers are becoming mediators for human communication. So two people may be talking over Zoom, for instance, but computers are mediating their communication. What happened was this notion that a computer word had to be long so that you could have high accuracy computations, but characters are short so when pioneers were young you had 10 characters per word and you would mask and shift to get a character out -- and there used to be no lowercase on a computer. What was happening in 1978 and the thing that C and Unix and that Bell Labs had was that the smaller computers were not really aimed at computation. Think about AT&T -- they were a telephone company. It was thinking of communication, not computation. It wasn't doing ballistics or weather computations or supercomputer simulations -- it was about getting phone calls to go back and forth and to get billing to work. What is billing? Billing is text -- you made a long-distance call; we'll send a bill out later. And so the idea of billing and all that stuff moves from computation as the essential purpose of computers to mediated human interaction so you're just getting the bill from a computer. What happened was the computer architecture changed from these long words for high accuracy mathematical calculations to single characters, 8 bits. 6 bits. 9 bits. 10 bits. Whatever it was so you could do characters. 8 bits could do upper and lower case from a Western-Latin character set. In 1978 that's kind of what mattered. 8-bit bytes. And so the problem was that none of the programming languages were suitable, none of the operating systems were suitable, nothing was suitable. And here you have AT&T moving from mathematical computation to textual computation. There was a language before C called B. The difference between B and C is that B is a word-oriented language that looks like C and C is a character-oriented language that looks like C. That changed everything. What then happens is that these character-oriented computers are so slow and cheap compared to the big supercomputers that were extremely expensive. So now, for example, a bank or department could own a computer that cost around 100k dollars as opposed to supercomputers that would cost millions of dollars. These computers could be passed around and reused. But these computers didn't have much support in the way of operating systems or programming languages. The problem with them in the 1970s is that these little computers are coming out with high frequency, where several of them were different from each other (because they were exploring a different kind of architecture, namely a byte- or character-oriented architecture as opposed to a word-oriented architecture). The problem they solved at Bell Labs was woah we just wrote an operating system for this particular PDP-8, but now we just bought a bunch of these other things. Crap! Now what do we do? Well, maybe we should just make an operating system in a language that abstracts away the low-level bits of it and just make a compiler for this new thing. They had a whole range of new cool computers and old crappy computers, and they wanted to use them all to do things. The old hand-me-down computers they would use for, say, email or things like that. But they wanted to write an operating system that would work all the way from the cool new computer they bought down to the old hand-me-down computer. They were studying language and operating system portability. Their goal was to take Unix and reduce the amount of assembly language in it to a smaller and smaller fraction. That was the research -- all operating systems were written in assembly language before Unix. And now we write an operating system in a high-level language and then we still have to write a tiny bit of assembly language. And we have to write a compiler, but that compiler can generate code that's probably faster than hand-built assembly language.
What happened was that by the time they were done in 1978 you see them talking in the book about the Interdata and the PDP ("we did this in the PDP and we had to change Unix, and we had to change C so that it would work with the Interdata") ... and in that portability, in that sort of progress of hardware progress marching on, they figured out how, when they got a new piece of hardware, how to quickly get to the point where they had an operating system, a compiler, and some networking, and they could plug that system in and it was materially the same as all the other systems they had purchased for the last 5-6 years.
What this allowed for was separately procuring the operating system and the hardware, which meant if there was a little company that came up with an innovation that was a computer hardware innovation they could buy that and say ya know what, we've been buying these PDP-8s for a long time, but man this Interdata company they're cheaper, they're faster, they've got more memory. Let's just recompile everything and use this Interdata stuff because that's cheap. The research was how to use a wide variety of nuked old hardware to do the same thing. And then what happened was that because they had solved the problem of software portability, hardware vendors could begin to accelerate the iteration time. Even the same hardware vendor could come up with later versions of the same thing and make changes in the architecture once they realized that a single instruction slowed the computer down. They could take that instruction out. Woah. In the old days of assembly language, that would break your code. Well, not in C. You just change the compiler to not use that instruction and then you recompile everything and boom, then this new twice as fast computer that had one instruction removed is working. A week later you're up and running on that hardware. This permitted innovations to happen. And that innovation has led to Moore's law.
The most recent example of seeing this in action is how Apple transitioned from powerpc as their core hardware architecture to Intel as their core hardware architecture to an ARM M1-based architecture. And it wasn't all that hard. Of course M1 works. It's using software portability as a defining characteristic. Apple is like we're going to change the architecture like every year because guess what. We're using C and everything's portable. We recompile it, we can add three instructions or take away three instructions, and it's okay on a year-by-year basis to iterate.
What's way more surprising than Apple's move from Intel to ARM was PowerPC to Intel. This was a very closely held secret by Apple. Steve Jobs spent a lot of time and money marketing how bad Intel was and how much better PowerPC was -- he wasn't lying but was being extremely selective in what he was pointing out. The PowerPC architecture was a better floating point processor than it was an integer processor. And the Intel was a better integer processor than it was a floating point processor. So all the marketing for Apple before they moved to Intel was how important floating point was to thing like charcoal filtering in Photoshop. "What how fash charcoal filtering works on a Mac compared to Windows. It's 12 times faster!" And that's just because the algorithm for charcoal filtering was using floating-point arithmetic instead of integer arithmetic. Other than that, the PowerPC is a terrible architecture for a portable computer. It's big, it makes a lot of heat, uses up your battery real fast, and it was not nearly as good as Intel on integer calculations, which is what word-processing uses which is what most of what we do.
Steve Jobs had like a 5-year project that was 5 people and it we recompiling the Mac operating system and running it on Intel-based hardware. For 5 years in a secret lab that only like 10 people knew about. [...] Apple, in around 2008, had a bunch of prototype Intel-based Mac boxes in a show room that worked for different users to try out. C is why they could do that. C is why they could have an operating system that they were secretly running on Intel hardware for 5 years all while selling machines that ran on PowerPC hardware. [...] Apple can just wake up on a Tuesday morning, look at all the data and say ya know what, we're going to turn down the chip space we're going to give for crypto, and we're going to turn up the chip space we're going to give for video editing, and that'll be called the M2.3 or whatever. And because of portable operating systems and the ability to recompile, they can just play that. This is part of why understanding computer architecture is important. Because now we can have this conversation about chips and about memory sizes and all this stuff and if you don't even know what assembly language is, then you don't quite understand how the M2 is profoundly different than the Intel processor. If you're a professional, then you should have some concept (not that you'll be a hardware developer) of the nature of change in hardware and how that changes what you can do in terms of power and software.

C programming language (Computerphile interview with Brian Kernighan)

Link: "C" Programming Language: Brian Kernighan - Computerphile

0:20: Multics was sort of the second version of time-sharing that was done at MIT with collaboration of Bell Labs and General Electric (very big system for providing, in effect, a computing utility for programmers). Multics was actually being written in high-level languages. They started with PL/I, which was not a good choice. Then they moved to BCPL, a language which was developed by Martin Richards of Cambridge.
The people working on Multics at Bell Labs, in particular Ken Thompson and Dennis Ritchie, had gotten used to the idea that you could write really interesting operating system kinds of software and tools that supported it in high-level languages which meant you weren't writing assembly language (all operating systems before then were written using assembly language). So when they started working on Unix, this very small stripped down version was done at Bell Labs, they were at the same time exploring the use of high level-languages, and there was a variety of these. There was a language called B, which was the first attempt, done by Ken Thompson, and it was a very simple language. I suppose you could say it was a very simple stripped down version of BCPL.
You could do interesting things with B, but it wasn't quite powerful enough. In particular, it was an interpreter, not a compiler, and that meant it didn't generate machine code for the particular machine it was running on -- it was just a program that interpreted what it was trying to say. So that meant it ran relatively slowly. The other thing is that the language did not support the architecture of newer machines that were showing up, in particular the PDP-11. Newer machines, instead of having everything in the machine be a single size of, in effect, integer, let's say 16 bits or something like that ... they had different sizes that the machine would efficiently and naturally manipulate (8 bits, 16 bits, maybe 32 bits). The language B didn't reflect that. And so Dennis Ritchie undertook to design what amounted to a version of B that was richer and had mechanisms in the language to say this is an 8 bit quantity, this is a 16 bit quantity, basically the char and int types that you see in the early versions of C.
And so he wrote a compiler for that and then, with the compiler in hand, Ritchie and Thompson started to rewrite the operating system itself in C. This took a while, somewhere along the order of 6 months to get that working. And at that point you've got the operating system, and of course all kinds of other critical core software, written in a high level language rather than in assembly language. And that's important for all kinds of reasons. It means that it's just a lot easier for lots of people to see what the code does -- you don't need so much expertise because the code is in a high level language. And the other thing is that it's portable in the sense that if someone makes a C compiler for a different kind of machine, which uses a different kind of architecture, then you can recompile whatever you program is for that new architecture, and that way you can move your program to a different machine. Of course, this has always been the advantage of high level languages, but now you could do it for the operating system. And that meant that Unix, the operating system, no longer was only for the PDP-11 but could run on a variety of other machines as well.

3:56: I think the concept of having a high level language is clear -- you compile it down to hexes and machine code that tells the processor to do stuff and if you have a different compiler for a different machine, it just changes what those hex codes are? (That's exactly right.) So it should be really simple to understand how you write an operating system in a high level language, but is it a bit like making a sharper tool and using that sharper tool to make a sharper tool?

4:24: The complication in writing an operating system in a high-level language and translating it into, let's say, multiple different architectures is that there's a lot of detail to get right there, and there are some things which the high-level language just doesn't support. So you need a tiny bit of assembly language assist for any given operating system. But what's more like the conceptual problem referenced with sharpening tools is how does the compiler itself get started because, of course, the C compiler is written in C. And so there's the how do you get it started and the idea of making the sharper tool to make a sharper tool and so on is metaphorically the right idea because what you do is you say I wrote some code in C on paper -- let me mechanically translate that into something that will process a bit more of C code and then bootstrap myself up from that. So it's a bootstrapping process. That's the basic idea.

Brian Kernighan interviews Ken Thompson at VCF East 2019

Link: Ken Thompson interviewed by Brian Kernighan at VCF East 2019

17:40: Multics was a monstrously overengineered big big project. It's a typical second system syndrome. We had a very nice time-sharing system at MIT, and they decided they were going to do the next one better. And that's the kiss of death. So they cooperated in a three-way thing with MIT, Bell Labs, and General Electric. And GE provided the machine because it couldn't run on a normal machine -- it had to have its own machine. Monster. And programming was done at Bell Labs. It was mostly designed at MIT. So we got these assignments do this, do this, do this. And we did them. And mostly I was uninspired. I'd do it and it was good work. I was a pretty good programmer. I just didn't know ... I was a notch in a big wheel. And it was producing something that I didn't know. Didn't want to use myself.
At some point, Bell Labs management realized this. And they backed out of the project. So now it was MIT and Honeywell. Bell Labs backed out with a nasty taste in their mouth. We don't do operating systems. No! Here I was and I wanted to do operating systems. But I found a work station. It was a PDP-7. It was a remote job entry for electrical engineering circuits. You draw a circuit on a CRT tube and you push a button and the circuit goes to the main computer by way of data sets or telephone lines. And they do the crunching. And they come back and then you can get the transfer function of various sorts on the screen. All electrical engineering. I just took it over and made some games on it. [...]
I was interested in a disk. This PDP-7 had a disk. No other PDP-7 had one like this. It was 6ft tall. A single platter, and the platter was on the vertical around. There was kind of a folklore -- you don't stand in front of it or behind it because if it got loose it would be like an airplane propeller. And it was fast. It was too fast for the machine. Which was kind of a shame you couldn't make good use of it. I wrote a disk-scheduling algorithm to try to maximize throughput on potentially any disk but, in particular, this disk. I got it going, but then I had to test it. I had to load it up with work. Ya know, you don't just ask it to read -- any disk can do that. So I had to really load it up and test its throughput under different algorithms and things like that. For that, I needed some programs on the side to do that. So I started writing these programs on the side and at some point I realized, without knowing it up until that point, that I was 3 weeks away from an operating system. With three programs, at one a week. An editor, to write code. An assembler, to turn the code into a language I could run. And I needed a little kernel kind of overlay. Call it an operating system. Luckily, right at that moment, my wife went on a 3-week vacation to take my roughly 1-year old son to visit my in-laws who were in Clifornia. Disappeared and all alone. One week, one week, one week. And then we had Unix.
When Unix was running, it was by far, even though it was a crummy little, ya know, factor of 10 slower computer than the comp center ... I started picking up really impressive users on this machine. It had two stations. I put a scribble text on this scope. That was one station. And then I had a model 33 teletype on the other one. Two at a time. Dennis was a user. McIlroy was a user. Morris was a user. That was the order of the users on the single machine. Bell Labs still had a nasty taste in its mouth from its horrible experience with Multics. No more operating systems! So we made a proposal to get a PDP-10, which was like the time-sharing computer of the day, to port this operating system (Unix) onto that everyone liked (by everyone just meaning the four of us). The proposal was soundly rejected when it fit into the budget of a single person requesting something -- usually, a single person at Bell Labs, could consume roughly their loaded salary in a year on the side. The notion was that somebody's salary, their loaded salary (the included the building and the guards and everything else), twice over was kind of the budget for Bell Labs. Ya know, if you get 4 people together you can get multi-million dollar things. So we were well within budget when we asked for the PDP-10, but they said no. We don't do operating systems. We don't do it.
So one of our fellows, Joe Ossanna, came up with a lie where the patent office was going to buy a special purpose editing complex to edit, store, and modify patents. They had their own formatting requirements. They had their own numbering conventions. They were unique. No normal editing thing. So some company was selling computers for patent applications. And we said aha! We can do that. And we'll save all this money. So the second proposal was to save money rather than to spend it. And for a machine -- no operating system, honest! -- that was for somebody else. It was a three-way win. It was impossible to say no. So we got it. And instantly ported Unix to it. And then Joe Ossanna believed us and wrote nroff and troff with enough macro power to do the patent. And then at some point we were actually doing patents. During the day we'd have 10 patent secretaries typing in patents. And we'd print them, but we'd do our own work at night. But we wouldn't do anything serious. Because this was an unprotected machine. And would crash if you did anything remotely malicious like writing machine language. And then the patent office took our machine. They loved it. And bought it. The acoustic people footed the bill.
The first one was a PDP-11. There were very few of them made. We got it before the peripherals came. Morris wrote DC, the very first program on the PDP-11. And while it was still a paper tape operating system because the disk hadn't come. And we were doing some porting and writing in assembler that we could live with for that machine and some other things in preparation and testing it with the assembly language for DC. Then when it came Unix went up almost instantly. Once the disk and the communications (teletype, remote teletype, and communications gear). It was feverishly trying to build in real time.

39:03: Doug McIlroy is the smartest of all of us and the least remembered or written down. He had a friend, Robert McClure, who had a language called TMG, and McClure got very commercial about his language and wouldn't let it out. When he left Bell Labs he said it was proprietary. It was a compiler-compiler. It was a yacc-like thing, but it was not assembled at a time, it was matching -- it was doing recursive descent. McIlroy sat there and wrote, on a piece of paper now, not a computer, TMG written in TMG. And then he now has TMG written in TMG and he decided to give this piece of paper to his piece of paper and write down what came out. The code, which he did. Then he came over to my editor and typed in his code, assembled it, and I won't say without error but with so few errors you'd be astonished. He came up with a TMG compiler on PDP-7, written in TMG. And it's the most basic, bare, and impressive self-compilation I've ever seen in my life. It was a very early tool on the PDP-7.
I decided no computer was complete without FORTRAN. It's gotta have FORTRAN. Nobody will ever buy a computer without FORTRAN. The PDP-7 was 8k of 18-bit words. And I ripped off 4k for the system (for myself), and the user swapped through the other 4k. And so I was FORTRAN and TMG. I was having a great time. And the first time I actually tried to do it it was like 10 times the size of the thing. So I started cutting pieces out of it. Down and down and down and down. When it finally got down to 4k, I called it B. But it was right at 4k because it came from above. There's no reason to stop when you get to where you want. Then I put features in that I liked. And it would blow over 4k. And it would not run. So I wrote a separate version of B which was a virtual B that would run the program out of disk. So it would grow over 4k, I'd run it on virtual B to get the B source, and hopefully what I wrote was a compaction of some sort so it would get smaller so it would get under 4k so I could bring it back to being roughly real time. This went over and over and over until the final version of B. There was one other added thing to B which was ... I saw Johnson semicolon version of the for loop. I put that in. I stole it. And that went virtual and came back down again.

43:34: And then Dennis took it and wanted to put it on the comp center. The big comp center. He wrote a compiler for B for the big comp center. And he called it NB for NewB. It was used to some extent and externally ported to several universities. And then he decided that wasn't enough. At this point, time had gone on and we now had a PDP-11 and we started getting more memory for the PDP-11 so we could think about expansion. And we decided that we had to write Unix in a higher level language. It was just mandatory. It was all assembly language until then. So he started muting NewB into C. The big deal was types. B and the old old C were very very similar languages except for all the types. NewB only had words. You'd load, store, add. Everything was words. The PDP-11 was bytes. So something had to be done to not waste factor 4 ... long story short, all by himself, he converted that, made C. I then tried to rewrite the kernel in C, whatever this current language was called (it was called C). And failed three times. Three total complete failures. Basically, being an egotist, I blamed it on the language. So he'd go back, beef up the language for something, and then finally when structures came in the way that structures did come into the language, which was completely outside of B (B had nothing to do with or resembling structures) ... the port to Unix of C on PDP-11 worked. Before that it was too complicated. I couldn't keep it all together. So then there was the first C version of Unix.

Self-discovered resources​

History of C​

Reference notes​

A History of C, UNIX, and Computation (a.k.a "The importance of 1978")​

Best programming language ever? (David Bombal YouTube interview)​

C programming language (Computerphile interview with Brian Kernighan)​

Brian Kernighan interviews Ken Thompson at VCF East 2019​

Self-discovered resources

History of C

Reference notes

A History of C, UNIX, and Computation (a.k.a "The importance of 1978")

Best programming language ever? (David Bombal YouTube interview)

C programming language (Computerphile interview with Brian Kernighan)

Brian Kernighan interviews Ken Thompson at VCF East 2019