The Curious Case of the Longevity of C
Despite a reputation for fast change some technology choices have stayed remarkably static over the past few decades. In this article we look at the C programming language which is over 40 years old but still remains a core piece of AHL’s and the world’s technology stack.
It is the early 1980s and you have been programming in BASIC, FORTRAN, a bit of Pascal and you read in BYTE magazine of a hot new language: C. Want to take a bet on it? There are only two bookshops in London that stock computer books and just one book on C: K&R. The book is expensive, about a week’s rent for a couple of hundred pages but it is the only way to learn in these pre-Google1 pre-StackOverflow/GitHub2 days.
C is a bare knuckle language and you struggle with dynamic memory allocation, pointer arithmetic, prefix vs postfix increment, operator precedence and all the rest. Any non-trivial data structure has to be crafted by hand. Useful library code is often distributed on floppy disks taped to the front of a magazine. Furthermore, machines were a lot more unforgiving to program in those days3. K&R C was introduced to a world where the pre-processor, now reviled, was considered to be a first class solution to many programming problems. Incidentally the pre-processor is so baroque that the standards committee tried to specify its behaviour and gave up. If you want an explanation of all the things that the preprocessor is doing I know of only one tool that does this4.
As you try to master this radically more powerful language you soon start thinking “at least no one will be writing this kind of code in 2017” (well, to be fair, you probably thought “… in 10 years time.”). And yet here we are, it’s 2017 and we still find ourselves reliant on C.
Next year will be the 40th anniversary of the first edition of K&R. Just consider the extraordinary advances made in other areas of computing over this period: hardware, networking, user interfaces and all the rest. How has C survived almost unscathed? Add to this the disastrously central role that C has played in our ongoing computer security nightmare.5 One standard textbook takes 534 pages to explain secure coding standards in C and as far as I can see the post-Heartbleed Core Infrastructure Initiative is focused exclusively on projects that use C.
In Man AHL we depend on C absolutely:
- Linux powers everything we do, here’s why it is written in C. With operating systems it pays to be conservative. This author once worked on a widely used operating system written in C++. It is now defunct6.
- We rely on Python which is written in C89, a standard older than many of our developers. Perhaps the reluctance to move to a more ‘modern’ C is that even C99 can’t legally buy a drink7.
- We use
gitfor source control.
hg, largely written in Python, was started within a few days with the same goals. Who won?
- Numpy… well you get the picture
So there are many, many layers of technological sediment that have been laid down over the bedrock of C.
C is often referred to as cross platform assembly. In a sense that is true since C compilers can be found on pretty much any platform from bare metal upwards8 but cross platform code usually needs to use conditional compilation and that rapidly becomes a monster. Here is what that looks like as visualised for a single Linux Kernel file (warning: large page).
Any good worker needs to know their tools and whilst early compilers played fast and loose with your code, nowadays, in the face of ever-predatory optimising compilers, any C developer needs to understand undefined behaviour. This requires some familiarity with the ISO C standard, not exactly the friendliest work of reference. A far cry from the simplicity of original K&R this monster weighs in at around 700 pages and I always seem to find that the answer I’m looking for is in some obscure footnote many pages away from where I am reading.
All languages trade off the cost of writing code to the cost of running it and C represents an extreme high/low on that scale. You might think that some higher level language that gets translated to C would solve a lot of these problems, indeed, Cython has carved out an effective niche here. The problem with code generators is that second order effects come in to play and your debug story is likely to be medium worse to far worse. And none of this really protects you for some of the worst that C can do as Cloudflare discovered.
One attraction in writing in such a long lasting language is that your code can have a similar lifespan. I recently stumbled across some C code I wrote in 1987 that was a brute force approach to a particular graph colouring problem. Today I can compile and run the code by making only minor changes;
register i; is no longer a thing so that piece of nostalgia had to go. In 1987 this code took around half an hour to run, today 0.03 seconds. Progress eh?
C remains a remarkably popular language, if you accept the TIOBE index and it is still rated number 1 by the IEEE. Perhaps all the more remarkable as C was intended as a systems implementation language and there are not many OSs being written right now. Indeed C’s reach has gone far beyond systems, for example GitHub recently needed a new markdown parser and for that they chose C99.
As a programming language, C has an essential simplicity, even if in practice this readily translates to “Be scared. All of the time”. Despite this I do find great pleasure of writing in C, to get something elegant working fast and safe is a reward other languages fail to give. Although I confess that the pleasure is more towards the masochistic end of the spectrum than the joyful one.
Will C become the Latin of computer languages, historically interesting but of little practical use? I would argue that it is worth the effort to learn C as it is one of the few languages that exposes you to some fundamental truths about computing platforms that politer languages take great care to shield you from. I find that the lessons I learnt from C have been subtly useful in understanding those long tail bugs that crop up from time to time. As Dennis Ritchie wrote about the history of the language “C is quirky, flawed, and an enormous success.” So will we still be programming in C in 20 years time, or, more prosaically, would learning it lead to a valuable career? I think of C the same way I think of Unesco World Heritage Sites, no one in their right mind would build one today but in decades to come will there be greying artisans delicately propping up some old relic? You bet.
Perhaps now it is time for a new generation to make their mark, will their efforts last 40 years? Rust anyone?
1. Around a dozen years thence.
2. Around two dozen years thence.
3. Programming hell at that time: without memory protection a pointer error would usually bring the whole machine down. No stack trace, no core dump, nothing, nada. Such joy.
4. I confess, I wrote it.
5. Thanks to John Regehr for this catchy phrase.
7. Yes, I know, it’s Visual Studio.
8. I’m trying really hard here to avoid the phrase “the lowest common denominator”.
Opinions expressed are those of the author and may not be shared by all personnel of Man Group plc (‘Man’). These opinions are subject to change without notice, are for information purposes only and do not constitute an offer or invitation to make an investment in any financial instrument or in any product to which the Company and/or its affiliates provides investment advisory or any other financial services. Any organisations, financial instrument or products described in this material are mentioned for reference purposes only which should not be considered a recommendation for their purchase or sale. Neither the Company nor the authors shall be liable to any person for any action taken on the basis of the information provided. Some statements contained in this material concerning goals, strategies, outlook or other non-historical matters may be forward-looking statements and are based on current indicators and expectations. These forward-looking statements speak only as of the date on which they are made, and the Company undertakes no obligation to update or revise any forward-looking statements. These forward-looking statements are subject to risks and uncertainties that may cause actual results to differ materially from those contained in the statements. The Company and/or its affiliates may or may not have a position in any financial instrument mentioned and may or may not be actively trading in any such securities. This material is proprietary information of the Company and its affiliates and may not be reproduced or otherwise disseminated in whole or in part without prior written consent from the Company. The Company believes the content to be accurate. However accuracy is not warranted or guaranteed. The Company does not assume any liability in the case of incorrectly reported or incomplete information. Unless stated otherwise all information is provided by the Company. Past performance is not indicative of future results.