Get email delivery of the Cadence blog featured here
How do you program an embedded system? Say, the code that runs on the control processor on a big SoC? What programming language should you use?
I have never really programmed an embedded system, unless you count writing a couple of experimental programs for smartphones, but what a fair bit of embedded system programming is today was just system programming twenty years ago. I have written device drivers for disk drives. I have written the lower levels of a full networking stack. I have written full operating systems for small computers. When I was doing graduate work at university, the microprocessor had barely been invented (okay, the Intel 4004 was a thing, but take one look at its architecture and you will see it is closer to programming an FPGA than a processor). The PC, IBM or otherwise, didn't exist. The Apple II came along somewhere around then. Anyway, all that programming was in assembly code for the various computers. When you only have 64KB of memory on the entire machine, everything has to be very tight. But today, you have more memory and you will likely be using a programming language. So what are the most popular choices today in 2019?
As it happens, just this week, IEEE Spectrum did a survey and the table below is the top 10 most popular programming languages. Here is the whole article. The methodology is a little weird since some of the "languages" aren't really programming languages but we'll go with their ranking. After all, if a lot of people sit in front of computers and write HTML, and other people come to the websites and "run" the HTML, it seems somewhat arbitrary to decide that is not "real" programming.
I think it is instructive to look at these languages and see what type of language they are and where the support comes from. First, here's the list:
First-choice Python is a language created by Guido van Rossum and first released in 1991. Two things I note about the language. First, it uses indentation to indicate control-structure rather than having explicit words or brackets, unlike many of the other languages on this list, which are what a friend of mine calls "curly bracket languages". Second, it is dynamically typed and almost anything that you can write that looks like it should make sense will do what you expect, although perhaps not very efficiently. The language is interpreted (sometimes meaning it is described as a scripting language). Despite books about the language, and even its logo, being a snake, the language was actually named after Monty Python (which, by the way, was first broadcast 50 years ago next month on October 5, 1969). Many university computer science courses use Python, so it is often the language new graduates are most familiar with. Python's popularity is also partially driven by its being quasi-standard in machine learning (and various other domains), and so there are rich libraries available. In particular, TensorFlow is written in Python, and although Caffe is written in C++, its interface is provided in Python.
Java is a language invented at Sun Microsystems (before they became part of Oracle) under the lead of James Gosling. It was intended to be widely portable ("write once, run anywhere") across a wide range of machines from what we would now call IoT devices up to mainframes. It is the main language used for writing apps for Android smartphones, which represent about 85% of all smartphones. It was first released in 1995. The language is compiled into machine-independent bytecodes that can then be interpreted. It is typically JITted, which stands for "just in time", meaning that the compiler incrementally compiles and optimizes the code the first time it is run, for whatever machine it is running on, so there is a sense in which the program is ported on the fly this way.
C (just the letter, pronounced "see") is a language created by Denis Ritchie at Bell Labs in about 1972 to create utilities (including the C compiler itself) for the Unix operating system. Subsequently, the Unix operating system was re-written almost entirely in C. C is a compiled language, and is regarded as very close to the actual underlying hardware so that there are few performance surprises. There are many implementations of the C compiler, the most well-known being the open-source implementations gcc and LLVM. But there are specialized proprietary compilers, too, such as Green Hills Software (which are certified for various domains).
C++ (pronounced see-plus-plus) was a derivative of C created by Bjarne Stroustrup to add Simula-style classes to C, and thus modernize the language. It's original compiler just output C that was then run through the normal compiler, but now a wide range of direct C++ compilers are avalable. The object-oriented structure makes it harder to write incorrect code since it is mostly strongly typed. But as Bjarne said at a seminar we had him run at VLSI Technology, "it makes it harder to shoot yourself in the foot, but when you do, it blows your whole leg off."
R (just a letter) is not really a language for writing general-purpose programs in. It is a language for statistical computing and data mining. Since this is the era of "big data" it has made its way onto this list in fifth place. The main implementation is an open-source version written mainly in C (and parts in R itself). R was first released in 1995, although it was originally an implementation of the S language developed at Bell Labs in the mid-1970s.
C# (pronounced see-sharp like the musical note) was created by Microsoft around 2000 as the language to use in its .NET initiative. It has since become an international standard. It was widely seen at the time as Microsoft's response to Sun's Java, so they could have similar capabilities but without depending on another company.
MATLAB is the language used within MATLAB (the word has more than one meaning), the main product of MathWorks (they used to be "The MathWorks" but they lost the "The"). The implementations and language are proprietary to MathWorks, although there have been various open-source attempts to clone it. MathWorks was founded in 1983, although the development of the language had started earlier in academia. MATLAB is the standard environment for developing digital signal processing and other similar algorithms. Since it can output Verilog or C++, there is a path to implementation through high-level synthesis with Stratus technology, or regular synthesis through Genus technology.
Swift was created by Apple as a successor to Objective-C that was developed by Next and acquired by Apple. It was introduced in 2014, where it was described as "objective-C without the C". There are implementations for MacOS, iOS, and Linux, but it remains proprietary to Apple.
Go was created at Google by Robert Griesemer, Rob Pike, and Ken Thompson. Yes, the Ken Thompson of Unix fame, whom I wrote about in my post Why You Shouldn't Trust Ken Thompson. It was a sort of re-work of C with modern language concepts such as concurrency and memory safety. It was developed in 2007 in the era of big data centers, multicore, and massive scale, where existing languages were a liability. It was open source and now has many contributors other than Google, and widely used elsewhere (my friend who works at Lyft programs mostly in Go).
For the low levels of an embedded system (device drivers, protocol stacks), you probably would not use the interpreted languages here, which leaves you with C, C++ (and maybe Go and Swift). But higher levels in an embedded system the interpreted languages are great, which is why Java is in the Android SDK.
Most of these languages are either fully open source, or the language is standardized and there are open-source implementations, although for several of the languages there are companies that will provide proprietary implementations with various advantages. In the list above of the top 10 languages, only MathWorks has a business making money by selling the language and its environment directly. But even so, Wikipedia says disapprovingly:
MATLAB is a proprietary product of MathWorks, so users are subject to vendor lock-in.
MATLAB is a proprietary product of MathWorks, so users are subject to vendor lock-in.
One language not on the list that is a good choice for embedded applications is Rust. This has been Stack Overflow's "most loved" language in every year since 2016 (including this year 2019). It was developed by Mozilla Corporation, who are the developers of the Firefox browser among other products.
Rust is somewhat like C++, so easy to learn if you already know C++ or even just C (it is a curly-bracket language). Its big advantage over C++ and C is that is its focus on safety, especially in a concurrent environment. It has no null pointers, and it is impossible to end up with anything like a pointer to memory that has been freed already (known as a dangling pointer). It is not garbage-collected (as are C#, Go, Java) but instead has a protocol for how ownership of objects is handled, which is way beyond the scope of this post (which is too long already). This is all checked at compile-time, which is why Rust is a good choice for embedded systems where safety is often a critical attribute.
You will find plenty of people predicting that Rust will surpass C++ and C over time...but the death of C++ and, especially, C is confidently predicted all the time.
Talking of languages, did you know today is International Talk Like a Pirate Day. So I think today you shouldn't consider programming your project in R, you should program in Aaarrrrr.
Sign up for Sunday Brunch, the weekly Breakfast Bytes email.