Get email delivery of the Cadence blog featured here
In addition to being the master of ceremonies for the recent embedded neural network symposium, Chris Rowen also presented his own thoughts. Chris used to be the CTO of Tensilica, and after Cadence acquired them he became the CTO of the IP group. Last year he left to create a startup in the deep learning space, called Cognite Ventures.
Something Chris pointed out last year at the previous summit was that 99% of captured raw data are pixels (photographs and video). This dwarfs everything else such as sound and motion. Starting in 2015, there are more image sensors in the world than there are people, and the amount of data that they produce is staggering (1010 sensors x 108 pixels/second = 1018 pixels/second). Making sense of all this raw data requires computer cognition.
However, computer vision is fundamentally difficult. It is one thing to recognize dogs versus cats, but real cognition means, for example, being able to classify a photo of a dog as one of 120 species, not including mutts.
That said, there has been rapid progress on three major fronts:
Vision isn't the only application of neural nets. Automatic speech recognition has improved, too. Twenty years ago, the error rate on word recognition was 40%; now it is 6%—and when whole sentences are considered, it is better still. The state of the art is moving to a unified end-to-end trained neural network that takes in the raw speech waveform and produces words. Google Translate recently switched to full sentence-at-a-time translation (as opposed to word-at-a-time translation), leading it to "improving more in a single leap than we've seen in the last 10 years."
Until recently, efficiency has been the weakness of neural nets. Hand-tuned feature recognition was more efficient (fewer MACs), but they were more effective. But convolutional neural networks allow high parallelism, low bit resolution, specialized architectures and manageable memory bandwidth. So that 1000X energy improvement over a general-purpose CPU compensates for the efficiency gap.
The biggest advantage of neural networks is that training them is usually a lot easier than hand-coding equivalent algorithms. They typically perform better than even the best manual methods. There is a massive educational shift going on to make machine learning a basic computer science skill. At major universities, any course on machine learning is instantly over-subscribed.
This impacts productivity at three main levels:
Cognitive computing, deep learning, machine learning, neural networks. These are largely different names for the same thing and will be a long-term driver for electronics. Before the transistor, there were tubes and analog designs, and the heritage of that ancestry is today's analog IP. Digital circuits really got going with the invention of the integrated circuit, and today we reuse IP with massive sub-systems, consisting of millions of gates. Processor-based design, which is basically software, started as assembly and has moved to today's widely available open-source infrastructure. Chris thinks that cognitive computing is the next big thing on that sort of level, which is why he opened his talk saying that cognitive computing is the next Moore's Law—not that something in neural nets is doubling every couple of years, but that in the same way as Moore's Law has driven digital design and software, cognitive computing will drive the next stage of intelligent systems.
There are many systems other than vision. Audio and natural language have also been mentioned, and they have applications in many market/application segments. Surveillance cameras that can distinguish what is normal from what is unusual. Autonomous driving. Voice user interfaces (VUI) like Alexa and Siri. Real-time translation. Augmented reality, like Hololens.
Or, as the punchy summary has it, "training is the new programming."