Socionext has developed a prototype chip that incorporates its newly-developed quantised Deep Neural Network (DNN) technology, enabling highly-advanced AI processing for small and low-power edge-computing devices.
Socionext’s prototype chip achieves YOLO v3 object detection at 30fps, whilst consuming less than 5W – ten times more efficient than conventional, general-purpose GPUs. The chip is also equipped with a high-performance, low-power Arm Cortex-A53 quad-core CPU. Unlike other accelerator chips, it can perform the entire AI processing without external processors.
The chip is the result of a research project on “Updatable and Low Power AI-Edge LSI Technology Development” commissioned by the Japanese New Energy and Industrial Technology Development Organisation (NEDO). The chip features a quantised DNN engine, optimised for deep learning inference processing at high speeds with low power consumption.
Today’s edge computing devices are based on conventional, general-purpose GPUs. These processors are not generally capable of supporting the growing demand for AI-based processing requirements, such as image recognition and analysis, which need larger devices at higher cost due to increases in power consumption and heat generation. Such devices and their limited performance are not desirable for state-of-the-art AI processing.
Quantised DNN Engine
Socionext’s proprietary architecture reduces the parameter and activation bits required for deep learning. The result is improved performance of AI processing along with lower power consumption. The architecture incorporates bit reduction including 1-bit (binary) and 2-bit (ternary) in addition to the conventional 8-bit, as well as the company’s original parameter compression technology, enabling a large amount of computation with fewer resources and significantly less amounts of data.
In addition, Socionext has developed a novel on-chip memory technology that provides highly-efficient data delivery, reducing the need for extensive large capacity on-chip or external memory typically required for deep learning.