share article

Share on facebook
Share on twitter
Share on linkedin

Introduction to neural networks


By Mark Patrick, Mouser Electronics

Artificial intelligence (AI) seems to be on everyone’s mind at the moment. And, it is not difficult to see why: AI is expected to have a profound impact on our society, becoming an intrinsic part of our daily lives, to the point that we won’t even know it’s there, but everyday tasks will be much easier and more seamless.

At the heart of any AI-based system is an artificial neural network, or ANN. Largely modelled on the human brain – the ultimate biological neural network – an ANN aims to mimic the brain’s processes. Within the human nervous system, neurons are specialised cells that carry nerve impulses, and axons convey neurons from the cell body to other cells, connected by branches called dendrites. Neurons serve as inputs to synapses, the connections between different cells. External stimuli, for example eye or skin neurons, can influence how the synapse bonds are shaped, and it is their strength that forms our learning.

Figure 1: A biological neural network

Like the biological structure, an ANN uses units of computation as neuron inputs to which a weight factor is added to indicate the strength of the synaptic connection. Learning takes place by adjusting the weights given to each input.

Just like the human brain, an ANN needs to learn – a process called “training” – and involves feeding the neural network with set input and output pairs. For example, in a computer-vision application this might be hundreds of images of animals (input), and the corresponding species name (output) of each image. The training process involves adjustment of weights so that for a given animal image the network’s prediction result is high. Achieving a high prediction result requires that the network is trained with a large number of images of each animal type. In this way, when the network encounters a picture it hasn’t seen before, it can predict the identity of the animal with a high degree of accuracy. This prediction phase of an ANN is called “inference”.

Figure 2: The mathematic interpretation of an artificial neural network

Neural network architecture types

The architecture of a neural network varies and, like many parts of the human brain, can suit different tasks. Two popular network architectures are recurrent neural network (RNN) and convolutional neural network (CNN).

In handwriting or speech recognition applications, for example, the RNN uses an architecture of multiple successive node layers. These networks are also regularly used for computer translation tasks, for example translating text from English to German. Predicting the next word as the message is typed is another example, as used by many messaging applications today.

A CNN is suitable for interpreting images, such as in computer vision, facial recognition and vehicle registration-plate detection. Its architecture is similar to the animal world’s visual cortex, where successive receptive field areas overlap to build up a complete visual image. They are multilayer neural networks with a single input and output but may have many hidden convolutional layers.

A complex core

ANNs are highly complex, combining skills and expertise from data science and neuroscience. Their complexity had made them out of reach for most commercial applications for a long time, remaining the sole preserve of academic and research communities. However, with the rise of computational capabilities, always-on connectivity and demanding applications, industry initiatives are making neural networks accessible to all. Neural network frameworks have opened up the development of machine-learning-based applications and put high-level AI capabilities in the hands of product designers and engineers.

TensorFlow is an example of an open-source software library framework, initially developed by Google for its internal research and production systems but made publicly available in 2015. With a focus across a range of different types of neural networks, TensorFlow comprises a comprehensive set of libraries, workflows, models and tools to develop and train neural networks. It supports a choice of programming languages from Python to JavaScript, as well as other hardware environments for CPUs and GPUs. Model deployment on low-compute devices, such as Android, iOS and Raspberry Pi devices, is also possible for edge-based inference.

Caffe is another deep-learning framework that supports a variety of computing platforms, programming languages and capabilities for neural network development, training and application. Created at Berkeley University, the Caffe website has a wide range of code examples, believed to offer the fastest image-classification CNN implementations available, capable of processing over 60 million images per day.

AI is all around us

Neural networks are working all around us, and many are already on our smartphone and smart home devices. Since none of these have the computing resources themselves to respond promptly, they rely on the inference taking place in the Cloud.

So, the next time you ask your smartphone assistant a question, take a moment to appreciate the complexities of what is going on behind the scenes – and how neural networks are responding to your query.

Share this article

Share on facebook
Share on twitter
Share on linkedin

Member Login