Training a neural network made easy

Series

3 January 2020

By Mark Patrick, Mouser Electronics

Training is a fundamental step in the process of developing an artificial neural network (ANN). It involves providing the neural network – regardless of type, i.e. convolutional, recurrent, etc – with a large set of relevant data and its associated metadata, such as labels, to start the classification. This process is the way the neural network algorithm “learns” and gains its knowledge, so when presented with a previously unseen image, a machine vision application can identify what it is. The more training data the neural network algorithm observes, the more accurate its predictive capabilities are. However, for the data scientists and developers involved in creating an ANN-based application, gathering sufficient data can be arduous and time-consuming.

Training data for machine vision applications

Consider a vision-based convolutional neural network (CNN) designed to identify geological features and formations. To do this successfully requires hundreds if not thousands of images, taken in many different locations and seasons so the algorithm can distinguish individual features. Assembling such a set of images can take years and is extremely costly.

Thankfully, there is a growing number of open-source datasets available covering a wide range of subjects (such as Google’s “awesome public datasets”) and data types. For example, there are many global, regional and national weather and climate datasets, containing temperature, precipitation and wind data points collected over many years. For image-based applications, there are also many sources, with a popular one being ImageNet, which contains over 14 million images, from animals and geological formations to people and plants.

ImageNet is also the foundation for many specialist datasets, such as the Stanford Dogs dataset – see Figure 1 – that contains 20,580 images of 120 different dog breeds from around the world. Each contains a class label to identify the breed, and bounding box coordinates that mark the image area.

A similar Stanford University dataset is its cars one, which classifies 196 different cars over 16,000 images, each labelled with a make, model and year.

Another example dataset is Labeled Faces in the Wild, with over 13,000 identified face images collected from across the web, with 1,680 of the individuals having more than two distinct images. This dataset helps with facial recognition, particularly for unconstrained recognition where lighting, background visual distractions and facial angle are pre-set.

Kaggle is another repository of open-source data, with one dataset, Fruit 360, offering a total of 71,125 images of 103 different classes (types) of fruit and nuts, from apples to walnuts. Of the total number of images available, there are 53,177 in a training dataset and 17,845 in a separate test-set. Such a dataset is ideal for use in machine vision for detecting fruits on packaging or a production line.

Not only are such open-source datasets suitable for training a model, but, like the examples here, within the set are images that are completely separate for the purposes of testing the model once it has been trained. Also, there are enough publicly available datasets, often multiples for a particular category, with one being used for training and another for testing.

Using a pre-trained model

To further ease the task of creating a neural network algorithm, there are now a number of open-source frameworks that not only provide the base algorithm code, libraries and drivers to implement a network quickly, but also a pre-trained model that is ready to be used for the predictive, or inference, phase. Caffe, Keras and TensorFlow are good examples of popular neural network frameworks that significantly ease the task when implementing a neural network from scratch. An example of a Keras pre-trained CNN model is the car classification tool available on GitHub, using the Stanford Cars dataset mentioned earlier. This model is ready for use, needing only a hardware platform and camera to be operational.

Pre-trained neural networks and open-source frameworks greatly assist in training, testing and application of AI-based cases. Developers can focus their time and effort on ensuring that the applications are reliable, robust and fully tested.