Training an image classifier service

Requirement: a 4GB to 12GB GPU
Total running time: minutes to an hour
Dataset download: minutes
Training time: minutes to an hour

In another tutorial it was shown how to setup an image classifier from an existing (i.e. pre-trained) neural network model. Here we show how to train this model with DeepDetect. This yields a useful example on how to train your own image classification models.

We highly recommend to use the DeepDetect Platform to train models, as a much easier and powerful solution than the low-level DeepDetect server.

Setup of the cats & dogs dataset

The first step is to acquire and setup the dataset. We are using the cats & dogs dataset from https://www.kaggle.com/c/dogs-vs-cats/data. Alternatively, you can get it from https://www.deepdetect.com/dd/datasets/cats_dogs.zip

Setup the directory for model and data:


mkdir models
mkdir models/cats_dogs

Then unzip the data into models/cats_dogs.

Copy the pre-trained model

We will use transfer learning, i.e. use a pre-trained model on Imagenet that we specialize on the cats vs dogs task. This eases the training task, makes it converge much faster and yields near perfect accuracy in a few thousand interations.

To install the pre-trained model for our architecture of reference (see se_resnet_50 below):

cd models/cats_dogs
wget https://www.deepdetect.com/models/senets/se_resnet_50/SE-ResNet-50.caffemodel

Creating the service

First, assuming a Docker container for DeepDetect, start the server with


docker run -d -p 8080:8080 -v /path/to/models:/opt/models/ jolibrain/deepdetect_gpu

Then create the service with:


curl -X PUT "http://localhost:8080/services/catsdogs" -d '{
       "mllib":"caffe",
       "description":"image classification service",
       "type":"supervised",
       "parameters":{
         "input":{
           "connector":"image",
           "width":224,
           "height":224,
           "db": true
         },
         "mllib":{
           "template":"se_resnet_50",
           "nclasses":2,
       "finetuning":true
         }
       },
       "model":{
         "templates":"../templates/caffe/",
         "repository":"/opt/models/cats_dogs",
     "weight": "SE-ResNet-50.caffemodel"
       }
     }'

In the call above, we are defining a state of the art image classification network called Squeeze-and-Excitation ResNet-50, and setting it up for training.

Training the classifier

The training phase is complex phase. Luckily it is fully automated from within DeepDetect. Basically, the data flow into an image data connector. The connector prepares the data for the neural net and deep learning library. The neural net is trained and tested regularly until completion. At this stage, the machine learning service has a model to use for classifying images automatically. More details on each of the hidden steps:

building of training and testing sets of image databases: the image dataset built above is turned into two databases on images, one for training, the other for validating the net regularly along the training process. The rational under the the building of a database, is that each image is passed thousands of times to the net and that reading and re-reading from the hard drive is too slow. The database is much more efficient for non sequential access.
training of the net: batches of random images are passed to the net for training, the process is repeated until the requested number of iterations has been reached. The training job can be stopped at any time through the API.
transfer learning: we will use a pre-trained model and specialize it onto the cats vs dogs task. This will give us near perfect accuracy in a few thousand training iterations.

Below is a training call for the model


curl -X POST "http://localhost:8080/train" -d '{
       "service":"catsdogs",
       "async":true,
       "parameters":{
         "input":{
           "connector":"image",
           "test_split":0.1,
           "shuffle":true,
           "width":224,
           "height":224,
           "db":true
         },
         "mllib":{
           "gpu":true,
           "mirror":true,
           "net":{
             "batch_size":32
           },
           "solver":{
             "test_interval":500,
             "iterations":5000,
             "base_lr":0.001
           },
       "noise":{"all_effects":true, "prob":0.001},
       "distort":{"all_effects":true, "prob":0.01}
         },
         "output":{
           "measure":["acc","mcll","f1"]
         }
       },
       "data":["/opt/models/cats_dogs/train/"]
     }'

The main options are explained as follows:

batch_size: the number of training images sent at once for training
iterations: the total number of times a batch of images is sent in for training
base_lr: the initial learning rate
test_split: the part of the training set used for testing, e.g. 0.1 means 10% is used for testing
shuffle: whether to shuffle the dataset before training (recommended)
measure: the list of measures to be computed and returned as status by the server
noise and distort: automated data augmentation to make the model more robust

For more details, see the API.

Upon the start of the training, the server will output some image file processing information:


...
INFO - Processed 1000 files.
INFO - Processed 2000 files.
INFO - Processed 3000 files.
INFO - Processed 4000 files.
...

The bash script below calls on the training status every 20 seconds. It should take around 5000 iterations to reach 98% accuracy or so.


while true; do
    out=$(curl -s -X GET "http://localhost:8080/train?service=imageserv&job=1&timeout=20")
    echo $out
    if [[ $out == *"running"* ]]
    then
    continue
    else
    break
    fi
done

Testing the classifier

Once training has completed, the service is immediately available for prediction. A simple prediction call looks like this:


curl -X POST "http://localhost:8080/predict" -d '{
       "service":"imageserv",
       "parameters":{
         "input":{
           "width":224,
           "height":224
         },
         "output":{
           "best":3
         }
       },
       "data":["https://www.deepdetect.com/img/examples/cat.jpg"]
     }'

Note that the trained model is saved on disk, and that the service can be safely destroyed while keeping the model. Simply create a new identical service and it will load the existing model, it is then immediately ready for prediction.