Setting up an image classifier based on Imagenet

This tutorial sets a classification service that will distinguish among 1000 different image tags, from ‘ambulance’ to ‘paddlock’, and more. It shows how to run a DeepDetect server with an image classification service based on a deep neural network pre-trained on a subset of Imagenet (ILSVRC12).

The machine learning service allows for an application to send images and to receive a set of tags describing this image in return. The tags are encoded in JSON.

The following presupposes that DeepDetect has been built & installed.

Getting the pre-trained model

We use the model provided by the Caffe dev team, and trained with the Googlenet architecture (2). However, this tutorial can be reproduced by using other provided models, and some from (1).

Using ResNet

You can use one of the brand new and more accurate Deep Residual Convolutional Network. Otherwise, follow instructions the GoogleNet section.

To use the very accurate ResNet-50, first download the model file from either https://github.com/KaimingHe/deep-residual-networks or http://www.deepdetect.com/models/resnet/.

Create a model directory and put the ResNet-50-model.caffemodel file in it. Now go the Setting up the DeepDetect service section and replace googlenet with resnet_50 in all calls.

Using Inception through Tensorflow

You can use the state of the art inception and resnet-v2 architectures and pre-trained models through DeepDetect with Tensorflow. Otherwise follow the instructions of the GoogleNet section.

To use the inception architecture and Tensorflow, first download the model file from https://deepdetect.com/models/tf/inception_resnet_v2.pb.

Create a model directory and put the .pb file into it. Now replace the call from the Setting up the DeepDetect service section with:

curl -X PUT "http://localhost:8080/services/imageserv" -d "{\"mllib\":\"tensorflow\",\"description\":\"image classification service\",\"type\":\"supervised\",\"parameters\":{\"input\":{\"connector\":\"image\",\"height\":224,\"width\":224},\"mllib\":{\"nclasses\":1001,\"inputlayer\":\"InputImage\"}},\"model\":{\"repository\":\"/path/to/model/\"}}"

Follow the instructions and in other calls, simply change caffe with tensorflow if needed. The prediction calls with the API /predict are independent from the deep learning library, which makes it very convenient in applications.

Using GoogleNet with Caffe

A download script is provided with Caffe, let’s use it from the customized version of Caffe included in the DeepDetect source (the script below requires you install the python-yaml package):

cd build/caffe_dd/src/caffe_dd
./scripts/download_model_binary.py models/bvlc_googlenet/

You can verify the model is there by looking for its file:

ls -l models/bvlc_googlenet/*.caffemodel
-rw-rw-r-- 1 beniz beniz 53533754 May 11 22:21 bvlc_googlenet.caffemodel

Now let’s create our own repository to host our new service and the pre-trained model. We can put this repository anywhere, here we put it at the root of the deepdetect repository:

cd deepdetect
mkdir models
mkdir models/imgnet
mv build/caffe_dd/src/caffe_dd/models/bvlc_googlenet/bvlc_googlenet.caffemodel models/imgnet

The final preparation step adds what deepdetect refers to as a correspondence file that turns the Imagenet class names, such as ‘ambulance’ into numeric class identifiers between 0 and 999. When training a model with DeepDetect, this file is automatically generated. However, when using a pre-trained model from outside DeepDetect, this file has to be explicitely added to the repository. In the present case, the correspondence file for Imagenet ILSVRC12 is provided with DeepDetect source:

cp datasets/imagenet/corresp_ilsvrc12.txt models/imgnet/corresp.txt

We are now ready to create the classifier service, and one step away to using it.

Setting up the DeepDetect service

Let’s start the DeepDetect server:

cd deepdetect/build/main
$ ./dede

and create a service:

curl -X PUT "http://localhost:8080/services/imageserv" -d '{
       "mllib":"caffe",
       "description":"image classification service",
       "type":"supervised",
       "parameters":{
         "input":{
           "connector":"image"
         },
         "mllib":{
           "template":"googlenet",
           "nclasses":1000
         }
       },
       "model":{
         "templates":"../templates/caffe/",
         "repository":"../../models/imgnet"
       }
     }'

or equivalently using the Python client:

from dd_client import DD
dd = DD('localhost')
dd.set_return_format(dd.RETURN_PYTHON)
description = 'image classification service'
mllib = 'caffe'
model = {'templates':'../templates/caffe/','repository':'../../models/imgnet'}
parameters_input = {'connector':'image'}
parameters_mllib = {'template':'googlenet','nclasses':1000}
parameters_output = {}
dd.put_service('imageserv',model,description,mllib,
               parameters_input,parameters_mllib,parameters_output)

yields:

{
  "status":{
    "code":201,
    "msg":"Created"
  }
}

Note that:

  • we are creating the model from the ‘googlenet’ template that is provided by DeepDetect. This means that the neural network definition is automatically added to our service repository and loaded up along with the model;
  • both the location of the model and the neural net templates, (e.g. “models/imgnet”) here are relative paths to where dede was called.

Testing image classification

We can now pass any image filepath or URL to our new classifier service and it will produce tags along with probabilities. Here is a first example:

curl -X POST "http://localhost:8080/predict" -d '{
       "service":"imageserv",
       "parameters":{
         "input":{
           "width":224,
           "height":224
         },
         "output":{
           "best":3
         }
       },
       "data":["ambulance.jpg"]
     }'

Drawing

{
  "status":{
    "code":200,
    "msg":"OK"
  },
  "head":{
    "method":"/predict",
    "time":1398.0,
    "service":"imageserv"
  },
  "body":{
    "predictions":{
      "uri":"../../main/ambulance.jpg",
      "loss":0.0,
      "classes":[
        {"prob":0.992520809173584,"cat":"n02701002 ambulance"},
        {"prob":0.007297487463802099,"cat":"n03977966 police van, police wagon, paddy wagon, patrol wagon, wagon, black Maria"},
        {"prob":0.00014072071644477546,"cat":"n04336792 stretcher"}
      ]
    }
  }
}

The call asks for the three top tags for the ambulance.jpg image. The top tag with probability 0.99 is ‘ambulance’. This image is very possibly part of the training set, so this allows to verify that all is well.

We can pass more images at once:

curl -X POST "http://localhost:8080/predict" -d '{
       "service":"imageserv",
       "parameters":{
         "input":{
           "width":224,
           "height":224
         },
         "output":{
           "best":3
         }
       },
       "data":["cat.jpg","alley-italy.jpg","thai-market.jpg"]
     }'

Drawing Drawing Drawing

{
  "status":{
    "code":200,
    "msg":"OK"
  },
  "head":{
    "method":"/predict",
    "time":3516.0,
    "service":"imageserv"
  },
  "body":{
    "predictions":[
      {
        "uri":"cat.jpg",
        "loss":0.0,
        "classes":[
          {"prob":0.5992535948753357,"cat":"n02124075 Egyptian cat"},
          {"prob":0.30885550379753115,"cat":"n02123045 tabby, tabby cat"},
          {"prob":0.06913747638463974,"cat":"n02123159 tiger cat"}
        ]
      },
      {
        "uri":"alley-italy.jpg",
        "loss":0.0,
        "classes":[
          {"prob":0.41952434182167055,"cat":"n03899768 patio, terrace"},
          {"prob":0.1190955638885498,"cat":"n03781244 monastery"},
          {"prob":0.08107643574476242,"cat":"n03776460 mobile home, manufactured home"}
        ]
      },
      {
        "uri":"thai-market.jpg",
        "loss":0.0,
        "classes":[
          {"prob":0.3546510338783264,"cat":"n07760859 custard apple"},
          {"prob":0.24165917932987214,"cat":"n03089624 confectionery, confectionary, candy store"},
          {"prob":0.07271246612071991,"cat":"n03461385 grocery store, grocery, food market, market"}
        ]
      }
    ]
  }
}

Decoding the results:

  • cat.jpg predicted as an ‘egyptian cat’
  • alley-italy.jpg predicted as a ‘patio, terrace’
  • thai-market.jpg predicted as a ‘custard apple’, ‘confectionery’ then ‘grocery store’, in that order.

Finally, try it out with any image by providing one or more URLs of images in the data field:

curl -X POST "http://localhost:8080/predict" -d '{
       "service":"imageserv",
       "parameters":{
         "input":{
           "width":224,
           "height":224
         },
         "output":{
           "best":3
         }
       },
       "data":["http://i.ytimg.com/vi/0vxOhd4qlnA/maxresdefault.jpg"]
     }'

Drawing

{
  "status":{
    "code":200,
    "msg":"OK"
  },
  "head":{
    "method":"/predict",
    "time":1591.0,
    "service":"imageserv"
  },
  "body":{
    "predictions":{
      "uri":"http://i.ytimg.com/vi/0vxOhd4qlnA/maxresdefault.jpg",
      "loss":0.0,
      "classes":[
        {"prob":0.24278657138347627,"cat":"n03868863 oxygen mask"},
        {"prob":0.20703653991222382,"cat":"n03127747 crash helmet"},
        {"prob":0.07931024581193924,"cat":"n03379051 football helmet"}
      ]
    }
  }
}

Note that the service can run indifferently on CPU or GPU.

Next steps

This is it, you can now plug this 1000 categories image classifier into the application of your choice.

The model above has state of the art accuracy, but may be insufficient for your own purposes. In this case, you can also train your own model on your own set of images, for more targeted purposes, such as content moderation, image indexing in search engines, news categorization, etc…

References

(1) http://caffe.berkeleyvision.org/model_zoo.html

(2) http://arxiv.org/abs/1409.4842


DeepDetect documentation