DeepDetect v0.14.0

DeepDetect v0.14 was released last week with inference for object detectors with torch, a novel Transformer architecture for time-series, and the new SAM optimizer.https://t.co/ISIrVaWR0w

Docker images at https://t.co/xNYgOQnG9h for CPU, CUDA & RT.#deeplearning #MLOps
— jolibrain (@jolibrain) March 10, 2021

DeepDetect release v0.14.0

DeepDetect v0.14.0 was released a couple weeks ago. Below we review the main features, fixes and additions.

In summary

Inference for torch object detection models
Novel Transformer architecture for time-series
Vision Transformer with Realformer support (https://arxiv.org/abs/2012.11747v2)
New Sharpness Aware Minimization optimizer with torch

Other goodies

Support for multiple test sets with the torch backend
Improved CSV input parser to handle quotes etc…
More configurable cropping action when chaining models
SSD MAP-x metric control when training an object detector

Docker images

CPU version: docker pull jolibrain/deepdetect_cpu:v0.14.0
GPU (CUDA only): docker pull jolibrain/deepdetect_gpu:v0.14.0
GPU (CUDA and Tensorrt) :docker pull jolibrain/deepdetect_cpu_tensorrt:v0.14.0
GPU with torch backend: docker pull jolibrain/deepdetect_gpu_torch:v0.14.0

All images available on https://hub.docker.com/u/jolibrain

DeepDetect v0.14.0 Release

Inference with torchvision object detection models

R-FCNN and RetinaNet object detectors now readily actionable in inference with DeepDetect server. As easy as:

curl -X PUT http://localhost:8080/services/detectserv -d '
{
    "description": "fasterrcnn",
    "mllib": "torch",
    "model": {
        "repository": "/path/to/model/"
    },
    "parameters": {
        "input": {
            "connector": "image",
	    "width": 224,
            "height": 224,
            "rgb": true,
            "scale": 0.0039
        },
        "mllib": {
            "template": "fasterrcnn"
        }
    },
    "type": "supervised"
}'

and for inference:

curl -X POST http://localhost:8080/predict -d '
{
    "data": [
        "/path/to/cat.jpg"
    ],
    "parameters": {
        "input": {
            "height": 224,
            "width": 224
        },
        "output": {
            "bbox": true,
            "confidence_threshold": 0.8
        }
    },
    "service": "detectserv"
}'

Transformer architecture for time-series

Transformers originate from NLP and can be applied to time-series forecasting via a series of small modifications and careful testing.

This is what we have done on very large datasets from our customers at Jolibrain. The carefully selected neural architectures and their proper setup are now part of DeepDetect.

Our preliminary results show that these architectures are very efficient with time-series as well.

The good thing with DeepDetect API is that this novel architecture is readily usable with ridiculously small changes to the API calls from our previous short tutorials on time-series. Simply use ttransformer as the neural template (for Temporal Transformer), and review the udpated DeepDetect API for ttransformer options. Note that using the default values always provides a good and easy start!

Realformer support with Vision Transformer (ViT)

Realformer is a simple change to the Transformer architecture that yields a slight improvement in accuracy on various tasks for a relatively small additional computational cost.

Bringing the Realformer to the Vision Transformer architecture is reminiscent of the residual (aka ResNet) architectures. The difference here is that the residuals are in between attention heads from successive hierarchical blocks.

Easily tested by simply passing the "realformer": true parameter to the mllib.template_params object.

Blog

10 March 2021