Multi-GPU training with DeepDetect

DeepDetect supports multi-GPU training via the Caffe backend. Make sure to read the Caffe Multi-GPU documentation to understand the good practices and lower details, though some of them are summarized below.

Multi-GPU applies similarly to any of the tutorials about training from [images], [CSV] or [SVM], by specifying the list of GPUs to be used to the gpuid API parameter.


  • Using GPU 1 (default is GPU 0)
  • Using GPUs 1 and 2:
gpuid: [1,2]
  • Using all detected GPUs:
gpuid: -1

Notes on usage

  • the specified batch_size applies to each GPU independently, i.e. using batch_size: 64 with 3 GPUs yields an effective batch size of 192

  • you may need to modify the learning rate base_lr depending on the number of GPUs and total effective batch size of your training setup

Notes on hardware and performances

Hardware considerations

  • the multi-GPU setup is constrained by the lower capability GPU in terms of memory

  • mixing up Nvidia GPUs with different architectures (e.g. Kepler and Pascal) is not supported


  • scaling is sub-linear the number of GPUs: expect x1.8 with 2GPUs, with scaling sligthly decreasing when adding more GPUs

  • communication on the PCIe bridges may be bounded, the software will let you know it is Waiting for data.

DeepDetect documentation