Why is there no distributed inference?

seungjun · November 27, 2020, 1:26am

So far I have only used a singler-server multi-GPU environment but in principle, DDP can be used at inference time, too.

What hinders using DDP at inference are the

synchronization at backward
DistributedSampler that modifies the dataloader so that the number of samples are evenly divisible by the number of GPUs.

At inference, you don’t need backward computation and you don’t want to modify the evaluation data.
You can use a custom dataloader for evaluation similarly this example to avoid the problems.

A related thread is here.