Multi gpu inference pytorch

I trained a model in multigpu thanks to accelerate from HuggingFace https://github.com/huggingface/accelerate

Nevertheless in inference I can’t see the use of my two gpus. I have done

   generator = Generator().to(device)
    # Load weights
    checkpoint = torch.load(weights_dir, map_location="cpu")
    generator.load_state_dict(checkpoint['gen_state_dict'])
    generator = accelerator.prepare(generator)
    generator.eval()


How would you do it for inference? 

Thanks !

I’m not familiar with accelerator but why prevents the same approach from being used at inference time? For example, just using the same accelerator workflow but removing the gradient computation and setting the model to eval mode?