How to inference under DDP

How can I inference model under distributed data parallel?
I want to gather all predictions to calculate metrics and write result in one file.

Hi, At a high level, after training your model with DDP, you can save its state_dict to a path and load a local model from that state_dict using load_state_dict . You can find full documentation here: https://pytorch.org/tutorials/intermediate/ddp_tutorial.html#save-and-load-checkpoints.

2 Likes

I had a similar issue,
Now i’m able to load the model but I got this error

AttributeError: ‘DistributedDataParallel’ object has no attribute ‘generate’

Any thoughts on how to solve this ?

Show you code please and pytorch version

Hey, I found a way

When I wrap the model to DDP , all my original model attributes can be found in model.module (where model is equivalent to DDP model)