How can I inference model under distributed data parallel?
I want to gather all predictions to calculate metrics and write result in one file.
Hi, At a high level, after training your model with DDP, you can save its state_dict
to a path and load a local model from that state_dict using load_state_dict
. You can find full documentation here: https://pytorch.org/tutorials/intermediate/ddp_tutorial.html#save-and-load-checkpoints.
2 Likes
I had a similar issue,
Now i’m able to load the model but I got this error
AttributeError: ‘DistributedDataParallel’ object has no attribute ‘generate’
Any thoughts on how to solve this ?
Show you code please and pytorch version
Hey, I found a way
When I wrap the model to DDP , all my original model attributes can be found in model.module (where model is equivalent to DDP model)
1 Like