Oktai15
(Oktai Tatanov)
April 3, 2018, 11:07pm
1
Hi there!
I successfully trained my model that was declared as:
model = ModelClass(...)
model = torch.nn.DataParallel(model, device_ids=[0, 1, 2])
During training, I saved my weights like this:
best_model = copy.deepcopy(model.state_dict())
torch.save(best_model, path)
And I could get my weights for test (it isn’t NoneType, I checked):
w = torch.load(path) # type(w) is <class 'collections.OrderedDict'>
But! When I try:
trained_model = model.load_state_dict(w)
I got NoneType! What has happened? Maybe it happens due to DataParallel? What should I do?
You don’t have to assign the trained_model
to load_state_dict
. model
should have all weights loaded.
Have a look at the Serialization info .
3 Likes
Oktai15
(Oktai Tatanov)
April 3, 2018, 11:21pm
3
As far as I understood, all weights will be downloaded to model
properly, right?
Yes, you could also verify it by printing some weights (print(model.layer_name.weight)
) before and after loading.
1 Like
J_Liu
(ljsscq2012td@outlook.com)
November 30, 2019, 12:14am
7
You may try the procedure:
define model=some_model(), and device=torch.device(‘cuda: 0,1,2’),
and model=torch.nn.DataParallel(model, device_ids=[0,1,2]), model=model.to(device),
then load the pretrained model:
model.module.load_state_dict(torch.load(‘path/to/model.pth’)
as mentioned in the answer, you do not need to assign the model to return value of load_state_dict()
.
Use of the following is enough:
G.load_state_dict(torch.load('G.pth'))
...
img = G(z)
1 Like
oh my fault. It is solved. Thank you .
edit: I deleted my question .