Torch has not attribute load_state_dict?

It depends, where the actual bottleneck is.
You could use the ImageNet example to get the actual data loading time. If it stays at a high value, you might have a data loading bottleneck.
If that’s the case, have a look at this post to see some potential workarounds.

On the other hand, if you see the data loading time approaching zero, your model might create the bottleneck, in which case you could try to profile it (e.g. using nsight) and see, which operations are the slowest.

The number of workers should speed up the data loading time. However, there is usually a sweet spot, after which increasing the number of workers might slow down the code again.

Hi ptrblck

I hope you are well. I run my DL , 2 CNN with 32 filters in first layer and 64 filters in second layer followed by 3 FC layers. my samples are balanced with 4000 positives and 4000 negatives. the ROC curve is 0.3 which is very low by 10 fold-cross validation. I check my training set the labeling is true, Do you think the over fitting happens and I need more data for training? I used one droup out in the FC layers.

Cheers
S

How well does the model perform on the training data?
You would observe overfitting, if there is a gap between the training and validation performance.

This type of error happened to me ,
Here’s how i solved:

I had save the model like this

state = {'epoch': epoch + 1, 'state_dict': model.state_dict(), 

                     'optimizer': optimizer.state_dict(), 'loss': loss, }

            torch.save(state, save_path)

So in order to load the model , I had to first run my architecture of model as below

model = Net()
checkpoint = torch.load(path)
model.load_state_dict(checkpoint['state_dict'])

:slight_smile: successfully loaded and tested on test set.

Hello ptrblck,

I am exactly following same process for model loading

model=CQCCModel()
optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)

model.load_state_dict(torch.load(model_path, map_location=‘cuda’))

optimizer.load_state_dict(torch.load(model_path, map_location=‘cuda’))

but I am getting this errors:

self.class.name, “\n\t”.join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for CQCCModel:
Missing key(s) in state_dict: “layer1.0.weight”, “layer1.0.bias”, “layer1.1.weight”, “layer1.1.bias”, “layer1.1.running_mean”, “layer1.1.running_var”, “layer2.0.conv1.weight”, “layer2.0.conv1.bias”, “layer2.0.bn1.weight”, “layer2.0.bn1.bias”, “layer2.0.bn1.running_mean”, “layer2.0.bn1.running_var”, “layer2.0.conv2.weight”, “layer2.0.conv2.bias”, “layer2.0.conv11.weight”, “layer2.0.conv11.bias”, “layer3.0.conv1.weight”, “layer3.0.conv1.bias”, “layer3.0.bn1.weight”, “layer3.0.bn1.bias”, “layer3.0.bn1.running_mean”, “layer3.0.bn1.running_var”, “layer3.0.conv2.weight”, “layer3.0.conv2.bias”, “layer3.0.conv11.weight”, “layer3.0.conv11.bias”, “layer3.0.pre_bn.weight”, “layer3.0.pre_bn.bias”, “layer3.0.pre_bn.running_mean”, “layer3.0.pre_bn.running_var”, “layer4.0.conv1.weight”,

Unexpected key(s) in state_dict: "module.layer1.0.weight", "module.layer1.0.bias", "module.layer1.1.weight", "module.layer1.1.bias", "module.layer1.1.running_mean", "module.layer1.1.running_var", "module.layer1.1.num_batches_tracked", "module.layer2.0.conv1.weight", "module.layer2.0.conv1.bias", "module.layer2.0.bn1.weight", "module.layer2.0.bn1.bias", "module.layer2.0.bn1.running_mean", "module.layer2.0.bn1.running_var", "module.layer2.0.bn1.num_batches_tracked", "module.layer2.0.conv2.weight", "module.layer2.0.conv2.bias", "module.layer2.0.conv11.weight", "module.layer2.0.conv11.bias", "module.layer3.0.conv1.weight", "module.layer3.0.conv1.bias", "module.layer3.0.bn1.weight", "module.layer3.0.bn1.bias", "module.layer3.0.bn1.running_mean", "module.layer3.0.bn1.running_var", "module.layer3.0.bn1.num_batches_tracked", "module.layer3.0.conv2.weight", "module.layer3.0.conv2.bias", "module.layer3.0.conv11.weight", "module.layer3.0.conv11.bias", "module.layer3.0.pre_bn.weight", "module.layer3.0.pre_bn.bias", "module.layer3.0.pre_bn.running_mean", "module.layer3.0.pre_bn.running_var", "module.layer3.0.pre_bn.num_batches_tracked", "module.layer4.0.conv1.weight", "module.layer4.0.conv1.bias", "module.layer4.0.bn1.weight", "module.layer4.0.bn1.bias", "module.layer4.0.bn1.running_mean", "module.layer4.0.bn1.running_var", "module.layer4.0.bn1.num_batches_tracked", "module.layer4.0.conv2.weight", "module.layer4.0.conv2.bias", "module.layer4.0.conv11.weight", "module.layer4.0.conv11.bias"

I save the model in this way:

model = CQCCModel()

torch.save(model.state_dict(), os.path.join(model_save_path, ‘epoch_{}.pth’.format(epoch)))

I didn’t understand this error.
Could you please let me know why this kind of error is coming and what is the right way to load the model.

Thanks in advance.

The model and optimizer would need their own state_dicts, while you are trying to load the model.state_dict() into both objects.

Store the checkpoint as:

checkpoint = {}
checkpoint['model'] = model.state_dict()
checkpoint['optimizer'] = optimizer.state_dict()
torch.save(checkpoint, PATH)

and load it via:

checkpoint = torch.load(PATH)
model = CQCCModel()
optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)

model.load_state_dict(checkpoint['model'])
optimizer.load_state_dict(checkpoint['optimizer'])

Thank you so much ptrblck for your reply.

After using this its giving error

checkpoint = torch.load(PATH)
model = CQCCModel()
optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)
model.load_state_dict(checkpoint[‘model’])

error: model.load_state_dict(checkpoint[‘model’])
KeyError: ‘model’

If I am using “model.load_state_dict(checkpoint[model])” than its showing error

error:

KeyError: CQCCModel(
(layer1): Sequential(
(0): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): LeakyReLU(negative_slope=0.03)
)
(layer2): Sequential(
(0): ResNetBlock(
(conv1): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(bn1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(lrelu): LeakyReLU(negative_slope=0.01)
(dropout): Dropout(p=0.5)
(conv2): Conv2d(32, 32, kernel_size=(3, 3), stride=(3, 3), padding=(1, 1))
(conv11): Conv2d(32, 32, kernel_size=(3, 3), stride=(3, 3), padding=(1, 1))
)
)
(layer3): Sequential(
(0): ResNetBlock(
(conv1): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(bn1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(lrelu): LeakyReLU(negative_slope=0.01)
(dropout): Dropout(p=0.5)
(conv2): Conv2d(32, 32, kernel_size=(3, 3), stride=(3, 3), padding=(1, 1))
(conv11): Conv2d(32, 32, kernel_size=(3, 3), stride=(3, 3), padding=(1, 1))
(pre_bn): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): MaxPool2d(kernel_size=3, stride=3, padding=1, dilation=1, ceil_mode=False)
)
(layer4): Sequential(
(0): ResNetBlock(
(conv1): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(bn1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(lrelu): LeakyReLU(negative_slope=0.01)
(dropout): Dropout(p=0.5)
(conv2): Conv2d(32, 32, kernel_size=(3, 3), stride=(3, 3), padding=(1, 1))
(conv11): Conv2d(32, 32, kernel_size=(3, 3), stride=(3, 3), padding=(1, 1))
(pre_bn): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): MaxPool2d(kernel_size=3, stride=3, padding=1, dilation=1, ceil_mode=False)
)

why this error is coming after defining the model (model = CQCCModel())?

Any suggestions is useful.

Thanks

In your line of code you are passing the model object as the key to the dict:

model.load_state_dict(checkpoint[model])

In my example I’ve used the strings "model" and "optimizer" for the checkpoint.

PS: you can post code snippets by wrapping them into three backticks ```, which makes debugging easier. :wink: