Problem in Loading the Saved model

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F

import torchvision
import torchvision.transforms as transforms

import numpy as np
from torch.utils.data.dataset import Dataset
ckpt_path=’/home/students/soumyajit_po/hsd_cnn_code/SOUMYA/checkpoint/vggCifar10TEST.pth’
checkpoint=torch.load(ckpt_path)
print(“Successfully Loaded”)

This code is giving me the following error:

Traceback (most recent call last):
File “Checking.py”, line 16, in
checkpoint=torch.load(ckpt_path)
File “/home/anaconda2/envs/pytorch_v36/lib/python3.6/site-packages/torch/serialization.py”, line 387, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File “/home/anaconda2/envs/pytorch_v36/lib/python3.6/site-packages/torch/serialization.py”, line 574, in _load
result = unpickler.load()
AttributeError: Can’t get attribute ‘CifarVggnet’ on <module ‘main’ from ‘Checking.py’>

While Saving I did:
if(test_acc>best_accuracy):
best_accuracy=test_acc
print(‘Saving…’)
if not os.path.isdir(‘checkpoint’):
os.mkdir(‘checkpoint’)
torch.save(model,PATH)

Please tell me where I am going wrong as I am following the instruction given in https://pytorch.org/tutorials/beginner/saving_loading_models.html#saving-loading-a-general-checkpoint-for-inference-and-or-resuming-training

@ptrblck @smth please help

Did you store the complete model using torch.save(model) or just the state_dict (the latter approach is recommended)?
Based on the error message it looks like you’ve used the former approach.
If so, you would need to restore the file structures in your current working directly as they have been while saving the model.

PS: Please don’t tag certain people, as this might discourage others to post an answer. :wink:

1 Like

You are right I used torch.save(model),previously. But now I did the following:-

I need to save 3 parameters as- epoch,best_acc and the model itself.
I did it this way:-
state={‘net’:model, ‘epoch’:epoch,‘accuracy’:best_acc}
torch.save(state,PATH)

but when I’m loading using checkpoint=torch.load(saved_path),
I am getting an Attribute Error as Can’t get attribute ‘CifarVggnet’ on <module ‘ main ’ from ‘Checking.py’>
Here, CifarVggnet is the name of the model that I’m saving. And I’m loading it from Checking.py.

To avoid this error then I defined the model as:
class CifarVggnet(nn.Module):
def init(self):
super(CifarVggnet,self).init()

I defined this in my Checking.py from where I’m loading the file.

Now I can load the attributes of the saved model as checkpoint[‘net’],checkpoint[‘epoch’] and checkpoint[‘accuracy’]. I printed then and they are correctly printing.

But now I got a new problem:-
The problem is that I can print the model. But I cannot access the feature part of the model.
i.e., when I do print(net.features), the error I am getting is:

Traceback (most recent call last):

  • File “Checking.py”, line 27, in *
  • print(net.features)*
  • File “/home/anaconda2/envs/pytorch_v36/lib/python3.6/site-packages/torch/nn/modules/module.py”, line 539, in getattr*
  • type(self).name, name))*
    AttributeError: ‘CifarVggnet’ object has no attribute ‘features’

How do I solve this now…Am I doing it the wrong way?

You would have to restore the complete scripts in your current working directory, i.e. with the complete model definition as explained here.
It seems you are just creating a dummy model using

class CifarVggnet(nn.Module):
    def __init__(self):
        super(CifarVggnet,self).__init__()

, while you would need the original model definition.

1 Like

Ok thank you very much for your reply,I’m doing it.
Also one more thing, I do need to save the accuracy and epoch along the model.
Will it be saved by torch.save(the_model.state_dict(), PATH) itself. If so then how to access that.

I would recommend to store a dict with all necessary values, e.g. as given in the ImageNet example.

2 Likes

model = CifarVggnet(*args, **kwargs)
model.load_state_dict(torch.load(ckpt_path))
print(model)
I did that , but during loading I get error:

File “Checking.py”, line 17, in
model = CifarVggnet(*args, **kwargs)
NameError: name ‘CifarVggnet’ is not defined

Where did you define CifarVggnet? It looks like this error is actually thrown in:

model = CifarVggnet(*args, **kwargs)

not in loading the state_dict.

1 Like

Checking.py

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F

import torchvision
import torchvision.transforms as transforms

import numpy as np
from torch.utils.data.dataset import Dataset
ckpt_path=’/home/students/soumyajit_po/hsd_cnn_code/SOUMYA/checkpoint/vggCif ar10TEST.pth’
model = CifarVggnet(*args, **kwargs)
model.load_state_dict(torch.load(ckpt_path))
print(model)

I am using Checking.py to load the model saved in ckpt_path.

The definition of CifarVggnet is still missing.
You should have some code like:

class CifarVggnet(nn.Module):
    def __init__(self):
        super(CifarVggnet, self).__init__()
        self....

    def forward(self, x):
        ...

Either define it in the same file before creating an instance of this model or import it from another file, where it was defined.

1 Like

import torch
import torchvision
import torchvision.transforms as transforms
import os

transform_train = transforms.Compose([
transforms.RandomCrop(32, padding=4),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])
trainset=torchvision.datasets.CIFAR10(root=’./data’,train=True,download=True,transform=transform_train)
trainloader=torch.utils.data.DataLoader(trainset,batch_size=4,shuffle=True,num_workers=2)

transform_test = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])
testset=torchvision.datasets.CIFAR10(root=’./data’,train=False,download=True,transform=transform_test)
testloader=torch.utils.data.DataLoader(testset,batch_size=4,shuffle=True,num_workers=2)

classes=(‘plane’,‘car’,‘bird’,‘cat’,‘deer’,‘dog’,‘frog’,‘horse’,‘ship’,‘truck’)

import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
device=torch.device(“cuda:0” if torch.cuda.is_available() else “cpu”)

import torchvision.models as models
vggnet=models.vgg16_bn(pretrained=True)

import matplotlib.pyplot as plt

classifier=list(vggnet.classifier.children())[:-6]
features=list(vggnet.features.children())

features.extend(nn.Sequential(nn.AvgPool2d(1, stride=1, padding=0, ceil_mode=False, count_include_pad=True)))
avgpool=list(vggnet.avgpool.children())[:-1]

vggnet.features=nn.Sequential(*list(features))
vggnet.classifier=nn.Sequential(*list(classifier))
vggnet.classifier[0]=nn.Linear(512,10)
vggnet.avgpool=nn.Sequential(*list(avgpool))

class CifarVggnet(nn.Module):
def init(self,vggnet):
super(CifarVggnet,self).init()
self.vggnet=vggnet

def forward(self,x):
    return self.vggnet(x)

model=CifarVggnet(vggnet)
print(model)

PATH=’./checkpoint/vggCifar10TEST.pth’
batch_size=128
num_epochs=1
model.to(device)
criterion=nn.CrossEntropyLoss()
optimizer=optim.Adam(model.parameters(),lr=0.0001)

torch.cuda.empty_cache()
globaliter=-1
best_accuracy=0
trainLOSS=[]
testACC=[]
testLOSS=[]
avg_train=0
avg_test_loss=0
for epoch in range(num_epochs):
globaliter+=1
running_loss=0
total=0
avg_train=0
correct_classified=0
model.train()
for i,data in enumerate(trainloader):
inputs,labels=data
inputs,labels=inputs.to(device),labels.to(device)
optimizer.zero_grad()
outputs=model(inputs)
loss=criterion(outputs,labels)
loss.backward()
optimizer.step()
predicted=torch.argmax(outputs,1)
total+=labels.size(0)
correct_classified+=(predicted==labels).sum().item()
running_loss+=loss.item()
if i%200==199:
print('Epoch:[%d, %5d] ’ % (epoch+1,i+1))

train_acc=100*(correct_classified/total)
avg_train=float(running_loss)/total    
trainLOSS.append(avg_train)    
    
print('Train Accuracy:%.3f'%(train_acc))
print('Train Average Loss:%.4f'%(avg_train))

c=0
total=0
globaliter=-1
l=0
j=-1
model.eval()
with torch.no_grad():
  for data in testloader:
    globaliter+=1
    j=j+1
    inputs,labels=data
    inputs,labels=inputs.to(device),labels.to(device)
    optimizer.zero_grad()
    outputs=model(inputs)
    loss=criterion(outputs,labels)
    l=l+loss.item()
    predicted=torch.argmax(outputs,1)
    total+=labels.size(0)
    c+=(predicted==labels).sum().item()
    
test_acc=(100*c/total)
if(test_acc>best_accuracy):
    best_accuracy=test_acc
    print('Saving...')
    state={'net':model,'acc':best_accuracy,'epoch':epoch}
    if not os.path.isdir('checkpoint'):
       os.mkdir('checkpoint')
   # torch.save(state,PATH)
   # torch.save(model,PATH)
    torch.save(model.state_dict(),PATH)
print('Accuracy of the network on test images:%.3f %%' % test_acc)

testACC.append(test_acc)
avg_test_loss=float(l)/total
testLOSS.append(avg_test_loss)
fig1 = plt.figure(1)        
plt.plot(range(epoch+1),trainLOSS,'r-',label='train loss')
plt.plot(range(epoch+1),testLOSS,'g-',label='test loss')
if epoch==0:
  plt.legend(loc='upper left')
  plt.xlabel('Epochs')
  plt.ylabel('Loss')
fig2 = plt.figure(2)        
plt.plot(range(epoch+1),testACC,'g-',label='test')        
if epoch==0:
  plt.legend(loc='upper left')
  plt.xlabel('Epochs')
  plt.ylabel('Testing Accuracy')    

print(‘Best Accuracy of the network on test images:%.3f %%’% best_accuracy)
fig1.savefig(‘trainloss_vs_epoch1.png’)
fig2.savefig(‘testacc_vs_epoch1.png’)

This is the code where I’m actually training and saving. For checking purpose I did set num_epoch=1.

As you said I made a separate class as vgg.py and kept it in a folder model

import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torchvision.models as models
vggnet=models.vgg16_bn(pretrained=True)
classifier=list(vggnet.classifier.children())[:-6]
features=list(vggnet.features.children())
features.extend(nn.Sequential(nn.AvgPool2d(1, stride=1, padding=0, ceil_mode =False, count_include_pad=True)))
avgpool=list(vggnet.avgpool.children())[:-1]
vggnet.features=nn.Sequential(*list(features))
vggnet.classifier=nn.Sequential(*list(classifier))
vggnet.classifier[0]=nn.Linear(512,10)
vggnet.avgpool=nn.Sequential(*list(avgpool))
class CifarVggnet(nn.Module):
def init(self,vggnet):
super(CifarVggnet,self).init()
self.vggnet=vggnet
def forward(self,x):
return self.vggnet(x)

model=CifarVggnet(vggnet)

-----------------------------------------------------------------------------------------------------------------------------------------
From a separate file as Checking.py I did— from model import *

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F

import torchvision
import torchvision.transforms as transforms
from model import *
import numpy as np
from torch.utils.data.dataset import Dataset

ckpt_path=’/home/students/soumyajit_po/hsd_cnn_code/SOUMYA/checkpoint/vggCif ar10TEST.pth.tar’
checkpoint=torch.load(ckpt_path,map_location=lambda storage,loc:storage)

---------------------------------------------------------------------------------------------------------------------------------------

The error I get is-
File “Checking.py”, line 15, in
checkpoint=torch.load(ckpt_path,map_location=lambda storage,loc:storage)
File “/home/anaconda2/envs/pytorch_v36/lib/python3.6/site-packages/torch/serialization.py”, line 387, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File “/home/anaconda2/envs/pytorch_v36/lib/python3.6/site-packages/torch/serialization.py”, line 574, in _load
result = unpickler.load()
AttributeError: Can’t get attribute ‘CifarVggnet’ on <module ‘main’ from ‘Checking.py’>

I do not understand this error!!!
CifarVggnet the model I saved , should be accessible now as I imported the model .

Are you able to create a new instance using

model = CifarVggnet()

in Checking.py?

PS: You can post code snippets using three backticks ``` :wink:

1 Like

No I am not able to do so.
Doing that gives me the following error:

Traceback (most recent call last):
  File "Checking.py", line 12, in <module>
    model=CifarVggnet()
NameError: name 'CifarVggnet' is not defined

So apparently your from model import * does not import CifarVggnet.

I would recommend to avoid using the asterisk for now and try to import the module directly.
If you manage to import the module and create a new instance, try to run your restoring code again.

1 Like

I tried it now, but it is giving me the same error.

  1 import torch                                                                
  2 import torch.nn as nn
  3 import torch.optim as optim
  4 import torch.nn.functional as F
  5 
  6 import torchvision
  7 import torchvision.transforms as transforms
  8 from model import vgg
  9 import numpy as np
 10 from torch.utils.data.dataset import Dataset
 11 
 12 net=CifarVggnet()
 13 ckpt_path='/home/students/soumyajit_po/hsd_cnn_code/SOUMYA/checkpoint/vggCif    ar10TEST.pth.tar'
 14 #model=vgg.model
 15 #print(model)
 16 checkpoint=torch.load(ckpt_path,map_location=lambda storage,loc:storage)

And the error is

Traceback (most recent call last):
  File "Checking.py", line 12, in <module>
    net=CifarVggnet()
NameError: name 'CifarVggnet' is not defined

This is my vgg.py inside model folder

 import torch.nn as nn                                                       
 import torch.optim as optim
 import torch.nn.functional as F
 import torchvision.models as models
 vggnet=models.vgg16_bn(pretrained=True)
 
 classifier=list(vggnet.classifier.children())[:-6] 
 features=list(vggnet.features.children())
 features.extend(nn.Sequential(nn.AvgPool2d(1, stride=1, padding=0, ceil_mode    =False, count_include_pad=True)))
 avgpool=list(vggnet.avgpool.children())[:-1]
 vggnet.features=nn.Sequential(*list(features))
 vggnet.classifier=nn.Sequential(*list(classifier))
 vggnet.classifier[0]=nn.Linear(512,10)
 vggnet.avgpool=nn.Sequential(*list(avgpool))
 
 class CifarVggnet(nn.Module):
     def __init__(self,vggnet):
         super(CifarVggnet,self).__init__()
         self.vggnet=vggnet
     
     def forward(self,x):
         return self.vggnet(x)
 
 model=CifarVggnet(vggnet)

Your script should already thrown an error in from model import vgg, since model doesn’t seem to be a valid module.
I would recommend to look into Python modules to get an idea how your model should be defined inside vgg.py.

That being said, I would recommend to save and load the state_dict approach generally.
Using this approach your files could look like:

# models.py
import torch
import torch.nn as nn

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.fc = nn.Linear(1, 1)
        
    def forward(self, x):
        x = self.fc(x)
        return x
# main.py
import torch
import torch.nn as nn

from models import MyModel

model = MyModel
model.load_state_dict(torch.load('STATE_DICT_PATH'))
1 Like

Thank you very much for this help. I could actually solve the problem of attribute error.
I read the link you gave.
Then I understood the meaning of the error that was being given.

AttributeError: Can’t get attribute ‘CifarVggnet’ on <module ‘main’ from ‘Checking.py’>

It turns out that may be the Checking.py file may not be successful in importing the model definition from vgg.py
To make it work I did the following:

from model.vgg import *

instead of:

from model import *

and

from model import vgg

The problem got resolved due to this line.
Thankyou very much @ptrblck for bearing my non-stop comments and replies.

2 Likes