Low accuracy when loading the model and testing

Hey guys,

I trained my CIFAR-10 full precision network using vgg architecture. I got a 92.27 percent accuracy on the validation set. However, when I saved and loaded the model and then tested using the loop, I am getting only 35 percent accuracy. I am pretty sure the file is not corrupted.

model=torch.load('/content/cifar_fullprecison_vgg8.pth')
import torch
class Ternary_batch_rel(torch.nn.Module):
    
  def __init__(self,batchnorm_size):
    super(Ternary_batch_rel,self).__init__()
    self.l1=torch.nn.Sequential(
    torch.nn.ReLU(),
    torch.nn.BatchNorm2d(batchnorm_size)
    )
  
  def forward(self,x):
    out=self.l1(x)
    return out
    
z1=Ternary_batch_rel(128).to(device)
z2=Ternary_batch_rel(256).to(device)
z3=Ternary_batch_rel(512).to(device)
class Ternary_max_pool(torch.nn.Module):
  def __init__(self):
    
    super(Ternary_max_pool,self).__init__()
    self.l1=torch.nn.Sequential(
    torch.nn.MaxPool2d(kernel_size=2,stride=2))
      
  def forward(self,x):
    out=self.l1(x)
    return out

zm=Ternary_max_pool().to(device)

Testing loop

correct=0
total=0
for images,labels in (train_loader):
  images=images.to(device)
  labels=labels.to(device)
  y1=F.conv2d(images,model['layer1.0.weight'],padding=1)
  y2=z1(y1)
  y2=F.conv2d(y2,model['layer1.3.weight'],padding=1)
  y3=z1(y2)
  y3=zm(y3)
  y4=F.conv2d(y3,model['layer2.0.weight'],padding=1)
  y4=z2(y4)
  y5=F.conv2d(y4,model['layer2.3.weight'],padding=1)
  y5=z2(y5)
  y6=zm(y5)
  y7=F.conv2d(y6,model['layer3.0.weight'],padding=1)
  y8=z3(y7)
  y9=F.conv2d(y8,model['layer3.3.weight'],padding=1)
  y10=z3(y9)
  y11=zm(y10)
  y11=y11.view(y11.size(0),-1)
  y12=F.linear(y11,model['layer4.0.weight'])
  y13=F.relu(y12)
  y14=F.dropout(y13)
  y15=F.linear(y14,model['layer4.3.weight'])
  _,predicted=torch.max(y15,1)
  total+=labels.size(0)
  correct+=(predicted==labels).sum().item()
print('Test accuracy of the model on the 10000 test images:{}%'.format((correct/total)*100))

I’m not sure why you are using the model components explicitly than using the model definition file’s forward method. Any good reason for that?

What about bias terms in conv, linear layers? Are you using them too?

I am training without the bias even the trained model that I am loading currently.

Also, I am implementing a paper that does not train weights rather than parameters of a weight drawn from some distribution and hence testing would not be straight forward( since weights need to be sampled from a distribution and then passed to infer the data)

1 Like

Did you call model.eval() after loading the model?
This is usually needed, if your model contains layers such as dropout and batchnorm.

But the model here is a dictionary since it contains state_dict . The model here is not an instance of the class.

It’s not completely clear to me how you are using the model, but even if you call the layers in a functional way, you should set them to eval to get the proper validation accuracy.
You can call .eval() on each module separately.

@ptrlblck, lets say if the model were an instance of a neural network class

model=Conv_net().to(device)

Now during the training phase, I would be using the forward function of the class to forward propagate and optimize the parameters. The parameters here are not the weights of the network, rather some parameters of a probability distribution to which the weights belong to.

During the eval() phase, I would have to first sample the trained probability distribution (paramerterized by the weight parameters) and then pass the sampled weights to the neural network
so during the model.eval() phase, I would not be able to do something like the following:

model.eval()
for images,labels in valid_loader:
yout=model(images)
...
..

However, if i have the samples of the distribution, I want to do the following to see my validation accuracy:

y1=F.convd(images,w_sampled_1,padding=1)
y2=z1(y1)... 
etc

Is there a way to run inference of this kind using model instance rather than a dictionary?

@ptrblck you could check out the code written here:

Did you solve this problem? I have realized that there is accuracy problems when I save the parameters of the models for certain epoch, and try to reproduce its outputs. For instance, suppose I stop the model at epoch 120, and save the model and its outputs. If I load the state_dict and predict over the same data points then there is about ~0.1% error. Why would that be? Is there something I should set to save the state_dict with higher precision?

I have same problem, I trained my model with 95% accuracy on trainset and 70% on validation, but when I use same data (validation-set) for inference, I’m getting awfull results. did you find an answer?