Test with single image gives different result and weights in each runtime

Diego_Hernandez · October 11, 2022, 10:06pm

Hi everyone, i expect you have a great day!.

I have already trained an Efficientnet-b3 model from efficientnet_pytorch over MultiGPU workstation.
When i charge trained weights over CPU server(another) i get different testing result eech runtime:

import os
import pickle
import torch
import torch.nn as nn
from efficientnet_pytorch import EfficientNet

def to_device(data, device):
if isinstance(data, (list,tuple)):
return [to_device(x, device) for x in data]
return data.to(device, non_blocking=True)

class Efficientnet_b3(nn.Module):
def init(self,weight_path):
super().init()
self.effb3 = EfficientNet.from_pretrained(model_name=‘efficientnet-b3’,weights_path=weight_path,num_classes=1)
self.effb3=nn.DataParallel(self.effb3)
def forward(self, xb): return self.effb3(xb)

device =torch.device(‘cpu’)
imagenet_weight_path=‘xxxxxxx.pth’
model =to_device(Efficientnet_b3(imagenet_weight_path), device)
custom_weight_path=‘yyyyyyyyyyyyyyyyyyyy.pth’
model.load_state_dict(torch.load(custom_weight_path,map_location=device),strict=False)
model.eval()
img_pickle_path=os.path.join(“zzzzzzzzz.pkl”)
with open(img_pickle_path, ‘rb’) as f: xb=pickle.load(f)
with torch.no_grad():
print(f"xb:{xb}“)
out=model(xb)
print(f"out:{out}”)

for name, param in model.named_parameters():
if name==‘effb3.module._fc.weight’:
print(name)
print(param)

The first results are:
out:tensor([[-0.0697]])
effb3.module._fc.weight
Parameter containing:
tensor([[-0.0079, -0.0221, -0.0182, …, 0.0254, -0.0004, -0.0096]],
requires_grad=True)
if i rerun a second time:
out:tensor([[-0.0343]])
effb3.module._fc.weight
Parameter containing:
tensor([[-0.0018, 0.0209, 0.0190, …, -0.0212, 0.0082, 0.0083]],
requires_grad=True)
I am trying to figure out the reason of this behavior if i am in the eval mode (model.eval()).
Another question is why requires_grad is showing True if i dont compute gradients like in model.train() mode.

thanks for reading and the kindness help

ptrblck · October 12, 2022, 5:13am

Based on your output it doesn’t seem that you are using the same pretrained model, since the parameter values change. I’m unsure if I’m missing something, but would assume you want to load the same model (you can format your code by wrapping it into three backticks ```, which makes debugging easier).

The requires_grad attributes of parameters are not changes via the model.train() or model.eval() calls as these operations switch the behavior of some layers only. E.g. during eval() dropout layers will be disabled.

Diego_Hernandez · October 12, 2022, 7:29pm

Hello @ ptrblck ,thank you. Firstable i format the code:

import os
import pickle
import torch
import torch.nn as nn
from efficientnet_pytorch import EfficientNet


def to_device(data, device):
    """Move tensor(s) to chosen device"""
    if isinstance(data, (list,tuple)):
        return [to_device(x, device) for x in data]
    return data.to(device, non_blocking=True)

class Efficientnet_b3(nn.Module):
    def __init__(self,weight_path):
        super().__init__()
        self.effb3 = EfficientNet.from_pretrained(model_name='efficientnet-b3',weights_path=weight_path,num_classes=1)
        self.effb3=nn.DataParallel(self.effb3)
    def forward(self, xb): return  self.effb3(xb)

device =torch.device(‘cpu’)
imagenet_weight_path=‘xxxxxxx.pth’
model =to_device(Efficientnet_b3(imagenet_weight_path), device)
custom_weight_path=‘yyyyyyyyyyyyyyyyyyyy.pth’
model.load_state_dict(torch.load(custom_weight_path,map_location=device),strict=False)
model.eval()
img_pickle_path=os.path.join(“zzzzzzzzz.pkl”)
with open(img_pickle_path, ‘rb’) as f: xb=pickle.load(f)
with torch.no_grad():
    print(f"xb:{xb}“)
    out=model(xb)
    print(f"out:{out}”)
for name, param in model.named_parameters():
    if name=='effb3.module._fc.weight':
        print(param)

The first results are:
out:tensor([[-0.0697]])
effb3.module._fc.weight
Parameter containing:
tensor([[-0.0079, -0.0221, -0.0182, …, 0.0254, -0.0004, -0.0096]],
requires_grad=True)


if i rerun a second time:

out:tensor([[-0.0343]])
effb3.module._fc.weight
Parameter containing:
tensor([[-0.0018, 0.0209, 0.0190, …, -0.0212, 0.0082, 0.0083]],
requires_grad=True)

Im am trying to instanciate a efficientnet model and then load a pretrained weight to perform a “inference” over image( before a sigmoid fcn).
The weight of pretrained model is changing every time i run this piece of code ( like i was training) so I am trying to figure out the reason of this behavior if i am in the eval mode (model.eval()) and using no_grad context.

Regards,

Diego

ptrblck · October 13, 2022, 6:14am

While trying to load the state_dict you are using strict=False. Could you explain why you are using it and why you would expect to see missing or additional keys in the state_dict? In the worst case, nothing will be loaded if all keys create mismatches and you are using random models in each run, which would explain the outputs.