When i load a model i am seeing this error?

Teddy_Salas · August 12, 2023, 3:07pm

;;
i saved a model, but when i load the model i am seeing this error:-

RuntimeError Traceback (most recent call last)
in <cell line: 3>()
1 saved_model_path = “/content/aI_genrtd_img_dectr.pth”
2
----> 3 model.load_state_dict(torch.load(saved_model_path))

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
2039
2040 if len(error_msgs) > 0:
→ 2041 raise RuntimeError(‘Error(s) in loading state_dict for {}:\n\t{}’.format(
2042 self.class.name, “\n\t”.join(error_msgs)))
2043 return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for tinyvgg1:
size mismatch for Conv_block1.0.weight: copying a param with shape torch.Size([76, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 64, 3, 3]).
size mismatch for Conv_block1.0.bias: copying a param with shape torch.Size([76]) from checkpoint, the shape in current model is torch.Size([3]).
size mismatch for Conv_block1.1.weight: copying a param with shape torch.Size([76]) from checkpoint, the shape in current model is torch.Size([3]).
size mismatch for Conv_block1.1.bias: copying a param with shape torch.Size([76]) from checkpoint, the shape in current model is torch.Size([3]).
size mismatch for Conv_block1.1.running_mean: copying a param with shape torch.Size([76]) from checkpoint, the shape in current model is torch.Size([3]).
size mismatch for Conv_block1.1.running_var: copying a param with shape torch.Size([76]) from checkpoint, the shape in current model is torch.Size([3]).
size mismatch for Conv_block1.4.weight: copying a param with shape torch.Size([76, 76, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 3]).
size mismatch for Conv_block1.4.bias: copying a param with shape torch.Size([76]) from checkpoint, the shape in current model is torch.Size([3]).
size mismatch for Conv_block1.5.weight: copying a param with shape torch.Size([76]) from checkpoint, the shape in current model is torch.Size([3]).
size mismatch for Conv_block1.5.bias: copying a param with shape torch.Size([76]) from checkpoint, the shape in current model is torch.Size([3]).
size mismatch for Conv_block1.5.running_mean: copying a param with shape torch.Size([76]) from checkpoint, the shape in current model is torch.Size([3]).
size mismatch for Conv_block1.5.ru

ptrblck · August 12, 2023, 3:31pm

These shape mismatches are raised if the parameters in the state_dict do not match the ones used in your model. Did you manipulate the model before saving this state_dict as it seems a lot of parameters changed?

Teddy_Salas · August 12, 2023, 3:35pm

i trained the model.then saved it

ptrblck · August 12, 2023, 3:42pm

Training the model won’t change the parameter shapes, so something else must have manipulated these.

Teddy_Salas · August 12, 2023, 3:44pm

how to solve this problem?

Teddy_Salas · August 12, 2023, 3:45pm

i trained several times but when i load same problem appears

ptrblck · August 12, 2023, 3:46pm

Figure out why the parameter shapes were changed and either remove it or re-apply these changes to the model before loading the state_dict.

Teddy_Salas · August 12, 2023, 3:48pm

how to find out why parameter shapes are changed?

ptrblck · August 12, 2023, 3:52pm

Read through your code and search for any lines of code which replace modules or parameters directly. Again, PyTorch won’t automatically change your model architecture. In case you are using 3rd party packages, which might perform some network surgery, look through it too. It’s your code so you should know why the model changed.

Teddy_Salas · August 12, 2023, 3:54pm

i think i find the problem

Teddy_Salas · August 12, 2023, 9:28pm

@ptrblck respected sir i tried , but didnt help me showing error,--------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
in <cell line: 3>()
1 saved_model_path = “/content/aI_genrtd_img_dectr (2).pth”
2
----> 3 model.load_state_dict(torch.load(saved_model_path))

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
2039
2040 if len(error_msgs) > 0:
→ 2041 raise RuntimeError(‘Error(s) in loading state_dict for {}:\n\t{}’.format(
2042 self.class.name, “\n\t”.join(error_msgs)))
2043 return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for tinyvgg1:
size mismatch for Conv_block1.0.weight: copying a param with shape torch.Size([76, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 76, 3, 3]).
size mismatch for Conv_block1.0.bias: copying a param with shape torch.Size([76]) from checkpoint, the shape in current model is torch.Size([3]).

ptrblck · August 12, 2023, 10:16pm

If you get stuck, post a minimal and executable code snippet reproducing the issue without any data dependencies, which I can use to debug the issue.

Teddy_Salas · August 12, 2023, 10:35pm

import torch
from torch import nn
import torchvision
from torchvision import transforms

train_transforms = transforms.Compose([
transforms.Resize(size=(224,224)),
transforms.RandomHorizontalFlip(),
transforms.RandomRotation(10),
transforms.TrivialAugmentWide(num_magnitude_bins=31),
transforms.ToTensor(),

])

test_transforms = transforms.Compose([
transforms.Resize(size=(224,224)),
transforms.RandomHorizontalFlip(),
transforms.RandomRotation(10),
transforms.ToTensor(),
from torchvision import datasets

train_data = datasets.ImageFolder(
root=train_dir,
transform=train_transforms
)

test_data = datasets.ImageFolder(
root=test_dir,
transform=test_transforms
)

])

from torch.utils.data import DataLoader

train_dataloader = DataLoader(
dataset = train_data,
batch_size = 28,
num_workers = os.cpu_count(),
shuffle=True
)

test_dataloader = DataLoader(
dataset = test_data,
batch_size=28,
num_workers = os.cpu_count(),
shuffle=False
)
class_list = train_data.classes
classclass tinyvgg1(nn.Module):
def init(self,hidden,input,output):
super().init()
self.Conv_block1 = nn.Sequential(
nn.Conv2d(
in_channels= input,
out_channels=hidden,
kernel_size=3,
stride=1,
padding=1
),
nn.BatchNorm2d(hidden),
nn.ReLU(),
nn.Dropout(0.2),
nn.Conv2d(
in_channels = hidden,
out_channels =hidden,
kernel_size=3,
stride=1,
padding=1
),
nn.BatchNorm2d(hidden),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2,stride=2)
)
self.Conv_block2 = nn.Sequential(
nn.Conv2d(
in_channels=hidden,
out_channels=hidden,
kernel_size=3,
stride=1,
padding=1
),
nn.BatchNorm2d(hidden),
nn.ReLU(),
nn.Dropout(0.2),
nn.Conv2d(
in_channels=hidden,
out_channels=hidden,
kernel_size=3,
stride=1,
padding=1
),

    nn.BatchNorm2d(hidden),
    nn.ReLU(),
    nn.MaxPool2d(kernel_size=2,stride=2)
)
self.Conv_block3 = nn.Sequential(
    nn.Conv2d(
        in_channels=hidden,
        out_channels=hidden,
        kernel_size=3,
        stride=1,
        padding=1
    ),
    nn.BatchNorm2d(hidden),
    nn.ReLU(),
    nn.Dropout(0.2),
    nn.Conv2d(
        in_channels=hidden,
        out_channels=hidden,
        kernel_size=3,
        stride=1,
        padding=1
    ),
    nn.BatchNorm2d(hidden),
    nn.ReLU(),
    nn.MaxPool2d(kernel_size=2,stride=2)
)
self.Conv_block4 = nn.Sequential(
    nn.Conv2d(
        in_channels=hidden,
        out_channels=hidden,
        kernel_size=3,
        stride=1,
        padding=1
    ),
    nn.BatchNorm2d(hidden),
    nn.ReLU(),
    nn.Dropout(0.2),
    nn.Conv2d(
        in_channels=hidden,
        out_channels=hidden,
        kernel_size=3,
        stride=1,
        padding=1
    ),
    nn.BatchNorm2d(hidden),
    nn.ReLU(),
    nn.MaxPool2d(kernel_size=2,stride=2)
)

self.classifier = nn.Sequential(
    nn.Flatten(),
    nn.Linear(
        in_features = hidden*14*14,
        out_features = output
    )
)

def forward(self,x):
x = self.Conv_block1(x)
#print(x.shape)
x = self.Conv_block2(x)
#print(x.shape)
x = self.Conv_block3(x)
#print(x.shape)
x = self.Conv_block4(x)
#print(x.shape)
x = self.classifier(x)
return x
return self.Conv_block4(self.Conv_block3(self.Conv_block2(self.Conv_block1(x))))

_list
device = “cuda” if torch.cuda.is_available() else “cpu”
device
model_x = tinyvgg1(input=3,
hidden=76,
output=2
).to(device)
def accuracy_fn(y_true,y_pred):
correct = torch.eq(y_true,y_pred).sum().item()
acc = correct/len(y_pred)*100
return acc
losfn = nn.CrossEntropyLoss()
optimizer =torch.optim.Adam(params = model_x.parameters(),
lr=0.0001,weight_decay=0.00001)
import torch
from torch import nn

def train_step(model:torch.nn.Module,
dataloader:torch.utils.data.DataLoader,
optimizer:torch.optim.Optimizer,
losfn:torch.nn.Module,
accuracy_fn,
device:device):
train_acc,train_loss=0,0
model.train()
l2 = 0.001
for batch,(X,y) in enumerate(dataloader):
X,y = X.to(device),y.to(device)

y_pred = model(X)
loss = losfn(y_pred,y)
train_loss += loss.item()
train_acc += accuracy_fn(y_true=y,y_pred=y_pred.argmax(dim=1))
l2_reg = torch.tensor(0.,device=device)
for param in model.parameters():
  l2_reg += torch.norm(param,p=2)
loss = loss + l2 * l2_reg
optimizer.zero_grad()
loss.backward()
optimizer.step()

train_acc /= len(dataloader)
train_loss /= len(dataloader)
print(f"Epoch {epoch} | Train acc: {train_acc:.2f}% | Train Loss {train_loss:.4f}")

def test_step(model:torch.nn.Module,
dataloader:torch.utils.data.DataLoader,
losfn:torch.nn.Module,
accuracy_fn,
device:device
):
test_acc,test_loss = 0,0
model.eval()
with torch.inference_mode():
for batch,(X,y) in enumerate(dataloader):
X,y = X.to(device),y.to(device)
test_pred = model(X)
loss1 = losfn(test_pred,y)
test_loss += loss1.item()
test_acc += accuracy_fn(y_true=y,y_pred=test_pred.argmax(dim=1))
test_acc /= len(dataloader)
test_loss /= len(dataloader)
print(f"Test Acc {test_acc:.2f}% | Test Loss {test_loss:.4f}")

from tqdm.auto import tqdm

torch.manual_seed(42)

epochs = 1000

from timeit import default_timer as timer

start = timer()

for epoch in tqdm(range(epochs)):
train_step(model=model_x.to(device),

         dataloader=train_dataloader,
         optimizer=optimizer,
         losfn=losfn,
         accuracy_fn=accuracy_fn,
         device=device)

test_step(model=model_x.to(device),
dataloader=test_dataloader,
losfn=losfn,
accuracy_fn=accuracy_fn,
device=device)
end = timer()

print(f"Total time {end-start}")
from pathlib import Path

MODEL_PATH = Path(“models”)
MODEL_PATH.mkdir(parents=True, exist_ok=True)
MODEL_NAME = “aI_genrtd_img_dectr.pth”
MODEL_SAVE_PATH = MODEL_PATH/MODEL_NAME

print(f"Saving model to: {MODEL_SAVE_PATH}")
torch.save(obj=model_x.state_dict(),
f=MODEL_SAVE_PATH)
model = tinyvgg1(3,76,2).to(device)

saved_model_path = “/content/aI_genrtd_img_dectr (2).pth”

model.load_state_dict(torch.load(saved_model_path))

ptrblck · August 12, 2023, 10:49pm

Your code is neither properly formatted nor is it executable since you have a data dependency.
In any case, the error is raised since you are mixing up input arguments in the model creation.
At first you are using:

model_x = tinyvgg1(input=3, hidden=76, output=2).to(device)

and later:

model = tinyvgg1(3,76,2).to(device)

which corresponds to tinyvgg1(hidden=3, input=76, output=2).

Teddy_Salas · August 12, 2023, 10:50pm

oksy i will check and inform you

Teddy_Salas · August 13, 2023, 12:36am

it worked thank you sir

Teddy_Salas · August 13, 2023, 2:59am

@ptrblck respected sir, the model i trained is well saved , after that i tried to train that saved model using this code import torch
import torchvision.models as models

model = torchvision.models.resnet18(pretrained=True)
model.load_state_dict(torch.load(‘/content/aI_genrtd_img_dectrx.pth’))
then i load the model but again i saw an error :----------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
in <cell line: 1>()
----> 1 model.load_state_dict(torch.load(‘/content/aI_genrtd_img_dectrx.pth’))

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
2039
2040 if len(error_msgs) > 0:
→ 2041 raise RuntimeError(‘Error(s) in loading state_dict for {}:\n\t{}’.format(
2042 self.class.name, “\n\t”.join(error_msgs)))
2043 return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for ResNet:
Missing key(s) in state_dict: “conv1.weight”, “bn1.weight”, “bn1.bias”, “bn1.running_mean”, “bn1.running_var”, “layer1.0.conv1.weight”, “layer1.0.bn1.weight”, “layer1.0.bn1.bias”, “layer1.0.bn1.running_mean”, “layer1.0.bn1.running_var”, “layer1.0.conv2.weight”, “layer1.0.bn2.weight”, “layer1.0.bn2.bias”, “layer1.0.bn2.running_mean”, “layer1.0.bn2.running_var”, “layer1.1.conv1.weight”, “layer1.1.bn1.weight”, “layer1.1.bn1.bias”, “layer1.1.bn1.running_mean”, “layer1.1.bn1.running_var”, “layer1.1.conv2.weight”, “layer1.1.bn2.weight”, “layer1.1.bn2.bias”, “layer1.1.bn2.running_mean”, “layer1.1.bn2.running_var”, “layer2.0.conv1.weight”, “layer2.0.bn1.weight”, “layer2.0.bn1.bias”, “layer2.0.bn1.running_mean”, “layer2.0.bn1.running_var”, “layer2.0.conv2.weight”, “layer2.0.bn2.weight”, “layer2.0.bn2.bias”, “layer2.0.bn2.running_mean”, “layer2.0.bn2.running_var”, “layer2.0.downsample.0.weight”, “layer2.0.downsample.1.weight”, “layer2.0.downsample.1.bias”, “layer2.0.downsample.1.running_mean”, “layer2.0.downsample.1.running_var”, “layer2.1.conv1.weight”, “layer2.1.bn1.weight”, “layer2.1.bn1.bias”, “layer2.1.bn1.running_mean”, “layer2.1.bn1.running_var”, “layer2.1.conv2.weight”, “layer2.1.bn2.weight”, “layer2.1.bn2.bias”, “layer2.1.bn2.running_mean”, “layer2.1.bn2.running_var”, “layer3.0.conv1.weight”, “layer3.0.bn1.weight”, “layer3.0.bn1.bias”, “layer3.0.bn1.running_mean”, “layer3.0.bn1.running_var”, “layer3.0.conv2.weight”, “layer3.0.bn2.weight”, “layer3.0.bn2.bias”, "layer3.0.bn2…
Unexpected key(s) in state_dict: “Conv_block1.0.weight”, “Conv_block1.0.bias”, “Conv_block1.1.weight”, “Conv_block1.1.bias”, “Conv_block1.1.running_mean”, “Conv_block1.1.running_var”, “Conv_block1.1.num_batches_tracked”, “Conv_block1.4.weight”, “Conv_block1.4.bias”, “Conv_block1.5.weight”, “Conv_block1.5.bias”, “Conv_block1.5.running_mean”, “Conv_block1.5.running_var”, “Conv_block1.5.num_batches_tracked”, “Conv_block2.0.weight”, “Conv_block2.0.bias”, “Conv_block2.1.weight”, “Conv_block2.1.bias”, “Conv_block2.1.running_mean”, “Conv_block2.1.running_var”, “Conv_block2.1.num_batches_tracked”, “Conv_block2.4.weight”, “Conv_block2.4.bias”, “Conv_block2.5.weight”, “Conv_block2.5.bias”, “Conv_block2.5.running_mean”, “Conv_block2.5.running_var”, “Conv_block2.5.num_batches_tracked”, “Conv_block3.0.weight”, “Conv_block3.0.bias”, “Conv_block3.1.weight”, “Conv_block3.1.bias”, “Conv_block3.1.running_mean”, “Conv_block3.1.running_var”, “Conv_block3.1.num_batches_tracked”, “Conv_block3.4.weight”, “Conv_block3.4.bias”, “Conv_block3.5.weight”, “Conv_block3.5.bias”, “Conv_block3.5.running_mean”, “Conv_block3.5.running_var”, “Conv_block3.5.num_batches_tracked”, “Conv_block4.0.weight”, “Conv_block4.0.bias”, “Conv_block4.1.weight”, “Conv_block4.1.bias”, “Conv_block4.1.running_mean”, “Conv_block4.1.running_var”, “Conv_block4.1.num_batches_tracked”, “Conv_block4.4.weight”, “Conv_block4.4.bias”, “Conv_block4.5.weight”, “Conv_block4.5.bias”, “Conv_block4.5.running_mean”, "Conv_block4.5.running_va…

ptrblck · August 13, 2023, 1:18pm

You cannot load your state_dict into a resnet.

Teddy_Salas · August 13, 2023, 1:29pm

i tried to load the model to ,train the model,is there any way to do it?

Teddy_Salas · August 13, 2023, 4:31pm

@ptrblck hello sir i tried to load the model to ,train the model,is there any way to do it?

When i load a model i am seeing this error?

;; i saved a model, but when i load the model i am seeing this error:-

;;
i saved a model, but when i load the model i am seeing this error:-