Hi, I use a pre-train model like below
model= Net().to(device)
model.load_state_dict(torch.load("cnn_alex.pkl"))
and I set model. train()
before training but It doesn’t converge. Can anyone help me?
Hi, I use a pre-train model like below
model= Net().to(device)
model.load_state_dict(torch.load("cnn_alex.pkl"))
and I set model. train()
before training but It doesn’t converge. Can anyone help me?
Though you are using pretrained model, you have to like any other model like initializing optimizers with the model parameters and train model over a mini batches or dataset. Also, you can freeze the model parameters by iterating over batches using
for param in model.parameters:
param.requires_grad=False
and fine tune the last layers by enabling requires_grad = true
or add few layers , to get better results.
Hi, I set the first layers with requires_grad = False
using
for name, param in model.named_parameters():
if name == 'fc2.weight':
break
else:
param.requires_grad_(False)
and leave the two last fully connected layers with requires_grad = True
. Then I trained and tested the model in a normal way. It did not converge
1.checkout your device whether on gpu
2.if your traindata isn’t converged,checkout the tensor’type of longint
or float
,sometimes loss or acc will calcuated zero when labels set float type.
See if this link is of any help, otherwise I can write a code and forward it to you.
https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html