Hi. I am moving from keras to pytorch. My code is running fine but the model is not training irrespective of different parameters settings. I have not initialised weights of any conv2d layer and pytorch must be doing what it is supposed to do by default. Are the uninitialised weights reason behind this ambiguity? Also, how could I initialise ‘normal distribution’ weights for conv2d?
For the normal initialisation you can simply do:
my_conv = nn.Conv2d(... ) nn.init.normal(my_conv.weight)
What do you mean by “the model is not training”? Is your loss increasing? Or the parameters remain unchanged?
My parameters are not changing irrespective of different learning rates.
Are you sure you are well defining your optimizer with the parameters of your model? And that you are calling
.backward() method before doing
optimizer.step()? Can you show some lines of code then?
Here is code:
for i, (train_images_1, train_labels_1) in enumerate(train_loader):
running_loss = 0.0 running_corrects = 0 #wrap them in Variable if use_gpu: train_images_1, train_labels_1 = Variable(train_images_1.cuda()), \ Variable(train_labels_1.cuda()) # zero the parameter gradients optimizer.zero_grad() # forward outputs = cnn(train_images_1) _, preds = torch.max(outputs.data, 1) loss = criterion(outputs, train_labels_1) loss.backward() optimizer.step()
Ok, if I understand your targets are the arguments max. This may not work, you should have a vector of zeros with a “one” where you want an activation:
for i, (train_images_1, train_labels_1) in enumerate(train_loader): # prepare target target = torch.zeros(1, num_labels) target[0,train_labels_1] = 1 running_loss = 0.0 running_corrects = 0 #wrap them in Variable train_images_1 = Variable(train_images_1.cuda(), requires_grad=True) target = Variable(target.cuda(), requires_grad=False) # zero the parameter gradients optimizer.zero_grad() # forward outputs = cnn(train_images_1) loss = criterion(outputs, target) loss.backward() optimizer.step()
I was using vector before. Then pytorch was treating it as multitarget problem and throwing run time error. I made the above changes too but its not working.
Thanks for your efforts.
Hey,did you find any solution.
I am facing the same problem
Could you describe your problem a bit and what you’ve tried so far?
Do you get an error or is the model not learning at all?
If you have a working Keras model, we could try to compare both implementations and look for differences and code bugs.
Hi my model is not training properly when I added one layer to the pretrained vgg16 model
Here is the code I modified
num_ftrs = model_conv.classifier.out_features
model_conv = model_conv.to(device)
when I run the code I got RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn to avoid this I added line loss = Variable(loss, requires_grad=True) to train_model function. Can you please help me?
If you want to replace the last linear layer in
model_conv.classifier, you would have to reassign the new
nn.Sequential module back to
.classifier instead of
Thanks I solved the problem before I used sequential on the basic model which I had problem to access trainable weights.