Model not training

Hi. I am moving from keras to pytorch. My code is running fine but the model is not training irrespective of different parameters settings. I have not initialised weights of any conv2d layer and pytorch must be doing what it is supposed to do by default. Are the uninitialised weights reason behind this ambiguity? Also, how could I initialise ‘normal distribution’ weights for conv2d?

For the normal initialisation you can simply do:

my_conv = nn.Conv2d(... )
nn.init.normal(my_conv.weight)

What do you mean by “the model is not training”? Is your loss increasing? Or the parameters remain unchanged?

1 Like

My parameters are not changing irrespective of different learning rates.

Are you sure you are well defining your optimizer with the parameters of your model? And that you are calling .backward() method before doing optimizer.step()? Can you show some lines of code then?

Here is code:

for i, (train_images_1, train_labels_1) in enumerate(train_loader):

	running_loss = 0.0
	running_corrects = 0
             #wrap them in Variable
            if use_gpu:
            	train_images_1, train_labels_1 = Variable(train_images_1.cuda()), \
             Variable(train_labels_1.cuda())

	

            # zero the parameter gradients
            optimizer.zero_grad()

            # forward
            outputs = cnn(train_images_1)

            _, preds = torch.max(outputs.data, 1)
            loss = criterion(outputs, train_labels_1)
            loss.backward()
            optimizer.step()

Ok, if I understand your targets are the arguments max. This may not work, you should have a vector of zeros with a “one” where you want an activation:

for i, (train_images_1, train_labels_1) in enumerate(train_loader):
  # prepare target
  target = torch.zeros(1, num_labels)
  target[0,train_labels_1] = 1
  
  running_loss = 0.0
  running_corrects = 0
  #wrap them in Variable
  train_images_1 = Variable(train_images_1.cuda(), requires_grad=True)
  target = Variable(target.cuda(), requires_grad=False)
  
  # zero the parameter gradients
  optimizer.zero_grad()

  # forward
  outputs = cnn(train_images_1)
  loss = criterion(outputs, target)
  loss.backward()
  optimizer.step()

I was using vector before. Then pytorch was treating it as multitarget problem and throwing run time error. I made the above changes too but its not working.

Thanks for your efforts. :slight_smile:

Hey,did you find any solution.
I am facing the same problem

Could you describe your problem a bit and what you’ve tried so far?
Do you get an error or is the model not learning at all?
If you have a working Keras model, we could try to compare both implementations and look for differences and code bugs.

Hi my model is not training properly when I added one layer to the pretrained vgg16 model
Here is the code I modified
num_ftrs = model_conv.classifier[6].out_features
model_conv.add_module(“classifier1”,nn.ReLU(inplace=True))
model_conv.classifier1=nn.Sequential(model_conv.classifier1,nn.Dropout(0.5),nn.Linear(num_ftrs,2))
model_conv.classifier.requires_grad=True
print(model_conv)
model_conv = model_conv.to(device)

when I run the code I got RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn to avoid this I added line loss = Variable(loss, requires_grad=True) to train_model function. Can you please help me?

If you want to replace the last linear layer in model_conv.classifier, you would have to reassign the new nn.Sequential module back to .classifier instead of .classifier1.

Thanks I solved the problem before I used sequential on the basic model which I had problem to access trainable weights.

@ptrblck I saw this issue now. I have a similar problem noted here: Training not working

Can you please help me out with this?