Training loss issue

kartheek · February 15, 2020, 9:55am

when i m training resnet modelin pytorch i got the traing loss is constant in every time …
pytorch:
elif self.model_type == ‘resnet’:

            self.model = models.resnet50()

            self.model.adaptiveavgpool = nn.AdaptiveAvgPool2d(self.model.fc.out_features)

            self.model.fc = nn.Linear(in_features=self.model.fc.in_features, out_features=self.output_classes)
             self.model.activation = nn.Softmax(self.model.fc)

keras:
elif self.model_type in [‘resnet50’]:
# self.model.layers.pop()
model_output = self.model.output
model_output = GlobalAveragePooling2D()(model_output)
predictions = Dense(self.output_classes, activation=‘softmax’)(model_output)
self.model = Model(self.model.input, predictions)

we are doing changing keras to pytorch…so we have an issue on that…

can you please give your suggestions in this…tpoic???

i got this like
Epoch: [0] Train Loss: [1.636952] Valid Loss: [1.576530]
Epoch: [1] Train Loss: [1.636952] Valid Loss: [1.404914]
Epoch: [2] Train Loss: [1.636952] Valid Loss: [1.395322]
Epoch: [3] Train Loss: [1.636952] Valid Loss: [1.391647]
Epoch: [4] Train Loss: [1.636952] Valid Loss: [1.390293]
Epoch: [5] Train Loss: [1.636952] Valid Loss: [1.393284]
Epoch: [6] Train Loss: [1.636952] Valid Loss: [1.395637]
Epoch: [7] Train Loss: [1.636952] Valid Loss: [1.396015]

train the model:

for epoch in range(num_epochs):

                    # set the model in train mode

                    self.model.train() 

                    self.model.optimizer.zero_grad()                     

                    # training_input_data = training_input_data.resize_([training_batch_size, 3, 224, 224])

                    training_ground_truth = training_ground_truth.resize_([training_batch_size])

                    self.model = self.model.double()

                    outputs = self.model(training_input_data.double())      

                    sm=torch.nn.Softmax()

                    probabilities=sm(outputs)

                    print("predicted probabilities while testing",probabilities) 

                    loss = self.model.criterion(outputs, training_ground_truth.long())

                    loss.backward()

                    self.model.optimizer.step()

                    train_losses.append(loss.item()

ptrblck · February 16, 2020, 12:09am

These seem to be a few issues in your PyTorch code:

you are assigning layers to new (and thus) unused attributes. model.adaptiveavgpool does not exist in resnet50, so you are creating a new attribute (maybe you want to change model.avgpool?).
same for model.activation
If your criterion is nn.CrossEntropyLoss, pass the raw logits without the softmax to the criterion.
Internally F.log_softmax and nn.NLLLoss will be applied.

kartheek · February 19, 2020, 1:20pm

that issue is when i am running a model on CPU it will changes training loss and accuracy changes but when i go to GPU it will constant .??
can you please give the solution for this??
as soon as possible

ptrblck · February 19, 2020, 4:04pm

Could you post your updated model and training code so that we could have another look?

You can post source code directly by wrapping it into three backticks ```

kartheek · February 19, 2020, 4:28pm

there will be an error on while I m putting model.avgpool this?
i.e out of memory
self.model = models.resnet50()

        self.model.avgpool = nn.AdaptiveAvgPool2d(self.model.fc.out_features)

        self.model.fc = nn.Linear(in_features=self.model.fc.in_features, out_features=self.output_classes)

train the model

                for epoch in range(num_epochs):

                    # set the model in train mode

                    self.model.train() 

                    self.model.optimizer.zero_grad()                     

                    # training_input_data = training_input_data.resize_([training_batch_size, 3, 224, 224])

                    training_ground_truth = training_ground_truth.resize_([training_batch_size])

                    self.model = self.model.double()

                    outputs = self.model(training_input_data.double())      

                    sm=torch.nn.Softmax()

                    probabilities=sm(outputs)

                    print("predicted probabilities while testing",probabilities) 

                    loss = self.model.criterion(outputs, training_ground_truth.long())

                    loss.backward()

                    self.model.optimizer.step()

                    train_losses.append(loss.item())

kartheek · February 20, 2020, 5:19pm

after training on vgg16 architecture the losses are different but when i run on Resnet50 architecture its always constant …
can you please give the solution for that ???

ptrblck · February 21, 2020, 5:09am

The provided code shouldn’t work, so please make sure you are really executing the posted code.

This line of code:

self.model.avgpool = nn.AdaptiveAvgPool2d(self.model.fc.out_features)

will initialize an adaptive pooling layer with a spatial output size of 1000, which won’t work using the following self.model.fc layer with in_features = 2048.