Custom Loss Function Problem

Mr.Bank · July 5, 2020, 1:44pm

@ptrblck hi again,
as you said, loss.grad returns 1. However, the gradient of the parameters seems there are some problems.
All parameters have the same weight and bias regardless of the epoch except FC.bias tensor.
At least, they seem same but I used your code that mentioned here:

Many parameters have been printed so it means there are some changes between new and old state but it is so small for example:

 old_state_dict['encoder.1.weight']
tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
        1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])


new_state_dict['encoder.1.weight']
tensor([0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997,
        0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997,
        0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997,
        0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997,
        0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997,
        0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997,
        0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997,
        0.9997])

Epoch 0:
FC.bias tensor(8.7036e-05, device=‘cuda:0’)
FC.bias tensor(8.7035e-05, device=‘cuda:0’)
FC.bias tensor(8.7035e-05, device=‘cuda:0’)
Epoch 1:
FC.bias tensor(8.7034e-05, device=‘cuda:0’)
FC.bias tensor(8.7032e-05, device=‘cuda:0’)
FC.bias tensor(8.7031e-05, device=‘cuda:0’)
Epoch 2:
FC.bias tensor(8.7029e-05, device=‘cuda:0’)
FC.bias tensor(8.7028e-05, device=‘cuda:0’)
FC.bias tensor(8.7026e-05, device=‘cuda:0’)

So are these updates enough? Can it be the main reason of not decreasing loss value?

My network model:

class ResNet50PairWise(nn.Module):
    def __init__(self, bits = 16):
        super().__init__()

        resnet = models.resnet50(pretrained=False)

        self.conv1 = nn.Conv2d(12, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
        self.encoder = nn.Sequential(
            self.conv1,
            resnet.bn1,
            resnet.relu,
            resnet.maxpool,
            resnet.layer1,
            resnet.layer2,
            resnet.layer3,
            resnet.layer4,
            resnet.avgpool
        )
        self.FC = nn.Linear(2048, bits)


        self.apply(weights_init_kaiming)
        self.apply(fc_init_weights)

    def forward(self, x):
        x = self.encoder(x)
        x = x.view(x.size(0), -1)

        logits = self.FC(x)
        sign = torch.sign(logits)
        binary_out = torch.relu(sign)

        return binary_out```