I want to use a custom loss function, where I do a compute condition before using the MSELoss what ever I do getting error no_grad if I used the requires_grad=True I get OOM error, note my batch size was working fine before the custom loss_fn, when I reduce the batch size to 2 it works but the loss doesn’t improve at all, for about 10k steps nothing changes same value.
pred = model(input)
loss_fn = nn.MSELoss()
diff = torch.gt(torch.sub(pred, target), 0).type(pred.type())
diff.requires_grad=True
tar = torch.ones(diff.size()).type(diff.type()).cuda()
loss = loss_fn(diff, tar)
I have changed the diff type by .type(pred.type()) to float, I don’t think that’s the problem, it’s like the training graph is detached when using the torch.gt and torch.sub