I am trying to implement the loss function in ICLR paper TRAINING DEEP NEURAL NETWORKS ON NOISY LABELS WITH BOOTSTRAPPING. I found that this is implemented in Tensorflow.
These are my implementations, but I do not think it is right. Can anyone help me?
class BCE_soft(nn.BCELoss):
def __init__(self, beta=0.95):
super(BCE_soft, self).__init__()
self.beta = beta
def forward(self, input, target):
target = self.beta * target + (1 - self.beta) * input
target = target.detach()
return super(BCE_soft, self).forward(input, target)
class BCE_hard(nn.BCELoss):
def __init__(self, beta=0.8):
super(BCE_hard, self).__init__()
self.beta = beta
def forward(self, input, target):
z = torch.round(input)
z = z.detach()
target = self.beta * target + (1 - self.beta) * z
target = target.detach()
return super(BCE_hard, self).forward(input, target)
The reason that I think it is wrong is that the new “target” contains information from “input”, however, we can not derive over that part since the “nn.BCE” requires its “input” to be not required grad.