Thank you for reading my post.
I’m a college student, and currently developing the peak detection algorithm using CNN to determine the ideal convolution kernel which is representable as the ideal mother wavelet function that will maximize the peak detection accuracy.
I’ve tried to create my own IoU loss function for the CNN training model, but I failed.
My own loss function is described as below.
'''
1D intersection over union loss function class
'''
class IoU(nn.Module):
def __init__(self, thresh: float = 0.5):
super().__init__()
self.thresh = thresh
def forward(self, inputs: torch.Tensor, targets:torch.Tensor, weights: Optional[torch.Tensor] = None, smooth: float = 0.0) -> Tensor:
inputs = torch.where(inputs < self.thresh, 0, 1)
batch_size = targets.shape[0]
intersect = torch.logical_and(inputs, targets)
intersect = intersect.view(batch_size, -1).sum(-1)
union = torch.logical_and(inputs, targets)
union = union.view(batch_size, -1).sum(-1)
IoU = (intersect + smooth) / (union + smooth)
IoU = IoU.mean()
return IoU
and I tried to test whether this works or not by this simple model described below.
x = torch.tensor(0, 1001, 256) # e.g. [0, 200, 30, 1000, ...]
true = torch.tensor(0, 2, 256) # e.g. [0, 1, 1, 0, 1, ...]
model = nn.Linear(256, 256)
criterion = IoU()
output = model(x)
loss = criterion(output, true)
loss.backward() # I'm stuck on here, cause my loss func IoU is not differentiable
print(f"loss ={loss}")
print(f"model weight: {model.weight.grad}")
print(f"model params: [x.grad for x in {model.parameters()]")
And the output on the terminal is RuntimeError: element 0 of variables does not require grad and does not have a grad_fn
This project is the time ever for me to use PyTorch, so I didn’t know what it meant at the first glance, but after my quick research, I figured out why this loss function fails (I’m not sure this is correct though)
my loss function IoU is not differentiable.
and
This is where the chain rule of this loss function break.
IoU = torch.nan_to_num(IoU)
IoU = IoU.mean()
Soon after I noticed this, I took a deeper look at the GitHub or stack overflow to find any other differentiable IoU loss function, but I’m still not sure how to create a differentiable IoU loss function (especially for 1D data).
If you have any experiences or insights around what I’m stuck on right now, please give me any instructions. Any advice will be welcomed and I hope that I can build my own IoU function for CNN model at the end.
Thank you