L2 distance in pixel space

pietz · April 30, 2022, 12:53pm

I’m trying to create a loss function that measures the euclidean distance of points (not based on coordinates but) based on activated pixels in a 2D map. So, for example:

a = torch.zeros(1, 1, 5, 5) # (B,C,H,W)
b = torch.zeros(1, 1, 5, 5) # (B,C,H,W)
a[0,0,1,1] = 1.0
b[0,0,3,4] = 1.0
loss(a, b)
# Expected Output: tensor([3.60555])

I can easily do this calculation with a combination of nonzero() and pow(2).sum().sqrt(). However, I don’t think that I can backpropagate anything, right?

How could I set up this loss function to train a network that minimizes the l2 loss between activated pixels to encourage predictions that overlap with the ground truth binary mask?

Andrei_Cristea · April 30, 2022, 1:31pm

There will always be exactly two pixels that are activated?

pietz · April 30, 2022, 1:39pm

No, but I can “calculate” the assignment in pairs that im interested in.

Andrei_Cristea · April 30, 2022, 10:42pm

Deleted, better answer below.

Andrei_Cristea · May 1, 2022, 10:45am

How about converting your pixel representation into coordinates as a first step, and then taking the distance. I believe this should work:

a = torch.zeros(1, 1, 5, 5) # (B,C,H,W)
b = torch.zeros(1, 1, 5, 5) # (B,C,H,W)

a[0, 0, 1, 1] = 1.
b[0, 0, 3, 4] = 1

x_a = (torch.arange(5).float() @ a).sum()
y_a = (torch.arange(5).float() @ a.permute([0, 1, 3, 2])).sum()

x_b = (torch.arange(5).float() @ b).sum()
y_b = (torch.arange(5).float() @ b.permute([0, 1, 3, 2])).sum()

((x_b - x_a).pow(2) + (y_b - y_a).pow(2)).sqrt()

Output:
tensor(3.6056)

pietz · May 1, 2022, 4:54pm

thank you. this looks good!