was asking the last time, if we can use neural network to predict if each pixel in an image is a corner of a box. I want to know
1- if this code is correct or not:
2- whene i got the probabilities, how can i distinct the probability of the top left corner from the probability of bottom right corner or the auther
image input_size = 512
import torch.nn as nn
import torch.nn.functional as F
dim_out = 512*3 (3 if the pixel is Top left corner, bottom right corner, auther)
self.base_conv = nn.conv2D(512, 512, 3, 1, 1, bias=True)
self.class_pred = nn.conv2D( 512, dim_out, 1, 1, 0)
self.class_loss = 0
def forward (image, bbox):
conv1 = F.relu(self.base_conv(image), inplace=True)
conv2 = self.class_pred(conv1)
prob_corner = F.sigmoid(conv2)
#------- loss calculations
if we can know witch is the probability of each corner
You can directly access the output values of
F.sigmoid(conv2) via indexing to get the corner pixels:
top_left = prob_corner[:, :, 0, 0]
top_right = prob_corner[:, :, 0, -1]
Let me know, if I misunderstood the question.
I’m sorry because I haven’t explain my problem well. there is what i mean:
i want to predict if each pixel of an image is one of the 2 corners of a box (top left and bottom right). I have pixel images and the coordinates of boxes, and i want to use a neural network. i dont know how to introduce the comparaision between the pixels and the coordinate of the box.
i wrote this code but in the result i have numbers between [0, 1] i can’t extract wich are belooging to the bottom right, top left corners?
How did you define your target containing the corners of your bounding box?
Did you define it as a “segmentation map” containing
1s at the corner pixels and
If so, you could try to use the position of the prediction to determine which pixel is “top left” and which is “bottom right”. However, since your prediction can output multiple pixels, you would have to deal somehow with this ambiguous result (maybe filtering out invalid candidates?).
I think the usual workflow to predict a bounding box would be to output the coordinates directly instead of a heat map.
Thanks, i will see with the position im my result