I’m trying to use a pretrained keypoint detector model from torchvision to predict keypoints on a region extracted from an image. I’d then like to visualize a saliency map of the prediction. I have a loss function that is just the Euclidean distance of each predicted set of keypoints with the closest ground truth set. However, when I call loss.backward(), the printing model.layer.[weight/bias].grad yields nothing. I was trying this to make sure gradients were getting back to the input image. I’ve included code.
- dets_for_kp is a list of Tensors (CxHxW) that have requires_grad=True.
- kp is the name of the model. It is a keypointrcnn_resnet50_fpn from torchvision models and is set to eval().
for p in kp.parameters():
p.requires_grad = True
# run detections through keypoint detector
kps = kp(dets_for_kp)
# we could come up with a single keypoint estimate per image by either averaging
# or a linear combination weighted by scores
best = []
valids = []
for j, k in enumerate(kps):
keypoints = k['keypoints']
if len(keypoints) == 0:
continue
best.append(torch.round(keypoints[0]))
valids.append(j)
# convert keypoints back to image coordinates. keypoints are (x,y,visible)
for i in range(len(valids)):
j = valids[i]
bbox, ph, pw = reverse_info[j]
best[i][:,0] += (bbox[0] - pw[0])
best[i][:,1] += (bbox[1] - ph[0])
# loop over keypoints and compare them to the ground truth
# since we don't know a priori which gt and which annotations are closest, and there aren't
# too many, just loop over each to get centroids and use closest centroid of gt for each
# annotation. loss is L2
anno_centroids = []
for a in annos:
x = np.array(a)[0::3].mean()
y = np.array(a)[1::3].mean()
z = np.array(a)[2::3].mean()
anno_centroids.append(np.array([x,y,z]))
k_centroids = []
for b in best:
centroid = b.mean(0)
k_centroids.append(centroid)
c_matches = []
for jj, c in enumerate(k_centroids):
closest_dist = 1e10
closest_centroid = None
for ii, ac in enumerate(anno_centroids):
if torch.norm(c - torch.Tensor(ac)) < closest_dist:
closest_dist = torch.norm(c-torch.Tensor(ac))
closest_centroid = ii
c_matches.append((jj, ii))
loss = torch.Tensor([0])
for m in c_matches:
loss += torch.norm(best[m[0]].view(-1) - torch.Tensor(annos[m[1]]))
loss.backward()
You can check that the gradients are nothing by printing
kp.roi_heads.keypoint_predictor.kps_score_lowres.bias.grad
or
dets_for_kp[0].grad
I’ve tried making sure every parameter is set to require_grad=True, along with the input. I’ve wondered if I am doing the loss wrong. I tried using the MSELoss instead but had no luck. What am I doing wrong?