Retaining gradients after comverting to numpy then Pytorch

Hello!

I had a question regarding retaining the gradients of tensors when converting to numpy arrays to do computations, for example in the code below, kpts0 and kpts1 are tensors which require gradients, unfortunately, the poselib library only accepts numpy arrays as inputs, so I detach the tenors, convert to numpy then convert the output to torch tensors with gradients set as true, but is it possible to convert to numpy arrays, pass to poselib and convert back to torch tensors while retaining the gradients?

Thank you!


def estimate_pose(kpts0, kpts1, K0, K1, thresh, conf=0.99999):
    if len(kpts0) < 5:
        return None
    
    kpts0 = kpts0.detach().cpu().numpy()
    kpts1 = kpts1.detach().cpu().numpy()
    K0 = K0.detach().cpu().numpy()
    K1 = K1.detach().cpu().numpy()

    fx_0, fy_0, cx_0, cy_0 = K0[0, 0], K0[1, 1], K0[0, 2], K0[1, 2]
    fx_1, fy_1, cx_1, cy_1 = K1[0, 0], K1[1, 1], K1[0, 2], K1[1, 2]
    
    camera_0 = {"model": "PINHOLE", "width": int(2 * cx_0), "height": int(2 * cy_0), "params": [fx_0, fy_0, cx_0, cy_0]}
    camera_1 = {"model": "PINHOLE", "width": int(2 * cx_1), "height": int(2 * cy_1), "params": [fx_1, fy_1, cx_1, cy_1]}

    M, info = poselib.estimate_relative_pose(kpts0, kpts1, camera_0, camera_1, {"max_epipolar_error": thresh, "success_prob": conf})
    R = M.R
    t = M.t

    R = torch.tensor(R, dtype=torch.float32, requires_grad=True).unsqueeze(0)
    t = torch.tensor(t, dtype=torch.float32, requires_grad=True).unsqueeze(0)

    P = torch.cat([R, t], dim=-1)
    return P

Yes, this is possible in custom autograd.Functions. You could use the same jumpy operation in its forward, but would also need to write the backward method for the numpy op.

1 Like

Thank you so much!

I tried following the pytorch documentation, and implemented the code below, I assume none of it is correct but I guess I am on the right track?

I had another question if you may, if I use the code in my initial post, then during backprogation, the gradients are computed until R and t, since the computational graph can not track any further than due to gradient detachement correct?

def estimate_pose(kpts0, kpts1, K0, K1, thresh, conf=0.99999):
    if len(kpts0) < 5:
        return None

    fx_0, fy_0, cx_0, cy_0 = K0[0, 0], K0[1, 1], K0[0, 2], K0[1, 2]
    fx_1, fy_1, cx_1, cy_1 = K1[0, 0], K1[1, 1], K1[0, 2], K1[1, 2]
    
    camera_0 = {"model": "PINHOLE", "width": int(2 * cx_0), "height": int(2 * cy_0), "params": [fx_0, fy_0, cx_0, cy_0]}
    camera_1 = {"model": "PINHOLE", "width": int(2 * cx_1), "height": int(2 * cy_1), "params": [fx_1, fy_1, cx_1, cy_1]}

    M, info = poselib.estimate_relative_pose(kpts0, kpts1, camera_0, camera_1, {"max_epipolar_error": thresh, "success_prob": conf})
    R = torch.tensor(M.R, dtype=torch.float32).unsqueeze(0)
    t = torch.tensor(M.t, dtype=torch.float32).unsqueeze(0)
    P_est = torch.cat([R, t], dim=-1)
    return P_est


class PoseEstimation(torch.autograd.Function):
    @staticmethod
    def forward(ctx, mkpts_0, mkpts_1, K_0, K_1, P_GT=None, ordering="xy"):
        ctx.save_for_backward(mkpts_0, mkpts_1)
        mkpts_0 = mkpts_0.squeeze().detach().cpu().numpy()
        mkpts_1 = mkpts_1.squeeze().detach().cpu().numpy()
        K_0 = K_0.squeeze().cpu().numpy()
        K_1 = K_1.squeeze().cpu().numpy()
        P_est = estimate_pose(mkpts_0, mkpts_1, K_0, K_1, 0.5, 0.99999)
        return P_est
    
    @staticmethod
    def backward(ctx, grad_output):
        mkpts_0, mkpts_1 = ctx.saved_tensors
        return grad_output * mkpts_0, grad_output * mkpts_1, None, None, None, None

Yes, operations on R and t will be tracked again and gradients thus also computed up to their creation.

1 Like

Thank you very much!

Hello! Sorry for the repetitive questions, but I managed to do the forward function correctly, however, during loss.backward(), everything freezes, for exmaple printing the gradients, everything keeps printing infitenly and freezes, I am not sure quite if it is due to the backward pass of the loss is incorrect?

self.pose_est = PoseEstimation.apply

P_est =  self.pose_est(mkpts_0,
                                        mkpts_1,
                                        batch["camera_intrinsic_matrix"],
                                        batch["camera_intrinsic_matrix"],
                                        0.5,
                                        0.99999)
                                        
                rot_loss, transl_loss = relative_pose_error(P_est,
                                                            batch["gt_relative_pose"],
                                                            train=True)
            loss += self.config["training"]["lambda_pose"]*(rot_loss + transl_loss)
            

        self.running_loss.append(loss.item())

        self.optimizer.zero_grad()

        loss.backward() <------ everything freezes here

def estimate_pose(kpts0, kpts1, K0, K1, thresh, conf=0.99999):
    #if len(kpts0) < 5:
    #    return None

    fx_0, fy_0, cx_0, cy_0 = K0[0, 0], K0[1, 1], K0[0, 2], K0[1, 2]
    fx_1, fy_1, cx_1, cy_1 = K1[0, 0], K1[1, 1], K1[0, 2], K1[1, 2]
    
    camera_0 = {"model": "PINHOLE", "width": int(2 * cx_0), "height": int(2 * cy_0), "params": [fx_0, fy_0, cx_0, cy_0]}
    camera_1 = {"model": "PINHOLE", "width": int(2 * cx_1), "height": int(2 * cy_1), "params": [fx_1, fy_1, cx_1, cy_1]}

    M, info = poselib.estimate_relative_pose(kpts0, kpts1, camera_0, camera_1, {"max_epipolar_error": thresh, "success_prob": conf})
    R = torch.tensor(M.R, dtype=torch.float32, requires_grad=True).reshape(3, 3).unsqueeze(0)
    t = torch.tensor(M.t, dtype=torch.float32, requires_grad=True).reshape(3, 1).unsqueeze(0)
    P_est = torch.cat([R, t], dim=-1)
    return P_est


class PoseEstimation(torch.autograd.Function):
    
    @staticmethod
    def forward(ctx, mkpts_0, mkpts_1, K_0, K_1, thresh, conf=0.99999):
        ctx.mkpts0, ctx.mkpts1 = mkpts_0, mkpts_1
        mkpts_0 = mkpts_0.squeeze().detach().cpu().numpy()
        mkpts_1 = mkpts_1.squeeze().detach().cpu().numpy()
        K_0 = K_0.squeeze().cpu().numpy()
        K_1 = K_1.squeeze().cpu().numpy()
        P_est = estimate_pose(mkpts_0, mkpts_1, K_0, K_1, thresh, conf)
        ctx.P_est = P_est
        return P_est
    
    @staticmethod
    def backward(ctx, grad_output):
        mkpts_0, mkpts_1, P_est = ctx.mkpts0, ctx.mkpts1, ctx.P_est
        grad_mkpts_0  = torch.autograd.grad(P_est, mkpts_0, grad_output, retain_graph=False)
        grad_mkpts_1  = torch.autograd.grad(P_est, mkpts_1, grad_output, retain_graph=False)
        return grad_mkpts_0, grad_mkpts_1, None, None, None, None 

Does this mean the gradients are printed repeatedly? If so, your script might not be frozen but maybe stuck in a loop. I also don’t quite understand how autograd.grad should work on non-differentiable operations as I would expect to see the actual backwards operations in the backward method.
If you need to store intermediate tensors you might also want to use ctx.save_for_backward.

1 Like

Yeah if I print the gradients in the backward pass, it keeps printing the same gradient repeatidly and everything freezes, even CTRL+C does not work, I have no idea why this is happening though. Printing everything up to loss.backward() works fine, then crashes during loss.backward(). Pretty certain it is mistake from my end on how I implemented the PoseEstimation backward pass, as everythign works fine if I do not include it.

Thank you again for your help! :smiley:

P_est tensor([[[ 1.0000e+00, -4.1633e-17, -2.7756e-17,  2.7756e-17],
         [ 4.1633e-17,  1.0000e+00, -8.3267e-17,  1.0000e+00],
         [ 2.7756e-17,  8.3267e-17,  1.0000e+00,  2.7756e-17]]],
       grad_fn=<PoseEstimationBackward>)
rot_loss tensor(0.0010, grad_fn=<MeanBackward0>)
transl_loss tensor(1.6500, grad_fn=<MeanBackward0>)

I haven’t executed your code, but I speculate you might have created a ref cycle and might need to use save_for_backward as mentioned before and as seen in this example. Also, I still claim you need to write the actual backward operations as I would expect Autograd to fail since you are using numpy ops.

1 Like

Oh I understand now!, thank you very much!