Unexpected Result from gradcheck

ajdillhoff · November 17, 2018, 3:58am

I have a function that, given a skeleton (joint-angle representation) and a set of keypoint locations, transforms the skeleton so that it represents the input keypoints. The functions are implemented in Python, but gradcheck fails when evaluating the result. I have narrowed the problem down to a couple of cases.

def slice_grad_min(static_parameter, x):
    # Modify parameter with self
    static_parameter[0] = static_parameter[0] * static_parameter[1]
    static_parameter = static_parameter * x
    return static_parameter

With input

static_parameter = torch.tensor([1., 2.], dtype=torch.double)
x = torch.tensor([3., 4.], dtype=torch.double, requires_grad=True)
result = gradcheck(slice_grad_min,
                   [static_parameter, x])
self.assertTrue(result)

Gradcheck Results:

RuntimeError: Jacobian mismatch for output 0 with respect to input 0,
numerical:tensor([[1.2000e+07, 0.0000e+00],
[4.8000e+07, 2.0000e+00]], dtype=torch.float64)
analytical:tensor([[2., 0.],
    [0., 2.]], dtype=torch.float64)

This is as simple as I could make the problem. I need to modify the static_parameter, which is dependent on the input x. The function returns the modified static_parameter. Can anyone explain why this is the result of gradcheck?

SimonW · November 17, 2018, 6:54am

gradcheck doesn’t work with inplace modifications to inputs. To check the correctness of such methods, check the function lambda x: f(x.clone()).

ajdillhoff · November 17, 2018, 7:50am

OK, I added .clone() calls to every rvalue. It still does not pass gradcheck. Did I do this right?

def slice_grad_min(static_parameter, x):
    # Modify parameter with self
    static_parameter[0] = static_parameter[0].clone() * static_parameter[1].clone()
    static_parameter = static_parameter.clone() * x
    return static_parameter

Output:

RuntimeError: Jacobian mismatch for output 0 with respect to input 0,
numerical:tensor([[1.2000e+07, 0.0000e+00],
    [4.8000e+07, 2.0000e+00]], dtype=torch.float64)
analytical:tensor([[2., 0.],
    [0., 2.]], dtype=torch.float64)

SimonW · November 17, 2018, 8:36am

No… all .clone() calls you added are not needed. Instead you miss the only one that matters, cloning the input. So it would be something like inserting static_parameter = static_parameter.clone() at the beginning of the function.