Thanks @ptrblck
I checked the output of my statistics model to see if I am detaching the computation graph, but in fact it doesn’t seem to be detached.
here is the code and the output of stat_model as well as the parameter.
fit_image, fit_landmarks = stat_model(theta_estimated)
print('fit landmark =', fit_landmarks)
print('theta=', theta_estimated[0])
fit landmark = tensor([[[ -93.7965, 41.2901, -457.3236, -118.0355, 23.2566, -394.9610,
-129.1178, -22.9549, -413.0228, -60.4286],
[ -14.2109, -392.8283, -112.7418, 32.2946, -766.2040, -65.1696,
31.8340, -795.4185, -35.9762, -2.9846],
[-788.4980, -108.9594, -9.7898, -800.8992, -49.7171, 31.8733,
-808.2354, -93.7749, 49.2322, -800.7413]],
[[ -86.7400, 28.3823, -492.1542, -93.9346, 29.3518, -440.1688,
-111.6750, 2.4048, -448.3925, -44.6627],
[ -23.6792, -451.9865, -117.1925, 11.8526, -759.7421, -62.1217,
39.6679, -760.5637, -32.7417, 18.6626],
[-752.7242, -92.1307, -17.7508, -781.5572, -45.2557, 43.6841,
-768.3282, -102.0547, 38.3073, -769.6452]],
[[ -78.7095, 17.3260, -457.3948, -111.3670, 22.8568, -394.5397,
-122.8169, -18.9796, -403.8733, -52.9702],
[ -10.9037, -390.0055, -114.4793, 26.6313, -784.2676, -56.4531,
25.9526, -799.3102, -33.1011, 2.2568],
[-785.2794, -100.1249, -14.8437, -806.3973, -41.8734, 26.0318,
-805.1998, -95.0777, 42.2077, -806.0328]]], device='cuda:0',
grad_fn=<ReshapeAliasBackward0>).
theta= tensor([-1.6324, -3.0613, 1.1880, -1.6660, -0.3259, -4.0532, 2.8580, -2.3427,
2.1786, 2.7516, -0.2029, -5.1932, -1.2288, -2.5457, 0.9505, -1.3430,
0.4178, -1.9190, 1.5987, 1.4101, -1.3779, -2.0695, -3.3447, 2.0617,
-3.6703, 2.3111, -2.0738, -7.8953, 1.4093, 0.3684, -0.1465, -5.3603,
3.9112, -1.7550, 3.0453, -3.8051, -2.8779, -0.9223, -2.2068, 0.7696,
-3.6210, -0.1781, 1.4597, -0.9524, 0.5832, -3.4065, -3.0134, -2.4941],
device=‘cuda:0’, grad_fn=)
you can see that grad_fn= for the output used for the loss and grad_fn= for the parameter.
what else could be detached?