I’ve just met a weird situation when using backward()
, the grad will be slightly different every time I call backward()
for identical conv layer and input, here’s my code:
import torch
import torch.nn as nn
torch.set_printoptions(precision=8)
def compare_for():
grads = []
for i in range(10):
r2_input, r2_conv = torch.load("r2_input_conv.pkl")
r2_output = r2_conv(r2_input)
loss2 = nn.SmoothL1Loss(reduction='none')(r2_output, torch.ones_like(r2_output)).mean()
loss2.backward()
# assert (r2_conv.weight.grad == r3_conv.weight.grad).all()
g2 = r2_conv.weight.grad
grads.append(g2)
for i in range(len(grads)):
for j in range(i + 1, len(grads)):
is_same = (grads[i] == grads[j]).all()
print(is_same)
if not is_same:
print(torch.abs(grads[i] - grads[j]).sum())
if __name__ == "__main__":
compare_for()
And the r2_input_conv.pkl
can be downloaded from:
链接: https://pan.baidu.com/s/1JoQLukOQYgT3I5hiztky9g?pwd=6wwa 提取码: 6wwa 复制这段内容后打开百度网盘手机App,操作更方便哦
What factor causes the difference? I think the grad shall be itentical no matter how many times I run the code.