Random grad error when backward a convolution layer

Mactarvish · June 8, 2023, 2:42am

I’ve just met a weird situation when using backward(), the grad will be slightly different every time I call backward() for identical conv layer and input, here’s my code:

import torch
import torch.nn as nn
torch.set_printoptions(precision=8)    


def compare_for():
    grads = []
    for i in range(10):
        r2_input, r2_conv = torch.load("r2_input_conv.pkl")
        r2_output = r2_conv(r2_input)
        loss2 = nn.SmoothL1Loss(reduction='none')(r2_output, torch.ones_like(r2_output)).mean()
        loss2.backward()

        # assert (r2_conv.weight.grad == r3_conv.weight.grad).all()
        g2 = r2_conv.weight.grad 
        grads.append(g2)

    for i in range(len(grads)):
        for j in range(i + 1, len(grads)):
            is_same = (grads[i] == grads[j]).all()
            print(is_same)
            if not is_same:
                print(torch.abs(grads[i] - grads[j]).sum())


if __name__ == "__main__":
    compare_for()

And the r2_input_conv.pkl can be downloaded from:

链接: https://pan.baidu.com/s/1JoQLukOQYgT3I5hiztky9g?pwd=6wwa 提取码: 6wwa 复制这段内容后打开百度网盘手机App，操作更方便哦

What factor causes the difference? I think the grad shall be itentical no matter how many times I run the code.