Not using register_buffer correctly

Hi Guys,

I keep getting this error with a custom module I’m trying to create:

"expected device cuda:0 and dtype Float but got device cpu and dtype Float"

From what I can understand, I need to use register_buffer on the offending tensor so that it gets set to CUDA when I use model.CUDA(). I think I’m doing that below, but I keep getting the same error.

class NoiseInjection(nn.Module):
    def __init__(self, shape):
        super(NoiseInjection, self).__init__()
        self.injected_noise = self.build_noise_injection(shape)

    def build_noise_injection(self, shape):
        noise =  torch.randn([shape[0], 1, shape[2] , shape[3]], requires_grad=False) #NCHW
        self.register_buffer('noise', noise)
        weights = nn.Parameter(torch.zeros([1, shape[1], 1 , 1])) # different weight for each channel
        noise_weighted = weights * self.noise
        return noise_weighted

     def forward(self,x):
         out = x + self.injected_noise
         return out

Anyone know what’s going on?

The input ‘x’ may have been pushed to cuda. Make sure you also push self.injected_noise to cuda. Something along the lines of
self.injected_noise = self.build_noise_injection(shape).cuda()

1 Like

I did this and the code now runs:

 self.injected_noise = nn.Parameter(self.build_noise_injection(shape))

I set it has a parameter since it is composed of weights * self.noise, where weights is trainable.

So to clarify - do I need to push every single variable in the graph to cuda, either defining it as a parameter or as a buffer?

Much thanks

Yeah, to avoid conflict, yeah. You cant operate on variables that are on different devices.

1 Like