I am writing some code for something similar to RoI pooling. The gradient propagates back well when I use CPU but not on GPU? Does anyone have any idea? Thanks a lot.
A demo is like this.
CPU:
out = torch.zeros(1, 3, 6, 6)
vout = Variable(out)
fmap = np.arange(3 * 6 * 6).reshape((1, 3, 6, 6))
tmap = Variable(torch.from_numpy(fmap).float(), requires_grad=True)
mask = torch.zeros(1, 6, 6).byte()
mask[0, 2:5, 2:5] = 1
mask = Variable(mask.expand(1, 3, 6, 6))
masked = tmap.masked_select(mask).view(3, -1)
pooled = torch.max(masked, 1)[0][:, 0]
vout[0, :, 0, 0] = pooled
# similar to the operation above
mask = torch.zeros(1, 6, 6).byte()
mask[0, 3:6, 3:6] = 1
mask = Variable(mask.expand(1, 3, 6, 6))
masked = tmap.masked_select(mask).view(3, -1)
pooled = torch.max(masked, 1)[0][:, 0]
vout[0, :, 1, 1] = pooled
a = torch.mean(vout)
a.backward()
print tmap.grad
GPU:
out = torch.zeros(1, 3, 6, 6)
vout = Variable(out).cuda()
fmap = np.arange(3 * 6 * 6).reshape((1, 3, 6, 6))
tmap = Variable(torch.from_numpy(fmap).float(), requires_grad=True).cuda()
mask = torch.zeros(1, 6, 6).byte().cuda()
mask[0, 2:5, 2:5] = 1
mask = Variable(mask.expand(1, 3, 6, 6))
masked = tmap.masked_select(mask).view(3, -1)
pooled = torch.max(masked, 1)[0][:, 0]
vout[0, :, 0, 0] = pooled
mask = torch.zeros(1, 6, 6).byte().cuda()
mask[0, 3:6, 3:6] = 1
mask = Variable(mask.expand(1, 3, 6, 6))
masked = tmap.masked_select(mask).view(3, -1)
pooled = torch.max(masked, 1)[0][:, 0]
vout[0, :, 1, 1] = pooled
a = torch.mean(vout)
a.backward()
print tmap.grad
The result is None.
I am using version 0.1.9.