Gradient is None after applying custom mask

gesc · November 8, 2020, 3:40am

I have this forward function and I use binary masks mask1 and mask2 but the module 1 get zero gradient how can I mask out1 so it still gets to be updated?

 # x, z 2D matrices, mask1,mask2 are column vectors with matching height
 def forward(self, x, z, mask2, mask1):
        out1  = self.module1(x)
        in_2 = z * mask1 + out1 *mask2
        return  self.module2 (in_2)

Please help Me

ptrblck · November 9, 2020, 5:59am

module1 should create valid gradients as long as mask2 isn’t full of zeros as seen here:

module1 = nn.Linear(10, 10)
module2 = nn.Linear(10, 10)

x = torch.randn(10, 10)
z = torch.randn(10, 10)

mask1 = torch.randint(0, 2,(10, 10)).float()
mask2 = torch.randint(0, 2,(10, 10)).float()
# mask2 = torch.zeros(10, 10)

# forward
out1  = module1(x)
in_2 = z * mask1 + out1 *mask2
out = module2 (in_2)

# backward
out.mean().backward()
print(module1.weight.grad)
print(module2.weight.grad)

gesc · November 9, 2020, 6:21am

Thanks for the clarifications now it’s working! my mask generator’s outputs were almost all zeroes.