I’m doing an image segmentation task here and I’m trying to give weights to gradients in such a way that some resulting pixels on different pictures are going to have a different weight. Because, some parts of the images are irrelevant to me and I don’t want to bother the net with learning stuff that isn’t important. In Keras I could get around by using sample_weight
with sample_weight_mode = "temporal"
, in PyTorch, it seems like the only relatively easy way to do that that I’ve found so far is by using hooks, something like
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.seq = nn.Sequential (
nn.Conv2d( 1 , 8 , 3 , padding=1) ,
nn.BatchNorm2d( int(8) ) ,
nn.ReLU( inplace=True ) ,
nn.Conv2d( 8 , 1 , 3 , padding=1) ,
)
def forward(self, input):
output = self.seq(input)
return output
def hookFunc(module, grad_in, grad_out):
return (grad_in[0] * appropriate_array, grad_in[1], grad_in[2])
model.seq[-1].register_backward_hook(hookFunc)
loss_fn = BCEWithLogitsLoss()
for input, target in rand_loader:
optimizer.zero_grad()
input_var = Variable( input.cuda() )
target_var = Variable( target.cuda() )
output = model( input_var )
loss = loss_fn(output, target_var)
loss.backward()
optimizer.step()
But then I only change the gradients of the loss to influence what goes above the module (layer) that register_backward_hook is attached to, seems like the gradients of the weights of the last conv module are already calculated at this point, I can only change them if I’ll send something else instead of grad_in[1], which means recalculating gradients of the weights, as far as I understand. I’m guessing there’s gotta be a better way then recalculating those weights. Any hint at how to influence those per-pixel-sample loss values before it propagated to any weights will be appreciated. Sorry if I confused some terminology, like losses and gradients, or made no sense in some other way. =)