Unexpected behaviour from trying to control model saliency

SchulzKilian · June 22, 2024, 1:11pm

I have a model where the human user during training gets to select parts of the image that he doesnt seem important, but they still have been included in the saliency map. Then multiply the selected pixels with the gradient and backpropagate the result, hoping to make the model not take those parts of the image into account.

The code of that part looks like this:

Then I check the gradients created, and zero the weights which have two high of a gradient (took the marked pixels into account). I get that Im overfitting here but even for one image, the next saliency map often times doesnt take out the marked regions, it rather returns a weird reverse saliency map, where every image part that was salient now isnt and vice versa. Any explanations as to why and how I can achieve what I want?

Thanks a lot in advance

SchulzKilian · June 24, 2024, 5:59am

Don’t necessarily need people who know the answer, also any ideas are very much welcome

SchulzKilian · June 24, 2024, 9:37am

ok so i realized it doesnt properly backpropagate because it doesnt manage to backpropagate to the marked pixels, how can i make it so only the gradients of the marked pixel go into the loss

soulitzer · June 24, 2024, 1:47pm

Do you have more code illustrating the problem? Look at only this snippet, it should backprop toward the marked_pixels if they require grad.

SchulzKilian · June 24, 2024, 3:13pm

sure I have two possible reasons why it doesnt backpropagate.

it actually doesnt backpropagate, and here is the code part where I try to do it:

marked_pixels_grad = torch.tensor(self.marked_pixels, requires_grad=True)
loss1 = (torch.sum((self.gradientsmarked_pixels_grad)))
loss1.backward(retain_graph=True)
self.zero_out_weights_with_non_zero_gradients()
def zero_weights_with_non_zero_gradients(self, instance_type= None):
for param in self.model.parameters():
if torch.isnan(param.grad).any():
print(“Gradient contains NaN values.”)
continue # Skip this parameter
sum1 =torch.sum(param.data)
percentile = 1- (self.marked_pixels_count3)/self.input.numel()
# print(f"percentile is {percentile}“)
if param.grad is not None:
limit = torch.quantile(abs(param.grad), percentile).item()
# print(f"limit is {limit}”)
if param.grad is not None and instance_type is None:
param.data[abs(param.grad) > limit] = 0
param.data[abs(param.grad) <= limit] *= 1/(1-percentile)
# print(f"Amount of zeros before is {param.data.numel()} amount removed is {param.data[abs(param.grad) > limit].numel()}“)
elif param.grad is not None and isinstance(param, instance_type):
param.data[abs(param.grad) > limit] = 0
param.data[abs(param.grad) <= limit] *= 1/(1-percentile)
sum2 =torch.sum(param.data)
# print(f"We went from {sum1} to {sum2}”)
param.requires_grad_()

The other way is I make a mistake while checking if i actually backpropagate to self.marked_pixels, but the way i do it is I create two losses, one with and one without marked_pixels, and check if there are differences with the gradients which there arent.

Also only backpropagating to the current gradients would explain why the saliency map gets inverted, you punish the weights that encourage current gradients and encourage the other ones. Maybe it doesnt backpropagate because marked pixels arent a continuous function? If so, is there maybe a way to still only add parts of the input gradients to the loss?

Edit: Maybe I go back to calculating the hessian for only the marked pixels?