You will have to use ._grad in order to overwrite the gradient.
But you should definitely prefer to change the loss computation (it would be much simpler and cleaner). The smooth_l1_loss is immediate to rewrite by hand, and you just need a step to multiply with your weights before summing the batch dimension. Something like this:
def forward(self, input, target, weights):
batch_loss = (torch.abs(input - target)<1).float()*(input - target)**2 +\
(torch.abs(input - target)>=1).float()*(torch.abs(input - target) - 0.5)
weighted_batch_loss = weight * loss
weighted_loss = weighted_batch_loss.sum()