Learnable parameter inside a torch autograd function

Adnan1588 · March 30, 2018, 3:27pm

I have an activation function for example:

class _Quantize(torch.autograd.Function):

def forward(self, input):
    self.save_for_backward(input)

    N_bits = 1
    ttanh = (torch.tanh(input))
    abss = torch.abs(ttanh)
    maxx = torch.max(abss)
    Func = ttanh / (maxx * 2) + 0.5
    #Func = Func * 2 - 1
    output = (Func * (2 ** N_bits - 1)).round() / (2 ** N_bits - 1)
    output = output * 2 - 1
    return output

def backward(self, grad_output):
    # saved tensors - tuple of tensors with one element
    grad_input = grad_output.clone()
    input, = self.saved_tensors
    grad_input[input.ge(1)] = 0
    grad_input[input.le(-1)] = 0
    return grad_input

H0w can I make one of those parameters learnable like N_bits.Thanks

SimonW · March 30, 2018, 4:22pm

You can have it as an input.

Adnan1588 · March 30, 2018, 7:42pm

I already have defined that as an input. but it requires the backward path to return a gradient tensor other than grad_input. If I return None then this parameter does not changes.

SimonW · March 30, 2018, 7:43pm

I don’t understand. If you don’t want it to have gradient, how can it be trained?

Adnan1588 · March 30, 2018, 7:46pm

I want to have gradient but I dont know how to return the second gradient tensor For example:

def forward(self, input,N_bits):
self.save_for_backward(input,N_bits)

ttanh = (torch.tanh(input))
abss = torch.abs(ttanh)
maxx = torch.max(abss)
Func = ttanh / (maxx * 2) + 0.5
#Func = Func * 2 - 1
output = (Func * (2 ** N_bits - 1)).round() / (2 ** N_bits - 1)
output = output * 2 - 1
return output

def backward(self, grad_output):
# saved tensors - tuple of tensors with one element
grad_input = grad_output.clone()
input,N_bits, = self.saved_tensors
grad_input[input.ge(1)] = 0
grad_input[input.le(-1)] = 0
return grad_input, (??)
I am not sure about ?? that part

SimonW · March 30, 2018, 7:56pm

Since you are doing customized gradient in autograd.Function, you should have a formula for whatever gradient you need.