Adnan1588
(Adnan Siraj Rakin)
March 30, 2018, 3:27pm
1
I have an activation function for example:
class _Quantize(torch.autograd.Function):
def forward(self, input):
self.save_for_backward(input)
N_bits = 1
ttanh = (torch.tanh(input))
abss = torch.abs(ttanh)
maxx = torch.max(abss)
Func = ttanh / (maxx * 2) + 0.5
#Func = Func * 2 - 1
output = (Func * (2 ** N_bits - 1)).round() / (2 ** N_bits - 1)
output = output * 2 - 1
return output
def backward(self, grad_output):
# saved tensors - tuple of tensors with one element
grad_input = grad_output.clone()
input, = self.saved_tensors
grad_input[input.ge(1)] = 0
grad_input[input.le(-1)] = 0
return grad_input
H0w can I make one of those parameters learnable like N_bits.Thanks
SimonW
(Simon Wang)
March 30, 2018, 4:22pm
2
You can have it as an input.
Adnan1588
(Adnan Siraj Rakin)
March 30, 2018, 7:42pm
3
I already have defined that as an input. but it requires the backward path to return a gradient tensor other than grad_input. If I return None then this parameter does not changes.
SimonW
(Simon Wang)
March 30, 2018, 7:43pm
4
I don’t understand. If you don’t want it to have gradient, how can it be trained?
Adnan1588
(Adnan Siraj Rakin)
March 30, 2018, 7:46pm
5
I want to have gradient but I dont know how to return the second gradient tensor For example:
def forward(self, input,N_bits):
self.save_for_backward(input,N_bits)
ttanh = (torch.tanh(input))
abss = torch.abs(ttanh)
maxx = torch.max(abss)
Func = ttanh / (maxx * 2) + 0.5
#Func = Func * 2 - 1
output = (Func * (2 ** N_bits - 1)).round() / (2 ** N_bits - 1)
output = output * 2 - 1
return output
def backward(self, grad_output):
# saved tensors - tuple of tensors with one element
grad_input = grad_output.clone()
input,N_bits, = self.saved_tensors
grad_input[input.ge(1)] = 0
grad_input[input.le(-1)] = 0
return grad_input, (??)
I am not sure about ?? that part
SimonW
(Simon Wang)
March 30, 2018, 7:56pm
6
Since you are doing customized gradient in autograd.Function
, you should have a formula for whatever gradient you need.