Option 1 and 2 both perform the correct function but are significantly slower than the built in nn.relu function (actually starting ok and then getting slower and slower as it runs) while option 3 tells me “data must be a Tensor.” Looking for any ideas for the slowdown or a different way to do this!
Edit: I have also now tried with return inp.clamp(min=0) which also works with the slowdown.
Got it! Don’t know why this would cause a problem, but I was declaring self.reluForwarder = reluForward() in the Module’s init. Moving it to just be reluForwarder = reluForward() in the forward function seems to make it work at the same speed as regular relu.
To avoid this mistake, you should keep in mind that a Function is not an nn.Module and should be used only once.
Also, for better performance, you should use the new style functions as follow:
class reluForward(torch.autograd.Function):
@staticmethod
def forward(self, inp):
#option1:
#return = inp * (inp>0).float()
#option2:
#return F.relu(inp).data
#option3:
return F.relu(inp)
@staticmethod
def backward(self, grad_out):
return grad_out
# To use it:
inp = Variable(torch.rand(10, 10))
out = reluForward.apply(inp) # Use the class here, not an instance of it !