Hello,
I created an activation function to play around with and when I use the torch autograd function, it is very slow. So, I created ReLU by myself (similar to the original one in PyTorch) and defined the forward and backward layers. Then I used it against inbuilt nn.ReLU() and I got a 10x reduced speed.
Any reasons why is this happening?
Is it because I havent written GPU optimized code or what? Any thoughts are welcome.
Thanks!