Why cant i use Tanh/sigmoid?

ptrblck · April 12, 2021, 10:02pm

It depends on the (chain of) operations as described here. If the activation is needed for gradient computation, inplace operations are disallowed, and PyTorch will raise an error.