Actually, there is a difference.
The implementations in torch.xxx
have the backward implemented using python
calls, while the functional counterparts have their backward implemented entirely in C/Cuda, so the functional
backward code is more efficient.
20 Likes