It’s not possible in general because some modules actually require the forward pass to be able to do the backward.
For the case of the linear, if you want the gradients wrt the weights, then you need the forward to have save the input tensor to be able to compute the gradients.
Of course in some special cases it’s possible as the one you showed. But I’m afraid you’ll have to write the mm yourself.