How does one go about overriding the derivative of a primitive matrix operation?

kdu · August 10, 2023, 10:38am

If I wanted to override the implementation of the backward pass of a primitive matrix operation, say, matmul or convolution, how might I go about it?

Some subquestions:
(a) is there any way to do it (efficiently) in Python (just me hoping, haha)?
(b) Assuming the answer to (a) is no, can I implement each operation once in C++ and then it’ll work for all backends, or would I need to implement each operation for each backend e.g. cuda?

Thanks!

ptrblck · August 10, 2023, 7:13pm

You could write a custom C++ extension as seen here. If you are using pure PyTorch operations you won’t need to write backend-specific code. However, you would still have the option to write custom CUDA code if needed.