How to implement a custom layer

Fabrice_Auzanneau · June 7, 2022, 6:59pm

Hi
I’d like to test a custom layer. I wrote the init and forward methods, but when I test it on a MNIST testcase, the training loss doesn’t change. What did I miss? Should I also write a backward method ?
Is there any good tutorial with examples somewhere ?

Thanks for your help

ptrblck · June 7, 2022, 7:00pm

This tutorial might be a good starter. Generally you don’t need to write the backward method if you stick to differentiable PyTorch operations.

Fabrice_Auzanneau · June 7, 2022, 7:07pm

Thanks a lot, I have seen this tutorial, but didn’t understand it.
If my loss doesn’t change on 20 epochs, I guess it means that my gradients are zero or near zero. So is this a symptom saying that I should write a backward method ?

ptrblck · June 7, 2022, 9:41pm

Not necessarily and you would have to debug the issue first.
Check if the .grad attributes of used parameters are showing a valid gradient after the first backward() call or if they show None without calling optimizer.zero_grad().
In the first case, the gradients are properly calculated but might be small, in the latter case the computation graph might be detached and you would need to check why that’s the case (e.g. by re-wrapping a tensor, calling .detach() on it, using a 3rd party library etc.).

Fabrice_Auzanneau · June 10, 2022, 8:23pm

I made it again from scratch, and now it works perfectly. Don’t know why…
Thanks for your help anyways.