I have two questions regarding writing custom backward.
1- Can I add a list of tensors to ctx.save_for_backward? I need to have a list of 24 activation outputs in my backward function.
2- Can I output something in the custom backward function? Let’s say I have a second copy of the model and would like to copy gradients in backward to that. How can I get gradients in the second model when calling loss.backward()
Regarding #2, my actual model is on gpu and I have an extra copy of the model (which resides on cpu) and when I calculate gradient for a layer, I copy the gradients to the cpu model too. So can I save the cpu model with ctx like ctx.cpu_model = cpu_model in forward pass, receive it in the backward pass, copy gradients from gpu model to cpu model during backward and after calling loss.backward(), get the gradients from the cpu model?
Yes that’s the way to do it