From https://pytorch.org/tutorials/beginner/examples_autograd/two_layer_net_custom_function.html torch.autograd.Functions now require forward and backward to be static. I am just coming back to a project I was working on way back in 0.3. It required a Function to keep track of some values and these values were actually changed elsewhere in the program. I see ctx can be used to save values between forward and backward passes. Is there a way to access, add, or modify ctx outside of forward() and backward()? Or alternatively to pass a value into forward() or backward() when it is called without using ctx
The recommended way to do this is to pass what you used to give to init to the forward function and add the corresponding number of
None, to the backward’s return.
Now if you want to access the
ctx, note that this is python so you can do whatever you want (like saving it in a global during forward), but that is not recommended.
Do you have more details (and a small code sample) of what you want to do?
Unfortunately the value that is needed for the call to backward is not calculated until after the forward function. So that would not work.
A global could possibly work, though I would then need to keep track of a separate ctx and some kind of intelligent indexing for every tensor that uses this function correct?
Essentially what I want to do is that the forward pass is just used to put the tensor in the autograd graph, and then the backward pass does its own thing that’s unrelated to the 0 gradient to calculate the gradient using an external value. Here is a cropped piece of code which had been working in 0.3:
class tagger(torch.autograd.Function): def forward(self, inp): return inp.clone() * 0 def backward(self, grad_out): grad_out = self.valueThatGetsFilledInOutsideThisFunction .... math .... return grad_in
One thing I have just tried that appears to allow this to work is the following. Does this seem like a reasonable solution or is this potentially dangerous:
class tagger2(torch.autograd.Function): @staticmethod def forward(ctx, inp, temp): ctx.save_for_backward(temp) return inp.clone() * 0, temp @staticmethod def backward(ctx, grad_out, temp_gradout): print(ctx.saved_tensors) anotherFunction(): applier = self.taggers[i].apply temp = torch.Tensor(1) data[i], tempOut = applier(data[i], temp) temp.data = 12345
This does make the backward pass print 12345. So I could easily just make a temps array the same size as the taggers array. but I don’t know for sure if its ok to modify something with .data. If this does work shoud I also return None rather than returning temp which is meaningless? And do I need an array of the taggers anymore or now that they are static functions I don’t need to keep track of them?
If you just want to use a different
grad_output from the provided one, you can just use a hook on the output:
output.register_hook(lambda grad: new_val)
In your forward pass in whatever function uses your custom autograd.Function.
Looks like that worked! Thanks!! Unfortunately it only solves the main problem of grad out being what I want for the actual input grad_out of backward. A big part of the math used to use member variables that need to be modified both inside and outside the call to backward(). I started a new thread since it seems like a slightly different topic now: How to transition to functions not being allowed to have member variables