Hi @albanD,
I went through Implementing Truncated Backpropagation Through Time as you suggested but I think I am a little confused about this
this is not retro-active: you need to detach() (or detach_() which is just a shorthand for
x = x.detach()
) before using the Tensor, otherwise it won’t have any effect.
So, now I will put my question a little differently and see if it makes sense:
How do I detach only certain indices/slices of the original Tensor. The reason why I am asking this is because I, at a given training iteration, only want to track gradient updates corresponding to the last
k1
batches of inputs (x
’s). So, I am interested in keepingrequires_grad=True
only for those inputs (x
) andrequires_grad=False
for the rest of the inputs.