# Correct way to do backpropagation through time?

This is in the context of model-based reinforcement learning. Say I have some reward at time T, and i want to do truncated backprop through the network roll out, what is the best way to do this? Are there any good examples out there? I haven’t managed to find much.

Any help would be appreciated!

``````# non-truncated
for t in range(T):
out = model(out)
out.backward()

# truncated to the last K timesteps
for t in range(T):
out = model(out)
if T - t == K:
out.detach()
out.backward()
``````
3 Likes

Shouldn’t the truncated example be

``````# truncated to the last K timesteps
for t in range(T):
out = model(out)
if T - t == K:
out.backward()
out.detach()
out.backward()``````
4 Likes

yea you’re right. i was just backwarding through the last part

I guess that an alternative is to do a “head” truncation besides a “tail” trunction

This is tail truncation –

``````modelparameter.requires_grad = False
for t in range(T):
out = model(out)
if T - t == K:
out.backward()
``````

I have a similar issue in this post.

I followed the pseudocode for the non-truncated BPTT in this conversation, the network trains but I have the feeling that the gradient is not flowing through time. I posted my training code for the network.

Can someone give some tips?

Check out hooks. If you want to inspect an gradient, you can register a backwards_hook, and drop the values into a print statement or tensorboard.

eg, in the below code I drop a hook to monitor the values passing through a softmax functiion. (later I compute the entropy and pump it into tensorboard).

``````        def monitorAttention(self, input, output):
if writer.global_step % 10 == 0:
monitors.monitorSoftmax(self, input, output, ' input ', writer, dim=1)
self.softmax.register_forward_hook(monitorAttention)
``````

@riccardosamperna did you find out the correct way to solve this problem? I followed the 2 posts you mentioned but I found no solution

Does this discussion help?