I trace it back into the output, but within the model I have this, which is a recurrent unit:
for i in range(timesteps):
indices = torch.LongTensor([i])
#ids = torch.LongTensor([1]).cuda()
pyramidal1, self.state = self.unit(torch.squeeze(x[:,i,...], 1), self.state)
print(pyramidal1.volatile)
print("Timesteps: ", i)
The interesting thing is that it becomes volatile in the second timestep, so is there something wrong about the way I am handling the data?