A better way to slice


I use an RNN model to process video frame by frame such that the forward looks something like this:

for frame in range(x.shape[1]):
    out1, hidden = self.rnn_layer1(x[:,frame,:,:], hidden)
return out1

But that code returns error “one of the variables needed for gradient computation has been modified by an inplace operation

After some trial and error i realized that the problem is that the sliced frame still holds gradient information of the whole video. I use an ugly hack that works just fine but i’m looking for a cleaner way to do this… here is the hack:

for frame in range(x.shape[1]):
    out1, hidden = self.rnn_layer1(torch.cat([x[:, frame, :, :]], dim=1).unsqueeze_(1), hidden)
return out1

Essentially concatenating the tensor to nothing…

Can you make sure your code is correct? It looks like your torch.cat call includes hidden as the second argument.

Despite the fact, that your actual code does not seem to work (@aplassard mentioned), this should work:

x_perm = x.transpose(0,1)
for frame in x_perm.split(1):
    out1, hidden = self.rnn_layer1(frame.squeeze(0), hidden) 

Edit: what your hack actually does is allocating new storage and copying the tensor. Regarding this a simple x[:,frame,:,:].clone() or x[:,frame,:,:].contiguous() might do the trick as well