Hello everyone,
I have a problem that requires both a cnn and rnn, both trainable end-to-end.
I have begun my using a dataloader on sequences of images that get one tag at the end. The batch size is always one, so I only have a sequence of image tensors and a tag. I want to pass them through a cnn, and then use the 1 x n_features
feature vector as the input to the rnn.
The forward pass seems to work, but when I use loss.backward, I get
loss.backward()
File "/scratch/Dimitris/software/pytorch/pytorchenv/lib/python2.7/site-packages/torch/autograd/variable.py", line 146, in backward
self._execution_engine.run_backward((self,), (gradient,), retain_variables)
File "/scratch/Dimitris/software/pytorch/pytorchenv/lib/python2.7/site-packages/torch/nn/_functions/thnn/auto.py", line 175, in backward
update_grad_input_fn(self._backend.library_state, input, grad_output, grad_input, *gi_args)
RuntimeError: out of range at /b/wheel/pytorch-src/torch/lib/THC/generic/THCTensor.c:23
The images are a tensor, and I pretend the sequence length is the batch size before putting an entire such tensor inside the cnn
for img in imgs_in_play:
img_tensor = t(PIL.Image.open(img))
img_tensor.unsqueeze_(0)
imgs_raw.append(img_tensor)
imgs = torch.cat(imgs_raw,0)
Then the output is passed part by part into the rnn.
imgs= torch.squeeze(imgs,0)
imgs = Variable(imgs.cuda())
output = vp.model(imgs)
output.data.unsqueeze_(1)
vp.rnn_model.zero_grad()
hidden_rnn = vp.rnn_model.init_hidden()
for i in range(output.size()[0]):
output_rnn, hidden_rnn = vp.rnn_model(output[i], hidden_rnn)
I imagine that just concatenating all images together and pretending they are in a batch before giving them to the cnn model gives trouble when the rnn is trying to create the gradients, but I am not sure which is the proper way of connecting the ~100 cnn’s to the inputs of the rnn.
All tensor shapes are as expected, the final cnn output before the rnn is (sequence length x 1 x 2048) and the output_rnn contains 13 tag predictions , one per sequence.
Any help is appreciated!
EDIT: the convnet does not seem to be the problem, since creating random input tensors of the same shape give the exact same error.
EDIT2: I was unsqueezing both the output_rnn tensor and the target_tensor, but output_rnn should not have been unqsueezed().
Dimitris