Out of range exception in autograd

I have implemented a recurrent neural network module which has adaptive number of steps based on a sub neural network called the termination networks which outputs a probability that the network should terminate. When I try to backprop using REINFORCE I am getting a index out of bounds exception.

The main part of the forward pass looks like this:

        s_t = query_embedding.float()
        c_t = Variable(torch.zeros((1, query_embedding.size()[1]))).float()

        if self.use_cuda:
            c_t = c_t.cuda()

        for _ in range(self.terminating_length):

            terminated = self.termination_module(s_t).squeeze()
            terminated_dist = torch.cat([terminated, 1 - terminated], dim=0)
            action = terminated_dist.multinomial()

            if action == 0:
                return self.answer_module(s_t)

            attended = self.attention_memory_module(s_t)

            joined = torch.cat([attended, s_t], dim=1)
            s_t, c_t = self.internal_rnn_cell(joined, (s_t, c_t))

        return self.answer_module(s_t)

For training I am doing:

    result = model.forward(question_variable, context_variable)
    loss = criterion(result.view(-1, vocab_size), answer_variable)
    rewards = torch.from_numpy(np.array(
            [0] * (len(model.saved_actions) - 1) + [POS_REWARD / loss.data.numpy()[0]], dtype=np.float32))

        for action, reward in zip(model.saved_actions, rewards):

        autograd.backward(model.saved_actions, [
            None] * len(model.saved_actions), retain_variables=False)
        print(total_loss / args.batch_size)
        print("Example Sentence: ", ' '.join(one_hot_distribution_to_tokens(
            tokenizer, result.cpu().data.numpy())[0]))
        count += 1

I am getting the following autograd error:

 File "sandbox.py", line 111, in <module>
    None] * len(model.saved_actions), retain_variables=False)
  File "/home/araghaja/anaconda3/lib/python3.6/site-packages/torch/autograd/__init__.py", line 42, in backward
    tuple(variables), tuple(grad_variables), retain_variables)
  File "/home/araghaja/anaconda3/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py", line 314, in backward
    in zip(self.input_sizes, _accumulate(self.input_sizes)))
  File "/home/araghaja/anaconda3/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py", line 313, in <genexpr>
    return tuple(grad_output.narrow(self.dim, end - size, size) for size, end
RuntimeError: out of range at /py/conda-bld/pytorch_1493674854206/work/torch/lib/TH/generic/THTensor.c:385

Any help would be appreciated!