I am trying to create text summerization using gru as encoder and pgn as decoder, but i am havinig error with final dist calculation. can please anyone help me to fix this error. I would be so thankful. thank you

    def forward(self, enc_input, enc_input_ext, dec_input, target=None, teacher_forcing_ratio=0.5):
        enc_output, enc_hidden = self.encoder(enc_input)

        dec_hidden = enc_hidden
        batch_size, seq_len = dec_input.size()
        outputs = torch.zeros(batch_size, seq_len, self.vocab_size).to(
            enc_input.device)  # Adjust self.vocab_size accordingly

        if self.training:
            loss = 0
            for t in range(seq_len - 1):  # Assuming dec_input includes <sos> and <eos>
                dec_input_t = dec_input[:, t].unsqueeze(1)
                true_output = target[:, t + 1]
                p_vocab, dec_hidden, p_gen = self.decoder(dec_input_t, dec_hidden, enc_output)
                context_vector, attn = self.attention(dec_hidden, enc_output)
                final_dist = self.get_final_distribution(enc_input_ext, p_gen, p_vocab, attn, self.max_oov)  # Simplified

                loss += f.cross_entropy(final_dist, true_output, ignore_index=self.pad_token_id)
            return loss / (seq_len - 1)

    def get_final_distribution(self, x, p_gen, p_vocab, attention_weights, max_oov):
        batch_size = x.size(0)
        # Clip the probabilities to avoid log(0) in loss computation
        p_gen = torch.clamp(p_gen, 0.001, 0.999)
        p_vocab_weighted = p_gen * p_vocab
        attention_weighted = (1 - p_gen) * attention_weights

        extension = torch.zeros((batch_size, max_oov)).float().to(x.device)
        p_vocab_extended = torch.cat([p_vocab_weighted, extension], dim=1)
        final_distribution = p_vocab_extended.scatter_add_(1, x, attention_weighted)

        return final_distribution

Here the problem brief,
Traceback (most recent call last):
File “/Users/sagar/PycharmProjects/TouchFYP/V2_main.py”, line 201, in
train(model, train_loader,val_loader, optimizer, criterion,scheduler, device,5, )
File “/Users/sagar/PycharmProjects/TouchFYP/V2_main.py”, line 167, in train
loss, _ = model(articles, ext_enc_inp, summaries[:, :-1], summaries[:, 1:], teacher_forcing_ratio=0.5)
File “/Users/sagar/PycharmProjects/TouchFYP/venv/lib/python3.9/site-packages/torch/nn/modules/module.py”, line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/Users/sagar/PycharmProjects/TouchFYP/venv/lib/python3.9/site-packages/torch/nn/modules/module.py”, line 1520, in _call_impl
return forward_call(*args, **kwargs)
File “/Users/sagar/PycharmProjects/TouchFYP/my_V2.py”, line 283, in forward
final_dist = self.get_final_distribution(enc_input_ext, p_gen, p_vocab, attn, self.max_oov,
File “/Users/sagar/PycharmProjects/TouchFYP/my_V2.py”, line 359, in get_final_distribution
attn_dist_extended[b, vocab_idx] += attention_weighted[b, idx]
IndexError: index 695 is out of bounds for dimension 1 with size 695

This indexing operation fails:

attn_dist_extended[b, vocab_idx] += attention_weighted[b, idx]

so you should check the shapes of the used tensors as well as the min/max values of the index tensors.
Based on the stacktrace this operation should be called in final_dist = self.get_final_distribution(enc_input_ext, p_gen, p_vocab, attn, self.max_oov,...) which is not the case in your posted code snippet.

well, thank you for the help. It helped me to solve this problem, but again i got new error. Well i have done some update on the final dist calcualtion. If you have time can you review my code to see what i am doing wrong, i will share the github link of the code if you want . As, i don’t have prior experience in NLP, this is my first time working with NLP and i am having tough time with it.
It would be so helpful.Thank you.

Sure, feel free to follow up with the new error and the corresponding code.

here is the link to my github model code, Text-Summerization/model.py at main · Sagar456-b/Text-Summerization · GitHub
the error i am facing is

  File "/Users/sagar/PycharmProjects/TouchFYP/V2_main.py", line 252, in <module>
    train(model, train_loader,val_loader, optimizer, criterion,scheduler, device,5, )
  File "/Users/sagar/PycharmProjects/TouchFYP/V2_main.py", line 218, in train
    loss, _ = model(articles, ext_enc_inp, summaries[:, :-1], summaries[:, 1:], teacher_forcing_ratio=0.5)
  File "/Users/sagar/PycharmProjects/TouchFYP/venv/lib/python3.9/site-packages/torch/_tensor.py", line 1022, in __iter__
    raise TypeError("iteration over a 0-d tensor")
TypeError: iteration over a 0-d tensor

I don’t know which tensor you are trying to iterate, but the error is raised if a 0-dim tensor is used in e.g. a for loop:

x = torch.randn(1)
>>> for i in x:
...     print(i)
>>> for i in x[0]:
...     print(i)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/dist-packages/torch/_tensor.py", line 1040, in __iter__
    raise TypeError("iteration over a 0-d tensor")
TypeError: iteration over a 0-d tensor
Thank you it has been great help.
I found that this error will be solved once i remove underscores (-) from
loss, _ = model(articles, ext_enc_inp, summaries[:, :-1], summaries[:, 1:], teacher_forcing_ratio=0.5), Do you know what’s the reason behind that?