I’m trying to extract multiple sentences (one probability = one caption) from simple caption CNN+LSTM model. I’m getting multiple probabilities for one sentence (caption)

main model code:

```
sampled_ids = []
inputs = features.unsqueeze(1)
for i in range(self.max_seg_length):
hiddens, states = self.lstm(inputs, states) # hiddens: (batch_size, 1, hidden_size)
outputs = self.linear(hiddens.squeeze(1)) # outputs: (batch_size, vocab_size)
_, predicted = outputs.max(1) # predicted: (batch_size)
# The code start here
sm = torch.nn.Softmax(dim=1)
#update
outputs1 = sm(outputs) #output shape (1, 9956)
top1_prob, top1_label = torch.topk(outputs1,1)
#print(top1_label)
#print('--------')
print(top1_prob)
sampled_ids.append(predicted)
inputs = self.embed(predicted) # inputs: (batch_size, embed_size)
inputs = inputs.unsqueeze(1) # inputs: (batch_size, 1, embed_size)
sampled_ids = torch.stack(sampled_ids, 1) # sampled_ids: (batch_size, max_seq_length)
return sampled_ids
```

Result for just top-1 caption

```
tensor([ 0.9998])
tensor([ 0.4791])
tensor([ 0.3699])
tensor([ 0.9963])
tensor([ 0.5529])
tensor([ 0.1465])
tensor([ 0.2513])
tensor([ 0.9950])
tensor([ 0.7264])
tensor([ 0.9951])
tensor([ 0.3070])
tensor([ 0.9992])
tensor([ 0.4416])
tensor([ 0.9996])
tensor([ 0.4754])
tensor([ 0.6424])
tensor([ 0.9996])
tensor([ 0.5170])
tensor([ 0.5675])
tensor([ 0.9996])
```