Here is the detail:

```
sim_1 = torch.bmm(output, output_pos.transpose(1, 2)) # both output and output_pos have size of (6, 25, 100), so sim_1.shape = (6, 25, 25)
```

Then I want to apply scipy.optimize.linear_sum_assignment to get each matrix’s some elements sum up. But I can only think of a **for-loop** way to do it:

```
for i in range(sim_1.shape[0]):
row_ind, col_ind = linear_sum_assignment( - sim_1[i].cpu().data.numpy())
sim_1[i] = sim_1[i, row_ind, col_ind].sum().view(1, -1) # This step failed. I want sim_1.shape = (6, 1) at last.
```

My torch version is 0.4. I want to ask, is there a correct way to do it? Thanks!