I finally finished tuning my model algorithm and it was running pretty fast, about 23 secs per epoch, now I want to get the model’s metrics, I concatenated every single batch results so I could do a quick math when the loops end, but now I’m getting 40 secs/epoch. I think it’s because a) I used torch.cat, b) it’s too small computation to be done on GPU. I’m not sure if I copy everything to CPU will make things faster, because I’ll have the overhead on I/O read/write.

mini-batch loop:

```
if y_true == None:
y_true = y
y_regression = torch.argmax(y_hat, dim=1)
else:
y_true = torch.cat((y_true, y), dim=0)
y_regression = torch.cat((y_regression, torch.argmax(y_hat, dim=1)), dim=0)
```

epoch loop, after batch loop ends:

```
acc = torch.sum(torch.eq(y_true,y_regression))/y_true.numel()
print(f'Epoch: {epoch}, loss: {loss}, accuracy: {acc}')
```