CNN prediction - CPU or GPU?

I finally finished tuning my model algorithm and it was running pretty fast, about 23 secs per epoch, now I want to get the model’s metrics, I concatenated every single batch results so I could do a quick math when the loops end, but now I’m getting 40 secs/epoch. I think it’s because a) I used torch.cat, b) it’s too small computation to be done on GPU. I’m not sure if I copy everything to CPU will make things faster, because I’ll have the overhead on I/O read/write.

mini-batch loop:


      if y_true == None:
        y_true = y
        y_regression = torch.argmax(y_hat, dim=1)
      else:
        y_true = torch.cat((y_true, y), dim=0)
        y_regression = torch.cat((y_regression, torch.argmax(y_hat, dim=1)), dim=0)

epoch loop, after batch loop ends:


    acc = torch.sum(torch.eq(y_true,y_regression))/y_true.numel()
    print(f'Epoch: {epoch}, loss: {loss}, accuracy: {acc}')

Instead of calling torch.cat in each iteration, append the temp. results in a list and create the tensor after the loop finished, which should speed up the code as only one copy will be triggered.

It reduced about 1,5 sec/epoch, not a great increase, but anything is worth it! Thanks for your help!