I’ve trained model on ~28,000 samples and wondering what the recommended / efficient way to run predictions on this many samples might be? If the input samples are in a tensor called x_data I’ve found that simply running:
model.eval()
with torch.no_grad():
pred = model(x_data)
results in very high memory growth and doesn’t complete. I’m able to run the above with batches of 1,000 samples inside a loop using tensor slicing, but it feels rather inelegant.
I’m curious to understand both why the memory footprint grows well beyond what the “pred” tensor would be for this number of samples and also if there’s a standard practice here like using a data loader with batch sizes etc. I went through the tutorial but only see the workflow through saving off the trained model.