The behaviour of certain layers changes when the model is in inference mode i.e. model.eval(). For dropout, this means the dropout layer won’t drop out any neurons. For batch normalization, this means the layer’s summary statistics won’t be updated with the given data. You can calculate your metrics without putting the model in evaluation mode, but there is no good reason to do that.
To try it yourself: if you’re not using model.eval(), you’ll notice that your evaluation metrics (and loss) will change across multiple runs of your evaluation script.