Measuring time for forward and backward pass

Hi everyone, I’m trying to measure the time needed for a forward and a backward pass separately on different models from the PyTorch Model Zoo. I’m using this code:

I do 5 dry runs, then measure ten times each forward pass and backward pass, average and compute the std deviation. Something strange keeps happening: the first time I execute the code everything is fine, if I relaunch the script right after one of the model will typically have a std deviation much higher than the others. I’m talking of a standard deviation of the same order of magnitude of the average. If instead I let a consistent amount of time pass between two runs, everything is fine.

Any idea of what might be causing this?
(Also if you have any advice on how to measure this, it will be welcome :slight_smile: )

check that your GPU is not downclocking or has some kind of adaptive power boost. I’ve observed this in the past if I put my GPU into an “adaptive power boost” setting (or whatever nvidia calls it).

Thank you.

I’ve disabled the auto-boost mode by

sudo nvidia-smi --auto-boost-default=DISABLED -i 0

I assumed it was successful since All done. appeared in the terminal, but the problem persists.

Every parameter of the GPU shown in nvidia-smi is fine.