One is usually enough, the main reason for a dry-run is to put your CPU and GPU on maximum performance state. This is especially useful for laptops as laptops CPU are all on powersaving
by default.
CPU and GPU are very quick to switch to the maximum performance test so just doing a 3000x3000 matrix multiplication before the actual benchmark should be enough and takes a couple seconds at most.
Caveat: on some CPUs, AVX2 workload will downcloak the CPU frequency (and AVX512 is worse)