Inconsistent model inference Time

Hi @milesyang, do you solve this problem now?
I test the code on RTX3090 (desktop) and RTX 2080 max-q (laptop). I found the Perf is always p2 for RTX3090 and the ineference time keeps the same however the sleep time is. But for RTX 2080 max-q, the Perf jumps with the sleep time. If sleep time is less than 50ms, the Perf is always P0 and the inference time is normal. But if sleep time is 500ms, the Perf jumps, maybe P0, P3 or P5, and the inference time jumps either. So I assume P2 is a normal power state for RTX3090 and P0 for RTX 2080 max-q. But since RTX 2080 max-q is a laptop GPU, the Perf tends to jump.