Seeing improvement only with deterministic settings


I’m developing a new method in PyTorch and supposedly it should improve my baseline. However, I can only see this improvement when I run my code in deterministic mode (no mater what seed I use with 4 workers), namely when I use the following line:

if deterministic:
        cudnn.deterministic = True
        cudnn.benchmark = False

As soon as I turn off the deterministic mode, the results are only comparable with my baseline. And I mean I see improvement in terms of better loss minimization and in my application better f1 and AUCPR scores.

This happens whatever the value of the seed?
Can you try by just setting the cudnn flags and not the seeds?