I have a problem regarding a large variation in the result I get, by running my model multiple times. The exact same architecture and training gives anywhere from 91.5% to 93.4% accuracy on image classification (cifar 10).
The problem is that I don’t know how to use the torch random seed in order to get the better results, not the worse ones. I tried various values for the random seed, with:
and I get the lower bound of the results. Any ideas?
torch.backends.cudnn.deterministic = True in addition to:
if torch.cuda.is_available(): torch.cuda.manual_seed_all(999)
but accuracy for same model/same data still varies considerably across runs. I’ve even tried duplicating the above in the code and even tried switching to the latest version of pytorch (3.1) but still getting the same variability in accuracy across runs for same model/same data. Weird.
Was following this post b/c I ran into same issues training an autoencoder. I don’t know if the OP has solved the problem. but I did a test last night on a AWS GPU and cuda on w/ the parameters below gave me consistent results. torch.backends.cudnn.deterministic = True torch.manual_seed(999)
Further I explicitly specify model.eval() after training when computing the decoders and encoders.
Alternatively when I have, below, the results were inconsistent. torch.backends.cudnn.deterministic = True torch.cuda.manual_seed_all(999)
As an above poster mentioned it seems as though torch.manual_seed() applies to both cuda and cpu devices for the latest version. So if you’re not getting consistent result w/ torch.cuda.manual_seed_all, try just torch.manual_seed. This may depend on the pytorch version you have installed…Hope this helps.
num_workers = 0 and torch.backends.cudnn.enabled = False are the real thing that works! And I also see that if you train one step 10 times, only using num_workers = 0 we can get exactly same output 8 times and different output 2 times.