Hi,
I don’t know if I’m doing something wrong but I checked the tutorial https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html#sphx-glr-intermediate-seq2seq-translation-tutorial-py to keep me up to date with the version 1.0 of PyTorch and did the same to my code.
I have remarked that during inference, Dropout still randomly puts zero because self.training=True even with torch.no_grad(). Is this expected ? If yes, how does the tutorial ? Before we were supposed to do something like model.eval(True)
Here a sample code
In [1]: import torch
In [2]: a = torch.Tensor([1,2,3,4,5])
In [3]: d = torch.nn.Dropout()
In [4]: d = torch.nn.Dropout(0.2)
In [5]: d(a)
Out[5]: tensor([0.0000, 0.0000, 0.0000, 0.0000, 6.2500])
In [6]: d(a)
Out[6]: tensor([1.2500, 2.5000, 3.7500, 0.0000, 6.2500])
In [7]: d(a)
Out[7]: tensor([1.2500, 0.0000, 3.7500, 5.0000, 6.2500])
In [8]: with torch.no_grad():
...: d(a)
...:
In [9]: with torch.no_grad():
...: print(d(a))
...:
...:
tensor([1.2500, 2.5000, 3.7500, 5.0000, 6.2500])
In [10]: with torch.no_grad():
...: print(d(a))
...:
...:
tensor([1.2500, 2.5000, 3.7500, 0.0000, 6.2500])
In [11]: with torch.no_grad():
...: print(d(a))
...:
...:
tensor([1.2500, 2.5000, 3.7500, 5.0000, 0.0000])
In [12]: with torch.no_grad():
...: print(d(a))
...:
...:
tensor([1.2500, 2.5000, 3.7500, 5.0000, 6.2500])
In [13]: with torch.no_grad():
...: print(d(a))
...:
...:
tensor([1.2500, 2.5000, 3.7500, 0.0000, 6.2500])
In [14]: with torch.no_grad():
...: print(d(a))
...:
...:
tensor([1.2500, 2.5000, 3.7500, 5.0000, 0.0000])
In [15]: with torch.no_grad():
...: print(d(a))
...: print(d.training)
...:
...:
tensor([1.2500, 2.5000, 0.0000, 5.0000, 6.2500])
True
In [16]: with torch.no_grad():
...: print(d(a))
...: print(d.training)
...:
...:
tensor([1.2500, 0.0000, 0.0000, 5.0000, 6.2500])
True
In [17]: print(d.training)
True