Hi, I’ll show what I meant by an example, I’ll first define:

```
import torch
from torch.nn import functional as F
import numpy as np
import random
#####
torch.manual_seed(60)
torch.cuda.manual_seed(60)
np.random.seed(60)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
random.seed(60)
#####
```

Now let’s do something simple:

```
pred = torch.FloatTensor([0,1,2])
a = F.gumbel_softmax(pred)
b = F.gumbel_softmax(pred)
print("a:",a)
print("b:",b)
```

we will get:

a: tensor([0.4957, 0.1406, 0.3637])

b: tensor([0.0653, 0.6350, 0.2996])

Now if I remove a:

```
pred = torch.FloatTensor([0,1,2])
b = F.gumbel_softmax(pred)
print("b:",b)
```

we will get:

b: tensor([0.4957, 0.1406, 0.3637])

And it continues as follows with 3 instances.

I understand that it is depended on the seed, but it is impossible that doing humble softmax on a vector is deepened on if I did it before and stored it on another variable.

Moreover that doing twice gumbel softmax as follows:

```
pred = torch.FloatTensor([0,1,2])
a = F.gumbel_softmax(pred)
b = F.gumbel_softmax(pred)
print("a:",a)
print("b:",b)
```

Consistantly showed better results on multiple seeds.

Can anyone help me understand this behaviour!?