Hi,
I want to change the softmax results to onehot gate.
for example:
x=nn.Linear(10,20)(input)
task_gate =nn.softmax()(x) (e.g., the results is 0.5, 0.2, 0.3)
I want to change the (0.5, 0.2, 0.3) to (1, 0, 0). Also, x need to be optimized.
Hi,
I want to change the softmax results to onehot gate.
for example:
x=nn.Linear(10,20)(input)
task_gate =nn.softmax()(x) (e.g., the results is 0.5, 0.2, 0.3)
I want to change the (0.5, 0.2, 0.3) to (1, 0, 0). Also, x need to be optimized.
Hi,
The function that transform (0.5, 0.2, 0.3) to (1, 0, 0) will have gradients that are 0 almost everywhere.
So you won’t be able to optimize anything as all the gradients you will get will be 0.
yes. I only want to get gradient instead of None gradient. So, which function can help me? Thanks.
But if you get a gradient tensor it will be full of 0
s is that what you want? You can just create a tensor with torch.zeros(your_size)
.