Hello, I am trying to do the following forward calculation:
y_ij = ||x_i||*cos(2<x_i,w_j>)
where x_i and w_j are vectors from matrix X and W. y_ij is an element in resulting matrix Y.
There are two equivalent ways to realize it:
xlen = x.pow(2).sum(1).pow(0.5).view(-1, 1) # ||x|| wlen = w.pow(2).sum(0).pow(0.5).view(1, -1) # ||w|| cos_theta = (x.mm(w) / xlen / wlen).clamp(-1, 1) theta = cos_theta.acos() cos_2_theta = torch.cos(2*theta) y = cos_2_theta * xlen.view(-1, 1)
Alternatively,
xlen = x.pow(2).sum(1).pow(0.5).view(-1, 1) # ||x|| wlen = w.pow(2).sum(0).pow(0.5).view(1, -1) # ||w|| cos_theta = (x.mm(w) / xlen / wlen).clamp(-1, 1) cos_2_theta = 2 * cos_theta ** 2 - 1 # cos(2x) = 2cos(x)^2-1 y = cos_2_theta * xlen
However, the first one is very unstable, i.e. gradients turns to NaN after several iterations. While the second one is good. Can anyone explain this issue?
Thanks!