HI ALL, I am trying to implement the BPR (https://arxiv.org/pdf/1205.2618), and I will got nan values for embedding, I am wondering that if it bugs in my code or the reason in "gradient of torch.norm at 0 (in version 0.2 and before) is NaN.” (Embeddings become NaN).
my code is:
class BPR_MF(nn.Module):
def __init__(self, user_num, item_num, embedding_dim):
super(BPR_MF, self).__init__()
self.user_embedding = nn.Embedding(user_num, embedding_dim)
self.item_embedding = nn.Embedding(item_num, embedding_dim)
self.logSigmoid = nn.LogSigmoid()
self.item_num = item_num
self.user_num = user_num
def forward(self, input_triple):
u,i,j = input_triple
user_embed = self.user_embedding(u)
item_i_embed = self.item_embedding(i)
item_j_embed = self.item_embedding(j)
score = (user_embed * (item_i_embed - item_j_embed)).sum()
log_prob = self.logSigmoid(score)
reg = (user_embed*user_embed).sum() + (item_i_embed*item_i_embed).sum() + (item_j_embed*item_j_embed).sum()
return log_prob, reg
and the loss function is :
loss = -1.0 * (log_prob) + reg * regularize // target is try to maximize the log_prob, so multiply by -1.
Thanks in advance