Hi,
I’m trying to port this Tensorflow model to PyTorch (you can see my progress here).
When I run my code however, the loss does not seem to be dropping at all, and stabilizes ~.76 compared to the original Tensorflow model’s loss of ~.21
How should I go about debugging my code?
Another question about the implementation that I have is whether the code for l2 regularization is written correctly in PyTorch (in comparison to the reference code in TensorFlow).
for k, batch in progress:
optimizer.zero_grad()
score_pos, score_neg = model(batch)
batch_loss = criterion(score_pos, score_neg)
# adding l2 regularisation
for name, param in model.named_parameters():
if name in ['mem_layer.hop_mapping.weight',
'output_module.dense.weight',
'output_module.out.weight']:
l2 = torch.sqrt(param.pow(2).sum())
batch_loss += (config.l2_lambda * l2)
batch_loss.backward()
nn.utils.clip_grad_norm_(model.parameters(), config.grad_clip)
optimizer.step()
Any help would be appreciated.