I am doing a transfer learning using USE to train quora question pairs for similarity using a Siamese network.
When I am training I am getting almost constant loss, not decreasing nor increase, also when I predict after train, all predictions value are same.
Can someone please help? I am stuck on this issue since last 2 days!
Here is the colab notebook link.
Your notebook requires a permission to access it. Could you post an “open” link so that we could check it for errors?
Let me know if you can access it?
Yes, I can access it, thanks.
I don’t see any obvious errors and the model is generally able to learn random data:
model = torch_siamese()
data1 = torch.randn(64, 512)
data2 = torch.randn(64, 512)
target = torch.randn(64, 1)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-2)
nb_epochs = 100
for epoch in range(nb_epochs):
optimizer.zero_grad()
output = model(data1, data2)
loss = loss_func(output, target)
loss.backward()
optimizer.step()
print('epoch {}, loss {}'.format(epoch, loss.item()))
I would recommend to check the shapes of the output
and target
tensors and make sure they are equal. Sometimes unwanted broadcasting could be applied if e.g. the target
is missing a dimension, which would result in this bad training performance.