Max-Margin Loss Drops to Zero, but Metric still very low


I am trying to train a shared embedding model using max-margin loss tested on an external retrieval metric.

The loss drops to zero just after one iteration!! and the metric is still very low.

I tried different margins, the same issue is there.

Any special tricks required to train under max-margin loss?