I am not sure whether this is the appropriate place to ask.

My loss has two parts, say L1 and L2. I want to minimize both, and at the same time I need to constrain that L1 should be always greater than L2 (L1>L2). Is the following correct?

By minimizing loss = L2 - L1 you are only really optimizing for making L1 greater than L2 rather than minimizing both loss components. For example, parameters such that (L1=1000, L2 = 990) is just as good of a solution as parameters such that (L1=10, L2=0).

Another loss function which is more in line with your problem description could be:
loss = L1 + L2 + H(L1, L2) where H is a very large number if L2>L1 and 0 otherwise.
To make H differentiable, you can approximate it with a continuous function taking the difference L1-L2 which increases rapidly at 0 (note that you are relaxing the inequality constrain by doing so).

I am not entirely sure about this reasoning though. Please correct me if I’m wrong.

You could downscale L2 like so: loss=L1+L2*min(1, (L1/L2).detach().cpu().item()), assuming L1 and L2 have same sign. Kinda weird optimization constraint though…