Confusion about loss objective with SGD

Hi, I have confusion about optimizing the loss objective with SGD

  1. loss=1-F.cosine_similarity(x,y) and loss= -F.cosine_similarity(x,y), are these two represent to minimize the similarity using SGD? Apparently, (1-Cosinesim) seems suitable to me but i have seen the (-CosineSim) in a contrastive Self-Supervise learning objective. So how these two behave differently?

  2. whats the difference between : loss = -F.crossentropy(logits, label) and loss=F.crossentropy(-logits, label) ? are these two represents to maximize the cross entropy ?

  1. no difference
    • -F.crossentropy is not bounded and minimizing loss will not converge
    • -F.crossentropy(logits, label) is not equal to F.crossentropy(-logits, label)

crossentropy is equivalent to the combination of LogSoftmax and NLLLoss
So logits input first go through LogSoftMax.
And for F.crossentropy(-logits, labels) think it this way: minimizing it will return estimator with minimum cross entropy between softmax(-logits) and lables .

are these two represents to maximize the cross entropy?

I don’t think so.