# Confusion about loss objective with SGD

Hi, I have confusion about optimizing the loss objective with SGD

1. loss=1-F.cosine_similarity(x,y) and loss= -F.cosine_similarity(x,y), are these two represent to minimize the similarity using SGD? Apparently, (1-Cosinesim) seems suitable to me but i have seen the (-CosineSim) in a contrastive Self-Supervise learning objective. So how these two behave differently?

2. whats the difference between : loss = -F.crossentropy(logits, label) and loss=F.crossentropy(-logits, label) ? are these two represents to maximize the cross entropy ?

1. no difference
• -F.crossentropy is not bounded and minimizing loss will not converge
• -F.crossentropy(logits, label) is not equal to F.crossentropy(-logits, label)

`crossentropy` is equivalent to the combination of `LogSoftmax` and `NLLLoss`
So `logits` input first go through `LogSoftMax`.
And for `F.crossentropy(-logits, labels)` think it this way: minimizing it will return estimator with minimum cross entropy between `softmax(-logits)` and `lables` .

are these two represents to maximize the cross entropy?

I don’t think so.