Hello all, I have a loss function as
loss = loss1 + 0.1 * loss2
where loss1
and loss2
are CrossEntropyLoss. The loss1
has two inputs are outputs
from network and ground-truth labeled
, called Supervised Loss, while the loss2
takes two inputs as outputs
and labeled
(just threshold the outputs), called Unsupervised Loss. They are balanced by the weight 0.1
. This is my implementation
optimizer.zero_grad()
###############
#Loss1: given images and labels
###############
criterion = nn.CrossEntropyLoss().to(device)
outputs = model(images)
loss1 = criterion(outputs, labels)
loss1.backward()
###############
#Loss2: given images
###############
outputs = model(images)
labels = outputs>0.5
_, labels = torch.max(outputs, 1)*labels
loss2 = 0.1*criterion(outputs, labels)
loss2.backward()
optimizer.step()
Could you look at my implementation and give me the comments for two thing:
- Is the implementation correct to perform
loss=loss1+0.1*loss2
? - Does the
optimizer.step()
andoptimizer.zero_grad()
will apply in end of the script or after each loss backward function?
For the second point, I mean
optimizer.zero_grad()
###############
#Loss1: given images and labels
###############
criterion = nn.CrossEntropyLoss().to(device)
outputs = model(images)
loss1 = criterion(outputs, labels)
###############
#Loss2: given images
###############
outputs = model(images)
labels = outputs>0.5
_, labels = torch.max(outputs, 1)*labels
loss2 = 0.1*criterion(outputs, labels)
loss = loss1+0.1*loss2
loss.backward()
optimizer.step()
Thanks in advance!