I have a CNN network that consists of multiple small Sequential
modules and each of the cnn modules have different output neurons, which i return in the end. Something like this:
class CNN:
def __init__(self):
self.cnn_module = nn.Sequential() # contains CNN layers
self.network1 = nn.Sequential() # Dense layers
self.network2 = nn.Sequential() # Dense layers
self.network3 = nn.Sequential() # Dense layers
def forward(self, x):
x = self.cnn_module(x)
x = x.flatten(batch_size, -1)
out_1 = self.network1(x)
out_2 = self.network2(x)
out_3 = self.network3(x)
return [out_1, out_2, out_3]
I was confused on how to calculated loss of each of the networks and then backpropogate the losses together. Would something like this work:
criterion = nn.CrossEntropyLoss()
model = CNN()
output = model(images)
loss_1 = criterion(output[0], label_0)
loss_2 = criterion(output[1], labels_1)
loss_3 = criterion(output[2], labels_2)
loss = loss_1+loss_2+loss_3
loss.backward()
optimizer.step()
What is loss_4
? I don’t see any reason why this shouldn’t work? Did you give it a go?
If I am seeing this right, what you are doing is equivalent to concatenating all outputs and running a CrossEntropyLoss
against the concatenated labels.
1 Like
so if that’s correct, how can i pass weighted values to the loss function for each of the sub-network, since the labels are different for each of the sub-networks. Do i have to call nn.CrossEntropy(weights)
each time when i calculate the loss ? so something like this:
criterion = nn.CrossEntropyLoss(weights_label_0)
loss_1 = criterion(output[0], label_0)
criterion = nn.CrossEntropyLoss(weights_label_1)
loss_2 = criterion(output[1], label_1)
criterion = nn.CrossEntropyLoss(weights_label_2)
loss_3 = criterion(output[2], label_2)
loss = loss_1+loss_2+loss_3
loss.backward()
optimizer.step()
will calling nn.CrossEntropyLoss()
every time before calculating loss will mess up the backpropogation somehow?
Why call each time? Just once will suffice à la
criterion1 = nn.CrossEntropyLoss(weights_label_0)
criterion2 = nn.CrossEntropyLoss(weights_label_1)
criterion3 = nn.CrossEntropyLoss(weights_label_2)
loss_1 = criterion1(output[0], label_0)
loss_2 = criterion2(output[1], label_1)
loss_3 = criterion3(output[2], label_2)
No, it will not.
should the weights vector for each class be calculated per batch or per dataset ?