Calculating loss for sub-networks

I have a CNN network that consists of multiple small Sequential modules and each of the cnn modules have different output neurons, which i return in the end. Something like this:

class CNN:
def __init__(self):
self.cnn_module = nn.Sequential() # contains CNN layers
self.network1 = nn.Sequential() # Dense layers
self.network2 = nn.Sequential() # Dense layers
self.network3 = nn.Sequential() # Dense layers

def forward(self, x):
x = self.cnn_module(x)
x = x.flatten(batch_size, -1)
out_1 = self.network1(x)
out_2 = self.network2(x)
out_3 = self.network3(x)
return [out_1, out_2, out_3]

I was confused on how to calculated loss of each of the networks and then backpropogate the losses together. Would something like this work:

criterion = nn.CrossEntropyLoss()
model = CNN()
output = model(images)
loss_1 = criterion(output[0], label_0)
loss_2 = criterion(output[1], labels_1)
loss_3 = criterion(output[2], labels_2)
loss = loss_1+loss_2+loss_3
loss.backward()
optimizer.step()

What is loss_4? I don’t see any reason why this shouldn’t work? Did you give it a go?

If I am seeing this right, what you are doing is equivalent to concatenating all outputs and running a CrossEntropyLoss against the concatenated labels.

1 Like

so if that’s correct, how can i pass weighted values to the loss function for each of the sub-network, since the labels are different for each of the sub-networks. Do i have to call nn.CrossEntropy(weights) each time when i calculate the loss ? so something like this:

criterion = nn.CrossEntropyLoss(weights_label_0)
loss_1 = criterion(output[0], label_0)

criterion = nn.CrossEntropyLoss(weights_label_1)
loss_2 = criterion(output[1], label_1)

criterion = nn.CrossEntropyLoss(weights_label_2)
loss_3 = criterion(output[2], label_2)

loss = loss_1+loss_2+loss_3
loss.backward()
optimizer.step()

will calling nn.CrossEntropyLoss() every time before calculating loss will mess up the backpropogation somehow?

Why call each time? Just once will suffice à la

criterion1 = nn.CrossEntropyLoss(weights_label_0)
criterion2 = nn.CrossEntropyLoss(weights_label_1)
criterion3 = nn.CrossEntropyLoss(weights_label_2)

loss_1 = criterion1(output[0], label_0)
loss_2 = criterion2(output[1], label_1)
loss_3 = criterion3(output[2], label_2)

No, it will not.

should the weights vector for each class be calculated per batch or per dataset ?