I am somewhat of a beginner to pytorch. I am implementing a paper where they have a classification CNN (input -> convolutional layers -> dense layers -> output). However, each filter of the final convolutional layer has its own loss calculation. During back-propagation, the gradient of the final loss of the network output is summed with the gradient of the loss for each particular filter. This is then back-propagated to lower convolutional layers as per usual.

I have no idea how to implement something like this. I have only ever implemented pre-defined loss functions (e.g. nn.CrossEntropyLoss()) and only for the final output of the network.

Do you have any suggestions of how to get started?

You can have the model returning both the last Conv layer output and the final output

def forward(self, x):
[...]
c = self.last_conv(...)
x = self.dense(x)
return c, x

Next, depending on how you calculate “loss for each particular filter” you can define your loss function and sum it with the classic output loss (as nn.CrossEntropyLoss):

def special_loss(c):
# Implements the paper algorithm
return l
loss = nn.CrossEntropyLoss()
c, output = model(input)
final_loss = loss(output, target) + special_loss(c, c_label)
final_loss.backward()

Keep in my mind that a loss function is normal PyTorch code, that returns a scalar (on which you usually call backward)