A model with multiple outputs

Both return None. I’m not detaching the tensors. What’d be the ideal way to figure out the root cause?

You could check the previous tensors and narrow down where the first grad_fn is set to None to isolate the issue.

I am using pytorch-lightning, and it runs validation sanity checks by default in the beginning, hence the
grad_fn was showing None for both. On disabling it, I see AddmmBackward object for output and AddBackward0 on printing the loss. However, there are still issues during the backward function.
Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.

Somehow, the issue is with multiple output layers. While testing, I tried with single output layer for one output feature, it works fine.

Could you post an executable code snippet reproducing this issue?
This error might be raised, e.g. if you are calling backward() multiple times after the intermediate forward activations were already freed.

Thanks for the code. You are appending the outputs to self.out in each iteration. Autograd will thus try to backpropagate through the current as well as the previous iterations and calculate the gradients, which will raise the error.
Assuming you don’t want to extend the computation graph in each iteration, you could reset the self.out list inside the forward method via self.out = [].

2 Likes

Apologies for re-opening such an old thread, I had a quick question. My task is to do Image regression, yielding 2 floats from a single image. Those 2 floats are somewhat related to each other, but mostly relevant to the image.

x = self.maxpool(x)
x = x.view(-1, ...)
x = self.fc1(x)
x = self.relu(x)
x = self.dropout(x)
x = self.fc2(x)
return x

So If I want to return 2 outputs, should I do something like

x = self.fc1(x)
...
y = self.fc2(x)
return x, y

So like a:

    [Final FC]
    /        \
[out_1]  [out_2]

And the training loop would be like:-

x, (y_1, y_2) = batch
y_hat_1, y_hat_2 = self.forward(x)
loss = MSELoss(y_hat_1, y_1) + MSELoss(y_hat_2, y_2)

where, x = image; y_1, y_2 are the two outputs from forward.

Would torch (lightning) be able to backprop correctly? :hugs:
Lots of thanks!

Yes, your approach looks correct and PyTorch will be able to backpropagate through both outputs.
NIT: you should create an object of the criterion and call it afterwards. Instead of MSELoss(y_hat_1, y_1) use:

criterion = nn.MSELoss()
loss1 = criterion(y_hat_1, y_1)
...
1 Like

Thanks for such a quick reply! Just to confirm, is creating an object for the loss required (as in, does it change how Autograd views it) or just a good convention? :+1:

You would have to create an object, as you would run into an error otherwise since the input tensors are understood as the __init__ arguments:

output = torch.randn(1, 1, requires_grad=True)
target = torch.randn(1, 1)

# works
criterion = nn.MSELoss()
loss = criterion(output, target)
loss.backward()

# fails
loss = nn.MSELoss(output, target)
# > UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
#  warnings.warn(warning.format(ret))
loss.backward() 
# > AttributeError: 'MSELoss' object has no attribute 'backward'
1 Like

:100: :rocket: Thanks a lot man, really appreciate it!

I tried using this sort of architecture:

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu(x)
        x = self.maxpool1(x)
        x = self.conv2(x)
        x = self.relu(x)
        x = self.maxpool2(x)
        x = self.conv3(x)
        x = self.relu(x)
        x = self.maxpool3(x)

        gate = x.view(-1)

        x = self.fc1(gate)
        x = self.relu(x)
        x = self.fc1(x)

        y = self.fc1(gate)
        y = self.relu(x)
        y = self.fc2(x)
        return x, y

with hidden_size=4608, it is returning tensors of

target: torch.Size([2]) # gt
output: torch.Size([4608])  #forward pass

which indicates that its returning the gate layer when its not? :thinking:
some more of the full details here:- hastebin with attached stdout

the final tensor printed is in the MAPE calculation (which is the first thing lightning does) and it looks a lot like gate

→ I am doing Image regression, where I would like to return 2 floats.
TIA! :hugs:

gate = x.view(-1) looks wrong, as you are flattening the entire activation tensor and are thus moving the feature dimension to the batch dimension.
Use gate = x.view(x.size(0), -1) instead.

In your code snippet you are also reusing self.fc1 and are replacing the y output in its second usage:

        y = self.fc1(gate)
        y = self.relu(x)
        y = self.fc2(x)

y would thus be the output of self.fc2(x) and the two previous lines of code can be removed.
Also note that x is already the output of x = self.fc1(x) from the previous code block.

1 Like

@ptrblck Hi,
I would like to ask the question, suppose I want to implement multiple outputs with the model architecture provided by MONAI, is this possible?
I don’t know if there are any errors in the following sample code.

class Net(pytorch_lightning.LightningModule):
    def __init__(self):
        super().__init__()
        x = monai.networks.nets.DenseNet121(spatial_dims=3, 
                                            in_channels=1, 
                                            out_channels=2)
        self.function_loss1 = monai.losses.FocalLoss()
        self.function_loss2 = monai.losses.FocalLoss()

    def forward(self, x):
          out1 = self.fc1(x)
          out2 = self.fc2(x)
          return out1, out2

    def training_step(self, batch, batch_idx):
          images, label1, label2 = batch["image"], batch["label1"], batch["label2"]
          output1, output2 = self.forward(images)
          loss = self.self.function_loss1(output1, label1)+self.self.function_loss2(output2, label2)
          tensorboard_logs = {"train_loss": loss.item()}
          return {"loss": loss, "log": tensorboard_logs}

    def fc1(self, x):  #fc2 the same
          x = torch.nn.Linear(x, 10)
          x = torch.nn.Linear(x, 10)
          return x

My biggest question is how many out_channels should I set in monai.networks.nets.DenseNet121?

I don’t see where self.fc1 and self.fc2 are defined so would expect your code to fail.

I guess it depends on your use case and think the number of out_channels would correspond to the number of classes you are dealing with.

Assuming my self.fc1 and self.fc2 are as follows, is this possible?

def fc1(self, x): 
          x = torch.nn.Linear(x, 10)
          x = torch.nn.Linear(x, 10)
          return x

def fc2(self, x):
          x = torch.nn.Linear(x, 10)
          x = torch.nn.Linear(x, 10)
          return x

Thank you!

The initialization of nn.Linear layers is wrong, as you need to create the modules first and call them on the input later. Take a look at these tutorials for a quick overview.
Once properly initialized, you could certainly use fc1 and fc2, but note that the DenseNet121 is also never used.

1 Like

Hi @H_i_Tr_n_Minh,

You need to use the instance of the class, not the class itself. So,

cbamfc = CBamFC(vgg16_config[i-1], flat[c]) #class instance
out, m = cbamfc(x) #where x is the input

Also, you need to change self.Cbam to self.cbam or your model will crash. (As you will have an undefined object)

thank you so much, how about I wanna get m. “m” is dense layer after Cbam and I use it for another classification.

You already return m with,

cbamfc = CBamFC(vgg16_config[i-1], flat[c]) #class instance
out, m = cbamfc(x) #where x is the input

Do you mean to return the layer itself? Or return the values before you use the ReLU function?

My problems is: I make a cbam layer and dense layer after cbam layer, but I just wanna forward cbam layer not dense layer, so I wanna split 2 values. to call vggnet with cbam layer