How to extract features from network output

JoaoGuerreiro · June 14, 2023, 2:52pm

Hi! I’m trying to implement a perceptual loss. To accomplish that goal I need to extract features from my network output and compare them with the features extracted from the ground truths.

For that I’ve created a custom loss class in pytorch. However, the output of my network is not a valid input for the feature extractor and therefore I need to transform the output inside the forward method of my custom loss class.

I’m getting the following error: RuntimeError: Output 0 of AliasBackward0 is a view and its base or another view of its base has been modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.

The problem I believe is that I can’t change a tensor with gradients in place… How can I do it? I need to transform the output of my network in order to correctly extract the features using a vgg network… is this impossible to do? Essentially I want to do something like this:

perceptual_criterion = VGGLoss(_min=_min, _max=_max).to(device)
other_criterion_1 = ...
other_criterion_2 = ...

...
# Inside training Loop...
            perceptual_loss = perceptual_criterion(network_outputs, ground_truths)
            ...

            # Total loss
            total_loss = perceptual_loss + other_criterion_1_loss + other_criterion_2_loss

            # Backward pass
            total_loss.backward()
            optimizer.step()

This is the forward method of my custom loss:

def forward(self, predictions, ground_truths):
    # Preprocess data as expected by VGG
    transformed_prediction = torch.stack([self.vgg_transforms(prediction) for prediction in predictions], dim=0)
    transformed_ground_truth = torch.stack([self.vgg_transforms(ground_truth) for ground_truth in ground_truths], dim=0)

    # Convert 3D data to 2D
    transformed_prediction_2d = transformed_prediction.view(-1, 3, 224, 224)
    transformed_ground_truth_2d = transformed_ground_truth.view(-1, 3, 224, 224)

    return self.criterion(self.vgg(transformed_prediction_2d), self.vgg(transformed_ground_truth_2d))

The VGG model I’m trying to use to extract features is a 2D model, so I need to reshape my data from 3D to 2D. The error is coming from the self.vgg_transforms which I believe will change the tensor in place…

The transforms are implemented with MONAI:

# VGG transforms
self.vgg_transforms = transforms.Compose([
    transforms.Resize(spatial_size=(256, 256, 256), mode="trilinear"),
    transforms.CenterSpatialCrop(roi_size=(224, 224, 224)),
    transforms.NormalizeIntensity(subtrahend=_min, divisor=(_max - _min)),
    transforms.RepeatChannel(repeats=3),
    transforms.ToTensor(),
    transforms.NormalizeIntensity(subtrahend=[0.485, 0.456, 0.406], divisor=[0.229, 0.224, 0.225], channel_wise=True),
])

I am doing something wrong? Is the problem MONAI related? Can I change a tensor with gradients?

Thank you

ptrblck · June 15, 2023, 3:32am

I would recommend checking which transformation is performed inplace first. Once you’ve isolated it you could implement an out-of-place version by reusing its code.

JoaoGuerreiro · June 15, 2023, 4:25am

Hi
I have spotted the in-place operation, but I just can’t make a valid out-of-place version that does not get the error…

The error comes from the line: img[slices] = (img[slices] - _sub) / _div

I’ve tried clone(), but even that doesn’t work…

ptrblck · June 15, 2023, 5:19am

You could keep iterating the tensor as it seems you are already doing it via slices and append the outputs to a new list which can then be stacked into the output tensor:

img = torch.randn(3, 224, 224)
out = []
for slices in range(img.size(0)):
    out.append((img[slices] - 1. ) / 2.)
out = torch.stack(out)
print(out.shape)
# torch.Size([3, 224, 224])