Hi,
first of all thanks to everyone working on this awesome library!
I’ve started looking into pytorch for a couple of months now, but I’ve reached a point where I cannot explain a specific behavior. I am sorry for not being able to get a smaller minimal example up but in the smaller versions the problems do not appear:
"""
@author: Utku Ozbulak - github.com/utkuozbulak
Available at: https://github.com/utkuozbulak/pytorch-cnn-visualizations/blob/master/src/cnn_layer_visualization.py
Small modifications:
"""
import numpy as np
import torch
from torch.optim import Adam
from torchvision import models
class CNNLayerVisualization:
"""
Produces an image that minimizes the loss of a convolution
operation for a specific layer and filter
"""
def __init__(self, model, selected_layer, selected_filter):
self.model = model
self.model.eval()
self.selected_layer = selected_layer
self.selected_filter = selected_filter
self._conv_output = None
def visualise_layer(self):
# Generate a pseudo-random image
np.random.seed(123)
random_image = np.random.uniform(0, 1, (3, 224, 224))
# Process image => For simplicitely to not transform here:
# processed_image = preprocess_image(random_image, False)
processed_image = torch.from_numpy(random_image).float().unsqueeze_(0)
processed_image.requires_grad = True
# Add one more channel to the beginning. Tensor shape = 1,3,224,224
# Define optimizer for the image
optimizer = Adam([processed_image], lr=0.1)
for i in range(1, 5):
optimizer.zero_grad()
# Assign create image to a variable to move forward in the model
x = processed_image
for index, layer in enumerate(self.model):
# Forward pass layer by layer
# x is not used after this point because it is only needed to trigger
# the forward hook function
x = layer(x)
# Only need to forward until the selected layer is reached
if index == self.selected_layer:
self._conv_output = x[0, self.selected_filter]
# TODO: Explain why it makes a difference when the following
# line is removed!
break
loss = -torch.mean(self._conv_output)
loss.backward()
# Loss function is the mean of the output of the selected layer/filter
# We try to minimize the mean of the output of that specific filter
print("Loss:", "{0:.2f}".format(loss.data.numpy()))
optimizer.step()
if __name__ == "__main__":
cnn_layer = 0
filter_pos = 2
pretrained_model = models.vgg16(pretrained=True).features
layer_vis = CNNLayerVisualization(pretrained_model, cnn_layer, filter_pos)
layer_vis.visualise_layer()
The problem lies in the TODO
line. I wanted to visualize different filter by utilizing gradient ascend and I’ve used this awesome github project as a reference.
Now is our model not iterable like the vgg16 model used in the repository. In my opinion, it shouldn’t make a difference if the input image, which is being maximized, goes through the full model or stops at the desired layer, as the backward function is only applied to operations leading up to the desired layer/filter. Smaller tests confirm this. But if you run the given script once with the break
line and then without the break
line, the loss differs significantly. I cannot explain why and hope that somebody could demystify this for me.
Thanks!