How can l load my best model as a feature extractor/evaluator?

ptrblck · August 27, 2020, 5:35am

If the pretrained model was used for a classification or segmentation use case, the probability would be high that it already returns logits, since nn.CrossEntropyLoss (and nn.BCEWithLogitsLoss) expect raw logits as the model output.
Alternatively, log probabilities could be returned and you could use torch.exp to get the probabilities.
Since the logits or log probabilities would be the model output, I don’t think you need hooks for it.
Could you explain your use case a bit more so that we could check e.g. where the logits are created and if you need hooks for it?

Aryan_Asadian · August 27, 2020, 7:38pm

Thank you for your reply. My main intention is to use the logits for implementing knowledge distillation, i.e., softened probabilities. I found the pre-trained models output the logits. So you are right. No need to hooks in this case.

ferreirafabio · October 11, 2020, 11:15am

magnificent, works like a charm! Thanks!

Roman · October 12, 2020, 9:24am

Thanks for this code!

I have some doubts about how to proceed when I am interested in activations of many layers.
What could I do to save all activations (neurons) of for example resnet 50?

Would I need to to write this for each layer?

ptrblck · October 12, 2020, 8:18pm

You could iterate all modules and register hooks with them as seen here.
I’m not sure, if there is another cleaner way of checking for “layers” besides the used if condition.

Roman · October 12, 2020, 8:47pm

Thanks for the reply. I still don’t get what I do when use the same module twice (in different layers).
What is happening in such a case?
See for example below, max_pool is used twice.

class LeNet5(nn.Module):
def init(self, dim=32, in_channels=1,
out_channels_1=6, out_channels_2=16,
kernel_size=5, stride=1, padding=0, dilation=1,
mp_kernel_size=2, mp_stride=2, mp_padding=0, mp_dilation=1,
fcsize1=120, fcsize2=84, nclasses=10):

    super(LeNet5, self).__init__()

    # helper for calculating dimension after conv/max_pool op
    def convdim(dim):
        return (dim + 2*padding - dilation * (kernel_size - 1) - 1)//stride + 1
    def mpdim(dim):
        return (dim + 2*mp_padding - mp_dilation * (mp_kernel_size - 1) - 1)//mp_stride + 1

    self.conv1 = nn.Conv2d(in_channels, out_channels_1, kernel_size, stride)
    self.max_pool = nn.MaxPool2d(mp_kernel_size,
                                 stride=mp_stride,
                                 padding=mp_padding,
                                 dilation=mp_dilation)
    self.conv2 = nn.Conv2d(out_channels_1, out_channels_2, kernel_size, stride)

    # final dimension after applying conv->max_pool->conv->max_pool
    dim = mpdim(convdim(mpdim(convdim(dim))))
    self.fc1 = nn.Linear(out_channels_2 * dim * dim, fcsize1)
    self.fc2 = nn.Linear(fcsize1, fcsize2)
    self.fc3 = nn.Linear(fcsize2, nclasses)

def forward(self, x):
    nsamples = x.shape[0]
    x1 = F.relu(self.conv1(x))
    x2 = self.max_pool(x1)
    x3 = F.relu(self.conv2(x2))
    x4 = self.max_pool(x3)
    x5 = x4.view(nsamples, -1)
    x6 = F.relu(self.fc1(x5))
    x7 = F.relu(self.fc2(x6))
    x8 = self.fc3(x7)
    return x8

ptrblck · October 13, 2020, 6:18am

In that case, the hook would overwrite the activation and you could either append the activations to the key in the activation dict or create different pooling layers in your model.

neda_vida · October 15, 2020, 9:09am

Hi…my code is:

class MyModelA(nn.Module):

def __init__(self,my_pretrained_model):

    super(MyModelA,self).__init__()

    self.model= nn.Sequential(*list(my_pretrained_model.children())[:-1])

    self.fc=nn.Sequential(nn.Linear(2048,512),

                                 nn.ReLU(),

                                 nn.Dropout(p=0.5),

                                 nn.Linear(512,2),

                                 nn.LogSoftmax(dim=1))

    for p in self.model.parameters():

        p.requires_grad = False

def forward(self,x):

    x=self.model(x)

    x=x.view(x.size(0),-1)

    x=self.fc(x)

    return x

model=MyModelA(my_pretrained_model=model).to(device)

after that i wrote

activation = {}

def get_activation(name):

def hook(model, input, output):

    activation[name] = output.detach()

return hook

how can i use this function for my use? because i want to know output of fc[0]. Could you please write for me this code?
model.fc[0].register_forward_hook(get_activation(‘fc[0]’))???

dugr · October 19, 2020, 9:03am

My network structure is similar to @neda_vida:

class CNN(nn.Module):
    def __init__(self, num_classes):
        nn.Module.__init__(self)
        self.conv1 = nn.Sequential(
            # Input shape (3, 64, 64)
            nn.Conv2d(
                in_channels=3,
                out_channels=6,
                kernel_size=5,
                stride=1,
                padding=2
            ),
            # Output shape (6, 60, 60)
            nn.ReLU(),
            # Output shape (6, 30, 30)
            nn.MaxPool2d(kernel_size=2)
        )
        self.conv2 = nn.Sequential(
            # Input shape (6, 30, 30)
            nn.Conv2d(
                in_channels=6,
                out_channels=16,
                kernel_size=5,
                stride=1,
                padding=2
            ),
            # Output shape (16, 26, 26)
            nn.ReLU(),
            # Output shape (16, 13, 13)
            nn.MaxPool2d(kernel_size=2)
        )
        self.fc = nn.Sequential(
            # FC output 5 classes
            nn.Linear(in_features=16 * 16 * 16,
                      out_features=300),
            nn.ReLU(),
            nn.Linear(in_features=300,
                      out_features=84),
            nn.ReLU(),
            nn.Linear(in_features=84,
                      out_features=num_classes)
        )

    def forward(self, x):
        x = self.conv1(x)
        x = self.conv2(x)
        x = x.view(x.size()[0], -1)
        x = self.fc(x)
        return x

when I do the activation hook:

        output = net(image)
        net.fc.register_forward_hook(get_activation('fc'))
        print(activation['fc'])

I got an error:

Traceback (most recent call last):
  File "C:/Users/dugr/PycharmProjects/NN/main.py", line 118, in <module>
    test_one_sample()
  File "C:/Users/dugr/PycharmProjects/NN/main.py", line 98, in test_one_sample
    print(activation['fc[0]'])
KeyError: 'fc[0]'

Not sure how to get an output of layer while using such way of implementing neural network or how can I this neural network rewrite so it complies with the activation hook code?
Thank you

ptrblck · October 20, 2020, 1:59am

You are trying to access an undefined key 'fc[0]' in print(activation['fc[0]']), while you are registering the hook with 'fc'.

Also, you are registering the hook after the forward pass, so you would have to rerun the forward pass to store the activation or register the hook before the first forward pass.

dugr · October 20, 2020, 7:28am

Yes, you are right. My orders of execution were mainly the reason of getting bugs.
After I point out like this, it doesn’t have bugs. Thank you!

class CNN(nn.Module):
    def __init__(self, num_classes):
        nn.Module.__init__(self)
        self.conv1 = nn.Sequential(
            # Input shape (3, 64, 64)
            nn.Conv2d(
                in_channels=3,
                out_channels=6,
                kernel_size=5,
                stride=1,
                padding=2
            ),
            # Output shape (6, 60, 60)
            nn.ReLU(),
            # Output shape (6, 30, 30)
            nn.MaxPool2d(kernel_size=2)
        )
        self.conv2 = nn.Sequential(
            # Input shape (6, 30, 30)
            nn.Conv2d(
                in_channels=6,
                out_channels=16,
                kernel_size=5,
                stride=1,
                padding=2
            ),
            # Output shape (16, 26, 26)
            nn.ReLU(),
            # Output shape (16, 13, 13)
            nn.MaxPool2d(kernel_size=2)
        )
        self.fc = nn.Sequential(
            # FC output 5 classes
            nn.Linear(in_features=16 * 16 * 16,
                      out_features=300),
            nn.ReLU(),
            nn.Linear(in_features=300,
                      out_features=84),
            nn.ReLU(),
            nn.Linear(in_features=84,
                      out_features=num_classes)
        )

    def forward(self, x):
        x = self.conv1(x)
        x = self.conv2(x)
        x = x.view(x.size()[0], -1)
        x = self.fc(x)
        return x

activation = {}
def get_activation(name):
    def hook(model, input, output):
        activation[name] = output.detach()
    return hook

if __name__ == '__main__':
    import torch
    x = torch.randn(1, 3, 64, 64)
    net = CNN(10)
    net.fc[4].register_forward_hook(get_activation('fc[4]'))
    output = net(x)
    print(activation['fc[4]'])
    net.fc.register_forward_hook(get_activation('fc'))
    output = net(x)
    print(activation['fc'])

111415 · November 18, 2020, 3:39am

When I using this method for a TorchScript module, it show like below:

RuntimeError: register_forward_hook is not supported on ScriptModules

zui · June 28, 2021, 1:02pm

Hi,
I don’t know if your answer about using hook may have been outdated since for the time being (2021), it seems that hook is commonly used for debugging purposes instead, and I think I have a better solution: using model.layer_name. So, we will have:

# same as before
model = MyModel(...)
model.load_state_dict(my_model['state_dict'])
model.eval()
# instead of using hook and let say the middle layer's name is *fc3*, we use:
output = model.fc3(new_sample)

What do you think? Do I misunderstand the question or am I wrong at something?

ptrblck · June 28, 2021, 5:41pm

Usually you would like to get an intermediate activation from a specific layer, which was created by passing the input through all layers before it.
In your current example you are passing the input directly to model.fc3. While this could work, note that the new_sample shape would need to fit the self.fc3 shape (so it’s often not the original input shape) and also you won’t be using any layers before it.
Hooks are not outdated and can still be used.

zui · June 28, 2021, 7:59pm

Thank you so much for your quick reply and clarify the problem! I’ve just seen in my code again how I can pass the input directly into the layer and still get the same result as using hook. It turns out that I’m trying to get the output of the whole first block (multiple layers) of the model, and the situation is just the same as when getting the output of the model’s first layer.

ColdCodeCool · July 6, 2021, 1:07pm

thank you for your work, sorry, I am new in pytorch, why output should detach in the forward hood?

ptrblck · July 7, 2021, 4:49am

It depends on your use case, i.e. if you want to store the activations e.g. for debugging or printing, you could detach it. On the other hand, if you want to calculate a loss and the gradients afterwards (via backward()), you shouldn’t detach them.

ColdCodeCool · July 7, 2021, 6:28am

very clear, thank you very much

enterthevoidf22 · July 8, 2021, 4:13pm

this solution seems a bit clumsy, probably a fault on my side but i got unboundlocal error.
what i found to work quite effortlessly is to assign a new attribute to whatever module i want the activations of, like so:

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.cl1 = nn.Linear(25, 60)
        self.cl2 = nn.Linear(60, 16)
        self.fc1 = nn.Linear(16, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
        
    def forward(self, x):
        x = F.relu(self.cl1(x))
        x = F.relu(self.cl2(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.log_softmax(self.fc3(x), dim=1)
        return x


def hook(model, input, output):
    if hasattr(model,'activations'):
        model.activations = torch.cat([model.activations,output.detach().unsqueeze(0)],dim=0)
    else:
        model.activations = output.detach().unsqueeze(0)


model = MyModel()
model.fc2.register_forward_hook(hook)
x = torch.randn(1, 25)
output = model(x)
print(model.fc2.activations)

am i missing somethings with this solve?

blade · October 14, 2021, 3:49pm

Dear @ptrblck ,

What is the benefit of getting activations using register_forward_hook rather than explicitly returning them in the forward pass?