How can l load my best model as a feature extractor/evaluator?

Roman · October 12, 2020, 8:47pm

Thanks for the reply. I still don’t get what I do when use the same module twice (in different layers).
What is happening in such a case?
See for example below, max_pool is used twice.

class LeNet5(nn.Module):
def init(self, dim=32, in_channels=1,
out_channels_1=6, out_channels_2=16,
kernel_size=5, stride=1, padding=0, dilation=1,
mp_kernel_size=2, mp_stride=2, mp_padding=0, mp_dilation=1,
fcsize1=120, fcsize2=84, nclasses=10):

    super(LeNet5, self).__init__()

    # helper for calculating dimension after conv/max_pool op
    def convdim(dim):
        return (dim + 2*padding - dilation * (kernel_size - 1) - 1)//stride + 1
    def mpdim(dim):
        return (dim + 2*mp_padding - mp_dilation * (mp_kernel_size - 1) - 1)//mp_stride + 1

    self.conv1 = nn.Conv2d(in_channels, out_channels_1, kernel_size, stride)
    self.max_pool = nn.MaxPool2d(mp_kernel_size,
                                 stride=mp_stride,
                                 padding=mp_padding,
                                 dilation=mp_dilation)
    self.conv2 = nn.Conv2d(out_channels_1, out_channels_2, kernel_size, stride)

    # final dimension after applying conv->max_pool->conv->max_pool
    dim = mpdim(convdim(mpdim(convdim(dim))))
    self.fc1 = nn.Linear(out_channels_2 * dim * dim, fcsize1)
    self.fc2 = nn.Linear(fcsize1, fcsize2)
    self.fc3 = nn.Linear(fcsize2, nclasses)

def forward(self, x):
    nsamples = x.shape[0]
    x1 = F.relu(self.conv1(x))
    x2 = self.max_pool(x1)
    x3 = F.relu(self.conv2(x2))
    x4 = self.max_pool(x3)
    x5 = x4.view(nsamples, -1)
    x6 = F.relu(self.fc1(x5))
    x7 = F.relu(self.fc2(x6))
    x8 = self.fc3(x7)
    return x8

ptrblck · October 13, 2020, 6:18am

In that case, the hook would overwrite the activation and you could either append the activations to the key in the activation dict or create different pooling layers in your model.

neda_vida · October 15, 2020, 9:09am

Hi…my code is:

class MyModelA(nn.Module):

def __init__(self,my_pretrained_model):

    super(MyModelA,self).__init__()

    self.model= nn.Sequential(*list(my_pretrained_model.children())[:-1])

    self.fc=nn.Sequential(nn.Linear(2048,512),

                                 nn.ReLU(),

                                 nn.Dropout(p=0.5),

                                 nn.Linear(512,2),

                                 nn.LogSoftmax(dim=1))

    for p in self.model.parameters():

        p.requires_grad = False

def forward(self,x):

    x=self.model(x)

    x=x.view(x.size(0),-1)

    x=self.fc(x)

    return x

model=MyModelA(my_pretrained_model=model).to(device)

after that i wrote

activation = {}

def get_activation(name):

def hook(model, input, output):

    activation[name] = output.detach()

return hook

how can i use this function for my use? because i want to know output of fc[0]. Could you please write for me this code?
model.fc[0].register_forward_hook(get_activation(‘fc[0]’))???

dugr · October 19, 2020, 9:03am

My network structure is similar to @neda_vida:

class CNN(nn.Module):
    def __init__(self, num_classes):
        nn.Module.__init__(self)
        self.conv1 = nn.Sequential(
            # Input shape (3, 64, 64)
            nn.Conv2d(
                in_channels=3,
                out_channels=6,
                kernel_size=5,
                stride=1,
                padding=2
            ),
            # Output shape (6, 60, 60)
            nn.ReLU(),
            # Output shape (6, 30, 30)
            nn.MaxPool2d(kernel_size=2)
        )
        self.conv2 = nn.Sequential(
            # Input shape (6, 30, 30)
            nn.Conv2d(
                in_channels=6,
                out_channels=16,
                kernel_size=5,
                stride=1,
                padding=2
            ),
            # Output shape (16, 26, 26)
            nn.ReLU(),
            # Output shape (16, 13, 13)
            nn.MaxPool2d(kernel_size=2)
        )
        self.fc = nn.Sequential(
            # FC output 5 classes
            nn.Linear(in_features=16 * 16 * 16,
                      out_features=300),
            nn.ReLU(),
            nn.Linear(in_features=300,
                      out_features=84),
            nn.ReLU(),
            nn.Linear(in_features=84,
                      out_features=num_classes)
        )

    def forward(self, x):
        x = self.conv1(x)
        x = self.conv2(x)
        x = x.view(x.size()[0], -1)
        x = self.fc(x)
        return x

when I do the activation hook:

        output = net(image)
        net.fc.register_forward_hook(get_activation('fc'))
        print(activation['fc'])

I got an error:

Traceback (most recent call last):
  File "C:/Users/dugr/PycharmProjects/NN/main.py", line 118, in <module>
    test_one_sample()
  File "C:/Users/dugr/PycharmProjects/NN/main.py", line 98, in test_one_sample
    print(activation['fc[0]'])
KeyError: 'fc[0]'

Not sure how to get an output of layer while using such way of implementing neural network or how can I this neural network rewrite so it complies with the activation hook code?
Thank you

ptrblck · October 20, 2020, 1:59am

You are trying to access an undefined key 'fc[0]' in print(activation['fc[0]']), while you are registering the hook with 'fc'.

Also, you are registering the hook after the forward pass, so you would have to rerun the forward pass to store the activation or register the hook before the first forward pass.

dugr · October 20, 2020, 7:28am

Yes, you are right. My orders of execution were mainly the reason of getting bugs.
After I point out like this, it doesn’t have bugs. Thank you!

class CNN(nn.Module):
    def __init__(self, num_classes):
        nn.Module.__init__(self)
        self.conv1 = nn.Sequential(
            # Input shape (3, 64, 64)
            nn.Conv2d(
                in_channels=3,
                out_channels=6,
                kernel_size=5,
                stride=1,
                padding=2
            ),
            # Output shape (6, 60, 60)
            nn.ReLU(),
            # Output shape (6, 30, 30)
            nn.MaxPool2d(kernel_size=2)
        )
        self.conv2 = nn.Sequential(
            # Input shape (6, 30, 30)
            nn.Conv2d(
                in_channels=6,
                out_channels=16,
                kernel_size=5,
                stride=1,
                padding=2
            ),
            # Output shape (16, 26, 26)
            nn.ReLU(),
            # Output shape (16, 13, 13)
            nn.MaxPool2d(kernel_size=2)
        )
        self.fc = nn.Sequential(
            # FC output 5 classes
            nn.Linear(in_features=16 * 16 * 16,
                      out_features=300),
            nn.ReLU(),
            nn.Linear(in_features=300,
                      out_features=84),
            nn.ReLU(),
            nn.Linear(in_features=84,
                      out_features=num_classes)
        )

    def forward(self, x):
        x = self.conv1(x)
        x = self.conv2(x)
        x = x.view(x.size()[0], -1)
        x = self.fc(x)
        return x

activation = {}
def get_activation(name):
    def hook(model, input, output):
        activation[name] = output.detach()
    return hook

if __name__ == '__main__':
    import torch
    x = torch.randn(1, 3, 64, 64)
    net = CNN(10)
    net.fc[4].register_forward_hook(get_activation('fc[4]'))
    output = net(x)
    print(activation['fc[4]'])
    net.fc.register_forward_hook(get_activation('fc'))
    output = net(x)
    print(activation['fc'])

111415 · November 18, 2020, 3:39am

When I using this method for a TorchScript module, it show like below:

RuntimeError: register_forward_hook is not supported on ScriptModules

zui · June 28, 2021, 1:02pm

Hi,
I don’t know if your answer about using hook may have been outdated since for the time being (2021), it seems that hook is commonly used for debugging purposes instead, and I think I have a better solution: using model.layer_name. So, we will have:

# same as before
model = MyModel(...)
model.load_state_dict(my_model['state_dict'])
model.eval()
# instead of using hook and let say the middle layer's name is *fc3*, we use:
output = model.fc3(new_sample)

What do you think? Do I misunderstand the question or am I wrong at something?

ptrblck · June 28, 2021, 5:41pm

Usually you would like to get an intermediate activation from a specific layer, which was created by passing the input through all layers before it.
In your current example you are passing the input directly to model.fc3. While this could work, note that the new_sample shape would need to fit the self.fc3 shape (so it’s often not the original input shape) and also you won’t be using any layers before it.
Hooks are not outdated and can still be used.

zui · June 28, 2021, 7:59pm

Thank you so much for your quick reply and clarify the problem! I’ve just seen in my code again how I can pass the input directly into the layer and still get the same result as using hook. It turns out that I’m trying to get the output of the whole first block (multiple layers) of the model, and the situation is just the same as when getting the output of the model’s first layer.

ColdCodeCool · July 6, 2021, 1:07pm

thank you for your work, sorry, I am new in pytorch, why output should detach in the forward hood?

ptrblck · July 7, 2021, 4:49am

It depends on your use case, i.e. if you want to store the activations e.g. for debugging or printing, you could detach it. On the other hand, if you want to calculate a loss and the gradients afterwards (via backward()), you shouldn’t detach them.

ColdCodeCool · July 7, 2021, 6:28am

very clear, thank you very much

enterthevoidf22 · July 8, 2021, 4:13pm

this solution seems a bit clumsy, probably a fault on my side but i got unboundlocal error.
what i found to work quite effortlessly is to assign a new attribute to whatever module i want the activations of, like so:

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.cl1 = nn.Linear(25, 60)
        self.cl2 = nn.Linear(60, 16)
        self.fc1 = nn.Linear(16, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
        
    def forward(self, x):
        x = F.relu(self.cl1(x))
        x = F.relu(self.cl2(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.log_softmax(self.fc3(x), dim=1)
        return x


def hook(model, input, output):
    if hasattr(model,'activations'):
        model.activations = torch.cat([model.activations,output.detach().unsqueeze(0)],dim=0)
    else:
        model.activations = output.detach().unsqueeze(0)


model = MyModel()
model.fc2.register_forward_hook(hook)
x = torch.randn(1, 25)
output = model(x)
print(model.fc2.activations)

am i missing somethings with this solve?

blade · October 14, 2021, 3:49pm

Dear @ptrblck ,

What is the benefit of getting activations using register_forward_hook rather than explicitly returning them in the forward pass?

ptrblck · October 14, 2021, 5:31pm

I think it depends on your use case and probably also coding style.
E.g. if I would be working on a new model architecture, where now different features will be returned, I would override the forward. This would make sure I can initialize the model using its new definition without any manipulation on the model itself.
On the other hand, if I just want to check some intermediates e.g. for debugging, I would use hooks as I can directly add them to the model without any changes to it.
Also, I believe that hooks are not scriptable right now.

Your use case might be of course different.

blade · October 14, 2021, 5:46pm

Thank you for the prompt response!

bigtree · October 23, 2021, 6:40am

@ptrblck
I wonder about my case. How could I print the output from Unet_Netzero.pretrained.layer1.3.0.act1

Unet_Netzero(
  (quant): QuantStub()
  (pretrained): Module(
    (layer1): Sequential(
      (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): ReLU(inplace=True)
      (3): Sequential(
        (0): DepthwiseSeparableConv(
          (conv_dw): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
          (bn1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (act1): ReLU()
          (hardtanh1): Hardtanh(min_val=0, max_val=6, inplace=True)
          (se): Identity()
          (conv_pw): Conv2d(32, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (bn2): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (act2): ReLU()
          (hardtanh2): Hardtanh(min_val=0, max_val=6, inplace=True)
          (skip_add): FloatFunctional(
            (activation_post_process): Identity()
          )
        )
      )

Thanks.

ptrblck · October 23, 2021, 6:48am

Forward hooks would work as given in this thread in e.g. this post.

bigtree · October 23, 2021, 7:05am

@ptrblck Thanks for the prompt reply.
I still not sure I understand that correctly.
For my case, the DepthwiseSeparableConv is defined in another nn.moduel and didn’t show in the forward() definition of the main network (i.e., Unet_Netzero)

If I want to access pretrained.layer1.3.0.bn1, is the following correct?
model.pretrained.layer1.3.0.bn1.register_forward_hook(get_activation(‘bn1’))

Thanks