Extract features from CNN

Hello everyone, I’m doing a research project and I have a CNN model already trained. Now I want to extract features from this CNN to apply conventional Machine Learning algorithms.

I have saved the CNN:

torch.save(model.state_dict(), './cnn.pth')

Now, how do I extract features from this model, to apply conventional ML algorithms?


@ptrblck any idea?

It’s straight forward. Let’s assume that your model contains two part: model.extractor and model.classifier.
Just pass an input batch to model.extractor, and you’ll get the desired features.

Sorry, I’m very noobie in this yet, so I didn´t understand what you are suggesting.

model = CNN()


To extract the features, what do you suggest to do after load the model?

Let’s assume that your CNN class looks like this:

class CNN(nn.Module): 
    def __init__(self, *args, **kwargs):
        super(CNN, self).__init__()
        self.feature_extractor = ... # CNN layers or whatever
        self.classifier = .... # Linear layers
    def forward(self, x):
        x = self.feature_extractor(x)
        x = self.classifier(x.view(x.shape[0], -1) # depend on your architecture
        return x

After loading your trained state dict, you can extract the feature from an input by just features = model.feature_extractor(input).

1 Like

Thank you, that’s what I was looking for.

I already did what you have suggested.

def forward(self, x):
        out = self.cnn1(x)
        out = self.relu1(out)
        out = self.maxpool1(out)
        out = self.cnn2(out)
        out = self.relu2(out)

        out = self.maxpool2(out)

        out = out.view(out.size(0), -1)

        out = self.fc1(out)
        return out

This is my architecture. Now (using what you have suggested), I will extract the features from the layer previous to the last (which is what I wanted):

out = out.view(out.size(0), -1)

The output shape is: torch.Size([1, 1568])
Converting to numpy array: (1, 1568)

Using this features, do you have any idea of what can I do now to apply traditional ML methods and get an output of ( -918.5343, -532.8511, -771.1676, -722.4738, -863.3235, -838.9160,
-1015.9191, -593.9412, -680.5461, -670.0557) (Class Activations).

How can I apply this array of (1,1568) to get a 10 class output?

Thank you for your answers and time :slight_smile:

Why don’t you directly use an additional Linear layer to produce your desired outputs.

I did that, now I want to apply ML traditional techniques, just to compare the performance of methodologies totally based on CNNs. But I’m not sure how to use (1, 1568) features, to get an 10 class output using ML methodologies.
I’m not sure if I am expressing myself correctly.

As far as I know, you should save all the outputs into a numpy array, then use that array as data for a machine learning model in scikit-learn or other libraries.

That’s what I did so far, I just don’t understand how will I get a 10 class output with 1568 input… But I will search more, if I found a solution, I will put here.
In time, if anyone knows what I should do, you are more than welcome to give some tips. I think that with one concrete example I will be able to apply more techniques and compare performances! Thank you :slight_smile:

I think it all depends which “ML traditional techniques” you want to use. And if you use a reference implementation, from which lib you get that implementation.
But as @caonv said, you are most likely going for scikit-learn like APIs, you will have to extract all the features in one big Tensor (or numpy array) and then feed that to scikit learn’s API

Thank you, I think I’m understading what you are saying. In this case, I have a CNN already trained with MNIST.
Now, if I pass an image as input to the trained model, the model will give me the activations of the classes (MNIST = 10 activations). But what I really want now is, with this information (which I already converted from tensor to numpy array):

out = out.view(out.size(0), -1)

For example, apply ridge regression (1.1. Linear Models — scikit-learn 0.24.0 documentation) to the array with this features extracted from the image, with shape = (1,1568). I just have to do this?

from sklearn import linear_model

reg = linear_model.RidgeCV(alphas=np.logspace(-6, 6, 13))


arr is the numpy array extracted from the layer previous to the last.
I don’t know if I made myself clear, but if you want I can provide more code in order to you understand better what I’m trying to say :slight_smile: .

This looks like the right thing to do yes.

Ok, very nice. It gives me this

TypeError: fit() missing 1 required positional argument: 'y'

y should be the label right?

It depends on which function you are using. For the RidgeCV from sklearn, I think it is the label yes. But you should check the functions doc to know what the argument is expected to contain.

1 Like

Alright, thank you! I will try other implementations of RidgeCV or other methods.

Update: I already did it! In the future if someone needs I can provide the code to do this! Thank you all.

Great to hear that, I think a summary of your key observations will help others to save a lot of time when facing the same situation.