Hello everyone, I’m doing a research project and I have a CNN model already trained. Now I want to extract features from this CNN to apply conventional Machine Learning algorithms.

I have saved the CNN:

torch.save(model.state_dict(), './cnn.pth')

Now, how do I extract features from this model, to apply conventional ML algorithms?

It’s straight forward. Let’s assume that your model contains two part: model.extractor and model.classifier.
Just pass an input batch to model.extractor, and you’ll get the desired features.

class CNN(nn.Module):
def __init__(self, *args, **kwargs):
super(CNN, self).__init__()
self.feature_extractor = ... # CNN layers or whatever
self.classifier = .... # Linear layers
def forward(self, x):
x = self.feature_extractor(x)
x = self.classifier(x.view(x.shape[0], -1) # depend on your architecture
return x

After loading your trained state dict, you can extract the feature from an input by just features = model.feature_extractor(input).

def forward(self, x):
out = self.cnn1(x)
out = self.relu1(out)
out = self.maxpool1(out)
out = self.cnn2(out)
out = self.relu2(out)
out = self.maxpool2(out)
out = out.view(out.size(0), -1)
out = self.fc1(out)
return out

This is my architecture. Now (using what you have suggested), I will extract the features from the layer previous to the last (which is what I wanted):

out = out.view(out.size(0), -1)

The output shape is: torch.Size([1, 1568])
Converting to numpy array: (1, 1568)

Using this features, do you have any idea of what can I do now to apply traditional ML methods and get an output of ( -918.5343, -532.8511, -771.1676, -722.4738, -863.3235, -838.9160,
-1015.9191, -593.9412, -680.5461, -670.0557) (Class Activations).

How can I apply this array of (1,1568) to get a 10 class output?

I did that, now I want to apply ML traditional techniques, just to compare the performance of methodologies totally based on CNNs. But I’m not sure how to use (1, 1568) features, to get an 10 class output using ML methodologies.
I’m not sure if I am expressing myself correctly.

As far as I know, you should save all the outputs into a numpy array, then use that array as data for a machine learning model in scikit-learn or other libraries.

That’s what I did so far, I just don’t understand how will I get a 10 class output with 1568 input… But I will search more, if I found a solution, I will put here.
In time, if anyone knows what I should do, you are more than welcome to give some tips. I think that with one concrete example I will be able to apply more techniques and compare performances! Thank you

I think it all depends which “ML traditional techniques” you want to use. And if you use a reference implementation, from which lib you get that implementation.
But as @caonv said, you are most likely going for scikit-learn like APIs, you will have to extract all the features in one big Tensor (or numpy array) and then feed that to scikit learn’s API

Thank you, I think I’m understading what you are saying. In this case, I have a CNN already trained with MNIST.
Now, if I pass an image as input to the trained model, the model will give me the activations of the classes (MNIST = 10 activations). But what I really want now is, with this information (which I already converted from tensor to numpy array):

from sklearn import linear_model
reg = linear_model.RidgeCV(alphas=np.logspace(-6, 6, 13))
reg.fit(arr)

arr is the numpy array extracted from the layer previous to the last.
I don’t know if I made myself clear, but if you want I can provide more code in order to you understand better what I’m trying to say .

It depends on which function you are using. For the RidgeCV from sklearn, I think it is the label yes. But you should check the functions doc to know what the argument is expected to contain.