I’m trying to pre-calculate the convolution layers of a pre-trained model prior to training the fully-connected classifier layer, specifically densenet161 model. The goal is to gain some speed in training the model if it doesn’t have to consistently calculate the conv layers which I have frozen.
I’m getting size mismatch error when I pass the calculated values into my fully-connected layer, a single linear function with in_features = 2208 (output of conv layer) and out_layer=5005. I don’t think there’s an issue with my training function, I think I’m not passing the calculated data in the right dimensions to the fully-connected layer with batches of 7.
The size of the output is [7, 2208, 7, 7]. So it looks like it’s right…what am misunderstanding? Thanks in advance!
Here’s the code for calculating the frozen layers
# Function to generate convoluted features and labels for given DataLoader and model
def preconvfeat(data_loader, model):
print('[preconvfeat]')
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)
conv_features = []
labels_list = []
count = 0
for data in data_loader:
count += 1
inputs, labels = data
inputs, labels = inputs.to(device), labels.to(device)
output = model.features(inputs) # calculate values for features block only
conv_features.extend(output.data.cpu().numpy())
labels_list.extend(labels.data.cpu().numpy())
conv_features = np.concatenate([[feat] for feat in conv_features])
return (conv_features, labels_list)