How to extract features for newly added layers to resnet50

Hi,

I added a few layers to the existing resnet34 for some trial purposes. I would like to extract the features from the output of the fc_4 layer (128 features) during valloader after the training phase. Code snippet.

        model_ft = models.resnet34(pretrained=use_pretrained)
        model_ft.fc_1 = nn.Sequential( nn.Dropout(0.1), nn.Linear(1000, 512))
        model_ft.fc_2 = nn.Sequential( nn.Dropout(0.1), nn.Linear(512, 512))
        model_ft.fc_3 = nn.Sequential( nn.Dropout(0.1), nn.Linear(512, 256))
        model_ft.fc_4 = nn.Sequential( nn.Dropout(0.1), nn.Linear(256, 128))
        # model_ft.avg_pool2 = nn.AdaptiveAvgPool2d(output_size=(1, 1))
        model_ft.fc_5 = nn.Linear(128, 2)

    feature_extractor = feature_extractor.to(device)
    print(feature_extractor)

    with torch.no_grad():
      model_ft .eval()   # Set model to evaluate mode
      feature_extractor.eval()

      # Iterate over data.
      for inputs, labels in dataloaders:
          inputs = inputs.to(device)
          labels = labels.to(device)

          # feature extractor
          feature_tensor = feature_extractor(inputs) # output now has the features corresponding to input x
          feature_arr = feature_tensor.cpu().detach().numpy().flatten()

Any ideas to extract features from layer (fc_4: 128 features). I need this to be used as a nth-length features of probability scores to predict later

Gives me the below error:

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in linear(input, weight, bias)
   1845     if has_torch_function_variadic(input, weight):
   1846         return handle_torch_function(linear, (input, weight), input, weight, bias=bias)
-> 1847     return torch._C._nn.linear(input, weight, bias)
   1848 
   1849 

RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`

The posted code wouldn’t work with torchvision.models.resnet34, since fc_1, fc_2, etc. are newly created attributes and thus never used in the original forward method unless you override it (which isn’t shown in the code).
Also, feature_extractor is undefined, so I’m unsure how it relates to the previously created model.
Could you post a minimal, executable code snippet to reproduce the issue?

Here is the full code example:
During Train: forward pass, calculate loss, optimize,
During Test: forwards pass, calculate loss, strip last layer, get features for the last layer output.


# dataloaders_dict = contains dict of both loaders  {'train', 'val'}

# add some layers to resnet50
model_ft = models.resnet34(pretrained=True)
model_ft.fc_1 = nn.Sequential( nn.Dropout(0.1), nn.Linear(1000, 512))
model_ft.fc_2 = nn.Sequential( nn.Dropout(0.1), nn.Linear(512, 512))
model_ft.fc_3 = nn.Sequential( nn.Dropout(0.1), nn.Linear(512, 256))
model_ft.fc_4 = nn.Sequential( nn.Dropout(0.1), nn.Linear(256, 128))
model_ft.fc_5 = nn.Linear(128, 2)

model_ft = model_ft.to(device)

loss_function = nn.CrossEntropyLoss()

optimizer = torch.optim.Adam(model_ft.parameters(), lr=1e-4)

def train_or_test(model, dataloaders_dict, optimizer, loss_function, phase):
  # since = time.time()

  running_loss = 0.0
  running_corrects = 0
  results1 = []

  # some training
  if phase == 'train':
    model.train()
  else: # 'val'/ use as a feature extractor to get last but layer values
    model.eval()

  for inputs, labels in dataloaders_dict[phase]:
    inputs = inputs.to(device)
    labels = labels.to(device)

    optimizer.zero_grad()

    outputs = model(inputs)
    loss = loss_function(outputs, labels)
    _, preds = torch.max(outputs, 1)

    # statistics
    running_loss += loss.item() * inputs.size(0)
    running_corrects += torch.sum(preds == labels.data)

    # backward + optimize only if in training phase
    if phase == 'train':
        loss.backward()
        optimizer.step()
    else: #'val'
      # use as extractor
      feature_extractor = torch.nn.Sequential(*list(model.children())[:-1])
      feature_extractor = feature_extractor.to(device)

      # forward pass input, get last but layer feature values
      feature_tensor = feature_extractor(inputs) # output now has the features corresponding to input x
      feature_arr = feature_tensor.cpu().detach().numpy().flatten()

      results1.append([labels[0].cpu().numpy(), feature_arr])

      
  if phase=='train':
    val_loss = running_loss/2488
    val_acc = running_corrects.double()/2488
    return model, val_loss, val_acc
  else: # 'val'
    val_loss = running_loss/278
    val_acc = running_corrects.double()/278
    return val_loss, val_acc, results1


model_ft, loss, acc = train_or_test(model_ft, dataloaders_dict, optimizer, loss_function, phase='train')
features, loss, acc = train_or_test(model_ft, dataloaders_dict, optimizer, loss_function, phase='val')






Thanks for the code snippet.
It still won’t work as described before, since the newly assigned attributes are never used:

model_ft = models.resnet34(pretrained=True)
model_ft.fc_1 = nn.Sequential( nn.Dropout(0.1), nn.Linear(1000, 512))
model_ft.fc_2 = nn.Sequential( nn.Dropout(0.1), nn.Linear(512, 512))
model_ft.fc_3 = nn.Sequential( nn.Dropout(0.1), nn.Linear(512, 256))
model_ft.fc_4 = nn.Sequential( nn.Dropout(0.1), nn.Linear(256, 128))
model_ft.fc_5 = nn.Linear(128, 2)

x = torch.randn(1, 3, 224, 224)
out = model_ft(x)
print(out.shape)
> torch.Size([1, 1000])

You might expect the output tensor to have the shape [batch_size=1, 2], but without overriding the forward method they won’t be used.

Wrapping the model into an nn.Sequential container might also not work directly, since you would lose all functional API calls (from the original forward) and would append the modules based on their initialization order to the nn.Sequential container. I would thus recommend to use nn.Sequential for strictly sequential models and write a custom nn.Module for other use cases.

I am quite new to pytorch, can you give me an example to add layer to model, train it.
During test strip it and get features from last but layer.

model= models.resnet34(pretrained=True)
# add some layer to model -- some example here

# train phase
x = torch.randn(1, 3, 224, 224)
out = model(x)
# test phase
feature_extractor = torch.nn.Sequential(*list(model_ft.children())[:-1])
y = torch.randn(1, 3, 224, 224)
out = feature_extractor(y) # in this example need 1000 features

Adding to what @ptrblck said, one way to add new layers to a pretrained resnet34 model would be the following:

  • Write a custom nn.Module, say MyNet
  • Include a pretrained resnet34 instance, say myResnet34, as a layer of MyNet
  • Add your fc_* layers as other layers of MyNet
  • In the forward function of MyNet, pass the input successively through myResnet34 and the various fc_* layers, in order.

And one way to get the output of fc_4 is to just return it from the forward function, along with the output of the last layer, as, say, a tuple.

The key to solving this problem is to think of the entire pretrained myResnet34 as a single layer, because that is what it effectively is, in this case.

If you don’t know how to do the steps I described above, it will be well worth your while learning how to do so. The official PyTorch tutorial is a good place to start.