ResNet, Bottleneck, Layers, groups, width_per_group

  1. Creating the resnet50 in a single line of code with pretrained weights is quite convenient instead of writing a custom class. If you don’t want to change e.g. the forward pass or any other modules, you could just stick to the torchvision.models.

  2. Have a look at the this or this tutorial for an introduction to finetuning the models.

  3. You can create the computation graph dynamically in any form you wish.
    E.g. if you want to feed the output of one model to another one, you can just write:

output = model1(x)
output = model2(output)
loss = criterion(output, target)
loss.backward()
optimizer.step()

Autograd will make sure to create the gradients in both models as long as you haven’t detached a tensor from the computation graph (e.g. by using numpy methods or calling tensor.detach()).