I was on the web looking for how to implement ResNet in Pytorch, and found this code, but I have couple of days looking for information on the web to explain how it works and I have not found anything, that is why I hopw to find information here of the parameters what are they and how they are used, or documentations.
Thanks.
import torch.nn as nn
import torchvision.models as models
class MyResNeXt(models.resnet.ResNet):
def __init__(self, training=True):
super(MyResNeXt, self).__init__(block=models.resnet.Bottleneck,
layers=[3, 4, 6, 3],
groups=32,
width_per_group=4)
self.fc = nn.Linear(2048, 1)
why do we init the ResNet with the Bottleneck, any docs?
2.- layers [3, 4, 6, 3] this depends on the type of resnet? 18, 50, 101? etc
3.- groups?
4.- width per group, can we use different sizes? 8, 16, 32, etc?
5.- self.fc = nn.Linear(2048, 1) is is binary, but why not 2 ? in Keras you can use 2 as the final output
The sublocks of the resnet architecture can be defined as BasicBlock or Bottleneck based on the used resnet depth. E.g. resnet18 and resnet32 use BasicBlock, while resnet>=50 use Bottleneck.
Yes. Your mentioned configuration would fit resnet34 and resnet50 as seen here.
Bottleneck layers support the groups argument to create grouped convolutions. (line of code)
Again, a ResNeXt-specific setup for the Bottleneck layer. You could try different values, but would most likely have to look into the paper, how these values would interact with the channels etc.
You could treat your binary classification use case as a 2 class multi-class classification use case, if you set the number of output features to 2.
@ptrblck, Wow!! this is great support, I am coming from the world of Keras and Tensorflow, as I believe that Pytorch is much better.
couple of questions:
1.- why allmost all the sample code I see in the web the do it like this:
resnet = models.resnet50(pretrained=True) and not like this other way? is there any advantage?
1.- how can you find tune this type of model
2.- can you concatenate models as you do in keras ?
Creating the resnet50 in a single line of code with pretrained weights is quite convenient instead of writing a custom class. If you don’t want to change e.g. the forward pass or any other modules, you could just stick to the torchvision.models.
Have a look at the this or this tutorial for an introduction to finetuning the models.
You can create the computation graph dynamically in any form you wish.
E.g. if you want to feed the output of one model to another one, you can just write:
Autograd will make sure to create the gradients in both models as long as you haven’t detached a tensor from the computation graph (e.g. by using numpy methods or calling tensor.detach()).
Hi @ptrblck, I am looking at this link here point # 2 but there is something I am missing,
model_ft = models.resnet18(pretrained=True)
num_ftrs = model_ft.fc.in_features
# Here the size of each output sample is set to 2.
# Alternatively, it can be generalized to nn.Linear(num_ftrs, len(class_names)).
model_ft.fc = nn.Linear(num_ftrs, 2)
Then you create another model…
model_conv = torchvision.models.resnet18(pretrained=True)
for param in model_conv.parameters():
param.requires_grad = False
# Parameters of newly constructed modules have requires_grad=True by default
num_ftrs = model_conv.fc.in_features
model_conv.fc = nn.Linear(num_ftrs, 2)
they have different model names and in the first one, you use all the layers except the last linear one and set it to 2.
In the second piece of code, you freeze all layers and then the rest is basically the same.
BUT I don’t see any fine-tuning here… at least not from Keras way of doing it… where we use the same model and here are different models.
you mean we have to do this, or what is that I am missing
output = model_ft(x)
output = model2(model_conv)
loss = criterion(output, target)
Maybe I’m using the wrong terminology, but by fine tuning a model I mean to use a pretrained model, make some necessary changes for the new dataset (e.g. number of output units) and train this model using the new dataset.
The tutorial explains two different approaches, where the first one trains all parameters, while the latter one only trains the last output layer.
Passing the output of one model to another one is independent from your fine tuning use case, so you can ignore it for now.
I was looking into the second link and was wondering how you go about instantiating is this correct? or where to find information of how to instantiate the ResNet 50 this way?
Not necessarily. I refer to “fine tuning” as using pretrained parameters and train the model on another dataset.
The author of the mentioned code snippet creates a new model called MyResNeXt by deriving from models.resnet.ResNet as the base model. This allows him to use all parent modules and might implementing his model easier.
The author doesn’t have to use the pretrained parameters of the base class, so it’s unrelated to a fine tuning task.
I would recommend to use the torchvision.models, if they fit your use case. Initializing the resnet “manually” as shown in your code snippet is needed, if you want to manipulate the model in a non-trivial way (e.g. change whole blocks inside the model etc.).
I might have used the wrong wording again, but there is not really anything special about this approach.
You are writing models in the same way by just passing the output of one layer to the next one:
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 3, 1, 1)
self.conv2 = nn.Conv2d(3, 6, 3, 1, 1)
self.act = nn.ReLU()
def forward(self, x):
x = self.act(self.conv1(x))
x = self.conv2(x)
return x
In this example you are passing the output of the first convolution to the next one.
Since MyModel derives from nn.Module it can be treated as all other nn.Modules e.g. nn.Conv2d.
import torchvision.models as models
class MyResNeXt(models.resnet.ResNet):
def __init__(self, training=True):
super(MyResNeXt, self).__init__(block=models.resnet.Bottleneck,
layers=[3, 4, 6, 3],
groups=32,
width_per_group=4)
self.load_state_dict(checkpoint)
# Override the existing FC layer with a new one.
self.fc = nn.Linear(2048, 1)
def freeze_until(net, param_name):
found_name = False
for name, params in net.named_parameters():
if name == param_name:
found_name = True
params.requires_grad = found_name
######################################################################
# Finetuning the convnet
# ----------------------
#
# Load a pretrained model and reset final fully connected layer.
#
# loop over the number of models to train
for i in np.arange(0, 10):
# initialize the optimizer and model
print("[INFO] training model {}/{}".format(i + 1, 10))
checkpoint = torch.load("../models/resnext50_32x4d-7cdf4587.pth")
model_ft = MyResNeXt().to(device) # models.resnet18(pretrained=True)
del checkpoint
freeze_until(model_ft, "layer4.0.conv1.weight")
criterion = nn.CrossEntropyLoss()
# Observe that all parameters are being optimized
optimizer_ft = torch.optim.Adam(model_ft.parameters(), lr=1e-5)
# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)
# Train and evaluate
model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler, num_epochs=25)
# save the model to disk
p = ["models", "model_resnet18_adam{}.model".format(i)]
checkpoint = { "optimizer": optimizer_ft.state_dict(),"model": model_ft.state_dict() }
torch.save(checkpoint, p)
Thanks again for your help!!! I really have been tryingto find advance info in the web about pytorch, but nothing reall that goes beyond MNIST…
@ptrblck found the solution on the forum. you and the forum are of great help!
“Sir, I have got the error. It was because i had two classes and was using 1 at the output layer, which is used in other frameworks. But as I changed it to 2 my code is running. Similar thing happened in the age data where total number of classes were 104 but the actual age was from 1-116 and I was f…”
But I though that we need to use 1 for two class classification. and two was for more than 2.
No. You can use a single output with e.g. nn.BCEWithLogitsLoss for a binary classification.
Alternatively you could also use two outputs with nn.CrossEntropyLoss for a “two class classification”, which would also classify each sample to one of two classes.
Note that the latter approach will double the output units and thus the last weight matrix.
Besides using another loss function, the target would also be different. nn.BCEWithLogitsLoss expects the target as a FloatTensor with values in the range [0, 1], while nn.CrossEntropyLoss expects it to be a LongTensor with class indices in the range [0, num_classes-1].