I do not need fully-connected layers, I just need the conv1_1 ~ relu5_3 layers of vgg16, and get the output of relu5_3 ,given a 256256 input. How to force the pretrained weights of conv1_1 ~ relu5_3 layers to accept 256256 input?
You don’t need to do anything to support different sizes for the convolutional layers.
I just tried your example with an input of size 1x3x256x256
and it worked just fine.
Does the input image pixel values need to be in [0,1]? or [0,255]?
The information about how to pre-process the models can be found in https://github.com/pytorch/vision#models
Regarding this approach
new_classifier = nn.Sequential(*list(model.classifier.children())[:-1])
model.classifier = new_classifier
This method worked for vgg16. But unfortunately does not work for inception_v3, as the model does not have a classifier. What would be the best approach for extracting features from inception_v3 model.
Thanks!
Because of the way Inception_v3 is structured, you would need to do some more manual work to get the layers that you want.
Something on the lines of
class MyInceptionFeatureExtractor(nn.Module):
def __init__(self, inception, transform_input=False):
super(MyInceptionFeatureExtractor, self).__init__()
self.transform_input = transform_input
self.Conv2d_1a_3x3 = inception.Conv2d_1a_3x3
self.Conv2d_2a_3x3 = inception.Conv2d_2a_3x3
self.Conv2d_2b_3x3 = inception.Conv2d_3a_3x3
self.Conv2d_3b_1x1 = inception.Conv2d_3b_3x3
self.Conv2d_4a_3x3 = inception.Conv2d_4a_3x3
self.Mixed_5b = inception.Mixed_5b
# stop where you want, copy paste from the model def
def forward(self, x):
if self.transform_input:
x = x.clone()
x[0] = x[0] * (0.229 / 0.5) + (0.485 - 0.5) / 0.5
x[1] = x[1] * (0.224 / 0.5) + (0.456 - 0.5) / 0.5
x[2] = x[2] * (0.225 / 0.5) + (0.406 - 0.5) / 0.5
# 299 x 299 x 3
x = self.Conv2d_1a_3x3(x)
# 149 x 149 x 32
x = self.Conv2d_2a_3x3(x)
# 147 x 147 x 32
x = self.Conv2d_2b_3x3(x)
# 147 x 147 x 64
x = F.max_pool2d(x, kernel_size=3, stride=2)
# 73 x 73 x 64
x = self.Conv2d_3b_1x1(x)
# 73 x 73 x 80
x = self.Conv2d_4a_3x3(x)
# 71 x 71 x 192
x = F.max_pool2d(x, kernel_size=3, stride=2)
# 35 x 35 x 192
x = self.Mixed_5b(x)
# copy paste from model definition, just stopping where you want
return x
inception = torchvision.models['inception_v3_google']
my_inception = MyInceptionFeatureExtractor(inception)
Thanks for the quick response,
I get the gist of how to do it.
Again,
inception = torchvision.models['inception_v3']
inception does not have attributes ‘Conv2d_1a_3x3’
How do I access them using inception
?
It should have these modules, as shown in the link in my previous message.
If not, you are probably using a DataParallel
on top of inception, so you first need to get the module back, by using inception.module.Conv2_1a_3x3
. Just to be sure, print your model and inspect the names of the modules that are printed
Got it to work. Thanks
so how to perform the backward process of a model with multiple outputs? Is it like this?:
class Net(nn.Module):
def __init__(self):
self.conv1 = nn.Conv2d(1, 1, 3)
self.conv2 = nn.Conv2d(1, 1, 3)
self.conv3 = nn.Conv2d(1, 1, 3)
def forward(self, x):
out1 = F.relu(self.conv1(x))
out2 = F.relu(self.conv2(out1))
out3 = F.relu(self.conv3(out2))
return out1, out2, out3
o1,o2,o3 = model.forward(input)
loss1 =criterion(o1,target1)
loss2 =criterion(o2,target2)
loss3 =criterion(o3,target3)
loss1.backward()
loss2.backward()
loss3.backward()
??
If it is not like this, how to make this work?
You can use autograd.backward
, or just sum the losses and backprop the summed losses
loss = loss1 + loss2 + loss3
loss.backward()
I wrote a demo:
criterion = nn.MSELoss()
x = Variable(torch.randn(1,100), requires_grad=True)
y = Variable(torch.randn(1,40))
class ToyModel(nn.Module):
def __init__(self):
super(ToyModel, self).__init__()
self.linear1 = nn.Linear(100,50)
self.linear2 = nn.Linear(50,40)
self.linear3 = nn.Linear(100,40)
def forward(self, x):
out1 = self.linear1(x)
out2 = self.linear2(out1)
out3 = self.linear3(x)
return out2,out3
model=ToyModel()
out2,out3 = model.forward(x)
print out3
loss1= criterion(out2,y)
loss2 = criterion(out3,y)
#print out2.grad
torch.autograd.backward(x, [grad1, grad2])
But how to get grad1 and grad2 ?
print out2.grad
just gives me 'None’
Is there any bug in my code? Thanks!
But how to get grad1 and grad2 ?
you don’t have to, that’s why Pytorch is amazing:
model=ToyModel()
out2,out3 = model.forward(x)
loss1= criterion(out2,y)
loss2 = criterion(out3,y)
loss = loss1 + loss2
loss.backward()
If I modify it to:
loss = loss1 + 0.8 * loss2
then it is a weighted loss?
Yes, it is weighted. You can mix the losses as you want.
what if I really need to get grad1 and grad2?[quote=“brisker, post:55, topic:119”]
criterion = nn.MSELoss()
x = Variable(torch.randn(1,100), requires_grad=True)
y = Variable(torch.randn(1,40))
class ToyModel(nn.Module):
def init(self):
super(ToyModel, self).init()
self.linear1 = nn.Linear(100,50)
self.linear2 = nn.Linear(50,40)
self.linear3 = nn.Linear(100,40)
def forward(self, x):
out1 = self.linear1(x)
out2 = self.linear2(out1)
out3 = self.linear3(x)
return out2,out3
model=ToyModel()
out2,out3 = model.forward(x)
print out3
loss1= criterion(out2,y)
loss2 = criterion(out3,y)
#print out2.grad
torch.autograd.backward(x, [grad1, grad2])
[/quote]
Before performing backpropagation you won’t have gradients as they are initialized when they are actually computed. (lazy initialization)
import torch
import torch.nn as nn
from torch.autograd import Variable
model = nn.Linear(5, 7)
x = Variable(torch.randn(10, 5))
y = model(x)
print(model.weight.grad) # None
y.backward(torch.randn(y.size()))
print(model.weight.grad) # Prints a 7x5 tensor
what about the model as described here?:[quote=“brisker, post:55, topic:119”]
class ToyModel(nn.Module):
def init(self):
super(ToyModel, self).init()
self.linear1 = nn.Linear(100,50)
self.linear2 = nn.Linear(50,40)
self.linear3 = nn.Linear(100,40)
def forward(self, x):
out1 = self.linear1(x)
out2 = self.linear2(out1)
out3 = self.linear3(x)
return out2,out3
[/quote]
you need to use hooks if you want to inspect gradients of intermediate variables.
Refer to the discussion here: Why cant I see .grad of an intermediate variable?
In your case you need to attach a hook to out2 and out3, which returns the ‘grad’
Hey guys, pardon my silly question. Of what use is the FC layer? How do you generalize to localization/detection of objects using the last fclayer features?