I want to use VGG network for a dataset whose input size is 12X64x64. Also I want to initialize first 12 layers with pretrained weights of its first layer. How can I do that?
To change the number of input channels in the first layer, you could just assign a new nn.Conv2d
layer:
model = models.vgg16()
model.features[0] = nn.Conv2d(12, 64, 3, 1, 1)
x = torch.randn(1, 12, 64, 64)
output = model(x)
If you want to copy the pretrained weights from the original first layer, you could do so after reassigning the new layer.
I’m not sure, how you would want to initialize the first 12 layers with the weights of the first one, as the shapes of the kernels are different. Could you explain your use case a bit more?
I tried the following. But it is not working.
vgg = models.vgg16_bn(pretrained=True)
temp= models.vgg16_bn(pretrained=True)
vgg.features[0] = nn.Conv2d(12, 64, 3, 1, 1)
params1 = temp.features[0].state_dict()
params2 = vgg.features[0].state_dict()
params2[‘weight’][:,:3,:,:]=params1[‘weight’][:,:,:,:]
params2[‘weight’][:,3:6,:,:]=params1[‘weight’][:,:,:,:]
params2[‘weight’][:,6:9,:,:]=params1[‘weight’][:,:,:,:]
params2[‘weight’][:,9:12,:,:]=params1[‘weight’][:,:,:,:]
params2[‘bias’]=params1[‘bias’]
vgg.features[0].load_state_dict(params2)
Can you please help?
I am pasting my code here. I am not able to execute the following code
model = models.vgg16_bn(pretrained=True)
weight = model.features[0].weight.clone()
model.features[0] = nn.Conv2d(12, 64, kernel_size=3, stride=1, padding=1, bias=False)
with torch.no_grad():
model.features[0].weight[:, :3] = weight
model.features[0].weight[:, 3:6] = weight
model.features[0].weight[:, 6:9] = weight
model.features[0].weight[:, 9:12] = weight
features=list(model.features.children())[:-20]
model.features=nn.Sequential(*features)
avgpool=list(model.avgpool.children())[:-1]
model.avgpool=nn.Sequential(*avgpool)
CLASS_L1=10
features_classifier = list(model.classifier.children())[:-8]
features_classifier.extend(nn.Sequential(nn.Linear(256, CLASS_L1),nn.BatchNorm1d(CLASS_L1),nn.ReLU(inplace=True)))
model.classifier = nn.Sequential(*features_classifier)
X=np.load(“file1.npy”)
train_loader=DataLoader(dataset=X,batch_size=10, shuffle=True,num_workers=4, pin_memory=True)
for epoch in range(num_epochs):
print(“Epoch {}/{}”.format(epoch, num_epochs))
print(’-*’ * 20)
loss_train = 0
model.train(True)
for i,input in enumerate(train_loader):
inputs=input.float()
print("shape:",inputs.shape)
with torch.no_grad():
if use_gpu:
inputs = Variable(inputs.cuda())
else:
inputs = Variable(inputs)
optimizer.zero_grad()
outputs = model(inputs)
_, preds = torch.max(outputs.data, 1)
loss=new_Loss(outputs).cuda()
loss_train += loss.data
print("minibatch loss:",loss)
loss.backward()
optimizer.step()
del inputs, outputs
torch.cuda.empty_cache()
print('*'*25)
ERROR MESSAGE:
Epoch 0/250
--------------------
(‘shape:’, (10, 12, 8, 8))
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=383 error=11 : invalid argument
vgg_new.py:69: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
temp=(torch.sum(softmax(input), dim=0))
vgg_new.py:71: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
ce=(torch.sum(softmax(input)*f.log_softmax(input, 1)*x))/len(input)
(‘classification entropy:’, tensor(6.8673, device=‘cuda:0’, grad_fn=))
(‘class entropy:’, tensor(9.9495, device=‘cuda:0’, grad_fn=))
vgg_new.py:84: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
return ((torch.sum(softmax(input)*f.log_softmax(input, 1)x))/len(input))-torch.sum(temptorch.log(temp)*x[0])
(‘minibatch loss:’, tensor(-0.0291, device=‘cuda:0’, grad_fn=))
Traceback (most recent call last):
File “vgg_new.py”, line 141, in
loss.backward()
File “/home/students/.local/lib/python2.7/site-packages/torch/tensor.py”, line 107, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File “/home/students/.local/lib/python2.7/site-packages/torch/autograd/init.py”, line 91, in backward
Variable._execution_engine.run_backward(tensors, grad_tensors, retain_graph, create_graph, allow_unreachable=True)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
Kindly advice how to fix it.