How to change parameters of first layer in pretrained network by copying its own weight iteratively

I want to use VGG network for a dataset whose input size is 12X64x64. Also I want to initialize first 12 layers with pretrained weights of its first layer. How can I do that?

To change the number of input channels in the first layer, you could just assign a new nn.Conv2d layer:

model = models.vgg16()
model.features[0] = nn.Conv2d(12, 64, 3, 1, 1)
x = torch.randn(1, 12, 64, 64)
output = model(x)

If you want to copy the pretrained weights from the original first layer, you could do so after reassigning the new layer.

I’m not sure, how you would want to initialize the first 12 layers with the weights of the first one, as the shapes of the kernels are different. Could you explain your use case a bit more?

I tried the following. But it is not working.

vgg = models.vgg16_bn(pretrained=True)
temp= models.vgg16_bn(pretrained=True)

vgg.features[0] = nn.Conv2d(12, 64, 3, 1, 1)
params1 = temp.features[0].state_dict()
params2 = vgg.features[0].state_dict()

params2[‘weight’][:,:3,:,:]=params1[‘weight’][:,:,:,:]
params2[‘weight’][:,3:6,:,:]=params1[‘weight’][:,:,:,:]
params2[‘weight’][:,6:9,:,:]=params1[‘weight’][:,:,:,:]
params2[‘weight’][:,9:12,:,:]=params1[‘weight’][:,:,:,:]
params2[‘bias’]=params1[‘bias’]
vgg.features[0].load_state_dict(params2)

Can you please help?

This post gives you a small example.

I am pasting my code here. I am not able to execute the following code

model = models.vgg16_bn(pretrained=True)
weight = model.features[0].weight.clone()
model.features[0] = nn.Conv2d(12, 64, kernel_size=3, stride=1, padding=1, bias=False)
with torch.no_grad():
model.features[0].weight[:, :3] = weight
model.features[0].weight[:, 3:6] = weight
model.features[0].weight[:, 6:9] = weight
model.features[0].weight[:, 9:12] = weight

features=list(model.features.children())[:-20]
model.features=nn.Sequential(*features)
avgpool=list(model.avgpool.children())[:-1]
model.avgpool=nn.Sequential(*avgpool)
CLASS_L1=10
features_classifier = list(model.classifier.children())[:-8]
features_classifier.extend(nn.Sequential(nn.Linear(256, CLASS_L1),nn.BatchNorm1d(CLASS_L1),nn.ReLU(inplace=True)))
model.classifier = nn.Sequential(*features_classifier)

X=np.load(“file1.npy”)
train_loader=DataLoader(dataset=X,batch_size=10, shuffle=True,num_workers=4, pin_memory=True)

for epoch in range(num_epochs):
print(“Epoch {}/{}”.format(epoch, num_epochs))
print(’-*’ * 20)

loss_train = 0
model.train(True)

for i,input in enumerate(train_loader):


inputs=input.float()
print("shape:",inputs.shape)
   	with torch.no_grad():
    	if use_gpu:
     		inputs = Variable(inputs.cuda())
        	else:
            	inputs = Variable(inputs)
    optimizer.zero_grad()
    
    
    outputs = model(inputs)
    
    _, preds = torch.max(outputs.data, 1)

    

    loss=new_Loss(outputs).cuda()
    loss_train += loss.data

    print("minibatch loss:",loss)


    loss.backward()
    optimizer.step()
   
	
    del  inputs, outputs
    torch.cuda.empty_cache()
   
    print('*'*25)

ERROR MESSAGE:

Epoch 0/250
--------------------
(‘shape:’, (10, 12, 8, 8))
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=383 error=11 : invalid argument
vgg_new.py:69: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
temp=(torch.sum(softmax(input), dim=0))
vgg_new.py:71: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
ce=(torch.sum(softmax(input)*f.log_softmax(input, 1)*x))/len(input)
(‘classification entropy:’, tensor(6.8673, device=‘cuda:0’, grad_fn=))
(‘class entropy:’, tensor(9.9495, device=‘cuda:0’, grad_fn=))
vgg_new.py:84: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
return ((torch.sum(softmax(input)*f.log_softmax(input, 1)x))/len(input))-torch.sum(temptorch.log(temp)*x[0])
(‘minibatch loss:’, tensor(-0.0291, device=‘cuda:0’, grad_fn=))
Traceback (most recent call last):
File “vgg_new.py”, line 141, in
loss.backward()
File “/home/students/.local/lib/python2.7/site-packages/torch/tensor.py”, line 107, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File “/home/students/.local/lib/python2.7/site-packages/torch/autograd/init.py”, line 91, in backward
Variable._execution_engine.run_backward(tensors, grad_tensors, retain_graph, create_graph, allow_unreachable=True)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

Kindly advice how to fix it.