I am trying to copy the weights matrix between last conv layer and first dense layer to a new architecture. In the original network, the output shape of the last conv layer was 256x6x6 and the number of nodes in the first dense layer were 4096. Now I have a new architecture in which the output shape of the last conv layer is 200x6x6 and number of nodes in the first dense layer are same i.e. 4096.
The shape of the original weight matrix is 4096, 9216 (4096 nodes in the dense layer and 256x6x6=9216) and the shape of the new matrix is 4096, 7200 (4096 nodes in the dense layer and 200x6x6=7200).
I tried to copy the weight like this, but I not able to figure it out whether I am copying the weights of those specific 200 filters.The code I tries is here
if isinstance(m0, nn.Linear): # m0 is the original network layer from module.modules()
filterindex=[] # filled with indexes of the filters to be copied (200 out of 256)
print(m0.weight.data.shape[0]) # gives 4096
weights=torch.ones(4096, 7200) # create new tensor of size 4096x7200
for filter_index in filterindex:
for f in range(0, m0.weight.data.shape[0]): # iterate through all 4096 rows
weights[f][filter_index*36:36] = m0.weight.data[f][filter_index*36:36]
print("data[0]",m0.weight.data[0][:36]) # print 36 values
m1.weight.data = weights.clone().cuda() # set new weights to m1 (new network)
Basically the output of the original ntework last conv layer will be 256x6x6 after flattening it will be 9216. So, I want to copy only those weights of filter indexes which are present in filterindex list, 7200 values out of 9216. Each filter will have 6x6x1 values= 36.
Any suggestion on these, how can I achieve it.