How to manually load weights(from **.txt file) into conv2D.weight inside nn.Sequential?


(Yaluguo) #1

I have pulled out some weight and bias data from a pre-trained tensorflow CNN model and saved them in txt files.
I wonder how can I load these data into a NN model contained in nn.Sequential in my PyTorch code like below?

class CNN(nn.Module):
def init(self):
super(CNN, self).init()
self.conv1 = nn.Sequential(
nn.Conv2d(
in_channels=4,
out_channels=32,
kernel_size=8,
stride=4,
padding=2,
),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2),
)


#2

You can access and set the convolution kernel by doing

self.conv1[0].weight.data = pretrained_weight
self.conv1[0].bias.data = pretrained_bias


(Yaluguo) #4

Thx for your answers!! But I have some puzzles!

I have done below:

mycnn = CNN()
print (mycnn.state_dict().keys())

it shows:
[‘conv1.0.weight’, ‘conv1.0.bias’, ‘conv2.0.weight’, ‘conv2.0.bias’, ‘conv3.0.weight’, ‘conv3.0.bias’, ‘fc1.weight’, ‘fc1.bias’, ‘out.weight’, ‘out.bias’]

Then I try to do below:

print (mycnn.conv1[0].bias.data)
print (mycnn.state_dict()[‘conv1.0.bias’].data)

The outputs are different.

And I check the gradient:
It shows

mycnn.conv1[0].bias.grad = None

mycnn.state_dict()[‘conv1.0.bias’].grad is an ERROR
AttributeError: ‘torch.FloatTensor’ object has no attribute ‘grad’

Can you tell the difference between “mycnn.conv1[0].bias” and “mycnn.state_dict()[‘conv1.0.bias’]” in my Pytorch model?


#5

Maybe what store in mycnn.state_dict( ) are just pytorch tensors, not variables.


(Yaluguo) #7

I have build exactly same model in both TF and Pytorch. And I trained in TF. For some reason, I have to transfer the pretrained weight to Pytorch.

The network is like:

In TF, Conv2d filter shape is [filter_height, filter_width, in_channels, out_channels], while in Pytorch is (out_channels, in_channels, kernel_size[0], kernel_size[1]).

So I have done below in TF:
image

and I transfer to pytorch like:
image

It turns out that the DQN in pytorch is not working well as in TF!