yaluguo
(Yaluguo)
August 11, 2017, 12:46pm
1
I have pulled out some weight and bias data from a pre-trained tensorflow CNN model and saved them in txt files.
I wonder how can I load these data into a NN model contained in nn.Sequential in my PyTorch code like below?
class CNN(nn.Module):
def init (self):
super(CNN, self).init ()
self.conv1 = nn.Sequential(
nn.Conv2d(
in_channels=4,
out_channels=32,
kernel_size=8,
stride=4,
padding=2,
),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2),
)
hma
August 11, 2017, 1:34pm
2
You can access and set the convolution kernel by doing
self.conv1[0].weight.data = pretrained_weight
self.conv1[0].bias.data = pretrained_bias
yaluguo
(Yaluguo)
August 12, 2017, 7:36am
4
Thx for your answers!! But I have some puzzles!
I have done below:
mycnn = CNN()
print (mycnn.state_dict().keys())
it showsďź
[âconv1.0.weightâ, âconv1.0.biasâ, âconv2.0.weightâ, âconv2.0.biasâ, âconv3.0.weightâ, âconv3.0.biasâ, âfc1.weightâ, âfc1.biasâ, âout.weightâ, âout.biasâ]
Then I try to do below:
print (mycnn.conv1[0].bias.data)
print (mycnn.state_dict()[âconv1.0.biasâ].data)
The outputs are different.
And I check the gradient:
It shows
mycnn.conv1[0].bias.grad = None
mycnn.state_dict()[âconv1.0.biasâ].grad is an ERROR
AttributeError: âtorch.FloatTensorâ object has no attribute âgradâ
Can you tell the difference between âmycnn.conv1[0].biasâ and âmycnn.state_dict()[âconv1.0.biasâ]â in my Pytorch model?
hma
August 12, 2017, 9:54am
5
Maybe what store in mycnn.state_dict( ) are just pytorch tensors, not variables.
yaluguo
(Yaluguo)
August 13, 2017, 1:44pm
7
I have build exactly same model in both TF and Pytorch. And I trained in TF. For some reason, I have to transfer the pretrained weight to Pytorch.
The network is like:
In TF, Conv2d filter shape is [filter_height, filter_width, in_channels, out_channels], while in Pytorch is (out_channels, in_channels, kernel_size[0], kernel_size[1]).
So I have done below in TF:
and I transfer to pytorch like:
It turns out that the DQN in pytorch is not working well as in TF!
2 Likes