MORI
May 25, 2017, 7:53pm
1
Hi,
I would like to initialize weights and bias with arbitrary values, not with uniforms or constant number.
class Model(nn.Module):
def init (self,init_param):
super(Model, self).init ()
self.conv1 = nn.Conv2d(512, 512, 3)
self.conv1.weight = initial_param[‘weight’]
self.conv1.bias = initial_param[‘bias’]
def forward(self, x):
return F.relu(self.conv2(x))
model = Model(initial_param)
initial_param[‘weight’] and initial_param['bias] are torch.FloatTensor of size 512x512x3x3 and 512 respectively.
I got following error
TypeError: cannot assign ‘torch.FloatTensor’ as child module ‘conv1’ (torch.nn.Module or None expected)
How to assign arbitrary values to parameters?
Now, my purpose is convert torch(lua) model to pytorch model.( torch.legacy.nn.Sequential to pytorch model)
I’m trying to extract weight and bias from legacy model and assign to pytorch model since there is no way to do it automatically.
Thanks.
mbp28
(mbp28)
May 25, 2017, 8:26pm
2
Hey,
I think something like this could work for you:
self.conv1.weight = torch.nn.Parameter(initial_param['weight'])
self.conv1.bias = torch.nn.Parameter(initial_param['bias'])
For example, the following code runs without error:
import torch
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F
class Kernel_Emb(nn.Module):
def __init__(self,D_in,H,D_out):
super(Kernel_Emb, self).__init__()
self.linear1 = nn.Linear(D_in,H)
self.linear2 = nn.Linear(H,D_out)
self.linear1.weight = torch.nn.Parameter(torch.zeros(D_in,H))
self.linear1.bias = torch.nn.Parameter(torch.ones(H))
def forward(self, x):
h_relu = self.linear1(x).clamp(min=0)
y_pred = self.linear2(h_relu)
return y_pred
net = Kernel_Emb(embd_dim,H1,kernel_dim)
print(net.linear1.bias) ###this prints ones.
5 Likes
Came
(Came)
March 11, 2019, 11:26pm
4
Based on the discussion here,
I think the shape pf weight matrix in linear layer should be reverse.
change:
self.linear1.weight = torch.nn.Parameter(torch.zeros(D_in,H))
to
self.linear1.weight = torch.nn.Parameter(torch.zeros(H,D_in))
1 Like
you are right, the parameters should be in the reverse order