Learnable bias layer

srikarym · June 22, 2017, 1:44pm

I’m trying to implement a learnable bias layer which would add a constant value to every element of the input. Learnable bias layer exists in lasagne but I was unable to find such a layer in pytorch. I initialized bias as a parameter but the value is not changing. Is this the right way to do it? -

 class Net(nn.Module):
     def __init__(self):
       super(Net, self).__init__()
       self.bias = nn.Parameter(torch.ones(1))
       self.conv1 = nn.Conv2d(1, 1, kernel_size=1,  bias=False)

def forward(self,x):
    x = self.conv1(x)
    bias_matrix = Variable(torch.ones((x.size()))*self.bias.cpu().data[0]).cuda()
    x =  x.add(bias_matrix)
    print self.bias
    return x

albanD · June 22, 2017, 1:58pm

Since you use the tensor inside your bias with self.bias.data, this will not work with the autograd system.
If you are using the master branch with automatic broadcasting, your can simply do:

def forward(self, x):
    x = self.conv1(x)
    out = x + self.bias
    return out

If not, you will need to do something like: out = x + self.bias.view(1,1,1,1).expand_as(x)

srikarym · June 22, 2017, 2:06pm

Did that, but the value of bias remains the same. On a different note, this is the way in which I updated the parameters. Please tell me if there is an error.

model = Net().cuda()
criterion = nn.MSELoss().cuda()
optimizer = optim.SGD([{'params':[model.bias], 'lr':0.001}], lr=0.001)
input = Variable(torch.ones((1,1,2,2))).cuda()
target = Variable(torch.zeros((1,1,2,2))).cuda()
loss = criterion(model(input),target)
print loss
loss.backward()

albanD · June 22, 2017, 2:09pm

You need to call zero_grad() and step() on the optimizer to perform a step.
See the mnist example here: https://github.com/pytorch/examples/blob/master/mnist/main.py#L82-L86

srikarym · June 22, 2017, 2:23pm

Thanks for the quick reply. The bias value is changing now. Also, I defined the optimizer as optim.SGD([{‘params’:[model.bias], ‘lr’:0.001}], lr=0.001) which will update bias value after each iteration. To update the weights of conv layer as well as the bias value, is optim.SGD(model.parameters(), lr=0.001) the right way to define the optimizer?

albanD · June 22, 2017, 2:29pm

Yes, this is the right way to do it.