I have a network has two branches: the first one provides the fully connected vector fc1, and the second one provides the fc2. The final fc is
fc = alpha_1 * fc_1 + alpha_2 * fc_2
where alpha_i is a learned parameter. It will be learned during training. For a batch size of 1, I used alpha = Parameter(torch.ones(1))
as the code
class fc_combined(nn.Module):
def __init__(self):
super(parameter_learning, self).__init__()
self.alpha_list = nn.ParameterList([])
for i in range(2):
alpha = Parameter(torch.ones(1))
self.alpha_list.append(alpha)
def forward(self, prob):
fc = self.alpha_list[0]*fc_1+self.alpha_list[1]*fc_2
return fc
My question is that how to change the line alpha = Parameter(torch.ones(1))
to work on the batch size bigger than 1. Because each vector in fc_i
will be weighted by alpha, if we have the batch of fc_i, as batch_sizex32
, then I think alpha also has size of batch_sizex1
. I have tried
alpha = Parameter(torch.ones(batch_size, 1))
But it does not work on validation because the batch size of validation is 1, while we learned alpha size of batch_sizex1 in training. Any suggestion are welcome