Custom layer and Forward

Hi, I’m a newbie with pytorch and have some problem with custom layer in model. I use a custom layer, a 2D-matrix defined as follows:

class MatrixNet(nn.Module):

    @staticmethod
    def _create_weight(x, y, b):
        weight = []
        bias = []
        for i in range(b):
            weight.append([])
            bias.append([])

            zeros_w = FloatTensor(x, y).fill_(0)
            zeros_b = FloatTensor(1, y).fill_(0)
            weight_i = FloatTensor(x, y).normal_()
            bias_i = FloatTensor(1, y).normal_()
            for j in range(b):
                weight[i].append(zeros_w)
                bias[i].append(zeros_b)

            weight[i][i] = weight_i
            bias[i][i] = bias_i

            weight[i] = torch.cat(weight[i], dim=1)
            bias[i] = torch.cat(bias[i], dim=1)

        weight = torch.cat(weight)
        bias = torch.cat(bias)
        return torch.nn.Parameter(weight, requires_grad=True), torch.nn.Parameter(bias, requires_grad=True)

    def __init__(self, net, lr: float):
        super().__init__()

        self._add_auxiliary_params()
        self.weight, self.bias = MatrixNet._create_weight(net[0], net[-1], opts.bootstrap_size)

        for name, param in self.named_parameters():
            print(name, param)

        if USE_CUDA:
            self.cuda()

        self._optimizer = opts.optimizer(self.parameters(), lr=lr)

    def _add_auxiliary_params(self):
        pass

    def forward(self, x):
        x = torch.mm(x, self.weight) + self.bias
        return x

   def _loss(self, outputs: FloatTensor, targets: FloatTensor):
        return F.mse_loss(outputs, targets)

    def update_parameters(self, inputs: FloatTensor, targets: FloatTensor):
        outputs = self(inputs)
        loss = self._loss(outputs, targets)
        self._optimizer.zero_grad()
        loss.backward()
        self._optimizer.step()

I can call “update_parameters” and get the gradient for ONE sample, but can not pass a BATCH of sample.

What i should do to make “Forward” auto-recognizes Batch-input as normal layer like torch.nn.Linear.

Thanks!

There is an extra torch.bmm function doing batched matrix multiplications and you will probably have to add an additional batchdimension via unsqueeze. Broadcasting should do the rest for you.

Thank you for reply!

In my case, one is a Batch (input) and one is a single matrix (weight of network). In order to use torch.bmm, i need to modify network weight. However, I found that torch.matmul can work for me.

Yes, your weights would have to have a batch dimension (of size 1), but if I remember correctly, that’s the way to do. At least I think, the convolutions handle it that way.