Flexible API for layers

I am interested in implementing a more flexible API for the Linear layer, where the inputs are only the output feature size. The input feature size is inferred from the size of input tensor. I have a minimal implementation:

from torch.nn import Module, Linear
class FlexibleLinear(Module):
    def __init__(self, out_feats):
        super(FlexibleLinear, self).__init__()
        self.out_feats = out_feats
        self.initialized = False
        self.linear = None

    def build(self, x):
        if self.initialized:
            return

        in_feats = x.shape[1]
        out_feats = self.out_feats
        self.linear = Linear(in_feats, out_feats)
        self.initialized = True

    def forward(self, x):
        self.build(x)
        y = self.linear(x)
        return y

I am wondering if there is any (better) way to do this, and/or if this can create any problem for a larger scale network.

I’m not really convinced about this code. There are issues:

  • Layer is not allocated properly. It will raise an Exception.
  • Layer is not tracked by the nn.Module. Thus, optimizer won’t track it and this will create a silent bug.

Despite I don’t know the details of the new lazy modules, I would recommend you to check them to code this.
https://pytorch.org/docs/stable/_modules/torch/nn/modules/conv.html#LazyConv2d

Thank you for the issues mentioned and the link.
It seems like LazyLinear is a new feature added since 1.8

https://pytorch.org/docs/stable/generated/torch.nn.LazyLinear.html