nn.Linear for one-hot features

After performing Global Average Pooling, I have (N, C=3, 1, 1) (reshaped to (N, C=3, 1)) dimensional features, which I would like to pass to a linear layer. My desired output is of size (N, C, 1) or (N, C). However, I’m unsure of what dimensions to use for out_features.

Dimensions-wise, nn.Linear(1, 1) returns the correct dims, but I’m not sure it makes logical sense. If I’m not mistaken, features of different classes will be sharing the same weight, and this will be treated as a binary problem instead of a 3-class one. How do I perform nn.Linear preserving both the correct output dimensions and the multi-class nature of the problem?


With nn.Linear you’re doing matrix multiplication: (N,n_in) @ (n_in,n_out) +bias = (N,n_out). Your n_out should be C(=3), unless you want to rescale+shift channels by same amounts. (n_in=1,n_out=3) makes almost no sense, this leaves two possibilities: 1)linear transformation 3x3, as done by nn.Linear(3,3) or nn.Conv1d(3,3,1) 2)independent channel transformations - nn.Linear is not suitable for this, but simple w*x+b expression does this; another fancy way to do this is is nn.Conv1d(3,3,1,groups=3) layer.

About shapes, for nn.Linear you should reshape input to (N,C) (or permute NCL -> NLC for L>1). And transformatons with nn.ConvXd work without reshape.

1 Like