Conv1D Input Shape and Channels

hachiro · April 13, 2021, 10:38am

I have an input tensor of shape [8 , 500 , 502 ] where 8 is the batch size , 500 is the length of a bag ( i’m using multiple instance learning ) and 502 is my window size. One bag represents the concatenation of 2 histograms.

I want to use a feature extractor with Conv1d auto encoder-decoder.

Should i transpose my input to x = x.transpose(2,1).contiguous() or use something like x = x.view(8*500, 1 , 502) . I am a bit confused with the concept of the channels to the 1D Convolution.

Thanks in advance

ptrblck · April 14, 2021, 5:19am

Each kernel in a “standard” (i.e. non-grouped) convolution will stride through the spatial dimensions and will use all input channels. The usage of the reshape operation depends on your actual use case and the logic your model should apply to the input. The CS231 - Conv notes explain the underlying operations of conv layers in more details.