I’m trying to reimplement the ResNet from this paper for use in time series classification. Since it’s my first time working with convolutional layers, I’m a bit confused about how to arrange the input tensor for the convolution.
The original implementation in Keras uses 2d layers. I’m given to understand that the convention in torch for a 2d layer is that the tensor should look like [batch_size, channels, height, width]
. For image input, this makes sense to me but I’m not sure how multivariate time series data maps to this schema, as in my mind, a 1d layer would make much more sense. Could somebody please point out to me how [sample_axis, time_axis, feature_axis]
could be mapped to this schema and why the paper uses a 2d layer?
Am I correct in assuming that in my case, channels
would be 1? Is this the same as using Conv1d or am I overlooking something?