I’m trying to reimplement the ResNet from this paper for use in time series classification. Since it’s my first time working with convolutional layers, I’m a bit confused about how to arrange the input tensor for the convolution.

The original implementation in Keras uses 2d layers. I’m given to understand that the convention in torch for a 2d layer is that the tensor should look like `[batch_size, channels, height, width]`

. For image input, this makes sense to me but I’m not sure how multivariate time series data maps to this schema, as in my mind, a 1d layer would make much more sense. Could somebody please point out to me how `[sample_axis, time_axis, feature_axis]`

could be mapped to this schema and why the paper uses a 2d layer?

Am I correct in assuming that in my case, `channels`

would be 1? Is this the same as using Conv1d or am I overlooking something?