Convolution on (C, H) and (C, 1, W) for EEG data

A paper I am reading now proposes an architecture to do classification on EEG data and it looks like the following (Note this diagram is sufficient for implementation and no other details in the paper are relevant).

But when I tried to implement this architecture, I got two issues

  • In Conv-Pool Block 1, the second convolution seems to do convolution on (C, H) instead of (H, W) and I do not know how to do this using torch.nn.Conv2d().
  • In Conv-Pool Block 2-4, it seems that the convolution is always operated on (C, 1, W) and every time I finish convolution, I need to permute the output to make convolution doable, which is a little weird for me. Say, in Conv-Pool Block 2, in order to convolve (batchSize, 25, 1, 171) with filter (25, 10), I need to permute it into (batchSize, 1, 25, 171) and I will get (batchSize, 50, 1, 162).

So did I understand this diagram correctly? Any input will be appreciated.

why not construct the whole algorithm with Conv1D?

From the diagram and information it looks like it’s 3D input (B x C x W) and not 4D

Sorry, I do not quite understand how I could use Conv1d() since except Conv-Pool Block 1, convolution operations on other layers have filter size like (25, 10) instead of a scalar.

Additionally, the paper tries to view EEG signals from 44 electrodes as an “image” with 1 channel (that is, a dataset of shape (batchSize, 1, 44, 534)), therefore I think using Conv2d() will make more sense.

1 Like