How to implement this model architecture in PyTorch?
I am not exactly sure about what they are doing on that paper (could not view it) but seems that they are taking the input and they took the Fourrier transform of it. I suppose you should compute the Fourrier transform after the input layer (see torch.fft) and then apply the convolution and max pooling operations. Then taking the inverse Fourier transform to getting back to the spatial domain.
But again, I am not sure about that because I could not read the paper.