I am writing a custom operation, which uses a lot of torch.nn.functional.conv1d
I have two questions.
1, It seems that torch.nn.functional.conv1d is very slow. I would expect it to be implemented by fast Fourier transform and thus fast, but convoluting two same length vectors (I padded one periodically before feed it into the function) seems to be slower than matrix multiplying a vector. Is there a way to speedup this?
2, It seems that torch.nn.functional.conv1d does not support GPU. I experience the following error message:
“Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #4 ‘other’”
Also torch.nn.functional.conv1d does not support float 64, which makes gradient check difficult.
Is there a way to get around these, at least to get my operation running on GPU?
Any suggestions are appreciated.