In the past, I have been using Theano where I could select the convolution strategy (see: Theano convolution strategies.). When I reimplement my model in PyTorch the memory usage is significantly higher, suggesting the convolution strategy is different.
In PyTorch, I can find the definitions in this github line.
My question is two-fold:
- Am I correct if I deduce that
IMPLICIT_PRECOMP_GEMMis the default strategy in PyTorch?
- Can I select the strategy manually (preferably per convolution) or can this only be done through the benchmarking function?