Hello, I would like to know the proper practice when using different convolution operators while performing inference on pre-trained models from TorchVision, such as those found at https://pytorch.org/vision/main/models.html. These models include one or more convolutional layers, but during inference, all convolutions are executed using PyTorch’s default implementation (in the case of GPU usage, via cuBLAS).
To change this behavior and use my custom convolution operator or one from an external library like CUTLASS (https://github.com/NVIDIA/cutlass/blob/main/examples/python/02_pytorch_extension_grouped_gemm.ipynb), I would like to know if:
- It is better to modify PyTorch’s source code, specifically the call to
torch.nn.Conv2d
, to import and use CUTLASS or my custom convolution operator. For this, I assume it would be necessary to recompile PyTorch or TorchVision from source. - It would be better to modify the model definitions in the TorchVision repository (https://github.com/pytorch/vision/tree/main/torchvision/models) to use my custom convolution operator directly in their implementation.
- The use of custom operators, as described here https://pytorch.org/tutorials/advanced/cpp_extension.html, would be a better approach.
I would like to know the best practice or if there is another, more feasible or simpler approach that minimizes changes to PyTorch’s source code or is more user-friendly. Thank you.