Cutlass custom kernel integration in Pytorch

Hello, I would like to know if, in addition to importing a custom kernel as detailed in this example, it is possible to import it into PyTorch and then reuse it not for a specific operation but to perform inference on a neural network model that uses these operations. This would be similar to what is described here, where there are pre-trained models in PyTorch, such as ResNet50, which includes up to 50 convolutions among other operations.

I don’t understand this part of the question. Could you describe how the cutlass kernel should be used for “inference” but not a specific operation (matmul)?

Sorry @ptrblck I didn’t explain myself well. Is it possible to use that custom convolution kernel from CUTLASS with a neural network model like ResNet50, which contains multiple convolution layers?

In the examples I’ve seen, such as https://pytorch.org/tutorials/advanced/cpp_extension.html, you can define new ops or import them using C++ kernels. However, it seems that they can only be used as single operations, for instance:

import torch

X = torch.randn(batch_size, input_features)
h = torch.randn(batch_size, state_size)
C = torch.randn(batch_size, state_size)

rnn = LLTM(input_features, state_size) 
new_h, new_C = rnn(X, (h, C))

However, suppose I want all the convolution layers inside a pre-trained PyTorch model like ResNet50 to use my imported CUTLASS convolution kernel. In that case, I’m not sure how to do that or whether it’s even possible.

Therefore, I have two questions: first, how to import a custom cutlass kernel with PyTorch and second, how to use or link it to allow pre-trained models and their operators (convolutions in my case) instead of the default convolution use it by PyTorch.