C++ Traced model has issues on CUDA

Olek · July 19, 2024, 1:48am

Hi there, Im working with a python model called transnetv2 - GitHub - soCzech/TransNetV2: TransNet V2: Shot Boundary Detection Neural Network.
From I understand, it should be possible to trace this model on my cpu and then later load and do inference on the traced model in c++ using CUDA, but I am having alot of trouble with this.
Ive been able to use the traced model completely fine in c++ when using my CPU, but if I attempt to use my GPU I get this runtime error:
RuntimeError: Input type (CPUFloatType) and weight type (CUDAFloatType) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

The thing is, if I trace this model using CUDA then I dont have this problem. I can use it in c++ no problem.
Am I doing something wrong when I do my tracing? The original model uses TensorFlow but has a pytorch implementation that Im using, thats the transnetv2_pytorch.py model in the github repo.

This is how Im doing my tracing:

# Initialize and load the model
model = TransNetV2()
state_dict = torch.load("inference-pytorch/transnetv2-pytorch-weights.pth")
model.load_state_dict(state_dict)
model.eval()

# Convert to TorchScript via tracing
example_input = torch.zeros(1, 100, 27, 48, 3, dtype=torch.uint8)  # An example input you would provide to your model

traced_script_module = torch.jit.trace(model, example_input)
traced_script_module.save("transnetv2_traced.pt")

ptrblck · July 19, 2024, 2:45am

Move the input to the GPU and it should work.

Olek · July 19, 2024, 2:52am

Hi, thanks so much for responding
Are you reffering to the input Im using when tracing the model? Will the traced model work on my CPU if I do this?

ptrblck · July 19, 2024, 12:26pm

No, I’m referring to the quoted error message which seems to be raised in libtorch when you are trying to execute the model. The error message explains that the weight is on the GPU while the input is on the CPU and fails.