Hi there, Im working with a python model called transnetv2 - GitHub - soCzech/TransNetV2: TransNet V2: Shot Boundary Detection Neural Network.
From I understand, it should be possible to trace this model on my cpu and then later load and do inference on the traced model in c++ using CUDA, but I am having alot of trouble with this.
Ive been able to use the traced model completely fine in c++ when using my CPU, but if I attempt to use my GPU I get this runtime error:
RuntimeError: Input type (CPUFloatType) and weight type (CUDAFloatType) should be the same or input should be a MKLDNN tensor and weight is a dense tensor
The thing is, if I trace this model using CUDA then I dont have this problem. I can use it in c++ no problem.
Am I doing something wrong when I do my tracing? The original model uses TensorFlow but has a pytorch implementation that Im using, thats the transnetv2_pytorch.py model in the github repo.
This is how Im doing my tracing:
# Initialize and load the model
model = TransNetV2()
state_dict = torch.load("inference-pytorch/transnetv2-pytorch-weights.pth")
model.load_state_dict(state_dict)
model.eval()
# Convert to TorchScript via tracing
example_input = torch.zeros(1, 100, 27, 48, 3, dtype=torch.uint8) # An example input you would provide to your model
traced_script_module = torch.jit.trace(model, example_input)
traced_script_module.save("transnetv2_traced.pt")