Fluctuating inference time using LibTorch

Hi, I’ve been working on implementing my own program that supports inference using either TensorRT or LibTorch. Currently the whole pipeline works well with TensorRT as the inference time looks smooth all the time but when I run the inference using a .pt model with LibTorch, the inference time obviously fluctuates.

The phenomenon mentioned above HAPPENS ONLY WHEN I take input images from a USB CAMERA. If I instead take inputs from a video clip, the inference time also looks smooth when using a .pt model with LibTorch like that with TensorRT.

Not sure what I’m doing incorrectly and currently no idea what to do to improve this.
I’m showing some of the code snippets I implemented for LibTorch inference.

cv::Mat predict(cv::Mat& image)
{
    if (m_libtorch_model_loaded == true)
    {
        torch::NoGradGuard no_grad;
        preprocess_libtorch(image);
        m_output_tensor = m_libtorch_model.forward({m_input_tensor}).toTensor();
        torch::cuda::synchronize();
    } 

    postprocess();
    return m_colormap_image;    
}
preprocess_libtorch(cv::Mat& image)
{
    cv::resize(image, m_input_image, cv::Size(m_input_w, m_input_h));
    m_input_tensor = torch::from_blob(m_input_image.data, {1, m_input_image.rows, m_input_image.cols, 3}, torch::kByte);

    m_input_tensor = m_input_tensor.permute({0, 3, 1, 2});  // [N, C, H, W]
    m_input_tensor = (m_input_tensor / m_scalar_255 - m_mean_tensor) / m_std_tensor;

    if (torch::cuda::is_available())
    {
        m_input_tensor = m_input_tensor.to(torch::kCUDA);
    }

    return;
}