Access Violation(c10_cuda.dll)

Hi, I am new to c++ frontend. I am trying to implement DDPG algorithm using c++ frontend(Release) and Mujoco. The code is a bit complex but in brief I am doing something like this.

#include "stdio.h"
#include "stdlib.h"
#include <iostream>
#include <torch/torch.h>
#include <vector>

using namespace std;

int main()
{
    torch::Device device(torch::kCPU);
    if (torch::cuda::is_available()) 
    {                
        device = torch::Device(torch::kCUDA);
        cout<<device<<endl;
    }
    double sensordata[]={0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5};

    for(auto epoch=0; epoch< 10000; epoch++)
    {
        vector<float> v1(sensordata,sensordata+15);
        cout<<"trying to create a tensor for: "<<epoch<<" time"<<endl;
        torch::Tensor a = torch::tensor(v1).to(device);
        cout<<"success"<<endl;

        for(auto time=0; time<10000; time++)
        {
            vector<float> v2(sensordata,sensordata+15);
            torch::Tensor b = torch::tensor(v2).to(device);
            //do something
            a=b;
        }
    }
    return 0;
} 

At any random epoch iteration the tensor “a” fails to create and I get an error message: Finished in "x"s with exit code 3221225477
Upon running it on visual studio I get the error message: “Unhandled exception at 0x00007FFC979C0990 (c10_cuda.dll) in example-app.exe: 0xC0000005: Access violation reading location 0x0000000000000019.”
Unfortunately The error cannot be replicated using the above code.
In the actual code the “sensordata” array is initialized by mujoco simulator. I suspect that mujoco uses multithreading on CPU in background and libtorch does asynchronous initialization of the tensors on GPU, so maybe the tensor “a” is tried to be initialized before “sensordata” array can be initialized.
I tried using “cudaDeviceSynchronize()” and “.to(device,/non_blocking=/true)” but ended up getting the same error at a random epoch.