Is is thread safe to do `Tensor.to(device)` from multiple threads to the same GPU device?

I’m reading many small files (tensors) to be loaded to the GPU memory. In this case, can I open a thread pool and use .to to move all of them to GPU? I have 8 GPUs in total, each will take ~2000 such tensors, and this is done in python with a 'multiprocessing.pool.ThreadPool(32).

Follow-up question: in general, is it safe to access the same GPU device through different threads but with thread local objects (still in python, e.g. using a thread pool) ? E.g. if I do x = torch.rand(2, 2) in two different threads and manipulate each thread’s x separately, will I possibly run into any errors / undefined behaviors?