same here,
OS: Win10
python: 3.8.1
torch: 1.4.0
CUDA: 10.1
though, I fixed the problem, by implementing the cat use case that I had.
in my case, this function solved my problem.
this function takes 4 arrays of 4D tensors, and concats them based on dim=1.
this is hardcoded though.
def cat(arr, device):
total_depth = 0
for x in arr:
total_depth += x.size()[1]
num_samples = arr[0].size()[0]
h = arr[0].size()[2]
w = arr[0].size()[3]
concated = torch.zeros((num_samples, total_depth, h, w), device=device)
last = 0
concated[:, :arr[0].size()[1], :, :] = arr[0]
last = arr[0].size()[1]
concated[:, last:last + arr[1].size()[1], :, :] = arr[1]
last = last + arr[1].size()[1]
concated[:, last:last + arr[2].size()[1], :, :] = arr[2]
last = last + arr[2].size()[1]
concated[:, last:, :, :] = arr[3]
return concated
Python 3.8.0 (tags/v3.8.0:fa919fd, Oct 14 2019, 19:37:50) [MSC v.1916 64 bit (AMD64)] on win32
I am using pip and everything from the official site.
also, cuda specifications:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:12:52_Pacific_Daylight_Time_2019
Cuda compilation tools, release 10.1, V10.1.243
How similar is it? Which python distribution are you using? Do you have Nvidia GPU in your PC? Do you have the GPU driver installed? What is the exact error message? We didn’t compile packages for CUDA 10.2. So does cuda: 10.2 mean that you compiled the package by yourself?
Hi.
I also have this problem.
OS: win10
using both libtorch 1.5.0 and 1.6.0 downloaded from pytorch.org
cuda: 10.1
Only happens in C++ when using torchscript loaded model on GPU, on CPU works fine.
Also no problem when running model in python
Update:
When I copy caffe2_nvrtc.dll to the same folder as executable it works fine. Is there any method to make it work without copying?