I cloned a project from github which is NVlabs/wetectron.
And I follow the instruction to set the environment.
The ninja version is 1.10.2.3
conda version is 4.4.10
I use the following commend to download pytorch: conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
so I got pytorch-1.11.0 and cuda-10.2
And then I clone the apex from nvidia, things still work pretty well now
and when I install the ninja with symbolic links
Failed occur:
fatal error: THC/THC.h: No such file or directory
and another error is
I search in google but I didn’t found anything useful.
I am not sure about it was caused by version or I can just simply download the file and paste it somewhere
Really appreciate for your help, but I think the problem is not caused by apex. I removed the conda environment and clone the apex using git clone GitHub - NVIDIA/apex: A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
and install it follow the commend as follow: pip install -v --disable-pip-version-check --no-cache-dir --global-option="–cpp_ext" --global-option="–cuda_ext" ./
The error is still the same, but I found that, it is some csrc/cuda/***.cu file couldn’t find the THC, for example :SigmoidFocalLoss_cuda.cu deform_pool_cuda.cu ROIPool_cuda.cu. Maybe I can find a newer version of these function somewhere? Or any advice I can solve if.
I might have misunderstood the issue as I thought you are running into the building error in apex.
Based on the function names, it seems you are trying to build maskrcnn_benchmark, which still uses the deprecated import here so you might either want to update it or skip its build.
Thank you for your help, I think I have solve the problem. It is caused by version of torch and cudatool. I reinstall pytorch for many times from 1.01 to 1.4 to 1.6 and the project can work now. Besides, I wonder why when I use conda commend to install pytorch and I get the cpu only version.
You could check which deprecated TH(C) methods are currently used in the code base and migrate them to the new ATen API. I would guess that the majority of actual function names should be equal but in a new namespace now (or without the TH tag in their name).
Im trying to compile a convolutional layer repository named “DCNv2” and have the same problem. I have installed CUDA 11.3 (docker image) and pytorch 1.12 and i have tried many things but any of them can solve it. I cant use another pytorch version because my gpu driver isnt compatible with cuda 11.6 (the other option for downloading pytorch) and my gpu isnt compatible with higher driver versions. I have tried to update ATen version but i havent found a higher version. I would appreciate any help. Thanks in advance.
PyTorch is compatible for all CUDA versions >= 10.2 at the moment, so you can pick whichever version would work for you.
As already described, the TH/THC namespace is deprecated and functions from it were moved to the ATen namespace. To fix build issues, check which TH(C) methods are used in the repository you are trying to build and move them to the new ATen calls.
Hi @ptrblck, thank you for the quick response! I have also figured out a solution from this post. I do have another general question about how to find out which ATen calls are for which TH(C) calls? Are there more guidance or tutorial?
Often the function name is the same and the namespace was just moved so you could remove the TH(C) from the operation and search for this op in the source code. If that doesn’t give you any matches, search for the TH(C) method and try to find the commit which moved it. I don’t think there is any mapping or so of all moved functions.