even with the simplest calculation i got Error OOM and i have already reinstall conda and cuda.
as to the env info
PyTorch version: 1.4.0+cu100
Is debug build: No
CUDA used to build PyTorch: 10.0
OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CMake version: Could not collect
Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 10.0.130
GPU models and configuration: GPU 0: GeForce RTX 2080 Ti
Nvidia driver version: 460.91.03
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
@1309123499
You can check the GPU usage via nvidia-smi command. There you can see if there is an already existing process that is taking up the GPU memory & kill it if needed.
I connect the remote server to train my model. After encountering this problem, I have reinstalled the cuda and the rebuilt the conda environment, as well as reconnected to the server. and this problem happened between two debugings, which means i did nothing about the machine or the environment. It really makes me confused.