I am running Tacotron2 on Windows with Conda environment with an RTX 3090.
I can get to run but in Windows, it seems like it uses twice as much VRAM as compared to my Linux counter parts.
I’ve tried using python 3.6 and 3.7. installing pytorch through pip and conda. using cudatoolkit.
CudaToolkits pre-v11 tell me its not sm_86 compatible and freeze.
I didn’t always have this issue but I’ve gone back to my conda env backups, and those don’t help.
Is there any scripts that can tell me which cuda version that its using? I removed all the cuda paths but somehow my conda env still knew where to find it and run.
My colab graphics card is 16GB and can do a batch of 48. While my windows graphics card is 24GB and can only do 32 batch size.
The conda binaries and pip wheels will use their own CUDA runtime (which you specify during the installation) as well as cudnn, NCCL etc., so your local CUDA toolkit won’t be used.
The local CUDA toolkit (and in particular the compiler) will be used if you are building a custom CUDA extension or PyTorch from source.
Are both machines using the same PyTorch, CUDA, cudnn installation (e.g. through conda) as well as the same GPU?
If so, are both GPUs free before you start the training or is e.g. Windows using it to display the desktop (which will use GPU memory)?
That’s good to hear. So the different memory behavior was observed between your local Windows machine and Colab?
It might be the case, but would also depend on the used device.
Could you create a single CUDATensor and check the memory used by the CUDA context via nvidia-smi? Make sure that the GPU memory is empty before running this test or subtract the already used memory.