i am trying to run dreambooth on runpod
unfortunately pytorch team removed xformers older version
i cant believe how smart they are
now we have to use torch 2
however it is not working on runpod
here the errors and steps i tried to solve the problem
I have installed Torch 2 via this command on RunPod io instance
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
Everything installed perfectly fine
With Torch 1 and Cuda 11.7, I was not getting any error but with Torch 2 the below error produced
Could not load library libcudnn_cnn_infer.so.8. Error: libnvrtc.so: cannot open shared object file: No such file or directory
How to fix?
It is using unix
On Windows same prodecure working very well
Using Automatic1111 web UI to use Stable Diffusion
this above i couldnt solve
therefore i have done the following things
apt update
apt install sudo
sudo apt install nvidia-cudnn
sudo apt-get install python3-dev
after installing all above
now i have this warning and training never progress
Steps: 0%| | 0/170 [00:00<?, ?it/s][2023-03-29 18:50:26,163] torch._inductor.utils: [WARNING] not enough cuda cores to use max_autotune mode
now when i run below python code i see everything looking good
import torch
# Check if CUDA is available
if torch.cuda.is_available():
print("CUDA is available")
# Display the current GPU name
print("GPU name: ", torch.cuda.get_device_name(torch.cuda.current_device()))
else:
print("CUDA is not available")
# Verify the PyTorch version
print("PyTorch version: ", torch.__version__)
import torch
print(torch.cuda.get_device_properties(0).multi_processor_count)
test.py result
CUDA is available
GPU name: NVIDIA RTX A4500
PyTorch version: 2.0.0+cu118
56
it is able to generate images with 15.58it which is very fast
any help appreciated very much