maamli
(Ali)
January 4, 2023, 3:13am
1
I cannot use cuda as a device as the kernel dies and its not due to the number of params, which I have tested.
I ran the following:
print(torch.__version__)
print(torch.version.cuda)
print(torch.backends.cudnn.version())
print(torch.cuda.get_device_name(0))
print(torch.cuda.get_device_properties(0))
which gives the following output:
1.13.0
11.6
8302
NVIDIA GeForce RTX 3050 Ti Laptop GPU
_CudaDeviceProperties(name='NVIDIA GeForce RTX 3050 Ti Laptop GPU', major=8, minor=6, total_memory=4095MB, multi_processor_count=20)
When I run the following script, my kernel dies:
x = torch.randn(1, 3, 224, 224, device='cuda')
conv = torch.nn.Conv2d(3, 3, 3).cuda()
out = conv(x)
please help me resolve this issue.
Could you show the exact error message you received?
Could you also share the output of pip list
to see your python packages?
maamli
(Ali)
January 4, 2023, 5:05am
3
Thanks for working with me on this. Kernel message is:
Kernel Restarting
The kernel appears to have died. It will restart automatically.
Pip list:
absl-py 1.2.0
aeppl 0.0.38
aesara 2.8.7
aiohttp 3.8.1
aiosignal 1.2.0
alabaster 0.7.12
anaconda-client 1.9.0
anaconda-navigator 2.1.4
anaconda-project 0.10.2
anyio 3.5.0
appdirs 1.4.4
argon2-cffi 21.3.0
argon2-cffi-bindings 21.2.0
arrow 1.2.2
arviz 0.14.0
astroid 2.6.6
astropy 5.0.4
asttokens 2.0.5
async-timeout 4.0.1
atomicwrites 1.4.0
attrs 21.4.0
audioread 3.0.0
Automat 20.2.0
autopep8 1.6.0
Babel 2.9.1
backcall 0.2.0
backports.functools-lru-cache 1.6.4
backports.tempfile 1.0
backports.weakref 1.0.post1
bambi 0.9.2
bcrypt 3.2.0
beautifulsoup4 4.11.1
binaryornot 0.4.4
bitarray 2.4.1
bkcharts 0.2
black 19.10b0
bleach 4.1.0
blis 0.7.9
bokeh 2.4.2
boto3 1.21.32
botocore 1.24.32
Bottleneck 1.3.4
brotlipy 0.7.0
cachetools 4.2.2
captum 0.5.0
catalogue 2.0.8
certifi 2021.10.8
cffi 1.15.0
cftime 1.6.2
chardet 4.0.0
charset-normalizer 2.0.4
click 8.0.4
cloudpickle 2.0.0
clyent 1.2.2
colorama 0.4.4
colorcet 2.0.6
conda 22.11.1
conda-build 3.22.0
conda-content-trust 0+unknown
conda-pack 0.6.0
conda-package-handling 1.8.1
conda-repo-cli 1.0.4
conda-token 0.3.0
conda-verify 3.4.2
confection 0.0.3
cons 0.4.5
constantly 15.1.0
cookiecutter 1.7.3
cryptography 3.4.8
cssselect 1.1.0
cycler 0.11.0
cymem 2.0.7
Cython 0.29.28
cytoolz 0.11.0
daal4py 2021.5.0
dask 2022.2.1
datashader 0.13.0
datashape 0.5.4
debugpy 1.5.1
decorator 5.1.1
deep-phonemizer 0.0.17
defusedxml 0.7.1
deprecat 2.1.1
diff-match-patch 20200713
dill 0.3.6
distributed 2022.2.1
docutils 0.17.1
en-core-web-sm 3.4.1
entrypoints 0.4
et-xmlfile 1.1.0
etuples 0.3.8
executing 0.8.3
fastjsonschema 2.15.1
fastprogress 1.0.3
filelock 3.6.0
flake8 3.9.2
Flask 1.1.2
fonttools 4.25.0
formulae 0.3.4
frozenlist 1.2.0
fsspec 2022.2.0
future 0.18.2
gensim 4.1.2
glob2 0.7
gmpy2 2.1.2
google-api-core 1.25.1
google-auth 1.33.0
google-auth-oauthlib 0.4.6
google-cloud-core 1.7.1
google-cloud-storage 1.31.0
google-crc32c 1.1.2
google-resumable-media 1.3.1
googleapis-common-protos 1.53.0
graphviz 0.20.1
greenlet 1.1.1
grpcio 1.42.0
h5py 3.6.0
HeapDict 1.0.1
holoviews 1.14.8
huggingface-hub 0.10.1
hvplot 0.7.3
hyperlink 21.0.0
idna 3.3
imagecodecs 2021.8.26
imageio 2.9.0
imagesize 1.3.0
importlib-metadata 4.11.3
incremental 21.3.0
inflection 0.5.1
iniconfig 1.1.1
intake 0.6.5
intervaltree 3.1.0
ipykernel 6.9.1
ipython 8.2.0
ipython-genutils 0.2.0
ipywidgets 7.6.5
isort 5.9.3
itemadapter 0.3.0
itemloaders 1.0.4
itsdangerous 2.0.1
jax 0.3.25
jaxlib 0.3.25
jdcal 1.4.1
jedi 0.18.1
jeepney 0.7.1
Jinja2 2.11.3
jinja2-time 0.2.0
jmespath 0.10.0
joblib 1.1.0
json5 0.9.6
jsonschema 4.4.0
jupyter 1.0.0
jupyter-client 6.1.12
jupyter-console 6.4.0
jupyter-core 4.9.2
jupyter-server 1.13.5
jupyterlab 3.3.2
jupyterlab-pygments 0.1.2
jupyterlab-server 2.10.3
jupyterlab-widgets 1.0.0
keyring 23.4.0
kiwisolver 1.3.2
langcodes 3.3.0
lazy-object-proxy 1.6.0
libarchive-c 2.9
librosa 0.9.2
llvmlite 0.38.0
locket 0.2.1
logical-unification 0.4.5
lxml 4.8.0
Markdown 3.3.4
MarkupSafe 2.0.1
matplotlib 3.5.1
matplotlib-inline 0.1.2
mccabe 0.6.1
miniKanren 1.0.3
mistune 0.8.4
mkl-fft 1.3.1
mkl-random 1.2.2
mkl-service 2.4.0
mock 4.0.3
mpmath 1.2.1
msgpack 1.0.2
multidict 5.2.0
multipledispatch 0.6.0
munkres 1.1.4
murmurhash 1.0.9
mypy-extensions 0.4.3
navigator-updater 0.2.1
nbclassic 0.3.5
nbclient 0.5.13
nbconvert 6.4.4
nbformat 5.3.0
nest-asyncio 1.5.5
netCDF4 1.6.2
networkx 2.7.1
nltk 3.7
nose 1.3.7
notebook 6.4.8
numba 0.55.1
numexpr 2.8.1
numpy 1.21.0
numpydoc 1.2
numpyro 0.10.1
oauthlib 3.2.1
olefile 0.46
openpyxl 3.0.9
opt-einsum 3.3.0
packaging 21.3
pandas 1.4.2
pandocfilters 1.5.0
panel 0.13.0
param 1.12.0
parsel 1.6.0
parso 0.8.3
partd 1.2.0
pathspec 0.7.0
pathy 0.10.0
patsy 0.5.2
pep8 1.7.1
pexpect 4.8.0
pickleshare 0.7.5
Pillow 9.0.1
pip 21.2.4
pkginfo 1.8.2
plotly 5.6.0
pluggy 1.0.0
polyscope 1.3.1
pooch 1.4.0
poyo 0.5.0
preshed 3.0.8
prometheus-client 0.13.1
prompt-toolkit 3.0.20
Protego 0.1.16
protobuf 3.19.1
psutil 5.8.0
ptyprocess 0.7.0
pure-eval 0.2.2
py 1.11.0
pyasn1 0.4.8
pyasn1-modules 0.2.8
pycodestyle 2.7.0
pycosat 0.6.3
pycparser 2.21
pyct 0.4.6
pycurl 7.44.1
pydantic 1.10.2
PyDispatcher 2.0.5
pydocstyle 6.1.1
pyerfa 2.0.0
pyflakes 2.3.1
Pygments 2.11.2
PyHamcrest 2.0.2
PyJWT 2.4.0
pylint 2.9.6
pyls-spyder 0.4.0
pymc 4.4.0
pymc3 3.11.5
pyodbc 4.0.32
pyOpenSSL 21.0.0
pyparsing 3.0.4
pyro-api 0.1.2
pyro-ppl 1.8.2
pyrsistent 0.18.0
PySocks 1.7.1
pytest 7.1.1
python-dateutil 2.8.2
python-lsp-black 1.0.0
python-lsp-jsonrpc 1.0.0
python-lsp-server 1.2.4
python-slugify 5.0.2
python-snappy 0.6.0
pytz 2021.3
pyviz-comms 2.0.2
PyWavelets 1.3.0
pyxdg 0.27
PyYAML 6.0
pyzmq 22.3.0
QDarkStyle 3.0.2
qstylizer 0.1.10
QtAwesome 1.0.3
qtconsole 5.3.0
QtPy 2.0.1
queuelib 1.5.0
regex 2022.3.15
requests 2.27.1
requests-file 1.5.1
requests-oauthlib 1.3.1
resampy 0.4.2
rope 0.22.0
rsa 4.7.2
Rtree 0.9.7
ruamel.yaml 0.16.12
ruamel.yaml.clib 0.2.6
ruamel-yaml-conda 0.15.100
s3transfer 0.5.0
sacremoses 0.0.43
scikit-image 0.19.2
scikit-learn 1.0.2
scikit-learn-intelex 2021.20220215.212715
scipy 1.8.0
Scrapy 2.6.1
seaborn 0.11.2
SecretStorage 3.3.1
semver 2.13.0
Send2Trash 1.8.0
service-identity 18.1.0
setuptools 61.2.0
sip 4.19.13
six 1.16.0
smart-open 5.2.1
sniffio 1.2.0
snowballstemmer 2.2.0
sortedcollections 2.1.0
sortedcontainers 2.4.0
soundfile 0.11.0
soupsieve 2.3.1
spacy 3.4.3
spacy-legacy 3.0.10
spacy-loggers 1.0.3
Sphinx 4.4.0
sphinxcontrib-applehelp 1.0.2
sphinxcontrib-devhelp 1.0.2
sphinxcontrib-htmlhelp 2.0.0
sphinxcontrib-jsmath 1.0.1
sphinxcontrib-qthelp 1.0.3
sphinxcontrib-serializinghtml 1.1.5
spyder 5.1.5
spyder-kernels 2.1.3
SQLAlchemy 1.4.32
srsly 2.4.5
stack-data 0.2.0
statsmodels 0.13.2
sympy 1.10.1
tables 3.6.1
tabulate 0.8.9
TBB 0.2
tblib 1.7.0
tenacity 8.0.1
tensorboard 2.10.1
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
terminado 0.13.1
testpath 0.5.0
text-unidecode 1.3
textdistance 4.2.1
Theano 1.0.5
Theano-PyMC 1.1.2
thinc 8.1.5
threadpoolctl 2.2.0
three-merge 0.1.1
tifffile 2021.7.2
tinycss 0.4
tldextract 3.2.0
tokenizers 0.11.4
toml 0.10.2
tomli 1.2.2
toolz 0.11.2
torch 1.13.0
torch-summary 1.4.5
torchaudio 0.13.0
torchvision 0.14.0
tornado 6.1
tqdm 4.64.0
traitlets 5.1.1
transformers 4.24.0
Twisted 22.2.0
typed-ast 1.4.3
typer 0.7.0
typing_extensions 4.1.1
ujson 5.1.0
Unidecode 1.2.0
urllib3 1.26.9
w3lib 1.21.0
wasabi 0.10.1
watchdog 2.1.6
wcwidth 0.2.5
webencodings 0.5.1
websocket-client 0.58.0
Werkzeug 2.0.3
wheel 0.37.1
widgetsnbextension 3.5.2
wrapt 1.12.1
wurlitzer 3.0.2
xarray 2022.12.0
xarray-einstats 0.3.0
xlrd 2.0.1
XlsxWriter 3.0.3
yapf 0.31.0
yarl 1.6.3
zict 2.0.0
zipp 3.7.0
zope.interface 5.4.0
Just to check if problem occurs due to jupyter. Could you run the same code inside a python script?
Also, can you show the cuda driver version?
There is a compatibility relation between driver and cuda toolkit, see here
maamli
(Ali)
January 4, 2023, 11:06pm
5
I ran nvcc --version
for cuda driver version and it gave me the following output.
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_Mar__8_18:18:20_PST_2022
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0
I ran it as a python script and saw the following error:
Could not load library libcudnn_cnn_infer.so.8. Error: libcuda.so: cannot open shared object file: No such file or directory
Please make sure libcudnn_cnn_infer.so.8 is in your library path!
Aborted
Do I need to do any specific reinstallations as libcudnn
packages are not installed? Please guide me.
Cudnn and cuda comes automatically with pytorch binaries. The issue is because of WSL.
This answer in github repor of WSL should solve your problem: github link
1 Like
maamli
(Ali)
January 6, 2023, 3:59am
7
Following the solution mentioned in the above link I was able to run it through a python script. and i see the output of the script:
x = torch.randn(1, 3, 224, 224, device='cuda')
conv = torch.nn.Conv2d(3, 3, 3).cuda()
out = conv(x)
but when I run the same thing in jupyter, the kernel dies, what might be the issue here? Any ideas as to how to further debug this?
My first guess is that it would be because of python interpreter used by jupyter kernel. Can you make sure that jupyter kernel uses the same interpreter that you ran your python script with?
maamli
(Ali)
January 6, 2023, 11:13pm
9
which python
inside terminal and inside jupyter returns the same path:
anaconda3/bin/python
But I think i have moved closer to the solution. I used the below export command to make the python script work previously. But when I ran again today, it gave me the same error. I had to run the command again to make it work. Similarly, the jupyter script worked as well.
export LD_LIBRARY_PATH=/usr/lib/wsl/lib:$LD_LIBRARY_PATH
So I think now its down to why I have to run export command again and again.
Wait, in the github link it says to add the command export command to .bashrc
file. You did this, right?
1 Like
maamli
(Ali)
January 8, 2023, 3:48am
11
I can confirm that by adding it to the bashrc, the above code snippet runs successfully in jupyter. Thanks so much so bearing with me to resolve this