Hi,
My project runs fast on my workstation at around 100% GPU utilization on an RTX 3090 but very slow on a server machine with an H100 and many CPU cores.
The code simulates data, so I don’t think it is related to reading/write to/from SSD. I noticed that no matter how many workers I set on the cluster, 2 threads are at 100% utilization, and all workers are almost idle. If I set 64 workers, the GPU waits for CPU, goes through 64 batches at ~100% utilization and then waits again. I used this high number to make it easier to see.
I wonder what this could be. Could it be related to some OMP stuff? I tried setting OMP_NUM_THREADS=1 without luck. pin_memory defaults to False.
Please find a screenshot of the utilization on the cluster attached.
The environment is
# packages in environment at /g/ries/users/Lucas/slurm/conda/envs/decode_dev:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_kmp_llvm conda-forge
absl-py 1.4.0 pyhd8ed1ab_0 conda-forge
accessible-pygments 0.0.4 pyhd8ed1ab_0 conda-forge
aiohttp 3.8.4 py310h2372a71_1 conda-forge
aiosignal 1.3.1 pyhd8ed1ab_0 conda-forge
alabaster 0.7.13 pyhd8ed1ab_0 conda-forge
alsa-lib 1.2.8 h166bdaf_0 conda-forge
antlr-python-runtime 4.9.3 pyhd8ed1ab_1 conda-forge
aom 3.5.0 h27087fc_0 conda-forge
asttokens 2.2.1 pyhd8ed1ab_0 conda-forge
async-timeout 4.0.2 pyhd8ed1ab_0 conda-forge
attr 2.5.1 h166bdaf_1 conda-forge
attrs 23.1.0 pyh71513ae_1 conda-forge
babel 2.12.1 pyhd8ed1ab_1 conda-forge
backcall 0.2.0 pyh9f0ad1d_0 conda-forge
backports 1.0 pyhd8ed1ab_3 conda-forge
backports.functools_lru_cache 1.6.5 pyhd8ed1ab_0 conda-forge
backports.zoneinfo 0.2.1 py310hff52083_7 conda-forge
beautifulsoup4 4.12.2 pyha770c72_0 conda-forge
black 23.3.0 pypi_0 pypi
blas 1.0 mkl conda-forge
blinker 1.6.2 pyhd8ed1ab_0 conda-forge
blosc 1.21.4 h0f2a231_0 conda-forge
brotli 1.0.9 h166bdaf_8 conda-forge
brotli-bin 1.0.9 h166bdaf_8 conda-forge
brotlipy 0.7.0 py310h5764c6d_1005 conda-forge
brunsli 0.1 h9c3ff4c_0 conda-forge
bump2version 1.0.1 pyh9f0ad1d_0 conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.19.1 hd590300_0 conda-forge
c-blosc2 2.9.3 hb4ffafa_0 conda-forge
ca-certificates 2023.5.7 hbcca054_0 conda-forge
cached-property 1.5.2 hd8ed1ab_1 conda-forge
cached_property 1.5.2 pyha770c72_1 conda-forge
cachetools 5.3.0 pyhd8ed1ab_0 conda-forge
caerus 0.1.9 pypi_0 pypi
cairo 1.16.0 ha61ee94_1014 conda-forge
certifi 2023.5.7 pyhd8ed1ab_0 conda-forge
cffi 1.15.1 py310h255011f_3 conda-forge
cfitsio 4.2.0 hd9d235c_0 conda-forge
charls 2.4.2 h59595ed_0 conda-forge
charset-normalizer 3.1.0 pyhd8ed1ab_0 conda-forge
click 8.1.3 unix_pyhd8ed1ab_2 conda-forge
cloudpickle 2.2.1 pyhd8ed1ab_0 conda-forge
colorama 0.4.6 pyhd8ed1ab_0 conda-forge
commonmark 0.9.1 py_0 conda-forge
contourpy 1.1.0 py310hd41b1e2_0 conda-forge
cryptography 41.0.1 py310h75e40e8_0 conda-forge
cuda-cudart 11.8.89 0 nvidia
cuda-cupti 11.8.87 0 nvidia
cuda-libraries 11.8.0 0 nvidia
cuda-nvrtc 11.8.89 0 nvidia
cuda-nvtx 11.8.86 0 nvidia
cuda-runtime 11.8.0 0 nvidia
cycler 0.11.0 pyhd8ed1ab_0 conda-forge
cytoolz 0.12.0 py310h5764c6d_1 conda-forge
dask-core 2023.6.0 pyhd8ed1ab_0 conda-forge
dav1d 1.2.1 hd590300_0 conda-forge
dbus 1.13.6 h5008d03_3 conda-forge
decorator 5.1.1 pyhd8ed1ab_0 conda-forge
deprecated 1.2.14 pyh1a96a4e_0 conda-forge
docutils 0.19 py310hff52083_1 conda-forge
exceptiongroup 1.1.1 pyhd8ed1ab_0 conda-forge
execnet 1.9.0 pyhd8ed1ab_0 conda-forge
executing 1.2.0 pyhd8ed1ab_0 conda-forge
expat 2.5.0 hcb278e6_1 conda-forge
fftw 3.3.10 nompi_hc118613_108 conda-forge
filelock 3.12.2 pyhd8ed1ab_0 conda-forge
findpeaks 2.4.7 pypi_0 pypi
font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge
font-ttf-inconsolata 3.000 h77eed37_0 conda-forge
font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge
font-ttf-ubuntu 0.83 hab24e00_0 conda-forge
fontconfig 2.14.2 h14ed4e7_0 conda-forge
fonts-conda-ecosystem 1 0 conda-forge
fonts-conda-forge 1 0 conda-forge
fonttools 4.40.0 py310h2372a71_0 conda-forge
freetype 2.12.1 hca18f0e_1 conda-forge
frozenlist 1.3.3 py310h5764c6d_0 conda-forge
fsspec 2023.6.0 pyh1a96a4e_0 conda-forge
future 0.18.3 pyhd8ed1ab_0 conda-forge
gettext 0.21.1 h27087fc_0 conda-forge
giflib 5.2.1 h0b41bf4_3 conda-forge
gitdb 4.0.10 pyhd8ed1ab_0 conda-forge
gitpython 3.1.31 pyhd8ed1ab_0 conda-forge
glib 2.76.3 hfc55251_0 conda-forge
glib-tools 2.76.3 hfc55251_0 conda-forge
gmp 6.2.1 h58526e2_0 conda-forge
gmpy2 2.1.2 py310h3ec546c_1 conda-forge
google-auth 2.20.0 pyh1a96a4e_0 conda-forge
google-auth-oauthlib 1.0.0 pyhd8ed1ab_0 conda-forge
graphite2 1.3.13 h58526e2_1001 conda-forge
grpcio 1.55.1 py310h1b8f574_1 conda-forge
gst-plugins-base 1.22.0 h4243ec0_2 conda-forge
gstreamer 1.22.0 h25f0c4b_2 conda-forge
gstreamer-orc 0.4.34 hd590300_0 conda-forge
h5py 3.9.0 nompi_py310h367e799_100 conda-forge
harfbuzz 6.0.0 h8e241bc_0 conda-forge
hdf5 1.14.0 nompi_hb72d44e_103 conda-forge
hydra-core 1.2.0 pyhd8ed1ab_0 conda-forge
hypothesis 6.79.3 pyha770c72_0 conda-forge
icu 70.1 h27087fc_0 conda-forge
idna 3.4 pyhd8ed1ab_0 conda-forge
imagecodecs 2023.1.23 py310ha3ed6a1_0 conda-forge
imageio 2.31.1 pyh24c5eb1_0 conda-forge
imagesize 1.4.1 pyhd8ed1ab_0 conda-forge
importlib-metadata 6.7.0 pyha770c72_0 conda-forge
importlib_metadata 6.7.0 hd8ed1ab_0 conda-forge
importlib_resources 5.12.0 pyhd8ed1ab_0 conda-forge
iniconfig 2.0.0 pyhd8ed1ab_0 conda-forge
ipython 8.14.0 pyh41d4057_0 conda-forge
jack 1.9.22 h11f4161_0 conda-forge
jedi 0.18.2 pyhd8ed1ab_0 conda-forge
jinja2 3.1.2 pyhd8ed1ab_1 conda-forge
joblib 1.2.0 pyhd8ed1ab_0 conda-forge
jpeg 9e h0b41bf4_3 conda-forge
jxrlib 1.1 h7f98852_2 conda-forge
keyutils 1.6.1 h166bdaf_0 conda-forge
kiwisolver 1.4.4 py310hbf28c38_1 conda-forge
krb5 1.20.1 h81ceb04_0 conda-forge
lame 3.100 h166bdaf_1003 conda-forge
lazy_loader 0.2 pyhd8ed1ab_0 conda-forge
lcms2 2.15 hfd0df8a_0 conda-forge
ld_impl_linux-64 2.40 h41732ed_0 conda-forge
lerc 4.0.0 h27087fc_0 conda-forge
libabseil 20230125.2 cxx17_h59595ed_2 conda-forge
libaec 1.0.6 hcb278e6_1 conda-forge
libavif 0.11.1 h8182462_2 conda-forge
libblas 3.9.0 16_linux64_mkl conda-forge
libbrotlicommon 1.0.9 h166bdaf_8 conda-forge
libbrotlidec 1.0.9 h166bdaf_8 conda-forge
libbrotlienc 1.0.9 h166bdaf_8 conda-forge
libcap 2.67 he9d0100_0 conda-forge
libcblas 3.9.0 16_linux64_mkl conda-forge
libclang 15.0.7 default_h7634d5b_2 conda-forge
libclang13 15.0.7 default_h9986a30_2 conda-forge
libcublas 11.11.3.6 0 nvidia
libcufft 10.9.0.58 0 nvidia
libcufile 1.6.1.9 0 nvidia
libcups 2.3.3 h36d4200_3 conda-forge
libcurand 10.3.2.106 0 nvidia
libcurl 8.1.2 h409715c_0 conda-forge
libcusolver 11.4.1.48 0 nvidia
libcusparse 11.7.5.86 0 nvidia
libdb 6.2.32 h9c3ff4c_0 conda-forge
libdeflate 1.17 h0b41bf4_0 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libevent 2.1.10 h28343ad_4 conda-forge
libexpat 2.5.0 hcb278e6_1 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libflac 1.4.3 h59595ed_0 conda-forge
libgcc-ng 13.1.0 he5830b7_0 conda-forge
libgcrypt 1.10.1 h166bdaf_0 conda-forge
libgfortran-ng 13.1.0 h69a702a_0 conda-forge
libgfortran5 13.1.0 h15d22d2_0 conda-forge
libglib 2.76.3 hebfc3b9_0 conda-forge
libgpg-error 1.47 h71f35ed_0 conda-forge
libgrpc 1.55.1 h59456c1_1 conda-forge
libhwloc 2.9.1 hd6dc26d_0 conda-forge
libiconv 1.17 h166bdaf_0 conda-forge
liblapack 3.9.0 16_linux64_mkl conda-forge
libllvm14 14.0.6 hcd5def8_3 conda-forge
libllvm15 15.0.7 hadd5161_1 conda-forge
libnghttp2 1.52.0 h61bc06f_0 conda-forge
libnpp 11.8.0.86 0 nvidia
libnsl 2.0.0 h7f98852_0 conda-forge
libnvjpeg 11.9.0.86 0 nvidia
libogg 1.3.4 h7f98852_1 conda-forge
libopus 1.3.1 h7f98852_1 conda-forge
libpng 1.6.39 h753d276_0 conda-forge
libpq 15.3 hbcd7760_1 conda-forge
libprotobuf 4.23.2 hd1fb520_5 conda-forge
libsndfile 1.2.0 hb75c966_0 conda-forge
libsqlite 3.42.0 h2797004_0 conda-forge
libssh2 1.11.0 h0841786_0 conda-forge
libstdcxx-ng 13.1.0 hfd8a6a1_0 conda-forge
libsystemd0 253 h8c4010b_1 conda-forge
libtiff 4.5.0 h6adf6a1_2 conda-forge
libtool 2.4.7 h27087fc_0 conda-forge
libudev1 253 h0b41bf4_1 conda-forge
libuuid 2.38.1 h0b41bf4_0 conda-forge
libvorbis 1.3.7 h9c3ff4c_0 conda-forge
libwebp-base 1.3.0 h0b41bf4_0 conda-forge
libxcb 1.13 h7f98852_1004 conda-forge
libxkbcommon 1.5.0 h79f4944_1 conda-forge
libxml2 2.10.3 hca2bb57_4 conda-forge
libzlib 1.2.13 hd590300_5 conda-forge
libzopfli 1.0.3 h9c3ff4c_0 conda-forge
lightning-utilities 0.8.0 pypi_0 pypi
line_profiler 4.0.3 py310hd41b1e2_0 conda-forge
llvm-openmp 16.0.6 h4dfa4b3_0 conda-forge
llvmlite 0.40.0 py310h1b8f574_0 conda-forge
locket 1.0.0 pyhd8ed1ab_0 conda-forge
lz4-c 1.9.4 hcb278e6_0 conda-forge
markdown 3.4.3 pyhd8ed1ab_0 conda-forge
markupsafe 2.1.3 py310h2372a71_0 conda-forge
mat73 0.60 pypi_0 pypi
matplotlib 3.7.1 py310hff52083_0 conda-forge
matplotlib-base 3.7.1 py310he60537e_0 conda-forge
matplotlib-inline 0.1.6 pyhd8ed1ab_0 conda-forge
mkl 2022.2.1 h84fe81f_16997 conda-forge
mpc 1.3.1 hfe3b2da_0 conda-forge
mpfr 4.2.0 hb012696_0 conda-forge
mpg123 1.31.3 hcb278e6_0 conda-forge
mpmath 1.3.0 pyhd8ed1ab_0 conda-forge
multidict 6.0.4 py310h1fa729e_0 conda-forge
munkres 1.1.4 pyh9f0ad1d_0 conda-forge
mypy-extensions 1.0.0 pypi_0 pypi
mysql-common 8.0.33 hf1915f5_0 conda-forge
mysql-libs 8.0.33 hca2cd23_0 conda-forge
ncurses 6.4 hcb278e6_0 conda-forge
networkx 3.1 pyhd8ed1ab_0 conda-forge
nspr 4.35 h27087fc_0 conda-forge
nss 3.89 he45b914_0 conda-forge
numba 0.57.0 py310h0f6aa51_2 conda-forge
numpy 1.21.6 py310h45f3432_0 conda-forge
oauthlib 3.2.2 pyhd8ed1ab_0 conda-forge
omegaconf 2.2.3 pyhd8ed1ab_0 conda-forge
opencv-python 4.7.0.72 pypi_0 pypi
openjpeg 2.5.0 hfec8fc6_2 conda-forge
openssl 3.1.1 hd590300_1 conda-forge
packaging 23.1 pyhd8ed1ab_0 conda-forge
pandas 2.0.2 py310h7cbd5c2_0 conda-forge
parso 0.8.3 pyhd8ed1ab_0 conda-forge
partd 1.4.0 pyhd8ed1ab_0 conda-forge
pathspec 0.11.1 pypi_0 pypi
patsy 0.5.3 pyhd8ed1ab_0 conda-forge
pcre2 10.40 hc3806b6_0 conda-forge
peakdetect 1.1 pypi_0 pypi
pexpect 4.8.0 pyh1a96a4e_2 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pillow 9.4.0 py310h023d228_1 conda-forge
pip 23.1.2 pyhd8ed1ab_0 conda-forge
pixman 0.40.0 h36c2ea0_0 conda-forge
platformdirs 3.8.0 pyhd8ed1ab_0 conda-forge
pluggy 1.2.0 pyhd8ed1ab_0 conda-forge
ply 3.11 py_1 conda-forge
pooch 1.7.0 pyha770c72_3 conda-forge
prompt-toolkit 3.0.38 pyha770c72_0 conda-forge
prompt_toolkit 3.0.38 hd8ed1ab_0 conda-forge
protobuf 4.23.2 py310hb875b13_1 conda-forge
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
pulseaudio 16.1 hcb278e6_3 conda-forge
pulseaudio-client 16.1 h5195f5e_3 conda-forge
pulseaudio-daemon 16.1 ha8d29e2_3 conda-forge
pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge
pyasn1 0.4.8 py_0 conda-forge
pyasn1-modules 0.2.7 py_0 conda-forge
pycparser 2.21 pyhd8ed1ab_0 conda-forge
pydantic 1.10.9 py310h2372a71_0 conda-forge
pydata-sphinx-theme 0.13.3 pyhd8ed1ab_0 conda-forge
pygments 2.15.1 pyhd8ed1ab_0 conda-forge
pyjwt 2.7.0 pyhd8ed1ab_0 conda-forge
pyopenssl 23.2.0 pyhd8ed1ab_1 conda-forge
pyparsing 3.1.0 pyhd8ed1ab_0 conda-forge
pyqt 5.15.7 py310hab646b1_3 conda-forge
pyqt5-sip 12.11.0 py310heca2aa9_3 conda-forge
pysocks 1.7.1 pyha2e5f31_6 conda-forge
pytest 7.4.0 pyhd8ed1ab_0 conda-forge
pytest-xdist 3.3.1 pyhd8ed1ab_0 conda-forge
python 3.10.12 hd12c33a_0_cpython conda-forge
python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge
python-tzdata 2023.3 pyhd8ed1ab_0 conda-forge
python_abi 3.10 3_cp310 conda-forge
pytorch 2.0.0 py3.10_cuda11.8_cudnn8.7.0_0 pytorch
pytorch-cuda 11.8 h7e8668a_5 pytorch
pytorch-lightning 1.8.6 pypi_0 pypi
pytorch-mutex 1.0 cuda pytorch
pytz 2023.3 pyhd8ed1ab_0 conda-forge
pyu2f 0.1.5 pyhd8ed1ab_0 conda-forge
pywavelets 1.4.1 py310h0a54255_0 conda-forge
pyyaml 6.0 py310h5764c6d_5 conda-forge
qt-main 5.15.8 h5d23da1_6 conda-forge
re2 2023.03.02 h8c504da_0 conda-forge
readline 8.2 h8228510_1 conda-forge
recommonmark 0.7.1 pyhd8ed1ab_0 conda-forge
requests 2.31.0 pyhd8ed1ab_0 conda-forge
requests-oauthlib 1.3.1 pyhd8ed1ab_0 conda-forge
rsa 4.9 pyhd8ed1ab_0 conda-forge
scikit-image 0.20.0 py310h9b08913_1 conda-forge
scikit-learn 1.2.2 py310hf7d194e_2 conda-forge
scipy 1.11.0 py310ha4c1d20_0 conda-forge
seaborn 0.12.2 hd8ed1ab_0 conda-forge
seaborn-base 0.12.2 pyhd8ed1ab_0 conda-forge
setuptools 68.0.0 pyhd8ed1ab_0 conda-forge
sip 6.7.9 py310hc6cd4ac_0 conda-forge
six 1.16.0 pyh6c4a22f_0 conda-forge
smmap 3.0.5 pyh44b312d_0 conda-forge
snappy 1.1.10 h9fff704_0 conda-forge
snowballstemmer 2.2.0 pyhd8ed1ab_0 conda-forge
sortedcontainers 2.4.0 pyhd8ed1ab_0 conda-forge
soupsieve 2.3.2.post1 pyhd8ed1ab_0 conda-forge
sphinx 5.3.0 pyhd8ed1ab_0 conda-forge
sphinx-autodoc-typehints 1.21.8 pyhd8ed1ab_0 conda-forge
sphinx-markdown-tables 0.0.17 pypi_0 pypi
sphinxcontrib-applehelp 1.0.4 pyhd8ed1ab_0 conda-forge
sphinxcontrib-devhelp 1.0.2 py_0 conda-forge
sphinxcontrib-htmlhelp 2.0.1 pyhd8ed1ab_0 conda-forge
sphinxcontrib-jsmath 1.0.1 py_0 conda-forge
sphinxcontrib-qthelp 1.0.3 py_0 conda-forge
sphinxcontrib-serializinghtml 1.1.5 pyhd8ed1ab_2 conda-forge
spline 0.11.1dev0 np121py310h93a0a19_1 haydnspass/label/dev
stack_data 0.6.2 pyhd8ed1ab_0 conda-forge
statsmodels 0.14.0 py310h278f3c1_1 conda-forge
structlog 23.1.0 pyhd8ed1ab_0 conda-forge
sympy 1.12 pypyh9d50eac_103 conda-forge
tbb 2021.9.0 hf52228f_0 conda-forge
tensorboard 2.13.0 pyhd8ed1ab_0 conda-forge
tensorboard-data-server 0.7.0 py310h34c0648_0 conda-forge
tensorboardx 2.6.1 pypi_0 pypi
threadpoolctl 3.1.0 pyh8a188c0_0 conda-forge
tifffile 2023.4.12 pyhd8ed1ab_0 conda-forge
tk 8.6.12 h27826a3_0 conda-forge
toml 0.10.2 pyhd8ed1ab_0 conda-forge
tomli 2.0.1 pyhd8ed1ab_0 conda-forge
toolz 0.12.0 pyhd8ed1ab_0 conda-forge
torchmetrics 0.11.0 pypi_0 pypi
torchtriton 2.0.0 py310 pytorch
tornado 6.3.2 py310h2372a71_0 conda-forge
tqdm 4.65.0 pyhd8ed1ab_1 conda-forge
traitlets 5.9.0 pyhd8ed1ab_0 conda-forge
typing-extensions 4.6.3 hd8ed1ab_0 conda-forge
typing_extensions 4.6.3 pyha770c72_0 conda-forge
tzdata 2023c h71feb2d_0 conda-forge
unicodedata2 15.0.0 py310h5764c6d_0 conda-forge
urllib3 1.26.15 pyhd8ed1ab_0 conda-forge
wcwidth 0.2.6 pyhd8ed1ab_0 conda-forge
werkzeug 2.3.6 pyhd8ed1ab_0 conda-forge
wget 3.2 pypi_0 pypi
wheel 0.40.0 pyhd8ed1ab_0 conda-forge
wrapt 1.15.0 py310h1fa729e_0 conda-forge
xcb-util 0.4.0 h516909a_0 conda-forge
xcb-util-image 0.4.0 h166bdaf_0 conda-forge
xcb-util-keysyms 0.4.0 h516909a_0 conda-forge
xcb-util-renderutil 0.3.9 h166bdaf_0 conda-forge
xcb-util-wm 0.4.1 h516909a_0 conda-forge
xkeyboard-config 2.38 h0b41bf4_0 conda-forge
xorg-kbproto 1.0.7 h7f98852_1002 conda-forge
xorg-libice 1.1.1 hd590300_0 conda-forge
xorg-libsm 1.2.4 h7391055_0 conda-forge
xorg-libx11 1.8.4 h0b41bf4_0 conda-forge
xorg-libxau 1.0.11 hd590300_0 conda-forge
xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge
xorg-libxext 1.3.4 h0b41bf4_2 conda-forge
xorg-libxrender 0.9.10 h7f98852_1003 conda-forge
xorg-renderproto 0.11.1 h7f98852_1002 conda-forge
xorg-xextproto 7.3.0 h0b41bf4_1003 conda-forge
xorg-xproto 7.0.31 h7f98852_1007 conda-forge
xz 5.2.6 h166bdaf_0 conda-forge
yaml 0.2.5 h7f98852_2 conda-forge
yarl 1.9.2 py310h2372a71_0 conda-forge
zernike 0.0.32 pypi_0 pypi
zfp 1.0.0 h27087fc_3 conda-forge
zipp 3.15.0 pyhd8ed1ab_0 conda-forge
zlib 1.2.13 hd590300_5 conda-forge
zlib-ng 2.0.7 h0b41bf4_0 conda-forge
zstd 1.5.2 h3eb15da_6 conda-forge