I’m trying to follow tips from PyTorch’s performance tuning guide. One of them is that one could try using tcmalloc from google.
Since I don’t have root permission on the server, I installed it locally and added path to the tcmalloc.so*
files to LD_LIBRARY_PATH
and path to include files to CPATH
. I then added the # export LD_PRELOAD=<tcmalloc.so>:$LD_PRELOAD
comment right below the import statements in my python code. But I don’t know if PyTorch is actually making use of tcmalloc or not. Is there any way I can check/verify that?
P.S. I’m using host memory heavily so I thought that tcmalloc could help with that.
export LD_PRELOAD
should not be a comment in your Python code but should be exported in your current terminal.
Once this is done you could use LD_DEBUG=libs
to check if the linker is indeed preloading the right library.