Tensor parallel spawns additional processes on GPU0 and uses additional memory

Solved my own issue, I guess this was because of a well known issue: Torch not able to utilize GPU ram properly - #6 by Tyan