Question about tensor parallel (DTensor, parallelize_module)

Its been a while since you posted this but I had a similar looking issue and resolved it with this: Torch not able to utilize GPU ram properly - #6 by Tyan