Using cuda ipc memory handles

Hi everyone! I try to write custom handle reader in pytorch based on this topic.
I read topic with torch.multiprocessing.reductions.rebuild_cuda_tensor and its work Ok, but if I stop reading this handle GPU memory leaked.

Can someone introduce me how its work and how overwrite handle for solve memory leaks?
And second? can I read rebuild_cuda_tensor with multiple processes?
PS. I use independent processes and can not transfer Queue here.