Memory mamagement

Recently, I probed into pytorch memory management mechanism and found that memory blocks were allocated seperately under different streams, which I think will cause memory over-allocation in some cases.
I wonder what developers design such mechanisms and what factors they consider

Thanks and best wishes