High gpu memory overhead

I was doing a GPU memory usage study for deployment and I noticed by default PyTorch is blocking out ~800mb of VRAM. I thought it might be model specific so I tried moving a Linear(1,1) layer to gpu and it took the same amount of memory, but moving another Linear(1,1) didn’t take any memory.
happens in 1.4 and 1.5

This memory is most likely allocated by the CUDA context and not parameters or tensors.
Depending on the CUDA version, used device etc. you might see an allocation of ~700+MB.