Does PyTorch overprovision VRAM?

A model uses 6GB VRAM, total use 7.8/8GB. I double input size, now it uses 6.1GB, 7.9/8GB. This shouldn’t be possible if pytorch uses the minimal it needs each time, and it’s a problem when I can’t reduce VRAM use by using a smaller model. Any remedy?