I’m testing loading same model in gpu like this.
import model
import numpy as np
tt = np.zeros((512, 512, 3), dtype=np.uint8)
ms = {}
while True:
for i in range(300):
if not ms.get(i):
ms.update({i:model.Model()})
for k,v in ms.items():
v.infer(tt)
print(i, 'loaded')
print(len(ms), 'length')
When I load just 1 model, it occupies 1G of GPU mem.
But when i load 160 model like above it occupies just 16G of gpu mem. why usage of gpu mem is not increasing linearly?