I’m trying to experiment with different model-architectures in a single jupyter notebook.
At the end of the notebook, I want to run all the n-models I trained above on the test-data.
But my GPU doesn’t have enough memory to train n-different models at once. I need to free up some space before training the next model. But I also need to preserve my n-models on the GPU so that I can use them all later for inference/test.
This is something I’m going for -
train_dataloader = DataLoader(...)
test_dataloader = DataLoader(...)
# model-1
class Model(nn.Module):
def __init__(self):
bla = nn.Conv2d()
def forward(self, x):
x = bla(x)
return x
# train model_1
for i in trainloader:
# train model_1
pred = model_1(x)
# save model_1 params and free up the GPU
# train model_2
# save model_2 params and free up the GPU
# train model_3
# save model_3 params and free up the GPU
# finally test all three models together
How do free up space on the GPU without deleting the model(on the gpu) and without restarting the kernel?
I came across torch.cuda.empty_cache()
. Can I safely use it in this case?