I want to use a network for extract the features, and then use the features as the inputs to another network. As the first network doesn’t need backward, so I use detach
after the first network. However, I find that the GPU memory used by the first network is not freed. Is there any methods to free the GPU memory used by the first network after extracting the features?
The code is like:
features = model1(inputs).detach()
outputs = model2(features)
loss = loss_func(outputs)
optimizer.zero_grad()
loss.backward()
optimizer.step()