How to free graph after detach?

I want to use a network for extract the features, and then use the features as the inputs to another network. As the first network doesn’t need backward, so I use detach after the first network. However, I find that the GPU memory used by the first network is not freed. Is there any methods to free the GPU memory used by the first network after extracting the features?

The code is like:

features = model1(inputs).detach()
outputs = model2(features)
loss = loss_func(outputs)
optimizer.zero_grad()
loss.backward()
optimizer.step()

You could potentially detach and clone the original output, assign it to a new variable, and delete the original output later.
However, wrapping the first model into a with torch.no_grad() block should be easier and would avoid storing these intermediate activations from the beginning.