I am working with pretrained models from OpenCLIP
Once I load a CLIP encoder, does doing this eliminate the rest of the components from the GPU memory?
model, , = open_clip.create_model_and_transforms()
model = model.visual
Should I be doing stuff like:
import gc
full_model, _, _ = open_clip.create_model_and_transforms()
# Isolate the visual component
visual_model = full_model.visual
# Create a new state dict with only the visual component
visual_state_dict = {k: v for k, v in full_model.state_dict().items() if k.startswith('visual.')}
# Create a new model with only the visual component
isolated_model = torch.nn.Module()
isolated_model.visual = visual_model
# Load the visual state dict into the new model
isolated_model.load_state_dict(visual_state_dict, strict=False)
# Delete references to the full model and unused variables
del full_model, visual_model, visual_state_dict
# Force garbage collection
gc.collect()
# Clear CUDA cache if using GPU
if torch.cuda.is_available():
torch.cuda.empty_cache()
Any better way???