Straitified K-fold filling up my ram too fast

chetan06 · May 26, 2020, 9:16am

skf = StratifiedKFold(n_splits = 5)
skf.get_n_splits(train_csv)
i=0
for idx1, idx2 in skf.split(train_csv['ID'],train_csv['Class']):
    model = timm.create_model('efficientnet_b1', pretrained = True)
    model.train()
    model.classifier = nn.Linear(in_features=1280, out_features=3, bias=True)
    model.to(device)
    optimizer1 = optim.Adam(model.parameters(), lr = 0.001)
    optimizer2 = optim.SGD(model.parameters(), lr = 0.001)
    scheduler1 = lr_scheduler.ReduceLROnPlateau(optimizer1,factor=0.33, mode="min", patience=4)
    scheduler2 = lr_scheduler.ReduceLROnPlateau(optimizer2,factor=0.33, mode="min", patience=4)
    train_data,_= train_csv.iloc[idx1], _
    valid_data = train_csv.iloc[idx2]
    train_dataset = AgeDataset(train_data, IMG_DIR, train_transform())
    valid_dataset = AgeDataset(valid_data, IMG_DIR, valid_transform())
    train_loader = DataLoader(
    train_dataset,
    batch_size = 4,
    shuffle = True,
    num_workers = workers
    )
    valid_loader = DataLoader(
    valid_dataset,
    batch_size = 4,
    shuffle = False,
    num_workers = workers
    )
    model = train(model, train_loader, valid_loader, optimizer1, scheduler1, criterion, 5)
    model = train(model, train_loader, valid_loader, optimizer2, scheduler2, criterion, 5)
    i+=1
    torch.save(model, r"/content/drive/My Drive/Age Detection/effnet"+str(i)+".pth")
    print("Saved Model "+str(i))

This is how I implemented Stratified K-fold on my model my this implemetation is filling up my RAM too fast. Please suggest some changes or another efficient implementation

chetan06 · May 26, 2020, 11:52am

@ptrblck @albanD and others please help

ptrblck · May 26, 2020, 9:24pm

Could you explain a bit, how fast your RAM is filled and in which iteration?
Are you seeing a contiguous increase in memory for each iteration?

chetan06 · May 27, 2020, 7:19am

Actually I used 10 epochs for training and only after 5 epochs my ram was filled up. I am using the same code though I think that the code is not perfect

ptrblck · May 27, 2020, 7:35am

Is this the entire code for the model training or are you storing some output tensors, losses or other objects somewhere in the code?
Also, are you running out of GPU memory or system RAM?

chetan06 · May 27, 2020, 7:37am

I’m running out of system RAM.
In the code, at last I am saving the models

ptrblck · May 27, 2020, 8:01am

That should be alright, as you are storing them on your drive.

Could you post a code snippet to reproduce this issue, please?
The model and the input data shapes would probably be enough so that we could run it locally.

chetan06 · May 27, 2020, 9:02am

train_csv’s shape = (19906,2)
model is efficientnet_b0 from timm