How to understand memory usage with PyTorch

Tomash · November 18, 2020, 7:52pm

Hi guys,
I am trying to train model on a modified COCO database. During loading data to dataloaders (images and targets - code in the end of post) htop command shows me that I am running like 100 process and every one of it uses 60 GB of VIRT and about 40 GB of RES, but summary the mem bar shows only 50 GB/ 504 GB. How to understand it? How to be sure that I will not use to much memory?

Can you look on my code if I am doing it alright?

class LoadDataset(Dataset):
def __init__(self): 
    self.images = []
    self.targets = []
    
    img_path, ann_path = (
        "path_to_images",
        "path_to_annotations_json",
    )
    coco_ds = torchvision.datasets.CocoDetection(img_path, ann_path)
    for i in range(0, len(coco_ds)):
        img, ann = coco_ds[i]
        for a in ann:
            images, targets = collate(
                [img.copy(), img.copy()], [[a], [a]], coco_ds.coco
            )
            for t in targets:
                self.targets.append(t)
            for image in images:
                self.images.append(image)
                        
def __len__(self):
    return len(self.images)

def __getitem__(self, idx):
    img = self.images[idx]
    target = self.targets[idx]
    
    return (
        img,
        target,
    )

and later in code: …

train_loader = DataLoader(LoadDataset, batch_size=24, shuffle=True)