Hi guys,
I am trying to train model on a modified COCO database. During loading data to dataloaders
(images and targets - code in the end of post) htop
command shows me that I am running like 100 process and every one of it uses 60 GB of VIRT
and about 40 GB of RES
, but summary the mem bar
shows only 50 GB/ 504 GB. How to understand it? How to be sure that I will not use to much memory?
Can you look on my code if I am doing it alright?
class LoadDataset(Dataset): def __init__(self): self.images = [] self.targets = [] img_path, ann_path = ( "path_to_images", "path_to_annotations_json", ) coco_ds = torchvision.datasets.CocoDetection(img_path, ann_path) for i in range(0, len(coco_ds)): img, ann = coco_ds[i] for a in ann: images, targets = collate( [img.copy(), img.copy()], [[a], [a]], coco_ds.coco ) for t in targets: self.targets.append(t) for image in images: self.images.append(image) def __len__(self): return len(self.images) def __getitem__(self, idx): img = self.images[idx] target = self.targets[idx] return ( img, target, )
and later in code: …
train_loader = DataLoader(LoadDataset, batch_size=24, shuffle=True)