I train a semantic segmentation model based on the Pytorch framework, using the same Batch_size and Num_worker in both the val and train stages.
Interestingly. During the training process, the CPU utilization of the main process is 100%, but the CPU utilization of several other Python sub-processes (launched by num_woker) is 25%, and the GPU utilization is 95%.
When the val stage is reached, something interesting happens. The CPU utilization corresponding to the main process remains around 100%, but the CPU utilization corresponding to the remaining python sub-processes (launched by num_woker) soars to 70%, while the GPU utilization decreases to 85%.
In fact, I use the same transform rules for the validation set data and the training set data.
transform_train = Compose(Resize(resize_shape), Rotation(2),
Normalize(mean=mean, std=std))
train_dataset = My_Dataset(Dataset_Path, "train", transform_train)
train_loader = DataLoader(train_dataset, batch_size=8, shuffle=True,
collate_fn=train_dataset.collate, num_workers=5, pin_memory=False)
# ------------ val data ------------
transform_val_img = Resize(resize_shape)
transform_val_x = Compose(ToTensor(), Normalize(mean=mean, std=std))
transform_val = Compose(transform_val_img, transform_val_x) # Resize --> ToTensor --> Normalize
val_dataset = My_Dataset(Dataset_Path, "val", transform_val)
val_loader = DataLoader(val_dataset, batch_size=8, collate_fn=val_dataset.collate, num_workers=5)