Hello Torch users,
I’m currently implementing a 3D resnet18 on fMRI data of dimension [27, 75, 93, 81]. I couldn’t do one epoch in 48 hours on two GPUs A100.
I have already tried to transform my data directly into NumPy array in order to speed up the process.
I already use the following code to run the model on 2 GPUs:
if torch.cuda.device_count() > 1:
print("Let's use", torch.cuda.device_count(), "GPUs!")
# dim = 0 [30, xxx] -> [10, ...], [10, ...], [10, ...] on 3 GPUs
model = nn.DataParallel(model)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model.to(device)
By the way, how can I check if the two GPUs are used?
I use this for my train loader:
train_loader = torch.utils.data.DataLoader(train_set,
batch_size=64,
shuffle=True,
num_workers=0)
Any ideas or tricks to speed up the process?