Hi, a few beginners questions:
Using a single 1080TI GPU
My model is a simple Feedforward net with 5 hidden layers of 100 relu units.
Basically, I have data sets of roughly 50-450KB, a data set is stored on my regular HD as (mat or pt file) where the x’s & y’s are stored as pytorch tensors.
Basically, I’m getting now roughly 3-5% GPU utilization (looking at windows task manager), while dedicated memory GPU occupies around 5GB/11GB.
Now, here is my code:
x_train_org,y_train_org= load(data.pt) x_train_org = x_train_org[rand_idx, :] y_train_org= y_train_org[rand_idx, :] training_data = data_utils.TensorDataset(x_train_org, y_train_org) train_data_loader = data_utils.DataLoader(training_data, batch_size=batch_size, shuffle=True, pin_memory=True) for i in range(0, ep_num): estimator.train() for x, y in train_data_loader: x, y = x.to(device), y.to(device) pred_log_probs = estimator.forward(x) model_optimizer.zero_grad() loss1 = cost_func(pred_log_probs.permute([0, 2, 1]), y) loss1.backward() model_optimizer.step() estimator.eval() pred_log_probs = estimator.forward(x_train_org.to(device)) train_loss[i + 1] = cost_func(pred_log_probs.permute([0, 2, 1]), y_train_org.to(device)).detach().item()
So my questions are:
Is there something wrong with the flow of my code?
Should I use the “.to(device)” directly when feeding the DataLoader with the training data?
Is there any reason for me to really use the DataLoader in case of a single GPU and when all data is already in pytorch tensor data?
Is there anything I can do to speed things up?