My GPU utilization is about 1% while training when I work with an image dataset passed to DataLoader, increasing batch size and num_workers does not help, however when I work with csv data and I do not preprocess it with DataLoader(I pass the whole dataset through model not using batches) it uses GPU and everything works fine but it works only if I make Variable from tensor when I try to put tensor on device(tensor.to(device)) nothing happens and it runs on Cpu. Thanks for help, I have been trying to fix this for a long time.
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import transforms
from torchvision import datasets
train_path = r'D:\Datasets\fruits\5857_1166105_bundle_archive\fruits-360\Training'
transform = transforms.Compose([
transforms.Resize((32,32)),
transforms.ToTensor(),
])
train_data = datasets.ImageFolder(train_path, transform=transform)
train_loader = torch.utils.data.DataLoader(
train_data,
batch_size=200,
num_workers=4,
shuffle=True,
)
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 3)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 3)
self.fc1 = nn.Linear(16*6*6, 64)
self.fc2 = nn.Linear(64, 131)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16*6*6)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
device = torch.device("cuda")
net = Model().cuda()
import torch.optim as optim
loss_fn = nn.CrossEntropyLoss()
opt = optim.Adam(net.parameters())
from torch.autograd import Variable
for epoch in range(10):
for i, data in enumerate(train_loader):
inputs, labels = data[0], data[1]
inputs = Variable(inputs).cuda()
labels = Variable(labels).cuda()
torch.no_grad()
out = net(inputs)
loss = loss_fn(out, labels)
loss.backward()
opt.step()
print(loss.item())
What is the resolution of your images? Is it in jpg format? I noticed with my CPU R5 2600X that the cpu was already at its limit decoding the image so i lowered the resolution of the image to 1000x1000 with a external Image Compressor and used the compressed images for the dataset. I aussme your CPU runs jpg decoding for most of the time.
Remove .cuda() after the optim.Adam() call. I did not know you could send the optimizer to any device. It’s usually only reserved for model or tensor. Happy to be told otherwise.
Hello thank you for your suggestion, but I was told to do try to put optimizer to Gpu(opt.cuda()) because my code higher in this conversation is not almost using the Gpu. Do you have any suggestions please because I can not make this work for a really long time.
Ow, sorry, I do not know why I said optim.cuda() so strictly! thanks @harsha_g for pointing out to. I meant loss functions but still it won’t make any differences because there is no parameter in cross entropy.
Also, did you try to remove Variable from your code and sending in/out to GPU using .cuda()?
Sending model and input/output tensors to cuda is enough. Can you check that after sending input/output to cuda, they are really on cuda? .device will help.
i have tried everything removing Variable using .to(device) I read all the posts about this and none of them helped. But when I work with csv dataset and i do not use DataLoader it uses Gpu at about 60% like in this example
import torch
import torch.nn as nn
import numpy as np
import pandas as pd
from torch.utils.data import DataLoader
from torch.autograd import Variable
import torch.nn.functional as F
from sklearn import preprocessing
nor = preprocessing.MinMaxScaler()
train_path = r"D:\Datasets\house prices\train.csv"
test_path = r"D:\Datasets\house prices\test.csv"
train_file = pd.read_csv(train_path)
test_file = pd.read_csv(test_path)
train_file.drop("Id", axis=1, inplace=True)
test_file.drop("Id", axis=1, inplace=True)
train_file.fillna(train_file.median(), inplace=True)
test_file.fillna(train_file.median(), inplace=True)
y_train = train_file["SalePrice"].values
y_train.resize(1460, 1)
num_cols = list(train_file._get_numeric_data().columns)
num_cols.remove("SalePrice")
train_file = pd.get_dummies(train_file, drop_first=True, dummy_na=True)
test_file = pd.get_dummies(test_file, drop_first=True, dummy_na=True)
train_file = nor.fit_transform(train_file)
test_file = nor.fit_transform(test_file)
X_train = train_file
x1 = pd.DataFrame(train_file)
x2 = pd.DataFrame(test_file)
x2 = x2.align(x1, axis=1)[0]
X_test = x2.values
X_train = Variable(torch.from_numpy(X_train).float())
y_train = Variable(torch.from_numpy(y_train).float())
X_test = Variable(torch.from_numpy(X_test).float())
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.fc1 = nn.Linear(289, 128)
self.fc2 = nn.Linear(128, 128)
self.fc3 = nn.Linear(128, 128)
self.fc4 = nn.Linear(128, 128)
self.fc5 = nn.Linear(128, 1)
self.drop = nn.Dropout(0.2)
def forward(self, x):
x = F.relu(self.fc1(x))
x = self.drop(F.relu(self.fc2(x)))
x = self.drop(F.relu(self.fc3(x)))
x = F.relu(self.fc4(x))
x = F.relu(self.fc5(x))
return x
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
net = Model().cuda()
loss_fn = nn.MSELoss()
opt = torch.optim.Adam(net.parameters())
epochs = 10000
train_loss = 0
for epoch in range(epochs):
preds = net(X_train.cuda())
opt.zero_grad()
loss = loss_fn(preds, y_train.cuda())
loss.backward()
opt.step()
train_loss += loss.item()
if epoch % 100 == 0:
print(f"Loss: {loss.item()}")
this works with no problems, I think that there might be some problem with DataLoder when passing images to model. I really appreciate your help and time thank you so much
I just ran the code with the data and I find no reason to believe there’s anything wrong with the code. I ran it for 100 epochs with a batch size of 2500 and I can see a peak of 60% on my GPU (Tesla P100). Each epoch at most takes 5 seconds.
when i tried to run it with 2500 batch size, it did not use gpu at all the only thing that it used was ram. I have Gpu gtx1050 ti and Tensorflow runs no problem and I did not even get to 1 epoch in 5 mins. I do not what might be the problem this is like nightmare nothings works to solve that.
I reinstalled torch and now cifar10 from pytorch datasets uses gpu 50% but no improvements in that fruit dataset but i think that dataset is not a problem because tensorflow model worked fine with this fruit dataset.
I have just found out that there is no problem in model or cuda but there must be a problem in Dataset or DataLoader because Cifar10 from pytorch runs no problem. But I think that dataset is not a problem because there is a lot of models on GitHub with fruits datasets and it also does not work with dogs vs cats dataset so I think that there is some problem with custom datasets and daloader. Do you have any suggestions please. Thanks for your help.