I’m trying to use PyTorch’s Resnet18 model with my image data. Given the complexity of the model as well as the size of the data, I’d like to run it using CUDA. I’m doing the follow:
resnet_cnn = models.resnet18(pretrained = True)
num_ftrs = resnet_cnn.fc.in_features
resnet_cnn.fc = nn.Linear(num_ftrs, 8)
criterion = nn.CrossEntropyLoss().cuda()
optimizer_ft = optim.SGD(resnet_cnn.parameters(), lr=0.001, momentum=0.9)
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=5, gamma=0.1)
After this, I attempt to train and test my model with the following loop:
count = 0
loss_list = []
iteration_list = []
accuracy_list = []
epochs = 30
for epoch in range(epochs):
for i, (images, labels) in enumerate(trainloader):
resnet_cnn = resnet_cnn.cuda()
images.cuda()
labels.cuda()
optimizer_ft.zero_grad()
outputs = resnet_cnn(images.cuda())
loss = criterion(outputs.cuda(), labels.cuda())
loss.backward()
optimizer_ft.step()
count += 1
if count % 50 == 0:
correct = 0
total = 0
for i, (images, labels) in enumerate(testloader):
# images.to(device)
# labels.to(device)
outputs = resnet_cnn(images.cuda())
predicted = torch.max(outputs.data, 1)[1]
total += len(labels)
correct += (predicted == labels.cuda()).sum()
accuracy = 100 * correct / float(total)
loss_list.append(loss.data)
iteration_list.append(count)
accuracy_list.append(accuracy)
if count % 500 == 0:
print("Iteration: {} Loss: {} Accuracy: {} %".format(count, loss.data, accuracy))
But I am met with the following error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-48-cb669e8d47c0> in <module>()
7 for epoch in range(epochs):
8 for i, (images, labels) in enumerate(trainloader):
----> 9 resnet_cnn = resnet_cnn.cuda()
10 images.cuda()
11 labels.cuda()
3 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in cuda(self, device)
489 Module: self
490 """
--> 491 return self._apply(lambda t: t.cuda(device))
492
493 def xpu(self: T, device: Optional[Union[int, device]] = None) -> T:
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
385 def _apply(self, fn):
386 for module in self.children():
--> 387 module._apply(fn)
388
389 def compute_should_use_set_data(tensor, tensor_applied):
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
407 # `with torch.no_grad():`
408 with torch.no_grad():
--> 409 param_applied = fn(param)
410 should_use_set_data = compute_should_use_set_data(param, param_applied)
411 if should_use_set_data:
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in <lambda>(t)
489 Module: self
490 """
--> 491 return self._apply(lambda t: t.cuda(device))
492
493 def xpu(self: T, device: Optional[Union[int, device]] = None) -> T:
RuntimeError: CUDA error: device-side assert triggered
I can’t figure what I’m doing wrong as I’ve trained another manually-defined CNN in the same way. Thank you in advance.