Hi,
I am new to pytorch and I am trying to write a simple NN code capable of running on GPU. I have attached the code (part of a larger code) which I have written to demonstrate my problem. I am able to compile and run my code without any errors. However, when I try to run it using cuda-memcheck, I get bunch of errors, which all state:
Program hit cudaErrorCudartUnloading (error 4) due to “driver shutting down” on CUDA API call to cudaFree/cudaDeviceSynchronize/cudaEventDestroy etc.
I am using pytorchv1.5.0-gpu, gcc/8.3.0, cuda 10.2.89, python-3.7-anaconda, and running my code on volta v100.
I am attaching my code
import torch
import numpy as np
class TwoLayerNet(torch.nn.Module):
def __init__(self, D_in, H, D_out):
super(TwoLayerNet, self).__init__()
self.linear1 = torch.nn.Linear(D_in,H)
self.linear2 = torch.nn.Linear(H,D_out)
def forward(self, x):
h_relu = self.linear1(x).clamp(min=0)
y_pred = self.linear2(h_relu)
return y_pred
def run_NN():
N, D_in, H, D_out, tstep = 64, 1000, 100, 10, 1000
print("*********************************************")
print("Start python")
dtype = torch.float;
device = torch.device("cuda:0")
x = torch.randn(N, D_in, dtype=dtype, device=device)
y = torch.randn(N, D_out, dtype=dtype, device=device)
w1 = torch.randn(D_in, H, device=device, dtype=dtype)
w2 = torch.randn(H, D_out, device=device, dtype=dtype)
model = TwoLayerNet(D_in, H, D_out)
y_pred = torch.matmul(x,w1)
print("End python")
print("*********************************************")
run_NN()
Could someone please help me?
Thank you
Edit: I tried cuda-memcheck on the examples listed at Simple NN examples Pytorch and I get the same errors.