This is the environment I’m running in:
- CUDA 10.1
- Python 3.7
- Titan X
- Pytorch 1.5.1
The network used to work normally, but after I added a method in my model to compute graph normalizations, this error starts to happen.
The error tracing with CUDA_LAUNCH_BLOCKING=1 gives:’
Traceback (most recent call last):----------------------------------| 0.1% Training epoch 0;
File "main.py", line 201, in <module>
cross_validation_with_val_set(model, params)
File "/afs/ece.cmu.edu/usr/xujinl/CSD/train_eval.py", line 110, in cross_validation_with_val_set
epoch_index=epoch-1, params=params, writer=writer)
File "/afs/ece.cmu.edu/usr/xujinl/CSD/train_eval.py", line 209, in train_test_eval
train_acc, train_loss = train(model, loaders['train'], opt, params)
File "/afs/ece.cmu.edu/usr/xujinl/CSD/train_eval.py", line 250, in train
return run_windowed_model(model, loader, opt, params)
File "/afs/ece.cmu.edu/usr/xujinl/CSD/train_eval.py", line 343, in run_windowed_model
out = model(x_curr)
File "/afs/ece.cmu.edu/usr/xujinl/anaconda3/envs/CSD/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/afs/ece.cmu.edu/usr/xujinl/CSD/modules/nets/linear.py", line 12, in forward
x = self.MLP(x)
File "/afs/ece.cmu.edu/usr/xujinl/anaconda3/envs/CSD/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/afs/ece.cmu.edu/usr/xujinl/CSD/modules/nets/linear.py", line 38, in forward
x = F.dropout(F.relu(self.bn1(self.lin1(x))),p=self.dropout,training=self.training)
File "/afs/ece.cmu.edu/usr/xujinl/anaconda3/envs/CSD/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/afs/ece.cmu.edu/usr/xujinl/anaconda3/envs/CSD/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "/afs/ece.cmu.edu/usr/xujinl/anaconda3/envs/CSD/lib/python3.7/site-packages/torch/nn/functional.py", line 1610, in linear
ret = torch.addmm(bias, input, weight.t())
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)`
What might be possible causes? Thank youl.