Failed on GPU with AWS EC2 g2 instance

I installed pytorch on AWS ec2 g2 instance ( with cuda 7.5)

Everything runs all right while on CPU, but when I tried to run on GPU. It threw the message below.
Tried to google the err msg but couldn’t find it
Thanks for any help!

torch.cuda.is_available()
THCudaCheck FAIL file=torch/csrc/cuda/Module.cpp line=108 error=30 : unknown error
Traceback (most recent call last):
File “”, line 1, in
File “/usr/local/lib/python3.5/dist-packages/torch/cuda/init.py”, line 28, in is_available
return torch._C._cuda_getDeviceCount() > 0
RuntimeError: cuda runtime error (30) : unknown error at torch/csrc/cuda/Module.cpp:108

Maybe this post could help you.

1 Like

Thanks! I will take a try

this quick start to aws+pytorch may help anyone struggling w/ getting gpu working. https://medium.com/@waya.ai/quick-start-pyt-rch-on-an-aws-ec2-gpu-enabled-compute-instance-5eed12fbd168

Hai,
Yes. It used to work fine. No longer does due to driver change by NVIDIA.

http://support.citrix.com/article/CTX202066

This article explains the issue.

“GPU-accelerated OpenGL onAmazon Web Services (AWS G2 instances) with NVIDIA GRID K520/K340 No Longer Works”

This is a known issue.

NVIDIA GRID drivers introduced a change in build 340. Earlier NVIDIA drivers enabled OpenGL sharing but later versions do not. NVIDIA reports that the AWS behavior is as intended. NVIDIA OpenGL over RDP/ RDSH is only supported on workstation (Quadro) / NVS products. The GRID K520 card that is on Amazon’s G2 instances is not a workstation product unlike GRID K2.
Thanks,
Riya,

For any Queries related to AWS visit:https://nareshit.com/aws-online-training/