ImportError: No module named _C using whl

lorenzotara · December 5, 2018, 12:05am

Hello everyone,

This is a related issue to this one.
I am trying to use Pytorch with Pyspark.
In order to deploy my model on a large dataset on HDFS I need to add the Pytorch wheel with

sc.addPyFile(torch-0.4.1-cp27-cp27mu-linux_x86_64.zip)

or directly when I submit the spark job with

py-files torch-0.4.1-cp27-cp27mu-linux_x86_64.whl

I don’t have privileges to install Pytorch on the machines of cluster, so I really need to use the whl file. When I try to import torch, this error happens:
File "/tmp/spark-940d3edb-efdf-4ceb-8955-f1f1e1c59939/userFiles-f88461bb-66de-4d0d-97ad-30a6716c3339/torch-0.4.1-cp27-cp27mu-linux_x86_64.zip/torch/__init__.py", line 80, in <module> ImportError: No module named _C.

I know I should call import torch from another directory than the root one, but in this case I have no clue on how to do it.

Thanks for your help!

albanD · December 5, 2018, 11:17am

I don’t know how pyspark works but it seems it’s not handling our C built modules properly Are libraries like numpy supported properly?
Even without admin rights, you can create a local python virtualenv where you install pytorch. That might be the simplest thing to do here.

lorenzotara · December 5, 2018, 11:50pm

Even creating a local python virtualenv wouldn’t help because in this case I would install it only on the master node, while I need pytorch on each machine that I am using with Pyspark. That’s why I need to send the whl file to each cluster.
Don’t you think that the problem is the same as in here?

suZhuo · December 6, 2018, 2:03am

you can try to uninstall numpy then install numpy+mkl. The download address is https://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy

albanD · December 6, 2018, 10:04am

No the issue was just because he was trying to import torch from the root of the github repo. And there is a torch folder there. And so python was loading this folder instead of the installed torch.