Recently, I updated my PyTorch from 0.3 to 0.4 so I also rewrote my C extension using the new C++ APIs in 0.4. Previously, if I wrapped my C extension with DataParallel, the CPU usage of my script could go above 100% (170%). However, if I wrapped the new C++ extension with DataParallel, the CPU usage could not go above 100% and the training was slowed down. After doing some research, I suspected that is related to GIL and I also found a way to release GIL in my C++ source files.
After I added py::call_guard<py::gil_scoped_release>(), the CPU usage could go above 100% and the training speed returned to normal. I just wonder if this will have some potential problems. Thanks in advance.
This should be fine. There shouldn’t be any pytorch-specific problems, but of course all the typical things to avoid when releasing GIL in pybind will apply.
I was curious why my old C extensions did not suffer from GIL. It looks like ffi, which was previously used to compile PyTorch extension, would release GIL by default. Is that correct?
Do you know of a documentation for m.def arguments?
What are the implications? I wonder if it would break things assuming we stick to ATen and kernels (no Python object references).
PyTorch uses a library called pybind11 to create Python bindings. You can find the documentation in this https://pybind11.readthedocs.io/en/stable/index.html. I am not sure if releasing GIL would break anything.