I want to launch a function using multi thread, each thread may run in different device context. So I want to use the thread_local
keyword in C++.
I wrote the code as following. The code will get an undefined symbol error. Which is basically as folloing :
ImportError: /home/user/anaconda3/.../_my_ext.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN10foo14MyDeviceInfoD1Ev
But this code is quite similar with the thread_local use from ThreadLocalDebugInfo.cpp.
Could you give me some hint on how to solve this problem? How could I store device information per thread?
// mydevice.h
namespace foo{
class MyDeviceInfo{
private:
c10::Device device_;
//...
public:
MyDeviceInfo() : device_(c10::Device(c10::DeviceType::CPU, 0)) {}
MyDeviceInfo(c10::Device device) : device_(device) {}
c10::Device& get_device() { return device_ ; }
...
};
MyDeviceInfo& get_tls_MyDeviceInfo();
void set_tls_MyDeviceInfo(c10::Device device);
} // namespace foo
//-------------------------
// mydevice.cpp
thread_local std::shared_ptr<foo::MyDeviceInfo> device_info;
namespace foo{
void set_tls_MyDeviceInfo(c10::Device device) {
device_info = std::make_shared<MyDeviceInfo> (device) ;
}
MyDeviceInfo& get_tls_MyDeviceInfo(){
return device_info->get_device();
}
} // namespace foo