C++ Segmentation fault during torch::jit::load()

Hello community,
I have a problem with loading my model in c++.
Working with Ubuntu 18.04 after a restart this works fine:
module_ = torch::jit::load("my-path");
and I can apply the model without any issues.
But if I kill the program and restart it, I run in a segmentation fault during torch::jit::load().

GDB:

[New Thread 0x7f3e73ff7700 (LWP 8143)]
[New Thread 0x7f3e737f6700 (LWP 8144)]

Thread 1 “Recorder” received signal SIGSEGV, Segmentation fault.
__memmove_avx_unaligned_erms () at …/sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:431
431 …/sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.

watch nvidia-smi looks regular.

I tried to find a way, to unload the model, before killing the program, but so far I found mostly methods which are not longer available in forums, like module.reset();

I am using libtorch 1.8.0 (Stable (1.8.0), Linux, LibTorch, C++/Java, CUDA 10.2) (btw. the latest link of Stable (1.8.1) → Linux → LibTorch → C++/Java → CUDA 10.2 is not working, so I can’t update, see here)
I tried a few other versions and updating cuDNN, also the LibTorch CPU version did not change this behavior.

I am grateful for any advice
BR Michael

The error seems to be raised by AVX, so I would assume that changing the CUDA version wouldn’t have any effect.
I assume your CPU is supporting AVX instructions, as the model seems to load fine after a restart.
Are you seeing any zombie processes after killing the main process?

I restarted to see if there is any new process coming up. I realized not every restart fixes the problem. I run the program multiple times, after the 4th time, without a restart, it was working, 5th time was working as well, afterwards 6th… segmentation fault.
I can’t see any zombie process so far.
I use an i7-10750H CPU, so I assume AVX is supported. My GPU is a NVIDIA P620, I just updated the driver from 450 to 460.32.03, to make sure the problem is not based there, but no change

I found a similar issue on GitHub.
There is reported if you use regex before you run in a segmentation fault. I tested it with my model in a very simple program:

int main() {
    std::regex regstr("crash");
    int i = 200;
    while (i--) {
        auto module = torch::jit::load("resnet_cpp.pt");
        std::cout << i << std::endl;
    }
    return 0;
}

if I remove the std::regex regstr(“crash”), it runs all 200 iterations without any problems. With it I can not make a single one. The error is:
Program received signal SIGSEGV, Segmentation fault.

0x00007f2534f7cfe6 in std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >, std::__cxx11::regex_traits, true>::_M_dfs(std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >, std::__cxx11::regex_traits, true>::_Match_mode, long) () from /usr/lib/libtorch/lib/libc10.so

From stack:

std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >, std::__cxx11::regex_traits, true>::_M_dfs(std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >, std::__cxx11::regex_traits, true>::_Match_mode, long) 0x00007f6f25a52fe6
std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >, std::__cxx11::regex_traits, true>::_M_dfs(std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >, std::__cxx11::regex_traits, true>::_Match_mode, long) 0x00007f6f25a5309b
std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >, std::__cxx11::regex_traits, true>::_M_dfs(std::__detail::_Executor<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >, std::__cxx11::regex_traits, true>::_Match_mode, long) 0x00007f6f25a532e4
bool std::__detail::__regex_algo_impl<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >, char, std::__cxx11::regex_traits, (std::__detail::_RegexExecutorPolicy)0, true>(__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, __gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::__cxx11::match_results<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > > >&, std::__cxx11::basic_regex<char, std::__cxx11::regex_traits > const&, std::regex_constants::match_flag_type) 0x00007f6f25a53a62
c10::Device::Device(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) 0x00007f6f25a4db36
torch::jit::Unpickler::readInstruction() 0x00007f6eb4f7cf88
torch::jit::Unpickler::run() 0x00007f6eb4f7e318
torch::jit::Unpickler::parse_ivalue() 0x00007f6eb4f7e53e
torch::jit::readArchiveAndTensors(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, c10::optional<std::function<c10::StrongTypePtr (c10::QualifiedName const&)> >, c10::optional<std::function<c10::intrusive_ptr<c10::ivalue::Object, c10::detail::intrusive_target_default_null_typec10::ivalue::Object > (c10::StrongTypePtr, c10::IValue)> >, c10::optionalc10::Device, caffe2::serialize::PyTorchStreamReader&) 0x00007f6eb4f27d71
torch::jit::(anonymous namespace)::ScriptModuleDeserializer::readArchive(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) [clone .constprop.848] 0x00007f6eb4f28010
torch::jit::(anonymous namespace)::ScriptModuleDeserializer::deserialize(c10::optionalc10::Device, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >&) [clone .constprop.847] 0x00007f6eb4f2aeb7
torch::jit::load(std::shared_ptrcaffe2::serialize::ReadAdapterInterface, c10::optionalc10::Device, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >&) 0x00007f6eb4f2b554
torch::jit::load(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, c10::optionalc10::Device, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >&) 0x00007f6eb4f2e305
torch::jit::load(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, c10::optionalc10::Device) 0x00007f6eb4f2e40f
main main.cpp:8
__libc_start_main 0x00007f6e37752bf7
_start 0x00005642a276bbfa

This is not my original problem, my original problem still exists. But so far it is the closes issue I could found on the web. I use regex too, but way before the model is loaded. With all libs updated I run in my original problem round about 1/5 times.
Maybe this helps to solve both issues.
To provide a stack to my original question:

__memmove_avx_unaligned_erms 0x00007f48e260ec21
void std::__cxx11::basic_string<char, std::char_traits, std::allocator >::_M_construct<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > >(__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, __gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::forward_iterator_tag) [clone .isra.101] 0x00007f49d4bd0604
c10::Device::Device(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) 0x00007f49d4bd147c
torch::jit::Unpickler::readInstruction() 0x00007f4960595f88
torch::jit::Unpickler::run() 0x00007f4960597318
torch::jit::Unpickler::parse_ivalue() 0x00007f496059753e
torch::jit::readArchiveAndTensors(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, c10::optional<std::function<c10::StrongTypePtr (c10::QualifiedName const&)> >, c10::optional<std::function<c10::intrusive_ptr<c10::ivalue::Object, c10::detail::intrusive_target_default_null_typec10::ivalue::Object > (c10::StrongTypePtr, c10::IValue)> >, c10::optionalc10::Device, caffe2::serialize::PyTorchStreamReader&) 0x00007f4960540d71
torch::jit::(anonymous namespace)::ScriptModuleDeserializer::readArchive(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) [clone .constprop.848] 0x00007f4960541010
torch::jit::(anonymous namespace)::ScriptModuleDeserializer::deserialize(c10::optionalc10::Device, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >&) [clone .constprop.847] 0x00007f4960543b82
torch::jit::load(std::shared_ptrcaffe2::serialize::ReadAdapterInterface, c10::optionalc10::Device, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >&) 0x00007f4960544554
torch::jit::load(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, c10::optionalc10::Device, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >&) 0x00007f4960547305
torch::jit::load(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, c10::optionalc10::Device) 0x00007f496054740f
Detector::start Detector.cpp:52
main main.cpp:41
__libc_start_main 0x00007f48e24a1bf7
_start 0x0000556d8a26638a

Is your code running fine if you remove the regex usage or is it still crashing?