Thread died at c10::cuda::CUDACachingAllocator::emptyCache();

In LibTorch,

After the trained model, call c10::cuda::CUDACachingAllocator::emptyCache() to free the GPU memory.
But sometimes the thread died in this emptyCache().

Can anyone help me with this issue?

This seems unusual, could there potentially be a race between this thread and other parts of your libtorch usage?