[Libtorch] when unload caffe2_nvrtc.dll the program will occurred exception

When the program runs to the end, it will occur exceptions (FAILURE_BUCKET_ID: BAD_INSTRUCTION_PTR_c0000005_caffe2_nvrtc.dll!Unloaded).

With windbg attach the program, the STACK_TEXT as follows:

<Unloaded_caffe2_nvrtc.dll>+0x110a
torch!thrust::cuda_cub::core::AgentLauncher<thrust::cuda_cub::__adjacent_difference::AdjacentDifferenceAgent<__int64 * __ptr64,__int64 * __ptr64,__int64,__nv_dl_wrapper_t<__nv_dl_tag<std::tupleat::Tensor,at::Tensor,at::Tensor (__cdecl*)(at::Tensor const & __ptr64,__int64,bool,bool,bool),&at::native::_GLOBAL__N__53_tmpxft_0000159c_00000000_15_Unique_compute_75_cpp1_ii_285fecc2::unique_dim_cuda_template<__int64>,3>,__int64,__int64 * __ptr64> > >::launch_impl<__int64 * __ptr64,__int64 * __ptr64,__int64 * __ptr64,__nv_dl_wrapper_t<__nv_dl_tag<std::tupleat::Tensor,at::Tensor,at::Tensor (__cdecl*)(at::Tensor const & __ptr64,__int64,bool,bool,bool),&at::native::_GLOBAL__N__53_tmpxft_0000159c_00000000_15_Unique_compute_75_cpp1_ii_285fecc2::unique_dim_cuda_template<__int64>,3>,__int64,__int64 * __ptr64>,__int64>+0x100 [c:\program files\nvidia gpu computing toolkit\cuda\v10.0\include\thrust\system\cuda\detail\core\agent_launcher.h @ 958]
torch!thrust::cuda_cub::core::AgentLauncher<thrust::cuda_cub::__adjacent_difference::AdjacentDifferenceAgent<__int64 * __ptr64,__int64 * __ptr64,__int64,thrust::not_equal_to<__int64> > >::launch_impl<__int64 * __ptr64,__int64 * __ptr64,__int64 * __ptr64,thrust::not_equal_to<__int64>,__int64>+0x80 [c:\program files\nvidia gpu computing toolkit\cuda\v10.0\include\thrust\system\cuda\detail\core\agent_launcher.h @ 957]
torch!_device_stub__ZN2at6native26batch_norm_backward_kernelIffxEEvNS_20PackedTensorAccessorIT_Ly3ENS_16DefaultPtrTraitsET1_EES6_S6_NS2_IS3_Ly1ES4_S5_EES7_S7_S7_S7_NS2_IT0_Ly1ES4_S5_EES9_bS8+0xea [c:\users\administrator\appdata\local\temp\tmpxft_00001a40_00000000-9_normalization.compute_75.cudafe1.stub.c @ 603]
torch!_device_stub__ZN2at6native26batch_norm_backward_kernelIffxEEvNS_20PackedTensorAccessorIT_Ly3ENS_16DefaultPtrTraitsET1_EES6_S6_NS2_IS3_Ly1ES4_S5_EES7_S7_S7_S7_NS2_IT0_Ly1ES4_S5_EES9_bS8+0x29a [c:\users\administrator\appdata\local\temp\tmpxft_00001a40_00000000-9_normalization.compute_75.cudafe1.stub.c @ 612]
torch!c10::tryTypeMetaToScalarType+0x2a5 [c:\w\1\s\windows\pytorch\c10\core\scalartype.h @ 139]
torch!cub::DispatchReduce<int * __ptr64,int * __ptr64,int,cub::Max>::InvokePasses<cub::DeviceReducePolicy<int,int,cub::Max>::Policy200,void (__cdecl*)(int * __ptr64,int * __ptr64,int,cub::GridEvenShare,cub::Max),void (__cdecl*)(int * __ptr64,int * __ptr64,int,cub::Max,int)>+0x508 [c:\w\1\s\windows\pytorch\third_party\cub\cub\device\dispatch\dispatch_reduce.cuh @ 572]
ucrtbase!execute_onexit_table+0x103
ucrtbase!register_onexit_function+0xeb
ucrtbase!execute_onexit_table+0x34
torch!cub::DispatchReduce<__int64 const * __ptr64,__int64 * __ptr64,int,cub::Max>::InvokeSingleTile<cub::DeviceReducePolicy<__int64,int,cub::Max>::Policy350,void (__cdecl*)(__int64 const * __ptr64,__int64 * __ptr64,int,cub::Max,__int64)>+0x96 [c:\w\1\s\windows\pytorch\third_party\cub\cub\device\dispatch\dispatch_reduce.cuh @ 458]
torch!cub::DispatchReduce<__int64 const * __ptr64,__int64 * __ptr64,int,cub::Max>::InvokeSingleTile<cub::DeviceReducePolicy<__int64,int,cub::Max>::Policy350,void (__cdecl*)(__int64 const * __ptr64,__int64 * __ptr64,int,cub::Max,__int64)>+0x194 [c:\w\1\s\windows\pytorch\third_party\cub\cub\device\dispatch\dispatch_reduce.cuh @ 469]
ntdll!RtlDeactivateActivationContextUnsafeFast+0x1bf
ntdll!LdrUnloadAlternateResourceModuleEx+0x32d
ntdll!RtlCreateHeap+0x1238
ntdll!RtlCreateHeap+0x13ad
ntdll!RtlCreateHeap+0x13ad
ntdll!LdrUnloadDll+0x113
ntdll!LdrUnloadDll+0x94
KERNELBASE!FreeLibrary+0x1d
000000000315fec0 0000000059de1d7b : 00000000028a2080 0000000000000000 00000000`0

SYMBOL_STACK_INDEX: 0

SYMBOL_NAME: caffe2_nvrtc+110a

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: caffe2_nvrtc

IMAGE_NAME: caffe2_nvrtc.dll

STACK_COMMAND: ~8s ; kb

BUCKET_ID: WRONG_SYMBOLS

FAILURE_BUCKET_ID: BAD_INSTRUCTION_PTR_c0000005_caffe2_nvrtc.dll!Unloaded

Followup: MachineOwner