I faced a problem with calling conv1d.forward in DLL on CUDA. In the following C++ code snippet line 4 ‘conv1d.forward’ crashes with stack overflow. The full CPP file (27 lines) is at the bottom of this message.
auto Net = torch::nn::Conv1d(torch::nn::Conv1dOptions(21, 2, 3));
Net->to(device);
torch::Tensor X = torch::rand({ 5,21,25 }).to(device);
torch::Tensor Y = Net->forward(X);
Having experimented on two PCs with different GPU types I found the problem is consistently reproduced if all of the following criteria are met:
- DLL. The same code in console EXE application runs normally.
- GPU/CUDA. There is no problem running the same code on CPU
- Convolutional layer. No problem with other layer types, e.g. linear.
I tried debugging DLL in Visual Studio (debugger window screenshot attached). Call stack suggests that stack overflow happens inside cudnn_cnn_infer64_8.dll module. This is part of Nvidia CUDNN library.
I am not sure if this error is part of Pytorch or Nvidia CUDNN. If anyone has any suggestion on how to resolve this please respond.
#include <torch/torch.h>
#define XLExport extern "C" __declspec(dllexport)
XLExport int _stdcall MLP_DLL();
int _stdcall MLP_DLL()
{
std::ofstream log("output.txt");
torch::Device device(torch::kCPU);
if (torch::cuda::is_available()) {
log << "Cuda found" << std::endl;
device = torch::Device(torch::kCUDA);
}
auto Net = torch::nn::Conv1d(torch::nn::Conv1dOptions(21, 2, 3));
Net->to(device);
torch::Tensor X = torch::rand({ 5,21,25 }).to(device);
log << " starting forward" << std::endl;
torch::Tensor Y = Net->forward(X);
log << "Y = " << std::endl;
log << Y << std::endl;
return 0;
}