I am writing pytorch extension using c++/CUDA. But I meet some problem, I want to use inline void parallel_for(const int64_t begin, const int64_t end, const int64_t grain_size, const F& f); funtion in ATen/Parallel.h . When I finish the code, I meet this bug:
/.local/lib/python3.6/site-packages/torch/include/ATen/Parallel.h:48:13: error: ‘void at::parallel_for(int64_t, int64_t, int64_t, const F&) [with F = geomean_pool2d_backward_out_frame(scalar_t, scalar_t*, scalar_t*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int, int, int, int, int, int, bool, c10::optional) [with scalar_t = float; int64_t = long int]::<lambda(int64_t, int64_t)>; int64_t = long int]’, declared using local type ‘const geomean_pool2d_backward_out_frame(scalar_t*, scalar_t*, scalar_t*, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int, int, int, int, int, int, bool, c10::optional) [with scalar_t = float; int64_t = long int]::<lambda(int64_t, int64_t)>’, is used but never defined [-fpermissive]*
Can you help me solve this problem? I really want to how to use inline void at::parallel_for() in my code. Please help me!