So indeed, this is the kernel that computed the forward pass.
It is called from this function that performs the forward pass.
This one is used in an automatically generated wrapper called CudaSpatialDepthwiseConvolution_updateOutput which is then exposed to python as CudaSpatialDepthwiseConvolution_updateOutput (same name). Which you can then access from python at torch._C._THNN.CudaSpatialDepthwiseConvolution_updateOutput.
Yes it’s slightly tricky especially since some of the binding are in generated files that are not in the repo originally.
I would recommand compiling from source and then looking for the function name in the folder where you compiled. That way, your search is going to include all the generated files !