I’m trying to correlate forward/backward operations with the autograd profiler. I’m using nsight systems now that nvprof is EOL but don’t think it’s related to that.
According to the current docs for torch.autograd.profiler.emit_nvtx
you can correlate them based on a seq=<seq_nr>
in forward pass operations and a stashed seq=<seq_nr>
in the corresponding backward operations. However in the output in 1.3 I don’t find any stashed seq
numbers, just seq
numbers. It looks like subsequent refactoring of torch/csrc/autograd/profiler.cpp
may have altered this as I can find the stashed seq
in earlier code but not in current code.
As I understand it from the docs the stashed seq=
is useful to distinguish operations in the backwards which correlate to forward operations from operations in the backward that don’t correlate and will just have a seq=
in backward.
Don’t have easily runnable code to post (but can put something together if that will help). Running with the code from the emit_nvtx
docs of:
with torch.cuda.profiler.profile():
with torch.autograd.profiler.emit_nvtx():
...
Then launching with:
nsys profile -t cublas,cuda,cudnn,nvtx -c cudaProfilerApi --kill=none $(which python) my_script.py
Using NVIDIA_Nsight_Systems_Linux_2019.5.1.58 (nvtx option only added in recent versions). As noted I get the seq=
nvtx ranges, just not the stashed seq=
so seems like NVTX ranges are working.