All functions have the same name in autograd profiler

I have been looking for tools to profile the end-to-end training of a model (reduced epochs, of course), and the profiler torch.utils.bottleneck is returning some strange results.

There is also lots of information it is leaving out, like what runs in parallel and such (comparing to tensorflow profiling tool), so i would love to hear about any other options!

The bottleneck profiler has returned this output;

All the functions have the same name, and it is overall a bit difficult to get a clear picture of what the model is doing. I suspect things like the input_shapes being and each function getting called exactly once are signs that something is not quite right, but i have not found much documentation to confirm this suspicion.

Any advice or suggestions would be sincerely appreciated!

Hi,

You are running on GPU right?
You can check in the doc here that the bottleneck tool won’t perform synchronization with the GPU.
Here you see that the only expensive functions are the transfer between CPU/GPU. This is expected as these are the only place where we synchronize with the GPU and actually wait for the GPU to do the work.

I think you want to use the autograd profiler with cuda setting turned on to get a better idea of which function is actually taking a long time on the GPU.

1 Like