How do I interpret the chrome trace from the profiler?

kmaeng · May 27, 2020, 10:19pm

This seems like a newbie question but couldn’t find any information that is detailed enough for me to understand.
I am trying to understand how to interpret the chrome trace from the autograd profile.
Below code generates a very simple chrome trace

if __name__ == "__main__":
    with torch.autograd.profiler.profile(True, False) as prof:
        net = Net()
        optimizer = torch.optim.SGD(net.parameters(), lr=0.1)
        optimizer.zero_grad()
        y = net(torch.tensor([1.0,2.0]))
        y.sum().backward()
        optimizer.step()
    print(prof.key_averages().table(sort_by="cpu_time_total"))
    prof.export_chrome_trace("test.json")

And this is the chrome trace I get:

It seems like the three chunks roughly represents the forward, backward, and weight update.
However, I don’t understand what is the two rows, shown as “Process CPU functions”. Are these two different processes? Or two different threads? Or two different CPUs?
From looking at the JSON file it seems like it is two separate threads (because they are tid 1 and 2). However, I am not sure why only the backprop would run on a different thread. Is the two really two different thread?
If so, is there any way to view process/CPU activities as well? Or can I only collect profile for a single process?

This seems like a very simple question, yet I cannot find a clear answer from the web. If there is any resource that I am not finding, please let me know.

Thank you!