Hi, please bear with this simple question. When using the Pytorch profiler, I found the device/host time confusing. Could anyone please share some guidance on their specific definition?
For example for this output, they vary quite a lot. How may I interpret the chart to better understand the performance bottleneck?
Any help would be appreciated.