I just try using the
There are several entries
Name Self CPU % Self CPU CPU total % CPU total CPU time avg Self CUDA Self CUDA % CUDA total CUDA time avg # of Calls
where could I find the explanation for these? does the time calculated in CPU total also overlap with the time in CUDA total if an operator involves the GPU computation?