Nvidia CUDA profiler is not able to profile certain code

Using this nvidia blog , it seems that the nvprof profiler is unable to profile unoptimized_cuda.cpp

Any advice ?

What does the profiling output show if you profile the code?

nvprof --print-gpu-trace python train.py

See https://colab.research.google.com/drive/1V94-WC6Jf8M4Tj8iX3x5CuW93qSfuG6A#scrollTo=v-9VE8CK1vPy

==565== NVPROF is profiling process 565, command: python3 train.py
Files already downloaded and verified
Files already downloaded and verified
==565== Warning: Profiling results might be incorrect with current version of nvcc compiler used to compile cuda app. Compile with nvcc compiler 9.0 or later version to get correct profiling results. Ignore this warning if code is already compiled with the recommended nvcc version 
Epoch:[0][    0/196]	Loss 2.6561 (2.6561)	Acc@1  11.72 ( 11.72)	Acc@5  53.52 ( 53.52)
Epoch:[0][   10/196]	Loss 2.3622 (2.5195)	Acc@1  13.28 ( 11.75)	Acc@5  58.98 ( 54.40)
Epoch:[0][   20/196]	Loss 2.3080 (2.4316)	Acc@1  13.28 ( 12.31)	Acc@5  52.34 ( 55.25)
Epoch:[0][   30/196]	Loss 2.3827 (2.3952)	Acc@1  12.89 ( 12.17)	Acc@5  50.00 ( 55.77)
Epoch:[0][   40/196]	Loss 2.2848 (2.3696)	Acc@1  16.02 ( 12.60)	Acc@5  56.64 ( 55.98)
Epoch:[0][   50/196]	Loss 2.2625 (2.3500)	Acc@1  13.28 ( 12.82)	Acc@5  57.03 ( 56.44)
Epoch:[0][   60/196]	Loss 2.2893 (2.3396)	Acc@1  14.45 ( 12.90)	Acc@5  56.64 ( 56.59)
Epoch:[0][   70/196]	Loss 2.2458 (2.3307)	Acc@1  15.23 ( 13.04)	Acc@5  62.50 ( 57.00)
Epoch:[0][   80/196]	Loss 2.2450 (2.3217)	Acc@1  15.62 ( 13.18)	Acc@5  64.45 ( 57.38)
Epoch:[0][   90/196]	Loss 2.2953 (2.3150)	Acc@1  14.06 ( 13.19)	Acc@5  53.52 ( 57.65)
Epoch:[0][  100/196]	Loss 2.2742 (2.3094)	Acc@1  12.89 ( 13.36)	Acc@5  55.47 ( 57.95)
Epoch:[0][  110/196]	Loss 2.2674 (2.3049)	Acc@1  10.94 ( 13.37)	Acc@5  59.77 ( 58.09)
Epoch:[0][  120/196]	Loss 2.2868 (2.3047)	Acc@1  12.50 ( 13.23)	Acc@5  64.84 ( 58.14)
Epoch:[0][  130/196]	Loss 2.2845 (2.3025)	Acc@1  12.11 ( 13.24)	Acc@5  60.16 ( 58.16)
Epoch:[0][  140/196]	Loss 2.2811 (2.2997)	Acc@1   8.98 ( 13.28)	Acc@5  60.16 ( 58.16)
Epoch:[0][  150/196]	Loss 2.2415 (2.2970)	Acc@1  13.67 ( 13.36)	Acc@5  60.16 ( 58.33)
Epoch:[0][  160/196]	Loss 2.2664 (2.2950)	Acc@1  10.55 ( 13.38)	Acc@5  58.20 ( 58.35)
Epoch:[0][  170/196]	Loss 2.2792 (2.2932)	Acc@1  12.89 ( 13.40)	Acc@5  60.16 ( 58.45)
Epoch:[0][  180/196]	Loss 2.2615 (2.2911)	Acc@1  10.55 ( 13.43)	Acc@5  62.11 ( 58.50)
Epoch:[0][  190/196]	Loss 2.2503 (2.2893)	Acc@1  12.50 ( 13.51)	Acc@5  62.11 ( 58.58)
Valid:	Acc@1  15.00 ( 13.51)	Acc@5  55.00 ( 58.56)

Epoch:[1][    0/196]	Loss 2.2474 (2.2474)	Acc@1  12.89 ( 12.89)	Acc@5  58.59 ( 58.59)
Epoch:[1][   10/196]	Loss 2.2401 (2.2493)	Acc@1  15.23 ( 13.92)	Acc@5  60.16 ( 60.40)
Epoch:[1][   20/196]	Loss 2.2601 (2.2532)	Acc@1  14.06 ( 14.34)	Acc@5  60.94 ( 60.44)
Epoch:[1][   30/196]	Loss 2.2807 (2.2538)	Acc@1  12.89 ( 14.34)	Acc@5  57.03 ( 60.19)
Epoch:[1][   40/196]	Loss 2.2443 (2.2520)	Acc@1  13.67 ( 14.57)	Acc@5  61.72 ( 60.76)
Epoch:[1][   50/196]	Loss 2.2555 (2.2501)	Acc@1  18.36 ( 14.76)	Acc@5  64.84 ( 60.85)
Epoch:[1][   60/196]	Loss 2.2707 (2.2518)	Acc@1  12.50 ( 14.66)	Acc@5  60.16 ( 60.76)
Epoch:[1][   70/196]	Loss 2.2294 (2.2507)	Acc@1  18.75 ( 14.58)	Acc@5  63.28 ( 60.85)
Epoch:[1][   80/196]	Loss 2.2474 (2.2504)	Acc@1  12.11 ( 14.54)	Acc@5  64.84 ( 60.79)
Epoch:[1][   90/196]	Loss 2.2804 (2.2503)	Acc@1  12.50 ( 14.53)	Acc@5  57.81 ( 60.76)
Epoch:[1][  100/196]	Loss 2.2520 (2.2501)	Acc@1  14.84 ( 14.71)	Acc@5  58.98 ( 60.75)
Epoch:[1][  110/196]	Loss 2.2546 (2.2490)	Acc@1  12.89 ( 14.75)	Acc@5  59.38 ( 60.89)
Epoch:[1][  120/196]	Loss 2.2439 (2.2482)	Acc@1  14.84 ( 14.80)	Acc@5  64.84 ( 61.01)
Epoch:[1][  130/196]	Loss 2.2591 (2.2489)	Acc@1  12.11 ( 14.68)	Acc@5  59.38 ( 60.92)
Epoch:[1][  140/196]	Loss 2.2619 (2.2484)	Acc@1  11.72 ( 14.69)	Acc@5  57.81 ( 61.00)
Epoch:[1][  150/196]	Loss 2.2169 (2.2480)	Acc@1  16.41 ( 14.68)	Acc@5  61.33 ( 60.97)
Epoch:[1][  160/196]	Loss 2.2675 (2.2484)	Acc@1  12.11 ( 14.63)	Acc@5  57.81 ( 60.94)
Epoch:[1][  170/196]	Loss 2.2344 (2.2480)	Acc@1  17.58 ( 14.59)	Acc@5  64.45 ( 61.01)
Epoch:[1][  180/196]	Loss 2.2397 (2.2480)	Acc@1  11.33 ( 14.56)	Acc@5  65.23 ( 61.04)
Epoch:[1][  190/196]	Loss 2.2421 (2.2472)	Acc@1  18.36 ( 14.59)	Acc@5  63.67 ( 61.22)
Valid:	Acc@1  16.25 ( 14.59)	Acc@5  67.50 ( 61.23)

Epoch:[2][    0/196]	Loss 2.2174 (2.2174)	Acc@1  13.28 ( 13.28)	Acc@5  60.94 ( 60.94)
Epoch:[2][   10/196]	Loss 2.2603 (2.2238)	Acc@1  12.50 ( 14.35)	Acc@5  62.89 ( 64.52)
Epoch:[2][   20/196]	Loss 2.2073 (2.2253)	Acc@1  17.58 ( 14.36)	Acc@5  65.23 ( 64.83)
Epoch:[2][   30/196]	Loss 2.2392 (2.2297)	Acc@1  11.72 ( 14.00)	Acc@5  64.06 ( 64.36)
Epoch:[2][   40/196]	Loss 2.2196 (2.2283)	Acc@1  16.02 ( 14.11)	Acc@5  60.94 ( 64.22)
Epoch:[2][   50/196]	Loss 2.2230 (2.2252)	Acc@1  18.36 ( 14.16)	Acc@5  63.67 ( 64.43)
Epoch:[2][   60/196]	Loss 2.2560 (2.2271)	Acc@1  14.06 ( 14.11)	Acc@5  62.50 ( 64.11)
Epoch:[2][   70/196]	Loss 2.2240 (2.2267)	Acc@1  12.50 ( 14.19)	Acc@5  67.19 ( 64.36)
Epoch:[2][   80/196]	Loss 2.2233 (2.2262)	Acc@1  15.23 ( 14.05)	Acc@5  67.19 ( 64.42)
Epoch:[2][   90/196]	Loss 2.2691 (2.2256)	Acc@1  14.06 ( 14.08)	Acc@5  62.50 ( 64.61)
Epoch:[2][  100/196]	Loss 2.2128 (2.2230)	Acc@1  16.80 ( 14.19)	Acc@5  66.02 ( 64.94)
Epoch:[2][  110/196]	Loss 2.2231 (2.2226)	Acc@1  16.80 ( 14.21)	Acc@5  67.58 ( 65.04)
Epoch:[2][  120/196]	Loss 2.2471 (2.2224)	Acc@1  16.02 ( 14.29)	Acc@5  67.19 ( 65.15)
Epoch:[2][  130/196]	Loss 2.2120 (2.2234)	Acc@1  14.06 ( 14.26)	Acc@5  66.80 ( 65.10)
Epoch:[2][  140/196]	Loss 2.1905 (2.2223)	Acc@1  14.84 ( 14.32)	Acc@5  68.36 ( 65.14)
Epoch:[2][  150/196]	Loss 2.1803 (2.2222)	Acc@1  17.58 ( 14.26)	Acc@5  69.53 ( 65.14)
Epoch:[2][  160/196]	Loss 2.2102 (2.2217)	Acc@1  10.94 ( 14.32)	Acc@5  67.97 ( 65.16)
Epoch:[2][  170/196]	Loss 2.1830 (2.2210)	Acc@1  17.97 ( 14.34)	Acc@5  71.09 ( 65.23)
Epoch:[2][  180/196]	Loss 2.2337 (2.2208)	Acc@1  12.89 ( 14.27)	Acc@5  60.16 ( 65.23)
Epoch:[2][  190/196]	Loss 2.2121 (2.2201)	Acc@1  13.67 ( 14.23)	Acc@5  65.62 ( 65.28)
Valid:	Acc@1  13.75 ( 14.28)	Acc@5  62.50 ( 65.29)

Epoch:[3][    0/196]	Loss 2.2125 (2.2125)	Acc@1  15.23 ( 15.23)	Acc@5  62.89 ( 62.89)
Epoch:[3][   10/196]	Loss 2.2045 (2.1961)	Acc@1  12.89 ( 15.06)	Acc@5  65.62 ( 66.73)
Epoch:[3][   20/196]	Loss 2.2089 (2.2014)	Acc@1  17.19 ( 15.55)	Acc@5  67.58 ( 66.18)
Epoch:[3][   30/196]	Loss 2.2160 (2.2043)	Acc@1  10.94 ( 15.07)	Acc@5  63.67 ( 65.81)
Epoch:[3][   40/196]	Loss 2.2225 (2.2042)	Acc@1  16.02 ( 15.13)	Acc@5  65.62 ( 65.95)
Epoch:[3][   50/196]	Loss 2.2464 (2.2044)	Acc@1  13.67 ( 14.87)	Acc@5  63.67 ( 65.92)
Epoch:[3][   60/196]	Loss 2.2477 (2.2076)	Acc@1  11.72 ( 14.74)	Acc@5  66.41 ( 65.64)
Epoch:[3][   70/196]	Loss 2.2266 (2.2075)	Acc@1  12.50 ( 14.61)	Acc@5  61.33 ( 65.64)
Epoch:[3][   80/196]	Loss 2.1900 (2.2094)	Acc@1  17.58 ( 14.28)	Acc@5  68.75 ( 65.40)
Epoch:[3][   90/196]	Loss 2.2321 (2.2095)	Acc@1  12.50 ( 14.30)	Acc@5  65.23 ( 65.43)
Epoch:[3][  100/196]	Loss 2.2230 (2.2088)	Acc@1  14.84 ( 14.41)	Acc@5  62.89 ( 65.40)
Epoch:[3][  110/196]	Loss 2.1826 (2.2078)	Acc@1  17.58 ( 14.41)	Acc@5  68.75 ( 65.51)
Epoch:[3][  120/196]	Loss 2.1869 (2.2064)	Acc@1  14.84 ( 14.52)	Acc@5  66.80 ( 65.66)
Epoch:[3][  130/196]	Loss 2.1816 (2.2070)	Acc@1  14.45 ( 14.50)	Acc@5  67.19 ( 65.56)
Epoch:[3][  140/196]	Loss 2.1712 (2.2074)	Acc@1  14.84 ( 14.53)	Acc@5  64.06 ( 65.45)
Epoch:[3][  150/196]	Loss 2.1949 (2.2081)	Acc@1  16.41 ( 14.50)	Acc@5  68.75 ( 65.34)
Epoch:[3][  160/196]	Loss 2.2323 (2.2078)	Acc@1  11.72 ( 14.55)	Acc@5  58.98 ( 65.36)
Epoch:[3][  170/196]	Loss 2.2053 (2.2076)	Acc@1  13.28 ( 14.52)	Acc@5  62.89 ( 65.29)
Epoch:[3][  180/196]	Loss 2.2092 (2.2074)	Acc@1  11.33 ( 14.52)	Acc@5  64.45 ( 65.30)
Epoch:[3][  190/196]	Loss 2.1581 (2.2061)	Acc@1  16.80 ( 14.56)	Acc@5  66.41 ( 65.39)
Valid:	Acc@1  12.50 ( 14.62)	Acc@5  62.50 ( 65.42)

Epoch:[4][    0/196]	Loss 2.2185 (2.2185)	Acc@1  16.41 ( 16.41)	Acc@5  65.23 ( 65.23)
Epoch:[4][   10/196]	Loss 2.1946 (2.1780)	Acc@1  14.84 ( 16.23)	Acc@5  67.58 ( 67.72)
Epoch:[4][   20/196]	Loss 2.1826 (2.1765)	Acc@1  13.67 ( 15.23)	Acc@5  69.53 ( 67.71)
Epoch:[4][   30/196]	Loss 2.1741 (2.1804)	Acc@1  14.45 ( 15.01)	Acc@5  66.80 ( 67.75)
Epoch:[4][   40/196]	Loss 2.1733 (2.1816)	Acc@1  19.92 ( 15.46)	Acc@5  69.14 ( 67.59)
Epoch:[4][   50/196]	Loss 2.1708 (2.1816)	Acc@1  15.62 ( 15.53)	Acc@5  68.75 ( 67.16)
Epoch:[4][   60/196]	Loss 2.2342 (2.1827)	Acc@1  12.50 ( 15.55)	Acc@5  63.67 ( 66.88)
Epoch:[4][   70/196]	Loss 2.1912 (2.1833)	Acc@1  18.75 ( 15.54)	Acc@5  65.23 ( 66.84)
Epoch:[4][   80/196]	Loss 2.1691 (2.1820)	Acc@1  15.62 ( 15.59)	Acc@5  66.41 ( 66.83)
Epoch:[4][   90/196]	Loss 2.1957 (2.1810)	Acc@1  12.89 ( 15.64)	Acc@5  64.06 ( 66.85)
Epoch:[4][  100/196]	Loss 2.1498 (2.1797)	Acc@1  18.36 ( 15.78)	Acc@5  71.88 ( 66.84)
Epoch:[4][  110/196]	Loss 2.2000 (2.1792)	Acc@1  16.02 ( 15.80)	Acc@5  63.67 ( 66.73)
Epoch:[4][  120/196]	Loss 2.2101 (2.1793)	Acc@1  15.23 ( 15.84)	Acc@5  65.23 ( 66.83)
Epoch:[4][  130/196]	Loss 2.1756 (2.1796)	Acc@1  14.84 ( 16.00)	Acc@5  64.84 ( 66.70)
Epoch:[4][  140/196]	Loss 2.2108 (2.1792)	Acc@1  16.02 ( 16.05)	Acc@5  65.62 ( 66.64)
Epoch:[4][  150/196]	Loss 2.1349 (2.1793)	Acc@1  19.92 ( 16.09)	Acc@5  63.28 ( 66.54)
Epoch:[4][  160/196]	Loss 2.2058 (2.1793)	Acc@1  14.06 ( 16.11)	Acc@5  66.80 ( 66.63)
Epoch:[4][  170/196]	Loss 2.1717 (2.1791)	Acc@1  15.23 ( 16.14)	Acc@5  67.97 ( 66.66)
Epoch:[4][  180/196]	Loss 2.1978 (2.1788)	Acc@1  16.02 ( 16.11)	Acc@5  66.41 ( 66.60)
Epoch:[4][  190/196]	Loss 2.1366 (2.1780)	Acc@1  18.36 ( 16.19)	Acc@5  68.36 ( 66.63)
Valid:	Acc@1  15.00 ( 16.22)	Acc@5  63.75 ( 66.65)

Epoch:[5][    0/196]	Loss 2.1741 (2.1741)	Acc@1  18.36 ( 18.36)	Acc@5  69.14 ( 69.14)
Epoch:[5][   10/196]	Loss 2.1512 (2.1490)	Acc@1  17.19 ( 18.22)	Acc@5  69.14 ( 68.86)
Epoch:[5][   20/196]	Loss 2.1548 (2.1540)	Acc@1  19.14 ( 18.36)	Acc@5  66.80 ( 67.86)
Epoch:[5][   30/196]	Loss 2.1510 (2.1594)	Acc@1  14.84 ( 18.03)	Acc@5  68.36 ( 67.54)
Epoch:[5][   40/196]	Loss 2.1936 (2.1605)	Acc@1  18.36 ( 18.11)	Acc@5  61.33 ( 67.33)
Epoch:[5][   50/196]	Loss 2.1379 (2.1586)	Acc@1  17.58 ( 17.99)	Acc@5  71.88 ( 67.42)
Epoch:[5][   60/196]	Loss 2.2033 (2.1615)	Acc@1  14.06 ( 17.87)	Acc@5  63.28 ( 67.28)
Epoch:[5][   70/196]	Loss 2.1947 (2.1634)	Acc@1  17.58 ( 17.79)	Acc@5  61.72 ( 67.15)
Epoch:[5][   80/196]	Loss 2.1627 (2.1623)	Acc@1  18.36 ( 17.69)	Acc@5  69.92 ( 67.29)
Epoch:[5][   90/196]	Loss 2.1762 (2.1610)	Acc@1  19.53 ( 17.71)	Acc@5  66.80 ( 67.39)
Epoch:[5][  100/196]	Loss 2.1425 (2.1585)	Acc@1  20.31 ( 17.74)	Acc@5  68.75 ( 67.50)
Epoch:[5][  110/196]	Loss 2.1684 (2.1587)	Acc@1  18.36 ( 17.64)	Acc@5  65.62 ( 67.45)
Epoch:[5][  120/196]	Loss 2.1651 (2.1576)	Acc@1  14.84 ( 17.61)	Acc@5  68.75 ( 67.67)
Epoch:[5][  130/196]	Loss 2.1759 (2.1580)	Acc@1  17.19 ( 17.64)	Acc@5  69.53 ( 67.71)
Epoch:[5][  140/196]	Loss 2.1900 (2.1565)	Acc@1  15.23 ( 17.64)	Acc@5  62.89 ( 67.84)
Epoch:[5][  150/196]	Loss 2.1118 (2.1552)	Acc@1  20.70 ( 17.68)	Acc@5  70.31 ( 67.95)
Epoch:[5][  160/196]	Loss 2.1644 (2.1553)	Acc@1  14.84 ( 17.64)	Acc@5  69.92 ( 67.97)
Epoch:[5][  170/196]	Loss 2.1594 (2.1549)	Acc@1  17.19 ( 17.66)	Acc@5  73.44 ( 68.06)
Epoch:[5][  180/196]	Loss 2.1276 (2.1546)	Acc@1  20.70 ( 17.63)	Acc@5  70.31 ( 68.09)
Epoch:[5][  190/196]	Loss 2.1128 (2.1532)	Acc@1  19.53 ( 17.69)	Acc@5  71.09 ( 68.20)
Valid:	Acc@1  18.75 ( 17.73)	Acc@5  65.00 ( 68.27)

It seems your notebook timed out after 93 epochs:

Epoch:[93][   50/196]	Loss 1.7159 (1.6831)	Acc@1  33.20 ( 37.65)	Acc@5  87.50 ( 88.55)

If you are profiling code, make sure the code exits in a reasonable amount of time.

I suppose the profiler is profiling the code for each and every epoch ?

The profiler might be started and stopped via torch.cuda.cudart().cudaProfilerStart() and torch.cuda.cudart().cudaProfilerStop(), repectively, in side the code or it can also profile the complete script. Once the script finishes, nvprof will create the profiling output. If you are profiling the complete script (for 93+ epochs), the output would be large and might take some time of course.