Using this nvidia blog , it seems that the nvprof profiler is unable to profile unoptimized_cuda.cpp
Any advice ?
Using this nvidia blog , it seems that the nvprof profiler is unable to profile unoptimized_cuda.cpp
Any advice ?
What does the profiling output show if you profile the code?
nvprof --print-gpu-trace python train.py
See https://colab.research.google.com/drive/1V94-WC6Jf8M4Tj8iX3x5CuW93qSfuG6A#scrollTo=v-9VE8CK1vPy
==565== NVPROF is profiling process 565, command: python3 train.py
Files already downloaded and verified
Files already downloaded and verified
==565== Warning: Profiling results might be incorrect with current version of nvcc compiler used to compile cuda app. Compile with nvcc compiler 9.0 or later version to get correct profiling results. Ignore this warning if code is already compiled with the recommended nvcc version
Epoch:[0][ 0/196] Loss 2.6561 (2.6561) Acc@1 11.72 ( 11.72) Acc@5 53.52 ( 53.52)
Epoch:[0][ 10/196] Loss 2.3622 (2.5195) Acc@1 13.28 ( 11.75) Acc@5 58.98 ( 54.40)
Epoch:[0][ 20/196] Loss 2.3080 (2.4316) Acc@1 13.28 ( 12.31) Acc@5 52.34 ( 55.25)
Epoch:[0][ 30/196] Loss 2.3827 (2.3952) Acc@1 12.89 ( 12.17) Acc@5 50.00 ( 55.77)
Epoch:[0][ 40/196] Loss 2.2848 (2.3696) Acc@1 16.02 ( 12.60) Acc@5 56.64 ( 55.98)
Epoch:[0][ 50/196] Loss 2.2625 (2.3500) Acc@1 13.28 ( 12.82) Acc@5 57.03 ( 56.44)
Epoch:[0][ 60/196] Loss 2.2893 (2.3396) Acc@1 14.45 ( 12.90) Acc@5 56.64 ( 56.59)
Epoch:[0][ 70/196] Loss 2.2458 (2.3307) Acc@1 15.23 ( 13.04) Acc@5 62.50 ( 57.00)
Epoch:[0][ 80/196] Loss 2.2450 (2.3217) Acc@1 15.62 ( 13.18) Acc@5 64.45 ( 57.38)
Epoch:[0][ 90/196] Loss 2.2953 (2.3150) Acc@1 14.06 ( 13.19) Acc@5 53.52 ( 57.65)
Epoch:[0][ 100/196] Loss 2.2742 (2.3094) Acc@1 12.89 ( 13.36) Acc@5 55.47 ( 57.95)
Epoch:[0][ 110/196] Loss 2.2674 (2.3049) Acc@1 10.94 ( 13.37) Acc@5 59.77 ( 58.09)
Epoch:[0][ 120/196] Loss 2.2868 (2.3047) Acc@1 12.50 ( 13.23) Acc@5 64.84 ( 58.14)
Epoch:[0][ 130/196] Loss 2.2845 (2.3025) Acc@1 12.11 ( 13.24) Acc@5 60.16 ( 58.16)
Epoch:[0][ 140/196] Loss 2.2811 (2.2997) Acc@1 8.98 ( 13.28) Acc@5 60.16 ( 58.16)
Epoch:[0][ 150/196] Loss 2.2415 (2.2970) Acc@1 13.67 ( 13.36) Acc@5 60.16 ( 58.33)
Epoch:[0][ 160/196] Loss 2.2664 (2.2950) Acc@1 10.55 ( 13.38) Acc@5 58.20 ( 58.35)
Epoch:[0][ 170/196] Loss 2.2792 (2.2932) Acc@1 12.89 ( 13.40) Acc@5 60.16 ( 58.45)
Epoch:[0][ 180/196] Loss 2.2615 (2.2911) Acc@1 10.55 ( 13.43) Acc@5 62.11 ( 58.50)
Epoch:[0][ 190/196] Loss 2.2503 (2.2893) Acc@1 12.50 ( 13.51) Acc@5 62.11 ( 58.58)
Valid: Acc@1 15.00 ( 13.51) Acc@5 55.00 ( 58.56)
Epoch:[1][ 0/196] Loss 2.2474 (2.2474) Acc@1 12.89 ( 12.89) Acc@5 58.59 ( 58.59)
Epoch:[1][ 10/196] Loss 2.2401 (2.2493) Acc@1 15.23 ( 13.92) Acc@5 60.16 ( 60.40)
Epoch:[1][ 20/196] Loss 2.2601 (2.2532) Acc@1 14.06 ( 14.34) Acc@5 60.94 ( 60.44)
Epoch:[1][ 30/196] Loss 2.2807 (2.2538) Acc@1 12.89 ( 14.34) Acc@5 57.03 ( 60.19)
Epoch:[1][ 40/196] Loss 2.2443 (2.2520) Acc@1 13.67 ( 14.57) Acc@5 61.72 ( 60.76)
Epoch:[1][ 50/196] Loss 2.2555 (2.2501) Acc@1 18.36 ( 14.76) Acc@5 64.84 ( 60.85)
Epoch:[1][ 60/196] Loss 2.2707 (2.2518) Acc@1 12.50 ( 14.66) Acc@5 60.16 ( 60.76)
Epoch:[1][ 70/196] Loss 2.2294 (2.2507) Acc@1 18.75 ( 14.58) Acc@5 63.28 ( 60.85)
Epoch:[1][ 80/196] Loss 2.2474 (2.2504) Acc@1 12.11 ( 14.54) Acc@5 64.84 ( 60.79)
Epoch:[1][ 90/196] Loss 2.2804 (2.2503) Acc@1 12.50 ( 14.53) Acc@5 57.81 ( 60.76)
Epoch:[1][ 100/196] Loss 2.2520 (2.2501) Acc@1 14.84 ( 14.71) Acc@5 58.98 ( 60.75)
Epoch:[1][ 110/196] Loss 2.2546 (2.2490) Acc@1 12.89 ( 14.75) Acc@5 59.38 ( 60.89)
Epoch:[1][ 120/196] Loss 2.2439 (2.2482) Acc@1 14.84 ( 14.80) Acc@5 64.84 ( 61.01)
Epoch:[1][ 130/196] Loss 2.2591 (2.2489) Acc@1 12.11 ( 14.68) Acc@5 59.38 ( 60.92)
Epoch:[1][ 140/196] Loss 2.2619 (2.2484) Acc@1 11.72 ( 14.69) Acc@5 57.81 ( 61.00)
Epoch:[1][ 150/196] Loss 2.2169 (2.2480) Acc@1 16.41 ( 14.68) Acc@5 61.33 ( 60.97)
Epoch:[1][ 160/196] Loss 2.2675 (2.2484) Acc@1 12.11 ( 14.63) Acc@5 57.81 ( 60.94)
Epoch:[1][ 170/196] Loss 2.2344 (2.2480) Acc@1 17.58 ( 14.59) Acc@5 64.45 ( 61.01)
Epoch:[1][ 180/196] Loss 2.2397 (2.2480) Acc@1 11.33 ( 14.56) Acc@5 65.23 ( 61.04)
Epoch:[1][ 190/196] Loss 2.2421 (2.2472) Acc@1 18.36 ( 14.59) Acc@5 63.67 ( 61.22)
Valid: Acc@1 16.25 ( 14.59) Acc@5 67.50 ( 61.23)
Epoch:[2][ 0/196] Loss 2.2174 (2.2174) Acc@1 13.28 ( 13.28) Acc@5 60.94 ( 60.94)
Epoch:[2][ 10/196] Loss 2.2603 (2.2238) Acc@1 12.50 ( 14.35) Acc@5 62.89 ( 64.52)
Epoch:[2][ 20/196] Loss 2.2073 (2.2253) Acc@1 17.58 ( 14.36) Acc@5 65.23 ( 64.83)
Epoch:[2][ 30/196] Loss 2.2392 (2.2297) Acc@1 11.72 ( 14.00) Acc@5 64.06 ( 64.36)
Epoch:[2][ 40/196] Loss 2.2196 (2.2283) Acc@1 16.02 ( 14.11) Acc@5 60.94 ( 64.22)
Epoch:[2][ 50/196] Loss 2.2230 (2.2252) Acc@1 18.36 ( 14.16) Acc@5 63.67 ( 64.43)
Epoch:[2][ 60/196] Loss 2.2560 (2.2271) Acc@1 14.06 ( 14.11) Acc@5 62.50 ( 64.11)
Epoch:[2][ 70/196] Loss 2.2240 (2.2267) Acc@1 12.50 ( 14.19) Acc@5 67.19 ( 64.36)
Epoch:[2][ 80/196] Loss 2.2233 (2.2262) Acc@1 15.23 ( 14.05) Acc@5 67.19 ( 64.42)
Epoch:[2][ 90/196] Loss 2.2691 (2.2256) Acc@1 14.06 ( 14.08) Acc@5 62.50 ( 64.61)
Epoch:[2][ 100/196] Loss 2.2128 (2.2230) Acc@1 16.80 ( 14.19) Acc@5 66.02 ( 64.94)
Epoch:[2][ 110/196] Loss 2.2231 (2.2226) Acc@1 16.80 ( 14.21) Acc@5 67.58 ( 65.04)
Epoch:[2][ 120/196] Loss 2.2471 (2.2224) Acc@1 16.02 ( 14.29) Acc@5 67.19 ( 65.15)
Epoch:[2][ 130/196] Loss 2.2120 (2.2234) Acc@1 14.06 ( 14.26) Acc@5 66.80 ( 65.10)
Epoch:[2][ 140/196] Loss 2.1905 (2.2223) Acc@1 14.84 ( 14.32) Acc@5 68.36 ( 65.14)
Epoch:[2][ 150/196] Loss 2.1803 (2.2222) Acc@1 17.58 ( 14.26) Acc@5 69.53 ( 65.14)
Epoch:[2][ 160/196] Loss 2.2102 (2.2217) Acc@1 10.94 ( 14.32) Acc@5 67.97 ( 65.16)
Epoch:[2][ 170/196] Loss 2.1830 (2.2210) Acc@1 17.97 ( 14.34) Acc@5 71.09 ( 65.23)
Epoch:[2][ 180/196] Loss 2.2337 (2.2208) Acc@1 12.89 ( 14.27) Acc@5 60.16 ( 65.23)
Epoch:[2][ 190/196] Loss 2.2121 (2.2201) Acc@1 13.67 ( 14.23) Acc@5 65.62 ( 65.28)
Valid: Acc@1 13.75 ( 14.28) Acc@5 62.50 ( 65.29)
Epoch:[3][ 0/196] Loss 2.2125 (2.2125) Acc@1 15.23 ( 15.23) Acc@5 62.89 ( 62.89)
Epoch:[3][ 10/196] Loss 2.2045 (2.1961) Acc@1 12.89 ( 15.06) Acc@5 65.62 ( 66.73)
Epoch:[3][ 20/196] Loss 2.2089 (2.2014) Acc@1 17.19 ( 15.55) Acc@5 67.58 ( 66.18)
Epoch:[3][ 30/196] Loss 2.2160 (2.2043) Acc@1 10.94 ( 15.07) Acc@5 63.67 ( 65.81)
Epoch:[3][ 40/196] Loss 2.2225 (2.2042) Acc@1 16.02 ( 15.13) Acc@5 65.62 ( 65.95)
Epoch:[3][ 50/196] Loss 2.2464 (2.2044) Acc@1 13.67 ( 14.87) Acc@5 63.67 ( 65.92)
Epoch:[3][ 60/196] Loss 2.2477 (2.2076) Acc@1 11.72 ( 14.74) Acc@5 66.41 ( 65.64)
Epoch:[3][ 70/196] Loss 2.2266 (2.2075) Acc@1 12.50 ( 14.61) Acc@5 61.33 ( 65.64)
Epoch:[3][ 80/196] Loss 2.1900 (2.2094) Acc@1 17.58 ( 14.28) Acc@5 68.75 ( 65.40)
Epoch:[3][ 90/196] Loss 2.2321 (2.2095) Acc@1 12.50 ( 14.30) Acc@5 65.23 ( 65.43)
Epoch:[3][ 100/196] Loss 2.2230 (2.2088) Acc@1 14.84 ( 14.41) Acc@5 62.89 ( 65.40)
Epoch:[3][ 110/196] Loss 2.1826 (2.2078) Acc@1 17.58 ( 14.41) Acc@5 68.75 ( 65.51)
Epoch:[3][ 120/196] Loss 2.1869 (2.2064) Acc@1 14.84 ( 14.52) Acc@5 66.80 ( 65.66)
Epoch:[3][ 130/196] Loss 2.1816 (2.2070) Acc@1 14.45 ( 14.50) Acc@5 67.19 ( 65.56)
Epoch:[3][ 140/196] Loss 2.1712 (2.2074) Acc@1 14.84 ( 14.53) Acc@5 64.06 ( 65.45)
Epoch:[3][ 150/196] Loss 2.1949 (2.2081) Acc@1 16.41 ( 14.50) Acc@5 68.75 ( 65.34)
Epoch:[3][ 160/196] Loss 2.2323 (2.2078) Acc@1 11.72 ( 14.55) Acc@5 58.98 ( 65.36)
Epoch:[3][ 170/196] Loss 2.2053 (2.2076) Acc@1 13.28 ( 14.52) Acc@5 62.89 ( 65.29)
Epoch:[3][ 180/196] Loss 2.2092 (2.2074) Acc@1 11.33 ( 14.52) Acc@5 64.45 ( 65.30)
Epoch:[3][ 190/196] Loss 2.1581 (2.2061) Acc@1 16.80 ( 14.56) Acc@5 66.41 ( 65.39)
Valid: Acc@1 12.50 ( 14.62) Acc@5 62.50 ( 65.42)
Epoch:[4][ 0/196] Loss 2.2185 (2.2185) Acc@1 16.41 ( 16.41) Acc@5 65.23 ( 65.23)
Epoch:[4][ 10/196] Loss 2.1946 (2.1780) Acc@1 14.84 ( 16.23) Acc@5 67.58 ( 67.72)
Epoch:[4][ 20/196] Loss 2.1826 (2.1765) Acc@1 13.67 ( 15.23) Acc@5 69.53 ( 67.71)
Epoch:[4][ 30/196] Loss 2.1741 (2.1804) Acc@1 14.45 ( 15.01) Acc@5 66.80 ( 67.75)
Epoch:[4][ 40/196] Loss 2.1733 (2.1816) Acc@1 19.92 ( 15.46) Acc@5 69.14 ( 67.59)
Epoch:[4][ 50/196] Loss 2.1708 (2.1816) Acc@1 15.62 ( 15.53) Acc@5 68.75 ( 67.16)
Epoch:[4][ 60/196] Loss 2.2342 (2.1827) Acc@1 12.50 ( 15.55) Acc@5 63.67 ( 66.88)
Epoch:[4][ 70/196] Loss 2.1912 (2.1833) Acc@1 18.75 ( 15.54) Acc@5 65.23 ( 66.84)
Epoch:[4][ 80/196] Loss 2.1691 (2.1820) Acc@1 15.62 ( 15.59) Acc@5 66.41 ( 66.83)
Epoch:[4][ 90/196] Loss 2.1957 (2.1810) Acc@1 12.89 ( 15.64) Acc@5 64.06 ( 66.85)
Epoch:[4][ 100/196] Loss 2.1498 (2.1797) Acc@1 18.36 ( 15.78) Acc@5 71.88 ( 66.84)
Epoch:[4][ 110/196] Loss 2.2000 (2.1792) Acc@1 16.02 ( 15.80) Acc@5 63.67 ( 66.73)
Epoch:[4][ 120/196] Loss 2.2101 (2.1793) Acc@1 15.23 ( 15.84) Acc@5 65.23 ( 66.83)
Epoch:[4][ 130/196] Loss 2.1756 (2.1796) Acc@1 14.84 ( 16.00) Acc@5 64.84 ( 66.70)
Epoch:[4][ 140/196] Loss 2.2108 (2.1792) Acc@1 16.02 ( 16.05) Acc@5 65.62 ( 66.64)
Epoch:[4][ 150/196] Loss 2.1349 (2.1793) Acc@1 19.92 ( 16.09) Acc@5 63.28 ( 66.54)
Epoch:[4][ 160/196] Loss 2.2058 (2.1793) Acc@1 14.06 ( 16.11) Acc@5 66.80 ( 66.63)
Epoch:[4][ 170/196] Loss 2.1717 (2.1791) Acc@1 15.23 ( 16.14) Acc@5 67.97 ( 66.66)
Epoch:[4][ 180/196] Loss 2.1978 (2.1788) Acc@1 16.02 ( 16.11) Acc@5 66.41 ( 66.60)
Epoch:[4][ 190/196] Loss 2.1366 (2.1780) Acc@1 18.36 ( 16.19) Acc@5 68.36 ( 66.63)
Valid: Acc@1 15.00 ( 16.22) Acc@5 63.75 ( 66.65)
Epoch:[5][ 0/196] Loss 2.1741 (2.1741) Acc@1 18.36 ( 18.36) Acc@5 69.14 ( 69.14)
Epoch:[5][ 10/196] Loss 2.1512 (2.1490) Acc@1 17.19 ( 18.22) Acc@5 69.14 ( 68.86)
Epoch:[5][ 20/196] Loss 2.1548 (2.1540) Acc@1 19.14 ( 18.36) Acc@5 66.80 ( 67.86)
Epoch:[5][ 30/196] Loss 2.1510 (2.1594) Acc@1 14.84 ( 18.03) Acc@5 68.36 ( 67.54)
Epoch:[5][ 40/196] Loss 2.1936 (2.1605) Acc@1 18.36 ( 18.11) Acc@5 61.33 ( 67.33)
Epoch:[5][ 50/196] Loss 2.1379 (2.1586) Acc@1 17.58 ( 17.99) Acc@5 71.88 ( 67.42)
Epoch:[5][ 60/196] Loss 2.2033 (2.1615) Acc@1 14.06 ( 17.87) Acc@5 63.28 ( 67.28)
Epoch:[5][ 70/196] Loss 2.1947 (2.1634) Acc@1 17.58 ( 17.79) Acc@5 61.72 ( 67.15)
Epoch:[5][ 80/196] Loss 2.1627 (2.1623) Acc@1 18.36 ( 17.69) Acc@5 69.92 ( 67.29)
Epoch:[5][ 90/196] Loss 2.1762 (2.1610) Acc@1 19.53 ( 17.71) Acc@5 66.80 ( 67.39)
Epoch:[5][ 100/196] Loss 2.1425 (2.1585) Acc@1 20.31 ( 17.74) Acc@5 68.75 ( 67.50)
Epoch:[5][ 110/196] Loss 2.1684 (2.1587) Acc@1 18.36 ( 17.64) Acc@5 65.62 ( 67.45)
Epoch:[5][ 120/196] Loss 2.1651 (2.1576) Acc@1 14.84 ( 17.61) Acc@5 68.75 ( 67.67)
Epoch:[5][ 130/196] Loss 2.1759 (2.1580) Acc@1 17.19 ( 17.64) Acc@5 69.53 ( 67.71)
Epoch:[5][ 140/196] Loss 2.1900 (2.1565) Acc@1 15.23 ( 17.64) Acc@5 62.89 ( 67.84)
Epoch:[5][ 150/196] Loss 2.1118 (2.1552) Acc@1 20.70 ( 17.68) Acc@5 70.31 ( 67.95)
Epoch:[5][ 160/196] Loss 2.1644 (2.1553) Acc@1 14.84 ( 17.64) Acc@5 69.92 ( 67.97)
Epoch:[5][ 170/196] Loss 2.1594 (2.1549) Acc@1 17.19 ( 17.66) Acc@5 73.44 ( 68.06)
Epoch:[5][ 180/196] Loss 2.1276 (2.1546) Acc@1 20.70 ( 17.63) Acc@5 70.31 ( 68.09)
Epoch:[5][ 190/196] Loss 2.1128 (2.1532) Acc@1 19.53 ( 17.69) Acc@5 71.09 ( 68.20)
Valid: Acc@1 18.75 ( 17.73) Acc@5 65.00 ( 68.27)
It seems your notebook timed out after 93 epochs:
Epoch:[93][ 50/196] Loss 1.7159 (1.6831) Acc@1 33.20 ( 37.65) Acc@5 87.50 ( 88.55)
If you are profiling code, make sure the code exits in a reasonable amount of time.
I suppose the profiler is profiling the code for each and every epoch ?
The profiler might be started and stopped via torch.cuda.cudart().cudaProfilerStart()
and torch.cuda.cudart().cudaProfilerStop()
, repectively, in side the code or it can also profile the complete script. Once the script finishes, nvprof
will create the profiling output. If you are profiling the complete script (for 93+ epochs), the output would be large and might take some time of course.