I was trying to profile my code with line_profiler because cProfile does not give much useful information for some reason (e.g. does not measure things like a[x]). It seems like the actual computations happen only when one tries to infer resulting values, because in code above, if I add a print, runtime splits pretty much exactly into two parts where “inference” is required (prints and casting to float).
64 102 862955 8460.3 1.8 output_var = model(data) # [B, C, H, W]
65 101 600 5.9 0.0 class_n = output_var.size(1)
66 101 20781085 205753.3 42.8 print(output_var.sum())
67 101 10971 108.6 0.0 output_flat = output_var.permute(0, 2, 3, 1).contiguous().view(-1, class_n)
68 101 15572 154.2 0.0 cross_ent = F.cross_entropy(output_flat, target.view(-1), size_average=False)
69 101 18728730 185433.0 38.6 test_loss_t += cross_ent.data[0]
70 100 5518 55.2 0.0 pred = output_var.data.max(1)[1] # [B, H, W]
and without print on line 66
64 102 862955 8460.3 1.8 output_var = model(data) # [B, C, H, W]
65 102 606 5.9 0.0 class_n = output_var.size(1)
66 102 7586 74.4 0.0 output_flat = output_var.permute(0, 2, 3, 1).contiguous().view(-1, class_n)
67 102 12863 126.1 0.0 cross_ent = F.cross_entropy(output_flat, target.view(-1), size_average=False)
68 102 39538526 387632.6 81.3 test_loss_t += cross_ent.data[0]
see - total runtime just slitted between those two lines.
Do I interpret this right? If yes, is there a way to change this behaviour for profiling purposes? Thanks.