Hi,
I’m using the default settings for model compilation.
In my case, compiling the model results in a 20X slow down. I left two models running (one compiled and one not), and the results are:
compiled: 873 steps in 8 hours
not-compiled: 16 256 steps in 8 hours
Each time during a forward, I’m passing tensor of the same dimensions exactly (BS x padded-len).
For the compiled model, the first step of the first batch takes A LOT of time [5 mins for the compilation time?].
Compiled model:
2023/08/03 23:55:33 Epoch 0: 0%| | 1/6103 [04:53<497:03:28, 293.25s/it]
2023/08/03 23:56:01 Epoch 0: 0%| | 2/6103 [05:20<271:54:04, 160.44s/it, v_num=-332]
2023/08/03 23:56:28 Epoch 0: 0%| | 3/6103 [05:47<196:25:00, 115.92s/it, v_num=-332]
2023/08/03 23:57:00 Epoch 0: 0%| | 4/6103 [06:19<160:49:52, 94.93s/it, v_num=-332]
Not-compiled model:
2023/08/03 23:57:41 Epoch 0: 0%| | 1/6103 [00:07<12:09:16, 7.17s/it, v_num=-334]
2023/08/03 23:57:42 Epoch 0: 0%| | 2/6103 [00:08<7:23:44, 4.36s/it, v_num=-334]
2023/08/03 23:57:44 Epoch 0: 0%| | 3/6103 [00:10<5:48:41, 3.43s/it, v_num=-334]
2023/08/03 23:57:45 Epoch 0: 0%| | 4/6103 [00:11<5:00:26, 2.96s/it, v_num=-334]
2023/08/03 23:57:47 Epoch 0: 0%| | 5/6103 [00:13<4:32:01, 2.68s/it, v_num=-334]
2023/08/03 23:57:48 Epoch 0: 0%| | 6/6103 [00:14<4:12:42, 2.49s/it, v_num=-334]
2023/08/03 23:57:50 Epoch 0: 0%| | 7/6103 [00:16<3:58:51, 2.35s/it, v_num=-334]
Hardware for the runs:
V100 32GB, 1 GPU