I am training a pretty standard UNet model. The only extension is the loss function that is a composition of three other losses. Each iteration takes something like ~1.3 minutes, and I notice that after ~ 100 iterations, the time for each iteration grows linearly. Any idea why it is happening?
PS. I read in forums that it can be due to lack of .item() in loss summation, i.e., total += loss.item()
I added that .item(), but it doesn’t help.
Epoch [101] Loss 0.0020: 100%|██████████| 1/1 [01:28<00:00, 88.95s/it]
Epoch [102] Loss 0.0021: 100%|██████████| 1/1 [01:29<00:00, 89.56s/it]
Epoch [103] Loss 0.0020: 100%|██████████| 1/1 [01:29<00:00, 89.60s/it]
Epoch [104] Loss 0.0020: 100%|██████████| 1/1 [01:29<00:00, 89.59s/it]
Epoch [105] Loss 0.0019: 100%|██████████| 1/1 [01:30<00:00, 90.49s/it]
Epoch [106] Loss 0.0019: 100%|██████████| 1/1 [01:27<00:00, 87.88s/it]
Epoch [107] Loss 0.0018: 100%|██████████| 1/1 [01:28<00:00, 88.62s/it]
Epoch [108] Loss 0.0018: 100%|██████████| 1/1 [01:29<00:00, 89.18s/it]
Epoch [109] Loss 0.0018: 100%|██████████| 1/1 [01:27<00:00, 87.38s/it]
Epoch [110] Loss 0.0018: 100%|██████████| 1/1 [01:30<00:00, 90.76s/it]
Epoch [111] Loss 0.0017: 100%|██████████| 1/1 [01:27<00:00, 87.06s/it]
Epoch [112] Loss 0.0017: 100%|██████████| 1/1 [01:30<00:00, 90.43s/it]
Epoch [113] Loss 0.0017: 100%|██████████| 1/1 [01:29<00:00, 89.99s/it]
Epoch [114] Loss 0.0016: 100%|██████████| 1/1 [01:29<00:00, 89.87s/it]
Epoch [115] Loss 0.0016: 100%|██████████| 1/1 [01:26<00:00, 86.47s/it]
Epoch [116] Loss 0.0016: 100%|██████████| 1/1 [01:28<00:00, 88.58s/it]
Epoch [117] Loss 0.0015: 100%|██████████| 1/1 [01:26<00:00, 86.67s/it]
Epoch [118] Loss 0.0015: 100%|██████████| 1/1 [01:27<00:00, 87.48s/it]
Epoch [119] Loss 0.0015: 100%|██████████| 1/1 [01:29<00:00, 89.70s/it]
Epoch [120] Loss 0.0014: 100%|██████████| 1/1 [01:28<00:00, 88.29s/it]
Epoch [121] Loss 0.0014: 100%|██████████| 1/1 [01:29<00:00, 89.31s/it]
Epoch [122] Loss 0.0014: 100%|██████████| 1/1 [01:30<00:00, 90.73s/it]
Epoch [123] Loss 0.0014: 100%|██████████| 1/1 [01:59<00:00, 119.80s/it]
Epoch [124] Loss 0.0013: 100%|██████████| 1/1 [02:28<00:00, 148.24s/it]
Epoch [125] Loss 0.0013: 100%|██████████| 1/1 [02:06<00:00, 126.39s/it]
Epoch [126] Loss 0.0013: 100%|██████████| 1/1 [01:45<00:00, 105.44s/it]
Epoch [127] Loss 0.0012: 100%|██████████| 1/1 [02:21<00:00, 141.05s/it]
Epoch [128] Loss 0.0012: 100%|██████████| 1/1 [01:40<00:00, 100.25s/it]
Epoch [129] Loss 0.0012: 100%|██████████| 1/1 [02:01<00:00, 121.82s/it]
Epoch [130] Loss 0.0012: 100%|██████████| 1/1 [01:56<00:00, 116.14s/it]
Epoch [131] Loss 0.0011: 100%|██████████| 1/1 [03:12<00:00, 192.90s/it]
Epoch [132] Loss 0.0011: 100%|██████████| 1/1 [01:46<00:00, 106.17s/it]
Epoch [133] Loss 0.0011: 100%|██████████| 1/1 [02:12<00:00, 132.23s/it]
Epoch [134] Loss 0.0011: 100%|██████████| 1/1 [02:17<00:00, 137.05s/it]
Epoch [135] Loss 0.0011: 100%|██████████| 1/1 [02:05<00:00, 125.36s/it]
Epoch [136] Loss 0.0010: 100%|██████████| 1/1 [02:17<00:00, 137.81s/it]
Epoch [137] Loss 0.0010: 100%|██████████| 1/1 [02:27<00:00, 147.47s/it]
Epoch [138] Loss 0.0010: 100%|██████████| 1/1 [02:13<00:00, 133.29s/it]
Epoch [139] Loss 0.0010: 100%|██████████| 1/1 [02:00<00:00, 120.73s/it]
Epoch [140] Loss 0.0010: 100%|██████████| 1/1 [02:01<00:00, 121.87s/it]
Epoch [141] Loss 0.0010: 100%|██████████| 1/1 [02:01<00:00, 121.29s/it]
Epoch [142] Loss 0.0009: 100%|██████████| 1/1 [01:59<00:00, 119.16s/it]
Epoch [143] Loss 0.0009: 100%|██████████| 1/1 [02:13<00:00, 133.37s/it]
Epoch [144] Loss 0.0009: 100%|██████████| 1/1 [02:08<00:00, 128.09s/it]
Epoch [145] Loss 0.0009: 100%|██████████| 1/1 [02:20<00:00, 140.43s/it]
Epoch [146] Loss 0.0009: 100%|██████████| 1/1 [02:22<00:00, 142.48s/it]
Epoch [147] Loss 0.0008: 100%|██████████| 1/1 [02:18<00:00, 138.08s/it]
Epoch [148] Loss 0.0008: 100%|██████████| 1/1 [02:19<00:00, 139.63s/it]
Epoch [149] Loss 0.0008: 100%|██████████| 1/1 [02:24<00:00, 144.33s/it]
Epoch [150] Loss 0.0008: 100%|██████████| 1/1 [02:58<00:00, 178.68s/it]
Epoch [151] Loss 0.0008: 100%|██████████| 1/1 [02:33<00:00, 153.68s/it]
Epoch [152] Loss 0.0008: 100%|██████████| 1/1 [02:28<00:00, 148.57s/it]
Epoch [153] Loss 0.0009: 100%|██████████| 1/1 [02:33<00:00, 153.27s/it]
Epoch [154] Loss 0.0008: 100%|██████████| 1/1 [02:36<00:00, 156.50s/it]
Epoch [155] Loss 0.0007: 100%|██████████| 1/1 [03:12<00:00, 192.43s/it]
Epoch [156] Loss 0.0007: 100%|██████████| 1/1 [03:36<00:00, 216.21s/it]
Epoch [157] Loss 0.0007: 100%|██████████| 1/1 [04:17<00:00, 257.19s/it]
Epoch [158] Loss 0.0007: 100%|██████████| 1/1 [03:35<00:00, 215.72s/it]
Epoch [159] Loss 0.0007: 100%|██████████| 1/1 [05:49<00:00, 349.24s/it]
Epoch [160] Loss 0.0007: 100%|██████████| 1/1 [05:20<00:00, 320.25s/it]
Epoch [161] Loss 0.0007: 100%|██████████| 1/1 [05:10<00:00, 310.60s/it]
Epoch [162] Loss 0.0007: 100%|██████████| 1/1 [05:15<00:00, 315.77s/it]
Epoch [163] Loss 0.0006: 100%|██████████| 1/1 [05:12<00:00, 312.63s/it]
Epoch [164] Loss 0.0006: 100%|██████████| 1/1 [05:32<00:00, 332.89s/it]
Epoch [165] Loss 0.0006: 100%|██████████| 1/1 [05:38<00:00, 338.47s/it]
Epoch [166] Loss 0.0006: 100%|██████████| 1/1 [06:06<00:00, 366.14s/it]
Epoch [167] Loss 0.0006: 100%|██████████| 1/1 [04:36<00:00, 276.96s/it]
Epoch [168] Loss 0.0008: 100%|██████████| 1/1 [04:58<00:00, 298.22s/it]
Epoch [169] Loss 0.0012: 100%|██████████| 1/1 [05:06<00:00, 306.25s/it]
Epoch [170] Loss 0.0007: 100%|██████████| 1/1 [05:46<00:00, 346.06s/it]
Epoch [171] Loss 0.0008: 100%|██████████| 1/1 [04:49<00:00, 289.96s/it]
Epoch [172] Loss 0.0007: 100%|██████████| 1/1 [05:10<00:00, 310.00s/it]
Epoch [173] Loss 0.0008: 100%|██████████| 1/1 [05:52<00:00, 352.54s/it]