Hello, I was training some networks. However, I met weird problems attached below. “learning rate” means the learning rate in the current iteration. args.lr represents the initial learning rate of this program. Can you guys give me some suggestions about this problem? Thanks in advance.
The version of PyTorch is 1.0.0, CUDA 8.0, Ubuntu 16.04, python 3.5.
Iters: 351700/[07], Loss 3.5787 (3.7740), 'Prec@1 43.810 (46.888), time: 1.00 s/iter, learning rate: 4.879625646604741e-05, [args.lr]: 0.0001
Iters: 351800/[07], Loss 3.4608 (3.7741), 'Prec@1 44.762 (46.922), time: 1.00 s/iter, learning rate: 4.872546293719906e-05, [args.lr]: 0.0001
Iters: 351900/[07], Loss 3.7157 (3.7688), 'Prec@1 51.429 (46.960), time: 1.00 s/iter, learning rate: 4.8654671964972e-05, [args.lr]: 0.0001
Iters: 352000/[07], Loss 3.1261 (3.7694), 'Prec@1 52.381 (46.949), time: 1.00 s/iter, learning rate: 4.8583883691367374e-05, [args.lr]: 0.0001
Iters: 352100/[07], Loss 3.6948 (3.7712), 'Prec@1 51.429 (46.961), time: 1.00 s/iter, learning rate: 4.85130982583809e-05, [args.lr]: 0.0001
Iters: 352200/[07], Loss 4.0376 (3.7710), 'Prec@1 41.905 (46.937), time: 1.00 s/iter, learning rate: 4.84423158080026e-05, [args.lr]: 0.0001
Iters: 352300/[07], Loss 3.6179 (3.7683), 'Prec@1 46.667 (46.949), time: 1.00 s/iter, learning rate: 4.8371536482216515e-05, [args.lr]: 0.0001
Iters: 352400/[07], Loss 4.9270 (3.7662), 'Prec@1 37.143 (46.954), time: 1.00 s/iter, learning rate: 4.830076042300044e-05, [args.lr]: 0.0001
Iters: 352500/[07], Loss nan (nan), 'Prec@1 0.000 (45.216), time: 0.99 s/iter, learning rate: 4.8229987772325544e-05, [args.lr]: 0.0001
Iters: 352600/[07], Loss nan (nan), 'Prec@1 0.000 (43.477), time: 0.99 s/iter, learning rate: 4.8159218672156253e-05, [args.lr]: 0.0001
Iters: 352700/[07], Loss nan (nan), 'Prec@1 0.000 (41.866), time: 0.99 s/iter, learning rate: 4.8088453264449795e-05, [args.lr]: 0.0001
Iters: 352800/[07], Loss nan (nan), 'Prec@1 0.000 (40.371), time: 0.99 s/iter, learning rate: 4.801769169115605e-05, [args.lr]: 0.0001
Iters: 352900/[07], Loss nan (nan), 'Prec@1 0.000 (38.979), time: 1.00 s/iter, learning rate: 4.7946934094217175e-05, [args.lr]: 0.0001
Iters: 353000/[07], Loss nan (nan), 'Prec@1 0.000 (37.680), time: 0.99 s/iter, learning rate: 4.787618061556734e-05, [args.lr]: 0.0001