Hi, I use AMP to train my model, but the parameters never update ,so the training is invalid.
I find that the program always skip “scaler.step()” , but the loss is neither NAN nor INF, I don’t know the reasons.
Code show as below:
backward:
for epoch:
for data in enumerate(dataloader)
with autocast():
loss= model(data)
optimizer.zero_grad()
scaler.scale(loss).backward()
scaler.step(optimizer)
scaler.update()
scheduler.step()
optimizer = opt.AdamW([
{'params':, ..........,'initial_lr': lr},
], lr=lr, weight_decay=wd)
scheduler = opt.lr_scheduler.CosineAnnealingLR(optimizer, T_max, eta_min=min_lr, last_epoch=epoch)
Enviroment:
OS:Ubuntu 20.04
Python: 3.8
Torch:1.7.1
CUDA:11.4
Graphics Card:3090