Scheduler error

Got the following error while training my model

---> 95         scheduler.step()
.
.
.
ValueError: Tried to step 1402 times. The specified number of total steps is 1400

But I’ve NOT specified anywhere in my code that the total steps should be 1400

please help

You need to provide more information. Maybe share the snippet of your minimal code, the line where the error stack originates, and so on.

However, I remember this sort of error popping up while using pytorch-lightning. There this error occurs when you are using a scheduler in the optimizer without passing the max_epochs argument.

1 Like

Here’s the code snipet:

  criterion = nn.BCEWithLogitsLoss()
  optimizer = torch.optim.Adam(model.parameters(),lr=0.001)
  scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer, max_lr=0.001,
                                                steps_per_epoch=int(len(train_dl)),
                                                epochs=num_epochs,
                                                anneal_strategy='linear')

You are defining the number of steps via steps_per_epoch and epochs as described in the docs:

  • epochs (int) – The number of epochs to train for. This is used along with steps_per_epoch in order to infer the total number of steps in the cycle if a value for total_steps is not provided. Default: None
  • steps_per_epoch (int) – The number of steps per epoch to train for. This is used along with epochs in order to infer the total number of steps in the cycle if a value for total_steps is not provided. Default: None

and as seen here:

model = nn.Linear(1, 1)

criterion = nn.BCEWithLogitsLoss()
optimizer = torch.optim.Adam(model.parameters(),lr=0.001)
num_epochs = 10
steps_per_epoch = 2
scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer, max_lr=0.001,
                                                steps_per_epoch=steps_per_epoch,
                                                epochs=num_epochs,
                                                anneal_strategy='linear')

for epoch in range(num_epochs):
    for _ in range(steps_per_epoch):
        optimizer.step()
        scheduler.step()
        
optimizer.step()
scheduler.step()
# ValueError: Tried to step 22 times. The specified number of total steps is 20