Step size for Cyclic scheduler

What I would like to know is how to choose a reasonable number for the step size. This is a multilabel classification with 21 labels.
My batch size is 8. And the number of images is 7000 images. Number for gradient accumulation
is 4.
Thank you

I think CyclicLR is usually called after each batch (and the docs also say so):

Cyclical learning rate policy changes the learning rate after every batch. step should be called after a batch has been used for training.

I didn’t formulate my question well.
I was talking about the upper step size for the increasing half of the cycle.
I choose it when initiating the scheduler.
I’m not talking about scheduler.step().
So in order to go from minimum lr to the maximum lr, how many steps should I choose?

Thanks for clarification.
From the paper:

The length of a cycle and the input parameter stepsize
can be easily computed from the number of iterations in
an epoch. An epoch is calculated by dividing the number
of training images by the batchsize used. For example,
CIFAR-10 has 50, 000 training images and the batchsize is
100 so an epoch = 50, 000/100 = 500 iterations. The final
accuracy results are actually quite robust to cycle length but
experiments show that it often is good to set stepsize equal
to 2 − 10 times the number of iterations in an epoch. For
example, setting stepsize = 8 * epoch with the CIFAR-10
training run (as shown in Figure 1) only gives slightly better
results than setting stepsize = 2 * epoch.

1 Like