Hi All!
I wanted to use a CosineAnnealingWithRestarts schedulers on Google Cloud TPU cores, with distributed parallel training.
I couldn’t find a tutorial which shows how to do that.
Please help me out.
Hi All!
I wanted to use a CosineAnnealingWithRestarts schedulers on Google Cloud TPU cores, with distributed parallel training.
I couldn’t find a tutorial which shows how to do that.
Please help me out.
Hey @chhablanig
Do you have a working training script without using LR schedulers?
cc @vincentqb for optimizer question
cc @ailzhang for XLA question