Training accuracy (loss) increases (decreases) in a zigzag way?

kevinj22 · December 19, 2018, 1:54pm

There’s a similar post here Strange behavior with SGD momentum training which also shows a saw toothed loss. A suggestion by smth is to do sampling with replacement.