Training accuracy (loss) increases (decreases) in a zigzag way?

There’s a similar post here Strange behavior with SGD momentum training which also shows a saw toothed loss. A suggestion by smth is to do sampling with replacement.