How much important it is to shuffle the training data before each epoch?

My question is not explicitly related to programming issue. I want to know, how much important it is to shuffle the training data before each epoch/iteration in pytorch? Please share your experience.

Also, if I see that shuffling test data changes the performance for a trained model which is trained without shuffling the training data, what we can infer from here? Can we say the model learned features sensitive to batch position? Or, we can say the model is perhaps implemented in a wrong way?

I am interested to know if anyone faced any issue with the performance given the data shuffling was done or not.