My training data is very huge and it’s impossible to load all of it at once even into main memory. So I’m loading a few blocks (subset) of data and training till convergence, then proceeding to next subset and training till convergence and so on. Is it the right approach ?

The model performance kind of remains the same even when training on a new subset of data.

Is this method fundamentally wrong, why ?

I know this question is not specific to pytorch, I’m sorry, but I find this forum very active.

Thanks!