I have a sequence classification problem that I am trying to solve using some kind of “curriculum learning”, which I describe in the following. My dataset consists of sequences and (in the particular problem I am trying to solve) the larger the sequences are, the harder they are to classify correctly.
Therefore, I am scheduling the sequence length in the dataset so that sequences are initially small and their length increases progressively during training. Each mini-batch obtained from the DataLoader has the shape (N, L_i, D)
, where N
is the batch size, L_i
is the sequence length at the i
-th epoch and D
is the dimension of the data.
In order to use the GPU memory efficiently, I would like to use a variable batch size, which is initially larger and decreases along the training epochs, to compensate for the extra memory required by the larger sequences. I thought it would be enough to do the following immediately before each epoch starts:
train_loader.batch_size = N_i
,
where N_i
would be the desired batch size for the i
-th epoch. However, apparently, the DataLoader does not allow the batch size to be changed after initialization, since the following error appears:
ValueError: batch_size attribute should not be set after DataLoader is initialized
Is there any workaround for this limitation?