My input has variable size. I haven’t found a way to use a DatasetLoader without padding the inputs with the maximum size in the batch. Is there any way around it? Is it possible without using the DatasetLoader class?
A batch must always consist of elements of the same size. However if your input is large enough and you can handle the corresponding output sizes you can feed batches off different sizes to the model.
How can I vary the batch size?
This does not work with the default dataloader but in general you could handle the loading by yourself and simply add a batch dimension to your data and use torch.cat to stack them to a batch:
batch_elements =  for i in range(curr_batch_size): # generate some sample data tmp = torch.rand((1, 50, 50)) # add batch dimension tmp = tmp.unsqueeze(0) batch_elements.append(tmp) batch = torch.cat(batch_elements, 0)
tmp = torch.rand((1, 50, 50)) by your own data samples. In This case I used 50x50 pixel images with one channel as sample data. To show the workflow with general data I did not integrate the batch dimension into the shape of the random tensor but added it afterwards.
EDIT: Alternatively you could use something like this. But note that this will pad your input and (depending on the maximal difference of your input sizes) you could end up with padding an enormous amount of zeros (or constants or whatever).
How exactly are you doing this? Your answer seems to assume everything is of same size already.
The original question is if padding is required for variable size input. Let’s be direct and answer that directly. Is that always necessary or not? When is padding necessary and when is it not?
To my understanding it’s always required cuz batches have to be of the same size (unless I’m not understanding something or don’t know something).
Items in the same batch have to be the same size, yes, but having a fully convolutional network you can pass batches of different sizes, so no, padding is not always required. In the extreme case you could even use batchsize of 1 and your input size could be completely random (assuming, that you adjusted strides, kernelsize, dilation etc in a proper way).
This is why it is hard to answer this question in general.
My take on how to solve this issue:
def collate_fn_padd(batch): ''' Padds batch of variable length note: it converts things ToTensor manually here since the ToTensor transform assume it takes in images rather than arbitrary tensors. ''' ## get sequence lengths lengths = torch.tensor([ t.shape for t in batch ]).to(device) ## padd batch = [ torch.Tensor(t).to(device) for t in batch ] batch = torch.nn.utils.rnn.pad_sequence(batch) ## compute mask mask = (batch != 0).to(device) return batch, lengths, mask
There seems to be a large collection of posts all over pytorch that makes it difficult to solve this issue. I have collected a list of all of them hopefully making things easier for all of us. Here:
- How to create batches of a list of varying dimension tensors?
- How to create a dataloader with variable-size input
- Using variable sized input - Is padding required?
- DataLoader for various length of data
- How to do padding based on lengths?
Also, Stack-overflow has a version of this question too: