Could you just double the batch size of the dataloader?
E.g.:
# assuming your data are images: batch of (128, 3, H, W)
batch_size = 128
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True, drop_last=True)
for batch in dataloader:
# pairs is a tuple of (64, 3, H, W) and (64, 3, H, W)
pairs = torch.split(batch, batch_size//2, dim=0)