However, I need my DataLoader to shuffle per batch, to allow duplicate sampling.
I assume this means you would like to sample n times with replacement for a given batch of n size. You should be able to write your own BatchSampler to do that and pass your custom implementation to DataLoader.
Alternatively, you can write a collate_fn that re-samples and collates the batch, and pass that collate_fn to DataLoader.