How drawing k random batches?

I am looking for drawing k random batches of 2 objects from test dataset, CIFAR10. Let’s say for batch_size=128, k=64, any ideas how can I do that?

Hi Niki!

Could you clarify what you want to do? You say that you
want “batches of 2 objects,” but you also say that you want
“batch_size=128.” If 2 is not that batch size (2 items per
batch), what role does 2 play in your question?

Best regards.

K. Frank

Thank you for your reply @KFrank. 128 is the initial number of batches for the test data, I want to put 2 items per batch in k random batches. I want to create k random batches in the test function.

Hello Niki!

Here are ways to draw two different kinds of random batches.
Let nTest be the total number of test data items you have.
I will assume that nTest > 2 * k (because this will matter
for one of the kinds of random batches).

First (ba in the example code below) we can construct random
batches of two items, where the items are drawn randomly from
your test data, with replacement. This means that a batch
might (but is rather unlikely to) contain two copies of the
same item, and two different batches might similarly contain
the same item.

If instead, you want no item to occur more than once in your
batches, I find it easiest to randomly permute the test data
(or, as in the example below, randomly permute indices into
the test data), and then loop through the random permutation
two at a time.

To make your batches (pseudo-)random, you use a
(pseudo-)random number generator. Both pytorch and
numpy have such things. I’ve used the ones from numpy.

This example assumes that you have your test data in data, a
torch.tensor, and creates k = 5 random batches of two items
each, using both the “with replacement” (ba) and disjoint (bb)

import numpy as np
import torch

nTest = 30
k = 5
data = torch.randn (nTest, 3)
p = np.random.permutation (nTest)
# each ba is a batch of 2 drawn independently with replacement
# the bb's are a set of batches of 2 without duplicates, either
# within or among the batches
for  i in range (k):
     ba = data[[np.random.randint (nTest), np.random.randint (nTest)]]
     bb = data[p[2*i : 2*i + 2]]
     print (ba)
     print (bb)

(Note, as far as I can tell, the number “128” doesn’t enter into
the specific question you are asking.)


K. Frank

Thank you very much for your reply @KFrank.