I am looking for drawing k **random** batches of 2 objects from test dataset, CIFAR10. Let’s say for batch_size=128, k=64, any ideas how can I do that?

Hi Niki!

Could you clarify what you want to do? You say that you

want “batches of 2 objects,” but you also say that you want

“batch_size=128.” If 2 is not that batch size (2 items per

batch), what role does 2 play in your question?

Best regards.

K. Frank

Thank you for your reply @KFrank. 128 is the initial number of batches for the test data, I want to put 2 items per batch in k random batches. I want to create k random batches in the test function.

Hello Niki!

Here are ways to draw two different kinds of random batches.

Let `nTest`

be the total number of test data items you have.

I will assume that `nTest > 2 * k`

(because this will matter

for one of the kinds of random batches).

First (`ba`

in the example code below) we can construct random

batches of two items, where the items are drawn randomly from

your test data, *with replacement*. This means that a batch

might (but is rather unlikely to) contain two copies of the

same item, and two different batches might similarly contain

the same item.

If instead, you want no item to occur more than once in your

batches, I find it easiest to randomly permute the test data

(or, as in the example below, randomly permute indices into

the test data), and then loop through the random permutation

two at a time.

To make your batches (pseudo-)random, you use a

(pseudo-)random number generator. Both pytorch and

numpy have such things. I’ve used the ones from numpy.

This example assumes that you have your test data in `data`

, a

`torch.tensor`

, and creates `k = 5`

random batches of two items

each, using both the “with replacement” (`ba`

) and disjoint (`bb`

)

approaches:

```
import numpy as np
import torch
nTest = 30
k = 5
data = torch.randn (nTest, 3)
p = np.random.permutation (nTest)
# each ba is a batch of 2 drawn independently with replacement
# the bb's are a set of batches of 2 without duplicates, either
# within or among the batches
for i in range (k):
ba = data[[np.random.randint (nTest), np.random.randint (nTest)]]
bb = data[p[2*i : 2*i + 2]]
print (ba)
print (bb)
```

(Note, as far as I can tell, the number “128” doesn’t enter into

the specific question you are asking.)

Best.

K. Frank