Why does the Pytorch WeightedRandomSampler act so weird?

Philipp_Friebertshau · February 17, 2019, 1:45am

I try to use the WeightedRandomSampler (torch.utils.data.WeightedRandomSampler) class from pytorch for the Dataloader class (torch.utils.data.Dataloader ), but I get weird results out of it. I use the chifar10 dataset, where the examples/class are uniformly distributed.

train_set = tv.datasets.cifar.CIFAR10(root="./data/CIFAR10"
,train=True
,download=True,
transform=Transform
)
sampler = torch.utils.data.WeightedRandomSampler(torch.ones(10)*0.1,1000)
Dataloader_tr = torch.utils.data.DataLoader(train_set,batch_size=batch_size,sampler=sampler)

c=[]
for _,i in Dataloader_tr:
c.extend(i.numpy())

d=np.bincount(np.array©)
–> array([104, 0, 80, 0, 0, 291, 104, 199, 0, 222], dtype=int64)

My expectation would be something like this:
–> array([104, 99, 100, 95, 102, 101, 104, 96, 104, 95], dtype=int64)

Philipp_Friebertshau · February 17, 2019, 2:01pm

pls help, I’m really stuck here

ptrblck · February 17, 2019, 4:22pm

weights should contain sample weights not class weights.
Have a look at this small code snippet.