datasets.MNIST transforms.ToTensor() returns ByteTensor

Hello everyone,
The following code returns ByteTensor in torchvision 0.2.1, osx,

    self.train_dataset = datasets.MNIST(root=root,

    self.test_dataset = datasets.MNIST(root=root,


Is this a bug or am I missing something here? As far as I remember it did return FloatTensor when I previously used it.
Thanks in advance!

The underlying data might still be stored as bytes. However, your ToTensor() transformation should return a FloatTensor for each sample.
Try to check the type of an instance, since the transformation will be applied in __getitem__: print(self.test_dataset[0][0].type()).

print(self.test_dataset.test_data[0].type()) is also a ByteTensor. I use a temporary solution by manually converting into FloatTensor however there was no problem with this snippet before as far as I remember(maybe before 0.4.1, I am not sure unfortunately).

Using your statement you are still getting the underlying test_data.
Try the following:


, without calling .test_data.

PS: I had a small error in my previous post, as you need to index the data of the returned tuple.

Oh I thought they were same:)
Yes now it prints out FloatTensor. I try to get a random subset from the train_dataset with the following code,

    for class_type in labels.unique():
        indices = np.where(labels == class_type)
        sample_indices = torch.randint(int(indices[0][0]),

        real_sample_pos =, real_indices[sample_indices]))

    if shuffle:

    self.small_train_set = self.train_dataset.train_data[real_sample_pos].type(torch.FloatTensor)
    self.small_train_labels = self.train_dataset.train_labels[real_sample_pos]

Is there a more elegant way where I can avoid .type(torch.FloatTensor) ?

Thank you very much for your help!

You could use a SubsetRandomSampler, keep your Dataset as it is, and just pass the sampler to your DataLoader.
Assuming real_sample_pos was somehow created, here is a small example:

real_sample_pos = torch.randperm(len([:100]
sampler = sampler.SubsetRandomSampler(real_sample_pos)
loader = DataLoader(

This would avoid working with the dataset internals, as e.g. now your transformations might not work on self.small_train_set.

1 Like

Thank you for the insight!