So I’m following along this tutorial in the docs on custom datasets. I’m using the MNIST dataset instead of the fancy one in the tutorial. This is the extension of the Dataset class I wrote:
class KaggleMNIST(Dataset):
def __init__(self, csv_file, transform=None):
self.pixel_frame = pd.read_csv(csv_file)
self.transform = transform
def __len__(self):
return len(self.pixel_frame)
def __getitem__(self, index):
if torch.is_tensor(index):
index = index.tolist()
image = self.pixel_frame.iloc[index, 1:]
image = np.array([image])
if self.transform:
image = self.transform(image)
return image
I do not know why the problem exists, exactly, but I found that casting to float64 seems to have solved it. I still wouldn’t mind a more detailed explanation as to what happened.
When you are loading your images, values are integer and has type of int64.
This is source code of normalize.
def normalize(tensor, mean, std, inplace=False):
"""Normalize a tensor image with mean and standard deviation.
.. note::
This transform acts out of place by default, i.e., it does not mutates the input tensor.
See :class:`~torchvision.transforms.Normalize` for more details.
Args:
tensor (Tensor): Tensor image of size (C, H, W) to be normalized.
mean (sequence): Sequence of means for each channel.
std (sequence): Sequence of standard deviations for each channel.
inplace(bool,optional): Bool to make this operation inplace.
Returns:
Tensor: Normalized Tensor image.
"""
if not torch.is_tensor(tensor):
raise TypeError('tensor should be a torch tensor. Got {}.'.format(type(tensor)))
if tensor.ndimension() != 3:
raise ValueError('Expected tensor to be a tensor image of size (C, H, W). Got tensor.size() = '
'{}.'.format(tensor.size()))
if not inplace:
tensor = tensor.clone()
dtype = tensor.dtype ############### here
mean = torch.as_tensor(mean, dtype=dtype, device=tensor.device)
std = torch.as_tensor(std, dtype=dtype, device=tensor.device)
if (std == 0).any():
raise ValueError('std evaluated to zero after conversion to {}, leading to division by zero.'.format(dtype))
if mean.ndim == 1:
mean = mean[:, None, None]
if std.ndim == 1:
std = std[:, None, None]
tensor.sub_(mean).div_(std)
return tensor
As you can see in the line that mean and std are converted to tensors instead of tuples or arrays, they will have same dtype as input tensor and as your input is already int64, mean and std will be int64 which in case of 0.5, both will be 0.
The reason is that Normalize or Transforms are defined for images and base library for image processing spcificly image loading is PIL which uses ToPILImage as first line of transforms. This method will automatically normalize data to [0, 1] range so what so ever mean and std values are, they will have same values as they will be converted to float.
For your case, a simple solution would be adding /255. where you convert data to array.