Why is this transform resulting in a divide by zero error?

So I’m following along this tutorial in the docs on custom datasets. I’m using the MNIST dataset instead of the fancy one in the tutorial. This is the extension of the Dataset class I wrote:

class KaggleMNIST(Dataset):

    def __init__(self, csv_file, transform=None):
        self.pixel_frame = pd.read_csv(csv_file)
        self.transform = transform

    def __len__(self):
        return len(self.pixel_frame)

    def __getitem__(self, index):
        if torch.is_tensor(index):
            index = index.tolist()

        image = self.pixel_frame.iloc[index, 1:]
        image = np.array([image])

        if self.transform:
            image = self.transform(image)

        return image

It works, until I try to use a transform on it:

tsf = transforms.Compose([transforms.ToTensor(), 
                          transforms.Normalize((0.5,), (0.5,))
trainset = KaggleMNIST('train/train.csv', transform=tsf)

image0 = trainset[0]

I’ve looked at the stack trace, and it seems like the normalization is happening in this line of code:

c:\program files\python38\lib\site-packages\torchvision\transforms\functional.py in normalize(tensor, mean, std, inplace)
--> 218     tensor.sub_(mean[:, None, None]).div_(std[:, None, None])

So I don’t get why there is divide by zero since std should be 0.5, nowhere remotely close to a small value.

Thanks for your help!

I do not know why the problem exists, exactly, but I found that casting to float64 seems to have solved it. I still wouldn’t mind a more detailed explanation as to what happened.

my solution:

image = self.pixel_frame.iloc[index, 1:].to_numpy(dtype='float64').reshape(1, -1)


When you are loading your images, values are integer and has type of int64.
This is source code of normalize.

def normalize(tensor, mean, std, inplace=False):
    """Normalize a tensor image with mean and standard deviation.

    .. note::
        This transform acts out of place by default, i.e., it does not mutates the input tensor.

    See :class:`~torchvision.transforms.Normalize` for more details.

        tensor (Tensor): Tensor image of size (C, H, W) to be normalized.
        mean (sequence): Sequence of means for each channel.
        std (sequence): Sequence of standard deviations for each channel.
        inplace(bool,optional): Bool to make this operation inplace.

        Tensor: Normalized Tensor image.
    if not torch.is_tensor(tensor):
        raise TypeError('tensor should be a torch tensor. Got {}.'.format(type(tensor)))

    if tensor.ndimension() != 3:
        raise ValueError('Expected tensor to be a tensor image of size (C, H, W). Got tensor.size() = '

    if not inplace:
        tensor = tensor.clone()

    dtype = tensor.dtype  ############### here
    mean = torch.as_tensor(mean, dtype=dtype, device=tensor.device)
    std = torch.as_tensor(std, dtype=dtype, device=tensor.device)
    if (std == 0).any():
        raise ValueError('std evaluated to zero after conversion to {}, leading to division by zero.'.format(dtype))
    if mean.ndim == 1:
        mean = mean[:, None, None]
    if std.ndim == 1:
        std = std[:, None, None]
    return tensor

As you can see in the line that mean and std are converted to tensors instead of tuples or arrays, they will have same dtype as input tensor and as your input is already int64, mean and std will be int64 which in case of 0.5, both will be 0.

The reason is that Normalize or Transforms are defined for images and base library for image processing spcificly image loading is PIL which uses ToPILImage as first line of transforms. This method will automatically normalize data to [0, 1] range so what so ever mean and std values are, they will have same values as they will be converted to float.

For your case, a simple solution would be adding /255. where you convert data to array.

image = np.array([image])/255.


1 Like

That’s a really good explanation! Thank you so much!