I want to normalize the MNIST dataset.
Here is how I calculate mean and standard-deviation:
transform=tv.transforms.Compose([tv.transforms.ToTensor()])
train_dataset = tv.datasets.MNIST('../data', train=True, download=True, transform=transform)
mean = torch.mean(torch.Tensor.float(train_dataset.data))
std = torch.std(torch.Tensor.float(train_dataset.data))
If I manually normalize the data like this:
train_dataset.data = (train_dataset.data - mean) / std
test_dataset.data = (test_dataset.data - mean) / std
I get decent accuracy (~0.978), though not better than without normalization (~0.9796).
However, if I use the Normalize transform with the same mean and std:
transform=tv.transforms.Compose([tv.transforms.ToTensor(), tv.transforms.Normalize(mean, std)])
train_dataset = tv.datasets.MNIST(’…/data’, train=True, download=True, transform=transform)
test_dataset = tv.datasets.MNIST(’…/data’, train=False, download=True, transform=transform)
I get very low accuracy (0.135). Why is that, and how should I use Normalize instead?
Second question: I also tried (manual) pixel-wise normalization:
px_mean = torch.mean(torch.Tensor.float(train_dataset.data), dim=0)
px_std = torch.std(torch.Tensor.float(train_dataset.data), dim=0)+1e-10
train_dataset.data = (train_dataset.data-px_mean)/px_std
test_dataset.data = (test_dataset.data-px_mean)/px_std
but this, too, gives me very low accuracy (~0.135). Am I doing it wrong, and if so, how to do it correctly?