I followed the tutorial on the normalization part and used torchvision.transform([0.5],[0,5])
to normalize the input. My data class is just simply 2d array (like a grayscale bitmap, which already save the value of each pixel , thus I only used one channel [0.5]) stored as .dat file. However, I find the code actually doesn’t take effect. The input data is not transformed. Here is the what I tried:
class DatDataSet(Dataset):
def __init__(self,root_dir,transform=None):
self.root_dir=root_dir
self.transform=transform
self.label=os.listdir(data_dir)
self.filepath=[os.path.join(data_dir,x) for x in self.label]
fullpath=[list(map(lambda i: x+'\\'+i,os.listdir(x))) for x in self.filepath]
self.datpath=reduce(operator.add,fullpath)
self.num=[len(x) for x in fullpath]
self.labellist=np.concatenate([counter*np.ones(x) for counter,x in enumerate(self.num)])
self.transform=transform
def __len__(self):
return np.sum(self.num)
def __getitem__(self,idx):
img=pd.read_table(self.datpath[idx],header=None,sep='\s+')
sample={'data':img.values,'label':int(self.labellist[idx])}
return sample
transform=transforms.Compose([transforms.Normalize([0.5],[0.5])])
In order to see the normalization, I defined two data set, one with transformation dat_dataset = DatDataSet(root_dir=data_dir,transform=transform)
and another without transformation dat_dataset2 = DatDataSet(root_dir=data_dir,transform=None)
Strangely, when I take the same index of data from dat_dataset
and dat_dataset2
, I found the values are the same.
Is it true that the torchvision.transform([0.5],[0,5])
can only transform the images instead of any custom dataset? Or what is the proper way to normalize?
Any comments and idea are highly appreciated. Thank you!