How to get npy dataloader

ykeivn · August 2, 2019, 2:43am

I have a few of .npy files as (25,512,512), and I need use it to input my network by dataloader . What should I do? Please!

ptrblck · August 4, 2019, 9:23pm

You could write a custom Dataset to lazily load each numpy file:

class MyDataset(Dataset):
    def __init__(self, np_file_paths):
        self.files = np_file_paths
    
    def __getitem__(self, index):
        x = np.load(self.files[index])
        x = torch.from_numpy(x).float()
        return x
    
    def __len__(self):
        return len(self.files)

After creating the Dataset instance, you could wrap it in a DataLoader, which will create a batches of [batch_size, 25, 512, 512].

xiew · February 19, 2023, 7:22am

Hi, I have kind of like a this npy files. I wrote like this:

    def __init__(self, dir_x, dir_y, transform=None):
        self.transform = transform
        self.datax = np.load(dir_x)
        self.datax = self.datax / 255.0
        self.datax = self.datax.reshape(-1, 1, 64, 64)
        self.datax = torch.from_numpy(self.datax).to(torch.float32)
        
        self.datay = np.load(dir_y)
        self.datay = self.datay / 255.0
        self.datay = self.datay.reshape(-1, 1)

    def __len__(self):
        return self.datax.shape[0]

    def __getitem__(self, idx):
        X = self.datax[idx]
        y = self.datay[idx]
        if self.transform:
            X = self.transform(X)
        return X, y

but when I train it, I got 0 losses. I don’t understand the problem.

ptrblck · February 19, 2023, 7:57pm

“0 losses” would indicate that your model is able to perfectly fit the training data. As a quick check you can compare the predictions vs. targets and see if that’s indeed the case.