How to load multiple .mat files for training in Pytorch

Abhi · October 16, 2020, 3:53am

I have 3D dataset(1000 files) in .mat format. and I wish to load for training. can anyone please suggest to me how to do this?

ptrblck · October 17, 2020, 10:17am

You could use scipy to load the mat files as shown here.

Abhi · October 21, 2020, 7:21am

Hi. @ptrblck I created the data loader by using the referenced function. but my data is not divided into batches while training.(mat file contain: complex128)
(one thing I don’t understand I am unable to define batch size greater than path length.)

‘’’
import scipy.io as io
class MyDataset(Dataset):
def init(self, mat_paths,col_name,transform):
self.paths = mat_paths

    self.matX = io.loadmat(self.paths)[col_name]

    self.transform = transform
def __getitem__(self, index):
  
    X = self.matX[index].astype(np.float32)
   
    X = torch.from_numpy(X)
    X = abs(X)

    return X

def __len__(self):
    return len(self.paths)

mat_paths = (r’C:\Users\Abhi\Downloads\dataset\outputs\images\train1.mat’)

trainDataset = MyDataset(mat_paths,‘train4’,transform=transform)

trainLoader = DataLoader(trainDataset, batch_size=32, shuffle=True, num_workers=0, pin_memory=True,drop_last=True)
‘’’

ptrblck · October 21, 2020, 8:00am

I assume self.matX is a numpy array containing all samples in dim0.
If that’s the case, you might need to return len(self.matX) or self.matX.shape[0] in the __len__ function instead of the length of self.paths.

PS: you can post code snippets by wrapping them into three backticks ```, which makes your code easier to debug.

Abhi · October 22, 2020, 4:56am

@ptrblck it works. thankyou.
okay, I will wrap my code in three ‘’’ before posting.