I have two datasets in the form of .mat. I want to use the scipy.io library and the h5py library to read and apply them to the program, but I don’t know how to operate. Please give pointers, thank you.The code that introduces the data set section is as follows.
This post might be helpful, if you would like to read these .mat files lazily.
However, it seems your dataset might be stored completely in the file.
In that case you could read it outside of the Dataset using scipy.io, transform them to tensors via torch.from_numpy and use a TensorDataset.
mat = scipy.io.loadmat('test.mat')
data = mat['data'] # use the key for data here
target = mat['target'] # use the key for target here
data = torch.from_numpy(data).float()
target = torch.from_numpy(target).long() # change type to your use case
dataset = TensorDataset(data, target)
Thank you very much, I will try it. The other .mat file is -v7.3 and needs to be read using the h5py library. Please tell me how to do it in detail. Please write it in code.
src_dataset = scipy.io.loadmat('E:\\ADDA\\pytorch-adda-master-lab\\datasets\\lab\\maria\\mat\\test_target_domain_maria.mat')
testdata = src_dataset['testdata'] # use the key for data here
testlabel = src_dataset['testlabel'] # use the key for target here
testdata = torch.from_numpy(testdata).float()
testlabel = torch.from_numpy(testlabel).long() # change type to your use case
dataset = TensorDataset(testdata, testlabel)
src_encoder_restore = os.path.join(model_root, src_dataset + "-source-encoder-final.pt")
src_classifier_restore = os.path.join(model_root, src_dataset + "-source-classifier-final.pt")
src_model_trained = True```
Hello sir i understood the part about the data part but what are we supposed to use for the target = mat[“target”] part. Are we supposed to enter the labels here?
target = mat['target'] # use the key for target here
assumes you’ve stored the target values in the .mat file using the 'target' key.
If that’s not the case and e.g. you don’t have any targets, you can skip this step.
You can use tensors directly. DataLoaders are able to shuffle, create batches, use multiple workers to load the next batches etc. but are not necessarily needed.