Load data from variables into classes

kristerv · July 31, 2020, 10:59am

My task is to take an episode of a TV show and its subtitles. Then make the subtitle timings more accurate (from 200ms to 20ms). So I want to learn what is speech and what is not.

I’ve now taken the audio, converted it into a spectrogram and separated each column of the spectogram to be a single data item. So now I have two arrays:

print(train_speech.size()) # torch.Size([93482, 201])
print(train_silence.size()) # torch.Size([35038, 201])

All I want to do is a simple multi-linear NN to make a difference. train_speech is FFT’s of people talking and train_silence is no talking (used subtitles for the distinction).

My question is what DataLoader can I use to take these into torch?

ptrblck · August 1, 2020, 7:24am

There is one DataLoader, which accepts a Dataset and provides different functionalities such as shuffling, creating batches using multiple workers, etc.

To create a custom Dataset you could have a look at this tutorial.

kristerv · August 3, 2020, 11:49am

What I don’t get is that my data is already a simple tensor… It doesn’t make sense to me that I need to create a separate abstraction just to fetch numbers from a few arrays…

ptrblck · August 4, 2020, 8:51am

If your data is already stored as tensors, you can just use TensorDataset or completely skip the abstraction and just feed to data to your model.

kristerv · August 5, 2020, 7:59am

Since I had two classes in separate variables I ended up making the custom class.

class MyDataset(Dataset):
    def __init__(self, speech, silence):
        self.data = list(map(lambda x: (x, 1), speech)) + list(map(lambda x: (x, 0), silence))
        
    def __getitem__(self, index):
        return self.data[index]
    
    def __len__(self):
        return len(self.data)

train_ds = MyDataset(train_speech, train_silence)
train_dl = DataLoader(train_ds, shuffle=True, batch_size=1024)

Thanks for helping me get through this.