Training on waveform vs spectrogram

I think the difference would be quite large, as the sampling rate could be high in the time domain and make the training quite challenging. If I remember correctly, one way to make some models work on waveforms directly was to use a stack of conv layers with a specific dilation, so that the input size was feasible to process in the end.

2 Likes