Hi there,

I am trying to implement 1 DCNN over raw audio data (mono channel).

I have say 60ksamples in all of wav files. Now my model code goes like

…

self.conv_1 = nn.Conv1d(self.input_spec_size,self.cnn_filter_size,3,1)

self.max_pooling_1 = nn.MaxPool1d(3)

…

But it throws the error: Expected 3-dimensional input for 3-dimensional weight [64, 60000, 3], but got 2-dimensional input of size [40, 60000] instead

Here 64 is no of output filters, 3 is kernel size and 40 is my batch size. Plz someone guide how to reshape this (60000,1 ) input data.

#########################

I tried reshaping input as

data.reshape(1,60000)

then the error :

Given groups=1, weight of size [64, 60000, 3], expected input[40, 1, 60000] to have 60000 channels, but got 1 channels instead