Trouble converting CNN from keras to pytorch

Hayden_Johnson · January 10, 2023, 4:31pm

I have created a CNN using the keras library which takes inputs which are 1D-arrays with 37000 data points each and with an output of 10 fully-connected nodes. I defined what I thought was an equivalent model in pytorch, however my results are significantly different. Does anyone see any mistakes in my implementation? The keras model is defined as the following:
model_aq = Sequential()
model_aq.add(Conv1D(32, 3, activation=“relu”, input_shape=(37000, 1)))
model_aq.add(Conv1D(32, 3, activation=“relu”))
model_aq.add(Conv1D(32, 3, activation=“relu”))
model_aq.add(Flatten())
model_aq.add(Dense(200, activation=“relu”))
model_aq.add(Dense(10))
model_aq.compile(loss=“mse”, optimizer=“RMSProp”, metrics=[tf.keras.metrics.RootMeanSquaredError()])

The model I created using pytorch is defined as:
class ModelAQ(nn.Module):
def init(self):
super(ModelAQ, self).init()
self.conv1 = nn.Conv1d(1, 32, kernel_size=3, stride=1, padding=1)
self.conv2 = nn.Conv1d(32, 32, kernel_size=3, stride=1, padding=1)
self.conv3 = nn.Conv1d(32, 32, kernel_size=3, stride=1, padding=1)
self.flatten = nn.Flatten()
self.dense1 = nn.Linear(32 * 36994, 200)
self.dense2 = nn.Linear(200, 10)

def forward(self, x):
    x = self.conv1(x)
    x = self.conv2(x)
    x = self.conv3(x)
    x = self.flatten(x)
    x = self.dense1(x)
    x = self.dense2(x)
    return x

pytorch_model_aq = ModelAQ()

To train the pytorch model I used the following:
criterion = nn.MSELoss(reduction=‘sum’)
def train(model_inp, num_epochs = num_epochs):
#optimizer = optim.RMSprop(model_inp.parameters(), lr=learning_rate)
optimizer = optim.RMSprop(model_inp.parameters())
for epoch in range(num_epochs):
running_loss = 0.
for inputs , labels in train_iter:
outputs = model_inp(inputs)
loss= criterion(outputs, labels)
optimizer.zero_grad()
loss.backward()
running_loss += loss.item()
optimizer.step()
if epoch % 1 == 0:
print(f’Epoch {epoch+1}/{num_epochs} running accumulative loss {running_loss:.3f}')

ptrblck · January 10, 2023, 10:22pm

Your PyTorch model misses the relu activations so either use an nn.ReLU module or use the functional API via x = F.relu(x). I’m also unsure how the shape is defined in Keras via input_shape=(37000, 1). Is this shape defining the shape as [batch_size, features] or is the batch size missing and is the shape defined as [sequence_length, channels]?