I’m trying to implement a convolutional autoencoder and when I flatten the tensor resulting from Con2D layer to feed into the Linear layer, I realize that the tensor is not from one but many samples. What is the correct way to make this conversion?
I believe that’s to be expected if you feed the network in batch! So flatten the tensor resulting from Conv2D by batch too. You can do this using
.view(-1, <the expected value of flattened conv2d>). The
-1 means it will automatically adjust to your batch size.
Take a look at this sample code:
class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = nn.Conv2d(1, 10, kernel_size=5) self.conv2 = nn.Conv2d(10, 20, kernel_size=5) self.conv2_drop = nn.Dropout2d() self.fc1 = nn.Linear(320, 50) self.fc2 = nn.Linear(50, 10) def forward(self, x): x = F.relu(F.max_pool2d(self.conv1(x), 2)) x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2)) x = x.view(-1, 320) # this is the flatten part x = F.relu(self.fc1(x)) x = F.dropout(x, training=self.training) x = self.fc2(x) return F.log_softmax(x)
Please correct me if I’m wrong