i recently got into neural networks and machine learning and have a question.
I built a small network which is supposed to take an input matrix (24x30) and an output matrix (24x30) and learn how to predict the output (basically 2d grayscale images). The idea is, that the output matrix is a subsequent development of the input matrix (ie a temporal follow up).
My approach is to use a 2d convolution layer, relu, pooling, and a linear activation. Now my problem is that the prediction is a 1x30 vector, which makes sense due to the linear layer and the .view part of the network.
Now i clearly dont understand well enough the network but it would be nice for someone to help me out here.
The code is as follows:
class ConvNet(torch.nn.Module): def __init__(self): super(ConvNet, self).__init__() self.conv1=torch.nn.Conv2d(in_channels=args.batchsize,out_channels=16, kernel_size=5) self.fc1=torch.nn.Linear(in_features=2080, out_features=384) self.fc2=torch.nn.Linear(in_features=384,out_features= 90) self.fc3=torch.nn.Linear(in_features=90,out_features=30) def forward(self, x): x=x.unsqueeze(0) x=torch.nn.functional.max_pool2d(torch.nn.functional.relu(self.conv1(x)), (2,2)) x=x.view(-1, self.num_flat_features(x)) x=torch.nn.functional.relu(self.fc1(x)) x=torch.nn.functional.relu(self.fc2(x)) x=self.fc3(x) return x def num_flat_features(self, x): size=x.size()[1:] num_features=1 for s in size: num_features *= s return num_features