I have 2 questions with the dimension of the tensor.

How do I convert my model output torch.Size([5, 1024, 7, 2, 4]) to torch.Size([5,1024,7,1,1]) ? I thought of slicing, but is that good idea? or there is another good way to do it?

I have two torch.Size([5,1024,7,1,1]) output from the network, I would like to concatenate these two outputs and feed to twolayer Fully Connected layer. (layer 1 output :1024, layer 2 output: 1). What would be the best way to do it?
Thank you.
 Since I don’t want to modify the network, I used torch.max to get the shape of [5, 1024, 7] from [5, 1024, 7, 2, 4].
out = torch.max(x, dim=3)[0]
out = torch.max(out, dim=3)[0]
I hope it will not lose any important feature from the final layer.
 I would like to concatenate the input. Yes flattening helped me to achieve this.
def __init__(self, spatial_model, temporal_model):
super(FusionNet, self).__init__()
self.spatial_model = spatial_model
self.temporal_model = temporal_model
self.fc = nn.Sequential(nn.Linear(1024*7*2, 1024, bias=True), nn.ReLU(inplace=True), nn.Dropout(p = 0.5),
nn.Linear(1024, 1))
def forward(self, spatial_input, temporal_input):
spatial_output = self.spatial_model(spatial_input)
temporal_output = self.temporal_model(temporal_input)
fused = torch.cat((spatial_output, temporal_output), dim = 1)
out = fused.view(fused.size(0), 1)
out = self.fc(out)
return out
Thanks again