I have the end result of the 3D convolutional part of my network, with shape:

n_samples, n_channels, height, width, depth (in this example this is 1,32, 25, 32, 32)

I wish to:

Global average pool this

Concatenate it with a linear extra input (which is of shape n_samples, n_extra_inputs)

Feed it to a fully connected layer

The issue I have is in flattening/squeezing the output of the AvgPool3d layer and then adding it to the extra input. I tried:

self.global_pool = nn.AvgPool3d(kernel_size=1)
...
x = self.global_pool(x)
torch.cat([x.squeeze, extra_input])

but x.squeeze just goes from shape [1, 32, 25, 32, 32] to [32, 25, 32, 32] which is clearly not what I need. This may be a time for View or flatten but rather than trial and error I thought I’d ask the recommended approach.

Your kernel_size=1. That defeats the purpose of an AvgPool3d layer as it will just return the exact same as the input. If you want to average each sample to a size of 1, try AdaptiveAvgPool3d(output_size=1).

So I get a shape 1,32,1,1,1 from that, then squeeze goes to [32]. I then used reshape(x, [1,32]) to get what I needed.

However now I’m struggling with working out how to concatenate the extra data for each sample. Since my n_samples at the moment is 1 it’s not so complicated but it needs to generalise.

How do I concatenate tensors of shape [1,32] and [1,7] to correctly have a tensor of shape [1,39] ?