Resnet3D for regression

banikr · October 2, 2019, 5:21pm

Hello,
I am doing a regression project where I have to predict 512 values of a vector from 3D images. The image dimension is 86,110,78.
Now I am confused about using the Maxpool3D layer or not. If I use maxpool then I have to pad the images to shape 88,112,80… since there are 3 pooling layers the final shape of the featured image will be 11,14,10.
And then using a fully connected layer(with tanh activation) I will shrink the output to 512.
are there better ways to do this without pooling, I know about strides…I am not familiar with how to do that.
Keeping in mind the residual connection/identity mapping.

Even if I zero-pad the image it will change the zero-mean unit variance normalization of the spatial values.

ptrblck · October 5, 2019, 3:26pm

The padding approach seems to be fine.
However, if you want to reduce the volumetric dimensions using strided convolutions, have a look here to get some information how strides work in conv layers.

banikr · October 25, 2019, 7:08pm

Hi,
Also, this paper says Maxpool layers can be replaced by strided convolutional layers with the same kernel size.

I did that but didn’t see any change in my result though.