Linear projection in residual networks

spring · January 11, 2021, 8:11am

Hi,

If the dimensions of F(X) and X are different in H(X) = F(X) + X, X must be linearly projected. Is matrix W_s used at this time a parameter to be learned? or Is it a defined(or fixed) special matrix?

Linear projection is like the below
y = F(x, {Wi}) + W_s*x

*Paper: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7780459

InnovArul · January 11, 2021, 9:37am

The parameters (if applicable) in the shortcut path are trainable.

Actual implementation of shortcut path in ResNet is here:

github.com

pytorch/vision/blob/master/torchvision/models/resnet.py#L213-L217


if stride != 1 or self.inplanes != planes * block.expansion:
    downsample = nn.Sequential(
        conv1x1(self.inplanes, planes * block.expansion, stride),
        norm_layer(planes * block.expansion),
    )

spring · January 11, 2021, 10:03am

Thanks for your kind response!