Linear projection in residual networks

Hi,

If the dimensions of F(X) and X are different in H(X) = F(X) + X, X must be linearly projected. Is matrix W_s used at this time a parameter to be learned? or Is it a defined(or fixed) special matrix?

Linear projection is like the below
y = F(x, {Wi}) + W_s*x

*Paper: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7780459

The parameters (if applicable) in the shortcut path are trainable.

Actual implementation of shortcut path in ResNet is here:

1 Like

Thanks for your kind response!