Residual connection implementation

Now i am studying “swish net” that model for audio segmentation.

In that paper, they used strided convolution & residual net. Follw image is from

after through stride=2 conv layer, its output length will be half of the input length.

Here, my question is…
how can merger output with input(residual connection) even their array dimension is mismatched?

G.A is just gated activation function, so it doesnt affect on the output dimension.

You have to use any linear transformation. Resnet (which has residual connections) has linear transformations to handle that.

1 Like