The error is raised in an nn.Conv2d layer, which uses in_channels=300, while the incoming activation has only 128 channels, so you would have to change the in_channels argument of this layer.
nn.Conv2d
in_channels=300
in_channels