It would work to use these layers in the forward pass, but model.parametes() will not return the parameters of these layers (and thus optim.SGD(model.parameters(), lr=1.) will not see these parameters either).
Also transferring the parameters to a device via model.to('cuda') won’t transfer these parameters to the desired device.
1 Like