So is the following scenario possible? I have 2 nets the first of which consists of only Sparse embedding layers, while the second net has an initial sparse embedding layer followed by linear layers. I want the second nets embedding layer to share weights with one of the layers in net 1, but due to the fact that subsequent layers are linear in net 2 i’m assuming the embeddings have to be dense? In any case when I tried the following the sparse optimizer for net1 complained about receiving dense inputs. Is this scenario even possible in pytorch?