Unused Parameters when using nn.TransformerEncoder

I am using nn.TransformerEncoder with a DDP wrapper - and getting an “unused parameters” error.

self.ego_encoder_layer = nn.TransformerEncoderLayer(d_model=2048, nhead=16, activation='gelu')
self.ego_transformer_encoder = nn.TransformerEncoder(self.ego_encoder_layer, num_layers=6)

And do a forward pass using:

ego_transformer_features = self.ego_transformer_encoder(ego_sequences, src_key_padding_mask=src_padding_mask)

I used the following code to print the unused parameters :

loss.backward()
print('-- FIND UNUSED PARAMETERS --')
for name, param in model.named_parameters():
    if param.grad is None:
        print(name)
print('--' * 30)
optimizer.step()

The unused parameters are listed below:

ego_encoder_layer.self_attn.in_proj_weight
ego_encoder_layer.self_attn.in_proj_bias
ego_encoder_layer.self_attn.out_proj.weight
ego_encoder_layer.self_attn.out_proj.bias
ego_encoder_layer.linear1.weight
ego_encoder_layer.linear1.bias
ego_encoder_layer.linear2.weight
ego_encoder_layer.linear2.bias
ego_encoder_layer.norm1.weight
ego_encoder_layer.norm1.bias
ego_encoder_layer.norm2.weight
ego_encoder_layer.norm2.bias

@alband:

self.ego_encoder_layer = nn.TransformerEncoderLayer(d_model=2048, nhead=16, activation='gelu')
self.ego_transformer_encoder = nn.TransformerEncoder(self.ego_encoder_layer, num_layers=6)

Removing the self designator in the ego_transformer_encoder resolved the issue? Not sure why:

ego_encoder_layer = nn.TransformerEncoderLayer(d_model=2048, nhead=16, activation='gelu')
self.ego_transformer_encoder = nn.TransformerEncoder(ego_encoder_layer, num_layers=6)