I am using nn.TransformerEncoder with a DDP wrapper - and getting an “unused parameters” error.
self.ego_encoder_layer = nn.TransformerEncoderLayer(d_model=2048, nhead=16, activation='gelu')
self.ego_transformer_encoder = nn.TransformerEncoder(self.ego_encoder_layer, num_layers=6)
And do a forward pass using:
ego_transformer_features = self.ego_transformer_encoder(ego_sequences, src_key_padding_mask=src_padding_mask)
I used the following code to print the unused parameters :
loss.backward()
print('-- FIND UNUSED PARAMETERS --')
for name, param in model.named_parameters():
if param.grad is None:
print(name)
print('--' * 30)
optimizer.step()
The unused parameters are listed below:
ego_encoder_layer.self_attn.in_proj_weight
ego_encoder_layer.self_attn.in_proj_bias
ego_encoder_layer.self_attn.out_proj.weight
ego_encoder_layer.self_attn.out_proj.bias
ego_encoder_layer.linear1.weight
ego_encoder_layer.linear1.bias
ego_encoder_layer.linear2.weight
ego_encoder_layer.linear2.bias
ego_encoder_layer.norm1.weight
ego_encoder_layer.norm1.bias
ego_encoder_layer.norm2.weight
ego_encoder_layer.norm2.bias