Pytorch model.named_modules() skips some layers

I am fusing the layers for quantization

This is the part of my model, which I am going to fuse. My method is use named_modules go through each submodule and check if they are conv2d bachnorm or relu.

(scratch): Module(
    (layer1_rn): Conv2d(24, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (layer2_rn): Conv2d(40, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (layer3_rn): Conv2d(112, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (layer4_rn): Conv2d(320, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (activation): ReLU()
    (refinenet4): FeatureFusionBlock_custom(
      (out_conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))
      (resConfUnit1): ResidualConvUnit_custom(
        (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (activation): ReLU()
        (skip_add): FloatFunctional(
          (activation_post_process): Identity()
        )
      )
      (resConfUnit2): ResidualConvUnit_custom(
        (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (activation): ReLU()
        (skip_add): FloatFunctional(
          (activation_post_process): Identity()
        )
      )
      (skip_add): FloatFunctional(
        (activation_post_process): Identity()
      )
    )
for name, module in m.named_modules():
        print(name, module)

I found if It missed the scratch.refinenet4.resConfUnit1.activation
The same thing happens at the activation in resConfUnit2.

Is this a bug?

scratch.layer1_rn Conv2d(24, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
scratch.layer2_rn Conv2d(40, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
scratch.layer3_rn Conv2d(112, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
scratch.layer4_rn Conv2d(320, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
scratch.activation ReLU()
here
scratch.refinenet4 FeatureFusionBlock_custom(
  (out_conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))
  (resConfUnit1): ResidualConvUnit_custom(
    (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (activation): ReLU()
    (skip_add): FloatFunctional(
      (activation_post_process): Identity()
    )
  )
  (resConfUnit2): ResidualConvUnit_custom(
    (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (activation): ReLU()
    (skip_add): FloatFunctional(
      (activation_post_process): Identity()
    )
  )
  (skip_add): FloatFunctional(
    (activation_post_process): Identity()
  )
)
scratch.refinenet4.out_conv Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))
scratch.refinenet4.resConfUnit1 ResidualConvUnit_custom(
  (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (activation): ReLU()
  (skip_add): FloatFunctional(
    (activation_post_process): Identity()
  )
)
> scratch.refinenet4.resConfUnit1.conv1 Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
> scratch.refinenet4.resConfUnit1.conv2 Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
> scratch.refinenet4.resConfUnit1.skip_add FloatFunctional(
>   (activation_post_process): Identity()
> )
scratch.refinenet4.resConfUnit1.skip_add.activation_post_process Identity()
scratch.refinenet4.resConfUnit2 ResidualConvUnit_custom(
  (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (activation): ReLU()
  (skip_add): FloatFunctional(
    (activation_post_process): Identity()
  )
)
scratch.refinenet4.resConfUnit2.conv1 Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
scratch.refinenet4.resConfUnit2.conv2 Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
scratch.refinenet4.resConfUnit2.skip_add FloatFunctional(
  (activation_post_process): Identity()
)
scratch.refinenet4.resConfUnit2.skip_add.activation_post_process Identity()
scratch.refinenet4.skip_add FloatFunctional(
  (activation_post_process): Identity()
)
scratch.refinenet4.skip_add.activation_post_process Identity()
scratch.refinenet3 FeatureFusionBlock_custom(
  (out_conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1))
  (resConfUnit1): ResidualConvUnit_custom(
    (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (activation): ReLU()
    (skip_add): FloatFunctional(
      (activation_post_process): Identity()
    )
  )
  (resConfUnit2): ResidualConvUnit_custom(
    (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (activation): ReLU()
    (skip_add): FloatFunctional(
      (activation_post_process): Identity()
    )
  )
  (skip_add): FloatFunctional(
    (activation_post_process): Identity()
  )
)

Can you post your code that does the fusion? It is virtually impossible what is going wrong when looking at the model only.