Breakout Sequential

I have builded a model of darknet using alot of nn.Sequential modules. I choosed to do so, because I was thinking: putting the Sequential parts together in my nn.Module initializer method benefits later speed instead iterating over a nn.ModuleList in the forward method.

This works fine and I tested a dozen times the output shapes. My Problem is now, that if I wont to use this model for yolov3 implementation I need to access the repeat 8 time Residual Blocks. Identifying these blocks would not be an issue with self.backbone.modules().

This generates two problems:

  1. Using self.backbone.modules() to iterate over them in the forward method destroys kind of my idea
  2. For some reason
for layer in self.backbone.modules():
    x = layer(x)

Results in an error and is absolute difficult at which position in the network this error occurs (it is a shape problem). While

x = self.backbone(x)

works perfectly fine.

What is best practice to solve both issues?

Some Ideas I have:

  1. Instead of deploying the backbone all at once I could deploy it in three steps and output three different outputs (at the end and after both 8-time-repeats).
  2. Working somehow with naming the layers/outputs and access them somehow (I have no clue how to achieve this, but sounds like the most reasonable approach)

I don’t think this is true, since nn.Sequential is also using a for loop internally as seen here.
If your submodule workflow is easier I would go for it.

Well that might change my approach. What about optimized code for cuda processing? Does Sequential performs better in any circumstance?

No, I doubt you would see any difference between using nn.Sequential and e.g. calling the modules line-by-line manually.
If you are concerned about the launch overhead for each CUDA kernel, check CUDA graphs but note the limitations of using it.

1 Like