Breakout Sequential

MaKaNu · March 31, 2022, 3:38pm

I have builded a model of darknet using alot of nn.Sequential modules. I choosed to do so, because I was thinking: putting the Sequential parts together in my nn.Module initializer method benefits later speed instead iterating over a nn.ModuleList in the forward method.

This works fine and I tested a dozen times the output shapes. My Problem is now, that if I wont to use this model for yolov3 implementation I need to access the repeat 8 time Residual Blocks. Identifying these blocks would not be an issue with self.backbone.modules().

This generates two problems:

Using self.backbone.modules() to iterate over them in the forward method destroys kind of my idea
For some reason

for layer in self.backbone.modules():
    x = layer(x)

Results in an error and is absolute difficult at which position in the network this error occurs (it is a shape problem). While

x = self.backbone(x)

works perfectly fine.

What is best practice to solve both issues?

Some Ideas I have:

Instead of deploying the backbone all at once I could deploy it in three steps and output three different outputs (at the end and after both 8-time-repeats).
Working somehow with naming the layers/outputs and access them somehow (I have no clue how to achieve this, but sounds like the most reasonable approach)

ptrblck · April 1, 2022, 5:59am

I don’t think this is true, since nn.Sequential is also using a for loop internally as seen here.
If your submodule workflow is easier I would go for it.

MaKaNu · April 1, 2022, 7:55am

Well that might change my approach. What about optimized code for cuda processing? Does Sequential performs better in any circumstance?

ptrblck · April 1, 2022, 8:56am

No, I doubt you would see any difference between using nn.Sequential and e.g. calling the modules line-by-line manually.
If you are concerned about the launch overhead for each CUDA kernel, check CUDA graphs but note the limitations of using it.