Reproduce same network architecture (with dropout active) in multiple steps: zeroes multiple times the same elements with dropout

  1. Either use the proposed approach from your cross post or initialize the layers first and use a switch in the custom forward method.

  2. I would write a custom dropout layer, which accepts an additional flag to resample the mask.

1 Like