Below is the high-level code. Would there be issues when I try to reuse variable names below?
NOTE: Jump Below, Issue got cleared, some ideas needed to avoid GPU Memory wastage.
- First pass is some normal convolutions
# Conv main = self.conv1(x) main = self.bn1(main) main = self.relu(main)
in second pass, we form 2 branches
and add them at the end of both branches.
# pass 2 | bL-module little = main main = self.conv2(main) main self.bn2(main) main = self.relu(main) little = self.littleblock(little) main += little
In 3rd pass, we again form 2 branches and pass the inputs and the outputs are again added, inside
# pass 3 | `ResBlockB`s & `ResBlockL`s planes = 64 little = main main = self.big_layer1(main) little = self.little_layer1(little) main = self.transition1(main, little) def transition1(x1, x2): assert(x1.shape == x2.shape) out = x1 + x2 # merge via add out = conv(out) return out
In 4th pass, again the output from last module is copied and passed to 2 branches, and finally merged with
# pass 4 | planes = 128 little = main main = self.big_layer2(main) little = self.little_layer2(little) main = self.transition2(main, little)
Would there be issues with backpropagation when I try to optimize this network? AFIK, There should not be any issues but I wanted to confirm that by reusing
little variables, would I be overwriting things (like computation graphs that would be required to calculate the grads) that should not be overwritten thus causing unwanted side-effects.
I am sorry if this is a basic question, I just wanted some sort of confirmation about the absence of any issue.