Below is the high-level code. Would there be issues when I try to reuse variable names below?
NOTE: Jump Below, Issue got cleared, some ideas needed to avoid GPU Memory wastage.
variables: main
and little
- First pass is some normal convolutions
# Conv
main = self.conv1(x)
main = self.bn1(main)
main = self.relu(main)
in second pass, we form 2 branches
and add them at the end of both branches.
# pass 2 | bL-module
little = main
main = self.conv2(main)
main self.bn2(main)
main = self.relu(main)
little = self.littleblock(little)
main += little
In 3rd pass, we again form 2 branches and pass the inputs and the outputs are again added, inside transition1
.
# pass 3 | `ResBlockB`s & `ResBlockL`s planes = 64
little = main
main = self.big_layer1(main)
little = self.little_layer1(little)
main = self.transition1(main, little)
def transition1(x1, x2):
assert(x1.shape == x2.shape)
out = x1 + x2 # merge via add
out = conv(out)
return out
In 4th pass, again the output from last module is copied and passed to 2 branches, and finally merged with transition2
.
# pass 4 | planes = 128
little = main
main = self.big_layer2(main)
little = self.little_layer2(little)
main = self.transition2(main, little)
Would there be issues with backpropagation when I try to optimize this network? AFIK, There should not be any issues but I wanted to confirm that by reusing main
and little
variables, would I be overwriting things (like computation graphs that would be required to calculate the grads) that should not be overwritten thus causing unwanted side-effects.
I am sorry if this is a basic question, I just wanted some sort of confirmation about the absence of any issue.