I was trying to implement a simple residual block, and I was looking for other people’s codes.
One thing that really caught my eye is that the residual block while in theory being x = f(x) + x, is implemented usually like this:
def forward(self, x): identity = x x = f(x) + identity return x
I was wondering why most are not writing this with the simpler form:
def forward(self, x): x += f(x) return x
The question might sound trivial or superficial, but I suspect maybe there is a deeper reason that just semantics at work here.