Anything that is computed in a differentiable way and that contribute to the loss will also contribute to the computes gradients. So yes, both will participate in the gradients in the paramters of net.
@albanD then can we split a large batch to N small batches, then get an accumulated results after N forwards to increase our batch size? I didn’t see some one used such a way to increase the batch size, instead they do accumulated based on the back propagated gradient.