How to identify the backward of a model is using multithread?

I recently implemented a model with two branches. Though the computational graphs of the two branches have no connections, they both modify a shared GPU storage when backward. The losses of the two branches are summed and so backward together. The non-derterministic backward result makes me realize that the backward of the computation graph is not single-threaded but multi-threaded.

Multi-threaded backward depending on the autograd graph is smart. But how can I identify the gradient computation of some parts of a model is multi-threaded? Are there some clear guidelines or rules? In which scope two disconnected parts will do backward in the multi-threaded way?

a = conv1(x)
b = conv2(x)
c = a + b

Is it certain that the gradients of a and b will be computed in multiple threads?