20 * 100 sized Variable a_1 expand as 40 * 100 sized Variable a_2, what operations does PyTorch do on the gradient when back prop from a_2 to a_1?

20 * 100 sized Variable a_1 expand as 40 * 100 sized Variable a_2, what is the gradient like when back prop from a_2 to a_1?
I am confused about this , since I can back prop from a_2 to a_1, so what operation does PyTorch do, to convert from the 40 * 100 sized gradient to 20 * 100 sized gradient?

What do you use to do this expand? Because the expand and expand_as functions can only expand dimensions that are originally of size 1. In that case, the gradients along all the expanded dimensions are accumulated onto the 1 dimension from the original Variable.