How to combine the intermediate results of training in advance when using pytorch for multi-GPU parallel training

I need to immediately backward some loss in the middle of the model training for parameter adjustment, but in the case of multiple GPUs, a batch is parallelized and cannot compute lost and backward.