Pytorch calculate loss backward when divide tensor into subgroup and put into different network

I am new to PyTorch coding. I want to calculate keypoints X and Y coordinate prediction. I am thinking to divide keypoints into subgroups and put those keypoints that belong to same group to same network. For example, I have tensor with shape [bs, 14, 128] (batch size, number of keypoints, feature size). I use tensor slicing to divide into 3 groups:

I am new to PyTorch coding. I want to calculate keypoints X and Y coordinate prediction. I am thinking to divide keypoints into subgroups and put those keypoints that belong to same group to same network. For example, I have tensor with shape [bs, 14, 128] (batch size, number of keypoints, feature size). I use tensor slicing to divide into 3 groups:

group_one[:, :4, :]

group_two[:, 4:9, :]

group_three[:, 9:14, :]

and then put those 3 subgroups into different network and concatenate back into [bs, 14, 128]. Is the loss backward can be calculated as usual and the parameters are updated correctly?

Views should be differentiable yes, so the gradients computed for each view should flow back to the base.