DDP -Sync Batch Norm - Gradient Computation Modified?

Okay, I realized if I remove the 2nd call, output_right = model(input_right), then it no longer gives this error. How can I make SyncBatchNorm work for two inputs to the same network that I want to constrain??

Works

output_left = model(input_left)
loss = torch.sum(output_left["output"][0] - 0)

Doesnt Work

output_left = model(input_left)
output_right = model(input_right)
loss = torch.sum(output_left["output"][0] - 0) + torch.sum(output_right["output"][0] - 0)