Learning weighted sums of network output for multiple inputs

Hello, I need to learn network with two different transformations on input data and than weighted sum of outputs, I want to learn weights w of the summing as well:

w = torch.tensor([.5, .5], requires_grad=True)
optimizer.add_param_group({'params': w})
learning loop of one input data:
    output[0] = transform_1(data)
    output[1] = transform_2(data)
    
    output = torch.matmul(w, input)
    loss = criterion(output, label)
    loss.backward()
    optimizer.step()

I am working with images and transforms include resizing, so transform_1(data).shape != transform_2(data).shape therefore I cannot stack them together along new dimension.

Using this loop I am getting

Variable._execution_engine.run_backward(
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [5, 2]], which is output 0 of PermuteBackward, is at version 10; expected version 8 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

If I put whole training loop into with torch.autograd.set_detect_anomaly(True): I got error

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [5, 2]], which is output 0 of PermuteBackward, is at version 10; expected version 8 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

Do you have any idea how to train network with learning weighted sum of outputs? Thank you very much