You need to make two nn.Parameters for a and b (on the device nn.Paramter(torch.tensor(1.0, device=my_device))) or so and pass those to the optimizer (e.g. list(model.parameters()) + [a, b] instead of just the parameters).
Note that generally it is mathematically difficult to train loss weights, but whether this applies in your case is hard to tell at this level of detail.
so i just initialize the variables in the constructor of my network class and use them as netobject.a or do i have to write something in the backward function also?