How to implement an updating weighted MSE Loss?

Huimin_LU · September 11, 2022, 6:29am

Hello everyone! I’m new here. I’m an undergraduate student doing my research project.
I’m not a native English speaker, so apologies to my weird grammar.

My research topic is about wind power prediction using an LSTM-NN and its application in the power trading. I used only the time-series date of wind power as parameters, so the power production is predicted based on the observed power production in the past.

Long story short, while trading, we care not only about the forecast accuracy, but also about the power price. When the price is low, a not-very-accurate forecast is acceptable. When the price is high, however, the wind power producer would prefer a highly accurate power production forecast.

In order to achieve this goal, I tried to update my loss function while training. I used the power price as weight factor and multiplied it with my MSE Loss function, so I defined my loss function as below:

def weighted_mse_loss(input, target, weight):
    return (weight * (input - target) ** 2)

And tried to train my model like this:

    # training
    loss = 0
    for i in range(epochs):
        for (seq, label, price_label) in Dtr:
            seq = seq.to(device)
            label = label.to(device)
            y_pred = model(seq)
            loss = weighted_mse_loss(y_pred, label, price_label)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
        print('epoch', i, ':', loss.item())

    state = {'model': model.state_dict(), 'optimizer': optimizer.state_dict()}
    torch.save(state, path)

The sequence data I input was like this:

[(tensor([[-0.4238],
          [ 0.7864],
          [ 0.7743],
          [ 0.6549],
          [ 0.7195]]),
  tensor([0.7324]),
  [6.49]),

At any given time, for example, ([0.7324]) would be the true value of the forecast target, while the previous 5 values would be used for forecast. [6.49] would be the power price at that time which I directly used as the weights.

However, when I run my code, I had an error like this:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [118], in <cell line: 1>()
----> 1 train(LSTM_PATH)

Input In [117], in train(path)
     18 label = label.to(device)
     19 y_pred = model(seq)
---> 20 loss = weighted_mse_loss(y_pred, label, price_label)
     21 optimizer.zero_grad()
     22 loss.backward()

Input In [109], in weighted_mse_loss(input, target, weight)
      1 def weighted_mse_loss(input, target, weight):
----> 2     return (weight * (input - target) ** 2)

TypeError: only integer tensors of a single element can be converted to an index

How could I fix that? Or is there a better way to update my MSE Loss function corresponding to the fluctuation of power price? I’m an EE student so I’m not very good at coding

ptrblck · September 12, 2022, 11:13pm

I’m unsure what exactly is raising the issue as I don’t see an indexing operation in the code.
Also, your code snippet works fine using:

def weighted_mse_loss(input, target, weight):
    return (weight * (input - target) ** 2)

x = torch.randn(10, 10, requires_grad=True)
y = torch.randn(10, 10)
weight = torch.randn(10, 1)

loss = weighted_mse_loss(x, y, weight)
loss.mean().backward()

Could you check the type of all inputs to the weighted_mse_loss method and post a minimal, executable code snippet here, if possible?

Huimin_LU · September 13, 2022, 3:18am

Thank you very much for your response!
The following is the code I used to make the input data:

def nn_seq_ms(B):
    print('data processing...')
    dataset = load_data()
    # split
    train = dataset[:'2017/12/31 23:50']
    test = dataset['2018/1/1 0:00':'2018/12/31 23:50']
    
    def process(data, batch_size):
        load = data[data.columns[1]]
        load = load.tolist()
        data = data.values.tolist()
        seq = []
        for i in range(len(data) - 144): # predict wind power using power production sequence data in the past 144 time intervals
            train_seq = [] # sequence data in the past 144 time intervals
            train_label = [] # prediction target at any given time
            price_label = [] # power price at that same time
            for j in range(i, i + 144):
                x = [load[j]]
                train_seq.append(x)

            train_label.append(load[i + 144])
            price_label.append(data[i + 144][0])
            train_seq = torch.FloatTensor(train_seq)
            train_label = torch.FloatTensor(train_label).view(-1)
#             price_label = torch.FloatTensor(price_label).view(-1)
            seq.append((train_seq, train_label, price_label))

        # print(seq[-1])
        seq = MyDataset(seq)
        seq = DataLoader(dataset=seq, batch_size=batch_size, shuffle=False, num_workers=0)

        return seq

    Dtr = process(train, B)
    Dte = process(test, B)

    return Dtr, Dte

Ultimately my inputs would be a dataloader, which I’m not sure is the type you are asking.

ptrblck · September 13, 2022, 3:22am

Thanks for the follow-up. Your code doesn’t help much as the dataset is undefined and I can’t see its types etc.
Use print statements in your code via e.g. print(type(input), type(target), type(weight)) to check the input types and more attributes if necessary to narrow down what the difference between our code snippet might be.

Huimin_LU · September 13, 2022, 3:34am

I’m sorry for confusing you I really suck at coding…
Anyway, the following is my dataset:

The ‘power (kWh)’ is the (normalized) time-series data I used to train the model, while the ‘avg_price (yen/kWh)’ is the power price I want to used as weights.

Huimin_LU · September 13, 2022, 3:35am

I’m not sure if that could help you. I’ve also used your code to check my data type and it turned out to be like this:

ptrblck · September 13, 2022, 4:18am

The error is most likely raised by using a list for the price_label assuming these are the weights:

def weighted_mse_loss(input, target, weight):
    return (weight * (input - target) ** 2)

x = torch.randn(10, 10, requires_grad=True)
y = torch.randn(10, 10)
weight = torch.randn(10, 1).tolist()

loss = weighted_mse_loss(x, y, weight)
# TypeError: only integer tensors of a single element can be converted to an index

Could you transform them to a tensor as it should work based on my previous code snippet?

Huimin_LU · September 13, 2022, 4:25am

I’ve transformed the price_label to a tensor, but it was still not working. I encountered another error while running the backward:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Input In [129], in <cell line: 1>()
----> 1 train(LSTM_PATH)

Input In [128], in train(path)
     21     loss = weighted_mse_loss(y_pred, label, price_label)
     22     optimizer.zero_grad()
---> 23     loss.backward()
     24     optimizer.step()
     25 print('epoch', i, ':', loss.item())

File ~\anaconda3\envs\pytorch\lib\site-packages\torch\_tensor.py:396, in Tensor.backward(self, gradient, retain_graph, create_graph, inputs)
    387 if has_torch_function_unary(self):
    388     return handle_torch_function(
    389         Tensor.backward,
    390         (self,),
   (...)
    394         create_graph=create_graph,
    395         inputs=inputs)
--> 396 torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)

File ~\anaconda3\envs\pytorch\lib\site-packages\torch\autograd\__init__.py:166, in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)
    162 inputs = (inputs,) if isinstance(inputs, torch.Tensor) else \
    163     tuple(inputs) if inputs is not None else tuple()
    165 grad_tensors_ = _tensor_or_tensors_to_tuple(grad_tensors, len(tensors))
--> 166 grad_tensors_ = _make_grads(tensors, grad_tensors_, is_grads_batched=False)
    167 if retain_graph is None:
    168     retain_graph = create_graph

File ~\anaconda3\envs\pytorch\lib\site-packages\torch\autograd\__init__.py:67, in _make_grads(outputs, grads, is_grads_batched)
     65 if out.requires_grad:
     66     if out.numel() != 1:
---> 67         raise RuntimeError("grad can be implicitly created only for scalar outputs")
     68     new_grads.append(torch.ones_like(out, memory_format=torch.preserve_format))
     69 else:

RuntimeError: grad can be implicitly created only for scalar outputs

ptrblck · September 13, 2022, 4:27am

Your loss tensor contains more than a single element so reduce it e.g. via loss.mean().backward().

Huimin_LU · September 13, 2022, 4:34am

Thank you so much! I used loss.mean().backward() and it is working!
But I am still a little bit confused about it. While I train the model, the loss is defined by calculating the MSE between the true value and the predicted value, and then multiplied with the weight. I think it will eventually be one single value, so why do I need to reduce it?

ptrblck · September 13, 2022, 4:50am

The loss won’t be automatically reduced and in your weighted_mse_loss you are using elementwise operations only.
Check the loss output from my first code snippet and you will see that its shape is equal to the input shapes of [10, 10] before I call loss.mean().

Huimin_LU · September 13, 2022, 4:55am

I fully understand this time! Thank you very much again for being so kind and patient and you really saved my life lol. It seems like I can graduate on time!