Error "ValueError: only one element tensors can be converted to Python scalars" on optimizer.step(closure)

I intend during the train compute the loss using MSELoss but using max error as reference for stoping to train and to compute the error.

Note that I set reduce=False, so the loss_func will returns a loss per input/target element. How I can use an optimizer that I can use max error to compute?

Error:

/Envs/climaenv/lib/python3.6/site-packages/torch/optim/lbfgs.py in step(self, closure)
    102         # evaluate initial f(x) and df/dx
    103         orig_loss = closure()
--> 104         loss = float(orig_loss)
    105         current_evals = 1
    106         state['func_evals'] += 1

ValueError: only one element tensors can be converted to Python scalars

Here is the code:


    def fit(self, x=None, y=None, lr=0.001, epochs=1000):
        '''
         Training function
         Argurmets:
            x: features train set
            y: target train set
            lr: Learning rate
            epochs: number of epochs 
        '''
        optimizer = torch.optim.LBFGS([{'params': [self.hidden.weight,self.predict.weight]}], 
                                      lr=lr,max_iter=20)
        
        loss_func = torch.nn.MSELoss(reduce=False) 
    
        def closure():
            optimizer.zero_grad()                  # clear gradients for next train
            prediction = self(x)                   # input x and predict based on x
            loss = loss_func(prediction, y) 
            loss.max().backward()                  # backpropagation, compute gradients 
            return loss
        
        for t in range(epochs):  
            optimizer.step(closure)                # apply gradient
            
        return self.loss_epochs

If you want to use the max error to stop your training, you could use

loss = orig_loss.max().detach().numpy()

Currently orig_loss is not a scalar, as you use reduce=False in your loss function.
Therefore float cannot cast it to a scalar float.

Thank You so much! Now I can stop the tranning. But I don’t know if you got my main point:

And How can I use to reduce my loss function? Nowdays the option is only sum ou mean. Because that I still got an error in optimizer…

For unreduced losses you would need to provide the gradient:

loss_fn = nn.MSELoss(reduce=False)
loss = loss_fn(torch.randn(10, 10, requires_grad=True), torch.randn(10, 10))
loss.backward(torch.ones_like(loss))
1 Like