Custom tweedie loss throwing an error in pytorch

Jordan_Howell · April 11, 2020, 4:45pm

Hello,

I’m having trouble implementing a GLM where the y follows a Tweedie distribution using the stars models package. Is there. A way to do this in pytorch? I’ve searched and haven’t found any literature or posts on it.

UPDATE

I’ve tried to define a custom loss function as such:

def tweedieloss(predicted, observed):
    '''
    Custom loss fuction designed to minimize the deviance using stochastic gradient descent
    '''
    p = torch.tensor([1.5])

    QLL = predicted**-p(((predicted*observed)/(torch.tensor([1])-p)) - ((predicted**2)/(torch.tensor([2])-p)))
    QLL.cuda()
    return -torch.abs(QLL)def tweedieloss(predicted, observed):
    '''
    Custom loss fuction designed to minimize the deviance using stochastic gradient descent
    '''
    p = torch.tensor([1.5])

    QLL = predicted**-p(((predicted*observed)/(torch.tensor([1])-p)) - ((predicted**2)/(torch.tensor([2])-p)))
    QLL.cuda()
    return -torch.abs(QLL)

I’m still not sure if this is exactly correct however, I do know it’s giving me an error.

I’m using it in the following model:

# Create the linear regression model
model = nn.Linear(x.shape[1], 1)
# Loss and optimizer
criterion = tweedieloss
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)

# PyTorch uses float32 by default
# Numpy creates float64 by default
inputs = torch.from_numpy(X.astype(np.float32))
targets = torch.from_numpy(Y.astype(np.float32))

# Train the model
n_epochs = 20
losses = []
for it in range(n_epochs):
    # zero the parameter gradients
    optimizer.zero_grad()

    # Forward pass
    outputs = model(inputs)
    loss = criterion(outputs, targets, p = torch.tensor([1.5]))

    # keep the loss so we can plot it later
    losses.append(loss.item())

    # Backward and optimize
    loss.backward()
    optimizer.step()

    print(f'Epoch {it+1}/{n_epochs}, Loss: {loss.item():.4f}')
```

The error I'm getting is the following:

```
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-182-87301b9d4374> in <module>
      9     # Forward pass
     10     outputs = model(inputs)
---> 11     loss = criterion(outputs, targets)
     12 
     13     # keep the loss so we can plot it later

<ipython-input-180-288c1e1a6f6d> in tweedieloss(predicted, observed)
      5     p = torch.tensor([1.5])
      6 
----> 7     QLL = predicted**-p(((predicted*observed)/(torch.tensor([1])-p)) - ((predicted**2)/(torch.tensor([2])-p)))
      8     QLL.cuda()
      9     return -torch.abs(QLL)

RuntimeError: expected device cuda:0 but got device cpu
```

i'm not sure what in that custom loss function I should send to cuda.  When I try to run everything with the CPU, I get the following:
```
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-188-87301b9d4374> in <module>
      9     # Forward pass
     10     outputs = model(inputs)
---> 11     loss = criterion(outputs, targets)
     12 
     13     # keep the loss so we can plot it later

<ipython-input-183-43f44140ed7b> in tweedieloss(predicted, observed)
      5     p = torch.tensor([1.5])
      6 
----> 7     QLL = predicted**-p(((predicted*observed)/(torch.tensor([1])-p)) - ((predicted**2)/(torch.tensor([2])-p)))
      8     return -torch.abs(QLL)

TypeError: 'Tensor' object is not callable
```

I'm not sure how to implement this custom loss function.

babababadukeduke · April 13, 2020, 5:34pm

Why do you need to use torch.tensor([1])? Can’t you simply pass 1

Jordan_Howell · April 13, 2020, 5:50pm

Because I was desperate to find an answer and thought that may help. When I take that out, I get the following:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-322-16213b95ac4b> in <module>
     12 #     print(loss)
     13     # keep the loss so we can plot it later
---> 14     losses.append(loss.item())
     15 
     16     # Backward and optimize

ValueError: only one element tensors can be converted to Python scalars

So I took the append out and printed the loss but it’s producing all NaNs.

babababadukeduke · April 13, 2020, 6:06pm

This error is happening because item() can only be called on a Tensor containing single element for example [1]. If you call item() on [1, 1] it will give this error. Also in PyTorch custom loss functions are suppose to return a scale value. For example below is a simple implementation of mean squared loss function

Jordan_Howell · April 13, 2020, 6:22pm

Ahhh! Well that got it to run on cpu. When i run it on GPU, I get the following:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-355-5f0610597548> in <module>
      9     # Forward pass
     10     outputs = model(inputs)
---> 11     loss = tweedieloss(outputs, targets)
     12     print(loss)
     13     # keep the loss so we can plot it later

<ipython-input-345-982fe7ade6d7> in tweedieloss(predicted, observed)
     23     p = torch.tensor(1.5)
     24 
---> 25     QLL = torch.pow(predicted, (-p))*(((predicted*observed)/(1-p)) - ((torch.pow(predicted, 2))/(2-p)))
     26 
     27     return torch.mean(-torch.abs(QLL))

RuntimeError: iter.device(arg).is_cuda() INTERNAL ASSERT FAILED at C:/w/1/s/tmp_conda_3.7_100118/conda/conda-bld/pytorch_1579082551706/work/aten/src\ATen/native/cuda/Loops.cuh:197, please report a bug to PyTorch.

Not sure how I’m supposed to report the bug.

babababadukeduke · April 13, 2020, 7:41pm

Did you check if both your outputs and targets are in the same CUDA device?

Jordan_Howell · April 14, 2020, 11:25am

I (and when I say I, I mean a friend) figured this out late yesterday. I had to change the function and we’re not sure why changing the simple math enabled it to run. But in case anyone wants a custom function that maximizes the Tweedie QLL, the below works:

def QLL(predicted, observed):
    p = torch.tensor(1.5)
    QLL = QLL = torch.pow(predicted, (-p))*(((predicted*observed)/(1-p)) - ((torch.pow(predicted, 2))/(2-p)))

    return QLL
        
def tweedieloss(predicted, observed, n):
    '''
    Custom loss fuction designed to minimize the deviance using stochastic gradient descent
    tweedie deviance from McCullagh 1983

    '''
    d = -2*QLL(predicted, observed)
#     loss = (weight*d)/1


    return torch.mean(d)