How to solve IndexError: too many indices for tensor of dimension 1

Hey guys, I am making a ROC graph for my multi-class classification problem. I have FPR and TPR and I am following the tutorial from scikit learn, printing multiclass ROC curve. For this, I am getting the prediction values from each and every epoch and I am taking labels as y_test. This is my code for ROC(Initial stage) -

print(__doc__)

import numpy as np
import matplotlib.pyplot as plt
from itertools import cycle

from sklearn import svm, datasets
from sklearn.metrics import roc_curve, auc
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import label_binarize
from sklearn.multiclass import OneVsRestClassifier
from scipy import interp

# Compute ROC curve and ROC area for each class 
fpr = dict() 
tpr = dict() 
roc_auc = dict() 
for i in range(nb_classes): 
   fpr[i], tpr[i], _ = roc_curve(labels[:, i], prediction[:, i]) # here change y_test to labels 
   roc_auc[i] = auc(fpr[i], tpr[i]) 
# Compute micro-average ROC curve and ROC area 
fpr["micro"], tpr["micro"], _ = roc_curve(labels.ravel(), prediction.ravel()) #  and here
roc_auc["micro"] = auc(fpr["micro"], tpr["micro"])

Now it is telling that -

Automatically created module for IPython interactive environment
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-34-bda71c8fb3a6> in <module>()
     17 roc_auc = dict()
     18 for i in range(nb_classes):
---> 19    fpr[i], tpr[i], _ = roc_curve(labels[:, i], prediction[:, i]) # here change y_test to labels
     20    roc_auc[i] = auc(fpr[i], tpr[i])
     21 # Compute micro-average ROC curve and ROC area

IndexError: too many indices for tensor of dimension 1

I have checked about shape of prediction and labels. Those are both torch.Size([32]).
I wonder why it is throwing an error. Can anyone of you help? Thanks.

Hello Deb -

In prediction[:, i] you are passing prediction two indices.
(":" counts as an index.) If prediction.shape is indeed
torch.Size([32]) (a one-index tensor) you are indeed passing
it “too many indices.”

Best.

K. Frank

1 Like

Well, there is a new problem though, can you help in this case?

Automatically created module for IPython interactive environment
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-37-5ffbb87ce56b> in <module>()
     17 roc_auc = dict()
     18 for i in range(nb_classes):
---> 19    fpr[i], tpr[i], _ = roc_curve(labels[i], prediction[i]) # here change y_test to labels
     20    roc_auc[i] = auc(fpr[i], tpr[i])
     21 # Compute micro-average ROC curve and ROC area

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/ranking.py in roc_curve(y_true, y_score, pos_label, sample_weight, drop_intermediate)
    616     """
    617     fps, tps, thresholds = _binary_clf_curve(
--> 618         y_true, y_score, pos_label=pos_label, sample_weight=sample_weight)
    619 
    620     # Attempt to drop thresholds corresponding to points in between and

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/ranking.py in _binary_clf_curve(y_true, y_score, pos_label, sample_weight)
    392     """
    393     # Check to make sure y_true is valid
--> 394     y_type = type_of_target(y_true)
    395     if not (y_type == "binary" or
    396             (y_type == "multiclass" and pos_label is not None)):

/usr/local/lib/python3.6/dist-packages/sklearn/utils/multiclass.py in type_of_target(y)
    247         raise ValueError("y cannot be class 'SparseSeries'.")
    248 
--> 249     if is_multilabel(y):
    250         return 'multilabel-indicator'
    251 

/usr/local/lib/python3.6/dist-packages/sklearn/utils/multiclass.py in is_multilabel(y)
    138     """
    139     if hasattr(y, '__array__'):
--> 140         y = np.asarray(y)
    141     if not (hasattr(y, "shape") and y.ndim == 2 and y.shape[1] > 1):
    142         return False

/usr/local/lib/python3.6/dist-packages/numpy/core/numeric.py in asarray(a, dtype, order)
    499 
    500     """
--> 501     return array(a, dtype, copy=False, order=order)
    502 
    503 

/usr/local/lib/python3.6/dist-packages/torch/tensor.py in __array__(self, dtype)
    448     def __array__(self, dtype=None):
    449         if dtype is None:
--> 450             return self.numpy()
    451         else:
    452             return self.numpy().astype(dtype, copy=False)

TypeError: can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

I have seen that it causes due to not using argmax, however, I have used it -

for epoch in range(epochs):
  
  running_loss = 0
  model.train()
  for images, labels in dataloader_train:
    
    #steps += 1
    images, labels = images.to(device), labels.to(device)
    
    optimizer.zero_grad()
    
    output = model.forward(images)
    p = torch.nn.functional.softmax(output, dim=1)
    prediction = torch.argmax(p, dim=1)
    #loss = torch.nn.functional.nll_loss(torch.log(p), y)
    loss = criterion(output, labels)
    loss.backward()
    optimizer.step()
    
    running_loss += loss.item()
    
  #if steps % print_every == 0:
  valid_loss = 0
  accuracy = 0
  model.eval()
  for images, labels in dataloader_test:
    optimizer.zero_grad()
    with torch.no_grad():
       
      images, labels = images.to(device), labels.to(device)

      output = model.forward(images)
      p = torch.nn.functional.softmax(output, dim=1)
      prediction = torch.argmax(p, dim=1)
      loss = criterion(output, labels)
          
      valid_loss += loss.item()
          
      ps = torch.exp(output)
         
      top_p, top_class = ps.topk(1, dim = 1)
      equals = top_class == labels.view(*top_class.shape)
      accuracy += torch.mean(equals.type(torch.FloatTensor))
        
  print("Epoch: {}/{} " .format(epoch+1, epochs))
  print("Train loss: {:.4f}.. " .format(running_loss/len(dataloader_train)))
  print("Valid loss: {:.4f}.. " .format(valid_loss/len(dataloader_test)))
  print("Accuracy: {:.4f}.. " .format(accuracy/len(dataloader_test)))
  model.train()

I am confused now how to print my ROC, I want to print a mutli-class ROC. I have not used any unuseful numpy array. I am new to this. can you please look after this? Thanks a lot.

Hello Deb!

First, in general, you should probably start a new thread for
things like this. (I do have some comments, below.)

I have no idea where the “CUDA tensor” error is coming from. I
suspect it might be a misleading error – not that it isn’t technically
true, but that it’s a (not really relevant) side effect of the “real” error.

(Would I be right that device is a gpu? You might try using just
the cpu, but I bet you would still get an error – just a different one.)

You say that you are working on a “multi-class classification”
problem, that is, that you have more than two classes. I suspect
that your prediction = torch.argmax(p, dim=1) returns values
greater than 1. (That is, if you had five classes, prediction would
take on values from [0, 1, 2, 3, 4].)

The documentation for sklearn.metrics.roc_curve says:

“Note: this implementation is restricted to the binary classification
task.”

The sample code you linked to (an example of how to use
roc_curve() with a multi-class classifier) has things like:

# Binarize the output
y = label_binarize(y, classes=[0, 1, 2])
n_classes = y.shape[1]

and I don’t see anything like that in the code you posted.

So I’m guessing that your error is that you are passing labels
(predictions) to roc_curve() that have values other than just
[0, 1], and that that is triggering a (not very helpful) error
message.

If this doesn’t help, please start a new thread and maybe
someone who (unlike me) actually knows how to use
sklearn.metrics.roc_curve could help.

Best.

K. Frank

1 Like

Thank you, You just said, what I didn’t find, Thank you a lot. I really appreciate it.