Simple Prediction Manipulation and Graphing. Also Feedback?

HI folks. I’m just having an unreasonable amount of difficulty shaping outputs to use in graph. Is anyone available to just talk about data types etc?

I’m going to include a bunch of code below, and I’d like to know how best to save my predictions and the associated labels (given that I’m shuffling between epochs). In particular I’d like to make a confusion matrix, but can’t seem to get my outputs into a single structure (list/array/tensor). I’m also interested in if there is a good way to include the datum id in here somewhere, so I can compare accuracy within some of the other fields I’m not using in training my model. I’d also be happy to know what I’m making harder than it needs to be and whats just bad code.

If this sort of code dump and request for general help is inappropriate for the forum, please let me know. I’m just so frustrated with how messy this project has been… I’m about to finish a masters degree and this shouldn’t be this hard…

Many many thanks for your help.

params = {'batch_size': 200,
          'shuffle': True}
batch_size=200
learning_rate = 0.001
num_epochs = 5
...
train_pd #pandasDF with  
   # index='id'= 5 char string, 
   # additional data not used in training
   # label= int in range (0,8), 
   # size, indices, values = sparse vector information
   # val_pd=same

# Prep for Data Loader
partition={'train': train_pd.index,'val': val_pd.index}

labels_train=train_pd['label'].to_dict()
labels_val=val_pd['label'].to_dict()

i_train=train_pd['indices'].to_dict()
i_val=val_pd['indices'].to_dict()

v_train=train_pd['values'].to_dict()
v_val=val_pd['values'].to_dict()

inputSize=train_pd['size'][0]

class Dataset(data.Dataset):
  'Characterizes a dataset for PyTorch'
  def __init__(self, list_IDs, labels, indices,values,vecSize):
        'Initialization'
        self.labels = labels
        self.list_IDs = list_IDs
        self.i= indices
        self.v =values
        self.vecSize=vecSize

  def __len__(self):
        'Denotes the total number of samples'
        return len(self.list_IDs)

  def __getitem__(self, index):
        'Generates one sample of data'
        # Select sample
        ID = self.list_IDs[index]
        try:
            # Load data and get label
            X = torch.sparse.FloatTensor(
                torch.LongTensor([self.i[ID]]),
                torch.FloatTensor(self.v[ID]), 
                torch.Size([self.vecSize])
            ).to_dense()
        except RuntimeError:
            print(ID)
            
        i = self.labels[ID]
        y=np.zeros(8)
        y[i]=1

        return X, torch.LongTensor([i])
    
training_set = Dataset(partition['train'], labels_train,i_train,v_train,inputSize)
training_generator = data.DataLoader(training_set, **params)

validation_set = Dataset(partition['val'], labels_val,i_val,v_val,inputSize)
validation_generator = data.DataLoader(validation_set, **params)

# Model
model = nn.Sequential(
                    nn.Linear(inputSize,16),
                    nn.ReLU(),
                    nn.Linear(16,8)#,
)

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)  

# Train the model
losses=[]
accuracies=[]
comparisons=[]

total_step = len(training_generator)
for epoch in range(3):
    i=0
    total=0
    correct=0
    for local_batch, local_labels in training_generator:
        i+=1
        local_batch, local_labels = local_batch.to(device), local_labels.to(device)

        optimizer.zero_grad()
        # Forward pass
        outputs = model(local_batch)

        total += local_labels.size(0)
        labs=local_labels.flatten()
        pred=np.argmax(outputs.detach().numpy(), axis=1)
        correct += (pred == labs).sum().item()
  
        acc=correct/total

        loss = criterion(outputs, local_labels.squeeze())
        losses.append(loss.item())
        accuracies.append(acc)
        
        if epoch==num_epochs-1:
            comparisons.append([pred,labs])

        # Backward and optimize
        loss.backward()
        optimizer.step()
        
        
        if i%100==0:
            print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}, Accuracy:{:,.4f}' 
                   .format(epoch+1, num_epochs,i, total_step, loss.item(), acc))

# Validate the model
Vcomparisons=[]

total_step = len(validation_generator)
total=0
correct=0
Vaccuracies=[]
alllab=[]
allpred=[]
for local_batch, local_labels in validation_generator:
    i+=1
    local_batch, local_labels = local_batch.to(device), local_labels.to(device)

    # Forward pass
    outputs = model(local_batch)

    total += local_labels.size(0)
    labs=local_labels.flatten()
    pred=np.argmax(outputs.detach().numpy(), axis=1)
    correct += (pred == labs).sum().item()
    
    alllab.append(labs.numpy())
    allpred.append(pred)

    acc=correct/total

    Vaccuracies.append(acc)

    Vcomparisons.append([pred,labs])
print ('Accuracy:{:,.4f}'  .format( acc))

labs=alllab[0]
preds=allpred[0]
for i in range(1,len(alllab)-1):
    print(labs, preds)
    print(alllab[i],allpred[i])
    labs=np.concatenate(labs,alllab[i], axis=1) #errors
    preds=np.concatenate(preds,allpred[i],axis=1) #errors

#Eventually sklearn confusion matrix here

For the confusion matrix you don’t necessarily need to store all predictions and targets. Just indexing at the corresponding row and column should be enough.

Do you mean the index by “datum id”? If so, you could just return the index with the data and target in your Dataset.

Your code looks alright. It seems you are using pandas to map some IDs to the data, which makes the code a bit harder to read, but if your need this mapping it’s totally fine. I’m not sure what you’re doing with Vaccuracies, but it you plan on taking the mean, your validation accuracy might have a small error, since the last batch might be smaller than all others. I would rather store the correct predictions and calculate the accuracy later using the length of your Dataset.

Here is a small example to create the confusion matrix (using sklearn’s row-column logic) and storing the predictions as well as the ids:

class MyDataset(Dataset):
    def __init__(self, nb_samples, nb_features, nb_classes):
        self.data = torch.randn(nb_samples, nb_features)
        self.target = torch.randint(0, nb_classes, (nb_samples,))
        
    def __getitem__(self, index):
        x = self.data[index]
        y = self.target[index]
        
        return x, y, index
    
    def __len__(self):
        return len(self.data)


nb_samples = 100
batch_size = 10
nb_features = 20
nb_classes = 10

dataset = MyDataset(
    nb_samples,
    nb_features,
    nb_classes
)

loader = DataLoader(
    dataset,
    batch_size=batch_size,
    shuffle=True,
)

model = nn.Linear(nb_features, nb_classes)
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
criterion = nn.CrossEntropyLoss()

model.eval()
conf_mat = torch.zeros(nb_classes, nb_classes)
losses = 0.
preds = []
pred_ids = []
with torch.no_grad():
    for data, target, data_idx in loader:  # Assuming this is the val loader
        output = model(data)
        loss = criterion(output, target)
        losses += loss.item()
        pred = torch.argmax(output, 1)
        
        conf_mat[target, pred] += 1
        preds.append(pred)
        pred_ids.append(data_idx)

preds = torch.cat(preds)
pred_ids = torch.cat(pred_ids)

Let me know, if this helps or if I misunderstood your issue.

Cheer up! Almost done, so don’t let these minor issues bug you.

1 Like

Thank you! This is exactly what I needed, there were some incorrect loops in my thinking.

I meant the ID that I’m mapping through pandas with. It’s not really necessary, except that it allows me to refer to the original data easily. So that’s what I’ll add to my dataloader outputs.

You’re right. I don’t need these. They are just copy/paste legacies.

I noticed your no_grad() environment:

I haven’t seen this in examples yet. Is there a comparable environment I could be using during training to eliminate the optimizer.zero_grad(), loss.backward(), and optimizer.step() lines?

Your confusion matrix is also exactly was I was looking for. Many thanks.

Yes, you can wrap any code with torch.no_grad(), where you are sure you won’t need to call backward.
The with statement makes sure to avoid storing intermediate activations in the forward pass, thus saving some memory, and is usually done during evaluation.