Accuracy is not calculated?

Hello, I want to classificate isolated music notes.
My dataset has the shape: input[252, seq_len], target[88, seq_len]
Dataset is normalized.
How seq_len is different for each note, I had to create my custom collate_fn:

def music_collate_fn(batch):
    data = np.zeros((252 , 0), dtype=np.float)
    target = np.zeros((88,0), dtype=np.int)
    for item in batch:
        data = np.concatenate((data, item[0]), axis=1)
        target = np.concatenate((target, item[1]), axis = 1)
    
    #to tensor
    data = torch.from_numpy(data).float()
    target = torch.from_numpy(target).float()
    #Transpose
    data = data.t()
    target = target.t()
    #Cuda
    data = data.to(device)
    target = target.to(device) 
    return data, target

The target is a binary classification. I have 88 musical notes. If a note is the number 0, I have 1 in the row 0 As many times as input data represents the note 0. If the note is the number 77, I have a 1 in the row 77. I think I am in the correct way. For example, target of note 0 before concatenate is like:

trainSet[0][1]
array([[1, 1, 1, ..., 1, 1, 1],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]])

I am trying to calculate loss in my program:
Train code:

mlp = rn.MLP(input_dim, hidden_dim, output_dim).to(device)
optimizer = torch.optim.SGD(mlp.parameters(), lr = 0.001, momentum=0.5)
def train(model, train_loader, optimizer, criterion):
    model.train()
    temp_loss = []
    i = 0
    for batch_idx, (input,target) in enumerate(train_loader):      
        outputs = model(input) 
        loss = criterion(outputs, target)
        temp_loss.append(loss.item())
        optimizer.zero_grad()           #Gradientes a cero
        loss.backward()                 #Back propagation
        optimizer.step()                #Actualizar parámetros.
        i +=1
        if(i % 5):
            print('----------TRAIN--------------PERDIDA: ', loss)
            print("Loss: {}".format(np.mean(temp_loss)))
    
    return temp_loss

The test code is:

def test( model, test_loader, optimizer, criterion):
    model.eval()
    test_loss = 0
    correct = 0
    temp_loss = []
    i=0
    with torch.no_grad():
        for data, target in test_loader:

            outputs = model(data)
            loss = criterion(outputs, target)
            temp_loss.append(loss.item())
            if(i == 3):
                print('\n----------TEST--------------\nPERDIDA: ', loss.item())
                i = 0
            pred = outputs.data.max(1, keepdim=True)[1]
            target = target.data.max(1,keepdim = True)[1]
            correct += pred.eq(target).sum()
            i += 1
    
    print('\nTest set: Avg. loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
    loss, correct, target.shape[0],
    correct / target.shape[0]))
    
    return temp_loss

I am trying to calculate accuracy, but prediction is always a low value, like 0,0001%. So the model never learn.

Furthermore, I am ussing F.binary_cross_entropy instead nn.BCELoss, because is the only way to have results. If I use nn.CrossEntropyLoss, the problem returned is:


  File "<ipython-input-9-5c699a4fd8ff>", line 1, in <module>
    losses += train(mlp, train_loader, optimizer, criterion)

  File "<ipython-input-7-072d93cf4aa0>", line 9, in train
    loss = criterion(outputs, target)

  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\loss.py", line 898, in __init__
    super(CrossEntropyLoss, self).__init__(weight, size_average, reduce, reduction)

  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\loss.py", line 23, in __init__
    super(_WeightedLoss, self).__init__(size_average, reduce, reduction)

  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\loss.py", line 16, in __init__
    self.reduction = _Reduction.legacy_get_string(size_average, reduce)

  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\_reduction.py", line 42, in legacy_get_string
    if size_average and reduce:

RuntimeError: bool value of Tensor with more than one value is ambiguous

Are you dealing with a multi-label classification use case, i.e. for each time step more than a single note can be active?
If so, could you print the shapes of the data and target as I think I didn’t understand it properly.
Anyway, here is a small dummy example:

batch_size = 10
nb_features = 256
nb_classes = 88
seq_len = 5

x = torch.randn(batch_size, seq_len, nb_features)
target = torch.randint(0, 2, (batch_size, seq_len, nb_classes)).float()

model = nn.Sequential(
    nn.Linear(nb_features, nb_classes),
    nn.Sigmoid(),
)
criterion = nn.BCELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-0)

for epoch in range(1000):
    optimizer.zero_grad()
    output = model(x)
    loss = criterion(output, target)
    loss.backward()
    optimizer.step()
    print('Epoch {}, loss {}'.format(epoch, loss.item()))

Hello @ptrblck and thanks so much for answer!

I am not sure if the classification is a multi-label, because I have to recognize notes, but the target is represented with 0 and 1. I have a Dataset with data of 88 notes. When I create a batch in my custom collate function, I join as many notes as batch size size. If I have a batch of 10, I collect 10 notes, so it have to recognize that 10 notes. In the next batch, it will have to recognize 10 notes more.

I print you the shape of input, target and outputs while debug in the train function in the first and second batch. I should say seq_len is different in each epoch.

First batch: 
Input.shape -> torch.Size([28471,252])
target.shape -> torch.Size([28471,88])
outputs.shape-> torch.Size([28471,88])

Second batch: 
Input.shape -> torch.Size([28146,252])
target.shape -> torch.Size([28146,88])
outputs.shape-> torch.Size([28146,88])

I am not sure if I am in the correct way. I am new in Pytorch. Thanks so much.

Thanks for the information.
Let me rephrase it: for each row of the target tensor, is there only one active note or multiple ones?
I.e. are you using only single notes or chords as well?

For each row of the target tensor, there is just an active note.

There is a simple example. If batch is size 2 I have two notes.
If note 0 has a shape like [88,5] I have a target like:

[[1,1,1,1,1], # row 0
[0,0,0,0,0] # row 1
.
.
.[0,0,0,0,0]] #row 87

if note 1 has a shape like [88,3] I have a target like:

[[0,0,0], #row 0
[1,1,1], # row 1
.
.
.
.[0,0,0]] #row 87

After concatenate arrays:

[[1,1,1,1,1,0,0,0],
[0,0,0,0,0,1,1,1]
[0,0,0,0,0,0,0,0]
.
.
.
[0,0,0,0,0,0,0]]

This representation is before transpose.

In that case, you could use nn.CrossEntropyLoss and the target tensor should only contain the class indices.
Given the target shape of [batch_size, nb_classes], you can just call

target = torch.argmax(target, 1)

to get the class indices.
Also note that nn.CrossEntropyLoss expects logits as the model output, so remove any non-linearity for the last layer in case you are using some.

I tried that, but it returns me an error.

criterion = nn.CrossEntropyLoss
#criterion = F.binary_cross_entropy
def train( model, train_loader, optimizer, criterion):
    model.train()
for batch_idx, (input,target) in enumerate(train_loader):
        optimizer.zero_grad()           #Gradientes a cero  
        outputs = model(input) 
        target = torch.argmax(target, 1)
        loss = criterion(outputs, target)
        temp_loss.append(loss.item())
 
        loss.backward()                 #Back propagation
        optimizer.step()                #Actualizar parámetros.

I obtain the next error:


  File "<ipython-input-17-1aec7df1fd6b>", line 69, in <module>
    train_losses += train(epoch, mlp, train_loader, optimizer, criterion)

  File "<ipython-input-17-1aec7df1fd6b>", line 15, in train
    loss = criterion(outputs, target)

  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\loss.py", line 898, in __init__
    super(CrossEntropyLoss, self).__init__(weight, size_average, reduce, reduction)

  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\loss.py", line 23, in __init__
    super(_WeightedLoss, self).__init__(size_average, reduce, reduction)

  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\loss.py", line 16, in __init__
    self.reduction = _Reduction.legacy_get_string(size_average, reduce)

  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\_reduction.py", line 42, in legacy_get_string
    if size_average and reduce:

RuntimeError: bool value of Tensor with more than one value is ambiguous

You have to create an instance of nn.CrossEntropyLoss, i.e. you have to call the constructor using parentheses:

criterion = nn.CrossEntropyLoss()

Hello again @ptrblck. I ask here because I don’t know if is a good idea create a new topic. If you think I should create a new topic, I will do it.

Reciently, I had studied my dataset, and I find I didn’t load correctly.

Now, I concatenate all examples correctly, and all is working almost correctly.

I have six dimensional matrixes, two for trains, two for test, and two for validation.

I deleted music_collate_fn because now I can use default function.
I make some changes in my code. Now, dataset class is much more easy:

class MusicDataSet(Dataset):
    def __init__(self, input_matrix, target_matrix, function=None):
        self.ms = input_matrix
        self.target = target_matrix

    def __len__(self):
        return len(self.ms)

    def __getitem__(self, idx):
        inp = torch.from_numpy(self.ms[idx].astype(np.float))
        target = torch.from_numpy(self.target[idx].astype(np.double))

        return inp, target

And the model is very simple:

class MLP(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super(MLP, self).__init__()

        self.dense1 = nn.Linear(input_dim, hidden_dim[0]) 
        self.dense2 = nn.Linear(hidden_dim[0], hidden_dim[1])  
        self.dense3 = nn.Linear(hidden_dim[1], hidden_dim[2])      
        self.dense4 = nn.Linear(hidden_dim[2], output_dim)
        self.relu = nn.ReLU()
        self.sigm = nn.Sigmoid()
        self.dropout = nn.Dropout(0.05)
     
    def forward(self, x):
        x = self.dense1(x)
        x = self.relu(x)
        x = self.dense2(x)
        x = self.relu(self.dropout(x))
        x = self.dense3(x)
        x = self.relu(self.dropout(x))
        x = self.dense4(x)
        x = self.sigm(x)
        return x

Preprocess code:

X_train, Y_train, X_val, Y_val , X_test, Y_test = sd.loadDatasets()

trainSet = mds.MusicDataSet(X_train, Y_train)
valSet = mds.MusicDataSet(X_val, Y_val)
testSet = mds.MusicDataSet(X_test, Y_test)
train_loader = torch.utils.data.DataLoader(trainSet, batch_size=batch_size_train, shuffle=False)
test_loader = torch.utils.data.DataLoader(testSet, batch_size=batch_size_test, shuffle=False)
val_loader = torch.utils.data.DataLoader(valSet, batch_size=batch_size_val, shuffle=False)

input_dim = 252
hidden_dim = (500, 600, 100)
output_dim = 88
if CUDA:
   mlp = rn.MLP(input_dim, hidden_dim, output_dim).to('cuda') 
   mlp.double();

optimizer = torch.optim.SGD(mlp.parameters(), lr = learning_rate, momentum=momentum)
criterion = nn.BCELoss()

All matrixes are transposed when are loaded.
The input data shape has 252 elements in each row.
The target data shape has 88 elements in each row.
The num rows in trainSet is 656445 and less in the others Datasets.

I was checking the accuracy in test, but it is not enough, is very bad, less than 50%. I can’t understand why.

This is the code to train, eval and test the model:

def train(model, train_loader, optimizer, criterion):
    model.train()
    for idx_batch, (input,target) in enumerate(train_loader):
        optimizer.zero_grad()
        input = input.cuda()
        target = target.cuda()
        outputs = model(input) 
        loss = criterion(outputs, target)
        loss.backward()   
        optimizer.step()
        train_loss += loss.item()
    return train_loss

def val(model, val_loader, optimizer, criterion):
    model.eval()
    val_loss = 0
    with torch.no_grad():
        for idx_batch, (input, target) in enumerate(val_loader):
             input = input.cuda()
             target = target.cuda()
            outputs = model(input)
            loss = criterion(outputs, target)
            val_loss += loss.item()
    return val_loss

for epoch in range(1, N_EPOCHS+1):
    print("------------EPOCHS ",epoch,"------------")
    train_loss = 0
    val_loss = 0
    train_loss = train(mlp, train_loader, optimizer, criterion)
    val_loss = val(mlp, val_loader, optimizer, criterion)
    
    #Mean of losses
    train_loss = train_loss/len(train_loader.dataset)
    val_loss = val_loss/len(val_loader.dataset)
    
    #Stats training and validation.
    print('Epoch: {} \tTraining loss: {:.6f} \tValidation loss:{:.6f}'.format(epoch, train_loss, val_loss))
    
#Training
mlp.eval()
for data, target in test_loader:

    data = data.to('cuda')
    target = target.to('cuda')
    outputs = mlp(data)
    loss = criterion(outputs, target)
    test_loss += loss.item() #*data.size(0)????
    pred = torch.max(outputs, 1)[1]
    t = torch.max(target, 1)[1]
    correct += pred.eq(t).sum()

Thanks so much.

The code looks generally alright.
You might need to play around with some hyperparameters (e.g. learning rate, hidden size, etc.).
It might also be a good idea to scale down your use case a bit and try to overfit your model on a small data sample (e.g. use just 10 samples from the train dataset).
If your model is not able to overfit these samples perfectly, there might be e.g. some code bugs we missed.

Yes, it really works. Thanks so much, I did a lot of changes to my code and I had a lot of problems. Thank you.

1 Like

Hey,

I tried helping you in other thread which you started for the same problem ?All the suggestions seemed to be not much helpful from what you replied back. Could you please summarize a few which helped ?

What did you end up using CrossEntropyLoss() ? Did you tweak hyper parameters more ?

I had to use torch.argmax before calculate the loss to have a correct loss and could advance in the training.

I am using now CrossEntropyLoss() because I was thinking about my problem and I have a multiclass problem. I have 88 possible class represented with 0 and 1, so I use torch.argmax to convert indexes in classes.

I think I am in the correct way.

Thanks.