1D segmentation " multi-target not supported "

I’m trying to implement a 1D segmentation for geological depth data. The output should be the input sequence of length 1100 (each point representing a different depth level) with an associated category (0-5) for each.

Here’s the end to end code to reproduce with artificial data.

https://pastebin.com/fxQCFxtv


X = np.random.rand(4000,1100).astype('float32')
y = np.random.randint(0,5,size=(4000,1100)).astype('long')
train_ds = data.TensorDataset(torch.from_numpy(X[:-800]).unsqueeze(-1), torch.from_numpy(y[:-800]).unsqueeze(-1))
valid_ds = data.TensorDataset(torch.from_numpy(X[-800:]).unsqueeze(-1), torch.from_numpy(y[-800:]).unsqueeze(-1))


batch_size = 32
n_iters = 3000
num_epochs = n_iters / (len(train_ds) / batch_size)
num_epochs = int(num_epochs)

train_loader = torch.utils.data.DataLoader(dataset=train_ds, 
                                           batch_size=batch_size, 
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=valid_ds, 
                                          batch_size=batch_size, 
                                          shuffle=False)

'''
STEP 3: CREATE MODEL CLASS
'''

class LSTMModel(nn.Module):
    def __init__(self, input_dim, hidden_dim, layer_dim, output_dim):
        super(LSTMModel, self).__init__()
        # Hidden dimensions
        self.hidden_dim = hidden_dim

        # Number of hidden layers
        self.layer_dim = layer_dim

        # Building your LSTM
        # batch_first=True causes input/output tensors to be of shape
        # (batch_dim, seq_dim, feature_dim)
        self.lstm = nn.LSTM(input_dim, hidden_dim, layer_dim, batch_first=True)

        # Readout layer
        self.fc = nn.Linear(hidden_dim, output_dim)

    def forward(self, x):
        # Initialize hidden state with zeros
        #######################
        #  USE GPU FOR MODEL  #
        #######################
        h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_().to(device)

        # Initialize cell state
        c0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_().to(device)

        # One time step
        out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))

        # Index hidden state of last time step
        # out.size() --> 100, 28, 100
        # out[:, -1, :] --> 100, 100 --> just want last time step hidden states! 
        out = self.fc(out[:, -1, :]) 
        # out.size() --> 100, 10
        return out

'''
STEP 4: INSTANTIATE MODEL CLASS
'''
input_dim = 1
hidden_dim = 100
layer_dim = 3  # ONLY CHANGE IS HERE FROM ONE LAYER TO TWO LAYER
output_dim = 5

model = LSTMModel(input_dim, hidden_dim, layer_dim, output_dim)

#######################
#  USE GPU FOR MODEL  #
#######################

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model.to(device)

'''
STEP 5: INSTANTIATE LOSS CLASS
'''
criterion = nn.CrossEntropyLoss()

'''
STEP 6: INSTANTIATE OPTIMIZER CLASS
'''
learning_rate = 0.1

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)  

'''
STEP 7: TRAIN THE MODEL
'''

# Number of steps to unroll
seq_dim = 1100  

iter = 0
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # Load images as Variable
        #######################
        #  USE GPU FOR MODEL  #
        #######################
        images = images.view(-1, seq_dim, input_dim).requires_grad_().to(device)
        labels = labels.to(device)

        # Clear gradients w.r.t. parameters
        optimizer.zero_grad()

        # Forward pass to get output/logits
        # outputs.size() --> 100, 10
        outputs = model(images)

        # Calculate Loss: softmax --> cross entropy loss
        print(outputs.shape, labels.shape)
        print(outputs.dtype, labels.dtype)

        loss = criterion(outputs, labels)

        # Getting gradients w.r.t. parameters
        loss.backward()

        # Updating parameters
        optimizer.step()

        iter += 1

        if iter % 500 == 0:
            # Calculate Accuracy         
            correct = 0
            total = 0
            # Iterate through test dataset
            for images, labels in test_loader:
                #######################
                #  USE GPU FOR MODEL  #
                #######################
                images = images.view(-1, seq_dim, input_dim).to(device)
                labels = labels.to(device)

                # Forward pass only to get logits/output
                outputs = model(images)

                # Get predictions from the maximum value
                _, predicted = torch.max(outputs.data, 1)

                # Total number of labels
                total += labels.size(0)

                # Total correct predictions
                #######################
                #  USE GPU FOR MODEL  #
                #######################
                if torch.cuda.is_available():
                    correct += (predicted.cpu() == labels.cpu()).sum()
                else:
                    correct += (predicted == labels).sum()

            accuracy = 100 * correct / total

            # Print Loss
            print('Iteration: {}. Loss: {}. Accuracy: {}'.format(iter, loss.item(), accuracy))

Thanks!

Could you post the code here by wrapping it into three backticks ``` please, as I’m unable to visit the linked website?

Sure, updated the post. I would also like to compare that to a simple 1d Unet, so fi you foresee any caveats with that it would be great to get your feedback.

Thanks for the code.
nn.CrossEntropyLoss expects a multi-class target (not multi-label target, where each sample can belong to more than a single class).
In your case, you should provide the target as:

X = np.random.rand(4000,1100).astype('float32')
y = np.random.randint(0,5,size=(4000,)).astype('long')
train_ds = TensorDataset(torch.from_numpy(X[:-800]).unsqueeze(-1), torch.from_numpy(y[:-800]))
valid_ds = TensorDataset(torch.from_numpy(X[-800:]).unsqueeze(-1), torch.from_numpy(y[-800:]))

What would be the correct loss to use for a multi label multi class target for a segmentation purpose where each 1100 time steps need to be labeled from 0 to 5, similar to part of speech tagging scenario.

You could use nn.BCEWithLogitsLoss for a multi-label classification.
Each output unit would correspond to a class activation, so that your target should be a FloatTensor with the same shape as your output.

Got it, I was surprised though for instance here, which is basically the exact problem setup minus the first embedding layer, they use NLLLoss, which as for me fails with the sam error. I’m looking at all the dimensions and they seem the same, so I will keep debugging. Thanks

https://pytorch.org/tutorials/beginner/nlp/sequence_models_tutorial.html

I’m no expert in NLP, but it looks like a multi-class classification, i.e. each output should only correspond to a single class.
This explanation also seems to go in this direction.