Unable to differentiate

Hello everyone (I’ve googled a lot before posting here, but at this point I’m kind of exhausted with PyTorch).
I want to pass my array of outputs through a sigmoid function with a threshold of 0.5 (e.g every element lower than 0.5 make it 0 and everything bigger than 0.5 make it 1) and provide it to my loss function.
I’ve already tried to do something like this
function = torch.nn.Sigmoid()
outputs = (function(outputs)>=0.5).float()
in order to provide it to my loss function
but if i do that then PyTorch cannot differentiate.
Any thoughts are appreciated :slight_smile:

Comparison results are not mathematically differentiable functions.

Yes yes i already know that , but i have a multi label classification problem and i want my model to predict more than one label!

I see. Can you do BCELoss then?

From my search already i found that MultiLabelMarginLoss might be the suitable loss function for me.

def train_epoch(_epoch, dataloader, model, loss_function):
# switch to train mode -> enable regularization layers, such as Dropout
model.train()
running_loss = 0.0

for i_batch, sample_batched in enumerate(dataloader, 1):

    # get the inputs (batch)
    inputs, labels, lengths, indices = sample_batched

    # sort batch (for handling inputs of variable length)
    lengths, (inputs, labels) = sort_batch(lengths, (inputs, labels))

    # convert to CUDA Variables
    if torch.cuda.is_available():
        inputs = Variable(inputs.cuda())
        labels = Variable(labels.cuda())
        lengths = Variable(lengths.cuda())

    # 1 - zero the gradients
    optimizer.zero_grad()

    # 2 - forward pass: compute predicted y by passing x to the model
    outputs = model(inputs, lengths)
    # 3 - compute loss
    loss = loss_function(outputs, labels)

    # 4 - backward pass: compute gradient wrt model parameters
    loss.backward()

    # 5 - update weights
    optimizer.step()

    running_loss += loss.data[0]

    # print statistics
    progress(loss=loss.data[0],
             epoch=_epoch,
             batch=i_batch,
             batch_size=BATCH_SIZE,
             dataset_size=len(train_set))

I guess i have to modify somehow the outputs to not produce continuous values when i want to predict the labels.

if you have a multi-label prediction problem, the best solution is to use BCELoss(torch.sigmoid(output), target)

1 Like

works like a charm mate!! thanks a lot! by the way does it affect it at all that in the target has many vectors inside?
e.g my targets have the form of [ [1,1,1,1,1], [0,1,0,0,0,0], [0,1,0,0,1,1] ] ?