Hello everyone (I’ve googled a lot before posting here, but at this point I’m kind of exhausted with PyTorch).
I want to pass my array of outputs through a sigmoid function with a threshold of 0.5 (e.g every element lower than 0.5 make it 0 and everything bigger than 0.5 make it 1) and provide it to my loss function.
I’ve already tried to do something like this
function = torch.nn.Sigmoid()
outputs = (function(outputs)>=0.5).float()
in order to provide it to my loss function
but if i do that then PyTorch cannot differentiate.
Any thoughts are appreciated
Comparison results are not mathematically differentiable functions.
Yes yes i already know that , but i have a multi label classification problem and i want my model to predict more than one label!
I see. Can you do BCELoss then?
From my search already i found that MultiLabelMarginLoss might be the suitable loss function for me.
def train_epoch(_epoch, dataloader, model, loss_function):
# switch to train mode -> enable regularization layers, such as Dropout
model.train()
running_loss = 0.0
for i_batch, sample_batched in enumerate(dataloader, 1):
# get the inputs (batch)
inputs, labels, lengths, indices = sample_batched
# sort batch (for handling inputs of variable length)
lengths, (inputs, labels) = sort_batch(lengths, (inputs, labels))
# convert to CUDA Variables
if torch.cuda.is_available():
inputs = Variable(inputs.cuda())
labels = Variable(labels.cuda())
lengths = Variable(lengths.cuda())
# 1 - zero the gradients
optimizer.zero_grad()
# 2 - forward pass: compute predicted y by passing x to the model
outputs = model(inputs, lengths)
# 3 - compute loss
loss = loss_function(outputs, labels)
# 4 - backward pass: compute gradient wrt model parameters
loss.backward()
# 5 - update weights
optimizer.step()
running_loss += loss.data[0]
# print statistics
progress(loss=loss.data[0],
epoch=_epoch,
batch=i_batch,
batch_size=BATCH_SIZE,
dataset_size=len(train_set))
I guess i have to modify somehow the outputs to not produce continuous values when i want to predict the labels.
if you have a multi-label prediction problem, the best solution is to use BCELoss(torch.sigmoid(output), target)
works like a charm mate!! thanks a lot! by the way does it affect it at all that in the target has many vectors inside?
e.g my targets have the form of [ [1,1,1,1,1], [0,1,0,0,0,0], [0,1,0,0,1,1] ] ?