Pytorch multi label classification error (full example helpful!)


I’m trying to do something similar to this ( except using pytorch to predict a multi label output with a simple neural network.

However, I can’t seem to be able to get it… I keep getting the following error
RuntimeError: expected scalar type Float but found Double even after casting .double() and .float()


  1. Is the last layer activation function correct for this case? do you still use a log softmax?
  2. Is there a working example of my code?


from sklearn.model_selection import train_test_split
from sklearn.datasets import make_multilabel_classification
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.optim.lr_scheduler import StepLR

X, y = make_multilabel_classification(n_samples=5000, n_features=10,
                                      n_classes=2, random_state=0)

xtrain, xtest, ytrain, ytest=train_test_split(X, y, 
                                              train_size=0.95, random_state=0)

x_shape = xtrain.shape

loss_fn = torch.nn.BCEWithLogitsLoss()

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(x_shape[1], 128)
        self.fc2 = nn.Linear(128, 2)

    def forward(self, x):
        x = self.fc1(x)
        x = F.relu(x)
        x = self.fc2(x)
        output = F.log_softmax(x, dim=1) # is this correct?
        return output

use_cuda = torch.cuda.is_available()
lr = 1
batch_size = 100
gamma = 0.7
epochs = 20
args = {'log_interval': 10, 'dry_run':False}

kwargs = {'batch_size': batch_size}
if use_cuda:
    kwargs.update({'num_workers': 1,
                   'pin_memory': True,
                   'shuffle': True},

if use_cuda:
    device = torch.device("cuda")          # a CUDA device object
model = Net().to(device)
X, y = torch.tensor(X), torch.tensor(y)
optimizer = optim.Adam(model.parameters(), lr=lr)

idx = 0
scheduler = StepLR(optimizer, step_size=1, gamma=gamma)

# batching not correct, but testing whether model can run
for epoch in range(1, epochs + 1):
    batch_x, batch_y  = X[idx:idx+batch_size].to(device), y[idx:idx+batch_size].to(device)
    output = model(batch_x)
    loss = loss_fn(output, batch_y)
    if idx % args['log_interval'] == 0:
        print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
            epoch, idx * len(batch_x), len(X),
            100. * idx / len(X), loss.item()))
        if args['dry_run']:
    idx += 100


The type mismatch error is most likely created if you are passing DoubleTensors as input data to the model. Numpy uses float64 as the default dtype and since you are using sklearn to create the dataset, I guess you would need to transform the tensors to FloatTensors via batch_x = batch_x.float() before passing it to the model (same for batch_y).

nn.BCEWithLogitssLoss expects logits so you should remove the F.log_softmax activation.

hey thanks for your reply! I still can’t seem to get it to work…

after casting it shows RuntimeError: result type Float can't be cast to the desired output type Long

Is my network even correct…?

Your code works fine after adding the suggested chages:

output = model(batch_x.float())
loss = loss_fn(output, batch_y.float())

No, since you should remove the F.log_softmax activation if you are using nn.BCEWithLogitsLoss.