Multi Inputs and Outputs - Pytorch

Dear Experts,
I have a situation that I need to predict outputs (y1,y2,y3,y4,y5) from given inputs (x1,x2,x3…,x32).
Inputs are mixed with categorical and ordinal variables which is ok with some encoding algorithms.
I have read several Pytorch examples but I got confused.
So, I need straight forward example or tutorials. Also, I have question about hidden layers. I read that there’s no fixed formula to get number of layer and it’s based on trail and errors. I might be wrong.
Could you please assist me?

Could you explain your confusion or which tutorial makes you feel confused?

That’s more or less true. If you don’t have a specific architecture in mind, you could start with some “working” models and adapt the model for your use case.

Thanks ptrblck.
In fact I have searched for several examples and most of them are considering images, while my scenario is different. I successfully implemented with Keras and TensorFlow but I am looking for Pytorch.


If you have a working model, you could port it to PyTorch.
Let us know, if you get stuck or if your ported model doesn’t work.

Thanks again. Could you please provide me an example of such case to start working on it.

It’s a bit hard to give you an example without more information about the desired architecture.
This model would take 32 input features and output logits for 5 classes:

class MyModel(nn.Module):
    def __init__(self, in_features, nb_classes):
        super(MyModel, self).__init__()
        self.fc1 = nn.Linear(in_features, 64)
        self.fc2 = nn.Linear(64, nb_classes)
    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = MyModel(32, 5)

Thanks. At least good starting point.

Hi again,
I have started with this example but I got an error: RuntimeError: 1D target tensor expected, multi-target not supported.

class Model(nn.Module):

    def __init__(self, embedding_size, num_numerical_cols, output_size, layers, p=0.4):
        self.all_embeddings = nn.ModuleList([nn.Embedding(ni, nf) for ni, nf in embedding_size])
        self.embedding_dropout = nn.Dropout(p)
        self.batch_norm_num = nn.BatchNorm1d(num_numerical_cols)

        all_layers = []
        num_categorical_cols = sum((nf for ni, nf in embedding_size))
        input_size = num_categorical_cols + num_numerical_cols

        for i in layers:
            all_layers.append(nn.Linear(input_size, i))
            input_size = i

        all_layers.append(nn.Linear(layers[-1], output_size))

        self.layers = nn.Sequential(*all_layers)

    def forward(self, x_categorical, x_numerical):
        embeddings = []
        for i,e in enumerate(self.all_embeddings):
        x =, 1)
        x = self.embedding_dropout(x)

        x_numerical = self.batch_norm_num(x_numerical)
        x =[x, x_numerical], 1)
        x = self.layers(x)
        return x

dataset = pd.read_csv('Optomiz_DataSet3.csv')


# start splitting inputs to categorical and numerical 
categorical_columns = ['c1', 'c2', 'c3', 'c4', 'c5']
numerical_columns = ['a1', 'a2', 'a3', 'a4', 'a5', 'a6', a7', 'a8', 'a9', 'a10', 'a11', 'a12', 'a13', 'a14','a15', 'a16', 'a17', 'a18', 'a19', 'a20', 'a21', 'a22', 'a23', 'a24', 'a25', 'a26','a27', 'a28']

# o1 is categorical and others are numericals 
outputs =['o1', 'o2', 'o3', 'o4', 'o5', 'o6', 'o7', 'o8']

for category in categorical_columns:
    dataset[category] = dataset[category].astype('category')

dataset['o1'] = dataset['o2'].astype('category')

# encoding inputs categorical data
c1 = dataset['c1']
c2= dataset['c2']
c3 = dataset['c3']
c4 = dataset['c4']
c5= dataset['c5']

# encoding output categorical data
dataset['o1'] = dataset['o1']

categorical_data = np.stack([c1, c2, c3, c4, c5], 1)

categorical_data = torch.tensor(categorical_data, dtype=torch.int64)

numerical_data = np.stack([dataset[col].values for col in numerical_columns], 1)
numerical_data = torch.tensor(numerical_data, dtype=torch.float)

outputs_data = np.stack([dataset[col].values for col in outputs], 1)
outputs_data = torch.tensor(outputs_data,dtype=torch.float)

categorical_column_sizes = [len(dataset[column].cat.categories) for column in categorical_columns]

categorical_embedding_sizes = [(col_size, min(50, (col_size+1)//2)) for col_size in categorical_column_sizes]

total_records = 9369
test_records = int(total_records * .2)

categorical_train_data = categorical_data[:total_records-test_records]
categorical_test_data = categorical_data[total_records-test_records:total_records]
numerical_train_data = numerical_data[:total_records-test_records]
numerical_test_data = numerical_data[total_records-test_records:total_records]
train_outputs = outputs_data[:total_records-test_records]
test_outputs = outputs_data[total_records-test_records:total_records]

# 5 is output_size, I believe that there is a mistake here
model = Model(categorical_embedding_sizes, numerical_data.shape[1], 5, [200,100,50], p=0.4)

loss_function = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

epochs = 300
aggregated_losses = []

for i in range(epochs):
    i += 1
    y_pred = model(categorical_train_data, numerical_train_data)

# I got runtime error with below function: RuntimeError: 1D target tensor expected, multi-target not supported
    single_loss = loss_function(y_pred, train_outputs)


    if i%25 == 1:
        print(f'epoch: {i:3} loss: {single_loss.item():10.8f}')


print(f'epoch: {i:3} loss: {single_loss.item():10.10f}')

So, what is the issue and why?

Thanks again.

nn.CrossEntropyLoss expects the target as a LongTensor containing the class indices.
Could you check the shape and values of train_outputs?

train_outputs shape:  torch.Size([7496, 8])
train_outputs values:  tensor([[3.0000e+00, 9.2000e+02, 3.6000e+01,  ..., 2.3280e+03, 0.0000e+00,
        [3.0000e+00, 9.2000e+02, 3.6000e+01,  ..., 2.3280e+03, 0.0000e+00,
        [3.0000e+00, 9.2000e+02, 3.7000e+01,  ..., 2.3280e+03, 0.0000e+00,
        [0.0000e+00, 9.2000e+02, 6.1000e+01,  ..., 0.0000e+00, 2.1450e+03,
        [0.0000e+00, 9.2000e+02, 6.1000e+01,  ..., 0.0000e+00, 2.1450e+03,
        [0.0000e+00, 9.2000e+02, 6.1000e+01,  ..., 0.0000e+00, 2.1450e+03,

Based on your code, I assume you are working on a multi-class classification use case.
However, the target seems to be a regression target?
Could you explain your use case and how the target is defined?

1 Like

Sure. I am trying to predict outputs ([‘o1’, ‘o2’, ‘o3’, ‘o4’, ‘o5’, ‘o6’, ‘o7’, ‘o8’]) based on some inputs.
I found this example :

So, I worked on it, then I decided to make it more complex by adding more inputs and predicting more outputs (one of the outputs is categorical).

If you believe this tutorial is not good, please feel free to provide me better example/tutorial .

In the linked tutorial outputs[:5] seems to have values of

tensor([1, 0, 1, 0, 0])

which is a LongTensor with class indices for two classes, while your train_outputs tensor shows floating point values, which will not work.

Could you compare your code to the tutorial again and make sure to use the right tensor as the target?

Thanks again for your useful comments.
I totally agree with your comments about the tutorial, but I have changed it to multiple outputs. I want to go further with more complex situations to see the powerful of Pytorch. I have implemented it with Keras and now I want to compare the outcomes of Pytorch.
The predicted output in the tutorial is one with values (1,0), while I am trying to predict 8 outputs each output has several values, i.e: o1 is categorical which contains 4 values, while o2-o8 are numerical.

I don’t know if Pytorch can handle such complex use cases or no. My believe it does but I need little bit push to solve outputs implementation.

Yes, that’ what I want, how to use right tensor for this use case. I am new to Pytorch world.

Thanks again

For this mixed use case, you could use two different model output layers, where the first one would have 4 output units and return logits for all 4 classes, while the second output layer would have 7 output units for the regression task.
For the first use case, you should calculate the loss using nn.CrossEntropyLoss, while for the second you could use nn.MSELoss, and finally sum these losses together before calling backward().

The general use case of a “mixed prediction” would also work with a single layer, where you would have to index the specific locations, but I’m not sure how well this approach would work, but you might just try it out. :wink:

Thanks again. Let me see how I can work on it :grin: