Dear Experts,

I have a situation that I need to predict outputs (y1,y2,y3,y4,y5) from given inputs (x1,x2,x3…,x32).

Inputs are mixed with categorical and ordinal variables which is ok with some encoding algorithms.

I have read several Pytorch examples but I got confused.

So, I need straight forward example or tutorials. Also, I have question about hidden layers. I read that there’s no fixed formula to get number of layer and it’s based on trail and errors. I might be wrong.

Could you please assist me?

Regards

Could you explain your confusion or which tutorial makes you feel confused?

That’s more or less true. If you don’t have a specific architecture in mind, you could start with some “working” models and adapt the model for your use case.

Thanks ptrblck.

In fact I have searched for several examples and most of them are considering images, while my scenario is different. I successfully implemented with Keras and TensorFlow but I am looking for Pytorch.

Regards

If you have a working model, you could port it to PyTorch.

Let us know, if you get stuck or if your ported model doesn’t work.

Thanks again. Could you please provide me an example of such case to start working on it.

Thanks

It’s a bit hard to give you an example without more information about the desired architecture.

This model would take 32 input features and output logits for 5 classes:

```
class MyModel(nn.Module):
def __init__(self, in_features, nb_classes):
super(MyModel, self).__init__()
self.fc1 = nn.Linear(in_features, 64)
self.fc2 = nn.Linear(64, nb_classes)
def forward(self, x):
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
model = MyModel(32, 5)
```

Thanks. At least good starting point.

Hi again,

I have started with this example but I got an error: RuntimeError: 1D target tensor expected, multi-target not supported.

```
class Model(nn.Module):
def __init__(self, embedding_size, num_numerical_cols, output_size, layers, p=0.4):
super().__init__()
self.all_embeddings = nn.ModuleList([nn.Embedding(ni, nf) for ni, nf in embedding_size])
self.embedding_dropout = nn.Dropout(p)
self.batch_norm_num = nn.BatchNorm1d(num_numerical_cols)
all_layers = []
num_categorical_cols = sum((nf for ni, nf in embedding_size))
input_size = num_categorical_cols + num_numerical_cols
for i in layers:
all_layers.append(nn.Linear(input_size, i))
all_layers.append(nn.ReLU(inplace=True))
all_layers.append(nn.BatchNorm1d(i))
all_layers.append(nn.Dropout(p))
input_size = i
all_layers.append(nn.Linear(layers[-1], output_size))
self.layers = nn.Sequential(*all_layers)
def forward(self, x_categorical, x_numerical):
embeddings = []
for i,e in enumerate(self.all_embeddings):
embeddings.append(e(x_categorical[:,i]))
x = torch.cat(embeddings, 1)
x = self.embedding_dropout(x)
x_numerical = self.batch_norm_num(x_numerical)
x = torch.cat([x, x_numerical], 1)
x = self.layers(x)
return x
dataset = pd.read_csv('Optomiz_DataSet3.csv')
# start splitting inputs to categorical and numerical
categorical_columns = ['c1', 'c2', 'c3', 'c4', 'c5']
numerical_columns = ['a1', 'a2', 'a3', 'a4', 'a5', 'a6', a7', 'a8', 'a9', 'a10', 'a11', 'a12', 'a13', 'a14','a15', 'a16', 'a17', 'a18', 'a19', 'a20', 'a21', 'a22', 'a23', 'a24', 'a25', 'a26','a27', 'a28']
# o1 is categorical and others are numericals
outputs =['o1', 'o2', 'o3', 'o4', 'o5', 'o6', 'o7', 'o8']
for category in categorical_columns:
dataset[category] = dataset[category].astype('category')
dataset['o1'] = dataset['o2'].astype('category')
# encoding inputs categorical data
c1 = dataset['c1'].cat.codes.values
c2= dataset['c2'].cat.codes.values
c3 = dataset['c3'].cat.codes.values
c4 = dataset['c4'].cat.codes.values
c5= dataset['c5'].cat.codes.values
# encoding output categorical data
dataset['o1'] = dataset['o1'].cat.codes.values
categorical_data = np.stack([c1, c2, c3, c4, c5], 1)
categorical_data = torch.tensor(categorical_data, dtype=torch.int64)
numerical_data = np.stack([dataset[col].values for col in numerical_columns], 1)
numerical_data = torch.tensor(numerical_data, dtype=torch.float)
outputs_data = np.stack([dataset[col].values for col in outputs], 1)
outputs_data = torch.tensor(outputs_data,dtype=torch.float)
categorical_column_sizes = [len(dataset[column].cat.categories) for column in categorical_columns]
categorical_embedding_sizes = [(col_size, min(50, (col_size+1)//2)) for col_size in categorical_column_sizes]
total_records = 9369
test_records = int(total_records * .2)
categorical_train_data = categorical_data[:total_records-test_records]
categorical_test_data = categorical_data[total_records-test_records:total_records]
numerical_train_data = numerical_data[:total_records-test_records]
numerical_test_data = numerical_data[total_records-test_records:total_records]
train_outputs = outputs_data[:total_records-test_records]
test_outputs = outputs_data[total_records-test_records:total_records]
# 5 is output_size, I believe that there is a mistake here
model = Model(categorical_embedding_sizes, numerical_data.shape[1], 5, [200,100,50], p=0.4)
loss_function = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
epochs = 300
aggregated_losses = []
for i in range(epochs):
i += 1
y_pred = model(categorical_train_data, numerical_train_data)
# I got runtime error with below function: RuntimeError: 1D target tensor expected, multi-target not supported
single_loss = loss_function(y_pred, train_outputs)
aggregated_losses.append(single_loss)
if i%25 == 1:
print(f'epoch: {i:3} loss: {single_loss.item():10.8f}')
optimizer.zero_grad()
single_loss.backward()
optimizer.step()
print(f'epoch: {i:3} loss: {single_loss.item():10.10f}')
```

So, what is the issue and why?

Thanks again.

`nn.CrossEntropyLoss`

expects the target as a `LongTensor`

containing the class indices.

Could you check the shape and values of `train_outputs`

?

```
train_outputs shape: torch.Size([7496, 8])
train_outputs values: tensor([[3.0000e+00, 9.2000e+02, 3.6000e+01, ..., 2.3280e+03, 0.0000e+00,
0.0000e+00],
[3.0000e+00, 9.2000e+02, 3.6000e+01, ..., 2.3280e+03, 0.0000e+00,
0.0000e+00],
[3.0000e+00, 9.2000e+02, 3.7000e+01, ..., 2.3280e+03, 0.0000e+00,
0.0000e+00],
...,
[0.0000e+00, 9.2000e+02, 6.1000e+01, ..., 0.0000e+00, 2.1450e+03,
1.7000e+00],
[0.0000e+00, 9.2000e+02, 6.1000e+01, ..., 0.0000e+00, 2.1450e+03,
1.7000e+00],
[0.0000e+00, 9.2000e+02, 6.1000e+01, ..., 0.0000e+00, 2.1450e+03,
1.7000e+00]])
```

Based on your code, I assume you are working on a multi-class classification use case.

However, the target seems to be a regression target?

Could you explain your use case and how the target is defined?

Sure. I am trying to predict outputs ([‘o1’, ‘o2’, ‘o3’, ‘o4’, ‘o5’, ‘o6’, ‘o7’, ‘o8’]) based on some inputs.

I found this example : https://stackabuse.com/introduction-to-pytorch-for-classification/

So, I worked on it, then I decided to make it more complex by adding more inputs and predicting more outputs (one of the outputs is categorical).

If you believe this tutorial is not good, please feel free to provide me better example/tutorial .

Thanks.

In the linked tutorial `outputs[:5]`

seems to have values of

```
tensor([1, 0, 1, 0, 0])
```

which is a `LongTensor`

with class indices for two classes, while your `train_outputs`

tensor shows floating point values, which will not work.

Could you compare your code to the tutorial again and make sure to use the right tensor as the target?

Thanks again for your useful comments.

I totally agree with your comments about the tutorial, but I have changed it to multiple outputs. I want to go further with more complex situations to see the powerful of Pytorch. I have implemented it with Keras and now I want to compare the outcomes of Pytorch.

The predicted output in the tutorial is one with values (1,0), while I am trying to predict 8 outputs each output has several values, i.e: o1 is categorical which contains 4 values, while o2-o8 are numerical.

I don’t know if Pytorch can handle such complex use cases or no. My believe it does but I need little bit push to solve outputs implementation.

Yes, that’ what I want, how to use right tensor for this use case. I am new to Pytorch world.

Thanks again

For this mixed use case, you could use two different model output layers, where the first one would have 4 output units and return logits for all 4 classes, while the second output layer would have 7 output units for the regression task.

For the first use case, you should calculate the loss using `nn.CrossEntropyLoss`

, while for the second you could use `nn.MSELoss`

, and finally sum these losses together before calling `backward()`

.

The general use case of a “mixed prediction” would also work with a single layer, where you would have to index the specific locations, but I’m not sure how well this approach would work, but you might just try it out.

Thanks again. Let me see how I can work on it