Can one use categorical embedding with image classification?


Is it possible to use categorical embedding with image classification? I have a bunch of images and categorical data to go with those images. Can I use both?

There are many ways you can try, but I’ll start with the following.

Represent the categorical data as 1-hot vector then concatenate it to the flattened input to the last classification layer.

This way you rely on CNN to create a representation of your image then use the categorical information only to help the classifier.

Thank you seongmin. I’ve tried a few things and I’m stuck. I have categorical and numerical columns to bring in. I have pulled them from my dataframe, and turned them into tensors. I’ve then pulled in the resnet-50 model with a modification (to add an output of 2 instead of 1000) at the end for my images.

I defined a model for the categorical and numerical images and tried to concatenate with resnet. I’m getting the following error:

NotImplementedError Traceback (most recent call last)
—> 22 y_pred = combined_model(image, categorical_data, numerical_data)
23 single_loss = loss_function(y_pred, label)
24 aggregated_losses.append(single_loss)

C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\ in call(self, *input, **kwargs)
539 result = self._slow_forward(*input, **kwargs)
540 else:
–> 541 result = self.forward(*input, **kwargs)
542 for hook in self._forward_hooks.values():
543 hook_result = hook(self, input, result)

C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\ in forward(self, *input)
95 registered hooks while the latter silently ignores them.
96 “”"
—> 97 raise NotImplementedError
99 def register_buffer(self, name, tensor):


Here is how I defined my model, the loss and the training loop:

class Image_Embedd(nn.Module):

    def __ init__(self, model, embedding_size, num_numerical_cols, output_size, layers, p = 0.4):
    embedding_size: Contains the embedding size for the categorical columns
    num_numerical_cols: Stores the total number of numerical columns
    output_size: The size of the output layer or the number of possible outputs.
    layers: List which contains number of neurons for all the layers.
    p: Dropout with the default value of 0.5
    self.model = model
    #list of ModuleList objects for all categorical columns
    self.all_embeddings = nn.ModuleList([nn.Embedding(ni, nf) for ni, nf in embedding_size])
    #drop out value for all layers
    self.embedding_dropout = nn.Dropout(p)
    #list of 1 dimension batch normalization objects for all numerical columns
    self.batch_norm_num = nn.BatchNorm1d(num_numerical_cols)

    #the number of categorical and numerical columns are added together and stored in input_size
    all_layers = []
    num_categorical_cols = sum((nf for ni, nf in embedding_size))
    input_size = num_categorical_cols + num_numerical_cols
    #loop iterates to add corresonding layers to all_layers list above
    for i in layers:
        all_layers.append(nn.Linear(input_size, i))
        input_size = i
    #append output layer to list of layers    
    all_layers.append(nn.Linear(layers[-1], output_size))
    #pass all layers to the sequential class
    self.layers = nn.Sequential(*all_layers)
    #define the foward method
    def forward(self, x_categorical, x_numerical):
        #this starts the embedding of categorical columns
        embeddings = []
        for i,e in enumerate(self.all_embeddings):
        x =, 1)
        x = self.embedding_dropout(x)

        #normalizing numerical columns
        x_numerical = self.batch_norm_num(x_numerical)

        #concatenating numerical and categorical columns
        x =[x, x_numerical], 1)
        x = self.layers(x)

        x2 = model(x2)
        x_final = torch.concat(x, x2)
        x_final = F.softmax(x_final, dim = 1)
        return x

Instantiate Model

combined_model = Image_Embedd(model = CNNmodel, embedding_size=categorical_embedding_sizes
,num_numerical_cols=numerical_data.shape[1], output_size = 2, layers = [256,128,64,32,2]
, p = 0.4)

Loss, optimizer

criterion = nn.CrossEntropyLoss().cuda()
optimizer = torch.optim.Adam(combined_model.parameters(), lr=0.001)
exp_lr_scheduler = lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.1)
combined_model = combined_model.cuda()

Training Loop

epochs = 1
aggregated_losses = []

max_trn_batch = 25

for i in range(epochs):
for b, (image, label, policy) in enumerate(train_loader):
image = image.cuda()
label = label.cuda()
categorical_data = categorical_data.cuda()
numerical_data = numerical_data.cuda()
#print(image, label, categorical_data, numerical_data)

    #count batches
    b += 1
    #throttle teh batches
    if b == max_trn_batch:

    y_pred = combined_model(image, categorical_data, numerical_data)
    single_loss = loss_function(y_pred, label)
    # statistics
    running_loss += single_loss.item() * image.size(0)
    running_corrects += torch.sum(y_pred ==

    print(f'train-epoch: {i}, train-batch: {b}')