LSTM_CNN image sequences

Hello everyone,
I got an assignment and stuck with it while going down the rabbit hole of learning PyTorch, LSTM and cnn. Provided the well known MNIST library I take combinations of 4 numbers and per combination it falls down into one of 7 labels.
eg: 1111 label 1 (follow a constant trend)
1234 label 2 increasing trend
4321 label 3 decreasing trend

7382 label 7 decreasing trend - increasing trend - decreasing trend
The shape of my tensor after loading of the tensor become (3,4,28,28) where the 28 comes from the MNIST image’s width and height. 3 is the batch size and 4 is the channels (4 images).
I’m somewhat stuck with how to pass this into a PyTorch backed LSTM and CNN as basically all Google searches lead to articles where simply one image is passed in.
I’m rather new to this all and looking for some guidance on how to go forward. I’ve been reading loads of articles, YT videos, … but all seem to touch the basic stuff or alternatives of the same subject.

I have written some code on the lstm part when I run it gives an the following error:

import numpy as np
import torch
import torch.nn as nn
from torch import optim, softmax
from sklearn.model_selection import train_test_split

#dataset = sequences of 4 MNIST images each
#datalabels =7

#Data
x_train, x_test, y_train, y_test = train_test_split(dataset.data, dataset.data_label, test_size=0.15,
                                                    random_state=42)
#model
class Mylstm(nn.Module):
    def __init__(self, input_size, hidden_size, n_layers, n_classes):
        super(Mylstm, self).__init__()
        self.input_size = input_size
        self.n_layers = n_layers
        self.hidden_size = hidden_size
        self.lstm = nn.LSTM(input_size, hidden_size, n_layers, batch_first=True)
        # readout layer
        self.fc = nn.Linear(hidden_size, n_classes)

    def forward(self, x):
        # Initialize hidden state with zeros
        h0 = torch.zeros(self.n_layers, x.size(0), self.hidden_size).requires_grad_()
        # initialize the cell state:
        c0 = torch.zeros(self.n_layers, x.size(0), self.hidden_size).requires_grad_()
        out, (h_n, h_c) = self.lstm(x, (h0.detach(), c0.detach()))
        x = h_n[ :,-1, :1]  
        x = self.fc(x)
        x = softmax(x, dim=1)
        return x

#Hyperparameters
input_size = 784
hidden_size = 256
sequence_length = 28
n_layers = 2
n_classes = 7
learning_rate = 0.001
model = Mylstm(input_size, hidden_size, n_layers, n_classes)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)


#training
bs = 0
num_epochs = 5
batch_size=3

if np.mod(x_train.shape[0], batch_size) == 0.0:
    iter = int(x_train.shape[0] / batch_size)
else:
    iter = int(x_train.shape[0] / batch_size) + 1
bs = 0
for i in range(iter):
    sequences = x_test[bs:bs + batch_size, :]
    labels = y_test[bs:bs + batch_size]
    test_images = dataset.load_images(sequences)
    bs += batch_size

for epoch in range(num_epochs):
    for i in range(iter):
        sequences = x_train[bs:bs + batch_size, :]
        labels = y_train[bs:bs + batch_size]
        input_images = dataset.load_images(sequences)
        bs += batch_size
        images=(torch.from_numpy(input_images)).view(batch_size,4,-1)
        labels=torch.from_numpy(labels)
        optimizer.zero_grad()
        output = model(images).float()
        # calculate Loss
        loss = criterion(output, labels)
        loss.backward()
        optimizer.step()

the above code gives the following error.
ValueError: Expected input batch_size (2) to match target batch_size (3).

Any idea please on how to go from here?

If you want to use a CNN, on your first Conv2d layer you could pass (in_channels=4, …).
For this particular error you are facing, can you provide the values of the following?

output.shape
labels.shape