Classification of image temporally

Hi, I have a problem where in, I want to classify images into class A, B, C based on the objects in the image. I also want to introduce a temporal vector which stores the time elapsed from the start of the sequence. I want to use this as well to predict the class A, B, and C which are dependent on time as well.
I understand how they can be trained as separate tasks, but I am curious to know can they be combined as a single task and trained?

Hi, I think It’s essential to consider whether you need to relate changes across a sequence. If you want to use both the image itself and a time value from the start as inputs to your model, you can treat the time value as an additional input feature. However, if you aim to capture temporal dependencies, consider encoding a window of frames. This makes your model ‘temporal aware’ and is it used for video-related tasks.