I’m facing with online action detection task(i.e. I have an input video stream and I want to classify each frame as it comes). I see in literature that optical flow gives a lot of help in capturing motion information from video, but as far as I known this is always computed as a pre-processing step (so it is not suitable for real-time applications).
Is there any algorithm that is capable of computing a real-time optical flow?
PS: In any case, suggestion or workarounds are welcome
Basically, my actual model is a single-stream-CNN (with RGB input) as feature extractor followed by an LSTM that performs the prediction, I’m convinced that using instead a two-stream-CNN (with optical flow input for the second stream) will increase the accuracy.