Need help importing videos into PyTorch

I have trained a Faster RCNN model for detecting fish underwater. I am currently using a random selection of frames from a 1hr video and training in batches. I have a few questions that might help improve the accuracy of the model:

  1. Should I be training on a 1hr video without shuffling the data? What’s the best way to go about this?
  2. Is there a way to inform the training of previous frames, so the model has some understanding of how fish move from frame to frame? This would help track fish between predicted frames, and reduce the number of false positives (eg. rocks, seaweed, misc. objects).
  3. Should I be considering alternatives to Faster RCNN to achieve better results? (eg. Mask RCNN, Instance segmentation, YOLO/DeepSort)

Cheers!