How to extract CNN features from video frames using pre-trained models? No splitting of data in train and test is recommended

How to extract CNN features from video frames using pre-trained models? No splitting of data in train and test is recommended. I have an IACC.3 dataset keyframes, I need to extract visual features using pre-trained network models such as VggNet, ResNet, GoogleNet etc. and store the features in the binary format .bin. I am thinking to use pytorch to write my code.

Check this post on how to extract 2D CNN features from a video CNN LSTM implementation for video classification

To save them in binary format I think you can use this function torch.save and io.BytesIO()

https://pytorch.org/docs/stable/torch.html#torch.save