If you have the image sequence as array-like shape vid = [w, h, c, n_frames] you can use np.split and specify the indices to extract random cropped sequences. random_sequence = np.split(vid, indices_or_sections=[10,20,45,60,...], axis=3)
For this, you have to load the whole sequence into memory, which might be a bit memory-heavy.
Otherwise, you can just define t1 and t2 in __getitem__ as:
i assume number of samples to be drawn for instance be k = 10 (argument is missing in random.sample()) or if at all we consider k = max_n_frames_wanted, sequence = 400.
t1 = [0, 1, 2, 3, 4,…389]
t2 = [0, 1, 2, 3, 4,…9]
I hope i am getting it correctly.
But, my main concern is when we using random sampling the length in def __len__(self) varies and the index should point out to the sequence to retrieve image from disk.
First, you can set the __len__(self) to be the number of sequences you have, so in this case, 11. So instead of
you can specify it as len(self.seq_lens). Sure, an epoch will be smaller, but you can train for more epochs.
Then, I assumed that you wanted to have a variable number of frames (but in serialized order) on each __getitem__(self, idx) call, so that’s what I specified as t1 and t2. At each call, you would get a sequence index and then randomly pick t1 and t2 in that sequence. Then take all the frames between t1 and t1+t2.
If you want to fix the number of frames then you can simply take frames between t1 and t1+k. Then, t1 can be specified as: t1 = random.sample(range(0, self.seq_lens[idx]-k), 1)
If you don’t want to take sequential (t1, t1+1,... t1+k) frames just replace 1 to k in t1 and remove t2.