The answer is going to depend on the characteristics of the patches. If you’re doing a sliding window, you probably don’t want to explicitly extract the patches: instead apply VGG16 convolutionally onto a larger image. Convolutions are effectively applied in a sliding window manner. (You’ll need to replace the nn.Linear layers with equivalent 1D convolutions)
If you need to extract random patches, you’ll probably want to use the indexing operations on tensors.
Thanks! Applying VGG16 convolutionally worked great for me.
However, with the current pytorch API, is it possible to build a function that takes (row, col) indices and a (height, width) block size list to get their corresponding patches efficiently ?
It might be useful to get patches that satisfy certain criteria like false negatives, etc.