Read a batch of images and their annotations(represents different object I am interested in different images).
Batch of images (B, C, H, W) feed into a network, output a new feature (B, C, H, W).
Now, how can I center-crop a fixed patch (e.g. 64 pixels)(Ideally should be B, C, 64, 64) from the generated feature maps, but based on different centers.
Can this kind of batch center crop be included within the training step? (My loss take the cropped patch as y_pred compare to some y_true).
You should be able to directly index the tensor and can thus specify the start and end indices (or derive them from the center and the size of the patch).
Yes, slicing a tensor is differentiable, so should also work during training.
Based on your answer, my init idea is turn the batched annotations into batched indexes, and then batch indexed the feature map to get cropped features. Is that right understanding?
BTW I saw this:
I think we have the same purpose. I try the code in that reply, however, seems like not a exactly cropping?