Lets say i have an CNN intermediate layer output tensor call it X with shape (B,C,H,W) batch, channels, height and width. I extract the regions of interest (ROIs) from the tensor based on some manually chosen criteria. Assume all the ROIs have same shape (B,N,C,h,w). N is number of ROIS, h is height, and w is width of ROI respectively. Lets call the ROI tensor Y. Now i perform an operation on Y (assume convolution), this operation does not alter the dimension or shape of the ROIs. Lets call the modified ROI tensor as Y’(shape: B,N,C,h,w).
Now i want to replace the locations where Y are extracted from X with Y’. This modified X is further processed in the subsequent layers of the model. So essentially if i do the following things
Y = X[location criteria]
Y’ = some_operation(Y)
X[location criteria] = Y’
Will i be able to train the model without any issues if i do the above?