Hi Soumith , thanks a lot for the answer. Ultimately the goal is to do classification on image ROIs and since they are of different sizes that’s why I was asking. So for instance if we have a training loop like this:
for epoch in epochs:
for sample in nb_samples:
outputs = convnet(sample)
where each sample represents a batch and it actually is a list of images
[img_1, img_2,...,img_n]. And,
img_1, img_2,...,img_n represent ROIs extracted from the original images.
Would that work or do I need to specify sth beforehand in the convnet architecture to make it work with this kind of data?
I’m not quite sure about the benefits of
nn.AdaptiveAvgPooling in this case. Would you mind elaborating a little bit. How could it help in terms of making the convnet deal with data where each sample has as variable size.
img1.shape = 238, 126, 3
img2.shape = 68, 234, 3
img3.shape = 225, 98, 3
. . .
img_n = ...