Input numpy ndarray instead of images in a CNN

Hi,

As I linked in another ticket, I found that this implementation is lack of vectorisation. When one retrieves data in loader, MyDataset.__getitem__ will be called millions of times. This becomes a bottleneck of my training on GPU. In Keras, we know that larger batch_size will reduce the training time; however here, batch_size will have small effect on the training time due to the loop over the training points. Is there any suggestion to avoid this?