Any input gives same output while using the RL code provided on the pytorch website

I am trying to train a DQN using the same method as specified in Pytorch tutorials. I am taking out features of 6 different images from a pretrained alexnet and then feeding the absolute difference of 2 adjacent images (ie my input size now becomes (3, feature_size)) to DQN. However, my dqn always converges such that output of all these 3 inputs is same. For preprocessing if images, I am converting them to PIL, randomly flipping them horizontally and converting them to Tensors.