Hi Shilpa, thanks for the answer!
Okay, is good to know that the sequential network worked better than the CNN. My problem is different from yours, I think. I explained it in the following link https://discuss.pytorch.org/t/getting-a-zero-grad-in-non-linear-regression/140005