Can you share the training block of your code? You might have some bugs there. In addition, it seems that you are using default weight values provided by PyTorch
, which isn’t ideal, check this for more info. It is better to initialize your layer weights with a better method, such as Xavier
, to ensure the convergence and the stability of training, check this Stackoverflow answer for more information.