Sorry I just found that when I define the network, there are some layers not used in forward pass (which means they are not included in the network). Delete these unused layers helps to solve the problem. But why should we use all layers defined in the init function? Won’t PyTorch automatically ignore these layers?