Pytorch 0.3, loss decrease well, but error do not go down well. when used pytorch 0.1.12 everything is ok

i meet a new problem, when used pytorch 0.3, loss decrease well, but error do not go down well. when used pytorch 0.1.12 everything is ok.
key code:
criterion = nn.CrossEntropyLoss()
loss = criterion(output_var, target_var)

_, predictions_var = torch.topk(output_var, 1)
error = 1 - torch.eq(predictions_var, target_var).float().mean()

when i train based same data and network.
here is the training log in pytorch0.1.12:
Train: (Epoch 0 of 300) [0449/1846] Time: 0.68629 (313.264) Loss: 1.50048 (1.616) Error: 0.71875 (0.761)
Train: (Epoch 0 of 300) [0450/1846] Time: 0.68532 (313.950) Loss: 1.50950 (1.616) Error: 0.65625 (0.761)
Train: (Epoch 0 of 300) [0451/1846] Time: 0.69083 (314.640) Loss: 1.51181 (1.615) Error: 0.65625 (0.760)
Train: (Epoch 0 of 300) [0452/1846] Time: 0.69078 (315.331) Loss: 1.56545 (1.615) Error: 0.68750 (0.760)
Train: (Epoch 0 of 300) [0453/1846] Time: 0.69094 (316.022) Loss: 1.64423 (1.615) Error: 0.71875 (0.760)
Train: (Epoch 0 of 300) [0454/1846] Time: 0.69243 (316.715) Loss: 1.56402 (1.615) Error: 0.62500 (0.760)
Train: (Epoch 0 of 300) [0455/1846] Time: 0.68892 (317.403) Loss: 1.56787 (1.615) Error: 0.78125 (0.760)
Train: (Epoch 0 of 300) [0456/1846] Time: 0.68399 (318.087) Loss: 1.50009 (1.615) Error: 0.65625 (0.760)
Train: (Epoch 0 of 300) [0457/1846] Time: 0.68671 (318.774) Loss: 1.55996 (1.615) Error: 0.75000 (0.760)
Train: (Epoch 0 of 300) [0458/1846] Time: 0.69036 (319.465) Loss: 1.50076 (1.615) Error: 0.68750 (0.759)
Train: (Epoch 0 of 300) [0459/1846] Time: 0.70112 (320.166) Loss: 1.56494 (1.614) Error: 0.78125 (0.759)
Train: (Epoch 0 of 300) [0460/1846] Time: 0.69292 (320.859) Loss: 1.59975 (1.614) Error: 0.65625 (0.759)
Train: (Epoch 0 of 300) [0461/1846] Time: 0.69803 (321.557) Loss: 1.56865 (1.614) Error: 0.71875 (0.759)
Train: (Epoch 0 of 300) [0462/1846] Time: 0.69104 (322.248) Loss: 1.42063 (1.614) Error: 0.62500 (0.759)
Train: (Epoch 0 of 300) [0463/1846] Time: 0.68754 (322.935) Loss: 1.49598 (1.614) Error: 0.81250 (0.759)
Train: (Epoch 0 of 300) [0464/1846] Time: 0.68633 (323.622) Loss: 1.43757 (1.613) Error: 0.59375 (0.759)
Train: (Epoch 0 of 300) [0465/1846] Time: 0.68728 (324.309) Loss: 1.46106 (1.613) Error: 0.62500 (0.758)
Train: (Epoch 0 of 300) [0466/1846] Time: 0.68766 (324.996) Loss: 1.58487 (1.613) Error: 0.68750 (0.758)
Train: (Epoch 0 of 300) [0467/1846] Time: 0.69090 (325.687) Loss: 1.62123 (1.613) Error: 0.84375 (0.758)
Train: (Epoch 0 of 300) [0468/1846] Time: 0.68906 (326.376) Loss: 1.56344 (1.613) Error: 0.78125 (0.758)
Train: (Epoch 0 of 300) [0469/1846] Time: 0.69489 (327.071) Loss: 1.72437 (1.613) Error: 0.81250 (0.759)
Train: (Epoch 0 of 300) [0470/1846] Time: 0.69617 (327.767) Loss: 1.63039 (1.613) Error: 0.75000 (0.759)
Train: (Epoch 0 of 300) [0471/1846] Time: 0.68704 (328.454) Loss: 1.56739 (1.613) Error: 0.71875 (0.758)
Train: (Epoch 0 of 300) [0472/1846] Time: 0.68853 (329.143) Loss: 1.57508 (1.613) Error: 0.75000 (0.758)
Train: (Epoch 0 of 300) [0473/1846] Time: 0.68571 (329.829) Loss: 1.51796 (1.613) Error: 0.75000 (0.758)
Train: (Epoch 0 of 300) [0474/1846] Time: 0.68808 (330.517) Loss: 1.46917 (1.612) Error: 0.43750 (0.758)
Train: (Epoch 0 of 300) [0475/1846] Time: 0.68482 (331.202) Loss: 1.57809 (1.612) Error: 0.71875 (0.758)
Train: (Epoch 0 of 300) [0476/1846] Time: 0.69689 (331.899) Loss: 1.49936 (1.612) Error: 0.68750 (0.757)
Train: (Epoch 0 of 300) [0477/1846] Time: 0.69660 (332.595) Loss: 1.64540 (1.612) Error: 0.78125 (0.758)
Train: (Epoch 0 of 300) [0478/1846] Time: 0.69536 (333.290) Loss: 1.49299 (1.612) Error: 0.78125 (0.758)
Train: (Epoch 0 of 300) [0479/1846] Time: 0.68811 (333.979) Loss: 1.57865 (1.612) Error: 0.62500 (0.757)
Train: (Epoch 0 of 300) [0480/1846] Time: 0.68936 (334.668) Loss: 1.49023 (1.612) Error: 0.56250 (0.757)
Train: (Epoch 0 of 300) [0481/1846] Time: 0.68498 (335.353) Loss: 1.70031 (1.612) Error: 0.75000 (0.757)
Train: (Epoch 0 of 300) [0482/1846] Time: 0.68599 (336.039) Loss: 1.53341 (1.612) Error: 0.75000 (0.757)
Train: (Epoch 0 of 300) [0483/1846] Time: 0.69284 (336.732) Loss: 1.53764 (1.611) Error: 0.65625 (0.757)
Train: (Epoch 0 of 300) [0484/1846] Time: 0.68952 (337.421) Loss: 1.57135 (1.611) Error: 0.81250 (0.757)

here is the training log in pytorch 0.3:
Train: (Epoch 0 of 300) [0467/2953] Time: 0.61195 (292.880) Loss: 1.61144 (1.618) Error: 0.81000 (0.796)
Train: (Epoch 0 of 300) [0468/2953] Time: 0.60961 (293.490) Loss: 1.57312 (1.618) Error: 0.75000 (0.796)
Train: (Epoch 0 of 300) [0469/2953] Time: 0.60911 (294.099) Loss: 1.57415 (1.618) Error: 0.77500 (0.796)
Train: (Epoch 0 of 300) [0470/2953] Time: 0.61323 (294.712) Loss: 1.58199 (1.617) Error: 0.75500 (0.796)
Train: (Epoch 0 of 300) [0471/2953] Time: 0.62353 (295.336) Loss: 1.54087 (1.617) Error: 0.74000 (0.795)
Train: (Epoch 0 of 300) [0472/2953] Time: 0.63124 (295.967) Loss: 1.56808 (1.617) Error: 0.78000 (0.795)
Train: (Epoch 0 of 300) [0473/2953] Time: 0.61456 (296.581) Loss: 1.65133 (1.617) Error: 0.85000 (0.796)
Train: (Epoch 0 of 300) [0474/2953] Time: 0.60898 (297.190) Loss: 1.55264 (1.617) Error: 0.86000 (0.796)
Train: (Epoch 0 of 300) [0475/2953] Time: 0.61363 (297.804) Loss: 1.48496 (1.617) Error: 0.80000 (0.796)
Train: (Epoch 0 of 300) [0476/2953] Time: 0.62919 (298.433) Loss: 1.60982 (1.617) Error: 0.86250 (0.796)
Train: (Epoch 0 of 300) [0477/2953] Time: 0.62011 (299.053) Loss: 1.54939 (1.617) Error: 0.82250 (0.796)
Train: (Epoch 0 of 300) [0478/2953] Time: 0.61035 (299.664) Loss: 1.56136 (1.617) Error: 0.81500 (0.796)
Train: (Epoch 0 of 300) [0479/2953] Time: 0.61940 (300.283) Loss: 1.53150 (1.616) Error: 0.83750 (0.796)
Train: (Epoch 0 of 300) [0480/2953] Time: 0.61538 (300.898) Loss: 1.57232 (1.616) Error: 0.76500 (0.796)
Train: (Epoch 0 of 300) [0481/2953] Time: 0.62098 (301.519) Loss: 1.78021 (1.617) Error: 0.84500 (0.796)
Train: (Epoch 0 of 300) [0482/2953] Time: 0.61404 (302.134) Loss: 1.64008 (1.617) Error: 0.80500 (0.796)
Train: (Epoch 0 of 300) [0483/2953] Time: 0.61893 (302.752) Loss: 1.56123 (1.617) Error: 0.84000 (0.796)
Train: (Epoch 0 of 300) [0484/2953] Time: 0.61640 (303.369) Loss: 1.49744 (1.616) Error: 0.76000 (0.796)
Train: (Epoch 0 of 300) [0485/2953] Time: 0.62221 (303.991) Loss: 1.75178 (1.617) Error: 0.82500 (0.796)
Train: (Epoch 0 of 300) [0486/2953] Time: 0.61246 (304.603) Loss: 1.53652 (1.616) Error: 0.80000 (0.796)
Train: (Epoch 0 of 300) [0487/2953] Time: 0.61301 (305.216) Loss: 1.55897 (1.616) Error: 0.77250 (0.796)
Train: (Epoch 0 of 300) [0488/2953] Time: 0.63178 (305.848) Loss: 1.76734 (1.617) Error: 0.79500 (0.796)
Train: (Epoch 0 of 300) [0489/2953] Time: 0.61366 (306.462) Loss: 1.52398 (1.616) Error: 0.78750 (0.796)
Train: (Epoch 0 of 300) [0490/2953] Time: 0.60878 (307.071) Loss: 1.58773 (1.616) Error: 0.81250 (0.796)
Train: (Epoch 0 of 300) [0491/2953] Time: 0.61080 (307.682) Loss: 1.68538 (1.617) Error: 0.81000 (0.796)
Train: (Epoch 0 of 300) [0492/2953] Time: 0.60971 (308.291) Loss: 1.57467 (1.616) Error: 0.75750 (0.796)
Train: (Epoch 0 of 300) [0493/2953] Time: 0.63329 (308.925) Loss: 1.62051 (1.616) Error: 0.85250 (0.796)
Train: (Epoch 0 of 300) [0494/2953] Time: 0.62522 (309.550) Loss: 1.53407 (1.616) Error: 0.75750 (0.796)
Train: (Epoch 0 of 300) [0495/2953] Time: 0.61182 (310.162) Loss: 1.53725 (1.616) Error: 0.79750 (0.796)
Train: (Epoch 0 of 300) [0496/2953] Time: 0.61881 (310.780) Loss: 1.57296 (1.616) Error: 0.84000 (0.796)
Train: (Epoch 0 of 300) [0497/2953] Time: 0.60919 (311.390) Loss: 1.53940 (1.616) Error: 0.82250 (0.796)
Train: (Epoch 0 of 300) [0498/2953] Time: 0.62138 (312.011) Loss: 1.62433 (1.616) Error: 0.80250 (0.796)
Train: (Epoch 0 of 300) [0499/2953] Time: 0.61154 (312.622) Loss: 1.58761 (1.616) Error: 0.79500 (0.796)

why? can you help me?