Hi I’m working on the binarized neural network proposed here binarynet using Pytorch. I saw there’s already a version available on Github BinarynetPytorch, where it used Adam as the optimizer. I tried to change it to SGD optimizer, however, the network then was not training at all.
I changed the initial learning rate for Adam (i.e., 5e-3) to 5e-1 for SGD. I also played with momentum various momentum and batch size. Unfortunately none of them worked. The loss value just fluctuated around one value, but never got dropped.
Are there anything I should do or check?