Equivalent XOR implementation (dynet, chainer, pytorch)

iceiony · May 19, 2017, 4:33pm

In this link you can find an equivalent implementation of XOR for different dynamic NN frameworks (dynet, chainer, pytorch).

Although the implementation is parametrised the same for all frameworks, I’m finding dynet to converge to a solution in less iterations. I’m hoping someone can point out why.

jekbradbury · May 19, 2017, 11:03pm

Since the model is very small, I’d walk through it step by step with the same initial weight tensor and same data; each layer and the optimization step should produce exactly the same output between the three frameworks (but it looks like it won’t).

iceiony · May 20, 2017, 9:14am

Ah, I was hoping not to have to do that
I’m guessing there is no regularisation/normalisation applied in pyTorch by default.

smth · May 20, 2017, 6:43pm

no, no regularization by default, you can configure it in the optimizer options.

iceiony · May 22, 2017, 10:18am

Found that the hidden layer had tanh activation for dynet. Sorry for the mistake, now all frameworks learn at the same rate.