I am using a nn.Embedding
module to encode words for a Baysian Skipgram model. Multiple embeddings are summed and then passed through a ReLU activation. The problem is however, that all weights in the nn.Embedding
module are negative, resulting in all 0 activations after the ReLU. I am using the max_norm=1.0
on the embedding module. Does that have something to do with it?
This only happens when I use a large corpus for training but not on a smaller development set. The dimensions of nn.Embedding
are around [10000 300]
on the large set and [250 30]
on the development set. I have also trained the model multiple times, all resulting in negative embedding weights.