How to perform character dropout at the input layer?

I want to perform dropout at the character level, meaning replacing some chars with the zero vector. Should i do this manually before I convert the 1 hot embeddings to char embeddings, or use


where inp has size [max_sent_len x batch_size x vocab_size]

and inp is the one hot encoding of the char vocabulary.

For the last one, for p=0.5, I’ve seen it converts the 1 flag in the one hot encoding to 2 (scaling each value by 1/p) for the values it doesnt zero. How is it affecting the behavior in the end?

Which one is the correct way?