Hello world!
I am doing some tasks connected with BERT and want to mask input text.
For example: He loves you. I know.
Into: [CLS] He loves you. [SEP] I know [SEP].
And then tokenize it.
How can I do this masking effectively?
have you checked the original github repository[1]?
On the fly there are some classes for training and finetuning, have you examined it?