Hi everyone, I am testing some models on a NER task and now I would like to use a CNN.
In particular, I am using a model which takes as input samples with a length of 512 tokens.
These are fed in a BERT model which hidden_size = 768.
The BERT output shape is (batch_size, length, hidden_size) = (16, 512,768).
Now I would like to use a CNN layer which takes in input these values, produce an output passed then in a Linear Layer which in turn outputs values with shape (16, 512, num_tags), where num_tags are the number of classes to predict.
Just recently I approached to CNNs and I am having troubles with the several hyperparametes to choose and the correct dimension to impose.