I am using CNN to classify text. The CNN input dimensions are BxM where B is batchsize and M is number of words/features (0 padded to length of max feature size in the batch). M varies from batch to batch. I would like to batchnormalize the input before the nn.Embedding() layer. How can I do this if I do not know size of M apriori when setting the model up?
class Testmodel(nn.Module):
def __init__(self, args):
super(Testmodel,self).__init__()
self.args = args
V = args.embed_num
D = args.embed_dim
self.embed = nn.Embedding(V, D)
......
self.fc1 = nn.Linear(1000, C)
It seems to me that batch norm is not applicable to your situation.
For the training phase, it could actually work. In practice B and M can change from one batch to another.
For the test phase, it wouldn’t be possible. A CNN’s output for a certain example is always independent of other examples. If you predict on image X1 alone, it will give you a result y1. If you predict on a batch [X1,X2], you want the output for X1 to be y1, the same as before; not some value dependent on X2.
Therefore, during the test phase, the mean and variance vectors used by batch norm are fixed; thus, their size (M) is also fixed.