TextCNN Kernel size can’t greater than actual input size

Hi, when I run my TextCNN model (my filter size is [3, 4, 5]), it runs successfully in the 12 epochs (the total epoch number is 20), but when it starts the No.13 epoch it shows the error:

RuntimeError: Calculated padded input size per channel: (4 x 100). Kernel size: (5 x 100). Kernel size can't be greater than actual input size

I met the RuntimeError above yesterday on Colab. However, I run my code successfully today with filter sizes = [3, 4, 5] on Colab.

Could somebody help me explain it? Any ideas?

Are you padding the input based on the largest sequence length?
If so, you might get unlucky and a batch might contain “short” examples, thus your activation might be too small at this particular layer.

@ptrblck

Thank you for your reply. I didn’t pad the input to the largest sequence length, just embed -> convolution -> max pooling -> concat -> output.

Any ideas to solve the RuntimeError ?

The error points to an activation, which is too small for the specified kernel size.
You would either have to enlarge your input, use a smaller kernel or remove some pooling layers.

@ptrblck

Thank you. Solved by changing to a smaller kernel.
If I want to enlarge my input, do you have some suggestions of it with torchtext ? I use torchtext to pre-process my text data.

SENTI_TOKENS = data.Field(batch_first=True,
                          tokenize=tokenize_and_cut,
                          preprocessing=tokenizer.convert_tokens_to_ids)

I think only padding would make sense for text data, so a smaller kernel seems to be more reasonable.

Could you please give me some code on padding with torchtext ?
I tried with fix_length, a parameter of data.Field(), could it pad?

your text size most be greater or equal to your maximum filter size

for example:

filter_sizes = [2, 3, 4]

maximum filter size is 4 so your text length should be 4 words or more

example of predicting batch of one text

def pred(text):
    
    # preprocessing text
    tokenized= text_cleaner(text).split() 
    
    # pad to maximum filter size:
    if len(tokenized) < 4:
       text += ['<pad>'] * (5 - len(tokenized)) # i used '<pad>' because my vocab dict map '<pad>' to 1 
    
    # convert words to index
    vectors = [TEXT.vocab.stoi.get(w, 0) for w in tokenized] # stoi.get(w, 0) it map w to its index if the word in the dict else it map to '<unk>' which is mapped to 0
    
    # convert the list of vectors to torch.tensor
    vectors = torch.tensor(vectors, dtype=torch.long, device=device).unsqueeze(dim=0)
        
    # predict text
    with torch.no_grad():
        output = model(vectors)

    return output

even if your batch size is greater then one (example: 32) thier length it mustn’t be less then maximum filter size

I hope it’s clear now