T5tokenizer differences

I am not an expert here, but this question is in my mind for a while. I understand the difference between the pre-rained T5 models is the number of layers and consequently the number of parameters. But what is the difference then between the pre-trained tokenizers? I mean All models are pre-trained on c4, if the tokenizer is also trained on c4 corpus then why load the tokenizer with different names? Is the pre-trained tokenizer is the same for all models but when loading the pre-trained tokenizer we refer to the config of the pre-trained model which inside has the path to the same pre-trained tokenizer?
Actually, I have tried the three tokenizers (small, base, big) to tokenize small samples of texts I did not notice any difference. Comparing the vocabulary of the three tokenizers I found that it is the same vocab for all tokenizers.
Another question please and correct me if I am wrong. As to my knowledge, the tokenizer and data distribution go in parallel to train any model. If I want to pre-train the T5 model for different numbers of layers on Masked language modeling on (let us say any English text dataset from hugging face). Do I need to train the tokenizer for this corpus or it is enough to use pre trained T5 tokenizer?