Embedding in PyTorch

The first parameter to input into the initialization is to determine the size of the embedding dictionary. Is there a guide on how big this should be? Does every element we want to embed have to be smaller than the size of the dictionary? Or are the elements hashed?

The num_embeddings is defines as the number of indices you would like to pass into it in the range [0, num_embeddings-1]. E.g. if you are dealing with 100 different indices in [0, 99], num_embeddings would be set to 100.
You can pick the value for embedding_dim and check, what would work the best for your use case.

Yes, though the values I am trying to embed are sparsely scattered between 1 and 10000. So if I set the first parameter to 10000, it seems like a lot of wasted space.

Would it be possible to shift the values to a contiguous range?