In my courses, I share prepared Jupyter notebooks with my students. All notebooks that handle textual data currently still use torchtext
(e.g., this one and this one. Now that that torchtext
is no longer maintained/developed, I would like to “refresh” the notebooks to remove any use of torchtext
.
What would be the recommended best practice? I basically used torchtext
only for building the vocabulary, and then transforming tokens/words to their respective indices, and vice versa
Before using torchtext
, I actually used my own implementation for that, so I know how to do it. But I would prefer to stick to popular libraries/tools/etc. and best practices to streamline my notebooks and keep the code clean.
Any suggestions are welcome!