Jupyter Notebooks for Beginners in NLP

vdw · October 26, 2023, 1:05pm

I hope this doesn’t violate any community guidelines; it’s just about sharing free resources.

I’m co-teaching an undergrad NLP course at the National University of Singapore. As part of our course materials, we provide our students with a collection of supplementary Jupyter notebooks for a more hands-on experience. By now, these notebooks are polished enough to put them on Github.

So if you’re new to NLP, these notebooks might be useful. They are organized into 3 parts:

Foundations: Regular Expressions, Data Preprocessing (tokenization, normalization, stemming/lemmatization), N-gram Language Models, Vector Space Model, Naive Bayes Classification
Core Tasks: POS Tagging, Constituency and Dependency Parsing, Word Semantics, Keyword Extraction, Basic Text Summarization
Neural NLP: Multi-Layer Perceptrons (MLPs), Recurrent Neural Networks (RNNs), Word Embeddings (Word2Vec: CBOW + Skip-gram), Transformers, Applications (sentiment analysis, machine translations, language models)

All examples in Neural NLP are using PyTorch. I already posted some individual notebooks in recent posts of mine here. So why not link the whole repository.

Again, these notebooks are for beginners and focus on learning and understanding the basics, and not about the state-of-the-art in NLP research.

Please let me know if you have any questions or maybe even suggestions for improvements.

ptrblck · October 26, 2023, 6:40pm

Thanks for sharing these great resources!