Observing Opinions: What is Pre-Processing?

09/09/2025 19 min Temporada 1 Episodio 4

Listen "Observing Opinions: What is Pre-Processing?"

Episode Synopsis

In this episode, Prof. Jamal Abdul Nasir from the University of Galway reveals why pre-processing is the backbone of all text analysis. He breaks down key steps like defining documents, tokenization, removing stop words, unification, and stemming vs. lemmatization. Jamal also explains unigrams vs. bigrams and how modern NLP techniques like byte-pair encoding are changing the game. Plus, he shares practical tips for making your pre-processing transparent and reproducible, helping your research stand strong and scale up.

More episodes of the podcast What is it about computational communication science?