Topic Modeling

Intro The bag-of-words approach tries to represent documents in a dataset directly using the words that appear in them. But often, these words are predicated on some underlying parameters are very among documents such as a topic being discussed. In this blog, we’ll begin with discussing this hidden or latent Read more…

Text Processing

In this post, you’ll learn how to read text data from different sources and prepare it for feature extraction. You’ll begin by cleaning it to remove irrelevant items, such as HTML tags. You will then normalize text by converting it into all lowercase, removing punctuations and extra spaces. Next, you Read more…