Topic Modeling

Intro The bag-of-words approach tries to represent documents in a dataset directly using the words that appear in them. But often, these words are predicated on some underlying parameters are very among documents such as a topic being discussed. In …

Implementation of RNN & LSTM

Learn how to represent memory in code. Then define and train RNNs in Pytorch and apply them to tasks that involve sequential data The practical notebook can be found at http://14.232.166.121:8880/lab/workspaces/andy > char_rnn > Character_Level_RNN_Exercise.ipynb

Neural Turing Machines

In this blog, we will target on one of the two main foundations of Rasa Core called Neural Turing Machine; we will also read together the original paper which can be found at the following link https://arxiv.org/pdf/1410.5401.pdf. Reading paper is …

Project: Part of Speech Tagging

Introduction In this Project, you’ll use the Pomegranate library to build a hidden Markov model for part of speech tagging with a universal tagset. Hidden Markov models have been able to achieve >96% tag accuracy with larger tagsets on realistic text corpora. Hidden …

Text Processing

In this post, you’ll learn how to read text data from different sources and prepare it for feature extraction. You’ll begin by cleaning it to remove irrelevant items, such as HTML tags. You will then normalize text by converting it …