NLP Big Picture

Published by duyanh on

In this blog, i want to clarify the big picture of our NLP Pillar, What topic we will cover and how we can grow together in this field

Big Picture

Important topic in NLP

Knowledge Representation

In NLP, knowledge representation related to how you can express your knowledge (text, documents) in features that we can perform the computational task, some famous example for the knowledge representations are: word2vec, glove

The knowledge representation can be a critical in semantic searching and building recommender system

Seq2Seq Model (architecture)

sequence to sequence or seq2seq models play an important roles in our current NLP task such as:

  • machine translation
  • Question & Answering
  • Dialog Engine
  • Image Capturing

We would like to develop these models with applications in Vietnamese Text domain; in particular, we hope to build a Vietnamese virtual assistant that can handle normal Vietnamese dialogue naturally

Generative Model

Generative model such as language model has a great application in supporting other NLP task such as language decoding in sequence to sequence model, or language correction. A good Vietnamese language model would be a great tool for NLP team in boosting performance for other NLP tasks

Vietnamese NLP Dataset:

Data is the fuel for all of AI tasks; in order to build up great NLP pipeline and Experiment; we need to think and start to work on collecting and building up Vietnamese Dataset. The important elements in constructing a good dataset are:

  • Labelling tool
  • Data storage
  • Collecting public dataset
  • Data crawler

Memory Network

Dynamic Memory networks-(DMNs) are state of art in Q&A systems. All nlp tasks r Q&A type. The way we use to build chatbots is to provide a series of input sentences and ask a question based on that sentences and it’s gonna output the answer.

Reinforcement Learning for NLP

Reinforcement Learning can be the perfect approach for NLP since here we are making the system learn the behavior of a trainer in a simulated environment through trial and error basis. For example: During the process of text classification of data from various domains without any training data, we can create an environment (Maybe a classification game) and an agent (software agent). Here, the agent would try to classify the text (action) initially in an arbitrary manner. Based on the result the agent would get a reward due to which it decides the next state of action.

Automatic Speech Recognition

Speech is the most natural form of communication for us — it’s second nature to us. And now, our machines have started to recognize our speech and they’re getting better and better at communicating with us.

Current voice assistants and devices like Amazon Alexa and Google Home are getting more and more popular each month — they are changing how we shop, how we search, how we interact with our devices and even each other.

if we can talk to “tigi” it would be great !!!

Beside these core concepts, We will deep dive into concepts about AI/ML/DL/RL

We all aim to be a “FULL-STACK AI ENGINEER” so let grow together

Cheers !!!!


Leave a Reply

Your email address will not be published. Required fields are marked *