Natural Language Processing

Natural language processing (NLP) is a branch of artificial intelligence that designs algorithms to process natural language data. The three primary processes involved in NLP are speech recognition, natural language understanding, and natural language generation.

Modern-day NLP relies on machine learning algorithms to handle natural language processing tasks. They leverage statistical-inference algorithms to produce robust models that can deal with unanticipated input. These types of algorithms improve their accuracy over time with exposure to greater amounts of input data.The major tasks of natural language processing deal with syntax, semantics, discourse, and speech recognition.Syntax related tasks include:

  • Part-of-speech tagging (determining the part of speech – noun, verb, adverb, adjective – for each word in a sentence)
  • Parsing (grammatically analyzing a sentence)
  • Stemming and Lemmatization (reducing the inflectional forms of each word into a common base or root.)

Tasks involving semantics include:

  • Machine translation (translating text from one human language to another)
  • Named entity recognition (determining proper names like people and places)
  • Sentiment analysis (infer subjective information such as identifying public opinion trends in social media)

Discourse tasks are principally centered on providing automatic summarization of documents. The goal of speech recognition is to take an audio file of someone talking and create a textual representation that can be further processed.

Natural Language Processing (NLP) Tutorial