NLP Task examples

In this posting, we will take a quick look at the NLP tasks that I picked for explanation.

1. Named Entity Recognition (NER) : Seeks to locate and classify entities into predefined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.

2. Semantic textual similarity : Determines how similar two pieces of texts are. This can take the form of assigning a score from 1 to 5. Related tasks are paraphrase or duplicate identification.

3. Question Answering (QA) : Builds systems that automatically answer questions posed by humans in a natural language.

4. Paraphrase Detection : Detecting whether multiple phrases have the same meaning.

5. Information retrieval / extraction : Finding the information they require but it does not explicitly return the answers of the questions. It informs the existence and location of documents that might consist of the required information.

6. Semantic role labeling : Process that assigns labels to words or phrases in a sentence that indicates their semantic role in the sentence, such as that of an agent, goal, or result.

7. Sentiment Analysis : Systematically identify, extract, quantify, and study affective states and subjective information.

8. Text Classification : Obvious. Task of assigning a set of predefined categories to open-ended text.

9. Text Summarization : Process of summarizing the information in large texts for quicker consumption.

10. Text Generation : Obvious -> ex) Generate news headline ….

11. Machine Translation : Investigates the use of software to translate text or speech from one language to another.

(** Semantic Parsing : Converting a natural language to a logical form: a machine-understandable representation of its meaning. -> ex) MT, QA….)

12. Fake news / Hate Speech detection : Obvious.

13. Topic Modeling : Discovers the abstract “topics” that occur in a collection of documents.

14. Relation extraction : Extracts semantic relationships from the text, which usually occur between two or more entities. Similar to Information Extraction, but IE additionally requires the removal of repeated relations and generally refers to the extraction of many different relationships. RE requires the detection & classification of semantic relationship mentions within a set of artifacts (typically from text documents).

15. Keyword extraction : Obvious. Automatic identification of terms that best describe the subject of a document.

16. Parts Of Speech tagging (POS) : Process of marking up a word in a text as corresponding to a particular part of speech.

17. Word Sense Disambiguation : Identifies which sense of a word is used in a sentence.

18. Speech-to-text / Text-to-speech : Translation of spoken language into text. / Converts text into spoken voice output. These can be achieved by training machines to understand human language and read it with a motive to act and react, as usual, humans do.

(** Speech Recognition : Analyzing the data in the forms of words either written or spoken.)

19. Language identification : Determining which natural language given content is in (Figuring out a document’s source language / Special case of Text categorization).

20. Dialogue Understanding : Obvious.

21. Dependency Parsing : Extracting a dependency parse of a sentence that represents its grammatical structure and defines the relationships between “head” words and words, which modify those heads.

22. Image Captioning : Creating a sentence or explanation that explains the image when it comes in.

23. Grammatical Error Correction : Obvious.

24. Chunking : ‘Chunking up’ refers to moving to more general or abstract pieces of information. While ‘chunking down’ means moving to more specific or detailed information.

Although some of the tasks mentioned above are special cases of other tasks, I described them separately for detailed explanation of the NLP tasks. I hope this posting will help you in getting a sense of NLP tasks !!

Check the following links if you want to find out SOTA methods/datasets for various NLP tasks :

  • SQuAD : https://rajpurkar.github.io/SQuAD-explorer/
  • KorQuAD : https://korquad.github.io/
  • CoQA : https://stanfordnlp.github.io/coqa/
  • GLUE Benchmark : https://gluebenchmark.com/
  • PaperWithCodes : https://paperswithcode.com/area/natural-language-processing
  • NLPPogress : http://nlpprogress.com/
Reference :
  • https://www.wikipedia.org/
  • https://medium.com/@miranthaj/25-nlp-tasks-at-a-glance-52e3fdff32e2
  • https://natural-language-understanding.fandom.com/wiki/List_of_natural_language_processing_tasks
  • https://bluediary8.tistory.com/118