Saturday, October 31, 2009

Natural Language Processing

Content:

The method used in natural language processing (NLP) is text mining—the processing of discovering and extracting knowledge from unstructured data. NLP concerns with the human-computer interactions from the perspective of language, it is to make machine language more readable to human and human language more understandable to computer. The basic text mining activities include information retrieval, information extraction and data mining. Because of the ambiguity of language, usually, it is not so easy to do the analysis of information at a single step, so the natural language is deal with at several levels.


In the biomedical field, the degree of ambiguity and analysis complexity is much greater. At the lexical level alone, the tokenization and the lexical variants are the distinct problems need to be identified. Morphological analysis is the way to unify the lexical variants by assigning a canonical base form. The concept precision and recall in morphological analysis are associated with probabilistic.

After the two series of lectures given by Dr. Gonzales, I experience the significance of Ontology and ubiquity of statistics once again. If at the beginning we follow the ontology to create the words and terms, maybe it will be much easier for the scientists in the NLP field. But that is the glamour of the natural language.


Posted by Xiaoxiao

No comments:

Post a Comment

Gentle Reminder: Sign comments with your name.