Content:NLP as I understood was the processing of natural language ie human language to make it understandable to the computer. Specifically used in bioinformatics domain, NLP is a very useful tool to extract relevant information from the huge amount of literature present, where its humanly impossible to read through every article. The first step in NLP is tokenization which is breaking the sentence into relevant words which are then used for searches. Morphological analysis uses the lexical words (group of words) and identifies its variants which can be linked to the base word. The 3 ways of capturing such language are : regular expressions, finite state automata and regular grammars. However I still need to understand these methodological aspects better. What I understood is that using these methods in bioinformatics world helps to recognize various related terms and therefore consolidate the knowledge which can be scattered all over the literature especially since the literature on genes is so ambiguous.
I found the Stanford website very resourceful for various programs which are freely downloadable for NLP. http://nlp.stanford.edu/
Posted by Sheetal Shetty
Friday, October 30, 2009
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Gentle Reminder: Sign comments with your name.