CP7018 LANGUAGE TECHNOLOGIES
UNIT I INTRODUCTION
Natural Language Processing – Mathematical Foundations – Elementary Probability Theory – Essential information Theory - Linguistics Essentials - Parts of Speech and Morphology – Phrase Structure – Semantics – Corpus Based Work.
UNIT II WORDS
Collocations – Statistical Inference – n-gram Models – Word Sense Disambiguation – Lexical Acquisition.
UNIT III GRAMMAR
Markov Models – Part-of-Speech Tagging – Probabilistic Context Free Grammars - Parsing.
UNIT IV INFORMATION RETRIEVAL
Information Retrieval Architecture – Indexing - Storage – Compression Techniques – Retrieval Approaches – Evaluation - Search Engines - Commercial Search Engine Features – Comparison - Performance Measures – Document Processing - NLP based Information Retrieval – Information Extraction.
UNIT V TEXT MINING
Categorization – Extraction Based Categorization – Clustering - Hierarchical Clustering - Document Classification and Routing - Finding and Organizing Answers from Text Search – Text Categorization and Efficient Summarization using Lexical Chains – Machine Translation - Transfer Metaphor - Interlingual and Statistical Approaches.
REFERENCES:
1. Christopher D.Manning and Hinrich Schutze, “ Foundations of Statistical Natural Language Processing “, MIT Press, 1999.
2. Daniel Jurafsky and James H. Martin, “ Speech and Language Processing” , Pearson, 2008.
3. Ron Cole, J.Mariani, et.al “Survey of the State of the Art in Human Language Technology”, Cambridge University Press, 1997.
4. Michael W. Berry, “ Survey of Text Mining: Clustering, Classification and Retrieval”, Springer Verlag, 2003.
No comments:
Post a Comment