|
|
Syllabus/Synopsis
Basics of information retrieval
- Text representation and processing
- Retrieval models (Boolean, vector space, language model)
- Indexing
- Evaluation
Advanced IR topics:
- Relevance feedback
- real feedback, pseudo-relevance feedback
- Document and concept clustering
- hierarchical clustering, k-means
- Web retrieval
- Page rank, difficulties of Web retrieval
- Cross-language retrieval
- queries in one language, documents in another
- Distributional and semantic similarity
- automatic thesaurus construction
Basics of statistical machine translation (MT)
- Language models for MT
- Estimation from parallel texts
- Decoding (finding the most probable translation)
|
|