Most of the web data today consists of unstructured text. Of course, the fact that this data exists is irrelevant,
unless it is made available such that users can quickly find information that is relevant for their needs.
This course will cover the fundamental knowledge necessary to build these systems, such as web crawling, index
construction and compression, boolean, vector-based, and probabilistic retrieval models, text classification and
clustering, link analysis algorithms such as PageRank, and computational advertising.
The students will also complete one programming project, in which they will construct one complex application
that combines multiple algorithms into a system that solves real-world problems.
Time and Place
Monday/Wednesday 11:00am - 12:15pm in Social Sciences, Room 411
msurdeanu AT email DOT arizona DOT edu
Office: Gould-Simpson 811
Office Hours: by request