Mihai Surdeanu

mihai surdeanu
mihai AT surdeanu DOT info / msurdeanu AT arizona DOT edu

This page contains some of my recent talks that were given outside of conferences. You are welcome to re-use these slides but please give credit.

The Minds of Machines

Abstract: We are inundated daily with news about artificial intelligence (AI) achieving tremendous results, e.g., defeating human champions at Go, driving better than us, etc. But does this mean that we are approaching the technical singularity where artificial intelligence far surpasses the human one? Does this mean that machines truly think? In this talk we will analyze these questions and illustrate that AI does not think that way we think: machines do not have a good way to represent and reason with world knowledge, and, of course, they are not self aware. Instead, AI is designed to automate and scale up pattern recognition for specific tasks. Because of this different goal, AI does perform better than humans at certain tasks. I will review a series of problems where AI outperforms humans, including specific applications of natural language understanding, precision medicine, and other problems, many of which implemented here at University of Arizona.
Where: UA Lecture Series 2018
Video: https://uascience.org/team/mihai-surdeanu/

Leveraging Machines for Causal Modeling

Abstract: I will present a method for automated, large-scale reading of scientific literature that can capture cellular processes at mechanistic detail. The system uses expressive but compact natural language grammars that recognize common structures across syntactic patterns. Additionally, this method resolves coreference across multiple statements, and understands complex linguistic phenomena, such as speculative or nested statements. This method outperformed all other participants in a recent third-party evaluation, under a metric that combines accuracy and throughput. The proposed approach has the throughput to process millions of articles within days. We have successfully applied this information to find novel cancer driving mechanisms in different cancer types. Combining "big mechanisms" with "big data" can lead to a causal, predictive understanding of cellular processes and unlock important downstream applications in medicine and biology.
Where: Bill and Melinda Gates Foundation's Grand Challenges Meeting
When: October 2016
[Slides]

Why Natural Language Processing Is Important

Abstract: It is estimated that up to 80% of the web consists of unstructured text. Thus, we clearly need language processing to: a) search all this data effectively in response to complex information needs; and b) enhance other computer models so they can learn about the world directly from text. My talk will be structured along these two ideas.
In the first part of the talk, I will describe our work towards teaching computers to answer complex questions, such as manner ("How") and reason ("Why"). I will present a robust question answering model for such complex questions that integrates multiple sources of automatically-acquired knowledge, such as lexical semantics and discourse information. I will describe how to evaluate the proposed system on two corpora from different genres and domains: one from Yahoo! Answers and one from the biology domain, and two types of non-factoid questions: manner and reason. I will experimentally demonstrate that our contributions improve performance up to 24% (relative) over a state-of-the-art model.
In the second part of the talk, I will introduce the novel task of identifying latent attributes in video scenes, such as the mental states of actors, using only large text collections as background knowledge and minimal information about the videos, such as activity and actor types. I will formalize the task and a measure of merit that accounts for the semantic relatedness of mental state terms. I will introduce several largely unsupervised information extraction models that identify the mental states of human participants in video scenes. I will show that these models produce complementary information and their combination significantly outperforms the individual models as well as other baseline methods. I believe this work has many important applications, from intelligent video surveillance systems to semantic search of videos and images.
Where: University of Arizona, Computer Science
When: November 2014
Slides available upon request.

Teaching Computers to Answer Non-Factoid Questions

Abstract: In this talk, I will describe our work towards teaching computers to answer complex questions, i.e., where the answer is a longer piece of text that explains a complex phenomenon, using linguistic information that is automatically acquired from free text. I will present a robust question answer model for non-factoid questions that integrates multiple sources of information, such as lexical semantics and discourse information, driven by two representations of discourse: a shallow representation centered around discourse markers, and a deep one based on Rhetorical Structure Theory. I will describe how to evaluate the proposed system on two corpora from different genres and domains: one from Yahoo! Answers and one from the biology domain, and two types of non-factoid questions: manner and reason. I will experimentally demonstrate that the discourse structure of non-factoid answers provides information that is complementary to lexical semantic similarity between question and answer, improving performance up to 24% (relative) over a state-of-the-art model that exploits lexical semantic similarity alone. I will further demonstrate excellent domain transfer of discourse information, suggesting these discourse features have general utility to non-factoid question answering.
Where: University of Arizona, Department of Linguistics
When: September 2014
Slides available upon request.

Extracting Latent Attributes from Video Scenes Using Text as Background Knowledge

Abstract: In this talk, I will introduce the novel task of identifying latent attributes in video scenes, such as the mental states of actors, using only large text collections as background knowledge and minimal information about the videos, such as activity and actor types. I will formalize the task and a measure of merit that accounts for the semantic relatedness of mental state terms. I will introduce several largely unsupervised information extraction models that identify the mental states of human participants in video scenes. I will show that these models produce complementary information and their combination significantly outperforms the individual models as well as other baseline methods. I believe this work has many important applications, from intelligent video surveillance systems to semantic search of videos and images.
Where: University of Arizona, Cognitive Science
When: September 2014
Slides available upon request.

Coreference Resolution Revisited

Abstract: To date, computational approaches to coreference resolution focus mostly on resolving nominal and pronominal mentions, generally using machine learning algorithms that model pairs of mentions independently. In this talk I will present two ideas that challenge this status quo.
In the first part of the talk I will introduce a deterministic approach to coreference resolution that combines the global information and precise features of modern machine-learning models with the transparency and modularity of deterministic, rule-based systems. Our sieve architecture applies a battery of deterministic coreference models one at a time from highest to lowest precision, where each model builds on the previous model's cluster output. The two stages of our sieve-based architecture, a mention detection stage that heavily favors recall, followed by coreference sieves that are precision oriented, offer a powerful way to achieve both high precision and high recall. Further, our approach makes use of global information through an entity-centric model that encourages the sharing of features across all mentions that point to the same real-world entity. Despite its simplicity, our approach gives state-of-the-art performance on several corpora and genres, and has also been incorporated into hybrid state-of-the-art coreference systems for Chinese and Arabic. Our system thus offers a new paradigm for combining knowledge in rule-based systems that has implications throughout computational linguistics.
In the second part of the talk I will describe a coreference resolution system that models entities and events jointly. Our iterative method cautiously constructs clusters of entity and event mentions using linear regression to model cluster merge operations. As clusters are built, information flows between entity and event clusters through features that model semantic role dependencies. Our system handles nominal and verbal events as well as entities, and our joint formulation allows information from event coreference to help entity coreference, and vice versa. In a cross-document domain with comparable documents, joint coreference resolution performs significantly better (over 3 F1 points) than two strong baselines that resolve entities and events separately.
Note: 20+ slides contributed by Heeyoung Lee.
Where: University of Arizona, Department of Linguistics
When: April 2013
[Slides]
[Audio]

Learning from the World

Abstract: Natural language processing (NLP) applications have benefited immensely from the advent of "big data" and machine learning. For example, IBM's Watson learned to successfully compete in Jeopardy! by using a question answering model trained on millions of Wikipedia pages and other documents. However, this abundance of textual data does not always come free: a lot of it has low quality (e.g., the text is often ungrammatical) or does not illustrate exactly the problem of interest. In this talk I show that such data is still valuable and can be used to train models for end-to-end NLP applications. I will focus on two specific NLP applications: question answering trained from Yahoo! Answers question-answer pairs, and information extraction trained from Wikipedia infoboxes aligned with web texts. I will show that: (a) low-quality text can be made useful by converting it to semantic representations, and (b) training data that incompletely models the problem of interest can be successfully incorporated through anomaly-aware machine learning models.
Note: This is a more recent version of my job talk.
Where: University of Arizona, Computer Science Department
When: April 2013
[Slides]