![]() |
Search the SiteSearch Member Database |
HCSNet International Visiting Speaker Seminar: Bonnie Webber, University of EdinburghSummaryFrom scientific journals to newspapers, weblogs and email, text is a rich and expanding source of information. Language Technology (LT) aims to provide tools for mining this source and delivering text-based information to the people who need it -- scientists, business and industry analysts, intelligence agents, etc. To date, the most effective LT techniques have been based on words and limited sequences of words (N-grams). But such techniques only go so far. More recently, Language Technology has begun to build on its success with words and N-grams to develop lexical approaches to sentence-level analysis, allowing it to unlock additional information from text. Still to come though is the higher-order information available from the sequences of sentences that make up discourse. To show what is needed to recognize and extract such information, I will start by reviewing the features of discourse that trigger the additional, higher-order meaning it conveys. I will then describe a lexicalised approach to discourse modelled on Lexicalised Tree-Adjoining Grammar that provides a simple and coherent account of these triggers. Empirical evidence for the approach, as well as a rich and valuable resource for research and technology development, comes from the "Penn Discourse TreeBank", which saw its first release in March 2006 (http://www.seas.upenn.edu/~pdtb/). I will conclude the talk by describing features of this resource, what it might allow researchers to discover, and how such discoveries could enhance Language Technology. BioBonnie Webber is the Chair of Intelligent Systems at the School of Informatics at University of Edinburgh since 1998, where she is also Deputy Head of the School. Prior to this, she was Professor at the Department of Computer and Information Science University of Pennsylvania. Event photos |