![]() |
Search the SiteSearch Member Database |
Seminar: Martin Volk, Stockholm UniversityLocations and timesThis seminar will take place at the following locations and dates:
Location: Building E6A, room 357, Macquarie University, North Ryde, Sydney Contact: Rolf Schwitter, rolfs@ics.mq.edu.au Location: Queensland University of Technology, Gardens Point Campus, Room S524 Contact: James Hogan, j.hogan@qut.edu.au Location: University of Melbourne, Alan Gilbert Theatre 1 Contact: Tim Baldwin, tim@csse.unimelb.edu.au SummaryThis presentation deals with different aspects of multilingual language technology. We start by summarizing our work in a project on Cross-Language Information Retrieval in the Medical Domain. In this project we have evaluated different means of bridging the gap between German queries and English documents and vice versa. We worked with a parallel collection of medical abstracts in the two languages. The combined research on such parallel corpora and on treebanks has recently led to parallel treebanks. A parallel treebank consists of syntactically annotated sentences in two or more languages, taken from translated documents. In addition, the syntax trees of two corresponding sentences are aligned on a sub-sentential level (word and phrase level). Parallel treebanks can be used as training or evaluation corpora for word and phrase alignment, as input for example-based machine translation, as training corpora for transfer rules, or for translation studies. We are developing a German-English-Swedish parallel treebank, with texts from financial documents and from a novel. We will report on our methods and tools for building the monolingual treebanks in the three languages and for aligning the corresponding units on the word and phrase level. In a related project the Computational Linguistics Group at Stockholm University has joined forces with a leading subtitling company in building a system for the automatic translation of film subtitles from Swedish to Danish. The company has provided a wealth of already translated subtitles, and our group builds a translation system to re-use and re-assemble the previous translations at various levels of granularity. A first prototype has been built and produces good results. The output will be checked by a professional translator, but it is expected that at least a third of the automatically translated subtitles need not be touched. We will report on experiences with handling the large parallel corpus and the current status of the project. BioMartin Volk has received his PhD from the University of Koblenz (Germany) in 1994. He has subsequently worked in Switzerland at the University of Zurich, the Zurich University of Applied Sciences, and at Eurospider Information Technology AG. Since 2003 he has been a professor of Computational Linguistics at Stockholm University (Sweden). His main research interests are in multilingual corpus annotation, cross-language information retrieval and machine translation. Materials
|