ComputingScienceColloquium Date: May 13 Time: 11:00 Room: BBL 509 ----+++ Speaker: [[http://www.cs.kun.nl/~kees/][Kees Koster]] ----+++ Title: Best-Only Parsing for Natural Languages ([[%ATTACHURL%/BestOnlyParsing.pdf][slides in pdf]]) ----+++ Abstract Rule-based grammars for Natural Languages are nondeterministic, due to syntactical and lexical ambiguity. Furthermore, they are often Attributed rather than Context Free and weighted by penalties or probabilities, making it hard to construct efficient parsers. In this talk we discuss how to construct Top-Down parsers for such grammars in spite of these complications. We first show how to optimize nondeterministic Context-Free parsing by means of memoization and then generalize this technique to certain attribute grammars. This makes the parsing process itself polynomial, but the generation of all parse trees may still demand exponential time. Although linguistic research may demand an enumeration of all parse trees, NLP applications usually require only the best analysis among all possible analyses of a sentence. Based on this observation, we introduce the Best-Only Parsing algorithm (BOP) for finding the single best analysis according to a weighted Attributed grammar. As an illustration, we consider an English grammar developed for Information Retrieval applications, for which only a single most probable analysis of the input is required. Using BOP, the time to find the single most probable analysis is dominated by the lexicalization time and depends practically linearly on the sentence length.