Transforming Dependency Structures to Logical Forms for Semantic Parsing

The strongly typed syntax of grammar formalisms such as CCG, TAG, LFG and HPSG offers a synchronous framework for deriving syntactic structures and semantic logical forms. In contrast—partly due to the lack of a strong type system—dependency structures are easy to annotate and have become a widely used form of syntactic analysis for many languages. However, the lack of a type system makes a formal mechanism for deriving logical forms from dependency structures challenging. We address this by introducing a robust system based on the lambda calculus for deriving neo-Davidsonian logical forms from dependency trees. These logical forms are then used for semantic parsing of natural language to Freebase. Experiments on the Free917 and Web-Questions datasets show that our representation is superior to the original dependency trees and that it outperforms a CCG-based representation on this task. Compared to prior work, we obtain the strongest result to date on Free917 and competitive results on WebQuestions.

a Work carried out during an internship at Google.b On leave from Columbia University.Figure 1: The dependency tree is binarized into its s-expression, which is then composed into the lambda expression representing the sentence logical form.

Disney acquired Pixar
In recent years, there have been significant advances in developing fast and accurate dependency parsers for many languages (McDonald et al., 2005;Nivre et al., 2007;Martins et al., 2013, inter alia).Motivated by the desire to carry these advances over to semantic parsing tasks, we present a robust method for mapping dependency trees to logical forms that represent underlying predicate-argument structures. 1e empirically validate the utility of these logical forms for question answering from databases.Since our approach uses dependency trees as input, we hypothesize that it will generalize better to domains that are well covered by dependency parsers than methods that induce semantic grammars from scratch.
The system that maps a dependency tree to its logical form (henceforth DEPLAMBDA) is illustrated in Figure 1.First, the dependency tree is binarized via an obliqueness hierarchy to give an s-expression that describes the application of functions to pairs of arguments.Each node in this s-expression is then substituted for a lambda-calculus expression and the relabeled s-expression is beta-reduced to give the logical form in Figure 1(c).Since dependency syntax does not have an associated type theory, we introduce a type system that assigns a single type to all constituents, thus avoiding the need for type checking (Section 2).DEPLAMBDA uses this system to generate robust logical forms, even when the dependency structure does not mirror predicate-argument relationships in constructions such as conjunctions, prepositional phrases, relative clauses, and wh-questions (Section 3).
These ungrounded logical forms (Kwiatkowski et al., 2013;Reddy et al., 2014;Krishnamurthy and Mitchell, 2015) are used for question answering against Freebase, by passing them as input to GRAPHPARSER (Reddy et al., 2014), a system that learns to map logical predicates to Freebase, resulting in grounded Freebase queries (Section 4).We show that our approach achieves state-of-the-art performance on the Free917 dataset and competitive performance on the WebQuestions dataset, whereas building the Freebase queries directly from dependency trees gives significantly lower performance.Finally, we show that our approach outperforms a directly comparable method that generates ungrounded logical forms using CCG.Details of our experimental setup and results are presented in Section 5 and Section 6, respectively.

Logical Forms
We use a version of the lambda calculus with three base types: individuals (Ind), events (Event), and truth values (Bool).Roughly speaking individuals are introduced by nouns, events are introduced by verbs, and whole sentences are functions onto truth values.For types A and B, we use A × B to denote the product type, while A → B denotes the type of functions mapping elements of A to elements of B. We will make extensive use of variables of type Ind × Event.For any variable x of type Ind × Event, we use x = (x a , x e ) to denote the pair of variables x a (of type Ind) and x e (of type Event).
Here, the subscript denotes the projections • a : Ind× Event → Ind and • e : Ind × Event → Event.
An important constraint on the lambda calculus system is as follows: All natural language con-stituents have a lambda-calculus expression of type Ind × Event → Bool.
A "constituent" in this definition is either a single word, or an s-expression.
S-expressions are defined formally in the next section; examples are (dobj acquired Pixar) and (nsubj (dobj acquired Pixar) Disney).
Essentially, s-expressions are binarized dependency trees, which include an ordering over the different dependencies to a head (in the above the dobj modifier is combined before the nsubj modifier).
Some examples of lambda-calculus expressions for single words (lexical entries) are as follows: acquired ⇒ λx.acquired(x e ) Disney ⇒ λy.Disney(y a ) Pixar ⇒ λz.Pixar(z a ) An example for a full sentence is as follows: Disney acquired Pixar ⇒ λx.∃yz.acquired(x e ) ∧ Disney(y a ) ∧ Pixar(z a ) ∧ arg 1 (x e , y a ) ∧ arg 2 (x e , z a ) This is a neo-Davidsonian style of analysis.Verbs such as acquired make use of event variables such as x e , whereas nouns such as Disney make use of individual variables such as y a .
The restriction that all expressions are of type Ind × Event → Bool simplifies the type system considerably.While it leads to difficulty with some linguistic constructions-see Section 3.3 for some examples-we believe the simplicity and robustness of the resulting system outweighs these concerns.It also leads to some spurious variables that are bound by lambdas or existentials, but which do not appear as arguments of any predicate: for example in the above analysis for Disney acquired Pixar, the variables x a , y e and z e are unused.However these "spurious" variables are easily identified and discarded.
An important motivation for having variables of type Ind × Event is that a single lexical item sometimes makes use of both types of variables.For example, the noun phrase president in 2009 has semantics λx.∃y.president(x a ) ∧ president event(x e ) ∧ arg 1 (x e , x a ) ∧ 2009(y a ) ∧ prep.in(x e , y a ) In this example president introduces the predicates president, corresponding to an individual, and president event, corresponding to an event; essentially a presidency event that may have various properties.This follows the structure of Freebase closely: Freebase contains an individual corresponding to Barack Obama, with a president property, as well as an event corresponding to the Obama presidency, with various properties such as a start and end date, a location, and so on.The entry for president is then λx.president(x a ) ∧ president event(x e ) ∧ arg 1 (x e , x a ) Note that proper nouns do not introduce an event predicate, as can be seen from the entries for Disney and Pixar above.

Dependency Structures to Logical Forms
We now describe the system used to map dependency structures to logical forms.We first give an overview of the approach, then go into detail about various linguistic constructions.

An Overview of the Approach
The transformation of a dependency tree to its logical form is accomplished through a series of three steps: binarization, substitution, and composition.Below, we outline these steps, with some additional remarks.
Binarization.A dependency tree is mapped to an s-expression (borrowing terminology from Lisp).For example, Disney acquired Pixar has the s-expression (nsubj (dobj acquired Pixar) Disney) Formally, an s-expression has the form (exp1 exp2 exp3), where exp1 is a dependency label, and both exp2 and exp3 are either (1) a word such as acquired ; or (2) an s-expression such as (dobj acquired Pixar).
We refer to the process of mapping a dependency tree to an s-expression as binarization, as it involves an ordering of modifiers to a particular head, similar to binarization of a context-free parse tree.
Substitution.Each symbol (word or label) in the s-expression is assigned a lambda expression.In our running example we have the following assignments: acquired ⇒ λx.acquired(x e ) Disney ⇒ λy.Disney(y a ) Composition.Beta-reduction is used to compose the lambda-expression terms to compute the final semantics for the input sentence.In this step expressions of the form (exp1 exp2 exp3) are interpreted as function exp1 being applied to arguments exp2 and exp3.For example, (dobj acquired Pixar) receives the following expression after composition: λz. ∃x.acquired(z e ) ∧ Pixar(x a ) ∧ arg 2 (z e , x a ) Obliqueness Hierarchy.The binarization stage requires a strict ordering on the different modifiers to each head in a dependency parse.For example, in (nsubj (dobj acquired Pixar) Disney), the dobj is attached before the nsubj.The ordering is very similar to the obliqueness hierarchy in syntactic formalisms such as HPSG (Pollard and Sag, 1994).
Type for Dependency Labels.This structure is isomorphic to the original dependency structure: there are variables x e , y a and z a corresponding to acquired, Disney and Pixar, respectively; and the sub-expressions arg 1 (x e , y a ) and arg 2 (x e , z a ) correspond to the dependencies acquired → Disney and acquired → Pixar.
By default we assume that the predicate argument structure is isomorphic to the dependency structure and many dependency labels receive a semantics of the form shown in (1).However, there are a number of important exceptions.As one example, the dependency label partmod receives semantics λf gz.∃x.f (z) ∧ g(x) ∧ arg 1 (x e , z a ) with arg 1 (x e , z a ) in place of the arg 1 (z e , x a ) in (1).This reverses the dependency direction to capture the predicate-argument structure of reduced relative constructions such as a company acquired by Disney.
Post-processing.We apply three post-processing steps-simple inferences over lambda-calculus expressions-to the derived logical forms.These relate to the handling of prepositions, coordination and control and are described and motivated in more detail under the respective headings below.

Analysis of Some Linguistic Constructions
In this section we describe in detail how various linguistic constructions not covered by the rule in (1)prepositional phrases, conjunction, relative clauses, and Wh questions-are handled in the formalism. 2repositional Phrases.Prepositional phrase modifiers to nouns and verbs have similar s-expressions: The following entries are used in these examples: where the entries for prep and pobj simply mirror the original dependency structure with prep modifying the event variable z e .
The semantics for acquired in 2009 is as follows: In practice this step is easily achieved by identifying variables (in this case p e ) participating in prep and pobj relations.It would be tempting to achieve this step within the lambda calculus expressions themselves, but we have found the post-processing step to be more robust to parsing errors and corner cases in the usage of the prep and pobj dependency labels.
Conjunctions.First consider a simple case of NPconjunction, Bill and Dave founded HP, whose s-expression is as follows: (nsubj (dobj founded HP) (conj-np (cc Bill and) Dave)) We make use of the following entries: The sentence Bill and Dave founded HP then receives the following semantics: Note how the x variable occurs in two subexpressions: coord(x, y, z), and arg 1 (e e , x a ).It can be interpreted as a variable that conjoins variables y and z together.In particular, we introduce a post-processing step where the subexpression coord(x, y, z) ∧ arg 1 (e e , x a ) is replaced with arg 1 (e e , y a ) ∧ arg 1 (e e , z a ), and the x variable is removed.The resulting expression is as follows: where s-to-I refers to the VP signed to Interscope, and d-50 refers to the VP discovered 50 Cent.The lambda-calculus expression for conj-vp is identical to the expression for conj-np: The logical form for the full sentence is then λe.∃xyz.Eminem(x a ) ∧ coord(e, y, z) ∧ arg 1 (e e , x a ) ∧ s to I(y) ∧ d 50(z) where we use s to I(y) and d 50(z) as shorthand for the lambda-calculus expressions for the two VPs.
After post-processing this is simplified to Other types of coordination, such as sentencelevel coordination and PP coordination, are handled with the same mechanism.All coordination dependency labels have the same semantics as conj-np and conj-vp.The only reason for having distinct dependency labels for different types of coordination is that different labels appear in different positions in the obliqueness hierarchy.This is important for getting the correct scope for different forms of conjunction.For instance, the following s-expression for the Eminem example would lead to an incorrect semantics: (conj-vp (nsubj (cc s-to-I and) Eminem) d-50) This s-expression is not possible under the obliqueness hierarchy, which places nsubj modifiers to a verb after conj-vp modifiers.
We realize that this treatment of conjunction is quite naive in comparison to that on offer in CCG.However, given the crude analysis of conjunction in dependency syntax, a more refined treatment is beyond the scope of the current approach.
Relative Clauses.Our treatment of relative clauses is closely related to the mechanism for traces described by Moortgat (1988;1991); see also Carpenter (1998) and Pereira (1990).Consider the NP Apple which Jobs founded with s-expression: Note that the s-expression has been augmented to include a variable f in dobj position, with (BIND f ...) binding this variable at the clause level.These annotations are added using a set of heuristic rules over the original dependency parse tree.
The BIND operation is interpreted in the following way.If we have an expression of the form where f is a variable and g is an expression that includes f , this is converted to where g(x) | f =EQ(z) is the expression g(x) with the expression EQ(z) substituted for f .EQ(z)(z ) is true iff z and z are equal (refer to the same entity).In addition we assume the following entries: It can be verified that (BIND f (nsubj (dobj founded f ) Jobs)) has semantics λu.∃xyz.founded(x e ) ∧ Jobs(y a ) ∧ EQ(u)(z) ∧ arg 1 (x e , y a ) ∧ arg 2 (x e , z a ) and Apple which Jobs founded has semantics λu.∃xyz.founded(x e ) ∧ Jobs(y a ) ∧ EQ(u as intended.Note that this latter expression can be simplified, by elimination of the z variable, to λu. ∃xy.founded(x e ) ∧ Jobs(y a ) ∧ arg 1 (x e , y a ) ∧ arg 2 (x e , u a ) ∧ Apple(u a ) Wh Questions.Wh questions are handled using the BIND-mechanism described in the previous section.As one example, the s-expression for Who did Jim marry is as follows: (wh-dobj (BIND f (nsubj (aux (dobj marry f ) did) Jim)) who) We assume the following lambda expressions: It can be verified that this gives the final logical form λx. ∃yz.TARGET(x a ) ∧ marry(y e ) ∧ Jim(z a ) ∧ arg 1 (y e , z a ) ∧ arg 2 (y e , x a ) Note that the predicate TARGET is applied to the variable that is the focus of the question.A similar treatment is used for cases with the wh-element in subject position (e.g., who married Jim) or where the wh-element is extracted from a prepositional phrase (e.g., who was Jim married to).

Comparison to CCG
In this section we discuss some differences between our approach and CCG-based approaches for mapping sentences to logical forms.Although our focus is on CCG, the arguments are similar for other formalisms that use the lambda calculus in conjunction with a generative grammar, such as HPSG and LFG, or approaches based on context-free grammars.
Our approach differs in two important (and related) respects from CCG: (1) all constituents in our approach have the same semantic type (Ind × Event → Bool); (2) our formalism does not make the argument/adjunct distinction, instead essentially treating all modifiers to a given head as adjuncts.
As an example, consider the analysis of Disney acquired Pixar within CCG.In this case acquired would be assigned the following CCG lexical entry: Note the explicit arguments corresponding to the subject and object of this transitive verb (f 1 and f 2 , respectively).An intransitive verb such as sleeps would be assigned an entry with a single functional argument corresponding to the subject (f 1 ): In contrast, the entries in our system for these two verbs are simply λx.acquired(x e ) and λx.sleeps(x e ).The two forms are similar, have the same semantic type, and do not include variables such as f 1 and f 2 for the subject and object.
The advantage of our approach is that it is robust, and relatively simple, in that a strict grammar that enforces type checking is not required.However, there are challenges in handling some linguistic constructions.A simple example is passive verbs.In our formalism, the passive form of acquired has the form λx. acquired.pass(xe ), distinct from its active form λx. acquired(x e ).The sentence Pixar was acquired is then assigned the logical form λx. ∃y.acquired.pass(xe ) ∧ Pixar(y a ) ∧ arg 1 (x e , y a ).Modifying our approach to give the same logical forms for active and passive forms would require a significant extension of our approach.In contrast, in CCG the lexical entry for the passive form of acquired can directly specify the mapping between subject position and the arg 2 : As another example, correct handling of object and subject control verbs is challenging in the singletype system: for example, in the analysis for John persuaded Jim to acquire Apple, the CCG analysis would have an entry for persuaded that explicitly takes three arguments (in this case John, Jim, and to acquire Apple) and assigns Jim as both the direct object of persuaded and as the subject of acquire.In our approach the subject relationship to acquire is instead recovered in a post-processing step, based on the lexical identity of persuaded.

Semantic Parsing as Graph Matching
We next describe how the ungrounded logical forms from the previous section are mapped to a fully grounded semantic representation that can be used for question answering against Freebase.Following Reddy et al. (2014), we treat this mapping as a graph matching problem, but instead of deriving ungrounded graphs from CCG-based logical forms, we use the dependency-based logical forms from the previous sections.To learn the mapping to Freebase, we rely on manually assembled question-answer pairs.For each training question, we first find the set of oracle grounded graphs-Freebase subgraphs which when executed yield the correct answer-derivable from the question logical form.These oracle graphs are then used to train a structured perceptron model.

Ungrounded Graphs
We follow Reddy et al. (2014) and first convert logical forms to their corresponding ungrounded graphs.Figure 2(a) shows an example for What is the name of the company which Disney acquired in 2006?.Predicates corresponding to resolved entities (Disney(y a ) and 2006(v a )) become entity nodes (rectangles), whereas remaining entity predicates (name(w a ) and company(x a )) become entity nodes (w a and x a ), connected to entity type nodes (name and company; rounded rectangles).The TARGET(w a ) node (diamond) connects to the entity node whose denotation corresponds to the answer to the question.

Grounded Graphs
The ungrounded graphs are grounded to Freebase subgraphs by mapping entity nodes, entity-entity  CONTRACT.The CONTRACT operation takes a pair of entity nodes connected by an edge and merges them into a single node.For example, in Figure 2(a) the entity nodes w a and x a are connected by an edge via the event w e .After applying the CONTRACT operation to nodes w a and x a , they are merged.Note how in Figure 2(b) all the nodes attached to w a attach to the node x a after this operation.The contracted graph is now isomorphic to its Freebase subgraph.
EXPAND.Parse errors may lead to ungrounded graphs with disconnected components.For example, the ungrammatical question What to do Washington DC December?results in the lambda expression λz.∃xyw.TARGET(x a ) ∧ do(z e ) ∧ arg 1 (z e , x a ) ∧ Washington DC(y a ) ∧ December(w a ).The corre-sponding ungrounded graph has three disconnected components (December and Washington DC, and the component with entity node x a linked to event z e ).In such cases, the graph is expanded by linking disconnected entity nodes to the event node with the largest edge degree.In the example above, this would add edges corresponding to the predicates dep(z e , y a ) ∧ dep(z e , w a ), where dep is the predicate introduced by the EXPAND operation when linking y a and w a to z e .When there is no existing event node in the graph, a dummy event node is introduced.

Learning
We use a linear model to map ungrounded to grounded graphs.The parameters of the model are learned from question-answer pairs.For example, the question What is the name of the company which Disney acquired in 2006? is paired with its answer {Pixar}.In line with most work on question answering against Freebase, we do not rely on annotated logical forms associated with the question for training, instead treating grounded graphs as latent variables.
Let q be a question, let u be an ungrounded graph for q and let g be a grounded graph formed by grounding the nodes and edges of u to the knowledge base K (throughout we use Freebase as the knowledge base).Following Reddy et al. (2014), we use beam search to find the highest scoring pair of ungrounded and grounded graphs (û, ĝ) under the model θ ∈ n : where Φ(u, g, q, K) ∈ n denotes the features for the pair of ungrounded and grounded graphs.Note that for a given query there may be multiple ungrounded graphs, primarily due to the optional use of the CON-TRACT operation. 3The feature function has access to the ungrounded and grounded graphs, to the question, as well as to the content of the knowledge base and the denotation |g| K (the denotation of a grounded graph is defined as the set of entities or attributes reachable at its TARGET node).See Section 5.3 for the features employed.
The model parameters are estimated with the averaged structured perceptron (Collins, 2002;Fre-und and Schapire, 1999).Given a training questionanswer pair (q, A), the update is: where (u + , g + ) denotes the pair of gold ungrounded and grounded graphs for q.Since we do not have direct access to these gold graphs, we instead rely on the set of oracle graphs, O K,A (q), as a proxy: where O K,A (q) is defined as the set of pairs (u, g) derivable from the question q, whose denotation |g| K has minimal F 1 -loss against the gold answer A. We find the oracle graphs for each question a priori by performing beam-search with a beam size of 10k and only use examples with oracle F 1 > 0.0 for training.

Experimental Setup
We next verify empirically that our proposed approach derives a useful logical compositional semantic representation from dependency syntax.Below, we give details on the evaluation datasets and baselines used for comparison.We also describe the model features and provide implementation details.

Training and Evaluation Datasets
We evaluated our approach on the Free917 (Cai and Yates, 2013) and WebQuestions (Berant et al., 2013) datasets.Free917 consists of 917 questions manually annotated with their Freebase query.We retrieved the answer to each question by executing its query on Freebase and ignore the query for all subsequent experiments.WebQuestions consists of 5810 questionanswer pairs.The standard train/test splits were used for both datasets, with Free917 containing 641 train and 276 test questions and WebQuestions containing 3030 train and 2780 test questions.For all our development experiments we tuned the models on held-out data consisting of 30% of the training questions, while for final testing we used the complete training data.

Baseline Models and Representations
In addition to the dependency-based semantic representation DEPLAMBDA (Section 3) and previous work on these datasets, we compare to three additional baseline representations outlined below.We use GRAPHPARSER4 to map these representations to Freebase.

DEPTREE.
In this baseline, an ungrounded graph is created directly from the original dependency tree.An event is created for each parent and its dependents in the tree.Each dependent is linked to this event with an edge labeled with its dependency relation, while the parent is linked to the event with an edge labeled arg 0 .If a word is a question word, an additional TARGET predicate is attached to its entity node.
SIMPLEGRAPH.This representation has a single event to which all entities in the question are connected by the predicate arg 1 .An additional TARGET node is connected to the event by the predicate arg 0 .This is similar to the template representation of Yao (2015) and Bast and Haussmann (2015).Note that this cannot represent any compositional structure.
CCGGRAPH.Finally, we compare to the CCGbased semantic representation of Reddy et al. (2014), adding the CONTRACT and EXPAND operations to increase its expressivity.

Implementation Details
Below are more details of our entity resolution model, the syntactic parser used, features in the grounding model and the beam search procedure.
Entity Resolution.For Free917, we follow prior work and resolve entities by string match against the entity lexicon provided with the dataset.For Web-Questions, we use eight handcrafted part-of-speech patterns to identify entity span candidates.We use the Stanford CoreNLP caseless tagger for part-of-speech tagging (Manning et al., 2014).For each candidate mention span, we retrieve the top 10 entities according to the Freebase API. 5 We then create a lattice in which the nodes correspond to mention-entity pairs, scored by their Freebase API scores, and the edges encode the fact that no joint assignment of entities to mentions can contain overlapping spans.Finally, we generate ungrounded graphs for the top 10 paths through the lattice and treat the final entity disambiguation as part of the semantic parsing problem.Syntactic Parsing.We recase the resolved entity mentions and run a case-sensitive second-order conditional random field part-of-speech tagger (Lafferty et al., 2001).The hypergraph parser of (Zhang and McDonald, 2014) is used for dependency parsing.The tagger and parser are both trained on the OntoNotes 5.0 corpus (Weischedel et al., 2011), with constituency trees converted to Stanford-style dependencies (De Marneffe and Manning, 2013).To derive the CCG-based representation, we use the output of the EasyCCG parser (Lewis and Steedman, 2014).
Features.We use the features from Reddy et al. (2014), which include edge alignment and stem overlap between ungrounded and grounded graphs, and contextual features such as word and grounded relation pairs.In addition, we introduce a feature indicating the use of the CONTRACT operation: (Merged-SubEdge, HeadSubEdge, MergedIsEntity, HeadIsEntity).For example, in Figure 2 the edge between w a and x a is contracted to x a , resulting in the feature (name.arg 1 , name.prep.of,False, False).The EX-PAND operation is treated as a pre-processing step and no features are used to encode its use.Finally, the entity-lattice score is used as a real valued feature.Beam Search.We use beam search to infer the highest scoring graph pair for a question.The search operates over entity-entity edges and entity type nodes of each ungrounded graph.For an entity-entity edge, we can ground the edge to a Freebase relation, contract the edge in either direction, or skip the edge.
For an entity type node, we can ground the node to a Freebase type, or skip the node.The order of traversal is based on the number of named entities connected to an edge.After an edge is grounded, the entity type nodes connected to it are grounded in turn, before the next edge is processed.To restrict the search, if two beam items correspond to the same grounded graph, the one with the lower score is discarded.A beam size of 100 was used in all experiments.

Experimental Results
We examine the different representations for question answering along two axes.First, we compare their expressiveness in terms of answer reachability assuming a perfect model.Second, we compare their performance with a learned model.Finally, we conduct a detailed error analysis of DEPLAMBDA, with a comparison to the errors made by CCGGRAPH.For WebQuestions evaluation is in terms of the average F 1 -score across questions, while for Free917, evaluation is in terms of exact answer accuracy.66.1 Expressiveness of the Representations This might come as a surprise, but it simply reflects the fact that the dataset does not contain questions that require compositional reasoning.

Results on WebQuestions and Free917
We use the best settings on the development set in subsequent experiments, i.e. with CONTRACT and EXPAND enabled.Table 3 shows the results on the WebQuestions and Free917 test sets with additional entries for recent prior work on these datasets.The trend from the development set carries over and DE-PLAMBDA outperforms the other graph-based representations, while performing slightly below the stateof-the-art model of Yih et al. (2015) ("Y&C"), which uses a separately trained entity resolution system (Yang and Chang, 2015).When using the standard Freebase API ("FB API") for entity resolution, the performance of their model drops to 48.4% F 1 .On Free917, DEPLAMBDA outperforms all other representations by a wide margin and obtains the best result to date.Interestingly, DEPTREE outperforms SIMPLEGRAPH in this case.We attribute this to the small training set and larger lexical variation of Free917.The structural features of the graph-based representations seem highly beneficial in this case.

Error Analysis
We categorized 100 errors made by DEPLAMBDA (+C +E) on the WebQuestions development set.In 43 cases the correct answer is present in the beam, Examples where DEPLAMBDA fails due to parse errors, but CCGGRAPH succeed include when was blessed kateri born and where did anne frank live before the war.Note that the EXPAND operation mitigates some of these problems.While CCG is known for handling comparatives elegantly (e.g., who was sworn into office when john f kennedy was assassinated ), we do not have a special treatment for them in the semantic representation.Differences in syntactic parsing performance and the somewhat limited expressivity of the semantic representation are likely the reasons for CCGGRAPH's lower performance.

Related Work
There are two relevant strands of prior work: general purpose ungrounded semantics and grounded semantic parsing.The former have been studied on their own and as a component in tasks such as semantic parsing to knowledge bases (Kwiatkowski et al., 2013;Reddy et al., 2014;Choi et al., 2015;Krishnamurthy and Mitchell, 2015), sentence simplification (Narayan and Gardent, 2014), summarization (Liu et al., 2015), paraphrasing (Pavlick et al., 2015) and relation extraction (Rocktäschel et al., 2015).There are two ways of generating these representations: either relying on syntactic structure and producing the semantics post hoc, or generating it directly from text.We adopt the former approach, which was pioneered by Montague (1973) and is becoming increasingly attractive with the advent of accurate syntactic parsers.
There have been extensive studies on extracting semantics from syntactic representations such as LFG (Dalrymple et al., 1995), HPSG (Copestake et al., 2001;Copestake et al., 2005), TAG (Gardent and Kallmeyer, 2003;Joshi et al., 2007) and CCG (Baldridge and Kruijff, 2002;Bos et al., 2004;Steedman, 2012;Artzi et al., 2015).However, few have used dependency structures for this purpose.Debusmann et al. (2004) and Cimiano (2009) describe grammar-based conversions of dependencies to semantic representations, but do not validate them empirically.Stanovsky et al. (2016) use heuristics based on linguistic grounds to convert dependencies to proposition structures.Bédaride and Gardent (2011) propose a graph-rewriting technique to convert a graph built from dependency trees and semantic role structures to a first-order logical form, and present results on textual entailment.Our work, in contrast, assumes access only to dependency trees and offers an alternative method based on the lambda calculus, mimicking the structure of knowledge bases such as Freebase; we further present extensive empirical results on recent question-answering corpora.
Structural mismatch between the source semantic representation and the target application's representation is an inherent problem with approaches using general-purpose representations.Kwiatkowski et al. (2013) propose lambda-calculus operations to generate multiple type-equivalent expressions to handle this mismatch.In contrast, we use graph-transduction operations which are relatively easier to interpret.There is also growing work on converting syntactic structures to the target application's structure without going through an intermediate semantic representation, e.g., answer-sentence selection (Punyakanok et al., 2004;Heilman and Smith, 2010;Yao et al., 2013) and semantic parsing (Ge and Mooney, 2009;Poon, 2013;Parikh et al., 2015;Xu et al., 2015;Wang et al., 2015;Andreas and Klein, 2015).
A different paradigm is to directly parse the text into a grounded semantic representation.Typically, an over-generating grammar is used whose accepted parses are ranked (Zelle and Mooney, 1996;Zettlemoyer and Collins, 2005;Wong and Mooney, 2007;Kwiatkowksi et al., 2010;Liang et al., 2011;Berant et al., 2013;Flanigan et al., 2014;Groschwitz et al., 2015).In contrast, Bordes et al. (2014) and Dong et al. (2015) discard the notion of a target representation altogether and instead learn to rank potential answers to a given question by embedding questions and answers into the same vector space.

Conclusion
We have introduced a method for converting dependency structures to logical forms using the lambda calculus.A key idea of this work is the use of a single semantic type for every constituent of the dependency tree, which provides us with a robust way of compositionally deriving logical forms.The resulting representation is subsequently grounded to Freebase by learning from question-answer pairs.Empirically, the proposed representation was shown to be superior to the original dependency trees and more robust than logical forms derived from a CCG parser.
λx. ∃py.acquired(x e ) ∧ 2009(y a ) ∧ in(p e ) ∧ prep(x e , p e ) ∧ pobj(p e , y a ) We replace in(p e ) ∧ prep(x e , p e ) ∧ pobj(p e , y a ) by prep.in(x e , y a ) as a post-processing step, effectively collapsing out the p variable while replacing the prep and pobj dependencies by a single dependency, prep.in.The final semantics are then as follows: λx.∃y.acquired(x e ) ∧ 2009(y a ) ∧ prep.in(x e , y a ) λe. ∃xyzu.Bill(y a ) ∧ Dave(z a ) ∧ founded(e e ) ∧ HP(u a ) ∧ coord(x, y, z) ∧ arg 1 (e e , x a ) ∧ arg 2 (e e , u a ) λe. ∃yzu.Bill(y a ) ∧ Dave(z a ) ∧ founded(e e ) ∧ HP(u a ) ∧ arg 1 (e e , y a ) ∧ arg 1 (e e , z a ) ∧ arg 2 (e e , u a ) VP-coordination is treated in a very similar way.Consider the sentence Eminem signed to Interscope and discovered 50 Cent.This has the following s-expression: (nsubj (conj-vp (cc s-to-I and) d-50) Eminem) u ir e d .ar g 1 a cq u ir ed .pre p .i n b u si n e ss .a c q u is ti o n .a c q u ir in g c o m p a n y b u si n es s. ac q u is it io n .

Figure 2 :
Figure 2: The CONTRACT operation applied to the ungrounded graph for the question What is the name of the company which Disney acquired in 2006?.After CON-TRACT has been applied the graph is isomorphic to the representation in Freebase; in (b) we show the Freebase predicates after grounding in blue.

Table 1 :
Oracle statistics and accuracies on the Web-Questions development set.+(-)C: with(out) CONTRACT.

Table 3 :
Question-answering results on the WebQuestions and Free917 test sets.butrankedbelow an incorrect answer (e.g., for where does volga river start, the annotated gold answer is Valdai Hills, which is ranked second, with Russia, Europe ranked first).In 35 cases, only a subset of the answer is predicted correctly (e.g, for what countries in the world speak german, the system predicts Germany from the human language.maincountryFreebase relation, whereas the gold relation human language.countriesspokenin gives multiple countries).Together, these two categories correspond to roughly 80% of the errors.In 10 cases, the Freebase API fails to add the gold entity to the lattice (e.g., for who is blackwell, the correct blackwell entity was missing).Due to the way WebQuestions was crowdsourced, 9 questions have incorrect or incomplete gold annotations (e.g., what does each fold of us flag means is answered with USA).The remaining 3 cases are due to structural mismatch (e.g., in who is the new governor of florida 2011, the graph failed to connect the target node with both 2011 and Florida).Due to the ungrammatical nature of WebQuestions, CCGGRAPH fails to produce ungrounded graphs for 4.5% of the complete development set, while DE-PLAMBDA is more robust with only 0.9% such errors.The CCG parser is restricted to produce a sentence tag as the final category in the syntactic derivation, which penalizes ungrammatical analyses (e.g., what victoria beckham kids names and what nestle owns).