Unification Parsing
Typed Feature Structures
– Combination: what is the result (product) of the rule?
CFG parsing
• Example CFG rule:
𝑆 → 𝑁𝑃 𝑉𝑃
Result: (?)
𝐷𝑃##𝐷𝑒𝑡#𝑁𝑃
• The problem is that this result is probably not an input (RHS) to
another rule
• Cannot rule out: “Those sheep runs”
– subject-verb agreement is not encoded
– Subcategorization frames in their different stages
Insufficiency of CFGs
XP→Y X
• Feature structures can do that
1. operate on two (or more) input structures
2. produce exactly one new output structure, or
• This suggests that operation “⊔”:
– is information-preserving
– monotonically incorporates specific
information (from runtime inputs)
– …into more general structures (authored
rules)
Constraint-based parsing
• From graph-theory and Prolog we know that an ideal “⊔” is
graph unification.
– If compatible, which is “more specific”
Head-Driven Phrase Structure Grammar
• “HPSG,” Pollard and Sag, 1994
• Highly consistent and powerful formalism
• Monostratal, declarative, non-derivational, lexicalist,
constraint-based
• Has been studied for many different languages• Psycholinguistic
evidence
HPSG foundations: Typed Feature Structures
– A ⊔ B is defined for all types:
• “Compatible types” 𝐴 ⊔ B = C
• “Incompatible types” A ⊔ B = ⊥
Type Hierarchy (Carpenter 1992)
• In the view of constraint-based grammar
– A unique most general type: *top* T
– Each non-top type has one or more parent type(s) – Two types are
compatible iff they share at least one offspring type
– Each non-top type is associated with optional
constraints
• Constraints specified in ancestor types are monotonically
inherited
• Constraints (either inherited, or newly introduced) must be
compatible
multiple inheritance
The type hierarchy• A simple example
Automatically
introduce GLB types
so that any two types
that unify have
exactly one greater
lowest bound
Typed Feature Structures
Feature Structure Grammars• HPSG (Pollard & Sag 1994)

Properties of TFSes
• Finiteness
each node has single type which is defined in the hierarchy
TFS equivalent views
• Unification is the operation of merging
information-bearing structures, without loss of information if the
unificands are consistent (monotonicity).
• It is an information ordering: a subsumes b iff
a contains less information than b
• Feature structure unification (⊔) is the operation of combining two
feature structures so that the result is: – …the most general feature
structure that is subsumed by the two unificands (the least
upper bound)
– …if there is no such structure, then the unification fails.
• Two feature structures that can be unified are compatible (or
consistent). Comparability entails

TFS unification
TFS unification has much subtlety
For example, it can render authored co-references vacuous
– unifying the front of B (i.e. the value of its LIST feature) into
the tail of A (its LAST value) and
– using the tail of difference list B as the new tail for the result
of the concatenation.
Result of appending the lists
this 𝑥 ⋀ fierce 𝑥 ⋀ dog 𝑥 ⋀chased 𝑒, 𝑥, 𝑦 ⋀
that 𝑦 ⋀ angry 𝑦 ⋀ cat(𝑦)
Semantics desiderata
• Compositionality
– The meaning of a phrase is composed of the meanings of its parts
Copestake, A., Flickinger, D., Pollard, C. J., and Sag, I. A. (2005).
Minimal recursion semantics: an introduction. Research on
Language and Computation, 3(4):281–332.
• Used across DELPH-IN projects

DELPH-IN Consortium
• An informal collaboration of about 20 research sites worldwide focused
on deep linguistic processing since ~2002
– DFKI Saarbrücken GmbH, Germany
– Stanford University, USA
– University of Oslo, Norway
– Saarland University, Germany
– University of Washington, Seattle, USA
– Nanyang Tecnological University, Singapore
– …many others
Key DELPH-IN Projects
• English Resource Grammar (ERG)
Flickinger 2002, • The Grammar Matrix
Bender et al. 2002, • Other large grammars
JACY (Japanese, Siegel and Bender 2002) GG; Cheetah (German; Crysmann;
Cramer and Zhang 2009) Many others: • Operational instrumentation of
grammars
[incr tsdb()] (Oepen and Flickinger 1998)• Joint-reference formalism
tools
English Resource Grammar (Flickinger 2002)
• Primary approach to combating parse intractability
• Every new feature structure is checked for a subsumption
relationship with existing TFSs.
…with a different adjacency/proximity condition
– Instead of joining adjacent words (parsing) the generator joins
mutually-exclusive EPs
• Trigger rules
– Required for postulating semantically vacuous lexemes• Index
accessibility filtering
– Futile hypotheses can be intelligently avoided
• Skolemization
– Inter-EP relationships (‘variables’) are burned-in to the input
semantics to guarantee proper semantics
DELPH-IN Joint Reference Formalism• Key focus of DELPH-IN research:
computational Head- driven Phrase Structure Grammar
HPSG, Pollard & Sag 1994
• TDL: Type Description Language
Krieger & Schafer 1994
• A minimalistic constraint-based typed feature structure (TFS)
formalism that maintains computational
tractability
Carpenter 1992
• MRS: Minimum Recursion Semantics
Copestake et al. 1995, 2005
• Multiple toolsets: LKB, PET, Ace, agree
• Committed to open source
TDL: Type Description Language• A text-based format for authoring
constraint- based grammars
TDL: type definition language
sleeping.”
|
|
|
|
engineering system |
|
Project components







|
parser |
client app |
debugger |
|
|
engine utilities Thai text
utilities