[go: up one dir, main page]

0% found this document useful (0 votes)
2 views33 pages

Module4 SRL

Semantic Role Labeling (SRL) involves identifying the semantic roles of noun phrases in relation to verbs within clauses, such as agent, patient, and instrument. It is useful for tasks like question answering and machine translation, and relies on syntactic cues and selectional restrictions to resolve ambiguities. Various datasets, including PropBank and FrameNet, provide annotated examples to support SRL research and applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views33 pages

Module4 SRL

Semantic Role Labeling (SRL) involves identifying the semantic roles of noun phrases in relation to verbs within clauses, such as agent, patient, and instrument. It is useful for tasks like question answering and machine translation, and relies on syntactic cues and selectional restrictions to resolve ambiguities. Various datasets, including PropBank and FrameNet, provide annotated examples to support SRL research and applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Natural Language Processing:

Semantic Role Labeling

1
Semantic Role Labeling
(SRL)
• For each clause, determine the semantic role
played by each noun phrase that is an
argument to the verb.
agent patient source destination instrument
– John drove Mary from Austin to Dallas in his
Toyota Prius.
– The hammer broke the window.
• Also referred to a “case role analysis,”
“thematic analysis,” and “shallow semantic
parsing”
2
Semantic Roles
• Origins in the linguistic notion of case (Fillmore,
1968)
• A variety of semantic role labels have been
proposed, common ones are:
– Agent: Actor of an action
– Patient: Entity affected by the action
– Instrument: Tool used in performing action.
– Beneficiary: Entity for whom action is performed
– Source: Origin of the affected entity
– Destination: Destination of the affected entity

3
Use of Semantic Roles
• Semantic roles are useful for various tasks.
• Question Answering
– “Who” questions usually use Agents
– “What” question usually use Patients
– “How” and “with what” questions usually use Instruments
– “Where” questions frequently use Sources and Destinations.
– “For whom” questions usually use Beneficiaries
– “To whom” questions usually use Destinations
• Machine Translation Generation
– Semantic roles are usually expressed using particular, distinct
syntactic constructions in different languages.
4
SRL and Syntactic Cues
• Frequently semantic role is indicated by a particular syntactic
position (e.g. object of a particular preposition).
– Agent: subject
– Patient: direct object
– Instrument: object of “with” PP
– Beneficiary: object of “for” PP
– Source: object of “from” PP
– Destination: object of “to” PP
• However, these are preferences at best:
– The hammer hit the window.
– The book was given to Mary by John.
– John went to the movie with Mary.
– John bought the car for $21K.
– John went to work by bus.
5
Selectional Restrictions
• Selectional restrictions are constraints that certain verbs
place on the filler of certain semantic roles.
– Agents should be animate
– Beneficiaries should be animate
– Instruments should be tools
– Patients of “eat” should be edible
– Sources and Destinations of “go” should be places.
– Sources and Destinations of “give” should be animate.
• Taxanomic abstraction hierarchies or ontologies (e.g.
hypernym links in WordNet) can be used to determine if
such constraints are met.
– “John” is a “Human” which is a “Mammal” which is a
“Vertebrate” which is an “Animate”

6
Use of Sectional Restrictions
• Selectional restrictions can help rule in or
out certain semantic role assignments.
– “John bought the car for $21K”
• Beneficiaries should be Animate
• Instrument of a “buy” should be Money
– “John went to the movie with Mary”
• Instrument should be Inanimate
– “John drove Mary to school in the van”
“John drove the van to work with Mary.”
• Instrument of a “drive” should be a Vehicle

7
Selectional Restrictions and
Syntactic Ambiguity
• Many syntactic ambiguities like PP
attachment can be resolved using selectional
restrictions.
– “John ate the spaghetti with meatballs.”
“John ate the spaghetti with chopsticks.”
• Instruments should be tools
• Patients of “eat” must be edible
– “John hit the man with a dog.”
“John hit the man with a hammer.”
• Instruments should be tools

8
Selectional Restrictions and
Word Sense Disambiguation
• Many lexical ambiguities can be resolved using
selectional restrictions.
• Ambiguous nouns
– “John wrote it with a pen.”
• Instruments of “write” should be WritingImplements
– “The bat ate the bug.”
• Agents (particularly of “eat”) should be animate
• Patients of “eat” should be edible
• Ambiguous verbs
– “John fired the secretary.”
“John fired the rifle.”
• Patients of DischargeWeapon should be Weapons
• Patients of CeaseEmploment should be Human
9
Empirical Methods for SRL
• Difficult to acquire all of the selectional
restrictions and taxonomic knowledge needed for
SRL.
• Difficult to efficiently and effectively apply
knowledge in an integrated fashion to
simultaneously determine correct parse trees, word
senses, and semantic roles.
• Statistical/empirical methods can be used to
automatically acquire and apply the knowledge
needed for effective and efficient SRL.

10
SRL as Sequence Labeling

• SRL can be treated as an sequence labeling


problem.
• For each verb, try to extract a value for each
of the possible semantic roles for that verb.
• Employ any of the standard sequence
labeling methods
– Token classification
– HMMs
– CRFs
11
SRL with Parse Trees
• Parse trees help identify semantic roles through
exploiting syntactic clues like “the agent is usually
the subject of the verb”.
• Parse tree is needed to identify the true subject.
S
NPsg VPsg
Det N PP ate the apple.
The man Prep NPpl
by the store near the dog
“The man by the store near the dog ate an apple.”
“The man” is the agent of “ate” not “the dog”. 12
SRL with Parse Trees
• Assume that a syntactic parse is available.
• For each predicate (verb), label each node in the
parse tree as either not-a-role or one of the
possible semantic roles. S
Color Code: NP VP
not-a-role
agent NP PP V NP
patient
Det A N Prep NP bit Det A N
source
destination a ε girl
The Adj A dog with Det A N
instrument
beneficiary big ε the ε boy
13
SRL as Parse Node Classification

• Treat problem as classifying parse-tree


nodes.
• Can use any machine-learning classification
method.
• Critical issue is engineering the right set of
features for the classifier to use.

14
Features for SRL

• Phrase type: The syntactic label of the


candidate role filler (e.g. NP).
• Parse tree path: The path in the parse tree
between the predicate and the candidate role
filler.

15
Parse Tree Path Feature: Example 1

S
Path Feature Value:
NP VP

V ↑ VP ↑ S ↓ NP NP PP V NP

Det A N Prep NP bit Det A N

The Adj A dog with Det A N a ε girl

big ε the ε boy

16
Parse Tree Path Feature: Example 2

S
Path Feature Value:
NP VP
V ↑ VP ↑ S ↓ NP ↓ PP ↓ NP NP PP V NP

Det A N Prep NP bit Det A N

The Adj A dog with Det A N a ε girl

big ε the ε boy

17
Features for SRL
• Phrase type: The syntactic label of the
candidate role filler (e.g. NP).
• Parse tree path: The path in the parse tree
between the predicate and the candidate role
filler.
• Position: Does candidate role filler precede
or follow the predicate in the sentence?
• Voice: Is the predicate an active or passive
verb?
• Head Word: What is the head word of the
candidate role filler? 18
Head Word Feature Example

• There are standard syntactic rules for


determining which word in a phrase is the
head.
S

Head Word: NP VP

dog NP PP V NP

Det A N Prep NP bit Det A N

The Adj A dog with Det A N a ε girl

big ε the ε boy


19
Complete SRL Example
S

NP VP

NP PP V NP
Det A N Prep NP bit Det A N

The Adj A dog with Det A N a ε girl

big ε the ε boy

Phrase Parse Position Voice Head


type Path word
NP V↑VP↑S↓NP precede active dog
20
Issues in Parse Node Classification
• Many other useful features have been proposed.
– If the parse-tree path goes through a PP, what is the
preposition?
• Results may violate constraints like “an action has
at most one agent”?
– Use some method to enforce constraints when making
final decisions. i.e. determine the most likely
assignment of roles that also satisfies a set of known
constraints.
• Due to errors in syntactic parsing, the parse tree is
likely to be incorrect.
– Try multiple top-ranked parse trees and somehow
combine results.
– Integrate syntactic parsing and SRL.
21
More Issues in Parse Node Classification

• Break labeling into two steps:


– First decide if node is an argument or not.
– If it is an argument, determine the type.

22
SRL Datasets
• FrameNet:
– Developed at Univ. of California at Berkeley
– Based on notion of Frames
• PropBank:
– Developed at Univ. of Pennsylvania
– Based on elaborating their Treebank
• Salsa:
– Developed at Universität des Saarlandes
– German version of FrameNet

23
PropBank
• Project at U Penn lead by Martha Palmer to add
semantic roles to the Penn treebank.
• Roles (Arg0 to ArgN) specific to each individual
verb to avoid having to agree on a universal set.
– Arg0 basically “agent”
– Arg1 basically “patient”
• Annotated over 1M words of Wall Street Journal
text with existing gold-standard parse trees.
• Statistics:
– 43,594 sentences 99,265 propositions (verbs + roles)
– 3,324 unique verbs 262,281 role assignments

24
PropBank
• proto-roles (proto-agent, proto-patient etc.,) and
verb-specific semantic roles are used.
• Example: sentences with the verb ‘increase’:
– He increased the price of the fruit.
– [Arg0]He increased [Arg1]{the price of the fruit}.

– The price of the fruit is increased by me again by 5%.


– [Arg1]{The price of the fruit} is increased by [Arg0]me
again by [Arg2]5%.
– Arg0: causer
– Arg1: impacted
– Arg2: range of cause
25
FrameNet
• Project at UC Berkeley led by Chuck Fillmore for
developing a database of frames, general semantic
concepts with an associated set of roles.
• Roles are specific to frames, which are “invoked” by
multiple words, both verbs and nouns.
– JUDGEMENT frame
• Invoked by: V: blame, praise, admire; N: fault, admiration
• Roles: JUDGE, EVALUEE, and REASON
• Specific frames chosen, and then sentences that employed
these frames selected from the British National Corpus and
annotated by linguists for semantic roles.
• Initial version: 67 frames, 1,462 target words,
_ 49,013 sentences, 99,232 role fillers

26
FrameNet
• Frame-specific semantic roles (core roles) and non-core
roles (similar to the ones in PropBank) are used.
• For example, the change_position_on_a_scale frame is
defined as:
– This frame consists of words that indicate the change of an Item’s
position on a scale (the Attribute) from a starting point (Initial
value) to an end point (Final value)
• Core Roles: ATTRIUBUTE, ITEM, DIFFERENCE,
INITIAL_STATE, INITIAL_VALUE,
FINAL_STATE, FINAL_VALUE
• Non-core Roles: DURATION, SPEED, GROUP
• Ex: [ITEMOil] rose [ATTRIBUTEin price] [DIFFERENCEby 2%]
[INITIAL_VALUEfrom 90] [FINAL_VALUEto 91.8].

27
FrameNet Results
• Gildea and Jurafsky (2002) performed SRL
experiments with initial FrameNet data.
• Assumed correct frames were identified and the
task was to fill their roles.
• Automatically produced syntactic analyses using
Collins (1997) statistical parser.
• Used simple Bayesian method with smoothing to
classify parse nodes.
• Achieved 80.4% correct role assignment.
Increased to 82.1% when frame-specific roles
were collapsed to 16 general thematic categories.
28
CONNL SRL Shared Task
• CONLL (Conference on Computational Natural
Language Learning) is the annual meeting for the
SIGNLL (Special Interest Group on Natural
Language Learning) of ACL (Association of
Computational Linguistics).
• Each year, CONLL has a “Shared Task” competition.
• PropBank semantic role labeling was used as the
Shared Task for CONLL-04 and CONLL-05.
• In CONLL-05, 24 teams participated.

29
CONLL-05 Learning Approaches

• Maximum entropy (8 teams)


• SVM (7 teams)
• SNoW (1 team) (ensemble of enhanced Perceptrons)
• Decision Trees (1 team)
• AdaBoost (2 teams) (ensemble of decision trees)
• Nearest neighbor (2 teams)
• Tree CRF (1 team)
• Combination of approaches (2 teams)
30
CONLL Experimental Method

• Trained on 39,832 WSJ sentences


• Tested on 2,416 WSJ sentences
• Also tested on 426 Brown corpus sentences to
test generalizing beyond financial news.
• Metrics:
– Precision: (# roles correctly assigned) / (# roles assigned)
– Recall: (# roles correctly assigned) / (total # of roles)
– F-measure: harmonic mean of precision and recall

31
Best Result from CONLL-05
• Univ. of Illinois system based on SNoW with
global constraints enforced using Integer Linear
Programming.

WSJ Test Brown Test


P(%) R(%) F(%) P(%) R(%) F(%)
82.28 76.78 79.44 73.38 62.93 67.75

32
Issues in SRL

• How to properly integrate syntactic parsing,


WSD, and role assignment so that they all
aid each other.
• How can SRL be used to aid end-use
applications:
– Question answering
– Machine Translation
– Text Mining

33

You might also like