[go: up one dir, main page]

0% found this document useful (0 votes)
89 views148 pages

NLP Techknowledge

Uploaded by

Temp Acc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views148 pages

NLP Techknowledge

Uploaded by

Temp Acc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 148

Mohd Laraib

MODULE 5
CHAPTER
Pragmatic and
5 Discourse Processing
Syllabus
5.1 Discourse Reference Resolution, Reference Phenomena, Syntactic & Semantic constraint
on coherence; Anaphora Resolution using Hobbs and Cantering Algorithm.
5.2 Self-Leaming topics: Discourse segmentation, Conference resolution.

5.1 Pragmatic Analysis. ****...5-2

5.2 Discourse Analysis. 5-2

5.3 Reference Resolution. 5-2

5.4 Reference Phenomena...

5.4.1 Type of Referring Expression.. 5-5

5.4.2 Types of Referents which Complicate the Reference Resolution... 5-8

5.5 Syntactic and Semantic Constraints on Coreference.. ******


5-10

5.6 Anaphora Resolution. ****** ****


5-14
5.6.1 Hobbs Algorithm.. .5-14

5.6.2 Centering Algorithm... .5-17

Chapter Ends.
Peamsene eseusees
Nature Language rocessing (MU - Sem 7- omp)
(Pragmatic&DiscourseProc.)...Pageno. (5-22
5.1 PRAGMATICANALYSIS

A step in the intormation extraction from text is pragmatic analysis. In particular, its
the section on analysing a group of text structures to determine their true meaning or
the intended meaning.
The study of pragmatics, a branch of linguistics and semiotics, focuses on how context
affects meaning.
In contrast to semantics, which examines meaning that is "coded" or conventional in a
particular language, pragmatics investigates how the transmission of meaning
depends not only on the speaker's and listener's structural and linguistic knowledge
grammar, lexicon, etc.), but also on the context of the utterance, any prior knowledge
about those involved, the speaker's implied intent, and other elements.
For example, the semantic analysis meaning of the sentence "The Soldier fought like
a lion" yields the meaning that the soldier fought using the paws or bit using teeth!
Which not the intended meaning. The intended meaning here is to highlight the

ferociousness of the soldier.


Sarcasm identification or detection is one of the important application of pragmatic

analysis.

5.2 DISCOURSEANALYsIS
events that take place at
Til now our attention has mostly been drawn to linguistic
often consists of collocated, connected
the word or sentence level. Of course, language
unrelated utterances. Such a collection of
groupings of sentences rather than single,
Sentences is referred to as a discourse.
refers to. Discourse analysis
discourse in linguistics
anguage in use is what the word
entals understanding social
ne practise of analysing texts or languages that
Dealing with morphemes, tenses,
n-grams,
eractions and text interpretation.
can be part of discourse analysis.
layouts, and other elements
guistic features, page
One definition of discourse is a series of sentences.

5.3 REFERENCERESOLUTION
in language. Think about the
occurrences
iscourse level is rich with

Conversationin the example, He looked at it for


dealership to check out an Acura Integra.
ent to Bill's car
about an hour (5.1)
Publications...A SACHIN SHAH Venture
(MU-New Syllabus w.e w.e.f academic year 22-23) (M7-83)
Tech-Neo
Natural Language Processing(MU Sem 7-Comp)
(Pragmatic&DiscourseProc.)..Pageno. (5-31
What do pronouns like "he" and "it" mean? The reader might not have issue
deducino
that he refers to John and not Bill, and that it refers to the Integra and not Bill's
auta
business, but for a computer?
The mechanism through which speakers utilise terms like John' and 'he' in passage
5.1) to signify a person called John' is the subject of this section's consideration of
the reference problem. Reference Resolution is a process by which when speakers use
expressions like John' and he' in passage (5.1) to denote a person named John' this
relation is identified.
We must first clarify a few terms before we can continue further.

1. Referring Expression and Referent

A natural language expression used to perform reference is called a referring


expression, and the entity that is referred to is called the referent.

Example : The name John' and he' in passage (5.1) are referring expressions,
and John' the person is their referent.

As a convenient shorthand, we will sometimes speak of a referring expression


referring to a referent, e.g., we might say that he' refers to John' the person.

2. Corefer

If the same entity has been referred by two referring expressions then it is called
as Corefer.
Example: The name John' and he' are corefer in passage (5.1) as both are

referring to the person 'John'.

3. Antecedent

There is also a name for a referring expression that grants permission for the use
of another, similar to how mentioning the name John' permits John' the person
to be afterwards referred to by the pronoun he'.

We call the name John' the antecedent of he.

4. Anaphora and Anaphoric

Reference to an entity that has been already introduced into the discourseis
called anaphora,and the referring expression used is said to be
anaphoric.
Example: In passage (5.1), the pronouns 'he' and it are anaphoric. And, the name
John' is anaphora.

(MU-New Syllabus w.ef academic year 22-23) (M7-83) Tech-Neo Publications...ASACHIN SHAH Venu
!!!i l
l
D

il
Natural Language Processing (MU
Sem 7- Comp) (Pragmatic& Discourse Proc.)...Pageno. (5-5)
5.4REFERENCE PHENOMENA
wwww.w.

offered by natural languages is quite extensi


The range of referential phenomena sive.
Five types of referring expression have been covered here, indefinite noun phrasas
es,
demonstratives, and names.
definite noun phrases, pronouns,

the three types of referents which complicatethe


note
Additionallyit is important to
reference resolution problem, inferrables, discontinuous sets and generics.

a5.4.1 Type of Referring Expression


There five types of referring expressions, indefinite noun phrases, definite noun
demonstratives, and names. Indefinitenoun
phrases, pronouns, phrases
1. Indefinite Noun Phrases Definite noun
Indefinite references bring unfamiliar entities to
phrases
the conversation environment. Pronouns
The determiner a' (or an') is used to indicate Demonstratives
indefinite references most frequently, although it
may also indicate by quantifiers like 'some' or even Names
the determiner this'. Fig 5.4.1:Typesof Referring
Expressions
Example
I saw 'an' Acura Integra today. (5.2)

"Some' Acura Integras were being unloaded at the local dealership today. (6.3)

I saw this' awesome Acura Integra today. (5.4)

I am going to the dealershipto buy 'an' Acura Integra today. (5.5)


These noun phrases elicit a
representationfor se
a new entity that fits the discou
model's definition.

There may be a
specific/non-specificambiguitysince the er'
indefinite determiner
doesn't say whether the speaker can has
identify the object. Example (5.2) only
the specific reading since the speaker is specificallythinking of the Integra that

she saw.

Drawback: On the other hand,


both interpretationsare plausiblein phrase (5.5).
In other words, the
speaker may already know which may

simply be preparingto chooseoneshe likes Integra she wants o1


(nonspecific).
(MU-New Syllabusw.e.f academic year 22-23) (M7-83)
Tech-NeoPublications...ASACHIN Venture

SH
NaturalLanguage Processing
(MU Sem 7-Comp) Pragmatic&DiscourseProc.)...Pageno. (5-5)
can
Solutionto the problem: In some instances,a followingreferencingstatement
be used to distinguish between the readings; if the expression is definite, the reading
15
is particular(I hope they still have that option), and if it is indefinite, the reading
generic (I hope they have a car I like).

There are several exceptions to this rule, such as when definite expressions are
modal circumstances (I'll park it in my garage), this is
acceptable in particular
compatible with the nonspecific reading.

2. Definite Noun Phrases


to having a "definite
An entity that is identifiable to the hearer is referred as

reference"when,
context (and is therefore
been mentioned in the discourse
) it has already
represented in the discourse model),

Gi) is a part of the hearer's worldview,


or

uniqueness.
ii) the description itself implies the object's
Example
white and needed to be washed. (5.6)
I saw an Acura Integra today. The Integra' was
in the US. (5.7)
The Indianapolis 500' is the most popular car race
500 was an Integra. (6.8)
The fastest car in the Indianapolis from the
referent may be determined
(6.6) showcases an example where the
Conversationcontext.
the hearer's set of
referent is identifiablefrom
.1) showcases an example where the
beliefs.
the referent is inherently
unique.
showcases an example where
model or the hearer's collection of
to note that
the discourse
e It is important definite noun in a sentence. In the
in order to refer to a
Odviews must be accessible
representation of the referent.
model
evokes a discourse
d instance, it also
3. Pronouns
reterence, as seen in example (5.9).
P
*Onominalization is another type of definite
needed to be washed. (5.9)
white and
Acura Integra today. It
was
*9aW an
restrictions
than plete definite
complet noun phrase
definite noun phrases,
Pronominal reference has more

activation or
salience for the referent
for the referent in the
the
necessitating
essitating a high level of
discourse model._
MJ-New Sylabus w.e.f academic year 22-23) (M7-83)
Tech-NeoPublicatione.ASACHIN SHAH Venture
Natural Language Processing (MU- Sem 7-Comp) (Pragmatic& DiscourseProc.)..Pageno. (5-7
While definite noun phrases can frequently refer further back, pronounsns tyni
typically
(but not always) refer to entities that were introduced no more than one or t
sentences before to the current discourse. This is illustrated by the difieron
ence
between sentences (5.10d) and (5.10d').

(5.10)
(a) John went to Bob's party, and parked next to a beautiful Acura Integra.

went inside and talked to Bob for more than an hour.


(b) He
(c) Bob told him that he recentlygot engaged.
(d) He also said that he bought it' yesterday.
d') He also said that he bought 'the Acura' yesterday.
The Integra no longer has the level of salience necessary to permit pronominal
reference by the time we get to the last sentence.

Cataphora: In cataphora, pronouns are mentioned before their referents are, as


in the following example (5.11). Here, it can be seen that the pronouns 'he' and t
both appear before the introduction of their respective referents.

Before he bought it, John checked over the Integra very carefully. (5.11)

Bound: Pronouns can be considered bound when they appear in quantified


contexts, as in the following example (5.12). Her' operates as a variable bound to
the quantified expression every woman when read in the appropriate way, rather
than referring to any particular woman in the context.

Every woman bought her Acura at the local dealership. (5.12)


4. Demonstratives

Simple definite pronouns like


it function
differently than demonstraive
pronouns like this' and that. They may show up both by themselves and a
determiners,such as 'this Acura' and that Acura'.
The decision between two
demonstratives-thissignalling proximity and
signalling distance-is typically connected to some idea of ical

proximity. Dependingon the situational setting of the conversation


geograpnie
spatial distance might be calculated as in (5.13). participan
John shows Bob an Acura Integra and a Mazda Miatal
.Bob (pointing):I like 'this' better than 'that. (6.13)

(MU-New Syllabusw.e.f academic year 22-23)


(M7-83)
ITech-NeoPublications..ASACHIN SHAH Venture
Natural Language Processing (MU - Sem 7-Comp)
(Pragmatic&DiscourseProc.)...Pageno.(5-8)
As an alternative, the discourse model's conceptual relations might be used to
symbolicallyunderstanddistance. Considerthe example(5.14).
Here, this one' relates to the Acura that was purchased yesterday (closer
temporal distance), and that one' refers to the Acura that was purchased five

years ago (greatertemporaldistance).


I bought an Integra yesterday. It's similar to the one I bought five years ago.
That one was really nice, but I like this one even better. (6.14)

5. Names
of persons,
Names prevalent type of referring expression and include names
are a

be used to refer to both new


organisations, and places. In the discourse,names may
and historical entities.

(5.15)
Woodhouse' certainly had not done him justice.
(a) Miss
from Amazon;
(6) International Business Machines' sought patent compensation
BM had previouslysued other companies

Referents which Complicate


5.4.2 Types of
the Reference Resolution

expression types,
Ater describing several referring
referent kinds Inferables
we will now focus on a few intriguing
more
reference resolution problem
aE make the

challenging
1. Inferrables Discontinous sets
refers
expression
In certain instances, a referring
rather than
one
to an entity that has been implied Generics
in the text.
that has been xpressly
summoned

Inferrables.
Such referentsare called as which
Fig 54.2:Typesof Referents
Reference Resolution
Complicate the

Example but 'a door had a dent and the engine'


I almost Acura Integra today,
St bought an
Seemed noisy. (5.16)
SACHIN SHAH Venture
MU-New Svlabus w.e. Tech-NeoPublications..A
ademic year 22-23) (M7-83)
Natural Language Processing (MU - Sem 7- Comp) (Pragmatic& DiscourseProc.)..Pageno. (5-9)
Now, consider the expressions 'a door and "the engine. In a typical situation, the
indefinite noun phrase 'a door' would introduce a new door into the discourse contex
but in this instance, the hearer is to infer something additional: that it is not just y

door, but rather one of the doors of the Integra.


The definite noun phrase the engine is also typically used to denote the engine' that
has already been invoked or is otherwise easily recognised. Although 'the engine' of
the aforementioned Integra has not been stated clearly in this sentence, the hearer

infers that it is the referent.


The outcomes of processes that are described by utterances in : discourse can also be

specified by inferrables. Take into account the recipe's possible follow-ups (a-c) to
sentence (5.17).

(5.17)
Mix the flour, butter, and water.

(a) Kneed the dough' until smooth and shiny.


(b) Spread the paste' over the blueberries.

(c)Stir the batter until all lumps are gone.


The words 'the dough' (a solid), batter (a liquid), and 'paste' (something in

between) can all be used to refer to the outcome of the acts in the first line, but

they all indicate distinct characteristics of this outcome.

2. Discontinuous Sets

is
Generally,'they' and them' are examples of a plural referring expressionthat
sometimes used to refer to groups of items that are invoked collectively. Other
and
examples include another plural referring expression (their Acuras)
conjoined noun phrases (John and Mary), as shown in the following example,
John and Mary love their Acuras. They drive them all the time. (6.18)
However, plural references may also apply to groups of things that the texts
discontinuous phrases have referred to, as in the following example,

John has an Acura, and Mary has a Mazda. They drive them all the time. (519)
.Here, they' refer to John and Mary, and the Acura and the Mazda are referred to
as them' in this sentence. Also, take note that the second line in this instance
will often be read pairwise or correspondingly, meaning that John drives tne

Acura and Mary drives the Mazda rather than both of them driving botn

automobiles.

(MU-New Syllabus w.e.f academic year 22-23) (M7-83) Tech-Neo Publications...A SACHIN SHAH Venture
Language Proce
cessing (MU
Natural
Sem 7- Comp) (Pragmatic &DiscourseProc.)...Page no.(5-10)
3. Generics

The existence of generic reference


adds another layer of complexity to the
reference problem. Consideringone (15.20).
I saw no less than 6 Acura
Integras today. "They' are the coolest cars. (5.20)
The most obvious reading here relates to the class of Integras as a whole rather
than the specific six Integrasstated in the first phrase.
5.5 SYNTACTICAND SEMANTIC CONSTRAINTSON cOREFERENCE
*******

Any effective reference resolution method must include


the step of filtering the collection of potential referents Number
agreemet
based on a few reasonably rigid requirements. Here, we
Person and case
go through a few of these restrictions. agreemet
Gender
1. Number Agreement
agreement
It is necessary for referring statements and their
Syntactic
number; in English, this constraints
referents to have the same

between references to the Selectional


entails differentiating restrictions
singular and plural.
Fig. 5.5.1:Syntactic and Semantic
Constraints on Coreference

are categorised according to their number.


In Table No. 5.5.1, pronouns
Pronouns according to Number Agreement
Table 5.5.1:

Plural Unspecified
Singular
Them You
She, Her, He, Him, His,
It | We, Us, They,

Example on number agreement.


constraints
he following examples illustrate
(5.21)
dohn has a new Acura. Tt is red.
are red. (5.22)
lohn has three: new Acuras. They'
red. (5.23)
Johnhas a new Acura. "They'
are

(5.24)
John has three Acuras. Tt is red.
new
are correct whereas, (5.23) and (5.24) are a

(5.22)
Here, the examples (5.21) and be associated with plural and the terr
term
as the
they' should
term
rect statements statements violate the constraints.
These to
0uld be associated with singular.

Tech-Neo Publications...A SACHIN SHAH Venture


MU-New Svllabus w.e.f year 22-23)
(M7-83)
Ous
acade
W.e.f academic
Natural LanguageProcessing(MU-Sem 7 Comp) (Pragmatic&DiscourseProc.)...Pageno.(5-11)
2. Person and Case Agreement

Three different person tenses are recognised in English: first, second, and third.
Also, due to case agreement, various pronoun forms may be necessary when used
in the subject position (nominative case, e.g, he, she, or they), object position
(accusative case, e g.. him, her, or they), and genitive position (genitive case, e.g.,
his Acura, her Acura, their Acura).
Table No. 5.5.2 depicts a division of pronoun categories according to person and
according to case agreement.
Table 5.5.2: Division of Pronoun Categories according to Person and case agreement

| First Second | Third

Nominative I, We You | He, She, They


Accusative Me, Us You Him, Her, Them

Genitive My, Our Your His, Her, Their

Example
You and I have Acuras. We' love them. (5.25)
John and Mary have Acuras. They' love them. (5.26)
John and Mary have Acuras. 'We' love them. (Where, We = John and Mary) 6.27)

You and I have Acuras. "They' love them. (Where, They = You and D (5.28)

Here, the examples (5.25) and (5.26) follow the person and case agreement constraints
whereas, (5.27) and (5.28) do not follow these constraints.

3. Gender Agreement

The gender implied by the referring statement must likewise be accepted by the
referents. English third person pronouns differentiate between the genders of male,
female, and nonpersonal, although unlike many other languages, the first two only apply
to living objects. Table No. 5.5.3 illustrate the pronouns under these gender categories.
Table 5.5.3: Pronouns under Gender Categories

Masculine Feminine Nonpersonal

He, Him, His | She, Her It

(MU-New Syllabusw.e.f academic year 22-23) (M7-83) Tech-Neo Publications...A SACHIN SHAH Venture
Natural
Language Processing(MU Sen 7-Comp) (Pragmatic&DiscourseProc.)..Page
-
no. (5-12)
Example

John has an Acura. "He' is attractive. (Where, he = John, not the Acura) (5.29)
John has an Acura. Tt is attractive. (Where, it = the Acura, not John) 5.30)
4. Syntactic Constraints

expressions used in the sentence, their syntactic


When two referential are same

potentially limit reference


interactions with potential antecedent noun phrases may
relations.

Example
to the constraints indicated
The pronouns in all of the followingsentences are subject
in brackets.
John bought 'himself a new Acura. (himself =John) (5.31)

John bought him'a new Acura. (him John) 6.32)


Acura. (him Bill) (5.33)
John said that Bill bought 'him'a new

himself a new Acura. (himself= Bill) (5.34)


John said that Bill bought
John Acura. (He # John and he # John) (5.35)
He' said that he' bought a new

and themselves'. A reflexive


Reflexive pronouns in English include himself, herself,
immediate sentence (5.31), but a
that contains it
corefers with the subject of the most
corefer with this subject,
cannot
to greatly simplify the issue (5.32).
nonreflexive
evident between
which the reverse reference pattern is
Examples (5.33) and (5.34), in
the pronoun and the subject of the higher
phrase, demonstrate that this rule only
immediate clause. John, on the other hand, cannot
applies to the subject of the most
the most recent sentence or with a higher-levell
corefer with either the subject of
subject (5.35).
and
just apply to a referring expression
a certain
These syntactic restrictions don't
also forbid coreference between the two.
potential antecedent noun phrase; they
antecedents that can be used to signify the
regardless of whether there are any other

same item.
generally, would be able to corefer
or
example, a nonreflexive pronoun like "him,
with the subject of the preceding sentence as it does in example (5.36), but it is not
coreterential he' in
POSsible in example (5.37) due to the presence ot the pronoun the

second clause.
Bill bought 'him' a new Acura. (him =
John) (6.36)
John wanted a new car.
new Acura. (he =
John, him T.
He' bought 'him'
a *

dohn wanted a new car.

6.37)

(MU-New syllabu Tech-Neo Publications...ASACHIN SHAH Venture


Yabus w.e.f academic year 22-23) (M7-83)
Natural LanguageProcessing(MU Sem 7- Comp) (Pragmatic&DiscourseProc.)..Pageno.(5-131

The criteria discussed above simplify the problem too much in a number of ware
they do not apply in many situations. In truth, the facts become extremely comnl.
plex
And

upon closer examination. In fact, it seems implausible that mere syntactic relations
ns
evidence.
could adequately explain all of the
appear in the same syntactic arrangements, tho
For example, even though they
reflexive himself and the nonreflexive "him in sentences18.43) and (18.44)
respectively, can both relate to the subject John.
John set the pamphlets about Acuras next to himself. (himself = John) (5.38)

John set the pamphlets about Acuras next to him. (him =


John) (5.39)

5. Selectional Restrictions
Referents may be dropped due to selectional restrictions that a verb imposes on
its arguments, as in the following example (5.40).
John parked his Acura in the garage. He had driven it around for hours. (5.40)

Here, the term it might be compared toeither the Acura or the garage. The
direct object of the verb 'drive', however, must be a vehicle that can be driven,
such as a car, truck, or bus, and cannot be a garage. As a result, 'the Acura' is the
It is
only potential referent because the pronoun appears as the object of drive.
feasible that a real-world NLP system might have a set of selectional constraints
that were very extensive for the verbs in its lexicon.

However, in the of metaphors, restrictions


case on selection can be broken. For
example, take a look at (5.41).
John bought a new Acura. It drinks gasoline like you would not believe. (5.41)
this
Although the verb 'drink' doesn't frequently relate to inanimate objects, in
context, it can be used to refer to a brand-new Acura.
far
Also, more general semantic constraints might also be in force, but they are
more challenging to represent thoroughly. Take for example the passage (6.4
car
John parked his Acura in the garage. It is incredibly messy, with old bike and
parts lying around everywhere. (5.42)
John parked his Acura in downtown Beverly Hills. It is incredibly messy,
old bike and car parts lying around everywhere. 5.43)
In 5.42) as a car is probably too small to have bike and automobile parts
around "everywhere" therefore the garage is almost surely the intena
reference it' in this case.
of
In order to resolve this reference, a system must be aware of the normal sva the
automobiles, garages, and the typical kinds of stuff one may find in each. ha
that
other hand, one's familiarity with Beverly Hills might lead them to believe
the Acura is the passage's reference to that city (5.43).

Venture

(MU-New Syllabusw.e.f academic year 22-23) (M7-83) Tech-Neo Publications...ASACHIN SHAH


Vatura Language Proce
cessing (MU Sem 7-Comp) Pragmatic & Discourse Proc.)...Page no. (5-14)
ANAPHORA RESOLUTION **

In this section two algorithmswill be


presentedfrom anaphoraresolution.
5.6.1 Hobbs Algorithm
One method for pronoun resolving is the Hobbs algorithm. The syntactic parse tree ot
the sentences forms the foundationof the
algorithm.
Consider the (5.44) Jack and Jill example to better grasp the concept and how we as
humans attempt to resolve the pronoun "his."

Jack and Jill went up the hill, To fetch a pail of water. Jack fell down and broke "his'
crown, And Jill came tumbling after. (5.44)
Here, Jack, Jill, hill, water, and crown are the potential resolution candidates for

pronoun 'his'.

not consider the potentialremedy? Perhaps because the


Why, then, did we crown as a

word his' was followed by the noun crown'. The Hobbs algorithm's initial
states that the referent search must always be confined to the left of the
presumption
target, eliminating the crown.

hill potentialreferents?
If so, are Jill, water, or

are aware that 'his' might not be referring to her'. The


But since Jill is a girl, we
typically used to refer to male or female
etc.
'he', his', she', her,
are
pronouns
to refer to inanimate objects. This characteristic,
animate objects, while it is used
of Jill, hill, and water.
rules out the possibilities
referred as gender agreement,
and entities that are closest to
referenced with pronouns,
Only a few sentences can be
than those that are farther awav, Tt
the referring word are more significant
Recency property is the name given to
with Jack as the only option.
ultimatelyleft us

this property.
embed intelligence (using the Hobbs algorithm) in
explore how we can
Now, let's the task of pronoun resolution now that
common sense
to accomplish
0Dots that lack resolve pronouns.
humans digest text and
we have
a better grasp of how

Algorithmn pronouns, the algorithm makes use of

It is when resolving
portant to note that, the previo
previous sections. The pronoun to be
s in the
mentioned
syntactic straints as
sentences to and
up to and including the current
of the
solved and the
the syntactic parse
Hobbs algorithm.
the
e Serveas the input to Tech-Neo Publications..ASACHIN SHAH Venture
New Syllabusv 22-23) (M7-83)
w.e.f academic year
Natural Language Processing(MU -Sem 7-Comp) (Pragmatic&DiscourseProc.)..Pageno. (5-15
Fig 5.6.1 details the Hobbs algorithm.
Construota parse tree and ldentfy the
pronoun for oo-reference reeolution
www.w w.w

move up the tree to the first


NP' or 'S', call
t as 'X and call path to it as p

8earch left-to-nghtbelow the X and left


of p' and propose any 'NP" node whichhas
an 'NP" or 'S' between it and 'X

If X
is highest 'S ifXle an NP If Xis an
nodein sentence

search previous trees, search below 'X to


And 'p' does not
in order of recency, right of 'p', left-to-right,
pass through an 'N-bar
left-to-right, that X breadth-firts, but not
breadth-first,proposing going
NPs immediatelydominates through any 'NP', or
propose 'X
encountered 'S', proposing'NP
encountered
search below 'X to
Go up to first 'NP" or 'S
left of 'p', left-to-right,
node encountered,
and call this 'X, breadth-first,
and path to it p' proposing'NP
encountered

Satisfied
No
with
Answer?

Yes

Answer
Fig 5.6.1 :Hobbs Algorithm
Example
Let's look at twosentences:

(5.45)
Sentence 1(S1):Jackis an engineer.
Sentence 2 (S2): Jill likes him.
(MU-New Syllabus w.e.f academicyear
22-23) (M7-83) Tech-NeoPublications...ASACHIN SHAH
Venu
Language
Natural Processing(MU Sem 7-Comp) -

(Pragmatic&Discourse Proc.)...Pageno. (5-16)


S

S1

NP P
NP
Jack
Jill
NP
NP
S
likes PRP
Det
an
engineer him

Fig. 5.6.2:Parse Tree for the given sentences

The problem identified is to resolve the pronoun him'.

Based on the algorithm


Go back from him' to PRP' to NP and this 'NP will be called as * and the path
1
him' to PRP to NP' as p'.

2. As there is no path to the left of p', we again go back from NP' to VP to s2,

which is our new *.


Due to the syntactic restrictions imposed by Binding theory, it does not, however,
3.
nonreflexive cannot refer to the
study that branch. According to binding theory,
a

clause in which However, a reflexive


it appears.
subject of the most immediate
can. Reflexive words include
such like himself, herself, themselves, etc. So, we
becomes my new *.
again go backwards to 'S',
4. From S we move to 'S1,
which is our new XK. 4
S2
Now we have encountered

NP when search the path. NP VP NP

This NP does not violate Jack Jill


NP
any constraints hence, this
18 out solution or answer. likes PRP
Det
That is, the him' in the
Baven sentences refers to an engineer him
Jack' Fig.5.6.3 : Resolution Path

for the pronoun 'him'.


ig 5.6.3 sho
Shows the pronoun
resolution path
MU-Nelew Syllabus (M7-83)
Tech-NeoPublications...ASACHIN SHAH Venture
w.e.f ademic year 22-23)
Natural Language Processing (MU - Sem 7- Comp) (Pragmatic&DiscourseProc.)..Pageno. (5-171

5.6.2 Centering Algorithmn


An explicit discourse model representation is not used by the Hobbs algorithm

that explicitly represents a


Centering theory, in contrast, is a family of models
discourse model and also makes the claim that at any given point in the discourse,

and that this object should be distinguished from all


only one entity is being "focused"
other entities that have been evoked.

There are two main representations


theory discoursemodel.
tracked in the Centering
utterances. The Backward
In what follows, take U, and Un1 to be two adjacent
the entity currently being focused
looking center of Un. denoted as C,(U,), represents
on Forward looking in the discourse after U, is interpreted. The forward looking
ordered list containing the entities
centers Un, denoted as Cf (U,), form an
of
mentioned in Un, all of which could serve as the Cb
of the following utterance. In fact,
element of CE (U,) mentioned in U
CpUn+ 1) is by definition the most highly ranked
first utterance in a discourse is undefined.) As
for how the entities
+1 (The Ch of the
sake we can use the grammatical role
in the Cr (U,) are ordered, for simplicity's
hierarchy below.

existential predicate nominal > object >indirect object or oblique >


Subject>
demarcated adverbial PP

We call the highest rank forward lookingcenter as Cp (preferred center).

The preferred referents of pronouns are by the algorithm based on the


determined
forward and backward looking centres in neighbouring
relationships between the
the link between C(Un+ 1), ChUn), and Cp (Un + 1, four
phrases. According to
intersentential interactions between a pair of utterances Un and Un +1 are specified

these are depicted in Fig 5.6.4.

Ch (Un+1) =C,(U,) Ch=(Un+1)#CpU)


Or undefined Ch(U,)

Ch (Un+ 1)=Cp(Un +1 Continue Smooth-Shift


Ch= (Un+1)+ C,(Un+1 Retain Rough-Shift
Fig. 5.6.4 : Transitions

Following rules are to be followed:


then
Rule 1: If any element of Cr (U,) is realized by a pronoun in utterance Un + 1

ChUn+1 must be realized as a pronoun also.


Tech-Neo Publications..ASACHIN SHAH V
enture

(MU-New Syllabus w.e.f academic year 22-23) (M7-83)


Nature
Language rocessing(MU -
Sem
7-Comp) (Pragmatic& Discourse Proc.)...Page no.(5-18)

Rule 2: Transition states are ordered. Continue is preferred to Retain is preferred to


Smooth-Shift is preferred to Rough-Shift.

The algorithm is defined as follows.


rererene
1. Create potential C-Ce pairings for every conceivable arrangement of
assignments.

2. Use constraints, such as selectional limitations, centering rules, and syntactie


coreference requirements, to narrow the results.

3 Order by transitional phrases


and
requirements (gender, number, syntactic,
If Rule 1 and other coreference
referents that are assigned
selectional restrictions) are not broken, the pronominal
relation in Rule 2.Let us step through
those that produce the most desired
are
the algorithm.
assage(21.66) to illustrate
5.46)
at the used car dealership. (U1)
John saw a beautiful 1961 Ford Falcon

He showed it to Bob. (U;)

He bought it. (U3)


order the Cf, for sentence U1 we get:
role hierarchy to
Usingthe grammatical
dealership)
C (U1):1John,Ford,
C, (U):John
C (U1): undefined
with John, and it, which is
interchangeable
which is is
he pronouns he', are present in sentence Ug. John
the dealership,
with the Ford
or

Cr(U1)member mentionedi
nterchangeable
highest-ranking
the
because he' is referent of it,
bU) by definition For each potential
referent for he').
(Since he' is the only possible
would be as follows if we
2 we
The assignments
the ensuing transitions are compared.

presume it' pertains to the Falcon:

Ce Ug): John, Ford, Bob)


Cp Ug): John
Ch Ug):John

Publications...A SACHIN SHAH Venture


Tech-Neo
MUNew Sy us w.e.f academic year 22-23) (M7-83)
Natural Language Processing (MU - Sem 7-Comp) (Pragmatic & Discourse Proc.)..Page no. (5.1au
9
Result:Continue (C,(U2)=C%(U2);Ch(U) undefined)
If we assume it refers to the dealership, the assignments would be:

CrU2): tJohn, dealership,Bob


CPU): John
ChU): John
Result: Continue (C,(U2)= C%(U2); C,(U1) undefined)
Since both options lead to a Continue transition, the algorithm is unable to determine
which should be chosen. We will assume for the purposes of example that ties are
broken in terms of the ranking on the prior Cf list. As a result, we will assume that
refers to the Falcon rather than the dealership, maintaining the first choice above's

representation of the current discourse model.


. While it is compatible with the Ford in sentence U3, he is compatible with either John
or Bob. If we take him to mean John, then John is ChU), and the tasks wouldbe

CrU3):1John, Ford)
Cp (U3)John
C U3): John
Result: Continue (Cp (U3)=©%(U3)=C%(U2))
.If we assume he refers to Bob, then Bob is Cb{U3) and the assignments would be:

C (Us):Bob, Ford)
Cp (Us): Bob

Cb Ug): Bob
Result: Smooth-Shift (C(U3)=©% (Ua); C%(U3)6=Ch(U2))
Since a Continue is preferred to a Smooth-Shift per Rule 2, John is correctly taken t
be the referent.

The key salient factors that the centeringalgorithmimplicitlytakes into accounta


are

the
the preferences for repeated mentions, recency, and grammatical roles. Since
resulting transition type determines the final reference assignments, the
grammatical role hierarchy only indirectly influences salience.
In particular,ifthe former results in a more
highly ranked transition, a referent
low-ranking grammatical role will be preferred to one in a more highly ra
nked

position.
(MU-New Syllabus w.e.f academic year 22-23) (M7-83)
Tech-Neo Publications..ASACHINSHAH
Venture
Natural Languag Processing(MU Sem 7-Comp) (Pragmatic &DiscourseProc.)..Pageno.(5-20)
Consequently, the centering algorith1m can mistakenly resolve a pronoun to a

referentwith low salience. For illustration's sake


(5.47),
Bob opened up a new dealership last week. John took a look at the Fords in his lot. He
ended up buying one. (6.47)

The third sentence's subject pronoun, "he," will be assigned to Bob by the centering
method because Bob is Ch(U2), whereas John will be assigned if John is C(U2),
resulting in a Smooth-Shift relation.
On the other hand, John will be accurately designated as the referent by the Hobbs
The like the algorithm, needs
Hobbs both
algorithm. centering algorithm,
morphological gender detectors anda complete syntactic parse.

As a model of entity coherence, centering theory also has implications for other
discourse applications, such as summarization.

Chapter Ends..

DOO

You might also like