0% found this document useful (0 votes)

147 views12 pages

Estimating Functions (Godambe Article)

This document discusses the development of estimating functions theory and how it provides a unified framework for least squares and maximum likelihood estimation methods. Estimating functions theory provides a finite sample justification for maximum likelihood estimation analogous to the Gauss-Markov theorem's justification of least squares estimation. It also addresses deficiencies of maximum likelihood for nuisance parameters and generalizes the theory beyond the standard linear model assumptions.

Uploaded by

Genevieve Magpayo Nangit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

147 views12 pages

Estimating Functions (Godambe Article)

Uploaded by

Genevieve Magpayo Nangit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Institute of Mathematical Statistics

LECTURE NOTES — MONOGRAPH SERIES

ESTIMATING FUNCTIONS: A SYNTHESIS OF LEAST

SQUARES AND MAXIMUM LIKELIHOOD METHODS

V.P. Godambe
University of Waterloo
ABSTRACT
The development of the modern theory of estimating functions is traced
from its inception. It is shown that this development has brought about
a synthesis of the two historically important methodologies of estimation
namely, the 'least squares' and the 'maximum likelihood'.
Key Words: Estimating functions; likelihood; score function.

1 Introduction
In common with most of the historical investigations, it is difficult to trace
the origin of the subject of this conference: 'Estimating Functions'. How-
ever, in the last two centuries clearly there are three important precursors of
the modern theory of estimating functions (EF): In the year 1805, Legendre
introduced the least squares (LS) method. At the turn of the last century
Pearson proposed the method of moments and in 1925 Fisher put forward
the maximum likelihood (ML) equations. Of these three, the method of
moments faded out in time because of its lack of any sound theoretical jus-
tification. However the other two methods namely the LS and the ML even
at present play an important role in the statistical methodology. These two
methods would also concern us in the following. The LS method was justified
by what today is called the Gauss-Markoff (GM) theorem: The estimates
obtained from LS equations are 'optimal' in the sense that they have min-
imum variance in the class of linear unbiased estimates. This was a finite
sample justification. At about the same time Laplace provided a different
'asymptotic justification' for the method. Fisher justified the ML estimation,
for it produced estimates which are asymptotically unbiased with smallest
variance. This left open the question, is there a finite sample justification
for the ML estimation corresponding to the GM theorem justification for the
LS estimation?
The modern EF theory provided such a justification. According to the
Όptimality criterion' of the EF theory, the score function (SF) is 'optimal'.
6 GODAMBE

2 SF Optimality
To state the just mentioned result formally we introduce briefly some nota-
tion. Let X = {x} be the sample (observations) space and a class of possible
distributions (densities) on X be given by {/( |0),0 G Ω}, Ω being the pa-
rameter space, which we assume here to be the real line. If the function /
is completely specified up to the (unknown) parameter 0, f{ \0) is called a
parametric model. For this model the (score function) SF = d\ogf( \θ)/dθ.
Any real function of x and 0 say g(x,0) is called an estimating function,
(EF). It is said to be unbiased if its mean value for 0 E Ω, is zero; £g = 0.
Further, for reasons which would be clear later, corresponding to every EF
g we define a standardized version g/{£{τ^)}- Now in a class Q = {g} of
uu
unbiased estimating functions, g* is said to be 'optimal' if the variance of
the standardized EF g, is minimized for g — g*:
ε
^)2/{£(%)}2 < ε(g)2/{εφ}2, θen,9eg. (2.1)
SF Theorem (Godambe, 1960). For a parametric model /( |0), granting
some regularity conditions, in the class of all unbiased EFs, the optimal
estimating function is given by the SF i.e.

g* = dlogf{.\θ)/dθ.

The optimality of the SF given by the above Theorem should be dis-

tinguished from the optimality of the LS estimates based on the GM theo-
rem. The SF optimality (though with some additional assumptions implies
asymptotic optimality of the ML estimate) is essentially optimality of the
'estimating function' while the LS optimality is optimality of the 'estimate'.
The concept underlying optimality criterion of the EF theory became more
vivid and compelling in relation to the problem of nuisance parameters.

3 Conditional SF Optimality
Now let the parameter 0 consist of two components 0\ and #2, θ = (#i,#2)
and the parametric model be /( |0i,#2) where θ\ is real and 02 is a vector;
θ £ Ω, θ\ G Ωi, 02 £ Ω2 and Ω = Ωi x Ω2. Further suppose we want
to estimate only 0\ (the interesting parameter) ignoring 02 (the nuisance
parameter). How to proceed? To this question the ML estimation provides
no satisfactory answer. If 0\ and §2 are jointly ML estimates for θ\ and 02,
as is well known, the estimate 0\ can be inconsistent (unacceptable) in case
the dimensionality of the parameter 02 goes on increasing with the number
of observations (cf. Neyman-Scott, 1948). The EF theory, for the present
ESTIMATING FUNCTIONS

situation implies, restricting to that part of likelihood function which is

governed by the interesting parameter θ\ only. Formally for the parametric
model /( |0i,02), let Q\ be the class of all unbiased EFs g(x,θ\), that is
functions of x and 0χ only:

Gi = {9 9 = s ( M i ) , S(g) = 0, Θ e Ω}.

Further let t be a complete sufficient statistic for the parameter 02, f° r every
fixed 0χ. Assuming the statistic t is independent of the parameter θ\ we have
Conditional SF Theorem (Godambe, 1976). Granting some regularity
conditions, in the class of EFs Q\, the 'optimal' EF g* is given by the con-
ditional SF i.e. g* = dlog/( |t;0i)/30i.
Note in the above theorem the definition of optimality is obtained from
(2.1) just by replacing in it Q by Gi and consequently S(dg/dθ) by ε(dg/dθι).
That is the criterion of optimality is unconditional. In the case of the
Neyman-Scott example, unlike the ML estimate 0, the equation 'conditional
SF = 0' provides a consistent estimate of θ\. Further the EF optimality cri-
terion suggests a definition of 'conditional SF' in case the statistic t depends
on the parameter θ\. If ί(0io) is the value oΐt at θ\ = 0χo then we define the
conditional SF by g* where

g* = {dlogf(.\t(θ10),θ1,θ2)/dθι}θl0=Θl. (3.1)

This definition is motivated as follows. The EF g* in (3.1) £ Q\ though

it depends on 02- It further is 'optimal' in Q\ though only locally at 02
(Lindsay, 1982). Unlike the previous situation, when the sufficient statistic
t, was independent of 0χ, now no universally optimal g* (i.e. for all 02 £ Ω2)
exists in Q\. Further though the EF g* in (3.1) depends on 02, it is orthogonal
to the marginal SF of the sufficient statistic ί, hence the substitution of an
estimate 02 derived from the latter, in the former would still leave the former
nearly optimal for large samples, (Lindsay, 1982; Godambe, 1991; Small and
McLeish, 1994; Liang and Zeger, 1995). The equation

would provide a (nearly optimal) consistent estimate of θ\.

Note in the forgoing discussion, conditioning is used just as a 'technique'
to obtain (unconditionally) 'optimum' EFs; but it is not used as a principle
of inference. In fact, without invoking any conditioning at all, Godambe and
Thompson (1974) established, in case of the normal distribution iV(0χ,02),
the optimality of the EF (s 2 - 02), for the interest parameter 02, ignoring
the nuisance parameter θ\. How this (unconditional) optimality leads to a
very 'flexible conditioning' will be discussed later.
GODAMBE

For a general perspective on the topic of conditioning and optimality

we refer to Small and McLeish (1988), Lindsay and Waterman (1991) and
Lindsay and Li (1995).
Lloyd (1987) and Bhapkar (1991) have given results concerning optimal-
ity of 'marginal SF' under 'conditional completeness'.
From the above discussion it is clear that the EF theory has corrected a
major deficiency of the ML estimation in case of the nuisance parameters.
Some earlier references in respect of the nuisance parameters are Bartlett
(1936), Cox (1958), Barnard (1963), Kalbfleisch and Sprott (1970), Barndorff-
Nielsen (1973) and others. Some of these authors tried to obtain conditions
under which the marginal distribution of t does not contain any information
about θ\ the parameter of interest. As we have seen the optimality criterion
of the EF theory yields such a condition in terms of 'completeness of the
statistic t\ Though not universally applicable (as none can be, I suppose)
it by now has been commonly used for its mathematical manageability. It
also carries with it greater conviction for it is derived from an optimality
criterion which has proved to be fruitful very generally.
In the following we would show that the EF theory, just as it corrected
ML estimation, also corrects some major inadequacies of the LS estimation
and the GM theorem.

4 Quasi-Score Function
We now replace the abstract (observation) sample in the discussion by n real
variates X{ : i = 1, ...,n which are assumed to be independently distributed
with means μi(θ) and variances Vi(θ) (μι and v% being some specified func-
tions of θ) i = 1, ...,n. For simplicity let θ be a scalar parameter. Initially
we consider the special case where μι are linear functions of θ and V{ are in-
dependent of θ. Here the LS equation is given by Σι{xi — μ%){-^)hi — 0
uu
The solution of the equation, as said before, according to GM theorem, has
smallest variance in the class of all linear unbiased estimates of 0; hence
is 'optimal'. The estimating function Σ{χi ~ tJLi){-^E')lvi ιs a l s o 'optimal'
uθ
according to criterion (2.1), in the class of all EFs of the form
n
9 = Σ(xi ~ μ%)(H (4.1)

where aι can be any arbitrary functions of θ. (Actually here we minimize

8(g2) subject to holding ε(dg/dθ) = const.. This will explain the standard-
ization of EF mentioned earlier.) Note this EF optimality implies more than
the GM optimality, for the solutions corresponding to all the equations g = 0
include not only all linear unbiased estimates of θ but many more.
ESTIMATING FUNCTIONS

Now let the means μ* and variances V{ be arbitrarily specified functions

of θ. Here the LS equation is given by 3 + B = 0 where

g =
, n ,„. ..jdμi/dθ)
υ%
1
and (4.2)

Clearly in (4.2), S{g) = 0 and £(B) = ΣΐdlogVi/dθ. Note for large n,

(<?/n) ~ 0 while ( £ / n ) could still be very large. Hence because of the bias
term (B) the LS equation Ίj + B = 0 would generally lead to an inconsistent
estimate. On the other hand according to the optimality criterion (2.1) of
the EF theory, in the class of EFs given by (4.1) for different functions di(θ),
i — 1,..., n, g given by (4.2) is Optimal'. Generally the equation ~g = 0 would
lead to a consistent solution. (Here GM theorem cannot be of any avail for
the solution of g = 0 would generally lead to a biased estimate.) For reasons
to be explained soon, we would call the estimating function ~g a quasi-score
function, (Quasi-SF).
Interestingly the EF optimality of the quasi-SF ~g was first established in
a wider setting of discrete stochastic processes with a martingale structure.
Quasi-SF Theorem (Godambe 1985). If μι and V{ denote the means and
variances of X{ conditional on the past observations #i_i,...,α;o i.e. μι =
μi(0, #0, •••, Xi-i) and Vi — Vi(θ, x0,..., x;_i) for i = 1,..., n then

_ λdμi/dθ
9 = 2^{x% - μ%) •— (4-3)

Here g is the optimal EF in the class of the EFs given by (4.1) where aι are
functions of a?<-i, •••, ^o in addition to θ.
Among precursors to the EF g in (4.3) above are included the following:
Durbin (1960) gave a GM theorem analogue for linear time series model.
Klimko and Nelson (1978) obtained conditional LS equations. Kalbfleisch
and Lawless (1983) suggested a special case of g for Markov models.
For further generalizations of the EF optimality results, relating to g in
(4.3) we refer to Godambe (1985), Godambe and Heyde (1987), Godambe
and Thompson (1989).
Returning back for simplicity to the case where the variates x^ i = 1,.., n
are independently distributed, we summarize important properties of the EF
g given by (4.2). The 'optimality' of g is for the semi-parametric model de-
fined by means μ%{θ) and variances Vi(θ). As a special case when μ; is linear
in θ and V{ is independent of θ the EF optimality of ~g implies the GM opti-
mality of the LS estimates. In this special case, if the underlying distribution
10 GODAMBE

is normal, the LS estimates coincide with the ML estimates. Generally, for

the exponential family distributions the SF coincides with the optimal EF
g given by (4.2). Even outside the exponential family of distributions, ~g
2 2
satisfies a very general property of the SF; ί[SF) = -S(dSF/dθ) simi-
2 2
larly we have E(g) = —S(dg/d9) . Further if SF denotes a generic 'score
function' for the class of distributions consistent with the semi-parametric
2 2
model mentioned above then S(g - SF) < E(g - SF) , (the expectation
being taken w.r.t. the distribution that corresponds to the SF), for all the
EFs g given by (4.1). These are the properties which justify the previously
introduced term ' quasi-SF' for ~g. (Even before the EF optimality of g was
discovered the term quasi-likelihood was commonly used in the literature on
generalized linear models, McCullagh and Nelder 1983, 1989).
As we have seen previously the EF theory corrected a major deficiency in
ML estimation relating to nuisance parameters. The above discussion points
to yet another accomplishment of the EF theory. It brought about, via quasi-
score function g, a kind of synthesis of two historically distinct methods of
estimation: LS for semi-parametric models and ML for parametric models.
The same criterion of the EF optimality namely (2.1) is satisfied in case of
the latter by the SF and in case of the former by the quasi-SF ~g in (4.2).
Only the classes of competing EFs are different. They are taken appropriate
to the model (see Godambe and Thompson, 1989 Appendix). Of course, the
forgoing discussion also shows that the quasi-SF g does provide, not only a
unification (of the two methods LS and ML) but much more. It provides a
generalization to deal with problems outside the scope of both LS and ML
methods.
As a further contribution of the EF theory to statistics, below we briefly
outline a very 'flexible conditioning' that the theory permits and the conse-
quent incorporation of the Bayesian factor within its methodology.

5 A Generalization
It was mentioned earlier that within the framework of martingales and cor-
responding filtering, the EF theory suggested use of weighted conditional
least square estimation, on grounds of its optimality property. But to deal
with general spatial processes one needs more flexible conditioning than used
before; this was provided by Godambe and Thompson (1989): Let as before
X = {x} be an abstract sample space and T = {F} be a class of distributions
on X. Further let θ be a real parameter, defined on T\ {Θ(F),F € T) = Ω.
Now suppose hj is a real function on X x Ω and Xj a specified partition (or
a σ-field generated by a partition) of X such that

Xj) = o, j = ι,...,k. (5.i)

ESTIMATING FUNCTIONS 11

The functions /ij, j = l,...,fc are called the elementary EFs; they are not
exhaustive. Their choice is determined by the problem at hand. Now suppose
the elementary EFs, h\,...,hk are mutually orthogonal (Def. Godambe and
Thompson 1989) and the class of underlying distributions T satisfy certain
conditions. Then in the class of all EFs g of the form

(5.2)
3=1

where qj are some real functions o n ί x ί ί which are measurable on Xj,

j = 1,..., k the 'optimal' one is given by

3=1

where q*j = {ε(dhj/dθ\Xj)}/{ε(hj\Xj)}. Here the criterion of optimality

as always is unconditional, given by (2.1) for a real parameter θ (or its
appropriate version if θ is a vector); the expectation is taken with respect to
Fef.
Up to the above results the EF theory was 'restricted' to the classical
setup where distributions on the sample space X for some fixed values of the
parameters are considered. But the formalism of the EF optimality criterion
is flexible enough and the just mentioned 'restriction' can be set aside if we
know something about the prior distribution of 0; for instance its mean (0Q)
and variance (VQ). Under such Bayesian setup the only changes that are
required to be done are as follows: (i) In (5.1) now Xj is not necessarily
a partition of just the sample space X, but it can be a partition of X x
Ω, Ω as before being the parameter space, (ii) Some elementary EFs hj,
j = l,...,fc can now be functions exclusively of the parameter θ. (in) All
expectations in the optimality criterion (2.1) are now with respect to the
joint distributions of (x,θ) (and not as before with respect to distributions
of x given θ). Following is an illustration.
Let the partitions of the sample space Xj and the elementary estimating
functions hj, j = l,...,fc be the same as in (5.1). Further, as suggested
before, let the mean value (0o) and the variance (v$) of the prior distribution
of θ be known. Now, to the set of elementary estimating functions h\,...,hk
we add one more, namely hk+ι = θ — ΘQ. In this case the optimal EF is
given by g* + (θ — ΘQ)/VQ, where g* is the same as in (5.3). Similarly now the
quasi-SF ~g in (4.2), which was obtained under the assumption '0 is fixed'
will now have to be replaced by ~g — (θ — ΘQ)/VQ (Godambe, 1994).
The 'optimality' of the EF given by the derivative of the logarithm of the
posterior density was established in a 'parametric setup' by Ferreira (1982)
and Ghosh (1993). Naik-Nimbalkar and Rajarshi (1995) have established
some optimality results in 'semi-parametric Bayesian setup'.
12 GODAMBE

6 Other Topics

Now, following are a few remarks (possibly only tangential) about the like-
lihoods: empirical, partial, profile, quasi and the like. Basically when the
likelihood function is precisely known, with no nuisance parameters, likeli-
hood ratio test is 'optimal' in the conventional sense of the term. Also the
SF satisfies the EF criterion of Όptimality'. Now the various likelihoods,
empirical partial, quasi just mentioned, try to 'approximate' the underlying
(true, precise) likelihood in situations of nuisance parameters and/or of semi-
parametric models. In similar situations EF theory tries to 'approximate'
the (true) underlying SF. However, unlike the former, the latter 'approxima-
tion' can be assessed with a plausible finite sample criterion. Suppose g(x, θ)
is a real function of the sample x and the parameter of interest θ such that
the expectation £ (g) = 0 for all possible underlying distributions F i.e. for
F G T. Let further SF be a score function corresponding to F in T. Then
the finite sample criterion of assessing the approximation g for SF is given
by £{g - SF)2, for all F £ T. This criterion as said before leads to the Όpti-
mality' criterion (2.1) of the EF theory. As I have previously shown optimal
or approximately optimum EFs are found in many practical problems and in
fact by now they are in common use. Now while optimum EFs and approxi-
mations thereof can provide a handy instrument for constructing confidence
intervals and related tests cf. Rao's test (Rao 1947, Basawa 1991), for some
other problems some kind of 'approximate likelihood' would be more handy.
I think, to be safer, construction of such approximate likelihoods should be
tied to the optimum EFs, whenever possible. It is good to note already a
strong trend in that direction (Qin and Lawless, 1994).
An often asked question (cf. Liang and Zeger 1995) is how does the EF
optimality relate to the properties of the corresponding estimate? How good
is the estimate? Usually the answer is given in terms of the 'error' of the
estimate. Now this 'error' is somewhat of an involved concept. Certainly,
error is not just a square root of an arbitrary (unbiased or nearly so) estimate
of variance. However for a parametric model the concept is clear. The error
is derived from the conditional (or the natural estimate of) variance of the
SF. Thus error is the inverse of the square root of observed Fisher information
(Efron and Hinkley, 1978). This methodology is formalized and extended by
the EF theory. Consider the confidence intervals, θ ± const (error), where
the estimate θ is obtained from the unbiased estimating equation g(θ) = 0.
Here a more direct way of obtaining the confidence intervals is by inverting
the distribution of the standardized version (cf. Godambe 1991, eq. 40) of
the EF g around θ. These intervals, compared to former ones, are easier to
compute. Also if g is the optimal EF, the corresponding intervals are shortest
compared to that of any other unbiased EF, (Godambe and Heyde 1987).
ESTIMATING FUNCTIONS 13

The standardizing factor of the EF g directly leads to the computation of

the 'error' for the estimate θ (Godambe, 1995).
For important previous review articles on the subject we refer to Heyde
(1989) and Godambe and Kale (1991). The present review highlights some
more recent developments and presents older results with different emphasis
and interpretations. A further reference along this line is Desmond (1997).

References

Barnard, G.A. (1963). Some logical aspects of the fiducial argument. J.R.
Statist Soc. B, 25, 111-114.
Barndorff-Nielsen, O.E. (1973). On M-ancillarity. Biometrika 60, 447-455.
Bartlett, M.S. (1936). The information available in small samples. Proc.
Camb. Phil. Soc, 34, 33-40.
Basawa, I.V. (1991). Generalized score tests for composite hypotheses. Es-
timating functions, (ed. V.P. Godambe), Oxford Univ. Press, Oxford.
121-131.
Bhapkar, V.P. (1991). Sufficiency, ancillarity and information in estimating
functions. Estimating Functions. (Ed. V.P. Godambe), Oxford Univ.
Press, Oxford. 240-254.
Cox, D.R. (1958). Some problems connected with statistical inference. Ann.
Math. Statist. 29, 357-372.
Desmond, A.F. (1997). Optimal estimating functions, quasi-likelihood and
statistical modelling (with discussion). J. Stat. Plan. Inf. 60, 77-121.
Durbin, J. (1960). Estimation of parameters in time series regression models.
J. Roy. Statist. Soc. B, 22, 139-153.
Ferreira, P.E. (1982). Multiparametric estimating equations. Ann. Stat.
Math. 34, 423-431.
Fisher, R.A. (1925). Theory of statistical estimation. Proc. Cambridge Phil.
Soc. 22, 700-706.
Ghosh, M. (1990). On a Bayesian analog of the theory of estimating func-
tions. C.G. Khatri Memorial Volume of Gujard Statistical Review, 17A,
47-52.
Godambe, V.P. (1960). An optimum property of regular maximum likeli-
hood estimation. Ann. Math. Statist. 31, 1208-1212.
Godambe, V.P. (1976). Conditional likelihood and unconditional optimum
estimating equations. Biometrika, 63, 277-284.
Godambe, V.P. (1985). The foundations of finite sample estimation in
stochastic processes. Biometrika 72, 419-428.
14 GODAMBE

Godambe, V.P. (1991). Orthogonality of estimating functions and nuisance

parameters. Biometrika 78, 143-151.
Godambe, V.P. (1994). Linear Bayes and optimal estimation. Tech. Report
STAT-94-11, University of Waterloo.
Godambe, V.P. (1995). Discussion of the paper, 'Inference Based on esti-
mating functions in the presence of nuisance parameters' by Liang, K.Y.
and Zeger, S.L. Statistical Science 10, 173-174.
Godambe, V.P. and Heyde, C.C. (1987). Quasi-likelihood and optimal esti-
mation. Int. Stat. Rev. 55, 231-244.
Godambe, V.P. and Kale, B.K. (1991). Estimating functions: an overview.
Estimating Functions. (Ed. V.P. Godambe), Oxford University Press,
Oxford. 1-20.
Godambe, V.P. and Thompson, M.E. (1974). Estimating equations in pres-
ence of nuisance parameters. Ann. Stat. 2,568-571.
Godambe, V.P. and Thompson, M.E. (1989). An extension of quasi-likelihood
estimation (With Discussion). J. Stat. Plan. Inf. 22, 137-172.
Heyde, C.C. (1989). Quasi-likelihood and optimality of estimating functions:
some current unifying themes. Bull. Int. Stat. Inst. Book 1, 19-29.
Kalbfleisch, J.D. and Sprott, D.A. (1970). Applications of likelihood meth-
ods to models involving large number of parameters. (With Discussion).
J.R. Statist. B, 32, 175-208.
Kalbfleisch, J.D., Lawless, J.F. and Vollmer, W.M. (1983). Estimation in
Markov models from aggregate data. Biometrics 39, 907-919.
Klimko, L.A. and Nelson, P.I. (1978). On conditional least squares estima-
tion for stochastic processes. Ann. Statist. 6, 629-642.
Legendre, A.M. (1805). Nouvelles methodes pour la determination des or-
bites des cometes. Paris: Courcier.
Liang, K.Y. and Zeger, S.L. (1995). Inference based on estimating functions
in the presence of nuisance parameters. (With Discussion). Statistical
Science 10, 158-195.
Lindsay, B. (1982). Conditional score functions: some optimality results.
Biometrika 69, 503-512.
Lindsay, B. and Waterman, R.P. (1991). Extending Godambe's method in
nuisance parameter problems. Proceedings of a Symposium in honour of
Prof. V.P. Godambe. University of Waterloo, 1-43.
Lindsay, B.G. and Li, B. (1995). Discussion of the paper, 'Inference based
on estimating functions in the presence of nuisance parameters' by Liang,
K.Y. and Zeger, S.L. Statistical Science 10, 175-177.
Lloyd, C.J. (1987). Optimality of marginal likelihood estimating equations.
Comm. Stat Theory and Meth. 16, 1733-1741.
McCullagh, P. and Nelder, J.A. (1983, 1989). Generalized linear models (1st
and 2nd editions). Chapman and Hall, London.
Naik-Nimbalkar, U.V. and Rajarshi, M.B. (1995). Filtering and smoothing
via estimating functions. J. Amer. Statist. Asso. 90, 301-306.
ESTIMATING FUNCTIONS 15

Neyman, J. and Scott, E.L. (1948). Consistent estimates based on partially

consistent observations. Econometrika, 16, 1-32.
Qin, J. and Lawless, J.F. (1994). Empirical likelihood and general estimating
equations. Annals of Statistics 22, 300-325.
Rao, C.R. (1947). Large sample tests for statistical hypotheses concerning
several parameters with applications to problems of estimation. Proc.
Camb. Phil. Soc. 44, 50-57.
Small, C. and McLeish, D.L. (1988). The theory and applications of sta-
tistical inference functions. Lecture Notes in statistics No. 44, Springer
Verlag. Heidelberg, New York, London.
Small, C. and McLeish, D.L. (1994). Hilbert space methods in probability
and statistical inference. John Wiley and Sons, Inc. New York.

Stat 450850 Notes 2012
No ratings yet
Stat 450850 Notes 2012
190 pages
Maximum Likelihood Estimation (MLE)
No ratings yet
Maximum Likelihood Estimation (MLE)
4 pages
Introduction
No ratings yet
Introduction
11 pages
MLE Lecture Note For Econometrician
No ratings yet
MLE Lecture Note For Econometrician
13 pages
LN Estimation Theory
No ratings yet
LN Estimation Theory
11 pages
Maximum Likelihood An Introduction: L. Le Cam
No ratings yet
Maximum Likelihood An Introduction: L. Le Cam
31 pages
Module 4
No ratings yet
Module 4
3 pages
Solution 3 Problem 1: Let X
No ratings yet
Solution 3 Problem 1: Let X
12 pages
Point Estimation: Definition of Estimators
No ratings yet
Point Estimation: Definition of Estimators
8 pages
ემპირიული პროცესები
No ratings yet
ემპირიული პროცესები
131 pages
1 Interval Censoring: Nonparametric Estimation For Interval Censored Data
No ratings yet
1 Interval Censoring: Nonparametric Estimation For Interval Censored Data
4 pages
3rd Partial Collaborative Activity #1
No ratings yet
3rd Partial Collaborative Activity #1
2 pages
Classics: 76 Resonance
No ratings yet
Classics: 76 Resonance
15 pages
Estimation of A Common Multivariate Normal Mean Vector
No ratings yet
Estimation of A Common Multivariate Normal Mean Vector
11 pages
Chap 10
No ratings yet
Chap 10
7 pages
Optimal Prediction With Memory: Alexandre J. Chorin, Ole H. Hald, Raz Kupferman
No ratings yet
Optimal Prediction With Memory: Alexandre J. Chorin, Ole H. Hald, Raz Kupferman
19 pages
Statistics Review
No ratings yet
Statistics Review
9 pages
ML Notes
No ratings yet
ML Notes
4 pages
TS Theme3
No ratings yet
TS Theme3
18 pages
STAT2602 Tutorial 5
No ratings yet
STAT2602 Tutorial 5
7 pages
Semiparametrically Efficient Estimation of Eucl 2019 Journal of Statistical
No ratings yet
Semiparametrically Efficient Estimation of Eucl 2019 Journal of Statistical
13 pages
Maximum Likelihood Estimation of Heckman's Sample Selection Model
No ratings yet
Maximum Likelihood Estimation of Heckman's Sample Selection Model
16 pages
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
No ratings yet
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
15 pages
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
No ratings yet
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
15 pages
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
No ratings yet
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
15 pages
Barndorff-Nielsen 1987
No ratings yet
Barndorff-Nielsen 1987
68 pages
Minimum L - Distance Estimators For Non-Normalized Parametric Models
No ratings yet
Minimum L - Distance Estimators For Non-Normalized Parametric Models
32 pages
STAT2102 Chapter6
No ratings yet
STAT2102 Chapter6
5 pages
Risk Fisher
No ratings yet
Risk Fisher
39 pages
Notes On Non-Cooperative Game Theory Econ 8103, Spring 2009, Aldo Rustichini
No ratings yet
Notes On Non-Cooperative Game Theory Econ 8103, Spring 2009, Aldo Rustichini
30 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
11 pages
Estimation of Parametric Functions in Downton's
No ratings yet
Estimation of Parametric Functions in Downton's
17 pages
Restricted Parameter Space Estimation Problems
No ratings yet
Restricted Parameter Space Estimation Problems
171 pages
Basic Maximal Total Strong Dominating Functions
No ratings yet
Basic Maximal Total Strong Dominating Functions
13 pages
0 Inference For Diffusion Processes
No ratings yet
0 Inference For Diffusion Processes
20 pages
6 Partially Linear Regression
No ratings yet
6 Partially Linear Regression
12 pages
Statistical Inference Foundations
No ratings yet
Statistical Inference Foundations
89 pages
RKL: A General, Invariant Bayes Solution For Neyman-Scott
No ratings yet
RKL: A General, Invariant Bayes Solution For Neyman-Scott
15 pages
1981 Estimating The Dimension of A Linear-Model - J. Andel, M. G. Perez and A. I. Negrao
No ratings yet
1981 Estimating The Dimension of A Linear-Model - J. Andel, M. G. Perez and A. I. Negrao
12 pages
SVM 1997
No ratings yet
SVM 1997
11 pages
Stat-Review Xid-8243919 1
No ratings yet
Stat-Review Xid-8243919 1
24 pages
Sujan - On The Capacity of Asymptotically Mean Stationary Channels
No ratings yet
Sujan - On The Capacity of Asymptotically Mean Stationary Channels
12 pages
Filt Ident Lecturenotes
No ratings yet
Filt Ident Lecturenotes
12 pages
Ergodic Properties Explained
No ratings yet
Ergodic Properties Explained
5 pages
A New Approach To Linear Filtering and Prediction Problems: R. E. Kalman
No ratings yet
A New Approach To Linear Filtering and Prediction Problems: R. E. Kalman
12 pages
Applibugs.14 06 11.christian-Robert
No ratings yet
Applibugs.14 06 11.christian-Robert
117 pages
Inference With The Median of A Prior
No ratings yet
Inference With The Median of A Prior
21 pages
Advanced Statistical Inference
No ratings yet
Advanced Statistical Inference
7 pages
7 Mle
No ratings yet
7 Mle
31 pages
Slides Week4
No ratings yet
Slides Week4
37 pages
Maximum Likelihood Notes1
No ratings yet
Maximum Likelihood Notes1
10 pages
Least Square
No ratings yet
Least Square
25 pages
Lectures 10-11
No ratings yet
Lectures 10-11
17 pages
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
No ratings yet
Estimation Theory: x, x, x ,…… ……x ,x f x,θ θ θ θ
18 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
16 pages
Stein 2011 DiffFilter
No ratings yet
Stein 2011 DiffFilter
20 pages
Part 2 Estimation
No ratings yet
Part 2 Estimation
72 pages
Statistics Diffusions
No ratings yet
Statistics Diffusions
66 pages
Personal Study Notes About Feminist Phenomenology
No ratings yet
Personal Study Notes About Feminist Phenomenology
6 pages
Narrative Analysis of "Blonde" by JC Oates
No ratings yet
Narrative Analysis of "Blonde" by JC Oates
2 pages
Quasi-Likelihood Functions, Generalized Linear Models, and The Gauss-Newton Method (Wedderburn Article)
No ratings yet
Quasi-Likelihood Functions, Generalized Linear Models, and The Gauss-Newton Method (Wedderburn Article)
9 pages
Literary Reflection of "Mudwoman" by JC Oates
No ratings yet
Literary Reflection of "Mudwoman" by JC Oates
1 page
Narrative Analysis of "Blonde" by Joyce Carol Oates
No ratings yet
Narrative Analysis of "Blonde" by Joyce Carol Oates
2 pages
Narrative Analysis of "Blonde" by JC Oates
No ratings yet
Narrative Analysis of "Blonde" by JC Oates
5 pages
Foucault's Discourse Analysis Guide
No ratings yet
Foucault's Discourse Analysis Guide
1 page
Narrative Analysis of "Blonde" by JC Oates
No ratings yet
Narrative Analysis of "Blonde" by JC Oates
1 page
Chapter 5 Concepts of Interaction Booklet - Final 13may2021
No ratings yet
Chapter 5 Concepts of Interaction Booklet - Final 13may2021
3 pages
National Youth Service Corps - Cybercafe Accreditation Platform
No ratings yet
National Youth Service Corps - Cybercafe Accreditation Platform
2 pages
Does Stretching Training Influence Muscular Strength A Systematic Review With Meta Analysis and Meta Regression
No ratings yet
Does Stretching Training Influence Muscular Strength A Systematic Review With Meta Analysis and Meta Regression
40 pages
Missing Data
No ratings yet
Missing Data
14 pages
Handbook of Farm, Dairy and Food Machinery Engineering (2nd Ed) (Gnv64)
No ratings yet
Handbook of Farm, Dairy and Food Machinery Engineering (2nd Ed) (Gnv64)
9 pages
BC Attestation Signed
No ratings yet
BC Attestation Signed
1 page
Medicare - Complete Training Manual For Engineers 210217 1
No ratings yet
Medicare - Complete Training Manual For Engineers 210217 1
64 pages
Post Impact of Demonetization On Retail Sector A Case Study On Retail Outlets at Malur Town of Kolar District
No ratings yet
Post Impact of Demonetization On Retail Sector A Case Study On Retail Outlets at Malur Town of Kolar District
3 pages
Funniesthvh Uk
No ratings yet
Funniesthvh Uk
6 pages
Case Study Final 2go Ceres Century PDF
No ratings yet
Case Study Final 2go Ceres Century PDF
4 pages
Model Exam Question Set 1
No ratings yet
Model Exam Question Set 1
2 pages
UPSC Prelims Mock Test Rau IAS Result
No ratings yet
UPSC Prelims Mock Test Rau IAS Result
64 pages
Data Engineering and MLops
No ratings yet
Data Engineering and MLops
3 pages
Dissertation HannWoeiHo
No ratings yet
Dissertation HannWoeiHo
154 pages
Write If P If Its Primary Prevention, S For Secondary Prevention and T For Tertiary Prevention
100% (1)
Write If P If Its Primary Prevention, S For Secondary Prevention and T For Tertiary Prevention
2 pages
Unit 3-Food Security 20232024updated
No ratings yet
Unit 3-Food Security 20232024updated
44 pages
Summative Questions
No ratings yet
Summative Questions
35 pages
Factorio Friday Facts 214
No ratings yet
Factorio Friday Facts 214
2 pages
Logical Reasoning and Comprehension Test
No ratings yet
Logical Reasoning and Comprehension Test
7 pages
Spare Parts Catalogue: Bajaj Auto Limited
100% (1)
Spare Parts Catalogue: Bajaj Auto Limited
76 pages
TorresOakley Explor1.3
No ratings yet
TorresOakley Explor1.3
6 pages
Whitlock DNFT
No ratings yet
Whitlock DNFT
4 pages
Dereks Resume 2025
No ratings yet
Dereks Resume 2025
2 pages
K4 Exploring The Informix IDS Sysmaster Database - International ...
No ratings yet
K4 Exploring The Informix IDS Sysmaster Database - International ...
86 pages
Module3 ARM
No ratings yet
Module3 ARM
96 pages
Jackson On Indian Removal
No ratings yet
Jackson On Indian Removal
2 pages
Hot Water Circulating Pump Cal2
100% (3)
Hot Water Circulating Pump Cal2
2 pages
Harnessing The Power of Culture and Cultural Intel
No ratings yet
Harnessing The Power of Culture and Cultural Intel
14 pages
CQC Inspection Report
No ratings yet
CQC Inspection Report
17 pages
Password Cracking Powerpoint
No ratings yet
Password Cracking Powerpoint
9 pages
LangChain for LLM Developers
No ratings yet
LangChain for LLM Developers
27 pages

Estimating Functions (Godambe Article)

Uploaded by

Estimating Functions (Godambe Article)

Uploaded by

Institute of Mathematical Statistics

LECTURE NOTES — MONOGRAPH SERIES

ESTIMATING FUNCTIONS: A SYNTHESIS OF LEAST

The optimality of the SF given by the above Theorem should be dis-

situation implies, restricting to that part of likelihood function which is

This definition is motivated as follows. The EF g* in (3.1) £ Q\ though

would provide a (nearly optimal) consistent estimate of θ\.

For a general perspective on the topic of conditioning and optimality

where aι can be any arbitrary functions of θ. (Actually here we minimize

Now let the means μ* and variances V{ be arbitrarily specified functions

Clearly in (4.2), S{g) = 0 and £(B) = ΣΐdlogVi/dθ. Note for large n,

is normal, the LS estimates coincide with the ML estimates. Generally, for

Xj) = o, j = ι,...,k. (5.i)

where qj are some real functions o n ί x ί ί which are measurable on Xj,

where q*j = {ε(dhj/dθ\Xj)}/{ε(hj\Xj)}. Here the criterion of optimality

The standardizing factor of the EF g directly leads to the computation of

Godambe, V.P. (1991). Orthogonality of estimating functions and nuisance

Neyman, J. and Scott, E.L. (1948). Consistent estimates based on partially

You might also like