Copyright
Allan Birnbaum.

A unified theory of estimation. 1. (Rev. & extended Feb. 1960) online

. (page 1 of 5)
Online LibraryAllan BirnbaumA unified theory of estimation. 1. (Rev. & extended Feb. 1960) → online text (page 1 of 5)
Font size
QR-code for this ebook


No. 196047




'^W YORK UNIVERSITY
^^^^^^P^^ OF MATHEMATICAL SCIENCES

ubrary

P5 Waverfy Pt«, t4e^ y^^ ^ ^ ^

MAY 2 n 19B0
NEW YORK UNIVERSITY
INSTITUTE OF
MATHEMATICAL SCIENCES



IMM-NYU 266
APRIL I960



A UNIFIED THEORY OF ESTIMATION. I

(Revised and extended, February 1960)



ALLAN BIRNBAUM



REPRODUCTION IN WHOLE OR IN PART

IS PERMITTED FOli ANY PL'UIX)SE
OE THE UNITED STATES GOVERNMENT.



PREPARED UNDER
CONTRACT NO. NONR-285(38)
WITH THE

OFFICE OF NAVAL RESEARCH
UNITED STATES NAVY



IMM-NYU 266
April I960



A UNIFIED THEORY OP ESTIMTION, I,
(Revised and extended, February I960)



Allan Blrnbaxora



This report represents results obtained at the Institute
of Mathematical Sciences, New York University, under the'
sponsorship of the Office of Naval Research, Contract No,
Nonr-285(38)„ Some sections include results previously
reported under the same title^) obtained at Columbia University
\inder the sponsorship of the Office of Naval Research,
Contract No, Nonr-266(33)



c »



i I t



» • t



«t » ^



» ' ». •



V i'-



0. I ntroduction and Sunrary . This jjaper extends and unifies some
previous formulations and theories of estimation for one-parameter
problems. The basic criterion used is admissibility of a point
estimator, defined with reference to its full distribution rather
than special loss functions such as squared error. Theoretical
methods of characterizing admissible estimators are given, and
practical comput':^ tional m.ethods for their use are illustrated in
a variety of examples.

Point, confidence limit, and confidence interval estimation are
included in a single theoretical formulation, and incorporated into
estimators of an "omnibus" form called "confidence curves," The
usefulness of the latter for some rpplicatlons as well as theoret-
ical purposes is Illustrated,

Wisher's maximum likelihood principle of estimation is general-
ized, given excct (non-asymptotic) justification, and unified with
the theory of tests and confidence regions of Keyman and Pearson.
Relations between exact and asymptotic results are discussed.

An application of the general theory gives optimal sequential
estimators having prescribed precision in a specified Interval,

Further developments, including multiparameter and nuisance para-
meter problems, problems of choice among admissible estimators,
formal and informal criteria for optimallty, and related problems
in the foundations of statistical Inference, will be presented sub-
sequently.



'1.1



1, A broad fornulati c n ci the problem of point est imstion . We con-
sider problems of estimation vjith reference to a specified experi-
ment E, leaving aside here questions of experimental design includ-
ing those of choice of a sample size or a sequential sampling rulej
some definite sampling rule, possibly sequential, is assumed speci-
fied as part of E. Let S =/x^ denote the s?mple spnce of possible
outcomes x of the experiment, Let f(x,0) denote one of the element-
ary probability functions on S which .-re specified as possibly true.
Let A = x^ denote the specified parameter space, i^'^or each in i ^
and for each subset of A of S, the probability that E yields an
outcome x in A is given by



'• X e A|Q I = { f (x,0) d^(x).



Prob

vjhere ti is a specified c"- finite measure on S, (Vie assume tacitly
here and belovj that consideration is appropriately restricted to
measurable sets and functions only.)

If Y = yC^) is any function defined on D-(e.g. y(^) = ^ cr
y(^) = ), with ranre ' , a point estimator of y is any measurable
function g = g(x) taking values in ['(or in T, its closure, if, for
example, ('is an open interval). The problem of choosing a good
estimator, that is an estimator which tends to take values close to
the true unknc^^Jn value of y, has been formulated mathematically in
various ways. Most formulations achieve mathematical definiteness
by introducing criteria of closeness which appear somewhat arbitrary
from some standpoints of application and undesirably schematic as
expressions of the intuitive notion of closeness.

If il is given no specific (parametric) structure, then the
latter features can be fully avoided only by a very broad formulation



3

which specifies only that ir y is true, then an exactly correct
estimate (g = y) is closer th::n any incorrect estimate (g ^ y) , If
iX is finite, -0-= ^i,'"% , snd y(") = ^, this leads to the
formulation of Lindley [1] in which estimators are compared only
on the basis of their error probabilities

p^^ = Prob [c'"'' (X) = 0. |0^ ] , i,j, = l,...k, i ^ j,

where o'"(x) is any estimator of 0. This formulation has no very
useful extension to typical estimation problems in which, fcr
example, n is an interval, and in which the event 0"(X) = exactly
has typically negligible probability and little interest.

The case in which H. is any set of real numbers, for example an
interval, and yC^) = ^, r^iay be terned the central problem of theory
of point-estimation, although very important generalizations of
this problem have been treated extensively. For this problem,
closeness of C"' to Q has been specified by the introduction of
specific loss functions: The absolute error criterion, |fi"-Ol,
was introduced by Laplace. Gauss replaced this by the squared -•••
error criterion (O'-G) which proved nathemo tically much more tract-
able and provided a definite formulation of the problem which seemed
equally reasonable. A generalized squared error criterion,
c(fi).(fl -fe) , where c(0) is any specif lee' positive function, is
used in some work in modern statistical decision theory. Such
criteria are sometimes used in conjunction with the requirement of
unbiasedness , E(Q"(X)|Q) = Q', this is done (evidently primarily to
facilitate mathematical developments) particularly in the theory
of linear estimation due to Gauss; this reduces the mean squared



k

error criterion to a criterion of variance: E[ (Q'-O) |0] E

'"'
Var(P |fi), (For a brief account of the history of the theory of

point estimation, cf, Neyrnan [2], pp. 9-lU • )

Each such definite specification of closeness can be criticiz-
ed as sonewhat arbitrary, except in a context where one postulates
the reality of the indicated costs of errors of each possible kind.
To avoid such features it is evidently necessary and sufficient to
adopt the following weak specification of closeness: If Q'^

For formal convenience, we also define a{0,^ ,Q" ) = 0.
When reference to a given estimator Q" is understood, we may write
simply a(u,P), a(0-,Q), or a(P+,Q). The functions a(0-,0) and
a(C+,C) of play a useful technical role, and will be called
respectively the l owe r and upper location functions of O",

In many problems, estimators for which Prob [o"(X) = o|oj>
for some are found not useful. The remaining estimators have
continuous c.d.f's,, and have a(0-,fi) e l-a(Q+,0). No two such
estimators, having different location functions, can be comparable J
for a(Q-,0,Q""') < a ( P- , , 0"''""' ) is equivalent to a (0+,O,o'"") > a(P+,P,P''
this shows that neither ebtirr^-tor is at least as good as the other.

The broad and "weak" definition of admissibility adopted here
leads to very large admissible classes in typical problems, Hovjever
it does not seem unreasonable to conceive of the problem of point
estimation as one in which the investigator chooses an estimator on
the basis of consideration of the risk curves of all estimators in
some essentially complete class. In principle this consideration
should be complete, but of course the practical counterpart of this
can be at most a more or less extensive f ai/iliarity with an essen-
tially complete class, developed by study of the risk-curves of a
variety of specific estimators, possibly strengthened by some
general theoretical considerations (Including envelope risk-curves,
discussed below)jand perhaps also by reference to one or several loss



8

functions and criteria of optinality which may seem more or less
appropriate in specific applications. Such an approach is not so
difficult to carry out as might be anticipated, as vjill be illus-
trated. Of course difficulties of coiiiput^.tion or complexity may
sometimes dictate that an inadmlssable estimator must be adopted;
even in such cases, the most general basis on which any particular
estimator might be justified as not too inefficient, is evidently
the comparison of its risk-curves with those of other estimators,
especially admissible ones.

Example . Let X be normally distributed with unknovjn mean C

r

and variance 1, i^Jithil= \^\ -co < C I'.'



3i^

Interval estimator, this is taken as evidence for the conclusion
that the true unknown value of the parameter C lies in the closed
interval [Q\Q'' ],

The probability properties of any interval estimator J may be
described in the following terms: It is natural to call a{0-,0,0")
the lower location function of J (as vjell as of 0"), and to denote
it when convenient by a(Q-,Q,J)j similarly a(0+,©,J) s a(0+,0,0»)
is the upper location function of J, As with point estimators,
these functions give respectively the probabilities of under-
estimation and of cverestimation vjhen a given interval estimator J
is used. For exam,ple, it is natural, to call J a med ian-unbias ed
interval estimator if for each © we have equal probabilities of
cverestimation and underestirra tion: a(0-,0,J) = a(0+,0,J). This
usage is compatible with the definition of a median-unbiased point
estimator.

A quantity of primary interest is the probability that the
conclusion indicgted by any interval estimator J ("C lies in
[Ot^p"]") will be incorrect, for each possible true value Q, This
probability is just the sum of the locstion functions of J:
Prob [o not covered by J(X)|o]= Prob [o" (X) < Q|o}
+ Prob {o(X) > e|0 } = a(0-,e,J) + a(0+,©,J).
If this probability equals a for each 0, then J is a (l-ci) confi-
dence interval; if in addition J is median-unbiased, then 0' and
P" are (l-'^a) confidence limits. As with point aid confidence limit
estimators, it is of interest in general to consider the probabili-
ties of errors of under-estimation and of over-estimation of various
magnitudes in interval estimation; we denote these probabilities by



■lOl "i



15

a(u,©,J) = ra(u,e,Oi) for each u > 0,
la(u,e,C") for each u < Q,

In a formal sense, a point estimator may be regarded as an
intervol estimator J = (Q' , 'P" ) having the specie! form: 0' (x) =


1 3 4 5

Online LibraryAllan BirnbaumA unified theory of estimation. 1. (Rev. & extended Feb. 1960) → online text (page 1 of 5)