the critical levels of the two one-sided tests of the hypothesis
that the true value of the parameter is fi, one against larger
alternatives and the other against smaller alternatives*
v'r'.Ov
^ •■:.-. .w%i.
Hi:
"â– >â– >; .:
â– i i 'â– "â–
"^.- - ^ij i sa,/
â– â– 'Yii j-
tn.?. dr'ieS.'do'
4u 'i ss â– 'â– rt
., .«.- ■.
li '*f.
^.} .: v-*,'.r."
>'i .' '? r*:.*^ â– '.rt
'^y^n'y.
i-VOJ. c
.^â– -*I)
..>;i"Xjev oi/rz^
15
Figure la. Confidence curve estimate of a binomial proportion
p based on n = 75 observations and an observed
proportion p = x/n = l|$/75 = "^ .
c(p, .6)
.5
.3
.2
..1
.05
1
1
1
/
/
T
\
1
\
—
t
1
1
1
\
\
I
-
... 1
1 t 1 >
1 1 1
t
1
1
—
\
\
— 1— â–ºâ–
.1 .2 .3 .U .5 .6 .7
Scale of p
.8 ,9 1.0
Figure lb. Graphs of some percent, points of p = x/n.
fnrrnrt
Scale of p = x/n
.1
^0 .1 .2 .3 .U .5 .6 .7 .8 .9 1.0
vJ v^ I V I — I : ■; 1 i 1 : : : \ 1 : — I V
Observed
p = .6
Scale of
.1 ^ = x/n
.1 .2
.3 ,k .5 .6 .7 .8 .9 1.0
Scale of p
1.0
c^»
•■*
"I — p
■! •'- *
_,...
1 i
16
The typical form of a confidence ciorve estimate is illustrated
In Figvire la, which is the graph of such an estimate of a binomial
mean (proportion) p, based on n = 75 observations and an observed
proportion p = Ij-5/75 = •6. The general formula for a confidence
curve estimator of a binomial proportion p, given an observed
proportion p = x/n based on n observations, using the usual normal
approximation, is
>(p,p) =i (- \/h|p-p| /yp(i-p)j
where ^ denotes the standard normal cumiolative distribution fiinction.
(Here Q becomes p, t becomes p = x/n, and the confidence limit 0(t,Y)
may be designated p(p,Y)* o^ p(x/n,Y)# In this notation, the usual
(mean-vinbiased) point estimator of p is x/n 5 p = p(p,»5)» The
discreteness of distributions of t in such problems represents a
minor theoretical and computational complication; except with very
small sample sizes, the quantitative and theoretical significance
of such complications is minor, and it seems appropriate for
typical purposes of informative inference to use the usual
continuous approximations to the distributions involved. VJhen
desired, such approximations can be replaced by exact probabilities
taken from tables of the binomial distribution; this is advisable
for p or ^ near or 1, and more generally when n is small* In the
present example we have
c(p,.6) = J r-(8„66)jp-.6|/\/p(l-p)'\ ,
which may be evaluated, using tables of ]^, for as many values of p
as desired to provide a sketch of the confidence curve estimate.
%'-.
^â– .. iloiq'soqoiq IfiifiOt'.ici .-;, .
:: nc.
! ."
•!" *
> r^-;.- r
aO'
â– â– : â– â– :,i \Mi.:.:-^: 1:?:'0^, , :zz!:qr:oo -./lis
c ::: o'ic .-ro
'i:
■■■■f: Ov . ::.;::' .;;•;; :i;)-J. I i>i;^ 4:.^?SJ^ v
ilOC; :Ji.'C»i
r r , r-
23 :
fi J- r.-
V:' [''Z'^'pijb IftJiiiOf* 'â– :j.
_ 1) . . / - -
'•'}..'
*^^ ,. ? aviJo lit
• . •>. "^r ■'••' - -. ' -■1
^ :.*0 U'j '^O
i::.^ ^''f;-!c.f ^:v:^ :> J Y.
17
An alternative graphical method of construction of such a
confidence curve estimate is illustrated in Figure lb. This
method is clearly applicable in any problem for which graphs (or
corresponding tables) of various quantiles (percent, points)
of the basic statistic are available. The figure also illustrates
further the definition of a confidence curve estimate. Figure lb
1/2
contains graphs of the functions p - k(p(l-p)) ' * for
i k = .277, .196, .124, .062, and 0, (These values of k correspond
to some of the graphs contained in the charts of 95 % confidence
belts of Glopper and Pearson [2], namely those labeled 50* 100,
250, 1000, and the central diagonal line.) Each of these fxmctions
gives quantiles of x/n as a fvmctlon of p (based on the usual
normal approximation). For n = 75* these are the ,008, .Oij.5, .111.2,
.295, .5, .705, .858, .955, and .992 quantiles of p = x/75. The
inverses of these functions, easily read graphically, give upper
and lower confidence limits based on any observed value; in the
present example, we obtain in this way, as indicated by arrows in
the figure, the following point estimates, as a basis for a
sketch of the complete confidence curve estimate: lower 99.2%
limit: •liSi lower 95.5 % limit : .50; ...I niedian-unbiased estimate
.60; upper 70.5%limit: .63; ...; upper 99.2% limit: •74«
^•' •-:/^
T .( !
Sj h
t t -^^
tSUiOS Oij
;.«r.»
SB i&
1 « ■f
«li' t tal
18
3« Interpretations of confidence curve estimates . The range
and flexibility of possible interpretations of a confidence curve
estimate can be illustrated by considering another example, shown
in Figure 2, This confidence curve is an estimate of the mean ©
of a normal distribution with known standard deviation cr= 5>
based on a sample of n = Ij. independent observations y. whose sample
mean is t = y = 5» Thus Figure 2 is the graph of the confidence
cx:irve estimate c(Q,t) = c(0,5) = ]^(-( •![.) lo-5 1 )• (The same estimate
would arise in the following different problems: Let 9 = \ip-[i.^ be
the unknown difference between two raeans \i. of normal distributions
with known standard deviations cr - , s cri; and suppose that a
difference of independent sample means yp - y-j = 5 is observed,
based on sample sizes n,,n2, such that
\/a-l/n^ +or|/n2 = 2.5 = l/(.4).)
The following are examples of tte inference statements about @,
incorporated in this confidence curve estimate, which may be read
by inspection of Figure 2;
1. A point estimate of Q is 5« (Here and in many common examples
the best median-unbiased estimate obtained in this way
coincides exactly or very nearly with standard estimates based
on the criterion of mean-unbiasedness or that of maximum
likelihood, except for very small sample sizes in some
examples. )
2, An upper .90 confidence limit for Q is 8.5 •
iiO iS' \ii'\ \>C'
Figure 2. Confidence curve estimate of a normal mean 6
based on a sample mean = 5> having a standard
error = 2*5 •
19
c(©,t)
-2.5
3« A ,99 confidence interval for Q is: -l.Lj. to 11, l\. •
i|.. In testing the hypothesis H : o = o against the one-sided
alternative H, : Q > 0, we would just reject H at the »025
significance level, (Hence we would not reject H at the ,01
level, but would reject it at the ,05 level.)
5« In testing H : Q = against the one-sided alternative
H, : Q < 0, we would accept H at any of the usual significance
levels.
6, In testing H : Q = against the tvjo-sided alternative
H^; ^ 0, we would just reject H at the .05(=2.( .025) )
significance level (but at no smaller level).
20
7. The information conveyed in the preceding three inference
statements {[|.)-(6) which are in hypothesis-testing form, can
be expressed alternatively in confidence -limit statements as
follows: The value is a .975-level lox>;er confidence limit
for Q, (Tliat is, we have moderately high confidence that the
true value of 9 is not as small as 0» Our confidence in this
inference is not as strong as is represented by the ,99 con-
fidence level, but is stronger than is represented by the #95
confidence level.)
8. The one-sided test of (l|) above, which rejects H : 8 = in
favor of larger alternatives at the «025 significance level,
has a Type II critical level of approximately .16 at the
alternative hypothesis Q = 7«5 (corresponding to power = «8I|.
against this alternative). If alternatives of this or larger
magnitudes are of interest, the observed data may be considered
fairly strong evidence against H^ favoring such alternatives,
since outcomes with sample means at least as large as that
observed have relatively small probability (,025) under H
but relatively large probability (.81]. or greater) under such
alternatives.
9« For the same test, at the alternative hypothesis 9 = 1 vje have
a Type II critical level of approxim.ately .95 (corresponding
to power = .05 only). If alternatives of about this inagnitude
are of interest, the observed data cannot be considered very
helpful, since outcomes with sample means at least as large
as that observed have small probabilities of similar
magnitudes under the different hypotheses of interest
(© = or 9 = 1),
21
h» Discussion. Systematic use of estimators consisting of
sets of confidence limits at various levels has been proposed by
Tukey [3] and by Cox [i^.], for reasons generally similar to those
described above. The particular form for such estimators proposed
above, confidence curves, seems to serve conveniently for typical
general pxir poses of informative Inference, when based on standard
current statistical techniques as illustrated above. In addition,
this form of definition of an (omnibus) estimator serves well in
some extensions and unifications of the theory and practical
techniques of estimation which will be published separately [51.
The latter include certain generalizations and justifications of
maximum likelihood methods, and unification of the latter with the
theory and techniques of confidence limit estimation*
The problem of interpreting significance test results so as
to distinguish appropriately between formal statistical signi-
ficance on the one hand, and practical significance in a specific
context of application on the other hand, is met in a simple but
helpful way by use of confidence curves, as Illustrated by points
(8) and (9) of the preceding section. The relatively Informal
comments given in that section to illustrate the Interpretations
of an observed outcome, as evidence relevant to the various statis*
tical hypotheses considered seem to give unified and explicit
form to much current practice of applied statistics based on the
generally accepted principles and foundations of the theory of
Neyman and Pearson. At the same time, the writer feels that these
foundations of statistical inference will bear further discussion,
which will be offered elsewhere [6], and in which certain
refinements and modifications of current theoretical formulations
and practical techniques will be proposed.
'^^>.
; •■■kJ; '^
•>3-£-e.t'lr3noo -so.
fji tjJ'i:
.!.B'0>
lo snojJ'ii'.v
â– i'c'f-rufoo* X-
22
For any specific application, the form and corapleteness in
which a confidence ci^'ve estimate is reported can of course vary
greatly. For many purposes a very rough sketch based on only
several computed points will suffice. And of co-urse for many
purposes the more standard techniques, either tests or estimates
(of the point, confidence limit, or confidence interval form) may
well svifficej from a formal theoretical standpoint, it may sometimes
be useful to regard any one of these standard techniques as an
incomplete description, of an \inderlying complete confidence curve
estimate, which sioffices for a particular application. This
standpoint is helpful in avoiding interpretations of standard
techniques xjhich are tied too formally to chosen fixed confidence
or significance levelss interpretations v/hich seem inappropriately
schematic in typical contexts of informative inference.
• "«?; ?■•'-" t'''-
23
REFERENCES
[1] Natrella, I^Iary G, "The relation between confidence intervals
and tests of significance," The Araerican Statistician , Vol, lij.
(1960), No. 1, pp. 20-22 and p. 38.
[2] Clopper, C, J,, and Pearson, E, S. "The use of confidence or
fiducial limits illustrated in the case of the binomial,"
Biometrika, Vol. 26 (1931]-), p. k^k*
[3] Tukey, J, \'K "Standard confidence points," (unpublished;
preliminary report presented at meeting of the Institute of
Mathematical Statistics, New York City, April 19^8 )•
[l\.] Cox, D, "Some problems connected with statistical inference".
Annals of Mathematical Statistics . Vol. 29 (1958), p. 363.
[5] Birnbaum, A. "A lonified theory of estimation. I," Tech.
Report, IMI4-1>JYU 266, Institute of Mathematical Sciences,
New York University, I960,
[6] Birnbaum, A. "On the foundations of statistical inference. I,"
Tech. Report, Il'dM-MU, 267, Institute of I"Iathematical Sciences,
New York University.
I ! .â– I-
â– v:
BASIC DISTRIBUTION LIST FOR OTICLAS3IPIED TECHNICAL REPORTS
Address
No', of
Copies Address
No. of
Copies
Head, Statistics Branch 3
Office of Naval Research
Washington 25, D. C.
Commanding Officer 2
Office of Naval Research
Branch Office, Navy 100
Fleet Post Office
Ne^^; York, New York
ASTIA Document Service Cntr.
Arlington Hall Sta. 10
Arlington 12, Virginia
Office of Techn. Services 1
Department of Commerce
Washington 25, D. C,
Techn. Informa, Officer 6
Naval Research Laboratory
Washington 25, D. C.
Prof', T, W, Anderson 1
Dept. of Math. Statistics
Columbia University
New York 27, New York
Prof. Z. W. Bimbaum 1
Lab.' of Stat. Research
Dept. of Mathematics
University of Washington
Seattle 5» Washington
Prof. Ralph A, Bradley ' 1
Dept, of Stat, fi Stat. Lab.
Virginia Polytechnic Inst.
Blacksburg, Virginia
Herman Chemoff
. Math. ^ Stat. Lab,
Prof-.
Appl
Stanford University
Stanford, California
Prof-. W. G. Cochran
Dept. of Statistics
Harvard University
Cambridge, Massachusetts
Prof'. Benjamin Epstein
Appl. Math, f^ Stat. Lab.
Stanford University
Stanford, California
Prof. Harold Hotelling
Associate Director
Institute of Statistics
Univ, of North Carolina
Chapel Hill, North Carolina
Prof. I. R. Savage
School of Business Admin.
University of Minnesota
Minneapolis, Minnesota
Prof, Oscar Kempthorne
Statistics Laboratory
Iowa State College
Ames, Iowa
Dr, Carl P. Kossack
I. B. M.
Lamb Estate Research Center
P.O. Box 218,
Yorktown Heights, New York
Prof. Gerald J. Lieberman
Appl. Math, fi Stat, Lab.
Stanford University
Stanford, California
Prof. William G. Madow
Department of Statistics
Stanford University
Stanford, California
Prof, J, Neyman
Department of Statistics
University of California
Berkeley I4., California
Prof-. Herbert Robbins
Math, Statistics Department
Columbia University
New York 27, New York
Prof, Murray Rosenblatt
Dept, of Mathematics
Brown -.University
Providence, .Rhodo Icland
Prof. L, J. Savage
Statistical Res. Laboratory
Chicago University
Chicago 37, Illinois
J,
"U .*":
• »
ar,v..
» s^ jr. L.^ *.
BASIC DISTRIBUTION LIST FOR ^CLASSIFIED TECIiNICAL REPORTS (cont.)
No. of
Address copies
Prof, Prank Spitzer 1
Dept. of Mathematics
University of Minnesota
Minneapolis, Minnesota
Prof, S. S. 'llks 1
Dept. of Mathematics
Princeton University
Princeton, New Jersey
Prof. Gertrude Cox 1
Institute of Statistics
State College Section
North Carolina State College
Raleigh, North Carolina
Prof. J. Wolfowitz 1
Department of Mathematics
Cornell University
Ithaca, New York
Prof. Harvey M, Wagner 1
Stanford University
Applied Mathematics
and Statistics Labs.
Serra House
Stanford, California
Prof. W. H. Kruskal 1
Department of Statistics
University of Chicago
Chicago 37, Illinois
IpO .
IC
I!YU
c.l
IMI'1-269
1 ^T rnteum
Gen^ NYU
â– â– - 1-.-^r^^
inetl lMM-269
C.l
~AimroR" — — -^^^il^abaian
IMM-269 ^'-^
Birnbaum
hood methods with exact
^^tifisafci-
DATE DUE I BORROWERS NAME
"^ '^^^^^^^^y^^t^
ROOM
NUMBER
N. Y. U. Institute of
Mathematical Sciences
25 Waverly Place
New York 3, N. Y.
Date Due
/?PR 7^
e
m
PRINTED
IN U. S. A.