Stephen Stanley Sledjeski.

Study of the power of multivariate analysis of variance on standardized achievement testing online

. (page 3 of 4)
Online LibraryStephen Stanley SledjeskiStudy of the power of multivariate analysis of variance on standardized achievement testing → online text (page 3 of 4)
Font size
QR-code for this ebook


The values of the F-ratio and complement of the
cumulative distribution function for fourth- and fifth-grade
mean value and regression estimated data sets at the 15 per-
cent level are presented in Table 4. For the fourth-grade
sample, no F-ratio of the mean value estimated data sets
differed from the complete data set's F-ratio by more than
0.3063. Likewise, for the regression estimated data sets,
no F-ratio differed from the complete data set's F-ratio by
more than 0.1386. Equivalent ranges for the fifth-grade
sample were 0.2364 and 0.0412, respectively.

Examining the complement of the cumulative distri-
bution function for the fourth-grade sample, no P of the mean
value estimated data sets differed from the complete data
set's complement by a value greater than 0.002696. Likewise,



29



c








o








•r^








JJ


C






d


nj M-t




^


<u


o




•HS






Vi




Wl




■I-)


bO C




w


C


•H




•H


•H


■l-l




Q


>


CO






CO


•H




(UK


CO




>




C




•H


CO


o




■U


(U O




m r-H






i-H


fX CO




3


Fi


cu




H


CO r-l




13 CO


a




O




{a






OJ


CO




(1)T3


CO




43


CO 43


CO


•U


VJ


3


0)




o cn.-i


M-l


1




a,


o^c-a


H




■u


<U


CO


,^>,


IW


4-1 CO


PM


•H


CO




^^


fXH


ti


QJ






•H


4J


M


XJ


■!-»


CU


4J


c


CO t-H


C


CO W


CU


^






H


b


1


C


o


0)43


ou


i-H


4J


•H




a u


CO


(1)


B


:3


CO 43


o


o


QJ


■u


Ufa


Vl








b04-l


Ti


u


(1)


o


e


O Pi!




n3M-i




■u






TJ


c


m


c


C


0)


O


o


CO


o


•H


•H




u


•u


■U


CU


<\)


nJ


a


0Ph


^


C-i




1





CO in


fe fe >


I-)









vn




in




iH




r^




CJN








(N




r^




O




<■




Oi








CO




^




m




m




in






Pm


CM
O




CM
O




CM

o




CM

o




CM

o




c




O




O




o




o




o




o



























■rl




o




o




o




o




o




tn
























QJ












































M
























60




m


CO




T-{




(Ti




r~






0)




VD


tH




r~-




CJ>




T-t






Pi


py


CM


O




cr>




00




00




0)

>






ro


CO




CM




CM




CM








CO


CO




CO




CO




CO


























13
























to






<r




CM




rH




in




o


u






a^




rH




o




CM




CO


C5






CvJ




VO




00




vD




<!•






Dm


rH

o




CNl
O




O




CM
O ,




CM
O




Q)
3




o




o




O




o




O




to

>




o




o




o




<D




d



























C
























to




■ en


r^




o




(^




r>






0)




a\


cr>




00




r^




00






S


Pn


in
m

CO


CM
CO




CM
CO




CM

CO




o

CO
CO










r-~




00




CO




CT>




VD








0^




o




1 —




<r




vD








vO




r-.




r^




CM




VO






CU


CO




<a-




~*




CO




CO




c




o




o




O




o




o




o




o




o




o




o




o




•H


























03




o




o ■




CO




o




o




M
























^1












































bO
























OJ




in


o




o




r^




v£)






Di




VO


00




CO




CO




a\








Pn


f^


00




00




CM




r~~




3






CJ>


00




00




O




o\




O






CM


CM




CM




CO




CM


























XI
























Cfl






QO




m




rH




CJ^




VO


Vl






o




r^




CO




<r




<}•


o






o




r^




r^




o




rH






PL|


o




O




o




CM

o




CM
O




0)
3




o




O




o




o




o




tfl
>




d




o




d




d




d














































to




o


o>




CM




■<r




CM






QJ




r^


CM




\D




rH




S3-






a


fn


CM


00
00

CM




00
00

CM




CT\

rH

CO




r-i
CO






^




Cvl




CO




sj-




in




0)




Q)




QJ




QJ




QJ




iH




rH




T-i




iH




T-t




D-




o-




D.




SX




a.




B




e




E




e




B




to




CO




to




to




to









Vi




CO




t/j




C/3



30



for the regression estimated data sets, no complement dif-
fered from the complete data set's complement by a value
greater than 0.001496. Equivalent ranges for the fifth-
grade sample were 0.001050 and 0.000255, respectively.

Since the complement of the complete data set for
both the fourth and fifth grades was less than 0.05 while
at the same time the five complements of the mean value and
the regression estimated data sets were less than 0.05, the
three null hypotheses are not rejected at the IS percent
level of missing subsamples .

Comparison of the Mean Value and the Regression
Estimated Data Sets with One Another anH

with the Complete Data Set at the 2(J
' Percent Level of Missing Subsamples

The values of the F-ratio and complement of the
cumulative distribution function for fourth- and fifth-grade
mean value and regression estimated data sets at the 20 per-
cent level are presented in Table 5. For the fourth-grade
sample, no F-ratio of the mean value estimated data sets
differed from the complete data set's F-ratio by more than
0.3305, Likewise, for the regression estimated data sets,
no F-ratio differed from the complete data set's F-ratio by
more than 0.1237. Equivalent ranges for the fifth-grade
sample were 0.2711 and 0.0479, respectively.

Examining the complement of the cumulative distri-
bution function for the fourth-grade sample, no P of the
mean value estimated data sets differed from the complete



31



w

PQ
H









00


en


o


vr




o








vD


<f


l-l


\o




cn








<r


vX)


<■


en




vO






Oi


CSJ


C-i


CM


CM




CM








O


o


o


o




o




O




o


o


o


o




o




•H

to




o


d


o


o




CD




to
































M


















60




KT


o


c;n


vO


o






(U




eg


in


t-i


CTv


r^






Cd


PM


o


r^


iH


tH


r~.




>






n


tN


en


en


CM








CO


en


en


en


CO




















T3


















to






in


Oi


en


o




en


)4









iH


en


00




in


o






, . iH


o-


en


<t




>;3-






fw


iH


CNJ


iH


CM




CM








O


o


o


o ,




O




0)

3




O


o


o


o ■




O








o


. CD


o


o




O




>
































CO




o


<r


\D


o-


CO






ii




<r


o


r^


o


<)■






s


fc


t3>


rH


•<t


o


o










in


cn


in


en


cn










ro


m


en


en


en










rH


iH


en


CJ^




<d-








en


tn


^


en




00








CN


O


VD


en




m






flH


■<J-


in


<r


<r




cn








o


o


o


o




o




o




o


o


o


o




o




•H
CO

to




d


o


CD


d




o


















^
















U


















60




CN


r^


vO


o


CO






(U




t~-


en


rH


00


00









f^


eg


v£>


CTi


- I


o




3






c^


00


00


CJ>


o









=


CM


CN


CM


CM


cn




















T)


















to






<j\


.H


en


cn




in


Vj






m


<■


i-H


00




iH


O






00


(?\


in


tH




CTi






CU


CO


o-


<f


en




rH








o


O


o


o




o




3




o


o


o


o




o




to

>




c^


CD


CD


CD




, O














.















^






to




00


en


vD


CM


vO






S




o


o


en


rH


in






S


U-i


vD


t^


o


en'


rH










o>


tX5


(Ti


O


CN










CM


CS4


CM


cn


cn






iH


CM


en


-d-




m




Q)


(U


Q)


OJ




(U




iH


rH


rH


tH




tH




&■


a-


Cu


(X




Oi




6


e





B




B




(0


CO


CO


CO




CO




to


CO


CO


CO




CO



32



data set's complement by a value, greater than 0.002830.
Likewise, for the regression estimated data sets, no comple-
ment differed from the complete data set's complement by a
value greater than 0.001361. Equivalent ranges for the
fifth-grade sample were 0.001159 and 0.000299, respectively.

Since the complement of the complete data set for
both the fourth and fifth grades was less than 0.05 while
at the same time the five complements of the mean value and
the regression estimated data sets were less than 0.05, the
three null hypotheses were not rejected at the 20 percent
level of missing subsamples.

Further Results

To determine which method of estimation investigated
was the stronger, an inspection of the values of the F-ratios
and complements of the cumulative distribution function was
conducted. The closeness of these values of the incomplete
data sets to that of the appropriate complete data set was
observed. For each group of five incomplete data sets at
each percent level, the range of values was found and
examined for largeness of width.

The largest range at each percent level of missing
data for the fourth-grade sample with mean value estimates
varied from 0.001388 to 0.003977, whereas, for the regres-
sion estimated samples, it varied from only 0.000375 to
0.001688. For the fifth-grade samples with mean value



33



estimates, the range varied from 0.000196 to 0.001159. For
regression estimates, it was 0.000245 to 0.000842. Only at
the 2% percent level of missing values did the mean value
complement range not exceed that of the regression comple-
ment range .

A closer examination of the results revealed addi-
tional information. One might presume that as the percent
of estimated data elements decreased, the smaller the range
would be between the value of the F-ratio of the complete
data set and the most distant value of the F-ratio of the
data sets with estimated values. This was neither consistent
within the fourth- and fifth-grade samples nor within the
method of estimation. Considering the percent level of
missing data with the shortest range to the level with the
longest range, the order for the fourth-grade sample with
mean value estimates is 2%, 5, 15, 20, 10; for the fourth-
grade sample with regression estimates, 5, 2%, 20, 15, 10;
for the fifth-grade sample with mean value estimates, 2%,
10, 5, 15, 20; and for the fifth-grade sample with regres-
sion estimates, 2%, 15, 20, 10, 5. The exact results hold
for the complement of the cumulative distribution function.
. Another presumption might be that the value of the
F-ratio of the complete data set would be within the range
of the values of the F-ratios at a particular percent level
of missing data. This is consistent for the fourth- and
fifth-grade samples within a method of estimation but not



34



between methods of estimation. For both the fourth- and
fifth-grade samples having mean value estimates, the value
of the F-ratio of the complete data set is within the range
of the values of the F-ratios for all percent levels of
missing data. For regression estimated samples, this is
not the case. The fourth-grade samples have F-ratios not
inclusive, range-wise, of the complete data set's F-ratio
at the -2% percent level; for the fifth grade, it is at the
2% and 20 percent levels. The value of the F-ratio of the
complete data set exceeds the values of the F-ratio in the
fifth-grade sample and precedes the values in the fourth-
grade sample.

Summary

In summary, this chapter has presented the statisti-
cal analysis of the data. The results of the study indicated
that no significant differences exist among the MANOVA
results of data sets having missing subscores estimated by
mean values, data sets having missing subscores estimated by
regression, and the complete data set with no missing values.
This was demonstrated for 100 samples with estimated sub-
scores. The estimated subsamples consisted of 2%, 5, 10,
15, and 20 percent of the complete samples of fourth- and
fifth-grade students.

Since inspection showed that the regression esti-
mated values provided MANOVA and complement results at each



35



percent level closer, in all instances, to that of the
complete data set, it is apparently the stronger of the two
estimation procedures. Both methods of estimation, though,
were demonstrated to provide MANOVA results not signifi-
cantly different from the results of the complete data sets



CHAPTER V
DISCUSSION, CONCLUSIONS, AND RECOMMENDATIONS

Discussion

The intention of this study was to examine the
effect of different estimators for missing multiresponse
data on multivariate analysis of variance (MANOVA) results.
Mean value and regression techniques were used in deter-
mining estimates. The MANOVA results for the data sets
which employed the different estimation techniques were
compared to each other and to MANOVA results of the complete
data set.

Specifically investigated were the achievement test
scores of a fourth-grade sample and a fifth-grade sample.
Fifty MANOVAs were conducted on each grade; 25 analyzed the
incomplete data sets with mean value estimates and 25 with
regression estimates. The 25 analyses were subgrouped into
five sets of analyses. Each set contained a different per-
cent, level of missing data. These levels were 2%, 5, 10,
15, and 20 percent of the complete sample. Five samples
with different missing subsets, of data were analyzed at each
level.

The results of Chapter IV demonstrated that the
14AN0VA results of both estimation techniques did not differ



36



37



significantly from one another nor from the results obtained
from the complete data set.. Inspection of the F-ratios and
complements implied that the regression method was apparently
the stronger estimation technique.

The latter result was determined by the closeness of
the values of the F-ratios and the complements of the ciimu-
lative distribution function for the estimated samples to
that of the complete data set.

In addition, two a posteriori results were observed.
It was found that as the percent of estimated data elements
decreased, it did not follow that the smaller the range
would be between the value of the F-ratio of the complete
data setand the most distant value of the F-ratio of the
data sets with estimated values. The non sequitur held for
both grades of students and both methods of estimation.
This was likewise true for the complement of the cumulative
distribution function.

A second finding was that the F-ratio of the complete
data set was not within the range of the values of the F-ratios
at all percent levels of missing data estimated by regression
techniques. It did hold for mean value estimated data sets.
The same findings occurred among the complements of the
cumulative distribution function.

Conclusions

Three conclusions were drawn from the present
study:



38



1. Achievement data with up to 20 percent missing
subscores that are estimated by mean value
techniques when analyzed by MANOVA provide
results which do not differ significantly from
MANOVA results of the same achievement data
without any missing subscores.

2. Achievement data with up to 20 percent missing
subscores that are estimated by regression
techniques when analyzed by MANOVA provide
results which do not differ significantly
from MANOVA results of the same achievement
data without any missing subscores.

3. Achievement data with up to 20 percent missing
subscores that are estimated by mean value
techniques when analyzed by MANOVA provide
results which do not differ significantly
from MANOVA results of achievement data with
up to 20 percent missing subscores that are
estimated by regression techniques.

The above conclusions seem to suggest that there
exist for educators alternatives in data analysis other than
discarding incomplete multiresponse observations. The
alternatives provided here are the two methods of estimation;
mean value and regression. In addition, the mean value
method of estimation was demonstrated to be as appropriate
in MANOVA as the regression method as proven by the non-
rejection of the third hypothesis. Further data consider-
ations revealed that for all levels of missing data, the
F-ratio.of the complete data set was located within the
range of the F-values determined for the data sets with
missing subsamples estimated by the mean value methods.
This did not hold for the regression method.

Since the mean value method is straightforward
and has been proved to be an appropriate estimation



39



technique, data formerly lost to' analysis can be retained.
No longer must estimates for omissions be evaded because of
complicated data manipulations, time, money, and resources.

Recommendations

The present study has operated under various limi-
tations which need to be investigated in order to extend
the inferences of this research. Bracht and Glass (1968)
stated:

The intent (sometimes explicitly stated, sometimes
not) of almost all experimenters is to generalize
their findings to some group of subjects and set
of conditions that are not included in the experi-
ment. To the extent and manner in which the
results of an experiment can be generalized to
different subjects, settings, experimenters, and,,
possibly, tests, the experimenter possesses
external validity , pp. 437-438

The external validity of this study is restricted by the

lack of reported research dealing with statistical analyses

which employ data estimates without parametric estimates.

Areas which require further investigation in reference to

inferential conclusions are presented in the following list

1. The samples consisted of fourth and fifth
graders . Other educational levels need to
be examined.

2. Achievement scores for two levels of one
standardized achievement test were analyzed.
Other standardized achievement tests need

to be investigated.

3. In addition to achievement tests, other types
of tests which measure not only the cognitive
domain but also the affective domain need to
be studied such as those dealing with self-
concept and social acceptance.



40



4. Other methods of estimation need to be con-
sidered in a manner similar to the present
investigation and compared to mean value
methods for accuracy and simiplicity.

5. Missing subsamples were determined randomly.
Actual missing subsamples need to be investi-
gated for possible commonalities.

6. The levels of missing data should be expanded
in order to determine maximum levels of missing
subsamples.

7. More than one missing subscore per experimental
unit needs inspection.

8. Experimental designs requiring analyses different
from multivariate analysis of variance need
probing.

These recommendations are listed not only to provide closure
to the present study but also to indicate the multidirec-
tional approaches involved in this specific area of research
Closure is provided with respect to confining the present
research's inferences to the subset of investigations out-
side of the above listing. The expanse of additional
approaches is suggested by the list itself. No one item
of the list is more worthy of study than the other. All
need investigation in order to advance to the universal
set of estimators for omissions of multirespons.e data.



REFERENCES



Afifi, A. and Elashoff , R. M. "Missing observations in

multivariate statistics I. Review of the litera-
ture . " Journal of the American Statisti cal
Association , 1966, 61. 595-604. ~

Afifi, A. and Elashoff, R. M. "Missing observations in
multivariate statistics II. Point estimation in
simple linear regression." Journal of the
American Sta tistical Association, 1967. 62.
10-29. —

Anderson, T. W. "Maximum likelihood estimates for a multi-
variate normal distribution when some observations
are missing." Journal of the American Sta tistical
Association . 1957, 52, 200-203. ~

Baird, H. R. and Kramer, C. Y. "Analysis of variance of a
balanced incomplete block design with missing
observations. Applied Statistics, 1960, 9.
189-198.

Bhargava, R. Multivar iate tests of hypotheses with incomplete
data . "Applied Mathematics and Statistical Labora- '
tories, Technical Report 3, 1962.

Bracht, G. H. and Glass, G. V. "The external validity of
experiments." American Educa tional Research
Journal , 1968, 5, 437-474.

Buck, S. F. "A method of estimation of missing values in

multivariate data suitable for use with an electronic
computer." Journal of the Royal Statistical Society.
Series B . 1960, 22, 302-307. [ ^

Dagenais, M. G. "Further suggestions concerning the utili-
zation of incomplete observations in regression
analysis." Journal of the American Statistical
Association, l97I. 66. 93-98. ~*



41



42



Dear, R. E. "A principal-component missing-data method for

multiple regression models," SP-86, Systems Develop-
ment Corporation, Santa Monica, California, 1959.

Dempster, A. P. "An overview of multivariate data analysis."
Journal of Multivariate Analysis , 1971, 1, 316-346.

Edgett, G. L. "Multiple regression with missing observa-
tions among the independent variables . " Journal of
the American St atistical Association, 1956. 51
122-131. \ — ; —

Federspiel, C. F. , Monroe, R. J., and Greenberg, B. G.
"An investigation of some multiple regression
methods for incomplete samples." University of
North Carolina, Institute of Statistics, Mineo
Series, No. 236, August 1959.

Glasser, M. "Linear regression analysis with missing
observations and the independent variables."
Journal of the A merican Statistical Association,
1964, 59, 834-844: '

Haitovsky, Y. "Missing data in regression analysis."
Journal of the Roy al Statistical Society,
Series B , 1968. 30. 67-82. '-

Hartwell, T. D. and Gaylor, D. W. "Estimating variance

components for two-way disproportionate data with
missing cells by the method of unweighted means."
Journal of t he American Statistical Association.
19/3. 68, 379-383.

Hocking, _R. R. and Smith, W. B. "Estimation of parameters
in the multivariate normal distribution with
missing observations." Journal of the American
Statistical Association , 1968, 63. 159-173.

Hopper, M. J., comp. Harwell Subroutine Library: A

Catalogue of Subroutines . London : Her Majesty 's
Stationery Office, State House, 49 High Holborn.
1970.

Kleinbaum, D. G. Estimation and hypothesis testing for

generalized multivariate linear models . Doctoral
dissertation. University of North Carolina, Chapel
Hill, North Carolina, 1970.

Kramer, C. Y. and Glass, S. "Analysis of variance of a
Latin square design with missing observations,"
Applied Statistics . 1960. 9, 43-50



43



Lord, F. M. "Estimation of parameters from incomplete data."
Journal, of the American Statistical Association,
1955, 50, 870-876. ~ [

Lord, F. M. "Estimation of latent ability and item parame-
ters when there are omitted responses." Psyc ho-
metrika, 1974, 39, 247-264.

Matthai, A. "Estimation of parameters from incomplete data
with applications to design of sample surveys."
Sankhya , 1951, 2, 145-152.

Mitra, S. K. "Some remarks on the missing plot analysis."
Sankhya , 1959, 21, 337-344.

Morrison, D. F. "Expectations and variances of maximum
likelihood estimates of the multivariate normal
distribution parameters with missing data."
Journal of the Am e rica n Statistical Association,
1971, 66, 602-604.

Nicholson, G. E., Jr. "Estimation of parameters from
incomplete multivariate samples . " Journal of
the American Statistical Association, 1957, 52,
523-526. — ■ — —

Preece, D. A. "Query and answer: Non-additivity in tv/o-
way classifications with missing values." Bio -
metrics , 1972, 28, 574-577.

Pruzek, R. M. "Methods and problems in the analysis of

multivariate data." Review of Educational Research,
1971, 41, 163-190. ' ' :

Raff eld, P. C. The effects of Guttman weights on the


1 3

Online LibraryStephen Stanley SledjeskiStudy of the power of multivariate analysis of variance on standardized achievement testing → online text (page 3 of 4)