the within device variables as X\.
16
Multitokamak datasets are often poorly defined in the sense that the
number of different tokamaks, Nt, is only slightly larger than the number of
intertokamak variables, m . When this occurs, the data covariance matrix,
X f 5Z _1 X is usually close to singular.
When the data covariance matrix is almost degenerate, the GLS estima-
tor of f3 is ill-conditioned. This occurs because the inversion of X' Yl 1 X
multiplies components of X' ]C _1 y by A" 1 where Xj are the eigenvalues of
X £ X^ _1 X. Thus small errors in the observed values of y are multiplied by
-l
Statisticians refer to the problem of the near degeneracy of the data ma-
trix as "colinearity". We briefly describe the two most commonly used sta-
tistical techniques to reduce colinearity problems.
Principle component analysis decomposes the data covariance matrix
X £ £ _1 X into E'AE. The columns of E are the eigenvector of the covari-
ance matrix and A is a m x m diagonal matrix whose entries consist of the
corresponding eigenvalues Xj of £j. The eigenvectors £j are also called the
principle components. The eigenvalues Xj are equal to the variance of their
respective eigenvectors.
Small eigenvalues correspond to data directions which have only been
slightly varied in the dataset. The projection of the parameter vector /3
onto these directions is poorly determined. If an eigenvalue A m is identically
zero, then an arbitrary multiple of the eigenvector em may be added to the
parameter vector (3 without affecting the goodness of fit.
When one or more eigenvalues Xj are almost zero, then an arbitrary com-
bination of these eigenvectors may be added to ft while only slightly degrading
17
the goodness of fit. This explains why a large number of distinctly different
scaling laws can describe tokamak data 1 with only slightly different goodnesses
offit.
The k principle component estimator is denned by setting the (m — k)
most ill defined projections of to zero. Mathematically
${ c = {E'A^E^'EVy
where Aj, has the (tti — k) smallest eigenvalues, Xj set to zero.
Ridge regression 4 is a related method where a small multiple of the iden-
tity is added to the data covariance matrix. Thus the ridge estimator is
The ridge trace is the plot of the individual components, 0i(8) vs. 9 to
determine graphically when 8 is large enough to stabilize the fit coefficients.
Both ridge regression and principle components stabilize the ill condi-
tioned least squares estimator by shrinking the ill conditioned components.
Mathematically, there always exists a small enough 8 such that the ridge
estimator has a smaller variance than the least squares estimator. In other
words, the small amount of bias introduced by 8 is more than compensated
by the improvement in numerical stability.
In summary, the near degeneracy of X' £ -1 X means that many param-
eter vectors, /3 (scaling laws) can describe the data. Ridge regression sets
the poorly defined components of (3 to a much smaller value than the least
squares estimator. Finally, we note that the independent variables Xi should
be centered about the weighted mean value, x for both principle components
and ridge regression.
18
III. Tokamak to Tokamak Variation
As discussed earlier, almost all scaling studies assume that the errors (de-
partures from the fitted values) are independent from discharge to discharge.
This neglects the component of error which remains constant on any one
device.
If this tokamak to tokamak variation is attributable to one or more impor-
tant factors such as wall material or limiter/divertor configuration, statistics
is of little help in analyzing confinement. If, however, the tokamak to toka-
mak differences are due to many small factors, then it is possible to analyze
the data by averaging over the small factors.
We now present and apply a slight generalization of Swamy's random
coefficient model. In this model, we assume that the scaling coefficients for
each tokamak, j3{ have an a priori distribution where E\j3i] = /3 and
E[0i - 1)0% - ff\ = A**
where A is an m x m covariance matrix.
Given the covariance, A, the mean parameter vector may be estimated
using generalized least squares. The covariance matrix A may then be esti-
mated iteratively from the residuals, $ — /9.
The inter-tokamak variables, x , are usually too poorly determined to rule
out fixed values of the parameters. Thus we adopt a hybrid model where /3
is fixed and /3j is distributed about /3j with covariance matrix Aj. We do
assume the overall scaling constant is random and included in x\.
Following [6], we divide our design matrix into separate blocks, Xj, for
each tokamak. The block Xj is then a kj x (m + raj) matrix where kj is
19
the number of datapoints from the j-th tokamak. Xj is then divided into
The error matrix for the j-th block reduces to
£« = alii + X 1 ,A l X t lt
The GLS estimator for J3 is
Using results from [6], this simplifes to
i=l
[
=
N
\
-1
E ' \(o> jj E j + A i )-{F i ,I l .)
3=1 \ h
mi
/
N
E
(^ t + A l )-6r
where E 5 = (J^.Xi,.)"", Fj = kjEjX^. and A = (r^P^AiP^. We have
used the standard statistical notation that "-" denotes the Moore-Penrose
generalized inverse and P x is the projection onto the row space of Xj.
The covariance matrix, A, may then be iteratively estimated from
N
N
N
t=l • /v l t=l
AT
s b = Y,(K-K)(K-KY
i=i
To take into account the effects of different x for different tokamaks fe- is
defined as
.
A +/?„■*«
/ x \\
\ /
20
We note that the error in our estimate of /3 tends to zero as the square
root of the number of tokamaks not as the square root of the number of
datapoints.
The predicted performance of any new tokamak, with parameters u =
{u ,ui) is
y{u) = /3 -u±
\
N
( vt \
F.
\ Im > J
{a„E 3 + A,)" (Fj, I mi ) u + xtf AjU!
The first term in the square root represents the uncertainty in the estimate
of J3. The second term accounts for the intrinsic variability of tokamaks when
only the parameters in u are specified.
We close this section by noting that the ordinary least squares estimator
for f3 does tend to the correct value of /3 as the number of tokamaks increases.
However the GLS estimator using the random coefficient model is a more
efficient estimate of /3. Finally the error estimates for J3 and y(u) even scale
incorrectly when ordinary least squares is used.
IV. Discussion
In no way do we want to pretend that a complex dynamical system such
as a tokamak is completely describable as a linear system with six or eight
degrees of freedoms. We prefer to view scaling laws as a Taylor series expan-
sion about a particular point in parameter space, including only the most
important plasma variables.
Scaling laws describe subregions of parameter space where a single loss
mechanism or a special combination of losses is dominant. Thus scaling laws
21
should not be trusted in extrapolating to new regions in parameter space.
Luckily, scaling laws are most often used for the case when the physical,
dimensionless variables are only varied slightly and the major extrapolation is
in device size. Furthermore in most multimachine databases, the component
of the database which varies the most is strongly associated with device size.
Acknowledgment
The author wishes to thank S. K aye for several detailed discussions of his
previous analyses of tokamak data. Dr. Kaye presented the author with a
table of scaling laws for individual tokamaks. This table help to stimulate
the author to realize tokamak to tokamak correlations need to be included.
Professor K. Lackner originally suggested the use of principle components in
their joint work on temperature profile shape scalings.
This work was supported under U.S. Department of Energy Grant No.
DE-FG02-86ER53223.
REFERENCES
1. Kaye, S., Phys. Fluids., Vol. 28, No. 8 (1985) p. 2327.
2. Mardia, K.V., Kent, J.T. and Bibby, J.M., Multivariate Analysis, Aca-
demic Press, London (1982).
3. Braams, B. and Lackner, K., Nucl. Fusion, Vol. 26, No. 6 (1986) p.
699.
4. Hoerl, A.E. and Kennard, R.W., Technometrics, Vol. 12, p. 55 (1970).
22
5. Swamy, P.A.V.B., Econometrica, Vol. 38, No. 2, p. 311 (1970).
6. Riedel, K.S., submitted to Econometrica.
23
THE SWAMY RANDOM COEFFICIENT MODEL
WITH SINGULAR COVARIANCE MATRICES
K.S. Riedel
Courant Institute of MathematicaJ Sciences
New York University
New York, NY. 10012
Abstract
The Swamy random coefficient model is generalized to the case
where the covariance matrices XjXj for each individual and the dis-
persion matrix A for the free parameters are singular.
In many physical or economic systems, the free parameters (3j vary from
individual/individual device to individual (device). If the differences in the
parameter vectors (3 d are due to many small differences in the individuals,
instead of one or two significant latent variables, a statistical treatment is
still possible. In these cases, the random coefficient model, 1,2 introduced
by Swamy, is applied. Swamy assumes that the parameter vectors (3j are
randomly distributed about a mean vector 0. More precisely,
# = /5 + ik (1)
where E[fii] = 0, £[£&'] = A, and £[&,/Ij] = 0, i ^ j.
In many situations, not all independent variables can be varied in the in-
dividual experiments. One such problem is the extrapolation of performance
parameters in controlled nuclear fusion experiments. 3 A database consisting
of a large number of datapoints from eight tokamak experiments has been
24
collected. The energy confinement time is assumed to be a geometric function
of certain bulk variables such as plasma current, magnetic field and device
size. However, several of these independent variables, namely the device size
and device shape, cannot be varied in the individual experiments.
In this note, we generalized Swamy's random coefficient model to the case
where a number of coefficients can be determined only through interindivid-
ual comparisons. We assume that the z-th individual can be modeled as
where yi and e^ are (A\ x 1) vectors and e^ is distributed E[e{, e •'] = (TuS^Ir..
In contrast to the standard random coefficient model, we assume the
design matrix has a block structure:
X i = (X 0i .,X li ) (2)
where X,, is a ki x m matrix and X lt is a ki x mi matrix.
We further assume that none of the independent variables in X 0i vary
over the i-th individual. Thus we can write X , as
Xo. =j.®x : (3)
where ji, is the K{ vector consisting entirely of ones.
For many cases, including ours, the number of individuals is only slightly
larger than m , the number of parameters which can only be determined
by interindividual comparisons. Thus allowing these parameters, f3 0t to be
random is ill conditioned. Therefore we assume only the parameters, f}\,
25
describing Xi axe random, i.e. 1
/
A =
V
(4)
Aj
Finally, we assume that the intercept is a random variable, included in
Xj . This assusmption significantly simplifies the resulting algebra since it
implies that the column space of X,,,, A/(X 0) ), is contained in the column
space of X\ i .
The standard generalized least squares estimator of the mean parameter
vector is
( N \~ l N
= (X t 2- 1 X)- l X t Vy= IXX^xA E«W (5)
We make one further generalization of Swamy's 1 results by assuming the
individual covariance matrices X[ Xi may be singular.
Due to the assumed block structure of A, the covariance matrix for the
i-th error structure, XJ« simphfies to
E tt = <7>/ t + XnAiX^ (6)
To simplify these expressions we note that Rao's equality (Ref. [4], p.
33) generalized to
(
where E is the Moore-Penrose generalized inverse of X l X, E = (X l X)~ and
A is the normalized projection of A onto the row space of X, A = P X AP X .
We note that XEX t is the projection operator onto the column space
of X. Thus the first two terms of eq. (7) are the projection perpendicular
26
to the column space of X, P±. Since M(X 0i ) C M(Xi t ), P±X 0i = 0, where
P±, =U~ XiEX\. Thus
-> |E^(X 0j ,X lj ) =
/ pt \
j
\ *"* J
(4^ + A i )-(F i ,/ mi ) (7)
_ =?t
where Ej = (X{.Xi-)~, Fj = kjEjfeuxJ.). We have used X„ X^ = kjX 0] x x
where x\ } is the mean value of ij for the j-th individual. Similarly, X'-Ej j/j
simphfies to
/ ire
\ Vi
J (4^ + ^-6-
(8)
where 6~ is the generalized least squares estimator obtained using the Moore-
Penrose inverse.
Thus the generalized least squares estimator for (3 simphfies to
IE MM^ + ^n^O £[ F! ^Ei + LiYbT
ij =i
t=i
(9)
Swamy's estimator for the within individual variation remains unchanged
(10)
bl
Vi'PuVi
h -rn u
where m\ i is the actual number of degrees of freedom in the i th individ-
ual regression. However the estimator for Ai becomes significantly more
complicated when X\ Xi t is singular. The variance matrix is defined from
N
N
N
EQAiQ, = -— - S b - Y,*&K x u)
(11)
27
where Q t projects onto the column space of X\. Xj and
(12)
1=1
Our definition of Sb differs from Swamy's because we have used our es-
timated (3 vector to iteratively define b~ . This is not only a more efficient
estimator, but also necessary to take into account the effects of different x
for different individuals. The precise definition of b~ is
Qi
t
(iY\
K+K' io.
\
\°J)
(13)
The second term corrects the constant term in bi for x .
If A'j'.Yi, is never singular, the LHS of (11) reduces to NAi. Unfortu-
antely, for degenerate cases we must solve a m\ x m\ matrix problem. We
rewrite eq. (11) as
(JtQi®Q t )&i = R (HO
where denotes the Kroenecker product and
Aj = (An, A 2 i . . . A m ,i, A12 • • ■A mimi ) .
The resulting linear system may be solved to yield an estimate for Aj.
Acknowledgment
This work was performed under U.S. Department of Energy Grant No.
DE-FG02-86ER-53223.
28
REFERENCES
1. Swamy, P.A.V.B., "Efficient interference in a random coefficient model,"
Econometrica, Vol. 38, No. 2 (1970) p. 311.
2. Judge, G.C., Griffiths, W.E., Hill, R.C., Lutkepohl, H. and Lee, T.-C,
Theory and Practice of Econometrics, J. Wiley and Sons, Inc., New
York, 1985.
3. Riedel, K.S., "Tokamak to tokamak variation and colinearity in scaling
laws," submitted to Nuclear Fusion.
4. Rao, C.R., Linear Statistical Inference and Its Applications, J. Wiley
and Sons, New York, 1973.
29
APPENDIX:
Principal Components Analysis for Degenerate Uncentered
Matrices
We assume the k-th individual experiment is of the form
x/(l,x +Xi )
where x,x % are mi vectors of independent variables which can be varied in
the individual experiments. Let X? denote the n x (ttij + 1) data matrix and
Xv be the n x rri] data matrix consisting of the last m^ columns of X?. We
define the normalized, centered data matrix X as
X = (X v - l n x )/y/n
The data matrix has the form
( 1 *"
X T X T =n I _ _ t
\ x X l X + xx
We now derive a simple expression for the Moore-Penrose generalized
inverse of XjXj- when X l X is degenerate.
Lemma. Let E be a t X £ symmetric non-negative definite matrix of
rank £i and v, w be t vectors. Assume v G M(E) and w is orthogonal to
M{E). Let
fi = E + {v + w)(v + wY ,
30
the Moore-Penrose generalized inverse is
{wtfElfp + EmpVtU 1 )
^mp — E MP
\w\
_ WW
+ (l + tfE^v) —
D
To determine the generalized inverse for non-centered matrices, we set
\
U ...
E =
\
1
v + w =
I
i
... x l x
Let ei, ... e no be Lhe nonrmalized null vectors of X'X, we define
- v- - - ( l
xj = }_ ei â– x , w =
«=i ^ Xf
If X l X is nearly degenerate, the principal components regression can be
obtained by projecting out the nearly degenerate vectors.
Finally, the projection operator onto the row space of Xt is
P Xt = ET.pE +
WW
\w\
31
NYU MF- 118
Riedel, Kurt S
Advanced statistics
for
tokamak transport
colinearity and...
c.l
NYU MF- 118
Riedel, Kurt S
Advanced statistics for
tokamak transport
colinearity and... c.l
DATE DUE
BORROWER S NAME
This book may be kept
FOURTEEN DAYS
A fine will be charged for each day the book is kept overtime.
GAYLORO t«2
PRINTEO IN U t A
utf> r
art
C I^S