Dale Weldeau Jorgenson. # Nonlinear three stage least squares pooling of cross section and average time series data online

. **(page 1 of 2)**

Online Library → Dale Weldeau Jorgenson → Nonlinear three stage least squares pooling of cross section and average time series data → online text (page 1 of 2)

Font size

WORKING PAPER

ALFRED P. SLOAN SCHOOL OF MANAGEMENT

NONLINEAR THREE STAGE LEAST SQUARES POOLING OF

CROSS SECTION AND AVERAGE TIME SERIES DATA

Dale W. Jorgenson"

and

Thomas M. Stoker""

April 1982

(Revised: August I983)

Sloan School of Management Working Paper #1293-82 ^.S3

MASSACHUSETTS

INSTITUTE OF TECHNOLOGY

50 MEMORIAL DRIVE

CAMBRIDGE, MASSACHUSETTS 02139

NONLINEAR THREE STAGE LEAST SQUARES POOLING OF

CROSS SECTION AND AVERAGE TIME SERIES DATA

Dale W. Jorgenson"

and

Thomas M. Stoker""

April 1982

(Revised: August 1983)

Sloan School of Management Working Paper #1293-82

"Department of Economics, Harvard University

Cambridge, Massachusetts 02138

'"Sloan School of Management, Massachusetts Institute of Technology

Cambridge, Massachusetts 02139

MONLIMEAR THREE STAGE LF^ST SQUARES POOLING

OF CROSS SECTION AND TIME SERIES OBSERVATIONS

by

l^ale V/. Jorgenson and Thomas U. Stolcer

l_, Iiitroduc t ion . The purpose of this paper is to discuss the pooling of

cross section and average time series data by the method of nonlinear three

stage least squares introduced by Jorgenson and Laffont (1974). ^ \,e consider

applications of this method to exact aggregation models, where there is a

unique correspondence between individual and aggregate behavior. This

correspondence makes exact aggregation models appropriate for the analysis of

individual data, average data, or both in coi.bina t ion. "

\lc consider observations on K individuals, indexed by k = 1, 2 ... Iw for

T time periods, indexed by t = 1, 2 ... T. V/e can represent the structural

form of an exact aggregation model for the kth individual in the tth time

period by:

^nkt = ^kt Pn^Pf ^'^' (n = 1. 2 ... N).

The observations y^^j.^ and Xu^^ vary over both individuals and time periods,

while the vector of observations p^ varies over time periods, but is the same

for all individuals in a given time period. The coefficients P (p , O) are

functions of the observations p^ and the vector of L structural parameters

6' = ^^1' ^2 â– â€¢â– ^L^ â€¢ Restrictions on the parameters are embodied in the

forms of these functions.

We can write the exact aggregation model for the kth individual in vector

form :

yj.,. = (I., Â» Xj^^) p(p^, e), (1)

- 2

where y is a vector of N observations, fiip^, Â©) is a vector of N coeffi-

cients, and I. is tlie identity matrix of order N. By averaging the model (1)

over all individuals for each time period, we obtain the structural form of

the eact aggregation model for averaged data.

^t = (^N Â® ^P P^i't' ^^

(2)

where y and x' are vectors of M observations on averages of y, . and x,' over

all individuals.

The models for individual cross section and average time series observa-

tions contain the same parameter vector 9 and the same coefficient vector

P(p^, Â©) . This reflects the correspondence between individual and aggregate

behavior that characterizes exact aggregation models. The forms of the indivi-

dual and aggregate i.iodel (1) and (2) are necessary and sufficient for exact

aggregation, provided that the population distribution of x is unres-

tricted. ^

As an example of exact aggregation models we first consider the linear

model that underlies previous discussions of pooling cross section and time

series data : ^

^nkt = Pt Â«ln ' Kt Â«2n ' (n = 1. 2 ... N) .

where G, and 0^ are vectors of parameters. In this example the vector of

In Z n

parameters of models (1) and (2) includes the elements of and 0t (n =

^ In zn

1, 2 ... N) . The vector of coefficients (i (p e) ' is (p^ 0j,j , 0oj,) and the

vector of observations x/ is (1 z' )

kt " ^ ' kt *

Deciand analysis provides many examples of nonlinear exact aggregation

models. In each of these examples the theory of consumer beliavior implies

constraints on the parameters of the model that are incorporated through the

- 3 -

form of the coefficients P^^Cp^, e) (n = 1. 2 ... N). Demand systems generated

by the Gorman polar form of the indirect utility function are nonlinear exact

aggregation models. Specific examples include the linear expenditure system

introduced by Klein and Rubin (1947-1948) and implemented by Stone (1954), the

S-branch utility tree of JJrown and Helen (1972), and the generalization of the

S-branch utility tree of Dlackorby, Boyce, and Russell (1978).

As an illustration, the linear expenditure system can be v/ritten in exact

aggregation form as follows:

'nkt = (Pâ€žt -^n - ^, ^ "^j Pjt^ ^ \ ''kt â€¢

(n = 1, 2 ... N).

where y n*. denotes expenditure on the nth commodity by the kth individual in

period t and p^^^ is the price of this comi.iodity (n = 1, 2 ... N) ; M, is total

expenditure on all coiianodities. The vector of parameters 9 includes the

parameters b^^ and c^^ (n = 1, 2 ... N) , the vector of coefficients nâ€ž(P(, Â«) '

is (Pjjj. c^ - bjj 1 c- P;t;'^ii^ *"'' *^^^ vector of observations x' is (1, ''],.)â€¢

More complex nonlinear exact aggregation models liave recently been intro-

duced by Deaton and Muellbauer (1980a, 19G0b) and by Jorgenson, Lau, and

Stoker (1980, 1981, 1982). The AIDS models of Deaton and Duellbauer can be

wri tten:

^nkt = (aâ€ž . I câ€žj in p.^)M,^ . ^^7^7^'^" = 1. 2 ... N).

where Yj^j^^, '1^*, and p ^ are defined as in the linear expenditure system and:

InP^ = la. in p . ^ ^ I" I I câ€ž . Inp^^ ' In p^^.

is a price index. The vector of parameters d includes the parameters

*t,> bâ€ž, câ€ž â€¢ (n, j = 1, 2 ... N), the vector of coefficients B (p., O) ' is

nnnj*^ nt

- 4 -

(a + 1 c . In p.^. :; ;;") and the vector of observations x,' is

n nj Jt' In P kt

The translog model of Jorgenson, Lau and Stoker can be represented in the

form :

a + I b In p. . b ,. b^' n, ^ A , ,

y _ /_a 'li l-Ll) â€ž _ Ul m . m .5- _ns kt skt

^nkt - ^ D(p^) ' "kt n(p^) â– kt ^Â° ' kt "" ^ n(p^)

(n = 1, 2 ... M),

where y^j,^, Mj.^ and p^^ are defined as above, A , ^ (s = 1, 2 ... S) represents

demographic characteristics such as family size, age of head of household, and

so on, and:

D(p^) = -1 + I b,jj In p.j.

In this example the vector Â© consists of the parameters a b . b b'' (n,

n' nj ' ;:j' ns

j = 1, 2 ... N; s = 1, 2 ... S) , the vector of coefficients P (p , G) ' is

aâ€ž + lb. In p.. b,,. bâ€ž, bâ€ž- b^e

^ ^) ' ^ - ^ ' ^ â€¢â€¢â– ^' ^"" ''^ ^^'^'"^ Â°' Â°'"^-

vations x-^ is (M^^. H^^ In V.^^. Mj^^ A^^^. M,^^ A^j^, ... M,^, A^.^^y

In this paper we focus on the implications of nonlinearity for the pool-

ing of cross section and average time series data. In Section 2 we consider

the stochastic specification of exact aggregation models (1) and (2). In Sec-

tion 3 we present and characterize the nonlinear three stage least squares

estimator for pooled time series and cross section observations. In Section 4

we discuss hypothesis testing and in Section 5 we consider estimation subject

to inequality constraints. V/e close with a brief summary of the results and a

discussion of applications.

2_. Stochast ic ?'>i)cc if ic at ion . V/e begin by considering average

- 5 -

observations for T time periods and a single cross section of K inJividual

observations. We assune that the observations are generated by exact a^^gre^a-

tion models (1) and (2) with additive disturbance terms. Given the stochastic

specification of the disturbance terms, the observations must be transformed

to obtain disturbances that are honoscedastic and uncorrelated across observa-

tions.

For pooling of cross section and average time series data the transforma-

tion of observations to obtain homoscedast ic and uncorrelated disturbances can

be divided into two steps. The first step separates the data sets by

transforming the average data so that time series disturbances are uncorre-

lated with cross section disturbances. The second iiej) transforms the result-

ing data sets to a form where disturbances in each data set are hoMOScedast ic

and uncorrelated. V/e present the transformation for the first step expli-

citly, indicating the features of this transformation that result in increased

efficiency. Tlie second step involves standard techniques for transformation,

which we illustrate by example.

We assume that individual observations are generated by the exact aggre-

gation model (1) with an additive random component, say e .

^kt = (In Â® ^kt) P(Pf Â®) ^ =kf (!')

We assume that the disturbance term e^,^ is distributed with mean zero and is

uncorrelated across individuals, so that:

^^'kt ^k't') =0. k it k'.

Any systematic correlation among individuals is assumed to be captured by

selection of the variables x^^^. The disturbance term e^^ is assumed to have

variance Q and time series covariance structure E(e, e,' , ) = C , !f . A

- 6

wide variety of alternative time series structures for e can be represented

by choosing an appropriate form for the matrix C ,

We could obtain a stochastic version of the exact aggregation model (2)

by averaging the individual observations in (1') for each time period. This

would be the appropriate procedure if the average data were constructed by

averaging the individual observations. However, we must allow for alternative

methods for constructing the aggregate data. In demand analysis, for example,

data on aggregate personal consumption expenditures are obtained from produc-

tion accounts for the economy as a whole rather than by direct observation of

quantities consumed by the entire population of individual households.

To allow for differences in methods of construction of the individual and

aggregate data we introduce an additive random component V into the exact

aggregation model (2) for each time period. The model relating the averaged

data y ^q ^ anj p jj then:

y^ = (Ifj 8 ^[) P(Pt, e) + Uj, (2')

where u = \) + e. and e^^ is a vector of N averaged disturbances (e, ) . Tlie

stochastic term ^) is assumed to be distributed independently of ej^^ with mean

ft ' 1 1 '

zero, variance J-K Â» and time series covariance structure E(\) \) , ) = n\ for

t ^ t'. To accommodate a variety of time series covariance structures for u

we have:

E(u ii')=a +â€” r o

t \'> "\) K ^tt' ^e-

In order to present methods for pooling cross section and time series

data we consider a sample of K' individual observations. We can "stack" the

equations (1') to obtain:

Y = (I^j 9 X) p(pj. . 9) + e, (3)

- 7

where Y is the vector of observations (y . ), X is the matrix with

nkt

rows

{x,' } and e is the vector of disturbances with mean zero and covarianco

kt

o

matrix H 9 I,.,. Similarly, we can represent the equations (2') in the form:

6 K

Y = f(e) + u.

(4)

where Y is the vector of averaged observations {y } ,

f(e) =

xj p(Pi .6)

I^ p(P2 .e)

and u is the vector of disturbances.

The first step in the transformation of observations eliminates the

correlation between of e and u

K'

E(u e, ') = TT- C n ,

t ^kt ' K tt ''e '

(k = 1. 2 ... K'; t = 1. 2 ... T). (5)

This correlation is removed by a nonsingular transformation of (3) and (4),

which is equivalent to replacing y , 7 and u^^ in (2') by:

yo

= Y. _

'i- C v

1^ ^tt Yes.

(6)

t

X - i^ c

K "tt ^es'

o

u" = u

t I

III C

:: tt ^cs'

o

where y , x and e denote the cross section averages of y, , x, and

o o

Ej. , The resulting disturbances uÂ° are now uncorrelated with e^^^ (k = 1, 2

- 8 -

K'). but have a more coraplicated time series structure than the original

disturbances :

E(u- uÂ«:) = 9^ ^ ^ C^,. P., - K' [C^^^. . C^,^ - II "e â€¢ ^"'^

The second step in the transformation of observations is to apply a non-

singular transform to the average data in (4) to obtain disturbances that are

homoscedastic and uncorrelated. We illustrate this transformation below by

example. We assume that the transformation has been performed, altering the

model (4) to:

â€” (8)

Y* = f*(e*) + u* ,

where u* is distributed with mean zero and variance 9.^^ A I^. For estimation,

we stack the equation systems (3) and (8):

( 9)

Y = tÂ» (eÂ«) + U.

where U' = (e' . u*'). which is distributed with mean zero and variance:

- - u., = pr^;,j .

The implementation of the transformations described above requires con-

sistent estimates of the variances and covariances n^. C^^,. 9^ it.f = 1. 2

... T). In general, these estimates require specific models of the processes

generating the disturbances. The purpose of the transformations is to assure

efficiency in estimation. Equation (2') shows that the contribution of the

individual errors e^^^ to the covariance structure of u^. is likely to be negli-

gible unless the matrices Â£2*^* ' are the same order of magnitude as ^^C^^. o^,

where K is population size. The benefits of performing the transformation (6)

- 9 -

depend on the size of the cross section relative to the population. In many

applications, K'/K will be extremely small so that the transformation (6)

leaves the observations unaffected. Typical numbers for an analysis of U.S.

household demand behavior are K' = 10,000 and K =70 million. Consequently,

only when the cross section sample size is of the same order of magnitude as

the size of the population will the correction yield significant benefits;

otherwise it can be ignored.

The following examples illustrate different error structures, where we

assume K'/K is very small. We take C^^, = q , t^t r ^ f^r simplicity, defer-

ring further discussion of this time series structure until we have presented

the examples. In Examples 1 and 2 we take flK^ = for t ^ t'.

Example 1 (Random Individual Errors): Suppose that \} arises because of

an additional random component ^ at the individual level, which is dis-

tributed with mean and variance 9..., so that \). = I \), /K. Then u =

V t K t t

I(\j. + ej.j.)/K. with:

E(u^ up =

t#f .

The second stage transformation is just a grouping correction, with u of

(8) given as u = 'X/IT'u^ , with Q ^ = fi., +

t ^ t u* V

9. .'

Â£

Example 2 (Common Time Effect): Suppose that ^ represents a conmion dis-

turbance in the aggregate data with n^*^ = fi for all t. In practice one

will usually encounter K SI > > n , so that u = \) for purposes of esti-

mation. Here no second stage correction is necessary, with

Example 3 (Autocorrelated Conmion Time Effect): Suppose that Example 2 is

- 10 -

modified to \)^ = y ^ , + w , where lo. is distributed with nean zero,

variance fi and uncorrelated over time, with K fl > > . llien

w we

1 1 2

n,. = Ji /I - Y â€¢ As above, the contribution of "Le.^/K to u is neeliai-

V (1) ' kt t Â° "

ble, so that u = \) . The second stage correction is now quasi-first

differencing, replacing y and x by y - y y . and x - y x (with

the standard adjustment to the first observation). Of course, u = u

and CI ^ = n in this case.

U* (1)

Now suppose that C , ^ so that we have a nontrivial time series corre-

lation structure for e In Examples 2 and 3 above, the effect of C , jtQ

Ik. L L L

would be negligible, due to the unimportance of X Eh^/K in u. . In Example 1,

K t t

however, the time series structure is potentially important, since '\Ji; u will

have the same time series covariance structure as e, ^ and would require con-

kt ^

sideration in the second stage of the transformation of observations.

Example 3 illustrates the cost of pooling with very general error struc-

tures. In particular. Example 3, the parameter y is best relabeled as a com-

ponent of Â©, with the transformed error covariance structure now determined by

^- and fil Â» = ,Q . The treatment of autocorrelation will involve augmenting the

e u* w DO

list of parameters to be estimated with the remaining error structure charac-

terized by Q and Q ^. This modeling approach is standard practice in time

series analysis. Consequently, in Section 3 we discuss only the consistent

estimation of the parameters 9. and CI *, which we will regard as positive

e u*

definite but otherwise unrestricted.

Before discussing the additional assumptions required for estimation of

the complete model, we introduce instrumental variables. It is often

appropriate to treat the variables x^^ and p^^ as endogenous for the individual

observations, the aggregate observations, or both. Tliis can occur when the

model is a simultaneous equations model in exact aggregation form or part of a

- 11 -

.arger system of simultaneous equations. For example, in demand analysis

observations on prices can reflect both supply and demand influences, requir-

ing aggregate instruments. Alternatively, in a study of savings, errors in

variables may necessitate instruments for the individual data, while in the

average data such errors may be negligible.

We assxune that there are vectors of observations on instruiaental vari-

ables, say t^j ). Denote as Z, and Z the matrices with rows z, and z'

respectively, and as Z the matrix:

\\^0

Finally, we must introduce regularity assumptions in order to character-

ize the NL3SLo estimator. We include these in the Appendix. The assumptions

are that the coefficient functions P(p e) are twice continuously differenti-

ablo in the components of fl, that the moment matrices defining the NL3SLS

objective function converge to stable, well behaved limits, and that the

parameter vector Â© is identified. We collect all components of ^ identified

in the cross section in a set Â§^ , all parameters identified in the time series

in a set 9, and all the remaining parameters in a set 9 .

3.. The Nonlinear Three Stage Least Squares Estimator . The t-ILSSLS estima-

tor 6 of Â©* is found as the value of 6 which minimizes:

S(e) = il - (i (9)) ' [t^ % Z(Z'Z)^Z'] lY - 4(9)).

â€¢ V \ ^7 '

(11)

where :

E T.

u* T,

- 12 -

is a consistent estimator of I as IC , T â€” >

ALFRED P. SLOAN SCHOOL OF MANAGEMENT

NONLINEAR THREE STAGE LEAST SQUARES POOLING OF

CROSS SECTION AND AVERAGE TIME SERIES DATA

Dale W. Jorgenson"

and

Thomas M. Stoker""

April 1982

(Revised: August I983)

Sloan School of Management Working Paper #1293-82 ^.S3

MASSACHUSETTS

INSTITUTE OF TECHNOLOGY

50 MEMORIAL DRIVE

CAMBRIDGE, MASSACHUSETTS 02139

NONLINEAR THREE STAGE LEAST SQUARES POOLING OF

CROSS SECTION AND AVERAGE TIME SERIES DATA

Dale W. Jorgenson"

and

Thomas M. Stoker""

April 1982

(Revised: August 1983)

Sloan School of Management Working Paper #1293-82

"Department of Economics, Harvard University

Cambridge, Massachusetts 02138

'"Sloan School of Management, Massachusetts Institute of Technology

Cambridge, Massachusetts 02139

MONLIMEAR THREE STAGE LF^ST SQUARES POOLING

OF CROSS SECTION AND TIME SERIES OBSERVATIONS

by

l^ale V/. Jorgenson and Thomas U. Stolcer

l_, Iiitroduc t ion . The purpose of this paper is to discuss the pooling of

cross section and average time series data by the method of nonlinear three

stage least squares introduced by Jorgenson and Laffont (1974). ^ \,e consider

applications of this method to exact aggregation models, where there is a

unique correspondence between individual and aggregate behavior. This

correspondence makes exact aggregation models appropriate for the analysis of

individual data, average data, or both in coi.bina t ion. "

\lc consider observations on K individuals, indexed by k = 1, 2 ... Iw for

T time periods, indexed by t = 1, 2 ... T. V/e can represent the structural

form of an exact aggregation model for the kth individual in the tth time

period by:

^nkt = ^kt Pn^Pf ^'^' (n = 1. 2 ... N).

The observations y^^j.^ and Xu^^ vary over both individuals and time periods,

while the vector of observations p^ varies over time periods, but is the same

for all individuals in a given time period. The coefficients P (p , O) are

functions of the observations p^ and the vector of L structural parameters

6' = ^^1' ^2 â– â€¢â– ^L^ â€¢ Restrictions on the parameters are embodied in the

forms of these functions.

We can write the exact aggregation model for the kth individual in vector

form :

yj.,. = (I., Â» Xj^^) p(p^, e), (1)

- 2

where y is a vector of N observations, fiip^, Â©) is a vector of N coeffi-

cients, and I. is tlie identity matrix of order N. By averaging the model (1)

over all individuals for each time period, we obtain the structural form of

the eact aggregation model for averaged data.

^t = (^N Â® ^P P^i't' ^^

(2)

where y and x' are vectors of M observations on averages of y, . and x,' over

all individuals.

The models for individual cross section and average time series observa-

tions contain the same parameter vector 9 and the same coefficient vector

P(p^, Â©) . This reflects the correspondence between individual and aggregate

behavior that characterizes exact aggregation models. The forms of the indivi-

dual and aggregate i.iodel (1) and (2) are necessary and sufficient for exact

aggregation, provided that the population distribution of x is unres-

tricted. ^

As an example of exact aggregation models we first consider the linear

model that underlies previous discussions of pooling cross section and time

series data : ^

^nkt = Pt Â«ln ' Kt Â«2n ' (n = 1. 2 ... N) .

where G, and 0^ are vectors of parameters. In this example the vector of

In Z n

parameters of models (1) and (2) includes the elements of and 0t (n =

^ In zn

1, 2 ... N) . The vector of coefficients (i (p e) ' is (p^ 0j,j , 0oj,) and the

vector of observations x/ is (1 z' )

kt " ^ ' kt *

Deciand analysis provides many examples of nonlinear exact aggregation

models. In each of these examples the theory of consumer beliavior implies

constraints on the parameters of the model that are incorporated through the

- 3 -

form of the coefficients P^^Cp^, e) (n = 1. 2 ... N). Demand systems generated

by the Gorman polar form of the indirect utility function are nonlinear exact

aggregation models. Specific examples include the linear expenditure system

introduced by Klein and Rubin (1947-1948) and implemented by Stone (1954), the

S-branch utility tree of JJrown and Helen (1972), and the generalization of the

S-branch utility tree of Dlackorby, Boyce, and Russell (1978).

As an illustration, the linear expenditure system can be v/ritten in exact

aggregation form as follows:

'nkt = (Pâ€žt -^n - ^, ^ "^j Pjt^ ^ \ ''kt â€¢

(n = 1, 2 ... N).

where y n*. denotes expenditure on the nth commodity by the kth individual in

period t and p^^^ is the price of this comi.iodity (n = 1, 2 ... N) ; M, is total

expenditure on all coiianodities. The vector of parameters 9 includes the

parameters b^^ and c^^ (n = 1, 2 ... N) , the vector of coefficients nâ€ž(P(, Â«) '

is (Pjjj. c^ - bjj 1 c- P;t;'^ii^ *"'' *^^^ vector of observations x' is (1, ''],.)â€¢

More complex nonlinear exact aggregation models liave recently been intro-

duced by Deaton and Muellbauer (1980a, 19G0b) and by Jorgenson, Lau, and

Stoker (1980, 1981, 1982). The AIDS models of Deaton and Duellbauer can be

wri tten:

^nkt = (aâ€ž . I câ€žj in p.^)M,^ . ^^7^7^'^" = 1. 2 ... N).

where Yj^j^^, '1^*, and p ^ are defined as in the linear expenditure system and:

InP^ = la. in p . ^ ^ I" I I câ€ž . Inp^^ ' In p^^.

is a price index. The vector of parameters d includes the parameters

*t,> bâ€ž, câ€ž â€¢ (n, j = 1, 2 ... N), the vector of coefficients B (p., O) ' is

nnnj*^ nt

- 4 -

(a + 1 c . In p.^. :; ;;") and the vector of observations x,' is

n nj Jt' In P kt

The translog model of Jorgenson, Lau and Stoker can be represented in the

form :

a + I b In p. . b ,. b^' n, ^ A , ,

y _ /_a 'li l-Ll) â€ž _ Ul m . m .5- _ns kt skt

^nkt - ^ D(p^) ' "kt n(p^) â– kt ^Â° ' kt "" ^ n(p^)

(n = 1, 2 ... M),

where y^j,^, Mj.^ and p^^ are defined as above, A , ^ (s = 1, 2 ... S) represents

demographic characteristics such as family size, age of head of household, and

so on, and:

D(p^) = -1 + I b,jj In p.j.

In this example the vector Â© consists of the parameters a b . b b'' (n,

n' nj ' ;:j' ns

j = 1, 2 ... N; s = 1, 2 ... S) , the vector of coefficients P (p , G) ' is

aâ€ž + lb. In p.. b,,. bâ€ž, bâ€ž- b^e

^ ^) ' ^ - ^ ' ^ â€¢â€¢â– ^' ^"" ''^ ^^'^'"^ Â°' Â°'"^-

vations x-^ is (M^^. H^^ In V.^^. Mj^^ A^^^. M,^^ A^j^, ... M,^, A^.^^y

In this paper we focus on the implications of nonlinearity for the pool-

ing of cross section and average time series data. In Section 2 we consider

the stochastic specification of exact aggregation models (1) and (2). In Sec-

tion 3 we present and characterize the nonlinear three stage least squares

estimator for pooled time series and cross section observations. In Section 4

we discuss hypothesis testing and in Section 5 we consider estimation subject

to inequality constraints. V/e close with a brief summary of the results and a

discussion of applications.

2_. Stochast ic ?'>i)cc if ic at ion . V/e begin by considering average

- 5 -

observations for T time periods and a single cross section of K inJividual

observations. We assune that the observations are generated by exact a^^gre^a-

tion models (1) and (2) with additive disturbance terms. Given the stochastic

specification of the disturbance terms, the observations must be transformed

to obtain disturbances that are honoscedastic and uncorrelated across observa-

tions.

For pooling of cross section and average time series data the transforma-

tion of observations to obtain homoscedast ic and uncorrelated disturbances can

be divided into two steps. The first step separates the data sets by

transforming the average data so that time series disturbances are uncorre-

lated with cross section disturbances. The second iiej) transforms the result-

ing data sets to a form where disturbances in each data set are hoMOScedast ic

and uncorrelated. V/e present the transformation for the first step expli-

citly, indicating the features of this transformation that result in increased

efficiency. Tlie second step involves standard techniques for transformation,

which we illustrate by example.

We assume that individual observations are generated by the exact aggre-

gation model (1) with an additive random component, say e .

^kt = (In Â® ^kt) P(Pf Â®) ^ =kf (!')

We assume that the disturbance term e^,^ is distributed with mean zero and is

uncorrelated across individuals, so that:

^^'kt ^k't') =0. k it k'.

Any systematic correlation among individuals is assumed to be captured by

selection of the variables x^^^. The disturbance term e^^ is assumed to have

variance Q and time series covariance structure E(e, e,' , ) = C , !f . A

- 6

wide variety of alternative time series structures for e can be represented

by choosing an appropriate form for the matrix C ,

We could obtain a stochastic version of the exact aggregation model (2)

by averaging the individual observations in (1') for each time period. This

would be the appropriate procedure if the average data were constructed by

averaging the individual observations. However, we must allow for alternative

methods for constructing the aggregate data. In demand analysis, for example,

data on aggregate personal consumption expenditures are obtained from produc-

tion accounts for the economy as a whole rather than by direct observation of

quantities consumed by the entire population of individual households.

To allow for differences in methods of construction of the individual and

aggregate data we introduce an additive random component V into the exact

aggregation model (2) for each time period. The model relating the averaged

data y ^q ^ anj p jj then:

y^ = (Ifj 8 ^[) P(Pt, e) + Uj, (2')

where u = \) + e. and e^^ is a vector of N averaged disturbances (e, ) . Tlie

stochastic term ^) is assumed to be distributed independently of ej^^ with mean

ft ' 1 1 '

zero, variance J-K Â» and time series covariance structure E(\) \) , ) = n\ for

t ^ t'. To accommodate a variety of time series covariance structures for u

we have:

E(u ii')=a +â€” r o

t \'> "\) K ^tt' ^e-

In order to present methods for pooling cross section and time series

data we consider a sample of K' individual observations. We can "stack" the

equations (1') to obtain:

Y = (I^j 9 X) p(pj. . 9) + e, (3)

- 7

where Y is the vector of observations (y . ), X is the matrix with

nkt

rows

{x,' } and e is the vector of disturbances with mean zero and covarianco

kt

o

matrix H 9 I,.,. Similarly, we can represent the equations (2') in the form:

6 K

Y = f(e) + u.

(4)

where Y is the vector of averaged observations {y } ,

f(e) =

xj p(Pi .6)

I^ p(P2 .e)

and u is the vector of disturbances.

The first step in the transformation of observations eliminates the

correlation between of e and u

K'

E(u e, ') = TT- C n ,

t ^kt ' K tt ''e '

(k = 1. 2 ... K'; t = 1. 2 ... T). (5)

This correlation is removed by a nonsingular transformation of (3) and (4),

which is equivalent to replacing y , 7 and u^^ in (2') by:

yo

= Y. _

'i- C v

1^ ^tt Yes.

(6)

t

X - i^ c

K "tt ^es'

o

u" = u

t I

III C

:: tt ^cs'

o

where y , x and e denote the cross section averages of y, , x, and

o o

Ej. , The resulting disturbances uÂ° are now uncorrelated with e^^^ (k = 1, 2

- 8 -

K'). but have a more coraplicated time series structure than the original

disturbances :

E(u- uÂ«:) = 9^ ^ ^ C^,. P., - K' [C^^^. . C^,^ - II "e â€¢ ^"'^

The second step in the transformation of observations is to apply a non-

singular transform to the average data in (4) to obtain disturbances that are

homoscedastic and uncorrelated. We illustrate this transformation below by

example. We assume that the transformation has been performed, altering the

model (4) to:

â€” (8)

Y* = f*(e*) + u* ,

where u* is distributed with mean zero and variance 9.^^ A I^. For estimation,

we stack the equation systems (3) and (8):

( 9)

Y = tÂ» (eÂ«) + U.

where U' = (e' . u*'). which is distributed with mean zero and variance:

- - u., = pr^;,j .

The implementation of the transformations described above requires con-

sistent estimates of the variances and covariances n^. C^^,. 9^ it.f = 1. 2

... T). In general, these estimates require specific models of the processes

generating the disturbances. The purpose of the transformations is to assure

efficiency in estimation. Equation (2') shows that the contribution of the

individual errors e^^^ to the covariance structure of u^. is likely to be negli-

gible unless the matrices Â£2*^* ' are the same order of magnitude as ^^C^^. o^,

where K is population size. The benefits of performing the transformation (6)

- 9 -

depend on the size of the cross section relative to the population. In many

applications, K'/K will be extremely small so that the transformation (6)

leaves the observations unaffected. Typical numbers for an analysis of U.S.

household demand behavior are K' = 10,000 and K =70 million. Consequently,

only when the cross section sample size is of the same order of magnitude as

the size of the population will the correction yield significant benefits;

otherwise it can be ignored.

The following examples illustrate different error structures, where we

assume K'/K is very small. We take C^^, = q , t^t r ^ f^r simplicity, defer-

ring further discussion of this time series structure until we have presented

the examples. In Examples 1 and 2 we take flK^ = for t ^ t'.

Example 1 (Random Individual Errors): Suppose that \} arises because of

an additional random component ^ at the individual level, which is dis-

tributed with mean and variance 9..., so that \). = I \), /K. Then u =

V t K t t

I(\j. + ej.j.)/K. with:

E(u^ up =

t#f .

The second stage transformation is just a grouping correction, with u of

(8) given as u = 'X/IT'u^ , with Q ^ = fi., +

t ^ t u* V

9. .'

Â£

Example 2 (Common Time Effect): Suppose that ^ represents a conmion dis-

turbance in the aggregate data with n^*^ = fi for all t. In practice one

will usually encounter K SI > > n , so that u = \) for purposes of esti-

mation. Here no second stage correction is necessary, with

Example 3 (Autocorrelated Conmion Time Effect): Suppose that Example 2 is

- 10 -

modified to \)^ = y ^ , + w , where lo. is distributed with nean zero,

variance fi and uncorrelated over time, with K fl > > . llien

w we

1 1 2

n,. = Ji /I - Y â€¢ As above, the contribution of "Le.^/K to u is neeliai-

V (1) ' kt t Â° "

ble, so that u = \) . The second stage correction is now quasi-first

differencing, replacing y and x by y - y y . and x - y x (with

the standard adjustment to the first observation). Of course, u = u

and CI ^ = n in this case.

U* (1)

Now suppose that C , ^ so that we have a nontrivial time series corre-

lation structure for e In Examples 2 and 3 above, the effect of C , jtQ

Ik. L L L

would be negligible, due to the unimportance of X Eh^/K in u. . In Example 1,

K t t

however, the time series structure is potentially important, since '\Ji; u will

have the same time series covariance structure as e, ^ and would require con-

kt ^

sideration in the second stage of the transformation of observations.

Example 3 illustrates the cost of pooling with very general error struc-

tures. In particular. Example 3, the parameter y is best relabeled as a com-

ponent of Â©, with the transformed error covariance structure now determined by

^- and fil Â» = ,Q . The treatment of autocorrelation will involve augmenting the

e u* w DO

list of parameters to be estimated with the remaining error structure charac-

terized by Q and Q ^. This modeling approach is standard practice in time

series analysis. Consequently, in Section 3 we discuss only the consistent

estimation of the parameters 9. and CI *, which we will regard as positive

e u*

definite but otherwise unrestricted.

Before discussing the additional assumptions required for estimation of

the complete model, we introduce instrumental variables. It is often

appropriate to treat the variables x^^ and p^^ as endogenous for the individual

observations, the aggregate observations, or both. Tliis can occur when the

model is a simultaneous equations model in exact aggregation form or part of a

- 11 -

.arger system of simultaneous equations. For example, in demand analysis

observations on prices can reflect both supply and demand influences, requir-

ing aggregate instruments. Alternatively, in a study of savings, errors in

variables may necessitate instruments for the individual data, while in the

average data such errors may be negligible.

We assxune that there are vectors of observations on instruiaental vari-

ables, say t^j ). Denote as Z, and Z the matrices with rows z, and z'

respectively, and as Z the matrix:

\\^0

Finally, we must introduce regularity assumptions in order to character-

ize the NL3SLo estimator. We include these in the Appendix. The assumptions

are that the coefficient functions P(p e) are twice continuously differenti-

ablo in the components of fl, that the moment matrices defining the NL3SLS

objective function converge to stable, well behaved limits, and that the

parameter vector Â© is identified. We collect all components of ^ identified

in the cross section in a set Â§^ , all parameters identified in the time series

in a set 9, and all the remaining parameters in a set 9 .

3.. The Nonlinear Three Stage Least Squares Estimator . The t-ILSSLS estima-

tor 6 of Â©* is found as the value of 6 which minimizes:

S(e) = il - (i (9)) ' [t^ % Z(Z'Z)^Z'] lY - 4(9)).

â€¢ V \ ^7 '

(11)

where :

E T.

u* T,

- 12 -

is a consistent estimator of I as IC , T â€” >

1 2

Online Library → Dale Weldeau Jorgenson → Nonlinear three stage least squares pooling of cross section and average time series data → online text (page 1 of 2)