WORKING PAPER
ALFRED P. SLOAN SCHOOL OF MANAGEMENT
T^ SCALING
A Mathematical Programming Approach
to Thurstonian Scaling
James M. Lattin
WP 118381
January 1981
MASSACHUSETTS
INSTITUTE OF TECHNOLOGY
50 MEMORIAL DRIVE
CAMBRIDGE, MASSACHUSETTS 02139
T,^ SCALING
A Mathematical Programming Approach
to Thurstonian Scaling
James H. Lattin
WP 118381 January 1981
My special thanks go to Professors Al Silk, Tom
Magnanti, and Roy Welsch for cheir support and advice
throughout the several stages of this work. I also
thank Paul Mireault and Michael Abraham for their
comments and insight.
ABSTRACT
We present a new technique, called T, scaling, for
determining scale estimates from paired comparisons data. We
present the new method in conjunction with a sensitivity
diagnostic that ascertains the extent to which intransitive
elements in the data influence the scale estimates from the
Thurstonian judgment scaling model. The T^ scale estimates,
based upon the minimization of absolute deviations rather than
least squares, are relatively insensitive to the presence of
limited inconsistency. We apply the new solution technique,
shown to be a straightforward minimum cost network, flow problem,
to several scaling problems in the literature. V/hen no single
limited source of inconsistency is indicated, the scale estimates
thus obtained are consistent with the least squares estimates.
When isolated departures from the scaling model or possible data
errors are present, the T, procedure remains largely insensitive
to their presence, preserving the interval scale properties of
the estimates.
1
I ntroduct ion
This papar presents a new solution technique, called T i
scaling, for determining scale estimates from paired comparisons
data. The development of the T, procedure v,'a3 motivated by a
concern for the substantial influence of intransitive elements in
the data on the final solution produced by the least squares
approach of the Thurstonian judgment scaling model (Thurstone,
1927a). The new technique utilizes a discrete L linear
approximation based upon the minimization of absolute deviations
(see Barrodale, 1968, and Barrodale and Roberts, 1973), where the
special structure of the scaling problem allows it to be solved
quite efficiently as a minimum cost network flow problem, using
standard techniques presented in such sources as Bradley, Hax,
and Magnanti (1977) or Shapiro (1979). The scale estimates thus
provided are in some sense more "robust" than in the traditional
approach in that they discount the influence of limited
inconsistency in the data.
The balance of the paper follows in several sections.
The first section reviews the least squares solution technique
for the Thurstonian judgment scaling model. The second section
reviews Mosteller's goodnessof f i t measure for the least squares
estimates (hosteller, 1951), and demonstrates how seriously this
fit deteriorates in the presence of limited inconsistency. In
order to do this, we develop a sensitivity diagnostic to assess
the relative influence of each pair of items on the determination
of the least squares estimates. The third section presents the
robust T scaling technique and shows that obtaining these scale
estimates is equivalent to solving a minimum cost network flow
problem. Finally, we apply the T approach to several problems
from the literature, and compare the results to the least squares
scale estimates.
I . The Law ot Co mparative Judgment and the Thurstonian Judgment
Scaling Mode l
In the typical judgment scaling problem, v/e are presented
with k different objects, each exhibiting some degree of a
certain common characteristic. If this characteristic, such as
"height", "weight", or "age", is a directly measureable quality
of singular dimensionality, chen we can order these k objects by
placing them along a continuum at the measured value of their
common characteristic. The positions of these objects, or scale
values, have the properties of the measurement scale. For
example, objects ordered on the basis of height or weight have
scale values v/ith ratio properties, while objects ordered on the
basis of heat in degrees Farenheit have interval properties.
V/hen the objects share a common characteristic that is
not directly measurable, such as "beauty" or "softness", the
ordering of the objects must depend upon some subjective estimate
of the common characteristic exhibited by each object. In order
to facilitate the process of ordering the objects along a
continuum without an apparent scale, the method of paired
comparisons is used to exact a set of relative judgments from an
observer. Thus, for any given pair of objects the observer is
required only to judge v;hich of the two exceeds the other with
respect to the underlying characteristic. This set of pairwise
judgments is used to determine scale values with interval
properties.
The law of comparative judgment established the
theoretical foundations for Thurstone's judgment scaling model.
Each object, when presented to the observer, acts as a stimulus
which excites a certain discriminal process within the observer.
Due to changing conditions in the experimental situation or
fluctuations within the observer, the same stimulus might trigger
a slightly different process, such that the position of the
stimulus on the specific psychological continuum is not always
the same. For example, an observer's subjective estimate of the
"beauty" of an object might be different when presented with the
object a second time, on account of the observer's mood, the time
of day, or the temperature of his surroundings.
In the Thurstonian model, the distribution of these
subjective estimates along the continuum is postulated to be
normal. The standard deviation of this distribution is called
the disciminal dispersion of the stimulus, and the mean is taken
to be the true scale value . The distributions of two stimuli, i
and J, might thus be represented as in Figure I.l. The scale
values are s and s , and the discriminal dispersions are the
1 J
standard deviations, o^ and cf^ . The discriminal processed within
the observer are random variables denoted i. and d â€¢.
It is now possible to talk about a discriminal
difference , (d .  d .) , for any pair of stimuli i and j. If i and
j are presented to an observer a large number of times, the
discriminal differences will also form a normal distribution,
2 2 1/2
with standard deviation 0j=Ca. +a ~ 2r . . ao.) where r. .
a 1 J 1] 1 J IT
is the correlation between the discriminal processes associated
wi th i and j .
FIGURE I.l
The discriminal distributions for stimuli 1 and j ,
centered about the true scale values s. and s..
1 J
The observer, of course, is unable to assign a value to
the position of the stimulus aloiig the appropriace psychological
continuum, but when presented wi tn two stimuli, he is able to
judge which of the two is greater. In some cases, because the
distributions overlap, the observer may judge stimulus i to be
greater than j even though s. is actually greater than s. . Over
a large number of comparisons, it is possible to determine the
approximate proportions of times stimulus j is judged greater
than i. These proportions are then used to determine the
relative positions of s. and s. on the continuum measuring their
1 D 
common quality.
Figure 1.2 shows the distribution of the discriminal
difference between i and j, where the shaded portion of the curve
cO
FIGURE 1.2
The discriminal difference between stimuli i and j .
indicates the proportion of times stimulus j appears greater than
stimulus i to the observer. The value x. . is the difference
ID
between s. and s measured in o units. Hence, s.  s. = x. . a t,
X ] d ' 3 1 13 d'
or in its final form:
(1)
s  s = X (of + of  2r..o.a.)/^
â– In this form, without limiting assumptions, the law of
comparative judgment is not solvable, as there are many more
unknowns than equations. Thurstone presented five cases of the
law, introducing certain assumptions into the model to make it
tractable. His case V is the most restrictive, assuming constant
standard deviations for all of the discriminal dispersions, and
no correlation between any of the discrimlnal processes (implying
zero covariance between stimuli) . Mosteller (1951) later showed
that the assanption of equal correlations between processes leads
to a formulation equivalent to Thurstone's case V, In either
case, the unit of measurement for the psychological scale may be
determined arbitrarily; hence, the constant modifying the x^
term in (1) may be taken to be unity, leaving
ID
(2)
s .
ID
The law of comparative judgment, limited by assumptions
of equal dispersions and equal covariances, is most frequently
estimated using paired comparisons data. In this procedure, we
present each pair of stimuli to the observer a large number of
times, as described above. If we wish to examine the collective
discriminal process of an entire population, we present each
stimulus pair i,j to each individual in the population only onca.
Thurstone, in his case II of the law of comparative judgment,
shov/ed that the same formulation holds true for either approach,
under certain assumptions of homogeneity.
The observed proportion of times stimulus j exceeds i,
p*,.f forms the matrix P'. Matrix P' has the property that
symmetric cells must sum to one; hence, d' . . + p' . . = 1. Matrix
P' determines matrix X', where each element x'.. is the unit
normal de,/iate corresponding to the observed proportion p' . . . If
the range of stimuli along the psychological continuum is large
relative to the discriminal dispersion, there may in fact be
3
cases where one stimulus is never judged greater than another.
If stimulus i is judged less than j every time the pair is
presented, the value p'ij will equal one and the value x'j_j will
approach infinity. This problem of an incomplete X' matrix is
usually solved by establishing upper and lower bounds on p'ij of
.01 and .99, thus insuring stability of the resulting scale
estimates .
From (2) above, we see that the difference between the
estimates of any two scale values, s'. and s'., gives us x".. an
estimate of the observed value x'. . , as shown:
ID
(3:
s'.
3
1
ID
With errorless data, we can choose scale estimates s'j and s' sc
that the estimates x".. will equal the observed x'j_:. Typically,
differences between the observed proportions and the true values
lead to a difference between x" . . and x'.., no matter how we
ID ID
choose s'. and s'.. Thurstone chose his scale estimates to
1 D
minimize Q, the sum of the squared deviations between x" â– and
X' . .:
ID
(4)
Q = ZI (x! .  x" )
ID
ID
ID
Substituting (3) into the equation above:
(5)
Q = IT. (x! .
ij ''
2
\ + s : )
1 1
Equation (5) is equivalent to minimizing either row sum.s or
column sums, so Thurstone limited his analysis to the columns of
X' .
Differentiating Q with respect to s'. gives:
(6)
dQ
ds'.
D
2 I (xl .  s'. + sM
Setting the partial derivative to zero and solving
(7)
s'. = 1 Xx: . + 1 Is!
=â€¢ k i ^^J k i ^
where k is the number of objects in the scaling problem. The
rightmost term in (7) is simply the mean of the estimated scale
values. Because the origin of the psychological continuum is
arbitrary, we can take it to be the mean of the s'., giving:
(8)
s'. = 1 Ix' .
J k i ^3
Thus, the least squares estimates of the true scale values are
the coluran means of the matrix X'. Torgerson (1958) presents a
more detailed discussion of this derivation.
10
II . The Effects of Inconsistency in the Observed Data on the
Determination of the Thurstonian Scale Estimates
The method of paired comparisons does not always produce a
set of data appropriate for use in a scaling model such as
Thurstone's. If there is perfect agreement among the judges on
the ordering of the k objects being compared, then it is not
possible to determine scale estimates with interval
characteristics. Another problem occurs when an observer or some
observers are particulary bad judges, or are poorly motivated to
take the care required to produce consistent comparisons. A
third problem occurs if the experimenter asks too much of his
observers; the objects may be so close together with respect to
their common quality that distinguishing them becomes almost a
guessing game. Finally, it is possible that the quality common
to the objects under examination is not representabl e as a linear
variate. When any one or several of these difficulties is
present, the reported preferences may contain intransi tiv i ties
called circular triads, where object i is judged greater than
object j , j is judged greater than k, yet k is ultimately judged
greater than i. Such an ordering is impossible to represent on a
single dimensional scale, and thus interferes with the process of
determining scale estimates.
Kendall and Babington Smith (1940) observed that
"[Thurstone's] method is appropriate where one is entitled to
assume a priori or by reason of precautions taken in the
11
selection of material that a linear variable is involved and that
there exist perceptible differences between the items presented
for compar ison, " ( p. 342) They proposed a coefficient of
consistence, ;; , where
f
\
1  24d/(k  k) k odd
1  24d/Ck^  4k) k even.
where k is the number of objects and where d is the number of
circular triads reported. The coefficient equals one when the
comparisons data contain no inconsistencies and equals zero when
the maximum number of circular triads is present. Thus, a value
of ^ near zero indicates potentially troublesome departures from
the scaling model.
For paired comparisons with fewer than eight objects,
Kendall and Smith also calculated the probabilities that a number
of circular triads d or greater would occur under a completely
random ranking scheme. If a single observer reports a number of
circular triads d that is likely to have come from a process of
unsystematic (random) judgment, his ability to discriminate
between objects should be questioned; if a number of observers
do the same, then a problem may lie in the difficulty of the task
or in the dimensionality of the quality under judgment.
Even vv'hen paired comparisons data are free of complete
intransi tivi ty , there is usually some form of inconsistency
present. In Figure II. 1 below, three stimuli a, b, and c are
shown equally spaced along the appropriate psychological
12
continuum. By assumption, their discriminal dispersions are all
equal (Thurstone's case V), and because the origin for the scale
is arbitrary it has been placed at the middle scale value. If we
presented an observer with stimulus pair a,b and stimulus pair
b,c a total of n times each, it is unlikely (due to statistical
fluctuation) that the observer would report a>b exactly the same
number of times he reported b>c. Even so, while an observer may
judge a>b and b>c approximately the same number of times each, he
might judge a>c only a slightly higher number of times, not
necessarily consistent with placing c twice as far from a as from
b.
FIGURE II. 1
True underlying model for the three stimulus example demon
strating that inconsistency need not take the form of
intransitivity. .
13
with Thurstonian scale estimates, stimulus c is
positioned on zhe scale somewhat closer to b than suggested by
the proportion of times b>c, yet not so close to a as suggested
by the proportion of times a>c. This "compromise" leads to
discrepancies between the observed values x', . and the values
ID
x" derived from the scale estimates, leaving a question as to
the fit of the final result.
With fallible data, it is helpful to have a measure of
the gcodnes3of f i t of the least squares estimates. Mosteller
(1951) presented a chisquare significance test for the fit
between the observed proportions p'.. and the fitted proportions
p" . . These fitted proportions are derived from the scale
estimates; they represent the proportion cf the time stimulus i
would be judged greater than stimulus j if the true scale values
were actually s' . and s' .. We can use the unit norm.al table co
find the proportion p" corresponding to each x" . ., and form the
matrix of fitted proportions P".
Mosteller suggested the arcsin transformation developed
by R. A. Fisher to establish a chisquare testing criterion
Given proportions p' . . and p" . . from a binomial sample of size n,
ID ID
e' = ar
csin yp'
and
arcsi
n^
are distributed with variance
821
n
14
when g' ij and e "ij are expressed in degrees. Thus, flostell er
suggests the following test of the goodnessoff i t of the
estimates :
/
. (GV. 0! .)
i>i
' S21/n
where n is the total number of times each stimulus pair is
presented. The test covers the elements in the lower triangular
matrix; thus, for a scaling problem involving k stimuli, the
distribution is 'v x^ ( ( k1 ) ( k2 ) /2 ) .
It is possible to assess the nature of the effect of
inconsistency on the fit of the scale estimates by constructing a
situation in which a single circular triad is exhibited in
otherwise errorless comparisons data. Figure II. 2 shows the
placement of four stimuli, a, b, d, and e, along the
psychological continuum. The differences between these actual
scale values are shown in the matrix X' in Table II. 1. The fifth
stimulus, c, is represented at two positions on the scale. With
respect to all stimuli but a, c is positioned at .10 on the
scale (the true value, c, ) . With respect to a, however, c is
bde
positioned at +.15 on the continuum (c ). The result is a single
a
inaccurate observation for stimulus pair a,c, forming the single
circular triad (a>b, b>c, c>a) .
If the observed proportion of times that c was judged
greater than a were overlooked, perhaps discounted as a
transcription error, the remaining data in X' would be consistent
15
tâ€”
.15
"bide
I
^
.05
i
.05
15
^^
FIGURE I I. 2
Contrived fivestimulus example, where inaccuracy is introduced
into the observation between stimuli a and c.
a
b
c
d
e
a

.1
.1
.15
.20
b
.1

.05
.25
.1
c
.1
.15

.55
.05
d
.15
.25
.55

.35
e
.20
.1
.05
.35
.01
.04 .04
21 .14
TABLE II. 1
Matrix X' for the contrived five
stimulus example of Figure II. 2.
16
with the determination of scale estimates leading to a perfect
fit between P" and P'. Including the inconsistency produces the
distorted scale estimates shown in Figure II. 3. Only the scale
estimates for stimuli a and c are different from the true
underlying values; stimuli b, d, and have the same relative
positions as in the actual configuration. Because X' is a skew
symmetric matrix fx' = â€¢/.' ) and the only columns affected by
ac ca
the presence of mtransi tiv i ty are those for a and c; the
estimated scale values differ from the actual values by the same
amount in opposite direcitons. As shown in Figure II. 3, the
scale estimate for stimulus c is .05 units greater than its
actual value; for a, it is .05 units less.
i
"T^
bjc' 9
M^
JL
true c
true a
FIGURE II. 3
Thurstonian scale estimates for the fivestimulus example
compared to the true values for stimuli a and c.
The distortion introduced into the scale due to the
inaccurate comparison of a and c degrades the fit of the least
squares estimates noticeably. In this example, where a single
17
circular triad is incroduced into otherwise errorless data, the
fitted proportions differ from the observed in seven out of 10
cases, as shown in Table II. 2. Because the least squares
procedure operates to minimize the sum of squared deviations, a
solution resulting in several small discrepancies is preferred to
a solution v/ith a single large one. Thus, the least squares
procedure distorts the interval properties of several scale
estimates in order to compensate for a single potentially
problematic observation.
.5A

.516

,46
.54

h
.532
.516

U
.A4
.401
.353

Â» JJ
.417
.401
.386

.56
.52
.48
.618
.536
.52
.504
.618
TABLE II. 2
Comparison of observed and fitted proportions for the fivestimulus
example, showing discrepancey in seven out of 10 cells in the
lower diagonal matrix.
18
This small ad hoc analysis of limited intr aasi tivi ty
motivates the design of a more general procedure. It would be
advantageous to have a diagnostic to determine the influence of
any one observation on the overall fit of the model. By
replacing the value in each cell of the lower diagonal of the
matrix X' by a value determined from the other relative
comparisons, and then assessing the fit for these modified
values, it is possible to determine the improvement in fit
associated with the "discounting" of one observation. If this
improvement is substantial, it indicates that the internal
properties of the initial scale may have been degraded by
inconsistency. If the inconsistency can be traced to data
transcription error, or to some other uncontrolled influence
operating on a limited portion of the data, we might want to turn
to a more robust scaling procedure where outlying observations
have less influence and the discrepancy in fit is limited to as
few stimulus pairs as possible.
Consider the contrived fivestimulus example shov/n in
Table II. 1. In this case, we introduced an intransi tivi ty by
perturbing the observed value x'
If we could somehow discount
this observation, so that scale estimates s' and s' depended
a c
only on the relative comparison with stimuli b, d, and e, the
resulting value would reflect the proportion of times stimulus a
was reported greater than c, with c positioned accurately. V/e
can thus use the concept of adjusting a stimulus pair to design a
diagnostic technique for determining the sensitivity of paired
comparisons data to inconsistency. This notion of sensitivity is
19
1978) , based
similar to the one introduced by Hoaglin and Welsch
loosely on the influence of a single observation on the fit of
the entire model rather than simply its own fitted value.
If stimulus pair i,j is discounted, only one value in
each of column i and column j of matrix X' changes; therefore,
all ocher scale estimates s' , m 9^ i or 3, remain unaltered. We
m
can use thesa k2 unaffected scale estimates to adjust the values
s' and s' . The adjusted estimate 's. ' will reflect the best
i J 1
position for stimulus i relative to all other stimuli by j.
Similarly, s.' will reflect the best position for stimulus j
without considering direct comparison to stimulus i.
The following sets of equations determine the adjusted
scale estimates s ' and 's ' :
1 3
s'  s' = V
i 1 "li
^'.  s! = x! ^ .
1 11 11,1
^i ^i+l" ^i+l,i
5Â»
'i
1
J1 J1,1
^j+r '^j+i.i
b\  s' = X,' .
1 k. k.,1
s!  s! = x'
e!  s: = x' .
J 11 11, J
's'  s'  x'
^j ^i+1 ^i+l,j
t 
3
s! ,= x! , .
J1 :i,j
j J+1 j+l,j
^Â« Â» t
'2  ^k = \,i
k2
equations
20
With infallible data, such as that found in the fivestimulus
example in Table il.i above, all k2 equations render exactly the
same value for the adjusted scale estimate. However, since
paired comparisons data rarely offer perfect observations, we
again choose to use a least squares approach to oDtain values for
Si ' and Sj ' .
Our results above show that the mean of the k2
equations gives the least squares solution:
1
^'
= 1
ZJ (x' . + s')
k2 m^l ^'^ "^
k
^ ZZr (X' . + S')
k2 ^ ^'3
Rearranging terms for s â€¢ ' gives:
si
1
= 1
71 X â€¢ . + ^H
1^2 ~: m,i 1^^ â€”. m
mi m=i
niT^i, j
m?fi, j
Because the scale origin has been arbitrarily centered at the
mean of the scale estimates, the equation above becomes:
21
i v^ X , X' . " ^ (s!  s'. )
k2 â€¢
rriT^i, j
Similarly, because s' is equal to the mean of column i of X',
i
the equation aboye becomes:
^: = 1 (ks!  X'. .) + 1 (s:G' )