Copyright
University of Illinois (Urbana-Champaign campus)..

The role of classification in the modern American library : papers presented at an institute conducted by the University of Illinois Graduate School of Library Science, November 1-4, 1959 (Volume 1959) online

. (page 5 of 15)
Online LibraryUniversity of Illinois (Urbana-Champaign campus).The role of classification in the modern American library : papers presented at an institute conducted by the University of Illinois Graduate School of Library Science, November 1-4, 1959 (Volume 1959) → online text (page 5 of 15)
Font size
QR-code for this ebook


and retrieval of information, and the actual slowness of the machine in
the linear searching of an index. Classification becomes one of the
methods proposed for dividing an index in order to shorten the time
required for a machine search.

Let us suppose we are searching for the name "Baker, Able
Charlie" in a village telephone book containing about 1,000 names. To
search for this name might take a minute or two, occupied with pick-
ing up the book, finding the proper page and column, and scanning the
proper column for the name being sought. Now it is quite practical to
utilize an IBM machine, or some similar machine, or even a deck of
edge-notched cards, to find one name in a random file of a thousand
names, in about the same time required for the manual search of an
alphabetical file in a minute or two. But suppose we are looking for
the name "Baker, Able Charlie" in a list of a million names compar-
able to the New York telephone book. It might take us a little longer
to lift the heavier book, to find the right page and the right column, and
to scan by the given names and address as well as the last name. Nev-
ertheless, the time required for a search for one name in an alphabet-
ical list of a million names is of the same order of magnitude as the
time required to find one name in an alphabetical list of a thousand
names. But a machine search for one name in a random list of a mil-
lion names will take one thousand times as long as a machine search
for one name in a thousand.

It was the more or less vague realization of this fact that led the
early advocates of the application of punched- card machines for the
organization and retrieval of information to recognize that machine
methods could not be applied efficiently to the random searching of
large masses of information. No machine search of a large random
list can approach the speed with which the mind can jump to the exact
position in an ordered list. It would be silly to randomize a list of
names in a phone book, or subject headings in an alphabetical index, in
order to search for any particular name or heading with punched- card
machines. An ordered list when it is over a certain size always en-
ables the mind which recognizes and utilizes the order to beat the

35



machine. The conclusion to be drawn here is that contrary to popular
misconceptions, the larger the number of qualitatively different units
in a linear system of information, the less applicable are standard
punched- card systems or even magnetic tape systems to the problem
of searching; and this conclusion leads, in turn, to a search for
(1) ways to cut down the size of indexes; and (2) ways to prefile or
classify items of information.

So long as it seemed that machines could only be used for linear
search of large files of information, the search for classification sys-
tems which could divide such files hierarchically or in any other way,
although doomed to defeat, still appeared to be necessary. However,
in recent years machine searching of literature has with rare excep-
tions adopted the method of look- up and coordination, rather than lin-
ear scanning, and this means that it is no longer necessary to invent
classification systems in order to make machine search of informa-
tion feasible or economic and practicable.

Some of those who wished to use classification in machine search-
ing systems developed the notion of generic coding. This large mouth-
ful means nothing more than the use of subordinate digits to indicate
subordinate topics, which every student of librarianship learns in
learning the Dewey Decimal system. For example, 500 is science in
general; 510 is mathematics; 520 is astronomy; 511 is arithmetic; 521
is theoretic astronomy; 511.3 is prime numbers; 521.3 is orbits, etc.
The advantage of such coding for machine systems lies in the ability
to search by a portion of the number rather than the whole number.

For example, if I search for everything on 51 , I pick up everything

on prime numbers, without asking or knowing that anything is in the
system on prime numbers. There are some people who feel that this
type of generic searching is necessary for machine systems. This is
most usually the case in the field of chemistry, where instead of
searching for a specific compound I may wish to search for all amines
or all chlorides or all purines, etc. It has been felt that the coding for
any specific compound which is an amine should also contain the coding
for amines as a generic group. Without going too much into detail on
this matter, it can be said that this type of generic coding is totally
unnecessary in order to make generic searching possible. Further-
more, it is much more expensive than other methods of carrying out
generic searches. In a study of the cost of generic coding which we
published in 1956 3 based upon a study of the number of digits being
employed in some of the systems being experimented with by the Pat-
ent Office, we determined that generic coding would increase the size
of a mechanical store by a factor of three to one, as compared to other
and simpler methods of carrying out generic searches. Since that time
our conclusions have been reinforced by the attempt made by the Na-
tional Bureau of Standards to develop a system of generic coding for
compounds. The system developed by the Bureau of Standards used so
many digits that the computer was in actual fact slower in its look- up
procedure than an individual turning over and examining cards in a
3x5 drawer.

36



If it is the case that all classification systems so far produced or
suggested have shown themselves to be inadequate as instruments of
such bibliographical control, and if it is the case that such systems
are not necessary for mechanized retrieval of information, why, may
we ask, must we continually be faced with the problem of laying the
ghost of classification or dissipating its shadows in the clear light of
analysis? This has been a problem which has troubled me for some
time. The issue seems so clear and yet we have this recurrent inter-
est in and time spent on the theory and problem of classification in li-
brarianship. I found the answer to this question in the "Report of Con-
clusions and Recommendations" issued by the International Study Con-
ference on Classification for Information Retrieval, held at Beatrice
Webb House, Dorking, England, May 13th - 17th, 1957. In a certain
sense the classification group which has been started in this country
and this Conference itself may be considered reactions to the Dorking
Conference. In studying the conclusions and recommendations of this
Conference, Paragraph (1), called "The Scope of Classification,"
gives us our clue:

Traditional classification has been concerned with the construc-
tion of hierarchies of terms - chains of classes and co-ordinat-
ed arrays. Modern information retrieval techniques also neces-
sitate the combination of terms to express complex subjects.
This conference takes the term 'classification' to include the
problems raised by both these forms of relation. Some mem-
bers use the term 'codification' for this field of study.

This is a complicated way of saying what earlier defenders of clas-
sification have said, namely, that all intellectual organization is clas-
sification and that even such things as alphabetical indexing or numer-
ical arrays are species of classification. It is said that no matter how
much we try to get away from classification, we must come back to it.
And thus we see the Dorking Conference, which was presumably called
to deal with classification as a specific method of organizing informa-
tion, generalized the term so that classification became the name for
any method of organizing information. We wish to do more at this
point than quarrel about the meaning of words. Hence, we will admit
that there is a sense in which all intellectual activity involves classi-
fication. The modern theory of arithmetic involves the notion that all
numbers are classes, that is, one is the class of all classes having a
single member, two is the class of all doubles, three is the class of
all triples, etc., that is to say, a number is a class of classes. Fur-
ther, it is certainly true that any general term involves the notion of
class. Any word which does more than point or indicate this or that,
is a word connoting or denoting a class. For example, I can point to a
particular color, but I cannot use the term "red" without implying a
class of shades, or the term "color" without implying a class of hues.
When I use a man's name as the entry in a descriptive catalogue, his

37



name becomes the class of all items written by him. In an alphabet-
ical catalogue any subject heading is the class of all items which follow
it in the catalogue. Certainly in this sense we must admit that all in-
tellectual endeavor involves classification and that if we use the word
"classification" in this wide sense, then all particular systems of or-
ganizing information are species of or varieties of classification. But
on this point there is no quarrel nor really any reason to hold the type
of Institute we are now holding. It seems to me that if we have a con-
ference on classification, or if someone is asked to read a paper on
wnether classification is substantial or shadowy, there must be im-
plied that there are other forms of organization of information, other
forms of library organization, to which the term "classification" does
not apply. In other words, if we say that a dictionary catalogue is a
classed catalogue in just the same sense in which the John Crerar Li-
brary catalogue is a classed catalogue, then the question of whether
we should have classed catalogues or dictionary catalogues becomes
meaningless, sort of like saying that "A includes B and B is not in-
cluded in A." This is a flat contradiction. What we must look for,
then, both at the present time and in the following papers presented at
this Institute, is a definition of classification which distinguishes it
from other forms of organization and which permits an evaluation of
classification as contrasted with an evaluation of other forms of or-
ganizing information. Unless we make this distinction, all of our dis-
cussion from now on will be shadowy and essentially meaningless. I
wish, then, to offer a simple definition of classification as librarians
have always used it which distinguishes it from other forms of organ-
ization. And here, if you will forgive me, we must utilize some sim-
ple logical notions to make this problem clear.

(1) The product of any two classes is a class, as illustrated
by the following diagram:




In this diagram A is a class, B is a class, AB is a class.

(2) The sum of any two classes is a class; that is, "A or B" is
a class.




38



(3) Given the situation where A includes B, AB is a class, but
the class B not A is null; that is, it has no members.




The class B is included in the class A, when all the mem-
bers of class B are also members of class A.

A library classification system like the Dewey system, the L.C.
system, the U.D.C. system, etc., may now be defined as follows:
There are a set of main classes, illustrated as follows:





All sub- classes are included in only one main class:




And this relation of inclusion continues, no matter how far we
carry this subdivision; thus, all sub- sub- classes are included in only
one sub- class:




v @)\ }

> x v x




It seems to me that those who defend classification systems are
saying that knowledge, books, or the information in books can be or-
ganized in this way and that an organization carried out in this manner
will serve the interests of scientific research and other intellectual
activities. In terms of logic, class inclusion is only a special case of

39



class intersection. For example, two standard theorems in any logi-
cal work are:

(y) (x) xnycx: The product of x and y is included in x
(y) (x) xcxuy: x is included in the sum of x and y

This is equivalent to saying that class inclusion can be defined in a
Boolean system of products, sums and complements.

Where, then, does the issue lie? We have first rejected the notion
that classification is a purely general notion and insisted upon its dis-
tinction from other types of organization. Now it appears we have in-
sisted on the general character of Boolean relations and have pointed
out that hierarchical classification or class inclusion is only a special
relationship within Boolean algebra. What issue, then, remains? For
myself, I think there isn't any, but historically there have been two
issues which may provide substance in addition to shadow during the
coming deliberations of this Conference. There have been metaphysi-
cians, philosophers, and even some scientists, notably certain bota-
nists and zoologists, who have insisted that in addition to the mathe-
matical notion of class there do exist in the world real classes or
archetypes. These men would say that the class of geraniums is much
more real than the class which anyone may set up which has as its
members any two flowers, e.g., a geranium and a rose. These men
would say that the class of red things is more real than the class of
colored things. Following this line, it would be said that scientific
investigation will disclose that the universe and all the items in it are
organized in a set of real classes and that the business of library
classification or any other type of classification is not to make classes
but to discover such classes. It is my present feeling that there are
no serious scientists who still hold this position, at least not since
the development and popularization of the theory of evolution and since
the development of Boolean algebra in the middle of the Nineteenth
Century. Let me remind you that it is traditional in library literature
to recognize that Dewey was very much influenced by Harris, that
Harris was an Hegelian, and that Hegelians are a species of unscien-
tific German metaphysicians who believe that all reality is constituted
by an hierarchy of classes reaching up to the Prussian State as the
class of all classes. I would say further that the emphasis on real
classes in this sense in librarianship is a cultural lag which should be
eliminated at this time.

There remains one other problem. It might be said that an empir-
ical investigation of how men actually organize knowledge or write
books discloses that some classes are better than others and that
some classes include other classes and that a good library organiza-
tion should reflect this empirical fact of how people study, do research,
or use libraries. This is a valid point of view and if the empirical
facts could be demonstrated, then a library classification based upon
such empirical facts would certainly be useful. On the other hand, if
the librarians make classifications for themselves based upon theo-

40



retical considerations and insist that the users of libraries modify
their own interests or own groupings in order to fit the librarians' /
theoretical classifications, such a procedure would have no warrant /
in either fact or logic.



Notes



1. J.J. Lund and Mortimer Taube, "A Non- Expansive Classifica-
tion System," The Library Quarterly,VLI (July, 1937) 373-394.

2. M. Taube, "The Functional Approach to Bibliographical Organi-
zation," Bibliographic Organization, (Chicago: University of Chicago
Press, 1951), pp. 57-71.

3. M. Taube and Associates, Studies in Coordinate Indexing,
Vol. HI, (Washington: Documentation Inc., 1956), pp. 34-57.



41



The Classified Catalogue
as an Aid to Research

Herman H. Henkle
Librarian, The John Crerar Library



Very little is known about the effectiveness of library subject cat-
alogues as tools of research. We know that they are indispensable
from a theoretical point of view, and from general observation of
their use and the results of a few studies we can conclude that they
are generally compatible with the library use habits of readers.

Some of the general conclusions which have been drawn from stud-
ies of the subject catalogue are: that there is no significant difference
between the amount of author catalogue use and subject catalogue use;
that the non- specialist ordinarily will make more use of the subject
catalogue than the specialist; and that most of the use of the subject
catalogue is for materials in English and of fairly recent date.

If the second of these generalizations is true, namely that subject
catalogue use is primarily by non- specialists, a discussion of the
classified catalogue as a research tool may be a somewhat sterile
exercise. On the other hand, we can remind ourselves that the im-
portance of research isn't determined by popular vote, so even a mi-
nority use should justify its consideration. In any case, classifica-
tion and classified catalogues have a high degree of relevancy. This
was my reason for agreeing to discuss the subject of the role of the
classified catalogue in research.

In evaluating what I have to say about classification, one general
caveat must be observed. My remarks on classification will relate ex-
clusively to its use in the classified catalogue. While some points
might have relevance to the classification of books for shelving, others
might have differing relevance or no relevance whatever. No effort
will be made here to indicate when there is or is not a common ground
in problems of shelf classification and the classified catalogue.

A second caveat is that the limitations of my experience with the
classified catalogue probably lend my judgements on its problems and
potentialities much less validity than they should have. I am aware of
the existence of several other classified card catalogues in current
use, but I have had no opportunity to examine them. All of what I have
to say is derived from experience with the classified catalogue of
Crerar Library. This being the case, I should begin with a brief des-
cription of this catalogue.

The first librarian of Crerar, Clement Walker Andrews, was a

42



chemist by first profession, and prior to accepting appointment to es-
tablish a new library of science and technology in Chicago (in 1895)
had been serving as librarian of the Massachusetts Institute of Tech-
nology. He was, then, by both profession and experience, science-
oriented. He was working, also, in a period when there was an active
and rising interest in classification. Whether these factors were the
cause or only coincidences, he decided that the subject catalogue at
Crerar would be a classified catalogue; and he chose to base it on the
flourishing classification system developed by his contemporary,
Melvil Dewey.

The catalogue consists of a classified section with an alphabetical
subject index filed as a separate section immediately before the first
sections of the classified catalogue. The labels on the catalogue trays
are class numbers. In the trays, cards are arranged by class num-
bers in the upper left corner of each card (call numbers are on the
right); guide cards indicate breaks between classes (but not all of
them); and within each class, the cards are arranged chronologically
by date of imprint with the latest date first, followed progressively by
earlier dates toward the back of each tray.

A crucial part of the classified catalogue system is the numerical
index, a classified card file maintained in the Catalog Department on
which a record is kept of every verbal heading in the subject index
which refers to each specific class number. In effect the subject in-
dex in reverse, it provides guidance to the cataloguer in the develop-
ment and maintenance of the subject index.

The late Harriet Penfield, for many years chief classifier at
Crerar, once wrote that Mr. Andrews considered the basic factors in
the classified catalogue to be "(1) time, (2) geographical, and (3) al-
phabetical sub- arrangements, and these have been built into the cata-
logue from the first and are characteristic of it." This is quoted from
some manuscript notes Miss Penfield prepared at my request before
her retirement. 2 Further quotation will serve to round out a general
picture of the catalogue.

Of first importance also was more adequate provision of
schedules, for the Library grew very rapidly, and both the L.C.
and D.C. schedules were very meagerly developed in the
nineties. Accordingly, the Brussels Classification was adopt-
ed for most sections of the social sciences [no longer includ-
ed in Crerar collections], and the Zurich Consilium Bibliog-
raphicum for 59 Zoology and some parts of Biology ....
Other expansions were worked out or adapted from other
sources very early, and from time to time later as needed,
though if another edition of D.C. was promised soon we tried
to wait .... Sometimes, too, we have not liked a new D.C.
expansion any better for our purpose than our own and have
made no effort to adopt it in whole or in part. We also have
avoided the over-elaborateness of some of the later D.C.
editions.

43



The general pattern of the catalogue was continued along its orig-
inal lines through most of the first six decades of the Library's his-
tory. But by 1950 we had reached the conclusion that the catalogue
should undergo a thorough review. Obviously this would be a major
undertaking, and might take a long time. Yet it was realized that a
beginning must be made. Substantial progress has been made in a
decade, 3 but there is still a vast amount of work to be done made
doubly difficult by the fact that the frontier of science and technology
is constantly changing at a rate that exceeds our capacity to keep fully
abreast of it.

One of the evidences of need for change in current policy was the
statement just quoted from Miss Penfield's notes, namely: "We also
have avoided the over-elaborateness of some of the later B.C. edi-
tions." This statement, we believe, reflects a serious misconception
of the principle of the classified catalogue. It equates use of classi-
fication in the catalogue with classification for the shelving of books.
Very strong reasons can be advanced for brief notation in shelf clas-
sification, but they are not applicable to the construction of the clas-
sified catalogue. They lead, in fact, to basic violation of the principle
of specificity in any type of subject indexing. And it is essential to
keep in mind that classification for use in a classified catalogue is not
classification of books, but subject indexing by means of class symbols.

Support for the position taken came from a number of what we re-
ferred to as test cases. These involved random selection of an index
entry followed by analysis of what was found in the classified cata-
logue. Two test cases will be described.

TEST CASE ONE
Index Entry

Corn oil

665.3 (Chemical technology)

Classified Catalogue

665.3 (Vegetable fats and oils)

This section was comprised of some 193 cards, including the
following subjects, not in any systematic order, and the index en-
tries referring to 665.3. The number of cards follows each sub-
ject; an asterisk indicates that there was an index card, but no
reference to 665.3.

(General) 96, including 5 on waxes Olive oil 21

Cocoa oil 11 Palm-oil 9

Corn oil 4 Peanut 1

Cotton seed 16 Peppermint 2

*Flaxseed 2 Soybean 12

*Kaoline 1 Sunflower seed 1

Karite 2 Turpentine 4

Maize 2 Wormwood 1

44



Numerical Index

Index entries recorded in the Numerical Index were checked and
grouped under three headings, excluding entries discovered through
examination of the cards in the classified catalogue under 665.3.

Recognizable Synonyms

Absinthium, see Wormwood

Cocoanut-oil, see Cocoa-oil

Corn

Corn, Indian

Eupatoriaceae (Wormwood)

Indian Corn

Saponification

Shea- butter, see Karite

Entries for which no titles
were identified



Argan

Castor-oil

Colza

Forest products

Rape plant

Rapeseed

Wax-palms

Zea mays

General Index
Entries

Fats, Vegetable
Oil seeds
Oils, Vegetable
Vegetable fats
Vegetable oils



What is wrong with this picture ? First, and most important, a
classified catalogue based on this pattern produces, in much too large
numbers, references comparable in part to what the machine men call
"false drops." Under the class, 665.3 (Vegetable fats and oils), there
are catalogue entries for 193 publications. The index entry for "corn
oil" refers us to the class, 665.3. Here we find four books dealing
with corn oil and one- hundred- eighty -nine false drops.

There are at least three other undesirable conditions illustrated.


1 2 3 5 7 8 9 10 11 12 13 14 15

Online LibraryUniversity of Illinois (Urbana-Champaign campus).The role of classification in the modern American library : papers presented at an institute conducted by the University of Illinois Graduate School of Library Science, November 1-4, 1959 (Volume 1959) → online text (page 5 of 15)