16916 words
Theoretical development of information science: A brief history
Birger Hjørland
University of Copenhagen, Royal School of Library and Information Science, Denmark
Pnl617@iva.ku.dk
Abstract
This paper presents a brief history of information science (IS) as viewed by the author. The term ‘information science’ goes back to 1955 and evolved in the aftermath of Claude Shannon’s ‘information theory’ (1948), which also inspired research into problems in fields of library science and documentation. These subjects were a main focus of what became established as ‘information science’, which from 1964 onwards was often termed ‘library and information science’ (LIS). However, the usefulness of Shannon’s information theory as the theoretical foundation of the field was been challenged. Among the strongest “paradigms” in the field is a tradition derived from the Cranfield experiments in the 1960s and the bibliometric research following the publication of Science Citation Index from 1963 and forward. Among the competing theoretical frameworks, ‘the cognitive view’ became influential from the 1970s. Today information science is very fragmented, but a growing number of researchers find that the problems in the field should be related to theories of knowledge and understood from a social and cultural perspective, thereby re-establishing connections with idea’s such as social epistemology which may have remained implicit in in the field much of the time.
1. Introduction
This paper briefly expounds theoretical developments in information science (IS). In practice information science today may be considered synonym with library and information science, LIS (see Hjørland, 2013for a more detailed discussion). This article is based on former accounts of the subject (e.g. Wersig 2003; Bates 2005 and Talja, Tuominen and Savolainen 2005). Information science has, however, a very disordered history, and the former descriptions need to be reconsidered and extended. Therefore, this article attempts to fill a gap and to provide a broad overview of the theoretical development of the field. It is necessarily selective and also subjective in the sense that it reflects the priorities made by the author. As Pierre Bourdieu wrote about his outline of the history of another field:
It is clear that it is not easy to construct the history of the sociology of science [or, as here, of information science], not only because of the vast volume of ‘literature’ but also because this is a field in which the history of the discipline is at stake (among others) in struggles. Each of the protagonists develops a vision of this history consistent with the interests linked to the position he occupies within the history; the different historical accounts are oriented according to the position of their producer and cannot claim the status of indisputable truth (Bourdieu, 2004, p. 9)
The point of departure in this paper is the term “information science” which according Shapiro (1995) was quoined by Jason Farradane (1955). It was established in the same period as were concepts such as ‘information technology’, ‘information processing’ and ‘information storage and retrieval’ appeared. All these terms seems to owe their appearance to the new ‘information theory’ developed by, among others, communications engineer Claude Shannon (1948) in the article A Mathematical Theory of Communication. As Proffitt (2010) noted about the Oxford English Dictionary’s coverage of the term “information”:
“The Supplement’s editors identified and included many of the earliest compounds evoking the sense of information as data, something to be stored, processed, or distributed electronically: information processing, information retrieval, information storage (all three dated from 1950). In quick succession came terms relating to the academic study of the phenomenon, appearing in a neatly logical sequence: first the idea (information theory, 1950), next its budding adherents (information scientist, 1953), then the established field of study (information science, 1955).”
Shannon’s (1948) ‘information theory’ seems therefore to be the direct or indirect reason for establishing ‘information science’ about seven years later. One may claim, however, that the field is older, that only the label is new. Rayward (1994, p. 238), for example, wrote that Paul Otlet’s (1934) Traité de Documentation is one of the first information science textbooks (implying that the content of information science is older than the name; see also Hjørland, in press c). We shall return to this “pre-history of information science” below and here start with Shannon, who brought not just a new theory, but in a much stronger way the idea that a theory in this field is possible at all. Shannon’s theory brought a new conception of ‘information’, which in the narrow sense is something which can be measured (e.g. in ‘bits’) and in a broader sense has been defined by Michael Buckland (1991) as “information as thing” and by the Oxford English Dictionary as:
[2]d. Separated from, or without the implication of, reference to a person informed: that which inheres in one of two or more alternative sequences, arrangements, etc., that produce different responses in something, and which is capable of being stored in, transmitted by, and communicated to inanimate things. [Oxford English dictionary, 2010 update]
Although information science thus seemingly owes its name to Shannon’s information theory, it also developed out of library science and documentation; but, as we shall see below, information theory that founded the field bearing the name ‘information science’ later lost influence as other theoretical frameworks became more important.
2. Selected aspects of the prehistory of information science
Fields like ‘library science’, ‘the science of bibliography’, ‘scientific information’ and ‘documentation’ can be understood as the predecessors of information science.
Called bibliography, documentation, and scientific information during the first five decades of the twentieth century, the field became known as information science in the early 1960s. (Kline, 2004) Pre-Ref
Version
One demonstration of this was the change in name that the American Documentation Institute (founded in 1937) underwent when it became the American Society for Information Science in 1968 (now the Association for Information Science & Technology). The many names for fields that are sometimes synonyms and sometimes separate fields in relation to information science, as well as their rather complicated relations, are outlined in Hjørland (2013c) and shall not be repeated here. Before the term ‘information science’ was used (i.e. before 1955) fields existed which were concerned with how documents are described, classified, organized, communicated and used. In other words information science may be seen as part of a family of fields that all aimed to provide optimal services, systems and infrastructures for different kinds of user groups. Such systems and services might be termed ‘information systems and information services’; however, this is an extremely broad concept that includes, among others, bibliographical systems and services, memory institutions, scientific and scholarly information systems, documentation systems, management information systems and knowledge organization systems. Many subtypes of what might be considered information systems and services tend to form separate fields of study with separate literatures. Library science, bibliography and documentation are basically about helping people find the books, articles, pictures, music, information and so on they need or would like to read or experience (including digital content and the application of advanced information technology), which may be termed document representation and searching (often termed information storage and retrieval). Librarians and information specialists help users retrieve the documents needed to solve tasks, including writing theses and research papers (and to make systems that make such retrieval optimal). They also help to ‘keep the valuable from oblivion’ (Wilson, 1968, p. 1). Thus, prior to the establishment of information science, the core concept was the document. A document should not be understood in the narrow, everyday meaning, but instead it is ‘any concrete or symbolic indication, preserved or recorded, for reconstructing or for proving a phenomenon, whether physical or mental’ (Briet, 1951, p.7]; here quoted from Buckland (1991, p.47). Briet’s understanding of documents seems to be influenced by semiotic theory, although this is not made explicit in her writings. Her famous example is that an antelope in Africa is not a document, but a species that is kept in a zoo is. The example shows that the concept of ‘document’ should be understood in connection to documenting activities.
Different theoretical views have had their effects in library science, bibliography and documentation. We shall not here consider them all. Subfields such as information retrieval (IR), knowledge organization (KO), bibliometrics, and information behavior have a long ranges of approaches. The facet analytic school with Ranganathan IN KO, for example, will not be considered (see Hjørland, 2013b). The following presentation is thus highly selective and is constructed in order to demonstrate developments considered overall important by the author.
2.1. Melvil Dewey
Library pioneer Melvil Dewey (1851–1931) had a strong practicalist influence on the field. His classification system (DDC) did not attempt to optimize findability in any specific collection or for any specific user group. Nor did it try to find optimal scientific or philosophical solutions to the problem today termed ‘information retrieval’ (IR). Instead DDC was a compromise and a standard which could be used by many different collections. His system is the dream of library management much more than the dream of users. Dewey’s approach may have blocked the development of library science towards becoming a true scholarly field by not connecting the field to philosophy and subject fields. Although Dewey felt it important that libraries mediated high-quality books and culture, he saw it as the job of subject specialists to make the document selection. His library science was thereby reduced to purely technical issues (and such technical issues were not understood as being connected with content, but were based on a dualistic view of technology and content). It is also characteristic of Dewey that he took the cultural values of his time and of his class and sex for granted: they were not examined, but were considered as given.
2.2. Henry Bliss
Library scientist Henry Evelyn Bliss (1870–1955), on the other hand, based library classification on knowledge developed in science and scholarship. He actually studied the different disciplines in order to learn how scientists classified their fields. His main idea was that although there are many different perspectives, it is possible to find overall lines of consensus on which to base bibliographic classification (a view which was in accordance with the logical positivism dominant at that time). His view is thus not as practicalist as Dewey’s, but it made library science much better connected to and founded in scholarship, although his view on consensus perhaps seems problematic from the perspective of our post-Kuhnian area.
2.3. The documentation movement
The documentation movement has already been mentioned with Briet’s development of the concept of “document” as a broad term related to a semiotic point of view in which a document is understood as a sign used to document something. The founders of this movement were Paul Otlet (1868-1944) and Henri Lafontaine (1854-1943). This movement is not limited to libraries, but focuses on bibliography and the task to provide documentation services based on subject analysis and classification, but also on providing abstracts and using the most advanced information technology. Documentation (and the concept of document and the – often implicitly - underlying semiotic philosophy) lost influence with the growing influence of the term “information” (see Hjørland, 2000). It is important to say, however, that an important re-introduction of documentation theory with the concept of documents has taken place with information science (see Buckland, 1991a; Hjørland, 2000, 2002; Frohmann, 2004; Furner, 2004; Lund, 2008 and Ørom, 2007).
2.4. Social epistemology
Library scientist Jesse Shera (1903–1982) and his colleague Margaret Egan (1905–1959) developed a conception termed ‘social epistemology’. Shera found that ‘previously, librarianship had developed merely “as a body of techniques evolved from certain ad hoc assumptions about how people use books ...”’ (Shera, 1970, p. 29) and he tried to develop the field on the basis of sociological theory. Social epistemology was defined as the study of those processes by which society as a whole seeks to achieve a perceptive or understanding relation to the total environment---physical, psychological, and intellectual (Egan and Shera, 1952, p. 132; original emphasis). The ‘focus of attention’ of this new discipline should be ‘the analysis of the production, distribution, and utilization of intellectual products (Egan and Shera, 1952, pp. 133–134). Jonathan Furner (2002) does not, however, consider Egan and Shera’s social epistemology related to either the later field known by this name or to the sociology of science; alternatively Furner suggested that Egan and Shera’s view should be understood as a psychological or individualist approach later to be taken up by the cognitive perspective. I do not fully agree with this view (although Furner indicates some obvious links and also correctly says that their writings have ‘an air of quaintness when placed alongside representatives of newer sociologies’). Egan and Shera were not satisfied with the individualist approaches of their own time and tried to formulate an alternative:
As she [Egan] has pointed out, psychologists have studied behaviour with reference to the conduct of the individual; epistemologists have studied the origins, growth, and development of knowledge, but again with reference to the individual. The sociologists have studied the behaviour of people in groups, but never really with reference to the influence of knowledge upon that behaviour. In other words, epistemology has never been taken out of the realm of individual’s relation to knowledge and studied in relation to the sum total of social behaviour, social action. (Shera, 1970, p. 85).
It should be considered that Egan and Shera wrote at a time, in which Thomas Kuhn’s philosophy (which is a social epistemology, cf. Wray, 2011) had not yet revolutionized the theory of knowledge. Egan and Shera’s approach was also based on documents (or ‘graphic records’) as the core concept of the field. An interpretation in retrospect might be that they too were searching for something like a semiotic theory, in which the meaning attributed to documents is determined by human social documenting practices. However, given the background knowledge of their time, this project remained somewhat unclear, as indicated by Furner (2002). There are two issues which in my view make Egan and Shera’s approach social (contrary to Furner’s view):
(1) Shera found that librarianship has to be based on subject knowledge:
“[A] good undergraduate major in a subject field is essential to the librarian, and he should pursue his subject specialty as far as his resources permit” (Shera, 1968, p. 317)
(2) Shera was obviously interested in libraries and their social and cultural importance in a historical perspective (cf., Shera, 1968), which is a perspective clearly distinct from psychological and cognitive approaches. A positive evaluation was also expressed by philosopher and information scientist Patrick Wilson (1927–2003):
Social epistemology with a focus on textual objects and with an eye on the actual and possible roles of information systems is a productive approach to our field (Wilson, 2002, electronic source, no page).
Unfortunately, social approaches were discontinued or marginalized and less fruitful approaches came to dominate the field in the next decades. Today, however, such social-epistemological conceptions have got a renaissance, as discussed later.
jis.sagepub.co.uk
3. Information theory
As mentioned above, engineer Claude Shannon developed the so-called information theory in 1948 (which, however, is often considered a misnomer for a theory of data transmission). Information theory is a mathematical theory about the technological issues involved whenever data is transmitted, stored or retrieved. Its basic idea is that the harder it is to guess what is received, the more information one has got. The theory involves concepts such as communication channels, bandwidth, noise, data transfer rate, storage capacity, signal-to-noise ratio, error rate, feedback and so on (see Figure 1).
Thomas Haigh (2001, p. 31) describes how Shannon’s theory became affiliated with documentation.
Information gained a new cachet from ‘information theory’ and Shannon’s information theory resonated far beyond its technical niche. During the late 1950s, ‘information’ seemed scientific, modern, and fashionably. The 1950s saw a flurry of interest in the problems of ‘scientific information’. Scientific and technical work was being published in unprecedented quantities, spurring interest in technologies and systems to classify, abstract, distribute, and index it. Alarmists warned that an ‘information explosion’ threatened Western scientific leadership during the cold war because America’s lack of centralized indexing and abstracting left scientists and engineers doomed to repeat previous published work.
Shannon’s ‘information theory’ has been and still is extremely important in engineering and computer science. However, the question for us is how important is Shannon’s theory for the field now established as ‘information science’ (or LIS)? Linguist and information scientist Henning Spang-Hanssen (2001, electronic source, no page) wrote:
‘information theory’ is not concerned with documents, and not even primarily concerned with the content or meaning of documents or other symbolic representations, but concentrates on the efficient transmission of signals, which may – or may not – convey meaning. It is therefore unfortunate to confuse the term information theory with information as occurring in “information science” and “information retrieval”.
As already mentioned information theory gave rise to a new understanding of ‘information’ and it became extremely popular, not just in telecommunications, but also in many other fields, including psychology, and it became common to consider libraries, journals, reference books and the whole scientific communication system as ‘information storage and retrieval systems’. An example that demonstrates this influence is The International Encyclopedia of the Social Sciences (Sills, 1968), which contains an overall entry for ‘Information Storage and Retrieval’ (IS&R), which is subdivided into five subsections:
I: The field [Information Storage and Retrieval] (Becker, 1968).
II: Information services (Mitchell, 1968).
III: Libraries (Shera, 1968).
IV: Reference materials and books (Vose, 1968).
V: Bibliographic issues in the behavioral sciences (Bry, 1968).
By assigning these subjects under the label ‘IS&R’ this entry (with its subentries) reflects a new information-theoretical view of libraries, bibliographies, documentation and the scholarly communication system. On the other hand, information theory is not really considered in the content of the entries. It seems just to be a new label for what was formerly termed library science or documentation. There is no direct discussion of the relation between subjects presented and the terms ‘information’, ‘IS&R’ or ‘information theory’, although the article about the field (Becker, 1968) focused on the application of technology and the creation of a new research field named ‘IS&R’ urged by the problems caused by the so-called ‘information explosion’ (implying the concept of information defined by [Oxford English Dictionary, 2010, sense 2d], and explicitly criticized by, for example, Buckland (1991b), Spang-Hanssen, 2001). Also the article on information services (Mitchell, 1968) mentioned the application of new technology and the paper has ‘the information crisis’ as its point of departure, but at its core the paper reflects a traditional documentation perspective rather than anything inspired by information theory. It should also be mentioned that Shera (presented above) wrote the section about libraries (Shera, 1968) and if anything this article reflects an alternative to the information-theoretical point of view. It is therefore not convincingly demonstrated that the subjects described under the label ‘information storage and retrieval’ can adequately and fruitfully be presented and discussed from the perspective of information theory. On the one hand the International Encyclopedia of the Social Sciences provides an example of an influence of information theory in information science and on the other hand it indicates that information theory did not influence the content of the field in a deeper way.
This use of terms like ‘information’ and ‘information storage and retrieval’ properly reflected expectations and hopes about the usefulness of Shannon’s theory (or something like it) in the future more than it reflected the actual use of that theory or considerations about the nature of the field. Another indication of this expectation was a Danish conference in 1957 (Blegvad, Elberling, Johnsen and Rode, 1957) documented that prominent scholars had then found that – at last – a theory or theoretical framework (Shannon’s) which seemed to be fruitful for attacking the problems of scholarly and scientific communication.
Shannon’s theory gave rise to the measurement of information by the unit of the ‘bit’, which may be applied to, for example, how information can be compressed and stored on a disk drive (not to be confused with the number of binary digits that may be stored on a given drive, which are not ‘bits’ in Shannon’s sense). However, as pointed out by many, this measure is not particular relevant to the field of library, information and documentation studies. Michael Buckland, for example, wrote: There is a valid and respectable field of formal information theory based on propositions, algorithms, uncertainty, truth statements, and the like, but its formal strengths are also its limits and make [it] inappropriate and inadequate for the concerns of LIS (Buckland, 2005, p. 686). Spang-Hanssen explained why Shannon’s theory does not apply to information science:
The amount of information is here [in Shannon’s information theory] measured by the decrease of uncertainty resulting from the choice of a particular message among a set of possible messages. […] I shall only mention a few points to show the limitation of this measure to our conception of information.
-
In Shannon’s sense, the amount of information is proportional to the length of the message (in a given code). This obviously does not apply to the utilization of literature as information. Among other things, an abstract may be as informative as the complete paper.
-
Shannon’s amount of information presupposes a measure of the uncertainty on behalf of the receiver. By the utilization of literature as information no measurable uncertainty can be defined generally.
-
Shannon’s amount of information applies to some explicit coding and cannot in the case of normal writing (or speech) account for semantic relations that are not shown by similarities of expression. E.g. the synonyms ‘serials’ and ‘periodicals’ would be treated as different messages (or parts of messages) having different ‘amounts of information’. (Spang-Hanssen, 2001, electronic source, no page).
As mentioned there were – particularly in the 1950s – great expectations that Shannon’s theory might, at last, provide a fruitful theoretical foundation for the study of scholarly communication, libraries, information searching, and reference books and so on. However, much of this must have been nothing but a dream. Most information researchers today do not find Shannon’s theory a proper theoretical basis for the field, although some (e.g. an editorial in Journal of Information Science) have argued otherwise:
‘The boundaries of information science: information theory is alive and well’ (Cawkell, 1990).
However, Cawkell’s examples are about technological problems rather than about library, documentation or information problems, because Cawkell was in my opinion talking about computer science rather than about information science. It is not the job of information scientists, for example, to construe an algorithm that makes it possible to compress images in order to reduce computer space (but rather to say something about which images should be retrieved for given purposes, how they should be made findable). Information theory does play a role in modern information retrieval research, in which the information of a given term is measured in relation to term frequencies (Baeza-Yates and Ribeiro-Neto, 2011, p. 218-219). Although this is the case, it would be wrong to say that information theory is the theoretical basis for IR research. As van Rijsbergen stated:
In the context of information retrieval (IR), information, in the technical meaning given in Shannon's theory of communication, is not readily measured (Shannon and Weaver [1949]). In fact, in many cases one can adequately describe the kind of retrieval by simply substituting 'document' for 'information' (van Rijsbergen, 1979, p. 1).
Chew et al. (2011) is an example of a recent paper which takes information theory very serious in relation to IR, but it also indicates that information theory has not been strongly connected with this area:
It would probably be fair to say that IT [information theory] (not to be confused with information science) has been at most incidental in the development of IR. Nevertheless, IT has not passed IR completely. (Chew et al., 2011, p. 39).
Whether modern IR research should be considered part of information science is also an issue. Historically leading IR-researchers used to publish in IS-journals, but today the mathematical and statistical parts of IR seems to have immigrated to computer science publications. This supports the view that information theory is affiliated with computer science, but not with IS.
Gernot Wersig (2003, p. 312) found that the very notion of semiotics ‘in fact became one of the most important critiques of too simple an application of information theory to human communication’. This may be an understatement, although each theory has, of course, its own domain in which it is the best model. Today, the information-theoretic understanding is often contrasted with the semiotic understanding, for example, by Fiske (2011, pp. 37–60). Information theory may have been a barrier to establishing information science in its own right because this field is related to meaning and semantics, which are dimensions which are not considered by information theory (see Nöth, 2013, for an overview of the criticism of information theory from semiotic perspectives).
If information science has rightfully skipped its foundation in information theory, the concept of ‘information’ itself may turn out to be superfluous. Furner found that
…philosophers of language have modeled the phenomena fundamental to human communication in ways that do not require us to commit to a separate concept of ‘information.’ Indeed, we can conclude that such a concept is unnecessary for IS. Once the concepts of interest have been labeled with conventional names such as ‘data,’ ‘meaning,’ ‘communication,’ ‘relevance,’ etc., nothing is left (so it may be argued) to which to apply the term ‘information.’ One corollary of such a conclusion is the equally negative judgment that the field of IS is itself misnamed, and that its subject matter should more appropriately be treated as a branch of communication studies, semiotics, or library studies (Furner, 2004, p. 428).
Also Bernd Frohmann deflated the idea that information is more important than documents, arguing instead that information...exists only as an effect of the ontologically primary elements: documents and documentary practices. It has, therefore, only a secondary or derived ontological status; it is an effect of the relative stability of documentary practices. Once practices stabilize, information can emerge (Frohmann, 2004, p. 18). It would be unfortunate now to skip the name “information science” but we should consider information as secondary to social documentary practices, as suggested by Frohmann (2004), Day (2011) and Goguen (1997), among others.
Share with your friends: |