Tags: Knowledge Organization, Classification, Information Science, International Classification, Journal of Documentation, Information Processing and Management, Documentation, Information Processing, development, Tim Berners-Lee, thesauri, Classification scheme, index entry, Ontology, Ontologies, Indexing system, index entries, Library of Congress, Semantic Web, IEEE Intelligent Systems, Library Science, retrieval tools, The British Technology Index, uniform structure, metadata tools, Indexing systems, Berners-Lee, Science and Technology, Altavista, Index String Generator, World Wide Web Consortium, World Wide Web, theoretical basis, Search Engines, International Forum, digital information, Cambridge Language Research Unit, String, index term, developing, digital environment, subject approach, Classification system, Stanford University, International Cataloguing, information retrieval, American Society for Information Science and Technology, online catalog, Knowledge Representation, Automatic Classification, International Conference, Library Trends, Journal of Information Science, Information Technology, Classification practices, Dewey Decimal Classification, Lim, E., Drexel Library Quarterly, Bibliographic Classification, Library Quarterly, Sarada Ranganathan Endowment for Library Science, Classification schemes, Library of Congress Subject Headings, Herald of Library Science, Library Resources and Technical Services, Bibliographic Classification systems, Scientific Information, Classification Society Bulletin, Cataloging, American Society for Information Science, Classification systems, Information and Communication Technologies
Content: BARC/2015/I/002
BARC/2015/I/002 KNOWLEDGE ORGANIZATION RESEARCH: AN OVERVIEW by Sangeeta Deokattey, B.K. Sharma and G. Ravi Kumar scientific information Resource Division and K. Bhanumurthy Former Head, Scientific Information Resource Division
KNOWLEDGE ORGANIZATION RESEARCH: AN OVERVIEW by Sangeeta Deokattey, B.K. Sharma and G. Ravi Kumar Scientific Information Resource Division and K. Bhanumurthy Former Head, Scientific Information Resource Division
01 Security classification : 02 Distribution : 03 Report status : 04 Series : 05 Report type : 06 Report No. : 07 Part No. or Volume No. : 08 Contract No. : 10 Title and subtitle :
Unclassified Internal New BARC Internal Technical Report BARC/2015/I/002 Knowledge organization research: an overview
11 Collation : 13 Project No. : 20 Personal author(s) :
48 p., 1 tab. Sangeeta Deokattey; B.K. Sharma; K. Bhanumurthy; G. Ravi Kumar
21 Affiliation of author(s) :
Scientific Information Resource Division, Bhabha Atomic Research Centre, Mumbai
22 Corporate author(s) : 23 Originating unit : 24 Sponsor(s) Name : Type :
Bhabha Atomic Research Centre, Mumbai - 400 085 Scientific Information Resource Division, Bhabha Atomic Research Centre, Trombay, Mumbai- 400 085 Department of Atomic Energy Government Contd...
30 Date of submission : 31 Publication/Issue date : 40 Publisher/Distributor :
BARC/2015/I/002 January 2015 February 2015 Head, Scientific Information Resource Division, Bhabha Atomic Research Centre, Mumbai
42 Form of distribution :
Hard copy
50 Language of text :
51 Language of summary :
52 No. of references :
229 refs.
53 Gives data on : 60 Abstract : The object of this literature review is to provide a historical perspective of R and D work in the area of Knowledge Organization (KO). This overview/summarization will provide information on major areas of KO. Journal articles published in core areas of KO: (Classification, Indexing, Thesauri and Taxonomies, Internet and Subject approach to information in the electronic era and Ontologies will be predominantly covered in this literature review. Coverage in this overview may not be completely exhaustive, but it succinctly showcases major developments in the area of KO. This review is a good source of additional reading material on KO apart from prescribed reading material on KO
71 UDC Class No :
99 Supplementary elements :
KNOWLEDGE ORGANIZATION RESEARCH: AN OVERVIEW By Sangeeta Deokattey, B.K. Sharma and G. Ravi Kumar Scientific Information Resource Division & K. Bhanumurthy Abstract The object of this literature review is to provide a historical perspective of R&D work in the area of Knowledge Organization (KO). This overview/summarization will provide information on major areas of KO. Journal articles published in core areas of KO: (Classification, Indexing, Thesauri & Taxonomies, Internet & Subject approach to information in the electronic era and Ontologies will be predominantly covered in this literature review. Coverage in this overview may not be completely exhaustive, but it succinctly showcases major developments in the area of KO. This review is a good source of additional reading material on KO apart from prescribed reading material on KO. Keywords: Knowledge Organization, Classification, Indexing, Thesauri, Ontologies, Library & Information Science 1. Introduction Since the dawn of civilization, Man has been motivated by his innate need to record and preserve History (His+Story) for posterity. The beginnings go back to the Egyptian Civilization, where leaves of the Papyrus plant were used to record History. The Aryans, Greeks, Romans and the Chinese perfected the art and utilization of Stone, Clay tablets and scrolls of Silk to record their day to day lives as well as major events. But the real impetus to preservation and use of recorded (Explicit) knowledge, was given through the invention of printing, by Gutenberg in Germany in 1439. Several scientific discoveries in the 19th Century 1
and rapid industrial and technological developments around the world saw the growth of scientific and technical literature. Universities, Specialized laboratories and research institutions started publishing results of their R&D activities through journal articles, technical reports and Conference papers. Patents and trademarks followed. Books were no longer objects to be preserved but information to be disseminated. Accordingly, the focus of libraries shifted from being object-centric to user-centric. This fundamental shift in focus, led to the need for better organization of the collection of books and other printed material for quick reference and use. 2. Organization of Knowledge: the Beginning When the question of organization of Knowledge was raised and debated in learned circles everywhere, it was universally recognized and accepted that subjects or the topics covered in a particular book have to be the main criterion for arranging them systematically on shelves. This would also ensure that books on the same subject would be together and readers could easily browse through the collection. Therefore almost all the tools for organizing knowledge are based on the subject approach to information. Beginning with Brunet (Jacques Charles Brunet (1780-1867) in the 19th Century (Tsioli and Corsini (1994) to the Classification schemes, Indexing systems, metadata tools and sophisticated search engines available today, subject approach to information, has had a long history. Starting with Enumerative schemes of Classification to the special and faceted schemes on various subjects, the use of Classification based schemes (in classifying traditional library materials to digital information systems, online information sources and Internet resources), continues to this day. This is more so, considering the proliferation of hybrid libraries (which incorporate both traditional and digital library materials) all over the world. 2
2.1 Classification Theory and Research The Classification Research Group (CRG) established in UK in 1952 made significant contributions to Classification theory in the latter half of the 20th century. The theoretical work of this group involved the study of Facet analysis, relational operators and the theory of Integrative Levels. Foskett (1970) gave an account of the pioneering work of the CRG from 1952-62. An account of a decade of progress in the general theory of Classification was given by Maniez & de Grolier (1991) for the period 1950s-1960s to the FID/CR. Foskett (1968)], Ranganathan (1967 and 1968), Mayne (1968), Sparck-Jones (1970) and Beghtol (1986a and 1986b) elaborated on the basic concepts of Classification theory. Distinguished members of the CRG like Vickery, Coates, Gilchrist, Aitchison and Langridge, advocated the analytico-synthetic approach to Classification, to overcome the pitfalls of Enumerative Classification schemes. The first International Conference on Classification Research at Dorking, in 1964, with its theme "The need for a Faceted Classification as the basis of all methods of Information Retrieval" was a turning point in Classification theory. Ranganathan's Colon Classification became the frontrunner for several faceted schemes of Classification in special subject disciplines. At the same time, in France, de Grolier (1962) and Gardin (1965) provided new insights into Classification theory, through their contributions. Gardin and his team demonstrated the development of a SYNTagmatic Organizational Language (SYNTOL), which was able to process (on a syntactical level), terms taken from four different Classification schemes and applicable to machine retrieval. It operated as a sort of a metalanguage. At the Second International Conference on Classification Research, held at Elsinore, in 1967, a new definition of Classification was formally adopted. "By Classification is meant, any method creating relations, generic or other, between individual semantic units, regardless of the degree of hierarchy contained in the systems and 3
of whether those systems would be applied in connection with traditional or more or less mechanized methods of document searching". Richmond (1974), de Grolier (1967) and Wahlin (1966) continued to support Classification in the wake of rapid thesaurus development in the 1960s. According to Richmond, there were two main reasons for the acceptance and need of a totally new scheme of general Classification. One was centralized processing and computerization (where standardization becomes a prerequisite) and two, the aims of the Unisist report (United Nations Educational Scientific & Cultural Organization (1971) which elaborated the development of a Broad System of Ordering (BSO) or Standard Reference Code (SRC) for broad classing and Fine System of Ordering (FSO) for fine Indexing. This two-pronged approach to Classification and Indexing is seen in the British National Bibliography (BNB), which uses DDC for broad classing and PRECIS for fine Indexing. A proposal for such a system was developed by Dahlberg (Principles for the construction of a universal Classification system (1974) and later also taken up by Svenonius (1972) and Lancaster (1972). The CRG too, proposed the development of an Integrative Level Theory and the General System Theory for the construction of a new general scheme. The Third International Conference on Classification Research, at Mumbai, in 1975, recommended new areas of research for the development of the BSO. Thus the BSO or the SRC tries to bridge this gap. It is a kind of a switching language between existing Classification schemes, thesauri and other information retrieval systems. It has a system of notation, which is used in groups of fractions, separated by commas. The Fourth International Conference on Classification Research, held at Augsberg, in 1982, further emphasized the need for a Universal Classification, Subject Analysis and Ordering Systems. "Classification research for Knowledge Representation and Organization" was the theme of the Fifth International Study Conference on Classification Research, held in Toronto in June 1991. The gradual transition of the meaning of the word "subject" in the traditional sense to 4
information, domain and knowledge came full circle with the selection of the theme "Knowledge organization for information retrieval" for the Sixth International Study Conference on Classification Research, held at University College, London in 1997. Ranganathan's contribution to Classification theory was the subject of the 15th IASLIC Conference where Gopinath (1992) identified the analogical processes in the Classification scheme devised by Ranganathan and the Knowledge Representation schemes used by AI researchers in Knowledge Based Systems (KBSs). This was one of the earliest papers in which Ontological approaches to the Organization of Knowledge were identified. Svenonius (1992) Ingwersen & Wormell (1992), Foskett (1992), Curras (1992) and Kumar (1992) discussed the theoretical and practical aspects of Ranganathan's contributions to Library & Information Science. Spiteri (1998) provided an overview of Ranganathan's contributions to Classification, particularly Facet Analysis. A review of studies in Classification and Indexing, was undertaken by Quinn (1994). Jaenecke (1994) highlighted the role played by Classification: both in KO and in the theory of Classification. India's contribution to Classification Theory and Practice was presented in a state-of-the-art report by Satija & Singh (1998). Two papers by Langridge (1999) and Hjorland & Albrechtsen (1999) traced the origins, history and elements of Classification founded on the historical and social understanding of knowledge. Soergel (1999) summarized the functions of Classification schemes / Ontologies, Thesauri and Dictionaries in which the parallels between Classification and Ontology were discussed. Kwasnik (1999) explored the link between Classification and Knowledge. Three papers by Beghtol (1998), Olson (2002) and Hurt (1997) reported the problems in Classification of interdisciplinary and marginalized subject domains and the rigidity of the uni-dimensional Classification schemes in the current academic environment. Dilevko (2009) demonstrated the applicability of classification theory to various textual-analytic approaches. Hjorland (2009) explored the importance of theories of concepts for organizing knowledge. According to him, Knowledge organization systems (e.g., classification systems, thesauri, and ontologies) should be understood as systems 5
basically organizing concepts and their semantic relations. An interesting parallelism was reported by Mazzocchi (2010) between the Vaisesika categorial system (based on Vedic Philosophy) and the five fundamental categories (PMEST) of Ranganathan. Szostak (2012) provided insights into classification of relationships among things in general which could also be applied to Information Science. Li, Sheng-Tun (2013) proposed a novel classification framework based on fuzzy formal concept analysis to conceptualize documents. This approach is particularly useful in classifying Web2.0 contents. The International Federation for Information & Documentation / Committee on Classification Research (FID/CR) and the Classification Research Group (CRG) (UK) made outstanding contributions to the theory of Classification. (The FID was dissolved in 2002). Two papers by Arntz et al. (1981) and Wilson (1972) feature the seminal R&D work carried out by these organizations. 2.1.1 Classification schemes Classification schemes began to be developed around the beginning of the 19th Century, to group and classify the entire Universe of Knowledge. Every Classification scheme uses a system of notations to represent the subject contents of books and on the basis of which all books are systematically arranged on the shelves. Creators of Classification schemes use epistemological, literary, scientific / philosophical and cultural warrants (theories of justification) for inclusion of new subjects in the Classification of recorded knowledge as and when needed. Classification schemes can be General organizing the whole of recorded knowledge or they can be Special, organizing only a small domain of knowledge. Melvil Dewey was the first to devise and initiate the use of enumerative Classification scheme for libraries, with his Dewey Decimal Classification (DDC), published in 1876. Other well known enumerative schemes include Bliss Bibliographic Classification (BBC) and the 6
Library of Congress Classification (LCC). Ranganathan in direct contrast to the enumerative schemes, evolved the faceted scheme of Classification called the Colon Classification. The Universal Decimal Classification (UDC) is based on Ranganathan's principles. Classification schemes and their practical applications and problems were the subject of several researches. Two studies highlighting the problems of Enumerative Classification schemes were by Borden & Nelson (1969) and Austin (1996). Richmond (1969) and Tauber & Feinberg (1974) made COMPARATIVE STUDIES of the LCC and the DDC. Mowery (1976) discussed the fundamentals of Cutter Classification and Davis (1976) elaborated on Enumerative Classification schemes. A critical review of the 5th edition of the CC was given by Satija (1992). Developments in computer and communication technologies marked the beginning of the use of computers in Classification schemes. Rigby (1974) gave an account of a decade of progress (1963-73) on computers and the UDC. Markey & Demeyer (1986), Finni & Paulson (1987) and Comaromi (1986) reported the use of DDC in the computer era. LC Classification and problems in online retrieval were discussed by Williamson (1986). Sparck-Jones (1971) and Sparck-Jones & Jackson (1970) experimented with the use of automation for keyword Classification. Two studies by Larson (1992) and Guenther (1996) presented the results in the automatic selection of LC Classification numbers. A Windows-based version of DDC was reviewed by Will & Will (1997). The LCC, DDC and the NLM Classification schemes were compared to determine their effectiveness in classifying materials in Health Insurance by Connaway & Sievert (1996). The development and use of the Regensberg Classification scheme in Germany was reported by Lorenz (1997). Another comparative study of CC and BC in the treatment of complex subjects was discussed by Chatterjee (2000). Bianchini (2012) reported a comparative study between CC and DDC and argued in favour of faceted schemes being more hospitable in organizing documents. Doria (2012) proposed the 7
development of a new type of Knowledge Organization System (KOS) based on faceted classification. 2.1.2 Internet and Classification schemes The OCLC (2004) under its project "Scorpion" pioneered the use of DDC for automatic Classification and Indexing of Internet resources. The Scorpion Open Source project offered software that implemented a system for automatically classifying Web-accessible text documents. Vizine-Goetz (1998) presented a position paper on the use of library Classification schemes for organizing Internet information. Jenkins et al. (1998) reported the use of automatic classification for Internet information resources. Molholt (1995) enumerated the virtues of Classification to address the problems of organizing web information. Cutter's principles of Bibliographic Classification and their relevance to the Internet were discussed by Mandel & Wolven (1996). The structures of LC and DDC were explored to classify web resources as reported by Dodd (1996). One of the themes of the Sixth International Conference on Classification Research was the important role of Classification and Classification schemes in organizing information on the Internet and the WWW. Another paper by Mitchell (1998) reiterated the importance of Classification in Knowledge Organization in the new world electronic order. The use of Classification schemes in organizing subject information gateways on the Internet in S.E. Asia was reported by Lim (2000). During the Sixth Biennial Conference of ISKO (International Society for Knowledge Organization), held at Toronto in July 2000, the use of DDC to classify Internet resources was discussed. Godby & Vizine-Goetz (2000) and Broughton & Lane (2000) reviewed the use of UDC and BBC in applications to web Indexing and searching. Three papers by Saeed (2001), Zins (2002) and Devadason et al. (2002) explored the possibility of using Classification schemes (both enumerative and faceted) for organizing Internet resources. Joorabchi (2011) detailed the development of a software for automatically classifying scientific research documents using the DDC scheme. The Online 8
Computer Library Centre (OCLC), USA, provides an online facility to automatically generate the most commonly used classification numbers and subject headings for an individual work, under it's OCLC Classify project ( A review of this service was reported by De Fino (2011). The use of co-occurrence information in a text as aids to enhance Classification was described by Figueiredo (2011). 3. Trends in Indexing Indexing and Indexing systems evolved, due to the problems of notation, faced by Classificationists, in classifying and filing of books on compound subjects. Starting with the traditional book index, to the various Pre-coordinate and Post-coordinate Indexing systems used in libraries, one of the most widely used Post-coordinate Indexing tools has been the thesaurus. Thesauri continue to be successfully used by various libraries, information centres and on-line database systems. An off-shoot of the thesaurus was the Classaurus, which combined the advantageous features of a Classification scheme and a thesaurus. Indexing systems originated from the traditional book index. These book indexes in due course of time, became more specific, to suit different contexts. The variations in book indexes had their impact upon the techniques introduced and developed by Cutter, Kaiser, Coates, Ranganathan, Farradane and Sharp and thus led to the evolution of Indexing. Cutter (1904) was the first to discuss the concept of specific subject in his "Rules for a Dictionary Catalog" published in 1876. He suggested the entry of a work under a specific subject heading rather than under the heading of a class, which included that subject. He also suggested entering a compound subject, by its first word and inverting the phrase, only when some other word is more significant. The definition of `significant' being ambiguous, led to inconsistencies in Indexing. Cutter's rules are still the basis for practical work in the USA and can be seen in LCSH and Sears List. LCSH is based on the principles of Cutter. Sears list is quite identical to LCSH. Efforts have been made to computerize LCSH by Markey (1988), 9
Metcalf (1983) and Vizine-Goetz & Markey (1989). Kaiser (1911) laid down the theoretical basis for fixing the significance order for composite subjects, in his "Systematic Indexing". The most important contribution to subject Indexing has been of Coates (1960). His significance order for composite subjects was Thing, Material and Action. According to him, "the most significant term in a compound subject is the one, which is most readily available to the memory of the enquirer". The British Technology Index is a fine example of Coates's theory. Metcalfe (1957) devoted three major works on subject approach and later, published a tentative code of rules for alphabetico-specific entry in 1959, combining the theories of Cutter and Kaiser. Lynch (1966) used a slightly complex procedure for his Articulated Subject Indexes (ASIs). Through computer manipulation of a simple sentence-like statement, a series of index strings or ASIs were devised. Ranganathan (1974) in direct contrast to Metcalfe, used a classificatory basis for his chain Indexing. Chain Indexing involves deriving subject index entries from the class number of a document. According to Guha (1983), "if an analytico-synthetic scheme of Classification is used, in the co-extensive representation of a subject, then, a retranslation of the class number using the schedules of the same scheme of Classification, would give a neatly structured formulation of the subject". Farradane (1966) through his Relational Indexing, provided a different approach, by creating a new type of syntax in Indexing. His subject formulation was based not on the characteristics of component subjects, but on the relationship that exists between each pair of components. He identified nine such types of relationships, each indicated by different operators and symbols. The relevant operator is inserted between two components, to indicate the precise relationship. Sharp (1965) suggested the use of selective combination of components, constituting the subject instead of providing access from each of the component terms. In his Selective Listing In Combination (SLIC) Indexing system, only certain combinations were used as index headings, irrespective of the citing order. 10
3.1 Developments in Indexing: String Indexing These systems were developed because of two main reasons: 1. availability of computers for producing indexes and 2. inconvenience in using traditional manual methods. A string index is a type of index with two main characteristics. Each index term usually, has a number of index entries containing at least some of the same terms and computer software is used to generate the description part of each index entry. The description part of a string index entry is called an Index String and the computer software used to produce it is called Index String Generator. The main advantage of string indexing is the economy of time and effort. With only a single input string and its number or locator, numerous index entries can be generated automatically. Due to the multiple overlapping of index entries and the clear syntactical rules inherent to all string indexes, high efficiency during searches is ensured. There are three basic types of String Indexing Systems: 1. Phrases in ordinary language: for eg. From titles of documents: KWIC, KWOC, PANDEX, PERMUTERM, ASI, KWPSI and Double-E-KWIC 2. Simple lists of terms or keywords: CLASE, ABC Spindex, TABLEDEX, SLIC, MULTITERM 3. String Indexing Systems with coded input strings: PRECIS, POPSI, CASIN, Statement Indexing, Automatic Library Catalog Displays, KWIDR, NEPHIS, LIPHIS, NETPAD, CIFT, PERMDEX, PASI and NILS. Some indexing methods similar to string indexing are Manual Title Catchword systems, Manual Cross-Indexing systems, Systematic Indexing unit card system initiated by the Library of Congress, UDC, Chain Indexing and the Universal Index Entry Generator of Keen. 3.2 Coordinate Indexing The period from 1939-1948 witnessed a revolution in the indexing and retrieval of documents. Machine or mechanical aids to IS&R, were introduced and widely used. This revolution was led by Batten and Cordonnier in Europe and Mooers and Taube in the USA. Batten devised a special system for the retrieval of patents, which was based on a 11
hierarchical scheme of Classification. At the same time, Cordonnier in France, developed a retrieval tool called the `Selecto' system for the coordination of classes. This system was widely used in the USA and Europe and came to be known as Optical Coincidence or Peeka-Boo system. Mooer's Zatocoding system, shared with the Batten cards, the ability to coordinate classes freely. But the real impetus was given by Taube, founder of Documentation Incorporated (Studies on Coordinate Indexing; 1953-1959). Taube's Uniterm system dates from 1951. The indexing systems devised by Batten, Cordonnier, Mooers and Taube were more flexible than traditional classified and alphabetical subject catalogs, since they circumvented the dependence upon a linear sequence of terms, to express a relationship between classes. Thus relationships here were based on combination rather than permutation used by traditional catalogs. This type of system was referred to as Coordinate Indexing system. Today's Coordinate Indexing systems gradually evolved from this early concept. No longer based on single-word terms, (two or more words are used to denote a term or a concept), they may be regarded as pre-coordinate systems where the terms are post-coordinated at the time of searching. Problems of redundancy and inaccuracy in the indexing of technical information were addressed in one of the earliest papers, by Loukopoulos (1966). A detailed comparative evaluation on the efficiency of Indexing systems was carried out by Cleverdon (1962) under the famous Aslib-Cranfield experiments. Caroll and Roeloffs (1969), Earl (1970), Cleverdon (1966) Litovsky (1969) and Salton (1967, 1970) experimented with the designing of automatic indexing systems. Landry (1971) and Foskett (1972) addressed theoretical problems in Indexing. Green (1972) dealt with the problem of indexing scientific and technical literature, belonging to different disciplines. Bhattacharya (1974) and Braun & Schwind (1976) advocated the use of natural language based indexing. Craven (1977) described NEsted PHrase Indexing System (NEPHIS) and Linked PHrase Indexing System (LIPHIS) indexing systems which were based on input strings of terms with additional codes for software. Farradane (1977) made a comparative study of another indexing system similar 12
to NEPHIS and LIPHIS called the Pragmatic Approach to Subject Indexing (PASI) and Brookes (1986) elaborated on Farradane's Relational Indexing. A method towards establishing homogeneity in different Indexing languages was described by Dahlberg (1981). A cognitive approach to achieving consistency in indexing was detailed by David & Giroux (1995). Fugmann (1997) provided a complementary approach to overcoming the differences between database indexing and book indexing. A comparative study of selection and representation of concepts of identical documents in two different databases was undertaken by Iivonen & Kivimaki (1998). As far as indexing of web resources is concerned, a number of studies highlighted the emerging trend of using a combination of manual and automatic approaches. Earle & Berry (1996) showed how online indexes could be designed with a flat structure in which each index entry is clearly worded and makes use of keywords from the subject matter. Srinivasan & Ruiz (1996) proposed a model for Indexing web resources which advocated the application of key features in Indexing strategies, which were in use. Ranganathan's Chain Indexing and its use in LISA were demonstrated by Kumar & Parameswaran (1998). Casey (1999) hypothesized the need for creating analytical indexes to access Internet resources. Ross (2000) reported the way in which Indexing process had changed with technological advances. Toth (2000) examined three modes which affect the practice of cataloging and indexing online resources on the Internet. At the ASIS annual conference in 1996, Weinberg (1996) detailed the developments in Knowledge Representation and Knowledge Organization of alphabetical Indexing systems and their adoption, particularly in the USA. Pre-coordinate and post-coordinate approaches to indexing and their historical development were described by Miller & Teitelbaum (2002). Losee (2004) suggested rules that could be used to determine the number and length of 13
subject headings, assigned to a document. Williams (2010) traced the origins of title derivative automatic indexing and the contributions of Peter Luhn and Herbert M. Ohlman. 4. Thesauri as Vocabulary Control Tools The need for a "Controlled vocabulary" was strongly felt due to the semantic pitfalls of using natural language terms during machine translation, mostly at the time of searching through computerized indexing systems. Thesauri are coordinate indexing tools wherein terms are post coordinated at the time of searching using Boolean operators (AND, OR, NOT) to combine different terms in a search strategy. A thesaurus is called a controlled vocabulary tool as only a single term is used as a valid and unique descriptor and other terms related to it are placed under it. A common thesaurus word block thus comprises the Descriptor (the preferred term) and other terms which are broader in scope, narrower in scope or are semantically related to it. Thus some form of control is exercised in the organization and management of natural language both during indexing and retrieval of information. Two studies that initiated the development of thesaurus for Classification and Indexing were by Joyce & Needham (1958) and Masterman et al (1959). Prior to that, the Cambridge Language Research Unit (UK) was considering the use of a thesaurus, such as Roget's to overcome the semantic pitfalls of natural language in machine translation. The analogy between machine translation and retrieval was established by Masterman (1975). The ASTIA thesaurus, US Armed Services Technical Information Agency (ASTIA) (1962) was published with 70,000 descriptors. With the publication of the Thesaurus of Engineering and Scientific Terms (TEST), Engineers Joint Council (1964) followed by the enlarged version in 1968, Engineers Joint Council (1968), the foundation was laid for thesaurus building. Davis's (1968) pioneering study witnessed the merging of a Classification scheme with a vocabulary to develop a thesaurus. Several studies, guidelines and standards evolved for developing new thesauri. Rostron (1968), Wall (1969), Rogers (1972), Friedman (1972) and 14
Lancaster (1972) advocated different methods for vocabulary building and control. Jachowicz (1979) provided a classificatory approach to thesaurus construction, Gopinath (1987) gave a demonstration of a symbiosis between a Classification scheme and a thesaurus and he brought out the correlation between shifts in knowledge and the schedules of Classification, taking an example from the field of Aviation Engineering (1999). Dykstra (1988) used the Library of Congress Subject Headings (LCSH) for developing a thesaurus. Wolff-Terroine et al. (1969), Salton (1972), Devadason & Balasubramanian (1981) experimented with automatic thesaurus construction. Reconciling of two or more thesauri was another topic of research. Angell (1968), Neville (1970) Wall & Barnes (1969), Aitchison (1981) and Rada (1987) worked on different aspects of compatibility between thesauri. Freeman (1976) was one of the early researchers to work on thesaurus construction in a diffuse subject area. The structure and functional relationships in thesauri were of special interest to information professionals and researchers. Willetts (1975) and Seetharama (1976) explored the term-concept relationships in a thesaurus. A variation of the thesaurus was the development of the Classaurus; a faceted hierarchical scheme of terms with vocabulary control features. Devadason (1985) has worked in some detail, in this area. computer technology witnessed rapid developments in the generation and use of both CD-ROM and online databases in the 1990s. Kristensen (1993) advocated a novel method of online searching by merging free-text terms from users' query statements and a search-aid thesaurus. Use of conventional database technology in thesaurus construction was demonstrated by Jones (1993). Craven (1993) experimented with Artificial Intelligence techniques, for making a thesaurus more dynamic. A statistical method (Bayesian networks) for thesaurus construction was reported by Park & Choi (1996). Facet analysis was used as a basis for thesaurus construction by Spiteri (1999). A new method for computing a thesaurus from a text corpus was demonstrated by Schutze & Pederson (1997), 15
using co-occurrence information (where words were defined to be similar if they had similar co-occurrence patterns). Diaz & Velasco (1998) described a specific application of domain analysis method for generating a thesaurus. Enhancing information retrieval using thesauri was another area of R&D. Lee (1994), Takeda (1994), Mazur (1994), Pollitt & Ellis (1994) worked in this field. Chen & Martinez (1998) performed a large scale experiment using automatic indexing, co-occurrence analysis and parallel computing to remove uncertainties in online information retrieval. A mapping experiment between a thesaurus and a subject heading list was reported by Chaplan (1995). A comparative study between library Classification schemes and thesauri was undertaken by Weinberg (1995). Mandala & Tokunaga (2000) proposed a method of expanding queries using heterogeneous thesauri to enhance retrieval. In the current Internet environment, thesauri continue to play a dominant role in Knowledge Organization and Retrieval. De la Rosa (1999) reviewed terminology tools available on the Internet and the problems encountered in using them. According to de la Rosa, there was a lack of a proper format for online thesauri and the author advocated the use of XML and its specifications to overcome these problems. In one of the recent studies, Neelameghan (2009) described briefly the selection of terms for the design and development of a thesaurus for the special subject tuberculosis. An interesting study by Mazzocchi (2009) discussed the problems of polysemy in thesaurus construction. Based on the Information Coding Classification (ICC), which is a theory-based, faceted universal classification system of knowledge fields, Dahlberg (2012) proposed a new lexicon of all knowledge fields in the German language with the terms of the fields in English. Obsolescence of subject descriptors due to various reasons is a major problem in organizing knowledge. Buckland (2012) examined the causes of this obsolescence taking examples from literature. 16
4.1 Standardization Efforts in Thesauri and Vocabulary Control Tools The following international organizations have played significant roles in international cooperation on vocabulary control. 4.1.1 Infoterm The International information Centre for Terminology was founded by Unesco in 1991 with the objective of supporting and coordinating international cooperation in the field of terminology. Initially, it worked with the Austrian Standards Institute, but later through Unesco's support, it created an international network for institutions and organizations that were interested in terminology development. Thus, TermNet, the International Network on Terminology, an independent international non-profit organization, was founded in 1988. In 1997, the European Commission asked Infoterm to set up a European network for terminology information and documentation centres, involving 15 organizations. One of the major activities of Infoterm is supporting the Secretariat of ISO/TC 37 on Terminology and other language and content resources and standardization activities in general. 4.1.2 International Society for Knowledge Organization (ISKO) Founded in 1989, it is the leading international society for organization of knowledge. ISKO's mission is to advance conceptual work in knowledge organization in all kinds of forms and for all kinds of purposes, such as databases, libraries, dictionaries and the Internet. As an interdisciplinary society, ISKO brings together, professionals from different fields. Its 500+ members are from the fields of Information Science, Philosophy, Linguistics, Computer Science as well as from medical informatics. ISKO's main activities include: organizing International Conferences every two years and national and regional conferences on special topics; publication of the journal "Knowledge Organization (KO) (formerly International Classification); Bringing out a newsletter, ISKO News (now part of KO); bringing out the series Advances in Knowledge Organization (AKO) and another series Knowledge Organization in Subject Areas (KOSA). 17
5. Computerization, Standardization of Metadata and Universal Bibliographic Control The International Conference on Cataloguing Principles (ICCP), organized by IFLA, in Paris, in 1961, was a landmark in the standardization of the bibliographic record format. It set standards and guidelines for headings of author and title records, in catalogs and bibliographies. In the year 1969, in Copenhagen, IFLA again convened an International Meeting of Cataloguing Experts (ICME), to discuss bibliographic standards. Any kind of cooperation in the field of information handling and processing necessitates standardization of bibliographic records. Unisist's PGI (General Programme of Information) works closely with the ISO in this area. It is responsible for developing international standards for application in all areas of information activity, thus providing the tools necessary for the establishment of compatible information services. It's "Unisist guide to standards for information handling" (Unesco, 1973) is a de facto guidebook even today. Standards for monographs ISBD (M) were published in 1974, followed by ISBD(S) for serials, ISBD(G) for general materials, ISBD(CM) for cartographic materials and ISBD(NBM) for non-book materials. The Library of Congress was the first to experiment with MAchine Readable Catalogue (MARC) record format, for exchange of bibliographic information in 1966. The second version MARC-II was an improvement over the first one. The BNB, in cooperation with Aslib, OSTI (Office of Scientific and Technical Information, USA) and the Library of Congress, developed a MARC format of its own, for specific requirements of libraries in the UK. Thus the UKMARC manual first appeared in 1975. But later, it underwent a change to incorporate developments in AACR2, for the use of non-book materials and for the advent of BLAISE. Several countries like Australia, Canada, Denmark, France, Germany, Italy, Japan, Latin America, Norway and Sweden, developed their own MARC formats for bibliographic records. The problems faced by various libraries, in the implementation of MARC were due to variations in linguistic, cultural and subject control 18
approaches. Therefore, the MARC International Format (MIF) was developed by IFLA, in 1973. It was based on ISBD and UNIMARC and was developed and published in 1977. Work on the establishment of a Common Communication Format (CCF), started in 1978, when the need to reconcile and standardize different formats became urgent. There was some confusion for libraries whether to adopt the Unisist Reference Manual Unesco (1981) guidelines or the MARC formats for computerized bibliographic description. Both were not compatible with each other. It was to overcome these problems that the Common Communication Format (CCF) was developed, based on ISO-2709, for the structure of a bibliographic record. Its structure, content designators and data elements, have been designed for communication between two or more information systems. The first edition of the CCF was published in 1984 and the second in 1988 (Simmons & Hopkinson (1988)). Another format, the MIBIS (Microcomputer-based Bibliographic Information System) structure, was proposed by the IDRC (International Development Research Centre, Canada), as a tool for libraries and documentation centres setting out to computerize their bibliographic information systems. It's record structure is compatible with the 2nd edition of CCF. The ABNCD is a design approach, to create an integrated database, devised by Neelameghan and his associates at the Documentation Research Training Centre (DRTC), Bengaluru. It is compatible with the CDS/ISIS software of Unesco. Various applications of MARC, UNIMARC and CCF in traditional information systems as well as in the Internet environment have been discussed by Chowdhury (1996) Smet & Nieuwenhuysen (1997), Curwen (1997), Stoecker & Alford (1998) and Fleck & Rust (1998). In 1998, IFLA commissioned yet another format for bibliographic description, called FRBR (Functional Requirements for Bibliographic Records). The aim of FRBR (2002) was to produce a framework that would provide a clear, precisely stated and commonly shared understanding of what it is that bibliographic record aims to provide information about, and what it is that one expects the record to achieve, in terms of answering users' needs. The EntityRelationship methodology was used for the FRBR model (Zumer & Riesthuis ((2002). 19
5.1 Information handling and standardization in the digital environment With the rapid advancement of technology and the explosion of digital information in the last decade, the importance of interoperable IR networks, to find and exchange quality information electronically, was realized. Various online services and the WWW have made it possible to access information in new ways. To facilitate this IR across diverse collections of data sources, a non-proprietary standardsbased communications protocol for IR, which is independent of database and computer environment is essential. The following review gives an outline of the historical aspects of standardization of digital information using various metadata tools. Standardization of digital information gains utmost importance considering the fact that apart from textual data, images, audio and video data also need to be addressed. 5.2 Internet and Related Developments The beginnings of the Internet can be traced back to a project sponsored by the US Defense Advanced Research Projects Agency (DARPA) in 1969, to enable researchers and defense contractors to share information. The Internet was created in 1990 by Tim Berners-Lee, a computer programmer working for CERN (European Organization for Nuclear Research, Geneva). Prior to the WWW (World Wide Web) also called Web, accessing files on the Internet was a challenging task, requiring specialized knowledge and skills. The Web (comprising tools like HTTP, HTML and URL also created by Berners-Lee) made it easy to retrieve a wide variety of digital files using the hypertext linking facility. The Internet has grown and continues to grow at a tremendous pace. 5.3 Development of Search Engines and Metadata Tools Although sophisticated search and Information Retrieval (IR) techniques date back to the late 1950s and early 60s, these techniques were used for closed systems. The early Internet search and retrieval tools lacked even the most basic capabilities mainly because it was 20
thought that traditional IR techniques would not work on an open unstructured information base like the Internet. Accessing a file on the Internet was through a programme, called the FTP (File Transfer Protocol). Files on FTP servers were organized in hierarchical directories, similar to the files on the Personal computers. This structure made it easy for the FTP server to display a listing of all the files stored on the server. The first servers were located at CERN, in Geneva. Initially navigating the web was difficult as it lacked a cohesive, uniform structure. The early search engines extensively used simple programmes called Web Crawlers to gather new links to web pages and add them to their lists. From 2000 onwards, search engines began appearing at a rapid pace. But all of them had the same problem. They were designed only to find and index Web documents and to point users to the most relevant documents in response to keyword queries. This situation was feasible in the earlier years, when most web content consisted of HTML pages. But the Internet continued to evolve, with information being made available in many formats, other than simple text documents. And the search engines appeared to be inadequate for the purpose of search and effective retrieval. This chaotic situation led to the development of the Semantic Web initiative by the World Wide Web Consortium (W3C,) the Internet body responsible for developing standards to realize the full potential of the Web. Table 1 shows the timeline of the development of search engines. 21
Table 1: Growth of Internet search engines
Search Service
Vannevar Bush proposes "MEMEX"
Hypertext coined by Ted Nelson
Dialog: first commercial proprietary system
OWL guide hypermedia browser
Archie for FTP search; Tim Berners-Lee creates the Web
Gopher: WAIS distributed search
ALIWEB (Archie Linking), WWWWorm
WebCrawler, Lycos, Yahoo
Infoseek, Altavista, Excite
HotBot, LookSmart
Hundreds of search engines and tools
Source: Sherman, C. and Price, G.: The invisible web; uncovering information search engines can't see. Medford, Information Today Inc., 2003, p.15
5.4 Metadata Tools Metadata is a critical component in the context of Knowledge Representation in digital libraries and also in the pre-web libraries. Metadata describes the attributes of a resource, where the resource may consist of bibliographic objects (as represented by MARC and other metadata formats) archival inventories and registers, geospatial objects etc. While metadata differ in respective levels of specificity, structure and maturity, their primary objective is to describe, identify and define a resource with regard to access patterns and filtering, terms and conditions for use, authentication and evaluation, preservation and interoperability.
One of the first metadata structures created by the library community is the humble catalogue card. Other computer metadata formats such as ISBDs have already been covered earlier in this review. The MARC format developed by the Library of Congress was a highly structured and semantically-rich metadata structure. Another well known metadata project is the Dublin Core Metadata Initiative (DCMI) (1991). Since the early 1990s, there has been a tremendous growth of Internet resources. To describe these adequately, several schema Ercegovac (1999) have been developed and used. But, the general trend among all these metadata formats has been overlapping attributes and also differences in the implementation of various datasets. Another development has been the Dublin Core Resource Description Framework (DCRDF) or simply DC. It is used to describe a variety of resources on the Internet for the purpose of resource communication. The W3C has helped develop the Resource Description Framework (RDF specifications). The RDF is an enabling technology for Resource Description, Content Authoring and General-Purpose cooperative cataloguing. Another wellknown initiative of the W3C is the development of the XML. The Darpa Agent Metadata Language (DAML) and the Ontology Inference Layer (OIL) are together used for metadata encoding and manipulation. Several libraries, mostly in Europe and America also use OCLC Connexion (which is compatible with RDF) for creation and sharing of metadata. The underlying infrastructure for most of the pre-Internet era metadata has been Standard Generalized Markup Language (SGML) and the eXtensible Markup Language (XML). 5.5 Standardization efforts in the area of vocabulary control The Semantic Web, as envisioned by Berners-Lee is an extension of the current web, in which the meaning of information is clearly and explicitly linked from the information itself, enabling computers and people to work in cooperation. The Semantic Web is a place where 23
strongly controlled or centralized metadata vocabularies can flourish alongside specialpurpose vocabularies. The Semantic Web technology supports free intermingling of vocabularies. It also includes instructions for processing data in specific ways, using the same technology. Metadata permeates each layer of the Semantic Web architecture of Berners-Lee. Therefore, metadata vocabularies, including Ontologies need to be further developed, to realize the full potential of the Semantic Web. 6. The Growth and Development of Ontologies Ontology (ontos = being and logos = study) means "the study of being". It is the theory of objects and their ties. Traditionally, Ontology as a subject, was the focus of Philosophers and Logicians, who used the term to denote the study of what is, i.e. what exists, the kinds and structures of objects, properties and other aspects of reality of the universe. Ontology is the first part that actually belongs to Metaphysics. It is a pure doctrine of elements of all our a priori cognitions; or it contains the summation of all our pure concepts that we can have a priori of things. In contemporary philosophy, formal Ontology has been developed in two principal ways. The first approach has been to study formal Ontology as a part of Ontology and to analyze it using the tools and approach of formal logic. The second line of development analyzes the fundamental categories of object, state of affairs, part-whole etc. as well as the relations between the part and the whole and their laws of dependence. The modern Ontology as a tool, is a fusion of both the approaches. 6.1 An Artificial Intelligence (AI) Perspective In the field of AI, Ontologies have been used in Problem Solving Methods (PSM) and in Knowledge Based Systems (KBS) from the 1990s. According to Chandrasekaran (1999), the term Ontology has largely come to mean one or two related things. First of all, Ontology is a representative vocabulary often specialized to some domain or subject matter. More 24
precisely, it is not the vocabulary as such that qualifies as an Ontology, but the conceptualizations that the terms in the vocabulary, are intended to capture. Ontologies are important from two points of view. 1. Ontological analysis classifies the structure of knowledge. Given a domain, its Ontology forms the heart of any system of knowledge representation, for that domain. Without Ontologies or the conceptualizations that underlie knowledge, there cannot be a vocabulary for representing knowledge. Thus, the first step in developing an effective KR system and vocabulary is to perform an effective Ontological analysis of the field or domain and 2. Ontologies enable knowledge sharing. In order to build a Knowledge Representation language based on analysis, one needs to associate terms with the concepts and relations in the Ontology and devise a syntax for encoding knowledge in terms of concepts and relations. This KR language can then be shared with others having similar needs. Hobbs (1995) proposed a general structure for a different underlying conceptualization of the world; one that would be particularly well-suited to language as opposed to philosophical Ontology, which is independent of language. Reynaud & Tort (1997), Heijst & Schreiber (1997) and Gomez-Perez & Benjamins (1999) developed Ontologies for PSMs. O'Leary (1997, 1998) highlighted the problems in using Ontologies for KBSs and discussed the role of Ontology in knowledge bases and knowledge management. An entire issue of IEEE Intelligent Systems, Special issue on Ontologies, (Jan./Feb. 1999) was devoted to Ontologies. Another special issue International Journal of Human-Computer Studies on Ontologies, (Feb/March 1997) showed the increasing interest in Ontologies on the part of AI researchers. 25
6.2 Ontological methodology Two pioneering papers described how to develop and build Ontologies. One was by Guarino (1997) and the other by Noy & McGuinness (2001). A comparative review of the state-of-theart in Ontology design was described by Noy & Hafner (1997). The use of specific tools and services to develop collaborative Ontologies was reported by Farquhar & Fikes (1997) in their study on the "Ontolingua" server. Borst & Ackermann (1997) described a formal Ontology called PHYSSYS in the domain of Engineering. Visser & Bench-Capon (1998) made a comparative study of four Ontologies in the field of Law. Valente & Russ (1999) presented a case study in building and reusing an Ontology in the field of air campaign planning. Lopez & Gomez-Perez (1999) gave guidelines for developing a chemical Ontology using two Ontology building tools; MethOntology and Ontology DEsign (ODE). Duineveld & Stoter (2000) compared various Ontology tools available on the Internet in the field of Engineering. Holsapple & Joshi (2002) provided different approaches such as inspirational, inductive, deductive, synthetic and collaborative in the design of Ontologies. Everett et al. (2002) described means to resolve issues of synonymy through the use of natural language in designing new Ontologies. Kohler et al. (2006) paved the way in bridging the gap between an HTML-based system and an RDF-based system, by linking words in texts to concepts in Ontologies. Jung-Min Kim et al. (2007) detailed the development of a methodology for an Ontology management system, based on Philosophical texts. Dahab et al. (2007) described an automatic Ontology construction method, from natural language English text. 6.3 Ontologies: Knowledge Organization Perspective Advances in Internet-based information services, have precipitated the need, to organize information in a more effective way. This is more so, in the present digital information explosion era, where interoperability among various systems is essential for information exchange. A number of research methods and experiments are currently on, to create a semantic web; a semantic interlinking of all the information on the web. These methods include the use of free text, i.e "keywords" taken from the text, clustering methods based on 26
statistical co-occurrence of words, linguistic methods using semantic clustering and Neural Network methods for browsing and searching the web. The growth of interdisciplinary subject domains has added to the problem of effective organization of knowledge. Ontologies combine elements from traditional library classification schemes and indexing tools. But their full potential and function can only be realized in a semantic web environment. It is not simply a conceptual framework but a concrete syntactic structure that models the semantics of a domain; the conceptual framework, in a machine understandable language. An Ontology offers a concise and systematic means for defining the semantics of web resources. Therefore, in the context of digital information management, Ontologies would be increasingly utilized. Dahlberg (1978) was one of the first Knowledge Organization professionals to identify the link between Classification structure and Ontology. In her "Ontical structures and universal Classification", she described the Ontological foundations of modern Classification systems. Walker (1981) identified four criteria on the basis of which relationship between Classification theory, Cognitive Science and Artificial Intelligence can be established. A modified version of the same table (Deokattey et al 2010) displays how this relationship can be extended to ontologies too. Gopinath (1999) and their colleagues at DRTC reinforced and corroborated her theories. Much later, Hjorland & Hartel (2003) delved in to the Ontological, epistemological and sociological factors, affecting a domain of knowledge. According to them, all domains are dynamic. And any KO tool should be able to reflect the constant changes in any domain and incorporate them in the new ever changing structure of knowledge. An experiment to convert a controlled vocabulary in to an Ontology was reported by Qin & Paling (2001). They used the controlled vocabulary of ERIC descriptors to develop a Domain Ontology on Education. According to them, "the major difference between the two models, lies in the values added through deeper semantics, in describing digital objects, both 27
conceptually and relationally". At the 7th International ISKO conference, in KR&O, the second session focused on epistemological foundations for knowledge structures and analysis. Silva & Rocha (2002) suggested an alignment process at the Ontological level to merging of Ontologies. At the same conference, Negrini & Zozi (2002) focused on the way Ontological structures can aid in the understanding and modeling of works of Art. The Networked Knowledge Organization Systems (NKOS) group held a series of workshops, in conjunction with the Digital Libraries Conference and ACM+IEEE joint conference on digital libraries, since 2001. At their 6th workshop (Mai, 2002), "Building a meaningful web from traditional KO systems to new semantic tools", all the seven presentations focused on how traditional systems for KO can be transformed in to Ontologies. In another study, Gnoli & Poli (2004) investigated the meaning of Ontology as a model for KO. Ding (2001) reviewed the importance of Ontologies in the development of the Semantic Web. He discussed the definition of Ontologies, kinds of Ontologies, Ontology tools, Ontology language and some important Ontology projects. Ding & Foo (2002), in another study, presented a two-part review. In the first part of the review, state-of-the-art techniques on semi-automatic and automatic Ontology generation were detailed. The second part of the review Ding & Foo (2002) dealt with Ontology mapping and evolving. Kim (2002) and McGuinness (2005) summarized their comments on the development of Ontologies and the web's growing dependence on them. As far as methodologies for developing Ontologies is concerned, an important study by Poli (2002) highlighted Ontological sub-theories and the use of domain analysis for developing an Ontology. Ironically, this methodology in the field of AI, utilizes domain analysis, an integral part of KO. Similarly, Prieto-Diaz (2003) also used a domain analysis and a faceted approach to build Ontologies with a software tool called `DARE'. Most of the current Ontological projects, use readily-available Ontology tools for developing new Ontologies. Charlet et al. (2006) describe a methodology to build a medical Ontology from textual reports, using a natural language processing tool; the SYNTEX software. Sanchez & Moreno 28
(2008) describe an automatic and unsupervised methodology that addresses the nontaxonomic learning process for constructing domain Ontologies. Deokattey et al. (2010) discussed the basic differences between a thesaurus and a domain ontology and explored a web-based approach to developing a domain ontology in a multidisciplinary subject area. Hilera (2010) described a method to generate ontologies from glossaries of terms taking an example from the "IEEE Standard Glossary of Software Engineering Terminology". Kim (2011) detailed a methodology for semi-automatically generating domain ontologies from information extracted from the Web. Saab (2011) describes the ontologization of information using phenomenological analysis, to distinguish between data, information and knowledge and the underlying ontological linkages between them. Downey (2012) proposed a methodology to develop a taxonomy visualization framework for organizing web content. Concept modeling is an integral part of Knowledge Organization. The basic meaning of "Concept" was elucidated by Marradi (2012) in an interesting article. Deokattey & Bhanumurthy (2013) described a method for visualization of a domain using concept maps. Conceptualization is the most important aspect in developing a domain ontology and concept maps help visualize the relationships among different concepts, particularly in the case of interdisciplinary subject domains. A recent development on the Web is the concept of Folksonomy, Social tagging or keywords for content management. It is a collaborative effort in developing metadata tags and categorizing content. In contrast to traditional subject indexing where information specialists decide both metadata and vocabulary, a Folksonomy can be freely generated and used by the consumers of the web content. This is a healthy trend as far as general information on the Web is concerned. Users can tag the web documents of their choice for future retrieval and store them. They may also modify the tags and the vocabulary as and when required using interactive Web2.0 technologies. But such efforts are more suited for content management at individual or group level than for serious scholarly pursuits. Folksonomies developed by web users do not address various semantic issues which affect their retrieval 29
function. To overcome these limitations, elements from an ontology in combination with user generated tags offer a better solution (Folksontologies). Lezcano (2012) described the implementation of such an integration combining Delicious and the OpenCyc knowledge base. 7. Conclusion Information is characterized by its inherent potential for perpetual growth. It is a limitless resource unlike other resources. Information has been growing at an ever increasing rate, volume and also in complexity, especially in the field of Science and Technology. Libraries and Information Centres have been in a state of continuous flux, because of tremendous developments both in the physical manifestations of information content (through Papyrus, paper, floppies, microforms, CD-ROMs, digital resources and now the Internet) as well as in the complexities of contents of documents due to the evolution of new overlapping interdisciplinary and multidisciplinary subjects. Standardization at the syntactic level has been achieved to a great extent. Continuous efforts are going on at various national and international institutions and organizations to evolve better strategies for organizing knowledge at the semantic level. Information and Communication Technologies, Human Machine Interactions, Artificial Intelligence, Computational Linguistics and Cognitive Psychology have been impacting KO and will continue to impact the organization and retrieval of all forms of Explicit Knowledge. Ontology, in its philosophical meaning, is the discipline investigating the framework of reality. This reality is structured into a series of integrative levels, which in turn, forms the basis for representing knowledge and developing new models of Knowledge Organization. Therefore, future Ontological methods will be shaped by this philosophical perception and the subsequent semantic representation of various levels of reality. KO&R in the domain of Information Science is rooted in Philosophical foundation and can thus provide a sound theoretical basis for representing this reality. The answers to the development of intelligent KO&R tools in the near future may lie at the intersection of all these domains. 30
References Aitchison, J. 1981. Integration of thesauri in the Social Sciences. International Classification, 8(2), 75-85. Angell, R.S. 1968. Compatibility in subject access vocabularies; the role of relations between index terms. Washington, Library of Congress. Arntz, H., Gietz, R A., Brown, K. R., Lazar, P., Afremov, V. Yu et al. 1981. 85 years of FID. International Forum on Information and Documentation, 6.3, Complete Special Issue. Austin, D. 1996. Prospects for a new General Classification scheme. Journal of Librarianship, 1(3), 149-168. Beghtol, C. 1986a. Bibliographic Classification theory and text linguistics; aboutness analysis,intertextuality and the cognitive act of classifying documents. Journal of Documentation, 42 (2), 84-113. Beghtol, C. 1986b. Semantic validity; concepts of warrant in Bibliographic Classification systems. Library Resources and Technical Services, 30(2), 109-125. Beghtol, C. 1998. Knowledge domains: multidisciplinarity and Bibliographic Classification systems. Knowledge Organization, 25 (1 & 2), 1-12. Bhattacharya, K. 1974. The effectiveness of natural language in Science indexing and retrieval. Journal of Documentation, 30(3), 235-254. Bianchini, C. 2012. Colon Classification and Nuovo Soggettario: The case of the Library of the Natural History Museum of Udine, Italy. Knowledge Organization, 39(1), 23-28. Borden, G.A. and Nelson, W.F. 1969. Towards a viable Classification scheme. American Documentation, 20(4), 298-301. Borst, P. and Akkermans, H. 1997. Engineering Ontologies. International Journal of HumanComputer Studies, 46 (2/3), 365-406. Braun, S. and Schwind, C. (1976). Automatic semantics-based Indexing of natural language texts for IR systems. Information Processing and Management, 12(2), 147-153. Brookes, B.C. 1986. Jason Farradane and Relational Indexing. Journal of Information Science. 12(1 & 2), 15-18. Broughton, V. and Lane, H. 2000. Classification schemes revisited: applications to Web Indexing and searching. Journal of Internet Cataloging, 2(3&4), 143-155. Buckland, M.K. 2012. Obsolescence in subject description. Journal of Documentation, 68 (2), 154-161. Caroll, J.M. and Roeloffs, R. 1969. Computer selection of keywords using word frequency analysis. American Documentation, 20(3), 227-233. Casey, C. 1999. An analytical index to the Internet: dreams of Utopia. College and Research Libraries, 60(6), 586-595. 31
Chandrasekaran, B. and Josephson, J.R. 1999, Feb. What are Ontologies and why do we need them? IEEE Intelligent Systems, 20-25. Chaplan, M.A. 1995. Mapping Laborline Thesaurus terms to Library of Congress Subject Headings: implications for vocabulary switching. Library Quarterly, 65(1), 39-61. Charlet, J., Bachimont, B. and Jaulent, M-C. 2006. Building medical ontologies by terminology extraction from texts: An experiment for the intensive care units. Computers in Biology and Medicine, 36(7&8), 857-870. (Special issue on Medical Ontologies). Chatterjee, A. 2000. Treatment of complex subjects in Documentary Classification with special reference to CC and BC; a comparative study. Herald of library science, 39(1 & 2), 41-49. Chen, H. and Martinez, J. 1998. Alleviating search uncertainty through concept associations: automatic indexing, co-occurrence analysis, and parallel computing. Journal of the American Society for Information Science, 49(3), 206-216. Chowdhury, G.G. 1996. Record formats for integrated databases; a review and comparison. Information Development, 12(4), 218-223. Cleverdon, C.W. 1962. Report on the testing and analysis of an investigation into the Comparative efficiency of indexing systems. College of Aeronautics, Cranfield, UK. (AslibCranfield Research Project). Cleverdon, C. W., Mills, J., and Keen, E. M. 1966. Factors determining the performance of indexing systems. Cranfield-Aslib Research Project. Coates, E.J. 1960. Subject catalogs. London, Library Association. Comaromi, J.P. 1986. The DDC in the 21st century. Paper presented at the 43rd FID Congress, Montreal, 1986. Connaway, L.S. and Sievert, M.C. 1996. Comparison of three Classification systems for information on health insurance. Cataloging and Classification Quarterly, 23(2), 89-104. Craven, T.C. 1977. NEPHIS: A Nested Phrase Indexing System. Journal of the American Society for Information Science, 28(2), 107-114. Craven, T.C. 1993. A thesaurus for use in a computer-aided abstracting tool kit. In Bonzi, S. (Ed.). Proceedings of the Annual Meeting of the American Society for Information Science, Columbus, Ohio, 24-28 October. (pp.178-184). Medford, New Jersey, Learned Information Inc., (For American Society for Information Science). Curras, E. 1992. Ranganathan's Classification theories under the systems science postulates. Journal of Library and Information science, 17(1), 45-65. Curwen, A.G. 1997. UNIMARC and international record exchange; an overview of recent projects and developments. Program, 31(3), 227-238. Cutter, C.A. 1904. Rules for a dictionary catalog. 4th ed. Washington, Govt. Printing Office. Dahab, M.Y., Hassan, H.A. and Rafea, A. 2007. Text Onto-Ex; automatic Ontology construction from natural English text. Expert Systems and Applications, 34(2), 1474-1480. 32
Dahlberg, I. 1974. Principles for the construction of a universal Classification system. In Wojciechowski, J.A. (Ed.). Conceptual basis of the Classification of knowledge, Proceedings of the Ottawa Conference on the Conceptual Basis of the Classification of Knowledge, 1-5 October, 1971. (pp. 450-471). Pullach, Verlag Dokumentation. Dahlberg, I. 1978. Ontical structures and universal Classification. Bengaluru, Sarada Ranganathan Endowment for Library Science. Dahlberg, I. 1981. Towards establishment of compatibility between indexing languages. International Classification, 8(2), 86-91. Dahlberg, I. 2012. A systematic new lexicon of all Knowledge fields based on the Information Coding Classification. Knowledge Organization, 39(2), 142-150. David, C. and Giroux, L. 1995. Indexing as problem solving: a cognitive approach to consistency. In (Olson, H.A. & Ward, D.B. (Eds.) Proceedings of CAIS/ACSI 95, the proceedings of the 23rd Annual Conference of the Canadian Association for InformationScience (Association Canadienne des sciences del'information Travaux), (pp.7989).Alberta University, School of Library and Information Studies. Davis, C.H. (1968). Integrating vocabularies with a Classification scheme. American Documentation, 19(1), 101 Davis, C.H. 1976. Pragmatic expansion of an Enumerative Classification scheme. Journal of the American Society of Information Science, 27(3), 174-175. De Fino, M. 2011. OCLC Classify. Technical Services Quarterly, 28(4), 456-457. ( De Grolier, E. 1962. A study in general categories applicable to Classification and coding in Documentation. Paris, Unesco. De Grolier, E. 1967. Synoptique Critique. Information Storage and Retrieval, 3, 385-396. De la Rosa, A. 1999. Tools for terminology on the WWW: XML. Instrumentos terminologicos en el www:xml.]. Profesional de la Informacion, 8(10), 14-20, 22-23, 26-27, 30-31, 34-36. Deokattey, S., Neelameghan, A. and Kumar, V. 2010. A method for developing a DomainOntology; a case study for a multidisciplinary subject. Knowledge Organization, 37(3),173-184. Deokattey, S. and Bhanumurthy, K. 2013. Domain visualization using concept maps; a case study. DESIDOC Journal of Library & Information Technology, 33(4), 295-299. Devadason, F.J. and Balasubramanian, V. 1981. Computer generation of a thesaurus from structured subject propositions. Information Processing and Management, 17(1), 1-11. Devadason, F.J. 1985. Online construction of alphabetical Classaurus; a vocabulary control and Indexing tool. Information Processing and Management, 21(1), 26. Devadason, F.J., Intaraksa, N., Patamawongjariya, P. and Desai, K. 2002. Faceted indexing based system for organizing and accessing internet resources. Knowledge Organization, 29(2), 65-77. 33
Diaz, I. and Velasco, M. 1998. Semi-automatic construction of thesaurus applying domain analysis techniques. International Forum on Information and Documentation, 23 (2),11-19. Dilevko, Juris. 2009. The relevance of Classification theory to textual analysis. Library & Information Science Research, 31(2), 92-100. Ding, Y. 2001. A review of Ontologies with the Semantic Web in view. Journal of Information Science, 27(6), 377-384. Ding, Y. and Foo, S. 2002. Ontology research and development; Pt.1: a review of Ontology generation. Journal of Information Science, 28(2), 123-136. Ding, Y. and Foo, S. 2002. Ontology research and development; Pt.2: a review of Ontology mapping and evolving. Journal of Information Science, 28(5), 375-388. Dodd, D.G. 1996. Grass-roots Cataloging and Classification: food for thought from World Wide Web subject-oriented hierarchical lists. Library Resources and Technical Services, 40 (3), 275-286. Doria, O.D. 2012. The role of activities awareness in Faceted Classification development. Knowledge Organization, 39(4), 283-291. Downey, S. 2012. Visualizing a taxonomy for virtual worlds. Journal of Educational Multimedia and Hypermedia, 21(1), 53-69. Dublin Core Metadata Element Set; Version 1.1, Reference description. DCMI, 1991. Available at Duineveld, A.J. and Stoter, R. 2000. WonderTools? A comparative study of Ontological engineering tools. International Journal of Human-Computer Studies, 52(6), 1111-1133. Dykstra, M. 1988. LC Subject headings disguised as a thesaurus. Library Journal, 113(4), 42-46. Earl, L.L. 1970. Experiments in automatic indexing and extracting. Information Storage and Retrieval, 6(4). 313-330. Earle, R.E. and Berry, R. 1996. Indexing online information. Technical Communication, 43(2), 146-156. Engineers Joint Council 1964. Thesaurus of Engineering Terms. EJC, New York. Engineers Joint Council 1968. Thesaurus of Engineering and Scientific Terms. EJC, New York. Ercegovac, Z. 1999. Introduction on metadata. Special issue of Journal of the American Society for Information Science, 50(13), 1165-1168. Everett, J.O., Bobrow, D. G., Stolle, R., Crouch, R.S., de Paiva, V., Condoravdi, C., van den Berg, M. and Polanyi, L. 2002. Making ontologies work for resolving redundancies across documents. Communications of the ACM, 45(2), 55-60. Farquhar, A. and Fikes, A.R. 1997. The Ontolingua Server: a tool for collaborative Ontology construction. International Journal of Human-Computer Studies, 46 (6), 707-727. 34
Farradane, J.E.L. 1966. Report on research into information retrieval by Relational Indexing. (Part 1; Methodology). London, City University. Farradane, J.E.L. 1977. A comparison of some Permuted Alphabetical Subject Indexes (PASIs). International Classification, 4(2), 94-101. Figueiredo, F. 2011. Word co-occurrence features for text classification. Information Systems, 36(5), 843-858. Finni, J.J. and Paulson, P.J. 1987. The Dewey Decimal Classification enters the computer age. Preprint paper presented at the IFLA general conference, Brighton, UK. Fleck, N.W. and Rust, M. 1998. MicroMARC for integrated formats. Library Hitech, 16(2), 37 44. Foskett, A.C. 1972. The subject approach; recent developments in Indexing. Journal of Librarianship, 4(4), 240-252. Foskett, D.J. 1968. Some historical aspects of the Classification of knowledge. Classification Society Bulletin, 1 (4), 2-11. Foskett, D.J. 1970. The CRG; 1952-1962. Libri, 12(2), 127-138. Foskett, D.J. 1992. More on the Personality facet. Journal of Library and Information Science, 17(1), 39-44. Freeman, F.H. 1976. Building a thesaurus for a diffuse subject area, Special Libraries, 67, 220-222. Friedman, B. 1972. Thesauri for vocabulary control. Drexel Library Quarterly, 8(2), 125-128. Fugmann, R. 1997. Bridging the gap between database indexing and book indexing. Knowledge Organization, 24 (4), 205-212. Gardin, J.C. 1965. SYNTOL. Graduate School of Library Science, New Jersey, Rutgers State University. Gnoli, C. and Poli, R. 2004. Levels of reality and levels of representation. Knowledge Organization, 31(3), 151-160. Godby, J. and Vizine-Goetz., D. 2000. ISKO participants discuss ways librarianship can improve responsiveness of the Web. OCLC Newsletter, 247, 22-25. Gomez-Perez, A. and Benjamins, V.R. 1999. Applications of ontologies and problem- solving methods. AI Magazine, 20(1), 119-122. Gopinath, M.A. 1987. Symbiosis between Classification and thesaurus. Library Science with a Slant to Documentation, 24(4), 42-46. Gopinath, M.A. 1992. Ontological model and Ranganathan's contribution In Chatterjee, A., Sen, S.K., Kapoor, S.K. (Eds.) Proceedings of the 15th National IASLIC Conference, Annamalai, Tamil Nadu, 26-29 Dec.1992. Calcutta, IASLIC, 9-14. Gopinath, M.A. 1999. Paradigms, paradigm shifts and Classification. Library Science with a Slant to Documentation and Information Studies, 36 (2), 73-77. 35
Green, C.D. 1972. Some problems of the Indexing of specialist material drawn from several disciplinary systems. Journal of Documentation, 28(1), 37-43. Guarino, N. 1997. Understanding, building and using Ontologies. International Journal of Human-Computer Studies, 46 (2/3), 293-310. Guenther, R.S. 1996. Automating the Library of Congress Classification scheme: implementation of the USMARC format for Classification data. Cataloging and Classification Quarterly, 21(3 & 4), 177-203. Guha, B. 1983. Documentation and information. 2nd ed. Kolkata, World Press. Heijst, G.V. and Schreiber, A.T. 1997. Roles are not classes: a reply to Nicola Guarino. International Journal of Human-Computer Studies, 46 (2/3), 311-318. Hilera, J. A. 2010. An evolutive process to convert Glossaries into Ontologies. Information Technology and Libraries, 29(4), 195-204. Hjorland, B. and Albrechtsen, H. 1999. An analysis of some trends in Classification research. Knowledge Organization, 26(3), 131-139. Hjorland, B. and Hartel, J. 2003. Ontological, epistemological and sociological dimensions of domains; afterword. Knowledge Organization, 30(3/4), 239-245. Hjorland, B. 2009. Concept Theory. Journal of the American Society for Information Science and Technology, 60(8), 1519-1536. Hobbs, J.R. 1995. Sketch of an Ontology underlying the way we talk about the world. International Journal of Human Computer Studies, 43, 819-830. Holsapple, C.W. and Joshi, K.D. 2002. A collaborative approach to Ontoloy design. Communications of the ACM, 45(2), 42-47. Hurt, C.D. 1997. Classification and subject analysis: looking to the future at a distance. Cataloging and Classification Quarterly, 24 (1 & 2), 97-112.IEEE Intelligent Systems, Special issue on Ontologies, Jan./Feb. 1999. Iivonen, M. and Kivimaki, K. 1998. Common entities and missing properties: similarities and differences in the indexing of concepts. Knowledge Organization, 25(3), 90-102. Ingwersen, P. and Wormell, I. 1992. Ranganathan in the perspective of advanced information retrieval. Libri, 42(3), 184-201. International Journal of Human-Computer Studies, Special issue on Ontologies, 46 (2/3),Feb/March 1997. Jachowicz, R.L. 1979. Application of Classification as a basis for the formulation of a thesaurus. In: Neelameghan, A. (Ed.). Ordering Systems for Global Information Networks; (pp. 356-352). Proceedings of the 3rd International Study Conference on Classification Research, Bombay, India, 6-11, Jan., 1975. Bangalore, FID/CR, Sarada Ranganathan Endowment for Library Science. Jaenecke, P. 1994. To what end knowledge organization. Knowledge Organization, 21(1), 311. 36
Jenkins, C., Jackson, M., Burden, P. and Wallis, J. 1998. Automatic Classification of web resources using JAVA and DDC. computer networks and ISDN Systems, 30(1-7), 646-648. Jones, S. 1993. A thesaurus data model for an intelligent retrieval system. Journal ofInformation Science, 19 (3), 167-178. Joorabchi, A. 2011. An unsupervised approach to automatic classification of scientific literature utilizing bibliographic metadata. Journal of Information Science, 37(5), 499-514. Joyce, T. and Needham, R.M. 1958. The thesaurus approach to information retrieval. American Documentation, 9, 192-197. Jung-Min Kim, Byoung-Il Choi, Hyo-Phil Shin & Hyoung-Joo Ki. 2007. A methodology for constructing of Philosophy Ontology based on philosophical texts. Computer Standards and Interfaces, 29(3), 302-315. Kaiser, J. (1911). Systematic Indexing. London, Pitman. Kim, H. (2002). Predicting how Ontologies for the Semantic Web will evolve. Communications of the ACM, 45(2), 48-54. Kim, J. (2011): Construction of Domain Ontologies: Sourcing the World Wide Web. International Journal of Intelligent Information Technologies, 7(2), 1-24. Kohler,J., Philippi, S., Specht, M. and Rueg, A. (2006). Ontology-based text indexing and querying for the Semantic Web. Knowledge-Based Systems, 19(8), 744-754. Kristensen, J. (1993). Expanding end-users' query statements for free text searching with a search-aid thesaurus. Journal of Information Processing and Management, 29 (6), 733-744. Kumar, K. (1992). Distinctive contribution of Ranganathan to library classification. Journal of Library and Information science, 17(2), 115-127. Kumar, T. V. R. and Parameswaran, M. (1998). Chain Indexing and LISA. Knowledge Organization, 25(1&2), 13-15. Kwasnik, B.H. (1999). The role of Classification in knowledge representation and discovery. Library Trends, 48(1), 22-47. Lancaster, F.W. (1972). Vocabulary control for information retrieval. Washington, Information Resources Press. Landry, B.C. (1971). A theory of indexing; indexing theory as a model for IS&R. Ohio, Ohio State Univ. Langridge, D.W. (1999). Classification; its kinds, elements, systems and applications. Herald of Library Science, 38(3-4), 256-257. Larson, R.R. (1992). Experiments in automatic Library of Congress Classification. Journal of the American Society for Information Science, 43(2), 130-148. Lee, J.H. (1994). Ranking documents in thesaurus-based Boolean retrieval systems. Information Processing and Management, 30 (1), 79-91. 37
Lezcano, L. (2012). Bridging informal tagging and formal semantics via hybrid navigation. Journal of Information Science, 38( 2), 140-155. Li, Sheng-Tun. (2013). A fuzzy conceptualization model for text mining with application in opinion polarity classification. Knowledge-Based Systems, 39, 23-33. Lim, E. (2000). SouthEast Asian subject gateways: an examination of their Classification practices. International Cataloguing and Bibliographic Control, 29(3), 45-48. Litovsky, B. (1969). Utility of automatic classification system for IS&R. Philadelphia, Moore School of Electrical Engg., Univ. of Pennsylvania. Lopez, M.F. and Gomez-Perez, A. (1999, Jan./Feb.). Building a chemical Ontology using MethOntology and the Ontology Design Environment. IEEE Intelligent Systems, 14(1), 3746. Lorenz, B. (1997). The Regensburg Classification: a short survey. Cataloging and Classification Quarterly, 25(1), 39-49. Losee, R. (2004). A performance model of the length and number of subject headings and index phrases. Knowledge Organization, 31(4), 245-251. Loukopoulos, L. (1966 Jan.). Indexing problems and some of their solutions. American Documentation, 17-25. Lynch, M.F. (1966). Subject indexes and automatic document retrieval; the structure of entries in Chemical Abstracts. Journal of Documentation, 22(3), 167-185. Mai, J.E. (2002). Organization of knowledge in a networked environment; report on the 6th Workshop on Organization of Knowledge in a Networked Environment. Knowledge Organization, 30(1), 36-37. Mandala, R. and Tokunaga, T. (2000). Query expansion using heterogeneous thesauri. Information Processing and Management, 36(3), 361-378. Mandel, C.A. and Wolven, R. (1996). Intellectual access to digital documents: joining proven principles with new technologies. Cataloging and Classification Quarterly, 22 (3&4), 25-42. Maniez, J. and de Grolier, E. (1991). A decade of research in Classification. International Classification, 18 (2), 73-77. Markey, K. and Demeyer, A.M. (1986). DDC online project. OCLC. Dewey Decimal Classification online project: evaluation of a library schedule and index integrated into the subject searching capabilities of an online catalog. Final report to the Council on Library Resources. Dublin, Oh: OCLC . (OCL C Research Report No. CLC/OPR/RR/86/1). Markey, K. (1988). Integrating the machine-readable LCSH in online catalogs. Information Technology and Libraries, 7, 299-312. Marradi, A. (2012). The concept of Concept: Concepts and Terms. Knowledge Organization, 39(1), 29-54. Masterman, M., Needham, R.M. and Sparck Jones, K. (1959): The analogy between mechanical translation and library retrieval. In: Proceedings of the International Conference on Scientific Information, Washington, D.C., November 16-21, 1958. 38
Washington, DC, National Academy of Sciences, 917-935. Masterman, M. (1975). Chasing the Enthymeme; Pt. 2; the nature of a philosophical text editor. In Aslib, Informatics 2, London, Aslib, 26-41. Mayne, A.J. (1968). Some modern approaches to the Classification of knowledge. Classification Society Bulletin, 1 (4), 13-17. Mazur, Z. (1994). Models of a distributed information retrieval system based on thesauri with weights. Information Processing and Management, 30 (1), 61-77. Mazzocchi, F. (2009). Knowledge organization in the Philosophical domain: dealing with Polysemy in thesaurus building. Knowledge Organization, 36(2-3), 103-112. Mazzocchi, F. (2010). S.R. Rangnathan's PMEST Categories: analyzing their Philosophical background and cognitive function. Information Studies, 16(3), 133-147. McGuinness, D.L. (2005). Ontologies come of age. Stanford, Stanford University. Metcalf, M. (1983, April). LCSH and online subject access. (paper presented at the Cataloging and Indexing Group Seminar, unpublished). Metcalfe, J. (1957). Information Indexing and subject cataloging. London, Scarecrow Press. Miller, U. and Teitelbaum, R. (2002). Pre-coordination and post-coordination; past and future. Knowledge Organization, 29(2), 87-93 Mitchell, J.S. (1998). In this age of WWW is Classification redundant? Catalogue and Index, 127, 5. Molholt, P. (1995). Qualities of Classification schemes for the Information Superhighway. Cataloging and Classification Quarterly, 21(2), 19-22. Mowery, R.L. (1976). Cutter Classification still at work. Library Research and Technical services, 20, 154-156. Neelameghan, A. (2009). Building a thesaurus for a specialized subject: a case report. Information Studies, 15(1), 61-64. Negrini, G. and Zozi, P. (2002). Ontological analysis of literary works of art. (Paper presented at the 7th ISKO International Conference on Challenges in Knowledge Representation and Organization for the 21st century; integration of knowledge across boundaries, Granada, July, 10-13). Neville, H.H. (1970). Feasibility study of a scheme for reconciling thesauri covering a common subject. Journal of Documentation, 26, 313-336. Noy, N.F. and Hafner, C.D. (1997, Fall). The state of the art in Ontology design: a survey and comparative review. AI Magazine, 18(3), 53-74. Noy, N.F. and McGuinness, D.L. (2001). Ontology development 101; a guide to creating your first Ontolgoy. Stanford, Stanford University OCLC (2004). Automatic Classification and Indexing of Internet Resources using DDC. (Project Scorpion: (Project closed as of 2006). 39
O'Leary, D.E. (1997). Impediments in the use of explicit Ontologies for KBS development. International Journal of Human-Computer Studies, 46 (2/3), 327-337. O'Leary, D.E. (1998). Using AI in knowledge management: knowledge bases and Ontologies. IEEE Intelligent Systems, 13(3), pp. 34-39. Olson, H.A. (2002). Mapping beyond Dewey's boundaries: constructing classificatory space for marginalized knowledge domains. Library Trends, 47 (2), 233-254. Park, Y.C. and Choi, K.S. (1996). Automatic thesaurus construction using Bayesian networks. Information Processing and Management, 32 (5), 543-553. Poli, R. (2002). Ontological methodology. International Journal for Human Computer Studies, 56, 639-664. Pollitt, A.S. and Ellis, G.P. (1994). Improving search quality using thesauri for query specification and the presentation of search results. In Albrechtsen, H. and Oernager, S. (Eds.) Proceedings of the Third International Society for Knowledge Organization (ISKO) Conference: Knowledge organization and quality management, Copenhagen, Denmark, 2024 June. (pp. 382-389). Frankfurt/Main, INDEKS Verlag. Prieto-Diaz, R. (2003). A faceted approach to building Ontologies. In: Information Reuse and Integration, IRI 2003. IEEE International Conference, 27-29 Oct. Digital Object Identifier: 10.1109/IRI.2003.1251451 Qin, J. and Paling, S. (2001). Converting a controlled vocabulary into an Ontology; the case of GEM. Information Research, 6( 2), available at Quinn, B. (1994). Recent theoretical approaches in Classification and Indexing. Knowledge Organization, 21(3), 140-147. Rada, R. (1987). Connecting and evaluating thesauri; issues and cases. International Classification, 14(2), 63-69. Ranganathan, S.R. (1967). Hidden roots of Classification. Informtion Storage and Retrieval, 3 (4), 399-410. Ranganathan, S.R. (1968). Basic subjects and their kinds (Classification problems 27), Library Science with a Slant to Documentation, 5 (1), 97-133. Ranganathan, S.R. (1974). Subject heading and facet analysis. Journal of Documentation, 30(2), 195-206. Reynaud. C. and Tort, F. (1997). Using Explicit Ontologies to create problem solving methods.International Journal of Human-Computer Studies, 46(2&3), 339-364. Richmond, P.A. (1969). LC and Dewey; their relevance to modern information science.Paper presented at the ALA preconference on Subject Analysis of Library Materials, Atlantic City, NJ, 19-21 June. Richmond, P.A. (1974). A reconsideration of Enumerative Classification for current information needs. Ciencia da Informacao, 3(1), 5-19. 40
Rigby, D. (1974). Computers and the DDC; a decade of progress (1963-1973). Paris, FID. Rogers, V.G. (1972). Thesaurus construction; an introduction. Drexel Library Quarterly, 8(2), 117-124. Ross, J. (2000). The impact of technology on indexing. Indexer, 22(1), 25-26. Rostron, R.M. (1968). The construction of a thesaurus. Aslib Proceedings, 20(3), 181-187. Saab, D.J. (2011). Information as Ontologization. Journal of the American Society for Information Science and Technology, 62(11), 2236-2246. Saeed, H. (2001). Potential of bibliographic tools to organize knowledge on the Internet; the use of DDC for organizing web-based information resources. Knowledge Organization, 28(1), 17-26. Salton, G. (1967). Designing automatic information systems; results obtained with the SMART programs. Social Science Information, 6(2), 11-117 Salton, G. (1970, April). Automatic text analysis. Science, 335-353. Salton, G. (1972). Experiments in automatic thesaurus construction for IR. In Information Processing, '71, (pp.115-123). Proceedings of the 1971 congress of the IFIPS, Amsterdam, North-Holland Publishers. Sanchez, D. and Moreno, A. (2008). Learning non-taxonomic relationships from web documents for domain ontology construction Data and Knowledge Engineering, 64, 600­ 623. Satija, M.P. and Singh S. (1998). Indian Classification schemes: an analysis. Library Science with a Slant to Documentation and Information Studies, 35 (3), 165-178. Satija, M.P. (1992). A critical introduction to the Seventh edition (1987) of the Colon Classification. Lucknow Librarian, 24 (1), 32-41. Schutze, H. and Pederson, J.O. (1997). A co-occurrence-based thesaurus and two applications to information retrieval. Information Processing and Management, 33 (3), 307318. Seetharama, S. (1976). Term-concept relationship in an IR thesaurus. Library Science with a slant to Documentation, 13(2), 67-73. Sharp, J.R. (1965). Some fundamentals of information retrieval. London, Deutsch. Silva, N. and Rocha, J. (2002). Merging Ontologies using a bottom-up lexical and structural approach. (Paper presented at the 7th ISKO International Conference on Challenges in Knowledge Representation and Organization for the 21st century; integration of knowledge across boundaries, Granada, July, 10-13). Simmons, P. and Hopkinson, A. (Eds.). (2nd ed.). (1988). CCF; the Common Communication Format. Paris, Unesco. Smet, E.D. and Nieuwenhuysen, P, (1997). The DANIS database system; integrating bibliographic and factual information using the CDS/ISIS software and the Common Communication Format. Journal of Information Science, 23(4), 327-337. 41
Soergel, D. (1999). The rise of Ontologies or the reinvention of classification. Journal of the American Society for Information Science, 50 (12), 1119-1120. Sparck-Jones, K. (1970). Some thoughts on Classification for retrieval.Journal of Documentation, 26 (2), 89-101. Sparck-Jones, K. (1971). Automatic keyword Classification for information retrieval. London, Butterworths. Sparck-Jones and Jackson, D.M. (1970). The use of automatically obtained keyword Classification for information retrieval. Information Storage and Retrieval, 5, 175-201. Spiteri, L.F. (1998). A simplified model for Facet Analysis. Canadian Journal of Information and Library Science 23 (1&2), 1-30. Spiteri, L.F. (1999). The essential elements of faceted thesauri. Cataloging and Classification Quarterly, 28 (4), 31-52. Srinivasan, P. and Ruiz, M.E. (1996, Oct. 21-24). An investigation of Indexing on the WWW. Proceedings of the 59th Annual Meeting of the American Society for Information Science, Baltimore, Maryland. Hardin, S. (Ed.) Medford, New Jersey: Information Today, Inc., for American Society for Information Science, 79-83. Stoecker, N.K. and Alford, D.L. (1998). From catalog to web; desktop access to Sandia technical reports. Internet Reference Services Quarterly, 3(1), 37-50. Svenonius, E. (1972). Effect of Indexing specificity on retrieval performance. Washington, National Science Foundation. Svenonius, E. (1992). Ranganathan and Classification Science. Libri, 42(3), 176-183. Szostak, R. (2012). Classifying relationships. Knowledge Organization, 39(3), 165-178. Takeda, N. (1994). Problems in hierarchical structures of thesauri: their influences on the results of information retrieval. Online Kensaku, 15(4), 183-186. Tauber, M.F. and Feinberg, H. (1974). Dewey Decimal and the LC Classifications; an overview. Drexel Library Quarterly, 10, 56-74. Toth, B. (2000). Cataloguing and indexing and the Web: help urgently needed? Catalogue and Index, 135, 1-2. Tsioli, M. and Corsini, S. (1994). Brunet-Parguez indexing of ancient books. [Indexation livres anciens Brunet-Parguez.] ARBIDO-Bulletin, 9 (4), 17-18. United Nations Educational Scientific & Cultural Organization (1971). UNISIST; study report on the feasibility of a world science information system. Paris, Unesco. United Nations Educational Scientific & Cultural Organization. (1981). Unisist Reference Manual for machine-readable bibliographic descriptions. (2nd ed.). Paris, Unesco. US Armed Services Technical Information Agency (ASTIA). (1962). Thesaurus of ASTIA descriptors. Washington ASTIA. 42
Valente, A. and Russ, T. (1999). Building and (re)using an Ontology of air campaign planning. IEEE Intelligent Systems, 14(1), 27-36. Visser, P.R.S. and Bench-Capon, T.J.M. (1998). A comparison of four Ontologies for the design of legal knowledge systems. Artificial Intelligence and Law, 6(1), 27-57. Vizine-Goetz, D. (1998). Using library classification schemes for Internet resources. (OCLC Internet Cataloguing Project Colloquium, position paper). Vizine-Goetz, D. and Markey, K. (1989). Characteristics of subject heading records in the machine readable library of LC subject headings. Information Technology and Libraries, 8(2), 203-209. Wahlin, E. (1966). Classification systems and their subjects. American Documentation, 17, 199- 215. Walker, D. (1981). The organization and use of information; contributions of Information Science, Computational Linguistics and Artificial Intelligence. Journal of the American Society for Information Science. 32, 347-363. Wall, E. (1969). Vocabulary building and control techniques. American Documentation, 20(2), 161-164. Wall, E. and Barnes, J. (1969). Intersystem compatibility and convertibility and convertibility of subject vocabularies. Philadelphia, Auerbach Corp. Weinberg, B.H. (1995). Library Classification and information retrieval thesauri: comparison and contrast. Cataloging and Classification Quarterly, 19 (3/4), 23-44. Weinberg, B.H. (1996). Complexity in indexing systems; abandonment and failure; implications for organizing the Internet. Proceedings of the ASIS Annual Conference Oct. 1924. Will, L. and Will, S. (1997). Dewey for Windows. Electronic Library, 15(3), 192-195. Willetts, M. (1975). An investigation of the nature of the relation between terms in thesauri. Journal of Documentation, 31(3), 71-79. Williams R.V. (2010). Hans Peter Luhn and Herbert M. Ohlman: their roles in the origins of Keyword-in-Context/Permutation Automatic Indexing. Journal of the American Society for Information Science and Technology, 61(4) 835-849. Williamson, N.J. (1986). Library of Congress Classification; problems and prospects in online retrieval. International Cataloguing, 15(4), 45-48. Wilson, T.D. (1972). The work of the British Classification Research Group. In Subject retrieval in the seventies: Proceedings of an international symposium, (pp.62-71). University of Maryland School of Library and Information Services. Wolff-Terroine, M., Simon, N. and Rembert, D. (1969). Use of a computer for compiling and holding a medical thesaurus. Methods of Information in Medicine, 8(1), 34-40. Zins, C. (2002). Models for classifying Internet resources. Knowledge Organization, 29(1), 20-28. 43
Zumer, M. and Riesthuis, J.A. (2002). Consequences of implementing FRBR; are we ready to open Pandora's box? Knowledge Organization, 29(2), 78-86. 44

File: knowledge-organization-research-an-overview.pdf
Title: 2015-E-01
Author: Administrator
Subject: 2015-E-01
Published: Sun Mar 1 09:53:44 2015
Pages: 48
File size: 0.15 Mb

Over the rainbow, 17 pages, 0.33 Mb
Copyright © 2018