Competency G

INFO 289 ePortfolio – Dr. Patricia Franks
SJSU School of Information / Fall 2015
Patricia Ayame Thomson

Competency G

Demonstrate understanding of basic principles and standards involved in organizing information such as classification and controlled vocabulary systems, cataloging systems, metadata schemas or other systems for making information accessible to a particular clientele

Introduction

            In the face of the technological revolution, people are inundated with an overload of information. As a result, the human mind has the inherent desire to make sense of, organize, and store new information. In the same way, libraries and information professionals have always had the responsibility to organize and disseminate information items. Unless there are basic standards and principles to establish uniformity across information organizations, libraries will not be interoperable or collaborative. As a result, it is necessary for information professionals to understand the principles and standards that facilitate the infrastructure of information retrieval (IR) systems, content management systems (CMS), and document management systems (DMS). In the following, I will define the various standards and principles.

Cataloging and Bibliographic Control

            The Resource Description and Access (RDA) is the latest unified cataloging standard that will replace the current standards of the Anglo-American Cataloging Rules, 2nd Edition (AARC2). The Resource Description and Access (RDA), implemented on March 31, 2013, was created to provide a wide scope for inclusion of any kind of information content or format, flexibility to accommodate new and emerging technologies, adaptability of data working in all types of information environments, and stability of access points for users and information professionals.

Since 1967 when the original Anglo-American Cataloging Rules (AACR) was established, library indexers have used the rules to standardize descriptive cataloging for information items in the English language. The AACR2 consists of two parts including the descriptive data and headings which are essential access points.

In order for technology to interpret the data, the information from AACR2 is coded to create Machine-Readable Cataloging (MARC) records. The MARC records are digital formats that make it possible for indexers to practice copy cataloging instead of having to enter the original data for each new record. Research organizations such as the Online Computer Library Center (OCLC) create the original MARC record and indexers at local libraries copy them—with potential minor revisions to accommodate their library.

Classification

            Classification means that digital and physical information items are organized by subject. In other words, information items are grouped together by similarities and commonalities of the subject matter. Organized in this way, patrons can contrast and compare similar items that are in the library collection. The most common classification systems used in libraries are the Library of Congress Classification (LCC) system and the Dewey Decimal Classification (DDC) system. Both classifications are hierarchical in structure and use notations and numbers in a significant sequence for identification purposes.

Metadata Schemas

            Metadata is data that leads to other data. Metadata schemas are a small set of vocabulary words developed by a community of members with a common interest. Collectively, the members add metadata or descriptive attributes to help identify the information sought and make it more accessible to patrons. The more descriptive data is added to the information resource, the easier it is for the information retrieval system to retrieve successful and relevant results.

Controlled Vocabularies

            The controlled vocabulary is an organized set of restricted and preferred words used to index the information items and improve the search and retrieval results. Due to the fact that the human language is rich, confusing, and highly variant, the information retrieval systems have to impose some sort of order in the form of predetermined and restricted vocabularies.

Another helpful function of controlled vocabularies is the inherent syndetic structure that connects the relationships between terms and consequently facilitates successful and relevant search retrievals. Using a set of controlled vocabularies helps to improve and enhance the communication between the user’s query and the search engines of information retrieval systems, databases, document management systems, and content management systems.

Thesaurus

The thesaurus is a more structured controlled vocabulary. Compared to a controlled vocabulary, the thesaurus is added at the end and requires post-coordination in information retrieval (IR) systems. The thesaurus clearly mentions the synonyms (or equivalence relationships,) related words or (associative relationships,) and narrower and broader terms (or hierarchical relationships.) In order to provide guidance to create a thesaurus, the National and International Standards have been developed, including ISO 5964, ANSI/NISO Z39.19 and ISO 2788. Essentially, the thesaurus includes a structured list of terms indexing equivalent, relative, and associative relationships to improve the effectiveness of information retrieval systems.

Three Artifacts as Evidence for Competency G

I respectfully present three artifacts to demonstrate my comprehension about Competency G.

First Artifact

LIBR 202 – Information Retrieval – Metadata

This artifact is from Dr. Mary Bolin’s course, LIBR 202—Information Retrieval offered at the San José State University, School of Library and Information Science program. This assignment is a discussion about metadata and the characteristics that are important for data warehouses. The artifact goes on to explain the differences between administrative, structural, and descriptive metadata. Morville (2005) states that search engines use metadata to index a page so that someone looking for the information on the page will be able to find it.

In addition, the assignment discusses the Machine-Readable Cataloging (MARC) records that became the national standard to disseminate bibliographic data in 1973. More recently, a coalition of libraries developed the Resource Description and Access (RDA) cataloging standards to keep up with the rapidly-changing technology. Furthermore, the assignment describes the importance of adding descriptive metadata so that the search engine has the ability to decipher the meaning and disambiguate between words. In conclusion, it is highly recommended for indexers to precisely and exhaustively describe the aboutness of the information entity by adding descriptive metadata tags. Finally, I included this artifact to illustrate my understanding about descriptive metadata and metadata schemas.

Second Artifact

LIBR 202 – Information Retrieval – Online Catalog Access

The second artifact is from LIBR 202—Information Retrieval course from the San José State University, School of Library and Information Science program. The assignment required us to read and discuss Marcia Bates’ article about her innovative ideas and suggestions to improve online subject catalog access. The suggestions are based on Marcia Bates’ philosophical assumptions that there are fundamental things that can be done to improve the effectiveness of current online subject catalogs. Bates specifically mentions features that can improve online systems such as natural-language search queries and automatic term stemming and weighting. Also, another feature discussed is the usefulness of hypertext in subject headings. Hypertext helps the user skip a level and takes them one step closer to the goal, as well as linking to other resources.

The most important advantage that Bates’ new model proposes is that it can be ‘wrapped around’ the existing Library of Congress subject-headings without requiring the expense or labor of re-indexing. Bates suggests the use of a front-end system controlled vocabularies (pre-coordination) and a back-user thesaurus (post-coordination). The article continues to state that Bates cites three elements that make information retrieval problematic and indeterminate which are ‘uncertainty, variety, and complexity’ mostly based on the variance in the use of the human language from both users and indexers. I included this artifact to demonstrate my knowledge about pre-coordination and post-coordination in various information systems.

Third Artifact

LIBR 202 – Information Retrieval – Pre-coordination and Post-coordination

The third artifact is from Dr. Mary Bolin’s LIBR 202—Information Retrieval course from the San José State University, School of Library and Information Science. The assignment was titled “Controlled Vocabulary and Subject Representation.” The professor separated the students into groups of four and I participated as a member of a team. As a group, we had to select twenty articles that have to do with libraries. Then, Dr. Mary Bolin required us to separate each group in two pairs. The reason Dr. Bolin divided us into pairs is because she wanted us to work on pre-coordination and post-coordination teams.

For each of the twenty articles we selected, my teammate and I were on the pre-coordination team and we were responsible for assigning descriptive data to include in the controlled vocabularies. The other two students worked on the post-coordination by developing a back-end thesaurus. Their job was to assign words for the thesaurus based on the twenty articles. I included this article, because it demonstrates my capability to assign terms for the controlled vocabulary in the front-end of the system to facilitate the search for information retrieval (IR) systems, online public access catalogs (OPAC), document retrieval systems (DRS), and content management systems (CMS). I believe this artifact illustrates my understanding about how the controlled vocabulary is created and embedded in the pre-coordination.

Conclusion: Understanding Components of Information Retrieval Systems

As required for Competency G, I believe the introduction and the three artifacts presented above demonstrate my understanding of the basic principles and standards involved in organizing information in a variety of information retrieval systems. The principles in the introduction such as cataloging, authority control, classification, metadata schemas, controlled vocabularies, and other effective methods in information systems are ways to make information more accessible to patrons. In conclusion, my understanding of controlled vocabularies and the availability of a thesaurus will help me with my online searching skills in the future.