INFO 289 ePortfolio – Dr. Patricia Franks
SJSU School of Information / Fall 2015
Patricia Ayame Thomson
Competency E
Design, Query, and Evaluate Information Retrieval Systems
Introduction
In the age of technology, it is critical for information professionals to be competent in conducting successful online searches to provide the most accurate, authoritative, and relevant information to library patrons. There is an myriad of technological tools and online resources that information professionals use to provide access to information including information retrieval systems, databases, social media sites, Libguides, and open public access catalogs (OPAC) to name a few.
In order for information professionals to provide excellent reference service, it is important for us to have a fundamental understanding about the components and inner-workings of information retrieval (IR) systems. In the following, I will discuss the three principles of design, query, and evaluation in relation to information retrieval systems.
Design
The primary design of an information retrieval system consists of a database and search engine. Information is aggregated and discriminated within the database, and the search engine retrieves relevant results in response to the user’s query. For example, the inherent infrastructure of Google’s information retrieval system consists of a controlled vocabulary (or pre-indexed words), full-text (or natural language), and classification (or subject headings). In addition, a simple and clear interface design of the information retrieval system can make a significant difference in the effectiveness and user-satisfaction.
Query
An online query is a question asked to find the information. Among others, the following are various types of searches that are conducted on information retrieval systems: Boolean logic, field, proximity, cherry-picking, basic, advanced, subject, and keyword searches.
Evaluation
The primary measure of the effectiveness of information retrieval systems is precision and recall. In response to the user’s query, precision measures only the relevant results retrieved. On the other hand, recall or completeness measures the number of all the relevant results retrieved from the query. In a way, precision and recall is equivalent to quality and quantity.
Morville (2005) explains that the type of search makes a difference in the rate of retrieved results. If the user needs only a few relevant results, then “precision outweighs recall” (p. 49). Precision is also important when the user is searching for information that they already know exists. On the other hand, if the user desires all of the relevant results available in the retrieved results, “recall is the key metric” (p. 50).
Three Evidence to Fulfill Competency E
I respectfully present three Artifacts to prove my competency for Competency E, applying the principles of design, query, and evaluation of information retrieval systems.
First Artifact
LIBR 202 Information Retrieval
I present the first artifact from the San José State University, School of Library and Information Science program’s course, LIBR 202—Information Retrieval taught by Dr. Mary Bolin. I chose to present this assignment since it demonstrates my understanding about precision and recall as measurements of an effective search.
The article explains that information retrieval systems are not yet able to retrieve the exact results in the query for every single search. This variance depends on what Morville (2005) calls “the people problem.” Morville (2005) explains that there are significant inconsistencies between words patrons and indexers use to describe the same idea and/or object. The human language is complex, variable, and difficult to pinpoint.
As a result, the findability of the information searched is one of the most essential functions of information retrieval systems. Morville (2005) explains the way to enhance the findability of the query is enhanced by adding exhaustive descriptions and populating the fields with metadata tags.
Although there is relevance ranking based on the location and occurrence of the term entered in the search box, the software has no way of determining the aboutness. Describing the aboutness of the information in the search engine will facilitate the system’s search by disambiguating one document from another.
This assignment helped me understand more about information retrieval systems, and as a result, improved my online searching skills. After completing this assignment, I now possess the ability to evaluate the effectiveness of searches by using precision and recall as a measure for effective searches.
Second Artifact
Controlled Vocabularies and Online Searching Techniques
LIBR 210 – Reference & Information Services
As my second piece of evidence, I present my assignment from the San José State University, School of Library and Information Science program’s course called LIBR 210—Reference and Information Services taught by Professor Cheryl Stenstrom.
The assignment describes the controlled vocabulary as a list of approved words and phrases indexed in the system. The list of words in the controlled vocabulary populates the metadata fields and improves the effectiveness of the search engine. In order to enhance recall, integrating a controlled vocabulary creates a syndetic structure to the taxonomy and connects non-linear, equivalent, hierarchical, and associative relationships between terms.
Controlled vocabularies are pre-coordinated at the front end of the information retrieval system. A thesaurus can also be added at the end which involves post-coordination. In addition, it is important to note that controlled vocabularies require frequent updates and constant upkeep and is costly and labor-intensive to maintain. The assignment concludes with discussions about various online searching techniques such as: Boolean logic, advanced, keyword, field, and proximity searching.
This artifact helped me to understand how the controlled vocabulary enhances the syndetic structure of the information retrieval system by connecting the hierarchical, equivalent, and/or associative relationships of words.
Third Artifact
Database Evaluation of the J. Paul Getty Website
LIBR 202 Information Retrieval
This artifact is from the San José State University, School of Library and Information Science program, LIBR 202—Information Retrieval. For the course, we had to evaluate an information retrieval system, so I chose to review the J. Paul Getty website. The robust and sophisticated infrastructure of the information retrieval system makes J. Paul Getty’s website effective, visually captivating, and easy to navigate. The multi-tiered information retrieval system contains multiple databases, four ontologies, faceted classification, controlled vocabularies, and hierarchical taxonomy.
The evaluation of the J. Paul Getty website goes on to mention that the information retrieval system has an effective function design, as well as wide scope and depth. Other areas of examination include the interface design, user model, search gateway, search function, help page, archive repositories, collection inventories, and library catalogs.
Overall, the well-funded, expansive, and well-executed J. Paul Getty website has a powerful underlying information retrieval system to have the capacity to store and index so many images of past exhibits and countless artwork throughout history.
Conclusion: I Understand the Infrastructure of Information Retrieval Systems
As an information professional, it is our job to search for and provide accurate, relevant, and authoritative information to library patrons. I believe my understanding of the inner-workings of an information retrieval system helps improve my online searching skills. Also, knowing how to access the controlled vocabulary and/or thesaurus in the information retrieval system is a valuable asset in conducting successful searches in the future.
I don’t believe this is the right work. Ubiquity means: the state or capacity of being everywhere, especially at the same time; omnipresence: the ubiquity of magical beliefs. Maybe abundance? Variety? Myriad? Profusion? Other?