When using databases, research is focused and retrieval rate is targeted and successful. Best of all, time isn’t wasted with inaccurate or partially useful materials. The results are “exactly” what is requested or else the materials aren’t there.
Let’s think back to the earliest databases that were available through Dialog, BRS, and other fee-based services.1 Most of the information on these services were listings of reports and journal articles arranged into subject specific databases. Queries used controlled vocabulary identified through LCSH (Library of Congress Subject Headings) and subject specific thesauri. The query, constructed using Boolean Logic, searched the database and retrieved only those items that contained the requested terms. If the search sets didn’t intersect or overlap, the items weren’t part of the retrieved materials. Boolean Logic is very precise. Either the items met the search criteria or they didn’t.
With the invention of search engines, search queries changed and an algorithm is used to retrieve information and resources based on frequency of requested term(s) within specific fields, usually title, author, abstract, and subject headings or assigned subject terminology. The results are usually displayed in a ranked order; the more frequent the term(s), the higher on the list. This ranking is processed used “fuzzy logic,” e.g., what I’ve described as “what you were looking for and then all the rest of the stuff that you might want.”
Search engines use a very complex algorithm to find and display results. This fits well within the philosophy of “good enough” search results. In other words, what the search engine finds is good enough even if it wasn’t what you were looking for.2
Let’s step back a minute and think about that last statement when it pertains to the researchers in your specific industry or company. Is the information “good enough?” Is it sufficient enough and/or complete enough so the researcher can continue a project, research or medical study, or even perform a task? Is a “sort of” answer what your researcher really needs? Why are the results less focused and less precise and yet, “good enough?”
Database query versus search engines
What’s the difference between a database query function and search engines? Over the decades, the difference has certainly blurred. Databases of articles, reports, monographs, and audiovisual materials are finite. Items are retrieved by searching the finite universe through identifiable fields such as author, title, subject, abstract, subject headings and thesauri terms, and identifiers such as call and item numbers. When items are input into the database, they are retrievable using one or more of the identifiable fields. The converse is true. If collections haven’t been input into the database, then a search of the database or catalog won’t retrieve the item. Only the archivists or records manager with in-depth knowledge of the collections will know what else is tucked away in the storage area.
In contrast, a search engine searches all of the web, which by its very nature, is infinite, changeable, and mutable. Searches via search engine are often not replicable because of the variable nature of queries and the constant input of content. Tags, natural language identifiers, and assigned metadata make items more visible to search engines and the algorithms used for ranking results.
Finite Equals Focused
The same variability is true of some very large conglomerations of databases such as those for genealogy. Landing pages for large database sets are an amalgamation of these databases. The main search engine or search box will retrieve results from across a large and ever-changing number of databases. When you search across all of them, you don’t necessarily find what you are seeking because the information may be buried deep in the list of results.
Results from search engines and large conglomerations of databases are displayed based upon an undisclosed ranking system and algorithm. Search results are less focused and difficult if not impossible to replicate. On the other hand, if the researcher drills down to a specific database, the search query retrieves only results from within that finite collection. The search is focused and targeted.
Summing it up.
Searching specific databases and catalogs results in focused research and query results. By using advanced search features, and taking advantage of facets, or “search within” options, both the search and the results can be honed and research is focused.
Use search engines for “scatter shot” or preliminary research to narrow terminology. Drill down to subject specific databases. It’s better and more focused to select the specific resources you want to search. Use subject specific databases and reference resources to narrow and focus your search.
My next blog post will examine how active reading enhances focus and harnesses technology.
1 For more info on the history of dial-up databases, see: Susanne Bjørner “Online Before the Internet, Part 1: Early Pioneers Tell Their Stories” Searcher 11 No. 6 (June 2003) http://www.infotoday.com/searcher/jun03/ardito_bjorner.shtml
2 Robert Mackey, “Good Enough is the New Great” NYT (Dec. 13, 2009).
Librarians often need information on copyright; this post provides a list of copyright resources for special librarians
All kinds of librarians, from reference to digitization to school librarians, confront copyright in their work. Few are experts, many need this primer
Staying up-to-date on technology trends is important for information professionals. Technology trends impact how information is shared and consumed.
A desirable difficulty is challenging, but not so hard as to be discouraging; students recall content more readily than if learned in an easier way.