Post Summary

In this post, I will give you a brief overview of Hathi Trust, its content, and how well the site’s built-in search tools will let you determine if what you’ve found is Current, Relevant, Accurate, and written by an Authority and suits your Purpose.

Overview of Hathi Trust

Hathi Trust is an online digital library and catalogue of 18 million+ digitized items that are free to information consumers under copyright law1 – mostly items that are “public domain.” (See Dalhousie University’s Guide about Public Domain in Canada, and Stanford University Libraries for the US.)

In practical terms, this means that most of the items in the collection are older. For most subjects I searched (including chemistry, cats, and Canada), over 50% of the items with available full-text were published before 1909, and over 80% were from before 1949.

Bar graph of books on the subject of cats that Hathi Trust gives people access to based on decade of publication.
My research suggests that 60.1% of books on cats that Hathi Trust has access to were published before 1909, and 82.3% before 1949. Clearly, I need a time machine for maximum snuggles.

Several University and other libraries contributed scans of books and book-like objects. Most of these libraries are based in Canada and the US2.

The content reflects the interests of libraries based in those countries – mostly items published in Canada, the US, and Europe. This does not mean that there is nothing from other parts of the world, just that the collection will be less strong in those areas.

Content in Hathi Trust

Most of the collection is based off the content of University and other post-secondary libraries. What is available tends to reflect the priorities of that type of library.

A very modern, alarmingly well-lit library
Image by Oli Götting from Pixabay

Most of the collection is based off the content of University and other post-secondary libraries. What is available tends to reflect the priorities of that type of library.

Hathi Trust gives you access to:

  • Academic and technical content: excellent range of topics, though almost exclusively older content.
  • Popular adult and children’s fiction: there isn’t a lot as a percentage of the total, but even a tiny fraction when you are talking about 16 million+ items is more than the average human can read in several lifetimes. It isn’t great for browsing fiction – you should have some idea of what you are looking for, like the name of a very dead author you want to read.
  • Educational content: again, excellent range of subjects, though almost exclusively older. If you are looking for recently published educational stuff, you are much better off looking at an Open Education source like Open Educational Resources Commons.

More of these:

Old books on a shelf, photographed at an unusual angle
Image by Pexels from Pixabay

Fewer of these:

New books on shelves
Image by Lubos Houska from Pixabay

The main groups I would recommend Hathi Trust to include:

  • People who like reading old books (both professional historians, and members of the general public)
  • Writers, especially historical fiction writers and science fiction writers with time-travel plots
  • Anyone wanting detailed descriptions and/or drawings of plants, bugs, and animals like these. (Artists, biologists, etc.)
  • Tech-savvy ghosts wishing to read things published during their lives. (Assuming they died 95+ years ago.)
Drawing of a ghost reaching out
I want boooooooks!
Image by Dean Lewis from Pixabay

Since the majority of what Hathi Trust has is more research focused, how good of a job does the website do allowing users to figure out if what they are looking at is CRAAP. Is it Current, Relevant, Accurate, written by an Authority, for an appropriate Purpose? (Read more about CRAAP.)

Poop Emoji
Image by Alexa from 
Pixabay

The more stars it gets in each section, the more tools you have to help you assess each part of CRAAP without actually reading each result.

How Hathi Trust’s search tools can help you find good stuff

The search tools and other information can help you find out if the search terms you are using are giving you good results, narrow down your results, or asses the quality of an individual search result.

Go to section on Currency, Relevant, Accuracy, Authority, or Purpose.

Currency

Vintage pocket watch
Image by Jakub Luksch from Pixabay

You have a couple of different ways you can either determine the age of what you are looking at, or narrow down your search based on age.

  • narrow down the search by the age of the material using the filters
Currency Screenshot 1: In the results from a search, you can scroll down until you see the “date of publication” filter, and open it up by by clicking on the “v”. You can then filter your results by clicking on the date range you are interested in. If the date range you want doesn’t appear, you may find the advanced search in “Currency Screenshot 4” more useful.
  • see how old an individual item is in the summary result or in the record that opens when you click on the “catalog record” button below each result
Currency Screenshot 2: In the list of search results, you get a summary of each hit. There is a published date for this record, but in many cases, you’ll get a lot more by clicking on the white “catalog record” button for the item you are interested in.

If you have already found what you are looking for, if there is a free digital version available you will see the orange “full view” button shown here. A white button, usually with the text “limited (search-only)” means you don’t have access through Hathi Trust.

Clicking on the “catalog record” will take us to the next screenshot.
Currency Screenshot 3: Here, you can see more details about the publication date. In the row labelled “Published”, you can see the publisher and their location, and the date published. In this record “c1928” tells you that the date is not an exact publication date. The c may indicate that the person who catalogued this item was giving an approximate date (c meaning “circa”), or they may be giving a copyright date.
  • in an advanced search, limit your results to a date range
Currency Screenshot 4: When you click on the “advanced search” link shown below the search bar’s orange “search” button, you will get to a screen something like this. Scroll down to the “publication year” section, and fill out the appropriate options.

Go to section on Currency, Relevant, Accuracy, Authority, or Purpose.

Relevant

Can the site tell you what is inside the book by giving you a purrfect summary of the subject?

Image by Алексей Боярских from Pixabay

There are a couple ways to do this using Hathi Trust, but there is one that give the most predictable results. First do a basic search to find something close to what you want, then going to the catalog record for that item.

Relevance Screenshot 1: A search for ghosts gave way too many results. However, the “subject” filter doesn’t provide any useful help. Scroll through the hits that you got until you see something that looks like it is close to your topic, then click on the “catalog record” button below it.
Relevance Screenshot 2: We’ve found some ghosts on the internet. But are there more? Click on a relevant subject to get relevant results.
Relevance Screenshot 3: Look at all the ghosts in this search! Now when we scroll down to the “subject” filter, we can narrow down even better what we are looking for.

Not all catalog records list all the subjects! This means:

  1. you have to read some of the things to decide if they are about what you want
  2. in any search by subject, you will probably miss some relevant items!

Ranting from a mildly annoyed librarian ahead! Proceed with caution. If too technical, detour to the “Accuracy” section with the archery target.

Sign with text: "Warning: Flammable area"
Image by Clker-Free-Vector-Images from Pixabay

In this case I do not recommend using the advanced search for subject searches. You are much better off finding a relevant record, then narrowing down from there.

Relevance Screenshot 4: If your sample result is too specific, you can click on the broader term to the left of the “>” to get broader results.

In the click-to-find-subject method, clicking on the first part of “Ghosts > South Carolina” you will get any items with a subject heading containing ghosts (spooktacular!), and clicking on “South Carolina” will search for “Ghosts > South Carolina”, or only items that are at least partly about Ghosts in South Carolina. Being able to broaden or narrow a search is excellent, so +1/5 star.

However…now we come to the problems. If the more intuitive click-to-find-stuff-on-the-same-subject method didn’t work, Hathi Trust would deserve only 1/5 for searching by relevance/subject, but I’d be giving it 4/5 since I wouldn’t know how many results I was missing!!!!

Look at the subject results my initial basic search for “ghost” “relevance screenshot 1” above.

See many ghosts, ghoulies, or spirits in that list? Me neither.

I’m not there, because I’m trying to finally finish my to-be-read booooks pile!

As best I can tell, Hathi Trust does correctly pull the subject headings from the record for each item. It then pulls apart the components making up the subject or subjects and then spits out the bits and pieces to make up the list of subjects for that filtering tool.

Relevance screenshot 5

This gave less than useful subjects to filter by.

As a librarian, my next instinct was to use advanced search. It did not go well.

Since “keywords” was not an search option in the advanced search drop-down menus, I went for “subject.”

Sadly, no matter how I typed in my search terms in the advanced search, I was missing a lot of results when compared to finding a relevant record, then accessing items on the same subject by clicking on the appropriate subjects in the catalogue record. Even when I type in the same punctuation, my advanced search for ghosts turned up way fewer results than the basic search.

Relevance screenshot 6: I didn’t find as many ghosts in the advanced search as I did by using the earlier method.

 If there is something I am missing between my advanced search (Relevance screenshot 6), and my basic search (Relevance screenshot 3), please let me know. Right now, I’m giving up looking for ghosts.

Go to section on Currency, Relevant, Accuracy, Authority, or Purpose.

Accuracy

Image by Christian Plass from Pixabay

To be fair, this is normal for the vast majority of information sources.

You’ll have to look inside the items and compare them to other research results to determine how accurate they are. Or how accurate they would have been considered when they were first published. Enjoy!

Go to section on Currency, Relevant, Accuracy, Authority, or Purpose.

Authority

Image by Kirill from Pixabay

Normal for an online database or library catalogue is 2-4 stars, so Hathi Trust doesn’t stick out either way in this.

Confirming that an author or creator knows what they are talking about is important [citation needed]. There are two slight hints that you can use in the record for an individual item that may help you figure this out, however, for hard-core research you will need to dig deeper than what is available within Hathi Trust.

In the item record, you can see both the author and the publisher.

Authority screenshot 1: if you don’t know if the author knows what they are talking about, can you trust the publisher to have done some digging into the writer’s backyard? Or at least their background?

How does this help?

Well…not much, though the publisher is more likely to give you some hints. You can get some idea based on the name of the publisher – a name like “University of Somewhere Press” will probably have a higher academic standard in what it wants people to associate its name with.

I would bump this rating up if you could always click on the author and publisher like you can the subjects. The author is usually clickable, but not always – and it doesn’t always produce a meaningful number of results. This would at least let you know if either of those had published books on similar topics.

Since the content is mostly very vintage, many publishers (possibly the majority) have either gone out of business or been eaten by larger presses and won’t have a website to see what else they publish. Even those that do may have shifted focus so much over the decades, that they are no longer an authority in areas they used to do well with. *cough* The History Channel and The Learning Channel *cough, cough*

Go to section on Currency, Relevant, Accuracy, Authority, or Purpose.

Purpose

Word art with examples of the purpose an information source can have, including: education, entertainment, advertising, propaganda, etc.

This low of a rating is normal for something so academically focused. Practically everyone who makes it to academia can read, although staying focused…is that a squirrel?

Image by Amy Spielmaker from Pixabay

Hathi Trust has left determining how dark or light the purpose a potential information source has for existing mostly as an exercise to the reader.

There is one tiny thing that you will sometimes see if you look in the subjects – an intended age group, like “juvenile.” If there is nothing like that there, you should assume it is meant for adults, but there is no reliable way to determine if it is aimed at a layperson, professional, or academic audience without reading it.

Purpose Screenshot 1

There is also no way to reliably know in advance if what you found is going to be a propaganda piece, advertising, etc.

You’ll have to determine that through other clues, like who wrote it, and the tone of the writing. Enjoy!

Go to section on Currency, Relevant, Accuracy, Authority, or Purpose.

Random search tool tidbits

For basic searchers

The search is not case sensitive – a search for mcgill, Mcgill, McGill, and mCgiLL all gave me the same number of results. Plus, putting in those search terms caused exceptional stress for autocarrot auto correct. Bonus!

For librarians or people who want to improve their information search skills

  • You can use wildcards (search for woman or women using wom?n) in some types of search
  • Boolean operators (AND, OR, etc.): since the input in the basic search is not case-sensitive a search for cats and dogs will give the same number of hits as cats AND dogs. You need to actually go in to the advanced search to use Booleans
  • As far as I can tell, this database does not have a subject thesaurus for searches. If I’m wrong, I’d love to know!

Summary

Overall, Hathi Trust is an excellent resource for finding online vintage material. With very few exceptions, the tools built into the site will help you assess the quality of the information that you found as well as or better than any rival.

Further Reading

For People with basic-intermediate search skills and/or who are short on time

For librarians or people who want to improve their information search skills

References

  1. https://www.hathitrust.org/about/ ↩︎
  2. https://www.hathitrust.org/member-libraries/contribute-content/contributors/ and https://www.hathitrust.org/the-collection/ ↩︎

Edited from original version: grammar fix, added Post Summary Section, and added links to sub-sections of the post for navigation.

Trending