Web Archives

We use Archive-It to collect, preserve, and provide access to content from around the web that complements our existing collecting areas and the research interests of our faculty, students, and staff.

The websites we include in our web archiving collections are selected on an individual basis. We welcome suggestions for content within these areas.

How It Works

Archive-it crawls and stores web content using the Heritrix web crawler. It’s designed to crawl websites without interfering with access. Most crawls will be run only a couple of times per year, and will last for a few days. Once a crawl is complete, the crawler ceases to interact with the server. If you have any questions or concerns, please contact us at webarchiving@umich.edu

All preserved sites will be prominently labeled as an “archived web page” with information about the crawl to avoid confusion with live websites. Archived versions of websites may appear to be incomplete. Certain types of content are difficult to archive, such as streaming media, database-driven content, and javascript-driven content. We do not crawl areas of websites that are password-protected. 

If you have questions or concerns about copyright, please see our Takedown Policy.

A wall of servers that say Internet Archive down the middle.