Text and Data Mining

Find text and data sets and explore related tools, methods, and analysis through our workshop, class instruction, and consultation offerings.

Text and data mining is the process of getting insights by analyzing large, machine-readable text or data sets to identify patterns, relationships, and other structured information that’s valuable for research and analysis.

What do you want to do?

Whether a computational process is needed for text and data mining — as opposed to analysis by hand — depends upon the level of complexity.

Review these questions as you consider what you want to find out:

  • Do you need to analyze an entire corpus (body of work), or just selected items from it? 
  • Do you need the content to be easily read by humans, or only by machines?
  • Do you need to download the entire contents? Or can your analysis be conducted on the platform where the content already resides?
  • What kind of analysis do you want to do?

Find resources for mining

The following research guides contain text and data in broad categories or genres — such as content from the last year of Twitter, Congressional hearings, or particular newspapers. Most are licensed for use only by U-M researchers.

You aren’t limited to these resources. We can help get the data you need by locating additional text or data sources, advising on whether licensed content is available to mine, and, in some cases, negotiating access to datasets and text collections for U-M researchers.

How we can help

We can help you find and use text and data sets for mining and offer introductory consultations, along with workshops and in-class instruction. 

Contact our digital scholarship team at library-ds@umich.edu with specific questions or to place a request.

Request a consultation

We offer introductory trainings on using specific tools for data processing and text analysis, as well as make referrals when appropriate. We also provide consultations for data collections that the U-M Library owns or licenses.

Request a workshop or class instruction

Contact us to request a workshop or in-class instruction focused on text analysis. We offer overviews and introductions to text and data mining technologies, methodologies, and tools.

University of Michigan Library

Our community

Privacy and copyright

Library Privacy Statement

Except where otherwise noted, this work is subject to a Creative Commons Attribution 4.0 license. For details and exceptions, see the Library Copyright Policy.

Federal Depository Library Program