Help Center

Text Mining Resources

What newspapers are available for data mining?

Baker Library has licensed the following newspapers for data mining from ProQuest. Currently, the newspapers are available on hard drives. HBS researchers may contact BRDS@hbs.edu for more information. 

Harvard affiliates may also want to explore ProQuest TDM Studio, a tool that allows you to mine large volumes of published content.

Newspaper TitleYears of XML/PDF ArticlesArticles-Level vs. Page-level
Atlanta Constitution1868-1930 (XML only)TBD
Austin American Statesman1871-1926all years article-level
The Baltimore Sun1837-1932all years article-level
The Boston Globe1872-1987all years article-level
Chicago Tribune1849-1935all years article-level
The Christian Science Monitor1908-1995all years article-level
The Cincinnati Enquirer1841-20091841-1922 article-level; 1923-2009 page-level
Dayton Daily NewsTBDTBD
Detroit Free Press1831-19991931-1922 article-level; 1923-1999 page level
Hartford Courant1764-1934all years article-level
Los Angeles Times1881-1950all years article-level
Louisville Courier-Journal1830-20001830-1922 article-level; 1923-2000 page-level
Nashville Tennessean1812-20021812-1922 article-level; 1923-2002 page-level
The New York Times1851-1933 (XML only)TBD
New York Tribune/Herald Tribune1841-1962all years article-level
Newsday1940-1990all years article-level
Philadelphia Inquirer1860-2001all years page-level
San Francisco Chronicle1865-1922all years article-level
St. Louis Post-Dispatch1874-20031874-1922 article-level; 1923-2003 page-level
Wall Street Journal1889-1932 (XML only)TBD
Washington Post1877-1937TBD

The Harvard Kennedy School also has a guide on resources available for texting mining.

Text Analysis Tools

Still need help?

Our expert librarians are here to help you find what you’re looking for.

Interior shot of inside Baker Library hall with students
Shot of inside of Baker Library with students studying