Digitizing 150 Years of Historic NewspapersBy Innodata Inc. on June 1, 2017
When a publisher acquired the rights to the microfilm archives of two of America’s most prestigious newspapers – The Washington Post and The New York Times – executives knew they were sitting on a potential gold mine.
But extracting that gold – a century and a half of news articles, editorials and photos capturing the history of
the United States – would not be easy. The two archives combined hold nearly 5.6 million newspaper pages, more than one million articles and more than 100,000 individual newspaper editions. Moreover, newspapers are notoriously difficult to digitize, given the large page format, page jumps and multiple photos and graphics.
The publisher also needed to convert the archives into a format other than microfilm so that could be preserved
indefinitely in order to maximize its potential for reuse. If the digitization was not cost effective, or efficient, the
publisher would find it difficult to obtain an adequate return on its investment.