Innodata Isogen Digitizes Extensive Collection of Legacy Publications
A global financial organization charged with overseeing and monitoring foreign exchange rates, balance of payments and government economic models has consistently recorded its past and ongoing activities and viewpoints in an extensive publications library. These publications not only serve as documents of record, but also help financial analysts and other observers shape government economic policy in countries worldwide.
In the past, many of these documents existed only on paper. However, the organization recently decided to maximize the value of this content repository by converting 25 years’ worth of publications to an electronic format. This would enable the content to be searched and retrieved on-line, providing greater ease of use and vastly expanding the potential audience.
Moreover, the industry’s unrelenting price wars put the company under intense pressure to control costs. While the company needed to maintain adequate staffing levels in its technical publishing department to support product launches, they also wanted to streamline what had become a cumbersome process to produce documents for both the print and web.
This was hardly a straight-forward digitization task. Many of these documents had been printed in different formats – from hard and soft cover books to document-style reports. Some 1,500 of the 7,000 publications were bound books, printed in a variety of formats – 6” x 9”, 8 _” x 11”, and 10” x 7.” Ranging in length from 60 to as many as 85 pages, the multi-lingual publications included photos, charts, tables, and graphs. In addition, the organization required each of the 7,000 paper documents to be generated in six different electronic files.
To convert the diverse documents and their content into a searchable electronic format, the organization turned to Innodata Isogen, which has carried out some of the world’s largest data conversion efforts.
Accuracy is of paramount importance to the agency, which commands a singular position as a leading monitor of global financial policies. Innodata Isogen's proven capabilities for achieving accuracy rates beyond the base requirement of 99.95% or higher was a key factor in the agency’s final decision.
Innodata Isogen began by assessing the appropriate technologies required to handle the volume and complexity of the source volumes. Using proprietary imaging technologies and processes, Innodata Isogen began processing the body text of the documents, along with the photographs, charts, and other graphic elements. In all, Innodata Isogen was required to provide six output files for each document:
To improve search capabilities, the team also set up a disciplined process for extracting targeted metadata. The six files for each publication were assigned a common file name structure with different characters to identify which type of file it was. Each publication has a base identifier that depicts information about its type, as well as the naming schema.
Both Innodata Isogen and the customer conducted strict quality checks, ranging from basic tasks such as checking proper pagination, order, and image quality for PDF and .tiff files. The image quality needed to be clear, free of extraneous marks, and positioned properly on the page.
When the project was finished, the agency rapidly began making new digital content available to economic officials, journalists, students and others who rely on this unique content collection. As a result, the organization can now showcase its thought leadership capabilities and economic leadership through this new dynamic content repository.