– Case Study –

Text Annotation for Training an ML Recruiting Model

Large and highly accurate training datasets enabled the AI platform to match job seeker resumes with relevant job postings

Challenge

A job market analytics company needed data to train its AI platform so that it could evaluate job seekers and connect them with employers. The platform combs through thousands of resumes, parses critical keywords, terms, and skills. Applicants are then sorted and matched to the job openings that provide the best fit. For the platform to work, a large data set of highly accurate annotated job profiles was required.

Data Received .94 kappa score

Results

SOLUTION

To begin the process of producing the training datasets, subject matter experts at Innodata created seven different taxonomies to annotate against. Innodata then passed 50,000 job profiles through its state-of-the-art annotation platform, which provided a first pass annotation against the seven defined taxonomies. A double-blind pass or inter-annotator process as well as an independent quality audit process was used to guarantee quality in the annotation of occupations. After the initial first pass of annotation by the annotation platform another human annotator conducted annotations of the dataset again. Where there were discrepancies, an adjudicator provides a judgement between the annotations.

IMPACT

The HR analytics company was provided with highly accurate datasets for training their model. The result of Innodata’s annotation process was data with a .94 kappa score, which suggest near perfect agreement in data accuracy. The created datasets enabled the AI platform to automatically and correctly identify resumes that closely matched job profiles.

AI Solutions

Model Safety, Evaluation, + Red Teaming

Agentic AI Evaluation & Observability

Agentic AI Evaluation & Observability

The Innodata GenAI Summit | London 2026

Domain-Specific AI: Smarter, Safer, and Built for Your Industry

AI Solutions

Model Safety, Evaluation, + Red Teaming

Agentic AI Evaluation & Observability

Agentic AI Evaluation & Observability

The Innodata GenAI Summit | London 2026

Domain-Specific AI: Smarter, Safer, and Built for Your Industry

Text Annotation for Training an ML Recruiting Model

Challenge

Data Received .94 kappa score

Results

Meet an Expert

Our Team of Data Experts

About

Company

Contact