Lourenco Miranda understands the role data annotation plays in driving the right business decisions for his company. As the regional head of capital management, resolution (risk), and model risk management at Société Genérale, he needs access to up-to-the-minute information that is unbiased, ethical, and most importantly, accurate. “If we train our models with the wrong data, we’re going to make the wrong decisions.”
Lourenco, like other business leaders experimenting with artificial intelligence and machine learning, needs annotated data to feed machine learning models. What’s more, working in a highly regulated environment like financial services, Lourenco and his team must understand how and why the data is tagged and labeled the way it is – it must be explainable. Even a minor inaccuracy could amount to costly consequences. This is why he believes data annotation is a critical step in machine learning that must be driven by subject matter experts.
Talking alongside Innodata’s Chief Product Officer, Rahul Singhal, Lourenco shared some of the typical use cases he is seeing in terms of creating annotated data and how it is driving adoption of artificial intelligence. For his team, complex contracts are typically the most important types of assets that needs annotation, but according to results of a poll we conducted with attendees of the webinar, the needs run across the board.
When it comes to annotation, in addition to text and images making up the bulk of the work that’s being done in the annotation space, we see that just about any type of data asset can be annotated to ensure the information is helping train machine learning models.
No matter what is being annotated, it’s not easy to do. It’s a very manual process requiring time and resources. Right now, almost 16% of webinar attendees say they are spending over $100,000 on data annotation.
And while 20% say it’s less than $10,000, more than half are just not sure what it costs to do it. As more and more companies start identifying the need and understanding data annotation has in regards to artificial intelligence, we expect the number to increase and for companies to start tracking spend within their budgets.
Finally, Lourenco talked about the need for humans – highly trained and experienced humans – to manage data annotation. His team helps with some external help, which seems to be the trend. Almost half of all companies we asked say they manage annotation in-house and with an outside partner.
So, what challenges or opportunities have you encountered with data annotation? Be sure to check out the webinar recording to hear how Lourenco and his team is tackling it.