Highlights
[1:58] AI Learning Resources and Applications
[4:37] Ground Truth Data in Machine Learning
[9:09] Securing Needed Ground Truth Data
[15:30] Addressing Privacy Concerns with Data Collection
[17:46] Capturing the Human Side of AI
[24:04] Overcoming Biases in AI
[26:51] The Evolution and Future of AI
Michael's Insights
[3:50] “Basically anything you can think of, AI can be a part of it.”
[11:24] “When it comes to human interaction, it’s really hard to create synthetic data because everyone is different.”
[24:53] “I don’t think you can eliminate biases in AI, but what you can do is get as much data as possible.”
[29:18] “AI can help us do a lot of things we can’t do ourselves, especially in healthcare.”
Howie's Bio
Michael Nguyen is the VP of Global Data Practice and Partnerships at Innodata, Inc. He is an action driven and technology focused business development professional with over 20 years of experience delivering state of the art products and services to global enterprises. For the past five years, Michael has been focused on ground truth data for artificial intelligence and machine learning.
Show Notes
In this episode, Melody welcomes Michael Nguyen, VP of Global Data Practice and Partnerships at Innodata, Inc. Michael shares insights into all that AI is (and isn’t), the advances that have been made in ground truth data collection, and what it will take to overcome the biases that are still present in machine learning.
GROUND TRUTH DATA IN MACHINE LEARNING
Ground truth data is a collection of data that can’t be manufactured, it has to be captured. Primarily used by companies who develop products for AI/ML, there are several critical components to consider when building a ground truth data set, including identifying what type of data you actually need. Michael points out data that isn’t usually readily available, such as human interaction data, and how his team overcomes the hurdles to secure it.
THE HUMAN SIDE OF AI
Some data are very easy to come by, while some are significantly more difficult to secure. Any data collection that has to do with humans is more sensitive, and Michael highlights the ways that they address concerns and protect privacy for anyone who is sharing data. Object and environment data collection is pretty straightforward, while speech, simulation, and human data are much more difficult aspects of AI to capture. The only way to mitigate bias with facial recognition is to capture as much data as possible.
THE EVOLUTION AND FUTURE OF AI
In the wake of the pandemic, one major focus of AI is turning to healthcare. From robots doing surgery to self-driving cars, AI will continue to improve the quality of life for everyone and take care of things that humans are not best suited to take care of.