Fairness in AI: It’s not One-Size-Fits-All

Humans are inherently biased. Each of our thoughts, decisions, and actions is influenced by prior experience. And humans create and train AI models, which in turn retrain themselves based on prior experience and human-provided data. Under these conditions, how can we expect machine learning models to be any more fair and equitable than we are?

The short answer is, we can’t. It may not be possible, given the different types of bias, origins of bias, and ways to measure it. But through well-designed tools, thoughtful decision-making, and careful monitoring, we can do our best to minimize bias and move toward fairness in AI.

What is Fairness in AI?

There is no universal, actionable definition of fairness in AI. Fairness has different meanings in different fields (e.g., law, social science, philosophy, and quantitative fields), can take various forms, and is measured in multiple ways. As a result, real-world applications of fairness are notoriously fraught.

Broadly speaking, fairness in AI refers to ML models that result in impartial treatment and/or equitable outcomes across all groups, particularly sensitive or protected classes of people — such as women and racial minorities — that have historically suffered from discrimination or unequal opportunities. But this definition is by no means complete.

Conflicting Definitions of Fairness

It is challenging to apply this broad idea of fairness to individual models and situations because of the many, often conflicting, variables. Fairness decisions require a significant amount of human judgment and a number of trade-offs:

Individual fairness vs. group fairness: Should the model be optimized for fairness and accuracy at an individual level, or should it strive for fairness among groups? These are frequently at odds with each other.
Fairness in treatment vs. fairness in outcome: Should there be equality in individual decision-making, or should the model be allowed to make “biased” decisions in order to achieve equality of outcomes between groups?
Fairness in data collection vs. algorithmic development vs. system implementation: Should fairness be prioritized in the data, algorithms, or implementation, or all of the above? What happens when one type of fairness negates another, as is often the case?
Equal outcomes vs. equal accuracy: Should the end-goal be equal outcomes between groups or equal error rates and types (false positives and false negatives) between groups? Both of these targets are difficult to achieve, and moreover, cannot be achieved simultaneously. In addition, applying fairness constraints in general can reduce overall accuracy.

Types and Causes of Bias

Improving fairness in AI requires finding and eliminating bias. Here again, the parameters and best practices are neither straightforward nor universal.

In AI, bias can make its way into the data, the algorithms, and the review process. Some of the more common types and sources of bias include:

Historical bias – bias that exists in the real world and is represented in the data even if sampling methods, algorithms, etc., are fair and equitable.
Selection bias – bias that arises when a data set is not representative of a population, i.e., when certain groups are under- or over-represented in the data.
Measurement bias – bias that results from incorrect feature selection and labelling of the data. Data that is easily available may get more attention than the relevant features or labels.
Aggregation bias – bias that occurs when heterogeneous populations are incorrectly grouped together.
Implicit bias – existing human biases that unconsciously affect how developers create and train models.
Group attribution bias – bias that comes from generalizing individuals’ attributes to a group to which they belong.

Tools and Methods to Reduce Bias and Promote Fairness

Since fairness is multifaceted and tends to be a moving target, fairness practices (and even definitions) must be customized for different contexts and purposes. There are tools that can assist in this process. In general, such tools should be seen as starting points rather than end points; teams can use these tools to prevent, detect, mitigate, and monitor bias, but they must be careful not to rely heavily on any particular tool. Every method has strengths and limitations. What appears to work best is a combination of quantitative and qualitative tools, as well as collaborative efforts to reduce bias at every stage in the AI lifecycle.

With these caveats in mind, examples of recommended quantitative tools (according to an informative and succinct fairness overview from UC Berkeley’s Haas School of Business) include:

IBM’s AI Fairness 360 Toolkit
Google’s What-If Tool
Microsoft’s fairlearn.py

Well-researched qualitative tools include the Co-Designed AI Fairness Checklist, developed by the Microsoft FATE (Fairness, Accountability, Transparency, and Ethics in AI) team, and the Fairness Analytic, by Mulligan et al.

In addition, transparency and explainability in AI models are critical for AI fairness. In order to effectively identify and mitigate bias in models, AI decisions must be visible and understandable.

Here are some broad but helpful guidelines for AI Fairness, excerpted directly from the UC Berkeley brief mentioned above:

Identify fairness considerations and approaches up front, and ensure appropriate voices (i.e. experts in the relevant domain and across disciplines) are included and empowered in the conversation.
Instead of trying to make an ML system completely fair (or “de-biasing” it), the goal can be to detect and mitigate fairness-related harms as much as possible. Questions that should always be asked include: Fair to whom? In what context?
There aren’t always clear-cut answers, so document processes and considerations (including priorities and trade-offs).
Use quantitative and qualitative approaches and tools to help facilitate these processes. Tools do not guarantee fairness! They are a good practice within the larger holistic approach to mitigating bias.
Fairness doesn’t stop once an AI system is developed. Ensure users and stakeholders can see, understand and appeal choices made by AI systems.

(Source: https://haas.berkeley.edu/wp-content/uploads/What-is-fairness_-EGAL2.pdf)

In Short, Fairness Takes Effort

Fairness in AI can be a quagmire and an uphill battle. Because it is a relatively new field of inquiry, there are often no organizational procedures or frameworks in place to promote and support it. Hence, fairness implementation may fall on a few motivated individuals who must fight against tight deadlines and inflexible policies to make fairness a priority. In order for fairness in AI to become feasible and sustainable, this must change. Companies must examine their own practices and enable collaboration between departments, organizational levels, and teams to determine the best ways to promote fairness — for their specific industries, customers, projects, data, and users. Fairness is far too complex, malleable, and subjective for one-size-fits-all solutions. There is no one way to achieve absolute fairness in AI. But there are many practicable ways to get close, and with concerted effort and innovation, these methods will only improve.

Fairness in AI: It’s not One-Size-Fits-All

What is Fairness in AI?

Conflicting Definitions of Fairness

Types and Causes of Bias

Tools and Methods to Reduce Bias and Promote Fairness

In Short, Fairness Takes Effort

Accelerate AI with Annotated Data

Check Out this Article on Why Your Model Performance Problems Are Likely in the Data

About

Company

Contact