AI Bias: Types, Impact Assessment, Detection, and Mitigation

You are deploying an AI system that makes or influences decisions about people, and you need to understand bias: where it comes from, how to detect it, what mitigation options exist, and what regulations require.

Definition

AI bias refers to systematic errors in AI system outputs that produce unfair outcomes for specific groups of people. Bias is not a single phenomenon. It enters AI systems at multiple points in the development and deployment lifecycle, and different sources of bias require different responses.

NIST Special Publication 1270 identifies three broad categories: systemic bias (rooted in institutions and society), statistical and computational bias (arising from data and algorithms), and human bias (from the people who design, develop, and use AI systems). These categories overlap. An AI hiring tool may reflect all three simultaneously.

The critical distinction: bias in AI systems is not limited to the training data. Bias can be introduced or amplified at every stage, from problem formulation through deployment and monitoring.

Documented bias types

Five well-documented bias types appear in the machine learning literature. Suresh and Guttag (2021) provide a taxonomy covering the full ML lifecycle:

Bias Type	Source	Description	Example
Historical bias	Training data	The data reflects existing societal inequities, which the model learns and reproduces	A hiring model trained on historical hiring data learns to favor male candidates for engineering roles because historical hiring was skewed
Representation bias	Training data	Some groups are underrepresented in the training data, leading to lower performance for those groups	Facial recognition systems trained primarily on lighter-skinned faces perform significantly worse on darker-skinned faces (Buolamwini and Gebru, 2018)
Measurement bias	Data collection	The features used as proxies for the target variable are less accurate for some groups than others	Using zip code as a proxy for creditworthiness disadvantages applicants from historically redlined neighborhoods
Aggregation bias	Model design	A single model is used for populations with different underlying patterns, producing poor results for subgroups	A medical risk model trained on pooled data performs well on average but misses risk factors specific to certain demographic groups
Evaluation bias	Testing	The benchmarks and test data used to validate the model do not represent all affected populations	A language model is evaluated on English benchmarks and declared "accurate" without testing on dialects, accented speech, or non-standard usage patterns

Impact assessment by domain

Bias creates different types of harm depending on the deployment context. The severity depends on the stakes of the decision and the vulnerability of the affected population:

Domain	Impact Level	Primary Bias Risks	Affected Populations
Hiring and recruitment	Critical	Historical bias in training data replicates past discrimination; measurement bias when proxies correlate with protected characteristics	Job applicants from underrepresented groups
Lending and credit	Critical	Historical bias reflects discriminatory lending practices; aggregation bias when a single model serves diverse populations	Applicants from historically redlined communities, minority borrowers
Healthcare	Critical	Representation bias when clinical data underrepresents certain demographics; measurement bias when health metrics perform differently across populations	Patients from underrepresented racial, ethnic, or age groups
Criminal justice	Critical	Historical bias in arrest and sentencing data encodes systemic disparities; evaluation bias when models are validated on non-representative samples	Defendants from overpoliced communities
Content moderation	High	Historical bias in labeling data reflects cultural biases of annotators; aggregation bias when a single model moderates content across cultures	Users from marginalized communities, speakers of minority dialects
Insurance	High	Historical bias in claims data; measurement bias when risk factors correlate with protected characteristics	Applicants from groups historically charged higher premiums

Detection methods

Disparate impact analysis

Compare AI system outcomes across demographic groups. Calculate the selection rate for each group and check whether any group falls below the four-fifths rule (80% of the highest group's selection rate), which is the standard used by the EEOC. This is a starting point, not a complete analysis.

**Limitations:** Requires access to demographic data, which may not be available or may raise its own privacy concerns. The four-fifths rule is a rough threshold; passing it does not guarantee fairness.

Fairness metrics

Apply quantitative fairness metrics to the model's predictions. Common metrics include:

**Demographic parity:** Equal selection rates across groups
**Equalized odds:** Equal true positive and false positive rates across groups
**Predictive parity:** Equal precision (positive predictive value) across groups
**Calibration:** Predicted probabilities mean the same thing across groups

**Limitations:** These metrics can conflict with each other. Optimizing for one may worsen another. The choice of metric depends on the context and values of the deploying organization.

Subgroup analysis

Break down model performance by relevant subgroups (not just binary categories). Intersectional analysis, examining performance across combinations of characteristics (for example, race and gender together), can reveal disparities that single-axis analysis misses. This was the approach used in the Gender Shades study.

**Limitations:** As the number of subgroups increases, sample sizes per subgroup decrease, making statistical analysis less reliable.

Red teaming for bias

Conduct structured testing where diverse teams probe the system for biased behavior. Include people from affected communities in the testing process. Test with inputs that are designed to surface differential treatment.

**Limitations:** Red teaming is limited by the perspectives and creativity of the testers. It is a complement to quantitative analysis, not a replacement.

Mitigation strategies

Data auditing

Before training, audit the dataset for representation, label quality, and historical patterns that may encode bias. Document the demographics of the data, known gaps, and any corrections applied. Remove or rebalance data that encodes discriminatory patterns.

**Limitations:** Historical bias is often embedded in the structure of the data, not just its composition. Correcting data is necessary but not sufficient.

Algorithmic fairness constraints

Apply fairness constraints during model training. Techniques include adversarial debiasing (training the model to make predictions that are not predictive of group membership), reweighting (adjusting sample weights to equalize representation), and post-processing (adjusting output thresholds per group to equalize outcomes).

**Limitations:** Fairness constraints often involve accuracy trade-offs. The right balance depends on the deployment context and cannot be determined by engineers alone.

Diverse evaluation

Test models on evaluation datasets that represent all affected populations. Include intersectional subgroups. Report disaggregated results, not just overall accuracy. If performance varies significantly across groups, the model is not ready for deployment.

**Limitations:** Representative evaluation datasets are difficult and expensive to create. Existing benchmarks often have the same representation gaps as training data.

Ongoing monitoring

Bias is not a problem that is solved once. Monitor AI system outputs in production for disparate impact, drifting performance across groups, and new patterns that emerge as the user population or operating context changes.

**Limitations:** Monitoring requires continuous access to outcome data and demographic information. Feedback loops between the AI system and the population it affects can create new biases over time.

Impact assessments

Conduct formal impact assessments before deployment and at regular intervals. Document the purpose of the AI system, the populations affected, the potential for harm, the mitigations in place, and the residual risk. The EU AI Act requires these for high-risk systems.

**Limitations:** Impact assessments are only as good as the people and processes behind them. A cursory assessment that checks boxes without genuine analysis provides false assurance.

Regulatory requirements

Bias in AI systems is increasingly subject to legal and regulatory requirements:

Regulation	Jurisdiction	Key Requirements
EU AI Act (Regulation 2024/1689)	European Union	High-risk AI systems must be tested for bias using appropriate metrics, ensure training data is representative, and implement ongoing monitoring. Requires conformity assessment before deployment.
NYC Local Law 144	New York City	Automated employment decision tools must undergo annual independent bias audit. Results must be published. Candidates must be notified that an automated tool is being used.
EEOC Guidance on AI in Employment	United States	Title VII applies to AI-driven employment decisions. Employers are liable for disparate impact caused by AI tools, including vendor-provided tools.
UK Equality Act 2010 (applied to AI)	United Kingdom	Indirect discrimination provisions apply to AI-assisted decisions. The burden of justification falls on the deploying organization.

Organizations deploying AI systems that affect people should assume they are subject to anti-discrimination law, even if no AI-specific regulation applies in their jurisdiction. General civil rights and anti-discrimination statutes apply to AI-driven decisions in most countries.

Organizational response framework

Action	When	Who
Data audit for representation and historical bias	Before model training or vendor selection	Data team, fairness reviewers
Fairness metric selection	During system design (before development starts)	Product team, legal, affected community representatives
Disparate impact analysis	Before deployment and quarterly thereafter	Data science, compliance
Independent bias audit	Annually (required by NYC LL144 for employment tools)	External auditor
Impact assessment	Before deployment, after significant changes, and annually	Cross-functional team including legal and ethics
User notification	At all times when AI influences decisions about people	Product team, legal

Decision checklist

Before deploying an AI system that makes or influences decisions about people, confirm:

[ ] Training data has been audited for representation gaps and historical bias
[ ] Fairness metrics have been selected with input from affected stakeholders, not just engineers
[ ] Disparate impact analysis shows acceptable results across all relevant subgroups
[ ] Intersectional analysis has been conducted (not just single-axis)
[ ] The system has been tested by diverse testers, including people from affected populations
[ ] Regulatory requirements (EU AI Act, NYC LL144, EEOC guidance) have been assessed and addressed
[ ] An ongoing monitoring plan is in place to detect emerging bias in production
[ ] An impact assessment documents purpose, risks, mitigations, and residual risk
[ ] Users and affected individuals are notified that AI is involved in the decision process

Key takeaways

Bias enters AI systems at every stage of the lifecycle, not just through training data. Problem formulation, feature selection, model design, evaluation, and deployment all introduce or amplify bias.
No single fairness metric captures all forms of unfairness. The choice of metric reflects values and priorities that should be decided by stakeholders, not engineers alone.
Regulatory requirements are expanding. Organizations that do not proactively address bias face legal liability under existing anti-discrimination law, and increasingly under AI-specific regulation.
Bias mitigation involves trade-offs. Improving fairness for one group or metric may reduce accuracy or worsen fairness on another metric. These trade-offs must be documented and justified.
Testing for bias is not a one-time event. Populations change, data drifts, and feedback loops create new patterns. Ongoing monitoring is not optional.