What is Bias in Machine Learning? Real-World Examples That Show the Impact of AI Bias

Omar Trejo

Senior Data Scientist

As artificial intelligence, or AI, increasingly becomes a part of our everyday lives, the need for understanding the systems behind this technology as well as their failings, becomes equally important. It’s simply not acceptable to write AI off as a foolproof black box that outputs sage advice. In reality, AI can be as flawed as its creators, leading to negative outcomes in the real world for real people. Because of this, understanding and mitigating bias in machine learning (ML) is a responsibility the industry must take seriously.

Bias is a complex topic that requires a deep, multidisciplinary discussion. In this article, I’ll share some real-world cases where Machine Learning bias has had negative impacts, before defining what bias really is, its causes, and ways to address it.

Table Of Contents

Real-World Examples of Bias in Machine Learning and Deep Learning
The Meaning of “Bias” and Different Types
Can ML Bias Be a Good Thing?
Navigating the Complexity of Bias in Machine Learning: Trade-offs, Ethics, and Regulation
Final Thoughts

At a high level, it’s important to remember that Machine Learning models, and computers in general, do not introduce bias by themselves. These machines are merely a reflection of what we as humans teach them. ML models use objective statistical techniques, and if they are somehow biased it’s because the underlying data is already biased in at least one of many ways. Understanding and addressing the causes of this are necessary to ensure effective yet equitable use of the technology.

Need to Hire Machine Learning Engineers?

Scalable Path has remote developers and engineers with experience in machine learning.

Start Hiring

Real-World Examples of Bias in Machine Learning and Deep Learning

The COMPAS System

In the criminal justice system, there is a desire to predict if someone who has done something unlawful is likely to offend again. Taking it further, it would be valuable to be able to accurately classify these people on a scale of low to high risk going forward to assist in decision making. Predicting recidivism is an important challenge for society, but it is also inherently very difficult to do in an accurate way. This is especially so in the context of machine learning, given many unobservable causes and contributing factors that cannot be neatly fed into a Machine Learning model.

This problem has been the subject of many sociological and psychological research studies and is the focus of the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) software. COMPAS is used in various states in the US to predict whether a person is likely to offend again, and its predictions are taken into account by judges when deciding sentences. It also is a commonly referenced example of a biased ML model.

An 18-year-old African American girl was arrested in 2014 for the theft of a bicycle. She was charged with burglary amounting to $80. More recently, a 41 years old Caucasian man was picked up for shoplifting tools worth $86. The man was a repeat offender and had been previously convicted of many thefts and armed robberies. The girl had committed some minor offences when she was younger. According to COMPAS, the girl was high risk and the man was low risk. Two years later, the girl had not been charged with any new crimes while the man was serving 8 years in prison for another offence.

Originally published on May 4, 2020Last updated on Aug 28, 2024

Key Takeaways

Why is AI bias unethical?

AI bias is unethical because it can violate individuals' rights to meaningful explanations in automated decision-making, perpetuate human prejudices, and undermine fairness and trust in AI systems. Addressing unwanted bias and upholding fairness requires a thoughtful focus on data, diverse teams, and empathy, as both an ethical imperative and a legal responsibility.

What are some famous examples of AI bias?

There are a few famous examples of AI bias, including the COMPAS system. This was a tool used in criminal justice, which has been found to unfairly assess African American defendants and potentially lead to unjust sentencing. Google Translate has also faced criticism for perpetuating gender stereotypes in translations, reflecting societal biases present in its training data. Additionally, Google Photos has been known to mislabel photos of African Americans, highlighting racial bias in facial recognition algorithms.

Looking to hire?

The Scalable Path Newsletter

Join thousands of subscribers and receive original articles about building awesome digital products

What is Exploratory Data Analysis? Steps & Examples

One of the most important things you can do when approaching a data science project is really understand the dataset you’re working with as a first step. Without a proper data exploration process in place, it becomes much more challenging to identify critical issues or successfully carry out a deeper analysis of the dataset. Exploratory Data Analysis (EDA) in Data Science is a step in

Nicolas Azevedo

Data Scientist and Machine Learning Engineer

A man using a remote drone to clean a database

Data Science

Data Preprocessing Techniques: 6 Steps to Clean Data in Machine Learning

The data preprocessing phase is the most challenging and time-consuming part of data science, but it’s also one of the most important parts. Learn best techniques to prepare and clean the data so you don’t compromise the ML model.