What Is Bias in Machine Learning?

ProfilePicture of Omar Trejo
Omar Trejo
Senior Data Scientist
Puzzle of a artifitial brain with circuits on the back

As artificial intelligence, or AI, increasingly becomes a part of our everyday lives, the need for understanding the systems behind this technology as well as their failings, becomes equally important. It’s simply not acceptable to write AI off as a foolproof black box that outputs sage advice. In reality, AI can be as flawed as its creators, leading to negative outcomes in the real world for real people. Because of this, understanding and mitigating bias in machine learning (ML) is a responsibility the industry must take seriously.

Table Of Contents

What Is Bias in Machine Learning?

Bias is a complex topic that requires a deep, multidisciplinary discussion. In this article, I’ll share some real-world cases where Machine Learning bias has had negative impacts, before defining what bias really is, its causes, and ways to address it.

At a high level, it’s important to remember that Machine Learning models, and computers in general, do not introduce bias by themselves. These machines are merely a reflection of what we as humans teach them. ML models use objective statistical techniques, and if they are somehow biased it’s because the underlying data is already biased in at least one of many ways. Understanding and addressing the causes of this are necessary to ensure effective yet equitable use of the technology.

Real-World Examples

The COMPAS System

In the criminal justice system, there is a desire to predict if someone who has done something unlawful is likely to offend again. Taking it further, it would be valuable to be able to accurately classify these people on a scale of low to high risk going forward to assist in decision making. Predicting recidivism is an important challenge for society, but it is also inherently very difficult to do in an accurate way. This is especially so in the context of machine learning, given many unobservable causes and contributing factors that cannot be neatly fed into a Machine Learning model.

This problem has been the subject of many sociological and psychological research studies and is the focus of the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) software. COMPAS is used in various states in the US to predict whether a person is likely to offend again, and its predictions are taken into account by judges when deciding sentences. It also is a commonly referenced example of a biased ML model.

An 18-year-old African American girl was arrested in 2014 for the theft of a bicycle. She was charged with burglary amounting to $80. More recently, a 41 years old Caucasian man was picked up for shoplifting tools worth $86. The man was a repeat offender and had been previously convicted of many thefts and armed robberies. The girl had committed some minor offences when she was younger. According to COMPAS, the girl was high risk and the man was low risk. Two years later, the girl had not been charged with any new crimes while the man was serving 8 years in prison for another offence.

An example of bias in the COMPAS system comparing low vs high risk thefts

Current versions may be different, but while being used by states in the US, the model predicted double the number of false positives for African Americans than for Caucasians, meaning that the models were much more likely to say that an African American was a “high” risk relative to Caucasian people.

The bias was obvious when results were seen from a specific angle, but removing such biases from these models was not deemed important, explicitly or implicitly. The opinion of many is that these models could have been “improved” to reduce unfair incarceration of African Americans, rather than exacerbating it. We will come back to this COMPAS system a couple of times during this article, given that it illustrates the real-world impact and complexities of bias in machine learning.

Google Translate

Examples of bias with more subtle implications can often be found in Natural Language Processing (NLP). Understanding language is very difficult for computers due to the involved nuance and context, and automatically translating between languages is even more of a challenge. Needless to say, it’s one of the hardest problems currently being tackled by AI.

Google Translate is a convenient tool that can be used to translate between languages of very different roots, and it works well enough most of the time. However, as with any ML tool, examples can be found with which the models perform poorly and exhibit bias. In the case of Google Translate, its current online version can be used to attempt to translate from English to Turkish, and back into English the sentence: “She is a programmer. He is a nurse.”

Screenshot of Bias in Google Translate from English to Turkish

The translation produces “O bir programcı. O bir hemşire.” If you then translate that back from Turkish into English, what you get is “He’s a programmer. She is a nurse,” which of course is not what we started with, and exhibits the presence of bias in the model.

Screenshot of Bias in Google Translate from Turkish to English

The bias is happening because Turkish uses a gender-neutral pronoun “o,” and in English, it gets translated into a gender-dependent pronoun, either “she” or “he.”

How will the models decide what gender-dependent pronoun to use when there’s not enough information in the text being translated?

One potential approach would be to use independent information that will allow the translation to be correct most of the time, even if incorrect in some cases. For example, the models could reference the following results from the annual StackOverflow survey to decide what gender-dependent pronoun to use.

Screenshot of the results of a survey to determine Stackoverflow users gender

As can be seen, if that were the case, it would be obvious that most of the time a programmer will be male. If we looked at similar results for nurses, it would probably turn out that most of the time nurses are female. Note that I’m not claiming that the Google Translator models are using StackOverflow survey results, I’m just trying to show how these models can become biased due to their reliance on statistics. In reality, Google Translate is probably using information from countless texts collected in digital records over decades and the bias is being learned from our common language patterns.

This is why I mentioned at the beginning of this section that NLP bias is more subtle. Bias when processing language, especially when trained with digital records from online samples (e.g. Twitter, Facebook, Wikipedia, etc.) will learn the patterns we exhibit when speaking and writing, and we would be naive to think that everything we post online is not being used to train NLP models.

The way we use language today influences how ML will work for us in the future. This should make us stop and think about the implications of the words we use and how we use them, and how that will impact our future when these models are responsible for more critical decisions.

An example of how quickly NLP models can become problematic when trained with unfiltered data is that of Microsoft’s Tay Twitter Bot, an online ML system created to interact with Twitter users in real-time. In less than a day after being released the bot went from publishing friendly to very scary and downright offensive tweets. The bot was quickly turned off to avoid further brand damage. This reminds me of the following quote on the importance of language:

Google Photos

Another example where removing bias from Machine Learning models is particularly hard is in image analysis, commonly referred to as Computer Vision (CV). Just as with NLP, CV is a very complex area where many things can go wrong, and where correctly identifying objects in a consistent fashion is very difficult. CV models are artifacts trained to recognize objects in pictures or videos. Significant improvements have been achieved using Deep Learning (DL), and sometimes these models can even identify objects more accurately than humans. However, these models are imperfect and still make mistakes that sometimes are inoffensive, and other times can significantly offend people.
There is a well-documented instance of an offensive categorization happening when Google Photos incorrectly identified a picture of African Americans as “gorillas” in 2015. The tweet is no longer out there, but prior to being taken down, it led to a lot of discussion in the CV community:

Screenshot of Bias in Google Photos

A common source of these biases is known as under-representation, which means that there are examples for which not enough data is being used to train the models. For example, if you search for wedding pictures online, you’ll probably find pictures of women in long white dresses and men in black suits, and that comes from the fact that there are many more available examples of those types of weddings than of other types. If you invert the problem, you’ll find that if you use a CV model to identify what’s in these pictures, the accuracy of the results will depend on the tropes present in the sample data.

One of Google’s teams identified that problem and produced the example shown below.

Bias in Machine Learning in Google Photos

Since then, Google has taken a much more proactive and public stance in favor of producing ML that is socially conscious and respects people in ways that were not considered before. However, there’s still a long way to go on this front.

Bias is a Complex Topic

Bias Has Many Meanings

There are many definitions of bias, so it made sense to start this article with real-world examples of what bias is with respect to its use as being discussed. Now that we have seen some examples, we need to define the term to enable a meaningful discussion. Here are some informal definitions coming from different but related disciplines to Machine Learning, at different levels of technicality and abstraction.

  • “Bias” in data: when a sample is not representative of the underlying population it represents. When this happens, your statistical derivations become inaccurate.
  • “Bias” in statistics: the difference between expected and actual values. When this happens, even if your estimations are consistent and precise, they are incorrect.
  • “Bias” in sociology”: tendency, known or unknown, to prefer one factor over another, preventing objectivity. When this happens something is influencing decisions or observations in a way that is deemed undesirable.
  • “Bias” in neural networks: a parameter for a neuron which is used to tune a model. This is used to make trade-offs between bias and variance (using their statistical definitions for ML).
  • “Bias” as most people understand it: when past experiences or erroneous assumptions affect our perception in a way that negatively affects other people. When this happens we often say that something is unfair.

As is seen above, “bias” is a loaded term that means very different things in different contexts and disciplines. To be clear, in this article we’re referring to the last definition of the term: “Bias” as most people understand it. However, this definition is somewhat vague and requires further exploration.

For example, what does “erroneous assumptions” mean? What does “negatively affect other people” mean? Even though they imply very complex concepts and discussions, as humans we intuitively understand what those phrases mean, or can fill in the blanks when considering specific situations. However, we need to be more precise since Machine Learning requires specific and explicit rules, and that’s where things get complicated.

Bias Can Be Good

Let me start by stating the obvious: qualifying something as being “good” or “bad” is always relative to many complexities around the context of the qualification. For example, killing someone is considered “bad”, but killing an intruder in your house to protect your family is considered “good”. The problems being tackled by ML are much more nuanced than this, which is what makes the challenges of bias so difficult to handle.
Following that same idea, biases can be “good” or “bad” depending on the context, and sometimes there’s no definite answer that can be agreed on by two people, let alone entire societies. “Bias” is not inherently “bad,” even if the term is mostly used with negative connotations in general, and particularly within this article.

If it’s hard to think of “biases” as being neutral (i.e. not necessarily “good” or “bad”), you can refer to them as “rules of thumb,” “shortcuts,” or “heuristics.”

“Stealing is bad” works as a general rule (bias) that can be useful to quickly classify the likely morality of an action when no further detail is present, but that isn’t where the analysis of a scenario should end. Biases often help us make quick decisions that would otherwise be impractical to make or help us compensate for lack of information, but problems arise when we do not contextualize them appropriately or don’t update them as we receive more information.

Bias Implies Complex Trade-Offs

In Kahneman’s Thinking Fast, Thinking Slow, he describes our brains as being conformed of two systems:

  1. The first system is quick to judgment, much like the “shortcuts” we described before as “biases.”
  2. The second system is slower and weighs data more carefully before making a decision.

Kahneman believes that with training and experience, we can learn to disengage our first system and engage more proactively with the second one – equivalent to building a Machine Learning model that’s accurate through meaningful analysis, instead of it relying on bias as a broad shortcut to an answer that’s likely to be statistically (but not categorically) accurate.

With respect to Machine Learning, a bias may simply be a correlation, fair or unfair, that leads to a certain kind of classification. This process is inherently neutral until given context and evaluated in terms of fairness. Bias, in a negative sense, is a requirement for something to be “unfair” – but there is no standard definition for “fairness.” Societies differ on what is “fair.” Even people within societies differ on what they regard as fair. “Fairness” in ML represents both an opportunity and a challenge.

Identifying appropriate “fairness criteria” for a system requires accounting for human experiences and perceptions, as well as cultural, social, historical, political, legal, and ethical considerations. Claims about bias and fairness are often about outcomes differing between groups, and the question of which are the relevant groups to consider is fundamentally a practical and moral one. Furthermore, at what level of granularity should groups be defined, and how should the boundaries between these groups be decided? When is it fair to define a group at all versus a better factoring of individual differences?

Measuring Bias in Machine Learning

“You can’t change what you don’t measure.”

Imagine we have an urn that we can see contains 90 black balls and 10 white balls, and we’re told that we will win a prize if we guess correctly the color of the next ball that will be randomly taken out of the urn. What color would you pick? You would probably pick “black” as it’s more likely to be the case. Now, what if we go back to the COMPAS example and make similar assumptions? For example, let’s assume historically we see 90% of repeat offenders are African American and 10% are Caucasian, and the prize society wins by identifying repeated offenders are lower crime rates. Who would you pick as someone who is likely to be a repeat offender? What if the percentages are 60% and 40% respectively? Well, of course it shouldn’t be color-based! Or should it? Wheelan puts it nicely in his book “Naked Statistics”:

Of course, real-world problems are never simple and require precise measurement to evaluate. At a minimum, for classification problems, it’s important to pay attention to the overall accuracy, as well as rates of true and false positives and negatives. To measure Machine Learning bias, you will want to track the performance of various accuracy metrics for different groups in your data at various levels of granularity, as well as cross-validate them across different sets of randomly selected features and observations. This is only a starting point for ensuring the accuracy and equity of a Machine Learning model – a holistic approach with meticulous analysis is required. There is much debate over how exactly to achieve this in the community, and counter-arguments were made by the company responsible for COMPAS to defend their methodology from a statistical approach.

This introduces an important question. How do we decide which measure of fairness is appropriate? It will depend on the expertise of the people involved in building these systems, but in general, answering the following two questions should help enlighten the decision of what accuracy metrics should be used for a given ML problem:

  1. What aspects of our society do we wish to be ignored by Machine Learning models?
  2. What biases in our society do we wish to see corrected or changed?

Finally, keep in mind that metrics are often aggregated. However, in some ML problems, even a single classification can be quite harmful, such as a false positive identifying an individual as a threat. Therefore, it’s not enough to control for those metrics in aggregate, but also it’s important to identify case-by-case instances where incorrect biases can be harmful.

Regulatory Efforts

Bias in Machine Learning models has been recognized as a very important challenge to address, which has led to regulatory involvement. For example, in the banking and financial industry in the United States, the Equal Credit Opportunity Act for fair lending states that institutions can not discriminate based on race, sex, age, national origin, or marital status, or proxies of these concepts. In the United States, postal codes are highly correlated to race, and therefore can not be used to train Machine Learning models that will be used to make decisions of whether to give credit to a person.

An international example can be seen in the AI Principles published by the Organisation for Economic Cooperation and Development (OECD), which were adopted by forty-two countries, and not only focus on fairness but also on the privacy aspects of Machine Learning models – another very important topic not touched on in this article. One of such principles directly refers to ML fairness and the necessity of introducing humans into the loop to keep the use of these ML models ethical.

The “Recommendation of the Council on Artificial Intelligence” is another document published by the OECD which states that models should be transparent, explainable, accountable, and robust, all of which are concepts we’ve touched on in this article with respect to Machine Learning bias and fairness. A third example is the often discussed in data-related company problems, is Europe’s General Data Protection Regulation (GDPR), which states the following:

It’s a very positive sign that regulatory bodies are taking action when it comes to ensuring ML models are fair. It’s a socially relevant topic that would probably not be addressed if left to companies and ML professionals alone, since there’s no immediate commercial incentive for them to do so, especially when compared against the high cost of adequately dealing with Machine Learning bias.

Bias and Malicious Actors

Consider the case of the previously mentioned Tay Twitter bot. Microsoft argued that one of the reasons that Tay’s speech became so offensive was because of trolls intentionally sending it malicious text, which then served to train the bot. Failure by its developers to account for this led to the termination of the project, and likely many lessons learned for the ML community as a whole.

In scenarios where the training data or methodology (such as pitting two models against each other) involves input from other parties, it becomes clear that considering the potential for malicious actors becomes necessary. This may seem like science-fiction, but Adversarial ML and respective counter-strategies are becoming a reality in the field. As AI use becomes more prevalent, and as they begin to interact with each other and the public in general, there’s sure to be some conflicts and unexpected outcomes that arise.


As we have seen, Machine Learning bias is a complex topic, both from a moral and technological standpoint. The negative impact of bias in real-world scenarios is clear, and as a result, the industry and regulators have been taking steps to minimize the amount a system can be viewed as “unfair”. Still, there is much work to be done as the field evolves.

We, as the worldwide ML community, must make sure we don’t leave these issues unaddressed as these models are the building blocks over which more autonomous systems will be built. As AIs become more omnipresent in our daily lives, the efficacy of their design will have real consequences on our way of living. It’s up to us and informed citizens to use their voice to drive policy, to ensure a future where AI is a fair force for good in the world.

If you are looking for more content for Data Scientists, we created an article with 6 techniques to clean data in the data preprocessing phase and an introduction of Exploratory Data Analysis (EDA) with in-depth concepts and techniques.

Originally published on May 4, 2020Last updated on Aug 29, 2023

Looking to hire?

Join our newsletter

Join thousands of subscribers already getting our original articles about software design and development. You will not receive any spam, just great content once a month.


Read Next

Browse Our Blog