What is Exploratory Data Analysis? Steps & Examples

Nicolas Azevedo

Data Scientist and Machine Learning Engineer

One of the most important things you can do when approaching a data science project is really understand the dataset you’re working with as a first step. Without a proper data exploration process in place, it becomes much more challenging to identify critical issues or successfully carry out a deeper analysis of the dataset.

Table Of Contents

What Is Exploratory Data Analysis?
Types of Variables in the Exploratory Data Analysis
How to Analyze Data Using Univariate Analysis
How to Apply Multivariate Analysis
How to Use Graphical/Tabular Analysis
Checking the Existence of Outliers in the Variables
Other Tools and Techniques
Final Thoughts

What Is Exploratory Data Analysis?

Exploratory Data Analysis (EDA) in Data Science is a step in the analysis process that uses several techniques to visualize, analyze, and find patterns in the data. John Turkey, who developed the EDA method, likened it to detective work because you have to dig for clues and evidence before making any assumptions about the outcome.

A complete and solid Exploratory Data Analysis can help identify issues in your data like missing or wrong values, typos, and anomalies (outliers). In addition, you will learn about the distribution of the data, the relationship between variables, and find variables that may not affect the desired outcome.

In this article, we’ll explore the principle techniques of Exploratory Data Analysis, tools, and graphs that help to understand the data better so you can ultimately answer business questions and find insights that may surprise your stakeholders.

Types of Variables in the Exploratory Data Analysis

When you start to explore the dataset, the first thing you have to evaluate is the attributes of the data you’re working on. Understanding the type of each variable will help you in the process of choosing the proper technique for the attribute analysis.

Types of Data: Quantitative vs. Qualitative

Quantitative Data

Originally published on Jun 10, 2021Last updated on Jul 21, 2023

Key Takeaways

What is exploratory data analysis explain with an example?

How do I start analyzing data?

Univariate Analysis is the simplest form of data analysis. 'Uni' refers to analyzing one individual attribute to understand the position of the data in the dataset by the central tendency measures and the sparseness of that data by the dispersion measures. This includes analyzing, mode, median, mean, and quantile and dispersion measures like variance, standard deviation, amplitude, Interquartile range, and coefficient of variation.

Is exploratory data analysis Qualitative or quantitative?

It can be both depending upon your dataset or the research you're using.

Looking to hire?

The Scalable Path Newsletter

Join thousands of subscribers and receive original articles about building awesome digital products. Check out past issues.

What is Bias in Machine Learning? Real-World Examples That Show the Impact of AI Bias

As artificial intelligence, or AI, increasingly becomes a part of our everyday lives, the need for understanding the systems behind this technology as well as their failings, becomes equally important. It’s simply not acceptable to write AI off as a foolproof black box that outputs sage advice. In reality, AI can be as flawed as its creators, leading to negative outcomes in the real world