What is Cohort Analysis ?

Cohort analysis is a subset of behavioural analytics that takes the data from various sources such as eCommerce platform, web application, Retail stores, etc and rather than considering all users as one group, it breaks them into related small groups. These related groups are referred as cohorts, usually share common characteristics or experiences within a defined time-span.

Cohort analysis is a tool to measure user engagement over the period of time. It helps to know whether user engagement is actually getting better over the time or it is improving with period of time because of growth only. Different organization uses cohort analysis to find the customer behaviour throughout the life cycle of every customer.

Cohort analysis is import because it helps to separate engagement ratio with growth ratio, as people can easily think growth as engagement of old customers. Generally, the retention of old customer is ignored because of the impressive growth in the sales by the new customers which result in increase of customer acquisition cost.

The platforms like ECommerce, Retail, Automobile, Banking, Insurance, Healthcare and Education they deal with large amount of data on daily basis. To analysis this large volume of data on regular basis is quite a changeling job as it needs a keen staff   to analysis it.

Cohort Matrix will help the companies to find how different group are using their products and accordingly they can work on their marking techniques, which will help them to determine when and how to connect with different groups of customers.

Types of Cohort Analysis

There are two most common types of cohort analysis:

1.      Acquisition Cohort: In acquisition cohort we create a cohort or groups based on first purchase date of the customer. Generally, we use this try of cohort to measure churn and retention rate of a customer. By using this cohort, we can measure how long a customer is using our products from the date of first purchases.

2.     Behavioural Cohort: In behavioural cohort we create a groups or cohorts based on behaviour and action of a customer. This helps us to analysis our active users in different demographics and with different behavioural patterns.

Acquisition cohorts help us to understand when an action is taking place, but behavioural cohorts are best for discovering and understanding churn rates, as they tell us why a user have taken an action.

Cohort Analysis to Improve Customer Retention

Cohort analysis contains looking at the groups of customers, over period of time, and observing how their behaviour changes with over the time. For example, let say that the number of customers who have used our services in the first month from that how many customers are again using our service in the next consecutive months. By seeing the cohort matrix, we can easily analysis whether we are able to retain our customers or not.

Cohort Analysis Using Power Bi:


Here ecommerce dataset has been taken for the analysis. Data contains multiple columns like Customer id which assign unique id to each customer, Customer Name, Market name from where the order is placed, Order Date which tell us date on which product was ordered, Order Id is assigned for each order placed and Sales done by each order.


Objective was to do:

  • Retention Rate month wise
    • Month wise retention percentage
  • Customer Behaviour
    • Count of customers per order number
    • Year wise revenue by old and new customers
    • Monthly contribution in total sales by last per and current year customer

Step 1: Importing the Ecommerce Dataset

First task in Power Bi is to import the Dataset. Here we have taken an eCommerce Sales dataset to understand the behaviour of customer in terms of sales and their retention. We need to import the dataset to Power Query Editor tool. Power Query editor will help us to clean the data if needed.

So, after importing the data to Power Query Editor our first task will be to check the data set for data cleaning. Once the data cleaning task is completed create a duplicate query (Table) of the given Sales dataset. In the duplicate query keep the column like Customer Id and Customer Name only and remove the remaining columns. After that Press Ctrl+A in the duplicate table to select the entire data and click on “Remove Rows” then “Remove Duplicate” this will delete the duplicate records from the table and we will be left with only distinct records. Next task is to rename the duplicate query (table) as “Customer”.

Once you are done with the above task then click on “Close & Apply” to save the changes and load the data into Power Bi desktop.

Step 2: Creating a Date Table

Create a date table using the minimum and maximum value of “Order Date” column as starting and ending dates respectively. To add a Date Table, we can use Calendar function, inside that use the min of “Order Date” will be staring date and max of order date will be ending date. We will get a column will all the date values between Starting and Ending date.

Steps to create a Date Table in Power Bi:

  1. Click on Modeling Tab.
  2. Then click on “New Table”.
  3. After that write the below DAX formula and click on commit button.

DAX Formula to create a new Date Table:

Step 3: Create a After Months Table

The next step is to create a After Months Table that will generate a sequential list of values which we will help us to visualize the cohort matrix by finding the outer months value.

The After Months table can have values starting from 0 to 23. This means that from current month till next 23 months this will give me the data.

DAX Formula to create a After Months Table:

Step 4: Adding a New Column in Customer Details Table

Add a new column in Customer Details table to find First Order Date for each customer. This will help us to know their churn rate.

Steps to add a new column in Customer Table:

  1. Click on Table View.
  2. Then select the Customer Table and Click on New Column.
  3. After that write the below DAX formula and click on commit button.

DAX Formula to find First Order Date for each customer:

Here we have found the End of Month (EOM) value for each First Order placed by each customer. After getting the result of EOM of First Order Date we will change the format of column to (MMMM,YYYY), this format will help us to get the result Month and Year wise.

After adding First Order Date column we need to add another column by the name “Total Number of Orders” in the Customer Table which will give the Total orders placed by each customer.

DAX Formula to find Total Numbers of orders:

Step 5: Creating a relationship between all the table

The next important task is to establish the relationship between the tables. In our case we have to create a relationship between Sales and Customer Table, also between Sales and Date Table.

Customer Id from Sales and Customer table is common, using this column we will establish a relationship between both the Tables. It will be an Many to One Relationship.

In the same way Date column from Date Table and Order Date column from Sales table has a relationship. It is also Many to One relationship.

Step 6: Creating a Total Sales Measure

Create a measure that will give Total Sales done by a customer. This measure we are going to use in our visual as well as to create another measure also.

DAX Formula to find sum of sales:

Step 7: Find Sales by new customer

Using Quick measure, we can easily write a DAX formula to find the sales done by new customers. This will help to know how much sales is done by new and old customers year by year and by that we will come to know whether we are able to retain the old customers or not. Also, what percentage of sales we are getting from New and Old Customer of that Month/Year.

Steps to add a measure using Quick Measure:

  1. Select the Sales Table then click on Modeling tab.
  2. After that click on Quick measure ribbon.
  3. Then under Filters section click on “Sales for new customers”
  4. Then we will get the flied to add the values.
  5. Under Sales Amount field add Total Sales measure.
  6. Under Customer ID field add Customer ID column.
  7. Under Date field add Date column from Date Table.
  8. Then click on ADD Button.
  9. After following the above steps, we will find that a new measure is created under Sales Table by the name “Sales for new customers”.

DAX Formula for Sales done by new customers:

Step 8: Finding Customer Retention Percentage

To find customer retention percentage over the period of time we have to create a measure that we will use in Matrix visual to see how many orders we are getting from old customers.

DAX Formula to find Customer Retention Percentage

Step 9: Adding a Matrix visual to find customer retention percentage

The next task is to add a Matrix Plot to find the customer retention rate. It will help us to know what percent of Orders are made by those customers who have made their first order in that month.

In this Matrix Plot we can find the percent of orders we are getting from the customers in next consecutive months. In this graph we can find the month value in first column which is representing the Month & Year of first order date of customer. In the next column we have value as 0 which represent current month value. All the records having 100% value in under column value 0, which is representing the orders made by customers who joined in that Month. After that we have columns with values from 1 to 23, which is representing the next 23 months data.

If we see the data row wise, we will find that what percent of orders is made by customer who made first order on that respective month. This will help us to know the repeat customer order count. This numbers are very important for all the industry to retain their customers and also they can work on their products.  

Step 10: Adding more visual to understand the customer behaviour

  • Stack Column Chart to find count of customers for each order value. Here year represent the Year of “first order date”.

Here we have added a Stacked Column Chart to find Count of customers by Total Numbers of order by First Order Date of a customer in Years. In our dataset we have the data for only two years i.e.  2016 and 2017. We can clearly observe that customer who have made their first order in year 2017 they have mostly made only 1 order, and customer who joined in year 2016 has made multiple orders also.

  • In the below chart we have use 100% Stacked Column Chart, to show the revenue  of sales done by customers who joined in year 2016 and 2017.

In the above visual we can clearly see that in year 2016 entire sales is done by the customer who joined in Year 2016 itself, which is obeys because that is the starting year that we have in our dataset. For year 2017 we can find that 69.94% sales we have got from customers who joined in 2016 and remaining we have got from year 2017.

By seeing this visual we can clearly say that company is able to retain its old customers. As 70% revenue of the total revenue in year 2017 we have received from old customers.

  • In the below chart we have use 100% Stacked Column Chart, to show the revenue of sales done per month in the year 2017 by customers who joined in year 2016 and 2017.

The above visual shows that from the month of January till September maximum sales is achieved from customers who joined in year 2016, but in the last three months majority of sales is done by customers who joined in year 2017. This visual clearly shows that company is able to retain its customer and getting a repeat order from them, which is a good sign for any organization.



By analysis the data set using Power Bi report we found that maximum revenue is generated by the customer who made the first purchase in the year 2016 which is a good sign for a company as it is able to retain its old customers. Total number of customers we are able to get is 18.01K and they have made 42.73K total orders in two years. Total revenue generated is 27.54M and gained profit of 2.61M from that. If we will calculate profit percentage from this then it will be 9.47%. So overall the company is profitable and this is possible because of sales done by old customers. Customers who joined in year 2017 generally they have made only one order but the customers who joined in year 2016 have ordered multiple times who shows that company has high retention rate and low churn rate.

Even the customers who joined in year 2017 are contributing more towards the end of the year this is possible because the number of new customers is increasing towards the last three months of year 2017. So overall the company is performing better by retaining old customers, acquiring new customer on a regular basis and gaining profit from the sales.

You can contact NIKHIL ANALYTICS for career counseling, and upskilling yourself in Data Analytics and Data Science. Nikhil Analytics counseling team is available at +91-9945339324.

You can visit our website https://www.nikhilanalytics.com for more details, and subscribe to our channel https://dataanalyticsedge.com for updates and tips.

Written By:

Mr. Saurabh Kumar is Educator & Data Scientist at Nikhil Analytics.

Be the first to comment on "What is Cohort Analysis ?"

Leave a comment

Your email address will not be published.



Subscribe for Data Analytics Edge Newsletter & Share..:-)

error: Content is protected !!