Python RFM Model for Customer Segmentation

RFM is a method used for segmenting and analyzing customer lifetime value.

It is used in direct-to-consumer marketing, retail stores, e-commerce companies & professional service companies to derive the engagement strength and the likelihood of customers staying with the company for the long term.

You’d be surprised to see the revenue generated by your returning customers. RFM Model could be your ally in the retention game.

What is RFM?

R -> Retention: The freshness of the customer activity, be it purchases or visits

F -> Frequency: The frequency of the customer transactions or visits

M -> Monetary: The intention of customer spend or purchasing power of customer

What are the benefits of the RFM Model?

It helps you to answer the following questions:

  1. Identify top customers
  2. Customers contribution to churn
  3. Potential / Valuable customers / Customer Lifecycle value
  4. Customers can be retained
  5. Prediction of customer engagement campaigns

We will be using simple python programming to segment our customers.

We will group these parameters by:

  1. Percentiles or quantiles
  2. Pareto Rule — 80/20
  3. Business Acumen

Let’s the percentile grouping for our approach & get on with some Python code.

Shopping data set link for reference: Customer Segmentation | Kaggle

# Import libraries
import pandas as pd
from datetime import timedelta
import matplotlib.pyplot as plt
import squarify
import seaborn as sns

# Read dataset
online = pd.read_csv(‘data.csv’, encoding=”ISO-8859–1″)
online[‘InvoiceDate’] = pd.to_datetime(online[‘InvoiceDate’])

# Drop NA values from online

# Create TotalSum column for online dataset
online[‘TotalSum’] = online[‘Quantity’] * online[‘UnitPrice’]

# Create snapshot date
snapshot_date = online[‘InvoiceDate’].max() + timedelta(days=1)

# Grouping by CustomerID
data_process = online.groupby([‘CustomerID’]).agg({
‘InvoiceDate’: lambda x: (snapshot_date – x.max()).days,
‘InvoiceNo’: ‘count’,
‘TotalSum’: ‘sum’

# Rename the columns
‘InvoiceDate’: ‘Recency’,
‘InvoiceNo’: ‘Frequency’,
‘TotalSum’: ‘MonetaryValue’
}, inplace=True)

# Plot RFM distributions
plt.figure(figsize=(12, 10))

# Plot distribution of R
plt.subplot(3, 1, 1)

# Plot distribution of F
plt.subplot(3, 1, 2)

# Plot distribution of M
plt.subplot(3, 1, 3)

# Show the plot

# Calculate Recency (R) and Frequency (F) groups
# Create labels for Recency and Frequency
r_labels = range(4, 0, -1)
f_labels = range(1, 5)

# Assign these labels to 4 equal percentile groups
r_groups = pd.qcut(data_process[‘Recency’], q=4, labels=r_labels)

# Assign these labels to 4 equal percentile groups
f_groups = pd.qcut(data_process[‘Frequency’], q=4, labels=f_labels)

# Create new columns R and F
data_process = data_process.assign(R=r_groups.values, F=f_groups.values)

# Create labels for MonetaryValue
m_labels = range(1, 5)

# Assign these labels to three equal percentile groups
m_groups = pd.qcut(data_process[‘MonetaryValue’], q=4, labels=m_labels)

# Create new column M
data_process = data_process.assign(M=m_groups.values)

# Concat RFM quartile values to create RFM Segments
def join_rfm(x):
return str(x[‘R’]) + str(x[‘F’]) + str(x[‘M’])

data_process[‘RFM_Segment_Concat’] = data_process.apply(join_rfm, axis=1)
rfm = data_process

# Count num of unique segments
rfm_count_unique = rfm.groupby(‘RFM_Segment_Concat’)[‘RFM_Segment_Concat’].nunique()

# Calculate RFM_Score
rfm[‘RFM_Score’] = rfm[[‘R’, ‘F’, ‘M’]].sum(axis=1)

# Define rfm_level function
def rfm_level(df):
if df[‘RFM_Score’] >= 9:
return ‘Can\’t Loose Them’
elif ((df[‘RFM_Score’] >= 8) and (df[‘RFM_Score’] < 9)): return ‘Champions’ elif ((df[‘RFM_Score’] >= 7


Customer Segmentation RFM Plot

Customer Segmentation RFM Plot

Customer Segmentation RFM Plot

Customer Segmentation RFM Table

  1. Potential — high potential to enter our loyal customer segments, you can decide to take an appropriate calls like giving them extra discounts or freebies or make them eligible for elite customer club!
  2. Promising — showing promising signs with quantity and value of their purchase but it has been a while since they last bought some time from you. you can target them as per their wishlist and throw some limited-time discounts or more bundling offers
  3. Needs Attention — made some initial purchases but have not seen them since. Was it a bad customer experience? Or product-market fit? Let’s spend some resources building our brand awareness with them.
  4. Require Activation — Poorest performers of our RFM model. They might have gone with our competitors for now and will require a different activation strategy to win them back.

rfm_level_agg.columns = rfm_level_agg.columns.droplevel()
rfm_level_agg.columns = ['RecencyMean', 'FrequencyMean', 'MonetaryMean', 'Count']
# Create our plot and resize it.
fig = plt.gcf()
ax = fig.add_subplot()
fig.set_size_inches(16, 9)
label=['Can\'t Loose Them','Champions','Loyal','Needs Attention','Potential','Promising','Require Activation'],
alpha=.6 )
plt.title("RFM Segments", fontsize=18, fontweight="bold")

Customer Segmentation RFM Segments

RFM analysis shows anomalies that will tell you priorities as per customer segments and help you to form appropriate value offering strategies. You can also add extra coefficients as per your business, products, relative seasonal index, etc. and change as you need