Churn Modelling for Mobile Telecommunications
What is Churn Modelling 

  • Churn represents the loss of an existing customer to a competitor
  • A prevalent problem in retail:
    • Mobile phone services
    • Home mortgage refinance
    • Credit card
  • Churn is a problem for any provider of a subscription service or recurring purchasable
    • Costs of customer acquisition and win-back can be high
    • Much cheaper to invest in customer retention
    • Difficult to recoup costs of customer acquisition unless customer is retained for a minimum length of time
  • Churn is especially important to mobile phone service providers
    • easy for a subscriber to switch services
    • Phone number portability will remove last important obstacle

Predicting Churn: Key to a Protective Strategy

  • Predictive modelling can assist churn management
    • By tagging customers most likely to churn
  • High risk customers should first be sorted by profitability
    • Campaign targeted to the most profitable at-risk customers
    • Typical retention campaigns include
      • Incentives such as price breaks
      • Special services available only to select customers
  • To be cost effective retention campaigns must be targeted to the right customers
    • Customers who would probably leave without the incentive
    • Costly to offer incentives to those who would stay regardless


Here, We have a sample telecom data on which we will run Churn Modelling using R code.

library(rattle)               # The weather data set and normVarNames().
library(randomForest) # Impute missing values using na.roughfix().
library(rpart)               # decision tree
library(tidyr)                # Tidy the data set.
library(ggplot2)           # Visualize data.
library(dplyr)               # Data preparation and pipes %>%.
library(lubridate)         # Handle dates.

Loading data directly from web

nm <- read.csv("", skip=4, colClasses=c("character", "NULL"),
header=FALSE, sep=":")[[1]]

dat  <- read.csv("", header=FALSE, col.names=c(nm, "Churn"))
nobs<- nrow(dat);


dsname <- "dat"
ds          <- get(dsname)


(vars <- names(ds))
target<- 'Churn';

ds$churn<-(as.numeric(ds$Churn) - 1)

## Split ds into train and test 
## 75% of the sample size
smp_size <- floor(0.75 * nrow(ds))

## set the seed to make your partition reproducible
train_ind <- sample(seq_len(nrow(ds)), size = smp_size)
train <- ds[train_ind, ]
test  <- ds[-train_ind, ]

corrgram(train, lower.panel=panel.ellipse, upper.panel=panel.pie);

Churn Modelling in Telecom industries corr-gram graph

Fitting a Model<-lm(churn~., data=train);

# Multiple R-squared:  0.1784,    Adjusted R-squared:  0.1724

How does the linear model perform<-predict(, test);<-sqrt(mean((test$churn)^2));  #0.3232695

# building a simpler model, similar R2 <- lm(churn ~ international.plan + voice.mail.plan + +
                      total.eve.minutes + total.night.charge + total.intl.calls +
                      total.intl.charge + number.customer.service.calls, data=train);

# Multiple R-squared: 0.1767, Adjusted R-squared: 0.174   <-predict(, test); <-sqrt(mean(($churn)^2));  #0.3227848 <- simpler, and better RMSE


How does the Logistic Regression perform

# logistic regression using a generalized linear model
glm.step <- glm(churn ~ international.plan + voice.mail.plan + +
                    total.eve.minutes + total.night.charge + total.intl.calls +
                    total.intl.charge + number.customer.service.calls, family = binomial, data = train)
                    pred.glm.step <- predict.glm(glm.step, newdata = test, type = "response")

RMSE.glm.step <- sqrt(mean((pred.glm.step-test$churn)^2))
RMSE.glm.step;  #0.3179586 <- better than the linear model


How does the Decision Tree perform

# build a decision tree based on the selected variables  <- rpart(churn ~ international.plan + voice.mail.plan + +
                           total.eve.minutes + total.night.charge + total.intl.calls +
                           total.intl.charge + number.customer.service.calls, data=train, method="class");

pred.rpart.step   <- predict(, test); # See correction below
RMSE.rpart.step <- sqrt(mean((pred.rpart.step-test$churn)^2))
RMSE.rpart.step;  #0.6742183 <- much worse than the linear model

# forgot type="class"
pred.rpart.step   <- as.numeric(predict(, test, type="class")) - 1;
RMSE.rpart.step <- sqrt(mean((pred.rpart.step-test$churn)^2))
RMSE.rpart.step; #0.2423902 <- better than the linear model

# 0.941247 94% tests are correctly matched


How does the Random Forest perform

# Build Random Forest Ensemble
set.seed(415) <- randomForest(as.factor(churn) ~ international.plan + voice.mail.plan + +
                                            total.eve.minutes + total.night.charge + total.intl.calls +
                                            total.intl.charge + number.customer.service.calls,
                                            data=train, importance=TRUE, ntree=2000)
varImpPlot(;   <- as.numeric(predict(, test))-1; <- sqrt(mean(($churn)^2)); #0.2217221  improvement from the linear model, so a non-linear, decision tree approach is better

# 0.9508393 95% tests are correctly matched



Linear Model

Simpler Linear Model

Logistic Regression
better than the linear model
Decision Tree
much worse than the liniar model (Overfitting)
Decision Tree (Without type = "class")
better than the liniar model
Random Forest
improvement from the liniar model  so a non-linear decision tree approach is better

