Segmentation of Mobile Customers using Data Mining Techniques

Download Full-Text PDF Cite this Publication

Text Only Version

Segmentation of Mobile Customers using Data Mining Techniques

Segmentation of Mobile Customers using Data Mining Techniques

Md. Mahfuz Reza Sajedun Nahar

Assistant Professor Student

Department of Computer Science and Engineering Department of Computer Science and Engineering Mawlana Bhashani Science and Technology University Mawlana Bhashani Science and Technology University

Santosh, Tangail-1902, Bangladesh Santosh, Tangail-1902, Bangladesh

Tanya Akter


Department of Computer Science and Engineering Mawlana Bhashani Science and Technology University

Santosh, Tangail-1902, Bangladesh

Abstract – In todays competitive world, to keep customers satisfied is a key to success for telecommunication companies. Data mining techniques are more preferable for discovering the customers attributes as well as their needs which is possible by segmenting their behaviours. Segmentation is the process of developing meaningful customer groups that are similar based on individual account characteristics and behaviours. Using K- means clustering, the paper proposes a resolution of customer segmentation for the telecommunication company. The prime objectives were to group customers using their behavioural characteristics and provide services according to the group.

Keywords: Data base of mobile customers, Data mining, K-means clustering, segmentations and services


    Telecom industry is a typical data-intensive industry, in which data mining technologies can be used to obtain useful knowledge to provide customers with better services and find more commercial opportunities[2].Customer satisfaction and attraction are one of the most significant goals in top level leading companies today. It will directly impact on companys revenue and income .Customers profitability is the profit that the company makes from serving a customer or customer group over a specified period of time [3]. Customer segmentation is a term used to describe the process of dividing customers into homogeneous groups on the basis of common attributes [16]. The customers within the same group have greatest similarity. Telecommunication companies utilize data mining to improve their marketing efforts, identify fraud, and provide service to the customer [6].

    Data mining refers to extracting or mining knowledge from large amount of data. Many other terms carry a similar or slightly different meaning to data mining, such as knowledge mining fromdata, knowledge extraction, data or pattern analysis, data archaeology and data dredging.

    Commonly used data mining techniques includes association analysis, classification and prediction, cluster analysis, outlier analysis and evolution analysis. Among them, the cluster

    analysis can be used to solve the problem of customer grouping [7].

    Clustering basically deals with grouping of objects such that each group consists of similar or related objects. The main idea behind clustering is to maximize the intra-cluster similarities and minimize the inter cluster similarities. In this thesis paper, we use K-means clustering technique to segment the customers [2] [5]. This is one of the most common and effective method to classify data because of its simplicity and ability to handle voluminous data sets. Generally, it accepts the number of clusters and the initial set of centroids as parameters. The distance of each item in the data set is calculated with each of the centroids of the respective cluster. For grouping the items of a data set using K-Means clustering is calculating the distance of the point from the chosen mean. This distance is usually the Euclidean Distance.


    In telecom sector, customer clustering or segmentation is one of the most significant methods used in studies of marketing. To arrive a better justification regarding of customer analysis for providing offers, it is relevant to make a survey of related literature.

    Some research about segmentation for customers has been developed. Ours is based on the discussion, proposed by

    S.M.H. Jansen, useddifferent clustering techniques to segment the customers and support vector machine to profile the segmented customer (2007).

    KonstantinosTsiptsis and AntoniosChorianopoulos discussed the customer segmentation in telecommunication on the basis of user behaviour using two approaches of segmentations(2009).One is behavioural analysis and another is value-based segmentation. They used all the available usage data to reveal the natural groupings in their customer base. The behavioural segmentation implementation included the application of a data reduction technique (PCA) to reveal the distinct dimensions of information, followed by a clustering

    technique to identify the segments. And value-based segmentation relies only on a single field. It does not need the application ofa data mining algorithm either. It only involves a simple sorting of records according to aprofitability index and an assignment to corresponding groups.

    In 2017 3rd International Conference on Science and Technology – Computer (ICST) , AndryAlamsyah and BellaniaNurriz performed a business analysis Monte Carlo Simulation and Clustering for Customer Segmentation in Business Organization to support customer segmentation case . They applied clustering methods in one of the branches of Indonesia Telecommunication Company. They wanted to show a simulation to generate customers income data for Telkom Indonesia Makassar city.Furthermore, the simulation result data is used to supportcustomer segmentation analysis using the K-Means clustering.


    1. Data Mining

      Data mining is the process of searching and analyzing data in order to find implicit, but potentially useful information. It is a powerful tool, helpful to companies as it predicts customers[1]. There are some basic data mining tasks such as association rules, sequential pattern, clustering and classification.

    2. Clustering

      The objective of cluster analysis is the organization of objects into groups, according to similarities among them[10]. The main idea behind clustering is to maximize the intra-cluster similarities and minimize the inter cluster similarities. K- Means algorithm is one of the common clustering processes based on centroid model.K-Means algorithmis a classical algorithm to solve the clustering problem [16]. The measure for the case in the cluster is represented by the meanvalue.

      K-means works as follows:

      1. Select the number of cluster. Let this number be K.

      2. Pick K seeds as centroid of thecluster. The seeds may be picked randomly.

      3. Compute the Euclidean distance of each object in the set from each of the centroids.

      4. Allocate each object to the cluster it is nearest to based on distances computed in the previous step.

      5. Compute the centroid of clusters by computing the means of the attribute values of cluster.

      6. Check if the stopping criteria has been met, if yes go to stop.

      7. else go to 3.

    3. Segmentation

    Segmentation is a process to divide customersof a consumer or business market in groups based on someshared characteristics. There are various aspects on whichcustomer segmentation can be done such as demographic, behavioral, geographic and so on[3].





    Data profiling

    Profilig outcomes


    This section describes in detail the research methodology including data collection, preparation, cluster analysis, segmentation, profiling, customer identification and service providing.

    Collection of Dataset

    Raw Data

    Data Cleaning

    K-means clustering

    Data Exploration

    Clusters of data

    Data preparation Clustering Customer Identification

    Fig. 1: Flow chart of customer segmentation

    The first process is collecting the data from a telecom industry.

    The attributes in our data set which is given below [6]:



    Attribute names








    Network Type





















    After we get the data,it needs to be cleaned and prepared the data.The pre-processing step contains several activities such as noise reduction,data cleaning, data integration and data reduction to get a better data form. The third process is segmenting the customers using cluster analysis.K-means is one of the most important and commonly used method fordividing the dataset into several clusters that requested[5]. The fifth process is analysing the result of the customers. Finally, we provide services to these customers.


    We analyse fifteen days historical data of anonymized coded dataset, for protecting customer and company privacy. Implementation is done by using rstudio and oracle 11gR2 software.

    1. Checking missing value:

      We will check whether any missing value exists or not in our data set. We have used R language for calculating missing value.

      Fig. 5: Centroid information

      We segment the Hour attribute into three time periods. This are-

      Fig. 2: Calculation of missing value

    2. Data Exploration:

      Before clustering, data exploration should be done. We use oracle data miner (ODM) to explore the data.

      Fig. 3: Data exploration

    3. Structure of Dataset:

      Information of all attributes are included in this section.

      Fig. 4: Structure of Dataset

    4. Cluster Analysis:

      In this research, K-means clustering is used to group the customer for further segmentation.

      1. means is a centroid based clustering in which the notion of similarity is derived by how close a data point is to the centroid of the cluster. The item is then assigned to the cluster with which the distance of the item.

        • Attribute Daytime. Mins denotes the period 6.00am- 5.00pm.

        • Attribute Evening.Mins denotes the period 5.00pm- 8.00pm.

        • Attribute Night. Mins denotes the period 8.00pm 6.00am.

    5. Profiling

    We profile the customers using hour attribute. This hour attribute will provide segments. The figure given below distributes the hour profile of each cluster.

    Fig. 6: The hour profile of each cluster

    1. Relativity

      Relationship among Daytime.Mins, Evening.Mins and Night. Mins customers are given below:

      Fig. 7: Relativity in Daytime.Mins, Evening.Mins, Night.Mins

      By examining the values of the each cluster we can determine the profitability of the customer. By analysing above figure 6 we can determine cluster 2 is the most profitable customer. Cluster 1 includes low profitable customers. So, we can decide:

      Cluster 2: High profitable customers. Cluster 3: Profitable customers.

      Cluster 1: Low profitable customers.

      Cluster 2 contains both Daytime. Mins and Evening. Mins attributes.

      Fig. 8: Information of Evening. Mins user

      Fig. 9: Information of Daytime.mins user

      By understanding profitability and attributes of customers, companies can make decisions to improve their services.Companies need to serve better service by handling above discussed customer category. After grouping the customers then we provide services among them.

      Services provided for Daytime users:

      • On-net call time offers.

      • Bundle offers.

      • Bonus offers.

      • Offers for news, games facility.

        Services provided for Evening time users:

      • FnF offers.

      • Special call rate offers.

      • SMS bundle offers.

      • Provides with new technology services like internet offers.

    We cant ignore medium and low profitable customers. Because they are part of the companies profit. One day they can be loyal customer. So, companies have to concern about them.


  1. Conclusion

    Latterly, the mobile telecommunication marketplace is highly competitive. Increasing the number of customers is the main challenge in modern telecommunication industry[4]. In this paper, we have shown that through the use of customer segmentation,a telecommunication company can easily attract its customers with right products and services[6]. This also helps in offering packages, offers and bundles for customers.

    So, companies must realize the grandeur of customer segmentation and profiling the customers behaviour to achieve better results by narrowing customer segments.The cluster analysis is able to solve customer segmentation problem[18]. This paper adopts the K-means clustering method to resolve a analysis of telecom customer segmentation. Practical results indicate that the analysis of customer segmentation for telecom sector is effective and successful[2]. The business objective was to group customers in terms of their behavioural characteristics and provides services according to the group considering which customers are profitable for the company.In other words, we theoretically discuss about the utilization of data mining algorithm for serving with suitable offers to the customers.

  2. Future work

For future research we can predict the risk customer using association rules..In future, we can find accuracy of all customers using churn prediction. Also we can include studying the performance of clustering with applying the behaviour of revenue.


[1] S.M.H. Jansen, Customer Segmentation and Customer Profiling for a Mobile Telecommunications Company Based on Usage Behavior: A Vodafone Case Study,pp 8-13, 17July 2007.

[2] CaiQiuru, Lua Ye, Xi Haixu, Liu Yijun, Zhu Guangping. Telecom Customer Segmentation Based on Cluster Analysis,inprceedings of International Conference on Computer Science and Information Processing (CSIP), 2012:IEEE.

[3] A. S. M. Shahadat Hossain , Customer Segmentation using Centroid Based and Density Based Clustering Algorithms,in proceedings of the 3rd International Conference on Electrical Information and Communication Technology (EICT), 7-9 December 2017, Khulna, Bangladesh:IEEE.

[4] AnahitaNamvar, Mehdi Ghazanfari, Mohsen Naderpour, A Customer Segmentation Framework for Targeted Marketing in Telecommunication,in proceedings of 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE),2017:IEEE.

[5] AndryAlamsyah, BellaniaNurriz , Monte Carlo Simulation and Clustering for Customer Segmentation in Business Organization,in proceedings of the 3rd International Conference on Sciene and Technology- Computer (ICST),2017:IEEE.

[6] Salar Masood, Moaz Ali, Faryal Arshad, Ali Mustafa Qamar, Aatif Kamal, Ahsan Rehman, Customer Segmentation and Analysis of a Mobile Telecommunication Company of Pakistan using Two Phase Clustering algorithms, 2013: IEEE.

[7] ZHANGYihua, Vip Customer Segmentation Based on Data Mining in Mobile-communications Industry, in proceedings of The 5th International Conference on Computer Science & EducationHefei, China. August 2427, 2010: IEEE.

[8] HasanZiafat, Majid Shakeri, Using Data Mining Techniques in Customer Segmentation, in proceedings of HasanZiafat Int. Journal of Engineering Research and Applications, Vol. 4, Issue 9( Version 3), September 2014, pp.70-79.

[9] Prof. TejalUpadhyay, Customer Profiling and Segmentation using Data Mining Techniques,vol.7, no.2, march 2016-sept 2016,pp 65- 67 [online]

[10] Mali, K., Clustering and its validation in a symbolic framework.

Patt. Recogn, Lett., vol. 24 (2003), pp. 2367-2376.

[11] Anshul Arora ,Dr. Rajan Vohra, Segmentation of Mobile Customers for Improving Profitability Using Data Mining Techniques, in the proceedings of (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 5 (4) , pp 5241-5244,2014.

[12] HasithaIndika Arumawadu1, R. M. KapilaTharanga Rathnayaka2,3,

S. K. Illangarathne4, Mining Profitability of Telecommunication Customers Using K-Means Clustering, in Journal of Data Analysis and Information Processing, 2015, vol.3, pp 63-71,[online] Available:, 36/jdaip.2015.33008 , [Accessed in August 2015 in SciRes].

[13] Emilia Mattila, Behavioral Segmentation of Telecommunication Customers inMaster of Science ThesisStockholm, Sweden 2008.


[15] KonstantinosTsiptsis ,AntoniosChorianopoulos, Segmentation Applications in Telecommunications in Data Mining Techniques in CRM: Inside Customer Segmentation,pp 291-332,2009.

[16] Ye, Xi Haixu, Liu Yijun,CaiQiuru,Yu Zhi-min. Telecom Customer Segmentation Based on Cluster Analysis,inprceedings of International Conference on Computer Science and Information Processing (CSIP),2012:IEEE.

[17] Jinghua Zhao, Wenbo Zhang, Yanwei Liu, Improved K-Means Cluster Algorithm in Telecommunications Enterprises Customer Segmentation ,2010:IEEE.

[18] ZhangXiaozhou, Zhang Yonghong, Telecom CustomerMarketing [M] Beijing: Posts & Telecom Press, 2008.

Leave a Reply

Your email address will not be published. Required fields are marked *