Prediction of Customer Behaviour Using CRM Framework

Download Full-Text PDF Cite this Publication

Text Only Version

Prediction of Customer Behaviour Using CRM Framework

Prediction of Customer Behaviour Using CRM Framework

Malini R. B.E., (M.Tech)1 Mr. S.J.Prashanth. B.E., M.Tecp CS & E dept, AIT, Chikkamagaluru CS & E dept, AIT, Chikkamagaluru

Abstract: In todays advanced world of businesses, CRM-data mining framework establishes close customer relationships and manages relationships between organizations and customers. Data mining has gained popularity in various CRM applications in recent years and classification model is an important data mining technique useful in the field. The model is used to predict the behaviour of customers to enhance the decision-making processes for retaining valued customers. An efficient CRM-data mining and two classification models will be proposed, Naive Bayes and Neural Networks are studied to show that the accuracy of Neural Networks is comparatively better.

Keywords: data mining framework; customer relationship management; prediction; classification


    Data mining is the computing process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. It is an essential process where intelligent methods are applied to extract data patterns. It is an interdisciplinary subfield of computer science. Data mining is the analysis step of the knowledge discovery in databases process. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Customer relationship management (CRM) is an approach to manage a company's interaction with current and potential customers. It uses data analysis about customers' history with a company to improve business relationships with customers, specifically focusing on customer retention and ultimately driving sales growth. One important aspect of the CRM approach is the systems of CRM that compile data from arrange of different communication channels, including a company's website, telephone, email, live chat, marketing materials, and more recently, social media. Through the CRM approach and the systems used to facilitate it, businesses learn more about their target audiences and how to best

    cater to their needs.

    CRM-data mining framework establishes close customer relationships and manages relationship between organizations and customers in todays advanced world of businesses. Data mining has gained popularity in various CRM applications in recent years and classification model is an important data mining technique useful in the field. Information technology tools, advanced internet technologies and explosion in customer data has improved

    the opportunities for marketing and has changed the way relationships between organisations and their customers are managed.

    Customer Relationship Management helps in building long term and profitable relationships with valuable customers. The set of processes and other useful systems in CRM help in developing a business strategy and the enterprise approach understands and influences the customer behaviour through meaningful communications so that customer acquisition, customer loyalty, customer retention and customer profitability are improved. The key factor in the development of a competitive CRM strategy is the understanding and analyzing of customer behaviour and this helps in acquiring and retaining potential customers so as to maximize customer value. CRM-data mining framework helps organizations to identify valuable customers and predict their future.

    To illustrate the performance of classification models, the CRM applications such as customer segmentation, prospecting and acquisition, affinity and cross sell, profitability, retention and attrition, risk analyses, etc are considered in banking domain. Instead of mass campaign banks focus on direct marketing campaigns as one measure to improve customer development. The banks use the data available to retain its best customers and to identify opportunities to sell them additional services. Two classification models, the Multilayer Perception Neural Network (MLPNN) which have their roots in the artificial intelligence, and Naive Bayes (NB) classifier, a simple probabilistic classifier based on applying Bayes theorem are used.

    The main components of CRM are building and managing customer relationships through marketing, observing relationships as they mature through distinct phases, managing these relationships at each stage and recognizing that the distribution of value of a relationship to the firm is not homogenous. The final factor of CRM highlights the importance of CRM through accounting for the profitability of customer relationships. Relational Intelligence, or awareness of the variety of relationships a customer can have with a firm, is an important component to the main phases of CRM.


    1. Data Mining with Neural Networks and Support Vector Machines.

      Cortez.P [2] proposed the data mining with neural networks and support vector machines using the R/rimmer Tool. Rminer, a open source library for the R tool that

      facilitates the use of data mining(DM) algorithms, such as neural networks(NNs) and support vector machines(SVMs), in classification and regression tasks. The fields of data mining (DM)/business intelligence (BI) arose due to the advances of information technology (IT), leading to an exponential growth of business and scientific databases. The aim of DM/BI is to analyze raw data and extract high-level knowledge for the domain user or decision-maker.

      In this work, rminer library is presented, which is an integrated framework that uses a console based approach and that facilitates the use of DM algorithms in

      R. In particular, it addresses two important and common goals namely (1) classification: labelling a data item into one of several predefined classes; and (2) regression: estimate a real-value from several input attributes. While adopting R packages for the DM algorithms, rminer provides new features: (i) it simplifies the use of DM algorithms (e.g. NNs and SVMs) in classification and regression tasks by presenting a short and coherent set of functions. (ii) it performs an automatic model selection (i.e. tuning of NN/SVM).(iii) it computes several classification/regression metrics and graphics, including the sensitivity analysis procedure for input relevance extraction.

      The aim is show that the R/rminer results are consistent when compared with other DM tools. Two tutorial examples (e.g. satellite image classification) were used to show the R/rminer potential under the CRISP-DM methodology. Additional experiments were held in order to measure the rminer library predictive performances. Overall, competitive results were obtained, in particular the SVM model for the classification tasks and NN for the regression ones.

    2. Application of Data Mining Techniques in Customer Relationship Management.

      EWT Ngai, L Xiu, DCK.Chau [3] proposed application of Data Mining Techniques in Customer Relationship Management. Despite the importance of data mining techniques to customer relationship management (CRM), there is a lack of a comprehensive literature review and a classification scheme for it. This is the first identifiable academic literature review of the application of data mining techniques to CRM. Customer relationship management (CRM) comprises a set of processes and enabling systems supporting a business strategy to build long term, profitable relationships with specific customers. Customer data and nformation technology (IT) tools form the foundation upon which any successful CRM strategy is built.

      Application of data mining techniques in CRM is an emerging trend in the industry. It has attracted the attention of practitioners and academics. This paper has identified eighty seven articles related to application of data mining techniques in CRM, and published between 2000 and 2006.It aims to give are search summary on the application of data mining in the CRM domain and techniques which are most often used. Research on the application of data mining in CRM will increase significantly in the future based on past publication rates

      and the increasing interest in the area. The majority of the reviewed articles relate to customer retention. Of these, 51.9% (28 articles) and 44.4% (24 articles) are related to one-to-one marketing and loyalty programs respectively. These articles could provide insight to organization policy makers on the common data mining practices used in retaining customers.

      There are relatively fewer articles discussing target customer analysis. Data mining techniques, such as neural networks and decision trees, could be used to seek the profitable segments of customers through analysis of customers underlying characteristics. The classification model is the most commonly applied model in CRM for predicting future customer behaviours. This is not surprising as classification modelling could be used to predict the effectiveness or profitability of a CRM strategy through the prediction of customer behaviours.

    3. A data-driven approach to predict the success of bank telemarketing.

      S.Moro, P.Cortez, P.Rita [11] proposed a data mining (DM) approach to predict the success of telemarketing calls for selling bank long-term deposits. A Portuguese retail bank was addressed, with data collected from 2008 to 2013, thus including the effects of the recent financial crisis. A semi-automatic feature selection was explored in the modelling phase, performed with the data prior to July 2012 and that allowed to select a reduced set of 22 features.DM models: logistic regression, decision trees (DTs), neural network (NN) and support vector machine were compared. Using two metrics, area of the receiver operating characteristic curve (AUC) and area of the LIFT cumulative curve (ALIFT), the four models were tested on an evaluation set, using the most recent data and a rolling window scheme.

      This research focus on targeting through telemarketing phone calls to sell long-term deposits. Within a campaign, the human agents execute phone calls to a list of clients to sell the deposit (outbound) or, if mean while the client calls the contact-center for any other reason, he is asked to subscribe the deposit (inbound). Thus, the result is a binary unsuccessful or successful contact. In this work, test four binary classification DM models, as implemented in the rminer package of the R tool: logistic regression (LR), decision trees (DTs), neural network (NN) and support vector machine (SVM).

      During the modelling phase, and using a semi- automated feature selection procedure, a reduced set of 22 relevant features is selected. Also, four DM models were compared: logistic regression (LR), decision trees (DTs), neural networks (NNs) and support vector machines (SVMs). These models were compared using two metrics, area of the receiver operating characteristic curve (AUC) and area of the LIFT cumulative curve (ALIFT), both at the modelling and rolling window evaluation phases. For both metrics and phases, the best results were obtained by the NN, which resulted in an AUC of 0.80 and ALIFT of 0.67 during the rolling window evaluation.


    The objectives of Customer Relationship Management framework is to present an efficient CRM- data mining framework for the prediction of customer behaviour in the domain of banking applications and to understand and analyze customer behaviour in acquiring and retaining potential customers so as to maximize customer value.

    Fig.1.CRM-data mining framework

    The proposed CRM-data mining framework is shown in Fig.1. Understanding the business goals and requirements of the problem domain forms the initial phase of any problem in data mining. A close study and management of customer relationships and their interactions will help to identify attract and retain effective customers in the domain. The next phase of data preparation or pre-processing helps in preparing the data by the processes of cleaning, attributes election, data transformation etc for further building up of models and their evaluation. Model construction in the CRM framework is a major step in which effective model to satisfy the business requirements is constructed. These models help in predicting the behaviour of the customers. Model evaluation and visualization measure the effectiveness of the model for enhancing their performance.

      1. Data Source

        1. Customer Identification: A customer identification requirement, where financial institutions need to verify the identity of individuals wishing to conduct financial transactions with them. The customer identification must include new account opening procedures that specify the identifying information that will be obtained from each customer. It must also include reasonable and practical risk-based procedures for verifying the identity of each customer.

        2. Customer development: Customer development is a formal methodology for building start-ups and new corporate ventures. The process assumes that early ventures have untested hypotheses about their business model (who are the customers, what features they want, what channel to

          use, revenue strategy/pricing tactics, how to get/keep/grow customers, strategic activities needed to deliver the product, internal resources needed, partners needed and costs.).Customer development starts with the key idea that there are no facts inside your building so get outside to test them.

        3. Customer attraction and retention: Attracting customers is the primary goal of most public-facing businesses, because it is the customer who creates demand for goods and services. Customer retention refers to the activities and actions companies and organizations take to reduce the number of customer defections.

        4. Business/Domain Understanding: Running a business requires a lot of knowledge and hard work. There are ongoing activities, business operations, involved in the production of value for all the stakeholders. The intended outcome of these business operations is to harvest the value from the assets owned by the business.

        5. Data Preparation/Pre-processing: Data pre-processing is a data mining technique that involves transforming raw data into an understandable format. Real-world data is often incomplete, inconsistent, and/or lacking in certain behaviours or trends, and is likely to contain many errors.

      2. Multilayer Perception Neural Network (MLPNN)

        A multilayer perceptron (MLP) is a class of feed forward artificial neural network. An MLP consists of at least three layers of nodes. Except for the input nodes, each node is a neuron that uses a nonlinear activation function. MLP utilizes a supervised learning technique called back propagation for training.

        Multilayer perception neural network (MLPNN) structure is organized as a layered set of neurons. Among the input, output and hidden layers of neurons the actual computations of the network are performed in the hidden layer, where each neuron sums its input attributes xi after multiplying them by the strengths of the respective connection weights wij. The activation function (AF) of this sum gives the output yj and sigmoid function is the AF used in the experiment.

        yj=f(wij,xi) (1)

        Back-Propagation (BP) learning is the most common training technique used for MLPNN. The sum of squared differences between the desired and asset value of the output neuron's E is defined as:

        E=½j(ydj-yj2 (2)

        Where yj is the output of a neuron j whose desired value is ydj . Weights wi,j in equation (1), are adjusted to finding the minimum error E of equation(2) as early as possible. The difference between the network outputs and the desired ones is reduced by the application of weight correction by BP. The neural networks help in learning, and reducing the future errors. Good learning ability, fast real-time operation, less memory demand, analysis of complex patterns are some of the advantages of MLPNN

        and the disadvantages include high-quality data requirement of the network, careful selection of variables a priori and so on.

      3. Naive Bayes (NB)

    In machine learning, naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes theorem with strong (naive) independence assumptions between the features. Bayesian classifiers are helpful in predicting the probability that a sample belongs to a particular class. The technique is used for large databases because of its high accuracy and fastness to train with simple models. To estimate the parameters (means and variances of the variables) necessary for classification, the classifier requires only a small amount of training data. It also handles real and discrete data.

    Bayes rule can be used as the basis for designing learning algorithms, as follows: To learn some target function f: UV, or equivalently, P (V|U), we use the training data to learn estimates of P(U|V) and P(V). Using these estimated probability distributions and Bayes rule new U examples can then be classified.

    From a given set of training instances with class labels, a learner in classification learning problems, attempts to construct a classifier. The Naive Bayes classifier assumes all attributes describing U are conditionally independent given V. The number of parameters that must be estimated to learn the classifier is reduced dramatically by this assumption. For both discrete and continuous U, Naive Bayes is a widely used learning algorithm.


    Fig.2. Calculation of statistical measures for a month

    Fig.2 shows the calculation of the statistical measures namely classification accuracy, sensitivity and specificity for the month of January, which is used to evaluate the performance of the classification models. The graph depicts the number of customers subscribed and unsubscribed the scheme. The blue bar shows the number of customers subscribed and the green bar shows the number of customers unsubscribed the scheme.

    Fig.3. Calculation of statistical measures for other month

    Fig.3 shows the calculation of the statistical measures namely classification accuracy, sensitivity and specificity for the month of June. The graph depicts the number of customers subscribed and unsubscribed the scheme. The blue bar shows the number of customers subscribed and the green bar shows the number of customers unsubscribed the scheme.


In this work an efficient CRM-data mining framework for the prediction of customer behaviour is proposed. Two classification models were used to predict the customer behaviour. In order to arrive at authentic research results it is always better to use standard bench marking data sets like UCI data sets. Hence it is used in the same work. The best model that achieves high predictive performance was MLPNN with accuracy rate of 88.63%. The performance of classifiers in terms of accuracy, sensitivity and specificity are compared.

This work can be extended to other new models like neuro fuzzy classifiers, ensemble of classifiers. The same experimental set up can be applied to other huge live banking datasets.


[1] Berson A, Smith S, Thearling K. Building data mining applications for CRM, McGraw-Hill; 2000.

[2] Cortez P. Data Mining with Neural Networks and Support Vector Machines using the R/rminer Tool, In Proceedings of the 10th Industrial Conference on Data Mining, Germany: Springer; 2010. p. 572583.

[3] EWT Ngai, L Xiu, DCK.Chau. Application of Data Mining Techniques in Customer Relationship Management: A Literature Review on Classification, Expert Systems with Applications; 36- 2, 2009. p. 2592-2602.

[4] EWT Ngai. Customer relationship management research (1992 2002): An academic literature review and classification, Marketing Intelligence, Planning; 23, 2005. p. 582605.

[5] Femina Bahari T, Sudheep Elayidom M. An Efficient CRM-Data Mining Framework for the Prediction of Customer Behaviour (2015)725-731.

[6] Hany AE. Bank Direct Marketing Analysis of Data Mining Techniques, International Journal of Computer Applications; 85- 7, 2014.

[7] JW Han M Kamber. Data mining concepts and techniques, 2nd ed. Morgan Kaufmann, San Francisco, CA; 2006.

[8] Ling, R., Yen D. Customer relationship management: An analysis framework and implementation strategies, Journal of Computer Information Systems; 41, 2001. p. 8297.

[9] Mitra S, Pal SK, Mitra P. Data mining in soft computing framework: A survey, IEEE Transactions on Neural Networks; 13, 2002. p. 314.

[10] MJA Berry, GS Linoff, Data Mining Techniques: For marketing, Sales and Customer Relationship Management, Indiana polis: Wiley; 2004.

[11] S Moro, P Cortez, P Rita. A data-driven approach to predict the success of bank telemarketing, Decision Support Systems; 62, 2014. p. 23-31.

[12] S Moro, R Laureano, P Cortez. Using Data Mining for Bank Direct Marketing: An Application of the CRISP-DM

Methodology, Proceedings of the European Simulation and Modelling Conference; Portugal,2011.p.117-121.

[13] Swift RS. Accelerating customer relationships: Using CRM and relationship technologies, N.J:Prentice Hall PTR;2001.

[14] T Munkata. Fundamentals of new artificial intelligence, 2nd ed.

London: Springer-Verlag; 2008.

[15] Tom. M. Mitchell. Machine Learning, 2nd ed. McGraw Hill; 2010.

[16] Turban E, Aronson JE, Liang TP, Sharda R. Decision support and business intelligence systems,8thed.PearsonEducation;2007.

[17] Witten I, Frank E. Data Mining-Practical Machine Learning Tools and Techniques, 2nd ed. USA: Elsevier; 2005.

Leave a Reply

Your email address will not be published. Required fields are marked *