Abstract
Customer lifetime value (CLV) is the most reliable indicator in direct marketing for measuring the profitability of the customers. This motivated the researchers to compete in building models to maximize CLV and consequently, enhancing the firm, and the customer relationship. This review paper analyzes the contributions of applying dynamic programming models in the area of direct marketing, to maximize CLV. It starts by reviewing the basic models that focused on calculating CLV, measuring it, simulating, optimizing it or -rarely- maximizing its value. Then highlighting the dynamic programming models including, Markov Decision Process (MDP), Approximate Dynamic Programming (ADP), also called Reinforcement Learning (RL), Deep RL and Double Deep RL. Although, MDP contributed significantly in the area of maximizing CLV, it has many limitations that encouraged researchers to utilize ADP (i.e. RL) and recently deep reinforcement learning (i.e. deep Q network). These algorithms overcame the limitations of MDP and were able to solve complex problems without suffering from the curse of dimensionality problem, however they still have some limitations including, overestimating the action values. This was the main motivation behind proposing double deep Q networks (DDQN). Meanwhile, neither DDQN nor the algorithms that outperformed it and overcame its limitations were applied in the area of direct marketing and this leaves a space for future research directions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abdolvand, N., Albadvi, A., Koosha, H.: Customer lifetime value: literature scoping map, and an agenda for future research. Int. J. Manag. Perspect. 1(3), 41–59 (2014)
Ahmad, A., Floris, A., Atzori, L.: OTT-ISP Joint service management: a customer lifetime value based approach. In: 2017 IFIP/IEEE Symposium on Integrated Network and Service Management (IM). IEEE (2017)
Amin, H.J., Aminu, A., Isa, R.: Adoption and impact of marketing strategies in Adama beverages Adamawa state, Northern Nigeria. Manag. Adm. Sci. Rev. 5(1), 38–47 (2016)
Arulkumaran, K., et al.: A brief survey of deep reinforcement learning. arXiv preprint arXiv:1708.05866 (2017)
Barto, A.G., Thomas, P.S., Sutton, R.S.: Some recent applications of reinforcement learning. In: Proceedings of the Eighteenth Yale Workshop on Adaptive and Learning Systems (2017)
Bertsimas, D., Mersereau, A.J.: A learning approach for interactive marketing to a customer segment. Oper. Res. 55(6), 1120–1135 (2007)
Bijmolt, T.H., Leeflang, P.S., Block, F., Eisenbeiss, M., Hardie, B.G., Lemmens, A., Saffert, P.: Analytics for customer engagement. J. Serv. Res. 13(3), 341–356 (2010)
Bose, I., Chen, X.: Quantitative models for direct marketing: a review from systems perspective. Eur. J. Oper. Res. 195(1), 1–16 (2009)
Cannon, J.N., Cannon, H.M.: Modeling strategic opportunities in product-mix strategy: a customer-versus product-oriented perspective. In: Developments in Business Simulation and Experiential Learning, vol. 35 (2014)
Casas-Arce, P., Martínez-Jerez, F.A., Narayanan, V.G.: The impact of forward-looking metrics on employee decision-making: the case of customer lifetime value. Account. Rev. 92(3), 31–56 (2016)
Chan, S.L., Ip, W.H.: A dynamic decision support system to predict the value of customer for new product development. Decis. Support Syst. 52(1), 178–188 (2011)
Chen, J., Patton, R.J.: Robust Model-Based Fault Diagnosis for Dynamic Systems, vol. 3. Springer, New York (2012)
Chen, P.P., et al.: Customer Lifetime Value in Video Games Using Deep Learning and Parametric Models. arXiv preprint arXiv:1811.12799 (2018)
Cheng, C.-J., et al.: Customer lifetime value prediction by a Markov chain based data mining model: application to an auto repair and maintenance company in Taiwan. Scientia Iranica 19(3), 849–855 (2012)
Ching, W., et al.: Customer lifetime value: stochastic optimization approach. J. Oper. Res. Soc. 55(8), 860–868 (2004)
Clempner, J.B., Poznyak, A.S.: Simple computing of the customer lifetime value: a fixed local-optimal policy approach. J. Syst. Sci. Syst. Eng. 23(4), 439–459 (2014)
Däs, M., et al.: Customer lifetime network value: customer valuation in the context of network effects. Electron. Mark. 27(4), 307–328 (2017)
Ekinci, Y., et al.: Analysis of customer lifetime value and marketing expenditure decisions through a Markovian-based model. Eur. J. Oper. Res. 237(1), 278–288 (2014)
Ekinci, Y., Ulengin, F., Uray, N.: Using customer lifetime value to plan optimal promotions. Serv. Ind. J. 34(2), 103–122 (2014)
Esteban-Bravo, M., Vidal-Sanz, J.M., Yildirim, G.: Valuing customer portfolios with endogenous mass and direct marketing interventions using a stochastic dynamic programming decomposition. Mark. Sci. 33(5), 621–640 (2014)
Garcıa, J., Fernández, F.: A comprehensive survey on safe reinforcement learning. J. Mach. Learn. Res. 16(1), 1437–1480 (2015)
Gelman, A.: Objections to Bayesian statistics. Bayesian Anal. 3(3), 445–449 (2008)
Gilbert, H., Weng, P., Xu, Y.: Optimizing quantiles in preference-based Markov decision processes. AAAI (2017)
Gupta, S., Zeithaml, V.: Customer metrics and their impact on financial performance. Mark. Sci. 25(6), 718–739 (2006)
Gupta, S., et al.: Modeling customer lifetime value. J. Serv. Res. 9(2), 139–155 (2006)
Haenlein, M., Kaplan, A.M., Beeser, A.J.: A model to determine customer lifetime value in a retail banking context. Eur. Manag. J. 25(3), 221–234 (2007)
Hasselt, H.V.: Double Q-learning. In: Advances in Neural Information Processing Systems, pp. 2613–2621 (2010)
Hessel, M., et al.: Rainbow: combining improvements in deep reinforcement learning. arXiv preprint arXiv:1710.02298 (2017)
Hiziroglu, A., Sengul, S.: Investigating two customer lifetime value models from segmentation perspective. Procedia Soc. Behav. Sci. 62, 766–774 (2012)
Hwang, H.: A stochastic approach for valuing customers: a case study. Int. J. Softw. Eng. Appl 10(3), 67–82 (2016)
Jain, D., Singh, S.S.: Customer lifetime value research in marketing: a review and future directions. J. Interact. Mark. 16(2), 34–46 (2002)
James, T., Glazebrook, K., Lin, K.: Developing effective service policies for multiclass queues with abandonment: asymptotic optimality and approximate policy improvement. INFORMS J. Comput. 28(2), 251–264 (2016)
Jerath, K., Fader, P.S., Hardie, B.G.S.: Customer-base analysis using repeated cross-sectional summary (RCSS) data. Eur. J. Oper. Res. 249(1), 340–350 (2016)
Jiang, D.R., Powell, W.B.: An approximate dynamic programming algorithm for monotone value functions. Oper. Res. 63(6), 1489–1511 (2015)
Jiang, D.R., Powell, W.B.: Optimal hour-ahead bidding in the real-time electricity market with battery storage using approximate dynamic programming. INFORMS J. Comput. 27(3), 525–543 (2015)
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)
Kahreh, M.S., et al.: Analyzing the applications of customer lifetime value (CLV) based on benefit segmentation for the banking sector. Procedia Soc. Behav. Sci. 109, 590–594 (2014)
Kalashnikov, D., et al.: QT-Opt: scalable deep reinforcement learning for vision-based robotic manipulation. arXiv preprint arXiv:1806.10293 (2018)
Kamakura, W., et al.: Choice models and customer relationship management. Mark. Lett. 16(3–4), 279–291 (2005)
Khajvand, M., et al.: Estimating customer lifetime value based on RFM analysis of customer purchase behavior: case study. Procedia Comput. Sci. 3, 57–63 (2011)
Klein, R., Kolb, J.: Maximizing customer equity subject to capacity constraints. Omega 55, 111–125 (2015)
Kumar, V., Ramani, G., Bohling, T.: Customer lifetime value approaches and best practice applications. J. Interact. Mark. 18(3), 60–72 (2004)
Kumar, V., Petersen, J.A., Leone, R.P.: Driving profitability by encouraging customer referrals: who, when, and how. J. Mark. 74(5), 1–17 (2010)
Kumar, V.: Customer lifetime value–the path to profitability. Found. Trends Mark. 2(1), 1–96 (2008)
Labbi, A., et al.: Customer Equity and Lifetime Management (CELM). Marketing Science (2007)
Lang, T., Rettenmeier, M.: Understanding consumer behavior with recurrent neural networks. In: International Workshop on Machine Learning Methods for Recommender Systems (2017)
Leike, J., et al.: AI safety gridworlds. arXiv preprint arXiv:1711.09883 (2017)
Li, X., et al.: Recurrent reinforcement learning: a hybrid approach. arXiv preprint arXiv:1509.03044 (2015)
Li, Y.: Deep reinforcement learning: an overview. arXiv preprint arXiv:1701.07274 (2017)
Litjens, G., et al.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)
Liu, D., Wang, D., Ichibushi, H.: Adaptive dynamic programming and reinforcement learning. In: UNESCO Encyclopedia of Life Support Systems (2012)
Ma, M., Li, Z., Chen, J.: Phase-type distribution of customer relationship with Markovian response and marketing expenditure decision on the customer lifetime value. Eur. J. Oper. Res. 187(1), 313–326 (2008)
Ma, S., et al.: A nonhomogeneous hidden Markov model of response dynamics and mailing optimization in direct marketing. Eur. J. Oper. Res. 253(2), 514–523 (2016)
Malthouse, E.C., Blattberg, R.C.: Can we predict customer lifetime value? J. Interact. Mark. 19(1), 2–16 (2005)
Malthouse, E.C., et al.: Managing customer relationships in the social media era: Introducing the social CRM house. J. Interact. Mark. 27(4), 270–280 (2013)
Mannor, S., et al.: Bias and variance approximation in value function estimates. Manag. Sci. 53(2), 308–322 (2007)
Mirrokni, V.S., et al.: Dynamic auctions with bank accounts. In: IJCAI (2016)
Nasution, R.A., et al.: The customer experience framework as baseline for strategy and implementation in services marketing. Procedia Soc. Behav. Sci. 148, 254–261 (2014)
Nemati, Y., et al.: A CLV-based framework to prioritize promotion marketing strategies: a case study of telecom industry. Iran. J. Manag. Stud. 11(3), 437–462 (2018)
Neslin, S.A., et al.: Overcoming the “recency trap” in customer relationship management. J. Acad. Mark. Sci. 41(3), 320–337 (2013)
Nour, M.A.: An integrative framework for customer relationship management: towards a systems view. Int. J. Bus. Inf. Syst. 9(1), 26–50 (2012)
Ohno, K., et al.: New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system. Eur. J. Oper. Res. 249(1), 22–31 (2016)
Permana, D., Pasaribu, U.S., Indratno, S.W.: Classification of customer lifetime value models using Markov chain. J. Phys. Conf. Ser. 893(1), 012026 (2017)
Powell, W.B.: Approximate dynamic programming: lessons from the field. In: 2008 Winter Simulation Conference. IEEE (2008)
Powell, W.B.: What you should know about approximate dynamic programming. Nav. Res. Logist. (NRL) 56(3), 239–249 (2009)
Reimer, K., Rutz, O.J., Pauwels, K.: How online consumer segments differ in long-term marketing effectiveness. J. Interact. Mark. 28(4), 271–284 (2014)
Reinartz, W., Thomas, J.S., Kumar, V.: Balancin acquisition and retention resources to maximize customer protability. J. Mark. 69(1), 63–79 (2005)
Rust, R.T., Kumar, V., Venkatesan, R.: Will the frog change into a prince? Predicting future customer profitability. Int. J. Res. Mark. 28(4), 281–294 (2011)
Sabatelli, M., et al.: Deep Quality-Value (DQV) Learning. arXiv preprint arXiv:1810.00368 (2018)
Sabbeh, S.F.: Machine-learning techniques for customer retention: a comparative study. Int. J. Adv. Comput. Sci. Appl. 9(2), 273–281 (2018)
Shah, D., et al.: Unprofitable cross-buying: evidence from consumer and business markets. J. Mark. 76(3), 78–95 (2012)
Sifa, R., et al.: Customer lifetime value prediction in non-contractual freemium settings: chasing high-value users using deep neural networks and SMOTE. In: Proceedings of the 51st Hawaii International Conference on System Sciences (2018)
Silver, D., et al.: Concurrent reinforcement learning from customer interactions. In: International Conference on Machine Learning (2013)
Simester, D.I., Sun, P., Tsitsiklis, J.N.: Dynamic catalog mailing policies. Manag. Sci. 52(5), 683–696 (2006)
Simester, D.: Field experiments in marketing. In: Handbook of Economic Field Experiments, vol. 1, pp. 465–497. North-Holland (2017)
Tarokh, M.J., EsmaeiliGookeh, M.: A new model to speculate CLV based on Markov chain model. J. Ind. Eng. Manag. Stud. 4(2), 85–102 (2017)
Theocharous, G., Hallak, A.: Lifetime value marketing using reinforcement learning. In: RLDM 2013, p. 19 (2013)
Theocharous, G., Thomas, P.S., Ghavamzadeh, M.: Personalized ad recommendation systems for life-time value optimization with guarantees. In: IJCAI (2015)
Tkachenko, Y., Kochenderfer, M.J., Kluza, K.: Customer simulation for direct marketing experiments. In: 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA). IEEE (2016)
Tkachenko, Y.: Autonomous CRM control via CLV approximation with deep reinforcement learning in discrete and continuous action space. arXiv preprint arXiv:1504.01840 (2015)
Umashankar, N., Bhagwat, Y., Kumar, V.: Do loyal customers really pay more for services? J. Acad. Mark. Sci. 45(6), 807–826 (2017)
Vaeztehrani, A., Modarres, M., Aref, S.: Developing an integrated revenue management and customer relationship management approach in the hotel industry. J. Revenue Pricing Manag. 14(2), 97–119 (2015)
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. In: AAAI, vol. 2 (2016)
Van Otterlo, M.: Markov decision processes: concepts and algorithms. Course on ‘Learning and Reasoning’ (2009)
Venkatesan, R., Kumar, V.: A customer lifetime value framework for customer selection and resource allocation strategy. J. Mark. 68(4), 106–125 (2004)
Venkatesan, R., Kumar, V., Bohling, T.: Optimal customer relationship management using Bayesian decision theory: an application for customer selection. J. Mark. Res. 44(4), 579–594 (2007)
Verhoef, P.C., et al.: CRM in data-rich multichannel retailing environments: a review and future research directions. J. Interact. Mark. 24(2), 121–137 (2010)
Verma, S.: Effectiveness of social network sites for influencing consumer purchase decisions. Int. J. Bus. Excel. 6(5), 624–634 (2013)
Wang, C., Pozza, I.D.: The antecedents of customer lifetime duration and discounted expected transactions: discrete-time based transaction data analysis. No. 2014-203 (2014)
Wei, Q., Liu, D.: Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans. Autom. Sci. Eng. 11(4), 1020–1036 (2014)
Wübben, M., Wangenheim, F.V.: Instant customer base analysis: managerial heuristics often “get it right”. J. Mark. 72(3), 82–93 (2008)
Zhang, J.Z., Netzer, O., Ansari, A.: Dynamic targeted pricing in B2B relationships. Market. Sci. 33(3), 317–337 (2014)
Zhang, Q., Seetharaman, P.B.: Assessing lifetime profitability of customers with purchasing cycles. Mark. Intell. Plan. 36(2), 276–289 (2018)
Zhao, M., et al.: Impression allocation for combating fraud in E-commerce via deep reinforcement learning with action norm penalty. In: IJCAI (2018)
Tirenni, G., et al.: The 2005 ISMS practice prize winner-customer equity and lifetime management (CELM) finnair case study. Mark. Sci. 26(4), 553–565 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
AboElHamd, E., Shamma, H.M., Saleh, M. (2020). Dynamic Programming Models for Maximizing Customer Lifetime Value: An Overview. In: Bi, Y., Bhatia, R., Kapoor, S. (eds) Intelligent Systems and Applications. IntelliSys 2019. Advances in Intelligent Systems and Computing, vol 1037. Springer, Cham. https://doi.org/10.1007/978-3-030-29516-5_34
Download citation
DOI: https://doi.org/10.1007/978-3-030-29516-5_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29515-8
Online ISBN: 978-3-030-29516-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)