[go: up one dir, main page]

skip to main content
10.1145/3639477.3639729acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open access

Enhancing Testing at Meta with Rich-State Simulated Populations

Published: 31 May 2024 Publication History

Abstract

This paper reports the results of the deployment of Rich-State Simulated Populations at Meta for both automated and manual testing. We use simulated users (aka test users) to mimic user interactions and acquire state in much the same way that real user accounts acquire state. For automated testing, we present empirical results from deployment on the Facebook, Messenger, and Instagram apps for iOS and Android Platforms. These apps consist of tens of millions of lines of code, communicating with hundreds of millions of lines of backend code, and are used by over 2 billion people every day. Our results reveal that rich state increases average code coverage by 38%, and endpoint coverage by 61%. More importantly, it also yields an average increase of 115% in the faults found by automated testing. The rich-state test user populations are also deployed in a (continually evolving) Test Universe; a web-enabled simulation platform for privacy-safe manual testing, which has been used by over 21,000 Meta engineers since its deployment in November 2022.

References

[1]
Shadi Abdul Khalek and Sarfraz Khurshid. 2010. Automated SQL query generation for systematic testing of database engines. In Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering. 329--332.
[2]
David Adam. 2020. Special report: The simulations driving the world's response to COVID-19. Nature (April 2020).
[3]
John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Elena Dulskyte, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Ralf Laemmel, Erik Meijer, Silvia Sapora, and Justin Spahr-Summers. 2020. WES: Agent-based User Interaction Simulation on Real Infrastructure. In GI @ ICSE 2020, Shin Yoo, Justyna Petke, Westley Weimer, and Bobby R. Bruce (Eds.). ACM, 276--284. Invited Keynote.
[4]
John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Elena Dulskyte, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Erik Meijer, Silvia Sapora, and Justin Spahr-Summers. 2021. Testing Web Enabled Simulation at Scale Using Metamorphic Testing. In International Conference on Software Engineering (ICSE) Software Engineering in Practice (SEIP) track. Virtual.
[5]
John Ahlgren, Kinga Bojarczuk, Sophia Drossopoulou, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Simon Lucas, Erik Meijer, Steve Omohundro, Rubmary Rojas, Silvia Sapora, Jie M. Zhang, and Norm Zhou. 2021. Facebook's Cyber-Cyber and Cyber-Physical Digital Twins. In 25th International Conference on Evaluation and Assessment in Software Engineering (EASE 2021). Virtual.
[6]
John Ahlgren, Kinga Bojarczuk, Sophia Drossopoulou, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Simon Lucas, Erik Meijer, Steve Omohundro, Rubmary Rojas, Silvia Sapora, Jie M. Zhang, and Norm Zhou. 2021. Facebook's Cyber-Cyber and Cyber-Physical Digital Twins (keynote paper). In 25th International Conference on Evaluation and Assessment in Software Engineering (EASE 2021). Virtual. Keynote talk given jointly by Inna Dvortsova and Mark Harman.
[7]
John Ahlgren, Kinga Bojarczuk, Sophia Drossopoulou, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Simon M. Lucas, Erik Meijer, Steve Omohundro, Rubmary Rojas, Silvia Sapora, Jie M. Zhang, and Norm Zhou. 2021. Facebook's Cyber-Cyber and Cyber-Physical Digital Twins. In Proceedings of the Evaluation and Assessment in Software Engineering (EASE 2021) Conference. to appear.
[8]
Saif Al-Sultan, Moath M. Al-Doori, Ali H. Al-Bayatti, and Hussien Zedan. 2014. A comprehensive survey on vehicular Ad Hoc network. Journal of Network and Computer Applications 37 (2014), 380--392.
[9]
Juan C Alonso, Alberto Martin-Lopez, Sergio Segura, Jose Maria Garcia, and Antonio Ruiz-Cortes. 2022. ARTE: Automated Generation of Realistic Test Inputs for Web APIs. IEEE Transactions on Software Engineering 49, 1 (2022), 348--363.
[10]
Nadia Alshahwan, Xinbo Gao, Mark Harman, Yue Jia, Ke Mao, Alexander Mols, Taijin Tei, and Ilya Zorin. 2018. Deploying Search Based Software Engineering with Sapienz at Facebook (keynote paper). In 10th International Symposium on Search Based Software Engineering (SSBSE 2018). Montpellier, France, 3--45. Springer LNCS 11036.
[11]
Saswat Anand, Antonia Bertolino, Edmund Burke, Tsong Yueh Chen, John Clark, Myra B. Cohen, Wolfgang Grieskamp, Mark Harman, Mary Jean Harrold, Jenny Li, Phil McMinn, and Hong Zhu. 2013. An orchestrated survey of methodologies for automated software test case generation. Journal of Systems and Software 86, 8 (August 2013), 1978--2001.
[12]
Saswat Anand, Mayur Naik, Mary Jean Harrold, and Hongseok Yang. 2012. Automated concolic testing of smartphone apps. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering. ACM, 59.
[13]
Andrea Arcuri and Lionel Briand. 2011. A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering. In 33rd International Conference on Software Engineering (ICSE'11) (Waikiki, Honolulu, HI, USA). ACM, New York, NY, USA, 1--10.
[14]
Earl T. Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2015. The Oracle Problem in Software Testing: A Survey. IEEE Transactions on Software Engineering 41, 5 (May 2015), 507--525.
[15]
Jonathan Bell, Nikhil Sarda, and Gail Kaiser. 2013. Chronicler: Lightweight recording to reproduce field failures. In 35th International Conference on Software Engineering (ICSE). IEEE, 362--371.
[16]
Francesco A Bianchi, Mauro Pezzè, and Valerio Terragni. 2017. Reproducing concurrency failures from crash stacks. In Foundations of Software Engineering (FSE). 705--716.
[17]
Kinga Bojarczuk, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Simon Lucas, Erik Meijer, Rubmary Rojas, and Silvia Sapora. 2021. Measurement Challenges for Cyber Cyber Digital Twins: Experiences from the Deployment of Facebook's WW Simulation System (keynote paper). In ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM '21). Keynote talk given jointly by Maria Lomeli and Mark Harman.
[18]
Robert S. Boyer, Bernard Elspas, and Karl N. Levitt. 1975. SELECT - a Formal System for Testing and Debugging Programs by Symbolic Execution. In International Conference on Reliable Software (Los Angeles, California). ACM, New York, NY, USA, 234--245.
[19]
Mustafa Bozkurt and Mark Harman. 2012. Optimised Realistic Test Input Generation Using Web Services. In 4th International Symposium on Search Based Software Engineering (SSBSE 2012). Riva del Garda, Italy, 105--120.
[20]
Cristian Cadar and Koushik Sen. 2013. Symbolic Execution for Software Testing: Three Decades Later. Commun. ACM 56, 2 (Feb. 2013), 82--90.
[21]
Dirk Draheim, John Grundy, John Hosking, Christof Lutteroth, and Gerald Weber. 2006. Realistic load testing of web applications. In Conference on Software Maintenance and Reengineering (CSMR'06). IEEE, 11--pp.
[22]
Angela Fan, Beliz Gokkaya, Mitya Lyubarskiy, Mark Harman, Shubho Sengupta, Shin Yoo, and Jie Zhang. 2023. Large Language Models for Software Engineering: Survey and Open Problems. In ICSE Future of Software Engineering (FoSE 2023. To Appear.
[23]
Gordon Fraser and Andreas Zeller. 2010. Mutation-driven generation of unit tests and oracles. In International Symposium on Software Testing and Analysis (ISSTA 2010). ACM, Trento, Italy, 147--158.
[24]
Dave Gray. 2015. Everything is a service. https://medium.com/the-connected-company/everything-is-a-service-96e668fc1fa4
[25]
Florian Gross, Gordon Fraser, and Andreas Zeller. 2012. Search-based system testing: high coverage, no false alarms. In International Symposium on Software Testing and Analysis (ISSTA 2012). 67--77.
[26]
Mark Harman, Lin Hu, Robert Mark Hierons, Joachim Wegener, Harmen Sthamer, André Baresel, and Marc Roper. 2004. Testability Transformation. IEEE Transactions on Software Engineering 30, 1 (Jan. 2004), 3--16.
[27]
Mark Harman, Yue Jia, and Yuanyuan Zhang. 2015. Achievements, open problems and challenges for search based software testing (keynote Paper). In 8th IEEE International Conference on Software Testing, Verification and Validation (ICST 2015). Graz, Austria.
[28]
Mark Harman and Phil McMinn. 2007. A Theoretical and Empirical Analysis of Evolutionary Testing and Hill Climbing for Structural Test Data Generation. In International Symposium on Software Testing and Analysis (ISSTA'07). Association for Computer Machinery, London, United Kingdom, 73--83.
[29]
Mark Harman, Phil McMinn, Jerffeson Teixeira de Souza, and Shin Yoo. 2012. Search Based Software Engineering: Techniques, Taxonomy, Tutorial. In Empirical software engineering and verification: LASER 2009-2010, Bertrand Meyer and Martin Nordio (Eds.). Springer, 1--59. LNCS 7007.
[30]
Mark Harman and Peter O'Hearn. 2018. From Start-ups to Scale-ups: Opportunities and Open Problems for Static and Dynamic Program Analysis (keynote paper). In 18th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2018). Madrid, Spain, 1--23.
[31]
Wei Jin and Alessandro Orso. 2012. Bugredux: Reproducing field failures for in-house debugging. In 34th international conference on software engineering (ICSE). IEEE, 474--484.
[32]
Gregory L Johnson, Clayton L Hanson, Stuart P Hardegree, and Edward B Ballard. 1996. Stochastic weather simulation: Overview and analysis of two commonly used models. Journal of Applied Meteorology 35, 10 (1996), 1878--1896.
[33]
James Cornelius King. 1969. A Program Verifier. Ph. D. Dissertation. Carnegie Mellon University.
[34]
Sergio Luna and Michael J Pennock. 2018. Social media applications and emergency management: A literature review and research agenda. International journal of disaster risk reduction 28 (2018), 565--577.
[35]
Ke Mao, Mark Harman, and Yue Jia. 2016. Sapienz: Multi-objective Automated Testing for Android Applications. In International Symposium on Software Testing and Analysis (ISSTA 2016). 94--105.
[36]
Ke Mao, Timotej Kapus, Lambros Petrou, Ákos Hajdu, Matteo Marescotti, Andreas Löscher, Mark Harman, and Dino Distefano. 2022. FAUSTA: Scaling Dynamic Analysis with Traffic Generation at WhatsApp. In 15th IEEE Conference on Software Testing, Verification and Validation, ICST 2022, Valencia, Spain, April 4-14, 2022. IEEE, 267--278.
[37]
Phil McMinn. 2004. Search-based Software Test Data Generation: A Survey. Software Testing, Verification and Reliability 14, 2 (June 2004), 105--156.
[38]
Geoffrey Neumann, Mark Harman, and Simon Poulding. 2015. Transformed Vargha-Delaney effect size. In Search-Based Software Engineering: 7th International Symposium, SSBSE 2015, Bergamo, Italy, September 5-7, 2015, Proceedings 7. Springer, 318--324.
[39]
Jose J Padilla, Saikou Y Diallo, Hamdi Kavak, Olcay Sahin, and Brit Nicholson. 2014. Leveraging social media data in agent-based simulations. In Proceedings of the 2014 Annual Simulation Symposium. 1--8.
[40]
Emilio Serrano, Carlos A. Iglesias, and Mercedes Garijo. 2015. A survey of twitter rumor spreading simulations. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9329 (2015), 113--122.
[41]
Kunal Taneja, Yi Zhang, and Tao Xie. 2010. MODA: Automated test generation for database applications via mock objects. In Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering. 289--292.
[42]
Sergio Terzi and Sergio Cavalieri. 2004. Simulation in the supply chain context: a survey. Computers in Industry 53, 1 (2004), 3--16.
[43]
Shreshth Tuli, Kinga Bojarczuk, Natalija Gucevska, Mark Harman, Xiao-Yu Wang, and Graham Wright. 2023. Simulation-Driven Automated End-to-End Test and Oracle Inference. In 45th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, SEIP@ICSE 2023, Melbourne, Australia, May 14-20, 2023. IEEE, 122--133.
[44]
Andreas Weiler, Harry Schilling, Lukas Kircher, and Michael Grossniklaus. 2019. Towards reproducible research of event detection techniques for Twitter. In 2019 6th Swiss Conference on Data Science (SDS). IEEE, 69--74.
[45]
Andreas Zeller. 2007. Beautiful Debugging. In Beautiful Code, Andy Oram and Greg Wilson (Eds.). O'Reilly & Associates, Inc., Sebastopol, CA 95472, 463--476. chapter 28.
[46]
Shlomo Zilberstein. 1996. Using anytime algorithms in intelligent systems. AI magazine 17, 3 (1996), 73--73.

Cited By

View all
  • (2024)Automated End-to-End Dynamic Taint Analysis for WhatsAppCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663824(21-26)Online publication date: 10-Jul-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE-SEIP '24: Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Practice
April 2024
480 pages
ISBN:9798400705014
DOI:10.1145/3639477
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • Faculty of Engineering of University of Porto

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2024

Check for updates

Author Tags

  1. software testing
  2. cyber cyber digital twins
  3. simulation-based testing
  4. machine learning

Qualifiers

  • Research-article

Conference

ICSE-SEIP '24
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)64
  • Downloads (Last 6 weeks)25
Reflects downloads up to 09 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Automated End-to-End Dynamic Taint Analysis for WhatsAppCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663824(21-26)Online publication date: 10-Jul-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media