research-article

Open access

Enhancing Testing at Meta with Rich-State Simulated Populations

Authors:

Nadia Alshahwan,

Kinga Bojarczuk,

Andrea Ciancone,

Natalija Gucevska,

Michal Krolikowski,

Simon Schellaert,

Kate Ustiuzhanina,

Will LewisAuthors Info & Claims

ICSE-SEIP '24: Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Practice

Pages 1 - 12

https://doi.org/10.1145/3639477.3639729

Published: 31 May 2024 Publication History

Abstract

This paper reports the results of the deployment of Rich-State Simulated Populations at Meta for both automated and manual testing. We use simulated users (aka test users) to mimic user interactions and acquire state in much the same way that real user accounts acquire state. For automated testing, we present empirical results from deployment on the Facebook, Messenger, and Instagram apps for iOS and Android Platforms. These apps consist of tens of millions of lines of code, communicating with hundreds of millions of lines of backend code, and are used by over 2 billion people every day. Our results reveal that rich state increases average code coverage by 38%, and endpoint coverage by 61%. More importantly, it also yields an average increase of 115% in the faults found by automated testing. The rich-state test user populations are also deployed in a (continually evolving) Test Universe; a web-enabled simulation platform for privacy-safe manual testing, which has been used by over 21,000 Meta engineers since its deployment in November 2022.

References

[1]

Shadi Abdul Khalek and Sarfraz Khurshid. 2010. Automated SQL query generation for systematic testing of database engines. In Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering. 329--332.

Digital Library

[2]

David Adam. 2020. Special report: The simulations driving the world's response to COVID-19. Nature (April 2020).

[3]

John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Elena Dulskyte, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Ralf Laemmel, Erik Meijer, Silvia Sapora, and Justin Spahr-Summers. 2020. WES: Agent-based User Interaction Simulation on Real Infrastructure. In GI @ ICSE 2020, Shin Yoo, Justyna Petke, Westley Weimer, and Bobby R. Bruce (Eds.). ACM, 276--284. Invited Keynote.

Digital Library

[4]

John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Elena Dulskyte, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Erik Meijer, Silvia Sapora, and Justin Spahr-Summers. 2021. Testing Web Enabled Simulation at Scale Using Metamorphic Testing. In International Conference on Software Engineering (ICSE) Software Engineering in Practice (SEIP) track. Virtual.

Digital Library

[5]

John Ahlgren, Kinga Bojarczuk, Sophia Drossopoulou, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Simon Lucas, Erik Meijer, Steve Omohundro, Rubmary Rojas, Silvia Sapora, Jie M. Zhang, and Norm Zhou. 2021. Facebook's Cyber-Cyber and Cyber-Physical Digital Twins. In 25th International Conference on Evaluation and Assessment in Software Engineering (EASE 2021). Virtual.

Digital Library

[6]

John Ahlgren, Kinga Bojarczuk, Sophia Drossopoulou, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Simon Lucas, Erik Meijer, Steve Omohundro, Rubmary Rojas, Silvia Sapora, Jie M. Zhang, and Norm Zhou. 2021. Facebook's Cyber-Cyber and Cyber-Physical Digital Twins (keynote paper). In 25th International Conference on Evaluation and Assessment in Software Engineering (EASE 2021). Virtual. Keynote talk given jointly by Inna Dvortsova and Mark Harman.

Digital Library

[7]

John Ahlgren, Kinga Bojarczuk, Sophia Drossopoulou, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Simon M. Lucas, Erik Meijer, Steve Omohundro, Rubmary Rojas, Silvia Sapora, Jie M. Zhang, and Norm Zhou. 2021. Facebook's Cyber-Cyber and Cyber-Physical Digital Twins. In Proceedings of the Evaluation and Assessment in Software Engineering (EASE 2021) Conference. to appear.

Digital Library

[8]

Saif Al-Sultan, Moath M. Al-Doori, Ali H. Al-Bayatti, and Hussien Zedan. 2014. A comprehensive survey on vehicular Ad Hoc network. Journal of Network and Computer Applications 37 (2014), 380--392.

Digital Library

[9]

Juan C Alonso, Alberto Martin-Lopez, Sergio Segura, Jose Maria Garcia, and Antonio Ruiz-Cortes. 2022. ARTE: Automated Generation of Realistic Test Inputs for Web APIs. IEEE Transactions on Software Engineering 49, 1 (2022), 348--363.

[10]

Nadia Alshahwan, Xinbo Gao, Mark Harman, Yue Jia, Ke Mao, Alexander Mols, Taijin Tei, and Ilya Zorin. 2018. Deploying Search Based Software Engineering with Sapienz at Facebook (keynote paper). In 10^th International Symposium on Search Based Software Engineering (SSBSE 2018). Montpellier, France, 3--45. Springer LNCS 11036.

[11]

Saswat Anand, Antonia Bertolino, Edmund Burke, Tsong Yueh Chen, John Clark, Myra B. Cohen, Wolfgang Grieskamp, Mark Harman, Mary Jean Harrold, Jenny Li, Phil McMinn, and Hong Zhu. 2013. An orchestrated survey of methodologies for automated software test case generation. Journal of Systems and Software 86, 8 (August 2013), 1978--2001.

Digital Library

[12]

Saswat Anand, Mayur Naik, Mary Jean Harrold, and Hongseok Yang. 2012. Automated concolic testing of smartphone apps. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering. ACM, 59.

Digital Library

[13]

Andrea Arcuri and Lionel Briand. 2011. A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering. In 33^rd International Conference on Software Engineering (ICSE'11) (Waikiki, Honolulu, HI, USA). ACM, New York, NY, USA, 1--10.

Digital Library

[14]

Earl T. Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2015. The Oracle Problem in Software Testing: A Survey. IEEE Transactions on Software Engineering 41, 5 (May 2015), 507--525.

Digital Library

[15]

Jonathan Bell, Nikhil Sarda, and Gail Kaiser. 2013. Chronicler: Lightweight recording to reproduce field failures. In 35th International Conference on Software Engineering (ICSE). IEEE, 362--371.

[16]

Francesco A Bianchi, Mauro Pezzè, and Valerio Terragni. 2017. Reproducing concurrency failures from crash stacks. In Foundations of Software Engineering (FSE). 705--716.

[17]

Kinga Bojarczuk, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Simon Lucas, Erik Meijer, Rubmary Rojas, and Silvia Sapora. 2021. Measurement Challenges for Cyber Cyber Digital Twins: Experiences from the Deployment of Facebook's WW Simulation System (keynote paper). In ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM '21). Keynote talk given jointly by Maria Lomeli and Mark Harman.

Digital Library

[18]

Robert S. Boyer, Bernard Elspas, and Karl N. Levitt. 1975. SELECT - a Formal System for Testing and Debugging Programs by Symbolic Execution. In International Conference on Reliable Software (Los Angeles, California). ACM, New York, NY, USA, 234--245.

[19]

Mustafa Bozkurt and Mark Harman. 2012. Optimised Realistic Test Input Generation Using Web Services. In 4^th International Symposium on Search Based Software Engineering (SSBSE 2012). Riva del Garda, Italy, 105--120.

[20]

Cristian Cadar and Koushik Sen. 2013. Symbolic Execution for Software Testing: Three Decades Later. Commun. ACM 56, 2 (Feb. 2013), 82--90.

Digital Library

[21]

Dirk Draheim, John Grundy, John Hosking, Christof Lutteroth, and Gerald Weber. 2006. Realistic load testing of web applications. In Conference on Software Maintenance and Reengineering (CSMR'06). IEEE, 11--pp.

[22]

Angela Fan, Beliz Gokkaya, Mitya Lyubarskiy, Mark Harman, Shubho Sengupta, Shin Yoo, and Jie Zhang. 2023. Large Language Models for Software Engineering: Survey and Open Problems. In ICSE Future of Software Engineering (FoSE 2023. To Appear.

[23]

Gordon Fraser and Andreas Zeller. 2010. Mutation-driven generation of unit tests and oracles. In International Symposium on Software Testing and Analysis (ISSTA 2010). ACM, Trento, Italy, 147--158.

Digital Library

[24]

Dave Gray. 2015. Everything is a service. https://medium.com/the-connected-company/everything-is-a-service-96e668fc1fa4

[25]

Florian Gross, Gordon Fraser, and Andreas Zeller. 2012. Search-based system testing: high coverage, no false alarms. In International Symposium on Software Testing and Analysis (ISSTA 2012). 67--77.

Digital Library

[26]

Mark Harman, Lin Hu, Robert Mark Hierons, Joachim Wegener, Harmen Sthamer, André Baresel, and Marc Roper. 2004. Testability Transformation. IEEE Transactions on Software Engineering 30, 1 (Jan. 2004), 3--16.

Digital Library

[27]

Mark Harman, Yue Jia, and Yuanyuan Zhang. 2015. Achievements, open problems and challenges for search based software testing (keynote Paper). In 8^th IEEE International Conference on Software Testing, Verification and Validation (ICST 2015). Graz, Austria.

[28]

Mark Harman and Phil McMinn. 2007. A Theoretical and Empirical Analysis of Evolutionary Testing and Hill Climbing for Structural Test Data Generation. In International Symposium on Software Testing and Analysis (ISSTA'07). Association for Computer Machinery, London, United Kingdom, 73--83.

Digital Library

[29]

Mark Harman, Phil McMinn, Jerffeson Teixeira de Souza, and Shin Yoo. 2012. Search Based Software Engineering: Techniques, Taxonomy, Tutorial. In Empirical software engineering and verification: LASER 2009-2010, Bertrand Meyer and Martin Nordio (Eds.). Springer, 1--59. LNCS 7007.

[30]

Mark Harman and Peter O'Hearn. 2018. From Start-ups to Scale-ups: Opportunities and Open Problems for Static and Dynamic Program Analysis (keynote paper). In 18^th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2018). Madrid, Spain, 1--23.

[31]

Wei Jin and Alessandro Orso. 2012. Bugredux: Reproducing field failures for in-house debugging. In 34th international conference on software engineering (ICSE). IEEE, 474--484.

Digital Library

[32]

Gregory L Johnson, Clayton L Hanson, Stuart P Hardegree, and Edward B Ballard. 1996. Stochastic weather simulation: Overview and analysis of two commonly used models. Journal of Applied Meteorology 35, 10 (1996), 1878--1896.

[33]

James Cornelius King. 1969. A Program Verifier. Ph. D. Dissertation. Carnegie Mellon University.

[34]

Sergio Luna and Michael J Pennock. 2018. Social media applications and emergency management: A literature review and research agenda. International journal of disaster risk reduction 28 (2018), 565--577.

[35]

Ke Mao, Mark Harman, and Yue Jia. 2016. Sapienz: Multi-objective Automated Testing for Android Applications. In International Symposium on Software Testing and Analysis (ISSTA 2016). 94--105.

[36]

Ke Mao, Timotej Kapus, Lambros Petrou, Ákos Hajdu, Matteo Marescotti, Andreas Löscher, Mark Harman, and Dino Distefano. 2022. FAUSTA: Scaling Dynamic Analysis with Traffic Generation at WhatsApp. In 15th IEEE Conference on Software Testing, Verification and Validation, ICST 2022, Valencia, Spain, April 4-14, 2022. IEEE, 267--278.

[37]

Phil McMinn. 2004. Search-based Software Test Data Generation: A Survey. Software Testing, Verification and Reliability 14, 2 (June 2004), 105--156.

Digital Library

[38]

Geoffrey Neumann, Mark Harman, and Simon Poulding. 2015. Transformed Vargha-Delaney effect size. In Search-Based Software Engineering: 7th International Symposium, SSBSE 2015, Bergamo, Italy, September 5-7, 2015, Proceedings 7. Springer, 318--324.

[39]

Jose J Padilla, Saikou Y Diallo, Hamdi Kavak, Olcay Sahin, and Brit Nicholson. 2014. Leveraging social media data in agent-based simulations. In Proceedings of the 2014 Annual Simulation Symposium. 1--8.

Digital Library

[40]

Emilio Serrano, Carlos A. Iglesias, and Mercedes Garijo. 2015. A survey of twitter rumor spreading simulations. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9329 (2015), 113--122.

[41]

Kunal Taneja, Yi Zhang, and Tao Xie. 2010. MODA: Automated test generation for database applications via mock objects. In Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering. 289--292.

Digital Library

[42]

Sergio Terzi and Sergio Cavalieri. 2004. Simulation in the supply chain context: a survey. Computers in Industry 53, 1 (2004), 3--16.

Digital Library

[43]

Shreshth Tuli, Kinga Bojarczuk, Natalija Gucevska, Mark Harman, Xiao-Yu Wang, and Graham Wright. 2023. Simulation-Driven Automated End-to-End Test and Oracle Inference. In 45th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, SEIP@ICSE 2023, Melbourne, Australia, May 14-20, 2023. IEEE, 122--133.

[44]

Andreas Weiler, Harry Schilling, Lukas Kircher, and Michael Grossniklaus. 2019. Towards reproducible research of event detection techniques for Twitter. In 2019 6th Swiss Conference on Data Science (SDS). IEEE, 69--74.

[45]

Andreas Zeller. 2007. Beautiful Debugging. In Beautiful Code, Andy Oram and Greg Wilson (Eds.). O'Reilly & Associates, Inc., Sebastopol, CA 95472, 463--476. chapter 28.

[46]

Shlomo Zilberstein. 1996. Using anytime algorithms in intelligent systems. AI magazine 17, 3 (1996), 73--73.

Cited By

Cela SCiancone AGustafsson PHajdu ÁJia YKapus TKoshtenko MLewis WMao KMartac Dd'Amorim M(2024)Automated End-to-End Dynamic Taint Analysis for WhatsAppCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663824(21-26)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3663529.3663824

Recommendations

Predictive mutation testing
ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and Analysis

Mutation testing is a powerful methodology for evaluating test suite quality. In mutation testing, a large number of mutants are generated and executed against the test suite to check the ratio of killed mutants. Therefore, mutation testing is widely ...
Grey-box concolic testing on binary code
ICSE '19: Proceedings of the 41st International Conference on Software Engineering

We present grey-box concolic testing, a novel path-based test case generation method that combines the best of both white-box and grey-box fuzzing. At a high level, our technique systematically explores execution paths of a program under test as in ...
Assessing model-based testing: an empirical study conducted in industry
ICSE Companion 2014: Companion Proceedings of the 36th International Conference on Software Engineering

We compare manual testing without any automation performed by a tester at a software company with model-based testing (MBT) performed by a tester at a research center.

The system under test (SUT), of which two different versions were tested by each ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE-SEIP '24: Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Practice

April 2024

480 pages

ISBN:9798400705014

DOI:10.1145/3639477

Co-chairs:
Ana Paiva,
Rui Abreu,
Maurício Aniche
Delft University of Technology, Netherlands
,
Nachiappan Nagappan
Meta, USA
,
Program Co-chairs:
Abhik Roychoudhury,
Margaret Storey

Copyright © 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

Faculty of Engineering of University of Porto

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2024

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICSE-SEIP '24

Sponsor:

SIGSOFT

ICSE-SEIP '24: 46th International Conference on Software Engineering: Software Engineering in Practice

April 14 - 20, 2024

Lisbon, Portugal

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
64
Total Downloads

Downloads (Last 12 months)64
Downloads (Last 6 weeks)25

Reflects downloads up to 09 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Cela SCiancone AGustafsson PHajdu ÁJia YKapus TKoshtenko MLewis WMao KMartac Dd'Amorim M(2024)Automated End-to-End Dynamic Taint Analysis for WhatsAppCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663824(21-26)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3663529.3663824

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents