[go: up one dir, main page]

Academia.eduAcademia.edu

Infrastructure-Aware Functional Testing of MapReduce Programs

2016, 2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW)

This paper is a post-print paper accepted in “International Conference on Future Internet of Things and Cloud (FiCloud), 2016“ The final version of this paper is available through IEEE Xplore in the next link: http://ieeexplore.ieee.org/document/7592719/ J. Morán, B. Rivas, C. De La Riva, J. Tuya, I. Caballero and M. Serrano, "Infrastructure-Aware Functional Testing of MapReduce Programs," 2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW), Vienna, 2016, pp. 171-176. doi: 10.1109/W-FiCloud.2016.45 IEEE copyright notice. © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works Infrastructure-Aware Functional Testing of MapReduce programs Bibiano Rivas Jesús Morán Department of Computing Institute of Technology and Information Systems University of Oviedo University of Castilla-La Gijón, Spain Mancha moranjesus@lsi.uniovi.es Ciudad Real, Spain Bibiano.Rivas@uclm.es Claudio de la Riva, Javier Tuya Ismael Caballero, Manuel Serrano Department of Computing University of Oviedo Gijón, Spain {claudio, tuya}@uniovi.es Abstract—Programs that process a large volume of data generally run in a distributed and parallel architecture, such as the programs implemented in the processing model MapReduce. In these programs, developers can abstract the infrastructure where the program will run and focus on the functional issues. However, the infrastructure configuration and its state cause different parallel executions of the program and some could derive in functional faults which are hard to reveal. In general, the infrastructure that executes the program is not considered during the testing, because the tests usually contain few input data and then the parallelization is not necessary. In this paper a testing technique is proposed to generate different infrastructure configurations for a given test input data, and then the program is executed in these configurations in order to reveal functional faults. This testing technique is automatized by using a test engine and applied in a case study. As a result, several infrastructure configurations are automatically generated and executed for a test case revealing a functional fault that is then fixed by the developer. Keywords— Software testing, MapReduce programs, Big Data Engineering, Hadoop I. INTRODUCTION The new trends in massive data processing have brought to light several technologies and processing models in the field called Big Data Engineering [1]. Among them, MapReduce [2] can be highlighted as it permits the analysis of large data based on the “divide and conquer” principle. These programs run two phases in a distributed infrastructure: the Mapper phase divides the problem into several subproblems, and then the Reducer phase solves each subproblem. Usually, MapReduce programs run on several computers with heterogeneous resources and features. This complex infrastructure is managed by a framework, such as Hadoop [3] which stands out due to its wide use in the industry [4]. From the developer point of view, a MapReduce program can be implemented only with Mapper and Reducer, without any consideration about the infrastructure. Then the framework that manages the infrastructure is also responsible to automatically deploy and run the program over several computers and lead the data processing between the input and output. Among others, the framework divides the input into Institute of Technology and Information Systems University of Castilla-La Mancha Ciudad Real, Spain {Ismael.Caballero, Manuel.Serrano}@uclm.es several subsets of data, then processes each one in parallel and re-runs some parts of the program if necessary. Despite the fact the program can be implemented abstracting the infrastructure, the developer needs to consider how the infrastructure configuration could affect the program functionality. A previous work [5] detects and classifies several faults that depend on how the infrastructure configuration affects the program execution and produces different output. These faults are often masked during the test execution because the tests usually run over an infrastructure configuration without considering the different situations that could occur in production, as for example different parallelism levels or the infrastructure failures [6]. On the other hand, if the tests are executed in an environment similar to the production, some faults may not be detected because it is common that the test inputs contain few data, which means that Hadoop does not parallelize the program execution. There are some tools to enable the simulation for some of these situations (for example computer and net failures) [7, 8, 9], but it is difficult to design, generate and execute the tests in a deterministic way because there are a lot of elements that need fine grained simulation, including the infrastructure and framework. The main contribution of this paper is a technique that can be used to generate automatically the different infrastructure configurations for a MapReduce application. The goal is to execute test cases with these configurations in order to reveal functional faults. Given a test input data, the configurations are obtained based on the different executions that can happen in production. Then each one of the configurations is executed in the test environment in order to detect functional faults of the program that may occur in production. The contributions of this work are: 1. A combinatorial technique to generate the different infrastructure configurations, taking into account characteristics related to the MapReduce processing and the test input data. 2. Automatic support by means of a test engine based on MRUnit [10] that allows the execution of the infrastructure configurations, together with the evaluation to detect failures. The rest of the paper is organized as follows. In Section II the principles of the MapReduce paradigm are introduced. The generation of the different configurations, the execution and the automatization of the tests are defined in Section III. In Section IV it is applied to a case study. In Section V the related work about software testing in MapReduce paradigm is presented. The paper ends with conclusions and future work in Section VI. II. MAPREDUCE PARADIGM The MapReduce program processes high quantities of data in a distributed infrastructure. The developer implements two functionalities: Mapper task that splits the problem into several subproblems and Reducer task that solves these subproblems. The final output is obtained from the deployment and the execution over a distributed infrastructure of several instances of Mapper and Reducer, also called tasks. The deployment and execution are automatically carried out by Hadoop or another framework. First, several Mapper tasks analyse in parallel a subset of input data and determine which subproblems these data need. When the execution of all Mappers are finished, several Reducers are also executed in parallel in order to solve the subproblems. Internally MapReduce handles <key, value> pairs, where the key is the subproblem identifier and the value contains the information to solve it. To illustrate MapReduce let us suppose a program that computes the average temperature per year from historical data about temperatures. This program solves one subproblem for each year, so the identifier or key is the year. The Mapper task receives a subset of temperature data and emits <year, temperature of this year> pairs. Then Hadoop aggregates all values per key. Therefore, the Reducer tasks receive subproblems like <year, [all temperatures of this year]>, that is all temperatures grouped per year. Finally, the Reducer calculates the average temperature. For example, in Fig. 1 an execution of the program considering the input is detailed: year 2000 with 3º, 2002 with 4º, 2000 with 1º, and 2001 with 5º. The first two inputs are analysed in one Mapper task and the remainder in another task. Then the temperatures are grouped per year and sent to the Reducer tasks. The first Reducer receives all the temperatures for the years 2000 and 2002, and the other task for the year 2001. Finally, each Reducer emits the average temperature of the analysed subproblems: 2º in the year 2000, 4º in 2002 and 5º in 2001. This program with the same input could be executed in another way by the framework, for example with three Mappers and three Reducers. Regardless of how the framework runs the program, it should generate the expected output. Additionally, to optimize the program, a Combiner functionality can be implemented. This task is run after the <2000, 3º> <2002, 4º> <2000, 1º> <2001, 5º> <2000, 3º> <2000, [3º, 1º]> <2002, 4º> <2002, [4º]> <2000, 2º> Mapper Task Reducer Task <2002, 4º> <2000, 1º> Mapper Task Reducer Task <2001, 5º> <2001, 5º> <2001, [5º]> Fig. 1. Program that calculates the average temperature per year Mapper and the goal is to remove the irrelevant <key, value> pairs to solve the subproblem. In MapReduce there are also other implementations such as for example Partitioner that decides for each <key, value> pair which Reducer analyses it, Sort that sorts the <key, value> pairs, and Group that aggregates the values of each key before the Reducer. The wrong implementation of these functionalities could cause a failure in one of the different ways in which Hadoop can run the program. These faults are difficult to detect during testing because the test cases usually contain few input data. In this way it is not necessary to split the inputs and therefore the execution is over one Mapper, one Combiner and one Reducer [2]. III. GENERATION AND EXECUTION OF TESTS The generation of the infrastructure configurations for the tests are defined in Section A, and a framework to execute the tests in Section B. A. Generation of the test scenarios To illustrate how the infrastructure configuration affects the program output, suppose that the example of Section II is extended with a Combiner in order to decrease the data and improve the performance. The Combiner receives several temperatures and then they are replaced by their average in the Combiner output. In this case, the program does not admit a Combiner because all the temperatures are needed to obtain the total average temperature. The error of adding the Combiner in order to optimize the program injects a functional fault in the program. Fig. 2 represents three possible executions of this program that could occur in production considering the different infrastructure configurations and the same input (year 1999 with temperatures 4º, 2º and 3º). The first configuration consists of one Mapper, one Combiner and one Reducer that produces the expected output. The second configuration also generates the expected output executing one Mapper that processes the temperatures 4º and 2º, another Mapper for 3º, two Combiner, and finally one Reducer. The third configuration also executes two Mapper, two Combiner and one Reducer, but produces an unexpected output because the first Mapper processes 4º and the second Mapper the temperatures 2º and 3º. Then one of the Combiner tasks calculates the average of 4º, and the other Combiner of 2º and 3º. The Reducer receives the previous averages (4º and <1999, 4º> <1999, 2º> <1999, 3º> <1999, 4º> <1999, 2º> <1999, 3º> <1999, 4º> <1999, 2º> <1999, 3º> Same input Mappper <1999, [4º, 2º, 3º]> <1999, [3º]> Combiner Reducer <1999, 3º> Mappper <1999, [4º, 2º]> Combiner Mappper Mappper Mappper Combiner <1999, [3º]> <1999, [4º]> Combiner <1999, [3º, 3º]> Reducer <1999, 3º> <1999, [4º, 2.5º]> Reducer <1999, 3.25º> Combiner <1999, [2º,3º]> Different scenario Different output Fig. 2. Different infrastructure configurations for a program that calculates the average temperature per year with Combiner task Automatic Test execution Test Input data Expected output Are equals? No (9) (optional) case (10) (8) Yes (1) Ideal Run scenario output (2) Ideal scenario (3) No (6) (7) Are Are all equals? Yes scenarios Yes tested? No (4) Output (5) Generation of new scenario Run scenario Fig. 3. a) General famework of test execution 2.5º), and calculates the total average in the year. This configuration produces 3.25º as output instead of the 3º of the expected output. The program has a functional fault only detected in the third configuration. The failure is produced whenever this infrastructure configuration is executed, regardless of the computer failures, slow net or others. This fault is difficult to reveal because the test case needs to be executed in the infrastructure configuration that detect it, and in a completely controlled way. Given a test input data, the goal is to generate the different infrastructure configurations, also called in this context scenarios. For this purpose, the technique proposed considers how the MapReduce program can execute these input data in production. First, the program runs the Mappers, then over their outputs the Combiners and finally the Reducers. The execution can be carried out over a different number of computers and therefore the Mapper-Combiner-Reducer can analyse a different subset of data in each execution. In order to generate each one of the scenarios, a combinatorial technique [11] is proposed to combine the values of the different parameters that can modify the execution of the MapReduce program. In this work the following parameters are considered based on previous work [5] that classifies different types of faults of the MapReduce applications:  Mapper parameters: (1) Number of Mapper tasks, (2) Inputs processed per each Mapper, and (3) Data processing order of the inputs, that is, which data are processed before other data in the Mapper and which data are processed after.  Combiner parameters for each Mapper output: (1) Number of Combiner tasks, and (2) Inputs processed per each Combiner.  Reducer parameters: (1) Number of Reducer tasks, and (2) Inputs processed per each Reducer. The different scenarios are obtained through the combination of all values that can take the above parameters and applying the constraints imposed by the sequential execution of MapReduce. The constraints considered in this paper are the following: Input: Test case with: input data expected output (optional) Output: scenario that reveals a fault (0) /* Generation of scenarios (section A)*/ (1) Scenarios ← Generate scenarios from input data (2) /* Execution of scenarios */ (3) ideal scenario output ← Execution of ideal scenario (4) ∀ scenario ∈ Scenarios: (5) scenario output ← Execution of scenario (6) IF scenario output <> ideal scenario output: (7) RETURN scenario with fault (8) IF ideal scenario output <> expected output: (9) RETURN ideal scenario (10) ELSE: (11) RETURN Zero faults detected b) Algorithm for test generation and execution of test scenarios 1. The values/combinations of the Mapper parameters depend on the input data because it is not possible more tasks than data. For example, if there are three data items in the input, the maximum number of Mappers is three. 2. The values/combinations of the Combiner parameters depend on the output of the Mapper tasks. 3. The values/combinations of the Reducer parameters depend on the output of the Mapper-Combiner tasks and another functionality executed by Hadoop before Reducer tasks. This other functionality is called Shuffle and for each <key, value> pair determines the Reducer task that requires these data, then sorts all the data and aggregates by key. To illustrate how the parameters are combined and how the constraints are applied, suppose the program of Fig. 2. The input of this program contains three data items, and these data constrain the values that the Mapper parameters can take because the maximum number of Mapper tasks is three (one Mapper per each <key, value> pair). The first scenario is generated with one Mapper, one Combiner and one Reducer. For the second scenario the parameter “Number of Mapper tasks” is modified to 2, where the first Mapper analyses two <key, value> pairs, and the second processes one pair. The third scenario maintains the parameter “Number of Mapper tasks” at 2, but modifies the parameter “Inputs processed per each Mapper”, so the first Mapper analyses one <key, value> pair and the other Mapper processes two pairs. The scenarios are generated by the modification of the values in the parameters in this way and considering the constraints. B. Execution of the test scenarios The previous section proposes a technique to generate scenarios that represent different infrastructure configurations according to the characteristics of the MapReduce processing. Fig. 3 describes a framework to execute systematically the tests with the scenarios generated by the technique of the previous section. The framework takes as input a test case that contains the input data and optionally the expected output. The test input data can be obtained with a generic testing technique or one Given a test case, the scenarios are generated according to the previous section, then they are iteratively executed and evaluated following the pseudocode of Fig. 3. For example, Fig. 2 contains the generation and execution of a program that calculates the average temperature per year in three scenarios considering the same test input: year 1999 with temperatures 4º, 2º and 3º. The first execution is the ideal scenario with one Mapper, one Combiner and one Reducer, that produces 3º as output. Then the second scenario is executed and also produces 3º. Finally, a third scenario is executed and produces 3.25º as output, this temperature is not equivalent to the 3º of the ideal scenario output. Consequently, a functional fault is revealed without any knowledge of the expected output of the test case. This approach is automatized by means of a test engine based on MRUnit library [10]. This library is used to execute each scenario. In MRUnit the test cases are executed in the ideal scenario, but this library is extended to generate other scenarios and enable parallelism supporting the execution of several Mapper, Combiner and Reducer tasks. CASE STUDY In order to evaluate the proposed approach, we use as case study the MapReduce program described in I8K|DQ-BigData framework [13]. This program measures the quality of the data exchanged between organizations according to part 140 of the ISO/TS 8000 [14]. The program receives (1) the data exchanged in a row-column fashion, together with (2) a set of mandatory columns that should contain data and (3) a percentage threshold that divides the data quality of each row in two parts: the first part is maximum if all mandatory columns contain data and zero otherwise, and the second part of the data quality is calculated as the percentage of the nonmandatory columns that contain data. The output of the TABLE I. TEST CASE OF THE I8K|DQ-BIGDATA PROGRAM Input Data quality threshold: 50% Name: Alice 50% City: (no data) 75% (average) Name: Bob Row 2 100% City: Vienna The procedure described in Section III is applied on the previous program using the previous test case as input. As a result, a fault is detected and reported to the developer. This failure occurs when the rows are processed in different Mappers and only the first Mapper receives the information related to the mandatory columns and the data quality threshold, because Hadoop splits the input data into several subsets. Without this information, the Mapper cannot calculate the data quality and does not emit any output. The bottom of Fig. 4 represents the scenario that produces the failure. There are two Mappers that process different rows. The first Mapper receives the data quality threshold (value of 50%), the mandatory column (“Name”) and the two columns of row 1 with only data in one column, so the Mapper emits 50% as data quality of row 1. The second Mapper processes only row 2, but no other information about the mandatory columns or data quality threshold, so this Mapper cannot emit any output. Then the Reducer receives only the data quality of row 1 and emits an incorrect output of the average data quality. This fault is difficult to detect because it implies the parallel and controlled execution of the program. Moreover, this fault is not revealed by the execution of the test case in the following environments: (a) Hadoop cluster in production with 4 computers, Hadoop in local mode (simple version of Hadoop with one computer), and (c) MRUnit unit testing library. These environments do not detect the fault because they only execute one scenario that masks the fault. Normally these environments run the program in the ideal scenario that is formed by one Mapper, one Combiner and one Reducer, and then the fault is masked due to a lack of parallelism. The test engine proposed in this paper executes the test case in the different scenarios that can occur in production with large data and infrastructure failures. In contrast with the other Threshold: 50% Mandatory: Name Row 1 Row 2 50% Row 1: 50% 100% Mapper Reducer Row 2: 100% Avg: 75% Threshold: 50% Mandatory: Name Row 1 Row 2 50% Row 1: 50% Mapper Reducer Avg: 50% Excepted output Mandatory columns: “Name” Row 1 Over the previous program, a test case is obtained using a specific MapReduce testing technique based on data flow [5]. The test input data and the expected output of the test case contain two rows represented in Table I. Row 1 contains two columns (Name and City), and only one column has data, so the data quality is 50%. Row 2 contains data in all columns, so the data quality is 100%. The total quality is 75%, which is the average of both rows. One scenario of the test engine IV. program is the data quality of each row, and the average of all rows. MRUnit or real environment specifically designed for MapReduce, such as MRFlow [12]. Then, the ideal scenario is generated (1) and executed (2, 3). This is the scenario formed by one Mapper, one Combiner and one Reducer which is the usual configuration executed in testing. Next, new scenarios are iteratively generated (4) and executed (5) through the technique of the previous section. The output of each scenario is checked against the output of the ideal scenario (6), revealing a fault if the outputs are not equivalent (7). Finally, if the test case contains the expected output, the output of ideal scenario is also checked against the expected output (8), detecting a fault when both are not equivalent (9, 10). Same input Mapper No output due the lack of threshold and mandatory columns Different scenario Fig. 4. Execution of the test case in different scenarios Different ouput environments, the test engine proposed does not need the expected output to detect faults. For example, in this case study the fault is revealed automatically because the outputs of the different scenarios are not equivalent to each other. The execution of some scenarios obtains an average quality of 75%, whereas the execution of other scenarios obtains 50%. These outputs are not equivalent, and the test engine detects automatically a fault despite the unknown expected output. After the detection and report of the fault during the test phase, the developer fixed the program and then the test case passed. V. RELATED WORK Despite the testing challenges of the Big Data applications [15, 16] and the progresses in the testing techniques [17], little effort is focused on testing the MapReduce programs [18], one of the principal paradigms of Big Data [19]. A study of Kavulya et al. [20] analyses several MapReduce programs and 3% of them do not finish, while another study by Ren et al. [21] places the number between 1.38% and 33.11%. Many of the works about testing of the MapReduce programs focus on performance and to a lesser degree functionality. A testing approach for Big Data is proposed by Gudipati et al. [22] specifying several processes, one of which is about MapReduce validation. In this process Camargo et al. [23] and Morán et al. [5] identify and classify several functional faults. Some of these faults are specific of the MapReduce paradigm and they are not easy to detect because they depend on the program execution over the infrastructure. One common type of fault is produced when the data should reach the Reducer in a specific order, but the parallel execution causes these data to arrive disordered. This fault was analysed by Csallner et al. [24] and Chen et al. [25] using some testing techniques based on symbolic execution and model checking. In contrast to the previous works, the approach of this paper is not focused on the detection of only one type of fault, it can also detect other MapReduce specific faults. To do this, the test input data is executed over different infrastructure configurations that could lead to failures. Several research lines suggest injecting infrastructure failures [26, 27] during the testing, and several tools support their injection [7, 8, 9]. For example, the work by Marynowski et al. [28] allows the creation of test cases specifying which computers fail and when. One possible problem is that some specific MapReduce faults could not be detected by infrastructure failures, but require full control of Hadoop and the infrastructure. In this paper, the different ways in which Hadoop could run the program are automatically generated from the functional point of view, regardless of the infrastructure failures and Hadoop optimizations. Furthermore, there are other approaches oriented to obtain the test input data of MapReduce programs, such as [12] that employs data flow testing and [29] based on a bacteriological algorithm. In this paper, given a test input data, several configurations are generated and then executed in order to reveal functional faults. The test input data could be obtained with the previous testing techniques. The functional tests can be executed directly in the production cluster or in one computer with Hadoop. Herriot [30] can be used to execute the tests in a cluster while providing access to their components supporting, among others, the injection of faults. Another option is to simulate a cluster in memory with the MiniClusters libraries [31]. In the unit testing, JUnit [32] could be used together with mock tools, or directly by MRUnit library [10] adapted to the MapReduce paradigm. These test engines only execute one infrastructure configuration and usually without parallelization. In this paper a test engine is implemented by an MRUnit extension that automatically generates and executes the different infrastructure configurations that could occur in production. VI. CONCLUSIONS A testing technique for the MapReduce programs is introduced and automatized in this paper as a test engine that reproduces the different infrastructure configurations for a given test case. Automatically and without an expected output, the test engine can detect functional faults specific to the MapReduce paradigm that are in general difficult to detect in the test/production environments. This approach is applied in a real program using a test case with few data. As a result, a functional fault is revealed allowing the developer to fix the program. In order to improve the generation of the infrastructure configurations, as part of the future we plan to extend the technique to select efficiently the configurations that are more likely to detect faults. The current approach is off-line because the tests are not carried out when the program is in production. As future work we plan to extend the approach to on-line testing, in order to monitor the functionality with the real data when the program is executed in production and detect the faults automatically. ACKNOWLEDGMENTS This work was supported in part by project TIN201346928-C3-1-R, funded by the Spanish Ministry of Science and Technology, and GRUPIN14-007, funded by the Principality of Asturias (Spain) and ERDF funds and Vice President for Research and Science Policy with BIN1637 INITIATION SCHOLARSHIP. REFERENCES [1] [2] [3] [4] [5] [6] ISO/IEC JTC 1 – Big Data, preliminary report 2014, ISO/IEC Std., 2015. J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” in Proc. of the OSDI - Symp. on Operating Systems Design and Implementation. USENIX, 2004, pp. 137–149. “Apache hadoop: open-source software for reliable, scalable, distributed computing,” https://hadoop.apache.org, accessed: 2016-0416. “Institutions that are using apache hadoop for educational or production uses,” http://wiki.apache.org/hadoop/PoweredBy, accessed: 2016-0416. J. Morán, C. de la Riva, and J. Tuya, “MRTree: Functional Testing Based on MapReduce’s Execution Behaviour,” in Future Internet of Things and Cloud (FiCloud), 2014 International Conference on, 2014, pp. 379–384. K. V. Vishwanath and N. Nagappan, “Characterizing cloud computing hardware reliability,” in Proceedings of the 1st ACM symposium on Cloud computing. ACM, 2010, pp. 193–204. [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] “Anarchyape: Fault injection tool for hadoop cluster from yahoo anarchyape,” https://github.com/david78k/anarchyape, accessed: 201604-16. “Chaos monkey,” https://github.com/Netflix/SimianArmy/wiki/ChaosMonkey, accessed: 2016-04-16. “Hadoop injection framework,” https://hadoop.apache.org, accessed: 2016-04-16. “Apache mrunit: Java library that helps developers unit test apache hadoop map reduce jobs,” http://mrunit.apache.org, accessed: 2016-0416. M. Grindal, J. Offutt, and S. F. Andler, “Combination testing strategies: a survey,” Software Testing, Verification and Reliability, vol. 15, no. 3, pp. 167–199, 2005. J. Morán, C. de la Riva, and J. Tuya, “Testing Data Transformations in MapReduce Programs,” in Proceedings of the 6th International Workshop on Automating Test Case Design, Selection and Evaluation, ser. A-TEST 2015. New York, NY, USA: ACM, 2015, pp. 20–25. B. Rivas, J. Merino, M. Serrano, I. Caballero, and M. Piattini, “I8k| dqbigdata: I8k architecture extension for data quality in big data,” in Advances in Conceptual Modeling. Springer, 2015, pp. 164–172. ISO/TS 8000-140, Data quality - Part 140: Master data: Exchange of characteristic data: Completeness, ISO/TS Std., 2009. S. Nachiyappan and S. Justus, “Getting ready for bigdata testing: A practitioner’s perception,” in Computing, Communications and Networking Technologies (ICCCNT), 2013 Fourth International Conference on. IEEE, 2013, pp. 1–5. A. Mittal, “Trustworthiness of big data,” International Journal of Computer Applications, vol. 80, no. 9, 2013. A. Bertolino, “Software testing research: Achievements, challenges, dreams,” in Future of Software Engineering, 2007. FOSE ’07, 2007, pp. 85–103. L. C. Camargo and S. R. Vergilio, “Mapreduce program testing: a systematic mapping study,” in Chilean Computer Science Society (SCCC), 32nd International Conference of the Computation, 2013. M. Sharma, N. Hasteer, A. Tuli, and A. Bansal, “Investigating the inclinations of research and practices in hadoop: A systematic review,” confluence The Next Generation Information Technology Summit (Confluence), 2014 5th International Conference -. S. Kavulya, J. Tan, R. Gandhi, and P. Narasimhan, “An analysis of traces from a production mapreduce cluster,” in Cluster, Cloud and [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] Grid Computing (CCGrid), 2010 10th IEEE/ACM International Conference on. IEEE, 2010, pp. 94–103. K. Ren, Y. Kwon, M. Balazinska, and B. Howe, “Hadoop’s adolescence: an analysis of hadoop usage in scientific workloads,” Proceedings of the VLDB Endowment, vol. 6, no. 10, pp. 853–864, 2013. M. Gudipati, S. Rao, N. D. Mohan, and N. K. Gajja, “Big data: Testing approach to overcome quality challenges,” Big Data: Challenges and Opportunities, pp. 65–72, 2013. L. C. Camargo and S. R. Vergilio, “Cassicação de defeitos para programas mapreduce: resultados de um estudo empírico,” in SAST 7th Brazilian Workshop on Systematic and Automated Software Testing, 2013. C. Csallner, L. Fegaras, and C. Li, “New ideas track: testing mapreduce-style programs,” in Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. ACM, 2011, pp. 504–507. Y.-F. Chen, C.-D. Hong, N. Sinha, and B.-Y. Wang, “Commutativity of reducers,” in Tools and Algorithms for the Construction and Analysis of Systems. Springer, 2015, pp. 131–146. F. Faghri, S. Bazarbayev, M. Overholt, R. Farivar, R. H. Campbell, and W. H. Sanders, “Failure scenario as a service (fsaas) for hadoop clusters,” in Proceedings of the Workshop on Secure and Dependable Middleware for Cloud Monitoring and Management. ACM, 2012, p. 5. P. Joshi, H. S. Gunawi, and K. Sen, “Prefail: A programmable tool for multiple-failure injection,” in ACM SIGPLAN Notices, vol. 46, no. 10. ACM, 2011, pp. 171–188. J. E. Marynowski, A. O. Santin, and A. R. Pimentel, “Method for testing the fault tolerance of mapreduce frameworks,” Computer Networks, vol. 86, pp. 1–13, 2015. A. J. Mattos, “Test data generation for testing mapreduce systems,” in Master’s degree dissertation, 2011. “Herriot: Large-scale automated test framework,” https://wiki.apache.org/hadoop/HowToUseSystemTestFramework, accessed: 2016-04-16. “Minicluster: Apache hadoop cluster in memory for testing,” https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CLIMiniCluster.html, accessed: 2016-04-16. “Junit: a simple framework to write repeatable tests,” http://junit.org/, accessed: 2016-04-16.