[go: up one dir, main page]

Academia.eduAcademia.edu
THE EFFECT OF BASE RATE SENSITIZATION ON END-USER QUERY PERFORMANCE MODERATED BY CONSCIENTIOUSNESS A. Faye Borthick Teaching and Learning with Technology Center Georgia State University Atlanta, GA USA borthick@gsu.edu Paul L. Bowen School of Business University of Queensland Brisbane Australia p.bowen@business.uq.edu.au David A. Robb School of Business University of Queensland Queensland, Australia a.robb@business.uq.edu.au Abstract End users with extensive experience with an organization’s data can often detect query errors when query results do not correspond to their ex ante expectations. Many end users, for example, newly hired business analysts, however, compose queries on unfamiliar data. Their lack of familiarity means that they may be less able to evaluate the reasonableness of their query results. Although additional query experience will eventually give them the familiarity with the data that they need, in the interim, they may not recognize incorrect results from flawed queries. This paper develops and tests base rate sensitization as a means of enabling end users to improve their query performance. Contrary to the hypotheses, sensitizing end users to base rates, as a means of improving their assessments of the likely correctness of their query results, was not associated with significantly fewer query errors on a consistent basis. In a post hoc analysis, participant conscientiousness was found to moderate query performance. Participants of high conscientiousness that were sensitized to base rates made fewer query errors than those not sensitized. In contrast, base rate-sensitized participants with low conscientiousness made more errors than those not sensitized. In this interaction, high conscientiousness participants were able to take advantage of base rate information while low conscientiousness participants appeared to be hindered by base rate sensitization. Keywords: Conscientiousness, end-user querying; query errors; base rate information Introduction After years of receiving and analyzing reports about their organizations, experienced managers and business analysts can classify query results as either reasonable or suspect. When confronted with suspect results, they can investigate whether business conditions have changed or whether the query that retrieved the data contained errors. From a Bayesian perspective, experienced 774 2003— Twenty-Fourth International Conference on Information Systems Borthick et al./Effect of Base Rate Sensitization on End-User Query Performance managers and analysts have developed internal base rate knowledge1 that they use to evaluate the reasonableness of query results (Biros et al. 2002; Klein et al. 1997; Roy and Lerch 1996). They can perform these evaluations before relying on the query results in their own work or making the results available to others in their organizations. Managers and business analysts that are not familiar with organizational data may not have developed the base rate knowledge necessary to evaluate their query results effectively (Ballou and Tayi 1999). Organizations are increasingly relying on databases to support operations and provide timely information for decision making. Few organizations have enough experienced managers and business analysts, however, to satisfy the demands for query-proficient staff (Leonard-Barton 1995). Newly acquired managers and business analysts are often not sufficiently adept at classifying their query results as reasonable or suspect, a tendency that is exacerbated by end users’ tendencies to be overconfident in the correctness of their queries (Borthick et al. 2001). Time and experience are prerequistes for evaluating query results correctly. Expediting this learning process for new managers and business analysts has the potential to reduce the probability of decision errors resulting from reliance on incorrect query results. This research investigates whether sensitizing new end users to using base rate information improves their query accuracy and their ability to align their confidence with the accuracy of their query results. Base Rate Sensitization and Query Performance Signal Detection According to signal detection theory, an individual’s ability to discern between signal and noise is a function of the relationship between the response criterion and the discriminability of the distributions (Roy and Lerch, 1997). Discriminability decreases when the distributions overlap and the response criterion occurs in the overlap, increasing the difficult of distinguishing between signal and noise. Figure 1 illustrates partially overlapped data distributions without and with errors. Figure 1. Overlapped Data Distributions Without and with Errors (Used with permission, B. D. Klein, D. L. Goodhue, and G. B. Davis, “Can Humans Detect Errors in Data? Impact of Base Rates, Incentives and Goals,” MIS Quarterly (21:2), 1997, pp. 169-194. Copyright, Regents of the University of Minnesota, MIS Quarterly, 321 19th Avenue South, Minneapolis, MN 55455) New managers and business analysts, who are likely to be unfamiliar with organizational data, confront a signal versus noise dilemma when evaluating query results. Distinguishing between signal and noise is especially difficult when the observed value occurs in the intersection of the two distributions (Figure 1). Distinguishing between signal and noise is further complicated when response criterion guidance is absent. Incorrectly interpreting a signal as noise is likely to result in decisions based on incorrect information. Incorrectly interpreting noise as a signal may result in additional time for reformulating queries and may lead to incorrect queries, giving erroneous results. Individuals acting on instructions to heed the base rate frequency of signals have been found to improve their signal detection performance (Davies and Parasuraman 1981). Heeding base rate information has improved placement of the response criterion, thus providing the means for end users to discriminate between signal and noise (Roy and Lerch 1997). In the context of query formulation, providing base rate information allows end users to establish an anchor point for comparing with their query results. 1 A base rate refers to the relative frequency with which an event occurs or an attribute appears in a population (Ginossar and Trope 1987, Hinsz et al. 1988, Lanning 1987). 2003 — Twenty-Fourth International Conference on Information Systems 775 Borthick et al./Effect of Base Rate Sensitization on End-User Query Performance Significant divergence from the expected results should engender sufficient dissonance within the end user to signal a potential error in a query. Attention to Base Rates Kahneman and Tversky (1973) observe that decision makers appear to place insufficient weight on base rate information, to the detriment of decision quality. Research across several fields2 implies that decision makers should use base rate information to a greater extent. Base rates need not be highly reliable nor extreme to have diagnostic value (Koehler 1996). In the absence of cues to the contrary (e.g., syntax error messages), inexperienced end users are apt to think they have prepared correct queries, leading to overconfidence in the quality of their decisions (Borthick et al. 2001). Being aware of base rate information, however, may create a standard of reference that can help them evaluate whether the information they extracted is reasonable (Figure 2). End users that are more sensitive to base rates, through experience or instruction, should be more likely to compare their results with historical base rates. In analyses of management reports, one of the primary considerations is how current results compare with results of prior periods adjusted for seasonal differences. In the absence of major changes in the organization and its operating environment, analysts tend to expect current results to be similar to those of prior periods. If the results of two consecutive periods differ substantially, experienced analysts would search for explanations such as unintentional errors, timing problems, or deliberate misstatements. Accordingly, providing end users with historical base rate information in the form of management reports may have the potential to prompt end users to detect errors in their queries. Sensitization to Base Rates Constructing a contingency table and being sensitized to base rates through a graphical representation of the setting have been associated with making improved probability assessments (Roy and Lerch 1997). While the graphical representation had a greater positive effect, both sensitized groups answered more questions correctly than did control groups. Similarly, a combination of training, error incidence identification, and warnings about compromised data was more effective than traditional training for helping inexperienced end users detect compromised data (Biros et al. 2002). Left to their own devices, inexperienced end users tend to believe they have formulated queries correctly once they pass syntax checks (Borthick et al. 2001). Experienced end users, however, realize that queries passing syntax checks may still be incorrect. Because of their familiarity with organizational data embodied in base rates, they have the capability of performing reasonableness checks on the query results. If base rates were available to them and they learned to use them, inexperienced end users might be able to perform more like experienced end users in recognizing potential discrepancies between their query results and the base rate information. Once they recognize potential discrepancies, inexperienced end users might be able to reconsider and refine their query formulations. This reasoning yields the following hypothesis: H1: Increasing base rate sensitization will be associated with decreasing query errors. Confidence in Query Correctness In the Brunswik Lens Model of human judgement (Brunswik 1956), individuals make judgments or predictions based on cues they sense from the environment that might be related to the criterion value they are attempting to discern or predict. In a database query setting, the criterion value corresponds to results from correctly formulated queries. Individuals’ judgements correspond to their queries, and the base rate information represents the cue set. 2 In information systems, Biros et al. (2002),Klein et al. (1997), and Roy and Lerch (1996); in accounting, Johnson et al. (2001) and Koonce (1993); in forensic science, Rogers, (2000); and in law, Cunningham and Reidy (1998). 776 2003— Twenty-Fourth International Conference on Information Systems Borthick et al./Effect of Base Rate Sensitization on End-User Query Performance Inexperienced End-Users: Not skilled at applying base rates Experienced End-Users: Skilled at applying base rates Information request Information request Formulate and run query Formulate and run query Cues: Base rates No Query completed Reasonable query results? Yes Query completed Figure 2. End-User Assessment of Query Reasonableness If participants interpret base rate information (i.e., the cue set) correctly, the comparison is likely to prompt thoughtful end users to reflect on the correctness of their queries.3 Reflection by end users with higher levels of sensitization to base rate information on query results that do not match the base rate information is likely to induce greater dissonance. Even if end users make some adjustment to allow for a result that does not match the base rate information exactly, their adjustments are usually insufficient. These insufficient adjustments result in probability distributions that are too tight, i.e., they underestimate the range of actual occurrences (Tversky and Kahneman 1974). Accordingly, even for correct queries, this dissonance is likely to reduce the confidence that end users would otherwise have in the correctness of their queries. Furthermore, when query results differ substantially from the base rate information due to incorrect queries, this dissonance leads to end users being even less confident in the correctness of their queries. As a hypothesis, this prediction about the relationship between awareness and use of base rate information and end-user confidence is: H2: Greater sensitization to base rate information will be associated with decreasing confidence in query correctness as the distance between the query result and base rate increases. The Experiment A laboratory experiment was conducted to test the hypotheses in a one-factor (presence or absence of sensitization to base rate information) with two covariates (complexity of the information request and grade point average (GPA)) between-groups experimental design. 3 Reflecting on the results of syntactically correct queries and making comparisons with base rate information takes time. End users that believe their current results do not satisfy the information requests will have to submit revised queries and evaluate their results. Hence, we expect that base rate sensitization will be associated with increasing time to complete queries. Time to complete queries was logged to permit analysis of the potential time effect. 2003 — Twenty-Fourth International Conference on Information Systems 777 Borthick et al./Effect of Base Rate Sensitization on End-User Query Performance Participants Participants, 78 advanced undergraduate and postgraduate IS students, had previously learned to use the SQL query language and had practiced using SQL to develop queries. Participants were stratified according to their prior information systems and query experience and GPA and assigned to two groups, the top participant to Group A, the next participant to Group B, etc. The resulting equivalent groups were then randomly assigned to the control or treatment conditions. Table 1. Samples of Management Reports Management Report: Merchandise Line Mark Up Percentages July-August 2002 Merchandise Line Inventory Values at 1 Average Mark Up Std Dev October 2002 (2002) (2002) Chemicals 61355.20 88 0.06 Electrical 98888.98 90 0.01 . . . . Audited Total Inventory Value at 1 1044463.59 Oct. Stock take Employee Akkerman Gillespie . Total Gross Sales Less sales awaiting credit clearance Management Report: Sales and Deliveries July-August 2002 Gross Sales to Percentage of gross Percent of late 1 Oct 2002 sales returns by deliveries by Employee (2002) Employee (2002) Average Std Dev Average Std Returns Late Del Dev 178451.20 0.18 0.00 14.21 2.12 77615.00 0.00 0.00 23.83 2.78 . . . . . 858782.90 0.088 0.003 17.30 2.67 29735.00 Late deliveries: Average number of days by Employee (2002) Average Std Late Del Dev 1.70 0.55 0.95 0.10 . . 1.53 0.36 Management Report: Supplier Deliveries July-August 2002 Percentage of gross Late Deliveries by Late deliveries: Average Gross purchase returns by Supplier (%) number of days by Supplier Purchases since 1 July Supplier (2002) 2002 Average Std Avg Late Std Dev Avg Late Std Dev Returns Dev Deliveries Deliveries AEG 31854.90 0.00 0.00 100.00 25.50 1.50 2.00 Apex 8983.80 0.01 0.01 0.00 0.00 0.00 0.00 . . . . . . . . Total gross purchases 416822.10 8.60 2.00 17.60 7.75 0.25 0.30 Percentage of Purchases delivered after want date since 1 July 2002 23.57 Percentage of total purchase rejects by supplier 7.93 Supplier 778 2003— Twenty-Fourth International Conference on Information Systems Borthick et al./Effect of Base Rate Sensitization on End-User Query Performance Procedure During the first hour (part 1) of the experiment, participants satisfied a sequence of information requests by preparing and executing SQL queries. A UNIX script captured each query attempt in text files, including start and end times for each request. All participants (treatment and control groups) received base rate information in the form of management reports. This report, illustrated in Table 1, was based on data in the database that participants queried. The treatment group was specifically directed to evaluate the correctness of query results based on related information for a prior period in the management reports. The control group received the same information requests and management reports but without the directions to heed the base rate information. During the second hour (part 2) of the experiment, participants prepared queries for similar information requests. Both groups received base rate information in the form of management reports. Neither the treatment nor the control group, however, received explicit directions to heed the base rate information. Table 2 shows a sample information request and its base rate sensitization for the treatment group. After each query attempt was executed, the interface displayed the SQL result, i.e., either the syntax errors or the records produced by the query. Participants could revise their queries as many times as they wished until they were satisfied with their results. After responding to a prompt for their confidence levels in the correctness of their queries, participants began work on the next information request. Participants were not allowed to return to previously completed queries. After eliminating incomplete responses, two examiners independently corrected participants’ responses. Each discrete alteration (addition or deletion of a query component) counted as one error. The examiners compared their independent assessments to ensure that all errors had been found and that the revised query formulations produced the correct results. Table 2. Example Information Request Information request For each employee, list their staff number, name, and the total retail dollar value of sales orders placed after 1 September 2002. Base rate To maximize the likelihood of appropriately evaluating the correctness of your query sensitization for output, establish a base rate by examining employee sales for the past quarter. The treatment group management report contains values for the last quarter (3 months). Your required dollar value is for one month (1 September to 1 October). Therefore total sales on your query output should be approximately 1/3 of the total sales for the quarter. Use the base rate to evaluate the correctness of the output of your query. Analysis of Results Planned Analyses Preliminary analysis of the part 1 queries for hypothesis 1 indicated no statistically significant results between the control and treatment groups with respect to query performance. The least-squares means for the number of errors for the control (8.5724) and treatment groups (9.2602) during part 1 of the experiment were not significantly different. Furthermore, the differences were opposite to the hypothesised direction. For part 2, the control and treatment groups’ least-squares means for the number of errors were 7.4207 and 8.3706, respectively. Again, not only was there no significant difference between the performance of the control and treatment groups, but the differences were not in the hypothesised direction. Post Hoc Analysis Given the unexpected results, previously collected participant personality data were included in the statistical analysis to investigate the potential moderating effect of participant conscientiousness on performance. With conscientiousness (Costa and McCrae 1992) as a covariate, there were statistically significant differences in performance of the treatment and control groups (Figure 3). Treatment group participants with above average levels of conscientiousness made significantly fewer errors than their counterparts in the control group. Although the difference was not statistically significant, treatment group participants with high 2003 — Twenty-Fourth International Conference on Information Systems 779 Borthick et al./Effect of Base Rate Sensitization on End-User Query Performance conscientiousness also made fewer errors than their control group counterparts. In contrast, treatment group participants with low and below average levels of conscientiousness made significantly more errors than their control group counterparts. These results indicate the presence of of an interaction effect between conscientiousness and the usefulness of base rate information. That is, conscientious participants were able to take advantage of the base rate information to improve their performance. In contrast, the presence base rate information was associated with impaired performance for participants with lower levels of conscientiousness. Control Treatment LS mean error 12 10 9.71 8 6 10.66 9.80 8.18 7.56 6.44 5.68 6.17 4 2 0 Low p = 0.0437 Below average p = 0.0105 Above average p = 0.0119 High p = 0.2179 Conscientiousness Figure 3. Performance Moderated by Conscientiousness Implications If sensitization to base rates improved query performance for everyone, organizations might be well served by incorporating base rate sensitization into training and mentoring programs for new managers and business analysts. But contrary to a one-size-fits-all model, post hoc analysis of participant queries detected an interaction effect. That is, while base rate-sensitization of participants with above average conscientiousness was associated with fewer query errors, sensitized participants with below average conscientiousness made more query errors. These results imply that if organizations consistently hire conscientious employees, sensitization to base rate information is likely to be beneficial. Organizations that hire less conscientious employees may need to place greater emphasis on helping users learn to formulate and evaluate queries systematically. Whether organizations hire employees with low or high conscientiousness, promoting mentoring relationships between experienced and new managers and business analysts is likely to improve end users’ query performance. References Ballou, D., and Tayi, G. “Enhancing Data Quality in Data Warehouse Environments,” Communications of the ACM (42:1), 1999, pp. 73-78. Biros, D., George, J., and Zmud, R. “Inducing Sensitivity to Deception in Order to Improve Decision Making Performance: A Field Study,” MIS Quarterly (26), 2002, pp. 119-144. Borthick, A. F., Bowen, P. L., Jones, D. R., and Tse, M. H. K. “The Effects of Information Request Ambiguity and Construct Incongruence on Query Development,” Decision Support Systems (32:1), 2001, 3-25. Brunwsik, E. Perception and the Representative Design of Psychological Experiments, University of California Press, Berkeley, CA, 1956. Costa, P. T., Jr., and McCrae R. R. Revised NEO Personality Inventory and NEO Five-Factor Inventory Professional Manual, Psychological Assessment Resources Inc., Florida, 1992. Cunningham, M., and Reidy, T. “Integrating Base Rate Data in Violence Risk Assessments at Capital Sentencing,” Behavioral Sciences and the Law (16:1), 1998, pp. 71-79. Davies, D., and Parasuraman, R. The Psychology of Vigilance, Academic Press, London, 1981. 780 2003— Twenty-Fourth International Conference on Information Systems Borthick et al./Effect of Base Rate Sensitization on End-User Query Performance Ginossar, Z., and Trope, Y. “The Effects of Base Rates and Individuating Information on Judgments About Another Person,” Journal of Experimental Social Psychology (16), 1980, pp. 228-242. Halstead, M. H. Elements of Software Science, Elsevier, Amsterdam, 1977. Hinsz, V., Tindale, R., Nagao, D., Davis, J., and Robertson, B. “The Influence of the Accuracy of Individuating Information on the Use of Base Rate in Probability Judgment,” Journal of Experimental Social Psychology (24), 1988, pp. 127-145. Johnson, P., Grazioli, S., Jamal, K., and Berryman, R. “Detecting Deception, Adversarial Problem Solving in a Low Base-Rate World,” Cognitive Science (25), 2001, pp. 355-392. Kahneman, D., and Tversky, A. “On the Psychology of Prediction,” Psychological Review (80), 1973, pp. 237-251. Klein, B. D., Goodhue, D. L., and Davis, G. B. “Can Humans Detect Errors in Data? Impact of Base Rates, Incentives and Goals,” MIS Quarterly (21:2), 1997, pp. 169-194. Koehler, J. “The Base Rate Fallacy Reconsidered: Descriptive, Normative and Methodological Challenges,” Behavioral and Brain Sciences (19:1), 1996, pp. 1-53. Koonce, L. “A Cognitive Characterization of Audit Analytical Review,” Auditing: A Journal of Practice & Theory (12), 1993, pp. 57-76. Lanning, K. “Some Reasons for Distinguishing Between ‘Non-Normative Response’ and ‘Irrational Decision’,” The Journal of Psychology (12), 1987, pp. 109-117. Leonard-Barton, D. Wellsprings of Knowledge, Harvard Business School Press, Boston, MA, 1995. Rogers, R. “The Uncritical Acceptance of Risk Assessment,” Forensic Practice in Law and Human Behavior (24:5), 2000, pp. 595-605. Roy, M. C., and Lerch, F. J. “Overcoming Ineffective Mental Representations in Base-Rate Problems,” Information Systems Research (7:2), 1997. pp. 233-247. Tversky, A., and Kahneman, D. “Judgment Under Uncertainty: Heuristics and Biases,” Science (185), 1974, pp. 1124-1131. 2003 — Twenty-Fourth International Conference on Information Systems 781