Abstract
Data science, and the related field of big data, is an emerging discipline involving the analysis of data to solve problems and develop insights. This rapidly growing domain promises many benefits to both consumers and businesses. However, the use of big data analytics can also introduce many ethical concerns, stemming from, for example, the possible loss of privacy or the harming of a sub-category of the population via a classification algorithm. To help address these potential ethical challenges, this paper maps and describes the main ethical themes that were identified via systematic literature review. It then identifies a possible structure to integrate these themes within a data science project, thus helping to provide some structure in the on-going debate with respect to the possible ethical situations that can arise when using data science analytics.
Similar content being viewed by others
References
Boell, S., & Cecez-Kecmanovic, D. (2014). A hermeneutic approach for conducting literature reviews and literature searches. Communications of the Association for Information Systems, 34, 1.
Boyd, D, & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication & society, 15(5), 662–679.
Boyd, D, Levy, K., & Marwick, A. E. (2014). The networked nature of algorithmic discrimination. In Data and discrimination: Collected essays (pp. 43–57). Washington, DC: Open Technology Institute.
Boyd, K. (2012). Critical questions for big data. Information, Communication & Society, 15, 662–679.
Braun, A., & Garriga, G. (2018). Consumer journey analytics in the context of data privacy and ethics. In C. Linnhoff-Popien, R. Schneider & M. Zaddach (Eds.), Digital marketplaces unleashed. Berlin: Springer.
Brey, P., & Soraker, J. (2009). Philosophy of computing and information technology. In D. M. Gabbay, A. W. M. Meijers, J. Woods, & P. Thagard (Eds). Philosophy of technology and engineering sciences (pp. 1341–1408). North Holland: Elsevier.
Butrymowicz, S., & Garland, S. (2012). How New York city’s value-added model compares to what other districts, states are doing, hechingerreport. Retrieved from http://hechingerreport.org/content/how-new-york-citys-value-added-model-compares-to-what-other-districts-states-are-doing_7757/.
Bynum, T. (2008). Computer and information ethics. In Stanford encyclopedia of philosophy. Retrieved from http://plato.stanford.edu/entries/ethics-computer/. Accessed 14 January 2016
Bynum, T., & Rogerson, S. (2003). Computer ethics and professional responsibility: Introductory text. New York: Wiley
Chen, A. (2017). Using machine learning to find the 8 types of players in the NBA, Fastbreak. http://fastbreakdata.com/classifying-the-modern-nba-player-with-machine-learning-539da03bb824.
Clarke, R. (2016). Big data, big risks. Information Systems Journal, 26(1), 77–90.
Crawford, K. (2013). The hidden biases in big data. Harvard Business Review Online Edn. Harvard Business Review.
De Laat, P. B. (2017). Big data and algorithmic decision-making: Can transparency restore accountability? ACM SIGCAS Computers and Society, 47(3), 39–53.
Dorasamy, N., & Pomazalová, N. (2016). Social impact and social media analysis relating to big data. In Data science and big data computing (pp. 293–313). Cham: Springer.
Drosou, M., Jagadish, H. V., Pitoura, E., & Stoyanovich, J. (2017). Diversity in big data: A review. Big data, 5(2), 73–84.
Elo, S., & Kyngäs, H. (2007). The qualitative content analysis process. Journal of Advanced Nursing, 62(1), 107–115.
Fairfield, J., & Shtein, H. (2014). Big data, big problems: Emerging issues in the ethics and data science of journalism. Journal of Mass Media Ethics, 29, 38–51.
Fleiss, J. L., Levin, B., & Paik, M. C. (2004). Determining sample sizes needed to detect a difference between two proportions. Statistical Methods for Rates and Proportions, 2, 64–85.
Floridi, L., & Taddeo, M. (2016). What is data ethics?. Philosophical Transactions Series A, 374, 2083.
Fong, K. (2016). The ethics conversation we’re not having about analytics. Harvard Business Review Online Edn. Retrieved from http://blogs.hbr.org/2013/04/thehidden-biases-in-big-data/. Accessed 20 August 2017.
Fuller, M. (2017). Big data, ethics and religion: New questions from a new science. Religions, 8(5), 88.
Grindrod, P. (2016). Beyond privacy and exposure: Ethical issues within citizen-facing analytics. Philosophical Transactions of the Royal Society A, 374(2083), 20160132.
Gumbus, A., & Grodzinsky, F. (2016). Era of big data: Danger of descrimination. ACM SIGCAS Computers and Society, 45(3), 118–125.
Haffar, J. (2015). Have you seen ASUM-DM? Retrieved from IBM: https://developer.ibm.com/predictiveanalytics/2015/10/16/have-you-seen-asum-dm/.
Harkens, A. (2016). ‘Rear window ethics’ and discrimination: The darker side of big data. In European conference on e-government (p. 267). Academic Conferences International Limited.
Hsieh, H.-F., & Shannon, S. E. (2005). Three approaches to qualitative content analysis. Qualitative Health Research, 15(9), 1277–1288.
Jagadish, H., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J. M., Ramakrishnan, R., & Shahabi, C. (2014). Big data and its technical challenges. Communications of the ACM, 57(7), 86–94.
Johnson, D. (1985). Computer ethics. Upper Saddle River: Prentice-Hall.
Johnson, D., & Nissenbaum, H. (1995). Computers, ethics and social values. New York: Pearson.
Joseph, D., Ng, K., Koh, C., and Ang. S (2007). Turnover of information technology professionals: A narrative review, meta-analytic structural equation modeling, and model development. MIS Quarterly, 31(3), 547–577.
Kitchenham, B., & Charters, S. (2007). Guidelines for performing systematic literature reviews in software engineering. UK: Keele.
Leonelli, S. (2016). Locating ethics in data science: Responsibility and accountability in global and distributed knowledge production systems. Philosophical Transactions of the Royal Society A, 374(2083), 20160122.
Manders-Huits, N., & Zimmer, M. (2009). Values and pragmatic action: The challenges of introducing ethical intelligence in technical design communities. International Review of Information Ethics, 10(2), 37–45.
Martin, K. E. (2015). Ethical issues in the big data industry. MIS Quarterly Executive, 14, 2.
Mateosian, R. (2013). Ethics of big data. IEEE Micro, 33(2), 60–61.
Metcalf, J., Keller, E., Boyd, D. (2016). Perspectives on big data, ethics and society. Council for Big Data, Ethics and Society. http://bdes.datasociety.net/council-output/perspectives-on-big-data-ethics-andsociety/.
Mingers, J., & Walsham, G. (2010). Towards ethical information systems: The contribution of discourse ethics. MIS Quarterly, 34(4), 833–854.
Mittelstadt, B. (2017). From individual to group privacy in big data analytics. Philosophy & Technology, 30, 475–494.
Newell, S., & Marabelli, M. (2015). Strategic opportunities (and challenges) of algorithmic decisionmaking: A call for action on the long-term societal effects of ‘datification’. The Journal of Strategic Information Systems. https://doi.org/10.1016/j.jsis.2015.02.001.
Nyes, K. (2016). White house to data scientists: We need you. Computer world. Retrieved from http://www.computerworld.com/article/3125660/big-data/white-house-to-data-scientists-we-need-you.html. Accessed 20 August 2017.
Pascalev, M. (2017). Privacy exchanges: Restoring consent in privacy self-management. Ethics and Information Technology, 19(1), 39–48. https://doi.org/10.1007/s10676-016-9410-4.
Rowe, F. (2014). What literature review is not: Diversity, boundaries and recommendations. European Journal of Information Systems, 23(3), 241–255.
Saltz, J., Dewar, N., & Heckman, R. (2018). Key concepts for a data science ethics curriculum. In Proceedings of the 49th ACM technical symposium on computer science education (pp. 952–957). ACM.
Saltz, J., & Stanton, J. (2017). An introduction to data science. Thousand Oaks: SAGE Publications.
Sandvig, C., Hamilton, K., Karahalios, K., & Langbort, C. (2014). An algorithm audit. In Data and discrimination: Collected essays. New York: New America, Open Technology Institute.
Schwartz, P. M. (2011). Privacy, ethics and analytics. IEEE security and privacy 9(3). IEEE.
Shearer, C. (2000). The CRISP-DM model: The new blueprint for data mining. Journal of Data Warehousing, 5(4), 13–22.
Someh, I. A., Breidbach, C. F., Davern, M. J., & Shanks, G. G. (2016). Ethical implications of big data analytics. In ECIS (pp. Research-in).
Stahl, B. C., Timmermans, J., & Mittelstadt, B. D. (2016). The ethics of computing: A survey of the computing-oriented literature. ACM Computing Surveys (CSUR), 48(4), 55.
Stevenson, D. (2014). Locating discrimination in data-based systems. Data and discrimination: Collected essays (16–20). Washington, DC: New America/Open Technology Institute
Stoyanovich, J., Howe, B., Abiteboul, S., Miklau, G., Sahuguet, A., & Weikum, G. (2017). Fides: Towards a platform for responsible data science. In SSDBM’17-29th International Conference on Scientific and Statistical Database Management.
Sweeney, L. (2013). Discrimination in Online Ad Delivery. ACM Queue 11(3). Association of Computing Machinery.
Tene, O., & Polotensky, J. (2012). Privacy in the age of big data. Stanford Law Review.
Tiell, S., & Metcalf, J. (2016). The Universal Principles of Data Science Ethics. Accenture Labs. https://www.accenture.com/t20160629T012639__w__/us-en/_acnmedia/PDF-24/Accenture-Universal-Principles-Data-Ethics.pdf.
Tractenberg, R. E., Russell, A. J., Morgan, G. J., FitzGerald, K. T., Collmann, J., Vinsel, L., … Dolling, L. M. (2015). Using ethical reasoning to amplify the reach and resonance of professional codes of conduct in training big data scientists. Science and Engineering Ethics, 21(6), 1485–1507.
Voronova, L., & Kazantsev, N. (2015). The ethics of big data: Analytical survey. In Business informatics (CBI), 2015 IEEE 17th conference on (Vol. 2, pp. 57–63). IEEE.
Wielki, J. (2015). The social and ethical challenges connected with the big data phenomenon. Polish Journal of Management Studies, 11(2), 192–202.
Wiener, N. (1954). The human use of human beings. New York: Doubleday.
Zwitter, A. (2014). Big data ethics. Big Data & Society, 1(2), 2053951714559253.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Saltz, J.S., Dewar, N. Data science ethical considerations: a systematic literature review and proposed project framework. Ethics Inf Technol 21, 197–208 (2019). https://doi.org/10.1007/s10676-019-09502-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10676-019-09502-5