7/4/2022
Data Architecture Training Course
MODULE 2: DATA HANDLING ETHICS
MODULE 2: DATA HANDLING ETHICS
1. Introduction
• Defined simply, ethics are principles of behavior based on
ideas of right and wrong. Ethical principles often focus on
ideas such as fairness, respect, responsibility, integrity,
quality, reliability, transparency, and trust.
• Data handling ethics are concerned with how to procure,
store, manage, use, and dispose of data in ways that are
aligned with ethical principles.
• Handling data in an ethical manner is necessary to the long-
term success of any organization that wants to get value
from its data. Unethical data handling can result in the loss
of reputation and customers, because it puts at risk people
whose data is exposed.
• In some cases, unethical practices are also illegal
1
7/4/2022
MODULE 2: DATA HANDLING ETHICS
1. Introduction
The ethics of data handling are complex, but they center on several core concepts:
• Impact on people: Because data represents characteristics of individuals
and is used to make decisions that affect people’s lives, there is an
imperative to manage its quality and reliability.
• Potential for misuse: Misusing data can negatively affect people and
organizations, so there is an ethical imperative to prevent the misuse of
data.
• Economic value of data: Data has economic value. Ethics of data
ownership should determine how that value can be accessed and by
whom.
MODULE 2: DATA HANDLING ETHICS
1. Introduction
• Organizations protect data based largely on laws and regulatory
requirements. Nevertheless, because data represents people (customers,
employees, patients, vendors, etc.), data management professionals should
recognize that there are ethical (as well as legal) reasons to protect data and
ensure it is not misused. Even data that does not directly represent
individuals can still be used to make decisions that affect people’s lives.
• There is an ethical imperative not only to protect data, but also to manage
its quality. People making decisions, as well as those impacted by decisions,
expect data to be complete and accurate. From both a business and a
technical perspective, data management professionals have an ethical
responsibility to manage data in a way that reduces the risk that it may
misrepresent, be misused, or be misunderstood. This responsibility extends
across the data lifecycle, from creation to destruction of data.
2
7/4/2022
MODULE 2: DATA HANDLING ETHICS
1. Introduction
• Unfortunately, many organizations fail to recognize and
respond to the ethical obligations inherent in data
management. They may adopt a traditional technical
perspective and profess not to understand the data; or they
assume that if they follow the letter of the law, they have no
risk related to data handling. This is a dangerous assumption.
• The data environment is evolving rapidly. Organizations are
using data in ways they would not have imagined even a
few years ago. While laws codify some ethical principles,
legislation cannot keep up with the risks associated with
evolution of the data environment. Organizations must
recognize and respond to their ethical obligation to protect
data entrusted to them by fostering and sustaining a culture
that values the ethical handling of information.
MODULE 2: DATA HANDLING ETHICS
2. Business Drivers
• ike W. Edward Deming’s statements on quality, ethics means “doing it right
when no one is looking.” An ethical approach to data use is increasingly
being recognized as a competitive business advantage (Hasselbalch and
Tranberg, 2016).
• Ethical data handling can increase the trustworthiness of an organization
and the organization’s data and process outcomes.
• This can create better relationships between the organization and its
stakeholders.
• Creating an ethical culture entails introducing proper governance, including
institution of controls to ensure that both intended and resulting outcomes
of data processing are ethical and do not violate trust or infringe on human
dignity..
3
7/4/2022
MODULE 2: DATA HANDLING ETHICS
2. Business Drivers
• Data handling doesn’t happen in a vacuum, and
customers and stakeholders expect ethical behavior and
outcomes from businesses and their data processes.
• Reducing the risk that data for which the organization is
responsible will be misused by employees, customers, or
partners is a primary reason for an organization to
cultivate ethical principles for data handling.
• There is also an ethical responsibility to secure data from
criminals (i.e., to protect against hacking and potential
data breaches).
MODULE 2: DATA HANDLING ETHICS
2. Business Drivers
• Different models of data ownership influence the ethics of data
handling. For example, technology has improved the ability of
organizations to share data with each other. This ability means
organizations need to make ethical decisions about their responsibility
for sharing data that does not belong to them.
• The emerging roles of Chief Data Officer, Chief Risk Officer, Chief
Privacy Officer, and Chief Analytics Officer are focused on controlling
risk by establishing acceptable practices for data handling. But
responsibility extends beyond people in these roles. Handling data
ethically requires organization-wide recognition of the risks associated
with misuse of data and organizational commitment to handling data
based on principles that protect individuals and respect the
imperatives related to data ownership.
4
7/4/2022
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.1 Ethical Principles for Data
• Respect for Persons: This principle reflects the fundamental ethical
requirement that people be treated in a way that respects their dignity
and autonomy as human individuals. It also requires that in cases where
people have ‘diminished autonomy’, extra care be taken to protect their
dignity and rights.
• Beneficence: This principle has two elements: first, do not harm; second,
maximize possible benefits and minimize possible harms.
• Justice: This principle considers the fair and equitable treatment of
people.
• The United States Department of Homeland Security’s Menlo Report
adapts the Belmont Principles to Information and Communication
Technology Research, adding a fourth principle: Respect for Law and
Public Interest (US-DHS, 2012).
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.1 Ethical Principles for Data
In 2015, the European Data Protection Supervisor published an opinion on digital
ethics highlighting the “engineering, philosophical, legal, and moral implications”
of developments in data processing and Big Data. It called for a focus on data
processing that upholds human dignity, and set out four pillars required for an
information ecosystem that ensures ethical treatment of data (EDPS, 2015):
• Future-oriented regulation of data processing and respect for the rights to
privacy and to data protection
• Accountable controllers who determine personal information processing
• Privacy conscious engineering and design of data processing products and
services
• Empowered individuals
“EDPS states that privacy is a fundamental human right”
5
7/4/2022
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.2 Principles Behind Data Privacy Law
• Privacy law is not new. Privacy and information privacy as concepts are firmly
linked to the ethical imperative to respect human rights.
• In 1890, American legal scholars Samuel Warren and Louis Brandeis described
privacy and information privacy as human rights with protections in common
law that underpin several rights in the US constitution.
• The concept of information privacy as a fundamental right was reaffirmed in
the US Privacy Act of 1974, which states that “the right to privacy is a
personal and fundamental right protected by the Constitution of the United
States”.
• In 1980, the Organization for Economic Co-operation and Development
(OECD) established Guidelines and Principles for Fair Information Processing
that became the basis for the European Union’s data protection laws.
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.2 Principles Behind Data Privacy Law
OECD’s eight core principles, They include:
• Limitations on data collection;
• An expectation that data will be of high quality;
• The requirement that when data is collected, it is done for a specific purpose;
• Limitations on data usage;
• Security safeguards;
• An expectation of openness and transparency;
• The right of an individual to challenge the accuracy of data related to himself or
herself;
• Accountability for organizations to follow the guidelines.
6
7/4/2022
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.2 Principles Behind Data Privacy Law
The OECD principles have since been superseded by
principles underlying the General Data Protection
Regulation of the EU, (GDPR, 2016). See Table 1:
• These principles are balanced by and support
certain qualified rights individuals have to their
data, including the rights to access, rectification of
inaccurate data, portability, the right to object to
processing of personal data that may cause
damage or distress, and erasure. When processing
of personal data is done based on consent, that
consent must be an affirmative action that is
freely given, specific, informed, and unambiguous.
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.2 Principles Behind Data Privacy Law
Canadian privacy law combines a
comprehensive regime of privacy protection
with industry self- regulation. PIPEDA
(Personal Information Protection and
Electronic Documents Act) applies to every
organization that collects, uses, and
disseminates personal information in the
course of commercial activities. It stipulates
rules, with exceptions, that organizations
must follow in their use of consumers’
personal information. Table 2 describes
statutory obligations based on PIPEDA.
7
7/4/2022
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.2 Principles Behind Data Privacy Law
In March 2012, the US Federal Trade Commission (FTC) issued a report recommending organizations design
and implement their own privacy programs based on best practices described in the report (i.e., Privacy by
Design) (FTC 2012). The report reaffirms the FTC’s focus on Fair Information Processing Principles (see Table 3).
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.2 Principles Behind Data Privacy Law
These principles are developed to embody the concepts in the OECD Fair Information Processing Guidelines. Other
focuses for fair information practices include:
• Simplified consumer choice to reduce the burden placed on consumers
• The recommendation to maintain comprehensive data management procedure throughout the information
lifecycle
• Do Not Track option
• Requirements for affirmative express consent
• Concerns regarding the data collection capabilities of large platform providers; transparency and clear privacy
notices and policies
• Individuals’ access to data
• Educating consumers about data privacy practices
• Privacy by Design
8
7/4/2022
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.2 Principles Behind Data Privacy Law
There is a global trend towards increasing legislative
protection of individuals’ information privacy, following
the standards set by EU legislation. Laws around the
world place different kinds of restrictions on the
movement of data across international boundaries. Even
within a multinational organization, there will be legal
limits to sharing information globally. It is therefore
important that organizations have policies and guidelines
that enable staff to follow legal requirements as well as
use data within the risk appetite of the organization.
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.3 Online Data in an Ethical Context
There are now emerging dozens of initiatives and programs designed to create a codified
set of principles to inform ethical behaviors online in the United States (Davis, 2012).
Topics include:
• Ownership of data: The rights to control one’s personal data in relation to social
media sites and data brokers. Downstream aggregators of personal data can embed
data into deep profiles that individuals are not aware of.
• The Right to be Forgotten: To have information about an individual be erased from
the web, particularly to adjust online reputation. This topic is part of data retention
practices in general.
• Identity: Having the right to expect one identity and a correct identity, and to opt for
a private identity.
• Freedom of speech online: Expressing one’s opinions versus bullying, terror inciting,
‘trolling,’ or insulting.
9
7/4/2022
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.4 Risks of Unethical Data Handling Practices
• Most people who work with data know that it is possible to use
data to misrepresent facts. The classic book How to Lie with
Statistics by Darrell Huff (1954) describes a range of ways that data
can be used to misrepresent facts while creating a veneer of
factuality. Methods include judicious data selection, manipulation
of scale, and omission of some data points. These approaches are
still at work today. The Right to be Forgotten: To have information
about an individual be erased from the web, particularly to adjust
online reputation. This topic is part of data retention practices in
general.
• The following scenarios describe unethical data practices that
violate these principles among others.
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.4 Risks of Unethical Data Handling Practices
3.4.1 Timing
• It is possible to lie through omission or inclusion of certain data
points in a report or activity based on timing. Equity market
manipulation through ‘end of day’ stock trades can artificially raise
a stock price at closing of the market giving an artificial view of the
stock’s worth. This is called market timing and is illegal.
• Business Intelligence staff may be the first to notice anomalies. In
fact, they are now seen as valuable players in the stock trading
centers of the world recreating trading pa erns looking for such
problems as well as analyzing reports and reviewing and
monitoring rules and alerts. Ethical Business Intelligence staff may
need to alert appropriate governance or management functions to
such anomalies.
10
7/4/2022
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.4 Risks of Unethical Data Handling Practices
3.4.2 Misleading Visualizations
• Charts and graphs can be used to present data in a
misleading manner. For instance, changing scale can make
a trend line look better or worse. Leaving data points out,
comparing two facts without clarifying their relationship,
or ignoring accepted visual conventions (such as that the
numbers in a pie chart representing percentages must add
up to 100 and only 100), can also be used to trick people
into interpreting visualizations in ways that are not
supported by the data itself.
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.4 Risks of Unethical Data Handling Practices
3.4.3 Unclear Definitions or Invalid Comparisons
• A US news outlet reported, based on 2011 US Census Bureau data, that
108.6 million people in the US were on welfare yet only 101.7 million people
had full time jobs, making it seem that a disproportionate percentage of the
overall population was on welfare. Media Matters explained the
discrepancy: The 108.6 million figure for the number of “people on welfare”
comes from a Census Bureau’s account ... of participation in means-tested
programs, which include “anyone residing in a household in which one or
more people received benefits” in the fourth quarter of 2011, thus including
individuals who did not themselves receive government benefits. On the
other hand, the “people with a full time job” figure included only individuals
who worked, not individuals residing in a household where at least one
person works
11
7/4/2022
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.4 Risks of Unethical Data Handling Practices
3.4.4 Bias
• Bias refers to an inclination of outlook. On the personal level, the term is
associated with unreasoned judgments or prejudices. In statistics, bias
refers to deviations from expected values. These are often introduced
through systematic errors in sampling or data selection. Bias can be
introduced at different points in the data lifecycle: when data is collected or
created, when it is selected for inclusion in analysis, through the methods by
which it is analyzed, and in how the results of analysis are presented.
• Using data without addressing the ways in which bias may be introduced
can compound prejudice while reducing transparency in process, giving the
resulting outcomes the veneer of impartiality or neutrality when they are
not neutral.
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.4 Risks of Unethical Data Handling Practices
There are several types of bias:
• Data Collection for pre-defined result: The analyst is pressured to
collect data and produce results in order to reach a pre- defined
conclusion, rather than as an effort to draw an objective conclusion.
• Biased use of data collected: Data may be collected with limited bias,
but an analyst is pressured to use it to confirm a pre- determined
approach. Data may even be manipulated to this end (i.e., some data
may be discarded if it does not confirm the approach).
• Hunch and search: The analyst has a hunch and wants to satisfy that
hunch, but uses only the data that confirms the hunch and does not
account for other possibilities that the data may surface.
12
7/4/2022
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.4 Risks of Unethical Data Handling Practices
There are several types of bias:
• Biased sampling methodology: Sampling is often a necessary part of
data collection. But bias can be introduced by the method used to
select the sample set. It is virtually impossible for humans to sample
without bias of some sort. To limit bias, use statistical tools to select
samples and establish adequate sample sizes. Awareness of bias in
data sets used for training is particularly important.
• Context and Culture: Biases are often culturally or contextually based,
so stepping outside that culture or context is required for a neutral
look at the situation.
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.4 Risks of Unethical Data Handling Practices
3.4.5 Transforming and Integrating Data
Data integration presents ethical challenges because data is changed as it moves from system
to system. If data is not integrated with care, it presents risk for unethical or even illegal data
handling. These ethical risks intersect with fundamental problems in data management,
including:
• Limited knowledge of data’s origin and lineage
• Data of poor quality
• Unreliable Metadata: Data consumers depend on reliable Metadata, including consistent
definitions of individual data elements, documentation of data’s origin, and
documentation of lineage (e.g., rules by which data is integrated).
• No documentation of data remediation history: Organizations should also have auditable
information related to the ways data has been changed.
13
7/4/2022
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.4 Risks of Unethical Data Handling Practices
3.4.6 Obfuscation / Redaction of Data
Obfuscating or redacting data is the practice of making information anonymous, or removing sensitive information. But
obfuscation alone may not be sufficient to protect data if a downstream activity (analysis or combination with other
datasets) can expose the data. This risk is present in the following instances:
• Data aggregation: When aggregating data across some set of dimensions, and removing identifying data, a dataset
can still serve an analytic purpose without concern for disclosing personal identifying information (PII). Aggregations
into geographic areas are a common practice
• Data marking: Data marking is used to classify data sensitivity (secret, confidential, personal, etc.) and to control
release to appropriate communities such as the public or vendors, or even vendors from certain countries or other
community considerations.
• Data masking: Data masking is a practice where only appropriate submitted data will unlock processes. Operators
cannot see what the appropriate data might be; they simply type in responses given to them, and if those responses
are correct, further activities are permitted.
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.5 Establishing an Ethical Data Culture
• Establishing a culture of ethical data handling requires
understanding existing practices, defining expected
behaviors, codifying these in policies and a code of
ethics, and providing training and oversight to enforce
expected behaviors. As with other initiatives related to
governing data and to changing culture, this process
requires strong leadership.
• Improving an organization’s ethical behavior regarding
data requires a formal Organizational Change
Management (OCM) process
14
7/4/2022
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.5 Establishing an Ethical Data Culture
3.5.1 Review Current State Data Handling Practices
• The first step to improvement is understanding the
current state. The purpose of reviewing existing data
handling practices is to understand the degree to which
they are directly and explicitly connected to ethical and
compliance drivers. This review should also identify how
well employees understand the ethical implications of
existing practices in building and preserving the trust of
customers, partners, and other stakeholders. The
deliverable from the review should document ethical
principles that underlie the organization’s collection, use,
and oversight of data, throughout the data lifecycle,
including data sharing activities.
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.5 Establishing an Ethical Data Culture
3.5.2 Identify Principles, Practices, and Risk Factors
• The purpose of formalizing ethical practices • Guiding principle: People have a right to privacy with
around data handling is to reduce the risk that respect to information about their health.
data might be misused and cause harm to • Risk: If there is wide access to the personal health data
customers, employees, vendors, other of patients, then thereby jeopardizing their right to
stakeholders, or the organization as a whole. privacy.
An organization trying to improve its practices • Practice: Only nurses and doctors will be allowed to
should be aware of general principles, such as access the personal health data of patients and only for
the necessity of protecting the privacy of purposes of providing care.
individuals, as well as industry-specific • Control: There will be an annual review of all users of
concerns, such as the need to protect financial the systems that contain personal health information
or health-related information. of patients to ensure that only those people who need
to have access do have access.
15
7/4/2022
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.5 Establishing an Ethical Data Culture
3.5.3 Create an Ethical Data Handling Strategy and Roadmap
The component pieces of such a strategy include:
• Values statements: Values statements describe what the
organization believes in. Examples might include truth, fairness, or
justice. These statements provide a framework for ethical handling
of data and decision-making.
• Ethical data handling principles: Ethical data handling principles
describe how an organization approaches challenges presented by
data; for example, how to respect the right of individuals to
privacy. Principles and expected behaviors can be summarized in a
code of ethics and supported through an ethics policy. Socialization
of the code and policy should be included in the training and
communications plan.
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.5 Establishing an Ethical Data Culture
3.5.3 Create an Ethical Data Handling Strategy and Roadmap
The component pieces of such a strategy include:
• Compliance framework: A compliance framework includes factors that drive organizational obligations.
Ethical behaviors should enable the organization to meet compliance requirements. Compliance
requirements are influenced by geographic and sector concerns.
• Risk assessments: Risk assessments identify the likelihood and the implications of specific problems arising
within the organization. These should be used to prioritize actions related to mitigation, including employee
compliance with ethical principles.
• Training and communications: Training should include review of the code of ethics. Employee must sign off
that they are familiar with the code and the implications of unethical handling of data. Training needs to be
ongoing; for example, through a requirement for an annual ethics statement affirmation. Communications
should reach all employees.
16
7/4/2022
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.5 Establishing an Ethical Data Culture
3.5.3 Create an Ethical Data Handling Strategy and Roadmap
The component pieces of such a strategy include:
• Roadmap: The roadmap should include a timeline with activities that can be approved by management.
Activities will include execution of the training and communications plan, identification and remediation of
gaps in existing practices, risk mitigation, and monitoring plans. Develop detailed statements that reflect the
target position of the organization on the appropriate handling of data, include roles, responsibilities, and
processes, and references to experts for more information. The roadmap should cover all applicable laws, and
cultural factors.
• Approach to auditing and monitoring: Ethical ideas and the code of ethics can be reinforced through training.
It is also advisable to monitor specific activities to ensure that they are being executed in compliance with
ethical principles.
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.5 Establishing an Ethical Data Culture
3.5.4 Adopt a Socially Responsible Ethical Risk Model
Data professionals involved in Business Intelligence, analytics, and Data Science are often
responsible for data that describes:
• Who people are, including their countries of origin and their racial, ethnic, and religious
characteristics
• What people do, including political, social, and potentially criminal activities
• Where people live, how much money they have, what they buy, who they talk with or text
or send email to
• How people are treated, including outcomes of analysis, such as scoring and preference
tracking that will tag them as ultimately privileged or not for future business
This data can be misused and counteract the principles underlying data ethics: respect for
persons, beneficence, and justice.
17
7/4/2022
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.5 Establishing an Ethical Data Culture
3.5.4 Adopt a Socially Responsible Ethical Risk Model
For example, an organization might set criteria for what it considers ‘bad’
customers in order to stop doing business with those individuals. But if that
organization has a monopoly on an essential service in a particular geographic
area, then some of those individuals may find themselves without that
essential service and they will be in harm’s way because of the organization’s
decision.
Projects that use personal data should have a disciplined approach to the use
of that data. See Figure 13. They should account for:
• How they select their populations for study (arrow 1)
• How data will be captured (arrow 2)
• What activities analytics will focus on (arrow 3)
• How the results will be made accessible (arrow 4)
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.5 Establishing an Ethical Data Culture
3.5.4 Adopt a Socially Responsible Ethical Risk Model
• Within each area of consideration, they should address potential
ethical risks, with a particular focus on possible negative effects on
customers or citizens. DAMA International encourages data
professionals to take a professional
• A risk model can be used to determine whether to execute the stand, and present the risk situation
project. It will also influence how to execute the project. For to business leaders who may not
have recognized the implications of
example, the data will be made anonymous, the private information
particular uses of data and these
removed from the file, the security on the files tightened or implications in their work.
confirmed, and a review of the local and other applicable privacy law
reviewed with legal. Dropping customers may not be permitted
under law if the organization is a monopoly in a jurisdiction, and
citizens have no other provider options such as energy or water.
18
7/4/2022
MODULE 2: DATA HANDLING ETHICS
3. Essential Concepts
3.6 Data Ethics and Governance
• Oversight for the appropriate handling of data falls under both data
governance and legal counsel. Together they are required to keep
up-to-date on legal changes, and reduce the risk of ethical
impropriety by ensuring employees are aware of their obligations.
• Data Governance must set standards and policies for and provide
oversight of data handling practices.
• Employees must expect fair handling, protection from reporting
possible breaches, and non-interference in their personal lives.
• Data Governance has a particular oversight requirement to review
plans and decisions proposed by BI, analytics and Data Science
studies.
MODULE 2: DATA HANDLING ETHICS
Group Discussion
19
7/4/2022
MODULE 2: DATA HANDLING ETHICS
Q&A
20