Unit – III Data Integrity
By
Dr. Gopalakrishnan C
Assistant Professor
What Is Data Integrity?
➢ Data integrity refers to the accuracy, consistency, and reliability of
data throughout its lifecycle.
➢ It ensures that data remains unaltered and uncorrupted from its
original state during storage, processing, retrieval, and
transmission.
➢ Data integrity is crucial for maintaining the trustworthiness of
information and for making informed decisions based on that
data.
➢ Techniques such as validation, verification, and error checking are
employed to maintain data integrity in databases and information
systems.
Why Is Data Integrity Important?
➢ Data integrity is important for several reasons:
➢ Accurate decision-making: Reliable and accurate data is essential for making informed
decisions in various sectors, such as business, healthcare, and government. Data integrity
ensures that the information used for decision-making is trustworthy and valid.
➢ Operational efficiency: Maintaining data integrity helps organizations run their operations
smoothly, as it minimizes errors and discrepancies that can cause disruptions and
inefficiencies.
➢ Compliance and regulations: Many industries are subject to strict regulations that mandate
maintaining data integrity to ensure safety, privacy, and security. Adhering to these
regulations is vital to avoid legal and financial consequences.
Cont..,
➢ Data security: Data integrity goes hand in hand with data security. Ensuring that
data is not tampered with or corrupted helps protect it from unauthorized
access, breaches, and cyberattacks.
➢ Customer trust and reputation: Organizations that maintain high data integrity
standards are more likely to earn customer trust and maintain a positive
reputation. Accurate and reliable data is crucial for delivering quality products
and services.
➢ Data analysis and reporting: Data integrity is critical for accurate data analysis
and reporting, as corrupted or inconsistent data can lead to misleading or false
conclusions.
➢ System stability: Data integrity helps maintain the stability of information
systems and databases, preventing errors that can cause system crashes or
malfunctions.
Data Integrity vs. Data Security vs. Data Quality
Data integrity, data security, and data quality are related but
distinct concepts that contribute to the overall management and
protection of information in a system.
What Is Data Integrity?
➢ Data integrity refers to the accuracy, consistency, and reliability of
data throughout its lifecycle.
➢ It ensures that data remains unaltered and uncorrupted from its
original state during storage, processing, retrieval, and
transmission.
➢ Data integrity is crucial for maintaining the trustworthiness of
information and for making informed decisions based on that
data.
What is data security?
➢ Data security is concerned with protecting data from
unauthorized access, theft, or damage.
➢ It involves implementing various safeguards and measures to
ensure the confidentiality, availability, and integrity of data.
➢ Data security techniques include encryption, access controls,
firewalls, intrusion detection systems, and secure backup and
recovery procedures.
What is data quality?
➢ Data quality is the measure of the condition or fitness of data
for its intended purpose.
➢ High-quality data is accurate, complete, consistent, timely, and
relevant.
➢ Data quality management involves various processes and
techniques to improve the reliability and value of the data,
such as data cleansing, data profiling, and data enrichment.
The main differences
➢ Data integrity focuses on maintaining the accuracy and consistency
of data throughout its lifecycle, data security deals with protecting
data from unauthorized access or damage, and data quality is
concerned with ensuring data meets the required standards for its
intended use.
➢ While these concepts are distinct, they are interrelated and
contribute to the overall effectiveness of data management and
protection.
Types of Data Integrity
Data integrity can be broadly categorized into two types:
✓ Physical integrity
✓ Logical integrity
Each type focuses on different aspects of data preservation and
accuracy.
Physical Integrity
➢ Physical integrity refers to protecting data from physical damage or
corruption that could occur due to hardware failures, power outages,
natural disasters, or other external factors.
➢ Maintaining physical integrity ensures that the data remains accessible
and unaltered during storage, retrieval, and transmission.
➢ Key aspects of physical integrity include:
✓ Redundancy
✓ Disaster recovery
✓ Fault tolerance
Physical Integrity - Redundancy
Redundancy: Redundancy is the duplication of data or system
components to provide a backup in case of hardware failure or
data corruption.
Techniques such as RAID (redundant array of independent
disks) and backup systems are employed to store multiple
copies of data, ensuring its availability even if one storage device
fails.
Physical Integrity - Disaster recovery
Disaster recovery: Disaster recovery involves planning and
implementing strategies to restore data and system functionality
following a natural disaster, power outage, or other unforeseen
events.
This includes creating offsite backups, establishing alternate
processing sites, and defining recovery procedures to minimize
downtime and data loss.
Physical Integrity - Fault tolerance
Fault tolerance: Fault tolerance is the ability of a system to
continue operating even in the presence of hardware or software
failures.
Fault-tolerant systems use redundant components, error
detection and correction mechanisms, and self-healing
processes to maintain data integrity and system functionality.
Logical Integrity
➢ Logical integrity focuses on maintaining the accuracy, consistency, and
validity of data within a database or information system during
processing, storage, and retrieval.
➢ Logical integrity involves the implementation of rules, constraints, and
validation checks to ensure that data remains consistent and reliable. Key
aspects of logical integrity include:
➢ Data validation
➢ Referential integrity
➢ Transaction control
Logical Integrity - Data validation
➢ Data validation: Data validation is the process of verifying
that the data entered into a system meets predefined criteria
or constraints.
➢ This can include checking for data type, format, range, and
uniqueness.
➢ Data validation helps prevent the introduction of errors or
inconsistencies in the data.
Logical Integrity - Referential integrity
➢ Referential integrity: Referential integrity is a concept in
relational databases that ensures the consistency of
relationships between tables.
➢ It enforces the rules for foreign key constraints, ensuring that
related records in different tables remain linked and that any
updates or deletions of records are propagated correctly
across the related tables.
Logical Integrity - Transaction control
➢ Transaction control: Transaction control ensures that data
remains consistent during concurrent database operations.
➢ involves the use of transactions, which are sets of related
operations that are either executed completely or not at all.
➢ The ACID properties (atomicity, consistency, isolation, and
durability) govern transactions and help maintain data
integrity during concurrent database operations.
Data Integrity Risks and Threats
➢ Data integrity can be compromised by various risks
and threats, which can be classified into two main
categories:
✓ Unintentional threats
✓ Intentional threats
Unintentional Threats
These are risks that arise due to accidental or inadvertent actions, system errors, or hardware
failures. Common unintentional threats include:
✓ Human errors: Mistakes made by users, such as incorrect data entry, accidental data
deletion or modification, or misconfiguration of systems, can compromise data integrity.
✓ Software bugs: Errors in software code or algorithms can introduce inconsistencies,
corruption, or inaccuracies in the data.
✓ Hardware failures: Failures in hardware components, such as storage devices, memory,
or network equipment, can lead to data corruption or loss.
✓ Data transmission errors: Errors occurring during data transfer or transmission, such as
network interruptions or packet loss, can result in data corruption or incomplete data.
✓ Power outages: Sudden power loss can lead to incomplete transactions or data
corruption, especially if the system does not have appropriate safeguards in place, such as
battery backups or redundant power supplies.
Intentional Threats
These are risks arising from deliberate actions by malicious actors intending to
compromise data integrity. Common intentional threats include:
✓ Unauthorized access: Unauthorized users gaining access to data can modify,
delete, or corrupt it intentionally or unintentionally.
✓ Data tampering: Malicious actors may alter or manipulate data with the intent
to cause harm, mislead, or gain unauthorized benefits.
✓ Cyberattacks: Various cyberattacks, such as ransomware, malware,
or Distributed Denial of Service (DDoS) attacks, can compromise data integrity
by encrypting, corrupting, or deleting data.
✓ Insider threats: Employees or other insiders with authorized access to data
may intentionally modify, delete, or disclose data for personal gain or other
malicious reasons.
Maintain Data Integrity - Always Validate Input Data
Validating input data involves checking that the data entered into a system meets predefined criteria or
constraints, such as data type, format, range, and uniqueness. Implementing data validation techniques helps
prevent the introduction of errors or inconsistencies in the data. Here are recommended techniques:
✓ Client-side validation: Implement validation checks in user interfaces, such as web forms or mobile
applications, to ensure that data entered by users meets predefined criteria before it is submitted to the
server. This can help prevent the entry of incorrect or incomplete data.
✓ Server-side validation: Validate data on the server side as well, as client-side validation can be bypassed or
manipulated. Server-side validation can help catch any inconsistencies or errors in the data before it is
processed or stored.
✓ Regular data audits: Periodically review and audit data to identify and correct errors or inconsistencies
that may have been introduced during data entry or processing.
Maintain Data Integrity - Implement Access Controls
Access controls limit who can access, modify, or delete data in a system. It helps
ensure that users have the appropriate permissions based on their job
responsibilities, reducing the risk of unauthorized access or accidental data
manipulation. Here are recommended techniques:
✓ Role-based access control (RBAC): Assign access permissions based on user
roles and job responsibilities, ensuring that users only have the necessary
access to perform their tasks.
✓ Principle of least privilege: Limit user access to the minimum level necessary
to complete their job functions, reducing the risk of unauthorized data access
or modification.
✓ Strong authentication: Use strong authentication methods, such as multi-
factor authentication (MFA), to verify the identity of users accessing sensitive
data.
Maintain Data Integrity - Keep an Audit Trail
An audit trail is a record of all actions and changes made to data, including who made the
change, when it was made, and what was changed. Maintaining an audit trail enables
organizations to track and monitor data modifications, identify potential security breaches,
and ensure compliance with regulations. Consider the following when implementing an audit
trail:
✓ Logging: Implement detailed logging of user actions and data changes to create a
comprehensive record of all activities within the system.
✓ Monitoring: Regularly monitor audit logs to identify unusual or suspicious activities that
may indicate data integrity issues or security breaches.
✓ Compliance: Ensure that audit trails meet the requirements of applicable regulations and
industry standards, such as GDPR, HIPAA, or PCI DSS.
Maintain Data Integrity - Always Backup Data
Regularly backing up data ensures that an up-to-date copy of the data is
available in case of data loss, corruption, or hardware failure.
Consider the following when implementing data backup:
✓ Backup frequency: Determine the appropriate frequency for data backups
based on the criticality of the data and the organization’s tolerance for data
loss.
✓ Offsite backups: Store backup copies offsite or in the cloud to protect
against data loss due to local disasters, hardware failures, or cyberattacks.
✓ Backup testing: Periodically test backup procedures and data recovery
processes to ensure that they are working correctly and meet recovery
objectives.
Thank You