Checking The Accuracy of Data
Checking The Accuracy of Data
5. Your web browser creates a session key, encrypts it with the server's public key and
sends the encrypted key to the server.
6. The server uses its private key to decrypt the session key.
7. The client and server use the session key to encrypt all further communications.
Range Check
A range check is commonly used when you are working with data which consists of
numbers, currency or dates/times.
Type Check
When you begin to set up your new system you will choose the most appropriate data type
for each field.
A type check will ensure that the correct type of data is entered into that field. For
example, in a clothes shop, dress sizes may range from 8 to 18. A number data type would
be a suitable choice for this data. By setting the data type as number, only numbers could
be entered e.g. 10, 12, 14 and you would prevent anyone trying to enter text such as ‘ten’
or ‘ten and a half’.
Check Digit
This is used when you want to be sure that a range of numbers has been entered correctly.
There are many different schemes (algorithms) for creating check digits.
For example, the ISBN-10 numbering system for books makes use of 'Modulo-11' division.
In modulo division, the answer is the remainder of the division. For example
The check digit is the final number in the sequence, so in this example it is the final ‘2’.
The computer will perform a complex calculation on all of the numbers and then compare
the answer to the check digit. If both match, it means the data was entered correctly.
Length Check
Sometimes you may have a set of data which always has the same number of characters.
A length check could be set up to ensure that exactly 11 numbers are entered into the
field. This type of validation cannot check that the 11 numbers are correct but it can
ensure that 10 or 12 numbers aren't entered.
A length check can also be set up to allow characters to be entered within a certain range.
So you could set a length check for postcode to accept data which has a minimum number
of 5 characters and a maximum number of 8.
AS & A Level Information Technology Chapter 1: Data Processing and Information
Lookup Check
- a car showroom might put the car models into a lookup list
- a vet might list the most popular types of animals that they deal with
Picture/Format Check
You may see this validation technique referred to as either a picture or a format check,
they are the same thing.
Think about a postcode. The majority of postcodes look something like this:
CV36 7TP
WR14 5WB
Replace either of those examples with L for any letter which appears and N for any number
that appears and you will end up with:
LLNN NLL
This means that you can set up a picture/format check for something like a postcode field
to ensure that a letter isn't entered where a number should be or a number in place of a
letter.
Example 2
A National Insurance number must be in the form of XX 99 99 99 X. The first two and the
last characters must be letters. The other six characters are numbers. Any format entered
differently to this will be rejected.
AS & A Level Information Technology Chapter 1: Data Processing and Information
Presence Check
There might be an important piece of data that you want to make sure is always stored.
For example, a school will always want to know an emergency contact number, a video
rental store might always want to know a customer's address.
A presence check makes sure that a critical field cannot be left blank, it must be filled in. If
someone tries to leave the field blank then an error message will appear and you won't be
able to progress to another record or save any other data which you have entered.
Consistency check
A consistency check is a type of logical check that confirms the data has been entered in a
logically consistent way. It checks that data across two fields is consistent.
An example is checking if the delivery date is after the shipping date for a parcel.
When entering the gender of ‘M’ or ‘F’, a consistency check will prevent ‘F’ from being
entered if the title is ‘Mr’ and will prevent ‘M’ from being entered if the title is ‘Mrs’ or
‘Miss’.
When entering data about dispatching products, it would not be possible to mark an item
as being dispatched until after it has been packaged.
Limit check
A limit check is similar to a range check, but the check is only applied to one boundary.
For example, in the UK you are only allowed to drive from the age of 17, but there is no
upper limit. If somebody enters a number lower than 17 when asked to enter their age
when applying for a driving license, for example, this will generate an error message.
Data verification
Verification means to check that the data on the original source document is identical to
the data that you have entered into the system. Verification can be performed in two
ways; double entry method, visual check.
Double entry
Think about when you choose a new password, you often have to type it in twice. This lets
the computer check if you have typed it exactly the same both times and not made a
mistake. It verifies that the first version is correct by matching it against the second
version.
Whilst this can help to identify many mistakes, it is not ideal for large amounts of data.
Visual check
This saves having to enter the data twice. It can help pick up errors where data has been
entered incorrectly or transposed.
However, it isn’t always that easy to keep moving your eyes back and forth between a
monitor and a paper copy. Also, if you are tired or your eyes feel 'blurry' then you might
miss errors. An alternative method is to print out the data entered and compare the
printout side by side with the source document.
• Another problem is that the person who is checking that the data has been entered
correctly may be the same person who entered it. It is very easy for them to overlook
their own mistakes. A possible way around this is to get somebody else to do the
check.
Parity check
A parity bit is a check bit, which is added to a block of data for error detection purposes. It
is used to validate the integrity of the data. The value of the parity bit is assigned either 0
or 1 that makes the number of 1s in the message block either even or odd depending upon
the type of parity. Parity check is suitable for single bit error detection only.
• Even Parity − Here the total number of bits in the message is made even.
• Odd Parity − Here the total number of bits in the message is made odd.
If this byte is using even parity, then the parity bit needs to be 0 since there is already an
even number of 1-bits (in this case, 4).
If odd parity is being used, then the parity bit needs to be 1 to make the number of 1-bits
odd.
If a byte has been transmitted from ‘A’ to ‘B’, and even parity is used, an error would be
flagged if the byte now had an odd number of 1-bits at the receiver’s end.
In this case, the receiver’s byte has three 1-bits, which means it now has odd parity whilst
the byte from the sender had even parity (four 1-bits). This clearly means an error has
occurred during the transmission of the data.
Parity bits only check to see if an error occurred during data transmission. They do not
correct the error. If an error occurs, then the data must be sent again.
Parity checks can find an error when a single bit is transmitted incorrectly, but there are
occasions when a parity check would not find an error if more than one bit is transmitted
incorrectly.
Checksum
A checksum is a value used to verify the integrity of a file or a data transfer. In other words,
it is a sum that checks the validity of data. Checksums are typically used to compare two
sets of data to make sure they are the
same. Some common applications include
verifying a disk image or checking the
integrity of a downloaded file. If the
checksums don't match those of the
original files, the data may have been
altered or corrupted.
A checksum can be calculated in many different ways, using different algorithms, for
example a simple checksum could simply be the number of bytes in a file. Just as we saw
with the problem with transposition of bits deceiving a parity check, this type of checksum
would not be able to notice if two or more bytes were swapped; the data would be
different, but the checksum would be the same.
AS & A Level Information Technology Chapter 1: Data Processing and Information
The common protocols used to determine checksum numbers are the transmission control
protocol (TCP) and the user diagram protocol (UDP). While checksum values that do not
match can signal something went wrong during transmission, a few factors can cause this
to happen, such as;
Different algorithms can be used to generate the checksum. Popular algorithms include
SHA-256, SHA-1 and MD5.
Hash total
A method for ensuring that data in a file have not been altered. A hash total is the
numerical sum of one or more fields in the file, including data not normally used in
calculations, such as account number. When necessary, the hash total is recalculated and
compared with the original. If data are lost or changed, a mismatch occurs which signals an
error.
Let’s consider a simple example. Sometimes, school examinations staff are asked to do a
statistical analysis of exam results. Here we have a small extract from the data that might
have been collected.
Normally, the Student ID would be stored as an alphanumeric type, so for the purpose of a
hash check, it would be converted to a number. The hash check involves adding all the
Student IDs together. In this example it would perform the calculation 4762 + 153 + 2539 +
4651 giving us a hash total of 12105.
The data would be transmitted along with the hash total and then the hash total would be
recalculated and compared with the original to make sure it was the same and that the
data had been transmitted correctly.
Control total
A control total is calculated in exactly the same way as a hash total, but is only carried out
on numeric fields. There is no need to convert alphanumeric data to numeric. The value
produced is a meaningful one which has a use.