COA Unit 4 5 6 7
COA Unit 4 5 6 7
Stack Organization
Instruction Formats
Addressing Modes
Program Control
Introduction:
The main part of the computer that performs the bulk of data-processing operations is called the central
processing unit and is referred to as the CPU.
The CPU is made up of three major parts, as shown in Fig. 8-1.
The register set stores intermediate data used during the execution of the instructions.
The arithmetic logic unit (ALU) performs the required microoperations for executing the instructions.
The control unit supervises the transfer of information among the registers and instructs the ALU as to which
operation to perform.
1. Stack Organization:
A stack or last-in first-out (LIFO) is useful feature that is included in the CPU of most computers.
Stack:
o A stack is a storage device that stores information in such a manner that the item stored last is the first
item retrieved.
The operation of a stack can be compared to a stack of trays. The last tray placed on top of the stack is the first
to be taken off.
In the computer stack is a memory unit with an address register that can count the address only.
The register that holds the address for the stack is called a stack pointer (SP). It always points at the top item in
the stack.
The two operations that are performed on stack are the insertion and deletion.
The operation of insertion is called PUSH.
The operation of deletion is called POP.
These operations are simulated by incrementing and decrementing the stack pointer register (SP).
1
Page
Register Stack:
A stack can be placed in a portion of a large memory or it can be organized as a collection of a finite number of
memory words or registers.
The below figure shows the organization of a 64-word register stack.
The stack pointer register SP contains a binary number whose value is equal to the address of the word
is currently on top of the stack. Three items are placed in the stack: A, B, C, in that order.
In above figure C is on top of the stack so that the content of SP is 3.
For removing the top item, the stack is popped by reading the memory word at address 3 and decrementing the
content of stack SP.
Now the top of the stack is B, so that the content of SP is 2.
Similarly for inserting the new item, the stack is pushed by incrementing SP and writing a word in the next-
higher location in the stack.
In a 64-word stack, the stack pointer contains 6 bits because 26 = 64.
Since SP has only six bits, it cannot exceed a number greater than 63 (111111 in binary).
When 63 is incremented by 1, the result is 0 since 111111 + 1. = 1000000 in binary, but SP can accommodate
only the six least significant bits.
Then the one-bit register FULL is set to 1, when the stack is full.
Similarly when 000000 is decremented by 1, the result is 111111, and then the one-bit register EMTY is set 1
when the stack is empty of items.
DR is the data register that holds the binary data to be written into or read out of the stack.
PUSH:
Initially, SP is cleared to 0, EMTY is set to 1, and FULL is cleared to 0, so that SP points to the word at address 0
and the stack is marked empty and not full.
If the stack is not full (if FULL = 0), a new item is inserted with a push operation.
The push operation is implemented with the following sequence of microoperations:
2
The stack pointer is incremented so that it points to the address of next-higher word.
Page
A memory write operation inserts the word from DR the top of the stack.
The first item stored in the stack is at address 1.
The last item is stored at address 0.
If SP reaches 0, the stack is full of items, so FULL is to 1.
This condition is reached if the top item prior to the last push way location 63 and, after incrementing SP, the
last item is stored in location 0.
Once an item is stored in location 0, there are no more empty registers in the stack, so the EMTY is cleared to 0.
POP:
A new item is deleted from the stack if the stack is not empty (if EMTY = 0).
The pop operation consists of the following sequence of min operations:
Memory Stack:
In the above discussion a stack can exist as a stand-alone unit. But in the CPU implementation of a stack is done
by assigning a portion of memory to a stack operation and using a processor register as stack pointer.
The below figure shows a portion computer memory partitioned into three segments: program, data, and
stack.
The program counter PC points at the address of the next instruction in program.
The address register AR points at an array of data.
3
SP SP-1
M [SP] DR
The stack pointer is decremented so that it points at the address of the next word.
A memory write operation inserts the word from DR into the top of stack. A new item is deleted with a pop
operation as follows:
DR M [SP]
SP SP+1
The top item is read from the stack into DR. The stack pointer is then decremented to point at the next item in
the stack.
Most computers do not provide hardware to check for stack overflow (full stack) or underflow (empty stack).
The stack limits can be checked by using processor registers:
o one to hold the upper limit (3000 in this case)
o Other to hold the lower limit (4001 in this case).
After a push operation, SP compared with the upper-limit register and after a pop operation, SP is a compared
with the lower-limit register.
The two microoperations needed for either the push or pop are
The advantage of a memory stack is that the CPU can refer to it without having specify an address, since the
address is always available and automatically updated in the stack pointer.
Reverse Polish notation, combined with a stack arrangement of registers, is the most efficient way known for
evaluating arithmetic expressions.
This procedure is employed in some electronic calculators and also in some computer.
The following numerical example may clarify this procedure. Consider the arithmetic expression
(3*4) + (5*6)
34 * 56* +
Each box represents one stack operation and the arrow always points to the top of the stack.
Scanning the expression from left to right, we encounter two operands.
First the number 3 is pushed into the stack, then the number 4.
The next symbol is the multiplication operator *.
This causes a multiplication of the two top most items the stack.
The stack is then popped and the product is placed on top of the stack, replacing the two original operands.
Next we encounter the two operands 5 and 6, so they are pushed into the stack.
The stack operation results from the next * replaces these two numbers by their product.
The last operation causes an arithmetic addition of the two topmost numbers in the stack to produce the final
result of 42.
5
2. Instruction Formats:
Page
The format of an instruction is usually depicted in a rectangular box symbolizing the bits of the instruction as
they appear in memory words or in a control register.
The bits of the instruction are divided into groups called fields.
The most common fields found in instruction formats are:
1. An operation code field that specifies the operation to be perform
2. An address field that designates a memory address or a processor register.
3. A mode field that specifies the way the operand or the effective address is determined.
Computers may have instructions of several different lengths containing varying number of addresses.
The number of address fields in the instruct format of a computer depends on the internal organization of its
registers.
Most computers fall into one of three types of CPU organizations:
1. Single accumulator organization.
2. General register organization.
3. Stack organization.
In an accumulator type organization all the operations are performed with an implied accumulator register.
The instruction format in this type of computer uses one address field.
For example, the instruction that specifies an arithmetic addition defined by an assembly language
instruction as
ADD X
Where X is the address of the operand. The ADD instruction in this case results in the operation AC AC
+M[X]. AC is the accumulator register and M[X] symbolizes the memory word located at address X.
The instruction format in this type of computer needs three register address fields.
Thus the instruction for an arithmetic addition may be written in an assembly language as
ADD R1, R2, R3
to denote the operation R1R2 + R3. The number of address fields in the instruction can be
reduced from three to two if the destination register is the same as one of the source registers.
Thus the instruction ADD R1, R2 would denote the operation R1 R1 + R2. Only register addresses for R1
and R2 need be specified in this instruction.
General register-type computers employ two or three address fields in their instruction format.
Each address field may specify a processor register or a memory word.
An instruction symbolized by ADD R1, X would specify the operation R1 R1 + M[X].
It has two address fields, one for register R1 and the other for the memory address X.
Stack organization:
The stack-organized CPU has PUSH and POP instructions which require an address field.
Thus the instruction PUSH X will push the word at address X to the top of the stack.
The stack pointer is updated automatically.
Operation-type instructions do not need an address field in stack-organized computers.
This is because the operation is performed on the two items that are on top of the stack.
The instruction ADD in a stack computer consists of an operation code only with no address field.
This operation has the effect of popping the two top numbers from the stack, adding the numbers, and
pushing the sum into the stack.
There is no need to specify operands with an address field since all operands are implied to be in the stack.
The influence of the number of addresses on computer programs, we will evaluate the arithmetic statement
Page
X= (A+B) * (C+D)
Using zero, one, two, or three address instructions and using the symbols ADD, SUB, MUL and DIV for four
arithmetic operations; MOV for the transfer type operations; and LOAD and STORE for transfer to and from
memory and AC register.
Assuming that the operands are in memory addresses A, B, C, and D and the result must be stored in memory ar
address X and also the CPU has general purpose registers R1, R2, R3 and R4.
Three-address instruction formats can use each address field to specify either a processor register or a
memory operand.
The program assembly language that evaluates X = (A+B) * (C+D) is shown below, together with
comments that explain the register transfer operation of each instruction.
Two-address instructions formats use each address field can specify either a processor register or
memory word.
The program to evaluate X = (A+B) * (C+D) is as follows
The MOV instruction moves or transfers the operands to and from memory and processor registers.
The first symbol listed in an instruction is assumed be both a source and the destination where the
result of the operation transferred.
One-address instructions use an implied accumulator (AC) register for all data manipulation.
For multiplication and division there is a need for a second register. But for the basic discussion we will
neglect the second register and assume that the AC contains the result of all operations.
The program to evaluate X=(A+B) * (C+D) is
All operations are done between the AC register and a memory operand.
T is the address of a temporary memory location required for storing the intermediate result.
A stack-organized computer does not use an address field for the instructions ADD and MUL.
The PUSH and POP instructions, however, need an address field to specify the operand that
communicates with the stack.
The following program shows how X = (A+B) * (C+D) will be written for a stack-organized computer.
(TOS stands for top of stack).
7
Page
To evaluate arithmetic expressions in a stack computer, it is necessary to convert the expression into
reverse Polish notation.
The name "zero-address” is given to this type of computer because of the absence of an address field in
the computational instructions.
RISC Instructions:
The instruction set of a typical RISC processor is use only load and store instructions for communicating
between memory and CPU.
All other instructions are executed within the registers of CPU without referring to memory.
LOAD and STORE instructions that have one memory and one register address, and computational type
instructions that have three addresses with all three specifying processor registers.
The following is a program to evaluate X=(A+B)*(C+D)
The load instructions transfer the operands from memory to CPU register.
The add and multiply operations are executed with data in the register without accessing memory.
The result of the computations is then stored memory with a store in instruction.
3. Addressing Modes
The way the operands are chosen during program execution is dependent on the addressing mode of the
instruction.
Computers use addressing mode techniques for the purpose of accommodating one or both of the following
provisions:
o To give programming versatility to the user by providing such facilities as pointers to memory, counters
for loop control, indexing of data, and program relocation.
o To reduce the number of bits in the addressing field of the instruction
Most addressing modes modify the address field of the instruction; there are two modes that need no address
field at all. These are implied and immediate modes.
Implied Mode:
In this mode the operands are specified implicitly in the definition of the instruction.
For example, the instruction "complement accumulator" is an implied-mode instruction because the
operand in the accumulator register is implied in the definition of the instruction.
All register reference instructions that use an accumulator are implied mode instructions.
Zero address in a stack organization computer is implied mode instructions.
Immediate Mode:
When the address specifies a processor register, the instruction is said to be in the register mode.
Page
Register Mode:
In this mode the operands are in registers that reside within the CPU.
The particular register is selected from a register field in the instruction.
In this mode the instruction specifies a register in CPU whose contents give the address of the operand
in memory.
In other words, the selected register contains the address of the operand rather than the operand itself.
The advantage of a register indirect mode instruction is that the address field of the instruction uses few
bits to select a register than would have been required to specify a memory address directly.
This is similar to the register indirect mode except that the register is incremented or decremented after
(or before) its value is used to access memory.
The address field of an instruction is used by the control unit in the CPU to obtain the operand from memory.
Sometimes the value given in the address field is the address of the operand, but sometimes it is just an
address from which the address of the operand is calculated.
The basic two mode of addressing used in CPU are direct and indirect address mode.
In this mode the effective address is equal to the address part of the instruction.
The operand resides in memory and its address is given directly by the address field of the instruction.
In a branch-type instruction the address field specifies the actual branch address.
In this mode the address field of the instruction gives the address where the effective address is stored
in memory.
Control fetches the instruction from memory and uses its address part to access memory again to read
the effective address.
A few addressing modes require that the address field of the instruction be added to the content of a specific
register in the CPU.
The effective address in these modes is obtained from the following computation:
In this mode the content of the program counter is added to the address part of the instruction in order
to obtain the effective address.
In this mode the content of an index register is added to the address part of the instruction to obtain
the effective address.
An index register is a special CPU register that contains an index value.
In this mode the content of a base register is added to the address part of the instruction to obtain the
effective address.
This is similar to the indexed addressing mode except that the register is now called a base register
9
Page
To show the differences between the various modes, we will show the effect of the addressing modes on the
instruction defined in Fig. 8-7.
The two-word instruction at address 200 and 201 is a "load to AC" instruction with an address field equal to
500.
The first word of the instruction specifies the operation code and mode, and the second word specifies the
address part.
PC has the value 200 for fetching this instruction. The content of processor register R1 is 400, and the content of
an index register XR is 100.
AC receives the operand after the instruction is executed.
In the direct address mode the effective address is the address part of the instruction 500 and the operand to
be loaded into AC is 500.
In the immediate mode the second word of the instruction is taken as the operand rather than an address, so
500 is loaded into AC.
In the indirect mode the effective address is stored in memory at address 500. Therefore, the effective address
is 800 and the operand is 300.
In the relative mode the effective address is 500 + 202 =702 and the operand is 325. (the value in PC after the
fetch phase and during the execute phase is 202.)
In the index mode the effective address is XR+ 500 = 100 + 500 = 600 and the operand is 900.
In the register mode the operand is in R1 and 400 is loaded into AC.
In the register indirect mode the effective address is 400, equal to the content of R1 and the operand loaded
into AC is 700.
The auto-increment mode is the same as the register indirect mode except that R1 is incremented to 401 after
the execution of the instruction.
The auto-decrement mode decrements R1 to 399 prior to the execution of the instruction. The operand loaded
into AC is now 450.
Table 8-4 lists the values of the effective address and the operand loaded into AC for the nine addressing
modes.
10
Page
4. Data Transfer and Manipulation:
Data transfer instructions move data from one place in the computer to another without changing the data
content.
The most common transfers are between memory and processor registers, between processor registers and
input or output, and between the processor registers themselves.
Table 8-5 gives a list of eight data transfer instructions used in many computers.
The load instruction has been used mostly to designate a transfer from memory to a processor register, usually
an accumulator.
The store instruction designates a transfer from a processor register into memory.
The move instruction has been used in computers with multiple CPU registers to designate a transfer from one
register to another and also between CPU registers and memory or between two memory words.
The exchange instruction swaps information between two registers or a register and a memory word.
The input and output instructions transfer data among processor registers and input or output terminals.
The push and pop instructions transfer data between processor registers and a memory stack.
Different computers use different mnemonics symbols for differentiate the addressing modes.
As an example, consider the load to accumulator instruction when used with eight different addressing modes.
Table 8-6 shows the recommended assembly language convention and actual transfer accomplished in each
case
Data manipulation instructions perform operations on data and provide the computational capabilities for the
computer.
The data manipulation instructions in a typical computer are usually divided into three basic types:
1. Arithmetic instructions
2. Logical and bit manipulation instructions
3. Shift instructions
1. Arithmetic instructions
The four basic arithmetic operations are addition, subtraction, multiplication and division.
Most computers provide instructions for all four operations.
Some small computers have only addition and possibly subtraction instructions. The multiplication and division
must then be generated by mean software subroutines.
A list of typical arithmetic instructions is given in Table 8-7.
The increment instruction adds 1 to the value stored in a register or memory word.
A number with all 1's, when incremented, produces a number with all 0's.
The decrement instruction subtracts 1 from a value stored in a register or memory word.
A number with all 0's, when decremented, produces number with all 1's.
The add, subtract, multiply, and divide instructions may be use different types of data.
The data type assumed to be in processor register during the execution of these arithmetic operations is defined
by an operation code.
An arithmetic instruction may specify fixed-point or floating-point data, binary or decimal data, single-precision
or double-precision data.
The mnemonics for three add instructions that specify different data types are shown below.
ADDI Add two binary integer numbers
ADDF Add two floating-point numbers
ADDD Add two decimal numbers in BCD
A special carry flip-flop is used to store the carry from an operation.
The instruction "add carry" performs the addition on two operands plus the value of the carry the previous
computation.
Similarly, the "subtract with borrow" instruction subtracts two words and borrow which may have resulted from
a previous subtract operation.
12
The negate instruction forms the 2's complement number, effectively reversing the sign of an integer when
represented it signed-2's complement form.
Page
3. Shift Instructions:
Shifts are operations in which the bits of a word are moved to the left or right.
The bit shifted in at the end of the word determines the type of shift used.
Shift instructions may specify logical shifts, arithmetic shifts, or rotate-type operations.
In either case the shift may be to the right or to the left.
Table 8-9 lists four types of shift instructions.
This is a shift-right operation with the end bit remaining the same.
The arithmetic shift-left instruction inserts 0 to the end position and is identical to the logical shift-instruction.
Page
The rotate instructions produce a circular shift. Bits shifted out at one of the word are not lost as in a logical
shift but are circulated back into the other end.
The rotate through carry instruction treats a carry bit as an extension of the register whose word is being
rotated.
Thus a rotate-left through carry instruction transfers the carry bit into the rightmost bit position of the
register, transfers the leftmost bit position into the carry, and at the same time, shift the entire register to the
left.
5. Program Control:
Program control instructions specify conditions for altering the content of the program counter.
The change in value of the program counter as a result of the execution of a program control instruction causes
a break in the sequence of instruction execution.
This instruction provides control over the flow of program execution and a capability for branching to different
program segments.
Some typical program control instructions are listed in Table 8.10.
The ALU circuit in the CPU have status register for storing the status bit conditions.
Status bits are also called condition-code bits or flag bits.
Figure 8-8 shows block diagram of an 8-bit ALU with a 4-bit status register.
14
Page
The four status bits are symbolized by C, S, Z, and V. The bits are set or cleared as a result of an
operation performed in the ALU.
o Bit C (carry) is set to 1 if the end carry C8 is 1. It is cleared to 0 if the carry is 0.
o S (sign) is set to 1 if the highest-order bit F7 is 1. It is set to 0 if the bit is 0.
o Bit Z (zero) is set to 1 if the output of the ALU contains all 0's. It is clear to 0 otherwise. In other
words, Z = 1 if the output is zero and Z =0 if the output is not zero.
o Bit V (overflow) is set to 1 if the exclusive-OR of the last two carries equal to 1, and cleared to 0
otherwise.
The above status bits are used in conditional jump and branch instructions.
Subroutine Call and Return:
A subroutine is self contained sequence of instructions that performs a given computational task.
The most common names used are call subroutine, jump to subroutine, branch to subroutine, or
branch and save return address.
A subroutine is executed by performing two operations
(1) The address of the next instruction available in the program counter (the return address) is stored
in a temporary location so the subroutine knows where to return
(2) Control is transferred to the beginning of the subroutine.
The last instruction of every subroutine, commonly called return from subroutine, transfers the return
address from the temporary location in the program counter.
Different computers use a different temporary location for storing the return address.
The most efficient way is to store the return address in a memory stack.
The advantage of using a stack for the return address is that when a succession of subroutines is
called, the sequential return addresses can be pushed into the stack.
A subroutine call is implemented with the following microoperations:
The instruction that returns from the last subroutine is implemented by the microoperations:
Program Interrupt:
Program interrupt refers to the transfer of program control from a currently running program to another service
program as a result of an external or internal generated request.
The interrupt procedure is similar to a subroutine call except for three variations:
o The interrupt is initiated by an internal or external signal.
o Address of the interrupt service program is determined by the hardware.
o An interrupt procedure usually stores all the information rather than storing only PC content.
Types of interrupts:
There are three major types of interrupts that cause a break in the normal execution of a program.
They can be classified as
o External interrupts:
15
These come from input—output (I/O) devices, from a timing device, from a circuit monitoring
Page
A computer with large number instructions is classified as a complex instruction set computer, abbreviated as
CISC.
The computer which having the fewer instructions is classified as a reduced instruction set computer,
abbreviated as RISC.
CISC Characteristics:
RISC Characteristics:
16
Page
UNIT-V
INPUT-OUTPUT ORGANIZATION
Peripheral Devices:
The Input / output organization of computer depends upon the size of computer and the
peripherals connected to it. The I/O Subsystem of the computer, provides an efficient mode
of communication between the central system and the outside environment
i) Monitor
ii) Keyboard
iii) Mouse
iv) Printer
v) Magnetic tapes
The devices that are under the direct control of the computer are said to be connected
online.
Peripherals connected to a computer need special communication links for interfacing them
with the central processing unit.
The purpose of communication link is to resolve the differences that exist between the
central computer and each peripheral.
2. The data transfer rate of peripherals is usually slower than the transfer rate of CPU
and consequently, a synchronization mechanism may be needed.
3. Data codes and formats in the peripherals differ from the word format in the CPU and
memory.
1
UNIT-V
4. The operating modes of peripherals are different from each other and must be
controlled so as not to disturb the operation of other peripherals connected to the
CPU.
These components are called Interface Units because they interface between the
processor bus and the peripheral devices.
The I/O Bus consists of data lines, address lines and control lines.
The I/O bus from the processor is attached to all peripherals interface.
To communicate with a particular device, the processor places a device address on address
lines.
Each Interface decodes the address and control received from the I/O bus, interprets them for
peripherals and provides signals for the peripheral controller.
It is also synchronizes the data flow and supervises the transfer between peripheral and
processor.
For example, the printer controller controls the paper motion, the print timing
The control lines are referred as I/O command. The commands are as following:
Control command- A control command is issued to activate the peripheral and to inform it
what to do.
Status command- A status command is used to test various status conditions in the interface
and the peripheral.
Data Output command- A data output command causes the interface to respond by
transferring data from the bus into one of its registers.
Data Input command- The data input command is the opposite of the data output.
In this case the interface receives on item of data from the peripheral and places it in its
buffer register. I/O Versus Memory Bus
2
UNIT-V
To communicate with I/O, the processor must communicate with the memory unit. Like the
I/O bus, the memory bus contains data, address and read/write control lines. There are 3 ways
that computer buses can be used to communicate with memory and I/O:
i. Use two Separate buses , one for memory and other for I/O.
ii. Use one common bus for both memory and I/O but separate control lines for each.
iii. Use one common bus for memory and I/O with common control lines.
I/O Processor
In the first method, the computer has independent sets of data, address and control buses
one for accessing memory and other for I/O. This is done in computers that provides a
separate I/O processor (IOP). The purpose of IOP is to provide an independent pathway for
the transfer of information between external device and internal memory.
i. Strobe Control
ii. Handshaking
3
UNIT-V
Strobe Signal :
The strobe control method of Asynchronous data transfer employs a single control line to
time each transfer. The strobe may be activated by either the source or the destination unit.
In the block diagram fig. (a), the data bus carries the binary information from source to
destination unit. Typically, the bus has multiple lines to transfer an entire byte or word. The
strobe is a single line that informs the destination unit when a valid data word is available.
The timing diagram fig. (b) the source unit first places the data on the data
bus. The information on the data bus and strobe signal remain in the active state to allow the
destination unit to receive the data.
In this method, the destination unit activates the strobe pulse, to informing the source to
provide the data. The source will respond by placing the requested binary information on the
data bus.
The data must be valid and remain in the bus long enough for the destination
unit to accept it. When accepted the destination unit then disables the strobe and the source
unit removes the data from the bus.
4
UNIT-V
Disadvantage of Strobe Signal :
The disadvantage of the strobe method is that, the source unit initiates the transfer has no way
of knowing whether the destination unit has actually received the data item that was places in
the bus. Similarly, a destination unit that initiates the transfer has no way of knowing whether
the source unit has actually placed the data on bus. The Handshaking method solves this
problem.
Handshaking:
The handshaking method solves the problem of strobe method by introducing a second
control signal that provides a reply to the unit that initiates the transfer.
Principle of Handshaking:
The basic principle of the two-wire handshaking method of data transfer is as follow:
One control line is in the same direction as the data flows in the bus from the source to
destination. It is used by source unit to inform the destination unit whether there a valid data
in the bus. The other control line is in the other direction from the destination to the source. It
is used by the destination unit to inform the source whether it can accept the data. The
sequence of control during the transfer depends on the unit that initiates the transfer.
The sequence of events shows four possible states that the system can be at any given time.
The source unit initiates the transfer by placing the data on the bus and enabling its data valid
signal. The data accepted signal is activated by the destination unit after it accepts the data
from the bus. The source unit then disables its data accepted signal and the system goes into
its initial state.
5
UNIT-V
Destination Initiated Transfer Using Handshaking:
The name of the signal generated by the destination unit has been changed to ready for data
to reflects its new meaning. The source unit in this case does not place data on the bus until
after it receives the ready for data signal from the destination unit. From there on, the
handshaking procedure follows the same pattern as in the source initiated case.
The only difference between the Source Initiated and the Destination Initiated transfer is in
their choice of Initial sate.
6
UNIT-V
Advantage of the Handshaking method:
The Handshaking scheme provides degree of flexibility and reliability because the
successful completion of data transfer relies on active participation by both units.
If any of one unit is faulty, the data transfer will not be completed. Such an error can
be detected by means of a Timeout mechanism which provides an alarm if the data is
not completed within time.
The transfer of data between two units is serial or parallel. In parallel data transmission, n bit
in the message must be transmitted through n separate conductor path. In serial transmission,
each bit in the message is sent in sequence one at a time.
Parallel transmission is faster but it requires many wires. It is used for short distances and
where speed is important. Serial transmission is slower but is less expensive.
In Asynchronous serial transfer, each bit of message is sent a sequence at a time, and binary
information is transferred only when it is available. When there is no information to be
transferred, line remains idle.
i. Start bit
i. Start Bit- First bit, called start bit is always zero and used to indicate the beginning
character.
ii. Stop Bit- Last bit, called stop bit is always one and used to indicate end of
characters. Stop bit is always in the 1- state and frame the end of the characters to
signify the idle or wait state.
iii. Character Bit- Bits in between the start bit and the stop bit are known as character
bits. The character bits always follow the start bit.
7
UNIT-V
a) Asynchronous Communication Interface
It works as both a receiver and a transmitter. Its operation is initialized by CPU by sending a
byte to the control register.
The transmitter register accepts a data byte from CPU through the data bus and
transferred to a shift register for serial transmission.
The receive portion receives information into another shift register, and when a
complete data byte is received it is transferred to receiver register.
CPU can select the receiver register to read the byte through the data bus. Data in the
status register is used for input and output flags.
A First In First Out (FIFO) Buffer is a memory unit that stores information in such a manner
that the first item is in the item first out. A FIFO buffer comes with separate input and output
terminals. The important feature of this buffer is that it can input data and output data at two
different rates.
When placed between two units, the FIFO can accept data from the source unit at one rate,
rate of transfer and deliver the data to the destination unit at another rate.
If the source is faster than the destination, the FIFO is useful for source data arrive in
bursts that fills out the buffer. FIFO is useful in some applications when data are transferred
asynchronously.
All the internal operations in a digital system are synchronized by means of clock pulses
supplied by a common clock pulse Generator. The data transfer can be
i. Synchronous or
ii. Asynchronous
When both the transmitting and receiving units use same clock pulse then such a data transfer
is called Synchronous process. On the other hand, if the there is not concept of clock pulses
8
UNIT-V
and the sender operates at different moment than the receiver then such a data transfer is
called Asynchronous data transfer.
The data transfer can be handled by various modes. some of the modes use CPU as an
intermediate path, others transfer the data directly to and from the memory unit and this can
be handled by 3 following ways:
i. Programmed I/O
In this mode of data transfer the operations are the results in I/O instructions which is a
part of computer program. Each data transfer is initiated by a instruction in the program.
Normally the transfer is from a CPU register to peripheral device or vice-versa.
Once the data is initiated the CPU starts monitoring the interface to see when next transfer
can made. The instructions of the program keep close tabs on everything that takes place in
the interface unit and the I/O devices.
9
UNIT-V
In this technique CPU is responsible for executing data from the memory for output
and storing data in memory for executing of Programmed I/O as shown in Flowchart-:
The main drawback of the Program Initiated I/O was that the CPU has to monitor the units all
the times when the program is executing. Thus the CPU stays in a program loop until the I/O
unit indicates that it is ready for data transfer. This is a time consuming process and the CPU
time is wasted a lot in keeping an eye to the executing of program.
To remove this problem an Interrupt facility and special commands are used.
Interrupt-Initiated I/O :
In this method an interrupt facility an interrupt command is used to inform the device about
the start and end of transfer. In the meantime the CPU executes other program. When the
interface determines that the device is ready for data transfer it generates an Interrupt Request
and sends it to the computer.
When the CPU receives such an signal, it temporarily stops the execution of the program and
branches to a service program to process the I/O transfer and after completing it returns back
to task, what it was originally performing.
In this type of IO, computer does not check the flag. It continue to perform its task.
10
UNIT-V
Whenever any device wants the attention, it sends the interrupt signal to the CPU.
CPU then deviates from what it was doing, store the return address from PC and
branch to the address of the subroutine.
Vectored Interrupt
Non-vectored Interrupt
In vectored interrupt the source that interrupt the CPU provides the branch
information. This information is called interrupt vectored.
In non-vectored interrupt, the branch address is assigned to the fixed address in the
memory.
Priority Interrupt:
When the interrupt is generated from more than one device, priority interrupt system
is used to determine which device is to be serviced first.
Devices with high speed transfer are given higher priority and slow devices are given
lower priority.
Using Software
Using Hardware
Polling Procedure :
Branch address contain the code that polls the interrupt sources in sequence. The
highest priority is tested first.
The disadvantage is that time required to poll them can exceed the time to serve them
in large number of IO devices.
Using Hardware:
11
UNIT-V
It accepts interrupt request and determine the priorities.
To speed up the operation each interrupting devices has its own interrupt vector.
No polling is required, all decision are established by hardware priority interrupt unit.
Device that wants the attention send the interrupt request to the CPU.
CPU then sends the INTACK signal which is applied to PI(priority in) of the first
device.
If it had requested the attention, it place its VAD(vector address) on the bus. And it
block the signal by placing 0 in PO(priority out)
If not it pass the signal to next device through PO(priority out) by placing 1.
The device whose PI is 1 and PO is 0 is the device that send the interrupt request.
It consist of interrupt register whose bits are set separately by the interrupting devices.
12
UNIT-V
Mask register is used to provide facility for the higher priority devices to interrupt
when lower priority device is being serviced or disable all lower priority devices
when higher is being serviced.
Corresponding interrupt bit and mask bit are ANDed and applied to priority encoder.
13
UNIT-V
Direct Memory Access (DMA):
In the Direct Memory Access (DMA) the interface transfer the data into and out of the
memory unit through the memory bus. The transfer of data between a fast storage device such
as magnetic disk and memory is often limited by the speed of the CPU. Removing the CPU
from the path and letting the peripheral device manage the memory buses directly would
improve the speed of transfer. This transfer technique is called Direct Memory Access
(DMA).
During the DMA transfer, the CPU is idle and has no control of the memory buses. A DMA
Controller takes over the buses to manage the transfer directly between the I/O device and
memory.
The CPU may be placed in an idle state in a variety of ways. One common method
extensively used in microprocessor is to disable the buses through special control signals
such as:
These two control signals in the CPU that facilitates the DMA transfer. The Bus Request
(BR) input is used by the DMA controller to request the CPU. When this input is active, the
CPU terminates the execution of the current instruction and places the address bus, data bus
14
UNIT-V
and read write lines into a high Impedance state. High Impedance state means that the output
is disconnected.
The CPU activates the Bus Grant (BG) output to inform the external DMA that the Bus
Request (BR) can now take control of the buses to conduct memory transfer without
processor.
When the DMA terminates the transfer, it disables the Bus Request (BR) line. The CPU
disables the Bus Grant (BG), takes control of the buses and return to its normal operation.
i. DMA Burst
ii) Cycle Stealing :- Cycle stealing allows the DMA controller to transfer one data word
at a time, after which it must returns control of the buses to the CPU.
DMA Controller:
The DMA controller needs the usual circuits of an interface to communicate with the
CPU and I/O device. The DMA controller has three registers:
i. Address Register
15
UNIT-V
i. Address Register :- Address Register contains an address to specify the desired
location in memory.
ii. Word Count Register :- WC holds the number of words to be transferred. The
register is incre/decre by one after each word transfer and internally tested for zero.
The unit communicates with the CPU via the data bus and control lines. The
registers in the DMA are selected by the CPU through the address bus by enabling the
DS (DMA select) and RS (Register select) inputs. The RD (read) and WR (write)
inputs are bidirectional.
When the BG (Bus Grant) input is 0, the CPU can communicate
with the DMA registers through the data bus to read from or write to the DMA
registers. When BG =1, the DMA can communicate directly with the memory by
specifying an address in the address bus and activating the RD or WR control.
DMA Transfer:
The CPU communicates with the DMA through the address and data buses as with
any interface unit. The DMA has its own address, which activates the DS and RS
lines. The CPU initializes the DMA through the data bus. Once the DMA receives the
start control command, it can transfer between the peripheral and the memory.
16
UNIT-V
When BG = 0 the RD and WR are input lines allowing the CPU to
communicate with the internal DMA registers. When BG=1, the RD and WR are
output lines from the DMA controller to the random access memory to specify the
read or write operation of data.
Summary :
Interface is the point where a connection is made between two different parts of a
system.
The strobe control method of Asynchronous data transfer employs a single control
line to time each transfer.
The handshaking method solves the problem of strobe method by introducing a
second control signal that provides a reply to the unit that initiates the transfer.
Programmed I/O mode of data transfer the operations are the results in I/O
instructions which is a part of computer program.
In the Interrupt Initiated I/O method an interrupt facility an interrupt command is used
to inform the device about the start and end of transfer.
In the Direct Memory Access (DMA) the interface transfer the data into and out of the
memory unit through the memory bus.
Input-Output Processor:
IOP is similar to CPU except that it is designed to handle the details of IO operation.
Unlike DMA which is initialized by CPU, IOP can fetch and execute its own
instructions.
17
UNIT-V
Memory occupies the central position and can communicate with each processor by
DMA.
IOP provides the path for transfer of data between various peripheral devices and
memory.
Data formats of peripherals differ from CPU and memory. IOP maintain such
problems.
Data are transfer from IOP to memory by stealing one memory cycle.
Instructions that are read from memory by IOP are called commands to distinguish
them from instructions that are read by the CPU.
18
UNIT-V
UNIT VI
Memory Organization
Memory hierarchy system consists of all storage devices employed in a computer system
from the slow but high capacity auxiliary memory to a relatively faster main memory, to an even smaller
and faster cache memory accessible to the high speed processing logic.
Main Memory: memory unit that communicates directly with the CPU (RAM)
Auxiliary Memory: device that provide backup storage (Disk Drives)
Cache Memory: special very-high-speed memory to increase the processing speed (Cache
RAM)
Figure12- 1 illustrates the components in a typical memory hierarchy. At the bottom of the
hierarchy are the relatively slow magnetic tapes used to store removable files. Next are the Magnetic
disks used as backup storage. The main memory occupies a central position by being able to
communicate directly with CPU and with auxiliary memory devices through an I/O process. Program not
currently needed in main memory are transferred into auxiliary memory to provide space for currently
used programs and data.
The cache memory is used for storing segments of programs currently being executed in
the CPU. The I/O processor manages data transfer between auxiliary memory and main memory. The
auxiliary memory has a large storage capacity is relatively inexpensive, but has low access speed
compared to main memory. The cache memory is very small, relatively expensive, and has very high
access speed. The CPU has direct access to both cache and main memory but not to auxiliary memory.
Multiprogramming:
Many operating systems are designed to enable the CPU to process a number of independent
programs concurrently.
Multiprogramming refers to the existence of 2 or more programs in different parts of the memory
hierarchy at the same time.
The part of the computer system that supervises the flow of information between auxiliary
memory and main memory.
6 – 2 MAIN MEMORY
Main memory is the central storage unit in a computer system. It is a relatively large and
fast memory used to store programs and data during the computer operation. The principal technology
used for the main memory is based on semi conductor integrated circuits. Integrated circuits RAM chips
are available in two possible operating modes, static and dynamic.
Static RAM – Consists of internal flip flops that store the binary information.
Dynamic RAM – Stores the binary information in the form of electric charges that are applied
to capacitors.
Most of the main memory in a general purpose computer is made up of RAM integrated circuit
chips, but a portion of the memory may be constructed with ROM chips.
Read Only Memory –Store programs that are permanently resident in the computer and for
tables of constants that do not change in value once the production of the computer is
completed.
The ROM portion of main memory is needed for storing an initial program called a Bootstrap
loader.
Boot strap loader –function is start the computer software operating when power is turned on.
Boot strap program loads a portion of operating system from disc to main memory and control
is then transferred to operating system.
RAM chip –utilizes bidirectional data bus with three state buffers to perform communication
with CPU
The block diagram of a RAM Chip is shown in Fig.12-2. The capacity of memory is 128 words of
eight bits (one byte) per word. This requires a 7-bit address and an 8-bit bidirectional data bus. The read
and write inputs specify the memory operation and the two chips select (CS) control inputs are enabling
the chip only when it is selected by the microprocessor. The read and write inputs are sometimes
combined into one line labelled R/W.
The function table listed in Fig.12-2(b) specifies the operation of the RAM chip. The unit is in
operation only when CS1=1 and CS2=0.The bar on top of the second select variable indicates that this
input is enabled when it is equal to 0. If the chip select inputs are not enabled, or if they are enabled but
the read or write inputs are not enabled, the memory is inhibited and its data bus is in a high-impedance
state. When CS1=1 and CS2=0, the memory can be placed in a write or read mode. When the WR input
is enabled, the memory stores a byte from the data bus into a location specified by the address input
lines. When the RD input is enabled, the content of the selected byte is placed into the data bus. The RD
and WR signals control the memory operation as well as the bus buffers associated with the bidirectional
data bus.
A ROM chip is organized externally in a similar manner. However, since a ROM can only read,
the data bus can only be in an output mode. The block diagram of a ROM chip is shown in fig.12-3. The
nine address lines in the ROM chip specify any one of the512 bytes stored in it. The two chip select
inputs must be CS1=1 and CS2=0 for the unit to operate. Otherwise, the data bus is in a high-impedance
state.
The interconnection between memory and processor is then established from knowledge of the
size of memory needed and the type of RAM and ROM chips available. The addressing of memory can
be established by means of a table that specify the memory address assigned to each chip. The table
called Memory address map, is a pictorial representation of assigned address space for each chip in the
system.
The memory address map for this configuration is shown in table 12-1. The component column
specifies whether a RAM or a ROM chip is used. The hexadecimal address column assigns a range of
hexadecimal equivalent addresses for each chip. The address bus lines are listed in the third column. The
RAM chips have 128 bytes and need seven address lines. The ROM chip has 512 bytes and needs 9
address lines.
RAM and ROM chips are connected to a CPU through the data and address buses. The low
order lines in the address bus select the byte within the chips and other lines in the address bus select a
particular chip through its chip select inputs.
The connection of memory chips to the CPU is shown in Fig.12-4. This configuration gives a
memory capacity of 512 bytes of RAM and 512 bytes of ROM. Each RAM receives the seven low-order
bits of the address bus to select one of 128 possible bytes. The particular RAM chip selected is
determined from lines 8 and 9 in the address bus. This is done through a 2 X 4 decoder whose outputs go
to the CS1 inputs in each RAM chip. Thus, when address lines 8 and 9 are equal to 00, the first RAM chip
is selected. When 01, the second RAM chip is select, and so on. The RD and WR outputs from the
microprocessor are applied to the inputs of each RAM chip. The selection between RAM and ROM is
achieved through bus line 10. The RAMs are selected when the bit in this line is 0, and the ROM when
the bit is 1. Address bus lines 1 to 9 are applied to the input address of ROM without going through the
decoder. The data bus of the ROM has only an output capability, whereas the data bus connected to the
RAMs can transfer information in both directions.
Magnetic Disks
A magnetic disk is a type of memory constructed using a circular plate of metal or plastic coated with magnetized
materials. Usually, both sides of the disks are used to carry out read/write operations. However, several disks may be
stacked on one spindle with read/write head available on each surface.
The following image shows the structural representation for a magnetic disk.
o The memory bits are stored in the magnetized surface in spots along the concentric circles called tracks.
o The concentric circles (tracks) are commonly divided into sections called sectors.
Magnetic Tape
Magnetic tape is a storage medium that allows data archiving, collection, and backup for different kinds of data. The
magnetic tape is constructed using a plastic strip coated with a magnetic recording medium. he bits are recorded as
magnetic spots on the tape along several tracks. Usually, seven or nine bits are recorded simultaneously to form a
character together with a parity bit.
Magnetic tape units can be halted, started to move forward or in reverse, or can be rewound. However, they cannot
be started or stopped fast enough between individual characters. For this reason, information is recorded in blocks
referred to as records.
The time required to find an item stored in memory can be reduced considerably if stored data can be
identified for access by the content of the data itself rather than by an address. A memory unit accessed
by content is called an associative memory or content addressable memory (CAM).
CAM is accessed simultaneously and in parallel on the basis of data content rather than by
specific address or location
Associative memory is more expensive than a RAM because each cell must have storage
capability as well as logic circuits
Argument register –holds an external argument for content matching
Key register –mask for choosing a particular field or key in the argument word
Hardware Organization
It consists of a memory array and logic for m words with n bits per word. The
argument register A and key register K each have n bits, one for each bit of a word. The match register M
has m bits, one for each memory word. Each word in memory is compared in parallel with the content of
the argument register. The words that match the bits of the argument register set a corresponding bit in
the match register. After the matching process, those bits in the match register that have been set
bindicate the fact that their corresponding words have been matched. Reading is accomplished by a
sequential access to memory for those words whose corresponding bits in the match register have been
set.
The relation between the memory array and external registers in an associative memory is shown
in Fig.12-7. The cells in the array are marked by the letter C with two subscripts. The first subscript gives
the word number and second specifies the bit position in the word. Thus cell Cij is the cell for bit j in word
i. A bit Aj in the argument register is compared with all the bits in column j of the array provided that kj
=1.This is done for all columns j=1,2,….n. If a match occurs between all the unmasked bits of the
argument and the bits in word I, the corresponding bit Mi in the match register is set to 1. If one or more
unmasked bits of the argument and the word do not match, Mi is cleared to 0.
It consists of flip-flop storage element Fij and the circuits for reading, writing, and matching thecell. The
input bit is transferred into the storage cell during a write operation. The bit stored is read out during a read
operation. The match logic compares the content of the storage cell with corresponding unmasked bit of the
argument and provides an output for the decision logic that sets the bit in Mi.
Match Logic
The match logic for each word can be derived from the comparison algorithm for two binary numbers.
First, neglect the key bits and compare the argument in A with the bits stored in the cells of the words.
Word i is equal to the argument in A if Aj=F ijfor j=1,2,…..,n. Two bits are equal if they are both 1
or both 0. The equality of two bits can be expressed logically by the Boolean function
where xj = 1 if the pair of bits in position j are equal;otherwise , x j =0. For a word i is equal to the
argument in A we must have all xj variables equal to 1. This is the condition for setting the corresponding
match bit Mi to 1. The Boolean function for this condition is
Mi = x1 x2 x3…… xn
E
ach cell requires two AND gate and one OR gate. The inverters for A and K are needed once for each
column and are used for all bits in the column. The output of all OR gates in the cells of the same word go
to the input of a common AND gate to generate the match signal for Mi. Mi will be logic 1 if a match
occurs and 0 if no match occurs.
Read Operation
If more than one word in memory matches the unmasked argument field , all the matched words
will have 1‟s in the corresponding bit position of the match register
In read operation all matched words are read in sequence by applying a read signal to each
word line whose corresponding Mi bit is a logic 1
In applications where no two identical items are stored in the memory , only one word may match
, in which case we can use Mi output directly as a read signal for the corresponding word
Write Operation
1. Entire memory : writing can be done by addressing each location in sequence – This makes it random
access memory for writing and content addressable memory for reading – number of lines needed for
decoding is d Where m = 2 d , m is number of words.
Tag register is used which has as many bits as there are words in memory
For every active ( valid ) word in memory , the corresponding bit in tag register is set to 1
When word is deleted the corresponding tag bit is reset to 0
The word is stored in the memory by scanning the tag register until the first 0 bit is
encountered After storing the word the bit is set to 1.
CACHE MEMORY
Locality of Reference
1. Temporal- means that a recently executed instruction is likely to be executed again very soon.
The information which will be used in near future is likely to be in use already( e.g. reuse
of information in loops)
2. Spatial- means that instructions in close proximity to a recently executed instruction are also likely
to be executed soon
If a word is accessed, adjacent (near) words are likely to be accessed soon ( e.g.
related data items (arrays) are usually stored together; instructions are executed sequentially
)
3. If active segments of a program can be placed in afast (cache) memory , then total execution time
can be reduced significantly
4. Temporal Locality of Reference suggests whenever an information (instruction or data) is needed
first , this item should be brought in to cache
5. Spatial aspect of Locality of Reference suggests that instead of bringing just one item from the
main memory to the cache ,it is wise to bring several items that reside at adjacent addresses as
well ( ie a block of information )
Principles of cache
The main memory can store 32k words of 12 bits each. The cache is capable of storing 512 of
these words at any given time. For every word stored , there is a duplicate copy in main memory. The
Cpu communicates with both memories. It first sends a 15 bit address to cahache. If there is a hit, the
CPU accepts the 12 bit data from cache. If there is a miss, the CPU reads the word from main memory
and the word is then transferred to cache.
When a read request is received from CPU,contents of a block of memory words containing the
location specified are transferred in to cache
When the program references any of the locations in this block , the contents are read from the
cache Number of blocks in cache is smaller than number of blocks in main memory
Correspondence between main memory blocks and those in the cache is specified by a mapping
function
Assume cache is full and memory word not in cache is referenced
Control hardware decides which block from cache is to be removed to create space for new block
containing referenced word from memory
Collection of rules for making this decision is called “Replacement algorithm ”
2. Update only cache location and mark it as “ Dirty or Modified Bit ” and update main memory
location at the time of cache block removal (“ Write Back ” or “ Copy Back ”) .
When addressed word is not in cache Read Miss occurs there are two ways this can be dealt
with
1. Entire block of words that contain the requested word is copied from main memory to cache and
the particular word requested is forwarded to CPU from the cache ( Load Through ) (OR)
2. The requested word from memory is sent to CPU first and then the cache is updated ( Early
Restart )
Write Operation
Mapping Functions
Correspondence between main memory blocks and those in the cache is specified by a memory
mapping function
There are three techniques in memory mapping
1. Direct Mapping
2. Associative Mapping
3. Set Associative Mapping
Direct mapping:
A particular block of main memory can be brought to a particular block of cache memory. So, it is not
flexible.
In fig 12-12. The CPU address of 15 bits is divided into two fields. The nine least significant bits
constitute the index field and remaining six bits form the tag field. The main memory needs an address
that includes both the tag and the index bits. The number of bits in the index field is equal to the number
of address bits required to access the cache memory.
The direct mapping cache organization uses the n- bit address to access the main memory and
the k-bit index to access the cache.Each word in cache consists of the data word and associated tag.
When a new word is first brought into the cache, the tag bits are stored alongside the data bits.When the
CPU generates a memory request, the index field is used the index field is used for the address to
access the cache. The tag field of the CPU address is compared with the tag in the word read from the
cache. If the two tags match, there is a hit anfd the desired data word is in cache. If there is no match,
there is a miss and the required word is read from main memory.
In fig 12-14, The index field is now divided into two parts: Block field and The word field. In a 512
word cache there are 64 blocks of 8 words each, since 64X8=512. The block number is specified with a 6
bit field and the word with in the block is specified with a 3-bit field. Th etag field stored within the the
cache is common to all eight words of the same block.
Associative mapping:
In this mapping function, any block of Main memory can potentially reside in any cache block
position. This is much more flexible mapping method.
In fig 12-11, The associative memory stores both address and content(data) of the memory word.
This permits any location in cache to store any word from main memory.The diagram shows three words
presently stored in the cache. The address value of 15 bits is shown as a five-digit ctal number and its
corressponding 12-bit word is shown as a four-digit octal number. A CPU address of 15-bits is placed in
the argument register and the associative memory is searched for a matching address. If address is
found, the corresponding 12-bit data is read and sent to the CPU. If no match occurs, the main memory is
accessed for the word.
Set-associative mapping:
In this method, blocks of cache are grouped into sets, and the mapping allows a block of main
memory to reside in any block of a specific set. From the flexibility point of view, it is in between to the
other two methods.
The octal numbers listed in Fig.12-15 are with reference to the main
memory contents. When the CPU generats a memory request, the index valus of the address is used to
access the cache. The tag field of the CPU
address is then compared with both tags in the cache to determine if a match occurs. The comparison
logic dine by an associative search of the tags in the set similar to anassociative memory search thus the
name “Set Associative”.
Replacement Policies
When the cache is full and there is necessity to bring new data to cache , then a decision must be
made as to which data from cache is to be removed
The guideline for taking a decision about which data is to be removed is called replacement policy
Replacement policy depends on mapping
There is no specific policy in case of Direct mapping as we have no choice of block placement in
cache Replacement Policies
A simple procedure is to replace cells of the cache in round robin order whenever a new
word is requested from memory
This constitutes a First-in First-out (FIFO) replacement policy
Random replacement
First-in First-out (FIFO) ( item chosen is the item that has been in the set longest)
Least Recently Used (LRU)( item chosen is the item that has been least recently used
by CPU)
VIRTUAL MEMORY
Types of Memory
Real memory
Main memory
Virtual memory
Memory on disk
Allows for effective multiprogramming and relieves the user of tight
constraints of main memory
Address used by a programmer is called virtual address and set of such addresses is
called address space
Address in main memory is called a location or physical address and set of such
locations is called the memory space
The Address Space is allowed to be larger than the memory space in computers with
virtual memory
In a multiprogram computer system, programs and data are transferred to and from auxiliary
memory and main memory based on demands imposed by the CPU. Suppose that program1 is currently
being executed in the CPU. Program1 and a portion of its associated data are moved from auxiliary
memory into main memory as shown in fig. 12-16. Portions of programs and data need not be in
contiguous locations in memory since information is being moved in out, and empty spaces may be
available in scattered locations in memory.
In fig 12-17, To map a virtual address of 20 bits to a physical address of 15 bits. The mapping is a
dynamic operation, which means that every address is translated immediately as a word is referenced by
CPU. The mapping table may be stored in a separate memory. In first case, an additional unit is required
as well as one extra memory access time. In the second case, the table takes space from main memoy
and two accesses to memory are required with program running at half speed. A third alternative is to use
an associative memory.
The physical memory is broken down into groups of equal size called blocks, which may range
from 64 to 4096 word each. The term page refers to groups of address space of the same size. Portions
of programs are moved from auxiliary memory to main memory in records equal to the size of a page.
The term “page frame” is sometimes used to denote a block.
In fig 12-18, a virtual address has 13 bits. Since each page consists of 1024 words, the high order
three bits of virtual address will specify one of the eight pages and the low order 10 bits give the line
address within the page.
The organization of the memory mapping table in a paged system is shown in Fig.12-19. The
memory page table consists of eight word , one for each page. The address in the page tabledenotes the
page number and the content of the word gives the block number where that page is stored in main
memory. The table showsthat pages 1,2,5 and 6 are now available in main memory in blocks 3,0,1 and 2,
respectively.
Replace the random access memory-page table with an associative memory of four words as
shown in Fig12-20. Each entry in the associative memory array consists of two fields. The first three bits
specify a field for storing the page number. The last two bits constitute a field for storing the block
number. The virtual address is placed in the argument register.
Address Translation
A segment is a set of logically related instructions or data elements associated with a given
name. Segment may be generated by the programmer or by the operating System. The address
generated by segmented program is called a logical address. The logical address may be larger than the
physical memory address as in virtual memory, but it may also be equal, and sometimes even smaller
than the length of the physical memory address.
The property of logical space is that it uses variable-length segments. The length of each
segment is allowed to grow and contract according to the needs of the program being executed.
Translation Lookaside Buffer
The mapping tables may be stored in two separate small memories or in main memory. A
memory reference from the CPU will require three access to memory.
o One to fetch the page table
o One to fetch the data
o From main memory
To overcome this problem a high-speed cache is set up for page table entries.
Called a Translation Lookaside Buffer (TLB).
Contains page table entries that have been most recently used.
If page table entry is present (TLB hit), the frame number is retrieved and the real
address is formed.
If page table entry is not found in the TLB (TLB miss), the page number is used to index
the process page table.
First checks if page is already in main memory If not in main memory a page fault is
issued The TLB is updated to include the new page entry.
Numerical Example
Consider the 20 bit logical address specified in Fig.12-22(a). The 4 bit segment number specifies
one of 16 possible segments. The 8 bit page number can specify up to 256 pages, and the b-bit word field
implies a page size of 256 words.
20
The physical memory shown in Fig 12-22(b) consists of 2 words of 32 bit each. The 20 –bit
address is divided into two fields: a 12-bit block number and an 8 bit word number. Thus, physical
memory is
divided into 4096
blocks of 256
words each.
Consider a program loaded into memory that requires five pages. The operating system may
assign to this program segment 6 and pages 0 through 4, as shown in Fig 12-23(a). the total logical
addressrange for the programis fromhexadecimal 60000 to 604FF. when the program is loaded into
physical memory , it is distributed among five blocks in physical memory where the operating system finds
empty spaces. The correspondence between each memory block and logical page number is then
entered in a table as shown in Fig.12-23(b).
MEMORY PROTECTION
Memory protection can be assigned to the physical address or the logical address. The protection
of memory through the physical address can be each block done by assigning to each block in memory a
number of protection bits that indicate the type of access allowed to its corresponding block.
The base address field gives the base of the page table address in segmented page organization.
Content:
■ It can also be used to edit or create spreadsheets, presentations, and even videos.
■ But the evolution of this complex system started around 1940 with the first Generation of
Computer and evolving ever since.
(1971–2010)
(2010 — Present)
1
First GenTubeseration Of Computer:
Vacuum (1940–1956)
Introduction:
4. These electronic tubes were made of glass and were about the size of a light bulb.were
Vaccum Age
Advantage:
The vacuum tube technology made possible the advent of electronic computers.
Disadvantage:
An air-conditioned is required.
2
Examples:
1. ENIAC
2. EDVAC
3. UNIVAC
4. IBM-701
5. IBM-650
Second Generation Of Computer:
Transistors (1956–1963)
Introduction:
1959—Introduction of the removable disk pack, providing users with fast access to stored
data.
1963--ASCII (American Standard Code for Information Interchange) introduced which
enables computers to exchange information.
Advantage:
3
Portable
Accuracy is improved than its predecessor.
Disadvantage:
Constant maintenance is required to work properly.
Commercial production was very difficult.
Still punched cards were used for input.
The cooling system is required.
More expensive and non-versatile.
Used for specific purposes.
Examples:
1. Honeywell 400
2. IBM 7094
3. CDC 1604
4. CDC 3600
5. UNIVAC 1108
Third Generation Of The Computer:
Introduction:
1. 1965-1971 is the period of third generation computer.
2. These computers were based on Integrated circuits.
3. IC was invented by Robert Noyce and Jack Kilby In 1958-1959.
4. IC was a single component containing number of transistors.
5. In 1964, computer manufacturers began replacing transistors with integrated circuits.
6. An integrated circuit (IC) is a complete electronic circuit on a small chip made of silicon.
7. These computers were more reliable and compact than computers made with transistors,
and they cost cost less to manufacture..
4
Integrated circuit age:
Advantages:
These computers were cheaper as compared to second-generation computers.
They were fast and reliable.
Use of IC in the computer provides the small size of the computer.
This generation of computers has big storage capacity.
Instead of punch cards, mouse and keyboard are used for input.
Disadvantages:
IC chips are difficult to maintain.
The highly sophisticated technology required for the manufacturing of IC
chips.
Air conditioning is required.
Examples:
1. PDP-8
2. PDP-11
3. ICL 2900
4. IBM 360
5. IBM 370
Fourth Generation Of Computer:
Microprocessors (1971–2010)
Introduction:
5
4. Graphics User Interface (GUI) technology was exploited to offer more comfort to users.
5. There are many key advancements that were made during this generation, the most
significant of which was the use of the microprocessor—a specialized chip developed for
computer memory and logic.
6. This revolutionized the computer industry by making it possible to use a single chip to
create a smaller ―personal‖ computer (as well as digital watches, pocket calculators, copy
machines, and so on).
Microprocessor Age
1977—Apple Computer Inc. was Established.
1981—Introduction of the IBM PC, which contains an Intel microprocessor chip and
Microsoft’s MS-DOS operation system.
1990—Microsoft releases Windows 3.0 with the ability to run multiple applications.
Advantage:
Fastest in computation and size get reduced as compared to the previous generation of
computer.
Heat generated is negligible.
Small in size as compared to previous generation computers.
Less maintenance is required.
All types of high-level language can be used in this type of computers.
Disadvantage:
The Microprocessor design and fabrication are very complex.
Air conditioning is required in many cases due to the presence of ICs.
Advance technology is required to make the ICs.
Examples:
1. IBM 4341
2. DEC 10
3. STAR 1000
Fifth Generation Of Computer:
Artificial Intelligence (2010 — Present)
6
Introduction:
1. The period of the fifth generation in 1980-onwards.
3. The aim of the fifth generation is to make a device which could respond to natural
language input and are capable of learning and self-organization.
5. Our current generation has been referred to as the ―Connected Generation‖ because of
the industry’s massive effort to increase the connectivity of computers.
6. The rapidly expanding Internet, World Wide Web, and intranets have created an
information superhighway that has enabled both computer professionals and home
computer users to communicate with others across the globe.
Artificial Intelligence
Advantage:
Disadvantage:
Examples:
1. Desktop
2. Laptop
3. NoteBook
4. UltraBook
5. Chromebook
7
7.2 PARALLEL PROCESSING IN UNIPROCESSOR SYSTEM:
■ Most General Uniprocessor system have the same basic structure
– Uniprocessor Architecture
– Parallel Processing Mechanisms
– Balancing of System Bandwidth
– Multiprogramming and Time sharing
Uniprocessor Architecture:
■ It is divided into four units, referred to as Logical Storage Units (LSU), that are four way
interleaved
■ The storage controller Provides multiport connections between the CPU and the four
LSU’s
2. CPU
■ The CPU contains the instruction decoding and execution units as well as a cache
3. I/O SubSystem
■ Peripherals are connected to the system via high-speed I/O channels which operate
asynchronously with the CPU
8
Parallel Processing Mechanisms:
9
Parallelism and pipelining within the CPU:
■ Parallel Adders
– Instruction fetch, decode, operand fetch, arithmetic logic execution, store result
10
Use of hierarchical memory system:
Balancing of subsystems:
■ CPU is the fastest unit in computer. The bandwidth of a system is defined as the number
of operations performed per unit time. In case of main memory the memory bandwidth is
measured by the number of words that can be accessed per unit time.
■ Bandwidth Balancing Between CPU and MemoryThe speed gap between the CPU and
the main memory can be closed up by using fast cache memory between them. A block
of memory words is moved from the main memory into the cache so that immediate
instructions can be available most of the time from the cache.
Multiprogramming:
■ Within the same interval of time, there may be multiple processes active in a computer,
competing for memory, I/O and CPU resources. Some computers are I/O bound and
some are
11
■ CPU bound. Various types of programs are mixed up to balance bandwidths among
functional units.
Example Whenever a process P1 is tied up with I/O processor for performing input
output operation at the same moment CPU can be tied up with an process P2. This allows
simultaneous execution of programs. The interleaving of CPU and I/O operations
among several programs is called as Multiprogramming.
Time-Sharing:
■ The mainframes of the batch era were firmly established by the late 1960s when advances
in semiconductor technology made the solid-state memory and integrated circuit feasible.
These
■ advances in hardware technology spawned the minicomputer era. They were small, fast,
and
■ This problem can be overcome by a concept called as Time sharing in which every
process is allotted a time slice of CPU time and thereafter after its respective time slice is
over CPU is allotted to the next program if the process is not completed it will be in
queue waiting for the second chance to receive the CPU time
Pipeline Computers
Array Processors
Multiprocessors systems
Pipelined processor:
– IF - Instruction Fetch
– ID – Instruction Decoding
– OF – Operation Fetch
– EX – Execution
12
Functional Structure of Pipeline Computer:
Array Processor:
Multiprocessor System:
Parallel computers still follow this basic design, just multiplied in units. The basic, fundamental
architecture remains the same. Parallel computers can be classified based on various criteria:
• number of data & instruction streams
• computer hardware structure (tightly or loosely coupled)
• degree of parallelism (the number of binary digits that can be processed within a unit time by
a computer system) Today there is no completely satisfactory classification of the different
types of parallel systems.
The most popular taxonomy of computer architecture is the Flynn’s
classification. Classifications of parallel computers
1. Flynn’s classification: (1966) is based on the multiplicity of instruction streams and the data streams
in computer systems.
2. Feng’s classification: (1972) is based on serial versus parallel processing (degree of parallelism).
3. Handler’s classification: (1977) is determined by the degree of parallelism and pipelining in various
subsystem levels.
13
14
15
16
17
18
19
20
7.5 Parallel Processing Applications:
Parallel Processing has been considered to be "the high end of Processing", and has been used to model
difficult scientific and engineering problems found in the real world. Some examples:
Atmosphere, Earth, Environment, Space Weather
Physics - applied, nuclear, particle, condensed matter, high pressure, fusion, photonics
Bioscience, Biotechnology, Genetics
Chemistry, Molecular Sciences
Geology, Seismology
Mechanical and Aerospace Engineering
Electrical Engineering, Circuit Design, Microelectronics
Computer Science, Mathematics
Today, commercial applications provide an equal or greater driving force in the development of faster
computers. These applications require the processing of large amounts of data in sophisticated ways. For example:
Databases, data mining
Oil exploration
Web search engines, web based business services
Medical imaging and diagnosis
Pharmaceutical design
Management of national and multi-national corporations
Financial and economic modeling
Advanced graphics and virtual reality, particularly in the entertainment industry
Networked video and multi-media technologies
Collaborative work environments
21