[go: up one dir, main page]

0% found this document useful (0 votes)
29 views29 pages

Com314 Lecture Note

The document discusses the distinction between computer architecture and organization, highlighting how architectural attributes impact program execution while organizational attributes deal with hardware details. It explains the hierarchical nature of computer systems and outlines the basic functions of a computer, including data processing, storage, movement, and control. Additionally, it covers the instruction cycle, the role of interrupts in improving processing efficiency, and provides historical context with the example of the ENIAC as the first general-purpose electronic digital computer.

Uploaded by

alaedujohn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views29 pages

Com314 Lecture Note

The document discusses the distinction between computer architecture and organization, highlighting how architectural attributes impact program execution while organizational attributes deal with hardware details. It explains the hierarchical nature of computer systems and outlines the basic functions of a computer, including data processing, storage, movement, and control. Additionally, it covers the instruction cycle, the role of interrupts in improving processing efficiency, and provides historical context with the example of the ENIAC as the first general-purpose electronic digital computer.

Uploaded by

alaedujohn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 29

System Architecture

(COM314)
Computer architecture refers to those attributes of a system visible to a
programmer or, put another way, those attributes that have a direct impact on
the logical execution of a program. Computer organization refers to the
operational units and their interconnections that realize the architectural
specifications. Examples of architectural attributes include the instruction set,
the number of bits used to represent various data types (e.g. numbers,
characters),I/O mechanisms, and techniques for addressing memory.
Organizational attributes include those hardware details transparent to the
programmer, such as control signals, interfaces between the computer and
peripherals and the memory technology used.

For example, it is an architectural design issue whether a computer will have a


multiply instruction. It is an organizational issue whether that instruction will
be implemented by a special multiply unit or by a mechanism that makes
repeated use of the add unit of the system. The organizational decision may be
based on the anticipated frequency of use of the multiply instruction, the
relative speed of the two approaches, and the cost and physical size of a
special multiply unit.

Historically ,and still today, the distinction between architecture and


organization has been an important one. Many computer manufacturers offer
a family of computer models, all with the same architecture but with
differences in organization. Consequently, the different models in the family
have different price and performance characteristics. Furthermore, a particular
architecture may span many years and encompass a number of different
computer models, its organization changing with changing technology. A
prominent example of both these phenomena is the IBM System/370
architecture .This architecture was first introduced in 1970 and included a
number of models. The customer with modest requirements could buy a
cheaper, slower model and, if demand increased, later upgrade to a more
expensive, faster model without having to abandon software that had already
been developed. Over the years, IBM has introduced many new models with

1
improved technology to replace older models, offering the customer greater
speed, lower cost, or both. These newer models retained the same
architecture so that the customer’s software investment was protected.
Remarkably, the System/370 architecture, with a few enhancements, has
survived to this day as the architecture of IBM’s mainframe product line.

In a class of computers called microcomputers, the relationship between


architecture and organization is very close. Changes in technology not only
influence organization but also result in the introduction of more powerful and
more complex architectures. Generally, there is less of a requirement for
generation-to-generation compatibility for these smaller machines. Thus, there
is more interplay between organizational and architectural design decisions.
An intriguing example of this is the reduced instruction set computer (RISC).

A computer is a complex system; contemporary computers contain millions of


elementary electronic components. How, then, can one clearly describe them?
The key is to recognize the hierarchical nature of most complex systems,
including the computer. A hierarchical system is a set of interrelated
subsystems, each of the latter, in turn, hierarchical in structure until were ach
some lowest level of elementary subsystem.

The hierarchical nature of complex systems is essential to both their design


and their description. The designer need only deal with a particular level of the
system at a time .At each level, the system consists of a set of components and
their interrelationships. The behaviour at each level depends only on a
simplified, abstracted characterization of the system at the next lower level. At
each level, the designer is concerned with structure and function:

• Structure: The way in which the components are interrelated

• Function:The operation of each individual component as part of the structure

2
Operating environment (source and destination of data)

Data movement
apparatus

Data movement
apparatus

Data
storage
facility Data
processing
facility

Fig 1: A functional view of the Computer


Function

Both the structure and functioning of a computer are in essence, simple. Fig 1
depicts the basic functions that a computer can perform. In general terms,
there are only four:

 Data processing
 Data storage
 Data movement
 Control

3
The computer, of course, must be able to process data. The data may take a
wide variety of forms, and the range of processing requirements is broad.
However, we shall see that there are only a few fundamental methods or types
of data processing.

It is also essential that a computer store data. Even if the computer is


processing data on the fly (i.e., data come in and get processed, and the results
go out immediately),the computer must temporarily store at least those pieces
of data that are being worked on at any given moment. Thus, there is at least a
short-term data storage function. Equally important, the computer performs a
long-term data storage function. Files of data are stored on the computer for
subsequent retrieval and update.

The computer must be able to move data between itself and the outside
world. The computer’s operating environment consists of devices that serve as
either sources or destinations of data. When data are received from or
delivered to a device that is directly connected to the computer, the process is
known as input–output (I/O),and the device is referred to as a peripheral.
When data are moved over longer distances, to or from a remote device, the
process is known as data communications. Finally, there must be control of
these three functions. Ultimately, this control is exercised by the individual(s)
who provides the computer with instructions. Within the computer, a control
unit manages the computer’s resources and orchestrates the performance of
its functional parts in response to those instructions.

4
COMPUTER FUNCTION

The basic function performed by a computer is execution of a program. This


consists of a set of instructions stored in memory.The processor does the
actual work by executing instructions specified in the program. This lecture
provides an overview ofthe key elements of program execution.In its simplest
form,instruction processing consists of two steps:The processor reads (fetches)
instructions from memory one at a time and executes each instruction.
Program execution consists of repeating the process of instruction fetch and
instruction execution.The instruction execution may involve several operations
and depends on the nature of the instruction

Figure 2: Computer Components:Top-Level View

The processing required for a single instruction is called an instruction cycle.


Using the simplified two-step description given previously,the instruction cycle
is depicted in Figure 3 below.The two steps are referred to as the fetch cycle
and the execute cycle.Program execution halts only if the machine is turned

5
off,some sort of unrecoverable error occurs,or a program instruction that halts
the computer is encountered.

Instruction Fetch and Execute

At the beginning of each instruction cycle,the processor fetches an instruction


from memory.In a typical processor,a register called the program counter (PC)
holds the address of the instruction to be fetched next. Unless told otherwise,
the processoralways increments the PC after each instruction fetch so that it
will fetch the next instruction in sequence (i.e., the instruction located at the
next higher memory address).So,for example,consider a computer in which
each instruction occupies one 16-bit word of memory.Assume that the
program counter is set to location 300.The processor will next fetch the
instruction at location 300. On succeeding instruction cycles,it will fetch
instructions from locations 301,302,303,and so on.This sequence may be
altered,as explained presently.

Figure 3: Basic Instruction cycle

The fetched instruction is loaded into a register in the processor known as the
instruction register (IR). The instruction contains bits that specify the action the
processor is to take.The processor interprets the instruction and performs the
required action.In general,these actions fall into four categories:

• Processor-memory: Data may be transferred from processor to memory or


from memory to processor.

• Processor-I/O: Data may be transferred to or from a peripheral device by


transferring between the processor and an I/O module.

• Data processing:The processor may perform some arithmetic or logic


operation on data.

6
• Control:An instruction may specify that the sequence of execution be
altered. For example,the processor may fetch an instruction from location
149,which specifies that the next instruction be from location 182.The
processor will remember this fact by setting the program counter to
182.Thus,on the next fetch cycle,the instruction will be fetched from location
182 rather than 150.

An instruction’s execution may involve a combination of these actions.

Consider a simple example using a hypothetical machine that includes the


characteristics listed in Figure 4. The processor contains a single data register,
called an accumulator (AC). Both instructions and data are 16 bits long.Thus, it
is convenient to organize memory using 16-bit words.The instruction format
provides 4 bits for the opcode,so that there can be as many as 2 4=16 different
opcodes,and up to 212= 4096 (4K) words of memory can be directly addressed.
Figure 5 illustrates a partial program execution, showing the relevant portions
of memory and processor registers. The program fragment shown adds the
contents of the memory word at address 940 to the contents of the memory
word ataddress 941 and stores the result in the latter location.

Figure 4: Characteristics of a Hypothetical Machine

7
Figure 5: Example of Program Execution (contents of memory and registers in
hexadecimal)

Three instructions,which can be described as three fetch and three execute


cycles,are required:

1. The PC contains 300, the address of the first instruction. This instruction
(the value 1940 in hexadecimal) is loaded into the instruction register IR
and the PC is incremented. Note that this process involves the use of a
memory address register (MAR) and a memory buffer register (MBR).
For simplicity, these intermediate registers are ignored.
2. The first 4 bits (first hexadecimal digit) in the IR indicate that the AC is to
be loaded.The remaining 12 bits (three hexadecimal digits) specify the
address (940) from which data are to be loaded.
3. The next instruction (5941) is fetched from location 301 and the PC is
incremented.

8
4. The old contents of the AC and the contents of location 941 are added
and the result is stored in the AC.
5. The next instruction (2941) is fetched from location 302 and the PC is
incremented.
6. The contents of the AC are stored in location 941.

In this example,three instruction cycles,In this example,three instruction


cycles,each consisting of a fetch cycle and an execute cycle,are needed to add
the contents of location 940 to the contents of 941. With a more complex set
of instructions,fewer cycles would be needed.Some older processors, for
example, included instructions that contain more than one memory address.
Thus the execution cycle for a particular instruction on such processors could
involve more than one reference to memory.Also, instead of memory
references,an instruction may specify an I/O operation. each consisting of a
fetch cycle and an execute cycle,are needed to add the contents of location
940 to the contents of 941. With a more complex set of instructions,fewer
cycles would be needed.Some older processors, for example, included
instructions that contain more than one memory address. Thus the execution
cycle for a particular instruction on such processors could involve more than
one reference to memory.Also, instead of memory references,an instruction
may specify an I/O operation.

Interrupts

Virtually all computers provide a mechanism by which other modules (I/O,


memory) may interrupt the normal processing of the processor. Table 1 below
lists the most common classes of interrupts.

Table 1: Classes of interrupts

9
We shall focus on the communication between modules that results from
interrupts. Interrupts are provided primarily as a way to improve processing
efficiency. For example, most external devices are much slower than the
processor. Suppose that the processor is transferring data to a printer using
the instruction cycle scheme of Figure 3.

(a) No Interrupts (b)interrupts; short I/O wait (c)interrupts; long I/O wait
Figure 6:Program Flow of Control without and with Interrupts

After each write operation,the processor must pause and remain idle until the
printer catches up.The length of this pause may be on the order of many
hundreds or even thousands of instruction cycles that do not involve memory.
Clearly,this is a very wasteful use of the processor. Figure 6 illustrates this state
of affairs.The user program performs a series of WRITE calls interleaved with
processing. Code segments 1, 2, and 3 refer to sequences of instructions that
do not involve I/O.The WRITE calls are to an I/O program that is a system utility
and that will perform the actual I/O operation.The I/O program consists of
three sections:

 A sequence of instructions ,labeled 4 in the figure, to prepare for the


actual I/O operation. This may include copying the data to be output
10
into a special buffer and preparing the parameters for a device
command.
 The actual I/O command.Without the use of interrupts,once this
command is issued,the program must wait for the I/O device to perform
the requested function (or periodically poll the device).The program
might wait by simply repeatedly performing a test operation to
determine if the I/O operation is done.
 A sequence of instructions, labelled 5 in the figure, to complete the
operation.This may include setting a flag indicating the success or failure
of the operation.

Because the I/O operation may take a relatively long time to complete,the I/O
program is hung up waiting for the operation to complete; hence, the user
program is stopped at the point of the WRITE call for some considerable period
of time.

INTERRUPTS AND THE INSTRUCTION CYCLE

With interrupts, the processor can be engaged in executing other instructions


while an I/O operation is in progress. Consider the flow of control in Figure
6b.As before, the user program reaches a point at which it makes a system call
in the form of a WRITE call.The I/O program that is invoked in this case consists
only of the preparation code and the actual I/O command.After these few
instructions have been executed, control returns to the user
program.Meanwhile,the external device is busy accepting data from computer
memory and printing it.This I/O operation is conducted concurrently with the
execution of instructions in the user program. When the external device
becomes ready to be serviced—that is, when it is ready to accept more data
from the processor,—the I/O module for that external device sends an
interrupt request signal to the processor.The processor responds by
suspending operation of the current program,branching off to a program to
service that particular I/O device,known as an interrupt handler,and resuming
the original execution after the device is serviced.The points at which such
interrupts occur are indicated by an asterisk in Figure 6b. From the point of
view of the user program,an interrupt is just that:an interruption of the normal
sequence of execution.When the interrupt processing is completed, execution
resumes (Figure 7).
11
Figure 7: Transfer of control via interrupt

Thus, the user program does not have to contain any special code to
accommodate interrupts;the processor and the operating system are
responsible for suspending the user program and then resuming it at the same
point. To accommodate interrupts,an interrupt cycleis added to the instruction
cycle, as shown in Figure 8.

Figure 8: Instruction cycle with interrupts

12
In the interrupt cycle, the processor checks to see if anyinterrupts have
occurred, indicated by the presence of an interrupt signal. If no interrupts are
pending, the processor proceeds to the fetch cycle and fetches the next
instruction of the current program. If an interrupt is pending, the processor
does the following:

 It suspends execution of the current program being executed and saves


its context.This means saving the address of the next instruction to be
executed (current contents of the program counter) and any other data
relevant to the processor’s current activity.
 It sets the program counter to the starting address of an interrupt
handlerroutine.

The First GenerationVacuum Tubes

ENIAC The ENIAC (Electronic Numerical Integrator And Computer), designed


and constructed at the University of Pennsylvania, was the world’s first
generalpurpose electronic digital computer.The project was a response to
U.S.needs during World War II.The Army’s Ballistics Research Laboratory
(BRL),an agency responsible for developing range and trajectory tables for new
weapons, was having difficulty supplying these tables accurately and within a
reasonable time frame.Without these firing tables,the new weapons and
artillery were useless to gunners.The BRL employed more than 200 people
who, using desktop calculators, solved the necessary ballistics equations.
Preparation of the tables for a single weapon would take one person many
hours,even days.

John Mauchly, a professor of electrical engineering at the University of


Pennsylvania, and John Eckert, one of his graduate students, proposed to build
a general-purpose computer using vacuum tubes for the BRL’s application.In
1943, the Army accepted this proposal, and work began on the ENIAC. The
resulting machine was enormous, weighing 30 tons, occupying 1500 square
feet of floor space, and containing more than 18,000 vacuum tubes. When
operating, it consumed 140 kilowatts of power. It was also substantially faster
than any electromechanical computer,capable of 5000 additions per second.
The ENIAC was a decimal rather than a binary machine. That is, numbers were
13
represented in decimal form,and arithmetic was performed in the decimal
system.Its memory consisted of 20 “accumulators,”each capable of holding a
10-digit decimal number. A ring of 10 vacuum tubes represented each digit. At
any time, only one vacuum tube was in the ON state, representing one of the
10 digits.The major drawback of the ENIAC was that it had to be programmed
manually by setting switches and plugging and unplugging cables. The ENIAC
was completed in 1946, too late to be used in the war effort. Instead,its first
task was to perform a series of complex calculations that were used to help
determine the feasibility of the hydrogen bomb.The use of the ENIAC for a
purpose other than that for which it was built demonstrated its general-
purpose nature.The ENIAC continued to operate under BRL management until
1955,when it was disassembled.

THE VON NEUMANN MACHINE

The task of entering and altering programs for the ENIAC was extremely
tedious.The programming process could be facilitated if the program could be
represented in a form suitable for storing in memory alongside the data.Then,a
computer could get its instructions by reading them from memory, and a
program could be set or altered by setting the values of a portion of memory.

This idea, known as the stored-program concept, is usually attributed to the


ENIAC designers,most notably the mathematician John von Neumann,who was
a consultant on the ENIAC project.Alan Turing developed the idea at about the
same time.The first publication of the idea was in a 1945 proposal by von
Neumann for a new computer,the EDVAC (Electronic Discrete Variable
Computer).

In 1946, von Neumann and his colleagues began the design of a new
storedprogram computer, referred to as the IAS computer, at the Princeton
Institute for Advanced Studies.The IAS computer,although not completed until
1952,is the prototype of all subsequent general-purpose computers.

Figure 9 below shows the general structure of the IAS computer.It consists of

• A main memory,which stores both data and instructions

• An arithmetic and logic unit (ALU) capable of operating on binary data

14
• A control unit,which interprets the instructions in memory and causes them
to be executed

• Input and output (I/O) equipment operated by the control unit

Figure 9: Structure of the IAS Computer

MICROPROCESSORS

With improvement in technology, more elements were added ton processor


chips.As time went on,more and more elements were placed on each chip,so
that fewer and fewer chips were needed to construct a single computer
processor.

A breakthrough was achieved in 1971, when Intel developed its 4004.The 4004
was the first chip to contain allof the components of a CPU on a single chip:The
microprocessor was born.

The 4004 can add two 4-bit numbers and can multiply only by repeated
addition.By today’s standards,the 4004 is hopelessly primitive,but it marked
the beginning of a continuing evolution of microprocessor capability and
power.

15
This evolution can be seen most easily in the number of bits that the processor
deals with at a time.There is no clear-cut measure of this,but perhaps the best
measure is the data bus width:the number of bits of data that can be brought
into or sent out of the processor at a time.Another measure is the number of
bits in the accumulator or in the set of general-purpose registers. Often, these
measures coincide, but not always.For example,a number of microprocessors
were developed that operate on 16-bit numbers in registers but can only read
and write 8 bits at a time.

The next major step in the evolution of the microprocessor was the
introduction in 1972 of the Intel 8008.This was the first 8-bit microprocessor
and was almost twice as complex as the 4004.

Neither of these steps was to have the impact of the next major event:the
introduction in 1974 of the Intel 8080.This was the first general-purpose
microprocessor.Whereas the 4004 and the 8008 had been designed for specific
applications,the 8080 was designed to be the CPU of a general-purpose
microcomputer. Like the 8008, the 8080 is an 8-bit microprocessor.The 8080,
however, is faster, has a richer instruction set,and has a large addressing
capability.

About the same time, 16-bit microprocessors began to be developed.


However,it was not until the end of the 1970s that powerful,general-purpose
16-bit microprocessors appeared. One of these was the 8086. The next step in
this trend occurred in 1981,when both Bell Labs and Hewlett-Packard
developed 32-bit,single-chip microprocessors.Intel introduced its own 32-bit
microprocessor,the 80386, in 1985,

The evolution of the Intel x86 architecture

The current x86 offerings represent the results of decades ofdesign effort on
complex instruction set computers (CISCs).The x86 incorporates the
sophisticated design principles once found only on mainframes and
supercomputers and serves as an excellent example of CISC design.An
alternative approach to processor design in the reduced instruction set
computer (RISC).The ARM architecture is used in a wide variety of embedded
systems and is one of the most powerful and best-designed RISC-based
systems on the market.
16
It is worthwhile to list some of the highlights of the evolution of the Intel
product line:

• 8080: The world’s first general-purpose microprocessor.This was an 8-bit


machine, with an 8-bit data path to memory.The 8080 was used in the first
personal computer

• 8086: A far more powerful, 16-bit machine. In addition to a wider data path
and larger registers, the 8086 sported an instruction cache, or queue, that
prefetches a few instructions before they are executed. A variant of this
processor, the 8088, was used in IBM’s first personal computer, securing the
success of Intel.The 8086 is the first appearance of the x86 architecture.

• 80286: This extension of the 8086 enabled addressing a 16-MByte memory


instead of just 1 MByte.

• 80386: Intel’s first 32-bit machine,and a major overhaul of the product.With


a 32-bit architecture, the 80386 rivalled the complexity and power of
minicomputers and mainframes introduced just a few years earlier.This was
the first Intel processor to support multitasking, meaning it could run multiple
programs at the same time.

• 80486:The 80486 introduced the use of much more sophisticated and


powerful cache technology and sophisticated instruction pipelining. The 80486
also offered a built-in math coprocessor, offloading complex math operations
from the main CPU.

• Pentium: With the Pentium, Intel introduced the use of superscalar


techniques,which allow multiple instructions to execute in parallel. • Pentium
Pro: The Pentium Pro continued the move into superscalar organization begun
with the Pentium,with aggressive use of register renaming,branch
prediction,data flow analysis,and speculative execution.

• Pentium II: The Pentium II incorporated Intel MMX technology,which is


designed specifically to process video,audio,and graphics data efficiently.

• Pentium III: The Pentium III incorporates additional floating-point instructions


to support 3D graphics software.
17
• Pentium 4: The Pentium 4 includes additional floating-point and other
enhancements for multimedia.8

• Core: This is the first Intel x86 microprocessor with a dual core, referring to
the implementation of two processors on a single chip.

• Core 2: The Core 2 extends the architecture to 64 bits.The Core 2 Quad


provides four processors on a single chip.

18
Computer Memory
Characteristics of Memory Systems

The complex subject of computer memory is made more manageable if we


classify memory systems according to their key characteristics. The most
important of these are listed below:

1. Location

Internal (e.g.processor registers,main memory,cache)

External (e.g.optical disks,magnetic disks,tapes)

2. Capacity

Number of words

Number of bytes

3. Unit of Transfer

Word

Block

4. Access Method

Sequential

Direct

Random

Associative

5. Performance

Access time

Cycle time

Transfer rate

6. Physical type

19
Semiconductor

Magnetic

Optica

Magneto-optical

7. Physical Characteristics

Volatile/non volatile

Erasable/no erasable

8. Oraganisation

Memory modules

The term location above refers to whether memory is internal and external to
the computer.Internal memory is often equated with main memory.But there
are other forms of internal memory.The processor requires its own local
memory,inthe form of registers and the control unit portion of the processor
may also require its own internal memory. Cache is another form of internal
memory. External memory consists of peripheral storage devices,such as disk
and tape,that are accessible to the processor via I/O controllers.

An obvious characteristic of memory is its capacity.For internal memory,this is


typically expressed in terms of bytes (1 byte 8 bits) or words. Common word
lengths are 8, 16, and 32 bits. External memory capacity is typically expressed
in terms of bytes.

The Memory Hierarchy

The design constraints on a computer’s memory can be summed up by three


questions:How much? How fast? How expensive?

The question of how much is somewhat open ended. If the capacity is there,
applications will likely be developed to use it.The question of how fast is,in a
sense, easier to answer.To achieve greatest performance,the memory must be
able to keep up with the processor.That is, as the processor is executing
instructions, we would not want it to have to pause waiting for instructions or

20
operands.The final question must also be considered.For a practical system,the
cost of memory must be reasonable in relationship to other components.

As might be expected,there is a trade-off among the three key characteristics


of memory: namely, capacity, access time, and cost. A variety of technologies
are used to implement memory systems, and across this spectrum of
technologies, the following relationships hold:

• Faster access time,greater cost per bit

• Greater capacity,smaller cost per bit

• Greater capacity,slower access time

The dilemma facing the designer is clear.The designer would like to use
memory technologies that provide for large-capacity memory,both because
the capacity is needed and because the cost per bit is low. However, to meet
performancerequirements,the designer needs to use expensive,relatively
lower-capacity memories with short access times. The way out of this dilemma
is not to rely on a single memory component or technology,but to employ a
memory hierarchy and the following points are observed as one goes through
the hierarchy:

a. Decreasing cost per bit

b. Increasing capacity

c. Increasing access time

d. Decreasing frequency of access of the memory by the processor

Thus, smaller, more expensive, faster memories are supplemented by larger,


cheaper, slower memories.The key to the success of this organization is item
(d): decreasing frequency of access.

21
Cache memory

Cache is a small high-speed memory. Stores data from some frequently used
addresses (of main memory).Cache memory is intended to give memory speed
approaching that of the fastest memories available, and at the same time
provide a large memory size at the price of less expensive types of
semiconductor memories.The concept is illustrated in Figure 10a. There is a
relatively large and slow main memory together with a smaller,faster cache
memory.The cache contains a copy of portions of main memory. When the
processor attempts to read a word of memory, a check is made todetermine if
the word is in the cache.If so,the word is delivered to the processor.If not,a
block of main memory,consisting of some fixed number of words,is read into
the cache and then the word is delivered to the processor.Because of the
phenomenon of locality of reference,when a block of data is fetched into the
cache to satisfy a single memory reference, it is likely that there will be future
references to that same memory location or to other words in the block.

Figure 10 Cache and Main Memory

22
Figure 10b depicts the use of multiple levels of cache.The L2 cache is slower
and typically larger than the L1 cache, and the L3 cache is slower and typically
larger than the L2 cache.

Figure 11 illustrates the read operation.The processor generates the read


address (RA) of a word to be read.If the word is contained in the cache,it is
deliveredto the processor.Otherwise,the block containing that word is loaded
into the cache, and the word is delivered to the processor. Figure 4.5 shows
these last two operations occurring in parallel and reflects the organization
shown in Figure 4.6,which is typical of contemporary cache organizations. In
this organization, the cache connects to the processor via data,control,and
address lines.The data and address lines also attach to data and address
buffers, which attach to a system bus from which main memory is
reached.When a cache hit occurs,the data and address buffers are disabled
and communication is only between processor and cache, with no system bus
traffic.When a cache miss occurs,the desired address is loaded onto the system
bus and the data are returned through the data buffer to both the cache and
the processor. In other organizations, the cache is physically interposed
between the processor and the main memory for all data,address,and control
lines.In this latter case, for a cache miss, the desired word is first read into the
cache and then transferred from cache to processor.

23
Figure 11 Cache Read Operation

24
Figure 12 Typical Cache Organization

25
THE ARITHMETIC AND LOGIC UNIT

The ALU is that part of the computer that actually performs arithmetic and
logical operations on data.All of the other elements of the computer system—
control unit, registers, memory, I/O—are there mainly to bring data into the
ALU for it to process and then to take the results back out.We have,in a
sense,reached the core or essence of a computer when we consider the
ALU.An ALU and,indeed,all electronic components in the computer are based
on the use of simple digital logic devices that can store binary digits and
perform simple Boolean logic operations.

Figure 13 indicates,in general terms,how the ALU is interconnected with the


rest of the processor.Data are presented to the ALU in registers,and the results
of an operation are stored in registers.These registers are temporary storage
locations within the processor that are connected by signal paths to the
ALUThe ALU may also set flags as the result of an operation.For example, an
overflow flag is set to 1 if the result of a computation exceeds the length of the
register into which it is to be stored. The flag values are also stored in
registerswithin the processor.The control unit provides signals that control the
operation of the ALU and the movement of the data into and out of the ALU

Figure 13 ALU Inputs and Outputs

INTEGER REPRESENTATION

For purposes of computer storage and processing, we do not have the benefit
of minus signs and periods.Only binary digits (0 and 1) may be used to
represent numbers. If we are limited to nonnegative integers, the
representation is straightforward.

An 8-bit word can represent the numbers from 0 to 255,including


26
11111111 = 255

10000000 = 128

00101001 = 41

00000001 = 1

00000000 = 0

INTEGER ARITHMETIC

Negation In sign-magnitude representation,the rule for forming the negation


of an integer is simple:invert the sign bit.In twos complement notation,the
negation of an integer can be formed with the following rules:

1. Take the Boolean complement of each bit of the integer (including the sign
bit).That is,set each 1 to 0 and each 0 to 1.

2. Treating the result as an unsigned binary integer,add 1.

This two-step process is referred to as the twos complement operation,or the


taking of the twos complement of an integer.

+18 = 00010010 (twos complement)

bitwise complement = 11101101

+ 1

= 11101110 = -18

As expected,the negative of the negative of that number is itself:

-18 = 11101110 (twos complement)

bitwise complement =00010001

+ 1

= 00010010 = 18

27
Addition and Subtraction

.Addition proceeds as if the two numbers were unsigned integers. The first
four examples illustrate successful operations.If the result of the operation is
positive,we get a positive number in twos complement form,which is the same
as in unsigned-integer form.If the result of the operation is negative, we get a
negative number in twos complement form. Note that,in some instances,there
is a carry bit beyond the end of the word (indicated by shading),which is
ignored. On any addition, the result may be larger than can be held in the word
size being used.This condition is called overflow.When overflow occurs,the
ALU must signal this fact so that no attempt is made to use the result.To detect
overflow, the following rule is observed:

OVERFLOW RULE: If two numbers are added, and they are both positive or
both negative,then overflow occurs if and only if the result has the opposite
sign.

Figure 14 Addition of Numbers in Twos Complement Representation

Figures 14e and f show examples of overflow. Note that overflow can occur
whether or not there is a carry. Subtraction is easily handled with the following
rule:

28
SUBTRACTION RULE: To subtract one number (subtrahend) from another
(minuend),take the twos complement (negation) of the subtrahend and add
it to the minuend.

Thus, subtraction is achieved using addition, as illustrated in Figure 14. The last
two examples demonstrate that the overflow rule still applies.

Figure 14 Subtraction of Numbers in Twos Complement Representation (M S)

29

You might also like