Chapter 1introduction Comp Arch
Chapter 1introduction Comp Arch
ORGANIZATION
Introduction
1
CONTENTS
Computer Evolution
Performance evaluation
2
INTRODUCTION TO BASIC COMPUTER
What is Computer?
➢ A device that accepts input, processes data, stores
data, and produces output, all according to a series of
stored instructions.
➢ Software
➢ Programs that tell a computer what to do
5
COMPUTER FUNCTIONAL UNITS
Input Unit
Output Unit
Memory
Bus Structure
6
COMPUTER FUNCTIONAL UNITS
INPUT UNIT
Converts the external world data to a binary
format, which CUP can understand
Mouse, Joystick , Keyboard etc
OUTPUT UNIT
Converts the binary format data to a format
that a common man can understand
7
Monitor, Printer, LCD, LED etc
COMPUTER FUNCTIONAL UNITS..
8
COMPUTER FUNCTIONAL UNITS..
MEMORY
Two types
RAM or R/W memory and
ROM read only memory
Computer organization
operational units and their interconnections that
realize the architectural specifications
Computer architecture
is the science of integrating those components to
achieve a level of functionality and performance.
11
COMPUTER ARCHITECTURE AND ORGANIZATION
Architectural
Computer
attributes
Architecture
include:
Organizational
Computer •The operational units
attributes
•Hardware details Organization and their
include:
transparent to the interconnections that
programmer, control realize the
signals, interfaces between architectural
the computer and specifications
peripherals, memory
technology used 12
INSTRUCTION SET ARCHITECTURE
A term that is often used interchangeably with computer
architecture
is the attributes of a computing system as seen by the
programmer
It deals with
The Instruction Set (what operations can be performed?)
The Instruction Format (how are instructions
specified?)
Data storage (where is data located?)
Addressing Modes (how is data accessed?)
Exceptional Conditions (what happens if something
13
goes wrong?)
MACHINE ORGANIZATION
the view of the computer that is seen by the logic designer.
It deals with
Von Neumann
Machine
15
INTERNAL BUS ORGANIZATION
Von Neumann
Machine
16
CATEGORIES OF COMPUTER
Cost
Computational power
Type of application
17
CATEGORIES OF COMPUTER CONT..
18
CATEGORIES OF COMPUTERS CONT..
Personal computer
Processing & storage units, visual display & audio
units, keyboards
Work stations
More computational power than PC
Costlier
20
CATEGORIES OF COMPUTERS CONT..
Minicomputers
a computer of medium power, more than a
microcomputer but less than a mainframe.
Mainframe
data processing in large organization
21
CATEGORIES OF COMPUTERS CONT..
Super computers
Faster than mainframes
22
HISTORY OF COMPUTER
➢ Modern computers result from 2 streams
of evolution
▪ Mechanization of arithmetic calculating
machines (hardware)
(software )
23
HISTORY OF COMPUTER – MECHANICAL ERA
➢ The abacus
▪ origin unknown
▪ used by the Chinese 3 to 4 thousand years ago
➢ Blasé Pascal (1623-1662)
➢ 1642 - Pascal’s Adder
▪ 1st mechanized adding machine
▪ gears and wheels
▪ add and subtract
24
▪ inaccurate
HISTORY OF COMPUTER – MECHANICAL ERA
25
HISTORY OF COMPUTER – MECHANICAL ERA
▪ debugging it
HISTORY OF COMPUTER – VON NEUMANN (1946)
architecture
31
GENERATIONS OF ELECTRONIC COMPUTERS
➢ 1st Generation - [1945 – 1955]
▪ vacuum tubes and relays (ENIAC)
➢ 2nd Generation – [1955 – 1965]
▪ discrete transistors (IBM 7090)
➢ 3rd Generation – [1965 – 1980]
▪ integrated circuits or chips
▪ operating systems (IBM 360)
➢ 4th Generation – [1980 - ]
▪ large-scale integration - microprocessors 32
ELECTRONIC COMPUTERS
FIRST GENERATION (1945 –1955)
Vacuum tubes were used for digital logic elements
and memory
IAS computer
Fundamental design approach was the stored
program concept
Attributed to the mathematician John von Neumann
First publication of the idea was in 1945 for the EDVAC
Design began at the Princeton Institute for Advanced
Studies
Completed in 1952
Prototype of all subsequent general-purpose
computers 33
o A main memory, which stores both data Central processing unit (CPU)
AC MQ
MBR
o A control unit, which interprets the
Instructions
instructions in memory and causes and data
Addresses
34
0 8 20 28 39
opcode (8 bits) address (12 bits) opcode (8 bits) address (12 bits)
Memory address • Specifies the address in memory of the word to be written from
register (MAR) or read into the MBR
00000101
0
ADD M(X) Add M(X) to AC; put the result in AC Set
00000111 ADD |M(X)| Add |M(X)| to AC; put the result in AC
00000110 SUB M(X) Subtract M(X) from AC; put the result in AC
00001000 SUB |M(X)| Subtract |M(X)| from AC; put the remainder
in AC
00001011 MUL M(X) Multiply M(X) by MQ; put most significant
bits of result in AC, put least significant bits
Arithmetic
in MQ
00001100 DIV M(X) Divide AC by M(X); put the quotient in MQ
and the remainder in AC
00010100 LSH Multiply accumulator by 2; i.e., shift left one
bit position
00010101 RSH Divide accumulator by 2; i.e., shift right one
position
00010010 STOR M(X,8:19) Replace left address field at M(X) by 12 37
rightmost bits of AC
Address modify
00010011 STOR M(X,28:39) Replace right address field at M(X) by 12
rightmost bits of AC
ELECTRONIC COMPUTERS
SECOND GENERATION (1955 –1965)
Transistor were used to design ALU & CU
High Level Language is used (FORTRAN)
To convert HLL to MLL compiler were used
Separate I/O processor were developed to
operate in parallel with CPU, thus improving the
performance
Invention of the transistor which was faster,
smaller and required considerably less power to 38
operate
ELECTRONIC COMPUTERS
THIRD GENERATION (1965‐1975)
IC technology improved
Improved IC technology helped in designing low cost, high speed
processor and memory modules
39
ELECTRONIC COMPUTERS
FOURTH GENERATION (1975‐1985)
CPU –Termed as microprocessor
INTEL, MOTOROLA, TEXAS,NATIONAL semiconductors
started developing microprocessor
41
MILESTONES IN COMPUTER TECHNOLOGY
1800s Analyt. Engine Babbage First digital computer
1936 Z1 Zuse First relay machine
1943 COLOSSUS British gov’t First electronic computer
1944 Mark I Aiken First general-purpose computer
1946 ENIAC I Eckert/Mauchley Modern computer history starts
1949 EDSAC Wilkes First stored-program computer
1952 IAS Von Neumann Most computers use this design
1960 PDP-1 DEC First minicomputer
1964 360 IBM Computer family, architecture
1964 6600 CDC First scientific supercomputer
1974 8080 Intel First processor on a chip
1974 CRAY-1 Cray First vector supercomputer
1981 IBM PC IBM Personal computer era
1985 MIPS MIPS 42
First commercial RISC machine
1990 RS6000 IBM First superscalar microprocessor
TERMINOLOGIES
Computer
A device that accepts input, processes data, stores data, and
produces output, all according to a series of stored
instructions.
Hardware
Includes the electronic and mechanical devices that process
the data; refers to the computer as well as peripheral
devices.
Software
A computer program that tells the computer how to perform
particular tasks. 43
TERMINOLOGIES
Network
Peripheral devices
44
TERMINOLOGIES
Input
Data
Information
Output
45
TERMINOLOGIES
Processing
Memory
Storage
Introduction
CPU Execution Time
The Performance Equation
47
Determinants of Performance
SPEC Benchmark
Other Performance Metrics
Examples
INTRODUCTION
Given a collection of computers, which one to buy?
Best Performance ?
Least Cost ?
Best Cost/Performance?
How to define performance?
Time required to finish a task (individual users)
Number of tasks executed per time (throughput)
Less time to finish implies better performance
PerformanceX ExecutionTimeY
n= =
PerformanceY ExecutionTimeX
CPU EXECUTION TIME
How to measure the execution time?
Almost all modern computers are based on a clock
The clock is a periodic square wave with known period
(cycle time)
Period = 1 / Frequency
50
However, not all instructions take the same time!
One way to think about execution time is that it equals the
number of instructions executed multiplied by the average
time per instruction
Time = IC x CPI x CC
CPU EXECUTION TIME
The average CPI is computed by
N
k =1
ICk CPI k
Effective CPI =
IC
51
Where
Where ICk is the number of instructions of class k
executed
CPIk is the number of clock cycles per instruction for
that instruction class
N is the number of instruction classes
52
Performance = 1 / Execution Time
Notes
Three key factors for performance: IC, CPI, and CC
CC: The clock rate is usually given
IC: Overall instruction count (executed instructions) by
using profilers/ simulators
CPI: varies by instruction type and ISA
implementation
THE PERFORMANCE EQUATION
Example 1. In a certain program 1000
instructions were executed on CPU running at 1
GHz. If the instruction counts and CPI for each
class are given below, how long does it take to
53
execute the program?
Instruction Instructi Class CPI
Class on Count
1 200 2
2 300 3
3 500 1
54
is faster ?
55
The original processor
Approach 1. A cache is added and it reduces the average load time to 2
cycles.
Approach 2. A branch prediction scheme is used and it cuts the branch
time by 1 cycle. Original App1 App2 App3
Approach 3. A second ALU is added CPI
to execute two ALU instructions at
Class Frequency Class CPI k x CPIk x F CPIk x F CPIk x F
once. F
ALU 50% 1 0.5 0.5 0.25
0.5
Load 20% 5 0.4 1.0 1.0
1.0
Store 10% 3 0.3 0.3 0.3
0.3
Branch 20% 2 0.4 0.2 0.4
0.4
Effective CPI 2.2 1.6 2.0 1.95
Speed up 1.375 1.10 1.128
DETERMINANTS OF PERFORMANCE
Execution Time = IC CPI x CC
IC CPI CC
56
Algorithm X X
Programmin
g Language X X
Compiler X X
ISA X X X
Processor
Organizatio X X
n
X
Technology
SPEC BENCHMARK
What programs can be used to evaluate
different computers? Can we cheat? Need a
standard!
SPEC Benchmark
57
Standard Performance Evaluation Corp (SPEC)
Programs used to measure performance (CPU, Web,
I/O…)
Typical actual workloads
SPEC CPU2006
Elapsed time to execute a selection of programs
Negligible I/O, so focuses on CPU performance
Summarize as geometric mean of performance ratios
CINT2006 (integer) and CFP2006 (floating-point)
n
Geometric Mean = n
Execution Time Ratio
i =1
i
SPEC BENCHMARK
CINT2006 for Opteron X4 2356
Name Description IC×109 CPI Tc (ns) Exec time Ref time SPECratio
perl Interpreted string processing 2,118 0.75 0.40 637 9,777 15.3
bzip2 Block-sorting compression 2,389 0.85 0.40 817 9,650 11.8
58
gcc GNU C Compiler 1,050 1.72 0.47 24 8,050 11.1
mcf Combinatorial optimization 336 10.00 0.40 1,345 9,120 6.8
go Go game (AI) 1,658 1.09 0.40 721 10,490 14.6
hmmer Search gene sequence 2,783 0.80 0.40 890 9,330 10.5
sjeng Chess game (AI) 2,176 0.96 0.48 37 12,100 14.5
libquantum Quantum computer simulation 1,623 1.61 0.40 1,047 20,720 19.8
h264avc Video compression 3,102 0.80 0.40 993 22,130 22.3
omnetpp Discrete event simulation 587 2.94 0.40 690 6,250 9.1
astar Games/path finding 1,082 1.79 0.40 773 7,020 9.1
xalancbmk XML parsing 1,058 2.70 0.40 1,143 6,900 6.0
Geometric mean 11.7
59
2900 P4 Extreme
2400 Xeon
CINT2000
Athlon
1900
Athlon 64
1400 Opteron
Pmac G5
900
Athlon FX (DC)
400 Core Duo
1
4
0.5
1.5
2.5
3.5
Core 2 Duo
Clock Speed (GHz)
SPEC BENCHMARK
CFP2000 Results for Various Processors
3200
Pentium 3
60
2700 Pentium 4
P4 Extreme
2200 Xeon
CFP2000
Athlon
1700
Athlon 64
1200 Opteron
Pmac G5
700 Athlon FX (DC)
Core Duo
200
Core 2 Duo
1
3
0.5
1.5
2.5
3.5
61
cooling)
EXAMPLES
Example 4. given a program with 106 instructions with
the following mix: 10% class A, 20% class B, 50% class
C, and 20% class D. If this program is executed on two
different processors with the specifications given below,
then
62
Process CR CPI Class CPI Class CPI Class CPI Class
or (GHz) A B C D
1 1.5 1 2 3 4
2 2 2 2 2 2
63
modification requires increasing the clock cycle by 10%?
modification does not affect the clock cycle but requires
twice the amount of power to execute the program
require
66
TIME CAN BE MEASURED IN TERMS OF:
° Response time
other programs).
68
MEASUREMENT OF TIME
Clk
clock period
71
EXAMPLE
If a computer’s clock rate increases from 200 MHz to 250
MHz and the other factors remain the same, how many
times faster will the computer be?
72
COMPUTING THE CPI
The CPI is the average number of cycles per instruction.
If for each instruction type, we know its frequency and number
of cycles need to execute it, we can compute the overall CPI by
the formula, CPI = Σ CPI x F
For example
Operation F CPI CPI x F % Time
ALU 50% 1 .5 23%
Load 20% 5 1.0 45%
Store 10% 3 .3 14%
Branch 20% 2 .4 18% 73