[go: up one dir, main page]

0% found this document useful (0 votes)
25 views15 pages

Lecture 36

The document discusses multicore computers, focusing on hardware and software performance issues, including parallelism, power consumption, and the organization of multicore systems. It highlights the challenges of effectively utilizing multicore architectures due to serial code and overhead, while also noting areas where multicore systems can excel, such as database management and server applications. The document outlines the potential for performance improvements with multicore processors, contingent on software's ability to leverage parallel resources.

Uploaded by

Amaresh Swain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views15 pages

Lecture 36

The document discusses multicore computers, focusing on hardware and software performance issues, including parallelism, power consumption, and the organization of multicore systems. It highlights the challenges of effectively utilizing multicore architectures due to serial code and overhead, while also noting areas where multicore systems can excel, such as database management and server applications. The document outlines the potential for performance improvements with multicore processors, contingent on software's ability to leverage parallel resources.

Uploaded by

Amaresh Swain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

COMPUTER ORGANIZATION AND

ARCHITECTURE (COA)

EET 2211
4TH SEMESTER – CSE & CSIT
CHAPTER 18, LECTURE 36
By Ms. Arya Tripathy
MULTICORE
COMPUTERS

2 MULTICORE COMPUTERS 7/16/2021


TOPICS TO BE COVERED
Ø HARDWARE PERFORMANCE ISSUES
1. Increase in Parallelism and Complexity
2. Power Consumption
Ø SOFTWARE PERFORMANCE ISSUES
1. Software on Multicore
Ø MULTICORE ORGANIZATION
1. Levels of Cache
2. Simultaneous Multithreading
Ø HETEROGENEOUS MULTICORE ORGANIZATION
1. Different Instruction Set Architectures
2. Equivalent Instruction Set Architecture

3 MULTICORE COMPUTERS 7/16/2021


LEARNING OBJECTIVES
v Understand the hardware performance issues that
have driven the move to multicore computers.

v Understand the software performance issues posed by


the use of multithreaded multicore computers.

v Present an overview of the two principal approaches to


heterogeneous multicore organization.

4 MULTICORE COMPUTERS 7/16/2021


MULTICORE PROCESSOR
v A multicore processor, also known as a chip multiprocessor, combines two or
more processor units (called cores) on a single piece of silicon (called a die).
v Typically, each core consists of all of the components of an independent processor, such
as registers, ALU, pipeline hardware, and control unit, plus L1 instruction and data
caches.
v In addition to multiple cores , contemporary multicore chips also includes L2 cache and
L3 cache also.
v The most highly integrated multicore processors, known as systems on chip (SoCs), also
include memory and peripheral controllers.

5 MULTICORE COMPUTERS 7/16/2021


HARDWARE PERFORMANCE ISSUES
vMicroprocessor systems have experienced a steady increase in execution performance for
decades.This increase is due to a number of factors, including increase in clock frequency,
increase in transistor density, and refinements in the organization of the processor on the
chip. All this leads to increase in complexity of the chip.

v1 s t
ha rdwa re p e r f o r m a n c e i s s u e i s I N C R E A S E I N PA R A L L E L I S M A N D
COMPLEXITY

vThe organizational changes in processor design have primarily been focused on exploiting
ILP, so that more work is done in each clock cycle.These changes include, in chronological
order:

1. Pipelining: Individual instructions are executed through a pipeline of stages so that while
one instruction is executing in one stage of the pipeline, another instruction is executing in
another stage of the pipeline.

2. Superscalar: Multiple pipelines are constructed by replicating execution resources.This


enables parallel execution of instructions in parallel pipelines, so long as hazards are avoided.

6 MULTICORE COMPUTERS 7/16/2021


3. Simultaneous multithreading (SMT): Register banks are expanded so that multiple threads
(thread: is the smallest sequence of programmed instructions that can be managed independently by a
scheduler, where scheduling is the method by which work is assigned to resources that complete the
work) can share the use of pipeline resources.

vWith each of these innovations, designers have over the years attempted to increase the performance
of the system by adding complexity.

vIn the case of pipelining, for example, simple three-stage pipelines were replaced by pipelines with
five stages.

vThere is a practical limit to how far this trend can be taken, because with more stages, there is the
need for more logic, more interconnections, and more control signals.

vSimilarly, with superscalar organization, increased performance can be achieved by increasing the
number of parallel pipelines.

vAgain, there are diminishing returns as the number of pipelines increases.

vMore logic is required to manage hazards and to stage instruction resources.

7 MULTICORE COMPUTERS 7/16/2021


vThis same point of diminishing returns is reached with SMT, as the complexity of managing multiple
threads over a set of pipelines limits the number of threads and number of pipelines that can be
effectively utilized.

vThe increase in complexity to deal with all of the logical issues related to very long pipelines,
multiple superscalar pipelines, and multiple SMT register banks means that increasing amounts of the
chip area are occupied with coordinating and signal transfer logic.

vThis increases the difficulty of designing, fabricating, and debugging the chips.

vIn general terms, the experience of recent decades has been encapsulated in a rule of thumb known
as Pollack’s rule, which states that performance increase is roughly proportional to square root of
increase in complexity.

vIn other words, if you double the logic in a processor core, then it delivers only 40% more
performance.

vIn principle, the use of multiple cores has the potential to provide near-linear performance
improvement with the increase in the number of cores—but only for software that can take advantage.

8 MULTICORE COMPUTERS 7/16/2021


v2nd hardware performance issue is POWER CONSUMPTION

üTo maintain the trend of higher performance as the number of transistors per chip rises, designers
have resorted to more elaborate processor designs (pipelining, superscalar, SMT) and to high clock
frequencies.

üUnfortunately, power requirements have grown exponentially as chip density and clock frequency
have risen.

üOne way to control power density is to use more of the chip area for cache memory.

üMemory transistors are smaller and have a power density an order of magnitude lower than that of
logic.

üPower considerations provide another motive for moving toward a multicore organization. Because
the chip has such a huge amount of cache memory, it becomes unlikely that any one thread of
execution can effectively use all that memory.

üEven with SMT, multithreading is done in a relatively limited fashion and cannot therefore fully
exploit a gigantic cache, whereas a number of relatively independent threads or processes has a greater
opportunity to take full advantage of the cache memory.
9 MULTICORE COMPUTERS 7/16/2021
SOFTWARE PERFORMANCE ISSUES
vThe potential performance benefits of a multicore organization depend on the ability
to effectively exploit the parallel resources available to the application.

vLet us focus first on a single application running on a multicore system.

vAmdahl’s law states that:

vSpeed up =

vThis law appears to make the prospect of a multicore organization attractive.

vBut as Figure (a) on the next slide shows, even a small amount of serial code has a
noticeable impact.

vIf only 10% of the code is inherently serial, running the program on a multicore
system with eight processors yields a performance gain of only a factor of 4.7.

10 MULTICORE COMPUTERS 7/16/2021


It shows even a small amount of serial code has a noticeable impact. If only 10% of
the code is inherently serial (f=0.9), running the program on a multicore system
with eight processors yields a performance gain of only a factor of 4.7.

11 MULTICORE COMPUTERS 7/16/2021


In addition, software typically incurs overhead as a result of communication and
distribution of work among multiple processors and as a result of cache coherence
overhead. This overhead results in a curve where performance peaks and then begins to
degrade because of the increased burden of the overhead of using multiple processors (e.g.,
coordination and OS management) as shown in Figure (b) below.

12 MULTICORE COMPUTERS 7/16/2021


Contd.
vHowever, software engineers have been addressing this problem and there
are numerous applications in which it is possible to effectively exploit a
multicore system.

vDatabase management systems and database applications are one area in


which multicore systems can be used effectively.

vMany kinds of servers can also effectively use the parallel multicore
organization, because ser vers typically handle numerous relatively
independent transactions in parallel.

vIn addition to general-purpose server software, a number of classes of


applications benefit directly from the ability to scale throughput with the
number of cores.
13 MULTICORE COMPUTERS 7/16/2021
Contd.
v Some of these include the following:

1. Multithreaded native applications (thread-level parallelism) :


Multithreaded applications are characterized by having a small
number of highly threaded processes.
2. Multiprocess applications (process-level parallelism) : Multiprocess
applications are characterized by the presence of many single-
threaded processes.
3. Java applications : Java applications embrace threading in a
fundamental way.
4. Multi-instance applications (application-level parallelism) : even if an
individual application does not scale to take advantage of a large
number of threads, it is still possible to gain from multicore
architecture by running multiple instances of applications in parallel.

14 MULTICORE COMPUTERS 7/16/2021


15 MULTICORE COMPUTERS 7/16/2021

You might also like