0% found this document useful (0 votes)

38 views159 pages

Processors Basic

The document discusses multicore processors, highlighting their design, functionality, and advantages such as enhanced performance and reduced power consumption. It also addresses the challenges associated with multicore technology, including software dependency and performance limitations. Major use cases for multicore processors include virtualization, databases, analytics, cloud computing, and visualization.

Uploaded by

Harshit Rathour

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views159 pages

Processors Basic

Uploaded by

Harshit Rathour

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 159

 Multi-Core Processors

 Super Scalar Processors

 Very Long Instruction Word (VLIW) Processors

 Vector Processors
Multi-core Processors

 A multicore processor is an integrated circuit that has two or

more processor cores attached for

• Enhanced performance and

• Reduced power consumption.

 These processors also enable more efficient simultaneous

processing of multiple tasks,

• such as parallel processing and multithreading.

Multi-core Processors Contd..

 A dual core setup is similar to having multiple, separate

processors installed on a computer.

 However, because the two processors are plugged into the same
socket,

• the connection between them is faster.

Multi-core Processors Contd..

 The use of multicore processors is one approach to boost

processor performance

• without exceeding the practical limitations of

semiconductor design and fabrication.

 Using multicores also ensure safe operation in areas such as heat

generation.
How do multicore processors work?

 The heart of every processor is an execution engine, also known

as a core.

 The core is designed to process instructions and data according

to
• the software programs in the computer's memory.

 Over the years, designers found that every new processor design
has limits.
How do multicore processors work?
Contd..
 Numerous technologies were utilized to accelerate
performance, as following ones:

• Clock Speed

• Hyper-threading

• More Chips
How do multicore processors work?
(Clock Speed)
Clock Speed:

 One approach is to make the processor's clock faster.

 Clock is the "drumbeat" used to

• synchronize the processing of instructions and data

through the processing engine.

 Clock speeds have accelerated from several megahertz (MHz)

to several gigahertz (GHz) nowadays.
How do multicore processors work?
(Clock Speed) Contd..
 However, transistors use up power with each clock tick.

 As a result, clock speeds have nearly reached their limits

• using current semiconductor fabrication and heat

management techniques.
How do multicore processors work?
(Hyper Threading)
Hyper Threading:

 Another approach involved the handling of multiple instruction

threads.

 Intel calls this hyper-threading.

 With hyper-threading, processor cores are designed to

• handle two separate instruction threads at the same time.

How do multicore processors work?
(Hyper Threading) Contd..

 When properly enabled and supported by both the computer's

firmware and operating system,

• hyper-threading techniques enable one physical core to

function as two logical cores.

 Still, the processor only possesses a single physical core.

How do multicore processors work?
(Hyper Threading) Contd..

 The logical abstraction of the physical processor added little real

performance to the processor

• other than to help streamline the behavior of multiple

simultaneous applications running on the computer.
How do multicore processors work?
(More Chips)
More Chips:

 The next step is to add processor chips to the processor package,

• which is the physical device that plugs into the

motherboard.

 A dual-core processor includes two separate processor cores.

 A quad-core processor includes four separate cores.

How do multicore processors work?
Contd..
 Today's multicore processors can easily include 12, 24 or even
more processor cores.

 The multicore approach is almost identical to the use of

multiprocessor motherboards,

• which have two or four separate processor sockets.

How do multicore processors work?
Contd..
 Today's huge processor performance involves the use of
processor products

• that combine fast clock speeds and multiple hyper-

threaded cores.
How do multicore processors work?
Contd..
 However, multicore chips have several issues to consider.

 First, the addition of more processor cores doesn't automatically

improve computer performance.

 The OS and applications must direct software program

instructions to

• recognize and use the multiple cores.

How do multicore processors work?
Contd..
 This must be done in parallel,

• using various threads to different cores within the

processor package.

 Some software applications may need to be refactored to

• support and use multicore processor platforms.

 Otherwise, only the default first processor core is used, and any
additional cores are unused or idle.
How do multicore processors work?
Contd..
 Second, the performance benefit of additional cores is not a
direct multiple.

 That is, adding a second core does not double the processor's
performance,

• or a quad-core processor does not multiply the processor's

performance by a factor of four.
How do multicore processors work?
Contd..
 This happens because of the shared elements of the processor,
such as access to

• Internal memory or caches,

• External buses,

• and Computer system memory.

How do multicore processors work?
Contd..
 The benefit of multiple cores can be substantial, but there are
practical limits.

 Still, the acceleration is typically better than a traditional

multiprocessor system because

• the coupling between cores in the same package is tighter and

• there are shorter distances and fewer components between

cores.
Why are multicore processors used?

 Multicore processors work on any modern computer hardware

platform.

 Virtually, all PCs and laptops today build in some multicore

processor model.

 However, the true power and benefit of these processors depend on

• software applications designed to emphasize parallelism.

Why are multicore processors used?
Contd..
 A parallel approach divides application work into numerous
processing threads,

• and then distributes and manages those threads across

two or more processor cores.
Major Use cases for Multicore Processors

There are several major use cases for multicore processors,

including the following five:

• Virtualization

• Databases

• Analytics & HPC

• Cloud

• Visualization
Major Use cases for Multicore Processors
(Visualization)
Virtualization:

 A virtualization platform, such as VMware, is designed to

• abstract the software environment from the underlying

hardware.

 Virtualization is capable of abstracting physical processor cores into

• virtual processors or central processing units (vCPUs)

❑ which are then assigned to Virtual Machines (VMs).

Major Use cases for Multicore Processors
(Visualization) Contd..

 Each VM becomes a virtual server capable of running its own

OS and application.

 It is possible to assign more than one vCPU to each VM,

• allowing each VM and its application to run parallel

processing software if required.
Major Use cases for Multicore Processors
(Databases)
Databases:

 A database is a complex software platform that frequently needs

to run many simultaneous tasks such as queries.

 As a result, databases are highly dependent on multicore

processors to

• distribute and handle these many task threads.

Major Use cases for Multicore Processors
(Databases) Contd..
 The use of multiple processors in databases is often coupled with
extremely high memory capacity

• that can reach 1 terabyte or more on the physical server.

Major Use cases for Multicore Processors
(Analytics & HPC)
Analytics and HPC:

 Big data analytics, such as

• machine learning and High Performance Computing

(HPC) both require

❑ breaking large & complex tasks into smaller and

more manageable pieces.
Major Use cases for Multicore Processors
(Analytics & HPC) Contd..
 Each piece of the computational effort can then be solved by

• distributing each piece of the problem to a different

processor.

 This approach enables each processor to work in parallel to

• solve the overarching problem far faster and more

efficiently than with a single processor.
Major Use cases for Multicore Processors
(Cloud)
Cloud:

 Organizations building a cloud adopt multicore processors to

• support all the virtualization needed to

❑ accommodate the highly scalable,

❑ and highly transactional demands of cloud

software platforms such as OpenStack.
Major Use cases for Multicore Processors
(Cloud) Contd..
 A set of servers with multicore processors can allow the cloud to

• create and scale up more VM instances on demand.

Major Use cases for Multicore Processors
(Visualization)
Visualization:

 Graphics applications, such as games and data-rendering engines,

• have the same parallelism requirements as other HPC

applications.

 Visual rendering is task-intensive,

• So visualization applications can make extensive use of

multiple processors to distribute the calculations required.
Major Use cases for Multicore Processors
(Visualization) Contd..

 Many graphics applications rely on Graphics Processing Units

(GPUs) rather than CPUs.

 GPUs are tailored to optimize graphics-related tasks.

 GPU packages often contain multiple GPU cores, similar in

principle to multicore processors.
Pros and cons of multicore processors

 Multicore processor technology is mature and well-defined.

 However, the technology poses its share of pros and cons,

• which should be considered when buying and deploying

new servers.
Advantages of Multicore Processor

Some of the advantages of multicore processors are following:

 Better application performance

 Better hardware performance

Advantages of Multicore Processor
(Better Application Performance)
Better Application Performance:

 The principle benefit of multicore processors is more potential

processing capability.

 Each processor core is effectively a separate processor that

OSes and applications can use.

 In a virtualized server, each VM can employ one or more

virtualized processor cores,

• enabling many VMs to coexist and operate

simultaneously on a physical server.
Advantages of Multicore Processor Contd..
(Better Application Performance)
 Similarly, an application designed for high levels of parallelism
may use any number of cores to

• provide high application performance that would be

impossible with single-chip systems.
Advantages of Multicore Processor
(Better Hardware Performance)
Better Hardware Performance:

 By placing two or more processor cores on the same device, it

can use shared components such as

• Common internal buses,

• and Processor caches more efficiently.

Advantages of Multicore Processor Contd..
(Better Hardware Performance)

 It also benefits from superior performance compared with

multiprocessor systems

• that have separate processor packages on the same

motherboard.
Disadvantages of Multicore Processor

Some of the disadvantages of multicore processor are following:

 Software dependent,

 Performance boosts are limited,

 Power, heat and clock restrictions.

Disadvantages of Multicore Processor
(Software Dependent)

Software Dependent:

 The application uses processors not the other way around.

 OSes and applications are always default to use the first

processor core, dubbed core 0.

 Any additional cores in the processor package will remain unused

or idle

• until software applications are enabled to use them.

Disadvantages of Multicore Processor
(Software Dependent)
Contd..
 Such applications include database applications and big data
processing tools like Hadoop.

 A business should consider for what a server will be used and

the applications it plans to use

• before making a multicore system investment

❑ to ensure that the system delivers its optimum

computing potential.
Disadvantages of Multicore Processor
(Performance boosts are limited)

Performance boosts are limited:

 Multiple processors in a processor package must share common

system buses and processor caches.

 The more processor cores share a package,

• the more sharing take place across common processor

interfaces and resources.
Disadvantages of Multicore Processor
(Performance boosts are limited)
Contd..

 This results in diminishing returns to performance as cores are

added.

 For most situations, the performance benefit of having multiple

cores
• far outweighs the performance lost to such sharing,

❑ but it's a factor to consider when testing application

performance.
Disadvantages of Multicore Processor
(Power, heat and clock restrictions)

Power, heat and clock restrictions:

 A computer may not be able to drive a processor with many

cores

• as hard as a processor with fewer cores or a single-core

processor.

 A modern processor core may contain over 500 million

transistors.
Disadvantages of Multicore Processor
(Power, heat and clock restrictions)
Contd..

 Each transistor generates heat when it switches,

• and this heat increases as the clock speed increases.

 All of that heat generation must be safely dissipated

• from the core through the processor package.

Disadvantages of Multicore Processor
(Power, heat and clock restrictions)
Contd..

 When more cores are running,

• this heat can multiply and quickly exceed the cooling

capability of the processor package.

 Thus, some multicore processors may actually reduce clock

speeds for instance,

• from 3.5 GHz to 3.0 GHz to help manage heat.

Disadvantages of Multicore Processor
(Power, heat and clock restrictions)
Contd..
 This reduces the performance of all processor cores in the
package.

 High-end multicore processors require

• complex cooling systems,

• and careful deployment & monitoring

to ensure long-term system reliability.

Architecture of Multicore Processors
Architecture of Multicore Processors
Contd..
The components of multicore processors are as follows:

 Cores

 Processor Support

 Caches
Architecture of Multicore Processors
(Cores)
Cores:

 Every multicore processor consists of two or more cores along

with a series of caches.

 Cores are the central component of multicore processors.

Architecture of Multicore Processors
(Cores) Contd..
 Cores contain

• all of the registers and circuitry,

• sometimes hundreds of millions of individual transistors

needed to

❑ perform the closely-synchronized tasks of ingesting

data and instruction,

❑ process content and outputting logical decisions or

results.
Architecture of Multicore Processors
(Processor Support)
Processor Support:

 Processor support circuitry includes an assortment of

input/output control and management circuitry, such as

• Clocks,

• Cache consistency,

• Power & thermal control,

• and External bus access.

Architecture of Multicore Processors
(Caches)
Caches:

 Caches are relatively small areas of very fast memory.

 A cache retains often-used instructions or data,

• making that content readily available to the core

❑ without the need to access system memory.

Architecture of Multicore Processors
(Caches) Contd..

 A processor checks the cache first.

 If the required content is present,

• the core takes that content from the cache, enhancing

performance benefits.

 If the content is absent, the core will access system memory for
the required content.
Architecture of Multicore Processors
(Caches) Contd..
 Level 1, or L1, cache is the smallest and fastest cache unique to
every core.

 Level 2, or L2, cache is a larger storage space shared among

the cores.

 Some multicore processor architectures may dedicate both L1 and

L2 caches.
Homogenous vs. Heterogeneous
Multicore Processors

Homogenous vs. heterogeneous multicore processors:

 The cores within a multicore processor may be homogeneous or

heterogeneous.

 Mainstream Intel and AMD multicore processors

for x86 computer architectures

• are homogeneous and provide identical cores.

Homogenous vs. Heterogeneous
Multicore Processors
Contd..

 However, dedicating a complex device to do a simple job or to

get greatest efficiency is often wasteful.

 There is a heterogeneous multicore processor market

• that uses processors with different cores for different

purposes.
Homogenous vs. Heterogeneous
Multicore Processors
Contd..
 Heterogeneous cores are generally found in embedded or Arm
processors that

• might mix microprocessor and microcontroller cores in

the same package.
Goals for Heterogeneous Multicore Processors

There are three general goals for heterogeneous multicore

processors:

 Optimized performance

 Optimized power

 Optimized security
Goals for Heterogeneous Multicore Processors
(Optimized Performance)

Optimized Performance:

 While homogeneous multicore processors are typically intended

to
• provide universal processing capabilities,

❑ many processors are not intended for such

generic system use cases.
Goals for Heterogeneous Multicore Processors
(Optimized Performance) Contd..

 Instead, they are designed and sold for use in embedded,

dedicated or task-specific systems

• that can benefit from the unique strengths of different

processors.
Goals for Heterogeneous Multicore Processors
(Optimized Performance) Contd..

 For example, a processor intended for a signal processing device

• might use an Arm processor

❑ that contains a Cortex-A general-purpose processor,

❑ with a Cortex-M core for dedicated signal processing

tasks.
Goals for Heterogeneous Multicore Processors
(Optimized Power)

Optimized Power:

 Providing simpler processor cores reduces the transistor count

and eases power demands.

 This makes the processor package and the overall system cooler
and more power-efficient.
Goals for Heterogeneous Multicore Processors
(Optimized Security)

Optimized Security:

 Jobs or processes can be divided among different types of cores,

• enabling designers to deliberately build high levels of

isolation

❑ that tightly control access among the various

processor cores.
Goals for Heterogeneous Multicore Processors
(Optimized Security) Contd..

 This greater control and isolation offer better stability and

security for the overall system,

• though at the cost of general flexibility.

Examples of Multicore Processors

Examples of multicore processors:

 Most modern processors designed and sold for general-purpose

x86 computing include multiple processor cores.

 Examples of Intel 12th-generation multicore processors include the

following:

• Intel Core i9 12900 family provides 8 cores and 24 threads.

• Intel Core i7 12700 family provides 8 cores and 20 threads.

• Intel Core i5 12600 processors offer 6 cores and 16 threads.

Examples of Multicore Processors
Contd..
 Examples of AMD Zen multicore processors include:

• AMD Zen 3 family (provides 4 to 16 cores).

• AMD Zen 2 family (provides up to 64 cores).

• AMD Zen+ family (provides 4 to 32 cores).

What is Superscalar Processor?

 A type of microprocessor that is used to implement a type of

parallelism

• known as instruction-level parallelism in a single processor

❑ execute more than one instruction during a clock

cycle by

▪ dispatching simultaneously various instructions

to special execution units on the processor.
What is Superscalar Processor?
Contd..
 A scalar processor executes single instruction for each clock
cycle;

 A superscalar processor can execute more than one instruction

during a clock cycle.
Features of Superscalar Processors

Features of superscalar processors include the following:

 Superscalar architecture is a parallel computing technique

utilized in various processors.

 In a superscalar computer, the CPU manages several instruction

pipelines to

• perform numerous instructions simultaneously during a

clock cycle.
Features of Superscalar Processors
Contd..
 Superscalar architectures include all pipelining features

• Although, there are several instructions executing

simultaneously within the same pipeline.

 Superscalar design methods normally comprise

• Parallel register renaming,

• Parallel instruction decoding,

• Speculative execution & out-of-order execution.

Features of Superscalar Processors
Contd..
 So, these methods are normally used with complementing
design methods like

• Caching,

• Pipelining,

• Branch prediction & multi-core in recent microprocessor

designs.
Superscalar Processor Architecture

 A superscalar processor is a CPU that

• executes above one instruction for each Clock cycle

because

❑ processing speeds are simply measured in Clock

cycles for each second.

 Compared to a scalar processor, this processor is faster.

Superscalar Processor Architecture
Contd..
 Superscalar processor architecture mainly includes parallel
execution units

• where these units can implement instructions

simultaneously.

 So first, this parallel architecture was implemented within a

Reduced Instruction Set Computer (RISC) processor that

• utilizes simple & short instructions to execute

calculations.
Superscalar Processor Architecture
Contd..
 Due to their superscalar abilities,

• Normally, Reduced Instruction Set Computer

(RISC) processors have performed better as compared to

❑ Complex Instruction Set Computer (CISC) processors

which run at the same megahertz.

 But, most CISC processors now like the Intel Pentium comprise
some RISC architecture also,

• which allows them to perform instructions in parallel.

Block Diagram for Superscalar Processor
Superscalar Processor Architecture
Contd..

 The superscalar processor is equipped with several processing

units for handling

• various instructions in parallel in every processing stage.

 By using the above architecture,

• a number of instructions start execution within a similar

clock cycle.
Superscalar Processor Architecture
Contd..
 These processors are capable of

• obtaining an instruction execution output of the one

instruction for each cycle.

 In the previous architecture diagram, a processor is used with

two execution units

• where one is used for integer & other one is used for the
operations of floating point.
Superscalar Processor Architecture
Contd..
 The Instruction Fetch Unit (IFU) is capable of

• instructions reading at a time & stores them within the

instruction queue.

 In every cycle, the dispatch unit fetches & decodes

• up to 2 instructions from the queue front.

Superscalar Processor Architecture
Contd..
 If there is a single integer, single floating point instruction & no
hazards,

• then both instructions are dispatched within a similar

clock cycle.
Scalar Pipelining

Pipelining:

 Pipelining is the procedure of breaking down tasks into sub-steps &

• executing them within different processor parts.

 Pipelining architecture in the scalar processor and the superscalar

processor is shown in next slides.
Scalar Pipelining Contd..
Super scalar Pipelining

 The instructions in a superscalar processor are issued from a

sequential instruction stream.

 It must allow multiple instructions for each clock cycle and

• the CPU must check dynamically for data dependencies

between instructions.
Super scalar Pipelining Contd..

 In the following superscalar pipeline,

• two instructions can be fetched and dispatched at a time to

❑ complete a maximum of 2 instructions per cycle.

Super scalar Pipelining
Super scalar Pipelining Contd..

 A scalar processor issues single instruction per clock cycle and

• performs a single pipeline stage per clock cycle

 whereas a superscalar processor, issues two instructions per

clock cycle in previous example and

• it executes two instances of each stage in parallel.

Super scalar Pipelining Contd..

 So, the instruction execution in a scalar processor takes more

time
• whereas in a superscalar it takes less time to execute
instructions.
Types of Superscalar Processors

 Some of the different types of superscalar processors are as

follows:

• Intel Pentium Processor

• IBM Power PC601

Types of Superscalar Processors
(Intel Pentium Processor)
Intel Pentium Processor:

 In Intel Pentium processor superscalar pipelined architecture

• CPU executes a minimum of two or above instructions

for each cycle.

 This processor is widely used in personal computers.

Types of Superscalar Processors
(Intel Pentium Processor)
Contd..
 Intel Pentium processor devices are normally built for

• Online use,

• Cloud computing,

• & Collaboration.

 So this processor perfectly works for tablets and Chromebooks to

• provide strong local performance & efficient online

interactions.
Types of Superscalar Processors
(IBM Power PC601)
IBM Power PC601:

 The superscalar processor like IBM power PC601 is from the

family of PowerPC of RISC microprocessors.

 This processor is capable of issuing as well as retiring three

instructions for each clock.
Types of Superscalar Processors
(IBM Power PC601)
Contd..
 Instructions are totally out of order for improved performance;

• but, the PC601 make the execution emerge in order.

Types of Superscalar Processors
(IBM Power PC601)
Contd..
 The power PC601 processor provides

• 32-bit logical addresses,

• 16 & 32 bits integer data types,

• 32 & 64 bits floating-point data types.

Characteristics of Super scalar Processor

Superscalar processor characteristics include the following:

 A superscalar processor is a super-pipelined model

• where simply the independent instructions are

performed serially without any waiting situation.

 A superscalar processor fetches & decodes at a time

• several instructions of the incoming instruction stream.

Characteristics of Super scalar Processor
Contd..
 The architecture of superscalar processor exploits

• the potential of instruction-level parallelism.

 Scalar processors mainly issue the single instruction for every

cycle.

 The number of instructions issued mainly depends on

• the instructions within the instruction stream.

Characteristics of Super scalar Processor
Contd..
 Instructions are frequently reordered to fit the architecture of
the processor better.

 The superscalar method is usually associated with some

identifying characteristics.

 Instructions are normally issued from a sequential instruction

stream.
Characteristics of Super scalar Processor
Contd..
 The CPU checks dynamically for data dependencies in between
instructions at run time.

 The CPU executes multiple instructions for each clock cycle.

Disadvantages of Superscalar Processor

Disadvantages of the superscalar processor include the following:

 Superscalar processors are not used much in small embedded

systems due to power usage.

 The problem with scheduling can happen in this architecture.

 Superscalar processor increases the complexity-level in the

designing of hardware.
Disadvantages of Superscalar Processor
Contd..
 The instructions in this processor are simply fetched based on
their sequential program order

• but this is not the best execution order.

Applications of Superscalar Processor

Applications of a superscalar processor include the following:

 The superscalar execution is frequently used in a laptop or

desktop.

 This processor simply scans the program in execution to

• discover sets of instructions that can be executed as one.

Applications of Superscalar Processor
Contd..
 A superscalar processor includes various data path hardware
copies

• which execute various instructions at once.

 This processor is mainly designed to generate an implementation

speed of above one instruction for

• each clock cycle of a single sequential program.

Introduction to VLIW Architecture

 The limitations of the Superscalar processor are prominent

• as the task of scheduling instruction becomes complex.

Introduction to VLIW Architecture
Contd..
 Intrinsic parallelism in the instruction stream,

 complexity,

 cost,

 and the branch instruction issue

• get resolved by a higher instruction set architecture called

the Very Long Instruction Word (VLIW) or VLIW
Machines.
Introduction to VLIW Architecture
Contd..
 VLIW uses Instruction Level Parallelism,

• i.e., it has programs to control the parallel execution of

the instructions.
Introduction to VLIW Architecture
Contd..
 In other architectures, the performance of the processor is
improved by using either of the following methods:

• pipelining (break the instruction into subparts),

• parallel processing (independently execute the instructions

in different parts of the processor),

• out-of-order-execution (execute instructions differently to

the program)
Introduction to VLIW Architecture
Contd..
 But each of the previous methods, add very much complexity to
the hardware.

 VLIW Architecture deals with it by depending on the compiler.

 The programs decide the parallel flow of the instructions to

resolve conflicts.

 This increases compiler complexity but decreases hardware

complexity by a lot.
Features of VLIW Architecture

Features:

 The processors in this architecture have multiple functional units,

• fetch from the Instruction cache that have Very Long

Instruction Word.

 Multiple independent operations are grouped together in a

single VLIW Instruction.

 They are initialized in the same clock cycle.

 Each operation is assigned an independent functional unit.

Features of VLIW Architecture
Contd..
 All the functional units share a common register file.

 Instruction words are typically of the length 64 to 1024 bits

depending on

• the number of execution unit,

• and the code length required to control each unit.

Features of VLIW Architecture
Contd..
 Instruction scheduling and parallel dispatch of the word is
done statically by the compiler.

 The compiler checks for dependencies before scheduling

parallel execution of the instructions.
Applications of VLIW Architecture

Some common applications of VLIW architecture include:

 Digital Signal Processing

 Multimedia Processing

 Scientific Computing

 Embedded Systems
Applications of VLIW Architecture
(Digital Signal Processing)
Digital Signal Processing (DSP):

 VLIW processors are well-suited for DSP applications because of

• their ability to perform multiple operations in parallel.

 DSP applications require high computational power

• and often involve multiple parallel data streams,

❑ which VLIW processors can handle, efficiently.

Applications of VLIW Architecture
(Multimedia Processing)
Multimedia Processing:

 VLIW processors are also used for multimedia applications such

as video and audio processing,

• where high throughput and parallelism are required.

Applications of VLIW Architecture
(Scientific Computing)
Scientific Computing:

 VLIW processors can be used for scientific computing

applications,

• where high-performance computing is required to solve

complex numerical problems.
Applications of VLIW Architecture
(Embedded Systems)
Embedded Systems:

 VLIW processors are used in many embedded systems, such as

• Automotive control systems,

• Medical devices,

• and Industrial automation equipment.

Applications of VLIW Architecture
(Embedded Systems) Contd..

 These systems require high-performance processors

• that can execute multiple instructions in parallel while

consuming minimal power.
Advantages of VLIW Architecture

Advantages:

 Reduces hardware complexity.

 Reduces power consumption because of reduction of hardware

complexity.
Advantages of VLIW Architecture
Contd..
 Since compiler takes care of

• Data dependency check,

• Decoding,

• Instruction issues,

Hence, it becomes a lot simpler.

Advantages of VLIW Architecture
Contd..
 Increases potential clock rate.

 Functional units are positioned corresponding to the instruction

pocket by compiler.
Disadvantages of VLIW Architecture

Disadvantages:

 Complex compilers are required which are hard to design.

 Increased program code size.

Disadvantages of VLIW Architecture
Contd..
 Unscheduled events,

• for example, a cache miss could lead to a stall which will

stall the entire processor.

 In case of un-filled opcodes in a VLIW,

• there is waste of memory space and instruction

bandwidth.
Vector Processor

 Vector processor is basically a central processing unit

• that has the ability to execute the complete vector input

in a single instruction.

 More specifically, it is a complete unit of hardware resources

• that executes sequential set of similar data items in the

memory using a single instruction.
Vector Processor Contd..

 Elements of the vector are ordered properly to have successive

addressing format of the memory.

• This is the reason that it implements the data sequentially.

 It holds a single control unit but has multiple execution units

• that perform the same operation on different data

elements of the vector.
Vector Processor Contd..

 Unlike scalar processors that operate on only a single pair of

data, a vector processor operates on multiple pair of data.

 However, one can convert a scalar code into vector code.

 This conversion process is known as vectorization.

 Vector processing allows operation on multiple data elements

by the help of single instruction.
Vector Processor Contd..

 These instructions are said to be single instruction multiple

data or vector instructions.

 The CPU used in recent time makes use of vector processing as

it is advantageous than scalar processing.
Architecture and Working

 The figure below represents the typical diagram showing vector

processing by a vector computer:
Architecture and Working
Contd..
Architecture and Working
Contd..

 The functional units of a vector computer are as follows:

• IPU or Instruction Processing Unit

• Vector Register

• Scalar Register

• Scalar Processor
Architecture and Working
Contd..

• Vector Instruction Controller

• Vector Access Controller

• Vector Processor
Architecture and Working
Contd..

 As vector computer has several functional pipes thus it can

execute the instructions over the operands.

 Both data and instructions are present in the memory at the

desired memory location.

 So, the instruction processing unit i.e., IPU fetches the

instruction from the memory.
Architecture and Working
Contd..

 Once the instruction is fetched

• then IPU determines either the fetched instruction is scalar

or vector in nature.

 If it is scalar in nature, then

• the instruction is transferred to the scalar register

• and then further scalar processing is performed.

Architecture and Working
Contd..

 While, when the instruction is vector in nature

• then it is fed to the vector instruction controller.

 This vector instruction controller first decodes the vector

instruction

• then accordingly determines the address of the vector

operand present in the memory.
Architecture and Working
Contd..

 Then it gives a signal to the vector access controller about

• the demand of the respective operand.

 This vector access controller then fetches the desired operand

from the memory.

 Once the operand is fetched then it is provided to the

instruction register

• so that it can be processed at the vector processor.

Architecture and Working
Contd..

 At times, when multiple vector instructions are present,

• then the vector instruction controller provides the

multiple vector instructions to the task system.

 And in case, the task system shows that the vector task is very
long

• then the processor divides the task into sub-vectors.

Architecture and Working
Contd..

 These sub-vectors are fed to the vector processor

• that makes use of several pipelines

❑ in order to execute the instruction over the operand

fetched from the memory at the same time.

 The various vector instructions are scheduled by the vector

instruction controller.
Classification of Vector Processor

 The classification of vector processor relies on

• the ability of vector formation,

• as well as the presence of vector instruction for processing.

 So, depending on these criteria, vector processor architecture is

classified as follows:

• Register to Register Architecture

• Memory to Memory Architecture

Classification of Vector Processor
Contd..
Register to Register Architecture

 This architecture is highly used in vector computers.

 In this architecture, fetching of the operand or the previous

results

• indirectly takes place through the main memory by the

use of registers.
Register to Register Architecture
Contd..
 Several vector pipelines present in the vector computer help in

• retrieving data from the registers,

• and also storing the results in the register.

 These vector registers are user instruction programmable.

Register to Register Architecture
Contd..
 This means that according to the register address present in the
instruction,

• the data is fetched and stored in the desired register.

 These vector registers hold fixed length

• like the register length in a normal processing unit.

Register to Register Architecture
Contd..
 Some examples of a supercomputer using the register to register
architecture are following:

• Cray – 1 belongs to Fujitsu series

Memory to Memory Architecture

 In memory to memory architecture,

• the operands or the results are directly fetched from the

memory despite using registers.

 However, it is to be noted here that the address of the desired

data to be accessed

• must be present in the vector instruction.

Memory to Memory Architecture
Contd..
 This architecture enables the fetching of data of size 512 bits
from memory to pipeline.

 However, due to high memory access time,

• pipelines of the vector computer requires higher startup

time,

❑ as higher time is required to initiate the vector

instruction.
Memory to Memory Architecture
Contd..
 Some examples of supercomputers that possess memory to
memory architecture are following:

• Cyber 205, developed by CDC

Characteristics of Vector Processor

Characteristics of Vector Processor:

 Vector Processors are designed to process multiple data elements

in parallel,

• while Scalar Processors process one data element at a time.

 Vector Processors can be more efficient,

• as they can complete a given task with fewer instructions

than a Scalar Processor.
Characteristics of Vector Processor
Contd..
 Vector Processors are more complex than Scalar Processors,

• and require more memory as well as power to operate.

 Vector Processors are used for more demanding tasks, such as

• scientific calculations,

• 3D game rendering.
Characteristics of Vector Processor
Contd..
 while Scalar Processors are used for simpler tasks, such as

• Basic calculations,

• and Web browsing.

 Vector Processors are more suitable for data-intensive

applications,

• while Scalar Processors are better suited for applications

that require fewer calculations.
Characteristics of Vector Processor
Contd..
 Vector Processors can be more expensive than Scalar Processors,

• as they require more complex hardware and software.

 Register to register architecture is better than memory to

memory architecture

• because it offers a reduction in vector access time.

Advantages of Vector Processor

 Better performance

 Highly parallel

 High memory bandwidth

 Reduced software overhead

 Improved accuracy
Applications of Vector Processor

 Computer Aided Design

 Image Processing

 Virtual Reality

 Scientific Computing

 Artificial Intelligence

 Data Analysis

Participants
No ratings yet
Participants
8 pages
What Is A Multicore Processor
No ratings yet
What Is A Multicore Processor
21 pages
Multicore Processor
100% (1)
Multicore Processor
23 pages
Multi-Core Processors Explained
No ratings yet
Multi-Core Processors Explained
10 pages
20BCE2351 Micro Assignment-02
No ratings yet
20BCE2351 Micro Assignment-02
5 pages
Ayushagrawal HPC
No ratings yet
Ayushagrawal HPC
17 pages
Processors: by Nipun Sharma ID: 1411981520
No ratings yet
Processors: by Nipun Sharma ID: 1411981520
24 pages
Lecture 1
No ratings yet
Lecture 1
37 pages
20BCE2351 Micro Assignment-02
No ratings yet
20BCE2351 Micro Assignment-02
5 pages
Multicore Programming Resource Guide PDF
No ratings yet
Multicore Programming Resource Guide PDF
53 pages
Dual Core Processors Explained
No ratings yet
Dual Core Processors Explained
16 pages
Winsem2022-23 Cse4001 Eth Vl2022230503160 Reference Material I 15-12-2022 1.4 Multi-Core Processor
No ratings yet
Winsem2022-23 Cse4001 Eth Vl2022230503160 Reference Material I 15-12-2022 1.4 Multi-Core Processor
34 pages
Multicore Processor Overview
No ratings yet
Multicore Processor Overview
19 pages
BCSE412L - Parallel Computing 03
No ratings yet
BCSE412L - Parallel Computing 03
11 pages
Multicore Processors Explained
No ratings yet
Multicore Processors Explained
17 pages
Ece 10 - Microprocessor and Microcontroller System and Design (Module 1)
No ratings yet
Ece 10 - Microprocessor and Microcontroller System and Design (Module 1)
20 pages
Multicore Processor
No ratings yet
Multicore Processor
14 pages
Arch13 Multiprocessors Afterlecture
No ratings yet
Arch13 Multiprocessors Afterlecture
70 pages
COREPROCESSOR
No ratings yet
COREPROCESSOR
7 pages
Multi Core
No ratings yet
Multi Core
19 pages
Module 07 - Multiprocessing
No ratings yet
Module 07 - Multiprocessing
60 pages
Parallel Processors From Client To Cloud: Omputer Rganization and Esign
No ratings yet
Parallel Processors From Client To Cloud: Omputer Rganization and Esign
43 pages
Ahmad Aljebaly Department of Computer Science Western Michigan University
No ratings yet
Ahmad Aljebaly Department of Computer Science Western Michigan University
42 pages
Grade 12 IT Theory Notes PDF
No ratings yet
Grade 12 IT Theory Notes PDF
126 pages
Slot29 CH18 MultiCoreComputers 18 Slides
No ratings yet
Slot29 CH18 MultiCoreComputers 18 Slides
18 pages
High Performance Computing Unit 1
No ratings yet
High Performance Computing Unit 1
3 pages
Multi-Core Architectures
100% (1)
Multi-Core Architectures
43 pages
Microprocessors: Single vs Multi-core
No ratings yet
Microprocessors: Single vs Multi-core
9 pages
The Multecore
No ratings yet
The Multecore
4 pages
Memory Coherent
No ratings yet
Memory Coherent
62 pages
CH18 MultiCoreComputers 18 Slides
No ratings yet
CH18 MultiCoreComputers 18 Slides
18 pages
Multiprocessing
No ratings yet
Multiprocessing
8 pages
Multicore Processor Technology-Advantages and Challenges: Anil Sethi, Himanshu Kushwah
No ratings yet
Multicore Processor Technology-Advantages and Challenges: Anil Sethi, Himanshu Kushwah
3 pages
Multi-Core Computing Insights
No ratings yet
Multi-Core Computing Insights
37 pages
Ca - Unit 4
No ratings yet
Ca - Unit 4
77 pages
Optimizing Embedded Multicore CPUs
No ratings yet
Optimizing Embedded Multicore CPUs
36 pages
Lecture 2 Multi-Core Computing
No ratings yet
Lecture 2 Multi-Core Computing
42 pages
A Survey On Parallel Multicore Computing Performan
No ratings yet
A Survey On Parallel Multicore Computing Performan
9 pages
Advancedcomputer Architecture
No ratings yet
Advancedcomputer Architecture
91 pages
Osa Multi Core
No ratings yet
Osa Multi Core
37 pages
Intel Core's Multicore Processor
No ratings yet
Intel Core's Multicore Processor
7 pages
Lec 44 Multicore
No ratings yet
Lec 44 Multicore
23 pages
Pdf&rendition 1
No ratings yet
Pdf&rendition 1
126 pages
Chapter 9 COA
No ratings yet
Chapter 9 COA
31 pages
Multithreading vs Multiprocessor Explained
No ratings yet
Multithreading vs Multiprocessor Explained
3 pages
Multi-Core Processors Explained
No ratings yet
Multi-Core Processors Explained
38 pages
Multi Core Processors
No ratings yet
Multi Core Processors
17 pages
Multi Core System
No ratings yet
Multi Core System
9 pages
Multi Core
No ratings yet
Multi Core
7 pages
Mmi Activity
No ratings yet
Mmi Activity
2 pages
Multi-Core-updated
No ratings yet
Multi-Core-updated
27 pages
List of Contents
No ratings yet
List of Contents
3 pages
Understanding Hardware Multithreading
No ratings yet
Understanding Hardware Multithreading
12 pages
Mod 7
No ratings yet
Mod 7
56 pages
SSC Course 6 CPU
No ratings yet
SSC Course 6 CPU
17 pages
Multi-Core Computing: Mohammad Tarik M Husam Shakiar
No ratings yet
Multi-Core Computing: Mohammad Tarik M Husam Shakiar
27 pages
Group Discussions
No ratings yet
Group Discussions
5 pages
171221-27-07 Evening Physics 2022
No ratings yet
171221-27-07 Evening Physics 2022
9 pages
17125-27-07 Morning Physics 2022
No ratings yet
17125-27-07 Morning Physics 2022
9 pages
Order ID 5473256944
No ratings yet
Order ID 5473256944
1 page
Lexical Analysis
No ratings yet
Lexical Analysis
2 pages
Zimbabwe Artisanal Gold Mining Analysis
No ratings yet
Zimbabwe Artisanal Gold Mining Analysis
15 pages
MGNT314 Subject Outline PDF
No ratings yet
MGNT314 Subject Outline PDF
28 pages
7 - Demand and Supply of Education
No ratings yet
7 - Demand and Supply of Education
33 pages
Print Head Doctor 14 for Pros
No ratings yet
Print Head Doctor 14 for Pros
6 pages
MBA With Digital Marketing-UWS-London
No ratings yet
MBA With Digital Marketing-UWS-London
8 pages
Private Excel Modelling Test
No ratings yet
Private Excel Modelling Test
40 pages
5-Hazber-CV-College Level
No ratings yet
5-Hazber-CV-College Level
1 page
Grade 8 Assessment Results
No ratings yet
Grade 8 Assessment Results
6 pages
English Grammar Guide for Students
No ratings yet
English Grammar Guide for Students
5 pages
CN 17 en
No ratings yet
CN 17 en
2 pages
NDIA GVSETS 2024 MOSA Session - (Papers) Harnessing Advanced Technologies For Swarm Operations Within CJADC2
No ratings yet
NDIA GVSETS 2024 MOSA Session - (Papers) Harnessing Advanced Technologies For Swarm Operations Within CJADC2
13 pages
Assignment Report
No ratings yet
Assignment Report
4 pages
Computer Science Second Half Book
No ratings yet
Computer Science Second Half Book
4 pages
6.two Dimentional Analytical Geometry
No ratings yet
6.two Dimentional Analytical Geometry
71 pages
History of Mexico City
No ratings yet
History of Mexico City
2 pages
ZEDi USB Windows Driver Help
No ratings yet
ZEDi USB Windows Driver Help
6 pages
Yang 2021 PDF
No ratings yet
Yang 2021 PDF
34 pages
SOP For Cold Chain Management
No ratings yet
SOP For Cold Chain Management
8 pages
Assessment of Civil Servants General Competencies
No ratings yet
Assessment of Civil Servants General Competencies
12 pages
Part Payment
No ratings yet
Part Payment
2 pages
Goyal Surampalli 2018 Impact of Climate Change On Water Resources in India
No ratings yet
Goyal Surampalli 2018 Impact of Climate Change On Water Resources in India
10 pages
En - Synergis Master Controller Configuration Guide 2.1
No ratings yet
En - Synergis Master Controller Configuration Guide 2.1
52 pages
Insurance's Role in Lagos SMEs Amid COVID-19
No ratings yet
Insurance's Role in Lagos SMEs Amid COVID-19
8 pages
Diseño de Vigas
No ratings yet
Diseño de Vigas
12 pages
Sensi Wi Fi Thermostat Manual Operation en Us 4849464
No ratings yet
Sensi Wi Fi Thermostat Manual Operation en Us 4849464
8 pages
Gow Props
No ratings yet
Gow Props
6 pages
Grapevine Communication
No ratings yet
Grapevine Communication
2 pages
Introduction To Conditional Statements and Loops in JavaScript
No ratings yet
Introduction To Conditional Statements and Loops in JavaScript
8 pages
施耐德SD328变频器说明书
No ratings yet
施耐德SD328变频器说明书
11 pages
Excel Functions: ROUND
No ratings yet
Excel Functions: ROUND
2 pages

Processors Basic

Uploaded by

Processors Basic

Uploaded by

Contents

 Super Scalar Processors

 Very Long Instruction Word (VLIW) Processors

 A multicore processor is an integrated circuit that has two or

• Enhanced performance and

• Reduced power consumption.

 These processors also enable more efficient simultaneous

• such as parallel processing and multithreading.

 A dual core setup is similar to having multiple, separate

• the connection between them is faster.

 The use of multicore processors is one approach to boost

• without exceeding the practical limitations of

 Using multicores also ensure safe operation in areas such as heat

 The heart of every processor is an execution engine, also known

 The core is designed to process instructions and data according

 One approach is to make the processor's clock faster.

 Clock is the "drumbeat" used to

• synchronize the processing of instructions and data

 Clock speeds have accelerated from several megahertz (MHz)

 As a result, clock speeds have nearly reached their limits

• using current semiconductor fabrication and heat

 Another approach involved the handling of multiple instruction

 Intel calls this hyper-threading.

 With hyper-threading, processor cores are designed to

• handle two separate instruction threads at the same time.

 When properly enabled and supported by both the computer's

• hyper-threading techniques enable one physical core to

 Still, the processor only possesses a single physical core.

 The logical abstraction of the physical processor added little real

• other than to help streamline the behavior of multiple

 The next step is to add processor chips to the processor package,

• which is the physical device that plugs into the

 A dual-core processor includes two separate processor cores.

 A quad-core processor includes four separate cores.

 The multicore approach is almost identical to the use of

• which have two or four separate processor sockets.

• that combine fast clock speeds and multiple hyper-

 First, the addition of more processor cores doesn't automatically

 The OS and applications must direct software program

• recognize and use the multiple cores.

• using various threads to different cores within the

 Some software applications may need to be refactored to

• support and use multicore processor platforms.

• or a quad-core processor does not multiply the processor's

• Internal memory or caches,

• and Computer system memory.

 Still, the acceleration is typically better than a traditional

• the coupling between cores in the same package is tighter and

• there are shorter distances and fewer components between

 Multicore processors work on any modern computer hardware

 Virtually, all PCs and laptops today build in some multicore

 However, the true power and benefit of these processors depend on

• software applications designed to emphasize parallelism.

• and then distributes and manages those threads across

There are several major use cases for multicore processors,

• Analytics & HPC

 A virtualization platform, such as VMware, is designed to

• abstract the software environment from the underlying

 Virtualization is capable of abstracting physical processor cores into

• virtual processors or central processing units (vCPUs)

❑ which are then assigned to Virtual Machines (VMs).

 Each VM becomes a virtual server capable of running its own

 It is possible to assign more than one vCPU to each VM,

• allowing each VM and its application to run parallel

 A database is a complex software platform that frequently needs

 As a result, databases are highly dependent on multicore

• distribute and handle these many task threads.

• that can reach 1 terabyte or more on the physical server.

 Big data analytics, such as

• machine learning and High Performance Computing

❑ breaking large & complex tasks into smaller and

• distributing each piece of the problem to a different

 This approach enables each processor to work in parallel to

• solve the overarching problem far faster and more

 Organizations building a cloud adopt multicore processors to

• support all the virtualization needed to