[go: up one dir, main page]

0% found this document useful (0 votes)
81 views6 pages

A Performance Analysis For Microprocessor Architec

This document analyzes the performance of single-core, dual-core, and hyper-threading CPU architectures by running four types of operations: integer additions, floating-point additions, integer and floating-point additions combined, and mixed integer and floating-point operations. The experiments found that single-core and dual-core CPUs performed as expected with combined operations taking close to the sum of individual operation times, while the hyper-threading CPU showed better performance when each thread performed a specific operation rather than mixed operations.

Uploaded by

kalyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views6 pages

A Performance Analysis For Microprocessor Architec

This document analyzes the performance of single-core, dual-core, and hyper-threading CPU architectures by running four types of operations: integer additions, floating-point additions, integer and floating-point additions combined, and mixed integer and floating-point operations. The experiments found that single-core and dual-core CPUs performed as expected with combined operations taking close to the sum of individual operation times, while the hyper-threading CPU showed better performance when each thread performed a specific operation rather than mixed operations.

Uploaded by

kalyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/228972595

A performance analysis for microprocessor architectures

Article · January 2007

CITATIONS READS

0 161

2 authors:

Nakhoon Baek Hwanyong Lee


Kyungpook National University Ajou University
140 PUBLICATIONS   292 CITATIONS    35 PUBLICATIONS   88 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

OpenGL ES implementation View project

All content following this page was uploaded by Nakhoon Baek on 22 May 2014.

The user has requested enhancement of the downloaded file.


Proceedings of the 6th WSEAS International Conference on Applied Computer Science, Hangzhou, China, April 15-17, 2007 436

A Performance Analysis for Microprocessor Architectures


Nakhoon Baek∗ Hwanyong Lee
School of EECS Solution Division
Kyungpook National University HUONE Inc.
Daegu 702-701 Daegu 702-205
Korea Korea
oceancru@gmail.com hylee@hu1.com

Abstract: In this paper, we selected three different CPU architectures for performance analysis: single-core, dual-
core and hyper-threading CPU’s. Four kinds of operations are executed on these architectures. After analyzing
all the data, we found that the single-core and dual-core act as usually expected: the execution times of combined
operations are very close to the sum of that of compounding operations. In contrast, the hyper-threading CPU
shows better performance when each thread performs specific operations, rather than mixed operations.

Key–Words: CPU architectures, performance analysis, multi-threading

1 Introduction 2 Background Works


In computer programming, a thread means a light pro-
Nowadays, the performance of microprocessors is ap- cess, which executes a given area of programming
proaching their physical limits. In the case of large- codes, with a dedicated stack area[4]. In contrast to
scale computers including super computers and main- usual processes, threads can share their memory each
frames, they already met this kind of technical lim- other, which can act as a strong point. Comparing
its in their CPU powers. Thus, they developed var- the conventional sequential programming paradigm
ious parallel processing techniques including multi- to the multi-threaded programming, one of the most
threading, super-threading, hyper-threading, and so strong point for the multi-threading is that multiple
on[1]. threads can be simultaneously executed in a paral-
In these days, microprocessors used in conven- lelized manner[5].
tional PC’s and even in high-end embedded systems Nowadays, multiple threads can be simultane-
have improved their ability to effectively support par- ously executed on many computer systems. On single
allel processing techniques. At this time, we already processor systems, the time sharing method is used to
have some commercial multi-core CPU’s including execute several threads, in turns. Through alternating
Intel Core2 Duo, Intel Core2 Quad, Intel Xeon, cus- the executing thread very frequently, this system can
tomized triple-core CPU’s for Xbox 360, etc[2, 3]. make the illusion of simultaneous executions. How-
It is clear that conventional programming models ever, in fact, the single processor systems only alter-
based on the sequential processing paradigm is not ex- nate multiple threads, rather than physically executing
actly suitable for these multi-core CPU’s. We need the threads in a parallelized way.
computer programs based on the parallel process- Multi-processor systems or multi-core processors
ing paradigm such as multi-processing and/or multi- are capable of physically executing multiple threads.
threading. In other words, multi-processing is now possible.
Thus, we can use the multi-threading in more wide
In this paper, we represent the experimental re- areas of programming, although they recommend it
sults on the execution time of some CPU-intensive only for I/O intensive works, in past.
operations for an amount of integer operations and/or
Multi-threading does not always make overall
floating-point operations, to finally analyze which
speed ups in all situations[6]. First of all, parallel
programming architecture is more suitable for newly
programs based on the multi-threading needs much
appeared CPU architectures.
more steps to start something useful, in comparison
with previous sequential programming techniques.
Thus, in worst case, the preparing and arranging times

Corresponding Author. for the multi-threading requires somewhat significant
Proceedings of the 6th WSEAS International Conference on Applied Computer Science, Hangzhou, China, April 15-17, 2007 437

Hyper-Threading Architecture Dual Core Architecture

On-Die On-Die
Cache Cache

Architectural Architectural Architectural Architectural


State State State State

APIC APIC APIC APIC

Processor Processor Processor


Core Core Core

system bus system bus

Figure 1: Hyper-threading architecture. Figure 2: Dual-core architecture.

portions in its overall execution time. Of course, pro- At this time, we have dual-core and quad-core
grammers should avoid this situation. CPU’s commercially available. A customized CPU
Hyper-threading is a recently developed technol- for Xbox 360 has triple cores, while some CPU’s for
ogy for more efficient multi-threading, on the Pen- workstation computers also have multi-core architec-
tium4 microprocessor architectures, delivered by In- tures.
tel. It is also officially known as HTT(hyper-threading
technology). In this technology, when a processing
core is active, the other CPU pipelines not in use may 3 Performance Analysis
be used by other threads, to finally simulate two log- In this paper, we will compare several CPU archi-
ical processors in a single physical processor. So, we tectures: single-core, dual-core and hyper-threading
can expect two logical processors in a hyper-threading CPU’s. For this purpose, we select four kinds of op-
possible CPU’s[7]. Figure 1 shows the conceptual di- erations. Basically, we focused on the arithmetic op-
agram for the hyper-threading environment. erations, to fully utilize the internal computing power
In spite of its strong points, hyper-threading also of CPU’s. To test integer operation units and floating-
has drawbacks. Since the logical cores share level-1 point units separately, we prepare the following oper-
and level-2 caches, there is some security holes and ations:
some slow-downs in real world applications[8].
Multi-core microprocessors have two or more • integer operations: consist of 1,000 integer ad-
processing cores in a single physical processor pack- ditions, which are repeated 5,000,000 times.
age, as shown in Figure 2. In this case, each pro-
cessing core has its own resources such as caches, • floating-point operations: consist of 1,000 dou-
registers, execution units, etc. Thus, there is no ble precision floating-point additions, repeated
resource sharing in multi-core architectures, while 5,000,000 times.
hyper-threading invokes some kind of resource shar- • mixed operations: consist of 1,000 integer ad-
ing. Some multi-core processors are designed to co- ditions and 1,000 floating-point additions. When
operate with hyper-threading technology. multiple threads are used, each thread is allotted
Although multi-core CPU’s are one of cost- to the same amount of integer and floating-point
effective way of implementing parallel programming operations.
paradigm, it also has some drawbacks. At this time,
multi-core CPU’s have slower clock than conventional • separated operations: consist of 1,000 inte-
single-core CPU’s. Thus, current multi-core CPU’s ger additions and 1,000 floating-point additions.
show bad scores for sequentially designed computer When multiple threads are used, each thread
programs, in comparison with single-core CPU’s[9]. is wholly served for integer operations or for
Proceedings of the 6th WSEAS International Conference on Applied Computer Science, Hangzhou, China, April 15-17, 2007 438

Table 1: Execution times on the single-core CPU. Table 2: Execution times on the dual-core CPU.

(unit: sec) (unit: sec)


num. operations num. operations
threads integer double mixed separated threads integer double mixed separated
1 1.294 2.145 3.491 3.453 1 0.691 1.772 2.716 2.459
2 1.306 2.157 3.459 3.435 2 0.366 0.894 1.256 1.775
3 1.316 2.173 3.499 3.457 3 0.403 0.941 1.294 1.334
4 1.326 2.155 3.463 3.457 4 0.400 0.919 1.297 1.331
5 1.336 2.137 3.461 3.469 5 0.400 0.931 1.306 1.306
6 1.346 2.165 3.595 3.477 6 0.409 0.928 1.319 1.303
7 1.388 2.169 3.545 3.455 7 0.412 0.925 1.319 1.281
8 1.390 2.147 3.511 3.467 8 0.416 0.903 1.316 1.294

of operations are shown in Table 1. All the operations


are measured for varying number of threads from 1 to
4.0 8. The graphical representation of these data is also
shown in Figure 3. Since there is only one CPU core,
the execution time is independent on the number of
3.5
threads.
double As we can trivially guess, the execution times for
3.0 separated-operations and mixed-operations threads
integer
separated
are very close to the sum of those of integer-operations
execution time (sec)

2.5 and double-operations threads. Additionally, there is


mixed
no noticeable difference between the execution time
2.0 of mixed-operation and separated-operation threads.

1.5 3.2 Dula-core case


For dual-core cases, we use an Intel Core2 E6400
1.0 2.13GHz CPU system, with 1.0GB memory. The
experiments are actually the same to the single-core
0.5 case. The experimental results are summarized in Ta-
ble 2 and Figure 4.
Since we use a dual-core CPU, the execution time
0.0 with 2 or more threads are dropped to half of the ex-
1 2 3 4 5 6 7 8 ecution time of single threaded case. Similar to the
number of threads single-core case, the execution times for separated-
operations and mixed-operations threads are very
close to the sum of those of integer-operations and
double-operations threads.
Figure 3: Single-core CPU performance.
3.3 Hyper-threading case
floating-point operations. I.e., the integer and To test the hyper-threading CPU case, an Intel Pen-
floating-point operations are separated into inde- tium4 2.8 GHz processor, with hyper-threading facil-
pendent threads. ity is used, with 1.0GB memory. Measured execu-
tion times are listed in Table 3, and its corresponding
graphical representation is shown in Figure 5.
3.1 Single-core case One remarkable thing on the graph is that the
We use an Intel Pentium4 1.6 GHz processor with separated-operations threads show better performance
1.5GB memory as a testing system for the single-core with respect to the mixed-operations threads. We
case. The measured execution times for the four kinds guess that the processor core is fully utilized when a
Proceedings of the 6th WSEAS International Conference on Applied Computer Science, Hangzhou, China, April 15-17, 2007 439

4.0 4.0
double double
3.5 integer 3.5
integer
separated separated
3.0 mixed 3.0
mixed
execution time (sec)

execution time (sec)


2.5 2.5

2.0 2.0

1.5 1.5

1.0 1.0

0.5 0.5

0.0 0.0
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
number of threads number of threads

Figure 4: Dual-core CPU performance. Figure 5: Hyper-threading CPU performance.

Table 3: Execution times on the hyper-threading CPU.


ations are executed on these architectures. We mea-
(unit: sec) sured the execution times of these all operations, for
num. operations different number of threads from 1 to 8.
threads integer double mixed separated After analyzing all the data, we found that the
1 0.756 2.200 3.266 2.941 single-core and dual-core act as usually expected, i.e.
2 0.747 1.125 1.844 2.181 the execution time of combined operations are very
3 0.744 1.144 1.875 1.816 close to the sum of that of compounding operations. In
4 0.750 1.119 1.903 1.819 contrast, the hyper-threading CPU shows better per-
5 0.766 1.128 1.909 1.856 formance when each thread performs specific opera-
6 0.791 1.141 1.925 1.859 tions, rather than mixed operations.
7 0.781 1.156 1.950 1.853 Conclusively, in the case of hyper-threading
8 0.784 1.131 1.941 1.856 CPU’s, we had better design the multi-threading soft-
ware to avoid a thread with mixed operations. We
need more experiments and analysis for more precise
inferences.
thread uses integer-operation unit and another thread
uses floating-point unit.
Acknowledgements: This work is financially sup-
ported by the Ministry of Education and Human Re-
4 Conclusion sources Development(MOE), the Ministry of Com-
merce, Industry and Energy(MOCIE) and the Min-
In this paper, we selected three different CPU archi- istry of Labor(MOLAB) through the fostering project
tectures for performance analysis: single-core, dual- of the Industrial-Academic Cooperation Centered U-
core and hyper-threading CPU’s. Four kinds of oper- niversity.
Proceedings of the 6th WSEAS International Conference on Applied Computer Science, Hangzhou, China, April 15-17, 2007 440

References:
[1] J. Stokes. Introduction to multithreading, superth-
reading and hyperthreading, 2005. http://arstech-
nica.com/articles/paedia/cpu/hyperthreading.ars.
[2] J. Stokes. Inside the Xbox 360, part I: procedural
synthesis and dynamic worlds, 2005. http://ars-
technica.com/articles/paedia/cpu/xbox360-1.ars.
[3] J. Stokes. Inside the Xbox 360, part II: the
Xenon CPU, 2005. http://arstechnica.com/arti-
cles/paedia/cpu/xbox360-2.ars.
[4] K. Wackowski and P. Gepner. Hyper-threading
technology speeds clusters. In Proc. 5th Int’l
Conf. on Parallel Proc. and Appl. Math., pages
17–26, 2003.
[5] G. Keren. Multi-threaded programming with
POSIX threads, 2002. http://users.actcom.coil/-
choo/lupg/index.html.
[6] D. Sarkar. Cost and time-cost effectiveness of
multiprocessing. IEEE Trans. Parallel Distrib.
Syst., 4(6):704–712, 1993.
[7] T. Martinez and Sunish Parikh. Understand-
ing dual processors, hyper-threading technology,
and multi-core systems. Intel Optimizing Center,
2005. http://www.devx.com/Intel/Article/27399.
[8] C. Percival. Cache missing for fun and profit. In
BSDCan ’05, 2005.
[9] J. Handy. The Cache Memory Book. Academic
Press, 1998.

View publication stats

You might also like