[go: up one dir, main page]

CN100394381C - Synchronous multi-thread processor circuit and operating method - Google Patents

Synchronous multi-thread processor circuit and operating method Download PDF

Info

Publication number
CN100394381C
CN100394381C CNB2004100430627A CN200410043062A CN100394381C CN 100394381 C CN100394381 C CN 100394381C CN B2004100430627 A CNB2004100430627 A CN B2004100430627A CN 200410043062 A CN200410043062 A CN 200410043062A CN 100394381 C CN100394381 C CN 100394381C
Authority
CN
China
Prior art keywords
threads
performance index
processor
currently running
performance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
CNB2004100430627A
Other languages
Chinese (zh)
Other versions
CN1534463A (en
Inventor
朴基豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US10/631,601 external-priority patent/US7152170B2/en
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Publication of CN1534463A publication Critical patent/CN1534463A/en
Application granted granted Critical
Publication of CN100394381C publication Critical patent/CN100394381C/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • G06F9/30189Instruction operation extension or modification according to execution mode, e.g. mode flag
    • AHUMAN NECESSITIES
    • A41WEARING APPAREL
    • A41DOUTERWEAR; PROTECTIVE GARMENTS; ACCESSORIES
    • A41D19/00Gloves
    • A41D19/015Protective gloves
    • A41D19/01547Protective gloves with grip improving means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3824Operand accessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Textile Engineering (AREA)
  • Power Sources (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

与SMT处理器中线程运行相关的处理电路,可用于基于所述SMT处理器当前所运行的线程的数量,而以不同的性能指标运行。例如,根据本发明的一些实施例,与SMT处理器中线程的运行相关的处理电路,例如浮点单元或数据高速缓存,可以基于所述SMT处理器当前所运行线程的数量,以高功率模式或低功率模式运行。此外,随着SMT处理器所运行线程数量的增加,能够降低处理电路的性能指标,从而在允许减少与线程相关的处理电路所消耗功率的总量时,提供该SMT处理器体系结构的优点。相关的计算机程序产品和方法也被公开。

Processing circuitry associated with running threads in an SMT processor can be configured to run with different performance metrics based on the number of threads currently running on the SMT processor. For example, according to some embodiments of the present invention, processing circuits related to the operation of threads in the SMT processor, such as floating point units or data caches, can be activated in a high-power mode based on the number of currently running threads in the SMT processor. or low power mode operation. Furthermore, as the number of threads run by an SMT processor increases, the performance metrics of the processing circuitry can be reduced, thereby providing the benefits of the SMT processor architecture while allowing a reduction in the amount of power consumed by the processing circuitry associated with the threads. Related computer program products and methods are also disclosed.

Description

同步多线程处理器电路以及运行方法 Synchronous multi-thread processor circuit and operating method

本申请要求于2003年2月20日提交的韩国专利申请号2003-107595的优先权,在此全文引用作为参考。This application claims priority from Korean Patent Application No. 2003-107595 filed on February 20, 2003, which is incorporated herein by reference in its entirety.

技术领域 technical field

本发明通常涉及计算机处理器体系结构,特别涉及同步多线程计算机处理器、相关的计算机程序产品及其运行方法。The present invention generally relates to computer processor architecture, and more particularly to synchronous multi-threaded computer processors, related computer program products and methods of operation thereof.

背景技术 Background technique

同步多线程(SMT)是一种利用硬件多线程来允许多个独立的线程在每一周期过程中发出指令的处理器体系结构。与其它硬件多线程体系结构在任何给定周期中仅激活一个单独的硬件内容(即线程)不同,SMT体系结构能够允许所有的线程内容同步地去竞争并共享处理器资源。Simultaneous multithreading (SMT) is a processor architecture that utilizes hardware multithreading to allow multiple independent threads to issue instructions during each cycle. Unlike other hardware multithreading architectures that only activate a single hardware context (ie, thread) in any given cycle, the SMT architecture allows all thread contexts to compete and share processor resources simultaneously.

SMT处理器能利用其他没有用的周期来执行指令,这样可以降低在SMT处理器中长时间等待操作的影响。此外,随着线程数量的增加,性能也可能提高,这也可能增加SMT处理器所消耗的能量。SMT processors can use other unused cycles to execute instructions, which can reduce the impact of long wait operations in SMT processors. Also, performance may increase as the number of threads increases, which may also increase the power consumed by the SMT processor.

在图1中举例说明了传统SMT处理器的方框图。图1中传统SMT处理器的运行在Dean M.Tullsen;Susan J.Egger;Henry M.Levy;Jack L.Lo;RebeccaL.Stamm等1996年在The 23rd Annual International Symposium on ComputerArchitecture,pp.191-202上的题为Exploiting Choice:Instruction Fetch and Issueon an Implementable Simultaneous Multithreading Processor中进行了论述,在此引用其公开内容以供参考。传统SMT处理器的体系结构和运行在技术上众所周知,在这里将不对它们作进一步详细的描述。A block diagram of a conventional SMT processor is illustrated in Figure 1 . The operation of the traditional SMT processor in Figure 1 is described in Dean M.Tullsen; Susan J.Egger; Henry M.Levy; Jack L.Lo; 202 entitled Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor, the disclosure of which is incorporated herein by reference. The architecture and operation of conventional SMT processors are well known in the art and they will not be described in further detail here.

发明内容 Contents of the invention

根据本发明的实施例可以提供处理电路、计算机程序产品、和/或以基于同步多线程(SMT)处理器所运行的线程的数量以不同的性能指标来运行的方法。例如,在根据本发明的多个实施例中,与SMT处理器中线程的运行相关的处理电路,例如浮点单元或数据高速缓存,可以基于所述SMT处理器当前所运行线程的数量,以一种高功率模式或一种低功率模式之一运行。此外,随着SMT处理器所运行线程数量的增加,能够降低处理电路的性能指标,从而在允许减少与线程相关的处理电路所消耗功率的总量时,提供该SMT处理器体系结构的优点。换言之,该SMT处理器能够以相同的功率但更高的性能运行,或者可以消耗较多功率但以高于传统SMT处理器的性能指标运行。Embodiments in accordance with the present invention may provide processing circuits, computer program products, and/or methods that operate at different performance metrics based on the number of threads executed by a simultaneous multithreading (SMT) processor. For example, in multiple embodiments of the present invention, the processing circuits related to the running of threads in the SMT processor, such as floating point units or data caches, can be based on the number of threads currently running on the SMT processor, in order of One of a high power mode or a low power mode operates. Furthermore, as the number of threads run by an SMT processor increases, the performance metrics of the processing circuitry can be reduced, thereby providing the benefits of the SMT processor architecture while allowing a reduction in the amount of power consumed by the processing circuitry associated with the threads. In other words, the SMT processor can run at the same power but with higher performance, or can consume more power but run at a higher performance index than conventional SMT processors.

在根据本发明的多个实施例中,该处理电路可以用于当所述SMT处理器当前所运行线程的数量小于或等于阈值时,以第一性能指标运行,当所述SMT处理器当前所运行线程的数量大于该阈值时,以第二性能指标运行。In multiple embodiments according to the present invention, the processing circuit may be configured to run with a first performance index when the number of threads currently running by the SMT processor is less than or equal to a threshold value, and when the number of threads currently running by the SMT processor is When the number of running threads is greater than the threshold, run with the second performance indicator.

在根据本发明的多个实施例中,性能指标控制电路可以用于基于所述SMT处理器当前所运行线程的数量为处理电路提供一个性能指标。根据本发明的多个实施例,当所述SMT处理器当前所运行线程的数量小于或等于阈值时,该性能指标控制电路能将提供给处理电路的性能指标提高为第一性能指标。当所述SMT处理器当前所运行线程的数量超过该阈值时,该性能指标控制电路能将提供给至少一个处理电路的性能指标降低至小于第一性能指标的第二性能指标。In various embodiments according to the present invention, the performance index control circuit may be used to provide a performance index for the processing circuit based on the number of threads currently running on the SMT processor. According to multiple embodiments of the present invention, when the number of currently running threads of the SMT processor is less than or equal to a threshold, the performance index control circuit can increase the performance index provided to the processing circuit to the first performance index. When the number of currently running threads of the SMT processor exceeds the threshold, the performance index control circuit can reduce the performance index provided to at least one processing circuit to a second performance index that is less than the first performance index.

在根据本发明的多个实施例中,当所述SMT处理器当前所运行线程的数量超过大于第一阈值的第二阈值时,该性能指标控制电路进一步将处理电路的性能指标降低至小于第二性能指标的第三性能指标。In multiple embodiments according to the present invention, when the number of currently running threads of the SMT processor exceeds a second threshold greater than the first threshold, the performance index control circuit further reduces the performance index of the processing circuit to be less than the first threshold The third performance index of the second performance index.

根据本发明可提供的性能指标变量的多个实施例。例如,根据本发明的一些实施例,该处理电路可以是包括标记存储器和数据存储器的高速缓冲存储器电路,用于当高速缓冲存储器以第一性能指标运行时,将提供与该标记存储器的存取同步的高速缓存数据。该数据存储器可以用于当高速缓冲存储器电路以小于第一性能指标的第二性能指标运行时,提供响应于标记存储器中命中的高速缓存数据。Various embodiments of performance indicator variables may be provided in accordance with the present invention. For example, according to some embodiments of the invention, the processing circuit may be a cache memory circuit comprising a tag memory and a data memory for providing access to the tag memory when the cache memory is operating at a first performance indicator Synchronized cache data. The data store may be used to provide cache data responsive to a hit in the tag memory when the cache memory circuit is operating at a second performance indicator that is less than the first performance indicator.

在根据本发明的多个实施例中,该高速缓冲存储器可以是用于存储通过指令运行的数据的数据高速缓冲存储器和用于存储通过相关数据运行的指令的指令高速缓冲存储器中的至少一种。根据本发明的多个实施例,该数据高速缓冲存储器可进一步用于当以第二性能指标运行时不提供响应于标记存储器中漏失的高速缓存数据。In various embodiments according to the present invention, the cache may be at least one of a data cache for storing data operated by instructions and an instruction cache for storing instructions operated by associated data . According to various embodiments of the invention, the data cache is further operable to not provide cache data responsive to misses in the tag memory when operating at the second performance indicator.

在根据本发明的多个实施例中,该处理电路可以是浮点单元。根据本发明的多个实施例,该浮点单元可以是用于在SMT处理器所运行线程的数量小于或等于阈值时以第一性能指标运行的第一浮点单元,并且该SMT处理器可以进一步包括当所述SMT处理器所运行线程的数量大于该阈值时以小于第一性能指标的第二性能指标运行的第二浮点单元。In various embodiments according to the invention, the processing circuit may be a floating point unit. According to multiple embodiments of the present invention, the floating point unit may be a first floating point unit for running with a first performance index when the number of threads run by the SMT processor is less than or equal to a threshold, and the SMT processor may It further includes a second floating-point unit running with a second performance index smaller than the first performance index when the number of threads run by the SMT processor is greater than the threshold.

在根据本发明的多个实施例在,该性能指标控制电路可用于响应于在SMT处理器中分别被创建和完成的线程,增加或减少SMT处理器当前所运行线程的数量。In various embodiments according to the present invention, the performance index control circuit is operable to increase or decrease the number of threads currently running by the SMT processor in response to threads being created and completed in the SMT processor, respectively.

根据本发明的多个实施例,第二处理电路可用于响应于在SMT处理器中当前所运行的线程数量增加到大于该阈值,从而以小于第一性能指标的第二性能指标运行。According to various embodiments of the invention, the second processing circuit is operable to operate at a second performance indicator less than the first performance indicator in response to the number of threads currently running in the SMT processor increasing above the threshold.

根据本发明的多个实施例,该性能指标控制电路可以用于响应于新线程的创建而降低提供给至少一个处理电路的性能指标,从而将SMT处理器当前所运行线程的数量从小于或等于阈值增加到大于该阈值。根据本发明的多个实施例,该性能指标控制电路可用于随着SMT处理器当前所运行线程的数量超过上升的阈值中的每一个时,将处理电路的性能指标降低至多个下降的性能指标中的一个。According to various embodiments of the present invention, the performance index control circuit can be used to reduce the performance index provided to at least one processing circuit in response to the creation of a new thread, thereby reducing the number of threads currently running by the SMT processor from less than or equal to The threshold is increased above the threshold. According to various embodiments of the present invention, the performance index control circuit is operable to reduce the performance index of the processing circuit to a plurality of decreasing performance indexes as the number of threads currently running by the SMT processor exceeds each of rising thresholds one of the.

根据本发明的多个实施例,该性能指标控制电路可用于为第一处理电路保持第一性能指标,并响应于SMT当前所运行线程的数量从小于或等于阈值增加至大于该阈值,为第二处理电路提供小于第一性能指标的第二性能指标。According to various embodiments of the present invention, the performance indicator control circuit is operable to maintain a first performance indicator for the first processing circuit, and respond to the number of threads currently running by the SMT increasing from less than or equal to a threshold value to greater than the threshold value, for the second The second processing circuit provides a second performance index less than the first performance index.

根据本发明的其他实施例,性能指标控制电路可用于基于所述SMT处理器当前所运行的线程的数量,向SMT处理器中的处理电路提供性能指标。According to other embodiments of the present invention, the performance indicator control circuit may be used to provide performance indicators to the processing circuit in the SMT processor based on the number of threads currently running on the SMT processor.

仍然根据本发明的其它实施例,线程管理电路可用于在创建线程后,将与SMT处理器相关的处理电路分配给SMT处理器中运行的线程。性能指标控制电路可用于基于所述SMT处理器当前所执行的,与至少一个阈值进行了比较的线程的数量,向处理电路提供大量性能指标中的一个。Still according to other embodiments of the present invention, the thread management circuit can be used to allocate the processing circuit related to the SMT processor to the thread running in the SMT processor after the thread is created. The performance indicator control circuit is operable to provide one of a plurality of performance indicators to the processing circuit based on the number of threads currently executing by the SMT processor compared to at least one threshold.

仍然根据本发明的其它实施例,与SMT处理器相关的高速缓冲存储器可以包括标记存储器和数据存储器,基于所述SMT处理器当前所运行的线程的数量,可以同步或在存取该标记存储器之后对该数据存储器进行存取。Still according to other embodiments of the present invention, the cache memory associated with the SMT processor may include a tag memory and a data memory, which may be accessed synchronously or after accessing the tag memory based on the number of threads currently running on the SMT processor access to the data memory.

附图说明 Description of drawings

图1是举例说明传统同步多线程(SMT)处理器电路体系结构的方框图。FIG. 1 is a block diagram illustrating a conventional simultaneous multithreading (SMT) processor circuit architecture.

图2是举例说明根据本发明的SMT处理器实施例的方框图。Figure 2 is a block diagram illustrating an embodiment of an SMT processor according to the present invention.

图3是举例说明根据本发明的线程管理电路实施例的方框图。Figure 3 is a block diagram illustrating an embodiment of a thread management circuit according to the present invention.

图4是举例说明根据本发明的性能指标控制电路实施例的方框图。FIG. 4 is a block diagram illustrating an embodiment of a performance index control circuit according to the present invention.

图5是举例说明根据本发明性能指标控制电路实施例的流程图。FIG. 5 is a flowchart illustrating an embodiment of the performance index control circuit according to the present invention.

图6是举例说明根据本发明的高速缓冲存储器实施例的方框图。Figure 6 is a block diagram illustrating an embodiment of a cache memory in accordance with the present invention.

图7是举例说明根据本发明的SMT处理器实施例的方框图。Figure 7 is a block diagram illustrating an embodiment of an SMT processor according to the present invention.

图8是举例说明根据本发明的SMT处理器实施例的方框图。Figure 8 is a block diagram illustrating an embodiment of an SMT processor according to the present invention.

图9是举例说明根据本发明的SMT处理器实施例的方框图。Figure 9 is a block diagram illustrating an embodiment of an SMT processor according to the present invention.

图10是举例说明根据本发明的性能指标控制电路实施例的方框图。FIG. 10 is a block diagram illustrating an embodiment of a performance index control circuit according to the present invention.

图11是举例说明根据本发明的性能指标控制电路实施例的流程图。FIG. 11 is a flowchart illustrating an embodiment of a performance index control circuit according to the present invention.

具体实施方式 Detailed ways

以下将参照附图对本发明进行更加充分地描述,在附图中示出了本发明的说明性实施例。然而,本发明可以以许多不同的形式实现,而并不应该认为局限于所述实施例;更确切地说,提供这些实施例是为了使公开的内容更透彻和全面,并且将会充分地向本领域技术人员传达本发明的范围。全文中,相同的数字表示相同的元件。The present invention will be described more fully hereinafter with reference to the accompanying drawings, in which illustrative embodiments of the invention are shown. However, the invention may be embodied in many different forms and should not be construed as limited to these embodiments; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully inform Those skilled in the art convey the scope of the invention. Throughout, like numerals refer to like elements.

应当了解,尽管在这里用术语“第一”和“第二”来描述多个元件,但是这些元件不应被这些术语所限制。这些术语只是用来区别一个元件和其他元件。因而,在不脱离所公开内容的范围内,以下所论述的第一元件可以被称为第二元件,同样,第二元件也可以被称为第一元件。It should be understood that although the terms "first" and "second" are used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from other elements. Thus, a first element discussed below could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of the disclosure.

作为本领域技术人员之一,将能够理解本发明可以具体表现为电路、计算机程序产品,和/或计算机程序产品。因此,本发明可以采取纯硬件的实施例,纯软件的实施例或结合软件和硬件特征的实施例的形式。此外,本发明可以采用在具有计算机可用程序代码的计算机可用存储介质上的计算机程序产品的形式。任何适用的计算机可读介质都可以被利用,包括硬盘,CD-ROM,光存储装置,或磁存储装置。As one of ordinary skill in the art will understand, the present invention may be embodied as a circuit, a computer program product, and/or a computer program product. Accordingly, the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware features. Furthermore, the present invention can take the form of a computer program product on a computer-usable storage medium having computer-usable program code. Any suitable computer readable medium may be utilized including hard disks, CD-ROMs, optical storage devices, or magnetic storage devices.

计算机程序代码或用来根据本发明,实现运行的“代码”可以用一种面向对象编程语言来编写,例如JAVASmalltalk或C++,JavaScript,VisualBasic,TSQL,Perl,或其它编程语言。本发明的软件实施例不依赖于一个特殊编程语言的实现。部分代码可以全部在一个中间服务器所利用的一个或更多的系统上执行。The computer program code or "code" used to implement the operation according to the present invention can be written in an object-oriented programming language, such as JAVA Smalltalk or C++, JavaScript, VisualBasic, TSQL, Perl, or other programming languages. The software embodiments of the present invention do not rely on a particular programming language for implementation. Portions of code may all execute on one or more systems utilized by an intermediate server.

代码可以全部在一个或更多的计算机系统上执行,或者可以一部分在服务器上执行,且一部分在客户机装置内的客户机,或者在通信网络中中间站的代理服务器上执行。在后面的方案中,客户机装置可以通过局域网或广域网(例如内部网)与服务器相连接,或者通过互联网(例如,经由互联网服务供应商)来进行连接。本发明可以通过使用经由各种类型计算机网络的各种协议来体现。The code may execute entirely on one or more computer systems, or may execute partly on a server and partly on a client within a client device, or on a proxy server at an intermediate station in a communication network. In the latter scheme, the client device may connect to the server through a local or wide area network (eg, an intranet), or through the Internet (eg, via an Internet service provider). The invention can be embodied through the use of various protocols over various types of computer networks.

以下将根据本发明的实施例,结合方框图和对方法、系统和计算机程序产品进行说明的流程图,来对本发明进行描述。应当了解,方框图和流程图中的每一个模块,以及方框图和流程图中模块的组合都可以通过计算机程序指令执行。这些计算机程序指令可以提供给同步多线程(SMT)处理器电路、专用计算机、或其他可编程数据处理装置,以生成一种机器,以使通过计算机处理器或其他可编程数据处理装置执行的所述指令,生成用于执行方框图和/或流程图的块中指定功能的装置。The present invention will be described below in combination with block diagrams and flowcharts illustrating methods, systems and computer program products according to embodiments of the present invention. It should be understood that each and every module of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be provided to a simultaneous multi-threaded (SMT) processor circuit, a special purpose computer, or other programmable data processing apparatus to create a machine such that all programs executed by a computer processor or other programmable data processing apparatus The instructions described above generate means for performing the functions specified in the blocks of the block diagrams and/or flowcharts.

这些计算机程序指令可以存储在计算机可读存储器中,以指示计算机或其他可编程数据处理装置以特定方式运行,以使存储在计算机可读存储器中的指令,生成包含执行在方框图和/或流程图或模块中所指定功能的指令装置的一件产品。These computer program instructions may be stored in a computer-readable memory to instruct a computer or other programmable data processing apparatus to operate in a specific manner, so that the instructions stored in the computer-readable memory generate the Or a piece of product with a command device for the function specified in the module.

该计算机程序指令可以载入SMT处理器电路或其他可编程数据处理装置,以在计算器或其他可编程装置中执行一系列的运行步骤,从而生成计算机实现的处理,以使在计算机或其他可编程装置上执行的指令,提供用来实现在方框图和/或流程图或模块中所指定功能的步骤。The computer program instructions can be loaded into an SMT processor circuit or other programmable data processing device to perform a series of operational steps in a calculator or other programmable device, thereby generating a computer-implemented Instructions executed on a programming device provide the steps to implement the functions specified in the block diagrams and/or flowcharts or modules.

根据本发明的实施例,可以提供与SMT处理器中线程的运行相关的处理电路,其中该处理电路用于基于所述SMT处理器当前所运行的线程的数量,以不同的性能指标来运行。应当了解,不同的性能指标可以包括不同的电路运行速度和/或不同的精度指标。根据本发明的多个实施例,根据本发明的处理电路可以在不同的时钟速度运行和/或用不同的电路类型(例如不同类型的CMOS装置),来提供不同的性能指标。例如,根据本发明的多个实施例,与SMT处理器中线程的运行相关的处理电路,例如浮点单元或数据高速缓存,可以基于所述SMT处理器当前所运行线程的数量,以高时钟速度下的高功率模式或低时钟速度下的低功率模式运行。此外,随着SMT处理器所运行线程数量的增加,能够降低处理电路的性能指标,从而在允许减少与线程相关的处理电路所消耗功率的总量时,提供该SMT处理器体系结构的优点。According to an embodiment of the present invention, a processing circuit related to running threads in the SMT processor may be provided, wherein the processing circuit is used to run with different performance indicators based on the number of threads currently running on the SMT processor. It should be appreciated that different performance metrics may include different circuit operating speeds and/or different accuracy metrics. According to various embodiments of the present invention, processing circuits according to the present invention may operate at different clock speeds and/or use different circuit types (eg, different types of CMOS devices) to provide different performance metrics. For example, according to multiple embodiments of the present invention, processing circuits related to the operation of threads in the SMT processor, such as floating point units or data caches, can be clocked at a high speed based on the number of currently running threads in the SMT processor. high power mode at high clock speeds or low power mode operation at low clock speeds. Furthermore, as the number of threads run by an SMT processor increases, the performance metrics of the processing circuitry can be reduced, thereby providing the benefits of the SMT processor architecture while allowing a reduction in the amount of power consumed by the processing circuitry associated with the threads.

应当了解,根据本发明实施例,可以示出能够使用多个固有地相互平行运行的线程的线程级并行技术。如在此所使用的,“线程”可以是具有相关指令和数据的单独的处理过程线程可以表示为是具有多个处理过程的并行计算机程序的一部分的处理过程。线程还可以表示为独立于其他程序运行的单独的计算机程序。每一个线程可以有相关状态,例如,由分别适于相关的指令,数据,程序计数器,和/或寄存器的状态所定义。对于线程的相关状态,能包括对于由处理器执行的线程而言足够的信息。It should be appreciated that in accordance with embodiments of the present invention, thread-level parallelism techniques that enable the use of multiple threads that inherently run in parallel to one another may be shown. As used herein, a "thread" may be an individual process thread with associated instructions and data may represent a process that is part of a parallel computer program having multiple processes. A thread can also be represented as a separate computer program that runs independently of other programs. Each thread may have an associated state, eg, defined by states respectively for associated instructions, data, program counters, and/or registers. For thread-related state, information sufficient for the thread executed by the processor can be included.

根据本发明的多个实施例,性能指标控制电路用于向分配给SMT处理器中所创建线程的处理电路提供各自的性能指标。例如,该性能指标控制电路能够提供第一性能指标,以使处理电路能以高功率模式运行,此外,可以向以低功率模式运行的处理电路提供第二性能指标。仍然根据本发明其他的实施例,该性能指标控制电路提供中间性能指标(就是在高功率与低功率之间的其他性能指标)。According to various embodiments of the present invention, the performance indicator control circuit is configured to provide respective performance indicators to the processing circuits allocated to the threads created in the SMT processor. For example, the performance index control circuit can provide a first performance index to enable the processing circuit to operate in a high power mode, and further, can provide a second performance index to the processing circuit operating in a low power mode. Still according to other embodiments of the present invention, the performance index control circuit provides an intermediate performance index (that is, other performance index between high power and low power).

根据本发明的多个实施例,以不同的性能指标运行的处理电路可以是包括标记存储器和数据存储器的高速缓冲存储器。当高速缓冲存储器以第一性能指标运行时(即以高功率模式),能够同步存取该标记存储器和数据存储器,而不考虑对标记存储器的存取是否会产生命中。当标记存储器中的命中率较高时,对数据存储器所进行的同步存取能够提供较高的性能。换言之,该高速缓冲存储器还能够以第二性能指标运行(即如低功率模式),其中该数据存储器仅响应标记存储器中的一个命中而进行存取。因此,如果出现一个标记漏失,则能够避免与数据存储器的存取相关的某些功率消耗。此外,如果出现一个标记命中,则可以及时偏移对标记存储器和数据存储器的存取。According to various embodiments of the invention, the processing circuit operating at different performance metrics may be a cache memory including a tag memory and a data memory. When the cache memory is operating at the first performance indicator (ie, in a high power mode), the tag memory and the data memory can be accessed synchronously, regardless of whether the access to the tag memory will result in a hit. Synchronous access to the data memory provides higher performance when the hit rate in the tag memory is high. In other words, the cache memory is also capable of operating at a second performance level (ie, such as a low power mode), wherein the data memory is only accessed in response to a hit in the tag memory. Thus, some of the power consumption associated with accessing the data memory can be avoided if a tag miss occurs. Furthermore, accesses to the tag memory and data memory can be shifted in time if a tag hit occurs.

仍然在其他实施例中,与SMT处理器的线程运行相关的处理电路,可以是指令高速缓存或其他类型的处理电路,就像浮点电路或整数载入/存储电路。此外,这些处理电路中的每一个都能以不同的性能指标运行。例如,根据本发明的多个实施例,高速缓冲存储器,指令高速缓存,和浮点电路以及整数载入/存储电路能够以不同的性能指标同步运行。Still in other embodiments, the processing circuit associated with the thread execution of the SMT processor may be an instruction cache or other type of processing circuit, like a floating point circuit or an integer load/store circuit. Furthermore, each of these processing circuits can operate at different performance metrics. For example, according to various embodiments of the present invention, cache memories, instruction caches, and floating point circuits and integer load/store circuits can operate simultaneously with different performance metrics.

仍然进一步根据本发明的实施例,同类的处理电路(例如浮点电路和整数载入/存储电路)能够划分成不同的性能类别,以使其中一些电路设置为以第一性能指标运行,然而其他操作电路设置为以第二性能指标运行。例如,根据本发明的多个实施例,其中用于分配给SMT处理器中的线程的一些浮点电路,可用于以一种高功率模式运行,然而其它可用于分配给SMT处理器中的线程的浮点电路,可用于以一种低功率模式运行。Still further in accordance with embodiments of the present invention, processing circuits of the same type (e.g., floating point circuits and integer load/store circuits) can be divided into different performance classes, such that some of the circuits are set to operate at a first performance specification, while others The operating circuit is configured to operate at the second performance indicator. For example, according to various embodiments of the present invention, some of the floating-point circuits for allocation to threads in SMT processors may be used to run in a high power mode, while others may be used for allocation to threads in SMT processors. floating-point circuitry that can be used to run in a low-power mode.

图2是举例说明根据本发明的SMT处理器实施例的方框图。根据图2,当在SMT处理器200中创建新线程时,线程管理电路205将一组处理电路分配给新创建的线程使用。所分配的处理电路可以包括程序计数器215、一组浮点寄存器245、和一组整数寄存器250。其他处理电路也可以分配给新创建的线程。应当了解,当线程完成时,可以释放分配为该线程使用的处理电路,以使这些处理电路可以重新分配给随后创建的线程。Figure 2 is a block diagram illustrating an embodiment of an SMT processor according to the present invention. According to FIG. 2, when a new thread is created in the SMT processor 200, the thread management circuit 205 allocates a group of processing circuits to the newly created thread for use. The allocated processing circuitry may include a program counter 215 , a set of floating point registers 245 , and a set of integer registers 250 . Other processing circuits can also be assigned to newly created threads. It should be appreciated that when a thread completes, processing circuitry allocated for use by that thread may be released so that the processing circuitry may be reassigned to subsequently created threads.

运行时,取指令电路210基于由所分配的程序计数器215提供的定位,从指令高速缓存220中取出指令,并将该指令提供给译码器225。译码器225将已译码的指令输出到寄存器重命名电路230。根据寄存器重命名电路230所提供指令的类型,寄存器重命名电路230将重命名的指令提供给浮点指令队列235或整数指令队列240。例如,如果寄存器重命名电路230提供的指令类型是浮点指令,则该指令将载入至浮点指令队列235,然而如果该寄存器重命名电路230所提供的指令是整数指令,则该指令将载入至整数指令队列240。In operation, the instruction fetch circuit 210 fetches an instruction from the instruction cache 220 and provides the instruction to the decoder 225 based on the location provided by the allocated program counter 215 . The decoder 225 outputs the decoded instructions to the register renaming circuit 230 . Depending on the type of instruction provided by register renaming circuit 230 , register renaming circuit 230 provides the renamed instruction to floating point instruction queue 235 or integer instruction queue 240 . For example, if the instruction type provided by the register renaming circuit 230 is a floating-point instruction, the instruction will be loaded into the floating-point instruction queue 235, whereas if the instruction provided by the register renaming circuit 230 is an integer instruction, the instruction will be loaded into the floating-point instruction queue 235. Loaded into the integer instruction queue 240 .

将来自从浮点指令队列235或整数指令队列240的指令载入至一相关寄存器,该寄存器用于通过浮点电路255或整数载入/存储电路260运行。特别是,浮点指令从浮点指令队列235传递到一组浮点寄存器245。浮点寄存器245中的指令可以通过浮点电路255存取。例如当浮点电路255所执行的(来自浮点寄存器245的)指令涉及存储在数据高速缓存265中的数据时,浮点电路255还可以存取存储在数据高速缓存265中的浮点数据。Instructions from floating point instruction queue 235 or integer instruction queue 240 are loaded into an associated register for execution by floating point circuit 255 or integer load/store circuit 260 . In particular, floating point instructions are passed from floating point instruction queue 235 to set of floating point registers 245 . Instructions in floating point registers 245 may be accessed through floating point circuitry 255 . Floating point circuitry 255 may also access floating point data stored in data cache 265 when instructions executed by floating point circuitry 255 (from floating point registers 245 ) involve data stored in data cache 265 , for example.

整数指令将从整数指令队列240传递到整数寄存器250。整数载入/存储电路260可以存取存储在整数寄存器250中的整数指令,以便执行所述指令。例如,当存储在整数寄存器250中的整数指令涉及存储在数据高速缓存265中的整数数据时,整数载入/存储电路260还可以对数据高速缓存265进行存取。Integer instructions will be passed from integer instruction queue 240 to integer registers 250 . Integer load/store circuitry 260 may access integer instructions stored in integer registers 250 for execution of the instructions. Integer load/store circuitry 260 may also access data cache 265 when an integer instruction stored in integer register 250 involves integer data stored in data cache 265, for example.

根据本发明的实施例,线程管理电路205向数据高速缓存265提供性能指标。特别是,该性能指标可以控制数据高速缓存265以第一性能指标或第二性能指标运行(即以高功率模式或低功率模式)。例如,线程管理电路205能够提供第一性能指标,其中数据高速缓存265以高功率模式运行,或者该线程管理电路可以提供第二性能指标,其中数据高速缓存265以低功率模式运行。应当了解,尽管已描述了以第一性能指标或者以第二性能指标运行的数据高速缓存265,但是根据本发明的实施例,可以使用更多的性能指标。According to an embodiment of the invention, thread management circuitry 205 provides performance metrics to data cache 265 . In particular, the performance indicator may control the data cache 265 to operate at the first performance indicator or the second performance indicator (ie, in a high power mode or a low power mode). For example, thread management circuitry 205 can provide a first performance indicator where data cache 265 is operating in a high power mode, or the thread management circuitry can provide a second performance indicator where data cache 265 is operating in a low power mode. It should be appreciated that although data cache 265 has been described operating at a first performance metric or at a second performance metric, many more performance metrics may be used in accordance with embodiments of the invention.

图3是举例说明根据本发明的线程管理电路实施例的方框图。根据图3,线程管理电路305接收来自操作系统的信息,或换言之,来自与SMT处理器中线程的创建有关的线程产生电路。线程管理电路305包括线程分配电路330,根据本发明,该线程分配电路能够为SMT处理器所创建的线程分配处理电路。Figure 3 is a block diagram illustrating an embodiment of a thread management circuit according to the present invention. According to FIG. 3, the thread management circuit 305 receives information from the operating system, or in other words, from the thread generation circuit related to the creation of threads in the SMT processor. The thread management circuit 305 includes a thread allocation circuit 330 capable of allocating processing circuits to threads created by the SMT processor according to the present invention.

线程管理电路305还包括性能指标控制电路340,该性能指标控制电路将性能指标提供给与SMT处理器所创建线程相关的处理电路。性能指标控制电路340能够基于所述SMT处理器当前所运行线程的数量,将性能指标提供给处理电路。特别是,随着SMT处理器所运行线程数量的增加,该性能指标控制电路可以将已降低的性能指标提供给与SMT所执行线程相关的处理电路。性能指标控制电路340能够响应于SMT处理器所运行线程的创建和完成,通过递增和递减一个内部计数,来确定SMT处理器当前所运行线程的数量。Thread management circuitry 305 also includes performance metrics control circuitry 340 that provides performance metrics to processing circuitry associated with threads created by the SMT processor. The performance indicator control circuit 340 can provide the performance indicator to the processing circuit based on the number of threads currently running by the SMT processor. In particular, as the number of threads executed by the SMT processor increases, the performance index control circuit can provide reduced performance index to the processing circuits associated with the threads executed by the SMT. Performance indicator control circuit 340 can determine the number of threads currently running on the SMT processor by incrementing and decrementing an internal count in response to the creation and completion of threads running on the SMT processor.

应当了解,根据本发明提供给处理电路的性能指标可以拥有一个默认值,例如第一性能指标(或高功率模式)。因此,随着线程增加,可以降低提供给处理电路的性能指标,从而降低性能,并且因此降低该处理电路的功率耗损。应当了解,可以经由信号线向处理电路提供性能指标,该信号线能传导具有至少两种状态的一个信号,所述两种状态即:第一性能指标和第二性能指标。例如,SMT处理器初始化后,该SMT处理器所运行线程的数量可以为零,其中提供给处理电路的性能指标的默认值是默认的第一性能指标(高功率模式)。随着线程增加并最终超过了一个阈值,该性能指标可以被改变为第二性能指标,例如,改变指示所用性能指标的信号的状态。It should be understood that the performance index provided to the processing circuit according to the present invention may have a default value, such as the first performance index (or high power mode). Thus, as the number of threads increases, the performance metrics provided to the processing circuitry may be reduced, thereby reducing performance, and thus reducing the power consumption of the processing circuitry. It should be appreciated that the performance indicator may be provided to the processing circuit via a signal line capable of conducting a signal having at least two states, namely: a first performance indicator and a second performance indicator. For example, after the SMT processor is initialized, the number of running threads of the SMT processor may be zero, wherein the default value of the performance index provided to the processing circuit is the default first performance index (high power mode). As threads increase and eventually exceed a threshold, the performance indicator may be changed to a second performance indicator, for example, changing the state of a signal indicative of the used performance indicator.

图4是举例说明根据本发明的性能指标控制电路实施例的方框图。根据图4,计数器电路405可以接收来自根据图3中所论述的操作系统或线程产生电路的信息,来确定SMT处理器当前所运行线程的数量。例如,如果在接收到关于一个新线程创建的信息时,计数器电路405示出了由SMT处理器启动的四个线程,则计数器电路405可以递增,从而反映SMT处理器当前运行了5个线程。FIG. 4 is a block diagram illustrating an embodiment of a performance index control circuit according to the present invention. According to FIG. 4, the counter circuit 405 may receive information from the operating system or the thread generation circuit discussed in accordance with FIG. 3 to determine the number of threads currently running by the SMT processor. For example, if counter circuit 405 shows four threads started by the SMT processor upon receiving information about the creation of a new thread, counter circuit 405 may increment to reflect that the SMT processor is currently running 5 threads.

计数器电路405可以将SMT处理器当前所运行线程的数量,提供给比较器电路410。与SMT处理器当前所运行线程的数量一起,还提供给比较器电路410阈值。该阈值可以示出了线程数量超过性能指标的改变的可编程值。因此,当所述SMT处理器当前所运行线程的数量小于或等于该阈值时,提供给处理电路的性能模式可以保持在第一性能指标,例如高功率模式。然而,当所述SMT处理器当前所运行线程的数量超过了该阈值时,可以降低该性能指标以减少SMT处理器的功率耗损。The counter circuit 405 may provide the number of threads currently running by the SMT processor to the comparator circuit 410 . Along with the number of threads currently running by the SMT processor, a threshold is also provided to comparator circuit 410 . The threshold may show a programmable value for changes in the number of threads over performance metrics. Therefore, when the number of currently running threads of the SMT processor is less than or equal to the threshold, the performance mode provided to the processing circuit may remain at the first performance index, such as a high power mode. However, when the number of currently running threads of the SMT processor exceeds the threshold, the performance index may be reduced to reduce power consumption of the SMT processor.

图5是举例说明根据本发明的性能指标控制电路实施例的流程图。根据图5,初始化SMT处理器时,该SMT处理器当前所运行线程的数量为零(模块500),随着在SMT处理器中线程的创建和完成,当前在SMT处理器中运行的线程数量N被递增或递减(模块505)。例如,在SMT处理器中运行四个线程的情况下,该N值将为4。当创建新线程时,该N值递增为5,然而如果随后有线程被完成时,该N值被递减回4。FIG. 5 is a flowchart illustrating an embodiment of a performance index control circuit according to the present invention. According to Fig. 5, when initializing SMT processor, the quantity of thread that this SMT processor is running currently is zero (module 500), along with creating and finishing in thread in SMT processor, the thread quantity that is currently running in SMT processor N is incremented or decremented (block 505). For example, in the case of four threads running in an SMT processor, the N value would be 4. When a new thread is created, the N value is incremented to 5, however if a thread is subsequently completed, the N value is decremented back to 4.

将SMT处理器当前所运行线程的数量与阈值做比较(模块510)。如果该SMT处理器当前所运行线程的数量小于或等于该阈值,该性能指标控制电路将第一性能指标提供给分配给线程的处理电路(模块515)。例如,如果分配给线程的处理电路是根据图2中所论述的高速缓冲存储器,该高速缓冲存储器能运行,使得该标记存储器和数据存储器被同步存取(即以高功率模式)。另一方面,如果SMT处理器所运行线程的数量大于该阈值(模块510),该性能指标控制电路将第二性能指标提供给与线程相关的处理电路(模块520)。例如,在上面根据图2所论述的实施例中,该高速缓冲存储器能以第二性能指标运行,使该数据高速缓存仅响应在标记存储器中的命中而被存取(即以低功率模式)。The number of threads currently running by the SMT processor is compared to a threshold (block 510). If the number of threads currently running by the SMT processor is less than or equal to the threshold, the performance indicator control circuit provides a first performance indicator to the processing circuit assigned to the thread (block 515). For example, if the processing circuitry assigned to a thread is in accordance with the cache memory discussed in Figure 2, the cache memory can operate such that the tag memory and data memory are accessed synchronously (ie in high power mode). On the other hand, if the number of threads run by the SMT processor is greater than the threshold (block 510), the performance indicator control circuit provides a second performance indicator to the processing circuit associated with the thread (block 520). For example, in the embodiment discussed above with reference to FIG. 2, the cache memory can operate at a second performance index such that the data cache is only accessed in response to hits in the tag memory (i.e., in a low power mode) .

图6是举例说明图2中所示的根据本发明高速缓冲存储器实施例的方框图。根据图6,标记存储器610用于存储数据存储器620中所存储数据的地址。SMT处理器使用与可用数据相关的地址对标记存储器610进行存取。通过标记比较电路630,将标记存储器610中的项目与地址进行比较,来确定该SMT处理器所需要的数据是否存储在数据存储器620中。如果标记比较电路630确定标记存储器610示出的所需要的数据存储在数据存储器620中,则产生标记命中,否则产生标记漏失。如果产生标记命中,允许输出电路650允许从数据存储器620中输出数据。FIG. 6 is a block diagram illustrating an embodiment of the cache memory shown in FIG. 2 in accordance with the present invention. According to FIG. 6 , tag memory 610 is used to store addresses of data stored in data memory 620 . The SMT processor accesses tag memory 610 using addresses associated with available data. The item in the tag memory 610 is compared with the address by the tag comparison circuit 630 to determine whether the data required by the SMT processor is stored in the data memory 620 . If the tag comparison circuit 630 determines that the required data shown by the tag memory 610 is stored in the data memory 620, a tag hit is generated, otherwise a tag miss is generated. The enable output circuit 650 allows data to be output from the data memory 620 if a tag hit occurs.

根据本发明的实施例,由性能指标控制电路所提供的性能指标,用于控制标记存储器610和数据存储器620如何运行。特别是,如果将第一性能指标提供给该高速缓冲存储器,不管是否产生了个标记命中,数据存储器允许电路640允许对数据存储器620与标记存储器610进行同步存取。相反,如果将第二性能指标提供给高速缓冲存储器,则除非产生标记命中,否则数据存储器允许电路640不允许对数据存储器620进行存取。According to an embodiment of the present invention, the performance index provided by the performance index control circuit is used to control how the tag memory 610 and the data memory 620 operate. In particular, data store enable circuit 640 allows simultaneous access to data store 620 and tag store 610 if a first performance indicator is provided to the cache memory, regardless of whether a tag hit has occurred. In contrast, if the second performance index is provided to the cache memory, the data memory enable circuit 640 does not allow access to the data memory 620 unless a tag hit occurs.

因此,根据本发明的实施例,在高功率模式中,可以同步存取标记存储器610和数据存储器620,以提供经过提高的性能,然而在个低功率模式中,只有当标记存储器610示出产生了标记命中时,才可以存取数据存储器620,从而允许减少该高速缓冲存储器的功率耗损。Therefore, according to an embodiment of the present invention, in the high power mode, the tag memory 610 and the data memory 620 may be accessed synchronously to provide improved performance, whereas in the low power mode, only when the tag memory 610 is shown generating Data memory 620 can only be accessed when a tag hit is obtained, thereby allowing the power consumption of the cache memory to be reduced.

图7是举例说明根据本发明在指令高速缓存中所使用的实施例的方框图。根据图7,线程管理电路700将指令高速缓存722分配给新线程。包含在线程管理电路300中的性能指标控制电路,可以将性能指标,提供给指令高速缓存722,以控制指令高速缓存722如何运行。Figure 7 is a block diagram illustrating an embodiment used in an instruction cache according to the present invention. According to FIG. 7, the thread management circuit 700 allocates the instruction cache 722 to the new thread. The performance index control circuit included in the thread management circuit 300 can provide the performance index to the instruction cache 722 to control how the instruction cache 722 operates.

特别是,指令高速缓存722能够响应于第一性能指标,以高功率模式运行,并且可以用于响应于第二性能指标,以低功率模式运行。根据上面所描述的,例如图5,可以基于所述SMT处理器当前所运行线程的数量,向指令高速缓存722提供第一和第二性能指标。此外,指令高速缓存722能够在与上面根据图6所描述的相类似的方式,以不同的性能指标运行,其中数据存储器620仅响应标记命中以低功率模式存取。例如,当确定对同一高速缓冲存储器线进行连续的存储器存取时,可以在该指令高速缓存中提供不同的性能指标,从而允许进行直接寻址。这种类型的限制可以使用直接寻址高速缓存来执行,该直接寻址高速缓存允许避免对标记随机存取存储器(RAM)进行读取,还允许消除标记比较。此外,在直接寻址高速缓存中,还可以避免从虚拟地址到物理地址的转换。In particular, instruction cache 722 is capable of operating in a high power mode in response to a first performance indicator and can be configured to operate in a low power mode in response to a second performance indicator. According to the above description, for example, in FIG. 5 , the first and second performance indicators may be provided to the instruction cache 722 based on the number of threads currently running on the SMT processor. In addition, instruction cache 722 can operate with different performance metrics in a manner similar to that described above with respect to FIG. 6 , where data memory 620 is only accessed in low power mode in response to tag hits. For example, when consecutive memory accesses to the same cache line are determined, different performance metrics may be provided in the instruction cache, allowing direct addressing. This type of restriction can be enforced using a direct addressing cache which allows avoiding reads to tag random access memory (RAM) and also allows tag comparisons to be eliminated. Furthermore, in a direct addressing cache, the translation from virtual address to physical address is also avoided.

图8是举例说明根据本发明,具有不同性能指标的单独的处理电路实施例的方框图。根据图8,第一浮点电路805可用于以第一性能指标运行,然而第二浮点电路815可用于以小于第一性能指标的第二性能指标运行。换言之,第一浮点电路805可用在高功率模式下,然而第二浮点电路815可用在低功率模式下。Figure 8 is a block diagram illustrating an embodiment of individual processing circuits having different performance metrics in accordance with the present invention. According to FIG. 8, the first floating point circuit 805 is operable to operate at a first performance index, whereas the second floating point circuit 815 is operable to operate at a second performance index less than the first performance index. In other words, the first floating point circuit 805 can be used in high power mode, while the second floating point circuit 815 can be used in low power mode.

第一整数载入/存储电路810用于以第一性能指标运行,然而第二整数载入/存储电路820用于以第二性能指标运行。线程管理电路800用于提供两种单独的性能指标。特别是,将第一性能指标提供给第一浮点电路805和第一整数载入/存储电路810。将由线程管理电路800提供的第二性能指标,提供给第二浮点电路815和第二整数载入/存储电路820。因此,将第一浮点电路805和第一整数载入/存储电路810分配给以第一性能指标运行的线程,然而将第二浮点电路815和第二整数载入/存储电路820分配给以第二性能指标运行的线程。应当了解,线程管理电路800能够分别或同步提供第一和第二性能指标。还应当了解,可以提供多于两个的单独的浮点电路和整数载入/存储电路,作为附加的性能指标。The first integer load/store circuit 810 is configured to operate at a first performance specification, whereas the second integer load/store circuit 820 is configured to operate at a second performance specification. Thread management circuit 800 is used to provide two separate performance metrics. In particular, the first performance index is provided to the first floating point circuit 805 and the first integer load/store circuit 810 . The second performance index provided by the thread management circuit 800 is provided to the second floating point circuit 815 and the second integer load/store circuit 820 . Thus, the first floating point circuit 805 and the first integer load/store circuit 810 are assigned to threads running at the first performance index, whereas the second floating point circuit 815 and the second integer load/store circuit 820 are assigned to A thread running at the second performance metric. It should be appreciated that the thread management circuit 800 can provide the first and second performance indicators separately or simultaneously. It should also be appreciated that more than two separate floating point circuits and integer load/store circuits may be provided as an additional performance measure.

根据本发明的实施例,SMT处理器中所运行线程的数量小于或等于第一阈值时,能够向第一浮点电路805和第一整数载入/存储电路810提供第一性能指标。当所述SMT处理器中当前所运行线程的数量超过该第一阈值时,能够向第二浮点电路815和第二整数载入/存储电路820提供第二性能指标。因此,当所述SMT处理器所运行线程的数量超过该阈值时,所有的线程(那些以前存在的和那些新创建的)可以用第二浮点电路815和第二整数载入/存储电路820来减少SMT处理器的功率损耗。According to an embodiment of the present invention, when the number of running threads in the SMT processor is less than or equal to the first threshold, the first performance index can be provided to the first floating point circuit 805 and the first integer load/store circuit 810 . When the number of currently running threads in the SMT processor exceeds the first threshold, a second performance index can be provided to the second floating point circuit 815 and the second integer load/store circuit 820 . Therefore, when the number of threads run by the SMT processor exceeds this threshold, all threads (those previously existing and those newly created) can use the second floating point circuit 815 and the second integer load/store circuit 820 To reduce the power loss of the SMT processor.

应当了解,根据本发明该浮点电路和整数载入/存储电路能够以不同的时钟速度运行和/或使用不同的电路类型(如不同类型的C M O S装置),来提供不同的性能指标。例如,根据本发明的一些实施例,与SMT处理器中线程的运行相关的浮点电路,可以基于所述SMT处理器当前所运行线程的数量,以高时钟速度下的高功率模式或低时钟速度下的低功率模式运行。It should be appreciated that the floating point circuitry and integer load/store circuitry can operate at different clock speeds and/or use different circuit types (e.g., different types of CMOS devices) to provide different performance metrics in accordance with the present invention . For example, according to some embodiments of the present invention, the floating-point circuit related to the operation of threads in the SMT processor can be based on the number of currently running threads in the SMT processor, in a high power mode or a low clock speed at a high clock speed. Low power mode operation at speed.

图9是举例说明包含多个处理电路的SMT处理器实施例的方框图,这些处理电路响应于由线程管理电路900提供的单独的性能指标。特别是,线程管理电路900提供三个单独的性能指标给一个指令高速缓存930,一个数据高速缓存965,第一和第二浮点电路905,915,以及第一和第二整数/载入-存储电路910,920。应当了解,提供给第一和第二浮点电路905,915以及提供给第一和第二整数载入/存储电路910,920的性能指标可以根据图8以上面所论述的方式运行。此外,数据高速缓存965和指令高速缓存930能够分别地根据图2和图7以上面所论述的方式运行。FIG. 9 is a block diagram illustrating an embodiment of an SMT processor including multiple processing circuits responsive to individual performance metrics provided by thread management circuit 900 . In particular, thread management circuit 900 provides three separate performance metrics for an instruction cache 930, a data cache 965, first and second floating point circuits 905, 915, and first and second integer/load- Storage circuits 910,920. It should be appreciated that the performance metrics provided to the first and second floating point circuits 905, 915 and to the first and second integer load/store circuits 910, 920 may operate in the manner discussed above with respect to FIG. Furthermore, data cache 965 and instruction cache 930 can operate in the manner discussed above with respect to FIGS. 2 and 7 , respectively.

因此,可以向不同的处理电路提供单独的性能指标,使得该处理电路能够以不同的性能指标运行,从而能够在性能和功率耗损的权衡上提供更好的控制。例如,当数据高速缓存265和第一和第二浮点电路905,915,以及第一和第二整数载入/存储电路910,920以第二性能指标运行时,该指令高速缓存能以第一性能指标运行。其他的性能指标的组合也可以被使用。Therefore, separate performance indicators can be provided to different processing circuits, so that the processing circuits can operate with different performance indicators, thereby providing better control on the trade-off between performance and power consumption. For example, when the data cache 265 and the first and second floating point circuits 905, 915, and the first and second integer load/store circuits 910, 920 are operating at the second performance index, the instruction cache can be at the first A performance index is run. Other combinations of performance metrics may also be used.

图10是举例说明图9中线程管理电路900包含的性能指标控制电路实施例的方框图。特别是,该性能指标控制电路包括计数器1000,该计数器响应SMT处理器中创建和完成的线程进行递增和递减的操作。第一到第三寄存器1015,1020,1025,每一个寄存器能存储该SMT处理器当前所运行的线程的数量的单独的阈值。三个比较器电路1030,1035和1040,分别与对应的寄存器1015,1020,1025相连接。特别是,存储第一阈值的第一寄存器1015与第一比较器电路1030相连接。存储第二阈值的第二寄存器1020与第二比较器电路1035相连接。存储第三阈值的第三寄存器1025与第三比较器电路1040相连接。FIG. 10 is a block diagram illustrating an embodiment of the performance index control circuit included in the thread management circuit 900 in FIG. 9 . In particular, the performance indicator control circuit includes a counter 1000 that is incremented and decremented in response to threads being created and completed in the SMT processor. First through third registers 1015, 1020, 1025, each capable of storing a separate threshold for the number of threads currently running on the SMT processor. The three comparator circuits 1030, 1035 and 1040 are respectively connected to the corresponding registers 1015, 1020 and 1025. In particular, a first register 1015 storing a first threshold value is connected to a first comparator circuit 1030 . The second register 1020 storing the second threshold value is connected to the second comparator circuit 1035 . A third register 1025 storing a third threshold value is connected to a third comparator circuit 1040 .

比较器电路1030,1035,1040中每一个,将SMT处理器当前所运行线程的数量与存储在各寄存器中的阈值进行比较。如果第一比较器电路1030确定当前SMT处理器所运行线程的数量大于第一寄存器1015中的第一阈值,则第一比较器电路1130产生一个性能指标1045,如图9所示,该性能指标连接到数据高速缓存965。因此,当该SMT处理器所运行线程的数量超过第一寄存器1015中的阈值时,数据高速缓存965的性能指标从第一性能指标改变为第二性能指标(即从高功率模式到低功率模式)。Each of the comparator circuits 1030, 1035, 1040 compares the number of threads currently running by the SMT processor with thresholds stored in respective registers. If the first comparator circuit 1030 determines that the number of threads run by the current SMT processor is greater than the first threshold in the first register 1015, then the first comparator circuit 1130 produces a performance index 1045, as shown in Figure 9, the performance index Connect to data cache 965. Therefore, when the number of threads run by the SMT processor exceeds the threshold in the first register 1015, the performance index of the data cache 965 changes from the first performance index to the second performance index (i.e., from high power mode to low power mode ).

如果第二比较器电路1035确定SMT处理器当前所运行线程的数量超过存储在第二寄存器1020中的阈值,则第二比较器电路1035产生一个连接到指令高速缓存930的性能指标1050,从而将指令高速缓存930的性能指标从第一性能指标改变为第二性能指标(即从高功率模式到低功率模式)。If the second comparator circuit 1035 determines that the number of threads currently being run by the SMT processor exceeds the threshold value stored in the second register 1020, the second comparator circuit 1035 generates a performance indicator 1050 that is connected to the instruction cache 930 so that the The performance index of the instruction cache 930 is changed from the first performance index to the second performance index (ie, from the high power mode to the low power mode).

如果第三比较器电路1040确定SMT处理器当前所运行线程的数量超过存储在第三寄存器1025中的阈值,则第三比较器电路1040产生一个连接到第一和第二浮点电路905,915和第一和第二整数/载入-存储电路910,920的性能指标1055。因此,也将这些处理电路的性能指标从第一性能指标改变为第二性能指标(即从高功率模式到低功率模式)。应当了解,连接到该浮点电路和该整数载入/存储电路的性能指标1055根据图8以上面所描述的方式运行。If the third comparator circuit 1040 determines that the number of threads currently being run by the SMT processor exceeds the threshold stored in the third register 1025, the third comparator circuit 1040 generates a signal that is connected to the first and second floating point circuits 905, 915. and the performance index 1055 of the first and second integer/load-store circuits 910,920. Accordingly, the performance specification of these processing circuits is also changed from the first performance specification to the second performance specification (ie from high power mode to low power mode). It should be appreciated that performance indicators 1055 connected to the floating point circuit and the integer load/store circuit operate in the manner described above with respect to FIG. 8 .

图11是举例说明图10中所例举的性能指标控制电路实施例的方法的流程图。根据图11,初始化SMT处理器时,该SMT处理器当前所运行线程的数量为零(模块1100)。随着该SMT处理器进行的线程的创建和完成,递增和递减该SMT处理器当前所运行线程的数量,以提供数值N,该数值表示SMT处理器当前所运行线程的数量(模块1105)。FIG. 11 is a flow chart illustrating the method of the embodiment of the performance index control circuit illustrated in FIG. 10 . According to FIG. 11 , when an SMT processor is initialized, the number of threads currently running by the SMT processor is zero (block 1100 ). As threads are created and completed by the SMT processor, the number of threads currently running by the SMT processor is incremented and decremented to provide a value N representing the number of threads currently running by the SMT processor (block 1105).

如果SMT处理器当前所运行线程的数量少于或等于第一阈值(模块1110),所有的处理电路继续以第一性能指标(或高性能指标)运行(模块1115)。另一方面,如果SMT处理器当前所运行线程的数量超过了第一阈值(模块1110),则连接到性能指标1045的处理电路开始以第二性能指标(或低性能指标)运行(模块1120)。If the number of threads currently running by the SMT processor is less than or equal to the first threshold (block 1110), all processing circuits continue to run at the first performance index (or high performance index) (block 1115). On the other hand, if the number of threads currently running by the SMT processor exceeds the first threshold (block 1110), the processing circuits connected to the performance index 1045 begin to run at the second performance index (or low performance index) (block 1120) .

如果SMT处理器当前所运行线程的数量少于或等于一个第二阈值(模块1125),则连接到性能指标1050(和连接到性能指标1055)的处理电路开始(或继续)以第一性能指标运行,与此同时,连接到性能指标1045(如上所述)的处理电路仍然以第二性能指标运行(模块1130)。If the number of threads currently running by the SMT processor is less than or equal to a second threshold (block 1125), the processing circuitry coupled to performance indicator 1050 (and to performance indicator 1055) begins (or continues) with the first performance indicator operation while the processing circuitry coupled to the performance indicator 1045 (described above) is still operating at the second performance indicator (block 1130).

如果SMT处理器当前所运行线程的数量超过了第二阈值(模块1125),连接到性能指标1050的处理电路,连同连接到性能指标1045的处理电路,开始(或继续)以第二性能指标运行(模块1135),然而连接到性能指标1055的处理电路继续以第一性能指标运行。If the number of threads currently running by the SMT processor exceeds a second threshold (block 1125), the processing circuitry coupled to the performance indicator 1050, along with the processing circuitry coupled to the performance indicator 1045, begin (or continue) to run at the second performance indicator (block 1135), however the processing circuitry coupled to performance indicator 1055 continues to operate at the first performance indicator.

如果SMT处理器当前所运行线程的数量少于或等于一个第三阈值(模块1140),连接到性能指标1055的处理电路继续以第一性能指标运行,然而连接到性能指标1045和性能指标1050的处理电路仍然以第二性能指标运行(模块1145)。如果SMT处理器当前所运行线程的数量超过了第三阈值(模块1140),则连接到性能指标1055的处理电路开始(或继续)以第二性能指标(即以低功率模式)运行(模块1150)。If the number of threads currently being run by the SMT processor is less than or equal to a third threshold (block 1140), the processing circuits connected to the performance index 1055 continue to run with the first performance index, while the processing circuits connected to the performance index 1045 and the performance index 1050 The processing circuit is still operating at the second performance indicator (block 1145). If the number of threads currently running by the SMT processor exceeds a third threshold (block 1140), the processing circuit connected to the performance index 1055 begins (or continues) to run with a second performance index (i.e., in a low power mode) (block 1150 ).

如上所述,根据本发明的实施例能够提供与SMT处理器中线程运行相关的处理电路,其中该处理电路基于所述SMT处理器当前所运行的线程的数量,以不同的性能指标运行。例如,根据本发明的一些实施例,与SMT处理器中线程的运行相关的处理电路,例如浮点单元或数据高速缓存,能够基于所述SMT处理器当前所运行线程的数量以高功率模式或低功率模式运行。As mentioned above, according to the embodiments of the present invention, processing circuits related to running threads in the SMT processor can be provided, wherein the processing circuits run with different performance indicators based on the number of threads currently running on the SMT processor. For example, according to some embodiments of the present invention, processing circuits related to the execution of threads in an SMT processor, such as floating point units or data caches, can be configured in a high-power mode or Low power mode operation.

此外,随着SMT处理器所运行线程数量的增加,能够降低处理电路的性能指标,从而在允许减少与线程相关的处理电路所消耗功率的总量时,提供了该SMT处理器体系结构的优点。例如,根据本发明的一些实施例,根据本发明的处理电路能够以不同的时钟速度运行和/或使用不同的电路类型(如不同类型的CMOS装置)来提供不同的性能指标。例如,根据本发明的一些实施例,与SMT处理器中线程的运行相关的处理电路,例如浮点电路或数据高速缓存,能够基于所述SMT处理器当前所运行线程的数量,以高时钟速度下的高功率模式或低时钟速度下的低功率模式运行。Furthermore, as the number of threads run by an SMT processor increases, the performance metrics of the processing circuitry can be reduced, thereby providing the advantages of the SMT processor architecture while allowing a reduction in the amount of power consumed by the processing circuitry associated with the threads . For example, according to some embodiments of the invention, processing circuits according to the invention can operate at different clock speeds and/or use different circuit types (eg, different types of CMOS devices) to provide different performance metrics. For example, according to some embodiments of the present invention, processing circuits related to the running of threads in an SMT processor, such as floating point circuits or data caches, can be clocked at a high clock speed based on the number of threads currently running on the SMT processor. run in high power mode at low clock speeds or in low power mode at low clock speeds.

本领域普通技术人员在给出的目前所公开的优势下,可以在不脱离本发明的精神和范围的情况下进行许多的变化和修改。因此,应当了解,前面所例举的实施例目的只是为了举例,而不应该用来将本发明限定成如下面权利要求所定义的那样。因此,下述权利要求所包含的内容不仅是前面照字面上所述的元件的组合,而且还包含了所有用于以实质上相同的方式执行实质上相同的功能,以获得实质上相同的结果的所有等效元件。因此,该权利要求应理解为包括了上面具体说明与描述的内容,概念上相等的内容,以及结合了本发明基本原理的内容。Numerous changes and modifications may be made by one of ordinary skill in the art, given the presently disclosed advantage, without departing from the spirit and scope of the invention. Accordingly, it should be understood that the foregoing illustrated embodiments are for purposes of example only, and should not be taken to limit the invention as defined by the following claims. Accordingly, to the following claims it is intended that not only combinations of the foregoing literally stated elements be included, but also all combinations for performing substantially the same function in substantially the same way to obtain substantially the same result all equivalent elements. Therefore, the claims should be understood to include what is specifically illustrated and described above, what is conceptually equivalent, and what incorporates the basic principles of the present invention.

Claims (32)

1.一种同步多线程处理器,包括:1. A synchronous multi-thread processor, comprising: 性能指标控制电路,用于基于同步多线程处理器当前所运行线程的数量,为至少一个处理电路提供性能指标;和a performance indicator control circuit for providing a performance indicator for at least one processing circuit based on the number of threads currently running on the SMP; and 至少一个与同步多线程处理器中线程运行相关的处理电路,且该处理电路用于基于同步多线程处理器当前所运行的线程的数量,以不同的性能指标运行。At least one processing circuit related to the execution of threads in the SMP, and the processing circuit is configured to operate with different performance indicators based on the number of threads currently running on the SMP. 2.根据权利要求1所述的同步多线程处理器,其中当同步多线程处理器当前所运行线程的数量少于或等于阈值时,至少一个处理电路用于以第一性能指标运行;并且2. The synchronous multi-thread processor according to claim 1, wherein when the number of threads currently running by the synchronous multi-thread processor is less than or equal to a threshold value, at least one processing circuit is configured to operate at a first performance index; and 其中当同步多线程处理器当前所运行线程的数量大于所述阈值时,至少一个处理电路用于以第二性能指标运行。Wherein when the number of currently running threads of the synchronous multi-thread processor is greater than the threshold, at least one processing circuit is configured to run with the second performance index. 3.根据权利要求1所述的同步多线程处理器,其中当同步多线程处理器当前所运行线程的数量小于或等于阈值时,所述性能指标控制电路保持提供给至少一个处理电路的第一性能指标;并且3. The synchronous multi-thread processor according to claim 1, wherein when the number of currently running threads of the synchronous multi-thread processor is less than or equal to a threshold value, the performance index control circuit keeps the first provided to at least one processing circuit performance indicators; and 其中当同步多线程处理器当前所运行线程的数量超过所述阈值时,所述性能指标控制电路将提供给至少一个处理电路的性能指标降低到第二性能指标,所述第二性能指标小于所述第一性能指标。Wherein when the number of currently running threads of the synchronous multi-thread processor exceeds the threshold, the performance index control circuit reduces the performance index provided to at least one processing circuit to a second performance index, and the second performance index is less than the set The above-mentioned first performance index. 4.根据权利要求3所述的同步多线程处理器,其中所述阈值包含第一阈值,其中当同步多线程处理器当前所运行线程的数量超过大于所述第一阈值的第二阈值时,所述性能指标控制电路进一步将提供给至少一个处理电路的性能指标降低到第三性能指标,所述第三性能指标小于所述第二性能指标。4. The synchronous multi-thread processor according to claim 3, wherein said threshold comprises a first threshold, wherein when the number of currently running threads of the synchronous multi-thread processor exceeds a second threshold greater than said first threshold, The performance index control circuit further reduces the performance index provided to the at least one processing circuit to a third performance index, the third performance index being less than the second performance index. 5.根据权利要求1所述的同步多线程处理器,其中至少一个处理电路包含高速缓冲存储器电路,该高速缓冲存储器电路包括标记存储器和数据存储器,并当所述高速缓冲存储器以第一性能指标运行时,该数据存储器与标记存储器的存取同步,提供高速缓冲存储器中存储的数据;并且5. The synchronous multithreaded processor of claim 1 , wherein at least one of the processing circuits comprises a cache memory circuit comprising a tag memory and a data memory, and when said cache memory is at a first performance index At runtime, the data memory is synchronized with accesses to the tag memory, providing data stored in the cache memory; and 其中所述数据存储器用于当所述高速缓冲存储器电路以小于所述第一性能指标的第二性能指标运行时,提供响应于标记存储器中的命中的高速缓冲存储器中存储的数据。Wherein the data memory is configured to provide data stored in the cache memory in response to a hit in the tag memory when the cache memory circuit is operating at a second performance index less than the first performance index. 6.根据权利要求5所述的同步多线程处理器,其中所述高速缓冲存储器电路包含用于存储通过指令运行的数据的数据高速缓冲存储器和用于存储运行相关数据的指令的指令高速缓冲存储器。6. The synchronous multi-thread processor according to claim 5, wherein said cache memory circuit comprises a data cache memory for storing data executed by instructions and an instruction cache memory for instructions storing execution dependent data . 7.根据权利要求5所述的同步多线程处理器,其中所述数据存储器进一步用于当以第二性能指标运行时不提供响应于所述标记存储器中的漏失的高速缓冲存储器中存储的数据。7. The synchronous multi-threaded processor of claim 5 , wherein the data memory is further configured to not provide data stored in the cache memory in response to a miss in the tag memory when operating at a second performance metric . 8.根据权利要求1所述的同步多线程处理器,其中至少一个处理电路包括浮点单元。8. The synchronous multithreaded processor of claim 1, wherein at least one processing circuit includes a floating point unit. 9.根据权利要求8所述的同步多线程处理器,其中所述浮点单元包括第一浮点单元,所述第一浮点单元用于当同步多线程处理器所运行线程的数量小于或等于阈值时以第一性能指标运行,同步多线程处理器进一步包括:9. The synchronous multi-thread processor according to claim 8, wherein the floating point unit comprises a first floating point unit, and the first floating point unit is used when the number of running threads of the synchronous multi-thread processor is less than or Running with the first performance indicator when equal to the threshold, the synchronous multi-threaded processor further includes: 第二浮点单元,用于当同步多线程处理器所运行线程的数量大于所述阈值时,以第二性能指标运行,其中所述第二性能指标小于所述第一性能指标。The second floating point unit is configured to run with a second performance index when the number of threads run by the SMP is greater than the threshold, wherein the second performance index is smaller than the first performance index. 10.根据权利要求1所述的同步多线程处理器,其中至少一个处理电路包括整数寄存器。10. The synchronous multi-threaded processor of claim 1, wherein at least one processing circuit comprises an integer register. 11.根据权利要求1所述的同步多线程处理器,其中所述性能指标控制电路用于分别响应同步多线程处理器中线程的创建和完成,通过增加或减少内部计数,来确定同步多线程处理器当前所运行线程的数量。11. The synchronous multithreading processor according to claim 1, wherein the performance index control circuit is used to determine the synchronous multithreading by increasing or decreasing an internal count in response to creation and completion of threads in the synchronous multithreading processor, respectively. The number of threads currently running on the processor. 12.根据权利要求1所述的同步多线程处理器,其中至少一个处理电路包含第一处理电路,该第一处理电路用于响应于同步多线程处理器中当前所运行的线程的数量被减少到小于或等于阈值,以第一性能指标运行,同步多线程处理器进一步包括:12. The synchronous multithreading processor of claim 1 , wherein at least one processing circuit comprises a first processing circuit configured to decrease the number of threads currently running in the synchronous multithreading processor in response to To be less than or equal to the threshold, run with the first performance indicator, the synchronous multi-threaded processor further includes: 第二处理电路,用于响应于同步多线程处理器中当前所运行的线程的数量被增加到超过所述阈值,以低于第一性能指标的第二性能指标运行。The second processing circuit is configured to operate at a second performance indicator lower than the first performance indicator in response to the number of threads currently running in the SMP being increased above the threshold. 13.根据权利要求1所述的同步多线程处理器,其中所述性能指标控制电路用于响应新线程的创建,使同步多线程处理器当前所运行线程的数量从小于或等于阈值增加到大于该阈值,而降低提供给至少一个处理电路的性能指标。13. The synchronous multi-thread processor according to claim 1, wherein the performance index control circuit is used to respond to the creation of a new thread, so that the number of currently running threads of the synchronous multi-thread processor increases from less than or equal to a threshold to greater than The threshold is lowered to provide a performance indicator to at least one processing circuit. 14.根据权利要求1所述的同步多线程处理器,其中所述性能指标控制电路用于随着同步多线程处理器当前所运行线程的数量超过多个上升阈值中的每一个时,将至少一个处理电路的性能指标依次降低到其相邻性能指标的下降性能指标。14. The synchronous multi-thread processor according to claim 1 , wherein the performance indicator control circuit is configured to set at least The performance index of a processing circuit is sequentially reduced to the descending performance index of its adjacent performance index. 15.根据权利要求1所述的同步多线程处理器,其中所述性能指标控制电路用于为第一处理电路保持第一性能指标,并响应于同步多线程处理器当前所运行线程的数量从小于或等于阈值增加到大于该阈值,将小于第一性能指标的第二性能指标提供给第二处理电路。15. The synchronous multi-thread processor according to claim 1, wherein the performance index control circuit is used to maintain a first performance index for the first processing circuit, and in response to the number of currently running threads of the synchronous multi-thread processor is small The equal to or greater threshold increases to greater than the threshold, providing a second performance index less than the first performance index to the second processing circuit. 16.一种同步多线程处理器,包括:16. A synchronous multi-threaded processor comprising: 性能指标控制电路,用于基于同步多线程处理器当前所运行的线程的数量,将性能指标提供给同步多线程处理器中的处理电路,a performance indicator control circuit, configured to provide performance indicators to a processing circuit in the synchronous multithread processor based on the number of threads currently running by the synchronous multithread processor, 其中所述性能指标控制电路进一步用于响应于新线程的创建,通过增加内部计数来确定同步多线程处理器当前所运行线程的数量,从而提供新的运行线程的数量,并且用于基于同步多线程处理器所运行线程的新数量,将性能指标提供给所述处理电路。Wherein the performance index control circuit is further used to respond to the creation of a new thread, by increasing the internal count to determine the number of currently running threads of the synchronous multi-threaded processor, thereby providing a new number of running threads, and used for synchronous multi-threading based on The new number of threads run by the thread processor provides performance metrics to the processing circuitry. 17.根据权利要求16所述的同步多线程处理器,其中当同步多线程处理器当前所运行线程的数量小于或等于阈值时,所述性能指标控制电路用于保持提供给处理电路的第一性能指标;并且17. The synchronous multi-thread processor according to claim 16, wherein when the number of currently running threads of the synchronous multi-thread processor is less than or equal to a threshold value, the performance index control circuit is used to keep the first performance indicators; and 当同步多线程处理器当前所运行线程的数量超过所述阈值时,所述性能指标控制电路将提供给处理电路的性能指标降低到小于第一性能指标的第二性能指标。When the number of currently running threads of the SMP exceeds the threshold, the performance index control circuit reduces the performance index provided to the processing circuit to a second performance index that is smaller than the first performance index. 18.根据权利要求17所述的同步多线程处理器,其中所述性能指标控制电路进一步用于为第一处理电路保持第一性能指标,并响应于同步多线程处理器当前所运行线程的数量从小于或等于阈值增加到大于该阈值,将小于第一性能指标的第二性能指标提供给第二处理电路。18. The synchronous multi-thread processor according to claim 17, wherein the performance index control circuit is further used to maintain a first performance index for the first processing circuit, and responds to the number of currently running threads of the synchronous multi-thread processor Increasing from less than or equal to the threshold to greater than the threshold, a second performance index less than the first performance index is provided to the second processing circuit. 19.根据权利要求16所述的同步多线程处理器,其中所述处理电路包括浮点单元和数据高速缓冲存储器中的至少一个。19. The synchronous multithreaded processor of claim 16, wherein the processing circuitry includes at least one of a floating point unit and a data cache. 20.根据权利要求16所述的同步多线程处理器,其中当同步多线程处理电路当前所运行线程的数量少于或等于阈值时,所述处理电路用于以第一性能指标运行;并且20. The synchronous multi-thread processor according to claim 16, wherein when the number of threads currently running by the synchronous multi-thread processing circuit is less than or equal to a threshold value, the processing circuit is configured to operate at a first performance index; and 当同步多线程处理电路当前所运行线程的数量大于所述阈值时,所述处理电路用于以第二性能指标运行。When the number of threads currently running by the synchronous multi-thread processing circuit is greater than the threshold, the processing circuit is configured to run with the second performance indicator. 21.一种同步多线程处理器,包括:21. A synchronous multi-threaded processor comprising: 线程管理电路,用于随着线程的创建,将与同步多线程处理器相关的处理电路分配给同步多线程处理器中所运行的线程;和thread management circuitry for allocating processing circuitry associated with the SMP to threads running in the SMP as threads are created; and 性能指标控制电路,用于基于同步多线程处理器当前所运行的、与至少一个阈值进行比较的线程数量,将多个性能指标中的一个提供给所述处理电路。Performance indicator control circuitry for providing one of a plurality of performance indicators to the processing circuitry based on the number of threads currently running by the SMP compared to at least one threshold. 22.根据权利要求21所述的同步多线程处理器,其中当同步多线程处理器当前所运行线程的数量小于或等于至少一个阈值时,所述性能指标控制电路用于保持提供给所述处理电路的第一性能指标;并且22. The synchronous multi-thread processor according to claim 21, wherein when the number of currently running threads of the synchronous multi-thread processor is less than or equal to at least one threshold, the performance index control circuit is used to keep providing a first performance indicator of the circuit; and 其中当同步多线程处理器当前所运行线程的数量超过至少一个阈值时,所述性能指标控制电路用于将提供给所述处理电路的性能指标降低到小于第一性能指标的第二性能指标。Wherein when the number of currently running threads of the synchronous multi-thread processor exceeds at least one threshold, the performance index control circuit is configured to reduce the performance index provided to the processing circuit to a second performance index that is less than the first performance index. 23.根据权利要求21所述的同步多线程处理器,其中所述性能指标控制电路用于响应于新线程的创建,使同步多线程处理器当前所运行线程的数量从小于或等于阈值增加到大于至少一个阈值,而降低提供给处理电路的性能指标。23. The synchronous multi-thread processor according to claim 21, wherein the performance indicator control circuit is used to increase the number of currently running threads of the synchronous multi-thread processor from less than or equal to a threshold value to greater than at least one threshold, while degrading the performance index provided to the processing circuit. 24.根据权利要求21所述的同步多线程处理器,其中所述性能指标控制电路用于当同步多线程处理器当前所运行线程的数量超过多个上升阈值中的每一个阈值时,将提供给处理电路的一个性能指标依次降低到其相邻性能指标的下降性能指标。24. The synchronous multi-thread processor according to claim 21, wherein the performance index control circuit is used to provide A performance index for the processing circuit is sequentially reduced to the descending performance index of its adjacent performance index. 25.根据权利要求21所述的同步多线程处理器,其中所述性能指标控制电路用于为第一处理电路保持第一性能指标,并响应于同步多线程处理器当前所运行线程的数量从小于或等于至少一个阈值增加到大于至少一个阈值,为第二处理电路提供小于第一性能指标的第二性能指标。25. The synchronous multi-thread processor according to claim 21 , wherein the performance index control circuit is used to maintain a first performance index for the first processing circuit, and in response to the number of currently running threads of the synchronous multi-thread processor being small At or equal to at least one threshold is increased to greater than at least one threshold, providing the second processing circuit with a second performance indicator that is less than the first performance indicator. 26.一种与权利要求1的同步多线程处理器相关的高速缓冲存储器,所述高速缓冲存储器包括标记存储器和数据存储器,两个存储器或者被同步存取,或者所述数据存储器根据同步多线程处理器当前所运行线程的数量在该标记存储器之后进行存取,26. A cache memory associated with the synchronous multithreading processor of claim 1, said cache memory comprising a tag memory and a data memory, both memories being either accessed synchronously, or said data memory being accessed according to the synchronous multithreading The number of threads currently running on the processor is accessed after this tag memory, 其中所述标记存储器和所述数据存储器响应于同步多线程处理器当前所运行线程的数量小于或等于阈值而被同步存取,wherein the tag memory and the data memory are accessed synchronously in response to a number of threads currently running on the SMP being less than or equal to a threshold, 所述数据存储器响应所述标记存储器中因同步多线程处理器当前所运行线程的数量大于阈值所导致的命中而被存取。The data memory is accessed in response to a hit in the tag memory caused by a number of threads currently running on the SMP being greater than a threshold. 27.一种运行同步多线程处理器的方法,包括:27. A method of operating a simultaneous multithreaded processor, comprising: 基于同步多线程处理器当前所运行线程的数量,将性能指标提供给至少一个处理电路,providing a performance indicator to at least one processing circuit based on the number of threads currently running on the synchronous multithreaded processor, 其中提供步骤优先于:where providing steps takes precedence over: 将同步多线程处理器当前所运行线程的数量与阈值进行比较,从而给至少一个处理电路提供性能指标,comparing the number of threads currently running by the synchronous multithreaded processor to a threshold, thereby providing a performance indicator for at least one processing circuit, 其中比较步骤优先于:where the compare step takes precedence over: 响应于同步多线程处理器中开始的新线程,递增同步多线程处理器当前所运行线程的数量;以及Incrementing the number of threads currently running on the SMP in response to a new thread starting in the SMP; and 响应于同步多线程处理器中结束的线程,递减同步多线程处理器当前所运行线程的数量。The number of threads currently running by the SMP is decremented in response to threads ending in the SMP. 28.根据权利要求27所述的方法,其中提供步骤包括:28. The method of claim 27, wherein the step of providing comprises: 如果同步多线程处理器当前所运行线程的数量小于或等于所述阈值,则将第一性能指标提供给至少一个处理电路;以及providing the first performance indicator to at least one processing circuit if the number of threads currently running by the SMP is less than or equal to the threshold; and 如果同步多线程处理器当前所运行线程的数量超过所述阈值,则将小于第一性能指标的第二性能指标提供给至少一个处理电路。If the number of threads currently running by the SMP exceeds the threshold, providing a second performance index that is less than the first performance index to at least one processing circuit. 29.根据权利要求28所述的方法,进一步包括:29. The method of claim 28, further comprising: 响应新线程的创建使得同步多线程处理器当前所运行线程的数量增加到超过上升的第一、第二和第三阈值以外的其他阈值,将进一步降低的性能指标提供给处理电路。A further reduced performance index is provided to the processing circuit in response to the creation of the new thread causing the number of threads currently running by the synchronous multithreaded processor to increase beyond thresholds other than the raised first, second and third thresholds. 30.一种同步多线程处理器,包括:30. A synchronous multi-threaded processor comprising: 用于基于同步多线程处理器当前所运行线程的数量将性能指标提供给至少一个处理电路的装置;means for providing a performance indicator to at least one processing circuit based on the number of threads currently running on the simultaneous multithreaded processor; 用于将同步多线程处理器当前所运行线程的数量与一个阈值进行比较从而将所述性能指标提供给至少一个处理电路的装置;means for comparing the number of threads currently running by the synchronous multithreaded processor with a threshold, thereby providing said performance indicator to at least one processing circuit; 用于响应于同步多线程处理器中开始的新线程而递增同步多线程处理器当前所运行线程数量的装置;以及means for incrementing the number of threads currently running by the SMP in response to a new thread starting in the SMP; and 用于响应于同步多线程处理器中结束的线程而递减同步多线程处理器当前所运行线程数量的装置。Means for decrementing the number of threads currently running by the simultaneous multithreaded processor in response to a thread terminating in the simultaneous multithreaded processor. 31.根据权利要求30所述的同步多线程处理器,其中提供装置包括:31. The synchronous multithreaded processor of claim 30, wherein the means for providing comprises: 用于如果同步多线程处理器当前所运行线程的数量小于或等于阈值则将第一性能指标提供给至少一个处理电路的装置;以及means for providing a first performance indicator to at least one processing circuit if the number of threads currently running by the synchronous multithreaded processor is less than or equal to a threshold; and 用于如果同步多线程处理器当前所运行线程的数量超过阈值时则将小于第一性能指标的第二性能指标提供给至少一个处理电路的装置。Means for providing a second performance indicator that is less than the first performance indicator to at least one processing circuit if the number of threads currently running by the SMP exceeds a threshold. 32.根据权利要求31所述的同步多线程处理器,进一步包括:32. The synchronous multithreaded processor of claim 31 , further comprising: 用于响应新线程的创建使得同步多线程处理器当前所运行的线程数量增加到超过上升的第一、第二和第三阈值以外的其他阈值而将进一步降低的性能指标提供给处理电路的装置。Means for providing a further reduced performance indicator to a processing circuit in response to the creation of a new thread causing the number of threads currently running by a synchronous multithreaded processor to increase beyond thresholds other than raised first, second and third thresholds .
CNB2004100430627A 2003-02-20 2004-02-20 Synchronous multi-thread processor circuit and operating method Expired - Lifetime CN100394381C (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
KR10759/2003 2003-02-20
KR10759/03 2003-02-20
KR20030010759 2003-02-20
US10/631,601 2003-07-31
US10/631,601 US7152170B2 (en) 2003-02-20 2003-07-31 Simultaneous multi-threading processor circuits and computer program products configured to operate at different performance levels based on a number of operating threads and methods of operating

Publications (2)

Publication Number Publication Date
CN1534463A CN1534463A (en) 2004-10-06
CN100394381C true CN100394381C (en) 2008-06-11

Family

ID=32044744

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2004100430627A Expired - Lifetime CN100394381C (en) 2003-02-20 2004-02-20 Synchronous multi-thread processor circuit and operating method

Country Status (5)

Country Link
JP (1) JP4439288B2 (en)
KR (1) KR100594256B1 (en)
CN (1) CN100394381C (en)
GB (1) GB2398660B (en)
TW (1) TWI261198B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4574493B2 (en) * 2005-08-22 2010-11-04 キヤノン株式会社 Processor system and multi-thread processor
JP4687685B2 (en) * 2007-04-24 2011-05-25 株式会社デンソー Electronic control device for engine control and microcomputer
EP2159700A4 (en) * 2007-06-19 2011-07-20 Fujitsu Ltd CACHE MEMORY CONTROLLER AND CONTROL METHOD
KR101109029B1 (en) 2007-06-20 2012-01-31 후지쯔 가부시끼가이샤 Arithmetic unit
US9529727B2 (en) 2014-05-27 2016-12-27 Qualcomm Incorporated Reconfigurable fetch pipeline
CN105808444B (en) * 2015-01-19 2019-01-01 东芝存储器株式会社 The control method of storage device and nonvolatile memory
WO2018018494A1 (en) * 2016-07-28 2018-02-01 张升泽 Method and system for allocating power based on multi-zone allocation
WO2018018492A1 (en) * 2016-07-28 2018-02-01 张升泽 Method and system of allocating current in plurality of intervals in interior of multi-core chip
CN112631960B (en) * 2021-03-05 2021-06-04 四川科道芯国智能技术股份有限公司 Method for expanding cache memory

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1193144A (en) * 1997-03-11 1998-09-16 国际商业机器公司 Method for monitoring property of multi-line processor and system thereof
US6079025A (en) * 1990-06-01 2000-06-20 Vadem System and method of computer operating mode control for power consumption reduction
US6493741B1 (en) * 1999-10-01 2002-12-10 Compaq Information Technologies Group, L.P. Method and apparatus to quiesce a portion of a simultaneous multithreaded central processing unit

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5218704A (en) * 1989-10-30 1993-06-08 Texas Instruments Real-time power conservation for portable computers
JP3100241B2 (en) * 1992-10-09 2000-10-16 ダイヤセミコンシステムズ株式会社 Microprocessor drive controller
JP3461535B2 (en) * 1993-06-30 2003-10-27 株式会社日立国際電気 Wireless terminal device and control method therefor
US5630142A (en) * 1994-09-07 1997-05-13 International Business Machines Corporation Multifunction power switch and feedback led for suspend systems
US6073159A (en) 1996-12-31 2000-06-06 Compaq Computer Corporation Thread properties attribute vector based thread selection in multithreading processor
US6272616B1 (en) * 1998-06-17 2001-08-07 Agere Systems Guardian Corp. Method and apparatus for executing multiple instruction streams in a digital processor with multiple data paths
US7051329B1 (en) * 1999-12-28 2006-05-23 Intel Corporation Method and apparatus for managing resources in a multithreaded processor
US7487505B2 (en) * 2001-08-27 2009-02-03 Intel Corporation Multithreaded microprocessor with register allocation based on number of active threads
US6711447B1 (en) * 2003-01-22 2004-03-23 Intel Corporation Modulating CPU frequency and voltage in a multi-core CPU architecture

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6079025A (en) * 1990-06-01 2000-06-20 Vadem System and method of computer operating mode control for power consumption reduction
CN1193144A (en) * 1997-03-11 1998-09-16 国际商业机器公司 Method for monitoring property of multi-line processor and system thereof
US6493741B1 (en) * 1999-10-01 2002-12-10 Compaq Information Technologies Group, L.P. Method and apparatus to quiesce a portion of a simultaneous multithreaded central processing unit

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Converting Thread-Level Parallelism to Instruction-LevelParallelism via Simultaneous Multithreading. JACK L.LO SUSAN J.EGGERS ET AL.ACM Transactions On Computer Systems,Vol.VOL.15 No.NO.3. 1997
Converting Thread-Level Parallelism to Instruction-LevelParallelism via Simultaneous Multithreading. JACK L.LO SUSAN J.EGGERS ET AL.ACM Transactions On Computer Systems,Vol.VOL.15 No.NO.3. 1997 *
Simultaneous Multithreading:Maximizing On-Chip Parallelism. Tullsen ET AL.Proceedings of the 22nd Annual International Symposium on Computer Acchitecture. 1995
Simultaneous Multithreading:Maximizing On-Chip Parallelism. Tullsen ET AL.Proceedings of the 22nd Annual International Symposium on Computer Acchitecture. 1995 *

Also Published As

Publication number Publication date
JP4439288B2 (en) 2010-03-24
JP2004252987A (en) 2004-09-09
KR100594256B1 (en) 2006-06-30
KR20040075287A (en) 2004-08-27
GB2398660B (en) 2005-09-07
CN1534463A (en) 2004-10-06
TW200421180A (en) 2004-10-16
GB2398660A (en) 2004-08-25
TWI261198B (en) 2006-09-01
GB0403738D0 (en) 2004-03-24

Similar Documents

Publication Publication Date Title
US7152170B2 (en) Simultaneous multi-threading processor circuits and computer program products configured to operate at different performance levels based on a number of operating threads and methods of operating
US9715391B2 (en) Cache memory apparatus
US6314511B2 (en) Mechanism for freeing registers on processors that perform dynamic out-of-order execution of instructions using renaming registers
US6151662A (en) Data transaction typing for improved caching and prefetching characteristics
US7437537B2 (en) Methods and apparatus for predicting unaligned memory access
US20090055635A1 (en) Program execution control device
US10866834B2 (en) Apparatus, method, and system for ensuring quality of service for multi-threading processor cores
TW201342218A (en) Providing an asymmetric multicore processor system transparently to an operating system
US20040205326A1 (en) Early predicate evaluation to reduce power in very long instruction word processors employing predicate execution
CN112559389B (en) Storage control device, processing device, computer system and storage control method
EP4020187A1 (en) Segmented branch target buffer based on branch instruction type
CN100394381C (en) Synchronous multi-thread processor circuit and operating method
US20080209174A1 (en) Processor And Its Instruction Issue Method
KR100977687B1 (en) Power saving method and apparatus for selectively enabling comparators in the CA Renaming Register file based on known processor state
US20080244224A1 (en) Scheduling a direct dependent instruction
Chappell et al. Microarchitectural support for precomputation microthreads
Dixon et al. THE NEXT-GENERATION INTEL CORE MICROARCHITECTURE.
EP3757772A1 (en) System, apparatus and method for a hybrid reservation station for a processor
CN118227285B (en) Resource allocation method, processor and electronic device
US12254319B2 (en) Scalable toggle point control circuitry for a clustered decode pipeline
US20240004808A1 (en) Optimized prioritization of memory accesses
GB2410584A (en) A simultaneous multi-threading processor accessing a cache in different power modes according to a number of threads
CN118193153A (en) Resource allocation method, processor and electronic equipment
Assis Simultaneous Multithreading: a Platform for Next Generation Processors

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CX01 Expiry of patent term
CX01 Expiry of patent term

Granted publication date: 20080611