KR101603711B1

KR101603711B1 - System and Method for Allocating Job for Operating GPU

Info

Publication number: KR101603711B1
Application number: KR1020140049973A
Authority: KR
Inventors: 황태호; 김동순; 김선욱
Original assignee: 전자부품연구원
Priority date: 2014-04-25
Filing date: 2014-04-25
Publication date: 2016-03-15
Anticipated expiration: 2034-04-25
Also published as: KR20150123519A

Abstract

본 발명은 그래픽 처리 장치(GPU)에 작업을 할당하고 그래픽 처리 장치가 할당받은 작업을 처리함에 있어서, 그래픽 처리 장치로부터 수신한 메모리 응답 시간에 기초하여 그래픽 처리 장치의 최적화된 코어 수를 조절하는 그래픽 처리 장치의 동작을 위한 작업 할당 시스템 및 방법을 제공한다. 본 발명에 따르면, 그래픽 처리 장치에서 최적화된 수의 코어가 작동하도록 함으로써 그래픽 처리 장치의 작업 처리 속도는 유지하면서 메모리 병목 현상으로 인한 작업 처리 지연을 감소시킬 수 있도록 한다.The present invention relates to a graphics processing device (GPU), a graphics processing device (GPU), and a graphics processing device (GPU) A task allocation system and method for operation of a processing apparatus are provided. According to the present invention, an optimized number of cores are operated in the graphics processing apparatus, thereby reducing the processing delay due to the memory bottleneck while maintaining the processing speed of the graphics processing apparatus.

Description

Technical Field [0001] The present invention relates to a system and a method for assigning a job to an operation of a graphics processing apparatus,

본 발명은 그래픽 처리 장치(GPU, Graphics Processing Unit)의 작동을 제어하는 시스템 및 방법에 관한 것으로서, 구체적으로는, 중앙 처리 장치가 요청한 작업을 그래픽 처리 장치에 효율적으로 할당하는 그래픽 처리 장치의 동작을 위한 작업 할당 시스템 및 그 방법에 관한 것이다.
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a system and a method for controlling the operation of a graphics processing unit (GPU), and more particularly, to an operation of a graphics processing apparatus that efficiently allocates a job requested by a central processing unit And more particularly, to a task allocation system and method thereof.

그래픽 처리 장치(GPU, Graphics Processing Unit)는 컴퓨터에서 그래픽 연산 처리를 전담하는 반도체 코어 칩 또는 장치를 의미하는 것으로서, 현재 가장 널리 쓰이고 있는 NVIDIA 사의 Fermi GPU 구조에서는 GPU 자원을 최대한 활용하기 위해 모든 GPU 코어에 가능한 한 많은 작업을 할당하고 있다. 즉, 종래의 GPU 구조에서는 모든 코어에 최대 개수의 작업을 할당함으로써 GPU의 TLP(Thread Level Parallelism)을 최대로 끌어내고자 한다.A GPU (Graphics Processing Unit) refers to a semiconductor core chip or device dedicated to processing graphics processing in a computer. In the presently widely used NVIDIA Fermi GPU structure, all GPU cores As much work as possible. That is, in the conventional GPU structure, it is attempted to maximize the thread level parallelism (TLP) of the GPU by allocating the maximum number of jobs to all the cores.

종래의 GPU 구조에서의 이러한 작업 할당 정책은 모든 코어를 최대한 활용하는 것으로서, 이는 메모리 접근이 빈번하지 않은 GPU 프로그램을 수행할 때는 좋은 효과를 보여주나 메모리 접근이 빈번한 프로그램을 수행할 때는 메모리의 병목 현상으로 인하여 좋지 못한 성능을 보인다.This task allocation policy in the conventional GPU structure is to make full use of all the cores. This is effective when the GPU program which does not have frequent memory access is executed, but when the program is frequently accessed with memory, the memory bottleneck The performance is not good.

즉, 메모리 접근 작업이 많은 GPU 프로그램을 수행할 경우 모든 코어에서 많은 수의 메모리 요청이 동시에 일어나므로 메모리에 병목 현상이 일어나게 되고, 메모리 병목현상으로 인해 GPU 코어는 요청한 데이터가 도착할 때까지 프로그램을 진행할 수 없으며, 이러한 지연(stall) 현상은 GPU 코어의 개수가 늘어날수록 증가한다. 이는 결국 개별 GPU 코어 성능을 크게 떨어뜨리고 개별 GPU 코어를 효율적으로 사용하지 못하게 하므로 전체적인 전력 소모량이 늘어나게 하는 문제점이 존재한다.
In other words, when a GPU program with a lot of memory access operations is executed, a large number of memory requests are concurrently performed in all cores, so that a memory bottleneck occurs. As a result of a memory bottleneck, the GPU core proceeds to program until the requested data arrives This stall phenomenon increases as the number of GPU cores increases. This, in turn, significantly degrades the performance of individual GPU cores and prevents the efficient use of individual GPU cores, resulting in an increase in overall power consumption.

본 발명은 전술한 문제점을 해결하기 위하여, 메모리 접근이 빈번한 프로그램에서 메모리 응답 시간(Memory Latency)을 기준으로 최적화된 GPU 코어 개수를 계산하여 활용함으로써 메모리 병목 현상을 줄이고 개별 GPU 코어의 성능을 높이는 그래픽 처리 장치의 동작을 위한 작업 할당 시스템 및 방법을 제공하는 것을 목적으로 한다.
In order to solve the problems described above, the present invention provides a graphic which reduces the memory bottleneck and increases the performance of the individual GPU cores by calculating and utilizing the optimized number of GPU cores based on the memory latency in a program having frequent memory accesses And a task allocation system and method for operation of the processing apparatus.

본 발명은 복수의 코어를 포함하며 중앙처리장치가 요청한 작업을 처리하는 그래픽처리장치; 및 상기 중앙처리장치가 요청한 작업을 상기 그래픽처리장치에 포함된 코어에 할당하고, 상기 그래픽처리장치로부터 일정 시간 간격으로 메모리 응답 시간 정보를 수신하며, 수신한 메모리 응답 시간 정보에 기초하여 상기 그래픽처리장치의 목표 코어 수를 지정하는 작업관리자를 포함하는 그래픽 처리 장치의 동작을 위한 작업 할당 시스템을 제공한다.The present invention relates to a graphics processing apparatus including a plurality of cores and processing a job requested by the central processing unit; And a processor for allocating a job requested by the central processing unit to a core included in the graphics processing unit, receiving memory response time information at a predetermined time interval from the graphic processing unit, And a task manager for specifying a target number of cores of the apparatus.

상기 작업관리자는 상기 수신한 메모리 응답 시간 정보의 개수가 기설정된 개수 이상이 되면 수신한 메모리 응답 시간의 평균을 산출하고, 산출된 평균에 기초하여 상기 그래픽처리장치의 목표 코어 수를 지정한다.The task manager calculates an average of the received memory response times when the number of received memory response time information is equal to or greater than a predetermined number and designates a target number of cores of the graphic processing apparatus based on the calculated average.

상기 작업관리자는 상기 산출된 평균이 제1 임계값보다 크면 상기 그래픽처리장치의 목표 코어 수를 감소시키고, 상기 산출된 평균이 제2 임계값보다 작으면 상기 그래픽처리장치의 목표 코어 수를 증가시킨다.The task manager decrements the target number of cores of the graphics processing unit if the calculated average is greater than the first threshold and increases the target number of cores of the graphics processing unit if the calculated average is less than the second threshold .

상기 작업관리자는 상기 지정된 그래픽처리장치의 목표 코어 수가 상기 그래픽처리장치의 작동 코어 수보다 크면 상기 목표 코어 수와 상기 작동 코어 수의 차이에 해당하는 수의 작업이 할당되지 않은 코어에 작업을 할당한다.The task manager assigns a task to a core to which a number of jobs corresponding to the difference between the target number of cores and the number of working cores is not allocated if the target number of cores of the specified graphics processing apparatus is greater than the number of working cores of the graphic processing apparatus .

상기 작업관리자는 상기 지정된 그래픽처리장치의 목표 코어 수가 상기 그래픽처리장치의 작동 코어 수보다 작으면 작동 중인 코어 중 상기 목표 코어 수와 상기 작동 코어 수의 차이에 해당하는 수의 코어를 작업 할당에서 제외한다.If the target number of cores of the specified graphics processing apparatus is smaller than the number of working cores of the graphics processing apparatus, the task manager excludes a number of cores corresponding to the difference between the target number of cores and the number of working cores do.

본 발명의 다른 일면에 따르면, 복수의 코어를 포함하는 그래픽처리장치에 작업을 할당하는 단계; 상기 그래픽처리장치로부터 일정 시간 간격으로 메모리 응답 시간에 대한 정보를 수신하는 단계; 상기 수신한 메모리 응답 시간에 대한 정보에 기초하여 상기 그래픽처리장치의 목표 코어 수를 지정하는 단계; 및 상기 지정된 목표 코어 수에 해당하는 수의 코어에 작업이 할당되도록 제어하는 단계를 포함하는 그래픽 처리 장치의 동작을 위한 작업 할당 방법을 제공한다.
According to another aspect of the present invention, there is provided a computer-readable storage medium storing a program for causing a computer to execute the steps of: assigning a job to a graphics processing apparatus including a plurality of cores; Receiving information on a memory response time from the graphics processing unit at predetermined time intervals; Designating a target number of cores of the graphic processing apparatus based on information on the received memory response time; And controlling a task to be assigned to a number of cores corresponding to the designated target number of cores.

본 발명은 최적화된 개수의 GPU 코어만을 활용하여 메모리 병목 현상을 줄이고 개별 GPU 코어의 성능을 향상시킬 수 있도록 한다. 따라서 향상된 개별 GPU 코어 성능으로 더 적은 수의 GPU 코어를 사용하면서도 종래기술과 같은 수준을 실행 속도를 유지하도록 하고 필요없는 GPU 코어는 사용하지 않음으로써 종래기술에 비해 전력 소모량을 감소시킬 수 있도록 한다.
The present invention utilizes only an optimized number of GPU cores to reduce memory bottlenecks and improve performance of individual GPU cores. Thus, with improved individual GPU core performance, fewer GPU cores are used, while maintaining the same level of performance as in the prior art and not using unnecessary GPU cores to reduce power consumption compared to the prior art.

도 1은 본 발명의 일실시예에 따른 그래픽 처리 장치의 동작을 위한 작업 할당 시스템의 전체적인 구성을 나타낸 도면.
도 2는 본 발명의 일실시예에 따른 그래픽 처리 장치의 동작을 위한 작업 할당 시스템의 작업관리자와 그래픽 처리 장치의 인터페이스를 나타낸 도면.
도 3과 도 4는 본 발명의 일실시예에 따른 그래픽 처리 장치의 동작을 위한 작업 할당 방법의 과정을 나타낸 도면.BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 is a diagram showing the overall configuration of a task allocation system for operation of a graphics processing apparatus according to an embodiment of the present invention; Fig.
2 illustrates an interface between a task manager and a graphics processing unit of a task allocation system for operation of a graphics processing unit according to an embodiment of the present invention.
FIG. 3 and FIG. 4 illustrate a process of a task allocation method for an operation of a graphics processing apparatus according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술 되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 기재에 의해 정의된다.BRIEF DESCRIPTION OF THE DRAWINGS The advantages and features of the present invention and the manner of achieving them will become apparent with reference to the embodiments described in detail below with reference to the accompanying drawings. The present invention may, however, be embodied in many different forms and should not be construed as being limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. And is provided to fully convey the scope of the invention to those skilled in the art, and the present invention is defined by the claims.

한편, 본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 및/또는 "포함하는(comprising)"은 언급된 구성소자, 단계, 동작 및/또는 소자에 하나 이상의 다른 구성소자, 단계, 동작 및/또는 소자의 존재 또는 추가함을 배제하지 않는다. 이하, 첨부된 도면을 참조하여 본 발명의 실시예를 상세히 설명하기로 한다.It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. In the present specification, the singular form includes plural forms unless otherwise specified in the specification. &Quot; comprises " and / or "comprising" when used in this specification is taken to specify the presence or absence of one or more other components, steps, operations and / Or add-ons. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일실시예에 따른 그래픽 처리 장치의 동작을 위한 작업 할당 시스템의 전체적인 구성을 나타낸 것이다.FIG. 1 is a block diagram showing a general configuration of a task allocation system for operation of a graphic processing apparatus according to an embodiment of the present invention.

본 발명의 일실시예에 따른 그래픽 처리 장치의 동작을 위한 작업 할당 시스템은 도 1에 도시된 바와 같이 중앙 처리 장치(100), 그래픽 처리 장치(300) 및 DRAM(400)을 포함하며, 중앙 처리 장치(100)와 그래픽 처리 장치(300)의 사이에 작업 실행을 전반적으로 제어하는 작업관리자(200)를 포함한다.The task allocation system for operation of the graphic processing apparatus according to an embodiment of the present invention includes a central processing unit 100, a graphics processing unit 300, and a DRAM 400 as shown in FIG. 1, And a task manager 200 for generally controlling work execution between the apparatus 100 and the graphic processing apparatus 300. [

작업관리자(200)는 중앙 처리 장치(100)로부터 작업 처리를 위임받고 위임받은 작업을 그래픽 처리 장치(300)의 코어에 할당하며 그래픽 처리 장치(300)의 구동을 제어한다. 작업관리자(200)는 그래픽 처리 장치(300)의 연산 자원을 관리하는 기능을 탑재하고 있으며, 이를 이용하여 본 발명이 제안하는 그래픽 처리 장치(300)의 코어 개수 조절을 수행한다.The task manager 200 delegates task processing from the central processing unit 100 and assigns the task delegated to the core of the graphic processing unit 300 and controls the operation of the graphic processing unit 300. The task manager 200 has a function of managing the operation resources of the graphic processing apparatus 300 and performs the adjustment of the number of cores of the graphic processing apparatus 300 proposed by the present invention.

도 2는 본 발명의 일실시예에 따른 그래픽 처리 장치의 동작을 위한 작업 할당 시스템의 작업관리자(200)와 그래픽 처리 장치(300) 간의 인터페이스를 나타낸 것이다.FIG. 2 illustrates an interface between the task manager 200 and the graphics processor 300 of the task assignment system for operation of the graphics processor according to an embodiment of the present invention.

작업관리자(200)는 중앙 처리 장치(100)로부터 위임받은 작업정보를 그래픽 처리 장치(300)에 전달하고, 그래픽 처리 장치(300)의 각 코어에 작업을 할당한다. 그리고 할당한 작업의 파라미터를 그래픽 처리 장치(300)로 전달한다.The task manager 200 transfers task information delegated from the central processing unit 100 to the graphic processing unit 300 and allocates the task to each core of the graphic processing unit 300. [ And transmits the parameters of the assigned job to the graphic processing apparatus 300. [

그래픽 처리 장치(300)는 작업관리자(200)로부터 할당받은 작업을 처리하고, 할당받은 작업의 처리가 완료되면 작업관리자(200)에게 작업 완료 신호를 전송한다. 또한, 그래픽 처리 장치(300)는 작업관리자(200)로부터 할당받은 작업을 수행하면서 일정 주기마다 메모리 응답 시간(Memory Latency)에 대한 정보를 작업관리자(200)에게 전달한다.The graphic processing apparatus 300 processes a task assigned from the task manager 200 and transmits a task completion signal to the task manager 200 when the assigned task is completed. In addition, the graphic processing apparatus 300 transmits information on the memory latency to the task manager 200 at predetermined intervals while performing tasks assigned from the task manager 200. [

작업관리자(200)는 그래픽 처리 장치(300)의 각 코어에서 전달받은 메모리 응답 시간을 일정 개수만큼 저장하고 저장된 메모리 응답 시간의 평균(AML, Average Memory Latency)을 계산한다. 작업관리자(200)는 계산된 평균 메모리 응답 시간(AML)에 기초하여 그래픽 처리 장치(300)의 작동 코어 개수를 조절하여 그래픽 처리 장치(300)의 코어가 최적화된 개수만큼 작동하도록 하며, 작업관리자(200)가 그래픽 처리 장치(300)의 작동 코어 개수를 조절하는 과정은 도 3과 도 4를 통해 구체적으로 설명한다.The task manager 200 stores a predetermined number of memory response times received from the cores of the graphic processor 300 and calculates an average of the stored memory response times (AML, Average Memory Latency). The task manager 200 adjusts the number of working cores of the graphics processing unit 300 based on the calculated average memory response time (AML) so that the cores of the graphics processing unit 300 operate as an optimized number, The process of controlling the number of operating cores of the graphic processing apparatus 300 will be described in detail with reference to FIG. 3 and FIG.

도 3과 도 4는 본 발명의 일실시예에 따른 그래픽 처리 장치의 동작을 위한 작업 할당 방법의 과정을 나타낸 것으로서, 도 3은 작업관리자가 최적화된 목표 코어 수를 지정하는 과정을 나타낸 것이고 도 4는 최적화된 목표 코어 수와 현재 작동 코어 수에 따라 그래픽 처리 장치에 작업 할당을 제어하는 과정을 나타낸 것이다.3 and 4 illustrate a process of a task allocation method for an operation of a graphics processing apparatus according to an embodiment of the present invention. FIG. 3 illustrates a process of assigning an optimized target number of cores to a task manager. Shows a process of controlling the job assignment to the graphic processing apparatus according to the number of optimized target cores and the number of currently operated cores.

도 3에 도시된 바와 같이, 본 발명의 일실시예에 따른 그래픽 처리 장치의 동작을 위한 작업 할당 시스템의 작업관리자는 중앙 처리 장치로 위임받은 작업을 그래픽 처리 장치의 각 코어에 할당하고, 일정 주기마다 그래픽 처리 장치로부터 각 코어의 메모리 응답 시간을 전달받는다.3, the task manager of the task allocation system for the operation of the graphic processing apparatus according to an exemplary embodiment of the present invention allocates a task delegated to the central processing unit to each core of the graphics processing apparatus, The memory response time of each core is received from the graphics processing unit.

작업관리자는 그래픽 처리 장치의 각 코어에서 전달받은 메모리 응답 시간을 가지고 전체 그래픽 처리 장치에서의 평균 메모리 응답 시간(AML)을 계산한다(S300). 이때 작업관리자는 그래픽 처리 장치로부터 전달받은 메모리 응답 시간의 개수가 일정 개수 이상이 되면 평균 메모리 응답 시간(AML)을 계산할 수도 있다.The task manager calculates the average memory response time (AML) in the entire graphics processing unit with the memory response time received from each core of the graphics processing unit (S300). At this time, the task manager may calculate the average memory response time (AML) when the number of memory response times received from the graphics processing device reaches a predetermined number or more.

작업관리자는 계산된 평균 메모리 응답 시간(AML)을 미리 지정된 응답 시간의 임계값과 비교하고 비교 결과에 따라 목표 코어 수를 조정하여 최적화된 개수의 코어가 작동할 수 있도록 한다.The task manager compares the calculated average memory response time (AML) with a threshold value of a predetermined response time and adjusts the target number of cores according to the comparison result so that an optimized number of cores can be operated.

구체적으로, 작업관리자는 평균 메모리 응답 시간(AML)을 응답 시간의 최대 임계값인 제1 임계값과 비교하고(S320), 평균 메모리 응답 시간(AML)이 제1 임계값보다 크면 목표 코어 수를 감소시킨다(S340).Specifically, the task manager compares the average memory response time (AML) with the first threshold value, which is the maximum threshold of the response time (S320). If the average memory response time (AML) is greater than the first threshold value, (S340).

그리고 평균 메모리 응답 시간(AML)이 제1 임계값보다 크지 않으면 평균 메모리 응답 시간(AML)을 응답 시간의 최소 임계값인 제2 임계값과 비교하고(S360), 평균 메모리 응답 시간(AML)이 제2 임계값보다 작으면 목표 코어 수를 증가시킨다(S380).If the average memory response time (AML) is not greater than the first threshold value, the average memory response time (AML) is compared with a second threshold value of the minimum response time (S360) If it is smaller than the second threshold value, the target number of cores is increased (S380).

평균 메모리 응답 시간(AML)이 제1임계값보다 크지 않고 제2 임계값보다 작지 않으면 목표 코어 수는 현재 목표 코어 수로 유지한다.If the average memory response time (AML) is not greater than the first threshold value and less than the second threshold value, the target number of cores is maintained at the current target number of cores.

즉, 본 발명은 평균 메모리 응답 시간(AML)을 미리 지정된 응답 시간의 임계값과 비교하고 비교 결과에 따라 목표 코어 수를 조정함으로써 그래픽 처리 장치의 코어가 최적화된 개수만큼 작동할 수 있도록 하며, 목표 코어 수 결정을 위한 제1 임계값과 제2 임계값은 실험적으로 구해질 수 있고 사용자에 의하여 임의로 설정될 수도 있다.That is, the present invention compares the average memory response time (AML) with a threshold value of a predetermined response time and adjusts the target number of cores according to the comparison result so that the core of the graphics processing unit can operate as an optimized number, The first threshold value and the second threshold value for determining the number of cores may be obtained experimentally and may be arbitrarily set by a user.

도 4는 최적화된 목표 코어 수에 따라 그래픽 처리 장치의 코어에 작업을 할당하는 과정을 나타낸 것이다.FIG. 4 illustrates a process of assigning a task to a core of a graphic processing apparatus according to an optimized target number of cores.

작업관리자는 목표 코어 수가 지정되면 지정된 목표 코어 수를 현재 작동 중인 그래픽 처리 장치의 코어 수와 비교한다(S400).When the target core number is designated, the task manager compares the designated target number of cores with the number of cores of the currently operating graphics processing unit (S400).

목표 코어 수가 현재 작동 중인 코어 수보다 크면(S410), 작업이 할당되지 않은 코어에 작업을 할당하여(S420) 작동 중인 코어 수가 최적화된 목표 코어 수와 동일하게 하거나 목표 코어 수에 근접하도록 한다.If the target number of cores is greater than the number of currently operating cores (S410), the task is assigned to the unassigned cores (S420) so that the number of cores in operation is equal to or close to the target number of cores.

그리고 목표 코어 수가 현재 작동하는 코어 수보다 작으면(S430), 현재 작동하고 있는 코어 중 작업이 가장 적게 할당된 코어를 선택하고 선택된 코어를 작업 할당에서 제외한다(S440). 따라서 작업이 가장 적게 할당된 코어에는 추가적인 작업 할당을 하지 않고 해당 코어가 현재 할당된 작업만을 완료하고 작동을 중지하도록 한다.If the number of target cores is smaller than the number of currently operating cores (S430), the currently selected core is selected and the selected cores are excluded from the job assignment (S440). Therefore, the core with the least workload is not assigned additional work, but the core only finishes the work currently assigned and stops working.

이때 목표 코어 수와 작동 중인 코어 수의 차이에 해당하는 수만큼의 코어를 작업이 적게 할당된 순서대로 선택하고 선택된 코어들을 작업 할당에서 제외할 수도 있다.At this time, the number of cores corresponding to the difference between the target number of cores and the number of cores in operation may be selected in the order in which the jobs are allocated in a less amount, and the selected cores may be excluded from the job allocation.

현재 작동하고 있는 코어 수와 목표 코어 수가 동일하면 현재 작동 중인 코어 중에서 가장 적은 작업을 할당받은 코어를 지정한다(S450). 그리고 지정된 코어에 작업 할당이 바로 가능하다면(S460) 작업을 바로 할당한다. 지정된 코어에 이미 너무 많은 작업이 할당되어 있어 추가적인 작업 할당이 가능하지 않으면(460) 그래픽 처리 장치의 코어로부터 작업 완료 신호를 수신할 때까지 대기하고(S480), 작업 완료 신호를 수신하면 작업을 할당한다.If the number of currently operating cores and the number of target cores are the same, a core assigned the least work among currently operating cores is designated (S450). If the assignment of the task to the designated core is immediately possible (S460), the task is immediately assigned. If too many jobs are already allocated to the designated core and additional job assignment is not possible (460), the process waits until a job completion signal is received from the core of the graphics processing device (S480). When the job completion signal is received, do.

이상의 설명은 본 발명의 기술적 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면, 본 발명의 본질적 특성을 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능하다. 따라서, 본 발명에 표현된 실시예들은 본 발명의 기술적 사상을 한정하는 것이 아니라, 설명하기 위한 것이고, 이러한 실시예에 의하여 본 발명의 권리범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 특허청구범위에 의하여 해석되어야 하고, 그와 동등하거나, 균등한 범위 내에 있는 모든 기술적 사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.
The foregoing description is merely illustrative of the technical idea of the present invention and various changes and modifications may be made without departing from the essential characteristics of the present invention. Therefore, the embodiments described in the present invention are not intended to limit the scope of the present invention, but are intended to be illustrative, and the scope of the present invention is not limited by these embodiments. It is intended that the present invention cover the modifications and variations of this invention provided they come within the scope of the appended claims and their equivalents, which fall within the scope of the present invention as claimed.

Claims

A graphics processing device including a plurality of cores and processing a job requested by the central processing unit; And
Wherein the graphics processing unit allocates a job requested by the central processing unit to a core included in the graphics processing unit, receives memory response time information at a predetermined time interval from the graphic processing unit, And a task manager for specifying a target number of cores of the processor,
The task manager assigns a task to a core to which a number of jobs corresponding to the difference between the target number of cores and the number of working cores is not allocated if the target number of cores of the specified graphics processing apparatus is greater than the number of working cores of the graphic processing apparatus that
A task allocation system for the operation of the graphics processing device.

The system of claim 1, wherein the task manager
Calculating an average of the received memory response times when the number of received memory response time information is equal to or greater than a predetermined number and designating a target number of cores of the graphic processing apparatus based on the calculated average
A task allocation system for the operation of the graphics processing device.

3. The system of claim 2, wherein the task manager
Decreasing a target number of cores of the graphic processing apparatus if the calculated average is greater than a first threshold value and increasing a target number of cores of the graphics processing apparatus if the calculated average is smaller than a second threshold value
A task allocation system for the operation of the graphics processing device.

delete

The system of claim 1, wherein the task manager
Excluding the number of cores corresponding to the difference between the target number of cores and the number of operating cores among the operating cores when the target number of cores of the specified graphics processing apparatus is smaller than the number of operating cores of the graphic processing apparatus
A task allocation system for the operation of the graphics processing device.

1. A task allocation method for a task manager for an operation of a graphics processing apparatus,
Assigning a job to a graphics processing unit comprising a plurality of cores;
Receiving information on a memory response time from the graphics processing unit at predetermined time intervals;
Designating a target number of cores of the graphic processing apparatus based on information on the received memory response time; And
Controlling to assign a job to a number of cores corresponding to the designated target number of cores
, &Lt; / RTI &
Wherein the step of controlling the task to be assigned to a number of cores corresponding to the designated target number of cores,
If the specified target number of cores is greater than the number of cores in operation, the task is assigned to cores for which no work is assigned
A task assignment method for an operation of the graphics processing apparatus.

The method according to claim 6, wherein the step of designating the target number of cores of the graphic processing apparatus based on the information on the received memory response time
Calculating an average of the received memory response times; And
Comparing the average with a preset threshold value, and designating a target number of cores based on the comparison result
A task assignment method for an operation of the graphics processing apparatus.

delete

7. The method of claim 6, wherein controlling the tasks to be assigned to a number of cores corresponding to the designated target number of cores
If the specified target number of cores is less than the number of cores in operation, then the task is allocated from among the assigned cores in the order of least allocated work.
A task assignment method for an operation of the graphics processing apparatus.

7. The method of claim 6, wherein controlling the tasks to be assigned to a number of cores corresponding to the designated target number of cores
If the specified number of target cores is equal to the number of cores in operation, allocate the task to the core with the least task assigned or assign the task to the core that sent the task completion signal
A task assignment method for an operation of the graphics processing apparatus.