KR102179399B1

KR102179399B1 - Smart i/o stream detection based on multiple attributes

Info

Publication number: KR102179399B1
Application number: KR1020170040850A
Authority: KR
Inventors: 잔키 에스. 비마니; 징페이 양; 최창호; 지안지안 후오
Original assignee: 삼성전자주식회사
Priority date: 2016-05-02
Filing date: 2017-03-30
Publication date: 2020-11-16
Anticipated expiration: 2037-03-30
Also published as: KR20170124444A

Abstract

멀티-스트리밍 메모리 시스템은, 메모리, 및 상기 메모리와 연결되는 프로세서를 포함하고, 소프트웨어 컴포넌트를 실행하는 상기 프로세서는, 로직 블록 어드레스들과 각각 관련되고, 각각이 데이터 기입들의 복수의 스트림들 각각에 대응하는 다수의 속성들을 식별하고, 상기 스트림들 각각에 대응하는 상기 속성들 각각에 대한 중요도 계수를 평가하고, 그리고 상기 로직 블록 어드레스들 각각 및 할당된 스트림에 대한 모든 중요도 계수들에 기초하여 상기 로직 블록 어드레스들 각각에 스트림 ID를 할당하여 2개 이상의 상기 로직 블록 어드레스들을 클러스터링한다.The multi-streaming memory system includes a memory and a processor connected to the memory, the processor executing a software component, each associated with a logic block address, each corresponding to each of a plurality of streams of data writes Identify a plurality of attributes, evaluate the importance coefficient for each of the attributes corresponding to each of the streams, and based on each of the logic block addresses and all the importance coefficients for the assigned stream, the logic block Two or more of the logical block addresses are clustered by assigning a stream ID to each of the addresses.

Description

Smart input/output stream detection based on various properties {SMART I/O STREAM DETECTION BASED ON MULTIPLE ATTRIBUTES}

본 발명은 멀티-스트리밍 플래시 드라이브들에서 스트림 할당을 개선하는 방법들 및 매커니즘들에 관한 것이다.The present invention relates to methods and mechanisms for improving stream allocation in multi-streaming flash drives.

솔리드 스테이트 드라이브/솔리드 스테이트 디스크(solid-state drive/solid-state disk, SSD)는 집적 회로(IC) 어셈블리들을 메모리로서 사용하여 데이터를 지속적으로 저장하는 고체 상태 저장 장치이다. SSD 기술은 일반적으로 기존의 블록 입/출력(I/O) 하드 디스크 드라이브들 (HDDs)와 호환되는 전자 인터페이스들을 사용하므로 많은 일반적인 애플리케이션들에서 쉬운 교체를 제공한다.A solid-state drive/solid-state disk (SSD) is a solid-state storage device that continuously stores data using integrated circuit (IC) assemblies as memory. SSD technology typically uses electronic interfaces that are compatible with existing block input/output (I/O) hard disk drives (HDDs), providing easy replacement for many common applications.

쓰기 증폭(write amplification)은 솔리드 스테이트 드라이브들(SSDs)에 사용되는 NAND 플래시 메모리와 같은 몇몇 형태의 비 휘발성 메모리에 대응하는 문제를 기술한다. 쓰기 증폭은 SSD에서 비 휘발성 메모리에 대해 맡겨진(committed) 쓰기들 횟수와 호스트 컴퓨팅 플랫폼으로부터 오는 쓰기들 횟수의 비율로 설명될 수 있다. 쓰기 증폭은 SSD에 랜덤 기입들(random writes)에 대한 문제를 일으킬 수 있다. 예를 들어, 높은 쓰기 증폭은 SSD에 대한 기입 성능을 감소시킬 수 있고, 비 휘발성 메모리 셀들의 마모를 증가시킬 수 있으며, 그에 따라 메모리 장치의 내구성을 저하시킨다.Write amplification describes a problem corresponding to some types of nonvolatile memory, such as NAND flash memory used in solid state drives (SSDs). Write amplification can be described as the ratio of the number of writes committed to the nonvolatile memory in the SSD and the number of writes coming from the host computing platform. Write amplification can cause problems for random writes to the SSD. For example, high write amplification may reduce write performance to an SSD, increase wear of nonvolatile memory cells, and thus degrade durability of a memory device.

"멀티-스트림(multi-stream) SSD"로 불리는 개념은 상이한 속성들을 갖는 데이터를 개별적으로 저장하는 인터페이스들을 갖는 애플리케이션들 및 오퍼레이팅 시스템들을 제공한다. 이러한 개별 데이터 저장소들을 "스트림들(streams)"이라고 한다. 스트림들은 서로 다른 데이터 기입들이 언제 서로 연관되거나 비슷한 수명을 갖는지를 나타내는데 사용될 수 있다. 즉, 개별 데이터 쓰기들의 그룹은 집합 스트림의 일부일 수 있으며, 그리고 각 스트림은 오퍼레이팅 시스템 또는 대응하는 애플리케이션에 의해서 할당된 스트림 ID로 식별된다. 따라서, 유사한 특성들 또는 속성들을 갖는 상이한 데이터는 고유한 스트림 ID로 할당되어 그 스트림 ID에 대응하는 데이터가 SSD 내 동일한 블록에 기입될 수 있도록 한다.The concept called "multi-stream SSD" provides operating systems and applications with interfaces that individually store data with different attributes. These individual data stores are referred to as "streams." Streams can be used to indicate when different data writes are associated with each other or have a similar lifetime. That is, a group of individual data writes may be part of an aggregate stream, and each stream is identified by a stream ID assigned by the operating system or the corresponding application. Thus, different data having similar characteristics or attributes are assigned a unique stream ID so that data corresponding to that stream ID can be written to the same block in the SSD.

즉, 멀티-스트리밍 플래시 드라이브들은 서로 관련있는 기입 동작들을 SSDs에 함께 배치함으로써 더 많은 유연성을 가능하게 하며, 이에 따라 쓰기 증폭을 감소 및 SSD들의 성능을 향상 모두를 시킨다. 예를 들어, 멀티-스트리밍 플래시 드라이브들에서 효율적인 스트림 할당은 쓰기 증폭을 감소시키고 SSD들의 수명과 내구성 모두를 향상시킬 수 있다.That is, multi-streaming flash drives enable more flexibility by placing related write operations together on SSDs, thereby reducing write amplification and improving performance of SSDs. For example, efficient stream allocation in multi-streaming flash drives can reduce write amplification and improve both lifespan and durability of SSDs.

현재, 스트림 할당은 데이터 기입들을 수행하는 애플리케이션을 수정함으로써 애플리케이션 계층에서 일반적으로 발생하며, 이는 결과적으로 높은 유지 관리 오버헤드(overhead)로 인해 상이한 애플리케이션들의 다수의 인스턴스들(instances)을 갖는 시스템들에 적합하지 않을 수 있다. 또한 다수의 애플리케이션들이 백엔드(backend)에서 다수의 저장 장치들에 의해 지원되는 경우, 관련 오버헤드가 더 증가한다.Currently, stream allocation generally occurs at the application layer by modifying the application performing data writes, which in turn results in systems with multiple instances of different applications due to high maintenance overhead. May not be suitable. Also, when multiple applications are supported by multiple storage devices at the backend, the associated overhead further increases.

또한, 현재의 멀티-스트리밍 시스템들의 대부분에서, 자동 스트림 검출을 위해 단지 소수의 속성들(즉, 특정 엔티티(entity)의 액세스의 빈도 및 시간 지역성(temporal locality) 속성들)이 사용되므로, 스트림들을 검출하기 위한 분석 방법이 제한된다.Also, in most of the current multi-streaming systems, only a few attributes (i.e., frequency and temporal locality attributes of access of a particular entity) are used for automatic stream detection, so streams are The method of analysis to detect is limited.

따라서 본 발명은 멀티-스트리밍 플래시 드라이브들을 제공하며, 특히 쓰기 증폭을 감소시키고, 솔리드 스테이트 드라이브들(SSDs)의 성능을 향상시키는 것이다.Accordingly, the present invention provides multi-streaming flash drives, in particular to reduce write amplification, and to improve the performance of solid state drives (SSDs).

본 발명의 실시예에 따르면, 멀티-스트리밍 메모리 시스템은, 메모리, 및 상기 메모리와 연결되는 프로세서를 포함하고, 소프트웨어 컴포넌트를 실행하는 상기 프로세서는, 로직 블록 어드레스들과 각각 관련되고, 각각이 데이터 기입들의 복수의 스트림들 각각에 대응하는 다수의 속성들을 식별하고, 상기 스트림들 각각에 대한 상기 속성들 각각을 위한 중요도 계수를 평가하고, 그리고 상기 로직 블록 어드레스들 각각 및 상기 할당된 스트림에 대한 모든 중요도 계수들에 기초하여 상기 로직 블록 어드레스들 각각에 스트림 ID를 할당하여 2개 이상의 로직 블록 어드레스들을 클러스터링한다.According to an embodiment of the present invention, a multi-streaming memory system includes a memory and a processor connected to the memory, and the processor executing a software component is associated with each of the logic block addresses, each writing data Identifying a plurality of attributes corresponding to each of a plurality of streams of, evaluating an importance coefficient for each of the attributes for each of the streams, and all importance for each of the logical block addresses and the assigned stream Two or more logical block addresses are clustered by assigning a stream ID to each of the logical block addresses based on coefficients.

상기 소프트웨어 컴포넌트는 1개 이상의 속성들에 가중 계수를 할당하는 것을 더 포함하고, 상기 소프트웨어 컴포넌트는 상기 할당된 가중 계수에 근거해서 다른 중요도 계수들보다 상기 다수의 속성들에 대응하는 상기 중요도 계수들에 우선권을 부여하여 상기 스트림 ID를 상기 로직 블록 어드레스들에 할당한다.The software component further comprises assigning a weighting factor to one or more attributes, wherein the software component is based on the assigned weighting factor to the importance coefficients corresponding to the plurality of attributes rather than other importance coefficients. Priority is given and the stream ID is assigned to the logical block addresses.

상기 소프트웨어 컴포넌트는 nXm 어드레스들을 포함하는 nXm 특징 매트릭스를 생성하여 상기 로직 블록 어드레스들 각각에 대한 상기 속성들 각각에 대한 상기 중요도 계수를 평가하며, 여기서 n은 상기 속성들의 전체 개수이고, m은 상기 로직 블록 어드레스들의 전체 개수이다.The software component generates an nXm feature matrix containing nXm addresses to evaluate the importance factor for each of the attributes for each of the logical block addresses, where n is the total number of attributes, and m is the logic block address. This is the total number of block addresses.

상기 특징 매트릭스의 어드레스들 각각은 상기 중요도 계수에 대응하고, 상기 로직 블록 어드레스들 중 하나에 대응하는 속성이 중요한 지를 나타내는 단일-비트 이진 값을 포함한다.Each of the addresses of the feature matrix corresponds to the importance factor and contains a single-bit binary value indicating whether the attribute corresponding to one of the logical block addresses is significant.

상기 소프트웨어 컴포넌트는 상기 속성에 대응하는 임계값이 충족되는 지의 여부에 근거해서 상기 단일-비트 이진 값의 값을 결정하는 것을 더 포함한다.The software component further includes determining a value of the single-bit binary value based on whether a threshold value corresponding to the attribute is satisfied.

상기 소프트웨어 컴포넌트는 상기 특징 매트릭스의 대응하는 어드레스들에서 일치하는 이진 값들을 갖는 상기 로직 블록 어드레스들 중 하나에 동일한 스트림 ID를 할당한다.The software component assigns the same stream ID to one of the logical block addresses having binary values that match at corresponding addresses in the feature matrix.

상기 소프트웨어 컴포넌트는, 다중 타임스탬프 윈도우들 각각 동안, 상기 스트림들 중 어느 스트림 그룹이 동시 데이터 기입 요청들 발생에 대응하는 지를 판별하여 상기 로직 블록 어드레스들의 각각에 대한 속성들 중 하나로서 일치성 속성에 대한 중요도를 평가한다.The software component determines, during each of the multiple timestamp windows, which stream group of the streams corresponds to the occurrence of concurrent data write requests, and determines the consistency attribute as one of the attributes for each of the logical block addresses. Evaluate the importance of

상기 소프트웨어 컴포넌트가 상기 일치성 속성에 대한 중요도 계수를 평가하는 것은, 상기 타임스탬프 윈도우들 각각에 대한 큐 내의 엘리먼트 내 상기 상기 스트림 그룹의 상기 스트림들 각각의 상기 로직 블록 어드레스들을 나타내고, 큐가 채워지는 경우, 각각의 엘리먼트를 다른 모든 엘리먼트들과 비교하고, 그리고 복수의 엘리먼트들 각각에서 각각 표현되는 상기 스트림 그룹의 상기 스트림들의 상기 로직 블록 어드레스들을 식별하는 것을 포함한다.The software component evaluating the importance factor for the correspondence attribute indicates the logical block addresses of each of the streams of the stream group in an element in the queue for each of the timestamp windows, and the queue is filled. Case, comparing each element with all other elements, and identifying the logical block addresses of the streams of the stream group, respectively represented in each of the plurality of elements.

상기 소프트웨어 컴포넌트는 상기 로직 블록 어드레스 요청들 각각에 대한 상기 속성들 각각에 대한 상기 중요도 계수에 근거하여 다음 로직 블록 어드레스 요청들에 스트림 ID를 할당하기 위한 사전을 생성하는 것을 더 포함하고, 그리고 상기 메모리는 상기 사전을 저장한다.The software component further comprising generating a dictionary for allocating a stream ID to the next logical block address requests based on the importance factor for each of the attributes for each of the logical block address requests, and the memory Stores the dictionary.

본 발명의 다른 실시예에 따르면, 멀티-스트리밍 메모리 장치에서 데이터의 입/출력 스트림에 대응하는 로직 블록 어드레스에 각각 관련된 속성들을 식별하는 방법은, 상기 메모리 장치에 대응하는 입/출력 스트림을 검출하는 단계, 상기 검출 된 입/출력 스트림에 대응하는 상기 로직 블록 어드레스에 대응하는 복수의 속성들을 획득하는 단계, 상기 로직 블록 어드레스에 대한 속성들 각각의 중요도 계수를 나타내는 특징 매트릭스를 생성하는 단계, 및 상기 특징 매트릭스에 근거해서 수신된 데이터를 상이한 스트림들에 클러스터링하는 단계를 포함한다.According to another embodiment of the present invention, a method of identifying attributes respectively related to a logic block address corresponding to an input/output stream of data in a multi-streaming memory device includes: detecting an input/output stream corresponding to the memory device. Obtaining a plurality of attributes corresponding to the logic block address corresponding to the detected input/output stream, generating a feature matrix indicating importance coefficients of each of the attributes for the logic block address, and the Clustering the received data into different streams based on the feature matrix.

상기 방법은, 상기 수신된 데이터를 상기 상이한 스트림들에 클러스터링하는 양상들을 평가하기 위해 상기 특징 매트릭스에 근거해서 분석 모델을 생성하는 단계를 더 포함한다.The method further includes generating an analysis model based on the feature matrix to evaluate aspects of clustering the received data into the different streams.

상기 방법은, 각각의 속성의 중요도 레벨에 근거해서 획득된 속성들을 다르게 가중하여 분석 모델에 대한 상대 가중 계수를 도입하는 단계를 더 포함한다.The method further includes the step of introducing a relative weighting factor for the analysis model by weighting the acquired attributes differently based on the importance level of each attribute.

상기 방법은, 룩업(lookup)을 수행함으로써 대응하는 스트림 ID를 그 다음 수신된 스트림에 할당하기 위해 상기 분석 모델에 기초하여 사전을 생성하는 단계를 더 포함한다.The method further includes generating a dictionary based on the analysis model to assign a corresponding stream ID to the next received stream by performing a lookup.

상이한 스트림들에 상기 수신된 데이터를 클러스터링하는 단계는 대응하는 스트림 ID를 상기 로직 블록 어드레스에 할당하는 단계를 포함한다. Clustering the received data in different streams includes assigning a corresponding stream ID to the logical block address.

상기 복수의 속성들은 빈도, 시간적 지역성, 순차성 또는 일치성을 포함한다.The plurality of attributes include frequency, temporal locality, sequentiality, or consistency.

상기 특징 매트릭스의 각 어드레스는 상기 로직 블록 어드레스에 대응하는 속성에 대응하는 중요 계수를 나타내는 단일-비트를 포함한다.Each address in the feature matrix includes a single-bit representing a significant coefficient corresponding to an attribute corresponding to the logical block address.

상기 수신된 데이터를 상이한 스트림들에 클러스터링하는 단계는 어느 로직 블록 어드레스들이 상기 수신된 데이터에 대응하는 지를 판별하는 단계를 포함한다.Clustering the received data into different streams includes determining which logical block addresses correspond to the received data.

상기 수신된 데이터를 상이한 스트림들에 클러스터링하는 단계는 복수의 타임스탬프 윈도우들 각각동안 어느 로직 블록 어드레스들이 동시에 액세스되는 지를 판별하는 단계를 포함한다.Clustering the received data into different streams includes determining which logical block addresses are simultaneously accessed during each of a plurality of timestamp windows.

본 발명의 다른 실시예에 따르면, 일치성 속성의 중요 계수를 멀티-스트림 메모리 장치 내 데이터의 스트림에 할당하는 방법은, 로직 블록 어드레스들(LBAs)에 각각 대응하는 데이터의 스트림들을 수신하는 단계, 복수의 타임스탬프 윈도우들 각각에 액세스되는 고유한 로직 블록 어드레스들의 수가 임계값보다 작은 지를 판별하는 단계, 타임스탬프 윈도우들에 각각 대응하는 심볼들을 삽입하고, 상기 고유 로직 블록 어드레스들의 수가 상기 임계값보다 작지 않을 때 상기 로직 블록 어드레스들을 큐에 각각 표시하는 단계, 상기 삽입된 심볼들이 상기 큐에 가득 찼는지를 판별하는 단계, 및 상기 큐가 가득 찼을 때 상기 타임스탬프 윈도우들 중 일부들 각각 동안 어느 로직 블록 어드레스들이 동시에 액세스되는 지를 판별하는 단계를 포함한다.According to another embodiment of the present invention, a method of allocating an important coefficient of a correspondence attribute to a stream of data in a multi-stream memory device includes receiving streams of data respectively corresponding to logical block addresses (LBAs), Determining whether the number of unique logic block addresses accessed to each of a plurality of timestamp windows is less than a threshold value, inserting symbols corresponding to each of the timestamp windows, and the number of unique logic block addresses is greater than the threshold. Displaying each of the logical block addresses in a queue when not small, determining whether the inserted symbols are full in the queue, and which logic block during each of some of the timestamp windows when the queue is full And determining whether addresses are being accessed simultaneously.

상기 방법은 상기 타임스탬프 윈도우들 중 일부들 각각 동안 동시에 액세스된 것으로 판별된 로직 블록 어드레스들에 대응하는 상기 스트림들로 동일한 스트림 ID를 할당하는 단계를 더 포함한다.The method further includes allocating the same stream ID to the streams corresponding to logical block addresses determined to be accessed simultaneously during each of some of the timestamp windows.

이와 같은 본 발명에 의하면, 멀티-스트리밍 플래시 드라이브를 제공하되, 특히 쓰기 증폭을 감소시키고, 솔리드 스테이트 드라이브들(SSDs)의 성능을 향상시킬 수 있다.According to the present invention as described above, a multi-streaming flash drive is provided, in particular, it is possible to reduce write amplification and improve performance of solid state drives (SSDs).

본 발명의 특징은 상세한 설명, 청구 범위 및 첨부 도면들을 참조하여 이해되고 이해될 것이다.
도 1은 본 발명의 일 실시예에 따른 다중 속성들을 고려하기 위한 개선된 멀티-스레드 K-평균 클러스터링을 갖는 스마트 스트림 할당을 위한 분석 모델을 나타낸 블록도이다.
도 2는 본 발명의 일 실시예에 따른 신규한 일치성 속성을 설명하기 위한 샘플링 그래프이다.
도 3은 본 발명의 일 실시예에 따른 일치성 속성 할당 플로우차트이다.Features of the invention will be understood and understood with reference to the detailed description, claims and accompanying drawings.
1 is a block diagram showing an analysis model for smart stream allocation with improved multi-threaded K-means clustering to consider multiple attributes according to an embodiment of the present invention.
2 is a sampling graph for explaining a novel consistency attribute according to an embodiment of the present invention.
3 is a flowchart of a matching attribute allocation according to an embodiment of the present invention.

본 발명의 개념 및 그 실시 방법들의 특징은 이하의 실시예 및 첨부 도면들의 상세한 설명을 참조하면 보다 쉽게 이해될 수 있다. 이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예에 대하여 상세히 설명한다. 그러나, 본 발명은 다양한 형태로 구체화될 수 있으며, 여기에 도시된 실시예들에만 한정되는 것으로 해석되어서는 안 된다. 오히려, 이들 실시예들은 본 개시가 완전하고 완벽할 것이고, 당업자에게 본 발명의 양태 및 특징을 충분히 전달할 수 있도록 예로서 제공된다. 따라서, 본 발명의 양상들 및 특징들의 완전한 이해를 위해 당업자에게 불필요한 프로세스들, 요소들, 및 기술들은 설명되지 않을 수 있다. 또 다른 언급이 없는 한, 동일한 도면 부호들은 첨부된 도면 및 상세한 설명 전반에 걸쳐 동일한 구성 요소를 지칭하므로, 그에 대한 설명은 반복되지 않을 것이다. 도면에서, 요소, 층 및 영역의 상대적 크기는 명확성을 위해 과장될 수 있다.The concept of the present invention and features of its implementation methods may be more easily understood with reference to the following embodiments and detailed description of the accompanying drawings. Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. However, the present invention may be embodied in various forms and should not be construed as being limited only to the embodiments shown here. Rather, these embodiments are provided by way of example so that this disclosure will be complete and complete, and will fully convey the aspects and features of the invention to those skilled in the art. Accordingly, processes, elements, and techniques unnecessary to those skilled in the art for a thorough understanding of aspects and features of the present invention may not be described. Unless otherwise stated, the same reference numerals refer to the same elements throughout the accompanying drawings and detailed description, and thus description thereof will not be repeated. In the drawings, the relative sizes of elements, layers, and regions may be exaggerated for clarity.

"제1", "제2", "제3" 등의 용어는 본 명세서에서 다양한 엘리먼트들, 구성 요소들, 영역들, 층들 및/또는 섹션들을 설명하기 위해 사용될 수 있지만, 이들 엘리먼트들, 구성 요소들, 영역들, 층들 및/또는 섹션들은 이들 용어들에 의해 제한되어서는 안 된다. 이들 용어는 하나의 엘리먼트, 구성 요소, 영역, 층 및/또는 섹션을 다른 엘리먼트, 구성 요소, 영역, 층 및/또는 섹션과 구별하기 위해 사용된다. 따라서, 이하에서 설명되는 제1 엘리먼트, 구성 요소, 영역, 층 또는 섹션은 본 발명의 사상 및 범위를 벗어나지 않고 제2 엘리먼트, 구성 요소, 영역, 계층 또는 섹션으로 지칭 될 수 있다.Terms such as “first”, “second”, “third” may be used herein to describe various elements, components, regions, layers, and/or sections, but these elements, configurations Elements, regions, layers and/or sections should not be limited by these terms. These terms are used to distinguish one element, component, region, layer and/or section from another element, component, region, layer and/or section. Accordingly, a first element, component, region, layer, or section described below may be referred to as a second element, component, region, layer or section without departing from the spirit and scope of the present invention.

"아래에", "아래쪽의", "하부", "낮은", "위에" "상부" 등과 같은 공간적으로 상대적인 용어들은 설명의 편의를 위해, 도면들에 도시된 방위들에 추가하여, 하나의 요소 또는 특징과 도면에 도시된 다른 요소(들) 또는 특징(들)과의 관계를 설명하기 위해 공간적으로 관련된 용어가 사용될 수 있다. 예를 들어, 도면의 장치가 뒤집힌다면, 다른 엘리먼트들 또는 특징들에 대해 "아래에" 또는 "아래쪽의" 또는 "하부"로 기술된 엘리먼트들은 다른 엘리먼트들 또는 특징들에 대해 "위"로 향하게 될 것이다. 따라서, "아래" 및 "아래쪽의"의 예시적인 용어들은 위와 아래의 방향 모두를 포함할 수 있다. 장치는 다른 방향으로 향할 수 있고 (예를 들어, 90도 또는 다른 방향들로 회전될 수 있음), 본 명세서에서 사용된 공간적으로 상대적인 기술 용어는 그에 따라 해석되어야 한다.Spatially relative terms such as "below", "lower", "lower", "lower", "above" "upper", etc. are, for convenience of description, in addition to the orientations shown in the drawings, one Spatially related terms may be used to describe the relationship between an element or feature and other element(s) or feature(s) shown in the drawings. For example, if the device in the figure is turned upside down, elements described as “below” or “bottom” or “bottom” for other elements or features face “up” for other elements or features. Will be Thus, exemplary terms of “below” and “below” may include both up and down directions. The device may be oriented in different directions (eg, may be rotated in 90 degrees or in other directions), and the spatially relative technical terminology used herein should be interpreted accordingly.

엘리먼트, 층, 영역 또는 구성 요소가 다른 엘리먼트, 층, 영역 또는 구성 요소의 "위에", "연결된" 또는 "결합된" 것으로 언급될 때, 이는 다른 구성 요소, 층, 영역 또는 구성 요소에 직접적으로 접속되거나, 연결되거나 또는 결합될 수 있으며, 하나 이상의 개재된 엘리먼트들, 층들, 영역들 또는 구성 요소들이 존재할 수 있다. 또한, 하나의 엘리먼트 또는 층이 2 개의 엘리먼트들 또는 층들의 "사이에" 있다고 언급될 때, 2 개의 엘리먼트들 또는 층들 사이의 유일한 구성 요소 또는 층, 또는 하나 이상의 개재하는 구성 요소들 또는 층들이 존재할 수도 있다.When an element, layer, region or component is referred to as “on”, “connected” or “coupled” to another element, layer, region or component, it is directly to the other component, layer, region or component. It may be connected, connected or combined, and there may be one or more intervening elements, layers, regions or components. In addition, when one element or layer is referred to as being “between” two elements or layers, there are only elements or layers between the two elements or layers, or one or more intervening elements or layers. May be.

본 명세서에서 사용된 용어는 특정 실시예들 만을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서 사용된 단수 형태는 문맥 상 다르게 지시하지 않는 한 복수 형태를 포함하고자 한다. 본 명세서에서 사용되는 "포함하는" 및 "구비하는"이라는 용어는 명시된 특징들, 정수들, 단계들, 동작들, 엘리먼트들 및/또는 엘리먼트의 요소의 존재를 특정하는 것으로 하나 이상의 다른 특징들, 정수들, 단계들, 동작, 요소, 구성 요소 및/또는 이들의 그룹의 존재 또는 추가를 배제하지 않음을 이해할 것이다. 본 명세서에 사용된 바와 같이, "및/또는"이라는 용어는 하나 이상의 관련 열거된 항목의 임의 및 모든 조합을 포함한다. "적어도 하나"와 같은 표현이 엘리먼트들의 목록 앞에 있을 때 엘리먼트들 전체 목록을 수정하고 목록의 개별 엘리먼트들을 수정하지 않는다.The terms used in the present specification are intended to describe only specific embodiments and are not intended to limit the present invention. The singular form used in the present specification is intended to include the plural form unless otherwise indicated by context. As used herein, the terms "comprising" and "comprising" are specified features, integers, steps, actions, elements, and/or the presence of an element of one or more other features, It will be understood that the presence or addition of integers, steps, actions, elements, components and/or groups thereof is not excluded. As used herein, the term “and/or” includes any and all combinations of one or more related listed items. When an expression such as "at least one" precedes a list of elements, it modifies the entire list of elements and does not modify individual elements of the list.

본 명세서에서 사용된 용어 "실질적으로", "약" 및 유사한 용어들은 근사의 용어로서 사용된 것이지 정도(degree)의 용어로서 사용된 것은 아니며, 측정된 값 또는 계산된 값의 고유한 편차를 설명하기 위한 것으로 당업자에 의해 인식된다. 또한, 본 발명의 실시예를 설명할 때 "할 수 있다"를 사용하는 것은 "본 발명의 하나 이상의 실시예들"을 의미한다. 본 명세서에서 사용된 바와 같이, 용어들 "사용하다" 및 "사용되는"은 용어들 "활용" 및 "활용되는"과 동의어로 각각 간주될 수 있다. 또한, "예시적인"이라는 용어는 견본 또는 실례(illustration)를 의미한다.The terms "substantially", "about" and similar terms used herein are used as terms of approximation, not as terms of degree, and describe the inherent deviation of a measured value or a calculated value. It is recognized by one of ordinary skill in the art to do so. Also, the use of "can" when describing an embodiment of the present invention means "one or more embodiments of the present invention." As used herein, the terms “use” and “used” may be considered synonymous with the terms “utilized” and “used”, respectively. Also, the term "exemplary" means a sample or illustration.

특정 실시예가 다르게 구현될 수 있는 경우, 특정한 처리 순서는 기술된 순서와 다르게 수행될 수 있다. 예를 들어, 2 개의 연속적으로 기술된 프로세스들은 실질적으로 동시에 수행되거나 설명된 순서와 반대 순서로 수행될 수 있다.When a specific embodiment may be implemented differently, a specific processing order may be performed differently from the described order. For example, two consecutively described processes may be performed substantially simultaneously or may be performed in an order opposite to the described order.

본 명세서에 기술된 본 발명의 실시예에 따른 전자 또는 전기 장치 및/또는 임의의 다른 관련 장치 또는 구성 요소는 임의의 적합한 하드웨어, 펌웨어(예를 들어, 주문형 집적 회로), 소프트웨어 또는 소프트웨어의 조합을 이용하여 구현될 수 있다. 예를 들어, 이들 장치의 다양한 구성 요소들은 하나의 집적 회로(IC) 칩 상에 또는 개별 IC 칩 상에 형성될 수 있다. 또한, 이들 장치들의 다양한 구성 요소들은 플렉서블 인쇄 회로 필름, 테이프 캐리어 패키지(TCP), 인쇄 회로 기판(PCB) 또는 하나의 기판 상에 구현될 수 있다. 또한, 이들 장치들의 다양한 구성 요소들은 하나 이상의 프로세서들에서 실행되고, 하나 이상의 컴퓨팅 장치들에서 실행되며, 컴퓨터 프로그램 명령들을 실행하고 여기에 설명된 다양한 기능들을 수행하기 위해 다른 시스템 구성 요소들과 상호 작용하는 프로세스 또는 스레드(thread)일 수 있다. 컴퓨터 프로그램 명령들은 예를 들어, 랜덤 액세스 메모리(RAM)와 같은 표준 메모리 장치를 사용하여 컴퓨팅 장치에 구현될 수 있는 메모리에 저장된다. 컴퓨터 프로그램 명령들은 또한 예를 들어, CD-ROM, 플래시 드라이브 등과 같은 다른 비일시적인 컴퓨터 판독 가능 매체에 저장될 수 있다. 또한, 당업자는 본 발명의 예시적인 실시예들의 사상 및 범위를 벗어남 없이, 다양한 컴퓨팅 장치들의 기능이 단일 컴퓨팅 장치에 결합되거나 통합될 수 있거나 또는 특정 컴퓨팅 장치의 기능이 하나 이상의 다른 컴퓨팅 장치들을 통해 분산될 수 있음을 잘 이해할 것이다.The electronic or electrical device and/or any other related device or component according to the embodiments of the invention described herein may use any suitable hardware, firmware (e.g., custom integrated circuit), software, or combination of software. It can be implemented using For example, the various components of these devices may be formed on one integrated circuit (IC) chip or on individual IC chips. In addition, various components of these devices may be implemented on a flexible printed circuit film, a tape carrier package (TCP), a printed circuit board (PCB), or a single substrate. In addition, the various components of these devices run on one or more processors, run on one or more computing devices, execute computer program instructions, and interact with other system components to perform the various functions described herein. It may be a process or a thread. Computer program instructions are stored in memory that can be implemented in a computing device using standard memory devices such as random access memory (RAM), for example. Computer program instructions may also be stored on other non-transitory computer-readable media such as, for example, a CD-ROM, a flash drive, or the like. In addition, those skilled in the art are aware that functions of various computing devices may be combined or integrated into a single computing device, or functions of a specific computing device may be distributed through one or more other computing devices, without departing from the spirit and scope of the exemplary embodiments of the present invention. You will understand well that it can be.

달리 정의되지 않는 한, 본 명세서에서 사용되는 모든 용어들(기술 및 과학 용어들 포함)은 본 발명이 속하는 기술 분야의 당업자에 의해 일반적으로 이해되는 것과 동일한 의미를 갖는다. 또한, 일반적으로 사전에서 사용되는 정의와 같은 용어들은 관련 기술 및/또는 본 명세서와 관련하여 그 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 이상적이거나 지나치게 형식적인 의미로 해석되지 않아야 한다.Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In addition, terms such as definitions generally used in the dictionary should be interpreted as having a meaning consistent with the meaning in relation to the related technology and/or the present specification, and should not be interpreted as an ideal or excessively formal meaning.

상술한 바와 같이, 멀티-스트리밍 플래시 드라이브들에서 효율적인 스트림 할당은 쓰기 증폭을 감소시킬 수 있고, SSD들의 수명, 내구성 및 성능을 향상시킬 수 있다. 데이터의 스트림들은 각기 다른 속성으로부터 얻은 이점을 효과적으로 종합함으로써, 그리고 대응하는 스트림 ID를 결정하는데 적합한 속성들을 식별(identifying) 및 획득(capturing)하는 것에 의해 더 효율적으로 할당될 수 있다.As described above, efficient stream allocation in multi-streaming flash drives can reduce write amplification and improve lifespan, durability, and performance of SSDs. Streams of data can be more efficiently allocated by effectively synthesizing the benefits gained from different attributes, and by identifying and capturing attributes suitable for determining the corresponding stream ID.

또한 상술한 바와 같이, 기존의 많은 멀티-스트리밍 시스템들은 자동 스트림 검출의 목적을 위해 특정 엔티티(entity)의 매우 적은 속성들(예를 들어, 액세스들의 빈도 및 시간 지역성)을 고려한다. 현재의 자동 스트림 검출 알고리즘들에 의한 낮은 성능 향상들은 식별된 스트림들의 새로운 속성들이 SSD들의 다중 스트림들에 저장될 데이터의 기대 수명을 반영하는데 적합할 수 있음을 나타낸다.Also, as described above, many existing multi-streaming systems take into account very few attributes of a particular entity (eg, frequency and temporal locality of accesses) for the purpose of automatic stream detection. The low performance improvements made by current automatic stream detection algorithms indicate that the new attributes of the identified streams may be suitable to reflect the life expectancy of data to be stored in multiple streams of SSDs.

따라서, 본 발명의 실시예들은 애플리케이션 계층(예를 들어, 장치 드라이버, 파일 시스템 등) 하에서 I/O 요구들에 대한 스트림 ID 할당을 향상시키고, 그리고 I/O 요청 트래픽을 다중 스트림들로 패킷화하기 위해, 멀티-스트리밍 메모리 장치로 다중 속성들을 획득하고 포함시키는 새로운 모델을 제공한다. 종래의 기술들과 달리, 설명된 실시예들은 스트림 할당을 위해 고려되는 더 많은 수의 속성들을 제공하면서도 다양한 속성들을 결합할 수 있는 확장 가능한 방법을 제공하며, 따라서 설명된 실시예들은 멀티-스트리밍 SSD들의 장점들을 보다 잘 활용할 수 있다. 즉, 설명된 실시예들은 다중 속성들을 고려하면서 개선된 멀티-스레드(multi-threaded) K-평균 클러스터링(K-means clustering)을 갖는 스마트한 스트림 할당을 위한 분석 모델을 개발할 수 있다. 설명된 실시예들은 또한 I/O 트레이스(trace)로부터 다중 속성들을 획득하기 위한 낮은/최소 오버헤드 트레이닝 단계(phase) 기술의 설계를 가능하게 하고, 또한 "일치성(congruency)"으로 불리는 신규 속성을 제공하여, 효율적인 스트림 검출 및 식별에서 일치성의 효과를 설명할 수 있다.Accordingly, embodiments of the present invention improve stream ID allocation for I/O requests under the application layer (eg, device driver, file system, etc.), and packetize I/O request traffic into multiple streams. To do this, we provide a new model for acquiring and including multiple attributes with a multi-streaming memory device. Unlike the conventional techniques, the described embodiments provide a scalable method capable of combining various attributes while providing a larger number of attributes considered for stream allocation, and thus the described embodiments provide a multi-streaming SSD You can better use their strengths. That is, the described embodiments can develop an analysis model for smart stream allocation with improved multi-threaded K-means clustering while considering multiple attributes. The described embodiments also enable the design of a low/minimum overhead training phase technique to obtain multiple attributes from an I/O trace, and also a new attribute called "congruency". Can explain the effect of consistency in efficient stream detection and identification.

도 1과 관련하여 이하 설명되는 실시예들은, 빈도(frequency), 시간 지역성(temporal locality), 순차성(sequentiality) 및 신규 속성인 "일치성(congruency)"등과 같은 다중 속성들을 고려할 수 있는 개선된 자동화 스트림 할당을 제공할 수 있다. 일치성은 도 2 및 도 3을 참조하여 상세히 설명된다.Embodiments described below with respect to FIG. 1 are improved automation that can take into account multiple attributes such as frequency, temporal locality, sequentiality, and a new attribute "congruency". Stream allocation can be provided. The correspondence will be described in detail with reference to FIGS. 2 and 3.

도 1은 본 발명의 일 실시예에 따라, 다중 속성들을 고려하기 위한 개선된 멀티-스레드 K-평균 클러스터링을 이용한 스마트 스트림 할당을 위한 분석 모델을 나타낸 블록도이다. 비록 본 실시예가 장치 드라이버 계층에 배치되는 것으로 설명되고, 비록 참조되는 관심 엔티티(entity)가 들어오는 데이터에 대응하는 로직 블록 어드레스(logical block address, LBA)이지만, 다른 실시예들에서, 관심 엔티티는 스트림 할당을 위해 설명된 실시예들의 배치 계층에 따라 다를 수 있다. 예를 들어, 관심 엔티티는 파일 시스템 계층에 배치되는 실시예들에 대해서는 파일로 대신 될 수 있다.FIG. 1 is a block diagram showing an analysis model for smart stream allocation using improved multi-threaded K-means clustering to consider multiple attributes according to an embodiment of the present invention. Although this embodiment is described as being placed in the device driver layer, and although the referenced entity of interest is a logical block address (LBA) corresponding to the incoming data, in other embodiments, the entity of interest is It may differ depending on the placement layer of the embodiments described for allocation. For example, the entity of interest may be replaced by a file for embodiments arranged in the file system layer.

도 1을 참조하면, 이 실시예의 분석 모델은 2 개의 단계들(phases)을 포함하는 것으로 설명되며, 2 개의 단계들은 트레이닝 단계(120) 및 테스트 단계(130)를 포함한다. 트레이닝 단계(120) 내 단계 S101에서, 애플리케이션 플랫폼은 SSD에 대한 검출된 I/O 스트림들에 대응하는 하나 이상의 적절한 속성들을 획득하기 위해 트레이닝 트레이스(trace, 추적)를 수행할 수 있다. 이러한 획득된 속성들은 예를 들어, 빈도, 시간 지역성, 순차성 및/또는 일치성을 포함할 수 있다. 단계 S101동안 트레이닝 트레이스로부터 획득된 이들 속성들은 이하에서 더 상세히 설명될 것이다.Referring to FIG. 1, the analysis model of this embodiment is described as including two phases, which include a training phase 120 and a testing phase 130. In step S101 in the training step 120, the application platform may perform a training trace to obtain one or more appropriate attributes corresponding to the detected I/O streams for the SSD. These obtained attributes may include, for example, frequency, temporal locality, sequentiality and/or consistency. These attributes obtained from the training trace during step S101 will be described in more detail below.

이후, 단계 S102에서, 획득된 속성들은 특징들(features)을 추출(extract)하기 위해 사용될 수 있으며. 따라서 검출된 I/O 스트림들 각각에 대해 추출된 특징들의 각각을 식별하고, 특징들 각각의 증요성 레벨을 나타내기 위해 사용되는 특징 매트릭스(feature matrix, 140)를 생성할 수 있다.Thereafter, in step S102, the obtained attributes may be used to extract features. Accordingly, it is possible to identify each of the extracted features for each of the detected I/O streams, and generate a feature matrix 140 that is used to indicate the significance level of each of the features.

특징 매트릭스(140)는 각 추출된 속성의 중요도(예를 들어, 중요도 계수(importance factor))가 로직 블록 어드레스(LBA) 또는 로직 블록 어드레스들(LBAs)의 그룹에 대응할 때 속성의 중요도를 효과적으로 나타낼 수 있다. 따라서, 특징 매트릭스(140)는 nXm 엔트리들을 갖는 nXm 매트릭스로 생각할 수 있다. 여기서, n은 스트림 검출 및 식별을 위해 분석 된 속성들의 총 개수이고, m은 로직 블록 어드레스들(LBAs)의 총 개수이다.The feature matrix 140 effectively represents the importance of the attribute when the importance of each extracted attribute (eg, an importance factor) corresponds to a logical block address (LBA) or a group of logical block addresses (LBAs). I can. Thus, the feature matrix 140 can be thought of as an nXm matrix with nXm entries. Here, n is the total number of attributes analyzed for stream detection and identification, and m is the total number of logic block addresses (LBAs).

중요도 계수는 특징 매트릭스(140)의 행들에 저장된 데이터 퀀텀들(data quanta)으로서 리스트될 수 있다. 특징 매트릭스(140)의 각 엔트리는, 그 속성이 특정 로직 블록 어드레스(LBA)에 대해 중요한 것으로 고려되는 지의 여부를 나타내는 단일 이진 비트로 표현될 수 있다. 예를 들어, 특징 매트릭스(140) 내 각 어드레스는 속성이 그 로직 블록 어드레스 (LBA)에 대해 중요함을 것을 나타내는 "1" 또는 속성이 그 로직 블록 어드레스(LBA)에 대해 중요하지 않음을 나타내는 "0"을 포함할 수 있다. 즉, 상대적으로 낮은 오버 헤드를 갖는 다양한 속성들(예를 들어, 빈도, 시간적 지역성, 순차성 및 일치성)을 획득하기 위해, 각 속성은 각 LBA에 대해 단일-이진 비트 값(single-bit binary value)으로 표시될 수 있으나, 각 LBA에 대한 중요도의 수준(예를 들어, 중요도의 변화 정도들)를 나타내기 위해 멀티-비트 값(multiple-bit value)이 사용될 수 있다. LBA 엔트리에 대한 이진 값은, 특징 매트릭스(140)의 속성이 임계값 벡터(threshold vector)의 대응하는 요소보다 작거나 같지 않을 때 0이고, 그렇지 않으면 1이다. 임계값 벡터는 nX1 엔트리들을 갖는 nX1 매트릭스로 생각될 수 있다. 여기서 n은 전체 속성들 수이고, 이 행렬의 각 엔트리는 각 속성의 임계값(threshold)에 대응한다. 개별적인 속성들 각각이 어떻게 획득되는 지는 아래에서 개별적으로 설명된다.The importance factor may be listed as data quanta stored in rows of feature matrix 140. Each entry in feature matrix 140 may be represented by a single binary bit indicating whether its attribute is considered important for a particular logical block address (LBA). For example, each address in feature matrix 140 may be "1" indicating that the attribute is important for its logical block address (LBA) or "1" indicating that the attribute is not important for that logical block address (LBA). May contain 0". That is, in order to obtain various properties (e.g., frequency, temporal locality, sequentiality, and consistency) with relatively low overhead, each property is a single-bit binary value for each LBA. ), but a multi-bit value may be used to indicate the level of importance (eg, degrees of change in importance) for each LBA. The binary value for the LBA entry is 0 when the attribute of the feature matrix 140 is less than or not equal to the corresponding element of the threshold vector, and 1 otherwise. The threshold vector can be thought of as an nX1 matrix with nX1 entries. Where n is the total number of attributes, and each entry in this matrix corresponds to the threshold of each attribute. How each of the individual attributes is obtained is described separately below.

특징 매트릭스(140)는 예를 들어, 상이한 로직 블록 어드레스들(LBAs)에 대응하는 데이터 기입들을 상이한 스트림들에 클러스터링(clustering)하기 위해 사용될 수 있다. 따라서, 개선된 멀티-스레드 K-평균(multi-threaded K-means) 방법은 다른 스트림들로의 데이터 기입들을 클러스터링 또는 그룹화하기 위한 클러스터링 모델로서 사용될 수 있고, K-평균은 반복적인 자율 클러스터링 애플리케이션이다. 획득된 속성들 각각은 데이터 클러스터링의 하나의 특징을 구성할 수 있다. 즉, 획득된 속성들의 특징들에 의해 데이터의 다양한 타입들을 식별하는데 사용되는 획득된 속성들 각각은 특징 매트릭스(140)의 하나의 행 또는 하나의 열에 대응할 수 있다.The feature matrix 140 may be used, for example, to cluster data writes corresponding to different logical block addresses (LBAs) in different streams. Therefore, the improved multi-threaded K-means method can be used as a clustering model for clustering or grouping data writes to different streams, and K-means is an iterative autonomous clustering application. . Each of the acquired attributes may constitute a feature of data clustering. That is, each of the acquired attributes used to identify various types of data by the characteristics of the acquired attributes may correspond to one row or one column of the feature matrix 140.

특징 매트릭스(140)가 생성되면, 단계 S103에서, 상이한 I/O 스트림들을 검출 및 식별하기 위한 분석 모델이 초기화될 수 있다. 초기화 모델은 K-평균 클러스트링을 나타낸다.When the feature matrix 140 is generated, in step S103, an analysis model for detecting and identifying different I/O streams may be initialized. The initialization model represents the K-means clustering.

따라서, 단계 S104에서, 초기화된 모델은 평가를 위해 모델(160)을 생성하도록 트레이닝(training)되고, 그 후 단계 S105에서 모델(160)이 평가(evaluate)된다. 모델(160)은 머신-러닝(machine-learning) 또는 셀프-러닝 알고리즘(self-learning algorithm)을 사용하여 효과적인 자체 형태(effectively shape itself) 또는 자체적인 트레이닝을 할 수 있다. 이는 모델 (160)이, 상이한 LBA들을 할당할 스트림 ID들을 결정함에 있어, 어떤 속성들이 다른 속성들보다 더 중요함을 효과적으로 강조하도록 허용할 수 있다.Accordingly, in step S104, the initialized model is trained to generate a model 160 for evaluation, and then in step S105, the model 160 is evaluated. Model 160 may effectively shape itself or train itself using machine-learning or self-learning algorithms. This may allow the model 160 to effectively emphasize that certain attributes are more important than others in determining the stream IDs to allocate different LBAs.

상이한 속성들 각각에 대응하는 상대 가중 계수(relative weight factor)는 기본적으로 단일(unity)일 수 있다. 그러나, 단계 S106에서, 선택적 상대 가중 계수가 모델(160)에 도입될 수 있다. 즉, 획득된 속성들 모두가 초기에 동일하게 가중될 수 있지만, 단계 S106에서, 대응하는 스트림 ID를 결정하고 할당할 때 각 속성의 상대적 중요성을 강조하기 위해, 속성들에 대응하는 상대 가중 계수 매트릭스가 선택적 입력으로서 생성된 모델(160)로 튜닝되어 전달될 수 있다. 예를 들어, 추출된 속성들 중 하나가 추출된 다른 속성들보다 상대적으로 더 높은 중요도 레벨을 갖는다면, 상대 가중 계수를 모델(160)에 도입함으로써 더 높은 가중 계수가 그 속성에 할당될 수 있고, 이는 단계 S105에서(예를 들어, 사용자 또는 시스템이 결정한 사양들을 준수하도록 조정되기 위해), 평가 또는 재평가될 수 있다.The relative weight factor corresponding to each of the different attributes may be basically unity. However, in step S106, an optional relative weighting factor may be introduced into the model 160. That is, all of the acquired attributes may be initially weighted equally, but in step S106, in order to emphasize the relative importance of each attribute when determining and assigning a corresponding stream ID, a relative weighting coefficient matrix corresponding to the attributes May be tuned to the generated model 160 and transmitted as an optional input. For example, if one of the extracted attributes has a relatively higher importance level than the other extracted attributes, a higher weighting factor can be assigned to that attribute by introducing a relative weighting factor into the model 160 , It may be evaluated or re-evaluated in step S105 (eg, to be adjusted to comply with specifications determined by the user or system).

트레이닝 단계(120)의 종료시에, 단계 S104, 단계 S105, 그리고 선택적으로 단계 S106에서, 상대적으로 적은 수의 반복들(예를 들어, 4회 또는 5회 반복)동안 클러스터링이 수행되면, 단계 S107에서, 대응하는 스트림 ID에 대한 LBA들(예를 들어) 각각의 관계를 표현하기 위한 사전(dictionary, 150)이 형성되어서, 식별된 대응하는 LBA에 근거해서 스트림 ID들은 새로운 데이터에 자동으로 할당될 수 있다. 다른 실시예들에서, 트레이닝 단계(120)는 방법의 초기 단계로서 수행될 수 있고, 그리고/또는 임의의 작업 부하(workload) 변화들을 보상하기 위해서 사전을 조정하기 위해 테스트 단계(130) 동안 주기적으로 수행될 수 있다. 사전(150)은 메모리의 디스크 상에 저장될 수 있고, 사전(150)이 테스트 단계(130)의 실행 시간 동안 사용될 수 있도록 예를 들어, 장치 계층, 블록 계층 또는 애플리케이션 계층에 저장될 수 있다At the end of the training step 120, if clustering is performed during a relatively small number of iterations (e.g., 4 or 5 iterations) in step S104, step S105, and optionally step S106, in step S107 , A dictionary 150 is formed to express the relationship of each of the LBAs (for example) to the corresponding stream ID, so that the stream IDs can be automatically assigned to new data based on the identified corresponding LBA. have. In other embodiments, the training step 120 may be performed as an initial step in the method, and/or periodically during the test step 130 to adjust the dictionary to compensate for any workload changes. Can be done. The dictionary 150 may be stored on a disk in memory, and may be stored in, for example, a device layer, a block layer, or an application layer so that the dictionary 150 can be used during the execution time of the test step 130.

따라서, 테스트 단계(130) 동안, 단계 S108에서, 새로운 데이터는 실행중인 작업 부하로 메모리에 기입되도록 들어온다. 그 후, 단계 S109에서, 룩업(lookup)은 사전(150)을 사용하여 수행될 수 있다. 즉, 단계 S108에서 새로운 데이터가 들어오면, 단계 S108에서 수신된 새로운 데이터에 대응하는 LBA가 식별될 수 있고, 그리고, 트레이닝 단계(120)동안 생성된 사전(150)의 룩업에 근거하여, 스트림 ID가 새로이 수신된 데이터에 대응하는 LBA에 근거하여 할당됨으로써 이에 따라 관련된 데이터를 그룹화할 수 있다.Thus, during the test step 130, in step S108, new data is entered to be written to the memory with the running workload. Thereafter, in step S109, a lookup may be performed using the dictionary 150. That is, when new data comes in in step S108, the LBA corresponding to the new data received in step S108 can be identified, and, based on the lookup of the dictionary 150 generated during the training step 120, the stream ID Is allocated based on the LBA corresponding to the newly received data, so that related data can be grouped accordingly.

그 후, 단계 S110에서, 대응하는 물리적 어드레스가 스트림 ID에 할당 될 수 있다. 즉, 작업 부하를 실행하는 동안, 사전(150)에 대한 빠른 룩업은 어떠한 런타임 계측 오버헤드 없이 오직 스트림 ID 할당에만 필요하다. 따라서, 테스트 단계(130)는 추가 오버헤드 없이 상대적으로 빠르게 수행될 수 있다.Thereafter, in step S110, a corresponding physical address may be assigned to the stream ID. That is, while running the workload, a quick lookup to dictionary 150 is only needed for stream ID assignment without any runtime measurement overhead. Therefore, the test step 130 can be performed relatively quickly without additional overhead.

상술한 분석 모델은 임의의 계층(예를 들어, 파일 시스템, 장치 드라이버 등)에 배치될 수 있으므로, 스트림 ID들을 할당하는 프로세스에 애플리케이션들이 수반될 어떤 필요를 없앤다는 점에 유의해야 한다. 또한 각 I/O 동작에 대해 그리고 구현 계층에 따라, 분석 모델은 SSD에 기록되도록 요청되는 로직 블록 번호들 또는 파일들을 알고 있으므로, 셀프-러닝 모델이 그러한 로직 블록 번호들 또는 파일들의 액세스에서 경향들을 찾는 것을 허용한다.It should be noted that the above-described analytic model can be placed in any layer (eg, file system, device driver, etc.), thus eliminating any need for applications to be involved in the process of allocating stream IDs. Also, for each I/O operation and depending on the implementation layer, the analytic model knows the logic block numbers or files that are requested to be written to the SSD, so that the self-learning model takes trends in accessing those logic block numbers or files. Allow to find.

상술한 분석 모델은 평가될 수 있는 다수의 전체 속성들에 대해 유연(flexible)하다는 것을 또한 유의해야 한다. 따라서 분석 모델은 임의의 수의 속성들을 고려하여 임의의 수의 스트림 ID들을 결정하기 위해 사용될 수 있다. 따라서, 본 발명은 특정 수의 속성들 또는 특정 수의 스트림들에 제한되지 않는다. 예를 들어, 본 발명의 실시예들은 멀티-스트림 플래시 드라이브들에 의해 지원되는 스트림들의 수가 변경되더라도, 이 분야의 미래 기술들을 지원할 수 있다.It should also be noted that the analysis model described above is flexible over a number of overall properties that can be evaluated. Thus, the analytic model can be used to determine any number of stream IDs taking into account any number of attributes. Thus, the invention is not limited to a specific number of attributes or a specific number of streams. For example, embodiments of the present invention can support future technologies in this field even if the number of streams supported by multi-stream flash drives is changed.

단계 S101에서의 트레이닝 트레이스에 의해 수행된 바와 같이, 다양한 속성들을 획득하기 위한 방법들의 예들이 아래에서 설명된다.Examples of methods for obtaining various attributes, as performed by the training trace in step S101, are described below.

빈도(frequency)의 속성은 데이터 입력의 결과로서 특정 어드레스가 얼마나 자주 액세스되려고 시도되었는 지에 대응할 수 있다. 빈도의 속성을 획득하기 위해, 임의의 특정 어드레스의 액세스 수가 미리 설정된 임계값(예를 들어 4 번)보다 작은 경우, 그 특정 어드레스에 대응하는 빈도 속성은 중요도를 나타내는 1로서 할당 될 수 있고, 그렇지 않으면 0으로 할당될 수 있다.The attribute of frequency may correspond to how often a particular address was attempted to be accessed as a result of data entry. In order to obtain the attribute of the frequency, if the number of accesses of any particular address is less than a preset threshold (e.g. 4 times), the frequency attribute corresponding to that particular address can be assigned as 1 representing the importance, otherwise Otherwise, it can be assigned to zero.

시간적 지역성(temporal locality)의 속성은 특정 어드레스가 얼마나 최근에 액세스되려고 시도되었는지에 대응할 수 있다. 시간적 지역성의 속성을 획득하기 위해, 트레이닝 트레이스의 최대 타임스탬프(timestamp)가 결정될 수 있으며, 최대 타임스탬프의 미리 설정된 임계값(예를 들어, 50%의 임계값) 이후에 액세스되는 어드레스들은 시간적 지역성 속성에 1을 할당하여 최근의 것으로 표시될 수 있고, 그렇지 않으면 시간적 속성을 0으로 할당한다.The attribute of temporal locality may correspond to how recently a particular address was attempted to be accessed. In order to obtain the property of temporal locality, a maximum timestamp of the training trace can be determined, and addresses accessed after a preset threshold of the maximum timestamp (e.g., 50%) are temporal locality. It can be marked as recent by assigning 1 to the property, otherwise assigning the temporal property to 0.

순차성(sequentiality)의 속성은 데이터 입력이 순차적인 순서로 특정 어드레스들에 액세스하고자 시도하는 지의 여부에 대응할 수 있다. 순차성의 속성을 획득하기 위해, 새로운 I/O 스트림이 시퀀스 검출기에 의해 수신될 때, 시퀀스 검출기는 주어진 트레이스에서 이전(또는 소정의 미리 정의된 윈도우 크기, 예를 들어, 16) LBA를 검색(look up)할 수 있다. 일치하는 것이 발견되면, 순차성 속성은 1로 할당되고, 그렇지 않으면 0으로 할당 될 수 있다.The attribute of sequentiality may correspond to whether or not the data input attempts to access specific addresses in a sequential order. In order to obtain the nature of the sequence, when a new I/O stream is received by the sequence detector, the sequence detector looks for the previous (or some predefined window size, e.g. 16) LBA in a given trace. up) you can. If a match is found, the sequentiality attribute is assigned 1, otherwise it can be assigned 0.

도 2는 본 발명의 실시예에 따른 신규한 일치성(congruency) 속성을 설명하는 샘플링 그래프이다.2 is a sampling graph illustrating a novel congruency attribute according to an embodiment of the present invention.

본 발명의 실시예들에 의해 도입된 속성인 일치성의 속성은 LBA들과 같은 상이한 관심사의 엔티티들이 규칙적으로 또는 거의 동시에 업데이트될 때에 대응한다. 예를 들어, 특정 그룹의 어드레스들이 자주 또는 거의 항상, 거의 비슷한 시간에 업데이트되는 경우, 그러한 LBA들의 그룹은 서로 일치한다고 말할 수 있다. 일치하는 어드레스들이 동시에 액세스되기 때문에, 메모리 장치의 성능을 향상시키기 위해 그들을 함께 그룹화하는 것이 유용할 수 있다.The attribute of consistency, an attribute introduced by embodiments of the present invention, corresponds when entities of different interests, such as LBAs, are updated regularly or almost simultaneously. For example, if the addresses of a particular group are updated frequently or almost always, at about the same time, it can be said that the groups of such LBAs match each other. Since matching addresses are accessed simultaneously, it may be useful to group them together to improve the performance of the memory device.

도 2에서, 샘플링 그래프의 x-축은 타임스탬프(예를 들어, 타임스탬프 윈도우 또는 시간 간격)에 대응하고, y-축은 LBA 공간에 대응한다. 본 실시예에서, 일치성은 하나 이상의 다른 LBA들과 일관되게 시간을 맞추는 하나 이상의 LBA들의 품질로서 간주될 수 있다. 따라서, 특정 LBA들 그룹이 거의 동시에 자주 업데이트되는 경우, 그 LBA들 그룹은 서로 일치하는 것으로 말할 수 있다. 따라서, 도 2에 도시된 예에서, LBA2 및 LBA3은 수직 점선들로 표시된 4 개의 타임스탬프 윈도우들/시간 간격들 중 3 개에서 동시에 모두 업데이트되기 때문에 서로 일치한다. 낮은 오버헤드를 갖는 일치성의 속성을 획득하기 위해, 고정 길이 큐가 유지될 수 있는데, 여기서 큐(queue)의 각 엘리먼트(element) 또는 슬롯(slot)은 관측된 시간 간격/타임스탬프 윈도우를 나타낸다.. In Fig. 2, the x-axis of the sampling graph corresponds to a timestamp (eg, a timestamp window or time interval), and the y-axis corresponds to the LBA space. In this embodiment, consistency can be considered as the quality of one or more LBAs consistently timed with one or more other LBAs. Thus, if a specific group of LBAs is frequently updated almost at the same time, the group of LBAs can be said to match each other. Thus, in the example shown in Fig. 2, LBA2 and LBA3 coincide with each other because they are all updated simultaneously in three of the four timestamp windows/time intervals indicated by vertical dotted lines. In order to obtain the attribute of consistency with low overhead, a fixed length queue can be maintained, where each element or slot of the queue represents an observed time interval/timestamp window. .

더욱 상세하게, 본 발명의 일 실시예의 장치는 특정 시간 간격들 사이에 또는 특정 시간 간격들에서 "스누프(snoop)"할 수 있다. 도 2에 도시된 예에서, 스누프의 사례들(instances)은 수직 점선들로 표시된다. LBA2 및 LBA3은 제1 스누프 시간(210)에서 일치하게 갱신됨을 알 수 있다. 유사하게, 도 2는 LBA2 및 LBA3이 제2 스누프 시간(220) 및 제3 스누프 시간(230)에서 동시에 액세스되는 것을 도시한다. 상이한 스누프 시간들은 LBA2 및 LBA3이 동시에(예를 들어, 일치하는(congruent)) 업데이트되는 경향을 갖고 있다는 것을 나타내기 때문에, LBA2와 LBA3이 유사한 라이프 사이클들을 가질 수 있으므로, LBA2와 LBA3을 단일 스트림으로 함께 그룹화하는 것이 이익일 수 있다.More specifically, the device of an embodiment of the present invention may "snoop" between or at specific time intervals. In the example shown in Fig. 2, instances of snoop are indicated by vertical dashed lines. It can be seen that LBA2 and LBA3 are updated coincidently at the first snoop time 210. Similarly, FIG. 2 shows that LBA2 and LBA3 are accessed simultaneously at the second snoop time 220 and the third snoop time 230. Since different snoop times indicate that LBA2 and LBA3 tend to be updated simultaneously (e.g., congruent), LBA2 and LBA3 may have similar life cycles, so LBA2 and LBA3 can be combined in a single stream. Grouping them together can be beneficial.

도 3은 본 발명의 일 실시예에 따른 일치성 속성 할당의 플로우차트이다.3 is a flow chart of concordance attribute assignment according to an embodiment of the present invention.

도 3을 참조하면, 일치성 속성을 획득하는 과정에서 고유한 LBA 어드레스들을 나타내기 위해 예시적인 심볼(symbol)들 [-, x, o, |, +]을 사용하여 일치성 속성을 결정하는 과정의 플로우차트가 도시된다. 큐 (300)는 다수의 엘리먼트들 또는 슬롯들을 포함하고, 큐(300)의 각 엘리먼트는 엘리먼트(단일 타임스탬프 윈도우를 참조하는 큐(300)의 각 엘리먼트)에 대응하는 해당 타임스탬프 윈도우 내에서 액세스되는 모든 고유한 LBA들의 리스트를 유지한다.Referring to FIG. 3, in the process of obtaining the consistency attribute, a process of determining the correspondence attribute using exemplary symbols [-, x, o, |, +] to indicate unique LBA addresses. A flowchart of is shown. The queue 300 includes a number of elements or slots, and each element of the queue 300 is accessed within a corresponding timestamp window corresponding to an element (each element of the queue 300 referring to a single timestamp window). It keeps a list of all the unique LBAs.

단계 S301에서, 고유한 LBA들의 수가 주어진 임계값(예를 들어, 임계값 "β")보다 작은 지의 여부가 판별된다. 각각의 타임스탬프 윈도우는 해당 타임스탬프 윈도우 내 액세스되는 모든 LBA들에 대응할 것이고, 특정 타임스탬프 윈도우를 큐(300)에 삽입하는 결정은 해당 타임스탬프 윈도우 내 액세스되는 모든 고유한 LBA들의 수에 의존하며, LBA들은 다양한 심볼들로 표현될 수 있다. 예를 들어, 이 실시예에서, 제1 타임스탬프 윈도우(310)에서, "-" LBA가 액세스되었고, 제2 타임스탬프 윈도우(320)에서 "x" LBA 및 "|" LBA가 함께 액세스되었고, 제3 타임스탬프 윈도우(330)에서 "-" LBA, "x" LBA, "|" LBA 및 "+" LBA가 함께 액세스되었고, 제4 타임스탬프 윈도우(340)에서 "-" LBA 및 "o" LBA가 함께 액세스되었다.In step S301, it is determined whether the number of unique LBAs is smaller than a given threshold value (eg, threshold value "β"). Each timestamp window will correspond to all LBAs accessed within that timestamp window, and the decision to insert a particular timestamp window into queue 300 depends on the number of all unique LBAs accessed within that timestamp window. , LBAs can be represented by various symbols. For example, in this embodiment, in the first timestamp window 310, the "-" LBA has been accessed, and in the second timestamp window 320, "x" LBA and "|" LBAs were accessed together, and in the third timestamp window 330 "-" LBA, "x" LBA, "|" LBA and “+” LBA were accessed together, and “-” LBA and “o” LBA were accessed together in the fourth timestamp window 340.

논리적으로, 주어진 타임스탬프 윈도우 내 적은 수의 LBA들이 액세스되지만, 그러한 적은 수의 LBA들의 조합이 일치할 때(즉, 그들은 여러 번 액세스되고 또한 복수의 그러한 여러 번에 대해 동시에 액세스된다), 그러한 LBA들을 동일한 스트림으로 그룹화하는 것의 중요성이 적합할 수 있다. 따라서, 일단 트레이스가 달성되면(예를 들어, 도 1에서 설명된 바와 같이), 입력이 분석될 수 있다.Logically, a small number of LBAs within a given timestamp window are accessed, but when the combination of such small number of LBAs matches (i.e. they are accessed multiple times and also accessed simultaneously for multiple such multiple times), such LBAs The importance of grouping them into the same stream may be appropriate. Thus, once the trace is achieved (eg, as described in FIG. 1), the input can be analyzed.

단계 S301에서, 타임스탬프 윈도우 내 액세스되는 고유한 LBA들이 소정의 임계값보다 작은 것으로 판단되면, 단계 S302에서, 특정 타임스탬프 윈도우가 큐(300)에 삽입된다. 그러나, 단계 S301에서 타임스탬프 윈도우 내 액세스되는 고유한 LBA들이 주어진 임계값보다 작지 않으면, 단계 S303에서, 프로세스는 다음 타임스탬프 윈도우로 이동한다. 비교적 많은 수의 액세스된 LBA들을 갖는 타임스탬프 윈도우를 무시하는 것이 유리할 수 있는데, 큰 수는 일치성에 대해서 중요도가 더 낮은 것을 나타낼 수 있기 때문이다. 따라서, 상대적으로 많은 수의 어드레스들을 갖는 타임스탬프 윈도우들을 큐(300)로 가져오지 않음으로써, 계산이 보다 빠르게 수행될 수 있고, 도 1에서 설명된 트레이닝 단계(120)가 더 효율적일 수 있다.In step S301, if it is determined that the unique LBAs accessed in the timestamp window are smaller than a predetermined threshold value, in step S302, a specific timestamp window is inserted into the queue 300. However, if the unique LBAs accessed in the timestamp window in step S301 are not smaller than the given threshold, in step S303, the process moves to the next timestamp window. It may be advantageous to ignore the timestamp window with a relatively large number of accessed LBAs, as a large number may indicate a lower importance for consistency. Therefore, by not bringing timestamp windows having a relatively large number of addresses to the queue 300, the calculation can be performed faster, and the training step 120 described in FIG. 1 can be more efficient.

단계 S304에서, 큐(300)가 가득 찼는지의 여부가 판별된다. 일단 큐(300)의 길이가 채워지면, 단계 S305에서, 일치성 속성에 값을 첨부함으로써 디큐(dequeue)가 트리거(trigger)된다. 큐(300)의 길이는 조정 될 수 있다.In step S304, it is determined whether the queue 300 is full. Once the length of the queue 300 is filled, in step S305, a dequeue is triggered by appending a value to the consistency attribute. The length of the queue 300 can be adjusted.

디큐하는 동안, 각 타임스탬프 윈도우는 모든 다른 타임스탬프 윈도우들과 비교되어 일치하는 엘리먼트들을 식별한다. 큐(300)의 제1 엘리먼트 리스트의 부분 집합 값들이 큐(300)의 다른 엘리먼트 리스트의 부분 집합 값들과 일치하는 것으로 디큐 동작 중에 판별되면, 그 부분 집합들의 값들은 1로 표시된다(예를 들어, 일치(congruent)). 엘리먼트 리스트의 부분 집합의 값들이 임의의 다른 엘리먼트 리스트의 부분 집합의 값들과 일치하지 않으면, 그 엘리먼트에 대응하는 부분 집합의 값들은 0으로 표시된다(예컨대, 불일치(not congruent)). 일치성을 판별하기 위한 부분 집합의 크기의 임계값은 조절될 수 있으며, 예를 들어, 3 개의 값들로 설정 될 수 있고, 단지 2 개의 매칭 값들을 포함하는 매칭 부분 집합들을 갖는 2 개의 상이한 엘리먼트들의 리스트들이 여전히 0(즉, 불일치)으로 표시될 수 있다. 단계 S305에서, 디큐는 FIFO(first in first out) 순서로 동작된다.During dequeue, each timestamp window is compared to all other timestamp windows to identify matching elements. When it is determined during the DQ operation that the subset values of the first element list of the queue 300 match the subset values of the other element list of the queue 300, the values of the subsets are displayed as 1 (for example, , Congruent). If the values of a subset of the element list do not match values of a subset of any other element list, the values of the subset corresponding to that element are marked as 0 (eg, not congruent). The threshold of the size of the subset for determining the correspondence can be adjusted, for example, can be set to three values, and of two different elements with matching subsets containing only two matching values. Lists can still be marked as 0 (ie, mismatch). In step S305, the D-Q is operated in FIFO (first in first out) order.

단계 S306에서, 매트릭스(예컨대, 일치성 매트릭스)는 LBA들을 나타내는 심볼들 각각에 단일-비트 이진 값을 할당함으로써 생성된다. "x" LBA 및 "|" LBA는 제2 및 제3 타임스탬프 윈도우들(320, 330)에서 모두 액세스되고 다른 타임스탬프 윈도우에서 액세스되지 않기 때문에 "x" LBA 및 "|" LBA는 모두 일치(congruent)로 판별되고, "1"의 값으로 할당되고, 나머지 LBA들은 불일치(non-congruent)로 판별되고, "0"의 값으로 할당된다.In step S306, a matrix (e.g., a correspondence matrix) is generated by assigning a single-bit binary value to each of the symbols representing LBAs. "x" LBA and "|" Since the LBA is accessed both in the second and third timestamp windows 320 and 330 and not in another timestamp window, the "x" LBA and "|" All LBAs are determined to be congruent and assigned a value of “1”, and the remaining LBAs are determined to be non-congruent and assigned a value of “0”.

또한 다른 실시들에서, 다양한 서브 속성들이 있을 수 있으므로, 단계 S306의 일치성 매트릭스에 다수의 열들이 있을 수 있고, 일치성 매트릭스의 각 행은 상이한 속성들로부터의 데이터를 참조할 수 있다.Also in other implementations, since there may be various sub-attributes, there may be multiple columns in the consistency matrix of step S306, and each row of the correspondence matrix may reference data from different attributes.

그 후, 일단 단계 S306에서 일치성 매트릭스가 생성되면, 도 2의 트레이닝 단계(120) 동안 생성된 사전(150)은 스트림 ID들을 다양한 데이터의 스트림들에 할당하기 위한 룩업 테이블로 사용된다.Thereafter, once the consistency matrix is generated in step S306, the dictionary 150 generated during the training step 120 of FIG. 2 is used as a lookup table for allocating stream IDs to streams of various data.

따라서 앞서 설명된 본 발명의 실시예들은 임의의 애플리케이션 레벨 변경들의 요구없이 구현될 수 있고, 따라서 다수의 애플리케이션들을 동시에 실행하기 위해 사용되는 임의의 애플리케이션, 작업 부하 또는 플랫폼에 대해 사용될 수 있고, 또한 다수의 속성들의 사용이 가능하게 될 수 있다. 더욱이, 위에서 설명된 적절한 상대 가중 계수들을 사용하여 상이한 속성들이 추가, 제거 또는 심지어 튜닝될 수 있으므로, 높은 품질의 스트림 ID 패킷화를 가능하게 한다.Accordingly, the embodiments of the present invention described above can be implemented without the need of any application level changes, and thus can be used for any application, workload or platform used to run multiple applications simultaneously, and can also The use of properties of can be made possible. Moreover, different attributes can be added, removed or even tuned using the appropriate relative weighting factors described above, thus enabling high quality Stream ID packetization.

또한, 테스트 단계(130)동안, 사전(150) 내의 단일의 하드 룩업(hard lookup)만이 사용되기 때문에, 계측 오버 헤드는 낮게/최소화되고, 따라서 테스트 동안 설명된 실시예들의 프로세스들을 빠르게 만든다. 또한, 실시예들은 다른 상황들(contexts)에 적용될 수 있기 때문에, 중요한 것이 중요하지 않은 것으로부터 분리되고, 설명된 실시예들은 고온, 상온 및 저온 데이터 식별 문제들, 다계층(multi-tiered) 스토리지, 효율적인 캐싱(caching)과 같은 다양한 분야들에 적용될 수 있다.Also, since only a single hard lookup in dictionary 150 is used during test step 130, metrology overhead is low/minimized, thus speeding up the processes of the described embodiments during testing. In addition, since the embodiments can be applied to different contexts, the important is separated from the non-critical, and the described embodiments are used for high temperature, room temperature and low temperature data identification problems, multi-tiered storage , It can be applied to various fields such as efficient caching.

상술한 내용은 예시적인 실시예들을 설명하기 위한 것이며, 본 발명을 제한하는 것으로 해석되어서는 안 된다. 비록 몇몇 예시적인 실시예들이 설명되었지만, 당업자는 예시적인 실시예들의 신규한 교시들 및 효과들로부터 실질적으로 벗어남 없이 예시적인 실시예들에서 많은 변형들이 가능하다는 것을 용이하게 이해할 것이다. 따라서, 그러한 모든 변형들은 청구항들에 정의된 예시적인 실시예들의 범위 내에 포함되도록 의도된다. 청구항들에서, 기능식 표현(means-plus-function) 용어들은 구조적 등가물뿐만 아니라 등가의 구조를 열거하여 여기에 설명된 구조를 포함하도록 의도되었다. 따라서, 상술한 실시예들은 예시적인 실시예들을 설명하기 위한 것이며, 개시된 특정 실시예들에 한정되는 것으로 해석되어서는 안되며, 첨부된 청구항들의 범위 내에서, 개시된 예시적인 실시예들 및 다른 예시적인 실시예들에 대한 변형들을 포함하는 것으로 이해되어야 한다. 본 발명의 개념은 다음의 청구항들에 의해 정의되고, 청구항들 범위의 균등물들을 포함한다. The above description is for describing exemplary embodiments and should not be construed as limiting the present invention. Although several exemplary embodiments have been described, those skilled in the art will readily appreciate that many variations are possible in the exemplary embodiments without substantially departing from the novel teachings and effects of the exemplary embodiments. Accordingly, all such modifications are intended to be included within the scope of the exemplary embodiments defined in the claims. In the claims, the terms mean-plus-function are intended to encompass the structure described herein by listing structural equivalents as well as equivalent structures. Accordingly, the above-described embodiments are for describing exemplary embodiments, and should not be construed as being limited to the specific disclosed embodiments, and within the scope of the appended claims, the disclosed exemplary embodiments and other exemplary embodiments It should be understood to include variations on the examples. The concept of the invention is defined by the following claims and includes equivalents of the scope of the claims.

140: 특징 매트릭스
150: 사전 160: 모델140: feature matrix
150: dictionary 160: model

Claims

For a multi-streaming memory system:
Memory; And
Including a processor connected to the memory,
The processor running a software component that improves stream allocation in multi-streaming flash drives comprises:
Identify attributes each associated with the logical block addresses, each corresponding to each of the plurality of streams of data writes, and each representing an expected life of data stored in the corresponding stream;
In order to evaluate the importance coefficient of any one of the attributes indicating the likelihood that the group of logical block addresses will be accessed with any one of the attributes for each of the streams, any one of the streams Determine whether to respond to simultaneous data write requests occurring in each of the timestamp windows,
The above is to determine:
Indicating the logical block addresses of each of the streams of the stream group in an element in the queue for each of the timestamp windows;
Comparing each of the elements with all other elements; And
Performed by identifying the logical block addresses of the streams of the stream group each represented within each of a plurality of the elements, and
Two or more clustered logic, based on each of the logical block addresses and all importance coefficients for a stream to which a stream ID is assigned, and an indication that the expected lifetime of the data of two or more logical block addresses will be similar. Clustering two or more of the logical block addresses into one or more data write streams by assigning a single stream ID to each of the block addresses; And
A multi-streaming memory system for writing the data in the streams of data writes.

The method of claim 1,
The software component further comprises assigning a weighting factor to one or more of the attributes, and
The software component assigns the stream ID to each of the logical block addresses by giving priority to importance coefficients corresponding to the attributes over other importance coefficients based on the assigned weighting coefficient. system.

The method of claim 1,
The software component evaluates the importance coefficient for each of the attributes for each of the logical block addresses by generating an n × m feature matrix comprising n × m addresses, wherein n is the total number of the attributes, And m is the total number of the logical block addresses.

The method of claim 3,
Each of the addresses of the feature matrix corresponds to the importance factor and includes a single-bit binary value indicating whether an attribute corresponding to any of the logical block addresses is significant.

The method of claim 4,
The software component further comprising determining a value of the single-bit binary value based on whether a threshold value corresponding to a corresponding one of the attributes is satisfied.

The method of claim 5,
And the software component allocating the same stream ID to logical block addresses having binary values that match at corresponding ones of the addresses of the feature matrix among the logical block addresses.

The method of claim 1,
The software component further comprises generating a dictionary for assigning a stream ID to the future logical block address requests based on the importance factor for each of the attributes for each of the logical block address requests, and
The memory is a multi-streaming memory system for storing the dictionary.

A method of identifying attributes respectively related to a logical block address corresponding to an input/output stream of data in a multi-streaming memory device for improving stream allocation in multi-streaming flash drives:
Detecting an input/output stream corresponding to the multi-streaming memory device;
Acquiring a plurality of attributes corresponding to a logic block address corresponding to the detected input/output stream, and indicating an expected lifetime of data to be stored in the detected input/output stream;
Generating a feature matrix representing importance coefficients of each of the attributes related to the logical block address;
To evaluate the importance factor of any one of the attributes indicating the likelihood that the group of logical block addresses will be accessed with any of the attributes for each of the streams, during each of a plurality of timestamp windows, the Determining which stream group among the streams corresponds to simultaneous data write requests;
Based on the feature matrix and an indication that the expected lifetime of the data of two or more of the logical block addresses is similar, the received data corresponding to the two or more of the logical block addresses is transferred to one or more streams of different data writes Clustering on; And
Writing the data of the streams of the data writes,
The determining step is:
Indicating logical block addresses of each of the streams of the stream group in an element in a queue for each of the timestamp windows;
Comparing each of the elements with all other elements; And
And identifying the logical block addresses of the streams of the stream group each represented in each of a plurality of the elements.

The method of claim 8,
And generating an analysis model based on the feature matrix to evaluate the clustering of the received data in the different streams.

A method of allocating an attribute importance factor to a stream of data in a multi-stream memory device for improving stream allocation in multi-streaming flash drives is:
Receiving the streams of data respectively corresponding to logical block addresses;
Determining whether the number of unique logical block addresses accessed to each of the plurality of timestamp windows is less than a threshold value;
Inserting symbols respectively corresponding to the timestamp windows and representing each of the logical block addresses into a queue when the number of unique logical block addresses corresponding to the queue is not less than the threshold value;
Determining whether the inserted symbols are full in the queue;
Determining which of the logical block addresses are simultaneously accessed during each of a plurality of timestamp windows among the timestamp windows when the queue is full; And
And writing the data of the streams of data.