[go: up one dir, main page]

CN116955044B - Method, device, equipment and medium for testing cache working mechanism of processor - Google Patents

Method, device, equipment and medium for testing cache working mechanism of processor Download PDF

Info

Publication number
CN116955044B
CN116955044B CN202311174747.4A CN202311174747A CN116955044B CN 116955044 B CN116955044 B CN 116955044B CN 202311174747 A CN202311174747 A CN 202311174747A CN 116955044 B CN116955044 B CN 116955044B
Authority
CN
China
Prior art keywords
cache
test
signal
thread
processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311174747.4A
Other languages
Chinese (zh)
Other versions
CN116955044A (en
Inventor
郑楚育
邓晓宇
何伟
唐丹
包云岗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Open Source Chip Research Institute
Original Assignee
Beijing Open Source Chip Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Open Source Chip Research Institute filed Critical Beijing Open Source Chip Research Institute
Priority to CN202311174747.4A priority Critical patent/CN116955044B/en
Publication of CN116955044A publication Critical patent/CN116955044A/en
Application granted granted Critical
Publication of CN116955044B publication Critical patent/CN116955044B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2205Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested
    • G06F11/2236Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested to test CPU or processors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2273Test methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The application provides a test method, a device, equipment and a medium of a cache working mechanism of a processor, which relate to the technical field of processor test and comprise the following steps: acquiring cache working mechanism information of a processor; configuring a preset initial test program according to the cached working mechanism information to obtain a test program for testing the cached working mechanism of the processor; the test program is provided with a plurality of cache test modes; generating thread addresses corresponding to at least partial threads of the processor one by one according to each cache test mode through a test program, wherein the thread addresses meet the cache test mode; according to the method, a test program is used for testing the working mechanism of the cache according to the thread address corresponding to each cache test mode, so that a test result is obtained, the working mechanism test of the processor cache is realized, multiple scenes of using the cache by the thread are adapted, the simulation time of using the cache scene by the thread is reduced, and the problem of long simulation time in the prior art is solved.

Description

Method, device, equipment and medium for testing cache working mechanism of processor
Technical Field
The present disclosure relates to the field of processor testing technologies, and in particular, to a method, an apparatus, a device, and a medium for testing a cache operating mechanism of a processor.
Background
In order to test the operation mechanism (also called design under test, DUT, design under Test) of the cache in the running of the processor (CPU, central Processing Unit), a test method of the cache operation mechanism of the processor is required.
In the prior art, the working mechanism of the cache at the processor runtime is tested using a verification framework (a test program) compiled by the SystemVerilog language, wherein the addresses allocated for the threads at the processor runtime are randomly generated.
In implementing the present application, the inventors found that at least the following problems exist in the prior art: because the addresses allocated for the threads in the running process of the processor are randomly generated, and the memory space of the thread using the cache has various scenes, the simulation time of the scene of the thread using the memory space of the cache is long due to random factors.
Disclosure of Invention
The embodiment of the application provides a test method, a test device and a test medium for a cache working mechanism of a processor, which are used for solving the problem that in the prior art, as addresses allocated to threads in the running process of the processor are randomly generated, the memory space of the threads using the cache has various scenes, and the simulation time of the scenes of the threads using the memory space of the cache is long due to random factors.
In a first aspect, an embodiment of the present application provides a method for testing a cache operating mechanism of a processor, where the method includes:
acquiring cache working mechanism information of a processor; the cache working mechanism information is used for representing a working mechanism of the cache when the processor runs;
configuring a preset initial test program according to the cache working mechanism information to obtain a test program for testing the working mechanism of the cache of the processor; the test program is provided with a plurality of cache test modes;
generating thread addresses corresponding to at least partial threads of the processor one by one according to each cache test mode through the test program, wherein the thread addresses meet the cache test modes;
and testing the working mechanism of the cache according to the thread address corresponding to each cache test mode by the test program to obtain a test result.
In a second aspect, an embodiment of the present application provides a test apparatus for a cache operating mechanism of a processor, where the apparatus includes:
the first acquisition module is used for acquiring cache working mechanism information of the processor; the cache working mechanism information is used for representing a working mechanism of the cache when the processor runs;
The second acquisition module is used for configuring a preset initial test program according to the cache working mechanism information to acquire a test program for testing the working mechanism of the cache of the processor; the test program is provided with a plurality of cache test modes;
the generating module is used for generating thread addresses corresponding to at least partial threads of the processor one by one according to each cache test mode through the test program, and the thread addresses meet the cache test modes;
and the test module is used for testing the working mechanism of the cache according to the thread address corresponding to each cache test mode through the test program to obtain a test result.
In a third aspect, embodiments of the present application further provide an electronic device, including a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the method of the first aspect.
In a fourth aspect, embodiments of the present application also provide a computer-readable storage medium, which when executed by a processor of an electronic device, causes the electronic device to perform the method of the first aspect.
In the embodiment of the application, the initial test program is configured according to the cache working mechanism information of the processor to obtain the test program, the working mechanism of the cache when the processor runs is tested through the test program, and the thread addresses corresponding to at least part of threads of the processor one by one are generated for each cache test mode so as to adapt to various scenes of the thread using the storage space of the cache, thereby reducing the simulation time of the thread using the storage space scene of the cache, and solving the problem that the prior art has various scenes of the thread using the storage space of the cache due to random generation of the address allocated for the thread when the processor runs and the simulation time of the thread using the storage space scene of the cache due to random factors.
The foregoing description is only an overview of the technical solutions of the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.
FIG. 1 is a flowchart of a method for testing a cache operating mechanism of a processor according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating steps of a method for testing a cache operating mechanism of another processor according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a test procedure according to an embodiment of the present application;
FIG. 4 is a schematic diagram of another test procedure according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a plurality of cache test modes provided in an embodiment of the present application;
FIG. 6 is a block diagram of a test apparatus for a cache operating mechanism of a processor according to an embodiment of the present invention;
FIG. 7 is a block diagram of an electronic device provided by an embodiment of the present invention;
fig. 8 is a block diagram of another electronic device in accordance with another embodiment of the invention.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type and not limited to the number of objects, e.g., the first object may be one or more. Furthermore, the term "and/or" as used in the specification and claims to describe an association of associated objects means that there may be three relationships, e.g., a and/or B, may mean: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship. The term "plurality" in the embodiments of the present application means two or more, and other adjectives are similar thereto.
The following describes in detail a method for testing a cache operating mechanism of a processor according to the embodiment of the present application through a specific embodiment and an application scenario thereof with reference to the accompanying drawings.
Fig. 1 is a flowchart of a step of a method for testing a cache operating mechanism of a processor according to an embodiment of the present application, where, as shown in fig. 1, the method may include:
step 101, obtaining cache working mechanism information of a processor.
The cache working mechanism information is used for representing a working mechanism of a cache when the processor runs.
In the embodiment of the application, the test program for testing the working mechanism of the cache of the processor is obtained by acquiring the cache working mechanism information of the processor and further configuring the preset initial test program according to the cache working mechanism information.
The caches of the processor include a plurality of cache levels such as a first-level cache (FLC), a second-level cache (MLC), and a third-level cache (LLC).
Step 102, configuring a preset initial test program according to the cache working mechanism information to obtain a test program for testing the working mechanism of the cache of the processor; the test program has a plurality of cache test modes.
In the embodiment of the application, a preset initial test program is configured according to the cache working mechanism information to obtain a test program for testing the working mechanism of the cache of the processor, and then the test program is used for testing the working mechanism of the cache.
It should be noted that, the initial test program is an initial verification framework (i.e. a test program) compiled and constructed for testing the working mechanism of the processor during running, the initial test program needs to be configured according to the information of the working mechanism of the cache to be tested, and then a test program matched with the information of the working mechanism of the cache to be tested is formed.
Step 103, generating thread addresses corresponding to at least part of threads of the processor one by one for each cache test mode through the test program.
Wherein the thread address satisfies the cache test mode.
In the embodiment of the application, by using a test program, for each cache test mode, thread addresses corresponding to at least part of threads of a processor are generated one by one, and then by using the test program, according to the thread addresses corresponding to each cache test mode, a working mechanism of a cache is tested, and a test result is obtained.
The thread (thread) is the smallest unit that the operating system can perform operation scheduling. It is included in the process and is the actual unit of operation in the process. One thread refers to a single sequential control flow in a process, and multiple threads can be concurrent in a process, each thread executing different tasks in parallel. The Process is a basic unit of resource allocation of a system, and is a basis of an operating system structure. A program is a description of instructions, data, and their organization, and a process is an entity of a program.
The thread address is the address of a storage space of a cache allocated for a corresponding thread, the cache test mode is a scene mode of the storage space of the cache used by the thread, and the scene mode comprises mutually independent storage spaces, partial overlapping of the storage spaces, complete overlapping of the storage spaces, scattered overlapping of the storage spaces and the like, wherein the storage space allocated for all threads comprises a plurality of storage blocks, and the mutually independent storage spaces refer to the fact that the storage spaces allocated for all threads do not have the same storage blocks; the partial coincidence of the storage spaces means that at least two storage spaces in the storage spaces allocated for all threads have a first preset number of identical first storage blocks, and all the first storage blocks form a set of storage blocks with continuous storage block addresses, wherein the first preset number is smaller than the total number of all the storage blocks of the corresponding storage spaces; the storage space is completely overlapped, namely, at least two storage spaces in the storage spaces allocated for all threads have the same second storage blocks, and the number of the second storage blocks is equal to the total number of all storage blocks of the corresponding storage spaces; the scattered overlapping of the storage spaces means that at least two storage spaces in the storage spaces allocated for all threads have a second preset number of identical third storage blocks, at least part of the third storage blocks form a set of storage blocks with discontinuous storage block addresses, and the second preset number is smaller than the total number of all storage blocks of the corresponding storage spaces.
And 104, testing the working mechanism of the cache according to the thread address corresponding to each cache test mode by the test program to obtain a test result.
In the embodiment of the application, the working mechanism of the cache is tested according to the thread address corresponding to each cache test mode through the test program, and the test result is obtained, so that the working mechanism of the cache in the running process of the processor is tested.
It should be noted that, referring to the following table (table 1), the test result may be in the form of a test log and may be stored in a database, where the test result may be an execution result of all access instructions in the test program, and the test result may include an instruction execution period (in the figure) where the access instructions are located, a cache level used by the access instructions and a cache access manner (including data storage, data acquisition, a client table in the figure), a type (in the figure) of the access instructions, a permission (in the figure) of the access instructions, a storage block address (addr table in the figure) of the access instructions, data (in the figure) in a storage block corresponding to the access instructions, a source (in the figure) in a decimal form corresponding to the access instructions, a sink (in the figure) in a decimal form corresponding to the access instructions, a source (in the figure) in hexadecimal form converted by the source in the decimal form corresponding to the access instructions, and a sink (in the figure) in hexadecimal form converted by the source in the decimal form corresponding to the access instructions.
TABLE 1
The test program comprises a plurality of access instruction sequences, wherein the access instruction sequences are used for requesting to access the corresponding storage blocks; the access instruction sequence comprises a plurality of access instructions executed according to a preset sequence; the memory blocks have corresponding memory block addresses.
The access instruction is a TileLink bus message, such as Acquire, grant, grantAck. Where TileLink is a chip-level interconnect standard that provides consistent memory mapped access to memory and other slave devices for multiple master devices. TileLink is designed for use in a system on chip (SoC) to connect general purpose multiprocessor, coprocessor, accelerator, direct memory access (DMA, direct Memory Access) engines, and simple or complex devices that use fast scalable interconnects to provide low latency and high throughput data handling; the Acquire message is the type of request message that the master agent initiates when planning to locally cache a copy of a block of data, which the master agent may also use to upgrade the rights on their cached block (e.g., obtain write rights for a read-only copy); the GrantData message is both a response and request message, which the slave agent uses to provide an acknowledgement message as well as a copy of the data block to the original requesting master agent; the GrantAck message is used by the master agent to provide a final acknowledge message of the transaction completion, while also being used by the slave agent to ensure global serialization of operations.
For example, in the test log, NULL represents a NULL value, and the test result of the access instruction acquisition is: the instruction execution period (in the figure, the period table item) in which the access instruction Acquire is located is 2018, the cache level used by the access instruction Acquire and the cache access mode (in the figure, the client table item) are L3 (the data storage of the three-level cache, namely, the storage block of the three-level cache receives the data), the type (in the figure, the opcode table item) of the access instruction Acquire is Acquire, the authority (in the figure, the parameter table item) of the access instruction Acquire is NtoB, the address (in the figure, the addr table item) of the storage block of the access instruction Acquire is 0x1c900, the data (in the figure, the data_hex table item) in the storage block of the access instruction Acquire is 0x0, the source (in the figure, the source table item) of the decimal form corresponding to the access instruction Acquire is 0, the sink (in the figure, the sink table item) of the decimal form corresponding to the access instruction Acquire is a value ll, the source (in the hexadecimal form) of the hexadecimal form corresponding to the access instruction Acquire is converted from 0 x_0 (in the figure, and the index table item of the hexadecimal form corresponding to the hexadecimal form of the hexadecimal form is 0 x.
In summary, in this embodiment of the present application, by configuring an initial test program according to cache working mechanism information of a processor, obtaining a test program, implementing a test on a working mechanism of a cache when the processor runs by the test program, and generating, for each cache test mode, a thread address corresponding to at least a part of threads of the processor one by one, so as to adapt to multiple scenes of using a storage space of the cache by the threads, thereby reducing simulation time of using the storage space of the cache by the threads, and solving the problem in the prior art that the simulation time of using the storage space of the cache by the threads is long because addresses allocated to the threads when the processor runs are randomly generated, and the storage space of the cache by the threads has multiple scenes due to random factors.
In addition, through the embodiment of the application, the triggering probability of the specific event formed by the working mechanism of the cache when the thread uses the scene of the storage space of the cache is improved, and the simulation time of the specific event formed by the working mechanism of the cache when the thread uses the scene of the storage space of the cache is reduced as a whole.
Fig. 2 is a flowchart of specific steps of a method for testing a cache operating mechanism of a processor according to an embodiment of the present application, and as shown in fig. 2, the method may include:
Step 201, obtaining cache working mechanism information of a processor.
The cache working mechanism information is used for representing a working mechanism of a cache when the processor runs.
The implementation of this step is similar to the implementation of step 101, and will not be described here again.
Step 202, acquiring a target signal set matched with the information of the cache working mechanism from a plurality of signal sets preset in the initial test program.
The signal set is a set of correspondence between signal identifications and signal handles of a plurality of operation signals; the operation signal is a signal generated when the corresponding cache is operated according to the corresponding cache working mechanism information.
In the embodiment of the application, the target signal set matched with the information of the cache working mechanism is obtained from a plurality of signal sets preset in the initial test program, and then the signal identifier and the signal handle of the operation signal matched with the information of the cache working mechanism are obtained according to the target signal set.
For example, the signal identifier of the operation signal whose signal handle is source is signal identifier 1, and the signal identifier of the operation signal whose signal handle is sink is signal identifier 2.
It should be noted that, the TileLink has A, B, C, D, E ports, and in the initial test procedure, a signal set is created for each type of cached working mechanism, and each signal set includes a correspondence between signal identifiers and signal handles of operation signals of A, B, C, D, E ports.
Step 203, according to the target signal set, acquiring a signal identifier and a signal handle of an operation signal matched with the information of the buffer work mechanism.
In the embodiment of the application, the signal identifier and the signal handle of the operation signal matched with the cache working mechanism information are obtained according to the target signal set, so that the signal value matched with the cache working mechanism information is given to the signal handle of the operation signal matched with the cache working mechanism information, and a signal corresponding to the operation signal is obtained, and a test program is formed.
It should be noted that, according to the type information of the cache working mechanism information, a target signal set matched with the cache working mechanism information is obtained, and then a signal identifier and a signal handle of an operation signal matched with the cache working mechanism information are obtained.
Step 204, giving a signal value matched with the buffer work mechanism information to a signal handle of the operation signal matched with the buffer work mechanism information, obtaining a signal corresponding to the operation signal, and forming the test program; the test program has a plurality of cache test modes.
In the embodiment of the application, a signal corresponding to the operation signal is obtained by giving a signal value matched with the cache working mechanism information to a signal handle of the operation signal matched with the cache working mechanism information, so as to form a test program, and further, a test result is obtained through the test program.
For example, referring to table 1, a signal value given to a signal handle source of an operation signal in an access instruction acquisition is 0.
Step 205, generating, by the test program, a thread address corresponding to at least a part of threads of the processor one by one for each of the cache test modes.
Wherein the thread address satisfies the cache test mode.
The implementation of this step is similar to the implementation of step 103, and will not be described here again.
Optionally, in some embodiments, the thread address is an address of a storage space of a cache allocated for the corresponding thread; the memory space comprises a plurality of memory blocks; the storage block is provided with a corresponding storage block address; step 205 includes the following sub-steps (sub-step 2051 to sub-step 2054):
in the substep 2051, when the cache test mode is that the storage spaces are independent, a first thread address corresponding to at least a part of threads of the processor one by one is generated by the test program.
And the storage spaces corresponding to all the first thread addresses do not have the same storage block.
In the embodiment of the application, under the condition that the cache test modes are independent of each other in storage space, first thread addresses corresponding to at least partial threads of the processor one by one are generated through the test program, and then the working mechanism of the cache is tested according to the thread addresses corresponding to each cache test mode through the test program, so that a test result is obtained.
It should be noted that, the access modes of the thread address adapted to the embodiment of the present application are divided into a single-thread address access and a multi-thread address access, where the single-thread address access can quickly verify the multi-request processing and the nested processing functions of the cache; the multi-threaded address access is capable of validating the multi-load processing function of the cache. The access sequence of the thread address adapted to the embodiment of the application can be divided into an address sequence increasing access and an address disorder access, wherein the address sequence increasing access is common in a processor, such as a computer loader, for writing data storage and the like; address-unordered access is typically caused in processors by the presence of multiple out-of-order allocated pointers in the program, as well as scattered pointers or variables that are generated during the running of some programs.
Referring to a1 in fig. 5, a memory space c1 corresponding to a first thread address 1 and a memory space c2 corresponding to a first thread address 2 are independent of each other, i.e., a memory block in the memory space c1 and a memory block in the memory space c2 do not overlap (are not identical).
And step 2052, generating a second thread address corresponding to at least part of threads of the processor one by one through the test program under the condition that the cache test mode is that the storage space is partially overlapped.
The memory space corresponding to at least two second thread addresses in the second thread addresses is provided with a first preset number of identical first memory blocks, and all the first memory blocks form a set of memory blocks with continuous memory block addresses; the first preset number is smaller than the total number of all storage blocks of the corresponding storage space.
In the embodiment of the application, under the condition that the cache test modes are partially overlapped in storage space, generating second thread addresses corresponding to at least partial threads of the processor one by one through the test program, and further testing the working mechanism of the cache according to the thread addresses corresponding to each cache test mode through the test program to obtain a test result.
Referring to a2 in fig. 5, a memory space c1 corresponding to the second thread address 1 and a memory space c2 corresponding to the second thread address 2 are partially overlapped, and the overlapping area is b1, that is, the memory block in the memory space c1 is partially identical to the memory block in the memory space c2 and the memory block address of the first memory block is continuous.
And step 2053, generating a third thread address corresponding to at least part of threads of the processor one by one through the test program under the condition that the cache test mode is that the storage spaces are completely overlapped.
And the number of the second storage blocks is equal to the total number of all storage blocks of the corresponding storage space.
In the embodiment of the application, under the condition that the cache test modes are that the storage spaces completely coincide, a third thread address corresponding to at least part of threads of the processor one by one is generated through the test program, and then the working mechanism of the cache is tested according to the thread address corresponding to each cache test mode through the test program, so that a test result is obtained.
Referring to a3 in fig. 5, a storage space c1 corresponding to the third thread address 1 and a storage space c2 corresponding to the third thread address 2 are completely overlapped, and the overlapping area is b2, that is, the storage block in the storage space c1 is completely the same as the storage block in the storage space c 2.
And step 2054, generating a fourth thread address corresponding to at least part of threads of the processor one by one through the test program under the condition that the cache test mode is that the storage spaces are scattered and overlapped.
The storage spaces corresponding to at least two fourth thread addresses in the fourth thread addresses are provided with a second preset number of identical third storage blocks, and at least part of the third storage blocks form a set of storage blocks with discontinuous storage block addresses; the second preset number is smaller than the total number of all storage blocks of the corresponding storage space.
In the embodiment of the application, under the condition that the cache test modes are scattered and overlapped in storage space, fourth thread addresses corresponding to at least part of threads of the processor one by one are generated through the test program, and then the working mechanism of the cache is tested according to the thread addresses corresponding to each cache test mode through the test program, so that a test result is obtained.
Referring to a4 in fig. 5, a storage space c1 corresponding to a third thread address 1 and a storage space c2 corresponding to a third thread address 2 are scattered and overlapped, one of the overlapping areas is b3, that is, the storage blocks in the storage space c1 are partially identical to the storage blocks in the storage space c2, and the storage block addresses of at least two third storage blocks are discontinuous.
The execution of sub-steps 2051 to 2054 may be performed, where the working mechanism of the cache is tested according to the thread addresses corresponding to the four cache test modes by the test program, so as to obtain a test result.
Optionally, in some embodiments, step 205 includes the following sub-steps (sub-step 2055):
substep 2055, by the test program, generating, for each of the cache test modes, consecutive thread addresses corresponding to at least some threads of the processor in a one-to-one correspondence in the execution order of the threads.
In the embodiment of the application, by means of a test program, for each cache test mode, according to the execution sequence of threads, continuous thread addresses corresponding to at least part of threads of a processor one by one are generated, and the method is applicable to a scene of simulating sequential incremental access of addresses, and further by means of the test program, according to the thread addresses corresponding to each cache test mode, a working mechanism of a cache is tested, and a test result is obtained.
Optionally, in some embodiments, step 205 includes the following sub-steps (sub-step 2056 through sub-step 2057):
and step 2056, selecting a target cache test mode from all the cache test modes according to a preset cycle sequence through the test program.
In the embodiment of the application, through a test program, a target cache test mode is selected from all cache test modes according to a preset cycle order, and then a thread address corresponding to each target cache test mode is generated respectively.
For example, the buffer test modes include mutually independent storage spaces, partially overlapping storage spaces, completely overlapping storage spaces, and scattered overlapping storage spaces, the buffer test modes are arranged according to a preset cycle order to form mutually independent storage spaces, partially overlapping storage spaces, completely overlapping storage spaces, and scattered overlapping storage spaces, then the first selected target buffer test mode is the mutually independent storage spaces, the second selected target buffer test mode is the partially overlapping storage spaces, the third selected target buffer test mode is the completely overlapping storage spaces, the fourth selected target buffer test mode is the scattered overlapping storage spaces, the fifth selected target buffer test mode is the mutually independent storage spaces, and the sixth selected target buffer test mode is the partially overlapping storage spaces … … so that the target buffer test mode is selected repeatedly.
Sub-step 2057, respectively generating a thread address corresponding to each target cache test mode.
In the embodiment of the application, the working mechanism of the cache is tested by respectively generating the thread address corresponding to each target cache test mode, and then in the process of executing the threads according to the thread sequence, the thread address corresponding to each target cache test mode selected according to the preset cycle sequence is tested, so that each scene of using the storage space of the cache by the threads can be simulated in sequence.
The execution of sub-steps 2056 to 2057 may be implemented, by a test program, to test the working mechanism of the cache according to the thread address corresponding to each target cache test mode selected in the preset loop order during the execution of the threads according to the thread order, so that each scenario of using the cached memory space by the threads may be simulated in sequence, and a test result may be obtained.
And 206, testing the working mechanism of the cache according to the thread address corresponding to each cache test mode and the signals by the test program to obtain the test result.
In the embodiment of the application, the test program is used for testing the working mechanism of the cache according to the thread address and the signals corresponding to each cache test mode to obtain the test result, so that the execution result of each signal can be obtained.
Optionally, in some embodiments, the thread address is an address of a storage space of a cache allocated for the corresponding thread; the memory space comprises a plurality of memory blocks; the storage block is provided with a corresponding storage block address; the test program comprises a plurality of access instruction sequences, wherein the access instruction sequences are used for requesting to access corresponding storage blocks; the access instruction sequence comprises a plurality of access instructions executed according to a preset sequence; the access instruction has a corresponding plurality of signals; in the signals corresponding to the first access instruction in the access instruction sequence, a first signal with a signal handle as a storage block address exists, and the first access instruction is an access instruction executed first in the access instruction sequence.
In this embodiment of the present application, the test program includes a plurality of access instruction sequences, through which access instructions may be requested to access a corresponding memory block, where the access instruction sequences include a plurality of access instructions executed according to a preset order, where the access instructions have a plurality of corresponding signals, so that a test result may be obtained by the test program according to a thread address and a signal corresponding to each cache test mode, and a working mechanism of the test cache, including obtaining an execution result of each signal.
For example, the access instruction Acquire, the access instruction Grant and the access instruction Grant form an access instruction sequence, the first access instruction in the access instruction sequence is the access instruction Acquire, in the access instruction sequence, the access instruction Acquire is executed first, then the access instruction Grant is executed, and finally the access instruction Grant is executed. The access command acquisition has a corresponding signal with a signal handle of a memory block address and a signal with a signal handle of a source, the access command Grant has a signal with a signal handle of a source and a signal with a signal handle of a sink, and the access command Grant has a signal with a signal handle of a sink.
Optionally, in some embodiments, the method further comprises the following steps (step 207 to step 209):
step 207, a first corresponding relation is established between the first signal value of the first signal and the second signal value of the corresponding second signal, and all the first corresponding relations corresponding to the first signal are combined into a signal value set.
The second signal is a signal except the first signal of the corresponding first access instruction.
In this embodiment of the present application, a first signal value of a first signal and a second signal value of a corresponding second signal are set to form a signal value set according to all first corresponding relations corresponding to the first signal, and then a target signal value set is determined from all signal value sets according to a third signal value of a third signal of a second access instruction executed after a first access instruction.
Step 208, determining a target signal value set from all signal value sets according to the third signal value of the third signal of the second access instruction executed after the first access instruction.
The signal handle of the third signal is the same as the signal handle of one second signal in the target signal value set.
In this embodiment of the present application, a target signal value set is determined from all signal value sets according to a third signal value of a third signal of a second access instruction executed after the first access instruction, and then a second corresponding relationship is established between a fourth signal value of a fourth signal of the second access instruction, except for the third signal value, and a first signal value in the corresponding target signal value set, and the second corresponding relationship is stored in the corresponding target signal value set, and a third corresponding relationship between the second access instruction and the corresponding first signal value is established.
Optionally, in some embodiments, step 208 includes the following sub-steps (sub-step 2081):
substep 2081, using, as the target signal value set, a signal value set of signals having the same signal value as the third signal value, among all signal value sets.
In this embodiment of the present application, a signal value set of signals having the same signal value as the third signal value is used as the target signal value set, so that a second corresponding relationship is established between a fourth signal value of a fourth signal of the second access instruction, except for the third signal value, and a first signal value in the corresponding target signal value set, and the second corresponding relationship is stored in the corresponding target signal value set, and a third corresponding relationship between the second access instruction and the corresponding first signal value is established.
Step 209, establishing a second correspondence between the fourth signal value of the fourth signal of the second access instruction, except the third signal value, and the first signal value in the corresponding target signal value set, storing the second correspondence in the corresponding target signal value set, and establishing a third correspondence between the second access instruction and the corresponding first signal value.
In the embodiment of the application, the second corresponding relation is established between the fourth signal value of the fourth signal except the third signal value of the second access instruction and the first signal value in the corresponding target signal value set, and the second corresponding relation is stored in the corresponding target signal value set, and in the target signal value set, the third corresponding relation between the second access instruction and the corresponding first signal value is established, and then the test result is obtained according to the third corresponding relation.
The obtaining the test result comprises the following steps:
step 210, obtaining the test result according to the third corresponding relation.
In the embodiment of the application, the test result is obtained according to the third corresponding relation, so that the access instruction sequence and the test result of each access instruction in the access instruction sequence can be directly obtained.
The steps 207 to 210 may be performed to directly obtain the access instruction sequence and the test result of each access instruction in the access instruction sequence, which, compared with the prior art, needs to manually analyze log (log) information of the signal to obtain the access instruction sequence, thereby obtaining the test result of each access instruction in the access instruction sequence, and improving the efficiency of obtaining the test result.
For example, in steps 207 to 210, for example, the access instruction Acquire, the access instruction Grant, and the access instruction Grant constitute an access instruction sequence, where the first access instruction in the access instruction sequence is the access instruction Acquire, and in the access instruction sequence, the access instruction Acquire is executed first, then the access instruction Grant is executed, and finally the access instruction Grant is executed. The access command acquisition has a corresponding signal with a signal handle of a memory block address and a signal with a signal handle of a source, the access command Grant has a signal with a signal handle of a source and a signal with a signal handle of a sink, and the access command Grant has a signal with a signal handle of a sink.
The first signal is a signal handle corresponding to an access instruction acquisition and is a signal of a storage block address, and the first signal value of the first signal is 0x1c900; the second signal is a signal with a signal handle corresponding to the access instruction acquisition being source, and the second signal value of the second signal is 0. A first correspondence 1 of first signal values 0x1c900 and second signal values 0 is created and all first correspondences (here only one, i.e. the first correspondence) are composed into a set of signal values 1.
The third signal 1 corresponding to the access command Grant of the second access command 1 is a signal whose signal handle is source, and the third signal value 1 of the third signal 1 is 0.
Assuming that the existing signal value sets include a signal value set 1 (including the second signal value 0), a signal value set 2 (excluding the second signal value 0), and a signal value set 3 (excluding the second signal value 0), the signal value set 1 is regarded as the target signal value set 1.
The fourth signal of the access command Grant is a signal with a signal handle of sink, and if the fourth signal value of the fourth signal is 0, a second corresponding relation between the fourth signal value 0 and the first signal value 0x1c900 is created, the second corresponding relation 1 is stored in the target signal value set 1, and a third corresponding relation 1 between the access command Grant and the first signal value 0x1c900 is created.
Similarly, in some embodiments, the third signal 2 corresponding to the access command GrantAck as the second access command 1 is a signal with a signal handle of sink, and the third signal value 2 of the third signal 2 is 0.
The existing signal value sets include a signal value set 1 (including a fourth signal value 0), a signal value set 2 (excluding the fourth signal value 0), and a signal value set 3 (excluding the fourth signal value 0), and since the signal value set 1 contains the signal value of the signal handle sink of the third signal 2, the signal value set 1 is regarded as the target signal value set 2.
The access command GrantAck does not have a corresponding fourth signal, and only the third corresponding relation 2 between the access command GrantAck and the first signal value 0x1c900 needs to be established.
Optionally, in some embodiments, before step 205, the method further comprises the step of (step 211):
step 211, configuring a preset initial test program according to the cache working mechanism information through a preset configuration interface of the initial test program to obtain the test program.
In the embodiment of the application, the preset initial test program is configured according to the cache working mechanism information through the preset configuration interface of the initial test program, so that the test program matched with the cache working mechanism information can be obtained.
Optionally, in some embodiments, the initial test program is a program compiled using a SystemVerilog language; prior to step 211, the method further comprises the following steps (steps 212 to 213):
step 212, create a Verilog process interface in the initial test program.
In the embodiment of the application, the Verilog process interface is created in the initial test program, so that an application programming interface of CPython for calling the Verilog process interface is created, and the application programming interface is used as a configuration interface.
It should be noted that, systemVerilog is simply called SV language, which is a language based on Verilog language, is an extension enhancement of IEEE 1364 Verilog-2001 standard, is compatible with Verilog 2001, combines Hardware Description Language (HDL) with modern high-level verification language (HVL), and becomes a language for next generation hardware design and verification recently.
Verilog is a hardware description language that describes the structure and behavior of digital system hardware in text form, and can be used to represent logic circuit diagrams, logic expressions, and logic functions performed by digital logic systems.
The Verilog process interface (Verilog Procedural Interface) is a Verilog process interface for the C language. The method can enable the behavior level description code of the digital circuit to directly call the function of the C language, and the used function of the C language can also call the standard Verilog system task. The Verilog program structure is part of the IEEE 1364 programming language interface standard.
Cpython is a Python interpreter implemented in the C language, and the Cpython's application Programming interface (API, application Programming Interface) provides the computer operating system or library with code for application calls.
Step 213, creating an application programming interface of CPython for calling the Verilog process interface, and taking the application programming interface as the configuration interface.
In the embodiment of the application, the application programming interface of CPython for calling the Verilog process interface is created, and is used as a configuration interface, so that the preset initial test program is configured according to the cached working mechanism information through the preset configuration interface of the initial test program, and the test program is obtained.
The steps 212 to 213 may be implemented by using the application programming interface as a configuration interface, and further configuring a preset initial test program according to the cached working mechanism information through the configuration interface to obtain the test program. In the prior art, the SystemVerilog language code of the initial test program needs to be modified, then the modified SystemVerilog language code is recompiled and built into the test program, and the time for obtaining the test program is long.
Through the application programming interface of CPython, CPython instructions can be directly used for uniformly endowing signal handles of all signals of the same type with the same signal values, and in the prior art, the signal handles of all signals of the same type are required to be respectively endowed with the signal values by using the SystemVerilog language.
Optionally, in some embodiments, the method further comprises the steps of (step 214 to step 215):
step 214, generating a superposition ratio according to a preset random rule.
In the embodiment of the application, the first preset number is obtained by generating the superposition ratio according to the preset random rule, and further calculating the superposition ratio and the total number of all storage blocks in the corresponding storage space.
It should be noted that the preset random rule may be implemented by using a random generator, and the overlapping ratio is a ratio of the first preset number to the total number of all the storage blocks in the corresponding storage space.
Step 215, calculating to obtain the first preset number according to the overlapping proportion and the total number of all storage blocks of the corresponding storage space.
In the embodiment of the application, the first preset number is obtained by calculation according to the superposition proportion and the total number of all the storage blocks of the corresponding storage space, and then the second thread address can be generated according to the first preset number.
Specifically, in some embodiments, the product of the overlapping proportion and the total number of all memory blocks of the corresponding memory space is taken as the first preset number.
By executing steps 214 to 215, a first preset number is obtained, and further, a second thread address may be generated according to the first preset number.
Optionally, referring to fig. 3, in some embodiments, the second level cache of the processor uses a cached working mechanism 1 and a cached working mechanism 2, and the third level cache of the processor uses a cached working mechanism 3, and the test program matched to the cached working mechanism 1, the cached working mechanism 2 and the cached working mechanism 3 includes a cache test mode selection module, a thread address generation module, a simulation cache module 1, a simulation firmware module 1, a simulation cache module 2, a simulation firmware module 2, a cache monitor 1, a cache monitor 2, a scoreboard module, a consistency checker, and a database. The buffer test mode selection module is configured to select a target buffer test mode from a plurality of buffer test modes, and the implementation process is described above and will not be repeated here; the thread address generating module is used for generating a corresponding thread address according to the target cache test mode, and the implementation process is as described above and is not repeated here; the simulation cache module 1 is used for simulating a cache access process according to the cached working mechanism 1 and the cached working mechanism 3, the simulation firmware module 1 is used for simulating a PTW working process according to the cached working mechanism 1 and the cached working mechanism 3, the simulation cache module 2 is used for simulating a cache access process according to the cached working mechanism 2 and the cached working mechanism 3, the simulation firmware module 2 is used for simulating the PTW working process according to the cached working mechanism 2 and the cached working mechanism 3, and the PTW is firmware used for storing information of a page table by a processor; the cache monitor 1 is used for monitoring the processes of the cached working mechanism 1 and the cached working mechanism 2 and sending test results to the scoreboard module, the consistency checker and the database, and the cache monitor 2 is used for monitoring the processes of the cached working mechanism 3 and sending test results to the scoreboard module, the consistency checker and the database; the scoreboard module is used for checking the correctness of the data stored in the storage block; the consistency checker is used for checking whether the memory access process of the memory block is matched and consistent with the cached working mechanism 1, the cached working mechanism 2 and the cached working mechanism 3.
Optionally, referring to fig. 4, in some embodiments, the matched test program includes a cache test pattern selection module, a thread address generation module, a simulation cache module, a simulation firmware module, a simulation engine module, a cache monitor, a scoreboard module, a consistency checker, and a database for the operating mechanism of the cache used by the secondary cache and the tertiary cache of the processor. The buffer test mode selection module is configured to select a target buffer test mode from a plurality of buffer test modes, and the implementation process is described above and will not be repeated here; the thread address generating module is used for generating a corresponding thread address according to the target cache test mode, and the implementation process is as described above and is not repeated here; the simulation cache module is used for simulating a cache access process according to a cached working mechanism, the simulation firmware module is used for simulating a PTW working process according to the cached working mechanism, and the simulation engine module is used for simulating a DMA working process according to the cached working mechanism; the cache monitor is used for monitoring the process of the cached working mechanism and sending the test result to the scoreboard module, the consistency checker and the database; the scoreboard module is used for checking the correctness of the data stored in the storage block; the consistency checker is configured to store whether the memory access procedure of the block matches the cached operating mechanism.
In summary, in this embodiment of the present application, by configuring an initial test program according to cache working mechanism information of a processor, obtaining a test program, implementing a test on a working mechanism of a cache when the processor runs by the test program, and generating, for each cache test mode, a thread address corresponding to at least a part of threads of the processor one by one, so as to adapt to multiple scenes of using a storage space of the cache by the threads, thereby reducing simulation time of using the storage space of the cache by the threads, and solving the problem in the prior art that the simulation time of using the storage space of the cache by the threads is long because addresses allocated to the threads when the processor runs are randomly generated, and the storage space of the cache by the threads has multiple scenes due to random factors.
In addition, through the embodiment of the application, the triggering probability of the specific event formed by the working mechanism of the cache when the thread uses the scene of the storage space of the cache is improved, and the simulation time of the specific event formed by the working mechanism of the cache when the thread uses the scene of the storage space of the cache is reduced as a whole.
Referring to fig. 6, a test apparatus for a cache operating mechanism of a processor according to an embodiment of the present application is shown, where the apparatus includes:
A first obtaining module 301, configured to obtain cache working mechanism information of a processor; the cache working mechanism information is used for representing a working mechanism of the cache when the processor runs;
a second obtaining module 302, configured to configure a preset initial test program according to the cached working mechanism information, to obtain a test program for testing a cached working mechanism of the processor; the test program is provided with a plurality of cache test modes;
a generating module 303, configured to generate, for each of the cache test modes, a thread address corresponding to at least a part of threads of the processor one by one, where the thread address satisfies the cache test mode;
and the test module 304 is configured to test, by using the test program, the working mechanism of the cache according to the thread address corresponding to each cache test mode, and obtain a test result.
Optionally, the second obtaining module 302 specifically includes:
the first acquisition sub-module is used for acquiring a target signal set matched with the cache working mechanism information from a plurality of signal sets preset in the initial test program; the signal set is a set of corresponding relations between signal identifications and signal handles of a plurality of operation signals; the operation signal is a signal generated when the corresponding cache is operated according to the corresponding cache working mechanism information;
The second acquisition sub-module is used for acquiring a signal identifier and a signal handle of an operation signal matched with the cache working mechanism information according to the target signal set;
and the third acquisition sub-module is used for giving a signal value matched with the cache working mechanism information to a signal handle of the operation signal matched with the cache working mechanism information, obtaining a signal corresponding to the operation signal and forming the test program.
Optionally, the test module 304 specifically includes:
and the test sub-module is used for testing the working mechanism of the cache according to the thread address corresponding to each cache test mode and the signals through the test program to obtain the test result.
Optionally, the thread address is an address of a storage space of a cache allocated for the corresponding thread; the memory space comprises a plurality of memory blocks; the storage block is provided with a corresponding storage block address; the test program comprises a plurality of access instruction sequences, wherein the access instruction sequences are used for requesting to access corresponding storage blocks; the access instruction sequence comprises a plurality of access instructions executed according to a preset sequence; the access instruction has a corresponding plurality of signals; in the signals corresponding to the first access instruction in the access instruction sequence, a first signal with a signal handle as a storage block address exists, and the first access instruction is an access instruction executed first in the access instruction sequence.
Optionally, the apparatus 300 further includes:
the signal value set module is used for establishing a corresponding first corresponding relation between the first signal value of the first signal and the second signal value of the corresponding second signal, and forming a signal value set by all the first corresponding relations corresponding to the first signal; the second signal is a signal of the corresponding first access instruction except the first signal;
a determining module, configured to determine a target signal value set from all signal value sets according to a third signal value of a third signal of a second access instruction executed after the first access instruction;
the storage module is used for establishing a second corresponding relation between a fourth signal value of the fourth signal except the third signal value of the second access instruction and a first signal value in a corresponding target signal value set, storing the second corresponding relation into the corresponding target signal value set, and establishing a third corresponding relation between the second access instruction and the corresponding first signal value;
and the third acquisition module is used for acquiring the test result according to the third corresponding relation.
Optionally, the determining module specifically includes:
and the determining submodule is used for taking the signal value set of the signal with the same signal value as the third signal value in all the signal value sets as the target signal value set.
Optionally, the second obtaining module 302 specifically includes:
and the fourth acquisition sub-module is used for configuring the preset initial test program according to the cache working mechanism information through a preset configuration interface of the initial test program to obtain the test program.
Optionally, the initial test program is a program compiled by using a SystemVerilog language; the apparatus 300 further comprises:
a first creation module for creating a Verilog process interface in the initial test program;
and the second creation module is used for creating an application programming interface of CPython for calling the Verilog process interface and taking the application programming interface as the configuration interface.
Optionally, the thread address is an address of a storage space of a cache allocated for the corresponding thread; the memory space comprises a plurality of memory blocks; the storage block is provided with a corresponding storage block address; the generating module 303 specifically includes:
the first generation sub-module is used for generating first thread addresses corresponding to at least partial threads of the processor one by one through the test program under the condition that the cache test mode is that the storage spaces are mutually independent; all the storage spaces corresponding to the first thread addresses do not have the same storage block;
The second generation submodule is used for generating second thread addresses corresponding to at least partial threads of the processor one by one through the test program under the condition that the cache test mode is that the storage space is partially overlapped;
the memory space corresponding to at least two second thread addresses in the second thread addresses is provided with a first preset number of identical first memory blocks, and all the first memory blocks form a set of memory blocks with continuous memory block addresses; the first preset number is smaller than the total number of all storage blocks of the corresponding storage space.
Optionally, the generating module 303 specifically further includes:
the third generation submodule is used for generating third thread addresses corresponding to at least partial threads of the processor one by one through the test program under the condition that the cache test mode is that the storage spaces are completely overlapped;
the storage spaces corresponding to at least two third thread addresses in the third thread addresses have the same second storage blocks, and the number of the second storage blocks is equal to the total number of all storage blocks of the corresponding storage spaces;
A fourth generation sub-module, configured to generate, by using the test program, a fourth thread address corresponding to at least a part of threads of the processor in a one-to-one correspondence manner when the cache test mode is that the storage spaces are scattered and overlapped;
the storage spaces corresponding to at least two fourth thread addresses in the fourth thread addresses are provided with a second preset number of identical third storage blocks, and at least part of the third storage blocks form a set of storage blocks with discontinuous storage block addresses; the second preset number is smaller than the total number of all storage blocks of the corresponding storage space.
Optionally, the apparatus 300 further includes:
the proportion generation module is used for generating a superposition proportion according to a preset random rule;
and the calculation module is used for calculating and obtaining the first preset number according to the superposition proportion and the total number of all storage blocks of the corresponding storage space.
Optionally, the generating module 303 specifically further includes:
and a fifth generation sub-module, configured to generate, by using the test program, for each of the cache test modes, a continuous thread address corresponding to at least some threads of the processor one by one according to an execution order of the threads.
Optionally, the generating module 303 specifically further includes:
the selecting submodule is used for selecting a target cache test mode from all the cache test modes according to a preset cycle sequence through the test program;
and the sixth generation submodule is used for respectively generating the thread address corresponding to each target cache test mode.
In summary, in the embodiment of the application, by configuring an initial test program according to cache working mechanism information of a processor to obtain the test program, implementing the test on the working mechanism of a cache when the processor runs through the test program, and generating a thread address corresponding to at least part of threads of the processor one by one for each cache test mode, so as to adapt to various scenes of using a cache storage space by the threads, reduce simulation time of using the cache storage space scene by the threads, and solve the problem that in the prior art, because addresses allocated to the threads when the processor runs are randomly generated, and the threads use the cache storage space has various scenes, the simulation time of using the cache storage space scene by the threads is long due to random factors.
In addition, through the embodiment of the application, the triggering probability of the specific event formed by the working mechanism of the cache when the thread uses the scene of the storage space of the cache is improved, and the simulation time of the specific event formed by the working mechanism of the cache when the thread uses the scene of the storage space of the cache is reduced as a whole.
Fig. 7 is a block diagram of an electronic device 600, according to an example embodiment. For example, the electronic device 600 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 7, an electronic device 600 may include one or more of the following components: a processing component 602, a memory 604, a power component 606, a multimedia component 608, an audio component 610, an input/output (I/O) interface 612, a sensor component 614, and a communication component 616.
The processing component 602 generally controls overall operation of the electronic device 600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 602 may include one or more processors 620 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 602 can include one or more modules that facilitate interaction between the processing component 602 and other components. For example, the processing component 602 may include a multimedia module to facilitate interaction between the multimedia component 608 and the processing component 602.
The memory 604 is used to store various types of data to support operations at the electronic device 600. Examples of such data include instructions for any application or method operating on the electronic device 600, contact data, phonebook data, messages, pictures, multimedia, and so forth. The memory 604 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The power supply component 606 provides power to the various components of the electronic device 600. The power supply components 606 can include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the electronic device 600.
The multimedia component 608 includes a screen between the electronic device 600 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may not only sense demarcations of touch or sliding actions, but also detect durations and pressures associated with the touch or sliding operations. In some embodiments, the multimedia component 608 includes a front camera and/or a rear camera. When the electronic device 600 is in an operational mode, such as a shooting mode or a multimedia mode, the front-facing camera and/or the rear-facing camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
The audio component 610 is for outputting and/or inputting audio signals. For example, the audio component 610 includes a Microphone (MIC) for receiving external audio signals when the electronic device 600 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 604 or transmitted via the communication component 616. In some embodiments, audio component 610 further includes a speaker for outputting audio signals.
The I/O interface 612 provides an interface between the processing component 602 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.
The sensor assembly 614 includes one or more sensors for providing status assessment of various aspects of the electronic device 600. For example, the sensor assembly 614 may detect an on/off state of the electronic device 600, a relative positioning of the components, such as a display and keypad of the electronic device 600, the sensor assembly 614 may also detect a change in position of the electronic device 600 or a component of the electronic device 600, the presence or absence of a user's contact with the electronic device 600, an orientation or acceleration/deceleration of the electronic device 600, and a change in temperature of the electronic device 600. The sensor assembly 614 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact. The sensor assembly 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 614 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 616 is utilized to facilitate communication between the electronic device 600 and other devices, either in a wired or wireless manner. The electronic device 600 may access a wireless network based on a communication standard, such as WiFi, an operator network (e.g., 2G, 3G, 4G, or 5G), or a combination thereof. In one exemplary embodiment, the communication component 616 receives broadcast signals or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 616 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the electronic device 600 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for implementing a method for testing a cache operating mechanism of a processor as provided by embodiments of the present application.
In an exemplary embodiment, a non-transitory computer-readable storage medium is also provided, such as memory 604, including instructions executable by processor 620 of electronic device 600 to perform the above-described method. For example, the non-transitory storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
Fig. 8 is a block diagram of an electronic device 700, according to an example embodiment. For example, the electronic device 700 may be provided as a server. Referring to fig. 8, electronic device 700 includes a processing component 722 that further includes one or more processors and memory resources represented by memory 732 for storing instructions, such as application programs, executable by processing component 722. The application programs stored in memory 732 may include one or more modules that each correspond to a set of instructions. In addition, the processing component 722 is configured to execute instructions to perform a method for testing a cache operating mechanism of a processor according to an embodiment of the present application.
The electronic device 700 may also include a power supply component 726 configured to perform power management of the electronic device 700, a wired or wireless network interface 750 configured to connect the electronic device 700 to a network, and an input output (I/O) interface 758. The electronic device 700 may operate based on an operating system stored in memory 732, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, or the like.
The embodiment of the application also provides a computer program product, which comprises a computer program, wherein the computer program is executed by a processor to realize the test method of the cache working mechanism of the processor.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.
The foregoing description of the preferred embodiments of the present application is not intended to limit the invention to the particular embodiments of the present application, but to limit the scope of the invention to the particular embodiments of the present application.
The foregoing describes in detail a method, apparatus, electronic device and computer readable storage medium for testing a cache operating mechanism of a processor provided in the present application, and specific examples are applied to illustrate the principles and embodiments of the present application, where the foregoing examples are only used to help understand the method and core idea of the present application; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (16)

1. A method for testing a cache operating mechanism of a processor, the method comprising:
acquiring cache working mechanism information of a processor; the cache working mechanism information is used for representing a working mechanism of the cache when the processor runs;
configuring a preset initial test program according to the cache working mechanism information to obtain a test program for testing the working mechanism of the cache of the processor; the test program is provided with a plurality of cache test modes;
generating thread addresses corresponding to at least partial threads of the processor one by one according to each cache test mode through the test program, wherein the thread addresses meet the cache test modes; the thread address is an address of a storage space of a cache allocated for a corresponding thread, the cache test mode is a scene mode that the thread uses the storage space of the cache, and the cache test mode comprises that the storage spaces are mutually independent, the storage spaces are partially overlapped, the storage spaces are completely overlapped and the storage spaces are scattered and overlapped;
And testing the working mechanism of the cache according to the thread address corresponding to each cache test mode by the test program to obtain a test result.
2. The method according to claim 1, wherein configuring a preset initial test program according to the cached operation mechanism information to obtain a test program for testing the cached operation mechanism of the processor includes:
acquiring a target signal set matched with the cache working mechanism information from a plurality of signal sets preset in the initial test program; the signal set is a set of corresponding relations between signal identifications and signal handles of a plurality of operation signals; the operation signal is a signal generated when the corresponding cache is operated according to the corresponding cache working mechanism information;
acquiring a signal identifier and a signal handle of an operation signal matched with the cache working mechanism information according to the target signal set;
and giving a signal value matched with the cache working mechanism information to a signal handle of the operation signal matched with the cache working mechanism information, obtaining a signal corresponding to the operation signal, and forming the test program.
3. The method according to claim 2, wherein said testing, by the test program, the cache operating mechanism according to the thread address corresponding to each cache test mode, to obtain a test result includes:
and testing the working mechanism of the cache according to the thread address corresponding to each cache test mode and the signals by the test program to obtain the test result.
4. The method of claim 2, wherein the memory space comprises a plurality of memory blocks; the storage block is provided with a corresponding storage block address; the test program comprises a plurality of access instruction sequences, wherein the access instruction sequences are used for requesting to access corresponding storage blocks; the access instruction sequence comprises a plurality of access instructions executed according to a preset sequence; the access instruction has a corresponding plurality of signals; in the signals corresponding to the first access instruction in the access instruction sequence, a first signal with a signal handle as a storage block address exists, and the first access instruction is an access instruction executed first in the access instruction sequence.
5. The method according to claim 4, wherein the method further comprises:
Establishing a corresponding first corresponding relation between a first signal value of the first signal and a second signal value of a corresponding second signal, and forming a signal value set by all the first corresponding relations corresponding to the first signal; the second signal is a signal of the corresponding first access instruction except the first signal;
determining a target signal value set from all signal value sets according to a third signal value of a third signal of a second access instruction executed after the first access instruction;
establishing a second corresponding relation between a fourth signal value of a fourth signal of the second access instruction except the third signal value and a first signal value in a corresponding target signal value set, storing the second corresponding relation into the corresponding target signal value set, and establishing a third corresponding relation between the second access instruction and the corresponding first signal value;
the obtaining the test result comprises the following steps:
and obtaining the test result according to the third corresponding relation.
6. The method of claim 5, wherein determining the set of target signal values from all signal value sets based on the third signal value of the third signal of the second access instruction executed after the first access instruction comprises:
And taking the signal value set of the signal with the same signal value as the third signal value as the target signal value set.
7. The method according to claim 1, wherein configuring a preset initial test program according to the cached operation mechanism information to obtain a test program for testing the cached operation mechanism of the processor includes:
and configuring the preset initial test program according to the cache working mechanism information through a preset configuration interface of the initial test program to obtain the test program.
8. The method of claim 7, wherein the initial test program is a program compiled using a SystemVerilog language;
before the preset initial test program is configured according to the cache working mechanism information through the preset configuration interface of the initial test program to obtain the test program, the method further comprises:
creating a Verilog process interface in the initial test program;
and creating an application programming interface of CPython for calling the Verilog process interface, and taking the application programming interface as the configuration interface.
9. The method of claim 1, wherein the memory space comprises a plurality of memory blocks; the storage block is provided with a corresponding storage block address;
the generating, by the test program, a thread address corresponding to at least a part of threads of the processor one by one for each preset cache test mode includes:
generating first thread addresses corresponding to at least part of threads of the processor one by one through the test program under the condition that the cache test mode is that the storage spaces are mutually independent; all the storage spaces corresponding to the first thread addresses do not have the same storage block;
generating second thread addresses corresponding to at least partial threads of the processor one by one through the test program under the condition that the cache test mode is that the storage space is partially overlapped;
the memory space corresponding to at least two second thread addresses in the second thread addresses is provided with a first preset number of identical first memory blocks, and all the first memory blocks form a set of memory blocks with continuous memory block addresses; the first preset number is smaller than the total number of all storage blocks of the corresponding storage space.
10. The method of claim 9, wherein generating, by the test program, a thread address that corresponds one-to-one to at least a portion of threads of the processor for each preset cache test pattern, further comprises:
generating a third thread address corresponding to at least part of threads of the processor one by one through the test program under the condition that the cache test mode is that the storage spaces are completely overlapped;
the storage spaces corresponding to at least two third thread addresses in the third thread addresses have the same second storage blocks, and the number of the second storage blocks is equal to the total number of all storage blocks of the corresponding storage spaces;
generating fourth thread addresses corresponding to at least part of threads of the processor one by one through the test program under the condition that the cache test mode is that the storage spaces are overlapped in a scattered mode;
the storage spaces corresponding to at least two fourth thread addresses in the fourth thread addresses are provided with a second preset number of identical third storage blocks, and at least part of the third storage blocks form a set of storage blocks with discontinuous storage block addresses; the second preset number is smaller than the total number of all storage blocks of the corresponding storage space.
11. The method according to claim 9, wherein the method further comprises:
generating a superposition ratio according to a preset random rule;
and calculating to obtain the first preset number according to the superposition proportion and the total number of all storage blocks of the corresponding storage space.
12. The method of claim 1, wherein generating, by the test program, a thread address for each of the predetermined cache test patterns that corresponds one-to-one to at least some threads of the processor comprises:
and generating continuous thread addresses corresponding to at least part of threads of the processor one by one according to the execution sequence of the threads for each cache test mode through the test program.
13. The method of claim 1, wherein generating, by the test program, a thread address for each of the predetermined cache test patterns that corresponds one-to-one to at least some threads of the processor comprises:
selecting a target cache test mode from all the cache test modes according to a preset cycle sequence through the test program;
and respectively generating thread addresses corresponding to each target cache test mode.
14. A test apparatus for a cache operating mechanism of a processor, the apparatus comprising:
the first acquisition module is used for acquiring cache working mechanism information of the processor; the cache working mechanism information is used for representing a working mechanism of the cache when the processor runs;
the second acquisition module is used for configuring a preset initial test program according to the cache working mechanism information to acquire a test program for testing the working mechanism of the cache of the processor; the test program is provided with a plurality of cache test modes;
the generating module is used for generating thread addresses corresponding to at least partial threads of the processor one by one according to each cache test mode through the test program, and the thread addresses meet the cache test modes; the thread address is an address of a storage space of a cache allocated for a corresponding thread, the cache test mode is a scene mode that the thread uses the storage space of the cache, and the cache test mode comprises that the storage spaces are mutually independent, the storage spaces are partially overlapped, the storage spaces are completely overlapped and the storage spaces are scattered and overlapped;
And the test module is used for testing the working mechanism of the cache according to the thread address corresponding to each cache test mode through the test program to obtain a test result.
15. An electronic device, comprising: a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the method of any one of claims 1 to 13.
16. A computer readable storage medium, characterized in that instructions in the computer readable storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the method of any one of claims 1 to 13.
CN202311174747.4A 2023-09-12 2023-09-12 Method, device, equipment and medium for testing cache working mechanism of processor Active CN116955044B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311174747.4A CN116955044B (en) 2023-09-12 2023-09-12 Method, device, equipment and medium for testing cache working mechanism of processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311174747.4A CN116955044B (en) 2023-09-12 2023-09-12 Method, device, equipment and medium for testing cache working mechanism of processor

Publications (2)

Publication Number Publication Date
CN116955044A CN116955044A (en) 2023-10-27
CN116955044B true CN116955044B (en) 2023-12-22

Family

ID=88451498

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311174747.4A Active CN116955044B (en) 2023-09-12 2023-09-12 Method, device, equipment and medium for testing cache working mechanism of processor

Country Status (1)

Country Link
CN (1) CN116955044B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117648226A (en) * 2024-01-29 2024-03-05 北京开源芯片研究院 Method and device for testing working mechanism of processor cache
CN117762717B (en) * 2024-02-18 2024-04-26 北京开源芯片研究院 Method and device for testing working mechanism of processor cache

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101937392A (en) * 2010-08-27 2011-01-05 华南理工大学 A Dynamic Defect Detection Method for Embedded Software
CN103365631A (en) * 2012-04-05 2013-10-23 辉达公司 Dynamic bank mode addressing for memory access
CN103729291A (en) * 2013-12-23 2014-04-16 华中科技大学 Synchrony relation based parallel dynamic data race detection system
US8762951B1 (en) * 2007-03-21 2014-06-24 Oracle America, Inc. Apparatus and method for profiling system events in a fine grain multi-threaded multi-core processor
CN106233254A (en) * 2014-03-27 2016-12-14 国际商业机器公司 Address extension in multi-threaded computer system and shortening
CN114579473A (en) * 2022-05-09 2022-06-03 太平金融科技服务(上海)有限公司深圳分公司 Application testing method, device, equipment and storage medium
CN115309661A (en) * 2022-09-15 2022-11-08 北京奇艺世纪科技有限公司 Application testing method and device, electronic equipment and readable storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8176475B2 (en) * 2006-10-31 2012-05-08 Oracle America, Inc. Method and apparatus for identifying instructions associated with execution events in a data space profiler
US20110258421A1 (en) * 2010-04-19 2011-10-20 International Business Machines Corporation Architecture Support for Debugging Multithreaded Code
JP5841457B2 (en) * 2012-03-01 2016-01-13 株式会社アドバンテスト Test equipment and test modules
US9804846B2 (en) * 2014-03-27 2017-10-31 International Business Machines Corporation Thread context preservation in a multithreading computer system
LT4027618T (en) * 2019-04-02 2024-08-26 Bright Data Ltd. Managing a non-direct url fetching service

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8762951B1 (en) * 2007-03-21 2014-06-24 Oracle America, Inc. Apparatus and method for profiling system events in a fine grain multi-threaded multi-core processor
CN101937392A (en) * 2010-08-27 2011-01-05 华南理工大学 A Dynamic Defect Detection Method for Embedded Software
CN103365631A (en) * 2012-04-05 2013-10-23 辉达公司 Dynamic bank mode addressing for memory access
CN103729291A (en) * 2013-12-23 2014-04-16 华中科技大学 Synchrony relation based parallel dynamic data race detection system
CN106233254A (en) * 2014-03-27 2016-12-14 国际商业机器公司 Address extension in multi-threaded computer system and shortening
CN114579473A (en) * 2022-05-09 2022-06-03 太平金融科技服务(上海)有限公司深圳分公司 Application testing method, device, equipment and storage medium
CN115309661A (en) * 2022-09-15 2022-11-08 北京奇艺世纪科技有限公司 Application testing method and device, electronic equipment and readable storage medium

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
Analysis of False Cache Line Sharing Effects on Multicore CPUs;SZuntorn Sae-eung;SJSU Scholarworks;全文 *
众核体系结构对Cilk语言的硬件支持及评测研究;龙国平等;计算机学报(第11期);全文 *
渗透缓存命中率诱导的缓存区域动态分配机制研究;李灵枝等;软件导刊(第04期);全文 *
片上多处理器中基于步长和指针的预取;肖俊华等;计算机工程(第04期);全文 *
片上多处理器的层次化高速测试和验证技术;郭松柳等;哈尔滨工程大学学报(第05期);全文 *
面向X86多核处理器的数据流程序任务调度与缓存优化;唐九飞等;中国科学技术大学学报(第03期);全文 *
面向图形和图像处理的轻核阵列机结构;李涛等;西安邮电学院学报(第03期);全文 *

Also Published As

Publication number Publication date
CN116955044A (en) 2023-10-27

Similar Documents

Publication Publication Date Title
CN116955044B (en) Method, device, equipment and medium for testing cache working mechanism of processor
CN112256563B (en) Android application stability testing method and device, electronic equipment and storage medium
CN115017053B (en) Test program generation method, device, equipment and readable storage medium
CN111459494B (en) A code processing method and device
CN111813407B (en) Game development method, game running device and electronic equipment
CN116775133A (en) Cache consistency verification method, device, equipment and medium
JP7685119B2 (en) Atomicity maintenance method, processor and electronic device
CN116795338A (en) Task processing method and device based on code development and electronic equipment
CN110674050B (en) Memory out-of-range detection method and device, electronic equipment and computer storage medium
CN114047885B (en) Method, device, equipment and medium for writing multi-type data
CN113535183B (en) Code processing method, device, electronic equipment and storage medium
CN110908882A (en) Performance analysis method and device of application program, terminal equipment and medium
CN114385487A (en) Execution time processing method and device and storage medium
CN119416410A (en) On-chip network simulation method, device, equipment and storage medium
CN116881171B (en) Seed use case processing method, device, equipment and storage medium in fuzzy test
CN117271374A (en) Chip simulation test methods, devices, equipment and storage media
CN116360859B (en) Power domain access method, device, equipment and storage medium
CN110162302B (en) Data processing method, device, electronic equipment and storage medium
CN117762717B (en) Method and device for testing working mechanism of processor cache
CN110851243A (en) Flow access control method and device, storage medium and electronic equipment
CN119835329B (en) Data processing method, device, equipment and storage medium
CN115114725B (en) Contact processing method, device and electronic equipment for automobile model
CN118642905B (en) Processor testing method, device, electronic device and readable storage medium
CN120764228B (en) Cache consistency testing methods, apparatus, equipment and readable storage media
CN119960830B (en) Method, device, equipment and storage medium for configuring immediate value of jump instruction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant