[go: up one dir, main page]

CN109445854B - Data transmission method and device - Google Patents

Data transmission method and device Download PDF

Info

Publication number
CN109445854B
CN109445854B CN201811281863.5A CN201811281863A CN109445854B CN 109445854 B CN109445854 B CN 109445854B CN 201811281863 A CN201811281863 A CN 201811281863A CN 109445854 B CN109445854 B CN 109445854B
Authority
CN
China
Prior art keywords
data
concurrent
pending
compiling
hardware
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811281863.5A
Other languages
Chinese (zh)
Other versions
CN109445854A (en
Inventor
龚施俊
江树浩
鄢贵海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Yuanshu (beijing) Technology Co Ltd
Original Assignee
Zhongke Yuanshu (beijing) Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Yuanshu (beijing) Technology Co Ltd filed Critical Zhongke Yuanshu (beijing) Technology Co Ltd
Priority to CN201811281863.5A priority Critical patent/CN109445854B/en
Publication of CN109445854A publication Critical patent/CN109445854A/en
Application granted granted Critical
Publication of CN109445854B publication Critical patent/CN109445854B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3877Concurrent instruction execution, e.g. pipeline or look ahead using a slave processor, e.g. coprocessor

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The present invention provides a kind of data transmission method and devices, this method comprises: establishing the concurrent data structure for saving application layer data, compiling related data and hardware layer data in memory, multiple threads are established, and obtains application layer data corresponding to pending data and saves to the concurrent data structure;The concurrent data structure is concurrently accessed using the multiple thread, and executes operation by the set-up function division of labor, application layer data corresponding to the pending data is converted into hardware layer data corresponding to the pending data;Hardware layer data corresponding to the pending data is sent to coprocessor to handle.It can be improved the response speed of host end system through the above scheme, to improve the data transmission efficiency between host side and coprocessor end.

Description

Data transmission method and device
Technical field
The present invention relates to field of computer technology more particularly to a kind of data transmission methods and device.
Background technique
With the termination of Moore's Law, special chip has welcome " gold " epoch as coprocessor.Special chip is basic On all be using similar to single-instruction multiple-data stream (SIMD) (SIMD) design architecture processing data, for example, some graphics processors (GPU), tensor processor (TPU) etc..For these special chips when handling data, data carrying is the key that influence its performance, Including being transported to coprocessor end from host side, and result data is read from coprocessor end.For improve data transfer effect Rate, all circles propose a variety of optimization methods, for example, the bandwidth of GPU is increased to by 12Gbps using novel memory GDDR5X, Data transmission period etc. is reduced by data compression.By these technologies, user when calling coprocessor to accelerate application etc. Reduce to the time, the fluency of system improves.Meanwhile requirement of the user to efficiency of transmission is also continuously improved.Therefore, host Data transmission efficiency between end and coprocessor end still needs to further increase.
Summary of the invention
In view of this, the present invention provides a kind of data transmission method and device, to improve the response speed of host end system Degree, to further increase the data transmission efficiency between host side and coprocessor end.
To achieve the goals above, the present invention uses following scheme:
In an embodiment of the invention, data transmission method, comprising:
The concurrent data knot for saving application layer data, compiling related data and hardware layer data is established in memory Structure establishes multiple threads, and obtains application layer data corresponding to pending data and save to the concurrent data structure;
The concurrent data structure is concurrently accessed using the multiple thread, and executes operation by the set-up function division of labor, with Application layer data corresponding to the pending data is converted into hardware layer data corresponding to the pending data;
Hardware layer data corresponding to the pending data is sent to coprocessor to handle.
In an embodiment of the invention, data transmission device, comprising:
Construction unit, for being established in memory for saving application layer data, compiling related data and hardware layer data Concurrent data structure, establish multiple threads, and obtain application layer data corresponding to pending data and save to described Concurrent data structure;
Processing unit, for concurrently accessing the concurrent data structure using the multiple thread, and by set-up function point Work executes operation, application layer data corresponding to the pending data is converted into hard corresponding to the pending data Part layer data;
Transmission unit, for hardware layer data corresponding to the pending data to be sent at coprocessor Reason.
In an embodiment of the invention, computer equipment, including memory, processor and storage are on a memory and can The computer program run on a processor, the processor realize the step of above-described embodiment the method when executing described program Suddenly.
In an embodiment of the invention, computer readable storage medium is stored thereon with computer program, the program quilt The step of processor realizes above-described embodiment the method when executing.
Data transmission method, data transmission device, computer equipment and computer readable storage medium of the invention, passes through Concurrent data structure is established in memory, establishes multiple threads, concurrently accesses the concurrent data knot using the multiple thread Structure, and operation is executed by the set-up function division of labor, pending data is converted into corresponding hardware layer data, energy by application layer data The concurrently execution for enough realizing host side different operation when carrying out data response to coprocessor, reduces the response of host end system Time and computing resource maximally utilize, to improve the data transmission efficiency between host side and coprocessor end.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.In the accompanying drawings:
Fig. 1 is the flow diagram of the data transmission method of one embodiment of the invention;
Fig. 2 is established in memory in one embodiment of the invention for saving application layer data, compiling related data and hard The method flow schematic diagram of the concurrent data structure of part layer data;
Fig. 3 is concurrently to access concurrent data structure using multiple threads in one embodiment of the invention and divide the work by set-up function Execute the method flow schematic diagram that operation generates hardware layer data;
Fig. 4 is concurrently to access concurrent data structure using multiple threads in another embodiment of the present invention and divide by set-up function Work executes the method flow schematic diagram that operation generates hardware layer data;
Fig. 5 is the flow diagram of the data transmission method of another embodiment of the present invention;
Fig. 6 is the schematic diagram of data structure in one embodiment of the invention;
Fig. 7 is the schematic diagram of multithreading in one embodiment of the invention;
Fig. 8 is the system operational process schematic diagram of the host side of one embodiment of the invention;
Fig. 9 is the interaction schematic diagram of the data transmission of one embodiment of the invention;
Figure 10 is the structural schematic diagram of the data transmission device of one embodiment of the invention.
Specific embodiment
Understand in order to make the object, technical scheme and advantages of the embodiment of the invention clearer, with reference to the accompanying drawing to this hair Bright embodiment is described in further details.Here, the illustrative embodiments of the present invention and their descriptions are used to explain the present invention, but simultaneously It is not as a limitation of the invention.
Fig. 1 is the flow diagram of the data transmission method of one embodiment of the invention.As shown in Figure 1, some embodiments Data transmission method, it may include:
Step S110: establish in memory for save application layer data, compiling related data and hardware layer data and Data structure is sent out, multiple threads are established, and obtains application layer data corresponding to pending data and saves to described concurrent Data structure;
Step S120: the concurrent data structure is concurrently accessed using the multiple thread, and is held by the set-up function division of labor Row operation, is converted into hardware layer corresponding to the pending data for application layer data corresponding to the pending data Data;
Step S130: hardware layer data corresponding to the pending data is sent to coprocessor and is handled.
Above-mentioned steps S110~step S130 can be executed by host side.The host side can be various general processors, For example, personal computer, tablet computer, mobile phone, server etc..Above-mentioned coprocessor can be that various to can assist in host side complete Processor, the accelerator of processing speed are improved at certain or certain processing tasks, for example, various special chips, more specifically, For example, the dedicated accelerator of time series or chip.The host side is connect with the coprocessor, can be transferred to pending data Coprocessor is handled, and obtains corresponding processing result.
In above-mentioned steps S110, which can be used to save relevant data and is capable of providing and concurrently visit It asks, can be concurrent data structure that is based on lock or being not based on lock, can concurrently be accessed by multiple threads.Application therein Layer data may include application layer data corresponding to the data of coprocessor before and after the processing;Compiling related data may include association Application layer data corresponding to data before processor processing passes through the data of compiler compiling, may also include coprocessor processing The data of device to be compiled compiling in application layer data corresponding to preceding data;Hardware layer data may include coprocessor processing Hardware layer data corresponding to the data of front and back.Saving application layer data, compiling related data and hardware layer data can save In one or more tables of data.Concurrent data structure is established in memory, it can be quick, concurrent in order to above-mentioned multiple threads Access improves reading and writing data speed.
The pending data can be to be read to obtain or real-time reception is obtained from hard disk.The number of hard disk or real-time reception According to the information that may include many attributes (for example, title, identification number, size of data, address etc.), therefrom it can choose or increase newly The information of required attribute constitutes application layer data corresponding to the pending data.The pending data can with data flow, The forms such as data segment are obtained to thread, which can be is segmented to obtain by host side.The pending data can be Various types, type or the data in field, for example, time series data.The pending data refers to coprocessor processing Data.
In above-mentioned steps S120, the set-up function division of labor, for example, it may be by concurrent data structure initialization and compiling Device is translated to interact, memory release, data compilation, interact with coprocessor, the operation such as memory address mappings of data, it should A little operations can distribute to above-mentioned multiple threads according to the actual situation and execute.The specific method of salary distribution can cooperate easy journey according to thread Degree, thread process process etc. determine.
Different threads may be performed simultaneously identical or different operation, and each thread can be same when executing its respective operation When access above-mentioned concurrent data structure, to read the data needed for it, or be written it treated data.The operation of each thread can To be executed by setting process, application layer data corresponding to pending data is obtained for example, realizing, generates the pending data Corresponding compiling related data, and hardware layer data corresponding to the pending data is ultimately generated, the hardware layer data It can directly be handled by coprocessor.
In above-mentioned steps S130, host side can by multithreading divide the work operation quickly, constantly be ready to it is different to Hardware layer data corresponding to data is handled, and is successively sent to the coprocessor and is handled.
In the present embodiment, by establishing concurrent data structure in memory, multiple threads is established, the multiple thread is utilized The concurrent data structure is concurrently accessed, and executes operation by the set-up function division of labor, pending data is turned by application layer data It changes corresponding hardware layer data into, can be realized host side different operation when carrying out data response to coprocessor and concurrently hold Row reduces the response time of host end system and maximally utilizing for computing resource, to improve host side and coprocessor end Between data transmission efficiency.
Fig. 2 is established in memory in one embodiment of the invention for saving application layer data, compiling related data and hard The method flow schematic diagram of the concurrent data structure of part layer data.As shown in Fig. 2, above-mentioned steps S110, establishes use in memory In the concurrent data structure for saving application layer data, compiling related data and hardware layer data, it may include:
Step S111: it is established in memory using the map data structure of C++ for saving application layer data, compiling correlation At least one tables of data of data and hardware layer data;
Step S112: concurrent data structure is generated according at least one described tables of data.
In above-mentioned steps S111, the number of at least one tables of data can be it is multiple, for example, may include: use In at least one table of preservation application layer data, at least one table for saving compiling related data and for saving hardware layer At least one table of data.Related data is saved using multiple tables of data, carries out data decoupler conjunction, generation when can reduce concurrent Conflict.
Wherein, described for save at least one table of application layer data to may include: for saving pending data institute The application data sheet of corresponding application layer data and application result table for saving application layer data corresponding to processing result; Described at least one table for being used to save compiling related data can include: for saving the application number of plies corresponding to pending data The compiling tables of data of data to be compiled and the compiling result for saving data to be compiled corresponding to pending data in Table;Described at least one table for being used to save hardware layer data can include: for saving hardware layer corresponding to pending data The hardware data table of data and hardware result table for saving hardware layer data corresponding to processing result.It is protected by these tables Application layer data, compiling related data and hardware layer data are deposited, classification is clear, convenient for quickly access.
At least one above-mentioned tables of data can be realized mainly using the map data structure of C++.The map of C++ is a key assignments Processing to container, suitable for one-to-one data.A self-built red black tree inside the map of C++, has the function to data auto-sequencing Can, the data in map be all it is sorted by the value of key, this is suitable for holding time sequence data.Therefore, the map of C++ is utilized Data structure realizes data table related, can accelerate data query speed.
In above-mentioned steps S112, using multiple tables of data save related data in the case where, can use it is a variety of not Multiple tables of data is associated with mode, generates concurrent data structure.For example, the concurrent API that can use C++11 (is answered With routine interface) according to the concurrent data structure of at least one tables of data generation based on lock, it can be more in order to control with this Thread concurrently accesses the concurrent data structure, specifically, wakes up thread function based on condition for example, realizing by the concurrent API of C++11 It is able to achieve the management to multithreading, to realize the demand of system high concurrent.In other embodiments, other modes be can use The concurrent data structure based on lock is generated, or the concurrent data structure for being not based on lock can be generated, specific implementation can be with Optionally selected.
In the present embodiment, related data, the conflict generated when can reduce concurrent are saved using multiple tables of data.Utilize C+ + map data structure realize data table related, data query speed can be accelerated.
In some embodiments, multiple threads are established, it may include: main thread is established, the function of the main thread includes just The beginningization concurrent data structure and calling compiler;The first sub thread and the second sub thread, institute are established using the main thread The function of stating the first sub thread compiles data, the function of second sub thread including the use of the compiler that the main thread is called Can including the use of first sub thread compile data generate hardware layer data, the multiple thread include the main thread, First sub thread and second sub thread.
Fig. 3 is concurrently to access concurrent data structure using multiple threads in one embodiment of the invention and divide the work by set-up function Execute the method flow schematic diagram that operation generates hardware layer data.As shown in figure 3, above-mentioned steps S120, that is, utilize the multiple Thread concurrently accesses the concurrent data structure, and executes operation by the set-up function division of labor, and the pending data institute is right The application layer data answered is converted into hardware layer data corresponding to the pending data, it may include:
Step S121: compiler is called using main thread;
Step S122: the concurrent data structure is accessed using the first sub thread, is obtained corresponding to the pending data Application layer data, and using call the compiler in application layer data corresponding to the pending data at least The data of part attribute are compiled, and are generated compiling result data corresponding to the pending data and are saved to described Concurrent data structure;
Step S123: the concurrent data structure is accessed using the second sub thread, is obtained corresponding to the pending data Compiling result data, and the compiling result data according to corresponding to the pending data carries out address of cache, described in generation Hardware layer data corresponding to pending data is simultaneously saved to the concurrent data structure;The multiple thread includes described Main thread, first sub thread and second sub thread.
In above-mentioned steps S121, which can be mainly responsible for the calling of compiler, realize the interaction with compiler. In other embodiments, which can also be responsible for initializing concurrent data structure, establish sub thread, application program is called to connect Mouthful, system exit after one or more operations such as memory release.For example, the main thread initializes the concurrent data structure simultaneously After establishing two sub threads, dependent instruction can be generated and call compiler, the shape for waiting the number of processing result can be entered later State.In other embodiments, other modes be can use and establishes sub thread, can also suggest more than two sub threads, each thread Function can be allocated according to design cycle.
In above-mentioned steps S122, according to the difference of coprocessor etc., it can use different compiler parameters and compiled It translates.In the accessible concurrent data structure of first sub thread tables of data relevant to the application layer data and read needed for Data.Application layer data may include multiple attributes, for example, title, identification number, row size, column size, size of data, address Deng.First sub thread can be compiled the data of attributes whole in application layer data, or only in application layer data The data of part attribute are compiled, for example, being only compiled to the identification number in application layer data, can save host with this The computing resource at end.
Compile compiling result data corresponding to obtained pending data, can be binary form, can by its with The identical data of the attribute of data to be compiled are saved, or can also read other attributes from application layer data simultaneously Data save together, for example, the compiling result data and title of identification number can be saved simultaneously.In other embodiments, may be used also To save data to be compiled corresponding to pending data, identical table can be stored in by compiling result data and data to be compiled In, or it is stored in respective table respectively, in the case where saving respectively, can first accesses and save the tables of data to be compiled simultaneously Data to be compiled are read, then is compiled generation compiling result data and saves into corresponding table.
In above-mentioned steps S123, which refers to the address in datarams.Second sub thread is accessible concurrent In data structure, compiling result data corresponding to pending data is obtained to the relevant table of compiling result data.It can lead to The data for increasing corresponding address properties on the basis of compiling result data are crossed, or increase the data of other attributes simultaneously, Such as the application layer data of the attributes such as row size, column size, size of data, it realizes address of cache, generates the corresponding hardware number of plies According to.Hardware layer data corresponding to pending data can be saved into an independent table, or together with the data of other layers It saves.
In the present embodiment, it can use a main thread and two sub threads converted pending data by application layer data At hardware layer data.The groundwork that data are carried from application layer to hardware layer may be implemented using the first sub thread, utilize Two sub threads can further improve carrying of the data from application layer to hardware layer.With successively receiving multiple pending datas, Main thread and two sub threads can concurrently carry out its respective operation, and successively carry out reading data using single thread, compile It the operations such as translates, map to compare, there is faster data response speed.
Fig. 4 is concurrently to access concurrent data structure using multiple threads in another embodiment of the present invention and divide by set-up function Work executes the method flow schematic diagram that operation generates hardware layer data.As shown in figure 4, above-mentioned steps S122, that is, utilize the first son Concurrent data structure described in thread accesses obtains application layer data corresponding to the pending data, and utilizes the institute called It states compiler to be compiled the data of at least partly attribute in application layer data corresponding to the pending data, generate Compiling result data corresponding to the pending data is simultaneously saved to the concurrent data structure, specifically, it may include:
Step S1221: it is accessed in the concurrent data structure using the first sub thread for saving the pending data The application data sheet of corresponding application layer data, the part read in application layer data corresponding to the pending data belong to The data of property and the compiling tables of data being saved in the concurrent data structure;
Step S1222: using the compiler called to the part attribute saved into the compiling tables of data Data be compiled, generate compiling result data corresponding to the pending data and be saved to the concurrent data Compiling result table in structure.
In above-mentioned steps S1221~step S1222, the data of the part attribute for example can be application layer data Identification number in (e.g., including the attributes such as title, identification number, row size, column size, size of data, address), by part Attribute data is compiled, it is possible to reduce computing cost.Application layer data corresponding to pending data, the pending data Compiling result data corresponding to corresponding data to be compiled, the pending data is stored in application data sheet (example respectively Such as, including the attributes such as title, identification number, row size, column size, size of data, address), compiling tables of data (e.g., including mark Know a number equal attributes), in compiling result table (e.g., including the attributes such as title, identification number), by solely being saved using respective list Data can simplify the concurrently access design architecture of multithreading.
Again as shown in figure 4, above-mentioned steps S123, that is, access the concurrent data structure using the second sub thread, obtain institute Compiling result data corresponding to pending data is stated, and the compiling result data according to corresponding to the pending data carries out Address of cache generates hardware layer data corresponding to the pending data and is saved to the concurrent data structure, tool Body, it may include:
Step S1231: the application data sheet is accessed using the second sub thread and is read corresponding to the pending data Address date, access the compiling result table and simultaneously read compiling result data corresponding to the pending data;
Step S1232: carry out ground according to compiling result data corresponding to the address date and the pending data Location mapping, generates hardware layer data corresponding to the pending data and is saved to hard in the concurrent data structure Part tables of data, and empty the compiling tables of data.
In above-mentioned steps S1231~step S1232, compared with other tables in above-mentioned concurrent data structure, using number It may include the data of most attribute types according to table.The compiling result table can not include the data of address properties, it is possible to The data of address properties are obtained from application data sheet.The hardware data table for example may include identification number, row size, arrange greatly The attributes such as small, size of data, address.Hardware layer data corresponding to pending data is saved to hardware data table, clearly The empty compiling tables of data, can be in order to saving data to be compiled corresponding to new pending data to the compiling data Table.In other embodiments, it can be not required to empty, but carry out data cover when saving data next time.
In some embodiments, when above-mentioned first sub thread complete data compilation after, the second sub thread can be waken up so that Carry out data mapping.After above-mentioned second sub thread completes data mapping, the first sub thread can be waken up, to prepare to receive main line The exit instruction of journey, while exiting main program.
Fig. 5 is the flow diagram of the data transmission method of another embodiment of the present invention.As shown in figure 5, above-mentioned steps S130, that is, hardware layer data corresponding to the pending data is sent to coprocessor and is handled, it may include:
Step S131: using second sub thread by hardware layer data corresponding to the pending data and it is described simultaneously It is used to save the hardware result table of hardware layer data corresponding to the processing result of the pending data in hair data structure Table information is sent to coprocessor so that the coprocessor to hardware layer data corresponding to the pending data at Reason, and the hardware layer data according to corresponding to the table information of hardware result table return processing result.
In above-mentioned steps S131, which for example may include that identification number, row size, column size, data are big The attributes such as small, address.Hardware layer data corresponding to pending data is sent to coprocessor, it can be to the pending data It is handled.The table information (for example, each attribute information, structural information etc.) of hardware result table is sent to coprocessor, it can be with After coprocessor processing processing to be processed, by the data of processing result by the form or format of the table information of hardware result table It is back to host side, consequently facilitating saving to the hardware result table.
In other embodiments, it is sent at association in the table information that the second sub thread sends hardware layer data and hardware result table After managing device, the instruction opened and handled can also be sent and start to process data to coprocessor to control it.
Again as shown in figure 5, the data transmission method of above-described embodiment, may also include that
Step S140: hardware layer data corresponding to the processing result is received using second sub thread and is protected It deposits to the hardware result table, and the hardware layer data according to corresponding to the processing result, the compiling result table and described Compiling tables of data obtains application layer data corresponding to the processing result and is saved in the concurrent data structure Using result table;
Step S150: it is read and is sent corresponding to the processing result from the application result table using the main thread Application layer data.
In above-mentioned steps S140, above-mentioned hardware data table and above-mentioned hardware result table can have identical attribute, on It states application data sheet and above-mentioned application result table can have identical attribute.This using result table for example may include title, The attributes such as identification number, row size, column size, size of data, address.After the coprocessor has handled data, host can be notified End reading process as a result, processing result is sent to host side, by the second sub thread receive, after receiving processing result, The data of processing result can be filled up in hardware result table and application result table.
It is obtained according to hardware layer data corresponding to the processing result, the compiling result table and the compiling tables of data The concrete mode of application layer data corresponding to the processing result may include: the hardware according to corresponding to the processing result Identification number in layer data inquires the compiling result table and obtains the title of data corresponding to the processing result, and according to institute The title for stating data corresponding to processing result fills in the identification number in the compiling tables of data to using result table.At other In embodiment, can the data according to corresponding to the processing result title by it is described compiling tables of data in identification number fill in To application data sheet.Specifically, for example, if the first character of the title of data corresponding to the processing result is lower stroke Line fills in the identification number in the compiling tables of data to application data sheet, otherwise, by the identification number in the compiling tables of data It fills in using result table.Filling in the data of application data sheet can continue to be transmitted to coprocessor after being converted by above-mentioned thread It is handled.
In above-mentioned steps S150, main thread is read from described using application corresponding to reading process result in result table After layer data, upper layer application can be returned to, or can further be sent to hard disk and carry out storage or output in real time to other Device.
In some embodiments, the data transmission method of above-described embodiment may also include that the main thread is being more than setting Period do not get in the case where pending data the first exit instruction and the second exit instruction are respectively sent to it is described First sub thread and second sub thread, and signal is exited and from described the from first sub thread receiving The releasing memory after exiting signal of two sub threads.With this can when no data need to handle releasing memory, terminate host side Resource occupation.
In order to make those skilled in the art be best understood from the present invention, it will illustrate reality of the invention with specific embodiment below Apply mode.
Coprocessor end can be the dedicated accelerator of time series.In order to be further reduced the response time of host end system It is utilized with realization maximum resource, improves the data exchange capability between host side and the dedicated accelerator of time series, one implements The data transmission method operation that decoupling data transmission-data calculating-result is read back by way of multithreading of example is realized more The main-memory data management of thread, wherein here decoupling seeks to transmit data and separate with calculating, reaches coprocessor It can also carry out data transmission operating when being calculated, to maximumlly fall data transfer overhead amortization.Due to relating to And dedicated accelerator is arrived, host end system needs are interacted with compiler and upper layer application to be communicated.Generally, the data Transmission method, it may include:
(1) the concurrent data structure based on lock is established in memory, is concurrently visited for saving data information, and for subsequent It asks;Tables of data in the concurrent data structure can be realized by map data structure;
(2) three threads are established, a main thread and two sub threads, main thread are responsible for system initialization and compiler The relevant operations such as data interaction and last memory release are carried out, sub thread 1 is (dedicated to add to bottom hardware for carrying data Fast device), sub thread 2 is used to read result data from bottom hardware;It can be realized by the concurrent API of C++11 and be waken up based on condition Thread function realizes the management to multithreading.
Specifically, some data structures are designed first concurrently to access for saving relevant data and providing.Fig. 6 is this hair The schematic diagram of data structure in a bright embodiment.As shown in fig. 6, need the data information that saves may include name, id, row, Col, len, addr respectively correspond the titles of data, identification number, row size, column size, size of data, address in datarams. In order to accelerate data query speed, the present embodiment mainly uses the map data structure of C++ to realize relevant table.In addition, Data information is divided into multiple tables of data according to system function and deposited by the conflict generated when in order to reduce concurrent, the present embodiment Storage, as shown in fig. 6, respectively application data sheet (APP-DTab), using result table (APP-RTab), compile tables of data (C- DTab), result table (C-RTab) is compiled, hardware data table (H-DTab) and hardware result table (H-RTab).C++11 is used simultaneously Concurrent API realize the map concurrent data structure based on lock.
Fig. 7 is the schematic diagram of multithreading in one embodiment of the invention, and this diagram depicts multi-thread in the present embodiment whole system The data structure that journey detailed design and per thread need to operate.This system mainly includes three threads, a main thread and Two sub threads.Main thread 0 is mainly responsible for the memory after initialization data structure, calls application interface and system exit The work such as release, it is therefore desirable to access all tables of data.Sub thread 1, when main completion compiler calls, system needs to complete Related work, need to access APP-DTab, C-DTab and C-RTab.Sub thread 2 is mainly completed host end data and result Information is transported on hardware and read back results data, needs to access C-DTab, C-RTab, H-DTab and H-RTab tables of data.
Fig. 8 is the system operational process schematic diagram of the host side of one embodiment of the invention.Fig. 9 is one embodiment of the invention The interaction schematic diagram of data transmission.Process when whole system operation is illustrated in detail in Fig. 8.As shown in Figure 8 and Figure 9, firstly, it is main Thread 0 completes the initialization of data structure and establishes two sub threads.Then, it generates dependent instruction and calls compiler, into etc. Wake-up signal to the result data stage, until receiving sub thread 2.Then, start to read result data (application layer data).Extremely This, present instruction is finished, if also to execute other instructions, continues to return, and otherwise, gives sub thread 1 and sub thread 2 Transmission exit instruction, waiting sub thread 1 and sub thread 2 exit signal, finally exit main program, and discharge associated internal memory.
Sub thread 1 enters wait state after created, and until compiler is called, is then called according to compiler The parameter (can regard the difference of dedicated accelerator and different) of transmitting, accesses application data sheet, completes compiling tables of data and compiling As a result table is filled in.If compiling, which is called, to be terminated, sub thread 2 is waken up, then judges whether to exit, if do not exited, It continues back at and compiler is waited to call the stage, otherwise send and exit signal to sub thread 2, subsequently enter and exit the program.
Sub thread 2 enters wait state after having created, until sub thread 1 is waken up.Then number is applied in access According to table, and according to compiling result table, hardware data table is filled in.Then sub thread 1 is waken up, and empties compiling tables of data.Then The relevant information of data and hardware result table in hardware data table is transferred to hardware (dedicated accelerator), and sends unlatching meter Instruction is calculated, into the stage for waiting result data.It reads result data from hardware to be written directly in hardware result table, and according to this Identification number query compiler result table in result data obtains data name, and will compile phase in tables of data according to the data name The identification number answered is filled in result table is applied, and main thread 0 is then wake up.Then judge whether sub thread 1 exits, if do not moved back Out, then it enters and waits 1 wake-up states of sub thread, otherwise enter to exit the program.
In the present embodiment, a kind of main-memory data management method towards time series accelerator of multithreading is provided, if Concurrent map data structure is counted, design data management multithreading is realized by the concurrent API of C++11 and wakes up thread function based on condition The management to multithreading is realized, to realize the demand of system high concurrent.It solves from host side and carries data to accelerator, so It waits accelerator to calculate afterwards, finally reads result decoupling realization from accelerator and system is effectively reduced by multithreading Response time, improve the utilization rate of resource.
Based on inventive concept identical with data transmission method shown in FIG. 1, the embodiment of the invention also provides a kind of numbers According to transmitting device, as described in following example.The principle and data transmission method phase solved the problems, such as due to the data transmission device Seemingly, therefore the implementation of the data transmission device may refer to the implementation of data transmission method, and overlaps will not be repeated.
Figure 10 is the structural schematic diagram of the data transmission device of one embodiment of the invention.As shown in Figure 10, some embodiments Data transmission device, comprising: construction unit 210, processing unit 220 and transmission unit 230, above-mentioned each unit are linked in sequence.
Construction unit 210, for being established in memory for saving application layer data, compiling related data and the hardware number of plies According to concurrent data structure, establish multiple threads, and obtain application layer data corresponding to pending data and save to institute State concurrent data structure;
Processing unit 220 for concurrently accessing the concurrent data structure using the multiple thread, and presses set-up function The division of labor executes operation, and application layer data corresponding to the pending data is converted into corresponding to the pending data Hardware layer data;
Transmission unit 230 is carried out for hardware layer data corresponding to the pending data to be sent to coprocessor Processing.
In some embodiments, construction unit 210, it may include: tables of data establishes module and data structure building module, and two Person is connected with each other.
Tables of data establishes module, for utilizing the map data structure of C++ to be established in memory for saving using the number of plies According to, compiling related data and hardware layer data at least one tables of data;
Data structure building module, for generating concurrent data structure according at least one described tables of data.
In some embodiments, data structure building module, it may include: the concurrent data structural generation module based on lock.
Concurrent data structural generation module based on lock, for the concurrent API using C++11 according at least one described number The concurrent data structure based on lock is generated according to table.
In some embodiments, at least one described tables of data include: for save at least one table of application layer data, At least one table for saving at least one table of compiling related data and for saving hardware layer data.
In some embodiments, processing unit 220, it may include: compiler calling module, compiling result data generation module And hardware layer data generation module, above-mentioned each sequence of modules connection.
Compiler calling module, for calling compiler using main thread;
Result data generation module is compiled, for accessing the concurrent data structure using the first sub thread, described in acquisition Application layer data corresponding to pending data, and answered using the compiler of calling corresponding to the pending data It is compiled with the data of at least partly attribute in layer data, generates compiling result data corresponding to the pending data And it is saved to the concurrent data structure;
Hardware layer data generation module, for accessing the concurrent data structure using the second sub thread, obtain it is described to Compiling result data corresponding to data is handled, and the compiling result data according to corresponding to the pending data carries out address Mapping, generates hardware layer data corresponding to the pending data and is saved to the concurrent data structure;It is described more A thread includes the main thread, first sub thread and second sub thread.
In some embodiments, result data generation module is compiled, it may include: the generation of compiling data and preserving module and volume Result generation and preserving module are translated, the two is connected with each other.
Data generation and preserving module are compiled, for utilizing the first sub thread to access in the concurrent data structure for protecting The application data sheet for depositing application layer data corresponding to the pending data reads application corresponding to the pending data The data of part attribute in layer data and the compiling tables of data being saved in the concurrent data structure;
Result generation and preserving module are compiled, for the compiler using calling to preservation to the compiling tables of data In the data of the part attribute be compiled, generate compiling result data corresponding to the pending data and protected Deposit the compiling result table into the concurrent data structure.
In some embodiments, hardware layer data generation module, it may include: compiling result data read module and hardware layer Data generation module, the two are connected with each other.
Compile result data read module, for using the second sub thread access the application data sheet and read it is described to Address date corresponding to data is handled, the compiling result table is accessed and reads compiling knot corresponding to the pending data Fruit data;
Hardware layer data generation module, for the compiling knot according to corresponding to the address date and the pending data Fruit data carry out address of cache, generate hardware layer data corresponding to the pending data and are saved to the number of concurrent According to the hardware data table in structure, and empty the compiling tables of data.
In some embodiments, transmission unit 230, it may include: sending module.
Sending module, for utilizing second sub thread by hardware layer data corresponding to the pending data and institute It states in concurrent data structure for saving the hardware result of hardware layer data corresponding to the processing result of the pending data The table information of table is sent to coprocessor so that the coprocessor to hardware layer data corresponding to the pending data into Row processing, and the hardware layer data according to corresponding to the table information of hardware result table return processing result.
In some embodiments, the data transmission device may also include that processing result receiving unit and processing result are read Unit is taken, the two is connected with each other, and processing result receiving unit is connect with transmission unit 230.
Processing result receiving unit, for receiving hardware layer corresponding to the processing result using second sub thread Data are simultaneously saved to the hardware result table, according to hardware corresponding to the processing result in the hardware result table The application that layer data generates application layer data corresponding to the processing result and is saved in the concurrent data structure As a result table;
Processing result reading unit, for using the main thread from read in the application result table and send it is described from Manage application layer data corresponding to result.
The embodiment of the present invention also provides a kind of computer equipment, including memory, processor and storage are on a memory simultaneously The computer program that can be run on a processor, the processor realize above-described embodiment the method when executing described program Step.
The embodiment of the present invention also provides a kind of computer readable storage medium, is stored thereon with computer program, the program The step of above-described embodiment the method is realized when being executed by processor.
In conclusion the data transmission method of the embodiment of the present invention, data transmission device, computer equipment and computer can Storage medium is read to establish multiple threads by establishing concurrent data structure in memory, concurrently access using the multiple thread The concurrent data structure, and operation is executed by the set-up function division of labor, pending data is converted into accordingly by application layer data Hardware layer data, can be realized host side to coprocessor carry out data response when different operation concurrently execution, reduce The response time of host end system and maximally utilizing for computing resource, to improve the number between host side and coprocessor end According to efficiency of transmission.
In the description of this specification, reference term " one embodiment ", " specific embodiment ", " some implementations Example ", " such as ", the description of " example ", " specific example " or " some examples " etc. mean it is described in conjunction with this embodiment or example Particular features, structures, materials, or characteristics are included at least one embodiment or example of the invention.In the present specification, Schematic expression of the above terms may not refer to the same embodiment or example.Moreover, the specific features of description, knot Structure, material or feature can be combined in any suitable manner in any one or more of the embodiments or examples.Each embodiment Involved in the step of sequence be used to schematically illustrate implementation of the invention, sequence of steps therein is not construed as limiting, can be as needed It appropriately adjusts.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
Particular embodiments described above has carried out further in detail the purpose of the present invention, technical scheme and beneficial effects Describe in detail it is bright, it should be understood that the above is only a specific embodiment of the present invention, the guarantor being not intended to limit the present invention Range is protected, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should be included in this Within the protection scope of invention.

Claims (9)

1. a kind of data transmission method characterized by comprising
The concurrent data structure for saving application layer data, compiling related data and hardware layer data is established in memory, is built Multiple threads are found, and obtains application layer data corresponding to pending data and saves to the concurrent data structure;
The concurrent data structure is concurrently accessed using the multiple thread, and executes operation by the set-up function division of labor, by institute It states application layer data corresponding to pending data and is converted into hardware layer data corresponding to the pending data;
Hardware layer data corresponding to the pending data is sent to coprocessor to handle;
Wherein, the concurrent data structure is concurrently accessed using the multiple thread, and executes operation by the set-up function division of labor, with Application layer data corresponding to the pending data is converted into hardware layer data corresponding to the pending data, is wrapped It includes:
Compiler is called using main thread;
The concurrent data structure is accessed using the first sub thread, obtains application layer data corresponding to the pending data, And using the compiler called to the number of at least partly attribute in application layer data corresponding to the pending data According to being compiled, generates compiling result data corresponding to the pending data and be saved to the concurrent data knot Structure;
The concurrent data structure is accessed using the second sub thread, obtains compiling number of results corresponding to the pending data According to, and the compiling result data according to corresponding to the pending data carries out address of cache, generates the pending data institute Corresponding hardware layer data is simultaneously saved to the concurrent data structure;The multiple thread includes the main thread, described First sub thread and second sub thread.
2. data transmission method as described in claim 1, which is characterized in that establish apply the number of plies for saving in memory According to, compiling related data and hardware layer data concurrent data structure, comprising:
It is established in memory using the map data structure of C++ for saving application layer data, compiling related data and the hardware number of plies According at least one tables of data;
Concurrent data structure is generated according at least one described tables of data.
3. data transmission method as claimed in claim 2, which is characterized in that generated according at least one described tables of data concurrent Data structure, comprising:
The concurrent data structure based on lock is generated according at least one described tables of data using the concurrent API of C++11.
4. data transmission method as claimed in claim 2, which is characterized in that at least one described tables of data includes: for protecting Deposit at least one table of application layer data, at least one table for saving compiling related data and for saving hardware layer data At least one table.
5. data transmission method as described in claim 1, which is characterized in that
The concurrent data structure is accessed using the first sub thread, obtains application layer data corresponding to the pending data, And using the compiler called to the number of at least partly attribute in application layer data corresponding to the pending data According to being compiled, generates compiling result data corresponding to the pending data and be saved to the concurrent data knot Structure, comprising:
It is accessed in the concurrent data structure using the first sub thread for saving application layer corresponding to the pending data The application data sheet of data, read the data of the part attribute in application layer data corresponding to the pending data and by its Save the compiling tables of data into the concurrent data structure;
The data for saving the part attribute into the compiling tables of data are compiled using the compiler of calling, Generate compiling result data and the compiling knot being saved in the concurrent data structure corresponding to the pending data Fruit table;
The concurrent data structure is accessed using the second sub thread, obtains compiling number of results corresponding to the pending data According to, and the compiling result data according to corresponding to the pending data carries out address of cache, generates the pending data institute Corresponding hardware layer data is simultaneously saved to the concurrent data structure, comprising:
The application data sheet is accessed using the second sub thread and reads address date corresponding to the pending data, is accessed The compiling result table simultaneously reads compiling result data corresponding to the pending data;
Address of cache is carried out according to compiling result data corresponding to the address date and the pending data, described in generation Hardware layer data corresponding to pending data and the hardware data table being saved in the concurrent data structure, and empty The compiling tables of data.
6. data transmission method as claimed in claim 5, which is characterized in that
Hardware layer data corresponding to the pending data is sent to coprocessor to handle, comprising:
It will be in hardware layer data corresponding to the pending data and the concurrent data structure using second sub thread Table information for saving the hardware result table of hardware layer data corresponding to the processing result of the pending data is sent to Coprocessor, so that the coprocessor handles hardware layer data corresponding to the pending data, and according to institute The table information for stating hardware result table returns to hardware layer data corresponding to processing result;
The data transmission method, further includes:
Hardware layer data corresponding to the processing result is received using second sub thread and is saved to the hardware As a result table, and the hardware layer data according to corresponding to the processing result, the compiling result table and the compiling tables of data obtain The application result table for taking application layer data corresponding to the processing result and being saved in the concurrent data structure;
Using the main thread from described using being read in result table and send application layer data corresponding to the processing result.
7. a kind of data transmission device characterized by comprising
Construction unit, for establish in memory for save application layer data, compiling related data and hardware layer data and Data structure is sent out, multiple threads are established, and obtains application layer data corresponding to pending data and saves to described concurrent Data structure;
Processing unit for concurrently accessing the concurrent data structure using the multiple thread, and is held by the set-up function division of labor Row operation, is converted into hardware layer corresponding to the pending data for application layer data corresponding to the pending data Data;
Transmission unit is handled for hardware layer data corresponding to the pending data to be sent to coprocessor;
Wherein, the processing unit includes:
Compiler calling module, for calling compiler using main thread;
Result data generation module is compiled, for accessing the concurrent data structure using the first sub thread, is obtained described wait locate Application layer data corresponding to data is managed, and using the compiler called to application layer corresponding to the pending data The data of at least partly attribute in data are compiled, and are generated compiling result data corresponding to the pending data and are incited somebody to action It is saved to the concurrent data structure;
Hardware layer data generation module obtains described to be processed for accessing the concurrent data structure using the second sub thread Compiling result data corresponding to data, and the compiling result data according to corresponding to the pending data carries out address and reflects It penetrates, generate hardware layer data corresponding to the pending data and is saved to the concurrent data structure;It is the multiple Thread includes the main thread, first sub thread and second sub thread.
8. a kind of computer equipment including memory, processor and stores the meter that can be run on a memory and on a processor Calculation machine program, which is characterized in that the processor realizes any one of claim 1 to 6 the method when executing described program Step.
9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is held by processor The step of any one of claim 1 to 6 the method is realized when row.
CN201811281863.5A 2018-10-31 2018-10-31 Data transmission method and device Active CN109445854B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811281863.5A CN109445854B (en) 2018-10-31 2018-10-31 Data transmission method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811281863.5A CN109445854B (en) 2018-10-31 2018-10-31 Data transmission method and device

Publications (2)

Publication Number Publication Date
CN109445854A CN109445854A (en) 2019-03-08
CN109445854B true CN109445854B (en) 2019-11-05

Family

ID=65548960

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811281863.5A Active CN109445854B (en) 2018-10-31 2018-10-31 Data transmission method and device

Country Status (1)

Country Link
CN (1) CN109445854B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110704112B (en) * 2019-08-30 2021-04-02 创新先进技术有限公司 Method and apparatus for concurrently executing transactions in a blockchain

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102722355A (en) * 2012-06-04 2012-10-10 南京中兴软创科技股份有限公司 Workflow mechanism-based concurrent ETL (Extract, Transform and Load) conversion method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3796124B2 (en) * 2001-03-07 2006-07-12 株式会社ルネサステクノロジ Variable thread priority processor
US20050071438A1 (en) * 2003-09-30 2005-03-31 Shih-Wei Liao Methods and apparatuses for compiler-creating helper threads for multi-threading
JP4420055B2 (en) * 2007-04-18 2010-02-24 日本電気株式会社 Multi-thread processor and inter-thread synchronous operation method used therefor
KR101572879B1 (en) * 2009-04-29 2015-12-01 삼성전자주식회사 Systems and methods for dynamically parallelizing parallel applications
GB2495959A (en) * 2011-10-26 2013-05-01 Imagination Tech Ltd Multi-threaded memory access processor
CN103543989A (en) * 2013-11-11 2014-01-29 镇江中安通信科技有限公司 Adaptive parallel processing method aiming at variable length characteristic extraction for big data

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102722355A (en) * 2012-06-04 2012-10-10 南京中兴软创科技股份有限公司 Workflow mechanism-based concurrent ETL (Extract, Transform and Load) conversion method

Also Published As

Publication number Publication date
CN109445854A (en) 2019-03-08

Similar Documents

Publication Publication Date Title
CN105159761B (en) For carrying out the Application Programming Interface of data parallel on multiprocessor
CN103262038B (en) Graphics calculations process scheduling
CN102508712B (en) Middleware system and task execution method in heterogeneous multi-core reconfigurable hybrid system
CN105339897B (en) Efficient priority perceives thread scheduling
CN103392171B (en) Graphics process from user model is assigned
CN107679620A (en) Artificial neural network processing unit
CN111258744A (en) Task processing method based on heterogeneous computation and software and hardware framework system
CN103019838B (en) Multi-DSP (Digital Signal Processor) platform based distributed type real-time multiple task operating system
CN106462395B (en) Thread in multiline procedure processor framework waits
CN107004253A (en) The application programming interface framework based on figure with equivalence class for enhanced image procossing concurrency
CN106875012A (en) A kind of streamlined acceleration system of the depth convolutional neural networks based on FPGA
US20020065870A1 (en) Method and apparatus for heterogeneous distributed computation
CN102902512A (en) Multi-thread parallel processing method based on multi-thread programming and message queue
CN109690505A (en) Device and method for the mixed layer address of cache for virtualization input/output embodiment
CN106462219A (en) Systems and methods of managing processor device power consumption
US20220068005A1 (en) Techniques to manage execution of divergent shaders
CN105573959B (en) A kind of distributed computer calculating storage one
CN110515053B (en) CPU and multi-GPU based heterogeneous platform SAR echo simulation parallel method
CN101717817A (en) Method for accelerating RNA secondary structure prediction based on stochastic context-free grammar
CN103207782A (en) Method for establishing partition system based on multi-kernel MOS (Module Operating System)
CN108694687A (en) Device and method for protecting the content in virtualization and graphics environment
CN112612523A (en) Embedded equipment driving system and method
CN110308982A (en) A kind of shared drive multiplexing method and device
CN107123154B (en) The rendering intent and device of target object
US11783530B2 (en) Apparatus and method for quantized convergent direction-based ray sorting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant