CN117934261A - High-resolution image detection and storage method and system - Google Patents
High-resolution image detection and storage method and system Download PDFInfo
- Publication number
- CN117934261A CN117934261A CN202311518964.0A CN202311518964A CN117934261A CN 117934261 A CN117934261 A CN 117934261A CN 202311518964 A CN202311518964 A CN 202311518964A CN 117934261 A CN117934261 A CN 117934261A
- Authority
- CN
- China
- Prior art keywords
- data
- image
- target
- npu
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/60—Memory management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0877—Cache access modes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Image Processing (AREA)
Abstract
The invention provides a high-resolution image detection and storage method and system, and belongs to the technical field of image processing. The method comprises the following steps: s1: the image acquisition processing module receives the image data and stores the image data into a plurality of logic channels in the image acquisition processing module, the image acquisition processing module preprocesses the image data, the image acquisition processing module calculates the data quantity cached in each logic channel in real time, and sends a target data carrying notification to the intelligent scheduling module when the data quantity in the target logic channel is not smaller than a preset value; s2: the AI calculation module calculates the resource utilization rate of each NPU calculation unit in the AI calculation module in real time; s3: the intelligent scheduling module informs a target NPU computing unit of a preset idle rate according to the resource utilization rate of each NPU; s4: and the AI calculation module controls the target NPU calculation unit to carry the target data in the FPGA into the NPU calculation unit after receiving the notification instruction. The data cached in the image acquisition processing module is directly forwarded to the AI calculation module.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a high-resolution image detection and storage method and system.
Background
With the development of aerial cameras and the rapid progress of artificial intelligence technology, users have more demands on the analysis and processing of aerial image data. The performance requirements for the functions of aerial image detection and identification, data analysis and processing and data storage are higher and higher. The detection and storage system for the high-resolution image in the prior art generally comprises an image acquisition and processing module, a resource scheduling module, an AI computing unit and a storage module, wherein the image acquisition and processing module generally consists of an FPGA (field programmable gate array) or a special image processing chip and a peripheral chip, the resource scheduling module consists of a CPU (Central processing Unit), a DDR (double data rate), an operating system, a driver and application software, the AI computing unit consists of a single NPU or a plurality of NPUs according to the calculation scale and the complexity, and the storage module consists of the single SSD or the plurality of SSDs according to the storage scale.
The following problems exist in the existing high-resolution image detection and storage system: 1. the image preprocessing performance problem is that the preprocessing of the image is carried out on a general processor, the CPU computing power cannot meet the preprocessing requirement of the ultra-high resolution image, the image preprocessing algorithm on a special ISP chip aims at a specific scene or a specific data format, the expansion of a protocol interface and the processing of multiple scenes are relatively lacking, the general characteristic is not provided, and the preprocessing requirement of the aviation unmanned aerial vehicle under the complex image condition is difficult to meet; 2. the problem of image data storage performance is that the ultra-high resolution image occupies a large data bandwidth, the data is stored by using a standard IO fwrite file system based on a CPU memory, the occupied bandwidth is large, the CPU resource consumption is high, and the processor capacity is insufficient when the high-definition image data is processed; 3. the bandwidth of the data transmission channel is bottleneck, intelligent detection and data storage of the traditional method are realized, data must be cached by a CPU, and the mass data transmission of the ultra-high resolution image is faced, and even if a DMA channel is used, the performance limit of the CPU is exceeded; 4. the power consumption and size problems, the architecture using the graphics card + GPU coprocessor is very bulky and heavy, and the power consumption is high. While the SOC integrated NPU architecture can achieve smaller size and power consumption, the CPU and NPU computing capacity of the SOC are relatively low, and the SOC integrated NPU architecture is generally used for occasions with low requirements on intelligent detection and recognition performance in conventional video scenes or processing, and the computing power of the SOC integrated NPU architecture cannot meet the requirements on image processing of the ultra-high resolution aviation unmanned aerial vehicle.
Disclosure of Invention
In order to solve the above problems, the embodiment of the application provides a high-resolution image detection and storage method and system.
In a first aspect, the present application provides a method for detecting and storing a high-resolution image, including the following steps:
S1: the image acquisition processing module receives the image data and stores the image data into a plurality of logic channels in the image acquisition processing module, the image acquisition processing module preprocesses the image data, the image acquisition processing module calculates the data quantity cached in each logic channel in real time, and sends a target data carrying notification to the intelligent scheduling module when the data quantity in the target logic channel is not smaller than a preset value;
s2: the AI calculation module calculates the resource utilization rate of each NPU calculation unit in the AI calculation module in real time;
S3: the intelligent scheduling module is used for notifying a target NPU computing unit with preset idle rate according to the resource utilization rate of each NPU when receiving the target data carrying notification;
S4: and the AI calculation module controls the target NPU calculation unit to carry the target data in the FPGA into the NPU calculation unit after receiving the notification instruction.
Further, in step S1, the image acquisition processing module receives the image data and stores the image data into each logic channel in the image acquisition processing module specifically includes:
The FPGA in the image acquisition processing module receives image data in real time and defines each logic channel according to the image type or the interface type, and each logic channel receives the image data and stores the image data in the DDR cache defined by each logic channel.
Further, in step S1, when the data amount in the target logic channel is not less than the preset value, sending a target data handling notification to the intelligent scheduling module specifically includes:
When the data volume in the target logic channel is not smaller than a preset value, an FPGA in the image acquisition processing module sends out a target data handling notification in an interrupt mode, wherein parameters of the target data handling notification comprise the DDR head address of the current data in the FPGA, the effective length of the data, the logic channel ID, the original image ID and the cut image ID.
Further, the step S3 specifically includes: the intelligent scheduling module firstly analyzes the parameters, and then informs the target NPU computing unit with the highest idle rate of the received preprocessed image address and related parameters through a register command.
Further, the step S4 specifically includes: after receiving the control command, the target NPU computing unit starts DMA and moves the DDR internal data of the FPGA to the DDR of the target NPU computing unit in a PCIE SWITCH bus addressing mode.
Further, step S4 further includes:
s5: the target NPU calculation unit carries out detection, identification and calculation on target data, and the AI calculation module feeds back the identification result to the intelligent scheduling module in an interrupt mode;
s6: the intelligent scheduling module sends a post-processing instruction to the FPGA according to a preset image post-processing scheme;
S7: and the FPGA performs post-processing on the original image or the preprocessed image cached in the DDR according to the identification result, the post-processed image redefines a new logic channel, and the data is stored in the DDR cache defined by the logic channel.
Further, the receiving of the preprocessed image address and the related parameters specifically includes: when receiving the portable notification of the FPGA data, firstly analyzing parameters, then finding a channel corresponding file according to the logic channel ID, and finally calling an expansion interface of a file system to transfer the DDR head address and the effective length of the FPGA.
In a second aspect, an embodiment of the present application provides a high resolution image detection storage system, including an image acquisition processing module, an AI computing module, and an intelligent scheduling module, where the image acquisition processing module includes an FPGA and a plurality of logic channels, and the AI computing module includes a plurality of NPU computing units;
the image acquisition processing module is used for receiving the image data and storing the image data into a plurality of logic channels in the image acquisition processing module, preprocessing the image data, calculating the data quantity cached in each logic channel in real time, and sending a target data carrying notification to the intelligent scheduling module when the data quantity in the target logic channel is not smaller than a preset value;
The AI calculation module is used for calculating the resource utilization rate of each NPU calculation unit in real time;
The image acquisition processing module is used for sending a target data carrying notification to the intelligent scheduling module when the data volume in the target logic channel is not smaller than a preset value;
The intelligent scheduling module is used for notifying a target NPU computing unit with preset idle rate according to the resource utilization rate of each NPU when receiving the target data carrying notification;
And the AI calculation module is used for controlling the target NPU calculation unit to carry the target data in the FPGA into the NPU calculation unit after the target NPU calculation unit receives the notification instruction.
In a third aspect, an embodiment of the present application provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method as provided in the first aspect or any one of the possible implementations of the first aspect when the computer program is executed.
In a fourth aspect, embodiments of the present application provide a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of a method as provided by the first aspect or any one of the possible implementations of the first aspect.
The beneficial effects of the application are as follows: the application integrates high-resolution image acquisition, processing, intelligent detection and high-speed storage, and the data cached in the image acquisition processing module is directly forwarded to the AI calculation module, so that a data transmission path does not pass through the intelligent scheduling module, the resource consumption of the intelligent scheduling module is further reduced, the data storage performance is greatly improved, more NPU calculation units can be expanded for parallel calculation, the scheduling instantaneity of the intelligent detection system is improved, and meanwhile, the intelligent identification processing flow of the equipment can be still ensured not to be influenced under the condition of various measures for reducing the power consumption of the equipment.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a high-resolution image detection and storage method according to an embodiment of the present application;
Fig. 2 is a schematic structural diagram of a high-resolution image detection storage system according to an embodiment of the present application;
Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a conventional high-resolution image detection and storage system;
FIG. 5 is a schematic diagram of a conventional high-resolution image detection and storage system in the process of image acquisition, preprocessing, detection and identification and storage;
fig. 6 is a schematic architecture diagram of a high-resolution image detection storage system according to an embodiment of the present application;
Fig. 7 is a schematic diagram of a data flow of a high-resolution image detection and storage system in data acquisition, storage and detection according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a data flow of a post-processing image in a recognition result of a high resolution image detection storage system according to an embodiment of the present application;
in the figure, a 201-image acquisition processing module, a 202-AI calculation module and a 203-intelligent scheduling module.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application.
In the following description, the terms "first," "second," and "first," are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The following description provides various embodiments of the application that may be substituted or combined between different embodiments, and thus the application is also to be considered as embracing all possible combinations of the same and/or different embodiments described. Thus, if one embodiment includes feature A, B, C and another embodiment includes feature B, D, then the application should also be seen as embracing one or more of all other possible combinations of one or more of A, B, C, D, although such an embodiment may not be explicitly recited in the following.
The following description provides examples and does not limit the scope, applicability, or examples set forth in the claims. Changes may be made in the function and arrangement of elements described without departing from the scope of the application. Various examples may omit, replace, or add various procedures or components as appropriate. For example, the described methods may be performed in a different order than described, and various steps may be added, omitted, or combined. Furthermore, features described with respect to some examples may be combined into other examples.
Please refer to fig. 1, 6, 7, 8. Fig. 1 is a schematic flow chart of a high-resolution image detection and storage method according to an embodiment of the present application, fig. 6 is a schematic flow chart of a high-resolution image detection and storage system according to an embodiment of the present application, fig. 7 is a schematic flow chart of data collected, stored and detected by the high-resolution image detection and storage system according to an embodiment of the present application, and fig. 8 is a schematic flow chart of data of a post-processing image after a recognition result by the high-resolution image detection and storage system according to an embodiment of the present application. In an embodiment of the application, the method comprises the following steps:
S1: the image acquisition processing module 201 receives image data and stores the image data into a plurality of logic channels in the image acquisition processing module 201, the image acquisition processing module 201 preprocesses the image data, the image acquisition processing module 201 calculates the data amount cached in each logic channel in real time and sends a target data carrying notification to the intelligent scheduling module 203 when the data amount in the target logic channel is not less than a preset value;
S2: the AI calculation module 202 calculates the resource utilization of each NPU calculation unit in real time;
S3: the intelligent scheduling module 203 notifies a target NPU calculation unit of a preset idle rate according to the resource utilization rate of each NPU when receiving the target data handling notification;
S4: the AI calculation module 202 controls the target NPU calculation unit to carry the target data in the FPGA into the NPU calculation unit after the target NPU calculation unit receives the notification instruction.
In the embodiment of the application, with the development of aviation cameras and the rapid progress of artificial intelligence technology, users have more demands on analysis and processing of aviation image data. The performance requirements for the functions of aerial image detection and identification, data analysis and processing and data storage are higher and higher. The application has the advantages of ensuring the recognition rate, the real-time detection and recognition performance, the real-time data retrieval and analysis performance and the high-speed data storage of the high-resolution aerial image target, along with light weight, low power consumption, small space and the like.
In the embodiment of the application, with the innovation of scientific technology, aviation cameras have greatly advanced, and the resolution, shooting frame rate, sensitivity, response speed and the like of the sensor are greatly improved. But the performance requirements for processing and analyzing massive image data are also improved correspondingly. The onboard intelligent detection equipment belongs to embedded single-machine equipment, and when the single-machine equipment processes and calculates high-resolution images, the full utilization of peripheral computing resources of a system is limited due to the limitation of computing resources, power consumption, size, CPU resource utilization rate, excessive consumption of memory bandwidth and the like. Referring to fig. 4, fig. 4 is a schematic architecture diagram of an existing high-resolution image detection and storage system, where the existing high-resolution image detection and storage system includes an image acquisition processing module, a resource scheduling module, an AI computing unit, and a storage module; the image acquisition processing module generally consists of an FPGA or a special image processing chip and a peripheral chip; the resource scheduling module consists of a CPU, a DDR, an operating system, a driver and application software; the AI computing unit consists of a single NPU or a plurality of NPUs according to the computing scale and the complexity; the storage module may be composed of a single or multiple SSDs according to the storage size, and generally adopts NVME protocol SSD. Referring to fig. 5, fig. 5 is a schematic diagram of a flow of image acquisition, preprocessing, detection, identification and storage in the conventional high-resolution image detection and storage system, in which: after the image is acquired, the image is preprocessed by a hardware module and is divided into original data and preprocessed image data, DMA is carried out on the original data and the preprocessed image data in a resource scheduling software cache, the preprocessed image is forwarded to an NPU computing module for detection and identification, and the other original preprocessed image data and the original image data are written into a data storage module according to a data storage rule.
In one embodiment, the image capturing processing module 201 in step S1 receives the image data and stores the image data into each logic channel therein specifically includes:
the FPGA in the image acquisition processing module 201 receives image data in real time and defines each logic channel according to the image type or the interface type, and each logic channel receives the image data and stores the image data in the DDR cache defined by each logic channel.
In one embodiment, the sending the target data handling notification to the intelligent scheduling module 203 in step S1 when the data size in the target logical channel is not smaller than the preset value specifically includes:
When the data volume in the target logic channel is not smaller than the preset value, the FPGA in the image acquisition processing module 201 sends out a target data handling notification in an interrupt mode, wherein parameters of the target data handling notification comprise the DDR head address of the current data in the FPGA, the effective length of the data, the logic channel ID, the original image ID and the cut image ID.
In one embodiment, step S3 specifically includes: the intelligent scheduling module 203 analyzes the parameters first, and then notifies the target NPU computing unit with the highest idle rate of the received preprocessed image address and related parameters through a register command.
In one embodiment, step S4 specifically includes: after receiving the control command, the target NPU computing unit starts DMA and moves the DDR internal data of the FPGA to the DDR of the target NPU computing unit in a PCIE SWITCH bus addressing mode.
In one embodiment, step S4 further comprises:
s5: the target NPU calculation unit performs detection, identification and calculation on the target data, and the AI calculation module 202 feeds back the identification result to the intelligent scheduling module 203 in an interrupt mode;
S6: the intelligent scheduling module 203 sends a post-processing instruction to the FPGA according to a preset image post-processing scheme;
S7: and the FPGA performs post-processing on the original image or the preprocessed image cached in the DDR according to the identification result, the post-processed image redefines a new logic channel, and the data is stored in the DDR cache defined by the logic channel.
In one embodiment, the receiving of the preprocessed image address and related parameters specifically includes: when receiving the portable notification of the FPGA data, firstly analyzing parameters, then finding a channel corresponding file according to the logic channel ID, and finally calling an expansion interface of a file system to transfer the DDR head address and the effective length of the FPGA.
In the embodiment of the application, the acquisition of the high-definition image is realized in the FPGA, the preprocessing (cutting, noise reduction, defogging and the like) of the image is carried out by adopting the FPGA heterogeneous resources, the CPU software realizes the scheduling of a plurality of NPU resources, the original image data and the processed image data are directly stored or forwarded to the NPU through the FPGA without being cached by the CPU, the CPU resource consumption is further reduced, the data storage performance is greatly improved, and meanwhile, more NPU modules can be expanded for parallel calculation, so that the scheduling instantaneity of the intelligent detection system is improved, and meanwhile, the intelligent recognition processing flow of the equipment can be still ensured not to be influenced under the condition of various measures for reducing the power consumption of the equipment.
In the embodiment of the application, with the development of aviation investigation technology, the calculation and storage tasks of the airborne intelligent detection and identification storage system are more and more, and higher requirements are put on the resources such as calculation power and storage of equipment, and under the environment with limited load, the size and power consumption of multiple equipment are also put on higher requirements, so that the method has important significance on how to improve the calculation and storage capacity of the equipment under the existing resource condition. In the architecture of the intelligent detection system, a data transmission path is newly added, and data cached in the FPGA can be directly forwarded to a storage body and an NPU so as to realize that the data transmission path does not pass through the CPU.
In the embodiment of the application, because the aviation image data volume is large, a large amount of CPU computing resources and DDR bandwidth resources are occupied in the data transmission process, the data transmission path of the FPGA acquisition image DMA architecture is modified, and the original data flow of carrying the FPGA DDR original image and preprocessing image data to the memory of the CPU and then carrying the CPU to the memory bank and the NPU is modified into the data flow of carrying the FPGA DDR image data to the SSD memory bank or the NPU directly, and the specific implementation process is as follows:
1) The FPGA receives image data in real time and defines each logic channel according to the image type or the interface type. Each channel receives image data and stores the image data in DDR caches defined by each logic channel;
2) The FPGA receives image data in real time and pre-processes the image (not limited to noise reduction, defogging, image bit width conversion, scaling, slicing, etc.). When the FPGA receives image data, when the image is received by data to the set cut-to-image height size data amount, the image of the completed part can be preprocessed (the image is preprocessed while being received). Distributing the preprocessed images in two ways, wherein one way is used for caching the data recording logic channels, and the other way is used for intelligently detecting and identifying the logic channel cache;
3) When the FPGA calculates the buffer data volume of each logic channel in real time and achieves the portability, the FPGA notifies the intelligent scheduling module 203 in an interrupt manner, and parameters thereof include: the current data is in information such as FPGA DDR head address, data effective length, logic channel ID, original image ID, cut-map ID, etc.;
4) The intelligent scheduling module 203 queries the resource utilization rate of each NPU computing unit in the AI computing module 202 in real time, when receiving the portable notification of the FPGA data, the intelligent scheduling module 203 firstly analyzes the parameters, and then notifies the NPU computing unit with the highest idle rate of the received preprocessed image address and related parameters through a register command;
5) After the NPU computing unit receives the control command, starting DMA, and moving the data in the FPGA DDR to the DDR of the NPU computing unit in a PCIE SWITCH bus addressing mode, wherein the NPU computing unit starts detection and identification computation.
6) After the NPU computing unit detects and identifies that one frame of image is completed, the detection and identification result is fed back to the intelligent scheduling module 203 in an interrupt mode. The intelligent scheduling module 203 further gives a post-processing instruction to the FPGA according to a preset image post-processing scheme, the FPGA performs post-processing (clipping, OSD, scaling, etc.) on the original image or the pre-processed image cached in the DDR according to the recognition result, and the post-processed image redefines a new logic channel and stores the data in the DDR cache.
7) When receiving the FPGA data portability notification, the intelligent scheduling module 203 firstly analyzes the parameters, then finds a channel corresponding file according to the logic channel ID, and finally calls an expansion interface of a file system to transfer the DDR head address and the effective length of the FPGA;
8) The kernel file system forms write-once operation according to the logic address, and submits an IO command to the memory by the memory driver, wherein parameters of the IO command comprise a data storage FPGA DDR head address, an effective length and a storage logic address;
9) And the memory body receives the IO command, starts DMA and moves the data in the FPGA DDR to the corresponding block address in a PCIE SWITCH bus addressing mode to finish data landing.
10 The FPGA detects the image acquisition and image processing state in real time, when one frame of complete image is acquired or one frame of image is processed, the application software is informed in an interrupt mode, and the application software establishes index information (information such as image time, image type, sequence number, image size, bit width, image target information, corresponding file path, file name, file offset and the like) according to the interrupt information and writes the index information into a database in real time.
A high resolution image detection memory system according to an embodiment of the present application will be described in detail with reference to fig. 2. It should be noted that, in the method for executing the embodiment of the present application shown in fig. 1, only the portion relevant to the embodiment of the present application is shown for convenience of description, and specific technical details are not disclosed, please refer to the embodiment of the present application shown in fig. 1.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a high-resolution image detection storage system according to an embodiment of the application. As shown in fig. 2, the system includes an image acquisition processing module 201, an AI computing module 202, and an intelligent scheduling module 203, wherein the image acquisition processing module 201 includes an FPGA and a plurality of logic channels, and the AI computing module 202 includes a plurality of NPU computing units;
The image acquisition processing module 201 is configured to receive image data and store the image data in a plurality of logic channels therein, perform preprocessing on the image data, calculate the data amount cached in each logic channel in real time, and send a target data handling notification to the intelligent scheduling module 203 when the data amount in the target logic channel is not less than a preset value;
The AI computing module 202 is configured to calculate the resource utilization of each NPU computing unit in real time;
The image acquisition processing module 201 is configured to send a target data handling notification to the intelligent scheduling module 203 when the data amount in the target logic channel is not less than a preset value;
The intelligent scheduling module 203 is configured to notify a target NPU computing unit with a preset idle rate according to the resource utilization rate of each NPU when receiving the target data handling notification;
The AI computing module 202 is configured to control the target NPU computing unit to carry the target data in the FPGA into the NPU computing unit after the target NPU computing unit receives the notification instruction.
In the embodiment of the present application, the image acquisition processing module 201 includes an FPGA, a dedicated image processing chip, a peripheral chip, and an image processing logic unit, which are responsible for image acquisition, preprocessing, post-processing, interface conversion, and image output. The intelligent scheduling module 203 comprises a CPU, a DDR, an operating system, a driver and resource scheduling software, and is responsible for intelligent flow resource scheduling, data flow control, external communication and parameter setting. The AI computing module 202 includes a storage unit, a plurality of NPU computing units, and a plurality of NPU computing unit arrays, where the NPU computing units are NPU parallel computing arrays, the NPU computing unit arrays combine and increase the rate of intelligent detection and recognition, the storage unit includes a single block or multiple blocks of SSD, and for maintaining the rate, generally, an NVME protocol SSD is used, the multiple blocks of SSD may use RAID to increase the storage rate, and the storage unit may be an NVME storage unit.
In the embodiment of the application, compared with the prior intelligent detection storage device, the application has the following advantages
1) The high-speed storage is realized in a simple mode, and the dependence of the high-speed storage on a high-performance CPU is reduced; meanwhile, the original characteristics of the general file system are reserved, the formed data files can be stored at high speed, and the data files can be accessed through a standard file access interface, so that storage management is facilitated.
2) Compared with the conventional mode, the original image is received and preprocessed, the preprocessed image is directly forwarded to the NPU to carry out detection and identification calculation without being buffered by the CPU, and the real-time performance of detection and identification is improved by parallel calculation of the multi-NPU cascade module.
3) The recognition result and the post-processing image result are stored in a database in an associated mode, and the target image can be conveniently and quickly searched after the event. The demonstration and playback of the data do not need to do image processing again, the processed image can be directly forwarded, the response time of the system is reduced, and the real-time performance of the retrieval and analysis of the data after the process is ensured.
4) Compared with the conventional mode, the original image data and the processed image data are directly stored or forwarded to the NPU computing unit through the FPGA without being cached by the CPU, so that the occupation of data interaction on CPU computing resources and DMA bandwidth resources is greatly reduced, the system can expand more NPU computing units and storage discs, and the upper limit of the system performance is improved.
5) Because the occupation of CPU resources is lower, the system can use an embedded processor with lower power consumption to realize the scheduling management of the data flow, and can also take measures of reducing the main frequency of the processor, closing part of processor cores and the like. The reduction of the power consumption can effectively reduce the heat area and the power supply size of the equipment, so that the equipment has the characteristic of light weight.
It will be clear to those skilled in the art that the technical solutions of the embodiments of the present application may be implemented by means of software and/or hardware. "unit," "module," "portion" or "portion" in this specification refers to software and/or hardware capable of performing a particular function, either alone or in combination with other components, wherein the hardware may be, for example, a Field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA), an integrated circuit (IntegratedCircuit, IC), or the like.
The processing units and/or modules of the embodiments of the present application may be implemented by an analog circuit that implements the functions described in the embodiments of the present application, or may be implemented by software that executes the functions described in the embodiments of the present application.
Referring to fig. 3, a schematic structural diagram of an electronic device according to an embodiment of the present application is shown, where the electronic device may be used to implement the method in the embodiment shown in fig. 1. As shown in fig. 3, the electronic device 300 may include: at least one central processor 301, at least one network interface 304, a user interface 303, a memory 305, at least one communication bus 302.
Wherein the communication bus 302 is used to enable connected communication between these components.
The user interface 303 may include a Display screen (Display), a Camera (Camera), and the optional user interface 303 may further include a standard wired interface, and a wireless interface.
The network interface 304 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.
Wherein the central processor 301 may comprise one or more processing cores. The central processor 301 connects the various parts within the overall electronic device 300 using various interfaces and lines, performs various functions of the terminal 300 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 305, and invoking data stored in the memory 305. Alternatively, the central processor 301 may be implemented in at least one hardware form of digital signal Processing (DIGITAL SIGNAL Processing, DSP), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA), programmable logic array (ProgrammableLogic Array, PLA). The central processor 301 may integrate one or a combination of several of a central processor (Central Processing Unit, CPU), an image central processor (Graphics Processing Unit, GPU), a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the cpu 301 and may be implemented by a single chip.
The memory 305 may include a random access memory (Random Access Memory, RAM) or a Read-only memory (Read-only memory). Optionally, the memory 305 includes a non-transitory computer readable medium (non-transitory computer-readable storage medium). Memory 305 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 305 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the above-described respective method embodiments, etc.; the storage data area may store data or the like referred to in the above respective method embodiments. The memory 305 may also optionally be at least one storage device located remotely from the aforementioned central processor 301. As shown in fig. 3, an operating system, a network communication module, a user interface module, and program instructions may be included in the memory 305, which is a type of computer storage medium.
In the electronic device 300 shown in fig. 3, the user interface 303 is mainly used for providing an input interface for a user, and acquiring data input by the user; and the central processor 301 may be configured to invoke a high resolution image detection storage application stored in the memory 305 and specifically perform the following operations:
S1: the image acquisition processing module 201 receives image data and stores the image data into a plurality of logic channels in the image acquisition processing module 201, the image acquisition processing module 201 preprocesses the image data, the image acquisition processing module 201 calculates the data amount cached in each logic channel in real time and sends a target data carrying notification to the intelligent scheduling module 203 when the data amount in the target logic channel is not less than a preset value;
S2: the AI calculation module 202 calculates the resource utilization of each NPU calculation unit in real time;
S3: the intelligent scheduling module 203 notifies a target NPU calculation unit of a preset idle rate according to the resource utilization rate of each NPU when receiving the target data handling notification;
S4: the AI calculation module 202 controls the target NPU calculation unit to carry the target data in the FPGA into the NPU calculation unit after the target NPU calculation unit receives the notification instruction.
The present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the above method. The computer-readable storage medium may include, among other things, any type of disk including floppy disks, optical disks, DVDs, CD-ROMs, micro-drives, and magneto-optical disks, ROM, RAM, EPROM, EEPROM, DRAM, VRAM, flash memory devices, magnetic or optical cards, nanosystems (including molecular memory ICs), or any type of media or device suitable for storing instructions and/or data.
It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present application is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present application.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, such as the division of the units, merely a logical function division, and there may be additional manners of dividing the actual implementation, such as multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some service interface, device or unit indirect coupling or communication connection, electrical or otherwise.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable memory. Based on this understanding, the technical solution of the present application may be embodied essentially or partly in the form of a software product, or all or part of the technical solution, which is stored in a memory, and includes several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned memory includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Those of ordinary skill in the art will appreciate that all or a portion of the steps in the various methods of the above embodiments may be performed by hardware associated with a program that is stored in a computer readable memory, which may include: flash disk, read-Only Memory (ROM), random-access Memory (RandomAccess Memory, RAM), magnetic disk or optical disk, etc.
The foregoing is merely exemplary embodiments of the present disclosure and is not intended to limit the scope of the present disclosure. That is, equivalent changes and modifications are contemplated by the teachings of this disclosure, which fall within the scope of the present disclosure. Embodiments of the present disclosure will be readily apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a scope and spirit of the disclosure being indicated by the claims.
Claims (10)
1. A high resolution image detection and storage method, comprising the steps of:
S1: the image acquisition processing module (201) receives image data and stores the image data into a plurality of logic channels in the image acquisition processing module, the image acquisition processing module (201) preprocesses the image data, the image acquisition processing module (201) calculates the data amount cached in each logic channel in real time and sends a target data carrying notification to the intelligent scheduling module (203) when the data amount in the target logic channel is not less than a preset value;
S2: the AI calculation module (202) calculates the resource utilization rate of each NPU calculation unit in the AI calculation module in real time;
s3: the intelligent scheduling module (203) informs a target NPU computing unit of a preset idle rate according to the resource utilization rate of each NPU when receiving a target data carrying notification;
S4: the AI computing module (202) controls the target NPU computing unit to carry target data in the FPGA into the NPU computing unit after the target NPU computing unit receives the notification instruction.
2. The high-resolution image detection and storage method according to claim 1, wherein the image acquisition processing module (201) in step S1 receives the image data and stores the image data in each logic channel therein specifically comprises:
The FPGA in the image acquisition processing module (201) receives image data in real time, defines each logic channel according to the image type or the interface type, and stores the received image data in the DDR cache defined by each logic channel.
3. The method for storing and detecting high-resolution images according to claim 1 or 2, wherein in step S1, when the amount of data in the target logical channel is not smaller than a preset value, sending a target data handling notification to the intelligent scheduling module (203) specifically comprises:
And when the data volume in the target logic channel is not smaller than a preset value, the FPGA in the image acquisition processing module (201) sends out a target data carrying notification in an interrupt mode, wherein parameters of the target data carrying notification comprise the DDR head address of the current data in the FPGA, the effective length of the data, the ID of the logic channel, the ID of the original image and the ID of the cut image.
4. The high resolution image detection and storage method according to claim 1 or 2, wherein step S3 specifically comprises: the intelligent scheduling module (203) firstly analyzes the parameters, and then informs the target NPU computing unit with the highest idle rate of the received preprocessed image address and related parameters through a register command.
5. The method for storing high-resolution image detection according to claim 1 or 2, wherein step S4 specifically comprises: after receiving the control command, the target NPU computing unit starts DMA and moves the DDR internal data of the FPGA to the DDR of the target NPU computing unit in a PCIE SWITCH bus addressing mode.
6. The high resolution image detection and storage method according to claim 1 or 2, further comprising, after step S4:
S5: the target NPU calculation unit carries out detection, identification and calculation on target data, and the AI calculation module (202) feeds back the identification result to the intelligent scheduling module (203) in an interrupt mode;
s6: the intelligent scheduling module (203) sends a post-processing instruction to the FPGA according to a preset image post-processing scheme;
S7: and the FPGA performs post-processing on the original image or the preprocessed image cached in the DDR according to the identification result, the post-processed image redefines a new logic channel, and the data is stored in the DDR cache defined by the logic channel.
7. The method of claim 4, wherein the receiving of the preprocessed image address and related parameters comprises: when receiving the portable notification of the FPGA data, firstly analyzing parameters, then finding a channel corresponding file according to the logic channel ID, and finally calling an expansion interface of a file system to transfer the DDR head address and the effective length of the FPGA.
8. A high resolution image detection storage system, characterized by: the intelligent scheduling system comprises an image acquisition processing module (201), an AI computing module (202) and an intelligent scheduling module (203), wherein the image acquisition processing module (201) comprises an FPGA and a plurality of logic channels, and the AI computing module (202) comprises a plurality of NPU computing units;
The image acquisition processing module (201) is used for receiving image data and storing the image data into a plurality of logic channels in the image acquisition processing module, preprocessing the image data, calculating the data quantity cached in each logic channel in real time, and sending a target data carrying notification to the intelligent scheduling module (203) when the data quantity in a target logic channel is not smaller than a preset value;
The AI calculation module (202) is used for calculating the resource utilization rate of each NPU calculation unit in real time;
The image acquisition processing module (201) is used for sending a target data carrying notification to the intelligent scheduling module (203) when the data volume in the target logic channel is not smaller than a preset value;
The intelligent scheduling module (203) is used for notifying a target NPU computing unit with preset idle rate according to the resource utilization rate of each NPU when receiving a target data handling notification;
the AI computing module (202) is used for controlling the target NPU computing unit to carry target data in the FPGA into the NPU computing unit after the target NPU computing unit receives the notification instruction.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1-7 when the computer program is executed.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any of claims 1-7.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311518964.0A CN117934261A (en) | 2023-11-15 | 2023-11-15 | High-resolution image detection and storage method and system |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311518964.0A CN117934261A (en) | 2023-11-15 | 2023-11-15 | High-resolution image detection and storage method and system |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN117934261A true CN117934261A (en) | 2024-04-26 |
Family
ID=90761847
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202311518964.0A Pending CN117934261A (en) | 2023-11-15 | 2023-11-15 | High-resolution image detection and storage method and system |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN117934261A (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120162602A (en) * | 2025-05-19 | 2025-06-17 | 上海壁仞科技股份有限公司 | Attention data processing method, device, medium and program product |
| CN120744980A (en) * | 2025-08-14 | 2025-10-03 | 凯云联创(北京)科技有限公司 | Image processing-oriented FPGA parallel computing unit scheduling method and device |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20150178243A1 (en) * | 2013-12-20 | 2015-06-25 | Rambus Inc. | High level instructions with lower-level assembly code style primitives within a memory appliance for accessing memory |
| US20190171965A1 (en) * | 2017-12-01 | 2019-06-06 | Deepwave Digital, Inc. | Artificial intelligence radio transceiver |
| CN110751676A (en) * | 2019-10-21 | 2020-02-04 | 中国科学院空间应用工程与技术中心 | Heterogeneous computing system and method based on target detection and readable storage medium |
| CN112731302A (en) * | 2021-04-06 | 2021-04-30 | 湖南纳雷科技有限公司 | STM32 and FPGA-based reverse radar signal processing system and method |
| WO2022005856A1 (en) * | 2020-07-01 | 2022-01-06 | Sony Interactive Entertainment LLC | High-speed save data storage for cloud gaming |
| CN115857805A (en) * | 2022-11-30 | 2023-03-28 | 合肥腾芯微电子有限公司 | Artificial intelligence computable storage system |
-
2023
- 2023-11-15 CN CN202311518964.0A patent/CN117934261A/en active Pending
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20150178243A1 (en) * | 2013-12-20 | 2015-06-25 | Rambus Inc. | High level instructions with lower-level assembly code style primitives within a memory appliance for accessing memory |
| US20190171965A1 (en) * | 2017-12-01 | 2019-06-06 | Deepwave Digital, Inc. | Artificial intelligence radio transceiver |
| CN110751676A (en) * | 2019-10-21 | 2020-02-04 | 中国科学院空间应用工程与技术中心 | Heterogeneous computing system and method based on target detection and readable storage medium |
| WO2022005856A1 (en) * | 2020-07-01 | 2022-01-06 | Sony Interactive Entertainment LLC | High-speed save data storage for cloud gaming |
| CN112731302A (en) * | 2021-04-06 | 2021-04-30 | 湖南纳雷科技有限公司 | STM32 and FPGA-based reverse radar signal processing system and method |
| CN115857805A (en) * | 2022-11-30 | 2023-03-28 | 合肥腾芯微电子有限公司 | Artificial intelligence computable storage system |
Non-Patent Citations (2)
| Title |
|---|
| THOMA Y 等: "FPGA-GPU communicating through PCIe", 《MICROPROCESSORS & MICROSYSTEMS》, 31 July 2015 (2015-07-31), pages 1 - 12 * |
| 潘银飞: "视觉检测中特征提取的FPGA加速技术研究", 《中国博士学位论文全文数据库信息科技辑》, 15 February 2022 (2022-02-15), pages 135 - 57 * |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120162602A (en) * | 2025-05-19 | 2025-06-17 | 上海壁仞科技股份有限公司 | Attention data processing method, device, medium and program product |
| CN120162602B (en) * | 2025-05-19 | 2025-08-15 | 上海壁仞科技股份有限公司 | Attention data processing method, device, medium and program product |
| CN120744980A (en) * | 2025-08-14 | 2025-10-03 | 凯云联创(北京)科技有限公司 | Image processing-oriented FPGA parallel computing unit scheduling method and device |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP3830715B1 (en) | Storage aggregator controller with metadata computation control | |
| US10528481B2 (en) | Apparatus and method for managing storage of data blocks | |
| CN117934261A (en) | High-resolution image detection and storage method and system | |
| WO2021120789A1 (en) | Data writing method and apparatus, and storage server and computer-readable storage medium | |
| CN112445725B (en) | Method, device and terminal device for pre-reading file pages | |
| CN111143242A (en) | A cache prefetch method and device | |
| CN105393236B (en) | Requiring rapid data read/writing method and apparatus | |
| KR102502569B1 (en) | Method and apparuts for system resource managemnet | |
| CN109413392A (en) | A kind of system and method for embedded type multichannel video image acquisition and parallel processing | |
| CN112954244B (en) | Method, device, equipment and storage medium for realizing storage of monitoring video | |
| CN117235088B (en) | Cache updating method, device, equipment, medium and platform of storage system | |
| CN110377527A (en) | A kind of method and relevant device of memory management | |
| US20190004968A1 (en) | Cache management method, storage system and computer program product | |
| WO2021190501A1 (en) | Data pre-fetching method and apparatus, and storage device | |
| WO2021204187A1 (en) | Layout analysis method and electronic device | |
| CN107278293B (en) | Sensor implementation device and method for virtual machine | |
| WO2019174206A1 (en) | Data reading method and apparatus of storage device, terminal device, and storage medium | |
| CN115840736A (en) | File sorting method, intelligent terminal and computer readable storage medium | |
| US10664952B2 (en) | Image processing method, and device, for performing coordinate conversion | |
| US10832132B2 (en) | Data transmission method and calculation apparatus for neural network, electronic apparatus, computer-readable storage medium and computer program product | |
| CN109583318A (en) | Medicinal plant recognition methods, device and computer equipment | |
| CN110134807B (en) | Target retrieval method, device, system and storage medium | |
| CN115543937B (en) | File defragmentation method and electronic device | |
| CN113010454A (en) | Data reading and writing method, device, terminal and storage medium | |
| US12373955B2 (en) | System and method for storage management of images |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |