CN118524248A - Video processing method, device, equipment and medium - Google Patents
Video processing method, device, equipment and medium Download PDFInfo
- Publication number
- CN118524248A CN118524248A CN202410588928.XA CN202410588928A CN118524248A CN 118524248 A CN118524248 A CN 118524248A CN 202410588928 A CN202410588928 A CN 202410588928A CN 118524248 A CN118524248 A CN 118524248A
- Authority
- CN
- China
- Prior art keywords
- data
- color space
- video
- preprocessing
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 20
- 238000000034 method Methods 0.000 claims abstract description 102
- 238000012545 processing Methods 0.000 claims abstract description 98
- 238000007781 pre-processing Methods 0.000 claims abstract description 96
- 238000006243 chemical reaction Methods 0.000 claims abstract description 48
- 230000008569 process Effects 0.000 claims description 53
- 238000003860 storage Methods 0.000 claims description 31
- 238000013135 deep learning Methods 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 11
- 238000000605 extraction Methods 0.000 description 19
- 238000010586 diagram Methods 0.000 description 15
- 239000000284 extract Substances 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 230000006835 compression Effects 0.000 description 5
- 238000007906 compression Methods 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 5
- 238000013136 deep learning model Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 108010001267 Protein Subunits Proteins 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/426—Internal components of the client ; Characteristics thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/443—OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Software Systems (AREA)
- Image Processing (AREA)
Abstract
The embodiment of the disclosure relates to a video processing method, a device, equipment and a medium, wherein the method is applied to electronic equipment comprising a graphic processor and comprises the following steps: acquiring a target video through a graphic processor; decoding and color space conversion are carried out on the target video through a graphic processor, so that color space data are obtained; preprocessing the color space data through a graphic processor to obtain preprocessing data, and storing the preprocessing data into a video memory; the pre-processing data is read by the target software development kit SDK in the graphics processor. The embodiment of the disclosure omits the steps of encoding, landing and reading data from the hard disk, reduces the time consumption of direct data copying of the memory and the video memory, greatly reduces the processing time consumption, reduces the related resource consumption, and further effectively meets the high-efficiency requirement of video processing.
Description
Technical Field
The disclosure relates to the technical field of video processing, and in particular relates to a video processing method, device, equipment and medium.
Background
With the continuous development of video processing technology, processes such as recognition, conversion and the like can be performed through a deep learning algorithm. At present, video frame extraction is needed before video processing, and the process of realizing video frame extraction by a graphics processor (Graphics Processing Unit, GPU) can comprise video decoding, color space conversion, encoding and disc dropping, and then data are read from a hard disc for algorithm processing.
Disclosure of Invention
In order to solve the technical problems, the present disclosure provides a video processing method, apparatus, device and medium.
The embodiment of the disclosure provides a video processing method applied to an electronic device comprising a graphics processor, the method comprising:
acquiring a target video through the graphic processor;
decoding and color space conversion are carried out on the target video through the graphic processor, so that color space data are obtained;
Preprocessing the color space data through the graphic processor to obtain preprocessing data, and storing the preprocessing data into a video memory;
And reading the preprocessing data through a target Software Development Kit (SDK) in the graphic processor.
The embodiment of the disclosure also provides a video processing apparatus, which is disposed in an electronic device including a graphics processor, and the apparatus includes:
the acquisition module is used for acquiring a target video through the graphic processor;
The decoding and converting module is used for decoding and converting the color space of the target video through the graphic processor to obtain color space data;
the preprocessing module is used for preprocessing the color space data through the graphic processor to obtain preprocessing data, and storing the preprocessing data into a video memory;
and the reading module is used for reading the preprocessing data through a target Software Development Kit (SDK) in the graphic processor.
The embodiment of the disclosure also provides an electronic device, which comprises: a processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instructions from the memory and execute the instructions to implement a video processing method as provided in an embodiment of the disclosure.
The present disclosure also provides a computer-readable storage medium storing a computer program for executing the video processing method as provided by the embodiments of the present disclosure.
Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has the following advantages: according to the video processing scheme provided by the embodiment of the disclosure, a target video is acquired through a graphic processor; decoding and color space conversion are carried out on the target video through a graphic processor, so that color space data are obtained; preprocessing the color space data through a graphic processor to obtain preprocessing data, and storing the preprocessing data into a video memory; the pre-processing data is read by the target software development kit SDK in the graphics processor. By adopting the technical scheme, when the video frame is extracted through the graphic processor, after the target video is acquired, the color space data after the target video is decoded and the color space is converted is subjected to preprocessing to obtain the preprocessing data, and then the preprocessing data is read to directly carry out subsequent processing, so that the steps of coding, disc dropping and data reading from a hard disc are omitted, the time consumption of direct data copying of a memory and a video memory is reduced, the processing time consumption is greatly reduced, the related resource consumption is reduced, and the high-efficiency requirement of video processing is further effectively met.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.
FIG. 1 is a schematic diagram of a video processing process in the related art;
fig. 2 is a schematic flow chart of a video processing method according to an embodiment of the disclosure;
FIG. 3 is a schematic diagram of a video processing process according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of another video processing process provided by an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a video processing apparatus according to an embodiment of the disclosure;
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.
It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.
It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.
The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
Video frame-taking by the graphics processor can reduce time consumption compared to video frame-taking by the central processing unit (Central Processing Unit, CPU). For example, fig. 1 is a schematic diagram of a video processing procedure in the related art, as shown in fig. 1, the process of implementing video frame extraction by a graphics processor may include decoding, color space conversion, encoding, and dropping video, where the whole process is processed serially, and the process of copying the memory and the video memory from each other is independent, the time required for decoding each video frame from video to color space conversion and then encoding and dropping video to the hard disk is about 4.58ms, and for a long video of 45 minutes and 25fps, such as a set of tv dramas, the time required for extracting all 67500 frames is 515.25 seconds, and when the application scenario is software development kit (Software Development Kit, SDK) providing images in the video to the image processing field performs algorithm processing, at this time, SDK needs to use OpenCV (Open Source Computer Vision Library) to read images from the hard disk and copy the memory to the video memory, and each image is about 25.65ms, which is not capable of meeting the requirement of efficient production. The related art reduces time consumption by adding the sub-threads only when extracting the image frames, but has limited effect, and still cannot meet the requirement of efficient processing.
In order to solve the above-described problems, embodiments of the present disclosure provide a video processing method, which is described below with reference to specific embodiments.
Fig. 2 is a flow chart of a video processing method according to an embodiment of the disclosure, which may be performed by a video processing apparatus, where the apparatus may be implemented in software and/or hardware, and may be generally integrated in an electronic device. As shown in fig. 2, the method is applied to an electronic device including a graphic processor, and includes:
And step 101, acquiring a target video through a graphic processor.
The video processing method of the embodiment of the disclosure can be executed by the graphics processor, and takes a lot of time compared with the central processing unit for frame extraction due to the computing power of the graphics processor. Among them, a graphic processor is a processor dedicated to processing graphic and image data, and can rapidly process a large amount of graphic data by means of parallel processing, thereby providing a smoother visual experience. The target video may be a video requiring video frame extraction, and the specific video source is not limited, for example, the target video may be a local video or a video of the internet.
Specifically, the central processing unit can extract the target video from the hard disk into the memory, copy the target video from the memory into the video memory, and further the graphics processor can acquire the target video from the video memory and perform frame extraction processing on the target video.
And 102, decoding and converting the color space of the target video through a graphic processor to obtain color space data.
The decoding may be to restore video stream data read from the target video to original pixel data for subsequent color space conversion and encoding, which is an essential step in the video frame extraction process, and the decoding process involves the use of a decoder, which can analyze each element of the video data according to the specifications of the video encoding standard, such as image frame data, motion vectors, chrominance information, and the like, where the image frame data may include a plurality of image frames, and the image frame data is acquired so that subsequent processing may analyze and process each image frame.
The color space conversion may be to convert image frame data obtained after video decoding from one color space to another color space to meet specific requirements, and the color space may include RGB (red green blue), YUV (luminance color difference), etc., YUV being one of the most commonly used color spaces, where Y represents a luminance component and U and V represent color difference components. The color space conversion may include RGB to YUV conversion or YUV to RGB conversion, and the specific conversion algorithm and formula is dependent on the definition of the color space. The purpose of the color space conversion is to convert the color representation of the image frame data from one format to another for subsequent processing and display, the converted color representation being better adapted to the different algorithmic model inputs and OpenCV processing.
The color space data may be data obtained after decoding and color space conversion of the target video. Specifically, after the graphics processor acquires the target video, the graphics processor may decode the target video to obtain image frame data, and perform color space conversion on the image frame data to obtain color space data, and store the color space data in the video memory.
Step 103, preprocessing the color space data through the graphic processor to obtain preprocessing data, and storing the preprocessing data into a video memory.
The preprocessing may be image processing performed before performing subsequent processing on the data after the frame extraction of the target video, and by using the preprocessing, the subsequent processing efficiency and suitability can be improved. The pre-processing may include one or more of processing by DALI (Data Loading and Augmentation Library) in the graphics processor, clipping processing, scaling processing, rotation processing, gray scale processing, noise reduction processing, normalization processing, binarization processing, brightness adjustment processing, contrast adjustment processing, saturation adjustment processing, filtering processing, watermark removal processing, smoothing processing, sharpening processing, enhancement processing, and the like.
The graphics processor may extract color space data from the video memory and preprocess the color space data to obtain preprocess data, and copy the preprocess data to the video memory.
Step 104, reading the pre-processing data through a target Software Development Kit (SDK) in the graphics processor.
The target SDK may be an SDK that performs target processing on data of video frames of a target video, where the target processing may be processing performed by using a preset algorithm or model, and the preset algorithm or model is not limited, and may be, for example, a deep learning algorithm.
After the graphics processor obtains the pre-processing data, the pre-processing data may be provided to a target SDK for target processing, which may deploy an algorithmically processed model on the GPU to obtain superior performance. Because the process eliminates the processes of coding and dropping the color space conversion data, the color space conversion data is directly preprocessed and then provided for the target SDK to be processed, the process of copying the video memory and the memory for multiple times is eliminated, the time consumption is greatly reduced, the data processing requirement on the target SDK running in the GPU can be reduced by adding the preprocessing, the suitability of the target SDK is improved, and the target SDK can be used for efficiently and conveniently processing the data.
For example, fig. 3 is a schematic diagram of a video processing process provided in the embodiment of the present disclosure, as shown in fig. 3, compared with the related art video processing process in fig. 1, when the video is extracted by the graphics processor in fig. 3, the steps of encoding and dropping are directly performed in the process of removing frames, in which the video file is a target video, the RGB byte stream is color space data obtained by decoding and color space conversion of the target video, the color space data after color space conversion is directly provided to the target SDK for use, and tensor data after preprocessing and tensor conversion of the color space data can also be provided to the target SDK for use, where the preprocessing and tensor conversion are optional steps, so that the steps of encoding and dropping, the copying process between the video memory and the memory, and the process of reading data from the hard disk are removed, thereby greatly improving the processing efficiency and reducing the consumption of related resources.
The encoding may be compression encoding of an image, for example, JPEG encoding, by dividing the image into different 8x8 pixel blocks, and performing discrete cosine transform (Discrete Cosine Transform, DCT) on each image block to extract frequency domain features of the image, and then compression encoding the frequency domain features using quantization and entropy encoding techniques, etc., to reduce the storage space of the image. JPEG encoding is widely used in image compression, and it can maintain the quality of an image to some extent and has a high compression ratio. In video frame extraction, each frame of image is subjected to JPEG coding to reduce the occupied storage space, so that the subsequent storage and transmission are convenient. For most algorithms or models, lossless images are suitable, but the storage space occupied by the lossless images is extremely large for data after video decoding and color space conversion, for example, the storage space is 6.22MB for a 1080p image, the storage space is needed for direct storage after decoding all 67500 frames for a common set of television dramas of 45 minutes and 25fps, and 419.85GB storage space is needed, so compression encoding is needed, the data volume is reduced, and the storage space occupation is further reduced.
The step of landing refers to the process of saving data to a computer hard disk, for example, the JPG byte stream obtained by the encoding can be saved to the computer hard disk. In video abstraction, the decoded, color space converted and encoded data is saved to hard disk for subsequent processing and storage. The hard disk is one of the main storage media of the computer, has larger capacity and faster data reading and writing speed, and is suitable for storing large-scale video data. The landing process needs to consider factors such as the storage format, storage path, storage performance of the data, etc. to ensure that the video frames can be effectively saved and can be quickly read and processed.
According to the video processing scheme provided by the embodiment of the disclosure, a target video is acquired through a graphic processor; decoding and color space conversion are carried out on the target video through a graphic processor, so that color space data are obtained; preprocessing the color space data through a graphic processor to obtain preprocessing data, and storing the preprocessing data into a video memory; the pre-processing data is read by the target software development kit SDK in the graphics processor. By adopting the technical scheme, when the video frame is extracted through the graphic processor, after the target video is acquired, the color space data after the target video is decoded and the color space is converted is subjected to preprocessing to obtain the preprocessing data, and then the preprocessing data is read to directly carry out subsequent processing, so that the steps of coding, disc dropping and data reading from a hard disc are omitted, the time consumption of direct data copying of a memory and a video memory is reduced, the processing time consumption is greatly reduced, the related resource consumption is reduced, and the high-efficiency requirement of video processing is further effectively met.
In some embodiments, decoding and color space converting, by a graphics processor, the target video to color space data includes: creating at least two sub-processes by a main process of a graphics processor, wherein the at least two sub-processes include at least one decoding sub-process and at least one color space conversion sub-process; and decoding and converting the color space of the target video through at least two subprocesses to obtain color space data, and storing the color space data into a video memory. Optionally, the target SDK runs in a host process of the graphics processor.
The main process may be an initial process for starting the graphics processor, the main process of the graphics processor may acquire the target video, and the main process may include the target SDK to perform subsequent processing on the data obtained by the video extraction. The sub-process may be a process created by the main process, the sub-process may perform different tasks than the main process, and the sub-process may communicate with the main process, since the main process and each sub-process therein are allowed to be accessed by the video memory in the graphics processor, the sub-process and the main process may communicate through the video memory.
After a main process of the graphic processor acquires a target video, the main process can create at least two subprocesses when starting a frame extraction process, and the at least two subprocesses can comprise at least one decoding subprocess and at least one color space conversion subprocess; and the target video is decoded in parallel and color space converted through the at least two subprocesses to obtain color space data, the color space data is stored in a video memory, and the main process can extract the color space data from the video memory and provide the color space data for the target SDK for subsequent processing.
Optionally, decoding and color space converting the target video by at least two sub-processes to obtain color space data may include: decoding the target video through at least one decoding subprocess to obtain image frame data, and storing the image frame data into a video memory; and extracting image frame data from the video memory through at least one color space conversion subprocess and performing color space conversion to obtain color space data.
The image frame data may be one of data obtained by decoding the target video, and the image frame data may include a plurality of image frames. When the at least two subprocesses comprise at least one decoding subprocess and at least one color space conversion subprocess, the graphics processor can store the target video into the video memory after acquiring the target video through the main process, and extract the target video from the video memory through the at least one decoding subprocess to decode the target video to obtain image frame data, and store the image frame data into the video memory; extracting image frame data from the video memory through at least one color space sub-unit, performing color space conversion to obtain color space data, and storing the color space data into the video memory.
Fig. 4 is a schematic diagram of another video processing procedure provided in the embodiment of the present disclosure, and as shown in fig. 4, the video processing procedure after multi-process optimization is shown in the figure, and compared with the processing speed caused by the serial processing of decoding and color space conversion in fig. 1, the processing speed is increased by creating at least one decoding sub-process and at least one color space conversion sub-process, so that the three steps of decoding, color space conversion and target SDK processing in the main process can be performed in parallel in batches.
In the technical scheme, when the video frame is extracted through the graphic processor, the video frame extraction can be realized through multiple processes, a target video is obtained through a main process of the graphic processor, and at least two subprocesses are created in the graphic processor; the target video is decoded and color space converted through at least two subprocesses to obtain color space data, the color space data are stored in the video memory, the target SDK of the main process extracts the color space data from the video memory to carry out subsequent processing, and then a plurality of steps can be carried out in parallel, so that batch processing is achieved, the processing speed is further increased, and the processing efficiency is improved.
In some embodiments, the at least two sub-processes further include at least one preprocessing sub-process, and preprocessing the color space data to obtain preprocessed data may include: and extracting color space data from the video memory through at least one preprocessing subprocess and performing preprocessing to obtain preprocessing data.
The preprocessing may be image processing performed before performing subsequent processing on the data after the frame extraction of the target video, and by using the preprocessing, the subsequent processing efficiency and suitability may be improved. The preprocessing subprocess may be a subprocess for performing image preprocessing on the color space data, and the number of the preprocessing subprocesses may be set according to actual situations.
When the main process of the graphic processor creates at least two subprocesses, the at least two subprocesses can also comprise at least one preprocessing subprocess, the at least one preprocessing subprocess extracts color space data from the video memory and performs preprocessing to obtain preprocessing data, the preprocessing data is also stored in the video memory, and then the target SDK of the main process can extract the preprocessing data from the video memory and perform subsequent processing.
In some embodiments, after storing the pre-processing data in the video memory, the video processing method may further include: converting the preprocessing data into tensor data through a graphic processor, and storing the tensor data into a video memory; the tensor data is read by a target SDK in the graphics processor.
Tensor (Tensor) data can be a data structure which can adapt to different coordinate systems in deep learning, is a generalized array concept, and can have any dimension. The target SDK comprises a deep learning algorithm, the data type of the tensor data corresponds to the algorithm type of the deep learning algorithm, the data type of the tensor data comprises a data structure and specific data dimensions, and the data type of the tensor data is determined according to the algorithm type of the deep learning algorithm of the target SDK.
After the graphics processor stores the preprocessing data into the video memory, tensor conversion can be performed on the tensor data, the preprocessing data is converted into tensor data and stored into the video memory, the tensor data is read through a target SDK in the graphics processor, so that analysis processing is performed on the tensor data through a deep learning algorithm in the target SDK, and as the data type of the tensor data corresponds to the algorithm type of the deep learning algorithm, the processing speed and suitability of the analysis processing of the deep learning algorithm can be improved.
In the scheme, the steps of preprocessing and tensor conversion are deployed in the graphics processor, so that the speed of subsequent processing can be improved, at least one preprocessing subprocess can be set for preprocessing, and the preprocessing subprocess is executed in parallel with other subprocesses, so that the processing speed is greatly improved, and the resource utilization rate is reduced.
The video processing method provided by the embodiment of the present disclosure is further described below by way of a specific example. By way of example, the video processing procedure of the graphics processor may specifically include: the main process starts the frame extraction process by using a process starting method (Memprocess ()), the memory return state (MemReturnFlag) =1, the initialization is successful (InitSuccess), and the size of the memory return state is returned (Return MemReturnFlag Size); the frame extraction Process is in a decoding state (Process Status) =1, a decoding frame number (DecodeNum) =0, when the decoding state is equal To 1, the frame extraction Process sequentially decodes the target video through at least one decoding subprocess To obtain video frame Data, stores the video frame Data in a video memory, performs color space conversion from the video frame Data in the video memory through at least one color space conversion subprocess To obtain color space Data, stores the color space Data in the video memory, copies the color space Data from the video memory To the designated Data of the video memory when the decoding frame number is less than (acquiring frame number +127)% 127, and can further perform preprocessing through at least one preprocessing subprocess before copying the Data (Copy To Data), and the decoding frame number is increased (DecodeNum ++).
Copying data from data-0 in a video memory in a main process, specifically, a video memory return method (MemReturnprocess ()) can be executed, a frame number (GetNum) =0 is obtained, when a decoding state=1, i=0 is defined, list= [ ], i represents the number of data to be copied, list represents a data set to be copied, when i < batch size < D, D represents a decoded frame number (DecodeNum), list increases data, batch size (BatchSize) represents sample data trained once by a deep learning algorithm, i increases (i++), a frame number increase (GetNum++) is obtained, and then the batch size is continuously judged; when the batch size does not satisfy i < batch size < D, an error Code, an error message, a data flag (Return Code MSG DATAFLAG) may be returned, and then a decision is continued to be made as to whether the decoding status is equal to 1. The number of Data (Data) in the video memory may be set, for example, to a batch size of 2, taking 128 as an example.
In the above scheme, taking the application scene as the data providing of the deep learning model as an example, the image obtained by video frame extraction is an important data obtaining way, and along with the continuous popularization of the GPU, the processing speed of the deep learning model is continuously improved, and challenges are brought to the currently used GPU frame extraction mode. In the related art, it is often necessary to first take out frames and drop discs, then read images from a hard disk to a memory and import the images into the memory, so that the link redundancy cannot meet the high-efficiency production requirement. The scheme is based on splitting partial steps of video decoding, image encoding and the like in GPU frame extraction, directly performing rejection processing on the parts of image encoding, disc dropping and data reading, directly converting the data preprocessing after video decoding and color space conversion into tensors for a deep learning model, rejecting the copy from a video memory to a memory, and then copying the memory to the video memory.
Fig. 5 is a schematic structural diagram of a video processing apparatus according to an embodiment of the present disclosure, where the apparatus may be implemented by software and/or hardware, and may be generally integrated in an electronic device. As shown in fig. 5, the apparatus is provided in an electronic device including a graphic processor, and includes:
An obtaining module 501, configured to obtain, by using the graphics processor, a target video;
A decoding and converting module 502, configured to decode and color space convert the target video by using the graphics processor, to obtain color space data;
A preprocessing module 503, configured to perform preprocessing on the color space data by using the graphics processor to obtain preprocessed data, and store the preprocessed data into a video memory;
a reading module 504, configured to read the pre-processing data through a target software development kit SDK in the graphics processor.
Optionally, the decoding-conversion module 502 includes:
A creation unit for creating at least two sub-processes through a main process of the graphic processor, wherein the at least two sub-processes include at least one decoding sub-process and at least one color space conversion sub-process;
And the processing unit is used for decoding the target video through the at least two subprocesses and converting the color space to obtain color space data, and storing the color space data into the video memory.
Optionally, the processing unit is configured to:
decoding the target video through the at least one decoding subprocess to obtain image frame data, and storing the image frame data into the video memory;
And extracting the image frame data from the video memory through the at least one color space conversion subprocess and performing color space conversion to obtain the color space data.
Optionally, the at least two sub-processes further include at least one preprocessing sub-process, and the preprocessing module 503 is configured to:
And extracting the color space data from the video memory through the at least one preprocessing subprocess and preprocessing the color space data to obtain preprocessing data.
Optionally, the apparatus further comprises a tensor conversion module for: after storing the pre-processing data in a memory,
Converting the preprocessing data into tensor data through the graphic processor, and storing the tensor data into a video memory;
and reading the tensor data through a target SDK in the graphic processor.
Optionally, the target SDK includes a deep learning algorithm, and a data type of the tensor data corresponds to an algorithm type of the deep learning algorithm.
Optionally, the target SDK is running in a host process of the graphics processor.
The video processing device provided by the embodiment of the disclosure can execute the video processing method provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of the execution method.
Embodiments of the present disclosure also provide a computer program product comprising a computer program/instruction which, when executed by a processor, implements the video processing method provided by any of the embodiments of the present disclosure.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.
Referring now in particular to fig. 6, a schematic diagram of an electronic device 600 suitable for use in implementing embodiments of the present disclosure is shown. The electronic device 600 in the embodiments of the present disclosure may include, but is not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), car terminals (e.g., car navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 6 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.
As shown in fig. 6, the electronic device 600 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 601, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
In general, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, magnetic tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 shows an electronic device 600 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via communication means 609, or from storage means 608, or from ROM 602. When executed by the processing device 601, the computer program performs the above-described functions defined in the video processing method of the embodiment of the present disclosure.
It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a target video through the graphic processor; decoding and color space conversion are carried out on the target video through the graphic processor, so that color space data are obtained; preprocessing the color space data through the graphic processor to obtain preprocessing data, and storing the preprocessing data into a video memory; and reading the preprocessing data through a target Software Development Kit (SDK) in the graphic processor.
Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
It will be appreciated that prior to using the technical solutions disclosed in the embodiments of the present disclosure, the user should be informed and authorized of the type of information, the scope of use, the use scenario, etc. related to the present disclosure in an appropriate manner according to relevant legal regulations.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).
Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.
Claims (10)
1. A video processing method, applied to an electronic device including a graphics processor, comprising:
acquiring a target video through the graphic processor;
decoding and color space conversion are carried out on the target video through the graphic processor, so that color space data are obtained;
Preprocessing the color space data through the graphic processor to obtain preprocessing data, and storing the preprocessing data into a video memory;
And reading the preprocessing data through a target Software Development Kit (SDK) in the graphic processor.
2. The method of claim 1, wherein decoding and color space converting the target video by the graphics processor results in color space data, comprising:
Creating at least two sub-processes by a main process of the graphics processor, wherein the at least two sub-processes include at least one decoding sub-process and at least one color space conversion sub-process;
And decoding and converting the target video through the at least two subprocesses to obtain color space data, and storing the color space data into the video memory.
3. The method of claim 2, wherein decoding and color space converting the target video by the at least two sub-processes to obtain color space data comprises:
decoding the target video through the at least one decoding subprocess to obtain image frame data, and storing the image frame data into the video memory;
And extracting the image frame data from the video memory through the at least one color space conversion subprocess and performing color space conversion to obtain the color space data.
4. The method of claim 2, wherein the at least two sub-processes further comprise at least one preprocessing sub-process, and wherein preprocessing the color space data to obtain preprocessed data comprises:
And extracting the color space data from the video memory through the at least one preprocessing subprocess and preprocessing the color space data to obtain preprocessing data.
5. The method of claim 1 or 4, wherein after storing the pre-processing data in a memory, the method further comprises:
Converting the preprocessing data into tensor data through the graphic processor, and storing the tensor data into a video memory;
and reading the tensor data through a target SDK in the graphic processor.
6. The method of claim 5, wherein the target SDK comprises a deep learning algorithm, and wherein a data type of the tensor data corresponds to an algorithm type of the deep learning algorithm.
7. The method of claim 1, wherein the target SDK is running in a host process of the graphics processor.
8. A video processing apparatus provided in an electronic device including a graphics processor, comprising:
the acquisition module is used for acquiring a target video through the graphic processor;
The decoding and converting module is used for decoding and converting the color space of the target video through the graphic processor to obtain color space data;
the preprocessing module is used for preprocessing the color space data through the graphic processor to obtain preprocessing data, and storing the preprocessing data into a video memory;
and the reading module is used for reading the preprocessing data through a target Software Development Kit (SDK) in the graphic processor.
9. An electronic device, the electronic device comprising:
A processor;
a memory for storing the processor-executable instructions;
The processor is configured to read the executable instructions from the memory and execute the instructions to implement the video processing method of any of the preceding claims 1-7.
10. A computer readable storage medium, characterized in that the storage medium stores a computer program for executing the video processing method according to any one of the preceding claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410588928.XA CN118524248A (en) | 2024-05-13 | 2024-05-13 | Video processing method, device, equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410588928.XA CN118524248A (en) | 2024-05-13 | 2024-05-13 | Video processing method, device, equipment and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118524248A true CN118524248A (en) | 2024-08-20 |
Family
ID=92277871
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410588928.XA Pending CN118524248A (en) | 2024-05-13 | 2024-05-13 | Video processing method, device, equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118524248A (en) |
-
2024
- 2024-05-13 CN CN202410588928.XA patent/CN118524248A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20180084292A1 (en) | Web-based live broadcast | |
CN113473126B (en) | Video stream processing method and device, electronic equipment and computer readable medium | |
US11562772B2 (en) | Video processing method, electronic device, and storage medium | |
CN111738951B (en) | Image processing method and device | |
WO2009129418A1 (en) | System and method for separated image compression | |
CN113225590B (en) | Video super-resolution enhancement method and device, computer equipment and storage medium | |
CN113409199A (en) | Image processing method, image processing device, electronic equipment and computer readable medium | |
CN118524248A (en) | Video processing method, device, equipment and medium | |
CN111783632A (en) | Face detection method and device for video stream, electronic equipment and storage medium | |
US20230262210A1 (en) | Visual lossless image/video fixed-rate compression | |
WO2023124461A1 (en) | Video coding/decoding method and apparatus for machine vision task, device, and medium | |
CN116366856A (en) | Image processing method and device, storage medium and electronic equipment | |
CN116433491A (en) | Image processing method, device, equipment, storage medium and product | |
CN118138770A (en) | Video processing method, device, electronic equipment and storage medium | |
CN118488255A (en) | Video processing method, device, equipment and medium | |
CN108933945B (en) | GIF picture compression method, device and storage medium | |
CN114205583A (en) | Video coding method and system based on H265 and electronic equipment | |
CN113613024A (en) | Video preprocessing method and device | |
CN116781916B (en) | Vehicle image storage method, apparatus, electronic device, and computer-readable medium | |
WO2018196614A1 (en) | Picture transcoding method, computation device and storage medium | |
CN118784626B (en) | Security monitoring video transmission method, device, equipment and computer readable medium | |
CN117390206B (en) | Fresh image storage method, apparatus, electronic device and computer readable medium | |
CN115412559B (en) | End cloud resource collaboration method, electronic equipment and readable storage medium | |
CN118368460A (en) | Video processing method, device, equipment and medium | |
CN112073731B (en) | Image decoding method, image decoding device, computer-readable storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |