Detailed Description
In order to make the objects, technical solutions and effects of the present application clearer and more specific, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and the described embodiments are only possible technical implementations of the present application, but not all possible implementations. Based on the embodiments of the present application, a person skilled in the art may well combine the embodiments of the present application to obtain other embodiments without inventive faculty, and these embodiments are also within the scope of the present application.
In the related art, a decoder is used to decode a video file into a series of frames, label information is drawn on each frame, the drawn label frames are synthesized with the original video frames, and the synthesized frames are displayed on a window. That is, the image gives the data to the program, which then continues drawing something, and the program returns directly to the image data. In OH systems, the underlying layer directly sends the graphics to display (no hardware decoded data is received) because the multimedia and driver do not touch the video decoded image.
The method, the system and the terminal for adding the additional annotation information during video playing are described below with reference to the accompanying drawings. Aiming at the problem that in the related art, the video playing and adding of the additional annotation information need to process the image after video decoding, and the multimedia and the drive of the OH system are not contacted with the image after video encoding, so that the efficiency of adding the additional annotation information is lower when the video of the OH system is played. Therefore, the technical problem that in the related art, the video playing is added with additional annotation information, the image after video decoding is required to be processed, and the multimedia and the drive of the OH system are not contacted with the image after video encoding, so that the efficiency of adding the additional annotation information is lower when the video of the OH system is played is solved.
In the embodiment of the application, two display layers are created on the video component, the lower layer plays the normal video, and the upper layer performs annotation drawing, so that the component is spliced like two windows during image synthesis service processing, and the graphics drawn according to the annotation information can be superimposed on the upper layer while the video is played.
The embodiment of the application has the following advantages:
(1) And in the double-layer display structure, two display layers are created on the video component, so that separation processing of video content and annotation content is realized, and the adding and modification of the annotation are more flexible.
(2) The multimedia and the driver do not directly contact the decoded image of the video, but directly send the graph to display through the bottom layer, thus reducing the consumption of system resources and improving the efficiency.
(3) And the real-time annotation is synchronous with the video, namely the annotation drawing cycle is synchronous with the video playing cycle, so that the annotation information can be corresponding to the video content in real time, and the accuracy and the real-time performance of the annotation are enhanced.
The technical scheme of the application is described in detail below by specific examples. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.
The method for adding the additional annotation information during video playing according to the preferred embodiment of the present application, as shown in fig. 1, comprises the following steps:
in step S101, a video playing component is constructed, where the video playing component includes a first display layer and a second display layer, the first display layer is used for video playing, and the second display layer is used for labeling drawing.
In one possible implementation, a video playback component is initialized that supports two layers of display, one for video playback and one for annotation. Setting the bottom attribute of the bottom display layer and the top attribute of the top display layer, carrying out display association on the bottom display layer and the top display layer according to the bottom attribute and the top attribute, taking the associated bottom display layer as a first display layer, and taking the associated top display layer as a second display layer.
The method comprises the steps of selecting a video playing component, selecting a video playing component supporting multi-layer display, such as FFmpeg or DirectShow, creating a display layer, creating two display layers on the video playing component, wherein the bottom layer is used for video playing, the top layer is used for marking and drawing, the two layers can be regarded as independent windows and cannot interfere with each other, setting bottom layer attributes, wherein the bottom layer is set to ensure that the bottom layer can normally play videos, including configuring decoders, setting pixel formats and the like, to ensure that the videos can be smoothly played, setting top layer attributes, the top layer is set to ensure that the top layer can overlay marking information on the videos, including setting transparent backgrounds, adjusting coordinate systems and the like, so that the marking information can be accurately overlaid on the videos, associating two layers of display, ensuring that the bottom layer and the top layer can work cooperatively, adjusting window attributes, enabling the top layer to move along with the bottom layer, and always keeping on the top layer display. Event processing, namely adding an event processing mechanism to the top-level window if required, responding to user interaction, for example, updating marking information in real time and feeding back to the user when the user clicks or drags on the top-level window, and resource management, namely distributing proper memory and computing resources for each display layer to avoid performance bottlenecks in subsequent operation.
That is, the embodiment of the application can lay a solid foundation for subsequent video playing and labeling work by reasonably arranging and using the video playing components supporting multi-layer display.
In step S102, a video source and annotation information are acquired.
In one possible implementation, the video file is loaded and the decoder is set in response to the video source preparation, the annotation source preparation, loading the annotation information and initializing the drawing module. The method comprises the steps of obtaining a video source by reading a video file corresponding to control operation of the video file, obtaining an annotation source and reading annotation information from the annotation source.
Specifically, the video source is selected, a local file, a network stream or other types of video input is designated, the video file or stream to be played can be selected, the video source is loaded, the video file or stream is read, the content of the video file or stream is loaded into a memory, file I/O operation is related to the local file, network communication and data receiving are required to be carried out on the network stream, a decoder is arranged, a proper decoder is selected according to the coding format of the video file, if the video is H.264 coding, a decoder supporting H.264 is required to be used, decoder parameters are configured, parameters such as resolution, frame rate and the like are set, the parameters are matched with the properties of the video source to ensure that the video can be correctly decoded and played, the decoder is started and prepared to receive video frames, the necessary connection is required to be established with the video source, a caching strategy is used for improving the playing efficiency, the video data is cached before the decoding, the resource management is carried out, the memory and CPU resource is used when the video source is loaded, the processing is carried out, the error loading is carried out in the video source, the reliability of the system is ensured, the reliability of the error processing is not to exist in the network, and the error processing is not carried out, and the stability is ensured.
Specifying the source of the annotation data, determining the source of the annotation information, including local files, databases, web services or real-time data streams, etc., loading the annotation data, reading the annotation information from the specified data source, involving file I/O operations, database queries, web requests, etc., parsing the annotation data, parsing the acquired annotation data (possibly in XML, JSON, etc. formats) into internally usable structured data, such as objects, lists or dictionaries, etc., initializing the drawing module, selecting and configuring a graphic library or module for drawing the annotation, such as OpenGL, directX, cairo or HTML5 Canvas, etc., setting drawing parameters, such as color, font, line width, transparency, etc., according to the characteristics of the annotation data and video frames, creating a drawing buffer, possibly requiring the creation of a drawing buffer, which helps to reduce the performance overhead of directly drawing onto the screen, resource management, requiring the rational allocation and management of memory, GPU resources, etc. to optimize the performance and stability, error handling, possibly handling, such as the processing of error and error handling, such as the initial data, the interactive interface, if the interactive system is required to be damaged, and the corresponding interactive content is allowed to be processed if the interactive system is abnormal.
The embodiment of the application is closely cooperated with the video playing and image synthesizing module through loading and analyzing the marking data so as to ensure the accuracy and the instantaneity of the marking information.
In step S103, a video frame is displayed through the first display layer according to the video source, and a labeling graph is drawn through the second display layer according to the labeling information.
In one possible implementation, the video playing cycle includes decoding video and playing the video on an underlying display layer, parsing the video source to obtain parsed data when the video source is played, inputting the parsed data into a decoder to obtain decoded video frames, and synchronously displaying the video frames to the first display layer.
The method comprises the steps of initializing a decoder before a video playing cycle starts, setting parameters of the decoder to match an encoding format of a video source, preparing a buffer zone, preparing a proper buffer zone for storing decoded video frames, starting the decoding cycle, continuously reading data from the video source and sending the data to the decoder in the cycle, reading the video data, reading a certain amount of data from the video source, namely, a file block, a network stream data packet and the like, decoding the video frames, sending the read data to the decoder, outputting the decoded video frames by the decoder, processing decoding results, checking the state of the decoding results, for example, whether decoding errors exist or not, if the decoding errors exist, carrying out corresponding error processing, synchronizing the decoded video frames on a bottom display layer, relating to the invoking of a graphic library, refreshing the display, instructing a display device to refresh the display layer, displaying the latest video frames, continuously monitoring the use conditions of resources, such as CPU (CPU), memory and memory, and the like, processing the decoding results, checking the state of the decoding results, for example, judging whether the decoding errors exist or not, if the decoding errors exist, stopping the decoding process is needed, stopping the playing process when the video frames are needed, the playing process is stopped, and the video frame is stopped, and the user is stopped when the playing is stopped, and the video is in the cycle is stopped, if the cycle is stopped, and the user is allowed to finish the cycle is in a real-time.
That is, the embodiment of the application continuously displays the decoded video frames on the screen through the video playing cycle to form a smooth video playing effect, and ensures the consistency and stability of video playing through accurate time control and efficient data processing.
In one possible implementation, the annotation rendering cycle renders annotation information synchronously on the top display layer. The method comprises the steps of obtaining a time stamp corresponding to a video frame, determining marking data corresponding to the time stamp in the video source, processing the marking data to obtain processed marking data so as to adapt to a second display layer, and drawing a marking graph on the second display layer according to the processed marking data.
Specifically, initializing a drawing environment, namely initializing a drawing module before a marking drawing cycle starts, and setting drawing parameters such as colors, fonts, line widths and the like; preparing a drawing buffer which helps to reduce the performance overhead of directly drawing on a screen, starting a drawing cycle independently of a video playing cycle, for continuously updating and displaying label information, acquiring current video frame information, such as frame numbers, time stamps and the like, of a currently displayed video frame from the video playing cycle to ensure that labels are synchronous with the video frame, reading label data, acquiring corresponding label information from a label data source according to the information of the current video frame, processing the label data, processing the acquired label data, such as converting coordinates, adjusting the size, applying a style and the like to adapt to the current display layer, drawing the processed label information to the top display layer by using a drawing module, relating to the call of a graphic library, refreshing the top display layer by using a display device, instructing the display device to display the latest label information, continuously monitoring the use conditions of resources, such as CPU (Central processing unit), GPU (graphic processing unit) to avoid performance bottlenecks, user interaction processing, allowing users to interact with labels (such as editing, deleting and the like) in the drawing process, processing the label data, requesting to accurately recovering the performance of the labels in the cycle, recovering the performance to be monitored by the cycle, or recovering the performance of the cycle by the error, or recovering the performance of the cycle, or recovering the performance of the error, if the error condition is required by the drawing cycle, deciding when to end the drawing cycle.
That is, the embodiment of the application ensures that the annotation information can be synchronously updated and displayed with the video content through the parallel operation of the annotation drawing cycle and the video playing cycle, and ensures accurate time control and efficient data processing so as to ensure the accuracy and the instantaneity of the annotation.
In step S104, image synthesis is performed according to the video frame and the labeling graph, so as to obtain a synthesized image, and the synthesized image is displayed.
In one possible implementation manner, a video layer corresponding to the video frame and a labeling layer corresponding to the labeling graph are obtained, an image processing library or an API is adopted to synthesize the video layer and the labeling layer, a synthesized image is obtained, and the synthesized image is output and displayed.
The method comprises the steps of processing image synthesis, combining contents of different layers into a final image, obtaining layer frames, respectively obtaining a bottom layer video frame and a top layer annotation frame from a video playing cycle and an annotation drawing cycle, adjusting layer attributes, ensuring that the sizes, resolutions and pixel formats of the two layers are consistent so as to facilitate synthesis, applying transparent and mixed modes, applying settings to ensure that the annotations can be properly overlapped on a video if the top layer annotation requires a transparent background or special mixed effect, synthesizing the layers, combining the two layers into a complete image by using an image processing library or an API (application program interface), relating to operations of pixel levels, such as Alpha mixing, optimizing image quality, performing quality optimization on the synthesized image according to requirements, such as adjusting contrast, brightness, sharpening and the like, buffering synthesized results, avoiding the synthesized image to be re-synthesized when being displayed for improving performance, displaying the synthesized image, outputting the synthesized image to a display device, such as a display or a projector, monitoring the use condition of a CPU (Central processing unit), enabling the use of resources to be monitored in a synthesis process, such as a CPU (graphic processing unit), the use of resources to be continuously monitored, the performance to be reduced, such as the performance to be not smooth, and the performance to be in response to be in real-time, and not in need of the interactive performance, and the performance to be matched with the image quality monitoring system, such as the performance to be in real-time, if the image is not required to be in real-time, and the performance of the interactive performance is required to be reduced, if the image processing is required to be in the interactive to be in the case of the performance, and the performance is required to be in the real-time-compatible with the image processing process, if the image is required to be reduced.
That is, the embodiment of the application forms a final display effect by seamlessly combining the video content and the annotation information.
In one possible implementation, when the video source is played to the end, the resources corresponding to the video source and the annotation source are released, and the performance data of resource management is recorded.
And particularly, releasing the resources after the video playing is finished. Monitoring the use of resources, namely continuously monitoring the use condition of the resources, including the use rate of a memory, a CPU, a GPU and the like, by a system in the video playing process; detecting the video end, detecting the flow of triggering the release of resources after the video is played to the end, stopping the video source, firstly stopping the reading and decoding operation of the video source once the video end is detected to ensure that no new data are generated, stopping the video playing cycle, ensuring that video frames are not rendered on a screen any more, stopping the marking drawing cycle, if the system comprises a marking function, stopping the marking drawing cycle, releasing decoder resources, a decoder is a key component for processing video data, and once the decoder is not needed, the occupied resources should be released, releasing graphic resources, namely any graphic resources used in the video playing and marking drawing processes, such as textures, buffers and the like, should be released, releasing the video source resources, namely closing a video file or a network stream, releasing all resources related to the video source, releasing marking data resources, namely, if marking data occupy a memory or a file handle, resetting a user interface, possibly needing to update the user interface to reflect the state of the end of playing, confirming that all related resources are released, namely, the memory leakage and updating state are needed, and the state is good, and the performance of the system is convenient to record the performance of the future performance of the system is better.
It should be noted that the embodiments of the present application intersect with the prior art solutions in that:
Double-layer display layer the present application creates two display layers inside the video playback assembly rather than operating on a single frame buffer;
The separation processing, namely video decoding and annotation drawing are used as independent processing flows, so that mutual interference is reduced;
the system resource utilization reduces the consumption of the system resource because the multimedia and the drive do not contact the image after video decoding;
Real-time performance, namely marking and drawing are synchronously carried out with video playing, so that the real-time performance of marking is ensured;
User interactivity-providing better user interaction capabilities.
The additional annotation information adding system for video playing according to the embodiment of the application is described with reference to the accompanying drawings.
Fig. 2 is a block diagram of an additional annotation information adding system for video playback according to an embodiment of the present application.
As shown in fig. 2, the additional annotation information adding system 10 during video playing comprises a component initializing module 100, a file information obtaining module 200, a video playing and annotation drawing module 300 and an image synthesizing module 400.
Specifically, the component initialization module 100 is configured to construct a video playing component, where the video playing component includes a first display layer and a second display layer, the first display layer is used for video playing, and the second display layer is used for labeling and drawing;
The file information acquisition module 200 is used for acquiring a video source and annotation information;
The video playing and annotation drawing module 300 is configured to display a video frame through the first display layer according to the video source, and draw an annotation graph through the second display layer according to the annotation information;
and the image synthesis module 400 is configured to perform image synthesis according to the video frame and the annotation graph, obtain a synthesized image, and display the synthesized image.
Fig. 3 is a schematic structural diagram of a terminal according to an embodiment of the present application. The terminal may include:
Memory 501, processor 502, and a computer program stored on memory 501 and executable on processor 502.
The processor 502 implements the method for adding additional annotation information when playing video provided in the above embodiment when executing a program.
Further, the terminal further includes:
a communication interface 503 for communication between the memory 501 and the processor 502.
Memory 501 for storing a computer program executable on processor 502.
The memory 501 may include high-speed RAM memory and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
If the memory 501, the processor 502, and the communication interface 503 are implemented independently, the communication interface 503, the memory 501, and the processor 502 may be connected to each other via a bus and perform communication with each other. The bus may be an industry standard architecture (Industry Standard Architecture, abbreviated ISA) bus, a Peripheral Component Interconnect (PCI) bus, an extended industry standard architecture (Extended Industry StandardArchitecture, abbreviated EIS) bus, or the like. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 3, but not only one bus or one type of bus.
Alternatively, in a specific implementation, if the memory 501, the processor 502, and the communication interface 503 are integrated on a chip, the memory 501, the processor 502, and the communication interface 503 may perform communication with each other through internal interfaces.
The processor 502 may be a central processing unit (Central Processing Unit, abbreviated as CPU), or an Application SPECIFIC INTEGRATED Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the application.
The present embodiment also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the additional annotation information adding method at the time of video playback as described above.
An embodiment of the present application provides a computer program product, which includes a computer program, where the computer program when executed by a processor implements a method for adding additional annotation information during video playing provided in any embodiment of the embodiments corresponding to fig. 1 of the present application.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, "N" means at least two, for example, two, three, etc., unless specifically defined otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and additional implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order from that shown or discussed, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable storage medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable storage medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include an electrical connection (an electronic device) having one or more wires, a portable computer diskette (a magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer-readable storage medium may even be paper or other suitable medium upon which the program is printed, as the program may be electronically captured, via optical scanning of the paper or other medium, then compiled, interpreted, or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the N steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. If implemented in hardware as in another embodiment, may be implemented using any one or combination of techniques known in the art, discrete logic circuits with logic gates for implementing logic functions on data signals, application specific integrated circuits with appropriate combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), etc.
Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.
The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.
It is to be understood that the application is not limited in its application to the examples described above, but is capable of modification and variation in light of the above teachings by those skilled in the art, and that all such modifications and variations are intended to be included within the scope of the appended claims.
It should be noted that the above embodiments are merely for illustrating the technical solution of the present application and not for limiting the same, and although the present application has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that the technical solution described in the above embodiments may be modified or some or all of the technical features may be equivalently replaced, and these modifications or substitutions do not make the essence of the corresponding technical solution deviate from the scope of the technical solution of the embodiments of the present application.