CN115988263A

CN115988263A - Video engineering data conversion method, device, equipment and storage medium

Info

Publication number: CN115988263A
Application number: CN202211689484.6A
Authority: CN
Inventors: 李小龙
Original assignee: Shenzhen Flash Scissor Intelligent Technology Co ltd
Current assignee: Shenzhen Flash Scissor Intelligent Technology Co ltd
Priority date: 2022-12-27
Filing date: 2022-12-27
Publication date: 2023-04-18

Abstract

The invention relates to the field of data conversion, and discloses a method, a device, equipment and a storage medium for converting engineering data of a video. The method comprises the following steps: acquiring video data; extracting video frame images in the video frame image set, and performing character recognition processing on the video frame images according to a character recognition algorithm to obtain frame image characters; according to an image recognition algorithm, carrying out feature recognition processing on a video frame image to obtain frame image features; based on the character outline coordinates, carrying out deduction processing on frame image characters from a video frame image to obtain character layer data; based on the feature contour coordinates, repairing and withholding the frame image features from the first withholding frame image to obtain feature image layer data; combining the second withholding frame map, the character layer data and the feature layer data to obtain frame map engineering data; and sequencing all the frame map engineering data according to the playing time axis to obtain video engineering data.

Description

Video engineering data conversion method, device, equipment and storage medium

Technical Field

The present invention relates to the field of data conversion, and in particular, to a method, an apparatus, a device, and a storage medium for converting video engineering data.

Background

In the fields of live broadcasting, self-media editing and the like, a large number of characters and decorative characters are used in a combined mode, the mode can enable people to quickly know the content of a video and can also be close to the contact with audiences, the mode is one of the most extensive video processing modes, the video is actually played by a plurality of pictures according to the sequence of a video track, and secondary editing cannot be performed under the condition that the video is packaged well.

When a user needs to modify a video, the project file is lost or deleted, so that secondary editing cannot be performed, only the original video can be found out, and the editing process is restarted, which is time-consuming and labor-consuming for the user. Therefore, a new technology is needed to solve the problems that the current video re-editing is time-consuming and labor-consuming, the video cannot be edited in a customized manner, and the secondary editing cannot be performed when the project file is lost.

Disclosure of Invention

The invention mainly aims to solve the technical problems that the current video re-editing is time-consuming and labor-consuming, the video cannot be edited in a customized manner, and secondary editing cannot be performed under the condition that an engineering file is lost.

The invention provides a video engineering data conversion method in a first aspect, which comprises the following steps:

acquiring video data, and splitting the video data based on a video playing time axis to obtain a video frame atlas;

extracting video frame images in the video frame image set, and performing character recognition processing on the video frame images according to a preset character recognition algorithm to obtain frame image characters and character outline coordinates;

according to a preset image recognition algorithm, carrying out feature recognition processing on the video frame image to obtain frame image features and feature contour coordinates;

based on the character outline coordinates, carrying out deduction processing on the frame image characters from the video frame image to obtain character layer data and a first deduction frame image;

based on the feature contour coordinate, repairing and withholding the frame image feature from the first withholding frame image to obtain feature image layer data and a second withholding frame image;

combining the second withholding frame map, the character layer data and the feature layer data to obtain frame map engineering data;

and sequencing all the frame map engineering data according to the playing time axis to obtain video engineering data.

Optionally, in a first implementation manner of the first aspect of the present invention, the extracting a video frame image in the video frame image set includes:

and capturing the video frame images in the video frame image set, and carrying out coordinate size identification processing on the video frame images to obtain a coordinate system based on the size of the frame images so as to carry out data conversion according to the coordinate system.

Optionally, in a second implementation manner of the first aspect of the present invention, the performing coordinate size identification processing on the video frame map to obtain a frame map size-based coordinate system includes:

analyzing horizontal and vertical pixels of the video frame image according to a preset coordinate recognition algorithm to obtain the number of horizontal pixels and the number of vertical pixels;

and establishing a coordinate system based on the size of the frame image based on the range of the number of the horizontal pixels and the number of the vertical pixels.

Optionally, in a third implementation manner of the first aspect of the present invention, the performing, according to a preset character recognition algorithm, character recognition processing on the video frame map to obtain frame map characters and character outline coordinates includes:

performing character recognition processing on the video frame image according to a preset character recognition algorithm to obtain frame image characters, a frame image character format and character outline recognition points, and marking the frame image character format on the frame image characters;

and performing summary operation processing on the character outline identification points to obtain the character outline coordinates of the frame image characters.

Optionally, in a fourth implementation manner of the first aspect of the present invention, the combining the second withholding frame map, the character layer data, and the feature layer data to obtain frame map engineering data includes:

and performing stack adding combination processing on the second withholding frame map, the feature layer data and the character layer data according to the sequence of the second withholding frame map, the feature layer data and the character layer data to obtain frame map engineering data.

Optionally, in a fifth implementation manner of the first aspect of the present invention, the obtaining feature layer data and a second clipped frame map by performing a repair clipping process on the frame map feature from the first clipped frame map based on the feature contour coordinate includes:

according to a preset repairing algorithm, repairing the frame image features corresponding to the feature contour coordinates in the first withholding frame image to obtain repaired frame image features;

and based on the feature contour coordinate, carrying out deduction processing on the repaired frame image feature to obtain feature layer data and a second deduction frame image.

Optionally, in a sixth implementation manner of the first aspect of the present invention, after the sorting all the frame map engineering data according to the play time axis to obtain video engineering data, the method further includes:

receiving a time selection instruction of the time playing time axis;

and inquiring and displaying the video engineering data corresponding to the moment according to the moment corresponding to the moment selection instruction.

The second aspect of the present invention provides a video engineering data conversion device, including:

the splitting module is used for acquiring video data and splitting the video data based on a playing time axis of a video to obtain a video frame atlas;

the character recognition module is used for extracting the video frame images in the video frame image set and carrying out character recognition processing on the video frame images according to a preset character recognition algorithm to obtain frame image characters and character outline coordinates;

the characteristic identification module is used for carrying out characteristic identification processing on the video frame image according to a preset image identification algorithm to obtain frame image characteristics and characteristic outline coordinates;

the character deduction module is used for deducting the frame image characters from the video frame image based on the character outline coordinates to obtain character layer data and a first deduction frame image;

the characteristic deduction module is used for carrying out restoration deduction processing on the frame image characteristics from the first deduction frame image based on the characteristic contour coordinates to obtain characteristic image layer data and a second deduction frame image;

the frame map combination module is used for combining the second withholding frame map, the character layer data and the feature layer data to obtain frame map engineering data;

and the time axis sequencing module is used for sequencing all the frame image engineering data according to the playing time axis to obtain video engineering data.

A third aspect of the present invention provides an apparatus for converting engineering data of a video, including: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line; the at least one processor calls the instructions in the memory to enable the engineering data conversion equipment of the video to execute the engineering data conversion method of the video.

A fourth aspect of the present invention provides a computer-readable storage medium having stored therein instructions, which, when run on a computer, cause the computer to execute the above-mentioned method for converting engineering data of a video.

In the embodiment of the invention, each video frame picture content component element is analyzed through the video playing track, partial redrawing is carried out on the periphery of the element, and finally, a project engineering file based on the video playing track is generated according to the attribute and the created layer of the element for direct secondary creation of an editing tool, so that the technical problems that the editing of the current video is time-consuming and labor-consuming and the secondary editing cannot be carried out under the condition that the project file is lost are solved.

Drawings

FIG. 1 is a schematic diagram of an embodiment of an engineering data conversion method for a video according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an embodiment of character recognition in accordance with the present invention;

FIG. 3 is a schematic diagram of an embodiment of feature recognition in an embodiment of the present invention;

FIG. 4 is a schematic diagram of video engineering data with a play time axis according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of an embodiment of an apparatus for converting engineering data of a video according to an embodiment of the present invention;

fig. 6 is a schematic diagram of another embodiment of the apparatus for converting engineering data of video according to the embodiment of the present invention;

fig. 7 is a schematic diagram of an embodiment of an engineering data conversion device for video according to an embodiment of the present invention.

Detailed Description

The embodiment of the invention provides a method, a device, equipment and a storage medium for converting engineering data of a video.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be implemented in other sequences than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

For understanding, a detailed flow of an embodiment of the present invention is described below, and referring to fig. 1, an embodiment of a method for converting engineering data of a video according to an embodiment of the present invention includes the steps of:

101. acquiring video data, and splitting the video data based on a video playing time axis to obtain a video frame atlas;

102. extracting video frame images in the video frame image set, and performing character recognition processing on the video frame images according to a preset character recognition algorithm to obtain frame image characters and character outline coordinates;

in the steps 101-102, video data is obtained first, then the video data is split frame by frame according to the time axis of the video playing track, the video data is split into a plurality of video frame images, and the video frame images are combined together to obtain a video frame image set. One of the video frame images in the video frame image set is then extracted, which may be in sequence during the extraction process. Referring to fig. 2, fig. 2 is a schematic diagram of an embodiment of character recognition in the embodiment of the present invention. In the pre-process of identifying the characters, the size of the received picture is identified, for example, the size is 900 × 1600, a certain point of the extraction tool is fixed as a far point to establish a coordinate system, and then the characters and the character coordinates of the video frame image are identified to obtain the frame image characters and the character outline coordinates.

Further, in the "extracting the video frame map in the video frame map set", the following steps may be performed:

1021. and capturing the video frame images in the video frame image set, and carrying out coordinate size identification processing on the video frame images to obtain a coordinate system based on the size of the frame images so as to carry out data conversion according to the coordinate system.

In step 1021, since there are multiple video frames in the video, the speed can be increased by determining the coordinate system at a certain point of the tool interface instead of establishing the coordinate system by size recognition of each video frame, so as to increase the operation speed. At 1021, by means of image processing, size recognition is performed on each video frame image when each video frame image is acquired, a coordinate system based on the frame image size is obtained by taking the lower left corner of the image as an origin, the horizontal axis as the X axis and the vertical axis as the Y axis, and the character outline coordinates and feature outline coordinates of subsequent processing can be calculated according to the coordinate system.

Specifically, the following steps may be performed in "performing coordinate size recognition processing on the video frame map to obtain a coordinate system based on the size of the frame map":

10211. analyzing horizontal and vertical pixels of the video frame image according to a preset coordinate recognition algorithm to obtain the number of horizontal pixels and the number of vertical pixels;

10212. and establishing a coordinate system based on the size of the frame image based on the range of the number of the horizontal pixels and the number of the vertical pixels.

In steps 10211-10212, pixels in the horizontal and vertical directions of the video frame map are identified to obtain a 900 × 1600 size, and the region limiting the coordinate system in the horizontal direction is 0-900 and the vertical direction is 0-1600 limits the coordinate system based on the frame map size.

Further, "according to a preset character recognition algorithm, performing character recognition processing on the video frame image to obtain frame image characters and character outline coordinates" may perform the following steps:

1022. performing character recognition processing on the video frame image according to a preset character recognition algorithm to obtain frame image characters, a frame image character format and character outline recognition points, and marking the frame image character format on the frame image characters;

1023. and performing summary operation processing on the character outline identification points to obtain the character outline coordinates of the frame image characters.

In the 1022-1023 steps, referring to fig. 2, the picture position recognition (drawing picture layer area) of the character picture in the video frame map and the picture layer surrounding area size of each frame map character are performed in fig. 2: w, h, and coordinate data of each bounding region. And (3) identifying the picture content: extracting text content: { this is a segment of text }, extracting text style, text color (color value), text size (font size), text background (color size), font (font name), font delineation (thickness, color size), alignment mode (left alignment, center alignment, right alignment). The method is characterized in that not only character strings of text styles are recognized, but also character colors (color values), character sizes (character sizes), character backgrounds (color numbers), fonts (font names), font delineation (thickness and color number), alignment modes (left alignment, local alignment and right alignment) and the like are recognized by relating to frame image character formats, then the frame image characters are marked, and character outline coordinates of character outline recognition points in a coordinate system based on the frame image sizes are given, wherein the character outline coordinates can be central coordinates surrounding rectangles, and can also be an outline coordinate set surrounding the rectangles.

103. According to a preset image recognition algorithm, carrying out feature recognition processing on the video frame image to obtain frame image features and feature contour coordinates;

in this embodiment, referring to fig. 3, fig. 3 is an embodiment of feature recognition in the embodiment of the present invention, after a picture is processed, sizes of an area 1, an area 2, an area 3, and an area 4 may be recognized, and coordinate data 1, coordinate data 2, coordinate data 3, and coordinate data 4 are generated, respectively, where the picture feature may be a person or other images, but is not a character. When the image features are identified, the contour coordinates of the features are recorded, for example, the features are persons, the contents of the persons are extracted, the data of the person contour coordinates are recorded, pixels inside the contours are marked as the image features based on the data of the contour coordinates, and the data of the person contour coordinates are determined as the feature contour coordinates.

104. Based on the character outline coordinates, carrying out deduction processing on the frame image characters from the video frame image to obtain character layer data and a first deduction frame image;

105. based on the feature contour coordinate, repairing and withholding the frame image feature from the first withholding frame image to obtain feature image layer data and a second withholding frame image;

in steps 104-105, the character map data is deducted directly from the video frame map, and the first deducted frame map after the character is deducted is left. Then identifying the surrounding content of the deducted text elements, carrying out local redrawing on the layer, deducting the frame image characteristics, and then completing the picture content: a. and drawing a layer according to the characteristic area of the image, and superposing the layer under the characteristic layer data. b. And identifying the surrounding content of the feature layer data, and locally redrawing the feature layer data.

Further, at 105, the following steps may be performed:

1051. according to a preset repairing algorithm, repairing the frame image features corresponding to the feature contour coordinates in the first withholding frame image to obtain repaired frame image features;

1052. and based on the feature contour coordinate, carrying out deduction processing on the repaired frame image feature to obtain feature layer data and a second deduction frame image.

In the 1051-1052 steps, pixels and contents around the frame image features are identified, local redrawing and repairing are performed on the image layer, and then the deduction processing of the portrait or image is performed after the repairing.

106. Combining the second withholding frame map, the character layer data and the feature layer data to obtain frame map engineering data;

in this embodiment, the editable picture engineering data is directly generated based on the second withholding frame map, the character layer data, and the feature layer data, and for the video,

further, at 106, the following steps may be performed:

1061. and performing stack adding combination processing on the second withholding frame map, the feature layer data and the character layer data according to the sequence of the second withholding frame map, the feature layer data and the character layer data to obtain frame map engineering data.

In this embodiment, because the layers are sequentially superimposed, the frame map engineering data with the sequence can be obtained by adding the layers according to the sequence of the second deduction frame map, the feature layer data and the character layer data through the data structure of the stack, the engineering data can be stably converted by more convenient editing, and the original map and the engineering data image cannot have large deviation.

107. And sequencing all the frame image engineering data according to the playing time axis to obtain video engineering data.

In this embodiment, referring to fig. 4, fig. 4 is a schematic diagram of video engineering data with a play time axis in the embodiment of the present invention, each frame of image of a video may be converted into engineering data, and then the engineering data with the play time axis may be combined to be edited, so that the engineering data of each frame of image may be edited in real time.

Further, after 107, the following steps may also be performed:

1071. receiving a time selection instruction of the time playing time axis;

1072. and inquiring and displaying the video engineering data corresponding to the moment according to the moment corresponding to the moment selection instruction.

In steps 1071-1072, when a time selection command is received, the engineering data selected after the video time conversion is displayed can be inquired and called based on the selection time.

With reference to fig. 5, the engineering data conversion method for a video in the embodiment of the present invention is described above, and an engineering data conversion device for a video in the embodiment of the present invention is described below, where an embodiment of the engineering data conversion device for a video in the embodiment of the present invention includes:

a splitting module 501, configured to obtain video data, split the video data based on a playing time axis of a video, and obtain a video frame atlas;

the character recognition module 502 is configured to extract a video frame image in the video frame image set, and perform character recognition processing on the video frame image according to a preset character recognition algorithm to obtain a frame image character and a character outline coordinate;

the feature recognition module 503 is configured to perform feature recognition processing on the video frame image according to a preset image recognition algorithm to obtain frame image features and feature contour coordinates;

a character deduction module 504, configured to deduct the frame image characters from the video frame image based on the character outline coordinates, so as to obtain character layer data and a first deduction frame image;

a feature extracting module 505, configured to perform repair extracting processing on the frame map features from the first extracted frame map based on the feature contour coordinates, to obtain feature map layer data and a second extracted frame map;

the frame map combination module 206 is configured to combine the second withholding frame map, the character layer data, and the feature layer data to obtain frame map engineering data;

and a time axis sorting module 507, configured to sort all the frame map engineering data according to the play time axis, so as to obtain video engineering data.

Referring to fig. 6, in another embodiment of the engineering data conversion device for video according to the embodiment of the present invention, the engineering data conversion device for video includes:

the splitting module 501 is configured to obtain video data, split the video data based on a play time axis of a video, and obtain a video frame atlas;

a character deduction module 504, configured to deduct the frame map characters from the video frame map based on the character contour coordinates, so as to obtain character layer data and a first deducted frame map;

a feature extracting module 505, configured to perform repair extracting processing on the frame image features from the first extracted frame image based on the feature contour coordinates, so as to obtain feature layer data and a second extracted frame image;

and a time axis sorting module 507, configured to sort all the frame map engineering data according to the playing time axis, so as to obtain video engineering data.

Wherein the character recognition module 502 is specifically configured to:

The character recognition module 502 may be further specifically configured to:

Wherein the frame map combining module 206 is specifically configured to:

The feature deduction module 505 is specifically configured to:

according to a preset repairing algorithm, repairing the frame image characteristics corresponding to the characteristic contour coordinates in the first withholding frame image to obtain repaired frame image characteristics;

The apparatus for converting engineering data of a video further includes a data selection module 508, where the data selection module 508 is specifically configured to:

receiving a time selection instruction of the time playing time axis;

Fig. 5 and fig. 6 describe the engineering data conversion device of the video in the embodiment of the present invention in detail from the perspective of the modular functional entity, and the engineering data conversion device of the video in the embodiment of the present invention is described in detail from the perspective of hardware processing.

Fig. 7 is a schematic structural diagram of an engineering data conversion device for video, where the engineering data conversion device 700 for video may generate relatively large differences due to different configurations or performances, and may include one or more processors (CPUs) 710 (e.g., one or more processors) and a memory 720, one or more storage media 730 (e.g., one or more mass storage devices) for storing applications 733 or data 732. Memory 720 and storage medium 730 may be, among other things, transient storage or persistent storage. The program stored in the storage medium 730 may include one or more modules (not shown), each of which may include a series of instruction operations in the engineering data converting apparatus 700 for a video. Further, the processor 710 may be configured to communicate with the storage medium 730, and execute a series of instruction operations in the storage medium 730 on the engineering data converting apparatus 700 for video.

The video-based engineering data conversion apparatus 700 may also include one or more power supplies 740, one or more wired or wireless network interfaces 750, one or more input-output interfaces 760, and/or one or more operating systems 731, such as Windows Server, mac OS X, unix, linux, freeBSD, and the like. Those skilled in the art will appreciate that the construction of the video-based engineering data conversion device shown in fig. 7 does not constitute a limitation of the video-based engineering data conversion device and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, or a volatile computer-readable storage medium, having stored therein instructions, which, when executed on a computer, cause the computer to perform the steps of the method for converting engineering data of a video.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses, and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A method for converting engineering data of a video is characterized by comprising the following steps:

2. The method for converting engineering data of a video according to claim 1, wherein the extracting the video frame map in the video frame map set comprises:

3. The method of claim 2, wherein the step of performing coordinate size recognition on the video frame map to obtain a frame map size-based coordinate system comprises:

4. The method for converting engineering data of a video according to claim 1, wherein the performing character recognition processing on the video frame map according to a preset character recognition algorithm to obtain frame map characters and character outline coordinates comprises:

and performing summary operation processing on the character outline identification points to obtain character outline coordinates of the frame image characters.

5. The method for converting engineering data of a video according to claim 1, wherein the combining the second extraction frame map, the character layer data, and the feature layer data to obtain frame map engineering data includes:

6. The method for converting engineering data of a video according to claim 1, wherein the obtaining feature layer data and a second clipped frame map by performing repair clipping processing on the frame map features from the first clipped frame map based on the feature contour coordinates comprises:

7. The method for converting engineering data of video according to claim 1, wherein after the sorting all the frame map engineering data according to the play time axis to obtain video engineering data, the method further comprises:

receiving a moment selection instruction of the time-play scale axis;

8. An apparatus for converting engineering data of a video, the apparatus comprising:

the splitting module is used for acquiring video data, splitting the video data based on a video playing time axis and obtaining a video frame atlas;

the feature deduction module is used for repairing and deducting the frame image features from the first deduction frame image based on the feature contour coordinates to obtain feature image layer data and a second deduction frame image;

the frame map combination module is used for combining the second extraction frame map, the character layer data and the feature layer data to obtain frame map engineering data;

9. An apparatus for converting engineering data of a video, comprising: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line;

the at least one processor invokes the instructions in the memory to cause the engineering data conversion device of the video to perform the engineering data conversion method of the video of any of claims 1-7.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the method for engineering data conversion of video according to any one of claims 1 to 7.