[go: up one dir, main page]

CN108986117B - Video image segmentation method and device - Google Patents

Video image segmentation method and device Download PDF

Info

Publication number
CN108986117B
CN108986117B CN201810802302.9A CN201810802302A CN108986117B CN 108986117 B CN108986117 B CN 108986117B CN 201810802302 A CN201810802302 A CN 201810802302A CN 108986117 B CN108986117 B CN 108986117B
Authority
CN
China
Prior art keywords
image
segmented
segmentation
determining
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810802302.9A
Other languages
Chinese (zh)
Other versions
CN108986117A (en
Inventor
曾伟
刘浩
王爽
王昊
刘琦
聂冉
李泉材
李佳明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Youku Culture Technology Beijing Co ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN201810802302.9A priority Critical patent/CN108986117B/en
Publication of CN108986117A publication Critical patent/CN108986117A/en
Application granted granted Critical
Publication of CN108986117B publication Critical patent/CN108986117B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The disclosure relates to a video image segmentation method and a device, wherein the method comprises the following steps: identifying a target object in an image to be segmented, wherein the image to be segmented is a video frame image in a video; determining a segmentation region corresponding to the identified target object in the image to be segmented; and generating a segmentation video image of the image to be segmented according to the segmentation area, and controlling a terminal to play the segmentation video image. The embodiment of the disclosure can generate the segmentation video images aiming at different target objects or different scenes by using one image to be segmented so as to meet different watching requirements and reduce the resource requirements of the shooting and editing links.

Description

Video image segmentation method and device
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to a video image segmentation method and apparatus.
Background
In a conventional video playing method, different shooting devices are generally required to shoot video images of different scenes, or a single shooting device repeats the video images of different scenes, and during playing, the video images of one scene are selected according to requirements and then played through later-stage video editing, so that resource waste exists in the video shooting and video editing stages.
Disclosure of Invention
In view of this, the present disclosure provides a video image segmentation method and apparatus, so as to solve the problem of resource waste when shooting or editing a video.
According to an aspect of the present disclosure, there is provided a video image segmentation method, the method including:
identifying a target object in an image to be segmented, wherein the image to be segmented is a video frame image in a video;
determining a segmentation region corresponding to the identified target object in the image to be segmented;
generating a segmentation video image of the image to be segmented according to the segmentation region,
and controlling the terminal to play the segmented video image.
In one possible implementation, identifying a target object in an image to be segmented includes:
and identifying a target object in the image to be segmented by utilizing a deep learning algorithm.
In one possible implementation manner, determining a segmentation region corresponding to the identified target object in the image to be segmented includes:
and determining a segmentation region corresponding to the identified target object in the image to be segmented according to a composition rule by utilizing a deep learning algorithm.
In one possible implementation manner, determining a segmentation region corresponding to the identified target object in the image to be segmented includes:
and determining a segmentation region corresponding to the identified target object and conforming to a specified scene in the image to be segmented.
In one possible implementation manner, determining a segmentation region corresponding to the identified target object in the image to be segmented includes:
and determining a segmentation region comprising the target object in the image to be segmented.
In one possible implementation, determining a segmentation region including the target object in the image to be segmented includes:
in the image to be segmented, at least two segmentation regions comprising the target object are determined, wherein the image corresponding to the segmentation region with the larger area in the at least two segmentation regions comprises the image corresponding to the segmentation region with the smaller area.
In one possible implementation manner, determining a segmentation region corresponding to the identified target object in the image to be segmented includes:
identifying a target site on the target object;
and determining a segmentation region corresponding to the target part in the image to be segmented.
In one possible implementation, the method further includes:
determining resolution information of the image to be segmented;
determining the segmentation size information of the image to be segmented according to the resolution information;
determining a segmentation region corresponding to the identified target object in the image to be segmented, including:
and determining a segmentation area corresponding to the identified target object and conforming to the segmentation size information in the image to be segmented.
In one possible implementation, the method further includes:
determining definition information of the image to be segmented;
determining the segmentation size information of the image to be segmented according to the resolution information, wherein the determination comprises the following steps:
and determining the segmentation size information of the image to be segmented according to the resolution information and the definition information.
In one possible implementation, the method further includes:
determining first playing display information, wherein the first playing display information comprises screen physical size information and/or playing resolution information;
and the split video image played by the terminal accords with the screen physical size information and/or the playing resolution information.
In one possible implementation, the method further includes:
determining second playing display information, wherein the second playing display information comprises horizontal screen display information or vertical screen display information;
and the segmented video image played by the terminal accords with the horizontal screen display information or the vertical screen display information.
In a possible implementation manner, generating a segmented video image of the image to be segmented according to the segmented region includes:
determining the corresponding coordinate information of the segmentation region in the image to be segmented;
and generating a segmentation video image of the image to be segmented according to the coordinate information.
In a possible implementation manner, the controlling the terminal to play the segmented video image includes:
determining the weight of each segmented video image in the image to be segmented;
determining a recommendation result in each segmented video image according to the weight;
and controlling the terminal to play the recommendation result.
In a possible implementation manner, the controlling the terminal to play the segmented video image includes:
acquiring target object selection information and/or scene selection information;
determining a selection result in the segmented video image according to the target object selection information and/or the scene selection information;
and controlling the terminal to play the selection result.
According to an aspect of the present disclosure, there is provided a video image segmentation apparatus, the apparatus including:
the target object identification module is used for identifying a target object in an image to be segmented, wherein the image to be segmented is a video frame image in a video;
the segmentation region determining module is used for determining a segmentation region corresponding to the identified target object in the image to be segmented;
a segmentation video image generation module for generating a segmentation video image of the image to be segmented according to the segmentation region,
and the playing module is used for controlling the terminal to play the segmented video image.
In one possible implementation, the target object identification module includes:
and the first target object identification submodule is used for identifying a target object in the image to be segmented by utilizing a deep learning algorithm.
In one possible implementation manner, the segmentation area determination module includes:
and the first segmentation region determining submodule is used for determining a segmentation region corresponding to the identified target object in the image to be segmented according to the composition rule by utilizing a deep learning algorithm.
In one possible implementation manner, the segmentation area determination module includes:
and the second segmentation region determining submodule is used for determining a segmentation region which corresponds to the identified target object and accords with a specified scene in the image to be segmented.
In one possible implementation manner, the segmentation area determination module includes:
and the third segmentation region determining submodule is used for determining a segmentation region comprising the target object in the image to be segmented.
In a possible implementation manner, the third segmentation area determination submodule is configured to:
in the image to be segmented, at least two segmentation regions comprising the target object are determined, wherein the image corresponding to the segmentation region with the larger area in the at least two segmentation regions comprises the image corresponding to the segmentation region with the smaller area.
In one possible implementation manner, the segmentation area determination module includes:
a target part identification submodule for identifying a target part on the target object;
and the fourth segmentation region determining submodule is used for determining a segmentation region corresponding to the target part in the image to be segmented.
In one possible implementation, the apparatus further includes:
the resolution information determining module is used for determining the resolution information of the image to be segmented;
the segmentation size information determining module is used for determining the segmentation size information of the image to be segmented according to the resolution information;
the segmentation region determination module includes:
and the fifth segmentation area determination submodule is used for determining a segmentation area which corresponds to the identified target object and accords with the segmentation size information in the image to be segmented.
In one possible implementation, the apparatus further includes:
the definition information determining module is used for determining the definition information of the image to be segmented;
the division size information determination module includes:
and the first segmentation information determining submodule is used for determining the segmentation size information of the image to be segmented according to the resolution information and the definition information.
In one possible implementation, the apparatus further includes:
the first playing and displaying information determining module is used for determining first playing and displaying information, wherein the first playing and displaying information comprises screen physical size information and/or playing resolution information;
and the split video image played by the terminal accords with the screen physical size information and/or the playing resolution information.
In one possible implementation, the apparatus further includes:
the second playing and displaying information determining module is used for determining second playing and displaying information, wherein the second playing and displaying information comprises horizontal screen displaying information or vertical screen displaying information;
and the segmented video image played by the terminal accords with the horizontal screen display information or the vertical screen display information.
In one possible implementation, the segmented video image generation module includes:
the coordinate information determining submodule is used for determining the corresponding coordinate information of the segmentation area in the image to be segmented;
and the first segmentation video image generation submodule is used for generating the segmentation video image of the image to be segmented according to the coordinate information.
In one possible implementation manner, the playing module includes:
the weight determining submodule is used for determining the weight of each segmented video image in the image to be segmented;
a recommendation result determining submodule, configured to determine a recommendation result in each of the segmented video images according to the weight;
and the recommendation result playing submodule is used for controlling the terminal to play the recommendation result.
In one possible implementation manner, the playing module includes:
the selection submodule is used for acquiring target object selection information and/or scene selection information;
a selection result determining submodule, configured to determine a selection result in the segmented video image according to the target object selection information and/or the scene selection information;
and the selection result playing submodule is used for controlling the terminal to play the selection result.
According to an aspect of the present disclosure, there is provided a video image segmentation apparatus including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to: the above-described video image segmentation method is performed.
According to an aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described video image segmentation method.
In the embodiment of the disclosure, after a target object in an image to be segmented is identified and a segmentation area corresponding to the target object is determined in the image to be segmented, a segmentation video image is generated according to the segmentation area, and the terminal is controlled to play the segmentation video image. According to the embodiment of the disclosure, different target objects can be determined, and different segmentation areas corresponding to the target objects can be determined, so that the segmentation video images aiming at different target objects or different scenes can be generated by using one image to be segmented, different viewing requirements can be met, and the resource requirements of shooting and editing links can be reduced.
Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features, and aspects of the disclosure and, together with the description, serve to explain the principles of the disclosure.
FIG. 1 shows a flow diagram of a video image segmentation method according to an embodiment of the present disclosure;
FIG. 2 shows a flow diagram of a video image segmentation method according to an embodiment of the present disclosure;
fig. 3 illustrates a schematic diagram of determining a segmentation region according to a composition rule in a video image segmentation method according to an embodiment of the present disclosure;
FIG. 4 shows a flow diagram of a video image segmentation method according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram illustrating different scenes in a video image segmentation method according to an embodiment of the present disclosure;
FIG. 6 shows a flow diagram of a video image segmentation method according to an embodiment of the present disclosure;
FIG. 7 is a diagram illustrating a segmentation region in a video image segmentation method according to an embodiment of the present disclosure;
FIG. 8 shows a flow diagram of a video image segmentation method according to an embodiment of the present disclosure;
fig. 9 is a schematic diagram illustrating a segmentation region corresponding to a target region in a video image segmentation method according to an embodiment of the present disclosure;
FIG. 10 shows a flow diagram of a video image segmentation method according to an embodiment of the present disclosure;
FIG. 11 shows a flow diagram of a video image segmentation method according to an embodiment of the present disclosure;
FIG. 12 shows a flow diagram of a video image segmentation method according to an embodiment of the present disclosure;
FIG. 13 is a schematic diagram illustrating landscape and portrait displays in a video image segmentation method according to an embodiment of the present disclosure;
FIG. 14 shows a flow diagram of a video image segmentation method according to an embodiment of the present disclosure;
FIG. 15 shows a flow diagram of a video image segmentation method according to an embodiment of the present disclosure;
FIG. 16 is a diagram illustrating weights in a video image segmentation method according to an embodiment of the present disclosure;
FIG. 17 shows a flow diagram of a video image segmentation method according to an embodiment of the present disclosure;
fig. 18 illustrates a schematic diagram of segmenting a video image in a video image segmentation method according to an embodiment of the present disclosure;
FIG. 19 shows a schematic diagram of a video image segmentation apparatus according to an embodiment of the present disclosure;
FIG. 20 shows a schematic diagram of a video image segmentation apparatus according to an embodiment of the present disclosure;
FIG. 21 is a block diagram illustrating an apparatus for video image segmentation in accordance with an exemplary embodiment;
fig. 22 is a block diagram illustrating an apparatus for video image segmentation in accordance with an exemplary embodiment.
Detailed Description
Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.
Fig. 1 shows a flowchart of a video image segmentation method according to an embodiment of the present disclosure, and as shown in fig. 1, the video image segmentation method includes:
step S10, identifying a target object in an image to be segmented, which is a video frame image in a video.
In one possible implementation, the image to be segmented may include a video frame image in a video captured by any video capture device. The video may be a live video stream or a recorded video. The image to be segmented may be an original video frame image shot by a video shooting device, or may be a video frame image obtained by preprocessing the original video frame image, where the preprocessing may include noise reduction processing, resolution adjustment, and the like.
Any object in the image to be segmented can be determined as a target object according to requirements. For example, the target object may be a human, an animal, a plant, a vehicle, or the like. The target object may also be a sub-part on a different object such as a human face, a human leg, an animal face, etc. The target object may include one object or may include a plurality of objects. For example, the image a to be segmented includes two persons with conversations, zhang san and lie san. The target object may be Zhangsi or Liqu, and Zhangsi or Liqu may be identified in the image A to be segmented. The target object may also be a face of zhang san or a face of lie si, and the face of zhang san or the face of lie si may be recognized in the image a to be segmented.
The target object may be identified in the image to be segmented using image recognition or the like.
Step S20, determining a segmentation region corresponding to the identified target object in the image to be segmented.
In one possible implementation, the division region may include any shape such as a rectangle. The divided region may include a region of a set area. The areas of the segmentation regions corresponding to different target objects in the image to be segmented may be the same or different. For example, a segmentation region 1 corresponding to zhangsan and a segmentation region 2 corresponding to lie four may be determined for the target objects zhangsan and lie four identified in the image a to be segmented according to the set areas. The areas of the divided regions 1 and 2 may be the same or different.
The area of the segmentation region can also be determined according to the area occupied by the target object in the image to be segmented. For example, in the image a to be segmented, based on the areas occupied by three and four in the image, it is determined that the area of the segmented region 1 corresponding to three is 40% of the total area of the image a to be segmented, and the area of the segmented region 2 corresponding to four is 20% of the total area of the image a to be segmented.
The plurality of divided regions corresponding to the target object may be determined based on the entire target object, or the plurality of divided regions corresponding to one or more sub-parts of the target object may be determined based on the sub-parts of the target object.
The target object or the sub-part of the target object may be located at a set position of the divided region. For example, the target object or the sub-part of the target object may be located at the center of the divided region or may be located below the middle of the divided region.
A plurality of segmented regions corresponding to a target object can be determined to meet different viewing requirements. For example, the areas of the set divided regions are an area a, an area b, and an area c, and the sizes of the area a, the area b, and the area c are different from each other. The target objects identified in the image a to be segmented are zhang san and lie si. The determining of the segmentation regions for zhang san may include: the divided regions 1a with area a, 1b with area b, and 1c with area c, which are determined for lie four, may include: a divided region 2a having an area a, a divided region 2b having an area b, and a divided region 2c having an area c.
And step S30, generating a segmentation video image of the image to be segmented according to the segmentation area.
In one possible implementation, the image to be segmented may be segmented in the image to be segmented by using the segmentation region, and the segmented image may be determined as a segmented video image.
A plurality of segmented video images may be generated for one target object in an image to be segmented. The multiple segmented video images of a target object may be used to represent one or more sub-portions of the target object, or may be used to represent the target object in different views. And step S40, the control terminal plays the split video image.
In a possible implementation manner, the terminal may be controlled to play the split video images according to requirements, and one or more split video images may be played on a screen of the terminal. For example, a segmented video image of one or more target objects may be played. For the image A to be segmented, one segmented video image of Zhang III or Liqu can be played, and a plurality of segmented video images of Zhang III and/or Liqu can also be played.
In a possible implementation manner, the video image segmentation method in the embodiment of the present disclosure may be completed on a server side, may be completed on a terminal side that plays a video, or may be completed by both the server and the terminal. When the server and the terminal finish the operation together, the server side can send the determined segmentation area and the image to be segmented to the terminal in a file mode, and the terminal generates and broadcasts a segmentation video image according to the coordinate set of the segmentation area and the image to be segmented. The server side can also generate independent segmentation videos based on the segmentation video images and provide the segmentation videos for the terminal to play. The present disclosure does not limit the execution subject of the video image segmentation method.
In the embodiment, a target object in an image to be segmented is identified, a segmentation area corresponding to the target object is determined in the image to be segmented, a segmentation video image is generated according to the segmentation area, and the segmentation video image is controlled to be played by a terminal. By determining different target objects and different segmentation areas corresponding to the target objects, the generation of segmentation video images aiming at different target objects or different scenes by using one image to be segmented can be realized, so that different viewing requirements are met, and the resource requirements of shooting and editing links are reduced.
Fig. 2 shows a flowchart of a video image segmentation method according to an embodiment of the present disclosure, and as shown in fig. 2, step S10 in the video image segmentation method includes:
and step S11, identifying the target object in the image to be segmented by using a deep learning algorithm.
In one possible implementation, the deep learning algorithm may form a more abstract class or feature of high-level representation attributes by combining low-level features to discover a distributed feature representation of the data. Deep learning is a method based on characterization learning of data in machine learning. For an image, it may be represented in a number of ways, such as a vector of intensity values for each pixel, or more abstractly as a series of edges, specially shaped regions, etc. A deep learning algorithm may be used to learn a task (e.g., a face recognition task) from a sample instance of an image using some particular representation method and identify the learned task in the image.
The deep neural network comprises a neural network based on a deep learning algorithm. The target object may be identified in the image to be segmented based on a deep neural network.
Based on the strong processing capability of the deep learning algorithm, the video image segmentation method can perform real-time and online video image segmentation on the live video. The live broadcast image shot by the shooting equipment can be divided into video images after being subjected to video image division, and the divided video images are selected and played according to requirements during live broadcast, so that the images shot by one shooting equipment are utilized, the divided video images of multiple scenes are obtained and played, the resource waste during live broadcast video shooting and editing is avoided, the expression capacity of the live broadcast image is enhanced, and the real-time requirements of live broadcast are met.
In this embodiment, a target object in an image to be segmented may be identified using a deep learning algorithm. The deep learning algorithm has strong processing capability and accurate processing result, and can improve the identification accuracy and identification efficiency of the target object in the video image segmentation method.
In one possible implementation manner, determining a segmentation region corresponding to the identified target object in the image to be segmented includes:
and determining a segmentation region corresponding to the identified target object in the image to be segmented according to a composition rule by utilizing a deep learning algorithm.
In one possible implementation, the deep neural network comprises a deep learning algorithm based neural network. A segmentation region corresponding to the identified target object may be determined in the image to be segmented based on the deep neural network.
The composition rule may include a rule for setting a position and an area occupied by the target object in the divided region. And determining the segmentation area of the target object according to the composition rule, and obtaining the segmented video image according to the segmentation area, wherein the segmented video image has harmonious and complete picture and strong artistic expressive force and is in accordance with the aesthetic sense of a viewer.
Fig. 3 is a schematic diagram illustrating a method for segmenting a video image according to composition rules in a video image segmentation method according to an embodiment of the present disclosure, as shown in fig. 3, an upper half portion in fig. 3 includes two composition rules, where composition rule 1 on a left side is: the dividing area is divided into nine sub-areas by four straight lines in a # -shape, and the straight lines are crossed to generate four cross points. The intersection point is a position where the target object is preferentially placed at the time of composition. Different weights may be set for the four intersections, with the intersection with the higher weight being the location of preferential placement. The composition rule 2 on the right side is a composition curve, and a target object can be placed according to a region divided by the composition curve.
In the lower half of fig. 3, three images are shown. The leftmost diagram shows that a target object is identified in the image to be segmented, and the target object is the face of a person on the right side in the image. The middle diagram shows that the identified target object is subjected to composition according to composition rule 1, and the target object is placed at one intersection point according to composition rule 1 to determine a segmentation area. The diagram on the right shows that the segmented video image is finally obtained according to the segmented regions.
The deep neural network can be trained to use the set composition rule to perform composition in the image to be segmented according to the target object and determine the segmentation region. The segmentation area can be determined quickly and accurately by using the trained deep neural network.
In this embodiment, a deep learning algorithm is used, the segmentation region corresponding to the identified target object can be quickly and accurately determined in the image to be segmented according to the composition rule, and the determined segmentation region meets the composition rule, so that the aesthetic requirement of a viewer can be met.
Fig. 4 shows a flowchart of a video image segmentation method according to an embodiment of the present disclosure, and as shown in fig. 4, step S20 in the video image segmentation method includes:
step S21, determining a segmentation region corresponding to the identified target object and corresponding to the designated scene in the image to be segmented.
In one possible implementation, the scene difference refers to the difference in the range size that the subject appears in the camera finder due to the different distances between the camera and the subject. The division of the scene into five categories, from near to far, is a close-up (above the shoulder of the human body), a close-up (above the chest of the human body), a middle-view (above the knee of the human body), a panoramic view (the whole body and surrounding background), and a distant view (the environment of the object). In the video, various scenes can be used alternately, so that the narration of the video drama and the expression of the human ideological emotion are realized, the processing of the human relation is more expressive, and the infectivity of the video is enhanced.
Fig. 5 is a schematic diagram illustrating different scenes in a video image segmentation method according to an embodiment of the present disclosure, and as shown in fig. 5, for a target person, different scenes such as a big close-up shot, a middle close-up shot, a full close-up shot, a wide close-up shot, a short shot, a middle long shot, a middle shot, and a panorama can be divided based on a head of the person. Different scene division modes can be provided for different target objects.
The appointed scene of each target object in the image to be segmented can be determined according to requirements. For example, the specified scene may be determined according to the content of the video, the viewer's preference or setting, and the like.
Different specified scenes may be set for the target object in the image to be segmented. For example, for a target object in the image a to be segmented, zhang is three, which is a leading role, and the designated scenes of zhang can be close-up, close-up and medium scenes, respectively. Aiming at a target object LiIV in the image A to be segmented, the LiIV is a matched angle, and the specified views of the LiIV can be a close view and a distant view. The same designated scene may also be set for the target object in the image to be segmented. For example, the same specified view may be determined for Zhang three and Li four: close-up and close-up.
In this embodiment, by setting the designated scene, the segmentation region corresponding to the designated scene can be determined in the image to be segmented, so that the generated segmented video image is more targeted, and different viewing requirements can be met.
Fig. 6 shows a flowchart of a video image segmentation method according to an embodiment of the present disclosure, and as shown in fig. 6, step S20 in the video image segmentation method includes:
step S22, determining a segmentation region including the target object in the image to be segmented.
In a possible implementation manner, after the target object is identified in the image to be segmented, when the segmentation region corresponding to the target object is determined, no further sub-part segmentation is performed on the target object, and the determined segmentation region corresponding to the target object includes the whole target object. When the target object has a plurality of corresponding divided regions, each divided region includes the entirety of the target object.
The areas of the plurality of divided regions corresponding to the target object may be the same, and the target object may be located at different positions of the plurality of divided regions having the same area.
The areas of the plurality of divided regions corresponding to the target object may be different. The target object may be located at different positions or at the same position of a plurality of divided regions having different areas, respectively.
Different sub-parts of the target object may also be determined as target objects. And determining the segmentation areas corresponding to the target object and different sub-parts of the target object to generate segmentation video images of different scenes of the target object.
In this embodiment, the determined segmentation region includes the target object, and complete information of the target object may be retained in the generated segmentation video image. And a segmented region can be generated quickly for a target object.
In one possible implementation, step S22 includes:
in the image to be segmented, at least two segmentation regions comprising the target object are determined, wherein the image corresponding to the segmentation region with the larger area in the at least two segmentation regions comprises the image corresponding to the segmentation region with the smaller area.
Fig. 7 is a schematic diagram illustrating segmentation regions in a video image segmentation method according to an embodiment of the disclosure, as shown in fig. 7, a head of a person in an image to be segmented is a target object, three segmentation regions are determined in the image to be segmented, a segmentation region with a smallest area in the middle is a first segmentation region, a segmentation region with a largest area is a third segmentation region, and a segmentation region between the first segmentation region and the third segmentation region is a second segmentation region. It can be seen that the image corresponding to the segmentation region with the larger area includes the image corresponding to the segmentation region with the smaller area.
In this embodiment, among a plurality of divided regions determined in an image to be divided, an image corresponding to a divided region having a larger area includes an image corresponding to a divided region having a smaller area. By nesting the segmentation areas, the segmentation areas can be quickly determined in the image to be segmented, and the relevance among the segmentation areas is strong, so that the segmentation areas can be conveniently selected for use in subsequent playing.
Fig. 8 shows a flowchart of a video image segmentation method according to an embodiment of the present disclosure, and as shown in fig. 8, step S20 in the video image segmentation method includes:
step S23, identifying a target site on the target object.
In one possible implementation, after the target object is identified in the image to be segmented, one or more target portions on the target object may be further identified as needed. For example, the recognized target object is a human, and sub-parts of the head, hands, legs, and the like of the human may be determined as the target parts. And further identifying the target portion in the image to be segmented.
The target site may be identified in the image to be segmented using a deep learning algorithm.
Step S24, determining a segmentation region corresponding to the target region in the image to be segmented.
In one possible implementation, one or more segmented regions may be determined for a target site. Fig. 9 is a schematic diagram illustrating a segmentation region corresponding to a target region in a video image segmentation method according to an embodiment of the present disclosure. As shown in fig. 9, the top half of fig. 9 is the image to be segmented. The target object identified in the image to be segmented is a person, the head and the hand of the person can be further used as target parts, after the head and the hand are further identified in the image to be segmented, segmentation areas corresponding to the head and the hand are determined, and two segmentation video images in the lower half part of fig. 9 are generated according to the segmentation areas of the head and the hand.
In the present embodiment, a segmented region is determined according to a target region on a target object, and segmented video images for different target regions are generated. Different details of the target object can be embodied, and richer video watching scenes are provided.
Fig. 10 shows a flowchart of a video image segmentation method according to an embodiment of the present disclosure, and as shown in fig. 10, the video image segmentation method further includes:
step S50, determining resolution information of the image to be segmented.
In one possible implementation, the image to be segmented may include different resolutions according to different shooting devices and different parameters set during shooting. With the continuous upgrading of shooting devices, the resolution of images to be segmented can reach 2K (1920 × 1080), 4K (3840 × 2160), 5K (5120 × 2160), 8K (7680 × 4320), 10K (10240 × 4320), and the like. The higher the resolution, the greater the number of pixels in the image to be segmented. The higher the resolution, the better the sharpness of the image, with the same magnification.
And step S60, determining the segmentation size information of the image to be segmented according to the resolution information.
In one possible implementation, the partition size information may include a minimum partition size. The minimum segmentation size of the image to be segmented may be determined based on the resolution information. The higher the resolution of the image to be segmented is, the smaller the minimum segmentation size is, the smaller the segmented video image generated according to the segmentation area determined by the minimum segmentation size is, and the definition can meet the watching requirement. When the division area is determined by using the division size smaller than the minimum division size, the generated division video image has poor definition and cannot meet the watching requirement.
The segmentation size information may also include a segmentation size interval, the size of the image to be segmented may be determined as a maximum segmentation size, and the segmentation size interval may be determined according to the maximum segmentation size and the minimum segmentation size determined according to the resolution.
Step S20 includes:
step S25, determining a segmentation region corresponding to the identified target object and conforming to the segmentation size information in the image to be segmented.
In one possible implementation, a plurality of division regions of a set size may be determined according to the division size information, wherein the size of the smallest division region is greater than or equal to the division region of the smallest division size. It is also possible to determine a divided region of an arbitrary size in the image to be divided based on the division size information, wherein the size of the smallest divided region is larger than or equal to the divided region of the smallest division size.
In this embodiment, after determining the segmentation size information according to the resolution information of the image to be segmented, a segmentation region corresponding to the identified target object and conforming to the segmentation size information may be determined in the image to be segmented. The generated segmentation video image can be clear based on the segmentation area determined by the resolution information, and the resolution of the segmentation video image can meet the watching requirement.
In one possible implementation, the method further includes:
and determining definition information of the image to be segmented.
In one possible implementation, sharpness refers to the degree of sharpness of detail shading and its boundaries in an image. If the definition of the image to be segmented is poor under the influence of the shooting environment, the shooting angle and the shooting light, the definition of the segmented video image generated according to the image to be segmented is also poor, and the segmented video image cannot meet the watching requirement.
Step S60 includes:
and determining the segmentation size information of the image to be segmented according to the resolution information and the definition information.
In a possible implementation manner, the segmentation size information of the image to be segmented may be determined jointly according to the resolution information and the definition information of the image to be segmented. The higher the definition, the smaller the minimum segmentation size of the image to be segmented at the same resolution. Under the same definition, the higher the resolution, the smaller the minimum segmentation size of the image to be segmented.
Different weights can be set for the resolution information and the definition information, and the segmentation size information of the image to be segmented is determined together according to the resolution information, the resolution information weight, the definition information and the definition information weight.
In this embodiment, the segmentation size information of the image to be segmented is determined according to the definition and resolution information of the image to be segmented, so that the definition and resolution of the generated segmented video image can both meet the viewing requirement.
Fig. 11 shows a flowchart of a video image segmentation method according to an embodiment of the present disclosure, and as shown in fig. 11, the video image segmentation method further includes:
step S70, determining first playing display information, wherein the first playing display information comprises screen physical size information and/or playing resolution information; and the split video image played by the terminal accords with the screen physical size information and/or the playing resolution information.
In one possible implementation, the screens of different terminals have different physical sizes, for example, the screen size of a mobile phone may be 3.5 inches, and the screen of a terminal used for playing advertisements on the outer wall of a building may reach several tens of meters. The playing resolution information of different terminals is different under the influence of the processing capacity of the device and the configured screen. When the terminal plays the split video image, the split video image is required to be played according with the screen physical size information and/or the playing resolution information.
In one possible implementation manner, a split video image corresponding to the screen physical size information and/or the playing resolution information of the playing terminal may be generated according to the image to be split. And directly broadcasting the generated divided video image by the terminal. For example, a split video image corresponding to the screen physical size information and/or the playing resolution information of a certain brand of mobile phone may be generated according to the image to be split, and the certain brand of mobile phone may directly play the generated split video image.
In a possible implementation manner, a plurality of sets of segmented video images corresponding to the screen physical size information and/or the playing resolution information of different terminals can be generated according to the image to be segmented. When the terminal plays, the divided video image corresponding to the screen physical size information and/or the playing resolution information is selected to be played. For example, a plurality of sets of segmented video images corresponding to the screen physical size information and/or the playing resolution information of a plurality of brands of mobile phones and a plurality of brands of notebook computers can be generated respectively according to the images to be segmented. One brand of mobile phone can select the split video image which is consistent with the self screen physical size information and/or the playing resolution information to play.
In this embodiment, the split video image corresponding to the screen physical size information and/or the playing resolution information of the terminal may be determined according to the screen physical size information and/or the playing resolution information. The split video image can be more in line with the playing requirement of the terminal.
Fig. 12 is a flowchart illustrating a video image segmentation method according to an embodiment of the present disclosure, and as shown in fig. 12, the video image segmentation method further includes:
step S80, determining second playing display information, wherein the second playing display information comprises horizontal screen display information or vertical screen display information; and the segmented video image played by the terminal accords with the horizontal screen display information or the vertical screen display information.
In a possible implementation manner, when the terminal plays the split video image, the terminal can perform horizontal screen display and/or vertical screen display. The screen of the terminal is generally rectangular, and when the longer side of the rectangular screen is placed transversely, the terminal is in a transverse screen display state. When the shorter side of the rectangular screen is placed transversely, the terminal is in a vertical screen display state. The horizontal screen display information comprises identification information representing that the terminal is in a horizontal screen display state, and the vertical screen display information comprises identification information representing that the terminal is in a vertical screen display state.
According to the watching habit, when the same target object in the image to be segmented is displayed on the horizontal screen and the vertical screen, the content in the corresponding video playing image is different.
Fig. 13 is a schematic diagram illustrating horizontal screen and vertical screen display in a video image segmentation method according to an embodiment of the present disclosure, and as shown in fig. 13, in order to determine a segmented video image when the upper half of fig. 13 is displayed as a horizontal screen for the same image to be segmented. The lower half of fig. 13 is a vertical screen display, and the divided video image is determined.
In a possible implementation manner, a segmented video image corresponding to the horizontal screen display information or the vertical screen display information of the playing terminal may be generated according to the image to be segmented. And directly broadcasting the generated divided video image by the terminal. For example, a split video image corresponding to the horizontal screen display information or the vertical screen display information of a certain brand of mobile phone may be generated according to the image to be split, and the generated split video image may be directly broadcast by the certain brand of mobile phone.
In a possible implementation manner, a plurality of sets of segmented video images corresponding to the horizontal screen display information or the vertical screen display information of different terminals can be generated according to the image to be segmented. And when the terminal plays, selecting the divided video image corresponding to the self horizontal screen display information or vertical screen display information to play. For example, a plurality of sets of split video images corresponding to the horizontal screen display information or the vertical screen display information of a plurality of brands of mobile phones and a plurality of brands of notebook computers can be generated respectively according to the images to be split. One brand of mobile phone can select the divided video image which is consistent with the horizontal screen display information or the vertical screen display information of the mobile phone to be played.
In this embodiment, the split video image corresponding to the landscape display information or the portrait display information of the terminal may be determined according to the landscape display information or the portrait display information of the terminal. The split video image can be more in line with the playing requirement of the terminal.
Fig. 14 shows a flowchart of a video image segmentation method according to an embodiment of the present disclosure, and as shown in fig. 14, step S30 in the video image segmentation method includes:
and step S31, determining the corresponding coordinate information of the segmentation area in the image to be segmented.
In one possible implementation, the segmented region may be expressed by its corresponding coordinate position in the image coordinate system of the image to be segmented. For example, the segmentation region may be a rectangle, and based on the number three of the target objects identified in the image a to be segmented, the segmentation region 1a corresponding to the number three may be determined. The specific position of the segmentation region 1a in the image to be segmented may include four positions of four vertices of the segmentation region 1a in the image coordinate system: coordinate point 1 (x 1, y 1), coordinate point 2 (x 1, y 2), coordinate point 3 (x 2, y 1), and coordinate point 4 (x 2, y 2).
A plurality of segmentation areas in an image to be segmented, corresponding coordinate information in the image to be segmented, may be determined. The coordinate information may include any representation such as an array or matrix. A text file (e.g., an XML text file) may be generated from the coordinate information.
And step S32, generating a segmentation video image of the image to be segmented according to the coordinate information.
In a possible implementation manner, according to the coordinate information, an image corresponding to the coordinate information may be obtained by segmentation in the image to be segmented. For example, a rectangular divided video image may be divided in the image to be divided according to the coordinate point 1 (x 1, y 1), the coordinate point 2 (x 1, y 2), the coordinate point 3 (x 2, y 1), and the coordinate point 4 (x 2, y 2).
In this embodiment, a segmented video image is obtained by determining the corresponding coordinate information of the segmented region in the image to be segmented. According to the coordinate information of the segmentation area, the segmentation video image can be accurately and quickly determined.
Fig. 15 shows a flowchart of a video image segmentation method according to an embodiment of the present disclosure, and as shown in fig. 15, step S40 in the video image segmentation method includes:
and step S41, determining the weight of each segmented video image in the image to be segmented.
In one possible implementation, the weight of the segmented video image in the image to be segmented may be determined according to the target object in the segmented video image. The image to be segmented may include a plurality of target objects, and the weight of the segmented video image corresponding to the target objects may be determined according to the proportion of the target objects in the picture of the image to be segmented and/or according to the importance degree of the target objects in the video.
And determining the weight of the segmented video image in the image to be segmented according to the definition of the segmented video image.
For example, the image a to be segmented includes zhang three and lie four, where zhang three is a main corner of the video and lie four is a minor corner. A higher weight may be determined for the segmented video image corresponding to zhangsan and a lower weight may be determined for the segmented video image corresponding to liquad.
Fig. 16 is a schematic diagram illustrating weights in a video image segmentation method according to an embodiment of the disclosure, and as shown in fig. 16, an image to be segmented is a video frame image in a variety program video. The weight of the host on the leftmost side of the picture is A level, the weight of the four-bit guest is B level, and the weight of the rear-row audience is C level. The weight of level A is greater than level B, which is greater than level C.
Step S42, determining a recommendation result in each of the segmented video images according to the weight.
In a possible implementation, since the image to be segmented may comprise a plurality of segmented video images, typically only one of the segmented video images needs to be played out. In order to enable the played divided video images to embody the representation content in the images to be divided to the maximum extent, the recommendation result for playing can be determined in the plurality of divided video images according to the weight of the divided video images. The recommendation may include one or more segmented video images. As shown in fig. 16, the divided video image corresponding to the host may be determined as the recommendation result according to the weight.
For example, after the divided video images are sorted in the order of decreasing weight, the divided video image with the largest weight or the first three bits of weight may be determined as the recommendation result.
And step S43, controlling the terminal to play the recommendation result.
In a possible implementation manner, one or more segmented video images with higher weights in the segmented video images can be played at the terminal according to requirements.
In this embodiment, the weight of the segmented video image may be determined, and the recommendation result may be determined according to the weight and then played at the terminal. According to the recommendation result determined by the weight, the representation content in the video can be well expressed, so that a viewer has good viewing experience.
Fig. 17 shows a flowchart of a video image segmentation method according to an embodiment of the present disclosure, and as shown in fig. 17, step S40 in the video image segmentation method includes:
in step S44, target object selection information and/or scene selection information is acquired.
Step S45, determining a selection result in the segmented video image according to the target object selection information and/or the scene selection information.
And step S46, controlling the terminal to play the selection result.
In one possible implementation, different viewers may select different target objects in the image to be segmented using the target object selection information. For example, viewer a may be interested in star a, wishing to view more images of star a in the video. Viewer a may select star a using the target object selection information.
Some viewers are used to watch close-up scenes and close-up characters, and some viewers are used to watch full-view images. Different viewers can also select the segmented video images of different scene types by using the scene selection information.
An information entry box or option box for target object selection information and/or scene selection information may be provided for selection by the viewer.
The corresponding relation between each segmented video image and the target object and/or the scene can be established, the selection result can be determined in the segmented video images according to the target object or the scene selected by the viewer, and the selection result is played at the terminal.
In this embodiment, the selection result may be determined in the segmented video image by the target object selection information and/or the scene selection information, and played at the terminal. The watching requirements of different viewers can be met through the target object selection information and/or the scene selection information.
Application example:
fig. 18 illustrates a schematic diagram of segmenting a video image in a video image segmentation method according to an embodiment of the present disclosure, and as shown in fig. 18, the uppermost image in fig. 18 is a video frame image in a video captured by a capturing device and is an image to be segmented. Two target objects, a person C on the left side and a person D on the right side, are included in the image to be segmented.
The designated scene may be a middle scene, and two segmented video images at the middle position of fig. 18 are obtained according to the two target objects, which are respectively from left to right: a segmented video image of the medium scene of person C, and a segmented video image of the medium scene of person D.
The head and the hands of the target object can be specified as target parts, and the four lowermost divided video images in fig. 18 are obtained from the target parts of the person C and the person D. From left to right are: a segmented video image of the hand of person C, a segmented video image of the head of person C, a segmented video image of the hand of person D, and a segmented video image of the head of person D. And four segmented video images aiming at the target part are segmented video images of close-up scenes of the character C and the character D.
When playing a video, a selection can be made among the six segmented video images in fig. 18. The selection can be performed according to the scene, or according to the target object. The uppermost image to be segmented in fig. 18 may also be played.
According to the embodiment, different scenes are obtained after the video image is segmented for one image to be segmented, and the segmented video images of different target objects can be selected and played at the terminal according to requirements. The resource waste during video shooting or video broadcasting is avoided.
Fig. 19 shows a schematic diagram of a video image segmentation apparatus according to an embodiment of the present disclosure, as shown in fig. 19, the video image segmentation apparatus includes:
a target object identification module 10, configured to identify a target object in an image to be segmented, where the image to be segmented is a video frame image in a video;
a segmentation region determination module 20, configured to determine a segmentation region corresponding to the identified target object in the image to be segmented;
a segmentation video image generation module 30, configured to generate a segmentation video image of the image to be segmented according to the segmentation region,
and the playing module 40 is used for controlling the terminal to play the segmented video image.
Fig. 20 shows a schematic diagram of a video image segmentation apparatus according to an embodiment of the present disclosure, as shown in fig. 20, in one possible implementation manner, the target object recognition module 10 includes:
and the first target object identification submodule 11 is configured to identify a target object in the image to be segmented by using a deep learning algorithm.
In one possible implementation, the segmentation area determining module 20 includes:
and the first segmentation region determining submodule 21 is configured to determine, by using a deep learning algorithm, a segmentation region corresponding to the identified target object in the image to be segmented according to a composition rule.
In one possible implementation, the segmentation area determining module 20 includes:
and the second segmentation region determining submodule 22 is configured to determine a segmentation region corresponding to the identified target object and conforming to a specified scene in the image to be segmented.
In one possible implementation, the segmentation area determining module 20 includes:
a third segmentation region determining submodule 23, configured to determine a segmentation region including the target object in the image to be segmented.
In a possible implementation, the third segmentation area determination submodule 23 is configured to:
in the image to be segmented, at least two segmentation regions comprising the target object are determined, wherein the image corresponding to the segmentation region with the larger area in the at least two segmentation regions comprises the image corresponding to the segmentation region with the smaller area.
In one possible implementation, the segmentation area determining module 20 includes:
a target portion identification submodule 24 for identifying a target portion on the target object;
and a fourth segmentation region determination submodule 25, configured to determine a segmentation region corresponding to the target portion in the image to be segmented.
In one possible implementation, the apparatus further includes:
a resolution information determining module 50, configured to determine resolution information of the image to be segmented;
a segmentation size information determining module 60, configured to determine, according to the resolution information, segmentation size information of the image to be segmented;
the segmentation area determination module 20 includes:
and a fifth segmentation area determination submodule 26, configured to determine, in the image to be segmented, a segmentation area corresponding to the identified target object and conforming to the segmentation size information.
In one possible implementation, the apparatus further includes:
a definition information determining module 70, configured to determine definition information of the image to be segmented;
the segmentation size information determination module 60 includes:
and the first segmentation information determining submodule 61 is configured to determine the segmentation size information of the image to be segmented according to the resolution information and the definition information.
In one possible implementation, the apparatus further includes:
a first playing and displaying information determining module 80, configured to determine first playing and displaying information, where the first playing and displaying information includes screen physical size information and/or playing resolution information;
and the split video image played by the terminal accords with the screen physical size information and/or the playing resolution information.
In one possible implementation, the apparatus further includes:
a second playing and displaying information determining module 90, configured to determine second playing and displaying information, where the second playing and displaying information includes horizontal screen displaying information or vertical screen displaying information;
and the segmented video image played by the terminal accords with the horizontal screen display information or the vertical screen display information.
In one possible implementation, the segmented video image generation module 30 includes:
the coordinate information determining submodule 31 is configured to determine coordinate information corresponding to the segmented region in the image to be segmented;
and the first segmentation video image generation submodule 32 is configured to generate a segmentation video of the image to be segmented according to the coordinate information.
In one possible implementation manner, the playing module 40 includes:
a weight determining submodule 41, configured to determine a weight of each segmented video image in the image to be segmented;
a recommendation result determining submodule 42, configured to determine a recommendation result in each of the segmented video images according to the weight;
and a recommendation result playing submodule 43, configured to control the terminal to play the recommendation result.
In a possible implementation manner, the playing module 40 includes:
a selection submodule 44 configured to obtain target object selection information and/or scene selection information;
a selection result determining submodule 45, configured to determine a selection result in the segmented video image according to the target object selection information and/or the scene selection information;
and a selection result playing submodule 46, configured to control the terminal to play the selection result.
Fig. 21 is a block diagram illustrating an apparatus 800 for video image segmentation in accordance with an exemplary embodiment. For example, the apparatus 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 21, the apparatus 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.
The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operations at the apparatus 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power components 806 provide power to the various components of device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the apparatus 800.
The multimedia component 808 includes a screen that provides an output interface between the device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor assembly 814 may detect the open/closed status of the device 800, the relative positioning of components, such as a display and keypad of the device 800, the sensor assembly 814 may also detect a change in the position of the device 800 or a component of the device 800, the presence or absence of user contact with the device 800, the orientation or acceleration/deceleration of the device 800, and a change in the temperature of the device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communications between the apparatus 800 and other devices in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium, such as the memory 804, is also provided that includes computer program instructions executable by the processor 820 of the device 800 to perform the above-described methods.
Fig. 22 is a block diagram illustrating an apparatus 1900 for video image segmentation according to an example embodiment. For example, the apparatus 1900 may be provided as a server. Referring to fig. 22, the device 1900 includes a processing component 1922 further including one or more processors and memory resources, represented by memory 1932, for storing instructions, e.g., applications, executable by the processing component 1922. The application programs stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the above-described method.
The device 1900 may also include a power component 1926 configured to perform power management of the device 1900, a wired or wireless network interface 1950 configured to connect the device 1900 to a network, and an input/output (I/O) interface 1958. The device 1900 may operate based on an operating system stored in memory 1932, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
In an exemplary embodiment, a non-transitory computer readable storage medium, such as the memory 1932, is also provided that includes computer program instructions executable by the processing component 1922 of the apparatus 1900 to perform the above-described methods.
The present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present disclosure.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terms used herein were chosen in order to best explain the principles of the embodiments, the practical application, or technical improvements to the techniques in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (28)

1. A method for video image segmentation, the method comprising:
identifying a target object in an image to be segmented, wherein the image to be segmented is a video frame image in a video;
determining a segmentation region corresponding to an identified target object in the image to be segmented, wherein the identified target object corresponds to a plurality of segmentation regions;
generating a segmentation video image of the image to be segmented according to the segmentation area;
controlling a terminal to play the segmented video image;
wherein the determining of the segmentation region corresponding to the identified target object in the image to be segmented comprises:
and determining a segmentation region corresponding to the identified target object and conforming to a specified scene in the image to be segmented.
2. The method of claim 1, wherein identifying a target object in an image to be segmented comprises:
and identifying a target object in the image to be segmented by utilizing a deep learning algorithm.
3. The method according to claim 1 or 2, wherein determining a segmentation region corresponding to the identified target object in the image to be segmented comprises:
and determining a segmentation region corresponding to the identified target object in the image to be segmented according to a composition rule by utilizing a deep learning algorithm.
4. The method according to claim 1 or 2, wherein determining a segmentation region corresponding to the identified target object in the image to be segmented comprises:
and determining a segmentation region comprising the target object in the image to be segmented.
5. The method according to claim 4, wherein determining a segmentation region including the target object in the image to be segmented comprises:
in the image to be segmented, at least two segmentation regions comprising the target object are determined, wherein the image corresponding to the segmentation region with the larger area in the at least two segmentation regions comprises the image corresponding to the segmentation region with the smaller area.
6. The method according to claim 1 or 2, wherein determining a segmentation region corresponding to the identified target object in the image to be segmented comprises:
identifying a target site on the target object;
and determining a segmentation region corresponding to the target part in the image to be segmented.
7. The method of claim 1, further comprising:
determining resolution information of the image to be segmented;
determining the segmentation size information of the image to be segmented according to the resolution information;
determining a segmentation region corresponding to the identified target object in the image to be segmented, including:
and determining a segmentation area corresponding to the identified target object and conforming to the segmentation size information in the image to be segmented.
8. The method of claim 7, further comprising:
determining definition information of the image to be segmented;
determining the segmentation size information of the image to be segmented according to the resolution information, wherein the determination comprises the following steps:
and determining the segmentation size information of the image to be segmented according to the resolution information and the definition information.
9. The method of claim 1, further comprising:
determining first playing display information, wherein the first playing display information comprises screen physical size information and/or playing resolution information;
and the split video image played by the terminal accords with the screen physical size information and/or the playing resolution information.
10. The method of claim 1, further comprising:
determining second playing display information, wherein the second playing display information comprises horizontal screen display information or vertical screen display information;
and the segmented video image played by the terminal accords with the horizontal screen display information or the vertical screen display information.
11. The method according to claim 1, wherein generating a segmented video image of the image to be segmented from the segmented region comprises:
determining the corresponding coordinate information of the segmentation region in the image to be segmented;
and generating a segmentation video image of the image to be segmented according to the coordinate information.
12. The method of claim 1, wherein controlling a terminal to play the segmented video image comprises:
determining the weight of each segmented video image in the image to be segmented;
determining a recommendation result in each segmented video image according to the weight;
and controlling the terminal to play the recommendation result.
13. The method of claim 1, wherein controlling a terminal to play the segmented video image comprises:
acquiring target object selection information and/or scene selection information;
determining a selection result in the segmented video image according to the target object selection information and/or the scene selection information;
and controlling the terminal to play the selection result.
14. A video image segmentation apparatus, characterized in that the apparatus comprises:
the target object identification module is used for identifying a target object in an image to be segmented, wherein the image to be segmented is a video frame image in a video, and the identified target object corresponds to a plurality of segmentation areas;
the segmentation region determining module is used for determining a segmentation region corresponding to the identified target object in the image to be segmented;
the segmentation video image generation module is used for generating a segmentation video image of the image to be segmented according to the segmentation area;
the playing module is used for controlling the terminal to play the segmented video image;
the segmentation region determination module includes:
and the second segmentation region determining submodule is used for determining a segmentation region which corresponds to the identified target object and accords with a specified scene in the image to be segmented.
15. The apparatus of claim 14, wherein the target object identification module comprises:
and the first target object identification submodule is used for identifying a target object in the image to be segmented by utilizing a deep learning algorithm.
16. The apparatus of claim 14 or 15, wherein the segmentation region determination module comprises:
and the first segmentation region determining submodule is used for determining a segmentation region corresponding to the identified target object in the image to be segmented according to the composition rule by utilizing a deep learning algorithm.
17. The apparatus of claim 14 or 15, wherein the segmentation region determination module comprises:
and the third segmentation region determining submodule is used for determining a segmentation region comprising the target object in the image to be segmented.
18. The apparatus of claim 17, wherein the third segmentation area determination submodule is configured to:
in the image to be segmented, at least two segmentation regions comprising the target object are determined, wherein the image corresponding to the segmentation region with the larger area in the at least two segmentation regions comprises the image corresponding to the segmentation region with the smaller area.
19. The apparatus of claim 14 or 15, wherein the segmentation region determination module comprises:
a target part identification submodule for identifying a target part on the target object;
and the fourth segmentation region determining submodule is used for determining a segmentation region corresponding to the target part in the image to be segmented.
20. The apparatus of claim 14, further comprising:
the resolution information determining module is used for determining the resolution information of the image to be segmented;
the segmentation size information determining module is used for determining the segmentation size information of the image to be segmented according to the resolution information;
the segmentation region determination module includes:
and the fifth segmentation area determination submodule is used for determining a segmentation area which corresponds to the identified target object and accords with the segmentation size information in the image to be segmented.
21. The apparatus of claim 20, further comprising:
the definition information determining module is used for determining the definition information of the image to be segmented;
the division size information determination module includes:
and the first segmentation information determining submodule is used for determining the segmentation size information of the image to be segmented according to the resolution information and the definition information.
22. The apparatus of claim 14, further comprising:
the first playing and displaying information determining module is used for determining first playing and displaying information, wherein the first playing and displaying information comprises screen physical size information and/or playing resolution information;
and the split video image played by the terminal accords with the screen physical size information and/or the playing resolution information.
23. The apparatus of claim 14, further comprising:
the second playing and displaying information determining module is used for determining second playing and displaying information, wherein the second playing and displaying information comprises horizontal screen displaying information or vertical screen displaying information;
and the segmented video image played by the terminal accords with the horizontal screen display information or the vertical screen display information.
24. The apparatus of claim 14, wherein the segmented video image generation module comprises:
the coordinate information determining submodule is used for determining the corresponding coordinate information of the segmentation area in the image to be segmented;
and the first segmentation video image generation submodule is used for generating the segmentation video image of the image to be segmented according to the coordinate information.
25. The apparatus of claim 14, wherein the playback module comprises:
the weight determining submodule is used for determining the weight of each segmented video image in the image to be segmented;
a recommendation result determining submodule, configured to determine a recommendation result in each of the segmented video images according to the weight;
and the recommendation result playing submodule is used for controlling the terminal to play the recommendation result.
26. The apparatus of claim 14, wherein the playback module comprises:
the selection submodule is used for acquiring target object selection information and/or scene selection information;
a selection result determining submodule, configured to determine a selection result in the segmented video image according to the target object selection information and/or the scene selection information;
and the selection result playing submodule is used for controlling the terminal to play the selection result.
27. A video image segmentation apparatus, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to: performing the method of any one of claims 1 to 13.
28. A non-transitory computer readable storage medium having stored thereon computer program instructions, wherein the computer program instructions, when executed by a processor, implement the method of any one of claims 1 to 13.
CN201810802302.9A 2018-07-18 2018-07-18 Video image segmentation method and device Active CN108986117B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810802302.9A CN108986117B (en) 2018-07-18 2018-07-18 Video image segmentation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810802302.9A CN108986117B (en) 2018-07-18 2018-07-18 Video image segmentation method and device

Publications (2)

Publication Number Publication Date
CN108986117A CN108986117A (en) 2018-12-11
CN108986117B true CN108986117B (en) 2021-06-04

Family

ID=64549449

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810802302.9A Active CN108986117B (en) 2018-07-18 2018-07-18 Video image segmentation method and device

Country Status (1)

Country Link
CN (1) CN108986117B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110276318A (en) * 2019-06-26 2019-09-24 北京航空航天大学 Nighttime highway rain recognition method, device, computer equipment and storage medium
CN112839227B (en) * 2019-11-22 2023-03-14 浙江宇视科技有限公司 Image coding method, device, equipment and medium
CN111246237A (en) * 2020-01-22 2020-06-05 视联动力信息技术股份有限公司 Panoramic video live broadcast method and device
CN112218160A (en) * 2020-10-12 2021-01-12 北京达佳互联信息技术有限公司 Video conversion method and device, video conversion equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541494A (en) * 2010-12-30 2012-07-04 中国科学院声学研究所 Video size switching system and video size switching method facing display terminal
CN103747210A (en) * 2013-12-31 2014-04-23 深圳市佳信捷技术股份有限公司 Method and device for data processing of video monitoring system
CN105163188A (en) * 2015-08-31 2015-12-16 小米科技有限责任公司 Video content processing method, device and apparatus
CN106792092A (en) * 2016-12-19 2017-05-31 广州虎牙信息科技有限公司 Live video flow point mirror display control method and its corresponding device
CN107547803A (en) * 2017-09-25 2018-01-05 北京奇虎科技有限公司 Video segmentation result edge optimization processing method, device and computing device
CN107545576A (en) * 2017-07-31 2018-01-05 华南农业大学 Image edit method based on composition rule
CN108124194A (en) * 2017-12-28 2018-06-05 北京奇艺世纪科技有限公司 A kind of net cast method, apparatus and electronic equipment
CN108156459A (en) * 2016-12-02 2018-06-12 北京中科晶上科技股份有限公司 Telescopic video transmission method and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102541494A (en) * 2010-12-30 2012-07-04 中国科学院声学研究所 Video size switching system and video size switching method facing display terminal
CN103747210A (en) * 2013-12-31 2014-04-23 深圳市佳信捷技术股份有限公司 Method and device for data processing of video monitoring system
CN105163188A (en) * 2015-08-31 2015-12-16 小米科技有限责任公司 Video content processing method, device and apparatus
CN108156459A (en) * 2016-12-02 2018-06-12 北京中科晶上科技股份有限公司 Telescopic video transmission method and system
CN106792092A (en) * 2016-12-19 2017-05-31 广州虎牙信息科技有限公司 Live video flow point mirror display control method and its corresponding device
CN107545576A (en) * 2017-07-31 2018-01-05 华南农业大学 Image edit method based on composition rule
CN107547803A (en) * 2017-09-25 2018-01-05 北京奇虎科技有限公司 Video segmentation result edge optimization processing method, device and computing device
CN108124194A (en) * 2017-12-28 2018-06-05 北京奇艺世纪科技有限公司 A kind of net cast method, apparatus and electronic equipment

Also Published As

Publication number Publication date
CN108986117A (en) 2018-12-11

Similar Documents

Publication Publication Date Title
CN109257645B (en) Video cover generation method and device
CN110620946B (en) Subtitle display method and device
CN108260020B (en) Method and device for displaying interactive information in panoramic video
CN108985176B (en) Image generation method and device
EP3176731A1 (en) Image processing method and device
CN109089170A (en) Barrage display methods and device
CN108986117B (en) Video image segmentation method and device
JP2021526698A (en) Image generation methods and devices, electronic devices, and storage media
US20150365600A1 (en) Composing real-time processed video content with a mobile device
KR101755412B1 (en) Method and device for processing identification of video file, program and recording medium
CN109862380B (en) Video data processing method, device and server, electronic equipment and storage medium
EP2953068A1 (en) Prompting method and device for seat selection
CN108737891B (en) Video material processing method and device
CN108924644B (en) Video clip extraction method and device
CN108900903B (en) Video processing method and device, electronic equipment and storage medium
CN108174269B (en) Visual audio playing method and device
CN113467603A (en) Audio processing method and device, readable medium and electronic equipment
KR20140089829A (en) Method and apparatus for controlling animated image in an electronic device
CN112541971A (en) Point cloud map construction method and device, electronic equipment and storage medium
CN112991381A (en) Image processing method and device, electronic equipment and storage medium
CN111354444A (en) Pathological section image display method and device, electronic equipment and storage medium
CN106954093B (en) Panoramic video processing method, device and system
CN109756783B (en) Poster generation method and device
EP3799415A2 (en) Method and device for processing videos, and medium
CN112887620A (en) Video shooting method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200424

Address after: 310052 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Alibaba (China) Co.,Ltd.

Address before: 100000 room 26, 9 Building 9, Wangjing east garden four, Chaoyang District, Beijing.

Applicant before: BEIJING YOUKU TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240611

Address after: 101400 Room 201, 9 Fengxiang East Street, Yangsong Town, Huairou District, Beijing

Patentee after: Youku Culture Technology (Beijing) Co.,Ltd.

Country or region after: China

Address before: 310052 room 508, 5th floor, building 4, No. 699 Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Patentee before: Alibaba (China) Co.,Ltd.

Country or region before: China

TR01 Transfer of patent right