WO2022003836A1

WO2022003836A1 - Processing system and processing method

Info

Publication number: WO2022003836A1
Application number: PCT/JP2020/025705
Authority: WO
Inventors: 遥久保田; 明片岡
Original assignee: 日本電信電話株式会社
Priority date: 2020-06-30
Filing date: 2020-06-30
Publication date: 2022-01-06
Also published as: JP7439927B2; JPWO2022003836A1

Abstract

A display processing system (10) has: a first memory unit that memorizes a plurality of conditions for detecting a scene in video information on the basis of a parameter which includes time series information associated with the video information and that memorizes directional relationships between the conditions; and a processed data storage unit (15) that obtains, with regard to a given condition, a condition relevant to the given condition, from the plurality of conditions on the basis of the relationships between the conditions and that detects a scene which corresponds to the obtained condition.

Description

Processing system and processing method

The present invention relates to a processing system and a processing method.

The video information can accurately reproduce the situation at the time of shooting. For this reason, video information is used in many fields regardless of personal use or business use. For example, video information is used for leisure and sports as an individual, and for security, business grasp, and trail as a business.

And, in utilizing video information, there are many cases where you want to detect only a specific scene from continuous video information. For example, there are cases where it is confirmed whether a specific person is browsing a scene that is a subject, cases where a procedure for a specific work is confirmed, and cases where it is confirmed whether or not a specific person is viewing only during a specific time zone.

However, the visual search for a specific scene has a large amount of time and image information, and the listability is poor, and the comparison with a schedule etc. is insufficient (the search particle size is rough, and the schedule is changed. It is also difficult to deal with), so it takes time and is often overlooked, which is inefficient. Therefore, there is a demand for an efficient scene presentation method for video information.

In response to this request, a scene division method based on screen switching has been proposed in the past. However, in the case of this method, it is difficult to use it in a one-shot video in which a cut is not sandwiched.

In addition, a search method using an index has been proposed in the past. In this method, an index is added to each frame at the start / end timing or in between, and the corresponding scene is presented by selecting one of the indexes. Then, a method has been proposed in which an index is given by using the image information and the character information in the video information in a complex manner, and a scene search using a threshold value or the like based on the index is possible (for example, a non-patent document). 1).

Furthermore, a visualization method using a timeline has been proposed. This method is necessary from the timeline because it is possible to analyze by comparing the time width of each scene from the timeline by presenting the range delimited by the index etc. in chronological order on the timeline. It is also possible to determine a wide range.

However, in the conventional method, avoiding missing scenes in indexing is an issue.

When indexing multiple parameters in a complex manner based on raw data, it is difficult to set parameters with few scene omissions that match the intended detection range. In addition, if it is unclear to the user how the parameters affect the detection result, such as when indexing is automated or when the user and the index setter are different, the user cannot detect it. It is also difficult to supplement the related scenes that did not exist.

The case of determining whether each parameter is above the threshold value will be described as an example. In this case, there is a problem that a good case cannot be extracted even if the partial condition is achieved. Specifically, if "I want to see the scene in which the car is reflected" and "the tire is reflected" in the condition, it is difficult to detect the scene in which almost the entire scene is shown except for the tire.

There is also the problem that excessive division of semantically continuous scenes occurs. Specifically, if there is a place where some conditions are not achieved in the scene that was originally desired to be detected, the detected scene is divided at this place.

Also, if the estimation accuracy of some conditions is poor, the scene detection itself may become difficult. In addition, when the degree of matching of the conditions differs due to individual differences, individual differences, etc., it is necessary to relax the conditions in order to set the conditions that cover the whole, the total detection time becomes long, and the narrowing down becomes weak.

Appropriate parameter settings are required to avoid these problems, but it is difficult for the user to determine how to set the parameters, and it is difficult to supplement related situations that could not be detected. ..

The present invention has been made in view of the above, and provides a processing system and a processing method capable of appropriately complementing a scene that could not be detected when detecting a scene in the video information. The purpose is.

In order to solve the above-mentioned problems and achieve the object, the processing system according to the present invention has a plurality of scenes in the video information based on parameters including time-series information associated with the video information. With respect to the first storage unit that stores the condition and the relationship between each directional condition, and any condition, based on the relationship between each condition, from a plurality of conditions to any condition. It has a detection unit that obtains related conditions and detects a scene that corresponds to the obtained conditions.

According to the present invention, when detecting a scene in video information, it is possible to appropriately supplement the scene that could not be detected.

FIG. 1 is a diagram showing an example of a scene detection screen in the video information according to the first embodiment. FIG. 2 is a diagram showing an example of the functional configuration of the display processing system according to the first embodiment. FIG. 3 is a diagram showing an example of a relationship graph. FIG. 4 is a diagram illustrating a display example by the visualization information display unit shown in FIG. 2. FIG. 5 is a flowchart showing a processing procedure of the conditions executed by the display processing system shown in FIG. 2 and the setting processing of the relationship between the conditions. FIG. 6 is a flowchart showing a processing procedure of display processing of visualization information for a specified condition executed by the display processing system shown in FIG. FIG. 7 is a flowchart showing a processing procedure of scene detection processing in video information executed by the display processing system shown in FIG. 2. FIG. 8 is a diagram illustrating scene detection in video information in the conventional method. FIG. 9 is a diagram illustrating scene detection in video information according to the first embodiment. FIG. 10 is a diagram showing an example of a scene detection screen in the video information according to the second embodiment. FIG. 11 is a diagram showing an example of the functional configuration of the display processing system according to the second embodiment. FIG. 12 is a flowchart showing a processing procedure of the conditions executed by the display processing system shown in FIG. 11 and the setting processing of the relationship between the conditions. FIG. 13 is a flowchart showing a condition executed by the display processing system shown in FIG. 11 and another processing procedure for setting the relationship between the conditions. FIG. 14 is a diagram showing an example of a computer in which a display processing system is realized by executing a program.

Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings. The present invention is not limited to this embodiment. Further, in the description of the drawings, the same parts are indicated by the same reference numerals.

[Embodiment 1]
FIG. 1 is a diagram showing an example of a scene detection screen in the video information according to the first embodiment. The upper figure of FIG. 1 also shows an example of a scene detection screen by a conventional method.

As shown in the upper figure of FIG. 1, in the conventional method, when a specific scene is detected from the video information, a detection condition for determining whether or not the scene is applicable is set by combining parameters associated with the video. Set. At this time, the search user performs a scene search using the label given to the creator of the detection condition.

For example, when a search user specifies a search condition (hereinafter referred to as a condition) for detecting a scene of video information, in the conventional method, the time zone of the scene corresponding to the specified condition is colored as a detection result. The timeline is displayed (see (1) and (2) in FIG. 1).

However, in the conventional method, even if this timeline is insufficient as the detection result of the scene to be detected, the detection conditional expression (for example, AND of conditions A to C (see (3) in FIG. 1) and the like) and the like. Is not disclosed, so it is difficult for the user to complement or narrow down the scenes. Even if the search condition expressions (AND of conditions A to C) are disclosed and editable, the conditions A to C are not disclosed. It is difficult for the search user to understand which parameter affects how. Therefore, the search user needs to repeatedly set the condition specification and try the search a plurality of times.

On the other hand, in the first embodiment, regarding the condition for detecting the scene based on the raw data associated with the video information, the time information of the scene to which the condition corresponds is retained for each condition ((4) in FIG. 1 (4). )), Holds a relation graph that hierarchically represents the directional relations between conditions (logical sum, AND, inclusion relations, etc.). For example, in the case of an action, "terminal operation" has a relation that it can be expressed by the AND condition of "stopping" and "stopping the viewpoint". Further, in the case of object detection, there is a relationship such that "tire" and "light" are included under "car". It should be noted that the conditions of each condition do not have to be completely independent, and the conditions of the lower layer do not necessarily have to be the necessary conditions of the upper layer.

Then, in the first embodiment, by using this relation graph, the timeline of the designated condition by the search user and the related condition related to the designated condition are displayed in a state with the relation. .. In the first embodiment, a timeline will be described as an example of visualization information indicating a time zone of a scene corresponding to an arbitrary condition, but the visualization information is not limited to the timeline and is a time zone of a scene corresponding to the condition. It may be character information indicating.

Specifically, in the first embodiment, as shown in the lower figure of FIG. 1, the timeline L1 (first visualization information) (AND of the conditions A to C) showing the time zone of the scene corresponding to the condition specified by the user. ), The configuration conditions A, B, and C related to the designated conditions are also visualized by displaying the timelines (second visualization information) LA to LC (see (5) in FIG. 1). Then, by connecting the timeline L1 and the timelines LA to LC with the AND symbol M1 (third visualization information), the relationship between the designated condition and the conditions A to D is visualized ((FIG. 1). 6)).

Further, for the conditions C-1 and C-2 lower than the configuration condition C, the timelines LC-1 and LC-2 are displayed, and the timeline LC and the timeline LC-1 and LC-2 are designated by the OR symbol M2. By connecting with (see frame W1 in FIG. 1), it is visualized that the condition C is an OR of the condition C-1 and the condition C-2 (see (7) in FIG. 1).

As described above, in the first embodiment, the timeline L1 indicating the time zone of the scene satisfying the designated condition and the timeline showing the time zone of the scene corresponding to the constituent conditions A to C constituting the designated condition are also hierarchically displayed. .. By referring to these hierarchically displayed timelines, the search user can easily grasp the scene detection result of each related condition together with the scene detection result of the specified condition. Then, the user can refer to the scene detection results of more conditions with priorities, and by not limiting the use to grasp the relationship between each condition, it is general purpose. It will also be possible to utilize it for specific searches.

Then, when the search user cannot sufficiently detect the scene under the specified condition, it can be expected that the search user intuitively performs the re-search using the related condition by referring to the timeline of the specified condition and the related condition. .. Therefore, in the first embodiment, it is considered that the repeated trials of parameter adjustment or condition specification by the search user, which are conventionally required, can be reduced.

For example, the search user can refer to the visualized timeline L1 and the timelines LA to LC-2 and their relationships with each other to specify conditions (for example, conditions A to C) that are more specific than the specified conditions. More abstract conditions (eg, conditions C-1, C-2) can be easily recognized. Then, the search user selects or combines these conditions A to C-2 to specify the conditions, so that the difficulty of setting the conditions by the search user is alleviated. Therefore, according to the first embodiment, when the search user detects the scene from the video information, it is possible to support the scene detection by the search user and appropriately supplement the scene that could not be detected.

[Display processing system]
Next, the display processing system according to the first embodiment will be described. FIG. 2 is a diagram showing an example of the functional configuration of the display processing system according to the first embodiment.

The display processing system 10 according to the first embodiment sets a relationship between each condition together with a plurality of conditions for detecting a scene in the video information based on a parameter including time-series information associated with the video information. Hold. As a result, the display processing system 10 visualizes the detection result of the scene under the arbitrary condition and the detection result of the scene corresponding to the condition related to the arbitrary condition in a state with the relationship between each condition. , It assists the user in detecting the scene, and makes it possible to appropriately supplement the scene that could not be detected. In the example of FIG. 1, the display processing system 10 is shown assuming that it functions as a terminal device, but the present invention is not limited to this, and the display processing system 10 may function as a server, and the searched video may be used. The scene may be output to the user terminal.

The display processing system 10 includes a raw data storage unit 11 (first input unit), a data processing unit 12, a UI (User Interface) unit 13, a condition storage unit 14 (first storage unit), and a processing data storage unit 15 ( It has a first storage unit, a detection unit, a second storage unit, and a third storage unit). Each part will be described below. It should be noted that each of the above-mentioned parts may be held by a plurality of devices in a dispersed manner.

In the display processing system 10, for example, a predetermined program is read into a computer or the like including a ROM (Read Only Memory), a RAM (Random Access Memory), a CPU (Central Processing Unit), etc., and the CPU executes the predetermined program. It will be realized by. Further, the display processing system 10 has a communication interface for transmitting and receiving various information to and from other devices connected via a network or the like. For example, the display processing system 10 has a NIC (Network Interface Card) or the like, and communicates with other devices via a telecommunication line such as a LAN (Local Area Network) or the Internet. The display processing system 10 has a touch panel, a voice input device, an input device such as a keyboard and a mouse, a display device such as a liquid crystal display, and a printing device such as a printer, and inputs and outputs information.

The raw data storage unit 11 receives and stores the video information to be searched and the input of raw data used in combination with the video information. The raw data is sensor information obtained synchronously with the shooting of video information. The sensor information is, for example, GPS (Global Positioning System) information, acceleration information, and temperature information.

The data processing unit 12 detects scenes in the video information corresponding to each condition based on the video information and raw data and each condition, and outputs the scene detection result according to each condition to the processing data storage unit 15. Further, the data processing unit 12 acquires raw data from the video information by processing the video information, and stores the video information and the raw data in the raw data storage unit 11 in association with each other. For example, the data processing unit 12 acquires raw data based on an object recognition result, position information by SLAM (Simultaneous Localization and Mapping), and the like.

The UI unit 13 has a condition setting unit 131, a visualization information display unit 132 (second input unit, first display unit), and a video display unit 133.

The condition setting unit 131 receives an instruction to create a search condition (condition) for scene detection by operating the input device by the search user. Upon receiving an instruction to create a condition, the condition setting unit 131 creates a new condition using an arbitrary method, and stores the new condition and the relationship between the new condition and other existing conditions. Store in 14.

The visualization information display unit 132 receives the input of the search condition (designated condition) (first condition) designated for scene detection of video information. The visualization information display unit 132 outputs the designated condition to the processing data storage unit 15. Then, the visualization information display unit 132 outputs visualization information related to scene detection based on the information including the scene detection result output from the processing data storage unit 15. When the visualization information display unit 132 receives the designation of the video reproduction scene by the operation of the input device by the search user who has referred to the visualization information, the visualization information display unit 132 outputs the reproduction range to the video display unit 133.

When the video display unit 133 receives the designation of the playback range from the visualization information display unit 132, the video display unit 133 reproduces the video in the designated range based on the video information stored in the raw data storage unit 11.

The condition storage unit 14 stores a plurality of conditions for detecting a scene in the video information and a relationship graph showing the relationship between each condition having directivity. When the relationship graph is registered or updated, the condition storage unit 14 outputs the registered or updated relationship graph to the processing data storage unit 15.

FIG. 3 is a diagram showing an example of a relationship graph. As shown in FIG. 3, the relation graph G1 comprehensively holds the relation of each condition by showing the logical product, the logical sum, or the time-series connection and the inclusion relation for each condition (FIG. 3). 3 (1)). For example, if the upper condition is the result of a logical sum or AND between certain conditions, or if the upper condition includes a certain condition, the upper condition may occur due to multiple conditions occurring in chronological order. When it is satisfied, the related condition is linked to the subordinate of the higher condition.

Specifically, in the relation graph G1, in the relation graph G1, there are "inspection" and "record" as subordinate conditions connected to the time series with respect to the condition "current situation survey". In the relationship graph G1, each condition may be linked, including conditions such as directionality (“inspection” → “recording”) and occurrence time (for example, within 10 seconds). In addition, the condition of "inspection" includes the conditions of "stop" and "gaze", and these two are associated with "inspection" in relation to the logical product. In the relationship graph G1, abstract conditions are related under more specific conditions. As shown in the relationship graph G1, by making the relationship of conditions into multiple layers, it is possible to simplify the prioritization of the requirements to be referred to by the user. Then, by making the relation of the conditions into multiple layers as in the relation graph G1, it is also possible to discriminate indirectly related conditions such as "inspection" and "gaze 2 seconds".

The processing data storage unit 15 holds a relation graph, obtains a condition related to an arbitrary condition from a plurality of conditions based on the relation graph with respect to an arbitrary condition, and sets a scene corresponding to the obtained condition. To detect.

The processing data storage unit 15 holds the scene detection result according to each condition together with the relationship graph. The scene detection result according to each condition is the detection result of the scene detected from the video information using each condition stored in the condition storage unit 14, and is obtained by the processing by the data processing unit 12.

When the designated condition is input from the visualization information display unit 152, the processing data storage unit 15 has the designated condition and the related condition related to the designated condition (second condition) from among a plurality of conditions based on the relation graph. ) And ask. Then, the processing data storage unit 15 outputs, among the scene detection results according to each condition, the scene detection result corresponding to the designated condition and the scene detection result corresponding to the related condition to the visualization information display unit 132. At the same time, the processing data storage unit 15 outputs information indicating the relationship between the designated condition and the related condition to the visualization information display unit 132 based on the relationship graph.

Then, the visualization information display unit 132 receives from the processing data storage unit 15 the scene detection result corresponding to the designated condition, the detection result of the scene corresponding to the related condition, and the information indicating the relationship between the designated condition and the related condition. , The first visualization information indicating the time zone of the scene corresponding to the designated condition, the second visualization information indicating the time zone of the scene corresponding to the related condition, and the third indicating the relationship between the designated condition and the related condition. Visualization information and output.

FIG. 4 is a diagram illustrating a display example by the visualization information display unit 132 shown in FIG. As shown in FIG. 4, the visualization information display unit 132 has a timeline showing the time zone of the scene corresponding to the user-designated condition “inspection”, and the related condition “stop” according to the relationship shown in the relationship graph G1. , "Gaze 5 seconds", "Move", "Overlook" is displayed as the third visualization information with a frame or symbol indicating that the timeline indicating the time zone of the corresponding scene is under the specified condition "Inspection". .. Further, the visualization information display unit 132 displays a timeline relating to the condition "gaze 2 seconds" included in the related condition "gaze 5 seconds". In this way, the visualization information display unit 132 hierarchically displays the timeline of the designated condition and its related information, so that the search user can refer to the related information regarding the designated condition "inspection" and display more related information. It can be intuitively used as a search condition (see (1) in FIG. 4).

The visualization information display unit 132 can use any visualization format, not limited to the timeline format. Further, the visualization information display unit 132 may change the display method such as the timeline or group the conditions, or may accept the change of the display method such as the timeline or the grouping between the conditions from the search user. good.

[Setting conditions and relationships between conditions]
Next, the processing executed by the display processing system 10 will be described. First, the conditions executed by the display processing system 10 and the setting processing of the relationships between the conditions will be described. FIG. 5 is a flowchart showing a processing procedure of the conditions executed by the display processing system 10 shown in FIG. 2 and the setting processing of the relationship between the conditions.

As shown in FIG. 5, the condition setting unit 131 receives an instruction to register a condition name or ID to be extracted for scene detection by operating an input device by a search user (step S1). The condition setting unit 131 selects one or more parameters expressing the corresponding condition from the available parameters by using an arbitrary method, and sets numerical conditions such as a threshold value and a reference value (step S2). The condition in which the name is specified in step S1 and the threshold value or the like is set in step S2 corresponds to the "new condition" in step S3 (described later).

Subsequently, the condition setting unit 131 extracts the condition related to the new condition from the condition storage unit 14 (step S3). The new condition is a condition that is a logical sum or a logical product of the existing conditions in the condition storage unit 14, a condition that includes the existing condition, and a condition that is expressed by combining the existing conditions in chronological order. The condition related to the new condition is a condition specified by the user as a related condition from the existing condition, a condition automatically detected by the display processing system 10 as being related by comparing the numerical conditions, or a new condition. It is a condition that is divided and a partial condition is created and linked. Moreover, the new condition does not necessarily have to be expressed by a combination of existing conditions. For example, for a new condition generated by a user, a new condition may be newly generated by searching for a related condition from an existing condition manually or by comparing numerical conditions, or by dividing a new condition.

The condition storage unit 14 registers the condition added by the condition setting unit 131, and updates the relationship graph between the condition and the existing condition (step S4) to set a more specific condition in the upper layer. do.

[Display processing of visualization information for specified conditions]
Next, the display processing of the visualization information for the designated condition executed by the display processing system 10 will be described. FIG. 6 is a flowchart showing a processing procedure of display processing of visualization information for a designated condition executed by the display processing system 10 shown in FIG.

As shown in FIG. 6, when the visualization information display unit 132 receives an input of a search condition (designated condition) designated for scene detection from a search user (step S11), the visualization information display unit 132 processes the specified condition. Data storage unit 15 Output to.

The processing data storage unit 15 obtains a specified condition and a related condition related to the specified condition from a plurality of conditions, and detects and visualizes a scene detection result corresponding to the specified condition and a scene detection result corresponding to the related condition. In addition to outputting to the information display unit 132, information indicating the relationship between the designated condition and the related condition is output to the visualization information display unit 132 based on the relationship graph (step S12).

As illustrated in FIGS. 1 and 4, the visualization information display unit 132 displays and outputs each timeline of the designated condition and the related information together with the relationship between the conditions (step S13).

Then, when the visualization information display unit 132 receives the designation of the video reproduction scene from the input device by the search user who referred to the visualization information (step S14), the visualization information display unit 132 outputs the reproduction range to the video display unit 133. When the video display unit 133 receives the designation of the reproduction range from the visualization information display unit 132, the video display unit 133 reproduces the video in the designated range based on the video information stored in the raw data storage unit 11 (step S15).

[Scene detection processing in video information]
Next, the scene detection process in the video information executed by the display processing system 10 will be described. FIG. 7 is a flowchart showing a processing procedure of scene detection processing in video information executed by the display processing system 10 shown in FIG.

As shown in FIG. 7, the display processing system 10 determines whether or not the data in the raw data storage unit 11 or the condition in the condition storage unit has been updated (step S21). The display processing system 10 repeats the determination process in step S21 until the data in the raw data storage unit 11 or the condition in the condition storage unit is updated.

Then, when the data of the raw data storage unit 11 or the condition of the condition storage unit is updated (step S21: Yes), the display processing system 10 determines whether or not to extract information from the video information in the own system. Is determined (step S22). When extracting information from video information in the own system (step S22: Yes), the data processing unit 12 executes information extraction such as GPS information, acceleration information, and temperature information from the video information (step S23).

When the information is not extracted from the video information in the own system (step S22: No), or after the end of step S23, the data processing unit 12 refers to the detection condition of each condition stored in the condition storage unit 14 (step). S24). In response to this, the condition storage unit 14 outputs the stored detection condition to the data processing unit 12 (step S25).

The data processing unit 12 detects the corresponding scene in the video information based on each condition (step S26), and outputs the scene detection result to the processing data storage unit 15. The processing data storage unit 15 saves the scene detection result of each condition (step S27), and ends the processing.

[Effect of Embodiment 1]
FIG. 8 is a diagram illustrating scene detection in video information in the conventional method. In the conventional method, since the user can search the scene only in the label unit prepared by the creator of the detection condition (see (1) in FIG. 8), the time zone of the scene corresponding to the specified condition is colored. Only the timeline is displayed as the detection result.

However, in the conventional method, even if this timeline is insufficient as the detection result of the condition to be detected, since the specific detection condition expression is not visible to the user, the user himself performs scene complementation and scene narrowing. That is difficult. Even if the conditional expression can be edited, it is difficult for the user to understand which parameter affects how, so it is necessary to repeatedly set the conditional specification and try the search multiple times.

FIG. 9 is a diagram illustrating scene detection in video information according to the first embodiment. In contrast to the conventional method, the display processing method according to the first embodiment holds a plurality of conditions for detecting a scene in video information and a relationship graph showing a relationship between each condition having directivity. Then, regarding the specified condition, based on the relationship between each condition, the related condition related to the specified condition is obtained from a plurality of conditions, and the scene corresponding to the obtained condition is detected. Then, in the first embodiment, as shown in FIG. 9, in addition to the timeline showing the time zone of the scene corresponding to the designated condition, the timeline showing the time zone of the scene corresponding to the related condition is used as the related condition. In addition to displaying each condition, the relationship between each condition is also displayed by linking the related conditions under the specified condition and displaying the timeline in a hierarchical manner.

When the search user wants to detect a specific scene from the video information, he / she can refer to the detection status (0/1 or continuous value) of the specified condition and the related condition that are visualized hierarchically, and the related condition is used. It is possible to refer to the scene detection result of the specified condition while comparing it with the scene detection result. Then, when the detection result is insufficient, the search user can narrow down the scene by referring to the scene detection result of the related condition. For example, when the search user wants to preferentially acquire the scenes corresponding to the specified conditions, the search user may narrow down the scenes in the time zone within the frames W11, W12, and W13 in the timeline corresponding to the specified conditions (Fig.). 9 (1)).

It is also possible to detect effective scenes from unexpected combinations of conditions by visualizing the scene detection results of related conditions related to the specified conditions. For example, the search user may find that the conditions A and C are indispensable from the scene detection results of the conditions A, B, and C in the frame W14, but are effective for the usage of the scene if the condition B is not satisfied. Since it can be discriminated, it is possible to set the condition only for the combination of the conditions A and C. Then, the search user can change the condition such that the condition C may be relaxed a little by referring to the inclusion condition of the condition C. As described above, according to the first embodiment, it is possible to deal with the case where it is assumed that the user arbitrarily groups a plurality of conditions or temporarily hides a specific condition.

Therefore, according to the first embodiment, the search user can easily grasp the scene detection result of each related condition together with the detection result of the designated condition. Then, according to the first embodiment, when the search user cannot sufficiently detect the scene under the specified condition, the search user intuitively uses the related condition by referring to the specified condition and the detection result of the related condition. It is expected that the search will be performed again. Therefore, in the first embodiment, it is considered that the user's repeated trials of parameter adjustment or condition specification, which have been conventionally required, can be reduced. Therefore, according to the first embodiment, when the search user detects the scene from the video information, it is possible to appropriately supplement the scene that could not be detected.

[Embodiment 2]
Next, the second embodiment will be described. FIG. 10 is a diagram showing an example of a scene detection screen in the video information according to the second embodiment. In the second embodiment, when a range of scenes to be detected in the future is specified by the search user, it is possible to compare and refer to each condition that can detect the scenes in this range from the video information.

For example, when the search user specifies an arbitrary time zone in the video information as a scene to be detected in the future (see (1) in FIG. 10), in the second embodiment, the scene is specified from a plurality of conditions. A condition (for example, conditions A to C) in which a detection scene exists in a time zone is obtained. Then, in the second embodiment, the timelines LA to LC-2 highlighting the designated time zone among the time zones to which each condition corresponds are hierarchically output for each requested condition. For example, as highlighting, the part of the timelines LA to LC-2 corresponding to the specified time zone is surrounded by the frame W21, displayed in a color or brightness different from the others, or blinks. Can be considered.

As described above, in the second embodiment, the search user specifies the time zone in which the scene is to be detected, and the scene corresponding to this time zone is detected under any combination of the conditions. It can be compared and referred to (see (2) in FIG. 10). Then, the search user can create detection conditions from the next time onward for the time zone in which the scene is to be detected by comparing and referring to the timelines LA to LC-2 highlighting the designated time zone. A specific scene can be detected more efficiently.

Then, in the second embodiment, by arranging the scene detection conditions A to C-2 according to the hierarchy of the relationship graph, the search user gives priority to more specific conditions or more limited conditions. It is possible to create a detection condition that is effective for the search. Therefore, according to the second embodiment, it is possible to detect a scene according to the desire of the search user by supporting the creation of a detection condition capable of detecting the scene desired by the search user.

[Display processing system]
Next, the display processing system according to the second embodiment will be described. FIG. 11 is a diagram showing an example of the functional configuration of the display processing system according to the second embodiment.

The display processing system 210 according to the second embodiment has a UI unit 213 and a processing data storage unit 215 (setting unit) as compared with the display processing system 10 shown in FIG. Further, in the display processing system 210, the UI unit 213 has a condition setting unit 2131 (first setting unit), a visualization information display unit 2132 (second display unit), and a video display unit 2133 (third input unit). , A fourth input unit).

The video display unit 2133 reproduces video information from any position of the search user. Then, the video display unit 2133 accepts the input of the time zone designation information for designating an arbitrary time zone in the video information by the operation of the input device by the search user. The video display unit 2133 outputs the time zone designated by the time zone designation information to the visualization information display unit 2132.

The visualization information display unit 2132 outputs the time zone specified by the time zone designation information to the processing data storage unit 215. The visualization information display unit 2132 outputs visualization information according to the time zone designation information.

The processing data storage unit 215 obtains a condition in which the detected scene is in the time zone specified in the time zone designation information from a plurality of conditions based on the scene detection result according to each condition, and the scene detection result according to the obtained condition. And, the relationship information indicating the relationship between the obtained conditions is output to the visualization information display unit 2132.

The visualization information display unit 2132 outputs visualization information highlighting the time zone specified in the time zone designation information among the time zones of the scenes corresponding to each condition for each condition obtained by the processing data storage unit 215. At the same time, visualization information showing the relationship between each condition is output.

The condition setting unit 2131 registers the condition for detecting the scene in the time zone specified in the time zone specification information and the relationship between each detection condition by the operation of the input device by the search user referring to the output visualization information. Accept instructions or update instructions. The condition setting unit 2131 registers or updates the conditions stored in the condition storage unit 14 and the relationship between the conditions in response to the received registration instruction or update instruction.

[Registration process of detection conditions for scenes in the specified time zone]
Next, the process of registering the detection condition of the scene in the designated time zone executed by the display processing system 210 will be described. FIG. 12 is a flowchart showing a processing procedure of the condition executed by the display processing system 210 shown in FIG. 11 and the setting processing of the relationship between the conditions.

As shown in FIG. 12, the video display unit 2133 reproduces video information from an arbitrary position designated by the search user (step S31). Then, the video display unit 2133 determines whether or not the time zone specified in the video information has been specified by the operation of the input device by the search user (step S32). If the video display unit 2133 has not received the time zone designation in the video information (step S32: No), the video display unit 2133 returns to the determination process of step S32. The search user can also specify a plurality of time zones in the video information at the same time.

When the time zone in the video information is specified (step S32: Yes), the specified time zone is output to the processing data storage unit 215 via the visualization information display unit 2132. The processing data storage unit 215 obtains a condition in which a detection scene exists in a specified time zone from a plurality of conditions based on the scene detection result according to each condition, and the scene detection result according to the obtained condition and these conditions. The relationship between the two is output to the visualization information display unit 2132 (step S33).

The visualization information display unit 2132 outputs visualization information highlighting the specified time zone among the time zones of the screen corresponding to each condition for the scene detection result by the processing data storage unit 215, and outputs the visualization information highlighting the relationship between the conditions. Visualization information indicating the sex is output (step S34). When a plurality of time zones are specified, the visualization information display unit 2132 may display the visualization information display unit 2132 so that it can be distinguished in each time zone, such as by separating the colors for each time zone.

Then, the condition setting unit 2131 determines whether or not a registration instruction or an update instruction has been received for the condition for detecting the scene in the designated time zone and the relationship between the detection conditions (step S35). When the display processing system 210 has not received the registration instruction or the update instruction for the condition and the relationship between the detection conditions (step S35: No), the display processing system 210 ends the processing. When a registration instruction or an update instruction for a condition and a relationship between each detection condition is received (step S35: Yes), the condition setting unit 2131 receives a registration instruction or an update instruction, and the condition storage unit 14 stores the condition. And the relationship between each condition are registered or updated (step S36).

[Effect of Embodiment 2]
As described above, in the second embodiment, when the search user specifies an arbitrary time zone in the video information as the range of the scene to be detected, the detection scene is in the designated time zone from a plurality of conditions. The conditions are obtained, and for each of the obtained conditions, the visualization information highlighting the specified time zone among the time zones to which each condition applies is displayed together with the relationship between the conditions.

Therefore, according to the second embodiment, when the search user wants to determine what kind of condition is effective for detecting the scene in the desired time zone, the condition is such that the scene in this time zone can be detected. , The usefulness of a condition can be determined from a more specific condition or a more limited condition by referring to the relationship between the conditions. For example, the search user can extract the conditions commonly detected in the corresponding scene and create a new condition as a template.

Therefore, according to the second embodiment, the search user can determine a useful condition as a condition for detecting a scene in a desired time zone when detecting a scene from the video information, so that the desired scene can be determined. It will be possible to detect it properly.

[Modified Example of Embodiment 2]
The display processing system 210 determines the conditions under which the scene in the desired time zone can be detected, and automatically registers or updates new conditions and relationships between the conditions for the scene in the desired time zone. You may. FIG. 13 is a flowchart showing a condition executed by the display processing system 210 shown in FIG. 11 and another processing procedure for setting the relationship between the conditions.

Step S41 and step S42 shown in FIG. 13 are the same processes as steps S31 and S32 shown in FIG. 12, respectively. When the time zone specified in the video information is specified (step S42: Yes), the processing data storage unit 215 detects the time zone specified from among a plurality of conditions based on the scene detection results according to each condition. A scene is obtained for a certain condition, and the scene detection result according to the obtained condition and the relationship information indicating the relationship between these conditions are output to the condition setting unit 2131 (second setting unit) (step S43).

The condition setting unit 2131 detects a scene in a designated time zone based on the condition output from the condition setting unit 2131 and the relationship information indicating the relationship between the conditions obtained by the condition setting unit 2131. Is determined (step S44). The condition setting unit 2131 extracts the conditions commonly detected in the designated time zone and creates a new condition. Further, the condition setting unit 2131 searches for the lower condition in order from the upper condition among the conditions having the detection scene in the designated time zone, and sets the condition by combining the more specific upper condition than the lower condition. , Create a new condition for scene detection in the specified time zone.

The condition setting unit 2131 registers or updates the condition stored in the condition storage unit 14 and the relationship between each condition based on the determination result in step S44 (step S45). The condition setting unit 2131 registers the new condition in the condition storage unit 14, and also registers or updates the relationship between the new condition and the existing condition.

[About the system configuration of the embodiment]
Each component of the display processing systems 10 and 210 is a functional concept and does not necessarily have to be physically configured as shown in the figure. That is, the specific form of the distribution and integration of the functions of the display processing systems 10 and 210 is not limited to the one shown in the figure, and all or part of them may be functional in any unit according to various loads and usage conditions. Or it can be physically distributed or integrated.

Further, each process performed in the display processing systems 10 and 210 may be realized by a CPU, a GPU (Graphics Processing Unit), and a program analyzed and executed by the CPU and the GPU, in whole or in any part thereof. .. Further, each process performed in the display processing system 10 may be realized as hardware by wired logic.

It is also possible to manually perform all or part of the processes described as being automatically performed among the processes described in the embodiment. Alternatively, all or part of the process described as being performed manually can be automatically performed by a known method. In addition, the above-mentioned and illustrated processing procedures, control procedures, specific names, and information including various data and parameters can be appropriately changed unless otherwise specified.

[program]
FIG. 14 is a diagram showing an example of a computer in which the display processing systems 10 and 210 are realized by executing a program. The computer 1000 has, for example, a memory 1010 and a CPU 1020. The computer 1000 also has a hard disk drive interface 1030, a disk drive interface 1040, a serial port interface 1050, a video adapter 1060, and a network interface 1070. Each of these parts is connected by a bus 1080.

Memory 1010 includes ROM 1011 and RAM 1012. The ROM 1011 stores, for example, a boot program such as a BIOS (Basic Input Output System). The hard disk drive interface 1030 is connected to the hard disk drive 1090. The disk drive interface 1040 is connected to the disk drive 1100. For example, a removable storage medium such as a magnetic disk or an optical disk is inserted into the disk drive 1100. The serial port interface 1050 is connected to, for example, a mouse 1110 and a keyboard 1120. The video adapter 1060 is connected to, for example, the display 1130.

The hard disk drive 1090 stores, for example, an OS (Operating System) 1091, an application program 1092, a program module 1093, and program data 1094. That is, the program that defines each processing of the display processing systems 10 and 210 is implemented as a program module 1093 in which the code that can be executed by the computer 1000 is described. The program module 1093 is stored in, for example, the hard disk drive 1090. For example, the program module 1093 for executing the same processing as the functional configuration in the display processing systems 10 and 210 is stored in the hard disk drive 1090. The hard disk drive 1090 may be replaced by an SSD (Solid State Drive).

Further, the setting data used in the processing of the above-described embodiment is stored as program data 1094 in, for example, a memory 1010 or a hard disk drive 1090. Then, the CPU 1020 reads the program module 1093 and the program data 1094 stored in the memory 1010 and the hard disk drive 1090 into the RAM 1012 and executes them as needed.

The program module 1093 and the program data 1094 are not limited to those stored in the hard disk drive 1090, but may be stored in, for example, a removable storage medium and read by the CPU 1020 via the disk drive 1100 or the like. Alternatively, the program module 1093 and the program data 1094 may be stored in another computer connected via a network (LAN (Local Area Network), WAN (Wide Area Network), etc.). Then, the program module 1093 and the program data 1094 may be read from another computer by the CPU 1020 via the network interface 1070.

Although the embodiment to which the invention made by the present inventor is applied has been described above, the present invention is not limited by the description and the drawings which form a part of the disclosure of the present invention according to the present embodiment. That is, other embodiments, examples, operational techniques, and the like made by those skilled in the art based on the present embodiment are all included in the scope of the present invention.

10,210 Display processing system 11 Raw data storage unit 12 Data processing unit 13 UI unit 14 Condition storage unit 15,215 Processing data storage unit 131,2131 Condition setting unit 132,2132 Visualization

information display unit

133,2133 Video display unit

Claims

A first storage that stores a plurality of conditions for detecting a scene in the video information based on a parameter including time-series information associated with the video information, and a relationship between each condition having directivity. Department and
With respect to any condition, a detection unit that obtains a condition related to the arbitrary condition from the plurality of conditions based on the relationship between the conditions and detects a scene corresponding to the obtained condition.
A processing system characterized by having.
The first input unit that accepts the input of the video information to be searched,
A second storage unit that stores a scene detection result in the video information to be searched according to each condition stored in the first storage unit, and a second storage unit.
A second input unit that receives input of the first condition specified for scene detection of the video information, and a second input unit.
A first display unit that outputs visualization information regarding the scene detection result, and
Have,
The detection unit obtains the first condition and the second condition related to the first condition from the plurality of conditions, and among the scene detection results under each of the conditions, the first condition. The scene detection result corresponding to the above and the scene detection result corresponding to the second condition are output to the first display unit, and information indicating the relationship between the first condition and the second condition is output. Output to the first display
The first display unit includes first visualization information indicating a time zone of a scene corresponding to the first condition, and second visualization information indicating a time zone of a scene corresponding to the second condition. The processing system according to claim 1, wherein the first condition and the third visualization information indicating the relationship with the second condition are output.
A third storage unit that stores the scene detection result in the video information according to each condition stored in the first storage unit, and a third storage unit.
A third input unit that accepts input of time zone designation information that specifies an arbitrary time zone in the video information, and
A second display unit that outputs visualization information according to the time zone designation information, and
A first setting unit that sets a condition stored in the first storage unit and a relationship between each condition, and a first setting unit.
Have more
Based on the scene detection results under each condition, the detection unit obtains a condition in which the detection scene exists in the time zone specified in the time zone designation information from the plurality of conditions, and the scene detection result under the obtained conditions. And the relationship information indicating the relationship between the obtained conditions are output to the second display unit.
The second display unit outputs visualization information highlighting the time zone specified in the time zone designation information among the time zones of the scenes corresponding to each condition for each condition obtained by the detection unit. At the same time, it outputs visualization information showing the relationship between each condition.
The first setting unit is the first storage unit in response to a registration instruction or an update instruction for a condition for detecting a scene in a time zone specified in the time zone designation information and a relationship between each detection condition. The processing system according to claim 1, wherein the conditions stored in the computer and the relationship between the conditions are registered or updated.
A third storage unit that stores the scene detection result in the video information according to each condition stored in the first storage unit, and a third storage unit.
A fourth input unit that accepts input of time zone designation information that specifies an arbitrary time zone in the video information, and
A second setting unit that sets the conditions stored in the first storage unit and the relationship between the conditions, and
Have more
Based on the scene detection results under each condition, the detection unit obtains a condition in which the detection scene exists in the time zone specified in the time zone designation information from the plurality of conditions, and the scene detection result according to the obtained condition. And the relationship information indicating the relationship between the obtained conditions are output to the second setting unit.
The second setting unit is a time zone designated in the time zone designation information based on the relationship information indicating the relationship between the condition obtained by the detection unit and the condition obtained by the detection unit. The process according to claim 1, wherein the condition for detecting the scene is determined, and the condition stored in the first storage unit and the relationship between the conditions are registered or updated based on the determination result. system.
It is a processing method executed by the processing system.
The processing system stores a plurality of conditions for detecting a scene in the video information based on a parameter including time-series information associated with the video information, and a relationship between each condition having directivity. Has a storage unit
A step of detecting a condition related to the arbitrary condition from the plurality of conditions based on the relationship between the arbitrary conditions.
A processing method characterized by including.