[go: up one dir, main page]

CN119763090B - Movie subtitle extraction method and device under background complex change scene based on OCR and color preprocessing - Google Patents

Movie subtitle extraction method and device under background complex change scene based on OCR and color preprocessing

Info

Publication number
CN119763090B
CN119763090B CN202411815294.3A CN202411815294A CN119763090B CN 119763090 B CN119763090 B CN 119763090B CN 202411815294 A CN202411815294 A CN 202411815294A CN 119763090 B CN119763090 B CN 119763090B
Authority
CN
China
Prior art keywords
subtitle
ocr
movie
picture
color
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202411815294.3A
Other languages
Chinese (zh)
Other versions
CN119763090A (en
Inventor
周晟
吴雨轩
卜佳俊
沈铭
李亮城
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202411815294.3A priority Critical patent/CN119763090B/en
Publication of CN119763090A publication Critical patent/CN119763090A/en
Application granted granted Critical
Publication of CN119763090B publication Critical patent/CN119763090B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Studio Circuits (AREA)
  • Studio Devices (AREA)

Abstract

The invention discloses a method and a device for extracting movie subtitles under a background complex change scene based on OCR and color preprocessing, wherein the method extracts the color information characteristics of the movie subtitles by intercepting and locating the positions of the movie subtitles in advance, extracts subtitle texts according to the subtitle color preprocessing, improves and optimizes the poor subtitle extraction effect caused by the reasons of confusion of background elements, color change and the like in the traditional movie subtitle extraction method. The method is beneficial to improving the recognition precision of OCR in the movie scene, and overcomes the challenges brought by background complexity while ensuring the efficiency.

Description

Movie subtitle extraction method and device under background complex change scene based on OCR and color preprocessing
Technical field:
the invention relates to an optical character recognition technology, in particular to a film subtitle extraction method under a complex background change scene based on OCR and specific color preprocessing.
The background technology is as follows:
In text recognition techniques for movie titles, there are often relatively complex scene changes. Movie scenes often contain rich and varied background elements such as fast moving objects, light changes, special effects, etc. The strong shadow effect may directly interfere with the definition of the subtitles, which factors make the subtitle extraction in the movie exceptionally complex.
Traditional movie subtitle extraction methods are based solely on Optical Character Recognition (OCR) technology, but often perform poorly because subtitles may be confused by background elements and even affected by color changes and are difficult to accurately recognize. It is therefore difficult to accurately extract subtitles in a movie scene using simple optical character recognition techniques.
The above problems are currently in need of solution.
The invention comprises the following steps:
The invention aims to overcome the defects of the prior art, and provides a method and a device for extracting movie captions in a scene with complex background changes based on OCR and color preprocessing based on the advantages of OCR and preprocessing.
In order to solve the technical problems, the method for extracting the movie subtitle under the background complex change scene based on OCR and color preprocessing comprises the following steps:
s110, sampling movie frames to obtain specific positions of text frames, randomly sampling a plurality of picture frames from the specific positions of the text frames, and obtaining all the positions of the text frames for the picture frames by an OCR method;
S120, acquiring caption position information by using a clustering algorithm, clustering the text frames to obtain the class with the largest number in the class, taking the central point coordinate of the class as the central point of the caption position, counting the mode of the height of the text frames in the class of the central point coordinate, taking the width of the film as the height of the caption position, and taking the width of the caption position;
S130, calculating subtitle color features according to the pixel point modes;
and S140, extracting caption text according to caption color preprocessing.
Further, the step S110 of sampling the movie frame to obtain the specific position of the text frame specifically includes:
s1101, for a film, randomly sampling n picture frames from the film;
S1102, for n picture frames, calculating the positions of all the text frames by an OCR method, wherein the number of the text frames is S n;
s1103, the specific position of the text frame is represented by { X, Y, H }, X, Y is the center point coordinate of the text frame, and H is the height of the text frame.
Further, the step S120 of obtaining subtitle position information by using a clustering algorithm specifically includes:
S1201, carrying out Kmeans clustering on the S n text frame positions X, Y to obtain the most numerous classes in the classes, and taking the central point coordinate of the class as the central point of the subtitle position;
s1202, counting the mode of the height H of the text frame in the class of the coordinates of the affiliated central point, and taking the mode as the height of the subtitle position;
S1203 sets the width of the movie as the width of the subtitle position.
Further, the step S130 of calculating the subtitle color feature according to the pixel mode specifically includes:
s1301, capturing a movie picture by utilizing a subtitle position according to the picture frame sampled in the S1101, and representing a pixel value of the captured picture by using Img i,j;
S1302, counting the most number of pixel values in each column of pixels in the intercepted picture, namely, the mode of the pixel values in each column, wherein C j is used for representing the mode of the pixel values in each column;
S1303, calculating the mode in { C 1,…,Cj,…,Cw }, and recording as M, wherein w is the number of columns of picture pixels;
S1304, repeating the above operation for all the truncated picture frames, and counting the pixel value mode in { M } as the caption color information feature.
Further, in the step S130, the subtitle color feature storage format is required to be in RGB three-channel mode.
Further, the step S140 of extracting the subtitle text according to the subtitle color preprocessing specifically includes:
S1401, extracting all picture frames in a film, and cutting out pictures according to subtitle positions;
S1402, binarizing the intercepted picture, setting 255 as the pixel value equal to the subtitle color feature, and setting 0 as the pixel value unequal to the subtitle color feature;
s1403, for the picture after binarization operation, the picture is sent to OCR for subtitle text extraction.
A second aspect of the present invention relates to a device for extracting movie titles in a background complex-change scene based on OCR and color preprocessing, which comprises a memory and one or more processors, wherein executable codes are stored in the memory, and the one or more processors are used for implementing the method for extracting movie titles in a background complex-change scene based on OCR and color preprocessing according to the present invention when the executable codes are executed.
A third aspect of the present invention relates to a computer-readable storage medium having stored thereon a program which, when executed by a processor, implements the method for extracting movie titles in a complex background changing scene based on OCR and color preprocessing as set forth in any one of claims 1 to 6.
The subtitle extraction method based on the OCR and color preprocessing technology has the beneficial effects that the subtitle extraction method based on the OCR and color preprocessing technology in the complex changing scene of the movie background is used for counting all text frames in a picture frame through the OCR technology, and clustering and mode operation are carried out on the text frame information to determine the subtitle position. According to the sequence of column statistics, row statistics and film frame statistics, the color characteristics of the subtitles are defined, the color information is preprocessed by binarization operation and then is put into an OCR technology to extract the subtitles, the problem that the existing pure OCR technology cannot identify the complex scene change in the film subtitles is solved, and therefore the accuracy of extracting the film subtitles is improved.
Description of the drawings:
fig. 1 is a flowchart of a subtitle extraction method under a complex changing scene of a movie background based on OCR and color preprocessing techniques according to an embodiment of the present invention.
Fig. 2 is a schematic view of the apparatus of the present invention.
Fig. 3 is a diagram showing the effect of the present invention after preprocessing for binarization operation.
The specific embodiment is as follows:
The invention will now be described in further detail with reference to the accompanying drawings. The drawings are simplified schematic representations which merely illustrate the basic structure of the invention and therefore show only the structures which are relevant to the invention.
Example 1
As shown in fig. 1, this embodiment 1 provides a method for extracting movie subtitles in a background complex changing scene based on OCR and color preprocessing, which improves the adaptability of the prior art to the complex changing scene in the movie, thereby helping to improve the extracting accuracy of the movie subtitles.
Specifically, the method comprises:
S110, sampling a film frame to obtain a specific position of a text frame;
Specifically, n picture frames are randomly sampled from a movie, all text frame positions are calculated for the n picture frames through an OCR method, the number of the text frames is S n, the specific positions of the text frames are represented by { X, Y and H }, X, Y is the center point coordinate of the text frames, and H is the height of the text frames.
S120, acquiring subtitle position information by using a clustering algorithm;
Specifically, the S n text frame positions X, Y are clustered by Kmeans to obtain the most numerous categories in the categories, the center point coordinates of the categories are used as the center points of the subtitle positions, the modes of the text frame heights H in the categories of the center point coordinates are counted and used as the subtitle positions, and the width of the movie is used as the width of the subtitle positions.
S130, calculating subtitle color features according to the pixel point modes;
Specifically, for the picture frame sampled in S1101, a movie picture is taken by using a caption position, the pixel value of the taken picture is represented by Img i,j, for each column of pixels in the taken picture, the most numerous pixel values are counted, that is, the mode of the pixel value in each column is represented by C j, the mode in { C 1,…,Cj,…,Cw } is calculated and denoted as M, w is the column number of the picture pixels, and for all taken picture frames, the above operation is repeated, and the mode of the pixel value in { M } is counted as the caption color information feature.
And S140, extracting caption text according to caption color preprocessing.
The method comprises the steps of extracting all picture frames in a film, intercepting the picture according to the subtitle position, carrying out binarization operation on the intercepted picture, setting a pixel value equal to the subtitle color characteristic as 255, setting a pixel value not equal to the subtitle color characteristic as 0, and sending the picture subjected to the binarization operation into OCR for subtitle character extraction.
In summary, the method for extracting the subtitle under the complex changing scene of the movie background based on the OCR and color preprocessing technology provided by the invention comprises the steps of counting all text frames in a picture frame through the OCR technology, and carrying out clustering and audience-taking operation on the text frame information to determine the subtitle position. According to the sequence of column statistics, row statistics and film frame statistics, the color characteristics of the subtitles are clarified, the color information is preprocessed by binarization operation, then the subtitle is extracted by OCR technology, and the effect of the preprocessed binarization operation is shown in figure 3. The problem that the existing simple OCR technology cannot identify the complex scene change in the movie subtitle is solved, and therefore the precision of movie subtitle extraction is improved.
Example 2
The present embodiment relates to a device for extracting movie subtitles in a background complex change scene based on OCR and color preprocessing, which includes a memory and one or more processors, as shown in fig. 2, wherein executable codes are stored in the memory, and the one or more processors are configured to implement the method for extracting movie subtitles in a background complex change scene based on OCR and color preprocessing of embodiment 1 when executing the executable codes.
Example 3
The present embodiment relates to a computer-readable storage medium having stored thereon a program which, when executed by a processor, implements the movie subtitle extraction method of embodiment 1 under a complex background change scene based on OCR and color preprocessing.
With the above-described preferred embodiments according to the present invention as an illustration, the above-described descriptions can be used by persons skilled in the relevant art to make various changes and modifications without departing from the scope of the technical idea of the present invention. The technical scope of the present invention is not limited to the description, but must be determined according to the scope of claims.

Claims (5)

1. The method for extracting the movie subtitle under the background complex change scene based on OCR and color preprocessing is characterized by comprising the following steps:
S110, sampling a specific position of a text frame obtained by a film frame, randomly sampling a plurality of picture frames from the specific position, and obtaining all the positions of the text frame by an OCR method for the picture frames, wherein the specific position of the text frame obtained by the film frame comprises the following steps:
S1101, randomly sampling n picture frames from a film;
S1102, for n picture frames, calculating all text frame positions by OCR method, and recording the number of text frames as ;
S1103, useThe specific position of the text frame is represented, X, Y is the center point coordinate of the text frame, and H is the height of the text frame;
S120, acquiring caption position information by using a clustering algorithm, clustering the text frames to obtain the class with the largest number in the class, taking the central point coordinate of the class as the central point of the caption position, counting the mode of the height of the text frames in the class of the central point coordinate, taking the width of the film as the height of the caption position, and taking the width of the caption position;
s130, calculating the subtitle color characteristics according to the pixel mode, wherein the method specifically comprises the following steps:
s1301, capturing a movie picture by utilizing a subtitle position according to the picture frame sampled in the S1101, and representing a pixel value of the captured picture by using Img i,j;
S1302, counting the most number of pixel values in each column of pixels in the intercepted picture, namely, the mode of the pixel values in each column, wherein C j is used for representing the mode of the pixel values in each column;
s1303, calculating the mode in { C 1,…,Cj ,…,Cw }, and recording as M, wherein w is the number of columns of picture pixels;
s1304, repeating the above operation for all the intercepted picture frames, and counting the pixel value mode in { M } as the subtitle color information feature;
s140, extracting caption text according to caption color preprocessing, wherein the method specifically comprises the following steps:
S1401, extracting all picture frames in a film, and cutting out pictures according to subtitle positions;
S1402, binarizing the intercepted picture, setting 255 as the pixel value equal to the subtitle color feature, and setting 0 as the pixel value unequal to the subtitle color feature;
s1403, for the picture after binarization operation, the picture is sent to OCR for subtitle text extraction.
2. The method for extracting movie subtitles in a complex background changing scene based on OCR and color preprocessing according to claim 1, characterized in that:
the step S120 of obtaining subtitle position information by using a clustering algorithm specifically includes:
S1201 pair of Position of individual text frameKmeans clustering is carried out to obtain the most number of classes in the classes, and the coordinates of the central points are used as the central points of the subtitle positions;
s1202, counting the height of the text frame in the category of the coordinates of the center point to which the text frame belongs And as the height of the subtitle position;
S1203 sets the width of the movie as the width of the subtitle position.
3. The method for extracting movie subtitles in a complex background scene change based on OCR and color preprocessing as claimed in claim 1, wherein in step S130, the subtitle color feature storage format requirement is RGB three-channel mode.
4. A device for extracting movie titles in a background complex-change scene based on OCR and color preprocessing, comprising a memory and one or more processors, wherein the memory stores executable codes, and the one or more processors are configured to implement the method for extracting movie titles in a background complex-change scene based on OCR and color preprocessing according to any one of claims 1 to 3 when executing the executable codes.
5. A computer-readable storage medium, having stored thereon a program which, when executed by a processor, implements the method for extracting movie titles in a complex background changing scene based on OCR and color preprocessing as set forth in any one of claims 1 to 3.
CN202411815294.3A 2024-12-11 2024-12-11 Movie subtitle extraction method and device under background complex change scene based on OCR and color preprocessing Active CN119763090B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411815294.3A CN119763090B (en) 2024-12-11 2024-12-11 Movie subtitle extraction method and device under background complex change scene based on OCR and color preprocessing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411815294.3A CN119763090B (en) 2024-12-11 2024-12-11 Movie subtitle extraction method and device under background complex change scene based on OCR and color preprocessing

Publications (2)

Publication Number Publication Date
CN119763090A CN119763090A (en) 2025-04-04
CN119763090B true CN119763090B (en) 2025-09-26

Family

ID=95189817

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411815294.3A Active CN119763090B (en) 2024-12-11 2024-12-11 Movie subtitle extraction method and device under background complex change scene based on OCR and color preprocessing

Country Status (1)

Country Link
CN (1) CN119763090B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101122953A (en) * 2007-09-21 2008-02-13 北京大学 A method for image text segmentation
CN111539427A (en) * 2020-04-29 2020-08-14 武汉译满天下科技有限公司 Method and system for extracting video subtitles

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08153163A (en) * 1994-11-29 1996-06-11 Sharp Corp Image processing device
KR100764175B1 (en) * 2006-02-27 2007-10-08 삼성전자주식회사 Apparatus and method for detecting important subtitles of video for customized broadcasting service
CN100562074C (en) * 2007-07-10 2009-11-18 北京大学 A method for extracting video subtitles
KR101029408B1 (en) * 2009-02-24 2011-04-14 성균관대학교산학협력단 How to extract Korean subtitles
CN102208023B (en) * 2011-01-23 2013-05-08 浙江大学 Method for recognizing and designing video captions based on edge information and distribution entropy
CN102915438B (en) * 2012-08-21 2016-11-23 北京捷成世纪科技股份有限公司 The extracting method of a kind of video caption and device
CN106254933B (en) * 2016-08-08 2020-02-18 腾讯科技(深圳)有限公司 Subtitle extraction method and device
CN111191651B (en) * 2019-12-06 2024-12-03 中国平安财产保险股份有限公司 Document image recognition method, device, computer equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101122953A (en) * 2007-09-21 2008-02-13 北京大学 A method for image text segmentation
CN111539427A (en) * 2020-04-29 2020-08-14 武汉译满天下科技有限公司 Method and system for extracting video subtitles

Also Published As

Publication number Publication date
CN119763090A (en) 2025-04-04

Similar Documents

Publication Publication Date Title
KR100746641B1 (en) Image code based on moving picture, apparatus for generating/decoding image code based on moving picture and method therefor
CN107862315B (en) Subtitle extraction method, video searching method, subtitle sharing method and device
US7403657B2 (en) Method and apparatus for character string search in image
TWI223212B (en) Generalized text localization in images
US11625871B2 (en) System and method for capturing and interpreting images into triple diagrams
CN110309746A (en) High-grade information security area list data information extracting method without communication interconnection
CN112749696B (en) Text detection method and device
CN116630984B (en) An OCR text recognition method and system based on seal removal
CN108009629A (en) A kind of station symbol dividing method based on full convolution station symbol segmentation network
CN110148223A (en) Monitor video target concentration expression and system in three-dimensional geography model of place
CN104244073A (en) Automatic detecting and recognizing method of scroll captions in videos
CN106169080A (en) A kind of combustion gas index automatic identifying method based on image
US20080095442A1 (en) Detection and Modification of Text in a Image
CN104185069B (en) A kind of TV station symbol recognition method and its identifying system
CN107368828A (en) High definition paper IMAQ decomposing system and method
CN107835397A (en) A kind of method of more camera lens audio video synchronizations
JP2020017136A (en) Object detection and recognition apparatus, method, and program
Huang et al. Automatic detection and localization of natural scene text in video
CN108446603A (en) A kind of headline detection method and device
CN119763090B (en) Movie subtitle extraction method and device under background complex change scene based on OCR and color preprocessing
CN106682670A (en) Method and system for identifying station caption
CN117291208B (en) Two-dimensional code extraction method and system
Ma et al. Mobile camera based text detection and translation
US20240153228A1 (en) Smart scene based image cropping
CN105930813B (en) A method of detection composes a piece of writing this under any natural scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant