CN105631051A

CN105631051A - Character recognition based mobile augmented reality reading method and reading system thereof

Info

Publication number: CN105631051A
Application number: CN201610111436.7A
Authority: CN
Inventors: 吕建明; 石嘉琪; 代涵宣; 刘宇阳; 徐辰沁; 马芮; 黄洁晶
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2016-02-29
Filing date: 2016-02-29
Publication date: 2016-06-01

Abstract

The invention discloses a mobile augmented reality reading method based on character recognition, which comprises the following steps: 1. The mobile device acquires the captured original image containing characters; 2. The mobile device preprocesses the obtained original image, and upload to the server; 3. The server obtains the text collection and the location information of each text in the image; 4. The server obtains the keyword collection and the location information of each keyword; 5. For each keyword, the server searches the knowledge base 6. The mobile terminal accurately superimposes the multimedia resources on the original image for each group of received results. The invention also discloses a reading system for realizing the mobile augmented reality reading method based on character recognition, comprising: a mobile phone terminal and a server terminal; the mobile phone terminal and the server terminal communicate through the Internet. It has advantages such as less storage cost of the server.

Description

Based on mobile augmented reality reading method and the reading system thereof of Text region

Technical field

The present invention relates to a kind of enhancing reality system technology, in particular to a kind of mobile augmented reality reading method based on Text region and reading system thereof.

Background technology

In conventional books, newspaper reading pattern, the information that people obtain only comes from the books and newspapers read, the quantity of information obtained is less and has limitation, if wanting to understand more multi information for interested content, it usually needs input keyword in the search engine of PC end or mobile terminal and search for. This kind of mode of operation is loaded down with trivial details, and reader and the interactivity read between report are poor.

In view of above-mentioned reading model Problems existing, the present invention proposes a kind of enhancing reality (AugmentedReality in conjunction with Text region, knowledge base coupling technology, it is called for short AR) technology, relevant word, image, video are accurately added on the word content of readers ' reading, help reader conveniently to obtain more information in the process read, and make the type diversification more of reading information.

So-called augmented reality is a kind of by the technology of true world information and virtual world information Seamless integration-. traditional augmented reality based on mobile equipment, it is take image by mobile phone camera, and by the image of this real world and be kept in background data base in advance store image compare, if the image mated mutually can be found, then by the word relevant with this image, the virtual information superposition of video or image is displayed in the preview window of mobile phone camera, user is allowed to see image and seamless being superimposed of virtual information in the true world, so that user can obtain more information and have the sensory experience of exceeding reality, reality is had more understanding. augmented reality has good application in e-magazine, placard publicity, virtual furnishings displaying etc.

But existing augmented reality, it is generally required to background data base preserves in advance and processes for generation of the image strengthening real effect, only when the image photographed comprises the image that these set in advance time, just display strengthens real content accordingly. This kind as identifying using image and in the way of matching vector, needs typing in advance in service end and stores the view data of a large amount of offer couplings on the one hand, and storage cost is relatively big, and early stage, image typing preparation work was loaded down with trivial details; On the other hand, owing to default image could can only be identified and produce to strengthen real effect, mobile enhancing realizes terminal and can only play a role under very limited specific image scene, constrains more greatly the widespread use of augmented reality.

Summary of the invention

The primary and foremost purpose of the present invention is to overcome the shortcoming of prior art with not enough, it is provided that a kind of mobile augmented reality reading method based on Text region.

Another object of the present invention is in overcoming the shortcoming of prior art and deficiency, thering is provided a kind of reading system being applied to the mobile augmented reality reading system based on Text region, this system is a kind of in conjunction with the mobile augmented reality reading system of Text region, knowledge base coupling.

The primary and foremost purpose of the present invention is achieved through the following technical solutions: a kind of reading method being applied to the mobile augmented reality reading system based on Text region, comprises the following steps:

S1. mobile equipment obtain taken by the image P comprising word.

S2. the image P that step S1 is obtained by mobile equipment carries out pre-treatment and obtains image P', then uploads onto the server.

S3. the image P' received is carried out text detection and identification by server, obtains word set { W_iAnd each word appear at the positional information { Loc in image P'_i. Wherein W_iRepresent i-th word detected, Loc_iRepresent that this word appears at the position in image P'.

S4. server is according to predefined keywords dictionary, and the word in the image P' obtained in step S3 is carried out keyword match, obtains set of keywords { T_j, and each keyword appears at the position { Pos in image P'_j. Wherein T_jRepresent jth the keyword detected, Pos_jRepresent this keyword T_jAppear at the position in image P'.

S5. each keyword T that server obtains according to step S4_j, carry out retrieving and T in knowledge base_jRelevant multimedia resource S set_j. And by result for retrieval set { (T_j,S_j,Pos_j) return to mobile equipment. Wherein Pos_jIt is keyword T_jAppear at the position in image P'.

S6. result (the T that mobile terminal receives for often group_j,S_j,Pos_j), by multimedia resource S_j, accurately it is superimposed upon the Pos of the image P that step S1 obtains_jOn position.

Abovementioned steps S1 is specially: the camera utilizing mobile equipment, is taken by the reading material including word, obtains image P.

Abovementioned steps S2 is specially: image P is adjusted resolving power by mobile equipment, and carries out image enhaucament and binary conversion treatment, obtains image P', then uploads onto the server.

Abovementioned steps S3 is specially: server, after obtaining image P', is detecting the character area in P', thus obtaining position in the picture, each word place. And call based on the word in the recognition engine text identification character area of optical character recognition (OCR) technology.

Abovementioned steps S4 is specially: the generation method of Keywords Dictionary is: for the ample resources (comprising article, picture, video etc.) collected in advance, utilize the Chinese lexical analysis device with functions such as Chinese word segmentation, part of speech mark, named entity recognition, new word identification to extract key noun wherein as keyword the title of all kinds of resource or title, and add in Keywords Dictionary. Keyword in Keywords Dictionary sorts according to temperature. When carrying out keyword match, the word sequence in the image P' of the acquisition in step S3 is first carried out Chinese word segmentation, then to each word obtained, search in Keywords Dictionary; Finally be retained in Keywords Dictionary occur word as keyword, constitute set of keywords { T_j. Each keyword T_jPosition Pos_jThe position of first word being defined as this keyword in image P'.

Abovementioned steps S5 is specially: each keyword T that server obtains according to step S4_j, carry out retrieving and T in knowledge base_jRelevant multimedia resource S set_j. In knowledge base, the multimedia resource information of record can be word, picture, video or three-dimensional model, and information source can be the inside resource of the web retrieval to World Wide Web or particular organization. Knowledge base adopts the mode of the table of falling row index that the descriptor of resource is carried out index, and supports the full-text search based on keyword.

Abovementioned steps S6 is specially: the concrete grammar of resource superposition is, the result (T that mobile terminal receives for often group_j,S_j,Pos_j), position Pos in image P_jNear region carry out highlighted highlighting, prompting reader this be the region that can click. When reader clicks this region time, this areas adjacent will show and keyword T_jThe resource information S being associated_j��

Another object of the present invention is achieved through the following technical solutions: a kind of mobile augmented reality reading system based on Text region, comprising: mobile phone terminal and server end; Described mobile phone terminal is communicated by internet with server end; Described mobile phone terminal comprises taking module, image pre-processing module and resource laminating module; Described server end comprises Text region module, keyword match module and knowledge base retrieval module; Described taking module comprises the image of word by mobile phone camera shooting; The image photographed is carried out pre-treatment by described image pre-processing module; The image received is carried out text detection and identification by described Text region module; Described keyword match module by Keywords Dictionary to the word in image carries out keyword match; Described knowledge base retrieval module retrieves the multimedia resource set relevant with keyword in knowledge base; Multimedia resource is accurately superimposed upon on the image of described mobile phone terminal shooting by described resource laminating module.

The principle of work of the present invention: the present invention is by the word in the reading material captured by character recognition technology identification mobile terminal, and in knowledge base, carry out information retrieval according to the word identified, the related text of acquisition, image or video resource are accurately added in the shooting picture of mobile terminal, it may also be useful to family obtains more relevant information based on reading on the basis of thing. The mobile augmented reality system based on character recognition technology that the present invention proposes breaks through above-mentioned limitation, when mobile terminal takes the material that arbitrary magazine, newpapers and periodicals etc. comprise word, to first carry out Text region, then knowledge base by word and backstage is compared, and then relevant Word message, picture information or video information is accurately added in the preview screen of mobile terminal. This kind, based on the mode of Text region, has following advantage, does not need to preserve in the server in advance corresponding image on the one hand, and the storage cost of server is less, also without the need to the image typing preparation work in early stage; On the other hand, the material comprising word arbitrarily can be identified and produce to strengthen real effect by mobile terminal, greatly expands the scope of application of this system.

The present invention has following advantage and effect relative to prior art:

1, to compensate for conventional books and newspapers reading method obtaining information amount few in the present invention, the shortcoming that interactivity is poor, the information that reader is obtained is not limited to reading matter, it is possible to merged mutually with the content of nature, efficiently mode with reality reading matter by related resource, it is provided that to the reading material that reader enriches more.

2, the reading material comprising word arbitrarily can be identified and produce to strengthen real effect by the present invention, and do not need the image to reading material to carry out typing in advance and process, greatly expands the scope of application of this system. Only need reading material comprises specific keyword, near keyword, superposition will can click mutual content of multimedia for user.

3, the augmented reality based on Text region proposed of the present invention, does not need the image preserving reading material in advance in the server in advance, and the storage cost of server is less.

Accompanying drawing explanation

Fig. 1 is the method flow diagram of invention.

Fig. 2 is the schematic diagram of reading matter.

Fig. 3 is the image process schematic diagram obtaining and comprising word.

Fig. 4 is schematic diagram shooting picture being carried out keyword recognition and highlighting.

Fig. 5 is the image process schematic diagram obtaining and comprising word.

Fig. 6 is schematic diagram shooting picture being carried out keyword recognition and highlighting.

Fig. 7 is the Resources list displaying figure.

Fig. 8 is resource display figure.

Fig. 9 is the reading system block diagram of the present invention.

Embodiment

Below in conjunction with embodiment and accompanying drawing, the present invention is described in further detail, but embodiments of the present invention are not limited to this.

Embodiment

As shown in Figure 1, a kind of mobile augmented reality reading method based on Text region, mainly comprises following six steps:

The image P comprising word taken by the acquisition of S1, mobile equipment.

The image P that step S1 is obtained by S2, mobile equipment carries out pre-treatment and obtains image P', then uploads onto the server.

The image P' received is carried out text detection and identification by S3, server, obtains word set { W_iAnd each word appear at the positional information { Loc in image P'_i. Wherein W_iRepresent i-th word detected, Loc_iRepresent that this word appears at the position in image P'.

S4, server, according to predefined keywords dictionary, carry out keyword match in the word in the image P' of acquisition in step s3, obtain set of keywords { T_j, and each keyword appears at the position { Pos in image P'_j. Wherein T_jRepresent jth the keyword detected, Pos_jRepresent this keyword T_jAppear at the position in image P'.

Each keyword T that S5, server obtain according to step S4_j, carry out retrieving and T in knowledge base_jRelevant multimedia resource S set_j. And by result for retrieval set { (T_j,S_j,Pos_j) return to mobile equipment. Wherein Pos_jIt is keyword T_jAppear at the position in image P'.

Result (the T that S6, mobile terminal receive for often group_j,S_j,Pos_j), by multimedia resource S_j, accurately it is superimposed upon the Pos of the image P that step S1 obtains_jOn position.

In order to show the mobile augmented reality reading method based on Text region and the reading system thereof of the present invention visually, it is described in detail below in conjunction with accompanying drawing and embodiment:

As shown in Figure 2, being the schematic diagram of reading matter, reading matter 201 can be any article comprising word such as magazine, placard or teaching material. Reading matter 201 is opened to page 202 and page 203, and page 202 contains a picture 204 built and the descriptive text 205 about picture 204, and page 203 contains one section of text description 206. In conventional reading model, when we read reading matter 201, the information acquired is only the content that page 202 and page 203 show, and the type of information is also only word and picture. If wanting to do further understanding for some things mentioned in the page, user needs to input corresponding keyword in a browser, obtains resource, and this process is comparatively loaded down with trivial details. By helping, user obtains the information wanting to understand when reading with a kind of form easily more in the present invention.

As shown in Figure 3, being the schematic diagram of the process using mobile equipment 301 to obtain the image comprising word, mobile equipment 301 can be any mobile equipment comprising network savvy and camera, such as mobile phone, palm panel computer etc. Based on the preview function of the built-in camera of mobile equipment 301, on the screen 302 of mobile equipment 301, the image 303 of display is the content of the page 202 of reading matter 201, not only comprises word, also comprise picture simultaneously in image 303. User determines to obtain after image 303, and image is by upload server after pretreatment, and detection is published picture the character area in picture by server, and the word in character area is identified, keyword match and obtain associated multimedia resource. The result of process is as shown in Figure 4, image 303 image 401 after treatment will be shown on mobile equipment 301, keyword Guangzhou tower is gone out by frame 402 frame and does highlighted highlighting, to remind user to click herein, it can be seen that relevant more multimedia resource.

As shown in Figure 5, it is the schematic diagram that another use mobile equipment 301 obtains the process of the image comprising word, the camera preview image 304 of mobile equipment 301 contains the word paragraph on reading matter 201 page 203. After user determines to obtain image 304, result is as shown in Figure 6 after treatment, and by the image 403 after display process on mobile equipment 301, keyword Guangzhou tower is gone out by frame 404 frame and does highlighted highlighting, to remind user to click herein, it can be seen that relevant more multimedia resource.

When user is on mobile equipment 301 screen when click on area 402 or region 404, display on mobile equipment 301 screen is as shown in Figure 7, by multimedia resource list 501 relevant for display Guangzhou tower on mobile equipment 301, wherein comprise multiple resource type, comprise video resource, article resource, 3D model etc. Click the resource items in the Resources list 501, it is possible to check detailed resource information. Such as, click on area 502, will obtain Guangzhou tower 3D model as shown in Figure 8. In Fig. 8, the Guangzhou tower 3D model 601 of display on mobile equipment 301, according to the change of the distance between mobile equipment 301 and reading matter 201, the size of Guangzhou tower 3D model 601 also will change accordingly; When changing the angle between mobile equipment 301 and reading matter as user, by the Guangzhou tower 3D model 601 of display different angles on the screen 302 of mobile equipment 301. Thus it can be seen that the different behaviors according to user are made corresponding change by the enhancing real-life asset shown, contribute to user comprehensive go understanding information, by force interactive.

As shown in Figure 9, a kind of reading system realizing the described mobile augmented reality reading method based on Text region, comprising: mobile phone terminal and server end; Described mobile phone terminal is communicated by internet with server end; Described mobile phone terminal comprises taking module, image pre-processing module and resource laminating module; Described server end comprises Text region module, keyword match module and knowledge base retrieval module; Described taking module comprises the image of word by mobile phone camera shooting; The image photographed is carried out pre-treatment by described image pre-processing module; The image received is carried out text detection and identification by described Text region module; Described keyword match module by Keywords Dictionary to the word in image carries out keyword match; Described knowledge base retrieval module retrieves the multimedia resource set relevant with keyword in knowledge base; Multimedia resource is accurately superimposed upon on the image of described mobile phone terminal shooting by described resource laminating module.

Above-described embodiment is that the present invention preferably implements mode; but embodiments of the present invention are not restricted to the described embodiments; the change done under the spirit of other any the present invention of not deviating from and principle, modification, replacement, combination, simplification; all should be the substitute mode of equivalence, it is included within protection scope of the present invention.

Claims

1. A mobile augmented reality reading method based on text recognition, is characterized in that, comprises the following steps:

Step S1, the mobile device acquires the captured image P containing text;

Step S2, the mobile device preprocesses the image P obtained in step S1 to obtain an image P', and then uploads it to the server;

Step S3, the server detects and recognizes characters on the received image P', and obtains the character set {W _i } and the location information {Loc _i } where each character appears in the image P'; where W _i represents the i-th detected A text, Loc _i represents the position where the text appears in the image P';

Step S4. According to the predefined keyword dictionary, the server performs keyword matching in the text in the image P' obtained in step S3, and obtains the keyword set {T _j }, and each keyword that appears in the image P' Position {Pos _j }; where T _j represents the jth keyword detected, and Pos _j represents the position where the keyword T _j appears in the image P';

Step S5, the server retrieves the multimedia resource set S _j related to T _j in the knowledge base according to each keyword T _j obtained in step S4, and collects the search result set {(T _j , S _j , Pos _j ) } back to the mobile device; where Pos _j is the position where the keyword T _j appears in the image P';

Step S6, for each group of received results (T _j , S _j , Pos _j ), the mobile terminal accurately superimposes the multimedia resource S _j on the position of Pos _j of the image P obtained in step S1.

2. The mobile augmented reality reading system combined with text recognition and knowledge base matching as claimed in claim 1, characterized in that, in the aforementioned step S1, the camera of the mobile device is used to photograph the reading material containing the text to obtain Image P.

3. The mobile augmented reality reading system combined with character recognition and knowledge base matching as claimed in claim 1, characterized in that, in the aforementioned step S2, the mobile device adjusts the resolution of the image P, and performs image enhancement and binarization processing to obtain the image P', and then upload it to the server.

4. The mobile augmented reality reading system combined with text recognition and knowledge base matching as claimed in claim 1, characterized in that, in the aforementioned step S3, after the server obtains the image P', it detects the text area in P', thereby Obtain the position of each character in the image, and call the character recognition engine based on optical character recognition technology to recognize the characters in the character area.

5. The mobile augmented reality reading system combined with text recognition and knowledge base matching as claimed in claim 1, characterized in that, in the aforementioned step S5, the multimedia resource information recorded in the knowledge base can be text, pictures, videos or three-dimensional Model, the source of information can be a collection of web pages on the World Wide Web or internal resources of a particular organization.

6. the mobile augmented reality reading system that combines text recognition, knowledge base matching as claimed in claim 1, is characterized in that, in aforementioned step S6, the specific method of resource superimposition is, mobile terminal receives the result ( T _j , S _j , Pos _j ), highlight the area Pos _j where the keyword appears in the image P, and remind the reader that this is an area that can be clicked. When the reader clicks on the keyword, it will be in the keyword area Nearby, resource information S _j is displayed.

7. A reading system that realizes the mobile augmented reality reading method based on character recognition claimed in claim 1, is characterized in that, comprising: a mobile phone terminal and a server terminal; the mobile phone terminal and the server terminal communicate through the Internet; the The mobile phone terminal includes a shooting module, an image preprocessing module and a resource superimposition module; the server side includes a text recognition module, a keyword matching module and a knowledge base retrieval module; the shooting module captures an image containing text through a mobile phone camera; the image The preprocessing module preprocesses the captured image; the text recognition module performs text detection and recognition on the received image; the keyword matching module performs keyword matching in the text in the image through a keyword dictionary; The knowledge base retrieval module searches the knowledge base for a collection of multimedia resources related to keywords; the resource overlay module accurately overlays the multimedia resources on the image captured by the mobile phone.