[go: up one dir, main page]

CN110717397A - An online translation system based on mobile phone camera - Google Patents

An online translation system based on mobile phone camera Download PDF

Info

Publication number
CN110717397A
CN110717397A CN201910856607.2A CN201910856607A CN110717397A CN 110717397 A CN110717397 A CN 110717397A CN 201910856607 A CN201910856607 A CN 201910856607A CN 110717397 A CN110717397 A CN 110717397A
Authority
CN
China
Prior art keywords
image
text
recognition
grayscale
translation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910856607.2A
Other languages
Chinese (zh)
Inventor
仲国强
林鑫
岳国华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ocean University of China
Original Assignee
Ocean University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ocean University of China filed Critical Ocean University of China
Priority to CN201910856607.2A priority Critical patent/CN110717397A/en
Publication of CN110717397A publication Critical patent/CN110717397A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)
  • Character Input (AREA)

Abstract

本发明公开了一种基于手机相机的在线翻译系统,自动对焦获取图片;图像预处理;在提取出相应文本信息后,利用谷歌提供的正方体文字识别引擎识别文字;取词识别完成后获取到了需要翻译的文字,将待翻译的文字向翻译模块传递即可返回翻译结果。本发明的有益效果是成功识别及翻译得到准确的结果。The invention discloses an online translation system based on a mobile phone camera, which can automatically focus to obtain pictures; image preprocessing; after extracting corresponding text information, use a cube text recognition engine provided by Google to recognize text; The translated text, pass the text to be translated to the translation module to return the translation result. The beneficial effect of the present invention is that accurate results are obtained by successful identification and translation.

Description

一种基于手机相机的在线翻译系统An online translation system based on mobile phone camera

技术领域technical field

本发明属于在线翻译技术领域,涉及一种基于手机相机的在线翻译系统。The invention belongs to the technical field of online translation, and relates to an online translation system based on a mobile phone camera.

背景技术Background technique

随着经济的发展,越来越多的人选择出国旅游,但是语言障碍始终困扰着人们。因此,一个能将旅游场景中的文字识别并翻译的软件,将能够满足国内外游客的需求。旅游翻译助手具有OCR实时识别及翻译功能,能帮助游客解决旅游过程中语言障碍的问题。旅游翻译助手实现了基于Android平台的对不同场景中的英文和中文的识别翻译功能。该应用程序基于Java语言开发,安装在安卓手机上,可在多种自然场景下进行调试。测试场景有书本文档识别与翻译、菜单识别与翻译、路牌标识识别与翻译等等。常用翻译软件需手工输入所要翻译的文本才可翻译,但该旅游翻译助手为方便用户使用,设计了相机取词识别后直接翻译的功能,省去了手工输入翻译文本的麻烦。With the development of the economy, more and more people choose to travel abroad, but the language barrier has always plagued people. Therefore, a software that can recognize and translate text in tourist scenes will be able to meet the needs of domestic and foreign tourists. The travel translation assistant has OCR real-time recognition and translation functions, which can help tourists solve the problem of language barriers during travel. The travel translation assistant realizes the recognition and translation function of English and Chinese in different scenarios based on the Android platform. The application is developed based on the Java language, installed on an Android phone, and can be debugged in a variety of natural scenarios. Test scenarios include book document recognition and translation, menu recognition and translation, street sign identification and translation, and so on. Common translation software requires manual input of the text to be translated before translation. However, for the convenience of users, this travel translation assistant has designed the function of direct translation after word recognition by the camera, which saves the trouble of manually inputting the translation text.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于提供一种基于手机相机的在线翻译系统,本发明的有益效果是成功识别及翻译得到准确的结果。The purpose of the present invention is to provide an online translation system based on a mobile phone camera, and the beneficial effect of the present invention is to successfully identify and translate to obtain accurate results.

本发明所采用的技术方案是按照以下步骤进行:The technical scheme adopted in the present invention is to carry out according to the following steps:

步骤1:自动对焦获取图片;摄像机对待识别的内容每隔两秒钟自动对焦一次并获取图片,从而通过自动对焦来动态定位并获取图像;Step 1: Auto-focus to acquire pictures; the camera automatically focuses on the content to be recognized every two seconds and acquires pictures, so as to dynamically locate and acquire images through auto-focus;

步骤2:图像预处理;对图片的预处理的步骤是:扫描图像-->获得图像-->解析图像-->获取图片参数-->图像去噪-->计算图像阈值-->对图像灰度二值化并保存灰度值-->获得灰度图像;Step 2: Image preprocessing; the steps of image preprocessing are: scan image --> obtain image --> analyze image --> obtain image parameters --> image denoising --> calculate image threshold --> pair Image grayscale binarization and save grayscale value --> get grayscale image;

步骤3:识别功能,在提取出相应文本信息后,利用谷歌提供的正方体文字识别引擎识别文字;Step 3: Recognition function, after extracting the corresponding text information, use the cube text recognition engine provided by Google to recognize the text;

步骤4:翻译功能,取词识别完成后获取到了需要翻译的文字,将待翻译的文字向翻译模块传递即可返回翻译结果。Step 4: Translation function, after the word recognition is completed, the text to be translated is obtained, and the translation result can be returned by passing the text to be translated to the translation module.

进一步,步骤1中,首先在系统中设定了一个定时器Timer,它会每隔一段时间执行一个TimerTask,并且在TimerTask里执行对焦方法,Camera类提供了自动对焦的方法,它接收AotoFocusCallback回调,这个方法执行后相机就会自动对焦,当它对焦完成后触发回调方法并再次启动Timer,以实现连续间隔的自动对焦,最后设计相机自动对焦方法,判断对焦成功后发出一个消息,间接控制执行拍照获取图片。Further, in step 1, first set a timer Timer in the system, it will execute a TimerTask at regular intervals, and execute the focus method in the TimerTask, the Camera class provides an autofocus method, which receives the AotoFocusCallback callback, After this method is executed, the camera will automatically focus. When the focus is completed, the callback method is triggered and the Timer is started again to achieve continuous automatic focus. Finally, the camera autofocus method is designed. After judging that the focus is successful, a message is sent to indirectly control the execution of the photo. Get pictures.

进一步,步骤2包括Further, step 2 includes

(1)图像获取:通过调用摄像头,扫描得到固定矩形区域图像,得到图像后,解析图像,获取待处理图像的参数;(1) Image acquisition: by calling the camera, scan to obtain a fixed rectangular area image, after obtaining the image, analyze the image, and obtain the parameters of the image to be processed;

(2)图像去噪,系统所采用的去噪算法是获取设置的固定大小区域中所有像素值的中间值,采取3乘以3的表格区域,pixel[0~8]是一个一维的像素值序列,中心点为pixel[4],对9个像素点的值进行排序,然后取中间值,再存入pixel[4],即中心点,得到去噪后的像素值;(2) Image de-noising, the de-noising algorithm used by the system is to obtain the median value of all pixel values in the fixed size area, and take a table area of 3 times 3, pixel[0~8] is a one-dimensional pixel Value sequence, the center point is pixel[4], sort the values of 9 pixel points, then take the middle value, and then store it in pixel[4], that is, the center point, to obtain the pixel value after denoising;

(3)计算阈值,系统中使用一维最大熵法、大律法和迭代法分别计算图像阈值,综合对比取最优阈值结果;(3) Calculate the threshold value. In the system, the one-dimensional maximum entropy method, the big law and the iterative method are used to calculate the image threshold value respectively, and the optimal threshold value result is obtained by comprehensive comparison;

(4)灰度二值化处理,系统根据之前计算的最优阈值对图像进行灰度二值化处理;(4) Grayscale binarization processing, the system performs grayscale binarization processing on the image according to the previously calculated optimal threshold;

(5)提取特征,将经预处理后的图像分割,区域化提取图像特征,取出图像中的文本信息,系统中文字的提取方法使用了Android系统中数据在Activity之间的传递模块,首先将图像置于拍照所在的Activity模块进行处理,同时在该模块完成取词的功能,然后将图片经处理得到的结果返回给翻译模块的Activity,再对文字进行识别与翻译。(5) Extract features, segment the preprocessed image, extract image features regionally, and extract text information in the image. The text extraction method in the system uses the data transfer module between activities in the Android system. The image is placed in the Activity module where the photo is taken for processing. At the same time, the function of taking words is completed in this module, and then the result obtained from the image processing is returned to the Activity of the translation module, and then the text is recognized and translated.

进一步,步骤3包括Further, step 3 includes

(1)正方体文字识别引擎分为两部分:图片布局分析和字符分割和识别;(1) The cube text recognition engine is divided into two parts: image layout analysis and character segmentation and recognition;

①图片布局分析主要分析连通区域,是字符识别的准备工作,通过一种混合的基于制表位检测的页面布局分析方法,将图像的表格、文本、图片等内容进行区分;①The image layout analysis mainly analyzes the connected area, which is the preparation work for character recognition. Through a hybrid page layout analysis method based on tab stop detection, the table, text, picture and other contents of the image are distinguished;

②字符分割和识别过程是先找到块区域,定位文本行和单词,识别文本,先进行第一次字符识别,通过字符区域类型判定,根据判定结果对比字符库识别字符,然后,根据识别出来的文本字符,进行粘连字符的分割,同时把错误分割的字符合并,完成字符的精细切分;②The process of character segmentation and recognition is to first find the block area, locate the text line and word, identify the text, first perform the first character recognition, determine the character area type, and compare the character library according to the determination result. For text characters, the glued characters are divided, and the wrongly divided characters are merged to complete the fine segmentation of the characters;

(2)正方体文字识别引擎训练:(2) Cube text recognition engine training:

利用jTessBoxEditor工具来训练语言字库,在该系统中需要完成的任务是对中文和英文的识别与翻译,所以在训练过程中,对tessdata中以下几个数据包进行训练,从而得到系统中应用的语言识别库:Use the jTessBoxEditor tool to train the language font library. The task to be completed in this system is to recognize and translate Chinese and English. Therefore, in the training process, the following data packets in tessdata are trained to obtain the language applied in the system. Recognition library:

chi_sim.traineddata,chi_sim.traineddata,

chi_tra.traineddata,eng.traineddata,chi_tra_vert.traineddata,chi_sim_vert.traineddata,在初次使用此系统时,需要先加载识别库,将识别语言库加载到手机存储根目录中才能正确运行识别功能,通过调用训练得到的语言识别库。chi_tra.traineddata, eng.traineddata, chi_tra_vert.traineddata, chi_sim_vert.traineddata, when using this system for the first time, you need to load the recognition library first, and load the recognition language library into the mobile phone storage root directory to run the recognition function correctly, which is obtained by calling the training language recognition library.

进一步,步骤(4)包括Further, step (4) includes

①利用浮点算法对扫描得到的区域图像进行灰度化处理,手机摄像头扫描得到的图像通常是24位深的RGB图像,灰度化处理后得到8位深的灰度层次图像;①Using floating-point algorithm to grayscale the scanned area image, the image scanned by the mobile phone camera is usually a 24-bit deep RGB image, and an 8-bit deep grayscale image is obtained after grayscale processing;

②图像通过普通二值化处理后得到黑白二色的图像,系统在利用二值化处理时保存了图像灰度信息,根据不同图像的不同像素值,对灰度信息进行二值化,将图片归为只有两个灰度值的图像,最终获得的图片是灰度图片;② After the image is processed by ordinary binarization, a black and white image is obtained. The system saves the grayscale information of the image when using the binarization process. According to the different pixel values of different images, the grayscale information is binarized, and the image It is classified as an image with only two grayscale values, and the final image is a grayscale image;

③在进行灰度二值化处理之后,先判断深色值像素数是否多于浅色值像素数,如果深色值像素数比浅色值像素数多,则进行反色处理。③ After the grayscale binarization process is performed, first determine whether the number of dark value pixels is more than the number of light value pixels, and if the number of dark value pixels is more than the number of light value pixels, perform inverse color processing.

进一步,步骤4中翻译功能流程为开始-->获取识别得到的文字信息-->将文本信息编码转换-->MD5加密,生成签名-->传输信息-->获取翻译结果-->在程序界面上显示翻译结果-->结束。Further, the translation function flow in step 4 is start-->acquire the text information obtained by recognition-->encode the text information-->MD5 encryption, generate a signature-->transmit information-->obtain the translation result-->in The translation result is displayed on the program interface --> End.

具体实施方式Detailed ways

下面结合具体实施方式对本发明进行详细说明。The present invention will be described in detail below with reference to specific embodiments.

实施例:Example:

步骤1:自动对焦获取图片;Step 1: Autofocus to get the picture;

自动对焦:摄像机对待识别的内容每隔两秒钟自动对焦一次并获取图片,从而通过自动对焦来动态定位并获取图像。首先在系统中设定了一个定时器Timer,它会每隔一段时间执行一个TimerTask,并且在TimerTask里执行对焦方法。Camera类提供了自动对焦的方法,它接收AotoFocusCallback回调。这个方法执行后相机就会自动对焦,当它对焦完成后(成功或失败)触发回调方法并再次启动Timer,以实现连续间隔的自动对焦。最后设计相机自动对焦(raiseEvent_OnAutoFocusSuccess)方法,判断对焦成功后发出一个消息,间接控制执行拍照获取图片。Autofocus: The camera automatically focuses on the content to be recognized and acquires a picture every two seconds, thereby dynamically positioning and acquiring images through autofocus. First, a timer Timer is set in the system, which will execute a TimerTask at regular intervals, and execute the focus method in the TimerTask. The Camera class provides the autofocus method, which receives the AotoFocusCallback callback. After this method is executed, the camera will automatically focus, and when it focuses (success or failure), the callback method is triggered and the Timer is started again to achieve continuous interval autofocus. Finally, the camera auto-focus (raiseEvent_OnAutoFocusSuccess) method is designed to send a message after judging that the focus is successful, and indirectly control the execution of taking pictures to obtain pictures.

步骤2:图像预处理;Step 2: Image preprocessing;

对图片的预处理的步骤是:扫描图像-->获得图像-->解析图像-->获取图片参数-->图像去噪-->计算图像阈值-->对图像灰度二值化并保存灰度值-->获得灰度图像。The steps of image preprocessing are: scan the image --> obtain the image --> analyze the image --> obtain the image parameters --> image denoising --> calculate the image threshold --> binarize the image gray level and Save Grayscale Values --> Get Grayscale Image.

(1)图像获取:通过调用摄像头,扫描得到固定矩形区域图像。得到图像后,解析图像,获取待处理图像的参数,比如颜色色值、位图行宽、像素数量。这是图像处理的第一步。(1) Image acquisition: By calling the camera, the fixed rectangular area image is obtained by scanning. After obtaining the image, parse the image to obtain the parameters of the image to be processed, such as color value, bitmap line width, and number of pixels. This is the first step in image processing.

(2)图像去噪。该系统所采用的去噪算法是获取设置的固定大小区域中所有像素值的中间值。采取3乘以3的表格区域。pixel[0~8]是一个一维的像素值序列,中心点为pixel[4]。对9个像素点的值进行排序,然后取中间值,再存入pixel[4],即中心点,得到去噪后的像素值。(2) Image denoising. The denoising algorithm adopted by this system is to obtain the median value of all pixel values in the set fixed size area. Take a 3 by 3 table area. pixel[0~8] is a one-dimensional sequence of pixel values, and the center point is pixel[4]. Sort the values of the 9 pixel points, then take the middle value, and then store it in pixel[4], that is, the center point, to obtain the pixel value after denoising.

(3)计算阈值。该系统中使用一维最大熵法、大律法和迭代法分别计算图像阈值,综合对比取最优阈值结果。(3) Calculate the threshold. In this system, the one-dimensional maximum entropy method, the big law and the iterative method are used to calculate the image threshold respectively, and the optimal threshold result is obtained by comprehensive comparison.

(4)灰度二值化处理。该系统根据之前计算的最优阈值对图像进行灰度二值化处理。(4) Grayscale binarization processing. The system performs grayscale binarization on the image according to the previously calculated optimal threshold.

①利用浮点算法对扫描得到的区域图像进行灰度化处理。手机摄像头扫描得到的图像通常是24位深的RGB图像,灰度化处理后得到8位深的灰度层次图像。①Using floating-point arithmetic to grayscale the scanned area image. The image scanned by the mobile phone camera is usually a 24-bit deep RGB image, and an 8-bit deep grayscale image is obtained after grayscale processing.

②图像通过普通二值化处理后得到黑白二色的图像,它只有黑、白两种颜色,也就是说它的每个像素位深度是1,占一个二进制位。但是,该系统在利用二值化处理时保存了图像灰度信息。根据不同图像的不同像素值,对灰度信息进行二值化,将图片归为只有两个灰度值的图像。最终获得的图片是灰度图片,但是通过灰度二值化后,图片识别精度更高。② After the image is processed by ordinary binarization, a black and white two-color image is obtained. It has only two colors, black and white, that is to say, the bit depth of each pixel is 1, which occupies one binary bit. However, the system preserves image grayscale information when using binarization. According to different pixel values of different images, the grayscale information is binarized, and the picture is classified as an image with only two grayscale values. The final image is a grayscale image, but after grayscale binarization, the image recognition accuracy is higher.

③在进行灰度二值化处理之后,可以先判断深色值像素数是否多于浅色值像素数,如果深色值像素数比浅色值像素数多,则进行反色处理。③ After the grayscale binarization process, it can be judged whether the number of dark value pixels is more than the number of light value pixels, and if the number of dark value pixels is more than the number of light value pixels, the inverse color processing is performed.

(5)提取特征。将经预处理后的图像分割,区域化提取图像特征,取出图像中的文本信息。系统中文字的提取方法使用了Android系统中数据在Activity之间的传递模块。首先将图像置于拍照所在的Activity模块进行处理,同时在该模块完成取词的功能,然后将图片经处理得到的结果返回给翻译模块的Activity,再对文字进行识别与翻译。(5) Extract features. The preprocessed image is segmented, the image features are extracted by regionalization, and the text information in the image is extracted. The method of text extraction in the system uses the data transfer module between activities in the Android system. First, the image is placed in the Activity module where the photo is taken for processing, and the function of word extraction is completed in this module, and then the result of the image processing is returned to the Activity of the translation module, and then the text is recognized and translated.

步骤3:识别功能;Step 3: Identify the function;

在提取出相应文本信息后,该系统利用谷歌提供的正方体文字识别引擎(Tesseract-ocr)识别文字。After extracting the corresponding text information, the system uses the cube text recognition engine (Tesseract-ocr) provided by Google to recognize the text.

(1)正方体文字识别引擎(Tesseract-ocr)原理:Tesseract引擎可以分为两部分:图片布局分析和字符分割和识别。(1) Principle of Tesseract-ocr: The Tesseract engine can be divided into two parts: image layout analysis and character segmentation and recognition.

①图片布局分析主要分析连通区域,是字符识别的准备工作。它通过一种混合的基于制表位检测的页面布局分析方法,将图像的表格、文本、图片等内容进行区分。①Picture layout analysis mainly analyzes the connected area, which is the preparation for character recognition. It uses a hybrid tab-stop detection-based page layout analysis method to differentiate images from tables, texts, pictures, etc.

②字符分割和识别过程是先找到块区域,定位文本行和单词,识别文本。先进行第一次字符识别,通过字符区域类型判定,根据判定结果对比字符库识别字符。然后,根据识别出来的文本字符,进行粘连字符的分割,同时把错误分割的字符合并,完成字符的精细切分。②The process of character segmentation and recognition is to first find the block area, locate the text lines and words, and recognize the text. The first character recognition is carried out, and the characters are recognized by comparing the character library according to the judgment result by judging the type of the character area. Then, according to the recognized text characters, the sticky characters are segmented, and the wrongly segmented characters are merged at the same time to complete the fine segmentation of the characters.

(2)正方体文字识别引擎(Tesseract-ocr)训练:(2) Cube text recognition engine (Tesseract-ocr) training:

利用jTessBoxEditor工具来训练语言字库。在该系统中我们需要完成的任务是对中文和英文的识别与翻译,所以在训练过程中,对tessdata中以下几个数据包进行训练,从而得到系统中应用的语言识别库:Use the jTessBoxEditor tool to train language fonts. The task we need to complete in this system is the recognition and translation of Chinese and English, so during the training process, the following data packets in tessdata are trained to obtain the language recognition library used in the system:

chi_sim.traineddata,chi_sim.traineddata,

chi_tra.traineddata,eng.traineddata,chi_tra_vert.traineddata,chi_sim_vert.traineddata。在初次使用此系统时,需要先加载识别库,将识别语言库加载到手机存储根目录中才能正确运行识别功能。通过调用训练得到的语言识别库,可以准确高效地识别文字信息。chi_tra.traineddata, eng.traineddata, chi_tra_vert.traineddata, chi_sim_vert.traineddata. When using this system for the first time, you need to load the recognition library first, and load the recognition language library into the root directory of the mobile phone storage to run the recognition function correctly. By calling the language recognition library obtained by training, text information can be recognized accurately and efficiently.

步骤4:翻译功能;Step 4: Translate function;

取词识别完成后获取到了需要翻译的文字,我们将待翻译的文字向翻译模块传递即可返回翻译结果,翻译功能流程为开始-->获取识别得到的文字信息-->将文本信息编码转换-->MD5加密,生成签名-->传输信息-->获取翻译结果-->在程序界面上显示翻译结果-->结束。After the word recognition is completed, the text that needs to be translated is obtained. We pass the text to be translated to the translation module to return the translation result. The translation function process is start --> get the recognized text information --> encode and convert the text information -->MD5 encryption, generate signature-->Transmission information-->Get translation result-->Display translation result on the program interface-->End.

在该系统中,为保证调用接口的安全性,接口采用了IP限制并且通过MD5编码加密传输数据,与服务器交互。首先对待翻译信息利用MD5加密,生成签名进行验证,通过HTTP的Post方式将待翻译信息传递给服务器。翻译结果通过服务器以Json格式返回后,对返回的XML信息解析,将最终的翻译结果以常见的表示形式呈现在翻译结果栏中,这样就实现了翻译功能。信息-摘要算法5(Message Digest Algorithm 5,MD5)为计算机安全领域广泛使用的一种散列函数,用以提供消息的完整性保护。它用于确保信息传输完整一致,是计算机广泛使用的杂凑算法之一。In this system, in order to ensure the security of the calling interface, the interface adopts IP restrictions and encrypts the transmission data through MD5 encoding, and interacts with the server. First, the information to be translated is encrypted with MD5, a signature is generated for verification, and the information to be translated is transmitted to the server through HTTP Post. After the translation result is returned by the server in Json format, the returned XML information is parsed, and the final translation result is presented in the translation result column in a common representation form, thus realizing the translation function. Message Digest Algorithm 5 (MD5) is a hash function widely used in the field of computer security to provide integrity protection of messages. It is used to ensure complete and consistent information transmission and is one of the hash algorithms widely used by computers.

本发明系统不仅具有实时取词识别功能,还提供了在线翻译功能,让用户的使用体验更佳,为用户的在日常生活中提供了方便。该系统的主要技术领域是模式识别与自然语言处理,它主要实现了实时的文本信息识别与翻译功能,是人们在旅游时良好的翻译助手。该系统分别通过图片处理技术、特征提取技术、文字识别技术和翻译技术等技术手段完成了每个子模块的设计,实现了相机取词识别后直接翻译的功能,省去了人们手工输入翻译文本的麻烦。该系统具有广泛的应用前景和重要的实践意义。The system of the invention not only has the function of real-time word selection and recognition, but also provides the function of online translation, which makes the user experience better and provides convenience for the user in daily life. The main technical fields of the system are pattern recognition and natural language processing. It mainly realizes real-time text information recognition and translation functions, and is a good translation assistant for people when traveling. The system completes the design of each sub-module through image processing technology, feature extraction technology, text recognition technology, translation technology and other technical means, and realizes the function of direct translation after word recognition by the camera, eliminating the need for people to manually input translation texts. trouble. The system has broad application prospects and important practical significance.

以上所述仅是对本发明的较佳实施方式而已,并非对本发明作任何形式上的限制,凡是依据本发明的技术实质对以上实施方式所做的任何简单修改,等同变化与修饰,均属于本发明技术方案的范围内。The above is only a preferred embodiment of the present invention, and does not limit the present invention in any form. Any simple modifications, equivalent changes and modifications made to the above embodiments according to the technical essence of the present invention belong to the present invention. within the scope of the technical solution of the invention.

Claims (6)

1.一种基于手机相机的在线翻译系统,其特征在于按照以下步骤进行:1. an online translation system based on mobile phone camera is characterized in that carrying out according to the following steps: 步骤1:自动对焦获取图片;摄像机对待识别的内容每隔两秒钟自动对焦一次并获取图片,从而通过自动对焦来动态定位并获取图像;Step 1: Auto-focus to acquire pictures; the camera automatically focuses on the content to be recognized every two seconds and acquires pictures, so as to dynamically locate and acquire images through auto-focus; 步骤2:图像预处理;对图片的预处理的步骤是:扫描图像-->获得图像-->解析图像-->获取图片参数-->图像去噪-->计算图像阈值-->对图像灰度二值化并保存灰度值-->获得灰度图像;Step 2: Image preprocessing; the steps of image preprocessing are: scan image --> obtain image --> analyze image --> obtain image parameters --> image denoising --> calculate image threshold --> pair Image grayscale binarization and save grayscale value --> get grayscale image; 步骤3:识别功能,在提取出相应文本信息后,利用谷歌提供的正方体文字识别引擎识别文字;Step 3: Recognition function, after extracting the corresponding text information, use the cube text recognition engine provided by Google to recognize the text; 步骤4:翻译功能,取词识别完成后获取到了需要翻译的文字,将待翻译的文字向翻译模块传递即可返回翻译结果。Step 4: Translation function, after the word recognition is completed, the text to be translated is obtained, and the translation result can be returned by passing the text to be translated to the translation module. 2.按照权利要求1所述一种基于手机相机的在线翻译系统,其特征在于:2. according to a kind of online translation system based on mobile phone camera according to claim 1, it is characterized in that: 所述步骤1中,首先在系统中设定了一个定时器Timer,它会每隔一段时间执行一个TimerTask,并且在TimerTask里执行对焦方法,Camera类提供了自动对焦的方法,它接收AotoFocusCallback回调,这个方法执行后相机就会自动对焦,当它对焦完成后触发回调方法并再次启动Timer,以实现连续间隔的自动对焦,最后设计相机自动对焦方法,判断对焦成功后发出一个消息,间接控制执行拍照获取图片。In the step 1, a timer Timer is first set in the system, it will execute a TimerTask at regular intervals, and execute the focus method in the TimerTask, the Camera class provides an automatic focus method, which receives the AotoFocusCallback callback, After this method is executed, the camera will automatically focus. When the focus is completed, the callback method is triggered and the Timer is started again to achieve continuous automatic focus. Finally, the camera autofocus method is designed. After judging that the focus is successful, a message is sent to indirectly control the execution of the photo. Get pictures. 3.按照权利要求1所述一种基于手机相机的在线翻译系统,其特征在于:所述步骤2包括3. a kind of online translation system based on mobile phone camera according to claim 1, is characterized in that: described step 2 comprises: (1)图像获取:通过调用摄像头,扫描得到固定矩形区域图像,得到图像后,解析图像,获取待处理图像的参数;(1) Image acquisition: by calling the camera, scan to obtain a fixed rectangular area image, after obtaining the image, analyze the image, and obtain the parameters of the image to be processed; (2)图像去噪,系统所采用的去噪算法是获取设置的固定大小区域中所有像素值的中间值,采取3乘以3的表格区域,pixel[0~8]是一个一维的像素值序列,中心点为pixel[4],对9个像素点的值进行排序,然后取中间值,再存入pixel[4],即中心点,得到去噪后的像素值;(2) Image de-noising, the de-noising algorithm used by the system is to obtain the median value of all pixel values in the fixed size area, and take a table area of 3 times 3, pixel[0~8] is a one-dimensional pixel Value sequence, the center point is pixel[4], sort the values of 9 pixel points, then take the middle value, and then store it in pixel[4], that is, the center point, to obtain the pixel value after denoising; (3)计算阈值,系统中使用一维最大熵法、大律法和迭代法分别计算图像阈值,综合对比取最优阈值结果;(3) Calculate the threshold value. In the system, the one-dimensional maximum entropy method, the big law and the iterative method are used to calculate the image threshold value respectively, and the optimal threshold value result is obtained by comprehensive comparison; (4)灰度二值化处理,系统根据之前计算的最优阈值对图像进行灰度二值化处理;(4) Grayscale binarization processing, the system performs grayscale binarization processing on the image according to the previously calculated optimal threshold; (5)提取特征,将经预处理后的图像分割,区域化提取图像特征,取出图像中的文本信息,系统中文字的提取方法使用了Android系统中数据在Activity之间的传递模块,首先将图像置于拍照所在的Activity模块进行处理,同时在该模块完成取词的功能,然后将图片经处理得到的结果返回给翻译模块的Activity,再对文字进行识别与翻译。(5) Extract features, segment the preprocessed image, extract image features regionally, and extract text information in the image. The text extraction method in the system uses the data transfer module between activities in the Android system. The image is placed in the Activity module where the photo is taken for processing. At the same time, the function of taking words is completed in this module, and then the result obtained from the image processing is returned to the Activity of the translation module, and then the text is recognized and translated. 4.按照权利要求3所述一种基于手机相机的在线翻译系统,其特征在于:所述步骤(4)包括4. according to a kind of online translation system based on mobile phone camera according to claim 3, it is characterized in that: described step (4) comprises ①利用浮点算法对扫描得到的区域图像进行灰度化处理,手机摄像头扫描得到的图像通常是24位深的RGB图像,灰度化处理后得到8位深的灰度层次图像;①Using floating-point algorithm to grayscale the scanned area image, the image scanned by the mobile phone camera is usually a 24-bit deep RGB image, and an 8-bit deep grayscale image is obtained after grayscale processing; ②图像通过普通二值化处理后得到黑白二色的图像,系统在利用二值化处理时保存了图像灰度信息,根据不同图像的不同像素值,对灰度信息进行二值化,将图片归为只有两个灰度值的图像,最终获得的图片是灰度图片;② After the image is processed by ordinary binarization, a black and white image is obtained. The system saves the grayscale information of the image when using the binarization process. According to the different pixel values of different images, the grayscale information is binarized, and the image It is classified as an image with only two grayscale values, and the final image is a grayscale image; ③在进行灰度二值化处理之后,先判断深色值像素数是否多于浅色值像素数,如果深色值像素数比浅色值像素数多,则进行反色处理。③ After the grayscale binarization process is performed, first determine whether the number of dark value pixels is more than the number of light value pixels, and if the number of dark value pixels is more than the number of light value pixels, perform inverse color processing. 5.按照权利要求1所述一种基于手机相机的在线翻译系统,其特征在于:所述步骤3包括5. a kind of online translation system based on mobile phone camera according to claim 1, is characterized in that: described step 3 comprises: (1)正方体文字识别引擎分为两部分:图片布局分析和字符分割和识别;(1) The cube text recognition engine is divided into two parts: image layout analysis and character segmentation and recognition; ①图片布局分析主要分析连通区域,是字符识别的准备工作,通过一种混合的基于制表位检测的页面布局分析方法,将图像的表格、文本、图片等内容进行区分;①The image layout analysis mainly analyzes the connected area, which is the preparation work for character recognition. Through a hybrid page layout analysis method based on tab stop detection, the table, text, picture and other contents of the image are distinguished; ②字符分割和识别过程是先找到块区域,定位文本行和单词,识别文本,先进行第一次字符识别,通过字符区域类型判定,根据判定结果对比字符库识别字符,然后,根据识别出来的文本字符,进行粘连字符的分割,同时把错误分割的字符合并,完成字符的精细切分;②The process of character segmentation and recognition is to first find the block area, locate the text line and word, identify the text, first perform the first character recognition, determine the character area type, and compare the character library according to the determination result. For text characters, the glued characters are divided, and the wrongly divided characters are merged to complete the fine segmentation of the characters; (2)正方体文字识别引擎训练:(2) Cube text recognition engine training: 利用jTessBoxEditor工具来训练语言字库,在该系统中需要完成的任务是对中文和英文的识别与翻译,所以在训练过程中,对tessdata中以下几个数据包进行训练,从而得到系统中应用的语言识别库:Use the jTessBoxEditor tool to train the language font library. The task to be completed in this system is to recognize and translate Chinese and English. Therefore, in the training process, the following data packets in tessdata are trained to obtain the language applied in the system. Recognition library: chi_sim.traineddata,chi_sim.traineddata, chi_tra.traineddata,eng.traineddata,chi_tra_vert.traineddata,chi_sim_vert.traineddata,在初次使用此系统时,需要先加载识别库,将识别语言库加载到手机存储根目录中才能正确运行识别功能,通过调用训练得到的语言识别库。chi_tra.traineddata, eng.traineddata, chi_tra_vert.traineddata, chi_sim_vert.traineddata, when using this system for the first time, you need to load the recognition library first, and load the recognition language library into the mobile phone storage root directory to run the recognition function correctly, which is obtained by calling the training language recognition library. 6.按照权利要求1所述一种基于手机相机的在线翻译系统,其特征在于:所述步骤4中翻译功能流程为开始-->获取识别得到的文字信息-->将文本信息编码转换-->MD5加密,生成签名-->传输信息-->获取翻译结果-->在程序界面上显示翻译结果-->结束。6. according to the described a kind of online translation system based on mobile phone camera according to claim 1, it is characterized in that: in described step 4, the translation function process flow is to start --> obtain the text information obtained by recognition --> code conversion of text information- ->MD5 encryption, generate signature-->Transmission information-->Get translation result-->Display translation result on the program interface-->End.
CN201910856607.2A 2019-09-11 2019-09-11 An online translation system based on mobile phone camera Pending CN110717397A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910856607.2A CN110717397A (en) 2019-09-11 2019-09-11 An online translation system based on mobile phone camera

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910856607.2A CN110717397A (en) 2019-09-11 2019-09-11 An online translation system based on mobile phone camera

Publications (1)

Publication Number Publication Date
CN110717397A true CN110717397A (en) 2020-01-21

Family

ID=69209822

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910856607.2A Pending CN110717397A (en) 2019-09-11 2019-09-11 An online translation system based on mobile phone camera

Country Status (1)

Country Link
CN (1) CN110717397A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257466A (en) * 2020-11-03 2021-01-22 沈阳雅译网络技术有限公司 Model compression method applied to small machine translation equipment
CN113392847A (en) * 2021-06-17 2021-09-14 拉萨搻若文化艺术产业开发有限公司 OCR (optical character recognition) handheld scanning translation device and translation method for Tibetan Chinese and English
CN113486892A (en) * 2021-07-02 2021-10-08 东北大学 Production information acquisition method and system based on smartphone image recognition

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101562694A (en) * 2009-05-26 2009-10-21 天津三星光电子有限公司 Method for realizing functions of character extraction and automatic translation of digital camera
CN104881405A (en) * 2015-05-22 2015-09-02 东莞中山大学研究院 A method for realizing photo translation based on a smart phone and the smart phone
CN108628858A (en) * 2018-04-20 2018-10-09 广东科学技术职业学院 The operating method and system of textual scan identification translation on line based on mobile terminal

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101562694A (en) * 2009-05-26 2009-10-21 天津三星光电子有限公司 Method for realizing functions of character extraction and automatic translation of digital camera
CN104881405A (en) * 2015-05-22 2015-09-02 东莞中山大学研究院 A method for realizing photo translation based on a smart phone and the smart phone
CN108628858A (en) * 2018-04-20 2018-10-09 广东科学技术职业学院 The operating method and system of textual scan identification translation on line based on mobile terminal

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PROGRAMOFAPE: "jTessBoxEditor使用", pages 107 - 108 *
郑树泉,王倩,武智霞,徐侃主编: "《数字图像处理与压缩编码技术》", 西安电子科技大学出版社, pages: 225 - 67 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257466A (en) * 2020-11-03 2021-01-22 沈阳雅译网络技术有限公司 Model compression method applied to small machine translation equipment
CN112257466B (en) * 2020-11-03 2023-08-18 沈阳雅译网络技术有限公司 Model compression method applied to small machine translation equipment
CN113392847A (en) * 2021-06-17 2021-09-14 拉萨搻若文化艺术产业开发有限公司 OCR (optical character recognition) handheld scanning translation device and translation method for Tibetan Chinese and English
CN113392847B (en) * 2021-06-17 2023-12-05 拉萨搻若文化艺术产业开发有限公司 Tibetan Chinese-English three-language OCR handheld scanning translation device and translation method
CN113486892A (en) * 2021-07-02 2021-10-08 东北大学 Production information acquisition method and system based on smartphone image recognition
CN113486892B (en) * 2021-07-02 2023-11-28 东北大学 Production information collection method and system based on smartphone image recognition

Similar Documents

Publication Publication Date Title
CN110363102B (en) Object identification processing method and device for PDF (Portable document Format) file
Raghunandan et al. Riesz fractional based model for enhancing license plate detection and recognition
CN106951832B (en) Verification method and device based on handwritten character recognition
Gebhardt et al. Document authentication using printing technique features and unsupervised anomaly detection
CN112434690A (en) Method, system and storage medium for automatically capturing and understanding elements of dynamically analyzing text image characteristic phenomena
CN108399405A (en) Business license recognition methods and device
CN110717397A (en) An online translation system based on mobile phone camera
CN112069991B (en) PDF (Portable document Format) form information extraction method and related device
CN111652233A (en) An automatic recognition method of text verification code for complex background
CN104376315A (en) Detection method based on computer image processing and mode recognition and application of detection method
Roy et al. Wavelet-gradient-fusion for video text binarization
CN116740723A (en) A PDF document recognition method based on the open source Paddle framework
CN111310750B (en) Information processing method, device, computing equipment and medium
CN116704523A (en) Text typesetting image recognition system for publishing and printing equipment
CN105551044B (en) A kind of picture control methods and device
JP2004272798A (en) Image reading device
WO2019140641A1 (en) Information processing method and system, cloud processing device and computer program product
CN104156725A (en) Novel Chinese character stroke combination method based on angle between stroke segments
CN115810197A (en) Multi-mode electric power form recognition method and device
CN109147002B (en) Image processing method and device
KR102562170B1 (en) Method for providing deep learning based paper book digitizing service
Koushik et al. Automated marks entry processing in handwritten answer scripts using character recognition techniques
CN111476090B (en) Watermark identification method and device
CN116484844A (en) Document OCR recognition result error correction method, system, equipment and medium
Huang et al. Scene character detection and recognition based on multiple hypotheses framework

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200121

RJ01 Rejection of invention patent application after publication