JP7644282B1

JP7644282B1 - Information processing system, information processing method, and program

Info

Publication number: JP7644282B1
Application number: JP2024027768A
Authority: JP
Inventors: 健太郎園田
Original assignee: TIS Inc
Current assignee: TIS Inc
Priority date: 2024-02-27
Filing date: 2024-02-27
Publication date: 2025-03-11
Anticipated expiration: 2044-02-27

Abstract

A system capable of appropriately extracting information from an image is provided.
[Solution] In an information processing system, an electronic data generation device includes a second information acquisition unit that acquires second text information from a second optical character recognition device different from the first optical character recognition device that recognizes, when a first accuracy index indicating the degree of recognition accuracy for characters indicated by the first text information is determined to be lower than a first standard for the accuracy of recognition of characters included in an image, as second text information, and the second information acquisition unit recognizes at least one character included in a specific range image, which is an image of a portion of the target image, including a character corresponding to the first standard index determined to be lower than the first standard, from a first optical character recognition device that recognizes each of a plurality of characters included in a target image as first text information, and a character identification unit that identifies a character included in the target image based on the first text information and the second text information.
[Selected Figure] Figure 1

Description

本発明は、情報処理システム、情報処理方法、およびプログラムに関する。 The present invention relates to an information processing system, an information processing method, and a program.

機械学習モジュールを用いて文書画像から情報を抽出するシステムが開示されている（特許文献１）。 A system that uses a machine learning module to extract information from document images is disclosed (Patent Document 1).

特開２０２２－７９４３９号公報JP 2022-79439 A

特許文献１に記載の文書データ抽出システムは、文書に関連付けられた画像データを取得し、光学式文字認識により画像データからメタデータを抽出する。メタデータは、テキストコンテンツ項目列と、テキストコンテンツ項目列の各テキストコンテンツ項目に関連付けられたテキストコンテンツ項目特徴とが指定される。文書データ抽出システムは、機械学習モジュールを用いて、テキストコンテンツ項目列とテキストコンテンツ項目特徴とに基づき、キーに関連付けられた１以上のテキストコンテンツ項目を決定する。これにより、文書データ抽出システムは文書画像から情報を抽出することができる。 The document data extraction system described in Patent Document 1 acquires image data associated with a document and extracts metadata from the image data by optical character recognition. The metadata specifies a string of text content items and text content item features associated with each text content item in the string of text content items. The document data extraction system uses a machine learning module to determine one or more text content items associated with a key based on the string of text content items and the text content item features. This enables the document data extraction system to extract information from a document image.

しかし、光学式文字認識を実行する一つの光学式文字認識装置によって画像データから適切にメタデータを生成できない場合、文書画像から適切に情報を抽出することができないという問題が生じる。 However, if an optical character recognition device that performs optical character recognition cannot properly generate metadata from image data, a problem arises in which information cannot be properly extracted from document images.

そこで、本発明は、上記の課題を解決するために、画像から適切に情報を抽出可能なシステムを提供することを目的とする。 Therefore, in order to solve the above problems, the present invention aims to provide a system that can appropriately extract information from images.

本発明の一態様に係る情報処理システムは、所定の装置から、文字を含む対象画像を取得する対象画像取得部と、前記対象画像に含まれる複数の文字のそれぞれを第１のテキスト情報として認識する第１の光学式文字認識装置から、前記複数の文字のそれぞれについての、前記第１のテキスト情報と、前記第１のテキスト情報が示す文字に対する認識の正確性の度合いを示す第１の正確性指標と、を取得する第１の情報取得部と、前記複数の文字における前記第１の正確性指標のうちの少なくとも一つに基づく第１の基準指標が、画像に含まれる文字の認識の正確性に関する第１の基準よりも低いと判定された場合、前記第１の基準よりも低いと判定された前記第１の基準指標に対応する文字を含む、前記対象画像の一部の範囲の画像である特定範囲画像であって、当該特定範囲画像に含まれる少なくとも一つの文字のそれぞれを第２のテキスト情報として認識する、前記第１の光学式文字認識装置とは異なる第２の光学式文字認識装置から、前記特定範囲画像に含まれる文字についての前記第２のテキスト情報を取得する第２の情報取得部と、前記第１のテキスト情報と、前記第２のテキスト情報と、に基づいて、前記対象画像に含まれる文字を特定する文字特定部と、を備える。 An information processing system according to one aspect of the present invention includes a target image acquisition unit that acquires a target image including characters from a predetermined device, a first information acquisition unit that acquires, from a first optical character recognition device that recognizes each of a plurality of characters included in the target image as first text information, the first text information for each of the plurality of characters and a first accuracy index indicating the degree of recognition accuracy for the characters indicated by the first text information, and a first reference index based on at least one of the first accuracy indexes for the plurality of characters that is higher than a first reference regarding the accuracy of recognition of characters included in an image. If it is determined that the first reference indicator is lower, a specific range image is an image of a part of the target image that includes characters corresponding to the first reference indicator determined to be lower than the first reference, and at least one character included in the specific range image is recognized as second text information. A second information acquisition unit acquires the second text information about the characters included in the specific range image from a second optical character recognition device different from the first optical character recognition device, and a character identification unit identifies the characters included in the target image based on the first text information and the second text information.

本発明の一態様に係る情報処理方法は、コンピュータが、所定の装置から、文字を含む対象画像を取得することと、前記対象画像に含まれる複数の文字のそれぞれを第１のテキスト情報として認識する第１の光学式文字認識装置から、前記複数の文字のそれぞれについての、前記第１のテキスト情報と、前記第１のテキスト情報が示す文字に対する認識の正確性の度合いを示す第１の正確性指標と、を取得することと、前記複数の文字における前記第１の正確性指標のうちの少なくとも一つに基づく第１の基準指標が、画像に含まれる文字の認識の正確性に関する第１の基準よりも低いと判定された場合、前記第１の基準よりも低いと判定された前記第１の基準指標に対応する文字を含む、前記対象画像の一部の範囲の画像である特定範囲画像であって、当該特定範囲画像に含まれる少なくとも一つの文字のそれぞれを第２のテキスト情報として認識する、前記第１の光学式文字認識装置とは異なる第２の光学式文字認識装置から、前記特定範囲画像に含まれる文字についての前記第２のテキスト情報を取得することと、前記第１のテキスト情報と、前記第２のテキスト情報と、に基づいて、前記対象画像に含まれる文字を特定することと、を実行する。 In one aspect of the present invention, an information processing method includes a computer that acquires a target image including characters from a predetermined device, acquires the first text information for each of the multiple characters included in the target image from a first optical character recognition device that recognizes each of the multiple characters included in the target image as first text information, and acquires the first text information for each of the multiple characters and a first accuracy index indicating the degree of recognition accuracy for the characters indicated by the first text information from a first optical character recognition device that recognizes each of the multiple characters included in the target image as first text information, and when a first reference index based on at least one of the first accuracy indexes for the multiple characters is determined to be lower than a first reference for the accuracy of recognition of the characters included in the image, acquires the second text information for the characters included in the specific range image from a second optical character recognition device different from the first optical character recognition device that recognizes each of at least one character included in the specific range image as second text information, the second text information for the characters included in the specific range image being a specific range image that is an image of a part of the target image including the characters corresponding to the first reference index determined to be lower than the first reference, and the second text information for the characters included in the specific range image being acquired from the second optical character recognition device that is different from the first optical character recognition device, and the second text information is acquired.

本発明の一態様に係るプログラムは、コンピュータに、所定の装置から、文字を含む対象画像を取得することと、前記対象画像に含まれる複数の文字のそれぞれを第１のテキスト情報として認識する第１の光学式文字認識装置から、前記複数の文字のそれぞれについての、前記第１のテキスト情報と、前記第１のテキスト情報が示す文字に対する認識の正確性の度合いを示す第１の正確性指標と、を取得することと、前記複数の文字における前記第１の正確性指標のうちの少なくとも一つに基づく第１の基準指標が、画像に含まれる文字の認識の正確性に関する第１の基準よりも低いと判定された場合、前記第１の基準よりも低いと判定された前記第１の基準指標に対応する文字を含む、前記対象画像の一部の範囲の画像である特定範囲画像であって、当該特定範囲画像に含まれる少なくとも一つの文字のそれぞれを第２のテキスト情報として認識する、前記第１の光学式文字認識装置とは異なる第２の光学式文字認識装置から、前記特定範囲画像に含まれる文字についての前記第２のテキスト情報を取得することと、前記第１のテキスト情報と、前記第２のテキスト情報と、に基づいて、前記対象画像に含まれる文字を特定することと、を実行させる。 A program according to one aspect of the present invention causes a computer to execute the following: acquire a target image including characters from a specified device; acquire, from a first optical character recognition device that recognizes each of a plurality of characters included in the target image as first text information, the first text information for each of the plurality of characters and a first accuracy index indicating the degree of recognition accuracy for the character indicated by the first text information; if a first reference index based on at least one of the first accuracy indexes for the plurality of characters is determined to be lower than a first reference standard for the accuracy of recognition of characters included in an image, acquire the second text information for the characters included in the specific range image from a second optical character recognition device different from the first optical character recognition device that recognizes each of at least one character included in the specific range image as second text information; and identify the characters included in the target image based on the first text information and the second text information.

本発明によれば、画像から適切に情報を抽出可能なシステムを提供することができる。 The present invention provides a system that can appropriately extract information from images.

電子データ生成システムの概要を示す図である。FIG. 1 is a diagram showing an overview of an electronic data generation system. 対象画像情報の一例を示すデータベースである。1 is a database showing an example of target image information. 特定範囲画像情報の一例を示すデータベースである。1 is a database showing an example of specific range image information. 表示部に表示される画面例を示す図である。FIG. 4 is a diagram showing an example of a screen displayed on a display unit. 電子データ生成システムの処理手順を示すフローチャートである。1 is a flowchart showing a processing procedure of the electronic data generation system. 一行の文字列のテキスト情報である行情報に対する正当性指標を示す表である。11 is a table showing validity indicators for line information that is text information of one line of a character string. コンピュータのハードウェア構成の一例を示す図である。FIG. 2 illustrates an example of a hardware configuration of a computer.

以下に、本発明の一実施形態における電子データ生成システム１０について、図面を参照して詳細に説明する。ただし、以下に説明する実施形態は、あくまでも例示であり、以下に明示しない種々の変形や技術の適用を排除する意図はない。すなわち、本発明は、その趣旨を逸脱しない範囲で種々変形し、または各実施例を組み合わせるなどして実施することができる。また、以下の図面の記載において、同一または類似の部分には同一または類似の符号を付して表している。 The electronic data generation system 10 according to one embodiment of the present invention will be described in detail below with reference to the drawings. However, the embodiment described below is merely an example, and is not intended to exclude the application of various modifications or techniques not explicitly described below. In other words, the present invention can be implemented by modifying it in various ways or combining the various embodiments without departing from the spirit of the invention. In addition, in the description of the drawings below, identical or similar parts are denoted by the same or similar reference numerals.

また、本実施形態において、「部」、「装置」、「システム」とは、単に物理的手段を意味するものではなく、その「部」、「装置」、「システム」が有する機能をソフトウェアによって実現する場合も含む。また、一つの「部」、「装置」、「システム」が有する機能が２つ以上の物理的手段や装置により実現されてもよく、二つ以上の「部」、「装置」、「システム」の機能が１つの物理的手段や装置により実現されてもよい。さらには、電子データ生成システム１０を構成する複数の装置のそれぞれの以下に示す各種機能が、当該複数の装置における他の装置によって実行されるように構成されていてもよい。 In addition, in this embodiment, the terms "part", "device", and "system" do not simply mean physical means, but also include cases where the functions of the "part", "device", and "system" are realized by software. Furthermore, the functions of one "part", "device", or "system" may be realized by two or more physical means or devices, and the functions of two or more "parts", "devices", or "systems" may be realized by one physical means or device. Furthermore, the various functions described below of each of the multiple devices that make up electronic data generation system 10 may be configured to be executed by other devices in the multiple devices.

＝＝＝電子データ生成システム１０の概要＝＝＝
＜＜構成の概要＞＞
図１を参照して、電子データ生成システム１０の概要について説明する。図１は、電子データ生成システム１０の概要を示す図である。 Overview of Electronic Data Generation System 10
<<Configuration Overview>>
An overview of an electronic data generation system 10 will be described with reference to Fig. 1. Fig. 1 is a diagram showing an overview of the electronic data generation system 10.

電子データ生成システム１０は、複数の光学式文字認識装置を用いて、画像からデジタル文書を正確に生成するシステムである。具体的には、電子データ生成システム１０は、ＪＰＥＧ（ＪｏｉｎｔＰｈｏｔｏｇｒａｐｈｉｃＥｘｐｅｒｔｓＧｒｏｕｐ）、ＴＩＦＦ（ＴａｇｇｅｄＩｍａｇｅＦｉｌｅＦｏｒｍａｔ）、ＰＮＧ（ＰｏｒｔａｂｌｅＮｅｔｗｏｒｋＧｒａｐｈｉｃｓ）等のグラフィックフォーマットで指定された画像やＰＤＦ（Portable Document Format）データの画像（以下、「対象画像」という。）などを、複数の光学式文字認識装置を通じてデジタルデータであるテキスト情報を生成する。 The electronic data generation system 10 is a system that uses multiple optical character recognition devices to accurately generate digital documents from images. Specifically, the electronic data generation system 10 generates text information, which is digital data, from images specified in graphic formats such as JPEG (Joint Photographic Experts Group), TIFF (Tagged Image File Format), and PNG (Portable Network Graphics) and images in PDF (Portable Document Format) data (hereinafter referred to as "target images") through multiple optical character recognition devices.

対象画像は、例えば各種契約書や論文などの画像である。以下、便宜上、対象画像が一例として一頁単位の契約書の画像であるとして説明する。 The target image may be, for example, an image of various contracts or papers. For the sake of convenience, the following description will be given assuming that the target image is an image of a single page of a contract.

電子データ生成システム１０は、例えば、電子データ生成装置１００と、第１の光学式文字認識装置２００と、第２の光学式文字認識装置３００と、ユーザ端末４００とを含む。 The electronic data generation system 10 includes, for example, an electronic data generation device 100, a first optical character recognition device 200, a second optical character recognition device 300, and a user terminal 400.

電子データ生成装置１００は、異なる二つの光学式文字認識装置のそれぞれによる対象画像の文字認識の結果に基づき、対象画像の文字認識の結果であるデジタル文書を出力する装置である。 The electronic data generation device 100 is a device that outputs a digital document that is the result of character recognition of a target image based on the results of character recognition of the target image by each of two different optical character recognition devices.

第１の光学式文字認識装置２００は、対象画像に対して文字認識を実行する装置である。 The first optical character recognition device 200 is a device that performs character recognition on a target image.

第２の光学式文字認識装置３００は、電子データ生成装置１００から取得される、対象画像の所定の範囲の画像に対して文字認識を実行する装置である。 The second optical character recognition device 300 is a device that performs character recognition on an image within a predetermined range of the target image obtained from the electronic data generation device 100.

電子データ生成装置１００、第１の光学式文字認識装置２００および第２の光学式文字認識装置３００は、例えば、クラウドコンピュータ、サーバコンピュータ、パーソナルコンピュータ（例えば、デスクトップ、ラップトップ、タブレットなど）、メディアコンピュータプラットホーム（例えば、ケーブル、衛星セットトップボックス、デジタルビデオレコーダ）、ハンドヘルドコンピュータデバイス（例えば、ＰＤＡ、電子メールクライアントなど）、あるいは他種のコンピュータ、またはコミュニケーションプラットホームであってもよい。なお、電子データ生成装置１００、第１の光学式文字認識装置２００および第２の光学式文字認識装置３００における処理の少なくとも一部は、１以上のコンピュータ（限定ではなく例として、１以上のコンピュータにより構成されるクラウドコンピューティング）により実現されていてもよい。 The electronic data generating device 100, the first optical character recognition device 200 and the second optical character recognition device 300 may be, for example, a cloud computer, a server computer, a personal computer (e.g., a desktop, a laptop, a tablet, etc.), a media computer platform (e.g., a cable or satellite set-top box, a digital video recorder), a handheld computing device (e.g., a PDA, an email client, etc.), or other types of computers or communication platforms. At least a portion of the processing in the electronic data generating device 100, the first optical character recognition device 200 and the second optical character recognition device 300 may be realized by one or more computers (for example, but not limited to, cloud computing consisting of one or more computers).

ユーザ端末４００は、ユーザの操作入力を受け付けて各種情報を表示する装置である。 The user terminal 400 is a device that accepts user input and displays various information.

ユーザ端末４００は、例えば、スマートフォン、携帯電話（フィーチャーフォン）、パーソナルコンピュータ（例えば、デスクトップ、ラップトップ、タブレットなど）、メディアコンピュータプラットホーム（例えば、ケーブル、衛星セットトップボックス、デジタルビデオレコーダ）、ハンドヘルドコンピュータデバイス（例えば、ＰＤＡ（Personal Digital Assistant）、電子メールクライアントなど）、ウェアラブル端末（メガネ型デバイス、時計型デバイスなど）、他種のコンピュータ、またはコミュニケーションプラットホームであってもよい。 The user terminal 400 may be, for example, a smartphone, a mobile phone (feature phone), a personal computer (e.g., desktop, laptop, tablet, etc.), a media computing platform (e.g., cable, satellite set-top box, digital video recorder), a handheld computing device (e.g., PDA (Personal Digital Assistant), email client, etc.), a wearable device (glasses-type device, watch-type device, etc.), another type of computer, or a communication platform.

＜＜処理の概要＞＞
図１を参照して、電子データ生成システム１０の処理の概要について説明する。 <<Processing Overview>>
An overview of the processing of an electronic data generating system 10 will be described with reference to FIG.

まず、ステップＳ１０において、電子データ生成装置１００は、所定の装置から取得された対象画像を、第１の光学式文字認識装置２００に送信する。 First, in step S10, the electronic data generation device 100 transmits a target image acquired from a specified device to the first optical character recognition device 200.

ステップＳ１１において、第１の光学式文字認識装置２００は、対象画像（例えば一頁単位の画像）に含まれる文字を認識して、認識した文字についてのテキスト情報（以下、「第１のテキスト情報」という。）を生成する。第１のテキスト情報には当該文字の対象画像上の座標が含まれていてもよい。このとき、第１の光学式文字認識装置２００は、生成した第１のテキスト情報が示す文字に対する認識の正確性の度合い（以下、「第１の正確性指標」という。）を生成する。以下では、便宜上、第１のテキスト情報、第１の正確性指標および座標をまとめて「第１の生成情報」ということもある。 In step S11, the first optical character recognition device 200 recognizes characters contained in a target image (e.g., an image of one page) and generates text information about the recognized characters (hereinafter referred to as "first text information"). The first text information may include the coordinates of the characters on the target image. At this time, the first optical character recognition device 200 generates a degree of recognition accuracy for the characters indicated by the generated first text information (hereinafter referred to as "first accuracy index"). For convenience, the first text information, first accuracy index, and coordinates may be collectively referred to as "first generated information" below.

第１の光学式文字認識装置２００は、第１の生成情報を電子データ生成装置１００に送信する。 The first optical character recognition device 200 transmits the first generation information to the electronic data generation device 100.

ステップＳ１２において、電子データ生成装置１００は、複数の文字における第１の正確性指標のうちの少なくとも一つに基づく基準指標（以下、「第１の基準指標」という。）が、画像に含まれる文字の認識の正確性に関する基準（以下、「第１の基準」という。）よりも低いと判定された場合、基準指標に対応する文字を含む、対象画像の所定の範囲の画像（以下、「特定範囲画像」という。）を特定する。 In step S12, if the electronic data generating device 100 determines that a reference index (hereinafter referred to as the "first reference index") based on at least one of the first accuracy indexes for multiple characters is lower than a standard (hereinafter referred to as the "first standard") regarding the accuracy of recognition of characters contained in an image, it identifies an image of a predetermined range of the target image (hereinafter referred to as the "specific range image") that contains characters corresponding to the reference index.

第１の基準指標とは、例えば、複数の文字のそれぞれの第１の正確性指標であってもよいし、行単位の画像やブロック単位の画像に含まれる複数の文字における第１の正確性指標の平均値であってもよい。 The first reference index may be, for example, the first accuracy index for each of the multiple characters, or the average value of the first accuracy index for the multiple characters included in a line-unit image or a block-unit image.

第１の基準とは、例えば第１の正確性指標と比較可能な閾値である。 The first criterion is, for example, a threshold value that can be compared to the first accuracy index.

特定範囲画像とは、例えば、一頁単位の対象画像の一部をセグメント化した、一つの文章がまとまった画像（以下、「ブロック画像」という。）であってもよいし、一行分の画像（以下、「行画像」という。）であってもよいし、一文字の画像である文字画像であってもよい。以下、便宜上、特定範囲画像を「行画像」として説明する。 The specific range image may be, for example, an image of a single sentence (hereinafter referred to as a "block image") obtained by segmenting a portion of a page-unit target image, an image of one line (hereinafter referred to as a "line image"), or a character image, which is an image of a single character. For convenience, the specific range image will be described below as a "line image."

電子データ生成装置１００は、特定範囲画像を第２の光学式文字認識装置３００に送信する。すなわち、電子データ生成装置１００は、対象画像を文字認識させた光学式文字認識装置とは異なる光学式文字認識装置に、文字認識の正確性が低いと判定された文字を含む、例えば対象画像の一部の範囲の特定範囲画像（例えば行画像）を再度文字認識させる。 The electronic data generating device 100 transmits the specific range image to the second optical character recognition device 300. That is, the electronic data generating device 100 causes an optical character recognition device different from the optical character recognition device that performed character recognition on the target image to perform character recognition again on the specific range image (e.g., a line image) of, for example, a part of the target image, which includes characters determined to have low accuracy of character recognition.

ステップＳ１３において、第２の光学式文字認識装置３００は、特定範囲画像に含まれる文字を認識して、認識した文字についてのテキスト情報（以下、「第２のテキスト情報」という。）を生成する。第２のテキスト情報には当該文字の対象画像上の座標が含まれていてもよい。このとき、第２の光学式文字認識装置３００は、生成した第２のテキスト情報が示す文字に対する認識の正確性の度合い（以下、「第２の正確性指標」という。）を生成する。なお、以下では、第２のテキスト情報、第２の正確性指標および座標をまとめて「第２の生成情報」ということもある。 In step S13, the second optical character recognition device 300 recognizes characters contained in the specific range image and generates text information about the recognized characters (hereinafter referred to as "second text information"). The second text information may include the coordinates of the characters on the target image. At this time, the second optical character recognition device 300 generates a degree of recognition accuracy for the characters indicated by the generated second text information (hereinafter referred to as "second accuracy index"). Note that, hereinafter, the second text information, second accuracy index, and coordinates may be collectively referred to as "second generated information".

第２の光学式文字認識装置３００は、第２の生成情報を電子データ生成装置１００に送信する。 The second optical character recognition device 300 transmits the second generated information to the electronic data generation device 100.

ステップＳ１４において、電子データ生成装置１００は、特定範囲画像に含まれる少なくとも一つの文字のそれぞれにおける第２の正確性指標のうちの少なくとも一つに基づく基準指標（以下、「第２の基準指標」という。）と、画像に含まれる文字の認識の正確性に関する基準（以下、「第２の基準」という。）との比較結果（以下、「第１の比較結果」という。）に基づいて、特定範囲画像に含まれる文字を特定する。 In step S14, the electronic data generation device 100 identifies characters included in the specific range image based on a comparison result (hereinafter referred to as the "first comparison result") between a reference indicator (hereinafter referred to as the "second reference indicator") based on at least one of the second accuracy indicators for each of at least one character included in the specific range image and a standard regarding the accuracy of recognition of characters included in the image (hereinafter referred to as the "second standard").

第２の基準指標とは、例えば、特定範囲画像に含まれる文字のそれぞれの第２の正確性指標であってもよいし、特定範囲画像に含まれる複数の文字における第２の正確性指標の平均値であってもよい。 The second reference index may be, for example, a second accuracy index for each character included in the specific range image, or may be an average value of the second accuracy indexes for multiple characters included in the specific range image.

第２の基準とは、例えば第２の正確性指標と比較可能な閾値である。 The second criterion is, for example, a threshold value that can be compared to the second accuracy index.

電子データ生成装置１００は、特定した特定範囲画像に含まれる文字を示す情報（以下、「文字認識結果」という。）を含む画面をユーザ端末４００に送信する。 The electronic data generating device 100 transmits a screen including information indicating the characters contained in the identified specific range image (hereinafter referred to as the "character recognition result") to the user terminal 400.

以上のとおり、電子データ生成装置１００は、一頁単位の対象画像に含まれる文字に対する第１の光学式文字認識装置２００による文字認識の正確性が低い場合に、第１の光学式文字認識装置２００とは異なる第２の光学式文字認識装置３００によって、当該文字を含む、対象画像の一部の範囲の特定範囲画像（ここでは一例として行単位の行画像）を文字認識した結果を取得して、当該結果に基づき対象画像に含まれる文字を特定する。 As described above, when the accuracy of character recognition by the first optical character recognition device 200 for characters included in a target image on a page basis is low, the electronic data generation device 100 obtains the results of character recognition of a specific range image (here, as an example, a line image on a line basis) of a part of the target image including the characters using a second optical character recognition device 300 different from the first optical character recognition device 200, and identifies the characters included in the target image based on the results.

すなわち、電子データ生成装置１００では、文字認識の対象範囲が異なる複数の光学式文字認識装置（例えば第１の光学式文字認識装置２００は一頁単位の文字認識で第２の光学式文字認識装置３００は行単位の文字認識）を用いて、文字認識の正確性が低い画像に対して文字認識することにより、正確性が高い文字認識を実現可能とする。 In other words, the electronic data generation device 100 uses multiple optical character recognition devices with different target ranges for character recognition (for example, the first optical character recognition device 200 performs character recognition on a page-by-page basis, and the second optical character recognition device 300 performs character recognition on a line-by-line basis) to perform character recognition on images with low character recognition accuracy, thereby making it possible to achieve highly accurate character recognition.

さらに言うと、電子データ生成システム１０では、文字認識の実行費用が第１の光学式文字認識装置２００よりも高額な第２の光学式文字認識装置３００を用いて対象範囲の文字認識の全てを実行するのではなく、第１の光学式文字認識装置２００では文字認識の正確性が低い、対象範囲よりもより狭い範囲の文字認識を実行することにより、文字認識の正確性を高めるとともに、文字認識にかかる費用を低減することが可能となる。 Moreover, in the electronic data generation system 10, instead of performing all character recognition in the target range using the second optical character recognition device 300, which is more expensive to perform character recognition than the first optical character recognition device 200, the accuracy of character recognition is low with the first optical character recognition device 200, and character recognition is performed in a narrower range than the target range, thereby making it possible to increase the accuracy of character recognition and reduce the cost of character recognition.

なお、第２の光学式文字認識装置３００に送信される特定範囲画像は、行画像であることに限定されず、一頁単位の画像であってもよく、ブロック画像または文字画像であってもよい。 Note that the specific range image sent to the second optical character recognition device 300 is not limited to being a line image, but may be an image of a page, a block image, or a character image.

また、電子データ生成システム１０は、第１の光学式文字認識装置２００および第２の光学式文字認識装置３００に加えて、さらに少なくとも一つの光学式文字認識装置を含んでいてもよい。この場合、当該光学式文字認識装置は、第１の光学式文字認識装置２００および第２の光学式文字認識装置３００のいずれかと同じ範囲の画像を文字認識するものであってもよいし、第１の光学式文字認識装置２００および第２の光学式文字認識装置３００が文字認識する範囲よりも狭い範囲の画像を文字認識するものであってもよい。これにより、電子データ生成システム１０は、より正確性が高い文字認識を実現可能となる。 The electronic data generation system 10 may further include at least one optical character recognition device in addition to the first optical character recognition device 200 and the second optical character recognition device 300. In this case, the optical character recognition device may perform character recognition on an image in the same range as either the first optical character recognition device 200 or the second optical character recognition device 300, or may perform character recognition on an image in a narrower range than the range in which the first optical character recognition device 200 and the second optical character recognition device 300 perform character recognition. This enables the electronic data generation system 10 to achieve more accurate character recognition.

＝＝＝電子データ生成装置１００＝＝＝
図１に示すように、電子データ生成装置１００は、記憶部１０１と、対象画像取得部１０２と、第１の情報取得部１０３と、第１の判定部１０４と、特定範囲特定部１０５と、情報送信部１０６と、第２の情報取得部１０７と、第２の判定部１０８と、文字特定部１０９と、表示処理部１１０とを含む。 Electronic Data Generation Device 100
As shown in FIG. 1, the electronic data generation device 100 includes a memory unit 101, a target image acquisition unit 102, a first information acquisition unit 103, a first judgment unit 104, a specific range identification unit 105, an information transmission unit 106, a second information acquisition unit 107, a second judgment unit 108, a character identification unit 109, and a display processing unit 110.

記憶部１０１は、例えば、対象画像情報Ｄ１０１ａと、特定範囲画像情報Ｄ１０１ｂとを含む。 The memory unit 101 includes, for example, target image information D101a and specific range image information D101b.

図２を参照して、対象画像情報Ｄ１０１ａについて説明する。図２は、対象画像情報１０１ａの一例を示すデータベースである。対象画像情報１０１ａは、対象画像に関するデータが格納されるデータベースである。 The target image information D101a will be described with reference to FIG. 2. FIG. 2 is a database showing an example of the target image information 101a. The target image information 101a is a database in which data related to the target image is stored.

図２に示すように、対象画像情報Ｄ１０１ａは、例えば、［対象画像ＩＤ］、［対象画像］、［ブロック情報］、［行情報］、［文字情報］、［座標］、［第１の正確性指標］などの項目を含む。［対象画像ＩＤ］は、対象画像を一意に識別可能な識別情報が格納される。［対象画像］は、対象画像が格納される。［ブロック情報］は、対象画像のうちの文章の一つのまとまりを示すブロック画像のテキスト情報（以下、「ブロック情報」という。）が格納される。［行情報］は、ブロック画像に含まれる一行ごとのテキスト情報（以下、「行情報」という。）が格納される。［文字情報］は、一行に含まれる文字のテキスト情報（以下、「文字情報」という。）が格納される。［座標］は、第１のテキスト情報のそれぞれが示す文字（文字情報）の対象画像中の座標が格納される。［第１の正確性指標］は、第１のテキスト情報のそれぞれが示す文字の正確性の度合いを示す第１の正確性指標が格納される。 2, the target image information D101a includes items such as [target image ID], [target image], [block information], [line information], [character information], [coordinates], and [first accuracy index]. [Target image ID] stores identification information that can uniquely identify the target image. [Target image] stores the target image. [Block information] stores text information of a block image that indicates a chunk of text in the target image (hereinafter referred to as "block information"). [Line information] stores text information for each line included in the block image (hereinafter referred to as "line information"). [Character information] stores text information of characters included in a line (hereinafter referred to as "character information"). [Coordinates] stores the coordinates in the target image of the characters (character information) indicated by each piece of first text information. [First accuracy index] stores a first accuracy index that indicates the degree of accuracy of the characters indicated by each piece of first text information.

図３を参照して、特定範囲画像情報Ｄ１０１ｂについて説明する。図３は、特定範囲画像情報１０１ｂの一例を示すデータベースである。特定範囲画像情報１０１ｂは、特定範囲画像に関するデータが格納されるデータベースである。 The specific range image information D101b will be described with reference to FIG. 3. FIG. 3 is a database showing an example of specific range image information 101b. Specific range image information 101b is a database in which data related to specific range images is stored.

図３に示すように、特定範囲画像情報Ｄ１０１ｂは、例えば、［特定範囲画像ＩＤ］、［特定範囲画像］、［第２のテキスト情報］、［座標］、［第２の正確性指標］などの項目を含む。［特定範囲画像ＩＤ］は、特定範囲画像を一意に識別可能な識別情報が格納される。［特定範囲画像］は、特定範囲画像が格納される。［第２のテキスト情報］は、特定範囲画像（図３では行画像）に含まれる文字画像の第２のテキスト情報が格納される。［座標］は、第２のテキスト情報が示す文字のそれぞれの対象画像中の座標または特定範囲画像中の座標が格納される。［第２の正確性指標］は、第２のテキスト情報が示す文字の正確性の度合いを示す第２の正確性指標が格納される。 As shown in FIG. 3, the specific range image information D101b includes items such as a [specific range image ID], a [specific range image], a [second text information], a [coordinates], and a [second accuracy index]. The [specific range image ID] stores identification information that can uniquely identify the specific range image. The [specific range image] stores the specific range image. The [second text information] stores second text information of the character image included in the specific range image (line image in FIG. 3). The [coordinates] stores the coordinates in the target image or the coordinates in the specific range image of the characters indicated by the second text information. The [second accuracy index] stores a second accuracy index that indicates the degree of accuracy of the characters indicated by the second text information.

対象画像取得部１０２は、所定の装置から対象画像を取得する。所定の装置は紙への印字を画像として取得可能な例えばスキャナー装置や対象画像を記憶するサーバ装置などである。対象画像取得部１０２は、取得した対象画像を第１の光学式文字認識装置２００に送信してもよい。 The target image acquisition unit 102 acquires a target image from a specific device. The specific device is, for example, a scanner device capable of acquiring printouts on paper as an image, or a server device that stores the target image. The target image acquisition unit 102 may transmit the acquired target image to the first optical character recognition device 200.

図１に戻り、第１の情報取得部１０３は、対象画像についての第１の生成情報を第１の光学式文字認識装置２００から取得する。具体的には、電子データ生成装置１００は、例えば、対象画像に含まれる文字画像のそれぞれについての第１のテキスト情報（座標を含む）および第１の正確性指標を第１の光学式文字認識装置２００から取得する。第１の生成情報は対象画像情報Ｄ１０１ａに格納される。 Returning to FIG. 1, the first information acquisition unit 103 acquires first generation information for the target image from the first optical character recognition device 200. Specifically, the electronic data generation device 100 acquires, for example, first text information (including coordinates) and a first accuracy index for each character image included in the target image from the first optical character recognition device 200. The first generation information is stored in the target image information D101a.

第１の判定部１０４は、第１の基準指標が第１の基準よりも低いか否かを判定する。具体的には、第１の判定部１０４は、第１の基準である閾値が「０．６」であり、対象画像の所定の範囲の画像（例えば行画像）が「ＡＢＣ」である場合、第１のテキスト情報である「Ａ」，「Ｂ」，「Ｄ」（文字画像「Ｃ」を「Ｄ」とご認識）のそれぞれの第１の正確性指標が「０．９９」，「０．９９」，「０．５５」であるとすると、当該所定の範囲の画像における第１の基準指標（ここでは「Ｄ」に対応する第１の正確性指標「０．５５」）が第１の基準（ここでは閾値「０．６」）よりも低いと判定する。 The first determination unit 104 determines whether the first reference index is lower than the first reference. Specifically, when the threshold value as the first reference is "0.6" and the image in a predetermined range of the target image (for example, a line image) is "ABC", if the first accuracy indexes of the first text information "A", "B", and "D" (character image "C" is recognized as "D") are "0.99", "0.99", and "0.55", respectively, the first determination unit 104 determines that the first reference index in the image in the predetermined range (here, the first accuracy index "0.55" corresponding to "D") is lower than the first reference (here, the threshold value "0.6").

すなわち、電子データ生成装置１００は、対象画像に含まれる所定の範囲の画像（例えば、ブロック画像、行画像または文字画像）における第１のテキスト情報に対応する第１の正確性指標のうちの少なくとも一つ（または平均値）が閾値よりも低い場合、当該所定の範囲の画像に対して正確に文字認識できていないと判定してもよい。 In other words, if at least one (or the average value) of the first accuracy indices corresponding to the first text information in a predetermined range of images (e.g., block images, line images, or character images) included in the target image is lower than a threshold value, the electronic data generation device 100 may determine that accurate character recognition has not been performed for the image in the predetermined range.

特定範囲特定部１０５は、第１の判定部１０４における判定結果（以下、「第１の判定結果」という。）に基づき、第１の基準よりも低いと判定された第１の基準指標に対応する文字を含む、対象画像の一部の範囲の画像である特定範囲画像を特定する。この場合、特定範囲特定部１０５は、第１の光学式文字認識装置２００から取得される対象画像に含まれる文字画像のそれぞれの座標を特定し、当該座標に基づき、第１の基準指標に対応する文字を含む特定範囲画像（例えば行画像）を特定する。 Based on the judgment result of the first judgment unit 104 (hereinafter referred to as the "first judgment result"), the specific range identification unit 105 identifies a specific range image, which is an image of a partial range of the target image, including characters corresponding to the first reference indicator that has been judged to be lower than the first standard. In this case, the specific range identification unit 105 identifies the coordinates of each character image included in the target image acquired from the first optical character recognition device 200, and identifies a specific range image (e.g., a line image) including characters corresponding to the first reference indicator based on the coordinates.

ここで、行画像を特定する処理（以下、「行特定処理」という。）の一例の概要について説明する。行特定処理では、まず、対象画像の左端の黒のドット画像を特定する。次に、行特定処理では、特定したドット画像から水平方向で右に向かって、高さ方向の所定の幅で黒ドットを特定しつつヒストグラム（例えば横軸が対象画像の左端からの距離、縦軸がドット画像の個数）を生成する。次に、水平に対して角度をずらして所定の幅で同様に黒のドット画像を特定しつつヒストグラムを生成する。そして、行特定処理では、ヒストグラムに基づき、行画像の左上のドットの座標と右下のドットの座標を特定することにより、行画像の範囲の座標を特定する。これにより、複数の文字を含む所定のまとまりの画像を適切に特定することが可能となる。 Here, an overview of an example of a process for identifying a line image (hereinafter referred to as "line identification process") will be described. In the line identification process, first, a black dot image at the left edge of the target image is identified. Next, in the line identification process, a histogram (e.g., the horizontal axis is the distance from the left edge of the target image and the vertical axis is the number of dot images) is generated while identifying black dots in a predetermined width in the height direction from the identified dot image to the right in the horizontal direction. Next, a histogram is generated while similarly identifying black dot images in a predetermined width at an angle shifted from the horizontal. Then, in the line identification process, the coordinates of the upper left dot and the lower right dot are identified based on the histogram, thereby identifying the coordinates of the range of the line image. This makes it possible to appropriately identify a predetermined group of images containing multiple characters.

なお、ヒストグラムにおいて、黒のドット画像が特定される第１の距離範囲と、黒のドット画像が特定される第２の距離範囲とが所定の距離を超える場合、第１の距離範囲の黒のドット画像が特定される範囲を第１の行画像として特定し、第２の距離範囲の黒のドット画像が特定される範囲を第１の行画像とは異なる第２の行画像として特定する。これにより、例えば同じ行ではあるものの、異なるブロック画像に含まれる行画像を異なる行として特定することが可能となる。 In addition, in the histogram, if the first distance range in which black dot images are identified and the second distance range in which black dot images are identified exceed a predetermined distance, the range in which black dot images are identified in the first distance range is identified as a first row image, and the range in which black dot images are identified in the second distance range is identified as a second row image that is different from the first row image. This makes it possible to identify row images that are in the same row but included in different block images as different rows.

情報送信部１０６は、特定された特定範囲画像（例えば行画像）を第２の光学式文字認識装置３００に送信する。 The information transmission unit 106 transmits the identified specific range image (e.g., a line image) to the second optical character recognition device 300.

第２の情報取得部１０７は、特定範囲画像に含まれる文字についての第２の生成情報を第２の光学式文字認識装置３００から取得する。具体的には、電子データ生成装置１００は、例えば、特定範囲画像である行画像（例えば「ＡＢＣ」）に含まれる文字画像（例えば「Ａ」「Ｂ」「Ｃ」）のそれぞれについての第２のテキスト情報（座標を含む）および第２の正確性指標（例えば「Ａ：０．９９」「Ｂ：０．９８」「Ｃ：０．９９」）を、第２の光学式文字認識装置２００から取得する。 The second information acquisition unit 107 acquires second generation information about characters included in the specific range image from the second optical character recognition device 300. Specifically, the electronic data generation device 100 acquires, for example, second text information (including coordinates) and second accuracy indices (for example, "A: 0.99", "B: 0.98", "C: 0.99") about each of the character images (for example, "A", "B", "C") included in a line image (for example, "ABC") that is a specific range image from the second optical character recognition device 200.

第２の判定部１０８は、第２の基準指標と第２の基準との大小関係を判定する。具体的には、第２の判定部１０８は、例えば、特定範囲画像に含まれる文字ついての第２の正確性指標の全てが所定の閾値以上であるか否かを判定する。例えば、第２の判定部１０８は、所定の閾値が「０．６」であり、特定範囲画像（例えば行画像）が「ＡＢＣ」である場合、第２のテキスト情報である「Ａ」，「Ｂ」，「Ｃ」のそれぞれの第２の正確性指標が「０．９９」，「０．９９」，「０．９８」であるとすると、第２の基準指標が第２の基準以上であると判定する。なお、第２の判定部１０８は、特定範囲画像に含まれる文字のそれぞれの第２の正確性指標の平均値が所定の閾値以上であるか否かを判定してもよい。 The second determination unit 108 determines whether the second reference index is greater than or equal to the second reference. Specifically, the second determination unit 108 determines, for example, whether all of the second accuracy indices for the characters included in the specific range image are equal to or greater than a predetermined threshold. For example, when the predetermined threshold is "0.6" and the specific range image (e.g., a line image) is "ABC", if the second accuracy indices for the second text information "A", "B", and "C" are "0.99", "0.99", and "0.98", respectively, the second determination unit 108 determines that the second reference index is equal to or greater than the second reference. The second determination unit 108 may also determine whether the average value of the second accuracy indices for the characters included in the specific range image is equal to or greater than a predetermined threshold.

第２の判定部１０８は、特定範囲画像についての第２の正確性指標のうちの一つでも所定の閾値よりも低い場合、第２の基準指標が第２の基準よりも低いと判定してもよい。例えば、第２の判定部１０８は、特定範囲画像（例えば行画像）が「ＡＢＣ」である場合、第２のテキスト情報である「Ａ」，「Ｂ」，「Ｃ」のそれぞれの第２の正確性指標が「０．９９」，「０．９９」，「０．５０」であるとすると、第２の基準指標が第２の基準よりも低いと判定する。 The second determination unit 108 may determine that the second reference index is lower than the second reference index when any one of the second accuracy indices for the specific range image is lower than a predetermined threshold. For example, when the specific range image (e.g., a line image) is "ABC", if the second accuracy indices of the second text information "A", "B", and "C" are "0.99", "0.99", and "0.50", respectively, the second determination unit 108 determines that the second reference index is lower than the second reference index.

このように、電子データ生成装置１００は、例えば、特定範囲画像についての第２の正確性指標の全てが所定の閾値を超える場合に、第２のテキスト情報が第１のテキスト情報よりも対象画像についての文字画像に対して正確に文字を認識できていると判定する。 In this way, the electronic data generation device 100 determines that the second text information is able to recognize characters more accurately for the character image in the target image than the first text information, for example, when all of the second accuracy indices for the specific range image exceed a predetermined threshold value.

文字特定部１０９は、第２の判定部１０８における判定結果（以下、「第２の判定結果」という。）に基づき、特定範囲画像（すなわち対象画像）に含まれる文字を特定する。具体的には、文字特定部１０９は、第２の基準指標が第２の基準以上である場合、第２の正確性指標に対応する第２のテキスト情報が示す文字を特定範囲画像に含まれる文字として特定する。 The character identification unit 109 identifies characters included in the specific range image (i.e., the target image) based on the judgment result (hereinafter referred to as the "second judgment result") in the second judgment unit 108. Specifically, when the second reference index is equal to or greater than the second reference, the character identification unit 109 identifies characters indicated by the second text information corresponding to the second accuracy index as characters included in the specific range image.

例えば、文字特定部１０９は、第２のテキスト情報である「Ａ」，「Ｂ」，「Ｃ」のそれぞれの第２の正確性指標（第２の基準指標）が「０．９９」，「０．９９」，「０．９８」である場合、第２の基準指標が第２の基準（例えば閾値「０．６」）以上であるため、第２のテキスト情報が示す「ＡＢＣ」を特定範囲画像の文字列として特定する。 For example, if the second accuracy index (second reference index) of the second text information "A", "B", and "C" are "0.99", "0.99", and "0.98", respectively, the character identification unit 109 identifies "ABC" indicated by the second text information as a character string in the specific range image because the second reference index is equal to or greater than the second reference (e.g., a threshold value of "0.6").

これにより、第１の光学式文字認識装置２００による一度目の文字認識において正確性が低い文字を含む所定の範囲の文字列について、第２の光学式文字認識装置３００による二度目の文字認識において文字列を適切に認識することが可能となる。 This makes it possible for a character string within a certain range, including characters that are recognized with low accuracy the first time by the first optical character recognition device 200, to be properly recognized the second time by the second optical character recognition device 300.

一方、文字特定部１０９は、第２の基準指標が第２の基準よりも低い場合、第１の正確性指標と第２の正確性指標との大小関係を判定した結果に基づき、第１の正確性指標に対応する第１のテキスト情報が示す文字、または第２の正確性指標に対応する第２のテキスト情報が示す文字のいずれかを、特定範囲画像の文字として特定する。 On the other hand, when the second reference index is lower than the second reference, the character identification unit 109 identifies, based on the result of determining the magnitude relationship between the first accuracy index and the second accuracy index, either the character indicated by the first text information corresponding to the first accuracy index or the character indicated by the second text information corresponding to the second accuracy index as a character in the specific range image.

具体的には、文字特定部１０９は、第２の光学式文字認識装置３００から取得される特定範囲画像に含まれる文字画像のそれぞれの第２の正確性指標の最低値と、第１の光学式文字認識装置２００から取得される当該特定範囲画像に対応する画像の第１のテキスト情報に対応する第１の正確性指標のうちの最低値と、のうちの高い値を示す最低値を特定する。文字特定部１０９は、特定した最低値を示す正確性指標に対応するテキスト情報（第１のテキスト情報または第２のテキスト情報）が示す文字を特定範囲画像に含まれる文字として特定する。 Specifically, the character identification unit 109 identifies the minimum value indicating the higher value among the minimum value of the second accuracy index of each character image included in the specific range image acquired from the second optical character recognition device 300 and the minimum value of the first accuracy index corresponding to the first text information of the image corresponding to the specific range image acquired from the first optical character recognition device 200. The character identification unit 109 identifies the character indicated by the text information (the first text information or the second text information) corresponding to the accuracy index indicating the identified minimum value as the character included in the specific range image.

例えば、文字特定部１０９は、行画像「ＡＢＣ」について、第１のテキスト情報である「Ａ」，「Ｆ」，「Ｄ」（ここでは、文字画像「Ｂ」を「Ｆ」とご認識し、「Ｃ」を「Ｄ」とご認識）における第１の正確性指標が「０．９９」，「０．４０」，「０．５５」であり、第２のテキスト情報である「Ａ」，「Ｂ」，「Ｅ」（文字画像「Ｃ」を「Ｅ」とご認識）のそれぞれの第２の正確性指標が「０．９９」，「０．９９」，「０．５０」である場合、第２の正確性指標の最低値「０．５０」が第１の正確性指標の最低値「０．４０」よりも高い値を示すことを特定する。この場合、文字特定部１０９は、第２のテキスト情報が示す「ＡＢＥ」を特定範囲画像の文字列として特定する。 For example, for the line image "ABC", if the first accuracy indexes of the first text information "A", "F", and "D" (here, character image "B" is recognized as "F" and "C" is recognized as "D") are "0.99", "0.40", and "0.55", and the second accuracy indexes of the second text information "A", "B", and "E" (character image "C" is recognized as "E") are "0.99", "0.99", and "0.50", respectively, the character identification unit 109 determines that the minimum value of the second accuracy index "0.50" is higher than the minimum value of the first accuracy index "0.40". In this case, the character identification unit 109 identifies "ABE" indicated by the second text information as a character string in the specific range image.

これにより、第１の光学式文字認識装置２００による一度目の文字認識において正確性が低い文字を含む所定の範囲の文字列と、第２の光学式文字認識装置３００による二度目の文字認識において文字列とのうち、より正確に認識されたと推定される文字列を採用することが可能となる。 This makes it possible to adopt a character string that is estimated to have been recognized more accurately from a predetermined range of character strings including characters with low accuracy in the first character recognition by the first optical character recognition device 200 and a character string in the second character recognition by the second optical character recognition device 300.

なお、上記において、第２の正確性指標の最低値と第１の正確性指標のうちの最低値とのうちの高い値を示す最低値を特定するとして説明したが、これに限定されない。例えば、文字特定部１０９は、最低値に替えて平均値を用いてもよく、この場合、高い平均値を示す正確性指標（第２の基準指標）に対応するテキスト情報が示す文字を特定範囲画像の文字として特定してもよい。例えば、電子データ生成装置１００は、最低値のうちの高い値を示す最低値を特定する処理を実行することによりユーザによる修正の手間を縮減でき、一方、平均値のうちの高い平均値を特定する処理を実行することによりテキストが全体的に程よくまとまっていればよいようなテキストを採用することができるためユーザの修正の手間を縮減できる。 In the above description, the minimum value indicating the higher value between the minimum value of the second accuracy index and the minimum value of the first accuracy index is specified, but this is not limited to this. For example, the character identification unit 109 may use an average value instead of the minimum value, and in this case, the character indicated by the text information corresponding to the accuracy index (second reference index) indicating the higher average value may be specified as the character of the specific range image. For example, the electronic data generation device 100 can reduce the user's effort in making corrections by performing a process of identifying the minimum value indicating the higher value of the minimum values, while performing a process of identifying the higher average value of the average values can reduce the user's effort in making corrections, since it is possible to adopt text that is only required to be reasonably well organized overall.

以上のように、電子データ生成装置１００は、対象画像の所定の範囲（例えば行単位）の画像に対する第１の光学式文字認識装置２００による文字認識の正確性が低い場合に、第１の光学式文字認識装置２００とは異なる第２の光学式文字認識装置３００によって当該所定の範囲の画像を文字認識した結果を取得する。すなわち、電子データ生成装置１００では、二つの異なる光学式文字認識装置で文字認識することによって、文字認識の正確性が低い画像について正確性の高い文字認識が実現可能となる。 As described above, when the accuracy of character recognition by the first optical character recognition device 200 for an image in a predetermined range (e.g., line units) of the target image is low, the electronic data generation device 100 obtains the result of character recognition of the image in the predetermined range by the second optical character recognition device 300, which is different from the first optical character recognition device 200. In other words, by performing character recognition using two different optical character recognition devices, the electronic data generation device 100 can achieve highly accurate character recognition for images with low character recognition accuracy.

さらに述べると、電子データ生成システム１０では、例えば、文字認識を実行するための費用が安く、文字認識の精度が低い第１の光学式文字認識装置２００によって広範囲（例えば一頁単位）の文字認識を実行し、文字認識を実行するための費用が第１の光学式文字認識装置２００よりも高く、文字認識の精度が第１の光学式文字認識装置２００よりも高い（例えば行単位での文字認識の精度が高い）第２の光学式文字認識装置３００によって、より狭い範囲の文字認識を実行することが望ましい。これにより、電子データ生成システム１０では、文字認識の正確性を高めるとともに、文字認識にかかる費用を低減することが可能となる。 More specifically, in the electronic data generation system 10, it is desirable to perform character recognition over a wide range (e.g., on a page-by-page basis) using the first optical character recognition device 200, which has a low cost of performing character recognition and a low character recognition accuracy, and to perform character recognition over a narrower range using the second optical character recognition device 300, which has a higher cost of performing character recognition than the first optical character recognition device 200 and a higher character recognition accuracy than the first optical character recognition device 200 (e.g., has a higher accuracy of character recognition on a line-by-line basis). This makes it possible for the electronic data generation system 10 to increase the accuracy of character recognition while reducing the cost of character recognition.

表示処理部１１０は、対象画像と文字特定部１０９で特定されたテキスト情報（文字認識結果）とを関連づけて画面内に表示させる。以下、図４を参照して、画面例について説明する。図４は、表示部に表示される画面例を示す図である。 The display processing unit 110 associates the target image with the text information (character recognition result) identified by the character identification unit 109 and displays them on the screen. An example screen will be described below with reference to FIG. 4. FIG. 4 is a diagram showing an example screen displayed on the display unit.

図４に示すように、画面Ｔ１０は、第１の表示領域Ｔ１１と、第２の表示領域Ｔ１２とを含む。第１の表示領域Ｔ１１は対象画像が表示される領域である。第１の表示領域Ｔ１１は例えば画面の一方側の半分の領域である。第２の表示領域Ｔ１２は対象画像における第１のテキスト情報および特定範囲画像における第２のテキスト情報（図４では行情報）が表示される領域である。第２の表示領域Ｔ１２は例えば画面の他方側の半分の領域である。 As shown in FIG. 4, the screen T10 includes a first display area T11 and a second display area T12. The first display area T11 is an area in which the target image is displayed. The first display area T11 is, for example, half of one side of the screen. The second display area T12 is an area in which the first text information in the target image and the second text information (line information in FIG. 4) in the specific range image are displayed. The second display area T12 is, for example, half of the other side of the screen.

表示処理部１１０は、例えば、第１の表示領域Ｔ１１に表示される対象画像に含まれる文字のうち、第２の表示領域に表示される行情報が示す文字を識別可能に表示する。具体的には、図４に示すように、表示処理部１１０は、例えば、対象画像の行画像を識別可能なオブジェクトＯＴ１を表示させ、当該行画像と対応する行情報にオブジェクトＯＴ２を表示させる。例えばオブジェクトＯＴ１の表示色はオブジェクトＯＴ２の表示色と同じ色である。これにより、電子データ生成装置１００は、対象画像の所定の範囲を文字認識した結果である第１のテキスト情報および第２のテキスト情報と、対象画像との対応関係を、ユーザに対して提供することができるため、ユーザにおいて対象画像に対する誤認識などを容易に把握可能とさせる。 The display processing unit 110, for example, identifiably displays characters indicated by line information displayed in the second display area among characters included in the target image displayed in the first display area T11. Specifically, as shown in FIG. 4, the display processing unit 110, for example, displays an object OT1 that can identify a line image of the target image, and displays an object OT2 in the line information corresponding to the line image. For example, the display color of the object OT1 is the same color as the display color of the object OT2. This allows the electronic data generation device 100 to provide the user with the correspondence between the target image and the first text information and the second text information, which are the results of character recognition of a predetermined range of the target image, so that the user can easily understand misrecognition of the target image, etc.

＜＜変形例＞＞
文字特定部１０９は、第２の光学式文字認識装置３００に特定範囲画像を入力した回数である入力回数に基づき、第１のテキスト情報が示す文字または第２のテキスト情報が示す文字のいずれかを、対象画像に含まれる文字として特定してもよい。具体的には、文字特定部１０９は、例えば、第２の光学式文字認識装置３００に特定範囲画像（例えば行画像）を入力した入力回数が予め定められた回数を超えた場合、第１の正確性指標に対応する第１のテキスト情報が示す文字を対象画像に含まれる文字として特定する。これにより、電子データ生成システム１０は、例えば、第１の光学式文字認識装置２００による文字認識の処理にかかる費用よりも、第２の光学式文字認識装置３００による文字認識の処理にかかる費用の方が高いような場合、一定の費用を超えるような場合は、より費用が低い光学式文字認識装置を用いて文字認識を実行することにより、費用縮減を実現できる。 <<Modifications>>
The character identification unit 109 may identify either the character indicated by the first text information or the character indicated by the second text information as a character included in the target image based on the number of inputs, which is the number of times a specific range image is input to the second optical character recognition device 300. Specifically, for example, when the number of inputs of a specific range image (e.g., a line image) input to the second optical character recognition device 300 exceeds a predetermined number of times, the character identification unit 10 identifies the character indicated by the first text information corresponding to the first accuracy index as a character included in the target image. As a result, the electronic data generation system 10 can achieve cost reduction by performing character recognition using an optical character recognition device with a lower cost when, for example, the cost of character recognition processing by the second optical character recognition device 300 is higher than the cost of character recognition processing by the first optical character recognition device 200, or when the cost exceeds a certain cost.

この場合、表示処理部１１０は、第２の光学式文字認識装置３００に特定範囲画像を入力した回数である入力回数を画面Ｔ１０の所定の位置に表示させてもよい。具体的には、表示処理部１１０は、第２の表示領域の所定の位置に表示されてもよい。さらに言うと、図４に示すように、表示処理部１１０は、入力回数を超えた時点以降に第２の光学式文字認識装置３００に入力する対象となった行画像（図４では「サーバ」）に対応する第２のテキスト情報に対して、入力回数を関連づけて表示させてもよい（図４の「５回」）。これにより、電子データ生成システム１０は、第２の光学式文字認識装置３００による文字認識が回数制限により実行できなかった特定範囲画像について、ユーザにおいて容易に把握可能とさせる。 In this case, the display processing unit 110 may display the number of inputs, which is the number of times the specific range image was input to the second optical character recognition device 300, at a predetermined position on the screen T10. Specifically, the display processing unit 110 may display at a predetermined position in the second display area. Furthermore, as shown in FIG. 4, the display processing unit 110 may display the number of inputs in association with the second text information corresponding to the line image ("server" in FIG. 4) that became the target of input to the second optical character recognition device 300 after the input number was exceeded ("5 times" in FIG. 4). In this way, the electronic data generation system 10 allows the user to easily grasp the specific range images for which character recognition by the second optical character recognition device 300 could not be performed due to the number of times limit.

＝＝＝第１の光学式文字認識装置２００＝＝＝
図１に戻り、第１の光学式文字認識装置２００の構成について説明する。第１の光学式文字認識装置２００は、例えば、対象画像が入力された場合、対象画像に含まれる文字を認識して、例えば認識した文字ごとに、第１のテキスト情報、第１の正確性指標および座標（第１の生成情報）を生成する装置である。 First Optical Character Recognition Device 200
1, a description will be given of the configuration of the first optical character recognition device 200. The first optical character recognition device 200 is a device that, for example, when a target image is input, recognizes characters included in the target image and generates, for example, first text information, a first accuracy index, and coordinates (first generated information) for each recognized character.

図１に示すように、第１の光学式文字認識装置２００は、例えば、記憶部２１０と、送受信部２２０と、処理部２３０とを備える。記憶部２１０は各種情報を記憶する。処理部２３０は文字認識するための処理を実行する。送受信部２２０は、電子データ生成装置１００との間で各種情報を送受信する。処理部２３０は、例えば文字を区別するように学習されたニューラル・ネットワークを使用して画像を分析する。ニューラル・ネットワークは、例えば複数の畳み込みネットワーク層及び再帰型ネットワーク層を備える。処理部２３０は、例えば対象画像についてページ、ブロック、行または文字ごとにセグメント化する。セグメント化した画像に含まれる文字について文字認識を実行することで、例えば文字ごとに第１の生成情報を生成する。処理部２３０は、例えばセグメント化した画像（例えば対象画像、ブロック画像または行画像）を一つのまとまりとしての第１の生成情報を生成してもよい。 As shown in FIG. 1, the first optical character recognition device 200 includes, for example, a storage unit 210, a transmission/reception unit 220, and a processing unit 230. The storage unit 210 stores various information. The processing unit 230 executes processing for character recognition. The transmission/reception unit 220 transmits and receives various information to and from the electronic data generation device 100. The processing unit 230 analyzes an image using, for example, a neural network trained to distinguish between characters. The neural network includes, for example, a plurality of convolutional network layers and a recurrent network layer. The processing unit 230 segments, for example, a target image into pages, blocks, lines, or characters. Character recognition is performed on the characters included in the segmented image to generate, for example, first generated information for each character. The processing unit 230 may generate, for example, the first generated information of a segmented image (for example, a target image, a block image, or a line image) as a single group.

＝＝＝第２の光学式文字認識装置３００＝＝＝
図１を参照して、第２の光学式文字認識装置３００の構成について説明する。第２の光学式文字認識装置３００は、例えば、特定範囲画像が入力された場合、特定範囲画像に含まれる文字を認識して、例えば認識した文字ごとに、第２のテキスト情報、第２の正確性指標および座標（第２の生成情報）を生成する装置である。 Second Optical Character Recognition Device 300
The configuration of the second optical character recognition device 300 will be described with reference to Fig. 1. The second optical character recognition device 300 is a device that, for example, when a specific range image is input, recognizes characters included in the specific range image and generates, for example, second text information, a second accuracy index, and coordinates (second generated information) for each recognized character.

第２の光学式文字認識装置３００は、例えば特定範囲画像が行画像である場合に、第１の光学式文字認識装置２００による文字識別の正確性よりも高い正確性を実現可能な装置であることが望ましい。この場合、電子データ生成システム１０では、第１の光学式文字認識装置２００における文字認識の正確性が低い行画像に対して、行画像に対する文字認識の正確性が高い第２の光学式文字認識装置３００を用いることにより、文字認識の正確性の向上を図ることが可能となる。 It is desirable that the second optical character recognition device 300 is a device capable of achieving higher accuracy in character identification than the first optical character recognition device 200, for example, when the specific range image is a line image. In this case, in the electronic data generation system 10, it is possible to improve the accuracy of character recognition by using the second optical character recognition device 300, which has high character recognition accuracy for line images, for line images for which the first optical character recognition device 200 has low character recognition accuracy.

図１に示すように、第２の光学式文字認識装置３００は、例えば、記憶部３１０と、送受信部３２０と、処理部３３０とを備える。記憶部３１０は各種情報を記憶する。処理部３３０は文字認識するための処理を実行する。送受信部３２０は、電子データ生成装置１００との間で各種情報を送受信する。処理部３３０は、例えば文字を区別するように学習されたニューラル・ネットワークを使用して画像を分析する。ニューラル・ネットワークは、例えば複数の畳み込みネットワーク層及び再帰型ネットワーク層を備える。処理部３３０は、第１の光学式文字認識装置２００の処理部２３０と同じであってもよいが、行画像に対する文字認識に特化した処理を実行する機能部であってもよい。この場合、処理部３３０は、例えば行画像について文字ごとにセグメント化する。そして、処理部３３０は、当該文字について文字認識を実行することにより、例えば文字ごとに第２の生成情報を生成する。 As shown in FIG. 1, the second optical character recognition device 300 includes, for example, a storage unit 310, a transmission/reception unit 320, and a processing unit 330. The storage unit 310 stores various information. The processing unit 330 executes processing for character recognition. The transmission/reception unit 320 transmits and receives various information to and from the electronic data generation device 100. The processing unit 330 analyzes an image using, for example, a neural network trained to distinguish between characters. The neural network includes, for example, a plurality of convolutional network layers and a recurrent network layer. The processing unit 330 may be the same as the processing unit 230 of the first optical character recognition device 200, or may be a functional unit that executes processing specialized for character recognition of line images. In this case, the processing unit 330 segments, for example, the line image for each character. Then, the processing unit 330 executes character recognition on the character to generate, for example, second generation information for each character.

＝＝＝ユーザ端末４００＝＝＝
図１を参照して、ユーザ端末４００の構成について説明する。図１に示すように、ユーザ端末４００は、例えば、記憶部４１０と、送受信部４２０と、表示処理部４３０との機能部を含む。各機能部は、例えば、プロセッサ１００１がメモリ１００２に格納されているプログラムを読み出して実現される機能である。 User Terminal 400
The configuration of the user terminal 400 will be described with reference to Fig. 1. As shown in Fig. 1, the user terminal 400 includes functional units, for example, a storage unit 410, a transmission/reception unit 420, and a display processing unit 430. Each functional unit is a function realized by, for example, a processor 1001 reading out a program stored in a memory 1002.

記憶部４１０は、各種情報を記憶する。送受信部４２０は電子データ生成装置１００との間で各種情報を送受信する。送受信部４２０で取得された各種情報は記憶部４１０に記憶される。表示処理部４３０は電子データ生成装置１００から取得する画面Ｔ１０を表示部に表示させる。 The storage unit 410 stores various information. The transmission/reception unit 420 transmits and receives various information to and from the electronic data generation device 100. The various information acquired by the transmission/reception unit 420 is stored in the storage unit 410. The display processing unit 430 displays the screen T10 acquired from the electronic data generation device 100 on the display unit.

＝＝＝処理手順＝＝＝
図５、図６を参照して、電子データ生成システム１０の処理手順について説明する。図５は、電子データ生成システム１０の処理手順を示すフローチャートである。図６は、一行の文字列のテキスト情報である行情報に対する正当性指標を示す表である。以下では、一例として、対象画像に含まれる一行の文字列である「１００ＢＡＳＥ－ＴＸスイッチ一式」に対する文字認識について説明する。 ===Processing Procedure===
The processing procedure of the electronic data generation system 10 will be described with reference to Fig. 5 and Fig. 6. Fig. 5 is a flowchart showing the processing procedure of the electronic data generation system 10. Fig. 6 is a table showing validity indices for line information, which is text information of a line of character strings. In the following, as an example, character recognition for "100BASE-TX switch set", which is a line of character string included in a target image, will be described.

ステップＳ１００において、電子データ生成装置１００は、所定の装置から対象画像を取得する。電子データ生成装置１００は、対象画像を記憶部１０１に記憶する。電子データ生成装置１００は、第１の光学式文字認識装置２００に対象画像を送信する。 In step S100, the electronic data generation device 100 acquires a target image from a specific device. The electronic data generation device 100 stores the target image in the storage unit 101. The electronic data generation device 100 transmits the target image to the first optical character recognition device 200.

ステップＳ１０１において、第１の光学式文字認識装置２００は、対象画像をセグメント化して、対象画像に含まれる文字ごとの第１のテキスト情報、第１の正確性指標および座標を生成する。第１の光学式文字認識装置２００は、第１の生成情報を電子データ生成装置１００に送信する。 In step S101, the first optical character recognition device 200 segments the target image to generate first text information, a first accuracy indicator, and coordinates for each character contained in the target image. The first optical character recognition device 200 transmits the first generated information to the electronic data generation device 100.

ステップＳ１０２において、電子データ生成装置１００は、対象画像に関連づけて、文字ごとに第１の生成情報を対象画像情報Ｄ１０１ａに記憶する。 In step S102, the electronic data generation device 100 associates the first generation information for each character with the target image and stores it in the target image information D101a.

ステップＳ１０３において、電子データ生成装置１００は、対象画像に含まれる文字の第１の基準指標が第１の基準よりも低いか否かを判定する。 In step S103, the electronic data generating device 100 determines whether the first reference index of the character contained in the target image is lower than the first reference.

第１の基準指標が第１の基準以上と判定された場合（ステップＳ１０３：ＮＯ）、ステップＳ１０４において、電子データ生成装置１００は、第１のテキスト情報が示す文字を対象画像に含まれる文字として特定する。 If it is determined that the first reference indicator is equal to or greater than the first reference (step S103: NO), in step S104, the electronic data generation device 100 identifies the character indicated by the first text information as a character included in the target image.

第１の基準指標が第１の基準よりも低いと判定された場合（ステップＳ１０３：ＹＥＳ）、ステップＳ１０５において、電子データ生成装置１００は、対象画像情報Ｄ１０１ａを参照して、第１の基準よりも低いと判定された第１の基準指標に対応する第１のテキスト情報を含む行情報を特定する。具体的には、電子データ生成装置１００は、図６（ａ）に示す第１のテキスト情報および第１の正当性指標を特定する。 If it is determined that the first reference indicator is lower than the first reference (step S103: YES), in step S105, the electronic data generation device 100 refers to the target image information D101a and identifies line information including first text information corresponding to the first reference indicator determined to be lower than the first reference. Specifically, the electronic data generation device 100 identifies the first text information and the first validity indicator shown in FIG. 6(a).

ステップＳ１０６において、電子データ生成装置１００は、特定した行情報に含まれる座標に基づき、行情報に対応する行画像（特定範囲画像）を対象画像から特定する。電子データ生成装置１００は、特定した行画像を第２の光学式文字認識装置３００に送信する。 In step S106, the electronic data generation device 100 identifies a line image (specific range image) corresponding to the line information from the target image based on the coordinates included in the identified line information. The electronic data generation device 100 transmits the identified line image to the second optical character recognition device 300.

ステップＳ１０７において、第２の光学式文字認識装置３００は、行画像をセグメント化して、行画像に含まれる文字ごとの第２のテキスト情報、第２の正確性指標および座標を生成する。第２の光学式文字認識装置３００は、第２の生成情報を電子データ生成装置１００に送信する。 In step S107, the second optical character recognition device 300 segments the line image to generate second text information, a second accuracy indicator, and coordinates for each character contained in the line image. The second optical character recognition device 300 transmits the second generated information to the electronic data generation device 100.

ステップＳ１０８において、電子データ生成装置１００は、特定範囲画像に関連づけて、文字ごとに第２の生成情報を特定範囲画像情報Ｄ１０１ｂに記憶する。 In step S108, the electronic data generation device 100 stores the second generation information for each character in the specific range image information D101b in association with the specific range image.

ステップＳ１０９において、電子データ生成装置１００は、特定範囲画像における第２の基準指標と第２の基準との大小関係を判定する。具体的には、図６（ｂ）に示す行情報の第２の正当性指標の全て（第２の基準指標）が閾値（第２の基準）を超えるか否かを判定する。 In step S109, the electronic data generating device 100 determines whether the second reference indicator in the specific range image is larger than the second reference. Specifically, it determines whether all of the second validity indicators (second reference indicators) of the line information shown in FIG. 6(b) exceed a threshold value (second reference).

第２の基準指標が第２の基準以上と判定された場合（ステップＳ１０９：ＹＥＳ）、ステップＳ１１０において、電子データ生成装置１００は、行情報に含まれる第２のテキスト情報が示す文字を行画像に含まれる文字として特定する。 If it is determined that the second reference indicator is equal to or greater than the second reference (step S109: YES), in step S110, the electronic data generation device 100 identifies the character indicated by the second text information included in the line information as the character included in the line image.

第２の基準指標が第２の基準よりも低いと判定された場合（ステップＳ１０９：ＮＯ）、ステップＳ１１１において、電子データ生成装置１００は、特定範囲画像に対応する行情報についての第１の正確性指標と、第２の正確性指標とを比較する。具体的には、電子データ生成装置１００は、図６（ａ）に示す第１の正確性指標のうちの最も小さい値（図６（ａ）の「０．３２」）と、図６（ｂ）に示す第２の正確性指標のうちの最も小さい値（図６（ｂ）の「０．５７」）とを比較する。 If it is determined that the second reference index is lower than the second reference (step S109: NO), in step S111, the electronic data generating device 100 compares the first accuracy index for the line information corresponding to the specific range image with the second accuracy index. Specifically, the electronic data generating device 100 compares the smallest value of the first accuracy indexes shown in FIG. 6(a) ("0.32" in FIG. 6(a)) with the smallest value of the second accuracy indexes shown in FIG. 6(b) ("0.57" in FIG. 6(b)).

ステップＳ１１２において、電子データ生成装置１００は、第１の正確性指標のうちの最も小さい値の方が第２の正確性指標のうちの最も小さい値よりも大きいと判定された場合、第１のテキスト情報を含む行情報（図６（ａ）の行情報）を行画像に含まれる文字として特定する。一方、電子データ生成装置１００は、第２の正確性指標のうちの最も小さい値の方が第１の正確性指標のうちの最も小さい値よりもが大きいと判定された場合、第２のテキスト情報を含む行情報（図６（ｂ）の行情報）を行画像に含まれる文字として特定する。 In step S112, if the electronic data generating device 100 determines that the smallest value of the first accuracy indicators is greater than the smallest value of the second accuracy indicators, it identifies the line information including the first text information (line information in FIG. 6(a)) as a character included in the line image. On the other hand, if the electronic data generating device 100 determines that the smallest value of the second accuracy indicators is greater than the smallest value of the first accuracy indicators, it identifies the line information including the second text information (line information in FIG. 6(b)) as a character included in the line image.

なお、ステップＳ１１２において、電子データ生成装置１００は、第１の正確性指標の平均値（図６（ａ）の「平均値」）と、第２の正確性指標の平均値（図６（ｂ）の「平均値」）とを比較してもよい。この場合、電子データ生成装置１００は、それぞれの平均値のうちの大きい値を示す平均値に対応する行情報（図６（ｂ）の行情報）を行画像に含まれる文字として特定する。 In addition, in step S112, the electronic data generation device 100 may compare the average value of the first accuracy index ("Average value" in FIG. 6(a)) with the average value of the second accuracy index ("Average value" in FIG. 6(b)). In this case, the electronic data generation device 100 identifies the line information (line information in FIG. 6(b)) corresponding to the average value that indicates the larger value among the respective average values as a character included in the line image.

ステップＳ１１３において、電子データ生成装置１００は、対象画像と、対象画像を文字認識した結果とを比較可能な図４に示す画面Ｔ１０をユーザ端末４００の表示部に表示させる。 In step S113, the electronic data generation device 100 displays screen T10 shown in FIG. 4 on the display unit of the user terminal 400, which allows a comparison between the target image and the results of character recognition of the target image.

なお、電子データ生成システム１０は、ステップＳ１０６において複数の行情報が特定された場合、ステップＳ１０６からステップＳ１１２を、特定された行情報の個数だけ繰り返し実行する。 If multiple pieces of line information are identified in step S106, the electronic data generation system 10 repeats steps S106 to S112 the number of times corresponding to the number of pieces of line information identified.

このように、電子データ生成システム１０では、一頁単位の文字認識をより適切に実行可能な第１の光学式文字認識装置２００によって文字認識した結果、文字認識の正確性が低いと判定された行について、行単位の文字認識をより適切に実行可能な第２の光学式文字認識装置３００によって文字認識することが望ましい。そして、電子データ生成システム１０は、第１の光学式文字認識装置２００による文字認識の結果と、第２の光学式文字認識装置３００による文字認識の結果とを比較して、より正確性が高い方の文字認識の結果を採用する。すなわち、電子データ生成装置１００では、二つの異なる光学式文字認識装置で異なる文字認識の範囲に対して文字認識することによって、文字認識の正確性が低い画像について正確性の高い文字認識が実現可能となる。 In this way, in the electronic data generation system 10, it is desirable to perform character recognition by the second optical character recognition device 300, which can perform character recognition by line, more appropriately for lines determined to have low accuracy as a result of character recognition by the first optical character recognition device 200, which can perform character recognition by page units more appropriately. Then, the electronic data generation system 10 compares the character recognition results by the first optical character recognition device 200 and the character recognition results by the second optical character recognition device 300, and adopts the character recognition result with the higher accuracy. In other words, in the electronic data generation device 100, by performing character recognition on different character recognition ranges using two different optical character recognition devices, it is possible to achieve highly accurate character recognition for images with low character recognition accuracy.

＝＝＝ハードウェア構成＝＝＝
図７を参照して、電子データ生成装置１００、第１の光学式文字認識装置２００、第２の光学式文字認識装置３００およびユーザ端末４００をコンピュータで実現する場合のハードウェア構成の一例を説明する。図７は、コンピュータのハードウェア構成の一例を示す図である。 ===Hardware Configuration===
7, an example of a hardware configuration for implementing the electronic data generation device 100, the first optical character recognition device 200, the second optical character recognition device 300, and the user terminal 400 on a computer will be described. FIG. 7 is a diagram showing an example of a hardware configuration of a computer.

図７に示すように、コンピュータ１０００は、プロセッサ１００１と、メモリ１００２と、記憶装置１００３と、入力Ｉ／Ｆ部１００４と、データＩ／Ｆ部１００５と、通信Ｉ／Ｆ部１００６、及び表示部１００７を含む。 As shown in FIG. 7, the computer 1000 includes a processor 1001, a memory 1002, a storage device 1003, an input I/F unit 1004, a data I/F unit 1005, a communication I/F unit 1006, and a display unit 1007.

プロセッサ１００１は、メモリ１００２に記憶されているプログラムを実行することによりコンピュータ１０００における各種の処理を制御する制御部である。 The processor 1001 is a control unit that controls various processes in the computer 1000 by executing programs stored in the memory 1002.

メモリ１００２は、例えばＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）等の記憶媒体である。メモリ１００２は、プロセッサ１００１によって実行されるプログラムのプログラムコードや、プログラムの実行時に必要となるデータを一時的に記憶する。 The memory 1002 is a storage medium such as a RAM (Random Access Memory). The memory 1002 temporarily stores the program code of the program executed by the processor 1001 and data required when the program is executed.

記憶装置１００３は、例えばハードディスクドライブ（ＨＤＤ）やフラッシュメモリ等の不揮発性の記憶媒体である。記憶装置１００３は、オペレーティングシステムや、上記各構成を実現するための各種プログラムを記憶する。 The storage device 1003 is a non-volatile storage medium such as a hard disk drive (HDD) or flash memory. The storage device 1003 stores an operating system and various programs for implementing the above configurations.

入力Ｉ／Ｆ部１００４は、ユーザからの入力を受け付けるためのデバイスである。入力Ｉ／Ｆ部１００４の具体例としては、キーボードやマウス、タッチパネル、各種センサー、ウェアラブル・デバイス等が挙げられる。入力Ｉ／Ｆ部１００４は、例えばＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）等のインターフェースを介してコンピュータ１０００に接続されても良い。 The input I/F unit 1004 is a device for receiving input from a user. Specific examples of the input I/F unit 1004 include a keyboard, a mouse, a touch panel, various sensors, and a wearable device. The input I/F unit 1004 may be connected to the computer 1000 via an interface such as a Universal Serial Bus (USB).

データＩ／Ｆ部１００５は、コンピュータ１０００の外部からデータを入力するためのデバイスである。データＩ／Ｆ部１００５の具体例としては、各種記憶媒体に記憶されているデータを読み取るためのドライブ装置等がある。データＩ／Ｆ部１００５は、コンピュータ１０００の外部に設けられることも考えられる。その場合、データＩ／Ｆ部１００５は、例えばＵＳＢ等のインターフェースを介してコンピュータ１０００へと接続される。 The data I/F unit 1005 is a device for inputting data from outside the computer 1000. A specific example of the data I/F unit 1005 is a drive device for reading data stored in various storage media. The data I/F unit 1005 may be provided outside the computer 1000. In that case, the data I/F unit 1005 is connected to the computer 1000 via an interface such as a USB.

通信Ｉ／Ｆ部１００６は、コンピュータ１０００の外部の装置と有線又は無線により、インターネットＮを介したデータ通信を行うためのデバイスである。通信Ｉ／Ｆ部１００６は、コンピュータ１０００の外部に設けられることも考えられる。その場合、通信Ｉ／Ｆ部１００６は、例えばＵＳＢ等のインターフェースを介してコンピュータ１０００に接続される。 The communication I/F unit 1006 is a device for performing data communication via the Internet N, either wired or wirelessly, with devices external to the computer 1000. The communication I/F unit 1006 may be provided external to the computer 1000. In that case, the communication I/F unit 1006 is connected to the computer 1000 via an interface such as a USB.

表示部１００７は、各種情報を表示するためのデバイスである。表示部１００７の具体例としては、例えば液晶ディスプレイや有機ＥＬ（Ｅｌｅｃｔｒｏ－Ｌｕｍｉｎｅｓｃｅｎｃｅ）ディスプレイ、ウェアラブル・デバイスのディスプレイ等が挙げられる。表示部１００７は、コンピュータ１０００の外部に設けられても良い。その場合、表示部１００７は、例えばディスプレイケーブル等を介してコンピュータ１０００に接続される。また、入力Ｉ／Ｆ部１００４としてタッチパネルが採用される場合には、表示部１００７は、入力Ｉ／Ｆ部１００４と一体化して構成することが可能である。 The display unit 1007 is a device for displaying various types of information. Specific examples of the display unit 1007 include a liquid crystal display, an organic EL (Electro-Luminescence) display, a display of a wearable device, and the like. The display unit 1007 may be provided outside the computer 1000. In that case, the display unit 1007 is connected to the computer 1000 via, for example, a display cable. In addition, when a touch panel is adopted as the input I/F unit 1004, the display unit 1007 can be configured as an integral part of the input I/F unit 1004.

＝＝＝まとめ＝＝＝
＜１＞本実施形態における電子データ生成システム１０は、所定の装置から、文字を含む対象画像を取得する対象画像取得部１０２と、対象画像に含まれる複数の文字のそれぞれを第１のテキスト情報として認識する第１の光学式文字認識装置２００から、複数の文字のそれぞれについての、第１のテキスト情報と、第１のテキスト情報が示す文字に対する認識の正確性の度合いを示す第１の正確性指標と、を取得する第１の情報取得部１０３と、複数の文字における第１の正確性指標のうちの少なくとも一つに基づく第１の基準指標が、画像に含まれる文字の認識の正確性に関する第１の基準よりも低いと判定された場合、前記第１の基準よりも低いと判定された前記第１の基準指標に対応する文字を含む、対象画像の一部の範囲の画像である特定範囲画像であって、当該特定範囲画像に含まれる少なくとも一つの文字のそれぞれを第２のテキスト情報として認識する、第１の光学式文字認識装置２００とは異なる第２の光学式文字認識装置３００から、特定範囲画像に含まれる文字についての第２のテキスト情報を取得する第２の情報取得部１０７と、第１のテキスト情報と、第２のテキスト情報と、に基づいて、対象画像に含まれる文字を特定する文字特定部１０９と、備える。これにより、電子データ生成システム１０は、文字認識の対象範囲が異なる複数の光学式文字認識装置（例えば第１の光学式文字認識装置２００は一頁単位の文字認識で第２の光学式文字認識装置３００は行単位の文字認識）を用いて、文字認識の正確性が低い画像に対して文字認識することにより、正確性が高い文字認識を実現することができる。 ====Summary====
<1> The electronic data generation system 10 in this embodiment includes a target image acquisition unit 102 that acquires a target image including characters from a predetermined device, a first information acquisition unit 103 that acquires, from a first optical character recognition device 200 that recognizes each of a plurality of characters included in the target image as first text information, first text information for each of the plurality of characters and a first accuracy index indicating the degree of recognition accuracy for the character indicated by the first text information, and a first reference index based on at least one of the first accuracy indexes for the plurality of characters, which is a first reference index regarding the accuracy of recognition of the character included in the image. When the first optical character recognition device 200 is judged to be lower than the first standard, the specific range image is an image of a part of the target image including a character corresponding to the first reference indicator judged to be lower than the first standard, and at least one character included in the specific range image is recognized as second text information. The electronic data generation system 10 includes: a second information acquisition unit 107 that acquires second text information about characters included in the specific range image from a second optical character recognition device 300 different from the first optical character recognition device 200, and a character identification unit 109 that identifies characters included in the target image based on the first text information and the second text information. As a result, the electronic data generation system 10 can realize highly accurate character recognition by performing character recognition on an image with low accuracy of character recognition using a plurality of optical character recognition devices with different target ranges for character recognition (for example, the first optical character recognition device 200 performs character recognition on a page basis and the second optical character recognition device 300 performs character recognition on a line basis).

＜２＞また、本実施形態における電子データ生成システム１０における第２の情報取得部１０７は、第２の光学式文字認識装置３００から、特定範囲画像に含まれる少なくとも一つの文字のそれぞれについての、第２のテキスト情報と、第２のテキスト情報が示す文字に対する認識の正確性の度合いを示す第２の正確性指標と、を取得し、文字特定部１０９は、前記特定範囲画像に含まれる少なくとも一つの文字のそれぞれにおける前記第２の正確性指標のうちの少なくとも一つに基づく第２の基準指標と、画像に含まれる文字の認識の正確性に関する第２の基準と、の大小関係を判定した結果に基づいて、特定範囲画像に含まれる文字を特定する。これにより、電子データ生成システム１０は、第１の光学式文字認識装置２００による一度目の文字認識において正確性が低い文字を含む所定の範囲の文字列について、第２の光学式文字認識装置３００による二度目の文字認識において文字列を適切に認識することが可能となる。 <2> In addition, the second information acquisition unit 107 in the electronic data generation system 10 in this embodiment acquires from the second optical character recognition device 300 second text information for each of at least one character included in the specific range image and a second accuracy index indicating the degree of accuracy of recognition for the character indicated by the second text information, and the character identification unit 109 identifies the character included in the specific range image based on the result of determining the magnitude relationship between the second reference index based on at least one of the second accuracy indexes for each of at least one character included in the specific range image and the second reference regarding the accuracy of recognition of the character included in the image. As a result, the electronic data generation system 10 is able to appropriately recognize a character string in the second character recognition by the second optical character recognition device 300 for a character string in a predetermined range that includes a character with low accuracy in the first character recognition by the first optical character recognition device 200.

＜３＞また、本実施形態における電子データ生成システム１０の文字特定部１０９は、特定範囲画像に含まれる全ての文字に関する第２の基準指標が第２の基準以上であると判定された場合、第２の正確性指標に対応する第２のテキスト情報が示す文字を、特定範囲画像に含まれる文字として特定する。これにより、電子データ生成システム１０は、第１の光学式文字認識装置２００による一度目の文字認識において正確性が低い文字を含む所定の範囲の文字列について、第２の光学式文字認識装置３００による二度目の文字認識において文字列を適切に認識することができる。 <3> Furthermore, when the character identification unit 109 of the electronic data generation system 10 in this embodiment determines that the second reference index for all characters included in the specific range image is equal to or greater than the second reference index, it identifies the characters indicated by the second text information corresponding to the second accuracy index as characters included in the specific range image. This allows the electronic data generation system 10 to properly recognize character strings in the second character recognition by the second optical character recognition device 300 for character strings in a predetermined range that include characters with low accuracy in the first character recognition by the first optical character recognition device 200.

＜４＞また、本実施形態における電子データ生成システム１０の文字特定部１０９は、特定範囲画像に含まれる文字のうちの少なくとも一つの文字に対する第２の基準指標が第２の基準よりも低いと判定された場合、対象画像における特定範囲画像に相当する範囲に含まれる少なくとも一つの文字の第１の正確性指標と、特定範囲画像に含まれる少なくとも一つの文字の第２の正確性指標と、の大小関係を判定した結果に基づいて、特定範囲画像に含まれる文字を特定する。これにより、電子データ生成システム１０は、第１の光学式文字認識装置２００による一度目の文字認識において正確性が低い文字を含む所定の範囲の文字列と、第２の光学式文字認識装置３００による二度目の文字認識において文字列とのうち、より正確に認識されたと推定される文字列による、より適切な文字認識を実現することができる。 <4> In addition, when the character identification unit 109 of the electronic data generation system 10 in this embodiment determines that the second reference index for at least one of the characters included in the specific range image is lower than the second reference, the character identification unit 109 identifies the characters included in the specific range image based on the result of determining the magnitude relationship between the first accuracy index of at least one character included in the range corresponding to the specific range image in the target image and the second accuracy index of at least one character included in the specific range image. This allows the electronic data generation system 10 to realize more appropriate character recognition using a character string that is estimated to have been recognized more accurately between a character string in a predetermined range that includes characters with low accuracy in the first character recognition by the first optical character recognition device 200 and a character string in the second character recognition by the second optical character recognition device 300.

＜５＞また、本実施形態における電子データ生成システム１０の対象画像取得部１０２は、一頁単位の画像である対象画像を取得し、情報送信部１０６は、対象画像に含まれる文章の行単位の画像である特定範囲画像を、第２の光学式文字認識装置３００に送信する。これにより、電子データ生成システム１０では、例えば、第１の光学式文字認識装置２００における文字認識の正確性が低い行画像に対して、行画像に対する文字認識の正確性が高い第２の光学式文字認識装置３００を用いることにより、文字認識の正確性の向上を図ることが可能となる。 <5> In addition, the target image acquisition unit 102 of the electronic data generation system 10 in this embodiment acquires a target image that is an image of one page unit, and the information transmission unit 106 transmits a specific range image that is an image of a line unit of text included in the target image to the second optical character recognition device 300. As a result, in the electronic data generation system 10, for example, for a line image in which the accuracy of character recognition in the first optical character recognition device 200 is low, by using the second optical character recognition device 300, which has high accuracy of character recognition for line images, it is possible to improve the accuracy of character recognition.

＜６＞また、本実施形態における電子データ生成システム１０の文字特定部１０９は、第２の光学式文字認識装置３００に特定範囲画像を入力することが、第２の光学式文字認識装置３００に対する特定範囲画像の入力に関する条件を満たす場合、第１の正確性指標に対応する第１のテキスト情報が示す文字を、特定範囲画像に含まれる文字として特定する。これにより、電子データ生成システム１０は、例えば、第１の光学式文字認識装置２００による文字認識の処理にかかる費用よりも、第２の光学式文字認識装置３００による文字認識の処理にかかる費用の方が高いような場合、一定の費用を超えるような場合は、より費用が低い光学式文字認識装置を用いて文字認識を実行することにより、費用縮減を実現できる。 <6> Furthermore, in the present embodiment, the character identification unit 109 of the electronic data generation system 10 identifies the character indicated by the first text information corresponding to the first accuracy index as a character included in the specific range image when inputting the specific range image to the second optical character recognition device 300 satisfies the conditions for inputting the specific range image to the second optical character recognition device 300. As a result, the electronic data generation system 10 can achieve cost reduction by performing character recognition using a less expensive optical character recognition device when, for example, the cost of character recognition processing by the second optical character recognition device 300 is higher than the cost of character recognition processing by the first optical character recognition device 200, or when the cost exceeds a certain cost.

＜７＞また、本実施形態における電子データ生成システム１０は、対象画像を画面Ｔ１０の第１の表示領域Ｔ１１に表示させ、特定範囲画像に含まれる文字を示す第２のテキスト情報を画面Ｔ１０における第１の表示領域Ｔ１１とは異なる第２の表示領域Ｔ１２に表示させる表示処理部１１０をさらに備える。これにより、電子データ生成システム１０は、対象画像の所定の範囲を文字認識した結果である第１のテキスト情報および第２のテキスト情報と、対象画像との対応関係を、ユーザに対して提供することができるため、ユーザにおいて対象画像に対する誤認識などを容易に把握可能とさせる。 <7> In addition, the electronic data generation system 10 in this embodiment further includes a display processing unit 110 that displays the target image in a first display area T11 of the screen T10 and displays second text information indicating characters contained in the specific range image in a second display area T12 different from the first display area T11 on the screen T10. This allows the electronic data generation system 10 to provide the user with the correspondence between the target image and the first text information and the second text information, which are the results of character recognition of a specified range of the target image, so that the user can easily grasp misrecognition of the target image, etc.

＜８＞また、本実施形態における電子データ生成システム１０における表示処理部１１０は、特定範囲画像に含まれる文字を示す第２のテキスト情報と、特定範囲画像に含まれる文字を除く対象画像に含まれる文字を示す第１のテキスト情報と、を第２の表示領域Ｔ１２に表示し、当該第２のテキスト情報を識別可能に表示する。これにより、電子データ生成システム１０は、第２の光学式文字認識装置３００における第２のテキスト情報を、ユーザが容易に特定可能に表示させることができるため、ユーザにおいて対象画像に対する誤認識の程度などを容易に把握可能とすることができる。 <8> In addition, the display processing unit 110 in the electronic data generation system 10 in this embodiment displays second text information indicating characters included in the specific range image and first text information indicating characters included in the target image excluding the characters included in the specific range image in the second display area T12, and displays the second text information in an identifiable manner. As a result, the electronic data generation system 10 can display the second text information in the second optical character recognition device 300 in a manner that is easily identifiable by the user, making it possible for the user to easily grasp the degree of misrecognition of the target image, etc.

＜９＞また、本実施形態における電子データ生成システム１０の表示処理部１１０は、第２の光学式文字認識装置３００に特定範囲画像を入力した回数である入力回数を、画面の所定の表示領域に表示させる。これにより、電子データ生成システム１０は、第２の光学式文字認識装置３００による文字認識の回数について、ユーザにおいて容易に把握可能とすることができる。 <9> In addition, the display processing unit 110 of the electronic data generation system 10 in this embodiment displays the number of inputs, which is the number of times that a specific range image has been input to the second optical character recognition device 300, in a predetermined display area of the screen. This allows the electronic data generation system 10 to allow the user to easily grasp the number of times character recognition has been performed by the second optical character recognition device 300.

１０…電子データ生成システム、１００…電子データ生成装置、１０１…記憶部、１０２…対象画像取得部、１０３…第１の情報取得部、１０４…第１の判定部、１０５…特定範囲特定部、１０６…情報送信部、１０７…第２の情報取得部、１０８…第２の判定部、１０９…文字特定部、１１０…表示処理部、２００…第１の光学式文字認識装置、３００…第２の光学式文字認識装置、４００…ユーザ端末。 10...electronic data generation system, 100...electronic data generation device, 101...storage unit, 102...target image acquisition unit, 103...first information acquisition unit, 104...first judgment unit, 105...specific range identification unit, 106...information transmission unit, 107...second information acquisition unit, 108...second judgment unit, 109...character identification unit, 110...display processing unit, 200...first optical character recognition device, 300...second optical character recognition device, 400...user terminal.

Claims

a target image acquisition unit that acquires a target image including characters from a predetermined device;
a first information acquisition unit that acquires, from a first optical character recognition device that recognizes each of a plurality of characters included in the target image as first text information, the first text information for each of the plurality of characters and a first accuracy index that indicates a degree of recognition accuracy for the character indicated by the first text information;
a second information acquisition unit that acquires, when a first reference indicator based on at least one of the first accuracy indicators for the plurality of characters is determined to be lower than a first standard related to accuracy of recognition of characters included in an image, from a second optical character recognition device different from the first optical character recognition device, a specific range image that is an image of a partial range of the target image including characters corresponding to the first reference indicator determined to be lower than the first standard, and that recognizes each of at least one character included in the specific range image as second text information, the second text information for each of at least one character included in the specific range image and a second accuracy indicator indicating a degree of accuracy of recognition for the character indicated by the second text information;
a character identification unit that identifies a character included in the specific range image based on a result of determining whether a second reference indicator based on at least one of the second accuracy indicators for each of at least one character included in the specific range image is larger than a second reference indicator related to accuracy of recognition of the character included in the image; and
Equipped with
when it is determined that the second reference index for at least one character among the characters included in the specific range image is lower than the second reference, the character identification unit identifies a character included in the specific range image based on a result of determining a magnitude relationship between the first accuracy index of at least one character included in a range corresponding to the specific range image in the target image and the second accuracy index of at least one character included in the specific range image.
Information processing system.

a target image acquisition unit that acquires a target image including characters from a predetermined device;
a first information acquisition unit that acquires, from a first optical character recognition device that recognizes each of a plurality of characters included in the target image as first text information, the first text information for each of the plurality of characters and a first accuracy index that indicates a degree of recognition accuracy for the character indicated by the first text information;
a second information acquisition unit that acquires the second text information about characters included in a specific range image from a second optical character recognition device different from the first optical character recognition device, the second information acquisition unit being configured to acquire, when a first reference indicator based on at least one of the first accuracy indicators for the plurality of characters is determined to be lower than a first standard related to the accuracy of recognition of characters included in an image, a specific range image that is an image of a partial range of the target image and includes characters corresponding to the first reference indicator determined to be lower than the first standard, the specific range image being input and recognizing each of at least one character included in the specific range image as second text information;
a character identification unit that identifies characters included in the target image based on the first text information and the second text information;
Equipped with
the character identification unit, when the number of times that the specific range image is input to the second optical character recognition device exceeds a predetermined number of times, identifies a character indicated by the first text information corresponding to the first accuracy index as a character included in the specific range image.
Information processing system.

when it is determined that the second reference indicator for all characters included in the specific range image is equal to or greater than the second reference, the character identification unit identifies a character indicated by the second text information corresponding to the second accuracy indicator as a character included in the specific range image.
The information processing system according to claim 1 .

The target image acquisition unit acquires the target image, which is an image of one page unit,
the second information acquisition unit acquires, from the second optical character recognition device, the second text information about characters included in the specific range image, which is an image of a line of a sentence included in the target image;
The information processing system according to claim 1 .

Displaying the target image in a first display area of a screen;
a display processing unit that displays the second text information indicating characters included in the specific range image in a second display area different from the first display area on the screen;
The information processing system according to claim 1 further comprising:

The display processing unit is
displaying, in the second display area, the second text information indicating characters included in the specific range image and the first text information indicating characters included in the target image excluding the characters included in the specific range image;
displaying the second text information in an identifiable manner;
6. The information processing system according to claim 5.

the display processing unit displays, in a predetermined display area of the screen, an input count, which is the number of times the specific range image has been input to the second optical character recognition device;
6. The information processing system according to claim 5.

The computer
Acquiring a target image including text from a predetermined device;
acquiring, from a first optical character recognition device that recognizes each of a plurality of characters included in the target image as first text information, the first text information for each of the plurality of characters and a first accuracy index indicating a degree of recognition accuracy for the character indicated by the first text information;
When a first reference indicator based on at least one of the first accuracy indicators for the plurality of characters is determined to be lower than a first reference regarding accuracy of recognition of characters included in an image, a specific range image is an image of a partial range of the target image including a character corresponding to the first reference indicator determined to be lower than the first reference, and each of at least one character included in the specific range image is recognized as second text information. The specific range image is obtained from a second optical character recognition device different from the first optical character recognition device, and the second text information and a second accuracy indicator indicating a degree of accuracy of recognition for the character indicated by the second text information are obtained for each of at least one character included in the specific range image;
identifying a character included in the specific range image based on a result of determining whether a second reference indicator based on at least one of the second accuracy indicators for each of at least one character included in the specific range image is larger than a second reference indicator related to accuracy of recognition of the character included in the image;
when it is determined that the second reference index for at least one character among the characters included in the specific range image is lower than the second reference, identifying characters included in the specific range image based on a result of determining a magnitude relationship between the first accuracy index of at least one character included in a range corresponding to the specific range image in the target image and the second accuracy index of at least one character included in the specific range image;
An information processing method for performing the above.

On the computer,
Acquiring a target image including text from a predetermined device;
acquiring, from a first optical character recognition device that recognizes each of a plurality of characters included in the target image as first text information, the first text information for each of the plurality of characters and a first accuracy index indicating a degree of recognition accuracy for the character indicated by the first text information;
When a first reference indicator based on at least one of the first accuracy indicators for the plurality of characters is determined to be lower than a first reference regarding accuracy of recognition of characters included in an image, a specific range image is an image of a partial range of the target image including a character corresponding to the first reference indicator determined to be lower than the first reference, and each of at least one character included in the specific range image is recognized as second text information. The specific range image is obtained from a second optical character recognition device different from the first optical character recognition device, and the second text information and a second accuracy indicator indicating a degree of accuracy of recognition for the character indicated by the second text information are obtained for each of at least one character included in the specific range image;
identifying a character included in the specific range image based on a result of determining whether a second reference indicator based on at least one of the second accuracy indicators for each of at least one character included in the specific range image is larger than a second reference indicator related to accuracy of recognition of the character included in the image;
when it is determined that the second reference index for at least one character among the characters included in the specific range image is lower than the second reference, identifying characters included in the specific range image based on a result of determining a magnitude relationship between the first accuracy index of at least one character included in a range corresponding to the specific range image in the target image and the second accuracy index of at least one character included in the specific range image;
A program that executes the following.