JP2011164924A

JP2011164924A - Program and information processing apparatus

Info

Publication number: JP2011164924A
Application number: JP2010026824A
Authority: JP
Inventors: Yohei Yamane; 洋平山根; Motoyuki Takaai; 基行鷹合; Hiroshi Masuichi; 博増市
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2010-02-09
Filing date: 2010-02-09
Publication date: 2011-08-25

Abstract

PROBLEM TO BE SOLVED: To display both character strings, which are character strings presumed to continue after a character string before a designation position in a character string to be processed, and presumed to be connected before the character string after the designation position in the character string to be processed. SOLUTION: An input receiving part 120 of an information processing apparatus 10 receives a designation position in object text being a character string to be processed. A phrase DB 110 stores a plurality of phrases. A candidate phrase acquisition part 140 specifies a phrase that starts from a character string before the designation position in the object text and a phrase that ends in a character string after the designation position in the object text from the phrase DB 110. An output processing part 160 displays a phrase different from the object text in the candidate phrase on a display device. COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、プログラムおよび情報処理装置に関する。 The present invention relates to a program and an information processing apparatus.

テキストの入力を支援する技術が知られている。例えば、特許文献１には、利用者により入力された漢字やひらがなを含む文字列に続く文字列を予測する技術が開示されている。特許文献１に記載の技術では、ひらがな、漢字、または漢字混じりひらがなから成る複数の文字列を辞書に収容しておき、入力された文字列をキーとして前述の辞書の中から抽出した文字列を、入力された文字列に続く予測文字列として出力する。 A technique for supporting text input is known. For example, Patent Document 1 discloses a technique for predicting a character string following a character string including a kanji or hiragana input by a user. In the technique described in Patent Document 1, a plurality of character strings composed of hiragana, kanji, or mixed kanji are stored in a dictionary, and a character string extracted from the above dictionary using the input character string as a key is stored. And output as a predicted character string following the input character string.

特許文献２には、ペン入力コンピュータにおいて、入力および／または既に確定した文字列に基づいて、複数の単語および複数の文例を格納した辞書格納手段から検索した１以上の候補単語を表示手段に表示させ、表示させた候補単語の中から所望の単語をユーザに選択させることで文章入力を行う技術が開示されている。 In Patent Document 2, on a pen input computer, one or more candidate words retrieved from a dictionary storage unit that stores a plurality of words and a plurality of sentence examples are displayed on a display unit based on input and / or already determined character strings. A technique for inputting text by allowing a user to select a desired word from the displayed candidate words is disclosed.

特許文献３には、省略記号を含む文章の入力を受け付け、その省略記号の部分に相当する文字列を補完する技術が開示されている。特許文献３に記載の技術では、単語とその使用頻度を記憶した単語辞書を参照して、入力された文章中の省略記号の前後の文字から省略された単語の候補を抽出し、単語間の遷移情報と単語間の遷移確率を記憶した遷移辞書に基づいて、抽出された候補単語の中から１つの単語を決定する。 Patent Document 3 discloses a technique for accepting input of a sentence including an ellipsis and complementing a character string corresponding to the ellipsis. In the technique described in Patent Document 3, a word dictionary that stores words and the frequency of use of the words is extracted, word candidates omitted from characters before and after the ellipsis in the input sentence are extracted, Based on the transition dictionary storing the transition information and the transition probability between words, one word is determined from the extracted candidate words.

特開平８−２５５１５８号公報JP-A-8-255158 特開平１０−１５４０３３号公報Japanese Patent Laid-Open No. 10-154033 特開２０００−３３０９８４号公報JP 2000-330984 A

ところで、入力済みのテキストを修正する場合、ユーザは、テキストにおいて指定した位置の前の文字列の後に続く文字列の入力を望むこともあるし、指定した位置の後の文字列の前に接続される文字列の入力を望むこともある。 By the way, when modifying text that has already been entered, the user may want to enter a string that follows the string before the specified position in the text, or connect before the string after the specified position. You may want to enter a string that will be processed.

本発明は、処理対象の文字列における指定位置の前の文字列の後に続くと推測された文字列と、処理対象の文字列における指定位置の後の文字列の前に接続されると推測された文字列と、の両方の文字列を表示させるプログラムおよび情報処理装置を提供することを目的とする。 The present invention is presumed to be connected to a character string presumed to follow the character string before the specified position in the character string to be processed and a character string after the specified position in the character string to be processed. It is an object of the present invention to provide a program and an information processing apparatus for displaying both character strings.

請求項１に係る発明は、処理対象の文字列である対象文字列における指定位置を受け付ける受付ステップと、記憶手段に記憶された複数の文字列から、前記対象文字列における前記指定位置の前の文字列から始まる文字列と、前記対象文字列における前記指定位置の後の文字列で終わる文字列と、を特定する特定ステップと、前記特定ステップにおいて特定された文字列のうち前記対象文字列と異なる文字列を表示手段に表示させる指示を含む表示制御情報を出力する出力ステップと、をコンピュータに実行させるためのプログラムである。 The invention according to claim 1 is a reception step of accepting a designated position in a target character string that is a character string to be processed, and a plurality of character strings stored in storage means, before the designated position in the target character string. A specifying step for specifying a character string starting from a character string and a character string ending with a character string after the specified position in the target character string; and the target character string among the character strings specified in the specifying step; A program for causing a computer to execute an output step of outputting display control information including an instruction to display different character strings on display means.

請求項２に係る発明は、請求項１に係る発明において、前記記憶手段は、さらに、ある文書の集合における前記複数の文字列それぞれの出現頻度に基づいて予め求められた前記複数の文字列それぞれの重要度を記憶し、前記特定ステップにおいて、さらに、前記対象文字列における前記指定位置の前の文字列から始まる文字列および前記対象文字列における前記指定位置の後の文字列で終わる文字列それぞれの前記重要度を特定し、前記表示制御情報は、さらに、前記特定ステップにおいて特定された各文字列の前記重要度を用いて決定された各文字列の順位に応じて、前記特定ステップにおいて特定された文字列のうち前記対象文字列と異なる文字列を前記表示手段に表示させる指示を含む。 According to a second aspect of the present invention, in the first aspect of the invention, the storage unit further includes the plurality of character strings obtained in advance based on the appearance frequencies of the plurality of character strings in a set of documents. In the specifying step, a character string starting from a character string before the specified position in the target character string and a character string ending with a character string after the specified position in the target character string, respectively The display control information is further specified in the specifying step according to the rank of each character string determined using the importance of each character string specified in the specifying step. An instruction to cause the display means to display a character string different from the target character string.

請求項３に係る発明は、請求項１または２に係る発明において、前記特定ステップにおいて特定された文字列のそれぞれから、前記指定位置の前の文字列および前記指定位置の後の文字列を削除する削除ステップ、をさらに前記コンピュータに実行させ、前記表示制御情報は、さらに、前記削除ステップの処理を行った結果の文字列を表示の対象とする指示を含む。 The invention according to claim 3 is the invention according to claim 1 or 2, wherein the character string before the specified position and the character string after the specified position are deleted from each of the character strings specified in the specifying step. The display control information further includes an instruction to display a character string obtained as a result of the processing of the deletion step.

請求項４に係る発明は、請求項３に係る発明において、前記表示制御情報は、前記指定位置の前の文字列および前記指定位置の後の文字列の両方を含む文字列と、前記指定位置の前の文字列のみを含む文字列と、前記指定位置の後の文字列のみを含む文字列と、のそれぞれに対して前記削除ステップの処理を行った結果の文字列を、互いに異なる態様で前記表示手段に表示させる指示をさらに含む。 The invention according to claim 4 is the invention according to claim 3, wherein the display control information includes a character string including both a character string before the specified position and a character string after the specified position, and the specified position. The character strings obtained as a result of the processing of the deletion step for each of the character string including only the character string before the character string and the character string including only the character string after the specified position are different from each other. An instruction to be displayed on the display means is further included.

請求項５に係る発明は、請求項２から４のいずれか１項に係る発明において、前記各文字列の順位は、前記特定ステップにおいて特定された各文字列の前記重要度に基づいて、かつ、前記特定された各文字列が前記対象文字列における前記指定位置の前および後の文字列のいずれを含むかに応じて決定される。 The invention according to claim 5 is the invention according to any one of claims 2 to 4, wherein the rank of each character string is based on the importance of each character string specified in the specifying step, and The character string is determined according to whether the character string before or after the designated position in the target character string is included.

請求項６に係る発明は、請求項２から５のいずれか１項に係る発明において、前記各文字列の順位は、前記特定ステップにおいて特定された各文字列の長さにさらに基づいて決定される。 The invention according to claim 6 is the invention according to any one of claims 2 to 5, wherein the rank of each character string is further determined based on the length of each character string specified in the specifying step. The

請求項７に係る発明は、請求項１から６のいずれか１項に係る発明において、前記対象文字列および前記記憶手段に記憶された複数の文字列は、それぞれ、当該文字列を含む文書において、予め設定された区切り文字によって区切られる単位の文字列である。 The invention according to claim 7 is the invention according to any one of claims 1 to 6, wherein the target character string and the plurality of character strings stored in the storage means are each in a document including the character string. , A character string of units delimited by preset delimiters.

請求項８に係る発明は、処理対象の文字列である対象文字列における指定位置を受け付ける受付手段と、記憶手段に記憶された複数の文字列から、前記対象文字列における前記指定位置の前の文字列から始まる文字列と、前記対象文字列における前記指定位置の後の文字列で終わる文字列と、を特定する特定手段と、前記特定手段で特定された文字列のうち前記対象文字列と異なる文字列を表示手段に表示させる指示を含む表示制御情報を出力する出力手段と、を備えることを特徴とする情報処理装置である。 According to an eighth aspect of the present invention, a receiving unit that receives a designated position in a target character string that is a character string to be processed, and a plurality of character strings stored in a storage unit, the character string before the designated position in the target character string. A specifying means for specifying a character string starting from a character string and a character string ending with a character string after the specified position in the target character string; and the target character string among the character strings specified by the specifying means; And an output unit that outputs display control information including an instruction to display different character strings on the display unit.

請求項１または８に係る発明によると、処理対象の文字列における指定位置の前の文字列の後に続くと推測された文字列と、その指定位置の後の文字列の前に接続されると推測された文字列と、の両方の文字列をユーザに提示できる。 According to the invention according to claim 1 or 8, when the character string that is estimated to follow the character string before the specified position in the character string to be processed is connected to the character string after the specified position. Both the estimated character string and the character string can be presented to the user.

請求項２に係る発明によると、ある文書の集合における各文字列の出現頻度に基づく重要度を用いて決定された順位に応じて、前記両方の文字列をユーザに提示できる。 According to the second aspect of the present invention, both of the character strings can be presented to the user in accordance with the order determined using the importance based on the appearance frequency of each character string in a set of documents.

請求項３に係る発明によると、前記両方の文字列のそれぞれについて、対象文字列と重複しない部分だけをユーザに提示できる。 According to the invention which concerns on Claim 3, only the part which does not overlap with an object character string can be shown to a user about each of both said character strings.

請求項４に係る発明によると、前記両方の文字列のそれぞれについて、処理対象の文字列における指定位置の前および後の文字列のいずれに接続すると推測したかを区別可能な態様でユーザに提示できる。 According to the invention according to claim 4, each of the two character strings is presented to the user in a distinguishable manner as to which of the character strings before and after the designated position in the character string to be processed is assumed to be connected. it can.

請求項５に係る発明によると、前記両方の文字列について、処理対象の文字列における指定位置の前および後の文字列のいずれを含むかを考慮した順位付けを行うことができる。 According to the fifth aspect of the present invention, it is possible to rank the two character strings in consideration of whether the character string before or after the designated position in the character string to be processed is included.

請求項６に係る発明によると、前記両方の文字列について、当該文字列の長さを考慮した順位付けを行うことができる。 According to the invention which concerns on Claim 6, the ranking which considered the length of the said character string about both said character strings can be performed.

請求項７に係る発明によると、完成された１つの文または句である処理対象の文字列に対する修正の候補をユーザに提示できる。 According to the invention which concerns on Claim 7, the correction candidate with respect to the character string of the process target which is one completed sentence or phrase can be shown to a user.

情報処理装置の内部構成の概略の例を示すブロック図である。It is a block diagram which shows the example of the outline of an internal structure of information processing apparatus. 表示画面の例を示す図である。It is a figure which shows the example of a display screen. フレーズＤＢ１１０から取得される候補フレーズの例を示す図である。It is a figure which shows the example of the candidate phrase acquired from phrase DB110. フレーズＤＢ１１０から取得される候補フレーズの他の例を示す図である。It is a figure which shows the other example of the candidate phrase acquired from phrase DB110. 情報処理装置の処理の手順の例を示すフローチャートである。It is a flowchart which shows the example of the procedure of a process of information processing apparatus. 候補フレーズを順位付けした結果の例を示す図である。It is a figure which shows the example of the result of having ranked the candidate phrase. 候補フレーズから処理対象テキスト中の指定位置の前後の文字列を削除した結果の例を示す図である。It is a figure which shows the example of the result of having deleted the character string before and behind the designated position in a process target text from a candidate phrase. 表示の態様の例を示す図である。It is a figure which shows the example of the aspect of a display. 表示の態様の他の例を示す図である。It is a figure which shows the other example of the aspect of a display. コンピュータのハードウエア構成の例を示す図である。It is a figure which shows the example of the hardware constitutions of a computer.

図１は、本発明の一実施形態の例による情報処理装置の内部構成の概略を示す。図１の例の情報処理装置１０は、文書ＤＢ（データベース）１００、フレーズＤＢ１１０、入力受付部１２０、対象テキスト取得部１３０、候補フレーズ取得部１４０、候補フレーズ加工部１５０、出力処理部１６０、および候補フレーズ挿入部１７０を備える。 FIG. 1 shows an outline of an internal configuration of an information processing apparatus according to an example of an embodiment of the present invention. 1 includes a document DB (database) 100, a phrase DB 110, an input reception unit 120, a target text acquisition unit 130, a candidate phrase acquisition unit 140, a candidate phrase processing unit 150, an output processing unit 160, and A candidate phrase insertion unit 170 is provided.

文書ＤＢ１００は、電子文書を記憶する。文書ＤＢ１００は、例えば、特定の分野において用いられる文書を記憶しておいてよい。後述する情報処理装置１０の各部の処理は、文書ＤＢ１００に記憶された電子文書中のテキスト（文字列）を処理対象として行なわれることもある。 The document DB 100 stores electronic documents. The document DB 100 may store a document used in a specific field, for example. Processing of each unit of the information processing apparatus 10 to be described later may be performed on a text (character string) in an electronic document stored in the document DB 100 as a processing target.

フレーズＤＢ１１０は、文書ＤＢ１００に記憶された電子文書において出現する複数のフレーズを記憶する。ここで、フレーズとは、文または文を構成する句など、文書において１つの単位として捉えられる表現を意味する。本実施形態の例では、予め設定された区切り文字によって区切られる文字列の単位を１つのフレーズとして扱う。予め設定された区切り文字として、本例では、句読点、疑問符、感嘆符、および改行記号を用いる。フレーズＤＢ１１０は、各フレーズを構成する文字列と、当該フレーズのスコアと、を関連づけて記憶する。各フレーズのスコアは、当該フレーズの重要度を表す数値であり、文書ＤＢ１００に記憶された電子文書における当該フレーズの出現頻度に基づいて予め求められてフレーズＤＢ１１０に記憶される。各フレーズの出現頻度をそのまま各フレーズのスコアとしてもよい。あるいは、例えば、各フレーズについて予め定められた有用性を表す数値によって出現頻度を重み付けした結果の値を当該フレーズのスコアとしてもよい。フレーズの有用性は、例えば、文書ＤＢ１００に記憶された電子文書の内容の分野に応じて定められる。例えば、ある特定の分野（医療、工業、経済、法律など）の文書において頻繁に用いられるフレーズを予測可能である場合がある。よって、特定の分野の電子文書が文書ＤＢ１００に記憶されている場合、当該分野で頻繁に用いられると予測されるフレーズについて、文書ＤＢ１００中の電子文書における実際の出現頻度に対して重み付けして、当該出現頻度よりも大きな値を当該フレーズのスコアとして設定してもよい。また、頻繁に用いられると予測される複数のフレーズ間で、使用頻度に差があると予測されることも考えられる。このような場合、その使用頻度の差を反映させて各フレーズの実際の出現頻度に対する重み付けを行えばよい。例えば、使用頻度がより高いと予測されるフレーズの重みをより大きくすればよい。 The phrase DB 110 stores a plurality of phrases that appear in the electronic document stored in the document DB 100. Here, the phrase means an expression that can be understood as one unit in a document, such as a sentence or a phrase constituting the sentence. In the example of the present embodiment, a unit of a character string delimited by a preset delimiter is handled as one phrase. In this example, punctuation marks, question marks, exclamation marks, and line feed symbols are used as preset delimiters. The phrase DB 110 stores a character string constituting each phrase and the score of the phrase in association with each other. The score of each phrase is a numerical value indicating the importance of the phrase, and is obtained in advance based on the appearance frequency of the phrase in the electronic document stored in the document DB 100 and stored in the phrase DB 110. The appearance frequency of each phrase may be used as the score of each phrase as it is. Alternatively, for example, a value obtained by weighting the appearance frequency with a numerical value indicating usefulness predetermined for each phrase may be used as the score of the phrase. The usefulness of the phrase is determined according to the field of the contents of the electronic document stored in the document DB 100, for example. For example, it may be possible to predict phrases that are frequently used in documents in a particular field (medical, industrial, economic, legal, etc.). Therefore, when an electronic document in a specific field is stored in the document DB 100, the phrase that is predicted to be frequently used in the field is weighted with respect to the actual appearance frequency in the electronic document in the document DB 100, A value larger than the appearance frequency may be set as the score of the phrase. It is also conceivable that the usage frequency is predicted to be different among a plurality of phrases predicted to be frequently used. In such a case, the actual appearance frequency of each phrase may be weighted by reflecting the difference in use frequency. For example, it is only necessary to increase the weight of a phrase predicted to be used more frequently.

なお、フレーズＤＢ１１０中の各フレーズの出現頻度は、文書ＤＢ１００中の電子文書に変更が生じること（新規登録、更新、および削除など）により変化し得る。よって、文書ＤＢ１００中の電子文書の変更とそれに伴う各フレーズの出現頻度の変化に応じて、フレーズＤＢ１１０中の各フレーズのスコアを更新してもよい。フレーズＤＢ１１０における各フレーズのスコアの更新は、例えば、文書ＤＢ１００中の電子文書の変更の時点で行ってもよいし、一定期間ごとに行ってもよい、あるいは、例えば、ユーザまたは管理者などにより指示されたタイミングでフレーズＤＢ１１０の更新を行ってもよい。 Note that the appearance frequency of each phrase in the phrase DB 110 may change due to a change (such as new registration, update, or deletion) in the electronic document in the document DB 100. Therefore, the score of each phrase in the phrase DB 110 may be updated according to a change in the electronic document in the document DB 100 and a change in the appearance frequency of each phrase. The update of the score of each phrase in the phrase DB 110 may be performed, for example, when the electronic document in the document DB 100 is changed, may be performed at regular intervals, or is instructed by, for example, a user or an administrator The phrase DB 110 may be updated at the timing that is performed.

入力受付部１２０は、マウスおよびキーボードなどの入力装置を介したユーザの入力を受け付ける。入力受付部１２０は、受け付けた入力の情報を、その内容に応じて対象テキスト取得部１３０または候補フレーズ挿入部１７０に渡す。 The input receiving unit 120 receives user input via an input device such as a mouse and a keyboard. The input reception unit 120 passes the received input information to the target text acquisition unit 130 or the candidate phrase insertion unit 170 according to the content.

対象テキスト取得部１３０は、文書ＤＢ１００中の電子文書または入力受付部１２０で受け付けられた入力テキストから、処理対象のテキストを取得する。対象テキスト取得部１３０は、例えば、表示装置（図示しない）に表示されたテキストにおいてユーザが指定した位置の前後の文字列を処理対象として取得する。例えば図２を参照し、文書ＤＢ１００中の電子文書または入力受付部１２０で受け付けられた入力テキストを表示装置に表示させた表示画面２００において、ユーザが入力装置を用いて矢印Ｃで示す位置にカーソルを合わせたとする。この例の場合、対象テキスト取得部１３０は、入力受付部１２０を介してカーソルの位置を表す情報を取得し、このカーソル位置をユーザによる指定位置とみなして指定位置の前後の文字列を取得する。このとき、表示画面２００に文書ＤＢ１００中の電子文書が表示されていれば文書ＤＢ１００から該当する文字列が取得され、入力受付部１２０で受け付けられた入力テキストが表示画面２００に表示されていれば入力テキストを記憶する一時記憶装置（図示しない）から該当する文字列が取得される。また、このとき取得される文字列は、指定位置の前方向および後方向において最初に現れるフレーズの区切り文字の間の文字列であってよい。例えば、指定位置から前方向および後方向の文字列を一文字ずつ調べて、予め設定されたフレーズの区切り文字（句点、感嘆符、疑問符、改行記号など）を発見した時点で、指定位置と当該区切り文字との間の文字列を取得する。ここで用いられる区切り文字は、上述のフレーズＤＢ１１０に記憶されるフレーズを決定するために予め設定される区切り文字と同様であってよい。図２の例では、指定位置を含む一文「有意なリンパ節腫大は認められません」が取得される。 The target text acquisition unit 130 acquires the text to be processed from the electronic document in the document DB 100 or the input text received by the input reception unit 120. The target text acquisition unit 130 acquires, for example, character strings before and after a position designated by the user in a text displayed on a display device (not shown) as a processing target. For example, referring to FIG. 2, on the display screen 200 in which the electronic document in the document DB 100 or the input text received by the input receiving unit 120 is displayed on the display device, the user moves the cursor to the position indicated by the arrow C using the input device. Are combined. In this example, the target text acquisition unit 130 acquires information indicating the position of the cursor via the input reception unit 120, and regards the cursor position as a specified position by the user, and acquires character strings before and after the specified position. . At this time, if an electronic document in the document DB 100 is displayed on the display screen 200, the corresponding character string is acquired from the document DB 100, and if the input text received by the input receiving unit 120 is displayed on the display screen 200. The corresponding character string is acquired from a temporary storage device (not shown) that stores the input text. The character string acquired at this time may be a character string between phrase delimiters that appear first in the forward and backward directions of the specified position. For example, by checking the forward and backward character strings one by one from the specified position and finding a preset phrase delimiter (punctuation mark, exclamation mark, question mark, new line symbol, etc.), the specified position and the delimiter Get string between characters. The delimiter used here may be the same as the delimiter set in advance to determine the phrase stored in the phrase DB 110 described above. In the example of FIG. 2, a sentence including a designated position “No significant lymphadenopathy is recognized” is acquired.

なお、本実施形態の例の説明において、指定位置の「前後」の文字列とは、横書きの文書であれば指定位置の「左右」の文字列を意味し、縦書きの文書であれば指定位置の「上下」の文字列を意味する。 In the description of the example of the present embodiment, the “front and back” character string at the designated position means the “left and right” character string at the designated position in a horizontally written document, and designated in a vertically written document. It means the character string "up and down" of the position.

再び図１を参照し、候補フレーズ取得部１４０は、対象テキスト取得部１３０が取得した処理対象テキストにおける指定位置の前の文字列を前半部分に含むフレーズおよびそのスコアと、指定位置の後の文字列を後半部分に含むフレーズおよびそのスコアと、をフレーズＤＢ１１０から取得する。「指定位置の前の文字列を前半部分に含むフレーズ」は、当該フレーズの最初の文字から始まる文字列が指定位置の前の文字列と一致するフレーズである。「指定位置の後の文字列を後半部分に含むフレーズ」は、当該フレーズの最後の文字で終わる文字列が指定位置の後の文字列と一致するフレーズである。以下、候補フレーズ取得部１４０が取得するフレーズを「候補フレーズ」とも呼ぶ。候補フレーズは、処理対象のテキストに挿入される文字列の候補を含むフレーズであると言える。 Referring to FIG. 1 again, the candidate phrase acquisition unit 140 includes a phrase including a character string before the specified position in the processing target text acquired by the target text acquisition unit 130 and its score, and the character after the specified position. The phrase including the column in the latter half and its score are acquired from the phrase DB 110. The phrase “a phrase including the character string before the specified position in the first half” is a phrase in which the character string starting from the first character of the phrase matches the character string before the specified position. The phrase “a phrase including the character string after the designated position in the latter half” is a phrase in which the character string ending with the last character of the phrase matches the character string after the designated position. Hereinafter, the phrase acquired by the candidate phrase acquisition unit 140 is also referred to as “candidate phrase”. It can be said that the candidate phrase is a phrase including a candidate for a character string to be inserted into the text to be processed.

図２を参照して説明した処理対象テキストの例「有意なリンパ節腫大は認められません」の場合、候補フレーズ取得部１４０は、指定位置の前の文字列「有意なリンパ節腫大」を検索キーとしてフレーズＤＢ１１０を検索して当該文字列を前半部分に含むフレーズとそのスコアを取得する。本例で候補フレーズ取得部１４０が文字列「有意なリンパ節腫大」を検索キーとして検索して取得する候補フレーズとそのスコアの例を図３に示す。さらに、候補フレーズ取得部１４０は、指定位置の後の文字列「は認められません」を検索キーとしてフレーズＤＢ１１０を検索して当該文字列を後半部分に含むフレーズとそのスコアを取得する。本例で候補フレーズ取得部１４０が文字列「は認められません」を検索キーとして検索して取得する候補フレーズとそのスコアの例を図４に示す。 In the case of the example of the processing target text described with reference to FIG. 2 “significant lymphadenopathy is not recognized”, the candidate phrase acquisition unit 140 uses the character string “significant lymphadenopathy” before the specified position. "Is used as a search key to search the phrase DB 110 to obtain a phrase including the character string in the first half and its score. In this example, the candidate phrase acquisition unit 140 searches for and acquires the character string “significant lymphadenopathy” as a search key, and an example of the score is shown in FIG. Further, the candidate phrase acquisition unit 140 searches the phrase DB 110 using the character string “cannot be accepted” after the designated position as a search key, and acquires a phrase including the character string in the latter half and its score. In this example, the candidate phrase acquisition unit 140 searches for and acquires the character string “cannot be accepted” as a search key.

図１の説明に戻り、候補フレーズ加工部１５０は、候補フレーズ取得部１４０が取得した候補フレーズに対する処理を行う。例えば、候補フレーズ加工部１５０は、各候補フレーズについて取得されたスコアを用いて、候補フレーズを順位付けする。また例えば、候補フレーズ加工部１５０は、各候補フレーズから、処理対象のテキストに含まれる文字列を削除する。例えば、図３の行Ｌ２の候補フレーズ「有意なリンパ節腫大や腹水の貯溜は認められません」の場合、処理対象のテキスト中の文字列「有意なリンパ節腫大」および「は認められません」を当該候補フレーズから削除して、文字列「や腹水の貯溜」を生成する。また、図４の行Ｍ１の候補フレーズ「特に異常は認められません」の例の場合、処理対象のテキスト中の文字列「は認められません」を削除し、文字列「特に異常」を生成する。候補フレーズ加工部１５０のより詳細な処理については後述する。 Returning to the description of FIG. 1, the candidate phrase processing unit 150 performs processing on the candidate phrase acquired by the candidate phrase acquisition unit 140. For example, the candidate phrase processing unit 150 ranks the candidate phrases using the score acquired for each candidate phrase. Further, for example, the candidate phrase processing unit 150 deletes the character string included in the text to be processed from each candidate phrase. For example, in the case of the candidate phrase “significant lymphadenopathy or ascites accumulation is not recognized” in line L2 in FIG. 3, the character strings “significant lymphadenopathy” and “ "Cannot" is deleted from the candidate phrase, and the character string "and ascites reservoir" is generated. In addition, in the case of the candidate phrase “particularly no abnormality is recognized” in the row M1 in FIG. 4, the character string “not recognized” is deleted from the text to be processed, and the character string “particularly abnormal” is changed. Generate. More detailed processing of the candidate phrase processing unit 150 will be described later.

出力処理部１６０は、候補フレーズ加工部１５０の処理の結果を出力する。出力処理部１６０は、例えば、候補フレーズ加工部１５０が処理対象のテキストに含まれる文字列を削除した後の候補フレーズを、候補フレーズ加工部１５０が行った順位付けに従って表示装置（図示しない）に表示させる処理を行う。出力処理部１６０は、例えば、表示の内容および態様を指示する表示制御情報を生成し、生成した表示制御情報を表示装置に対して出力することで、表示装置への表示を実現する。出力処理部１６０による表示の内容および態様の具体例は後述する。 The output processing unit 160 outputs the processing result of the candidate phrase processing unit 150. For example, the output processing unit 160 displays the candidate phrases after the candidate phrase processing unit 150 deletes the character strings included in the text to be processed on the display device (not shown) according to the ranking performed by the candidate phrase processing unit 150. Process to be displayed. For example, the output processing unit 160 generates display control information instructing display contents and modes, and outputs the generated display control information to the display device, thereby realizing display on the display device. Specific examples of display contents and modes by the output processing unit 160 will be described later.

候補フレーズ挿入部１７０は、出力処理部１６０が表示させた候補フレーズの中から１つの候補フレーズをユーザが選択した場合に、入力受付部１２０を介してユーザの選択した候補フレーズを特定し、特定した候補フレーズを処理対象テキストの指定位置に挿入する処理を行う。処理対象のテキストが文書ＤＢ１００中の電子文書から取得されていれば、当該電子文書における指定位置に、特定した候補フレーズを挿入する。入力受付部１２０が受け付けた入力テキストを記憶した一時記憶装置から処理対象のテキストが取得されていれば、当該一時記憶装置における入力テキスト中の指定位置に、特定した候補フレーズを挿入する。 The candidate phrase insertion unit 170 identifies and identifies the candidate phrase selected by the user via the input reception unit 120 when the user selects one candidate phrase from the candidate phrases displayed by the output processing unit 160. The candidate phrase is inserted into the designated position of the text to be processed. If the text to be processed is acquired from the electronic document in the document DB 100, the identified candidate phrase is inserted at the designated position in the electronic document. If the text to be processed is acquired from the temporary storage device that stores the input text received by the input receiving unit 120, the identified candidate phrase is inserted at the designated position in the input text in the temporary storage device.

以下、情報処理装置１０の処理の例を説明する。図５は、情報処理装置１０が行う処理の手順の例を示すフローチャートである。情報処理装置１０は、例えば、表示装置に表示されたテキスト中の特定の位置を指定する入力をユーザが行った場合に、図５の例の手順の処理を開始する。 Hereinafter, an example of processing of the information processing apparatus 10 will be described. FIG. 5 is a flowchart illustrating an example of a procedure of processing performed by the information processing apparatus 10. The information processing apparatus 10 starts the processing of the procedure in the example of FIG. 5, for example, when the user performs an input for designating a specific position in the text displayed on the display device.

まず、入力受付部１２０は、ユーザが入力装置を用いて指定した、テキスト中の指定位置を取得する（ステップＳ１０）。 First, the input receiving unit 120 acquires a designated position in the text designated by the user using the input device (step S10).

入力受付部１２０から指定位置を取得した対象テキスト取得部１３０は、指定位置の前後の文字列を取得して処理対象テキストとする（ステップＳ１２）。本実施形態の例において、対象テキスト取得部１３０は、ステップＳ１２で、指定位置の前後で最初に現れる予め設定された区切り文字と、指定位置と、の間の文字列を取得する。対象テキスト取得部１３０は、取得した処理対象テキストを候補フレーズ取得部１４０に渡す。 The target text acquisition unit 130 that has acquired the specified position from the input receiving unit 120 acquires the character string before and after the specified position and sets it as the processing target text (step S12). In the example of this embodiment, the target text acquisition unit 130 acquires a character string between a preset delimiter that first appears before and after the designated position and the designated position in step S12. The target text acquisition unit 130 passes the acquired processing target text to the candidate phrase acquisition unit 140.

なお、以下の処理手順については、図２を参照して上述した処理対象テキストの例「有意なリンパ節腫大は認められません」（指定位置は、図２の矢印ｃのカーソルの位置）がステップＳ１２で取得された場合を例にとり説明する。 For the following processing procedure, an example of the text to be processed described above with reference to FIG. 2 "No significant lymphadenopathy is recognized" (the specified position is the cursor position indicated by the arrow c in FIG. 2). Will be described with reference to an example in which step S12 is acquired in step S12.

候補フレーズ取得部１４０は、対象テキスト取得部１３０から受け取った処理対象テキストに挿入される文字列の候補を含む候補フレーズをフレーズＤＢ１１０から取得する（ステップＳ１４）。ステップＳ１４で、候補フレーズ取得部１４０は、例えば、候補フレーズを指定位置で分割し、指定位置の前の文字列および指定位置の後の文字列のそれぞれを検索キーとしてフレーズＤＢ１１０を検索する。そして、指定位置の前の文字列を前半部分に含むフレーズおよびそのスコアと、指定位置の後の文字列を後半部分に含むフレーズおよびそのスコアと、をフレーズＤＢ１１０から取得する。 The candidate phrase acquisition unit 140 acquires, from the phrase DB 110, candidate phrases that include character string candidates to be inserted into the processing target text received from the target text acquisition unit 130 (step S14). In step S 14, for example, the candidate phrase acquisition unit 140 divides the candidate phrase at the specified position, and searches the phrase DB 110 using each of the character string before the specified position and the character string after the specified position as a search key. Then, the phrase including the character string before the designated position in the first half and the score thereof, and the phrase including the character string after the designated position in the second half and the score thereof are acquired from the phrase DB 110.

本例において、ステップＳ１４で、指定位置の前の文字列「有意なリンパ節腫大」を前半部分に含むフレーズおよびそのスコアとして図３の表に例示する候補フレーズおよびスコアの組が取得され、指定位置の後の文字列「は認められません」を後半部分に含むフレーズおよびそのスコアとして図４の表に例示する候補フレーズおよびスコアの組が取得されるとする。図３および図４を参照し、指定位置の前の文字列および指定位置の後の文字列の両方を含む候補フレーズは、フレーズＤＢ１１０から２回取得される。例えば、候補フレーズ「有意なリンパ節腫大は認められません」，「有意なリンパ節腫大や腹水の貯溜は認められません」は、それぞれ、指定位置の前の文字列を前半部分に含むフレーズとして取得され、かつ、指定位置の後の文字列を後半部分に含むフレーズとして取得される（図３の行Ｌ１，Ｌ２および図４の行Ｍ２，Ｍ３参照）。また、ステップＳ１４では、処理対象テキストと同一の文字列が取得されることもある（図３の行Ｌ１および図４の行Ｍ２参照）。 In this example, in step S14, a set of candidate phrases and scores illustrated in the table of FIG. 3 is acquired as a phrase including the character string “significant lymphadenopathy” in the first half of the designated position and its score, Assume that a set of candidate phrases and scores exemplified in the table of FIG. 4 is acquired as a phrase including the character string “not allowed” after the specified position in the latter half and its score. With reference to FIG. 3 and FIG. 4, the candidate phrase including both the character string before the designated position and the character string after the designated position is acquired from the phrase DB 110 twice. For example, the candidate phrases “No significant lymphadenopathy” or “No significant lymphadenopathy or ascites accumulation” are recognized in the first half of the string before the specified position. It is acquired as a phrase including the character string after the designated position in the latter half part (see lines L1 and L2 in FIG. 3 and lines M2 and M3 in FIG. 4). In step S14, the same character string as the processing target text may be acquired (see line L1 in FIG. 3 and line M2 in FIG. 4).

なお、候補フレーズ取得部１４０は、ステップＳ１４で取得した候補フレーズおよびそのスコアを、指定位置の前および後のいずれの文字列を含むフレーズとして取得したかを表す情報と共に、候補フレーズ加工部１５０に渡す。 The candidate phrase acquisition unit 140 sends the candidate phrase acquired in step S14 and its score to the candidate phrase processing unit 150 together with information indicating whether the candidate string acquired before or after the designated position is included as a phrase. hand over.

候補フレーズおよびそのスコアを候補フレーズ取得部１４０から受け取った候補フレーズ加工部１５０は、各候補フレーズのスコアを用いて候補フレーズを順位付けする（ステップＳ１６）。本実施形態の例では、ステップＳ１６で、候補フレーズ加工部１５０は、まず、各候補フレーズが指定位置の前および後のいずれの文字列を含むかに応じて、当該候補フレーズのスコアを用いて新たなスコアを決定する。次に、各候補フレーズについて決定した新スコアの大きい順に候補フレーズを順位付けする。 The candidate phrase processing unit 150 that has received the candidate phrase and its score from the candidate phrase acquisition unit 140 ranks the candidate phrases using the score of each candidate phrase (step S16). In the example of the present embodiment, in step S16, the candidate phrase processing unit 150 first uses the score of the candidate phrase depending on which character string before and after the designated position each candidate phrase includes. Determine a new score. Next, the candidate phrases are ranked in descending order of the new score determined for each candidate phrase.

各候補フレーズの新スコアは、例えば、次の式（１）に従って計算される。
Ｓ＝ｓ１×α＋ｓ２×β （１） The new score of each candidate phrase is calculated according to the following formula (1), for example.
S = s1 × α + s2 × β (1)

式（１）において、ｓ１は、指定位置の前の文字列を含む候補フレーズのスコアを表し、ｓ２は、指定位置の後の文字列を含む候補フレーズのスコアを表す。αおよびβは、重み付け係数である。αおよびβの値は、例えば、予め設定される定数であってよい。あるいは、αおよびβの値として、計算対象の候補フレーズの長さ（候補フレーズに含まれる文字の数）を用いてもよい。候補フレーズの長さを用いる場合、α＝βとなる。 In Expression (1), s1 represents a score of a candidate phrase including a character string before the designated position, and s2 represents a score of a candidate phrase including the character string after the designated position. α and β are weighting factors. The values of α and β may be constants set in advance, for example. Alternatively, the length of the candidate phrase to be calculated (number of characters included in the candidate phrase) may be used as the values of α and β. When using the length of the candidate phrase, α = β.

指定位置の前および後のいずれか一方の文字列だけを含む候補フレーズについて式（１）の値Ｓを計算する場合、指定位置の前および後のいずれの文字列を含むかに応じてｓ１およびｓ２の一方に当該候補フレーズのスコアを代入し、他方に値「０」を代入して計算する。指定位置の前および後の両方の文字列を含む候補フレーズの場合、ｓ１およびｓ２の両方に当該候補フレーズのスコアを代入して計算する。例えば、図３の行Ｌ１および図４の行Ｍ２の候補フレーズ「有意なリンパ節腫大は認められません」の場合、スコアは「１８９」であり、指定位置の前の文字列を含み、かつ指定位置の後の文字列を含むことから、式（１）のＳ＝１８９×α＋１８９×βとなる。また、図３の行Ｌ２および図４の行Ｍ３の候補フレーズ「有意なリンパ節腫大や腹水の貯溜は認められません」の場合（スコアは「１６７」）も、指定位置の前および後の文字列を含むことから、式（１）のＳ＝１６７×α＋１６７×βとなる。また、指定位置の前の文字列だけを含む候補フレーズの例として、図３の行Ｌ３「有意なリンパ節腫大は指摘できません」の場合（スコアは「１０５」）、式（１）のＳ＝１０５×α＋０×βとなる。指定位置の後の文字列だけを含む候補フレーズの例として、図４の行Ｍ１「特に異常は認められません」の場合（スコアは「２８８」）、式（１）のＳ＝０×α＋２８８×βとなる。 When calculating the value S of the formula (1) for a candidate phrase including only one of the character strings before and after the designated position, s1 and The calculation is performed by substituting the score of the candidate phrase into one of s2 and substituting the value “0” into the other. In the case of a candidate phrase that includes both character strings before and after the designated position, the score of the candidate phrase is substituted for both s1 and s2. For example, in the case of the candidate phrase “no significant lymphadenopathy is found” in the row L1 in FIG. 3 and the row M2 in FIG. 4, the score is “189”, including the character string before the specified position, Since the character string after the designated position is included, S = 189 × α + 189 × β in equation (1). Also, in the case of the candidate phrase “significant lymphadenopathy or ascites accumulation is not recognized” in the row L2 in FIG. 3 and the row M3 in FIG. 4 (score is “167”), before and after the designated position Therefore, S = 167 × α + 167 × β in Equation (1). In addition, as an example of a candidate phrase including only the character string before the specified position, in the case of the line L3 “significant lymphadenopathy cannot be pointed out” in FIG. 3 (score is “105”), S in the expression (1) = 105 × α + 0 × β. As an example of a candidate phrase including only the character string after the designated position, in the case of the row M1 in FIG. 4 “particularly no abnormality is recognized” (score is “288”), S = 0 × α + 288 in Expression (1) × β.

図６に、図３および図４に例示する各候補フレーズについて式（１）に従って決定される新スコアの例を示す。図６は、式（１）において、α＝２．０，β＝１．０とした場合の例である。図６では、候補フレーズを、決定された新スコアが大きい順に示す。候補フレーズ加工部１５０は、図５のステップＳ１６で、図６の例の表の行Ｋ１〜Ｋ５の候補フレーズの順に高い順位を付ける。 FIG. 6 shows an example of a new score determined according to the formula (1) for each candidate phrase exemplified in FIGS. 3 and 4. FIG. 6 is an example when α = 2.0 and β = 1.0 in the formula (1). In FIG. 6, the candidate phrases are shown in descending order of the determined new score. The candidate phrase processing unit 150 assigns a higher rank in the order of candidate phrases in rows K1 to K5 in the table of the example of FIG. 6 in step S16 of FIG.

図５の例の手順の説明に戻り、ステップＳ１６の後、候補フレーズ加工部１５０は、指定位置の前および後の文字列を、各候補フレーズから削除する（ステップＳ１８）。ステップＳ１８では、各候補フレーズにおいて、処理対象テキストと重複する部分が削除される。よって、例えば図６の行Ｋ１のように、処理対象テキスト「有意なリンパ節腫大は認められません」と同一の文字列から構成される候補フレーズについては、ステップＳ１８で、候補フレーズの集合から当該候補フレーズ自体が削除されることになる。図７に、図６に例示する各候補フレーズに対してステップＳ１８の処理を行った結果の例を示す。図７を参照すると、図６の行Ｋ１（処理対象テキストと同一の候補フレーズ）は、図７の例の表には存在しない。また、図７の行Ｋ２〜Ｋ５に示すフレーズは、それぞれ、図６の行Ｋ２〜Ｋ５の候補フレーズから、処理対象テキストと重複する部分（つまり、処理対象テキストにおける指定位置の前の文字列および後の文字列）を削除したものである。 Returning to the description of the procedure in the example of FIG. 5, after step S 16, the candidate phrase processing unit 150 deletes the character strings before and after the designated position from each candidate phrase (step S 18). In step S18, the part which overlaps with a process target text is deleted in each candidate phrase. Therefore, for example, a candidate phrase composed of the same character string as the text to be processed “No significant lymphadenopathy is recognized” as shown in line K1 of FIG. The candidate phrase itself is deleted from. FIG. 7 shows an example of the result of performing the process of step S18 on each candidate phrase exemplified in FIG. Referring to FIG. 7, the line K 1 (the same candidate phrase as the processing target text) in FIG. 6 does not exist in the table of the example in FIG. 7. Also, the phrases shown in the rows K2 to K5 in FIG. 7 are the portions overlapping the processing target text from the candidate phrases in the rows K2 to K5 in FIG. 6 (that is, the character string before the specified position in the processing target text and It is the one that deletes the following character string.

再び図５を参照し、候補フレーズ加工部１５０は、ステップＳ１６およびステップＳ１８の処理の結果を出力処理部１６０に渡す。候補フレーズ加工部１５０は、さらに、各候補フレーズが指定位置の前および後のいずれの文字列を含むフレーズとしてフレーズＤＢ１１０から取得されたものであるかを表す情報を出力処理部１６０に渡す。 Referring again to FIG. 5, candidate phrase processing unit 150 passes the results of the processing in steps S 16 and S 18 to output processing unit 160. The candidate phrase processing unit 150 further passes to the output processing unit 160 information indicating whether each candidate phrase is acquired from the phrase DB 110 as a phrase including a character string before or after the designated position.

出力処理部１６０は、処理対象テキストと重複する部分を削除した結果の各候補フレーズを、候補フレーズ加工部１５０が付した順位に従って表示装置（図示しない）に表示させる処理を行う（ステップＳ２０）。また、本実施形態の例では、ステップＳ２０において、出力処理部１６０は、各候補フレーズが指定位置の前および後のいずれの文字列を含むフレーズとしてフレーズＤＢ１１０から取得されたかに応じて異なる態様で各候補フレーズを表示装置に表示させる。つまり、各候補フレーズが、（ｉ）指定位置の前の文字列を前半部分に含むフレーズとしてのみ取得された場合、（ｉｉ）指定位置の後の文字列を後半部分に含むフレーズとしてのみ取得された場合、および、（ｉｉｉ）指定位置の前の文字列を前半部分に含むフレーズとして、かつ指定位置の後の文字列を後半部分に含むフレーズとして取得された場合、の３つの場合に応じて異なる態様で各候補フレーズを表示させる。 The output processing unit 160 performs a process of displaying each candidate phrase as a result of deleting a portion overlapping with the processing target text on a display device (not shown) according to the ranking given by the candidate phrase processing unit 150 (step S20). Moreover, in the example of this embodiment, in step S20, the output processing unit 160 has a different aspect depending on whether each candidate phrase is acquired from the phrase DB 110 as a phrase including any character string before or after the designated position. Each candidate phrase is displayed on the display device. That is, when each candidate phrase is acquired only as a phrase including (i) a character string preceding the specified position in the first half part, (ii) acquired only as a phrase including the character string after the specified position in the second half part. And (iii) when acquired as a phrase including the character string before the specified position in the first half and as a phrase including the character string after the specified position in the second half. Each candidate phrase is displayed in a different manner.

図８に、ステップＳ２０で表示装置に表示される表示画面の例を示す。図８の例において、処理対象テキストを含む元のテキストの表示画面２００と、候補フレーズの表示画面３００と、が表示される。表示画面３００は、処理対象テキスト「有意なリンパ節腫大は認められません」の下方に配置されている。また、表示画面３００において、図７の例の表の候補フレーズ「や腹水の貯溜」，「特に異常」，「は指摘できません」，「などの異状を指摘できません」が、新スコアの高い順（ステップＳ１６で決定された順位の高い順）に表示されている。表示画面３００中の候補フレーズのうち、「や腹水の貯溜」は、上記（ｉｉｉ）の場合に、「特に異常」は、上記（ｉｉ）の場合に、「は指摘できません」および「などの異状を指摘できません」は、上記（ｉ）の場合に該当する。表示画面３００において、各候補フレーズは、（ｉ），（ｉｉ），（ｉｉｉ）の場合に応じて異なる地模様の枠内に表示されている。また、各候補フレーズは、当該候補フレーズに対して処理対象テキストの指定位置の前および後のいずれの文字列が接続されるかに応じて、指定位置の前側または後側に配置される。例えば、（ｉｉ）に該当する候補フレーズ「特に異常」に対しては、その後に指定位置の後の文字列が接続され、指定位置の前の文字列が接続されることはないことから、当該候補フレーズは、指定位置の前側に配置される。また、その他の３つの候補フレーズは、いずれも、指定位置の前の文字列の後に接続されることから、指定位置の後側に配置される。 FIG. 8 shows an example of a display screen displayed on the display device in step S20. In the example of FIG. 8, an original text display screen 200 including a text to be processed and a candidate phrase display screen 300 are displayed. The display screen 300 is arranged below the processing target text “No significant lymphadenopathy is recognized”. In addition, on the display screen 300, the candidate phrases “and ascites accumulation”, “particularly abnormal”, “cannot point out”, “cannot point out abnormalities such as“ cannot be pointed out ”in the table of the example of FIG. Are displayed in order of descending order determined in step S16. Among the candidate phrases in the display screen 300, “and ascites reservoir” are the cases (iii) above, and “particularly abnormal” is the case (ii), “is not pointed out” and abnormalities such as “ Cannot be pointed out ”corresponds to the case of (i) above. On the display screen 300, each candidate phrase is displayed in a frame of a different ground pattern depending on cases (i), (ii), and (iii). Each candidate phrase is arranged on the front side or the rear side of the designated position depending on which character string before or after the designated position of the processing target text is connected to the candidate phrase. For example, for the candidate phrase “particularly abnormal” corresponding to (ii), the character string after the designated position is connected thereafter, and the character string before the designated position is not connected. The candidate phrase is arranged on the front side of the designated position. In addition, since the other three candidate phrases are all connected after the character string before the designated position, they are arranged behind the designated position.

なお、図８の表示画面３００の変形例として、各候補フレーズを（ｉ），（ｉｉ），（ｉｉｉ）の場合に応じて異なる地模様の枠内に表示させる代わりに、異なる色の枠内に表示させてもよい。さらに他の例では、各候補フレーズを（ｉ），（ｉｉ），（ｉｉｉ）の場合に応じて異なる文字色またはフォントで表示させてもよい。 As a modification of the display screen 300 of FIG. 8, instead of displaying each candidate phrase in a frame of a different ground pattern depending on the cases (i), (ii), and (iii), May be displayed. In yet another example, each candidate phrase may be displayed in a different character color or font depending on cases (i), (ii), and (iii).

図９は、ステップＳ２０で表示装置に表示される表示画面の他の例を示す。図９の例において、処理対象テキストを含む元のテキストの表示画面２００と、候補フレーズの表示画面３０２と、が表示される。表示画面３０２は、処理対象テキスト「有意なリンパ節腫大は認められません」の下方に、その左上の角を指定位置に合わせて配置されている。表示画面３０２において、図８の例の表示画面３００と同様、候補フレーズ「や腹水の貯溜」，「特に異常」，「は指摘できません」，「などの異状を指摘できません」が、ステップＳ１６で決定された順位の高い順に表示されている。表示画面３０２において、各候補フレーズに対応づけて示される「前後」，「後」，「前」の文字列は、各候補フレーズが上記（ｉ），（ｉｉ），（ｉｉｉ）のいずれの場合に該当するかに応じて異なる。上記（ｉｉｉ）に該当する候補フレーズ「や腹水貯溜」は、処理対象テキストの指定位置の前の文字列「有意なリンパ節腫大」の後に続き、さらに、当該候補フレーズの後に処理対象テキストの指定位置の後の文字列「は認められません」が続く。このことから、候補フレーズ「や腹水貯溜」に対応づけて、当該候補フレーズが指定位置の「前」および「後」の文字列に接続される旨を表す文字列「前後」が示される。また、上記（ｉｉ）に該当する候補フレーズ「特に異常」は、処理対象テキストの指定位置の後の文字列「は認められません」の前に接続され、指定位置の前の文字列には接続されない。このことを表す文字列「後」が候補フレーズ「特に異常」に対応づけられて示される。また、上記（ｉｉｉ）に該当する候補フレーズ「は指摘できません」，「などの異状を指摘できません」は、それぞれ、処理対象テキストの指定位置の前の文字列「有意なリンパ節腫大」の後に接続され、指定位置の後の文字列には接続されない。このことを表す文字列「前」が、これらの各候補フレーズに対応づけられて示される。 FIG. 9 shows another example of the display screen displayed on the display device in step S20. In the example of FIG. 9, an original text display screen 200 including a processing target text and a candidate phrase display screen 302 are displayed. The display screen 302 is arranged below the processing target text “No significant lymphadenopathy is recognized” with the upper left corner aligned with the designated position. In the display screen 302, as in the display screen 300 in the example of FIG. 8, the candidate phrases “and ascites accumulation”, “particularly abnormal”, “cannot be pointed out”, “cannot point out abnormalities” are determined in step S16. Are displayed in descending order. In the display screen 302, the character strings of “before”, “after”, and “before” shown in association with each candidate phrase are any of the above-mentioned (i), (ii), and (iii) It depends on the case. The candidate phrase “and ascites reservoir” corresponding to the above (iii) follows the character string “significant lymphadenopathy” before the specified position of the processing target text, and further after the candidate phrase, The string “is not allowed” follows the specified position. From this, the character string “front and rear” indicating that the candidate phrase is connected to the character strings “front” and “rear” at the designated position is shown in association with the candidate phrase “and ascites reservoir”. In addition, the candidate phrase “particularly abnormal” corresponding to the above (ii) is connected before the character string “not allowed” after the specified position of the text to be processed, and the character string before the specified position is Not connected. A character string “after” indicating this is shown in association with the candidate phrase “particularly abnormal”. In addition, the candidate phrases corresponding to the above (iii) “cannot be pointed out” and “cannot point out abnormalities such as” follow the character string “significant lymphadenopathy” before the specified position of the text to be processed, respectively. Connected and not connected to the string after the specified position. A character string “previous” representing this is shown in association with each of these candidate phrases.

なお、図９の表示画面３０２において、その変形例として、（ｉ），（ｉｉ），（ｉｉｉ）のいずれの場合に該当するかに応じて、異なる地模様または色の枠内に候補フレーズを表示させたり、異なる文字色またはフォントで候補フレーズを表示させたりしてもよい。 In addition, in the display screen 302 of FIG. 9, as a modification, candidate phrases are placed in different background patterns or color frames depending on which case corresponds to (i), (ii), or (iii). Candidate phrases may be displayed in different character colors or fonts.

再び図５を参照し、入力受付部１２０において、ステップＳ２０で表示された候補フレーズのうちの１つのユーザによる選択を受け付ける（ステップＳ２２）。ステップＳ２２では、例えば、図８または図９に例示するような候補フレーズの表示画面を確認したユーザが、入力装置を用いて表示画面中の候補フレーズのうちの１つを選択する指示を入力した場合に、当該入力を入力受付部１２０が受け付ける。入力受付部１２０は、選択された候補フレーズを特定する情報を候補フレーズ挿入部１７０に渡す。 Referring to FIG. 5 again, the input receiving unit 120 receives a selection by one user from the candidate phrases displayed in step S20 (step S22). In step S22, for example, the user who has confirmed the candidate phrase display screen illustrated in FIG. 8 or FIG. 9 inputs an instruction to select one of the candidate phrases on the display screen using the input device. In this case, the input receiving unit 120 receives the input. The input receiving unit 120 passes information for specifying the selected candidate phrase to the candidate phrase inserting unit 170.

選択された候補フレーズを特定する情報を入力受付部１２０から受け取った候補フレーズ挿入部１７０は、選択された候補フレーズを処理対象テキストにおける指定位置に挿入する（ステップＳ２４）。ステップＳ２４の後、図５の例の手順の処理は終了する。 The candidate phrase insertion unit 170 that has received the information specifying the selected candidate phrase from the input receiving unit 120 inserts the selected candidate phrase at the designated position in the processing target text (step S24). After step S24, the process of the procedure in the example of FIG. 5 ends.

なお、図５を参照して以上で説明した処理の例は、情報処理装置１０の処理手順の一例に過ぎず、上記で説明した例に限定されるものではない。例えば、図５の例の手順において、ステップＳ１６とステップＳ１８との処理の順番を入れ替えてもよい。すなわち、ステップＳ１４で取得された各候補フレーズから、処理対象テキストと重複する部分を削除した（ステップＳ１８）後、ステップＳ１６を参照して上述したように各候補フレーズの順位付けを行ってもよい。この例では、ステップＳ１４で取得された候補フレーズのうち、処理対象テキストと同一の候補フレーズを除外した上で各候補フレーズの新スコアの決定および順位付けが行われる。 Note that the processing example described above with reference to FIG. 5 is merely an example of the processing procedure of the information processing apparatus 10, and is not limited to the example described above. For example, in the procedure of the example of FIG. 5, the order of processing in step S16 and step S18 may be switched. That is, after deleting the part which overlaps with a process target text from each candidate phrase acquired by step S14 (step S18), you may rank each candidate phrase as mentioned above with reference to step S16. . In this example, after the candidate phrases that are the same as the processing target text are excluded from the candidate phrases acquired in step S14, a new score for each candidate phrase is determined and ranked.

図５の例の手順の他の変形例として、ステップＳ１６で、各候補フレーズの新スコアを決定および順位付けを行う代わりに、各候補フレーズの新スコアを決定する処理だけを行ってもよい。この例の場合、ステップＳ２０で候補フレーズを出力するときに、出力処理部１６０において、新スコアの大きい順に各候補フレーズを順位付けして出力すればよい。 As another modification of the procedure of the example of FIG. 5, instead of determining and ranking the new scores for each candidate phrase in step S16, only the process of determining the new scores for each candidate phrase may be performed. In the case of this example, when candidate phrases are output in step S20, the output processing unit 160 may rank and output the candidate phrases in descending order of the new score.

さらに他の変形例として、ステップＳ２４で、選択された候補フレーズを処理対象テキストの指定位置に挿入する処理を常に行う代わりに、当該候補フレーズが指定位置の前および後のいずれの文字列を含むフレーズとして取得されたものであるかに応じて、処理対象テキストの一部を候補フレーズで置換してもよい。例えば、図９を参照し、指定位置の文字列の前および後の文字列に接続する候補フレーズ「や腹水貯溜」が選択された場合は指定位置に挿入し、指定位置の文字列の後の文字列だけに接続する候補フレーズ「特に異常」が選択された場合は、指定位置の前の文字列「有意なリンパ節腫大」を当該候補フレーズ「特に異常」で置換するようにする。また、指定位置の文字列の前の文字列だけに接続する候補フレーズ「は指摘できません」または「などの異状を指摘できません」が選択された場合は、指定位置の後の文字列「は認められません」を選択された候補フレーズで置換するようにする。 As yet another modification, instead of always performing the process of inserting the selected candidate phrase into the designated position of the processing target text in step S24, the candidate phrase includes any character string before and after the designated position. Depending on whether the phrase is acquired as a phrase, a part of the text to be processed may be replaced with a candidate phrase. For example, referring to FIG. 9, when the candidate phrase “and ascites reservoir” connected to the character string before and after the character string at the designated position is selected, it is inserted at the designated position and after the character string at the designated position. When the candidate phrase “particularly abnormal” connected only to the character string is selected, the character string “significant lymphadenopathy” before the designated position is replaced with the candidate phrase “particularly abnormal”. In addition, if the candidate phrase “cannot be pointed out” or “cannot point out abnormalities such as” that is connected only to the character string before the character string at the specified position is selected, the character string “after the specified position is accepted. Replace “No” with the selected candidate phrase.

以上に例示した情報処理装置１０は、典型的には、汎用のコンピュータにて上述の情報処理装置１０の各部の機能又は処理内容を記述したプログラムを実行することにより実現される。コンピュータは、例えば、ハードウエアとして、図１０に示すように、ＣＰＵ（中央演算装置）８０、メモリ（一次記憶）８２、各種Ｉ／Ｏ（入出力）インタフェース８４等がバス８６を介して接続された回路構成を有する。また、そのバス８６に対し、例えばＩ／Ｏインタフェース８４経由で、ハードディスクドライブ（ＨＤＤ）８８やＣＤやＤＶＤ、フラッシュメモリなどの各種規格の可搬型の不揮発性記録媒体を読み取るためのディスクドライブ９０が接続される。このようなドライブ８８又は９０は、メモリに対する外部記憶装置として機能する。実施形態の処理内容が記述されたプログラムがＣＤやＤＶＤ等の記録媒体を経由して、又はネットワーク経由で、ＨＤＤ８８等の固定記憶装置に保存され、コンピュータにインストールされる。固定記憶装置に記憶されたプログラムがメモリに読み出されＣＰＵにより実行されることにより、実施形態の処理が実現される。 The information processing apparatus 10 exemplified above is typically realized by executing a program describing functions or processing contents of each unit of the information processing apparatus 10 described above on a general-purpose computer. In the computer, for example, as shown in FIG. 10, a CPU (central processing unit) 80, a memory (primary storage) 82, various I / O (input / output) interfaces 84, and the like are connected via a bus 86 as hardware. Circuit configuration. Also, a disk drive 90 for reading a portable nonvolatile recording medium of various standards such as a hard disk drive (HDD) 88, a CD, a DVD, or a flash memory via the I / O interface 84, for example, to the bus 86. Connected. Such a drive 88 or 90 functions as an external storage device for the memory. A program in which the processing content of the embodiment is described is stored in a fixed storage device such as the HDD 88 via a recording medium such as a CD or DVD or via a network, and is installed in a computer. The program stored in the fixed storage device is read into the memory and executed by the CPU, whereby the processing of the embodiment is realized.

なお、以上では、情報処理装置１０を１台のコンピュータにより実現する例の実施形態を説明したが、情報処理装置１０の上述の例の各種の機能を複数のコンピュータに分散させて実現してもよい。 In the above, the embodiment of the example in which the information processing apparatus 10 is realized by one computer has been described. However, the various functions of the above-described example of the information processing apparatus 10 may be realized by being distributed to a plurality of computers. Good.

１０情報処理装置、８０ＣＰＵ、８２メモリ、８４Ｉ／Ｏインタフェース、８６バス、８８ＨＤＤ、９０ディスクドライブ、１００文書ＤＢ、１１０フレーズＤＢ、１２０入力受付部、１３０対象テキスト取得部、１４０候補フレーズ取得部、１５０候補フレーズ加工部、１６０出力処理部、１７０候補フレーズ挿入部。 10 information processing apparatus, 80 CPU, 82 memory, 84 I / O interface, 86 bus, 88 HDD, 90 disk drive, 100 document DB, 110 phrase DB, 120 input reception unit, 130 target text acquisition unit, 140 candidate phrase acquisition Part, 150 candidate phrase processing part, 160 output processing part, 170 candidate phrase insertion part.

Claims

A reception step for receiving a designated position in a target character string that is a character string to be processed;
From a plurality of character strings stored in the storage means, a character string starting from a character string before the specified position in the target character string, a character string ending with a character string after the specified position in the target character string, Specific steps to identify,
An output step of outputting display control information including an instruction to display a character string different from the target character string among the character strings identified in the identifying step;
A program that causes a computer to execute.

The storage means further stores the importance of each of the plurality of character strings obtained in advance based on the appearance frequency of each of the plurality of character strings in a set of documents.
In the specifying step, the importance level of each of a character string starting from a character string before the specified position in the target character string and a character string ending in a character string after the specified position in the target character string is specified. ,
The display control information further includes the target of the character strings specified in the specifying step according to the order of the character strings determined using the importance of the character strings specified in the specifying step. Including an instruction to display a character string different from the character string on the display means;
The program according to claim 1.

A deletion step of deleting a character string before the specified position and a character string after the specified position from each of the character strings specified in the specifying step;
Is further executed by the computer,
The display control information further includes an instruction to display a character string as a result of performing the process of the deletion step.
The program according to claim 1 or 2, characterized in that

The display control information includes a character string including both a character string before the specified position and a character string after the specified position, a character string including only the character string before the specified position, and the specified position. A character string including only the subsequent character string, and further including an instruction for causing the display means to display the character string resulting from the processing of the deletion step for each of the character strings,
The program according to claim 3.

The rank of each character string is based on the importance of each character string specified in the specifying step, and the specified character string is a character before and after the designated position in the target character string. Determined according to which of the columns are included,
The program according to any one of claims 2 to 4, wherein:

The rank of each character string is determined further based on the length of each character string specified in the specifying step.
The program according to any one of claims 2 to 5, wherein:

Each of the target character string and the plurality of character strings stored in the storage unit is a character string in a unit delimited by a preset delimiter in a document including the character string.
The program according to any one of claims 1 to 6, wherein:

Receiving means for receiving a designated position in a target character string that is a character string to be processed;
From a plurality of character strings stored in the storage means, a character string starting from a character string before the specified position in the target character string, a character string ending with a character string after the specified position in the target character string, Identifying means for identifying
An output means for outputting display control information including an instruction to display on the display means a character string different from the target character string among the character strings specified by the specifying means;
An information processing apparatus comprising: