[go: up one dir, main page]

JP2011095802A - Machine translation device and program - Google Patents

Machine translation device and program Download PDF

Info

Publication number
JP2011095802A
JP2011095802A JP2009246135A JP2009246135A JP2011095802A JP 2011095802 A JP2011095802 A JP 2011095802A JP 2009246135 A JP2009246135 A JP 2009246135A JP 2009246135 A JP2009246135 A JP 2009246135A JP 2011095802 A JP2011095802 A JP 2011095802A
Authority
JP
Japan
Prior art keywords
translation
translated
phrase
text
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2009246135A
Other languages
Japanese (ja)
Other versions
JP5148583B2 (en
Inventor
Enko Sai
遠航 蔡
Yumiko Yoshimura
裕美子 吉村
Takashi Shibuya
貴志 澁谷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Toshiba Digital Solutions Corp
Original Assignee
Toshiba Corp
Toshiba Solutions Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp, Toshiba Solutions Corp filed Critical Toshiba Corp
Priority to JP2009246135A priority Critical patent/JP5148583B2/en
Publication of JP2011095802A publication Critical patent/JP2011095802A/en
Application granted granted Critical
Publication of JP5148583B2 publication Critical patent/JP5148583B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

<P>PROBLEM TO BE SOLVED: To generate the translation of an appropriate phrase according to the context information or field information of a document when performing translation using a translation example. <P>SOLUTION: A translation example which is the same or similar to an original for translation is searched by a translation example search means 32 from a translation example database 29 by using an original for translation as a search key. A difference associating means 33 makes a phrase in the translation of a translation example in which editing is required to a difference part of the original of the searched translation example and the original for translation associated with the phrase in the original for translation. A difference phrase translation obtaining means 34 obtains a word used for translation according to the context or the field in the original for translation using information other than the phrase in the original for translation to the phrase in the original for translation associated with the phrase in the translation of the translation example. The translation is completed by replacing the phrase in the translation of the translation example which requires editing with the obtained word used in translation. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、第1言語文と第2言語文との対訳からなる翻訳用例を用いて翻訳対象の第1言語の原文を第2言語の訳文に翻訳する機械翻訳装置及びプログラムに関する。   The present invention relates to a machine translation apparatus and a program for translating an original sentence of a first language to be translated into a translated sentence of a second language using an example of translation consisting of a parallel translation of a first language sentence and a second language sentence.

コンピュータにより翻訳対象である第1言語の原文を第2言語の文に翻訳する機能を有した機械翻訳装置がある。このような機械翻訳装置においては、第1言語文と第2言語文との対訳からなる翻訳用例を予め翻訳用例データベースに複数蓄積しておき、入力された翻訳対象原文に類似する翻訳用例を検索して、翻訳対象原文とともに利用者に提示し、翻訳の支援を行う。翻訳用例原文は翻訳対象原文とは完全一致していないことが多いので、当該翻訳用例の訳文を無編集で当該翻訳対象原文に対応する訳文としてそのまま適用できるケースは限られており、通常は必要な編集を当該翻訳用例訳文に施した上で当該翻訳対象原文の訳文を得ることになる(例えば、特許文献1、特許文献2参照)。   There is a machine translation device having a function of translating an original sentence of a first language to be translated into a sentence of a second language by a computer. In such a machine translation apparatus, a plurality of translation examples consisting of parallel translations of a first language sentence and a second language sentence are stored in advance in a translation example database, and a translation example similar to the input translation target sentence is searched. Then, it is presented to the user together with the original text to be translated, and the translation is supported. The translation source text often does not exactly match the translation target text, so there are limited cases where the translation text of the translation example can be directly applied as a translation corresponding to the translation target text without editing. The translation of the original text to be translated is obtained after performing such editing on the translation example translation (see, for example, Patent Document 1 and Patent Document 2).

このような翻訳用例を用いた翻訳では、ユーザが編集箇所の判断を行い、訳語の選択などの作業が必要となるため手間がかかる。そこで、上記編集を自動的に行う方法も提案されている。この場合、翻訳対象原文と検索された翻訳用例原文との相違箇所(以下差分)を語句単位で判定し、翻訳用例原文中の差分に対応する翻訳用例訳文の語句を判定し、当該語句を翻訳対象原文中の差分の語句に置き換えた合成文を作成し、合成文に含まれる翻訳対象原文中の語句を当該語句に対応する訳語で置換した訳文を作成する(例えば、特許文献3参照)。   In translation using such a translation example, it is time consuming because the user needs to determine the editing location and select a translation. Therefore, a method for automatically performing the editing has been proposed. In this case, differences (hereinafter referred to as differences) between the translation target original text and the searched translation example original text are determined in terms of phrases, the translation example translation phrases corresponding to the differences in the translation example original text are determined, and the corresponding phrases are translated. A synthetic sentence is created by replacing the difference words in the target original sentence, and a translation sentence is created by replacing the words in the translation target original sentence included in the synthetic sentence with the translation corresponding to the word (for example, see Patent Document 3).

特開2003−330924号公報JP 2003-330924 A 特開2005−339087号公報JP 2005-339087 A 特開2006−11842号公報JP 2006-11842 A

しかし、特許文献3のものでは、翻訳対象原文と翻訳用例原文との差分を語句単位で判定し、翻訳用例原文中の差分に対応する翻訳用例訳文の語句を判定し、当該語句を翻訳対象原文中の差分の語句に置き換えた合成文を作成し、合成文に含まれる原文中の語句を当該語句に対応する訳語で置換した訳文を作成することになるので、当該差分の語句は文脈情報や文書の分野情報などに応じた訳出ができないことがある。すなわち、翻訳対象原文中の差分の語句に対応する訳語で置換するため、常に辞書に登録されている訳語候補のうち既定の訳語が使用され、適切でない訳文となってしまうことがある。表1に、適切でない訳文になってしまう場合の一例を示す。

Figure 2011095802
However, in Patent Document 3, the difference between the original text to be translated and the original text for translation is determined in units of words, the word / phrase of the example translation for translation corresponding to the difference in the original text for translation is determined, and the original text to be translated is determined. A compound sentence that has been replaced with a difference word in the middle sentence is created, and a translation sentence in which the word or phrase in the original sentence included in the compound sentence is replaced with a translation corresponding to the word or phrase, the difference word or phrase is context information or Translation according to the field information of the document may not be possible. That is, since a translation corresponding to a difference word in the original text to be translated is replaced, a default translation is always used among the translation candidates registered in the dictionary, which may result in an inappropriate translation. Table 1 shows an example of an inappropriate translation.
Figure 2011095802

表1では、翻訳対象原文と翻訳用例原文との差分が”strain”と”antigen”とであり、翻訳用例原文中の”antigen”に対応する翻訳用例訳文中の語句が”抗原”である場合を示している。この場合、翻訳用例訳文中の”抗原”を翻訳対象原文の”strain”で置き換えた合成文は、”結核の予防注射に、このstrainが使用されました。”となる。そして、この”strain”の訳語候補を参照し、訳語候補のうちから既定の訳語である”負荷”を選択して、自動生成訳文を生成する。従って、自動生成訳文は、”結核の予防注射に、この負荷が使用されました。”となる。   In Table 1, the difference between the translation source text and the translation example text is “strain” and “antigen”, and the word in the translation example translation corresponding to “antigen” in the translation example text is “antigen” Is shown. In this case, the compound sentence in which the “antigen” in the translation example translation is replaced with “strain” in the translation target sentence becomes “this strain was used for the prevention of tuberculosis”. Then, referring to the translation candidate of “strain”, the “translation” which is a predetermined translation is selected from the translation candidates, and the automatically generated translation is generated. Thus, the automatically generated translation is "This load was used for TB vaccination."

しかし、希望訳文は、”結核の予防注射に、この菌種が使用されました。”であり、それを訳出することができない。これは、翻訳対象原文の文脈や文書の分野などを考慮せず、既定の訳語で置き換えるために、このような結果になってしまっている。   However, the desired translation is “This strain was used for the TB injection” and cannot be translated. This is the result of replacing with the default translation without considering the context of the original text to be translated and the field of the document.

本発明の目的は、翻訳用例訳文中の当該語句を翻訳用例原文中の差分の訳語で置換した訳文を作成する際、翻訳対象原文中の差分の語句に対して、文脈情報や文書の分野情報に応じた適切な語句の訳文を生成できる機械翻訳装置及びプログラムを提供することである。   The purpose of the present invention is to create contextual or document field information for a difference word or phrase in the translation target original text when creating a translation in which the word or phrase in the translation example translation is replaced with a difference translation in the translation original text. It is an object to provide a machine translation device and a program capable of generating a translation of an appropriate word / phrase according to the above.

本発明の機械翻訳装置は、機械翻訳プログラム、翻訳対象の第1言語を翻訳目的の第2言語に翻訳するための機械翻訳辞書、翻訳対象の第1言語の文と翻訳目的の第2言語の文との対訳からなる翻訳用例を蓄積した翻訳用例データベースを記憶した記憶装置と、翻訳対象の第1言語の原文をデータとして入力する入力装置と、翻訳後の第2言語の訳文を出力する出力装置と、前記機械翻訳プログラムを演算実行する演算制御装置とを備えた機械翻訳装置において、第1言語の翻訳対象原文を前記機械翻訳辞書を用いて翻訳する翻訳手段と、前記翻訳対象原文を検索キーとして前記翻訳用例データベースから前記翻訳対象原文に同一または類似の翻訳用例を検索する翻訳用例検索手段と、前記検索された翻訳用例の原文と前記翻訳対象原文との差異部分に対して編集が必要な翻訳用例の訳文中の語句と翻訳対象原文中の語句とを対応付ける差分対応付け手段と、前記翻訳用例の訳文中の語句に対応づけられた前記翻訳対象原文中の語句に対して前記翻訳対象原文中の当該語句以外の情報を利用して前記翻訳対象原文中の文脈または分野に応じた訳語を取得する差分語句訳語取得手段と、前記差分語句訳語取得手段によって取得された訳語を前記編集が必要な翻訳用例の訳文中の語句と置き換えて訳文を完成させる訳語置換手段とを備えたことを特徴とする。   The machine translation apparatus of the present invention includes a machine translation program, a machine translation dictionary for translating a first language to be translated into a second language for translation, a sentence in the first language to be translated and a second language for translation. A storage device storing a translation example database that accumulates translation examples consisting of translations with sentences, an input device that inputs the original text of the first language to be translated as data, and an output that outputs the translated text of the second language after translation In a machine translation device comprising an apparatus and a calculation control device for calculating and executing the machine translation program, a translation means for translating a translation target original text in a first language using the machine translation dictionary, and searching for the translation target text A translation example retrieval means for retrieving a translation example that is the same as or similar to the translation target original from the translation example database as a key, and the original text of the searched translation example and the translation target original Difference matching means for associating a phrase in the translation of the translation example that needs to be edited with respect to a different portion and a phrase in the translation target original, and in the translation target text associated with the phrase in the translation of the translation example A difference word / phrase translation acquisition unit for acquiring a translation corresponding to a context or a field in the translation target original using information other than the word / phrase in the translation target original, and a difference word / phrase translation acquisition unit Translation word replacing means for completing a translated sentence by replacing the acquired translated word with a phrase in the translated sentence of the translation example that needs to be edited is provided.

本発明によれば、翻訳用例訳文中の当該語句を翻訳用例原文中の差分の訳語で置換した訳文を作成する際、翻訳対象原文中の差分の語句に対して、文脈情報や文書の分野情報に応じた適切な語句の訳文を生成できる機械翻訳装置を提供できる。   According to the present invention, when creating a translation in which the word / phrase in the translation example translation is replaced with the translation of the difference in the translation example original, context information and document field information are obtained for the difference phrase in the translation target original. It is possible to provide a machine translation device that can generate a translation of an appropriate word or phrase according to the situation.

本発明の第1の実施の形態に係わる機械翻訳装置の機能ブロック図。The functional block diagram of the machine translation apparatus concerning the 1st Embodiment of this invention. 本発明の実施の形態に係る機械翻訳装置のハードウエア構成を示すブロック構成図。The block block diagram which shows the hardware constitutions of the machine translation apparatus which concerns on embodiment of this invention. 本発明の第1の実施の形態に係わる機械翻訳装置の処理内容を示すフローチャート。The flowchart which shows the processing content of the machine translation apparatus concerning the 1st Embodiment of this invention. 本発明の第1の実施の形態における差分対応付け手段での差分対応付け処理の内容を示すフローチャート。The flowchart which shows the content of the difference matching process in the difference matching means in the 1st Embodiment of this invention. 本発明の第1の実施の形態における形態素解析及び構文解析済みの翻訳用例原文と翻訳用例訳文との構文ツリーの説明図。Explanatory drawing of the syntax tree of the example sentence for translation and the example translation sentence for translation in which the morphological analysis and syntax analysis in the 1st Embodiment of this invention were carried out. 本発明の第1の実施の形態における差分語句訳語取得手段での差分語句訳語取得処理の内容を示すフローチャート。The flowchart which shows the content of the difference word phrase translation acquisition process in the difference word phrase translation acquisition part in the 1st Embodiment of this invention. 本発明の第2の実施の形態に係わる機械翻訳装置の機能ブロック図。The functional block diagram of the machine translation apparatus concerning the 2nd Embodiment of this invention. 本発明の第2の実施の形態における差分語句訳語取得手段での差分語句訳語取得処理の内容を示すフローチャート。The flowchart which shows the content of the difference word phrase translation acquisition process in the difference word phrase translation acquisition part in the 2nd Embodiment of this invention. 本発明の第3の実施の形態における差分語句訳語取得手段の差分語句訳語取得処理の内容を示すフローチャート。The flowchart which shows the content of the difference word phrase translation acquisition process of the difference word phrase translation acquisition part in the 3rd Embodiment of this invention.

以下、図面を参照しながら本発明の実施の形態について説明する。図1は本発明の第1の実施の形態に係わる機械翻訳装置11の機能ブロック図、図2は本発明の実施の形態に係る機械翻訳装置のハードウエア構成を示すブロック構成図である。   Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a functional block diagram of the machine translation apparatus 11 according to the first embodiment of the present invention, and FIG. 2 is a block configuration diagram showing a hardware configuration of the machine translation apparatus according to the embodiment of the present invention.

図2において、機械翻訳装置11は、例えば一般的なコンピュータに機械翻訳プログラムなどのソフトウェアプログラムがインストールされ、そのソフトウェアプログラムが演算制御装置12のプロセッサ13において実行されることにより実現される。   In FIG. 2, the machine translation device 11 is realized by installing a software program such as a machine translation program in a general computer and executing the software program in the processor 13 of the arithmetic control device 12.

演算制御装置12は機械翻訳に関する各種演算を行うものであり、演算制御装置12はプロセッサ13とメモリ14とを有し、メモリ14には翻訳に関する機械翻訳プログラム15が記憶され、プロセッサ13により処理が実行される際には作業エリア16が用いられる。演算制御装置12の演算結果等は出力装置17である表示装置18に表示出力され、また、通信制御装置19を介して通信ネットワークに出力される。   The arithmetic control device 12 performs various arithmetic operations related to machine translation. The arithmetic control device 12 has a processor 13 and a memory 14. A memory translation program 15 is stored in the memory 14, and the processor 13 performs processing. When executed, the work area 16 is used. Calculation results and the like of the calculation control device 12 are displayed and output on the display device 18 that is the output device 17 and also output to the communication network via the communication control device 19.

入力装置20は演算制御装置12に情報を入力するものであり、例えば、マウス21、キーボード22、ディスクドライブ23、通信制御装置19から構成され、例えば、マウス21やキーボード22は表示装置18を介して演算制御装置12に各種指令を入力し、キーボード22、ディスクドライブ23、通信制御装置19は翻訳対象の文書を入力する。   The input device 20 is used to input information to the arithmetic control device 12, and includes, for example, a mouse 21, a keyboard 22, a disk drive 23, and a communication control device 19. For example, the mouse 21 and the keyboard 22 are connected via the display device 18. Then, various commands are input to the arithmetic and control unit 12, and the keyboard 22, the disk drive 23, and the communication control unit 19 input a document to be translated.

すなわち、ディスクドライブ23は翻訳対象の文書のファイルを記憶媒体に入出力するものであり、通信制御装置19は機械翻訳装置11をインターネットやLANなどの通信ネットワークに接続するものである。通信制御装置19はLANカードやモデムなどの装置であり、通信制御装置19を介して通信ネットワークと送受信したデータは入力信号又は出力信号として演算制御装置12に送受信される。さらに、演算制御装置12の演算結果や翻訳に必要な知識・規則を蓄積した翻訳辞書等を記憶するハードディスクドライブ(HDD)24が設けられている。   That is, the disk drive 23 inputs / outputs a file of a document to be translated to / from a storage medium, and the communication control device 19 connects the machine translation device 11 to a communication network such as the Internet or a LAN. The communication control device 19 is a device such as a LAN card or a modem, and data transmitted / received to / from the communication network via the communication control device 19 is transmitted / received to / from the arithmetic control device 12 as an input signal or an output signal. Further, a hard disk drive (HDD) 24 is provided for storing a calculation dictionary of the calculation control device 12 and a translation dictionary storing knowledge and rules necessary for translation.

図1は本発明の第1の実施の形態に係わる機械翻訳装置11の機能ブロック図である。図1に示す演算制御装置12内の各機能ブロックは、上述の機械翻訳プログラム15を構成する各プログラムに対応する。すなわち、プロセッサ13が機械翻訳プログラム15を構成する各プログラムを実行することで、演算制御装置12は、各機能ブロックとして機能することとなる。また、記憶装置25の各ブロックは、演算制御装置12内のメモリ14及びハードディスクドライブ24の記憶領域に対応する。   FIG. 1 is a functional block diagram of a machine translation apparatus 11 according to the first embodiment of the present invention. Each functional block in the arithmetic and control unit 12 shown in FIG. 1 corresponds to each program constituting the machine translation program 15 described above. That is, when the processor 13 executes each program constituting the machine translation program 15, the arithmetic control device 12 functions as each functional block. Each block of the storage device 25 corresponds to a storage area of the memory 14 and the hard disk drive 24 in the arithmetic control device 12.

入力処理部26は、入力装置20から入力された翻訳対象の第1言語の原文データを入力し翻訳部27に出力するとともに、操作に必要な情報を入力装置26から入力して翻訳部27に各種指令を与えるものである。   The input processing unit 26 inputs the original text data of the first language to be translated input from the input device 20 and outputs it to the translation unit 27, and inputs information necessary for the operation from the input device 26 to the translation unit 27. Gives various commands.

表示処理部28は、入力装置20から入力処理部26を介して入力された翻訳対象の第1言語の原文、翻訳部27で翻訳後の第2言語の訳文、記憶装置25に記憶された翻訳用例データベース29の翻訳用例を表示装置18に表示出力するものである。また、入力装置20から入力処理部26を介して入力された操作に必要な情報も併せて表示処理部28を介して表示装置18に表示出力する。   The display processing unit 28 includes the original text of the first language to be translated input from the input device 20 via the input processing unit 26, the translated text of the second language translated by the translation unit 27, and the translation stored in the storage device 25. An example of translation in the example database 29 is displayed on the display device 18. Information necessary for an operation input from the input device 20 via the input processing unit 26 is also displayed and output to the display device 18 via the display processing unit 28.

翻訳用例データベース29は、予め登録された第1言語の文と第2言語の文との対訳からなる翻訳用例を複数蓄積したデータベースであり、機械翻訳辞書30には、第1言語を第2言語に翻訳する際に必要な語彙・規則が蓄積されている。   The translation example database 29 is a database in which a plurality of translation examples composed of parallel translations of sentences in the first language and second language registered in advance are stored. The machine translation dictionary 30 includes the first language in the second language. The vocabulary and rules necessary for translation into are stored.

また、翻訳部27は、機械翻訳辞書30の語彙・規則を用いて第1言語を第2言語に翻訳するものであり、翻訳手段31、翻訳用例検索手段32、差分対応付け手段33、差分語句訳語取得手段34、訳語置換手段35を有している。   The translation unit 27 translates the first language into the second language using the vocabulary and rules of the machine translation dictionary 30, and includes a translation unit 31, a translation example search unit 32, a difference association unit 33, and a difference phrase. A translation acquisition unit 34 and a translation replacement unit 35 are provided.

翻訳手段31は、第1言語の文またはフレーズを構文解析や形態素解析をして第2言語に翻訳するものである。翻訳用例検索手段32は、入力された第1言語の翻訳対象原文を検索キーとして翻訳用例データベース29から翻訳対象原文に類似する翻訳用例を検索するものであり、差分対応付け手段33は、翻訳用例検索手段32で検索された翻訳用例原文と翻訳対象原文との差異部分を対応付けるともに、当該差異部分に対応する編集が必要な翻訳用例訳文中の語句と翻訳対象原文中の語句とを対応付けるものである。   The translation means 31 performs a syntax analysis or a morphological analysis on the sentence or phrase in the first language and translates it into the second language. The translation example search means 32 searches the translation example database 29 for a translation example similar to the translation target original text using the input translation target original text of the first language as a search key. In addition to associating a difference between the translation example original searched for by the search means 32 and the translation target original, the phrase in the translation example translation that needs to be edited corresponding to the difference is associated with the phrase in the translation target original. is there.

そして、差分語句訳語取得手段34は、翻訳用例の訳文中の語句に対応づけられた翻訳対象原文中の語句、つまり、編集が必要な翻訳用例訳文中の語句に対して、翻訳対象原文中の当該語句以外の情報を利用して翻訳対象原文中の文脈または分野に応じた適切な訳語を取得するものであり、訳語置換手段35は、差分語句訳語取得手段34によって取得した訳語で編集が必要な翻訳用例訳文中の語句を置き換えて訳文を完成させるものである。   Then, the difference word phrase translated word acquisition unit 34 applies the phrase in the translation target text to the phrase in the translation target text that is associated with the phrase in the translation text of the translation example, that is, the phrase in the translation example translation text that needs to be edited. Uses information other than the word / phrase to obtain an appropriate translation according to the context or field in the original text to be translated, and the translation replacement unit 35 needs to edit the translation acquired by the differential word / phrase translation acquisition unit 34 The translation is completed by replacing words in the translation example.

図3は、本発明の第1の実施の形態に係わる機械翻訳装置の処理内容を示すフローチャートである。以下の説明では、表2に示す翻訳対象原文と翻訳用例(翻訳用例原文及び翻訳用例訳文)の組合せを例にとり説明する。

Figure 2011095802
FIG. 3 is a flowchart showing the processing contents of the machine translation apparatus according to the first embodiment of the present invention. In the following description, a combination of the translation target original text and the translation example (translation example original text and translation example translation text) shown in Table 2 will be described as an example.
Figure 2011095802

入力装置20から入力処理部26を経由して第1言語の翻訳対象原文が翻訳部27に入力され翻訳要求が発行されたときは、翻訳部27の翻訳手段31は、翻訳対象原文に対して機械翻訳辞書30を用いて辞書引き処理・形態素解析を行い(S1)、各形態素の品詞、活用の種類、各種属性、訳語などの各種情報を得る。   When the translation target original text in the first language is input from the input device 20 to the translation section 27 via the input processing section 26 and a translation request is issued, the translation means 31 of the translation section 27 responds to the translation target text. Dictionary lookup processing and morphological analysis are performed using the machine translation dictionary 30 (S1), and various information such as parts of speech, types of utilization, various attributes, and translations of each morpheme are obtained.

続いて、翻訳用例検索手段32は、翻訳用例データベース29に対して翻訳用例検索を行う(S2)。すなわち、入力文と同一または類似の翻訳用例(翻訳用例原文及び翻訳用例訳文)が翻訳用例データベース29に蓄積されているかどうかを検索する。   Subsequently, the translation example search means 32 performs a translation example search on the translation example database 29 (S2). That is, it is searched whether or not the translation example (translation example original sentence and translation example translation) that is the same as or similar to the input sentence is stored in the translation example database 29.

この段階では、ステップS1の処理により、翻訳対象原文の形態素解析結果や翻訳対象原文中の各語に対する各種情報が取得されているため、カタカナや送り仮名の表記の揺れ、漢字表記とひらがな表記の揺れ、送り仮名の有無の揺れの情報も取得されている。これらの情報により、完全に入力文と1文字違わず一致した翻訳用例でなくとも、これらの表記の揺れに関する違いについては違いとしては認識しないような揺れの吸収処理が可能である。   At this stage, since the morphological analysis result of the translation target text and various information about each word in the translation target text are acquired by the processing of step S1, the katakana and the kana notation, kanji notation and hiragana notation Information on the shaking and the presence / absence of a sending pseudonym is also acquired. With these pieces of information, even if the translation example does not completely match the input sentence with one character, it is possible to perform a vibration absorption process that does not recognize the difference regarding the fluctuation of these notations as a difference.

一方、表記の揺れは吸収しない完全一致の翻訳用例原文があるときのみ検索成功とする構成も可能である。また、文字・単語上の違いがあっても同値とみなす差異文字数ないしは差異単語数の割合の下限値を設けて検索の可否を制御する構成であってもよい。これにより、語句・文字の使われ方が類似した翻訳用例原文を抽出できる。このような検索手法を用いる場合は、用例検索のタイミングを辞書引き処理・形態素解析の後にする必要はなく、翻訳用例検索のステップを図3の処理の最初に行う構成でもよい。   On the other hand, a configuration in which the search is successful only when there is an exact original translation example that does not absorb the fluctuation of the notation is also possible. Moreover, even if there is a difference between characters and words, a configuration may be adopted in which the number of different characters regarded as the same value or the lower limit value of the ratio of the number of different words is provided to control the possibility of search. As a result, it is possible to extract a translation example original that is similar in terms and phrases. When such a search method is used, it is not necessary to set the timing of the example search after the dictionary lookup process and the morphological analysis, and the translation example search step may be performed at the beginning of the process of FIG.

ステップS2の翻訳用例検索では、複数の翻訳用例を検索できる場合もあるが、最終的に使用する優先度が最も高い翻訳用例が処理対象となる。複数の翻訳用例から優先度の最も高い用例を決定するには、類似度、類似度が同じ場合には用例の登録日時などの基準を用いて行うのが一般的である。類似度は、例えば、語句の一致する数が多いほど類似度が高いと判定することで行う。   In the translation example search in step S2, there may be cases where a plurality of translation examples can be searched, but the translation example with the highest priority to be finally used is the processing target. In order to determine an example with the highest priority from a plurality of translation examples, it is common to use a criterion such as the registration date of the example when the degree of similarity is the same. The similarity is determined, for example, by determining that the similarity is higher as the number of matching words is larger.

いま、翻訳対象原文”The strain was used to vaccinate people against tuberculosis.”に対して、翻訳用例として、翻訳用例原文が”The antigen was used to vaccinate people against tuberculosis.”、翻訳用例訳文が”結核の予防注射に、この抗原が使用されました。”が検索されたとする。   Now, as an example for translation, the original text for translation is “The antigen was used to vaccinate people against tuberculosis.”, And the translated text for translation is “Prevention of tuberculosis”. Suppose this antigen was used for injection. "

このような翻訳用例が検索されたとすると(S3)、差分対応付け手段33は翻訳に使用する翻訳用例及び翻訳対象原文に対して差分対応付け処理を行う(S4)。図4は差分対応付け手段33での差分対応付け処理(ステップS4)の内容を示すフローチャートである。   If such a translation example is searched (S3), the difference association means 33 performs a difference association process on the translation example used for translation and the translation target original (S4). FIG. 4 is a flowchart showing the contents of the difference association process (step S4) in the difference association means 33.

図4に示すように、差分対応付け手段33は、まず翻訳対象原文に対して構文解析を行い(S11)、翻訳用例原文及び翻訳用例訳文に対して、形態素解析及び構文解析を行う(S12)。これによって、翻訳対象原文、翻訳用例原文、翻訳用例訳文の文を構成する単語列、各単語の品詞・活用の種類・訳語ほか、翻訳処理に必要な各種情報、そして構文構造を得る。これらの情報を用いて、翻訳対象原文と翻訳用例原文の差分の対応付けを行い(S13)、翻訳用例原文と翻訳用例訳文の中のどの語句とどの語句が対応しているかの対応付けを行い(S14)、さらにこの結果を用いて、翻訳対象原文と用例原文の差分が用例訳文のどの語句に対応しているかの対応付けを行う(S15)。   As shown in FIG. 4, the difference association unit 33 first performs syntax analysis on the translation target original (S11), and performs morphological analysis and syntax analysis on the translation example original and translation example translation (S12). . As a result, the original text to be translated, the original text for translation, the word string constituting the text of the translated text for translation, the part of speech of each word, the type of use / translation, various information necessary for translation processing, and the syntax structure are obtained. Using these pieces of information, the difference between the translation target original and the translation example original is associated (S13), and which words and phrases in the translation example original and the translation example translation are associated with each other. (S14) Further, using this result, the difference between the translation target original text and the example original text is associated with which word / phrase in the example translated text (S15).

表3に翻訳対象原文の解析結果を示す。表3では翻訳対象原文の語句が対応する翻訳用例原文の語句IDを併せて示している。

Figure 2011095802
Table 3 shows the analysis results of the original text to be translated. Table 3 also shows the phrase ID of the translation example original corresponding to the phrase of the original text to be translated.
Figure 2011095802

翻訳対象原文と翻訳用例原文の差分の対応付け処理(ステップS13)では、表3及び表4を対比することにより、翻訳対象原文と翻訳用例原文の差分を対応付ける。これにより、翻訳対象原文と翻訳用例原文の差分は、翻訳対象原文の「strain」と翻訳用例原文の「antigen」の差であることが判明する。   In the process of associating the difference between the translation target original text and the translation example original text (step S13), the differences between the translation target original text and the translation example original text are correlated by comparing Table 3 and Table 4. As a result, the difference between the translation target original and the translation example original is found to be a difference between “strain” of the translation target original and “antigen” of the translation example original.

次に、翻訳用例原文と翻訳用例訳文の中のどの語句とどの語句が対応しているかの対応付け処理(ステップS14)について説明する。   Next, the association process (step S14) of which words in the translation example original text and which words in the translation example translation text correspond to each other will be described.

図5は形態素解析及び構文解析済みの翻訳用例原文と翻訳用例訳文との構文ツリーの説明図であり、図4(a)は翻訳用例原文の構文ツリー図、図4(b)は翻訳用例訳文の構文ツリー図である。図4中の「TW」で示しているのは、第2言語である翻訳用例訳文の訳語候補として存在する訳語の候補のリストである。複数あるものは、コンマで区切って列挙している。   FIG. 5 is an explanatory diagram of a syntax tree of a translation example original sentence and a translation example translation sentence that have been subjected to morphological analysis and syntax analysis, FIG. 4A is a syntax tree diagram of the translation example original sentence, and FIG. 4B is a translation example translation sentence. FIG. In FIG. 4, “TW” indicates a list of translation candidates that exist as translation candidates for the translation example translation sentence that is the second language. Multiple items are listed separated by commas.

図5を参照してステップS14での翻訳用例原文と翻訳用例訳文の中のどの語句とどの語句が対応しているかの対応付け処理を説明する。   With reference to FIG. 5, the association process of which words in the translation example original text and the translation example translation text correspond to which words in step S14 will be described.

(1)以下の処理を翻訳用例原文の解析結果の各構成要素ごとに行う。 (1) The following processing is performed for each component of the analysis result of the translation example original.

(1a)翻訳用例原文の見出し語(形態素解析後の各要素)を翻訳用例訳文の解析結果の中で「TW」属性を含む要素を探し、見出し語に対応する訳語を対応候補として抽出する。複数存在すれば、複数候補のペアを抽出する。 (1a) For translation headwords (each element after morphological analysis), search for an element including the “TW” attribute in the translation result of translation example translation, and extract a translation corresponding to the headword as a corresponding candidate. If there are a plurality of pairs, a plurality of candidate pairs are extracted.

(1b)翻訳用例原文の要素の訳語候補の一つが翻訳用例訳文の解析結果中の構成要素と一致するかどうかをチェックし、一致するものがあれば対応候補として抽出する。複数存在すれば複数候補のペアを抽出する。 (1b) It is checked whether or not one of the translation word candidates of the translation example original sentence element matches the component in the analysis result of the translation example translation sentence, and if there is a match, it is extracted as a corresponding candidate. If there are a plurality of pairs, a plurality of candidate pairs are extracted.

(2)上記(1)の結果、対応関係に重複がなく、ユニークに対応が決まるものは、対応語句として確定する。また、片方の構造中の1つの要素が他方の構造では2つ以上の連続要素に一致する場合など、要素数が一致しないケースでも対応語句として抽出する。 (2) As a result of the above (1), those whose correspondences do not overlap and whose correspondence is uniquely determined are determined as corresponding terms. In addition, even when the number of elements does not match, such as when one element in one structure matches two or more continuous elements in the other structure, it is extracted as a corresponding phrase.

(3)上記(1)の結果、対応関係にあいまい性がある場合、例えば、同じ語が2回用いられているような場合は、他の対応語句候補と解析結果中での直接的修飾関係や、修飾関係の交差の有無や、部分構造中での他の対応語句候補との共存関係をもとに、より可能性の高い対応関係を選択し、あいまい性を解消する。 (3) As a result of the above (1), when the correspondence is ambiguous, for example, when the same word is used twice, it is directly modified in the analysis result with other corresponding words / phrases. Further, based on the presence / absence of crossing of the modification relationship and the coexistence relationship with other corresponding word / phrase candidates in the partial structure, a more likely correspondence relationship is selected to eliminate the ambiguity.

(4)上記(3)までに対応語句候補が抽出されると、これらの対応語句候補の第1言語文の解析結果構造と第2言語文の解析結果構造との中で、構造的に連続している部分の検出を行う。 (4) When the corresponding phrase candidates are extracted by the above (3), structurally continuous among the analysis result structures of the first language sentence and the analysis result structure of the second language sentence of the corresponding phrase candidates. Detecting the part that is.

この対応付け処理によって以下の対応関係が得られる。「antigen−抗原」、「use−使用する」、「tuberculosis−結核」。   The following correspondence is obtained by this association processing. “Antigen”, “use”, “tuberculosis”.

この対応付けを行った後の内部データの一覧を表4及び表5に示す。表4は翻訳用例原文の解析結果であり、翻訳用例原文の個々の構成要素、品詞情報、及び翻訳用例訳文のデータ構造のどの語句に対応するかを示すID番号とを有しており、「−1」が入っている要素は、対応する語句がないことを示す。

Figure 2011095802
Tables 4 and 5 show lists of internal data after the association. Table 4 shows the analysis result of the translation example original text, and includes individual components of the translation example original text, part of speech information, and an ID number indicating which word in the data structure of the translation example translation text. An element including “−1” indicates that there is no corresponding phrase.
Figure 2011095802

また、表5は翻訳用例訳文の解析結果であり、表4と同様に、翻訳用例訳文の個々の構成要素、品詞情報、及び翻訳用例原文のデータ構造のどの語句に対応するかを示すID番号とを有しており、「−1」が入っている要素は、対応する語句がないことを示す。

Figure 2011095802
Table 5 shows the analysis result of the translation example translation. Similarly to Table 4, each component of the translation example translation, part-of-speech information, and an ID number indicating which word / phrase of the translation example original data structure corresponds to And an element containing “−1” indicates that there is no corresponding phrase.
Figure 2011095802

この対応関係から、翻訳対象原文と翻訳用例原文との差分は「strain」と「antigen」の差であり、翻訳用例原文と翻訳用例訳文との対応関係から、翻訳用例原文の「antigen」が対応している翻訳用例訳文の語句は「抗原」であることが分かる。この結果から、ステップS15において、翻訳対象原文と翻訳用例原文との差分と、翻訳用例訳文との対応関係は、「strain−抗原」であることが得られる。   From this correspondence, the difference between the translation target text and the translation example text is the difference between “strain” and “antigen”, and the translation example text “antigen” corresponds from the correspondence between the translation example text and the translation example translation text. It can be seen that the phrase of the translated translation example is “antigen”. From this result, in step S15, it is obtained that the correspondence between the difference between the translation target original and the translation example original and the translation example translation is “strain-antigen”.

このようにして、図3の差分対応付け処理(ステップS4)により得られた翻訳対象原文と翻訳用例原文との差分語句「strain−抗原」に対して、差分語句訳語取得手段34は、「strain」の訳語として「抗原」に代わる適切な訳語を取得する(S5)。   In this way, for the difference phrase “strain-antigen” between the translation target original and the translation example original obtained by the difference association process (step S4) in FIG. As a translation of “Antigen”, an appropriate translation instead of “antigen” is acquired (S5).

図6は、差分語句訳語取得手段34での差分語句訳語取得処理(ステップS5)の内容を示すフローチャートである。まず、翻訳対象原文を翻訳する(S21)。すなわち、翻訳手段31を用いて翻訳用例を使用しないで翻訳対象原文を翻訳する。その後、翻訳対象原文と翻訳用例原文との差分語句の訳語を切り出す(S22)。   FIG. 6 is a flowchart showing the contents of the difference word / phrase translation acquisition process (step S5) in the difference word / phrase translation acquisition means 34. First, the original text to be translated is translated (S21). That is, the original text to be translated is translated using the translation means 31 without using a translation example. After that, the translation of the difference word between the translation target original and the translation example original is cut out (S22).

ステップS21の翻訳処理は一般的な機械翻訳処理である。すなわち、翻訳手段31では翻訳用例を使用しないことによって、構文解析で得られる文節の係り受け、共起関係などの情報を利用して、機械翻訳辞書30で定義されている訳し分け規則を適用して訳語を得る。   The translation process in step S21 is a general machine translation process. In other words, the translation means 31 does not use a translation example, and applies the translation rules defined in the machine translation dictionary 30 using information such as clause dependency and co-occurrence relationship obtained by parsing. Get the translation.

翻訳対象原文と翻訳用例原文との差分語句の翻訳対象原文の語句“strain”には、表1に示しているように、複数な訳語(負荷〜病原菌)があり、どの訳語が適切かを決める知識として、以下のような訳出決定のための規則が機械翻訳辞書30に蓄積されている。   As shown in Table 1, there are a plurality of translated words (load to pathogen) in the phrase “strain” of the translation target text of the difference phrase between the translation target text and the translation example original text, and which translation word is appropriate is determined. As knowledge, the following rules for determining translation are stored in the machine translation dictionary 30.

<規則>strain + vaccinate → strain=菌種
(意味:”strain”と”vaccinate”が共起する場合”strain”は”菌種”と訳す)
この規則によって、翻訳対象原文”The strain was used to vaccinate people against tuberculosis.”の訳文は以下のようになる。
<Rule> strain + vaccinate → strain = bacteria (Meaning: When “strain” and “vaccinate” co-occur, “strain” is translated as “bacteria”)
According to this rule, the translation of the original text to be translated "The strain was used to vaccinate people against tuberculosis."

訳文:”菌種は人々に結核の予防注射をするために使用されました。”
次に、ステップS22では、ステップS21で得られた訳文”菌種は人々に結核の予防注射をするために使用されました。”から、差分語句である”strain”の訳語を切り出す。この訳文では”strain”の訳語情報が”菌種”として得られているので、翻訳対象原文と翻訳用例原文との差分語句の訳語である”菌種”が取得される。
Translation: “The fungus was used to vaccinate people with TB.”
Next, in step S22, the translated word “strain” is extracted from the translated sentence obtained in step S21 “The fungus species were used to give people a preventive injection of tuberculosis.” In this translation, the translation information of “strain” is obtained as “bacterial species”, so that “bacterial species” that is the translation of the difference phrase between the translation target original and the translation example original is acquired.

このようにして、図3の差分語句訳語取得手段34による差分語句訳語取得(ステップS5)により、翻訳対象原文と翻訳用例原文との差分語句の訳語である”菌種”が取得されると、訳語置換手段35によって訳語置換処理が行われる(S6)。この訳語置換処理では、差分対応付け処理(ステップS4)で得られた”strain−抗原”の対応関係を利用して、翻訳用例訳文”結核の予防注射に、この抗原が使用されました。”の中の”抗原”を、差分語句訳語取得(ステップS5)で得られた”菌種”で置き換えて訳文を完成させる。これによって、訳文:”結核の予防注射に、この菌種が使用されました。”が得られ、翻訳対象原文中の文脈に応じた適切な訳語での訳文を得ることができる。   In this way, when the difference word / phrase translation acquisition (step S5) by the difference word / phrase translation acquisition unit 34 of FIG. 3 acquires the “fungus species” that is the translation of the difference word / phrase between the translation target original and the translation example original, The translated word replacement means 35 performs translated word replacement processing (S6). In this translated word replacement process, this antigen was used for the translation of the translated example translation “tuberculosis” using the “strain-antigen” correspondence obtained in the difference matching process (step S4). The “antigen” in is replaced with the “fungus species” obtained in the differential word phrase translation acquisition (step S5) to complete the translation. As a result, the translation: “This bacterium was used for vaccination for tuberculosis” is obtained, and a translation with an appropriate translation according to the context in the original text to be translated can be obtained.

図7は本発明の第2の実施の形態に係わる機械翻訳装置の機能ブロック図である。この第2の実施の形態は、図1乃至図6に示した一例に対し、差分語句訳語取得手段34は翻訳手段31を用いて翻訳対象原文を翻訳させることに代えて、外部の翻訳システム36を用いて翻訳対象原文を翻訳させるようにしたものである。   FIG. 7 is a functional block diagram of a machine translation apparatus according to the second embodiment of the present invention. This second embodiment is different from the example shown in FIGS. 1 to 6 in that the difference word / phrase translation acquisition unit 34 uses the translation unit 31 to translate the original text to be translated instead of an external translation system 36. Is used to translate the original text to be translated.

この本発明の第2の実施の形態についても、表2に示す翻訳対象原文と翻訳用例(翻訳用例原文及び翻訳用例訳文)の組合せを例にとり説明する。第2の実施の形態は、表2の例に対して、図3の差分対応付け処理(ステップS4)までの処理は第1の実施の形態と同じである。図3の差分語句訳語取得処理(ステップS5)では、外部の翻訳システム36と連携して行う。   This second embodiment of the present invention will also be described by taking the combination of the translation target original text and the translation example (translation example original text and translation example translation text) shown in Table 2 as an example. In the second embodiment, with respect to the example of Table 2, the processes up to the difference association process (step S4) in FIG. 3 are the same as those in the first embodiment. The difference word / phrase translation acquisition process (step S5) in FIG.

図8は本発明の第2の実施の形態の差分語句訳語取得手段34での差分語句訳語取得処理の内容を示すフローチャートである。まず、翻訳対象原文の翻訳を依頼する(S41)。すなわち、差分語句訳語取得手段34は外部の翻訳システム36に翻訳対象原文の翻訳を依頼する。その後、外部の翻訳システム36から翻訳結果の訳文を受け取る(S42)。そして、受け取った訳文から差分語句訳語を取得する(S43)。この場合、外部の翻訳システム36がどのように翻訳処理して訳文を生成しているかについては関知する必要はない。   FIG. 8 is a flowchart showing the contents of the difference word / phrase translation acquisition process in the difference word / phrase translation acquisition unit 34 according to the second embodiment of this invention. First, a translation of the original text to be translated is requested (S41). That is, the difference word / phrase translation acquisition unit 34 requests an external translation system 36 to translate the original text to be translated. Thereafter, a translation of the translation result is received from the external translation system 36 (S42). Then, a differential word phrase translation is acquired from the received translation (S43). In this case, it is not necessary to know how the external translation system 36 performs translation processing to generate a translation.

この場合、ステップS43において、外部の翻訳システム36から受け取った訳文から差分語句の訳語を取得するには、翻訳対象原文と受け取った訳文の対に対し、図4のステップS14と同様な処理を行うことになる。   In this case, in step S43, in order to obtain the translation of the difference word / phrase from the translation received from the external translation system 36, the same processing as in step S14 in FIG. 4 is performed on the translation target original sentence and the received translation pair. It will be.

いま、表2の例に対して、差分語句訳語取得手段34は外部の翻訳システム36に翻訳を依頼して、外部の翻訳システムの訳文”人間の結核の予防注射にこの菌種が使用されました。”を受け取ったとする。この受け取った訳文から訳語を取得するために、差分語句訳語取得手段34は、ステップS43において、図4のステップS14と同様な処理を行う。すなわち、翻訳対象原文”The strain was used to vaccinate people against tuberculosis.”と、外部の翻訳システムから受け取った訳文”人間の結核の予防注射にこの菌種が使用されました。”との対に対して、図4のステップS14と同様な処理を行う。この処理によって、”strain”の訳語として”菌種”が取得される。   Now, for the example in Table 2, the difference word / phrase translation acquisition means 34 asks the external translation system 36 to translate, and this bacterial species is used for the vaccination of the translation of the external translation system “human tuberculosis”. ”. In order to acquire a translated word from the received translated sentence, the difference word / phrase translated word acquisition unit 34 performs the same processing as step S14 of FIG. 4 in step S43. In other words, the translation target text “The strain was used to vaccinate people against tuberculosis.” And the translation received from an external translation system “This strain was used for vaccination against human tuberculosis.” Thus, the same processing as step S14 in FIG. 4 is performed. By this process, “fungus species” is acquired as a translation of “strain”.

最後に、訳語置換手段35によって訳語置換処理が行われ、差分対応付け処理手段33で得られた”starin−抗原”の対応関係を利用して、翻訳用例訳文”結核の予防注射に、この抗原が使用されました。”の中の”抗原”を”菌種”で置き換えて訳文を完成させる。これによって、以下の訳文が得られる。これによって、訳文:”結核の予防注射に、この菌種が使用されました。”が得られ、翻訳対象原文中の文脈に応じた適切な訳語での訳文を得ることができる。   Finally, the translation replacement process is performed by the translation replacement unit 35, and the translation example "translation" is used for the preventive injection of tuberculosis using the "starin-antigen" correspondence obtained by the difference correlation processing unit 33. The “antigen” in “” is replaced with “fungal species” to complete the translation. As a result, the following translation is obtained. As a result, the translation: “This bacterium was used for vaccination for tuberculosis” is obtained, and a translation with an appropriate translation according to the context in the original text to be translated can be obtained.

第2の実施の形態では、外部の翻訳システム36から受け取るのは訳文のみであるが、外部の翻訳システム36から訳文に加え、翻訳対象原文と訳文の語句の対応関係も取得する構成にしてもよい。この場合、差分語句訳語取得手段34のステップS43の「差分語句訳語を取得する」処理では、翻訳対象原文と訳文の語句の対応関係から簡単に差分語句の訳語を取得することができ、図4のステップS14と同様な処理を行わなくてもよい。   In the second embodiment, only the translation is received from the external translation system 36, but in addition to the translation, the correspondence relationship between the translation target original sentence and the phrase of the translation is also acquired from the external translation system 36. Good. In this case, in the “acquire difference word / phrase translation” process in step S43 of the difference word / phrase translation acquisition unit 34, the translation of the difference word / phrase can be easily acquired from the correspondence between the translation target original sentence and the phrase of the translation, FIG. It is not necessary to perform the same process as step S14.

次に、本発明の第3の実施の形態を説明する。第1の実施の形態や第2の実施の形態では、翻訳手段31や外部の翻訳システム36を用いて、翻訳用例を使用しないで翻訳対象原文を翻訳し、その翻訳により得られた訳語を、翻訳対象原文と訳文の語句の対応関係から差分語句の訳語とするようにしたが、翻訳手段31あるいは外部の翻訳システム36を用いて翻訳対象原文を含む第1言語の文書の分野判定を行って分野情報を取得し、差分対応付け手段33で対応付けられた翻訳対象原文中の語句の訳語の分野情報と照合して分野情報が一致している訳語を適切な訳語として取得するようにしてもよい。   Next, a third embodiment of the present invention will be described. In the first embodiment and the second embodiment, the translation source 31 is translated using the translation means 31 and the external translation system 36 without using a translation example, and the translation obtained by the translation is The translation of the difference word / phrase is made based on the correspondence between the translation target sentence and the phrase of the translation sentence, but the field of the first language document including the translation target sentence is determined using the translation means 31 or the external translation system 36. It is also possible to acquire the field information and collate it with the field information of the translated word of the word / phrase in the original text to be translated associated by the difference associating means 33 to obtain a translation having the same field information as an appropriate translation. Good.

いま、翻訳原文文書の分野情報を利用して翻訳対象原文の差分語句の訳語を取得する場合の例を表6に示す。

Figure 2011095802
Table 6 shows an example in which the translation of the difference word / phrase of the original text to be translated is acquired using the field information of the translation original text document.
Figure 2011095802

この場合、図3に示す差分対応付け手段34による差分対応付け処理(ステップS4)によって”base−艦隊”の対応関係が得られる。翻訳対象原文”Formation and equipment of this base were reported.”を翻訳用例を使用しないで翻訳すると、”この基礎の構成及び設備が報告されました。”の訳文が得られる。   In this case, the “base-fleet” correspondence is obtained by the difference association process (step S4) by the difference association unit 34 shown in FIG. If you translate the original text “Formation and equipment of this base were reported” without using the translation examples, you will get a translation of “This basic configuration and equipment has been reported.”

第1の実施の形態や第2の実施の形態では、翻訳用例訳文の「艦隊」を、翻訳用例を使用しないで翻訳した訳文の「基礎」で置き換えることになる。従って、第1の実施の形態や第2の実施の形態では、”この基礎の編成及び装備が報道されました。”の訳文が得られる。つまり、”base”が既定訳語の”基礎”のままで訳し出される。   In the first embodiment and the second embodiment, the “Fleet” in the translation example translation is replaced with the “basic” translation translated without using the translation example. Therefore, in the first embodiment and the second embodiment, a translation of “This foundation organization and equipment has been reported” is obtained. In other words, “base” is translated as “basic” as the default translation.

ここで、機械翻訳辞書中の訳語に分野情報がついているものがある。例えば、「base」について、野球分野であれば、「ベース」と訳し、軍事分野であれば「基地」と訳し、それ以外であれば「基礎」と訳す。機械翻訳辞書30では、例えば、これら分野情報が表7に示すように格納されている。

Figure 2011095802
Here, some of the translated words in the machine translation dictionary have field information. For example, “base” is translated as “base” in the baseball field, “base” in the military field, and “base” in the other fields. In the machine translation dictionary 30, for example, the field information is stored as shown in Table 7.
Figure 2011095802

そこで、差分語句訳語取得手段34では、文書の分野情報を利用して差分語句の適切な訳語を得る。図9は差分語句訳語取得手段34の差分語句訳語取得処理の内容を示すフローチャートである。   Therefore, the differential word phrase translation acquisition unit 34 obtains an appropriate translation of the differential phrase using the field information of the document. FIG. 9 is a flowchart showing the contents of the difference word / phrase translation acquisition process of the difference word / phrase translation acquisition means 34.

まず、差分語句訳語取得手段34は翻訳手段31あるいは外部の翻訳システム36を用いて、原文文書について分野判定を行う(S31)。翻訳対象原文と翻訳用例原文との差分語句の差分語句の訳語候補を順番に照合する(S32)。訳語候補に分野情報がついているかを判別し(S33)、分野情報がついていれば、S31で得られた原文文書の分野情報と当該訳語の分野情報とが一致するかどうかを判定する(S34)。一致していれば、その訳語候補を訳語とし処理を終了する(S35)。一方、ステップS33の判定で訳語候補に分野情報がついていない場合、ステップS34の判定で一致しない場合には、次の訳語があるかどうかを判定し(S36)、次の訳語があるときはステップS32に戻る。次の訳語がないときは、その訳語候補の規定訳語を訳語とする(S37)。   First, the difference word / phrase translation acquisition unit 34 performs field determination on the original document using the translation unit 31 or the external translation system 36 (S31). The translation word candidates of the difference words between the translation target original text and the translation example original text are collated in order (S32). It is determined whether the field information is attached to the candidate translation (S33). If the field information is attached, it is determined whether the field information of the original document obtained in S31 matches the field information of the translation (S34). . If they match, the translation candidate is regarded as a translation and the process is terminated (S35). On the other hand, if field information is not attached to the candidate translation in the determination in step S33, and it does not match in the determination in step S34, it is determined whether there is a next translation (S36), and if there is a next translation, step Return to S32. When there is no next translation, the specified translation of the translation candidate is used as a translation (S37).

表6の例で説明すると、まずステップS31で表6の「翻訳原文文書」について分野判定を行う。これによって、”分野=軍事”が得られる。ステップS32から、”base”の個々の訳語を順番に照合し、同じ”分野=軍事”である訳語”基地”が得られる。   To explain using the example in Table 6, first, in step S31, a field determination is performed on the “translation original document” in Table 6. As a result, “field = military” is obtained. From step S32, individual translations of “base” are collated in order, and the translation “base” having the same “field = military” is obtained.

そして、図3のステップS6の訳語置換処理によって、”base−艦隊”の対応関係を利用し、翻訳用例訳文中の”艦隊”を”基地”に置き換えて訳文を完成する。これにより、訳文”この基地の編成及び装備が報道されました。”が得られる。   Then, by the translation replacement process in step S6 of FIG. 3, the “base-fleet” correspondence is used to replace “fleet” in the translation example translation with “base” to complete the translation. As a result, the translation “The formation and equipment of this base has been reported.” Is obtained.

本発明の実施の形態によれば、翻訳用例を用いた翻訳において、翻訳対象原文と翻訳用例原文との差分を語句単位で判定し、翻訳用例原文中の差分に対応する翻訳用例訳文の語句を判定し、翻訳用例訳文中の当該語句を翻訳用例原文中の差分の訳語で置換した訳文を作成する際、翻訳対象原文中の差分の語句に対して、文脈情報や文書の分野情報を使用して訳語を決定し、当該訳語で置換して訳文を生成するので、適切な訳語を得ることができる。   According to the embodiment of the present invention, in the translation using the translation example, the difference between the translation target original and the translation example original is determined in units of words, and the translation example translation corresponding to the difference in the translation example original is determined. Judgment and use of context information and document field information for the difference words in the translation source text when creating a translation that replaces the words in the translation example translation with the translation differences in the translation example source text Thus, a translation is determined and a translation is generated by replacing the translation with the translation, so that an appropriate translation can be obtained.

11…機械翻訳装置、12…演算制御装置、13…プロセッサ、14…メモリ、15…機械翻訳プログラム、16…作業エリア、17…出力装置、18…表示装置、19…通信制御装置、20…入力装置、21…マウス、22…キーボード、23…ディスクドライブ、24…ハードディスクドライブ、25…記憶装置、26…入力処理部、27…翻訳部、28…表示処理部、29…翻訳用例データベース、30…機械翻訳辞書、31…翻訳手段、32…翻訳用例検索手段、33…差分対応付け手段、34…差分語句訳語取得手段、35…訳語置換手段、36…外部の翻訳システム DESCRIPTION OF SYMBOLS 11 ... Machine translation apparatus, 12 ... Operation control apparatus, 13 ... Processor, 14 ... Memory, 15 ... Machine translation program, 16 ... Work area, 17 ... Output device, 18 ... Display apparatus, 19 ... Communication control apparatus, 20 ... Input Device 21 ... Mouse 22 ... Keyboard 23 ... Disk Drive 24 ... Hard Disk Drive 25 ... Storage Device 26 ... Input Processing Unit 27 ... Translation Unit 28 ... Display Processing Unit 29 ... Translation Example Database 30 ... Machine translation dictionary, 31 ... translation means, 32 ... translation example search means, 33 ... difference association means, 34 ... difference word phrase translation acquisition means, 35 ... translation replacement means, 36 ... external translation system

Claims (5)

機械翻訳プログラム、翻訳対象の第1言語を翻訳目的の第2言語に翻訳するための機械翻訳辞書、翻訳対象の第1言語の文と翻訳目的の第2言語の文との対訳からなる翻訳用例を蓄積した翻訳用例データベースを記憶した記憶装置と、翻訳対象の第1言語の原文をデータとして入力する入力装置と、翻訳後の第2言語の訳文を出力する出力装置と、前記機械翻訳プログラムを演算実行する演算制御装置とを備えた機械翻訳装置において、第1言語の翻訳対象原文を前記機械翻訳辞書を用いて翻訳する翻訳手段と、前記翻訳対象原文を検索キーとして前記翻訳用例データベースから前記翻訳対象原文に同一または類似の翻訳用例を検索する翻訳用例検索手段と、前記検索された翻訳用例の原文と前記翻訳対象原文との差異部分に対して編集が必要な翻訳用例の訳文中の語句と翻訳対象原文中の語句とを対応付ける差分対応付け手段と、前記翻訳用例の訳文中の語句に対応づけられた前記翻訳対象原文中の語句に対して前記翻訳対象原文中の当該語句以外の情報を利用して前記翻訳対象原文中の文脈または分野に応じた訳語を取得する差分語句訳語取得手段と、前記差分語句訳語取得手段によって取得された訳語を前記編集が必要な翻訳用例の訳文中の語句と置き換えて訳文を完成させる訳語置換手段とを備えたことを特徴とする機械翻訳装置。 Machine translation program, machine translation dictionary for translating a first language to be translated into a second language for translation, translation example comprising a translation of a sentence in the first language to be translated and a sentence in the second language to be translated A storage device storing an example database for translation, an input device for inputting the original text of the first language to be translated as data, an output device for outputting the translated text of the second language after translation, and the machine translation program In a machine translation device comprising a computation control device for performing computation, translation means for translating a translation target original text in a first language using the machine translation dictionary, and from the translation example database using the translation target text as a search key It is necessary to edit a translation example search means for searching for a translation example that is the same as or similar to the source text to be translated, and a difference between the source text of the searched translation example and the source text to be translated. A difference correlating means for associating a phrase in a translation of a translation example with a phrase in the translation target text, and a translation target for the phrase in the translation target text associated with the phrase in the translation of the translation example The difference word / phrase translation acquisition means for acquiring a translation according to the context or field in the original text to be translated using information other than the word / phrase in the original sentence, and the editing of the translation acquired by the difference word / phrase translation acquisition means A machine translation device comprising translation means for replacing a phrase in a translation of a necessary translation example and completing a translation. 前記差分語句訳語取得手段は、前記翻訳手段を用いて前記翻訳対象原文を翻訳させ、その翻訳文中の前記差分対応付け手段で対応付けられた前記翻訳対象原文中の語句に対応する訳語を、前記翻訳対象原文中の文脈に応じた訳語として取得することを特徴とする請求項1記載の機械翻訳装置。 The difference word / phrase translation acquisition unit translates the translation target original using the translation unit, and translates the word corresponding to the word / phrase in the translation target original corresponding to the difference matching unit in the translation 2. The machine translation apparatus according to claim 1, wherein the machine translation apparatus is obtained as a translated word corresponding to a context in a translation target original text. 前記差分語句訳語取得手段は、外部の翻訳システムを用いて前記翻訳対象原文を翻訳させ、その翻訳文中の前記差分対応付け手段で対応付けられた前記翻訳対象原文中の語句に対応する訳語を、前記翻訳対象原文中の文脈に応じた訳語として取得することを特徴とする請求項1記載の機械翻訳装置。 The difference word / phrase translation acquisition means translates the original text to be translated using an external translation system, and translates the words corresponding to the words / phrases in the original text to be translated, which are correlated by the difference correlation means in the translated text, The machine translation apparatus according to claim 1, wherein the machine translation apparatus is acquired as a translated word corresponding to a context in the original text to be translated. 前記差分語句訳語取得手段は、前記翻訳手段あるいは外部の翻訳システムを用いて前記翻訳対象原文を含む第1言語の文書の分野判定を行って分野情報を取得し、前記差分対応付け手段で対応付けられた翻訳対象原文中の語句の訳語の分野情報と照合して分野情報が一致している訳語を訳語として取得することを特徴とする請求項1記載の機械翻訳装置。 The difference word / phrase translation obtaining unit obtains field information by performing field determination of a document in the first language including the source text to be translated using the translation unit or an external translation system, and associates the difference by the difference association unit. 2. The machine translation apparatus according to claim 1, wherein a translated word having the same field information is obtained as a translated word by collating with the field information of the translated word of the phrase in the original text to be translated. 機械翻訳プログラム、翻訳対象の第1言語を翻訳目的の第2言語に翻訳するための機械翻訳辞書、翻訳対象の第1言語の文と翻訳目的の第2言語の文との対訳からなる翻訳用例を蓄積した翻訳用例データベースを記憶した記憶装置と、翻訳対象の第1言語の原文をデータとして入力する入力装置と、翻訳後の第2言語の訳文を出力する出力装置と、前記機械翻訳プログラムを演算実行する演算制御装置とを備えた機械翻訳装置として機能させるためのコンピュータにおいて、コンピュータに、第1言語の翻訳対象原文を前記機械翻訳辞書を用いて翻訳する機能と、前記翻訳対象原文を検索キーとして前記翻訳用例データベースから前記翻訳対象原文に同一または類似の翻訳用例を検索する機能と、前記検索された翻訳用例の原文と前記翻訳対象原文との差異部分に対して編集が必要な翻訳用例の訳文中の語句と翻訳対象原文中の語句とを対応付ける機能と、前記翻訳用例の訳文中の語句に対応づけられた前記翻訳対象原文中の語句に対して前記翻訳対象原文中の当該語句以外の情報を利用して前記翻訳対象原文中の文脈または分野に応じた訳語を取得する機能と、前記差分語句訳語取得手段によって取得された訳語を前記編集が必要な翻訳用例の訳文中の語句と置き換えて訳文を完成させる機能とを実現させるための機械翻訳プログラム。 Machine translation program, machine translation dictionary for translating a first language to be translated into a second language for translation, translation example comprising a translation of a sentence in the first language to be translated and a sentence in the second language to be translated A storage device storing an example database for translation, an input device for inputting the original text of the first language to be translated as data, an output device for outputting the translated text of the second language after translation, and the machine translation program A computer for causing a computer to function as a machine translation device having a computation control device for performing computation, wherein the computer translates a translation target text in a first language using the machine translation dictionary, and searches the translation target text A function of searching the translation example database as a key to search for a translation example that is the same or similar to the original text to be translated, the original text of the searched translation example, and the translation pair A function for associating a phrase in the translation of the translation example that needs to be edited with respect to a difference from the original text and a phrase in the translation target text, and in the translation target text that is associated with the phrase in the translation of the translation example A function for acquiring a translation corresponding to the context or field in the original text to be translated using information other than the relevant word in the original text to be translated, and a translated word obtained by the differential word translated word obtaining means A machine translation program for realizing a function of completing a translation by replacing a phrase in a translation of an example for translation that requires editing.
JP2009246135A 2009-10-27 2009-10-27 Machine translation apparatus, method and program Active JP5148583B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2009246135A JP5148583B2 (en) 2009-10-27 2009-10-27 Machine translation apparatus, method and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2009246135A JP5148583B2 (en) 2009-10-27 2009-10-27 Machine translation apparatus, method and program

Publications (2)

Publication Number Publication Date
JP2011095802A true JP2011095802A (en) 2011-05-12
JP5148583B2 JP5148583B2 (en) 2013-02-20

Family

ID=44112668

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2009246135A Active JP5148583B2 (en) 2009-10-27 2009-10-27 Machine translation apparatus, method and program

Country Status (1)

Country Link
JP (1) JP5148583B2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013196493A (en) * 2012-03-21 2013-09-30 Toshiba Corp Machine translation device, machine translation method and program
CN113191163A (en) * 2021-05-21 2021-07-30 北京有竹居网络技术有限公司 Translation method, translation device, translation equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05257966A (en) * 1992-03-11 1993-10-08 Nec Corp Machine translator
JP2006011842A (en) * 2004-06-25 2006-01-12 Sharp Corp Translation device and translation program
JP2008176536A (en) * 2007-01-18 2008-07-31 Toshiba Corp Device, method and program for mechanically translating input original language sentence to target language
JP2009116584A (en) * 2007-11-06 2009-05-28 Toshiba Corp Machine translation device and machine translation program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05257966A (en) * 1992-03-11 1993-10-08 Nec Corp Machine translator
JP2006011842A (en) * 2004-06-25 2006-01-12 Sharp Corp Translation device and translation program
JP2008176536A (en) * 2007-01-18 2008-07-31 Toshiba Corp Device, method and program for mechanically translating input original language sentence to target language
JP2009116584A (en) * 2007-11-06 2009-05-28 Toshiba Corp Machine translation device and machine translation program

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013196493A (en) * 2012-03-21 2013-09-30 Toshiba Corp Machine translation device, machine translation method and program
CN113191163A (en) * 2021-05-21 2021-07-30 北京有竹居网络技术有限公司 Translation method, translation device, translation equipment and storage medium

Also Published As

Publication number Publication date
JP5148583B2 (en) 2013-02-20

Similar Documents

Publication Publication Date Title
JP5264892B2 (en) Multilingual information search
US6055528A (en) Method for cross-linguistic document retrieval
US7752032B2 (en) Apparatus and method for translating Japanese into Chinese using a thesaurus and similarity measurements, and computer program therefor
JP2006004427A (en) System and method of searching content of complicated languages such as japanese
JP2008234656A (en) Method and system for translating cross language query request, and cross language information retrieval
US8402046B2 (en) Conceptual reverse query expander
KR20160124079A (en) Systems and methods for in-memory database search
CN101099153A (en) Systems, methods, software and interfaces for multilingual information retrieval
JP2004118740A (en) Question answering system, question answering method and question answering program
JP4160548B2 (en) Document summary creation system, method, and program
JP5148583B2 (en) Machine translation apparatus, method and program
JP5025603B2 (en) Machine translation apparatus, machine translation program, and machine translation method
JP2006343925A (en) Related-word dictionary creating device, related-word dictionary creating method, and computer program
JP4588657B2 (en) Translation device
JP2004086307A (en) Information retrieving device, information registering device, information retrieving method, and computer readable program
JP2009104475A (en) Similar document retrieval device, and similar document retrieval method and program
JP2008140204A (en) Data retrieval system and program
JP6707410B2 (en) Document search device, document search method, and computer program
JP4635585B2 (en) Question answering system, question answering method, and question answering program
JP4588417B2 (en) Translation device
JP5039114B2 (en) Machine translation apparatus and program
JP2007164635A (en) Method, device and program for acquiring synonymous vocabulary
JP5909123B2 (en) Machine translation apparatus, machine translation method and program
JP2012230460A (en) Machine translation system, method, and program
JP4034503B2 (en) Document search system and document search method

Legal Events

Date Code Title Description
A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20120427

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20120508

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20120523

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20121030

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20121128

R150 Certificate of patent or registration of utility model

Ref document number: 5148583

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

Free format text: JAPANESE INTERMEDIATE CODE: R150

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20151207

Year of fee payment: 3

S531 Written request for registration of change of domicile

Free format text: JAPANESE INTERMEDIATE CODE: R313531

R350 Written notification of registration of transfer

Free format text: JAPANESE INTERMEDIATE CODE: R350

S533 Written request for registration of change of name

Free format text: JAPANESE INTERMEDIATE CODE: R313533

R350 Written notification of registration of transfer

Free format text: JAPANESE INTERMEDIATE CODE: R350