JPH08328578A

JPH08328578A - Text voice synthesizer

Info

Publication number: JPH08328578A
Application number: JP7135420A
Authority: JP
Inventors: Kaoru Tsukamoto; 薫塚本
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1995-06-01
Filing date: 1995-06-01
Publication date: 1996-12-13

Abstract

PURPOSE: To obtain voices that are easily understood by pronouncing a nown phrase portion slowly compared to other portions when a speed reading is specified in a text voice synthesizer which synthesizes voices from inputted character information. CONSTITUTION: The synthesizer is provided with a text analysis section 8 which converts inputted character information to phoneme rhythm symbol columns by referring to a pronunciation dictionary 3, a synthesis parameter generating section 4 which converts the columns to synthesis parameters for voice synthesis by referring to a voice piece storage section 5 and a voice synthesis section 6 which synthesizes voice signals based on the parameters. A speed reading processing section 9 is provided in the section 8 and phoneme rhythm symbol columns are generated so that only the nown phrases among the inputted character information are pronounced relatively slowly during a speed voice synthesis.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、入力された文字列情報
に基づき、音声を合成して出力する音声合成装置に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice synthesizing device for synthesizing and outputting a voice based on input character string information.

【０００２】[0002]

【従来の技術】テキストデータ等の文字情報を入力とし
て、それを音声に変換して出力する音声合成装置は、出
力語彙の制限が無いことから、録音再生型の音声合成技
術にとって代わる音声合成技術として種々の利用分野で
の応用が期待できる。例えば、ワードプロセッサ等で作
成されたテキストデータを音声に変換して出力させた
り、また、テキストを編集するだけで、簡単に応答メッ
セージの作成および変更を行うことができるので、電話
その他の通信サービス等でも利用できる。2. Description of the Related Art A speech synthesizer which receives character information such as text data and converts it into speech and outputs the speech is not limited in output vocabulary. Can be expected to be applied in various fields of use. For example, text data created by a word processor, etc. can be converted into voice and output, or by editing the text, you can easily create and change response messages. But it can be used.

【０００３】図６は従来の音声合成装置の構成図であ
り、日本語（漢字かな混じり文）を入力とした日本語テ
キスト音声変換を行う装置を示している。以下この図６
を参照しながら、従来装置の概要を説明する。図６にお
いて、従来のテキスト解析部２では、発音辞書３を利用
して、文字情報入力部１より入力された漢字かな混じり
文から、音韻韻律記号列を生成する。ここで、音韻韻律
記号列とは、入力文の読み、アクセントおよびイントネ
ーション等を文字列として記述したもので、中間言語と
呼ばれる。各単語の読みとアクセント等は、発音辞書３
に登録されており、テキスト解析部２は、この発音辞書
３を参照しながら、音韻韻律記号列を生成する。FIG. 6 is a block diagram of a conventional speech synthesizer, which shows an apparatus for performing Japanese text-to-speech conversion using Japanese (Kanji / Kana mixed sentence) as an input. This Figure 6 below
The outline of the conventional device will be described with reference to FIG. In FIG. 6, the conventional text analysis unit 2 uses the pronunciation dictionary 3 to generate a phonological prosodic symbol string from a kanji / kana mixed sentence input from the character information input unit 1. Here, the phonological prosody symbol string describes the reading, accent, intonation, etc. of the input sentence as a character string, and is called an intermediate language. Pronunciation dictionary 3 for reading and accent of each word
The text analysis unit 2 refers to the pronunciation dictionary 3 to generate a phonological prosodic symbol string.

【０００４】合成パラメータ生成部４では、前記音韻韻
律記号列に基づき、音声素片（音の種類）を取り出し、
予め定められた規則より、音韻継続時間（音の長さ）、
基本周波数（声の高さ）パターンといった音声合成用の
パラメータ（合成パラメータと呼ぶ）を生成する。この
内、音声素片は、単語等を発声した時の発音データから
分析生成されるもので、合成のための音声の基本単位で
あり、これらを重ね合わせていくことによって、合成波
形が生成される。音声素片データは、例えばＲＯＭ等か
ら成る音声素片記憶部５に予め格納されており、合成パ
ラメータ生成部４は、前記音韻韻律記号列から音声単位
を認識し、音声素片記憶部５を参照して対応する音声素
片データを取り出す。The synthesis parameter generation unit 4 extracts a voice unit (sound type) based on the phonological prosodic symbol string,
According to a predetermined rule, phoneme duration (sound length),
A parameter for voice synthesis (called a synthesis parameter) such as a fundamental frequency (voice pitch) pattern is generated. Of these, speech units are those that are analyzed and generated from pronunciation data when a word or the like is uttered, and are the basic units of speech for synthesis.By superimposing these, a synthesized waveform is generated. It The voice unit data is stored in advance in the voice unit storage unit 5 including, for example, a ROM, and the synthesis parameter generation unit 4 recognizes the voice unit from the phonological prosodic symbol string and stores the voice unit storage unit 5. The corresponding speech unit data is extracted by referring to it.

【０００５】音声合成部６は、合成パラメータ生成部４
が生成した合成パラメータに基づいて、合成波形（音声
信号）を生成する。このような合成音声信号が、スピー
カー７を通して音声出力されたり、回線を介して他の装
置に伝送されたりする。また、音声合成装置は、使用さ
れる目的に応じて、音声の速さや、男性音、女性音等の
選択が可能になっているものが多い。The voice synthesis unit 6 is a synthesis parameter generation unit 4
A synthesized waveform (voice signal) is generated based on the generated synthesis parameter. Such a synthesized voice signal is output as voice through the speaker 7 or is transmitted to another device via a line. In addition, many voice synthesizers are capable of selecting the speed of voice, a male sound, a female sound, etc. according to the purpose of use.

【０００６】[0006]

【発明が解決しようとする課題】テキストの読上げ速度
を上げると、従来の技術によれば、どの音韻もほとんど
一様に短縮して発声させるために、合成音が単調で、聞
きづらくなってしまっていた。そのため、ユーザーは、
重要な単語を聞き漏らすことが多かったので、これを避
けるためには、ゆっくり読み上げさせるか、その都度繰
り返して聞くしかなかった。When the reading speed of the text is increased, according to the conventional technique, all the phonemes are shortened almost uniformly and the synthesized speech is monotonous and becomes difficult to hear. Was there. Therefore, the user
I often missed important words, so to avoid this, I had to either read them slowly or repeat them each time.

【０００７】本発明では、テキストを一定以上の速さで
読み上げる際、名詞をそれ以外よりも相対的にゆっくり
と発声させ、速読読み上げにおいても内容の聞き取りや
すい音声を合成する音声合成装置を提供し、短時間に大
量の文章を聞くことを可能にする。The present invention provides a speech synthesizer which utters a noun relatively slowly when reading a text at a certain speed or faster than other words, and synthesizes a voice whose content is easy to hear even when reading aloud quickly. It enables you to listen to a lot of sentences in a short time.

【０００８】[0008]

【課題を解決するための手段】本発明は、文字情報を入
力する文字情報入力部と、その文字情報を、予め単語の
読みおよびアクセント等を登録した発音辞書を参照して
読み、アクセントおよびイントネーション等を示す音韻
韻律記号列に変換するテキスト解析部と、その音韻韻律
記号列を受け入れ、予め音声の基本単位である音声素片
を登録した音声素片記憶部を参照して音韻韻律記号列を
音声合成用の合成パラメータに変換する合成パラメータ
生成部と、その合成パラメータを受け入れて音声信号を
合成する音声合成部とを具備するテキスト音声合成装置
において、テキスト解析部に速読処理部を設け、速読音
声合成の際に、入力文字情報中の名詞句のみを相対的に
遅く発音するよう指定する音韻韻律記号列を生成するこ
とを特徴としている。SUMMARY OF THE INVENTION According to the present invention, a character information input section for inputting character information and the character information are read by referring to a pronunciation dictionary in which the reading and accent of words are registered in advance, and the accent and intonation are read. Etc., a text analysis unit for converting into a phonological prosodic symbol string, the phonological prosodic symbol string is accepted, and the phonological prosodic symbol string is referred to by referring to a speech unit storage unit in which a speech unit which is a basic unit of speech is registered in advance. In a text-to-speech synthesizer including a synthesis parameter generation unit that converts a synthesis parameter for speech synthesis and a voice synthesis unit that receives the synthesis parameter and synthesizes a voice signal, the text analysis unit includes a speed reading processing unit, It is characterized by generating a phonological prosodic symbol string that specifies that only noun phrases in input character information should be pronounced relatively late during speed-reading speech synthesis. .

【０００９】[0009]

【作用】文字情報入力部に入力した文字情報を、テキス
ト解析部は文解析し、発音辞書を参照して音韻韻律記号
列を生成する。速読の指定がなされている場合、その音
韻韻律記号列生成の際に、速読処理部がはたらいて名詞
句を相対的に遅く発声するよう指定する。合成パラメー
タ生成部は、その音韻韻律記号列に基づき、音声素片記
憶部を参照して合成パラメータを生成し、音声合成部は
その合成パラメータを音声信号に変換する。速読指定さ
れている場合、得られた音声信号は、名詞句の部分だけ
が他の部分よりも遅めに発音されようになっている。The text analysis unit analyzes the character information input to the character information input unit and generates a phonological prosodic symbol string by referring to the pronunciation dictionary. When the speed reading is specified, the speed reading processing unit operates to generate the noun phrase relatively slowly when the phonological prosodic symbol string is generated. The synthesis parameter generation unit refers to the phoneme prosodic symbol string to generate a synthesis parameter by referring to the speech unit storage unit, and the voice synthesis unit converts the synthesis parameter into a voice signal. When the speed reading is designated, only the noun phrase part of the obtained audio signal is sounded later than the other parts.

【００１０】[0010]

【実施例】以下に図を用いて本発明の実施例を説明す
る。以下の説明において、文とは文頭から。（ピリオ
ド）までを示し、文章とは、複数の文から構成されるも
のを示す。図１は実施例の音声合成装置の構成図であ
り、図において、１は文字情報入力部であり、漢字かな
混じり文字列等の日本語文章を入力するためのものであ
る。Embodiments of the present invention will be described below with reference to the drawings. In the following explanation, sentences are from the beginning. Up to (period) is shown, and the sentence means one composed of a plurality of sentences. FIG. 1 is a block diagram of a speech synthesizer according to an embodiment. In the figure, reference numeral 1 is a character information input unit for inputting Japanese sentences such as character strings mixed with kanji and kana.

【００１１】８は本発明のテキスト解析部であり、文字
情報入力部１からの文を受け入れて文解析を行い、文の
読み、アクセントおよびイントネーション等を記述した
音韻韻律記号列を生成する。前記テキスト解析部８は速
読処理部９を有し、この速読処理部９は、文章を早口
（はやい速度）で合成させる際に、名詞を含む文節を他
の部分よりも相対的に遅く発声するように指定すること
を特徴とする速読処理を行う。３は発音辞書を示し、予
め単語の読みとアクセント等を登録してあり、テキスト
解析部８の処理の際に参照して用いる。Reference numeral 8 is a text analysis unit of the present invention, which receives a sentence from the character information input unit 1 and analyzes the sentence to generate a phonological prosodic symbol string describing the reading, accent, intonation, etc. of the sentence. The text analysis unit 8 has a speed reading processing unit 9. The speed reading processing unit 9 relatively slows down a phrase including a noun in comparison with other parts when synthesizing a sentence at a fast speed (quick speed). Performs speed-reading processing characterized by specifying to speak. Reference numeral 3 denotes a pronunciation dictionary in which the reading and accent of a word are registered in advance, which is referred to and used in the processing of the text analysis unit 8.

【００１２】４は合成パラメータ生成部であり、テキス
ト解析部８において作成された音韻韻律記号列に基づい
て音声単位を認識し、音声素片記憶部５に登録してある
音声素片データを取り出して合成パラメータを生成す
る。なお、その音声素片データは予め音声データから分
析生成されたものである。６は音声合成部を示してお
り、合成パラメータ生成部４から出力された合成パラメ
ータを受け入れて合成波形（音声信号）を生成する。Reference numeral 4 denotes a synthesis parameter generation unit, which recognizes a voice unit based on the phonological prosodic symbol string created by the text analysis unit 8 and extracts the voice unit data registered in the voice unit storage unit 5. To generate synthetic parameters. The speech unit data is previously generated by analysis from the speech data. Reference numeral 6 denotes a voice synthesis unit, which receives the synthesis parameter output from the synthesis parameter generation unit 4 and generates a synthesized waveform (voice signal).

【００１３】７はスピーカであり、本装置により合成さ
れた前記合成波形を出力するためのものである。図２は
実施例のテキスト解析部の動作フローチャートであり、
この図を用いて実施例のテキスト解析部の処理の流れを
説明する。ここでの文解析においては、構文レベルや意
味レベルに立ち入った解析を行うことになるが、ここに
示す実施例では、一例として単語分割レベル（形態素解
析レベル）における最長一致法につき説明する。Reference numeral 7 is a speaker for outputting the synthesized waveform synthesized by the present apparatus. FIG. 2 is an operation flowchart of the text analysis unit of the embodiment,
The flow of processing of the text analysis unit of the embodiment will be described with reference to this figure. In the sentence analysis here, the analysis is performed at the syntax level and the semantic level, but in the embodiment shown here, the longest matching method at the word division level (morpheme analysis level) will be described as an example.

【００１４】装置が動作を開始すると、テキスト解析部
８は文字情報入力部１からテキストすなわち日本語文書
を読み出す（Ｓ１）。文字情報入力部１から入力された
日本語文章は、先頭から順に単語に文解析されて発音辞
書３中の登録単語とのマッチングが行われる。この文解
析は、文の先頭から逐次、発音辞書３の見出し語の文字
列と照合を取って行くこととする。複数個の見出し語と
照合が取れた場合には、そのうちで最も長い語長の文字
列を選び出す。このようにして、文の読みと、アクセン
ト等の制御情報とを生成する。When the apparatus starts operating, the text analysis section 8 reads out a text, that is, a Japanese document from the character information input section 1 (S1). The Japanese sentence input from the character information input unit 1 is sentence-parsed into words in order from the beginning and matching with registered words in the pronunciation dictionary 3 is performed. In this sentence analysis, the character strings of the entry words in the pronunciation dictionary 3 are sequentially checked from the beginning of the sentence. If a plurality of headwords can be matched, the character string with the longest word length is selected. In this way, sentence reading and control information such as accent are generated.

【００１５】このような文解析を行うため、まず、先頭
に文字ポインタをセットする（Ｓ２）。次に、文字ポイ
ンタの指す文字で始まる単語（読み出し単語）を発音辞
書３から取り出す（Ｓ３）。この読み出し単語と発音辞
書３に登録されている長い単語から入力文字列とマッチ
ングを取る（Ｓ４）。この最長一致法を用いた文解析で
は、文中のある位置以降、後に続く単語の候補がなくな
った場合にはマッチングが取れない（Ｓ５）。その場合
には、バックトラックする必要がある。従って、前の単
語の長さだけ文字ポインタを戻して（Ｓ６）、前の単語
にさかのぼって別の候補で再びマッチングを取り直す
（Ｓ４）。In order to perform such sentence analysis, first, a character pointer is set at the beginning (S2). Next, a word (read word) starting with the character pointed by the character pointer is taken out from the pronunciation dictionary 3 (S3). The read word and the long word registered in the pronunciation dictionary 3 are matched with the input character string (S4). In the sentence analysis using the longest matching method, if there is no candidate for the succeeding word after a certain position in the sentence, matching cannot be obtained (S5). In that case, it is necessary to backtrack. Therefore, the character pointer is returned by the length of the previous word (S6), the previous word is traced back, and another candidate is matched again (S4).

【００１６】マッチングが取れた場合には（Ｓ５）、前
の単語との接続関係を調べ（Ｓ７）、その接続関係の良
否を判別する（Ｓ８）。接続関係が否（ＮＧ）である場
合には、再び、Ｓ４の処理まで戻り、Ｓ８までの処理を
繰り返し行う。接続関係が良（ＯＫ）である場合には
（Ｓ８）、単語に読みを付与する（Ｓ９）。発音辞書３
には、単語の読み仮名、アクセント型および品詞情報等
が登録されており、単語情報もこのとき同時に得られ
る。If a match is found (S5), the connection relation with the previous word is checked (S7), and the quality of the connection relation is determined (S8). If the connection relationship is negative (NG), the process returns to S4 and the processes up to S8 are repeated. When the connection relation is good (OK) (S8), reading is added to the word (S9). Pronunciation dictionary 3
The phonetic kana, accent type, and part-of-speech information of the word are registered in, and the word information is also obtained at this time.

【００１７】次に、読み上げ速度が一定以上に設定され
ており、テキスト解析の結果、単語が名詞であった場合
には（Ｓ１０）、その単語に強調マークを付与しておく
（Ｓ１１）。それから、単語の長さだけ文字ポインタを
進める（Ｓ１２）。そして、一文につきこの一連の処理
が終了したかを調べ（Ｓ１３）、終了していなければ、
Ｓ３の処理に戻り、同様な処理（Ｓ３〜Ｓ１３）を行っ
て一文の処理を行う。一文の処理が終了したら、自立語
と自立語、あるいは自立語と付属語を結合することによ
り、複合語や文節などのアクセント句を生成し、さら
に、係り受けのある隣接したアクセント句を１つにまと
めて呼気段落を形成する（Ｓ１４）。このとき、単語の
結合によるアクセントの移動、生起、消失は、テキスト
解析部８中に記述されたアクセント結合規則に基づいて
行われる。また、呼気段落とは、人間が一息で発声する
単位である。Next, if the reading speed is set to a certain level or higher and the result of the text analysis is that the word is a noun (S10), an emphasis mark is given to the word (S11). Then, the character pointer is advanced by the length of the word (S12). Then, it is checked whether or not this series of processing has been completed for each sentence (S13).
Returning to the processing of S3, the same processing (S3 to S13) is performed and the processing of one sentence is performed. When processing of one sentence is completed, independent words and independent words, or independent words and adjunct words are combined to generate accent phrases such as compound words and clauses, and one adjacent accent phrase with dependency To form an exhalation paragraph (S14). At this time, the movement, occurrence, and disappearance of accents due to word combination are performed based on the accent combination rule described in the text analysis unit 8. The expiratory paragraph is a unit in which a human speaks in a breath.

【００１８】最後に、Ｓ１１で付与された強調マークに
従って、強調マークのある単語のアクセント句の速度を
周囲より一段階落とす記号を挿入し（Ｓ１５）、音韻韻
律記号列を完成させる。この結果は従来と同様に合成パ
ラメータ生成部４に送られ（Ｓ１６）、この合成パラメ
ータ生成部４では、生成された音韻韻律記号列に基づ
き、音韻継続時間、基本周波数パタン、パワー等といっ
た韻律パラメータの設定を行い、使用する音声素片デー
タを取り出す。音声合成部６では、合成パラメータに基
づき合成波形が生成され、スピーカ７等を通して合成音
が出力される。Finally, in accordance with the emphasizing mark given in S11, a symbol that reduces the speed of the accent phrase of the word having the emphasizing mark by one step from the surroundings is inserted (S15) to complete the phonological prosodic symbol string. This result is sent to the synthesis parameter generation unit 4 as in the conventional case (S16). Based on the generated phonological prosodic symbol string, the synthesis parameter generation unit 4 produces prosodic parameters such as phoneme duration, fundamental frequency pattern, and power. Is set and the speech unit data to be used is taken out. The voice synthesis unit 6 generates a synthesized waveform based on the synthesis parameter, and outputs a synthesized sound through the speaker 7 or the like.

【００１９】まだ解析すべき文が残っていれば（Ｓ１
７）、Ｓ１の処理に戻り、同様な処理（Ｓ１〜Ｓ１６）
を繰り返して行い、テキスト解析部８の処理を終了す
る。次に、文例によって、テキスト解析の動作を説明す
る。図３は実施例のテキスト解析部処理例の説明図であ
り、この図の「あなたは早口言葉が得意です。」という
文を例にとると、これは図２のＳ２〜Ｓ１３の処理によ
り、「あなた−は−早口言葉−が−得意です」と分解さ
れる。If there are still sentences to be analyzed (S1
7), returning to the process of S1, the same process (S1 to S16)
Is repeatedly performed, and the processing of the text analysis unit 8 ends. Next, the operation of text analysis will be described with reference to sentence examples. FIG. 3 is an explanatory diagram of an example of processing of the text analysis unit of the embodiment. Taking the sentence “You are good at quick words” in this figure as an example, this is due to the processing of S2 to S13 of FIG. "You are good at-quick words-" is decomposed.

【００２０】図４は実施例のテキスト解析結果例の説明
図であり、上記処理により確定したテキスト解析例を示
している。これから、この文は、「あなたは」と「早口
言葉が」と「得意です」の３つのアクセント句に分けら
れることがわかるが、このうち名詞を含むアクセント句
は、２番目の「早口言葉が」である。なお、「あなた」
は代名詞であり、この代名詞は文中のキーワードになる
ことはほとんどないので、強調マークはつけない。名詞
である「早口言葉」には、Ｓ１１の処理により強調マー
クがつけられるために、このアクセント句の前後に、
｛Ｖ−｝、｛Ｖ＋｝を挿入する。｛Ｖ−｝は、速度をこ
の記号以前より、一段階遅くする記号であり、｛Ｖ＋｝
はその逆である。FIG. 4 is an explanatory view of an example of the text analysis result of the embodiment, and shows an example of the text analysis determined by the above processing. From this, it can be seen that this sentence is divided into three accent phrases, "you are", "quick words" and "we are good at." It is. "You"
Is a pronoun, and since this pronoun rarely becomes a keyword in a sentence, an emphasis mark is not attached. Since the emphasis mark is put on the noun "Speaker" by the process of S11, before and after this accent phrase,
Insert {V-} and {V +}. {V-} is a symbol that makes the speed one step slower than before this symbol, and {V +}
Is the opposite.

【００２１】図３の音韻記号列出力例の通常速度指定の
場合の記号列には、上記の｛Ｖ−｝、｛Ｖ＋｝の記号は
挿入されないが、速読を指定して処理を行うと、図３の
速読の場合の記号列に示すように、この｛Ｖ−｝、｛Ｖ
＋｝が挿入される。これにより、名詞句を周囲のアクセ
ント句よりも一段階遅く発声させるための音韻韻律記号
列を生成させることができる。また、記号”Ｐ１”，”
Ｐ３”，”Ｐ０”は呼気段落を示すフレーズ記
号、”｝”，”］”は、アクセント記号であり、本説明
に用いた記号は、便宜上のものであるため、実際には記
号は何を用いてもよい。The above-mentioned symbols {V-} and {V +} are not inserted in the symbol string in the case of specifying the normal speed in the phoneme symbol string output example of FIG. 3, but when the process is performed by designating the speed reading. , {V-}, {V-, as shown in the symbol string in the case of speed reading in FIG.
+} Is inserted. As a result, it is possible to generate a phonological prosodic symbol string for uttering the noun phrase one step later than the surrounding accent phrases. Also, the symbol "P1", "
P3 "and" P0 "are phrase symbols indicating an exhalation paragraph, and"} "and"] "are accent symbols. Since the symbols used in this description are for convenience, what is actually a symbol? You may use.

【００２２】また、文中に、名詞を含むアクセント句が
たくさんあるときには、強調マークに優先順位をつけ
て、文全体の読み上げ速度が落ちないように考慮する。
図５は実施例の名詞強調の優先順位例の説明図である。
名詞に助詞の「は」や「が」がついたアクセント句は、
主語になりやすく重要である場合が多いので、優先順位
を上げている。このように、隣接する単語等との関係や
文中の位置その他に応じて優先度を決めておき、該優先
度を考慮して強調マークをつければよい。例えば、「厳
選された小麦粉や砂糖、新鮮な卵や牛乳から作られた手
作りのケーキは大変おいしい。」という文において、
「小麦粉」、「砂糖」、「卵」、「牛乳」、「手作
り」、「ケーキ」の５つが名詞であるが、図５に従って
強調マークをつけると、強調されないものを０点、少し
でも強調される可能性のあるものを１点、等というよう
に加点していくと、「小麦粉」、「砂糖」、「卵」、
「牛乳」、「手作り」は１点、「ケーキ」は５点という
強調マークがつけられる（図２のＳ１１の処理）。図２
のＳ１５ではこの強調マークに従い、５点の「ケーキ」
を含むアクセント句、ここでは、「ケーキは」の前後に
速度記号を付与する。When there are many accent phrases including nouns in a sentence, the emphasis marks are prioritized so that the reading speed of the entire sentence does not decrease.
FIG. 5 is an explanatory diagram of an example of the priority order of the noun emphasis according to the embodiment.
Accent phrases with the particle "ha" or "ga" attached to the noun are
In many cases, it is easy to become a subject and important, so the priority is raised. In this way, the priority may be determined according to the relationship with the adjacent word or the like, the position in the sentence, and the like, and the emphasis mark may be placed in consideration of the priority. For example, in the sentence, "Handmade cakes made from carefully selected flour, sugar, fresh eggs and milk are very delicious."
The five nouns are "flour", "sugar", "eggs", "milk", "handmade", and "cake", but if you put emphasis marks according to Fig. 5, 0 points are emphasized and 0 points are emphasized. If you add points such as one that may be done, "flour", "sugar", "egg",
"Milk" and "Handmade" are given 1 point, and "cake" is given 5 points (process of S11 in FIG. 2). Figure 2
In S15 of this, 5 points of "cake" according to this emphasis mark
Accent phrases including, here, a speed symbol is added before and after "cake".

【００２３】上記のテキスト解析部における処理によっ
て、文中の名詞を周囲よりゆっくりと発声させることが
できる。このため、聴収者は、文中の名詞を強調された
状態で聞けるため、はやく読み上げさせていても、文中
のキーワードとなることの多い名詞を聞き漏らすことが
少なくなる。また、合成音の読み上げ速度の変化がある
ため、単調さを和らげ、聞き易くなっている。By the processing in the above text analysis unit, the noun in the sentence can be uttered more slowly than the surroundings. For this reason, the listener can hear the nouns in the sentence in an emphasized state, and even if they are read aloud quickly, they are less likely to miss the nouns that often become the keywords in the sentence. Further, since the reading speed of the synthetic sound changes, the monotonousness is softened and it becomes easy to hear.

【００２４】上記処理によって内容が聞き取りやすくな
るので、さらに、速度を上げることができ、大量のデー
タを短時間で効率よく聴取することが可能となる。な
お、本実施例では、合成音に変化を与え、名詞に注目さ
せるために、名詞句を周囲の句より、ゆっくりと発声さ
せたが、加えて音量を一段階上げるように音韻韻律記号
列を生成させることも有効である。Since the contents are easily heard by the above processing, the speed can be further increased, and a large amount of data can be efficiently heard in a short time. In this embodiment, the noun phrase is uttered more slowly than the surrounding phrases in order to give attention to the noun in order to give a change to the synthetic sound, but in addition, the phonological prosodic symbol string is used to raise the volume by one step. It is also effective to generate them.

【００２５】[0025]

【発明の効果】速読処理を指定した場合に、名詞を周囲
よりゆっくり発声させるように処理する速読処理部をテ
キスト解析部に設けたことにより、合成音に変化を与え
て名詞を際立たせることができ、その速読指定時にも内
容が聞き取りやすいという効果を有すると共にさらに発
声速度を上げることが可能となる効果を有する。As described above, when the speed reading process is specified, the text analyzing unit is provided with the speed reading processing unit that processes the noun so that it is spoken more slowly than the surroundings. This has the effect of making it easier to hear the contents even when the speed reading is designated, and further has the effect of further increasing the speaking speed.

[Brief description of drawings]

【図１】実施例の音声合成装置の構成図FIG. 1 is a configuration diagram of a speech synthesizer according to an embodiment.

【図２】実施例のテキスト解析部の動作フローチャートFIG. 2 is an operation flowchart of a text analysis unit according to the embodiment.

【図３】実施例のテキスト解析部処理例の説明図FIG. 3 is an explanatory diagram of an example of processing of a text analysis unit according to the embodiment.

【図４】実施例のテキスト解析結果例の説明図FIG. 4 is an explanatory diagram of a text analysis result example of the embodiment.

【図５】実施例の名詞強調の優先順位例の説明図FIG. 5 is an explanatory diagram of an example of priority order of noun emphasis according to the embodiment.

【図６】従来の音声合成装置の構成図FIG. 6 is a block diagram of a conventional speech synthesizer.

[Explanation of symbols]

１文字情報入力部３発音辞書４合成パラメータ生成部５音声素片記憶部６音声合成部８テキスト解析部９速読処理部 1 character information input unit 3 pronunciation dictionary 4 synthesis parameter generation unit 5 speech unit storage unit 6 speech synthesis unit 8 text analysis unit 9 speed reading processing unit

Claims

[Claims]

1. A character information input section for inputting character information,
A text analysis unit that converts the character information into a phonological prosodic symbol string by referring to a pronunciation dictionary in which word readings and accents are registered in advance, and a phoneme unit that is the basic unit of speech that accepts the phonological prosodic symbol string. A speech synthesis unit for converting the phonological prosodic symbol string into a synthesis parameter for speech synthesis by referring to a speech unit storage unit registered in advance, and a speech synthesis unit for accepting the synthesis parameter and synthesizing a speech signal. In a text-to-speech synthesizer provided, a text analysis unit is provided with a speed reading processing unit, and at the time of speed reading speech synthesis, a phonological prosodic symbol string that specifies that only noun phrases in input character information are pronounced relatively late. A text-to-speech synthesizer characterized by generating a.

2. The noun phrase according to claim 1, wherein the noun phrase corresponding to the noun in the input character information is given a priority according to the relationship between the noun in the input character information and the position in the sentence, etc. A text-to-speech synthesis apparatus having a speed reading processing unit for synthesizing a phonological prosodic symbol string designating that only a noun phrase having a high priority is pronounced relatively late when a large number of nouns are included.

3. The speed reading processing unit according to claim 1, further comprising: a phonological prosodic symbol string that specifies that a noun phrase that is pronounced slowly is pronounced at a relatively high volume. And a text-to-speech synthesizer.