JP5527827B2

JP5527827B2 - Loudness adjusting device, loudness adjusting method, and program

Info

Publication number: JP5527827B2
Application number: JP2012094004A
Authority: JP
Inventors: 力濱島
Original assignee: NEC Engineering Ltd
Current assignee: NEC Engineering Ltd
Priority date: 2012-04-17
Filing date: 2012-04-17
Publication date: 2014-06-25
Anticipated expiration: 2032-04-17
Also published as: JP2013223130A

Description

本発明はラウドネス調整装置、ラウドネス調整方法、及びプログラムに関する。 The present invention relates to a loudness adjusting device, a loudness adjusting method, and a program.

地上波デジタルテレビ放送の本格運用が開始されている。デジタルテレビ放送は、コンテンツの作成者が生成した音声をそのまま視聴者の視聴する音声とすることができる。しかしその反面、作成者がダイナミックレンジの広いコンテンツを作成した場合、コンテンツ内での音量差が大きくなる。視聴者が生活雑音の中でこのようなコンテンツをデジタルテレビ等で視聴すると、小さな音声が聞きづらいと感じる場合がある。そこで、視聴者がデジタルテレビ等の音量をリモートコントローラ等で大きくした場合、シーンチェンジ等が生じた場合に急に大きな音声が出力されてしまい、不快に感じる場合がある。また、このような場合にはユーザが主導で音量を再調整する必要が生じる。 Full-scale operation of digital terrestrial television broadcasting has started. In digital television broadcasting, the sound generated by the creator of the content can be used as the sound that the viewer views. On the other hand, if the creator creates content with a wide dynamic range, the volume difference within the content increases. When a viewer views such content on a digital television or the like in a daily life noise, it may be difficult to hear a small sound. Therefore, when the viewer increases the volume of a digital television or the like with a remote controller or the like, a loud sound may be suddenly output when a scene change or the like occurs, which may be uncomfortable. In such a case, it is necessary to readjust the volume by the user.

このような状況を鑑みて、音声信号の音量に関する世界的な標準規格の策定が進められている。この標準規格は、番組・素材単位での平均ラウドネス値を計算し、ターゲットラウドネス値に合わせた素材を生成・運用することを目的としている。なお、ラウドネスとは、人間が感じる音の大きさ（音の感覚量）である。ラウドネス値とは、デジタル録音レベルからラウドネス測定アルゴリズムに基づいて算出したラウドネスの計算値である。平均ラウドネス値とは、任意の測定区間のラウドネス値である。ターゲットラウドネス値とは、番組の聴取レベルを適切に保つために目標とする番組の平均のラウドネス値である。詳細は、非特許文献１を参照されたい。 In view of such a situation, development of a global standard regarding the volume of an audio signal is being promoted. The purpose of this standard is to calculate an average loudness value for each program / material, and to generate and operate a material that matches the target loudness value. Note that the loudness is the volume of sound perceived by humans (the amount of sound sensed). The loudness value is a calculated loudness value calculated from a digital recording level based on a loudness measurement algorithm. The average loudness value is a loudness value in an arbitrary measurement interval. The target loudness value is an average loudness value of a target program in order to keep the program listening level appropriate. For details, see Non-Patent Document 1.

デジタルテレビ放送には、スポーツ中継や記者会見中継等の生放送番組があり、これらの番組に対しては基本的に編集処理を行うことが出来ない。そのため、前述の標準規格に規定された音量感の調整処理をリアルタイムで行う装置を提供する必要がある。 Digital television broadcasts include live broadcast programs such as sports broadcasts and press conference broadcasts, and basically these programs cannot be edited. Therefore, there is a need to provide a device that performs real-time volume adjustment processing defined in the aforementioned standard.

特開２００６−９３９０８号公報JP 2006-93908 A

社団法人電波産業会、デジタルテレビ放送番組におけるラウドネス運用規定技術資料（ＡＲＩＢＴＲ−Ｂ３２１．０版）、平成２３年３月２８日策定Japan Radio Industry Association, Loudness Operation Regulations for Digital TV Broadcasting Program Technical Data (ARIB TR-B32 1.0 version), formulated on March 28, 2011

上述したように、生放送番組等を考慮した場合、音量感の調整処理をリアルタイムで行うことが望ましい。しかしながら、前述の標準規格（非特許文献１）に記載された方式では、ラウドネス値が移動二乗平均値であるため、すなわち任意の区間（時間）を待った後に値を算出するため、音量感の調整処理に一定の遅れが生じてしまう。生放送番組等ではない場合（リアルタイム性が求められない場合）、平均ラウドネス値の算出時間を考慮して音声信号を遅延させる手法を用いることができる。しかしながら、生放送番組等である場合（リアルタイム性が求められる場合）、このような音声信号を遅延させる手法を用いることはできない。 As described above, when a live broadcast program or the like is taken into consideration, it is desirable to perform volume adjustment processing in real time. However, in the method described in the above-mentioned standard (Non-Patent Document 1), since the loudness value is a moving mean square value, that is, the value is calculated after waiting for an arbitrary interval (time), the volume adjustment is performed. A certain delay occurs in the processing. When the program is not a live broadcast program or the like (when real-time performance is not required), a method of delaying the audio signal in consideration of the calculation time of the average loudness value can be used. However, in the case of a live broadcast program or the like (when real-time performance is required), such a method of delaying an audio signal cannot be used.

すなわち、前述の標準規格（非特許文献１）に記載された一般的な方式では、リアルタイム性を担保しつつ、音量感の調整処理を行うことが困難であるという問題が生じていた。 That is, in the general method described in the above-mentioned standard (Non-patent Document 1), there is a problem that it is difficult to adjust the volume feeling while ensuring the real-time property.

本発明は、上述した問題点を鑑みてなされたものであり、リアルタイム性を担保しつつ、音量感の調整処理を行うことができるラウドネス調整装置、ラウドネス調整方法、及びプログラムを提供することを主たる目的とする。 The present invention has been made in view of the above-described problems, and mainly provides a loudness adjustment device, a loudness adjustment method, and a program capable of performing volume adjustment processing while ensuring real-time performance. Objective.

本発明にかかるラウドネス調整装置の一態様は、
入力デジタル音声信号のラウドネス調整を行うラウドネス調整装置であって、
前記入力デジタル音声信号から抽出された音声データに対し、ゲイン調整を行うゲイン調整部と、
前記ゲイン調整部によるゲイン調整済みの音声データを特性フィルタ処理したフィルタ済み音声データを生成する第１フィルタ部と、
前記第１フィルタ部の生成したフィルタ済み音声データが第１閾値以下の場合、前記ゲイン調整部によるゲイン調整済みの音声データを増幅して出力するブースト処理部と、
前記ブースト処理部が出力した音声データを特性フィルタ処理したフィルタ済み音声データを生成する第２フィルタ部と、
前記第２フィルタ部の生成したフィルタ済み音声データが第２閾値以上の場合、前記ブースト処理部が増幅した音声データに対してリミッタコンプレッサ処理を行って出力するリミッタコンプレッサ部と、を備えるものである。 One aspect of the loudness adjusting device according to the present invention is:
A loudness adjustment device for adjusting the loudness of an input digital audio signal,
A gain adjustment unit that performs gain adjustment on audio data extracted from the input digital audio signal;
A first filter unit that generates filtered voice data obtained by performing characteristic filtering on the voice data that has been gain-adjusted by the gain adjustment unit;
A boost processing unit that amplifies and outputs the gain-adjusted audio data by the gain adjustment unit when the filtered audio data generated by the first filter unit is equal to or less than a first threshold;
A second filter unit that generates filtered audio data obtained by performing characteristic filtering on the audio data output by the boost processing unit;
A limiter compressor unit that performs a limiter compressor process on the audio data amplified by the boost processing unit when the filtered audio data generated by the second filter unit is equal to or greater than a second threshold value. .

本発明にかかるラウドネス調整方法の一態様は、
入力デジタル音声信号のラウドネス調整を行うラウドネス調整方法であって、
前記入力デジタル音声信号から抽出された音声データに対し、ゲイン調整を行うゲイン調整ステップと、
前記ゲイン調整ステップにおけるゲイン調整済みの音声データを特性フィルタ処理したフィルタ済み音声データを生成する第１フィルタステップと、
前記第１フィルタステップにて生成したフィルタ済み音声データが第１閾値以下の場合、前記ゲイン調整ステップにおいてゲイン調整済みの音声データを増幅して出力するブースト処理ステップと、
前記ブースト処理ステップにて出力された音声データを特性フィルタ処理したフィルタ済み音声データを生成する第２フィルタステップと、
前記第２フィルタステップにて生成したフィルタ済み音声データが第２閾値以上の場合、前記ブースト処理部が増幅した音声データに対してリミッタコンプレッサ処理を行って出力するリミッタコンプレッサステップと、を備えるものである。 One aspect of the loudness adjustment method according to the present invention is:
A loudness adjustment method for adjusting the loudness of an input digital audio signal,
A gain adjustment step for performing gain adjustment on the audio data extracted from the input digital audio signal;
A first filter step for generating filtered audio data obtained by performing characteristic filtering on the audio data after gain adjustment in the gain adjustment step;
A boost processing step of amplifying and outputting the gain-adjusted audio data in the gain adjustment step when the filtered audio data generated in the first filter step is less than or equal to a first threshold;
A second filter step for generating filtered voice data obtained by performing characteristic filtering on the voice data output in the boost processing step;
A limiter compressor step for performing a limiter compressor process on the audio data amplified by the boost processing unit when the filtered audio data generated in the second filter step is equal to or greater than a second threshold value. is there.

本発明にかかるプログラムの一態様は、
入力デジタル音声信号のラウドネス調整をコンピュータに実行させるプログラムであって、
コンピュータに、
前記入力デジタル音声信号から抽出された音声データに対し、ゲイン調整を行うゲイン調整ステップと、
前記ゲイン調整ステップにおけるゲイン調整済みの音声データを特性フィルタ処理したフィルタ済み音声データを生成する第１フィルタステップと、
前記第１フィルタステップにて生成したフィルタ済み音声データが第１閾値以下の場合、前記ゲイン調整ステップにおいてゲイン調整済みの音声データを増幅して出力するブースト処理ステップと、
前記ブースト処理ステップにて出力された音声データを特性フィルタ処理したフィルタ済み音声データを生成する第２フィルタステップと、
前記第２フィルタステップにて生成したフィルタ済み音声データが第２閾値以上の場合、前記ブースト処理部が増幅した音声データに対してリミッタコンプレッサ処理を行って出力するリミッタコンプレッサステップと、
を実行させる、ものである。 One aspect of the program according to the present invention is as follows:
A program that causes a computer to perform loudness adjustment of an input digital audio signal,
On the computer,
A gain adjustment step for performing gain adjustment on the audio data extracted from the input digital audio signal;
A first filter step for generating filtered audio data obtained by performing characteristic filtering on the audio data after gain adjustment in the gain adjustment step;
A boost processing step of amplifying and outputting the gain-adjusted audio data in the gain adjustment step when the filtered audio data generated in the first filter step is less than or equal to a first threshold;
A second filter step for generating filtered voice data obtained by performing characteristic filtering on the voice data output in the boost processing step;
A limiter compressor step for performing a limiter compressor process on the audio data amplified by the boost processing unit when the filtered audio data generated in the second filter step is equal to or greater than a second threshold; and
Is to execute.

本発明によれば、リアルタイム性を担保しつつ、音量感の調整処理を行うことができるラウドネス調整装置、ラウドネス調整方法、及びプログラムを提供することができる。 ADVANTAGE OF THE INVENTION According to this invention, the loudness adjustment apparatus, the loudness adjustment method, and program which can perform the adjustment process of a volume feeling can be provided, ensuring real-time property.

実施の形態１にかかるラウドネス調整装置１の全体構成を示すブロック図である。1 is a block diagram showing an overall configuration of a loudness adjusting apparatus 1 according to a first embodiment. 一般的な抑圧処理（ＡＴＫ処理）を示す概念図である。It is a conceptual diagram which shows a general suppression process (ATK process). 一般的な開放処理（ＲＥＬ処理）を示す概念図である。It is a conceptual diagram which shows a general open process (REL process). 実施の形態１にかかる制御部１９の抑圧処理（ＡＴＫ処理）を示すフローチャートである。6 is a flowchart showing a suppression process (ATK process) of the control unit 19 according to the first embodiment; 実施の形態１にかかる制御部１９の開放処理（ＲＥＬ処理）を示すフローチャートである。3 is a flowchart showing an opening process (REL process) of the control unit 19 according to the first embodiment; 実施の形態１にかかるラウドネス調整装置１のラウドネス調整を示す概念図である。It is a conceptual diagram which shows the loudness adjustment of the loudness adjustment apparatus 1 concerning Embodiment 1. FIG. 実施の形態１にかかる制御部１９によるゲイン調整を示す概念図である。FIG. 3 is a conceptual diagram illustrating gain adjustment by a control unit 19 according to the first embodiment. 実施の形態１にかかる制御部１９によるゲイン調整を示す概念図である。FIG. 3 is a conceptual diagram illustrating gain adjustment by a control unit 19 according to the first embodiment. 実施の形態１にかかる制御部１９による開放処理（ＲＥＬ処理）を示す概念図である。FIG. 3 is a conceptual diagram showing an opening process (REL process) by a control unit 19 according to the first embodiment; 実施の形態２にかかるラウドネス調整装置１の全体構成を示すブロック図である。It is a block diagram which shows the whole structure of the loudness adjustment apparatus 1 concerning Embodiment 2. FIG. 実施の形態３にかかるラウドネス調整装置１の全体構成を示すブロック図である。It is a block diagram which shows the whole structure of the loudness adjustment apparatus 1 concerning Embodiment 3. FIG. 本発明にかかるラウドネス調整装置１の全体構成を示すブロック図である。1 is a block diagram showing an overall configuration of a loudness adjusting apparatus 1 according to the present invention.

＜実施の形態１＞
以下、図面を参照して本発明の実施の形態について説明する。図１は、本実施の形態にかかるラウドネス調整装置１の構成を示すブロック図である。ラウドネス調整装置１は、例えばデジタルテレビ受像機に内蔵される装置である。また、ラウドネス調整装置１は、インターネット放送やラジオ放送等の音声を扱う装置（例えばコンピュータ、ラジオ装置、映像／音声処理装置とも記載する。）に内蔵されても良い。 <Embodiment 1>
Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing a configuration of a loudness adjusting apparatus 1 according to the present embodiment. The loudness adjusting device 1 is a device built in, for example, a digital television receiver. Further, the loudness adjusting device 1 may be incorporated in a device (for example, also referred to as a computer, a radio device, or a video / audio processing device) that handles audio such as Internet broadcasting or radio broadcasting.

ラウドネス調整装置１は、デコーダ１１と、ゲイン調整部１２と、音声調整部１３と、セレクタ１８と、制御部１９と、移動二乗平均算出部２０と、移動二乗平均算出部２１と、Ｋフィルタ２２と、エンコーダ２３と、を備える。音声調整部１３は、Ｋフィルタ１４と、音声データブースト部１５と、Ｋフィルタ１６と、リミッタコンプレッサ部１７と、を備える。 The loudness adjusting apparatus 1 includes a decoder 11, a gain adjusting unit 12, a sound adjusting unit 13, a selector 18, a control unit 19, a moving root mean square calculating unit 20, a moving root mean square calculating unit 21, and a K filter 22. And an encoder 23. The audio adjustment unit 13 includes a K filter 14, an audio data boost unit 15, a K filter 16, and a limiter compressor unit 17.

デコーダ１１には、入力デジタル音声信号が供給される。入力デジタル音声信号は、デジタルテレビ受像機のアンテナ等から入力されたデータである。入力音声データは、例えばＡＥＳ／ＥＢＵフォーマット信号、エンベディッドオーディオ信号等である。デコーダ１１は、入力デジタル音声信号をデコードし、デコード済みの音声データをゲイン調整部１２に供給し、デコードにより得られた付加情報（音声モード情報、切替情報、ラウドネス運用情報等）をセレクタ１８に供給する。 An input digital audio signal is supplied to the decoder 11. The input digital audio signal is data input from an antenna or the like of a digital television receiver. The input audio data is, for example, an AES / EBU format signal, an embedded audio signal, or the like. The decoder 11 decodes the input digital audio signal, supplies the decoded audio data to the gain adjustment unit 12, and adds additional information (audio mode information, switching information, loudness operation information, etc.) obtained by the decoding to the selector 18. Supply.

なお、デコーダ１１は、デコードにより得られた付加情報内にラウドネス調整を行わないことを指示する情報が含まれる場合、デコード済みの音声データをエンコーダ２３または任意の処理部にそのまま供給し、ゲイン調整部１２への音声データの供給は行わない。 Note that when the additional information obtained by decoding includes information instructing not to perform loudness adjustment, the decoder 11 supplies the decoded audio data to the encoder 23 or an arbitrary processing unit as it is, and performs gain adjustment. The audio data is not supplied to the unit 12.

ゲイン調整部１２は、制御部１９から入力されるゲイン調整値を用いて、デコーダ１１から入力された音声データのゲインを調整する。ゲイン調整部１２は、ゲイン調整済みの音声データをＫフィルタ１４、及び音声データブースト部１５に供給する。 The gain adjustment unit 12 adjusts the gain of the audio data input from the decoder 11 using the gain adjustment value input from the control unit 19. The gain adjusting unit 12 supplies the gain-adjusted audio data to the K filter 14 and the audio data boost unit 15.

音声調整部１３は、後述する移動二乗平均の算出処理を待つことなく、入力された音声データのブースト処理及びリミッタコンプレッサ処理を行う。当該ブースト処理は、ラウドネス運用基準の周波数特性に適応した処理である。以下、音声調整部１３内の各処理部の処理について説明する。 The audio adjustment unit 13 performs boost processing and limiter compressor processing on the input audio data without waiting for a calculation process of moving root mean square described later. The boost processing is processing adapted to the frequency characteristics of the loudness operation standard. Hereinafter, processing of each processing unit in the sound adjustment unit 13 will be described.

Ｋフィルタ１４は、入力された音声データに対してＫ特性（ＩＴＵ−ＲＢＳ．１７７０で規定されている聴感補正フィルタの特性）フィルタリング処理を行い、フィルタリング済みの音声データを音声データブースト部１５に供給する。 The K filter 14 performs a K characteristic (audience correction filter characteristic defined in ITU-R BS.1770) filtering process on the input audio data, and the filtered audio data is sent to the audio data boost unit 15. Supply.

音声データブースト部１５は、Ｋフィルタ１４から入力された音声データが予め定められた閾値以下の音声データであるか否かを判定する。当該判定は、例えばデシベル値（ｄＢ）の比較により行う。閾値以下であると判定した場合、音声データブースト部１５は、ゲイン調整部１２から入力された音声データを増幅する（ブースト処理を行う）。音声データブースト部１５は、増幅済みの音声データをＫフィルタ１６及びリミッタコンプレッサ部１７に供給する。なお、閾値より大きいと判定した場合、音声データブースト部１５は、ゲイン調整部１２から入力された音声データをそのままＫフィルタ１６及びリミッタコンプレッサ部１７に供給する。 The audio data boost unit 15 determines whether the audio data input from the K filter 14 is audio data equal to or less than a predetermined threshold. This determination is performed by, for example, comparing decibel values (dB). When it determines with it being below a threshold value, the audio | voice data boost part 15 amplifies the audio | voice data input from the gain adjustment part 12 (performs a boost process). The audio data boost unit 15 supplies the amplified audio data to the K filter 16 and the limiter compressor unit 17. When it is determined that the value is larger than the threshold value, the audio data boost unit 15 supplies the audio data input from the gain adjustment unit 12 to the K filter 16 and the limiter compressor unit 17 as they are.

Ｋフィルタ１６は、音声データブースト部１５から入力された音声データに対してＫ特性（ＩＴＵ−ＲＢＳ．１７７０で規定されている聴感補正フィルタの特性）フィルタリング処理を行い、フィルタリング済みの音声データをリミッタコンプレッサ部１７に供給する。 The K filter 16 performs a K characteristic (auditory correction filter characteristic defined in ITU-R BS.1770) filtering process on the audio data input from the audio data boost unit 15, and the filtered audio data is processed. This is supplied to the limiter compressor unit 17.

リミッタコンプレッサ部１７は、Ｋフィルタ１６から入力された音声データが予め定められた閾値以上の音声データであるか否かを判定する。当該判定は、例えばデシベル値（ｄＢ）の比較により行う。閾値以上であると判定した場合、リミッタコンプレッサ部１７は、音声データブースト部１５から入力された音声データに対してリミッタコンプレッサ処理を行い、処理後の音声データをＫフィルタ２２及びエンコーダ２３に供給する。ここで、リミッタコンプレッサ処理は、一般的な放送音声の処理について行われる任意の音声レベルの適正化処理であれば良い。なお、閾値以上ではないと判定した場合、リミッタコンプレッサ部１７は、音声データブースト部１５から入力された音声データをそのままＫフィルタ２２及びエンコーダ２３に供給する。 The limiter compressor unit 17 determines whether or not the audio data input from the K filter 16 is audio data that is equal to or greater than a predetermined threshold value. This determination is performed by, for example, comparing decibel values (dB). When it is determined that the value is equal to or greater than the threshold value, the limiter compressor unit 17 performs limiter compressor processing on the audio data input from the audio data boost unit 15 and supplies the processed audio data to the K filter 22 and the encoder 23. . Here, the limiter compressor process may be any process for optimizing an arbitrary audio level performed for general broadcast audio processing. If it is determined that the value is not equal to or greater than the threshold value, the limiter compressor unit 17 supplies the audio data input from the audio data boost unit 15 to the K filter 22 and the encoder 23 as they are.

エンコーダ２３は、リミッタコンプレッサ部１７またはデコーダ１１から入力された音声データを所定のフォーマットにエンコードし、エンコード済みの音声データを任意の処理部（例えばデジタルテレビ受像機のスピーカ等）に供給する。 The encoder 23 encodes the audio data input from the limiter compressor unit 17 or the decoder 11 into a predetermined format, and supplies the encoded audio data to an arbitrary processing unit (for example, a speaker of a digital television receiver).

Ｋフィルタ２２は、リミッタコンプレッサ部１７から入力された音声データに対してＫ特性（ＩＴＵ−ＲＢＳ．１７７０で規定されている聴感補正フィルタの特性）フィルタリング処理を行い、フィルタリング済みの音声データを移動二乗平均算出部２０及び移動二乗平均算出部２１に供給する。 The K filter 22 performs K characteristic (audience correction filter characteristic defined in ITU-R BS.1770) filtering processing on the audio data input from the limiter compressor unit 17 and moves the filtered audio data. This is supplied to the mean square calculation unit 20 and the moving mean square calculation unit 21.

移動二乗平均算出部２０及び移動二乗平均算出部２１は、Ｋフィルタ２２から入力された音声データの信号パワーを求めるための二乗平均値を算出する。ここで、移動二乗平均算出部２０と移動二乗平均算出部２１は、異なる時間の移動二乗平均処理を行う。以下の説明では、移動二乗平均算出部２０は、非特許文献１に規定された最小ブロックである１００ｍｓ（０．１秒）単位で移動二乗平均値を算出し、算出値Ｓ（短い時間での移動二乗平均値）を制御部１９に供給する。一方、移動二乗平均算出部２１は、ＥＢＵＲ−１２８に規定された時間である３ｓ（３秒）単位で移動二乗平均を算出し、算出値Ｌ（長い時間での移動二乗平均値）を制御部１９に供給する。なお、移動二乗平均の算出方法の詳細は、非特許文献１を参照されたい。また、上述の移動二乗平均の単位となる時間（０．１秒、３秒）はあくまで一例であり、これに必ずしも限られない。 The moving mean square calculation unit 20 and the moving mean square calculation unit 21 calculate a mean square value for obtaining the signal power of the audio data input from the K filter 22. Here, the moving mean square calculation unit 20 and the moving mean square calculation unit 21 perform moving mean square processing at different times. In the following description, the moving mean square calculation unit 20 calculates the moving mean square value in units of 100 ms (0.1 seconds), which is the minimum block defined in Non-Patent Document 1, and calculates the calculated value S (in a short time). The moving mean square value) is supplied to the control unit 19. On the other hand, the moving mean square calculation unit 21 calculates the moving mean square in units of 3 seconds (3 seconds), which is the time defined in EBU R-128, and controls the calculated value L (moving mean square value over a long time). Supplied to the unit 19. Refer to Non-Patent Document 1 for details of the method of calculating the moving mean square. Moreover, the time (0.1 second, 3 second) used as the unit of the above-mentioned moving mean square is an example to the last, and is not necessarily restricted to this.

セレクタ１８には、外部からの制御情報と、デコーダ１１からの付加情報と、が入力される。制御情報とは、後述する高速な開放処理（ＲＥＬ処理）を指示する指示情報を含むことができる情報である。付加情報には、上述のように音声モードの切替（例えば、ＣＭ（コマーシャル）音声と番組音声の切替）に関する情報が含まれる。セレクタ１８は、両入力のうち制御部１９の動作を変更する情報を適宜選択し、選択した情報を制御部１９に供給する。セレクタ１８の動作詳細は、図５を参照して後述する。 The selector 18 receives control information from the outside and additional information from the decoder 11. The control information is information that can include instruction information for instructing a high-speed release process (REL process) described later. As described above, the additional information includes information related to switching between audio modes (for example, switching between CM (commercial) audio and program audio). The selector 18 appropriately selects information for changing the operation of the control unit 19 from both inputs, and supplies the selected information to the control unit 19. Details of the operation of the selector 18 will be described later with reference to FIG.

制御部１９には、セレクタ１８の選択した情報と、ターゲットラウドネス値と、移動二乗平均算出部２０の算出値Ｓと、移動二乗平均算出部２１の算出値Ｌと、が入力される。制御部１９は、これらの情報を基にゲイン調整値を算出し、算出したゲイン調整値をゲイン調整部１２に供給する。 The information selected by the selector 18, the target loudness value, the calculated value S of the moving mean square calculating unit 20, and the calculated value L of the moving mean square calculating unit 21 are input to the control unit 19. The control unit 19 calculates a gain adjustment value based on these pieces of information, and supplies the calculated gain adjustment value to the gain adjustment unit 12.

ターゲットラウドネス値とは、前述のように出力音量の目的値となる平均ラウドネス値である。制御部１９は、ゲイン調整値に関する抑圧処理（ＡＴＫ（アタック）処理）、または開放処理（ＲＥＬ（リリース）処理）を行うことにより、ゲイン調整部１２における音声データのゲイン調整を制御する。抑圧処理（ＡＴＫ処理）とは、規定値レベル（ｄＢ）以上の音声データが入力された場合には、音声データを抑圧するようなゲイン調整値を定めることを意味する。開放処理（ＲＥＬ処理）とは、規定値レベル（ｄＢ）以下の音声データが入力された場合には、音声データを開放するようなゲイン調整値を定めることを意味する。制御部１９は、設定したゲイン調整値をゲイン調整部１２に供給する。 The target loudness value is an average loudness value that is the target value of the output volume as described above. The control unit 19 controls the gain adjustment of the audio data in the gain adjustment unit 12 by performing suppression processing (ATK (attack) processing) or release processing (REL (release) processing) related to the gain adjustment value. Suppression processing (ATK processing) means that a gain adjustment value that suppresses audio data is determined when audio data of a specified value level (dB) or higher is input. The release process (REL process) means that a gain adjustment value is set so that the audio data is released when audio data of a specified value level (dB) or less is input. The control unit 19 supplies the set gain adjustment value to the gain adjustment unit 12.

図２及び図３を参照し、一般的な抑圧処理（ＡＴＫ処理）及び開放処理（ＲＥＬ処理）の概要を説明する。ある音声データが規定値レベル（ｔｈ）である場合、図２に示すように音声データを抑圧する処理（ＡＴＫ処理）を行う。例えば音声データのデシベル値（ｄＢ）が規定レベル（ｔｈ）より大きい場合、音声データのレベルを弱めるようなゲイン調整値を設定する。この時、抑圧量がある一定量以上（通常は約９０％以上）になるまでの時間を抑圧時間（アタック時間）という。 An outline of general suppression processing (ATK processing) and release processing (REL processing) will be described with reference to FIGS. When certain audio data is at a specified value level (th), a process (ATK process) for suppressing the audio data is performed as shown in FIG. For example, when the decibel value (dB) of the audio data is larger than the specified level (th), a gain adjustment value that weakens the level of the audio data is set. At this time, the time until the suppression amount becomes a certain amount or more (usually about 90% or more) is called suppression time (attack time).

また、図３に示すように、抑圧状態から規定値レベル（ｔｈ）以下の音声データとなった場合、抑圧状態から開放する処理（ＲＥＬ処理）を行う。この時、開放される量がある一定量以上（通常は開放量が約９０％以上）になるまでの時間を開放時間（リリース時間）という。なお、抑圧処理（ＡＴＫ処理）及び開放処理（ＲＥＬ処理）の詳細は、例えば特許文献１を参照されたい。 In addition, as shown in FIG. 3, when the voice data becomes lower than the specified value level (th) from the suppressed state, a process of releasing from the suppressed state (REL process) is performed. At this time, the time until the amount to be released reaches a certain amount or more (usually the opening amount is about 90% or more) is referred to as an opening time (release time). For details of the suppression process (ATK process) and the release process (REL process), refer to Patent Document 1, for example.

以下、制御部１９の詳細な処理を図４及び図５を参照して説明する。はじめに、図４を参照して制御部１９による抑圧処理（ＡＴＫ処理）について説明する。制御部１９は、算出値Ｓ（移動二乗平均算出部２０の出力値）と、ターゲットラウドネス値（Ｌｔｈ）と、を比較する（ＳＴ１１）。算出値Ｓがターゲットラウドネス値（Ｌｔｈ）よりも大きくない場合（ＳＴ１１：Ｎｏ）、制御部１９は、処理を行わない。算出値Ｓがターゲットラウドネス値（Ｌｔｈ）よりも大きい場合（ＳＴ１１：Ｙｅｓ）、制御部１９は、抑圧処理（ＡＴＫ処理）の実行中であるか否かを判定する（ＳＴ１２）。 Hereinafter, detailed processing of the control unit 19 will be described with reference to FIGS. 4 and 5. First, the suppression process (ATK process) by the control unit 19 will be described with reference to FIG. The control unit 19 compares the calculated value S (the output value of the moving mean square calculating unit 20) with the target loudness value (Lth) (ST11). When the calculated value S is not larger than the target loudness value (Lth) (ST11: No), the control unit 19 does not perform processing. When the calculated value S is larger than the target loudness value (Lth) (ST11: Yes), the control unit 19 determines whether the suppression process (ATK process) is being executed (ST12).

抑圧処理（ＡＴＫ処理）の実行中ではない場合（ＳＴ１２：Ｎｏ）、制御部１９は、算出値Ｌ（移動二乗平均算出部２１の出力値）が増加傾向にあるか否かを判定する（ＳＴ１３）。算出値Ｌが増加傾向にはない場合（ＳＴ１３：Ｎｏ）、制御部１９は、処理を行わない。一方、算出値Ｌが増加傾向にある場合（ＳＴ１３：Ｙｅｓ）、制御部１９は、抑圧処理（ＡＴＫ処理）を実行する（ＳＴ１４）。ゲイン調整値は、所望の抑圧時間と、入力音声データのｄＢ値に応じて調整される When the suppression process (ATK process) is not being executed (ST12: No), the control unit 19 determines whether or not the calculated value L (the output value of the moving mean square calculating unit 21) tends to increase (ST13). ). When the calculated value L does not tend to increase (ST13: No), the control unit 19 does not perform processing. On the other hand, when the calculated value L tends to increase (ST13: Yes), the control unit 19 executes a suppression process (ATK process) (ST14). The gain adjustment value is adjusted according to the desired suppression time and the dB value of the input audio data.

抑圧処理（ＡＴＫ処理）の実行中である場合（ＳＴ１２：Ｙｅｓ）、制御部１９は、算出値Ｌ（移動二乗平均算出部２１の出力値）が増加傾向にあるか否かを判定する（ＳＴ１５）。算出値Ｌが増加傾向にはない場合（ＳＴ１５：Ｎｏ）、制御部１９は、抑圧処理（ＡＴＫ処理）を中断する（ＳＴ１６）。一方、算出値Ｌが増加傾向にある場合（ＳＴ１５：Ｙｅｓ）、制御部１９は、抑圧処理（ＡＴＫ処理）を実行する（ＳＴ１４）。 When the suppression process (ATK process) is being executed (ST12: Yes), the control unit 19 determines whether or not the calculated value L (the output value of the moving mean square calculating unit 21) tends to increase (ST15). ). When the calculated value L does not tend to increase (ST15: No), the control unit 19 interrupts the suppression process (ATK process) (ST16). On the other hand, when the calculated value L tends to increase (ST15: Yes), the control unit 19 executes a suppression process (ATK process) (ST14).

続いて、抑圧処理（ＡＴＫ処理）からの開放処理（ＲＥＬ）処理の流れを図５を参照して説明する。セレクタ１８は、音声モードが変更された場合に、変更されたことを示す信号を出力する。制御部１９は、セレクタ１８からの出力信号により音声モードが切り替わったか否かを判定する（ＳＴ２１）。音声モードが切り替わっていない場合（ＳＴ２１：Ｎｏ）、制御部１９は、開放用の閾値（開放閾値（ＲＥＬｓｔａｒｔｔｈ）とも記載する。）と、算出値Ｓと、を比較する（ＳＴ２２）。開放閾値は、ターゲットラウドネス値よりも低い値を持つ。 Next, the flow of the release process (REL) process from the suppression process (ATK process) will be described with reference to FIG. The selector 18 outputs a signal indicating the change when the sound mode is changed. The control unit 19 determines whether or not the sound mode has been switched based on the output signal from the selector 18 (ST21). When the audio mode has not been switched (ST21: No), the control unit 19 compares the opening threshold (also referred to as an opening threshold (REL start th)) with the calculated value S (ST22). The opening threshold has a value lower than the target loudness value.

算出値Ｓが開放閾値以下である場合（ＳＴ２２：Ｙｅｓ）、制御部１９は、通常の開放処理となるようにゲイン調整値を設定する（ＳＴ２３）。すなわち、制御部１９は、所定の開放時間（Ｔ１）となるようにゲイン調整値を設定する（ＳＴ２３）。一方、算出値Ｓが開放閾値以下ではない場合（ＳＴ２２：Ｎｏ）、制御部１９は、処理を行わない。 When the calculated value S is less than or equal to the opening threshold (ST22: Yes), the control unit 19 sets a gain adjustment value so as to perform a normal opening process (ST23). That is, the control unit 19 sets the gain adjustment value so that the predetermined opening time (T1) is reached (ST23). On the other hand, when the calculated value S is not less than or equal to the opening threshold (ST22: No), the control unit 19 does not perform processing.

抑圧処理（ＡＴＫ処理）の実行中であって音声モードの変更が生じた場合（ＳＴ２１：Ｙｅｓ）、制御部１９は、高速の開放処理となるようにゲイン調整値を設定する（ＳＴ２４）。すなわち、制御部１９は、所定の開放時間（Ｔ１）よりも短い開放時間（Ｔ２）となるようにゲイン調整値を設定する（ＳＴ２４）。換言すると、制御部１９は、通常の開放処理（ＲＥＬ）よりも開放量が大きくなるようにゲイン調整値を設定する。ゲイン調整値は、所望の開放時間（Ｔ１またはＴ２）と、入力音声データのｄＢ値に応じて調整される。 When the suppression process (ATK process) is being executed and the voice mode is changed (ST21: Yes), the control unit 19 sets the gain adjustment value so that the high-speed release process is performed (ST24). That is, the control unit 19 sets the gain adjustment value so that the opening time (T2) is shorter than the predetermined opening time (T1) (ST24). In other words, the control unit 19 sets the gain adjustment value so that the opening amount is larger than the normal opening process (REL). The gain adjustment value is adjusted according to a desired opening time (T1 or T2) and the dB value of the input audio data.

なお、上述の判定において、高速な開放処理（ＲＥＬ）処理を指示する制御情報がセレクタ１８から入力された場合、制御部１９は、音声モードの変更時と同様の処理（ＳＴ２１：Ｙｅｓ）を行う。 In the above determination, when control information for instructing a high-speed release process (REL) process is input from the selector 18, the control unit 19 performs the same process (ST21: Yes) as when the voice mode is changed. .

続いて、各処理部の動作と出力音声データとの関係を図６〜図９を参照して説明する。図６は、音声調整部１３（Ｋフィルタ１４、音声データブースト部１５、Ｋフィルタ１６、リミッタコンプレッサ１７）の制御を示すグラフである。 Next, the relationship between the operation of each processing unit and output audio data will be described with reference to FIGS. FIG. 6 is a graph showing control of the sound adjustment unit 13 (K filter 14, sound data boost unit 15, K filter 16, limiter compressor 17).

前述のように非特許文献１に記載されたラウドネス値の調整手法では、移動二乗平均を用いるため、リアルタイムでのラウドネス調整が困難である。そのため、図６に示すように、瞬間的に大きな音量の音声データが入力された場合、一般的なラウドネス値の調整手法では大きな音量差（図中の（１））が生じてしまう。これにより、視聴者に不快感を与えてしまう。 As described above, the loudness value adjustment method described in Non-Patent Document 1 uses moving root mean square, so that it is difficult to adjust loudness in real time. Therefore, as shown in FIG. 6, when audio data with a large volume is input instantaneously, a large volume difference ((1) in the figure) is generated in a general loudness adjustment method. Thereby, an unpleasant feeling is given to the viewer.

一方、本実施の形態にかかるラウドネス調整装置１は、音声調整部１３を備える。そして、当該音声調整部１３は、移動二乗平均値を用いることなく、入力音声データの抑圧処理（リミッタコンプレッサ処理）を行う。そのため、瞬間的に大きな音量の音声データが入力された場合であっても、音量差を抑えることが出来る（図中の（２））。これにより、視聴者にとって違和感の無い、すなわち不快感を与えることの無い音声データの提供を実現することができる。 On the other hand, the loudness adjusting apparatus 1 according to the present embodiment includes an audio adjusting unit 13. And the said audio | voice adjustment part 13 performs the suppression process (limiter compressor process) of input audio | voice data, without using a moving mean square value. Therefore, even when voice data with a large volume is input instantaneously, the volume difference can be suppressed ((2) in the figure). Thereby, it is possible to provide audio data that does not give the viewer a sense of incongruity, that is, does not give unpleasant feeling.

図７及び図８は、制御部１９のゲイン調整値の調整処理を示すグラフである。図７は、各タイミング（ＴＭ１〜ＴＭ５）における算出値Ｓと各閾値（ターゲットラウドネス値及び開放閾値）、及び算出値Ｌの増減を示す図である。図８は、制御部１９によるゲイン調整値の調整を示す図である。図７及び図８には、同一の入力音声データ、算出値Ｓ、及び算出値Ｌが表示されている。 7 and 8 are graphs showing the adjustment process of the gain adjustment value of the control unit 19. FIG. 7 is a diagram illustrating increase / decrease of the calculated value S, each threshold value (target loudness value and open threshold value), and the calculated value L at each timing (TM1 to TM5). FIG. 8 is a diagram illustrating adjustment of the gain adjustment value by the control unit 19. 7 and 8, the same input voice data, the calculated value S, and the calculated value L are displayed.

タイミングＴＭ１において、算出値Ｓは、ターゲットラウドネス値以上であり、かつ算出値Ｌは増加傾向にある（図７ＴＭ１）。そのため、制御部１９は、タイミングＴＭ１から抑圧処理（ＡＴＫ処理）を開始する（図８ＴＭ１）。タイミングＴＭ１〜ＴＭ２の間、制御部１９は、抑圧処理（ＡＴＫ処理）を実行する。 At timing TM1, the calculated value S is greater than or equal to the target loudness value, and the calculated value L tends to increase (FIG. 7 TM1). Therefore, the control unit 19 starts a suppression process (ATK process) from timing TM1 (TM1 in FIG. 8). During timings TM1 and TM2, the control unit 19 executes a suppression process (ATK process).

タイミングＴＭ２において、算出値Ｌは減少傾向にある（図７ＴＭ２）。そのため、制御部１９は、タイミングＴＭ２から抑圧処理（ＡＴＫ処理）の実行を保留する（図８ＴＭ２）。タイミングＴＭ２〜ＴＭ３の間、制御部１９は、抑圧処理（ＡＴＫ処理）を行わない。 At timing TM2, the calculated value L tends to decrease (FIG. 7TM2). Therefore, the control unit 19 suspends execution of the suppression process (ATK process) from timing TM2 (FIG. 8 TM2). During timings TM <b> 2 to TM <b> 3, the control unit 19 does not perform suppression processing (ATK processing).

タイミングＴＭ３において、算出値Ｓは、ターゲットラウドネス値以上であり、かつ算出値Ｌは増加傾向となる（図７ＴＭ３）。そのため、制御部１９は、タイミングＴＭ１から抑圧処理（ＡＴＫ処理）を再開する（図８ＴＭ３）。 At timing TM3, the calculated value S is greater than or equal to the target loudness value, and the calculated value L tends to increase (FIG. 7 TM3). Therefore, the control unit 19 resumes the suppression process (ATK process) from the timing TM1 (TM3 in FIG. 8).

タイミングＴＭ４において、算出値Ｓは、開放閾値以下となる（図７ＴＭ４）。そのため、制御部１９は、タイミングＴＭ４から開放処理（ＲＥＬ処理）を開始する（図８ＴＭ４）。 At timing TM4, the calculated value S is equal to or less than the open threshold (FIG. 7 TM4). Therefore, the control unit 19 starts the release process (REL process) from timing TM4 (FIG. 8 TM4).

タイミングＴＭ１において、算出値Ｓは、ターゲットラウドネス値以上となり、かつ算出値Ｌは増加傾向にある（図７ＴＭ５）。のため、制御部１９は、タイミングＴＭ５から抑圧処理（ＡＴＫ処理）を開始する（図８ＴＭ５）。以上のように、制御部１９は、算出値Ｓ及び算出値Ｌを用いてゲイン調整値の調整を行う。 At timing TM1, the calculated value S is equal to or greater than the target loudness value, and the calculated value L is increasing (FIG. 7 TM5). Therefore, the control unit 19 starts suppression processing (ATK processing) from timing TM5 (FIG. 8 TM5). As described above, the control unit 19 adjusts the gain adjustment value using the calculated value S and the calculated value L.

図９は、制御部１９による開放処理を示す図である。図９は、一般的な開放処理（ターゲットラウドネス値との比較）を用いて生成した音声データと、開放閾値を用いた開放処理を用いて生成した音声データと、を示すグラフである。図示するように、一般的な開放処理では、音声データがターゲットラウドネス値を下回った場合、即座に開放処理を行う。そのため、ターゲットラウドネス値付近での抑圧、及び開放が交互に繰り返される。これにより、元の音声データに対して振幅の変化が揺らいだ状態になる。換言すると、元の音声データの抑揚を正確に表した音声データを生成することができない。 FIG. 9 is a diagram illustrating an opening process performed by the control unit 19. FIG. 9 is a graph showing sound data generated using a general opening process (comparison with a target loudness value) and sound data generated using an opening process using an opening threshold. As shown in the figure, in a general release process, if the audio data falls below the target loudness value, the release process is immediately performed. For this reason, suppression and release near the target loudness value are repeated alternately. As a result, the amplitude change fluctuates with respect to the original audio data. In other words, it is not possible to generate audio data that accurately represents the inflection of the original audio data.

一方、ターゲットラウドネス値よりも小さい値である開放閾値を用いた開放処理を用いて生成した音声データは、ターゲットラウドネス値付近において抑圧と開放が交互に繰り返される事象が無くなる。これにより、元の音声データに対して振幅の変化が揺らいだ状態を回避することができる。よって、図９に示すように、元の音声データの抑揚を正確に表した音声データを生成することができる。 On the other hand, the sound data generated by using the release process using the release threshold that is smaller than the target loudness value does not have an event in which suppression and release are alternately repeated in the vicinity of the target loudness value. Thereby, it is possible to avoid a state in which the change in amplitude fluctuates with respect to the original audio data. Therefore, as shown in FIG. 9, it is possible to generate audio data that accurately represents the inflection of the original audio data.

続いて、本実施の形態にかかるラウドネス調整装置１の効果について改めて説明する。上述のように、音声調整部１３は、移動二乗平均の算出値を用いることなく、特性フィルタ処理に従ったブースト処理及びリミッタコンプレッサ処理を行う。これらの処理は、移動二乗平均の算出とは異なり、即時的に処理が可能である。当該構成により、極端に大きな音声データを抑圧することができ、かつ、極端に大きな音声データに後続する通常の音量の音声データを不要に抑圧することが無くなる。すなわち、リアルタイム性を担保しつつ、音量感の調整を適切に行うことができる。換言すると、リアルタイム性を保ちつつ、音質を一定に保つ（違和感の無い音声を生成する）ラウドネス調整を実現することが可能となる。 Then, the effect of the loudness adjustment apparatus 1 concerning this Embodiment is demonstrated anew. As described above, the sound adjustment unit 13 performs the boost process and the limiter compressor process according to the characteristic filter process without using the calculated value of the moving mean square. Unlike the calculation of the moving average, these processes can be performed immediately. With this configuration, extremely large audio data can be suppressed, and normal volume audio data following the extremely large audio data is not unnecessarily suppressed. That is, it is possible to appropriately adjust the volume feeling while ensuring the real-time property. In other words, it is possible to realize loudness adjustment that maintains a constant sound quality (generates a sound without a sense of incongruity) while maintaining real-time characteristics.

さらに、制御部１９は、図４に示すように、算出値Ｌの増加時にのみ抑圧処理（ＡＴＫ処理）を行い、算出値Ｌの減少時には抑圧処理（ＡＴＫ処理）の保留を行うことにより、過度の抑圧処理（ＡＴＫ処理）を行わない。これにより、抑圧が行われすぎた音声データを出力することを回避できる。 Furthermore, as shown in FIG. 4, the control unit 19 performs the suppression process (ATK process) only when the calculated value L increases, and holds the suppression process (ATK process) when the calculated value L decreases, No suppression processing (ATK processing) is performed. As a result, it is possible to avoid outputting audio data that has been excessively suppressed.

さらにまた、抑圧処理（ＡＴＫ処理）の実行中から開放処理（ＲＥＬ処理）に移行する際にも、音声データのレベルがターゲットラウドネス値よりも小さい値である開放閾値以下となった場合に開放処理を行う。これにより、不要な開放処理を行うことなく、生成する音声データの抑揚を基の音声データの抑揚と近づけることができる。 Furthermore, even when the suppression process (ATK process) is being executed and the process proceeds to the release process (REL process), the release process is performed when the level of the audio data is equal to or less than the release threshold value that is smaller than the target loudness value. I do. As a result, the inflection of the generated voice data can be brought close to the inflection of the original voice data without performing unnecessary release processing.

制御部１９は、抑圧処理（ＡＴＫ処理）の実行中に音声モードの変更や外部からの指示が生じた場合に、通常の開放処理（ＲＥＬ処理）よりも開放時間が短くなるようにゲイン調整値を設定する（高速開放処理を実行する）。デジタルテレビに関する音声信号は、番組コンテンツとコマーシャルが混在し、両者の音量感が全く異なる場合がある。一般的なラウドネス調整の手法では、番組コンテンツからコマーシャルに切り替わる場合、番組コンテンツにかかる音声データの解析（移動二乗平均の算出）に基づく抑圧処理（ＡＴＫ処理）が行われてしまう恐れがある。すなわち、番組コンテンツに適した抑圧処理（ＡＴＫ処理）をコマーシャルに対しても行ってしまう。しかし、本実施形態にかかるラウドネス調整装置では、上述の高速開放処理を行うことにより、抑圧量を即座にキャンセルすることができ、コンテンツに応じたラウドネス調整を瞬時に行うことが可能になる。 The control unit 19 adjusts the gain adjustment value so that the release time is shorter than the normal release process (REL process) when a voice mode change or an external instruction occurs during the execution of the suppression process (ATK process). Is set (executes high-speed release processing). The audio signal related to digital television is a mixture of program content and commercials, and the volume feeling may be completely different. In a general loudness adjustment technique, when switching from program content to commercial, there is a risk that a suppression process (ATK process) based on analysis of audio data (calculation of a moving mean square) related to the program content may be performed. That is, suppression processing (ATK processing) suitable for program content is also performed on commercials. However, in the loudness adjusting apparatus according to the present embodiment, the suppression amount can be canceled immediately by performing the above-described high-speed release processing, and the loudness adjustment according to the content can be instantaneously performed.

さらにまた、デコーダ１１は、デコードにより得られた付加情報内にラウドネス調整を行わないことを指示する情報が含まれる場合、デコード済みの音声データをエンコーダ２３または任意の処理部にそのまま供給する。デコーダ１１に入力される入力デジタル音声信号は、生放送等のような番組にかかるものの他に、予め非特許文献１に規定のラウドネス調整が行われたコンテンツにかかるものがある。後者の場合、既に制作者の意図通りのラウドネス調整が行われているため、改めてラウドネス調整を行う必要がない。デコーダ１１が後者の信号の場合にラウドネス調整を行うことなくそのまま出力を行うため、制作者の意図を変えることなく音声データを供給することができる。 Further, when the additional information obtained by decoding includes information indicating that loudness adjustment is not performed, the decoder 11 supplies the decoded audio data to the encoder 23 or an arbitrary processing unit as it is. The input digital audio signal input to the decoder 11 is related to content that has been subjected to loudness adjustment as prescribed in Non-Patent Document 1 in addition to that related to programs such as live broadcasting. In the latter case, since the loudness adjustment has already been performed as intended by the producer, there is no need to perform the loudness adjustment again. In the case of the latter signal, the decoder 11 outputs the signal as it is without adjusting the loudness, so that the audio data can be supplied without changing the intention of the producer.

＜実施の形態２＞
本実施の形態にかかるラウドネス調整装置１は、多チャンネル（サラウンドまたはそれ以上のチャンネル数）の音声データを処理できることを特徴とする。本実施の形態にかかるラウドネス調整装置１について、実施の形態１と異なる点を以下に説明する。 <Embodiment 2>
The loudness adjusting apparatus 1 according to the present embodiment is characterized in that it can process multi-channel (surround or more channels) audio data. The loudness adjustment apparatus 1 according to the present embodiment will be described below with respect to differences from the first embodiment.

図１０は、本実施の形態にかかるラウドネス調整装置１の構成を示すブロック図である。ラウドネス調整装置１には、多チャンネル（サラウンドまたはそれ以上のチャンネル数）に関する入力デジタル音声信号が入力される。 FIG. 10 is a block diagram showing a configuration of the loudness adjusting apparatus 1 according to the present embodiment. The loudness adjusting apparatus 1 receives an input digital audio signal related to multiple channels (surround or more channels).

ラウドネス調整装置１は、入力デジタル音声信号内の各チャンネルに応じた数の音声調整部１３（１３−１〜１３―ｎ、ｎは２以上の整数であり以下の記載でも同様である。）、移動二乗平均算出部２０（２０−１〜２０−ｎ）、移動二乗平均算出部２１（２１−１〜２１−ｎ）、Ｋフィルタ２２（２２−１〜２２−ｎ）、ゲイン乗算部２４（２４−１〜２４−ｎ）、二乗平均積算部２５−１、及び二乗平均積算部２５−２を備える。 The loudness adjusting device 1 has the same number of audio adjusting units 13 (13-1 to 13-n, n is an integer of 2 or more according to each channel in the input digital audio signal, and the same applies to the following description). Moving root mean square calculator 20 (20-1 to 20-n), Moving root mean square calculator 21 (21-1 to 21-n), K filter 22 (22-1 to 22-n), gain multiplier 24 ( 24-1), a root mean square integration unit 25-1, and a root mean square integration unit 25-2.

デコーダ１１は、入力デジタル音声信号をデコードし、各チャネルに対応する音声データを音声調整部１３−１〜１３−ｎに供給する。実施の形態１と同様の符号を付した処理部の処理は、チャネル毎に実施の形態１と同様の処理を行うため、その詳細な説明は省略する。 The decoder 11 decodes the input digital audio signal and supplies audio data corresponding to each channel to the audio adjustment units 13-1 to 13-n. Since the processing of the processing unit attached with the same reference numerals as in the first embodiment performs the same processing as in the first embodiment for each channel, detailed description thereof is omitted.

ゲイン乗算部２４−１は、算出値Ｓ及び算出値Ｌに対して、処理対象のチャネルに対応付けられたゲインを乗算する。そして、ゲイン乗算部２４−１は、算出値Ｓの乗算結果を二乗平均積算部２５−１に供給し、算出値Ｌの乗算結果を二乗平均積算部２５−２に供給する。 The gain multiplication unit 24-1 multiplies the calculated value S and the calculated value L by a gain associated with the processing target channel. Then, the gain multiplication unit 24-1 supplies the multiplication result of the calculated value S to the root mean square integration unit 25-1, and supplies the multiplication result of the calculation value L to the root mean square integration unit 25-2.

二乗平均積算部２５−１は、各チャネルの算出値Ｓ（最小ブロックである１００ｍｓ（０．１秒）単位で移動二乗平均値）を積算（累積加算）する。二乗平均積算部２５−１は、積算結果を制御部１９に供給する。 The root mean square accumulating unit 25-1 accumulates (accumulates and adds) the calculated value S of each channel (moving mean square value in units of 100 ms (0.1 second) which is the minimum block). The root mean square integration unit 25-1 supplies the integration result to the control unit 19.

二乗平均積算部２５−２は、各チャネルの算出値Ｌ（（３秒）単位で移動二乗平均）を積算（累積加算）する。二乗平均積算部２５−１は、積算結果を制御部１９に供給する。 The root mean square accumulating unit 25-2 accumulates (accumulates and adds) the calculated values L (moving mean square in units of (3 seconds)) of each channel. The root mean square integration unit 25-1 supplies the integration result to the control unit 19.

制御部１９は、二乗平均積算部２５−１及び二乗平均積算部２５−２から入力された積算値を基に、各ゲイン調整部１２−１〜１２−ｎに供給するゲイン調整部を算出し、供給する。 The control unit 19 calculates gain adjustment units to be supplied to the gain adjustment units 12-1 to 12-n based on the integration values input from the root mean square integration unit 25-1 and the mean square integration unit 25-2. Supply.

続いて、本実施の形態にかかるラウドネス調整装置１の効果について説明する。ラウドネス調整装置１は、上述の構成により、多チャンネルに関する入力デジタル音声信号が入力された場合であってもリアルタイム性を担保しつつ、音量感の調整処理を行うことができる。 Then, the effect of the loudness adjustment apparatus 1 concerning this Embodiment is demonstrated. With the above-described configuration, the loudness adjusting apparatus 1 can perform a volume adjustment process while ensuring real-time performance even when an input digital audio signal related to multiple channels is input.

＜実施の形態３＞
本実施の形態にかかるラウドネス調整装置１は、音声データの無音判定を行う処理部を有することを特徴とする。本実施の形態にかかるラウドネス調整装置１について、実施の形態１と異なる点を以下に説明する。 <Embodiment 3>
The loudness adjusting apparatus 1 according to the present embodiment includes a processing unit that performs silence determination of audio data. The loudness adjustment apparatus 1 according to the present embodiment will be described below with respect to differences from the first embodiment.

図１１は、本実施の形態にかかるラウドネス調整装置１の構成を示すブロック図である。ラウドネス調整装置１は、実施の形態１の構成（図１）に加えて無音検出部２６を更に備える。 FIG. 11 is a block diagram showing a configuration of the loudness adjusting apparatus 1 according to the present embodiment. The loudness adjusting apparatus 1 further includes a silence detecting unit 26 in addition to the configuration of the first embodiment (FIG. 1).

デコーダ１１は、デコード済みの音声データをゲイン調整部１２と、無音検出部２６と、に適宜供給する。無音検出部２６は、音声データのレベル変動、すなわちデシベル（ｄＢ）変動の有無を検知し、レベル変動が一定時間生じない場合、あるいは、素材切替時に音声データをミュート（無音状態）とする時に生成される音声データが全て"０"となる場合、あるいは特定の音声データとなる状態が一定時間経過する場合、のいずれかを検出した際に制御部１９に通知を行う。 The decoder 11 appropriately supplies the decoded audio data to the gain adjustment unit 12 and the silence detection unit 26. The silence detector 26 detects the presence or absence of a level change in audio data, that is, a decibel (dB) change, and is generated when the level change does not occur for a certain period of time, or when the audio data is muted (silenced) at the time of material switching When all the audio data to be “0” is detected, or when the state of specific audio data has passed for a certain period of time, the controller 19 is notified.

制御部１９は、抑圧処理（ＡＴＫ処理）を行っている際に無音検出部２６からの通知を受け付けた場合、上述の高速開放処理（ＲＥＬ処理、図５ＳＴ２４）を実行する。 When the control unit 19 receives a notification from the silence detection unit 26 during the suppression process (ATK process), the control unit 19 executes the above-described high-speed release process (REL process, ST24 in FIG. 5).

続いて、本実施の形態にかかるラウドネス調整装置１の効果について説明する。一般的に、テレビ番組コンテンツからコマーシャルに切り替わる場合、一定時間の無音区間が存在する。ここで、入力音声デジタル信号内に抽出した付加情報内に、コンテンツの切り替わりにかかる情報が含まれない場合がある。本実施の形態にかかるラウドネス調整装置１は、無音検出部２６により無音区間を検出している。そして、制御部１９は、付加情報に加えて無音区間の有無に応じて高速開放処理（ＲＥＬ処理）を実行する。これにより、コンテンツが切り替わる前に行われていた抑圧処理（ＡＴＫ処理）を即座にキャンセルすることができ、コンテンツに応じたラウドネス調整を瞬時に行うことが可能になる。 Then, the effect of the loudness adjustment apparatus 1 concerning this Embodiment is demonstrated. Generally, when switching from TV program content to commercial, there is a silent section of a certain time. Here, in some cases, the additional information extracted in the input audio digital signal does not include information related to content switching. In the loudness adjusting apparatus 1 according to the present embodiment, the silence detection unit 26 detects a silence interval. And the control part 19 performs a high-speed open | release process (REL process) according to the presence or absence of a silence area in addition to additional information. As a result, the suppression process (ATK process) performed before the content is switched can be canceled immediately, and the loudness adjustment according to the content can be performed instantaneously.

ここで、改めて本発明の概略を図１２を参照して説明する。図１２は、ラウドネス調整装置１の概略を示したブロック図である。ラウドネス調整装置１は、ゲイン調整部１２と、音声調整部１３と、を備える。音声調整部１３は、Ｋフィルタ１４と、音声データブースト部１５と、Ｋフィルタ１６と、リミッタコンプレッサ部１７と、を備える。 Here, the outline of the present invention will be described again with reference to FIG. FIG. 12 is a block diagram showing an outline of the loudness adjusting apparatus 1. The loudness adjusting device 1 includes a gain adjusting unit 12 and a sound adjusting unit 13. The audio adjustment unit 13 includes a K filter 14, an audio data boost unit 15, a K filter 16, and a limiter compressor unit 17.

ゲイン調整部１２には、音声データが入力される。ここで、音声データとは、例えばＡＥＳ／ＥＢＵフォーマット信号、エンベディッドオーディオ信号等をデコードすることにより得られるデータである。ゲイン調整部１２は、所定のゲイン調整値を用いて入力された音声データのゲインを調整し、ゲイン調整済みの音声データをＫフィルタ１４及び音声データブースト部１５に供給する。 Audio data is input to the gain adjusting unit 12. Here, the audio data is data obtained by decoding, for example, an AES / EBU format signal, an embedded audio signal, or the like. The gain adjustment unit 12 adjusts the gain of the input audio data using a predetermined gain adjustment value, and supplies the gain-adjusted audio data to the K filter 14 and the audio data boost unit 15.

Ｋフィルタ１４は、入力された音声データに対してＫ特性（フィルタリング処理を行い、フィルタリング済みの音声データを音声データブースト部１５に供給する。 The K filter 14 performs K characteristic (filtering processing on the input voice data, and supplies the filtered voice data to the voice data boost unit 15.

音声データブースト部１５は、Ｋフィルタ１４から入力された音声データが予め定められた閾値以下の音声データであるか否かを判定する。閾値以下であると判定した場合、音声データブースト部１５は、ゲイン調整部１２から入力された音声データを増幅する（ブースト処理を行う）。音声データブースト部１５は、増幅済みの音声データをＫフィルタ１６及びリミッタコンプレッサ部１７に供給する。 The audio data boost unit 15 determines whether the audio data input from the K filter 14 is audio data equal to or less than a predetermined threshold. When it determines with it being below a threshold value, the audio | voice data boost part 15 amplifies the audio | voice data input from the gain adjustment part 12 (performs a boost process). The audio data boost unit 15 supplies the amplified audio data to the K filter 16 and the limiter compressor unit 17.

Ｋフィルタ１６は、音声データブースト部１５から入力された音声データに対してＫ特性フィルタリング処理を行い、フィルタリング済みの音声データをリミッタコンプレッサ部１７に供給する。 The K filter 16 performs K characteristic filtering on the audio data input from the audio data boost unit 15 and supplies the filtered audio data to the limiter compressor unit 17.

リミッタコンプレッサ部１７は、Ｋフィルタ１６から入力された音声データが予め定められた閾値以上の音声データであるか否かを判定する。閾値以上であると判定した場合、リミッタコンプレッサ部１７は、音声データブースト部１５から入力された音声データに対してリミッタコンプレッサ処理を行い、処理後の音声データを出力する。 The limiter compressor unit 17 determines whether or not the audio data input from the K filter 16 is audio data that is equal to or greater than a predetermined threshold value. When it is determined that the value is equal to or greater than the threshold value, the limiter compressor unit 17 performs limiter compressor processing on the audio data input from the audio data boost unit 15 and outputs the processed audio data.

上記のとおり、ラウドネス調整装置１は、移動二乗平均の算出値を用いることなく、特性フィルタ処理に従ったブースト処理及びリミッタコンプレッサ処理を行う。これらの処理は、移動二乗平均の算出とは異なり、即時的に処理が可能である。当該構成により、例えば図６に示すように極端に大きな音声データを抑圧することができ、かつ、極端に大きな音声データに後続する通常の音量の音声データを不要に抑圧することが無くなる。すなわち、リアルタイム性を担保しつつ、音量感の調整を適切に行うことができる。 As described above, the loudness adjusting apparatus 1 performs the boost process and the limiter compressor process according to the characteristic filter process without using the calculated value of the moving average. Unlike the calculation of the moving average, these processes can be performed immediately. With this configuration, for example, extremely large audio data can be suppressed as shown in FIG. 6, for example, and normal volume audio data following the extremely large audio data is not unnecessarily suppressed. That is, it is possible to appropriately adjust the volume feeling while ensuring the real-time property.

以上、本発明者によってなされた発明を実施の形態に基づき具体的に説明したが、本発明は前記実施の形態に限定されるものではなく、その要旨を逸脱しない範囲で種々変更可能であることはいうまでもない。 As mentioned above, the invention made by the present inventor has been specifically described based on the embodiment. However, the present invention is not limited to the embodiment, and various modifications can be made without departing from the scope of the invention. Needless to say.

上述のデコーダ１１、ゲイン調整部１２、音声調整部１３、制御部１９、移動二乗平均算出部２０、移動二乗平均算出部２１、Ｋフィルタ２２、エンコーダ２３の各処理は、任意のコンピュータ内で動作するプログラムとして実現することが可能である。 Each process of the above-described decoder 11, gain adjustment unit 12, audio adjustment unit 13, control unit 19, moving root mean square calculating unit 20, moving root mean square calculating unit 21, K filter 22, and encoder 23 operates in an arbitrary computer. It can be realized as a program to be executed.

プログラムは、様々なタイプの非一時的なコンピュータ可読媒体（non-transitory computer
readable medium）を用いて格納され、コンピュータに供給することができる。非一時的なコンピュータ可読媒体は、様々なタイプの実体のある記録媒体（tangible storage medium）を含む。非一時的なコンピュータ可読媒体の例は、磁気記録媒体（例えばフレキシブルディスク、磁気テープ、ハードディスクドライブ）、光磁気記録媒体（例えば光磁気ディスク）、ＣＤ−ＲＯＭ（Read Only Memory）、ＣＤ−Ｒ、ＣＤ−Ｒ／Ｗ、半導体メモリ（例えば、マスクＲＯＭ、ＰＲＯＭ（Programmable ROM）、ＥＰＲＯＭ（Erasable PROM）、フラッシュＲＯＭ、ＲＡＭ（random access memory））を含む。また、プログラムは、様々なタイプの一時的なコンピュータ可読媒体（transitory computer readable medium）によってコンピュータに供給されてもよい。一時的なコンピュータ可読媒体の例は、電気信号、光信号、及び電磁波を含む。一時的なコンピュータ可読媒体は、電線及び光ファイバ等の有線通信路、又は無線通信路を介して、プログラムをコンピュータに供給できる。 The program can be any type of non-transitory computer readable medium.
readable medium) and can be supplied to a computer. Non-transitory computer readable media include various types of tangible storage media. Examples of non-transitory computer-readable media include magnetic recording media (for example, flexible disks, magnetic tapes, hard disk drives), magneto-optical recording media (for example, magneto-optical disks), CD-ROMs (Read Only Memory), CD-Rs, CD-R / W and semiconductor memory (for example, mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (random access memory)) are included. The program may also be supplied to the computer by various types of transitory computer readable media. Examples of transitory computer readable media include electrical signals, optical signals, and electromagnetic waves. The temporary computer-readable medium can supply the program to the computer via a wired communication path such as an electric wire and an optical fiber, or a wireless communication path.

上記実施形態の一部又は全部は、以下の付記のようにも記載されうるが、以下には限られない。 A part or all of the above embodiment can be described as in the following supplementary notes, but is not limited thereto.

＜付記１＞
入力デジタル音声信号のラウドネス調整を行うラウドネス調整装置であって、
前記入力デジタル音声信号から抽出された音声データに対し、ゲイン調整を行うゲイン調整部と、
前記ゲイン調整部によるゲイン調整済みの音声データを特性フィルタ処理したフィルタ済み音声データを生成する第１フィルタ部と、
前記第１フィルタ部の生成したフィルタ済み音声データが第１閾値以下の場合、前記ゲイン調整部によるゲイン調整済みの音声データを増幅して出力するブースト処理部と、
前記ブースト処理部が出力した音声データを特性フィルタ処理したフィルタ済み音声データを生成する第２フィルタ部と、
前記第２フィルタ部の生成したフィルタ済み音声データが第２閾値以上の場合、前記ブースト処理部が増幅した音声データに対してリミッタコンプレッサ処理を行って出力するリミッタコンプレッサ部と、を備えるラウドネス調整装置。 <Appendix 1>
A loudness adjustment device for adjusting the loudness of an input digital audio signal,
A gain adjustment unit that performs gain adjustment on audio data extracted from the input digital audio signal;
A first filter unit that generates filtered voice data obtained by performing characteristic filtering on the voice data that has been gain-adjusted by the gain adjustment unit;
A boost processing unit that amplifies and outputs the gain-adjusted audio data by the gain adjustment unit when the filtered audio data generated by the first filter unit is equal to or less than a first threshold;
A second filter unit that generates filtered audio data obtained by performing characteristic filtering on the audio data output by the boost processing unit;
A loudness adjustment device comprising: a limiter compressor unit that outputs the audio data amplified by the boost processing unit when the filtered audio data generated by the second filter unit is equal to or greater than a second threshold value; .

＜付記２＞
前記リミッタコンプレッサ部が出力した音声データを特性フィルタ処理したフィルタ済み音声データを生成する第３フィルタ部と、
前記第３フィルタ部が生成したフィルタ済み音声データに対し、第１の時間区間を用いた移動二乗平均により第１二乗平均値を算出する第１二乗平均算出部と、
前記第３フィルタ部が生成したフィルタ済み音声データに対し、第１の時間区間より長い第２の時間区間を用いた移動二乗平均により第２二乗平均値を算出する第２二乗平均算出部と、
前記第１及び第２二乗平均値、及びターゲットラウドネス値に基づいて前記ゲイン調整に用いるゲイン調整値を算出する制御部と、
を備える付記１に記載のラウドネス調整装置。 <Appendix 2>
A third filter unit for generating filtered audio data obtained by performing characteristic filtering on the audio data output by the limiter compressor unit;
A first mean square calculation unit that calculates a first mean square value by moving mean square using a first time interval for the filtered voice data generated by the third filter unit;
A second mean square calculating unit that calculates a second mean square value by moving mean square using a second time interval longer than the first time interval for the filtered voice data generated by the third filter unit;
A control unit that calculates a gain adjustment value used for the gain adjustment based on the first and second mean square values and the target loudness value;
The loudness adjusting device according to claim 1, further comprising:

＜付記３＞
前記リミッタコンプレッサ部が出力した音声データを特性フィルタ処理したフィルタ済み音声データを生成する第３フィルタ部と、
前記第３フィルタ部が生成したフィルタ済み音声データに対し、第１の時間区間を用いた移動二乗平均により第１二乗平均値を算出する第１二乗平均算出部と、
前記第３フィルタ部が生成したフィルタ済み音声データに対し、第１の時間区間より長い第２の時間区間を用いた移動二乗平均により第２二乗平均値を算出する第２二乗平均算出部と、
前記第１及び第２二乗平均値、及びターゲットラウドネス値に基づいて前記ゲイン調整に用いるゲイン調整値を算出する制御部と、
を備える付記１に記載のラウドネス調整装置。 <Appendix 3>
A third filter unit for generating filtered audio data obtained by performing characteristic filtering on the audio data output by the limiter compressor unit;
A first mean square calculation unit that calculates a first mean square value by moving mean square using a first time interval for the filtered voice data generated by the third filter unit;
A second mean square calculating unit that calculates a second mean square value by moving mean square using a second time interval longer than the first time interval for the filtered voice data generated by the third filter unit;
A control unit that calculates a gain adjustment value used for the gain adjustment based on the first and second mean square values and the target loudness value;
The loudness adjusting device according to claim 1, further comprising:

＜付記４＞
前記制御部は、
外部からの制御情報の入力が無く、前記入力デジタル音声信号の音声モードの変更がなく、前記抑圧処理の実行中であり、かつ前記第１二乗平均値が開放閾値以下である場合、所定時間で前記入力デジタル音声信号から抽出された前記音声データの開放が完了するように前記ゲイン調整値を調整する開放処理を行い、
外部からの制御情報の入力が無く、前記入力デジタル音声信号の音声モードの変更がなく、前記抑圧処理の実行中であり、かつ前記第１二乗平均値が前記開放閾値以下ではない場合、前記開放処理を行わない、
ことを特徴とする付記３に記載のラウドネス調整装置。 <Appendix 4>
The controller is
When there is no input of control information from the outside, there is no change in the audio mode of the input digital audio signal, the suppression process is being executed, and the first mean square value is less than or equal to the open threshold value, a predetermined time Performing an opening process for adjusting the gain adjustment value so that the opening of the audio data extracted from the input digital audio signal is completed;
When there is no input of control information from the outside, there is no change in the audio mode of the input digital audio signal, the suppression process is being executed, and the first mean square value is not less than or equal to the release threshold, the release No processing,
The loudness adjusting device according to Supplementary Note 3, wherein

＜付記５＞
前記制御部は、
前記抑圧処理の実行中であり、かつ外部からの制御情報の入力及び前記入力デジタル音声信号の音声モードの変更のいずれか一方が生じた場合、前記所定時間よりも短い時間で前記入力デジタル音声信号から抽出された前記音声データの開放が完了するように前記ゲイン調整値を調整する、
ことを特徴とする付記３または付記４に記載のラウドネス調整装置。 <Appendix 5>
The controller is
The input digital audio signal is shorter than the predetermined time when the suppression process is being performed and any one of input of control information from the outside and change of the audio mode of the input digital audio signal occurs. Adjusting the gain adjustment value so as to complete the release of the audio data extracted from
The loudness adjusting apparatus according to Supplementary Note 3 or Supplementary Note 4, wherein

＜付記６＞
前記入力デジタル音声信号から前記音声データを抽出するデコーダを更に備え、
前記デコーダは、前記入力デジタル音声信号から付加情報を抽出し、前記付加情報内にゲイン調整を行わないことを指示する情報が含まれている場合、抽出した前記音声データを前記ゲイン調整部に供給せずに他の任意の出力装置に供給し、
それ以外の場合には前記ゲイン調整部に抽出した前記音声データを供給する、
ことを特徴とする付記１乃至付記５のいずれか１項に記載のラウドネス調整装置。 <Appendix 6>
A decoder for extracting the audio data from the input digital audio signal;
The decoder extracts additional information from the input digital audio signal, and supplies the extracted audio data to the gain adjustment unit when information indicating that gain adjustment is not performed is included in the additional information. Supply to any other output device without
In other cases, the audio data extracted is supplied to the gain adjustment unit.
The loudness adjustment device according to any one of Supplementary Note 1 to Supplementary Note 5, wherein

＜付記７＞
前記デコーダが抽出した前記音声データを解析し、無音状態が一定時間続いた場合に前記制御部に通知を行う無音検出部を更に備え、
前記制御部は、前記抑圧処理の実行中であり、かつ前記無音検出部からの通知を受信した場合、前記所定時間よりも短い時間で前記音声データの開放が完了するように前記ゲイン調整値を調整する、
ことを特徴とする付記６に記載のラウドネス調整装置。 <Appendix 7>
Analyzing the audio data extracted by the decoder, and further comprising a silence detection unit for notifying the control unit when a silent state continues for a certain period of time;
When the control unit is executing the suppression process and receives a notification from the silence detection unit, the control unit sets the gain adjustment value so that the release of the audio data is completed in a time shorter than the predetermined time. adjust,
The loudness adjusting device according to appendix 6, wherein

＜付記８＞
入力デジタル音声信号のラウドネス調整を行うラウドネス調整装置であって、
前記入力デジタル音声信号から第１の音声データ及び第２の音声データを抽出するエンコーダと、
前記エンコーダが抽出した前記第１の音声データに対し、第１ゲイン調整値を用いてゲイン調整を行う第１ゲイン調整部と、
前記エンコーダが抽出した前記第２の音声データに対し、第２ゲイン調整値を用いてゲイン調整を行う第２ゲイン調整部と、
前記第１ゲイン調整部によるゲイン調整済みの音声データを特性フィルタ処理したフィルタ済み音声データを生成する第１フィルタ部と、
前記第２ゲイン調整部によるゲイン調整済みの音声データを特性フィルタ処理したフィルタ済み音声データを生成する第２フィルタ部と、
前記第１フィルタ部の生成したフィルタ済み音声データが第１閾値以下の場合、前記第１ゲイン調整部によるゲイン調整済みの音声データを増幅して出力する第１ブースト処理部と、
前記第２フィルタ部の生成したフィルタ済み音声データが前記第１閾値以下の場合、前記第２ゲイン調整部によるゲイン調整済みの音声データを増幅して出力する第２ブースト処理部と、
前記第１ブースト処理部が出力した音声データを特性フィルタ処理したフィルタ済み音声データを生成する第３フィルタ部と、
前記第２ブースト処理部が出力した音声データを特性フィルタ処理したフィルタ済み音声データを生成する第４フィルタ部と、
前記第３フィルタ部の生成したフィルタ済み音声データが第２閾値以上の場合、前記ブースト処理部が増幅した音声データに対してリミッタコンプレッサ処理を行って出力する第１リミッタコンプレッサ部と、
前記第４フィルタ部の生成したフィルタ済み音声データが前記第２閾値以上の場合、前記ブースト処理部が増幅した音声データに対してリミッタコンプレッサ処理を行って出力する第２リミッタコンプレッサ部と、
を備えるラウドネス調整装置。 <Appendix 8>
A loudness adjustment device for adjusting the loudness of an input digital audio signal,
An encoder for extracting first audio data and second audio data from the input digital audio signal;
A first gain adjustment unit that performs gain adjustment using a first gain adjustment value on the first audio data extracted by the encoder;
A second gain adjustment unit that performs gain adjustment on the second audio data extracted by the encoder using a second gain adjustment value;
A first filter unit that generates filtered audio data obtained by performing characteristic filtering on the audio data that has been gain-adjusted by the first gain adjustment unit;
A second filter unit that generates filtered voice data obtained by performing characteristic filtering on the voice data that has been gain-adjusted by the second gain adjustment unit;
A first boost processing unit that amplifies and outputs the audio data that has been gain-adjusted by the first gain adjustment unit when the filtered audio data generated by the first filter unit is equal to or less than a first threshold;
A second boost processing unit for amplifying and outputting the gain-adjusted audio data by the second gain adjustment unit when the filtered audio data generated by the second filter unit is less than or equal to the first threshold;
A third filter unit for generating filtered audio data obtained by performing characteristic filtering on the audio data output by the first boost processing unit;
A fourth filter unit for generating filtered audio data obtained by performing characteristic filtering on the audio data output from the second boost processing unit;
A first limiter compressor unit that performs a limiter compressor process on the audio data amplified by the boost processing unit when the filtered audio data generated by the third filter unit is equal to or greater than a second threshold;
A second limiter compressor unit that performs a limiter compressor process on the audio data amplified by the boost processing unit when the filtered audio data generated by the fourth filter unit is equal to or greater than the second threshold;
A loudness adjustment device comprising:

＜付記９＞
前記第１リミッタコンプレッサ部が出力した音声データを特性フィルタ処理したフィルタ済み音声データを生成する第５フィルタ部と、
前記第２リミッタコンプレッサ部が出力した音声データを特性フィルタ処理したフィルタ済み音声データを生成する第６フィルタ部と、
前記第５フィルタ部が処理したフィルタ済み音声データに対し、第１の時間区間を用いた移動二乗平均により算出した第１二乗平均値を算出する第１二乗平均算出部と、
前記第５フィルタ部が処理したフィルタ済み音声データに対し、第１の時間区間より長い第２の時間区間を用いた移動二乗平均により算出した第２二乗平均値を算出する第２二乗平均算出部と、
前記第６フィルタ部が処理したフィルタ済み音声データに対し、前記第１の時間区間を用いた移動二乗平均により算出した第３二乗平均値を算出する第３二乗平均算出部と、
前記第６フィルタ部が処理したフィルタ済み音声データに対し、前記第２の時間区間を用いた移動二乗平均により算出した第４二乗平均値を算出する第４二乗平均算出部と、
前記第１及び第２二乗平均値、及びターゲットラウドネス値に基づいて前記ゲイン調整値を算出する制御部と、
前記第１及び第３二乗平均値の積算値、前記第２及び第４二乗平均値の積算値、及びターゲットラウドネス値に基づいて、前記第１ゲイン調整値及び前記第２ゲイン調整値を算出する制御部と、を備えることを特徴とする付記８に記載のラウドネス調整装置。 <Appendix 9>
A fifth filter unit for generating filtered audio data obtained by performing characteristic filtering on the audio data output from the first limiter compressor unit;
A sixth filter unit for generating filtered audio data obtained by performing characteristic filtering on the audio data output from the second limiter compressor unit;
A first root mean square calculation unit that calculates a first root mean square value calculated by moving root mean square using a first time interval for the filtered voice data processed by the fifth filter unit;
A second root mean square calculating unit that calculates a second root mean square value calculated by moving root mean square using a second time interval longer than the first time interval for the filtered voice data processed by the fifth filter unit When,
A third root mean square calculation unit that calculates a third root mean square value calculated by moving root mean square using the first time interval for the filtered voice data processed by the sixth filter unit;
A fourth mean square calculation unit that calculates a fourth mean square value calculated by moving mean square using the second time interval for the filtered voice data processed by the sixth filter unit;
A controller that calculates the gain adjustment value based on the first and second mean square values and a target loudness value;
The first gain adjustment value and the second gain adjustment value are calculated based on the integrated value of the first and third mean square values, the integrated value of the second and fourth mean square values, and the target loudness value. The loudness adjustment device according to appendix 8, further comprising: a control unit.

＜付記１０＞
付記１乃至付記９のいずれか１項に記載のラウドネス調整装置を内蔵した映像／音声処理装置。 <Appendix 10>
A video / audio processing apparatus including the loudness adjusting apparatus according to any one of appendix 1 to appendix 9.

＜付記１１＞
入力デジタル音声信号のラウドネス調整を行うラウドネス調整方法であって、
前記入力デジタル音声信号から抽出された音声データに対し、ゲイン調整を行うゲイン調整ステップと、
前記ゲイン調整ステップにおけるゲイン調整済みの音声データを特性フィルタ処理したフィルタ済み音声データを生成する第１フィルタステップと、
前記第１フィルタステップにて生成したフィルタ済み音声データが第１閾値以下の場合、前記ゲイン調整ステップにおいてゲイン調整済みの音声データを増幅して出力するブースト処理ステップと、
前記ブースト処理ステップにて出力された音声データを特性フィルタ処理したフィルタ済み音声データを生成する第２フィルタステップと、
前記第２フィルタステップにて生成したフィルタ済み音声データが第２閾値以上の場合、前記ブースト処理部が増幅した音声データに対してリミッタコンプレッサ処理を行って出力するリミッタコンプレッサステップと、を備えるラウドネス調整方法。 <Appendix 11>
A loudness adjustment method for adjusting the loudness of an input digital audio signal,
A gain adjustment step for performing gain adjustment on the audio data extracted from the input digital audio signal;
A first filter step for generating filtered audio data obtained by performing characteristic filtering on the audio data after gain adjustment in the gain adjustment step;
A boost processing step of amplifying and outputting the gain-adjusted audio data in the gain adjustment step when the filtered audio data generated in the first filter step is less than or equal to a first threshold;
A second filter step for generating filtered voice data obtained by performing characteristic filtering on the voice data output in the boost processing step;
A loudness adjustment comprising: a limiter compressor step for performing a limiter compressor process on the audio data amplified by the boost processing unit when the filtered audio data generated in the second filter step is greater than or equal to a second threshold value; Method.

＜付記１２＞
前記リミッタコンプレッサステップに手出力した音声データを特性フィルタ処理したフィルタ済み音声データを生成する第３フィルタステップと、
前記第３フィルタステップにて生成したフィルタ済み音声データに対し、第１の時間区間を用いた移動二乗平均により第１二乗平均値を算出する第１二乗平均算出ステップと、
前記第３フィルタステップにて生成したフィルタ済み音声データに対し、第１の時間区間より長い第２の時間区間を用いた移動二乗平均により第２二乗平均値を算出する第２二乗平均算出ステップと、
前記第１及び第２二乗平均値、及びターゲットラウドネス値に基づいて前記ゲイン調整に用いるゲイン調整値を算出する制御ステップと、
を備える付記１１に記載のラウドネス調整方法。 <Appendix 12>
A third filter step for generating filtered audio data obtained by performing characteristic filtering on the audio data manually output to the limiter compressor step;
A first mean square calculating step of calculating a first mean square value by moving mean square using the first time interval for the filtered voice data generated in the third filter step;
A second mean square calculating step of calculating a second mean square value by moving mean square using a second time interval longer than the first time interval for the filtered voice data generated in the third filter step; ,
A control step of calculating a gain adjustment value used for the gain adjustment based on the first and second mean square values and a target loudness value;
The loudness adjustment method according to claim 11, further comprising:

＜付記１３＞
前記制御ステップでは、前記第１二乗平均値が前記ターゲットラウドネス値よりも大きく、前記第２二乗平均値が増加傾向にある場合、前記入力デジタル音声信号から抽出された前記音声データを抑圧するように前記ゲイン調整値を調整する抑圧処理を行い、
前記第１二乗平均値が前記ターゲットラウドネス値よりも大きく、前記第２二乗平均値が減少傾向にある場合、前記抑圧処理を中断し、
上記以外の場合には、前記抑圧処理は行わない、
ことを特徴とする付記１２に記載のラウドネス調整方法。 <Appendix 13>
In the control step, when the first mean square value is larger than the target loudness value and the second mean square value tends to increase, the voice data extracted from the input digital voice signal is suppressed. A suppression process for adjusting the gain adjustment value is performed,
When the first mean square value is larger than the target loudness value and the second mean square value tends to decrease, the suppression process is interrupted,
Otherwise, the suppression process is not performed.
The loudness adjustment method according to appendix 12, wherein:

＜付記１４＞
前記制御ステップでは、
外部からの制御情報の入力が無く、前記入力デジタル音声信号の音声モードの変更がなく、前記抑圧処理の実行中であり、かつ前記第１二乗平均値が開放閾値以下である場合、所定時間で前記入力デジタル音声信号から抽出された前記音声データの開放が完了するように前記ゲイン調整値を調整する開放処理を行い、
外部からの制御情報の入力が無く、前記入力デジタル音声信号の音声モードの変更がなく、前記抑圧処理の実行中であり、かつ前記第１二乗平均値が前記開放閾値以下ではない場合、前記開放処理を行わない、
ことを特徴とする付記１３に記載のラウドネス調整方法。 <Appendix 14>
In the control step,
When there is no input of control information from the outside, there is no change in the audio mode of the input digital audio signal, the suppression process is being executed, and the first mean square value is less than or equal to the open threshold value, a predetermined time Performing an opening process for adjusting the gain adjustment value so that the opening of the audio data extracted from the input digital audio signal is completed;
When there is no input of control information from the outside, there is no change in the audio mode of the input digital audio signal, the suppression process is being executed, and the first mean square value is not less than or equal to the release threshold, the release No processing,
The loudness adjustment method according to supplementary note 13, wherein the loudness adjustment method is performed.

＜付記１５＞
前記制御ステップでは、
前記抑圧処理の実行中であり、かつ外部からの制御情報の入力及び前記入力デジタル音声信号の音声モードの変更のいずれか一方が生じた場合、前記所定時間よりも短い時間で前記入力デジタル音声信号から抽出された前記音声データの開放が完了するように前記ゲイン調整値を調整する、
ことを特徴とする付記１３または付記１４に記載のラウドネス調整方法。 <Appendix 15>
In the control step,
The input digital audio signal is shorter than the predetermined time when the suppression process is being performed and any one of input of control information from the outside and change of the audio mode of the input digital audio signal occurs. Adjusting the gain adjustment value so as to complete the release of the audio data extracted from
The loudness adjustment method according to supplementary note 13 or supplementary note 14, wherein

＜付記１６＞
入力デジタル音声信号のラウドネス調整をコンピュータに実行させるプログラムであって、
コンピュータに、
前記入力デジタル音声信号から抽出された音声データに対し、ゲイン調整を行うゲイン調整ステップと、
前記ゲイン調整ステップにおけるゲイン調整済みの音声データを特性フィルタ処理したフィルタ済み音声データを生成する第１フィルタステップと、
前記第１フィルタステップにて生成したフィルタ済み音声データが第１閾値以下の場合、前記ゲイン調整ステップにおいてゲイン調整済みの音声データを増幅して出力するブースト処理ステップと、
前記ブースト処理ステップにて出力された音声データを特性フィルタ処理したフィルタ済み音声データを生成する第２フィルタステップと、
前記第２フィルタステップにて生成したフィルタ済み音声データが第２閾値以上の場合、前記ブースト処理部が増幅した音声データに対してリミッタコンプレッサ処理を行って出力するリミッタコンプレッサステップと、
を実行させるプログラム。 <Appendix 16>
A program that causes a computer to perform loudness adjustment of an input digital audio signal,
On the computer,
A gain adjustment step for performing gain adjustment on the audio data extracted from the input digital audio signal;
A first filter step for generating filtered audio data obtained by performing characteristic filtering on the audio data after gain adjustment in the gain adjustment step;
A boost processing step of amplifying and outputting the gain-adjusted audio data in the gain adjustment step when the filtered audio data generated in the first filter step is less than or equal to a first threshold;
A second filter step for generating filtered voice data obtained by performing characteristic filtering on the voice data output in the boost processing step;
A limiter compressor step for performing a limiter compressor process on the audio data amplified by the boost processing unit when the filtered audio data generated in the second filter step is equal to or greater than a second threshold; and
A program that executes

＜付記１７＞
前記リミッタコンプレッサステップに手出力した音声データを特性フィルタ処理したフィルタ済み音声データを生成する第３フィルタステップと、
前記第３フィルタステップにて生成したフィルタ済み音声データに対し、第１の時間区間を用いた移動二乗平均により第１二乗平均値を算出する第１二乗平均算出ステップと、
前記第３フィルタステップにて生成したフィルタ済み音声データに対し、第１の時間区間より長い第２の時間区間を用いた移動二乗平均により第２二乗平均値を算出する第２二乗平均算出ステップと、
前記第１及び第２二乗平均値、及びターゲットラウドネス値に基づいて前記ゲイン調整に用いるゲイン調整値を算出する制御ステップと、
を備える付記１６に記載のプログラム。 <Appendix 17>
A third filter step for generating filtered audio data obtained by performing characteristic filtering on the audio data manually output to the limiter compressor step;
A first mean square calculating step of calculating a first mean square value by moving mean square using the first time interval for the filtered voice data generated in the third filter step;
A second mean square calculating step of calculating a second mean square value by moving mean square using a second time interval longer than the first time interval for the filtered voice data generated in the third filter step; ,
A control step of calculating a gain adjustment value used for the gain adjustment based on the first and second mean square values and a target loudness value;
The program according to appendix 16, comprising:

＜付記１８＞
前記制御ステップでは、前記第１二乗平均値が前記ターゲットラウドネス値よりも大きく、前記第２二乗平均値が増加傾向にある場合、前記入力デジタル音声信号から抽出された前記音声データを抑圧するように前記ゲイン調整値を調整する抑圧処理を行い、
前記第１二乗平均値が前記ターゲットラウドネス値よりも大きく、前記第２二乗平均値が減少傾向にある場合、前記抑圧処理を中断し、
上記以外の場合には、前記抑圧処理は行わない、
ことを特徴とする付記１７に記載のプログラム。 <Appendix 18>
In the control step, when the first mean square value is larger than the target loudness value and the second mean square value tends to increase, the voice data extracted from the input digital voice signal is suppressed. A suppression process for adjusting the gain adjustment value is performed,
When the first mean square value is larger than the target loudness value and the second mean square value tends to decrease, the suppression process is interrupted,
Otherwise, the suppression process is not performed.
The program according to appendix 17, characterized by:

＜付記１９＞
前記制御ステップでは、
外部からの制御情報の入力が無く、前記入力デジタル音声信号の音声モードの変更がなく、前記抑圧処理の実行中であり、かつ前記第１二乗平均値が開放閾値以下である場合、所定時間で前記入力デジタル音声信号から抽出された前記音声データの開放が完了するように前記ゲイン調整値を調整する開放処理を行い、
外部からの制御情報の入力が無く、前記入力デジタル音声信号の音声モードの変更がなく、前記抑圧処理の実行中であり、かつ前記第１二乗平均値が前記開放閾値以下ではない場合、前記開放処理を行わない、
ことを特徴とする付記１８に記載のプログラム。

<Appendix 19>
In the control step,
When there is no input of control information from the outside, there is no change in the audio mode of the input digital audio signal, the suppression process is being executed, and the first mean square value is less than or equal to the open threshold value, a predetermined time Performing an opening process for adjusting the gain adjustment value so that the opening of the audio data extracted from the input digital audio signal is completed;
When there is no input of control information from the outside, there is no change in the audio mode of the input digital audio signal, the suppression process is being executed, and the first mean square value is not less than or equal to the release threshold, the release No processing,
The program according to appendix 18, characterized by:

＜付記２０＞
前記制御ステップでは、
前記抑圧処理の実行中であり、かつ外部からの制御情報の入力及び前記入力デジタル音声信号の音声モードの変更のいずれか一方が生じた場合、前記所定時間よりも短い時間で前記入力デジタル音声信号から抽出された前記音声データの開放が完了するように前記ゲイン調整値を調整する、
ことを特徴とする付記１８または付記１９に記載のプログラム。 <Appendix 20>
In the control step,
The input digital audio signal is shorter than the predetermined time when the suppression process is being performed and any one of input of control information from the outside and change of the audio mode of the input digital audio signal occurs. Adjusting the gain adjustment value so as to complete the release of the audio data extracted from
The program according to appendix 18 or appendix 19, characterized by:

１ラウドネス調整装置
１１デコーダ
１２ゲイン調整部
１３音声調整部
１４Ｋフィルタ
１５音声データブースト部
１６Ｋフィルタ
１７リミッタコンプレッサ部
１８セレクタ
１９制御部
２０移動二乗平均算出部
２１移動二乗平均算出部
２２Ｋフィルタ
２３エンコーダ
２４ゲイン乗算部
２５二乗平均積算部
２６無音検出部 DESCRIPTION OF SYMBOLS 1 Loudness adjustment apparatus 11 Decoder 12 Gain adjustment part 13 Audio | voice adjustment part 14 K filter 15 Audio | voice data boost part 16 K filter 17 Limiter compressor part 18 Selector 19 Control part 20 Moving square average calculation part 21 Moving square average calculation part 22 K filter 23 Encoder 24 Gain multiplication unit 25 Root mean square integration unit 26 Silence detection unit

Claims

A loudness adjustment device for adjusting the loudness of an input digital audio signal,
A gain adjustment unit that performs gain adjustment on audio data extracted from the input digital audio signal;
A first filter unit that generates filtered voice data obtained by performing characteristic filtering on the voice data that has been gain-adjusted by the gain adjustment unit;
A boost processing unit that amplifies and outputs the gain-adjusted audio data by the gain adjustment unit when the filtered audio data generated by the first filter unit is equal to or less than a first threshold;
A second filter unit that generates filtered audio data obtained by performing characteristic filtering on the audio data output by the boost processing unit;
A loudness adjustment device comprising: a limiter compressor unit that outputs the audio data amplified by the boost processing unit when the filtered audio data generated by the second filter unit is equal to or greater than a second threshold value; .

A third filter unit for generating filtered audio data obtained by performing characteristic filtering on the audio data output by the limiter compressor unit;
A first mean square calculation unit that calculates a first mean square value by moving mean square using a first time interval for the filtered voice data generated by the third filter unit;
A second mean square calculating unit that calculates a second mean square value by moving mean square using a second time interval longer than the first time interval for the filtered voice data generated by the third filter unit;
A control unit that calculates a gain adjustment value used for the gain adjustment based on the first and second mean square values and the target loudness value;
The loudness adjusting device according to claim 1, comprising:

The control unit suppresses the audio data extracted from the input digital audio signal when the first mean square value is larger than the target loudness value and the second mean square value tends to increase. A suppression process for adjusting the gain adjustment value is performed,
When the first mean square value is larger than the target loudness value and the second mean square value tends to decrease, the suppression process is interrupted,
Otherwise, the suppression process is not performed.
The loudness adjustment apparatus according to claim 2, wherein

The controller is
When there is no input of control information from the outside, there is no change in the audio mode of the input digital audio signal, the suppression process is being executed, and the first mean square value is less than or equal to the open threshold value, a predetermined time Performing an opening process for adjusting the gain adjustment value so that the opening of the audio data extracted from the input digital audio signal is completed;
When there is no input of control information from the outside, there is no change in the audio mode of the input digital audio signal, the suppression process is being executed, and the first mean square value is not less than or equal to the release threshold, the release No processing,
The loudness adjusting device according to claim 3.

The controller is
The input digital audio signal is shorter than the predetermined time when the suppression process is being performed and any one of input of control information from the outside and change of the audio mode of the input digital audio signal occurs. Adjusting the gain adjustment value so as to complete the release of the audio data extracted from
Loudness adjustment device according to 請 Motomeko 4 you wherein a.

A decoder for extracting the audio data from the input digital audio signal;
The decoder extracts additional information from the input digital audio signal, and supplies the extracted audio data to the gain adjustment unit when information indicating that gain adjustment is not performed is included in the additional information. Supply to any other output device without
In other cases, the audio data extracted is supplied to the gain adjustment unit.
The loudness adjusting apparatus according to claim 4 or 5 , characterized in that

Analyzing the audio data extracted by the decoder, and further comprising a silence detection unit for notifying the control unit when a silent state continues for a certain period of time;
When the control unit is executing the suppression process and receives a notification from the silence detection unit, the control unit sets the gain adjustment value so that the release of the audio data is completed in a time shorter than the predetermined time. adjust,
The loudness adjusting apparatus according to claim 6.

A video / audio processing apparatus including the loudness adjusting apparatus according to claim 1.

A loudness adjustment method for adjusting the loudness of an input digital audio signal,
A gain adjustment step for performing gain adjustment on the audio data extracted from the input digital audio signal;
A first filter step for generating filtered audio data obtained by performing characteristic filtering on the audio data after gain adjustment in the gain adjustment step;
A boost processing step of amplifying and outputting the gain-adjusted audio data in the gain adjustment step when the filtered audio data generated in the first filter step is less than or equal to a first threshold;
A second filter step for generating filtered voice data obtained by performing characteristic filtering on the voice data output in the boost processing step;
A limiter compressor step for performing a limiter compressor process on the audio data amplified in the boost processing step when the filtered audio data generated in the second filter step is greater than or equal to a second threshold value, and a loudness comprising Adjustment method.

A program that causes a computer to perform loudness adjustment of an input digital audio signal,
On the computer,
A gain adjustment step for performing gain adjustment on the audio data extracted from the input digital audio signal;
A first filter step for generating filtered audio data obtained by performing characteristic filtering on the audio data after gain adjustment in the gain adjustment step;
A boost processing step of amplifying and outputting the gain-adjusted audio data in the gain adjustment step when the filtered audio data generated in the first filter step is less than or equal to a first threshold;
A second filter step for generating filtered voice data obtained by performing characteristic filtering on the voice data output in the boost processing step;
A limiter compressor step for performing a limiter compressor process on the audio data amplified in the boost processing step when the filtered audio data generated in the second filter step is greater than or equal to a second threshold;
A program that executes