JP2019144022A

JP2019144022A - Characteristic prediction method, characteristic prediction program and characteristic prediction device for beverage

Info

Publication number: JP2019144022A
Application number: JP2018026368A
Authority: JP
Inventors: 邊正大寺; Masahiro Terabe; 野高将山; Takamasa Yamano; 馬由芽美; Yume Mima; 垣和広柴; Kazuhiro Shibagaki; 田敏裕鎌; Toshihiro Kamata; 秀幸南; Hideyuki Minami; 川斉利守; Seiri Morikawa; 山知香子高; Chikako Takayama
Original assignee: Mitsubishi Research Institute Inc; Kirin Co Ltd
Current assignee: Mitsubishi Research Institute Inc; Kirin Co Ltd
Priority date: 2018-02-16
Filing date: 2018-02-16
Publication date: 2019-08-29
Anticipated expiration: 2038-02-16
Also published as: JP7046637B2

Abstract

【課題】入力された試醸条件から予測される飲料の特性を高い精度で推定する、飲料の特性予測方法、特性予測プログラムおよび特性予測装置を提供する。【解決手段】本発明に係る飲料の特性予測方法は、飲料の試醸条件における設定項目を説明変数とした試醸サンプルを生成するステップと、前記試醸サンプルを、複数の予測モデルに入力し、前記予測モデルごとに前記飲料の成分値の推定値を求めるステップと、前記試醸サンプルを、前記予測モデルに対応する誤差判定モデルに入力し、前記予測モデルごとに許容誤差内確率を計算するステップと、手法選択モデルに、前記試醸サンプルに係る説明変数と、複数の前記許容誤差内確率を入力し、前記手法選択モデルの出力値に基づき、前記飲料の前記試醸条件における前記成分値として提示される、前記推定値の選択を行うステップとをコンピュータが実行することを特徴とする。【選択図】図１PROBLEM TO BE SOLVED: To provide a beverage characteristic prediction method, a characteristic prediction program and a characteristic prediction device for estimating the characteristics of a beverage predicted from input trial brewing conditions with high accuracy. SOLUTION: In the method for predicting the characteristics of a beverage according to the present invention, a step of generating a trial brewing sample using a setting item in the trial brewing condition of the beverage as an explanatory variable and the trial brewing sample are input to a plurality of prediction models. , The step of obtaining the estimated value of the component value of the beverage for each prediction model, and the trial brewing sample are input to the error determination model corresponding to the prediction model, and the probability within the margin of error is calculated for each prediction model. The explanatory variables related to the trial brewing sample and the plurality of probabilities within the margin of error are input to the step, the method selection model, and the component values of the beverage under the trial brewing conditions based on the output values of the method selection model. It is characterized in that the computer executes the step of selecting the estimated value, which is presented as. [Selection diagram] Fig. 1

Description

本発明は、飲料の特性予測方法、特性予測プログラムおよび特性予測装置に関する。 The present invention relates to a beverage characteristic prediction method, a characteristic prediction program, and a characteristic prediction apparatus.

近年は消費者の嗜好が短期間で変化するようになり、飲料市場における競争は激しさを増している。このような状況では、消費者の要求を満たし、高い競争力を有する新商品の麦芽アルコール飲料（例えば、ビールや発泡酒など）やビールテイスト飲料をタイムリーに市場へ投入していく必要がある。新商品の開発において、技術者は過去に蓄積されたレシピなどのデータを参照しながら、消費者の要求を満たす味、香り、色、アルコール度数などの特性を備えた飲料が得られるよう、試醸を繰り返す。過去のデータをどのように活用して、試醸の条件を決めるのかは、それぞれの技術者次第である。一般に、技術者は開発経験で培った感覚やノウハウを生かして試醸の条件を決めているが、実際にこのような感覚やノウハウを形式知や組織知として伝承していくのは難しい。試醸を行うと、多くの材料、設備、工数を使う必要があり、試作回数が多くなれば時間だけでなく、リソースの消費も大きくなる。 In recent years, consumer preferences have changed in a short period of time, and competition in the beverage market has intensified. Under such circumstances, it is necessary to timely introduce new malt alcoholic beverages (for example, beer and sparkling liquor) and beer-taste beverages that meet consumer demands and have high competitiveness to the market. . In the development of new products, engineers can try to obtain beverages with characteristics such as taste, aroma, color, and alcohol content that meet consumer demand while referring to data such as recipes accumulated in the past. Repeat the brew. It is up to each engineer to decide how to use the past data to determine the brewing conditions. In general, engineers use the senses and know-how cultivated through development experience to determine the conditions for brewing, but it is difficult to actually pass on such senses and know-how as formal knowledge and organizational knowledge. When brewing, it is necessary to use many materials, equipment, and man-hours. As the number of trials increases, not only time but also resource consumption increases.

そこで、効率的な麦芽アルコール飲料やビールテイスト飲料などの試作を実現すべく、過去に蓄積されたデータを分析し、特性を予測する技術を開発することが考えられる。食品の分野では、過去のデータから食品の品質などを予想する技術が開発されている。特許文献１には食品中の微生物の増殖時間を予測し、食品の日持ち時間などを予測する技術が開示されている。特許文献２にはビールテイスト飲料の中間製品である濾過前液の濾過性を予測する技術が開示されている。特許文献３には麦汁のノネナールポテンシャルから麦芽アルコール発酵飲料の香味安定性を予測する技術が開示されている。 Therefore, in order to realize trial production of efficient malt alcoholic beverages and beer-taste beverages, it is conceivable to develop a technique for analyzing characteristics accumulated in the past and predicting characteristics. In the field of food, techniques for predicting the quality of food from past data have been developed. Patent Document 1 discloses a technique for predicting the growth time of microorganisms in food and predicting the shelf life of food. Patent Document 2 discloses a technique for predicting the filterability of a pre-filtration liquid that is an intermediate product of a beer-taste beverage. Patent Document 3 discloses a technique for predicting the flavor stability of a malt alcohol fermented beverage from the wort's nonenal potential.

特許第６１２１０３５号公報Japanese Patent No. 6121035 特許第５８９４０２７号公報Japanese Patent No. 5894027 特許第４１９１３９０号公報Japanese Patent No. 4191390

Ｃｌｅａｒｙ，ＪｏｈｎＧ，ｅｔａｌ．Ｋ＊：ＡｎＩｎｓｔａｎｃｅ−ｂａｓｅｄＬｅａｒｎｅｒＵｓｉｎｇａｎＥｎｔｒｏｐｉｃＤｉｓｔａｎｃｅＭｅａｓｕｒｅ．ＩＣＭＬ‘９５ＰｒｏｃｅｅｄｉｎｇｓｏｆＴｗｅｌｔｈＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＭａｃｈｉｎｅＬｅａｒｎｉｎｇ，Ｊｕｌｙ，１９９５Clear, John G, et al. K *: An Instance-based Learner Usage an Entropic Distance Measurement. ICML '95 Proceedings of Twelve International Conference on Machine Learning, July, 1995 Ｗａｎｇ，Ｙｏｎｇ，ｅｔａｌ．ＩｎｄｕｃｉｎｇＭｏｄｅｌＴｒｅｅｓｆｏｒＣｏｎｔｉｎｕｏｕｓＣｌａｓｓｅｓ，ＥｕｒｏｐｅａｎＣｏｎｆｅｒｅｎｃｅｏｎＭａｃｈｉｎｅＬｅａｒｎｉｎｇ，１９９７Wang, Yong, et al. Inducting Model Trees for Continuous Classes, European Conference on Machine Learning, 1997

特許文献１の方法は、食品における微生物の成長を予測することが目的であり、生産された食品の特性を予測するものではない。特許文献２の方法では、飲料の生産工程における中間製品の特性を推定することができるが、醸造によって最終的に生成される飲料の特性を予測するものではない。特許文献３の方法は、飲料の一部の特性を予測することができるが、試醸条件に係る複数の変数を入力として、飲料の全般的な特性を予測するものではない。すなわち、先行技術ではモデルに変数として入力できる条件の数と、予想可能な飲料の特性が限られており、飲料を試醸（醸造）したときの結果の予測にそのまま適用するのは難しい。 The method of Patent Document 1 is intended to predict the growth of microorganisms in food, and does not predict the characteristics of the produced food. In the method of Patent Document 2, the characteristics of the intermediate product in the beverage production process can be estimated, but the characteristics of the beverage finally produced by brewing are not predicted. The method of Patent Document 3 can predict some characteristics of a beverage, but does not predict the general characteristics of a beverage by using a plurality of variables related to brewing conditions as input. That is, in the prior art, the number of conditions that can be input as variables to the model and the properties of the beverage that can be predicted are limited, and it is difficult to directly apply the results to the prediction when the beverage is brewed.

これらの課題を踏まえ、麦芽アルコール飲料やビールテイスト飲料などの試作においては、過去に蓄積されたデータを分析し、入力された試醸条件から醸造（試醸）される飲料の特性を正確に予測する技術の開発が望まれている。本発明は、過去のデータに基づき、入力された試醸条件から予測される飲料の特性を高い精度で推定する、飲料の特性予測方法、特性予測プログラムおよび特性予測装置を提供する。 Based on these issues, in the trial production of malt alcoholic beverages and beer-taste beverages, the data accumulated in the past is analyzed, and the characteristics of the beverages brewed (trial brew) are accurately predicted from the input brewing conditions. Development of technology to do this is desired. The present invention provides a beverage characteristic prediction method, a characteristic prediction program, and a characteristic prediction apparatus that estimate a characteristic of a beverage predicted from input brewing conditions with high accuracy based on past data.

本発明に係る飲料の特性予測方法は、飲料の試醸条件における設定項目を説明変数とした試醸サンプルを生成するステップと、前記試醸サンプルを、複数の予測モデルに入力し、前記予測モデルごとに前記飲料の成分値の推定値を求めるステップと、前記試醸サンプルを、前記予測モデルに対応する誤差判定モデルに入力し、前記予測モデルごとに許容誤差内確率を計算するステップと、手法選択モデルに、前記試醸サンプルに係る説明変数と、複数の前記許容誤差内確率を入力し、前記手法選択モデルの出力値に基づき、前記飲料の前記試醸条件における前記成分値として提示される、前記推定値の選択を行うステップとをコンピュータが実行することを特徴とする。 The method for predicting beverage characteristics according to the present invention includes a step of generating a tasting sample with setting items in beverage tasting conditions as explanatory variables, the tasting sample being input to a plurality of prediction models, and the prediction model A step of obtaining an estimated value of the component value of the beverage every time, a step of inputting the brew sample into an error determination model corresponding to the prediction model, and calculating a probability within tolerance for each prediction model, and a method An explanatory variable related to the tasting sample and a plurality of probabilities within the allowable error are input to the selection model, and are presented as the component values in the brewing conditions of the beverage based on the output value of the method selection model. The computer executes the step of selecting the estimated value.

第１の実施形態に係る特性予測装置全体の構成例を示す図。The figure which shows the structural example of the whole characteristic prediction apparatus which concerns on 1st Embodiment. 学習処理における交差検証の例を示した図。The figure which showed the example of the cross-validation in a learning process. 学習処理における交差検証の例を示した図。The figure which showed the example of the cross-validation in a learning process. 過去の醸造事例を格納した醸造データの例を示した図。The figure which showed the example of the brewing data which stored the past brewing example. 過去の醸造事例を格納した醸造データの例を示した図。The figure which showed the example of the brewing data which stored the past brewing example. 醸造データから生成された学習データの例を示した図。The figure which showed the example of the learning data produced | generated from the brewing data. 予測する成分値ごとに使用される変数の例を示した図。The figure which showed the example of the variable used for every component value to predict. ある分析値についての判定テーブルの例を示した図。The figure which showed the example of the determination table about a certain analysis value. ある分析値についての推定値比較テーブルの例を示した図。The figure which showed the example of the estimated value comparison table about a certain analysis value. Ｍ５´を使った場合におけるモデルデータの例を示した図。The figure which showed the example of the model data in the case of using M5 '. Ｍ５´を使った場合におけるモデルデータの例を示した図。The figure which showed the example of the model data in the case of using M5 '. シグモイド関数の例を示した図。The figure which showed the example of the sigmoid function. 学習処理を示したフローチャート。The flowchart which showed the learning process. 学習処理を示したフローチャート。The flowchart which showed the learning process. 学習処理を示したフローチャート。The flowchart which showed the learning process. 成分値の予測処理を示したフローチャート。The flowchart which showed the prediction process of the component value. 成分値の予測処理を示したフローチャート。The flowchart which showed the prediction process of the component value. 特徴空間からの関連事例の抽出方法の例を示した図。The figure which showed the example of the extraction method of the related example from feature space. 試醸条件入力画面の例を示した図。The figure which showed the example of the tasting condition input screen. 予測結果表示画面の第１の例を示した図。The figure which showed the 1st example of the prediction result display screen. 予測結果表示画面の第２の例を示した図。The figure which showed the 2nd example of the prediction result display screen. 特性予測装置に係るハードウェア構成の例を示す図。The figure which shows the example of the hardware constitutions which concern on a characteristic prediction apparatus.

以下では、図面を参照しながら、本発明の実施形態について説明する。また、図面において同一の構成要素は、同じ番号を付し、説明は、適宜省略する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In the drawings, the same components are denoted by the same reference numerals, and description thereof will be omitted as appropriate.

（第１の実施形態）
図１は、第１の実施形態に係る特性予測装置全体の構成例を示す図である。 (First embodiment)
FIG. 1 is a diagram illustrating a configuration example of the entire characteristic prediction apparatus according to the first embodiment.

図１の特性予測装置１は、記憶部２と、学習部３と、前処理部４と、表示部５と、操作部６とを備えている。最初に、特性予測装置１で実行される処理の概要について説明する。 The characteristic prediction apparatus 1 of FIG. 1 includes a storage unit 2, a learning unit 3, a preprocessing unit 4, a display unit 5, and an operation unit 6. First, an outline of processing executed by the characteristic prediction apparatus 1 will be described.

特性予測装置１は、過去の醸造事例に係る情報を含む醸造データを使って、任意の条件で醸造を行った場合に予想される飲料の特性を出力する。過去の醸造事例は、過去に飲料の醸造（試醸）が行われたときにおける原材料または醸造工程の少なくともいずれかの設定項目（レシピ）と、生成された飲料の成分値の情報を含む。飲料の例としては、麦芽を原材料に含む飲料（麦芽飲料）の生産に使われる麦汁、生産工程で発酵を伴う発酵飲料、非発酵のノンアルコール飲料、ビールテイスト飲料（アルコール分が１％未満であるビール風味の発泡性炭酸飲料）などが挙げられる。 The characteristic prediction apparatus 1 outputs the characteristic of a beverage expected when brewing is performed under arbitrary conditions using brewing data including information relating to past brewing cases. The past brewing example includes information on at least one setting item (recipe) of the raw material or the brewing process when the beverage was brewed (trial brewing) in the past, and the component value of the generated beverage. Examples of beverages include wort used in the production of beverages that contain malt as a raw material (malt beverage), fermented beverages that are fermented in the production process, non-fermented non-alcoholic beverages, and beer-taste beverages (with an alcohol content of less than 1%) Beer-flavored sparkling carbonated beverages).

発酵飲料の例としては、ビール、発泡酒、麦芽使用の新ジャンル、麦芽を使用したウイスキーなどの麦芽アルコール飲料が挙げられる。ただし、発酵飲料はワイン、日本酒などその他の種類の醸造酒であってもよいし、ウイスキー、ウォッカ、焼酎、白酒、ブランデーなどの蒸留工程を伴う蒸留酒であってもよい。飲料の原材料、飲料の生産において使われる酵母や菌類の種類、最終的に生成される飲料におけるアルコール成分の有無、アルコール度数については特に問わない。 Examples of fermented beverages include beer, happoshu, a new genre using malt, and malt alcoholic beverages such as whiskey using malt. However, the fermented beverage may be other types of brewed liquor such as wine and sake, or may be distilled liquor accompanied by a distillation step such as whiskey, vodka, shochu, white sake, and brandy. There are no particular restrictions on the raw materials of beverages, the types of yeasts and fungi used in beverage production, the presence or absence of alcohol components in the finally produced beverage, and the alcohol content.

麦芽アルコール飲料の特性は、例えば最終外観発酵度（ＡＡＬ：ＡｐｐａｒｅｎｔＡｔｅｎｎｕａｔｉｏｎＬｉｍｉｔ）、色度、ｐＨ、全窒素、アミノ酸合計、苦味価（ＢＵ：ＢｉｔｔｅｒｎｅｓｓＵｎｉｔｓ）などの成分値によって特定される。以下では、麦芽アルコール飲料の試醸を行うときに、ある試醸条件のもとで生成される麦芽アルコール飲料の成分値（特性）を予測する場合を例として説明する。試醸条件は、使用される原材料または醸造工程の少なくともいずれかの設定項目（レシピ）に係る情報を含む。 The characteristics of the malt alcoholic beverage are specified by component values such as final appearance fermentation (AAL), chromaticity, pH, total nitrogen, total amino acid, bitterness (BU), and the like. Below, the case where the component value (characteristic) of the malt alcoholic drink produced | generated on a certain trial brewing condition is estimated when brewing a malt alcoholic drink is demonstrated as an example. The brewing conditions include information relating to the setting items (recipe) of at least one of the raw materials used or the brewing process.

麦芽アルコール飲料の原材料に係る設定項目としては、例えば麦芽の使用量、麦芽の種類、麦芽のロット、麦芽の混合比、麦芽の成分値、ホップの使用量、ホップの種類、ホップのロット、ホップの混合比、ホップの成分値、副材料の種類、副材料の使用量、用水処理剤の種類、用水処理剤の使用量、酵素剤の種類、酵素剤の使用量などが挙げられる。醸造ではこれらの項目を調整することによって、様々な特性や風味に係る麦芽アルコール飲料を生成することができる。これらの原材料に係る設定は例であり、その他の設定項目が醸造事例および試醸条件に含まれていてもよい。 Examples of setting items related to raw materials of malt alcoholic beverages include malt usage, malt type, malt lot, malt mixing ratio, malt component value, hop usage, hop type, hop lot, hop The mixing ratio, the hop component value, the type of auxiliary material, the amount of auxiliary material used, the type of water treatment agent, the amount of water treatment agent used, the type of enzyme agent, the amount of enzyme agent used, and the like. By adjusting these items in brewing, malt alcoholic beverages with various characteristics and flavors can be produced. The settings relating to these raw materials are examples, and other setting items may be included in the brewing examples and the brewing conditions.

麦芽アルコール飲料の醸造工程に係る設定項目としては、糖化における温度の時間変化、糖化を行う時間、煮沸率、煮沸時間、冷却時間、冷却の温度、発酵の時間、発酵が行わる温度、熟成の時間、熟成が行われる温度、濾過で使用されるフィルタの種類などが挙げられる。これらの設定項目は例であり、その他の設定項目が醸造事例および試醸条件に含まれていてもよい。 As the setting items related to the brewing process of malt alcoholic beverages, changes in temperature during saccharification, saccharification time, boiling rate, boiling time, cooling time, cooling temperature, fermentation time, fermentation temperature, aging Examples include time, temperature at which aging is performed, and the type of filter used in filtration. These setting items are examples, and other setting items may be included in the brewing examples and the brewing conditions.

本発明に係る特性予測装置では、上述の各設定項目を説明変数（特徴量）に、上述の発酵飲料の成分値を目的変数に設定した上で、機械学習手法を適用し、目的変数の値を推定する。利用者は、特性予測装置に特性（成分値）の予想を行いたい試醸条件に係る設定項目を入力し、当該試醸条件において生成される飲料の成分値を予想することができる。 In the characteristic prediction apparatus according to the present invention, the above-described setting items are set as explanatory variables (features), the component values of the above-mentioned fermented beverages are set as objective variables, and machine learning techniques are applied to set the values of the objective variables. Is estimated. The user can input a setting item related to the brewing condition for which the characteristic (component value) is to be predicted to the characteristic prediction apparatus, and can predict the component value of the beverage generated under the brewing condition.

上述のように、麦芽アルコール飲料などの飲料の分野では試作に要する時間、コスト、リソースを抑えつつ、開発スピードを高速化することが求められている。このような状況においては、任意の条件で醸造（試醸）を行ったときに得られる飲料の特性を高い精度で予測することが必要である。また、飲料の分野では複数の成分値により飲料の特性が特定されるため、複数の成分値を効率よく予測することも求められている。 As described above, in the field of beverages such as malt alcoholic beverages, it is required to increase the development speed while suppressing time, cost, and resources required for trial production. In such a situation, it is necessary to predict the characteristics of the beverage obtained when brewing (trial brewing) under arbitrary conditions with high accuracy. Moreover, since the characteristic of a drink is specified by the several component value in the field | area of a drink, it is also calculated | required to predict a several component value efficiently.

本発明に係る特性予測装置では、それぞれの成分値を予測するときに、複数の機械学習手法のうち、学習データにおいて推定値と実際の成分値の誤差が最も小さくなる機械学習手法を使い、成分値の正式な予測値として利用者に提示する。ここで、推定値とはそれぞれの機械学習手法によって推定された成分値のことをいう。 In the characteristic prediction apparatus according to the present invention, when each component value is predicted, a machine learning method that minimizes an error between an estimated value and an actual component value in the learning data among a plurality of machine learning methods is used. Presented to the user as a formal predicted value. Here, the estimated value means a component value estimated by each machine learning method.

学習データにおける各機械学習手法による推定値と実際の成分値（真値）のずれは、図２および図３に示した、交差検証（ｃｒｏｓｓ−ｖａｌｉｄａｔｉｏｎ）によって調べることができる。交差検証を行う場合、学習処理において学習データＬを複数のブロックＬ_１、Ｌ_２、Ｌ_３、・・・、Ｌ_ｋに分割する。例えば、成分値Ｙ_１、Ｙ_２、Ｙ_３、・・・、Ｙ_ｍについて学習処理（予測モデルおよび誤差判定モデルの生成処理）を行う場合、ブロックＬ_２、Ｌ_３、・・・、Ｌ_ｋをトレーニング用データとして学習（モデルの生成）を行う。そして、ブロックＬ_１をテスト用データとして複数の機械学習手法を使った場合の誤差を求める。 The deviation between the estimated value by each machine learning method and the actual component value (true value) in the learning data can be examined by cross-validation shown in FIGS. When performing cross-validation, the learning data L is divided into a plurality of blocks L ₁ , L ₂ , L ₃ ,..., L _k in the learning process. For example, when performing learning processing (prediction model and error determination model generation processing) for the component values Y ₁ , Y ₂ , Y ₃ ,..., Y _m , the blocks L ₂ , L ₃ _,. Is used as training data (model generation). Then, a error in case of using a plurality of machine learning techniques to block L ₁ as a test data.

上述のブロックの分割数、テスト用データおよびトレーニング用データのブロック割り当ては一例にしか過ぎない。学習データの一部をテスト用データに設定し、学習データのうち、テスト用データに設定されていない部分をトレーニング用データとして用いるのであれば、ブロックの分割数ならびに、テスト用データおよびトレーニング用データの割り当てについては特に問わない。例えば、複数のブロックがテスト用データに設定されていてもよい。それぞれの成分値（目的変数）についての学習処理（予測モデルおよび誤差判定モデルの生成処理）で同じテスト用データおよびトレーニング用データの組み合わせを使ってもよいし、異なるテスト用データおよびトレーニング用データの組み合わせを使ってもよい。 The above-described number of block divisions, block assignment of test data, and training data are only examples. If a part of the learning data is set as test data, and the part of the learning data that is not set as test data is used as training data, the number of block divisions, test data, and training data There is no particular limitation on the assignment of. For example, a plurality of blocks may be set as test data. The same test data and training data combination may be used in the learning process (prediction model and error determination model generation process) for each component value (objective variable), or different test data and training data may be used. Combinations may be used.

以下では学習データＬの分割数をｋ、予測対象とする成分値の数をｍとする。ｋとｍの関係については特に限定しない。図２および図３の例では、テスト用データとして使われない残りのブロックのすべてをトレーニングデータとして使っているが、必ず残りのブロックのすべてをトレーニングデータとして使わなくてもよい。 Hereinafter, the number of divisions of the learning data L is k, and the number of component values to be predicted is m. There is no particular limitation on the relationship between k and m. In the example of FIG. 2 and FIG. 3, all of the remaining blocks that are not used as test data are used as training data. However, all of the remaining blocks need not be used as training data.

本実施形態では、予測を行いたい成分値（目的変数）ごとに第１手法（第１の機械学習手法）であるＭ５´、第２手法であるＫ＊（第２の機械学習手法）を使って予測モデルを学習する。そして、各手法による予測モデルを使って分析値の推定値を求める。第１手法の予測モデルによる推定値を第１推定値、第２手法の予測モデルによる推定値を第２推定値とよぶものとする。それぞれの成分値について、学習データにおける真値と第１推定値との間の誤差と、学習データにおける真値と第２推定値との間の誤差を比較し、当該成分値の正式な予測値として、誤差が少ない手法に係る推定値が使われるようにする。 In the present embodiment, for each component value (objective variable) to be predicted, the first method (first machine learning method) M5 ′ and the second method K * (second machine learning method) are used. To learn the prediction model. And the estimated value of an analytical value is calculated | required using the prediction model by each method. The estimated value based on the prediction model of the first method is referred to as a first estimated value, and the estimated value based on the prediction model of the second method is referred to as a second estimated value. For each component value, the error between the true value in the learning data and the first estimated value is compared with the error between the true value in the learning data and the second estimated value, and the official prediction value of the component value As described above, an estimated value related to a method with less error is used.

なお上述のＭ５´、Ｋ＊は例であり、その他の機械学習手法を用いてもよい。また、これより多い数の機械学習手法による分析値の推定値を求め、推定値の真値との間の誤差が最小となる手法を選択してもよい。 The above-described M5 ′ and K * are examples, and other machine learning methods may be used. Alternatively, an estimated value of the analysis value by a larger number of machine learning methods may be obtained, and a method that minimizes an error between the estimated value and the true value may be selected.

上述のような学習処理を行うことにより、試醸条件においてそれぞれの成分値の予測をするのにあたり、最も誤差が少なくなると推定される機械学習的手法による推定値を使うことができる。 By performing the learning process as described above, it is possible to use an estimated value by a machine learning method that is estimated to have the smallest error in predicting each component value under the brewing conditions.

次に、特性予測装置１の構成要素について説明する。 Next, components of the characteristic prediction apparatus 1 will be described.

記憶部２は、過去の醸造事例のデータ、学習データ、機械学習の中間処理で生成されるデータ、機械学習で生成されたモデルのデータ、特性予測プログラムなどを格納する記憶領域である。記憶部２は、内部の構成要素として醸造データベース２１と、学習データ記憶部２２と、判定データ記憶部２３と、モデルデータベース２４とを含んでいる。醸造データベース２１、学習データ記憶部２２、判定データ記憶部２３、モデルデータベース２４の詳細については後述する。 The storage unit 2 is a storage area for storing past brewing case data, learning data, data generated by an intermediate process of machine learning, model data generated by machine learning, a characteristic prediction program, and the like. The storage unit 2 includes a brewing database 21, a learning data storage unit 22, a determination data storage unit 23, and a model database 24 as internal components. Details of the brewing database 21, the learning data storage unit 22, the determination data storage unit 23, and the model database 24 will be described later.

記憶部２は、ＳＲＡＭ、ＤＲＡＭなどの揮発性メモリでも、ＮＡＮＤ、ＭＲＡＭ、ＦＲＡＭなどの不揮発性メモリでもよい。また光ディスク、ハードディスク、ＳＳＤなどのストレージ装置でもよい。記憶部２は、特性予測装置１に内蔵されていてもよいし、特性予測装置１の外部の記憶装置であってもよい。また、記憶部２は、ＳＤカード、ＭｉｃｒｏＳＤカード、ＵＳＢメモリなどの取り外し可能な記憶媒体であってもよい。 The storage unit 2 may be a volatile memory such as SRAM or DRAM, or a nonvolatile memory such as NAND, MRAM, or FRAM. A storage device such as an optical disk, a hard disk, or an SSD may be used. The storage unit 2 may be built in the characteristic prediction apparatus 1 or may be a storage device outside the characteristic prediction apparatus 1. The storage unit 2 may be a removable storage medium such as an SD card, a MicroSD card, or a USB memory.

学習部３は、任意の条件で醸造（試醸）を行った場合に予想される飲料の成分値を出力する。学習部３は、内部の構成要素として、変数選択部１１と、交差検証部１２と、モデル生成部１３と、第１推定部１４と、第２推定部１５と、検証部１６と、手法選択部１７とを含んでいる。これらの構成要素の詳細についても後述する。 The learning unit 3 outputs a component value of a beverage expected when brewing (trial brewing) is performed under an arbitrary condition. The learning unit 3 includes, as internal components, a variable selection unit 11, an intersection verification unit 12, a model generation unit 13, a first estimation unit 14, a second estimation unit 15, a verification unit 16, and a method selection. Part 17. Details of these components will also be described later.

前処理部４は、過去の醸造事例に係るデータを、学習部３が利用可能な学習データの形式に変換する。前処理部４が行う処理の詳細についても後述する。 The preprocessing unit 4 converts data relating to past brewing cases into a learning data format that can be used by the learning unit 3. Details of processing performed by the preprocessing unit 4 will also be described later.

表示部５は、利用者が特性予測装置の操作時に利用するＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）やＣＬＩ（ＣｏｍｍａｎｄＬｉｎｅＩｎｔｅｒｆａｃｅ）、各種データ、試醸条件の入力画面、予測結果や関連事例の画面などを表示するディスプレイである。ディスプレイとして、例えば液晶ディスプレイ、有機エレクトロルミネッセンスディスプレイ、プロジェクタ、ＬＥＤディスプレイなどを使うことができるが、その他の種類のディスプレイを使ってもよい。図１の例における表示部５は特性予測装置１に内蔵されているが、表示部５の位置については特に問わない。表示部５は、特性予測装置１から離れた部屋や建物に設置されていてもよいし、タブレットやスマートフォンなどの無線通信端末のディスプレイであってもよい。 The display unit 5 displays a GUI (Graphical User Interface) and CLI (Command Line Interface), various data, a screen for inputting tasting conditions, a screen for prediction results and related cases, etc. used by the user when operating the characteristic prediction device. Display. As the display, for example, a liquid crystal display, an organic electroluminescence display, a projector, an LED display, or the like can be used, but other types of displays may be used. Although the display unit 5 in the example of FIG. 1 is built in the characteristic prediction apparatus 1, the position of the display unit 5 is not particularly limited. The display unit 5 may be installed in a room or a building away from the characteristic prediction device 1 or may be a display of a wireless communication terminal such as a tablet or a smartphone.

操作部６は、利用者による特性予測装置１の操作手段を提供するデバイスである。操作部６は、例えば、キーボード、マウス、スイッチ、音声認識デバイスなどであるが、これに限られない。また、操作部６は表示部５と一体化したタッチパネルであってもよい。操作部６の位置についても特に問わない。操作部６は、特性予測装置から離れた部屋や建物に設置されていてもよいし、タブレットやスマートフォンなどの無線通信端末のタッチパネルであってもよい。 The operation unit 6 is a device that provides operation means of the characteristic prediction device 1 by the user. The operation unit 6 is, for example, a keyboard, a mouse, a switch, a voice recognition device, or the like, but is not limited thereto. The operation unit 6 may be a touch panel integrated with the display unit 5. The position of the operation unit 6 is not particularly limited. The operation unit 6 may be installed in a room or a building away from the characteristic prediction device, or may be a touch panel of a wireless communication terminal such as a tablet or a smartphone.

次に、記憶部２の内部の各構成要素について説明する。 Next, each component inside the storage unit 2 will be described.

醸造データベース２１には、過去に飲料の醸造が行われた事例（過去の醸造事例）における原材料や醸造工程などの設定項目（レシピ）と、生成された飲料の成分値に関する情報である、醸造データが格納されている。図４および図５のテーブルは、醸造データの例を示している。図４のテーブルの列３０には、それぞれの醸造事例を一意的に識別する識別子である、醸造ＩＤが格納されている。醸造ＩＤとして、例えば複数の英数字を含む識別子を使うことができるが、識別子のフォーマットについては特に問わない。醸造ＩＤにより識別されるそれぞれの行には、当該醸造事例における各設定項目の値と生成された飲料の成分値（分析値）が格納されている。 The brewing database 21 includes brewing data, which is information on setting items (recipe) such as raw materials and brewing processes in cases where beverages have been brewed in the past (past brewing cases) and component values of the generated beverages. Is stored. The tables of FIGS. 4 and 5 show examples of brewing data. A brewing ID, which is an identifier for uniquely identifying each brewing case, is stored in the column 30 of the table of FIG. For example, an identifier including a plurality of alphanumeric characters can be used as the brew ID, but the format of the identifier is not particularly limited. Each row identified by the brewing ID stores the value of each setting item in the brewing case and the component value (analysis value) of the generated beverage.

図４のテーブルの列３１には、原材料に係る複数の設定項目の値が格納されている。列３１には、麦芽Ａの使用量、麦芽Ｂの使用量、麦芽Ｃの使用量、ホップＤの使用量、ホップＥの使用量、副材料Ｆの使用量、副材料Ｇの使用量が含まれている。これらの設定項目は例であり、これらとは異なる設定項目が使われていてもよい。例えば、複数の原材料が混合されて使用された場合には、混合後の原材料に係る全窒素、色度などの成分値を設定項目に使ってもよい。原材料に係る設定項目として成分値を使う場合、成分値は麦汁を糖化させる前の値であってもよいし、糖化された後の値であってもよく、成分値の測定タイミングについては特に問わない。 In a column 31 of the table of FIG. 4, values of a plurality of setting items related to the raw material are stored. Column 31 includes the amount of malt A, the amount of malt B, the amount of malt C, the amount of hop D, the amount of hop E, the amount of secondary material F, the amount of secondary material G used. It is. These setting items are examples, and different setting items may be used. For example, when a plurality of raw materials are mixed and used, component values such as total nitrogen and chromaticity related to the mixed raw materials may be used as setting items. When using a component value as a setting item related to the raw material, the component value may be a value before saccharifying wort, or a value after saccharification, and the measurement timing of the component value is particularly It doesn't matter.

図５のテーブルは、図４のテーブルのさらに右側に格納されるデータを示したものである。図５のテーブルの列３２には、醸造工程に係る複数の設定項目の値が格納されている。列３２には、温度変化パターン、糖化時間、煮沸率、煮沸時間が含まれている。これらの設定項目は例であり、これらとは異なる設定項目が使われていてもよいし、さらに他の設定項目が含まれていてもよい。 The table in FIG. 5 shows data stored on the right side of the table in FIG. In the column 32 of the table of FIG. 5, values of a plurality of setting items related to the brewing process are stored. The column 32 includes a temperature change pattern, a saccharification time, a boiling rate, and a boiling time. These setting items are examples, setting items different from these may be used, and other setting items may be included.

列３２の左側にある列３３には、生成された飲料の複数の成分値が格納されている。列３３には、ＡＡＬ（最終外観発酵度）、色度、ｐＨ、全窒素、アミノ酸合計、ＢＵに係る成分値が含まれている。これらの成分値は例であり、これらとは異なる成分値が使われていてもよい。特性予測装置１ではそれぞれの成分値ごとに機械学習手法を使って値の予測を行うため、列３３ａに示されたそれぞれの成分値が機械学習における目的変数に相当する。 A column 33 on the left side of the column 32 stores a plurality of component values of the generated beverage. The column 33 includes component values relating to AAL (final appearance fermentation degree), chromaticity, pH, total nitrogen, amino acid total, and BU. These component values are examples, and component values different from these may be used. Since the characteristic prediction apparatus 1 predicts a value for each component value using a machine learning technique, each component value shown in the column 33a corresponds to an objective variable in machine learning.

なお、醸造データにおける生成された飲料の成分値は、すべての工程が終了した飲料の成分値であってもよいし、麦汁や、途中工程における中間生成物の成分値であってもよい。すなわち、特性予測装置１が予測対象とする飲料の成分値は最終製品に係るものであっても、生産途中に生成される物に係るものであってもよく、醸造工程における時点については特に問わない。 In addition, the component value of the drink produced | generated in brewing data may be the component value of the drink which all the processes completed, and the component value of the intermediate product in wort or an intermediate process may be sufficient as it. That is, the component value of the beverage to be predicted by the characteristic prediction device 1 may be related to the final product or may be related to a product generated during the production, and the time point in the brewing process is particularly questioned. Absent.

学習データ記憶部２２には、醸造データから生成された学習データが保存される。図６のテーブルは、学習データの例を示している。学習データは、醸造データと同様に醸造ＩＤにより一意的に識別される。列３０は、学習データにおける醸造ＩＤを示している。醸造データでは、醸造ＩＤによって識別されるそれぞれの行に係るデータを（醸造）事例とよんでいたが、学習データでは、それぞれの行に係るデータをサンプルとよぶものとする。図６のテーブルにはさらに列３４と、列３５が示されている。列３４は学習データの説明変数である。図６の例において列３４は説明変数ｘ_１から説明変数ｘ_ｎまでｎ個の説明変数を含んでいる。列３５には、それぞれのサンプルに対応する醸造事例の成分値が格納されている。図６の例において列３５は成分値Ｙ_１から成分値Ｙ_ｍまでｍ個の成分値を含んでいる。 The learning data storage unit 22 stores learning data generated from the brew data. The table in FIG. 6 shows an example of learning data. The learning data is uniquely identified by the brewing ID like the brewing data. A column 30 indicates the brewing ID in the learning data. In the brewing data, the data related to each row identified by the brewing ID is called a (brewing) example, but in the learning data, the data related to each row is called a sample. In the table of FIG. 6, a column 34 and a column 35 are further shown. A column 34 is an explanatory variable of learning data. Column 34 includes a n number of explanatory variables from the explanatory variable x ₁ to the explanatory variable x _n in the example of FIG. In a column 35, component values of brewing cases corresponding to the respective samples are stored. In the example of FIG. 6, the column 35 includes _m component values from the component value Y ₁ to the component value Y _m .

以降では、学習データに含まれる、過去の醸造事例に対応するサンプルを過去サンプルとよぶものとする。それぞれの過去サンプルは醸造ＩＤによって一意的に識別される。また、成分値の予測を行いたい試醸条件から生成されたサンプルを試醸サンプルとよぶものとする。 Hereinafter, a sample corresponding to a past brewing example included in the learning data is referred to as a past sample. Each past sample is uniquely identified by a brew ID. A sample generated from the brewing conditions for which the component value is to be predicted is called a brewing sample.

次に、前処理部４による学習データの生成処理（前処理）について説明する。 Next, learning data generation processing (preprocessing) by the preprocessing unit 4 will be described.

前処理部４が実行する前処理により、醸造データにおける列３１、３２（設定項目）は図６の列３４に示された説明変数に変換される。もし、醸造データに相当するデータが複数のデータベースに保存されている場合には、前処理において必要なデータを抽出し、データの結合処理を行ってもよい。また、蓄積されている過去の醸造事例に係るデータが醸造ＩＤによって一意的に識別可能な形式となっていない場合には、醸造ＩＤ単位で情報の集約を行ってもよい。また、醸造データに異常なデータが含まれている場合には、異常なデータを取り除いてもよい。 By the preprocessing executed by the preprocessing unit 4, the columns 31 and 32 (setting items) in the brew data are converted to explanatory variables shown in the column 34 of FIG. If data corresponding to brewing data is stored in a plurality of databases, data necessary for pre-processing may be extracted and data combining processing may be performed. In addition, when the accumulated data relating to past brewing cases is not in a format that can be uniquely identified by the brewing ID, information may be aggregated in units of brewing IDs. Further, when abnormal data is included in the brew data, the abnormal data may be removed.

また、前処理では技術者の知見や文献などの形式知を反映し、学習データへの変換を行ってもよい。例えば、経験上製品の特定の成分値の予測に必須な成分値がある場合には、当該成分値を必ず機械学習で使用される説明変数に含めるよう、設定することができる。このような設定は、例えば学習データの説明変数の属性情報や、説明変数に関する情報を格納したメタデータにマーキングをすることによって行う。醸造が行われる設備の違いによって生成される飲料の特性が変化することが知られている場合には、醸造事例で使用された設備に応じて、説明変数の値を補正してもよい。利用者（例えば、技術者、研究者など）が表示部５に表示されたデータを参照し、操作部６を使って異常なデータまたは入力ミスのデータを修正したり、使用対象から除外したりしてもよい。 In the pre-processing, formal knowledge such as engineers' knowledge and documents may be reflected and converted into learning data. For example, when there is a component value that is essential for predicting a specific component value of a product based on experience, the component value can be set to be included in an explanatory variable used in machine learning. Such setting is performed by marking, for example, attribute information of explanatory variables of learning data and metadata storing information about explanatory variables. If it is known that the characteristics of the beverage produced will vary depending on the equipment at which the brewing is performed, the value of the explanatory variable may be corrected according to the equipment used in the brewing case. A user (for example, an engineer, a researcher, etc.) refers to the data displayed on the display unit 5, corrects abnormal data or erroneous data using the operation unit 6, or excludes it from the target of use. May be.

文献に記載された情報を使って、学習データへの変換を行うことができる。例えば、文献に糖化において用いられる、特定の酵素が活性化する条件が記載されている場合には、活性化条件を満たしている醸造ＩＤについては、当該酵素に係る情報を説明変数に必ず含めるように設定し、活性化条件を満たしていない醸造ＩＤについては、当該酵素に係る情報を説明変数から除外するように設定することができる。 Conversion to learning data can be performed using information described in the literature. For example, if the literature describes conditions for activating a specific enzyme used in saccharification, for brewing IDs that satisfy the activation conditions, be sure to include information on the enzyme in the explanatory variable. The brewing ID that does not satisfy the activation condition can be set so as to exclude information related to the enzyme from the explanatory variables.

また、文献には飲料の成分値に関する理論式が記載されていることがある。理論式を使うと、過去の醸造事例において実際に成分値の測定を行っていない場合でも、当該事例において推測される成分値を導き出せる場合がある。理論式から導き出された成分値を、学習データに説明変数または目的変数として追加してもよい。 Moreover, the theoretical formula regarding the component value of a drink may be described in literature. By using a theoretical formula, even when component values are not actually measured in past brewing cases, component values estimated in the case may be derived. You may add the component value derived | led-out from the theoretical formula to learning data as an explanatory variable or an objective variable.

図７は、予測を行う成分値ごとに使用される説明変数と目的変数の例を示している。図７上段にはテーブル３６、図７下段にはテーブル３７が示されている。テーブル３６、３７は図６の学習データの一部を抽出したものとなっている。テーブル３６には成分値Ｙ_１を目的変数ｙ_１に設定し、推定を行うときに使われる説明変数ｘ_１、ｘ_２、ｘ_３が示されている。一方、テーブル３７には成分値Ｙ_２を目的変数ｙ_２に設定し、推定を行うときに使われる説明変数ｘ_４、ｘ_５、ｘ_６が示されている。テーブル３６、３７ではそれぞれ３個の説明変数が選択されているが、選択される説明変数の数はこれとは異なっていてもよい。また、異なる目的変数の値を推定するときに使われる説明変数に重複があってもよい。テーブル３６、３７の例に示したような変数選択は、変数選択部１１によって行われる。変数選択部１１の詳細については後述する。 FIG. 7 shows an example of explanatory variables and objective variables used for each component value to be predicted. A table 36 is shown in the upper part of FIG. 7, and a table 37 is shown in the lower part of FIG. Tables 36 and 37 are obtained by extracting a part of the learning data of FIG. The table 36 shows explanatory variables x ₁ , x ₂ , x ₃ used when estimation is performed by setting the component value Y ₁ as the objective variable y ₁ . On the other hand, the table 37 shows explanatory variables x ₄ , x ₅ , x ₆ used when the component value Y ₂ is set to the objective variable y ₂ and estimation is performed. Although three explanatory variables are selected in the tables 36 and 37, the number of selected explanatory variables may be different from this. In addition, there may be duplication of explanatory variables used when estimating values of different objective variables. Variable selection as shown in the examples of the tables 36 and 37 is performed by the variable selection unit 11. Details of the variable selection unit 11 will be described later.

判定データ記憶部２３には、成分値の推定に用いる機械学習手法の選択に使われるデータが保存される。機械学習手法の選択に使われるデータの例としては、判定テーブル、推定値比較テーブルが挙げられる。以下では、図８〜図１０を参照し、判定データ記憶部２３に保存される判定テーブル、推定値比較テーブルの例について説明する。 The determination data storage unit 23 stores data used for selecting a machine learning method used for estimating a component value. Examples of data used for selecting the machine learning method include a determination table and an estimated value comparison table. Below, with reference to FIGS. 8-10, the example of the determination table preserve | saved at the determination data memory | storage part 23 and an estimated value comparison table is demonstrated.

図８には、判定テーブル３８が示されている。予測モデル（例えば、第１手法、第２手法などの機械学習手法）と、予想する成分値（Ｙ_１、Ｙ_２、・・・、Ｙ_ｍ）の組み合わせごとに判定テーブルが生成される。判定テーブル３８の列３９には、醸造ＩＤが格納される。判定テーブル３８の列４０は、各醸造ＩＤに対応する説明変数である。列４１は、各醸造ＩＤ（各醸造事例）における実際の成分値Ｙ_１であり、モデルの推定値の誤差を求めるときに真値として使われる。列４２は、ある手法に係る予測モデルにおける目的変数ｙ_１であり、当該予測モデルによって計算された推定値が格納される。 FIG. 8 shows the determination table 38. A determination table is generated for each combination of a prediction model (for example, a machine learning method such as the first method or the second method) and a predicted component value (Y ₁ , Y ₂ ,..., Y _m ). A brewing ID is stored in a column 39 of the determination table 38. The column 40 of the determination table 38 is an explanatory variable corresponding to each brewing ID. Column 41 is the actual component values Y ₁ in each brewing ID (each brewing case), is used as a true value when determining the error of the estimated value of the model. Column 42 is the objective variable y ₁ in the prediction model according to one approach, the estimated value calculated by the predictive model are stored.

列４３には、許容誤差内判定結果が格納される。許容誤差内判定結果は“ＴＲＵＥ”または“ＦＡＬＳＥ”の２値をとる。モデルによる推定値（列４２）と、真値である実際の分析値（列４１）との間の誤差の大きさが許容誤差内であれば、列４３の値は“ＴＲＵＥ”となる。モデルによる推定値（列４２）と、真値である実際の分析値（列４１）との間の誤差が許容誤差より大きいのであれば、列４３の値は“ＦＡＬＳＥ”となる。許容誤差の例としては、真値の１％、２％、５％、１０％などの値を使うことができるが、設定値については特に限定しない。後述するように、列４３の許容誤差内判定結果は誤差判定モデルの学習時に教師信号として用いられる。 The column 43 stores the determination result within the allowable error. The determination result within the permissible error takes a binary value of “TRUE” or “FALSE”. If the magnitude of the error between the estimated value by the model (column 42) and the actual analysis value (column 41), which is a true value, is within the allowable error, the value in column 43 is “TRUE”. If the error between the estimated value by the model (column 42) and the actual analysis value (column 41), which is a true value, is larger than the allowable error, the value of column 43 is “FALSE”. As an example of the allowable error, values such as 1%, 2%, 5%, and 10% of the true value can be used, but the set value is not particularly limited. As will be described later, the determination result within the allowable error in the column 43 is used as a teacher signal when learning the error determination model.

列４３ａには、それぞれの醸造ＩＤに対応する許容誤差内確率が格納される。許容誤差内確率は、手法選択モデルの学習や正式な予測値として提示される推定値の選択に用いられる。許容誤差内確率の詳細については、後述する。 In the column 43a, the within-tolerance probability corresponding to each brewing ID is stored. The within-tolerance probability is used for learning a method selection model and selecting an estimated value presented as a formal predicted value. Details of the probability within the allowable error will be described later.

図８の列３９〜列４１は、図６の学習データと共通の内容となっている。したがって、判定テーブルは、学習データと共通のテーブルであってもよい。また、判定テーブルは列４０、４１に係るデータを持たず、学習データに係るテーブルから該当するデータを取得してもよい。 Columns 39 to 41 in FIG. 8 have the same contents as the learning data in FIG. Therefore, the determination table may be a common table with the learning data. Further, the determination table may not have the data related to the columns 40 and 41, and the corresponding data may be acquired from the table related to the learning data.

図９には、推定値比較テーブル４４が示されている。予想する成分値（Ｙ_１、Ｙ_２、・・・、Ｙ_ｍ）ごとに推定値比較テーブルが生成される。列４５には、醸造ＩＤが格納される。列４５に格納される醸造ＩＤは、テスト用データとして使われるブロックに含まれるサンプルに係るものとなる。推定値比較テーブル４４は成分値Ｙ_１に係るテーブルであるから、ブロックＬ_１に含まれるサンプルに係る醸造ＩＤが格納される。列４６には、第１手法による予測モデルによって求められた第１推定値が格納されている。列４７には、第２手法による予測モデルによって求められた第２推定値が格納されている。図９の例では、２つの手法による推定値が示されていているが、これより多い数の種類の手法による推定値が手法比較テーブル４４に含まれていてもよい。列４８には、真値である実際の成分値が格納されている。 FIG. 9 shows an estimated value comparison table 44. An estimated value comparison table is generated for each predicted component value (Y ₁ , Y ₂ ,..., Y _m ). Column 45 stores the brewing ID. The brewing ID stored in the column 45 is related to the sample included in the block used as test data. Since the estimated value comparison table 44 is a table according to the component value Y _1, brewing ID of the samples included in the block L ₁ are stored. The column 46 stores the first estimated value obtained by the prediction model according to the first method. The column 47 stores the second estimated value obtained by the prediction model by the second method. In the example of FIG. 9, estimated values by two methods are shown, but estimated values by a larger number of types of methods may be included in the method comparison table 44. The column 48 stores actual component values that are true values.

列４９には、第１手法の予測モデルによる第１推定値と、第２手法の予測モデルによる第２推定値の実際の成分値（真値）との誤差（ずれ）に基づいて選択された、成分値Ｙ_１の予測を行うのに「好ましい手法」が格納される。テスト用データを使って推定値を求めた場合、それぞれの醸造ＩＤについて、真値との誤差が小さい方の手法を「好ましい手法」として選択することができる。後述するように、列４９の「好ましい手法」は教師信号として用いられる。 The column 49 is selected based on an error (deviation) between the first estimated value based on the prediction model of the first method and the actual component value (true value) of the second estimated value based on the prediction model of the second method. The “preferred method” is stored for the prediction of the component value Y ₁ . When the estimated value is obtained using the test data, the method having the smaller error from the true value can be selected as the “preferred method” for each brewing ID. As will be described later, the “preferred method” in column 49 is used as a teacher signal.

モデルデータベース２４には、推定値を出力する予測モデル、誤差判定モデル、手法選択モデルのデータ（モデルデータ）が保存される。予測モデルはモデル生成部１３により、誤差判定モデルは検証部１６により、手法選択モデルは手法選択部１７により、それぞれ生成される。モデルデータの内容と保存フォーマットは学習に使用される手法とモデル種類によって異なる。 The model database 24 stores data (model data) of a prediction model that outputs an estimated value, an error determination model, and a method selection model. The prediction model is generated by the model generation unit 13, the error determination model is generated by the verification unit 16, and the method selection model is generated by the method selection unit 17. The contents and storage format of model data vary depending on the method and model used for learning.

図１０および図１１は、予測モデルの学習手法（機械学習手法）としてＭ５´を使った場合におけるモデルデータの例を示している。図１０、図１１にはそれぞれ深さ３の木（ｔｒｅｅ）が示されている。木は、学習データを分割することによって生成される。図１０は、学習データのうち、目的変数ｙ_１の推定をするときに使用する説明変数を用いて生成された木である。図１１は、学習データのうち、目的変数ｙ_２の推定をするときに使用する説明変数を用いて生成された木である。図１０、図１１の木のそれぞれのノードには回帰式が示されている。それぞれの回帰式は各ノードに含まれる学習データを使って生成され、目的変数（ｙ_１またはｙ_２）として成分値（Ｙ_１またはＹ_２）の推定値を出力する。 FIGS. 10 and 11 show examples of model data when M5 ′ is used as a prediction model learning method (machine learning method). FIGS. 10 and 11 each show a tree having a depth of three. The tree is generated by dividing the learning data. FIG. 10 is a tree generated using explanatory variables used when estimating the objective variable y _{1 in} the learning data. FIG. 11 is a tree generated using the explanatory variables used when estimating the objective variable y _{2 in} the learning data. A regression equation is shown at each node in the trees of FIGS. Each regression equation is generated using the learning data included in each node, and an estimated value of the component value (Y ₁ or Y ₂ ) is output as the objective variable (y ₁ or y ₂ ).

また、手法としてＫ＊を使った場合には、Ｋ＊によるモデルデータをモデルデータベース２４に保存することができる。Ｋ＊では、試醸サンプルと、過去サンプルとの間の遷移確率と、それぞれの過去サンプルに対応する醸造事例の成分値を用いて、分析値（目的変数）の推定値を計算することができる。モデル生成部１３、Ｍ５´およびＫ＊の詳細については後述する。 When K * is used as a technique, model data based on K * can be stored in the model database 24. In K *, an estimated value of an analytical value (objective variable) can be calculated by using the transition probability between the brew sample and the past sample and the component value of the brewing case corresponding to each past sample. . Details of the model generation unit 13, M5 ′, and K * will be described later.

次に、学習部３の内部の構成要素について説明する。 Next, the internal components of the learning unit 3 will be described.

変数選択部１１は、学習データ記憶部２２に保存された学習データから、それぞれの目的変数ｙ_１、ｙ_２、・・・、ｙ_ｍの推定に用いる説明変数の選択を行う。変数選択部が行う変数選択処理は、上述の図６のテーブルから、図７のテーブル３６、３７を生成する処理に相当する。変数選択部１１は、例えば主成分分析などの線形の次元削減手法を使って変数選択を行ってもよいし、その他の手法によって使用する説明変数を絞り込んでもよい。また、前処理部４が学習データの属性情報やメタデータに記録した、特定の醸造ＩＤにおいては使用が必須である説明変数に係る情報や、特定の醸造ＩＤにおいて使用対象から除外される説明変数に係る情報を使って、変数選択を行ってもよい。また、利用者が使用する説明変数の選択を行ってもよいし、利用者が変数選択部１１の選択した説明変数に対して変更を加えてもよい。 The variable selection unit 11 selects explanatory variables to be used for estimating the respective target variables y ₁ , y ₂ ,..., Y _m from the learning data stored in the learning data storage unit 22. The variable selection process performed by the variable selection unit corresponds to the process of generating the tables 36 and 37 in FIG. 7 from the table in FIG. The variable selection unit 11 may perform variable selection using a linear dimension reduction method such as principal component analysis, or may narrow down explanatory variables used by other methods. In addition, information related to explanatory variables that are recorded in the attribute information and metadata of the learning data by the preprocessing unit 4 and are essential for use in specific brewing IDs, and explanatory variables that are excluded from the use targets in specific brewing IDs The variable selection may be performed using the information related to. The explanatory variable used by the user may be selected, or the user may change the explanatory variable selected by the variable selection unit 11.

なお、特性予測装置１は必ず変数選択部１１を備えていなくてもよい。すなわち、変数選択（次元削減）を行わずに、学習データに含まれるすべての説明変数ｘ_１、ｘ_２、・・・、ｘ_ｍを使って、目的変数の値を推定してもよい。以降の説明では、単に学習データと述べた場合、変数選択（次元削減）が行われる前のもとの学習データと、それぞれの目的変数の推定のために変数選択（次元削減）が行われた後の学習データの両方を含むものとする。 Note that the characteristic prediction apparatus 1 does not necessarily include the variable selection unit 11. That is, the value of the objective variable may be estimated using all the explanatory variables x ₁ , x ₂ ,..., X _m included in the learning data without performing variable selection (dimension reduction). In the following explanation, when it is simply referred to as learning data, the original learning data before variable selection (dimension reduction) and variable selection (dimension reduction) were performed to estimate each objective variable. Both later learning data shall be included.

交差検証部１２は、学習データＬを複数のブロックＬ_１、Ｌ_２、Ｌ_３、・・・、Ｌ_ｋに分割し、それぞれの予想対象となる成分値（Ｙ_１、Ｙ_２、・・・、Ｙ_ｍ）の推定に用いられる、予測モデルと、対応する誤差判定モデルの学習において、テスト用データとして使われるブロックと、トレーニング用データとして使われるブロックの割り当てを決定する。交差検証部１２が実行する処理の詳細は、図２、図３の説明で述べた通りである。交差検証部１２は、上述の用途別のブロックの割り当てに関する情報を記憶部２に保存してもよい。 The cross validation unit 12 divides the learning data L into a plurality of blocks L ₁ , L ₂ , L ₃ ,..., L _k , and component values (Y ₁ , Y ₂ ,. , Y _m ), a block used as test data and a block used as training data are determined in learning of a prediction model and a corresponding error determination model. Details of the processing executed by the cross-validation unit 12 are as described in the description of FIGS. The cross-validating unit 12 may store the information related to the above-described block allocation for each application in the storage unit 2.

モデル生成部１３は、学習データに基づき、目的変数の推定値を出力する予測モデル（機械学習モデル）を生成する。モデル生成部１３が生成した予測モデルは、モデルデータベース２４に保存される。ここでは、機械学習モデル（手法）としてＭ５´（第１手法）を使った場合を例に説明するが、Ｍ５やニューラルネットワークなどその他のモデルを使って目的変数の推定値を計算してもよい。以下ではモデル生成部１３が用いる手法の例のひとつであるＭ５´の概要について説明する。 The model generation unit 13 generates a prediction model (machine learning model) that outputs an estimated value of the objective variable based on the learning data. The prediction model generated by the model generation unit 13 is stored in the model database 24. Here, a case where M5 ′ (first method) is used as a machine learning model (method) will be described as an example. However, an estimated value of an objective variable may be calculated using another model such as M5 or a neural network. . Below, the outline | summary of M5 'which is one of the examples of the method which the model production | generation part 13 uses is demonstrated.

Ｍ５´（第１手法）では標準偏差の減少量（ＳＤＲ：ＳｔａｎｄａｒｄＤｅｖｉａｔｉｏｎＲｅｄｕｃｔｉｏｎ）の最大化を基準として、学習データを順次分割する。Ｍ５´におけるＳＤＲは下記の式（１）のように定義される。

ここで、ｍは欠値のないサンプルの数、｜Ｔ｜は、分割前のノードに含まれるサンプル数、β（ｉ）は補正係数、ｓｄ（Ｔ）は分割前のノードにおける標準偏差、｜Ｔ_ｊ｜は分割により作られた一方のノード（左側または右側）に含まれるサンプル数、ｓｄ（Ｔ_ｊ）は分割により作られた一方のノード（左側または右側）における標準偏差である。 In M5 ′ (first method), learning data is sequentially divided on the basis of maximization of a standard deviation reduction (SDR). The SDR in M5 ′ is defined as the following formula (1).

Here, m is the number of samples without missing values, | T | is the number of samples included in the node before division, β (i) is a correction coefficient, sd (T) is the standard deviation at the node before division, | T _j | is the number of samples included in one node (left or right) created by division, and sd (T _j ) is the standard deviation in one node (left or right) created by division.

なお、式（１）のＳＤＲは一例であり、Ｍ５´の原型となったＭ５のように式（１）の括弧内の項のみによって表されるＳＤＲを使ってもよい。 The SDR in the formula (1) is an example, and an SDR represented only by a term in parentheses in the formula (1) may be used, such as M5 that is a prototype of M5 ′.

式（１）を基準に学習データを分割することによって、分散（標準偏差）が最小化された学習データに属する過去サンプルの部分集合（例えば、図１０、図１１のリーフノード）が得られる。分割による標準偏差の減少に統計的有意性が認められなくなった場合（例えば、分割後のデータの標準偏差がもとの学習データに係る標準偏差の５％未満となる場合）や、ノードに含まれるサンプル数が下限値（例えば、３以上）に達した場合には、学習データの分割を打ち切ることができる。分割によって木が生成されたら、それぞれのノードに含まれるサンプルを使い、目的変数の推定値を出力する回帰式を生成する。回帰式は、例えば線形回帰によって計算することができるが、回帰の手法については特に問わない。 By dividing the learning data based on the formula (1), a subset of past samples (for example, leaf nodes in FIGS. 10 and 11) belonging to the learning data in which the variance (standard deviation) is minimized is obtained. When statistical significance is no longer recognized in the reduction of standard deviation due to division (for example, when the standard deviation of the data after division is less than 5% of the standard deviation of the original learning data) or included in the node When the number of samples to be reached reaches a lower limit (for example, 3 or more), the division of the learning data can be aborted. When the tree is generated by the division, a regression equation that outputs an estimated value of the objective variable is generated using samples included in each node. The regression equation can be calculated by, for example, linear regression, but the regression method is not particularly limited.

木のそれぞれのノードについて回帰式を求めたら、木の枝刈り（ｐｒｕｎｉｎｇ）を行うことができる。ルートノードを除く、すべてのノードに含まれる過去サンプルついて、回帰式による推定値と、当該過去サンプルに対応する醸造事例に係る実際の成分値との間の絶対値誤差の平均値を計算する。次に木に含まれるそれぞれのサブツリーについて、枝における絶対値誤差の平均値の総和と、サブツリーのルートノードにおける絶対値誤差の平均値の総和を比較する。前者の値の方が大きい場合、当該サブツリーの枝刈りを行う。これにより、予測精度を損なうことなく木を単純化し、過学習の影響を軽減できる。 Once the regression equation is determined for each node in the tree, the tree can be pruned. For the past samples included in all nodes except the root node, an average value of absolute value errors between the estimated value based on the regression equation and the actual component value related to the brewing case corresponding to the past sample is calculated. Next, for each subtree included in the tree, the total sum of the absolute values of the absolute value errors at the branches is compared with the sum of the average values of the absolute value errors at the root node of the subtree. If the former value is larger, the subtree is pruned. Thereby, the tree can be simplified without impairing the prediction accuracy, and the influence of overlearning can be reduced.

Ｍ５´を使うことにより、学習データを分割した過去サンプルの部分集合ごとに回帰式を用意することができる。これによって、学習データの全体にひとつの回帰式を当てはめる場合に比べて回帰式の予測精度を高めることができる。 By using M5 ′, a regression equation can be prepared for each subset of past samples obtained by dividing the learning data. Thereby, the prediction accuracy of the regression equation can be improved as compared with the case where one regression equation is applied to the entire learning data.

第１推定部１４は、モデル生成部１３によって生成された第１手法による予測モデル（Ｍ５´）を用いて飲料の成分値（目的変数）の推定を行う。以降では、第１手法による予測モデル（Ｍ５´）によって得られた推定値を第１推定値とよび、第２手法による予測モデル（Ｋ＊）によって得られた推定値と区別するものとする。 The 1st estimation part 14 estimates the component value (objective variable) of a drink using the prediction model (M5 ') by the 1st method produced | generated by the model production | generation part 13. FIG. Hereinafter, the estimated value obtained by the prediction model (M5 ′) by the first method is referred to as a first estimated value, and is distinguished from the estimated value obtained by the prediction model (K *) by the second method.

第１手法としてＭ５´を使った場合における、第１推定値の計算処理について説明する。モデル生成部１３は、Ｍ５´モデルを使い、学習データを分割し、分割後の各ノードに回帰式が対応付けられている木を生成している（例えば図１０、図１１）。第１推定部１４は、モデルデータベース２４を参照し、推定する成分値Ｙ_ｉ（目的変数ｙ_ｉ）に対応する木に係るデータを特定する。第１推定部１４は、成分値Ｙ_ｉ（目的変数ｙ_ｉ）の推定に用いられる試醸サンプルの説明変数の値を参照し、試醸サンプルが木のどのノードに属するのかを特定する。試醸サンプルの属するノードが特定されたら、当該ノードに対応付けられた回帰式を用いて目的変数ｙ_ｉの値を計算する。そして第１推定部１４は、回帰式より計算された目的変数ｙ_ｉの値を第１推定値として出力する。 A calculation process of the first estimated value when M5 ′ is used as the first method will be described. The model generation unit 13 uses the M5 ′ model, divides the learning data, and generates a tree in which a regression equation is associated with each node after division (for example, FIGS. 10 and 11). The first estimation unit 14 refers to the model database 24 and identifies data related to the tree corresponding to the component value Y _i (object variable y _i ) to be estimated. The first estimation unit 14 refers to the explanatory variable value of the tasting sample used for estimating the component value Y _i (objective variable y _i ), and identifies which node of the tasting sample the tree belongs to. When the node to which the sample sample belongs is specified, the value of the objective variable y _i is calculated using the regression equation associated with the node. Then, the first estimation unit 14 outputs the value of the objective variable y _i calculated from the regression equation as the first estimated value.

第２推定部１５は、モデル生成部１３によって生成された第２手法による予測モデル（Ｋ＊）を用いて飲料の成分値（目的変数）の推定を行う。以降では、第２手法による予測モデル（Ｋ＊）によって得られた推定値を第２推定値とよび、第１手法による予測モデル（Ｍ５´）によって得られた第１推定値と区別するものとする。 The 2nd estimation part 15 estimates the component value (objective variable) of a drink using the prediction model (K *) by the 2nd method produced | generated by the model production | generation part 13. FIG. Hereinafter, the estimated value obtained by the prediction model (K *) by the second method is referred to as a second estimated value, and is distinguished from the first estimated value obtained by the prediction model (M5 ′) by the first method. To do.

第２手法としてＫ＊を使った場合における、第２推定値の計算処理について説明する。Ｋ＊は学習データの過去サンプルを使って、試醸条件における成分値の推定値を求める事例ベースの手法である。以下では、Ｋ＊（第２手法）を使って成分値の予測値を行う方法を説明する。 A calculation process of the second estimated value when K * is used as the second method will be described. K * is a case-based technique for obtaining an estimated value of a component value under a brewing condition using a past sample of learning data. Below, the method of performing the predicted value of a component value using K * (2nd method) is demonstrated.

ここで、Ｉが連続値をとるサンプルの集合、ＴがＩの上での変換操作ｔの集合であるものとする。下記の式（２）に示されているように、変換操作ｔの組み合わせによってサンプルａからサンプルｂへの遷移ができるものとする。

Here, it is assumed that I is a set of samples having continuous values, and T is a set of conversion operations t on I. As shown in the following formula (2), it is assumed that the transition from the sample a to the sample b can be performed by a combination of the conversion operations t.

ＰをＩにおけるサンプル間の遷移を実現する変換の集合であるとすると、下記の式（３）に示された関係を満たす確率関数（ＰｒｏｂａｂｉｌｉｔｙＦｕｎｃｔｉｏｎ）を定義することができる。

式（３）はあるサンプルを始点としたときに起こりうる遷移パターンの確率をすべて足し合わせると１になることを示している。 If P is a set of transformations that realize transitions between samples in I, a probability function (Probability Function) that satisfies the relationship shown in the following equation (3) can be defined.

Equation (3) shows that the sum of all transition pattern probabilities that can occur when a certain sample is the starting point is 1.

下記の式（４）は特徴空間内でサンプルａからサンプルｂ間の各経路をとって変換が起こる確率の和Ｐ＊を示している。

Equation (4) below shows the sum P * of the probabilities of transformation taking each path from sample a to sample b in the feature space.

下記の式（５）のＫ＊関数は、式（４）のＰ＊をエントロピーに類似した形式で表したものである。

Ｋ＊関数は、Ｋ＊学習器における距離関数として用いられる。本実施形態における過去サンプルのように、それぞれのサンプルが複数の説明変数を含む場合、サンプル間の遷移確率Ｐ＊はそれぞれの説明変数について求めた遷移確率の積となる。サンプル間のＫ＊関数は、それぞれの説明変数について求めたＫ＊関数の値の和となる。 The K * function in Equation (5) below represents P * in Equation (4) in a form similar to entropy.

The K * function is used as a distance function in the K * learner. When each sample includes a plurality of explanatory variables as in the past sample in the present embodiment, the transition probability P * between samples is a product of the transition probabilities obtained for the respective explanatory variables. The K * function between samples is the sum of the values of the K * function obtained for each explanatory variable.

サンプル間の遷移確率を使うと、Ｋ＊による成分値の推定を行うことができる。まず、説明変数ｘ_ｉごとに、試醸サンプルと学習データのそれぞれの過去サンプルとの間の遷移確率を計算する。次にそれぞれの説明変数について求めた遷移確率を積算する。これによって、すべての説明変数を含めた試醸サンプルとそれぞれの過去サンプルとの間の遷移確率が求められる。最後に、過去サンプルに対応する醸造事例における実際の成分値と、試醸サンプルとの間の遷移確率の積を、それぞれの過去サンプルについて計算し、対象のデータ（学習データ）に含まれるすべての過去サンプルについて加算する。この値を過去サンプルと試醸サンプルとの間の遷移確率の和で割ると、成分値の推定値を求めることができる。 Using the transition probability between samples, the component value can be estimated by K *. First, for each explanatory variable x _i, calculate the transition probabilities between each of the past samples of Fermentation samples and learning data. Next, the transition probabilities obtained for each explanatory variable are integrated. As a result, the transition probabilities between the brew sample including all the explanatory variables and the respective past samples are obtained. Finally, the product of the transition probability between the actual component value in the brewing case corresponding to the past sample and the sample sample is calculated for each past sample, and all the data included in the target data (learning data) Add the past samples. By dividing this value by the sum of transition probabilities between the past sample and the brew sample, an estimated value of the component value can be obtained.

なお、学習処理においてテスト用データに設定されたブロックを用いて、第２手法の誤差を確認する際には、テスト用データに設定されたブロックに属するそれぞれの過去サンプルと、トレーニング用データの過去サンプルとの間の遷移確率を計算する。 In addition, when checking the error of the second method using the block set in the test data in the learning process, each past sample belonging to the block set in the test data and the past of the training data Calculate transition probabilities between samples.

Ｋ＊を使ってサンプルをカテゴリに分類することもできる。例えばサンプルａがカテゴリＣに属する確率を求める場合、サンプルａがカテゴリＣに属するサンプルのいずれかに遷移する確率の和を求めればよい。すなわち、下記の式（６）をＰ＊として用いることができる。

You can also use K * to classify samples into categories. For example, when obtaining the probability that the sample a belongs to the category C, the sum of the probabilities that the sample a transitions to any of the samples belonging to the category C may be obtained. That is, the following formula (6) can be used as P *.

各醸造事例に対応する過去サンプルが、複数のカテゴリに分類されている特徴空間上に試醸サンプルを追加し、上述の式（６）を適用した場合、それぞれのカテゴリについてＰ＊の値を計算することができる。この場合、試醸サンプルは、Ｐ＊の値が最大となったカテゴリに属すると判定することができる。 When a past sample corresponding to each brewing case is added to a feature space that is classified into a plurality of categories and the above formula (6) is applied, the value of P * is calculated for each category. can do. In this case, it can be determined that the sample brew sample belongs to the category having the maximum P * value.

検証部１６は、学習処理においてテスト用データを使って推定値を求めたときに当該推定値と真値（実際の成分値）との間の誤差が許容誤差内にあるか否かを判定し、許容誤差内判定結果を求める。そして、トレーニング用データを使って許容誤差内判定結果を教師信号とし、ずれが許容誤差内にあるか否かの分類（ＴＲＵＥ／ＦＡＬＳＥ）を行う誤差判定モデルを学習する。最後に、テスト用データに学習した誤差判定モデルを適用し、許容誤差内確率を求める。 The verification unit 16 determines whether or not an error between the estimated value and the true value (actual component value) is within an allowable error when the estimated value is obtained using the test data in the learning process. Then, the determination result within the allowable error is obtained. Then, using the training data, the determination result within the allowable error is used as a teacher signal, and an error determination model for classifying whether the deviation is within the allowable error (TRUE / FALSE) is learned. Finally, the learned error determination model is applied to the test data to determine the within-tolerance probability.

上述の検証部１６による処理は、予測対象とする成分値および予測に用いる手法ごとに実行されるため、予測対象とする成分値（Ｙ_１、Ｙ_２、・・・、Ｙ_ｍ）と機械学習手法（例えば、第１手法、第２手法）の組み合わせごとに誤差判定モデルが生成される。次に、予測処理において検証部１６が実行する処理について述べる。 Since the processing by the verification unit 16 described above is executed for each component value to be predicted and each technique used for prediction, the component values (Y ₁ , Y ₂ ,..., Y _m ) to be predicted and machine learning are performed. An error determination model is generated for each combination of methods (for example, the first method and the second method). Next, a process executed by the verification unit 16 in the prediction process will be described.

予測処理においては、試醸サンプルに係る説明変数を誤差判定モデルに入力し、それぞれの手法について、許容誤差内確率を計算し、許容誤差内確率に基づき、後述の手法選択部１７（手法選択モデル）で誤差が小さい（精度が高い）と推定される手法を選択する。 In the prediction process, an explanatory variable related to the tasting sample is input to the error determination model, a probability within tolerance is calculated for each method, and a method selection unit 17 (method selection model described later) is calculated based on the probability within tolerance. ) To select a method in which the error is estimated to be small (high accuracy).

推定値と真値との間のずれが許容誤差内にあるか否かの２値の分類（ＴＲＵＥ／ＦＡＬＳＥ）を行う、誤差判定モデルは例えば、ロジスティック回帰によって学習することができる。以下では、ロジスティック回帰を使った誤差判定モデルの学習処理の概要について説明する。 An error determination model that performs binary classification (TRUE / FALSE) of whether or not the deviation between the estimated value and the true value is within an allowable error can be learned by, for example, logistic regression. Below, the outline | summary of the learning process of the error determination model using logistic regression is demonstrated.

検証部１６が用いるロジスティックモデルは下記の式（７）のように表される。

ここで、Ｚは目的変数、Ｘ_１〜Ｘ_ｎ＋１は説明変数、ｂ_０は定数、ｂ_１〜ｂ_ｎ＋１は回帰係数である。例えば、説明変数Ｘ_１〜Ｘ_ｎとして試醸サンプルに係る説明変数ｘ_１〜ｘ_ｎを使うことができる。説明変数Ｘ_ｎとして、特徴空間において過去サンプルのうち、試醸サンプルとの距離が最も短いものまでの距離（最近接距離）を使うことができる。距離の例としては、ユークリッド距離、マンハッタン距離、ミンコフスキー距離、チェビシフ距離、マハラノビス距離などがあるが、どの種類の距離を用いてもよい。なお、式（７）のモデルの一例であり、ロジスティックモデルの説明変数に必ず最近接距離を含めなくてもよい。 The logistic model used by the verification unit 16 is expressed as the following equation (7).

Here, Z is an objective variable, X _{1 to} X _{n + 1} are explanatory variables, b ₀ is a constant, and b _{1 to} b _{n + 1} are regression coefficients. For example, explanatory variables x _{1 to} x _n related to the brew sample can be used as the explanatory variables X _{1 to} X _n . As the explanatory variable _Xn , it is possible to use the distance (closest distance) to the shortest distance from the brew sample among the past samples in the feature space. Examples of distances include Euclidean distance, Manhattan distance, Minkowski distance, Chebyshev distance, Mahalanobis distance, etc. Any type of distance may be used. Note that this is an example of the model of Equation (7), and the nearest distance may not necessarily be included in the explanatory variable of the logistic model.

許容誤差内判定結果を教師信号としてロジスティック回帰を行うことにより、定数ｂ_０と回帰係数ｂ_１〜ｂ_ｎ＋１の値を計算する。数値計算では、最急降下法や確率的勾配降下法などを使って対数尤度の極値をとるよう、ｂ_０〜ｂ_ｎ＋１の値を繰り返し更新する。ただし、使用される数値計算アルゴリズムについては特に問わない。 By performing logistic regression using the determination result within the permissible error as a teacher signal, the values of the constant b ₀ and the regression coefficients b _{1 to} b _{n + 1} are calculated. In the numerical calculation, the values of b _{0 to} b _{n + 1} are repeatedly updated so as to take the extreme value of the log likelihood using the steepest descent method or the stochastic gradient descent method. However, the numerical calculation algorithm used is not particularly limited.

定数ｂ_０と回帰係数ｂ_１〜ｂ_ｎ＋１の値を求めたら、目的変数Ｚを下記の式（８）によって表されるシグモイド関数に代入し、許容誤差内確率ｐを計算することができる。

When the values of the constant b ₀ and the regression coefficients b _{1 to} b _{n + 1} are obtained, the objective variable Z can be substituted into the sigmoid function represented by the following equation (8) to calculate the within-tolerance probability p.

図１２に示したように、シグモイド関数の出力値ｐは（０，１）の範囲の値をとる。ロジスティック回帰分析を使うと、分析対象が１と０の２値の状態をとるものであるときに、目的変数がいずれかの値をとる確率を推定することができる。２つの機械学習手法による推定値を求める場合、検証部１６は予測対象となっているそれぞれの成分値について、第１推定値に係る許容誤差内確率ｐ_１と、第２推定値に係る許容誤差内確率ｐ_２のそれぞれを計算する。 As shown in FIG. 12, the output value p of the sigmoid function takes a value in the range of (0, 1). When logistic regression analysis is used, it is possible to estimate the probability that an objective variable takes any value when the analysis target is in a binary state of 1 and 0. When obtaining the estimated values by the two machine learning methods, the verification unit 16 determines the in-tolerance probability p ₁ related to the first estimated value and the allowable error related to the second estimated value for each component value to be predicted. Each of the inner probabilities p ₂ is calculated.

上述では、ロジスティック回帰を使って誤差判定モデルの学習を行う場合を例に説明したが、その他の手法を使って誤差判定モデルの学習を行ってもよい。確率的な分類を行うその他の手法の例としては、最小二乗確率的分類などが挙げられる。 In the above description, the case where the error determination model is learned using logistic regression has been described as an example. However, the error determination model may be learned using other methods. Examples of other methods for performing probabilistic classification include least square probabilistic classification.

手法選択部１７は、学習処理において、テスト用データをそれぞれの手法による予測モデルに入力して求めた推定値と、実際の成分値（真値）との間の誤差（ずれ）に基づき、誤差の小さい（精度の高い）機械学習手法を「好ましい手法」として選択する。詳細な処理は上述の図９に係る説明で述べた通りである。次に、手法選択部１７は、検証部１６で求められた許容誤差内確率（例えば、第１許容誤差内確率と第２許容誤差内確率）を学習データの説明変数に追加し、「好ましい手法」を教師信号として、手法選択モデルの学習を行う。 In the learning process, the method selection unit 17 determines an error based on an error (deviation) between an estimated value obtained by inputting test data into a prediction model of each method and an actual component value (true value). A machine learning method with a small (high accuracy) is selected as a “preferred method”. The detailed processing is as described in the description related to FIG. Next, the method selection unit 17 adds the within-tolerance probability (for example, the first in-tolerance probability and the second within-tolerance probability) obtained by the verification unit 16 to the explanatory variable of the learning data, "Is used as a teacher signal to learn the method selection model.

手法選択モデルの学習は、それぞれの予測対象の成分値について実行されるため、予測対象の成分値（Ｙ_１、Ｙ_２、・・・、Ｙ_ｍ）ごとに手法選択モデルが生成される。次に、予測処理において手法選択部１７が実行する処理について述べる。 Since the learning of the method selection model is performed for each prediction target component value, a method selection model is generated for each prediction target component value (Y ₁ , Y ₂ ,..., Y _m ). Next, a process executed by the method selection unit 17 in the prediction process will be described.

予測処理においては、検証部１６で計算された、それぞれの手法に係る許容誤差内確率（例えば、確率ｐ_１、ｐ_２）と、試醸サンプルに係る説明変数を手法選択モデルに入力し、正式な予測値として用いる推定値（例えば、第１手法と第２手法がある場合には、第１推定値または第２推定値のいずれか）を選択する。 In the prediction process, the probability within tolerance (for example, probabilities p ₁ , p ₂ ) calculated by the verification unit 16 and the explanatory variable related to the tasting sample are input to the method selection model, and are formalized. An estimated value to be used as a predicted value (for example, when there is a first method and a second method, either the first estimated value or the second estimated value) is selected.

次に、手法選択モデルの学習処理について説明する。第１手法、第２手法のようにふたつの手法のいずれかの推定値を正式な予測値として選択する場合、誤差判定モデルと同様に、ロジスティック回帰を用いることができる。以下では、手法選択モデルの学習にロジスティック回帰を使った場合を例に説明する。 Next, the learning process of the method selection model will be described. Logistic regression can be used in the same manner as the error determination model when selecting an estimated value of one of the two methods as a formal predicted value, such as the first method and the second method. Below, the case where logistic regression is used for learning of the method selection model will be described as an example.

手法選択モデルの学習では、下記の式（９）のようなロジスティックモデルを使う。

ここで、Ｚ´は目的変数、Ｘ´_１〜Ｘ´_ｎ＋２は説明変数、ｄ_０は定数、ｄ_１〜ｄ_ｎ＋２は回帰係数である。 In learning of the method selection model, a logistic model such as the following equation (9) is used.

Here, Z ′ is an objective variable, X ′ _{1 to} X ′ _{n + 2} are explanatory variables, d ₀ is a constant, and d _{1 to} dn _{+ 2} are regression coefficients.

手法選択部１７は、例えば説明変数Ｘ´_１として第１推定値に係る許容誤差内確率ｐ_１、説明変数Ｘ´_２として第２推定値に係る許容誤差内確率ｐ_２、説明変数Ｘ´_３〜Ｘ´_ｎ＋２として、試醸サンプルに係る説明変数ｘ_１〜ｘ_ｎを使うことができる。すなわち、試醸サンプルの説明変数に、試醸条件における第１推定値の許容誤差内確率と、試醸条件における第２推定値の許容誤差内確率を追加して、ロジスティック回帰の説明変数としている。そして、手法選択部１７は図９の列４９に示された「好ましい手法」を教師信号としてロジスティック回帰を行い、定数ｄ_０と回帰係数ｄ_１〜ｄ_ｎ＋２の値を計算する。使われる数値計算の手法については特に問わない。 Method selection unit 17, for example, describes a variable X'tolerances within probability _{p 1} that a ₁ according to the first estimate, the explanatory variables X'tolerances in probability _{p 2} of the ₂ to the second estimate, explanatory variables _X'3 as ~X' _{n + 2,} it is possible to use the explanatory variable _x 1 ~x _n according to Fermentation samples. That is, the explanatory variable of the logistic regression is obtained by adding the probability within the allowable error of the first estimated value under the brewing condition and the probability within the allowable error of the second estimated value under the brewing condition to the explanatory variable of the brewing sample. . Then, the technique selection unit 17 performs logistic regression using the “preferred technique” shown in the column 49 of FIG. 9 as a teacher signal, and calculates values of the constant d ₀ and the regression coefficients d _{1 to} d _{n + 2} . The numerical calculation method used is not particularly limited.

目的変数Ｚ´の値を求めたら、シグモイド関数である下記の式（１０）にＺ´を代入して確率ｐ´を計算する。

そして、確率ｐ´の値を２値化する。２値化処理の一例としては、しきい値を使う方法がある。例えば、確率ｐ´がしきい値より大きい場合には出力値を１とし、モデルベースの手法（第１推定値）を選択する。そして確率ｐ´がしきい値以下の場合には出力値を０とし、事例ベースの手法（第２推定値）を選択する。しきい値としては、例えば０．５を使うことができるが、これとは異なる値であってもよい。以降では式（１０）の確率ｐ´を手法変数とよび、許容誤差内確率ｐと区別するものとする。 When the value of the objective variable Z ′ is obtained, the probability p ′ is calculated by substituting Z ′ into the following equation (10) that is a sigmoid function.

Then, the value of the probability p ′ is binarized. As an example of the binarization process, there is a method using a threshold value. For example, when the probability p ′ is larger than the threshold value, the output value is set to 1, and a model-based method (first estimated value) is selected. If the probability p ′ is less than or equal to the threshold value, the output value is set to 0, and a case-based method (second estimated value) is selected. As the threshold value, for example, 0.5 can be used, but a different value may be used. In the following, the probability p ′ in equation (10) is called a technique variable and is distinguished from the tolerance p within the tolerance.

上述では、手法選択モデルの学習にロジスティック回帰を用いた場合を例に説明したが、その他の手法を使って手法選択モデルの学習を行ってもよい。例えば、最小二乗確率的分類を使ってもよいし、サポートベクトル分類、決定木、アンサンブル学習などを使って手法選択モデルの学習を行ってもよい。 In the above description, the case where logistic regression is used for learning the method selection model has been described as an example. However, the method selection model may be learned using other methods. For example, least square probabilistic classification may be used, or method selection models may be learned using support vector classification, decision trees, ensemble learning, and the like.

予測対象の成分値（Ｙ_１、Ｙ_２、・・・、Ｙ_ｍ）ごとに手法選択モデルが生成されているため、実際の予測処理においては、予測をする成分値によって選択される手法が異なる場合がある。手法選択部１７は予測処理において、それぞれの成分値の予測のために選択された機械学習手法の種類と、試醸条件における成分値の正式な予測値を表示部５に表示してもよい。 Since a method selection model is generated for each component value (Y ₁ , Y ₂ ,..., Y _m ) to be predicted, the method to be selected differs depending on the component value to be predicted in the actual prediction process. There is a case. In the prediction process, the method selection unit 17 may display the type of the machine learning method selected for the prediction of each component value and the formal prediction value of the component value under the brewing conditions on the display unit 5.

次に、特性予測装置１が実行する処理の詳細について説明する。図１３〜図１５は、特性予測装置が実行する学習処理を示したフローチャートである。利用者が醸造条件における成分値の予測を行う前に、当該成分値の予測のための学習処理が実行されている必要がある。以下では、図１３〜図１５のフローチャートを参照しながら、処理を説明する。 Next, details of processing executed by the characteristic prediction apparatus 1 will be described. 13 to 15 are flowcharts illustrating the learning process executed by the characteristic prediction apparatus. Before the user predicts a component value under brewing conditions, a learning process for predicting the component value needs to be performed. Hereinafter, the processing will be described with reference to the flowcharts of FIGS.

最初に、前処理部４は、醸造データから学習データを生成し、学習データ記憶部２２に学習データを保存する（ステップＳ１０１）。ステップＳ１０１で実行される処理は前処理部４に係る説明で述べた通りである。次に、変数選択部１１は、予測を行う成分値（目的変数）と、目的変数の推定に用いる手法との組み合わせごとに、学習に使用する変数を選択する（ステップＳ１０２）。ステップＳ１０２で実行される処理の詳細は、図６、図７および変数選択部１１に係る説明で述べた通りである。このとき変数選択部１１は、選択された変数に係る情報を学習データ記憶部２２に保存してもよい。 First, the preprocessing unit 4 generates learning data from the brewing data, and stores the learning data in the learning data storage unit 22 (step S101). The processing executed in step S101 is as described in the description related to the preprocessing unit 4. Next, the variable selection unit 11 selects a variable to be used for learning for each combination of a component value to be predicted (objective variable) and a technique used for estimating the objective variable (step S102). Details of the processing executed in step S102 are as described in FIGS. 6 and 7 and the description of the variable selection unit 11. At this time, the variable selection unit 11 may store information on the selected variable in the learning data storage unit 22.

そして、交差検証部１２は、学習データＬをｋ分割し、複数のブロックＬ_１、Ｌ_２、・・・、Ｌ_ｋを生成する（ステップＳ１０３）。このとき、交差検証部１２はそれぞれの成分値Ｙ_１、Ｙ_２、・・・、Ｙ_ｍに係る予測モデルおよび誤差判定モデルの学習を行うときに、テスト用データとするブロックと、トレーニング用データとするブロックの割り当てを決定する。学習データＬの分割数ｋについては特に限定しない。ステップＳ１０３で実行される処理の詳細は、図２、図３および交差検証部１２に係る説明で述べた通りである。 Then, the intersection verification unit 12 divides the learning data L into _k and generates a plurality of blocks L ₁ , L ₂ ,..., L _k (step S103). At this time, cross-validation unit 12 Each component value Y _1, Y _2, · · ·, when performing the learning of the prediction model and the error determination model according to Y _m, and a block to the test data, a training data The block allocation is determined. The number of divisions k of the learning data L is not particularly limited. The details of the processing executed in step S103 are as described in FIGS. 2 and 3 and the description related to the cross-validating unit 12.

学習データを複数のブロックに分割したら、学習データの一部のブロックをテスト用データに設定し、残りのブロックをトレーニング用データに設定する（ステップＳ１０４）。学習データにおけるテスト用データおよびトレーニング用データのブロック割り当てが決まったら、それぞれの予測対象の成分値（目的変数）および機械学習手法の組み合わせごとに、予測モデルと誤差判定モデルの学習を行う。以下では、繰り返し処理により、予測モデルと誤差判定モデルの学習が行われる場合を例に説明する。まず、繰り返し処理のため、ループカウンタｉに１を代入する（ステップＳ１０５）。 When the learning data is divided into a plurality of blocks, a part of the learning data is set as test data, and the remaining blocks are set as training data (step S104). When the block assignment of the test data and the training data in the learning data is determined, the prediction model and the error determination model are learned for each combination of the component value (objective variable) and the machine learning method of each prediction target. Hereinafter, a case where learning of a prediction model and an error determination model is performed by iterative processing will be described as an example. First, 1 is substituted into the loop counter i for repetitive processing (step S105).

ステップＳ１０５が実行された後、特性予測装置１はステップＳ１０６〜Ｓ１１１の処理（第１手法に係る予測モデルおよび対応する誤差判定モデルの学習）と、ステップＳ１１２〜Ｓ１１８の処理（第２手法に係る予測モデルおよび対応する誤差判定モデルの学習）を並列的に実行する。以下ではまずステップＳ１０６〜Ｓ１１１の処理について説明する。 After step S105 is executed, the characteristic prediction apparatus 1 performs the processing of steps S106 to S111 (learning the prediction model and the corresponding error determination model according to the first method) and the processing of steps S112 to S118 (related to the second method). (Prediction model and corresponding error determination model learning) are executed in parallel. Below, the process of step S106-S111 is demonstrated first.

モデル生成部１３は、トレーニング用データを使って推定をする成分値Ｙ_ｉ（目的変数ｙ_ｉ）について第１手法（例えば、Ｍ５´）により、予測モデルを生成する（ステップＳ１０６）。モデル生成部１３は第１手法により生成された予測モデルをモデルデータとしてモデルデータベース２４に保存する（ステップＳ１０７）。モデルデータは、当該モデルにより推定される成分値（目的変数）を示す情報とともに保存される。ステップＳ１０６、ステップＳ１０７で実行される処理の詳細は、モデル生成部１３に係る説明で述べた通りである。 The model generation unit 13 generates a prediction model by the first method (for example, M5 ′) for the component value Y _i (object variable y _i ) estimated using the training data (step S106). The model generation unit 13 stores the prediction model generated by the first method in the model database 24 as model data (step S107). The model data is stored together with information indicating component values (object variables) estimated by the model. The details of the processing executed in step S106 and step S107 are as described in the description related to the model generation unit 13.

次に、第１推定部１４は、テスト用データを使って予測モデルによる成分値Ｙ_ｉ（目的変数ｙ_ｉ）の第１推定値を求める。そして、検証部１６は、それぞれの醸造ＩＤにおける第１推定値と真値（実際の成分値）との間の誤差の大きさを求め、保存する。検証部１６は、当該誤差の大きさが許容誤差の範囲内にあるか否かを判定する（ステップＳ１０８）。 Next, the first estimation unit 14 obtains a first estimate of the component values by prediction model with the test data Y _{i (objective} variable y _i). And the verification part 16 calculates | requires and preserve | saves the magnitude | size of the error between the 1st estimated value in each brewing ID, and a true value (actual component value). The verification unit 16 determines whether or not the magnitude of the error is within the allowable error range (step S108).

検証部１６は、テスト用データに含まれる、それぞれの醸造ＩＤについて、許容誤差内判定結果（ＴＲＵＥ／ＦＡＬＳＥ）を保存する（ステップＳ１０９）。検証部１６は、例えば図８の判定テーブルに許容誤差内判定結果を保存することができる。ステップＳ１０８、Ｓ１０９で実行される処理の詳細は図８に係る説明で述べた通りである。 The verification unit 16 stores the determination result within tolerance (TRUE / FALSE) for each brew ID included in the test data (step S109). For example, the verification unit 16 can store the determination result within the allowable error in the determination table of FIG. Details of the processing executed in steps S108 and S109 are as described in the description of FIG.

そして、検証部１６はトレーニング用データを使い、許容誤差内判定結果を教師信号として、ずれ（誤差）が許容誤差の範囲内にある確率を推定する誤差判定モデルを学習し、記憶部２のモデルデータベース２４に保存する（ステップＳ１１０）。ステップＳ１１０で実行される処理の詳細は、誤差判定モデルに係る説明で述べた通りである。次に、検証部１６はテスト用データに誤差判定モデルを適用し、それぞれの醸造ＩＤについて、第１許容誤差内確率を求める（ステップＳ１１１）。第１許容誤差内確率は判定テーブルに保存することができる。 Then, the verification unit 16 uses the training data, learns an error determination model that estimates the probability that the deviation (error) is within the allowable error range, using the determination result within the allowable error as a teacher signal, and the model of the storage unit 2 Save in the database 24 (step S110). The details of the process executed in step S110 are as described in the explanation relating to the error determination model. Next, the verification unit 16 applies the error determination model to the test data, and obtains the first allowable error probability for each brew ID (step S111). The probability within the first allowable error can be stored in the determination table.

次に、ステップＳ１１２〜Ｓ１１８の処理について説明する。 Next, processing in steps S112 to S118 will be described.

モデル生成部１３は、トレーニング用データを使って推定をする成分値Ｙ_ｉ（目的変数ｙ_ｉ）について第２手法による、学習を行う（ステップＳ１１２）。第２手法としてＫ＊を用いる場合、ステップＳ１１２で遷移確率の計算のための前処理などを実行してもよい。特に前処理などを実行する必要がない場合には、ステップＳ１１２に係る処理を省略してもよい。第２手法としてモデルベースの手法が用いられる場合、ステップＳ１１２ではもうひとつの予測モデルが生成される。 The model generation unit 13 learns the component value Y _i (objective variable y _i ) estimated using the training data by the second method (step S112). When K * is used as the second method, preprocessing for calculating the transition probability may be executed in step S112. In particular, when it is not necessary to execute preprocessing or the like, the processing according to step S112 may be omitted. When a model-based method is used as the second method, another prediction model is generated in step S112.

次に、モデル生成部１３は第２手法で使うデータをモデルデータとして、モデルデータベース２４に保存する（ステップＳ１１３）。特に保存するモデルデータがない場合には、ステップＳ１１３に係る処理を省略してもよい。 Next, the model generation unit 13 stores the data used in the second method as model data in the model database 24 (step S113). In particular, when there is no model data to be stored, the processing according to step S113 may be omitted.

そして、第２推定部１５はテスト用データを使って、第２手法による成分値Ｙ_ｉ（目的変数ｙ_ｉ）の第２推定値を求める。そして、検証部１６は、それぞれの醸造ＩＤにおける第２推定値と真値（実際の成分値）との間の誤差の大きさを求め、保存する。検証部１６は、当該誤差の大きさが許容誤差の範囲内にあるか否かを判定する（ステップＳ１１４）。 Then, the second estimation unit 15 obtains a second estimated value of the component value Y _i (objective variable y _i ) by the second method using the test data. And the verification part 16 calculates | requires and preserve | saves the magnitude | size of the error between the 2nd estimated value and true value (actual component value) in each brewing ID. The verification unit 16 determines whether or not the magnitude of the error is within the allowable error range (step S114).

検証部１６は、テスト用データに含まれる、それぞれの醸造ＩＤについて、許容誤差内判定結果（ＴＲＵＥ／ＦＡＬＳＥ）を保存する（ステップＳ１１５）。検証部１６は、例えば図８の判定テーブルに許容誤差内判定結果を保存することができる。ステップＳ１１４、Ｓ１１５で実行される処理の詳細は図８に係る説明で述べた通りである。 The verification unit 16 stores the determination result within tolerance (TRUE / FALSE) for each brew ID included in the test data (step S115). For example, the verification unit 16 can store the determination result within the allowable error in the determination table of FIG. Details of the processing executed in steps S114 and S115 are as described in the description of FIG.

そして、検証部１６はトレーニング用データを使い、許容誤差内判定結果を教師信号として、ずれ（誤差）が許容誤差の範囲内にある確率を推定する誤差判定モデルを学習し、記憶部２のモデルデータベース２４に保存する（ステップＳ１１７）。ステップＳ１１７で実行される処理の詳細は、誤差判定モデルに係る説明で述べた通りである。次に、検証部１６はテスト用データに誤差判定モデルを適用し、それぞれの醸造ＩＤについて、第２許容誤差内確率を求める（ステップＳ１１８）。第２許容誤差内確率は判定テーブルに保存することができる。 Then, the verification unit 16 uses the training data, learns an error determination model that estimates the probability that the deviation (error) is within the allowable error range, using the determination result within the allowable error as a teacher signal, and the model of the storage unit 2 It stores in the database 24 (step S117). The details of the process executed in step S117 are as described in the explanation relating to the error determination model. Next, the verification unit 16 applies an error determination model to the test data, and obtains a second allowable error probability for each brew ID (step S118). The probability within the second allowable error can be stored in the determination table.

予測対象の成分値Ｙ_ｉ（目的変数ｙ_ｉ）についてステップＳ１１１およびステップＳ１１８の処理が終了したら、ループカウンタｉにｉ＋１を代入する（ステップＳ１１９）。次に、ループカウンタｉの値を予測対象の成分値（目的変数）の数ｍと比較する。ｉ≦ｍである場合、処理はステップＳ１０６およびステップＳ１１２（並列的な処理）に戻る。ｉ＞ｍである場合、処理はステップＳ１２１に進む（ステップＳ１２０）。これにより、すべての予測対象の成分値Ｙ_ｉ（目的変数ｙ_ｉ）について、予測モデルと対応する誤差判定モデルが学習されるまで、繰り返し処理が実行される。 When the processing of step S111 and step S118 is completed for the prediction target component value Y _i (objective variable y _i ), i + 1 is substituted into the loop counter i (step S119). Next, the value of the loop counter i is compared with the number m of component values (objective variables) to be predicted. If i ≦ m, the process returns to step S106 and step S112 (parallel processing). If i> m, the process proceeds to step S121 (step S120). Thus, the iterative process is executed until the error determination model corresponding to the prediction model is learned for all the prediction target component values Y _i (objective variable y _i ).

図１３、図１４のフローチャートに示した例では、繰り返し処理によってそれぞれの予測対象の成分値Ｙ_ｉ（目的変数ｙ_ｉ）に係る予測モデルと対応する誤差判定モデルの学習が実現されているが、処理の並列性はこれとは異なっていてもよい。例えば、複数の成分値Ｙ_ｉ（目的変数ｙ_ｉ）に係る予測モデルと対応する誤差判定モデルの学習を並列的に実行してもよいし、第１手法と第２手法に係るモデルの学習を逐次実行してもよい。並列的な学習処理は、例えば複数台の計算機（コンピュータ）、複数のＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）を用いることによって実現することができる。 In the example shown in the flowcharts of FIGS. 13 and 14, the learning of the error determination model corresponding to the prediction model related to each prediction target component value Y _i (objective variable y _i ) is realized by the iterative process. Processing parallelism may be different. For example, learning of an error determination model corresponding to a prediction model related to a plurality of component values Y _i (objective variable y _i ) may be performed in parallel, or learning of models related to the first method and the second method may be performed. You may perform sequentially. Parallel learning processing can be realized by using, for example, a plurality of computers (computers) and a plurality of CPUs (Central Processing Units).

次に、手法選択部１７は、それぞれの目的変数ｙ（成分値Ｙ）について、第１手法による第１推定値と真値（実際の成分値）との間の誤差と、第２手法による第２推定値と真値（実際の成分値）との間の誤差を比較する。手法選択部１７は、当該目的変数ｙについて、誤差が小さい手法を、当該目的変数の推定に「好ましい手法」として選択する（ステップＳ１２１）。「好ましい手法」の選択処理に関する詳細は、図９および手法選択部１７に係る説明で述べた通りである。 Next, the method selection unit 17 determines, for each objective variable y (component value Y), the error between the first estimated value by the first method and the true value (actual component value), and the second method by the second method. 2 Compare the error between the estimated value and the true value (actual component value). The method selection unit 17 selects a method with a small error for the target variable y as a “preferred method” for estimating the target variable (step S121). Details regarding the “preferred method” selection process are as described in FIG. 9 and the description of the method selection unit 17.

学習データの醸造ＩＤのそれぞれについて、「好ましい手法」が選択されたら、学習データをトレーニング用データと、トレーニング用データに分割せず、手法選択モデルの学習を行う。図１５の例では予測対象とする成分値（目的変数）ごとに手法選択モデルの学習を繰り返すため、それぞれの目的変数ｙ（成分値Ｙ）に対応する、ループカウンタｊに１を代入する（ステップＳ１２２）。 When “preferred method” is selected for each of the brewing IDs of the learning data, learning of the method selection model is performed without dividing the learning data into training data and training data. In the example of FIG. 15, since learning of the method selection model is repeated for each component value (object variable) to be predicted, 1 is assigned to the loop counter j corresponding to each object variable y (component value Y) (step S122).

まず、手法選択部１７は学習データの説明変数に第１許容誤差内確率および第２許容誤差内確率を追加し、目的変数ｙ_ｊ（成分値Ｙ_ｊ）について、「好ましい手法」を教師信号として、手法選択モデルを学習する（ステップＳ１２３）。手法選択モデルの学習についての詳細は、手法選択部１７に係る説明で述べた通りである。そして、手法選択部１７は学習された目的変数ｙ_ｊ（成分値Ｙ_ｊ）に係る手法選択モデルを記憶部２のモデルデータベース２４に保存する。 First, the method selection unit 17 adds the first allowable error probability and the second allowable error probability to the explanatory variables of the learning data, and uses the “preferred method” as the teacher signal for the objective variable y _j (component value Y _j ). The method selection model is learned (step S123). Details of learning of the method selection model are as described in the description of the method selection unit 17. Then, the technique selection unit 17 stores the technique selection model related to the learned objective variable y _j (component value Y _j ) in the model database 24 of the storage unit 2.

予測対象の成分値Ｙ_ｊ（目的変数ｙ_ｊ）についてステップＳ１２４の処理が終了したら、ループカウンタｊにｊ＋１を代入する（ステップＳ１２５）。次に、ループカウンタｊの値を予測対象の成分値（目的変数）の数ｍと比較する。ｊ≦ｍである場合、処理はステップＳ１２３に戻る。ｊ＞ｍである場合、すべての学習処理は終了する（ステップＳ１２６）。これにより、すべての予測対象の成分値Ｙ_ｊ（目的変数ｙ_ｊ）について、手法選択モデルが学習されるまで、繰り返し処理が実行される。 When the process of step S124 is completed for the prediction target component value Y _j (objective variable y _j ), j + 1 is substituted into the loop counter j (step S125). Next, the value of the loop counter j is compared with the number m of component values (objective variables) to be predicted. If j ≦ m, the process returns to step S123. If j> m, all learning processes are terminated (step S126). Thus, the iterative process is executed until the method selection model is learned for all the prediction target component values Y _j (objective variable y _j ).

図１５のフローチャートに示した例では、繰り返し処理によってそれぞれの予測対象の成分値Ｙ_ｉ（目的変数ｙ_ｉ）に係る手法選択モデルの学習が実現されているが、複数の成分値Ｙ_ｉ（目的変数ｙ_ｉ）に係る手法選択モデルの学習を並列的に行ってもよい。並列的な学習処理は、例えば複数台の計算機（コンピュータ）、複数のＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）を用いることによって実現することができる。 In the example shown in the flowchart of FIG. 15, learning of the method selection model related to each prediction target component value Y _i (objective variable y _i ) is realized by iterative processing, but a plurality of component values Y _i (objectives) The method selection model relating to the variable y _i ) may be learned in parallel. Parallel learning processing can be realized by using, for example, a plurality of computers (computers) and a plurality of CPUs (Central Processing Units).

次に、特性予測装置１による予測処理について述べる。 Next, prediction processing by the characteristic prediction apparatus 1 will be described.

図１６、図１７は、ある成分値Ｙ_ｐの予測処理を示したフローチャートである。予測対象の成分値（目的変数）について学習処理が完了したら、当該成分値（目的変数）について予測処理を実行することが可能となる。以下では、図１６、図１７のフローチャートを参照しながら、処理を説明する。 16, FIG. 17 is a flowchart showing a prediction processing of a component value Y _p. When the learning process is completed for the prediction target component value (objective variable), the prediction process can be executed for the component value (objective variable). Hereinafter, the processing will be described with reference to the flowcharts of FIGS. 16 and 17.

最初に、利用者は試醸条件と、当該試醸条件のもとで予測したい飲料の成分値を決める（ステップＳ２０１）。飲料の成分値はひとつであってもよいし、複数個であってもよい。次に、利用者は操作部６を使って試醸条件と予測したい成分値Ｙ_ｐを特性予測装置１に入力する（ステップＳ２０２）。そして利用者は操作部６より成分値の予測処理の開始指令を出す。 First, the user determines the brewing conditions and the component value of the beverage to be predicted under the brewing conditions (step S201). The component value of the beverage may be one or plural. Next, the user inputs a component value Y _p to be expected Fermentation conditions using the operating unit 6 in the characteristic prediction apparatus 1 (step S202). Then, the user issues a component value prediction process start command from the operation unit 6.

予測処理の開始指令が出されたら、特性予測装置１は試醸条件から成分値の推定手法（例えば、Ｍ５´などのモデルベースの手法、Ｋ＊などの事例ベースの手法）ごとに説明変数と目的変数を選ぶ（ステップＳ２０３）。ここで選ばれた説明変数と目的変数を試醸サンプルとよぶものとする。ステップＳ２０３では、予測対象となっている試醸条件に係るそれぞれ成分値について試醸サンプルが生成される。ステップＳ２０３で実行される処理の詳細は図７に係る説明で述べた通りである。 When a prediction process start command is issued, the characteristic prediction apparatus 1 determines the explanatory variable for each component value estimation method (for example, a model-based method such as M5 ′ or a case-based method such as K *) from the brewing conditions. An objective variable is selected (step S203). The explanatory variables and objective variables selected here are called tasting samples. In step S203, a brew sample is generated for each component value related to the brew condition that is the prediction target. The details of the processing executed in step S203 are as described in the description related to FIG.

ステップＳ２０３の処理が実行された後、特性予測装置１はステップＳ２０４、Ｓ２０５の処理（第１手法による推定と許容誤差内確率の計算）と、ステップＳ２０６、Ｓ２０７の処理（第２手法による推定と許容誤差内確率の計算）を並列的に実行する。以下では、最初にステップＳ２０４、Ｓ２０５の処理について説明する。 After the process of step S203 is executed, the characteristic prediction apparatus 1 performs the processes of steps S204 and S205 (estimation by the first method and calculation of the within-tolerance error probability) and the processes of steps S206 and S207 (estimation by the second method). The calculation of probability within tolerance is performed in parallel. Below, the process of step S204, S205 is demonstrated first.

第１推定部１４は、試醸サンプルを成分値Ｙ_ｐ（目的変数ｙ_ｐ）の推定を行う第１手法に係る予測モデルに入力し、第１推定値を計算する（ステップＳ２０４）。第１手法としてＭ５´が用いられている場合、木の該当するノードにおける回帰式から成分値（目的変数）の推定値が求められる。 The first estimation unit 14 inputs the brew sample into the prediction model according to the first method for estimating the component value Y _p (objective variable y _p ), and calculates the first estimated value (step S204). When M5 ′ is used as the first method, an estimated value of the component value (objective variable) is obtained from the regression equation at the corresponding node of the tree.

次に、検証部１６は、試醸サンプルに係る説明変数を第１手法の予測モデルに対応する誤差判定モデルに入力し、第１許容誤差内確率を計算する（ステップＳ２０５）。検証部１６は計算した第１許容誤差内確率を記憶部２に保存してもよい。 Next, the verification unit 16 inputs the explanatory variable related to the brew sample to the error determination model corresponding to the prediction model of the first method, and calculates the first allowable error probability (Step S205). The verification unit 16 may store the calculated first allowable error probability in the storage unit 2.

次に、ステップＳ２０６、Ｓ２０７の処理について説明する。 Next, the processing of steps S206 and S207 will be described.

第２推定部１５は、試醸サンプルを成分値Ｙ_ｐ（目的変数ｙ_ｐ）の推定を行う第２手法に係る予測モデルに入力し、第２推定値を計算する（ステップＳ２０６）。第２手法としてＫ＊が用いられている場合、試醸サンプルとそれぞれの過去サンプルとの間の遷移確率と、それぞれの過去サンプルに対応する醸造事例の成分値の積を足し合わせ、遷移確率の和で除算すると、第２推定値を計算することができる。 The second estimation unit 15 inputs the brew sample into the prediction model according to the second method for estimating the component value Y _p (objective variable y _p ), and calculates the second estimated value (step S206). When K * is used as the second method, the product of the transition probability between the brew sample and each past sample and the component value of the brewing case corresponding to each past sample is added, and the transition probability By dividing by the sum, the second estimate can be calculated.

次に、検証部１６は、試醸サンプルに係る説明変数を第２手法の予測モデルに対応する誤差判定モデルに入力し、第２許容誤差内確率を計算する（ステップＳ２０７）。検証部１６は計算した第２許容誤差内確率を記憶部２に保存してもよい。 Next, the verification unit 16 inputs the explanatory variable related to the tasting sample to the error determination model corresponding to the prediction model of the second method, and calculates the second allowable error probability (step S207). The verification unit 16 may store the calculated second allowable error probability in the storage unit 2.

ステップＳ２０５およびステップＳ２０７の処理が実行されたら、ステップＳ２０８の処理を実行する。ステップＳ２０８で、手法選択部１７は、成分値Ｙ_ｐ（目的変数ｙ_ｐ）に対応する手法選択モデルへ試醸サンプルに係る説明変数と、第１許容誤差内確率と、第２許容誤差内確率を入力し、手法選択モデルの出力値を求める。手法選択モデルがロジスティック回帰によって学習されている場合、出力値は（０，１）の範囲の値となる。 If the process of step S205 and step S207 is performed, the process of step S208 will be performed. In step S <b> 208, the method selection unit 17 determines the method selection model corresponding to the component value Y _p (objective variable y _p ), the explanatory variable relating to the brew sample, the first allowable error probability, and the second allowable error probability. To obtain the output value of the method selection model. When the method selection model is learned by logistic regression, the output value is a value in the range of (0, 1).

手法選択部１７は、手法選択モデルの出力値に基づき、第１推定値または第２推定値のいずれかを正式な成分値Ｙ_ｐの予測値として利用者に提示するかを決める（ステップＳ２０９）。手法選択モデルがロジスティック回帰によって学習されている場合、出力値をしきい値（例えば、０．５）と比較する。例えば、出力値がしきい値より大きい場合、第１推定値を正式な予測値として選択する。出力値がしきい値以下である場合、第２推定値を正式な予測値として選択する。 Method selection unit 17 based on the output value of the approach selected model, decide whether to present the user either the first estimate or the second estimate as the predicted value of the formal component value Y _p (Step S209) . When the method selection model is learned by logistic regression, the output value is compared with a threshold value (for example, 0.5). For example, when the output value is larger than the threshold value, the first estimated value is selected as the official predicted value. If the output value is less than or equal to the threshold value, the second estimated value is selected as the formal predicted value.

ここでは、ふたつの予測モデルによる推定値のいずれかを正式な成分値Ｙ_ｐの予測値として選択する処理が行われているが、判定結果によっては両方の予測モデルによる推定値を正式な成分値Ｙ_ｐの予測値として選択してもよい。また、３つ以上の予測モデルによる推定値が計算された場合には、その中でいずれかひとつの推定値を正式な成分値Ｙ_ｐの予測値として選択してもよいし、複数の推定値を正式な成分値Ｙ_ｐの予測値として選択してもよい。 Here, the process of selecting one of the estimates by the two prediction model as the predicted value of the formal component value Y _p is being performed, the judgment result by the formal component values an estimate by both prediction model it may be selected as the predicted value of Y _p. Further, when the estimated value by more than two predictive model is calculated may select an estimate of any one among them as a prediction value of the formal component value Y _p, multiple estimates may be selected as a formal predicted value of component value Y _p.

次に、学習部３は、学習データの特徴空間における試醸サンプルの位置に基づき、関連事例として提示する過去サンプルを抽出する（ステップＳ２１０）。例えば、特徴空間において試醸サンプルと同一のカテゴリ（クラスタ）に分類されている過去サンプルまたは、特徴空間で試醸サンプルの近傍にある過去サンプルを関連事例として抽出することができる。 Next, the learning unit 3 extracts a past sample to be presented as a related case based on the position of the tasting sample in the feature space of the learning data (step S210). For example, a past sample that is classified into the same category (cluster) as the tasting sample in the feature space or a past sample that is in the vicinity of the tasting sample in the feature space can be extracted as a related case.

以下では、図１８を参照しながら、関連事例の抽出処理について説明する。図１８には、学習データに係る特徴空間が示されている。図１８の例では、学習データに係る過去サンプルがカテゴリ＃１、カテゴリ＃２、カテゴリ＃３に分類されている。カテゴリへの分類処理は、例えばＫ＊、ｋ−平均法、決定木、アンサンブル分類などによって行うことができるが、分類に用いる手法については特に限定しない。 Hereinafter, the related case extraction process will be described with reference to FIG. FIG. 18 shows a feature space related to learning data. In the example of FIG. 18, past samples related to learning data are classified into category # 1, category # 2, and category # 3. Classification processing into categories can be performed by, for example, K *, k-average method, decision tree, ensemble classification, etc., but the method used for classification is not particularly limited.

図１８の特徴空間では、黒丸が試醸サンプル６０の位置に相当する。試醸サンプルと同一のカテゴリ（クラスタ）に分類されている過去サンプルを関連事例として抽出する場合、カテゴリ＃３に属する過去サンプルが関連事例として抽出される。 In the feature space of FIG. 18, the black circle corresponds to the position of the brew sample 60. When a past sample classified in the same category (cluster) as the sample sample is extracted as a related case, a past sample belonging to category # 3 is extracted as a related case.

また、カテゴリ（クラスタ）への分類がなされていない特徴空間において関連事例の抽出を行うことができる。図１８の試醸サンプル６０（黒丸）を中心とする球６１の表面は、試醸サンプル６０からの等距離線を示している。図１８の特徴空間では、例えば、球６１の範囲内にある過去サンプルを関連事例として抽出することができる。特徴空間における距離としては、ユークリッド距離、マンハッタン距離、ミンコフスキー距離、チェビシフ距離、マハラノビス距離などを使うことができるが、どの種類の距離を使ってもよい。 In addition, it is possible to extract related cases in a feature space that is not classified into categories (clusters). The surface of the sphere 61 centering on the brewing sample 60 (black circle) in FIG. 18 shows an equidistant line from the brewing sample 60. In the feature space of FIG. 18, for example, a past sample within the range of the sphere 61 can be extracted as a related case. As the distance in the feature space, Euclidean distance, Manhattan distance, Minkowski distance, Chebyshev distance, Mahalanobis distance, etc. can be used, but any kind of distance may be used.

このように、試醸サンプルから一定の距離内にある過去サンプルを関連事例として抽出してもよいし、試醸サンプルから最も近い過去サンプルから順番に、所定の個数の過去サンプルを関連事例として抽出してもよい。ここで、所定の個数として例えば、１０、１５、２０のような値を用いることができるが、設定値については特に問わない。 In this way, past samples within a certain distance from the tasting sample may be extracted as related cases, or a predetermined number of past samples are extracted as related cases in order from the nearest past sample. May be. Here, for example, values such as 10, 15, and 20 can be used as the predetermined number, but the set value is not particularly limited.

以下では、再び図１７のフローチャートの説明に戻る。 Hereinafter, the description returns to the flowchart of FIG.

最後に、正式な成分値Ｙ_ｐの予測値および試醸条件の関連事例を利用者に提示する（ステップＳ２１１）。ステップＳ２１１では、利用者に対して成分値の予測値が何らかの方法で提示されていればよい。例えば、ディスプレイなどに予測値が表示されていてもよいし、紙に成分値の予想値をプリントアウトして利用者に提供してもよいし、成分値の予想値を利用者のメールアドレスに送信してもよいし、音声で成分値の予想値を通知してもよい。また、特性予測装置１がウェブサービスまたはイントラネットのサーバなどの機能を備える場合には、ウェブブラウザから試醸条件における成分値の予想値を確認できるようにしてもよい。なお、ステップＳ２１１における画面表示の例については後述する。 Finally, it presented to the user the relevant case the predicted values and Fermentation conditions formal component value Y _p (step S211). In step S211, it is only necessary that the predicted value of the component value is presented to the user by some method. For example, the predicted value may be displayed on a display, the predicted component value may be printed out on paper and provided to the user, or the predicted component value may be provided to the user's email address. You may transmit and you may notify the expected value of a component value with an audio | voice. Moreover, when the characteristic prediction apparatus 1 has a function such as a web service or an intranet server, the predicted value of the component value in the brewing conditions may be confirmed from a web browser. An example of the screen display in step S211 will be described later.

ステップＳ２１１で利用者に提示（表示）される予測値は、ひとつの予測モデルによる予測値であってもよいし、複数の予測モデルによる予測値であってもよい。また、利用者に提示する関連事例の件数については特に限定しない。 The predicted value presented (displayed) to the user in step S211 may be a predicted value based on one prediction model or may be predicted values based on a plurality of prediction models. Further, the number of related cases presented to the user is not particularly limited.

図１６、図１７のフローチャートでは、第１手法による推定と許容誤差内確率の計算処理と、第２手法による推定と許容誤差内確率の計算処理が並列的に実行されていたが、これらの処理は逐次実行されていてもよい。また、複数の成分値の予測を行う場合には、すべてまたは一部の成分値の予測を並列的に実行してもよいし、それぞれの成分値の予測を逐次行ってもよい。 In the flowcharts of FIG. 16 and FIG. 17, the estimation process by the first method and the calculation process of the probability within the allowable error and the estimation process by the second method and the calculation process of the probability within the allowable error are performed in parallel. May be executed sequentially. Further, when a plurality of component values are predicted, all or some component values may be predicted in parallel, or each component value may be predicted sequentially.

次に、特性予測装置１によって表示部５に表示される画面の例について説明する。図１９は、試醸条件入力画面の例を示している。図１９の画面５０は「飲料特性予測システム」における試醸条件入力画面である。画面５０では、利用者が過去の醸造事例に修正を加えて、試醸条件に設定することが可能となっている。また、麦芽については、複数のロットに係る麦芽を組み合わせた上で、醸造開始時における麦芽の成分値を計算することができる。画面５０では麦芽の種類、それぞれの麦芽の使用量、ホップの種類、それぞれのホップの使用量、副材料の種類、それぞれの副材料の使用量、用水処理剤の種類、それぞれの用水処理剤の使用量、酵素剤の種類、それぞれの酵素剤の使用量、温度条件の設定などの細かい醸造条件が設定できる。 Next, an example of a screen displayed on the display unit 5 by the characteristic prediction apparatus 1 will be described. FIG. 19 shows an example of a tasting condition input screen. The screen 50 of FIG. 19 is a brewing condition input screen in the “beverage characteristic prediction system”. On the screen 50, the user can modify the past brewing cases and set the brewing conditions. Moreover, about the malt, after combining the malt which concerns a some lot, the component value of the malt at the time of a brew start can be calculated. In screen 50, the type of malt, the amount of each malt used, the type of hop, the amount of each hop used, the type of sub-material, the amount of each sub-material used, the type of water treatment agent, the type of each water treatment agent Fine brewing conditions such as the amount used, type of enzyme agent, amount of each enzyme agent used, and temperature conditions can be set.

図２０は、ディスプレイに表示された予測結果表示画面の第１の例を示している。図１５の画面５１上段のテーブル５１ａには、最終外観発酵度（ＡＡＬ）、色度、ｐＨ、全窒素、アミノ酸合計、ＢＵの複数の成分値に係る正式な予測値が表示されている。また、画面５１下段のテーブル５１ｂには、複数の関連事例と対応する醸造ＩＤなどが表示されている。利用者は画面５１下段のテーブルをクリックすることにより、関連事例に関するさらに詳細な情報を参照することができる。関連事例の詳細情報は、例えばそれぞれの関連事例における設定項目と、それぞれの関連事例において生成された飲料の成分値を含む。 FIG. 20 shows a first example of a prediction result display screen displayed on the display. In the table 51a in the upper part of the screen 51 in FIG. 15, final predicted values relating to the final appearance fermentation degree (AAL), chromaticity, pH, total nitrogen, amino acid total, and a plurality of component values of BU are displayed. A table 51b at the bottom of the screen 51 displays brewing IDs corresponding to a plurality of related cases. The user can refer to more detailed information related to related cases by clicking the table at the bottom of the screen 51. The detailed information of the related cases includes, for example, setting items in the respective related cases and component values of the beverages generated in the respective related cases.

図２１は、ディスプレイに表示された予測結果表示画面の第２の例を示している。図２１のテーブルには、第１の手法と第２の手法による、最終外観発酵度（ＡＡＬ）、色度、ｐＨ、全窒素、アミノ酸合計、ＢＵの予測値が表示されている。このように、複数の機械学習手法（予測モデル）による推定値を、予測される飲料の成分値として表示してもよい。 FIG. 21 shows a second example of the prediction result display screen displayed on the display. In the table of FIG. 21, predicted values of final appearance fermentation degree (AAL), chromaticity, pH, total nitrogen, amino acid total, and BU by the first method and the second method are displayed. Thus, you may display the estimated value by a some machine learning method (prediction model) as a component value of the drink to be predicted.

本発明に係る特性予測装置を使うことにより、過去に蓄積された醸造データを使って、任意の条件（試醸条件）で醸造を行ったときに得られる飲料の特性を高い精度で予測することができる。学習データのもととなる醸造データに含まれる事例数が多ければ多いほど、正確な成分値の予測をすることができる。また、本発明に係る特性予測装置は利用者が入力した試醸条件と関連性を有する関連事例を利用者に提示する。これにより、利用者は膨大な過去の醸造事例の中から、開発の参考となる醸造事例を探し出す苦労から解放される。利用者は関連事例を参照し、所望の特性（成分値）を有する飲料が得られる、試醸条件（原材料、醸造工程などの設定項目）の調整や再検討を行うことができる。 By using the characteristic prediction apparatus according to the present invention, using the brewing data accumulated in the past, the characteristics of the beverage obtained when brewing under arbitrary conditions (trial brewing conditions) are predicted with high accuracy. Can do. The more the number of cases included in the brewing data that is the basis of the learning data, the more accurate the component value can be predicted. Moreover, the characteristic prediction apparatus according to the present invention presents the user with related cases having relevance to the brewing conditions input by the user. As a result, the user is freed from the trouble of searching for a brewing case that can be used as a reference for development among a large number of past brewing cases. The user can refer to related cases and adjust or reexamine the brewing conditions (setting items such as raw materials and brewing processes) from which a beverage having a desired characteristic (component value) is obtained.

研究開発部門に配属された新入社員や若手社員などは、関連事例を参照することによって、どのような原材料を使い、醸造工程をどのように調整すれば、所望の特性（成分値）の飲料を生成することができるのか、学ぶことができる。本発明に係る特性予測装置を使えば、新入社員や若手社員は、これまで技術者が長い年数の試醸の繰り返しにより習得した醸造におけるノウハウや感覚を効率的に理解することができる。本発明に係る特性予測装置は、飲料の試作に要する時間、コスト、リソースを抑え、開発スピードを高速化し、社員への教育、ノウハウの継承にも寄与する。 New employees and young employees assigned to the R & D department can refer to related cases to find out what raw materials are used and how the brewing process is adjusted to produce beverages with the desired characteristics (component values). You can learn what can be generated. By using the characteristic prediction apparatus according to the present invention, new employees and young employees can efficiently understand the know-how and sense in brewing that engineers have acquired through repeated trial brewing for many years. The characteristic predicting apparatus according to the present invention reduces the time, cost, and resources required for beverage prototyping, increases the development speed, and contributes to education and succession of know-how to employees.

（第２の実施形態）
第２の実施形態では、本発明に係る特性予測装置のハードウェア構成について説明する。本発明に係る特性予測装置は、コンピュータ１００により構成される。コンピュータ１００には、サーバ、クライアント端末、組み込み機器のマイコン、タブレット、スマートフォン、フィーチャーフォン、パソコンなどの各種の情報処理装置が含まれる。コンピュータ１００は、仮想計算機（ＶＭ：ＶｉｒｔｕａｌＭａｃｈｉｎｅ）やコンテナなどによって実現されていてもよい。 (Second Embodiment)
In the second embodiment, a hardware configuration of the characteristic prediction apparatus according to the present invention will be described. The characteristic prediction apparatus according to the present invention is configured by a computer 100. The computer 100 includes various information processing apparatuses such as a server, a client terminal, a microcomputer of an embedded device, a tablet, a smartphone, a feature phone, and a personal computer. The computer 100 may be realized by a virtual machine (VM) or a container.

図１７は、コンピュータ１００の一例を示す図である。図１７のコンピュータ１００は、プロセッサ１０１と、入力装置１０２と、表示装置１０３と、通信装置１０４と、記憶装置１０５とを備える。プロセッサ１０１、入力装置１０２、表示装置１０３、通信装置１０４、記憶装置１０５は、バス１０６により相互に接続されている。 FIG. 17 is a diagram illustrating an example of the computer 100. The computer 100 in FIG. 17 includes a processor 101, an input device 102, a display device 103, a communication device 104, and a storage device 105. The processor 101, the input device 102, the display device 103, the communication device 104, and the storage device 105 are connected to each other via a bus 106.

プロセッサ１０１は、コンピュータ１００の制御装置と演算装置を含む電子回路である。プロセッサ１０１として、例えば、汎用目的プロセッサ、中央処理装置（ＣＰＵ）、マイクロプロセッサ、デジタル信号プロセッサ（ＤＳＰ）、コントローラ、マイクロコントローラ、状態マシン、特定用途向け集積回路、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、プログラム可能論理回路（ＰＬＤ）またはこれらの組合せを用いることができる。 The processor 101 is an electronic circuit including a control device and a calculation device of the computer 100. Examples of the processor 101 include a general-purpose processor, a central processing unit (CPU), a microprocessor, a digital signal processor (DSP), a controller, a microcontroller, a state machine, an application-specific integrated circuit, a field programmable gate array (FPGA), and a program Possible logic circuits (PLDs) or combinations thereof can be used.

プロセッサ１０１は、バス１０６を介して接続された各装置（例えば、入力装置１０２、通信装置１０４、記憶装置１０５）から入力されたデータやプログラムに基づいて演算処理を行い、演算結果や制御信号を、バス１０６を介して接続された各装置（例えば、表示装置１０３、通信装置１０４、記憶装置１０５）に出力する。具体的には、プロセッサ１０１は、コンピュータ１００のＯＳ（オペレーティングシステム）や、特性予測プログラムなどを実行し、コンピュータ１００に含まれるそれぞれの装置を制御する。 The processor 101 performs arithmetic processing based on data or a program input from each device (for example, the input device 102, the communication device 104, and the storage device 105) connected via the bus 106, and outputs a calculation result and a control signal. And output to each device (for example, the display device 103, the communication device 104, and the storage device 105) connected via the bus 106. Specifically, the processor 101 executes an OS (operating system) of the computer 100, a characteristic prediction program, and the like, and controls each device included in the computer 100.

特性予測プログラムとは、コンピュータ１００に、特性予測装置の各構成要素に係る処理を実行させるプログラムである。特性予測プログラムは、一時的でない有形のコンピュータ読み取り可能な記憶媒体に記憶される。上記の記憶媒体は、例えば、光ディスク、光磁気ディスク、磁気ディスク、磁気テープ、フラッシュメモリ、半導体メモリであるが、これに限られない。プロセッサ１０１が特性予測プログラムを実行することによって、コンピュータ１００は特性予測装置として機能することができる。 The characteristic prediction program is a program that causes the computer 100 to execute processing relating to each component of the characteristic prediction apparatus. The characteristic prediction program is stored in a non-transitory tangible computer-readable storage medium. The storage medium is, for example, an optical disk, a magneto-optical disk, a magnetic disk, a magnetic tape, a flash memory, or a semiconductor memory, but is not limited thereto. When the processor 101 executes the characteristic prediction program, the computer 100 can function as a characteristic prediction apparatus.

入力装置１０２は、コンピュータ１００に情報を入力するための装置である。入力装置１０２は、例えば、キーボード、マウス、タッチパネルなどであるが、これに限られない。ユーザは、入力装置１０２を用いることにより、利用者は醸造データの前処理開始指令、学習データの変更操作、学習処理の開始指令、学習データの変数選択操作、成分値の予測を行う醸造条件の指定操作、予測対象とする成分値の指定操作、成分値の予測処理の開始指令、表示内容の変更指令などを入力することができる。 The input device 102 is a device for inputting information to the computer 100. The input device 102 is, for example, a keyboard, a mouse, a touch panel, but is not limited thereto. By using the input device 102, the user uses the brewing data preprocessing start command, the learning data change operation, the learning processing start command, the learning data variable selection operation, and the brewing condition prediction for component value prediction. It is possible to input a designation operation, a designation operation for a component value to be predicted, a command to start component value prediction processing, a command to change display contents, and the like.

表示装置１０３は、画像や映像を表示するための装置である。表示装置１０３は、例えば、ＬＣＤ（液晶ディスプレイ）、ＣＲＴ（ブラウン管）、有機ＥＬ（有機エレクトロルミネッセンス）ディスプレイ、プロジェクタ、ＬＥＤディスプレイなどであるが、これに限られない。表示装置１０３には、上述のように醸造データや学習データの内容や、醸造条件における成分値の予測結果などが表示される。 The display device 103 is a device for displaying images and videos. The display device 103 is, for example, an LCD (liquid crystal display), a CRT (CRT), an organic EL (organic electroluminescence) display, a projector, or an LED display, but is not limited thereto. As described above, the display device 103 displays the contents of brewing data and learning data, the prediction result of the component value under the brewing conditions, and the like.

通信装置１０４は、コンピュータ１００が外部装置と無線または有線で通信するために使用する装置である。通信装置１０４は、例えば、ＮＩＣ（ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣａｒｄ）、通信モジュール、モデム、ハブ、ルータなどであるが、これに限られない。コンピュータ１００は、通信装置１０４を介して、遠隔地の工場や研究所で蓄積された醸造データを収集してもよい。また、コンピュータ１００（特性予測装置１）がデータセンターやマシン室に設置されたサーバなどである場合、コンピュータ１００は通信装置１０４を介して、リモートの端末からの操作指令を受け付けたり、画面表示の内容をリモートの端末に表示させたりしてもよい。 The communication device 104 is a device used by the computer 100 to communicate with an external device wirelessly or by wire. The communication device 104 is, for example, a NIC (Network Interface Card), a communication module, a modem, a hub, or a router, but is not limited thereto. The computer 100 may collect brewing data accumulated in a remote factory or laboratory via the communication device 104. When the computer 100 (characteristic prediction apparatus 1) is a server installed in a data center or a machine room, the computer 100 receives an operation command from a remote terminal or displays a screen via the communication device 104. The contents may be displayed on a remote terminal.

記憶装置１０５は、コンピュータ１００のＯＳや、特性予測プログラム、特性予測プログラムの実行に必要なデータ、特性予測プログラムの実行により生成されたデータなどを記憶する記憶媒体である。記憶装置１０５には、主記憶装置と外部記憶装置が含まれる。主記憶装置は、例えば、ＲＡＭ、ＤＲＡＭ、ＳＲＡＭであるが、これに限られない。また、外部記憶装置は、例えば、ハードディスク、光ディスク、フラッシュメモリ、磁気テープなどであるが、これに限られない。上述の醸造データベース２１、学習データ記憶部２２、判定データ記憶部２３、モデルデータベース２４は、記憶装置１０５上に構築されてもよいし、外部のサーバやストレージ上に構築されてもよい。 The storage device 105 is a storage medium that stores the OS of the computer 100, a characteristic prediction program, data necessary for executing the characteristic prediction program, data generated by executing the characteristic prediction program, and the like. The storage device 105 includes a main storage device and an external storage device. The main storage device is, for example, a RAM, a DRAM, or an SRAM, but is not limited thereto. The external storage device is, for example, a hard disk, an optical disk, a flash memory, a magnetic tape, or the like, but is not limited thereto. The brewing database 21, the learning data storage unit 22, the determination data storage unit 23, and the model database 24 described above may be built on the storage device 105, or may be built on an external server or storage.

なお、コンピュータ１００は、プロセッサ１０１、入力装置１０２、表示装置１０３、通信装置１０４、記憶装置１０５を、それぞれ１つずつまたは複数備えてもよい。また、コンピュータ１００にプリンタやスキャナなどの周辺機器が接続されていてもよい。 Note that the computer 100 may include one or more processors 101, input devices 102, display devices 103, communication devices 104, and storage devices 105, respectively. A peripheral device such as a printer or a scanner may be connected to the computer 100.

また、特性予測装置は、単一のコンピュータ１００により構成されてもよいし、複数のコンピュータ１００が相互に接続された情報システムによって構成されていてもよい。 Moreover, the characteristic prediction apparatus may be comprised by the single computer 100, and may be comprised by the information system with which the some computer 100 was mutually connected.

さらに、特性予測プログラムは、コンピュータ１００の記憶装置１０５に予め記憶されていてもよいし、コンピュータ１００の外部の記憶媒体に記憶されていてもよいし、インターネット上にアップロードされていてもよい。いずれの場合にも、特性予測プログラムをコンピュータ１００にインストールして実行することにより、特性予測装置の機能を実現することができる。 Furthermore, the characteristic prediction program may be stored in advance in the storage device 105 of the computer 100, may be stored in a storage medium external to the computer 100, or may be uploaded on the Internet. In any case, the function of the characteristic prediction apparatus can be realized by installing the characteristic prediction program in the computer 100 and executing it.

本発明のいくつかの実施形態を説明したが、これらの実施形態は例として提示したものであり、発明の範囲の限定することは意図していない、これらの実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれると同様に、特許請求の範囲に記載された発明とその均等の範囲に含まれるものである。 Although several embodiments of the present invention have been described, these embodiments have been presented by way of example, and are not intended to limit the scope of the invention. Various omissions, replacements, and changes can be made without departing from the scope of the invention. These embodiments and their modifications are included in the scope and gist of the invention, and are also included in the invention described in the claims and the equivalents thereof.

１特性予測装置
２記憶部
３学習部
４前処理部
５表示部
６操作部
１１変数選択部
１２交差検証部
１３モデル生成部
１４第１推定部
１５第２推定部
１６検証部
１７手法選択部
２１醸造データベース
２２学習データ記憶部
２３判定データ記憶部
２４モデルデータベース
３０、３１、３２、３３、３３ａ、３４、３５、３９、４０、４１、４２、４３、４３ａ。４５、４６、４７、４８、４９列
３６、３７、３８、４４、５１ａ、５１ｂテーブル
５０、５１、５２画面
６０試醸サンプル
６１球
１００コンピュータ
１０１プロセッサ
１０２入力装置
１０３表示装置
１０４通信装置
１０５記憶装置
１０６バス DESCRIPTION OF SYMBOLS 1 Characteristic prediction apparatus 2 Memory | storage part 3 Learning part 4 Preprocessing part 5 Display part 6 Operation part 11 Variable selection part 12 Cross verification part 13 Model production | generation part 14 1st estimation part 15 2nd estimation part 16 Verification part 17 Method selection part 21 Brewing database 22 Learning data storage unit 23 Determination data storage unit 24 Model databases 30, 31, 32, 33, 33a, 34, 35, 39, 40, 41, 42, 43, 43a. 45, 46, 47, 48, 49 Rows 36, 37, 38, 44, 51a, 51b Tables 50, 51, 52 Screen 60 Samples 61 Ball 100 Computer 101 Processor 102 Input device 103 Display device 104 Communication device 105 Storage device 106 Bus

Claims

Generating a tasting sample using the setting items in the beverage brewing conditions as explanatory variables;
Inputting the tasting sample into a plurality of prediction models, and obtaining an estimate of the component value of the beverage for each prediction model;
Inputting the sample sample to an error determination model corresponding to the prediction model, and calculating a probability within tolerance for each prediction model;
The method selection model is input with explanatory variables related to the tasting sample and a plurality of probabilities within the allowable error, and the component values predicted under the tasting conditions of the beverage based on the output value of the method selection model The computer executes the step of selecting the estimated value presented as:
Beverage property prediction method.

Including the step of displaying the estimated value by at least one of the prediction models on the display as the component value predicted in the brewing conditions of the beverage,
The beverage characteristic prediction method according to claim 1.

Including a past sample with the setting item in the beverage brewing case as an explanatory variable, generating learning data;
Dividing the learning data into a plurality of blocks, setting some blocks as test data, and setting blocks not set as the test data as training data;
Using the training data to estimate the component values of the beverage and generating a plurality of the prediction models;
Obtaining the estimated value of the component value by a plurality of the prediction models using the test data;
Determining whether the magnitude of the error between the estimated value and the true value of the component value in the brewing case is within an allowable error, and storing the result as an allowable error determination result;
The error determination model is generated for each prediction model, using the training data and estimating the probability that the magnitude of the error in the estimated value is within the allowable error using the determination result within the allowable error as a teacher signal. And steps to
Applying a plurality of the error determination models to the test data, and calculating the probability within the allowable error for each prediction model;
For the past sample included in the test data, compare the error between the estimated value and the true value by a plurality of prediction models, select the prediction model with the smallest error, and as a preferred method Saving step;
The method selection model that adds a plurality of probabilities within the allowable error to the explanatory variable of the learning data, and selects the estimated value presented as the component value of the beverage using the preferred method as a teacher signal. And a step of generating a computer,
The beverage characteristic prediction method according to claim 1 or 2.

The past sample that is classified in the same category as the tasting sample in the feature space related to the learning data or the past sample that is in the vicinity of the tasting sample in the feature space is extracted, and Including the step of presenting as a related case,
The beverage characteristic prediction method according to claim 3.

Including the step of displaying the setting items in the related case and the component values of the beverage generated in the related case on a display.
The beverage characteristic prediction method according to claim 4.

Any of the prediction models adds, for a plurality of the past samples, a product of a transition probability between the past samples determined by K * and the component value of the brewing case corresponding to the past samples, The estimated value is calculated by dividing by the sum of the transition probabilities,
The beverage characteristic prediction method according to any one of claims 3 to 5.

At least one of the error determination model and the method selection model is generated by logistic regression,
The characteristic prediction method of the drink according to any one of claims 1 to 6.

Any of the prediction models is generated by M5 or M5 ′,
The beverage characteristic prediction method according to any one of claims 1 to 7.

The beverage is one of beer, happoshu, new genre, and beer-taste beverage,
The method for predicting beverage characteristics according to any one of claims 1 to 8.

The beverage is a beverage containing malt as a raw material,
The beverage characteristic prediction method according to any one of claims 1 to 9.

The setting item includes at least information related to either the raw material of the beverage or the brewing process of the beverage,
The method for predicting characteristics of a beverage according to any one of claims 1 to 10.

Generating a tasting sample using the setting items in the beverage brewing conditions as explanatory variables;
Inputting the tasting sample into a plurality of prediction models, and obtaining an estimate of the component value of the beverage for each prediction model;
Inputting the sample sample to an error determination model corresponding to the prediction model, and calculating a probability within tolerance for each prediction model;
The method selection model is input with explanatory variables related to the tasting sample and a plurality of probabilities within the allowable error, and the component values predicted under the tasting conditions of the beverage based on the output value of the method selection model Selecting the estimated value presented as:
The computer executes a step of displaying at least one of the estimated values on the display as the component value predicted in the brewing conditions of the beverage,
Characteristic prediction program.

Generating a tasting sample using the setting items in the beverage brewing conditions as explanatory variables;
Inputting the tasting sample into a plurality of prediction models, and obtaining an estimate of the component value of the beverage for each prediction model;
Inputting the sample sample to an error determination model corresponding to the prediction model, and calculating a probability within tolerance for each prediction model;
The method selection model is input with explanatory variables related to the tasting sample and a plurality of probabilities within the allowable error, and the component values predicted under the tasting conditions of the beverage based on the output value of the method selection model Selecting the estimated value presented as:
Displaying at least one of the estimated values on the display as the component value predicted in the brewing conditions of the beverage,
Characteristic prediction device.