JP2022180119A

JP2022180119A - Data analysis system

Info

Publication number: JP2022180119A
Application number: JP2021087045A
Authority: JP
Inventors: 正晃加納; Masaaki Kano
Original assignee: JTEKT Corp
Current assignee: JTEKT Corp
Priority date: 2021-05-24
Filing date: 2021-05-24
Publication date: 2022-12-06

Abstract

To provide a data analysis system that solves a problem that learning does not progress with an ArcFace and can generate a machine learning model that enables high-precision prediction while using high prediction accuracy of the ArcFace.SOLUTION: A data analysis system 1 includes: a loss function calculation unit 25 that calculates, when an angle θ between a feature vector x' and a weight vector W of a correct answer class is smaller than a predetermined value θTh, a loss function value Lossarc by applying margin addition processing by an ArcFace using the feature vector x' and the weight vector W, and calculates, when the formed angle θ is equal to or greater than a predetermined value θTh, a loss function value Losscos by applying the margin addition processing by a CosFace using the feature vector x' and the weight vector W; and a learning processing unit 26 that learns, on the basis of the loss function values Lossarc, Losscos, a machine learning model 11 by a gradient method.SELECTED DRAWING: Figure 2

Description

本発明は、データ解析システムに関する。 The present invention relates to data analysis systems.

特許文献１には、ニューラルネットワークにより構成された識別器を用いて、外観画像における欠陥の有無を検査する外観検査装置が記載されている。一般に、外観画像における欠陥の有無の検査などのように、クラス分類を行う機械学習モデルは、クラス分類の境界が明確であることが望まれる。 Patent Literature 1 describes a visual inspection apparatus that inspects the presence or absence of defects in a visual image using a discriminator configured by a neural network. In general, machine learning models that perform class classification, such as inspection of the presence or absence of defects in appearance images, are desired to have clear class classification boundaries.

クラス分類を行うためのニューラルネットワークからなる機械学習モデルの学習方法は、種々知られている。例えば、ユークリッド距離、マンハッタン距離、チェビシェフ距離などのように、特徴量空間における距離を用いて学習を行う方法がある。この場合、例えば、Center Lossなどの距離学習用の損失関数を用いる。 Various methods of learning a machine learning model consisting of a neural network for performing class classification are known. For example, there is a method of learning using distances in feature space, such as Euclidean distance, Manhattan distance, and Chebyshev distance. In this case, for example, a loss function for distance learning such as Center Loss is used.

また、角度を用いた距離学習として、ArcFace、CosFace、SphereFaceなどが知られている。これらの距離学習は、特徴ベクトルと正解クラスの重みベクトルとの内積に対して、正解クラスの場合にマージンを付加して、学習を行う手法である。これらの距離学習は、特徴ベクトルと正解クラスの重みベクトルとのなす角θを小さくするように学習する手法である。そして、これらの距離学習の手法は、マージンの与え方がそれぞれ異なる。また、一般に、予測精度は、ArcFaceが最も良く、次にCosFace、その次にSphereFaceの順となる。 ArcFace, CosFace, SphereFace, etc. are known as distance learning using angles. These distance learning methods are methods of learning by adding a margin in the case of the correct class to the inner product of the feature vector and the weight vector of the correct class. These distance learning methods are methods of learning so as to reduce the angle θ between the feature vector and the weight vector of the correct class. These distance learning methods differ in how to give a margin. Also, in general, ArcFace has the best prediction accuracy, followed by CosFace, and then SphereFace.

特開２０２０－１０６４６１号公報Japanese Patent Application Laid-Open No. 2020-106461

発明者は、角度を用いた距離学習として、ArcFaceによるマージン付加処理を適用して勾配法を用いて学習する場合、初期における、特徴ベクトルと重みベクトルとのなす角θにおけるcos距離（cos類似度とも称する）が小さい場合に、学習が進まない場合があることを発見した。 The inventors found that, as distance learning using angles, when margin addition processing by ArcFace is applied and learning is performed using the gradient method, the initial cos distance (cos similarity ) is small, learning may not progress.

本発明は、かかる課題に鑑みてなされたものであり、ArcFaceによる高い予測精度を利用しつつ、ArcFaceにより学習が進まない問題を解決し、高精度な予測を可能とする機械学習モデルを生成することができるデータ解析システムを提供しようとするものである。 The present invention has been made in view of this problem, and solves the problem that learning does not progress with ArcFace while utilizing the high prediction accuracy of ArcFace, and generates a machine learning model that enables highly accurate prediction. It is intended to provide a data analysis system capable of

本発明の一態様は、
演算処理装置および記憶装置を備えるコンピュータ装置により構成されたデータ解析システムであって、
前記記憶装置は、ニューラルネットワークより構成され、最終段の全結合層の重みベクトルを用いて特徴ベクトルを出力し、対象のデータに関する前記特徴ベクトルとクラス代表のデータに関する前記特徴ベクトルとのcos距離によりクラス分類を行うための機械学習モデルを記憶し、
前記演算処理装置は、
学習用データを入力した場合に前記機械学習モデルを実行することにより前記特徴ベクトルを出力する機械学習モデル実行部と、
前記機械学習モデル実行部が前記特徴ベクトルを出力した際の前記重みベクトルを取得する重みベクトル取得部と、
前記特徴ベクトルと正解クラスの前記重みベクトルとのなす角θが所定値より小さい場合に、前記特徴ベクトルおよび前記重みベクトルを用いてArcFaceによるマージン付加処理を適用して損失関数の値を算出し、前記なす角θが前記所定値以上の場合に、前記特徴ベクトルおよび前記重みベクトルを用いてCosFaceによるマージン付加処理を適用して損失関数の値を算出する損失関数演算部と、
前記損失関数の値に基づいて勾配法により前記機械学習モデルの学習を行う学習処理部と、
を備える、データ解析システムにある。 One aspect of the present invention is
A data analysis system configured by a computer device comprising an arithmetic processing device and a storage device,
The storage device is composed of a neural network, and outputs a feature vector using the weight vector of the fully connected layer at the final stage. store a machine learning model for classifying,
The arithmetic processing unit is
a machine learning model execution unit that outputs the feature vector by executing the machine learning model when learning data is input;
a weight vector acquisition unit that acquires the weight vector when the machine learning model execution unit outputs the feature vector;
when the angle θ between the feature vector and the weight vector of the correct class is smaller than a predetermined value, applying margin addition processing by ArcFace using the feature vector and the weight vector to calculate the value of the loss function; a loss function calculation unit that calculates a loss function value by applying margin addition processing by CosFace using the feature vector and the weight vector when the formed angle θ is equal to or greater than the predetermined value;
a learning processing unit that learns the machine learning model by a gradient method based on the value of the loss function;
in a data analysis system comprising

機械学習モデルの学習は、角度を用いた距離学習としてのArcFaceとCosFaceを併用している。具体的には、損失関数演算部が、特徴ベクトルと重みベクトルとのなす角θが所定値より小さい場合には、ArcFaceによるマージン付加処理を適用し、なす角θが所定値以上の場合には、CosFaceによるマージン付加処理を適用している。 The learning of the machine learning model uses both ArcFace and CosFace as distance learning using angles. Specifically, when the angle θ formed by the feature vector and the weight vector is smaller than a predetermined value, the loss function calculation unit applies margin addition processing by ArcFace, and when the angle θ formed is greater than or equal to a predetermined value, , Margin addition processing by CosFace is applied.

従って、ArcFaceによるマージン付加処理のみの場合に学習が進まない可能性のある領域は、CosFaceによるマージン付加処理が適用されている。従って、全体として、ArcFaceにて学習が進まなくなるような領域は存在せず、CosFaceにより確実に学習が進む状態とすることができる。 Therefore, margin addition processing by CosFace is applied to areas where there is a possibility that learning will not progress if only margin addition processing by ArcFace is performed. Therefore, as a whole, there is no region where learning is not progressed with ArcFace, and a state can be established in which learning is reliably progressed with CosFace.

さらに、学習が進むと、特徴ベクトルと正解クラスの重みベクトルとのなす角θは小さくなる。学習の初期において、特徴ベクトルと正解クラスの重みベクトルとのなす角θが所定値より大きい場合において、CosFaceによるマージン付加処理が適用される。その後、学習が進むことで、なす角θが小さくなる。そうすると、なす角θが所定値に到達し、CosFaceの領域から、ArcFaceの領域へ移行する。その後、ArcFaceによる学習が進むことにより、高精度な予測を行うことができる機械学習モデルが生成される。 Furthermore, as the learning progresses, the angle θ between the feature vector and the weight vector of the correct class becomes smaller. At the beginning of learning, when the angle θ between the feature vector and the weight vector of the correct class is greater than a predetermined value, CosFace margin addition processing is applied. After that, as the learning progresses, the formed angle θ becomes smaller. Then, the formed angle θ reaches a predetermined value, and the area of CosFace shifts to the area of ArcFace. After that, as learning by ArcFace progresses, a machine learning model capable of making highly accurate predictions is generated.

データ解析システムの学習フェーズにおける機能ブロック構成図である。It is a functional block block diagram in the learning phase of a data-analysis system. データ解析システムの学習フェーズにおける詳細構成を示す図である。It is a figure which shows the detailed structure in the learning phase of a data-analysis system. cos距離と損失関数の値との関係を示すグラフにおいて、ArcFaceとCosFaceとの第一の併用パターンを実線で示し、ArcFaceのみによるパターンを破線にて示す。In the graph showing the relationship between the cos distance and the value of the loss function, the solid line indicates the first combined pattern of ArcFace and CosFace, and the dashed line indicates the pattern of ArcFace only. cos距離と損失関数の値との関係を示すグラフにおいて、ArcFaceとCosFaceとの第二の併用パターンを実線で示し、ArcFaceのみによるパターンを破線にて示す。In the graph showing the relationship between the cos distance and the value of the loss function, the solid line indicates the second combined pattern of ArcFace and CosFace, and the dashed line indicates the pattern of ArcFace alone.

（１．機械学習モデル）
機械学習モデルは、ニューラルネットワークにより構成される。ここでのニューラルネットワークは、隠れ層が１層である場合、または、隠れ層が複数層である場合を含み意味で用いている。つまり、隠れ層が複数層ある場合には、いわゆるディープニューラルネットワークと称する。もちろん、機械学習モデルは、ニューラルネットワークとして、例えば、畳み込みニューラルネットワークなどを適用することができる。機械学習モデルは、ニューラルネットワークにより構成されるため、入力されるデータに対応する特徴ベクトルを出力する。 (1. Machine learning model)
A machine learning model is composed of a neural network. The term "neural network" as used herein includes the case where the number of hidden layers is one or the case where the number of hidden layers is multiple. In other words, when there are multiple hidden layers, it is called a so-called deep neural network. Of course, the machine learning model can apply, for example, a convolutional neural network as a neural network. Since the machine learning model is composed of a neural network, it outputs a feature vector corresponding to input data.

本形態の機械学習モデルは、各ニューロンにおいて、重み、バイアス、活性化関数などを用いた処理を行う。主に、重みおよびバイアスが学習対象となる。そして、機械学習モデルは、最終段の全結合層の出力が特徴ベクトルとなる。最終段の全結合層は、重みベクトルを用いて特徴ベクトルを出力する。 The machine learning model of this embodiment performs processing using weights, biases, activation functions, etc. in each neuron. Weights and biases are mainly learned. In the machine learning model, the output of the fully connected layer at the final stage becomes the feature vector. The fully connected layer at the final stage outputs the feature vector using the weight vector.

本形態においては、機械学習モデルは、対象のデータのクラス分類を行うためのモデルである。例えば、対象のデータに関する特徴ベクトルが、各クラスの代表データに関する特徴ベクトルに近いか否かを判定することにより、対象のデータがどのクラスに属するかを判定することができる。本形態においては、機械学習モデルは、対象のデータに関する特徴ベクトルとクラス代表のデータに関する特徴ベクトルとcos距離により、対象のデータのクラス分類を行うためのモデルである。 In this embodiment, the machine learning model is a model for classifying target data. For example, it is possible to determine to which class the target data belongs by determining whether or not the feature vector relating to the target data is close to the feature vector relating to the representative data of each class. In this embodiment, the machine learning model is a model for performing class classification of target data based on feature vectors relating to target data, feature vectors relating to class representative data, and cos distances.

例えば、機械学習モデルは、画像データを対象として、画像データのクラス分類に用いるモデルとすることができる。この場合、機械学習モデルは、対象の画像データに関する特徴ベクトルとクラス代表の画像データに関する特徴ベクトルとのcos距離により、対象の画像データのクラス分類を行う。 For example, the machine learning model can be a model that targets image data and is used for classifying image data. In this case, the machine learning model performs class classification of the target image data based on the cos distance between the feature vector relating to the target image data and the feature vector relating to the class representative image data.

特に、機械学習モデルは、産業製品の外観検査に用いるモデルを例にあげる。この場合、産業製品の外観の画像データを入力して、対象の産業製品が良品であるか不良品であるかの判定を行う。具体的には、機械学習モデルに対象の外観画像データを入力した場合に出力される特徴ベクトルを取得する。また、機械学習モデルに良品の代表画像データを入力した場合に出力される特徴ベクトルを取得する。さらに、機械学習モデルに不良品の代表画像データを入力した場合に出力される特徴ベクトルを取得する。 In particular, the machine learning model is exemplified by a model used for visual inspection of industrial products. In this case, by inputting image data of the appearance of the industrial product, it is determined whether the target industrial product is a non-defective product or a defective product. Specifically, a feature vector that is output when external image data of a target is input to the machine learning model is acquired. Also, a feature vector that is output when the representative image data of a non-defective product is input to the machine learning model is acquired. Furthermore, the feature vector output when the representative image data of the defective product is input to the machine learning model is acquired.

そして、対象の外観画像データに関する特徴ベクトルと良品の代表画像データに関する特徴ベクトルとのcos距離より、対象の外観画像データが良品のクラスに分類されるかを判定する。一方、対象の外観画像データに関する特徴ベクトルと不良品の代表画像データに関する特徴ベクトルとのcos距離より、対象の外観画像データが不良品のクラスに分類されるかを判定する。このように、各クラスの代表の画像データに関する特徴ベクトルとのcos距離に基づいて、産業製品が、良品であるか不良品であるかの判定を行うことができる。 Then, based on the cos distance between the feature vector related to the target appearance image data and the feature vector related to the non-defective representative image data, it is determined whether the target appearance image data is classified into the non-defective product class. On the other hand, it is determined whether the target appearance image data is classified into the defective product class from the cos distance between the feature vector relating to the target appearance image data and the feature vector relating to the representative image data of the defective product. In this way, it is possible to determine whether an industrial product is good or bad based on the cos distance between the feature vector and the representative image data of each class.

不良品の種類が複数存在する場合には、対象の外観画像データに関する特徴ベクトルと、それぞれの不良品の代表画像データに関する特徴ベクトルとのcos距離により、対象の外観画像データがどの不良品のクラスに分類されるかを判定することもできる。ここで、産業製品としては、車両部品、産業機械部品、民生機器部品など種々の部品を対象とできる。 If there are multiple types of defective products, the cos distance between the feature vector related to the target external image data and the feature vector related to the representative image data of each defective product determines which class of defective product the target external image data belongs to. It is also possible to determine whether it is classified into Here, as industrial products, various parts such as vehicle parts, industrial machine parts, and consumer equipment parts can be targeted.

（２．データ解析システム１の学習フェーズの概要）
データ解析システム１の学習フェーズにおける構成の概要について、図１を参照して説明する。データ解析システム１は、記憶装置２および演算処理装置３を備えるコンピュータ装置により構成される。 (2. Outline of learning phase of data analysis system 1)
An overview of the configuration of the data analysis system 1 in the learning phase will be described with reference to FIG. A data analysis system 1 is composed of a computer device having a storage device 2 and an arithmetic processing device 3 .

記憶装置２は、上述した機械学習モデル１１を記憶する。さらに、記憶装置２は、機械学習モデルの学習フェーズにおいて用いる学習用データ１２を記憶する。学習用データ１２は、例えば、産業製品の外観画像データであって、良品の画像データおよび不良品の画像データを含む。さらに、学習用データ１２は、各画像データが良品であるか不良品であるかの情報、すなわち正解ラベル１２ａを含む。 The storage device 2 stores the machine learning model 11 described above. Furthermore, the storage device 2 stores learning data 12 used in the learning phase of the machine learning model. The learning data 12 is, for example, appearance image data of industrial products, and includes image data of non-defective products and image data of defective products. Further, the learning data 12 includes information indicating whether each image data is good or bad, that is, a correct label 12a.

記憶装置２は、さらに、損失関数演算用プログラム１３を記憶する。損失関数演算用プログラムは、ArcFaceによるマージン付加処理を適用して損失関数の値を算出するプログラム、および、CosFaceによるマージン付加処理を適用して損失関数の値を算出するプログラムを含む。 The storage device 2 further stores a loss function calculation program 13 . The loss function calculation program includes a program for applying margin addition processing by ArcFace to calculate the value of the loss function, and a program for applying margin addition processing by CosFace to calculate the value of the loss function.

演算処理装置３は、機械学習モデル実行部２１、特徴ベクトル正規化処理部２２、重みベクトル取得部２３、重みベクトル正規化処理部２４、損失関数演算部２５、および、学習処理部２６を備える。 The arithmetic processing device 3 includes a machine learning model execution unit 21 , a feature vector normalization processing unit 22 , a weight vector acquisition unit 23 , a weight vector normalization processing unit 24 , a loss function calculation unit 25 and a learning processing unit 26 .

機械学習モデル実行部２１は、機械学習モデル１１に画像データを入力した場合に、入力された画像データに関する特徴ベクトルを出力する。学習フェーズにおいて、機械学習モデル実行部２１は、記憶装置２に記憶されている外観画像の学習用データ１２を入力して、入力された学習用データ１２に関する特徴ベクトルを出力する。特徴ベクトル正規化処理部２２は、機械学習モデル実行部２１により出力された特徴ベクトルに対してＬ２正規化を行う。 When image data is input to the machine learning model 11, the machine learning model execution unit 21 outputs a feature vector related to the input image data. In the learning phase, the machine learning model execution unit 21 inputs learning data 12 of appearance images stored in the storage device 2 and outputs a feature vector related to the input learning data 12 . The feature vector normalization processing unit 22 performs L2 normalization on the feature vectors output by the machine learning model execution unit 21 .

重みベクトル取得部２３は、機械学習モデル実行部２１が特徴ベクトルを出力した際の最終段の全結合層の重みベクトルを取得する。重みベクトル正規化処理部２４は、重みベクトル取得部２３が取得した重みベクトルに対してＬ２正規化を行う。 The weight vector acquisition unit 23 acquires the weight vector of the final fully connected layer when the machine learning model execution unit 21 outputs the feature vector. The weight vector normalization processing unit 24 performs L2 normalization on the weight vector acquired by the weight vector acquisition unit 23 .

損失関数演算部２５は、記憶装置２に記憶されている損失関数演算用プログラム１３を実行することにより、ArcFaceによるマージン付加処理を適用して損失関数の値Lossを算出する。さらに、損失関数演算部２５は、損失関数演算用プログラム１３を実行することにより、CosFaceによるマージン付加処理を適用して損失関数の値Lossを算出する。２つのマージン付加処理の適用の概要は、次のとおりである。 The loss function calculation unit 25 executes the loss function calculation program 13 stored in the storage device 2 to apply margin addition processing by ArcFace to calculate the loss function value Loss. Further, the loss function calculation unit 25 executes the loss function calculation program 13 to apply margin addition processing by CosFace to calculate the value Loss of the loss function. An overview of the application of the two margin addition processes is as follows.

損失関数演算部２５は、特徴ベクトルと正解クラスの重みベクトルとのなす角θが所定値より小さい場合に、特徴ベクトルおよび重みベクトルを用いてArcFaceによるマージン付加処理を適用して損失関数の値Lossを算出する。一方、損失関数演算部２５は、なす角θが所定値以上の場合に、特徴ベクトルおよび重みベクトルを用いてCosFaceによるマージン付加処理を適用して損失関数の値Lossを算出する。損失関数の値Lossの算出においては、記憶装置２０に記憶されている学習用データ１２に含まれる正解ラベル１２ａを用いる。 When the angle θ between the feature vector and the weight vector of the correct class is smaller than a predetermined value, the loss function calculation unit 25 applies margin addition processing by ArcFace using the feature vector and the weight vector to obtain the loss function value Loss Calculate On the other hand, when the formed angle θ is equal to or greater than a predetermined value, the loss function calculator 25 applies margin addition processing by CosFace using the feature vector and the weight vector to calculate the loss function value Loss. In calculating the loss function value Loss, the correct label 12a included in the learning data 12 stored in the storage device 20 is used.

学習処理部２６は、損失関数演算部２５により算出された損失関数の値Lossに基づいて、勾配法により機械学習モデル１１の学習を行う。学習処理部２６は、記憶装置２に記憶されている機械学習モデル１１の重みおよびバイアスを学習する。本形態においては、学習処理部２６は、損失関数の値Lossが最小値または極小値となるように、勾配降下法を適用して学習する。 The learning processing unit 26 learns the machine learning model 11 by the gradient method based on the loss function value Loss calculated by the loss function calculation unit 25 . The learning processing unit 26 learns the weights and biases of the machine learning model 11 stored in the storage device 2 . In this embodiment, the learning processing unit 26 learns by applying the gradient descent method so that the loss function value Loss becomes the minimum value or the minimum value.

（３．データ解析システム１の学習フェーズの詳細構成）
データ解析システム１の学習フェーズの詳細な構成について図２を参照して説明する。機械学習モデル実行部２１が、学習用データ１２に含まれる画像データを入力し、機械学習モデルを実行する。そうすると、機械学習モデル実行部２１が、特徴ベクトルを出力する。ここで、入力される学習用データ１２に含まれる画像データをｘとして、出力される特徴ベクトルをｘ’とした場合、式（１）のように表される。ｆ（）は、機械学習モデル１１を表す関数である。 (3. Detailed configuration of learning phase of data analysis system 1)
A detailed configuration of the learning phase of the data analysis system 1 will be described with reference to FIG. The machine learning model execution unit 21 receives the image data included in the learning data 12 and executes the machine learning model. Then, the machine learning model execution unit 21 outputs feature vectors. Here, when the image data included in the input learning data 12 is x, and the feature vector to be output is x', the expression (1) is obtained. f( ) is a function representing the machine learning model 11 .

特徴ベクトル正規化処理部２２が、特徴ベクトルｘ’をＬ２正規化することで、正規化後特徴ベクトルｘ”を出力する。特徴ベクトルｘ’と正規化後特徴ベクトルｘ”とは、式（２）のように表される。正規化後特徴ベクトルｘ”は、特徴ベクトルｘ’の長さを１としたベクトルである。 The feature vector normalization processing unit 22 L2-normalizes the feature vector x′ to output a normalized feature vector x″. ). The normalized feature vector x″ is a vector with the length of the feature vector x′ set to one.

重みベクトル取得部２３が、機械学習モデル実行部２１が実行した機械学習モデル１１における最終段の全結合層の重みベクトルＷを取得する。重みベクトル正規化処理部２４が、重みベクトルＷをＬ２正規化することで、正規化後重みベクトルＷ’を出力する。重みベクトルＷと正規化後重みベクトルＷ’とは、式（３）のように表される。正規化後重みベクトルＷ’は、重みベクトルＷの長さを１としたベクトルである。 The weight vector acquisition unit 23 acquires the weight vector W of the final fully connected layer in the machine learning model 11 executed by the machine learning model execution unit 21 . The weight vector normalization processing unit 24 performs L2 normalization on the weight vector W to output a normalized weight vector W'. The weight vector W and the normalized weight vector W' are expressed as in Equation (3). The normalized weight vector W' is a vector having the length of the weight vector W set to one.

損失関数演算部２５は、ArcFace適用部３０と、CosFace適用部４０とを備える。ArcFace適用部３０は、内積算出部３１、ArcFace演算部３２、ロジット算出部３３、ソフトマックス関数演算部３４、損失算出部３５を備える。CosFace適用部４０は、内積算出部４１、CosFace演算部４２、ロジット算出部４３、ソフトマックス関数演算部４４、損失算出部４５を備える。ただし、ArcFace適用部３０の内積算出部３１と、CosFace適用部４０の内積算出部４１とは、同一の処理を行うため、共通したプログラムを実行するようにしても良い。 The loss function calculator 25 includes an ArcFace application unit 30 and a CosFace application unit 40 . The ArcFace application unit 30 includes an inner product calculation unit 31 , an ArcFace calculation unit 32 , a logit calculation unit 33 , a softmax function calculation unit 34 and a loss calculation unit 35 . The CosFace application unit 40 includes an inner product calculation unit 41 , a CosFace calculation unit 42 , a logit calculation unit 43 , a softmax function calculation unit 44 and a loss calculation unit 45 . However, since the inner calculation section 31 of the ArcFace application section 30 and the inner calculation section 41 of the CosFace application section 40 perform the same processing, they may execute a common program.

ArcFace適用部３０の内積算出部３１は、正規化後特徴ベクトルｘ”と正規化後重みベクトルＷ’とを取得する。そして、内積算出部３１は、式（４）に示すように、正規化後特徴ベクトルｘ”と正規化後重みベクトルＷ’との内積であるcos距離（cosθ）を算出する。CosFace適用部４０の内積算出部４１も同様である。 The inner product calculation unit 31 of the ArcFace application unit 30 acquires the normalized feature vector x″ and the normalized weight vector W′. A cos distance (cos θ), which is the inner product of the normalized feature vector x″ and the normalized weight vector W′, is calculated. The same applies to the inner calculation section 41 of the CosFace application section 40 .

ArcFace演算部３２は、式（５）に従って、正規化後特徴ベクトルｘ”と正解クラスｊの正規化後重みベクトルＷ’とのなす角θが所定値θ_Ｔｈより小さい場合に、正規化後特徴ベクトルｘ”と正規化後重みベクトルＷ’とのなす角θを用いて、ArcFaceにより角度θ’を算出する。ArcFace演算部３２は、正解クラスｊに対応する場合には、ArcFaceによるマージンｍ_ａを付加する処理（マージン付加処理）を実行する。ArcFace演算部３２は、正解クラスｊに対応しない場合には、マージンｍ_ａを付加せずに、正規化後特徴ベクトルｘ”と正規化後重みベクトルＷ’とのなす角θを用いる。 If the angle θ between the normalized feature vector x″ and the normalized weight vector W′ of the correct class j is smaller _than a predetermined value θTh, the ArcFace calculator 32 calculates the normalized feature Using the angle θ between the vector x″ and the normalized weight vector W′, the angle θ′ is calculated by ArcFace. The ArcFace calculation unit 32 executes a process (margin addition process) of adding a margin _ma by ArcFace when it corresponds to the correct class j. The ArcFace calculator 32 uses the angle θ formed by the normalized feature vector x″ and the normalized weight vector W′ without adding the margin _ma when it does not correspond to the correct class j.

ここで、所定値θ_Ｔｈは、式（６）により表される。 Here, the predetermined value θ _Th is represented by Equation (6).

ArcFace演算部３２は、前述のなす角θが所定値θ_Ｔｈより小さい場合に、式（５）により算出した角θ’_ｊを用いて、式（７）に表されるように、cos距離（cosθ’_ｊ）を算出する。つまり、ArcFace演算部３２は、正解クラスｊに対応する場合には、マージンｍ_ａが付加されたときのcos距離（cosθ’_ｊ）を算出する。一方、ArcFace演算部３２は、正解クラスｊに対応しない場合には、マージンｍ_ａを付加せずに、内積算出部３１により算出されたcos距離（cosθ’_ｊ）をそのまま用いる。 When the formed angle θ is smaller than the predetermined value _θTh , the ArcFace calculator 32 uses the angle _θ'j calculated by the formula (5) to calculate the cos distance ( cos θ' _j ). In other words, the ArcFace calculator 32 calculates the cos distance (cos θ′ _j ) when the margin _ma is added when it corresponds to the correct class j. On the other hand, the ArcFace calculator 32 uses the cos distance (cos θ′ _j ) calculated by the inner product calculator 31 as it is without adding the margin _ma when it does not correspond to the correct class j.

CosFace演算部４２は、式（８）に従って、正規化後特徴ベクトルｘ”と正解クラスｊの正規化後重みベクトルＷ’とのなす角θが所定値θ_Ｔｈ以上の場合に、CosFaceによる処理を行う。CosFace演算部４２は、正解クラスｊに対応する場合には、CosFaceによるマージンｍ_ｃを付加する処理（マージン付加処理）を実行して、マージンｍ_ｃが付加されたときのcos距離（cosθ’_ｊ）を算出する。一方、CosFace演算部４２は、正解クラスｊに対応しない場合には、マージンｍ_ｃを付加せずに、内積算出部３１により算出されたcos距離（cosθ’_ｊ）をそのまま用いる。 The CosFace calculation unit 42 performs CosFace processing when the angle θ between the normalized feature vector x″ and the normalized weight vector W′ of the correct class j is equal to _or greater than a predetermined value θTh according to Equation (8). The CosFace calculation unit 42 executes a process (margin addition process) for adding _a margin _mc by CosFace when it corresponds to the correct class j, and calculates the cos distance (cos θ On the other hand, the CosFace calculation unit 42 calculates the cos distance (cos θ' _j ) calculated by the inner product calculation unit 31 without adding the margin m _c when it does not correspond to the correct _class j. is used as is.

ここで、CosFaceによるマージンｍ_ｃは、式（９）により表される。 Here, the margin _mc by CosFace is represented by Equation (9).

ArcFace適用部３０のロジット算出部３３は、式（１０）に従って、ArcFace演算部３２により算出されたcos距離（cosθ’_ｊ）に、スケールパラメータｓを乗算することにより、ロジットlogitを算出する。また、CosFace適用部４０のロジット算出部４３も、同様の処理を行う。すなわち、ロジット算出部４３は、式（１０）に従って、CosFace演算部４２により算出されたcos距離（cosθ’_ｊ）に、スケールパラメータｓを乗算することにより、ロジットlogitを算出する。 The logit calculation unit 33 of the ArcFace application unit 30 calculates the logit logit by multiplying the cos distance (cos θ' _j ) calculated by the ArcFace calculation unit 32 by the scale parameter s according to equation (10). The logit calculation unit 43 of the CosFace application unit 40 also performs similar processing. That is, the logit calculator 43 calculates the logit logit by multiplying the cos distance (cos θ′ _j ) calculated by the CosFace calculator 42 by the scale parameter s according to Equation (10).

ArcFace適用部３０のソフトマックス関数演算部３４は、式（１１）に従って、ロジット算出部３３により算出されたロジットlogitをソフトマックス関数により変換する。同様に、CosFace適用部４０のソフトマックス関数演算部４４は、式（１１）に従って、ロジット算出部４３により算出されたロジットlogitをソフトマックス関数により変換する。 The softmax function calculation unit 34 of the ArcFace application unit 30 converts the logit logit calculated by the logit calculation unit 33 using the softmax function according to equation (11). Similarly, the softmax function calculation unit 44 of the CosFace application unit 40 converts the logit logit calculated by the logit calculation unit 43 using the softmax function according to equation (11).

ArcFace適用部３０の損失算出部３５は、式（１２）に従って、ArcFaceによる損失関数の値Loss_arcを算出する。損失算出部３５は、ロジット算出部３３により算出されたロジットlogitに対して損失関数としてのクロスエントロピーを適用することにより、損失関数の値Loss_arcとしてクロスエントロピー誤差を算出する。クロスエントロピー誤差の算出においては、学習用データ１２における正解ラベル１２ａを用いる。 The loss calculation unit 35 of the ArcFace application unit 30 calculates the value Loss _arc of the loss function by ArcFace according to Equation (12). The loss calculator 35 applies cross entropy as a loss function to the logit logit calculated by the logit calculator 33 to calculate a cross entropy error as a loss function value Loss _arc . In calculating the cross-entropy error, the correct label 12a in the learning data 12 is used.

つまり、式（１２）に示すArcFaceによる損失関数は、マージンｍ_ａを付加することにより、正解クラスｊの重みベクトルＷ_ｙｉと特徴ベクトルｘ’_ｉとのcos距離を実際より小さく見積もり、不正解クラスの重みベクトルＷと特徴ベクトルｘ’とのcos距離を実際より大きく見積もっていることに相当する。つまり、他のクラスの重みベクトルＷよりも、正解クラスｊの重みベクトルＷ_ｙｉにより近づくように損失関数の値Loss_arcを与えているため、正解クラスの重みベクトルと不正解クラスの重みベクトルとを引き離す効果を有する。 That is, the loss function by ArcFace shown in Equation (12) estimates the cos distance between the weight vector _Wyi of the correct answer class j and the feature vector _x'i smaller than the actual value by adding the margin _ma , and the incorrect answer class This corresponds to overestimating the cos distance between the weight vector W and the feature vector x'. That is, since the loss function value Loss _arc is given so as to be closer to the weight vector _Wyi of the correct class j than the weight vector W of other classes, the weight vector of the correct class and the weight vector of the incorrect class are divided into It has a pulling effect.

CosFace適用部４０の損失算出部４５は、式（１３）に従って、CosFaceによる損失関数の値Loss_cosを算出する。損失算出部４５は、ロジット算出部４３により算出されたロジットlogitに対して損失関数としてのクロスエントロピーを適用することにより、損失関数の値Loss_cosとしてクロスエントロピー誤差を算出する。クロスエントロピー誤差の算出においては、学習用データ１２における正解ラベル１２ａを用いる。 The loss calculation unit 45 of the CosFace application unit 40 calculates the value Loss _cos of the loss function by CosFace according to Equation (13). The loss calculator 45 applies cross entropy as a loss function to the logit logit calculated by the logit calculator 43 to calculate a cross entropy error as a loss function value Loss _cos . In calculating the cross-entropy error, the correct label 12a in the learning data 12 is used.

つまり、式（１３）に示すCosFaceによる損失関数は、マージンｍ_ｃを付加することにより、ArcFaceと基本的には同様に機能する。つまり、CosFaceによる損失関数は、正解クラスｊの重みベクトルＷ_ｙｉと特徴ベクトルｘ’_ｉとのcos距離を実際より小さく見積もり、不正解クラスの重みベクトルＷと特徴ベクトルｘ’とのcos距離を実際より大きく見積もっていることに相当する。つまり、他のクラスの重みベクトルＷよりも、正解クラスｊの重みベクトルＷ_ｙｉにより近づくように損失関数の値Loss_cosを与えているため、正解クラスの重みベクトルと不正解クラスの重みベクトルとを引き離す効果を有する。 That is, the loss function by CosFace shown in Equation (13) functions basically in the same way as ArcFace by adding a margin _mc . In other words, the loss function by CosFace underestimates the cos distance between the weight vector _Wyi of the correct class j and the feature vector _x'i , and the cos distance between the weight vector W of the incorrect class j and the feature vector x'. It is equivalent to making a larger estimate. That is, since the loss function value Loss _cos is given so as to be closer to the weight vector _Wyi of the correct class j than the weight vector W of other classes, the weight vector of the correct class and the weight vector of the incorrect class are divided into It has a pulling effect.

学習処理部２６は、正規化後特徴ベクトルｘ”と正解クラスｊの正規化後重みベクトルＷ’とのなす角θが所定値θ_Ｔｈ以上の場合には、ArcFace適用部３０の損失算出部３５により算出された損失関数の値Loss_arc（式（１２）に示す）を用いて、勾配法により機械学習モデル１１を学習する。 When the angle θ between the normalized feature vector x″ and the normalized weight vector W′ of the correct class j is equal to or greater than a predetermined value _θTh , the learning processing unit 26 calculates the loss calculation unit 35 of the ArcFace application unit 30. The machine learning model 11 is trained by the gradient method using the loss function value Loss _arc (shown in Equation (12)) calculated by .

また、学習処理部２６は、正規化後特徴ベクトルｘ”と正解クラスｊの正規化後重みベクトルＷ’とのなす角θが所定値θ_Ｔｈより小さい場合には、CosFace適用部４０の損失算出部４５により算出された損失関数の値Loss_cos（式（１３）に示す）を用いて、勾配法により機械学習モデル１１を学習する。 Further, when the angle θ between the normalized feature vector x″ and the normalized weight vector W′ of the correct class j is smaller than a predetermined value θ _Th , the learning processing unit 26 calculates the loss of the CosFace application unit 40. Using the loss function value Loss _cos (shown in equation (13)) calculated by the unit 45, the machine learning model 11 is learned by the gradient method.

（４．第一のマージン適用時のcos距離と損失関数の値）
第一のマージン適用時における「正規化後特徴ベクトルｘ”と正解クラスの重みベクトルＷ’とのcos距離」と「損失関数の値Loss」との関係について、図３を参照して説明する。ここで、図３には、機械学習モデル１１の学習においてArcFaceとCosFaceの併用パターンとしての本形態を実線にて示しており、比較例としてArcFaceのみを適用したパターンについて破線にて示している。 (4. Values of cos distance and loss function when applying the first margin)
The relationship between the "cos distance between the normalized feature vector x" and the weight vector W' of the correct class and the "loss function value Loss" when the first margin is applied will be described with reference to FIG. Here, in FIG. 3, the solid line indicates this embodiment as a combined pattern of ArcFace and CosFace in the learning of the machine learning model 11, and the dashed line indicates a pattern in which only ArcFace is applied as a comparative example.

比較例としてのArcFaceのみを適用した場合には、cos距離に対する損失関数の値が、cos距離が小さい領域のうちのcosθ_Ｅのときに極値（本形態では極大値）を持つ。学習の初期において、cos距離がcosθ_Ｅよりも大きな値の場合には、勾配法を適用して学習すると、cos距離がより大きい方に移動して行く。そのため、cos距離を大きくする方向に学習が進んで行き、理想的な状態に近づいて行く。一方、学習の初期において、cos距離がcosθ_Ｅよりも小さな値の場合には、勾配法を適用して学習すると、cos距離がより小さい方に移動して行く。そのため、cos距離を大きくする方向に学習が進まない。 When only ArcFace as a comparative example is applied, the value of the loss function with respect to the cos distance has an extreme value (maximum value in this embodiment) at cos θ _E in the small cos distance region. When the cos distance is larger than cos θ _E in the initial stage of learning, learning by applying the gradient method causes the cos distance to move to a larger value. Therefore, learning progresses in the direction of increasing the cos distance, and the ideal state is approached. On the other hand, when the cos distance is smaller than cos θ _E in the initial stage of learning, learning by applying the gradient method moves the cos distance to a smaller value. Therefore, learning does not progress in the direction of increasing the cos distance.

ArcFaceによるマージン付加処理は、式（７）に示すように、なす角θに角度のマージンｍ_ａを加算する処理である。そのため、なす角θが大きい場合（１８０°付近の場合）、θ＋ｍ_ａの角度が１８０°を超える場合には、変換後のcos距離の値の変化が、大小逆転する場合がある。このことを理由に、ArcFaceによるマージン付加処理は、上記のような関係を有することになる。 The margin adding process by ArcFace is a process of adding an angle margin _ma to the formed angle θ as shown in equation (7). Therefore, when the formed angle θ is large (in the vicinity of 180°), and when the angle θ+ _ma exceeds 180°, the change in the value of the cos distance after conversion may be reversed in magnitude. For this reason, margin addition processing by ArcFace has the above relationship.

本形態を適用した場合には、図３に示すように、CosFaceを適用した領域と、ArcFaceを適用した領域とが存在する。ここで、CosFaceによるマージン付加処理は、式（８）に示すように、cosθからマージンｍ_ｃを減算する処理である。従って、変換後のcos距離の値の変化は、cosθの変化と同一となり、ArcFaceの場合のように大小逆転することはない。 When this embodiment is applied, as shown in FIG. 3, there are areas to which CosFace is applied and areas to which ArcFace is applied. Here, the margin adding process by CosFace is the process of subtracting the margin _mc from cos θ as shown in equation (8). Therefore, the change in the value of the cos distance after conversion is the same as the change in cos θ, and unlike the case of ArcFace, the magnitude is not reversed.

そして、ArcFaceを適用した領域とCosFaceを適用した領域との境界が、なす角θが所定値θ_Ｔｈに対応するcos距離の値cosθ_Ｔｈとなる。ArcFaceによるマージン付加処理およびCosFaceによるマージン付加処理は、なす角θが所定値θ_Ｔｈの前後において、損失関数の値Lossが、cos距離に対して単調減少または単調増加する関係を有するように設定されている。本形態においては、cos距離が大きくなる場合に、損失関数の値Lossが単調減少するような関係を有するように設定されている。 Then, the angle θ formed by the boundary between the area to which ArcFace is applied and the area to which CosFace is applied becomes the cos distance value cos _θTh corresponding to the predetermined value _θTh . The margin addition processing by ArcFace and the margin addition processing by CosFace are set so that the value Loss of the loss function monotonically decreases or increases with respect to the cos distance when the formed angle θ reaches a predetermined value _θTh . ing. In this embodiment, the loss function value Loss is set to monotonically decrease as the cos distance increases.

特に、所定値θ_Ｔｈに対応するcos距離の値cosθ_Ｔｈが、ArcFaceによるマージン付加処理においてcos距離に対する損失関数の値Loss_arcが極値となるときのcos距離の値cosθ_Ｅよりも大きな値となるように、所定値θ_Ｔｈが設定されている。 In particular, the cos distance value cos θ _Th corresponding to the predetermined value θ _Th is greater than the cos distance value cos θ _E when the loss function value Loss _arc with respect to the cos distance becomes the extreme value in the margin addition processing by ArcFace. The predetermined value θ _Th is set so that

つまり、本形態においては、ArcFaceのみを適用した場合のような極値を有しない。従って、学習の初期において、cos距離がどの位置に位置したとしても、勾配法を適用して学習することでcos距離がより大きい方に移動して行く。そのため、cos距離を大きくする方向に学習が進んで行き、理想的な状態に近づいて行く。 In other words, in this embodiment, there is no extreme value unlike when only ArcFace is applied. Therefore, no matter where the cos distance is located at the initial stage of learning, learning by applying the gradient method moves to the direction where the cos distance is larger. Therefore, learning progresses in the direction of increasing the cos distance, and the ideal state is approached.

さらに、ArcFaceによるマージン付加処理およびCosFaceによるマージン付加処理は、なす角θが所定値θ_Ｔｈにおいて、損失関数の値Lossがcos距離に対して連続する関係を有するように設定されている。この場合、学習において、連続的に処理が進む。従って、学習が安定する。特に、なす角θが所定値θ_Ｔｈにおいて、損失関数の値Lossがcos距離に対して滑らかに連続する関係を有するように設定されているとより良い。つまり、なす角θが所定値θ_Ｔｈにおいて、損失関数の値Lossをcos距離による偏微分した値が、一致する状態となる。 Furthermore, margin addition processing by ArcFace and margin addition processing by CosFace are set so that the value Loss of the loss function has a continuous relationship with the cos distance when the formed angle θ is a predetermined value _θTh . In this case, learning progresses continuously. Therefore, learning is stabilized. In particular, it is preferable that the angle .theta. is set to a predetermined value _.theta.Th so that the value Loss of the loss function has a smooth continuous relationship with the cos distance. That is, when the formed angle θ is a predetermined value θ _Th , the values obtained by partially differentiating the value Loss of the loss function with respect to the cos distance match.

ArcFaceによるマージン付加処理およびCosFaceによるマージン付加処理が、上記のように設定するためには、例えば、所定値θ_Ｔｈの設定、および、CosFaceによるマージンｍ_ｃの設定にて対応できる。 In order to set the margin addition processing by ArcFace and the margin addition processing by CosFace as described above, for example, the setting of the predetermined value _θTh and the setting of the margin _mc by CosFace can be done.

（５．第二のマージン適用時のcos距離と損失関数の値）
第二のマージン適用時における正規化後特徴ベクトルｘ”と正解クラスの重みベクトルＷ’とのcos距離と損失関数の値Lossとの関係について、図４を参照して説明する。ここで、図４には、図３と同様に、機械学習モデル１１の学習においてArcFaceとCosFaceの併用パターンとしての本形態を実線にて示しており、比較例としてArcFaceのみを適用したパターンについて破線にて示している。 (5. Values of cos distance and loss function when applying the second margin)
The relationship between the cos distance between the normalized feature vector x″ and the weight vector W′ of the correct class and the loss function value Loss when the second margin is applied will be described with reference to FIG. 4, similar to FIG. 3, this embodiment as a pattern of combined use of ArcFace and CosFace in the learning of the machine learning model 11 is indicated by a solid line, and a pattern applying only ArcFace as a comparative example is indicated by a broken line. there is

図４に示す第二のマージン適用時には、図３と比較して、ArcFaceによるマージン付加処理およびCosFaceによるマージン付加処理が、なす角θが所定値θ_Ｔｈにおいて、損失関数の値Lossがcos距離に対して不連続の関係を有するように設定されている。ただし、ArcFaceを適用した領域とCosFaceを適用した領域との境界が、なす角θが所定値θ_Ｔｈに対応するcos距離の値cosθ_Ｔｈとなる。ArcFaceによるマージン付加処理およびCosFaceによるマージン付加処理は、なす角θが所定値θ_Ｔｈの前後において、損失関数の値Lossが、cos距離に対して単調減少または単調増加する関係を有するように設定されている。 When the second margin shown in FIG. 4 is applied, _compared with FIG. It is set to have a discontinuous relationship with respect to However, the angle θ formed by the boundary between the area to which ArcFace is applied and the area to which CosFace is applied is the cos distance value cos _θTh corresponding to the predetermined value _θTh . The margin addition processing by ArcFace and the margin addition processing by CosFace are set so that the value Loss of the loss function monotonically decreases or increases with respect to the cos distance when the formed angle θ reaches a predetermined value _θTh . ing.

この場合も、本形態においては、ArcFaceのみを適用した場合のような極値を有しない。従って、学習の初期において、cos距離がどの位置に位置したとしても、勾配法を適用して学習することでcos距離がより大きい方に移動して行く。そのため、cos距離を大きくする方向に学習が進んで行き、理想的な状態に近づいて行く。 Again, in this embodiment, there are no extrema as in the case of applying only ArcFace. Therefore, no matter where the cos distance is located at the initial stage of learning, learning by applying the gradient method moves to the direction where the cos distance is larger. Therefore, learning progresses in the direction of increasing the cos distance, and the ideal state is approached.

そして、ArcFaceによるマージン付加処理およびCosFaceによるマージン付加処理が、上記のように設定するためには、例えば、所定値θ_Ｔｈの設定、および、CosFaceによるマージンｍ_ｃの設定にて対応できる。 In order to set the margin addition processing by ArcFace and the margin addition processing by CosFace as described above, for example, the setting of the predetermined value _θTh and the setting of the margin _mc by CosFace can correspond.

（６．効果）
以上のように、機械学習モデル１１の学習は、角度を用いた距離学習としてのArcFaceとCosFaceを併用している。具体的には、損失関数演算部２５は、特徴ベクトルｘ’と重みベクトルＷとのなす角θが所定値θ_Ｔｈより小さい場合には、式（５）（７）（１２）に示すように、ArcFaceによるマージン付加処理を適用する。一方、損失関数演算部２５は、なす角θが所定値θ_Ｔｈ以上の場合には、式（８）（１３）に示すように、CosFaceによるマージン付加処理を適用している。 (6. Effect)
As described above, the learning of the machine learning model 11 uses both ArcFace and CosFace as distance learning using angles. Specifically, when the angle θ between the feature vector x′ and the weight vector W is smaller than a predetermined value θ _Th , the loss function calculator 25 calculates , Apply margin addition processing by ArcFace. On the other hand, when the formed angle θ is equal to or greater than the predetermined value _θTh , the loss function calculator 25 applies margin addition processing by CosFace as shown in Equations (8) and (13).

さらに、学習が進むと、特徴ベクトルｘ’と正解クラスの重みベクトルＷとのなす角θは小さくなる。学習の初期において、特徴ベクトルｘ’と正解クラスの重みベクトルＷとのなす角θが所定値θ_Ｔｈより大きい場合（cos距離がcosθ_Ｔｈより小さい場合）において、CosFaceによるマージン付加処理が適用される。その後、学習が進むことで、なす角θが小さくなる。そうすると、なす角θが所定値θ_Ｔｈに到達し、CosFaceの領域から、ArcFaceの領域へ移行する。その後、ArcFaceによる学習が進むことにより、高精度な予測を行うことができる機械学習モデル１１が生成される。 Furthermore, as the learning progresses, the angle θ between the feature vector x' and the weight vector W of the correct class becomes smaller. In the initial stage of learning, when the angle θ formed by the feature vector x′ and the weight vector W of the correct class is greater than a predetermined value _θTh (when the cos distance is smaller than _cosθTh ), margin addition processing by CosFace is applied. . After that, as the learning progresses, the formed angle θ becomes smaller. Then, the formed angle θ reaches a predetermined value _θTh , and the region of CosFace shifts to the region of ArcFace. After that, as learning by ArcFace progresses, a machine learning model 11 capable of highly accurate prediction is generated.

１データ解析システム
２記憶装置
３演算処理装置
２１機械学習モデル実行部
２３重みベクトル取得部
２５損失関数演算部
２６学習処理部 REFERENCE SIGNS LIST 1 data analysis system 2 storage device 3 arithmetic processing unit 21 machine learning model execution unit 23 weight vector acquisition unit 25 loss function calculation unit 26 learning processing unit

Claims

A data analysis system configured by a computer device comprising an arithmetic processing device and a storage device,
The storage device is composed of a neural network, and outputs a feature vector using the weight vector of the fully connected layer at the final stage. store a machine learning model for classifying,
The arithmetic processing unit is
a machine learning model execution unit that outputs the feature vector by executing the machine learning model when learning data is input;
a weight vector acquisition unit that acquires the weight vector when the machine learning model execution unit outputs the feature vector;
when the angle θ between the feature vector and the weight vector of the correct class is smaller than a predetermined value, applying margin addition processing by ArcFace using the feature vector and the weight vector to calculate the value of the loss function; a loss function calculation unit that calculates a loss function value by applying margin addition processing by CosFace using the feature vector and the weight vector when the formed angle θ is equal to or greater than the predetermined value;
a learning processing unit that learns the machine learning model by a gradient method based on the value of the loss function;
A data analysis system comprising:

In the margin addition processing by ArcFace and the margin addition processing by CosFace, when the formed angle θ is before and after the predetermined value, the value of the loss function is 2. The data analysis system according to claim 1, which is set to have a monotonically decreasing or monotonically increasing relationship with .

In the margin addition processing by ArcFace and the margin addition processing by CosFace, when the formed angle θ is the predetermined value, the value of the loss function is continuous with respect to the cos distance between the feature vector and the weight vector of the correct class. 3. The data analysis system of claim 2, wherein the data analysis system is set to have a relationship of

In the margin addition processing by ArcFace and the margin addition processing by CosFace, when the formed angle θ is the predetermined value, the value of the loss function is inconsistent with the cos distance between the feature vector and the weight vector of the correct class. 3. The data analysis system of claim 2, configured to have a sequential relationship.

The value of the cos distance corresponding to the predetermined value is the cos distance when the value of the loss function with respect to the cos distance between the feature vector and the weight vector of the correct answer class in the margin addition processing by ArcFace becomes an extreme value. 5. The data analysis system according to any one of claims 1 to 4, wherein said predetermined value is set such that the value of .

6. The machine learning model according to any one of claims 1 to 5, wherein the machine learning model is a model for performing class classification based on a cos distance between the feature vector relating to the target image data and the feature vector relating to the class representative image data. Data analysis system as described.

The machine learning model is a model used for appearance inspection of an industrial product, and determines whether the industrial product is a non-defective product or a defective product based on the input image data of the appearance of the industrial product. 7. The data analysis system of claim 6, which is a model for.

The machine learning model calculates the cos distance between the feature vector related to the input image data and the feature vector related to the representative image data of the non-defective product, and the feature vector related to the input image data and the representative image of the defective product. 8. The data analysis system according to claim 7, wherein determination is made as to whether said industrial product is said non-defective product or said defective product based on the cos distance with said feature vector regarding data.