KR20200121206A

KR20200121206A - Teacher-student framework for light weighted ensemble classifier combined with deep network and random forest and the classification method based on thereof

Info

Publication number: KR20200121206A
Application number: KR1020190043980A
Authority: KR
Inventors: 고병철; 허두영
Original assignee: 계명대학교 산학협력단
Priority date: 2019-04-15
Filing date: 2019-04-15
Publication date: 2020-10-23
Anticipated expiration: 2039-04-15
Also published as: KR102224253B1

Abstract

The present invention relates to a teacher-student framework for lightening an ensemble classifier in which a deep network and random forest are combined, and a classification method based on the same. The teacher-student framework comprises: a teacher training module (100) for training a teacher model formed of a teacher deep network and a teacher random forest by using data set A; a soft target data generation module (200) for inputting data set B into the teacher deep network and the teacher random forest trained in the teacher training module (100), and combining two outputs to generate soft target data set B′; a student training module (300) for training a student model formed of a student network and a student random forest by using the data set B′ generated in the soft target data generation module (200); and a classification module (400) for performing classification by combining two outputs of the student network and the student random forest trained in the student learning module (300).

Description

TEACHER-STUDENT FRAMEWORK FOR LIGHT WEIGHTED ENSEMBLE CLASSIFIER COMBINED WITH DEEP NETWORK AND RANDOM FOREST AND THE CLASSIFICATION METHOD BASED ON THEREOF }

본 발명은 교사-학생 프레임워크 및 이를 기반으로 하는 분류 방법에 관한 것으로서, 보다 구체적으로는 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 및 이를 기반으로 하는 분류 방법에 관한 것이다.The present invention relates to a teacher-student framework and a classification method based on the same, and more specifically, to a teacher-student framework for lightening an ensemble classifier combined with a deep network and a random forest, and a classification method based on the same. About.

신경망(Neural Network)은 인간이 뇌를 통해 문제를 처리하는 방법과 비슷한 방법으로 문제를 해결하기 위해 컴퓨터에서 채택하고 있는 구조로서, 수학적 모델로서의 뉴런이 상호 연결되어 네트워크를 형성할 때 이를 신경망 또는 인공 신경망(Artificial Neural Network)이라 한다.
Neural Network is a structure adopted by computers to solve problems in a way similar to the way humans handle problems through the brain.When neurons as a mathematical model are interconnected to form a network, it is a neural network or artificial It is called an Artificial Neural Network.

신경망은 각 뉴런이 독립적으로 동작하는 처리기의 역할을 하기 때문에 병렬성(Parallelism)이 뛰어나고, 많은 연결선에 정보가 분산되어 있기 때문에 몇몇 뉴런에 문제가 발생하더라도 전체 시스템에 큰 영향을 주지 않으므로 결함 허용(fault tolerance) 능력이 있으며, 주어진 환경에 대한 학습 능력이 있다. 이와 같은 특성 때문에 인공 지능 분야의 문제 해결에 이용되고 있으며, 문자 인식, 음성 인식, 분류, 진단, 예측 등 여러 분야에서 이용되고 있다.
Since the neural network acts as a processor for each neuron to operate independently, it has excellent parallelism, and because information is distributed across many connection lines, even if a problem occurs in some neurons, it does not affect the entire system, so fault tolerance. tolerance), and the ability to learn about a given environment. Due to such characteristics, it is used for problem solving in the field of artificial intelligence, and is used in various fields such as character recognition, speech recognition, classification, diagnosis, and prediction.

다양한 심층 신경망 중에서, 컨볼루셔널 뉴럴 네트워크(Convolutional Neural Network; CNN)는 정확도가 높아 보행자 검출, 컴퓨터 비전을 기반으로 한 인간의 자세 추정(Pose Orientation Estimation; POE), 음성 인식 등 다양한 분야에 널리 사용되고 있다. CNN은 그 적용 분야에 따라 학습 및 테스트를 위해 많은 수의 데이터 세트를 필요로 한다. 또한, 연산량이 많으므로 종래의 분류기와 비교하여 대규모의 고수준 컴퓨팅 장치를 필요로 하는 한계가 있다. 따라서 POE와 같이, 신속하고 정확한 분류가 필요한 분야에 있어서, 높은 성능은 보장하면서도, 처리 시간 및 메모리양을 줄일 수 있는 방법의 개발이 필요하다.
Among various deep neural networks, Convolutional Neural Network (CNN) has high accuracy and is widely used in various fields such as pedestrian detection, pose orientation estimation (POE) based on computer vision, and voice recognition. have. CNNs require a large number of data sets for training and testing, depending on their application. In addition, since the amount of calculation is large, there is a limitation in that a large-scale high-level computing device is required compared to a conventional classifier. Therefore, in fields that require rapid and accurate classification, such as POE, there is a need to develop a method capable of reducing processing time and memory amount while ensuring high performance.

한편, 기계 학습(Machine Learning)에서의 랜덤 포레스트(Random Forest)는 분류, 회귀 분석 등에 사용되는 앙상블 학습 방법의 일종으로서, 훈련 과정에서 구성한 다수의 결정 트리로부터 부류(분류) 또는 평균 예측치(회귀 분석)를 출력함으로써 동작한다. 랜덤 포레스트는 여러 개의 결정 트리들을 임의적으로 학습하는 방식의 앙상블 방법이다. 랜덤 포레스트 방법은 크게 다수의 결정 트리를 구성하는 학습 단계와, 입력 벡터가 들어왔을 때 분류하거나 예측하는 테스트 단계로 구성되어 있다. 랜덤 포레스트는 검출, 분류, 그리고 회귀 등 다양한 애플리케이션으로 활용되고 있다.
On the other hand, Random Forest in Machine Learning is a kind of ensemble learning method used for classification and regression analysis, and it is a class (classification) or average predicted value (regression analysis) from a number of decision trees constructed in the training process. It works by outputting ). Random forest is an ensemble method of randomly learning several decision trees. The random forest method largely consists of a learning step that constructs a number of decision trees and a test step that classifies or predicts when an input vector is received. Random forests are used in various applications such as detection, classification, and regression.

랜덤 포레스트에서 가장 큰 영향을 미치는 매개변수들은 포레스트의 크기(트리의 개수)와 최대 허용 깊이 등이다. 이 중, 포레스트의 크기(트리의 개수)는, 총 포레스트를 몇 개의 트리로 구성할지를 결정하는 매개변수이다. 포레스트의 크기가 작으면, 즉 트리의 개수가 적으면 트리들을 구성하고 테스트하는데 걸리는 시간이 짧은 대신, 일반화 능력이 떨어져 임의의 입력 데이터 포인트에 대해 틀린 결과를 내놓을 확률이 높다. 반면에, 포레스트의 크기가 크면, 즉 트리의 개수가 많으면 높은 성능을 보장하지만, 훈련과 테스트 시간이 길어지고 메모리양이 증가하는 단점이 있다. 따라서, 높은 성능은 보장하면서도, 처리 시간 및 메모리양을 줄일 수 있는 개선된 랜덤 포레스트 방법을 개발할 필요성이 있다.
The parameters that have the greatest influence in a random forest are the size of the forest (number of trees) and the maximum allowable depth. Among them, the size of the forest (the number of trees) is a parameter that determines how many trees the total forest consists of. If the size of the forest is small, that is, if the number of trees is small, the time taken to construct and test the trees is short, but the generalization ability is low, and there is a high probability of producing incorrect results for any input data point. On the other hand, if the size of the forest is large, that is, if the number of trees is large, high performance is guaranteed, but there are disadvantages of lengthening training and testing time and increasing the amount of memory. Therefore, there is a need to develop an improved random forest method capable of reducing processing time and memory amount while ensuring high performance.

한편, 본 발명과 관련된 선행기술로서, 등록특허 제10-1901307호(발명의 명칭: 가중 퍼지 소속함수 기반 심층 신경망을 이용한 클래스 분류 방법, 장치 및 컴퓨터 판독 가능한 기록매체), 공개특허 제10-2018-0046122호(발명의 명칭: 랜덤 포레스트 기법을 이용한 산화물 나노물질의 독성 예측 모델 생성 방법 및 장치) 등이 개시된 바 있다.On the other hand, as a prior art related to the present invention, Patent No. 10-1901307 (Name of the invention: Class classification method, apparatus and computer-readable recording medium using a deep neural network based on weighted fuzzy membership function), Patent Publication No. 10-2018 No. -0046122 (title of the invention: a method and apparatus for generating a toxicity prediction model of oxide nanomaterials using a random forest technique) and the like have been disclosed.

본 발명은 기존에 제안된 방법들의 상기와 같은 문제점들을 해결하기 위해 제안된 것으로서, 심층 네트워크와 랜덤 포레스트를 결합하여 새로운 앙상블 분류기를 개발하고, 교사 모델의 출력인 소프트 타겟 데이터 세트 B^*를 입력으로 하여 학생 모델을 학습시킴으로써, 교사 모델과 학생 모델로 구성되는 교사-학생 프레임워크를 통해 개발된 앙상블 분류기를 경량화하면서도 더 유연한 분류 결과를 출력하도록 할 수 있고, 클래스 레이블이 포함되는 데이터 세트 A와 클래스 레이블이 포함되지 않는 데이터 세트 B를 이용해 교사 모델을 학습시킴으로써, 교사 모델의 오버 피팅(overfitting)을 방지할 수 있는, 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 및 이를 기반으로 하는 분류 방법을 제공하는 것을 그 목적으로 한다.The present invention has been proposed to solve the above problems of the previously proposed methods, and develops a new ensemble classifier by combining a deep network and a random forest, and a soft target data set B ^* which is an output of the teacher model as an input. By learning the student model, the ensemble classifier developed through the teacher-student framework consisting of the teacher model and the student model can be lightened and output more flexible classification results. Data set A and class including class labels A teacher-student framework for lightening the ensemble classifier combined with deep network and random forest, which can prevent overfitting of the teacher model by training the teacher model using the label-free dataset B, and Its purpose is to provide a classification method based on this.

상기한 목적을 달성하기 위한 본 발명의 특징에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크는,A teacher-student framework for lightening the weight of an ensemble classifier combined with a deep network and a random forest according to the features of the present invention for achieving the above object,

교사-학생 프레임워크로서,As a teacher-student framework,

데이터 세트 A를 이용하여, 교사 심층 네트워크 및 교사 랜덤 포레스트로 구성되는 교사 모델을 학습시키는 교사 학습 모듈;A teacher learning module for learning a teacher model composed of a teacher deep network and a teacher random forest using the data set A;

데이터 세트 B를 상기 교사 학습 모듈에서 학습된 교사 심층 네트워크 및 교사 랜덤 포레스트에 입력하고, 출력된 두 출력을 결합하여 소프트 타겟 데이터 세트 B^*를 생성하는 소프트 타겟 데이터 생성 모듈;A soft target data generation module for inputting data set B into the deep teacher network and teacher random forest learned in the teacher learning module, and combining the output two outputs to generate a soft target data set B ^* ;

상기 소프트 타겟 데이터 생성 모듈에서 생성된 데이터 세트 B^*를 이용하여, 학생 네트워크 및 학생 랜덤 포레스트로 구성되는 학생 모델을 학습시키는 학생 학습 모듈; 및A student learning module for learning a student model composed of a student network and a student random forest by using the data set B ^* generated by the soft target data generation module; And

상기 학생 학습 모듈에서 학습된 학생 네트워크 및 학생 랜덤 포레스트의 두 출력을 결합하여 분류를 수행하는 분류 모듈을 포함하는 것을 그 구성상의 특징으로 한다.
It is characterized in that it comprises a classification module that performs classification by combining two outputs of the student network and the student random forest learned in the student learning module.

바람직하게는, 상기 데이터 세트 A는,Preferably, the data set A,

클래스 레이블이 포함되는 하드 타겟 데이터 세트일 수 있다.
It may be a hard target data set including a class label.

바람직하게는, 상기 데이터 세트 B는,Preferably, the data set B,

클래스 레이블이 포함되지 않는 데이터 세트일 수 있다.
It may be a data set that does not include a class label.

바람직하게는, 상기 교사 학습 모듈은,Preferably, the teacher learning module,

상기 데이터 세트 A를 이용하여 교사 심층 네트워크를 학습시키는 제1 교사 학습부; 및A first teacher learning unit for learning the deep teacher network using the data set A; And

상기 교사 심층 네트워크의 특징 맵(feature map)을 이용하여 교사 랜덤 포레스트를 학습시키는 제2 교사 학습부를 포함할 수 있다.
It may include a second teacher learning unit for learning a teacher random forest by using a feature map of the deep teacher network.

바람직하게는, 상기 교사 심층 네트워크는,Preferably, the deep teacher network,

각각의 클래스의 확률값인 소프트 타겟 출력을 얻기 위해, 연화된 소프트맥스 함수(softened softmax function)를 적용할 수 있다.
In order to obtain a soft target output that is a probability value of each class, a softened softmax function can be applied.

바람직하게는,Preferably,

웨이블렛 변환을 적용하여 입력 이미지에 대한 전처리를 수행하는 전처리 모듈을 더 포함할 수 있다.
A pre-processing module for performing pre-processing on the input image by applying wavelet transform may be further included.

바람직하게는, 상기 학생 학습 모듈은,Preferably, the student learning module,

상기 소프트 타겟 데이터 생성 모듈에서 생성된 데이터 세트 B^*를 이용하여, 학생 네트워크를 학습시키는 제1 학생 학습부; 및A first student learning unit for learning a student network by using the data set B ^* generated by the soft target data generation module; And

상기 소프트 타겟 데이터 생성 모듈에서 생성된 데이터 세트 B^*를 이용하여, 학생 랜덤 포레스트를 학습시키는 제2 학생 학습부를 포함할 수 있다.
It may include a second student learning unit for learning a student random forest by using the data set B ^* generated by the soft target data generation module.

더욱 바람직하게는, 상기 제1 학생 학습부는,More preferably, the first student learning unit,

(3-1-1) 학생 네트워크의 파라미터(W_S)를 초기화하는 단계;(3-1-1) initializing the parameter (W _S ) of the student network;

(3-1-2) 사전 학습된 네트워크에 상기 데이터 세트 B^*를 입력하는 단계;(3-1-2) inputting the data set B ^* into a pre-trained network;

(3-1-3) 손실 함수(loss function, L(Ws))를 계산하는 단계;(3-1-3) calculating a loss function (L(Ws));

(3-1-4)

를 W_S ^*로 업데이트하는 단계; 및(3-1-4)

Updating to W _S ^* ; And

(3-1-5) 학생 네트워크를 위한 최적 파라미터 W_S ^*를 선택하는 단계를 수행하여, 상기 학생 네트워크를 학습시킬 수 있다.
(3-1-5) By performing the step of selecting the optimal parameter W _S ^* for the student network, the student network may be trained.

더더욱 바람직하게는, 상기 단계 (3-1-3)에서,Even more preferably, in the step (3-1-3),

다음 수학식을 이용해 손실 함수를 계산할 수 있다.The loss function can be calculated using the following equation.

상기 수학식에서, N은 데이터 세트 B^*의 샘플 수, C는 클래스 수, P_T(x_i|c_j)와 P_S(x_i|c_j)는 각각 입력 벡터 x_i에 대한 교사와 학생의 후방(posterior) 클래스 확률임.
In the above equation, N is the number of samples in the data set B ^* , C is the number of classes, and P _T (x _i |c _j ) and P _S (x _i |c _j ) are the teachers and students for the input vector x _i , respectively. This is the posterior class probability.

더욱 바람직하게는, 상기 제2 학생 학습부는,More preferably, the second student learning unit,

(3-2-1) 교사 랜덤 포레스트의 t번째 트리 구조(T-RF_t)를 학생 랜덤 포레스트의 t번째 트리 구조(S-RF_t)로 복사하여 이전학습(Transfer Learning)을 하는 단계;(3-2-1) copying the t-th tree structure (T-RF _t ) of the teacher random forest to the t-th tree structure (S-RF _t ) of the student random forest to perform transfer learning;

(3-2-2) 입력벡터 v를 갖는 데이터 세트 B^*를 S-RF_t의 의사결정 트리 중 하나에 입력하는 단계;(3-2-2) inputting a data set B ^* having an input vector v into one of the decision trees of S-RF _t ;

(3-2-3) 노드 O에서 스플릿 함수 f(v)를 생성하는 단계;(3-2-3) generating a split function f(v) at node O;

(3-2-4) 노드 O에서 정보 이득(information gain) ΔE을 계산하는 단계;(3-2-4) calculating an information gain ΔE at node O;

(3-2-5) 교차-엔트로피(cross-entropy) Tr(Te, S)_t를 계산하는 단계; 및(3-2-5) calculating cross-entropy Tr(Te, S) _t ; And

(3-2-6) 상기 계산된 교차-엔트로피가 부스팅을 멈추는 최소 임계값 미만이면, 현재의 S-RF_t를 저장하고, 그렇지 않으면 상기 단계 (3-2-2)부터 재수행하는 단계를 T개의 랜덤 의사결정 트리를 구성할 때까지 수행하여, 상기 학생 랜덤 포레스트를 학습시킬 수 있다.
(3-2-6) If the calculated cross-entropy is less than the minimum threshold for stopping boosting, the current S-RF _t is stored, and otherwise, the step of re-performing from step (3-2-2) is T It is possible to learn the student random forest by executing until a random decision tree is constructed.

상기한 목적을 달성하기 위한 본 발명의 특징에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법은,A classification method based on a teacher-student framework for lightening an ensemble classifier combined with a deep network and a random forest according to a feature of the present invention for achieving the above object,

교사-학생 프레임워크 기반의 분류 방법으로서,As a classification method based on the teacher-student framework,

(1) 데이터 세트 A를 이용하여, 교사 심층 네트워크 및 교사 랜덤 포레스트로 구성되는 교사 모델을 학습시키는 단계;(1) using the data set A, training a teacher model composed of a deep teacher network and a teacher random forest;

(2) 데이터 세트 B를 상기 단계 (1)에서 학습된 교사 심층 네트워크 및 교사 랜덤 포레스트에 입력하고, 출력된 두 출력을 결합하여 소프트 타겟 데이터 세트 B^*를 생성하는 단계;(2) inputting the data set B into the deep teacher network and the teacher random forest learned in step (1), and combining the two outputs to generate a soft target data set B ^* ;

(3) 상기 단계 (2)에서 생성된 데이터 세트 B^*를 이용하여, 학생 네트워크 및 학생 랜덤 포레스트로 구성되는 학생 모델을 학습시키는 단계; 및(3) learning a student model consisting of a student network and a student random forest by using the data set B ^* generated in step (2); And

(4) 상기 단계 (3)에서 학습된 학생 네트워크 및 학생 랜덤 포레스트의 두 출력을 결합하여 분류를 수행하는 단계를 포함하는 것을 그 구성상의 특징으로 한다.
(4) It is characterized in that it comprises the step of performing classification by combining the two outputs of the student network and the student random forest learned in step (3).

바람직하게는, 상기 데이터 세트 A는,Preferably, the data set A,

바람직하게는, 상기 데이터 세트 B는,Preferably, the data set B,

바람직하게는, 상기 단계 (1)은,Preferably, the step (1),

(1-1) 상기 데이터 세트 A를 이용하여 교사 심층 네트워크를 학습시키는 단계; 및(1-1) learning the deep teacher network using the data set A; And

(1-2) 상기 교사 심층 네트워크의 특징 맵(feature map)을 이용하여 교사 랜덤 포레스트를 학습시키는 단계를 포함할 수 있다.
(1-2) It may include the step of learning a teacher random forest using a feature map of the deep teacher network.

바람직하게는,Preferably,

(0) 웨이블렛 변환을 적용하여 입력 이미지에 대한 전처리를 수행하는 단계를 더 포함할 수 있다.
(0) The step of performing preprocessing on the input image by applying the wavelet transform may further be included.

바람직하게는, 상기 단계 (3)은,Preferably, the step (3),

(3-1) 상기 단계 (2)에서 생성된 데이터 세트 B^*를 이용하여, 학생 네트워크를 학습시키는 단계; 및(3-1) learning a student network by using the data set B ^* generated in step (2); And

(3-2) 상기 단계 (2)에서 생성된 데이터 세트 B^*를 이용하여, 학생 랜덤 포레스트를 학습시키는 단계를 포함할 수 있다.
(3-2) Using the data set B ^* generated in step (2), it may include the step of learning a student random forest.

더욱 바람직하게는, 상기 단계 (3-1)은,More preferably, the step (3-1),

(3-1-4)

를 W_S ^*로 업데이트하는 단계; 및(3-1-4)

Updating to W _S ^* ; And

더욱 바람직하게는, 상기 단계 (3-2)는,More preferably, the step (3-2),

(3-2-6) 상기 계산된 교차-엔트로피가 부스팅을 멈추는 최소 임계값 미만이면, 현재의 S-RF_t를 저장하고, 그렇지 않으면 상기 단계 (3-2-2)부터 재수행하는 단계를 T개의 랜덤 의사결정 트리를 구성할 때까지 수행하여, 상기 학생 랜덤 포레스트를 학습시킬 수 있다.(3-2-6) If the calculated cross-entropy is less than the minimum threshold for stopping boosting, the current S-RF _t is stored, and otherwise, the step of re-performing from step (3-2-2) is T It is possible to learn the student random forest by executing until a random decision tree is constructed.

본 발명에서 제안하고 있는 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 및 이를 기반으로 하는 분류 방법에 따르면, 심층 네트워크와 랜덤 포레스트를 결합하여 새로운 앙상블 분류기를 개발하고, 교사 모델의 출력인 소프트 타겟 데이터 세트 B^*를 입력으로 하여 학생 모델을 학습시킴으로써, 교사 모델과 학생 모델로 구성되는 교사-학생 프레임워크를 통해 개발된 앙상블 분류기를 경량화하면서도 더 유연한 분류 결과를 출력하도록 할 수 있고, 클래스 레이블이 포함되는 데이터 세트 A와 클래스 레이블이 포함되지 않는 데이터 세트 B를 이용해 교사 모델을 학습시킴으로써, 교사 모델의 오버 피팅(overfitting)을 방지할 수 있다.According to the teacher-student framework for lightening the ensemble classifier combined with the deep network and the random forest proposed in the present invention and a classification method based on the same, a new ensemble classifier is developed by combining a deep network and a random forest, By training the student model using the soft target data set B ^* , which is the output of the teacher model, the ensemble classifier developed through the teacher-student framework consisting of the teacher model and the student model is lightened and output more flexible classification results. The teacher model can be trained using the data set A including the class label and the data set B not including the class label, thereby preventing overfitting of the teacher model.

도 1은 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 및 이를 기반으로 하는 분류 방법을 도시한 도면.
도 2는 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크의 구성을 도시한 도면.
도 3은 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법의 흐름을 도시한 도면.
도 4는 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법에서, 단계 S100의 세부적인 흐름을 도시한 도면.
도 5는 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법에서, 단계 S300의 세부적인 흐름을 도시한 도면.
도 6은 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법에서, 단계 S310의 학생 네트워크 학습 절차를 설명한 알고리즘을 도시한 도면.
도 7은 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법에서, 단계 S320의 학생 랜덤 포레스트 학습 절차를 설명한 알고리즘을 도시한 도면.
도 8은 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 및 이를 기반으로 하는 분류 방법에서, 보행자 방향 클래스 분류를 예를 들어 도시한 도면.
도 9는 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법을 포함하는 8개의 실험의 보행자 방향 추정 결과를 비교한 도면.
도 10은 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크를 이용해 분류된 방향 클래스별 POE 분류 정확도(Acc)를 confusion matrix로 도시한 도면.
도 11은 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크에서, 학생 랜덤 포레스트의 트리 수 결정을 위한 실험 결과를 도시한 도면.
도 12는 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법을 포함하는 4개의 실험의 정확도, 파라미터의 수 및 연산 수를 비교한 도면.
도 13은 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법을 포함하는 5개의 CNN 기반 방법에 대한 실험 결과를 요약한 도면.
도 14는 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크를 사용하여 (a) TUD 및 (b) KITTI 데이터 세트의 POE 분류 결과를 도시한 도면.FIG. 1 is a diagram illustrating a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention, and a classification method based thereon.
FIG. 2 is a diagram showing the configuration of a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention.
3 is a diagram illustrating a flow of a classification method based on a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention.
4 is a diagram illustrating a detailed flow of step S100 in a classification method based on a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention.
5 is a diagram illustrating a detailed flow of step S300 in a classification method based on a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention.
FIG. 6 is a diagram illustrating an algorithm for explaining a student network learning procedure in step S310 in a teacher-student framework-based classification method for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention. .
7 illustrates an algorithm for explaining a student random forest learning procedure in step S320 in a teacher-student framework-based classification method for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention. drawing.
8 is a diagram illustrating, for example, classifying a pedestrian direction class in a teacher-student framework for weight reduction of an ensemble classifier combined with a deep network and a random forest, and a classification method based thereon.
9 is a view comparing pedestrian direction estimation results of eight experiments including a classification method based on a teacher-student framework for lightening an ensemble classifier combined with a deep network and a random forest according to an embodiment of the present invention.
FIG. 10 is a diagram showing POE classification accuracy (Acc) for each direction class classified using a teacher-student framework for lightening an ensemble classifier combined with a deep network and a random forest according to an embodiment of the present invention as a confusion matrix .
11 is a diagram showing an experiment result for determining the number of trees of a student random forest in a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention.
12 shows the accuracy of four experiments, the number of parameters, and the number of operations including a classification method based on a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention. Compared drawings.
13 is a diagram summarizing experimental results for five CNN-based methods including a teacher-student framework-based classification method for weight reduction of an ensemble classifier combined with a deep network and a random forest according to an embodiment of the present invention. .
FIG. 14 shows the results of POE classification of (a) TUD and (b) KITTI data sets using a teacher-student framework for weight reduction of an ensemble classifier combined with a deep network and a random forest according to an embodiment of the present invention. One drawing.

이하, 첨부된 도면을 참조하여 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명을 용이하게 실시할 수 있도록 바람직한 실시예를 상세히 설명한다. 다만, 본 발명의 바람직한 실시예를 상세하게 설명함에 있어, 관련된 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략한다. 또한, 유사한 기능 및 작용을 하는 부분에 대해서는 도면 전체에 걸쳐 동일한 부호를 사용한다.
Hereinafter, preferred embodiments will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art may easily implement the present invention. However, in describing a preferred embodiment of the present invention in detail, if it is determined that a detailed description of a related known function or configuration may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be omitted. In addition, the same reference numerals are used throughout the drawings for portions having similar functions and functions.

덧붙여, 명세서 전체에서, 어떤 부분이 다른 부분과 연결 되어 있다고 할 때, 이는 직접적으로 연결 되어 있는 경우뿐만 아니라, 그 중간에 다른 소자를 사이에 두고 간접적으로 연결 되어 있는 경우도 포함한다. 또한, 어떤 구성요소를 포함 한다는 것은, 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있다는 것을 의미한다.
In addition, throughout the specification, when a part is said to be connected to another part, this includes not only the case that it is directly connected, but also the case that it is indirectly connected with another element interposed therebetween. In addition, the inclusion of certain components means that other components may be further included rather than excluding other components unless specifically stated to the contrary.

딥러닝 네트워크는 심층 모델 생성을 위해 많은 파라미터가 요구된다. 따라서 많은 양의 곱셈을 수행하기 위해 많은 양의 메모리와 시간이 필요하다. 본 발명에서는, 이러한 심층 네트워크 모델의 단점을 해결하기 위해, 교사-학생 프레임워크를 채택하여, 교사 심층 네트워크를 기반으로 동일한 수준의 성능을 갖춘 더 얕은 학생 모델을 구성하였다.
Deep learning networks require many parameters to create deep models. Therefore, a large amount of memory and time are required to perform a large amount of multiplication. In the present invention, in order to solve the shortcomings of the deep network model, a teacher-student framework was adopted, and a shallower student model having the same level of performance was constructed based on the deep teacher network.

도 1은 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 및 이를 기반으로 하는 분류 방법을 도시한 도면이다. 도 1에 도시된 바와 같이, 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 및 이를 기반으로 하는 분류 방법은, 심층 네트워크와 랜덤 포레스트를 결합하여 새로운 앙상블 분류기를 개발하고, 교사 모델의 출력인 소프트 타겟 데이터 세트 B^*를 입력으로 하여 학생 모델을 학습시킴으로써, 교사 모델과 학생 모델로 구성되는 교사-학생 프레임워크를 통해 개발된 앙상블 분류기를 경량화하면서도 더 유연한 분류 결과를 출력하도록 할 수 있고, 클래스 레이블이 포함되는 데이터 세트 A와 클래스 레이블이 포함되지 않는 데이터 세트 B를 이용해 교사 모델을 학습시킴으로써, 교사 모델의 오버 피팅(overfitting)을 방지할 수 있다.
FIG. 1 is a diagram illustrating a teacher-student framework for weight reduction of an ensemble classifier combined with a deep network and a random forest according to an embodiment of the present invention, and a classification method based thereon. As shown in FIG. 1, a teacher-student framework for weight reduction of an ensemble classifier combined with a deep network and a random forest according to an embodiment of the present invention and a classification method based on the same, include a deep network and a random forest. Combined to develop a new ensemble classifier, and train the student model with the soft target data set B ^* as an output of the teacher model, and use the ensemble classifier developed through a teacher-student framework consisting of a teacher model and a student model. It is possible to output more flexible classification results while being lighter, and by training the teacher model using dataset A with class labels and dataset B without class labels, overfitting of the teacher model can be prevented. I can.

도 1에 도시된 바와 같이, 교사 모델은 교사 심층 네트워크와 교사 랜덤 포레스트의 출력을 결합하여 각 클래스에 대한 소프트 타겟(확률 값)을 생성하고, 이러한 소프트 타겟 값을 입력하여 학생 모델을 훈련할 수 있다. 보다 구체적으로, 도 1을 상세하게 설명하면, (a) 하드 타겟으로 레이블 된 데이터 세트 A를 (b) 교사 심층 네트워크 및 (c) 교사 랜덤 포레스트에 입력하고, (d) 레이블이 지정되지 않은 데이터 세트 B를 학습된 2개의 교사 모델에 입력할 수 있다. (e) 두 교사(교사 심층 네트워크 및 교사 랜덤 포레스트)의 소프트 출력을 하나의 소프트 타겟 벡터로 결합하고, (f) 소프트 타겟 데이터 세트 B^*를 학생 모델에 입력하여, (g) 학생 모델을 학습시켜, (h) 최종 클래스 확률을 얻을 수 있다. 이와 같이 본 발명의 교사-학생 프레임워크는 네트워크의 크기를 줄일 수 있을 뿐 아니라, 교사 모델의 분류 기능을 모방할 수 있는 압축된 학생 모델을 쉽게 구성할 수 있다.
As shown in Fig. 1, the teacher model generates a soft target (probability value) for each class by combining the output of the teacher deep network and the teacher random forest, and inputs the soft target value to train the student model. have. More specifically, referring to FIG. 1 in detail, (a) data set A labeled as a hard target is input to (b) deep teacher network and (c) teacher random forest, and (d) unlabeled data Set B can be entered into the two trained teacher models. (e) Combine the soft outputs of two teachers (teacher deep network and teacher random forest) into one soft target vector, (f) input the soft target data set B ^* into the student model, and (g) train the student model. So, (h) the final class probability can be obtained. As described above, the teacher-student framework of the present invention can reduce the size of the network and easily construct a compressed student model that can mimic the classification function of the teacher model.

도 2는 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크의 구성을 도시한 도면이다. 도 2에 도시된 바와 같이, 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크는, 데이터 세트 A를 이용하여, 교사 심층 네트워크 및 교사 랜덤 포레스트로 구성되는 교사 모델을 학습시키는 교사 학습 모듈(100), 데이터 세트 B를 교사 학습 모듈(100)에서 학습된 교사 심층 네트워크 및 교사 랜덤 포레스트에 입력하고, 출력된 두 출력을 결합하여 소프트 타겟 데이터 세트 B^*를 생성하는 소프트 타겟 데이터 생성 모듈(200), 소프트 타겟 데이터 생성 모듈(200)에서 생성된 데이터 세트 B^*를 이용하여, 학생 네트워크 및 학생 랜덤 포레스트로 구성되는 학생 모델을 학습시키는 학생 학습 모듈(300), 및 학생 학습 모듈(300)에서 학습된 학생 네트워크 및 학생 랜덤 포레스트의 두 출력을 결합하여 분류를 수행하는 분류 모듈(400)을 포함하여 구성될 수 있다.
FIG. 2 is a diagram illustrating a configuration of a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention. As shown in FIG. 2, a teacher-student framework for weight reduction of an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention is, by using a data set A, a deep teacher network and a random teacher. A teacher learning module 100 that trains a teacher model composed of a forest, and a data set B is input to the deep teacher network and teacher random forest learned in the teacher learning module 100, and the two outputs are combined to provide soft target data. generating a soft target data to generate a set of B ^* module 200, a soft target data generated by using the data set B ^* generated by the module 200, the student study to study the student model consisting of student network and student Random Forest It may be configured to include a module 300 and a classification module 400 for performing classification by combining two outputs of the student network and the student random forest learned in the student learning module 300.

또한, 도 2에 도시된 바와 같이, 교사 학습 모듈(100)은, 데이터 세트 A를 이용하여 교사 심층 네트워크를 학습시키는 제1 교사 학습부(110), 및 교사 심층 네트워크의 특징 맵(feature map)을 이용하여 교사 랜덤 포레스트를 학습시키는 제2 교사 학습부(120)를 포함하여 구성될 수 있고, 학생 학습 모듈(300)은, 소프트 타겟 데이터 생성 모듈(200)에서 생성된 데이터 세트 B^*를 이용하여, 학생 네트워크를 학습시키는 제1 학생 학습부(310), 및 소프트 타겟 데이터 생성 모듈(200)에서 생성된 데이터 세트 B^*를 이용하여, 학생 랜덤 포레스트를 학습시키는 제2 학생 학습부(320)를 포함하여 구성될 수 있다.
In addition, as shown in FIG. 2, the teacher learning module 100 includes a first teacher learning unit 110 for learning a deep teacher network using a data set A, and a feature map of the deep teacher network. It may be configured to include a second teacher learning unit 120 for learning the teacher random forest using, the student learning module 300, using the data set B ^* generated by the soft target data generation module 200 Thus, by using the first student learning unit 310 for learning the student network, and the data set B ^* generated by the soft target data generation module 200, the second student learning unit 320 for learning a student random forest It can be configured to include.

본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크의 각 구성요소에 대해서는, 이하에서 교사-학생 프레임워크 기반의 분류 방법의 각 단계에서 상세히 설명하도록 한다.
Each component of the teacher-student framework for weight reduction of the ensemble classifier in which the deep network and the random forest according to an embodiment of the present invention are combined will be described in detail in each step of the teacher-student framework-based classification method below. Let me explain.

도 3은 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법의 흐름을 도시한 도면이다. 도 3에 도시된 바와 같이, 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법은, 데이터 세트 A를 이용하여 교사 모델을 학습시키는 단계(S100), 데이터 세트 B를 교사 심층 네트워크 및 교사 랜덤 포레스트에 입력하고, 출력된 두 출력을 결합하여 소프트 타겟 데이터 세트 B^*를 생성하는 단계(S200), 데이터 세트 B^*를 이용하여 학생 모델을 학습시키는 단계(S300) 및 학습된 학생 네트워크 및 학생 랜덤 포레스트의 두 출력을 결합하여 분류를 수행하는 단계(S400)를 포함하여 구현될 수 있으며, 웨이블렛 변환을 적용하여 입력 이미지에 대한 전처리를 수행하는 단계(S10)를 더 포함하여 구현될 수 있다.
3 is a diagram illustrating a flow of a classification method based on a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention. As shown in FIG. 3, the classification method based on the teacher-student framework for lightening the weight of the ensemble classifier combined with the deep network and the random forest according to an embodiment of the present invention includes a teacher model using a data set A. Learning (S100), inputting data set B to the deep teacher network and teacher random forest, and combining the two outputs to generate soft target data set B ^* (S200), using data set B ^* It may be implemented including the step of training a student model (S300) and a step of performing classification (S400) by combining the two outputs of the learned student network and the student random forest, and preprocessing the input image by applying wavelet transformation. It may be implemented by further including the step (S10) of performing.

이하에서는, 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법의 각 흐름에 대하여 상세히 설명하도록 한다.
Hereinafter, each flow of a classification method based on a teacher-student framework for lightening the weight of an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention will be described in detail.

단계 S100에서는, 데이터 세트 A를 이용하여, 교사 심층 네트워크 및 교사 랜덤 포레스트로 구성되는 교사 모델을 학습시킬 수 있다. 단계 S100은, 교사 학습 모듈(100)에 의해 처리될 수 있다.
In step S100, a teacher model including a deep teacher network and a teacher random forest may be trained using the data set A. Step S100 may be processed by the teacher learning module 100.

교사 모델은 교사 심층 네트워크와, 많은 양의 학습 데이터를 기반으로 한 높은 수준의 성능을 가진 교사 랜덤 포레스트(Random Forest; RF)를 사용하여 구성될 수 있다. 여기에서, 데이터 세트 A는 클래스 레이블이 포함되는 하드 타겟 데이터 세트일 수 있다. 즉, 단계 S100에서, 교사 모델은 0 또는 1로 레이블 된 데이터 세트 A를 사용하여 학습될 수 있다.
The teacher model can be constructed using a deep teacher network and a teacher random forest (RF) with a high level of performance based on a large amount of learning data. Here, the data set A may be a hard target data set including a class label. That is, in step S100, the teacher model may be trained using the data set A labeled 0 or 1.

교사 심층 네트워크는, 각각의 클래스의 확률값인 소프트 타겟 출력을 얻기 위해, 연화된 소프트맥스 함수(softened softmax function)를 적용할 수 있다. 즉, 교사 심층 네트워크 T는, 일반 CNN 모델의 소프트맥스(softmax) 함수와는 상이하게, 소프트 타겟(출력 확률(output probability))을 얻기 위해, 다음 수학식 1과 같은 연화된 소프트맥스 함수(softened softmax function)를 교사 프리-소프트맥스 활성화 벡터(the vector of the teacher pre-softmax activations) aT에 적용할 수 있다. 교사 심층 네트워크의 기본 아이디어는, 학생 네트워크가 실제 레이블에 의해 제공되는 정보뿐만 아니라, 교사 심층 네트워크에 의해 학습된 더 작은 구조를 포착할 수 있게 하는 것이다.The deep teacher network may apply a softened softmax function to obtain a soft target output that is a probability value of each class. That is, different from the softmax function of the general CNN model, the deep teacher network T is a softened softmax function as shown in Equation 1 below to obtain a soft target (output probability). softmax function) can be applied to the vector of the teacher pre-softmax activations aT. The basic idea of the teacher deep network is to enable the student network to capture not only the information provided by the actual label, but also the smaller structures learned by the teacher deep network.

여기서, aT는 표본의 진정한 레이블에 대한 하나의 하드 타겟 표현에 매우 가깝지만, 교사의 소프트 출력(softened output, P_T)은 온도(τ>1)가 증가함에 따라 더 부드럽게 분포할 수 있다.
Here, aT is very close to one hard target expression for the true label of the sample, but the softened output (P _T ) of the teacher can be distributed more smoothly as the temperature (τ>1) increases.

이 방법은 교사 심층 네트워크의 출력에서 나오는 신호를 부드럽게 하고, 학생 모델을 학습하는 동안 학생 네트워크에 더 많은 정보를 제공할 수 있다. 그러나 학생 네트워크의 성능은 온도에 민감하기 때문에, 이 값은 모든 학습 데이터에 대해 경험적으로 결정되어야 하며, 최적의 온도를 예측하려면 상당한 노력이 요구된다.
This method can smooth the signals coming out of the teacher deep network's output and provide more information to the student network while learning the student model. However, since the student network's performance is sensitive to temperature, this value must be determined empirically for all training data, and considerable effort is required to predict the optimal temperature.

본 발명의 단계 S100에서는, 온도를 결정하고 교사 모델의 소프트 출력을 얻는데 필요한 노력을 줄이기 위해, 도 1에 도시된 바와 같이, 교사 심층 네트워크의 소프트 출력과 교사 랜덤 포레스트를 결합하여 새로운 소프트 출력을 선택하였다. 의사 결정 트리 앙상블 분류기인 랜덤 포레스트는 기존의 분류기에 비해 높은 학습 속도로 매우 많은 양의 데이터를 처리하는 것으로 알려져 있다. 또한, 랜덤 포레스트는 본질적으로 특정 클래스에 대해 더 부드러운 분류 결과의 분포를 제공한다.
In step S100 of the present invention, in order to reduce the effort required to determine the temperature and obtain the soft output of the teacher model, as shown in FIG. 1, a new soft output is selected by combining the soft output of the teacher deep network and the teacher random forest. I did. Random Forest, which is a decision tree ensemble classifier, is known to process a very large amount of data at a higher learning speed than conventional classifiers. Also, random forests essentially provide a smoother distribution of classification results for a particular class.

도 4는 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법에서, 단계 S100의 세부적인 흐름을 도시한 도면이다. 도 4에 도시된 바와 같이, 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법의 단계 S100은, 데이터 세트 A를 이용하여 교사 심층 네트워크를 학습시키는 단계(S110) 및 교사 심층 네트워크의 특징 맵을 이용하여 교사 랜덤 포레스트를 학습시키는 단계(S120)를 포함하여 구현될 수 있다.
4 is a diagram illustrating a detailed flow of step S100 in a classification method based on a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention. As shown in FIG. 4, step S100 of a classification method based on a teacher-student framework for lightening an ensemble classifier combined with a deep network and a random forest according to an embodiment of the present invention is performed using a data set A. It may be implemented including the step of learning the deep teacher network (S110) and the step of learning the teacher random forest using the feature map of the deep teacher network (S120).

단계 S110에서는, 데이터 세트 A를 이용하여 교사 심층 네트워크를 학습시킬 수 있다. 단계 S110은, 제1 교사 학습부(110)에 의해 처리될 수 있다. 보다 구체적으로, 단계 S110에서는, 먼저 학습 데이터 세트 A를 사용하여 교사 심층 네트워크를 학습시키는데, 이때 데이터 세트 A는 클래스 레이블이 포함되는 하드 타겟 데이터 세트일 수 있다. 데이터 세트 A={(x_i, y_i)|i=1, 2, …, N}는 M 차원 입력 벡터 x_i=(x_i1, x_i2, …, x_iM) 및 x_i의 전문가가 표시한 스칼라 클래스 레이블 y_i={g₁, g₂, …, g_c}로 구성될 수 있다.
In step S110, the deep teacher network may be trained using the data set A. Step S110 may be processed by the first teacher learning unit 110. More specifically, in step S110, the deep teacher network is first trained using the training data set A, in which case the data set A may be a hard target data set including a class label. Data set A={(x _i , y _i )|i=1, 2,… , N} is the M-dimensional input vector _{_{x i = (x i1, x}} i2, ..., x iM) and scalar class label is a display of the expert _{_{_{x i y i = {g 1}}} , g 2, ... , g _c }.

교사 심층 네트워크는 ResNet-101 모델(He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition, In Proceedings of IEEE Conference of Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV USA, 26 June 1 July 2016; pp. 770-778.)을 기반으로 클래스 레이블이 지정된(하드 타겟) 학습 데이터로 생성될 수 있다. 교사 심층 네트워크의 구조는 101개의 매개변수 계층(parameter layers), 하나의 평균 풀링 계층(average pooling layer), 및 하나의 완전 연결 계층(fully connected layer)로 구성될 수 있다. ResNet은 각 33 필터 쌍에 하나 이상의 레이어를 건너뛰는 바로가기 연결(shortcut connection)을 추가하지만, 기본 아키텍처는 일반 CNN과 동일할 수 있다. 또한, ResNet은 모든 바로가기와 제로 패딩(zero-padding)에 identity mapping을 사용하여 차원 수를 늘릴 수 있다. 짧은 연결의 출력은 쌓인 레이어의 출력에 추가될 수 있다.
The deep teacher network is based on the ResNet-101 model (He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition, In Proceedings of IEEE Conference of Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV USA, 26 June 1 July 2016; pp. 770-778.) can be generated as class-labeled (hard target) training data. The structure of the deep teacher network may consist of 101 parameter layers, one average pooling layer, and one fully connected layer. ResNet adds a shortcut connection that skips one or more layers to each 33 filter pair, but the basic architecture can be the same as a normal CNN. In addition, ResNet can increase the number of dimensions by using identity mapping for all shortcuts and zero-padding. The output of the short link can be added to the output of the stacked layer.

학습 데이터 세트 A가 주어지면, ImageNet에서 사전 교육된 ResNet-101 모델을 더 작은 데이터 세트 A로 미세 조정하여, 새로운 태스크를 위한 모든 네트워크 가중치를 업데이트할 수 있다. 교사 심층 네트워크를 학습한 후, 수학식 1은 출력 단위(클래스)에 소프트 출력을 제공할 수 있다. 즉, 미리 정해진 개수로 분류된 클래스에 출력 확률이 제공될 수 있다.
Given the training dataset A, we can fine-tune the pretrained ResNet-101 model on ImageNet to a smaller dataset A, updating all network weights for the new task. After learning the deep teacher network, Equation 1 may provide a soft output to the output unit (class). That is, the output probability may be provided to classes classified by a predetermined number.

단계 S120에서는, 교사 심층 네트워크의 특징 맵(feature map)을 이용하여 교사 랜덤 포레스트를 학습시킬 수 있다. 단계 S120은, 제2 교사 학습부(120)에 의해 처리될 수 있다. 즉, 단계 S120에서, 두 번째 분류기로서 교사 랜덤 포레스트의 개별 의사 결정 트리는 하드 클래스 레이블 y_i를 갖는 입력벡터 x_i에 대한 최종 특징 벡터를 사용하여 학습될 수 있다. 의사 결정 트리의 학습은 정보 이득(information gain)을 사용하여 부분집합의 무작위 표본 추출 및 분리 함수 선택에 기반을 둔다. 샘플 x의 최종 클래스 분포는 다음 수학식 2와 같이, 모든 트리 T의 각 클래스 확률 분포 p_t(c_i|x)의 앙상블(산술평균)을 사용하여 생성될 수 있다.In step S120, a teacher random forest may be trained using a feature map of the deep teacher network. Step S120 may be processed by the second teacher learning unit 120. That is, in step S120, the individual decision tree of the teacher random forest as the second classifier may be learned using the final feature vector for the input vector x _i with the hard class label y _i . The learning of the decision tree is based on random sampling of subsets and selection of the separation function using information gain. The final class distribution of sample x may be generated using an ensemble (arithmetic mean) of each class probability distribution p _t (c _i |x) of all trees T, as shown in Equation 2 below.

단계 S200에서는, 데이터 세트 B를 단계 S100에서 학습된 교사 모델에 입력하고, 출력된 소프트 출력을 이용하여 소프트 타겟 데이터 세트 B^*를 생성할 수 있다. 단계 S200은, 소프트 타겟 데이터 생성 모듈(200)에 의해 처리될 수 있다. 보다 구체적으로, 단계 S200에서는, 도 1의 (e)에서와 같이, 데이터 세트 B를 단계 S100에서 학습된 교사 심층 네트워크 및 교사 랜덤 포레스트에 입력하고, 교사 심층 네트워크의 출력 및 교사 랜덤 포레스트의 출력을 하나의 소프트 타겟 벡터로 결합하여, 각각의 클래스의 확률값인 소프트 타겟 데이터 세트 B^*를 생성할 수 있다. 이때, 단계 S200에서, 데이터 세트 B는 클래스 레이블이 포함되지 않는 데이터 세트일 수 있다.
In step S200, the data set B is input to the teacher model learned in step S100, and the soft target data set B ^* may be generated using the output soft output. Step S200 may be processed by the soft target data generation module 200. More specifically, in step S200, as shown in (e) of FIG. 1, data set B is input to the deep teacher network and teacher random forest learned in step S100, and the output of the teacher deep network and the teacher random forest is output. By combining into one soft target vector, a soft target data set B ^* , which is a probability value of each class, can be generated. In this case, in step S200, the data set B may be a data set that does not include a class label.

보다 구체적으로, 교사 모델의 학습이 완료된 다음, 훨씬 더 크고 레이블이 없는 학습 데이터 세트 B가 교사 모델에 적용되고, 하드 타겟과 반대되는 소프트 타겟(클래스 확률)으로 구성된 새로운 데이터 세트 B^*가 구성될 수 있다.
More specifically, after training of the teacher model is complete, a much larger, unlabeled training data set B is applied to the teacher model, and a new data set B ^* consisting of a soft target (class probability) as opposed to a hard target is constructed. I can.

교사 모델에 하나의 데이터 세트 A만을 적용하여 소프트 타겟을 생성하는 알고리즘과는 달리, 본 발명의 접근법은 교사 모델의 오버 피팅(overfitting)을 방지하기 위해 추가 학습 데이터 세트 B를 사용할 수 있다. 단계 S200에서 생성된 새로운 소프트 타겟 데이터 세트 B^*는, 다른 클래스들 사이의 관계를 유지함으로써, 원래 하드 타겟 데이터보다 더 많은 정보를 포착할 수 있다. 또한, 하드 타겟 데이터 세트를 사용하는 것보다 더 유연한(flexible) 분류 결과를 얻을 수 있다.
Unlike the algorithm that generates a soft target by applying only one data set A to the teacher model, the present approach can use an additional training data set B to prevent overfitting of the teacher model. The new soft target data set B ^* generated in step S200 may capture more information than the original hard target data by maintaining a relationship between different classes. In addition, it is possible to obtain a more flexible classification result than using a hard target data set.

데이터 세트 B에 포함된 모든 M개의 샘플 x가 학습된 후, 클래스 확률 p_i ^*(소프트 타겟)로 표현된 새로운 데이터 세트 B^*가 B^*={(x_i, p_i ^*)|i=1, 2, …, M}와 같이 생성될 수 있다. 인식 성능의 차이는 랜덤 포레스트에서 사용된 의사 결정 트리의 수에 따라 발생할 수 있다.
After all M samples x in data set B have been trained, the new data set B ^* expressed as class probability p _i ^* (soft target) is B ^* ={(x _i , p _i ^* )|i=1 , 2, … , M} can be created. The difference in recognition performance may occur depending on the number of decision trees used in the random forest.

한편, 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법은, 단계 S10을 더 포함하여 구현될 수 있다. 즉, 단계 S10에서는, 전처리 모듈(500)이 웨이블렛 변환을 적용하여 입력 이미지에 대한 전처리를 수행할 수 있다. 특히, 단계 S10은, 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법을 이미지에 적용하여 분류를 할 때 추가적으로 처리될 수 있다. 보다 구체적으로는, 단계 S10에서는 하이 패스(high-pass) 필터된 2개의 서브이미지 및 로우 패스(low-pass) 필터된 하나의 서브이미지를 생성하고, 단계 S100에서는 단계 S10에서 생성된 3개의 서브이미지를 이용하여 교사 모델을 학습시킬 수 있다.
Meanwhile, a classification method based on a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention may be implemented by further including step S10. That is, in step S10, the preprocessing module 500 may perform preprocessing on the input image by applying the wavelet transform. In particular, step S10 may be additionally processed when classifying an image by applying a classification method based on a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest according to an embodiment of the present invention are combined. have. More specifically, in step S10, two high-pass filtered sub-images and one low-pass filtered sub-image are generated, and in step S100, the three sub-images generated in step S10 are generated. Teacher models can be trained using images.

구체적으로는, 연화된 소프트맥스 함수(soften softmax function) 외에도, 웨이블렛 변환의 세가지 수작업 필터 응답을 모델에 제공할 수 있다. 즉, 2개의 하이 패스(high-pass) 필터된 서브이미지(sub-images)(LH 및 HL) 및 하나의 로우 패스(low-pass) 필터된 서브이미지(LL)를 사용하여, 도 1의 (a)에 도시된 바와 같은 회색 이미지와 함께 Daubechies D4 웨이블렛을 사용하여 적절한 수작업 특성을 제공하면, 특정 분류 문제에 대한 결과가 향상될 수 있다. 또한, 웨이블렛 변환은 양호한 공간 주파수 지역 특성을 가지며, 이미지의 공간 정보 및 기울기 정보를 보존할 수 있기 때문에, 다양한 밝기 조건에서 분류 성능을 향상시키는데 도움이 될 수 있다.
Specifically, in addition to the soften softmax function, three manual filter responses of wavelet transform can be provided to the model. That is, using two high-pass filtered sub-images (LH and HL) and one low-pass filtered sub-image LL, ( Using the Daubechies D4 wavelet with the gray image as shown in a) to provide appropriate manual characteristics can improve the results for specific classification problems. Further, since the wavelet transform has good spatial frequency region characteristics and can preserve spatial information and gradient information of an image, it may be helpful to improve classification performance under various brightness conditions.

단계 S300에서는, 단계 S200에서 생성된 데이터 세트 B^*를 이용하여, 학생 네트워크 및 학생 랜덤 포레스트로 구성되는 학생 모델을 학습시킬 수 있다. 단계 S300은, 학생 학습 모듈(300)에 의해 처리될 수 있다.
In step S300, a student model including a student network and a student random forest may be trained using the data set B ^* generated in step S200. Step S300 may be processed by the student learning module 300.

교사 모델을 학습시킨 후, 교사 모델에서 생성된 소프트 타겟 데이터 세트 B^*를 사용하여, 경량화된 학생 모델을 구성할 수 있다. 학생 모델은 교사 모델에서처럼 학생 네트워크 1개와 학생 랜덤 포레스트 1개로 구성될 수 있다.
After training the teacher model, a lightweight student model can be constructed using the soft target data set B ^* generated from the teacher model. As in the teacher model, the student model can consist of 1 student network and 1 student random forest.

도 5는 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법에서, 단계 S300의 세부적인 흐름을 도시한 도면이다. 도 5에 도시된 바와 같이, 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법의 단계 S300은, 데이터 세트 B^*를 이용하여 학생 네트워크를 학습시키는 단계(S310) 및 데이터 세트 B^*를 이용하여 학생 랜덤 포레스트를 학습시키는 단계(S320)를 포함하여 구현될 수 있다.
5 is a diagram illustrating a detailed flow of step S300 in a classification method based on a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention. As shown in FIG. 5, step S300 of a classification method based on a teacher-student framework for lightening an ensemble classifier combined with a deep network and a random forest according to an embodiment of the present invention uses a data set B ^* Thus, it may be implemented including the step of learning the student network (S310) and the step of learning the student random forest using the data set B ^* (S320).

단계 S310에서는, 단계 S200에서 생성된 데이터 세트 B^*를 이용하여, 학생 네트워크를 학습시킬 수 있다. 단계 S310은, 제1 학생 학습부(310)에 의해 처리될 수 있다.
In step S310, the student network may be trained using the data set B ^* generated in step S200. Step S310 may be processed by the first student learning unit 310.

학생 네트워크는 DarkNet 레퍼런스 모델(Darknet reference model. Available online: https://pjreddie.com/darknet/imagenet/#reference (accessed on 27 December 2018).)을 수정하여 생성할 수 있다. 이는 DarkNet 레퍼런스 모델의 계산 속도는, 파라미터의 개수가 1/5 및 1/10일 때 하나의 CPU에서 기존의 ResNet-101보다 16배 빠르고 AlexNet보다 2배 빠르기 때문이다. 따라서 교사 심층 네트워크를 압축하는 대신, 학생 네트워크로 얕은 DarkNet 레퍼런스 모델을 사용하고, 교사 모델에서 생성한 소프트 타겟 데이터 세트를 사용하여 학생 네트워크를 다시 학습시킬 수 있다.
Student networks can be created by modifying the Darknet reference model. Available online: https://pjreddie.com/darknet/imagenet/#reference (accessed on 27 December 2018). This is because the calculation speed of the DarkNet reference model is 16 times faster than the existing ResNet-101 and 2 times faster than AlexNet in one CPU when the number of parameters is 1/5 and 1/10. So, instead of compressing the deep teacher network, we can use the shallow DarkNet reference model as the student network and retrain the student network using the soft target data set generated from the teacher model.

학생 네트워크의 구조는 7개의 맥스 풀링 레이어(max pooling layer), 각 컨볼루션 레이어(convolution layer) 다음에 하나의 평균 풀링 레이어를 포함하는 총 8개의 컨볼루션 레이어로 구성될 수 있다. 전면의 7개의 컨볼루션 레이어는 3×3 크기의 컨볼루션 필터와 2×2 크기의 필터가 있는 맥스 풀링 레이어가 있으며, 마지막 컨볼루션 레이어는 1×1 크기의 컨볼루션 필터가 있고, 완전 연결 레이어(fully connected layer) 대신 평균 풀링 레이어가 있어서, 오버 피팅 문제를 방지하고 완전 연결 레이어의 학습 가능한 파라미터의 수를 줄일 수 있다. 또한, 각 컨볼루션 레이어에 배치 정규화(batch normalisation)가 적용되며, leaky ReLU(LReLU)는 dying ReLU 문제 해결을 위해 활성화 함수로 사용될 수 있다. LReLU 함수 f(x)는 다음 수학식 3과 같이 함수가 0인 대신 x<0일 때 작은 음수값을 갖는다.The structure of the student network may consist of a total of 8 convolutional layers including 7 max pooling layers, and one average pooling layer after each convolution layer. The front seven convolution layers have a 3×3 convolution filter and a max pooling layer with a 2×2 size filter, and the last convolution layer has a 1×1 size convolution filter, and a fully connected layer. There is an average pooling layer instead of a (fully connected layer), which avoids over-fitting problems and reduces the number of learnable parameters of the fully connected layer. In addition, batch normalization is applied to each convolutional layer, and leaky ReLU (LReLU) can be used as an activation function to solve the dying ReLU problem. The LReLU function f(x) has a small negative value when x<0 instead of 0 as shown in Equation 3 below.

학생 네트워크를 학습시키기 위해, ImageNet에서 사전 훈련된 컨볼루션 가중치를 사용하고, 소프트 타겟 데이터 세트 B^*를 사용하여 미세 조정을 수행할 수 있다. 교차-엔트로피 기준(cross-entropy criterion)은 다음 수학식 4와 같이 하드 타겟 벡터를 소프트 타겟 벡터로 대체함으로써, 프레임 단위 최소화에 기반할 수 있다.To train the student network, we can use pretrained convolutional weights in ImageNet and perform fine tuning using the soft target data set B ^* . The cross-entropy criterion may be based on frame-by-frame minimization by replacing the hard target vector with a soft target vector as shown in Equation 4 below.

여기서, N은 데이터 세트 B^*의 샘플 수이고 C는 클래스 수이다. 또한, P_T(x_i|c_j)와 P_S(x_i|c_j)는 각각 입력 벡터 x_i에 대한 교사와 학생의 후방(posterior) 클래스 확률이다.
Where N is the number of samples in the data set B ^* and C is the number of classes. Also, P _T (x _i |c _j ) and P _S (x _i |c _j ) are the posterior class probabilities of the teacher and student for the input vector x _i , respectively.

도 6은 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법에서, 단계 S310의 학생 네트워크 학습 절차를 설명한 알고리즘을 도시한 도면이다. 단계 S310에서는, 도 6에 도시된 바와 같은 알고리즘으로, 제1 학생 학습부(310)가 학생 네트워크를 먼저 학습시킬 수 있다.
FIG. 6 is a diagram illustrating an algorithm for explaining a student network learning procedure in step S310 in a teacher-student framework-based classification method for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention. to be. In step S310, the first student learning unit 310 may first learn the student network with an algorithm as shown in FIG. 6.

보다 구체적으로는, 단계 S310에서는, (3-1-1) 학생 네트워크의 파라미터(W_S)를 초기화하는 단계, (3-1-2) 사전 학습된 네트워크에 데이터 세트 B^*를 입력하는 단계, (3-1-3) 손실 함수(loss function, L(Ws))를 계산하는 단계, (3-1-4)

를 W_S ^*로 업데이트하는 단계, 및 (3-1-5) 학생 네트워크를 위한 최적 파라미터 W_S ^*를 선택하는 단계를 수행하여, 학생 네트워크를 학습시킬 수 있다.
More specifically, in step S310, (3-1-1) initializing the parameter W _S of the student network, (3-1-2) inputting the data set B ^* into the pre-trained network, (3-1-3) Calculating the loss function (L(Ws)), (3-1-4)

By performing the steps of updating to W _S ^* and (3-1-5) selecting the optimal parameter W _S ^* for the student network, the student network may be trained.

단계 S320에서는, 단계 S200에서 생성된 데이터 세트 B^*를 이용하여, 학생 랜덤 포레스트를 학습시킬 수 있다. 단계 S320은, 제2 학생 학습부(320)에 의해 처리될 수 있다.
In step S320, the student random forest may be trained using the data set B ^* generated in step S200. Step S320 may be processed by the second student learning unit 320.

학생 네트워크와 마찬가지로, 학생 랜덤 포레스트의 초기 의사 결정 트리는 트리 수, 트리 깊이 및 개별 트리의 각 노드에 대한 스플릿 임계값을 갖는 스플릿 함수 등 교사 랜덤 포레스트와 동일한 구조를 사용할 수 있다.
Similar to the student network, the initial decision tree of the student random forest can use the same structure as the teacher random forest, such as a split function with the number of trees, the tree depth, and a split threshold for each node of an individual tree.

학생 랜덤 포레스트의 의사 결정 트리의 학습은, 입력 벡터 v와 클래스 확률 p_i ^*를 입력으로 한다. 입력 벡터는 마지막 특징 맵(last feature maps)(4×4×8)으로부터 생성된 128 차원을 가지고, 클래스 확률은 교사 모델의 출력으로부터 추정될 수 있다. 소프트 타겟 데이터 세트 B^*의 클래스 확률을 갖는 출력 벡터로 구성된 학습 데이터로부터, 학생 랜덤 포레스트의 의사 결정 트리는 샘플로부터 클래스 확률을 갖는 p개의 변수를 무작위로 선택할 수 있다.
The learning of the decision tree of the student random forest takes an input vector v and a class probability p _i ^* as inputs. The input vector has 128 dimensions created from the last feature maps (4×4×8), and the class probability can be estimated from the output of the teacher model. From the training data composed of the output vector with the class probabilities of the soft target data set B ^*, the decision tree of the student random forest can randomly select p variables with the class probabilities from the samples.

B'_O가 노드 O에서 샘플을 나타낸다고 하자. 사전에 훈련되고 임의로 생성된 스플릿 함수(split function) f(v_p)는 랜덤 서브 세트(subset) B_O'를 노드 O에서 좌측(B'_l) 및 우측(B'_r) 서브 세트로 반복적으로 분할할 수 있다. 최상의 스플릿 함수를 선택하기 위해, 노드 O의 엔트로피 E(O)는 확률 분포 P_j ^*를 갖는 p 변수만을 사용하여 추정될 수 있다. 본 발명에서, 노드 O의 엔트로피 E(B'_O)는 다음 수학식 5와 같이 정의될 수 있다.B _'O Let denote a sample from node O. Trained in advance and a randomly generated split function _{(split function) f (v p} ) is repeatedly, in the node O left (B 'random subset (subset) B _O to _l) and right (B' _r) subset Can be divided. To select the best split function, the entropy E(O) of node O can be estimated using only the p variable with probability distribution P _j ^* . In the present invention, the entropy E (B' _O ) of the node O may be defined as in Equation 5 below.

동일한 방법을 사용하여, 노드 O의 좌측 및 우측 서브 세트는 B'_l 및 B'_r로 분할되고, 엔트로피 E(B'_l) 및 E(B'_r)이 계산될 수 있다. 3개의 엔트로피로부터, 노드 O의 정보 이득 ΔE는 다음 수학식 6으로부터 계산될 수 있다.Using the same method, the left and right sub-set of the node O is divided into B _'l and B' _r, is the entropy E (B _'l) and E (B' _r) it can be calculated. From the three entropies, the information gain ΔE of node O can be calculated from Equation 6 below.

이 과정은 후보 스플릿 함수의 수를 적용하는 동안 반복될 수 있으며, 최대 ΔE를 갖는 함수를 노드 O에 대한 최상의 스플릿 함수 f(v_p)로 결정할 수 있다.
This process can be repeated while applying the number of candidate split functions, and the function with the maximum ΔE can be determined as the best split function f(v _p ) for node O.

초기 의사 결정 트리 Tr_t가 확장된 후에, C 클래스들의 확률 분포가 리프 노드(leaf node)에 저장될 수 있다. 그 다음에, 수학식 6에서 교사 랜덤 포레스트에 의해 기술된(transcribed) 데이터 세트 B^*의 샘플 i의 j번째 클래스 분포를 나타내는 P_ij ^*(Te)와, 구성된 의사 결정 트리 t에 기반을 둔 샘플 i의 j번째 클래스 분포를 나타내는 P_ij ^*(S_t)로, 교차-엔트로피(cross-entropy)를 추정할 수 있다. 최종 교차-엔트로피의 일반적인 형태는 다음 수학식 7과 같다.After the initial decision tree Tr _t is expanded, the probability distribution of C classes may be stored in a leaf node. Then, P _ij ^* (Te) representing the j-th class distribution of sample i of the data set B ^* transcribed by the teacher random forest in Equation 6, and a sample based on the constructed decision tree t With P _ij ^* (S _t ) representing the j-th class distribution of i, cross-entropy can be estimated. The general form of the final cross-entropy is shown in Equation 7 below.

부스티드(boosted) 랜덤 포레스트의 높은 성능에 힘입어, Tr(Te, S)_t가 최소 기준 이하가 될 때까지 t번째 약한 의사 결정 트리를 업데이트하기 위해 부스팅을 반복할 수 있다. T개의 랜덤 의사 결정 트리가 완성되면, 학생 랜덤 포레스트는 최종적으로 클래스 당 확률 분포로 구성된 T 트리가 될 수 있다.
Thanks to the high performance of the boosted random forest, boosting can be repeated to update the t-th weak decision tree until Tr(Te, S) _t falls below the minimum criterion. When T random decision trees are completed, the student random forest may finally become a T tree composed of probability distributions per class.

도 7은 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법에서, 단계 S320의 학생 랜덤 포레스트 학습 절차를 설명한 알고리즘을 도시한 도면이다. 단계 S320에서는, 도 7에 도시된 바와 같은 알고리즘으로, 제2 학생 학습부(320)가 학생 랜덤 포레스트를 학습시킬 수 있다.
7 illustrates an algorithm for explaining a student random forest learning procedure in step S320 in a teacher-student framework-based classification method for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention. It is a drawing. In step S320, the second student learning unit 320 may learn the student random forest using an algorithm as shown in FIG. 7.

보다 구체적으로는, 단계 S320에서는, (3-2-1) 교사 랜덤 포레스트의 t번째 트리 구조(T-RF_t)를 학생 랜덤 포레스트의 t번째 트리 구조(S-RF_t)로 복사하여 이전학습(Transfer Learning)을 하는 단계, (3-2-2) 입력벡터 v를 갖는 데이터 세트 B^*를 S-RF_t의 의사결정 트리 중 하나에 입력하는 단계, (3-2-3) 노드 O에서 스플릿 함수 f(v)를 생성하는 단계, (3-2-4) 노드 O에서 정보 이득(information gain) ΔE을 계산하는 단계, (3-2-5) 교차-엔트로피(cross-entropy) Tr(Te, S)_t를 계산하는 단계, 및 (3-2-6) 계산된 교차-엔트로피가 부스팅을 멈추는 최소 임계값 θ 미만이면, 현재의 S-RF_t를 저장하고, 그렇지 않으면 단계 (3-2-2)부터 재수행하는 단계를 T개의 랜덤 의사결정 트리를 구성할 때까지 수행하여, 학생 랜덤 포레스트를 학습시킬 수 있다.
More specifically, in step S320, (3-2-1) the t-th tree structure (T-RF _t ) of the teacher random forest is copied to the _t -th tree structure (S-RF _t ) of the student random forest, and prior learning (Transfer Learning), (3-2-2) inputting the data set B ^* with the input vector v into one of the decision trees of S-RF _t , (3-2-3) at node O Generating split function f(v), (3-2-4) calculating information gain ΔE at node O, (3-2-5) cross-entropy Tr( Te, S) calculating _t , and (3-2-6) if the calculated cross-entropy is less than the minimum threshold θ to stop boosting, store the current S-RF _t , otherwise step (3- The re-performing steps from 2-2) are performed until T random decision trees are constructed, so that the student random forest can be learned.

단계 S400에서는, 단계 S300에서 학습된 학생 네트워크 및 학생 랜덤 포레스트의 두 출력을 결합하여 분류를 수행할 수 있다. 단계 S400은, 분류 모듈(400)에 의해 처리될 수 있다. 보다 구체적으로, 단계 S400에서는, 단계 S300에서 학습된 학생 네트워크 및 학생 랜덤 포레스트의 출력값을 결합하여 최종 확률을 생성하고, 이를 통해 최종 확률이 가장 높은 클래스로 분류할 수 있다. 즉, 본 발명에서는, 학습된 학생 모델에 입력을 하고, 학생 네트워크의 출력값과 학생 랜덤 포레스트의 출력값을 결합하여 최종 확률을 생성함으로써, 보다 정확하게 분류를 할 수 있다.
In step S400, classification may be performed by combining two outputs of the student network and the student random forest learned in step S300. Step S400 may be processed by the classification module 400. More specifically, in step S400, a final probability is generated by combining the output values of the student network and the student random forest learned in step S300, and through this, the final probability may be classified as a class having the highest final probability. That is, in the present invention, by inputting the learned student model and combining the output value of the student network and the output value of the student random forest to generate a final probability, it is possible to more accurately classify.

실험 결과Experiment result

본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 및 이를 기반으로 하는 분류 방법의 효율성을 입증하기 위해, ADAS (Advanced Driver Assistant System)에서 충돌 회피를 위한 보행자의 자세 방향 추정에 적용하여 본 발명의 성능을 평가하고, 최근 연구에서 제시된 다른 접근법을 사용하여 비교 실험을 수행하였다. ADAS에서 차량은 보행자를 탐지하고 보행자의 포즈 방향 추정(Pose Orientation Estimation; POE)를 기반으로 미리 보행자의 의도를 예측할 수 있다. 따라서 보행자가 차량을 알아차리지 않고 도로를 밟고 있을 때, 운전자에게 경고할 수 있으므로, 충돌 가능성이 크게 줄어들 수 있다. 이때, 움직이는 차량에 의해 캡쳐된 단일 이미지에서의 POE를 목적으로 하므로, 스테레오 카메라 또는 RGBD 센서를 사용하는 3D POE는 고려하지 않는다.
In order to demonstrate the efficiency of the teacher-student framework and classification method based on the teacher-student framework for lightening the ensemble classifier combined with the deep network and the random forest according to an embodiment of the present invention, conflict in ADAS (Advanced Driver Assistant System) The performance of the present invention was evaluated by applying it to the estimation of a pedestrian's posture direction for avoidance, and a comparative experiment was performed using another approach suggested in a recent study. In ADAS, a vehicle can detect a pedestrian and predict a pedestrian's intention in advance based on the pedestrian's Pose Orientation Estimation (POE). Therefore, when a pedestrian is stepping on the road without noticing the vehicle, it is possible to warn the driver, thereby greatly reducing the possibility of a collision. At this time, since POE in a single image captured by a moving vehicle is aimed at, 3D POE using a stereo camera or an RGBD sensor is not considered.

벤치 마크 데이터베이스를 사용하여 본 발명의 성능을 평가하고, 최근 연구에서 제시된 다른 접근법을 사용하여 비교 실험을 수행하였다.
The performance of the present invention was evaluated using a benchmark database, and comparative experiments were conducted using different approaches presented in recent studies.

본 실험에서는, 먼저 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 및 이를 기반으로 하는 분류 방법이 다양한 보행자 자세의 방향 추정에 효과적임을 증명하기 위해, POE의 성능을 검증하였다. 이 실험은 Microsoft Windows 10을 실행하는 24GB RAM의 Intel Core i7 프로세서를 사용하여 수행되었다. 또한, 교사 랜덤 포레스트 및 학생 랜덤 포레스트를 포함한 모든 RF 접근 방식은 CPU를 기반으로 실행되었으며 교사 심층 네트워크는 하나의 Titan Xp GPU를 사용하여 실행되었다.
In this experiment, first, to prove that the teacher-student framework for lightening the ensemble classifier combined with the deep network and the random forest, and the classification method based on it, is effective in estimating the direction of various pedestrian postures, the performance of POE is verified. I did. The experiment was conducted using an Intel Core i7 processor with 24GB of RAM running Microsoft Windows 10. In addition, all RF approaches, including teacher random forest and student random forest, were implemented based on CPU, and the teacher deep network was implemented using a single Titan Xp GPU.

교사 심층 네트워크의 학습을 위해, 배치 크기, 운동량, 학습률(learning rate) 및 가중치 감퇴(weight decay)는, 각각 32, 0.9, 0.001 및 0.0005로 설정하였다. 랜덤 포레스트의 경우, 성능 및 트리를 저장하는 데 필요한 메모리 측면에서 중요한 파라미터는 트리의 깊이 및 그 개수이다. 본 실험에서는, 최대 트리 깊이를 20, 교사 랜덤 포레스트의 트리 개수를 300으로 설정하였다. 학생 랜덤 포레스트의 트리 개수를 결정하기 위해, 나무의 개수를 250, 200, 150, 100, 70 및 50으로 순차적으로 줄였으며, 실험 결과에 기반하여 추후 설명할 바와 같이, 더 정확하고 빠른 계산을 위해 70으로 설정하였다.
For learning of the deep teacher network, the batch size, momentum, learning rate, and weight decay were set to 32, 0.9, 0.001, and 0.0005, respectively. In the case of a random forest, an important parameter in terms of performance and memory required to store the tree is the depth of the tree and its number. In this experiment, the maximum tree depth was set to 20 and the number of trees in the teacher's random forest was set to 300. To determine the number of trees in the student random forest, the number of trees was sequentially reduced to 250, 200, 150, 100, 70, and 50, and for more accurate and faster calculations, as described later based on the experimental results. It was set to 70.

학생 네트워크와 학생 랜덤 포레스트는 도 6 및 도 7에 도시된 바와 같은 알고리즘 1 및 알고리즘 2를 기반으로 한 소프트 타겟 학습 데이터 세트 B^*를 사용하여 재학습되었다. 보행자 감지와 관련된 많은 데이터 세트의 사용이 가능하지만, 상대적으로 보행자 방향 추정을 한 연구는 거의 없는 실정이다. 따라서 본 발명에서는, 테두리 상자와 이산 방향 주석이 있는 5,228개의 보행자 이미지로 구성된 가장 인기 있는 TUD 멀티 뷰 보행자 데이터 세트를 이용해 POE 실험을 수행하였다. 이 데이터 세트에는 학습을 위한 전신 보행자 이미지 4,732개, 유효성 검사 248개, 테스트 248개가 포함되어 있다. TUD 데이터 세트의 이미지는 실제 거리의 상황에서 촬영되었으며, 모든 이미지에는 다양한 포즈와 옷이 포함되어 있어서 데이터 세트를 훨씬 까다롭게 만들었다. 작은 데이터 세트로 훈련된 모델은 검증 및 테스트 세트의 데이터를 일반화하지 않아 오버 피팅되는 결과를 초래한다는 것은 일반적인 사실이다. 오버 피팅을 줄이기 위해, 본 발명에서는 이미지 이동, 확대 및 축소, -15도에서 +15도 사이에서 임의의 각도로 회전, 왼쪽-오른쪽 뒤집기 및 자르기와 같은 데이터 확대를 적용하여 데이터 세트의 크기를 증가시켰다. 모든 학습 이미지에는 교사 모델에 제공된 원본 이미지와 복제 이미지가 포함된다. 앞서 언급한 데이터 증가에 의해, 데이터 세트 A에 4,732개의 이미지를, 데이터 세트 B에 4,732개의 이미지를 할당하였다.
The student network and the student random forest were retrained using a soft target learning data set B ^* based on Algorithm 1 and Algorithm 2 as shown in FIGS. 6 and 7. Although many data sets related to pedestrian detection are available, relatively few studies have performed pedestrian direction estimation. Therefore, in the present invention, a POE experiment was performed using the most popular TUD multi-view pedestrian data set consisting of 5,228 pedestrian images with a bounding box and discrete direction annotation. This data set contains 4,732 full-body pedestrian images for training, 248 validation tests, and 248 tests. The images in the TUD data set were taken in real street situations, and every image included a variety of poses and clothes, making the data set even more challenging. It is a common fact that a model trained on a small data set does not generalize the data in the validation and test set, resulting in overfitting results. To reduce overfitting, the present invention increases the size of the data set by applying data magnification such as image movement, enlargement and reduction, rotation at any angle between -15 degrees and +15 degrees, left-right flip and cropping. Made it. All training images include the original and duplicate images provided in the teacher model. By the aforementioned data increase, 4,732 images were allocated to data set A and 4,732 images were allocated to data set B.

도 8은 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 및 이를 기반으로 하는 분류 방법에서, 보행자 방향 클래스 분류를 예를 들어 도시한 도면이다. 보행자 방향 추정에 있어서, 모든 방향이 예측될 때 클래스 수는 증가할 수 있다. 따라서 대부분의 기존 연구에서는 도 8에 도시된 바와 같이 방향을 N개의 클러스터로 나누어 인식하는 방법을 사용하였다. 예를 들어, TUD 데이터 세트는 테두리 상자의 뒤, 앞, 왼쪽, 오른쪽, 왼쪽 뒤, 오른쪽 뒤, 왼쪽 앞 및 오른쪽 앞과 같은 방향 주석이 있는 보행자 이미지 5,228개로 구성된다. TUD 데이터 세트의 경우, 방향 클래스가 45도로 나뉜다.
FIG. 8 is a diagram illustrating an example of classifying a pedestrian direction class in a teacher-student framework for lightening an ensemble classifier combined with a deep network and a random forest, and a classification method based thereon. In pedestrian direction estimation, the number of classes may increase when all directions are predicted. Therefore, most of the existing studies have used a method of dividing the direction into N clusters and recognizing them as shown in FIG. For example, the TUD data set consists of 5,228 pedestrian images with directional annotations such as behind, front, left, right, left rear, right rear, left front, and right front of a bounding box. For the TUD data set, the orientation class is divided into 45 degrees.

8개의 클래스의 방향 추정의 유효성을 검증하기 위해, TUD 데이터 세트의 정밀도(precision), 회수율(recall) 및 오탐율(False positive rate, FPR)을 측정하였다. 이 값은 일반적으로 물체 인식 성능을 평가하는데 사용된다. 또한, 정확도(accuracy; Acc)는 포즈와 혼동 행렬(confusion matrices)을 평가하여 클래스 간의 성능을 비교하는데 사용된다. 정확도는 조사된 전체 사례수에 대한 탐지 성공 비율이다.
In order to verify the validity of the direction estimation of the eight classes, the precision, recall and false positive rate (FPR) of the TUD data set were measured. This value is generally used to evaluate object recognition performance. Also, accuracy (Accuracy) is used to compare performance between classes by evaluating poses and confusion matrices. Accuracy is the ratio of successful detection to the total number of cases investigated.

TUD 데이터 세트에서 성능평가Performance evaluation on TUD data set

본 발명을 이용한 POE 분류의 효과를 검증하기 위해, 다섯 개의 최첨단 방법과 성능을 비교하였다. 각 실험은 다음과 같다. (1) 매우 무작위화된 트리 분류기의 배열을 사용하여 POE를 분류하는 MoAWG, (2) 랜덤 포레스트 분류기와 결합된 부분 최소 제곱 기반 모델을 사용하는 PLS-RF, (3) 신체 자세 방향을 인식하기 위해 희박한 표현 기법(sparse representation technique)을 사용하는 MACF, (4) 16개의 가중 CNN 레이어 및 저해상도 이미지를 갖는 CNN을 사용하는 VGG-16, (5) deep residual nets에 기반을 둔 ResNet-101, (6) 수작업 필터 없는 제안된 교사 모델, (7) 제안된 교사 모델(proposed T-Model), (8) 학생 네트워크 및 학생 랜덤 포레스트를 포함하는 제안된 학생 모델(proposed S-Model). 총 8가지 방법 중 (4) 내지 (8)의 방법은 CNN을 기반으로 한다.
In order to verify the effect of POE classification using the present invention, five state-of-the-art methods and performance were compared. Each experiment is as follows. (1) MoAWG classifying POE using an array of highly randomized tree classifiers, (2) PLS-RF using partial least squares-based model combined with random forest classifier, (3) recognizing body posture orientation MACF using a sparse representation technique, (4) VGG-16 using 16 weighted CNN layers and CNN with low-resolution images, (5) ResNet-101 based on deep residual nets, ( 6) Proposed teacher model without manual filter, (7) Proposed T-Model, (8) Proposed S-Model including student network and student random forest. Of the total eight methods, methods (4) to (8) are based on CNN.

도 9는 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법을 포함하는 8개의 실험의 보행자 방향 추정 결과를 비교한 도면이다. 도 9는 평균 정밀도(average precision; AP), 평균 회수율(average recall; AR), 및 평균 FPR (AFPR)의 관점에서 8가지 접근법의 결과를 비교한 것이다. 도 9에 도시된 바와 같이, 모든 실험에서 CNN 기반 방법((4) 내지 (8)의 방법)이 종래의 수작업 및 분류기 기반 방법보다 우수한 분류 성능을 가진다는 것을 확인할 수 있다. MoAWG가 기존의 세 가지 접근 방식(MoAWG, PLS-RF, MACF) 중 최고 성능을 달성하였으나, 심층 네트워크 기반 접근 방법 중 가장 낮은 성능을 보인 VGG-16보다 0.2%, 3.2% 및 0.6%의 낮은 성능을 보였다. VGG-16과 ResNet-101은 기존의 접근법보다는 나은 성능을 보였으나, 기본적인 CNN 모델을 사용하기 때문에, 그 성능이 본 발명보다 낮다는 것을 확인할 수 있다. 제안된 3가지 방법 중에서, T-Model 방법은 교사 심층 네트워크와 교사 랜덤 포레스트를 동시에 사용하기 때문에, 적용된 세 가지 평가 항목에서 다른 방법에 비해 최고의 성능을 보였다.
9 is a view comparing pedestrian direction estimation results of eight experiments including a classification method based on a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention. . 9 compares the results of eight approaches in terms of average precision (AP), average recall (AR), and average FPR (AFPR). As shown in FIG. 9, in all experiments, it can be seen that the CNN-based method (methods (4) to (8)) has better classification performance than the conventional manual and classifier-based methods. MoAWG achieved the highest performance among the three existing approaches (MoAWG, PLS-RF, MACF), but 0.2%, 3.2% and 0.6% lower performance than VGG-16, the lowest performance among deep network-based approaches. Showed. VGG-16 and ResNet-101 showed better performance than the conventional approach, but since they use the basic CNN model, it can be confirmed that their performance is lower than that of the present invention. Among the three proposed methods, the T-Model method showed the best performance compared to other methods in the three evaluation items applied because it uses the deep teacher network and the teacher random forest at the same time.

수작업 필터를 사용하여 전처리하는 단계 S10이 없는 제안된 T-Model(Proposed T-Model without handcraft filters)의 경우 원래의 T-Model과 비교하여 세 가지 평가 항목 모두에서 성능이 떨어졌다. 결과에 기초하여, 웨이블렛 변환은 양호한 공간 주파수 위치 특성을 가지며, 이미지의 공간 정보 및 경도 정보를 보존할 수 있음을 알 수 있다.
In the case of the proposed T-Model without handcraft filters (T-Model), which does not have a pre-processing step S10 using a manual filter, performance was poor in all three evaluation items compared to the original T-Model. Based on the results, it can be seen that the wavelet transform has good spatial frequency position characteristics and can preserve spatial information and longitude information of an image.

제안된 S-Model의 평가 결과는 T-Model과 비교할 때, AP와 AR 측면에서 7.3%와 5.3%의 약간 낮은 성능을 보였다. 그러나 모델의 크기 축소 비율에 비하여 성능 저하가 적기 때문에, 제안된 방법이 성능을 유지하면서 메모리 및 속도 요구 사항을 효과적으로 향상시킴을 알 수 있다. 제안된 S-Model은, 다른 CNN 기반의 방법과 비교할 때, AP 및 AR은 상대적으로 높고 AFPR은 낮다. 이는 제안된 방법이 복잡한 배경 또는 흐릿한 보행자 외곽선에 대해 강건하다는 것을 나타낸다.
The evaluation result of the proposed S-Model showed slightly lower performance of 7.3% and 5.3% in terms of AP and AR compared to the T-Model. However, since there is less performance degradation compared to the size reduction ratio of the model, it can be seen that the proposed method effectively improves the memory and speed requirements while maintaining the performance. Compared with other CNN-based methods, the proposed S-Model has relatively high AP and AR and low AFPR. This indicates that the proposed method is robust against complex backgrounds or blurry pedestrian outlines.

도 10은 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크를 이용해 분류된 방향 클래스별 POE 분류 정확도(Acc)를 confusion matrix로 도시한 도면이다. 도 10에 도시된 바와 같이, ‘Back’과 ‘Lback’을 제외하고 대부분의 방향은 비슷한 분류 성능을 보였다. 다른 방향과 비교할 때 이 두 방향의 정확도가 낮은 주된 이유는 웨이블렛 변환이 CNN의 이전 단계에서 적용되더라도 두 방향이 비슷한 모양을 가졌기 때문이다. 반면에, ‘Lfront’와 ‘Rback’은 외모의 차이로 인해 가장 우수한 분류 성능을 보였다.
FIG. 10 is a diagram showing POE classification accuracy (Acc) for each direction class classified using a teacher-student framework for lightening an ensemble classifier combined with a deep network and a random forest according to an embodiment of the present invention as a confusion matrix to be. As shown in FIG. 10, most directions except for'Back'and'Lback' showed similar classification performance. The main reason for the low accuracy of these two directions compared to the other directions is that even though the wavelet transform was applied in the previous step of the CNN, the two directions had similar shapes. On the other hand,'Lfront'and'Rback' showed the best classification performance due to the difference in appearance.

학생 RF에 대한 의사 결정 트리의 최적 수 결정Determining the optimal number of decision trees for student RF

학생 랜덤 포레스트의 경우 의사 결정 트리의 수는 처리 시간 및 메모리 절약을 위한 파라미터 수를 줄이는 데 중요한 요소이다. 학생 랜덤 포레스트의 최적 트리 수를 결정하기 위해 TUD 데이터 세트에서 정밀도(Precision), 회수율(recall) 및 정확도(accuracy) 성능을 비교하면서, 트리 수를 200, 150, 100, 70 및 50으로 순차적으로 줄여서 실험을 수행하였다.
In the case of a student random forest, the number of decision trees is an important factor in reducing the number of parameters for saving processing time and memory. The number of trees was sequentially reduced to 200, 150, 100, 70, and 50, comparing the precision, recall, and accuracy performance on the TUD data set to determine the optimal number of trees for the student random forest. The experiment was carried out.

도 11은 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크에서, 학생 랜덤 포레스트의 트리 수 결정을 위한 실험 결과를 도시한 도면이다. 도 11에 도시된 바와 같이, 트리의 수가 증가함에 따라 정밀도, 회수율 및 정확도가 증가하지만, 파라미터의 수가 상대적으로 증가하고 속도와 압축률이 감소하게 된다. 이러한 결과를 바탕으로, 70개의 트리가 다른 트리 수와 비슷하거나 약간 더 높은 성능을 나타내므로, 학생 랜덤 포레스트의 최적 트리 수라고 볼 수 있다. 따라서 본 발명에서는, 정확도를 높이고 파라미터의 수를 줄이기 위해 학생 랜덤 포레스트의 트리 수를 70으로 설정하였다.
11 is a diagram showing an experiment result for determining the number of trees of a student random forest in a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention. As shown in FIG. 11, as the number of trees increases, the precision, the recovery rate, and the accuracy increase, but the number of parameters relatively increases and the speed and compression rate decrease. Based on these results, 70 trees can be considered as the optimal number of trees in the student random forest because they show similar or slightly higher performance than other trees. Therefore, in the present invention, the number of trees in the student random forest is set to 70 in order to increase the accuracy and reduce the number of parameters.

모델 압축 평가Model compression evaluation

모델 압축의 목표는 교사 모델과 비슷한 성능으로 매개변수 및 연산이 적은 최적의 학생 모델을 생성하는 것이다. 따라서, 제안된 학생 모델을 인기 있는 모델 압축 방법인 MobileNet과 TUD 데이터 세트를 사용하여 파라미터의 개수 및 연산의 관점에서 비교하였다. 비교 모델은 사전 훈련된 파라미터를 기반으로 한 TUD 학습 데이터를 사용하여 미세 조정되었다. MobileNets는 파라미터의 개수와 연산을 줄이기 위해 적용되는 separable depth-wise convolutions을 기반으로 한다. 본 실험에서는, 하나의 Titan-X GPU를 사용하여 세 가지 비교 방법을 실행하였다.
The goal of model compression is to create an optimal student model with fewer parameters and operations with similar performance to the teacher model. Therefore, the proposed student model was compared in terms of the number of parameters and computation using the popular model compression method, MobileNet and TUD data set. The comparative model was fine-tuned using TUD training data based on pretrained parameters. MobileNets are based on separable depth-wise convolutions that are applied to reduce the number and operation of parameters. In this experiment, three comparison methods were performed using one Titan-X GPU.

도 12는 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법을 포함하는 4개의 실험의 정확도, 파라미터의 수 및 연산 수를 비교한 도면이다. 도 12에 도시된 바와 같이, 제안된 학생 모델은 교사 모델에 비해 파라미터의 수를 약 5배, 연산 수를 약 19.6배 줄일 수 있다. 즉, 학생 모델의 POE 분류 정확도는 교사 모델의 POE 분류 정확도보다 다소 낮지만, 요구되는 연산 및 파라미터의 수가 매우 적다는 것을 확인할 수 있다. 또한, 학생 모델은 MobileNet보다 POE 정확도가 17.9% 우수하며, 5배 적은 수의 연산을 사용한다. 그러나 학생 네트워크는 일반적인 컨볼루션 방법을 사용하는데, 이 방법은 연산 수를 19.6배 늘린다. 비교 결과에서 알 수 있듯이, 제안된 모델 압축 방법은 기존의 압축 방법에 비해 POE 분류 정확도 및 연산 횟수 면에서 우수한 성능을 보임을 확인할 수 있다.
12 shows the accuracy of four experiments, the number of parameters, and the number of operations including a classification method based on a teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined according to an embodiment of the present invention. It is a comparison drawing. As shown in FIG. 12, the proposed student model can reduce the number of parameters by about 5 times and the number of operations by about 19.6 times compared to the teacher model. That is, it can be seen that the POE classification accuracy of the student model is somewhat lower than that of the teacher model, but the number of required operations and parameters is very small. In addition, the student model has 17.9% better POE accuracy than MobileNet, and uses 5 times fewer operations. However, the student network uses the general convolution method, which increases the number of operations by 19.6 times. As can be seen from the comparison result, it can be seen that the proposed model compression method shows superior performance in terms of POE classification accuracy and number of operations compared to the conventional compression method.

KITTI 데이터 세트에 대한 성능 평가Performance evaluation on the KITTI data set

본 발명에서 제안된 사용된 알고리즘이 다른 데이터 세트에 효과적으로 적용될 수 있는지 여부를 검증하기 위해, 본 발명의 알고리즘을 KITTI 데이터 세트에도 적용하고 결과를 비교하였다.
In order to verify whether the used algorithm proposed in the present invention can be effectively applied to other data sets, the algorithm of the present invention was applied to the KITTI data set and the results were compared.

두 번째 데이터 세트로 사용한 KITTI 데이터 세트는, 스테레오 이미징, 옵티컬 플로(optical flow), 시각적 주행 측정(visual odometry), 3D 객체 탐지(3D object detection) 및 3D 추적(3D tracking)을 포함하는 실제 세계 컴퓨터 비전 벤치마크이다. 이용 가능한 9가지 카테고리 중에서, 보행자 카테고리에 대한 실험을 수행하였다. KITTI 데이터 세트의 보행자 카테고리를 5,415개의 이미지로 구성된 학습 데이터 세트와 2,065개의 이미지로 구성된 유효성 검사 세트로 나누었다. 또한, 학습 데이터 세트에만 데이터 증가를 적용하여 데이터 세트의 크기를 늘리고 4,732개의 이미지의 전체 학습 데이터 세트를 사용했다. 데이터 세트의 난이도는 크기, 폐색(occlusions) 및 절단 수준(truncation level)에 따라 “쉬움(easy)”, “보통(moderate)”, “어려움(hard)”으로 정의하였다. 중요하지 않은 영역의 탐지 또는 최소 크기보다 작은 탐지는 오탐지(false positive)로 간주하지 않는다. KITTI 데이터 세트에 대해 학생 모델을 학습하기 위해, 학습 데이터는 교사 모델에 적용되었고, 교사 모델의 출력인 소프트 타겟 데이터가 학생 네트워크 및 학생 랜덤 포레스트에 적용되었다. 모델이 학습되는 동안, 8개의 각도에서 보행자의 방향을 정규화하고, 수학식 8을 사용하여 연속된 방향 값을 추정하였다.
The KITTI data set used as the second data set is a real world computer including stereo imaging, optical flow, visual odometry, 3D object detection, and 3D tracking. It is a vision benchmark. Among the nine categories available, an experiment was performed on the pedestrian category. The pedestrian category of the KITTI data set was divided into a training data set composed of 5,415 images and a validation set composed of 2,065 images. In addition, we applied data increase to only the training data set to increase the size of the data set and used the entire training data set of 4,732 images. The difficulty of the data set was defined as “easy”, “moderate”, and “hard” according to size, occlusions and truncation level. Detection of an insignificant area or a detection smaller than the minimum size is not regarded as a false positive. To train the student model on the KITTI data set, the training data was applied to the teacher model, and the soft target data, the output of the teacher model, was applied to the student network and the student random forest. While the model was being trained, the pedestrian's direction was normalized at eight angles, and successive direction values were estimated using Equation 8.

8개의 클래스의 방향 추정의 유효성을 검증하기 위해, KITTI 데이터 세트의 경우, KITTI 데이터 세트의 보행자 데이터가 TUD와 다른 방향으로 계속 표시되어 있기 때문에, 평균 방향 유사성(Average Orientation Similarity; AOS)을 사용하였다.
In order to verify the validity of the direction estimation of eight classes, in the case of the KITTI data set, since the pedestrian data of the KITTI data set is continuously displayed in a different direction than the TUD, Average Orientation Similarity (AOS) was used. .

성능 평가를 위해 다음과 같은 최첨단 방법들과의 정확성을 비교하였다. (1) 모델 방법의 변형 가능한 부분을 확장하여 다른 관점을 다루는 DPM-VOC+VP, (2) CNN을 사용하여 단일 단안 영상(single monocular image)으로부터 3D 객체를 검출하는 Mono3D, (3) 하위 카테고리 인식 컨볼루션 신경망 기반의 SubCNN, (4) 고도로 최적화된 CNN 기반 탐지 프레임워크의 최상단에서 관점 추론을 사용한 FRCNN, (5) 제안된 학생 모델.
To evaluate the performance, accuracy was compared with the following state-of-the-art methods. (1) DPM-VOC+VP, which deals with different perspectives by expanding the deformable part of the model method, (2) Mono3D, which detects 3D objects from single monocular images using CNN, (3) subcategories SubCNN based on cognitive convolutional neural network, (4) FRCNN using viewpoint inference from the top of highly optimized CNN-based detection framework, (5) Proposed student model.

도 13은 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크 기반의 분류 방법을 포함하는 5개의 CNN 기반 방법에 대한 실험 결과를 요약한 도면이다. 도 13에 도시된 바와 같이, KITTI 데이터 세트를 이용한 실험에서, DPM-VOC+VP와 FRCNN의 두가지 방법은, 보행자의 외곽이 흐릿하여 입력 이미지가 작을 때 다른 세가지 방법모다 낮은 AOS 비율을 보였다. 그러나 SubCNN 방법은 작은 크기의 보행자 검출하는 데에 이미지 피라미드를 사용하기 때문에, 다른 방법들보다 우수한 AOS 성능을 나타냈다. SubCNN은 KITTI 데이터 세트에 대해 상대적으로 우수한 AOS 비율을 나타냈으나, 본 발명에서 제안된 학생 모델보다 네트워크 구조가 더 깊고 넓기 때문에, 지역 제안 및 객체 검출을 위한 추가적인 네트워크가 필요하다는 단점이 있다. 그러나 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크는, 교사-학생 구조를 적용하여 AOS 속도를 향상시키고, 두 개의 압축된 분류기(학생 네트워크 및 학생 랜덤 포레스트)가 다른 것의 단점을 보완하여, KITTI 데이터 세트의 쉬움, 보통 및 어려움 데이터에 대하여 우수한 성능을 나타냈다.
13 is a diagram summarizing experimental results for five CNN-based methods including a teacher-student framework-based classification method for weight reduction of an ensemble classifier combined with a deep network and a random forest according to an embodiment of the present invention. to be. As shown in FIG. 13, in the experiment using the KITTI data set, the two methods, DPM-VOC+VP and FRCNN, showed a lower AOS ratio than the other three methods when the input image was small due to the blurring of the pedestrian. However, since the SubCNN method uses an image pyramid to detect small pedestrians, AOS performance is better than other methods. SubCNN showed a relatively excellent AOS ratio with respect to the KITTI data set, but since the network structure is deeper and wider than the student model proposed in the present invention, there is a disadvantage in that an additional network for region proposal and object detection is required. However, the teacher-student framework for weight reduction of an ensemble classifier in which a deep network and a random forest according to an embodiment of the present invention are combined to improve AOS speed by applying a teacher-student structure, and two compressed classifiers (student Network and Student Random Forest) compensated for the shortcomings of the others, and showed excellent performance for the easy, normal and difficult data of the KITTI data set.

도 14는 본 발명의 일실시예에 따른 심층 네트워크와 랜덤 포레스트가 결합된 앙상블 분류기의 경량화를 위한 교사-학생 프레임워크를 사용하여 (a) TUD 및 (b) KITTI 데이터 세트의 POE 분류 결과를 도시한 도면이다. 도 14에 도시된 바와 같이, 본 발명에서 제안된 학생 모델은, 보행자의 몸이 다른 보행자에 의해 왜곡되거나 부분적으로 가려졌을 때, 심지어 이미지가 흐릿하더라도 올바르게 보행자의 방향을 예측할 수 있다.
FIG. 14 shows the results of POE classification of (a) TUD and (b) KITTI data sets using a teacher-student framework for weight reduction of an ensemble classifier combined with a deep network and a random forest according to an embodiment of the present invention. It is a drawing. As shown in Fig. 14, the student model proposed in the present invention can correctly predict the pedestrian's direction even when the pedestrian's body is distorted or partially obscured by another pedestrian, even if the image is blurred.

이와 같이, 본 발명에서는 교사 모델은 교사 심층 네트워크의 출력과 교사 랜덤 포레스트의 출력을 결합하여 각 클래스의 확률값을 생성하고, 이와 같은 소프트 타겟 값을 입력하여 학생 모델을 학습시킬 수 있다. 두가지 다른 분류 모델을 결합함으로써, 모델 크기를 줄일 수 있을 뿐 아니라, 교사 모델의 분류 기능을 모방하는 학생 모델을 구성할 수 있다. 또한, 기존의 CNN 기반 분류 접근법과 달리, 본 발명은 교사 심층 네트워크와 교사 랜덤 포레스트의 출력을 결합하여 새로운 소프트 출력을 선택하고, 교사 모델을 기반으로 동등한 성능의 학생 모델을 구성할 수 있다.
As described above, in the present invention, the teacher model may generate a probability value for each class by combining the output of the deep teacher network and the output of the teacher random forest, and input the soft target value to train the student model. By combining two different classification models, not only can the model size be reduced, but a student model can be constructed that mimics the classification function of the teacher model. In addition, unlike the existing CNN-based classification approach, the present invention selects a new soft output by combining the output of the teacher deep network and the teacher random forest, and constructs a student model of equivalent performance based on the teacher model.

이상 설명한 본 발명은 본 발명이 속한 기술분야에서 통상의 지식을 가진 자에 의하여 다양한 변형이나 응용이 가능하며, 본 발명에 따른 기술적 사상의 범위는 아래의 특허청구범위에 의하여 정해져야 할 것이다.The present invention described above can be modified or applied in various ways by those of ordinary skill in the technical field to which the present invention belongs, and the scope of the technical idea according to the present invention should be determined by the following claims.

100: 교사 학습 모듈
110: 제1 교사 학습부
120: 제2 교사 학습부
200: 소프트 타겟 데이터 생성 모듈
300: 학생 학습 모듈
310: 제1 학생 학습부
320: 제2 학생 학습부
400: 분류 모듈
500: 전처리 모듈
S10: 웨이블렛 변환을 적용하여 입력 이미지에 대한 전처리를 수행하는 단계
S100: 데이터 세트 A를 이용하여 교사 모델을 학습시키는 단계
S110: 데이터 세트 A를 이용하여 교사 심층 네트워크를 학습시키는 단계
S120: 교사 심층 네트워크의 특징 맵을 이용하여 교사 랜덤 포레스트를 학습시키는 단계
S200: 데이터 세트 B를 교사 심층 네트워크 및 교사 랜덤 포레스트에 입력하고, 출력된 두 출력을 결합하여 소프트 타겟 데이터 세트 B^*를 생성하는 단계
S300: 데이터 세트 B^*를 이용하여 학생 모델을 학습시키는 단계
S310: 데이터 세트 B^*를 이용하여 학생 네트워크를 학습시키는 단계
S320: 데이터 세트 B^*를 이용하여 학생 랜덤 포레스트를 학습시키는 단계
S400: 학습된 학생 네트워크 및 학생 랜덤 포레스트의 두 출력을 결합하여 분류를 수행하는 단계100: Teacher Learning Module
110: First Teacher Learning Department
120: Second Teacher Learning Department
200: soft target data generation module
300: Student Learning Module
310: 1st Student Learning Department
320: Second Student Learning Department
400: classification module
500: pretreatment module
S10: Step of performing preprocessing on the input image by applying wavelet transform
S100: training a teacher model using data set A
S110: Training the deep teacher network using the data set A
S120: Learning a teacher random forest using the feature map of the deep teacher network
S200: Inputting the data set B into the deep teacher network and the teacher random forest, and combining the output two outputs to generate a soft target data set B ^*
S300: training a student model using the data set B ^*
S310: Learning a student network using the data set B ^*
S320: Learning a student random forest using the data set B ^*
S400: combining the two outputs of the learned student network and the student random forest to perform classification

Claims

As a teacher-student framework,
A teacher learning module 100 for learning a teacher model composed of a teacher deep network and a teacher random forest using the data set A;
A soft target data generation module 200 for inputting data set B into the deep teacher network and teacher random forest learned in the teacher learning module 100, and combining the output two outputs to generate a soft target data set B ^* ;
A student learning module 300 for learning a student model composed of a student network and a student random forest by using the data set B ^* generated by the soft target data generation module 200; And
The ensemble classifier combined with the deep network and the random forest, characterized in that it comprises a classification module 400 for performing classification by combining two outputs of the student network and the student random forest learned in the student learning module 300 Teacher-student framework for lightweighting.

The method of claim 1, wherein the data set A,
A teacher-student framework for lightening an ensemble classifier combined with a deep network and a random forest, characterized in that it is a hard target data set including class labels.

The method of claim 1, wherein the data set B,
A teacher-student framework for lightweighting an ensemble classifier combining a deep network and a random forest, which is a data set that does not contain a class label.

The method of claim 1, wherein the teacher learning module (100),
A first teacher learning unit 110 for learning the deep teacher network using the data set A; And
And a second teacher learning unit 120 that learns a teacher random forest using the feature map of the deep teacher network.For lightening the ensemble classifier in which the deep network and the random forest are combined Teacher-student framework.

The method of claim 1, wherein the deep teacher network,
A teacher-student framework for weight reduction of an ensemble classifier combined with a deep network and a random forest, characterized by applying a softened softmax function to obtain a soft target output, which is a probability value of each class.

The method of claim 1,
A teacher-student framework for lightening an ensemble classifier in which a deep network and a random forest are combined, characterized in that it further comprises a preprocessing module 500 that performs preprocessing on the input image by applying wavelet transform.

The method of claim 1, wherein the student learning module (300),
A first student learning unit 310 for learning a student network by using the data set B ^* generated by the soft target data generation module 200; And
Using the data set B ^* generated by the soft target data generation module 200, characterized in that it comprises a second student learning unit 320 for learning a student random forest, a deep network and a random forest are combined. A teacher-student framework for lightweight ensemble classifiers.

The method of claim 7, wherein the first student learning unit 310,
(3-1-1) initializing the parameter (W _S ) of the student network;
(3-1-2) inputting the data set B ^* into a pre-trained network;
(3-1-3) calculating a loss function (L(Ws));
(3-1-4)

Updating to W _S ^* ; And
(3-1-5) A teacher for lightening an ensemble classifier combined with a deep network and a random forest, characterized in that the student network is trained by performing the step of selecting the optimal parameter W _S ^* for the student network -Student framework.

The method of claim 8, wherein in the step (3-1-3),
A teacher-student framework for lightweighting an ensemble classifier combined with a deep network and a random forest, characterized in that the loss function is calculated using the following equation.

In the above equation, N is the number of samples in the data set B ^* , C is the number of classes, and P _T (x _i |c _j ) and P _S (x _i |c _j ) are the teachers and students for the input vector x _i , respectively. This is the posterior class probability.

The method of claim 7, wherein the second student learning unit (320),
(3-2-1) copying the t-th tree structure (T-RF _t ) of the teacher random forest to the t-th tree structure (S-RF _t ) of the student random forest to perform transfer learning;
(3-2-2) inputting a data set B ^* having an input vector v into one of the decision trees of S-RF _t ;
(3-2-3) generating a split function f(v) at node O;
(3-2-4) calculating an information gain ΔE at node O;
(3-2-5) calculating cross-entropy Tr(Te, S) _t ; And
(3-2-6) If the calculated cross-entropy is less than the minimum threshold for stopping boosting, the current S-RF _t is stored, and otherwise, the step of re-performing from step (3-2-2) is T A teacher-student framework for lightening an ensemble classifier combining a deep network and a random forest, characterized in that the student random forest is trained by performing until constructing two random decision trees.

As a classification method based on the teacher-student framework,
(1) using the data set A, training a teacher model composed of a deep teacher network and a teacher random forest;
(2) inputting the data set B into the deep teacher network and the teacher random forest learned in step (1), and combining the two outputs to generate a soft target data set B ^* ;
(3) learning a student model consisting of a student network and a student random forest by using the data set B ^* generated in step (2); And
(4) For reducing the weight of the ensemble classifier in which the deep network and the random forest are combined, comprising the step of performing classification by combining the two outputs of the student network and the student random forest learned in step (3). Classification method based on the teacher-student framework.

The method of claim 11, wherein the data set A,
A classification method based on a teacher-student framework for lightening an ensemble classifier combined with a deep network and a random forest, characterized in that it is a hard target data set including a class label.

The method of claim 11, wherein the data set B,
Classification method based on a teacher-student framework for weight reduction of an ensemble classifier combined with a deep network and a random forest, characterized in that the data set does not include a class label.

The method of claim 11, wherein the step (1),
(1-1) learning the deep teacher network using the data set A; And
(1-2) A teacher for weight reduction of an ensemble classifier combining a deep network and a random forest, comprising the step of learning a teacher random forest using a feature map of the deep teacher network- Classification method based on student framework.

The method of claim 11, wherein the deep teacher network,
Based on a teacher-student framework for weight reduction of an ensemble classifier combined with a deep network and a random forest, characterized by applying a softened softmax function to obtain a soft target output, which is a probability value of each class Classification method.

The method of claim 11,
(0) A classification method based on a teacher-student framework for lightening an ensemble classifier combined with a deep network and a random forest, further comprising the step of performing pre-processing on the input image by applying a wavelet transform.

The method of claim 11, wherein the step (3),
(3-1) learning a student network by using the data set B ^* generated in step (2); And
(3-2) For lightening the ensemble classifier in which the deep network and the random forest are combined, comprising the step of learning a student random forest using the data set B ^* generated in step (2). Classification method based on the teacher-student framework.

The method of claim 17, wherein the step (3-1),
(3-1-1) initializing the parameter (W _S ) of the student network;
(3-1-2) inputting the data set B ^* into a pre-trained network;
(3-1-3) calculating a loss function (L(Ws));
(3-1-4)

Updating to W _S ^* ; And
(3-1-5) A teacher for lightening an ensemble classifier combined with a deep network and a random forest, characterized in that the student network is trained by performing the step of selecting the optimal parameter W _S ^* for the student network -Classification method based on student framework.

The method of claim 18, wherein in the step (3-1-3),
A classification method based on a teacher-student framework for weight reduction of an ensemble classifier combined with a deep network and a random forest, characterized in that the loss function is calculated using the following equation.

The method of claim 17, wherein the step (3-2),
(3-2-1) copying the t-th tree structure (T-RF _t ) of the teacher random forest to the t-th tree structure (S-RF _t ) of the student random forest to perform transfer learning;
(3-2-2) inputting a data set B ^* having an input vector v into one of the decision trees of S-RF _t ;
(3-2-3) generating a split function f(v) at node O;
(3-2-4) calculating an information gain ΔE at node O;
(3-2-5) calculating cross-entropy Tr(Te, S) _t ; And
(3-2-6) If the calculated cross-entropy is less than the minimum threshold for stopping boosting, the current S-RF _t is stored, and otherwise, the step of re-performing from step (3-2-2) is T A classification method based on a teacher-student framework for lightening an ensemble classifier combined with a deep network and a random forest, characterized in that the student random forest is learned by performing until constructing two random decision trees.