KR102531763B1

KR102531763B1 - Method and Apparatus for Recovery of no-Fine-Tuning Training and Data for Lightweight Neural Networks using Network Pruning Techniques

Info

Publication number: KR102531763B1
Application number: KR1020210112340A
Authority: KR
Inventors: 최동완; 이건호; 김민수
Original assignee: 인하대학교 산학협력단
Priority date: 2021-08-25
Filing date: 2021-08-25
Publication date: 2023-05-11
Anticipated expiration: 2041-08-25
Also published as: KR102531763B9; KR20230030290A

Abstract

네트워크 프루닝 기법을 통해 경량화 된 신경망에 대한 데이터와 미세 조정 학습이 필요 없는 성능 복구 방법 및 장치가 제시된다. 본 발명에서 제안하는 네트워크 프루닝 기법을 통해 경량화 된 신경망에 대한 데이터와 미세 조정 학습이 필요 없는 성능 복구 방법은 사전학습된 합성곱 신경망의 레이어에서 제거되는 필터와 보존되는 필터간의 선형관계를 이용하여 손상된 필터를 나타내는 단계, 배치 정규화 레이어와 활성화 레이어에 따른 상기 손상된 필터로 인한 손실 함수를 정의하는 단계 및 상기 손실함수를 최소화하는 계수값을 전달행렬의 요소값으로 나타내어 사전학습된 합성곱 신경망의 필터와 프루닝 행렬의 연산을 수행함으로써 프루닝으로 압축된 신경망의 성능을 복구하는 단계를 포함한다.A performance recovery method and device that does not require data and fine-tuned learning for a lightweight neural network through network pruning techniques are presented. The performance recovery method that does not require data and fine-tuning learning for a lightweight neural network through the network pruning technique proposed in the present invention uses the linear relationship between the filter removed from the layer of the pretrained convolutional neural network and the filter preserved. A filter of a pretrained convolutional neural network by representing a damaged filter, defining a loss function due to the damaged filter according to a batch normalization layer and an activation layer, and representing coefficient values that minimize the loss function as element values of a transfer matrix. and restoring the performance of the neural network compressed by pruning by performing an operation of H and a pruning matrix.

Description

Method and Apparatus for Recovery of no-Fine-Tuning Training and Data for Lightweight Neural Networks using Network Pruning Techniques}

본 발명은 네트워크 프루닝 기법을 통해 경량화 된 신경망에 대한 데이터와 미세 조정 학습이 필요 없는 성능 복구 방법 및 장치에 관한 것이다.The present invention relates to a performance recovery method and apparatus that do not require data and fine-tuning learning for a lightweight neural network through a network pruning technique.

네트워크 프루닝은 과적합된 신경망을 압축하기 위해 활발히 연구가 이루어지고 있는 분야 중 하나이다. 네트워크 프루닝은 가능한 한 사전학습된 합성곱 신경망의 성능을 보존하면서 신경망의 파라미터를 제거함으로써 파라미터의 수를 줄이며 신경망의 추론 속도를 줄이는 것이 목표이다. 네트워크 프루닝은 주로 두 가지 웨이트(unstructured pruning)와 필터 프루닝(filter pruning)으로 나뉜다. 웨이트 프루닝은 신경망의 구조를 변경하지 않고, 파라미터 자체를 '0'으로 대체함으로써 신경망을 희소하게(sparse) 만드는 기법이며, 이 기법은 신경망의 파라미터의 수를 줄이는 데 효과적이나 신경망 구조를 변경하지 않기 때문에 추론 속도를 줄이기 위해서는 특별한 하드웨어와 소프트웨어가 필요하다. 반면, 필터 프루닝은 신경망의 필터를 제거함으로써 신경망의 구조를 변경시키기 때문에 추가적인 하드웨어와 소프트웨어 없이 추론 속도를 줄이는 것이 가능하다. Network pruning is one of the areas of active research to compress overfitted neural networks. Network pruning aims to reduce the number of parameters and reduce the inference speed of a neural network by removing the parameters of the neural network while preserving the performance of the pretrained convolutional neural network as much as possible. Network pruning is mainly divided into two types: unstructured pruning and filter pruning. Weight pruning is a technique that makes a neural network sparse by replacing the parameters themselves with '0' without changing the structure of the neural network. This technique is effective in reducing the number of parameters of the neural network, but does not change the structure of the neural network. Therefore, special hardware and software are required to reduce inference speed. On the other hand, since filter pruning changes the structure of the neural network by removing the filter of the neural network, it is possible to reduce the inference speed without additional hardware and software.

일반적으로, 필터 프루닝은 학습되지 않은 신경망을 학습시키는 과정에서 반복적으로 신경망의 필터를 제거하는 반복적인 프루닝(iterative pruning)과 사전학습된 합성곱 신경망을 프루닝하여 파라미터를 제거한 후 재학습을 통하여 성능을 복구하는 원샷 프루닝(one-shot pruning) 기법이 존재한다. 하지만 반복적인 프루닝과 원샷 프루닝은 학습을 위한 충분한 데이터, 하드웨어가 필요하며, 학습과정을 많은 시간이 소요되는 문제점이 있다. 이러한 문제를 해결하고자 다음과 같은 종래기술들이 제안되었다. In general, filter pruning includes iterative pruning that repeatedly removes the filter of the neural network in the process of training an untrained neural network and pruning a pretrained convolutional neural network to remove parameters and then performing retraining. There is a one-shot pruning technique that restores performance through However, iterative pruning and one-shot pruning require sufficient data and hardware for learning, and have problems in that the learning process takes a lot of time. In order to solve this problem, the following prior art has been proposed.

종래기술[1]은 적은 학습과정으로 압축된 성능을 복구하기 위해 원본 데이터를 이용하여 신경망의 각 레이어마다 제거할 필터를 재구성 오류를 기반하여 그리디(greedy)하게 선택하는 방법을 제안하였다. 하지만 종래기술[1]은 재구성 오류를 만들어내기 위한 데이터와 재학습을 위한 원본 데이터 전체가 필요하다는 문제점이 여전히 존재한다. 종래기술[2]은 적은 데이터를 이용하여 제거해야 할 필터를 쿨백-라이블러 발산(Kullback-Leibler divergence) 기준(criterion)을 이용하여 신경망 내 전역적으로(globally) 선택하는 방법을 제안하였으며 적은 학습만 압축된 신경망을 학습시키기위해 지식 증류(Knowledge Distilation)을 이용하여 재학습하는 방법을 제안하였다. 하지만 여전히 데이터와 재학습이 필요하다는 문제는 해결하지 못하였다. 데이터와 재학습 없이 사전학습된 합성곱 신경망을 프루닝한 이후 성능을 복구하기 위해, 종래기술에 따른 원투원(one-to-one) 방식[3,4]은 제거되는 필터마다 가장 유사한 보존되는 하나의 필터를 찾아 제거되는 필터에 스케일을 곱하고 이를 보존되는 필터에 더함으로써 제거되는 필터로 인해 발생하는 성능 손실을 복구한다.The prior art [1] proposed a method of greedily selecting a filter to be removed for each layer of a neural network based on a reconstruction error using original data in order to recover compressed performance with a small learning process. However, the prior art [1] still has a problem in that all original data for re-learning and data for generating reconstruction errors are required. The prior art [2] proposed a method of globally selecting a filter to be removed using a small amount of data in a neural network using the Kullback-Leibler divergence criterion, and with little learning. In order to learn only compressed neural networks, a relearning method using knowledge distillation was proposed. However, it still did not solve the problem of needing data and relearning. In order to recover the performance after pruning the pretrained convolutional neural network without data and retraining, the one-to-one method according to the prior art [3,4] uses the most similar preserved one for each filter removed. Recover the performance loss caused by the filter being removed by finding the filter in , multiplying the filter being removed by the scale, and adding it to the filter being preserved.

본 발명이 이루고자 하는 기술적 과제는, 종래기술의 유사도를 기반하여 압축된 신경망을 복구하는 방법에서 신경망 후반 레이어에서 유사하지 않은 필터를 선택할 수 있는 문제점을 해결하기 위해 제거되는 필터와 가장 유사한 필터 하나를 선택하는 방법이 아닌 제거되는 필터와 보존되는 필터들 사이의 선형결합(linear combination)을 기반으로 압축된 신경망의 성능을 복구하는 방법 및 장치를 제공하는데 있다. The technical problem to be achieved by the present invention is to solve the problem of selecting a filter that is not similar in the second layer of the neural network in the method of restoring a compressed neural network based on the similarity of the prior art. An object of the present invention is to provide a method and apparatus for restoring the performance of a compressed neural network based on a linear combination between a filter to be removed and a filter to be preserved rather than a selection method.

일 측면에 있어서, 본 발명에서 제안하는 네트워크 프루닝 기법을 통해 경량화 된 신경망에 대한 데이터와 미세 조정 학습이 필요 없는 성능 복구 방법은 사전학습된 합성곱 신경망의 레이어에서 제거되는 필터와 보존되는 필터간의 선형관계를 이용하여 손상된 필터를 나타내는 단계, 배치 정규화 레이어와 활성화 레이어에 따른 상기 손상된 필터로 인한 손실 함수를 정의하는 단계 및 상기 손실함수를 최소화하는 계수값을 전달행렬의 요소값으로 나타내어 사전학습된 합성곱 신경망의 필터와 프루닝 행렬의 연산을 수행함으로써 프루닝으로 압축된 신경망의 성능을 복구하는 단계를 포함한다. In one aspect, the performance recovery method that does not require data and fine-tuned learning for a neural network that is lightweight through the network pruning technique proposed in the present invention provides a relationship between a filter removed from a layer of a pretrained convolutional neural network and a filter that is preserved. Representing the damaged filter using a linear relationship, defining a loss function due to the damaged filter according to the batch normalization layer and the activation layer, and representing the coefficient values that minimize the loss function as element values of the transfer matrix to pre-learn. and restoring performance of the neural network compressed by pruning by performing an operation of a filter of the convolutional neural network and a pruning matrix.

상기 사전학습된 합성곱 신경망의 레이어에서 제거되는 필터와 보존되는 필터간의 선형관계를 이용하여 손상된 필터를 나타내는 단계는 상기 사전학습된 합성곱의 필터 프루닝에서 사용되는 프루닝 행렬을 전달 행렬로 변형시킴으로써 제거되는 필터의 정보를 보존되는 필터에게 상속시키고, 제거되는 필터를 보존되는 필터들의 선형 결합으로 나타낸다. In the step of representing a damaged filter using a linear relationship between a filter to be removed from a layer of the pretrained convolutional neural network and a filter to be preserved, the pruning matrix used in filter pruning of the pretrained convolutional neural network is transformed into a transfer matrix. By doing so, the information of the removed filter is inherited by the preserved filter, and the removed filter is represented as a linear combination of the preserved filters.

배치 정규화 레이어와 활성화 레이어에 따른 상기 손상된 필터로 인한 손실 함수를 정의하는 단계는 손실함수를 최소화하기 위해 활성화맵을 특징맵으로 근사화 한 후, 잔차 오류(Residual Error)에 대하여 배치 정규화 레이어와 활성화 레이어에 따른 활성화맵을 특징맵이 배치 정규화 레이어를 통과한 값으로 나타내어 손실 함수를 정의하고, 상기 손실함수의 잔차 오류, 배치정규화 오류, 활성화 오류를 최소화 하는 계수값을 찾는다. In the step of defining a loss function due to the damaged filter according to the batch normalization layer and the activation layer, after approximating the activation map with a feature map to minimize the loss function, the batch normalization layer and the activation layer for the residual error. A loss function is defined by representing the activation map according to the feature map as a value passed through the batch normalization layer, and a coefficient value that minimizes the residual error, batch normalization error, and activation error of the loss function is found.

상기 손실함수를 최소화하는 계수값을 전달행렬의 요소값으로 나타내어 사전학습된 합성곱 신경망의 필터와 프루닝 행렬의 연산을 수행함으로써 프루닝으로 압축된 신경망의 성능을 복구하는 단계는 상기 손실함수를 최소화하는 계수값을 상기 제거되는 필터에 곱하고, 상기 보존되는 필터에 더함으로써 사전학습된 합성곱 신경망과 압축된 신경망의 재구성 오류를 최소화하는 전달 행렬을 찾고, 상기 손실함수를 최소화하는 계수값을 전달행렬의 요소값으로 나타내어 사전학습된 합성곱 신경망의 필터와 프루닝 행렬의 2-모드곱(2-mode product) 연산을 수행함으로써 데이터와 학습없이 프루닝으로 압축된 신경망의 성능을 복구한다. The step of restoring the performance of the neural network compressed by pruning by performing the operation of the filter of the pretrained convolutional neural network and the pruning matrix by representing the coefficient values that minimize the loss function as element values of the transfer matrix. Find a transfer matrix that minimizes the reconstruction error of the pretrained convolutional neural network and the compressed neural network by multiplying the minimized coefficient value by the filter to be removed and adding it to the preserved filter, and transfers the coefficient value that minimizes the loss function. Recover the performance of the neural network compressed by pruning without data and learning by performing a 2-mode product operation of the filter of the pretrained convolutional neural network and the pruning matrix represented by the element values of the matrix.

또 다른 일 측면에 있어서, 본 발명에서 제안하는 네트워크 프루닝 기법을 통해 경량화 된 신경망에 대한 데이터와 미세 조정 학습이 필요 없는 성능 복구 장치는 사전학습된 합성곱 신경망의 레이어에서 제거되는 필터와 보존되는 필터간의 선형관계를 이용하여 손상된 필터를 나타내고, 배치 정규화 레이어와 활성화 레이어에 따른 상기 손상된 필터로 인한 손실 함수를 정의하는 프루닝 모델링부 및 손실함수를 최소화하는 계수값을 전달행렬의 요소값으로 나타내어 사전학습된 합성곱 신경망의 필터와 프루닝 행렬의 연산을 수행함으로써 프루닝으로 압축된 신경망의 성능을 복구하는 재구성 모델링부를 포함한다.In another aspect, a performance recovery device that does not require data and fine-tuning learning for a lightweight neural network through the network pruning technique proposed in the present invention is a filter removed from a layer of a pretrained convolutional neural network and a preserved A pruning modeling unit that represents a damaged filter using a linear relationship between filters and defines a loss function due to the damaged filter according to the batch normalization layer and the activation layer, and coefficient values that minimize the loss function are represented as element values of the transfer matrix and a reconstruction modeling unit that restores performance of the neural network compressed by pruning by performing an operation of a filter and a pruning matrix of the pretrained convolutional neural network.

본 발명의 실시예들에 따른 필터 프루닝을 이용하여 압축된 신경망을 데이터와 학습없이 원본 신경망의 성능으로 복구하는 방법 및 장치를 통해 제거되는 필터와 가장 유사한 필터 하나를 선택하는 방법이 아닌 제거되는 필터와 보존되는 필터들 사이의 선형결합(linear combination)을 기반으로 압축된 신경망의 성능을 복구하여 종래기술의 유사도를 기반하여 압축된 신경망을 복구하는 방법에서의 신경망 후반 레이어에서 유사하지 않은 필터를 선택할 수 있는 문제점을 해결할 수 있다. Method and apparatus for restoring a compressed neural network to the performance of the original neural network without data and learning using filter pruning according to embodiments of the present invention In the method of restoring a compressed neural network based on the similarity of the prior art by restoring the performance of a compressed neural network based on a linear combination between the filter and the preserved filters, a dissimilar filter is used in the latter layer of the neural network. Problems can be solved by choosing.

도 1은 종래기술에 따른 심층 신경망의 레이어를 나타내는 도면이다.
도 2는 본 발명의 일 실시예에 따른 네트워크 프루닝 기법을 통해 경량화 된 신경망에 대한 데이터와 미세 조정 학습이 필요 없는 성능 복구 장치의 구성을 나타내는 도면이다.
도 3은 본 발명의 일 실시예에 따른 네트워크 프루닝 기법을 통해 경량화 된 신경망에 대한 데이터와 미세 조정 학습이 필요 없는 성능 복구 방법을 설명하기 위한 흐름도이다.
도 4는 본 발명의 일 실시예에 따른 프루닝 행렬 및 전달 행렬을 설명하기 위한 도면이다.
도 5는 본 발명의 일 실시예에 따른 데이터와 학습없이 프루닝으로 압축된 신경망의 성능을 복구하기 위한 알고리즘을 나타내는 도면이다.
도 6은 본 발명의 일 실시예에 따른 합성곱 신경망의 성능 복구 방법과 NM과의 잔차 오류(RE), 배치 정규화 오류(BE), 가중화된 평균 재구성 오류(WARE)를 나타낸 도면이다. 1 is a diagram showing layers of a deep neural network according to the prior art.
2 is a diagram showing the configuration of a performance recovery device that does not require data and fine-tuning learning for a neural network that has been reduced through a network pruning technique according to an embodiment of the present invention.
3 is a flowchart illustrating a performance recovery method that does not require data and fine-tuning learning for a neural network that has been reduced through a network pruning technique according to an embodiment of the present invention.
4 is a diagram for explaining a pruning matrix and a transfer matrix according to an embodiment of the present invention.
5 is a diagram illustrating an algorithm for restoring performance of a neural network compressed by pruning without data and learning according to an embodiment of the present invention.
6 is a diagram showing a method for restoring the performance of a convolutional neural network according to an embodiment of the present invention and residual error (RE), batch normalization error (BE), and weighted average reconstruction error (WARE) with NM.

이하, 본 발명의 실시 예를 첨부된 도면을 참조하여 상세하게 설명한다. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 종래기술에 따른 심층 신경망의 레이어를 나타내는 도면이다. 1 is a diagram showing layers of a deep neural network according to the prior art.

데이터와 학습없이 프루닝으로 압축된 신경망의 성능을 복구시키기 위해 종래기술[3,4]에서는 제거되는 필터와 보존되는 필터간의 유사도를 이용하여 성능을 회복시키는 방법을 이용하였다. 하지만 도 1[4]과 같이, 최근 개발된 심층 신경망은 많은 레이어를 가지며, 레이어가 깊어질수록 필터간의 유사도는 점점 떨어진다. 즉, 유사도를 기반하여 압축된 신경망을 복구하는 방법은 신경망의 후반 레이어에서 유사하지 않은 필터를 선택할 수 있는 문제점이 있다[3,4]. 이러한 문제를 해결하고자 NM[4]은 신경망의 배치 정규화(batch normalzation)와 활성화 함수(activation function)를 고려한 거리기반 측정법을 이용하여 유사한 필터를 선택하는 방법을 제안하였다. 하지만 이러한 방법은 여전히 유사도 기반으로, 유사하지 않은 필터를 잘못 선택할 가능성이 존재한다. In order to restore the performance of neural networks compressed by pruning without data and learning, the prior art [3, 4] used a method of restoring performance by using the similarity between the filter to be removed and the filter to be preserved. However, as shown in FIG. 1 [4], the recently developed deep neural network has many layers, and the similarity between filters gradually decreases as the layers become deeper. That is, the method of recovering a compressed neural network based on similarity has a problem in that a dissimilar filter can be selected in the latter layer of the neural network [3, 4]. To solve this problem, NM [4] proposed a method of selecting a similar filter using a distance-based measurement method considering the batch normalization and activation function of neural networks. However, this method still has the possibility of incorrectly selecting dissimilar filters based on similarity.

따라서 본 발명에서는 제거되는 필터와 가장 유사한 필터 하나를 선택하는 방법이 아닌 제거되는 필터와 보존되는 필터들 사이의 선형결합(linear combination)을 기반으로 압축된 신경망의 성능을 복구하고자 한다. 본 발명에서는 필터 프루닝을 이용하여 압축된 신경망을 데이터와 학습없이 원본 신경망의 성능으로 복구하는 방법 및 장치를 제안한다. Therefore, in the present invention, the performance of the compressed neural network is restored based on a linear combination between the filter to be removed and the filter to be preserved, rather than a method of selecting one filter most similar to the filter to be removed. The present invention proposes a method and apparatus for restoring a compressed neural network to the performance of an original neural network without data and learning using filter pruning.

도 2는 본 발명의 일 실시예에 따른 네트워크 프루닝 기법을 통해 경량화 된 신경망에 대한 데이터와 미세 조정 학습이 필요 없는 성능 복구 장치의 구성을 나타내는 도면이다. 2 is a diagram showing the configuration of a performance recovery device that does not require data and fine-tuning learning for a neural network that has been reduced through a network pruning technique according to an embodiment of the present invention.

제안하는 네트워크 프루닝 기법을 통해 경량화 된 신경망에 대한 데이터와 미세 조정 학습이 필요 없는 성능 복구 장치는 프루닝 모델링부(210) 및 재구성 모델링부(220)를 포함한다. A performance recovery device that does not require data and fine-tuning learning for a neural network that is lightweight through the proposed network pruning technique includes a pruning modeling unit 210 and a reconstruction modeling unit 220.

본 발명의 실시예에 따른 네트워크 프루닝 기법을 통해 경량화 된 신경망에 대한 데이터와 미세 조정 학습이 필요 없는 성능 복구 장치를 도 2를 참조하여 간략히 설명하면, 제안된 장치는 데이터와 학습없이 프루닝으로 인해 발생하는 압축된 신경망의 성능하락을 복구할 수 있다. 먼저, 프루닝으로 인한 성능 하락은 도 2에서와 같이, 프루닝 모델링부(다시 말해, 프루닝된 모델(Pruned Model))(210)이 사전학습된 합성곱 신경망의

번째 레이어에서 제거되는 필터로 인해

번째 레이어의 출력 채널이 제거되고 이로 인해, 그 다음 레이어인

번째 필터의 채널이 연쇄적으로 제거됨으로써 출력의 손상이 발생된다. 이러한 오류(error)는 각 레이어 마다 누적되어 최종적으로 압축된 신경망의 성능하락이 발생한다. 이러한 성능 하락을 복구하기 위해, 본 발명에서는 제거되는 필터와 보존되는 필터간의 선형관계를 이용하고 배치 정규화 레이어와 활성화 레이어를 고려한 손실 함수를 정의한다. 그리고 정의된 손실함수를 최소화하는

를 찾아, 제거되는 필터에 곱하여 보존되는 필터들에게 더함으로써 사전학습된 합성곱 신경망과 압축된 신경망의 재구성 오류를 최소화할 수 있도록 한다. 도 2에서 재구성 모델링부(다시 말해, 재구성된 모델(Restored Model))(220)에 관하여 더욱 상세히 설명하면, 본 발명은 먼저 합성곱 신경망에서 필터 프루닝을 N-mode product[5]을 이용하여 나타내고 원본 신경망과 프루닝된 신경망 사이의 재구성 오류를 최소화하는 문제로 정의하여 데이터와 학습없이 프루닝으로 압축된 신경망을 회복 방법을 제안한다. Referring to FIG. 2, a performance recovery device that does not require data and fine-tuning learning for a lightweight neural network through a network pruning technique according to an embodiment of the present invention will be briefly described. The proposed device is pruning without data and learning. It is possible to recover the performance degradation of the compressed neural network caused by this. First, the performance degradation due to pruning is, as shown in FIG.

Due to the filter being removed in the second layer

The output channel of the first layer is removed, so that the next layer

As the channels of the th filter are sequentially removed, the output is damaged. These errors are accumulated for each layer, resulting in a performance degradation of the final compressed neural network. To recover this performance degradation, the present invention defines a loss function using a linear relationship between a filter to be removed and a filter to be preserved and considering the batch normalization layer and the activation layer. And minimizing the defined loss function

Find , and multiply the filters to be removed and add them to the filters to be preserved, thereby minimizing the reconstruction error of the pretrained convolutional neural network and the compressed neural network. Referring to the reconstruction modeling unit (ie, the restored model) 220 in FIG. 2 in more detail, the present invention first performs filter pruning in a convolutional neural network using N-mode product [5] It is defined as the problem of minimizing the reconstruction error between the original neural network and the pruned neural network, and proposes a recovery method for compressed neural networks without data and learning.

본 발명의 실시예에 따른 프루닝 모델링부(210)는 사전학습된 합성곱 신경망의 레이어에서 제거되는 필터와 보존되는 필터간의 선형관계를 이용하여 손상된 필터를 나타내고, 배치 정규화 레이어와 활성화 레이어에 따른 상기 손상된 필터로 인한 손실 함수를 정의한다. 프루닝 모델링부(210)는 상기 사전학습된 합성곱의 필터 프루닝에서 사용되는 프루닝 행렬을 전달 행렬로 변형시킴으로써 제거되는 필터의 정보를 보존되는 필터에게 상속시키고, 제거되는 필터를 보존되는 필터들의 선형 결합으로 나타낼 수 있다. The pruning modeling unit 210 according to an embodiment of the present invention represents a damaged filter using a linear relationship between a filter removed from a layer of a pretrained convolutional neural network and a filter preserved, and according to the batch normalization layer and the activation layer Define a loss function due to the damaged filter. The pruning modeling unit 210 transforms the pruning matrix used in the filter pruning of the pretrained convolution into a transfer matrix so that the information of the filter to be removed is inherited by the filter to be preserved, and the filter to be removed is the filter to be preserved. can be represented as a linear combination of

본 발명의 실시예에 따른 프루닝 모델링부(210)는 손실함수를 최소화하기 위해 활성화맵을 특징맵으로 근사화 한 후(221), 잔차 오류(Residual Error)에 대하여 배치 정규화 레이어와 활성화 레이어에 따른 활성화맵을 특징맵이 배치 정규화 레이어를 통과한 값으로 나타내어 손실 함수를 정의하고, 상기 손실함수의 잔차 오류, 배치정규화 오류, 활성화 오류를 최소화 하는 계수값을 찾을 수 있다. The pruning modeling unit 210 according to an embodiment of the present invention approximates the activation map with a feature map in order to minimize the loss function (221), and then determines the residual error according to the batch normalization layer and the activation layer. A loss function is defined by representing an activation map as a value of a feature map passed through a batch normalization layer, and coefficient values minimizing residual errors, batch normalization errors, and activation errors of the loss function can be found.

재구성 모델링부(220)는 손실함수를 최소화하는 계수값을 전달행렬의 요소값으로 나타내어 사전학습된 합성곱 신경망의 필터와 프루닝 행렬의 연산을 수행함으로써 프루닝으로 압축된 신경망의 성능을 복구한다. The reconstruction modeling unit 220 restores the performance of the neural network compressed by pruning by representing the coefficient value that minimizes the loss function as an element value of the transfer matrix and performing the operation of the filter of the pretrained convolutional neural network and the pruning matrix. .

본 발명의 실시예에 따른 재구성 모델링부(220)는 손실함수를 최소화하는 계수값을 상기 제거되는 필터에 곱하고, 상기 보존되는 필터에 더함으로써 사전학습된 합성곱 신경망과 압축된 신경망의 재구성 오류를 최소화하는 전달 행렬을 찾는다. 이후, 상기 손실함수를 최소화하는 계수값을 전달행렬의 요소값으로 나타내어 사전학습된 합성곱 신경망의 필터와 프루닝 행렬의 2-모드곱(2-mode product) 연산을 수행함으로써 데이터와 학습없이 프루닝으로 압축된 신경망의 성능을 복구할 수 있다. The reconstruction modeling unit 220 according to an embodiment of the present invention multiplies the filter to be removed by the coefficient value that minimizes the loss function and adds it to the filter to be preserved to reduce the reconstruction error of the pretrained convolutional neural network and the compressed neural network. Find the transfer matrix that minimizes. Thereafter, the coefficient value that minimizes the loss function is expressed as an element value of the transfer matrix, and a 2-mode product operation is performed between the filter of the pretrained convolutional neural network and the pruning matrix, thereby processing without data and learning. It is possible to recover the performance of compressed neural networks by running.

도 3은 본 발명의 일 실시예에 따른 네트워크 프루닝 기법을 통해 경량화 된 신경망에 대한 데이터와 미세 조정 학습이 필요 없는 성능 복구 방법을 설명하기 위한 흐름도이다. 3 is a flowchart illustrating a performance recovery method that does not require data and fine-tuning learning for a neural network that has been reduced through a network pruning technique according to an embodiment of the present invention.

제안하는 네트워크 프루닝 기법을 통해 경량화 된 신경망에 대한 데이터와 미세 조정 학습이 필요 없는 성능 복구 방법은 사전학습된 합성곱 신경망의 레이어에서 제거되는 필터와 보존되는 필터간의 선형관계를 이용하여 손상된 필터를 나타내는 단계(310), 배치 정규화 레이어와 활성화 레이어에 따른 상기 손상된 필터로 인한 손실 함수를 정의하는 단계(320) 및 상기 손실함수를 최소화하는 계수값을 전달행렬의 요소값으로 나타내어 사전학습된 합성곱 신경망의 필터와 프루닝 행렬의 연산을 수행함으로써 프루닝으로 압축된 신경망의 성능을 복구하는 단계(330)를 포함한다. A performance recovery method that does not require data and fine-tuning learning for a lightweight neural network through the proposed network pruning technique uses a linear relationship between a filter removed from a layer of a pretrained convolutional neural network and a filter that is preserved to repair a damaged filter. Step 310 of representing, step 320 of defining a loss function due to the corrupted filter according to the batch normalization layer and activation layer, and pretrained convolution by representing coefficient values that minimize the loss function as element values of a transfer matrix. A step 330 of restoring performance of the neural network compressed by pruning by performing an operation of a filter of the neural network and a pruning matrix.

단계(310)에서, 사전학습된 합성곱 신경망의 레이어에서 제거되는 필터와 보존되는 필터간의 선형관계를 이용하여 손상된 필터를 나타낸다. In step 310, a damaged filter is represented by using a linear relationship between a filter to be removed from a layer of the pretrained convolutional neural network and a filter to be preserved.

본 발명의 실시예에 따른 단계(310)에서, 프루닝 모델링부는 프루닝 모델링부는 상기 사전학습된 합성곱의 필터 프루닝에서 사용되는 프루닝 행렬을 전달 행렬로 변형시킴으로써 제거되는 필터의 정보를 보존되는 필터에게 상속시키고, 제거되는 필터를 보존되는 필터들의 선형 결합으로 나타낼 수 있다.In step 310 according to an embodiment of the present invention, the pruning modeling unit preserves filter information removed by transforming the pruning matrix used in filter pruning of the pretrained convolution into a transfer matrix. Inherit to the filter that becomes, and the filter that is removed can be represented as a linear combination of the filters that are preserved.

단계(320)에서, 배치 정규화 레이어와 활성화 레이어에 따른 상기 손상된 필터로 인한 손실 함수를 정의한다. In step 320, a loss function due to the corrupted filter according to the batch normalization layer and the activation layer is defined.

본 발명의 실시예에 따른 단계(320)에서, 프루닝 모델링부는 손실함수를 최소화하기 위해 활성화맵을 특징맵으로 근사화 한 후, 잔차 오류(Residual Error)에 대하여 배치 정규화 레이어와 활성화 레이어에 따른 활성화맵을 특징맵이 배치 정규화 레이어를 통과한 값으로 나타내어 손실 함수를 정의하고, 상기 손실함수의 잔차 오류, 배치정규화 오류, 활성화 오류를 최소화 하는 계수값을 찾을 수 있다. In step 320 according to an embodiment of the present invention, the pruning modeling unit approximates the activation map with a feature map to minimize the loss function, and then activates according to the batch normalization layer and the activation layer for the residual error. A loss function is defined by representing a map as a value through which a feature map has passed through a batch normalization layer, and coefficient values minimizing residual errors, batch normalization errors, and activation errors of the loss function can be found.

단계(330)에서, 손실함수를 최소화하는 계수값을 전달행렬의 요소값으로 나타내어 사전학습된 합성곱 신경망의 필터와 프루닝 행렬의 연산을 수행함으로써 프루닝으로 압축된 신경망의 성능을 복구한다. In step 330, the performance of the neural network compressed by pruning is restored by representing the coefficient value that minimizes the loss function as an element value of the transfer matrix and performing the operation of the filter of the pretrained convolutional neural network and the pruning matrix.

본 발명의 실시예에 따른 단계(330)에서, 재구성 모델링부는 손실함수를 최소화하는 계수값을 상기 제거되는 필터에 곱하고, 상기 보존되는 필터에 더함으로써 사전학습된 합성곱 신경망과 압축된 신경망의 재구성 오류를 최소화하는 전달 행렬을 찾는다. 이후, 상기 손실함수를 최소화하는 계수값을 전달행렬의 요소값으로 나타내어 사전학습된 합성곱 신경망의 필터와 프루닝 행렬의 2-모드곱(2-mode product) 연산을 수행함으로써 데이터와 학습없이 프루닝으로 압축된 신경망의 성능을 복구할 수 있다. In step 330 according to an embodiment of the present invention, the reconstruction modeling unit reconstructs the pretrained convolutional neural network and the compressed neural network by multiplying the filter to be removed by the coefficient value that minimizes the loss function and adding it to the filter to be preserved. Find the transfer matrix that minimizes the error. Thereafter, the coefficient value that minimizes the loss function is expressed as an element value of the transfer matrix, and a 2-mode product operation is performed between the filter of the pretrained convolutional neural network and the pruning matrix, thereby processing without data and learning. It is possible to recover the performance of compressed neural networks by running.

도 4는 본 발명의 일 실시예에 따른 프루닝 행렬 및 전달 행렬을 설명하기 위한 도면이다. 4 is a diagram for explaining a pruning matrix and a transfer matrix according to an embodiment of the present invention.

도 4(a)는 본 발명의 실시예에 따른 프루닝 행렬, 도 4(b)는 LBYL에 대한 전달 행렬 및 도 4(c)는 원투원에 대한 전달행렬을 나타낸다. 4(a) shows a pruning matrix according to an embodiment of the present invention, FIG. 4(b) shows a transfer matrix for LBYL, and FIG. 4(c) shows a transfer matrix for one-to-one.

본 발명의 실시예에 따른 사전학습된 합성곱 신경망에서의 필터 프루닝을 설명하기 앞서, L개의 레이어를 가지는 합성곱 신경망의

번째 연산에서 얻은 결과는 다음과 같다: Prior to explaining filter pruning in a pretrained convolutional neural network according to an embodiment of the present invention, a convolutional neural network having L layers

The result obtained from the second operation is:

여기서,

는 합성곱 연산을 의미하며,

은

번째 레이어의 출력이면서

번째 레이어의 입력 활성화맵이며,

은 합성곱 필터를 의미한다.

은 각각 특징맵, 활성화맵을 나타낸다. 이때 m은 필터의 수를 의미하며 n은 이전 레이어의 활성화맵의 채널 수이며, w×h는 활성화맵의 크기이며, k×k는 필터의 크기를 의미한다. 그리고

는 활성화 레이어이며,

은 배치 정규화 레이어이다. here,

denotes a convolution operation,

silver

As the output of the second layer

It is the input activation map of the second layer,

denotes a convolutional filter.

denotes a feature map and an activation map, respectively. Here, m means the number of filters, n is the number of channels of the activation map of the previous layer, w×h is the size of the activation map, and k×k is the size of the filter. and

is the activation layer,

is the batch normalization layer.

번째 레이어에서 필터 프루닝을 수행하면 사전학습된 합성곱 신경망의 구조가 변경되어, 합성곱 필터의 차원이 변경되기 때문에 손상된 필터

, 특징맵

, 활성화맵

으로 변경된다. 여기서 t는 m보다 작은 남아있는 필터의 수를 의미한다. 이를 사전학습된 합성곱 신경망의 필터

와 프루닝 행렬(pruning matrix)

의 1-모드곱(1-mode product)를 이용하여 도 4(a)와 같이 나타낼 수 있다.

Performing filter pruning in the second layer changes the structure of the pretrained convolutional neural network, which changes the dimensionality of the convolutional filter, resulting in a damaged filter.

, feature map

, the activation map

is changed to where t denotes the number of remaining filters smaller than m. This is the filter of the pretrained convolutional neural network.

and pruning matrix

It can be expressed as shown in FIG. 4 (a) using the 1-mode product of .

여기서, 각각의

번째 필터는 제거되지 않으며, (m-t)개의 필터들은 완전히 제거된다. Here, each

The th filter is not removed, and (mt) filters are completely removed.

신경망의

번째 레이어에서 제거되는 필터로 인해

번째 레이어의 출력 채널이 제거되고 이로 인해 그 다음 레이어인

번째 필터의 채널이 연쇄적으로 제거됨으로써 출력의 손상이 발생된다. 즉,

번째 필터

는

로 변경된다. 여기서

은

번째 레이어의 필터의 수이며,

는 필터의 크기이다. 필터의 차원이 변경되지만 필터의 수는 변경되지 않기 때문에

번째 레이어의 출력의 차원은 사전학습된 합성곱 신경망의 출력 차원과 동일하다. 하지만 제거된 필터의 채널로 인해 발생되는 오류(error)는 각 레이어마다 누적되어 최종적으로 압축된 신경망의 성능하락이 발생한다. 손상된 필터

는 사전학습된 필터

와 프루닝 행렬

의 2-모드곱(2-mode product)으로 나타낼 수 있다: neural network

Due to the filter being removed in the second layer

The output channels of the first layer are removed, resulting in

As the channels of the th filter are sequentially removed, the output is damaged. in other words,

second filter

Is

is changed to here

silver

is the number of filters in the second layer,

is the size of the filter. Because the dimension of filters changes but the number of filters does not.

The dimension of the output of the second layer is the same as that of the pretrained convolutional neural network. However, errors generated by the channels of the removed filters are accumulated in each layer, resulting in a performance degradation of the final compressed neural network. damaged filter

is the pretrained filter

and the pruning matrix

It can be expressed as a 2-mode product of

여기서,

은 손상된 특징맵을 의미하며

는 사전학습된 합성곱 신경망의 특징맵이다. here,

denotes a damaged feature map

is the feature map of the pretrained convolutional neural network.

본 발명의 실시예에 따른 필터 프루닝에서 사용되는 도 4(a)와 같은 프루닝 행렬

를 도 4(b)와 같이 전달 행렬

로 변형시킴으로써 제거되는 필터의 정보를 보존되는 필터에게 상속시키고자 한다. 즉,

번째 레이어의 필터가 제거되어 연쇄적으로

번째 레이어 필터의 채널이 제거되었을 때, 압축된 신경망의 출력값

이 사전학습된 합성곱 신경망의 출력값

에 근사할 수 있는 전달 행렬

을 찾아 손상된 성능을 복구하고자 한다. A pruning matrix as shown in FIG. 4(a) used in filter pruning according to an embodiment of the present invention.

As shown in Figure 4 (b), the transfer matrix

By transforming into , we want to inherit the information of the filter to be removed to the filter to be preserved. in other words,

The filter of the second layer is removed, cascading

The output value of the compressed neural network when the channel of the second layer filter is removed.

The output of this pretrained convolutional neural network

A transfer matrix that can approximate

to recover the damaged performance.

따라서, 본 발명에서는 다음과 같이 사전학습된 합성곱 신경망과 압축된 신경망의

번째 특징맵 사이의 재구성 오류를 최소화하는 전달 행렬을 찾는 것을 목적으로 하고, 목적식은 다음과 같이 나타낼 수 있다: Therefore, in the present invention, the pretrained convolutional neural network and the compressed neural network as follows

The purpose is to find a transfer matrix that minimizes the reconstruction error between the feature maps, and the objective expression can be expressed as:

하지만 특징맵은 데이터가 있어야 얻을 수 있는 출력값이다. 따라서 본 발명에서는 데이터 없이 특징맵 사이의 재구성 오류를 최소화하는 방법을 제안한다. 이를 위해, 제거되는 하나의 필터는 보존되는 필터들의 선형 결합으로 나타낼 수 있다는 가정한다. 이때 얻은 계수값을 제거되는 필터의 가중치와 계수값

을 곱하여 보존되는 필터들 각각에게 더한다.

번째 필터 하나가 제거되었을 때, 전달 행렬

은 다음과 같이 정의된다: However, the feature map is an output value that can be obtained only when there is data. Therefore, the present invention proposes a method for minimizing reconstruction errors between feature maps without data. For this purpose, it is assumed that one filter to be removed can be represented as a linear combination of the filters to be preserved. The weight and coefficient values of the filter that removes the coefficient values obtained at this time

Multiply by and add to each of the preserved filters.

When the first filter is removed, the transfer matrix

is defined as:

여기서

는 스칼라로서, 제거되는

번째 필터와 얼마나 관련있는지 정도를 나타낸다. 도 4(b)는 제거되는 필터의 수가 2개인 경우의 예시이다.here

is a scalar, which is removed

Indicates the degree to which it is related to the first filter. 4(b) is an example of a case in which the number of filters to be removed is two.

합성곱 신경망의 하나의 레이어는 합성곱 필터와 배치 정규화 레이어, 활성화 레이어로 구성된다. 하지만, 먼저 배치 정규화와 활성화 레이어를 고려하지 않고 재구성 오류가 어떻게 이루어져 있는지 살펴보고자 한다. 위의 목적식에서 단순함을 위해

의

번째 채널만을 고려하게되는 경우 압축된 신경망의 특징맵

은 다음과 같이 나타낼 수 있다: One layer of the convolutional neural network consists of a convolutional filter, a batch normalization layer, and an activation layer. However, first, let's look at how reconstruction errors are made without considering batch normalization and activation layers. For the sake of simplicity in the above objective

of

If only the th channel is considered, the feature map of the compressed neural network

can be expressed as:

그러면,

번째 레이어에서의

번째 특징맵에 대한 사전학습된 합성곱 신경망과 압축된 신경망의 재구성 오류는 다음과 같이 나타낼 수 있다: then,

in the second layer

The reconstruction error of the pretrained convolutional neural network and the compressed neural network for the feature map can be expressed as:

위의 식으로부터

번째 레이어에서

번째 필터가 제거되었을 때,

번째 레이어의

재구성 오류는 다음이 나타낼 수 있다: from the above expression

in the second layer

When the second filter is removed,

of the second layer

A reconstruction error may indicate:

상기 식을 바탕으로 하여, 목적식을 최소화하는 것은 하기식을 최소화하는 것으로 문제를 줄일 수 있다: Based on the above expression, minimizing the objective expression can reduce the problem to minimizing the following expression:

여기서

를 만족하는

를 찾아야 한다. 앞서 배치 정규화와 활성화 레이어를 고려하지 않았기 때문에 쉽게, 활성화맵을 특징맵으로 근사할 수 있게 된다

. 따라서 위의 식은 하기식으로 나타낼 수 있다: here

to satisfy

should find Since we did not consider the batch normalization and activation layer earlier, we can easily approximate the activation map as a feature map.

. Therefore, the above equation can be expressed as:

여기서

는 데이터없이 구할 수 없는 값이므로, 나머지 부분

을 최소화하는

를 찾아야 하며, 이 부분을 잔차 오류

(Residual Error)라 한다. here

is a value that cannot be obtained without data, so the remainder

to minimize

, and this part is the residual error

(Residual Error).

여기서

는 사전학습된 합성곱 신경망의 네트워크의 파라미터만으로 최소화된다. here

Is It is minimized only with the parameters of the network of the pretrained convolutional neural network.

배치 정규화 레이어를 고려하게되면, 활성화맵은 특징맵이 배치 정규화 레이어

를 통과한 값으로 나타낼 수 있다

. 그리고

은 다음과 같이 표현할 수 있다: Considering a batch normalization layer, the activation map is the feature map of the batch normalization layer.

can be expressed as a value that passes through

. and

can be expressed as:

정리1. Organize 1.

여기서,

,

은 배치 정규화 파라미터 이다. here,

,

is the batch normalization parameter.

배치정규화를 고려한

는

로 대체될 수 있다. 그리고 상기 식의 두 번째 항을 배치 정규화 오류

(Batch Normalization Error)로 나타낸다. considering batch normalization

Is

can be replaced with And the second term in the above expression is the batch normalization error

(Batch Normalization Error).

정의2. Definition 2.

마지막으로, 활성화 레이어

를 고려하면, 활성화맵은 특징맵이 배치 정규화와 활성화 레이어를 통과한 값으로 나타낼 수 있다

. 그리고

는 다음과 같이 나타낼 수 있다: Finally, the activation layer

Considering , the activation map can be represented by the value of the feature map passing through the batch normalization and activation layer.

. and

can be expressed as:

이론1. Theory 1.

여기서,

는 활성화 오류(Activation Error)로 나타낸다. here,

is represented by an activation error.

배치 정규화와 활성화 레이어를 고려하는 확장을 통해서, 최종적으로 재구성 오류는 다음 위의 식과 같이 정의할 수 있다. 위 식으로부터

는 잔차 오류(

), 배치정규화 오류(

), 활성화 오류(

)를 최소화하는 값으로 정해져야 한다. 하지만 활성화 오류를 줄이기 위해서는 특징맵의 모든 요소에 대해 모든 경우를 고려해야 하고, 이는 데이터없이는 불가능하다. 따라서, 다음 아래의 손실함수를 최소화하는

를 찾고자 한다: Through expansion that considers batch normalization and activation layers, the final reconstruction error can be defined as the following equation above. from the above expression

is the residual error (

), batch normalization error (

), activation error (

) should be set to a value that minimizes However, in order to reduce the activation error, all cases must be considered for all elements of the feature map, which is impossible without data. Therefore, minimizing the loss function

You want to find:

여기서,

. here,

.

위 손실함수를 만족하는

는 다음과 같이 닫힌 형태(closed form)을 얻어 낼 수 있다. that satisfies the above loss function.

can be obtained in closed form as follows.

이론2. theory 2.

여기서,

,

는

번째레이어의

번째 필터의 벡터,

는

번째 레이어의

번째 필터의 벡터,

및

이다. here,

,

Is

of the second layer

vector of the th filter,

Is

of the second layer

vector of the th filter,

and

am.

도 5는 본 발명의 일 실시예에 따른 데이터와 학습없이 프루닝으로 압축된 신경망의 성능을 복구하기 위한 알고리즘을 나타내는 도면이다. 5 is a diagram illustrating an algorithm for restoring performance of a neural network compressed by pruning without data and learning according to an embodiment of the present invention.

상기 손실 함수를 최소화하는

는 제거되는 필터에 대해 각각 위의 닫힌 형태로부터 얻게 되고, 이를 전달 행렬의 요소값으로 정하고 이를 사전학습된 합성곱 신경망의 필터와 2-모드곱(2-mode product) 연산을 수행하게 되면, 데이터와 학습없이 프루닝으로 압축된 신경망의 성능을 복구할 수 있다. 이에 대한 알고리즘을 도 5에 도시하였다. minimizing the loss function

is obtained from the above closed form for each filter to be removed, and when this is set as the element value of the transfer matrix and a 2-mode product operation is performed with the filter of the pretrained convolutional neural network, the data It is possible to recover the performance of compressed neural networks by pruning without training with . An algorithm for this is shown in FIG. 5 .

합성곱 신경망을 프루닝하는 레이어에 대해서, 레이어 각각의 전달 행렬

를 초기화한다(2번째 줄). 그리고 각 레이어에서 보존되는 필터에 대해서는 전달행렬에 대해서는 보존되는 필터에 해당하는 인덱스만 1이고 나머지가 0인 원핫 벡터(one-hot vector)를 입력한다(4-5번째 줄). 그 이후 제거되는 필터에 대해서는 손실함수를 최소화는

를 찾아 전달 행렬의 요소로 입력한다(6-7번째 줄). 하나의 레이어에 대해서 전달 행렬을 구하게 되면 사전학습된 합성곱 신경망의 파라미터와 전달 행렬과의 2-모드곱 를 수행하여 프루닝으로 손실된 성능을 복구시킨다. 이러한 과정을 프루닝을 수행하는 레이어에 대해서 모두 수행한다.For the layer pruning the convolutional neural network, the transfer matrix of each layer

initialize (line 2). And, for the filter that is preserved in each layer, a one-hot vector in which only the index corresponding to the filter that is preserved is 1 and the rest is 0 is input for the transfer matrix (lines 4 to 5). For filters that are removed after that, the loss function is minimized.

Find and enter it as an element of the transfer matrix (lines 6-7). When the transfer matrix is obtained for one layer, the performance lost by pruning is restored by performing 2-mode multiplication between the parameters of the pretrained convolutional neural network and the transfer matrix. All of these processes are performed on the pruning layer.

표 1 내지 표 4는 본 발명에서 제안한 데이터와 학습없이 프루닝기법으로 압축된 사전학습된 합성곱 신경망의 성능을 복구한 기법을 적용한 것에 대한 CIFAR와 ImageNet 데이터에 대한 결과이다. Tables 1 to 4 are the results of CIFAR and ImageNet data for applying the data proposed in the present invention and the technique for restoring the performance of the pretrained convolutional neural network compressed by the pruning technique without learning.

<표 1><Table 1>

<표 2><Table 2>

<표 3><Table 3>

<표 4><Table 4>

실험에 사용된 네트워크 프루닝 기준은 합성곱 필터의 L1-norm[6], L2-norm[7], L2-GM[8]를 이용하였으며 제안된 방법이 무작위로 프루닝된 경우에도 강건함(robutness)를 보이기 위해 무작위(Random) 기준을 이용하여 실험하였다. 각 표에서의 가장 위에는 사전학습된 합성곱 신경망과 데이터에 대한 성능을 나타낸다. 실험 비교군으로 NM은 원투원 보상 방법이며, Prune은 프루닝 이후 재학습을 하지 않은 압축된 신경망의 성능을 나타낸다. 표 1 내지 표 4를 통해, 제안된 방법이 다른 비교 실험군보다 다양한 데이터와 사전학습된 합성곱 신경망에서 평균적으로 높은 성능을 보임을 확인할 수 있다. The network pruning criteria used in the experiment used L1-norm [6], L2-norm [7], and L2-GM [8] of convolutional filters, and the proposed method is robust even when pruning randomly ( In order to show robustness, an experiment was conducted using a random criterion. The top of each table shows the performance of the pretrained convolutional neural network and data. As an experimental comparison group, NM is a one-to-one compensation method, and Prune represents the performance of a compressed neural network without relearning after pruning. Through Tables 1 to 4, it can be confirmed that the proposed method shows higher average performance than other comparative experimental groups in various data and pretrained convolutional neural networks.

도 6은 본 발명의 일 실시예에 따른 합성곱 신경망의 성능 복구 방법과 NM과의 잔차 오류(RE), 배치 정규화 오류(BE), 가중화된 평균 재구성 오류(WARE)를 나타낸 도면이다.6 is a diagram showing a method for restoring the performance of a convolutional neural network according to an embodiment of the present invention and residual error (RE), batch normalization error (BE), and weighted average reconstruction error (WARE) with NM.

도 4는 제안된 방법과 NM과의 잔차 오류(RE), 배치 정규화 오류(BE), 가중화된 평균 재구성 오류(WARE)를 나타낸 것으로, 가중화된 평균 재구성 오류는 학습 데이터 1000개를 이용하여 제안된 방법으로 복구된 신경망과 NM에서 제안한 방법과의 재구성 오류의 평균과의 비교를 하였다. 도 4에서 보이듯이, 잔차 오류는 제안된 방법이 낮은 오류값을 가진다. 이러한 결과가 나타난 이유는 NM은 제거되는 필터에 대해 보존되는 하나의 필터에 제거되는 필터의 정보를 전달하는 반면, 제안된 방법은 제거되는 필터에 대해 보존되는 필터들의 간의 관계 고려하기 때문에 더 낮은 잔차 오류를 가질 수 밖에 없다. 그리고 배치 정규화 오류와 가중화된 평균 재구성 오류가 신경망의 초반 레이어에서는 비슷한 값을 가지는 이유는 도 1에서와 같이 유사한 필터가 초반 레이어에서 더 많이 존재하기 때문이다. 하지만 후반 레이어에 갈수록 NM의 오류는 제안한 방법보다 더 높은 것을 확인할 수 있다. 이를 통해, 제안된 방법은 레이어 간의 변동이 NM에 비해 작으며 3가지 오류에 대해 더 낮은 값을 갖도록 유지시키는 것을 확인할 수 있었다.Figure 4 shows the residual error (RE), batch normalization error (BE), and weighted average reconstruction error (WARE) of the proposed method and NM. The weighted average reconstruction error is calculated using 1000 training data. The neural network recovered by the proposed method and the average reconstruction error of the method proposed by NM were compared. As shown in Fig. 4, the residual error of the proposed method has a low error value. The reason for this result is that NM transfers the information of the filter to be removed to one filter that is preserved for the filter to be removed, whereas the proposed method considers the relationship between filters that are preserved for the filter to be removed, resulting in lower residuals. can only have errors The reason why the batch normalization error and the weighted average reconstruction error have similar values in the initial layer of the neural network is that similar filters exist more in the initial layer as shown in FIG. 1 . However, it can be confirmed that the error of NM is higher than the proposed method in the later layers. Through this, it was confirmed that the proposed method maintains a smaller variation between layers compared to NM and has lower values for the three errors.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPA(field programmable array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다.　 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다.　 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다.　 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다.　 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The devices described above may be implemented as hardware components, software components, and/or a combination of hardware components and software components. For example, devices and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), It may be implemented using one or more general purpose or special purpose computers, such as a programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions. A processing device may run an operating system (OS) and one or more software applications running on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of software. For convenience of understanding, there are cases in which one processing device is used, but those skilled in the art will understand that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that it can include. For example, a processing device may include a plurality of processors or a processor and a controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다.　 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치에 구체화(embody)될 수 있다.　 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may include a computer program, code, instructions, or a combination of one or more of the foregoing, which configures a processing device to operate as desired or processes independently or collectively. You can command the device. Software and/or data may be any tangible machine, component, physical device, virtual equipment, computer storage medium or device, intended to be interpreted by or provide instructions or data to a processing device. can be embodied in Software may be distributed on networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer readable media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다.　 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다.　 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다.　 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다.　 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.　 The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program commands recorded on the medium may be specially designed and configured for the embodiment or may be known and usable to those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. - includes hardware devices specially configured to store and execute program instructions, such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter, as well as machine language codes such as those produced by a compiler.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다.　 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with limited examples and drawings, those skilled in the art can make various modifications and variations from the above description. For example, the described techniques may be performed in an order different from the method described, and/or components of the described system, structure, device, circuit, etc. may be combined or combined in a different form than the method described, or other components may be used. Or even if it is replaced or substituted by equivalents, appropriate results can be achieved.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents of the claims are within the scope of the following claims.

<참고자료><References>

[1] J. Luo, J. Wu, and W. Lin. Thinet: A filter level pruning method for deep neural network compression. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pages 5068-5076. IEEE Computer Society, 2017[1] J. Luo, J. Wu, and W. Lin. Thinet: A filter level pruning method for deep neural network compression. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pages 5068-5076. IEEE Computer Society, 2017

[2] J. Luo and J. Wu. Neural network pruning with residual-connections and limited-data. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA,USA, June 13-19, 2020, pages 1455-1464. IEEE, 2020.[2] J. Luo and J. Wu. Neural network pruning with residual-connections and limited-data. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA,USA, June 13-19, 2020, pages 1455-1464. IEEE, 2020.

[3] S. Srinivas and R. V. Babu. Data-free parameter pruning for deep neural networks. In X. Xie,M. W. Jones, and G. K. L. Tam, editors, Proceedings of the British Machine Vision Conference 2015, BMVC 2015, Swansea, UK, September 7-10, 2015, pages 31.1-31.12. BMVA Press, 2015.[3] S. Srinivas and R. V. Babu. Data-free parameter pruning for deep neural networks. In X. Xie, M. W. Jones, and G. K. L. Tam, editors, Proceedings of the British Machine Vision Conference 2015, BMVC 2015, Swansea, UK, September 7-10, 2015, pages 31.1-31.12. BMVA Press, 2015.

[4] W. Kim, S. Kim, M. Park, and G. Jeon. Neuron merging: Compensating for pruned neurons. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.[4] W. Kim, S. Kim, M. Park, and G. Jeon. Neuron merging: Compensating for pruned neurons. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020 , virtual, 2020.

[5] T. G. Kolda and B. W. Bader. Tensor decompositions and applications. SIAM Rev., 51(3):455- 337 500, 2009[5] T. G. Kolda and B. W. Bader. Tensor decompositions and applications. SIAM Rev., 51(3):455-337 500, 2009

[6] H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf. Pruning filters for efficient convnets. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017.[6] H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf. Pruning filters for efficient convnets. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017.

[7] Y. He, G. Kang, X. Dong, Y. Fu, and Y. Yang. Soft filter pruning for accelerating deep convo315 lutional neural networks. In J. Lang, editor, Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden, pages 2234-2240. ijcai.org, 2018[7] Y. He, G. Kang, X. Dong, Y. Fu, and Y. Yang. Soft filter pruning for accelerating deep convo315 lutional neural networks. In J. Lang, editor, Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden, pages 2234-2240. ijcai.org, 2018

[8] Y. He, P. Liu, Z. Wang, Z. Hu, and Y. Yang. Filter pruning via geometric median for deep convolutional neural networks acceleration. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pages 4340-4349. Computer Vision Foundation / IEEE, 2019.[8] Y. He, P. Liu, Z. Wang, Z. Hu, and Y. Yang. Filter pruning via geometric median for deep convolutional neural networks acceleration. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pages 4340-4349. Computer Vision Foundation/IEEE, 2019.

Claims

representing a damaged filter by using a linear relationship between a filter removed from a layer of a convolutional neural network pretrained through a pruning modeling unit and a filter preserved;
Defining a loss function due to the damaged filter according to a batch normalization layer and an activation layer through a pruning modeling unit; and
Restoring the performance of the neural network compressed by pruning by representing coefficient values that minimize the loss function as element values of the transfer matrix through a reconstruction modeling unit and performing an operation of a filter and a pruning matrix of a pretrained convolutional neural network.
A performance recovery method of a convolutional neural network comprising a.

According to claim 1,
Representing a damaged filter by using a linear relationship between a filter removed from a layer of a convolutional neural network pretrained through the pruning modeling unit and a filter preserved,
By transforming the pruning matrix used in filter pruning of the pretrained convolution into a transfer matrix, the information of the filter to be removed is inherited by the filter to be preserved, and the filter to be removed is represented as a linear combination of the filters to be preserved.
Performance recovery method of convolutional neural networks.

According to claim 1,
Defining a loss function due to the damaged filter according to the batch normalization layer and the activation layer through the pruning modeling unit,
After approximating the activation map with a feature map to minimize the loss function, the loss function is expressed by the batch normalization layer and the activation map according to the activation layer as the value that the feature map passed through the batch normalization layer for the residual error. Define and find the coefficient value that minimizes the residual error, batch normalization error, and activation error of the loss function
Performance recovery method of convolutional neural networks.

According to claim 1,
Recovering the performance of the neural network compressed by pruning by performing the calculation of the filter and the pruning matrix of the pretrained convolutional neural network by representing the coefficient values that minimize the loss function through the reconstruction modeling unit as element values of the transfer matrix ,
Find a transfer matrix that minimizes the reconstruction error of the pretrained convolutional neural network and the compressed neural network by multiplying the filter to be removed by the coefficient value that minimizes the loss function and adding it to the filter to be preserved;
Coefficient values that minimize the loss function are represented as element values of the transfer matrix, and pruning is performed without data and learning by performing a 2-mode product operation of the filter of the pretrained convolutional neural network and the pruning matrix. Recovering the performance of compressed neural networks
Performance recovery method of convolutional neural networks.

a pruning modeling unit that represents a damaged filter using a linear relationship between a filter removed from a layer of a pretrained convolutional neural network and a filter preserved, and defines a loss function due to the damaged filter according to a batch normalization layer and an activation layer; and
Reconstruction modeling unit that restores the performance of the neural network compressed by pruning by performing the calculation of the filter and pruning matrix of the pretrained convolutional neural network by representing the coefficient values that minimize the loss function as element values of the transfer matrix
A performance recovery device of a convolutional neural network comprising a.

According to claim 5,
The pruning modeling unit,
By transforming the pruning matrix used in filter pruning of the pretrained convolution into a transfer matrix, the information of the filter to be removed is inherited by the filter to be preserved, and the filter to be removed is represented as a linear combination of the filters to be preserved.
A performance recovery device for convolutional neural networks.

According to claim 5,
The pruning modeling unit,
After approximating the activation map with a feature map to minimize the loss function, the loss function is expressed by the batch normalization layer and the activation map according to the activation layer as the value that the feature map passed through the batch normalization layer for the residual error. Define and find the coefficient value that minimizes the residual error, batch normalization error, and activation error of the loss function
A performance recovery device for convolutional neural networks.

According to claim 5,
The reconstruction modeling unit,
Find a transfer matrix that minimizes the reconstruction error of the pretrained convolutional neural network and the compressed neural network by multiplying the filter to be removed by the coefficient value that minimizes the loss function and adding it to the filter to be preserved;
Coefficient values that minimize the loss function are represented as element values of the transfer matrix, and pruning is performed without data and learning by performing a 2-mode product operation of the filter of the pretrained convolutional neural network and the pruning matrix. Recovering the performance of compressed neural networks
A performance recovery device for convolutional neural networks.