KR20240086365A

KR20240086365A - Multi-agent reinforcement learning based digital tomosynthesis image denoising system using data transformation

Info

Publication number: KR20240086365A
Application number: KR1020220171849A
Authority: KR
Inventors: 이승완; 남기복
Original assignee: 건양대학교산학협력단
Priority date: 2022-12-09
Filing date: 2022-12-09
Publication date: 2024-06-18
Anticipated expiration: 2042-12-09
Also published as: KR102791115B1

Abstract

본 발명은 DT(digital tomosynthesis) 이미지에 대한 노이즈 제거 모델 생성을 위해 convGRU(convolutional gated recurrent unit) 및 RMC(Reward Map Convolution)를 도입한 형태의 다중 에이전트 강화학습 네트워크를 적용하고 이에 데이터 변환을 적용한 멀티 에이전트 강화학습 기반 디지털 토모신세시스 영상 노이즈 제거 시스템에 관한 것이다.The present invention applies a multi-agent reinforcement learning network that introduces convGRU (convolutional gated recurrent unit) and RMC (Reward Map Convolution) to generate a noise removal model for digital tomosynthesis (DT) images, and applies data transformation to the multi-agent reinforcement learning network. This is about a digital tomosynthesis image noise removal system based on agent reinforcement learning.

Description

Multi-agent reinforcement learning based digital tomosynthesis image denoising system using data transformation}

본 발명은 디지털 토모신세시스 영상 노이즈 제거에 관한 기술로, 자세하게는 DT(digital tomosynthesis) 이미지에 대한 노이즈 제거 모델 생성을 위해 convGRU(convolutional gated recurrent unit) 및 RMC(Reward Map Convolution)를 도입한 형태의 다중 에이전트 강화학습 네트워크를 적용하고 이에 데이터 변환을 적용한 멀티 에이전트 강화학습 기반 디지털 토모신세시스 영상 노이즈 제거 시스템에 관한 것이다.The present invention is a technology related to digital tomosynthesis image noise removal. In detail, the present invention is a technology for removing noise from digital tomosynthesis (DT) images. In detail, the present invention is a technology for removing noise from digital tomosynthesis (DT) images by introducing a convolutional gated recurrent unit (convGRU) and reward map convolution (RMC). This is about a multi-agent reinforcement learning-based digital tomosynthesis image noise removal system that applies an agent reinforcement learning network and data transformation.

디지털 토모신세시스(digital tomosynthesis; DT)는 제한된 각도에서 영상을 스캔하고 컴퓨터단층 촬영(computed tomography; CT) 또는 자기 공명(magnetic resonance; MR) 영상과 유사한 알고리즘을 사용하여 재구성함으로써 3차원 영상을 획득한다.Digital tomosynthesis (DT) acquires three-dimensional images by scanning images from limited angles and reconstructing them using algorithms similar to computed tomography (CT) or magnetic resonance (MR) imaging. .

이러한 DT는 같은 3D 영상을 제공하는 CT보다 노출 선량이 적어 유방 등 선량에 민감한 부위에 사용되고, 또한 단순 X선 촬영에 비해 구조물이 겹쳐지는 것이 적어 구조물 및 병변을 좀 더 명확히 볼 수 있는 장점이 있다.This DT has a lower exposure dose than CT that provides the same 3D image, so it is used in areas sensitive to dose, such as the breast. Additionally, compared to simple X-ray imaging, there is less overlap of structures, so it has the advantage of being able to see structures and lesions more clearly. .

그러나 DT 영상은 제한된 스캔 각도 및 부족한 투영상에 의해 영상의 화질이 좋지 않아 정확한 진단에 어려움이 있을 수 있다.However, DT images have poor image quality due to limited scanning angles and insufficient projection images, making accurate diagnosis difficult.

이러한 단점을 해결하기 위하여 DT 영상의 품질을 향상시키기 위한 연구가 활발히 진행되고 있다. 하지만, 기존의 방법들로 지도학습의 경우 모델을 훈련시키기 위한 많은 양의 데이터와 오랜 시간이 필요하고, 신경망에 대한 지도학습은 과적합으로 인한 손실함수를 최소화하는데 어려움이 있으며 영상의 세밀한 부분에 왜곡이 발생하는 등의 문제가 있다.To solve these shortcomings, research is actively being conducted to improve the quality of DT images. However, in the case of supervised learning with existing methods, a large amount of data and a long time are required to train the model, and supervised learning for neural networks has difficulty minimizing the loss function due to overfitting, and requires a large amount of data and a long time to train the model. There are problems such as distortion.

이에 반해 강화학습(reinforcement learning; RL)은 네트워크 학습시 많은 양의 라벨 데이터가 필요하지 않으며 RL에 딥러닝을 적용할 경우 학습속도를 가속화할 수 있다. 또한, 입력영상의 픽셀 및 패치 단위를 멀티 에이전트로 활용한다면 다양한 영역별 특징 및 픽셀 값의 미세한 변화를 출력영상에 반영할 수 있다.In contrast, reinforcement learning (RL) does not require a large amount of label data when learning a network, and applying deep learning to RL can accelerate the learning speed. Additionally, if the pixel and patch units of the input image are used as a multi-agent, subtle changes in features and pixel values of various regions can be reflected in the output image.

특히 멀티 에이전트 강화학습(multi-agent reinforcement learning; MARL)은 입력 영상의 상태를 추출하고 현재 상태에서 보상을 최대화할 수 있는 행동을 선택하는 방법으로 학습이 진행되며, 모델이 수행할 행동을 사전에 정의할 수 있어 목적에 최적화된 출력영상을 획득하는 것이 가능하다.In particular, multi-agent reinforcement learning (MARL) is a method of extracting the state of the input image and selecting an action that can maximize the reward in the current state. The action to be performed by the model is determined in advance. Since it can be defined, it is possible to obtain an output image optimized for the purpose.

대한민국 공개특허 제10-2022-0086937호(2022.06.24)Republic of Korea Patent Publication No. 10-2022-0086937 (2022.06.24)

본 발명은 상기와 같은 필요에 의하여 창출된 것으로, 본 발명의 목적은 DT 이미지에 대한 노이즈 제거 모델 생성을 위한 다중 에이전트 강화학습 네트워크를 적용하고 이에 convGRU(convolutional gated recurrent unit) 및 RMC(Reward Map Convolution)을 도입한 형태의 데이터 변환을 적용한 멀티 에이전트 강화학습 기반 디지털 토모신세시스 영상 노이즈 제거 시스템을 제공하는 것이다.The present invention was created in response to the above-mentioned needs, and the purpose of the present invention is to apply a multi-agent reinforcement learning network to generate a noise removal model for DT images and to use a convGRU (convolutional gated recurrent unit) and RMC (Reward Map Convolution) ) is to provide a digital tomosynthesis image noise removal system based on multi-agent reinforcement learning that applies data conversion in the form of introduced.

상기와 같은 목적을 위한 본 발명은 표준이미지 및 상기 표준이미지에 설정된 레벨의 노이즈를 적용한 다수의 훈련용 입력이미지를 생성하는 학습데이터생성부; 상기 트레이닝이미지에 웨이브릿 변환 및 앤스콤 변환을 적용하는 변환부; 풀리 컨볼루셔널 네트워크 방식으로 연결되되, 입력영상의 특징을 추출하는 쉐어네트워크와, 특징이 추출된 입력영상을 표준영상과 비교하여 보상을 계산하며 보상이 최대화되도록 학습을 하는 밸류네트워크와, 사전 정의된 차트로부터 입력영상의 상태에 적합한 행동을 결정하는 폴리시네트워크의 하위 네트워크로 이루어져 입력영상의 노이즈를 제거하는 멀티 에이전트 강화학습모델을 구성하는 학습모델구성부; 상기 표준영상 및 복수의 입력영상을 상기 멀티 에이전트 강화학습모델에 입력하고, 상기 표준영상과 입력영상의 차이를 줄이도록 상기 멀티 에이전트 강화학습모델을 훈련시키는 학습부; 훈련된 멀티 에이전트 강화학습모델을 적용하여 입력된 디지털 토모신세시스 영상의 노이즈를 제거하는 역변환부; 로 이루어지는 것을 특징으로 한다.The present invention for the above purpose includes a learning data generator that generates a standard image and a plurality of training input images to which a set level of noise is applied to the standard image; A transform unit that applies wavelet transform and Anscomb transform to the training image; It is connected by a fully convolutional network method, a share network that extracts the features of the input image, a value network that calculates compensation by comparing the input image from which the features are extracted with the standard image and learns to maximize the compensation, and a dictionary definition. A learning model component that configures a multi-agent reinforcement learning model that removes noise from the input image, consisting of a sub-network of the policy network that determines an action appropriate for the state of the input image from the chart; a learning unit that inputs the standard image and a plurality of input images into the multi-agent reinforcement learning model and trains the multi-agent reinforcement learning model to reduce the difference between the standard image and the input image; An inverse transformation unit that removes noise from the input digital tomosynthesis image by applying a trained multi-agent reinforcement learning model; It is characterized by consisting of.

이때 상기 쉐어네트워크는 1개의 컨볼루션층과 3개의 확장 컨볼루션층으로 구성되고, 상기 밸류네트워크는 2개의 확장 컨볼루션층과 1개의 컨볼루션층으로 구성되며, 상기 폴리시네트워크는 2개의 확장컨볼루션층과 1개의 컨볼루션층으로 구성되는 것이 바람직하다.At this time, the share network consists of one convolution layer and three extended convolution layers, the value network consists of two extended convolution layers and one convolution layer, and the policy network consists of two extended convolution layers. It is preferable that it consists of a layer and one convolution layer.

또한, 상기 쉐어네트워크에서 확장 컨볼루션층의 확장비율은 각각 2, 3, 4로 설정되고, 멀티 에이전트 강화학습모델의 모든 층에서 필터의 이동범위는 1로 고정되는 것이 바람직하다.In addition, it is preferable that the expansion ratios of the extended convolutional layers in the shared network are set to 2, 3, and 4, respectively, and the movement range of the filter in all layers of the multi-agent reinforcement learning model is fixed to 1.

또한, 상기 밸류네트워크에서는 하기의 [수학식 3] 및 [수학식 4]를 통해 보상을 산출하는 것이 바람직하다.Additionally, in the value network, it is desirable to calculate compensation through [Equation 3] and [Equation 4] below.

[수학식 3][Equation 3]

[수학식 4][Equation 4]

(는 타임스텝 t에서 k번째 에이전트의 보상, I_k는 표준영상의 픽셀 값, 는 영상특징을 포함한 k번째 에이전트의 상태, 는 k번째 에이전트에서 계산된 전체보상, γ는 감쇠 계수, ω는 p번째 에이전트의 컨볼루션 필터가중치, F(k)는 k번째 에이전트에서 수용영역의 중심)( is the reward of the kth agent at time step t, I _k is the pixel value of the standard image, is the state of the kth agent including image features, is the total reward calculated at the kth agent, γ is the attenuation coefficient, ω is the convolution filter weight of the pth agent, F(k) is the center of the receptive field at the kth agent)

또한, 상기 폴리시네트워크의 확장 컨볼루션층의 확장비율은 각각 3, 2로 설정되고, 이후 convGRU(convolutional gated recurrent unit)이 적용되는 것이 바람직하다.In addition, it is preferable that the expansion ratios of the expanded convolutional layer of the policy network are set to 3 and 2, respectively, and then a convGRU (convolutional gated recurrent unit) is applied.

또한, 상기 폴리시네트워크는 [수학식 5]를 통해 적합한 행동을 출력하는 것이 바람직하다.Additionally, it is desirable for the policy network to output appropriate actions through [Equation 5].

[수학식 5][Equation 5]

(π_k는 상태 s_k에 행동 α_k를 적용하기 위한 최적의 정책)(π _k is the optimal policy for applying action α _k to state s _k )

본 발명을 통해 제한된 스캔 각도 및 부족한 투영상에 의해 영상의 화질이 좋지 않았던 DT 영상의 노이즈를 효과적으로 제거하여 구조물 및 병변을 명확히 진단할 수 있으며, 종래 CT 등 고품질 영상을 위해 불가피했던 방사선 피폭량도 크게 줄일 수 있다.Through the present invention, structures and lesions can be clearly diagnosed by effectively removing noise from DT images, which had poor image quality due to limited scanning angles and insufficient projection images, and greatly reducing the amount of radiation exposure that was unavoidable for high-quality images such as conventional CT. It can be reduced.

도 1은 본 발명의 실시예에 따른 디지털 토모신세시스 영상 노이즈 제거 시스템의 블록도,
도 2는 본 발명에 따른 DT 영상의 노이즈 제거를 위한 MARL 모델의 구조도,
도 3은 본 발명의 실시예에 따른 디지털 토모신세시스 영상 노이즈 제거 방법의 순서도,
도 4는 기존에 사용하던 MARL 모델과 본 발명의 데이터 변환이 적용된 MARL을 사용하여 잡 음을 제거한 DT 영상,
도 5는 본 발명의 실시예에 따른 입력영상 및 각 모델의 출력이미지에 대한 SNR 측정 결과를 나타낸 그래프이다.1 is a block diagram of a digital tomosynthesis image noise removal system according to an embodiment of the present invention;
Figure 2 is a structural diagram of the MARL model for noise removal of DT images according to the present invention;
3 is a flowchart of a digital tomosynthesis image noise removal method according to an embodiment of the present invention;
Figure 4 is a DT image in which noise was removed using the existing MARL model and MARL to which the data transformation of the present invention was applied;
Figure 5 is a graph showing the SNR measurement results for the input image and the output image of each model according to an embodiment of the present invention.

이하 첨부된 도면을 참조하여 본 발명 데이터 변환을 적용한 멀티 에이전트 강화학습 기반 디지털 토모신세시스 영상 노이즈 제거 시스템을 구체적으로 설명한다.Hereinafter, a multi-agent reinforcement learning-based digital tomosynthesis image noise removal system using the data conversion of the present invention will be described in detail with reference to the attached drawings.

본 발명에서는 DT 이미지에 대한 노이즈(노이즈) 제거 모델을 생성하기 위해 멀티 에이전트 강화학습 네트워크를 도입한다. DT 이미지에서 내부구조 간 대비가 불충분할 경우 네트워크가 노이즈 패턴을 분리하기 어렵고 네트워크 훈련 중 노이즈 특징을 정확히 인식하는 것을 방해한다. 이에 본 발명에서는 멀티 에이전트 강화학습 네트워크에서 DT 이미지의 노이즈 특징을 선택적으로 전달하고 훈련 정확도 향상을 위해 웨이브릿 및 앤스콤 변환을 적용한다.In the present invention, a multi-agent reinforcement learning network is introduced to generate a noise removal model for DT images. If the contrast between internal structures in the DT image is insufficient, it is difficult for the network to separate noise patterns and prevents accurate recognition of noise features during network training. Accordingly, in the present invention, the noise characteristics of DT images are selectively transmitted in a multi-agent reinforcement learning network and wavelet and Anscombe transforms are applied to improve training accuracy.

이후 표준영상 및 입력영상이 고정되는 기존의 학습방법 대비 멀티 에이전트 강화학습모델을 통해 입력영상이 바뀌며 학습이 이루어는 효과를 통해 부족한 학습데이터양이 충분히 확보하여 모델 학습이 이루어지게 된다. Afterwards, compared to the existing learning method in which the standard image and input image are fixed, the input image is changed through a multi-agent reinforcement learning model and learning is performed. Through this effect, the insufficient amount of learning data is sufficiently secured to enable model learning.

도 1은 본 발명의 실시예에 따른 디지털 토모신세시스 영상 노이즈 제거 시스템의 블록도로서, 본 발명은 주요 구성으로 학습데이터생성부(110)와, 변환부(120)와, 학습모델구성부(130)와, 학습부(140) 및 역변환부(150)를 구비한다.Figure 1 is a block diagram of a digital tomosynthesis image noise removal system according to an embodiment of the present invention. The main components of the present invention are a learning data generation unit 110, a conversion unit 120, and a learning model constructor 130. ), and a learning unit 140 and an inverse conversion unit 150.

상기 학습데이터생성부(110)는 표준이미지 및 상기 표준이미지에 설정된 레벨의 노이즈를 적용한 다수의 훈련용 입력이미지를 생성하는 구성이다.The learning data generator 110 is configured to generate a standard image and a plurality of input images for training by applying a set level of noise to the standard image.

표준이미지는 고해상도의 기준 이미지이며 훈련용 입력이미지는 다양한 노이즈 레벨을 갖는 DT 이미지로서 본 발명의 실시예에서는 0.05, 0.07, 0.10, 0.13 및 0.15의 표준편차를 갖는 가우시안 노이즈를 추가한 이미지를 입력이미지로 생성하였다.The standard image is a high-resolution reference image, and the input image for training is a DT image with various noise levels. In the embodiment of the present invention, the input image is an image to which Gaussian noise with standard deviations of 0.05, 0.07, 0.10, 0.13, and 0.15 is added. It was created with .

상기 변환부(120)는 상기 트레이닝이미지에 웨이브릿 변환 및 앤스콤 변환을 적용하는 구성으로, 네트워크 학습시 노이즈 특징을 정확하게 인식할 수 있도록 한다.The transform unit 120 is configured to apply wavelet transform and Anscombe transform to the training image, allowing noise characteristics to be accurately recognized during network learning.

웨이브릿 변환(wavelet transform)은 시간적으로 주파수 성분이 변하는 신호에 대하여 시간과 주파수 성분을 나타내는 변환 방법이다. 일반적으로 푸리에 변환(Fourier transform)은 신호가 시간적으로 변하지 않는다는 가정에서 주파수 성분을 표시하는데 반해 웨이브릿 변환은 처프(chirped) 신호나 ECG(Electrocardiograph), 영상 신호와 같이 시간적으로 주파수 성분이 변하는 신호에 대하여 시간과 주파수 성분을 표현하기 위하여 사용된다. 이 경우 낮은 주파수 성분은 높은 주파수 해상도로 표현하고, 높은 주파수 성분은 시간 해상도를 높게 변환한다.Wavelet transform is a transformation method that represents time and frequency components of a signal whose frequency components change over time. In general, the Fourier transform displays frequency components under the assumption that the signal does not change temporally, whereas the wavelet transform displays the frequency component of signals that change temporally, such as chirped signals, ECG (Electrocardiograph), and video signals. It is used to express time and frequency components. In this case, low frequency components are expressed with high frequency resolution, and high frequency components are converted to high time resolution.

본 발명에서는 [수학식 1]과 같은 웨이브릿 변환을 통해 고주파 구성 요소의 방향 정보를 포함한 비 중복 서브밴드 이미지를 생성할 수 있으며, 입력 DT 영상의 복잡한 패턴으로부터 노이즈 성분을 추출할 수 있다.In the present invention, a non-redundant subband image including direction information of high-frequency components can be generated through wavelet transformation as shown in [Equation 1], and noise components can be extracted from the complex pattern of the input DT image.

[수학식 1][Equation 1]

여기서 M 및 N은 입력영상의 x 및 y축의 픽셀개수를 의미한다. 는 변환레벨 j로 영상신호(A)를 근사하거나 수평(H), 수직(V) 및 대각(D) 방향을 따라 신호를 분리하기 위한 필터이다.Here, M and N refer to the number of pixels on the x and y axes of the input image. is a filter for approximating the image signal (A) with conversion level j or separating the signal along the horizontal (H), vertical (V), and diagonal (D) directions.

앤스콤 변환(Anscombe transform)은 푸아송 분포가 있는 임의 변수를 대략적인 표준 가우시안 분포를 가진 변수로 변환하는 분산 안정화 변환이다.The Anscombe transform is a variance stabilizing transformation that converts a random variable with a Poisson distribution into a variable with an approximate standard Gaussian distribution.

본 발명에서는 [수학식 2]와 같은 Anscombe transform을 통해 신호의존적 노이즈를 신호 독립적 노이즈로 변환한 후 신호에 포함된 노이즈 속성의 복잡성을 감소시킨다.In the present invention, signal-dependent noise is converted into signal-independent noise through Anscombe transform as shown in Equation 2, and then the complexity of noise properties included in the signal is reduced.

[수학식 2][Equation 2]

여기서 는 공간영역에서 노이즈가 없는 신호이며 n(x,y)는 신호 독립적 노이즈다.here is a noise-free signal in the spatial domain, and n(x,y) is signal-independent noise.

상기 학습모델구성부(130) 풀리 컨볼루셔널 네트워크 방식으로 연결되되, 입력영상의 특징을 추출하는 쉐어네트워크와, 특징이 추출된 입력영상을 표준영상과 비교하여 보상을 계산하며 보상이 최대화되도록 학습을 하는 밸류네트워크와, 사전 정의된 차트로부터 입력영상의 상태에 적합한 행동을 결정하는 폴리시네트워크의 하위 네트워크로 이루어져 입력영상의 노이즈를 제거하는 멀티 에이전트 강화학습모델을 구성한다.The learning model component 130 is connected in a fully convolutional network manner, and learns to calculate compensation by comparing a share network that extracts features of the input image and a standard image from which the features are extracted, and maximizes the compensation. It consists of a sub-network of the value network, which determines the action appropriate for the state of the input image from a predefined chart, and the sub-network of the policy network, which forms a multi-agent reinforcement learning model that removes noise from the input image.

도 2는 본 발명에 따른 DT 영상의 노이즈 제거를 위한 MARL 모델의 구조도이다. 본 발명에서 제시한 MARL 모델은 총 3개의 하위 네트워크로 이루어져 있으며 각 네트워크는 풀리 컨볼루셔널 네트워크(fully convolution network) 방식으로 연결된다.Figure 2 is a structural diagram of the MARL model for noise removal of DT images according to the present invention. The MARL model presented in the present invention consists of a total of three sub-networks, and each network is connected in a fully convolutional network.

상기 쉐어네트워크(Shared network)는 1개의 컨볼루션층(convolution layer; Conv) 및 3개의 확장 컨볼루션층(dilated convolution layer; Dilated conv)으로 구성된다.The shared network consists of one convolution layer (Conv) and three dilated convolution layers (Dilated conv).

확장 컨볼루션층은 수용영역의 크기를 확장함으로써 입력영상의 특징을 빠르게 추출하고 계산 비용을 감소시킬 수 있다. 이때 Shared network에서 확장 컨볼루션층의 확장비율은 각각 2, 3 및 4로 설정하였으며, 본 발명에서 제시하는 모델의 모든 층에서 필터의 이동 범위는 1로 고정하였다. The extended convolution layer can quickly extract features of the input image and reduce computational costs by expanding the size of the receptive field. At this time, the expansion ratios of the expanded convolutional layers in the shared network were set to 2, 3, and 4, respectively, and the movement range of the filter in all layers of the model presented in the present invention was fixed to 1.

이러한 Shared network는 행동이 적용된 입력영상의 상태를 추출하는 역할을 하며, shared network에서 추출된 상태는 각각 value 및 policy network의 입력으로 사용된다.This shared network plays a role in extracting the state of the input image to which the action is applied, and the state extracted from the shared network is used as input to the value and policy network, respectively.

상기 밸류네트워크(Value network)는 2개의 확장 컨볼루션층과 1개의 컨볼루 션층으로 구성된다. 본 발명의 실시예에서는 Value network의 확장 컨볼루션은 확장비율을 3 및 2로 설정하였다.The value network consists of two extended convolution layers and one convolution layer. In the embodiment of the present invention, the expansion ratio of the value network's expansion convolution was set to 3 and 2.

이러한 Value network는 shared network에서 추출된 입력영상의 상태를 표준영상과 비교하여 보상을 계산하며 해당 보상이 최대화되는 방향으로 학습이 진행되어 표준영상과 매우 흡사한 출력영상을 생성할 수 있도록 한다.This value network calculates compensation by comparing the state of the input image extracted from the shared network with the standard image, and learns in a direction that maximizes the compensation, creating an output image very similar to the standard image.

상기 폴리시네트워크(Policy network)는 각각 3 및 2의 확장비율을 가진 2개의 확장 컨볼루션층과 1개의 컨볼루션층으로 구성되었으며 2개의 확장 컨볼루션층 이후에 convolutional gated recurrent unit(convGRU)이 적용된다. ConvGRU는 학습 시간이 길어질수록 초기 상태의 중요성이 떨어지는 시간 의존성 문제를 해결할 수 있어 학습 정확도를 향상시킬 수 있다.The policy network consists of two expanded convolutional layers and one convolutional layer with expansion ratios of 3 and 2, respectively, and a convolutional gated recurrent unit (convGRU) is applied after the two expanded convolutional layers. . ConvGRU can improve learning accuracy by solving the time dependence problem where the importance of the initial state decreases as the learning time increases.

즉 기존의 1자 형태로 진행되는 학습방식에서는 네트워크가 깊어지면 앞의 초기값이 흐려질 수 있는 문제를 ConvGRU를 통해 데이터 소실을 방지하게 된다.In other words, in the existing one-character learning method, data loss is prevented through ConvGRU, where the initial value may become blurred as the network deepens.

이러한 Policy network에서는 입력 영상의 현재 상태에 가장 적합한 행동이 결정된다.In this policy network, the most appropriate action is determined based on the current state of the input image.

[표 1] 폴리시 네트워크에서 사용되는 사전정의된 행동 세트로서, 에이전트가 실행하는 작업은 [표 1]에 나열된 미리 정의된 집합에서 선택되었다.[Table 1] A set of predefined actions used in the policy network. The tasks executed by the agent were selected from the predefined set listed in [Table 1].

[표 1][Table 1]

미리 정의된 작업 집합에는 공간 영역에서 설계된 기존 이미지 필터가 포함되었으며 기존 이미지 필터는 노이즈 제거를 위해 경험적으로 결정되었다. 에피소드 t에서의 최적 정책은 상태를 업데이트하고 업데이트된 상태는 에피소드 t+1에서 공유된 서브네트워크의 입력 데이터로 사용되었다. 제안하는 다중 에이전트 RL 네트워크의 세부 구성은 [표 2]에 요약되어 있다. The predefined task set included existing image filters designed in the spatial domain and the existing image filters were determined empirically for noise removal. The optimal policy at episode t updates the state, and the updated state was used as input data for the shared subnetwork at episode t+1. The detailed configuration of the proposed multi-agent RL network is summarized in [Table 2].

[표 2][Table 2]

상기 학습부(140)는 상기 표준영상 및 복수의 입력영상을 상기 멀티 에이전트 강화학습모델에 입력하고, 상기 표준영상과 입력영상의 차이를 줄이도록 상기 멀티 에이전트 강화학습모델을 훈련시킨다.The learning unit 140 inputs the standard image and a plurality of input images into the multi-agent reinforcement learning model and trains the multi-agent reinforcement learning model to reduce the difference between the standard image and the input image.

상기 역변환부(150)는 상기 학습부(140)를 통해 훈련된 멀티 에이전트 강화학습모델을 적용하여 입력된 디지털 토모신세시스 영상의 노이즈를 제거한다.The inverse transform unit 150 applies the multi-agent reinforcement learning model trained through the learning unit 140 to remove noise from the input digital tomosynthesis image.

도 3은 본 발명의 실시예에 따른 디지털 토모신세시스 영상 노이즈 제거 방법의 순서도이다.Figure 3 is a flowchart of a digital tomosynthesis image noise removal method according to an embodiment of the present invention.

첫 번째 단계(S110)로 노이즈 영상에 데이터 변환이 적용된 입력영상을 shared network에 입력이 이뤄진다.In the first step (S110), the input image to which data conversion has been applied to the noise image is input to the shared network.

두 번째 단계(S120)로 Shared network에서 추출된 입력영상의 특징을 value 및 policy network로 전달된다.In the second step (S120), the features of the input image extracted from the shared network are transferred to the value and policy network.

세 번째 단계(S130)로 Value network에서 현재 상태의 보상을 출력한다.In the third step (S130), the value network outputs the current state compensation.

본 발명에서 Value network는 [수학식 3] 및 [수학식 4]와 같은 보상 계산 과정을 통해 현재 상태의 보상을 출력한다.In the present invention, the value network outputs the current state compensation through a compensation calculation process such as [Equation 3] and [Equation 4].

[수학식 3][Equation 3]

[수학식 4][Equation 4]

(는 타임스텝 t에서 k번째 에이전트의 보상, I_k는 표준영상의 픽셀값, 는 영상특징을 포함한 k번째 에이전트의 상태, 는 k번째 에이전트에서 계산된 전체보상, γ는 감쇠 계수, ω는 p번째 에이전트의 컨볼루션 필터가중치, F(k)는 k번째 에이전트에서 수용영역의 중심)( is the reward of the kth agent at time step t, I _k is the pixel value of the standard image, is the state of the kth agent including image features, is the total reward calculated at the kth agent, γ is the attenuation coefficient, ω is the convolution filter weight of the pth agent, F(k) is the center of the receptive field at the kth agent)

본 단계에서는 계산 비용을 감소시키기 위해 L번째 학습 단계에서 보상 계산에 컨볼루션을 수행하였다.In this step, convolution was performed on the reward calculation in the Lth learning step to reduce computational cost.

네 번째 단계(S140)로 value network에서 출력된 보상을 기반으로 policy network에서 적합 한 행동을 출력한다.In the fourth step (S140), the policy network outputs an appropriate action based on the reward output from the value network.

본 발명에서는 [수학식 5]와 같이 value network에서 출력된 보상을 기반으로 policy network에서 적합 한 행동을 출력한다.In the present invention, an appropriate action is output from the policy network based on the reward output from the value network, as shown in [Equation 5].

[수학식 5][Equation 5]

출력된 정책은 에이전트에 전달되어 상태를 업데이트한다.The output policy is delivered to the agent to update its status.

다섯 번째 단계(S150)에서는 설정된 만큼 반복 수행하여 학습 완료 후 학습된 MARL으로 DT 영상의 노이즈 제거한다. 즉 앞서 언급된 각 단계는 설정된 반복횟수만큼 수행된 뒤 학습이 완료되면 최종적으로 학습된 MARL을 DT 영상의 노이즈 제거에 사용한다.In the fifth step (S150), the process is repeated as many times as set, and after learning is completed, the learned MARL is used to remove noise from the DT image. In other words, each step mentioned above is performed for the set number of repetitions, and when learning is completed, the finally learned MARL is used to remove noise from the DT image.

도 4는 기존에 사용하던 MARL 모델과 본 발명의 데이터 변환이 적용된 MARL을 사용하여 노이즈를 제거한 DT 영상으로, (a) 표준영상, (b) 노이즈가 추가된 입력영상, (c) 기존 MARL 모델을 적용한 영상, (d) MARL에 wavelet transform을 적용한 영상, (e) MARL에 Anscombe transform을 적용한 영상을 각각 나타내고 있다.Figure 4 is a DT image from which noise was removed using the existing MARL model and MARL to which the data transformation of the present invention was applied, (a) a standard image, (b) an input image with noise added, and (c) an existing MARL model. (d) an image to which the wavelet transform was applied to MARL, (e) an image to which Anscombe transform was applied to MARL.

도 4에서 확인할 수 있듯이 MARL 및 데이터 변환을 적용하였을 때 입력영상에 비해 노이즈 이 제거된 것을 관찰할 수 있다. 노이즈 제거 정도 평가를 위해 수행한 정량적 비교분석 결과는 도 5와 같다.As can be seen in Figure 4, when MARL and data transformation are applied, it can be observed that noise is removed compared to the input image. The results of the quantitative comparative analysis performed to evaluate the degree of noise removal are shown in Figure 5.

도 5는 본 발명의 실시예에 따른 입력영상 및 각 모델의 출력이미지에 대한 SNR 측정 결과를 나타낸 그래프로서, 측정결과로 데이터 변환이 적용된 MARL을 사용할 경우 기존 MARL 모델의 출력영상보다 정량적 평가 결과가 향상되는 것을 확인할 수 있다.Figure 5 is a graph showing the SNR measurement results for the input image and the output image of each model according to an embodiment of the present invention. When MARL with data conversion applied as the measurement result is used, the quantitative evaluation result is better than the output image of the existing MARL model. You can see improvement.

본 발명의 권리는 위에서 설명된 실시예에 한정되지 않고 청구범위에 기재된 바에 의해 정의되며, 본 발명의 분야에서 통상의 지식을 가진 자가 청구범위에 기재된 권리범위 내에서 다양한 변형과 개작을 할 수 있다는 것은 자명하다.The rights of the present invention are not limited to the embodiments described above but are defined by the claims, and those skilled in the art can make various changes and modifications within the scope of the claims. This is self-evident.

110: 학습데이터생성부 120: 변환부
130: 학습모델구성부 140: 학습부
150: 역변환부110: Learning data generation unit 120: Conversion unit
130: Learning model configuration unit 140: Learning unit
150: Inverse conversion unit

Claims

A learning data generator 110 that generates a standard image and a plurality of input images for training by applying noise of a set level to the standard image;
A transformation unit 120 that applies wavelet transformation and Anscomb transformation to the training image;
It is connected using a fully convolutional network method, a share network that extracts the features of the input image, a value network that calculates compensation by comparing the input image from which the features are extracted with the standard image and learns to maximize the compensation, and a pre-defined network. A learning model component 130 that configures a multi-agent reinforcement learning model that removes noise in the input image, consisting of a sub-network of the policy network that determines an action appropriate for the state of the input image from the chart.
a learning unit 140 that inputs the standard image and a plurality of input images into the multi-agent reinforcement learning model and trains the multi-agent reinforcement learning model to reduce the difference between the standard image and the input image;
An inverse transform unit 150 that removes noise from the input digital tomosynthesis image by applying a trained multi-agent reinforcement learning model; A digital tomosynthesis image noise removal system comprising:

According to paragraph 1,
The share network consists of one convolution layer and three extended convolution layers, the value network consists of two extended convolution layers and one convolution layer, and the policy network consists of two extended convolution layers. A digital tomosynthesis image noise removal system comprising: and one convolution layer.

According to paragraph 2,
In the shared network, the expansion ratios of the extended convolutional layers are set to 2, 3, and 4, respectively, and the moving range of the filter in all layers of the multi-agent reinforcement learning model is fixed to 1. Digital tomosynthesis image noise removal. system.

According to paragraph 2,
The digital tomosynthesis image noise removal system is characterized in that the value network calculates compensation through [Equation 3] and [Equation 4] below.
[Equation 3]

[Equation 4]

( is the reward of the kth agent at time step t, I _k is the pixel value of the standard image, is the state of the kth agent including image features, is the total reward calculated at the kth agent, γ is the attenuation coefficient, ω is the convolution filter weight of the pth agent, F(k) is the center of the receptive field at the kth agent)

According to paragraph 2,
The expansion ratio of the expansion convolution layer of the policy network is set to 3 and 2, respectively, and then a convGRU (convolutional gated recurrent unit) is applied. A digital tomosynthesis image noise removal system.

According to clause 4,
The policy network is a digital tomosynthesis image noise removal system characterized in that it outputs appropriate behavior through [Equation 5].
[Equation 5]

(π _k is the optimal policy for applying action α _k to state s _k )