KR102774323B1

KR102774323B1 - Device for ensembling data and operating method thereof

Info

Publication number: KR102774323B1
Application number: KR1020180081304A
Authority: KR
Inventors: 임명은; 박흰돌; 정호열; 최재훈; 한영웅
Original assignee: 한국전자통신연구원
Priority date: 2018-01-30
Filing date: 2018-07-12
Publication date: 2025-03-04
Anticipated expiration: 2038-07-12
Also published as: KR20190092217A

Abstract

본 발명은 복수의 건강 예측 장치들로부터 수신된 데이터를 앙상블하는 장치의 동작 방법에 관한 것이다. 본 발명의 실시예에 따른 방법은 원시 학습 데이터를 제1 및 제2 건강 예측 장치들에 제공하는 단계, 제1 및 제2 건강 예측 장치들로부터 생성된 제1 및 제2 학습 결과 데이터를 수신하는 단계, 제1 및 제2 학습 결과 데이터 각각에 포함된 특징 데이터 중 동일한 특징을 갖는 특징 데이터 간의 상관 관계에 기초하여 타겟 관계 모델을 생성하는 단계, 학습 결과 데이터에 포함된 서로 다른 특징을 갖는 특징 데이터 간의 상관 관계에 기초하여 특징 관계 모델을 생성하는 단계, 및 타겟 관계 모델 및 특징 관계 모델을 병합하여 앙상블 모델을 구축하는 단계를 포함한다. 본 발명에 따르면, 앙상블 모델의 성능이 향상될 수 있다.The present invention relates to an operating method of a device for ensembling data received from a plurality of health prediction devices. The method according to an embodiment of the present invention includes the steps of providing raw learning data to first and second health prediction devices, receiving first and second learning result data generated from the first and second health prediction devices, generating a target relationship model based on a correlation between feature data having the same feature among feature data included in each of the first and second learning result data, generating a feature relationship model based on a correlation between feature data having different features included in the learning result data, and building an ensemble model by merging the target relationship model and the feature relationship model. According to the present invention, the performance of the ensemble model can be improved.

Description

DEVICE FOR ENSEMBLING DATA AND OPERATING METHOD THEREOF

본 발명은 미래 건강 예측을 위한 데이터의 처리에 관한 것으로, 좀 더 구체적으로 복수의 건강 예측 장치들로부터 수신된 데이터를 앙상블하는 장치 및 이의 동작 방법에 관한 것이다.The present invention relates to processing of data for predicting future health, and more specifically, to a device for ensembling data received from a plurality of health prediction devices and an operating method thereof.

건강한 삶을 영위하기 위하여, 현재의 질병을 치료하는 것에서 나아가 미래의 건강 상태를 예측하기 위한 요구가 제기되고 있다. 미래의 건강 상태를 예측하기 위하여, 빅데이터를 분석하여 질병을 진단하거나 미래의 질병 위험도를 예측하고자 하는 수요가 증가하고 있다. 산업 기술과 정보 통신 기술의 발달은 빅데이터의 구축을 지원하고 있다. 그리고, 이러한 빅데이터를 이용하여, 컴퓨터와 같은 전자 장치를 학습시켜, 다양한 서비스를 제공하는 인공 지능과 같은 기술이 대두되고 있다. 특히, 미래의 건강 상태를 예측하기 위하여, 다양한 의료 데이터 또는 건강 데이터 등을 이용한 학습 모델을 구축하는 방안이 제안되고 있다. In order to lead a healthy life, there is a growing demand to predict future health conditions beyond treating current diseases. In order to predict future health conditions, there is an increasing demand to analyze big data to diagnose diseases or predict future disease risks. The development of industrial technology and information and communication technology is supporting the construction of big data. In addition, technologies such as artificial intelligence, which uses such big data to teach electronic devices such as computers and provide various services, are emerging. In particular, in order to predict future health conditions, a method of constructing a learning model using various medical data or health data is being proposed.

정확한 예측을 위해서는 데이터의 규모가 클수록 유리하지만, 윤리적 문제, 법적 문제, 개인 프라이버시 문제 등 다양한 원인으로, 다양한 의료 기관들끼리의 데이터 공유 등은 사실상 어려울 수 있다. 이로 인하여, 의료 데이터의 하나로 통합된 빅데이터 구축은 사실상 어려운 실정이다. 이러한 의료 데이터 특유의 문제점에 대한 방안으로, 다기관의 통합된 빅데이터에 대한 단일 예측기를 구축하는 대신 다양한 의료 기관들에서 개별적으로 구축된 데이터로 개별 예측 모델을 학습하고, 이들의 예측 결과를 환자의 미래 건강 상태의 예측에 활용하는 방안이 모색되고 있다.
선행기술1: KR 10-2009-0053578 A (공개일: 2009-05-27)
선행기술2: KR 10-2016-0143512 A (공개일: 2016-12-14)For accurate prediction, the larger the data size, the more advantageous it is, but in reality, it can be difficult to share data among various medical institutions due to various reasons such as ethical issues, legal issues, and personal privacy issues. As a result, it is practically difficult to build big data that integrates medical data into one. As a solution to this problem unique to medical data, instead of building a single predictor for integrated big data from multiple institutions, a method is being sought to learn individual prediction models with data built individually from various medical institutions, and utilize the prediction results to predict the future health status of patients.
Prior art 1: KR 10-2009-0053578 A (Publication date: 2009-05-27)
Prior art 2: KR 10-2016-0143512 A (Published on: 2016-12-14)

본 발명은 미래의 건강 상태 예측의 신뢰성, 정확성, 및 효율성을 확보할 수 있도록, 복수의 건강 예측 장치들로부터 수신된 데이터를 앙상블하는 장치 및 이의 동작 방법을 제공할 수 있다.The present invention can provide a device and an operating method thereof for ensembling data received from a plurality of health prediction devices so as to secure reliability, accuracy, and efficiency in predicting future health conditions.

본 발명의 실시예에 따른 앙상블 예측 장치의 동작 방법에 의하여, 복수의 건강 예측 장치들로부터 수신된 데이터가 앙상블된다. 앙상블 예측 장치의 동작 방법은 원시 학습 데이터를 제1 건강 예측 장치 및 제2 건강 예측 장치에 제공하는 단계, 제1 건강 예측 장치로부터 원시 학습 데이터에 기초하여 생성된 제1 학습 결과 데이터를 수신하는 단계, 제2 건강 예측 장치로부터 원시 학습 데이터에 기초하여 생성된 제2 학습 결과 데이터를 수신하는 단계, 제1 및 제2 학습 결과 데이터 각각에 포함된 특징 데이터 중 동일한 특징을 갖는 특징 데이터 간의 상관 관계에 기초하여, 특징 별로 제1 및 제2 건강 예측 장치들 각각에 대한 가중치를 제공하는 타겟 관계 모델을 생성하는 단계, 제1 또는 제2 학습 결과 데이터에 포함된 특징 데이터 중 서로 다른 특징을 갖는 특징 데이터 사이의 상관 관계에 기초하여 서로 다른 특징 각각에 대한 가중치를 제공하는 특징 관계 모델을 생성하는 단계, 및 타겟 관계 모델 및 특징 관계 모델을 병합하여 앙상블 모델을 구축하는 단계를 포함한다.According to an operating method of an ensemble prediction device according to an embodiment of the present invention, data received from a plurality of health prediction devices are ensembled. The operating method of the ensemble prediction device includes the steps of providing raw learning data to a first health prediction device and a second health prediction device, receiving first learning result data generated based on the raw learning data from the first health prediction device, receiving second learning result data generated based on the raw learning data from the second health prediction device, generating a target relationship model which provides weights for each of the first and second health prediction devices for each feature based on correlations between feature data having the same feature among feature data included in each of the first and second learning result data, generating a feature relationship model which provides weights for each of different features based on correlations between feature data having different features among feature data included in the first or second learning result data, and constructing an ensemble model by merging the target relationship model and the feature relationship model.

본 발명의 실시예에 따른 앙상블 예측 장치는 네트워크 인터페이스, 앙상블 모델 학습부, 건강 예측부, 및 프로세서를 포함한다. 네트워크 인터페이스는 원시 학습 데이터 또는 의료 데이터를 복수의 건강 예측 장치들에 제공하고, 원시 학습 데이터 및 의료 데이터에 기초하여 복수의 건강 예측 장치들로부터 생성된 학습 결과 데이터 및 복수의 건강 예측 장치들에 대한 복수의 메타 정보들을 수신한다. 앙상블 모델 학습부는 복수의 메타 정보들 사이의 유사도에 기초하여 타겟 학습 데이터를 선별하여 앙상블 학습 데이터를 구축하고, 타겟 학습 데이터 사이의 상관 관계 및 타겟 학습 데이터 각각에 포함된 특징 데이터 사이의 상관 관계에 기초하여 앙상블 모델을 생성한다. 건강 예측부는 의료 데이터에 의해 생성된 결과 데이터를 앙상블 모델에 입력하여 사용자의 건강 상태를 예측한다. 프로세서는 네트워크 인터페이스 및 앙상블 모델 학습부, 및 건강 예측부를 제어한다.An ensemble prediction device according to an embodiment of the present invention includes a network interface, an ensemble model learning unit, a health prediction unit, and a processor. The network interface provides raw learning data or medical data to a plurality of health prediction devices, and receives learning result data generated from the plurality of health prediction devices based on the raw learning data and the medical data and a plurality of meta-information about the plurality of health prediction devices. The ensemble model learning unit selects target learning data based on similarity between a plurality of meta-information to build an ensemble learning data, and generates an ensemble model based on correlations between target learning data and correlations between feature data included in each of the target learning data. The health prediction unit inputs result data generated by the medical data into the ensemble model to predict a health status of a user. The processor controls the network interface, the ensemble model learning unit, and the health prediction unit.

본 발명의 실시예에 따른 데이터를 앙상블하는 장치 및 이의 동작 방법은 타겟 학습 데이터를 선별하여, 앙상블 모델을 생성함으로써 다수의 유사한 타겟 학습 데이터에 의한 오버피팅을 완화시킬 수 있다. A device for ensembling data according to an embodiment of the present invention and an operating method thereof can alleviate overfitting caused by a plurality of similar target learning data by selecting target learning data and generating an ensemble model.

또한, 본 발명의 실시예에 따른 데이터를 앙상블하는 장치 및 이의 동작 방법은 앙상블 모델의 학습 시에 타겟 학습 데이터 사이의 상관 관계 및 타겟 학습 데이터 각각에 포함된 특징 데이터 사이의 상관 관계를 분리하여 학습함으로써, 복수의 건강 예측 장치들 간의 특성 및 학습 데이터의 특성이 종합적으로 고려되어 앙상블 모델의 성능을 향상시킬 수 있다.In addition, the device for ensembling data according to an embodiment of the present invention and its operating method learn the correlation between target learning data and the correlation between feature data included in each of the target learning data separately when learning an ensemble model, thereby comprehensively considering the characteristics of a plurality of health prediction devices and the characteristics of learning data, thereby improving the performance of the ensemble model.

도 1은 본 발명의 실시예에 따른 건강 상태 예측 시스템을 도시한 도면이다.
도 2는 도 1의 앙상블 예측 장치의 예시적인 블록도이다.
도 3은 도 2의 앙상블 예측 장치의 동작 방법에 대한 순서도이다.
도 4는 도 3의 S130 단계를 구체화한 순서도이다.
도 5는 도 4의 S131 단계를 구체적으로 설명하기 위한 도면이다.
도 6은 도 4의 S132 단계를 구체적으로 설명하기 위한 도면이다.
도 7은 도 4의 S133 단계를 구체적으로 설명하기 위한 도면이다.
도 8은 도 4의 S134 단계를 구체적으로 설명하기 위한 도면이다.FIG. 1 is a diagram illustrating a health status prediction system according to an embodiment of the present invention.
Figure 2 is an exemplary block diagram of the ensemble prediction device of Figure 1.
Figure 3 is a flowchart of the operation method of the ensemble prediction device of Figure 2.
Figure 4 is a flowchart specifying step S130 of Figure 3.
Figure 5 is a drawing for specifically explaining step S131 of Figure 4.
Figure 6 is a drawing for specifically explaining step S132 of Figure 4.
Figure 7 is a drawing for specifically explaining step S133 of Figure 4.
Figure 8 is a drawing for specifically explaining step S134 of Figure 4.

아래에서는, 본 발명의 기술 분야에서 통상의 지식을 가진 자가 본 발명을 용이하게 실시할 수 있을 정도로, 본 발명의 실시 예들이 명확하고 상세하게 기재된다.Below, embodiments of the present invention are described clearly and in detail so that a person skilled in the art can easily practice the present invention.

도 1은 본 발명의 실시예에 따른 건강 상태 예측 시스템을 도시한 도면이다. 도 1을 참조하면, 건강 상태 예측 시스템(100)은 제1 내지 제n 건강 예측 장치들(111~11n), 단말기(120), 앙상블 예측 장치(130), 및 네트워크(140)를 포함한다. 설명의 편의상, 건강 예측 장치들의 개수가 n개인 것으로 도시하였으나, 건강 예측 장치들의 개수는 제한되지 않고 복수로 제공될 수 있다.FIG. 1 is a diagram illustrating a health status prediction system according to an embodiment of the present invention. Referring to FIG. 1, the health status prediction system (100) includes first to nth health prediction devices (111 to 11n), a terminal (120), an ensemble prediction device (130), and a network (140). For convenience of explanation, the number of health prediction devices is illustrated as n, but the number of health prediction devices is not limited and may be provided in multiple numbers.

제1 내지 제n 건강 예측 장치들(111~11n) 각각은 개별적으로 구축된 예측 모델에 기초하여 사용자의 건강 상태를 예측할 수 있다. 여기에서, 예측 모델은 시계열 의료 데이터를 이용하여, 미래 시점의 건강 상태를 예측하기 위하여 모델화된 구조일 수 있다. 제1 내지 제n 건강 예측 장치들(111~11n) 각각은 제1 내지 제n 학습 데이터(11~1n)을 이용하여 예측 모델을 생성하고 학습할 수 있다. 제1 내지 제n 건강 예측 장치들(111~11n) 각각은 서로 다른 의료 기관 또는 공공 기관에 제공될 수 있다. 제1 내지 제n 학습 데이터(11~1n)은 기관들 각각의 예측 모델 생성 및 학습을 위하여, 개별적으로 데이터베이스화될 수 있다. 서로 다른 의료 기관 또는 공공 기관은 개별적으로 예측 모델을 학습하고, 이러한 학습에 따라 구축된 예측 모델에 사용자의 시계열 의료 데이터를 적용하여, 사용자의 미래 시점에 대한 건강 상태를 예측할 수 있다.Each of the first to nth health prediction devices (111 to 11n) can predict the health status of the user based on an individually constructed prediction model. Here, the prediction model may be a structure modeled to predict the health status of a future point in time using time-series medical data. Each of the first to nth health prediction devices (111 to 11n) can generate and learn the prediction model using the first to nth learning data (11 to 1n). Each of the first to nth health prediction devices (111 to 11n) can be provided to different medical institutions or public institutions. The first to nth learning data (11 to 1n) can be individually databased for each of the institutions to generate and learn the prediction model. Different medical institutions or public institutions can individually learn the prediction model and apply the user's time-series medical data to the prediction model constructed according to the learning, thereby predicting the health status of the user for a future point in time.

제1 내지 제n 건강 예측 장치들(111~11n) 각각은 네트워크(140)를 통하여, 앙상블 예측 장치(130)로부터 원시 학습 데이터(31)를 수신할 수 있다. 여기에서, 원시 학습 데이터(31)는 앙상블 예측 장치(130)에 구축되는 앙상블 모델을 학습하기 위한 데이터로 이해될 수 있다. 제1 내지 제n 건강 예측 장치들(111~11n) 각각은 원시 학습 데이터(31)를 구축된 예측 모델에 적용하여 제1 내지 제n 학습 결과 데이터를 생성할 수 있다. 여기에서, 제1 내지 제n 학습 결과 데이터는 원시 학습 데이터(31)에 따라 제1 내지 제n 건강 예측 장치들(111~11n) 각각이 미래 건강 상태를 예측한 결과 데이터로 이해될 수 있다. 제1 내지 제n 학습 결과 데이터는 네트워크(140)를 통하여, 앙상블 예측 장치(130)로 제공될 수 있다.Each of the first to nth health prediction devices (111 to 11n) can receive raw learning data (31) from the ensemble prediction device (130) via the network (140). Here, the raw learning data (31) can be understood as data for learning an ensemble model constructed in the ensemble prediction device (130). Each of the first to nth health prediction devices (111 to 11n) can apply the raw learning data (31) to the constructed prediction model to generate first to nth learning result data. Here, the first to nth learning result data can be understood as result data in which each of the first to nth health prediction devices (111 to 11n) predicts a future health state according to the raw learning data (31). The first to nth learning result data can be provided to the ensemble prediction device (130) via the network (140).

제1 내지 제n 학습 결과 데이터는 서로 다른 예측 모델에 기초하여 생성되므로, 서로 다른 데이터 값을 가질 수 있다. 제1 내지 제n 건강 예측 장치들(111~11n) 각각은 서로 다른 의료 데이터, 즉 서로 다른 제1 내지 제n 학습 데이터(11~1n)을 기반으로 예측 모델을 학습 및 구축하기 때문이다. 윤리적 문제, 법적 문제, 개인 프라이버시 문제 등, 의료 데이터의 특성으로 인하여, 의료 기관 별로 데이터를 공유하기 어렵고, 빅데이터화가 어렵다. 따라서, 제1 내지 제n 건강 예측 장치들(111~11n)이 개별적으로 예측 모델을 구축하되, 앙상블 예측 장치(130)에서 제1 내지 제n 건강 예측 장치들(111~11n)로부터 예측된 결과 데이터를 앙상블함으로써, 다양한 데이터 학습이 고려된 미래 건강 예측이 가능할 수 있다.Since the first to nth learning result data are generated based on different prediction models, they can have different data values. This is because each of the first to nth health prediction devices (111 to 11n) learns and builds a prediction model based on different medical data, that is, different first to nth learning data (11 to 1n). Due to the characteristics of medical data, such as ethical issues, legal issues, and personal privacy issues, it is difficult to share data by medical institution, and it is difficult to make big data. Therefore, by having the first to nth health prediction devices (111 to 11n) individually build prediction models, and by ensembling the predicted result data from the first to nth health prediction devices (111 to 11n) in an ensemble prediction device (130), future health prediction considering various data learning can be possible.

단말기(120)는 사용자의 미래 건강 예측을 위한 요청 신호를 제공할 수 있다. 단말기(120)는 스마트폰, 데스크탑, 랩탑, 웨어러블 장치 등 요청 신호를 제공할 수 있는 전자 장치일 수 있다. 예를 들어, 단말기(120)는 네트워크(140)를 통하여, 앙상블 예측 장치(130)에 요청 신호를 제공할 수 있고, 건강 상태 예측 시스템(100)은 제1 내지 제n 건강 예측 장치들(111~11n) 및 앙상블 예측 장치(130)를 이용하여 사용자의 건강 상태를 진단하거나, 미래 건강 상태를 예측할 수 있다. 이를 위하여, 단말기(120)는 요청 신호와 함께 시계열 의료 데이터를 앙상블 예측 장치(130)에 제공할 수 있다. 시계열 의료 데이터는 진단, 치료, 검사, 또는 투약 처방 등에 의하여 생성된 사용자의 건강 상태를 나타내는 데이터를 의미할 수 있고, 예시적으로, EMR(Electronic Medical Record) 데이터 또는 PHR(Personal Health Record) 데이터일 수 있다.The terminal (120) can provide a request signal for predicting the future health of the user. The terminal (120) can be an electronic device capable of providing a request signal, such as a smart phone, a desktop, a laptop, or a wearable device. For example, the terminal (120) can provide a request signal to the ensemble prediction device (130) through the network (140), and the health status prediction system (100) can diagnose the health status of the user or predict the future health status by using the first to nth health prediction devices (111 to 11n) and the ensemble prediction device (130). To this end, the terminal (120) can provide time-series medical data to the ensemble prediction device (130) together with the request signal. The time-series medical data can mean data representing the health status of the user generated by diagnosis, treatment, examination, or medication prescription, and for example, can be EMR (Electronic Medical Record) data or PHR (Personal Health Record) data.

앙상블 예측 장치(130)는 제1 내지 제n 학습 결과 데이터를 이용하여 앙상블 모델을 학습한다. 여기에서, 앙상블 모델은 제1 내지 제n 건강 예측 장치들(111~11n) 각각이 건강 상태를 예측한 학습 결과 데이터를 앙상블하여, 미래 건강 상태를 최종 예측하기 위하여 모델화된 구조일 수 있다. 상술된 바와 같이, 앙상블 예측 장치(130)는 원시 학습 데이터(31)를 제1 내지 제n 건강 예측 장치들(111~11n) 각각이 학습한 결과인 제1 내지 제n 학습 결과 데이터를 수신한다. 앙상블 예측 장치(130)는 제1 내지 제n 학습 결과 데이터를 통합하여 앙상블 학습 데이터(32)를 생성할 수 있다. 앙상블 예측 장치(130)는 앙상블 학습 데이터(32)에 기초하여 앙상블 모델을 학습한다.The ensemble prediction device (130) learns an ensemble model using the first to nth learning result data. Here, the ensemble model may be a structure modeled to finally predict a future health state by ensembling learning result data in which each of the first to nth health prediction devices (111 to 11n) predicted a health state. As described above, the ensemble prediction device (130) receives the first to nth learning result data, which are the results of learning from the raw learning data (31) by each of the first to nth health prediction devices (111 to 11n). The ensemble prediction device (130) may generate ensemble learning data (32) by integrating the first to nth learning result data. The ensemble prediction device (130) learns an ensemble model based on the ensemble learning data (32).

앙상블 모델은 제1 내지 제n 건강 예측 장치들(111~11n)의 다양성(diversity)이 클수록, 높은 성능을 가질 수 있다. 이러한 다양성은 각각의 건강 예측 장치들에 구축된 예측 모델들의 알고리즘의 다양성, 예측 모델들 각각에 제공되는 데이터 값의 다양성, 및 데이터에 포함된 특징(feature; 예를 들어, 혈압, 콜레스테롤 수치 등)들의 다양성에 기초하여 결정될 수 있다. 다만, 앙상블 예측 장치(130)는 제1 내지 제n 건강 예측 장치들(111~11n)에 구축된 예측 모델들에 직접적으로 개입할 수 없다. 따라서, 각각의 예측 모델들이 서로 유사한 데이터, 알고리즘, 또는 특징에 의한 학습 결과로 생성된 경우, 유사하지 않은 데이터에 대하여 정확성이 급격히 감소하는 오버피팅(overfitting)이 발생될 수 있다. The ensemble model can have higher performance as the diversity of the first to nth health prediction devices (111 to 11n) is greater. This diversity can be determined based on the diversity of the algorithms of the prediction models built in each of the health prediction devices, the diversity of the data values provided to each of the prediction models, and the diversity of the features (e.g., blood pressure, cholesterol level, etc.) included in the data. However, the ensemble prediction device (130) cannot directly intervene in the prediction models built in the first to nth health prediction devices (111 to 11n). Therefore, if each of the prediction models is generated as a result of learning by similar data, algorithms, or features, overfitting, in which the accuracy is rapidly reduced for dissimilar data, may occur.

앙상블 예측 장치(130)는 오버피팅을 완화시키기 위하여, 타겟 학습 데이터를 선별할 수 있다. 타겟 학습 데이터는 앙상블 모델을 학습하기 위하여 제1 내지 제n 학습 결과 데이터 중 선택된 학습 데이터일 수 있다. 타겟 학습 데이터를 선별하기 위하여, 앙상블 예측 장치(130)는 제1 내지 제n 건강 예측 장치들(111~11n) 각각으로부터 제1 내지 제n 메타 정보들을 수신할 수 있다. 제1 내지 제n 메타 정보들 각각은 해당 건강 예측 장치가 학습하는 특징, 알고리즘, 및 규모 등에 대한 정보를 포함할 수 있다. 앙상블 예측 장치(130)는 제1 내지 제n 메타 정보들 사이의 유사도에 기초하여 타겟 학습 데이터를 선별하고, 선별된 타겟 학습 데이터를 통합하여 앙상블 학습 데이터(32)를 생성할 수 있다. 여기에서, 통합은 단순한 데이터의 나열 또는 결합으로 이해될 수 있다. 구체적인 타겟 학습 데이터의 선별 과정은 후술된다.The ensemble prediction device (130) may select target learning data to alleviate overfitting. The target learning data may be learning data selected from the first to nth learning result data to learn an ensemble model. To select the target learning data, the ensemble prediction device (130) may receive first to nth meta information from each of the first to nth health prediction devices (111 to 11n). Each of the first to nth meta information may include information about features, algorithms, and scales learned by the corresponding health prediction devices. The ensemble prediction device (130) may select target learning data based on the similarity between the first to nth meta information, and may integrate the selected target learning data to generate ensemble learning data (32). Here, integration may be understood as a simple listing or combination of data. A specific process of selecting target learning data will be described later.

앙상블 예측 장치(130)는 앙상블 학습 데이터(32)에 기초하여 타겟 학습 데이터 사이의 상관 관계 및 타겟 학습 데이터에 포함된 특징 데이터(이하, 특징)들 사이의 상관 관계에 기초하여 앙상블 모델을 생성할 수 있다. 앙상블 예측 장치(130)는 앙상블 학습 데이터(32)를 특징 별로 분류하여 타겟 관계 모델에 입력함으로써, 타겟 관계 모델을 생성 및 학습시킬 수 있다. 이러한 타겟 관계 모델은 타겟 학습 데이터 사이의 상관 관계를 분석하는데 이용될 수 있다. 앙상블 예측 장치(130)는 앙상블 학습 데이터(32)를 학습 데이터 별로 분류하여 특징 관계 모델에 입력함으로써, 특징 관계 모델을 생성 및 학습시킬 수 있다. 이러한 특징 관계 모델은 특징들 사이의 상관 관계를 분석하는데 이용될 수 있다. 이후, 앙상블 예측 장치(130)는 타겟 관계 모델 및 특징 관계 모델을 병합(머징)하고, 튜닝하여, 앙상블 모델을 최적화할 수 있다. 구체적인 앙상블 모델의 생성 과정은 후술된다.The ensemble prediction device (130) can generate an ensemble model based on the correlation between target learning data and the correlation between feature data (hereinafter, “features”) included in the target learning data based on the ensemble learning data (32). The ensemble prediction device (130) can generate and train a target relationship model by classifying the ensemble learning data (32) by feature and inputting it into a target relationship model. This target relationship model can be used to analyze the correlation between target learning data. The ensemble prediction device (130) can generate and train a feature relationship model by classifying the ensemble learning data (32) by feature and inputting it into a feature relationship model. This feature relationship model can be used to analyze the correlation between features. Thereafter, the ensemble prediction device (130) can merge and tune the target relationship model and the feature relationship model to optimize the ensemble model. A specific process of generating an ensemble model will be described later.

앙상블 예측 장치(130)는 앙상블 모델에 기초하여, 사용자의 미래 건강 상태를 예측 및 분석할 수 있다. 단말기(120)의 요청에 따라, 앙상블 예측 장치(130)는 시계열 의료 데이터를 제1 내지 제n 건강 예측 장치들(111~11n)에 제공할 수 있다. 앙상블 예측 장치(130)는 제1 내지 제n 건강 예측 장치들(111~11n) 각각으로부터 제1 내지 제n 예측 결과 데이터를 수신할 수 있다. 앙상블 예측 장치(130)는 앙상블 모델에 기초하여 제1 내지 제n 예측 결과 데이터를 앙상블하여, 사용자의 미래 건강 상태를 예측할 수 있다.The ensemble prediction device (130) can predict and analyze the future health status of the user based on the ensemble model. At the request of the terminal (120), the ensemble prediction device (130) can provide time series medical data to the first to nth health prediction devices (111 to 11n). The ensemble prediction device (130) can receive the first to nth prediction result data from each of the first to nth health prediction devices (111 to 11n). The ensemble prediction device (130) can predict the future health status of the user by ensembling the first to nth prediction result data based on the ensemble model.

네트워크(140)는 제1 내지 제n 건강 예측 장치들(111~11n), 단말기(120), 앙상블 예측 장치(130) 사이의 데이터 통신이 수행되도록 구성될 수 있다. 제1 내지 제n 건강 예측 장치들(111~11n), 단말기(120), 앙상블 예측 장치(130)는 네트워크(140)를 통하여, 유선 또는 무선으로 데이터를 주고 받을 수 있다. 도 1에 도시된 바와 달리, 제1 내지 제n 건강 예측 장치들(111~11n)과 앙상블 예측 장치(130) 사이의 데이터 통신을 수행하기 위한 네트워크와 단말기(120)와 앙상블 예측 장치(130) 사이의 데이터 통신을 수행하기 위한 네트워크는 서로 분리될 수 있다.The network (140) may be configured to perform data communication between the first to nth health prediction devices (111 to 11n), the terminal (120), and the ensemble prediction device (130). The first to nth health prediction devices (111 to 11n), the terminal (120), and the ensemble prediction device (130) may exchange data via the network (140) either wired or wirelessly. Unlike as illustrated in FIG. 1, the network for performing data communication between the first to nth health prediction devices (111 to 11n) and the ensemble prediction device (130) and the network for performing data communication between the terminal (120) and the ensemble prediction device (130) may be separated from each other.

도 2는 도 1의 앙상블 예측 장치의 예시적인 블록도이다. 도 2의 블록도는 앙상블 모델을 생성 및 학습하고, 앙상블 모델을 이용하여 미래 건강 상태를 예측 또는 분석하기 위한 예시적인 구성으로 이해될 것이고, 앙상블 예측 장치(130)의 구조가 이에 제한되지 않을 것이다. 도 2를 참조하면, 앙상블 예측 장치(130)는 네트워크 인터페이스(131), 프로세서(132), 메모리(133), 스토리지(136), 및 버스(137)를 포함할 수 있다. 예시적으로, 앙상블 예측 장치(130)는 서버로 구현될 수 있으나, 이에 제한되지 않는다. 설명의 편의상 도 1의 도면 부호를 참조하여, 도 2가 설명된다.FIG. 2 is an exemplary block diagram of the ensemble prediction device of FIG. 1. The block diagram of FIG. 2 will be understood as an exemplary configuration for generating and learning an ensemble model and predicting or analyzing a future health state using the ensemble model, and the structure of the ensemble prediction device (130) will not be limited thereto. Referring to FIG. 2, the ensemble prediction device (130) may include a network interface (131), a processor (132), a memory (133), storage (136), and a bus (137). For example, the ensemble prediction device (130) may be implemented as a server, but is not limited thereto. For convenience of explanation, FIG. 2 will be described with reference to the drawing symbols of FIG. 1.

네트워크 인터페이스(131)는 도 1의 네트워크(140)를 통하여 제1 내지 제n 건강 예측 장치들(111~11n), 단말기(120)와 통신할 수 있도록 구성된다. 예를 들어, 앙상블 모델의 생성을 위하여, 네트워크 인터페이스(131)는 제1 내지 제n 건강 예측 장치들(111~11n)에 원시 학습 데이터(31)를 제공할 수 있다. 네트워크 인터페이스(131)는 제1 내지 제n 건강 예측 장치들(111~11n)의 분석 결과인 제1 내지 제n 학습 결과 데이터를 수신하고, 이를 버스(137)를 통하여 프로세서(132), 메모리(133), 또는 스토리지(136)에 제공할 수 있다. The network interface (131) is configured to communicate with the first to nth health prediction devices (111 to 11n) and the terminal (120) via the network (140) of FIG. 1. For example, in order to generate an ensemble model, the network interface (131) may provide raw learning data (31) to the first to nth health prediction devices (111 to 11n). The network interface (131) may receive the first to nth learning result data, which are analysis results of the first to nth health prediction devices (111 to 11n), and provide the same to the processor (132), the memory (133), or the storage (136) via the bus (137).

사용자의 미래 건강 예측 또는 분석을 위하여, 네트워크 인터페이스(131)는 단말기(120)로부터 요청 신호 및 시계열 의료 데이터를 수신할 수 있고, 시계열 의료 데이터를 제1 내지 제n 건강 예측 장치들(111~11n)에 제공할 수 있다. 네트워크 인터페이스(131)는 1 내지 제n 건강 예측 장치들(111~11n)로부터 제1 내지 제n 예측 결과 데이터를 수신하고, 이를 버스(137)를 통하여 프로세서(132), 메모리(133), 또는 스토리지(136)에 제공할 수 있다. 네트워크 인터페이스(131)는 제1 내지 제n 예측 결과 데이터를 앙상블한 결과 생성된 미래 건강 상태의 최종 예측 결과를 단말기(120)에 제공할 수 있다.In order to predict or analyze the user's future health, the network interface (131) can receive a request signal and time-series medical data from the terminal (120), and provide the time-series medical data to the first to nth health prediction devices (111 to 11n). The network interface (131) can receive the first to nth prediction result data from the first to nth health prediction devices (111 to 11n), and provide the same to the processor (132), the memory (133), or the storage (136) via the bus (137). The network interface (131) can provide the terminal (120) with a final prediction result of the future health state generated as a result of ensembling the first to nth prediction result data.

프로세서(132)는 앙상블 예측 장치(130)의 중앙 처리 장치로의 기능을 수행할 수 있다. 프로세서(132)는 앙상블 모델의 생성 및 학습, 그리고 앙상블 모델에 기초한 미래 건강 예측 및 분석을 위하여 요구되는 제어 동작 및 연산 동작을 수행할 수 있다. 예를 들어, 프로세서(132)의 제어에 따라, 네트워크 인터페이스(131)는 원시 학습 데이터(31) 또는 시계열 의료 데이터를 제1 내지 제n 건강 예측 장치들(111~11n)에 제공하고, 학습 결과 데이터 또는 예측 결과 데이터를 수신할 수 있다. 프로세서(132)의 제어에 따라, 앙상블 모델을 생성하기 위한 타겟 학습 데이터 선별 동작, 타겟 관계 모델 및 특징 관계 모델의 학습 동작 등이 수행될 수 있다. 프로세서(132)는 메모리(133)의 연산 공간을 활용하여 동작할 수 있고, 스토리지(136)로부터 운영체제를 구동하기 위한 파일들 및 어플리케이션의 실행 파일들을 읽을 수 있다. 프로세서(132)는 운영 체제 및 다양한 어플리케이션들을 실행할 수 있다.The processor (132) may perform a function as a central processing unit of the ensemble prediction device (130). The processor (132) may perform control operations and operation operations required for generation and learning of an ensemble model, and future health prediction and analysis based on the ensemble model. For example, under the control of the processor (132), the network interface (131) may provide raw learning data (31) or time series medical data to the first to nth health prediction devices (111 to 11n), and receive learning result data or prediction result data. Under the control of the processor (132), a target learning data selection operation for generating an ensemble model, a target relationship model, and a feature relationship model learning operation, etc. may be performed. The processor (132) may operate by utilizing the operation space of the memory (133), and may read files for driving an operating system and executable files of applications from the storage (136). The processor (132) may execute an operating system and various applications.

메모리(133)는 프로세서(132)에 의하여 처리되거나 처리될 예정인 데이터 및 프로세스 코드들을 저장할 수 있다. 예를 들어, 메모리(133)는 원시 학습 데이터(31), 제1 내지 제n 학습 결과 데이터, 타겟 학습 데이터를 선별하기 위한 정보들, 앙상블 학습 데이터(32), 또는 앙상블 모델을 구축하기 위한 정보들을 저장할 수 있다. 또한, 메모리(133)는 시계열 의료 데이터, 건강 예측 장치들로부터 제공된 예측 결과 데이터, 또는 앙상블 결과 미래 건강에 대한 최종 예측 결과에 대한 정보들을 저장할 수 있다. 메모리(133)는 앙상블 예측 장치(130)의 주기억 장치로 이용될 수 있다. 메모리(133)는 DRAM (Dynamic RAM), SRAM (Static RAM), PRAM (Phase-change RAM), MRAM (Magnetic RAM), FeRAM (Ferroelectric RAM), RRAM (Resistive RAM) 등을 포함할 수 있다.The memory (133) can store data and process codes that are processed or are to be processed by the processor (132). For example, the memory (133) can store raw learning data (31), first to nth learning result data, information for selecting target learning data, ensemble learning data (32), or information for constructing an ensemble model. In addition, the memory (133) can store time-series medical data, prediction result data provided from health prediction devices, or information on the final prediction result for future health as an ensemble result. The memory (133) can be used as a main memory device of the ensemble prediction device (130). The memory (133) can include a DRAM (Dynamic RAM), an SRAM (Static RAM), a PRAM (Phase-change RAM), an MRAM (Magnetic RAM), a FeRAM (Ferroelectric RAM), an RRAM (Resistive RAM), and the like.

메모리(133)는 앙상블 모델 학습부(134) 및 건강 예측부(135)를 포함할 수 있다. 앙상블 모델 학습부(134) 및 건강 예측부(135)는 메모리(133)의 연산 공간의 일부일 수 있다. 이 경우, 앙상블 모델 학습부(134) 및 건강 예측부(135)는 펌웨어 또는 소프트웨어로 구현될 수 있다. 예를 들어, 펌웨어는 스토리지(136)에 저장되고, 펌웨어를 실행 시에 메모리(133)에 로딩될 수 있다. 프로세서(132)는 메모리(133)에 로딩된 펌웨어를 실행할 수 있다. 앙상블 모델 학습부(134)는 프로세서(132)의 제어 하에 앙상블 모델을 생성 및 학습하도록 동작될 수 있다. 건강 예측부(135)는 프로세서(132)의 제어 하에 앙상블 모델을 이용하여 사용자의 미래 건강 상태를 예측 및 분석하도록 동작될 수 있다. 앙상블 모델 학습부(134) 및 건강 예측부(135)의 구체적인 동작은 후술된다.The memory (133) may include an ensemble model learning unit (134) and a health prediction unit (135). The ensemble model learning unit (134) and the health prediction unit (135) may be part of the computational space of the memory (133). In this case, the ensemble model learning unit (134) and the health prediction unit (135) may be implemented as firmware or software. For example, the firmware may be stored in the storage (136) and loaded into the memory (133) when the firmware is executed. The processor (132) may execute the firmware loaded into the memory (133). The ensemble model learning unit (134) may be operated to generate and learn an ensemble model under the control of the processor (132). The health prediction unit (135) may be operated to predict and analyze the future health status of the user using the ensemble model under the control of the processor (132). The specific operations of the ensemble model learning unit (134) and the health prediction unit (135) are described later.

도 2에 도시된 바와 달리, 앙상블 모델 학습부(134) 및 건강 예측부(135)는 앙상블 모델을 구축하고, 사용자의 미래 건강 상태를 예측하기 위한 별도의 하드웨어로 구현될 수 있다. 예를 들어, 앙상블 모델 학습부(134) 및 건강 예측부(135)는 인공 신경망을 통한 학습을 수행하여 앙상블 모델을 구축하기 위한 뉴로모픽 칩 등으로 구현되거나, FPGA(Field Programmable Gate Aray) 또는 ASIC(Application Specific Integrated Circuit)와 같은 전용 논리 회로 등으로 구현될 수 있다.Unlike as shown in Fig. 2, the ensemble model learning unit (134) and the health prediction unit (135) may be implemented as separate hardware for constructing an ensemble model and predicting the user's future health status. For example, the ensemble model learning unit (134) and the health prediction unit (135) may be implemented as a neuromorphic chip for constructing an ensemble model by performing learning through an artificial neural network, or may be implemented as a dedicated logic circuit such as an FPGA (Field Programmable Gate Array) or an ASIC (Application Specific Integrated Circuit).

스토리지(136)는 운영 체제 또는 어플리케이션들에 의해 장기적인 저장을 목적으로 생성되는 데이터, 운영 체제를 구동하기 위한 파일, 또는 어플리케이션들의 실행 파일 등을 저장할 수 있다. 예를 들어, 스토리지(136)는 앙상블 모델 학습부(134) 및 건강 예측부(135)의 실행을 위한 파일들을 저장할 수 있다. 스토리지(136)는 앙상블 예측 장치(130)의 보조 기억 장치로 이용될 수 있다. 스토리지(136)는 플래시 메모리, PRAM (Phase-change RAM), MRAM (Magnetic RAM), FeRAM (Ferroelectric RAM), RRAM (Resistive RAM) 등을 포함할 수 있다.Storage (136) can store data generated for long-term storage by an operating system or applications, files for operating an operating system, or executable files of applications. For example, storage (136) can store files for executing an ensemble model learning unit (134) and a health prediction unit (135). Storage (136) can be used as an auxiliary memory device of an ensemble prediction device (130). Storage (136) can include flash memory, PRAM (Phase-change RAM), MRAM (Magnetic RAM), FeRAM (Ferroelectric RAM), RRAM (Resistive RAM), etc.

버스(137)는 앙상블 예측 장치(130)의 구성 요소들 사이에서 통신 경로를 제공할 수 있다. 네트워크 인터페이스(131), 프로세서(132), 메모리(133), 및 스토리지(136) 는 버스(137)를 통해 서로 데이터를 교환할 수 있다. 버스(137)는 앙상블 예측 장치(130)에서 이용되는 다양한 유형의 통신 포맷을 지원하도록 구성될 수 있다.The bus (137) can provide a communication path between components of the ensemble prediction device (130). The network interface (131), the processor (132), the memory (133), and the storage (136) can exchange data with each other through the bus (137). The bus (137) can be configured to support various types of communication formats used in the ensemble prediction device (130).

도 3은 도 2의 앙상블 예측 장치의 동작 방법에 대한 순서도이다. 도 3을 참조하면, 앙상블 예측 장치의 동작 방법은 앙상블 모델을 학습하는 단계(S100) 및 미래 건강 상태를 예측하는 단계(S200)로 구분될 수 있다. 도 3의 각 단계들은 도 2의 프로세서(132)에 의하여 실행될 수 있다. S100 단계는 프로세서(132)의 제어 하에, 앙상블 모델 학습부(134)에서 처리될 수 있다. S200 단계는 프로세서(132)의 제어 하에, 건강 예측부(135)에서 처리될 수 있다. 설명의 편의상 도 1 및 도 2의 도면 부호를 참조하여, 도 3이 설명된다.FIG. 3 is a flowchart of an operation method of the ensemble prediction device of FIG. 2. Referring to FIG. 3, the operation method of the ensemble prediction device can be divided into a step (S100) of learning an ensemble model and a step (S200) of predicting a future health state. Each step of FIG. 3 can be executed by the processor (132) of FIG. 2. Step S100 can be processed in an ensemble model learning unit (134) under the control of the processor (132). Step S200 can be processed in a health prediction unit (135) under the control of the processor (132). For convenience of explanation, FIG. 3 will be described with reference to the drawing symbols of FIGS. 1 and 2.

S110 단계에서, 앙상블 예측 장치(130)는 원시 학습 데이터(31)를 건강 예측 장치들(111~11n)에 제공한다. 원시 학습 데이터(31)는 시계열 데이터이다. 원시 학습 데이터(31)는 시간의 흐름에 따른 특징 데이터를 포함할 수 있다. 예를 들어, 원시 학습 데이터(31)는 측정 또는 진단된 시간을 나타내는 시간 데이터를 포함할 수 있다. 원시 학습 데이터(31)는 혈압, 콜레스테롤 수치, 몸무게 등 다양한 건강 지표를 나타내는 특징 데이터를 포함할 수 있다.In step S110, the ensemble prediction device (130) provides raw learning data (31) to the health prediction devices (111 to 11n). The raw learning data (31) is time series data. The raw learning data (31) may include feature data according to the flow of time. For example, the raw learning data (31) may include time data indicating a time of measurement or diagnosis. The raw learning data (31) may include feature data indicating various health indicators such as blood pressure, cholesterol level, and body weight.

S120 단계에서, 앙상블 예측 장치(130)는 건강 예측 장치들(111~11n)로부터 학습 결과 데이터 및 메타 정보들을 수신한다. 학습 결과 데이터는 건강 예측 장치들(111~11n) 각각이 원시 학습 데이터(31)를 이용하여 건강 상태를 예측한 결과 데이터일 수 있다. 건강 예측 장치들(111~11n) 각각에 대응되는 메타 정보는 해당 예측 모델에서 학습한 특징 데이터, 학습 알고리즘, 및 제1 내지 제n 학습 데이터(11~1n) 중 해당 건강 예측 장치에 대응되는 학습 데이터의 규모 등에 대한 정보를 포함할 수 있다. 앙상블 모델 학습부(134)는 메타 정보들 및 학습 결과 데이터를 제공받을 수 있다.In step S120, the ensemble prediction device (130) receives learning result data and meta information from the health prediction devices (111 to 11n). The learning result data may be result data in which each of the health prediction devices (111 to 11n) predicts a health status using the raw learning data (31). The meta information corresponding to each of the health prediction devices (111 to 11n) may include information about feature data learned from the corresponding prediction model, a learning algorithm, and the size of learning data corresponding to the corresponding health prediction device among the first to nth learning data (11 to 1n). The ensemble model learning unit (134) may be provided with meta information and learning result data.

S130 단계에서, 앙상블 예측 장치(130)는 수신된 메타 정보들 및 학습 결과 데이터에 기초하여 앙상블 모델을 생성할 수 있다. S130 단계는 프로세서(132)의 제어 하에, 앙상블 모델 학습부(134)에서 수행될 수 있다. 앙상블 모델 학습부(134)는 메타 정보들 사이의 유사도에 기초하여 학습 결과 데이터 중 일부를 선별할 수 있다. 선별된 학습 결과 데이터는 타겟 학습 데이터, 즉 앙상블 학습 데이터(32)로 결정될 수 있다. 앙상블 모델 학습부(134)는 앙상블 학습 데이터(32)에 기초하여 타겟 관계 모델, 특징 관계 모델을 생성하고, 이를 병합 및 튜닝하여 앙상블 모델을 생성할 수 있다. 생성된 앙상블 모델은 스토리지(136)에 구축될 수 있으나, 이에 제한되지 않고, 별도의 서버 또는 저장 매체에 구축될 수 있다. S130 단계의 구체적인 과정들은 도 4에서 후술된다.In step S130, the ensemble prediction device (130) can generate an ensemble model based on the received meta information and learning result data. Step S130 can be performed in the ensemble model learning unit (134) under the control of the processor (132). The ensemble model learning unit (134) can select some of the learning result data based on the similarity between the meta information. The selected learning result data can be determined as target learning data, i.e., ensemble learning data (32). The ensemble model learning unit (134) can generate a target relationship model and a feature relationship model based on the ensemble learning data (32), and merge and tune them to generate an ensemble model. The generated ensemble model can be built in the storage (136), but is not limited thereto, and can be built in a separate server or storage medium. Specific processes of step S130 are described below in FIG. 4.

S200 단계에서, 앙상블 예측 장치(130)는 생성된 앙상블 모델에 기초하여, 사용자의 미래 건강 상태를 예측할 수 있다. 이를 위하여, S210 단계에서, 앙상블 예측 장치(130)는 단말기(120)로부터 시계열 의료 데이터를 수신할 수 있다. 프로세서(132)의 제어 하에, 네트워크 인터페이스(131)는 네트워크(140)를 통하여 시계열 의료 데이터를 수신할 수 있다. 시계열 의료 데이터는 사용자의 다양한 건강 지표를 나타내는 다양한 특징 데이터를 포함할 수 있다.At step S200, the ensemble prediction device (130) can predict the future health status of the user based on the generated ensemble model. To this end, at step S210, the ensemble prediction device (130) can receive time series medical data from the terminal (120). Under the control of the processor (132), the network interface (131) can receive the time series medical data through the network (140). The time series medical data can include various feature data representing various health indicators of the user.

S220 단계에서, 앙상블 예측 장치(130)는 제1 내지 제n 건강 예측 장치들(111~11n)에 건강 예측을 요청할 수 있다. 이를 위하여, 앙상블 예측 장치(130)는 단말기(120)로부터 수신된 시계열 의료 데이터를 제1 내지 제n 건강 예측 장치들(111~11n)에 제공할 수 있다. 사용자의 시계열 의료 데이터는 제1 내지 제n 건강 예측 장치들(111~11n) 각각의 개별적으로 구축된 예측 모델에 입력될 수 있다. 그 결과, 제1 내지 제n 건강 예측 장치들(111~11n)은 개별적인 예측 모델에 의하여, 제1 내지 제n 예측 결과 데이터를 생성할 수 있다. 제1 내지 제n 예측 결과 데이터는 네트워크(140)를 통하여 앙상블 예측 장치(130)에 제공될 수 있다.In step S220, the ensemble prediction device (130) can request health prediction from the first to nth health prediction devices (111 to 11n). To this end, the ensemble prediction device (130) can provide time series medical data received from the terminal (120) to the first to nth health prediction devices (111 to 11n). The user's time series medical data can be input into individually constructed prediction models of each of the first to nth health prediction devices (111 to 11n). As a result, the first to nth health prediction devices (111 to 11n) can generate first to nth prediction result data by their individual prediction models. The first to nth prediction result data can be provided to the ensemble prediction device (130) through the network (140).

S230 단계에서, 앙상블 예측 장치(130)는 제1 내지 제n 건강 예측 장치들(111~11n)로부터 수신된 제1 내지 제n 예측 결과 데이터를 앙상블할 수 있다. S230 단계는 프로세서(132)의 제어 하에, 건강 예측부(135)에서 수행될 수 있다. 건강 예측부(135)는 S130 단계에서 생성된 앙상블 모델에 기초하여, 제1 내지 제n 예측 결과 데이터를 앙상블하고, 사용자의 미래 건강 상태를 예측 및 분석할 수 있다. 예측 및 분석된 사용자의 미래 건강 상태에 대한 정보는 네트워크(140)를 통하여, 단말기(120)에 제공될 수 있다. In step S230, the ensemble prediction device (130) can ensemble the first to nth prediction result data received from the first to nth health prediction devices (111 to 11n). Step S230 can be performed in the health prediction unit (135) under the control of the processor (132). The health prediction unit (135) can ensemble the first to nth prediction result data based on the ensemble model generated in step S130, and predict and analyze the future health status of the user. The predicted and analyzed information on the future health status of the user can be provided to the terminal (120) through the network (140).

앙상블 예측 장치(130)를 이용하여, 다양한 기관들(제1 내지 제n 건강 예측 장치들(111~11n))로부터 학습된 예측 모델을 이용하여, 시계열 의료 데이터를 분석할 수 있다. 시계열 의료 데이터는 시간의 흐름에 따른 특징들에 대한 정보를 나타내므로, 시간의 경과에 따른 건강 상태의 추이가 분석될 수 있다. 이를 이용하면, 미래의 특정 시점에서의 건강 상태가 분석될 수 있다. 다만, 건강 예측 장치는 한정된 학습 데이터를 이용하여 예측 모델이 생성되므로, 앙상블 예측 장치(130)는 다양한 기관들로부터 출력된 예측 결과 데이터를 통합하여 건강 상태 예측의 정확성을 증가시킬 수 있다. 이러한 통합에 기관들 각각의 예측 모델들 사이의 상관 관계 및 특징들 사이의 상관 관계가 고려되어, 미래의 특정 시점의 건강 상태에 대한 예측 정확성이 증가될 수 있다.Using the ensemble prediction device (130), time-series medical data can be analyzed using prediction models learned from various institutions (the first to nth health prediction devices (111 to 11n)). Since the time-series medical data represents information about features over time, the trend of health status over time can be analyzed. Using this, health status at a specific point in time can be analyzed. However, since the health prediction device generates a prediction model using limited learning data, the ensemble prediction device (130) can increase the accuracy of health status prediction by integrating prediction result data output from various institutions. In this integration, correlations between prediction models of each institution and correlations between features are taken into consideration, so that the accuracy of prediction of health status at a specific point in time in the future can be increased.

도 4는 도 3의 S130 단계를 구체화한 순서도이다. 즉, 도 4는 앙상블 예측 장치(130)의 앙상블 모델을 생성하는 단계를 구체화한 도면이다. 도 4의 각 단계들은 도 2의 프로세서(132)의 제어 하에, 앙상블 모델 학습부(134)에서 처리될 수 있다. 설명의 편의상 도 1 및 도 2의 도면 부호를 참조하여, 도 4가 설명된다.Fig. 4 is a flowchart that embodies step S130 of Fig. 3. That is, Fig. 4 is a diagram that embodies a step of generating an ensemble model of an ensemble prediction device (130). Each step of Fig. 4 can be processed in an ensemble model learning unit (134) under the control of the processor (132) of Fig. 2. For convenience of explanation, Fig. 4 is described with reference to the drawing symbols of Figs. 1 and 2.

S131 단계에서, 앙상블 모델 학습부(134)는 타겟 예측 장치를 선별할 수 있다. 상술하였듯이, 앙상블 예측 장치(130)는 제1 내지 제n 건강 예측 장치들(111~11n)로부터 원시 학습 데이터(31)의 송신에 응답하여, 메타 정보들 및 학습 결과 데이터를 수신한다. 앙상블 모델 학습부(134)는 메타 정보들 사이의 유사도에 따라 메타 정보들을 하나 이상의 그룹으로 클러스터링하고, 클러스터링된 그룹 별로 하나의 대표를 선택할 수 있다. 즉, 제1 내지 제n 건강 예측 장치들(111~11n) 중 앙상블 모델을 생성 및 학습하기 위한 대상들, 즉 타겟 예측 장치들이 선택될 수 있다. 선택된 대표들에 대응되는 학습 결과 데이터가 타겟 학습 데이터, 즉 앙상블 학습 데이터(32)로 결정될 수 있다. 이에 대한 내용은 도 5에 도시된다.In step S131, the ensemble model learning unit (134) can select a target prediction device. As described above, the ensemble prediction device (130) receives meta information and learning result data in response to the transmission of raw learning data (31) from the first to nth health prediction devices (111 to 11n). The ensemble model learning unit (134) can cluster the meta information into one or more groups according to the similarity between the meta information and select one representative for each clustered group. That is, among the first to nth health prediction devices (111 to 11n), the targets for generating and learning the ensemble model, that is, the target prediction devices, can be selected. The learning result data corresponding to the selected representatives can be determined as the target learning data, that is, the ensemble learning data (32). The details thereof are illustrated in FIG. 5.

S132 단계에서, 앙상블 모델 학습부(134)는 앙상블 학습 데이터(32)에 기초하여 타겟 관계 모델을 생성 및 학습할 수 있다. 앙상블 모델 학습부(134)는 앙상블 학습 데이터(32)로 결정된 복수의 타겟 학습 데이터 사이의 상관 관계에 기초하여 타겟 관계 모델을 생성 및 학습할 수 있다. 앙상블 모델 학습부(134)는 특징을 기준으로, 앙상블 학습 데이터(32)를 재구성할 수 있다. 예를 들어, 앙상블 모델 학습부(134)는 제1 내지 제n 타겟 학습 데이터 각각의 혈압과 관련된 특징 데이터 간의 상관 관계를 분석 가능하도록, 특징 별로 앙상블 학습 데이터(32)를 재구성할 수 있다. 앙상블 모델 학습부(134)는 타겟 학습 데이터 각각에 포함된 동일한 특징에 대응되는 특징 데이터 사이의 상관 관계를 분석함으로써, 타겟 학습 데이터 사이의 상관 관계를 분석할 수 있다. 타겟 관계 모델은 다양한 특징(건강 지표)들에 대한 기관들(건강 예측 장치들) 별 예측 정확성을 분석하도록 구축되고, 이에 따라 기관들 각각에 대한 가중치를 결정할 수 있다. 이에 대한 내용은 도 6에 도시된다.In step S132, the ensemble model learning unit (134) can generate and learn a target relationship model based on the ensemble learning data (32). The ensemble model learning unit (134) can generate and learn a target relationship model based on a correlation between a plurality of target learning data determined by the ensemble learning data (32). The ensemble model learning unit (134) can reconstruct the ensemble learning data (32) based on features. For example, the ensemble model learning unit (134) can reconstruct the ensemble learning data (32) by features so as to be able to analyze a correlation between feature data related to blood pressure of each of the first to nth target learning data. The ensemble model learning unit (134) can analyze a correlation between feature data corresponding to the same feature included in each of the target learning data, thereby analyzing a correlation between the target learning data. The target relationship model is constructed to analyze the prediction accuracy of each institution (health prediction devices) for various features (health indicators), and accordingly, the weight for each institution can be determined. This is illustrated in Fig. 6.

S133 단계에서, 앙상블 모델 학습부(134)는 앙상블 학습 데이터(32)에 기초하여 특징 관계 모델을 생성 및 학습할 수 있다. 앙상블 모델 학습부(134)는 복수의 타겟 학습 데이터 각각에 포함된 복수의 특징 데이터 사이의 상관 관계에 기초하여 특징 관계 모델을 생성 및 학습할 수 있다. 앙상블 모델 학습부(134)는 타겟 학습 데이터를 기준으로, 앙상블 학습 데이터(32)를 분리할 수 있다. 예를 들어, 앙상블 모델 학습부(134)는 하나의 타겟 학습 데이터에 포함된 제1 내지 제x 특징 데이터 간의 상관 관계를 분석 가능하도록, 타겟 학습 데이터 별로 앙상블 학습 데이터(32)를 분리할 수 있다. 앙상블 모델 학습부(134)는 타겟 학습 데이터 내의 서로 다른 특징들 사이의 상관 관계를 분석할 수 있다. 특징 관계 모델은 다양한 특징(건강 지표)들 사이의 연관성 및 유사성을 분석하도록 구축되고, 이에 따른 특징들 각각에 대한 가중치를 결정할 수 있다. 이에 대한 내용은 도 7에 도시된다.In step S133, the ensemble model learning unit (134) can generate and learn a feature relationship model based on the ensemble learning data (32). The ensemble model learning unit (134) can generate and learn a feature relationship model based on a correlation between a plurality of feature data included in each of a plurality of target learning data. The ensemble model learning unit (134) can separate the ensemble learning data (32) based on the target learning data. For example, the ensemble model learning unit (134) can separate the ensemble learning data (32) by target learning data so as to be able to analyze a correlation between the first to xth feature data included in one target learning data. The ensemble model learning unit (134) can analyze a correlation between different features in the target learning data. The feature relationship model is constructed to analyze the correlation and similarity between various features (health indicators), and can determine a weight for each feature accordingly. The details thereof are illustrated in FIG. 7.

S134 단계에서, 앙상블 모델 학습부(134)는 타겟 관계 모델 및 특징 관계 모델을 병합함으로써, 앙상블 모델을 구축할 수 있다. 앙상블 모델 학습부(134)는 타겟 관계 모델의 출력과 특징 관계 모델의 입력을 연결함으로써, 두 모델들을 병합(머징)하고, 앙상블 모델을 생성할 수 있다. 앙상블 모델 학습부(134)는 병합된 앙상블 모델에 다시 앙상블 학습 데이터(32)를 입력할 수 있다. 그리고, 앙상블 모델 학습부(134)는 앙상블 모델의 출력 결과를 분석하여, 타겟 학습 데이터를 생성하는 건강 예측 장치들(기관들), 그리고 특징들에 대한 가중치를 조정하는 튜닝 과정을 수행할 수 있다. 이에 대한 내용은 도 8에 도시된다.In step S134, the ensemble model learning unit (134) can build an ensemble model by merging the target relationship model and the feature relationship model. The ensemble model learning unit (134) can merge the two models by connecting the output of the target relationship model and the input of the feature relationship model, and generate an ensemble model. The ensemble model learning unit (134) can input the ensemble learning data (32) again into the merged ensemble model. Then, the ensemble model learning unit (134) can analyze the output result of the ensemble model and perform a tuning process for adjusting the weights for the health prediction devices (organs) that generate the target learning data and the features. The details thereof are illustrated in FIG. 8.

S135 단계에서, 앙상블 모델 학습부(134)는 구축된 앙상블 모델의 성능을 평가한다. 앙상블 모델 학습부(134)는 앙상블 모델로부터 출력된 결과 데이터와 원시 학습 데이터(31)에 의하여 기대되는 결과 데이터를 비교할 수 있다. 이러한 비교에 기초하여, 앙상블 모델 학습부(134)는 앙상블 모델의 성능을 평가할 수 있다. 원시 학습 데이터(31) 및 이에 대하여 기대되는 결과 데이터, 즉 원시 학습 데이터(31)에 대한 미래 건강 상태의 예측 결과는 앙상블 모델의 구축을 위하여, 미리 설정될 수 있고, 메모리(133)에 저장될 수 있다.In step S135, the ensemble model learning unit (134) evaluates the performance of the constructed ensemble model. The ensemble model learning unit (134) can compare the result data output from the ensemble model with the result data expected by the raw learning data (31). Based on this comparison, the ensemble model learning unit (134) can evaluate the performance of the ensemble model. The raw learning data (31) and the result data expected therefor, that is, the prediction result of the future health status for the raw learning data (31), can be set in advance for constructing the ensemble model and can be stored in the memory (133).

S136 단계에서, 앙상블 모델 학습부(134)는 앙상블 모델의 평가된 성능과 기준 성능을 비교할 수 있다. 기준 성능은 미리 설정될 수 있고, 메모리(133)에 저장될 수 있다. 앙상블 모델의 성능이 기준 성능 이상인 경우 (또는 높은 경우), 구축된 앙상블 모델이 최종 앙상블 모델로 결정되어 앙상블 모델을 생성하는 단계가 종료될 수 있다. 앙상블 모델의 성능이 기준 성능보다 낮은 경우 (또는 이하인 경우), S131 단계가 다시 진행된다. 이 경우, 앙상블 모델 학습부(134)는 타겟 학습 데이터를 다시 선별할 수 있다. 앙상블 모델 학습부(134)는 메타 정보들을 다시 클러스터링하거나, 클러스터링된 그룹에서 대표를 다시 선택할 수 있다. 앙상블 모델 학습부(134)는 앙상블 모델의 성능이 기준 성능을 만족할 때까지, S132 단계 내지 S135 단계를 반복할 수 있다.In step S136, the ensemble model learning unit (134) can compare the evaluated performance of the ensemble model with the reference performance. The reference performance can be preset and stored in the memory (133). If the performance of the ensemble model is higher than (or higher than) the reference performance, the constructed ensemble model can be determined as the final ensemble model, and the step of generating the ensemble model can be terminated. If the performance of the ensemble model is lower than (or lower than) the reference performance, step S131 is performed again. In this case, the ensemble model learning unit (134) can re-select the target learning data. The ensemble model learning unit (134) can re-cluster the meta information or re-select a representative from the clustered group. The ensemble model learning unit (134) can repeat steps S132 to S135 until the performance of the ensemble model satisfies the reference performance.

도 5는 도 4의 S131 단계를 구체적으로 설명하기 위한 도면이다. 즉, 도 5는 앙상블 예측 장치(130)의 타겟 예측 장치를 선별하는 단계를 구체화한 도면이다. 도 5의 각 단계들은 도 2의 프로세서(132)의 제어 하에, 앙상블 모델 학습부(134)에서 처리될 수 있다. 설명의 편의상 도 1 및 도 2의 도면 부호를 참조하여, 도 5가 설명된다.FIG. 5 is a drawing specifically explaining step S131 of FIG. 4. That is, FIG. 5 is a drawing specifically illustrating a step of selecting a target prediction device of an ensemble prediction device (130). Each step of FIG. 5 can be processed in an ensemble model learning unit (134) under the control of the processor (132) of FIG. 2. For convenience of explanation, FIG. 5 is explained with reference to the drawing symbols of FIG. 1 and FIG. 2.

S131a 단계에서, 앙상블 모델 학습부(134)는 메타 정보의 유사도를 계산한다. 앙상블 모델 학습부(134)는 제1 내지 제n 건강 예측 장치들(111~11n) 각각에 대한 메타 정보들을 수신한다. 예시적으로, 타겟 풀 상에 메타 정보들이 원형으로 도시된다. 앙상블 모델 학습부(134)는 메타 정보들을 통하여, 건강 예측 장치들(111~11n) 각각에 구축된 예측 모델들이 학습한 특징들, 예측 모델들의 알고리즘들, 학습 데이터(11~1n)의 규모를 분석할 수 있다. 앙상블 모델 학습부(134)는 분석된 결과에 기초하여 타겟 풀 상에 메타 정보들을 배치할 수 있다. 앙상블 모델 학습부(134)는 타겟 풀 상에 배치된 메타 정보들 사이의 벡터 값에 기초하여, 메타 정보의 유사도를 계산할 수 있다.In step S131a, the ensemble model learning unit (134) calculates the similarity of the meta information. The ensemble model learning unit (134) receives the meta information for each of the first to nth health prediction devices (111 to 11n). For example, the meta information is shown in a circular shape on the target pool. The ensemble model learning unit (134) can analyze the features learned by the prediction models built for each of the health prediction devices (111 to 11n), the algorithms of the prediction models, and the scale of the learning data (11 to 1n) through the meta information. The ensemble model learning unit (134) can arrange the meta information on the target pool based on the analyzed result. The ensemble model learning unit (134) can calculate the similarity of the meta information based on the vector value between the meta information arranged on the target pool.

S131b 단계에서, 앙상블 모델 학습부(134)는 메타 정보의 유사도에 기초하여, 메타 정보들을 하나 이상의 그룹으로 클러스터링할 수 있다. 예시적으로, 도 5의 타겟 풀에서, 메타 정보들은 유사도에 기초하여 제1 내지 제3 그룹들(C1~C3)로 클러스터링되는 것으로 도시된다. 앙상블 모델 학습부(134)는 메타 정보가 유사한 유사군별로 메타 정보들을 클러스터링한다. 즉, 동일한 그룹에 속하는 메타 정보에 대응되는 건강 예측 장치는 유사한 학습을 통하여 구축된 예측 모델을 포함하는 것으로 이해될 수 있다.In step S131b, the ensemble model learning unit (134) can cluster the meta information into one or more groups based on the similarity of the meta information. For example, in the target pool of FIG. 5, the meta information is illustrated as being clustered into the first to third groups (C1 to C3) based on the similarity. The ensemble model learning unit (134) clusters the meta information by similar groups of meta information. That is, it can be understood that the health prediction device corresponding to the meta information belonging to the same group includes a prediction model built through similar learning.

S131c 단계에서, 앙상블 모델 학습부(134)는 타겟 학습 데이터를 선택한다. 이를 위하여, 앙상블 모델 학습부(134)는 원시 학습 데이터(31)에 대한 학습 결과 데이터의 정확도를 평가한다. 상술하였듯이, 앙상블 예측 장치(130)는 원시 학습 데이터(31)에 대하여 기대되는 결과 데이터, 즉 미래 건강 상태의 예측 결과를 미리 설정할 수 있다. 앙상블 모델 학습부(134)는 미리 설정된 결과 데이터에 기초하여, 그룹들 내의 메타 정보에 대응되는 학습 결과 데이터의 정확도를 평가할 수 있다. 앙상블 모델 학습부(134)는 각각의 그룹들 내에서 평가 결과 가장 높은 정확도를 갖는 학습 결과 데이터를 타겟 학습 데이터로 결정할 수 있다.In step S131c, the ensemble model learning unit (134) selects target learning data. To this end, the ensemble model learning unit (134) evaluates the accuracy of learning result data for the raw learning data (31). As described above, the ensemble prediction device (130) can preset expected result data for the raw learning data (31), that is, the prediction result of the future health status. The ensemble model learning unit (134) can evaluate the accuracy of learning result data corresponding to the meta information within the groups based on the preset result data. The ensemble model learning unit (134) can determine learning result data having the highest accuracy as the evaluation result within each group as the target learning data.

예를 들어, 앙상블 모델 학습부(134)는 제1 그룹(C1)의 세 개의 메타 정보들에 대응되는 학습 결과 데이터와 기대되는 결과 데이터를 비교할 수 있다. 이 중, 제1 타겟 메타 정보(T1)에 대응되는 학습 결과 데이터의 정확도가 가장 높은 경우, 앙상블 모델 학습부(134)는 제1 타겟 메타 정보(T1)에 대응되는 학습 결과 데이터를 타겟 학습 데이터로 선택할 수 있다. 유사한 방식으로, 앙상블 모델 학습부(134)는 제2 그룹(C2) 및 제3 그룹(C3) 내의 학습 결과 데이터 중 가장 높은 정확도를 갖는 제2 타겟 메타 정보(T2) 및 제3 타겟 메타 정보(T3)에 대응되는 학습 결과 데이터를 타겟 학습 데이터로 선택할 수 있다. 즉, 앙상블 학습 데이터(32)는 제1 내지 제3 타겟 메타 정보(T1~T3)에 대응되는 학습 결과 데이터를 포함할 수 있다.For example, the ensemble model learning unit (134) can compare the learning result data corresponding to the three meta information of the first group (C1) with the expected result data. Among these, if the learning result data corresponding to the first target meta information (T1) has the highest accuracy, the ensemble model learning unit (134) can select the learning result data corresponding to the first target meta information (T1) as the target learning data. In a similar manner, the ensemble model learning unit (134) can select the learning result data corresponding to the second target meta information (T2) and the third target meta information (T3) having the highest accuracy among the learning result data in the second group (C2) and the third group (C3) as the target learning data. That is, the ensemble learning data (32) can include the learning result data corresponding to the first to third target meta information (T1 to T3).

S131a 내지 S131c 단계들을 수행한 결과 선택된 타겟 학습 데이터에 기초하여, 앙상블 모델이 생성된다. 이후, 도 4의 S136 단계에서, 앙상블 모델의 성능이 기준 성능에 도달하지 못한 경우, S131a 내지 S131c 단계들이 다시 수행될 수 있다. 이 경우, S131b 단계에서, 앙상블 모델 학습부(134)는 메타 정보들을 다시 클러스터링 할 수 있다. 예를 들어, 앙상블 모델 학습부(134)는 그룹 내에서 상대적으로 메타 정보의 유사도가 낮은 메타 정보를 해당 그룹에서 제외시키거나, 다른 그룹에 포함시킬 수 있다. 또한, S131c 단계에서, 앙상블 모델 학습부(134)는 타겟 학습 데이터를 다시 선택할 수 있다. 예를 들어, 앙상블 모델 학습부(134)는 다시 수행된 클러스터링에 의하여 변경된 그룹들 내의 학습 결과 데이터의 정확도를 다시 평가하고, 타겟 학습 데이터를 다시 선별할 수 있다. Based on the target learning data selected as a result of performing steps S131a to S131c, an ensemble model is generated. Thereafter, in step S136 of FIG. 4, if the performance of the ensemble model does not reach the reference performance, steps S131a to S131c may be performed again. In this case, in step S131b, the ensemble model learning unit (134) may re-cluster the meta information. For example, the ensemble model learning unit (134) may exclude meta information with relatively low meta information similarity within a group from the group or include it in another group. In addition, in step S131c, the ensemble model learning unit (134) may re-select the target learning data. For example, the ensemble model learning unit (134) may re-evaluate the accuracy of the learning result data within the groups changed by the re-performed clustering, and select the target learning data again.

예시적으로, S131a 내지 S131c 단계들은 기계학습 방식의 학습 모델에 기초하여 진행될 수 있다. 기계학습 방식의 학습 모델은 유사도 계산 기반의 클러스터링을 수행하도록 구현될 수 있다. 유사도 계산 기반의 클러스터링을 이용하여, 타겟 학습 데이터를 선별함으로써, 앙상블 모델의 오버 피팅이 완화될 수 있다. S131a 내지 S131c 단계들에 따른, 유사도 계산, 클러스터링, 및 정확도 평가는 입력되는 메타 정보의 종류, 클러스터링 알고리즘, 및 정확도 평가 계산 방식 등에 기초하여 다양하게 설정될 수 있다. For example, steps S131a to S131c may be performed based on a machine learning-based learning model. The machine learning-based learning model may be implemented to perform clustering based on similarity calculation. By selecting target learning data using clustering based on similarity calculation, overfitting of an ensemble model may be alleviated. Similarity calculation, clustering, and accuracy evaluation according to steps S131a to S131c may be variously set based on the type of input meta information, clustering algorithm, and accuracy evaluation calculation method.

도 6은 도 4의 S132 단계를 구체적으로 설명하기 위한 도면이다. 즉, 도 6은 앙상블 학습 데이터(32)를 이용하여 앙상블 예측 장치(130)가 타겟 관계 모델(TM)을 학습하는 과정을 나타낸다. 타겟 관계 모델(TM)은 도 2의 프로세서(132)의 제어 하에, 앙상블 모델 학습부(134)에서 학습될 수 있다. 설명의 편의상 도 1 및 도 2의 도면 부호를 참조하여, 도 6이 설명된다.FIG. 6 is a drawing for specifically explaining step S132 of FIG. 4. That is, FIG. 6 shows a process in which an ensemble prediction device (130) learns a target relationship model (TM) using ensemble learning data (32). The target relationship model (TM) can be learned in an ensemble model learning unit (134) under the control of the processor (132) of FIG. 2. For convenience of explanation, FIG. 6 is explained with reference to the drawing symbols of FIG. 1 and FIG. 2.

앙상블 학습 데이터(32)는 제1 내지 제n 타겟 학습 데이터(Ha~Hn)을 포함하며, 복수의 건강 예측 장치들로부터 생성된 학습 결과 데이터 중 n개의 학습 결과 데이터가 선택되었음을 의미한다. 제1 내지 제n 타겟 학습 데이터(Ha~Hn) 각각은 다양한 특징 데이터를 포함한다. 예를 들어, 제1 타겟 학습 데이터(Ha)는 제1 내지 제x 특징 데이터(a1~ax)을 포함하고, 제2 타겟 학습 데이터(Hb)는 제1 내지 제x 특징 데이터(b1~bx)을 포함하고, 제n 타겟 학습 데이터(Hn)는 제1 내지 제x 특징 데이터(n1~nx)을 포함한다.The ensemble learning data (32) includes first to n-th target learning data (Ha to Hn), and means that n learning result data are selected from learning result data generated from a plurality of health prediction devices. Each of the first to n-th target learning data (Ha to Hn) includes various feature data. For example, the first target learning data (Ha) includes first to x-th feature data (a1 to ax), the second target learning data (Hb) includes first to x-th feature data (b1 to bx), and the n-th target learning data (Hn) includes first to x-th feature data (n1 to nx).

특징 데이터는 원시 학습 데이터(31)를 생성하기 위하여 진단, 검사, 또는 처방된 항목인 특징에 대응될 수 있다. 특징은 혈압, 콜레스테롤 수치, 몸무게 등 다양한 건강 지표를 나타낼 수 있다. 도 6에 도시된 특징 데이터는 동일한 숫자를 갖는 경우, 동일한 특징을 나타내는 것으로 가정한다. 예를 들어, 제1 타겟 학습 데이터(Ha)의 제1 특징 데이터(a1)와 제2 타겟 학습 데이터(Hb)의 제1 특징 데이터(b1)는 동일한 특징을 나타내는 것으로 이해될 것이다.The feature data may correspond to a feature that is a diagnosis, examination, or prescription item to generate raw learning data (31). The feature may represent various health indicators such as blood pressure, cholesterol level, and body weight. It is assumed that the feature data illustrated in Fig. 6 represent the same feature if they have the same number. For example, the first feature data (a1) of the first target learning data (Ha) and the first feature data (b1) of the second target learning data (Hb) will be understood to represent the same feature.

앙상블 모델 학습부(134)는 타겟 관계 모델(TM)을 학습하기 위하여, 동일한 특징 별로 제1 내지 제n 타겟 학습 데이터(Ha~Hn)을 재구성할 수 있다. 예를 들어, 제1 타겟 학습 데이터(Ha)의 제1 특징 데이터(a1), 제2 타겟 학습 데이터(Hb)의 제1 특징 데이터(b1), 및 제n 타겟 학습 데이터(Hn)의 제1 특징 데이터(n1)는 타겟 관계 모델(TM)의 동일한 레이어에 입력되도록 재구성될 수 있다. 즉, 동일한 특징은 동일한 입력 레이어에 제공될 수 있다.The ensemble model learning unit (134) can reconstruct the first to nth target learning data (Ha to Hn) by the same feature in order to learn the target relationship model (TM). For example, the first feature data (a1) of the first target learning data (Ha), the first feature data (b1) of the second target learning data (Hb), and the first feature data (n1) of the nth target learning data (Hn) can be reconstructed to be input to the same layer of the target relationship model (TM). That is, the same feature can be provided to the same input layer.

타겟 관계 모델(TM)은 타겟 학습 데이터 사이의 관계를 고려하여, 특징 별 미래 시점의 예측 결과를 도출할 수 있다. 타겟 관계 모델(TM)은 제1 내지 제x 타겟 관계 모델들(TM1~TMx)을 포함할 수 있고, 타겟 관계 모델들(TM1~TMx)의 개수는 특징 데이터의 개수에 대응될 수 있다. 제1 내지 제x 타겟 관계 모델들(TM1~TMx) 각각은 제1 내지 제n 타겟 학습 데이터(Ha~Hn) 각각에 포함된 특징 데이터 중 한가지 종류의 특징 데이터를 입력 받는다. 예를 들어, 제1 타겟 관계 모델(TM1)은 제1 내지 제n 타겟 학습 데이터(Ha~Hn) 각각에 포함된 제1 특징 데이터(a1~n1)을 입력 받을 수 있다. The target relationship model (TM) can derive a prediction result of a future point in time for each feature by considering the relationship between target learning data. The target relationship model (TM) can include first to x-th target relationship models (TM1 to TMx), and the number of target relationship models (TM1 to TMx) can correspond to the number of feature data. Each of the first to x-th target relationship models (TM1 to TMx) receives one type of feature data among the feature data included in each of the first to n-th target learning data (Ha to Hn). For example, the first target relationship model (TM1) can receive first feature data (a1 to n1) included in each of the first to n-th target learning data (Ha to Hn).

제1 내지 제x 타겟 관계 모델들(TM1~TMx) 각각은 제1 내지 제n 타겟 학습 데이터(Ha~Hn) 각각에 포함된 특징 데이터 중 한가지 종류의 특징 데이터 사이의 상관 관계를 학습할 수 있다. 예를 들어, 제1 타겟 관계 모델(TM1)은 제1 내지 제n 타겟 학습 데이터(Ha~Hn) 각각에 포함된 제1 특징 데이터(a1~n1) 사이의 상관 관계를 분석할 수 있다. 이를 통하여, 제1 타겟 관계 모델(TM1)은 제1 특징 데이터(a1~n1) 각각에 가중치를 부여할 수 있다. Each of the first to nth target relationship models (TM1 to TMx) can learn a correlation between one type of feature data among feature data included in each of the first to nth target learning data (Ha to Hn). For example, the first target relationship model (TM1) can analyze a correlation between the first feature data (a1 to n1) included in each of the first to nth target learning data (Ha to Hn). Through this, the first target relationship model (TM1) can assign a weight to each of the first feature data (a1 to n1).

예를 들어, 도 1의 제1 건강 예측 장치(111)가 제공되는 의료 기관은 다른 의료 기관들에 비하여 심혈관 질환 등에 특화될 수 있고, 제2 건강 예측 장치(112)가 제공되는 의료 기관은 다른 의료 기관들에 비해 호흡기 질환 등에 특화될 수 있다. 제1 건강 예측 장치(111)가 제1 타겟 학습 데이터(Ha)를 생성하고, 제2 건강 예측 장치(112)가 제2 타겟 학습 데이터(Hb)를 생성한 경우, 타겟 관계 모델(TM)은 제1 타겟 학습 데이터(Ha)의 심혈관 질환과 관련된 특징 데이터의 가중치를 다른 타겟 학습 데이터의 심혈관 질환과 관련된 특징 데이터보다 크게 부여할 수 있다. 또한, 타겟 관계 모델(TM)은 제2 타겟 학습 데이터(Hb)의 호흡기 질환과 관련된 특징 데이터의 가중치를 다른 타겟 학습 데이터의 호흡기 질환과 관련된 특징 데이터보다 크게 부여할 수 있다. 이를 통하여, 다양한 의료 기관들의 예측 모델들을 이용하여 미래 건강 상태가 예측될 수 있고, 미래 건강 상태의 예측 정확성이 증가할 수 있다.For example, a medical institution provided with the first health prediction device (111) of FIG. 1 may be specialized in cardiovascular diseases, etc., compared to other medical institutions, and a medical institution provided with the second health prediction device (112) may be specialized in respiratory diseases, etc., compared to other medical institutions. When the first health prediction device (111) generates the first target learning data (Ha) and the second health prediction device (112) generates the second target learning data (Hb), the target relationship model (TM) may assign a greater weight to the feature data related to cardiovascular diseases of the first target learning data (Ha) than to the feature data related to cardiovascular diseases of the other target learning data. In addition, the target relationship model (TM) may assign a greater weight to the feature data related to respiratory diseases of the second target learning data (Hb) than to the feature data related to respiratory diseases of the other target learning data. Through this, future health states may be predicted using prediction models of various medical institutions, and the accuracy of prediction of future health states may be increased.

제1 내지 제x 타겟 관계 모델들(TM1~TMx) 각각은 복수의 레이어들로 계층화될 수 있다. 예시적으로, 제1 내지 제x 타겟 관계 모델들(TM1~TMx)이 뉴럴 네트워크 모델로 도시되었으나, 특정 모델로 제한되지 않고, 기계 학습을 수행할 수 있는 다양한 학습 모델이 적용될 수 있다.Each of the first to xth target relationship models (TM1 to TMx) can be hierarchized into multiple layers. For example, the first to xth target relationship models (TM1 to TMx) are illustrated as neural network models, but are not limited to a specific model, and various learning models capable of performing machine learning can be applied.

도 7은 도 4의 S133 단계를 구체적으로 설명하기 위한 도면이다. 즉, 도 7은 앙상블 학습 데이터(32)를 이용하여 앙상블 예측 장치(130)가 특징 관계 모델(FM)을 학습하는 과정을 나타낸다. 특징 관계 모델(FM)은 도 2의 프로세서(132)의 제어 하에, 앙상블 모델 학습부(134)에서 학습될 수 있다. 설명의 편의상 도 1 및 도 2의 도면 부호를 참조하여, 도 7이 설명된다.FIG. 7 is a drawing for specifically explaining step S133 of FIG. 4. That is, FIG. 7 shows a process in which an ensemble prediction device (130) learns a feature relationship model (FM) using ensemble learning data (32). The feature relationship model (FM) can be learned in an ensemble model learning unit (134) under the control of the processor (132) of FIG. 2. For convenience of explanation, FIG. 7 is explained with reference to the drawing symbols of FIG. 1 and FIG. 2.

앙상블 학습 데이터(32)는 복수의 타겟 학습 데이터를 포함하고, 복수의 타겟 학습 데이터 각각은 복수의 특징 데이터를 포함한다. 예를 들어, 제1 타겟 학습 데이터는 제1 내지 제x 특징 데이터(a1~ax)을 포함한다. 특징 데이터는 원시 학습 데이터(31)를 생성하기 위하여 진단, 검사, 또는 처방된 항목에 대응될 수 있다. 앙상블 모델 학습부(134)는 특징 관계 모델(FM)을 학습하기 위하여, 앙상블 학습 데이터(32)를 타겟 학습 데이터 별로 분리할 수 있다. 앙상블 학습 데이터(32)는 타겟 학습 데이터 별로 특징 관계 모델(FM)에 입력된다.The ensemble learning data (32) includes a plurality of target learning data, and each of the plurality of target learning data includes a plurality of feature data. For example, the first target learning data includes the first to xth feature data (a1 to ax). The feature data may correspond to a diagnosed, examined, or prescribed item to generate the raw learning data (31). The ensemble model learning unit (134) may separate the ensemble learning data (32) by target learning data in order to learn the feature relationship model (FM). The ensemble learning data (32) is input to the feature relationship model (FM) by target learning data.

특징 관계 모델(FM)은 타겟 학습 데이터 내의 특징 데이터(a1~ax) 사이의 관계를 고려하여, 미래 시점의 예측 결과를 도출할 수 있다. 특징 관계 모델(FM)은 타겟 학습 데이터 단위로 데이터를 입력 받는다. 예를 들어, 특징 관계 모델(FM)은 제1 타겟 학습 데이터를 입력 받고, 제2 타겟 학습 데이터 내지 제n 타겟 학습 데이터를 차례로 입력 받을 수 있다. 특징 관계 모델(FM)은 하나의 타겟 학습 데이터의 제1 내지 제x 특징 데이터(a1~ax) 사이의 상관 관계를 분석할 수 있다. 이를 통하여, 특징 관계 모델(FM)은 제1 내지 제x 특징 데이터(a1~ax) 각각에 가중치를 부여할 수 있다.The feature relationship model (FM) can derive a prediction result of a future point in time by considering the relationship between the feature data (a1~ax) in the target learning data. The feature relationship model (FM) receives data as a unit of target learning data. For example, the feature relationship model (FM) can receive the first target learning data, and the second target learning data to the n-th target learning data in sequence. The feature relationship model (FM) can analyze the correlation between the first to x-th feature data (a1~ax) of one target learning data. Through this, the feature relationship model (FM) can assign a weight to each of the first to x-th feature data (a1~ax).

예를 들어, 심혈관 질환과 관련하여, 제1 특징 데이터(a1)가 다른 특징 데이터에 비하여 중요한 건강 지표일 수 있다. 이 경우, 특징 관계 모델(FM)은 제1 특징 데이터(a1)의 가중치를 다른 특징 데이터보다 크게 부여할 수 있다. 또한, 호흡기 질환과 관련하여, 제2 특징 데이터(a2)와 제x 특징 데이터(ax)가 유사한 건강 지표로 이용될 수 있다. 이 경우, 특징 관계 모델(FM)은 제2 특징 데이터(a2)와 제x 특징 데이터(ax) 사이의 연산에 부여되는 가중치를 다른 특징 데이터 간의 연산에 부여되는 가중치보다 크게 설정할 수 있다. 이를 통하여, 다양한 특징들이 복합적으로 고려되어 미래 건강 상태가 예측될 수 있고, 미래 건강 상태의 예측 정확성이 증가할 수 있다.For example, in relation to cardiovascular disease, the first feature data (a1) may be a more important health indicator than other feature data. In this case, the feature relationship model (FM) may assign a greater weight to the first feature data (a1) than to other feature data. In addition, in relation to respiratory disease, the second feature data (a2) and the x-th feature data (ax) may be used as similar health indicators. In this case, the feature relationship model (FM) may assign a greater weight to the operation between the second feature data (a2) and the x-th feature data (ax) than to the weight to the operation between other feature data. Through this, various features may be comprehensively considered to predict future health status, and the accuracy of predicting future health status may increase.

특징 관계 모델(FM)은 복수의 레이어들로 계층화될 수 있다. 예시적으로, 특징 관계 모델(FM)은 뉴럴 네트워크 모델로 도시되었으나, 특정 모델로 제한되지 않고, 기계 학습을 수행할 수 있는 다양한 학습 모델이 적용될 수 있다.The feature relationship model (FM) can be hierarchized into multiple layers. For example, the feature relationship model (FM) is illustrated as a neural network model, but it is not limited to a specific model, and various learning models that can perform machine learning can be applied.

도 8은 도 4의 S134 단계를 구체적으로 설명하기 위한 도면이다. 즉, 도 8은 앙상블 학습 데이터(32)를 이용하여 앙상블 예측 장치(130)가 앙상블 모델(EM)을 구축하는 과정을 나타낸다. 앙상블 모델(EM)은 도 2의 프로세서(132)의 제어 하에, 앙상블 모델 학습부(134)에서 구축될 수 있다. 설명의 편의상 도 1 및 도 2의 도면 부호를 참조하여, 도 8이 설명된다.FIG. 8 is a drawing for specifically explaining step S134 of FIG. 4. That is, FIG. 8 shows a process in which an ensemble prediction device (130) constructs an ensemble model (EM) using ensemble learning data (32). The ensemble model (EM) can be constructed in an ensemble model learning unit (134) under the control of the processor (132) of FIG. 2. For convenience of explanation, FIG. 8 is explained with reference to the drawing symbols of FIG. 1 and FIG. 2.

앙상블 모델(EM)은 우선 도 6에서 생성된 타겟 관계 모델(TM)과 도 7에서 생성된 특징 관계 모델(FM)을 병합함으로써 생성된다. 앙상블 모델 학습부(134)는 타겟 관계 모델(TM)의 출력과 특징 관계 모델(FM)의 입력을 연결시킬 수 있다. 이를 통하여, 앙상블 모델(EM)은 타겟 학습 데이터에 대응되는 건강 예측 장치들 사이의 관계 및 타겟 학습 데이터 각각에 포함된 특징들 사이의 관계들을 종합적으로 고려할 수 있다.The ensemble model (EM) is first generated by merging the target relationship model (TM) generated in Fig. 6 and the feature relationship model (FM) generated in Fig. 7. The ensemble model learning unit (134) can connect the output of the target relationship model (TM) and the input of the feature relationship model (FM). Through this, the ensemble model (EM) can comprehensively consider the relationships between health prediction devices corresponding to the target learning data and the relationships between features included in each of the target learning data.

타겟 관계 모델(TM)과 특징 관계 모델(FM)은 개별적으로 학습되므로, 두 모델들을 단순하게 병합하여 생성된 앙상블 모델(EM)은 기준 성능보다 낮은 성능을 가질 수 있다. 따라서, 타겟 관계 모델(TM)과 특징 관계 모델(FM)이 병합된 후, 앙상블 학습 데이터(32)가 다시 앙상블 모델(EM)에 입력된다. 앙상블 학습 데이터(32)는 제1 내지 제n 타겟 학습 데이터(Ha~Hn)을 포함할 수 있다. 앙상블 학습 데이터(32)는 도 6에서의 데이터 입력 방법과 같이, 동일한 특징 별로 재구성될 수 있다. 예를 들어, 제1 타겟 학습 데이터(Ha)의 제1 특징 데이터(a1), 제2 타겟 학습 데이터(Hb)의 제1 특징 데이터(b1), 및 제n 타겟 학습 데이터(Hn)의 제1 특징 데이터(n1)는 타겟 관계 모델(TM)의 동일한 레이어에 입력되도록 재구성될 수 있다.Since the target relationship model (TM) and the feature relationship model (FM) are learned separately, the ensemble model (EM) generated by simply merging the two models may have lower performance than the reference performance. Therefore, after the target relationship model (TM) and the feature relationship model (FM) are merged, the ensemble learning data (32) is input again to the ensemble model (EM). The ensemble learning data (32) may include the first to n-th target learning data (Ha to Hn). The ensemble learning data (32) may be reconstructed for each identical feature, as in the data input method of FIG. 6. For example, the first feature data (a1) of the first target learning data (Ha), the first feature data (b1) of the second target learning data (Hb), and the first feature data (n1) of the n-th target learning data (Hn) may be reconstructed to be input to the same layer of the target relationship model (TM).

앙상블 모델(EM)에 앙상블 학습 데이터(32)를 입력한 결과에 기초하여, 앙상블 모델(EM)은 최적화될 수 있다. 즉, 앙상블 모델(EM)의 가중치가 갱신될 수 있다. 예를 들어, 앙상블 모델(EM)의 출력 결과와 미리 설정된 미래 건강 상태의 예측 결과의 비교를 통하여, 앙상블 모델(EM)의 가중치가 변경될 수 있다. 이러한 가중치의 변경은 특징 관계 모델(FM)로 한정될 수 있으나, 이에 제한되지 않는다. 또한, 타겟 관계 모델(TM) 및 특징 관계 모델(FM)의 병합 과정에의 변형 등을 최소화하고, 데이터의 평활화(smoothing)을 위하여, 타겟 관계 모델의 출력과 특징 관계 모델의 입력 사이에 병합 레이어(aggregation layer, AL)가 제공될 수 있다.Based on the result of inputting the ensemble learning data (32) into the ensemble model (EM), the ensemble model (EM) can be optimized. That is, the weights of the ensemble model (EM) can be updated. For example, the weights of the ensemble model (EM) can be changed by comparing the output result of the ensemble model (EM) with the prediction result of the preset future health status. This change in the weights can be limited to the feature relationship model (FM), but is not limited thereto. In addition, in order to minimize deformation in the merging process of the target relationship model (TM) and the feature relationship model (FM), and to smooth the data, an aggregation layer (AL) can be provided between the output of the target relationship model and the input of the feature relationship model.

위에서 설명한 내용은 본 발명을 실시하기 위한 구체적인 예들이다. 본 발명에는 위에서 설명한 실시 예들뿐만 아니라, 단순하게 설계 변경하거나 용이하게 변경할 수 있는 실시 예들도 포함될 것이다. 또한, 본 발명에는 상술한 실시 예들을 이용하여 앞으로 용이하게 변형하여 실시할 수 있는 기술들도 포함될 것이다.The above-described contents are specific examples for implementing the present invention. The present invention will include not only the embodiments described above, but also embodiments that can be simply designed or easily modified. In addition, the present invention will also include technologies that can be easily modified and implemented in the future using the embodiments described above.

100: 건강 상태 예측 시스템
130: 앙상블 예측 장치100: Health status prediction system
130: Ensemble prediction device

Claims

In a method of operating a device for ensembling data received from multiple health prediction devices,
A step of providing raw learning data to a first health prediction device and a second health prediction device;
A step of receiving first learning result data generated based on the raw learning data from the first health prediction device;
A step of receiving second learning result data generated based on the raw learning data from the second health prediction device;
A step of generating a target relationship model that provides weights for each of the first and second health prediction devices for each feature based on the correlation between feature data having the same feature among feature data included in each of the first learning result data and the second learning result data.
A step of generating a feature relationship model that provides weights for each of the different features based on the correlation between feature data having different features among the feature data included in the first learning result data or the second learning result data; and
A method comprising the step of building an ensemble model by merging the target relationship model and the feature relationship model.

In the first paragraph,
Further comprising a step of selecting target learning data from among the first learning result data and the second learning result data based on the first meta information received from the first health prediction device and the second meta information received from the second health prediction device,
A method comprising a step of generating the target relationship model and the feature relationship model based on the target learning data.

In the first paragraph,
A step of receiving time series medical data from an ensemble prediction device and generating a plurality of prediction result data using the constructed ensemble model; and
A method further comprising a step of merging and analyzing the plurality of prediction result data.

A network interface that receives first learning result data from a first health prediction device, receives second learning result data from a second health prediction device, and receives time series medical data from a terminal;
An ensemble prediction device including a processor that generates ensemble learning data based on the first learning result data and the second learning result data, and generates an ensemble model based on the ensemble learning data

In paragraph 4,
The ensemble prediction device selects target learning data from among the first learning result data and the second learning result data based on first meta information received from the first health prediction device and second meta information received from the second health prediction device, and generates the ensemble learning data based on the target learning data.

In paragraph 4,
An ensemble prediction device wherein the above time series medical data includes at least one of an Electronic Medical Record (EMR) and a Personal Health Record (PHR).

In paragraph 4,
The above ensemble prediction device includes an ensemble model learning unit and a health prediction unit,
The above ensemble model learning unit generates a target relationship model and a feature relationship model based on the above ensemble learning data,
The above health prediction unit is an ensemble prediction device that performs ensemble prediction by merging target relationships extracted through the target relationship model and feature relationships extracted through the feature relationship model for a plurality of prediction result data generated based on the above ensemble model.

In paragraph 7,
The above ensemble model learning unit and the above health prediction unit are an ensemble prediction device including at least one of a neuromorphic chip, an FPGA (Field Programmable Gate Array), and an ASIC (Application Specific Integrated Circuit).