KR102254341B1

KR102254341B1 - methods for diagnosing the high risk group of Diabetes based on Genetic Risk Score

Info

Publication number: KR102254341B1
Application number: KR1020200132461A
Authority: KR
Inventors: 김봉조; 김영진; 문상훈; 황미영; 한소희; 장혜미
Original assignee: 대한민국
Priority date: 2020-10-14
Filing date: 2020-10-14
Publication date: 2021-05-21
Anticipated expiration: 2040-10-14

Abstract

The present invention relates to a kit for diagnosing diabetes or predicting a disease occurrence risk and a diagnosis method using the same. The method calculates genetic risk scores for 175 genetic mutations, which are pre-reported in relation with second type diabetes, and 56 genetic mutations, which show an association with fasting blood sugar, so as to analyze the same in a complex manner, thereby selecting a genetic high risk group of the second type diabetes by using only genetic information of each person without additionally inspecting an age, a gender, an environmental effect, and the like and being used in customized diagnosis, treatment and prevention according to genetic high risk group selection of the second type diabetes.

Description

Diagnosing the high risk group of diabetes based on genetic risk assessment {methods for diagnosing the high risk group of Diabetes based on Genetic Risk Score}

본 발명은 유전적 위험도 평가 기반의 당뇨병 고위험군을 진단하는 방법에 관한 것이다.The present invention relates to a method for diagnosing a high risk group of diabetes based on genetic risk assessment.

탄수화물 이용이 감소되고 지질 및 단백질 이용이 강화된 대사 질환인 당뇨병은 인슐린의 절대적 결핍 또는 상대적 결핍에 의해 유발된다. 보다 심각한 경우에, 당뇨병은 만성 고지혈증, 당뇨, 수분 및 전해질 손실, 케토산증 및 혼수 (coma)를 특징으로 한다. 장기 합병증은 신경병증, 신장병증 (nephropathy), 대소 혈관의 전신 퇴화성변화 (generalized degenerative change) 및 감염에 대한 증가된 감수성을 포함한다. 가장 일반적인 형태의 당뇨병은 표적 조직에서 손상된 인슐린 분비 및 인슐린 내성에 의한 고혈당증을 특징으로 하는 제2형 당뇨병이다.Diabetes, a metabolic disease with reduced carbohydrate utilization and enhanced lipid and protein utilization, is caused by an absolute or relative deficiency of insulin. In more severe cases, diabetes is characterized by chronic hyperlipidemia, diabetes, loss of water and electrolytes, ketoacidosis, and coma. Long-term complications include neuropathy, nephropathy, generalized degenerative changes in large and small vessels, and increased susceptibility to infection. The most common form of diabetes is type 2 diabetes, characterized by impaired insulin secretion in target tissues and hyperglycemia due to insulin resistance.

제2형 당뇨병은 인슐린 저항성이 있는 것이 특징이며, 인체조직의 인슐린에 대한 반응 결손은 인슐린 수용체와 관련이 있다고 여겨지고 있다. 제2형 당뇨병 초기에 나타나는 이상 증세는 인슐린 감수성의 감소이며, 이 단계에서는 인슐린 감수성을 향상시키거나 간의 포도당 생산을 줄이는 투약과 다양한 조치로 고혈당증이 역전될 수 있다. 제2형 당뇨병은 주로 생활방식의 요소나 유전에 의하며, 비만 (체질량 지수 30 초과로 정의), 운동 부족, 부실한 식사, 스트레스, 도시화 등을 포함한 몇 가지 생활방식 요소들은 제2형 당뇨병의 발생에 있어 중요하다고 알려져 있다.Type 2 diabetes is characterized by insulin resistance, and it is believed that the lack of response to insulin in human tissues is related to insulin receptors. An abnormal condition that appears in the early stages of type 2 diabetes is a decrease in insulin sensitivity, and at this stage, hyperglycemia can be reversed by medications and various measures that improve insulin sensitivity or reduce hepatic glucose production. Type 2 diabetes is primarily due to lifestyle factors or inheritance, and several lifestyle factors, including obesity (defined as a body mass index greater than 30), lack of exercise, poor diet, stress, and urbanization, are responsible for the development of type 2 diabetes. It is known to be important.

또한, 대한당뇨병학회에 따르면 2016년 기준 30세 이상 성인 7명 중 1명이 당뇨병 환자이고, 전국적으로 약 501만명의 환자가 있을 것으로 추정하고 있다. 당뇨병 진료비는 2010년 1조 4천억원에서 2015년 1조 8천억원으로 33.3% 증가하는 등 당뇨병으로 인한 대한민국의 사회경제적 부담이 가속화고 있다.In addition, according to the Korean Diabetes Association, it is estimated that as of 2016, 1 out of 7 adults over the age of 30 is diabetic, and there are about 5,100,000 patients nationwide. Diabetes medical expenses increased 33.3% from 1.4 trillion won in 2010 to 1.8 trillion won in 2015, accelerating the socio-economic burden of diabetes in Korea.

기존에 당뇨병 진단을 위해 사용되는 혈당, 당화혈색소 등 임상역학 변수와 다양한 마커 등의 수치는 당뇨병 발생이 급격하게 증가하는 40대 이후에 주로 차이가 나타나기 때문에, 단기간 변화에 대한 당뇨병 예측에 활용 될 수 있으나 40대 이전에 고위험군을 조기에 선별하고 예방에 활용하기 어렵다는 문제가 있다. 또한, 기존에 발굴된 수십개 수준의 유전변이 정보를 이용하여 제2형 당뇨를 예측하는 방법이 제안된바 있으나 질환 예측률이 낮고 (약 70% 수준) 질환에 대한 통계적인 설명력이 낮다는 단점이 있다. 최근 연구에서는 전장유전체연관성분석법 (Genome-wide association study, 이하 GWAS)을 이용하여 제2형 당뇨병에 관련된 약 300여개의 유전변이를 보고하고 있다. 이러한 유전변이 정보를 이용하여 인구집단 내 상위 5%에 해당하는 제2형 당뇨의 유전적 고위험군을 분석하였을 때, 나머지 95% 인구집단에 비해서 약 2~3배 당뇨 발병율이 높다는 것이 확인되었다. 그러나, 이때 선별된 고위험군들 중 당뇨병이 없는 정상군도 상당수 있는 것으로 나타났다. 따라서, 제2형 당뇨병의 유전적 고위험군 선별에 대하여 정확도가 우수한 키트 개발이 필요한 실정이다.Since the levels of clinical epidemiologic variables such as blood sugar and glycated hemoglobin and various markers, which have been used for diabetic diagnosis, mainly differ after the 40s when the incidence of diabetes increases rapidly, they can be used to predict diabetes against short-term changes. However, there is a problem that it is difficult to select high-risk groups early and use them for prevention before their 40s. In addition, a method for predicting type 2 diabetes using dozens of previously discovered genetic variation information has been proposed, but there is a disadvantage in that the disease prediction rate is low (about 70% level) and the statistical explanation for the disease is low. . In a recent study, about 300 genetic mutations related to type 2 diabetes have been reported using a genome-wide association study (GWAS). When analyzing the genetic high-risk group of type 2 diabetes, which is the top 5% of the population, using this genetic variation information, it was confirmed that the incidence of diabetes was about 2-3 times higher than that of the remaining 95% of the population. However, it was found that among the high-risk groups selected at this time, there were also a significant number of normal groups without diabetes. Therefore, it is necessary to develop a kit with excellent accuracy for screening a genetically high-risk group of type 2 diabetes.

본 발명의 목적은 제2형 당뇨병 진단 또는 발병 예측에 필요한 정보를 제공하는 방법을 제공하는 데 있다.An object of the present invention is to provide a method of providing information necessary for diagnosing or predicting the onset of type 2 diabetes.

또한, 본 발명의 다른 목적은 제2형 당뇨병 진단 또는 발병 예측용 다형성 마커 조성물을 제공하는 데 있다.In addition, another object of the present invention is to provide a polymorphic marker composition for diagnosing or predicting the onset of type 2 diabetes.

또한, 본 발명의 다른 목적은 상기 마커 조성물을 포함하는 제2형 당뇨병 진단 또는 발병 예측용 조성물을 제공하는 데 있다.In addition, another object of the present invention is to provide a composition for diagnosing or predicting the onset of type 2 diabetes, including the marker composition.

또한, 본 발명의 다른 목적은 상기 마커 조성물을 포함하는 제2형 당뇨병 발병 위험을 예측하기 위한 키트를 제공하는 데 있다.In addition, another object of the present invention is to provide a kit for predicting the risk of developing type 2 diabetes, including the marker composition.

생물학적 시료로부터 서열번호 1 내지 26의 6 번째 염기서열에 위치하는 다형성 부위를 포함하는 공복혈당 관련 다형성 마커 (polymorphic marker)에 대한 유전적 위험도 점수 (Genetic Risk Score, GRS)를 계산하는 단계;Calculating a genetic risk score (GRS) for a fasting blood sugar-related polymorphic marker including a polymorphic site located in the sixth nucleotide sequence of SEQ ID NOs: 1 to 26 from the biological sample;

생물학적 시료로부터 제2형 당뇨병 관련 다형성 마커 (polymorphic marker)에 대한 유전적 위험도 점수 (Genetic Risk Score, GRS)를 계산하는 단계;Calculating a genetic risk score (GRS) for a type 2 diabetes-related polymorphic marker from the biological sample;

공복혈당 유전적 위험도 점수 및 제2형 당뇨병 유전적 위험도 점수 각각을 기반으로 고위험군을 선별하는 단계;Selecting a high-risk group based on each of the fasting glucose genetic risk score and the type 2 diabetes genetic risk score;

공복혈당 유전적 위험도 기반 고위험군 및 제2형 당뇨병 유전적 위험도 기반 고위험군에 모두 속하면 제2형 당뇨병 고위험군으로 판단하는 단계;를 포함하는 제2형 당뇨병 진단 또는 발병 예측에 필요한 정보를 제공하는 방법을 제공한다.A method of providing information necessary for diagnosing or predicting the onset of type 2 diabetes, including; determining as a high risk group for type 2 diabetes if it belongs to both the high risk group based on the genetic risk of fasting blood sugar and the high risk group based on the type 2 diabetes genetic risk. to provide.

이어서, 본 발명은 서열번호 1 내지 26의 염기서열로 표시되는 폴리뉴클레오티드에서, 각 염기서열의 6번째에 위치하는 다형성 부위를 포함하는 10개 이상의 연속적인 염기서열로 구성된 폴리뉴클레오티드 또는 그의 상보적 폴리뉴클레오티드에 특이적으로 결합하는 프라이머 또는 프로브를 포함하는, 제2형 당뇨병 진단 또는 발병 예측용 다형성 마커 조성물을 제공한다.Subsequently, in the polynucleotide represented by the nucleotide sequence of SEQ ID NO: 1 to 26, a polynucleotide composed of 10 or more consecutive nucleotide sequences including a polymorphic site located at the sixth position of each nucleotide sequence or a complementary polynucleotide thereof It provides a polymorphic marker composition for diagnosing or predicting the onset of type 2 diabetes, including primers or probes that specifically bind to nucleotides.

아울러, 본 발명은 상기 마커 조성물을 포함하는 제2형 당뇨병 진단 또는 발병 예측용 조성물을 제공한다.In addition, the present invention provides a composition for diagnosing or predicting the onset of type 2 diabetes including the marker composition.

마지막으로, 본 발명은 상기 마커 조성물을 포함하는 제2형 당뇨병 진단 또는 발병 예측용 키트를 제공한다.Finally, the present invention provides a kit for diagnosing or predicting the onset of type 2 diabetes including the marker composition.

본 발명의 제2형 당뇨병 진단 또는 발병 예측 방법은 제2형 당뇨병과 관련하여 기 보고된 유전변이 175개 유전변이와 공복혈당과 연관성을 나타내는 56개 유전변이에 대한 유전적 위험도를 점수화하여 복합적으로 분석함으로써 나이와 성별, 환경적인 영향 등의 추가적인 검사 없이 각 개인의 유전정보만을 이용하여 제2형 당뇨의 유전적 고위험군을 선별할 수 있으며 제2형 당뇨의 유전적 고위험군 선별에 따라 맞춤형 진료, 치료 및 예방에 활용할 수 있다.The method for diagnosing or predicting the onset of type 2 diabetes of the present invention scores the genetic risk of 175 genetic mutations previously reported in relation to type 2 diabetes and 56 genetic mutations that are associated with fasting blood sugar. By analyzing, it is possible to select the genetically high-risk group of type 2 diabetes using only the genetic information of each individual without additional tests such as age, sex, and environmental impact, and customized treatment and treatment according to the selection of the genetically high-risk group of type 2 diabetes. And can be used for prevention.

도 1은 한국인칩 유전체정보 정도관리 흐름도를 도식화하여 나타낸 것이다.
도 2는 한국인칩의 주요 콘텐츠를 나타낸 표이다.
도 3은 분석 대상자의 기초 통계 자료를 나타낸 표이다.
도 4는 공복혈당 관련 56개 유전변이의 맨하탄 플롯 (Manhattan plot) 결과를 나타낸 그래프이다.
도 5는 공복혈당에 따른 제2형 당뇨병 유병률 변화를 나타낸 그래프이다.1 is a schematic diagram of a flow chart of quality control of Korean chip genome information.
2 is a table showing the main contents of the Korean chip.
3 is a table showing basic statistical data of an analysis subject.
4 is a graph showing the results of a Manhattan plot of 56 genetic variants related to fasting blood sugar.
5 is a graph showing the change in the prevalence of type 2 diabetes according to fasting blood sugar.

이하, 본 발명에서 사용되는 용어를 상세히 설명한다.Hereinafter, terms used in the present invention will be described in detail.

본 발명에서 사용되는 용어 “다형성 (polymorphism)”이란 하나의 유전자 좌위 (locus)에 두 가지 이상의 대립 유전자 (allele)가 존재하는 경우를 말하며 다형성 부위 중에서, 사람에 따라 단일 염기만이 다른 것을 단일 염기 다형성 (single nucleotide polymorphism, SNP)이라 한다. 바람직한 다형성 마커는 선택된 집단에서 1% 이상, 더욱 바람직하게는 5% 또는 10% 이상의 발생 빈도를 나타내는 두 가지 이상의 대립 유전자를 가진다. 따라서, 다형성 마커에 대한 유전적 연관 (genetic association)은 특정한 다형성 마커의 하나 이상의 특정한 대립형질에 대해 연관이 있다는 것을 의미한다. 마커는 단일 염기 다형성 (SNP), 마이크로새틀라이트 (microsatellite), 삽입, 결실, 중복 및 전위 (translocation)를 포함한, 게놈에서 발견되는 임의의 변이 형태의 대립형질을 포함할 수 있다.The term “polymorphism” as used in the present invention refers to a case in which two or more alleles exist in one locus. Among the polymorphic sites, only a single base differs from one person to another. It is called single nucleotide polymorphism (SNP). Preferred polymorphic markers have two or more alleles that exhibit a frequency of occurrence of at least 1%, more preferably at least 5% or 10% in the selected population. Thus, a genetic association for a polymorphic marker means that there is an association for one or more specific alleles of a specific polymorphic marker. Markers can include any variant form of allele found in the genome, including single base polymorphism (SNP), microsatellite, insertions, deletions, duplications and translocations.

본 발명에서 사용되는 용어 “대립 유전자 (allele)” 또는 “대립형질”이란 상동 염색체의 동일한 유전자 좌위에 존재하는 한 유전자의 여러 타입을 말한다. 대립 유전자는 다형성을 나타내는데 사용되기도 하며, 예컨대, SNP는 두 종류의 대립 인자 (biallele)를 갖는다.The term "allele" or "allele" used in the present invention refers to several types of one gene present at the same locus of a homologous chromosome. Alleles are also used to indicate polymorphism, for example, SNPs have two types of alleles.

본 발명에서 사용되는 용어 “rs_id”란 1998년부터 SNP 정보를 축적하기 시작한 NCBI가 초기에 등록되는 모든 SNP에 대하여 부여한 독립된 표지자인 rs-ID를 의미한다. 본 발명에서는 rs831571와 같은 형태로 기재하였다. 이와 같은 표에 기재된 rs_id는 본 발명의 다형성 마커인 SNP 마커를 의미한다. 당업자라면 상기 rs_id를 이용하여 SNP의 위치 및 서열을 용이하게 확인할 수 있을 것이다. NCBI의 dbSNP (The Single Nucleotide Polymorphism Database) 번호인 rs_id에 해당하는 구체적인 서열은 시간이 지남에 따라 약간 변경될 수 있다. 본 발명의 범위가 상기 변경된 서열에도 미치는 것은 당업자에게 자명할 것이다.The term “rs_id” used in the present invention refers to rs-ID, which is an independent marker assigned to all SNPs initially registered by NCBI, which has started accumulating SNP information since 1998. In the present invention, it is described in the same form as rs831571. The rs_id described in this table refers to the SNP marker, which is a polymorphic marker of the present invention. Those skilled in the art will be able to easily identify the location and sequence of the SNP using the rs_id. The specific sequence corresponding to the NCBI's dbSNP (The Single Nucleotide Polymorphism Database) number, rs_id, may change slightly over time. It will be apparent to those skilled in the art that the scope of the present invention also extends to the altered sequence.

본 발명에서 사용되는 용어 “진단 또는 발병 예측”이란 질병 발생의 예측 및 질병 발생 위험도를 결정하거나 도출시키는데 사용되는 모든 유형의 분석을 포함하며, 바람직하게는 제2형 당뇨인지 여부를 판단하여 진단을 하거나 발병 위험을 예측하는 것일 수 있다.The term “diagnosis or onset prediction” used in the present invention includes all types of analysis used to predict disease occurrence and determine or derive the risk of disease occurrence, and preferably determine whether or not type 2 diabetes is diagnosed. Or it could be predicting the risk of developing it.

이하, 본 발명에 대하여 보다 상세하게 설명하도록 한다.Hereinafter, the present invention will be described in more detail.

본 발명자들은 125,872명에 대한 유전체정보 분석을 통해 약 830만개의 단일 염기 다형성 (single nucleotide polymorphism, SNP) 정보를 획득하고, 이를 이용하여 공복혈당과 연관성을 나타내는 26개의 새로운 유전변이 및 30개의 기 보고된 유전변이를 발굴하였다. 또한, 제2형 당뇨병과 관련하여 기 보고된 유전변이 175개 유전변이와 공복혈당과 연관성을 나타내는 56개 유전변이에 대한 각각의 유전적 위험도 기반 고위험군에 속하는 경우 제2형 당뇨병 발병률이 더 높은 유전적 고위험군에 대한 예측 정확도가 현저히 높아진다는 것을 확인하고 본 발명을 완성하였다.The present inventors obtained about 8.3 million single nucleotide polymorphism (SNP) information through genomic information analysis of 125,872 people, and reported 26 new genetic mutations and 30 groups showing association with fasting blood sugar using this information. The genetic variation was discovered. In addition, the genetic risk of each of the 175 genetic mutations previously reported related to type 2 diabetes and 56 genetic mutations associated with fasting blood sugar, the higher the incidence of type 2 diabetes in the case of high-risk groups. It was confirmed that the prediction accuracy for the enemy high-risk group was remarkably increased, and the present invention was completed.

본 발명은 생물학적 시료로부터 서열번호 1 내지 26의 6 번째 염기서열에 위치하는 다형성 부위를 포함하는 공복혈당 관련 다형성 마커 (polymorphic marker)에 대한 유전적 위험도 점수 (Genetic Risk Score, GRS)를 계산하는 단계;The present invention is a step of calculating a genetic risk score (GRS) for a fasting blood sugar-related polymorphic marker including a polymorphic site located at the sixth nucleotide sequence of SEQ ID NOs: 1 to 26 from a biological sample. ;

또한, 상기 공복혈당 관련 다형성 마커 (polymorphic marker)는 서열번호 27 내지 56의 6 번째 염기서열에 위치하는 다형성 마커를 더 포함하는 것일 수 있으나, 당 분야에 알려진 공복혈당과 관련된 다형성 마커라면 제한되지 않는다.In addition, the fasting blood glucose-related polymorphic marker may further include a polymorphic marker positioned at the 6th nucleotide sequence of SEQ ID NOs: 27 to 56, but is not limited as long as it is a polymorphic marker related to fasting blood glucose known in the art. .

또한, 상기 생물학적 시료는 조직, 세포, 전혈, 혈청, 혈장 및 타액으로 이루어지는 군으로부터 선택되는 것일 수 있으나, 이에 제한되지 않는다.In addition, the biological sample may be selected from the group consisting of tissue, cells, whole blood, serum, plasma, and saliva, but is not limited thereto.

또한, 유전적 위험도 (Genetic Risk Score, GRS)는 수학식 1을 이용하여 계산된 것일 수 있다.In addition, the genetic risk (Genetic Risk Score, GRS) may be calculated using Equation 1.

[수학식 1][Equation 1]

상기 수학식 1에서,In Equation 1,

n은 총 다형성 마커의 수를 의미한다.n means the total number of polymorphic markers.

i는 다형성마커 서열번호를 의미한다.i means the polymorphic marker sequence number.

j는 샘플 번호를 의미한다.j stands for the sample number.

β는 베타 계수로 다형성 마커의 유전적 효과 (effect size)를 의미한다.β is the beta coefficient, meaning the genetic effect (effect size) of the polymorphic marker.

x는 j 샘플이 대립유전자 (effective allele)의 존재 또는 부재에 따라 0, 1, 또는 2 중 하나의 대립유전자 수치를 의미한다. 대립유전형의 수치는 유전형 예측 방법 (genotype imputation)을 사용하는 경우 0~2 사이의 실수 값으로 대체된다.x means the allele value of one of 0, 1, or 2 in the j sample depending on the presence or absence of an effective allele. The value of the allele is replaced with a real value between 0 and 2 when using genotype imputation.

또한, 상기 유전자 위험도 점수 (Genetic Risk Score, GRS)는 각 SNP 좌위에 존재하는 위험 대립유전자의 수를 세부점수로 부여하고 그 세부 점수의 총합으로 계산되는 것이 바람직하다.In addition, the Genetic Risk Score (GRS) is preferably assigned the number of risk alleles present in each SNP locus as a detailed score and calculated as the sum of the detailed scores.

본 발명의 일실시예에 있어서, 상기 유전적 위험도 점수 (Genetic Risk Score)를 계산하는 단계는 생물학적 시료로부터 하기 표 1 및 표 2에 기재된 모든 다형성 마커 (polymorphic marker) 좌위의 효과 대립유전자 (Allele)의 존재 또는 부재를 결정하는 단계; 상기 각 좌위의 효과 대립유전자 (effective allele)의 존재 또는 부재에 따라 각 좌위에 0, 1, 또는 2 중 하나의 대립유전자 수치를 부여하는 단계로서, 효과 대립유전자 (effective allele)이 부재한 경우에는 0, 효과 대립유전자가 한 개의 반수체에만 존재하는 경우에는 1, 효과 대립유전자가 두 개의 반수체에 모두 존재하는 경우에는 2의 수치를 부여하는 단계; 및 제2형 당뇨병 환자의 각 좌위의 효과 대립유전자 빈도와 정상 대조군의 효과 대립유전자 빈도를 비교하여 각 좌위에서의 베타 계수 (β)를 산출하여 유전적 위험도 점수 (Genetic Risk Score, GRS)를 계산하는 단계;를 포함할 수 있다.In one embodiment of the present invention, the step of calculating the genetic risk score is the effect of the locus of all polymorphic markers shown in Tables 1 and 2 below from a biological sample. Determining the presence or absence of; A step of assigning a value of one allele of 0, 1, or 2 to each locus according to the presence or absence of an effective allele of each locus, and in the absence of an effective allele Assigning a value of 0, 1 when the effect allele is present in only one haploid, and 2 when the effect allele is present in both haploids; And by comparing the frequency of the effect allele of each locus in type 2 diabetic patients with the frequency of the effect allele in the normal control group, and calculating the beta coefficient (β) at each locus to calculate the genetic risk score (GRS). It may include;

여기서, 일례로 질병 유전적 위험도를 wGRS (weighted genetic risk score) 방식을 설명하였으나, 본 발명은 이에 한정되지 않고 실시예에 따라 genetic risk score, polygenic risk score, machine learning 방법, 선형회귀분석 방법 등 다양한 위험도 계산 방식을 이용할 수 있다.Here, as an example, the weighted genetic risk score (wGRS) method has been described for disease genetic risk, but the present invention is not limited thereto, and various methods such as genetic risk score, polygenic risk score, machine learning method, linear regression analysis method, etc. Risk calculation methods can be used.

또한, 상기 유전적 위험도를 사용자가 소속된 집단 (예컨대, 동일 국가, 동일 거주 지역, 동일 연령대 등) 내의 다른 사용자의 질병 유전적 위험도와 대비하여 상대적인 값으로 변환할 수 있다. 이때, 질병 유전적 위험도를 상대적인 값으로 변환하는 이유는, 사용되는 유전자 마커의 수에 따라 유전적 위험도 (genetic risk score) 절대값이 크게 변화할 수 있고, 질병 유전적 위험도와 표현형 유전적 위험도의 계산 방법이 서로 상이할 수 있기 때문에, 상대적인 순위로 노멀라이제이션 (normalization)하기 위함이다.In addition, the genetic risk may be converted into a relative value compared to the disease genetic risk of other users in the group to which the user belongs (eg, the same country, the same residential area, the same age group, etc.). At this time, the reason for converting the disease genetic risk to a relative value is that the absolute value of the genetic risk score can vary greatly depending on the number of genetic markers used, and the difference between the disease genetic risk and the phenotypic genetic risk. Since the calculation methods can be different from each other, this is for normalization in relative order.

또한, 상기 효과 대립유전자 (effective allele)는 effect size 의 기준이 되는 대립형질 (allele)을 의미하고, 본 발명에서는 서열번호 1 내지 26의 염기서열로 표시되는 폴리뉴클레오티드에서, 각 염기서열의 6번째 다형성 부위가 효과 대립유전자 (effective allele)인 경우, effect size가 양의 실수 일 경우 공복혈당 증가 및 제2형 당뇨병 발병 위험이 높다고 예측할 수 있다.In addition, the effective allele means an allele that is the criterion of the effect size, and in the present invention, in the polynucleotide represented by the nucleotide sequences of SEQ ID NOs: 1 to 26, the sixth of each nucleotide sequence If the polymorphic site is an effective allele, and the effect size is a positive mistake, it can be predicted that there is a high risk of fasting blood sugar and type 2 diabetes.

또한, 상기 방법은 서열번호 1 내지 26의 다형성 부위에 동형 혹은 이형 대립유전자형을 갖는 경우, 제2형 당뇨병 발병 위험이 높다고 예측할 수 있다.In addition, the above method can predict that the risk of developing type 2 diabetes is high if the polymorphic site of SEQ ID NOs: 1 to 26 has an isotype or heterogeneous allele.

구체적으로, 서열번호 1의 다형성 부위의 유전자형이 A인 경우,Specifically, when the genotype of the polymorphic site of SEQ ID NO: 1 is A,

서열번호 2의 다형성 부위의 유전자형이 A인 경우,When the genotype of the polymorphic site of SEQ ID NO: 2 is A,

서열번호 3의 다형성 부위의 유전자형이 ATCTCTC인 경우,When the genotype of the polymorphic site of SEQ ID NO: 3 is ATCTCTC,

서열번호 4의 다형성 부위의 유전자형이 G인 경우,When the genotype of the polymorphic site of SEQ ID NO: 4 is G,

서열번호 5의 다형성 부위의 유전자형이 A인 경우,When the genotype of the polymorphic site of SEQ ID NO: 5 is A,

서열번호 6의 다형성 부위의 유전자형이 C인 경우,When the genotype of the polymorphic site of SEQ ID NO: 6 is C,

서열번호 7의 다형성 부위의 유전자형이 T인 경우,When the genotype of the polymorphic site of SEQ ID NO: 7 is T,

서열번호 8의 다형성 부위의 유전자형이 A인 경우,When the genotype of the polymorphic site of SEQ ID NO: 8 is A,

서열번호 9의 다형성 부위의 유전자형이 T인 경우,When the genotype of the polymorphic site of SEQ ID NO: 9 is T,

서열번호 10의 다형성 부위의 유전자형이 A인 경우,When the genotype of the polymorphic site of SEQ ID NO: 10 is A,

서열번호 11의 다형성 부위의 유전자형이 C인 경우,When the genotype of the polymorphic site of SEQ ID NO: 11 is C,

서열번호 12의 다형성 부위의 유전자형이 A인 경우,When the genotype of the polymorphic site of SEQ ID NO: 12 is A,

서열번호 13의 다형성 부위의 유전자형이 T인 경우,When the genotype of the polymorphic site of SEQ ID NO: 13 is T,

서열번호 14의 다형성 부위의 유전자형이 C인 경우,When the genotype of the polymorphic site of SEQ ID NO: 14 is C,

서열번호 15의 다형성 부위의 유전자형이 A인 경우,When the genotype of the polymorphic site of SEQ ID NO: 15 is A,

서열번호 16의 다형성 부위의 유전자형이 C인 경우,When the genotype of the polymorphic site of SEQ ID NO: 16 is C,

서열번호 17의 다형성 부위의 유전자형이 C인 경우,When the genotype of the polymorphic site of SEQ ID NO: 17 is C,

서열번호 18의 다형성 부위의 유전자형이 G인 경우,When the genotype of the polymorphic site of SEQ ID NO: 18 is G,

서열번호 19의 다형성 부위의 유전자형이 C인 경우,When the genotype of the polymorphic site of SEQ ID NO: 19 is C,

서열번호 20의 다형성 부위의 유전자형이 AAAAC인 경우,When the genotype of the polymorphic site of SEQ ID NO: 20 is AAAAC,

서열번호 21의 다형성 부위의 유전자형이 C인 경우,When the genotype of the polymorphic site of SEQ ID NO: 21 is C,

서열번호 22의 다형성 부위의 유전자형이 A인 경우,When the genotype of the polymorphic site of SEQ ID NO: 22 is A,

서열번호 23의 다형성 부위의 유전자형이 G인 경우,When the genotype of the polymorphic site of SEQ ID NO: 23 is G,

서열번호 24의 다형성 부위의 유전자형이 T인 경우,When the genotype of the polymorphic site of SEQ ID NO: 24 is T,

서열번호 25의 다형성 부위의 유전자형이 G인 경우 또는When the genotype of the polymorphic site of SEQ ID NO: 25 is G, or

서열번호 26의 다형성 부위의 유전자형이 C인 경우에 effect size가 양의 실수 이면 공복혈당 증가 및 제2형 당뇨병 발병 위험도가 높다고 예측할 수 있다.When the genotype of the polymorphic site of SEQ ID NO: 26 is C, if the effect size is a positive real number, it can be predicted that the risk of developing fasting blood sugar and type 2 diabetes is high.

또한, 상기 진단은 한국인을 대상으로 하는 것이 바람직하나, 이에 한정되지 않는다.In addition, the diagnosis is preferably for Koreans, but is not limited thereto.

또한, 본 발명의 다형성 마커에 대한 유전자형의 확인은 시퀀싱 분석, 자동염기서열분석기를 사용한 시퀀싱 분석, 파이로시퀀싱 (pyrosequencing), 마이크로어레이에 의한 혼성화, PCR-RELP법 (restriction fragment length polymorphism), PCR-SSCP법 (single strand conformation polymorphism), PCR-SSO법 (specific sequence oligonucleotide), PCR-SSO법과 도트 하이브리드화법을 조합한 ASO (allele specific oligonucleotide) 하이브리드화법, TaqMan-PCR법, MALDI-TOF/MS법, RCA법 (rolling circle amplification), HRM (high resolution melting)법, 프라이머신장법, 서던 블롯 하이브리드화법, 도트 하이브리드화법 등의 공지의 방법에 의하여 수행될 수 있다. 상기 분석 결과들은 당업계에서 일반적으로 사용되는 통계학적 분석 방법을 이용하여 통계 처리할 수 있으며, 예를 들면, 스튜던트 t-검정 (Student's t-test), 카이-스퀘어 테스트 (Chi-square test), 선형 회귀선분석(linear regression line analysis), 다변량 로지스틱 회귀분석 (multiple logistic regression analysis) 등을 통해 얻은 연속 변수 (continuous variables), 절대 변수 (categorical variables), 대응비 (odds ratio) 및 95% 신뢰구간 (confidence interval) 등의 변수를 이용하여 분석할 수 있다.In addition, identification of the genotype for the polymorphic marker of the present invention includes sequencing analysis, sequencing analysis using an automatic sequencing analyzer, pyrosequencing, hybridization by microarray, PCR-RELP method (restriction fragment length polymorphism), PCR -SSCP method (single strand conformation polymorphism), PCR-SSO method (specific sequence oligonucleotide), ASO (allele specific oligonucleotide) hybridization method combining PCR-SSO method and dot hybridization method, TaqMan-PCR method, MALDI-TOF/MS method , RCA method (rolling circle amplification), HRM (high resolution melting) method, primer extension method, Southern blot hybridization method, dot hybridization method, and the like. The analysis results can be statistically processed using statistical analysis methods commonly used in the art, for example, Student's t-test, Chi-square test, Continuous variables, categorical variables, odds ratio, and 95% confidence intervals obtained through linear regression line analysis, multiple logistic regression analysis, etc. confidence interval).

이어서, 본 발명은 서열번호 1 내지 26의 염기서열로 표시되는 폴리뉴클레오티드에서, 각각 염기서열의 6번째에 위치하는 다형성 부위를 포함하는 10개 이상의 연속적인 염기서열로 구성된 폴리뉴클레오티드 또는 그의 상보적 폴리뉴클레오티드에 특이적으로 결합하는 프라이머 또는 프로브를 포함하는, 제2형 당뇨병 진단 또는 발병 예측용 다형성 마커 조성물을 제공한다.Subsequently, in the polynucleotide represented by the nucleotide sequence of SEQ ID NO: 1 to 26, a polynucleotide composed of 10 or more consecutive nucleotide sequences including a polymorphic site located at the 6th nucleotide sequence or a complementary polynucleotide thereof It provides a polymorphic marker composition for diagnosing or predicting the onset of type 2 diabetes, including primers or probes that specifically bind to nucleotides.

또한, 상기 다형성 부위는 단일염기다형성 (single nucleotide polymorphism)일 수 있으며, 다형성 부위가 효과 대립유전자형 (effective allele)를 가지는 경우 제2형 당뇨병 발병 위험도가 높다고 예측할 수 있다. 상기 서열번호 1 내지 26의 다형성 부위의 효과 대립유전자형 (effective allele)은 상기 기술한 바와 같다.In addition, the polymorphic site may be single nucleotide polymorphism, and if the polymorphic site has an effective allele, it can be predicted that the risk of developing type 2 diabetes is high. The effective alleles of the polymorphic sites of SEQ ID NOs: 1 to 26 are as described above.

또한, 상기 폴리뉴클레오티드 또는 그의 상보적 폴리뉴클레오티드와 특이적으로 결합하는 프라이머 또는 프로브는 대립형질 특이적 (allele-specific)이다.In addition, the primer or probe that specifically binds to the polynucleotide or its complementary polynucleotide is allele-specific.

상기 "프라이머"란 짧은 자유 수산화기를 가지는 핵산서열로서 상보적인 템플리트와 염기쌍을 형성할 수 있고 템플리트 가닥 복사를 위한 시작 지점으로 기능하는 짧은 핵산서열을 말한다. 본 발명의 프라이머는 예를 들면, 포스포르아미다이트 고체 지지체 방법과 같은 당 분야에 공지된 방법을 이용하여 화학적으로 합성할 수 있다.The "primer" refers to a short nucleic acid sequence that can form a base pair with a complementary template as a nucleic acid sequence having a short free hydroxyl group and serves as a starting point for copying the template strand. The primer of the present invention can be chemically synthesized using a method known in the art, such as, for example, a phosphoramidite solid support method.

상기 "프로브"는 mRNA와 특이적으로 결합할 수 있는 수개 내지 수백 개의 염기로 이루어진 RNA 또는 DNA 등의 핵산 단편을 의미하며 라벨링되어 있어 특정 mRNA의 존재유무를 확인할 수 있다. 프로브는 올리고뉴클레오타이드 프로브, 단쇄 DNA 프로브, 이중쇄 DNA 프로브, RNA 프로브 등의 형태로 제작될 수 있고 비오틴, FITC, 로다민, DIG 등으로 표지되거나 방사선 동위 원소 등으로 표지될 수 있다.The "probe" refers to a nucleic acid fragment such as RNA or DNA composed of several to hundreds of bases capable of specifically binding to mRNA, and is labeled so that the presence or absence of a specific mRNA can be confirmed. Probes may be prepared in the form of oligonucleotide probes, single-stranded DNA probes, double-stranded DNA probes, RNA probes, and the like, and may be labeled with biotin, FITC, rhodamine, DIG, or the like, or labeled with radioactive isotopes.

또한, 상기 프로브는 검출 가능한 물질 예를 들면, 적합한 신호를 제공하고 충분한 반감기를 갖는 방사성 표지로 표지할 수 있다. 표지된 프로브는 문헌 (Sambook et al., Molecular Cloning, A LaboratoryMannual, 1989)에 공지된 바와 같은 고체 지지체 상의 핵산에 혼성화시킬 수 있다.In addition, the probe can be labeled with a detectable substance such as a radioactive label that provides a suitable signal and has a sufficient half-life. Labeled probes can be hybridized to nucleic acids on solid supports as known from Sambook et al., Molecular Cloning, A LaboratoryMannual, 1989.

또한, 상기 프로브는 대립유전자 특이적 프로브로서, 핵산 단편 중에 다형성 부위가 존재하여, 하나의 대립유전자를 포함한 핵산 단편에는 혼성화하지만, 다른 대립유전자를 포함한 핵산 단편에는 혼성화하지 않을 수 있다. 이 경우 혼성화 조건은 대립형질 간의 혼성화 강도에 있어서 유의한 차이를 보여 대립형질 중 하나에만 혼성화되도록 충분히 엄격해야 한다. 바람직하게는 프로브는 혼성화에서의 최대 효율을 위하여 단일 가닥일 수 있으나, 이에 제한되지 않는다.In addition, the probe is an allele-specific probe, and a polymorphic site exists in a nucleic acid fragment, and thus hybridizes to a nucleic acid fragment containing one allele, but may not hybridize to a nucleic acid fragment containing another allele. In this case, the hybridization conditions should be sufficiently stringent so that only one of the alleles is hybridized to show a significant difference in hybridization strength between alleles. Preferably, the probe may be single stranded for maximum efficiency in hybridization, but is not limited thereto.

아울러, 본 발명은 상기 다형성 마커 조성물을 포함하는, 제2형 당뇨병 진단 또는 발병 예측용 조성물을 제공한다.In addition, the present invention provides a composition for diagnosing or predicting the onset of type 2 diabetes, including the polymorphic marker composition.

또한, 상기 조성물은 본 발명의 특징인 서열번호 1 내지 26의 염기서열로 표시되는 폴리뉴클레오티드에서 6번째 염기에 해당하는 다형성 부위를 검출할 수 있는 프라이머 또는 프로브를 포함하는 다형성 마커를 이용하여 제2형 당뇨병을 진단 또는 발병을 예측할 수 있다.In addition, the composition is a second polymorphic marker including a primer or probe capable of detecting a polymorphic site corresponding to the sixth base in the polynucleotide represented by the nucleotide sequence of SEQ ID NO: 1 to 26, which is a characteristic of the present invention. Diabetes type diabetes can be diagnosed or predicted onset.

또한, 프라이머를 이용한 특정 핵산의 검출은 PCR과 같은 증폭 방법을 사용하여 목적 유전자의 서열을 증폭한 다음 당 분야에 공지된 방법으로 유전자의 증폭여부를 확인함으로써 수행될 수 있다. 또한, 프로브를 이용한 특정 핵산의 검출은 적합한 조건하에서 시료 핵산을 프로브와 접촉시킨 후 혼성화되는 핵산의 존재여부를 확인함으로써 수행될 수 있다.In addition, detection of a specific nucleic acid using a primer can be performed by amplifying the sequence of the target gene using an amplification method such as PCR, and then confirming whether the gene is amplified by a method known in the art. In addition, detection of a specific nucleic acid using a probe may be performed by contacting a sample nucleic acid with a probe under suitable conditions and then confirming the presence or absence of a hybridized nucleic acid.

또한, 상기 프로브나 프라이머를 이용하여 특정 핵산을 검출할 수 있는 방법으로는 예를 들면, 이에 한정되지는 않으나 중합효소 연쇄반응 (PCR), DNA 시퀀싱, RT-PCR, 프라이머 연장법 (Nikiforeov et al., Nucl Acids Res 22, 4167-4175, 1994), 올리고뉴클레오타이드 연장 분석 (Nickerson et al., Pro Nat Acad Sci USA, 87, 8923-8927, 1990), 대립형질 특이적 PCR법 (Rust et al., Nucl Acids Res, 6, 3623-3629, 1993), RNase 불일치절단 (RNase mismatch cleavage, Myers et al., Science, 230, 1242-1246, 1985), 단일가닥 입체 다형화 (single strand conformation lymorphism, Orita et al., Pro Nat Acad Sci USA, 86, 2766-2770, 1989), SSCP 및 헤테로두플렉스 동시 분석법 (Lee et al., Mol Cells, 5:668-672, 1995), 변성 구배 젤 전기영동 (DGGE, Cariello et al., Am J Hum Genet, 42, 726-734, 1988), 변성 고압 액체 크로마토그래피 (Underhill et al., Genome Res, 7, 996-1005, 1997), 혼성화 반응, DNA 칩 등이 있다. 상기 혼성화 반응의 예로는 노던 하이브리다이제이션 (Maniatis T. et al., Molecular Cloning, Cold Spring Habor Laboratory, NY, 1982), 인시츄 하이브리다이제이션 (Jacquemier et al., Bull Cancer, 90:31-8, 2003) 및 마이크로어레이 (Macgregor, Expert, Rev Mol Diagn 3:185-200, 2003) 방법 등이 있다.In addition, methods for detecting a specific nucleic acid using the probe or primer include, but are not limited to, polymerase chain reaction (PCR), DNA sequencing, RT-PCR, primer extension method (Nikiforeov et al. ., Nucl Acids Res 22, 4167-4175, 1994), oligonucleotide extension analysis (Nickerson et al., Pro Nat Acad Sci USA, 87, 8923-8927, 1990), allele-specific PCR method (Rust et al. , Nucl Acids Res, 6, 3623-3629, 1993), RNase mismatch cleavage (Myers et al., Science, 230, 1242-1246, 1985), single strand conformation lymorphism, Orita et al., Pro Nat Acad Sci USA, 86, 2766-2770, 1989), SSCP and heteroduplex simultaneous analysis (Lee et al., Mol Cells, 5:668-672, 1995), denaturing gradient gel electrophoresis (DGGE , Cariello et al., Am J Hum Genet, 42, 726-734, 1988), denaturing high pressure liquid chromatography (Underhill et al., Genome Res, 7, 996-1005, 1997), hybridization reaction, DNA chip, etc. have. Examples of the hybridization reaction include Northern hybridization (Maniatis T. et al., Molecular Cloning, Cold Spring Habor Laboratory, NY, 1982), in situ hybridization (Jacquemier et al., Bull Cancer, 90:31-8). , 2003) and microarray (Macgregor, Expert, Rev Mol Diagn 3:185-200, 2003) methods.

상기 본 발명의 제2형 당뇨병 진단 또는 발병 예측용 조성물은 상술한 핵산을 검출하는 방법에 일반적으로 사용되는 시약을 추가로 포함할 수 있다. 예를 들면, PCR 반응에 요구되는 dNTP(deoxynulceotide triphosphate), 내열성 중합효소(polymerase), 염화마그네슘 등의 금속이온염이 포함할 수 있으며, 시퀀싱에 요구되는 dNTP, 시쿼나제(sequenase) 등을 포함할 수 있다.The composition for diagnosing or predicting the onset of type 2 diabetes of the present invention may further include a reagent generally used in the method for detecting the above-described nucleic acid. For example, metal ion salts such as dNTP (deoxynulceotide triphosphate), heat-resistant polymerase, and magnesium chloride required for PCR reaction may be included, and dNTP and sequenase required for sequencing may be included. I can.

마지막으로, 본 발명은 상기 다형성 마커 조성물을 포함하는, 제2형 당뇨병 진단 또는 발병 예측용 키트를 제공한다.Finally, the present invention provides a kit for diagnosing or predicting the onset of type 2 diabetes, including the polymorphic marker composition.

또한, 상기 키트는 다형성 마커 중 1종 이상의 마커를 확인하기 위한 폴리뉴클레오티드, 프라이머 또는 프로브뿐만 아니라 분석 방법에 적합한 한 종류 또는 그 이상의 다른 구성 성분 조성물, 용액 또는 장치가 포함될 수 있다.In addition, the kit may include a polynucleotide, a primer, or a probe for identifying one or more of the polymorphic markers, as well as one or more other component compositions, solutions, or devices suitable for an analysis method.

예를 들어, 본 발명의 키트는 PCR을 수행하기 위해 필요한 필수 요소를 포함하는 키트일 수 있다. PCR 키트는, 상기 다형성 마커에 대한 특이적인 폴리뉴클레오티드, 프라이머 또는 프로브 외에도 테스트 튜브 또는 다른 적절한 컨테이너, 반응 완충액 (pH 및 마그네슘 농도는 다양), 데옥시뉴클레오타이드 (dNTPs), Taq-폴리머라아제 및 역전사효소와 같은 효소, DNase, RNAse 억제제, DEPC-수 (DEPC-water) 및 멸균수 등을 포함할 수 있다.For example, the kit of the present invention may be a kit including essential elements necessary for performing PCR. PCR kits, in addition to the polynucleotides, primers or probes specific for the polymorphic marker, test tubes or other suitable containers, reaction buffers (various in pH and magnesium concentration), deoxynucleotides (dNTPs), Taq-polymerase and reverse transcription. Enzymes such as enzymes, DNase, RNAse inhibitors, DEPC-water and sterile water may be included.

본 발명의 제2형 당뇨병 진단 또는 발병 예측에 필요한 정보를 제공하는 방법에 따르면, 제2형 당뇨병과 관련하여 기 보고된 유전변이 175개와 공복혈당과 연관성을 나타내는 56개 유전변이를 각각 유전적 위험도 점수화하여 복합적으로 분석함으로써 제2형 당뇨병을 조기에 진단하고, 발병 위험을 예측할 수 있다.According to the method for providing information necessary for diagnosing or predicting the onset of type 2 diabetes of the present invention, 175 previously reported genetic mutations related to type 2 diabetes and 56 genetic mutations associated with fasting blood sugar are respectively associated with genetic risk. By scoring and complex analysis, type 2 diabetes can be diagnosed early and the risk of onset can be predicted.

이하, 본 발명의 실시예를 첨부된 도면을 참고하여 보다 상세하게 설명하도록 한다. 그러나, 하기의 실시예는 본 발명의 내용을 구체화하기 위한 것일 뿐, 이에 의해 본 발명이 한정되는 것은 아닐 것이다.Hereinafter, embodiments of the present invention will be described in more detail with reference to the accompanying drawings. However, the following examples are only for embodiing the contents of the present invention, and the present invention is not limited thereto.

<실시예 1> 유전체정보의 정도관리<Example 1> Quality control of genome information

도 1을 참조하여 134,721명의 유전체역학조사사업 참여자를 대상으로 한국인칩을 이용하여 유전체정보를 생산하고, 생산된 유전체정보는 도 2를 참조하여 정도관리 파이프라인에 따라서 genotype calling, low quality 유전변이 제거, low quality 샘플 제거, MDS (Multi-dimensional scaling)/PCA (Principal Component Analysis)를 통한 이상치 제거 등을 수행하여 정제하였다. 정제 후 125,872명에 대한 유전체정보를 확보하였으며, 이를 이용하여 후속 분석을 진행하였다.With reference to Fig. 1, genome information is produced using Korean chips for 134,721 participants in the genome epidemiological investigation project, and the produced genome information is genotype calling and low quality genetic mutations removed according to the quality control pipeline with reference to Fig. 2 , removal of low quality samples, and removal of outliers through MDS (Multi-dimensional scaling)/PCA (Principal Component Analysis), etc. were performed. After purification, genome information for 125,872 people was secured, and subsequent analysis was conducted using this.

<실시예 2> 유전체정보 분석<Example 2> Genomic information analysis

정제 후 125,872명에 대한 유전체 정보는 Eagle 소프트웨어로 phasing 분석, Impute v4 소트프웨어로 imputation 분석을 수행하였다. 유전체정보 분석을 통해 약 830만개의 단일 염기 다형성 (single nucleotide polymorphism, SNP) 정보를 획득하였다. 125,872명의 임상역학정보 중 공복혈당 (Fasting Plasma Glucose, FPG), 당화혈색소 (Hemoglobin A1c, HbA1c) 등 분석하였으며, 분석 시 각 정보에 영향을 줄 수 있는 질병 과거력, 약물력 등을 제외하였다. 특히 제2형 당뇨 환자군은 각 정보 분석 시 제외하였다 (도 3 참조).After purification, the genome information of 125,872 people was analyzed by phasing with Eagle software and imputation with Impute v4 software. Through genomic information analysis, information on about 8.3 million single nucleotide polymorphisms (SNPs) was obtained. Among the clinical epidemiological information of 125,872 people, fasting plasma glucose (FPG) and glycated hemoglobin (Hemoglobin A1c, HbA1c) were analyzed, and disease history and drug history that may affect each information were excluded. In particular, the type 2 diabetes patient group was excluded when analyzing each information (see FIG. 3).

제 2형 당뇨 환자 및 정상군은 ADA (American Diabetes Association)에서 제시하는 기준에 따라 아래와 같이 정의하였다. 하기 기준에 따라 선별된 제2형 당뇨 환자 수는 12,135명이었다.Type 2 diabetes patients and normal groups were defined as follows according to the criteria presented by the American Diabetes Association (ADA). The number of type 2 diabetic patients selected according to the following criteria was 12,135.

제2형 당뇨 환자: FPG >= 126 mg/dL (7.0 mmol/L), oral glucose tolerance test (OGTT) 이후 2시간 혈당 >= 200mg/dL (11.1 mmol/L), 또는 HbA1c >= 6.5% (48 mmol/mol). 그 외 과거력 정보에 따라 제2형 당뇨로 진단 받은 경우 포함.Type 2 diabetes patients: FPG >= 126 mg/dL (7.0 mmol/L), blood glucose for 2 hours after oral glucose tolerance test (OGTT) >= 200 mg/dL (11.1 mmol/L), or HbA1c >= 6.5% ( 48 mmol/mol). Including cases diagnosed with type 2 diabetes based on other historical information.

정상군: FPG < 100mg/dL (5.6 mmol/L), OGTT < 140 mg/dL (7.8 mmol/L), 또는 HbA1c < 6% (48 mmol/mol). 모든 사람은 제2형 당뇨병으로 진단받은 적이 없어야함.Normal group: FPG <100 mg/dL (5.6 mmol/L), OGTT <140 mg/dL (7.8 mmol/L), or HbA1c <6% (48 mmol/mol). Not everyone should have been diagnosed with type 2 diabetes.

<실시예 3> 전장유전체연관성 분석 방법<Example 3> Method for analyzing full-length genetic association

공복혈당과 연관성을 나타내는 유전변이를 선정하기 위하여 전장유전체연관성분석법을 이용하였다. 결측치 예측 (Imputation)이 완료된 자료와 상기 임상역학정보를 이용하여 125,872명에서 나타나는 공복혈당과 모든 유전변이에 대한 연관성 분석을 수행하였다. 제2형 당뇨병과 관련되어 기존에 보고된 유전변이 정보를 추출하였으며, 그 중 본 연구에서는 기 보고된 175개의 유전변이 정보를 활용하여 제2형 당뇨병과의 연관성을 분석하였다 (표 1). 상기 연관성 분석은 선형 회귀분석 (Linear regression)으로 수행하였으며 나이, 성별, 코호트 지역에 대해 보정하였다.In order to select a genetic variation that is associated with fasting blood sugar, a full-length genetic association analysis method was used. Using the data for which the missing value was predicted (Imputation) was completed and the clinical epidemiologic information, a correlation analysis was performed for fasting blood glucose in 125,872 patients and all genetic mutations. Previously reported genetic variation information related to type 2 diabetes was extracted. Among them, this study analyzed the association with type 2 diabetes using the previously reported 175 genetic variation information (Table 1). The association analysis was performed by linear regression and corrected for age, sex, and cohort area.

공복혈당과 관련된 유전변이 발굴의 통계적 검정력을 확보하기 위해서 Biobank Japan에서 공개한 약 16만명의 공복혈당과 유전변이의 연관성 분석 결과 (약 6백만개 유전변이)와 통합하여 Inverse-variance weighted meta-analysis를 수행하였다. 분석 후 heterogeneity가 높은 (P < 0.001) 결과는 제거하였다. 그 결과 공복혈당과 관련된 26개의 기존에 보고되지 않은 새로운 유전변이와 30개의 기 보고된 유전변이를 발굴하였다. 공복혈당과 관련된 56개의 유전변이는 도 4 및 표 2에 나타내었다. 도 4는 공복혈당 관련 56개 유전변이의 맨하탄 플롯 (Manhattan plot) 결과를 나타낸 그래프이다. x축은 chromosome, y축은 -log10 (P-value), 파란색은 기 보고된 유전변이, 빨간색은 신규 유전변이를 의미한다.In order to secure the statistical power of the discovery of genetic mutations related to fasting blood glucose, inverse-variance weighted meta-analysis was performed by integrating with the results of the association analysis (about 6 million genetic mutations) between about 160,000 fasting blood glucose and genetic mutations published by Biobank Japan. Performed. After analysis, the results of high heterogeneity (P <0.001) were removed. As a result, 26 unreported new genetic mutations and 30 previously reported genetic mutations related to fasting blood glucose were identified. 56 genetic mutations related to fasting blood sugar are shown in Fig. 4 and Table 2. 4 is a graph showing the results of a Manhattan plot of 56 genetic mutations related to fasting blood sugar. The x-axis represents the chromosome, the y-axis represents -log10 (P-value), the blue represents the previously reported genetic mutation, and the red represents the new genetic variant.

ChromosomeChromosome PositionPosition Effective alleleEffective allele Other alleleOther allele ProbeProbe 1One 2206832622068326 GG AA ACCAGACATGGACCAGACATGG 1One 4003592840035928 TT GG ATGAAGAAACCATGAAGAAACC 1One 5125609151256091 TT CC ACAGGCGTGAGACAGGCGTGAG 1One 120526982120526982 TT CC CACTACGGGTCCACTACGGGTC 1One 177889025177889025 CC AA CAAAAAGGAATCAAAAAGGAAT 1One 205114873205114873 GG CC ACTCTCGGGCTACTCTCGGGCT 1One 206593900206593900 GG CC GTGTCCTAGGAGTGTCCTAGGA 1One 214159256214159256 CC TT GTATATAGCCCGTATATAGCCC 1One 219748818219748818 GG CC TGCAACTCTTTTGCAACTCTTT 1One 229672955229672955 AA GG TTTGGGAACTATTTGGGAACTA 1One 235690800235690800 AA GG GTTACGTTGGTGTTACGTTGGT 22 2564322125643221 TT GG TGAAAGGTCATTGAAAGGTCAT 22 2773094027730940 CC TT CTTGCTGGTGACTTGCTGGTGA 22 5892177758921777 AA GG AATTTGTACTTAATTTGTACTT 22 5930772559307725 AA GG TCTGAGTGATTTCTGAGTGATT 22 6058366560583665 AA GG CCAGAGTGTGGCCAGAGTGTGG 22 6528789665287896 GG AA TGTACATTGAGTGTACATTGAG 22 120231070120231070 GG CC ACAGGCCAGATACAGGCCAGAT 22 147861633147861633 TT CC CCCCACCTCAACCCCACCTCAA 22 149455385149455385 GG AA ATTCTAGTGCCATTCTAGTGCC 22 227101411227101411 GG AA AAACAATCAGTAAACAATCAGT 22 234303281234303281 GG CC CCCACCCTTTTCCCACCCTTTT 33 1233650712336507 AA GG GCAAGGTCATAGCAAGGTCATA 33 2345558223455582 CC TT CCAAATGGTAGCCAAATGGTAG 33 4692553946925539 CC TT GAATGTAGGTCGAATGTAGGTC 33 4998059649980596 TT CC GGTTGCTTCTGGGTTGCTTCTG 33 6396233963962339 AA GG GGACAGTGATGGGACAGTGATG 33 7767172177671721 AA CC AAATACAAAATAAATACAAAAT 33 121956953121956953 AA GG GAACAGTCACAGAACAGTCACA 33 124926637124926637 CC TT TGCACTGGCTGTGCACTGGCTG 33 152086533152086533 GG AA AAGTTATATTCAAGTTATATTC 33 170733076170733076 AA GG AACATGGTGAAAACATGGTGAA 33 183738460183738460 CC AA ACAGGAGTGAGACAGGAGTGAG 33 185503456185503456 AA TT AAAAATAATAAAAAAATAATAA 33 186665645186665645 TT CC CACACCCCTTGCACACCCCTTG 44 744972744972 TT GG CGTTTGTGGGACGTTTGTGGGA 44 17844031784403 TT CC CCCAGCGCCTTCCCAGCGCCTT 44 63067636306763 CC GG TCACCGTATGGTCACCGTATGG 44 4518613945186139 AA GG GATCTGCTAAGGATCTGCTAAG 44 7174824571748245 AA GG ATGAGGGTCTAATGAGGGTCTA 44 8357827183578271 GG AA TTGTCAATTTTTTGTCAATTTT 44 8530187085301870 CC TT AGAAGTAGAAAAGAAGTAGAAA 44 104140848104140848 AA CC GCAGACATAGAGCAGACATAGA 44 137083193137083193 CC AA TTTACATAAGTTTTACATAAGT 44 153513369153513369 AA TT CACTTTCCTTTCACTTTCCTTT 55 4468258944682589 GG AA TCTCAATTATTTCTCAATTATT 55 5209424452094244 ATAT AA AGATAATGACAAGATAATGACA 55 5210048952100489 GG AA TTCATATATTCTTCATATATTC 55 5327142053271420 AA GG TGCTGGCTTTATGCTGGCTTTA 55 5580847555808475 TT CC AGCATCCAGGGAGCATCCAGGG 55 7500367875003678 CC TT CATAATGAAGCCATAATGAAGC 55 7642494976424949 AA GG CCAACGTTTTCCCAACGTTTTC 55 7843060778430607 AA CC ATATGCCATGAATATGCCATGA 55 133864599133864599 AA GG TAAAGGGCCAGTAAAGGGCCAG 55 176513896176513896 AA CC CGCGGCCACGCCGCGGCCACGC 66 72318437231843 AA GG GGGTGGACCTGGGGTGGACCTG 66 2067970920679709 GG AA GTTTTAGATCTGTTTTAGATCT 66 4040924340409243 CC TT GAGTCTGCCTGGAGTCTGCCTG 66 4381419043814190 TT CC ATATACGGTAGATATACGGTAG 66 5078877850788778 CC AA GGACTAAGGTTGGACTAAGGTT 66 107431688107431688 AA GG TTGGAGAAGGATTGGAGAAGGA 66 117996631117996631 CC TT GTCCATGTACTGTCCATGTACT 66 126792095126792095 GG AA ATTTTACAGGCATTTTACAGGC 66 127416930127416930 GG AA AAAAAAAGTAAAAAAAAAGTAA 66 137300960137300960 GG AA CATGGAACATTCATGGAACATT 66 160770312160770312 GG AA CGCTTATGGTCCGCTTATGGTC 66 164133001164133001 TT CC ACAGGCGTGGGACAGGCGTGGG 77 1388869913888699 CC GG AAACAGACACAAAACAGACACA 77 1506356915063569 TT CC GAACTCCAACAGAACTCCAACA 77 2351289623512896 CC GG TTTAAGTTCCATTTAAGTTCCA 77 3072845230728452 TT CC AGCCACGGCAAAGCCACGGCAA 77 4425564344255643 AA GG GTGCCGTGGCAGTGCCGTGGCA 77 6940666169406661 TT AA CCATTATAGTCCCATTATAGTC 77 103444978103444978 TT CC ATCTACACACAATCTACACACA 77 117495667117495667 AA CC CCCCCCAAACACCCCCCAAACA 77 130457914130457914 GG AA TTACAACCTTATTACAACCTTA 77 156930550156930550 CC GG ACCGTGCTCGGACCGTGCTCGG 88 1983092119830921 TT CC AATCACTGTACAATCACTGTAC 88 3086393830863938 CC TT AACCCTGTCTCAACCCTGTCTC 88 4150857741508577 AA CC GCGCGCGAGAGGCGCGCGAGAG 88 9596162695961626 CC TT TCCGCTCGCCGTCCGCTCGCCG 88 110123183110123183 GG CC AGTTCCAGACCAGTTCCAGACC 88 118185025118185025 AA GG TTCATGTCATGTTCATGTCATG 88 129568078129568078 TT CC CACAGCTAGTACACAGCTAGTA 88 145507304145507304 TT CC GAACTCCTGGGGAACTCCTGGG 99 42919284291928 CC AA CAGTCATTAACCAGTCATTAAC 99 1906783319067833 GG AA TTATAAGCATGTTATAAGCATG 99 2024106920241069 CC TT AGTAATGAAGAAGTAATGAAGA 99 2213406822134068 AA GG TGTCAGCAGCTTGTCAGCAGCT 99 2841068328410683 CC TT CTGTATTTGAACTGTATTTGAA 99 3407447634074476 CC TT AGCCTTTAGAAAGCCTTTAGAA 99 8135911381359113 GG CC GCCGCCATGGTGCCGCCATGGT 99 8190559081905590 GG AA ACAACATCTATACAACATCTAT 99 8430894884308948 AA GG TGTGAGTTGCTTGTGAGTTGCT 99 9700168297001682 CC AA TCCTCATAGCTTCCTCATAGCT 99 136149229136149229 CC TT CCTTTTATGTGCCTTTTATGTG 99 139241030139241030 GG AA TCCCAATTGGGTCCCAATTGGG 1010 1230789412307894 TT CC AGTGTCTACTGAGTGTCTACTG 1010 7731461777314617 AA TT AGAAATGAAGTAGAAATGAAGT 1010 8095282680952826 CC GG CTACCGAGGTTCTACCGAGGTT 1010 9446242794462427 CC TT TCTCTTCCATCTCTCTTCCATC 1010 114758349114758349 TT CC AGATACTATATAGATACTATAT 1010 122929493122929493 CC TT TATAATCAGTCTATAATCAGTC 1010 124193181124193181 GG TT CTTCCTGTGTGCTTCCTGTGTG 1111 21972862197286 GG AA GGAGGAGCCCAGGAGGAGCCCA 1111 28571942857194 CC AA GTCCCAGGGGTGTCCCAGGGGT 1111 3498214834982148 AA CC TTATTCAGTCTTTATTCAGTCT 1111 3543786335437863 CC GG TTTCAGATGATTTTCAGATGAT 1111 6529479965294799 TT CC AGCATCCCTAAAGCATCCCTAA 1111 7246039872460398 CC AA ACCTAATTTCTACCTAATTTCT 1111 9270871092708710 GG CC CATCTCCTATCCATCTCCTATC 1111 128389391128389391 TT CC AGCTACGCTGTAGCTACGCTGT 1111 128398938128398938 TT CC ATTTCCGAACTATTTCCGAACT 1212 1287109912871099 GG TT GGATGTCAGCGGGATGTCAGCG 1212 2645328326453283 GG AA TGTCAATCTGTTGTCAATCTGT 1212 2796515027965150 TT CC CACATCCCAGACACATCCCAGA 1212 6622106066221060 TT AA AGAAAATATGTAGAAAATATGT 1212 7152295371522953 CC GG GAGATGATGGGGAGATGATGGG 1212 9784877597848775 GG AA TAACCATTGAATAACCATTGAA 1212 108629780108629780 AA GG TTGCCGCATTTTTGCCGCATTT 1212 118406696118406696 TT CC CCCCGCGCTGCCCCCGCGCTGC 1212 118412373118412373 AA GG TCAACGGAGAGTCAACGGAGAG 1212 133069698133069698 AA GG CAAGAGCAAGGCAAGAGCAAGG 1313 2677699926776999 GG AA GAGTCAGCAACGAGTCAGCAAC 1313 3104245231042452 TT GG CAGCGGTGCAACAGCGGTGCAA 1313 3355430233554302 AA GG CTAACGTTGAACTAACGTTGAA 1313 5108880951088809 AA GG ATCAGGAAAGGATCAGGAAAGG 1313 5109609551096095 TT AA ATTACACAAGGATTACACAAGG 1313 8071715680717156 AA GG TTCATGGATAGTTCATGGATAG 1313 109947213109947213 CC TT ATACATCCCTAATACATCCCTA 1414 2328893523288935 GG CC CCAAGCGTATCCCAAGCGTATC 1414 3884841938848419 TT GG AACATGGAGTTAACATGGAGTT 1414 9196372291963722 GG AA CACACACGATACACACACGATA 1515 4180920541809205 AA GG CCACTGTGCCTCCACTGTGCCT 1515 5309155353091553 TT CC TCCTTCTCTATTCCTTCTCTAT 1515 6239426462394264 CC GG GGATGGCTTCCGGATGGCTTCC 1515 6387129263871292 TT CC ATTCACAGTAGATTCACAGTAG 1515 6808088668080886 TT AA CTGTTACTGTTCTGTTACTGTT 1515 7593212975932129 TT GG GCCACGGCAGCGCCACGGCAGC 1515 7781812877818128 AA CC GTCCCCAGAAGGTCCCCAGAAG 1515 9042329390423293 CC TT GTGCATCTTAGGTGCATCTTAG 1616 295795295795 CC TT TTAGTTCATCATTAGTTCATCA 1616 35831733583173 TT CC TTCTGCCTCATTTCTGCCTCAT 1616 2032316820323168 AA GG ATGCTGACTAAATGCTGACTAA 1616 2891521728915217 AA GG GGGCCGCTGGCGGGCCGCTGGC 1616 3004578930045789 GG CC CCCAGCTATGGCCCAGCTATGG 1616 5380095453800954 CC TT CATGATATTGACATGATATTGA 1616 6965186669651866 TT CC ACCACCCTGACACCACCCTGAC 1616 7523487275234872 AA CC CAGGACCAGCCCAGGACCAGCC 1616 8153479081534790 CC TT GGGCCTGATGGGGGCCTGATGG 1616 8956405589564055 AA TT TAAATTAATTATAAATTAATTA 1717 3609995236099952 AA TT TAATTTATTTATAATTTATTTA 1717 4073141140731411 CC GG GGGCTGGTGGGGGGCTGGTGGG 1717 4706032247060322 AA CC TGGCACAATCTTGGCACAATCT 1717 6589250765892507 CC GG AGAGGGTGGGAAGAGGGTGGGA 1818 70706427070642 CC TT AAGGCTCCTCTAAGGCTCCTCT 1818 5784836957848369 TT AA TCATTAAAAAATCATTAAAAAA 1818 6084588460845884 CC TT TTGGGTCTGATTTGGGTCTGAT 1919 49488624948862 AA GG CAGGCGTGGTGCAGGCGTGGTG 1919 79706357970635 GG AA GAATCAGTCGTGAATCAGTCGT 1919 1303841513038415 AA GG GTGGCGGGTGCGTGGCGGGTGC 1919 1938850019388500 TT AA CACAGACCCCACACAGACCCCA 1919 2225755822257558 CC TT TTTTATATTTTTTTTATATTTT 1919 3389083833890838 GG CC TCCTTCGCCTGTCCTTCGCCTG 1919 4541194145411941 CC TT ACGTGTGCGGCACGTGTGCGGC 1919 4615701946157019 GG AA GGGGAATTCAGGGGGAATTCAG 1919 4756900347569003 AA GG CATGCGCCAGGCATGCGCCAGG 2020 2146679521466795 CC TT ATTGCTTGAACATTGCTTGAAC 2020 3259670432596704 AA GG AAGCAGTTCTGAAGCAGTTCTG 2020 4883213548832135 TT CC GGGGCCGGTTAGGGGCCGGTTA 2020 5015538650155386 CC TT AGAGATGAGATAGAGATGAGAT 2020 5122359451223594 TT AA AATTTAAAAAAAATTTAAAAAA 2020 5739462857394628 GG CC TGTGTCTGATCTGTGTCTGATC 2222 4432473044324730 TT CC ATCCCCTTCTAATCCCCTTCTA 2222 5035685050356850 TT CC ACTTTCTCCTGACTTTCTCCTG

Chromosome, position 정보는 hg19를 기준으로 작성됨.Chromosome, position information is written based on hg19.

Effective allele: effect size 의 기준이 되는 대립형질 (allele),Effective allele: allele, which is the criterion for effect size,

Other allele: 그 외 대립형질.Other allele: Other allele.

Probe: 대립형질을 기준으로 양측 5개 염기로 총 11개 염기서열. 6번째 염기서열이 Other allele에 해당함. 대립형질이 2개 이상 염기일 경우, 6번째 염기서열부터 Other allele 대립형질의 길이만큼 해당됨.Probe: A total of 11 base sequences with 5 bases on both sides based on the allele. The 6th nucleotide sequence corresponds to Other allele. If the allele is 2 or more bases, the length of the other allele allele from the 6th nucleotide sequence corresponds

NoveltyNovelty rs_idrs_id CHRCHR POSPOS Effective
AlleleEffective
Allele Other
AlleleOther
Allele ProbeProbe 서열번호Sequence number 신규new -- 1One 43,457,37643,457,376 TT TATA TTTTTTAAAAGTTTTTTAAAAG 2424 신규new rs2199501rs2199501 1One 226,442,395226,442,395 AA GG ATCACGAGGTCATCACGAGGTC 1One 신규new rs348330rs348330 1One 229,672,955229,672,955 AA GG TTTGGGAACTATTTGGGAACTA 22 신규new -- 22 37,575,18137,575,181 ATCTCTCATCTCTC AA TTTCTATCTCTTTTCTATCTCT 33 신규new rs243018rs243018 22 60,586,70760,586,707 GG CC TTCCACGTATATTCCACGTATA 44 신규new rs3849330rs3849330 22 118,893,366118,893,366 AA TT AGACATCCCTTAGACATCCCTT 55 신규new rs832189rs832189 33 63,841,94263,841,942 CC TT TGTGGTGGCAGTGTGGTGGCAG 66 신규new rs7656416rs7656416 44 1,254,5351,254,535 TT CC CTACACGCCTCCTACACGCCTC 77 신규new rs147834269rs147834269 44 6,303,7316,303,731 AA GG ACGGCGAGGCCACGGCGAGGCC 88 신규new rs1741676rs1741676 66 117,277,631117,277,631 TT AA TGAACAATCTTTGAACAATCTT 99 신규new -- 66 153,421,680153,421,680 GG GAGA AAAAAGAAAAAAAAAAGAAAAA 2525 신규new rs11145912rs11145912 99 139,300,876139,300,876 AA GG AGATCGTGCCAAGATCGTGCCA 1010 신규new rs61848342rs61848342 1010 12,303,81312,303,813 CC TT GGGCATGGTGGGGGCATGGTGG 1111 신규new rs61839365rs61839365 1010 26,515,66626,515,666 AA GG ATACTGTGTCAATACTGTGTCA 1212 신규new rs139027698rs139027698 1010 94,468,24794,468,247 TT CC ACCTCCGTCTCACCTCCGTCTC 1313 신규new -- 1212 97,853,95497,853,954 CC CACA TTTCACAATTTTTTCACAATTT 2626 신규new rs4992759rs4992759 1212 133,140,409133,140,409 CC TT GAGCTTGTGTGGAGCTTGTGTG 1414 신규new rs34238147rs34238147 1313 26,776,25526,776,255 AA GG GAACCGAGGCCGAACCGAGGCC 1515 신규new rs6495240rs6495240 1515 77,766,55677,766,556 CC TT TGGCTTGAGCATGGCTTGAGCA 1616 신규new rs7219123rs7219123 1717 17,337,83917,337,839 CC GG GGCAGGTACAGGGCAGGTACAG 1717 신규new rs9897302rs9897302 1717 63,163,41563,163,415 GG AA CCCAGATTCAACCCAGATTCAA 1818 신규new rs2302783rs2302783 1717 66,447,07366,447,073 CC TT TGGATTGATCATGGATTGATCA 1919 신규new -- 1919 46,184,75746,184,757 AAAACAAAAC AA TCTTAAAAACATCTTAAAAACA 2020 신규new rs6021276rs6021276 2020 50,155,38650,155,386 CC TT AGAGATGAGATAGAGATGAGAT 2121 신규new rs34885433rs34885433 2020 61,273,96061,273,960 AA GG TGCTGGCGTAGTGCTGGCGTAG 2222 신규new rs1757697rs1757697 2020 62,137,97162,137,971 GG AA CAGGTATGGTGCAGGTATGGTG 23 23 기 보고Flag report -- 22 27,744,36427,744,364 AA ATTATT AGCTAATTTTTAGCTAATTTTT 2727 기 보고Flag report rs12712928rs12712928 22 45,192,08045,192,080 CC GG TGTTTGCTGTCTGTTTGCTGTC 2828 기 보고Flag report rs781534836rs781534836 22 169,775,546169,775,546 AA ATTTGATTTG ATCTTATTTGTATCTTATTTGT 2929 기 보고Flag report -- 22 173,597,431173,597,431 TT TGTG AGAGATGGGGGAGAGATGGGGG 3030 기 보고Flag report rs6767514rs6767514 33 123,151,762123,151,762 AA CC AAGCACCCAGAAAGCACCCAGA 3131 기 보고Flag report rs9873618rs9873618 33 170,733,076170,733,076 AA GG AACATGGTGAAAACATGGTGAA 3232 기 보고Flag report rs11705729rs11705729 33 185,507,299185,507,299 TT AA CCACCAGCAGCCCACCAGCAGC 3333 기 보고Flag report rs9379084rs9379084 66 7,231,8437,231,843 AA GG GGGTGGACCTGGGGTGGACCTG 3434 기 보고Flag report rs10440833rs10440833 66 20,688,12120,688,121 AA TT TTTTTTAATTTTTTTTTAATTT 3535 기 보고Flag report rs3765467rs3765467 66 39,033,59539,033,595 AA GG CAAGCGAGGGGCAAGCGAGGGG 3636 기 보고Flag report rs4721401rs4721401 77 15,064,89615,064,896 GG TT GAGTTTGGGGTGAGTTTGGGGT 3737 기 보고Flag report -- 77 44,252,19044,252,190 CC CTGCTG AGGCTCTGTGGAGGCTCTGTGG 3838 기 보고Flag report rs7791286rs7791286 77 50,856,79250,856,792 AA GG GTGAGGAGTTCGTGAGGAGTTC 3939 기 보고Flag report rs13266634rs13266634 88 118,184,783118,184,783 TT CC CCAGCCGGGACCCAGCCGGGAC 4040 기 보고Flag report rs59757115rs59757115 99 628,941628,941 GG AA TGGGCATGGTGTGGGCATGGTG 4141 기 보고Flag report -- 99 4,284,9614,284,961 AA ATAT GTGCTATTTTTGTGCTATTTTT 4242 기 보고Flag report rs10965248rs10965248 99 22,132,87822,132,878 CC TT GGGGTTTCACCGGGGTTTCACC 4343 기 보고Flag report rs649129rs649129 99 136,154,304136,154,304 TT CC CCTAACGCAGTCCTAACGCAGT 4444 기 보고Flag report rs11595776rs11595776 1010 113,003,311113,003,311 AA GG GTTGCGGGGCTGTTGCGGGGCT 4545 기 보고Flag report rs60808706rs60808706 1111 2,857,2332,857,233 AA GG GAGGCGGGGCAGAGGCGGGGCA 4646 기 보고Flag report rs204926rs204926 1111 8,255,1068,255,106 AA GG GGGCTGGTCCAGGGCTGGTCCA 4747 기 보고Flag report rs174538rs174538 1111 61,560,08161,560,081 AA GG CGTCGGCAGGACGTCGGCAGGA 4848 기 보고Flag report rs74333814rs74333814 1111 72,457,48772,457,487 TT CC AAGCACGGCACAAGCACGGCAC 4949 기 보고Flag report -- 1111 92,695,70892,695,708 TTGAGAAATCTCTATACTATTTTCCATAGTGACTGTTGAGAAATCTCTATACTATTTTCCATAGTGACTG TT GCTCTTTGAGAGCTCTTTGAGA 5050 기 보고Flag report rs671rs671 1212 112,241,766112,241,766 AA GG ACACTGAAGTGACACTGAAGTG 5151 기 보고Flag report rs3812861rs3812861 1313 28,493,48928,493,489 TT AA TTTTTAAAAAATTTTTAAAAAA 5252 기 보고Flag report rs2858980rs2858980 1313 33,554,58733,554,587 AA GG TAGACGAAATGTAGACGAAATG 5353 기 보고Flag report -- 1515 62,391,18462,391,184 AA ATAT CATATATTTTTCATATATTTTT 5454 기 보고Flag report -- 1515 99,292,39699,292,396 CC CTTCTT AAATCCTTTTTAAATCCTTTTT 5555 기 보고Flag report -- 2020 22,562,32622,562,326 AGAG AA GGTGAAGAAGAGGTGAAGAAGA 5656

Other allele: 그 외 대립형질.Other allele: Other allele.

<실시예 4> 유전적 위험도 분석 방법<Example 4> Genetic risk analysis method

유전체정보를 이용하여 상기 표 1 및 표 2에 기재된 모든 다형성 마커 (polymorphic marker)를 실시예 3에서 전장유전체연관성분석법에 사용된 한국인칩 유전체정보에서 추출하여 각 다형성 마커 (polymorphic marker)에 대한 효과 대립유전자 (Allele)의 존재 또는 부재를 확인하였다.Using genome information, all polymorphic markers listed in Tables 1 and 2 were extracted from the Korean chip genome information used in the full-length genome association analysis method in Example 3, and the effect of each polymorphic marker was confronted. The presence or absence of a gene (Allele) was confirmed.

효과 대립유전자 (effective allele)가 부재한 경우에는 0, 효과 대립유전자가 한 개의 반수체에만 존재하는 경우에는 1, 효과 대립유전자가 두 개의 반수체에 모두 존재하는 경우에는 2의 수치를 부여하였다. 유전형 예측방법 (genotype imputation)을 이용하여 대립유전형의 수치를 확인하는 경우에는 대립유전형의 사후 확률 (posterior probability)에 따라서 0~2 사이의 실수 값을 가지게 된다. 제2형 당뇨병 환자의 각 다형성 마커 좌위의 효과 대립유전자 빈도와 정상 대조군의 효과 대립유전자 빈도를 비교하여 각 좌위에서의 베타 계수 (β)를 산출하여 하기 수학식 1에 대입하여 유전적 위험도 점수 (Genetic Risk Score, GRS)를 계산하였다. 계산된 유전적 위험도는 표준정규분포를 가지도록 표준화하였다.In the absence of an effective allele, a value of 0, a value of 1 when an effect allele exists in only one haploid, and 2 when an effect allele exists in both haploids. When the value of the allele is checked using genotype imputation, it has a real value between 0 and 2 depending on the posterior probability of the allele. The effect of each polymorphic marker locus in type 2 diabetic patients is compared with the allele frequency of the effect of the normal control group, and the beta coefficient (β) at each locus is calculated and substituted into the following equation (1), and the genetic risk score ( Genetic Risk Score, GRS) was calculated. The calculated genetic risk was standardized to have a standard normal distribution.

n: 총 다형성 마커의 수,n: total number of polymorphic markers,

i: 다형성 마커 서열번호,i: polymorphic marker sequence number,

j: 샘플 번호.j: Sample number.

β: 베타 계수 (다형성 마커의 유전적 효과 (effect size))β: beta coefficient (genetic effect of polymorphic markers (effect size))

x는 j 샘플이 대립유전형(effective allele)의 존재 또는 부재에 따라 0, 1, 또는 2 중 하나의 대립유전형 수치를 의미한다. 대립유전형의 수치는 유전형 예측 방법(genotype imputation)을 사용하는 경우 0~2 사이의 실수 값으로 대체된다.x denotes an allelic value of one of 0, 1, or 2 in the j sample depending on the presence or absence of an effective allele. The value of the allele is replaced with a real value between 0 and 2 when using genotype imputation.

<실시예 5> 유전적 고위험군 분석<Example 5> Genetic high risk group analysis

유전적 고위험군을 분석하기 위해 계산된 유전적 위험도를 오름차순으로 정렬하였으며, 정렬된 값에 따라서 백분위수 또는 30개의 그룹으로 구분하였다.In order to analyze the genetic high-risk group, the calculated genetic risks were sorted in ascending order, and according to the sorted values, they were divided into percentiles or 30 groups.

유전적 고위험군은 중위수 그룹 (30번째 중 15번째 그룹 또는 40~60% 백분위 그룹) 및 저위험군 (30번째 중 1번째 그룹 (하위 약 3.3%) 등)에서 나타나는 제2형 당뇨병 유병률과 비교하였다. 분석 방법으로는 로지스틱 회귀분석 (Logistic regression)을 이용하였으며 성별, 나이, 코호트 지역을 보정하였다.The genetic high-risk group was compared with the prevalence of type 2 diabetes in the median group (15th out of 30 or 40-60% percentile group) and low-risk group (1st out of 30 (lower about 3.3%), etc.). Logistic regression was used as an analysis method, and sex, age, and cohort area were corrected.

그 결과, 공복혈당 유전적 위험도를 30개 그룹으로 구분하였을 때 가장 고위험군 그룹 (상위 3.3%)는 14.9%의 제2형 당뇨병 유병률을 보였으며 이는 중간 그룹에 비해서 1.5배, 가장 낮은 그룹 (하위 3.3%)에 비해서는 3배의 유병률이 증가하는 것을 확인하였다 (도 5). 또한, 공복혈당 유전적 위험도 기반 유전적 고위험군은 상위 1%의 경우 2.27배, 상위 5%의 경우 2.04배의 당뇨 발병 위험이 증가하는 것을 확인하였으며 제2형 당뇨 유전적 위험도 기반 유전적 고위험군은 상위 1%의 경우 4.58배, 상위 5%의 경우 3.43배의 당뇨 발병 위험이 증가하는 것을 확인하였다 (표 3).As a result, when the genetic risk of fasting blood glucose was divided into 30 groups, the highest risk group (top 3.3%) had a prevalence of type 2 diabetes of 14.9%, which was 1.5 times higher than the middle group and the lowest group (bottom 3.3%). %), it was confirmed that the prevalence increased by 3 times (FIG. 5). In addition, it was confirmed that the risk of developing diabetes increased by 2.27 times in the top 1% and 2.04 times in the top 5% in the high risk group based on the genetic risk of fasting blood sugar. It was confirmed that the risk of diabetes was increased by 4.58 times for 1% and 3.43 times for the top 5% (Table 3).

CategoryCategory GRS typeGRS type Top GRS groupTop GRS group Odds ratioOdds ratio 95% CI95% CI P-valueP-value SingleSingle FPGFPG Top 20%Top 20% 1.671.67 1.57-1.771.57-1.77 8.96E-668.96E-66 Top 10%Top 10% 1.91.9 1.77-2.041.77-2.04 4.45E-744.45E-74 Top 5%Top 5% 2.042.04 1.87-2.231.87-2.23 4.65E-594.65E-59 Top 1%Top 1% 2.272.27 1.92-2.691.92-2.69 3.38E-213.38E-21 T2DT2D Top 20%Top 20% 2.272.27 2.14-2.402.14-2.40 1.39E-1731.39E-173 Top 10%Top 10% 2.882.88 2.70-3.082.70-3.08 8.28E-2218.28E-221 Top 5%Top 5% 3.433.43 3.17-3.713.17-3.71 2.71E-2042.71E-204 Top 1%Top 1% 4.584.58 3.96-5.303.96-5.30 1.61E-931.61E-93 CombinationCombination FPG and T2DFPG and T2D Top 20%Top 20% 3.263.26 2.91-3.662.91-3.66 7.29E-917.29E-91 Top 10%Top 10% 4.344.34 3.79-4.973.79-4.97 1.54E-991.54E-99 Top 5%Top 5% 5.285.28 4.40-6.344.40-6.34 2.75E-712.75E-71 Top 1%Top 1% 8.088.08 4.55-14.354.55-14.35 9.33E-139.33E-13

<실시예 6> 유전적 고위험군의 복합 분석<Example 6> Complex analysis of genetic high-risk group

공복혈당 유전적 위험도 및 제2형 당뇨 유전적 위험도를 백분위로 구분하고, 각 유전적 위험도를 기반으로 모두 고위험군에 속하는 집단에 대해 나머지 집단과 비교 분석하였다. 분석 방법으로는 로지스틱 회귀분석 (Logistic regression)을 이용하였으며 성별, 나이, 코호트 지역을 보정하였다.The genetic risk of fasting blood glucose and the genetic risk of type 2 diabetes were divided into percentiles, and based on each genetic risk, all high-risk groups were compared with the rest of the groups. Logistic regression was used as an analysis method, and sex, age, and cohort area were corrected.

그 결과, 공복혈당 및 제2형 당뇨의 유전적 위험도를 기반으로 모두 상위 1% 이상에 해당하는 유전적 고위험군은 8.08배, 상위 5%의 경우 5.28배의 당뇨 발병 위험이 증가하는 것을 확인하였다.As a result, based on the genetic risk of fasting blood sugar and type 2 diabetes, it was confirmed that the risk of developing diabetes was increased by 8.08 times in the genetic high-risk group, and 5.28 times in the top 5% in the top 5%.

상기 결과를 독립적인 유전체정보에서 검증하기 위해 22,686명에 대한 한국인칩 유전체정보 중에서 동일한 유전변이 정보를 추출하여, 상기 방법과 동일하게 유전적 위험도를 측정하고 제2형 당뇨병 발병률을 분석하였다. 그러나, 초기 분석에 사용된 12만 6천명에 비해서는 적은 수로, 1% 수준의 고위험군 분석은 적은 샘플 수로 인해 분석 결과에서 제외하였다. 그 결과는 표 4에 나타내었다.In order to verify the above results from the independent genome information, the same genetic mutation information was extracted from the genome information of Korean chips for 22,686 people, and the genetic risk was measured in the same manner as the above method, and the incidence of type 2 diabetes was analyzed. However, compared to the 126,000 people used in the initial analysis, the analysis of the high-risk group at the level of 1% was excluded from the analysis result due to the small number of samples. The results are shown in Table 4.

표 4에 나타낸 바와 같이, 공복혈당 및 제2형 당뇨의 유전적 위험도 모두 상위 5% 이상에 해당하는 유전적 고위험군은 3.51배의 당뇨 발병률을 보여주고 있어, 1.94 내지 2.65배를 보이는 공복혈당 또는 제2형 당뇨의 유전적 위험도 기반 고위험군 보다 상회하는 당뇨 발병률을 확인 할 수 있었다.As shown in Table 4, the genetic high-risk group corresponding to the top 5% or higher in both the fasting blood sugar and the genetic risk of type 2 diabetes showed a 3.51 times the incidence of diabetes, and thus 1.94 to 2.65 times the fasting blood sugar or drug Based on the genetic risk of type 2 diabetes, the incidence of diabetes was more than that of the high-risk group was confirmed.

이러한 결과는, 본 발명의 각각의 유전적 위험도를 복합적으로 분석하여 공복혈당 (FPG) 기반 유전적 위험도 고위험군과 제2형 당뇨 (T2D) 기반 유전적 위험도 고위험군에 모두 해당되는 경우에 제2형 당뇨병의 유전적 고위험군으로 선별하는 방법은 공복혈당 (FPG)과 관련된 유전변이에 대한 유전적 위험도, 제2형 당뇨 (T2D) 관련된 유전변이에 대한 유전적 위험도를 각각 단독으로 분석하여 제2형 당뇨병 고위험군을 선별하는 기존 방법과 비교하여 선별 정확도가 현저히 우수하고, 제2형 당뇨 발병률이 더 높은 유전적 고위험군을 정교하게 선별 할 수 있어 제2형 당뇨의 유전적 고위험군 선별에 따라 맞춤형 진료, 치료 및 예방에 활용할 수 있다는 것을 의미한다.These results are based on a complex analysis of the genetic risk of each of the present invention, and type 2 diabetes in the case of both the high-risk group based on fasting blood sugar (FPG) and the high-risk group based on type 2 diabetes (T2D). The method of screening as a high-risk group for fasting blood sugar (FPG) is a high-risk group for type 2 diabetes by analyzing the genetic risk for the genetic mutation related to fasting blood sugar (FPG) and the genetic risk for the genetic mutation related to type 2 diabetes (T2D). The screening accuracy is remarkably superior compared to the existing method of screening for type 2 diabetes, and the genetic high-risk group with a higher incidence of type 2 diabetes can be precisely screened, tailored treatment, treatment and prevention according to the selection of the genetic high-risk group for type 2 diabetes. It means that you can use it.

CategoryCategory GRS typeGRS type Top GRS groupTop GRS group Odds ratioOdds ratio 95% CI95% CI P-valueP-value SingleSingle FPGFPG Top 20%Top 20% 1.561.56 1.56-2.101.56-2.10 3.85E-153.85E-15 Top 10%Top 10% 1.851.85 1.85-2.611.85-2.61 3.38E-193.38E-19 Top 5%Top 5% 1.941.94 1.94-3.001.94-3.00 1.71E-151.71E-15 T2DT2D Top 20%Top 20% 2.092.09 2.09-2.782.09-2.78 8.79E-338.79E-33 Top 10%Top 10% 2.482.48 2.48-3.472.48-3.47 5.13E-365.13E-36 Top 5%Top 5% 2.652.65 2.65-4.002.65-4.00 1.87E-291.87E-29 CombinationCombination FPG and T2DFPG and T2D Top 20%Top 20% 2.842.84 2.84-5.042.84-5.04 9.16E-209.16E-20 Top 10%Top 10% 3.443.44 3.44-6.873.44-6.87 3.44E-193.44E-19 Top 5%Top 5% 3.513.51 3.51-9.133.51-9.13 1.16E-121.16E-12

이상에서 살펴본 바와 같이, 본 발명의 구체적인 실시예를 상세하게 설명되었으나, 본 발명의 사상을 이해하는 당업자는 동일한 사상의 범위 내에서 다른 구성요소를 추가, 변경, 삭제 등을 통하여, 퇴보적인 다른 발명이나 본 발명 사상의 범위 내에 포함되는 다른 실시예를 용이하게 제안할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 본 발명의 범위는 상술한 상세한 설명보다는 후술하는 특허청구의 범위에 의하여 나타내어지며, 특허청구의 범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.As described above, the specific embodiments of the present invention have been described in detail, but those skilled in the art who understand the spirit of the present invention can add, change, delete other elements within the scope of the same idea, and other regressive inventions. However, it will be possible to easily propose other embodiments included within the scope of the spirit of the present invention. Therefore, it should be understood that the embodiments described above are illustrative and non-limiting in all respects. The scope of the present invention is indicated by the scope of the claims to be described later rather than the detailed description described above, and all changes or modified forms derived from the meaning and scope of the claims and their equivalent concepts are included in the scope of the present invention. Should be interpreted as.

<110> Korea Disease Control and Prevention Agency, KDCA <120> methods for diagnosing the high risk group of Diabetes based on Genetic Risk Score <130> PN2007-308 <160> 56 <170> KoPatentIn 3.0 <210> 1 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 1 atcacgaggt c 11 <210> 2 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 2 tttgggaact a 11 <210> 3 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> ATCTCTC/A <400> 3 tttctatctc t 11 <210> 4 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> G/C <400> 4 ttccacgtat a 11 <210> 5 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/T <400> 5 agacatccct t 11 <210> 6 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/T <400> 6 tgtggtggca g 11 <210> 7 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/C <400> 7 ctacacgcct c 11 <210> 8 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 8 acggcgaggc c 11 <210> 9 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/A <400> 9 tgaacaatct t 11 <210> 10 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 10 agatcgtgcc a 11 <210> 11 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/T <400> 11 gggcatggtg g 11 <210> 12 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 12 atactgtgtc a 11 <210> 13 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/C <400> 13 acctccgtct c 11 <210> 14 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/T <400> 14 gagcttgtgt g 11 <210> 15 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 15 gaaccgaggc c 11 <210> 16 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/T <400> 16 tggcttgagc a 11 <210> 17 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/G <400> 17 ggcaggtaca g 11 <210> 18 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> G/A <400> 18 cccagattca a 11 <210> 19 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/T <400> 19 tggattgatc a 11 <210> 20 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> AAAAC/A <400> 20 tcttaaaaac a 11 <210> 21 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/T <400> 21 agagatgaga t 11 <210> 22 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 22 tgctggcgta g 11 <210> 23 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> G/A <400> 23 caggtatggt g 11 <210> 24 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(7) <223> T/TA <400> 24 ttttttaaaa g 11 <210> 25 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(7) <223> G/GA <400> 25 aaaaagaaaa a 11 <210> 26 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(7) <223> C/CA <400> 26 tttcacaatt t 11 <210> 27 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(8) <223> A/ATT <400> 27 agctaatttt t 11 <210> 28 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/G <400> 28 tgtttgctgt c 11 <210> 29 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(10) <223> A/ATTTG <400> 29 atcttatttg t 11 <210> 30 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(7) <223> T/TG <400> 30 agagatgggg g 11 <210> 31 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/C <400> 31 aagcacccag a 11 <210> 32 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 32 aacatggtga a 11 <210> 33 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/A <400> 33 ccaccagcag c 11 <210> 34 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 34 gggtggacct g 11 <210> 35 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/T <400> 35 ttttttaatt t 11 <210> 36 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 36 caagcgaggg g 11 <210> 37 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> G/T <400> 37 gagtttgggg t 11 <210> 38 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(8) <223> C/CTG <400> 38 aggctctgtg g 11 <210> 39 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 39 gtgaggagtt c 11 <210> 40 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/C <400> 40 ccagccggga c 11 <210> 41 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> G/A <400> 41 tgggcatggt g 11 <210> 42 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(7) <223> A/AT <400> 42 gtgctatttt t 11 <210> 43 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/T <400> 43 ggggtttcac c 11 <210> 44 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/C <400> 44 cctaacgcag t 11 <210> 45 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 45 gttgcggggc t 11 <210> 46 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 46 gaggcggggc a 11 <210> 47 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 47 gggctggtcc a 11 <210> 48 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 48 cgtcggcagg a 11 <210> 49 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/C <400> 49 aagcacggca c 11 <210> 50 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> TTGAGAAATCTCTATACTATTTTCCATAGTGACTG/T <400> 50 gctctttgag a 11 <210> 51 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 51 acactgaagt g 11 <210> 52 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/A <400> 52 tttttaaaaa a 11 <210> 53 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 53 tagacgaaat g 11 <210> 54 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(7) <223> A/AT <400> 54 catatatttt t 11 <210> 55 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(8) <223> C/CTT <400> 55 aaatcctttt t 11 <210> 56 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> AG/A <400> 56 ggtgaagaag a 11 <110> Korea Disease Control and Prevention Agency, KDCA <120> methods for diagnosing the high risk group of Diabetes based on Genetic Risk Score <130> PN2007-308 <160> 56 <170> KoPatentIn 3.0 <210> 1 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 1 atcacgaggt c 11 <210> 2 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 2 tttgggaact a 11 <210> 3 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> ATCTCTC/A <400> 3 tttctatctc t 11 <210> 4 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> G/C <400> 4 ttccacgtat a 11 <210> 5 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/T <400> 5 agacatccct t 11 <210> 6 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/T <400> 6 tgtggtggca g 11 <210> 7 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/C <400> 7 ctacacgcct c 11 <210> 8 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 8 acggcgaggc c 11 <210> 9 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/A <400> 9 tgaacaatct t 11 <210> 10 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 10 agatcgtgcc a 11 <210> 11 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/T <400> 11 gggcatggtg g 11 <210> 12 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 12 atactgtgtc a 11 <210> 13 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/C <400> 13 acctccgtct c 11 <210> 14 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/T <400> 14 gagcttgtgt g 11 <210> 15 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 15 gaaccgaggc c 11 <210> 16 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/T <400> 16 tggcttgagc a 11 <210> 17 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/G <400> 17 ggcaggtaca g 11 <210> 18 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> G/A <400> 18 cccagattca a 11 <210> 19 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/T <400> 19 tggattgatc a 11 <210> 20 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> AAAAC/A <400> 20 tcttaaaaac a 11 <210> 21 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/T <400> 21 agagatgaga t 11 <210> 22 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 22 tgctggcgta g 11 <210> 23 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> G/A <400> 23 caggtatggt g 11 <210> 24 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(7) <223> T/TA <400> 24 ttttttaaaa g 11 <210> 25 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(7) <223> G/GA <400> 25 aaaaagaaaa a 11 <210> 26 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(7) <223> C/CA <400> 26 tttcacaatt t 11 <210> 27 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(8) <223> A/ATT <400> 27 agctaatttt t 11 <210> 28 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/G <400> 28 tgtttgctgt c 11 <210> 29 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(10) <223> A/ATTTG <400> 29 atcttatttg t 11 <210> 30 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(7) <223> T/TG <400> 30 agagatgggg g 11 <210> 31 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/C <400> 31 aagcacccag a 11 <210> 32 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 32 aacatggtga a 11 <210> 33 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/A <400> 33 ccaccagcag c 11 <210> 34 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 34 gggtggacct g 11 <210> 35 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/T <400> 35 ttttttaatt t 11 <210> 36 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 36 caagcgaggg g 11 <210> 37 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> G/T <400> 37 gagtttgggg t 11 <210> 38 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(8) <223> C/CTG <400> 38 aggctctgtg g 11 <210> 39 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 39 gtgaggagtt c 11 <210> 40 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/C <400> 40 ccagccggga c 11 <210> 41 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> G/A <400> 41 tgggcatggt g 11 <210> 42 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(7) <223> A/AT <400> 42 gtgctatttt t 11 <210> 43 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> C/T <400> 43 ggggtttcac c 11 <210> 44 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/C <400> 44 cctaacgcag t 11 <210> 45 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 45 gttgcggggc t 11 <210> 46 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 46 gaggcggggc a 11 <210> 47 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 47 gggctggtcc a 11 <210> 48 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 48 cgtcggcagg a 11 <210> 49 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/C <400> 49 aagcacggca c 11 <210> 50 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> TTGAGAAATCTCTATACTATTTTCCATAGTGACTG/T <400> 50 gctctttgag a 11 <210> 51 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 51 acactgaagt g 11 <210> 52 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> T/A <400> 52 tttttaaaaa a 11 <210> 53 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> A/G <400> 53 tagacgaaat g 11 <210> 54 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(7) <223> A/AT <400> 54 catatatttt t 11 <210> 55 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6)..(8) <223> C/CTT <400> 55 aaatcctttt t 11 <210> 56 <211> 11 <212> DNA <213> Homo sapiens <220> <221> variation <222> (6) <223> AG/A <400> 56 ggtgaagaag a 11

Claims

Calculating a genetic risk score (GRS) for a fasting blood sugar-related polymorphic marker including a polymorphic site located in the sixth nucleotide sequence of SEQ ID NOs: 1 to 26 from the biological sample;
Calculating a genetic risk score (GRS) for a type 2 diabetes-related polymorphic marker from the biological sample;
Selecting a high-risk group based on each of the fasting glucose genetic risk score and the type 2 diabetes genetic risk score;
A method of providing information necessary for diagnosing or predicting the onset of type 2 diabetes, including; determining as a high risk group for type 2 diabetes when belonging to both the high risk group based on the genetic risk of fasting blood sugar and the high risk group based on the type 2 diabetes .

The method of claim 1,
The fasting blood glucose-related polymorphic marker further comprises a polymorphic marker including a polymorphic site located at the 6th nucleotide sequence of SEQ ID NOs: 27 to 56, which is necessary for diagnosing or predicting the onset of type 2 diabetes. How to provide information.

The method of claim 1,
The biological sample is a method of providing information necessary for diagnosing type 2 diabetes, characterized in that it is selected from the group consisting of tissue, cells, whole blood, serum, plasma, and saliva.

The method of claim 1,
The genetic risk (Genetic Risk Score, GRS) is calculated using the following Equation 1, a method of providing information necessary for diagnosing type 2 diabetes or predicting the onset:

[Equation 1]

In Equation 1, n is the total number of polymorphic markers, i is the polymorphic marker sequence number, β is the genetic effect of the polymorphic marker, and x is the allele value of the sample j.

The method of claim 1,
When the genotype of the polymorphic site of SEQ ID NO: 1 is A,
When the genotype of the polymorphic site of SEQ ID NO: 2 is A,
When the genotype of the polymorphic site of SEQ ID NO: 3 is ATCTCTC,
When the genotype of the polymorphic site of SEQ ID NO: 4 is G,
When the genotype of the polymorphic site of SEQ ID NO: 5 is A,
When the genotype of the polymorphic site of SEQ ID NO: 6 is C,
When the genotype of the polymorphic site of SEQ ID NO: 7 is T,
When the genotype of the polymorphic site of SEQ ID NO: 8 is A,
When the genotype of the polymorphic site of SEQ ID NO: 9 is T,
When the genotype of the polymorphic site of SEQ ID NO: 10 is A,
When the genotype of the polymorphic site of SEQ ID NO: 11 is C,
When the genotype of the polymorphic site of SEQ ID NO: 12 is A,
When the genotype of the polymorphic site of SEQ ID NO: 13 is T,
When the genotype of the polymorphic site of SEQ ID NO: 14 is C,
When the genotype of the polymorphic site of SEQ ID NO: 15 is A,
When the genotype of the polymorphic site of SEQ ID NO: 16 is C,
When the genotype of the polymorphic site of SEQ ID NO: 17 is C,
When the genotype of the polymorphic site of SEQ ID NO: 18 is G,
When the genotype of the polymorphic site of SEQ ID NO: 19 is C,
When the genotype of the polymorphic site of SEQ ID NO: 20 is AAAAC,
When the genotype of the polymorphic site of SEQ ID NO: 21 is C,
When the genotype of the polymorphic site of SEQ ID NO: 22 is A,
When the genotype of the polymorphic site of SEQ ID NO: 23 is G,
When the genotype of the polymorphic site of SEQ ID NO: 24 is T,
When the genotype of the polymorphic site of SEQ ID NO: 25 is G, and
A method for providing information necessary for diagnosing or predicting the onset of type 2 diabetes, characterized in that it predicts that the risk of developing type 2 diabetes is high when the genotype of the polymorphic site of SEQ ID NO: 26 is C.

The method of claim 1,
The method for providing information necessary for diagnosing or predicting the onset of type 2 diabetes, characterized in that the diagnosis is for Koreans.

In the polynucleotide represented by the nucleotide sequence of SEQ ID NO: 1 to 26, a polynucleotide composed of 10 or more consecutive nucleotide sequences including a polymorphic site located at the 6th position of each nucleotide sequence or a complementary polynucleotide thereof is specifically A polymorphic marker composition for diagnosing or predicting the onset of type 2 diabetes, comprising a primer or probe that binds.

The method of claim 7,
The allele of the polymorphic site of SEQ ID NO: 1 is A,
The allele of the polymorphic site of SEQ ID NO: 2 is A,
The allele of the polymorphic site of SEQ ID NO: 3 is ATCTCTC,
The allele of the polymorphic site of SEQ ID NO: 4 is G,
The allele of the polymorphic site of SEQ ID NO: 5 is A,
The allele of the polymorphic site of SEQ ID NO: 6 is C,
The allele of the polymorphic site of SEQ ID NO: 7 is T,
The allele of the polymorphic site of SEQ ID NO: 8 is A,
The allele of the polymorphic site of SEQ ID NO: 9 is T,
The allele of the polymorphic site of SEQ ID NO: 10 is A,
The allele of the polymorphic site of SEQ ID NO: 11 is C,
The allele of the polymorphic site of SEQ ID NO: 12 is A,
The allele of the polymorphic site of SEQ ID NO: 13 is T,
The allele of the polymorphic site of SEQ ID NO: 14 is C,
The allele of the polymorphic site of SEQ ID NO: 15 is A,
The allele of the polymorphic site of SEQ ID NO: 16 is C,
The allele of the polymorphic site of SEQ ID NO: 17 is C,
The allele of the polymorphic site of SEQ ID NO: 18 is G,
The allele of the polymorphic site of SEQ ID NO: 19 is C,
The allele of the polymorphic site of SEQ ID NO: 20 is AAAAC,
The allele of the polymorphic site of SEQ ID NO: 21 is C,
The allele of the polymorphic site of SEQ ID NO: 22 is A,
The allele of the polymorphic site of SEQ ID NO: 23 is G,
The allele of the polymorphic site of SEQ ID NO: 24 is T,
The allele of the polymorphic site of SEQ ID NO: 25 is G and
The polymorphic marker composition for diagnosing or predicting the onset of type 2 diabetes, characterized in that the allele of the polymorphic site of SEQ ID NO: 26 is C.

The method of claim 7,
The diagnosis is a polymorphic marker composition for diagnosing or predicting the onset of type 2 diabetes, characterized in that for Koreans.

A composition for diagnosing or predicting the onset of type 2 diabetes, comprising the marker composition of any one of claims 7 to 9.

A kit for diagnosing or predicting the onset of type 2 diabetes, comprising the marker composition of any one of claims 7 to 9.