[go: up one dir, main page]

CN106202989A - A kind of method obtaining child's individuality biological age based on oral microbial community - Google Patents

A kind of method obtaining child's individuality biological age based on oral microbial community Download PDF

Info

Publication number
CN106202989A
CN106202989A CN201510213461.1A CN201510213461A CN106202989A CN 106202989 A CN106202989 A CN 106202989A CN 201510213461 A CN201510213461 A CN 201510213461A CN 106202989 A CN106202989 A CN 106202989A
Authority
CN
China
Prior art keywords
age
child
oral
information
microbial community
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510213461.1A
Other languages
Chinese (zh)
Other versions
CN106202989B (en
Inventor
滕飞
杨芳
黄适
徐健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Institute of Bioenergy and Bioprocess Technology of CAS
Original Assignee
Qingdao Institute of Bioenergy and Bioprocess Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Institute of Bioenergy and Bioprocess Technology of CAS filed Critical Qingdao Institute of Bioenergy and Bioprocess Technology of CAS
Priority to CN201510213461.1A priority Critical patent/CN106202989B/en
Publication of CN106202989A publication Critical patent/CN106202989A/en
Application granted granted Critical
Publication of CN106202989B publication Critical patent/CN106202989B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The present invention provides a kind of method obtaining child's individuality biological age based on oral microbial community, and described method includes obtaining the sample containing described child individuality oral microorganism;Extract the DNA of oral microorganism;Described DNA information is converted into microbiologic population's information, utilizes random forests algorithm, oral microbial community information and age are carried out regression analysis, build regression model, it is thus achieved that described Chinese population child's Individual Age.The scheme that the present invention provides can obtain the biological age that Chinese population child is individual exactly, can be without invasive, obtain saliva of buccal cavity or dental plaque sample simply, efficiently, child's Individual Age is detected for a long time, this is beneficial to quickly judge host's now physiological health state, give a clue for health monitoring, improve disease early diagnosis speed simultaneously.

Description

一种基于口腔微生物群落获得儿童个体生物年龄的方法A method to obtain individual biological age of children based on oral microflora

技术领域 technical field

本发明涉及微生物检测模型领域,具体的说是一种基于口腔微生物群落获得儿童个体生物年龄的方法。 The invention relates to the field of microbial detection models, in particular to a method for obtaining the biological age of children based on oral microbial communities.

背景技术 Background technique

人类并不孤单于世,每个人体内均携带有数十亿个微生物,人类与其体内共生的微生物共同组成一个“超级生物体”。子宫中是没有微生物的,人类第一次与微生物接触的是产道。在出生后,通过喝奶以及与外界环境相接触,更多的微生物迁移进入人类体内。人类微生物群落具有年龄特征,人类体内微生物群落随着年龄增长逐渐建立起来,并一生随着生理发育改变而不断进化。那些在出生后进入人体并对人体健康产生重要影响的微生物是后天禀赋的重要承载者,相当于在人类体内存在着除人类自身的基因组外的另一个基因组通过表达调控人体的生命健康,目前认为共生微生物可作为人体的第二基因组,其遗传信息的总和被称为微生物组(microbiome),赋予人类不依赖于自身进化而获得的复杂个体特征。因此,全面认识人体共生菌群可深度揭示其对人体健康或疾病状态的影响,从而构建微生物群落存在及变化情况与宿主生理状态之间的联系。 Human beings are not alone in the world. Each human body carries billions of microorganisms. Human beings and the symbiotic microorganisms in their bodies together form a "super organism". There are no microorganisms in the womb, and the first contact with microorganisms is the birth canal. After birth, more microbes migrate into the human body through drinking milk and contact with the external environment. The human microbial community has age characteristics. The microbial community in the human body is gradually established with age and continues to evolve with changes in physiological development throughout life. Those microorganisms that enter the human body after birth and have an important impact on human health are important carriers of acquired endowments, which is equivalent to the existence of another genome in the human body besides the human genome that regulates the life and health of the human body through expression. It is currently believed that Symbiotic microorganisms can be used as the second genome of the human body, and the sum of their genetic information is called the microbiome, endowing humans with complex individual characteristics that do not depend on their own evolution. Therefore, a comprehensive understanding of the human symbiotic flora can deeply reveal its impact on human health or disease status, thereby establishing the relationship between the existence and changes of the microbial community and the physiological state of the host.

口腔系统是连通人体内外的交通枢纽,为人体共生菌群非常重要的栖息位点,维持口腔菌群结构和功能的健康平衡状态,对于人体健康具有深刻而不容忽视的重大意义。与血液检查和骨龄作为疾病诊断媒介相比,口腔位点采样具有低侵害性、低成本、样品采集和处理简易、快捷等优势。 The oral system is a transportation hub that connects the inside and outside of the human body, and is a very important habitat for the symbiotic flora of the human body. Maintaining a healthy balance of the structure and function of the oral flora is of profound and significant significance to human health. Compared with blood tests and bone age as a medium for disease diagnosis, oral site sampling has the advantages of low invasiveness, low cost, simple and fast sample collection and processing.

人的生长发育可用两个“年龄”来表示:即生物年龄(biological age)和生活年龄(chronological age)。生物年龄指个体在潜在的生命期中,目前所在的位置,是人体健康状况的综合指数,是机体老化程度的客观表述。生活年龄指个体自出生之日算起的实际年龄,以日历上所经过的时间为准。由于营养、疾病、遗传、环境等因素的影响,一些人的生活年龄与发育程度(生物年龄)并不一致,所以生活年龄并不能真实反映一个身体的发育、成熟程度,而生物年龄与个人的 生理健康有密切的关系。 Human growth and development can be represented by two "ages": biological age and chronological age. Biological age refers to the current position of an individual in the potential life cycle, is a comprehensive index of human health status, and is an objective expression of the aging degree of the body. Life age refers to the actual age of an individual calculated from the date of birth, based on the elapsed time on the calendar. Due to the influence of nutrition, disease, genetics, environment and other factors, some people's life age and developmental level (biological age) are not consistent, so life age can not truly reflect the development and maturity of a body, and biological age is related to an individual's physiological Health is closely related.

目前广泛使用骨骼年龄(骨龄;skeletal age,SA;bone age,BA)评价生物年龄或成熟状况。骨龄测定包括手腕部、肘关节、膝关节及足等身体部位,手腕部因其敏感及摄片方便而临床上常用采用X线片进行手腕部对儿童进行测定。但手腕部骨骼数目诸多,有腕骨8块,掌骨5块,指骨14块,加上尺、桡骨共29块,此外,拇指内侧种籽骨也是骨骼发育的重要标志。评定方法主要包括图谱法和记分法,其中图谱法简单、直观,但鉴定者在实际操作中还是以整张X线片进行比较与判读,尚存在着主观判读强、骨成熟组合多等问题;记分法虽相对客观,但骨发育等级划分过细,标准掌握难度大,从而降低了骨龄评价的可靠性,而且骨龄标准图谱库的建立及计算机读片系统的研究亟待解决。此外,虽X射线对人体几乎无害,但对于儿童生长发育的长期追踪仍需在处理时采取防护措施,而非放射线骨龄发育评价与方法的开发仍处于初期阶段,如超声检测使用的判读精度较低,方法学上仍存在问题。而除骨龄外,测定生物年龄的方法常用的还有牙齿成熟度及第二性征发育程度,但这些方法通常评估多依靠鉴定者主观判断,结果均是范围值,较难计算精确的个体生物年龄,且评估指标在个体间异质性相对较大。因此,亟待开发出客观、精确、易操作、无侵害性、高通量的生物年龄评估方法。 Currently, skeletal age (bone age; skeletal age, SA; bone age, BA) is widely used to evaluate biological age or maturity status. Bone age measurement includes body parts such as the wrist, elbow joint, knee joint, and foot. Because of the sensitivity of the wrist and the convenience of taking pictures, X-ray films are often used in clinical practice to measure the wrist of children. But there are many bones in the wrist, including 8 carpal bones, 5 metacarpal bones, 14 phalanx bones, and 29 ulna and radius bones. In addition, the seed bone on the inner side of the thumb is also an important sign of bone development. Evaluation methods mainly include atlas method and scoring method. Among them, the atlas method is simple and intuitive, but the appraiser still compares and interprets the whole X-ray film in actual operation, and there are still problems such as strong subjective interpretation and many bone maturation combinations; Although the scoring method is relatively objective, the division of bone development grades is too detailed and the standards are difficult to grasp, which reduces the reliability of bone age evaluation. Moreover, the establishment of a bone age standard atlas library and the research on computer image reading systems need to be solved urgently. In addition, although X-rays are almost harmless to the human body, long-term follow-up of children's growth and development still requires protective measures during treatment, and the development of non-radiographic bone age development evaluation and methods is still in the early stage, such as the interpretation accuracy of ultrasonic testing. However, there are still methodological problems. In addition to bone age, the commonly used methods for determining biological age are tooth maturity and secondary sexual characteristic development. However, these methods usually rely on the subjective judgment of the appraiser, and the results are all range values. age, and the evaluation indicators are relatively heterogeneous among individuals. Therefore, it is urgent to develop an objective, accurate, easy-to-operate, noninvasive, and high-throughput biological age assessment method.

发明内容 Contents of the invention

针对现有技术中存在的上述不足之处,本发明要解决的技术问题是提供一种基于口腔微生物群落获得儿童个体生物年龄的方法。 In view of the above-mentioned deficiencies in the prior art, the technical problem to be solved by the present invention is to provide a method for obtaining the individual biological age of children based on oral microbial flora.

本发明为实现上述目的所采用的技术方案是:一种基于口腔微生物群落获得儿童个体生物年龄的方法,包括以下步骤: The technical solution adopted by the present invention to achieve the above object is: a method for obtaining the biological age of children based on the oral microbial community, comprising the following steps:

数据收集:收集多个时间点的儿童个体口腔样本; Data Collection: Collect individual oral samples from children at multiple time points;

数据转化:提取获得口腔样本的DNA信息,利用生物信息学方法将所述DNA信息转化为口腔微生物群落信息; Data conversion: extract the DNA information of oral samples, and use bioinformatics methods to convert the DNA information into oral microbial community information;

数据模型的初步构建:将获得的口腔微生物群落信息作为输入变量,利用 随机森林方法,将其对年龄信息进行回归,初步构建基于口腔微生物群落信息检测生物年龄的初步数学模型; Preliminary construction of the data model: the obtained oral microbial community information is used as an input variable, and the random forest method is used to regress it on the age information, and a preliminary mathematical model for detecting biological age based on the oral microbial community information is initially constructed;

数学模型的优化和确定:根据变量在模型的重要性程度排序,在不影响模型性能前提下简化模型变量组合,最终确定儿童个体年龄检测的模型; Optimization and determination of the mathematical model: according to the order of the importance of variables in the model, the combination of model variables is simplified without affecting the performance of the model, and finally the model for individual age detection of children is determined;

儿童个体生物年龄的检测:将所需微生物群落信息作为输入变量,利用已建立的数学模型进行回归分析,获得所检测的儿童个体此时生物年龄。 Detection of individual biological age of children: The required microbial community information is used as an input variable, and the established mathematical model is used for regression analysis to obtain the biological age of individual children detected at this time.

所述口腔样本为唾液或龈上牙菌斑样本。 The oral cavity sample is a sample of saliva or supragingival plaque.

所述将DNA信息转化为口腔微生物群落信息包括以下步骤: Said converting DNA information into oral microbial community information comprises the following steps:

通过高通量测序手段获得DNA信息的16s RNA或全基因组信息; Obtain 16s RNA or whole genome information of DNA information through high-throughput sequencing;

针对16s RNA或全基因组信息进行从门到种水平细菌种系信息划归; According to the 16s RNA or the whole genome information, the bacterial germline information is assigned from the phylum to the species level;

分别统计每个样品在种水平上各物种的序列数,并与该样品总体获得的序列数计算其比值,从而获取各物种的相对丰度。 The sequence number of each species at the species level was counted separately for each sample, and its ratio was calculated with the sequence number obtained in the sample as a whole, so as to obtain the relative abundance of each species.

所述数据模型的初步构建,包括以下步骤: The preliminary construction of the data model includes the following steps:

1)将获得的口腔微生物的全部细菌种水平的组成及其相对丰度作为输入变量; 1) The composition and relative abundance of all bacterial species levels of the obtained oral microorganisms are used as input variables;

2)利用随机森林方法,将输入变量对儿童个体的年龄信息进行回归,初步构建基于口腔微生物群落信息检测生物年龄的初步数学模型。 2) Using the random forest method, the input variables are regressed on the age information of individual children, and a preliminary mathematical model for detecting biological age based on oral microbial community information is initially constructed.

所述数据模型的优化和确定,包括以下步骤: The optimization and determination of the data model includes the following steps:

1)获得初步数学模型中代表菌的种类的各个变量对模型性能的重要性程度; 1) Obtain the degree of importance of each variable representing the type of bacteria in the preliminary mathematical model to the performance of the model;

2)按照变量对模型重要性程度从小到大排序,逐步减少变量数量,利用随机森林方法,进行对年龄的回归分析,获得不同变量组合的模型; 2) According to the importance of variables to the model, sort them from small to large, gradually reduce the number of variables, and use the random forest method to perform regression analysis on age to obtain models with different variable combinations;

3)评价在不降低模型性能前提下的最简化变量组合,确定为年龄相关变量,从而确定最终优化模型。 3) Evaluate the most simplified variable combination without reducing the performance of the model, and determine it as an age-related variable, so as to determine the final optimization model.

所述儿童个体生物年龄的检测,包括以下步骤: The detection of the individual biological age of the child comprises the following steps:

1)获取儿童个体口腔样本的DNA; 1) Obtain DNA from individual oral samples of children;

2)利用生物信息学方法将DNA信息转换为口腔微生物群落信息; 2) Using bioinformatics methods to convert DNA information into oral microbial community information;

3)获得儿童个体的年龄相关变量的相对丰度; 3) Obtain the relative abundance of age-related variables of individual children;

4)利用随机森林方法,将年龄相关变量的组成及其丰度作为变量,对建立的年龄检测模型进行回归分析,获得儿童个体此时的生物年龄。 4) Using the random forest method, the composition and abundance of age-related variables are used as variables, and regression analysis is performed on the established age detection model to obtain the biological age of individual children at this time.

还包括:把获得的儿童个体的生物年龄与其实际年龄进行对比,获知儿童的此时生长发育情况,即如果生物年龄低于实际年龄,则提示该儿童有由于疾病等因素导致生长发育迟缓的可能;如果生物年龄等于实际年龄,则提示该儿童生长发育情况正常;如果生物年龄高于实际年龄,则提示儿童有早熟的可能。 It also includes: comparing the biological age of the obtained individual child with its actual age, and knowing the growth and development of the child at this time, that is, if the biological age is lower than the actual age, it indicates that the child may be stunted due to diseases and other factors ; If the biological age is equal to the actual age, it indicates that the child's growth and development are normal; if the biological age is higher than the actual age, it indicates that the child may have precocious puberty.

本发明具有以下优点及有益效果: The present invention has the following advantages and beneficial effects:

1.本发明的对象采集和处理简易、无侵害性、成本低; 1. The object collection and processing of the present invention are simple, non-invasive, and low in cost;

2.本发明的模型建立和优化易于操作、数据处理高效; 2. The model establishment and optimization of the present invention are easy to operate and efficient in data processing;

3.本发明的评估客观、自动化,可提供精确数值; 3. The evaluation of the present invention is objective and automatic, and can provide accurate numerical values;

4.本发明应用广泛:其应用对象不仅适用于大规模人群评估,也可针对个体实现长期监测;其应用形式不仅可检测儿童个体此时生物年龄,也可作为评估个体发育生长和健康情况的辅助方法。 4. The present invention is widely used: its application object is not only suitable for large-scale crowd assessment, but also can realize long-term monitoring for individuals; its application form can not only detect the biological age of children at this time, but also can be used as a tool for evaluating individual development, growth and health. helper method.

附图说明 Description of drawings

图1为本发明实施提供的实验设计图; Fig. 1 is the experimental design diagram that the present invention implements to provide;

图2为本发明实施提供的口腔微生物群落结构特征图; Fig. 2 is a characteristic figure of oral microbial community structure provided by the implementation of the present invention;

图3为本发明实施提供的通过随机森林回归方法筛选出与年龄相关的口腔微生物组成及其对模型性能贡献程度图; Fig. 3 screens out age-related oral microbial composition and its contribution to model performance by the random forest regression method provided by the present invention;

图4为本发明实施提供的优化后模型应用于健康组和龋病组结果图。 Fig. 4 is a graph showing the results of applying the optimized model provided by the present invention to the healthy group and caries group.

具体实施方式 detailed description

下面结合附图及实施例对本发明做进一步的详细说明。 The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments.

本发明以利用口腔牙菌斑和唾液微生物群落构建和优化可检测儿童口腔的生物年龄作为实施例(图1),包括下列内容: The present invention uses oral dental plaque and saliva microbial flora to construct and optimize the biological age of the detectable children's oral cavity as an example (Fig. 1), including the following:

(1)收集儿童口腔健康状态临床信息(表1): (1) Collect the clinical information of children's oral health status (Table 1):

对广州市南方中英文幼儿园全日制儿童的口腔健康进行追踪调查,每半年检查一次,持续一年三次检查,之后再间隔一年进行检查,根据调查记录的儿童dmfs(龋,失,补牙数)指数,根据本研究目的选择具有下述三类口腔健康变化特征的儿童纳入此课题研究:①健康组(H2H组):口腔龋病状况始终保持健康的17名儿童;②龋病组,包括龋病发生组(H2C组):口腔龋病状况经历从健康到龋病新发过程的21名儿童,以及龋病进展组(C2C组):口腔龋病状况经历从已患龋到龋病发展过程的12名儿童。入选标准包括:年龄约4岁,20颗乳牙全部萌出,排除标准包括:有全身系统性疾病和牙周、口臭等口腔疾患,三个月服用抗生素。就整个实验流程各项细节及以后的数据公布等事宜征得志愿者监护人同意,并签署知情同意书。选取所有入选儿童的口腔检查时所取的龈上牙菌斑和唾液样品共计284个。 A follow-up survey was conducted on the oral health of full-time children in Guangzhou Nanfang Chinese-English Kindergarten. The oral health was checked every six months, three times a year, and then checked at intervals of one year. According to the purpose of this study, children with the following three types of oral health changes were selected to be included in this research: ①Healthy group (H2H group): 17 children whose oral caries status remained healthy all the time; ②Caries group, including dental caries Disease occurrence group (H2C group): 21 children whose oral caries status has gone through the process from healthy to new caries, and caries progress group (C2C group): oral caries status has gone through the development process from caries to caries of 12 children. Inclusion criteria include: age about 4 years old, all 20 deciduous teeth have erupted. Exclusion criteria include: systemic diseases, oral diseases such as periodontal and bad breath, and taking antibiotics for three months. The consent of the guardians of the volunteers was obtained for the details of the entire experimental process and subsequent data release, and an informed consent was signed. A total of 284 samples of supragingival plaque and saliva were taken during the oral examination of all selected children.

调查方法:由两名牙体牙髓专科医生以视诊结合探诊的方式进行检查,检查器械高温高压消毒,必要时借助棉签去除软垢。检查前统一认识、方法和标准,标准一致性检验的Kappa值均大于0.92。采用世界卫生组织《口腔健康调查基本方法》(1997年)对龋病的诊断标准。冠龋诊断标准:牙齿的窝沟点隙或光滑面有明显龋洞、或明显釉质下破坏、或明确可探及软化洞底或洞壁的病损记为龋齿,包括有充填物或已窝沟封闭同时有龋者。有下列表现而缺乏其他阳性症状时不列入龋齿记录范围:①白色或白垩色斑点;②探诊无软化的着色或粗糙斑点;③釉质点隙或窝沟着色,但无明显釉质下潜行破坏;④中到重度氟斑牙,有光泽、质硬、有小凹陷;⑤根据分布或病史,结合触诊、视诊观察因磨损而造成病损龋齿。 Investigation method: Two dental endodontic specialists inspected by inspection combined with probing. The inspection instruments were sterilized by high temperature and high pressure, and soft dirt was removed with cotton swabs if necessary. Before the inspection, the understanding, methods and standards were unified, and the Kappa values of the standard consistency test were all greater than 0.92. The diagnostic criteria for caries were adopted from the World Health Organization "Basic Methods of Oral Health Survey" (1997). Diagnostic criteria for crown caries: tooth pits and fissures or smooth surfaces have obvious caries, or obvious subenamel damage, or lesions that can be clearly explored and soften the bottom or wall of the cavity are recorded as dental caries, including fillings or pits Groove closed at the same time caries. The following manifestations without other positive symptoms are not included in the scope of dental caries records: ① white or chalky spots; ② no softening staining or rough spots on probing; ③ enamel gaps or pits and fissures staining, but no obvious subenamel damage ; ④ Moderate to severe dental fluorosis, shiny, hard, with small depressions; ⑤ According to the distribution or medical history, combined with palpation and inspection, observe the caries caused by wear and tear.

表1本发明实例提供的样本临床数据 The sample clinical data that table 1 example of the present invention provides

(2)收集儿童唾液和龈上菌斑样本: (2) Collect children's saliva and supragingival plaque samples:

取样前一小时受试者避免进食及饮水,每次取样均在早上9:100-12:00,取样时儿童保持轻仰头、闭眼、直立座位。收集儿童无刺激性唾液于50ml无菌离心管中约3-5ml,并每1ml分装于1.5ml离心管中;再使用无菌牙刷采集全部萌出乳牙龈上的菌斑1分钟,将粘附于牙刷上的菌斑转移至盛有10ml双蒸水的50ml离心管,取样时避免触碰黏膜等口腔其他位点。对样品分别编号并置于-80℃保存待提取DNA。 The subjects refrained from eating and drinking one hour before the sampling, and each sampling was between 9:100 and 12:00 in the morning. During the sampling, the children kept their heads lightly, eyes closed, and sat upright. Collect about 3-5ml of non-irritating saliva from children in a 50ml sterile centrifuge tube, and divide each 1ml into a 1.5ml centrifuge tube; then use a sterile toothbrush to collect all the plaques on the erupted primary gums for 1 minute, and the adherent The plaque on the toothbrush was transferred to a 50ml centrifuge tube filled with 10ml of double distilled water. Avoid touching the mucous membrane and other oral sites when sampling. The samples were numbered and placed at -80°C to store the DNA to be extracted.

(3)基因组DNA提取和PCR扩增16S rRNA基因片段 (3) Genomic DNA extraction and PCR amplification of 16S rRNA gene fragments

采用高盐DNA提取方法。将盛有菌斑和唾液的离心管分别13,000rpm/min速度离心15min,弃上清,分别加入1ml裂解液,裂解液混合物中加入30μL蛋白酶K及150μL 10%SDS,53℃水浴震荡过夜培养。加入400μL 5M NaCl冰上培养10min,13,000rpm/min离心10min。加入等体积的饱和酚溶液,至水相酚混匀成乳液状,以13,000rpm/min速度离心15min,吸取上层黏稠水相至新管,重复酚抽提一次。加等体积的氯仿异戊醇混合液(24:1),转动混匀,以13,000rpm/min速度离心15min,取上层黏稠水相转移。加入800μL异丙醇,室温培养1min,以13,000rpm/min速度离心15min。弃上清,70%乙醇洗两次,干燥后溶于50μL TE溶液。 A high-salt DNA extraction method was used. Centrifuge the centrifuge tubes containing plaque and saliva at 13,000 rpm/min for 15 min, discard the supernatant, add 1 ml of lysate, add 30 μL of proteinase K and 150 μL of 10% SDS to the lysate mixture, and culture overnight at 53°C with shaking in a water bath. Add 400 μL of 5M NaCl and incubate on ice for 10 min, then centrifuge at 13,000 rpm/min for 10 min. Add an equal volume of saturated phenol solution until the phenol in the aqueous phase is mixed into an emulsion, centrifuge at 13,000 rpm/min for 15 min, absorb the upper viscous aqueous phase into a new tube, and repeat the phenol extraction once. Add an equal volume of chloroform-isoamyl alcohol mixture (24:1), rotate and mix well, centrifuge at 13,000rpm/min for 15min, and transfer the upper viscous aqueous phase. Add 800 μL of isopropanol, incubate at room temperature for 1 min, and centrifuge at 13,000 rpm/min for 15 min. Discard the supernatant, wash twice with 70% ethanol, dry and dissolve in 50 μL TE solution.

采用Qubit超微量分光光度仪定量DNA浓度,电泳检测DNA完整性。提取后的DNA保存于-20℃。约15ng DNA用于构建16S扩增文库。 The DNA concentration was quantified by a Qubit ultra-micro spectrophotometer, and the DNA integrity was detected by electrophoresis. The extracted DNA was stored at -20°C. About 15ng of DNA was used to construct the 16S amplified library.

为获得相对准确的种系发育信息,选取16S rRNA片段上V1-V3高变区(Escherichia coli positions 5-534)作为PCR扩增目标片段。确定PCR上游引物(5’-NNNNNNN-TGGAGAGTTTGATCCTGGCTCAG-3’)及下游引物(5’-NNNNNNN-TACCGCGGCTGCTGGCAC-3’),NNNNNNN即IDtag,是为区别不同样品来源而设计的随机组合的七个碱基,分别加入上下游引物的5’端,利用该多样品平行标记技术完成多个样品同时在测序仪上测序。 In order to obtain relatively accurate phylogenetic information, the V1-V3 hypervariable region (Escherichia coli positions 5-534) on the 16S rRNA fragment was selected as the target fragment for PCR amplification. Determine the PCR upstream primer (5'-NNNNNNN-TGGAGAGTTTGATCCTGGCTCAG-3') and downstream primer (5'-NNNNNNNN-TACCGCGGCTGCTGGCAC-3'), NNNNNNN is the IDtag, which is a random combination of seven bases designed to distinguish different sample sources , add the 5' ends of the upstream and downstream primers respectively, and use the multi-sample parallel labeling technology to complete the sequencing of multiple samples on the sequencer at the same time.

每个样品进行三次PCR扩增,PCR反应体系(25μL)包含12.5μL的Gotag Hotstart聚合酶,各1μL上下游引物(浓度5pM),1μL基因组DNA(5ngμL-1),9.5μL PCR级别无菌水,在Thermocycler PCR system进行反应。反应条件设定为:95℃预变性2min,94℃变性30s,退火56℃25s,72℃延伸25s,共25个循环,最后72℃延伸5min。PCR产物混合后全部进行凝胶电泳(1.2%Q琼脂糖,5V cm-1,40min),确认扩增效果,将琼脂糖胶放置在紫外灯下,割取约500bp长度的DNA条带,按照Qiagen MiniElute试剂盒提供的操作流程进行回收、纯化目的片段DNA,用20μL洗涤。 Each sample was subjected to PCR amplification three times. The PCR reaction system (25 μL) contained 12.5 μL of Gotag Hotstart polymerase, 1 μL of upstream and downstream primers (concentration 5 pM), 1 μL of genomic DNA (5 ng μL-1), 9.5 μL of PCR-grade sterile water , and reacted in the Thermocycler PCR system. The reaction conditions were set as follows: pre-denaturation at 95°C for 2 min, denaturation at 94°C for 30 s, annealing at 56°C for 25 s, extension at 72°C for 25 s, a total of 25 cycles, and finally extension at 72°C for 5 min. After the PCR products were mixed, all were subjected to gel electrophoresis (1.2% Q agarose, 5V cm-1, 40min) to confirm the amplification effect, and the agarose gel was placed under a UV lamp to cut out a DNA band of about 500bp in length. The operating procedures provided by the Qiagen MiniElute kit were used to recover and purify the DNA of the target fragment, and wash with 20 μL.

(4)454GS FLX Titanium测序 (4) 454GS FLX Titanium Sequencing

主要流程如下:①文库制备,采用Agilent BioAnalyzer 2100生物分析仪及PicoGreen超微量分光光度仪联合定量,将不同样品以等摩尔混合后共构建三份DNA文库,与特异性接头连接修饰,变性处理回收单链DNA;②乳化PCR,将DNA文库固定于磁珠,经扩增乳化,形成油水混合物,每个DNA片断在微反应器进行独立平行扩增,产生数百万计相同拷贝。打破乳化状态,回收纯化结合于磁珠上的DNA片段;③测序反应,将携带DNA的磁珠与其他反应物混合,放入PTP板中置于454GS FLX Titanium机器中,每一个与模板链互补的核苷酸的添加都会产生荧光信号并被CCD照相机所捕获,逐步完成测序;④数据收集,通过系统信息学工具对测序反应数据进行碱基解析。 The main process is as follows: ①Library preparation, using Agilent BioAnalyzer 2100 bioanalyzer and PicoGreen ultra-micro spectrophotometer for joint quantification, different samples were mixed in equimolar amounts to construct three DNA libraries, connected with specific adapters for modification, denatured and recovered Single-stranded DNA; ②Emulsified PCR, immobilizing the DNA library on magnetic beads, emulsified by amplification, to form an oil-water mixture, and each DNA fragment is independently amplified in parallel in a microreactor to produce millions of identical copies. Break the emulsified state, recover and purify the DNA fragments bound to the magnetic beads; ③sequencing reaction, mix the magnetic beads carrying the DNA with other reactants, put them in a PTP plate and place them in a 454GS FLX Titanium machine, each of which is complementary to the template strand The addition of nucleotides will generate fluorescent signals and be captured by the CCD camera, and the sequencing will be completed step by step; ④ data collection, the base analysis of the sequencing reaction data will be performed through system informatics tools.

(5)将获得的高通量数据转换成具体的微生物群落数据 (5) Convert the obtained high-throughput data into specific microbial community data

序列质量控制:454高质量序列分析流程主要基于MOTHUR平台,设定质量控制规范,符合标准的序列片段被视为高质量序列,予以保留。①至少有一端引物能被匹配,允许的编辑距离(插入、删除、缺失、错配的碱基数量)不超过2;②序列长度大于150bp;③设置一个50bp的碱基阅读框,从每条序列的第一个碱基开始逐个碱基向后移动,每移动一个碱基,计算一次该阅读框内的质量分数均值,该质量指数均值需大于35;④不含有模糊碱基;⑤允许标签序列错配数量不超过1。经初步过滤后,需要进一步对序列进行测序错误的筛查,包括“preclustering”和嵌合体(Chimera)序列查找等步骤。选择UCHIME程序查找并删除这些序列。 Sequence quality control: The 454 high-quality sequence analysis process is mainly based on the MOTHUR platform, and quality control specifications are set. Sequence fragments that meet the standards are regarded as high-quality sequences and are retained. ①At least one end of the primer can be matched, and the allowed editing distance (number of bases for insertion, deletion, deletion, mismatch) does not exceed 2; ②The sequence length is greater than 150bp; ③Set a 50bp base reading frame, from each The first base of the sequence starts to move backward base by base, and calculates the average quality score in the reading frame every time a base is moved, and the average quality index must be greater than 35; ④ does not contain ambiguous bases; ⑤ allow tags The number of sequence mismatches does not exceed 1. After preliminary filtering, the sequence needs to be further screened for sequencing errors, including steps such as "preclustering" and chimera sequence search. Select the UCHIME program to find and delete these sequences.

基于16S数据库的种系发育信息分析:采用MOTHUR分类方法针对人类口腔核心微生物16S数据库(CORE)进行从门到种水平细菌种系信息划归,分别统计各个样品在每个分类水平上各物种的序列数,并与该样品总体获得的序列数计算其比值,从而获取每个门类各物种的相对丰度。 Analysis of phylogenetic information based on 16S database: use the MOTHUR classification method to classify bacterial phylogenetic information from phylum to species level in the 16S database (CORE) of human oral core microorganisms, and count the phylogenetic information of each sample at each classification level Sequence number, and calculate its ratio with the total sequence number obtained in the sample, so as to obtain the relative abundance of each species of each phylum.

(6)不同因素对于口腔菌群分布的影响(图2): (6) The influence of different factors on the distribution of oral flora (Figure 2):

以杰森-香浓(Jensen-Shannon)矩阵为基础的群落结构计算方法:其除了样品 间的进化距离外,还可调查样品细菌种水平上丰度的区别。样品中的细菌种丰度分布可以看作是物种的概率分布,可以利用样品间这种概率分布的互信息熵(Jensen-Shannon divergence,JSD)来度量样品间的微生物组的区别。样品间的距离D(a,b)的计算公式如下: Community structure calculation method based on Jensen-Shannon matrix: In addition to the evolutionary distance between samples, it can also investigate the difference in the abundance of bacterial species in samples. The abundance distribution of bacterial species in a sample can be regarded as the probability distribution of species, and the mutual information entropy (Jensen-Shannon divergence, JSD) of this probability distribution between samples can be used to measure the difference of microbiome between samples. The formula for calculating the distance D(a,b) between samples is as follows:

DD. (( aa ,, bb )) == JSDJSD (( PP aa ,, PP bb ))

Pa和Pb分别代表样品a和样品b中的丰度分布。JSD(X,Y)定义了两个样品中不同的概率分布X和Y间的互信息熵(Jensen-Shannon divergence)。 P a and P b represent the abundance distribution in sample a and sample b, respectively. JSD(X,Y) defines the mutual information entropy (Jensen-Shannon divergence) between different probability distributions X and Y in two samples.

JSDJSD (( Xx ,, YY )) == 11 22 KLDKLD (( Xx ,, mm )) ++ 11 22 KLDKLD (( YY ,, mm ))

mm == 11 22 (( Xx ++ YY ))

KLD是X和Y间的Kullback-Leibler离散度,具体的计算方法如下: KLD is the Kullback-Leibler dispersion between X and Y, the specific calculation method is as follows:

KLDKLD (( Xx ,, YY )) == ΣΣ ii Xx ii loglog Xx ii YY ii

非监督的主坐标分析:将Jensen-Shannon矩阵进行主坐标分析(PCoA:Principal Coordinates Analysis)以展示不同样本间口腔微生物群落结构特征,PCoA将各个物种信息视为互相独立不关联的变量,以样本×变量相对丰度的矩阵进行分析,以在不考虑环境因子影响的前提下,无偏见、整体的观察样本的内在菌群结果,发现一个或多个潜在的变量(主坐标,Principal coordinate,PC)以最大程度的在较低维度上最好的解释样本内在的变异,每一个主坐标代表在此维度下可解释的整体结构变异程度,从而达到数据降维处理并对样品排序的目的,其中样本的得分(Score)是物种得分的线性组合。 Unsupervised Principal Coordinates Analysis: Principal Coordinates Analysis (PCoA: Principal Coordinates Analysis) is performed on the Jensen-Shannon matrix to show the structural characteristics of oral microbial communities between different samples. × Variable relative abundance matrix is analyzed to observe the internal flora results of the sample unbiased and overall without considering the influence of environmental factors, and find one or more potential variables (principal coordinates, Principal coordinate, PC ) to best explain the internal variation of the sample in the lower dimension to the greatest extent, and each principal coordinate represents the degree of overall structural variation that can be explained in this dimension, so as to achieve the purpose of data dimensionality reduction and sample sorting, among which The sample score (Score) is a linear combination of species scores.

置换多元统计分析结果显示口腔微生物群落具有明显的年龄特征,这些年龄特征与个体发育成熟度和健康状态有关,支持根据口腔微生物群落建立评估儿童个体生物年龄的方法(图1,图2): The results of permutation multivariate statistical analysis showed that the oral microbial community had obvious age characteristics, which were related to individual developmental maturity and health status, and supported the establishment of a method for assessing the biological age of children based on the oral microbial community (Fig. 1, Fig. 2):

①在各个生态位点,时间/年龄因素是决定菌群分布的最重要因素。 ①In each ecological site, the time/age factor was the most important factor determining the distribution of the flora.

②在各个生态位点,影响菌群分布的其他重要因素根据其重要性排名顺序 为:健康/疾病状态、样品分组、个体异质性。 ②In each ecological site, other important factors affecting the distribution of flora were ranked according to their importance: health/disease status, sample grouping, and individual heterogeneity.

③在不同分组中(包括H2H、H2C、C2C组),健康组中时间因素对其菌群影响最大,而在龋病组中时间因素收到疾病状态影响而对菌群影响作用受抑制。 ③In different groups (including H2H, H2C, C2C groups), the time factor in the healthy group had the greatest impact on its flora, while in the caries group, the time factor was affected by the disease state and its effect on the flora was inhibited.

以上结果提示:口腔菌群可作为个体年龄检测的媒介,且可反应宿主口腔健康状态。 The above results suggest that oral flora can be used as a medium for individual age detection and can reflect the oral health status of the host.

(7)初步建立口腔状态检测的数学模型(图1) (7) Preliminary establishment of a mathematical model for oral state detection (Fig. 1)

在机器学习中,随机森林方法是一个包含多个决策树的模型,并且其输出的类别是由个别树输出的类别的众数而定,该模型被广泛用于挖掘目标变量和众多解释变量间的关联关系。此方法不但可建立分类或回归模型,同时可确定区分特定状态或标签的变量,并可通过其重要性值以判断其区分能力的大小。在本实例中,随机森林方法利用R的randomForest软件包实现,建立5000棵树,其他均为默认设置。以输入数据的2/3作为训练数据集,以输入数据的1/3作为测试数据集,随机进行100次实验以降低误差。 In machine learning, the random forest method is a model that contains multiple decision trees, and its output category is determined by the mode of the category output by individual trees. This model is widely used to mine the relationship between the target variable and many explanatory variables. relationship. This method can not only establish a classification or regression model, but also determine the variables that distinguish a specific state or label, and judge the size of its distinguishing ability through its importance value. In this example, the random forest method is implemented using the randomForest package of R, and 5000 trees are established, and the others are default settings. 2/3 of the input data is used as the training data set, and 1/3 of the input data is used as the test data set, and 100 experiments are randomly performed to reduce the error.

以H2H组中的口腔微生物群落细菌种数据作为输入变量,以每个样本对应的实际月龄作为样本信息,将其回归到离散的输出变量(预测的月龄),初步建立检测儿童个体生物年龄的数学模型。 Taking the bacterial species data of the oral microbial community in the H2H group as the input variable, taking the actual month age corresponding to each sample as the sample information, and regressing it to the discrete output variable (predicted month age), a preliminary establishment of a method for detecting the individual biological age of children mathematical model.

随机森林机器学习(Random Forests,RF)是一种基于分类器算法的机器学习,由LeoBreiman提出,通过自助法重采样技术,从训练集(data set)n中有放回地重复随机抽取k个样本生成新的训练样本(train set)集合,然后根据自助样本集生成k个分类树组成随机森林,新数据的分类结果按分类树投票多少形成的分数而定,分类误差取决于每一棵树的分类能力和它们之间的相关性。单棵树的分类能力可能很小,但在随机产生大量的决策树后,一个测试样品可以通过每一棵树的分类结果经统计后选择最可能的分类。它通过对大量分类树的汇总提高了模型的预测精度,由于其不存在过度拟合、预测精度高,该模型被广泛用于挖掘目标变量和众多解释变量间的关联关系。 Random forest machine learning (Random Forests, RF) is a kind of machine learning based on the classifier algorithm, proposed by Leo Breiman, through the self-help method resampling technology, from the training set (data set) n Repeated random sampling k The sample generates a new set of training samples (train set), and then generates k classification trees to form a random forest based on the self-service sample set. The classification result of new data depends on the score formed by the number of votes of the classification tree, and the classification error depends on each tree. classification ability and the correlation between them. The classification ability of a single tree may be very small, but after a large number of decision trees are randomly generated, a test sample can select the most probable classification through statistics of the classification results of each tree. It improves the prediction accuracy of the model by summarizing a large number of classification trees. Because it does not have overfitting and high prediction accuracy, this model is widely used to mine the relationship between the target variable and many explanatory variables.

(8)优化已建立的检测儿童个体生物年龄的模型(图3) (8) Optimizing the established model for detecting the individual biological age of children (Fig. 3)

除了建立检测模型和预测,随机森林方法还能用于评价解释变量的重要性,特征选择采用随机的方法去分裂每一个节点,然后比较不同情况下产生的误差。直观的评价标准是该变量越重要,对预报结果的影响也越大。随机森林模型解释变量的重要性评价采用类似标准:将所有检验标本某一解释变量的取值随机打乱,采用原随机森林模型对检验样本再次预报,袋外拟合误差增加越多,该解释变量越重要。袋外拟合误差增加量可用于定量评价解释变量重要性。本专利采用十倍交叉验证(Ten-Fold Cross Validation)评价构建模型所需纳入变量的最小数量。随机重复100次,以均值作为对算法准确性的估计。交叉验证(Cross-Validation,CV)是一种用来验证分类器性能的统计分析方法,主要用于建模评估中得到可靠稳定的模型,即在某种意义下将原始数据(dataset)进行分组,一部分做为训练集(train set),另一部分做为验证集(validation set),首先用训练集对分类器进行训练,再利用验证集来测试训练得到的模型,将每次分类误差做为评价分类器性能的指标。而十倍交叉验证将数据集分成十分,轮流将其中9份作为训练数据,1份作为测试数据,进行试验。每次试验都会得出相应的分类误差,10次结果均值作为对算法精度的估计。 In addition to building detection models and predictions, the random forest method can also be used to evaluate the importance of explanatory variables. Feature selection uses a random method to split each node, and then compares the errors generated in different situations. The intuitive evaluation standard is that the more important the variable is, the greater the impact on the forecast results. The evaluation of the importance of the random forest model explanatory variables adopts a similar standard: the value of a certain explanatory variable of all test samples is randomly disrupted, and the original random forest model is used to predict the test samples again. The more the out-of-bag fitting error increases, the explanation Variables are more important. The increase of out-of-bag fitting error can be used to quantitatively evaluate the importance of explanatory variables. This patent uses Ten-Fold Cross Validation (Ten-Fold Cross Validation) to evaluate the minimum number of variables required to build a model. Repeat 100 times randomly, and use the mean value as an estimate of the accuracy of the algorithm. Cross-Validation (CV) is a statistical analysis method used to verify the performance of classifiers, mainly used to obtain reliable and stable models in modeling evaluation, that is, to group the original data (dataset) in a certain sense , one part is used as the training set (train set), and the other part is used as the validation set (validation set). First, the training set is used to train the classifier, and then the validation set is used to test the trained model, and each classification error is taken as A metric for evaluating classifier performance. The ten-fold cross-validation divides the data set into ten parts, and takes turns using 9 parts as training data and 1 part as test data for experiments. The corresponding classification error will be obtained for each trial, and the mean of 10 results will be used as an estimate of the accuracy of the algorithm.

将变量按照其对年龄回归重要性排序,将随着变量减少而随机森林回归模型区分年龄能力没有显著改变的变量组合作为最终年龄相关微生物标记物。其中,来源于牙菌斑的标记物包括洛氏普氏菌(Prevotella loescheii),反硝化金氏菌(Kingella denitrificans),纤毛菌属BU064(Leptotrichia BU064),多形具核梭杆菌亚种(Fusobacterium nucleatum subsp.polymorphum),伯杰菌602D02(Bergeyella602D02),口腔心杆菌(Cardiobacterium valvarum),轻型链球菌/肺炎链球菌/婴儿链球菌/口腔链球菌(Streptococcus mitis/Streptococcus pneumonia/Streptococcus infantis/Streptococcus oralis),黄奈瑟菌/粘液奈瑟菌/咽奈瑟菌(Neisseria flava/Neisseria mucosa/Neisseria pharyngis),纤细弯曲菌(Campylobacter gracilis),金黄奈瑟菌(Neisseria flavescens),来源于唾液的15个标记物包括卟啉单胞菌CW034(Porphyromonas CW034),格登链球菌(Streptococcus gordonii),非典型韦 荣球菌/殊异韦荣球菌/小韦荣球菌(Veillonella atypical/Veillonella dispar/Veillonella parvula),口腔消化链球菌(Peptostreptococcus stomatis),副血链球菌/口腔链球菌(Streptococcus parasanguinis/Streptococcus oralis),纤毛菌BU064(Leptotrichia BU064),(Porphyromonas catoniae),TM7口腔分类单元352(TM7oral taxon 352),普氏菌口腔分类单元299(Prevotella oral taxon 299),产黑普氏菌(Prevotella melaninogenica),沟真杆菌/弱小真杆菌(Eubacterium sulci/Eubacterium infirmum),伯杰菌602D02(Bergeyella 602D02),金黄奈瑟菌(Neisseria flavescens),脑膜炎奈瑟菌/多糖奈瑟菌(Neisseria meningitides/Neisseria polysaccharea),苛养颗粒链菌(Granulicatella elegans)(图3)。 The variables were sorted according to their importance to age regression, and the combination of variables that did not significantly change the ability of the random forest regression model to distinguish age as the variables decreased was used as the final age-related microbial marker. Among them, markers derived from dental plaque include Prevotella loescheii, Kingella denitrificans, Leptotrichia BU064, Fusobacterium polymorpha subsp. nucleatum subsp.polymorphum), Bergeyella 602D02 (Bergeyella602D02), Cardiobacterium valvarum, Streptococcus mitis/Streptococcus pneumonia/Streptococcus infantis/Streptococcus oralis , Neisseria flava/Neisseria mucosa/Neisseria pharyngis, Campylobacter gracilis, Neisseria flavescens, 15 markers from saliva Species including Porphyromonas CW034 (Porphyromonas CW034), Streptococcus gordonii (Streptococcus gordonii), Veillonella atypical/Veillonella dispar/Veillonella parvula, oral digestion Streptococcus (Peptostreptococcus stomatis), Streptococcus parasanguinis/Streptococcus oralis, Leptotrichia BU064, (Porphyromonas catoniae), TM7 oral taxon 352, Prevotella Oral taxon 299 (Prevotella oral taxon 299), Prevotella melaninogenica (Prevotella melaninogenica), Eubacterium sulci/Eubacterium infirmum, Bergeyella 602D02 (Bergeyella 602D02), Neisseria aureus ( Neisseria flavescens), Neisseria meningitidis/Neisseria polysaccharide (Neisseria seria meningitides/Neisseria polysaccharea), Granulicatella elegans (Figure 3).

利用随机森林方法,以年龄相关微生物标记物为输入变量,以每个样本对应的实际月龄作为样本信息,将其回归到离散的输出变量(预测的月龄),最终建立优化后的检测儿童个体生物年龄的模型。 Using the random forest method, age-related microbial markers are used as input variables, and the actual month age corresponding to each sample is used as the sample information, and it is regressed to the discrete output variable (predicted month age), and finally an optimized test child is established. Models of the biological age of individuals.

(9)优化后模型的应用及其性能(图4) (9) Application and performance of the optimized model (Figure 4)

将优化后模型应用于不同组别中,即将各个样本的年龄相关微生物标记物的组成及其丰度作为输入变量,利用年龄检测模型进行回归分析,得出此时该样本的生物年龄,结果显示:在健康组中,可见通过口腔微生物群落检测所得的生物年龄基本与生活年龄保持一致;在龋病组中,通过口腔微生物群落检测所得的生物年龄显著低于生活年龄(t检验,p<0.05),提示口腔疾病的发生潜在抑制了菌群的成熟度,从而导致口腔菌群年龄的降低,上述结果说明所建立的模型可以较好评定儿童个体的生物年龄,并提示口腔菌群年龄可反应儿童个体口腔健康状态。 The optimized model is applied to different groups, that is, the composition and abundance of age-related microbial markers in each sample are used as input variables, and the age detection model is used for regression analysis to obtain the biological age of the sample at this time. The results show that : In the healthy group, it can be seen that the biological age obtained by the oral microbial community is basically consistent with the chronological age; in the caries group, the biological age obtained by the oral microbial community is significantly lower than the chronological age (t test, p<0.05 ), suggesting that the occurrence of oral diseases potentially inhibits the maturity of the flora, leading to a decrease in the age of the oral flora. Individual oral health status in children.

本发明所述的基于随机森林的回归分析方法可参见Breiman L(2001)Random forests.Mach Learn 45:5–32.)和(Knights D,Costello EK,Knight R.Supervised classification of human microbiota.FEMS Microbiol Rev.2011Mar;35(2):343-59.doi:10.1111/j.1574-6976.2010.00251.x.Epub 2010Oct 7.Review.PubMed PMID:21039646.。 The regression analysis method based on random forest of the present invention can refer to Breiman L (2001) Random forests.Mach Learn 45:5-32.) and (Knights D, Costello EK, Knight R.Supervised classification of human microbiota.FEMS Microbiol Rev.2011Mar;35(2):343-59.doi:10.1111/j.1574-6976.2010.00251.x.Epub 2010Oct 7.Review.PubMed PMID:21039646.

当然,上述说明并非是对本发明的限制,本发明也并不限于上述举例,本技术领域的普通技术人员,在本发明的实施范围内,做出的变化、改型、添加或替换,都应属于本发明的保护范围。 Of course, the above descriptions are not intended to limit the present invention, and the present invention is not limited to the above examples. Any changes, modifications, additions or substitutions made by those skilled in the art within the implementation scope of the present invention shall be Belong to the protection scope of the present invention.

Claims (7)

1. the method obtaining child's individuality biological age based on oral microbial community, it is characterised in that Comprise the following steps:
Data collection: collect child's individuality oral cavity sample of multiple time point;
Data convert: extract the DNA information obtaining oral cavity sample, utilize bioinformatics method by described DNA information is converted into oral microbial community information;
The Primary Construction of data model: using the oral microbial community information of acquisition as input variable, utilize Random forest method, returns it age information, and Primary Construction is based on oral microbial community information The preliminary mathematical model of detection biological age;
The optimization of mathematical model and determining: sort at the importance degree of model according to variable, do not affecting mould Under type performance premise, the combination of simplified model variable, finally determines the model that child's Individual Age detects;
The detection of child's individuality biological age: using desired microorganisms group information as input variable, has utilized The mathematical model set up carries out regression analysis, it is thus achieved that the child's individuality now biological age detected.
One the most according to claim 1 obtains child's individuality biological age based on oral microbial community Method, it is characterised in that described oral cavity sample is dental plaque sample on saliva or gum.
One the most according to claim 1 obtains child's individuality biological age based on oral microbial community Method, it is characterised in that the described oral microbial community information that is converted into by DNA information includes following step Rapid:
16s RNA or the full-length genome information of DNA information is obtained by high-flux sequence means;
Carry out horizontal bacterial strains information from door to kind for 16s RNA or full-length genome information to incorporate into;
Add up each sample sequence number of each species in kind of level, and the sequence obtained with this population of samples respectively Columns calculates its ratio, thus obtains the relative abundance of each each species.
One the most according to claim 1 obtains child's individuality biological age based on oral microbial community Method, it is characterised in that the Primary Construction of described data model, comprise the following steps:
1) composition and the relative abundance thereof of the whole thin species level of the oral microorganism obtained are become as input Amount;
2) utilize random forest method, the age information that child is individual is returned, tentatively by input variable Build preliminary mathematical model based on oral microbial community infomation detection biological age.
One the most according to claim 1 obtains child's individuality biological age based on oral microbial community Method, it is characterised in that the optimization of described data model and determining, comprise the following steps:
1) each variable importance journey to model performance of the kind representing bacterium in preliminary mathematical model is obtained Degree;
2) according to variable, model importance degree is sorted from small to large, gradually reduces variable quantity, utilize with Machine forest method, carries out the regression analysis to the age, it is thus achieved that the model of different variable combinations;
3) evaluate the combination of simplification variable under not reducing model performance premise, be defined as age correlated variables, So that it is determined that final optimization pass model.
One the most according to claim 1 obtains child's individuality biological age based on oral microbial community Method, it is characterised in that the detection of described child's individuality biological age, comprise the following steps:
1) DNA of child's individuality oral cavity sample is obtained;
2) utilize bioinformatics method that DNA information is converted to oral microbial community information;
3) relative abundance of the individual age correlated variables of child is obtained;
4) random forest method is utilized, using the composition of age correlated variables and abundance thereof as variable, to foundation Age detection model carry out regression analysis, it is thus achieved that the individual biological age now of child.
One the most according to claim 1 obtains child's individuality biological age based on oral microbial community Method, it is characterised in that also include: biological age individual for the child obtained is entered with its actual age Row contrast, knows the now growth promoter situation of child, if i.e. biological age is less than actual age, then carries Show that this child has owing to the factors such as disease cause the possibility of growth retardation;If biological age is equal to reality Age, then point out this upgrowth and development of children situation normal;If biological age is higher than actual age, then point out Child has the possibility of precocity.
CN201510213461.1A 2015-04-30 2015-04-30 A method of children's individual biological age is obtained based on oral microbial community Active CN106202989B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510213461.1A CN106202989B (en) 2015-04-30 2015-04-30 A method of children's individual biological age is obtained based on oral microbial community

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510213461.1A CN106202989B (en) 2015-04-30 2015-04-30 A method of children's individual biological age is obtained based on oral microbial community

Publications (2)

Publication Number Publication Date
CN106202989A true CN106202989A (en) 2016-12-07
CN106202989B CN106202989B (en) 2018-12-21

Family

ID=57458361

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510213461.1A Active CN106202989B (en) 2015-04-30 2015-04-30 A method of children's individual biological age is obtained based on oral microbial community

Country Status (1)

Country Link
CN (1) CN106202989B (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273711A (en) * 2017-06-22 2017-10-20 宁波大学 A kind of shrimp disease quantitative forecasting technique based on enteron aisle bacterial indicator
CN108268753A (en) * 2018-01-25 2018-07-10 清华大学 A kind of microorganism group recognition methods and device, equipment
CN110277151A (en) * 2019-06-11 2019-09-24 浙江大学 Human physiological age analysis method, system and model based on routine physical examination indicators
CN110931082A (en) * 2019-12-12 2020-03-27 爱尔生基因医学科技有限公司 Method and system for gene detection and evaluation
CN110957038A (en) * 2019-11-29 2020-04-03 广州市雷德医学检验实验室有限公司 Immune age determination system, method, device and storage medium
CN111261222A (en) * 2018-12-03 2020-06-09 中国科学院青岛生物能源与过程研究所 Construction method and application of oral microbial community detection model
CN111477273A (en) * 2020-05-18 2020-07-31 中国人民解放军国防科技大学 Method for predicting individual age information based on brain tissue gene expression
CN111816307A (en) * 2020-04-15 2020-10-23 浙江大学 Method and evaluation method of constructing biological age evaluation model of Chinese population based on clinical markers
JP2021010343A (en) * 2019-07-08 2021-02-04 三菱ケミカル株式会社 Health condition prediction method by oral cavity bacteria
CN113257344A (en) * 2020-02-12 2021-08-13 大江基因医学股份有限公司 Method for establishing cell state evaluation model
CN113528688A (en) * 2021-09-09 2021-10-22 北京泱深生物信息技术有限公司 Use of microorganisms in the preparation of products for the diagnosis of growth retardation
CN113574604A (en) * 2018-10-26 2021-10-29 深度青春有限公司 Aging markers of the human microbiome and the microbiome's aging clock
CN113643750A (en) * 2021-08-09 2021-11-12 浙江大学 Method for predicting growth traits of offspring based on rumen flora structure of female ruminant
CN113689913A (en) * 2021-08-26 2021-11-23 江南大学 Method for predicting age of pit mud of Luzhou-flavor liquor pit
WO2021254299A1 (en) * 2020-06-15 2021-12-23 The Chinese University Of Hong Kong Use of bacteria in children development assessment and treatment
CN114023386A (en) * 2021-10-26 2022-02-08 艾德范思(北京)医学检验实验室有限公司 Metagenome data analysis and characteristic bacteria screening method
WO2023229279A1 (en) * 2022-05-26 2023-11-30 주식회사 엘지생활건강 Method for determining age by using microbiome

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006094803A (en) * 2004-09-30 2006-04-13 Shimane Univ Method for analyzing microbial community structure using T-RFLP
US20080305475A1 (en) * 2007-06-07 2008-12-11 Kao Corporation Microbial community analysis
CN101833613A (en) * 2010-06-04 2010-09-15 中国科学院青岛生物能源与过程研究所 A kind of oral microbial community database and its application
CN103305607A (en) * 2013-05-22 2013-09-18 宁波大学 Disease prediction method for aquaculture based on microflora change
WO2014179965A1 (en) * 2013-05-09 2014-11-13 The Procter & Gamble Company Biomarker identifying method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006094803A (en) * 2004-09-30 2006-04-13 Shimane Univ Method for analyzing microbial community structure using T-RFLP
US20080305475A1 (en) * 2007-06-07 2008-12-11 Kao Corporation Microbial community analysis
CN101833613A (en) * 2010-06-04 2010-09-15 中国科学院青岛生物能源与过程研究所 A kind of oral microbial community database and its application
WO2014179965A1 (en) * 2013-05-09 2014-11-13 The Procter & Gamble Company Biomarker identifying method and system
CN103305607A (en) * 2013-05-22 2013-09-18 宁波大学 Disease prediction method for aquaculture based on microflora change

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
NIMA KIANOUSH 等: "Bacterial profile of dentine caries and the impact of pH on bacterial population diversity", 《PLOS ONE》 *
RAMÓN DIAZ-URIARTE: "GeneSrF and varSelRF: a web-based tool and R package for gene selection and classification using random forest", 《BMC BIOINFORMATICS》 *
凌均棨 等: "新一代高通量技术在龋病相关的口腔菌群宏基因组学研究中的应用", 《中华口腔医学研究杂志》 *
张洋洋 等: "人颊黏膜微生物群落年龄相关性演替研究", 《华西口腔医学杂志》 *
李贞子 等: "随机森林回归分析及在代谢调控关系研究中的应用", 《中国卫生统计》 *
武晓岩 等: "随机森林在基因表达数据分析中的应用及研究进展", 《2007年中国卫生统计学术大会论文集》 *
田芳云 等: "16SrRNA基因序列分析在人体微生物组学研究中的进展及应用", 《中国老年学》 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273711B (en) * 2017-06-22 2021-03-23 宁波大学 Screening method of prawn health condition indicating flora
CN107273711A (en) * 2017-06-22 2017-10-20 宁波大学 A kind of shrimp disease quantitative forecasting technique based on enteron aisle bacterial indicator
CN108268753A (en) * 2018-01-25 2018-07-10 清华大学 A kind of microorganism group recognition methods and device, equipment
CN108268753B (en) * 2018-01-25 2021-12-03 清华大学 Method, device and equipment for identifying microbiome
CN113574604A (en) * 2018-10-26 2021-10-29 深度青春有限公司 Aging markers of the human microbiome and the microbiome's aging clock
CN111261222B (en) * 2018-12-03 2023-08-11 中国科学院青岛生物能源与过程研究所 Construction method of oral microbial community detection model
CN111261222A (en) * 2018-12-03 2020-06-09 中国科学院青岛生物能源与过程研究所 Construction method and application of oral microbial community detection model
CN110277151A (en) * 2019-06-11 2019-09-24 浙江大学 Human physiological age analysis method, system and model based on routine physical examination indicators
JP2021010343A (en) * 2019-07-08 2021-02-04 三菱ケミカル株式会社 Health condition prediction method by oral cavity bacteria
CN110957038B (en) * 2019-11-29 2021-05-14 广州市雷德医学检验实验室有限公司 Immune age determination system, method, device and storage medium
CN110957038A (en) * 2019-11-29 2020-04-03 广州市雷德医学检验实验室有限公司 Immune age determination system, method, device and storage medium
CN110931082A (en) * 2019-12-12 2020-03-27 爱尔生基因医学科技有限公司 Method and system for gene detection and evaluation
CN113257344A (en) * 2020-02-12 2021-08-13 大江基因医学股份有限公司 Method for establishing cell state evaluation model
CN111816307A (en) * 2020-04-15 2020-10-23 浙江大学 Method and evaluation method of constructing biological age evaluation model of Chinese population based on clinical markers
CN111477273A (en) * 2020-05-18 2020-07-31 中国人民解放军国防科技大学 Method for predicting individual age information based on brain tissue gene expression
WO2021254299A1 (en) * 2020-06-15 2021-12-23 The Chinese University Of Hong Kong Use of bacteria in children development assessment and treatment
CN113643750A (en) * 2021-08-09 2021-11-12 浙江大学 Method for predicting growth traits of offspring based on rumen flora structure of female ruminant
CN113689913A (en) * 2021-08-26 2021-11-23 江南大学 Method for predicting age of pit mud of Luzhou-flavor liquor pit
CN113528688A (en) * 2021-09-09 2021-10-22 北京泱深生物信息技术有限公司 Use of microorganisms in the preparation of products for the diagnosis of growth retardation
CN114023386A (en) * 2021-10-26 2022-02-08 艾德范思(北京)医学检验实验室有限公司 Metagenome data analysis and characteristic bacteria screening method
WO2023229279A1 (en) * 2022-05-26 2023-11-30 주식회사 엘지생활건강 Method for determining age by using microbiome

Also Published As

Publication number Publication date
CN106202989B (en) 2018-12-21

Similar Documents

Publication Publication Date Title
CN106202989A (en) A kind of method obtaining child&#39;s individuality biological age based on oral microbial community
CN105209918B (en) Biomarker authentication method and system
Ghensi et al. Strong oral plaque microbiome signatures for dental implant diseases identified by strain-resolution metagenomics
Feres et al. Support vector machine-based differentiation between aggressive and chronic periodontitis using microbial profiles
Relvas et al. Relationship between dental and periodontal health status and the salivary microbiome: bacterial diversity, co-occurrence networks and predictive models
Do et al. Oral biofilms: molecular analysis, challenges, and future prospects in dental diagnostics
Galimanas et al. Bacterial community composition of chronic periodontitis and novel oral sampling sites for detecting disease indicators
Luo et al. Microbial profiles in saliva from children with and without caries in mixed dentition
CN106202846A (en) The construction method of oral microbial community detection model and application thereof
Jensen et al. Early markers of periodontal disease and altered oral microbiota are associated with glycemic control in children with type 1 diabetes
Robinson et al. Intricacies of assessing the human microbiome in epidemiologic studies
US20150337349A1 (en) Microbiome Modulation Index
US10982283B2 (en) Indices of microbial diversity relating to health
CN105296590A (en) Colorectal cancer marker and application thereof
CN108345768B (en) Method for determining maturity of intestinal flora of infants and marker combination
CN105324670A (en) Method and system for assessing health condition
CN105132518A (en) Colon cancer marker and application thereof
CN116472582A (en) Animal diagnosis using machine learning
CN106446599A (en) Method for screening oral pathogenic biomarkers of infant caries
CN111261222B (en) Construction method of oral microbial community detection model
Parolin et al. Design and validation of a DNA-microarray for phylogenetic analysis of bacterial communities in different oral samples and dental implants
WO2020018954A1 (en) Methods and systems for oral microbiome analysis
US20230121442A1 (en) Method of Quantifying Product Impact on Human Microbiome
CN107002021A (en) Biomarker of rheumatoid arthritis and application thereof
Yan et al. Interpretable machine learning framework reveals microbiome features of oral disease

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant