[go: up one dir, main page]

CN105044361B - A kind of diagnostic marker and its screening technique for being suitable for esophageal squamous cell carcinoma early diagnosis - Google Patents

A kind of diagnostic marker and its screening technique for being suitable for esophageal squamous cell carcinoma early diagnosis Download PDF

Info

Publication number
CN105044361B
CN105044361B CN201510497914.8A CN201510497914A CN105044361B CN 105044361 B CN105044361 B CN 105044361B CN 201510497914 A CN201510497914 A CN 201510497914A CN 105044361 B CN105044361 B CN 105044361B
Authority
CN
China
Prior art keywords
acid
metabolic
serum
diagnostic
lysopc
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510497914.8A
Other languages
Chinese (zh)
Other versions
CN105044361A (en
Inventor
王家林
张涛
朱正江
薛付忠
赵德利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Institute of Cancer Prevention and Treatment
Original Assignee
Shandong Institute of Cancer Prevention and Treatment
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Institute of Cancer Prevention and Treatment filed Critical Shandong Institute of Cancer Prevention and Treatment
Priority to CN201510497914.8A priority Critical patent/CN105044361B/en
Publication of CN105044361A publication Critical patent/CN105044361A/en
Application granted granted Critical
Publication of CN105044361B publication Critical patent/CN105044361B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6893Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids related to diseases not provided for elsewhere
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6803General methods of protein analysis not limited to specific proteins or families of proteins
    • G01N33/6848Methods of protein analysis involving mass spectrometry
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/92Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving lipids, e.g. cholesterol, lipoproteins, or their receptors
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N2030/022Column chromatography characterised by the kind of separation mechanism
    • G01N2030/027Liquid chromatography

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Chemical & Material Sciences (AREA)
  • Urology & Nephrology (AREA)
  • Biomedical Technology (AREA)
  • Physics & Mathematics (AREA)
  • Hematology (AREA)
  • Analytical Chemistry (AREA)
  • Pathology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Food Science & Technology (AREA)
  • Microbiology (AREA)
  • Cell Biology (AREA)
  • Biotechnology (AREA)
  • Medicinal Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Endocrinology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

本发明公开了一种适合于食管鳞状细胞癌早期诊断的诊断标记物及其筛选方法,本发明发现了25种血清代谢标记物和与之相关的10个代谢通路。通过这25种血清代谢标记物的组合可以得到用于食管癌诊断的诊断标记物。本发明诊断标记物筛选方法可操作性强,采用诊断标记物可以构建诊断模型,该诊断模型效果良好,灵敏度高,特异性好,不仅适合晚期食管癌的诊断,还适合于早期食管癌的诊断。采用本发明诊断标记物构建的诊断模型,仅通过取血就能实现诊断,无创、花费低,能够很好的替代现今内创性诊断模式,大大减轻了患者的痛苦,且本发明诊断快速、便捷,所需时间短,提高了工作效率,有利于食管癌的早发现、早治疗,具有良好的临床使用和推广价值。

The invention discloses a diagnostic marker suitable for early diagnosis of esophageal squamous cell carcinoma and a screening method thereof. The invention discovers 25 serum metabolic markers and 10 metabolic pathways related thereto. A diagnostic marker for the diagnosis of esophageal cancer can be obtained through the combination of these 25 serum metabolic markers. The diagnostic marker screening method of the present invention has strong operability, and the diagnostic marker can be used to construct a diagnostic model. The diagnostic model has good effect, high sensitivity and good specificity, and is not only suitable for the diagnosis of advanced esophageal cancer, but also suitable for the diagnosis of early esophageal cancer. . The diagnostic model constructed by the diagnostic markers of the present invention can be diagnosed only by taking blood, which is non-invasive and low-cost, and can well replace the current endoinvasive diagnostic model, greatly reducing the suffering of patients, and the present invention is fast and easy to diagnose. The method is convenient, takes a short time, improves work efficiency, is beneficial to early detection and early treatment of esophageal cancer, and has good clinical application and promotion value.

Description

一种适合于食管鳞状细胞癌早期诊断的诊断标记物及其筛选 方法A diagnostic marker suitable for early diagnosis of esophageal squamous cell carcinoma and its screening method

技术领域technical field

本发明涉及食管鳞状细胞癌的诊断,具体涉及一种食管鳞状细胞癌诊断标记物,诊断标记物的筛选方法、以该诊断标记物为基础构建的诊断模型,以及诊断模型的构建方法,该诊断标记物和诊断模型对于食管原位癌、早期和晚期食管癌都具有很好的诊断效果,特别适合于食管癌早期诊断,属于食管鳞状细胞癌诊断技术领域。The present invention relates to the diagnosis of esophageal squamous cell carcinoma, in particular to a diagnostic marker for esophageal squamous cell carcinoma, a screening method for the diagnostic marker, a diagnostic model based on the diagnostic marker, and a method for constructing the diagnostic model. The diagnostic marker and diagnostic model have good diagnostic effects on esophageal carcinoma in situ, early and advanced esophageal cancer, are particularly suitable for early diagnosis of esophageal cancer, and belong to the technical field of diagnosis of esophageal squamous cell carcinoma.

背景技术Background technique

食管癌(esophageal cancer)是由食管鳞状上皮或腺上皮的异常增生所形成的恶性病变。据世界卫生组织最新数据表明:全世界每年约有40万人死于食管癌,我国是食管癌发病率和死亡率最高的国家,且90%患者的组织类型为鳞状细胞癌(ESCC)。食管癌发病隐匿,早期无症状或症状很不典型,发现时已是临床晚期,普遍预后不佳(5年生存率约为13%)。因此,早期诊断和早期治疗是改善食管癌预后、降低死亡率的关键。目前,在食管癌高发区的早诊早治平台和临床上常用的早期筛查和诊断方法包括食管拉网细胞学检查、X线钡餐造影、食管超声内镜、食管内镜检查等。但这些方法均为有创检查、操作复杂、且价格高昂,限制了其在食管癌筛查和早期诊断中的广泛应用。Esophageal cancer is a malignant lesion formed by abnormal proliferation of esophageal squamous or glandular epithelium. According to the latest data from the World Health Organization, about 400,000 people die from esophageal cancer every year in the world. my country is the country with the highest incidence and mortality rate of esophageal cancer, and 90% of patients have squamous cell carcinoma (ESCC). The onset of esophageal cancer is occult, with no symptoms or very atypical symptoms in the early stage. It is already in the late clinical stage when it is discovered, and the prognosis is generally poor (the 5-year survival rate is about 13%). Therefore, early diagnosis and early treatment are the key to improve the prognosis and reduce the mortality of esophageal cancer. At present, the platform for early diagnosis and early treatment in areas with a high incidence of esophageal cancer and commonly used clinical early screening and diagnostic methods include esophageal mesh cytology, X-ray barium meal contrast, esophageal ultrasonography, and esophageal endoscopy. However, these methods are invasive, complicated and expensive, which limits their wide application in screening and early diagnosis of esophageal cancer.

食管癌发生涉及多因素、多阶段、多基因变异积累及与环境因素相互作用的复杂过程,包括在分子水平上涉及众多原癌基因、抑癌基因以及蛋白质的改变,以及长期不良的生活或饮食习惯的影响(进食含亚硝胺类较多的食物、如喜欢腌制酸菜或霉变食品、长期喜进烫食、吸烟、饮酒不良嗜好等)。代谢组学是对生物样品(如血清、尿液、唾液等)中所有分子量低于1000Da小分子代谢物(如脂肪酸、氨基酸、核苷及甾体等生物小分子)进行定性定量检测,从而监测机体受疾病或危险因素累积等干扰后内源性物质做出的代谢响应。体内的生物信息由基因经转录传递给蛋白质,最终体现为小分子代谢物。不同于基因组学和蛋白组学反映的生物体内在差异,代谢组学的研究领域扩展到了机体与环境之间的相互影响和作用。小分子代谢物不仅是机体生命活动、生化代谢的物质基础,还体现了某些外来因素对体内代谢环境的改变,因而某些独特代谢物的浓度在不同个体间的差异事实上反映了疾病内在的表现和外在病因。近年来研究发现,诸如代谢性疾病和恶性肿瘤(卵巢癌)等疾病发生发展过程中,机体基础生化代谢均发生了明显变化,对人类理解复杂疾病的代谢机制将发挥重要作用,同时为复杂疾病的筛检和早期诊断提供崭新的技术方法。The occurrence of esophageal cancer involves a complex process of multi-factors, multi-stages, multi-gene mutation accumulation and interaction with environmental factors, including many proto-oncogenes, tumor suppressor genes and protein changes at the molecular level, as well as long-term poor life or diet The influence of habits (eating foods containing more nitrosamines, such as liking pickled sauerkraut or moldy food, long-term preference for hot food, smoking, bad habits of drinking, etc.). Metabolomics is the qualitative and quantitative detection of all small molecular metabolites (such as fatty acids, amino acids, nucleosides and steroids) in biological samples (such as serum, urine, saliva, etc.) The metabolic response of endogenous substances after the body is disturbed by diseases or accumulation of risk factors. Biological information in the body is transferred from genes to proteins through transcription, and finally manifested as small molecule metabolites. Unlike the internal differences of organisms reflected by genomics and proteomics, the research field of metabolomics extends to the interaction and interaction between the organism and the environment. Small molecule metabolites are not only the material basis of the body's life activities and biochemical metabolism, but also reflect the changes of some external factors to the metabolic environment in the body. Therefore, the differences in the concentration of some unique metabolites between different individuals actually reflect the inherent disease. manifestations and external causes. In recent years, studies have found that during the development of diseases such as metabolic diseases and malignant tumors (ovarian cancer), the basic biochemical metabolism of the body has undergone significant changes, which will play an important role in understanding the metabolic mechanisms of complex diseases. Screening and early diagnosis provide new technical methods.

食管癌的发生发展是由多基因及环境因素相互作用所致,首先是相关的功能基因表达发生改变或突变,然后是一系列细胞信号传导及蛋白质合成改变,最终在与环境因素相互作用下使得代谢产物发生变化。食管癌的发生正是环境危险因素逐渐积累并不断损伤机体各代谢通路稳态的结果。目前,已经有人利用代谢组学对食管癌进行研究,例如Wu等(Wu H,Xue R,Lu C,et al.Metabolomic study for diagnostic model of oesophagealcancer using gas chromatography/mass spectrometry.J Chromatogr B AnalytTechnol Biome d Life Sci,2009,877(27):3111-7.)、Zhang等(Zhang J,Bowers J,LiuL,et al.Esoph ageal cancer metabolite biomarkers detected by LC-MS and NMRmethods.PLoS O ne,2012,7(1):e30181.)、Xu等(Xu J,Chen Y,Zhang R,et al.Globaland targeted metabolomics of esophageal squamous cell carcinoma discoverspotential diagnostic and therapeutic biomarkers.Mol Cell Proteomics,2013,12(5):1306-18.)都利用代谢组学技术对食管癌进行了研究。The occurrence and development of esophageal cancer is caused by the interaction of multiple genes and environmental factors. First, the expression of related functional genes is changed or mutated, followed by a series of changes in cell signal transduction and protein synthesis. Metabolites change. The occurrence of esophageal cancer is the result of gradual accumulation of environmental risk factors and continuous damage to the homeostasis of various metabolic pathways in the body. At present, some people have used metabolomics to study esophageal cancer, such as Wu et al (Wu H, Xue R, Lu C, et al. Life Sci,2009,877(27):3111-7.), Zhang et al. (Zhang J, Bowers J, LiuL, et al.Esoph ageal cancer metabolite biomarkers detected by LC-MS and NMRmethods.PLoS O ne,2012,7 (1):e30181.), Xu et al. (Xu J, Chen Y, Zhang R, et al.Globaland targeted metabolomics of esophageal squamous cell carcinoma discovers potential diagnostic and therapeutic biomarkers.Mol Cell Proteomics,2013,12(5):1306- 18.) Both used metabolomics technology to study esophageal cancer.

食管癌的发生发展伴随着体内多种代谢物的改变,一般需要几年甚至十几年,如能在癌变早期阶段发现,进行早期治疗,可有效提高预后效果。而事实上,已有研究证明在疾病发病前或危险因素累积阶段,内源性物质就会做出相应的代谢响应。例如,Zhao等通过代谢指纹图谱研究揭示了糖尿病前期患者的代谢特征,并证实了脂肪酸、色氨酸、尿酸、胆汁酸等的代谢改变发生于疾病出现临床症状前很长一段时间里,为代谢性疾病的筛查、早期诊断和干预提供了新的可能。然而,上述食管癌代谢组学研究纳入的主要是食管癌晚期患者,而大多没有纳入或很少的纳入早期食管癌病例。晚期食管癌已发生淋巴结和远处转移,甚至出现肿瘤恶病质状态,此时机体代谢已发生很大的变化,因此,这些研究仅能发现食管癌发病晚期同健康对照相比的代谢轮廓差异,根据这些代谢轮廓的差异仅能较好的诊断出晚期食管癌,而对于早期食管癌却无法进行诊断,即不能实现食管癌的早期诊断。其次,上述食管癌代谢组学研究仅获得很少一部分与食管癌发病相关的代谢物(从研究层次上仅是代谢靶标分析,而不是代谢轮廓或代谢组学分析)。此外,上述研究大多未从食管癌的客观分子筛查/早期诊断标准模型的角度评价代谢组学的转化医学潜力及应用效果,大部分并未报告筛选的代谢物的筛查/诊断食管癌的灵敏度、特异度以及ROC曲线下面积AUC值。The occurrence and development of esophageal cancer is accompanied by changes in various metabolites in the body, and it usually takes several years or even ten years. If cancer can be detected in the early stage and treated early, it can effectively improve the prognosis. In fact, studies have shown that before the onset of disease or the accumulation of risk factors, endogenous substances will make corresponding metabolic responses. For example, Zhao et al. revealed the metabolic characteristics of pre-diabetic patients through the study of metabolic fingerprints, and confirmed that the metabolic changes of fatty acids, tryptophan, uric acid, and bile acids occurred in a long period of time before the clinical symptoms of the disease appeared. Screening, early diagnosis and intervention of diseases provide new possibilities. However, the metabolomics studies of esophageal cancer mentioned above mainly included patients with advanced esophageal cancer, and most of them did not include or rarely included early-stage esophageal cancer cases. Lymph node and distant metastases have occurred in advanced esophageal cancer, and even a state of tumor cachexia has occurred. At this time, the body's metabolism has undergone great changes. Therefore, these studies can only find differences in the metabolic profile of advanced esophageal cancer compared with healthy controls. According to These differences in metabolic profiles can only better diagnose advanced esophageal cancer, but cannot diagnose early esophageal cancer, that is, the early diagnosis of esophageal cancer cannot be achieved. Secondly, the metabolomics study of esophageal cancer mentioned above only obtained a small part of metabolites related to the pathogenesis of esophageal cancer (from the research level, it is only the analysis of metabolic targets, not the analysis of metabolic profiling or metabolomics). In addition, most of the above studies did not evaluate the translational medicine potential and application effect of metabolomics from the perspective of objective molecular screening/early diagnosis standard model of esophageal cancer, and most of them did not report the screening/diagnosis of metabolites for esophageal cancer. Sensitivity, specificity, and area under the ROC curve (AUC).

发明人前期针对食管癌已经进行了一系列研究,采用代谢组学的技术研究了能够进行食管癌早期筛查的代谢组学分析模型,该技术对于我国食管癌高发区的高危人群发现具有的应用和推广价值,目前已经在山东肥城市试点应用。然而,食管癌高发群筛查和临床早期诊断仍具有诸多不同。食管癌高发区筛查是高发区的以健康或表面健康的人为观察对象,目的是在健康的人群中发现那些表面健康,但可疑患有食管部位病变的人(高危个体),筛查试验阳性者须作进一步的诊断或干预;而诊断是在临床环境下以患者或可疑患者为观察对象,目的是区分患者是否有相应病症,对患者病情做出及时、正确的判断,以便采取相应有效的治疗措施,临床上诊断阳性者要给予治疗(如手术、化疗或放疗)。目前,医院普遍使用有创伤、操作复杂、且价格高昂的影像学检查临床诊断食管癌病例,并且患者主动就诊是大多已为晚期,因此仍缺少能简单有效的用于临床食管癌诊断(特别是早期食管癌诊断)的血清生物标记物。因此,寻找特异、敏感、经济和无创的食管癌早期诊断血清代谢标记物,并建立一种安全有效的食管癌早期分子诊断模型具有重要的临床应用价值。The inventor has conducted a series of studies on esophageal cancer in the early stage, using metabolomics technology to study a metabolomics analysis model that can be used for early screening of esophageal cancer. This technology has application in the discovery of high-risk groups in high-incidence areas of esophageal cancer in my country and promotion value, it has been piloted in Feicheng City, Shandong Province. However, there are still many differences between the screening of high-incidence groups of esophageal cancer and early clinical diagnosis. Esophageal cancer screening in high-incidence areas is to observe healthy or apparently healthy people in high-incidence areas. The purpose is to find those who appear to be healthy but suspected of having esophageal lesions (high-risk individuals) in the healthy population, and the screening test is positive. Patients need to make further diagnosis or intervention; and diagnosis is to observe patients or suspicious patients in a clinical environment, the purpose is to distinguish whether patients have corresponding symptoms, to make timely and correct judgments on patients' conditions, so as to take corresponding and effective measures. Treatment measures, clinically positive patients should be given treatment (such as surgery, chemotherapy or radiotherapy). At present, hospitals generally use traumatic, complicated, and expensive imaging examinations to clinically diagnose esophageal cancer cases, and most of the patients who actively seek treatment are already in the advanced stage, so there is still a lack of simple and effective clinical diagnosis of esophageal cancer (especially Serum biomarkers for early esophageal cancer diagnosis. Therefore, finding specific, sensitive, economical and non-invasive serum metabolic markers for early diagnosis of esophageal cancer and establishing a safe and effective early molecular diagnosis model of esophageal cancer have important clinical application value.

发明内容Contents of the invention

针对现有技术中食管鳞状细胞癌(简称食管癌)的诊断操作复杂、价格昂贵、有创伤性,目前的标记物仅对晚期食管癌灵敏度高,不能实现食管癌的早期诊断等不足,本发明提供了一种适合于食管鳞状细胞癌早期诊断的诊断标记物,该诊断标记物对于食管原位癌、早期食管癌、晚期食管癌都具有较好的灵敏度和特异度,不仅能够用于晚期食管癌的诊断,还能够较好的用于食管鳞状细胞癌的早期诊断,对于改善食管鳞状细胞癌预后、降低死亡率有很重要的意义。Aiming at the shortcomings of esophageal squamous cell carcinoma (referred to as esophageal cancer) in the prior art, which are complex, expensive, and traumatic, and the current markers are only highly sensitive to advanced esophageal cancer, and cannot achieve early diagnosis of esophageal cancer, this paper The invention provides a diagnostic marker suitable for the early diagnosis of esophageal squamous cell carcinoma. The diagnostic marker has good sensitivity and specificity for esophageal carcinoma in situ, early esophageal cancer, and advanced esophageal cancer. It can not only be used for The diagnosis of advanced esophageal cancer can also be better used for the early diagnosis of esophageal squamous cell carcinoma, which is of great significance for improving the prognosis and reducing mortality of esophageal squamous cell carcinoma.

本发明还提供了上述适合于食管鳞状细胞癌早期诊断的诊断标记物的筛选方法,通过该方法所得的标记物对于早期和晚期食管癌都具有很好的灵敏度和特异性,尤其是适合食管癌的早期诊断,对于食管癌的治疗有重要的临床意义。The present invention also provides a screening method for the diagnostic markers suitable for early diagnosis of esophageal squamous cell carcinoma. The markers obtained by this method have good sensitivity and specificity for early and advanced esophageal cancer, and are especially suitable for Early diagnosis of cancer has important clinical significance for the treatment of esophageal cancer.

本发明还提供了一种食管鳞状细胞癌诊断模型及诊断模型的构建方法,该模型构建方法简单,能够代替现今有创的诊断方法,方便快捷,避免了待检人员的痛苦,对于食管原位癌、早期食管癌、晚期食管癌都具有较好的灵敏度和特异度,为食管鳞状细胞癌的早诊早治提供了有效的技术支持。The present invention also provides a diagnostic model of esophageal squamous cell carcinoma and a method for constructing the diagnostic model. The method for constructing the model is simple and can replace current invasive diagnostic methods. Carcinoma of the esophagus, early esophagus, and advanced esophagus all have good sensitivity and specificity, providing effective technical support for the early diagnosis and treatment of esophageal squamous cell carcinoma.

本发明还提供了一种采用该诊断模型诊断食管鳞状细胞癌的方法,采用本发明模型仅通过取血就能进行诊断,方便快捷,无内创,尤其是对于早期食管癌灵敏度高,特异度好,具有很好的临床应用价值。The present invention also provides a method for diagnosing esophageal squamous cell carcinoma using the diagnostic model. The model of the present invention can be used for diagnosis only by taking blood, which is convenient and quick without internal trauma, especially for early esophageal cancer with high sensitivity and specificity. It has good clinical application value.

目前,本领域大都从基因和大分子蛋白质方面研究筛选食管鳞状细胞癌诊断标记物,本发明一改以往的研究思路,首次提出采用血清代谢组学技术筛选食管鳞状细胞癌诊断标记物的思路,发现了特别适合于食管鳞状细胞癌早期诊断的标记物,使不便于发现的早期食管鳞状细胞癌有了很好的诊断方法。本发明依托“国家食管癌早诊早治示范基地(山东省肥城市)”的食管癌筛检与随访人群队列,获得食管原位癌(简称原位癌,0期39例)、早期食管癌(简称早期癌,I期17例、II期11例)及晚期食管癌(简称晚期癌,III期30例)患者的血清标本,并随机抽取经确定无任何食管病变及其他代谢疾病(如甲亢、甲减、高血压和糖尿病、肾病等)的健康人群为健康对照,使用UPLC-QTOF/MS获得1466个小分子代谢物的代谢指纹图谱,经过对食管癌患者、及健康对象的小分子代谢物的代谢指纹图谱的对比、分析,得到适合于食管鳞状细胞癌早期诊断的诊断标记物,以这些诊断标记物进行模型构建,得到食管癌诊断模型,利用该模型可以快速的诊断出是否为食管癌,尤其是可以诊断出早期食管癌,灵敏度高、特异度好,具有临床使用和推广价值。At present, most of the research and screening of diagnostic markers for esophageal squamous cell carcinoma are carried out from the aspects of genes and macromolecular proteins. Based on this idea, markers that are especially suitable for early diagnosis of esophageal squamous cell carcinoma have been discovered, which makes it a good diagnostic method for early esophageal squamous cell carcinoma that is not easy to detect. The present invention relies on the esophageal cancer screening and follow-up population cohort of the "National Esophageal Cancer Early Diagnosis and Early Treatment Demonstration Base (Feicheng City, Shandong Province)" to obtain esophageal carcinoma in situ (abbreviated as carcinoma in situ, 0 stage 39 cases), early esophageal cancer (referred to as early cancer, 17 cases of stage I, 11 cases of stage II) and advanced esophageal cancer (referred to as advanced cancer, 30 cases of stage III), and randomly selected patients without any esophageal lesions and other metabolic diseases (such as hyperthyroidism) , hypothyroidism, hypertension, diabetes, kidney disease, etc.) were healthy controls, using UPLC-QTOF/MS to obtain metabolic fingerprints of 1466 small molecule metabolites, after small molecule metabolism in patients with esophageal cancer and healthy subjects The comparison and analysis of the metabolic fingerprints of esophageal squamous cell carcinoma obtained diagnostic markers suitable for the early diagnosis of esophageal squamous cell carcinoma. These diagnostic markers were used to construct a model to obtain a diagnostic model of esophageal cancer, which can be used to quickly diagnose whether it is a squamous cell carcinoma of the esophagus Esophageal cancer, especially early esophageal cancer, can be diagnosed with high sensitivity and specificity, and has clinical application and promotion value.

本发明中,所述食管原位癌是指TNM分期标准中0期,指粘膜上皮层内或皮肤表皮内的非典型增生(重度)累及上皮的全层,但尚未侵破基底膜而向下浸润生长的癌;早期食管癌是指TNM分期标准中I和II期,指无淋巴结累及、无远处转移的局限于黏膜后或黏膜下层的癌;晚期食管癌是指TNM分期标准中III期和IV期,指已累及肌层或达外膜或外膜以外,有局部或远处淋巴结转移的癌。TNM分期标准依据American joint Committee on Cancer(AJCC)TNM Classfication of Carcinoma of the Esophagus and EsophagogastricJunction(7th ed,2010)。In the present invention, the esophageal carcinoma in situ refers to stage 0 in the TNM staging standard, which refers to atypical hyperplasia (severe) in the epithelial layer of the mucosa or in the epidermis of the skin involving the whole layer of the epithelium, but has not yet penetrated the basement membrane and descended Invasive and growing cancer; early esophageal cancer refers to stage I and II in the TNM staging standard, and refers to cancer confined to the retromucosa or submucosa without lymph node involvement and distant metastasis; advanced esophageal cancer refers to stage III in the TNM staging standard And stage IV, refers to the cancer that has involved the muscle layer or reached the adventitia or beyond the adventitia, with local or distant lymph node metastasis. The TNM staging standard is based on the American joint Committee on Cancer (AJCC) TNM Classfication of Carcinoma of the Esophagus and Esophagogastric Junction (7th ed, 2010).

本发明的诊断标记物和诊断模型可以将无症状或症状不明显的早期食管癌诊断出来,无内创性,减轻了检测者的痛苦,且诊断过程简洁、快速,提高了工作效率,对于食管癌的早诊早治、预后的改善、死亡率的降低都有十分重要的意义。实现本发明的具体技术方案如下:The diagnostic marker and diagnostic model of the present invention can diagnose asymptomatic or insignificant early esophageal cancer without internal invasiveness, which reduces the pain of the tester, and the diagnosis process is simple and fast, which improves work efficiency. For esophageal cancer Early diagnosis and early treatment of cancer, improvement of prognosis, and reduction of mortality are all of great significance. Realize the concrete technical scheme of the present invention as follows:

一种适合于食管鳞状细胞癌早期诊断的诊断标记物,为下述25种血清代谢标记物中的任意一种或一种以上的组合:beta-丙氨酸-赖氨酸(beta-Ala-Lys),左旋肌肽(L-Carnosine),顺-9-十六碳烯酸(cis-9-Palmitoleic acid),棕榈酸(Palmitic acid),油酸(OleicAcid),溶血磷脂酸LPA(18:1(9Z)/0:0),溶血卵磷脂LysoPC(14:0/0:0),溶血卵磷脂LysoPC(18:2(9Z,12Z)),溶血卵磷脂LysoPC(24:0),磷脂PC(14:1(9Z)/P-18:1(11Z)),磷脂PC(16:0/18:2(9Z,12Z)),磷脂PC(24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z)),亚油酸(Linoleic acid),烟酰胺腺嘌呤二核苷酸(NADH),皮质醇(Cortisol),L-酪氨酸(L-Tyrosine),L-色氨酸(L-Tryptophan),甘氨胆酸(GlycocholicAcid),牛磺胆酸盐(Taurocholate),次黄嘌呤(Hypoxanthine),尿囊酸(Allantoic acid),肌苷(Inosine),1-磷酸鞘氨醇(Sphingosine 1-phosphate),硫酸半乳糖基酰基鞘氨醇3-O-Sulfogalactosylceramide(d18:1/20:0),乳糖神经酰胺Lactosylceramide(d18:1/22:0)。A diagnostic marker suitable for the early diagnosis of esophageal squamous cell carcinoma, which is any one or a combination of more than one of the following 25 serum metabolic markers: beta-alanine-lysine (beta-Ala -Lys), L-Carnosine, cis-9-Palmitoleic acid, Palmitic acid, Oleic Acid, Lysophosphatidic acid LPA (18: 1(9Z)/0:0), Lysolecithin LysoPC(14:0/0:0), Lysolecithin LysoPC(18:2(9Z,12Z)), Lysolecithin LysoPC(24:0), Phospholipids PC(14:1(9Z)/P-18:1(11Z)), phospholipid PC(16:0/18:2(9Z,12Z)), phospholipid PC(24:1(15Z)/22:6( 4Z, 7Z, 10Z, 13Z, 16Z, 19Z)), linoleic acid (Linoleic acid), nicotinamide adenine dinucleotide (NADH), cortisol (Cortisol), L-tyrosine (L-Tyrosine) , L-Tryptophan, Glycocholic Acid, Taurocholate, Hypoxanthine, Allantoic acid, Inosine, Sphingosine 1-phosphate (Sphingosine 1-phosphate), galactosylceramide sulfate 3-O-Sulfogalactosylceramide (d18:1/20:0), lactosylceramide (d18:1/22:0).

上述诊断标记物中,可以是上述25种血清代谢标记物中的任意一种,也可以是它们之间的两种或者两种以上的随意组合。当使用两种或两种以上的血清代谢标记物的组合作为诊断标记物时,诊断的效果会优于单一的血清代谢标记物作为诊断标记物的效果。Among the above-mentioned diagnostic markers, it may be any one of the above-mentioned 25 serum metabolic markers, or any combination of two or more of them. When a combination of two or more serum metabolic markers is used as a diagnostic marker, the diagnostic effect will be better than that of a single serum metabolic marker as a diagnostic marker.

进一步的,上述诊断标记物可以为下述(a)-(h)中的任意一种血清代谢标记物的组合:(a)可以为beta-丙氨酸-赖氨酸(beta-Ala-Lys)和左旋肌肽(L-Carnosine)的组合;(b)或者为顺-9-十六碳烯酸(cis-9-Palmitoleic acid)、棕榈酸(Palmitic acid)和油酸(Oleic Acid)的组合;(c)或者为溶血磷脂酸LPA(18:1(9Z)/0:0)、溶血卵磷脂LysoPC(14:0/0:0)、溶血卵磷脂LysoPC(18:2(9Z,12Z))和溶血卵磷脂LysoPC(24:0)的组合;(d)或者为磷脂PC(14:1(9Z)/P-18:1(11Z))、磷脂PC(16:0/18:2(9Z,12Z))、磷脂PC(24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z))和亚油酸(Linoleic acid)的组合;(e)或者为烟酰胺腺嘌呤二核苷酸(NADH)、L-酪氨酸(L-Tyrosine)和L-色氨酸(L-Tryptophan)的组合;(f)或者为皮质醇(Cortisol)、甘氨胆酸(Gl ycocholic Acid)和牛磺胆酸盐(Taurocholate)的组合;(g)或者为次黄嘌呤(Hypoxanthine)、尿囊酸(Allantoic acid)和肌苷(Inosine)的组合;(h)或者为1-磷酸鞘氨醇(Sphingosine 1-ph osphate)、硫酸半乳糖基酰基鞘氨醇3-O-Sulfogalactosylceramide(d18:1/20:0)和乳糖神经酰胺Lactosylceramide(d18:1/22:0)的组合。Further, the above-mentioned diagnostic markers can be any combination of serum metabolic markers in the following (a)-(h): (a) can be beta-alanine-lysine (beta-Ala-Lys ) and L-Carnosine; (b) or a combination of cis-9-Palmitoleic acid, Palmitic acid and Oleic Acid ; (c) or lysophosphatidic acid LPA (18:1 (9Z)/0:0), lyso-lecithin LysoPC (14:0/0:0), lyso-lecithin LysoPC (18:2 (9Z,12Z) ) and LysoPC (24:0); 9Z,12Z)), phospholipid PC (24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z)) and linoleic acid (Linoleic acid); (e) or smoke A combination of NADH, L-Tyrosine, and L-Tryptophan; (f) or Cortisol, Glycocholic Acid (Glycocholic Acid) and taurocholate (Taurocholate); (g) or a combination of hypoxanthine (Hypoxanthine), allantoic acid (Allantoic acid) and inosine (Inosine); (h) or 1 -Sphingosine 1-ph osphate, galactosylceramide sulfate 3-O-Sulfogalactosylceramide(d18:1/20:0) and lactosylceramide(d18:1/22:0) combination.

进一步的,上述诊断标记物可以为下述15种血清代谢标记物中的两种或两种以上的组合:溶血磷脂酸LPA(18:1(9Z)/0:0),溶血卵磷脂LysoPC(14:0/0:0),溶血卵磷脂LysoPC(18:2(9Z,12Z)),溶血卵磷脂LysoPC(24:0),磷脂PC(14:1(9Z)/P-18:1(11Z)),磷脂PC(16:0/18:2(9Z,12Z)),磷脂PC(24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z)),烟酰胺腺嘌呤二核苷酸(NA DH),皮质醇(Cortisol),L-色氨酸(L-Tryptophan),牛磺胆酸盐(Taurocholate),次黄嘌呤(Hypo xanthine),肌苷(Inosine),硫酸半乳糖基酰基鞘氨醇3-O-Sulfogalactosylceramide(d18:1/20:0),乳糖神经酰胺Lactosylceramide(d18:1/22:0)。Further, the above-mentioned diagnostic markers can be two or more combinations of the following 15 serum metabolic markers: lysophosphatidic acid LPA (18:1(9Z)/0:0), lysophosphatidic acid LysoPC ( 14:0/0:0), LysoPC (18:2(9Z,12Z)), LysoPC (24:0), Phospholipid PC (14:1(9Z)/P-18:1( 11Z)), phospholipid PC (16:0/18:2 (9Z, 12Z)), phospholipid PC (24:1 (15Z)/22:6 (4Z, 7Z, 10Z, 13Z, 16Z, 19Z)), smoke Amide adenine dinucleotide (NA DH), cortisol (Cortisol), L-tryptophan (L-Tryptophan), taurocholate (Taurocholate), hypoxanthine (Hypo xanthine), inosine (Inosine) ), galactosylceramide sulfate 3-O-Sulfogalactosylceramide (d18:1/20:0), lactosylceramide (d18:1/22:0).

进一步的,上述诊断标记物可以为下述7种血清代谢标记物中的两种或两种以上的组合:溶血磷脂酸LPA(18:1(9Z)/0:0)、溶血卵磷脂LysoPC(14:0/0:0)、溶血卵磷脂LysoPC(18:2(9Z,12Z))、溶血卵磷脂LysoPC(24:0)、磷脂PC(14:1(9Z)/P-18:1(11Z))、磷脂PC(16:0/18:2(9Z,12Z))和磷脂PC(24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z))。Further, the above-mentioned diagnostic markers can be two or more combinations of the following seven serum metabolic markers: lysophosphatidic acid LPA (18:1(9Z)/0:0), lysophosphatidic acid LysoPC ( 14:0/0:0), LysoPC (18:2(9Z,12Z)), LysoPC (24:0), Phospholipid PC (14:1(9Z)/P-18:1( 11Z)), phospholipid PC (16:0/18:2(9Z,12Z)) and phospholipid PC (24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z)).

进一步的,上述诊断标记物可以为下述5种血清代谢标记物中的两种或两种以上的组合:L-酪氨酸(L-Tyrosine)、L-色氨酸(L-Tryptophan)、甘氨胆酸(GlycocholicAcid)、牛磺胆酸盐(T aurocholate)和皮质醇(Cortisol)。Further, the above-mentioned diagnostic markers can be two or more combinations of the following five serum metabolic markers: L-Tyrosine, L-Tryptophan, Glycocholic Acid, Taurocholate and Cortisol.

优选的,上述诊断标记物为下述25种血清代谢标记物的组合(记为诊断标记物A,下同):beta-丙氨酸-赖氨酸(beta-Ala-Lys)、左旋肌肽(L-Carnosine)、顺-9-十六碳烯酸(cis-9-Palmitoleic acid)、棕榈酸(Palmitic acid),油酸(Oleic Acid)、溶血磷脂酸LPA(18:1(9Z)/0:0)、溶血卵磷脂LysoPC(14:0/0:0)、溶血卵磷脂LysoPC(18:2(9Z,12Z))、溶血卵磷脂LysoPC(24:0)、磷脂PC(14:1(9Z)/P-18:1(11Z))、磷脂PC(16:0/18:2(9Z,12Z))、磷脂PC(24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z))、亚油酸(Linoleic acid)、烟酰胺腺嘌呤二核苷酸(NADH)、皮质醇(Cortisol)、L-酪氨酸(L-Tyrosine)、L-色氨酸(L-Tryptophan)、甘氨胆酸(Glycocholic Acid)、牛磺胆酸盐(Taurocholate)、次黄嘌呤(Hypoxanthine)、尿囊酸(Allantoic acid)、肌苷(Inosin e)、1-磷酸鞘氨醇(Sphingosine1-phosphate)、硫酸半乳糖基酰基鞘氨醇3-O-Sulfogalactos ylceramide(d18:1/20:0)和乳糖神经酰胺Lactosylceramide(d18:1/22:0)。Preferably, the above-mentioned diagnostic marker is a combination of the following 25 serum metabolic markers (referred to as diagnostic marker A, the same below): beta-alanine-lysine (beta-Ala-Lys), L-carnosine ( L-Carnosine), cis-9-hexadecenoic acid (cis-9-Palmitoleic acid), palmitic acid (Palmitic acid), oleic acid (Oleic Acid), lysophosphatidic acid LPA (18:1(9Z)/0 :0), LysoPC(14:0/0:0), LysoPC(18:2(9Z,12Z)), LysoPC(24:0), LysoPC(14:1( 9Z)/P-18:1(11Z)), Phospholipid PC(16:0/18:2(9Z,12Z)), Phospholipid PC(24:1(15Z)/22:6(4Z,7Z,10Z, 13Z, 16Z, 19Z)), Linoleic acid (Linoleic acid), Nicotinamide adenine dinucleotide (NADH), Cortisol (Cortisol), L-Tyrosine (L-Tyrosine), L-Tryptophan (L-Tryptophan), Glycocholic Acid, Taurocholate, Hypoxanthine, Allantoic acid, Inosine e, Sheath 1-phosphate Amino alcohol (Sphingosine1-phosphate), galactosylceramide sulfate 3-O-Sulfogalactos ylceramide (d18:1/20:0) and lactosylceramide (d18:1/22:0).

优选的,上述诊断标记物为下述7种血清代谢标记物的组合(记为诊断标记物B,下同):溶血磷脂酸LPA(18:1(9Z)/0:0)、溶血卵磷脂LysoPC(14:0/0:0)、溶血卵磷脂LysoPC(18:2(9Z,12Z))、溶血卵磷脂LysoPC(24:0)、磷脂PC(14:1(9Z)/P-18:1(11Z))、磷脂PC(16:0/18:2(9Z,12Z))和磷脂PC(24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z))。Preferably, the above-mentioned diagnostic markers are a combination of the following seven serum metabolic markers (referred to as diagnostic marker B, the same below): lysophosphatidic acid LPA (18:1(9Z)/0:0), lysolecithin LysoPC(14:0/0:0), LysoPC(18:2(9Z,12Z)), LysoPC(24:0), LysoPC(14:1(9Z)/P-18: 1(11Z)), phospholipid PC (16:0/18:2(9Z,12Z)) and phospholipid PC (24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z)) .

优选的,上述诊断标记物为下述5种血清代谢标记物的组合(记为诊断标记物C,下同):L-酪氨酸(L-Tyrosine)、L-色氨酸(L-Tryptophan)、甘氨胆酸(Glycocholic Acid)、牛磺胆酸盐(T aurocholate)和皮质醇(Cortisol)。Preferably, the above-mentioned diagnostic marker is a combination of the following five serum metabolic markers (referred to as diagnostic marker C, the same below): L-tyrosine (L-Tyrosine), L-tryptophan (L-Tryptophan ), Glycocholic Acid, Taurocholate and Cortisol.

本发明提供了多种血清代谢物或血清代谢物组合构成的诊断标记物,上述诊断标记物共涉及25种血清代谢标记物,这25种血清代谢标记物与10种代谢通路密切相关。其中,beta-丙氨酸-赖氨酸(beta-Ala-Lys)和左旋肌肽(L-Carnosine)这2种血清代谢标记物与beta丙氨酸代谢(beta-Alanine metabolism)代谢通路密切相关;顺-9-十六碳烯酸(cis-9-Palmitoleic acid)、棕榈酸(Palmitic acid)和油酸(OleicAcid)这3种血清代谢标记物与脂肪酸合成(Fatty acid biosynthesis)代谢通路密切相关;溶血磷脂酸LPA(18:1(9Z)/0:0)、溶血卵磷脂LysoPC(14:0/0:0)、溶血卵磷脂LysoPC(18:2(9Z,12Z))和溶血卵磷脂LysoPC(24:0)这4种血清代谢标记物与甘油磷脂代谢(Glycerophospholipid metabolism)代谢通路密切相关;磷脂PC(14:1(9Z)/P-18:1(11Z))、磷脂PC(16:0/18:2(9Z,12Z))、磷脂PC(24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z))和亚油酸(Linoleic acid)这4种血清代谢标记物与甘油磷脂代谢(Glycerophospholipid metabolism)和亚油酸代谢(Linoleicacid metabolism)这两种代谢通路密切相关;烟酰胺腺嘌呤二核苷酸(NADH)与氧化磷酸化(Oxidative phosphorylation)代谢通路密切相关;L-酪氨酸(L-Tyrosine)和L-色氨酸(L-Tryptophan)这2种血清代谢标记物与苯基丙氨酸/酪氨酸和色氨酸代谢(Phenylalanine,tyrosine and tryptophan biosynthesis)代谢通路密切相关;甘氨胆酸(GlycocholicAcid)和牛磺胆酸盐(Taurocholate)这2种血清代谢标记物与初级胆汁酸合成(Primary bile acid biosynthesis)代谢通路密切相关;皮质醇(Cortisol)与癌症通路和胆汁分泌(Pathways in cancer,and Bile secretion)代谢通路密切相关;次黄嘌呤(Hypoxanthine)、尿囊酸(Allantoic acid)和肌苷(Inosine)这3种血清代谢标记物与嘌呤代谢(Purine metabolism)代谢通路密切相关;1-磷酸鞘氨醇(Sphingosine 1-phosphate)、硫酸半乳糖基酰基鞘氨醇3-O-Sulfogalactosylceramide(d18:1/20:0)和乳糖神经酰胺Lactosylceramide(d18:1/22:0)这3种血清代谢标记物与鞘脂类代谢(Sphingolipid metabolism)代谢通路密切相关。The present invention provides diagnostic markers composed of various serum metabolites or combinations of serum metabolites. The above diagnostic markers involve a total of 25 serum metabolic markers, and these 25 serum metabolic markers are closely related to 10 metabolic pathways. Among them, the two serum metabolic markers, beta-Ala-Lys and L-Carnosine, are closely related to the metabolic pathway of beta-Alanine metabolism; Three serum metabolic markers, cis-9-Palmitoleic acid, palmitic acid and oleic acid, are closely related to the metabolic pathway of fatty acid biosynthesis; Lysophosphatidic acid LPA (18:1(9Z)/0:0), lysolecithin LysoPC (14:0/0:0), lysolecithin LysoPC (18:2(9Z,12Z)) and lysolecithin LysoPC (24:0) These four serum metabolic markers are closely related to the metabolic pathway of glycerophospholipid metabolism; phospholipid PC(14:1(9Z)/P-18:1(11Z)), phospholipid PC(16: 0/18:2(9Z,12Z)), phospholipid PC (24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z)) and linoleic acid (Linoleic acid) Serum metabolic markers are closely related to two metabolic pathways: glycerophospholipid metabolism and linoleic acid metabolism; nicotinamide adenine dinucleotide (NADH) and oxidative phosphorylation (Oxidative phosphorylation) metabolic pathway Closely related; L-Tyrosine (L-Tyrosine) and L-Tryptophan (L-Tryptophan), two serum metabolic markers, are closely related to phenylalanine/tyrosine and tryptophan and tryptophan biosynthesis) metabolic pathway; two serum metabolic markers, Glycocholic Acid and Taurocholate, are closely related to the primary bile acid biosynthesis metabolic pathway; cortisol ( Cortisol) is closely related to cancer pathway and bile secretion (Pathways in cancer, and Bile secretion) metabolic pathway; the three serum metabolic markers of hypoxanthine (Hypoxanthine), allantoic acid (Allantoic acid) and inosine (Inosine) are closely related to Purine metabolism lism) metabolic pathways are closely related; 1-phosphate sphingosine (Sphingosine 1-phosphate), sulfate galactosylceramide 3-O-Sulfogalactosylceramide (d18:1/20:0) and lactosylceramide (d18: 1/22:0) These three serum metabolic markers are closely related to the metabolic pathway of sphingolipid metabolism.

本发明还提供了上述各种适合于食管鳞状细胞癌早期诊断的诊断标记物的筛选方法,包括以下步骤:The present invention also provides a screening method for various diagnostic markers suitable for the early diagnosis of esophageal squamous cell carcinoma, comprising the following steps:

(1)收集食管鳞状细胞癌患者和健康人群血清样本,作为分析样本,其中食管鳞状细胞癌血清样本包括食管原位癌血清样本、早期食管癌血清样本和晚期食管癌血清样本;(1) Serum samples from patients with esophageal squamous cell carcinoma and healthy people were collected as analysis samples, where esophageal squamous cell carcinoma serum samples included esophageal carcinoma in situ serum samples, early esophageal cancer serum samples and advanced esophageal cancer serum samples;

(2)将每个分析样本采用UPLC-QTOF/MS血清代谢组学技术进行分析,得到各血清样本的原始代谢指纹图谱;(2) Analyze each analysis sample using UPLC-QTOF/MS serum metabolomics technology to obtain the original metabolic fingerprint of each serum sample;

(3)使用R语言XCMS软件包将食管鳞状细胞癌血清样本和健康血清样本的原始代谢指纹图谱分别进行图谱预处理,得到每行为分析样本,每列为代谢物信息的二维矩阵,并使用R语言的CAMERA软件包对二维矩阵进行代谢物峰标识,用于进一步的统计分析;(3) Use the R language XCMS software package to preprocess the original metabolic fingerprints of esophageal squamous cell carcinoma serum samples and healthy serum samples respectively, and obtain a two-dimensional matrix of metabolite information for each row analysis sample and each column, and Use the CAMERA software package of R language to identify metabolite peaks in the two-dimensional matrix for further statistical analysis;

(4)将步骤(3)的二维矩阵依次进行主成分分析和偏最小二乘判别分析,得到PLS-D A模型,该PLS-DA模型显示食管鳞状细胞癌患者与健康人群有代谢模式差异和明显的分类趋势;(4) The two-dimensional matrix in step (3) is subjected to principal component analysis and partial least squares discriminant analysis in turn to obtain the PLS-DA model, which shows that patients with esophageal squamous cell carcinoma and healthy people have metabolic patterns differences and clear classification trends;

(5)根据上述得到的PLS-DA模型,借助PLS-DA建模的变量重要性评分和单变量的非参数检验进行差异代谢物筛选,筛选标准为:VIP≥1,且经假发现率FDR的多重检验校正后q值小于0.05;(5) According to the PLS-DA model obtained above, the differential metabolites were screened with the help of the variable importance score of the PLS-DA modeling and the univariate non-parametric test. The screening criteria were: VIP≥1, and the false discovery rate FDR The q value after multiple testing correction is less than 0.05;

(6)将上述筛选得到的差异代谢物根据R语言的CAMERA包确定差异代谢物的准分子离子、加合物和同位素信息,获得潜在代谢标记物;(6) Determine the quasi-molecular ions, adducts and isotope information of the differential metabolites obtained through the above screening according to the CAMERA package of the R language, and obtain potential metabolic markers;

(7)在上述潜在代谢标记物的基础上,结合潜在代谢标记物的一级、二级质谱信息、准分子离子信息、加合物信息和同位素信息,推测诊断标记物的分子质量和分子式,并与现有的标准化合物进行对比、匹配,得到血清代谢标记物。单一的血清代谢标记物或血清代谢标记物的组合即可作为适合于食管鳞状细胞癌早期诊断的诊断标记物。(7) On the basis of the above potential metabolic markers, combined with the primary and secondary mass spectrum information, quasi-molecular ion information, adduct information and isotope information of the potential metabolic markers, the molecular mass and molecular formula of the diagnostic markers are estimated, And compared and matched with existing standard compounds to obtain serum metabolic markers. A single serum metabolic marker or a combination of serum metabolic markers can be used as a diagnostic marker suitable for early diagnosis of esophageal squamous cell carcinoma.

上述筛选方法中,所述健康人群为无上消化道病变及其他代谢疾病(如甲亢、甲减、高血压和糖尿病、肾病等)的人群。In the above screening method, the healthy population is the population without upper gastrointestinal lesions and other metabolic diseases (such as hyperthyroidism, hypothyroidism, hypertension, diabetes, kidney disease, etc.).

上述筛选方法中,进行LC-MS血清代谢组学技术分析时,每10个分析样本加入一个质量控制样品,用于实时监测分析样本从进样前处理到分析过程中的质量控制情况,所述质量控制样品为5份食管癌血清样本和5份健康血清样本的混合样品。In the above screening method, when LC-MS serum metabolomics analysis is performed, a quality control sample is added to every 10 analysis samples for real-time monitoring of the quality control of the analysis samples from the pre-injection treatment to the analysis process. The quality control sample was a mixture of 5 esophageal cancer serum samples and 5 healthy serum samples.

上述筛选方法中,所述分析样本和质量控制样品进样前进行以下预处理:In the above-mentioned screening method, the following pretreatments are carried out before the analysis sample and the quality control sample are injected:

(1)用移液器抽取50μl分析样本或质量控制样品,置于Bravo自动标本处理系统的96孔板上;(1) Use a pipette to extract 50 μl of analytical samples or quality control samples and place them on a 96-well plate of the Bravo automatic specimen processing system;

(2)加入150μl甲醇提取,涡旋30s,并在-20℃下孵化以沉淀蛋白;(2) Add 150 μl of methanol for extraction, vortex for 30 s, and incubate at -20°C to precipitate protein;

(3)然后于高速离心机中在4℃下以4000转/分离心20min;(3) Centrifuge at 4000 rpm for 20 minutes at 4°C in a high-speed centrifuge;

(4)将步骤(3)的上清液倒入LC-MS进样瓶中,保存在-80℃下以备LC-MS检测。(4) Pour the supernatant from step (3) into an LC-MS sampling vial and store at -80°C for LC-MS detection.

上述筛选方法中,对原始代谢指纹图谱进行图谱预处理是指:用Masshunter软件将获得的原始代谢指纹图谱转换为MZdata数据文件,然后将MZdata数据文件使用XCMS软件包进行包括保留时间校正、峰识别、峰匹配和峰对齐的预处理操作,得到二维矩阵。In the above screening method, the spectral preprocessing of the original metabolic fingerprint refers to converting the obtained original metabolic fingerprint into an MZdata data file with Masshunter software, and then using the XCMS software package to perform retention time correction and peak identification on the MZdata data file. , peak matching and peak alignment preprocessing operations to obtain a two-dimensional matrix.

上述筛选方法中,使用R软件包CAMERA对二维矩阵进行代谢物峰标识包括同位素峰、加合物和碎片离子的代谢物峰标识。In the above screening method, the R software package CAMERA is used to identify metabolite peaks on the two-dimensional matrix, including isotope peaks, adducts and fragment ions.

上述筛选方法中,对每个分析样本采用LC-MS血清代谢组学技术进行分析时,液相色谱所用色谱柱为WatersACQUITYUPLC HSS T3色谱柱,规格为100mm×2.1mm,1.8μm;进样量为6μL,进样温度为4℃,流速为0.5ml/min;色谱流动相包含两种溶剂A和B:正离子ESI+模式下的A为0.1wt%甲酸水溶液,负离子ESI-模型下的A为0.5mmol/L氟化铵水溶液,正离子ESI+模式下的B为0.1wt%甲酸的乙腈溶液,负离子ESI-模型下的B为纯乙腈;色谱梯度洗脱条件为:0-1min为1%B,1-8min为1%B-100%B逐渐递增,10-10.1min为100%B迅速减为1%B,然后1%B持续1.9min。In the above screening method, when each analysis sample is analyzed by LC-MS serum metabolomics technology, the chromatographic column used in liquid chromatography is WatersACQUITYUPLC HSS T3 chromatographic column, the specification is 100mm×2.1mm, 1.8μm; the injection volume is 6 μL, the injection temperature is 4°C, the flow rate is 0.5ml/min; the chromatographic mobile phase contains two solvents A and B: A in positive ion ESI+ mode is 0.1wt% formic acid aqueous solution, and A in negative ion ESI- mode is 0.5 Mmol/L ammonium fluoride aqueous solution, B in positive ion ESI+ mode is 0.1wt% formic acid in acetonitrile solution, B in negative ion ESI- mode is pure acetonitrile; chromatographic gradient elution condition: 0-1min is 1% B, 1-8min is 1% B-100% B gradually increasing, 10-10.1min is 100% B rapidly reduced to 1% B, and then 1% B lasts for 1.9min.

上述筛选方法中,对每个分析样本采用LC-MS血清代谢组学技术进行分析时,质谱检测使用四极杆时间飞行质谱仪Q-TOF,并采用电喷雾离子源的正离子模式ESI+和负离子模式ESI-,离子源温度为400℃,锥孔气流量为12L/min,脱溶剂气温为250℃,脱溶剂气流量为16L/min;在正离子和负离子模式下毛细管电压分别为+3kV和-3kV,锥孔电压均为0V;正离子模式下锥孔压力为20psi,负离子模式下锥孔压力为40psi;图谱数据采集的质荷比范围为50~1200m/z,采集的扫描频率为0.25s。In the above screening method, when each analysis sample is analyzed by LC-MS serum metabolomics technology, the mass spectrometry detection uses the quadrupole time-of-flight mass spectrometer Q-TOF, and uses the positive ion mode ESI+ and negative ion of the electrospray ion source Mode ESI-, the ion source temperature is 400°C, the cone gas flow rate is 12L/min, the desolvation temperature is 250°C, the desolvation gas flow rate is 16L/min; the capillary voltage is +3kV and +3kV and -3kV, the cone voltage is 0V; the cone pressure in positive ion mode is 20psi, and the cone pressure in negative ion mode is 40psi; s.

本发明的优选方案中,筛选时所用的食管原位癌患者39人,早期食管癌患者28人,晚期食管癌患者30人,健康人群105人。In the preferred solution of the present invention, 39 patients with esophageal carcinoma in situ, 28 patients with early esophageal cancer, 30 patients with advanced esophageal cancer, and 105 healthy people were used for screening.

本发明的优选方案中,筛选过程中得到的PLS-DA模型的R2X=0.167,R2Y=0.569,Q2Y=0.523。In the preferred solution of the present invention, the R2X=0.167, R2Y=0.569, and Q2Y=0.523 of the PLS-DA model obtained in the screening process.

本发明还提供了一种食管鳞状细胞癌诊断模型的构建方法,包括以下步骤:The present invention also provides a method for constructing a diagnostic model of esophageal squamous cell carcinoma, comprising the following steps:

(1)收集食管鳞状细胞癌患者和健康人群血清样本,作为分析样本,其中食管鳞状细胞癌血清样本包括食管原位癌血清样本、早期食管癌血清样本和晚期食管癌血清样本;(1) Serum samples from patients with esophageal squamous cell carcinoma and healthy people were collected as analysis samples, where esophageal squamous cell carcinoma serum samples included esophageal carcinoma in situ serum samples, early esophageal cancer serum samples and advanced esophageal cancer serum samples;

(2)将每个分析样本采用LC-MS血清代谢组学技术进行分析,得各血清样本的原始代谢指纹图谱;(2) Analyze each analysis sample using LC-MS serum metabolomics technology to obtain the original metabolic fingerprint of each serum sample;

(3)使用R语言XCMS软件包对各血清样本的原始代谢指纹图谱分别进行图谱预处理,得到每行为分析样本,每列为代谢物信息的二维矩阵,同时使用R软件包CAMERA对二维矩阵进行代谢物峰标识,用于进一步的统计分析;(3) Use the R language XCMS software package to preprocess the original metabolic fingerprints of each serum sample to obtain a two-dimensional matrix of metabolite information for each row analysis sample, and use the R software package CAMERA to analyze the two-dimensional Matrix for metabolite peak identification for further statistical analysis;

(4)根据质荷比和保留时间从二维矩阵中筛选出本发明适合于食管鳞状细胞癌早期诊断的诊断标记物的信息,得到诊断标记物二维矩阵;(4) Screen out the information of the diagnostic markers suitable for the early diagnosis of esophageal squamous cell carcinoma from the two-dimensional matrix according to the mass-to-charge ratio and retention time, and obtain the two-dimensional matrix of diagnostic markers;

(5)根据该诊断标记物二维矩阵,使用R语言中randomForest软件包构建随机森林模型,得食管鳞状细胞癌诊断模型。(5) According to the two-dimensional matrix of diagnostic markers, a random forest model was constructed using the randomForest software package in R language to obtain a diagnostic model for esophageal squamous cell carcinoma.

上述构建方法中,所述食管原位癌是指TNM分期标准中0期,指粘膜上皮层内或皮肤表皮内的非典型增生(重度)累及上皮的全层,但尚未侵破基底膜而向下浸润生长的癌;早期食管癌是指TNM分期标准中I和II期,指无淋巴结累及、无远处转移的局限于黏膜后或黏膜下层的癌;晚期食管癌是指TNM分期标准中III期和IV期,指已累及肌层或达外膜或外膜以外,有局部或远处淋巴结转移的癌。TNM分期标准依据American joint Committee onCancer(AJCC)TNM Classfication of Carcinoma of the Esophagus andEsophagogastric Junction(7th ed,2010)。In the above construction method, the esophageal carcinoma in situ refers to stage 0 in the TNM staging standard, which refers to the atypical hyperplasia (severe) in the epithelial layer of the mucosa or in the epidermis of the skin, involving the whole layer of the epithelium, but has not yet penetrated the basement membrane. Cancer that infiltrates and grows below; early esophageal cancer refers to stage I and II in the TNM staging standard, and refers to cancer confined to the retromucosa or submucosa without lymph node involvement and distant metastasis; advanced esophageal cancer refers to stage III in the TNM staging standard Stage and stage IV refer to cancers that have involved the muscular layer or reached the adventitia or beyond the adventitia, with local or distant lymph node metastasis. The TNM staging standard is based on the American joint Committee on Cancer (AJCC) TNM Classfication of Carcinoma of the Esophagus and Esophagogastric Junction (7th ed, 2010).

本发明的优选方案中,构建随机森林模型时,建模参数ntree=5000。In the preferred solution of the present invention, when constructing the random forest model, the modeling parameter ntree=5000.

本发明的优选方案中,模型构建时,是基于以下的样本数目构建的:所用的述食管原位癌患者39人,早期食管癌患者28人,晚期食管癌患者30人,健康人群105人。In the preferred solution of the present invention, the model is constructed based on the following number of samples: 39 patients with esophageal carcinoma in situ, 28 patients with early esophageal cancer, 30 patients with advanced esophageal cancer, and 105 healthy people.

本发明的优选方案中,当适合于食管鳞状细胞癌早期诊断的诊断标记物为25种血清代谢标记物的组合(诊断标记物A)时,所得的诊断模型的诊断界值(Threshold)为0.3552;当适合于食管鳞状细胞癌早期诊断的诊断标记物为7种血清代谢标记物的组合(诊断标记物B)时,所得的诊断模型的诊断界值(Threshold)为0.7431;当适合于食管鳞状细胞癌早期诊断的诊断标记物为5种血清代谢标记物的组合(诊断标记物C)时,所得的诊断模型的诊断界值(Threshold)为0.4943。当诊断模型给出的预测数值大于等于诊断界值时,说明患有食管鳞状细胞癌,当小于诊断界值时,说明未患有食管鳞状细胞癌。In a preferred version of the present invention, when the diagnostic marker suitable for the early diagnosis of esophageal squamous cell carcinoma is a combination of 25 serum metabolic markers (diagnostic marker A), the diagnostic threshold (Threshold) of the resulting diagnostic model is 0.3552; when the diagnostic marker suitable for the early diagnosis of esophageal squamous cell carcinoma is a combination of 7 serum metabolic markers (diagnostic marker B), the diagnostic threshold (Threshold) of the resulting diagnostic model is 0.7431; when suitable for When the diagnostic marker for early diagnosis of esophageal squamous cell carcinoma is a combination of 5 serum metabolic markers (diagnostic marker C), the diagnostic threshold (Threshold) of the obtained diagnostic model is 0.4943. When the prediction value given by the diagnostic model is greater than or equal to the diagnostic cut-off value, it means that you have esophageal squamous cell carcinoma, and when it is less than the diagnostic cut-off value, it means that you do not have esophageal squamous cell carcinoma.

本发明还提供了一种食管鳞状细胞癌诊断模型,该诊断模型按照上述食管鳞状细胞癌诊断模型的构建方法构建而得。同上,在本发明优选方案中,当诊断模型所用的诊断标记物为诊断标记物A时,诊断模型的诊断界值为0.3552;当为诊断标记物B时,诊断模型的诊断界值为0.7431;当为诊断标记物C时,诊断模型的诊断界值为0.4943。The present invention also provides a diagnostic model of esophageal squamous cell carcinoma, which is constructed according to the method for constructing the diagnostic model of esophageal squamous cell carcinoma. As above, in the preferred solution of the present invention, when the diagnostic marker used in the diagnostic model is diagnostic marker A, the diagnostic cutoff value of the diagnostic model is 0.3552; when it is diagnostic marker B, the diagnostic cutoff value of the diagnostic model is 0.7431; When it is the diagnostic marker C, the diagnostic cut-off value of the diagnostic model is 0.4943.

本发明还提供了一种食管鳞状细胞癌诊断模型的使用方法,即采用该食管鳞状细胞癌诊断模型诊断食管鳞状细胞癌的方法,包括以下步骤:The present invention also provides a method for using a diagnostic model of esophageal squamous cell carcinoma, that is, a method for diagnosing esophageal squamous cell carcinoma using the diagnostic model of esophageal squamous cell carcinoma, comprising the following steps:

(1)取待检血清样本,通过预处理达到进样要求,将预处理后的待检血清样本采用LC-MS血清代谢组学技术进行分析,得该待检血清样本的原始代谢指纹图谱;(1) Take the serum sample to be tested, meet the sample injection requirements through pretreatment, analyze the pretreated serum sample to be tested by LC-MS serum metabolomics technology, and obtain the original metabolic fingerprint of the serum sample to be tested;

(2)使用R语言XCMS软件包将该原始代谢指纹图谱进行图谱预处理,并进行代谢物峰标识,得到可以用于统计分析的二维矩阵;(2) Use the R language XCMS software package to preprocess the original metabolic fingerprint and identify metabolite peaks to obtain a two-dimensional matrix that can be used for statistical analysis;

(3)根据质荷比和保留时间从二维矩阵中筛选出本发明适合于食管鳞状细胞癌早期诊断的诊断标记物的信息,得到诊断标记物二维矩阵;(3) Screen out the information of the diagnostic markers suitable for the early diagnosis of esophageal squamous cell carcinoma according to the mass-to-charge ratio and retention time from the two-dimensional matrix, and obtain the two-dimensional matrix of diagnostic markers;

(4)将诊断标记物二维矩阵带入食管鳞状细胞癌诊断模型中,根据模型给出的数值和模型的诊断界值(Threshold),判断是否为早期食管鳞状细胞癌。(4) Bring the two-dimensional matrix of diagnostic markers into the diagnostic model of esophageal squamous cell carcinoma, and judge whether it is early esophageal squamous cell carcinoma according to the values given by the model and the diagnostic threshold (Threshold) of the model.

在本发明的优选方案中,当以25种血清代谢标记物的组合(诊断标记物A)为适合于食管鳞状细胞癌早期诊断的诊断标记物时,诊断模型给出的数值大于或等于0.3552时,诊断为食管癌,否则为不是;当以7种血清代谢标记物的组合(诊断标记物B)为适合于食管鳞状细胞癌早期诊断的诊断标记物时,诊断模型给出的数值大于或等于0.7431时,诊断为食管癌,否则为不是;当以5种血清代谢标记物的组合(诊断标记物C)为适合于食管鳞状细胞癌早期诊断的诊断标记物时,诊断模型给出的数值大于或等于0.4943时,诊断为食管癌,否则为不是。In the preferred version of the present invention, when the combination of 25 serum metabolic markers (diagnostic marker A) is suitable for the early diagnosis of esophageal squamous cell carcinoma, the value given by the diagnostic model is greater than or equal to 0.3552 When the diagnosis is esophageal cancer, otherwise it is not; when the combination of 7 serum metabolic markers (diagnostic marker B) is suitable for the early diagnosis of esophageal squamous cell carcinoma, the value given by the diagnostic model is greater than or equal to 0.7431, it is diagnosed as esophageal cancer, otherwise it is not; when the combination of 5 serum metabolic markers (diagnostic marker C) is suitable for the early diagnosis of esophageal squamous cell carcinoma, the diagnostic model gives When the value of is greater than or equal to 0.4943, it is diagnosed as esophageal cancer, otherwise it is not.

本发明优点为:本发明采用血清代谢组学技术以及数据统计分析技术得到适合于食管鳞状细胞癌早期诊断的诊断标记物和食管鳞状细胞癌诊断模型,并且发现了与诊断标记物有密切相关的10个代谢通路。本发明诊断标记物筛选方法可操作性强,模型构建方法简单,所得诊断模型效果良好,灵敏度高,特异性好,不仅适合晚期食管癌的诊断,还适合于早期食管癌的诊断,特别适合于食管癌的早期诊断。本发明仅通过取血就能实现诊断,无创、花费低,能够很好的替代现今内创性诊断模式,大大减轻了患者的痛苦,且本发明诊断快速、便捷,所需时间短,提高了工作效率,有利于食管癌的早发现、早治疗,具有良好的临床使用和推广价值。The advantages of the present invention are: the present invention uses serum metabolomics technology and data statistical analysis technology to obtain diagnostic markers and esophageal squamous cell carcinoma diagnostic models suitable for early diagnosis of esophageal squamous cell carcinoma, and has found a close relationship with the diagnostic markers. Related 10 metabolic pathways. The diagnostic marker screening method of the present invention has strong operability, simple model construction method, good diagnostic effect, high sensitivity and good specificity, and is not only suitable for the diagnosis of advanced esophageal cancer, but also suitable for the diagnosis of early esophageal cancer, especially suitable for Early diagnosis of esophageal cancer. The present invention can realize diagnosis only by taking blood, is non-invasive and low-cost, and can well replace the current internal invasive diagnosis mode, greatly reducing the pain of patients, and the present invention is fast and convenient in diagnosis, requires a short time, and improves the The work efficiency is conducive to the early detection and early treatment of esophageal cancer, and has good clinical use and promotion value.

附图说明Description of drawings

图1.原始代谢指纹图谱的总离子色谱图(LC-QTOF/MS,+ESI为正离子模式,-ESI为负离子模式),横轴为保留时间(Retention Time,min),纵轴为代谢物相对浓度,Normal为健康血清样本,ESCC为食管癌血清样本。Figure 1. The total ion chromatogram of the original metabolic fingerprint (LC-QTOF/MS, +ESI is the positive ion mode, -ESI is the negative ion mode), the horizontal axis is the retention time (Retention Time, min), and the vertical axis is the metabolite Relative concentration, Normal is a healthy serum sample, ESCC is an esophageal cancer serum sample.

图2.代谢轮廓预分析的PCA得分图,其中Normal为健康血清样本,ESCC为食管癌血清样本,QC表示质量控制样品。Figure 2. PCA score plot of metabolic profile pre-analysis, where Normal is a healthy serum sample, ESCC is an esophageal cancer serum sample, and QC is a quality control sample.

图3A为食管癌和健康对照的代谢轮廓比较的PLS-DA三维得分图,建模的R2X=0.167,R2Y=0.569,Q2Y=0.523;图3B为相应的基于随机置换方法的PLS-DA建模验证图。其中Normal为健康血清样本,ESCC为食管癌血清样本。Figure 3A is the PLS-DA three-dimensional score map comparing the metabolic profiles of esophageal cancer and healthy controls, the modeled R2X=0.167, R2Y=0.569, Q2Y=0.523; Figure 3B is the corresponding PLS-DA model based on the random permutation method Validation diagram. Among them, Normal is a healthy serum sample, and ESCC is a serum sample of esophageal cancer.

图4. L-色氨酸(L-Tryptophan)的鉴定流程图,其中图(a):m/z为205.0974的色谱图中保留时间特征;图(b):保留时间为181.38s的一级质谱图;图(c):保留时间为181.38s,m/z为205.0974的二级离子MS/MS碎片图;图(d):代谢物(RT:181.38s,m/z:205.0974)的碎片裂解机制。Figure 4. The flow chart of the identification of L-Tryptophan (L-Tryptophan), where Figure (a): m/z is 205.0974 retention time characteristics in the chromatogram; Figure (b): retention time is 181.38s first-order Mass Spectrum; Figure (c): Secondary ion MS/MS fragmentation with a retention time of 181.38s and m/z of 205.0974; Figure (d): Fragmentation of metabolites (RT: 181.38s, m/z: 205.0974) cracking mechanism.

图5. 25个代谢标记物所构建的随机森林模型的外部测试样本的ROC曲线图。Figure 5. The ROC curve of the external test sample of the random forest model constructed by 25 metabolic markers.

图6. 7个代谢标记物所构建的随机森林早期食管癌诊断模型的外部测试样本ROC曲线。Figure 6. The external test sample ROC curve of the random forest early esophageal cancer diagnosis model constructed by 7 metabolic markers.

图7. 5个代谢标记物所构建的随机森林早期食管癌诊断模型的外部测试样本ROC曲线。Figure 7. The external test sample ROC curve of the random forest early esophageal cancer diagnosis model constructed by 5 metabolic markers.

具体实施方式detailed description

下面,通过以下具体实施方式对本发明进行进一步的解释,并对本发明优点进行进一步的证明。Below, the present invention is further explained through the following specific embodiments, and the advantages of the present invention are further proved.

本发明诊断标记物的筛选、诊断模型的构建方法以及效果验证如下:The screening of the diagnostic markers of the present invention, the construction method of the diagnostic model and the effect verification are as follows:

1、研究对象1. Research object

本研究依托“国家食管癌早诊早治示范基地(山东省肥城市)”的食管癌筛查平台,针对山东省肥城市40-69岁胃镜下碘染色指示性活检对象(作为金标准确认),采集食管原位癌(0期39例)、早期食管癌(I期17例、II期11例)及晚期食管癌(III期30例);并随机抽取筛检中胃镜下碘染色阴性受试者即无上消化道病变的健康对象105例作为健康样本。Relying on the esophageal cancer screening platform of "National Demonstration Base for Early Diagnosis and Treatment of Esophageal Cancer (Feicheng City, Shandong Province)", this study targeted iodine-stained biopsy subjects aged 40-69 under gastroscope in Feicheng City, Shandong Province (confirmed as the gold standard) Collect esophageal carcinoma in situ (39 cases at stage 0), early esophageal cancer (17 cases at stage I, 11 cases at stage II) and advanced esophageal cancer (30 cases at stage III); The test subjects were 105 healthy subjects without upper gastrointestinal lesions as healthy samples.

2、LC-MS的血清代谢组学检测2. Serum metabolomics detection by LC-MS

所有采集的血清样本离心后放于-80℃冰箱内保存,使用超高效液相色谱-质谱联用仪(UPLC-QTOF/MS 6550,Agilent)和Bravo自动标本预处理系统(Agilent,USA)进行代谢组学检测(分3个大批次检测,并做好质量控制),获得样本的包含色谱和质谱信息的原始代谢指纹图谱。具体操作如下:All collected serum samples were centrifuged and stored in a -80°C refrigerator, using an ultra-high performance liquid chromatography-mass spectrometer (UPLC-QTOF/MS 6550, Agilent) and a Bravo automatic specimen pretreatment system (Agilent, USA). Metabolomics detection (in 3 large batches of detection, and quality control), to obtain the original metabolic fingerprint of the sample including chromatographic and mass spectral information. The specific operation is as follows:

2.1仪器和设备2.1 Instruments and equipment

实验设备包括:UPLC-QTOF/MS 6550系统(Agilent,USA)、Bravo系统(Agilent,USA)、高速低温离心机、振动涡旋机、氮气干燥装置、4℃冷藏冰箱(海尔)、纯水仪(西门子)。Experimental equipment includes: UPLC-QTOF/MS 6550 system (Agilent, USA), Bravo system (Agilent, USA), high-speed low-temperature centrifuge, vibrating vortex machine, nitrogen drying device, 4 ℃ refrigerator (Haier), pure water instrument (Siemens).

实验耗材包括:Waters ACQUITYHSS T3(particle size,1.8μm;100mm(length)×2.1mm)色谱柱、液氮、高纯氮;锥底进样瓶、2ml离心转子、2ml离心管(圆底)、移液器、1000μl枪头、200μl枪头、记号笔、乳胶手套、口罩。Experimental consumables include: Waters ACQUITY HSS T3 (particle size, 1.8μm; 100mm (length) × 2.1mm) chromatographic column, liquid nitrogen, high-purity nitrogen; conical bottom sample bottle, 2ml centrifugal rotor, 2ml centrifuge tube (round bottom), pipette, 1000μl Pipette tip, 200μl pipette tip, marker pen, latex gloves, mask.

实验试剂包括:甲醇(迪马,HPLC级纯)、乙腈(迪马,HPLC级纯)、甲酸(光复精密化学研究所,天津)、纯水(TOC<10ppb)。Experimental reagents include: methanol (Dima, HPLC grade), acetonitrile (Dima, HPLC grade), formic acid (Guangfu Institute of Fine Chemistry, Tianjin), pure water (TOC<10ppb).

2.2血清样本预处理2.2 Serum sample pretreatment

血清样本预处理前,制备21份质量控制样品(QC),将所有早期食管癌血清样本、食管原位癌血清样本、晚期食管癌血清样本、健康血清样本和质量控制样品进行随机编号,以早期食管癌血清样本、食管原位癌血清样本、晚期食管癌血清样本、健康血清样本作为分析样本,每隔10个分析样本加入一个质量控制样品。将早期食管癌血清样本、食管原位癌血清样本和晚期食管癌血清样本统称为食管癌血清样本,质量控制样品为5份食管癌血清样本和5份健康血清样本的混合样品。食管癌血清样本、健康血清样本和质量控制样品均进行预处理,预处理包括以下4个步骤:Before the pretreatment of serum samples, 21 quality control samples (QC) were prepared, and all early esophageal cancer serum samples, esophageal carcinoma in situ serum samples, advanced esophageal cancer serum samples, healthy serum samples and quality control samples were randomly numbered, and early Esophageal cancer serum samples, esophageal carcinoma in situ serum samples, advanced esophageal cancer serum samples, and healthy serum samples were used as analysis samples, and a quality control sample was added every 10 analysis samples. Early esophageal cancer serum samples, esophageal carcinoma in situ serum samples, and advanced esophageal cancer serum samples were collectively referred to as esophageal cancer serum samples, and the quality control sample was a mixed sample of 5 esophageal cancer serum samples and 5 healthy serum samples. Esophageal cancer serum samples, healthy serum samples and quality control samples were all pretreated, and the pretreatment included the following 4 steps:

(1)用移液器抽取50μl分析样本或质量控制样品,置于Bravo自动标本处理系统(Agilent,USA)的96孔板上;(1) Use a pipette to extract 50 μl of analytical samples or quality control samples, and place them on a 96-well plate of the Bravo automatic specimen processing system (Agilent, USA);

(2)加入150μl甲醇提取,涡旋30s,并在-20℃下孵化以沉淀蛋白。(2) Add 150 μl of methanol for extraction, vortex for 30 s, and incubate at -20°C to precipitate protein.

(3)然后于高速离心机中在4℃下以4000转/分离心20min;(3) Centrifuge at 4000 rpm for 20 minutes at 4°C in a high-speed centrifuge;

(4)将步骤(3)的上清液倒入LC-MS进样瓶中,保存在-80℃下以备LC-MS检测;(4) Pour the supernatant of step (3) into an LC-MS injection bottle, and store it at -80°C for LC-MS detection;

2.3血清UPLC-QTOF/MS检测2.3 Serum UPLC-QTOF/MS detection

UPLC系统(1290series,Agilent)将6μL等份预处理后的样品注入ACQUITY UPLCHSS T3(particle size,1.8μm;100mm(length)×2.1mm)色谱柱(Waters,Milford,USA)。进样顺序为完全随机化进样,以排除进样顺序带来的偏倚。色谱流动相包含两种溶剂:A为0.1wt%甲酸(水稀释,正离子ESI+)或0.5mM氟化铵(水稀释,负离子ESI-),B为0.1wt%甲酸(乙腈稀释,正离子ESI+)或100%乙腈(负离子ESI-)。色谱梯度为:0-1min为1%B,1-8min为1%B-100%B逐渐递增,然后10-10.1min为100%B迅速减为1%B,然后1%B持续1.9min。流速为0.5ml/min。整个样品检测过程维持在4℃。其中,A和B的百分含量指的是体积百分含量。UPLC system (1290series, Agilent) injected 6 μL aliquots of pretreated samples into ACQUITY UPLCHSS T3 (particle size, 1.8 μm; 100 mm (length) × 2.1 mm) chromatographic column (Waters, Milford, USA). The injection sequence was completely randomized to eliminate the bias caused by the injection sequence. The chromatographic mobile phase contains two solvents: A is 0.1wt% formic acid (diluted in water, positive ion ESI+) or 0.5mM ammonium fluoride (diluted in water, negative ion ESI-), B is 0.1wt% formic acid (diluted in acetonitrile, positive ion ESI+ ) or 100% acetonitrile (negative ion ESI-). The chromatographic gradient is: 1% B for 0-1 min, 1% B-100% B for 1-8 min, gradually increasing, then 100% B for 10-10.1 min, rapidly reduced to 1% B, and then 1% B for 1.9 min. The flow rate is 0.5ml/min. The entire sample detection process was maintained at 4°C. Wherein, the percentages of A and B refer to volume percentages.

质谱检测使用Agilent四极杆时间飞行质谱仪Q-TOF(6550,Agilent),并采用电喷雾离子源的正离子模式(ESI+)和负离子模式(ESI-)。离子源温度设定为400℃,而锥孔气流量为12L/min。同时,脱溶剂气温设定为250℃,而脱溶剂气流量16L/min。在正离子和负离子模式下毛细管电压分别为+3kV和-3kV,且锥孔电压均为0V。锥孔压力为20psi(正离子)和40psi(负离子)。图谱数据采集的质荷比范围为50~1200m/z,采集的扫描频率为0.25s。MS/MS二级质谱分析中,高纯度氮作为碰撞气体用于生成目标离子碎片,碰撞能量设置为10、20、或40eV。Agilent quadrupole time-of-flight mass spectrometer Q-TOF (6550, Agilent) was used for mass spectrometry detection, and the positive ion mode (ESI+) and negative ion mode (ESI-) of the electrospray ion source were used. The ion source temperature was set at 400 °C, and the cone gas flow was 12 L/min. At the same time, the desolvation temperature was set at 250° C., and the desolvation gas flow rate was 16 L/min. The capillary voltage was +3kV and -3kV in positive and negative ion modes, respectively, and the cone voltage was 0V. Cone pressure is 20psi (positive ions) and 40psi (negative ions). The range of mass-to-charge ratio for spectral data collection is 50-1200m/z, and the scanning frequency of collection is 0.25s. In MS/MS MS analysis, high-purity nitrogen is used as a collision gas to generate target ion fragments, and the collision energy is set to 10, 20, or 40eV.

3、XCMS图谱预处理3. XCMS Spectrum Preprocessing

UPLC-QTOF/MS血清正离子ESI+和负离子ESI-检测获得原始代谢指纹图谱数据(见图1),通过Agilent公司的Masshunter软件转化为Mzdata数据文件,然后使用R语言的XCMS软件包进行XCMS图谱预处理,预处理包括保留时间校正、峰识别、峰匹配、峰对齐、滤噪、重叠峰解析、阈值选择、标准化等。XCMS预处理的相关参数为:峰半腰峰宽为10(fwhm=10),保留时间窗设置为10(bw=10),而其他参数为默认值。XCMS图谱预处理后得到可用于统计分析的二维矩阵,其中每行为样本(观测),每列为代谢物(变量),矩阵中值为相应的代谢物浓度。并且每个代谢物峰使用保留时间(retention time,RT)和质荷比(mass-to-chargeratio,m/z)定性。然后该二维矩阵使用R软件包CAMERA进行代谢物峰标识(包括同位素峰、加合物和碎片离子)。统计分析前对样本进行标准化处理,待分析保留时间范围设定为0.5~10min。经XCMS图谱预处理,在正离子检测模式的UPLC-QTOF/MS谱生成的数据矩阵中包含981个代谢物峰,负离子检测模式为485个代谢物峰,共有1466个代谢物峰。UPLC-QTOF/MS serum positive ion ESI+ and negative ion ESI- detection to obtain the original metabolic fingerprint data (see Figure 1), through Agilent's Masshunter software into Mzdata data files, and then use the R language XCMS software package for XCMS spectrum pre-processing Processing, preprocessing includes retention time correction, peak identification, peak matching, peak alignment, noise filtering, overlapping peak analysis, threshold selection, standardization, etc. The relevant parameters of XCMS pretreatment are: peak width at half waist is 10 (fwhm=10), retention time window is set to 10 (bw=10), and other parameters are default values. After XCMS spectrum preprocessing, a two-dimensional matrix that can be used for statistical analysis is obtained, in which each row is a sample (observation), each column is a metabolite (variable), and the matrix median value is the corresponding metabolite concentration. And each metabolite peak is characterized by retention time (retention time, RT) and mass-to-charge ratio (mass-to-chargeratio, m/z). This two-dimensional matrix was then used for metabolite peak identification (including isotopic peaks, adducts and fragment ions) using the R package CAMERA. The samples were standardized before statistical analysis, and the retention time range to be analyzed was set at 0.5-10 min. After XCMS spectrum preprocessing, the data matrix generated by UPLC-QTOF/MS spectrum in positive ion detection mode contains 981 metabolite peaks, and 485 metabolite peaks in negative ion detection mode, with a total of 1466 metabolite peaks.

4、LC-MS实验质量控制4. LC-MS experiment quality control

在血清样本进行代谢组学检测时,将制备的QC样品按每10个分析样本安排1个QC的顺序均匀地插入分析样本中,从而实时监测从样本前处理到样本检测过程中的质量控制情况。所得原始代谢指纹图谱经XCMS图谱预处理后,计算每个代谢物在QC样本中的%RSD值(变异系数),绝大多数代谢物的%RSD值控制在30%以下,说明样本前处理到样本品检测过程中的质量控制情况良好,所获得的代谢组学数据真实可信。When serum samples are tested for metabolomics, the prepared QC samples are evenly inserted into the analyzed samples in the order of 1 QC for every 10 analyzed samples, so as to monitor the quality control from sample pretreatment to sample detection in real time . After the obtained original metabolic fingerprint was preprocessed by XCMS spectrum, the %RSD value (coefficient of variation) of each metabolite in the QC sample was calculated, and the %RSD value of most metabolites was controlled below 30%, indicating that the sample pretreatment to The quality control of the sample product testing process is good, and the obtained metabolomics data is authentic and credible.

5、基于PCA的代谢轮廓预分析5. Metabolic profile pre-analysis based on PCA

使用无监督分析方法即主成分分析(principal component analysis,PCA)来初步观察组间分类趋势和离群点,见图2。图中QC标本的重复性可表明LC-MS实验质量控制良好。从图中还可以看出,食管癌及健康对照间具有一定的分类趋势,但仍有部分交叉,需要采用有监督学习方法实现进一步的分类。The unsupervised analysis method, principal component analysis (PCA), was used to initially observe the classification trends and outliers between groups, as shown in Figure 2. The repeatability of the QC samples in the figure can indicate that the quality control of the LC-MS experiment is good. It can also be seen from the figure that there is a certain classification trend between esophageal cancer and healthy controls, but there are still some overlaps, and a supervised learning method is needed to achieve further classification.

6、基于PLS-DA的代谢轮廓分析6. Metabolic profile analysis based on PLS-DA

将得到的二维矩阵数据随机分配成4/5作为训练样本training data,另外1/5作为外部测试样本test data(见表1)。为尽量消除组内差距引起的偏差,获得较为明显的分组趋势,进一步针对训练样本使用有监督分析方法即偏最小二乘判别分析(partial leastsquares-discriminant analysis,PLS-DA)显示食管癌及健康对照间的代谢轮廓的差异和分类趋势。如图3所示,食管癌同健康对照间具有代谢模式差异和明显的组间分类趋势,其建模的R2X=0.167,R2Y=0.569,Q2Y=0.523。Randomly assign 4/5 of the obtained two-dimensional matrix data as the training sample training data, and the other 1/5 as the external test sample test data (see Table 1). In order to eliminate the deviation caused by the difference within the group as much as possible and obtain a more obvious grouping trend, a supervised analysis method, that is, partial least squares-discriminant analysis (PLS-DA), was further used for the training samples to show the difference between esophageal cancer and healthy controls. Differences and classification trends in metabolic profiles among As shown in Figure 3, there are differences in metabolic patterns and obvious classification trends between esophageal cancer and healthy controls, and the modeled R2X=0.167, R2Y=0.569, and Q2Y=0.523.

表1.食管癌早期诊断的代谢组学研究的基线和临床病理特征Table 1. Baseline and clinicopathological features of metabolomic studies for early diagnosis of esophageal cancer

7、食管癌早期诊断的差异代谢物筛选和化学物质鉴定7. Screening of differential metabolites and identification of chemical substances for early diagnosis of esophageal cancer

为筛选出早期食管癌诊断的差异代谢物,我们借助于PLS-DA建模的变量重要性评分(VIP)和单变量的非参数检验(nonparametric Kruskal-Wallis rank sum test)进行筛选。变量筛选标准为:VIP≥1;且经假发现率FDR的多重检验校正后q值小于0.05。按照此标准,共筛选出在食管癌与健康对照之间差异表达血清代谢组标记物551个,进一步根据R语言的CAMERA包确定差异代谢物的准分子离子、加合物和同位素信息,排除化学信号和人体内没有的,获得242个潜在代谢标记物。In order to screen out differential metabolites for the diagnosis of early esophageal cancer, we screened with the help of variable importance score (VIP) in PLS-DA modeling and univariate nonparametric test (nonparametric Kruskal-Wallis rank sum test). The variable screening criteria were: VIP≥1; and the q value after multiple test correction of the false discovery rate FDR was less than 0.05. According to this standard, a total of 551 differentially expressed serum metabolome markers between esophageal cancer and healthy controls were screened out, and the quasi-molecular ions, adducts and isotope information of the differential metabolites were further determined according to the CAMERA package of the R language, excluding chemical metabolites. Signals and those not found in the human body, 242 potential metabolic markers were obtained.

针对上述242个潜在代谢标记物,按照以下化学物质鉴定步骤(如代谢标记物L-色氨酸RT 181.38s,m/z 205.0974的鉴定过程,见图4),进行代谢标记物的鉴定:For the above 242 potential metabolic markers, follow the identification steps of chemical substances (such as the identification process of the metabolic marker L-tryptophan RT 181.38s, m/z 205.0974, see Figure 4) to identify the metabolic markers:

(1)根据潜在代谢标记物的一级质谱裂解分布特征,结合R语言的CAMERA软件包(CAMERA为R语言package软件包Collection of annotation related methods for massspectrometry data,http://bioconductor.org/packages/release/bioc/html/CAMERA.html)确定潜在代谢标记物的准分子离子、加合物和同位素信息,推测潜在代谢标记物的分子质量和分子式;(1) According to the first-order mass spectrometry fragmentation distribution characteristics of potential metabolic markers, combined with the R language CAMERA software package (CAMERA is the R language package software package Collection of annotation related methods for massspectrometry data, http://bioconductor.org/packages/ release/bioc/html/CAMERA.html) determine the quasi-molecular ions, adducts and isotope information of potential metabolic markers, and infer the molecular mass and molecular formula of potential metabolic markers;

(2)根据分子质量查找在线人类代谢物数据库HMDB(http://www.hmdb.ca/)和METLIN(http://metlin.scripps.edu/),确定若干备选化合物;(2) Search the online human metabolite database HMDB (http://www.hmdb.ca/) and METLIN (http://metlin.scripps.edu/) according to the molecular mass, and determine several candidate compounds;

(3)对242个潜在代谢标记物进行RRLC-QTOF/MS/MS二级质谱实验,进一步获得代谢物相应的质谱离子碎片信息,并同数据库中备选化合物进行二级质谱图碎片离子匹配;比对化合物标准样品库的色谱和质谱特征进行最终的物质确定。(3) Conduct RRLC-QTOF/MS/MS secondary mass spectrometry experiments on 242 potential metabolic markers, further obtain the mass spectrometry ion fragment information corresponding to metabolites, and perform secondary mass spectrometry fragment ion matching with candidate compounds in the database; The final substance identification was performed by comparing the chromatographic and mass spectral characteristics of the compound standard sample library.

根据上述鉴定方法,通过二级质谱或标准品确认的情况下共成功鉴定出25个血清代谢标记物,这25个血清大小标记物的通用名(Common Name)分别为:beta-Ala-Lys;L-Carnosine;cis-9-Palmitoleic acid;Palmitic acid;OleicAcid;LPA(18:1(9Z)/0:0);LysoPC(14:0/0:0);LysoPC(18:2(9Z,12Z));LysoPC(24:0);PC(14:1(9Z)/P-18:1(11Z));PC(16:0/18:2(9Z,12Z));PC(24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z));Linoleicacid;NADH;Cortisol;L-Tyrosine;L-Tryptophan;GlycocholicAcid;Taurocholate;Hypoxanthine;Allantoic acid;Inosine;Sphingosine 1-phosphate;3-O-Sulfogalactosylceramide(d18:1/20:0);Lactosylceramide(d18:1/22:0)。According to the above identification method, a total of 25 serum metabolic markers were successfully identified by secondary mass spectrometry or standard products. The common names (Common Name) of these 25 serum size markers are: beta-Ala-Lys; L-Carnosine; cis-9-Palmitoleic acid; Palmitic acid; Oleic Acid; LPA(18:1(9Z)/0:0); LysoPC(14:0/0:0); LysoPC(18:2(9Z,12Z )); LysoPC(24:0); PC(14:1(9Z)/P-18:1(11Z)); PC(16:0/18:2(9Z,12Z)); PC(24:1 (15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z)); Linoleicacid; NADH; Cortisol; L-Tyrosine; L-Tryptophan; Glycocholic Acid; Taurocholate; Hypoxanthine; Allantoic acid; Inosine; phosphate; 3-O-Sulfogalactosylceramide (d18:1/20:0); Lactosylceramide (d18:1/22:0).

对比已发表文献,本发明这25个血清代谢标记物均为首次在早期食管鳞状细胞癌中发现,这对于早期食管癌的诊断、治疗有十分重要的意义。其中,beta-Ala-Lys的中文译名为b eta-丙氨酸-赖氨酸,L-Carnosine的中文译名为左旋肌肽,cis-9-Palmitoleicacid的中文译名为顺-9-十六碳烯酸,Palmitic acid的中文译名为棕榈酸,Oleic Acid的中文译名为油酸,LP A的中文译名为溶血磷脂酸,LysoPC的中文译名为溶血卵磷脂,PC的中文译名为磷脂,Li noleic acid的中文译名为亚油酸,NADH的中文译名为烟酰胺腺嘌呤二核苷酸,Cortisol的中文译名为皮质醇,L-Tyrosine的中文译名为L-酪氨酸,L-Tryptophan的中文译名为L-色氨酸,Glycocholic Acid的中文译名为甘氨胆酸,Taurocholate的中文译名为牛磺胆酸盐,Hy poxanthine的中文译名为次黄嘌呤,Allantoic acid的中文译名为尿囊酸,Inosine的中文译名为肌苷,Sphingosine 1-phosphate的中文译名为1-磷酸鞘氨醇,3-O-Sulfogalactosylcer amide的中文译名为硫酸半乳糖基酰基鞘氨醇,Lactosylceramide的中文译名为乳糖神经酰胺,各中文译名因翻译可能存在偏差,以英文标准名为准。Compared with the published literature, the 25 serum metabolic markers of the present invention are all found in early esophageal squamous cell carcinoma for the first time, which has very important significance for the diagnosis and treatment of early esophageal cancer. Among them, the Chinese translation of beta-Ala-Lys is beta-alanine-lysine, the Chinese translation of L-Carnosine is L-carnosine, and the Chinese translation of cis-9-Palmitoleic acid is cis-9-hexadecenoic acid The Chinese translation of Palmitic acid is palmitic acid, the Chinese translation of Oleic Acid is oleic acid, the Chinese translation of LP A is lysophosphatidic acid, the Chinese translation of LysoPC is lysolecithin, the Chinese translation of PC is phospholipid, and the Chinese translation of Li noleic acid The Chinese translation of NADH is nicotinamide adenine dinucleotide, the Chinese translation of Cortisol is cortisol, the Chinese translation of L-Tyrosine is L-tyrosine, and the Chinese translation of L-Tryptophan is L- Tryptophan, the Chinese translation of Glycocholic Acid is glycocholic acid, the Chinese translation of Taurocholate is taurocholate, the Chinese translation of Hy poxanthine is hypoxanthine, the Chinese translation of Allantoic acid is allantoic acid, and the Chinese translation of Inosine Inosine, the Chinese translation of Sphingosine 1-phosphate is 1-phosphate sphingosine, the Chinese translation of 3-O-Sulfogalactosylceramide is sulfate galactosylceramide, and the Chinese translation of Lactosylceramide is lactose ceramide. Due to possible deviations in translation, the English standard name shall prevail.

上述25个血清代谢标记物在HMDB和METLIN中的数据库检索信息如下表2所示,本领域技术人员可以根据下表中的HMDB ID号、METLIN ID号得到这25个血清代谢标记物的详细信息,例如化学结构式:The database retrieval information of the above 25 serum metabolic markers in HMDB and METLIN is shown in Table 2 below. Those skilled in the art can obtain the detailed information of these 25 serum metabolic markers according to the HMDB ID number and METLIN ID number in the following table , such as the chemical formula:

表2 25个血清代谢标记物在HMDB和METLIN的数据库检索信息Table 2 The database retrieval information of 25 serum metabolic markers in HMDB and METLIN

此外,通过KEGG富集(enrichment)和代谢通路(pathway)分析,发现上述25个血清代谢标记物与以下10个代谢通路密切相关:Glycerophospholipid metabolism;Linoleicacid metabolism;beta-Alanine metabolism;Fatty acid biosynthesis;Oxidativephosphorylation;Phenylalanine,tyrosine and tryptophan biosynthesis;Primarybile acid biosynthesis;Pathways in cancer,and Bile secretion;Purinemetabolism;Sphingolipid metabolism。上述10个代谢通路为代谢通路KEGG中的标准名(http://www.genome.jp/kegg/),其相应的中文译名为:甘油磷脂代谢(Glycerophospholipid metabolism)、亚油酸代谢(Linoleic acid metabolism)、beta丙氨酸代谢(beta-Alanine metabolism)、脂肪酸合成(Fatty acid biosynthesis)、氧化磷酸化(Oxidative phosphorylation)、苯基丙氨酸/酪氨酸和色氨酸代谢(Phenylalanine,tyrosine and tryptophan biosynthesis)、初级胆汁酸合成(Primary bile acidbiosynthesis)、癌症通路和胆汁分泌(Pathways in cancer,and Bile secretion)、嘌呤代谢(Purine metabolism)、鞘脂类代谢(Sphingolipid metabolism)。这证明食管癌发病早期这10个代谢通路发生了扰动,本发明的这一发现对于食管癌的预防和药物的研发有很好的指导作用。In addition, through KEGG enrichment and metabolic pathway analysis, it was found that the above 25 serum metabolic markers were closely related to the following 10 metabolic pathways: Glycerophospholipid metabolism; Linoleicacid metabolism; beta-Alanine metabolism; Fatty acid biosynthesis; Oxidativephosphorylation ; Phenylalanine, tyrosine and tryptophan biosynthesis; Primary bile acid biosynthesis; Pathways in cancer, and Bile secretion; Purine metabolism; Sphingolipid metabolism. The above 10 metabolic pathways are the standard names in the metabolic pathway KEGG (http://www.genome.jp/kegg/), and their corresponding Chinese translation names are: glycerophospholipid metabolism, linoleic acid metabolism metabolism), beta-Alanine metabolism (beta-Alanine metabolism), fatty acid synthesis (Fatty acid biosynthesis), oxidative phosphorylation (Oxidative phosphorylation), phenylalanine/tyrosine and tryptophan metabolism (Phenylalanine, tyrosine and Tryptophan biosynthesis), Primary bile acid biosynthesis, Pathways in cancer, and Bile secretion, Purine metabolism, Sphingolipid metabolism. This proves that these 10 metabolic pathways are disturbed in the early stage of esophageal cancer, and this discovery of the present invention has a very good guiding effect on the prevention of esophageal cancer and the development of drugs.

下表3为筛选得到的25个血清代谢标记物在食管癌患者和健康人群中的差异信息,其中在正负两个离子模式下均发现了L-Tryptophan。FC为食管癌和健康对照相比的变化倍数(fold change),根据FC信息可以看出:beta-丙氨酸-赖氨酸(beta-Ala-Lys)、溶血磷脂酸LPA(18:1(9Z)/0:0)、溶血卵磷脂LysoPC(14:0/0:0)、磷脂PC(16:0/18:2(9Z,12Z))、烟酰胺腺嘌呤二核苷酸(NADH)、L-酪氨酸(L-Tyrosine),L-色氨酸(L-Tryptophan)、甘氨胆酸(GlycocholicAcid)、尿囊酸(Allantoic acid)、肌苷(Inosine)和1-磷酸鞘氨醇(Sphingosine1-phosphate)在食管癌组中相比健康组表达量明显升高,而其他的代谢标记物在食管癌组中相比健康组表达量明显降低。Table 3 below shows the difference information of the 25 serum metabolic markers screened between patients with esophageal cancer and healthy people, in which L-Tryptophan was found in both positive and negative ion modes. FC is the fold change between esophageal cancer and healthy controls. According to the FC information, it can be seen that: beta-alanine-lysine (beta-Ala-Lys), lysophosphatidic acid LPA (18:1( 9Z)/0:0), LysoPC (14:0/0:0), Phospholipid PC (16:0/18:2(9Z,12Z)), Nicotinamide Adenine Dinucleotide (NADH) , L-Tyrosine (L-Tyrosine), L-Tryptophan (L-Tryptophan), Glycocholic Acid (Glycocholic Acid), Allantoic acid (Allantoic acid), Inosine (Inosine) and 1-phosphate sphingosine The expression of alcohol (Sphingosine1-phosphate) in the esophageal cancer group was significantly higher than that in the healthy group, while the expression of other metabolic markers was significantly lower in the esophageal cancer group than in the healthy group.

FDR为基于非参数检验多重比较校正的假发现率,其值均小于0.05;AUC为单个代谢组标记物的诊断试验评价的ROC曲线下面积AUC值,从该值可以看出这25个血清代谢标记物单独作为标记物进行食管癌与非食管癌的诊断时,最低的AUC值为0.61,最高的AUC值为0.85。由此可以看出,作为单一组分的诊断标记物来说,本发明筛选得到的25个血清代谢标记物的诊断效果是较为显著的,以单一血清代谢标记物进行诊断具有一定的临床研究价值。FDR is the false discovery rate based on non-parametric multiple comparison correction, and its value is less than 0.05; AUC is the area under the ROC curve AUC value of the diagnostic test evaluation of a single metabolome marker, from which it can be seen that the 25 serum metabolites When markers are used alone to diagnose esophageal cancer and non-esophageal cancer, the lowest AUC value is 0.61, and the highest AUC value is 0.85. It can be seen that, as a single-component diagnostic marker, the diagnostic effect of the 25 serum metabolic markers screened by the present invention is more significant, and diagnosis with a single serum metabolic marker has certain clinical research value .

为了使诊断效果更好,可以将血清代谢标记物组合进行使用,例如可以根据血清代谢标记物与代谢通路之间的关系进行组合,形成以下8种诊断标记物:(a)beta-丙氨酸-赖氨酸(beta-Ala-Lys)和左旋肌肽(L-Carnosine)的组合;(b)顺-9-十六碳烯酸(cis-9-Palmitoleic acid)、棕榈酸(Palmitic acid)和油酸(OleicAcid)的组合;(c)溶血磷脂酸LPA(18:1(9Z)/0:0)、溶血卵磷脂LysoPC(14:0/0:0)、溶血卵磷脂LysoPC(18:2(9Z,12Z))和溶血卵磷脂LysoPC(24:0)的组合;(d)磷脂PC(14:1(9Z)/P-18:1(11Z))、磷脂PC(16:0/18:2(9Z,12Z))、磷脂PC(24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z))和亚油酸(Linoleicacid)的组合;(e)烟酰胺腺嘌呤二核苷酸(NADH)、L-酪氨酸(L-Tyrosine)和L-色氨酸(L-Tryptophan)的组合;(f)皮质醇(Cortisol)、甘氨胆酸(GlycocholicAcid)和牛磺胆酸盐(Taurocholate)的组合;(g)次黄嘌呤(Hypoxanthine)、尿囊酸(Allantoic acid)和肌苷(Inosine)的组合;(h)1-磷酸鞘氨醇(Sphingosine 1-phosphate)、硫酸半乳糖基酰基鞘氨醇3-O-Sulfogalactosylceramide(d18:1/20:0)和乳糖神经酰胺Lactosylceramide(d18:1/22:0)的组合。In order to make the diagnosis better, serum metabolic markers can be used in combination, for example, the following 8 diagnostic markers can be formed according to the relationship between serum metabolic markers and metabolic pathways: (a) beta-alanine - A combination of lysine (beta-Ala-Lys) and L-carnosine (L-Carnosine); (b) cis-9-hexadecenoic acid (cis-9-Palmitoleic acid), palmitic acid (Palmitic acid) and Oleic acid (OleicAcid) combination; (c) lysophosphatidic acid LPA (18:1 (9Z)/0:0), lysolecithin LysoPC (14:0/0:0), lysolecithin LysoPC (18:2 (9Z,12Z)) and LysoPC (24:0); (d) Phospholipid PC (14:1(9Z)/P-18:1(11Z)), Phospholipid PC (16:0/18 :2(9Z,12Z)), the combination of phospholipid PC(24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z)) and linoleic acid (Linoleic acid); (e) smoke A combination of NADH, L-Tyrosine and L-Tryptophan; (f) Cortisol, GlycocholicAcid ) and Taurocholate; (g) Hypoxanthine, Allantoic acid and Inosine; (h) Sphingosine 1-phosphate -phosphate), a combination of 3-O-Sulfogalactosylceramide (d18:1/20:0) and lactosylceramide (d18:1/22:0).

还可以选择AUC效果好的几种代谢标记物进行组合形成诊断标记物,例如,诊断标记物可以为下述血清代谢标记物中的两种或两种以上的组合:溶血磷脂酸LPA(18:1(9Z)/0:0),溶血卵磷脂LysoPC(14:0/0:0),溶血卵磷脂LysoPC(18:2(9Z,12Z)),溶血卵磷脂LysoPC(24:0),磷脂PC(14:1(9Z)/P-18:1(11Z)),磷脂PC(16:0/18:2(9Z,12Z)),磷脂PC(24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z)),烟酰胺腺嘌呤二核苷酸(NADH),皮质醇(Cortisol),L-色氨酸(L-Tryptophan),牛磺胆酸盐(Taurocholate),次黄嘌呤(Hypoxanthine),肌苷(Inosine),硫酸半乳糖基酰基鞘氨醇3-O-Sulfogalactosylceramide(d18:1/20:0),乳糖神经酰胺Lactosylceramide(d18:1/22:0)。Several metabolic markers with good AUC effects can also be selected to be combined to form diagnostic markers. For example, the diagnostic markers can be a combination of two or more of the following serum metabolic markers: lysophosphatidic acid LPA (18: 1(9Z)/0:0), Lysolecithin LysoPC(14:0/0:0), Lysolecithin LysoPC(18:2(9Z,12Z)), Lysolecithin LysoPC(24:0), Phospholipids PC(14:1(9Z)/P-18:1(11Z)), phospholipid PC(16:0/18:2(9Z,12Z)), phospholipid PC(24:1(15Z)/22:6( 4Z, 7Z, 10Z, 13Z, 16Z, 19Z)), Nicotinamide Adenine Dinucleotide (NADH), Cortisol, L-Tryptophan, Taurocholate ), hypoxanthine (Hypoxanthine), inosine (Inosine), galactosylceramide sulfate 3-O-Sulfogalactosylceramide (d18:1/20:0), lactosylceramide (d18:1/22:0 ).

还可以选择AUC效果好的下述血清代谢标记物中的两种或两种以上的组合:溶血磷脂酸LPA(18:1(9Z)/0:0)、溶血卵磷脂LysoPC(14:0/0:0)、溶血卵磷脂LysoPC(18:2(9Z,12Z))、溶血卵磷脂LysoPC(24:0)、磷脂PC(14:1(9Z)/P-18:1(11Z))、磷脂PC(16:0/18:2(9Z,12Z))和磷脂PC(24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z))。A combination of two or more of the following serum metabolic markers with good AUC effect can also be selected: lysophosphatidic acid LPA (18:1(9Z)/0:0), lysophosphatidic acid LysoPC (14:0/ 0:0), LysoPC(18:2(9Z,12Z)), LysoPC(24:0), LysoPC(14:1(9Z)/P-18:1(11Z)), Phospholipid PC (16:0/18:2(9Z,12Z)) and Phospholipid PC (24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z)).

还可以选择AUC效果好的下述血清代谢标记物中的两种或两种以上的组合:L-酪氨酸(L-Tyrosine)、L-色氨酸(L-Tryptophan)、甘氨胆酸(GlycocholicAcid)、牛磺胆酸盐(Taurocholate)和皮质醇(Cortisol)。You can also choose a combination of two or more of the following serum metabolic markers with good AUC effects: L-Tyrosine, L-Tryptophan, Glycocholic Acid (Glycocholic Acid), Taurocholate and Cortisol.

表3.食管癌与健康对照间差异表达的25种血清代谢标记物Table 3. Twenty-five serum metabolic markers differentially expressed between esophageal cancer and healthy controls

FDR为假发现率(多重比较);AUC为ROC曲线下面积;FC为fold change变化倍数;RSD%为基于质量控制样本计算的变异系数。FDR is the false discovery rate (multiple comparisons); AUC is the area under the ROC curve; FC is the fold change; RSD% is the coefficient of variation calculated based on quality control samples.

下面,详细列举本发明3种优选诊断标记物的进一步应用效果,其他诊断标记物的应用情况在此不再一一列举。Below, the further application effects of the three preferred diagnostic markers of the present invention will be listed in detail, and the application of other diagnostic markers will not be listed here.

8、食管癌早期诊断模型及外部验证8. Esophageal cancer early diagnosis model and external verification

8.1以上述鉴定的25个血清代谢标记物的组合作为诊断标记物,在训练样本中基于随机森林(random Forest)构建食管癌早期诊断模型。随机森林使用R语言中randomForest软件包实现,建模参数ntree=5000(等同于下述b)。8.1 Using the combination of the 25 serum metabolic markers identified above as diagnostic markers, an early diagnosis model for esophageal cancer was constructed based on random forest in the training samples. The random forest is implemented using the randomForest software package in the R language, and the modeling parameter ntree=5000 (equivalent to b below).

随机森林建模步骤如下:The random forest modeling steps are as follows:

(1)原始训练集的样本含量为N,应用bootstrap法有放回地随机抽取b个新的自助样本集,并由此构建b棵分类树,每次未被抽到的样本组成了b个袋外数据(out-of-bag,OOB);(1) The sample content of the original training set is N, apply the bootstrap method to randomly select b new self-service sample sets with replacement, and construct b classification trees from this, and the undrawn samples form b Out-of-bag data (out-of-bag, OOB);

(2)设有mall个变量,则在每一棵树的每个节点处随机抽取mtry个变量(mtry<<mall),然后在mtry中选择一个最具有分类能力的变量,变量分类的阈值通过检查每一个分类点确定;(2) If there are m all variables, m try variables are randomly selected at each node of each tree (m try << m all ), and then a variable with the most classification ability is selected in m try , The threshold for variable classification is determined by examining each classification point;

(3)随机森林中的每一棵分类树为二叉树,其生成遵循自顶向下的递归分裂原则,即从根节点开始依次对训练集进行划分。每棵树最大限度地生长,不做任何修剪。(3) Each classification tree in the random forest is a binary tree, and its generation follows the top-down recursive splitting principle, that is, the training set is divided sequentially from the root node. Each tree grows to its maximum without any pruning.

(4)将生成的多棵分类树组成随机森林,用随机森林分类器对新的数据进行判别与分类,分类结果按树分类器的投票多少而定。(4) Combine the generated multiple classification trees into a random forest, use the random forest classifier to discriminate and classify the new data, and the classification result depends on the number of votes of the tree classifier.

(5)然后以该投票得分和实际分类情况行ROC曲线分析可获得诊断的诊断界值(Threshold)。此模型的诊断界值(Threshold)为0.3552。(5) Then the ROC curve analysis can be performed with the voting score and the actual classification situation to obtain the diagnostic cut-off value (Threshold) of diagnosis. The diagnostic threshold (Threshold) of this model is 0.3552.

上述构建的随机森林模型即可以作为食管癌诊断模型,当采用构建的随机森林模型进行诊断时,将待测血清中的25个血清代谢标记物的数据信息导入随机森林模型中,如果模型分类器的投票结果大于或等于诊断界值,则判定为诊断阳性(患食管鳞状细胞癌),如果低于诊断界值,则判定为诊断阴性(未患食管鳞状细胞癌)。The random forest model constructed above can be used as a diagnostic model for esophageal cancer. When the constructed random forest model is used for diagnosis, the data information of 25 serum metabolic markers in the serum to be tested is imported into the random forest model. If the model classifier If the voting result is greater than or equal to the diagnostic cutoff value, it is judged as a positive diagnosis (with esophageal squamous cell carcinoma), and if it is lower than the diagnostic cutoff value, it is judged as a negative diagnosis (without esophageal squamous cell carcinoma).

将外部测试样本的25个血清代谢标记物的二维矩阵数据代入上述建立的随机森林模型中,得到测试样本的食管癌患病概率预测值,并同实际病理结果(食管癌或健康)相比做ROC曲线分析(见图5),获得随机森林模型的灵敏度、特异度和ROC曲线下面积AUC值,结果见表4。从图5和表4可以看出,本发明上述构建的食管癌诊断模型效果良好,其用于食管癌诊断的ROC曲线下面积AUC为0.895(0.784~1),灵敏度为85.00%,特异度为90.48%。Substitute the two-dimensional matrix data of 25 serum metabolic markers of the external test sample into the random forest model established above to obtain the predicted value of the probability of esophageal cancer of the test sample, and compare it with the actual pathological results (esophageal cancer or healthy) Perform ROC curve analysis (see Figure 5) to obtain the sensitivity, specificity and AUC value of the area under the ROC curve of the random forest model, and the results are shown in Table 4. As can be seen from Fig. 5 and Table 4, the esophageal cancer diagnostic model constructed above in the present invention has a good effect, and its area under the ROC curve AUC for esophageal cancer diagnosis is 0.895 (0.784~1), the sensitivity is 85.00%, and the specificity is 90.48%.

进一步的,将测试样本的不同分期的食管癌患病概率预测值与实际病理结果(食管癌或健康)相比分别做ROC曲线分析,用于评价该诊断模型对不同分期食管癌的诊断效果。随机森林模型对于不同分期的食管癌的灵敏度、特异度和ROC曲线下面积AUC值见下表4,从表中可以看出:随着食管癌的进一步恶化,AUC值和特异度有增高趋势,灵敏度在原位癌和晚期癌期间较好,在早期癌中有所下降,总体来说该模型对于晚期食管癌的诊断效果较好,但是原位癌和早期食管癌的诊断效果(AUC)也能够达到可以接受的0.85以上,也具有早期诊断的价值,同时也说明本发明筛选得到的血清代谢标记物在早期食管癌甚至原位癌阶段就有了代谢变化。Furthermore, ROC curve analysis was performed on the predicted values of the esophageal cancer prevalence probability of different stages of the test samples compared with the actual pathological results (esophageal cancer or healthy), to evaluate the diagnostic effect of the diagnostic model on different stages of esophageal cancer. The sensitivity, specificity and AUC values of the area under the ROC curve of the random forest model for different stages of esophageal cancer are shown in Table 4 below. It can be seen from the table that: with the further deterioration of esophageal cancer, the AUC value and specificity tend to increase. Sensitivity is better in carcinoma in situ and advanced carcinoma, and decreases in early carcinoma. Overall, the model has a better diagnostic effect on advanced esophageal cancer, but the diagnostic effect (AUC) of carcinoma in situ and early esophageal carcinoma is also low. It can reach an acceptable value of 0.85 or more, which also has the value of early diagnosis. It also shows that the serum metabolic markers screened by the present invention have metabolic changes in the early stage of esophageal cancer or even carcinoma in situ.

原位癌是比早期(I和II期)食管癌还要早的阶段,食管癌的诊断早期更难,晚期相对容易一些。由表中的数据看,本发明的诊断模型能够很好的诊断出是否患有食管癌,并且不仅对晚期食管癌的诊断效果好,对于早期食管癌和原位癌的准确度、灵敏度和特异度也较好,能够有效地诊断出症状不明显的原位癌和早期食管癌,降低了癌症漏诊率,非常有利于食管癌的早发现、早治疗,对于改善食管癌的预后、降低食管癌的死亡率有很好的帮助,具有良好的临床使用和推广价值。Carcinoma in situ is an earlier stage than early stage (I and II) esophageal cancer, which is more difficult to diagnose in the early stage and relatively easier in the late stage. From the data in the table, the diagnostic model of the present invention can well diagnose whether to suffer from esophageal cancer, and not only has a good diagnostic effect on advanced esophageal cancer, but also has high accuracy, sensitivity and specificity for early esophageal cancer and carcinoma in situ. The accuracy is also good, and it can effectively diagnose carcinoma in situ and early esophageal cancer with no obvious symptoms, which reduces the rate of missed diagnosis of cancer, and is very conducive to the early detection and early treatment of esophageal cancer. The mortality rate is very helpful, and has good clinical use and promotion value.

表4.食管癌诊断模型的外部推广的ROC分析结果Table 4. ROC analysis results of external generalization of esophageal cancer diagnostic model

8.2以7个血清代谢标记物的组合作为诊断标记物进行建模,并用于诊断食管癌,具体如下:8.2 A combination of 7 serum metabolic markers was used as a diagnostic marker for modeling and used to diagnose esophageal cancer, as follows:

将得到的二维矩阵数据随机分配成4/5作为训练样本training data,另外1/5作为外部测试样本test data(见表1)。仅采用溶血磷脂酸LPA(18:1(9Z)/0:0)、溶血卵磷脂Lyso PC(14:0/0:0)、溶血卵磷脂LysoPC(18:2(9Z,12Z))、溶血卵磷脂LysoPC(24:0)、磷脂PC(14:1(9Z)/P-18:1(11Z))、磷脂PC(16:0/18:2(9Z,12Z))和磷脂PC(24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z))7种代谢标记物作为诊断标记物,在训练样本中基于随机森林(random Fores t)构建食管癌早期诊断模型。随机森林使用R语言中randomForest软件包实现,建模参数ntree=5000,随机森林建模步骤同上。Randomly assign 4/5 of the obtained two-dimensional matrix data as the training sample training data, and the other 1/5 as the external test sample test data (see Table 1). Only use lysophosphatidic acid LPA(18:1(9Z)/0:0), lyso-lecithin Lyso PC(14:0/0:0), lyso-lecithin LysoPC(18:2(9Z,12Z)), hemolys Lecithin LysoPC(24:0), phospholipid PC(14:1(9Z)/P-18:1(11Z)), phospholipid PC(16:0/18:2(9Z,12Z)) and phospholipid PC(24 :1(15Z)/22:6(4Z, 7Z, 10Z, 13Z, 16Z, 19Z)) Seven metabolic markers were used as diagnostic markers, and the early diagnosis of esophageal cancer was constructed based on random forest (random forest t) in the training samples Model. The random forest is implemented using the randomForest software package in the R language, the modeling parameter ntree=5000, and the random forest modeling steps are the same as above.

采用构建的随机模型进行诊断时,将待测血清中的7个血清代谢标记物的数据信息导入随机森林模型中,如果模型分类器的投票结果大于或等于诊断界值,则判定为诊断阳性(患食管鳞状细胞癌),如果低于诊断界值,则判定为诊断阴性(未患食管鳞状细胞癌)。此模型的诊断界值(Threshold)为0.7431。When using the constructed random model for diagnosis, the data information of the 7 serum metabolic markers in the serum to be tested is imported into the random forest model, and if the voting result of the model classifier is greater than or equal to the diagnostic cut-off value, it is judged as a positive diagnosis ( Esophageal squamous cell carcinoma), if it is lower than the diagnostic cut-off value, it is judged as negative diagnosis (no esophageal squamous cell carcinoma). The diagnostic threshold (Threshold) of this model was 0.7431.

将外部测试样本的7个血清代谢标记物的二维矩阵数据代入上述建立的随机森林模型中,得到测试样本的食管癌患病概率预测值,并同实际病理结果(食管癌或健康)相比做ROC曲线分析(见图6),获得随机森林模型的灵敏度、特异度和ROC曲线下面积AUC值,结果见表5。从图6和表5可以看出,本发明上述构建的食管癌诊断模型效果良好,其用于食管癌诊断的AUC为0.876(0.752~1),灵敏度为90%,特异度为85.71%。Substitute the two-dimensional matrix data of 7 serum metabolic markers of the external test sample into the random forest model established above to obtain the predicted value of the probability of esophageal cancer of the test sample, and compare it with the actual pathological results (esophageal cancer or healthy) Perform ROC curve analysis (see Figure 6) to obtain the sensitivity, specificity and AUC value of the area under the ROC curve of the random forest model, and the results are shown in Table 5. It can be seen from Fig. 6 and Table 5 that the esophageal cancer diagnostic model constructed above in the present invention has good effect, and its AUC for esophageal cancer diagnosis is 0.876 (0.752-1), the sensitivity is 90%, and the specificity is 85.71%.

进一步的,将测试样本的不同分期的食管癌患病概率预测值与实际病理结果(食管癌或健康)相比分别做ROC曲线分析,用于评价该诊断模型对不同分期食管癌的诊断效果。随机森林模型对于不同分期的食管癌的灵敏度、特异度和ROC曲线下面积AUC值见下表5,从表中可以看出:随着食管癌的进一步恶化,AUC值和灵敏度有增高趋势,特异度在原位癌和晚期癌期间较好,在早期癌中有所下降,总体来说该模型对于晚期食管癌的诊断效果较好,但是原位癌和早期食管癌的诊断效果(AUC)也能够达到可以接受的0.83以上,也具有早期诊断的价值,同时也说明本发明筛选得到的血清代谢标记物在早期食管癌甚至原位癌阶段就有了代谢变化。Furthermore, ROC curve analysis was performed on the predicted values of the esophageal cancer prevalence probability of different stages of the test samples compared with the actual pathological results (esophageal cancer or healthy), to evaluate the diagnostic effect of the diagnostic model on different stages of esophageal cancer. The sensitivity, specificity and AUC values of the area under the ROC curve of the random forest model for different stages of esophageal cancer are shown in Table 5 below. The degree is better in carcinoma in situ and advanced carcinoma, and decreased in early carcinoma. Generally speaking, the model has a better diagnostic effect on advanced esophageal cancer, but the diagnostic effect (AUC) of carcinoma in situ and early esophageal carcinoma is also lower. It can reach an acceptable value of 0.83 or more, which also has the value of early diagnosis. It also shows that the serum metabolic markers screened by the present invention have metabolic changes in the early stage of esophageal cancer or even carcinoma in situ.

由表中的数据可以看出,本发明7个血清代谢标记物信息构建的诊断模型相比于采用25个血清代谢标记物信息构建的诊断模型效果差一些,但该诊断模型也能够很好的诊断出是否患有食管癌,并且不仅对晚期食管癌的诊断效果好,对于早期食管癌和原位癌的准确度、灵敏度和特异度也较好,能够有效地诊断出症状不明显的原位癌和早期食管癌,降低了癌症漏诊率,非常有利于食管癌的早发现、早治疗,对于改善食管癌的预后、降低食管癌的死亡率有很好的帮助,具有良好的临床使用和推广价值。As can be seen from the data in the table, the diagnostic model constructed by the information of 7 serum metabolic markers of the present invention is less effective than the diagnostic model constructed by using the information of 25 serum metabolic markers, but the diagnostic model can also be very good. Diagnose whether you have esophageal cancer, and not only has a good diagnostic effect on advanced esophageal cancer, but also has good accuracy, sensitivity and specificity on early esophageal cancer and carcinoma in situ, and can effectively diagnose in situ cancer with no obvious symptoms. Cancer and early esophageal cancer reduce the rate of missed diagnosis of cancer, which is very conducive to the early detection and early treatment of esophageal cancer, and is very helpful for improving the prognosis of esophageal cancer and reducing the mortality rate of esophageal cancer. value.

表5食管癌诊断模型的外部推广的ROC分析结果Table 5 ROC analysis results of external extension of esophageal cancer diagnostic model

8.3、以5个血清代谢标记物的组合作为诊断标记物进行建模,并用于诊断食管癌,具体如下:8.3. The combination of 5 serum metabolic markers is used as a diagnostic marker for modeling and used to diagnose esophageal cancer, as follows:

将得到的二维矩阵数据随机分配成4/5作为训练样本training data,另外1/5作为外部测试样本test data(见表1)。采用L-酪氨酸(L-Tyrosine)、L-色氨酸(L-Tryptophan)、甘氨胆酸(GlycocholicAcid)、牛磺胆酸盐(Taurocholate)和皮质醇(Cortisol)5种血清代谢标记物作为诊断标记物,在训练样本中基于随机森林(randomForest)构建食管癌早期诊断模型。随机森林使用R语言中randomForest软件包实现,建模参数ntree=5000,随机森林建模步骤同上。Randomly assign 4/5 of the obtained two-dimensional matrix data as the training sample training data, and the other 1/5 as the external test sample test data (see Table 1). Five serum metabolic markers are used: L-Tyrosine, L-Tryptophan, Glycocholic Acid, Taurocholate and Cortisol As diagnostic markers, an early diagnosis model of esophageal cancer was constructed based on random Forest in the training samples. The random forest is implemented using the randomForest software package in the R language, the modeling parameter ntree=5000, and the random forest modeling steps are the same as above.

采用构建的随机模型进行诊断时,将待测血清中的5个血清代谢标记物的数据信息导入随机森林模型中,如果模型分类器的投票结果大于或等于诊断界值,则判定为诊断阳性(患食管鳞状细胞癌),如果低于诊断界值,则判定为诊断阴性(未患食管鳞状细胞癌)。此模型的诊断界值(Threshold)为0.4943。When the constructed random model is used for diagnosis, the data information of the five serum metabolic markers in the serum to be tested is imported into the random forest model, and if the voting result of the model classifier is greater than or equal to the diagnostic cut-off value, it is judged as a positive diagnosis ( Esophageal squamous cell carcinoma), if it is lower than the diagnostic cut-off value, it is judged as negative diagnosis (no esophageal squamous cell carcinoma). The diagnostic threshold (Threshold) of this model was 0.4943.

将外部测试样本的5个血清代谢标记物的二维矩阵数据代入上述建立的随机森林模型中,得到测试样本的食管癌患病概率预测值,并同实际病理结果(食管癌或健康)相比做ROC曲线分析(见图7),获得随机森林模型的灵敏度、特异度和ROC曲线下面积AUC值,结果见表6。从图7和表6可以看出,本发明上述构建的食管癌诊断模型效果良好,其用于食管癌诊断的AUC为0.84(0.703~0.978),灵敏度为95%,特异度为76.19%。Substitute the two-dimensional matrix data of the five serum metabolic markers of the external test sample into the random forest model established above to obtain the predicted value of the probability of esophageal cancer of the test sample, and compare it with the actual pathological results (esophageal cancer or healthy) Perform ROC curve analysis (see Figure 7), and obtain the sensitivity, specificity and AUC value of the area under the ROC curve of the random forest model, and the results are shown in Table 6. It can be seen from Figure 7 and Table 6 that the esophageal cancer diagnostic model constructed above in the present invention has good effect, and its AUC for esophageal cancer diagnosis is 0.84 (0.703-0.978), the sensitivity is 95%, and the specificity is 76.19%.

进一步的,将测试样本的不同分期的食管癌患病概率预测值与实际病理结果(食管癌或健康)相比分别做ROC曲线分析,用于评价该诊断模型对不同分期食管癌的诊断效果。随机森林模型对于不同分期的食管癌的灵敏度、特异度和ROC曲线下面积AUC值见下表6,从表中可以看出:这5种血清代谢标记物对于原位癌、早期癌和晚期癌表现出不同的趋势。Furthermore, ROC curve analysis was performed on the predicted values of the esophageal cancer prevalence probability of different stages of the test samples compared with the actual pathological results (esophageal cancer or healthy), to evaluate the diagnostic effect of the diagnostic model on different stages of esophageal cancer. The sensitivity, specificity and AUC values of the area under the ROC curve of the random forest model for different stages of esophageal cancer are shown in Table 6 below. show different trends.

由表中的数据可以看出,本发明5个血清代谢标记物信息构建的诊断模型相比于采用25个和7个血清代谢标记物信息构建的诊断模型效果差一些,但该诊断模型也能够很好的诊断出是否患有食管癌,并且不仅对晚期食管癌的诊断效果好,对于早期食管癌和原位癌的准确度、灵敏度和特异度也较好,能够有效地诊断出症状不明显的原位癌和早期食管癌,降低了癌症漏诊率,非常有利于食管癌的早发现、早治疗,对于改善食管癌的预后、降低食管癌的死亡率有很好的帮助,具有良好的临床使用和推广价值。As can be seen from the data in the table, the diagnostic model constructed by the information of 5 serum metabolic markers of the present invention is less effective than the diagnostic model constructed by using the information of 25 and 7 serum metabolic markers, but the diagnostic model can also It is very good at diagnosing whether you have esophageal cancer, and it is not only good for the diagnosis of advanced esophageal cancer, but also has good accuracy, sensitivity and specificity for early esophageal cancer and carcinoma in situ, and can effectively diagnose symptoms without obvious symptoms. Carcinoma in situ and early esophageal cancer reduce the rate of missed diagnosis of cancer, which is very conducive to early detection and early treatment of esophageal cancer, and is very helpful for improving the prognosis of esophageal cancer and reducing the mortality rate of esophageal cancer. Use and promote value.

表6食管癌早期诊断模型的外部推广的ROC分析结果Table 6 ROC analysis results of the external extension of the early diagnosis model of esophageal cancer

9、结论9. Conclusion

9.1本发明所得25个血清代谢标记物中的任意一个作为诊断食管癌的诊断标记物都具有较好的诊断效果,但是将多个血清代谢标记物组合应用的效果更好。9.1 Any one of the 25 serum metabolic markers obtained in the present invention has a good diagnostic effect as a diagnostic marker for esophageal cancer, but the combined application of multiple serum metabolic markers has a better effect.

9.2本发明优选的3种诊断标记物(诊断标记物A、B、C)以及构建的诊断模型对于食管癌具有很好的诊断效果,具有临床应用价值。9.2 The three preferred diagnostic markers of the present invention (diagnostic markers A, B, C) and the constructed diagnostic model have a good diagnostic effect on esophageal cancer and have clinical application value.

经过验证,本发明所得诊断标记物和诊断模型具有很好的应用价值,可以在临床上采用本发明的诊断标记物和诊断模型进行食管癌的诊断,步骤如下:After verification, the diagnostic marker and diagnostic model obtained by the present invention have good application value, and the diagnostic marker and diagnostic model of the present invention can be used clinically to diagnose esophageal cancer. The steps are as follows:

(1)采集待检血清,离心后采用上述2.2中的步骤(1)-(4)对血清进行预处理,以备进样检测;(1) Collect the serum to be tested, and after centrifugation, use the steps (1)-(4) in the above 2.2 to pretreat the serum for sample injection;

(2)将预处理后的待检血清样本按照上述2.3的步骤进行LC-MS检测,得原始代谢指纹图谱;(2) Perform LC-MS detection on the pretreated serum sample to be tested according to the above-mentioned 2.3 steps to obtain the original metabolic fingerprint;

(3)将原始代谢指纹图谱按照上述步骤3的方法进行图谱预处理,并进行代谢物峰标识,得到该待检血清的二维矩阵;(3) Preprocessing the original metabolic fingerprint according to the method of step 3 above, and carrying out metabolite peak identification to obtain a two-dimensional matrix of the serum to be tested;

(4)根据质荷比和保留时间从二维矩阵中筛选出相应的诊断标记物(诊断标记物A、B或C)信息,得到诊断标记物二维矩阵;(4) Screen out the corresponding diagnostic marker (diagnostic marker A, B or C) information from the two-dimensional matrix according to the mass-to-charge ratio and retention time to obtain a two-dimensional matrix of diagnostic markers;

(5)将诊断标记物二维矩阵带入相应的诊断模型中,根据模型给出的数值和模型的诊断界值(Threshold),判断是否为食管鳞状细胞癌。当模型给出的数值大于等于诊断界值时,判定为诊断阳性(患食管鳞状细胞癌),如果低于诊断界值,则判定为诊断阴性(未患食管鳞状细胞癌)。(5) Bring the two-dimensional matrix of diagnostic markers into the corresponding diagnostic model, and judge whether it is esophageal squamous cell carcinoma according to the value given by the model and the diagnostic threshold (Threshold) of the model. When the value given by the model is greater than or equal to the diagnostic cut-off value, it is judged as a positive diagnosis (with esophageal squamous cell carcinoma), and if it is lower than the diagnostic cut-off value, it is judged as a negative diagnosis (without esophageal squamous cell carcinoma).

除此之外,为了加快效率,可以同时采集多人的血清样本,并进行编号,将多个样本一次性进行LC-MS检测、图谱预处理、代谢峰标识、诊断标记物二维矩阵筛选和数据导入。In addition, in order to speed up the efficiency, serum samples of multiple people can be collected at the same time and numbered, and multiple samples can be subjected to LC-MS detection, spectrum preprocessing, metabolic peak identification, two-dimensional matrix screening of diagnostic markers and data import.

在实际应用中,可以按照本发明建模方法选取更多的样本进行建模,增加模型的准确度。以上为对本发明专利的描述而非限定,基于本发明专利思想的其他实施方式,均在本发明保护范围之中。In practical application, more samples can be selected for modeling according to the modeling method of the present invention, so as to increase the accuracy of the model. The above is a description of the patent of the present invention without limitation, and other implementations based on the idea of the patent of the present invention are within the protection scope of the present invention.

Claims (14)

1.一种适合于食管鳞状细胞癌早期诊断的诊断标记物,其特征是包括:1. A diagnostic marker suitable for early diagnosis of esophageal squamous cell carcinoma, characterized in that it comprises: 血清代谢标记物A: 磷脂PC(14:1(9Z)/P-18:1(11Z)),和Serum metabolic marker A: Phospholipid PC (14:1(9Z)/P-18:1(11Z)), and 血清代谢标记物B: 磷脂PC(24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z))。Serum metabolic marker B: Phospholipid PC (24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z)). 2.根据权利要求1所述的诊断标记物,其特征是:还包括血清代谢标记物C,所述血清代谢标记物C选自以下血清代谢标记物中的一种或多种:beta-丙氨酸-赖氨酸(beta-Ala-Lys)、左旋肌肽(L-Carnosine)、顺-9-十六碳烯酸(cis-9-Palmitoleic acid)、棕榈酸(Palmitic acid )、油酸(Oleic Acid)、溶血磷脂酸LPA(18:1(9Z)/0:0)、溶血卵磷脂LysoPC(14:0/0:0)、溶血卵磷脂LysoPC(18:2(9Z,12Z))、溶血卵磷脂LysoPC(24:0)、磷脂PC(16:0/18:2(9Z,12Z))、亚油酸(Linoleic acid)、烟酰胺腺嘌呤二核苷酸(NADH)、皮质醇(Cortisol)、L-酪氨酸(L-Tyrosine)、L-色氨酸(L-Tryptophan)、甘氨胆酸(GlycocholicAcid)、牛磺胆酸盐(Taurocholate)、次黄嘌呤(Hypoxanthine)、尿囊酸(Allantoic acid)、肌苷(Inosine)、1-磷酸鞘氨醇(Sphingosine 1-phosphate)、硫酸半乳糖基酰基鞘氨醇3-O-Sulfogalactosylceramide (d18:1/20:0)、乳糖神经酰胺Lactosylceramide (d18:1/22:0)。2. The diagnostic marker according to claim 1, characterized in that: it also includes serum metabolic marker C, which is selected from one or more of the following serum metabolic markers: beta-propanol Amino acid-lysine (beta-Ala-Lys), L-carnosine (L-Carnosine), cis-9-hexadecenoic acid (cis-9-Palmitoleic acid), palmitic acid (Palmitic acid), oleic acid ( Oleic Acid), lysophosphatidic acid LPA(18:1(9Z)/0:0), lysolecithin LysoPC(14:0/0:0), lysolecithin LysoPC(18:2(9Z,12Z)), LysoPC (24:0), phospholipid PC (16:0/18:2 (9Z, 12Z)), linoleic acid (Linoleic acid), nicotinamide adenine dinucleotide (NADH), cortisol ( Cortisol), L-Tyrosine, L-Tryptophan, Glycocholic Acid, Taurocholate, Hypoxanthine, Urine Allantoic acid, Inosine, Sphingosine 1-phosphate, 3-O-Sulfogalactosylceramide (d18:1/20:0), lactose Ceramide Lactosylceramide (d18:1/22:0). 3.根据权利要求2所述的诊断标记物,其特征是:还包括血清代谢标记物C,所述血清代谢标记物C选自以下血清代谢标记物中的一种或多种:溶血磷脂酸LPA(18:1(9Z)/0:0)、溶血卵磷脂LysoPC(14:0/0:0)、溶血卵磷脂LysoPC(18:2(9Z,12Z))、溶血卵磷脂LysoPC(24:0)、磷脂PC(16:0/18:2(9Z,12Z))、烟酰胺腺嘌呤二核苷酸(NADH)、皮质醇(Cortisol)、L-色氨酸(L-Tryptophan)、牛磺胆酸盐(Taurocholate)、次黄嘌呤(Hypoxanthine)、肌苷(Inosine)、硫酸半乳糖基酰基鞘氨醇3-O-Sulfogalactosylceramide (d18:1/20:0)、乳糖神经酰胺Lactosylceramide (d18:1/22:0)。3. The diagnostic marker according to claim 2, characterized in that: it also includes a serum metabolic marker C, which is selected from one or more of the following serum metabolic markers: lysophosphatidic acid LPA(18:1(9Z)/0:0), LysoPC(14:0/0:0), LysoPC(18:2(9Z,12Z)), LysoPC(24: 0), phospholipid PC (16:0/18:2 (9Z,12Z)), nicotinamide adenine dinucleotide (NADH), cortisol (Cortisol), L-tryptophan (L-Tryptophan), bovine Taurocholate, Hypoxanthine, Inosine, Galactosylceramide Sulfate 3-O-Sulfogalactosylceramide (d18:1/20:0), Lactosylceramide (d18 :1/22:0). 4.根据权利要求3所述的诊断标记物,其特征是:还包括血清代谢标记物C,所述血清代谢标记物C选自以下血清代谢标记物中的一种或多种:溶血卵磷脂LysoPC(18:2(9Z,12Z))、溶血卵磷脂LysoPC(24:0)、烟酰胺腺嘌呤二核苷酸(NADH)、皮质醇(Cortisol)、牛磺胆酸盐(Taurocholate)、硫酸半乳糖基酰基鞘氨醇3-O-Sulfogalactosylceramide (d18:1/20:0)、乳糖神经酰胺Lactosylceramide (d18:1/22:0)。4. The diagnostic marker according to claim 3, characterized in that: it also includes a serum metabolic marker C, which is selected from one or more of the following serum metabolic markers: lysolecithin LysoPC(18:2(9Z,12Z)), LysoPC(24:0), Nicotinamide Adenine Dinucleotide(NADH), Cortisol, Taurocholate, Sulfate Galactosylceramide 3-O-Sulfogalactosylceramide (d18:1/20:0), Lactosylceramide (d18:1/22:0). 5.根据权利要求4所述的诊断标记物,其特征是:还包括血清代谢标记物C,所述血清代谢标记物C选自以下血清代谢标记物中的一种或多种:溶血卵磷脂LysoPC(18:2(9Z,12Z))、溶血卵磷脂LysoPC(24:0)、牛磺胆酸盐(Taurocholate)、乳糖神经酰胺Lactosylceramide(d18:1/22:0)。5. The diagnostic marker according to claim 4, characterized in that: it also includes a serum metabolic marker C, which is selected from one or more of the following serum metabolic markers: lysolecithin LysoPC(18:2(9Z,12Z)), LysoPC(24:0), Taurocholate, Lactosylceramide(d18:1/22:0). 6.根据权利要求1所述的诊断标记物,其特征是:还包括血清代谢标记物C,所述血清代谢标记物C选自以下血清代谢标记物中的一种或多种:溶血磷脂酸LPA(18:1(9Z)/0:0)、溶血卵磷脂LysoPC(14:0/0:0)、溶血卵磷脂LysoPC(18:2(9Z,12Z))、溶血卵磷脂LysoPC(24:0)、磷脂PC(16:0/18:2(9Z,12Z))。6. The diagnostic marker according to claim 1, characterized in that: it also includes a serum metabolic marker C, which is selected from one or more of the following serum metabolic markers: lysophosphatidic acid LPA(18:1(9Z)/0:0), LysoPC(14:0/0:0), LysoPC(18:2(9Z,12Z)), LysoPC(24: 0), phospholipid PC (16:0/18:2(9Z,12Z)). 7.根据权利要求1所述的诊断标记物,其特征是:还包括血清代谢标记物C,所述血清代谢标记物C选自以下组合中的一种或多种:7. The diagnostic marker according to claim 1, characterized in that: it also includes a serum metabolic marker C, which is selected from one or more of the following combinations: 组合一:beta-丙氨酸-赖氨酸(beta-Ala-Lys)和左旋肌肽(L-Carnosine)的组合;Combination 1: the combination of beta-Ala-Lysine (beta-Ala-Lys) and L-Carnosine (L-Carnosine); 组合二:顺-9-十六碳烯酸(cis-9-Palmitoleic acid)、棕榈酸(Palmitic acid )和油酸(Oleic Acid)的组合;Combination 2: a combination of cis-9-Palmitoleic acid, Palmitic acid and Oleic acid; 组合三:溶血磷脂酸LPA(18:1(9Z)/0:0)、溶血卵磷脂LysoPC(14:0/0:0)、溶血卵磷脂LysoPC(18:2(9Z,12Z))和溶血卵磷脂LysoPC(24:0)的组合;Combination 3: lysophosphatidic acid LPA (18:1(9Z)/0:0), lyso-lecithin LysoPC(14:0/0:0), lyso-lecithin LysoPC(18:2(9Z,12Z)) and hemolysate Combination of lecithin LysoPC (24:0); 组合四:磷脂PC(16:0/18:2(9Z,12Z))和亚油酸(Linoleic acid)的组合;Combination 4: Combination of phospholipid PC (16:0/18:2(9Z,12Z)) and linoleic acid (Linoleic acid); 组合五:烟酰胺腺嘌呤二核苷酸(NADH)、L-酪氨酸(L-Tyrosine)和L-色氨酸(L-Tryptophan)的组合;Combination five: a combination of nicotinamide adenine dinucleotide (NADH), L-tyrosine (L-Tyrosine) and L-tryptophan (L-Tryptophan); 组合六:皮质醇(Cortisol)、甘氨胆酸(Glycocholic Acid)和牛磺胆酸盐(Taurocholate)的组合;Combination 6: Combination of Cortisol, Glycocholic Acid and Taurocholate; 组合七:次黄嘌呤(Hypoxanthine)、尿囊酸(Allantoic acid)和肌苷(Inosine)的组合;Combination 7: a combination of Hypoxanthine, Allantoic acid and Inosine; 组合八:1-磷酸鞘氨醇(Sphingosine 1-phosphate)、硫酸半乳糖基酰基鞘氨醇3-O-Sulfogalactosylceramide (d18:1/20:0)和乳糖神经酰胺Lactosylceramide (d18:1/22:0)的组合。Combination eight: Sphingosine 1-phosphate, galactosylceramide sulfate 3-O-Sulfogalactosylceramide (d18:1/20:0) and lactosylceramide (d18:1/22: 0) combination. 8.根据权利要求1所述的诊断标记物,其特征是:还包括血清代谢标记物C,所述血清代谢标记物C为下述23种血清代谢标记物的组合:beta-丙氨酸-赖氨酸(beta-Ala-Lys)、左旋肌肽(L-Carnosine)、顺-9-十六碳烯酸(cis-9-Palmitoleic acid)、棕榈酸(Palmiticacid ),油酸(Oleic Acid)、溶血磷脂酸LPA(18:1(9Z)/0:0)、溶血卵磷脂LysoPC(14:0/0:0)、溶血卵磷脂LysoPC(18:2(9Z,12Z))、溶血卵磷脂LysoPC(24:0)、磷脂PC(16:0/18:2(9Z,12Z))、亚油酸(Linoleic acid)、烟酰胺腺嘌呤二核苷酸(NADH)、皮质醇(Cortisol)、L-酪氨酸(L-Tyrosine)、L-色氨酸(L-Tryptophan)、甘氨胆酸(Glycocholic Acid)、牛磺胆酸盐(Taurocholate)、次黄嘌呤(Hypoxanthine)、尿囊酸(Allantoic acid)、肌苷(Inosine)、1-磷酸鞘氨醇(Sphingosine 1-phosphate)、硫酸半乳糖基酰基鞘氨醇3-O-Sulfogalactosylceramide (d18:1/20:0)和乳糖神经酰胺Lactosylceramide (d18:1/22:0)。8. The diagnostic marker according to claim 1, characterized in that: it also includes serum metabolic marker C, which is a combination of the following 23 serum metabolic markers: beta-alanine- Lysine (beta-Ala-Lys), L-Carnosine (L-Carnosine), cis-9-Hexadecenoic acid (cis-9-Palmitoleic acid), Palmitic acid (Palmitic acid), Oleic acid (Oleic Acid), Lysophosphatidic acid LPA(18:1(9Z)/0:0), lysolecithin LysoPC(14:0/0:0), lysolecithin LysoPC(18:2(9Z,12Z)), lysolecithin LysoPC (24:0), phospholipid PC (16:0/18:2 (9Z,12Z)), linoleic acid (Linoleic acid), nicotinamide adenine dinucleotide (NADH), cortisol (Cortisol), L -L-Tyrosine, L-Tryptophan, Glycocholic Acid, Taurocholate, Hypoxanthine, Allantoic Acid ( Allantoic acid), Inosine, Sphingosine 1-phosphate, 3-O-Sulfogalactosylceramide (d18:1/20:0) and Lactosylceramide (d18:1/22:0). 9.根据权利要求7所述的诊断标记物,其特征是:beta-丙氨酸-赖氨酸(beta-Ala-Lys)和左旋肌肽(L-Carnosine)与beta丙氨酸代谢(beta-Alanine metabolism)代谢通路密切相关;9. The diagnostic marker according to claim 7, characterized in that: beta-alanine-lysine (beta-Ala-Lys) and L-carnosine (L-Carnosine) and beta alanine metabolism (beta-Ala-Lys) Alanine metabolism) metabolic pathways are closely related; 顺-9-十六碳烯酸(cis-9-Palmitoleic acid)、棕榈酸(Palmitic acid )和油酸(Oleic Acid)与脂肪酸合成(Fatty acid biosynthesis)代谢通路密切相关;cis-9-Palmitoleic acid, Palmitic acid and Oleic acid are closely related to the metabolic pathway of fatty acid biosynthesis; 溶血磷脂酸LPA(18:1(9Z)/0:0)、溶血卵磷脂LysoPC(14:0/0:0)、溶血卵磷脂LysoPC(18:2(9Z,12Z))和溶血卵磷脂LysoPC(24:0)与甘油磷脂代谢(Glycerophospholipidmetabolism)代谢通路密切相关;Lysophosphatidic acid LPA (18:1(9Z)/0:0), lysolecithin LysoPC (14:0/0:0), lysolecithin LysoPC (18:2(9Z,12Z)) and lysolecithin LysoPC (24:0) is closely related to the metabolic pathway of glycerophospholipid metabolism; 磷脂PC(14:1(9Z)/P-18:1(11Z))、磷脂PC(16:0/18:2(9Z,12Z))、磷脂PC(24:1(15Z)/22:6(4Z,7Z,10Z,13Z,16Z,19Z))和亚油酸(Linoleic acid)与甘油磷脂代谢(Glycerophospholipid metabolism)和亚油酸代谢(Linoleic acid metabolism)这两种代谢通路密切相关;Phospholipid PC(14:1(9Z)/P-18:1(11Z)), Phospholipid PC(16:0/18:2(9Z,12Z)), Phospholipid PC(24:1(15Z)/22:6 (4Z, 7Z, 10Z, 13Z, 16Z, 19Z)) and linoleic acid are closely related to two metabolic pathways, Glycerophospholipid metabolism and Linoleic acid metabolism; 烟酰胺腺嘌呤二核苷酸(NADH)与氧化磷酸化(Oxidative phosphorylation)代谢通路密切相关;Nicotinamide adenine dinucleotide (NADH) is closely related to oxidative phosphorylation (Oxidative phosphorylation) metabolic pathway; L-酪氨酸(L-Tyrosine)和 L-色氨酸(L-Tryptophan)与苯基丙氨酸/酪氨酸和色氨酸代谢(Phenylalanine, tyrosine and tryptophan biosynthesis)代谢通路密切相关;L-Tyrosine (L-Tyrosine) and L-Tryptophan (L-Tryptophan) are closely related to the metabolic pathway of Phenylalanine/tyrosine and tryptophan biosynthesis; 甘氨胆酸(Glycocholic Acid)和牛磺胆酸盐(Taurocholate)与初级胆汁酸合成(Primary bile acid biosynthesis)代谢通路密切相关;Glycocholic Acid and Taurocholate are closely related to the metabolic pathway of primary bile acid biosynthesis; 皮质醇(Cortisol)与癌症通路和胆汁分泌(Pathways in cancer, and Bilesecretion)代谢通路密切相关;Cortisol is closely related to cancer pathways and bile secretion (Pathways in cancer, and Bile secretion) metabolic pathways; 次黄嘌呤(Hypoxanthine)、尿囊酸(Allantoic acid)和肌苷(Inosine)与嘌呤代谢(Purine metabolism)代谢通路密切相关;Hypoxanthine, Allantoic acid and Inosine are closely related to the metabolic pathway of Purine metabolism; 1-磷酸鞘氨醇(Sphingosine 1-phosphate)、硫酸半乳糖基酰基鞘氨醇3-O-Sulfogalactosylceramide (d18:1/20:0)和乳糖神经酰胺Lactosylceramide (d18:1/22:0)与鞘脂类代谢(Sphingolipid metabolism)代谢通路密切相关。1-phosphate sphingosine (Sphingosine 1-phosphate), galactosylceramide sulfate 3-O-Sulfogalactosylceramide (d18:1/20:0) and lactosylceramide (d18:1/22:0) with Sphingolipid metabolism (Sphingolipid metabolism) metabolic pathways are closely related. 10.一种权利要求1-9中任一项所述的适合于食管鳞状细胞癌早期诊断的诊断标记物的筛选方法,其特征是,包括以下步骤:10. A method for screening diagnostic markers suitable for early diagnosis of esophageal squamous cell carcinoma according to any one of claims 1-9, characterized in that it comprises the following steps: (1)收集食管鳞状细胞癌患者和健康人群血清样本,作为分析样本,其中食管鳞状细胞癌血清样本包括食管原位癌血清样本、早期食管癌血清样本和晚期食管癌血清样本;(1) Serum samples from patients with esophageal squamous cell carcinoma and healthy people were collected as analysis samples, where esophageal squamous cell carcinoma serum samples included esophageal carcinoma in situ, early esophageal cancer and advanced esophageal cancer; (2)将每个分析样本采用LC-MS血清代谢组学技术进行分析,得各血清样本的原始代谢指纹图谱;(2) Analyze each analysis sample using LC-MS serum metabolomics technology to obtain the original metabolic fingerprint of each serum sample; (3)使用R语言XCMS软件包将食管鳞状细胞癌血清样本和健康血清样本的原始代谢指纹图谱分别进行图谱预处理,得到每行为分析样本,每列为代谢物信息的二维矩阵,并使用R软件包CAMERA对二维矩阵进行代谢物峰标识,用于进一步的统计分析;(3) Use the R language XCMS software package to preprocess the original metabolic fingerprints of esophageal squamous cell carcinoma serum samples and healthy serum samples respectively, and obtain a two-dimensional matrix of metabolite information for each row analysis sample and each column, and Use the R package CAMERA to identify metabolite peaks in the two-dimensional matrix for further statistical analysis; (4)将步骤(3)的二维矩阵依次进行主成分分析和偏最小二乘判别分析,得到PLS-DA模型,该PLS-DA模型显示食管鳞状细胞癌患者与健康人群有代谢模式差异和明显的分类趋势;(4) Perform principal component analysis and partial least squares discriminant analysis on the two-dimensional matrix in step (3) in turn to obtain a PLS-DA model, which shows that patients with esophageal squamous cell carcinoma have metabolic patterns different from healthy people and clear classification trends; (5)根据上述得到的PLS-DA模型,借助PLS-DA建模的变量重要性评分和单变量的非参数检验进行差异代谢物筛选,筛选标准为:VIP≥1,且经假发现率FDR的多重检验校正后q值小于0.05;(5) According to the PLS-DA model obtained above, the differential metabolites were screened with the help of the variable importance score of the PLS-DA modeling and the univariate non-parametric test. The screening criteria were: VIP≥1, and the false discovery rate FDR The q value after multiple testing correction is less than 0.05; (6)将上述筛选得到的差异代谢物根据R语言的CAMERA包确定差异代谢物的准分子离子、加合物和同位素信息,获得潜在代谢标记物;(6) Determine the quasi-molecular ions, adducts and isotope information of the differential metabolites obtained from the above screening according to the CAMERA package of the R language, and obtain potential metabolic markers; (7)在上述潜在代谢标记物的基础上,结合潜在代谢标记物的一级、二级质谱信息、准分子离子信息、加合物信息和同位素信息,推测诊断标记物的分子质量和分子式,并与现有的标准化合物进行对比、匹配,得到适合于食管鳞状细胞癌早期诊断的诊断标记物。(7) On the basis of the above potential metabolic markers, combined with the primary and secondary mass spectrometry information, quasi-molecular ion information, adduct information and isotope information of the potential metabolic markers, the molecular mass and molecular formula of the diagnostic markers were estimated, And compared and matched with the existing standard compounds to obtain a diagnostic marker suitable for early diagnosis of esophageal squamous cell carcinoma. 11.根据权利要求10所述的筛选方法,其特征是:进行LC-MS血清代谢组学技术分析时,每10个分析样本加入一个质量控制样品,用于实时监测分析样本从进样前处理到分析过程中的质量控制情况,所述质量控制样品为5份食管癌血清样本和5份健康血清样本的混合样品。11. The screening method according to claim 10, characterized in that: when performing LC-MS serum metabolomics analysis, every 10 analysis samples are added with a quality control sample for real-time monitoring of the analysis samples from pre-injection processing As to the quality control situation in the analysis process, the quality control sample is a mixed sample of 5 esophageal cancer serum samples and 5 healthy serum samples. 12.根据权利要求10所述的筛选方法,其特征是:所述分析样本和质量控制样品进样前进行以下预处理:12. The screening method according to claim 10, characterized in that: the analysis sample and the quality control sample are subjected to the following pretreatment before sample introduction: (1)用移液器抽取50μl分析样本或质量控制样品,置于Bravo自动标本处理系统的96孔板上;(1) Use a pipette to draw 50 μl of analytical samples or quality control samples, and place them on a 96-well plate of the Bravo automatic specimen processing system; (2)加入150μl甲醇提取,涡旋30s,并在-20℃下孵化以沉淀蛋白;(2) Add 150μl methanol for extraction, vortex for 30s, and incubate at -20°C to precipitate protein; (3)然后于高速离心机中在4℃下以4000转/分离心20min;(3) Then centrifuge in a high-speed centrifuge at 4°C at 4000 rpm for 20 minutes; (4)将步骤(3)的上清液倒入LC-MS进样瓶中,保存在-80℃下以备LC-MS检测。(4) Pour the supernatant from step (3) into an LC-MS sample bottle and store at -80°C for LC-MS detection. 13.根据权利要求10所述的筛选方法,其特征是:对原始代谢指纹图谱进行图谱预处理是指:用Masshunter软件将获得的原始代谢指纹图谱转换为MZdata数据文件,然后将Mzdata数据文件使用XCMS软件包进行包括保留时间校正、峰识别、峰匹配和峰对齐的预处理操作,得到二维矩阵;使用R软件包CAMERA对二维矩阵进行代谢物峰标识包括同位素峰、加合物和碎片离子的代谢物峰标识。13. The screening method according to claim 10, characterized in that: carrying out spectrum preprocessing to the original metabolic fingerprints refers to: converting the obtained original metabolic fingerprints into MZdata data files with Masshunter software, and then using the Mzdata data files The XCMS software package performs preprocessing operations including retention time correction, peak identification, peak matching and peak alignment to obtain a two-dimensional matrix; use the R software package CAMERA to identify metabolite peaks in the two-dimensional matrix, including isotope peaks, adducts and fragments Metabolite peak identification for the ion. 14.根据权利要求10所述的筛选方法,其特征是:对每个分析样本采用LC-MS血清代谢组学技术进行分析时,液相色谱所用色谱柱为Waters ACQUITY UPLC HSS T3色谱柱,规格为100 mm× 2.1 mm,1.8 μm;进样量为6µL,进样温度为4℃,流速为0.5 ml/min;色谱流动相包含两种溶剂A和B:正离子ESI+模式下的A为0.1wt%甲酸水溶液,负离子ESI-模型下的A为0.5mmol/L氟化铵水溶液,正离子ESI+模式下的B为0.1wt%甲酸的乙腈溶液,负离子ESI-模型下的B为纯乙腈;色谱梯度洗脱条件为:0-1min为1%B,1-8min为1%B-100%B逐渐递增,10-10.1min为100%B迅速减为1%B,然后1%B持续1.9min;14. The screening method according to claim 10, characterized in that: when each analysis sample is analyzed by LC-MS serum metabolomics technology, the chromatographic column used in liquid chromatography is Waters ACQUITY UPLC HSS T3 chromatographic column, specification It is 100 mm×2.1 mm, 1.8 μm; the injection volume is 6 µL, the injection temperature is 4°C, and the flow rate is 0.5 ml/min; the chromatographic mobile phase contains two solvents A and B: A in positive ion ESI+ mode is 0.1 wt% formic acid aqueous solution, A under negative ion ESI-model is 0.5mmol/L ammonium fluoride aqueous solution, B under positive ion ESI+ mode is 0.1wt% formic acid acetonitrile solution, B under negative ion ESI-model is pure acetonitrile; Gradient elution conditions are: 0-1min is 1%B, 1-8min is 1%B-100%B gradually increasing, 10-10.1min is 100%B rapidly reduces to 1%B, and then 1%B lasts for 1.9min ; 质谱检测使用四极杆时间飞行质谱仪Q-TOF,并采用电喷雾离子源的正离子模式ESI+和负离子模式ESI-,离子源温度为400℃,锥孔气流量为12L/min,脱溶剂气温为250℃,脱溶剂气流量为16L/min;在正离子和负离子模式下毛细管电压分别为+3kV和-3kV,锥孔电压均为0V;正离子模式下锥孔压力为20psi,负离子模式下锥孔压力为40psi;图谱数据采集的质荷比范围为50~1200 m/z,采集的扫描频率为0.25s。The mass spectrometry detection uses the quadrupole time-of-flight mass spectrometer Q-TOF, and adopts the positive ion mode ESI+ and the negative ion mode ESI- of the electrospray ion source. The temperature is 250℃, the flow rate of desolvation gas is 16L/min; the capillary voltage is +3kV and -3kV respectively in positive ion mode and negative ion mode, and the cone voltage is 0V; the cone pressure in positive ion mode is 20psi, and in negative ion mode The cone pressure is 40psi; the mass-to-charge ratio range of the spectral data collection is 50-1200 m/z, and the scanning frequency of collection is 0.25s.
CN201510497914.8A 2015-08-14 2015-08-14 A kind of diagnostic marker and its screening technique for being suitable for esophageal squamous cell carcinoma early diagnosis Active CN105044361B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510497914.8A CN105044361B (en) 2015-08-14 2015-08-14 A kind of diagnostic marker and its screening technique for being suitable for esophageal squamous cell carcinoma early diagnosis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510497914.8A CN105044361B (en) 2015-08-14 2015-08-14 A kind of diagnostic marker and its screening technique for being suitable for esophageal squamous cell carcinoma early diagnosis

Publications (2)

Publication Number Publication Date
CN105044361A CN105044361A (en) 2015-11-11
CN105044361B true CN105044361B (en) 2017-07-28

Family

ID=54451064

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510497914.8A Active CN105044361B (en) 2015-08-14 2015-08-14 A kind of diagnostic marker and its screening technique for being suitable for esophageal squamous cell carcinoma early diagnosis

Country Status (1)

Country Link
CN (1) CN105044361B (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106706820B (en) * 2015-11-13 2018-05-25 中国科学院大连化学物理研究所 A kind of bearing calibration of general extensive metabolism group data
CN106324131B (en) * 2016-08-12 2019-01-22 中国人民解放军第四军医大学 A screening method for marker proteins of LGIEN and HGIEN before esophageal squamous cell carcinoma
CN106198812B (en) * 2016-08-23 2018-12-25 国家烟草质量监督检验中心 A kind of measurement screening technique of the laryngocarcinoma urine difference metabolin based on hydrophilic Interaction Chromatography flight time mass spectrum
CN106370741B (en) * 2016-08-23 2019-03-29 国家烟草质量监督检验中心 A kind of measurement screening technique of the laryngocarcinoma serum difference metabolin based on hydrophilic Interaction Chromatography flight time mass spectrum
CN108072704B (en) * 2016-11-08 2021-05-11 中国科学院大连化学物理研究所 Detection method of bile acids in feces based on liquid chromatography-mass spectrometry
CN109239210B (en) * 2018-09-10 2019-11-19 哈尔滨工业大学 A pancreatic ductal adenocarcinoma marker and screening method thereof
CN109856400B (en) * 2019-01-22 2022-02-01 上海交通大学医学院附属仁济医院 Use of ceramide C24 as biomarker for diagnosing gallbladder cancer
CN110322963B (en) * 2019-07-04 2024-01-30 成都新基因格生物科技有限公司 Neonatal genetic metabolic disease detection and analysis method, device and system
EP4212864A4 (en) * 2020-09-09 2024-05-01 Hitachi High-Tech Corporation CANCER TESTING METHOD USING A LIST OF METABOLITES
CN112151121B (en) * 2020-09-25 2024-05-07 北京大学 Diagnostic marker for diagnosing esophageal cancer, kit and screening method thereof, and construction method of esophageal cancer diagnostic model
CN112964807B (en) * 2021-03-30 2022-09-23 浙江大学 Metabolic markers and their screening methods for the prognosis of hepatitis B and acute-on-chronic liver failure
CN113376288B (en) * 2021-06-17 2023-05-30 中山大学 Early diagnosis serum biomarker for Guangzhou pipe-line nematode disease, screening method and application
CN113325117B (en) * 2021-06-25 2022-07-15 中国医学科学院北京协和医院 Application of a group of biomarkers in preparation of kit for predicting progression of intravenous smooth sarcomatosis
CN114118640B (en) * 2022-01-29 2022-04-26 中国长江三峡集团有限公司 Long-term precipitation prediction model construction method, long-term precipitation prediction method and device
CN116359272B (en) * 2023-04-03 2023-11-10 汕头大学医学院 Metabolic marker and application thereof in diagnosis and prediction of esophageal cancer
CN116699035A (en) * 2023-06-29 2023-09-05 长治医学院 Application of gibberellin A34, 12-hydroxydodecanoic acid in diagnosis reagent of esophageal squamous cell carcinoma
CN117147812B (en) * 2023-10-26 2024-01-16 中日友好医院(中日友好临床医学研究所) Sphingolipid metabolism marker as well as analysis method and application thereof
CN117147737B (en) * 2023-10-27 2024-02-02 中国医学科学院药物研究所 Plasma combined marker for esophageal squamous carcinoma diagnosis, kit and detection method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103197006A (en) * 2013-03-26 2013-07-10 中国药科大学 Method for determining serous metabolic biomarker of heroin abuse crowd
CN103604875A (en) * 2013-06-08 2014-02-26 江苏警官学院 Method for measuring serum metabolism markers in methylamphetamine abusers

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1310033C (en) * 2004-03-23 2007-04-11 中国医学科学院肿瘤医院肿瘤研究所 New method for detecting esophageal cancer haemocyanin fingerprint
US8653006B2 (en) * 2010-09-03 2014-02-18 Purdue Research Foundation Metabolite biomarkers for the detection of esophageal cancer using NMR
EP2444464A1 (en) * 2010-10-21 2012-04-25 Centre National de la Recherche Scientifique (CNRS) Novel neutral (bio)material
EP3076979B1 (en) * 2013-12-05 2020-04-29 University of Miami Compositions and methods for reducing intraocular pressure
CN104713971B (en) * 2015-04-01 2017-03-29 山东省肿瘤医院 The method that a kind of preliminary examination of the utilization esophageal carcinoma analyses model analysiss blood serum metabolic group with the credit of blood serum metabolic group
CN104713970B (en) * 2015-04-01 2017-03-15 山东省肿瘤医院 A kind of construction method of blood serum metabolic group analysis model
CN104713969B (en) * 2015-04-01 2017-04-12 山东省肿瘤医院 Construction method for serum metabonomics analysis model for esophagus cancer primary screening

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103197006A (en) * 2013-03-26 2013-07-10 中国药科大学 Method for determining serous metabolic biomarker of heroin abuse crowd
CN103604875A (en) * 2013-06-08 2014-02-26 江苏警官学院 Method for measuring serum metabolism markers in methylamphetamine abusers

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Biomarker identification and pathway analysis by serum metabolomics of childhood acute lymphoblastic leukemia;Yunnuo Bai,et al;《Clinica Chimica Acta》;20140605;第436卷;207–216 *

Also Published As

Publication number Publication date
CN105044361A (en) 2015-11-11

Similar Documents

Publication Publication Date Title
CN105044361B (en) A kind of diagnostic marker and its screening technique for being suitable for esophageal squamous cell carcinoma early diagnosis
CN109884302B (en) Early diagnosis markers and application of lung cancer based on metabolomics and artificial intelligence technology
WO2023082820A1 (en) Marker for lung adenocarcinoma diagnosis and application thereof
CN103616450B (en) A kind of Serum of Patients with Lung Cancer specific metabolic production spectra and method for building up thereof
CN108414660B (en) Application of group of plasma metabolism small molecule markers related to early diagnosis of lung cancer
CN102323351B (en) Bladder cancer patient urine specific metabolite spectrum, establishing method and application
Liang et al. Metabolomic analysis using liquid chromatography/mass spectrometry for gastric cancer
Flatley et al. MALDI mass spectrometry in prostate cancer biomarker discovery
CN111562338B (en) Application of clear renal cell carcinoma metabolic markers in early screening and diagnostic products for renal cell carcinoma
Liang et al. Serum metabolomics uncovering specific metabolite signatures of intra-and extrahepatic cholangiocarcinoma
CN102332387B (en) Biological tissue direct-spray mass spectrum device and analysis method
Kałużna-Czaplińska et al. Current applications of chromatographic methods for diagnosis and identification of potential biomarkers in cancer
CN105044240B (en) A kind of diagnostic marker for being suitable for esophageal squamous cell carcinoma early diagnosis
CN105044343B (en) A method for constructing a diagnostic model of esophageal squamous cell carcinoma, the resulting diagnostic model, and a method for using the model
Buszewska-Forajta et al. New approach in determination of urinary diagnostic markers for prostate cancer by MALDI-TOF/MS
Li et al. A pilot study for colorectal carcinoma screening by instant metabolomic profiles using conductive polymer spray ionization mass spectrometry
CN113406226B (en) Method for detecting imatinib metabolite in plasma of GIST patient based on non-targeted metabonomics
CN116106453B (en) Application of D-sorbitol in screening of esophageal squamous cell carcinoma
CN113567585A (en) A peripheral blood-based screening marker and kit for esophageal squamous cell carcinoma
Zhang et al. Altered phosphatidylcholines expression in sputum for diagnosis of non-small cell lung cancer
CN105044342B (en) A kind of diagnostic marker for being suitable for cancer of the esophagus early diagnosis
Zou et al. Small molecules as potential biomarkers of early gastric cancer: A mass spectrometry imaging approach
JP7650375B2 (en) Biomarker composition for diagnosing oral cancer comprising acylcarnitine metabolites - Patent Application 20100223633
CN113484518B (en) Diagnostic biomarker for distinguishing lung diseases
Xu et al. Discovery of potential therapeutic targets for non-small cell lung cancer using high-throughput metabolomics analysis based on liquid chromatography coupled with tandem mass spectrometry

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant