KR102523938B1 - 상동 재조합 결핍증 예측 모델의 모델링 방법 - Google Patents
상동 재조합 결핍증 예측 모델의 모델링 방법 Download PDFInfo
- Publication number
- KR102523938B1 KR102523938B1 KR1020220014718A KR20220014718A KR102523938B1 KR 102523938 B1 KR102523938 B1 KR 102523938B1 KR 1020220014718 A KR1020220014718 A KR 1020220014718A KR 20220014718 A KR20220014718 A KR 20220014718A KR 102523938 B1 KR102523938 B1 KR 102523938B1
- Authority
- KR
- South Korea
- Prior art keywords
- hrd
- data
- gene expression
- modeling method
- cancer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 230000006801 homologous recombination Effects 0.000 title description 11
- 238000002744 homologous recombination Methods 0.000 title description 11
- 230000007812 deficiency Effects 0.000 title description 8
- 230000014509 gene expression Effects 0.000 claims abstract description 31
- 108020004999 messenger RNA Proteins 0.000 claims abstract description 15
- 238000007477 logistic regression Methods 0.000 claims abstract description 8
- 206010028980 Neoplasm Diseases 0.000 claims description 32
- 201000011510 cancer Diseases 0.000 claims description 24
- 238000012360 testing method Methods 0.000 claims description 19
- 238000012549 training Methods 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 7
- 238000001914 filtration Methods 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 3
- 206010061535 Ovarian neoplasm Diseases 0.000 description 5
- 108090000623 proteins and genes Proteins 0.000 description 5
- 206010006187 Breast cancer Diseases 0.000 description 4
- 208000026310 Breast neoplasm Diseases 0.000 description 4
- 206010033128 Ovarian cancer Diseases 0.000 description 4
- 208000000236 Prostatic Neoplasms Diseases 0.000 description 4
- 238000000611 regression analysis Methods 0.000 description 4
- 206010060862 Prostate cancer Diseases 0.000 description 3
- 238000012952 Resampling Methods 0.000 description 3
- 108091007743 BRCA1/2 Proteins 0.000 description 2
- 208000017897 Carcinoma of esophagus Diseases 0.000 description 2
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 2
- 229920000776 Poly(Adenosine diphosphate-ribose) polymerase Polymers 0.000 description 2
- 210000000481 breast Anatomy 0.000 description 2
- 208000006990 cholangiocarcinoma Diseases 0.000 description 2
- 230000005782 double-strand break Effects 0.000 description 2
- 230000034431 double-strand break repair via homologous recombination Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 201000005619 esophageal carcinoma Diseases 0.000 description 2
- 201000005243 lung squamous cell carcinoma Diseases 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 230000002611 ovarian Effects 0.000 description 2
- 238000011282 treatment Methods 0.000 description 2
- 108700028369 Alleles Proteins 0.000 description 1
- 208000005623 Carcinogenesis Diseases 0.000 description 1
- 201000009030 Carcinoma Diseases 0.000 description 1
- 208000037051 Chromosomal Instability Diseases 0.000 description 1
- 230000005971 DNA damage repair Effects 0.000 description 1
- 230000007067 DNA methylation Effects 0.000 description 1
- 230000033616 DNA repair Effects 0.000 description 1
- 102100039116 DNA repair protein RAD50 Human genes 0.000 description 1
- 108010067741 Fanconi Anemia Complementation Group N protein Proteins 0.000 description 1
- 102000016627 Fanconi Anemia Complementation Group N protein Human genes 0.000 description 1
- 101000743929 Homo sapiens DNA repair protein RAD50 Proteins 0.000 description 1
- 101000777293 Homo sapiens Serine/threonine-protein kinase Chk1 Proteins 0.000 description 1
- 108700019961 Neoplasm Genes Proteins 0.000 description 1
- 102000048850 Neoplasm Genes Human genes 0.000 description 1
- 239000012661 PARP inhibitor Substances 0.000 description 1
- 229940121906 Poly ADP ribose polymerase inhibitor Drugs 0.000 description 1
- 238000003559 RNA-seq method Methods 0.000 description 1
- 102100031081 Serine/threonine-protein kinase Chk1 Human genes 0.000 description 1
- 108091081400 Subtelomere Proteins 0.000 description 1
- 230000001594 aberrant effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000036952 cancer formation Effects 0.000 description 1
- 231100000504 carcinogenesis Toxicity 0.000 description 1
- 208000019065 cervical carcinoma Diseases 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000012361 double-strand break repair Effects 0.000 description 1
- 230000001973 epigenetic effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000030279 gene silencing Effects 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 208000030776 invasive breast carcinoma Diseases 0.000 description 1
- 208000024312 invasive carcinoma Diseases 0.000 description 1
- 230000036210 malignancy Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000037390 scarring Effects 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 230000005748 tumor development Effects 0.000 description 1
- 206010046766 uterine cancer Diseases 0.000 description 1
- 208000012991 uterine carcinoma Diseases 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
- 238000007482 whole exome sequencing Methods 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/10—Gene or protein expression profiling; Expression-ratio estimation or normalisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2537/00—Reactions characterised by the reaction format or use of a specific feature
- C12Q2537/10—Reactions characterised by the reaction format or use of a specific feature the purpose or use of
- C12Q2537/165—Mathematical modelling, e.g. logarithm, ratio
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Chemical & Material Sciences (AREA)
- Medical Informatics (AREA)
- Genetics & Genomics (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Organic Chemistry (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Software Systems (AREA)
- Biotechnology (AREA)
- Immunology (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Wood Science & Technology (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Zoology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Analytical Chemistry (AREA)
- Evolutionary Biology (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- Oncology (AREA)
- Hospice & Palliative Care (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Microbiology (AREA)
- Public Health (AREA)
- Databases & Information Systems (AREA)
- Bioethics (AREA)
- Biochemistry (AREA)
Abstract
Description
도 2는 훈련 세트의 HRD 상태 ROC 그래프이다.
도 3은 테스트 세트의 HRD 상태 ROC 그래프이다.
Claims (7)
- a) TCGA(The Cancer Genome Atlas) 범암(pan-cancer) 데이터인 HRD 데이터를 NCI(National Cancer Institute)의 Genomic Data Commons(GDC) 웹사이트에서 다운로드하고, TCGA에서 제공하는 유전자 발현 데이터를 다운로드하여 mRNA 유전자 발현 데이터 및 HRD 데이터를 수집하는 단계;
b) 상기 HRD 데이터에서 표적 변수(target variable)를 절대편차 중앙값으로 필터링하여 추출하고, mRNA 유전자 발현 데이터에서 절대편차 중앙값을 이용하여 상대적 중요도가 낮은 데이터를 제거하여 예측 변수(predictor variable)를 추출하는 단계; 및
c) 상기 표적 변수와 예측 변수를 이용한 벌점형 로지스틱 회귀모델을 모델링하는 단계를 포함하되,
상기 c) 단계는,
데이터의 전처리 과정으로서, 정규화의 양(λ과 LASSO(Least Absolute Shrinkage Selector Operator) 벌점(penalty)의 비율(α)인 두 개의 하이퍼 파라미터(hyperparameters)를 결정하고,
AUPR(Area Under the Precision-Recall) 측면에서 모델 선택을 위해 그리드 검색을 사용한 하이퍼 파라미터 최적화를 수행하는 것을 특징으로 하는 HRD 예측 모델의 모델링 방법. - 삭제
- 삭제
- 삭제
- 제1항에 있어서,
상기 c) 단계는,
예측 변수의 3/4을 훈련 세트(training set)로 사용하고, 나머지를 테스트 세트(test set)로 사용하여, 모델링을 수행하는 것을 특징으로 하는 HRD 예측 모델의 모델링 방법. - 삭제
- 제1항에 있어서,
λ={10-5, 10-4, 10-3, 10-2, 10-1, 100}, α={0.0, 0.25, 0.5, 0.75, 1.0}으로 설정되는 것을 특징으로 하는 HRD 예측 모델의 모델링 방법.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020220014718A KR102523938B1 (ko) | 2022-02-04 | 2022-02-04 | 상동 재조합 결핍증 예측 모델의 모델링 방법 |
PCT/KR2023/000172 WO2023149672A1 (ko) | 2022-02-04 | 2023-01-04 | 상동 재조합 결핍증 예측 모델의 모델링 방법 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020220014718A KR102523938B1 (ko) | 2022-02-04 | 2022-02-04 | 상동 재조합 결핍증 예측 모델의 모델링 방법 |
Publications (1)
Publication Number | Publication Date |
---|---|
KR102523938B1 true KR102523938B1 (ko) | 2023-04-19 |
Family
ID=86142267
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020220014718A Active KR102523938B1 (ko) | 2022-02-04 | 2022-02-04 | 상동 재조합 결핍증 예측 모델의 모델링 방법 |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR102523938B1 (ko) |
WO (1) | WO2023149672A1 (ko) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015200873A1 (en) * | 2014-06-26 | 2015-12-30 | Icahn School Of Medicine At Mount Sinai | Methods for diagnosing risk of renal allograft fibrosis and rejection |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9850542B2 (en) * | 2013-03-04 | 2017-12-26 | Board Of Regents, The University Of Texas System | Gene signature to predict homologous recombination (HR) deficient cancer |
CA2908745C (en) * | 2013-04-05 | 2023-03-14 | Myriad Genetics, Inc. | Methods and materials for assessing homologous recombination deficiency |
ES2946251T3 (es) * | 2014-08-15 | 2023-07-14 | Myriad Genetics Inc | Métodos y materiales para evaluar la deficiencia de recombinación homóloga |
JP7224185B2 (ja) * | 2016-05-01 | 2023-02-17 | ゲノム・リサーチ・リミテッド | Dnaサンプルを特徴付ける方法 |
-
2022
- 2022-02-04 KR KR1020220014718A patent/KR102523938B1/ko active Active
-
2023
- 2023-01-04 WO PCT/KR2023/000172 patent/WO2023149672A1/ko active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015200873A1 (en) * | 2014-06-26 | 2015-12-30 | Icahn School Of Medicine At Mount Sinai | Methods for diagnosing risk of renal allograft fibrosis and rejection |
Non-Patent Citations (1)
Title |
---|
medRxiv 사전공개논문(https://doi.org/10.1101/2021.12.20.21267985* * |
Also Published As
Publication number | Publication date |
---|---|
WO2023149672A1 (ko) | 2023-08-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ross et al. | Tissue-based genomics augments post-prostatectomy risk stratification in a natural history cohort of intermediate-and high-risk men | |
Freedland et al. | Utilization of a genomic classifier for prediction of metastasis following salvage radiation therapy after radical prostatectomy | |
Sauter et al. | Integrating tertiary Gleason 5 patterns into quantitative Gleason grading in prostate biopsies and prostatectomy specimens | |
US20220064737A1 (en) | Detecting cancer, cancer tissue of origin, and/or a cancer cell type | |
JP2019521673A5 (ko) | ||
AU2020212057A1 (en) | Detecting cancer, cancer tissue of origin, and/or a cancer cell type | |
CN108504555B (zh) | 鉴别及评价肿瘤进展的装置和方法 | |
CN111161882A (zh) | 一种基于深度神经网络的乳腺癌生存期预测方法 | |
Qian et al. | Radiogenomics of lower-grade gliomas: a radiomic signature as a biological surrogate for survival prediction | |
CN108475300B (zh) | 利用癌症患者的基因组碱基序列突变信息和生存信息的定制型药物选择方法及系统 | |
KR20210073526A (ko) | 전사 인자 프로파일링 | |
WO2020132544A1 (en) | Anomalous fragment detection and classification | |
JP2024528489A (ja) | 相同修復欠損を分類するシステムおよび方法 | |
CN107851136B (zh) | 用于对未知重要性的变体划分优先级顺序的系统和方法 | |
KR102523938B1 (ko) | 상동 재조합 결핍증 예측 모델의 모델링 방법 | |
Kikutake et al. | Pan-cancer analysis of intratumor heterogeneity associated with patient prognosis using multidimensional measures | |
KR20240073026A (ko) | 노이즈 영역 필터링을 사용한 메틸화 단편 확률론적 노이즈 모델 | |
Cattelani et al. | Triple and quadruple optimization for feature selection in cancer biomarker discovery | |
CN113724782A (zh) | 一种基于可变聚腺苷酸化位点的疾病预后标志物筛选方法 | |
Toh et al. | Analysis of copy number variation from germline DNA can predict individual cancer risk | |
Liu et al. | A support vector machine model predicting the risk of duodenal cancer in patients with familial adenomatous polyposis at the transcript levels | |
KR102737606B1 (ko) | 유전자 발현과 dna 메틸화의 2차 부분 상관관계에 기초한 암 환자별 네트워크 구축 방법 | |
Ho et al. | Evolutionary learning-derived lncRNA signature with biomarker discovery for predicting stage of colon adenocarcinoma | |
KR20230118307A (ko) | Braf 변이체 예측 모델의 모델링 방법 | |
Kim et al. | Inferring modes of evolution from colorectal cancer with residual polyp of origin |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PA0109 | Patent application |
Patent event code: PA01091R01D Comment text: Patent Application Patent event date: 20220204 |
|
PA0201 | Request for examination | ||
PA0302 | Request for accelerated examination |
Patent event date: 20220216 Patent event code: PA03022R01D Comment text: Request for Accelerated Examination Patent event date: 20220204 Patent event code: PA03021R01I Comment text: Patent Application |
|
PE0902 | Notice of grounds for rejection |
Comment text: Notification of reason for refusal Patent event date: 20221020 Patent event code: PE09021S01D |
|
E701 | Decision to grant or registration of patent right | ||
PE0701 | Decision of registration |
Patent event code: PE07011S01D Comment text: Decision to Grant Registration Patent event date: 20230328 |
|
GRNT | Written decision to grant | ||
PR0701 | Registration of establishment |
Comment text: Registration of Establishment Patent event date: 20230417 Patent event code: PR07011E01D |
|
PR1002 | Payment of registration fee |
Payment date: 20230417 End annual number: 3 Start annual number: 1 |
|
PG1601 | Publication of registration |