[go: up one dir, main page]

CN108268752B - A chromosomal abnormality detection device - Google Patents

A chromosomal abnormality detection device Download PDF

Info

Publication number
CN108268752B
CN108268752B CN201810047686.8A CN201810047686A CN108268752B CN 108268752 B CN108268752 B CN 108268752B CN 201810047686 A CN201810047686 A CN 201810047686A CN 108268752 B CN108268752 B CN 108268752B
Authority
CN
China
Prior art keywords
coverage
value
window
cnv
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810047686.8A
Other languages
Chinese (zh)
Other versions
CN108268752A (en
Inventor
糜庆丰
彭春方
张娟
赵宇
陈样宜
饶兴蔷
罗东红
黄铨飞
刘丽菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CapitalBio Genomics Co Ltd
Original Assignee
CapitalBio Genomics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CapitalBio Genomics Co Ltd filed Critical CapitalBio Genomics Co Ltd
Priority to CN201810047686.8A priority Critical patent/CN108268752B/en
Publication of CN108268752A publication Critical patent/CN108268752A/en
Application granted granted Critical
Publication of CN108268752B publication Critical patent/CN108268752B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids

Landscapes

  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Data Mining & Analysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Epidemiology (AREA)
  • Databases & Information Systems (AREA)
  • Bioethics (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Apparatus Associated With Microorganisms And Enzymes (AREA)

Abstract

The invention discloses a chromosome abnormality detection device. The existing chromosome abnormality detection analysis is based on a read length counting statistical model, only repetitive sequences aligned to the same initial position in a genome can be removed, and reads which are different in initial position and overlapped with each other cannot be removed; according to the device, by introducing a sequence coverage (coverage) statistical model, the repeated sequences and the overlapping regions thereof caused by the amplification preference of the whole genome of the single cell can be effectively removed, the uniformity of data is remarkably improved, the noise of the data is further reduced, the detection rate of positive samples is improved, and the false positive rate is reduced.

Description

A kind of chromosome abnormality detection device
Technical field
The present invention relates to data processing techniques, and in particular to a kind of chromosome abnormality detection device.
Background technique
In recent years, help pregnant patient more and more with receiving supplementary reproduction, a large amount of clinical discoveries are in supplementary reproduction process Easily there is the case where planting failure or Unexplained spontaneous abortion repeatedly in the embryo of middle part high risk Mr. and Mrs, and test-tube baby is overall Live birth rate is less than 30%, and studying discovery embryo chromosome is the main reason for causing test-tube baby to fail extremely.Therefore, to embryo Tire carries out implantation prochromosome abnormality detection, and then the embryo of health is selected to be implanted into, and is remarkably improved the pregnant of test-tube baby Rate of being pregnent and live birth rate.
Embryo implantation prochromosome abnormality detection needs to carry out blastula embryo Trophectoderm cells or blastomere single Cell amplification, makes up to DNA initial amount required for high-flux sequence platform, i.e., reaches μ g rank by the DNA of pg rank DNA content;The unicellular amplification method of mainstream is divided into three classes by principle at present: unicellular amplification method (such as DOP- of based on PCR PCR)[1], multiple strand displacement amplification (MDA)[2]With multiple cyclic annular cyclic amplification technology (MALBAC) of annealing[3].Since these are slender Born of the same parents' amplification method is all using tens wheel exponential amplification, this makes the amplification Preference of the certain specific positions of genome unlimited Amplification, generates a large amount of repetitive sequences (duplicate reads), causes the homogeneity that depth is sequenced to significantly reduce, ultimately causes There are a large amount of exceptional values and false positive results in sample results analysis.Therefore, removal is by amplification Preference bring repetitive sequence Embryo implantation prochromosome abnormality detection based on unicellular amplification is very important.
Currently, the chromosome abnormality detection and analysis for embryo are all based on the long counting (reads number) of reading: will survey The reading long (reads) that sequence generates is compared into reference genome;Reads of the specific filtration resistance to initial position identical into genome (duplicate reads);Reference genome is divided into the statistical window of N number of fixed length, counts the reading long number of each window;It is right It reads long number and carries out GC correction;Reading long number is normalized and is converted into reading long ratio (reads ratio);Finally count Long ratio (reads ratio) is read in analysis genome to judge embryo to be measured with the presence or absence of chromosome abnormality.The above analysis stream Journey is merely capable of removal in the processing method of removal repetitive sequence (duplicate reads) and compares into genome identical The duplicate reads of beginning position has for initial position difference but between each other the reads of overlapping (overlap) to be It can not effectively remove.Therefore, it is necessary to can just be effectively improved using more efficiently removal repetition methods based on unicellular complete The accuracy of the chromosome abnormality detection of genome amplification.
Bibliography
[1]Telenius H,Carter NP,Bebb CE,et al.Degenerate oligonucleotide- primed PCR:general amplification of target DNA by a single degenerate primer [J].Genomics,1992,13(3):718-725.
[2]Dean FB,Nelson JR,Giesler TL,et al.Rapid amplification of plasmid and phage DNA using Phi 29DNA polymerase and multiply-primed rolling circle amplification[J].Genome Research,2001,11(6):1095-1099.
[3]Zong C,Lu S,Chapman AR,et al.Genome-wide detection of single- nucleotide and copy-number variations of a single human cell[J].Science,2012, 338(6114):1622-1626.
[4]Olshen A B,Venkatraman E S,Lucito R,et al.Circular binary segmentation for the analysis of array-based DNA copy number data.[J] .Biostatistics,2004,5(4):557-72.
[5]Venkatraman E S,Olshen A B.A faster circular binary segmentation algorithm for the analysis of array CGH data[J].Bioinformatics,2007,23(6): 657-63.
Summary of the invention
In order to solve the above-mentioned technical problem, the object of the present invention is to provide a kind of chromosome abnormality detection devices.
The technical scheme adopted by the invention is that:
A kind of chromosome abnormality detection device, comprising:
Sequencing data acquiring unit: for obtaining the reading long segment obtained through high-flux sequence;
Comparing unit: being compared for that will read long segment with human genome reference sequences, obtains the position for reading long segment Confidence breath and length information;
Coverage computing unit: it for human genome reference sequences to be divided into several first windows, is grown according to reading The location information and length information of segment, calculate the coverage of each first window, according to the coverage and G/C content of first window Carry out Loess correction;Several continuous first windows are merged into the second window, after calculating the second window Loess correction Coverage and its coverage accounting;
Candidate CNV recognition unit: it for the breakpoint location using cyclic annular binary segmentation algorithm identification chromosome, calculates adjacent CBS ratio between breakpoint identifies the candidate region CNV according to CBS ratio threshold value;
False positive filter element: for calculating the significance P-value of the candidate region CNV CBS ratio value, according to P-value filters false positive region, obtains the region CNV and the results of karyotype of sample to be tested.
Particularly, the base sum/section length covered in coverage=section;The covering of coverage accounting=section Degree/all autosomal coverages.
Particularly, CBS ratio is all second window coverages between the adjacent breakpoint that cyclic annular binary segmentation algorithm identifies The mean value of accounting.
In coverage computing unit, the first window is the non-duplicate section of 10~50Kb, it is preferable that first window Mouth is the non-duplicate section of 20Kb.
In coverage computing unit, second length of window is 0.1~2Mb, it is preferable that second length of window is appointed Selected from 100Kb, 500Kb and 1Mb.
Preferably, in candidate CNV recognition unit, the CBS ratio threshold value is [1.4,2.6], is sentenced beyond threshold range It is set to the candidate region CNV.
Preferably, in false positive filter element, calculating P-value includes:
Randomly sampled data library is formed according to the result of nominal reference sample, therefrom extracts at least 100000 times and candidate The isometric simulation CBS section in the region CNV obtains the density profile of simulation CBS ratio value, calculates the candidate region CNV CBS The significance P-value of ratio value.
Preferably, in false positive filter element, the P-value < 0.001 in the candidate region CNV is then determined as the region CNV, Otherwise, as false positive area filter.
Further, described device further includes sequencing unit:
It is connected with sequencing data acquiring unit, for carrying out high-flux sequence, the sample to the library constructed using sample It originally include through unicellular amplification or the sample for expanding through PCR or being expanded in advance without PCR in advance.
Further, described device further includes filter element:
It is connected with comparing unit, for rejecting and being in tandem sequence repeats position and transposons repeatable position according to comparison result Reading long segment and low-quality, more matchings and the reading long segment that is non-fully matched on chromosome.
The beneficial effects of the present invention are:
Existing chromosome abnormality detection and analysis, which are based on, reads long counting statistics model, can only remove comparison to genome In identical initial position repetitive sequence, not can be removed reads that is different for initial position but having overlapping between each other;This Invention device can effectively remove single cell whole genome amplification preference by calling sequence coverage (coverage) statistical model Property bring repetitive sequence and its overlapping region, significantly improve the homogeneity of data, and then reduce noise data, improve positive sample This recall rate and reduction false positive rate.
Detailed description of the invention
Fig. 1 is chromosome abnormality testing process schematic diagram;
Fig. 2 is the lower 24 chromosome copies numeric distribution figure of T1 sample 1M resolution ratio;A figure shows tradition based on reading length The testing result of counting method, B figure show the testing result provided by the invention based on coverage method;
Fig. 3 is the distribution map of the lower 24 chromosome copies numerical value of T8 sample 1M resolution ratio;A figure shows tradition based on reading The testing result of long counting method, B figure show the testing result provided by the invention based on coverage method;
Fig. 4 is the distribution map of the lower 24 chromosome copies numerical value of T19 sample 1M resolution ratio;A figure shows that tradition is based on The testing result of long counting method is read, B figure shows the testing result provided by the invention based on coverage method;
Fig. 5 is the distribution map of the lower 24 chromosome copies numerical value of T2 sample 1M resolution ratio;A figure shows tradition based on reading The testing result of long counting method, B figure show the testing result provided by the invention based on coverage method.
Specific embodiment
Thought of the invention: the sample (such as unicellular sample) low for starting DNA content, the unicellular expansion of utilization index type During DNA concentration is promoted to μ g rank by pg grades by increasing mode, amplification preference is often infinitely amplified, and is generated a large amount of Repetitive sequence (duplicate reads), causes the homogeneity of sample poor.Traditional chromosome abnormality based on the long counting of reading Analysis method is merely capable of removal in the processing method of removal repetitive sequence (duplicate reads) and compares into genome The duplicate reads of identical initial position has for initial position difference but between each other overlapping (overlap) Reads can not be effectively removed, and therefore, conventional method is for expanding the genome area of preference and the gene of non-amplification preference The obtained sequencing reading length number of group range statistics can difference, eventually lead to the sequence ratio of some regions in analysis result Regular meeting is significantly higher than (or being lower than) normal condition, to false positive results occur.Apparatus of the present invention are in order to avoid unicellular amplification Testing result is influenced, chromosome abnormality is detected using based on coverage (coverage) statistical model, phase can be effectively removed The characteristic of overlapping region between adjacent sequencing reading length reduces the influence due to unicellular amplification preference bring false positive results, realizes The detection of the chromosome abnormality of high-accuracy.Visible based on inventive concept: apparatus of the present invention are applicable not only to need to be through slender Screening before the embryo implantation of the trace sample of born of the same parents' amplification, is equally applicable to need the chromosome abnormality of the pre- amplified sample of PCR to detect, such as The chromosome abnormality of abortion tissue object detects, and is more suitable for the chromosome abnormality detection of the constant sample expanded in advance without PCR.This Invention device is a kind of general chromosome abnormality detection device, and more particularly to solve, there are the detections of PCR amplification preference sample Problem embodies more superior detection effect.
A kind of chromosome abnormality detection device provided by the invention, comprising:
Sequencing data acquiring unit: for obtaining the reading long segment obtained through high-flux sequence;
Comparing unit: being compared for that will read long segment with human genome reference sequences, obtains the position for reading long segment Confidence breath and length information;
Coverage computing unit: it for human genome reference sequences to be divided into several first windows, is grown according to reading The location information and length information of segment, calculate the coverage of each first window, according to the coverage and G/C content of first window Carry out Loess correction;Several continuous first windows are merged into the second window, after calculating the second window Loess correction Coverage and its coverage accounting;
Candidate CNV recognition unit: it for the breakpoint location using cyclic annular binary segmentation algorithm identification chromosome, calculates adjacent CBS ratio between breakpoint identifies the candidate region CNV according to CBS ratio threshold value;
False positive filter element: for calculating the significance P-value of the candidate region CNV CBS ratio value, according to P-value filters false positive region, obtains the region CNV and the results of karyotype of sample to be tested.
Particularly, the base sum/section length covered in coverage=section;The covering of coverage accounting=section Degree/all autosomal coverages.
Particularly, CBS ratio is all second window coverages between the adjacent breakpoint that cyclic annular binary segmentation algorithm identifies The mean value of accounting.
In coverage computing unit, the first window is the non-duplicate section of 10~50Kb, it is preferable that first window Mouth is the non-duplicate section of 20Kb.
In coverage computing unit, second length of window is 0.1~2Mb, it is preferable that second window is long optional From for 100Kb, 500Kb and 1Mb.
Preferably, in candidate CNV recognition unit, the CBS ratio threshold value is [1.4,2.6], is sentenced beyond threshold range It is set to the candidate region CNV.
Preferably, in false positive filter element, calculating P-value includes:
Randomly sampled data library is formed according to the result of nominal reference sample, therefrom extracts at least 100000 times and candidate The isometric simulation CBS section in the region CNV obtains the density profile of simulation CBS ratio value, calculates the candidate region CNV CBS The significance P-value of ratio value.
Preferably, in false positive filter element, the P-value < 0.001 in the candidate region CNV is then determined as the region CNV, Otherwise, as false positive area filter.
Further, described device further includes sequencing unit:
It is connected with sequencing data acquiring unit, for carrying out high-flux sequence, the sample to the library constructed using sample It originally include through unicellular amplification or the sample for expanding through PCR or being expanded in advance without PCR in advance.
Further, described device further includes filter element:
It is connected with comparing unit, for rejecting and being in tandem sequence repeats position and transposons repeatable position according to comparison result Reading long segment and low-quality, more matchings and the reading long segment that is non-fully matched on chromosome.
Above-mentioned sequencing unit, sequencing data acquiring unit, comparing unit, filter element, coverage computing unit, candidate CNV unit, false positive filter element can be program module, can also be hardware device module.
The present invention is explained further below in conjunction with specific embodiment, the scope of protection of the present invention is not limited to this.
Embodiment 1
A kind of chromosome abnormality detection device provided by the invention is applied to the chromosome abnormality based on unicellular amplification In detection technique, following processing step is specifically included, flow diagram is as shown in Figure 1.
1, sequencing data of whole genome is obtained
Cell strain known to a collection of caryogram is had purchased from Coriell company, totally 25 samples participate in this item detection, and sample is compiled Number be T1~T25, in which: 2 negative samples;3 sex chromosome abnormalities samples;7 autosome aneuploid samples; The micro- repetition of 1 sex chromosome or micro-deleted sample;The micro- repetition of 12 autosomes or micro-deleted sample;Sample above is carried out single Cell whole genome amplification, library construction and high-flux sequence obtain and read long segment.
2, it compares
The reading long segment of acquisition is compared with human genome standard sequence hg19, each reading long segment is compared to dye Colour solid corresponding position obtains each comparison information for reading long segment, location information, length information and the Quality Control letter including reading long segment Breath.
3, it filters
According to the Quality Control information in comparison result, the reading lengthy motion picture in tandem sequence repeats position and transposons repeatable position is rejected Section and low-quality, more matchings and the reading long segment being non-fully matched on chromosome.
4, coverage (coverage) calculates
Human genome reference sequences are divided into several first windows, each first window is the non-overlap area of 20kb Domain calculates the coverage of first window, according to first window according to the filtered location information and length information for reading long segment Coverage and G/C content to GC Preference carry out Loess correction, several continuous first windows are merged into the second window, Each second length of window is 1Mb, coverage and its coverage accounting (coverage after calculating the second window Loess correction Ratio, abbreviation CR);Wherein, the base sum/section length covered in coverage=section;Coverage accounting=section Coverage/all autosomal coverages.
5, candidate CNV is identified
Use cyclic annular binary segmentation algorithm (CBS, Circular Binary Segmentation) algorithm[4][5]Identification dye The breakpoint location of colour solid sets CBS ratio threshold value as [1.4,2.6], is determined as the candidate region CNV beyond threshold range, no Then it is determined as that dye-free body is abnormal, wherein CBS ratio all second window coverages between the adjacent breakpoint of CBS identification account for The mean value of ratio.
6, false positive filters
Randomly sampled data library is formed according to the result of nominal reference sample, therefrom extracts 100000 times and the candidate area CNV The isometric simulation CBS section in domain obtains the density profile of simulation CBS ratio value, and then calculates the candidate region CNV CBS The significance P-value of ratio value;False positive region is filtered according to the P-value in the candidate region CNV, specifically: it is candidate The P-value < 0.001 in the region CNV, then be determined as the region CNV, otherwise, as false positive area filter, finally obtains to be measured The region CNV of sample and results of karyotype.
Inventor will be the present embodiment (hereinafter referred to as " coverage method ") and traditional based on the chromosome abnormality for reading long counting Detection method (hereinafter referred to as " read long counting method " ") it compares, while the lot sample is originally analyzed using chip method.
Table 1 provides the chromosome abnormality testing result of 25 known caryogram cells, in which: 24 samples are reading long counting method It is identical with testing result under coverage method and consistent with the results of karyotype of chip;Inspection of 1 sample (T2) under two methods It is different to survey result, and the results of karyotype of chip is consistent with the testing result of the present embodiment.It can be seen that dye provided by the invention Colour solid abnormal detector has reliability and accuracy.
The chromosome abnormality testing result of table 1, known caryogram cell
Table 2 provides CV value of the above-mentioned sample respectively using the long counting method of reading and coverage method under 1M resolution ratio, CV value The dispersion degree of data is represented, can reflect the homogeneity that the reading long segment that sequencing obtains is distributed on reference genome, and then anti- Reflect amplification homogeneity quality.It is clear that coverage method detection CV value is substantially reduced, illustrate chromosome abnormality provided by the invention Detection device can improve the problem of amplification homogeneity difference.
The CV value of table 2, all samples the 1M resolution ratio under two kinds of detection methods
From above-mentioned sample, picking T1, T2, T8 and T19 sample is example, further illustrates result.
Fig. 2 illustrates T1 sample 24 chromosome copies numeric distribution situations under 1M resolution ratio, wherein Fig. 2A be based on The testing result of long counting method is read, Fig. 2 B is the testing result based on coverage method.T1 is a negative sample (46, XX), root It can more intuitively illustrate that chromosome abnormality detection device provided by the invention can mention according to the distribution situation at the midpoint Fig. 2A and Fig. 2 B The homogeneity of height amplification.
Fig. 3 and Fig. 4 respectively with T8 sample (47, XY ,+15) and T19 sample (46, XX, del (8) (pter-p12)) for, Chromosome aneuploid sample and segment CNV sample 24 chromosome copies numeric distribution situations in 1M resolution ratio are illustrated, Wherein Fig. 3 A, 4A are based on the testing result for reading long counting method, and Fig. 3 B, 5B are the testing result based on coverage method.In conjunction with figure 3, Fig. 4 and table 3 be not it is found that chromosome abnormality detection device provided by the invention influences sun while improving amplification homogeneity The detection value of property result.
Table 3, T2, T8 and T19 sample region the CNV detection value under two kinds of detection methods
Fig. 5 illustrates T2 sample (46, XY) 24 chromosome copies numeric distribution situations in 1M resolution ratio, and wherein A schemes For based on the testing result for reading long counting method, B figure is the testing result based on coverage method.According to point under 1M resolution ratio in Fig. 5 A Distribution situation known to T2 sample amplification homogeneity it is poor, use the CV value read when long counting method is detected under 1M resolution ratio It is 0.123, is higher than other detection samples, has detected false positive CNV in based on the testing result for reading long counting method and (be located at No. 7 The region chromosome q11.21, section length about 5M);And after using the detection device provided by the invention based on coverage method, have The homogeneity (as shown in Figure 5 B) of T2 sample is improved to effect, the CV value under 1M resolution ratio is reduced to 0.073, final detection result It is consistent with chip caryogram, do not occur the region false positive CNV.This is illustrated: when amplification homogeneity is poor, the long meter of traditional reading Number method may introduce false positive results, and chromosome abnormality detection device provided by the invention is applied to can be improved when being analyzed The homogeneity of amplification reduces the probability that false positive results occur.
It is to be illustrated to the preferred embodiment of the present invention, but the invention is not limited to the implementation above Example, those skilled in the art can also make various equivalent variations on the premise of without prejudice to spirit of the invention or replace It changes, these equivalent deformations or replacement are all included in the scope defined by the claims of the present application.

Claims (10)

1. a kind of chromosome abnormality detection device that can remove repetitive sequence, comprising:
Sequencing data acquiring unit: for obtaining the reading long segment obtained through high-flux sequence;
Comparing unit: being compared for that will read long segment with human genome reference sequences, obtains the position letter for reading long segment Breath and length information;
Coverage computing unit: for human genome reference sequences to be divided into several first windows, according to reading long segment Location information and length information, calculate the coverage of each first window, carried out according to the coverage of first window and G/C content Loess correction;Several continuous first windows are merged into the second window, the covering after calculating the second window Loess correction Degree and its coverage accounting;Wherein, the base sum/section length covered in coverage=section;Coverage accounting=section Coverage/all autosomal coverages;
Candidate CNV recognition unit: for the breakpoint location using cyclic annular binary segmentation algorithm identification chromosome, adjacent breakpoint is calculated Between CBS ratio, according to CBS ratio threshold value identify the candidate region CNV;
False positive filter element: for calculating the significance P-value of the candidate region CNV CBS ratio value, according to P- Value filters false positive region, obtains the region CNV and the results of karyotype of sample to be tested;
Wherein, CBS ratio is all second window coverage accountings between the adjacent breakpoint that cyclic annular binary segmentation algorithm identifies Mean value.
2. the apparatus according to claim 1, it is characterised in that: in coverage computing unit, the first window be 10~ The non-duplicate section of 50Kb.
3. the apparatus of claim 2, it is characterised in that: in coverage computing unit, the first window is 20Kb Non-duplicate section.
4. the apparatus according to claim 1, it is characterised in that: in coverage computing unit, second length of window is 0.1~2Mb。
5. device according to claim 4, it is characterised in that: in coverage computing unit, second length of window is appointed Selected from 100Kb, 500Kb, 1 Mb.
6. the apparatus according to claim 1, it is characterised in that: in candidate CNV recognition unit, the CBS ratio threshold value For [1.4,2.6], it is determined as the candidate region CNV beyond threshold range.
7. the apparatus according to claim 1, it is characterised in that: in false positive filter element, calculating P-value includes:
Randomly sampled data library is formed according to the result of nominal reference sample, therefrom extracts at least 100000 times and the candidate area CNV The isometric simulation CBS section in domain obtains the density profile of simulation CBS ratio value, calculates the candidate region CNV CBS ratio The significance P-value of value.
8. the device according to claim 1, it is characterised in that: in false positive filter element, the P- in the candidate region CNV Value < 0.001 is then determined as the region CNV, otherwise, as false positive area filter.
9. described in any item devices according to claim 1~8, it is characterised in that: described device further includes sequencing unit:
It is connected with sequencing data acquiring unit, for carrying out high-flux sequence, the sample packet to the library constructed using sample It includes through unicellular amplification or the sample for expanding through PCR or being expanded in advance without PCR in advance.
10. described in any item devices according to claim 1~8, it is characterised in that: described device further includes filter element:
It is connected with comparing unit, for rejecting the reading in tandem sequence repeats position and transposons repeatable position according to comparison result Long segment and low-quality, more matchings and the reading long segment being non-fully matched on chromosome.
CN201810047686.8A 2018-01-18 2018-01-18 A chromosomal abnormality detection device Active CN108268752B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810047686.8A CN108268752B (en) 2018-01-18 2018-01-18 A chromosomal abnormality detection device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810047686.8A CN108268752B (en) 2018-01-18 2018-01-18 A chromosomal abnormality detection device

Publications (2)

Publication Number Publication Date
CN108268752A CN108268752A (en) 2018-07-10
CN108268752B true CN108268752B (en) 2019-02-01

Family

ID=62775981

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810047686.8A Active CN108268752B (en) 2018-01-18 2018-01-18 A chromosomal abnormality detection device

Country Status (1)

Country Link
CN (1) CN108268752B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110268044B (en) * 2017-03-07 2022-08-02 深圳华大生命科学研究院 Method and device for detecting chromosome variation
CN109920480B (en) * 2019-03-14 2020-02-21 深圳市海普洛斯生物科技有限公司 Method and device for correcting high-throughput sequencing data
CN113496761B (en) * 2020-04-03 2023-09-19 深圳华大生命科学研究院 Methods, devices and applications for determining CNV in nucleic acid samples
CN114283881A (en) * 2021-12-29 2022-04-05 广州解序基因科技有限公司 Thalassemia Panel Data CNV Analysis System
CN115019892B (en) * 2022-06-13 2023-04-07 郑州大学第一附属医院 Confidence determination method for sequence coverage in sequencing of environmental microbiota metagenome

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104428425A (en) * 2012-05-04 2015-03-18 考利达基因组股份有限公司 Methods for determining absolute genome-wide copy number variations of complex tumors
CN104781421A (en) * 2012-09-04 2015-07-15 夸登特健康公司 Systems and methods to detect rare mutations and copy number variation
CN106650312A (en) * 2016-12-29 2017-05-10 安诺优达基因科技(北京)有限公司 Device for detecting DNA copy number variation of circulating tumor

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8645343B2 (en) * 2008-08-26 2014-02-04 23Andme, Inc. Processing data from genotyping chips
US8486630B2 (en) * 2008-11-07 2013-07-16 Industrial Technology Research Institute Methods for accurate sequence data and modified base position determination
GB201215449D0 (en) * 2012-08-30 2012-10-17 Zoragen Biotechnologies Llp Method of detecting chromosonal abnormalities
KR20160039386A (en) * 2014-10-01 2016-04-11 삼성에스디에스 주식회사 Apparatus and method for detection of internal tandem duplication

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104428425A (en) * 2012-05-04 2015-03-18 考利达基因组股份有限公司 Methods for determining absolute genome-wide copy number variations of complex tumors
CN104781421A (en) * 2012-09-04 2015-07-15 夸登特健康公司 Systems and methods to detect rare mutations and copy number variation
CN106650312A (en) * 2016-12-29 2017-05-10 安诺优达基因科技(北京)有限公司 Device for detecting DNA copy number variation of circulating tumor

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Circular binary segmentation for the analysis of array‐based DNA copy number data;Olshen A B, et al,;《Biostatistics》;20041231;第5卷(第4期);557-72 *
DNA拷贝数变异及其研究进展;马天骏,等;;《中华临床医师杂志》;20130731;第7卷(第14期);309-312,317 *
改进的基因拷贝数变异检测算法;李平,等;;《计算机工程》;20130131;第39卷(第1期);6592-6594 *

Also Published As

Publication number Publication date
CN108268752A (en) 2018-07-10

Similar Documents

Publication Publication Date Title
CN108268752B (en) A chromosomal abnormality detection device
CN109637590B (en) Microsatellite instability detection system and method based on genome sequencing
CN107423578B (en) Device for detecting somatic cell mutation
EP4073805B1 (en) Systems and methods for predicting homologous recombination deficiency status of a specimen
CN106599616B (en) Ultralow frequency mutational site determination method based on duplex-seq
CN108256292A (en) A kind of copy number variation detection device
CN106156543B (en) A kind of tumour ctDNA information statistical method
CN104462869A (en) Method and device for detecting somatic cell SNP
CN108319813A (en) Circulating tumor DNA copies the detection method and device of number variation
CN113160882A (en) Pathogenic microorganism metagenome detection method based on third generation sequencing
CN105986008A (en) CNV detection method and CNV detection apparatus
CN106980763A (en) A kind of cancer based on gene mutation frequency drives the screening technique of gene
CN106778073A (en) A kind of method and system for assessing tumor load change
CN110060733B (en) Second-generation sequencing tumor somatic variation detection device based on single sample
WO2018054254A1 (en) Method and system for identifying tumor load in sample
CN107949845A (en) The new method of sex of foetus and fetus sex chromosomal abnormality can be distinguished on multiple platforms
CN111341383A (en) Method, device and storage medium for detecting copy number variation
CN115083521B (en) Method and system for identifying tumor cell group in single cell transcriptome sequencing data
CN111091868A (en) Method and system for analyzing chromosome aneuploidy
CN114446389A (en) A tumor neoantigen feature analysis and immunogenicity prediction tool and its application
CN113903398A (en) Intestinal cancer early-screening marker, detection method, detection device, and computer-readable medium
CN113862351B (en) Kit and method for identifying extracellular RNA biomarkers in body fluid sample
CN112102944A (en) NGS-based brain tumor molecular diagnosis analysis method
WO2020124625A1 (en) Ctdna-based gene detection method and apparatus, storage medium, and computer system
CN107885972A (en) It is a kind of based on the fusion detection method of single-ended sequencing and its application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant