CN120843648A

CN120843648A - A non-fixed value composite quality control product for high-throughput sequencing of DNA pathogen metagenomics and its preparation method

Info

Publication number: CN120843648A
Application number: CN202511349003.0A
Authority: CN
Inventors: 杨启文; 张栋; 杜娟; 朱盈; 褚晓冰; 王冰; 杨宗兵; 王洋; 王军; 苏慧婷; 高弈; 陈新飞; 陆旻雅; 郭佳钰
Original assignee: Beijing Shuimujiheng Biotechnology Co ltd; Peking Union Medical College Hospital Chinese Academy of Medical Sciences
Current assignee: Beijing Shuimujiheng Biotechnology Co ltd; Peking Union Medical College Hospital Chinese Academy of Medical Sciences
Priority date: 2025-09-19
Filing date: 2025-09-19
Publication date: 2025-10-28
Anticipated expiration: 2045-09-19
Also published as: CN120843648B

Abstract

The present invention relates to a non-valued composite quality control product for high-throughput sequencing of DNA pathogen metagenomics and a preparation method thereof, belonging to the field of DNA detection technology. The preparation method comprises the following steps: S1. preparing a DNA positive quality control product: (1) extracting bacterial genomic nucleic acid and fungal genomic nucleic acid; (2) obtaining viral genomic nucleic acid; (3) extracting human genomic nucleic acid; (4) mixing the DNA positive quality control product according to the following relative abundance: 1% target pathogen genome and 99% human genome per milliliter; the target pathogen genome includes: bacterial genomic nucleic acid, fungal genomic nucleic acid, and viral genomic nucleic acid; (5) mixing the target pathogen genome according to a preset relative abundance of p _pD ; the present invention successfully constructs a complex background system that is highly similar to clinical samples by adding human genomic DNA to the DNA positive quality control product and using a negative quality control product prepared from human cells, providing a quality control effect that is closer to reality.

Description

Non-constant composite quality control product for DNA pathogen metagenome high-throughput sequencing and preparation method thereof

Technical Field

The invention belongs to the technical field of DNA detection, and particularly relates to a non-constant composite quality control product for DNA pathogen metagenome high-throughput sequencing and a preparation method thereof.

Background

With the rapid development of new generation sequencing technology (NGS), metagenomic sequencing has become an important means for studying the composition and function of microbial communities in complex samples such as environments, human bodies, and the like. The technology can detect and identify hundreds of pathogens simultaneously by carrying out overall sequencing and analysis on all microbial genomes in a sample, and has wide application in the fields of infectious disease diagnosis, microbiome research, environmental monitoring and the like. At present, the metagenome sequencing technology has the advantages of no need of presupposition, capability of discovering novel pathogens, single detection coverage of various microorganisms and the like, but simultaneously also faces the technical challenges of complex sample nucleic acid composition, large human background interference, low pathogen content, limited detection rate, non-uniform data analysis standard and the like. In terms of quality control, laboratories commonly employ commercial plasmids, synthetic fragments, or single microbial cultures as internal quality controls, using ultrapure water or extraction reagents as negative controls, but lack of uniform laboratory interstitial assessment standards results in poor comparability and reproducibility of the results from each laboratory.

The existing quality control scheme mainly comprises two types of commercial quality control products and laboratory self-made quality control products. Commercial quality control products, however, have the problems of limited target sequence, high price, insufficient stability and the like. The self-made quality control product in the laboratory mainly adopts a plasmid cloning method and a mixed strain culture method, and has the advantages of relatively simple operation and low cost, but cannot truly simulate the complexity of clinical samples, and the preparation process is complicated and has large batch-to-batch difference. Along with the continuous progress of sequencing technology, the read length is prolonged, the flux is increased, the cost is reduced, the application field of the sequencing technology is expanded to multiple aspects of clinical diagnosis, public health, microbiome research and the like, and the requirements on standardization, automation and traceability of quality control products are continuously improved. The main problems of the current quality control product include lack of complicated human background, single pathogen species, discontinuous abundance gradient, large batch-to-batch difference, insufficient stability, harsh preservation conditions and the like. There are few comprehensive quality control products on the market, which simultaneously contain DNA pathogens such as bacteria, fungi, viruses and the like and simulate the real clinical background, and the diversified requirements of the sequencing technology cannot be met. Therefore, it is needed to establish a standardized quality control system, develop a stable quality control product preparation method, and provide a traceable constant value scheme to meet the multi-platform compatibility requirement, support the full-process quality monitoring, and ensure the comparability of results.

The technical scheme for metagenome sequencing quality control at present mainly comprises three major categories of artificially synthesized nucleic acid quality control, pathogenic microorganism culture quality control and mixed matrix quality control. The artificially synthesized nucleic acid quality control product is prepared mainly through chemical synthesis and plasmid cloning technology, wherein the oligonucleotide fragment of 20-200bp can be obtained through chemical synthesis, longer sequences can be constructed through PCR amplification and fragment connection, and the amplification of the target gene fragment of 0.1-15kb can be realized through the plasmid cloning technology.

The control of pathogenic microorganism culture substances is mainly prepared by inactivating pathogens. The inactivation method comprises heat inactivation (56-121 ℃), irradiation (gamma rays or electron beams) and treatment with chemical reagents (formaldehyde, beta-propiolactone, etc.). The pathogen after the inactivation treatment can be used as a quality control for evaluating the sensitivity and specificity of the detection method.

The mixed matrix quality control mainly comprises a cell line matrix and a clinical sample mimic. The cell line matrix is prepared by using human immortalized cell lines, primary cultured cells or genetically engineered cells, and is obtained through the steps of cell culture amplification, collection and counting, nucleic acid extraction by lysis and the like, and the clinical sample simulant is prepared by mixing a plurality of clinical specimens and adding pathogens with known concentration.

The prior art has various products in the aspect of commercialization application, including various quality control products which are promoted by international manufacturers and domestic manufacturers, and is applied to scenes such as laboratory interstitial assessment, methodology verification, daily quality control and the like.

The current metagenome sequencing technology has the following problems in the laboratory indoor quality control field:

1. The existing artificially synthesized nucleic acid quality control product has the advantage of controllable sequence, but the application of the product is limited in various aspects. Because of the limitations of synthetic technology, artificially synthesized DNA fragments are generally short in length and difficult to mimic the complete pathogen genome. Meanwhile, the quality control product obtained by plasmid cloning has obvious difference between the DNA conformation and the natural nucleic acid, and can not truly reflect the nucleic acid characteristics in clinical samples. In addition, the cost of chemical synthesis and plasmid cloning techniques is high, and it is difficult to achieve a balance of economic benefits in a large-scale production process.

2. The control of pathogenic microorganism culture substances faces a number of technical bottlenecks in the preparation and application processes. Many important pathogens are difficult to cultivate and require higher levels of biosafety due to their special growth requirements or biosafety requirements. In the preparation process, no matter what inactivation mode is adopted, different degrees of damage can be caused to nucleic acid of pathogens, and the reliability of quality control products is reduced. Meanwhile, the difference of growth states of cultures in different batches leads to remarkable batch-to-batch difference of the final product, and standardized production is difficult to realize.

3. Mixed matrix quality control has unique advantages in simulating real clinical specimens, but its manufacturing process faces multiple challenges. When a quality control product is prepared by using a cell line, the growth state of cells is difficult to control accurately, resulting in fluctuation of product quality. The preparation of clinical sample simulants is limited by the source of the sample, and it is difficult to obtain sufficient raw materials continuously and stably. In addition, due to the complexity of the matrix components, the performance of the quality control product is affected by the interference of various physical and chemical factors. These characteristics make standardized production and scale-up applications of mixed matrix quality control materials more difficult.

4. The current commercial quality control products have common problems, and the wide application of the commercial quality control products is severely restricted. Firstly, the production cost of the existing quality control product is high, so that the product is high in price, and the running cost of a laboratory is increased. Secondly, the stability of the quality control product is poor, the requirements on storage conditions and transportation environment are strict, and a plurality of inconveniences are brought to the logistics and the use of the product. In addition, the traceability of quality control products is insufficient, and a unified quality standard system is difficult to establish. Finally, due to the limitation of the production process, uniformity among different batches of products is difficult to ensure, and comparability of detection results is affected.

5. At present, a composite quality control system capable of meeting the detection requirement of DNA pathogens does not appear in the market. Most of the existing quality control products are designed aiming at single types of pathogens or single detection flow, most of the traditional quality control products are developed aiming at traditional molecular detection methods such as PCR and the like, and the design concept and performance characteristics of the traditional quality control products are difficult to meet the requirement of metagenome sequencing on multiple quality control. This singleness makes the laboratory have to purchase and use multiple quality control products at the same time, not only increasing the cost of detection, but also potentially affecting the outcome judgment due to the performance differences between different quality control products. Meanwhile, due to the lack of uniform quality control standards, the detection results of different types of quality control products are difficult to effectively compare. This situation makes it difficult for the laboratory to comprehensively evaluate the detection performance of metagenomic sequencing, and also makes it impossible to accurately control the quality of the detection results of different types of pathogens.

6. The lack of a standardized quality control system, that is, the indoor quality control of the existing metagenome sequencing lacks uniform positive and negative quality control products, and the sensitivity, the specificity and the accuracy of the sequencing process cannot be effectively evaluated.

7. The quality control product is insufficient in simulating a real sample, namely, a part of quality control product is prepared by only synthesizing fragments or samples of a single source, so that the complex sample background of human beings and the actual proportion of pathogenic microorganisms are difficult to simulate.

8. The technical performance verification is difficult, the existing quality control product is limited to a single detection technology, cannot be widely applied to DNA pathogen metagenome analysis by the Next Generation Sequencing (NGS) technology, and lacks versatility and flexibility.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides a non-constant composite quality control product for DNA pathogenic metagenome high-throughput sequencing and a preparation method thereof.

The technical scheme of the invention is as follows:

A preparation method of a non-constant composite quality control product for DNA pathogen metagenome high-throughput sequencing comprises the following steps:

S1, preparing a DNA positive quality control product:

(1) Extracting bacterial genomic nucleic acid and fungal genomic nucleic acid;

(2) Obtaining viral genome nucleic acid;

(3) Extracting human genome nucleic acid;

(4) The DNA positive quality control agent per milliliter is mixed according to the relative abundance of 1 percent of target pathogenic genome and 99 percent of human genome, wherein the target pathogenic genome comprises bacterial genome nucleic acid, fungal genome nucleic acid and virus genome nucleic acid;

(5) The target pathogen genome is mixed according to a preset relative abundance p _pD;

In practice, the target pathogen may be selected from any pathogen, in particular any bacterium, fungus, virus, which can be routinely selected and determined by the person skilled in the art according to the practical needs and detection requirements. After determining the specific target pathogen, the relative abundance of each target pathogen can be preset according to the type and the number of the target pathogen, and the sum of each target pathogen and the respective preset relative abundance in the quality control product is equal to 100%, which can be conventionally determined and done by a person skilled in the art according to the specific pathogen type and the number and in combination with the conventional technical knowledge reported in the prior art. The conventional technical knowledge reported in the prior art includes:

1. The second generation sequencing technology of Chinese metagenomics, the clinical application expert consensus for detecting infectious pathogens [ attached corrections herein ] [ J ]. Chinese J.He.He.J.2020, 38 (11): 681-689.DOI: 10.3760/cma.j.cn311365-20200031-00132.

2. Expert consensus indicates that the second generation sequencing of the pathogenic metagenome can cover a wide range of pathogens, viruses, fungi, bacteria, parasites, can be detected simultaneously, whether the clinical samples are successfully cultured or not, and only contains detectable DNA or RNA.

3.Knight, R., Vrbanac, A., Taylor, B.C. et al. Best practices for analysing microbiomes. Nat Rev Microbiol 16, 410–422 (2018). https://doi.org/10.1038/s41579-018-0029-9; This article states that the relative abundance nature is constitutive data. Any effective statistical analysis must be normalized to a constant sum (e.g., 100%) to avoid spurious correlations.

In some embodiments, the target pathogenic genome and its preset relative abundance p _pD are as follows, 15% pseudomonas aeruginosa, 10% mycobacterium avium, 10% klebsiella pneumoniae, 8% pallidum, 5% bacteroides fragilis, 5% streptococcus pyogenes, 5% haemophilus influenzae, 4.5% streptococcus stomatae, 4% staphylococcus aureus, 3% veillonella, 2% streptococcus anaerobacter, 1% legionella pneumophila, 0.5% streptococcus pneumoniae, 10% human bocavirus type 1, 5% adenovirus type 7, 4% candida albicans, 4% cryptococcus gari, 4% aspergillus fumigatus, and the innovative relative abundance gradient design is that the DNA pathogens in the quality control are designed according to a gradient ratio from high to low, forming a continuous relative abundance gradient, and the detection capability of the detection system for different abundance pathogens can be comprehensively evaluated.

The target pathogen genome is mixed according to the preset relative abundance p _pD, and the method comprises the following steps:

1) Calculating an initial input volume V for each target pathogen using equation (a) _pD1

(a)

In formula (a), g _p represents the genome size of the pathogen of interest; mass volume concentration of target pathogenic nucleic acid;

2) Mixing the nucleic acid of each target pathogen according to the initial input volume V _pD1 of each target pathogen calculated in the step 1), adding the genome DNA of the human cell line, and finally adding TE buffer solution (1X, pH 8.0) to form a sample;

3) Sucking 20-40% of the volume of the sample from the sample, constructing a library by using a metagenomic sequencing technology, sequencing on a machine, and analyzing biological information to obtain the effective sequence number r _D1 of each target pathogen;

4) The actual input volume V _pD2 for each target pathogen is calculated according to equation (b):

(b)

in the formula (b), r _D2 is randomly assigned within 0-1000, and r _D2 is not equal to 0;

the r _D2 of each of the pathogen alpha and pathogen beta satisfies the conversion relation of the following formula (c):

(c)

In the formula (c), r _D2α represents the effective sequence number of the pathogen alpha, r _D2β represents the effective sequence number of the pathogen beta, p _pD2α represents the relative abundance of the pathogen alpha, and p _pD2β represents the relative abundance of the pathogen beta;

5) And (3) regulating the nucleic acid of each target pathogen according to the target pathogen obtained by the calculation in the step (4), adding the regulated target pathogen into the volume V _pD2, mixing, adding the genome DNA of the humanized cell line according to 100ng/mL, and finally adding a TE buffer solution to obtain the DNA positive quality control product.

The target pathogen is selected from pseudomonas aeruginosa, mycobacterium avium, klebsiella pneumoniae, human pallidum, bacteroides fragilis, streptococcus pyogenes, haemophilus influenzae, streptococcus stomatae, staphylococcus aureus, veillonella parvula, anaerobic streptococcus peptis, legionella pneumophila, streptococcus pneumoniae, human bocavirus type 1, adenovirus type 7, candida albicans, cryptococcus garteus and aspergillus fumigatus;

Preferably, the target pathogen has a genome size selected from the group consisting of Pseudomonas aeruginosa 6839777 bp, mycobacterium avium 4956752 bp, klebsiella pneumoniae 5548441 bp, xanthomonas mandshurica 5226429 bp, bacteroides fragilis 5234583 bp, streptococcus pyogenes 1844942 bp, haemophilus influenzae 1850809 bp, streptococcus stomatitis 1931995 bp, staphylococcus aureus 2806340 bp, wegrong coccus 2132186 bp, streptococcus anaerobic digestion 2192403 bp, legionella pneumophila 3407565 bp, streptococcus pneumoniae 2096425 bp, human Bocate virus type 1 5099 bp, adenovirus type 7 35197 bp, candida albicans 14735515 bp, cryptococcus garitides 17527853 bp, aspergillus fumigatus 28825722 bp;

The target pathogenic genome and the preset relative abundance p _pD thereof are as follows, 15% of pseudomonas aeruginosa, 10% of mycobacterium avium, 10% of klebsiella pneumoniae, 8% of anthropomorphic bacillus, 5% of bacteroides fragilis, 5% of streptococcus pyogenes, 5% of haemophilus influenzae, 4.5% of streptococcus stomatitis, 4% of staphylococcus aureus, 3% of veillonella parvula, 2% of streptococcus anaerobic digestion, 1% of legionella pneumophila and 0.5% of streptococcus pneumoniae, 10% of human bocavirus type 1, 5% of adenovirus type 7, 4% of candida albicans, 4% of cryptococcus gare and 4% of aspergillus fumigatus.

S1, extracting bacterial genome nucleic acid and fungal genome nucleic acid comprises the steps of selecting proper culture mediums to respectively culture bacteria or fungi, collecting thalli in a logarithmic growth phase, extracting genome DNA, and carrying out quality detection and quantification;

The method for obtaining the genome nucleic acid of the virus comprises the steps of obtaining the genome full-length sequence of the target virus through in vitro synthesis, recombining the genome full-length sequence of the target virus with an Ad5 adenovirus skeleton through plasmids, amplifying the genome full-length sequence in BJ5183-AD-1 and Stbl3 bacteria, transfecting 293T cells, packaging the obtained DNA pseudo-virus particles, and extracting the nucleic acid of the DNA pseudo-virus particles.

In the DNA positive quality control product, the final concentration of the nucleic acid is 100ng/mL.

The preparation method of the non-constant composite quality control product for DNA pathogenic metagenome high-throughput sequencing further comprises the following step S2 of preparing the negative quality control product, namely washing a human cell line for 3-5 times, wherein the human cell line is free from virus or other exogenous microorganism pollution.

The non-constant composite quality control product for DNA pathogenic metagenome high-throughput sequencing is prepared by adopting the preparation method, wherein the non-constant composite quality control product comprises a DNA positive quality control product;

the DNA positive quality control product comprises pathogenic genome nucleic acid and humanized genome nucleic acid;

Pathogenic genomic nucleic acids include DNA viral genomic nucleic acids, bacterial genomic nucleic acids, and fungal genomic nucleic acids.

The bacterial genomic nucleic acid comprises the following relative abundance of each of the components of 15% pseudomonas aeruginosa, 10% mycobacterium avium, 10% klebsiella pneumoniae, 8% pallidum, 5% bacteroides fragilis, 5% streptococcus pyogenes, 5% haemophilus influenzae, 4.5% streptococcus stomatitis, 4% staphylococcus aureus, 3% veillonella parvula, 2% streptococcus anaerobic digestion, 1% legionella pneumophila, 0.5% streptococcus pneumoniae;

The DNA virus genome nucleic acid comprises the following components in relative abundance of 10% human bocavirus type 1 and 5% adenovirus type 7;

the fungal genome nucleic acid comprises the following components in relative abundance, namely 4% candida albicans, 4% cryptococcus garter and 4% aspergillus fumigatus;

The non-constant composite quality control product also comprises a negative quality control product, wherein the negative quality control product comprises a human cell line sediment which is washed for 3-5 times and is free from virus or other exogenous microorganism pollution.

The human cell line refers to a cell line which can be subjected to in vitro passage and is obtained by immortalizing human cells.

The cell line pellet of human origin contained 1X 10 ⁵ cells.

The cell line is a cell line free of contamination by viruses or other foreign microorganisms.

The innovation point of the invention is that:

1. A standardized quality control system is provided, namely a dual indoor quality control system is constructed through creatively designed DNA Positive (DPC) and Negative (NC) quality control products and is used for evaluating the sensitivity, the specificity and the consistency of DNA pathogen metagenome sequencing.

2. The simulated real sample background is that the quality control product contains nucleic acid extracted by pathogenic microorganisms such as bacteria, fungi, pseudoviruses and the like, and the human cell nucleic acid background is accurate in proportion, so that the complexity of the real clinical sample can be simulated highly.

3. The quality control product is suitable for a Next Generation Sequencing (NGS) platform, covers a DNA pathogen detection flow, and can be widely applied to clinical, scientific research and laboratory standardized tests.

4. The innovative design of the relative abundance gradient is that DNA pathogens in the quality control product are designed according to the gradient proportion from high to low, so that continuous relative abundance gradients are formed, and the detection capability of the detection system for pathogens with different abundance can be comprehensively estimated.

The beneficial effects of the invention are as follows:

Firstly, the invention realizes the complete coverage of the whole DNA pathogen metagenome sequencing process by integrating various quality control forms. Wherein the DNA positive quality control creatively compounds the artificially synthesized nucleic acid, the inactivated strain DNA and the humanized genome DNA, so as to simulate the coexistence real state of pathogens and host background in clinical samples to the greatest extent, and the negative quality control provides the real humanized background to ensure the detection specificity. The invention provides and realizes the innovative design of DNA pathogen metagenome sequencing quality control products for the first time, and constructs a complete quality control scheme by integrating DNA pathogen nucleic acid such as bacteria, fungi, viruses and the like and human background nucleic acid into the same quality control system. The DNA positive quality control product adopts a composite design of artificially synthesized pseudovirus extracted nucleic acid, inactivated strain DNA and humanized genome DNA, and the negative quality control product provides a real humanized background to ensure detection specificity, thereby realizing the innovative breakthrough of a quality control system.

And secondly, the invention adopts a high-efficiency and reliable production process, successfully avoids risks in the processes of live virus culture and inactivation by a pseudo virus technology, ensures the stability among batches by adopting a standardized preparation process, ensures the reliability of a detection result by accurate quantitative design, and remarkably improves the production efficiency and the product quality of quality control products.

Thirdly, the invention realizes the accurate quantification and batch uniformity of the quality control product through standardized preparation technology and proportioning design. By adding the human genome DNA into the DNA positive quality control product and adopting the negative quality control product prepared by human cells, a complex background system which is highly similar to a clinical sample is successfully constructed, and a quality control effect which is more similar to actual is provided. The quality control product is suitable for the whole process of DNA extraction, library construction and sequencing analysis of metagenome sequencing, and has wide coverage. The invention has remarkable advantages in the aspect of simulating clinical samples, creatively adds the humanized genome DNA as host background nucleic acid into the DNA positive quality control product, adopts the humanized cells to prepare the negative quality control product, provides a real and complex nucleic acid background, simulates the coexistence actual condition of pathogens and host nucleic acid in the clinical samples to the greatest extent, and ensures the real reliability of quality control results.

Fourth, the invention provides the design concept of DNA pathogenic metagenome quality control product for the first time, and by integrating DNA bacteria, fungi, virus nucleic acid and human background nucleic acid into one quality control system, the limitation of the existing single quality control scheme is overcome, a complete quality control solution is provided for clinical laboratories, and the detection requirements of different types of pathogens can be met at the same time. Innovative simulation background providing complex nucleic acid background through human cell sediment, and accurately simulating real environment in clinical samples. Stability and long-term storage, wherein the quality control product is subjected to uniformity and stability evaluation, so that the consistency of the performance in short-term and long-term storage is ensured. The quality control system of the invention realizes unprecedented comprehensive quality control coverage, can simultaneously meet the detection requirement of DNA pathogens, supports the full-flow quality control from sample processing to bioinformatics analysis, and is particularly suitable for high-throughput sequencing platforms such as NGS and the like. The advantages of different types of quality control products are complementary, so that multiple quality control functions are successfully realized, and the limitation of the existing single quality control scheme is overcome.

Fifth, the invention adopts innovative abundance gradient design, and forms continuous gradient distribution from high to low through precise control of pathogen relative abundance. In the DNA positive quality control product, the gradient design from 15% of pseudomonas aeruginosa to 0.5% of streptococcus pneumoniae not only can comprehensively evaluate the detection capability of a detection system to pathogens with different concentrations, but also provides a reliable basis for determining the lower limit of detection sensitivity. The gradient design is obviously superior to the simple high-low concentration setting in the existing quality control product, and provides a more comprehensive and accurate reference standard for the performance of a clinical laboratory evaluation detection system. The innovative design of the abundance gradient is that the relative abundance of pathogens is accurately prepared according to the gradient proportion from high to low, so that continuous concentration gradients are formed, and the detection sensitivity of a detection system to targets with different concentrations can be effectively evaluated.

Sixthly, the invention provides an innovative DNA pathogen quality control system for DNA pathogen metagenome sequencing quality control, not only realizes the quality control of the whole detection process, but also remarkably improves the accuracy and reliability of quality control by simulating the nucleic acid composition of a real clinical sample, and can effectively meet the requirements of clinical diagnosis on the stability, reliability and comprehensiveness of quality control products. Through the innovative design, the composite quality control line system provided by the invention remarkably improves the quality control level of DNA pathogen metagenome sequencing, and provides more reliable and comprehensive quality control guarantee for clinical diagnosis. The invention provides an economical and practical quality control solution, a set of quality control system can meet multiple quality control requirements, effectively reduces purchasing and using costs of laboratories, simultaneously, the product has excellent stability, reduces storage and transportation requirements, is simple and easy to operate, is convenient to popularize and apply in clinical laboratories, and has remarkable practical value.

In addition, the invention has the following beneficial effects:

1. The DNA positive quality control product is nucleic acid extracted from pseudoviruses and inactivated cultures and covers the whole length of the respective genome.

2. The DNA positive quality control product is pathogen nucleic acid genome, has low requirement on laboratory biosafety, and can be used in laboratories with lower P2 and grades. The DNA positive quality control product can realize standardized production.

3. The raw materials of the invention are stable in source and can be continuously obtained. The prepared quality control product is stored in TE buffer solution, the matrix components are simple, the condition that the detection result is unstable due to interference of other factors is avoided, the standardized production of the quality control product is realized, and the stability of the quality control performance is ensured. The vast majority of pathogens contained in the quality control are the nucleic acids extracted from the inactivated culture, and a small fraction of viruses are the nucleic acids extracted from pseudoviruses. The plasmid is introduced into escherichia coli or cells to realize the coating of protein shells to form pseudo virus particles, so that the pseudo virus particles can be propagated infinitely theoretically, and the purpose of controlling the cost can be realized in the mass production process instead of directly using chemically synthesized plasmids in the mass production process.

4. The existing quality control products are mostly aimed at single pathogens, and are prepared and produced simultaneously to cover various pathogens of fungi, bacteria and viruses, so that the consumption of materials and the labor hour of personnel are multiplied, and the production cost is greatly increased. The invention covers a plurality of types of pathogens in one tube, reduces material consumption and man-hour of personnel, and reduces the production cost by times. The single preparation amount of each pathogen raw material is large, when the raw material is used for preparing the quality control product, a single batch can produce enough quality control products, long-time use of users can be met, and the batch changing frequency is greatly reduced, even if the batch changing is carried out, repeated fixed values are carried out for a plurality of times when the pathogen concentration in the raw material is fixed, the concentration level of the previous batch is gradually adjusted, then the preparation is carried out according to the fixed input volume, the uniformity among different batches is ensured to the greatest extent, and the CV variation coefficient in experimental example 1 can reflect the uniformity of the quality control product.

5. Experimental examples 1-2 of the invention comprise DNA positive quality control products and negative quality control products. The DNA genome sequence analysis method comprises the steps of respectively containing a plurality of pathogens which are important in clinical care, covering bacteria, fungi, viruses and the like, covering the whole genome length of the pathogens, and being capable of meeting DNA pathogen metagenome sequence analysis. In theory, DNA pathogen metagenome sequencing mNGS is capable of detecting all DNA pathogens, unlike traditional molecular detection methods such as PCR and the like, which only detect specific pathogens, the invention covers multiple types of pathogens, a user does not need to purchase multiple single pathogen quality control products, and does not need to construct libraries for sequencing and analysis respectively, and the invention can analyze by only constructing 1 DNA library and 20M reads sequencing data, thereby effectively reducing detection cost, eliminating performance differences among different types of quality control products, realizing the aim of simultaneously controlling multiple pathogens in one tube, and helping a laboratory to comprehensively evaluate the detection performance of DNA pathogens mNGS.

6. Experimental example 1 of the invention comprises a DNA positive quality control product and a DNA negative quality control product. In experimental example 1, the abundance of pathogens contained in the DNA positive quality control is set in a gradient manner from high to low, and the concentration levels of the medium positive and weak positive degrees are included, so that the sensitivity and the accuracy of a sequencing process can be effectively evaluated. Experimental example 2 is an operation method of negative quality control, and can effectively evaluate the specificity and accuracy of a sequencing flow. DPC is a positive quality control, and concerns about sensitivity and accuracy, requires detection of the declared contained pathogen, and relative abundance is expected, without assessment of specificity. Negative quality controls were used to evaluate specificity.

7. The pathogens contained in experimental example 1 of the invention all cover the whole length of each genome, and can effectively simulate a real sample. The DNA positive quality control of experimental example 1 is inserted with human cell line genome DNA, and the ratio of the DNA positive quality control is more than 99%, so that the ratio of the DNA positive quality control to the host nucleic acid in a clinical sample is effectively simulated by more than 99%.

8. Most of the existing quality control products comprise single pathogen, are prepared from pseudoviruses or nucleic acids, cover part of genes of pathogen genome, and are only suitable for PCR in vitro diagnostic reagents. The DNA positive quality control product prepared in experimental example 1 of the invention respectively contains a plurality of pathogens which are important in clinical care, covers the whole genome length of the pathogens, and can meet the requirement of DNA pathogen metagenome sequencing. Some laboratories only carry out mNGS detection of DNA, only a DNA positive quality control product and a DNA negative quality control product are needed, in the composite quality control product, DPC and NC are separated and independent in a kit, and can be matched for use, and the quality control product can meet the use requirements of the laboratories, and shows the flexibility of the invention.

Drawings

FIG. 1 is a box plot showing the relative abundance of pathogens detected by the DNA Positive Control (DPC) of Experimental example 1 of the present invention.

FIG. 2 is a plot of relative abundance of detected pathogens for DNA positive quality control (DPC) of experimental example 1 of the present invention.

Detailed Description

The following describes the present invention in detail with reference to the accompanying drawings, specific examples and experimental examples, but is not intended to limit the scope of the present invention.

Sources of biological materials

1. Pseudomonas aeruginosa used in Experimental example 1 was ATCC 27853 (TM) from American type culture Collection; mycobacterium avium using ATCC 25291TM from the American type culture Collection; klebsiella pneumoniae was ATCC 13883 (TM) from the American type culture Collection; the method comprises the steps of using ATCC 49188TM to obtain a pale bacillus, using ATCC 25285TM to obtain a pale bacillus, using ATCC 19615TM to obtain a pale bacillus, using ATCC 49766TM to obtain a American mode culture set, using ATCC 35037TM to obtain a pale bacillus, using ATCC 10790TM to obtain a pale bacillus, using ATCC 25923TM to obtain a pale bacillus, using ATCC 27337TM to obtain a pale bacillus, using ATCC 33152TM to obtain a pale bacillus, using ATCC 27337TM to obtain a pale bacillus, using ATCC 49619TM to obtain a pale bacillus, using a candid 1 and an adenovirus 7 type DNA pseudovirus prepared by man, using a human, using a complete set of J-shaped virus, using a complete set of ATCC 5357 to obtain a complete set of gene, using a complete set of the same type as a pale bacillus, using a complete set of the same type as that of a pale bacillus, using ATCC-type, using a complete bacillus, using a complete set of the same model of the same as a complete bacillus, using a complete virus, using a complete set of the same model of the same can be obtained by using a complete viruses, using a complete set of the same model as that can be obtained by using a complete viruses, using a complete set of the model, using a complete viruses, using a complete set of the model, using a complete can be obtained by using a model, and a model, obtained can be obtained using a model, can have a model, obtained, can have a model, obtained, can obtained, obtained, BJ5183-AD-1 competent cells, stbl3 competent cells, 293T cells, LN229 cell lines were commercially available from American style culture Collection.

2. The humanized cell line used in Experimental example 2 was LN229 (ATCC CRL-2611. TM.) from the American type culture Collection. The person skilled in the art may replace ‌ LN229 cell genomic DNA used in the present invention with cell line nucleic acids of different origins (e.g.other human origin cell lines, e.g.GM 12878 (human lymphoblast line), HEK293 (human embryonic kidney cells, healthy human embryonic kidney cells immortalized by adenovirus 5 (Ad 5) DNA fragments), MCF-10A (mammary epithelial cells, healthy female mammary tissue, non-tumorigenic immortalized cells), HUVEC (human umbilical vein endothelial cells, healthy neonatal umbilical vein primary cells, limited passable or immortalized), etc., which are routinely done. These cell lines are all commercially available.

Example 1, preparation method of non-constant composite quality control product of the invention

The embodiment of the group provides a preparation method of a non-constant composite quality control product for DNA pathogen metagenome high-throughput sequencing. All the examples of the group have the following common characteristics that the preparation method of the non-constant composite quality control product for DNA pathogen metagenome high-throughput sequencing comprises the following steps:

S1, preparing a DNA positive quality control product:

(1) Extracting bacterial genomic nucleic acid and fungal genomic nucleic acid;

(2) Obtaining viral genome nucleic acid;

(3) Extracting human genome nucleic acid;

(1) Calculating an initial input volume V for each target pathogen using equation (a) _pD1

(a)

3.0E+09 is scientific counting method, which shows that 3×10 ⁹, the size of human haploid genome is about 3.3 Gb (30 hundred million base pairs), the DNA quality of human haploid genome is about 3.3 pg, the mixed copy number concentration of 18 pathogens is 1000 copies/mL, according to the genome size of mixed 18 pathogens, substituting the above formula (a) can calculate the volume V _pD1 of each pathogen input, the genome size of each target pathogen can be inquired by ATCC official website and NCBI database, the mass volume concentration of pathogen nucleic acid This parameter was determined by means of a Qubit 4 fluorometer (invitrogen, USA) after extraction of each pathogenic nucleic acid.

As used herein, the DNA pathogen refers to a pathogen whose genetic material is DNA.

In particular embodiments, the DNA pathogen includes bacteria, fungi, DNA viruses;

in some specific embodiments, the bacteria include Pseudomonas aeruginosa, mycobacterium avium, klebsiella pneumoniae, ochrous, bacteroides fragilis, streptococcus pyogenes, haemophilus influenzae, streptococcus stomatitis, staphylococcus aureus, leucococcus parvus, streptococcus anaerobic digestion, legionella pneumophila, and Streptococcus pneumoniae.

In other specific embodiments, the fungus comprises Candida albicans, cryptococcus garteus, aspergillus fumigatus.

In some embodiments, the DNA virus comprises human bocavirus type 1 and adenovirus type 7.

Each of the above-mentioned target pathogens is a common pathogen known to those skilled in the art, having the ordinary technical meaning known to those skilled in the art, such as:

The pseudomonas aeruginosa can be the meaning of the term "pseudomonas aeruginosa" described in the description of the study on the bacteriostasis of rhizoma atractylodis based on bioinformatics analysis, and also can be the meaning of the term "pseudomonas aeruginosa" of hundred degrees encyclopedia;

Mycobacterium avium may be the term "Mycobacterium avium" as described in "Evaluating the Effects of 60°C Heating for 90 Min on Bacterial Pathogen Viability and IgG Concentration in Bovine Colostrum", or the term "Mycobacterium avium" as used in hundred degrees encyclopedia;

The klebsiella pneumoniae can be the meaning of the term "klebsiella pneumoniae" described in the description of pathogen distribution and drug resistance analysis in sterile body fluid, or the meaning of the term "klebsiella pneumoniae" in hundred degrees encyclopedia;

the human pallidum can be the meaning of the word "human pallidum" described in the text of the microbial community structure and carbon sequestration ability in the oil-containing solid waste residue, and also can be the meaning of the word "human pallidum" of the hundred degrees encyclopedia;

the bacteroides fragilis can be the meaning of the word "bacteroides fragilis" recorded in the text of the research and study of the characteristics of intestinal tracts and plaque flora of old lower limb arteriosclerosis obliterated patients, and can also be the meaning of the word "bacteroides fragilis" of the hundred degrees encyclopedia;

Streptococcus pyogenes may be the term "Streptococcus" as described in "Rhamnose polysaccharide-decorated outer membrane vesicles as a vaccine candidate targeting Group A Streptococcus from Streptococcus pyogenes and Streptococcus dysgalactiae subsp. equisimilis", or the term "Streptococcus pyogenes" as used in Baicales;

The haemophilus influenzae can be the meaning of the word "180 cases of pertussis hospitalized children clinical feature analysis" one word of the haemophilus influenzae "described herein, and can also be the meaning of the word" haemophilus influenzae "of the hundred degrees encyclopedia;

The streptococcus stomatitis can be the meaning of the word oral streptococcus described in the text of the structural change analysis of intestinal flora of the neonatal hyperbilirubinemia caused by early-onset septicemia, and can also be the meaning of the word oral streptococcus of the hundred degrees encyclopedia;

The staphylococcus aureus can be the meaning of the term of staphylococcus aureus recorded in the first text of research on the freshness of shrimps by a sodium alginate-based intelligent active film with antibacterial and ammonia-sensitive functions, and can also be the meaning of the term of staphylococcus aureus of hundred degrees encyclopedia;

the Povidicon can be the meaning of the word Povidicon' influence of Povidicon and its acetate producing gene deletion engineering bacteria on rumen microorganism fermentation;

anaerobic streptococcus mutans can be the meaning of the word "anaerobic streptococcus mutans" described in the text of "clinical pharmacist takes part in the pharmaceutical practice of 1 case of treatment of an anaerobic streptococcus mutans infected parturient;

Legionella pneumophila can be the meaning of the word "Legionella pneumophila" described in the text of "influence of herpesvirus infection on clinical prognosis and respiratory tract microecology of severe pneumonia patients", and also can be the meaning of the word "Legionella pneumophila" of the hundred degrees encyclopedia;

The streptococcus pneumoniae can be 'streptococcus pneumoniae' described in the first text of 'streptococcus pneumoniae capsular refined polysaccharide and preparation of degraded polysaccharide', and can also be the meaning of the term 'streptococcus pneumoniae' of hundred degrees encyclopedia;

The human bocavirus type 1 can be the meaning of the word "human bocavirus type 1" described in the text of "human bocavirus type 1" of research progress of molecular biology of human bocavirus type 1;

Adenovirus type 7 can be the meaning of the term "adenovirus type 7" as described herein for "percent monocyte in combination with CRP and PCT to identify influenza a virus and adenovirus infection;

The candida albicans can be the chromosome aneuploidy in a fungus drug resistance mechanism, which is the meaning of the term candida albicans recorded in the research progress, and can also be the meaning of the term candida albicans in the hundred degrees encyclopedia;

the cryptococcus garter can be the meaning of the word 'cryptococcus garter' described in the text of 'separation and identification of secondary metabolite of endophytic fungi Stagonosporopsis sp. YZHH-J-1 and antibacterial activity research';

the aspergillus fumigatus can be the chromosome aneuploidy in the fungus drug resistance mechanism, which is expressed as the meaning of the term aspergillus fumigatus described in the research progress, and can also be the meaning of the term aspergillus fumigatus in the hundred-degree encyclopedia.

The genome size of the target pathogen is selected from pseudomonas aeruginosa 6839777 bp, mycobacterium avium 4956752 bp, klebsiella pneumoniae 5548441 bp, pallidobacter humanus 5226429 bp, bacteroides fragilis 5234583 bp, streptococcus pyogenes 1844942 bp, haemophilus influenzae 1850809 bp, streptococcus stomatae 1931995 bp, staphylococcus aureus 2806340 bp, veillonella parvula 2132186 bp, streptococcus anadis 2192403 bp, legionella pneumophila 3407565 bp, streptococcus pneumoniae 2096425 bp, human bocavirus type 1 5099 bp, adenovirus type 7 35197 bp, candida albicans 14735515 bp, cryptococcus glatirosus 17527853 bp and aspergillus fumigatus 28825722 bp;

(2) Mixing the initial input volume V _pD1 of each target pathogen calculated in step (1) with the nucleic acid of each target pathogen, adding the genomic DNA of the human cell line to a total volume of 1mL, and finally adding TE buffer (1X, pH 8.0), wherein in some embodiments, the further addition of the genomic DNA of the human cell line is 100ng of the genomic DNA of the human cell line.

(3) Sucking 20-40% of the volume of the sample from the sample, constructing a library by using a metagenomic sequencing technology, sequencing on a machine, and analyzing biological information to obtain the effective sequence number r _D1 of each target pathogen;

In some embodiments, the volume of 20% -40% of the sample can be adjusted according to the total volume of the sample, for example, if the total volume of the sample is 1mL, then 300 μl of the sample can be extracted from the 1 mL.

In a specific embodiment, the effective sequence number r _D1 has a conventional technical meaning well known to those skilled in the art, and may be, for example, the meaning of the term "effective sequence number" described in the "study of diversity and antibacterial activity of Silybum marianum".

The operation of constructing a library by using a metagenomic sequencing technology, performing on-machine sequencing and analyzing biological information to obtain the effective sequence number r _D1 of each target pathogen is a conventional technical operation well known to a person skilled in the art, and a specific operation method can be referred to as a metagenomic sequencing method.

(4) The actual input volume V _pD2 for each target pathogen is calculated according to equation (b):

(b)

(c)

The r _D2 value of a certain pathogen is selected and set within the range of 0-1000, and the r _D2 value of other pathogens can be calculated as the relative abundance p _pD of each pathogen is preset and fixed (formula c).

(5) And (3) regulating the nucleic acid of each target pathogen according to the target pathogen obtained by the calculation in the step (4), adding the regulated target pathogen into the volume V _pD2, mixing, adding 100ng of human cell line genome DNA, and finally adding TE buffer solution (1X, pH 8.0) to obtain the DNA positive quality control product. In some embodiments, the total volume of the DNA positive control is 1mL.

In specific embodiments, the target pathogen is selected from the group consisting of Pseudomonas aeruginosa, mycobacterium avium, klebsiella pneumoniae, pallidobacter, bacteroides fragilis, streptococcus pyogenes, haemophilus influenzae, streptococcus stomatitis, staphylococcus aureus, wegenea parvulus, streptococcus anae, legionella pneumophila, streptococcus pneumoniae, human Bocanis type 1, adenovirus type 7, candida albicans, cryptococcus garitides, aspergillus fumigatus;

In some embodiments, S1, the extraction of bacterial genomic nucleic acid and fungal genomic nucleic acid comprises selecting a suitable medium to culture bacteria or fungi, respectively, collecting bacterial cells in logarithmic growth phase, extracting genomic DNA, and performing quality detection and quantification;

The above DNA extraction procedure is a routine procedure for extracting a bacterial nucleic acid well known to those skilled in the art, and for example, in some embodiments, a bacterial genomic DNA extraction kit (DP 302) from Tiangen Biochemical technology (Beijing) Co., ltd.) is used for bacterial nucleic acid extraction, and a fungal genomic DNA extraction kit (D2300-100) from Beijing Soy technologies Co., ltd.) is used for fungal nucleic acid extraction, and the DNA extraction procedure can be performed with reference to the directions for use of these kit products.

The method for obtaining the genome nucleic acid of the virus comprises the steps of in vitro synthesis to obtain a genome full-length sequence of a target virus, recombination of the genome full-length sequence of the target virus with an Ad5 adenovirus skeleton through plasmids, amplification in BJ5183-AD-1 and Stbl3 bacteria, and transfection of 293T cells for packaging to obtain the DNA pseudovirus particles. The collected virus supernatant is subjected to nucleic acid extraction by using a kit, and the quality is confirmed by a plurality of detection methods such as digital PCR and the like, which are conventional operations well known to those skilled in the art, and specific reference can be made to "a lentivirus vector system-based HIV-1 genotype drug resistance detection quality control product, a preparation method and application thereof" the method described in the text is used for carrying out virus nucleic acid extraction and digital PCR operations.

The above-described viral genome nucleic acid acquisition step is a routine technical procedure well known to those skilled in the art, and specifically, can be performed by referring to the step described in section "Choi, V.W., Asokan, A., Haberman, R.A. and Samulski, R.J. (2007), Production of Recombinant Adeno-Associated Viral Vectors for In Vitro and In Vivo Use. Current Protocols in Molecular Biology, 78: 16.25.1-16.25.24.https://doi.org/10.1002/0471142727.mb1625s78".

Preferably, in the DNA positive quality control, the final concentration of the nucleic acid is 100ng/mL. The final concentration of nucleic acid herein refers to the total concentration of bacterial genomic nucleic acid and fungal genomic nucleic acid, viral genomic nucleic acid, and human genomic nucleic acid added together.

In a further embodiment, the preparation method of the non-constant composite quality control product for DNA pathogen metagenome high throughput sequencing further comprises the following step S2, preparing a negative quality control product, namely washing a human cell line for 3-5 times to obtain a human cell line precipitate without virus or other exogenous microorganism pollution. The washing is a conventional technical operation well known to those skilled in the art, and specifically, the washing may be performed with reference to the procedure described in section Contamination and Biosafety of "ATCC ANIMAL CELL Culture Guide (revised 2023).

Group 2 example, non-constant composite quality control article of the present invention

The present set of examples provides a non-constant composite quality control for high throughput sequencing of DNA pathogen metagenome. All the embodiments of the group have the common characteristics that the non-constant composite quality control product is prepared by adopting the preparation method of any one of the embodiments of the group 1, wherein the non-constant composite quality control product comprises a DNA positive quality control product;

Pathogenic genomic nucleic acids include DNA viral genomic nucleic acids, bacterial genomic nucleic acids, fungal genomic nucleic acids;

Wherein the bacterial genome nucleic acid comprises the following components of 15% of pseudomonas aeruginosa, 10% of mycobacterium avium, 10% of klebsiella pneumoniae, 8% of pallidum anthropomorphic bacillus, 5% of bacteroides fragilis, 5% of streptococcus pyogenes, 5% of haemophilus influenzae, 4.5% of streptococcus stomatitis, 4% of staphylococcus aureus, 3% of veillonella parvulus, 2% of streptococcus anaerobic digestion, 1% of legionella pneumophila and 0.5% of streptococcus pneumoniae, and the relative abundance refers to the percentage of the effective number of the genome ready of a specific bacterium to the effective number of the genome nucleic acid ready of the pathogen contained in the DNA positive quality control;

The DNA virus genome nucleic acid comprises the following components of 10% of human bocavirus type 1 and 5% of adenovirus type 7, wherein the relative abundance refers to the percentage of the effective reads of the genome of a specific virus (human bocavirus type 1 or adenovirus type 7) to the effective reads of the genome of a pathogen contained in a DNA positive quality control;

The fungal genomic nucleic acid comprises the following components in relative abundance, 4% candida albicans, 4% cryptococcus garter, 4% aspergillus fumigatus. The relative abundance refers to the percentage of the effective number of genomic reads of a particular fungus to the effective number of genomic reads of the pathogen contained in the DNA positive quality control.

These microorganisms are the most clinically interesting pathogenic microorganisms which usually cause infections of the respiratory tract, the blood stream or the reproductive system, the relative abundance of the human genome is more than 99%, the sum of the relative abundance of the target pathogens incorporated into the quality control is less than 1%, and the sum of the two adds up to 100%. The sequencing data is not 100% because the sequencing data also contains sequences of non-target microorganisms, which may be derived from environmental pollution, reagent pollution (such as microbial DNA in the reagent), and pollution during sample collection or treatment (such as skin colonization bacteria and microorganisms introduced by laboratory operations), and the percentage values related to the invention are all preset values for reference of users. The quality control product is a non-constant quality control product and is a non-standard product, and a user is required to accumulate a target value by detecting the quality control product for more than 20 consecutive days in a detection system. The non-constant composite quality control product can be used as a product to be finally marketed, and the core innovation of the invention is the preparation method of the non-constant composite quality control product. The quality control product with non-fixed value is characterized in that the manufacturer does not provide specific values and only marks the expected range or concentration gradient. The term "non-constant value" has a conventional technical meaning commonly understood by those skilled in the art, and may be, for example, the meaning of the term "non-constant value" in "one, applicable scope" of "quality control assignment study (No. 36 in 2022") issued by the national pharmaceutical administration as "quality control registration and inspection guidelines", where quality control may be classified into constant quality control and non-constant quality control "according to whether a labeled value of an analyte is given.

The genome nucleic acid in the non-constant composite quality control product is the whole genome nucleic acid sequence.

The utilization of the non-fixed value composite quality control product for metagenome sequencing can enable the host rate, namely the percentage of the number of the effective reads compared with the human genome to the total number of the effective sequencing reads to be more than 99%, the percentage of the number of the effective reads compared with the pathogenic genome to the total number of the effective sequencing reads to be within 1%, and the proportion of the DNA sequences from human in sequencing data, namely (the number of the effective reads compared with the human genome/the total number of the effective sequencing reads) multiplied by 100%.

In a further embodiment, the non-constant composite quality control for high throughput sequencing of DNA pathogenic metagenome further comprises a negative quality control comprising a non-viral or other exogenous microorganism-contaminated human cell line pellet subjected to 3-5 washes.

In a specific embodiment, the human cell line refers to a cell line which can be passaged in vitro and is obtained by immortalizing human cells;

in a more specific embodiment, the cell line pellet of human origin contains 1×10 ⁵ cells;

in a preferred embodiment, the cell line is a cell line free of contamination by viruses or other foreign microorganisms.

The invention provides a composite quality control product for DNA pathogen metagenome high-throughput sequencing and a preparation method thereof. The quality control product comprises a DNA positive quality control product (DPC) and a negative quality control product (NC), and can be used for evaluating the detection performance of the whole DNA pathogen metagenome sequencing process.

The technical scheme of the invention is as follows

1. Quality control product specification and storage conditions

Quality control product name specification storage condition

DNA metagenome positive quality control product 0.5 mL/branch-20 ℃ plus or minus 5 DEG C

Metagenomic negative cell quality control 1 x 10 ⁵ cells/branch-20 ± 5 °c

2. Quality control product composition

1. DNA positive quality control product (DPC)

The design of the intrinsic control product covers various pathogenic microorganisms such as DNA viruses, bacteria, fungi and the like, and the pathogens can cause infectious diseases of a plurality of systems and organs such as human respiratory system, digestive system, genitourinary system, skin soft tissues and the like, so that the method has wide clinical application value.

DPC is a nucleic acid quality control product comprising genomic and humanized genomic nucleic acids of DNA viruses, bacteria and fungal pathogens.

The components are mixed according to a preset proportion, and the proportion of the human nucleic acid in the total nucleic acid is more than 99 percent.

The relative abundance of bacterial pathogens in each component of DPC is respectively set as 15% of pseudomonas aeruginosa, 10% of mycobacterium avium, 10% of klebsiella pneumoniae, 8% of pallidum, 5% of bacteroides fragilis, 5% of streptococcus pyogenes, 5% of haemophilus influenzae, 4.5% of streptococcus stomatitis, 4% of staphylococcus aureus, 3% of veillonella parvula, 2% of streptococcus anaerobiosis, 1% of legionella pneumophila and 0.5% of streptococcus pneumoniae. The relative abundance of DNA viruses was set to 10% for human bocavirus type 1 and 5% for adenovirus type 7, respectively. The relative abundance of fungal pathogens was set to 4% candida albicans, 4% cryptococcus garter, and 4% aspergillus fumigatus, respectively.

The nucleic acid content of each pathogen in the quality control product is precisely designed and is arranged in a gradient decreasing mode from 15% of the relative abundance of pseudomonas aeruginosa to 0.5% of the relative abundance of streptococcus pneumoniae, so that ordered abundance differences are formed, and the detection capability of the detection system on targets with different concentrations can be conveniently estimated.

2. Negative quality control (NC)

Cell pellet prepared from human cell line.

Used for simulating complex nucleic acid background in human samples.

Preferably a cell line free of respiratory tract infection viral background.

The number of cells is preferably 1.0X10 ⁵ cells/branch.

3. Innovation point of quality control product preparation method

1. The preparation of DPC components employs optimized extraction and purification processes to ensure nucleic acid integrity.

2. The proportion of each component is accurately calculated and verified, an innovative gradient decreasing design is adopted, the nucleic acid content of different pathogens shows a regular difference (the DNA pathogens are reduced from 15% to 0.5%), the reliability of the detection result is ensured, and the detection capability of the detection system on targets with different concentrations can be comprehensively evaluated.

4. The difference in the component proportions in the batch is small, and the Coefficient of Variation (CV) is controlled within an acceptable reasonable range, so that good uniformity is shown.

4. Technical effects

1. Detection performance of DNA pathogens can be evaluated simultaneously

2. Nucleic acid composition characteristics simulating real clinical samples

3. The uniformity in the batch is good, and the CV value is controlled in a reasonable range

4. Simple preservation condition and convenient transportation

The present invention will be described in further detail with reference to the following examples.

Experimental example 1 preparation and detection of DNA metagenomic cationic control (DPC)

1. Preparation of DPC:

(1) And (3) obtaining bacterial and fungal nucleic acid, namely selecting a proper culture medium to respectively culture various pathogenic bacteria, collecting thalli in a logarithmic growth phase, extracting genome DNA, and carrying out quality detection and quantification.

(2) The method comprises the steps of obtaining a viral whole genome sequence through in vitro segmented synthesis, recombining all fragments of the whole genome with an Ad5 adenovirus skeleton through plasmids, amplifying in BJ5183-AD-1 and Stbl3 bacteria, and then transfecting 293T cells for packaging to obtain DNA pseudovirus particles. The collected virus supernatant is subjected to nucleic acid extraction by using a kit, and quality is confirmed by a plurality of tests such as digital PCR.

(3) Culturing ‌ LN229 cell line, extracting human genome DNA, and carrying out quality detection and quantification.

(4) The nucleic acid concentration was determined using a digital PCR detection system (catalog number: 23053, new manufacturing technologies (Beijing)) in a 30. Mu.L reaction system comprising 2X Probe dPCR SuperMix (no UNG) 15. Mu.L each of a two-way primer (10. Mu.M) 2.4. Mu.L, a probe (10. Mu.M) 0.75. Mu.L, a DNA template and ddH ₂ O9.45. Mu.L. PCR reaction program: 95℃10min,95℃30s,60℃ 1min,40 cycles,12 ℃5min, the two-way primer and probe were routinely designed and synthesized by those skilled in the art based on the genomic sequence of the target pathogen to be detected as desired with reference to the following:

Edwards RL, Takach JE, McAndrew MJ, Menteer J, Lestz RM, Whitman D, Baxter-Lowe LA. Next generation multiplexing for digital PCR using a novel melt-based hairpin probe design. Front Genet. 2023 Nov 10;14:1272964. doi: 10.3389/fgene.2023.1272964. PMID: 38028620; PMCID: PMC10667681.

Wadle S, Lehnert M, Rubenwolf S, Zengerle R, von Stetten F. Real-time PCR probe optimization using design of experiments approach. Biomol Detect Quantif. 2015 Dec 30;7:1-8. doi: 10.1016/j.bdq.2015.12.002. PMID: 27077046; PMCID: PMC4827641.

qiagen digital PCR detection development guide （https://www.qiagen.com/zh-cn/applications/digital-pcr/beginners/dpcr-guide/setup-and-troubleshooting）

In practice, the whole process of the digital PCR can be carried out by a commercial biological company (for example, new Youyi manufacturing technology (Beijing) technology Co., ltd.) and the design and synthesis of the two-way primers and probes are involved, which can be conventionally performed by those skilled in the art, and the primer sequences will not be repeated herein for the sake of brevity.

(5) According to the quantitative result of digital PCR, the components are mixed according to the preset proportion, the proportion of host source nucleic acid is required to reach more than 99%, and the final concentration of DPC nucleic acid is 100ng/mL. Bacterial pathogen relative abundance was set to 15% for pseudomonas aeruginosa, 10% for mycobacterium avium, 10% for klebsiella pneumoniae, 8% for pallidum, 5% for bacteroides fragilis, 5% for streptococcus pyogenes, 5% for haemophilus influenzae, 4.5% for streptococcus stomatitis, 4% for staphylococcus aureus, 3% for veillonella parvula, 2% for streptococcus anaerobiosis, 1% for legionella pneumophila, and 0.5% for streptococcus pneumoniae, respectively. The relative abundance of DNA viruses was set to 10% for human bocavirus type 1 and 5% for adenovirus type 7, respectively. The relative abundance of fungal pathogens was set to 4% candida albicans, 4% cryptococcus garter, and 4% aspergillus fumigatus, respectively. The ratio of the target sequences is normalized in DPC according to the length of each pathogen genome, and the number of target sequences after normalization is set to be 100-1000.

(6) The prepared DPC is packaged and frozen at-20 ℃ plus or minus 5 ℃.

2. DPC procedure:

(1) Thawing the DNA metagenome positive quality control product at room temperature for 15 minutes, and then vortex vibration centrifugation;

(2) 300. Mu.L was added to 1.5mL centrifuge tubes containing UMSI;

(3) Preparing a cracking reaction solution, and extracting and purifying DNA by using TIANamp Micro DNA kit (DP 316, TIANGEN BIOTECH, beijing, china);

(4) DPC nucleic acid concentration determination was performed using a qubit4.0 fluorescence quantitative instrument (Invitrogen, usa);

(5) After concentration measurement, nuclease-free water was added to homogenize the DPC concentration to 1.2 ng/. Mu.L;

(6) After homogenization, the DPC nucleic acid was disrupted using a genome fragmentation kit (VM 008-50, microphoton medical instruments inc.);

(7) Adding index and PCR reagent (provided by Guangzhou microphoton medical instruments Co., ltd.) into nucleic acid after DPC cleavage to complete library construction;

(8) Screening fragments by using magnetic beads, and measuring the concentration of the DPC library by using a Qubit4.0 fluorescence quantitative instrument;

(9) Mixing with other libraries, and diluting;

(10) Sequencing was performed on a NextSeq 550Dx sequencer (Illumina, USA).

Part 2 "DPC procedure" is an operation step of quality control of laboratory routine tests with quality control, independent of the preparation of quality control.

3. Inter-bottle uniformity assessment of DPC

(1) 6 DPCs were randomly withdrawn, 1 was tested daily, and monitored continuously for 6 days.

(2) Based on the results of the continuous monitoring, the inter-bottle uniformity of DPC was evaluated. The quality control product off-machine data volume, the host rate, the detected sequence number, the relative abundance and the like are all registered, and the sequence number, the relative abundance, the mean value and CV are calculated. The relative abundance is the single target sequence number/total target sequence number of the pathogen.

(3) The results of data analysis show that the detection results of positive quality control substances show that all pathogenic microorganisms expected to be contained are detected, the relative abundance of the sequencing results (the measured values in the table 2) and the preset flora is basically similar, the CV (coefficient of variation) value of 6 repeated experiments is within an acceptable range, and the uniformity among bottles in batches is qualified, so that the DNA positive quality control substance can be used for high-throughput sequencing of the metagenome. Fig. 2 shows that the average value of the measured values of the 6 repeated experiments and the preset value are used for drawing a line graph, the fitting degree of the two line graphs is very high, the set relative abundance of the quality control product accords with the expectation, and the accuracy of DPC preparation can be proved.

Experimental example 2 preparation and detection of Negative Control (NC)

1. NC preparation:

(1) The human cell line refers to a cell line which is obtained by immortalizing human cells and can be passaged in vitro, preferably a cell line without a background of respiratory tract infection virus.

(2) Cell pellet, i.e., cryopreserved cells depleted of contamination from the culture medium, is selected to be free of virus or other foreign microorganism contamination after multiple (3-5) washes and stored for a prolonged period of time at-80 degrees celsius or less.

(3) The number of cells contained in the cell pellet corresponds to the number of cells obtained in a human respiratory tract sample, preferably the sample contains 1X 10 ⁵ cells (counting is performed by using a cell counting plate, and the counting is repeated 3 times, so that the number of cells of the quality control product can be controlled at the level).

2. Metagenomic negative cytoplasmatic control (DNC) procedure:

(1) Taking a tube NC, centrifuging at a low speed, and adding 500 mu L of physiological saline for later use;

(2) Extracting 300 mu L of DNC from NC added with physiological saline according to DPC operation flow to build a library;

(3) DNC is derived from NC in (1) above.

Part 2 "metagenomic negative cytoplasmatic control (DNC) procedure" this is an operation step of quality control for laboratory routine detection with a quality control, independent of the preparation of the quality control.

3. Inter-bottle uniformity assessment of NC

The quality control product off-machine data volume, the host rate, the detected sequence number, the relative abundance and the like are all registered, and the sequence number, the relative abundance, the mean value and CV are calculated. The detection result of the negative quality control product does not detect the expected pathogenic microorganisms contained in the positive quality control product, so that the use requirement of the negative quality control product is met, and the specificity can be proved without detecting the pathogen. NC does not have to evaluate uniformity and accuracy.

Claims

1. A method for preparing a non-constant composite quality control product for high-throughput sequencing of a DNA pathogenic metagenome, which is characterized by comprising the following steps:

S1, preparing a DNA positive quality control product:

(1) Extracting bacterial genomic nucleic acid and fungal genomic nucleic acid;

(2) Obtaining viral genome nucleic acid;

(3) Extracting human genome nucleic acid;

(a)

2) Mixing the nucleic acid of each target pathogen according to the initial input volume V _pD1 of each target pathogen calculated in the step 1), adding the genome DNA of the human cell line, and finally adding TE buffer solution to form a sample;

(b)

(c)

5) And (3) regulating the nucleic acid of each target pathogen according to the target pathogen obtained by the calculation in the step (4), adding the regulated target pathogen into the volume V _pD2, mixing, adding the genome DNA of the human cell line, and finally adding the TE buffer solution to obtain the DNA positive quality control product.

2. The method for preparing the non-constant composite quality control product for high-throughput sequencing of DNA pathogenic metagenome according to claim 1, wherein the target pathogen is selected from the group consisting of Pseudomonas aeruginosa, mycobacterium avium, klebsiella pneumoniae, xanthobacter hominis, bacteroides fragilis, streptococcus pyogenes, haemophilus influenzae, streptococcus stomatitis, staphylococcus aureus, wegenea parvos, streptococcus anaerobiosus, legionella pneumophila, streptococcus pneumoniae, human Bobaclov virus type 1, adenovirus type 7, candida albicans, cryptococcus garitides, aspergillus fumigatus;

And/or the target pathogen has a genome size selected from Pseudomonas aeruginosa 6839777 bp, mycobacterium avium 4956752 bp, klebsiella pneumoniae 5548441 bp, xanthomonas mandshurica 5226429 bp, bacteroides fragilis 5234583 bp, streptococcus pyogenes 1844942 bp, haemophilus influenzae 1850809 bp, streptococcus stomatitis 1931995 bp, staphylococcus aureus 2806340 bp, wegrong coccus 2132186 bp, streptococcus anae 2192403 bp, legionella pneumophila 3407565 bp, streptococcus pneumoniae 2096425 bp, human Bobacmid type 1 5099 bp, adenovirus type 7 35197 bp, candida albicans 14735515 bp, cryptococcus garicum 17527853 bp, aspergillus fumigatus 28825722 bp;

3. The method for preparing a non-quantitative composite quality control product for high throughput sequencing of a DNA pathogen metagenome according to claim 1, wherein in S1, the extracting of bacterial genomic nucleic acid and fungal genomic nucleic acid comprises selecting a suitable culture medium to culture bacteria or fungi respectively, collecting bacterial cells in a logarithmic growth phase, extracting genomic DNA, and performing quality detection and quantification;

4. The method for preparing a non-constant composite quality control for high-throughput sequencing of a DNA pathogenic metagenome according to claim 1, wherein the final concentration of nucleic acid in the DNA positive quality control is 100ng/mL.

5. The method for preparing a non-quantitative composite quality control for high throughput sequencing of a DNA pathogenic metagenome according to any one of claims 1 to 4, further comprising the step of S2 preparing a negative quality control by washing a human cell line 3 to 5 times without virus or other exogenous microorganism contamination of a human cell line precipitate.

6. The non-constant composite quality control product for high-throughput sequencing of DNA pathogenic metagenome is characterized by being prepared by adopting the preparation method of any one of claims 1-5, wherein the non-constant composite quality control product comprises a DNA positive quality control product;

7. The non-quantitative composite quality control for high throughput sequencing of DNA pathogen metagenome according to claim 6, wherein the bacterial genomic nucleic acid comprises the following components in relative abundance: 15% of Pseudomonas aeruginosa, 10% of Mycobacterium avium, 10% of Klebsiella pneumoniae, 8% of human pallidum, 5% of Bacteroides fragilis, 5% of Streptococcus pyogenes, 5% of Haemophilus influenzae, 4.5% of Streptococcus stomatitis, 4% of Staphylococcus aureus, 3% of Wegrong coccus, 2% of anaerobic Streptococcus pepticus, 1% of Legionella pneumophila and 0.5% of Streptococcus pneumoniae;

and/or the non-constant composite quality control product also comprises a negative quality control product, wherein the negative quality control product comprises a human cell line sediment which is washed for 3-5 times and is free from virus or other exogenous microorganism pollution.

8. The non-constant composite quality control for high throughput sequencing of DNA pathogenic metagenome according to claim 7, wherein said human cell line is a cell line which can be passaged in vitro obtained by immortalizing human-derived cells.

9. The non-constant composite quality control for high throughput sequencing of a DNA pathogen metagenome according to claim 7, wherein the human cell line pellet contains 1 x 10 ⁵ cells.

10. A non-constant composite quality control for high throughput sequencing of DNA pathogen metagenome according to any of claims 7-9, wherein said cell line is a virus-or other exogenous microorganism-free cell line.