CN114921533A - Methods and adaptors for characterising a target polynucleotide - Google Patents
Methods and adaptors for characterising a target polynucleotide Download PDFInfo
- Publication number
- CN114921533A CN114921533A CN202210482806.3A CN202210482806A CN114921533A CN 114921533 A CN114921533 A CN 114921533A CN 202210482806 A CN202210482806 A CN 202210482806A CN 114921533 A CN114921533 A CN 114921533A
- Authority
- CN
- China
- Prior art keywords
- polynucleotide
- adaptor
- target polynucleotide
- nanopore
- sequencing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 102000040430 polynucleotide Human genes 0.000 title claims abstract description 209
- 108091033319 polynucleotide Proteins 0.000 title claims abstract description 209
- 239000002157 polynucleotide Substances 0.000 title claims abstract description 209
- 238000000034 method Methods 0.000 title claims abstract description 67
- 238000012163 sequencing technique Methods 0.000 claims abstract description 82
- 230000004048 modification Effects 0.000 claims abstract description 52
- 238000012986 modification Methods 0.000 claims abstract description 52
- 239000011148 porous material Substances 0.000 claims abstract description 43
- 238000005259 measurement Methods 0.000 claims abstract description 19
- 230000003287 optical effect Effects 0.000 claims abstract description 10
- 108060004795 Methyltransferase Proteins 0.000 claims description 67
- 230000000903 blocking effect Effects 0.000 claims description 58
- 125000003729 nucleotide group Chemical group 0.000 claims description 48
- 108090000623 proteins and genes Proteins 0.000 claims description 25
- 102000004169 proteins and genes Human genes 0.000 claims description 23
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 claims description 22
- PFNFFQXMRSDOHW-UHFFFAOYSA-N spermine Chemical compound NCCCNCCCCNCCCN PFNFFQXMRSDOHW-UHFFFAOYSA-N 0.000 claims description 13
- 229960002685 biotin Drugs 0.000 claims description 11
- 235000020958 biotin Nutrition 0.000 claims description 11
- 239000011616 biotin Substances 0.000 claims description 11
- 108010090804 Streptavidin Proteins 0.000 claims description 10
- 239000003446 ligand Substances 0.000 claims description 10
- 238000006243 chemical reaction Methods 0.000 claims description 9
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 9
- 229920001184 polypeptide Polymers 0.000 claims description 8
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 8
- KIDHWZJUCRJVML-UHFFFAOYSA-N putrescine Chemical compound NCCCCN KIDHWZJUCRJVML-UHFFFAOYSA-N 0.000 claims description 8
- ATHGHQPFGPMSJY-UHFFFAOYSA-N spermidine Chemical compound NCCCCNCCCN ATHGHQPFGPMSJY-UHFFFAOYSA-N 0.000 claims description 8
- 125000002091 cationic group Chemical group 0.000 claims description 7
- 229940063675 spermine Drugs 0.000 claims description 6
- 239000005700 Putrescine Substances 0.000 claims description 4
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 claims description 4
- 229940063673 spermidine Drugs 0.000 claims description 4
- 108060002716 Exonuclease Proteins 0.000 claims description 3
- 102100022536 Helicase POLQ-like Human genes 0.000 claims description 3
- 101000899334 Homo sapiens Helicase POLQ-like Proteins 0.000 claims description 3
- 101710183280 Topoisomerase Proteins 0.000 claims description 3
- 230000029936 alkylation Effects 0.000 claims description 3
- 238000005804 alkylation reaction Methods 0.000 claims description 3
- 239000000427 antigen Substances 0.000 claims description 3
- 102000036639 antigens Human genes 0.000 claims description 3
- 108091007433 antigens Proteins 0.000 claims description 3
- 108010051210 beta-Fructofuranosidase Proteins 0.000 claims description 3
- 238000012512 characterization method Methods 0.000 claims description 3
- 102000013165 exonuclease Human genes 0.000 claims description 3
- 239000001573 invertase Substances 0.000 claims description 3
- 235000011073 invertase Nutrition 0.000 claims description 3
- 239000000203 mixture Substances 0.000 claims description 3
- 239000007787 solid Substances 0.000 claims description 2
- 102000035160 transmembrane proteins Human genes 0.000 claims description 2
- 108091005703 transmembrane proteins Proteins 0.000 claims description 2
- 238000007672 fourth generation sequencing Methods 0.000 abstract description 7
- 230000009466 transformation Effects 0.000 abstract description 2
- 239000002773 nucleotide Substances 0.000 description 41
- 239000000523 sample Substances 0.000 description 20
- 108020004414 DNA Proteins 0.000 description 17
- 102000053602 DNA Human genes 0.000 description 17
- 108091093037 Peptide nucleic acid Proteins 0.000 description 14
- 239000000872 buffer Substances 0.000 description 14
- GYOZYWVXFNDGLU-XLPZGREQSA-N dTMP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(O)=O)[C@@H](O)C1 GYOZYWVXFNDGLU-XLPZGREQSA-N 0.000 description 13
- 102000039446 nucleic acids Human genes 0.000 description 13
- 108020004707 nucleic acids Proteins 0.000 description 13
- 150000007523 nucleic acids Chemical class 0.000 description 13
- 230000008569 process Effects 0.000 description 11
- DRTQHJPVMGBUCF-XVFCMESISA-N Uridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-XVFCMESISA-N 0.000 description 10
- 125000003636 chemical group Chemical group 0.000 description 9
- 239000000126 substance Substances 0.000 description 9
- KHWCHTKSEGGWEX-RRKCRQDMSA-N 2'-deoxyadenosine 5'-monophosphate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(O)=O)O1 KHWCHTKSEGGWEX-RRKCRQDMSA-N 0.000 description 8
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 8
- 150000001540 azides Chemical class 0.000 description 8
- 230000000295 complement effect Effects 0.000 description 8
- 125000002652 ribonucleotide group Chemical group 0.000 description 8
- ZKHQWZAMYRWXGA-KQYNXXCUSA-J ATP(4-) Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)[C@H]1O ZKHQWZAMYRWXGA-KQYNXXCUSA-J 0.000 description 7
- ZKHQWZAMYRWXGA-UHFFFAOYSA-N Adenosine triphosphate Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)C(O)C1O ZKHQWZAMYRWXGA-UHFFFAOYSA-N 0.000 description 7
- 230000008859 change Effects 0.000 description 7
- 239000003153 chemical reaction reagent Substances 0.000 description 7
- LTFMZDNNPPEQNG-KVQBGUIXSA-N 2'-deoxyguanosine 5'-monophosphate Chemical compound C1=2NC(N)=NC(=O)C=2N=CN1[C@H]1C[C@H](O)[C@@H](COP(O)(O)=O)O1 LTFMZDNNPPEQNG-KVQBGUIXSA-N 0.000 description 6
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 6
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 6
- DJJCXFVJDGTHFX-UHFFFAOYSA-N Uridinemonophosphate Natural products OC1C(O)C(COP(O)(O)=O)OC1N1C(=O)NC(=O)C=C1 DJJCXFVJDGTHFX-UHFFFAOYSA-N 0.000 description 6
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 6
- 125000002947 alkylene group Chemical group 0.000 description 6
- HVYWMOMLDIMFJA-DPAQBDIFSA-N cholesterol Chemical compound C1C=C2C[C@@H](O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H]([C@H](C)CCCC(C)C)[C@@]1(C)CC2 HVYWMOMLDIMFJA-DPAQBDIFSA-N 0.000 description 6
- IERHLVCPSMICTF-XVFCMESISA-N cytidine 5'-monophosphate Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](COP(O)(O)=O)O1 IERHLVCPSMICTF-XVFCMESISA-N 0.000 description 6
- IERHLVCPSMICTF-UHFFFAOYSA-N cytidine monophosphate Natural products O=C1N=C(N)C=CN1C1C(O)C(O)C(COP(O)(O)=O)O1 IERHLVCPSMICTF-UHFFFAOYSA-N 0.000 description 6
- 238000001514 detection method Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- RQFCJASXJCIDSX-UUOKFMHZSA-N guanosine 5'-monophosphate Chemical compound C1=2NC(N)=NC(=O)C=2N=CN1[C@@H]1O[C@H](COP(O)(O)=O)[C@@H](O)[C@H]1O RQFCJASXJCIDSX-UUOKFMHZSA-N 0.000 description 6
- 235000013928 guanylic acid Nutrition 0.000 description 6
- DJJCXFVJDGTHFX-XVFCMESISA-N uridine 5'-monophosphate Chemical compound O[C@@H]1[C@H](O)[C@@H](COP(O)(O)=O)O[C@H]1N1C(=O)NC(=O)C=C1 DJJCXFVJDGTHFX-XVFCMESISA-N 0.000 description 6
- UHDGCWIWMRVCDJ-UHFFFAOYSA-N 1-beta-D-Xylofuranosyl-NH-Cytosine Natural products O=C1N=C(N)C=CN1C1C(O)C(O)C(CO)O1 UHDGCWIWMRVCDJ-UHFFFAOYSA-N 0.000 description 5
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 5
- 208000035657 Abasia Diseases 0.000 description 5
- UHDGCWIWMRVCDJ-PSQAKQOGSA-N Cytidine Natural products O=C1N=C(N)C=CN1[C@@H]1[C@@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-PSQAKQOGSA-N 0.000 description 5
- 108091093094 Glycol nucleic acid Proteins 0.000 description 5
- 239000007995 HEPES buffer Substances 0.000 description 5
- 108091046915 Threose nucleic acid Proteins 0.000 description 5
- DRTQHJPVMGBUCF-PSQAKQOGSA-N beta-L-uridine Natural products O[C@H]1[C@@H](O)[C@H](CO)O[C@@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-PSQAKQOGSA-N 0.000 description 5
- UHDGCWIWMRVCDJ-ZAKLUEHWSA-N cytidine Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-ZAKLUEHWSA-N 0.000 description 5
- 229920002477 rna polymer Polymers 0.000 description 5
- DRTQHJPVMGBUCF-UHFFFAOYSA-N uracil arabinoside Natural products OC1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-UHFFFAOYSA-N 0.000 description 5
- 229940045145 uridine Drugs 0.000 description 5
- IVOMOUWHDPKRLL-KQYNXXCUSA-N Cyclic adenosine monophosphate Chemical compound C([C@H]1O2)OP(O)(=O)O[C@H]1[C@@H](O)[C@@H]2N1C(N=CN=C2N)=C2N=C1 IVOMOUWHDPKRLL-KQYNXXCUSA-N 0.000 description 4
- 102000004190 Enzymes Human genes 0.000 description 4
- 108090000790 Enzymes Proteins 0.000 description 4
- NYHBQMYGNKIUIF-UUOKFMHZSA-N Guanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O NYHBQMYGNKIUIF-UUOKFMHZSA-N 0.000 description 4
- UDMBCSSLTHHNCD-KQYNXXCUSA-N adenosine 5'-monophosphate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP(O)(O)=O)[C@@H](O)[C@H]1O UDMBCSSLTHHNCD-KQYNXXCUSA-N 0.000 description 4
- -1 aryl phosphine Chemical compound 0.000 description 4
- 239000012472 biological sample Substances 0.000 description 4
- ZOOGRGPOEVQQDX-KHLHZJAASA-N cyclic guanosine monophosphate Chemical compound C([C@H]1O2)O[P@](O)(=O)O[C@@H]1[C@H](O)[C@H]2N1C(N=C(NC2=O)N)=C2N=C1 ZOOGRGPOEVQQDX-KHLHZJAASA-N 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 229920000642 polymer Polymers 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- 239000000047 product Substances 0.000 description 4
- 239000011780 sodium chloride Substances 0.000 description 4
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 4
- QRZUPJILJVGUFF-UHFFFAOYSA-N 2,8-dibenzylcyclooctan-1-one Chemical compound C1CCCCC(CC=2C=CC=CC=2)C(=O)C1CC1=CC=CC=C1 QRZUPJILJVGUFF-UHFFFAOYSA-N 0.000 description 3
- NFEXJLMYXXIWPI-JXOAFFINSA-N 5-Hydroxymethylcytidine Chemical compound C1=C(CO)C(N)=NC(=O)N1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 NFEXJLMYXXIWPI-JXOAFFINSA-N 0.000 description 3
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 3
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 3
- 108091034117 Oligonucleotide Proteins 0.000 description 3
- 239000002202 Polyethylene glycol Substances 0.000 description 3
- 230000009471 action Effects 0.000 description 3
- 150000001345 alkine derivatives Chemical class 0.000 description 3
- 125000002355 alkine group Chemical group 0.000 description 3
- 150000001413 amino acids Chemical class 0.000 description 3
- 239000012491 analyte Substances 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 3
- 238000000137 annealing Methods 0.000 description 3
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 3
- 230000004888 barrier function Effects 0.000 description 3
- 235000012000 cholesterol Nutrition 0.000 description 3
- 229940107161 cholesterol Drugs 0.000 description 3
- 239000002131 composite material Substances 0.000 description 3
- 150000001875 compounds Chemical class 0.000 description 3
- 238000001816 cooling Methods 0.000 description 3
- 229910052802 copper Inorganic materials 0.000 description 3
- 239000010949 copper Substances 0.000 description 3
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 3
- 239000005547 deoxyribonucleotide Substances 0.000 description 3
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 3
- 230000005684 electric field Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- ACCCMOQWYVYDOT-UHFFFAOYSA-N hexane-1,1-diol Chemical compound CCCCCC(O)O ACCCMOQWYVYDOT-UHFFFAOYSA-N 0.000 description 3
- 239000001301 oxygen Substances 0.000 description 3
- 229910052760 oxygen Inorganic materials 0.000 description 3
- 229920001223 polyethylene glycol Polymers 0.000 description 3
- 235000000346 sugar Nutrition 0.000 description 3
- 150000003573 thiols Chemical class 0.000 description 3
- YKBGVTZYEHREMT-KVQBGUIXSA-N 2'-deoxyguanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](CO)O1 YKBGVTZYEHREMT-KVQBGUIXSA-N 0.000 description 2
- MXHRCPNRJAMMIM-SHYZEUOFSA-N 2'-deoxyuridine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-SHYZEUOFSA-N 0.000 description 2
- 150000005019 2-aminopurines Chemical class 0.000 description 2
- HCGYMSSYSAKGPK-UHFFFAOYSA-N 2-nitro-1h-indole Chemical class C1=CC=C2NC([N+](=O)[O-])=CC2=C1 HCGYMSSYSAKGPK-UHFFFAOYSA-N 0.000 description 2
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 2
- XTWYTFMLZFPYCI-KQYNXXCUSA-N 5'-adenylphosphoric acid Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1O XTWYTFMLZFPYCI-KQYNXXCUSA-N 0.000 description 2
- ZAYHVCMSTBRABG-UHFFFAOYSA-N 5-Methylcytidine Natural products O=C1N=C(N)C(C)=CN1C1C(O)C(O)C(CO)O1 ZAYHVCMSTBRABG-UHFFFAOYSA-N 0.000 description 2
- ZAYHVCMSTBRABG-JXOAFFINSA-N 5-methylcytidine Chemical compound O=C1N=C(N)C(C)=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 ZAYHVCMSTBRABG-JXOAFFINSA-N 0.000 description 2
- XTWYTFMLZFPYCI-UHFFFAOYSA-N Adenosine diphosphate Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(COP(O)(=O)OP(O)(O)=O)C(O)C1O XTWYTFMLZFPYCI-UHFFFAOYSA-N 0.000 description 2
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 2
- 239000002126 C01EB10 - Adenosine Substances 0.000 description 2
- ZWIADYZPOWUWEW-XVFCMESISA-N CDP Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](COP(O)(=O)OP(O)(O)=O)O1 ZWIADYZPOWUWEW-XVFCMESISA-N 0.000 description 2
- 239000008000 CHES buffer Substances 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 2
- UDMBCSSLTHHNCD-UHFFFAOYSA-N Coenzym Q(11) Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(COP(O)(O)=O)C(O)C1O UDMBCSSLTHHNCD-UHFFFAOYSA-N 0.000 description 2
- MIKUYHXYGGJMLM-GIMIYPNGSA-N Crotonoside Natural products C1=NC2=C(N)NC(=O)N=C2N1[C@H]1O[C@@H](CO)[C@H](O)[C@@H]1O MIKUYHXYGGJMLM-GIMIYPNGSA-N 0.000 description 2
- PCDQPRRSZKQHHS-CCXZUQQUSA-N Cytarabine Triphosphate Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@@H](O)[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 PCDQPRRSZKQHHS-CCXZUQQUSA-N 0.000 description 2
- NYHBQMYGNKIUIF-UHFFFAOYSA-N D-guanosine Natural products C1=2NC(N)=NC(=O)C=2N=CN1C1OC(CO)C(O)C1O NYHBQMYGNKIUIF-UHFFFAOYSA-N 0.000 description 2
- 101150077975 DDT gene Proteins 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 2
- QGWNDRXFNXRZMB-UUOKFMHZSA-N GDP Chemical compound C1=2NC(N)=NC(=O)C=2N=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1O QGWNDRXFNXRZMB-UUOKFMHZSA-N 0.000 description 2
- XKMLYUALXHKNFT-UUOKFMHZSA-N Guanosine-5'-triphosphate Chemical compound C1=2NC(N)=NC(=O)C=2N=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1O XKMLYUALXHKNFT-UUOKFMHZSA-N 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- 229910019142 PO4 Inorganic materials 0.000 description 2
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 2
- 108091093078 Pyrimidine dimer Proteins 0.000 description 2
- 108091028664 Ribonucleotide Proteins 0.000 description 2
- RZCIEJXAILMSQK-JXOAFFINSA-N TTP Chemical compound O=C1NC(=O)C(C)=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 RZCIEJXAILMSQK-JXOAFFINSA-N 0.000 description 2
- RYYWUUFWQRZTIU-UHFFFAOYSA-N Thiophosphoric acid Chemical class OP(O)(S)=O RYYWUUFWQRZTIU-UHFFFAOYSA-N 0.000 description 2
- XCCTYIAWTASOJW-XVFCMESISA-N Uridine-5'-Diphosphate Chemical compound O[C@@H]1[C@H](O)[C@@H](COP(O)(=O)OP(O)(O)=O)O[C@H]1N1C(=O)NC(=O)C=C1 XCCTYIAWTASOJW-XVFCMESISA-N 0.000 description 2
- WREGKURFCTUGRC-POYBYMJQSA-N Zalcitabine Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](CO)CC1 WREGKURFCTUGRC-POYBYMJQSA-N 0.000 description 2
- BZDVTEPMYMHZCR-JGVFFNPUSA-N [(2s,5r)-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methyl phosphono hydrogen phosphate Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(O)=O)CC1 BZDVTEPMYMHZCR-JGVFFNPUSA-N 0.000 description 2
- 150000001251 acridines Chemical class 0.000 description 2
- 229960005305 adenosine Drugs 0.000 description 2
- LNQVTSROQXJCDD-UHFFFAOYSA-N adenosine monophosphate Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(CO)C(OP(O)(O)=O)C1O LNQVTSROQXJCDD-UHFFFAOYSA-N 0.000 description 2
- 125000000217 alkyl group Chemical group 0.000 description 2
- 239000007864 aqueous solution Substances 0.000 description 2
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 2
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 2
- 108010092854 aspartyllysine Proteins 0.000 description 2
- IVRMZWNICZWHMI-UHFFFAOYSA-N azide group Chemical group [N-]=[N+]=[N-] IVRMZWNICZWHMI-UHFFFAOYSA-N 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- IQFYYKKMVGJFEH-UHFFFAOYSA-N beta-L-thymidine Natural products O=C1NC(=O)C(C)=CN1C1OC(CO)C(O)C1 IQFYYKKMVGJFEH-UHFFFAOYSA-N 0.000 description 2
- 239000006227 byproduct Substances 0.000 description 2
- 229910052799 carbon Inorganic materials 0.000 description 2
- 150000001768 cations Chemical class 0.000 description 2
- 238000004587 chromatography analysis Methods 0.000 description 2
- DAEAPNUQQAICNR-RRKCRQDMSA-K dADP(3-) Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP([O-])(=O)OP([O-])([O-])=O)O1 DAEAPNUQQAICNR-RRKCRQDMSA-K 0.000 description 2
- CIKGWCTVFSRMJU-KVQBGUIXSA-N dGDP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(O)=O)O1 CIKGWCTVFSRMJU-KVQBGUIXSA-N 0.000 description 2
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 2
- UJLXYODCHAELLY-XLPZGREQSA-N dTDP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 UJLXYODCHAELLY-XLPZGREQSA-N 0.000 description 2
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 2
- 230000006378 damage Effects 0.000 description 2
- 101150102279 ddc gene Proteins 0.000 description 2
- 238000010828 elution Methods 0.000 description 2
- 239000012530 fluid Substances 0.000 description 2
- 108010010147 glycylglutamine Proteins 0.000 description 2
- QGWNDRXFNXRZMB-UHFFFAOYSA-N guanidine diphosphate Natural products C1=2NC(N)=NC(=O)C=2N=CN1C1OC(COP(O)(=O)OP(O)(O)=O)C(O)C1O QGWNDRXFNXRZMB-UHFFFAOYSA-N 0.000 description 2
- 229940029575 guanosine Drugs 0.000 description 2
- 238000000338 in vitro Methods 0.000 description 2
- 125000005647 linker group Chemical group 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 239000012528 membrane Substances 0.000 description 2
- 239000002184 metal Substances 0.000 description 2
- 229910052751 metal Inorganic materials 0.000 description 2
- 230000011987 methylation Effects 0.000 description 2
- 238000007069 methylation reaction Methods 0.000 description 2
- 244000005700 microbiome Species 0.000 description 2
- 238000002156 mixing Methods 0.000 description 2
- 239000000178 monomer Substances 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 239000002777 nucleoside Substances 0.000 description 2
- 125000003835 nucleoside group Chemical group 0.000 description 2
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 2
- 239000010452 phosphate Substances 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 239000013635 pyrimidine dimer Substances 0.000 description 2
- 239000002336 ribonucleotide Substances 0.000 description 2
- 239000002904 solvent Substances 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 150000008163 sugars Chemical class 0.000 description 2
- 229920001059 synthetic polymer Polymers 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 229940104230 thymidine Drugs 0.000 description 2
- XXYIANZGUOSQHY-XLPZGREQSA-N thymidine 3'-monophosphate Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](OP(O)(O)=O)C1 XXYIANZGUOSQHY-XLPZGREQSA-N 0.000 description 2
- 229960000523 zalcitabine Drugs 0.000 description 2
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- VLSDXINSOMDCBK-BQYQJAHWSA-N (E)-1,1'-azobis(N,N-dimethylformamide) Chemical compound CN(C)C(=O)\N=N\C(=O)N(C)C VLSDXINSOMDCBK-BQYQJAHWSA-N 0.000 description 1
- XKKCQTLDIPIRQD-JGVFFNPUSA-N 1-[(2r,5s)-5-(hydroxymethyl)oxolan-2-yl]-5-methylpyrimidine-2,4-dione Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)CC1 XKKCQTLDIPIRQD-JGVFFNPUSA-N 0.000 description 1
- CKTSBUTUHBMZGZ-SHYZEUOFSA-N 2'‐deoxycytidine Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 CKTSBUTUHBMZGZ-SHYZEUOFSA-N 0.000 description 1
- SKWCZPYWFRTSDD-UHFFFAOYSA-N 2,3-bis(azaniumyl)propanoate;chloride Chemical compound Cl.NCC(N)C(O)=O SKWCZPYWFRTSDD-UHFFFAOYSA-N 0.000 description 1
- MHKBMNACOMRIAW-UHFFFAOYSA-N 2,3-dinitrophenol Chemical class OC1=CC=CC([N+]([O-])=O)=C1[N+]([O-])=O MHKBMNACOMRIAW-UHFFFAOYSA-N 0.000 description 1
- PIINGYXNCHTJTF-UHFFFAOYSA-N 2-(2-azaniumylethylamino)acetate Chemical group NCCNCC(O)=O PIINGYXNCHTJTF-UHFFFAOYSA-N 0.000 description 1
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 1
- MWBWWFOAEOYUST-UHFFFAOYSA-N 2-aminopurine Chemical compound NC1=NC=C2N=CNC2=N1 MWBWWFOAEOYUST-UHFFFAOYSA-N 0.000 description 1
- HXMVNCMPQGPRLN-UHFFFAOYSA-N 2-hydroxyputrescine Chemical compound NCCC(O)CN HXMVNCMPQGPRLN-UHFFFAOYSA-N 0.000 description 1
- HKVRRPIGVZKBQT-UHFFFAOYSA-N 3,3-diphenylcyclooctyne Chemical group C1CCCCC#CC1(C=1C=CC=CC=1)C1=CC=CC=C1 HKVRRPIGVZKBQT-UHFFFAOYSA-N 0.000 description 1
- RBTBFTRPCNLSDE-UHFFFAOYSA-N 3,7-bis(dimethylamino)phenothiazin-5-ium Chemical compound C1=CC(N(C)C)=CC2=[S+]C3=CC(N(C)C)=CC=C3N=C21 RBTBFTRPCNLSDE-UHFFFAOYSA-N 0.000 description 1
- WOVKYSAHUYNSMH-RRKCRQDMSA-N 5-bromodeoxyuridine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C(Br)=C1 WOVKYSAHUYNSMH-RRKCRQDMSA-N 0.000 description 1
- LRSASMSXMSNRBT-UHFFFAOYSA-N 5-methylcytosine Chemical compound CC1=CNC(=O)N=C1N LRSASMSXMSNRBT-UHFFFAOYSA-N 0.000 description 1
- OZFPSOBLQZPIAV-UHFFFAOYSA-N 5-nitro-1h-indole Chemical class [O-][N+](=O)C1=CC=C2NC=CC2=C1 OZFPSOBLQZPIAV-UHFFFAOYSA-N 0.000 description 1
- MSSXOMSJDRHRMC-UHFFFAOYSA-N 9H-purine-2,6-diamine Chemical compound NC1=NC(N)=C2NC=NC2=N1 MSSXOMSJDRHRMC-UHFFFAOYSA-N 0.000 description 1
- FJVAQLJNTSUQPY-CIUDSAMLSA-N Ala-Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN FJVAQLJNTSUQPY-CIUDSAMLSA-N 0.000 description 1
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 1
- RZZMZYZXNJRPOJ-BJDJZHNGSA-N Ala-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](C)N RZZMZYZXNJRPOJ-BJDJZHNGSA-N 0.000 description 1
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 1
- CJQAEJMHBAOQHA-DLOVCJGASA-N Ala-Phe-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CJQAEJMHBAOQHA-DLOVCJGASA-N 0.000 description 1
- FFZJHQODAYHGPO-KZVJFYERSA-N Ala-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N FFZJHQODAYHGPO-KZVJFYERSA-N 0.000 description 1
- JNLDTVRGXMSYJC-UVBJJODRSA-N Ala-Pro-Trp Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O JNLDTVRGXMSYJC-UVBJJODRSA-N 0.000 description 1
- GCTANJIJJROSLH-GVARAGBVSA-N Ala-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C)N GCTANJIJJROSLH-GVARAGBVSA-N 0.000 description 1
- OGUPCHKBOKJFMA-SRVKXCTJSA-N Arg-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N OGUPCHKBOKJFMA-SRVKXCTJSA-N 0.000 description 1
- PCQXGEUALSFGIA-WDSOQIARSA-N Arg-His-Trp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O PCQXGEUALSFGIA-WDSOQIARSA-N 0.000 description 1
- FFEUXEAKYRCACT-PEDHHIEDSA-N Arg-Ile-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCNC(N)=N)[C@@H](C)CC)C(O)=O FFEUXEAKYRCACT-PEDHHIEDSA-N 0.000 description 1
- CLICCYPMVFGUOF-IHRRRGAJSA-N Arg-Lys-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O CLICCYPMVFGUOF-IHRRRGAJSA-N 0.000 description 1
- BTJVOUQWFXABOI-IHRRRGAJSA-N Arg-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCNC(N)=N BTJVOUQWFXABOI-IHRRRGAJSA-N 0.000 description 1
- VUGWHBXPMAHEGZ-SRVKXCTJSA-N Arg-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCN=C(N)N VUGWHBXPMAHEGZ-SRVKXCTJSA-N 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- NUHQMYUWLUSRJX-BIIVOSGPSA-N Asn-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N NUHQMYUWLUSRJX-BIIVOSGPSA-N 0.000 description 1
- PCKRJVZAQZWNKM-WHFBIAKZSA-N Asn-Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O PCKRJVZAQZWNKM-WHFBIAKZSA-N 0.000 description 1
- JEEFEQCRXKPQHC-KKUMJFAQSA-N Asn-Leu-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JEEFEQCRXKPQHC-KKUMJFAQSA-N 0.000 description 1
- QTKYFZCMSQLYHI-UBHSHLNASA-N Asn-Trp-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(O)=O QTKYFZCMSQLYHI-UBHSHLNASA-N 0.000 description 1
- QUCCLIXMVPIVOB-BZSNNMDCSA-N Asn-Tyr-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC(=O)N)N QUCCLIXMVPIVOB-BZSNNMDCSA-N 0.000 description 1
- OERMIMJQPQUIPK-FXQIFTODSA-N Asp-Arg-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O OERMIMJQPQUIPK-FXQIFTODSA-N 0.000 description 1
- SBHUBSDEZQFJHJ-CIUDSAMLSA-N Asp-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O SBHUBSDEZQFJHJ-CIUDSAMLSA-N 0.000 description 1
- GHODABZPVZMWCE-FXQIFTODSA-N Asp-Glu-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O GHODABZPVZMWCE-FXQIFTODSA-N 0.000 description 1
- RRKCPMGSRIDLNC-AVGNSLFASA-N Asp-Glu-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RRKCPMGSRIDLNC-AVGNSLFASA-N 0.000 description 1
- ZSVJVIOVABDTTL-YUMQZZPRSA-N Asp-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)O)N ZSVJVIOVABDTTL-YUMQZZPRSA-N 0.000 description 1
- PZXPWHFYZXTFBI-YUMQZZPRSA-N Asp-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PZXPWHFYZXTFBI-YUMQZZPRSA-N 0.000 description 1
- HKEZZWQWXWGASX-KKUMJFAQSA-N Asp-Leu-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 HKEZZWQWXWGASX-KKUMJFAQSA-N 0.000 description 1
- ORRJQLIATJDMQM-HJGDQZAQSA-N Asp-Leu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O ORRJQLIATJDMQM-HJGDQZAQSA-N 0.000 description 1
- CTWCFPWFIGRAEP-CIUDSAMLSA-N Asp-Lys-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O CTWCFPWFIGRAEP-CIUDSAMLSA-N 0.000 description 1
- NONWUQAWAANERO-BZSNNMDCSA-N Asp-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@H](CC(O)=O)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 NONWUQAWAANERO-BZSNNMDCSA-N 0.000 description 1
- HICVMZCGVFKTPM-BQBZGAKWSA-N Asp-Pro-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O HICVMZCGVFKTPM-BQBZGAKWSA-N 0.000 description 1
- MNQMTYSEKZHIDF-GCJQMDKQSA-N Asp-Thr-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O MNQMTYSEKZHIDF-GCJQMDKQSA-N 0.000 description 1
- 235000007319 Avena orientalis Nutrition 0.000 description 1
- 244000075850 Avena orientalis Species 0.000 description 1
- WOVKYSAHUYNSMH-UHFFFAOYSA-N BROMODEOXYURIDINE Natural products C1C(O)C(CO)OC1N1C(=O)NC(=O)C(Br)=C1 WOVKYSAHUYNSMH-UHFFFAOYSA-N 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 235000014698 Brassica juncea var multisecta Nutrition 0.000 description 1
- 235000006008 Brassica napus var napus Nutrition 0.000 description 1
- 240000000385 Brassica napus var. napus Species 0.000 description 1
- 235000006618 Brassica rapa subsp oleifera Nutrition 0.000 description 1
- 235000004977 Brassica sinapistrum Nutrition 0.000 description 1
- 241000282472 Canis lupus familiaris Species 0.000 description 1
- 240000006162 Chenopodium quinoa Species 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- HMFHBZSHGGEWLO-SOOFDHNKSA-N D-ribofuranose Chemical compound OC[C@H]1OC(O)[C@H](O)[C@@H]1O HMFHBZSHGGEWLO-SOOFDHNKSA-N 0.000 description 1
- CKTSBUTUHBMZGZ-UHFFFAOYSA-N Deoxycytidine Natural products O=C1N=C(N)C=CN1C1OC(CO)C(O)C1 CKTSBUTUHBMZGZ-UHFFFAOYSA-N 0.000 description 1
- SHIBSTMRCDJXLN-UHFFFAOYSA-N Digoxigenin Natural products C1CC(C2C(C3(C)CCC(O)CC3CC2)CC2O)(O)C2(C)C1C1=CC(=O)OC1 SHIBSTMRCDJXLN-UHFFFAOYSA-N 0.000 description 1
- 241000283086 Equidae Species 0.000 description 1
- LYCAIKOWRPUZTN-UHFFFAOYSA-N Ethylene glycol Chemical group OCCO LYCAIKOWRPUZTN-UHFFFAOYSA-N 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- IPHGBVYWRKCGKG-FXQIFTODSA-N Gln-Cys-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(O)=O)C(O)=O IPHGBVYWRKCGKG-FXQIFTODSA-N 0.000 description 1
- IOFDDSNZJDIGPB-GVXVVHGQSA-N Gln-Leu-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IOFDDSNZJDIGPB-GVXVVHGQSA-N 0.000 description 1
- JNENSVNAUWONEZ-GUBZILKMSA-N Gln-Lys-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JNENSVNAUWONEZ-GUBZILKMSA-N 0.000 description 1
- LKDIBBOKUAASNP-FXQIFTODSA-N Glu-Ala-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LKDIBBOKUAASNP-FXQIFTODSA-N 0.000 description 1
- MXOODARRORARSU-ACZMJKKPSA-N Glu-Ala-Ser Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)O)N MXOODARRORARSU-ACZMJKKPSA-N 0.000 description 1
- FLLRAEJOLZPSMN-CIUDSAMLSA-N Glu-Asn-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FLLRAEJOLZPSMN-CIUDSAMLSA-N 0.000 description 1
- VAZZOGXDUQSVQF-NUMRIWBASA-N Glu-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N)O VAZZOGXDUQSVQF-NUMRIWBASA-N 0.000 description 1
- KLJMRPIBBLTDGE-ACZMJKKPSA-N Glu-Cys-Asn Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(O)=O KLJMRPIBBLTDGE-ACZMJKKPSA-N 0.000 description 1
- HRBYTAIBKPNZKQ-AVGNSLFASA-N Glu-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O HRBYTAIBKPNZKQ-AVGNSLFASA-N 0.000 description 1
- SYWCGQOIIARSIX-SRVKXCTJSA-N Glu-Pro-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O SYWCGQOIIARSIX-SRVKXCTJSA-N 0.000 description 1
- UUTGYDAKPISJAO-JYJNAYRXSA-N Glu-Tyr-Leu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=C(O)C=C1 UUTGYDAKPISJAO-JYJNAYRXSA-N 0.000 description 1
- QXUPRMQJDWJDFR-NRPADANISA-N Glu-Val-Ser Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXUPRMQJDWJDFR-NRPADANISA-N 0.000 description 1
- STVHDEHTKFXBJQ-LAEOZQHASA-N Gly-Glu-Ile Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O STVHDEHTKFXBJQ-LAEOZQHASA-N 0.000 description 1
- QITBQGJOXQYMOA-ZETCQYMHSA-N Gly-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)CN QITBQGJOXQYMOA-ZETCQYMHSA-N 0.000 description 1
- YKJUITHASJAGHO-HOTGVXAUSA-N Gly-Lys-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)CN YKJUITHASJAGHO-HOTGVXAUSA-N 0.000 description 1
- LLWQVJNHMYBLLK-CDMKHQONSA-N Gly-Thr-Phe Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LLWQVJNHMYBLLK-CDMKHQONSA-N 0.000 description 1
- GJHWILMUOANXTG-WPRPVWTQSA-N Gly-Val-Arg Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GJHWILMUOANXTG-WPRPVWTQSA-N 0.000 description 1
- 244000068988 Glycine max Species 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- 241000219146 Gossypium Species 0.000 description 1
- ORZGPQXISSXQGW-IHRRRGAJSA-N His-His-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(O)=O ORZGPQXISSXQGW-IHRRRGAJSA-N 0.000 description 1
- 240000005979 Hordeum vulgare Species 0.000 description 1
- 235000007340 Hordeum vulgare Nutrition 0.000 description 1
- 238000006736 Huisgen cycloaddition reaction Methods 0.000 description 1
- DMHGKBGOUAJRHU-UHFFFAOYSA-N Ile-Arg-Pro Natural products CCC(C)C(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O DMHGKBGOUAJRHU-UHFFFAOYSA-N 0.000 description 1
- UKTUOMWSJPXODT-GUDRVLHUSA-N Ile-Asn-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N UKTUOMWSJPXODT-GUDRVLHUSA-N 0.000 description 1
- CTHAJJYOHOBUDY-GHCJXIJMSA-N Ile-Cys-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)O)C(=O)O)N CTHAJJYOHOBUDY-GHCJXIJMSA-N 0.000 description 1
- KFVUBLZRFSVDGO-BYULHYEWSA-N Ile-Gly-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O KFVUBLZRFSVDGO-BYULHYEWSA-N 0.000 description 1
- KEKTTYCXKGBAAL-VGDYDELISA-N Ile-His-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CO)C(=O)O)N KEKTTYCXKGBAAL-VGDYDELISA-N 0.000 description 1
- VNDQNDYEPSXHLU-JUKXBJQTSA-N Ile-His-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N VNDQNDYEPSXHLU-JUKXBJQTSA-N 0.000 description 1
- PWDSHAAAFXISLE-SXTJYALSSA-N Ile-Ile-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O PWDSHAAAFXISLE-SXTJYALSSA-N 0.000 description 1
- SJLVSMMIFYTSGY-GRLWGSQLSA-N Ile-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N SJLVSMMIFYTSGY-GRLWGSQLSA-N 0.000 description 1
- YNMQUIVKEFRCPH-QSFUFRPTSA-N Ile-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)O)N YNMQUIVKEFRCPH-QSFUFRPTSA-N 0.000 description 1
- DMSVBUWGDLYNLC-IAVJCBSLSA-N Ile-Ile-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 DMSVBUWGDLYNLC-IAVJCBSLSA-N 0.000 description 1
- KLBVGHCGHUNHEA-BJDJZHNGSA-N Ile-Leu-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)O)N KLBVGHCGHUNHEA-BJDJZHNGSA-N 0.000 description 1
- TVYWVSJGSHQWMT-AJNGGQMLSA-N Ile-Leu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N TVYWVSJGSHQWMT-AJNGGQMLSA-N 0.000 description 1
- FFAUOCITXBMRBT-YTFOTSKYSA-N Ile-Lys-Ile Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FFAUOCITXBMRBT-YTFOTSKYSA-N 0.000 description 1
- FQYQMFCIJNWDQZ-CYDGBPFRSA-N Ile-Pro-Pro Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 FQYQMFCIJNWDQZ-CYDGBPFRSA-N 0.000 description 1
- PXKACEXYLPBMAD-JBDRJPRFSA-N Ile-Ser-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PXKACEXYLPBMAD-JBDRJPRFSA-N 0.000 description 1
- OMDWJWGZGMCQND-CFMVVWHZSA-N Ile-Tyr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N OMDWJWGZGMCQND-CFMVVWHZSA-N 0.000 description 1
- NJGXXYLPDMMFJB-XUXIUFHCSA-N Ile-Val-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N NJGXXYLPDMMFJB-XUXIUFHCSA-N 0.000 description 1
- ZSESFIFAYQEKRD-CYDGBPFRSA-N Ile-Val-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCSC)C(=O)O)N ZSESFIFAYQEKRD-CYDGBPFRSA-N 0.000 description 1
- 108010065920 Insulin Lispro Proteins 0.000 description 1
- AHLPHDHHMVZTML-BYPYZUCNSA-N L-Ornithine Chemical compound NCCC[C@H](N)C(O)=O AHLPHDHHMVZTML-BYPYZUCNSA-N 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 1
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 1
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 1
- 240000004322 Lens culinaris Species 0.000 description 1
- 235000014647 Lens culinaris subsp culinaris Nutrition 0.000 description 1
- REPPKAMYTOJTFC-DCAQKATOSA-N Leu-Arg-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O REPPKAMYTOJTFC-DCAQKATOSA-N 0.000 description 1
- APFJUBGRZGMQFF-QWRGUYRKSA-N Leu-Gly-Lys Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN APFJUBGRZGMQFF-QWRGUYRKSA-N 0.000 description 1
- HRTRLSRYZZKPCO-BJDJZHNGSA-N Leu-Ile-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HRTRLSRYZZKPCO-BJDJZHNGSA-N 0.000 description 1
- PPGBXYKMUMHFBF-KATARQTJSA-N Leu-Ser-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PPGBXYKMUMHFBF-KATARQTJSA-N 0.000 description 1
- LFSQWRSVPNKJGP-WDCWCFNPSA-N Leu-Thr-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(O)=O LFSQWRSVPNKJGP-WDCWCFNPSA-N 0.000 description 1
- BTEMNFBEAAOGBR-BZSNNMDCSA-N Leu-Tyr-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BTEMNFBEAAOGBR-BZSNNMDCSA-N 0.000 description 1
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 239000000232 Lipid Bilayer Substances 0.000 description 1
- 235000007688 Lycopersicon esculentum Nutrition 0.000 description 1
- JCFYLFOCALSNLQ-GUBZILKMSA-N Lys-Ala-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JCFYLFOCALSNLQ-GUBZILKMSA-N 0.000 description 1
- KCXUCYYZNZFGLL-SRVKXCTJSA-N Lys-Ala-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O KCXUCYYZNZFGLL-SRVKXCTJSA-N 0.000 description 1
- NQCJGQHHYZNUDK-DCAQKATOSA-N Lys-Arg-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CCCN=C(N)N NQCJGQHHYZNUDK-DCAQKATOSA-N 0.000 description 1
- YVMQJGWLHRWMDF-MNXVOIDGSA-N Lys-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N YVMQJGWLHRWMDF-MNXVOIDGSA-N 0.000 description 1
- ZXFRGTAIIZHNHG-AJNGGQMLSA-N Lys-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N ZXFRGTAIIZHNHG-AJNGGQMLSA-N 0.000 description 1
- NJNRBRKHOWSGMN-SRVKXCTJSA-N Lys-Leu-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O NJNRBRKHOWSGMN-SRVKXCTJSA-N 0.000 description 1
- DNWBUCHHMRQWCZ-GUBZILKMSA-N Lys-Ser-Gln Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O DNWBUCHHMRQWCZ-GUBZILKMSA-N 0.000 description 1
- IKXQOBUBZSOWDY-AVGNSLFASA-N Lys-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N IKXQOBUBZSOWDY-AVGNSLFASA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 244000070406 Malus silvestris Species 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- GGXZOTSDJJTDGB-GUBZILKMSA-N Met-Ser-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O GGXZOTSDJJTDGB-GUBZILKMSA-N 0.000 description 1
- OOLVTRHJJBCJKB-IHRRRGAJSA-N Met-Tyr-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N OOLVTRHJJBCJKB-IHRRRGAJSA-N 0.000 description 1
- 240000005561 Musa balbisiana Species 0.000 description 1
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 1
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 1
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 1
- 244000061176 Nicotiana tabacum Species 0.000 description 1
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 1
- AHLPHDHHMVZTML-UHFFFAOYSA-N Orn-delta-NH2 Natural products NCCCC(N)C(O)=O AHLPHDHHMVZTML-UHFFFAOYSA-N 0.000 description 1
- UTJLXEIPEHZYQJ-UHFFFAOYSA-N Ornithine Natural products OC(=O)C(C)CCCN UTJLXEIPEHZYQJ-UHFFFAOYSA-N 0.000 description 1
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- 244000046052 Phaseolus vulgaris Species 0.000 description 1
- 235000010627 Phaseolus vulgaris Nutrition 0.000 description 1
- BFYHIHGIHGROAT-HTUGSXCWSA-N Phe-Glu-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BFYHIHGIHGROAT-HTUGSXCWSA-N 0.000 description 1
- JQLQUPIYYJXZLJ-ZEWNOJEFSA-N Phe-Ile-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 JQLQUPIYYJXZLJ-ZEWNOJEFSA-N 0.000 description 1
- RORUIHAWOLADSH-HJWJTTGWSA-N Phe-Ile-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=CC=C1 RORUIHAWOLADSH-HJWJTTGWSA-N 0.000 description 1
- ZUQACJLOHYRVPJ-DKIMLUQUSA-N Phe-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZUQACJLOHYRVPJ-DKIMLUQUSA-N 0.000 description 1
- XZQYIJALMGEUJD-OEAJRASXSA-N Phe-Lys-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XZQYIJALMGEUJD-OEAJRASXSA-N 0.000 description 1
- OKQQWSNUSQURLI-JYJNAYRXSA-N Phe-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC1=CC=CC=C1)N OKQQWSNUSQURLI-JYJNAYRXSA-N 0.000 description 1
- BSKMOCNNLNDIMU-CDMKHQONSA-N Phe-Thr-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O BSKMOCNNLNDIMU-CDMKHQONSA-N 0.000 description 1
- RGMLUHANLDVMPB-ULQDDVLXSA-N Phe-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N RGMLUHANLDVMPB-ULQDDVLXSA-N 0.000 description 1
- XYFCBTPGUUZFHI-UHFFFAOYSA-N Phosphine Natural products P XYFCBTPGUUZFHI-UHFFFAOYSA-N 0.000 description 1
- KIZQGKLMXKGDIV-BQBZGAKWSA-N Pro-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 KIZQGKLMXKGDIV-BQBZGAKWSA-N 0.000 description 1
- XQLBWXHVZVBNJM-FXQIFTODSA-N Pro-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 XQLBWXHVZVBNJM-FXQIFTODSA-N 0.000 description 1
- BUEIYHBJHCDAMI-UFYCRDLUSA-N Pro-Phe-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BUEIYHBJHCDAMI-UFYCRDLUSA-N 0.000 description 1
- FIODMZKLZFLYQP-GUBZILKMSA-N Pro-Val-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FIODMZKLZFLYQP-GUBZILKMSA-N 0.000 description 1
- 238000010357 RNA editing Methods 0.000 description 1
- 230000026279 RNA modification Effects 0.000 description 1
- PYMYPHUHKUWMLA-LMVFSUKVSA-N Ribose Natural products OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PYMYPHUHKUWMLA-LMVFSUKVSA-N 0.000 description 1
- 240000000111 Saccharum officinarum Species 0.000 description 1
- 235000007201 Saccharum officinarum Nutrition 0.000 description 1
- OLIJLNWFEQEFDM-SRVKXCTJSA-N Ser-Asp-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLIJLNWFEQEFDM-SRVKXCTJSA-N 0.000 description 1
- WSTIOCFMWXNOCX-YUMQZZPRSA-N Ser-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N WSTIOCFMWXNOCX-YUMQZZPRSA-N 0.000 description 1
- HBTCFCHYALPXME-HTFCKZLJSA-N Ser-Ile-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HBTCFCHYALPXME-HTFCKZLJSA-N 0.000 description 1
- NLOAIFSWUUFQFR-CIUDSAMLSA-N Ser-Leu-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O NLOAIFSWUUFQFR-CIUDSAMLSA-N 0.000 description 1
- XUDRHBPSPAPDJP-SRVKXCTJSA-N Ser-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO XUDRHBPSPAPDJP-SRVKXCTJSA-N 0.000 description 1
- PMCMLDNPAZUYGI-DCAQKATOSA-N Ser-Lys-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMCMLDNPAZUYGI-DCAQKATOSA-N 0.000 description 1
- UKKROEYWYIHWBD-ZKWXMUAHSA-N Ser-Val-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O UKKROEYWYIHWBD-ZKWXMUAHSA-N 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 240000003768 Solanum lycopersicum Species 0.000 description 1
- 244000061456 Solanum tuberosum Species 0.000 description 1
- 235000002595 Solanum tuberosum Nutrition 0.000 description 1
- 241000282887 Suidae Species 0.000 description 1
- 235000009470 Theobroma cacao Nutrition 0.000 description 1
- 244000299461 Theobroma cacao Species 0.000 description 1
- JTEICXDKGWKRRV-HJGDQZAQSA-N Thr-Asn-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O JTEICXDKGWKRRV-HJGDQZAQSA-N 0.000 description 1
- DHPPWTOLRWYIDS-XKBZYTNZSA-N Thr-Cys-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(O)=O)C(O)=O DHPPWTOLRWYIDS-XKBZYTNZSA-N 0.000 description 1
- SHOMROOOQBDGRL-JHEQGTHGSA-N Thr-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SHOMROOOQBDGRL-JHEQGTHGSA-N 0.000 description 1
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 1
- IMULJHHGAUZZFE-MBLNEYKQSA-N Thr-Gly-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IMULJHHGAUZZFE-MBLNEYKQSA-N 0.000 description 1
- MPUMPERGHHJGRP-WEDXCCLWSA-N Thr-Gly-Lys Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O MPUMPERGHHJGRP-WEDXCCLWSA-N 0.000 description 1
- FKIGTIXHSRNKJU-IXOXFDKPSA-N Thr-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CN=CN1 FKIGTIXHSRNKJU-IXOXFDKPSA-N 0.000 description 1
- PAXANSWUSVPFNK-IUKAMOBKSA-N Thr-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N PAXANSWUSVPFNK-IUKAMOBKSA-N 0.000 description 1
- QNCFWHZVRNXAKW-OEAJRASXSA-N Thr-Lys-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNCFWHZVRNXAKW-OEAJRASXSA-N 0.000 description 1
- GYUUYCIXELGTJS-MEYUZBJRSA-N Thr-Phe-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O GYUUYCIXELGTJS-MEYUZBJRSA-N 0.000 description 1
- DNCUODYZAMHLCV-XGEHTFHBSA-N Thr-Pro-Cys Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)O)N)O DNCUODYZAMHLCV-XGEHTFHBSA-N 0.000 description 1
- RVMNUBQWPVOUKH-HEIBUPTGSA-N Thr-Ser-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMNUBQWPVOUKH-HEIBUPTGSA-N 0.000 description 1
- BBPCSGKKPJUYRB-UVOCVTCTSA-N Thr-Thr-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O BBPCSGKKPJUYRB-UVOCVTCTSA-N 0.000 description 1
- CJEHCEOXPLASCK-MEYUZBJRSA-N Thr-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=C(O)C=C1 CJEHCEOXPLASCK-MEYUZBJRSA-N 0.000 description 1
- 240000000359 Triticum dicoccon Species 0.000 description 1
- PEYSVKMXSLPQRU-FJHTZYQYSA-N Trp-Ala-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O PEYSVKMXSLPQRU-FJHTZYQYSA-N 0.000 description 1
- PXQPYPMSLBQHJJ-WFBYXXMGSA-N Trp-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N PXQPYPMSLBQHJJ-WFBYXXMGSA-N 0.000 description 1
- AKLNEFNQWLHIGY-QWRGUYRKSA-N Tyr-Gly-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N)O AKLNEFNQWLHIGY-QWRGUYRKSA-N 0.000 description 1
- GYKDRHDMGQUZPU-MGHWNKPDSA-N Tyr-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CC=C(C=C1)O)N GYKDRHDMGQUZPU-MGHWNKPDSA-N 0.000 description 1
- SLLKXDSRVAOREO-KZVJFYERSA-N Val-Ala-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](C(C)C)N)O SLLKXDSRVAOREO-KZVJFYERSA-N 0.000 description 1
- UUYCNAXCCDNULB-QXEWZRGKSA-N Val-Arg-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(N)=O)C(O)=O UUYCNAXCCDNULB-QXEWZRGKSA-N 0.000 description 1
- COYSIHFOCOMGCF-UHFFFAOYSA-N Val-Arg-Gly Natural products CC(C)C(N)C(=O)NC(C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-UHFFFAOYSA-N 0.000 description 1
- XWYUBUYQMOUFRQ-IFFSRLJSSA-N Val-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N)O XWYUBUYQMOUFRQ-IFFSRLJSSA-N 0.000 description 1
- MBGFDZDWMDLXHQ-GUBZILKMSA-N Val-Met-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](C(C)C)N MBGFDZDWMDLXHQ-GUBZILKMSA-N 0.000 description 1
- UEPLNXPLHJUYPT-AVGNSLFASA-N Val-Met-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O UEPLNXPLHJUYPT-AVGNSLFASA-N 0.000 description 1
- SJRUJQFQVLMZFW-WPRPVWTQSA-N Val-Pro-Gly Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O SJRUJQFQVLMZFW-WPRPVWTQSA-N 0.000 description 1
- YLBNZCJFSVJDRJ-KJEVXHAQSA-N Val-Thr-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O YLBNZCJFSVJDRJ-KJEVXHAQSA-N 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 241000219094 Vitaceae Species 0.000 description 1
- 240000008042 Zea mays Species 0.000 description 1
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 description 1
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 1
- PGAVKCOVUIYSFO-UHFFFAOYSA-N [[5-(2,4-dioxopyrimidin-1-yl)-3,4-dihydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl] phosphono hydrogen phosphate Chemical compound OC1C(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)OC1N1C(=O)NC(=O)C=C1 PGAVKCOVUIYSFO-UHFFFAOYSA-N 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000021736 acetylation Effects 0.000 description 1
- 238000006640 acetylation reaction Methods 0.000 description 1
- 108010011559 alanylphenylalanine Proteins 0.000 description 1
- HMFHBZSHGGEWLO-UHFFFAOYSA-N alpha-D-Furanose-Ribose Natural products OCC1OC(O)C(O)C1O HMFHBZSHGGEWLO-UHFFFAOYSA-N 0.000 description 1
- 150000001414 amino alcohols Chemical class 0.000 description 1
- 210000004381 amniotic fluid Anatomy 0.000 description 1
- 235000021016 apples Nutrition 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 108010036533 arginylvaline Proteins 0.000 description 1
- 108010038633 aspartylglutamate Proteins 0.000 description 1
- 125000004069 aziridinyl group Chemical group 0.000 description 1
- 235000021015 bananas Nutrition 0.000 description 1
- 235000021028 berry Nutrition 0.000 description 1
- 108091008324 binding proteins Proteins 0.000 description 1
- 102000023732 binding proteins Human genes 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000001124 body fluid Anatomy 0.000 description 1
- 239000010839 body fluid Substances 0.000 description 1
- 229950004398 broxuridine Drugs 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 235000013339 cereals Nutrition 0.000 description 1
- 238000007385 chemical modification Methods 0.000 description 1
- 235000005822 corn Nutrition 0.000 description 1
- 238000002425 crystallisation Methods 0.000 description 1
- 230000008025 crystallization Effects 0.000 description 1
- 208000030381 cutaneous melanoma Diseases 0.000 description 1
- 238000006352 cycloaddition reaction Methods 0.000 description 1
- 125000000640 cyclooctyl group Chemical group [H]C1([H])C([H])([H])C([H])([H])C([H])([H])C([H])(*)C([H])([H])C([H])([H])C1([H])[H] 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- MXHRCPNRJAMMIM-UHFFFAOYSA-N desoxyuridine Natural products C1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-UHFFFAOYSA-N 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- QONQRTHLHBTMGP-UHFFFAOYSA-N digitoxigenin Natural products CC12CCC(C3(CCC(O)CC3CC3)C)C3C11OC1CC2C1=CC(=O)OC1 QONQRTHLHBTMGP-UHFFFAOYSA-N 0.000 description 1
- SHIBSTMRCDJXLN-KCZCNTNESA-N digoxigenin Chemical compound C1([C@@H]2[C@@]3([C@@](CC2)(O)[C@H]2[C@@H]([C@@]4(C)CC[C@H](O)C[C@H]4CC2)C[C@H]3O)C)=CC(=O)OC1 SHIBSTMRCDJXLN-KCZCNTNESA-N 0.000 description 1
- 239000000539 dimer Substances 0.000 description 1
- 239000001177 diphosphate Substances 0.000 description 1
- XPPKVPWEQAFLFU-UHFFFAOYSA-J diphosphate(4-) Chemical compound [O-]P([O-])(=O)OP([O-])([O-])=O XPPKVPWEQAFLFU-UHFFFAOYSA-J 0.000 description 1
- 235000011180 diphosphates Nutrition 0.000 description 1
- 238000004821 distillation Methods 0.000 description 1
- 239000003651 drinking water Substances 0.000 description 1
- 235000020188 drinking water Nutrition 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 239000012149 elution buffer Substances 0.000 description 1
- 238000001976 enzyme digestion Methods 0.000 description 1
- 150000002118 epoxides Chemical class 0.000 description 1
- 210000003743 erythrocyte Anatomy 0.000 description 1
- 239000012467 final product Substances 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 239000003292 glue Substances 0.000 description 1
- 108010049041 glutamylalanine Proteins 0.000 description 1
- 150000004676 glycans Chemical class 0.000 description 1
- 108010037850 glycylvaline Proteins 0.000 description 1
- 235000021021 grapes Nutrition 0.000 description 1
- 125000004051 hexyl group Chemical group [H]C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])* 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 108010036413 histidylglycine Proteins 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 108010031424 isoleucyl-prolyl-proline Proteins 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 235000021374 legumes Nutrition 0.000 description 1
- 108010057821 leucylproline Proteins 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 125000001921 locked nucleotide group Chemical group 0.000 description 1
- 210000002751 lymph Anatomy 0.000 description 1
- 108010003700 lysyl aspartic acid Proteins 0.000 description 1
- 108010009298 lysylglutamic acid Proteins 0.000 description 1
- 229920002521 macromolecule Polymers 0.000 description 1
- 238000005374 membrane filtration Methods 0.000 description 1
- 229960000907 methylthioninium chloride Drugs 0.000 description 1
- 150000004712 monophosphates Chemical class 0.000 description 1
- 210000003097 mucus Anatomy 0.000 description 1
- 231100000252 nontoxic Toxicity 0.000 description 1
- 230000003000 nontoxic effect Effects 0.000 description 1
- 239000012434 nucleophilic reagent Substances 0.000 description 1
- 229960003104 ornithine Drugs 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 108010012581 phenylalanylglutamate Proteins 0.000 description 1
- 108010051242 phenylalanylserine Proteins 0.000 description 1
- 239000008363 phosphate buffer Substances 0.000 description 1
- 150000003904 phospholipids Chemical class 0.000 description 1
- 229910000073 phosphorus hydride Inorganic materials 0.000 description 1
- 230000004962 physiological condition Effects 0.000 description 1
- 210000002381 plasma Anatomy 0.000 description 1
- 229920001282 polysaccharide Polymers 0.000 description 1
- 239000005017 polysaccharide Substances 0.000 description 1
- 235000012015 potatoes Nutrition 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 210000003296 saliva Anatomy 0.000 description 1
- 239000013535 sea water Substances 0.000 description 1
- 210000002966 serum Anatomy 0.000 description 1
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 1
- 108010071207 serylmethionine Proteins 0.000 description 1
- 238000010008 shearing Methods 0.000 description 1
- 201000003708 skin melanoma Diseases 0.000 description 1
- 238000010583 slow cooling Methods 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 239000007858 starting material Substances 0.000 description 1
- 230000000707 stereoselective effect Effects 0.000 description 1
- 230000005945 translocation Effects 0.000 description 1
- 239000001226 triphosphate Substances 0.000 description 1
- 235000011178 triphosphate Nutrition 0.000 description 1
- UNXRWKVEANCORM-UHFFFAOYSA-N triphosphoric acid Chemical compound OP(O)(=O)OP(O)(=O)OP(O)(O)=O UNXRWKVEANCORM-UHFFFAOYSA-N 0.000 description 1
- 108010003137 tyrosyltyrosine Proteins 0.000 description 1
- 210000002700 urine Anatomy 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Biotechnology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Analytical Chemistry (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention provides methods and adaptors for characterising a polynucleotide of interest. The present invention provides a method of characterising a target polynucleotide comprising: (a) moving a target polynucleotide through a nanopore, wherein a sequencing terminus of the target polynucleotide comprises a modification that occludes the nanopore; (b) as the target polynucleotide moves relative to the pore, one or more electrical and/or optical measurements are taken. The invention also provides an adaptor for characterising a target polynucleotide, wherein the adaptor comprises a modification moiety which binds to a sequencing terminal of the target polynucleotide, the modification moiety causing the nanopore to become blocked. The method provided by the invention avoids the condition that a large amount of libraries after sequencing are enriched at the Trans (Trans) end of a nanopore sequencing device, and further improves the variability of the transformation of the Trans (Trans) end.
Description
Technical Field
The present invention is in the field of gene sequencing, and relates to a method for characterizing a polynucleotide, and to adaptors used in the method.
Background
The nanopore sequencing technology has the characteristics of long reading length, direct reading of modification information and real-time data production parallel analysis, and has more obvious advantages in detection of long-fragment nucleic acid detection variation (including but not limited to point mutation, insertion deletion, inversion translocation, gene fusion, RNA abnormal shearing, RNA editing and other related variations of nucleic acid) and modification information (including but not limited to methylation, acetylation and the like) compared with a second-generation sequencing or other sequencing platforms. The platform supports the parallel characteristics of data production and analysis, realizes real-time mutation/modification detection and diagnosis, and has a portable design, so that the platform has a wide application prospect.
When a voltage is applied across the nanopore, the current drops as analytes (e.g., polynucleotides, polypeptides, polysaccharides, and lipids) pass through the nanopore, and the degree of current blockage caused by analytes of different structures varies. The current changes when the analyte temporarily remains in the nanopore barrel (barrel) for a period of time. Nanopore detection of nucleotides gives a change in current of known characteristics and duration.
In current nanopore sequencing, all analytes to be tested are loaded to the Cis (Cis) end of the nanopore sequencing device at the start of sequencing. During the sequencing process, analytes to be detected continuously pass through the pores and are accumulated to the Trans (Trans) end of the nanopore sequencer in a large amount.
Disclosure of Invention
The present invention aims to provide a method of characterising a polynucleotide of interest and also provides adaptors for use in the method. The adaptors of the invention can be used to avoid enrichment of sequencing libraries to the Trans (Trans) end of the sequencing apparatus, thereby avoiding possible interference with this enrichment and further improving the variability of the Trans (Trans) end engineering.
The purpose of the invention is realized by the following technical scheme:
in a first aspect, the present invention provides a method of characterising a target polynucleotide comprising:
(a) moving the target polynucleotide through the nanopore,
wherein the sequencing terminus of the target polynucleotide comprises a modification that occludes a nanopore;
(b) taking one or more electrical and/or optical measurements as the target polynucleotide moves relative to the pore, wherein the measurements are representative of one or more characteristics of the target polynucleotide, and thereby characterising the target polynucleotide.
The method of the invention further comprises the following steps:
(c) moving the target polynucleotide in a reverse direction relative to the nanopore back to the starting side of the nanopore.
The method according to the present invention, wherein,
the nanopore comprises a solid state nanopore and/or a biological nanopore; the biological nanopore comprises a transmembrane pore; the transmembrane pore comprises a transmembrane protein pore.
The method according to the present invention, wherein step (c) does not comprise:
taking one or more electrical and/or optical measurements as the polynucleotide moves relative to the pore, wherein the measurements are representative of one or more characteristics of the polynucleotide, and thereby characterising the target polynucleotide.
The method according to the present invention, wherein the step (c) comprises:
taking one or more electrical and/or optical measurements as the polynucleotide moves relative to the pore, wherein the measurements are representative of one or more characteristics of the polynucleotide, and thereby characterising the target polynucleotide.
The method according to the invention, wherein in step (c) the moving the target polynucleotide in reverse direction relative to the pore is effected by at least one means comprising: a reverse voltage is applied.
The means for reversing the movement of the target polynucleotide relative to the pore may further comprise: AFM (atomic force microscope) drawing, or drawing of an invertase that moves the target polynucleotide in reverse direction with respect to the nanopore, and if a helicase that moves the target polynucleotide toward the nanopore is a helicase in the 5 '-3' direction, the invertase may be a helicase in the 3 '-5' direction.
The method according to the present invention, wherein the step (a) comprises:
attaching an adaptor comprising a modification moiety to the polynucleotide of interest such that a sequencing terminus of the polynucleotide of interest comprises the modification moiety.
Preferably, the modifying moiety comprises a modifying moiety with no charge on the side chain or a positive charge on the side chain; and/or
The modified part with uncharged side chains comprises any one or the combination of more than two of PNA, polypeptide and nucleotide modified by alkylation of phosphate backbone; and/or
The pendant positively charged modified moiety comprises a phosphate backbone cationic oligomer-modified nucleotide; and/or
The cationic oligomer comprises any one or the combination of more than two of spermine, spermidine and putrescine; and/or
The modifying moiety comprises a ligand and a ligand that bind to each other, including streptavidin and biotin, antigens, and antibodies.
The method according to the invention, wherein the adaptor is a Y-adaptor comprising a double-stranded region and at least one single-stranded region, or an E-adaptor comprising a double-stranded region and no single-stranded region.
Wherein the type E adaptors are adaptors for conventional sequencing and type E adaptors suitable for use herein comprise a modification moiety.
The method according to the invention wherein the adaptor is a Y-adaptor, the modified portion of the Y-adaptor being located in or forming the overhang portion of the Y-adaptor; and/or
The modifying moiety comprises a modifying moiety with uncharged side chains, more preferably the modifying moiety with uncharged side chains is a PNA or a polypeptide.
And/or
The modification moiety is covalently attached to the Y-adaptor or the modification is attached to the Y-adaptor by a click chemistry reaction.
The method according to the invention wherein the adaptor is an E-type adaptor and the modification moiety is located at one end which is not attached to the polynucleotide of interest; and/or
The modified part comprises an affinity substance, preferably the affinity substance is streptavidin and biotin; or
The modified moiety comprises cholesterol.
The method according to the present invention, wherein the adaptor comprises a blocking strand having a different structure from the polynucleotide for blocking a motor protein;
preferably, the blocking strand comprises one or more nitroindoles, one or more inosines, one or more acridines, one or more 2-aminopurines, one or more 2-6-diaminopurines, one or more 5-bromo-deoxyuracils, one or more inverted thymidine, one or more inverted dideoxythymidine, one or more dideoxycytidine, one or more 5-methylcytosine, one or more 5-hydroxymethylcytidine, one or more 2 '-alkoxy-modified ribonucleotides, preferably 2' -methoxy-modified ribonucleotides, one or more isodeoxycytidines, one or more isodeoxyguanosine, one or more C3 groups, one or more photocleavable groups, one or more hexanediol, one or more iSP9 groups, one or more iSP18 groups, a, A polymer or one or more thiol linkages.
In a second aspect, the present invention provides an adaptor for characterising a target polynucleotide, wherein the adaptor comprises a modification moiety for binding to a sequencing terminal of the target polynucleotide, the modification moiety being capable of causing nanopore blockage.
In a third aspect, the present invention provides a construct for characterising a polynucleotide of interest, the construct comprising an adaptor, and a polynucleotide of interest;
the adaptor comprises a modification moiety that binds to a sequencing terminus of the polynucleotide of interest, the modification moiety causing the nanopore to be blocked;
preferably, the target polynucleotide is a double-stranded polynucleotide.
In a fourth aspect, the present invention provides a complex for characterising a polynucleotide of interest, said complex comprising a construct and a motor protein, wherein,
the construct comprises an adaptor and a polynucleotide of interest, the adaptor comprising a modification moiety, the modification moiety binding to a sequencing terminal of the polynucleotide of interest, the modification moiety causing the nanopore to be blocked;
the motor protein is a protein capable of binding to the target polynucleotide and controlling its movement through the pore;
preferably, the motor protein is selected from one or more of a polymerase, an exonuclease, a helicase and a topoisomerase;
more preferably, the helicase is selected from one or more of Hel308 helicase, RecD helicase, tra helicase, TrwC helicase, XPD helicase and DDA helicase.
In a fifth aspect, the present invention provides a kit for nanopore characterization polynucleotides, the composition of the kit comprising:
1) an adaptor comprising a modification moiety that binds to a sequencing terminus of the polynucleotide of interest, the modification moiety causing nanopore blockage; and
2) a motor protein.
In nanopore sequencing, all libraries at the start of sequencing are loaded to the Cis (Cis) end of the nanopore sequencing device. In the sequencing process, along with continuous hole passing of the sequencing library to be tested, a large amount of nucleic acid libraries can be gathered to a Trans (Trans) end of the nanopore sequencer, and due to the fact that the nucleic acid carries charges, the gathering can cause possible interference on the sequencing result. The inventors of the present invention provide a method that can avoid this enrichment. The technical concept of the present invention is described below with reference to fig. 1 and 2, fig. 1 is a schematic diagram of the principle of using adaptors of the present invention to avoid massive enrichment of sequencing libraries at the Trans end (Trans); FIG. 2 is a schematic diagram of a complex comprising an adaptor of the invention, a polynucleotide of interest and a helicase; in FIG. 1, the adaptors are ligated to both ends of a double strand of a target polynucleotide to be characterized, and the double strand moves relative to the pore while unwinding under the action of a helicase, and the target polynucleotide is sequenced by the change in the current through the pore, that is, strand sequencing. Since the modified portion of the adaptors of the present invention is present at the sequencing end of the sequencing strand, at the sequencing end of the target polynucleotide strand, the modification fails to cross the nanopore and causes the pore to be blocked, resulting in the target polynucleotide strand completing the sequencing being kicked back out of the pore to the Cis (Cis) terminus. Briefly, during sequencing, the helicase is added as 5' -3 ' helicase, and under the guidance of the complex, the 5' end of the library is put into a hole, and the helicase shifts along 5' -3 ' and performs sequencing; when the single-stranded library runs to the 3' end, because the energy barrier of the end cannot be crossed, the hole blocking phenomenon can occur, and then the system applies reverse voltage to kick the hole, so that the single-stranded library after sequencing is kicked from the Trans (Trans) end to the Cis (Cis) end.
Wherein, in figure 2, the complex comprises the adaptor, the target polynucleotide duplex to be characterised and a motor protein, the Y1 strand comprises a blocking strand S and a polynucleotide strand D' linked to the blocking strand S, and a motor protein arrested on the blocking strand S, and the region of complementarity of the Y2 strand to the Y1 strand is the duplex portion of polynucleotide strand L; the YB-Dn chain comprises a modified portion, which in one particular embodiment comprises PNA, which, when uncharged, causes pore blocking and is therefore eventually kicked out. The complementary region of the YB-Up chain and the Y1 chain is a double-stranded polynucleotide D; wherein the YB-Dn chain and the YB-Up chain are connected through click chemistry, and DBCO and N3 are specifically taken as examples.
Compared with the prior art, the technical scheme of the invention has the following advantages:
the method provided by the invention avoids the condition that a large amount of the library after sequencing is enriched at the Trans (Trans) end of the nanopore sequencing device, greatly reduces the interference possibly caused by the enrichment of a large amount of charged nucleic acid at the Trans (Trans) end compared with the existing sequencing technology, and further improves the variability of the transformation of the Trans (Trans) end.
In addition, methods using modifications with uncharged or positively charged side chains to create pore blockades are more straightforward to prepare and operate and more functional than methods using, for example, streptavidin and biotin to create pore blockades, and can be performed using an adaptor such as a Y adaptor during sequencing.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram showing the principle of using adaptors of the present invention to avoid massive enrichment of sequencing libraries at the Trans end (Trans);
FIG. 2 shows a schematic diagram of a complex comprising an adaptor of the invention, a polynucleotide of interest and a helicase in one particular embodiment;
wherein the Y1 chain comprises a blocking chain S and a polynucleotide chain D' connected with the blocking chain S, and a motor protein stagnated on the blocking chain S, the complementary region of the Y2 chain and the Y1 chain is a double-stranded part of the polynucleotide chain L, the YB-Dn chain comprises a modified part, and the complementary region of the YB-Up chain and the Y1 chain is a double-stranded polynucleotide D; wherein the YB-Dn chain and the YB-Up chain are connected through click chemistry;
FIG. 3 shows a schematic representation of a complex comprising an adaptor of the invention, a polynucleotide of interest and a helicase in another specific embodiment; the sequencing terminal of the target polynucleotide is introduced into the biotin-streptavidin compound, and the electric field force is not enough to tear the target polynucleotide when the target polynucleotide is sequenced to the tail end, so that the hole is blocked, and a sequencing chain is kicked out.
FIG. 4 shows a diagram of sequencing signals when sequencing is performed using adaptors according to example 1 of the invention;
FIG. 5 shows a graph of sequencing signals when sequencing is performed using adaptors according to example 2 of the invention;
FIG. 6 shows a diagram of sequencing signals for adaptors according to example 3 of the invention and for sequencing using adaptors according to example 3 of the invention.
Detailed Description
It is understood that different applications of the disclosed products and methods may be tailored to specific needs in the art. It is to be understood that the terminology used herein is for the purpose of describing particular embodiments of the invention only and is not intended to be limiting.
In addition, as used in this specification and the appended claims, the singular forms "a," "an," and "the" include plural referents unless the content clearly dictates otherwise. Thus, for example, reference to "a polynucleotide" includes two or more polynucleotides, reference to "a polynucleotide binding protein" includes two or more such proteins, reference to "a helicase" includes two or more helicases, "reference to" a monomer "refers to two or more monomers, reference to" a pore "includes two or more pores, and the like.
All publications, patents and patent applications cited herein, whether supra or infra, are hereby incorporated by reference in their entirety.
Method
The present invention provides a method of characterising a target polynucleotide, comprising:
(a) moving the target polynucleotide through the nanopore,
wherein the sequencing terminus of the target polynucleotide comprises a modification that occludes a nanopore;
(b) taking one or more electrical and/or optical measurements as the polynucleotide moves relative to the pore, wherein the measurements are representative of one or more characteristics of the polynucleotide, and thereby characterising the target polynucleotide.
The method of the invention further comprises the following steps:
(c) optionally applying a reverse voltage to reverse the movement of the target polynucleotide relative to the nanopore back to the starting side of the nanopore.
The method according to the present invention, wherein step (c) does not comprise:
taking one or more electrical and/or optical measurements as the polynucleotide moves relative to the pore, wherein the measurements are representative of one or more characteristics of the polynucleotide, and thereby characterising the target polynucleotide.
The method according to the present invention, wherein the step (c) may comprise:
taking one or more electrical and/or optical measurements as the polynucleotide moves relative to the pore, wherein the measurements are representative of one or more characteristics of the polynucleotide, and thereby characterising the target polynucleotide.
The method according to the invention, wherein in step (c) the moving the target polynucleotide in reverse direction relative to the pore is effected by at least one means comprising: a reverse voltage is applied.
The methods of the invention comprise measuring one or more characteristics of the target polynucleotide. The method may comprise measuring a characteristic of 2, 3, 4, 5 or more polynucleotides of interest. The one or more characteristics, preferably selected from (i) the length of the target polynucleotide, (ii) the identity of the target polynucleotide, (iii) the sequence of the target polynucleotide, (iv) the secondary structure of the target polynucleotide; and (v) whether the target polynucleotide is modified. (i) Any combination of (v) to (v) may be measured according to the present invention.
For (i), the length of the polynucleotide may be determined, for example, by determining the number of interactions of the target polynucleotide with the pore and the duration of time between interactions of the target polynucleotide with the pore.
For (ii), the identity of the polynucleotides may be determined in a variety of ways. The identity of a polynucleotide may be determined in conjunction with or without determination of the sequence of the target polynucleotide. The former is straightforward; sequencing said polynucleotide and identifying therefrom. The latter can be done in several ways. For example, the presence of a particular motif in a polynucleotide can be determined (without determining the remaining sequence of the polynucleotide). Alternatively, a particular electrical and/or optical signal determined in the method can identify a polynucleotide of interest from a particular source.
For (iii), the sequence of the polynucleotide may be determined as described previously. Suitable sequencing methods, particularly those using electrical measurements, are described in Stoddart D et al, Proc Natl Acad Sci, 12; 7702-7, Lieberman KR et al, J Am Chem Soc.2010; 132(50) 17961-72, and International application WO 2000/28312.
For (iv), the secondary structure can be measured in a variety of ways. For example, if the method involves electrical measurements, the secondary structure may be measured using changes in residence time or current changes through the aperture. This allows regions of single-and double-stranded polynucleotides to be identified.
For (v), the presence or absence of any modification can be determined. The method preferably comprises determining whether the target polynucleotide has been modified by methylation, oxidation, damage, use of one or more proteins or one or more labels, tags or blocking strands. Specific modifications will result in specific interactions with the pore, which can be determined using the methods described below. For example, cytosine can be identified from methylated cytosine based on the current passing through the pore during its interaction with each nucleotide.
The process is generally carried out in the presence of a buffer. In the exemplary apparatus discussed above, the buffer is present in the aqueous solution of the chamber. Any buffer may be used in the methods of the invention. Typically, the buffer is a phosphate buffer. Other suitable buffers are HEPES and Tris-HCl buffers. The process is typically carried out at a pH of 4.0 to 12.0, 4.5 to 10.0, 5.0 to 9.0, 5.5 to 8.8, 6.0 to 8.7, 7.0 to 8.8, or 7.5 to 8.5. The pH used is preferably about 7.5.
The process can be carried out at 0 to 100 ℃, 15 to 95 ℃, 16 to 90 ℃, 17 to 85 ℃, 18 to 80 ℃, 19 to 70 ℃, or 20 to 60 ℃. The process is typically carried out at room temperature. The process is optionally carried out at a temperature that supports helicase function, for example about 37 ℃.
The method can be used for the detection of free nucleotides or free nucleotide analogs andand/or a cofactor that assists the functioning of the helicase. The method may also be carried out in the absence of free nucleotides or free nucleotide analogues and in the absence of a cofactor for the helicase. The free nucleotides can be any one or more of the individual nucleotides as discussed above. Free nucleotides include, but are not limited to, Adenosine Monophosphate (AMP), Adenosine Diphosphate (ADP), Adenosine Triphosphate (ATP), Guanosine Monophosphate (GMP), Guanosine Diphosphate (GDP), Guanosine Triphosphate (GTP), Thymidine Monophosphate (TMP), Thymidine Diphosphate (TDP), Thymidine Triphosphate (TTP), Uridine Monophosphate (UMP), Uridine Diphosphate (UDP), Uridine Triphosphate (UTP), Cytidine Monophosphate (CMP), Cytidine Diphosphate (CDP), Cytidine Triphosphate (CTP), cyclic adenosine monophosphate (cAMP), cyclic guanosine monophosphate (cGMP), deoxyadenosine monophosphate (dAMP), deoxyadenosine diphosphate (DADP), deoxyadenosine monophosphate (dATP), deoxyguanosine monophosphate (dGMP), deoxyguanosine diphosphate (dGDP), deoxyguanosine triphosphate (dGTP), deoxythymidine monophosphate (dTMP), deoxythymidine diphosphate (dTDDP), deoxythymidine triphosphate (dTTP), uridine deoxydiphosphate (dUMP), uridine deoxydiphosphate (dUDP), uridine deoxytriphosphate (dUTP), cytidine deoxymonophosphate (dCMP), cytidine deoxydiphosphate (dCDP), and cytidine deoxytriphosphate (dCTP). The free nucleotides are preferably selected from AMP, TMP, GMP, CMP, UMP, dAMP, dTMP, dGMP or dCMP. The free nucleotide is preferably Adenosine Triphosphate (ATP). Helicase cofactors are factors that allow helicase or a construct to function. The helicase co-factor is preferably a divalent metal cation. The divalent metal cation is preferably Mg 2+ ,Mn 2+ ,Ca 2+ Or Co 2+ . Helicase cofactor is most preferably Mg 2+ 。
Joint body
The adaptors of the present invention comprise a modification moiety that binds to the sequencing terminus of the target polynucleotide to cause nanopore blockage.
Wherein the modifying moiety comprises a modifying moiety having no charge on the side chain or a positive charge on the side chain. Optionally, the side chain uncharged modification moiety includes any one of or a combination of any two or more of PNA, polypeptide, and phosphate backbone alkylation modified nucleotides. Alternatively, the pendant positively charged modified moiety comprises a phosphate backbone cationic oligomer-modified nucleotide, see those described in CN101370817A, incorporated herein by reference in its entirety.
In particular, the cationic oligonucleotide A i B j H has an oligonucleotide moiety A i And an oligocationic moiety B j Wherein A is i Is an oligonucleotide residue of an i-mer, i ═ 5 to 50, having natural or unnatural nucleobases and/or pentofuranosyl and/or natural phosphodiester bonds. B is j Is an organic oligocationic moiety of a j-mer, j ═ 1 to 50, wherein B is selected from the group comprising:
-HPO 3 -R 1 -(X-R 2 n ) n1 -X-R 3 -O-wherein R 1 、R 2 n And R 3 Are identical or different lower alkylene, X is NH or NC (NH) 2 ) 2 ,n1=2-20,
-HPO 3 -R 4 -CH(R 5 X 1 )-R 6 -O-wherein R 4 Is lower alkylene, R 5 And R 6 Are identical or different lower alkylene radicals, X 1 Is putrescine, spermidine or spermine residue,
-HPO 3 -R 7 -(aa) n2 -R 8 -O-wherein R 7 Is lower alkylene, R 8 Is lower alkylene, serine, amino alcohol obtained by reduction of natural amino acids, (aa) n2 Is a peptide containing natural amino acids with cationic side chains such as arginine, lysine, ornithine, histidine, diaminopropionic acid, n2 ═ 2-20.
As used in the specification and claims, "lower alkyl" and "lower alkylene" preferably mean optionally substituted C 1 -C 5 Straight or branched chain alkyl or alkylene.
For example a is selected from the group comprising deoxyribonucleotides, ribonucleotides, Locked Nucleotides (LNA) and chemical modifications or substitutions thereof such as phosphorothioates (also known as phosphorothioates), 2 '-fluoro groups, 2' -O-alkyl groups or labelling groups such as fluorescers. Preferably, the cationic oligomer comprises any one of spermine, spermidine and putrescine or a combination of any two or more of the same.
Optionally, the modifying moiety comprises a ligand and a ligand bound to each other, the ligand and ligand comprising streptavidin and biotin, and/or an antigen and an antibody.
The length of the modified moiety varies depending on the chemical structure of the modified moiety. In a specific embodiment, the modified portion is PNA, and the modified polynucleotide may be 3-100 polynucleotides, preferably 5-50 polynucleotides, more preferably 10-30 polynucleotides, and even more preferably 13-20 polynucleotides.
The adaptor is connected to two ends of a double strand of a target polynucleotide to be characterized, the double strand moves relative to the pore while unwinding under the action of helicase, and the sequence of the target polynucleotide is determined through the current change of the pore, namely, the strand sequencing is carried out. Since the modified portion of the adaptors of the present invention is present at the sequencing end of the sequencing strand, at the sequencing end of the target polynucleotide strand, the modification fails to cross the nanopore and causes the pore to be blocked, resulting in the target polynucleotide strand completing the sequencing being kicked back out of the pore to the Cis (Cis) terminus.
Wherein the adaptor may be a Y-type adaptor comprising a double-stranded region and at least one single-stranded region, or an E-type adaptor comprising a double-stranded region and no single-stranded region.
In a particular embodiment, the adaptor is a Y-adaptor, the modified portion of the Y-adaptor is located in or forms the overhang portion of the Y-adaptor; and/or
The modifying moiety comprises a modifying moiety with uncharged side chains, more preferably the modifying moiety with uncharged side chains is a PNA or a polypeptide.
And/or
The modification moiety is covalently attached to the Y-adaptor or the modification is attached to the Y-adaptor by a click chemistry reaction.
In particular, the Y-shaped adaptor comprises { S-D } in the 5 'to 3' direction n Or { D-S } n Wherein D is a double-stranded polynucleotide comprising a modification moiety, S is a blocking strand, and n is a positive integer;
and, the D duplex comprises a polynucleotide strand D ' linked to S and a complementary strand D "of the D ', wherein the motor protein moves in the direction S → D ' during characterization, and the modification is located on the complementary strand D".
For example, in a specific embodiment, the helicase is a 5' -3 ' helicase, and the 5' end of the library is placed into a well under the guidance of the complex, the helicase is displaced 5' -3 ' and sequenced; when the single-stranded library runs to the 3' end, because the energy barrier of the end cannot be crossed, the hole blocking phenomenon can occur, and then the system applies reverse voltage to kick the hole, so that the single-stranded library after sequencing is kicked from the Trans (Trans) end to the Cis (Cis) end.
The adaptor of the invention, wherein the adaptor comprises { L-S-D } in the 5 'to 3' end direction n Or { D-S-L } n (ii) a Wherein, the L is a polynucleotide chain; preferably, at least part of said L is double stranded; and/or at least part of said L is single-stranded; and/or said L comprises one or more blocking molecules; and/or said L comprises a leader sequence that threads preferentially into the hole.
It will be appreciated that the L moiety is the moiety that first contacts the sequencing well.
In a specific embodiment, the polynucleotide of interest is modified to include a Y-type adaptor and an E-type adaptor comprising a leader sequence. Wherein a Y-adaptor containing a leader sequence is ligated to one end of the polynucleotide and an E-adaptor is ligated to the other end, the modified moiety of the invention being located at the end of the E-adaptor not attached to the polynucleotide of interest; the modification part comprises an affinity substance, and the affinity substance is streptavidin and biotin; or the modified portion comprises cholesterol.
The leader sequence preferentially enters the nanopore, and a double-stranded adaptor which does not comprise a single-stranded region cannot pass through the nanopore due to the blockage of the nanopore caused by the modified part, so that the system applies reverse voltage to kick the pore, and the single-stranded library after sequencing is kicked from a Trans (Trans) end to a Cis (Cis) end.
In the present invention, the following are understood with respect to the Cis (Cis) and Trans (Trans) termini:
nanopores typically have two openings: a first opening and a second opening. Such openings are commonly referred to as the cis-opening and trans-opening of the nanopore. Typically the first opening is a cis opening and the second opening is a trans opening. The symbols "cis" and "trans" opening in nanopores are conventional in the art. For example, the cis opening of a nanopore typically faces the cis end of the nanopore, and the trans opening typically faces the trans end. It will be appreciated that the cis-terminus is the end from which the target polynucleotide moves into the nanopore and the trans-terminus is the end from which the target polynucleotide moves out of the nanopore.
Blocking chain
The one or more blocking strands are included in the target polynucleotide. The blocking strand or strands are preferably part of the target polynucleotide, e.g. it/they interrupt the polynucleotide sequence. The one or more blocking strands are preferably not part of one or more block molecules, such as deceleration strips, that hybridize to the target polynucleotide.
There may be any number of blocking strands in the target polynucleotide, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more blocking strands. Preferably there are 2, 4 or 6 blocked strands in the target polynucleotide. Different regions of the target polynucleotide may have a blocking strand, for example a blocking strand in the leader sequence and a blocking strand in the hairpin loop.
The one or more blocking strands each provide an energy barrier that the one or more helicases cannot overcome even in the active mode. The one or more blocking strands may arrest the one or more helicases by reducing the pulling of the helicase (e.g., by removing the bases of the nucleotides in the target polynucleotide) or physically blocking the movement of the one or more helicases (e.g., using bulky chemical groups).
The one or more blocking strands may comprise any molecule or combination of molecules that arrest one or more helicases. The one or more blocking strands may comprise any molecule or combination of molecules that prevent the one or more helicases from moving along the target polynucleotide. It is directly determined whether one or more helicases stay at one or more of the blocked strands in the absence of a nanopore and an applied potential. For example, this can be tested as shown in the examples, e.g., the ability of helicases to cross the blocked strand and displace the complementary strand of DNA can be measured by PAGE.
The one or more blocking chains typically comprise a linear molecule such as a polymer. The one or more blocking strands typically have a different structure than the target polynucleotide. For example, if the target polynucleotide is DNA, one or more of the blocking strands is not typically deoxyribonucleic acid. In particular, if the target polynucleotide is deoxyribonucleic acid (DNA) or ribonucleic acid (RNA), the one or more blocking strands preferably comprise Peptide Nucleic Acid (PNA), Glycerol Nucleic Acid (GNA), Threose Nucleic Acid (TNA), Locked Nucleic Acid (LNA) or a synthetic polymer with nucleotide side chains.
The one or more blocking strands preferably include one or more nitroindoles, such as one or more 5-nitroindoles, one or more inosines, one or more acridines, one or more 2-aminopurines, one or more 2-6-diaminopurines, one or more 5-bromo-deoxyuracils, one or more inverted thymidine (inverted dTs), one or more inverted deoxythymidine (ddTs), one or more dideoxycytidine (ddCs), one or more 5-methylcytidine, one or more 5-hydroxymethylcytidine, one or more 2 '-alkoxy-modified ribonucleotides (preferably 2' -methoxy-modified ribonucleotides), one or more isodeoxycytidines (iso-dCs), one or more isodeoxyguanosine (iso dGs), one or more iSPC3 groups (i.e., nucleotides lacking sugars and bases), one or more Photocleavable (PC) groups, one or more hexanediol groups, one or more blocked chain 9(iSp9) groups, one or more blocked chain 18(iSp18) groups, a polymer or one or more thiol linkages. The one or more blocking chains may comprise any combination of these groups. Many of these groups are commercially available from (Integrated DNA).
The one or more blocking chains may comprise any number of these groups. For example, for 2-aminopurine, 2-6-diaminopurine, 5-bromodeoxyuridine, inverted dTs, ddTs, ddCs, 5-methylcytidine, 5-hydroxymethylcytidine, 2 '-alkoxy-modified ribonucleotides (preferably 2' -methoxy-modified ribonucleotides), iso dCs, iso dGs, iSPC3 groups, PC groups, hexanediol groups and thiol linkages, one or more blocking strands preferably comprise 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more. The one or more blocking chains preferably comprise 2, 3, 4, 5, 6, 7, 8 or more iSp9 groups. The one or more blocking chains preferably comprise 2, 3, 4, 5 or 6 or more iSp18 groups. The most preferred chain-blocking group is 4 iSP18 groups.
The polymer is preferably a polypeptide or polyethylene glycol (PEG). The polypeptide preferably comprises 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more amino acids. The PEG preferably comprises 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more monomeric units.
The one or more blocking strands preferably comprise one or more abasic nucleotides (i.e. nucleotides lacking a nucleobase), for example 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more abasic nucleotides. The nucleobases may be replaced by-H (idSp) or-OH in abasic nucleotides. Abasic blocking strands can be inserted into a target polynucleotide by removing nucleobases from one or more adjacent nucleotides.
The one or more blocking strands preferably comprise one or more chemical groups that physically cause the one or more helicases to stall. The one or more chemical groups are preferably one or more pendant chemical groups. The one or more chemical groups may be attached to one or more nucleobases in the target polynucleotide. The one or more chemical groups may be attached to the backbone of the target polynucleotide. Any number, such as 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or more of these chemical groups may be present. Suitable groups include, but are not limited to, fluorophores, streptavidin and/or biotin, cholesterol, methylene blue, Dinitrophenols (DNPs), digoxigenin and/or anti-digoxigenin and diphenylcyclooctyne groups.
Different blocking strands in a target polynucleotide may comprise different stasis molecules. For example, one blocking strand may comprise a linear molecule as discussed above, and the other blocking strand may comprise one or more chemical groups that physically cause the arrest of one or more helicases. The blocking strand may comprise any linear molecule as discussed above and one or more chemical groups, such as one or more abasic and fluorophore groups, that physically cause the arrest of one or more helicases.
Composite material
The invention provides a complex comprising an adaptor according to the invention and a motor protein, wherein the motor protein is located in a blocking chain;
preferably, the motor protein is a protein capable of binding to a polynucleotide and controlling its movement through a pore; preferably an enzyme. For example, the enzyme is selected from one or more of a polymerase, an exonuclease, a helicase and a topoisomerase. For example, the helicase is selected from one or more of Hel308 helicase, RecD helicase, tra helicase, TrwC helicase, XPD helicase and DDA helicase.
Polynucleotide
Polynucleotides, such as nucleic acids, are macromolecules containing two or more nucleotides. The polynucleotide or nucleic acid may include any combination of any nucleotides. Nucleotides may be naturally occurring or synthetic. One or more nucleotides in a polynucleotide may be oxidized or methylated. One or more nucleotides in the polynucleotide may be damaged. For example, the polynucleotide may comprise a pyrimidine dimer. Such dimers are often associated with damage caused by ultraviolet light and are the leading cause of cutaneous melanoma. One or more nucleotides in a polynucleotide may be modified, for example with a label or tag. Suitable labels are described below.
The nucleotides in a polynucleotide are typically ribonucleotides or deoxyribonucleotides. The polynucleotide may comprise the following nucleosides: adenosine, uridine, guanosine and cytidine. The nucleotide is preferably a deoxyribonucleotide. The polynucleotide preferably comprises the following nucleosides: deoxyadenosine (dA), deoxyuridine (dU) and/or thymidine (dT), deoxyguanosine (dG) and deoxycytidine (dC).
Nucleotides typically contain a monophosphate, diphosphate or triphosphate. The phosphate may be attached on the 5 "or 3" side of the nucleotide.
Suitable nucleotides include, but are not limited to, Adenosine Monophosphate (AMP), Guanosine Monophosphate (GMP), Thymidine Monophosphate (TMP), Uridine Monophosphate (UMP), Cytidine Monophosphate (CMP), cyclic adenosine monophosphate (cAMP), cyclic guanosine monophosphate (cGMP), adenosine deoxymonophosphate (dAMP), guanosine deoxymonophosphate (dGMP), thymidine deoxymonophosphate (dTMP), uridine deoxymonophosphate (dUMP) and cytidine deoxymonophosphate (dCMP). The nucleotide is preferably selected from the group consisting of AMP, TMP, GMP, CMP, UMP, dAMP, dTMP, dGMP, dCMP, and dUMP. The nucleotide is most preferably selected from dAMP, dTMP, dGMP, dCMP, and dUMP. The polynucleotide preferably comprises the following nucleotides: dAMP, dUMP and/or dTMP and dCMP.
The nucleotides in the polynucleotide may be linked to each other in any manner. Nucleotides are typically linked by their sugars and phosphate groups, as in nucleic acids. The nucleotides may be linked by their nucleobases, such as in a pyrimidine dimer.
The polynucleotide may be a nucleic acid. The polynucleotide may be any synthetic nucleic acid known in the art, such as Peptide Nucleic Acid (PNA), Glycerol Nucleic Acid (GNA), Threose Nucleic Acid (TNA), Locked Nucleic Acid (LNA), or other synthetic polymers having nucleotide side chains. The PNA backbone consists of repeating N- (2-aminoethyl) -glycine units linked by peptide bonds. The GNA backbone is composed of repeating ethylene glycol units linked by phosphodiester bonds. The TNA backbone consists of repetitive threones linked together by phosphodiester bonds. The LNA is formed from nucleotides with an additional bridge linking the 2 "oxygen and the 4" carbon in the ribose sugar as discussed above.
The polynucleotide is most preferably ribonucleic acid (RNA) or deoxyribonucleic acid (DNA).
The polynucleotide may be of any length. For example, a polynucleotide can be at least 10, at least 50, at least 100, at least 150, at least 200, at least 250, at least 300, at least 400, or at least 500 nucleotides in length. The polynucleotide may be 1000 or more nucleotides, 5000 or more nucleotides in length or 100000 or more nucleotides in length.
Helicases may move along all or only part of the target polynucleotide in the methods of the invention. All or a portion of the target polynucleotide can be characterized using the methods of the invention.
The target polynucleotide may be single stranded. At least a portion of the target polynucleotide is preferably double stranded. Helicases are typically bound to single stranded polynucleotides. If at least a portion of the target polynucleotide is double-stranded, the target polynucleotide preferably comprises a single-stranded region or a non-hybridizing region. The one or more helicases are capable of binding to one strand of the single-stranded region or the non-hybridizing region. The target polynucleotide preferably comprises one or more single stranded regions or one or more non-hybridising regions.
Sample(s)
The target polynucleotide is present in any suitable sample. The invention is generally practiced on samples known to contain or suspected of containing the target polynucleotide. Alternatively, the invention may be carried out on a sample to identify one or more target polynucleotides identified, which are known or expected to be present in the sample.
The sample may be a biological sample. The invention may be practiced in vitro on samples obtained or extracted from any organism or microorganism. The organism or microorganism is typically ancient nuclear (archaean), prokaryotic or eukaryotic, and typically belongs to one of the five kingdoms: plant kingdom, animal kingdom, fungi, prokaryotes, and protists. The present invention is carried out in vitro on samples obtained or extracted from any virus. The sample is preferably a liquid sample. The sample typically comprises a body fluid of the patient. The sample may be urine, lymph, saliva, mucus or amniotic fluid, but is preferably blood, plasma or serum. Typically, the sample is of human origin, but may alternatively be from other mammals, such as commercially farmed animals, such as horses, cattle, sheep or pigs, or may be pets such as cats or dogs. Alternatively, samples of plant origin are typically obtained from commercial crops, such as cereals, legumes, fruits or vegetables, e.g. wheat, quinoa, barley, oats, canola, corn, soybean, rice, bananas, apples, tomatoes, potatoes, grapes, tobacco, beans, lentils, sugar cane, cocoa, cotton.
The sample may be a non-biological sample. The non-biological sample is preferably a liquid sample. Examples of non-biological samples include surgical fluids, water such as drinking water, seawater or river water, and reagents for laboratory testing.
The sample is typically processed prior to testing, for example by centrifugation or by membrane filtration to remove unwanted molecules or cells, such as red blood cells. The detection may be performed immediately after the sample is obtained. The sample may also be stored prior to analysis, preferably below-70 ℃.
Click chemistry
The polynucleotides of the present application may be covalently linked. For example, free copper click chemistry or copper catalyzed click chemistry may be used. Click chemistry is used in these applications due to its desirable properties and its range for generating covalent linkages between various building blocks. For example, it is fast, clean and non-toxic, producing only harmless by-products. Click chemistry is the term first introduced by Kolb et al in 2001 to describe a broader series of powerful, selective and modular building blocks that are reliable for small and large scale applications (Kolb HC Finn, MG, Sharp less KB, click chemistry: reverse chemical function from a good practices, angew. chem. int. ed.40(2001) 2004-. They defined a series of stringent criteria for click chemistry as follows: "the reaction must be modular, broad, give very high yields, produce only harmless by-products that can be removed by non-chromatography, and be stereospecific (but not necessarily enantioselective). The required process features include simple reaction conditions (ideally the process should be insensitive to oxygen and water), readily available starting materials and reagents, the use of no solvent or solvent which is mild (e.g. water) or easily removed, and simple product isolation. Purification must be by non-chromatography, e.g., crystallization or distillation, if desired, and the product must be stable under physiological conditions.
Suitable examples of click chemistry include, but are not limited to, the following:
(a) 1, 3-couple cycloaddition of variants of free copper, wherein the azide reacts with the alkyne under stress, for example in the cyclooctane ring;
(b) reaction of an oxygen nucleophilic reagent on one linker with an epoxide or aziridine reactive moiety on the other linker; and
(c) staudinger ligation, in which the alkyne moiety can be substituted with an aryl phosphine, results in a specific reaction with the azide to give an amide bond.
Preferably, the click chemistry reaction is a cu (i) -catalyzed 1,3 dipolar cycloaddition reaction between an alkyne and an azide. In a preferred embodiment, the first group is an azide group and the second group is an alkyne group. Nucleic acid bases have been synthesized with azide and alkyne groups inserted at preferred positions (e.g., Kocalka P, El-Sagher AH, Brown T, Rapid and effective DNA strand-linking by click chemistry, Chemiochem.2008.9 (8): 1280-5). Alkyne groups are commercially available from Berry Associates (Michigan, USA) and azide groups are synthesized by ATDBio or idtbio.
In a particular embodiment of the present application, preferably the reactive groups are azide and hexyl groups, such as azide N3 and DBCO.
Reagent kit
In a further aspect, the invention also provides a kit for characterising a polynucleotide, the kit comprising the adaptor or the complex.
The kit comprises (a) one or more adaptors, (b) one or more helicases. The kit may include any of the helicases and wells discussed above.
The kit may also include components of the membrane, such as phospholipids such as lipid bilayers, required to form a layer of amphiphilic molecules.
The kit of the invention may additionally comprise one or more other reagents or instruments enabling the performance of any of the embodiments mentioned above. Such reagents or instruments include one or more of the following: suitable buffers (aqueous solutions), means for obtaining a sample from a subject (e.g.a vessel or an instrument comprising a needle), means for amplifying and/or expressing a polynucleotide, a membrane or pressure clamp or patch clamp device as defined above. The reagents may be present in the kit in a dry state, such that the fluid sample re-suspends the reagents. The kit may also, optionally, include instructions for how to use the kit in the methods of the invention, or detailed information about the patient for whom the methods are useful. The kit optionally includes components necessary to facilitate helicase movement (e.g., ATP and Mg) 2+ )。
Example 1: preparation and sequencing of the Y adaptor-enzyme complexes of the invention
SEQ ID NO:1GCGGAGTCAAACGGTAGAAGTCGTTTTTTTTTT
SEQ ID NO:2ACTGCTCATTCGGTCCTGCTGACT
SEQ ID NO:3CGACTTCTACCGTTTGACTCCGC
SEQ ID NO:4GTCAGCAGGACCGAATGAGCAGT
5AGTCCAGCACCGACC, wherein SEQ ID NO 5 consists of PNA.
The complex is formed by hybridizing 4 different strands together;
the first strand (Y1), in turn, comprises a leader sequence, namely an iSPC3 blocking strand, denoted 3, which is linked to the 5' end of SEQ ID NO:1, and the 3' end of SEQ ID NO:1 is in turn linked to the blocking strand iSPC18, denoted 8888, and the 5' end of SEQ ID NO: 2.
Second strand (Y2) as shown in SEQ ID NO: 3.
A third strand (YB-Up), the 5 'end of SEQ ID NO:4 comprising P for ligation of a polynucleotide to be characterized, the 3' end of SEQ ID NO:4 comprising a click chemistry group DBCO;
the fourth chain (YB-Dn), N-Lys (azide) -OO-AGTCCAGCACCGACC-RR-C, where O is O-linker (also known as AEEA or eg1) and R is Lys, both for increased solubility.
Y1:5’-333333333333333333333333333333GCGGAGTCAAACGGTAGAAGTCGTTTTTTTTTT-8888-ACTGCTCATTCGGTCCTGCT GACT-3’
Y2:5'-CGACTTCTACCGTTTGACTCCGC-3’
YB-Up:5'-P-GTCAGCAGGACCGAATGAGCAGT-DBCO-3’
YB-Dn:N-Lys(azide)-OO-AGTCCAGCACCGACC-RR-C
Uniformly mixing YB-Up and YB-Dn with equal substance amount (the concentration can be from 10 mu M to 100 mu M); the mixture is placed at 50 ℃ to react for 4 hours, and then the connection product is separated and purified by Urea-PAGE glue to prepare the YB-PNA chain.
Preparation of modified Y-adaptor complexes: mixing Y1; y2; the YB-PNA three synthetic single strands are synthesized with 1: 1.1: 1.1 (slowly cooling from 95 ℃ to 25 ℃, and the cooling amplitude is not more than 0.1 ℃/s). The annealing final system comprises 160mM HEPES 7.0; 200mM NaCl, with a final concentration of Y1 of 4-8. mu.M, finally forming a Y adaptor. The Y-adaptors (500nM) were mixed with 6 times the amount of substance T4 Dda-M1G/E94C/C109A/C136A/A360C (3. mu.M) (SEQ ID NO: 6) in buffer (100mM NaAc (pH 7); 1.5mM TMAD) and incubated for 30 min at room temperature. Sample 1 was obtained.
SEQ ID NO:6
GTFDDLTEGQKNAFNIVMKAIKEKKHHVTINGPAGTGKTTLTKFIIEALISTGETGIILAAPTHAAKKILSKLSGKEASTIHSILKINPVTYECNVLFEQKEVPDLAKARVLICDEVSMYDRKLFKILLSTIPPWATIIGIGDNKQIRPVDPGENTAYISPFFTHKDFYQCELTEVKRSNAPIIDVATDVRNGKWIYDKVVDGHGVRGFTGDTALRDFMVNYFSIVKSLDDLFENRVMAFTNKSVDKLNSIIRKKIFETDKDFIVGEIIVMQEPLFKTYKIDGKPVSEIIFNNGQLVRIIEAEYTSTFVKARGVPGEYLIRHWDLTVETYGDDEYYREKIKIISSDEELYKFNLFLGKTCETYKNWNKGGKAPWSDFWDAKSQFSKVKALPASTFHKAQGMSVDRAFIYTPCIHYADVELAQQLLYVGVTRGRYDVFYV
Wherein, as shown in figure 2, is a schematic representation of the composite; wherein the Y1 chain comprises a blocking chain S and a polynucleotide chain D' connected with the blocking chain S, and a motor protein stagnated on the blocking chain S, the complementary region of the Y2 chain and the Y1 chain is a double-stranded part of the polynucleotide chain L, the YB-Dn chain comprises a modified part, and the complementary region of the YB-Up chain and the Y1 chain is a double-stranded polynucleotide D; wherein the YB-Dn chain and the YB-Up chain are connected by click chemistry.
Then, sample 1 was purified using a DNAPAC PA200 column using the following elution buffer (buffer A:20mM Na-CHES,250mM NaCl, 4% (W/V) glycerol, pH 8.6, buffer B:20mM Na-CHES,1M NaCl, 4% (W/V) glycerol, pH 8.6), sample 1 was loaded on the column, and the enzyme that did not bind to the DNA was eluted from the column with buffer A. The enzyme bound Y-adaptor complex is then eluted with 10 column volumes of 0-100% buffer B. Then, the main elution peak is collected, and the concentration of the main elution peak is measured, so that the adaptor compound of the invention is obtained.
The 2.7Kb polynucleotide library (i.e., the analyte to be detected) was ligated at both ends to the adaptor complex using T4 ligase and then sequenced using the genencology nanopore sequencer QNome-9604, sequencing buffer: final concentration 10mM HEPES, 100mM MgCl 2 375mM KCl, ATP 100mM, pH 7.1, sequencing temperature: 30-40 ℃. The sequencing results are shown in FIG. 4. In the sequencing process, the 5' end of the library is put into a hole, and helicase shifts along 5' -3 ' and performs sequencing; the object to be tested penetrates through the nanopore to cause current change, when the object to be tested runs to the 3' end (namely the YB-Dn chain area), the PNA is uncharged because the area is the PNA, the current change cannot be caused during hole passing, the system considers that the hole blocking phenomenon occurs, then the system applies reverse voltage to kick the hole, and the single-chain library after sequencing is kicked back to the Cis (Cis) end from the Trans (Trans) end.
Example 2: another type E adaptor-enzyme complex of the inventionPreparation and sequencing of the substance
SEQ ID NO:7GGTAGTCAGCAGGACCGAATGAGCAGTTT
SEQ ID NO:8ACTGCTCATTCGGTCCTGCTGAC
Type E adaptor sequences
EA-1:5’-P-GGTAGTCAGCAGGACCGAATGAGCAGTTT-biotin-3’
EA-2:5’-ACTGCTCATTCGGTCCTGCTGAC-3’
Wherein the 5' end of EA-1 comprises P for ligation to the polynucleotide to be characterized, and the end is labeled with biotin (biotin).
Preparation of modified type E adaptor complexes: two synthetic single chains EA-1: EA-2 were synthesized in a ratio of 1: annealing at a molar ratio of 1.1 (slow cooling from 95 ℃ to 25 ℃ with a cooling amplitude of not more than 0.1 ℃/s). The annealing final system comprises 160mM HEPES 7.0; 200mM NaCl, final product concentration 4-8. mu.M. Then, according to the following steps of 1: adding streptavidin in the molar ratio of 1, incubating for 10min at 30 ℃, and connecting the streptavidin with biotin at the tail end of EA-1 to obtain the E-type adaptor. FIG. 3 is a schematic representation of the composite; the sequencing terminal of the target polynucleotide is introduced into the biotin-streptavidin compound, and the electric field force is not enough to tear the target polynucleotide when the target polynucleotide is sequenced to the tail end, so that the hole is blocked, and a sequencing chain is kicked out.
An asymmetric 2.7Kb polynucleotide (i.e., analyte to be detected) library was prepared by a single-enzyme digestion method, and the Y-adaptor complex prepared in example 1 and the E-adaptor prepared in this example were added to the library for ligation and purification, followed by sequencing using a manopore sequencer QNome-9604 of kyoto technologies ltd, sequencing buffer: final concentration 10mM HEPES, 100mM MgCl 2 375mM KCl, ATP 100mM, pH 7.1, sequencing temperature: 30-40 ℃. The sequencing results are shown in FIG. 5. When the sequencing is started, under the guidance of the complex, the 5' end of the object to be detected enters a hole, and the helicase shifts along 5' -3 ' and performs sequencing; when the system runs to the 3 'end, the biotin and the streptavidin at the 3' end cannot cross the nanopore due to the action of the electric field force, so that the pore blocking phenomenon can occur, the system applies reverse voltage to kick the pore, and the single chain after sequencing is completedThe library kicks from the Trans (Trans) end back to the Cis (Cis) end.
Example 3:preparation and sequencing of another Y adaptor of the invention
The procedure is as in example 1, except that: 5 of the polynucleotide of SEQ ID NO. 5, the phosphate at the 5' end is modified with spermine.
The complex is formed by hybridizing 3 different strands together;
the first strand (Y-Top-1-NS), in turn, comprises a leader sequence, namely the iSPC3 blocking strand, denoted 3, which is linked to the 5' end of SEQ ID NO:1, and the 3' end of SEQ ID NO:1 is in turn linked to the blocking strand iSPC18 (denoted 8888) and the 5' end of SEQ ID NO: 2.
The second strand (Y-Top-2-NS) is shown in SEQ ID NO: 3.
A third strand (Y-Bottom-S), the 5' end of SEQ ID NO:4 comprising P for ligation to the polynucleotide to be characterized, the 3' end of SEQ ID NO:4 being ligated to the 5' end of SEQ ID NO:5, and
5 by cationic oligomer modification, namely spermine modification.
Y-Top-1-NS:5’-
333333333333333333333333333333GCGGAGTCAAACGGTAGAAGTCGTTTTTTTTTT-8888-ACTGCTCATTCGGTCCTGCT GACT-3’
Y-Top-2-NS:5'-CGACTTCTACCGTTTGACTCCGC-3’
Y-Bottom-S: 5' -P-GTCAG CAGGA CCGAA TGAGCAGTSSSAGTCCAGCACCGACC (S stands for cationic oligomer spermine)
The sequencing results are shown in FIG. 6. In the sequencing process, the 5' end of the library is put into a hole, and helicase shifts along 5' -3 ' and performs sequencing; when an object to be tested passes through the nanopore, current change can be caused, and when the object to be tested runs to the 3' end (namely a YB-Dn chain region), the current change cannot be caused when the object passes through the hole due to positive electricity of the region, the system considers that the hole is blocked, and then the system can apply reverse voltage to kick the hole, so that the single-chain library after sequencing is kicked back to a Cis (Cis) end from a Trans (Trans) end.
In addition, the term "and/or" herein is only one kind of association relationship describing the association object, and means that there may be three kinds of relationships, for example, a and/or B, and may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
It should be understood that in the embodiment of the present invention, "B corresponding to a" means that B is associated with a, from which B can be determined. It should also be understood that determining B from a does not mean determining B from a alone, but may be determined from a and/or other information.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Sequence listing
<110> Chengdu carbon technology Co., Ltd
<120> method for characterizing a polynucleotide of interest and adaptors
<160> 8
<170> SIPOSequenceListing 1.0
<210> 1
<211> 33
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 1
gcggagtcaa acggtagaag tcgttttttt ttt 33
<210> 2
<211> 24
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 2
actgctcatt cggtcctgct gact 24
<210> 3
<211> 23
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
cgacttctac cgtttgactc cgc 23
<210> 4
<211> 23
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
gtcagcagga ccgaatgagc agt 23
<210> 5
<211> 15
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
agtccagcac cgacc 15
<210> 6
<211> 439
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 6
Gly Thr Phe Asp Asp Leu Thr Glu Gly Gln Lys Asn Ala Phe Asn Ile
1 5 10 15
Val Met Lys Ala Ile Lys Glu Lys Lys His His Val Thr Ile Asn Gly
20 25 30
Pro Ala Gly Thr Gly Lys Thr Thr Leu Thr Lys Phe Ile Ile Glu Ala
35 40 45
Leu Ile Ser Thr Gly Glu Thr Gly Ile Ile Leu Ala Ala Pro Thr His
50 55 60
Ala Ala Lys Lys Ile Leu Ser Lys Leu Ser Gly Lys Glu Ala Ser Thr
65 70 75 80
Ile His Ser Ile Leu Lys Ile Asn Pro Val Thr Tyr Glu Cys Asn Val
85 90 95
Leu Phe Glu Gln Lys Glu Val Pro Asp Leu Ala Lys Ala Arg Val Leu
100 105 110
Ile Cys Asp Glu Val Ser Met Tyr Asp Arg Lys Leu Phe Lys Ile Leu
115 120 125
Leu Ser Thr Ile Pro Pro Trp Ala Thr Ile Ile Gly Ile Gly Asp Asn
130 135 140
Lys Gln Ile Arg Pro Val Asp Pro Gly Glu Asn Thr Ala Tyr Ile Ser
145 150 155 160
Pro Phe Phe Thr His Lys Asp Phe Tyr Gln Cys Glu Leu Thr Glu Val
165 170 175
Lys Arg Ser Asn Ala Pro Ile Ile Asp Val Ala Thr Asp Val Arg Asn
180 185 190
Gly Lys Trp Ile Tyr Asp Lys Val Val Asp Gly His Gly Val Arg Gly
195 200 205
Phe Thr Gly Asp Thr Ala Leu Arg Asp Phe Met Val Asn Tyr Phe Ser
210 215 220
Ile Val Lys Ser Leu Asp Asp Leu Phe Glu Asn Arg Val Met Ala Phe
225 230 235 240
Thr Asn Lys Ser Val Asp Lys Leu Asn Ser Ile Ile Arg Lys Lys Ile
245 250 255
Phe Glu Thr Asp Lys Asp Phe Ile Val Gly Glu Ile Ile Val Met Gln
260 265 270
Glu Pro Leu Phe Lys Thr Tyr Lys Ile Asp Gly Lys Pro Val Ser Glu
275 280 285
Ile Ile Phe Asn Asn Gly Gln Leu Val Arg Ile Ile Glu Ala Glu Tyr
290 295 300
Thr Ser Thr Phe Val Lys Ala Arg Gly Val Pro Gly Glu Tyr Leu Ile
305 310 315 320
Arg His Trp Asp Leu Thr Val Glu Thr Tyr Gly Asp Asp Glu Tyr Tyr
325 330 335
Arg Glu Lys Ile Lys Ile Ile Ser Ser Asp Glu Glu Leu Tyr Lys Phe
340 345 350
Asn Leu Phe Leu Gly Lys Thr Cys Glu Thr Tyr Lys Asn Trp Asn Lys
355 360 365
Gly Gly Lys Ala Pro Trp Ser Asp Phe Trp Asp Ala Lys Ser Gln Phe
370 375 380
Ser Lys Val Lys Ala Leu Pro Ala Ser Thr Phe His Lys Ala Gln Gly
385 390 395 400
Met Ser Val Asp Arg Ala Phe Ile Tyr Thr Pro Cys Ile His Tyr Ala
405 410 415
Asp Val Glu Leu Ala Gln Gln Leu Leu Tyr Val Gly Val Thr Arg Gly
420 425 430
Arg Tyr Asp Val Phe Tyr Val
435
<210> 7
<211> 29
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
ggtagtcagc aggaccgaat gagcagttt 29
<210> 8
<211> 23
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
actgctcatt cggtcctgct gac 23
Claims (14)
1. A method of characterizing a target polynucleotide, comprising:
(a) moving the target polynucleotide through the nanopore,
wherein the sequencing terminus of the target polynucleotide comprises a modification that occludes the nanopore;
(b) taking one or more electrical and/or optical measurements as the target polynucleotide moves relative to the pore, wherein the measurements are representative of one or more characteristics of the target polynucleotide, and thereby characterising the target polynucleotide.
2. The method of claim 1, wherein,
the nanopore comprises a solid state nanopore and/or a biological nanopore; the biological nanopore comprises a transmembrane pore; the transmembrane pore comprises a transmembrane protein pore.
3. The method of claim 1 or 2, further comprising:
(c) moving the target polynucleotide in a reverse direction relative to the nanopore back to the starting side of the nanopore.
4. The method of claim 3, wherein step (c) does not include:
(ii) taking one or more electrical and/or optical measurements as the target polynucleotide moves relative to the pore, wherein the measurements are representative of one or more characteristics of the target polynucleotide, and thereby characterising the target polynucleotide; and/or
In step (c), said moving said target polynucleotide in an opposite direction relative to said pore is achieved by means comprising at least one of: applying a reverse voltage, atomic force microscope drawing and/or drawing of an invertase that moves the target polynucleotide in reverse relative to the nanopore.
5. The method of any one of claims 1 to 4, wherein step (a) comprises:
attaching an adaptor comprising the modification moiety to the polynucleotide of interest such that the sequencing terminus of the polynucleotide of interest comprises the modification moiety; and/or
The modifying moiety comprises a modifying moiety having no charge on the side chain or a positive charge on the side chain; and/or
The side chain uncharged modified part comprises any one or combination of more than two of PNA, polypeptide and phosphate backbone alkylation modified nucleotide; and/or
The pendant positively charged modified moiety comprises a phosphate backbone cationic oligomer-modified nucleotide; and/or
The cationic oligomer comprises one or the combination of more than two of spermine, spermidine and putrescine; and/or
The modifying moiety comprises a ligand and a ligand which bind to each other, the ligand and ligand comprising streptavidin and biotin, and/or
Antigens and antibodies.
6. The method of claim 5, wherein the adaptor is a Y-adaptor comprising a double-stranded region and at least one single-stranded region, or
An adaptor of type E comprising a double-stranded region and no single-stranded region.
7. The method of claim 6, wherein the adaptor is a Y-adaptor, the modified portion of the Y-adaptor being located in or forming the overhang portion of the Y-adaptor;
and/or, the modification moiety is covalently attached to the Y-adaptor, or the modification moiety is attached to the Y-adaptor by a click chemistry reaction.
8. A method according to claim 6 wherein the adaptor is an E-type adaptor and the modification moiety is located at the end which is not attached to the polynucleotide of interest.
9. The method of any one of claims 1 to 8, wherein the adaptor comprises a blocking strand having a different structure from the polynucleotide for blocking a motor protein.
10. An adaptor for characterising a target polynucleotide, characterised in that the adaptor comprises a modification moiety for binding to a sequencing terminal of the target polynucleotide, the modification moiety being capable of causing nanopore blockage.
11. A construct for characterising a polynucleotide of interest, the construct comprising an adaptor, and a polynucleotide of interest;
the adaptor comprises a modification moiety that binds to a sequencing terminus of the polynucleotide of interest, the modification moiety being capable of causing nanopore blockage;
the target polynucleotide is a double-stranded polynucleotide.
12. A complex for characterising a polynucleotide of interest, the complex comprising an adaptor or construct, and a motor protein, wherein,
the construct comprising the adaptor and a polynucleotide of interest,
the adaptor comprises a modification moiety that binds to a sequencing terminus of the polynucleotide of interest, the modification moiety being capable of causing nanopore blockage;
the motor protein is a protein capable of binding to the target polynucleotide and controlling its movement through the pore.
13. A complex according to claim 12, wherein the motor protein is selected from one or more of a polymerase, exonuclease, helicase and topoisomerase;
the helicase is selected from one or more of Hel308 helicase, RecD helicase, Tral helicase, TrwC helicase, XPD helicase and DDA helicase.
14. A kit for nanopore characterization of polynucleotides, the composition of the kit comprising:
1) an adaptor comprising a modification moiety for binding to a sequencing terminus of the polynucleotide of interest, the modification moiety being capable of causing nanopore blockage; and
2) a motor protein.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210482806.3A CN114921533A (en) | 2022-05-05 | 2022-05-05 | Methods and adaptors for characterising a target polynucleotide |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210482806.3A CN114921533A (en) | 2022-05-05 | 2022-05-05 | Methods and adaptors for characterising a target polynucleotide |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114921533A true CN114921533A (en) | 2022-08-19 |
Family
ID=82806951
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210482806.3A Pending CN114921533A (en) | 2022-05-05 | 2022-05-05 | Methods and adaptors for characterising a target polynucleotide |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114921533A (en) |
-
2022
- 2022-05-05 CN CN202210482806.3A patent/CN114921533A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US12168799B2 (en) | Hairpin loop method for double strand polynucleotide sequencing using transmembrane pores | |
CN110088299B (en) | Nucleic acid detection method guided by nanopores | |
EP3126516B1 (en) | Method of target molecule characterisation using a molecular pore | |
EP3464624B1 (en) | Method of nanopore sequencing of concatenated nucleic acids | |
EP2895618B1 (en) | Sample preparation method | |
EP3097204B1 (en) | Method for controlling the movement of a polynucleotide through a transmembrane pore | |
EP3126515B1 (en) | Method for characterising a double stranded nucleic acid using a nano-pore and anchor molecules at both ends of said nucleic acid | |
EP4265723A2 (en) | Reagents and methods for molecular barcoding of nucleic acids of single cells | |
CN108350499A (en) | Can transformation marker composition, method and combine its process | |
AU2013220156A1 (en) | Aptamer method | |
CN114854826A (en) | Sequences, linkers comprising sequences and uses thereof | |
CN114262735A (en) | Adaptors for characterising polynucleotides and uses thereof | |
US20220403368A1 (en) | Methods and systems for preparing a nucleic acid construct for single molecule characterisation | |
CN115698331A (en) | Method for selectively characterizing polynucleotides using a detector | |
EP3877544B1 (en) | Liquid sample workflow for nanopore sequencing | |
EP3735471B1 (en) | Method for selecting polynucleotides based on enzyme interaction duration | |
CN114921533A (en) | Methods and adaptors for characterising a target polynucleotide | |
CN112041461A (en) | Methods for attaching adaptors to single-stranded regions of double-stranded polynucleotides | |
US20220025430A1 (en) | Sequence based imaging | |
WO2023194713A1 (en) | Method | |
JP2025511395A (en) | method | |
WO2023179829A1 (en) | Targeted enrichment of large dna molecules for long-read sequencing using facs or microfluidic partitioning | |
CN119753110A (en) | Adapter for characterizing analytes, characterization method and use thereof | |
WO2023116575A1 (en) | Adapter for characterizing target polynucleotide, method, and use thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 40076125 Country of ref document: HK |