CN115369098A - A novel CRISPR-associated transposase - Google Patents
A novel CRISPR-associated transposase Download PDFInfo
- Publication number
- CN115369098A CN115369098A CN202110532731.0A CN202110532731A CN115369098A CN 115369098 A CN115369098 A CN 115369098A CN 202110532731 A CN202110532731 A CN 202110532731A CN 115369098 A CN115369098 A CN 115369098A
- Authority
- CN
- China
- Prior art keywords
- homology
- seq
- plasmid
- leu
- gene
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 108091033409 CRISPR Proteins 0.000 title claims abstract description 92
- 238000010354 CRISPR gene editing Methods 0.000 title claims abstract description 91
- 108010020764 Transposases Proteins 0.000 title claims abstract description 68
- 102000008579 Transposases Human genes 0.000 title claims abstract description 67
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 158
- 241000519590 Pseudoalteromonas Species 0.000 claims abstract description 29
- 241000607626 Vibrio cholerae Species 0.000 claims abstract description 25
- 229940118696 vibrio cholerae Drugs 0.000 claims abstract description 25
- 239000013612 plasmid Substances 0.000 claims description 156
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 52
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 52
- 229920001184 polypeptide Polymers 0.000 claims description 51
- 230000008685 targeting Effects 0.000 claims description 34
- 239000002773 nucleotide Substances 0.000 claims description 31
- 125000003729 nucleotide group Chemical group 0.000 claims description 31
- UCSJYZPVAKXKNQ-HZYVHMACSA-N streptomycin Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 claims description 24
- 241000894006 Bacteria Species 0.000 claims description 21
- 101100537561 Escherichia coli tnsA gene Proteins 0.000 claims description 21
- 239000012634 fragment Substances 0.000 claims description 21
- 101100260929 Escherichia coli tnsC gene Proteins 0.000 claims description 19
- 101100260928 Escherichia coli tnsB gene Proteins 0.000 claims description 17
- 229930027917 kanamycin Natural products 0.000 claims description 15
- 229960000318 kanamycin Drugs 0.000 claims description 15
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 claims description 15
- 229930182823 kanamycin A Natural products 0.000 claims description 15
- 108091033319 polynucleotide Proteins 0.000 claims description 14
- 102000040430 polynucleotide Human genes 0.000 claims description 14
- 239000002157 polynucleotide Substances 0.000 claims description 14
- 238000010362 genome editing Methods 0.000 claims description 12
- 229960005322 streptomycin Drugs 0.000 claims description 12
- 101710163270 Nuclease Proteins 0.000 claims description 7
- 125000006850 spacer group Chemical group 0.000 claims description 7
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 5
- 125000003275 alpha amino acid group Chemical group 0.000 claims 14
- 101100387128 Myxococcus xanthus (strain DK1622) devR gene Proteins 0.000 claims 3
- 101100273269 Thermus thermophilus (strain ATCC 27634 / DSM 579 / HB8) cse3 gene Proteins 0.000 claims 3
- 101150106467 cas6 gene Proteins 0.000 claims 3
- 101150044165 cas7 gene Proteins 0.000 claims 3
- 101100382541 Escherichia coli (strain K12) casD gene Proteins 0.000 claims 1
- 101100387131 Myxococcus xanthus (strain DK1622) devS gene Proteins 0.000 claims 1
- 101150049463 cas5 gene Proteins 0.000 claims 1
- -1 tniQ Proteins 0.000 claims 1
- 241000588724 Escherichia coli Species 0.000 abstract description 19
- 230000002452 interceptive effect Effects 0.000 abstract description 5
- 229920002401 polyacrylamide Polymers 0.000 abstract description 4
- 241001025106 Pseudoalteromonas translucida KMM 520 Species 0.000 description 32
- 230000017105 transposition Effects 0.000 description 31
- 238000003780 insertion Methods 0.000 description 21
- 230000037431 insertion Effects 0.000 description 21
- 238000010276 construction Methods 0.000 description 18
- 150000001413 amino acids Chemical group 0.000 description 17
- 210000004027 cell Anatomy 0.000 description 16
- 239000007787 solid Substances 0.000 description 15
- 229930101283 tetracycline Natural products 0.000 description 15
- OFVLGDICTFRJMM-WESIUVDSSA-N tetracycline Chemical compound C1=CC=C2[C@](O)(C)[C@H]3C[C@H]4[C@H](N(C)C)C(O)=C(C(N)=O)C(=O)[C@@]4(O)C(O)=C3C(=O)C2=C1O OFVLGDICTFRJMM-WESIUVDSSA-N 0.000 description 15
- 239000005090 green fluorescent protein Substances 0.000 description 13
- 229960000723 ampicillin Drugs 0.000 description 12
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 12
- 101150066555 lacZ gene Proteins 0.000 description 12
- 108090000790 Enzymes Proteins 0.000 description 10
- 238000010586 diagram Methods 0.000 description 10
- 230000000694 effects Effects 0.000 description 10
- 238000001502 gel electrophoresis Methods 0.000 description 10
- 230000014509 gene expression Effects 0.000 description 10
- 102000004190 Enzymes Human genes 0.000 description 9
- 238000006243 chemical reaction Methods 0.000 description 9
- 239000007788 liquid Substances 0.000 description 9
- 230000009466 transformation Effects 0.000 description 9
- 229960005091 chloramphenicol Drugs 0.000 description 8
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 8
- 238000002474 experimental method Methods 0.000 description 8
- 230000006698 induction Effects 0.000 description 8
- 108010034529 leucyl-lysine Proteins 0.000 description 8
- 238000011160 research Methods 0.000 description 8
- 108020004705 Codon Proteins 0.000 description 7
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 7
- 238000001976 enzyme digestion Methods 0.000 description 7
- 101150113191 cmr gene Proteins 0.000 description 6
- 238000012269 metabolic engineering Methods 0.000 description 6
- 238000000034 method Methods 0.000 description 6
- 244000005700 microbiome Species 0.000 description 6
- 238000011144 upstream manufacturing Methods 0.000 description 6
- 241000186226 Corynebacterium glutamicum Species 0.000 description 5
- 108700007698 Genetic Terminator Regions Proteins 0.000 description 5
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 5
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 5
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 5
- 241000589516 Pseudomonas Species 0.000 description 5
- 230000001580 bacterial effect Effects 0.000 description 5
- 230000015572 biosynthetic process Effects 0.000 description 5
- 230000037361 pathway Effects 0.000 description 5
- 108010048818 seryl-histidine Proteins 0.000 description 5
- 238000003786 synthesis reaction Methods 0.000 description 5
- 241001485655 Corynebacterium glutamicum ATCC 13032 Species 0.000 description 4
- 241000196324 Embryophyta Species 0.000 description 4
- 101150066002 GFP gene Proteins 0.000 description 4
- 108010065920 Insulin Lispro Proteins 0.000 description 4
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 4
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 4
- 101710137500 T7 RNA polymerase Proteins 0.000 description 4
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 4
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 4
- 230000010354 integration Effects 0.000 description 4
- 108010047926 leucyl-lysyl-tyrosine Proteins 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 108010090894 prolylleucine Proteins 0.000 description 4
- 102000004169 proteins and genes Human genes 0.000 description 4
- 238000012163 sequencing technique Methods 0.000 description 4
- 229960000268 spectinomycin Drugs 0.000 description 4
- UNFWWIHTNXNPBV-WXKVUWSESA-N spectinomycin Chemical compound O([C@@H]1[C@@H](NC)[C@@H](O)[C@H]([C@@H]([C@H]1O1)O)NC)[C@]2(O)[C@H]1O[C@H](C)CC2=O UNFWWIHTNXNPBV-WXKVUWSESA-N 0.000 description 4
- 108010051110 tyrosyl-lysine Proteins 0.000 description 4
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 3
- CLICCYPMVFGUOF-IHRRRGAJSA-N Arg-Lys-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O CLICCYPMVFGUOF-IHRRRGAJSA-N 0.000 description 3
- 241001198387 Escherichia coli BL21(DE3) Species 0.000 description 3
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 3
- 108020005004 Guide RNA Proteins 0.000 description 3
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 3
- QVFGXCVIXXBFHO-AVGNSLFASA-N Leu-Glu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O QVFGXCVIXXBFHO-AVGNSLFASA-N 0.000 description 3
- HNDWYLYAYNBWMP-AJNGGQMLSA-N Leu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N HNDWYLYAYNBWMP-AJNGGQMLSA-N 0.000 description 3
- 108010079364 N-glycylalanine Proteins 0.000 description 3
- 241000231663 Puffinus auricularis Species 0.000 description 3
- RDFQNDHEHVSONI-ZLUOBGJFSA-N Ser-Asn-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O RDFQNDHEHVSONI-ZLUOBGJFSA-N 0.000 description 3
- OWCVUSJMEBGMOK-YUMQZZPRSA-N Ser-Lys-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O OWCVUSJMEBGMOK-YUMQZZPRSA-N 0.000 description 3
- COYHRQWNJDJCNA-NUJDXYNKSA-N Thr-Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O COYHRQWNJDJCNA-NUJDXYNKSA-N 0.000 description 3
- OTJMMKPMLUNTQT-AVGNSLFASA-N Val-Leu-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](C(C)C)N OTJMMKPMLUNTQT-AVGNSLFASA-N 0.000 description 3
- 241000607598 Vibrio Species 0.000 description 3
- 235000001014 amino acid Nutrition 0.000 description 3
- 108010077245 asparaginyl-proline Proteins 0.000 description 3
- 210000000349 chromosome Anatomy 0.000 description 3
- 238000012258 culturing Methods 0.000 description 3
- 239000000499 gel Substances 0.000 description 3
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 3
- 108010050848 glycylleucine Proteins 0.000 description 3
- 108010025306 histidylleucine Proteins 0.000 description 3
- 230000001939 inductive effect Effects 0.000 description 3
- 108010057821 leucylproline Proteins 0.000 description 3
- 108010054155 lysyllysine Proteins 0.000 description 3
- 101150031079 manX gene Proteins 0.000 description 3
- 101150070589 nagB gene Proteins 0.000 description 3
- 101150027065 nagE gene Proteins 0.000 description 3
- 239000013642 negative control Substances 0.000 description 3
- 108010031719 prolyl-serine Proteins 0.000 description 3
- 235000018102 proteins Nutrition 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 229920001817 Agar Polymers 0.000 description 2
- BUDNAJYVCUHLSV-ZLUOBGJFSA-N Ala-Asp-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O BUDNAJYVCUHLSV-ZLUOBGJFSA-N 0.000 description 2
- PMQXMXAASGFUDX-SRVKXCTJSA-N Ala-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCCN PMQXMXAASGFUDX-SRVKXCTJSA-N 0.000 description 2
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 2
- NINQYGGNRIBFSC-CIUDSAMLSA-N Ala-Lys-Ser Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CO)C(O)=O NINQYGGNRIBFSC-CIUDSAMLSA-N 0.000 description 2
- BSYKSCBTTQKOJG-GUBZILKMSA-N Arg-Pro-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BSYKSCBTTQKOJG-GUBZILKMSA-N 0.000 description 2
- XXAOXVBAWLMTDR-ZLUOBGJFSA-N Asn-Cys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)N)N XXAOXVBAWLMTDR-ZLUOBGJFSA-N 0.000 description 2
- DMLSCRJBWUEALP-LAEOZQHASA-N Asn-Glu-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O DMLSCRJBWUEALP-LAEOZQHASA-N 0.000 description 2
- NVWJMQNYLYWVNQ-BYULHYEWSA-N Asn-Ile-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O NVWJMQNYLYWVNQ-BYULHYEWSA-N 0.000 description 2
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 2
- XMHFCUKJRCQXGI-CIUDSAMLSA-N Asn-Pro-Gln Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O XMHFCUKJRCQXGI-CIUDSAMLSA-N 0.000 description 2
- KRXIWXCXOARFNT-ZLUOBGJFSA-N Asp-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O KRXIWXCXOARFNT-ZLUOBGJFSA-N 0.000 description 2
- WSWYMRLTJVKRCE-ZLUOBGJFSA-N Asp-Ala-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O WSWYMRLTJVKRCE-ZLUOBGJFSA-N 0.000 description 2
- RDRMWJBLOSRRAW-BYULHYEWSA-N Asp-Asn-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O RDRMWJBLOSRRAW-BYULHYEWSA-N 0.000 description 2
- KQBVNNAPIURMPD-PEFMBERDSA-N Asp-Ile-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O KQBVNNAPIURMPD-PEFMBERDSA-N 0.000 description 2
- RQHLMGCXCZUOGT-ZPFDUUQYSA-N Asp-Leu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RQHLMGCXCZUOGT-ZPFDUUQYSA-N 0.000 description 2
- UJGRZQYSNYTCAX-SRVKXCTJSA-N Asp-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UJGRZQYSNYTCAX-SRVKXCTJSA-N 0.000 description 2
- QNIACYURSSCLRP-GUBZILKMSA-N Asp-Lys-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O QNIACYURSSCLRP-GUBZILKMSA-N 0.000 description 2
- GKWFMNNNYZHJHV-SRVKXCTJSA-N Asp-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O GKWFMNNNYZHJHV-SRVKXCTJSA-N 0.000 description 2
- 241000192125 Firmicutes Species 0.000 description 2
- RBWKVOSARCFSQQ-FXQIFTODSA-N Gln-Gln-Ser Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O RBWKVOSARCFSQQ-FXQIFTODSA-N 0.000 description 2
- XJKAKYXMFHUIHT-AUTRQRHGSA-N Gln-Glu-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N XJKAKYXMFHUIHT-AUTRQRHGSA-N 0.000 description 2
- PSERKXGRRADTKA-MNXVOIDGSA-N Gln-Leu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PSERKXGRRADTKA-MNXVOIDGSA-N 0.000 description 2
- MLSKFHLRFVGNLL-WDCWCFNPSA-N Gln-Leu-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MLSKFHLRFVGNLL-WDCWCFNPSA-N 0.000 description 2
- JNENSVNAUWONEZ-GUBZILKMSA-N Gln-Lys-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JNENSVNAUWONEZ-GUBZILKMSA-N 0.000 description 2
- UESYBOXFJWJVSB-AVGNSLFASA-N Gln-Phe-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O UESYBOXFJWJVSB-AVGNSLFASA-N 0.000 description 2
- XQDGOJPVMSWZSO-SRVKXCTJSA-N Gln-Pro-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCC(=O)N)N XQDGOJPVMSWZSO-SRVKXCTJSA-N 0.000 description 2
- JJKKWYQVHRUSDG-GUBZILKMSA-N Glu-Ala-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O JJKKWYQVHRUSDG-GUBZILKMSA-N 0.000 description 2
- PVBBEKPHARMPHX-DCAQKATOSA-N Glu-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCC(O)=O PVBBEKPHARMPHX-DCAQKATOSA-N 0.000 description 2
- HVYWQYLBVXMXSV-GUBZILKMSA-N Glu-Leu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HVYWQYLBVXMXSV-GUBZILKMSA-N 0.000 description 2
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 2
- VNCNWQPIQYAMAK-ACZMJKKPSA-N Glu-Ser-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O VNCNWQPIQYAMAK-ACZMJKKPSA-N 0.000 description 2
- JVWPPCWUDRJGAE-YUMQZZPRSA-N Gly-Asn-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JVWPPCWUDRJGAE-YUMQZZPRSA-N 0.000 description 2
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 2
- NKVZTQVGUNLLQW-JBDRJPRFSA-N Ile-Ala-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)O)N NKVZTQVGUNLLQW-JBDRJPRFSA-N 0.000 description 2
- NCSIQAFSIPHVAN-IUKAMOBKSA-N Ile-Asn-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N NCSIQAFSIPHVAN-IUKAMOBKSA-N 0.000 description 2
- RFMDODRWJZHZCR-BJDJZHNGSA-N Ile-Lys-Cys Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CS)C(O)=O RFMDODRWJZHZCR-BJDJZHNGSA-N 0.000 description 2
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 2
- 241000880493 Leptailurus serval Species 0.000 description 2
- KWTVLKBOQATPHJ-SRVKXCTJSA-N Leu-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N KWTVLKBOQATPHJ-SRVKXCTJSA-N 0.000 description 2
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 2
- OGCQGUIWMSBHRZ-CIUDSAMLSA-N Leu-Asn-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OGCQGUIWMSBHRZ-CIUDSAMLSA-N 0.000 description 2
- KAFOIVJDVSZUMD-DCAQKATOSA-N Leu-Gln-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O KAFOIVJDVSZUMD-DCAQKATOSA-N 0.000 description 2
- JLWZLIQRYCTYBD-IHRRRGAJSA-N Leu-Lys-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JLWZLIQRYCTYBD-IHRRRGAJSA-N 0.000 description 2
- ONPJGOIVICHWBW-BZSNNMDCSA-N Leu-Lys-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 ONPJGOIVICHWBW-BZSNNMDCSA-N 0.000 description 2
- JIHDFWWRYHSAQB-GUBZILKMSA-N Leu-Ser-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JIHDFWWRYHSAQB-GUBZILKMSA-N 0.000 description 2
- GZRABTMNWJXFMH-UVOCVTCTSA-N Leu-Thr-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZRABTMNWJXFMH-UVOCVTCTSA-N 0.000 description 2
- 102000003960 Ligases Human genes 0.000 description 2
- 108090000364 Ligases Proteins 0.000 description 2
- YIBOAHAOAWACDK-QEJZJMRPSA-N Lys-Ala-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YIBOAHAOAWACDK-QEJZJMRPSA-N 0.000 description 2
- IMAKMJCBYCSMHM-AVGNSLFASA-N Lys-Glu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN IMAKMJCBYCSMHM-AVGNSLFASA-N 0.000 description 2
- ISHNZELVUVPCHY-ZETCQYMHSA-N Lys-Gly-Gly Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O ISHNZELVUVPCHY-ZETCQYMHSA-N 0.000 description 2
- KKFVKBWCXXLKIK-AVGNSLFASA-N Lys-His-Glu Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCCN)N KKFVKBWCXXLKIK-AVGNSLFASA-N 0.000 description 2
- SLQJJFAVWSZLBL-BJDJZHNGSA-N Lys-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN SLQJJFAVWSZLBL-BJDJZHNGSA-N 0.000 description 2
- ZJSZPXISKMDJKQ-JYJNAYRXSA-N Lys-Phe-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCC(O)=O)C(O)=O)CC1=CC=CC=C1 ZJSZPXISKMDJKQ-JYJNAYRXSA-N 0.000 description 2
- UGCIQUYEJIEHKX-GVXVVHGQSA-N Lys-Val-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O UGCIQUYEJIEHKX-GVXVVHGQSA-N 0.000 description 2
- VKCPHIOZDWUFSW-ONGXEEELSA-N Lys-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN VKCPHIOZDWUFSW-ONGXEEELSA-N 0.000 description 2
- RIPJMCFGQHGHNP-RHYQMDGZSA-N Lys-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCCCN)N)O RIPJMCFGQHGHNP-RHYQMDGZSA-N 0.000 description 2
- KRYSMKKRRRWOCZ-QEWYBTABSA-N Phe-Ile-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O KRYSMKKRRRWOCZ-QEWYBTABSA-N 0.000 description 2
- QSWKNJAPHQDAAS-MELADBBJSA-N Phe-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O QSWKNJAPHQDAAS-MELADBBJSA-N 0.000 description 2
- VIIRRNQMMIHYHQ-XHSDSOJGSA-N Phe-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N VIIRRNQMMIHYHQ-XHSDSOJGSA-N 0.000 description 2
- CDGABSWLRMECHC-IHRRRGAJSA-N Pro-Lys-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O CDGABSWLRMECHC-IHRRRGAJSA-N 0.000 description 2
- OHKFXGKHSJKKAL-NRPADANISA-N Ser-Glu-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O OHKFXGKHSJKKAL-NRPADANISA-N 0.000 description 2
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 2
- PMCMLDNPAZUYGI-DCAQKATOSA-N Ser-Lys-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMCMLDNPAZUYGI-DCAQKATOSA-N 0.000 description 2
- GSCVDSBEYVGMJQ-SRVKXCTJSA-N Ser-Tyr-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CO)N)O GSCVDSBEYVGMJQ-SRVKXCTJSA-N 0.000 description 2
- 241001622829 Tatumella Species 0.000 description 2
- LHEZGZQRLDBSRR-WDCWCFNPSA-N Thr-Glu-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LHEZGZQRLDBSRR-WDCWCFNPSA-N 0.000 description 2
- CRZNCABIJLRFKZ-IUKAMOBKSA-N Thr-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N CRZNCABIJLRFKZ-IUKAMOBKSA-N 0.000 description 2
- VRUFCJZQDACGLH-UVOCVTCTSA-N Thr-Leu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VRUFCJZQDACGLH-UVOCVTCTSA-N 0.000 description 2
- VTFWAGGJDRSQFG-MELADBBJSA-N Tyr-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O VTFWAGGJDRSQFG-MELADBBJSA-N 0.000 description 2
- KGSDLCMCDFETHU-YESZJQIVSA-N Tyr-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O KGSDLCMCDFETHU-YESZJQIVSA-N 0.000 description 2
- XJPXTYLVMUZGNW-IHRRRGAJSA-N Tyr-Pro-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O XJPXTYLVMUZGNW-IHRRRGAJSA-N 0.000 description 2
- VVZDBPBZHLQPPB-XVKPBYJWSA-N Val-Glu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O VVZDBPBZHLQPPB-XVKPBYJWSA-N 0.000 description 2
- 239000008272 agar Substances 0.000 description 2
- 108010005233 alanylglutamic acid Proteins 0.000 description 2
- 108010044940 alanylglutamine Proteins 0.000 description 2
- 108010047495 alanylglycine Proteins 0.000 description 2
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- 238000000137 annealing Methods 0.000 description 2
- 108010038633 aspartylglutamate Proteins 0.000 description 2
- 108010068265 aspartyltyrosine Proteins 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000005520 cutting process Methods 0.000 description 2
- 108010069495 cysteinyltyrosine Proteins 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 2
- 238000004520 electroporation Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 2
- 108010049041 glutamylalanine Proteins 0.000 description 2
- 108010010147 glycylglutamine Proteins 0.000 description 2
- 108010015792 glycyllysine Proteins 0.000 description 2
- 108010092114 histidylphenylalanine Proteins 0.000 description 2
- 108010018006 histidylserine Proteins 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 2
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 2
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 2
- 230000037353 metabolic pathway Effects 0.000 description 2
- 108010005942 methionylglycine Proteins 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 108020004707 nucleic acids Proteins 0.000 description 2
- 102000039446 nucleic acids Human genes 0.000 description 2
- 150000007523 nucleic acids Chemical class 0.000 description 2
- 239000000843 powder Substances 0.000 description 2
- 108700042769 prolyl-leucyl-glycine Proteins 0.000 description 2
- 108010070643 prolylglutamic acid Proteins 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 108091008146 restriction endonucleases Proteins 0.000 description 2
- 239000011780 sodium chloride Substances 0.000 description 2
- 108010080629 tryptophan-leucine Proteins 0.000 description 2
- 108010073969 valyllysine Proteins 0.000 description 2
- CWFMWBHMIMNZLN-NAKRPEOUSA-N (2s)-1-[(2s)-2-[[(2s,3s)-2-amino-3-methylpentanoyl]amino]propanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CWFMWBHMIMNZLN-NAKRPEOUSA-N 0.000 description 1
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 1
- 101150044182 8 gene Proteins 0.000 description 1
- 241000251468 Actinopterygii Species 0.000 description 1
- 241000607525 Aeromonas salmonicida Species 0.000 description 1
- SSSROGPPPVTHLX-FXQIFTODSA-N Ala-Arg-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O SSSROGPPPVTHLX-FXQIFTODSA-N 0.000 description 1
- FSBCNCKIQZZASN-GUBZILKMSA-N Ala-Arg-Met Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O FSBCNCKIQZZASN-GUBZILKMSA-N 0.000 description 1
- JAMAWBXXKFGFGX-KZVJFYERSA-N Ala-Arg-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JAMAWBXXKFGFGX-KZVJFYERSA-N 0.000 description 1
- LBJYAILUMSUTAM-ZLUOBGJFSA-N Ala-Asn-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O LBJYAILUMSUTAM-ZLUOBGJFSA-N 0.000 description 1
- YBPLKDWJFYCZSV-ZLUOBGJFSA-N Ala-Asn-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N YBPLKDWJFYCZSV-ZLUOBGJFSA-N 0.000 description 1
- XCVRVWZTXPCYJT-BIIVOSGPSA-N Ala-Asn-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N XCVRVWZTXPCYJT-BIIVOSGPSA-N 0.000 description 1
- GWFSQQNGMPGBEF-GHCJXIJMSA-N Ala-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C)N GWFSQQNGMPGBEF-GHCJXIJMSA-N 0.000 description 1
- LSLIRHLIUDVNBN-CIUDSAMLSA-N Ala-Asp-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LSLIRHLIUDVNBN-CIUDSAMLSA-N 0.000 description 1
- MKZCBYZBCINNJN-DLOVCJGASA-N Ala-Asp-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MKZCBYZBCINNJN-DLOVCJGASA-N 0.000 description 1
- WCBVQNZTOKJWJS-ACZMJKKPSA-N Ala-Cys-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(O)=O)C(O)=O WCBVQNZTOKJWJS-ACZMJKKPSA-N 0.000 description 1
- IYCZBJXFSZSHPN-DLOVCJGASA-N Ala-Cys-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O IYCZBJXFSZSHPN-DLOVCJGASA-N 0.000 description 1
- AWAXZRDKUHOPBO-GUBZILKMSA-N Ala-Gln-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O AWAXZRDKUHOPBO-GUBZILKMSA-N 0.000 description 1
- GGNHBHYDMUDXQB-KBIXCLLPSA-N Ala-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)N GGNHBHYDMUDXQB-KBIXCLLPSA-N 0.000 description 1
- HMRWQTHUDVXMGH-GUBZILKMSA-N Ala-Glu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HMRWQTHUDVXMGH-GUBZILKMSA-N 0.000 description 1
- ZVFVBBGVOILKPO-WHFBIAKZSA-N Ala-Gly-Ala Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O ZVFVBBGVOILKPO-WHFBIAKZSA-N 0.000 description 1
- BEMGNWZECGIJOI-WDSKDSINSA-N Ala-Gly-Glu Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O BEMGNWZECGIJOI-WDSKDSINSA-N 0.000 description 1
- LMFXXZPPZDCPTA-ZKWXMUAHSA-N Ala-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N LMFXXZPPZDCPTA-ZKWXMUAHSA-N 0.000 description 1
- BLIMFWGRQKRCGT-YUMQZZPRSA-N Ala-Gly-Lys Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN BLIMFWGRQKRCGT-YUMQZZPRSA-N 0.000 description 1
- PNALXAODQKTNLV-JBDRJPRFSA-N Ala-Ile-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O PNALXAODQKTNLV-JBDRJPRFSA-N 0.000 description 1
- GSHKMNKPMLXSQW-KBIXCLLPSA-N Ala-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C)N GSHKMNKPMLXSQW-KBIXCLLPSA-N 0.000 description 1
- CKLDHDOIYBVUNP-KBIXCLLPSA-N Ala-Ile-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O CKLDHDOIYBVUNP-KBIXCLLPSA-N 0.000 description 1
- DVJSJDDYCYSMFR-ZKWXMUAHSA-N Ala-Ile-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O DVJSJDDYCYSMFR-ZKWXMUAHSA-N 0.000 description 1
- OKIKVSXTXVVFDV-MMWGEVLESA-N Ala-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C)N OKIKVSXTXVVFDV-MMWGEVLESA-N 0.000 description 1
- QQACQIHVWCVBBR-GVARAGBVSA-N Ala-Ile-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QQACQIHVWCVBBR-GVARAGBVSA-N 0.000 description 1
- HHRAXZAYZFFRAM-CIUDSAMLSA-N Ala-Leu-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O HHRAXZAYZFFRAM-CIUDSAMLSA-N 0.000 description 1
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 1
- OYJCVIGKMXUVKB-GARJFASQSA-N Ala-Leu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N OYJCVIGKMXUVKB-GARJFASQSA-N 0.000 description 1
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 1
- AJBVYEYZVYPFCF-CIUDSAMLSA-N Ala-Lys-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O AJBVYEYZVYPFCF-CIUDSAMLSA-N 0.000 description 1
- SDZRIBWEVVRDQI-CIUDSAMLSA-N Ala-Lys-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O SDZRIBWEVVRDQI-CIUDSAMLSA-N 0.000 description 1
- VCSABYLVNWQYQE-SRVKXCTJSA-N Ala-Lys-Lys Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O VCSABYLVNWQYQE-SRVKXCTJSA-N 0.000 description 1
- 108010011667 Ala-Phe-Ala Proteins 0.000 description 1
- XRUJOVRWNMBAAA-NHCYSSNCSA-N Ala-Phe-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 XRUJOVRWNMBAAA-NHCYSSNCSA-N 0.000 description 1
- DHBKYZYFEXXUAK-ONGXEEELSA-N Ala-Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 DHBKYZYFEXXUAK-ONGXEEELSA-N 0.000 description 1
- ZBLQIYPCUWZSRZ-QEJZJMRPSA-N Ala-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=CC=C1 ZBLQIYPCUWZSRZ-QEJZJMRPSA-N 0.000 description 1
- WEZNQZHACPSMEF-QEJZJMRPSA-N Ala-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 WEZNQZHACPSMEF-QEJZJMRPSA-N 0.000 description 1
- YCRAFFCYWOUEOF-DLOVCJGASA-N Ala-Phe-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 YCRAFFCYWOUEOF-DLOVCJGASA-N 0.000 description 1
- DCVYRWFAMZFSDA-ZLUOBGJFSA-N Ala-Ser-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DCVYRWFAMZFSDA-ZLUOBGJFSA-N 0.000 description 1
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 1
- ARHJJAAWNWOACN-FXQIFTODSA-N Ala-Ser-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O ARHJJAAWNWOACN-FXQIFTODSA-N 0.000 description 1
- AOAKQKVICDWCLB-UWJYBYFXSA-N Ala-Tyr-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N AOAKQKVICDWCLB-UWJYBYFXSA-N 0.000 description 1
- JPOQZCHGOTWRTM-FQPOAREZSA-N Ala-Tyr-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPOQZCHGOTWRTM-FQPOAREZSA-N 0.000 description 1
- IYKVSFNGSWTTNZ-GUBZILKMSA-N Ala-Val-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IYKVSFNGSWTTNZ-GUBZILKMSA-N 0.000 description 1
- REWSWYIDQIELBE-FXQIFTODSA-N Ala-Val-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O REWSWYIDQIELBE-FXQIFTODSA-N 0.000 description 1
- 241000192542 Anabaena Species 0.000 description 1
- 241000192537 Anabaena cylindrica Species 0.000 description 1
- SGYSTDWPNPKJPP-GUBZILKMSA-N Arg-Ala-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SGYSTDWPNPKJPP-GUBZILKMSA-N 0.000 description 1
- VKKYFICVTYKFIO-CIUDSAMLSA-N Arg-Ala-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N VKKYFICVTYKFIO-CIUDSAMLSA-N 0.000 description 1
- VYSRNGOMGHOJCK-GUBZILKMSA-N Arg-Ala-Met Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N VYSRNGOMGHOJCK-GUBZILKMSA-N 0.000 description 1
- MUXONAMCEUBVGA-DCAQKATOSA-N Arg-Arg-Gln Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(N)=O)C(O)=O MUXONAMCEUBVGA-DCAQKATOSA-N 0.000 description 1
- CPSHGRGUPZBMOK-CIUDSAMLSA-N Arg-Asn-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O CPSHGRGUPZBMOK-CIUDSAMLSA-N 0.000 description 1
- RVDVDRUZWZIBJQ-CIUDSAMLSA-N Arg-Asn-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O RVDVDRUZWZIBJQ-CIUDSAMLSA-N 0.000 description 1
- XTGGTAWGUFXJSV-NAKRPEOUSA-N Arg-Cys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCCN=C(N)N)N XTGGTAWGUFXJSV-NAKRPEOUSA-N 0.000 description 1
- BEXGZLUHRXTZCC-CIUDSAMLSA-N Arg-Gln-Ser Chemical compound C(C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N)CN=C(N)N BEXGZLUHRXTZCC-CIUDSAMLSA-N 0.000 description 1
- AUFHLLPVPSMEOG-YUMQZZPRSA-N Arg-Gly-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AUFHLLPVPSMEOG-YUMQZZPRSA-N 0.000 description 1
- FRMQITGHXMUNDF-GMOBBJLQSA-N Arg-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N FRMQITGHXMUNDF-GMOBBJLQSA-N 0.000 description 1
- GNYUVVJYGJFKHN-RVMXOQNASA-N Arg-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N GNYUVVJYGJFKHN-RVMXOQNASA-N 0.000 description 1
- YBZMTKUDWXZLIX-UWVGGRQHSA-N Arg-Leu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YBZMTKUDWXZLIX-UWVGGRQHSA-N 0.000 description 1
- UZGFHWIJWPUPOH-IHRRRGAJSA-N Arg-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UZGFHWIJWPUPOH-IHRRRGAJSA-N 0.000 description 1
- LCBSSOCDWUTQQV-SDDRHHMPSA-N Arg-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N LCBSSOCDWUTQQV-SDDRHHMPSA-N 0.000 description 1
- KZXPVYVSHUJCEO-ULQDDVLXSA-N Arg-Phe-Lys Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=CC=C1 KZXPVYVSHUJCEO-ULQDDVLXSA-N 0.000 description 1
- MNBHKGYCLBUIBC-UFYCRDLUSA-N Arg-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CCCNC(N)=N)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 MNBHKGYCLBUIBC-UFYCRDLUSA-N 0.000 description 1
- AOHKLEBWKMKITA-IHRRRGAJSA-N Arg-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AOHKLEBWKMKITA-IHRRRGAJSA-N 0.000 description 1
- UULLJGQFCDXVTQ-CYDGBPFRSA-N Arg-Pro-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UULLJGQFCDXVTQ-CYDGBPFRSA-N 0.000 description 1
- AMIQZQAAYGYKOP-FXQIFTODSA-N Arg-Ser-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O AMIQZQAAYGYKOP-FXQIFTODSA-N 0.000 description 1
- LFAUVOXPCGJKTB-DCAQKATOSA-N Arg-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N LFAUVOXPCGJKTB-DCAQKATOSA-N 0.000 description 1
- KSHJMDSNSKDJPU-QTKMDUPCSA-N Arg-Thr-His Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 KSHJMDSNSKDJPU-QTKMDUPCSA-N 0.000 description 1
- VJIQPOJMISSUPO-BVSLBCMMSA-N Arg-Trp-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VJIQPOJMISSUPO-BVSLBCMMSA-N 0.000 description 1
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 1
- VLIJAPRTSXSGFY-STQMWFEESA-N Arg-Tyr-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 VLIJAPRTSXSGFY-STQMWFEESA-N 0.000 description 1
- CGWVCWFQGXOUSJ-ULQDDVLXSA-N Arg-Tyr-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O CGWVCWFQGXOUSJ-ULQDDVLXSA-N 0.000 description 1
- QHUOOCKNNURZSL-IHRRRGAJSA-N Arg-Tyr-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O QHUOOCKNNURZSL-IHRRRGAJSA-N 0.000 description 1
- ULBHWNVWSCJLCO-NHCYSSNCSA-N Arg-Val-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N ULBHWNVWSCJLCO-NHCYSSNCSA-N 0.000 description 1
- WOZDCBHUGJVJPL-AVGNSLFASA-N Arg-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N WOZDCBHUGJVJPL-AVGNSLFASA-N 0.000 description 1
- CPTXATAOUQJQRO-GUBZILKMSA-N Arg-Val-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O CPTXATAOUQJQRO-GUBZILKMSA-N 0.000 description 1
- LEFKSBYHUGUWLP-ACZMJKKPSA-N Asn-Ala-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LEFKSBYHUGUWLP-ACZMJKKPSA-N 0.000 description 1
- SLKLLQWZQHXYSV-CIUDSAMLSA-N Asn-Ala-Lys Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O SLKLLQWZQHXYSV-CIUDSAMLSA-N 0.000 description 1
- MEFGKQUUYZOLHM-GMOBBJLQSA-N Asn-Arg-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MEFGKQUUYZOLHM-GMOBBJLQSA-N 0.000 description 1
- MFFOYNGMOYFPBD-DCAQKATOSA-N Asn-Arg-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O MFFOYNGMOYFPBD-DCAQKATOSA-N 0.000 description 1
- ZZXMOQIUIJJOKZ-ZLUOBGJFSA-N Asn-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC(N)=O ZZXMOQIUIJJOKZ-ZLUOBGJFSA-N 0.000 description 1
- DAPLJWATMAXPPZ-CIUDSAMLSA-N Asn-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC(N)=O DAPLJWATMAXPPZ-CIUDSAMLSA-N 0.000 description 1
- UGXVKHRDGLYFKR-CIUDSAMLSA-N Asn-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(N)=O UGXVKHRDGLYFKR-CIUDSAMLSA-N 0.000 description 1
- UPALZCBCKAMGIY-PEFMBERDSA-N Asn-Gln-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UPALZCBCKAMGIY-PEFMBERDSA-N 0.000 description 1
- BZMWJLLUAKSIMH-FXQIFTODSA-N Asn-Glu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BZMWJLLUAKSIMH-FXQIFTODSA-N 0.000 description 1
- MSBDSTRUMZFSEU-PEFMBERDSA-N Asn-Glu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MSBDSTRUMZFSEU-PEFMBERDSA-N 0.000 description 1
- IICZCLFBILYRCU-WHFBIAKZSA-N Asn-Gly-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O IICZCLFBILYRCU-WHFBIAKZSA-N 0.000 description 1
- WONGRTVAMHFGBE-WDSKDSINSA-N Asn-Gly-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N WONGRTVAMHFGBE-WDSKDSINSA-N 0.000 description 1
- HYQYLOSCICEYTR-YUMQZZPRSA-N Asn-Gly-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O HYQYLOSCICEYTR-YUMQZZPRSA-N 0.000 description 1
- UDSVWSUXKYXSTR-QWRGUYRKSA-N Asn-Gly-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UDSVWSUXKYXSTR-QWRGUYRKSA-N 0.000 description 1
- SGAUXNZEFIEAAI-GARJFASQSA-N Asn-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)N)N)C(=O)O SGAUXNZEFIEAAI-GARJFASQSA-N 0.000 description 1
- MYCSPQIARXTUTP-SRVKXCTJSA-N Asn-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(=O)N)N MYCSPQIARXTUTP-SRVKXCTJSA-N 0.000 description 1
- UBGGJTMETLEXJD-DCAQKATOSA-N Asn-Leu-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O UBGGJTMETLEXJD-DCAQKATOSA-N 0.000 description 1
- JWKDQOORUCYUIW-ZPFDUUQYSA-N Asn-Lys-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JWKDQOORUCYUIW-ZPFDUUQYSA-N 0.000 description 1
- RTFWCVDISAMGEQ-SRVKXCTJSA-N Asn-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N RTFWCVDISAMGEQ-SRVKXCTJSA-N 0.000 description 1
- RAUPFUCUDBQYHE-AVGNSLFASA-N Asn-Phe-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O RAUPFUCUDBQYHE-AVGNSLFASA-N 0.000 description 1
- XTMZYFMTYJNABC-ZLUOBGJFSA-N Asn-Ser-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N XTMZYFMTYJNABC-ZLUOBGJFSA-N 0.000 description 1
- JWQWPRCDYWNVNM-ACZMJKKPSA-N Asn-Ser-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N JWQWPRCDYWNVNM-ACZMJKKPSA-N 0.000 description 1
- MYTHOBCLNIOFBL-SRVKXCTJSA-N Asn-Ser-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MYTHOBCLNIOFBL-SRVKXCTJSA-N 0.000 description 1
- ZUFPUBYQYWCMDB-NUMRIWBASA-N Asn-Thr-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZUFPUBYQYWCMDB-NUMRIWBASA-N 0.000 description 1
- AMGQTNHANMRPOE-LKXGYXEUSA-N Asn-Thr-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O AMGQTNHANMRPOE-LKXGYXEUSA-N 0.000 description 1
- YNQMEIJEWSHOEO-SRVKXCTJSA-N Asn-Tyr-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O YNQMEIJEWSHOEO-SRVKXCTJSA-N 0.000 description 1
- YSYTWUMRHSFODC-QWRGUYRKSA-N Asn-Tyr-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O YSYTWUMRHSFODC-QWRGUYRKSA-N 0.000 description 1
- ZAESWDKAMDVHLL-RCOVLWMOSA-N Asn-Val-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O ZAESWDKAMDVHLL-RCOVLWMOSA-N 0.000 description 1
- CASGONAXMZPHCK-FXQIFTODSA-N Asp-Asn-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)CN=C(N)N CASGONAXMZPHCK-FXQIFTODSA-N 0.000 description 1
- BUVNWKQBMZLCDW-UGYAYLCHSA-N Asp-Asn-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BUVNWKQBMZLCDW-UGYAYLCHSA-N 0.000 description 1
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 1
- NAPNAGZWHQHZLG-ZLUOBGJFSA-N Asp-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N NAPNAGZWHQHZLG-ZLUOBGJFSA-N 0.000 description 1
- TVVYVAUGRHNTGT-UGYAYLCHSA-N Asp-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O TVVYVAUGRHNTGT-UGYAYLCHSA-N 0.000 description 1
- LKIYSIYBKYLKPU-BIIVOSGPSA-N Asp-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O LKIYSIYBKYLKPU-BIIVOSGPSA-N 0.000 description 1
- VHQOCWWKXIOAQI-WDSKDSINSA-N Asp-Gln-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VHQOCWWKXIOAQI-WDSKDSINSA-N 0.000 description 1
- DXQOQMCLWWADMU-ACZMJKKPSA-N Asp-Gln-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DXQOQMCLWWADMU-ACZMJKKPSA-N 0.000 description 1
- VAWNQIGQPUOPQW-ACZMJKKPSA-N Asp-Glu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O VAWNQIGQPUOPQW-ACZMJKKPSA-N 0.000 description 1
- XAJRHVUUVUPFQL-ACZMJKKPSA-N Asp-Glu-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O XAJRHVUUVUPFQL-ACZMJKKPSA-N 0.000 description 1
- HSWYMWGDMPLTTH-FXQIFTODSA-N Asp-Glu-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HSWYMWGDMPLTTH-FXQIFTODSA-N 0.000 description 1
- OVPHVTCDVYYTHN-AVGNSLFASA-N Asp-Glu-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OVPHVTCDVYYTHN-AVGNSLFASA-N 0.000 description 1
- ZEDBMCPXPIYJLW-XHNCKOQMSA-N Asp-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O ZEDBMCPXPIYJLW-XHNCKOQMSA-N 0.000 description 1
- YDJVIBMKAMQPPP-LAEOZQHASA-N Asp-Glu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O YDJVIBMKAMQPPP-LAEOZQHASA-N 0.000 description 1
- HAFCJCDJGIOYPW-WDSKDSINSA-N Asp-Gly-Gln Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O HAFCJCDJGIOYPW-WDSKDSINSA-N 0.000 description 1
- KHGPWGKPYHPOIK-QWRGUYRKSA-N Asp-Gly-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KHGPWGKPYHPOIK-QWRGUYRKSA-N 0.000 description 1
- SVABRQFIHCSNCI-FOHZUACHSA-N Asp-Gly-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O SVABRQFIHCSNCI-FOHZUACHSA-N 0.000 description 1
- HOBNTSHITVVNBN-ZPFDUUQYSA-N Asp-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N HOBNTSHITVVNBN-ZPFDUUQYSA-N 0.000 description 1
- JNNVNVRBYUJYGS-CIUDSAMLSA-N Asp-Leu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O JNNVNVRBYUJYGS-CIUDSAMLSA-N 0.000 description 1
- XLILXFRAKOYEJX-GUBZILKMSA-N Asp-Leu-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O XLILXFRAKOYEJX-GUBZILKMSA-N 0.000 description 1
- ZRUBWRCKIVDCFS-XPCJQDJLSA-N Asp-Leu-Thr-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ZRUBWRCKIVDCFS-XPCJQDJLSA-N 0.000 description 1
- QNMKWNONJGKJJC-NHCYSSNCSA-N Asp-Leu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O QNMKWNONJGKJJC-NHCYSSNCSA-N 0.000 description 1
- WOPJVEMFXYHZEE-SRVKXCTJSA-N Asp-Phe-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O WOPJVEMFXYHZEE-SRVKXCTJSA-N 0.000 description 1
- PCJOFZYFFMBZKC-PCBIJLKTSA-N Asp-Phe-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PCJOFZYFFMBZKC-PCBIJLKTSA-N 0.000 description 1
- NONWUQAWAANERO-BZSNNMDCSA-N Asp-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@H](CC(O)=O)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 NONWUQAWAANERO-BZSNNMDCSA-N 0.000 description 1
- XUVTWGPERWIERB-IHRRRGAJSA-N Asp-Pro-Phe Chemical compound N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1ccccc1)C(O)=O XUVTWGPERWIERB-IHRRRGAJSA-N 0.000 description 1
- DINOVZWPTMGSRF-QXEWZRGKSA-N Asp-Pro-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O DINOVZWPTMGSRF-QXEWZRGKSA-N 0.000 description 1
- XXAMCEGRCZQGEM-ZLUOBGJFSA-N Asp-Ser-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O XXAMCEGRCZQGEM-ZLUOBGJFSA-N 0.000 description 1
- ZVGRHIRJLWBWGJ-ACZMJKKPSA-N Asp-Ser-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZVGRHIRJLWBWGJ-ACZMJKKPSA-N 0.000 description 1
- GHAHOJDCBRXAKC-IHPCNDPISA-N Asp-Trp-Tyr Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N GHAHOJDCBRXAKC-IHPCNDPISA-N 0.000 description 1
- JGLWFWXGOINXEA-YDHLFZDLSA-N Asp-Val-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 JGLWFWXGOINXEA-YDHLFZDLSA-N 0.000 description 1
- 241000193830 Bacillus <bacterium> Species 0.000 description 1
- VEXZGXHMUGYJMC-UHFFFAOYSA-M Chloride anion Chemical compound [Cl-] VEXZGXHMUGYJMC-UHFFFAOYSA-M 0.000 description 1
- 241000192700 Cyanobacteria Species 0.000 description 1
- 244000166675 Cymbopogon nardus Species 0.000 description 1
- 235000018791 Cymbopogon nardus Nutrition 0.000 description 1
- TVYMKYUSZSVOAG-ZLUOBGJFSA-N Cys-Ala-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O TVYMKYUSZSVOAG-ZLUOBGJFSA-N 0.000 description 1
- XXDLUZLKHOVPNW-IHRRRGAJSA-N Cys-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CS)N)O XXDLUZLKHOVPNW-IHRRRGAJSA-N 0.000 description 1
- VFGADOJXRLWTBU-JBDRJPRFSA-N Cys-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CS)N VFGADOJXRLWTBU-JBDRJPRFSA-N 0.000 description 1
- OHLLDUNVMPPUMD-DCAQKATOSA-N Cys-Leu-Val Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CS)N OHLLDUNVMPPUMD-DCAQKATOSA-N 0.000 description 1
- BNCKELUXXUYRNY-GUBZILKMSA-N Cys-Lys-Glu Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CS)N BNCKELUXXUYRNY-GUBZILKMSA-N 0.000 description 1
- YXPNKXFOBHRUBL-BJDJZHNGSA-N Cys-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CS)N YXPNKXFOBHRUBL-BJDJZHNGSA-N 0.000 description 1
- UDDITVWSXPEAIQ-IHRRRGAJSA-N Cys-Phe-Arg Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UDDITVWSXPEAIQ-IHRRRGAJSA-N 0.000 description 1
- JTEGHEWKBCTIAL-IXOXFDKPSA-N Cys-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CS)N)O JTEGHEWKBCTIAL-IXOXFDKPSA-N 0.000 description 1
- LHRCZIRWNFRIRG-SRVKXCTJSA-N Cys-Tyr-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CS)N)O LHRCZIRWNFRIRG-SRVKXCTJSA-N 0.000 description 1
- LPBUBIHAVKXUOT-FXQIFTODSA-N Cys-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CS)N LPBUBIHAVKXUOT-FXQIFTODSA-N 0.000 description 1
- FBPFZTCFMRRESA-FSIIMWSLSA-N D-Glucitol Natural products OC[C@H](O)[C@H](O)[C@@H](O)[C@H](O)CO FBPFZTCFMRRESA-FSIIMWSLSA-N 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- XXLBHPPXDUWYAG-XQXXSGGOSA-N Gln-Ala-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XXLBHPPXDUWYAG-XQXXSGGOSA-N 0.000 description 1
- KWUSGAIFNHQCBY-DCAQKATOSA-N Gln-Arg-Arg Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O KWUSGAIFNHQCBY-DCAQKATOSA-N 0.000 description 1
- PRBLYKYHAJEABA-SRVKXCTJSA-N Gln-Arg-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O PRBLYKYHAJEABA-SRVKXCTJSA-N 0.000 description 1
- SOBBAYVQSNXYPQ-ACZMJKKPSA-N Gln-Asn-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SOBBAYVQSNXYPQ-ACZMJKKPSA-N 0.000 description 1
- TWHDOEYLXXQYOZ-FXQIFTODSA-N Gln-Asn-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N TWHDOEYLXXQYOZ-FXQIFTODSA-N 0.000 description 1
- PHZYLYASFWHLHJ-FXQIFTODSA-N Gln-Asn-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PHZYLYASFWHLHJ-FXQIFTODSA-N 0.000 description 1
- WMOMPXKOKASNBK-PEFMBERDSA-N Gln-Asn-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WMOMPXKOKASNBK-PEFMBERDSA-N 0.000 description 1
- ODBLJLZVLAWVMS-GUBZILKMSA-N Gln-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)N)N ODBLJLZVLAWVMS-GUBZILKMSA-N 0.000 description 1
- CYTSBCIIEHUPDU-ACZMJKKPSA-N Gln-Asp-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O CYTSBCIIEHUPDU-ACZMJKKPSA-N 0.000 description 1
- QYTKAVBFRUGYAU-ACZMJKKPSA-N Gln-Asp-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QYTKAVBFRUGYAU-ACZMJKKPSA-N 0.000 description 1
- IKDOHQHEFPPGJG-FXQIFTODSA-N Gln-Asp-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O IKDOHQHEFPPGJG-FXQIFTODSA-N 0.000 description 1
- WQWMZOIPXWSZNE-WDSKDSINSA-N Gln-Asp-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O WQWMZOIPXWSZNE-WDSKDSINSA-N 0.000 description 1
- PNENQZWRFMUZOM-DCAQKATOSA-N Gln-Glu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O PNENQZWRFMUZOM-DCAQKATOSA-N 0.000 description 1
- XKBASPWPBXNVLQ-WDSKDSINSA-N Gln-Gly-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O XKBASPWPBXNVLQ-WDSKDSINSA-N 0.000 description 1
- XSBGUANSZDGULP-IUCAKERBSA-N Gln-Gly-Lys Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CCCCN)C(O)=O XSBGUANSZDGULP-IUCAKERBSA-N 0.000 description 1
- HXOLDXKNWKLDMM-YVNDNENWSA-N Gln-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N HXOLDXKNWKLDMM-YVNDNENWSA-N 0.000 description 1
- FTIJVMLAGRAYMJ-MNXVOIDGSA-N Gln-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(N)=O FTIJVMLAGRAYMJ-MNXVOIDGSA-N 0.000 description 1
- HWEINOMSWQSJDC-SRVKXCTJSA-N Gln-Leu-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O HWEINOMSWQSJDC-SRVKXCTJSA-N 0.000 description 1
- VZRAXPGTUNDIDK-GUBZILKMSA-N Gln-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N VZRAXPGTUNDIDK-GUBZILKMSA-N 0.000 description 1
- LGIKBBLQVSWUGK-DCAQKATOSA-N Gln-Leu-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LGIKBBLQVSWUGK-DCAQKATOSA-N 0.000 description 1
- ZBKUIQNCRIYVGH-SDDRHHMPSA-N Gln-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZBKUIQNCRIYVGH-SDDRHHMPSA-N 0.000 description 1
- QDXMSSWCEVYOLZ-SZMVWBNQSA-N Gln-Leu-Trp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CCC(=O)N)N QDXMSSWCEVYOLZ-SZMVWBNQSA-N 0.000 description 1
- HPCOBEHVEHWREJ-DCAQKATOSA-N Gln-Lys-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HPCOBEHVEHWREJ-DCAQKATOSA-N 0.000 description 1
- ILKYYKRAULNYMS-JYJNAYRXSA-N Gln-Lys-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ILKYYKRAULNYMS-JYJNAYRXSA-N 0.000 description 1
- ZEEPYMXTJWIMSN-GUBZILKMSA-N Gln-Lys-Ser Chemical compound NCCCC[C@@H](C(=O)N[C@@H](CO)C(O)=O)NC(=O)[C@@H](N)CCC(N)=O ZEEPYMXTJWIMSN-GUBZILKMSA-N 0.000 description 1
- AQPZYBSRDRZBAG-AVGNSLFASA-N Gln-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N AQPZYBSRDRZBAG-AVGNSLFASA-N 0.000 description 1
- FTTHLXOMDMLKKW-FHWLQOOXSA-N Gln-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(N)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 FTTHLXOMDMLKKW-FHWLQOOXSA-N 0.000 description 1
- MFHVAWMMKZBSRQ-ACZMJKKPSA-N Gln-Ser-Cys Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N MFHVAWMMKZBSRQ-ACZMJKKPSA-N 0.000 description 1
- RWQCWSGOOOEGPB-FXQIFTODSA-N Gln-Ser-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O RWQCWSGOOOEGPB-FXQIFTODSA-N 0.000 description 1
- LPIKVBWNNVFHCQ-GUBZILKMSA-N Gln-Ser-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O LPIKVBWNNVFHCQ-GUBZILKMSA-N 0.000 description 1
- PAOHIZNRJNIXQY-XQXXSGGOSA-N Gln-Thr-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O PAOHIZNRJNIXQY-XQXXSGGOSA-N 0.000 description 1
- IIMZHVKZBGSEKZ-SZMVWBNQSA-N Gln-Trp-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(C)C)C(O)=O IIMZHVKZBGSEKZ-SZMVWBNQSA-N 0.000 description 1
- CVRUVYDNRPSKBM-QEJZJMRPSA-N Gln-Trp-Ser Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)N)N CVRUVYDNRPSKBM-QEJZJMRPSA-N 0.000 description 1
- WIMVKDYAKRAUCG-IHRRRGAJSA-N Gln-Tyr-Glu Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O WIMVKDYAKRAUCG-IHRRRGAJSA-N 0.000 description 1
- VCUNGPMMPNJSGS-JYJNAYRXSA-N Gln-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O VCUNGPMMPNJSGS-JYJNAYRXSA-N 0.000 description 1
- ZZLDMBMFKZFQMU-NRPADANISA-N Gln-Val-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O ZZLDMBMFKZFQMU-NRPADANISA-N 0.000 description 1
- SZXSSXUNOALWCH-ACZMJKKPSA-N Glu-Ala-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O SZXSSXUNOALWCH-ACZMJKKPSA-N 0.000 description 1
- ITYRYNUZHPNCIK-GUBZILKMSA-N Glu-Ala-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O ITYRYNUZHPNCIK-GUBZILKMSA-N 0.000 description 1
- DIXKFOPPGWKZLY-CIUDSAMLSA-N Glu-Arg-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O DIXKFOPPGWKZLY-CIUDSAMLSA-N 0.000 description 1
- PBEQPAZRHDVJQI-SRVKXCTJSA-N Glu-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCC(=O)O)N PBEQPAZRHDVJQI-SRVKXCTJSA-N 0.000 description 1
- OJGLIOXAKGFFDW-SRVKXCTJSA-N Glu-Arg-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCC(=O)O)N OJGLIOXAKGFFDW-SRVKXCTJSA-N 0.000 description 1
- ZOXBSICWUDAOHX-GUBZILKMSA-N Glu-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCC(O)=O ZOXBSICWUDAOHX-GUBZILKMSA-N 0.000 description 1
- IESFZVCAVACGPH-PEFMBERDSA-N Glu-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O IESFZVCAVACGPH-PEFMBERDSA-N 0.000 description 1
- XHUCVVHRLNPZSZ-CIUDSAMLSA-N Glu-Gln-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XHUCVVHRLNPZSZ-CIUDSAMLSA-N 0.000 description 1
- PXHABOCPJVTGEK-BQBZGAKWSA-N Glu-Gln-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O PXHABOCPJVTGEK-BQBZGAKWSA-N 0.000 description 1
- VFZIDQZAEBORGY-GLLZPBPUSA-N Glu-Gln-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VFZIDQZAEBORGY-GLLZPBPUSA-N 0.000 description 1
- HTTSBEBKVNEDFE-AUTRQRHGSA-N Glu-Gln-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)O)N HTTSBEBKVNEDFE-AUTRQRHGSA-N 0.000 description 1
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 1
- MUSGDMDGNGXULI-DCAQKATOSA-N Glu-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O MUSGDMDGNGXULI-DCAQKATOSA-N 0.000 description 1
- VXQOONWNIWFOCS-HGNGGELXSA-N Glu-His-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N VXQOONWNIWFOCS-HGNGGELXSA-N 0.000 description 1
- DRLVXRQFROIYTD-GUBZILKMSA-N Glu-His-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N DRLVXRQFROIYTD-GUBZILKMSA-N 0.000 description 1
- ZMVCLTGPGWJAEE-JYJNAYRXSA-N Glu-His-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCC(=O)O)N)O ZMVCLTGPGWJAEE-JYJNAYRXSA-N 0.000 description 1
- ZWABFSSWTSAMQN-KBIXCLLPSA-N Glu-Ile-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O ZWABFSSWTSAMQN-KBIXCLLPSA-N 0.000 description 1
- LGYCLOCORAEQSZ-PEFMBERDSA-N Glu-Ile-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O LGYCLOCORAEQSZ-PEFMBERDSA-N 0.000 description 1
- ITBHUUMCJJQUSC-LAEOZQHASA-N Glu-Ile-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O ITBHUUMCJJQUSC-LAEOZQHASA-N 0.000 description 1
- ZCOJVESMNGBGLF-GRLWGSQLSA-N Glu-Ile-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZCOJVESMNGBGLF-GRLWGSQLSA-N 0.000 description 1
- ZHNHJYYFCGUZNQ-KBIXCLLPSA-N Glu-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O ZHNHJYYFCGUZNQ-KBIXCLLPSA-N 0.000 description 1
- DNPCBMNFQVTHMA-DCAQKATOSA-N Glu-Leu-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O DNPCBMNFQVTHMA-DCAQKATOSA-N 0.000 description 1
- NWOUBJNMZDDGDT-AVGNSLFASA-N Glu-Leu-His Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 NWOUBJNMZDDGDT-AVGNSLFASA-N 0.000 description 1
- WNRZUESNGGDCJX-JYJNAYRXSA-N Glu-Leu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WNRZUESNGGDCJX-JYJNAYRXSA-N 0.000 description 1
- YGLCLCMAYUYZSG-AVGNSLFASA-N Glu-Lys-His Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 YGLCLCMAYUYZSG-AVGNSLFASA-N 0.000 description 1
- HRBYTAIBKPNZKQ-AVGNSLFASA-N Glu-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O HRBYTAIBKPNZKQ-AVGNSLFASA-N 0.000 description 1
- QDMVXRNLOPTPIE-WDCWCFNPSA-N Glu-Lys-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QDMVXRNLOPTPIE-WDCWCFNPSA-N 0.000 description 1
- UERORLSAFUHDGU-AVGNSLFASA-N Glu-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N UERORLSAFUHDGU-AVGNSLFASA-N 0.000 description 1
- LHIPZASLKPYDPI-AVGNSLFASA-N Glu-Phe-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O LHIPZASLKPYDPI-AVGNSLFASA-N 0.000 description 1
- SYWCGQOIIARSIX-SRVKXCTJSA-N Glu-Pro-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O SYWCGQOIIARSIX-SRVKXCTJSA-N 0.000 description 1
- BIYNPVYAZOUVFQ-CIUDSAMLSA-N Glu-Pro-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O BIYNPVYAZOUVFQ-CIUDSAMLSA-N 0.000 description 1
- NNQDRRUXFJYCCJ-NHCYSSNCSA-N Glu-Pro-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O NNQDRRUXFJYCCJ-NHCYSSNCSA-N 0.000 description 1
- MRWYPDWDZSLWJM-ACZMJKKPSA-N Glu-Ser-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O MRWYPDWDZSLWJM-ACZMJKKPSA-N 0.000 description 1
- IDEODOAVGCMUQV-GUBZILKMSA-N Glu-Ser-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IDEODOAVGCMUQV-GUBZILKMSA-N 0.000 description 1
- VHPVBPCCWVDGJL-IRIUXVKKSA-N Glu-Thr-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VHPVBPCCWVDGJL-IRIUXVKKSA-N 0.000 description 1
- YOTHMZZSJKKEHZ-SZMVWBNQSA-N Glu-Trp-Lys Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CCC(O)=O)=CNC2=C1 YOTHMZZSJKKEHZ-SZMVWBNQSA-N 0.000 description 1
- MIWJDJAMMKHUAR-ZVZYQTTQSA-N Glu-Trp-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CCC(=O)O)N MIWJDJAMMKHUAR-ZVZYQTTQSA-N 0.000 description 1
- HAGKYCXGTRUUFI-RYUDHWBXSA-N Glu-Tyr-Gly Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)O)N)O HAGKYCXGTRUUFI-RYUDHWBXSA-N 0.000 description 1
- LZEUDRYSAZAJIO-AUTRQRHGSA-N Glu-Val-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LZEUDRYSAZAJIO-AUTRQRHGSA-N 0.000 description 1
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 1
- MFVQGXGQRIXBPK-WDSKDSINSA-N Gly-Ala-Glu Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O MFVQGXGQRIXBPK-WDSKDSINSA-N 0.000 description 1
- VXKCPBPQEKKERH-IUCAKERBSA-N Gly-Arg-Pro Chemical compound NC(N)=NCCC[C@H](NC(=O)CN)C(=O)N1CCC[C@H]1C(O)=O VXKCPBPQEKKERH-IUCAKERBSA-N 0.000 description 1
- KKBWDNZXYLGJEY-UHFFFAOYSA-N Gly-Arg-Pro Natural products NCC(=O)NC(CCNC(=N)N)C(=O)N1CCCC1C(=O)O KKBWDNZXYLGJEY-UHFFFAOYSA-N 0.000 description 1
- GNBMOZPQUXTCRW-STQMWFEESA-N Gly-Asn-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)CN)C(O)=O)=CNC2=C1 GNBMOZPQUXTCRW-STQMWFEESA-N 0.000 description 1
- QPDUVFSVVAOUHE-XVKPBYJWSA-N Gly-Gln-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)CN)C(O)=O QPDUVFSVVAOUHE-XVKPBYJWSA-N 0.000 description 1
- ZQIMMEYPEXIYBB-IUCAKERBSA-N Gly-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN ZQIMMEYPEXIYBB-IUCAKERBSA-N 0.000 description 1
- KMSGYZQRXPUKGI-BYPYZUCNSA-N Gly-Gly-Asn Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(N)=O KMSGYZQRXPUKGI-BYPYZUCNSA-N 0.000 description 1
- XPJBQTCXPJNIFE-ZETCQYMHSA-N Gly-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 1
- MVORZMQFXBLMHM-QWRGUYRKSA-N Gly-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CN=CN1 MVORZMQFXBLMHM-QWRGUYRKSA-N 0.000 description 1
- HPAIKDPJURGQLN-KBPBESRZSA-N Gly-His-Phe Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CNC=N1 HPAIKDPJURGQLN-KBPBESRZSA-N 0.000 description 1
- UESJMAMHDLEHGM-NHCYSSNCSA-N Gly-Ile-Leu Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O UESJMAMHDLEHGM-NHCYSSNCSA-N 0.000 description 1
- ULZCYBYDTUMHNF-IUCAKERBSA-N Gly-Leu-Glu Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ULZCYBYDTUMHNF-IUCAKERBSA-N 0.000 description 1
- TVUWMSBGMVAHSJ-KBPBESRZSA-N Gly-Leu-Phe Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 TVUWMSBGMVAHSJ-KBPBESRZSA-N 0.000 description 1
- NNCSJUBVFBDDLC-YUMQZZPRSA-N Gly-Leu-Ser Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O NNCSJUBVFBDDLC-YUMQZZPRSA-N 0.000 description 1
- FXGRXIATVXUAHO-WEDXCCLWSA-N Gly-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN FXGRXIATVXUAHO-WEDXCCLWSA-N 0.000 description 1
- LXTRSHQLGYINON-DTWKUNHWSA-N Gly-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN LXTRSHQLGYINON-DTWKUNHWSA-N 0.000 description 1
- IGOYNRWLWHWAQO-JTQLQIEISA-N Gly-Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 IGOYNRWLWHWAQO-JTQLQIEISA-N 0.000 description 1
- VNNRLUNBJSWZPF-ZKWXMUAHSA-N Gly-Ser-Ile Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VNNRLUNBJSWZPF-ZKWXMUAHSA-N 0.000 description 1
- LCRDMSSAKLTKBU-ZDLURKLDSA-N Gly-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN LCRDMSSAKLTKBU-ZDLURKLDSA-N 0.000 description 1
- YXTFLTJYLIAZQG-FJXKBIBVSA-N Gly-Thr-Arg Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YXTFLTJYLIAZQG-FJXKBIBVSA-N 0.000 description 1
- BXDLTKLPPKBVEL-FJXKBIBVSA-N Gly-Thr-Met Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(O)=O BXDLTKLPPKBVEL-FJXKBIBVSA-N 0.000 description 1
- UMBDRSMLCUYIRI-DVJZZOLTSA-N Gly-Trp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)CN)O UMBDRSMLCUYIRI-DVJZZOLTSA-N 0.000 description 1
- UVTSZKIATYSKIR-RYUDHWBXSA-N Gly-Tyr-Glu Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O UVTSZKIATYSKIR-RYUDHWBXSA-N 0.000 description 1
- MUGLKCQHTUFLGF-WPRPVWTQSA-N Gly-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)CN MUGLKCQHTUFLGF-WPRPVWTQSA-N 0.000 description 1
- SBVMXEZQJVUARN-XPUUQOCRSA-N Gly-Val-Ser Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O SBVMXEZQJVUARN-XPUUQOCRSA-N 0.000 description 1
- SVHKVHBPTOMLTO-DCAQKATOSA-N His-Arg-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O SVHKVHBPTOMLTO-DCAQKATOSA-N 0.000 description 1
- MWWOPNQSBXEUHO-ULQDDVLXSA-N His-Arg-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 MWWOPNQSBXEUHO-ULQDDVLXSA-N 0.000 description 1
- VOEGKUNRHYKYSU-XVYDVKMFSA-N His-Asp-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O VOEGKUNRHYKYSU-XVYDVKMFSA-N 0.000 description 1
- BQYZXYCEKYJKAM-VGDYDELISA-N His-Cys-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BQYZXYCEKYJKAM-VGDYDELISA-N 0.000 description 1
- VYMGAXSNYUFVCK-GUBZILKMSA-N His-Gln-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N VYMGAXSNYUFVCK-GUBZILKMSA-N 0.000 description 1
- FYVHHKMHFPMBBG-GUBZILKMSA-N His-Gln-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N FYVHHKMHFPMBBG-GUBZILKMSA-N 0.000 description 1
- SWSVTNGMKBDTBM-DCAQKATOSA-N His-Gln-Glu Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N SWSVTNGMKBDTBM-DCAQKATOSA-N 0.000 description 1
- NWGXCPUKPVISSJ-AVGNSLFASA-N His-Gln-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N NWGXCPUKPVISSJ-AVGNSLFASA-N 0.000 description 1
- RAVLQPXCMRCLKT-KBPBESRZSA-N His-Gly-Phe Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RAVLQPXCMRCLKT-KBPBESRZSA-N 0.000 description 1
- WJGSTIMGSIWHJX-HVTMNAMFSA-N His-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N WJGSTIMGSIWHJX-HVTMNAMFSA-N 0.000 description 1
- DYKZGTLPSNOFHU-DEQVHRJGSA-N His-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N DYKZGTLPSNOFHU-DEQVHRJGSA-N 0.000 description 1
- DEOQGJUXUQGUJN-KKUMJFAQSA-N His-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N DEOQGJUXUQGUJN-KKUMJFAQSA-N 0.000 description 1
- LDFWDDVELNOGII-MXAVVETBSA-N His-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CN=CN1)N LDFWDDVELNOGII-MXAVVETBSA-N 0.000 description 1
- TVMNTHXFRSXZGR-IHRRRGAJSA-N His-Lys-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O TVMNTHXFRSXZGR-IHRRRGAJSA-N 0.000 description 1
- YAEKRYQASVCDLK-JYJNAYRXSA-N His-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N YAEKRYQASVCDLK-JYJNAYRXSA-N 0.000 description 1
- LNDVNHOSZQPJGI-AVGNSLFASA-N His-Pro-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CN=CN1 LNDVNHOSZQPJGI-AVGNSLFASA-N 0.000 description 1
- YEKYGQZUBCRNGH-DCAQKATOSA-N His-Pro-Ser Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CN=CN2)N)C(=O)N[C@@H](CO)C(=O)O YEKYGQZUBCRNGH-DCAQKATOSA-N 0.000 description 1
- CUEQQFOGARVNHU-VGDYDELISA-N His-Ser-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CUEQQFOGARVNHU-VGDYDELISA-N 0.000 description 1
- DAKSMIWQZPHRIB-BZSNNMDCSA-N His-Tyr-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DAKSMIWQZPHRIB-BZSNNMDCSA-N 0.000 description 1
- WSAILOWUJZEAGC-DCAQKATOSA-N His-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N WSAILOWUJZEAGC-DCAQKATOSA-N 0.000 description 1
- SYPULFZAGBBIOM-GVXVVHGQSA-N His-Val-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N SYPULFZAGBBIOM-GVXVVHGQSA-N 0.000 description 1
- GGXUJBKENKVYNV-ULQDDVLXSA-N His-Val-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N GGXUJBKENKVYNV-ULQDDVLXSA-N 0.000 description 1
- RWIKBYVJQAJYDP-BJDJZHNGSA-N Ile-Ala-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RWIKBYVJQAJYDP-BJDJZHNGSA-N 0.000 description 1
- CYHYBSGMHMHKOA-CIQUZCHMSA-N Ile-Ala-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N CYHYBSGMHMHKOA-CIQUZCHMSA-N 0.000 description 1
- TZCGZYWNIDZZMR-UHFFFAOYSA-N Ile-Arg-Ala Natural products CCC(C)C(N)C(=O)NC(C(=O)NC(C)C(O)=O)CCCN=C(N)N TZCGZYWNIDZZMR-UHFFFAOYSA-N 0.000 description 1
- ATXGFMOBVKSOMK-PEDHHIEDSA-N Ile-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N ATXGFMOBVKSOMK-PEDHHIEDSA-N 0.000 description 1
- WECYRWOMWSCWNX-XUXIUFHCSA-N Ile-Arg-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(C)C)C(O)=O WECYRWOMWSCWNX-XUXIUFHCSA-N 0.000 description 1
- AZEYWPUCOYXFOE-CYDGBPFRSA-N Ile-Arg-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](C(C)C)C(=O)O)N AZEYWPUCOYXFOE-CYDGBPFRSA-N 0.000 description 1
- IIXDMJNYALIKGP-DJFWLOJKSA-N Ile-Asn-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N IIXDMJNYALIKGP-DJFWLOJKSA-N 0.000 description 1
- RPZFUIQVAPZLRH-GHCJXIJMSA-N Ile-Asp-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](C)C(=O)O)N RPZFUIQVAPZLRH-GHCJXIJMSA-N 0.000 description 1
- JQLFYZMEXFNRFS-DJFWLOJKSA-N Ile-Asp-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N JQLFYZMEXFNRFS-DJFWLOJKSA-N 0.000 description 1
- QSPLUJGYOPZINY-ZPFDUUQYSA-N Ile-Asp-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N QSPLUJGYOPZINY-ZPFDUUQYSA-N 0.000 description 1
- NPROWIBAWYMPAZ-GUDRVLHUSA-N Ile-Asp-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N NPROWIBAWYMPAZ-GUDRVLHUSA-N 0.000 description 1
- HOLOYAZCIHDQNS-YVNDNENWSA-N Ile-Gln-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HOLOYAZCIHDQNS-YVNDNENWSA-N 0.000 description 1
- CYHJCEKUMCNDFG-LAEOZQHASA-N Ile-Gln-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N CYHJCEKUMCNDFG-LAEOZQHASA-N 0.000 description 1
- BEWFWZRGBDVXRP-PEFMBERDSA-N Ile-Glu-Asn Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O BEWFWZRGBDVXRP-PEFMBERDSA-N 0.000 description 1
- XLCZWMJPVGRWHJ-KQXIARHKSA-N Ile-Glu-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N XLCZWMJPVGRWHJ-KQXIARHKSA-N 0.000 description 1
- PNDMHTTXXPUQJH-RWRJDSDZSA-N Ile-Glu-Thr Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H]([C@H](O)C)C(=O)O PNDMHTTXXPUQJH-RWRJDSDZSA-N 0.000 description 1
- SLQVFYWBGNNOTK-BYULHYEWSA-N Ile-Gly-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N SLQVFYWBGNNOTK-BYULHYEWSA-N 0.000 description 1
- LBRCLQMZAHRTLV-ZKWXMUAHSA-N Ile-Gly-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O LBRCLQMZAHRTLV-ZKWXMUAHSA-N 0.000 description 1
- CCYGNFBYUNHFSC-MGHWNKPDSA-N Ile-His-Phe Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O CCYGNFBYUNHFSC-MGHWNKPDSA-N 0.000 description 1
- BBQABUDWDUKJMB-LZXPERKUSA-N Ile-Ile-Ile Chemical compound CC[C@H](C)[C@H]([NH3+])C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C([O-])=O BBQABUDWDUKJMB-LZXPERKUSA-N 0.000 description 1
- TWYOYAKMLHWMOJ-ZPFDUUQYSA-N Ile-Leu-Asn Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O TWYOYAKMLHWMOJ-ZPFDUUQYSA-N 0.000 description 1
- YGDWPQCLFJNMOL-MNXVOIDGSA-N Ile-Leu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YGDWPQCLFJNMOL-MNXVOIDGSA-N 0.000 description 1
- FCWFBHMAJZGWRY-XUXIUFHCSA-N Ile-Leu-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)O)N FCWFBHMAJZGWRY-XUXIUFHCSA-N 0.000 description 1
- PHRWFSFCNJPWRO-PPCPHDFISA-N Ile-Leu-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N PHRWFSFCNJPWRO-PPCPHDFISA-N 0.000 description 1
- OVDKXUDMKXAZIV-ZPFDUUQYSA-N Ile-Lys-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OVDKXUDMKXAZIV-ZPFDUUQYSA-N 0.000 description 1
- PNTWNAXGBOZMBO-MNXVOIDGSA-N Ile-Lys-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PNTWNAXGBOZMBO-MNXVOIDGSA-N 0.000 description 1
- GLYJPWIRLBAIJH-UHFFFAOYSA-N Ile-Lys-Pro Natural products CCC(C)C(N)C(=O)NC(CCCCN)C(=O)N1CCCC1C(O)=O GLYJPWIRLBAIJH-UHFFFAOYSA-N 0.000 description 1
- UDBPXJNOEWDBDF-XUXIUFHCSA-N Ile-Lys-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)O)N UDBPXJNOEWDBDF-XUXIUFHCSA-N 0.000 description 1
- UFRXVQGGPNSJRY-CYDGBPFRSA-N Ile-Met-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N UFRXVQGGPNSJRY-CYDGBPFRSA-N 0.000 description 1
- OTSVBELRDMSPKY-PCBIJLKTSA-N Ile-Phe-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OTSVBELRDMSPKY-PCBIJLKTSA-N 0.000 description 1
- XLXPYSDGMXTTNQ-UHFFFAOYSA-N Ile-Phe-Leu Natural products CCC(C)C(N)C(=O)NC(C(=O)NC(CC(C)C)C(O)=O)CC1=CC=CC=C1 XLXPYSDGMXTTNQ-UHFFFAOYSA-N 0.000 description 1
- IVXJIMGDOYRLQU-XUXIUFHCSA-N Ile-Pro-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O IVXJIMGDOYRLQU-XUXIUFHCSA-N 0.000 description 1
- CAHCWMVNBZJVAW-NAKRPEOUSA-N Ile-Pro-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)O)N CAHCWMVNBZJVAW-NAKRPEOUSA-N 0.000 description 1
- JODPUDMBQBIWCK-GHCJXIJMSA-N Ile-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O JODPUDMBQBIWCK-GHCJXIJMSA-N 0.000 description 1
- XMYURPUVJSKTMC-KBIXCLLPSA-N Ile-Ser-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N XMYURPUVJSKTMC-KBIXCLLPSA-N 0.000 description 1
- ZNOBVZFCHNHKHA-KBIXCLLPSA-N Ile-Ser-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZNOBVZFCHNHKHA-KBIXCLLPSA-N 0.000 description 1
- PXKACEXYLPBMAD-JBDRJPRFSA-N Ile-Ser-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PXKACEXYLPBMAD-JBDRJPRFSA-N 0.000 description 1
- CNMOKANDJMLAIF-CIQUZCHMSA-N Ile-Thr-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O CNMOKANDJMLAIF-CIQUZCHMSA-N 0.000 description 1
- SAEWJTCJQVZQNZ-IUKAMOBKSA-N Ile-Thr-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SAEWJTCJQVZQNZ-IUKAMOBKSA-N 0.000 description 1
- KBDIBHQICWDGDL-PPCPHDFISA-N Ile-Thr-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N KBDIBHQICWDGDL-PPCPHDFISA-N 0.000 description 1
- NURNJECQNNCRBK-FLBSBUHZSA-N Ile-Thr-Thr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NURNJECQNNCRBK-FLBSBUHZSA-N 0.000 description 1
- UYODHPPSCXBNCS-XUXIUFHCSA-N Ile-Val-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(C)C UYODHPPSCXBNCS-XUXIUFHCSA-N 0.000 description 1
- DGAQECJNVWCQMB-PUAWFVPOSA-M Ilexoside XXIX Chemical compound C[C@@H]1CC[C@@]2(CC[C@@]3(C(=CC[C@H]4[C@]3(CC[C@@H]5[C@@]4(CC[C@@H](C5(C)C)OS(=O)(=O)[O-])C)C)[C@@H]2[C@]1(C)O)C)C(=O)O[C@H]6[C@@H]([C@H]([C@@H]([C@H](O6)CO)O)O)O.[Na+] DGAQECJNVWCQMB-PUAWFVPOSA-M 0.000 description 1
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 1
- SITWEMZOJNKJCH-UHFFFAOYSA-N L-alanine-L-arginine Natural products CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 1
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 1
- SENJXOPIZNYLHU-UHFFFAOYSA-N L-leucyl-L-arginine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-UHFFFAOYSA-N 0.000 description 1
- LJHGALIOHLRRQN-DCAQKATOSA-N Leu-Ala-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N LJHGALIOHLRRQN-DCAQKATOSA-N 0.000 description 1
- CZCSUZMIRKFFFA-CIUDSAMLSA-N Leu-Ala-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O CZCSUZMIRKFFFA-CIUDSAMLSA-N 0.000 description 1
- CQQGCWPXDHTTNF-GUBZILKMSA-N Leu-Ala-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O CQQGCWPXDHTTNF-GUBZILKMSA-N 0.000 description 1
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 1
- PBCHMHROGNUXMK-DLOVCJGASA-N Leu-Ala-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 PBCHMHROGNUXMK-DLOVCJGASA-N 0.000 description 1
- BQSLGJHIAGOZCD-CIUDSAMLSA-N Leu-Ala-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O BQSLGJHIAGOZCD-CIUDSAMLSA-N 0.000 description 1
- GRZSCTXVCDUIPO-SRVKXCTJSA-N Leu-Arg-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O GRZSCTXVCDUIPO-SRVKXCTJSA-N 0.000 description 1
- QUAAUWNLWMLERT-IHRRRGAJSA-N Leu-Arg-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(C)C)C(O)=O QUAAUWNLWMLERT-IHRRRGAJSA-N 0.000 description 1
- STAVRDQLZOTNKJ-RHYQMDGZSA-N Leu-Arg-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O STAVRDQLZOTNKJ-RHYQMDGZSA-N 0.000 description 1
- WUFYAPWIHCUMLL-CIUDSAMLSA-N Leu-Asn-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O WUFYAPWIHCUMLL-CIUDSAMLSA-N 0.000 description 1
- RFUBXQQFJFGJFV-GUBZILKMSA-N Leu-Asn-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O RFUBXQQFJFGJFV-GUBZILKMSA-N 0.000 description 1
- KKXDHFKZWKLYGB-GUBZILKMSA-N Leu-Asn-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KKXDHFKZWKLYGB-GUBZILKMSA-N 0.000 description 1
- WGNOPSQMIQERPK-GARJFASQSA-N Leu-Asn-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N WGNOPSQMIQERPK-GARJFASQSA-N 0.000 description 1
- FIJMQLGQLBLBOL-HJGDQZAQSA-N Leu-Asn-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FIJMQLGQLBLBOL-HJGDQZAQSA-N 0.000 description 1
- WXHFZJFZWNCDNB-KKUMJFAQSA-N Leu-Asn-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WXHFZJFZWNCDNB-KKUMJFAQSA-N 0.000 description 1
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 1
- QCSFMCFHVGTLFF-NHCYSSNCSA-N Leu-Asp-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O QCSFMCFHVGTLFF-NHCYSSNCSA-N 0.000 description 1
- KAFOIVJDVSZUMD-UHFFFAOYSA-N Leu-Gln-Gln Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)NC(CCC(N)=O)C(O)=O KAFOIVJDVSZUMD-UHFFFAOYSA-N 0.000 description 1
- WMTOVWLLDGQGCV-GUBZILKMSA-N Leu-Glu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N WMTOVWLLDGQGCV-GUBZILKMSA-N 0.000 description 1
- NEEOBPIXKWSBRF-IUCAKERBSA-N Leu-Glu-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O NEEOBPIXKWSBRF-IUCAKERBSA-N 0.000 description 1
- HVJVUYQWFYMGJS-GVXVVHGQSA-N Leu-Glu-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O HVJVUYQWFYMGJS-GVXVVHGQSA-N 0.000 description 1
- KXODZBLFVFSLAI-AVGNSLFASA-N Leu-His-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 KXODZBLFVFSLAI-AVGNSLFASA-N 0.000 description 1
- XQXGNBFMAXWIGI-MXAVVETBSA-N Leu-His-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 XQXGNBFMAXWIGI-MXAVVETBSA-N 0.000 description 1
- AVEGDIAXTDVBJS-XUXIUFHCSA-N Leu-Ile-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AVEGDIAXTDVBJS-XUXIUFHCSA-N 0.000 description 1
- KOSWSHVQIVTVQF-ZPFDUUQYSA-N Leu-Ile-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KOSWSHVQIVTVQF-ZPFDUUQYSA-N 0.000 description 1
- LIINDKYIGYTDLG-PPCPHDFISA-N Leu-Ile-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LIINDKYIGYTDLG-PPCPHDFISA-N 0.000 description 1
- DSFYPIUSAMSERP-IHRRRGAJSA-N Leu-Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DSFYPIUSAMSERP-IHRRRGAJSA-N 0.000 description 1
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 1
- PDQDCFBVYXEFSD-SRVKXCTJSA-N Leu-Leu-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O PDQDCFBVYXEFSD-SRVKXCTJSA-N 0.000 description 1
- QNBVTHNJGCOVFA-AVGNSLFASA-N Leu-Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O QNBVTHNJGCOVFA-AVGNSLFASA-N 0.000 description 1
- FAELBUXXFQLUAX-AJNGGQMLSA-N Leu-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(C)C FAELBUXXFQLUAX-AJNGGQMLSA-N 0.000 description 1
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 1
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 1
- BGZCJDGBBUUBHA-KKUMJFAQSA-N Leu-Lys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O BGZCJDGBBUUBHA-KKUMJFAQSA-N 0.000 description 1
- FIICHHJDINDXKG-IHPCNDPISA-N Leu-Lys-Trp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O FIICHHJDINDXKG-IHPCNDPISA-N 0.000 description 1
- WXZOHBVPVKABQN-DCAQKATOSA-N Leu-Met-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)O)C(=O)O)N WXZOHBVPVKABQN-DCAQKATOSA-N 0.000 description 1
- YESNGRDJQWDYLH-KKUMJFAQSA-N Leu-Phe-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CS)C(=O)O)N YESNGRDJQWDYLH-KKUMJFAQSA-N 0.000 description 1
- KTOIECMYZZGVSI-BZSNNMDCSA-N Leu-Phe-His Chemical compound C([C@H](NC(=O)[C@@H](N)CC(C)C)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CC=CC=C1 KTOIECMYZZGVSI-BZSNNMDCSA-N 0.000 description 1
- PTRKPHUGYULXPU-KKUMJFAQSA-N Leu-Phe-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O PTRKPHUGYULXPU-KKUMJFAQSA-N 0.000 description 1
- XWEVVRRSIOBJOO-SRVKXCTJSA-N Leu-Pro-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O XWEVVRRSIOBJOO-SRVKXCTJSA-N 0.000 description 1
- VULJUQZPSOASBZ-SRVKXCTJSA-N Leu-Pro-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O VULJUQZPSOASBZ-SRVKXCTJSA-N 0.000 description 1
- KWLWZYMNUZJKMZ-IHRRRGAJSA-N Leu-Pro-Leu Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O KWLWZYMNUZJKMZ-IHRRRGAJSA-N 0.000 description 1
- DPURXCQCHSQPAN-AVGNSLFASA-N Leu-Pro-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DPURXCQCHSQPAN-AVGNSLFASA-N 0.000 description 1
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 1
- UCXQIIIFOOGYEM-ULQDDVLXSA-N Leu-Pro-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UCXQIIIFOOGYEM-ULQDDVLXSA-N 0.000 description 1
- JDBQSGMJBMPNFT-AVGNSLFASA-N Leu-Pro-Val Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O JDBQSGMJBMPNFT-AVGNSLFASA-N 0.000 description 1
- SBANPBVRHYIMRR-GARJFASQSA-N Leu-Ser-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N SBANPBVRHYIMRR-GARJFASQSA-N 0.000 description 1
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 1
- SQUFDMCWMFOEBA-KKUMJFAQSA-N Leu-Ser-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SQUFDMCWMFOEBA-KKUMJFAQSA-N 0.000 description 1
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 1
- IDGRADDMTTWOQC-WDSOQIARSA-N Leu-Trp-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IDGRADDMTTWOQC-WDSOQIARSA-N 0.000 description 1
- RIHIGSWBLHSGLV-CQDKDKBSSA-N Leu-Tyr-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O RIHIGSWBLHSGLV-CQDKDKBSSA-N 0.000 description 1
- MVJRBCJCRYGCKV-GVXVVHGQSA-N Leu-Val-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O MVJRBCJCRYGCKV-GVXVVHGQSA-N 0.000 description 1
- AAKRWBIIGKPOKQ-ONGXEEELSA-N Leu-Val-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AAKRWBIIGKPOKQ-ONGXEEELSA-N 0.000 description 1
- VKVDRTGWLVZJOM-DCAQKATOSA-N Leu-Val-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 1
- QESXLSQLQHHTIX-RHYQMDGZSA-N Leu-Val-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QESXLSQLQHHTIX-RHYQMDGZSA-N 0.000 description 1
- VHFFQUSNFFIZBT-CIUDSAMLSA-N Lys-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N VHFFQUSNFFIZBT-CIUDSAMLSA-N 0.000 description 1
- JCFYLFOCALSNLQ-GUBZILKMSA-N Lys-Ala-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JCFYLFOCALSNLQ-GUBZILKMSA-N 0.000 description 1
- PNPYKQFJGRFYJE-GUBZILKMSA-N Lys-Ala-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O PNPYKQFJGRFYJE-GUBZILKMSA-N 0.000 description 1
- KCXUCYYZNZFGLL-SRVKXCTJSA-N Lys-Ala-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O KCXUCYYZNZFGLL-SRVKXCTJSA-N 0.000 description 1
- IRNSXVOWSXSULE-DCAQKATOSA-N Lys-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN IRNSXVOWSXSULE-DCAQKATOSA-N 0.000 description 1
- CLBGMWIYPYAZPR-AVGNSLFASA-N Lys-Arg-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O CLBGMWIYPYAZPR-AVGNSLFASA-N 0.000 description 1
- YNNPKXBBRZVIRX-IHRRRGAJSA-N Lys-Arg-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O YNNPKXBBRZVIRX-IHRRRGAJSA-N 0.000 description 1
- WALVCOOOKULCQM-ULQDDVLXSA-N Lys-Arg-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WALVCOOOKULCQM-ULQDDVLXSA-N 0.000 description 1
- NQCJGQHHYZNUDK-DCAQKATOSA-N Lys-Arg-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CCCN=C(N)N NQCJGQHHYZNUDK-DCAQKATOSA-N 0.000 description 1
- DGAAQRAUOFHBFJ-CIUDSAMLSA-N Lys-Asn-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O DGAAQRAUOFHBFJ-CIUDSAMLSA-N 0.000 description 1
- QYOXSYXPHUHOJR-GUBZILKMSA-N Lys-Asn-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QYOXSYXPHUHOJR-GUBZILKMSA-N 0.000 description 1
- DEFGUIIUYAUEDU-ZPFDUUQYSA-N Lys-Asn-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DEFGUIIUYAUEDU-ZPFDUUQYSA-N 0.000 description 1
- ZQCVMVCVPFYXHZ-SRVKXCTJSA-N Lys-Asn-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN ZQCVMVCVPFYXHZ-SRVKXCTJSA-N 0.000 description 1
- YVSHZSUKQHNDHD-KKUMJFAQSA-N Lys-Asn-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N YVSHZSUKQHNDHD-KKUMJFAQSA-N 0.000 description 1
- IWWMPCPLFXFBAF-SRVKXCTJSA-N Lys-Asp-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O IWWMPCPLFXFBAF-SRVKXCTJSA-N 0.000 description 1
- SFQPJNQDUUYCLA-BJDJZHNGSA-N Lys-Cys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCCCN)N SFQPJNQDUUYCLA-BJDJZHNGSA-N 0.000 description 1
- SSYOBDBNBQBSQE-SRVKXCTJSA-N Lys-Cys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O SSYOBDBNBQBSQE-SRVKXCTJSA-N 0.000 description 1
- DFXQCCBKGUNYGG-GUBZILKMSA-N Lys-Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCCN DFXQCCBKGUNYGG-GUBZILKMSA-N 0.000 description 1
- RZHLIPMZXOEJTL-AVGNSLFASA-N Lys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N RZHLIPMZXOEJTL-AVGNSLFASA-N 0.000 description 1
- PGBPWPTUOSCNLE-JYJNAYRXSA-N Lys-Gln-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N PGBPWPTUOSCNLE-JYJNAYRXSA-N 0.000 description 1
- QZONCCHVHCOBSK-YUMQZZPRSA-N Lys-Gly-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O QZONCCHVHCOBSK-YUMQZZPRSA-N 0.000 description 1
- GQFDWEDHOQRNLC-QWRGUYRKSA-N Lys-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN GQFDWEDHOQRNLC-QWRGUYRKSA-N 0.000 description 1
- YXTKSLRSRXKXNV-IHRRRGAJSA-N Lys-His-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCCCN)N YXTKSLRSRXKXNV-IHRRRGAJSA-N 0.000 description 1
- GNLJXWBNLAIPEP-MELADBBJSA-N Lys-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCCCN)N)C(=O)O GNLJXWBNLAIPEP-MELADBBJSA-N 0.000 description 1
- MXMDJEJWERYPMO-XUXIUFHCSA-N Lys-Ile-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MXMDJEJWERYPMO-XUXIUFHCSA-N 0.000 description 1
- OJDFAABAHBPVTH-MNXVOIDGSA-N Lys-Ile-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O OJDFAABAHBPVTH-MNXVOIDGSA-N 0.000 description 1
- QOJDBRUCOXQSSK-AJNGGQMLSA-N Lys-Ile-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(O)=O QOJDBRUCOXQSSK-AJNGGQMLSA-N 0.000 description 1
- NJNRBRKHOWSGMN-SRVKXCTJSA-N Lys-Leu-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O NJNRBRKHOWSGMN-SRVKXCTJSA-N 0.000 description 1
- AIRZWUMAHCDDHR-KKUMJFAQSA-N Lys-Leu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O AIRZWUMAHCDDHR-KKUMJFAQSA-N 0.000 description 1
- RBEATVHTWHTHTJ-KKUMJFAQSA-N Lys-Leu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O RBEATVHTWHTHTJ-KKUMJFAQSA-N 0.000 description 1
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 1
- ALGGDNMLQNFVIZ-SRVKXCTJSA-N Lys-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N ALGGDNMLQNFVIZ-SRVKXCTJSA-N 0.000 description 1
- JQSIGLHQNSZZRL-KKUMJFAQSA-N Lys-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N JQSIGLHQNSZZRL-KKUMJFAQSA-N 0.000 description 1
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 1
- YDDDRTIPNTWGIG-SRVKXCTJSA-N Lys-Lys-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O YDDDRTIPNTWGIG-SRVKXCTJSA-N 0.000 description 1
- QQPSCXKFDSORFT-IHRRRGAJSA-N Lys-Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN QQPSCXKFDSORFT-IHRRRGAJSA-N 0.000 description 1
- LMGNWHDWJDIOPK-DKIMLUQUSA-N Lys-Phe-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LMGNWHDWJDIOPK-DKIMLUQUSA-N 0.000 description 1
- BOJYMMBYBNOOGG-DCAQKATOSA-N Lys-Pro-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BOJYMMBYBNOOGG-DCAQKATOSA-N 0.000 description 1
- OBZHNHBAAVEWKI-DCAQKATOSA-N Lys-Pro-Asn Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O OBZHNHBAAVEWKI-DCAQKATOSA-N 0.000 description 1
- JCVOHUKUYSYBAD-DCAQKATOSA-N Lys-Pro-Cys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCCCN)N)C(=O)N[C@@H](CS)C(=O)O JCVOHUKUYSYBAD-DCAQKATOSA-N 0.000 description 1
- HYSVGEAWTGPMOA-IHRRRGAJSA-N Lys-Pro-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O HYSVGEAWTGPMOA-IHRRRGAJSA-N 0.000 description 1
- LECIJRIRMVOFMH-ULQDDVLXSA-N Lys-Pro-Phe Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 LECIJRIRMVOFMH-ULQDDVLXSA-N 0.000 description 1
- YTJFXEDRUOQGSP-DCAQKATOSA-N Lys-Pro-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O YTJFXEDRUOQGSP-DCAQKATOSA-N 0.000 description 1
- MIROMRNASYKZNL-ULQDDVLXSA-N Lys-Pro-Tyr Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 MIROMRNASYKZNL-ULQDDVLXSA-N 0.000 description 1
- YSPZCHGIWAQVKQ-AVGNSLFASA-N Lys-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN YSPZCHGIWAQVKQ-AVGNSLFASA-N 0.000 description 1
- DNWBUCHHMRQWCZ-GUBZILKMSA-N Lys-Ser-Gln Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O DNWBUCHHMRQWCZ-GUBZILKMSA-N 0.000 description 1
- JOSAKOKSPXROGQ-BJDJZHNGSA-N Lys-Ser-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JOSAKOKSPXROGQ-BJDJZHNGSA-N 0.000 description 1
- SQXZLVXQXWILKW-KKUMJFAQSA-N Lys-Ser-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SQXZLVXQXWILKW-KKUMJFAQSA-N 0.000 description 1
- MIFFFXHMAHFACR-KATARQTJSA-N Lys-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN MIFFFXHMAHFACR-KATARQTJSA-N 0.000 description 1
- YKBSXQFZWFXFIB-VOAKCMCISA-N Lys-Thr-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCCCN)C(O)=O YKBSXQFZWFXFIB-VOAKCMCISA-N 0.000 description 1
- BDFHWFUAQLIMJO-KXNHARMFSA-N Lys-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N)O BDFHWFUAQLIMJO-KXNHARMFSA-N 0.000 description 1
- YFQSSOAGMZGXFT-MEYUZBJRSA-N Lys-Thr-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YFQSSOAGMZGXFT-MEYUZBJRSA-N 0.000 description 1
- ZFNYWKHYUMEZDZ-WDSOQIARSA-N Lys-Trp-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CCCCN)N ZFNYWKHYUMEZDZ-WDSOQIARSA-N 0.000 description 1
- SQRLLZAQNOQCEG-KKUMJFAQSA-N Lys-Tyr-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 SQRLLZAQNOQCEG-KKUMJFAQSA-N 0.000 description 1
- QLFAPXUXEBAWEK-NHCYSSNCSA-N Lys-Val-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O QLFAPXUXEBAWEK-NHCYSSNCSA-N 0.000 description 1
- RPWQJSBMXJSCPD-XUXIUFHCSA-N Lys-Val-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCCN)C(C)C)C(O)=O RPWQJSBMXJSCPD-XUXIUFHCSA-N 0.000 description 1
- QEVRUYFHWJJUHZ-DCAQKATOSA-N Met-Ala-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(C)C QEVRUYFHWJJUHZ-DCAQKATOSA-N 0.000 description 1
- YNOVBMBQSQTLFM-DCAQKATOSA-N Met-Asn-Leu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O YNOVBMBQSQTLFM-DCAQKATOSA-N 0.000 description 1
- FVKRQMQQFGBXHV-QXEWZRGKSA-N Met-Asp-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O FVKRQMQQFGBXHV-QXEWZRGKSA-N 0.000 description 1
- VOOINLQYUZOREH-SRVKXCTJSA-N Met-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCSC)N VOOINLQYUZOREH-SRVKXCTJSA-N 0.000 description 1
- XPCLRYNQMZOOFB-ULQDDVLXSA-N Met-His-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N XPCLRYNQMZOOFB-ULQDDVLXSA-N 0.000 description 1
- AFFKUNVPPLQUGA-DCAQKATOSA-N Met-Leu-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O AFFKUNVPPLQUGA-DCAQKATOSA-N 0.000 description 1
- SODXFJOPSCXOHE-IHRRRGAJSA-N Met-Leu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O SODXFJOPSCXOHE-IHRRRGAJSA-N 0.000 description 1
- LBNFTWKGISQVEE-AVGNSLFASA-N Met-Leu-Met Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCSC LBNFTWKGISQVEE-AVGNSLFASA-N 0.000 description 1
- XDGFFEZAZHRZFR-RHYQMDGZSA-N Met-Leu-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XDGFFEZAZHRZFR-RHYQMDGZSA-N 0.000 description 1
- USBFEVBHEQBWDD-AVGNSLFASA-N Met-Leu-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O USBFEVBHEQBWDD-AVGNSLFASA-N 0.000 description 1
- IRVONVRHHJXWTK-RWMBFGLXSA-N Met-Lys-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N IRVONVRHHJXWTK-RWMBFGLXSA-N 0.000 description 1
- FMYLZGQFKPHXHI-GUBZILKMSA-N Met-Met-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O FMYLZGQFKPHXHI-GUBZILKMSA-N 0.000 description 1
- CNAGWYQWQDMUGC-IHRRRGAJSA-N Met-Phe-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CNAGWYQWQDMUGC-IHRRRGAJSA-N 0.000 description 1
- HUURTRNKPBHHKZ-JYJNAYRXSA-N Met-Phe-Val Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 HUURTRNKPBHHKZ-JYJNAYRXSA-N 0.000 description 1
- WYDFQSJOARJAMM-GUBZILKMSA-N Met-Pro-Asp Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O WYDFQSJOARJAMM-GUBZILKMSA-N 0.000 description 1
- MQASRXPTQJJNFM-JYJNAYRXSA-N Met-Pro-Phe Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MQASRXPTQJJNFM-JYJNAYRXSA-N 0.000 description 1
- MIXPUVSPPOWTCR-FXQIFTODSA-N Met-Ser-Ser Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MIXPUVSPPOWTCR-FXQIFTODSA-N 0.000 description 1
- VOAKKHOIAFKOQZ-JYJNAYRXSA-N Met-Tyr-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CCSC)CC1=CC=C(O)C=C1 VOAKKHOIAFKOQZ-JYJNAYRXSA-N 0.000 description 1
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 1
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 1
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 241000335186 Peltigera membranacea Species 0.000 description 1
- FPTXMUIBLMGTQH-ONGXEEELSA-N Phe-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 FPTXMUIBLMGTQH-ONGXEEELSA-N 0.000 description 1
- BBDSZDHUCPSYAC-QEJZJMRPSA-N Phe-Ala-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BBDSZDHUCPSYAC-QEJZJMRPSA-N 0.000 description 1
- ULECEJGNDHWSKD-QEJZJMRPSA-N Phe-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 ULECEJGNDHWSKD-QEJZJMRPSA-N 0.000 description 1
- UHRNIXJAGGLKHP-DLOVCJGASA-N Phe-Ala-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O UHRNIXJAGGLKHP-DLOVCJGASA-N 0.000 description 1
- BKWJQWJPZMUWEG-LFSVMHDDSA-N Phe-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BKWJQWJPZMUWEG-LFSVMHDDSA-N 0.000 description 1
- VHWOBXIWBDWZHK-IHRRRGAJSA-N Phe-Arg-Asp Chemical compound NC(N)=NCCC[C@@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 VHWOBXIWBDWZHK-IHRRRGAJSA-N 0.000 description 1
- AGYXCMYVTBYGCT-ULQDDVLXSA-N Phe-Arg-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O AGYXCMYVTBYGCT-ULQDDVLXSA-N 0.000 description 1
- HTKNPQZCMLBOTQ-XVSYOHENSA-N Phe-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N)O HTKNPQZCMLBOTQ-XVSYOHENSA-N 0.000 description 1
- ZENDEDYRYVHBEG-SRVKXCTJSA-N Phe-Asp-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZENDEDYRYVHBEG-SRVKXCTJSA-N 0.000 description 1
- KOUUGTKGEQZRHV-KKUMJFAQSA-N Phe-Gln-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O KOUUGTKGEQZRHV-KKUMJFAQSA-N 0.000 description 1
- MGBRZXXGQBAULP-DRZSPHRISA-N Phe-Glu-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 MGBRZXXGQBAULP-DRZSPHRISA-N 0.000 description 1
- CSDMCMITJLKBAH-SOUVJXGZSA-N Phe-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O CSDMCMITJLKBAH-SOUVJXGZSA-N 0.000 description 1
- APJPXSFJBMMOLW-KBPBESRZSA-N Phe-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 APJPXSFJBMMOLW-KBPBESRZSA-N 0.000 description 1
- QPVFUAUFEBPIPT-CDMKHQONSA-N Phe-Gly-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O QPVFUAUFEBPIPT-CDMKHQONSA-N 0.000 description 1
- WKTSCAXSYITIJJ-PCBIJLKTSA-N Phe-Ile-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O WKTSCAXSYITIJJ-PCBIJLKTSA-N 0.000 description 1
- HTXVATDVCRFORF-MGHWNKPDSA-N Phe-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N HTXVATDVCRFORF-MGHWNKPDSA-N 0.000 description 1
- CWFGECHCRMGPPT-MXAVVETBSA-N Phe-Ile-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O CWFGECHCRMGPPT-MXAVVETBSA-N 0.000 description 1
- SMFGCTXUBWEPKM-KBPBESRZSA-N Phe-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 SMFGCTXUBWEPKM-KBPBESRZSA-N 0.000 description 1
- YTILBRIUASDGBL-BZSNNMDCSA-N Phe-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 YTILBRIUASDGBL-BZSNNMDCSA-N 0.000 description 1
- CMHTUJQZQXFNTQ-OEAJRASXSA-N Phe-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CC=CC=C1)N)O CMHTUJQZQXFNTQ-OEAJRASXSA-N 0.000 description 1
- ZUQACJLOHYRVPJ-DKIMLUQUSA-N Phe-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZUQACJLOHYRVPJ-DKIMLUQUSA-N 0.000 description 1
- DOXQMJCSSYZSNM-BZSNNMDCSA-N Phe-Lys-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O DOXQMJCSSYZSNM-BZSNNMDCSA-N 0.000 description 1
- KLXQWABNAWDRAY-ACRUOGEOSA-N Phe-Lys-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 KLXQWABNAWDRAY-ACRUOGEOSA-N 0.000 description 1
- PTLMYJOMJLTMCB-KKUMJFAQSA-N Phe-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N PTLMYJOMJLTMCB-KKUMJFAQSA-N 0.000 description 1
- RYQWALWYQWBUKN-FHWLQOOXSA-N Phe-Phe-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O RYQWALWYQWBUKN-FHWLQOOXSA-N 0.000 description 1
- DSXPMZMSJHOKKK-HJOGWXRNSA-N Phe-Phe-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O DSXPMZMSJHOKKK-HJOGWXRNSA-N 0.000 description 1
- WWPAHTZOWURIMR-ULQDDVLXSA-N Phe-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=CC=C1 WWPAHTZOWURIMR-ULQDDVLXSA-N 0.000 description 1
- YMIZSYUAZJSOFL-SRVKXCTJSA-N Phe-Ser-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O YMIZSYUAZJSOFL-SRVKXCTJSA-N 0.000 description 1
- BONHGTUEEPIMPM-AVGNSLFASA-N Phe-Ser-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O BONHGTUEEPIMPM-AVGNSLFASA-N 0.000 description 1
- MCIXMYKSPQUMJG-SRVKXCTJSA-N Phe-Ser-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MCIXMYKSPQUMJG-SRVKXCTJSA-N 0.000 description 1
- RAGOJJCBGXARPO-XVSYOHENSA-N Phe-Thr-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 RAGOJJCBGXARPO-XVSYOHENSA-N 0.000 description 1
- CXMSESHALPOLRE-MEYUZBJRSA-N Phe-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)O CXMSESHALPOLRE-MEYUZBJRSA-N 0.000 description 1
- YFXXRYFWJFQAFW-JHYOHUSXSA-N Phe-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N)O YFXXRYFWJFQAFW-JHYOHUSXSA-N 0.000 description 1
- BTAIJUBAGLVFKQ-BVSLBCMMSA-N Phe-Trp-Val Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](C(C)C)C(O)=O)C1=CC=CC=C1 BTAIJUBAGLVFKQ-BVSLBCMMSA-N 0.000 description 1
- IPVPGAADZXRZSH-RNXOBYDBSA-N Phe-Tyr-Trp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O IPVPGAADZXRZSH-RNXOBYDBSA-N 0.000 description 1
- JSGWNFKWZNPDAV-YDHLFZDLSA-N Phe-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 JSGWNFKWZNPDAV-YDHLFZDLSA-N 0.000 description 1
- IEIFEYBAYFSRBQ-IHRRRGAJSA-N Phe-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N IEIFEYBAYFSRBQ-IHRRRGAJSA-N 0.000 description 1
- QBFONMUYNSNKIX-AVGNSLFASA-N Pro-Arg-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O QBFONMUYNSNKIX-AVGNSLFASA-N 0.000 description 1
- GRIRJQGZZJVANI-CYDGBPFRSA-N Pro-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H]1CCCN1 GRIRJQGZZJVANI-CYDGBPFRSA-N 0.000 description 1
- XROLYVMNVIKVEM-BQBZGAKWSA-N Pro-Asn-Gly Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O XROLYVMNVIKVEM-BQBZGAKWSA-N 0.000 description 1
- AHXPYZRZRMQOAU-QXEWZRGKSA-N Pro-Asn-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1)C(O)=O AHXPYZRZRMQOAU-QXEWZRGKSA-N 0.000 description 1
- GDXZRWYXJSGWIV-GMOBBJLQSA-N Pro-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 GDXZRWYXJSGWIV-GMOBBJLQSA-N 0.000 description 1
- GQLOZEMWEBDEAY-NAKRPEOUSA-N Pro-Cys-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GQLOZEMWEBDEAY-NAKRPEOUSA-N 0.000 description 1
- ODPIUQVTULPQEP-CIUDSAMLSA-N Pro-Gln-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@@H]1CCCN1 ODPIUQVTULPQEP-CIUDSAMLSA-N 0.000 description 1
- DIFXZGPHVCIVSQ-CIUDSAMLSA-N Pro-Gln-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DIFXZGPHVCIVSQ-CIUDSAMLSA-N 0.000 description 1
- QGOZJLYCGRYYRW-KKUMJFAQSA-N Pro-Glu-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QGOZJLYCGRYYRW-KKUMJFAQSA-N 0.000 description 1
- UIMCLYYSUCIUJM-UWVGGRQHSA-N Pro-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 UIMCLYYSUCIUJM-UWVGGRQHSA-N 0.000 description 1
- IBGCFJDLCYTKPW-NAKRPEOUSA-N Pro-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 IBGCFJDLCYTKPW-NAKRPEOUSA-N 0.000 description 1
- AUQGUYPHJSMAKI-CYDGBPFRSA-N Pro-Ile-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 AUQGUYPHJSMAKI-CYDGBPFRSA-N 0.000 description 1
- MRYUJHGPZQNOAD-IHRRRGAJSA-N Pro-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1 MRYUJHGPZQNOAD-IHRRRGAJSA-N 0.000 description 1
- MCWHYUWXVNRXFV-RWMBFGLXSA-N Pro-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 MCWHYUWXVNRXFV-RWMBFGLXSA-N 0.000 description 1
- VTFXTWDFPTWNJY-RHYQMDGZSA-N Pro-Leu-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VTFXTWDFPTWNJY-RHYQMDGZSA-N 0.000 description 1
- OFGUOWQVEGTVNU-DCAQKATOSA-N Pro-Lys-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OFGUOWQVEGTVNU-DCAQKATOSA-N 0.000 description 1
- YAZNFQUKPUASKB-DCAQKATOSA-N Pro-Lys-Cys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CS)C(=O)O YAZNFQUKPUASKB-DCAQKATOSA-N 0.000 description 1
- XQPHBAKJJJZOBX-SRVKXCTJSA-N Pro-Lys-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O XQPHBAKJJJZOBX-SRVKXCTJSA-N 0.000 description 1
- IQAGKQWXVHTPOT-FHWLQOOXSA-N Pro-Lys-Trp Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O IQAGKQWXVHTPOT-FHWLQOOXSA-N 0.000 description 1
- VGVCNKSUVSZEIE-IHRRRGAJSA-N Pro-Phe-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O VGVCNKSUVSZEIE-IHRRRGAJSA-N 0.000 description 1
- PRKWBYCXBBSLSK-GUBZILKMSA-N Pro-Ser-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O PRKWBYCXBBSLSK-GUBZILKMSA-N 0.000 description 1
- QUBVFEANYYWBTM-VEVYYDQMSA-N Pro-Thr-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUBVFEANYYWBTM-VEVYYDQMSA-N 0.000 description 1
- YIPFBJGBRCJJJD-FHWLQOOXSA-N Pro-Trp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@@H]3CCCN3 YIPFBJGBRCJJJD-FHWLQOOXSA-N 0.000 description 1
- ZAUHSLVPDLNTRZ-QXEWZRGKSA-N Pro-Val-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O ZAUHSLVPDLNTRZ-QXEWZRGKSA-N 0.000 description 1
- STGVYUTZKGPRCI-GUBZILKMSA-N Pro-Val-Cys Chemical compound SC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 STGVYUTZKGPRCI-GUBZILKMSA-N 0.000 description 1
- OQSGBXGNAFQGGS-CYDGBPFRSA-N Pro-Val-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OQSGBXGNAFQGGS-CYDGBPFRSA-N 0.000 description 1
- FIODMZKLZFLYQP-GUBZILKMSA-N Pro-Val-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FIODMZKLZFLYQP-GUBZILKMSA-N 0.000 description 1
- MTMJNKFZDQEVSY-BZSNNMDCSA-N Pro-Val-Trp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O MTMJNKFZDQEVSY-BZSNNMDCSA-N 0.000 description 1
- FHJQROWZEJFZPO-SRVKXCTJSA-N Pro-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 FHJQROWZEJFZPO-SRVKXCTJSA-N 0.000 description 1
- 241000396459 Pseudoalteromonas translucida Species 0.000 description 1
- 108010079005 RDV peptide Proteins 0.000 description 1
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 1
- PZZJMBYSYAKYPK-UWJYBYFXSA-N Ser-Ala-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O PZZJMBYSYAKYPK-UWJYBYFXSA-N 0.000 description 1
- WDXYVIIVDIDOSX-DCAQKATOSA-N Ser-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N WDXYVIIVDIDOSX-DCAQKATOSA-N 0.000 description 1
- QVOGDCQNGLBNCR-FXQIFTODSA-N Ser-Arg-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O QVOGDCQNGLBNCR-FXQIFTODSA-N 0.000 description 1
- VAUMZJHYZQXZBQ-WHFBIAKZSA-N Ser-Asn-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O VAUMZJHYZQXZBQ-WHFBIAKZSA-N 0.000 description 1
- DKKGAAJTDKHWOD-BIIVOSGPSA-N Ser-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N)C(=O)O DKKGAAJTDKHWOD-BIIVOSGPSA-N 0.000 description 1
- BGOWRLSWJCVYAQ-CIUDSAMLSA-N Ser-Asp-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BGOWRLSWJCVYAQ-CIUDSAMLSA-N 0.000 description 1
- SMIDBHKWSYUBRZ-ACZMJKKPSA-N Ser-Glu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O SMIDBHKWSYUBRZ-ACZMJKKPSA-N 0.000 description 1
- SQBLRDDJTUJDMV-ACZMJKKPSA-N Ser-Glu-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SQBLRDDJTUJDMV-ACZMJKKPSA-N 0.000 description 1
- VQBCMLMPEWPUTB-ACZMJKKPSA-N Ser-Glu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VQBCMLMPEWPUTB-ACZMJKKPSA-N 0.000 description 1
- IOVHBRCQOGWAQH-ZKWXMUAHSA-N Ser-Gly-Ile Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IOVHBRCQOGWAQH-ZKWXMUAHSA-N 0.000 description 1
- WSTIOCFMWXNOCX-YUMQZZPRSA-N Ser-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N WSTIOCFMWXNOCX-YUMQZZPRSA-N 0.000 description 1
- JFWDJFULOLKQFY-QWRGUYRKSA-N Ser-Gly-Phe Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JFWDJFULOLKQFY-QWRGUYRKSA-N 0.000 description 1
- FYUIFUJFNCLUIX-XVYDVKMFSA-N Ser-His-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O FYUIFUJFNCLUIX-XVYDVKMFSA-N 0.000 description 1
- CAOYHZOWXFFAIR-CIUDSAMLSA-N Ser-His-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O CAOYHZOWXFFAIR-CIUDSAMLSA-N 0.000 description 1
- DJACUBDEDBZKLQ-KBIXCLLPSA-N Ser-Ile-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O DJACUBDEDBZKLQ-KBIXCLLPSA-N 0.000 description 1
- JIPVNVNKXJLFJF-BJDJZHNGSA-N Ser-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N JIPVNVNKXJLFJF-BJDJZHNGSA-N 0.000 description 1
- HEUVHBXOVZONPU-BJDJZHNGSA-N Ser-Leu-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HEUVHBXOVZONPU-BJDJZHNGSA-N 0.000 description 1
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 1
- JWOBLHJRDADHLN-KKUMJFAQSA-N Ser-Leu-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JWOBLHJRDADHLN-KKUMJFAQSA-N 0.000 description 1
- GZSZPKSBVAOGIE-CIUDSAMLSA-N Ser-Lys-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O GZSZPKSBVAOGIE-CIUDSAMLSA-N 0.000 description 1
- LRWBCWGEUCKDTN-BJDJZHNGSA-N Ser-Lys-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LRWBCWGEUCKDTN-BJDJZHNGSA-N 0.000 description 1
- XUDRHBPSPAPDJP-SRVKXCTJSA-N Ser-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO XUDRHBPSPAPDJP-SRVKXCTJSA-N 0.000 description 1
- GDUZTEQRAOXYJS-SRVKXCTJSA-N Ser-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CO)N GDUZTEQRAOXYJS-SRVKXCTJSA-N 0.000 description 1
- XKFJENWJGHMDLI-QWRGUYRKSA-N Ser-Phe-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O XKFJENWJGHMDLI-QWRGUYRKSA-N 0.000 description 1
- KQNDIKOYWZTZIX-FXQIFTODSA-N Ser-Ser-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KQNDIKOYWZTZIX-FXQIFTODSA-N 0.000 description 1
- XQJCEKXQUJQNNK-ZLUOBGJFSA-N Ser-Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O XQJCEKXQUJQNNK-ZLUOBGJFSA-N 0.000 description 1
- VGQVAVQWKJLIRM-FXQIFTODSA-N Ser-Ser-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O VGQVAVQWKJLIRM-FXQIFTODSA-N 0.000 description 1
- KKKVOZNCLALMPV-XKBZYTNZSA-N Ser-Thr-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O KKKVOZNCLALMPV-XKBZYTNZSA-N 0.000 description 1
- NADLKBTYNKUJEP-KATARQTJSA-N Ser-Thr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NADLKBTYNKUJEP-KATARQTJSA-N 0.000 description 1
- WMZVVNLPHFSUPA-BPUTZDHNSA-N Ser-Trp-Arg Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 WMZVVNLPHFSUPA-BPUTZDHNSA-N 0.000 description 1
- UQGAAZXSCGWMFU-UBHSHLNASA-N Ser-Trp-Asp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CO)N UQGAAZXSCGWMFU-UBHSHLNASA-N 0.000 description 1
- FVFUOQIYDPAIJR-XIRDDKMYSA-N Ser-Trp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CO)N FVFUOQIYDPAIJR-XIRDDKMYSA-N 0.000 description 1
- UBTNVMGPMYDYIU-HJPIBITLSA-N Ser-Tyr-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UBTNVMGPMYDYIU-HJPIBITLSA-N 0.000 description 1
- PLQWGQUNUPMNOD-KKUMJFAQSA-N Ser-Tyr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O PLQWGQUNUPMNOD-KKUMJFAQSA-N 0.000 description 1
- ZVBCMFDJIMUELU-BZSNNMDCSA-N Ser-Tyr-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CO)N ZVBCMFDJIMUELU-BZSNNMDCSA-N 0.000 description 1
- PMTWIUBUQRGCSB-FXQIFTODSA-N Ser-Val-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O PMTWIUBUQRGCSB-FXQIFTODSA-N 0.000 description 1
- KEAYESYHFKHZAL-UHFFFAOYSA-N Sodium Chemical compound [Na] KEAYESYHFKHZAL-UHFFFAOYSA-N 0.000 description 1
- IGROJMCBGRFRGI-YTLHQDLWSA-N Thr-Ala-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O IGROJMCBGRFRGI-YTLHQDLWSA-N 0.000 description 1
- PXQUBKWZENPDGE-CIQUZCHMSA-N Thr-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)O)N PXQUBKWZENPDGE-CIQUZCHMSA-N 0.000 description 1
- BSNZTJXVDOINSR-JXUBOQSCSA-N Thr-Ala-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BSNZTJXVDOINSR-JXUBOQSCSA-N 0.000 description 1
- DWYAUVCQDTZIJI-VZFHVOOUSA-N Thr-Ala-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DWYAUVCQDTZIJI-VZFHVOOUSA-N 0.000 description 1
- CEXFELBFVHLYDZ-XGEHTFHBSA-N Thr-Arg-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O CEXFELBFVHLYDZ-XGEHTFHBSA-N 0.000 description 1
- UNURFMVMXLENAZ-KJEVXHAQSA-N Thr-Arg-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UNURFMVMXLENAZ-KJEVXHAQSA-N 0.000 description 1
- SWIKDOUVROTZCW-GCJQMDKQSA-N Thr-Asn-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C)C(=O)O)N)O SWIKDOUVROTZCW-GCJQMDKQSA-N 0.000 description 1
- VIBXMCZWVUOZLA-OLHMAJIHSA-N Thr-Asn-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O VIBXMCZWVUOZLA-OLHMAJIHSA-N 0.000 description 1
- JTEICXDKGWKRRV-HJGDQZAQSA-N Thr-Asn-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O JTEICXDKGWKRRV-HJGDQZAQSA-N 0.000 description 1
- OJRNZRROAIAHDL-LKXGYXEUSA-N Thr-Asn-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OJRNZRROAIAHDL-LKXGYXEUSA-N 0.000 description 1
- VXMHQKHDKCATDV-VEVYYDQMSA-N Thr-Asp-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VXMHQKHDKCATDV-VEVYYDQMSA-N 0.000 description 1
- GNHRVXYZKWSJTF-HJGDQZAQSA-N Thr-Asp-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N)O GNHRVXYZKWSJTF-HJGDQZAQSA-N 0.000 description 1
- LYGKYFKSZTUXGZ-ZDLURKLDSA-N Thr-Cys-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)NCC(O)=O LYGKYFKSZTUXGZ-ZDLURKLDSA-N 0.000 description 1
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 1
- MPUMPERGHHJGRP-WEDXCCLWSA-N Thr-Gly-Lys Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O MPUMPERGHHJGRP-WEDXCCLWSA-N 0.000 description 1
- KBBRNEDOYWMIJP-KYNKHSRBSA-N Thr-Gly-Thr Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N)O KBBRNEDOYWMIJP-KYNKHSRBSA-N 0.000 description 1
- VUSAEKOXGNEYNE-PBCZWWQYSA-N Thr-His-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O VUSAEKOXGNEYNE-PBCZWWQYSA-N 0.000 description 1
- KRGDDWVBBDLPSJ-CUJWVEQBSA-N Thr-His-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O KRGDDWVBBDLPSJ-CUJWVEQBSA-N 0.000 description 1
- SXAGUVRFGJSFKC-ZEILLAHLSA-N Thr-His-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SXAGUVRFGJSFKC-ZEILLAHLSA-N 0.000 description 1
- PAXANSWUSVPFNK-IUKAMOBKSA-N Thr-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N PAXANSWUSVPFNK-IUKAMOBKSA-N 0.000 description 1
- GXUWHVZYDAHFSV-FLBSBUHZSA-N Thr-Ile-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GXUWHVZYDAHFSV-FLBSBUHZSA-N 0.000 description 1
- VTVVYQOXJCZVEB-WDCWCFNPSA-N Thr-Leu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O VTVVYQOXJCZVEB-WDCWCFNPSA-N 0.000 description 1
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 1
- ZXIHABSKUITPTN-IXOXFDKPSA-N Thr-Lys-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O ZXIHABSKUITPTN-IXOXFDKPSA-N 0.000 description 1
- SPVHQURZJCUDQC-VOAKCMCISA-N Thr-Lys-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O SPVHQURZJCUDQC-VOAKCMCISA-N 0.000 description 1
- XSEPSRUDSPHMPX-KATARQTJSA-N Thr-Lys-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O XSEPSRUDSPHMPX-KATARQTJSA-N 0.000 description 1
- XNTVWRJTUIOGQO-RHYQMDGZSA-N Thr-Met-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O XNTVWRJTUIOGQO-RHYQMDGZSA-N 0.000 description 1
- NDXSOKGYKCGYKT-VEVYYDQMSA-N Thr-Pro-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O NDXSOKGYKCGYKT-VEVYYDQMSA-N 0.000 description 1
- DEGCBBCMYWNJNA-RHYQMDGZSA-N Thr-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O DEGCBBCMYWNJNA-RHYQMDGZSA-N 0.000 description 1
- GFRIEEKFXOVPIR-RHYQMDGZSA-N Thr-Pro-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O GFRIEEKFXOVPIR-RHYQMDGZSA-N 0.000 description 1
- STUAPCLEDMKXKL-LKXGYXEUSA-N Thr-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O STUAPCLEDMKXKL-LKXGYXEUSA-N 0.000 description 1
- CJEHCEOXPLASCK-MEYUZBJRSA-N Thr-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=C(O)C=C1 CJEHCEOXPLASCK-MEYUZBJRSA-N 0.000 description 1
- MQVGIFJSFFVGFW-XEGUGMAKSA-N Trp-Ala-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O MQVGIFJSFFVGFW-XEGUGMAKSA-N 0.000 description 1
- GZTKZDGIEBKZAH-XIRDDKMYSA-N Trp-Cys-His Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N GZTKZDGIEBKZAH-XIRDDKMYSA-N 0.000 description 1
- WPSYJHFHZYJXMW-JSGCOSHPSA-N Trp-Gln-Gly Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O WPSYJHFHZYJXMW-JSGCOSHPSA-N 0.000 description 1
- LJCLHMPCYYXVPR-VJBMBRPKSA-N Trp-Gln-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC3=CNC4=CC=CC=C43)C(=O)O)N LJCLHMPCYYXVPR-VJBMBRPKSA-N 0.000 description 1
- VUMCLPHXCBIJJB-PMVMPFDFSA-N Trp-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CC3=CNC4=CC=CC=C43)N VUMCLPHXCBIJJB-PMVMPFDFSA-N 0.000 description 1
- IVBJBFSWJDNQFW-XIRDDKMYSA-N Trp-Pro-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O IVBJBFSWJDNQFW-XIRDDKMYSA-N 0.000 description 1
- KBKTUNYBNJWFRL-UBHSHLNASA-N Trp-Ser-Asn Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O)=CNC2=C1 KBKTUNYBNJWFRL-UBHSHLNASA-N 0.000 description 1
- VDCGPCSLAJAKBB-XIRDDKMYSA-N Trp-Ser-His Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N VDCGPCSLAJAKBB-XIRDDKMYSA-N 0.000 description 1
- IELISNUVHBKYBX-XDTLVQLUSA-N Tyr-Ala-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 IELISNUVHBKYBX-XDTLVQLUSA-N 0.000 description 1
- HSVPZJLMPLMPOX-BPNCWPANSA-N Tyr-Arg-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O HSVPZJLMPLMPOX-BPNCWPANSA-N 0.000 description 1
- ADBDQGBDNUTRDB-ULQDDVLXSA-N Tyr-Arg-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O ADBDQGBDNUTRDB-ULQDDVLXSA-N 0.000 description 1
- GFZQWWDXJVGEMW-ULQDDVLXSA-N Tyr-Arg-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O GFZQWWDXJVGEMW-ULQDDVLXSA-N 0.000 description 1
- TWAVEIJGFCBWCG-JYJNAYRXSA-N Tyr-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N TWAVEIJGFCBWCG-JYJNAYRXSA-N 0.000 description 1
- HDSKHCBAVVWPCQ-FHWLQOOXSA-N Tyr-Glu-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HDSKHCBAVVWPCQ-FHWLQOOXSA-N 0.000 description 1
- AKLNEFNQWLHIGY-QWRGUYRKSA-N Tyr-Gly-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N)O AKLNEFNQWLHIGY-QWRGUYRKSA-N 0.000 description 1
- AZGZDDNKFFUDEH-QWRGUYRKSA-N Tyr-Gly-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AZGZDDNKFFUDEH-QWRGUYRKSA-N 0.000 description 1
- OHNXAUCZVWGTLL-KKUMJFAQSA-N Tyr-His-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CS)C(=O)O)N)O OHNXAUCZVWGTLL-KKUMJFAQSA-N 0.000 description 1
- STTVVMWQKDOKAM-YESZJQIVSA-N Tyr-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC3=CC=C(C=C3)O)N)C(=O)O STTVVMWQKDOKAM-YESZJQIVSA-N 0.000 description 1
- NXRGXTBPMOGFID-CFMVVWHZSA-N Tyr-Ile-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O NXRGXTBPMOGFID-CFMVVWHZSA-N 0.000 description 1
- AXWBYOVVDRBOGU-SIUGBPQLSA-N Tyr-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N AXWBYOVVDRBOGU-SIUGBPQLSA-N 0.000 description 1
- OHOVFPKXPZODHS-SJWGOKEGSA-N Tyr-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N OHOVFPKXPZODHS-SJWGOKEGSA-N 0.000 description 1
- BSCBBPKDVOZICB-KKUMJFAQSA-N Tyr-Leu-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BSCBBPKDVOZICB-KKUMJFAQSA-N 0.000 description 1
- ARJASMXQBRNAGI-YESZJQIVSA-N Tyr-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N ARJASMXQBRNAGI-YESZJQIVSA-N 0.000 description 1
- CDKZJGMPZHPAJC-ULQDDVLXSA-N Tyr-Leu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CDKZJGMPZHPAJC-ULQDDVLXSA-N 0.000 description 1
- CWVHKVVKAQIJKY-ACRUOGEOSA-N Tyr-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CC=C(C=C2)O)N CWVHKVVKAQIJKY-ACRUOGEOSA-N 0.000 description 1
- BGFCXQXETBDEHP-BZSNNMDCSA-N Tyr-Phe-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O BGFCXQXETBDEHP-BZSNNMDCSA-N 0.000 description 1
- LRHBBGDMBLFYGL-FHWLQOOXSA-N Tyr-Phe-Glu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=C(O)C=C1 LRHBBGDMBLFYGL-FHWLQOOXSA-N 0.000 description 1
- SCZJKZLFSSPJDP-ACRUOGEOSA-N Tyr-Phe-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O SCZJKZLFSSPJDP-ACRUOGEOSA-N 0.000 description 1
- WURLIFOWSMBUAR-SLFFLAALSA-N Tyr-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC3=CC=C(C=C3)O)N)C(=O)O WURLIFOWSMBUAR-SLFFLAALSA-N 0.000 description 1
- NVZVJIUDICCMHZ-BZSNNMDCSA-N Tyr-Phe-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O NVZVJIUDICCMHZ-BZSNNMDCSA-N 0.000 description 1
- SOEGLGLDSUHWTI-STECZYCISA-N Tyr-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=C(O)C=C1 SOEGLGLDSUHWTI-STECZYCISA-N 0.000 description 1
- YYLHVUCSTXXKBS-IHRRRGAJSA-N Tyr-Pro-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O YYLHVUCSTXXKBS-IHRRRGAJSA-N 0.000 description 1
- XGZBEGGGAUQBMB-KJEVXHAQSA-N Tyr-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CC=C(C=C2)O)N)O XGZBEGGGAUQBMB-KJEVXHAQSA-N 0.000 description 1
- IEWKKXZRJLTIOV-AVGNSLFASA-N Tyr-Ser-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O IEWKKXZRJLTIOV-AVGNSLFASA-N 0.000 description 1
- RIVVDNTUSRVTQT-IRIUXVKKSA-N Tyr-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O RIVVDNTUSRVTQT-IRIUXVKKSA-N 0.000 description 1
- CLEGSEJVGBYZBJ-MEYUZBJRSA-N Tyr-Thr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CLEGSEJVGBYZBJ-MEYUZBJRSA-N 0.000 description 1
- WQOHKVRQDLNDIL-YJRXYDGGSA-N Tyr-Thr-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O WQOHKVRQDLNDIL-YJRXYDGGSA-N 0.000 description 1
- BXJQKVDPRMLGKN-PMVMPFDFSA-N Tyr-Trp-Leu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(O)=O)C1=CC=C(O)C=C1 BXJQKVDPRMLGKN-PMVMPFDFSA-N 0.000 description 1
- TYGHOWWWMTWVKM-HJOGWXRNSA-N Tyr-Tyr-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 TYGHOWWWMTWVKM-HJOGWXRNSA-N 0.000 description 1
- HZWPGKAKGYJWCI-ULQDDVLXSA-N Tyr-Val-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)Cc1ccc(O)cc1)C(C)C)C(O)=O HZWPGKAKGYJWCI-ULQDDVLXSA-N 0.000 description 1
- YFOCMOVJBQDBCE-NRPADANISA-N Val-Ala-Glu Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N YFOCMOVJBQDBCE-NRPADANISA-N 0.000 description 1
- REJBPZVUHYNMEN-LSJOCFKGSA-N Val-Ala-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](C(C)C)N REJBPZVUHYNMEN-LSJOCFKGSA-N 0.000 description 1
- RUCNAYOMFXRIKJ-DCAQKATOSA-N Val-Ala-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RUCNAYOMFXRIKJ-DCAQKATOSA-N 0.000 description 1
- UDLYXGYWTVOIKU-QXEWZRGKSA-N Val-Asn-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UDLYXGYWTVOIKU-QXEWZRGKSA-N 0.000 description 1
- ZMDCGGKHRKNWKD-LAEOZQHASA-N Val-Asn-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZMDCGGKHRKNWKD-LAEOZQHASA-N 0.000 description 1
- CGGVNFJRZJUVAE-BYULHYEWSA-N Val-Asp-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CGGVNFJRZJUVAE-BYULHYEWSA-N 0.000 description 1
- FPCIBLUVDNXPJO-XPUUQOCRSA-N Val-Cys-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O FPCIBLUVDNXPJO-XPUUQOCRSA-N 0.000 description 1
- SRWWRLKBEJZFPW-IHRRRGAJSA-N Val-Cys-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N SRWWRLKBEJZFPW-IHRRRGAJSA-N 0.000 description 1
- CJDZKZFMAXGUOJ-IHRRRGAJSA-N Val-Cys-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N CJDZKZFMAXGUOJ-IHRRRGAJSA-N 0.000 description 1
- BRPKEERLGYNCNC-NHCYSSNCSA-N Val-Glu-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N BRPKEERLGYNCNC-NHCYSSNCSA-N 0.000 description 1
- YDPFWRVQHFWBKI-GVXVVHGQSA-N Val-Glu-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N YDPFWRVQHFWBKI-GVXVVHGQSA-N 0.000 description 1
- ROLGIBMFNMZANA-GVXVVHGQSA-N Val-Glu-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N ROLGIBMFNMZANA-GVXVVHGQSA-N 0.000 description 1
- DJEVQCWNMQOABE-RCOVLWMOSA-N Val-Gly-Asp Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N DJEVQCWNMQOABE-RCOVLWMOSA-N 0.000 description 1
- FXVDGDZRYLFQKY-WPRPVWTQSA-N Val-Gly-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)C(C)C FXVDGDZRYLFQKY-WPRPVWTQSA-N 0.000 description 1
- BVWPHWLFGRCECJ-JSGCOSHPSA-N Val-Gly-Tyr Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N BVWPHWLFGRCECJ-JSGCOSHPSA-N 0.000 description 1
- ZIGZPYJXIWLQFC-QTKMDUPCSA-N Val-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](C(C)C)N)O ZIGZPYJXIWLQFC-QTKMDUPCSA-N 0.000 description 1
- KDKLLPMFFGYQJD-CYDGBPFRSA-N Val-Ile-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](C(C)C)N KDKLLPMFFGYQJD-CYDGBPFRSA-N 0.000 description 1
- LKUDRJSNRWVGMS-QSFUFRPTSA-N Val-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LKUDRJSNRWVGMS-QSFUFRPTSA-N 0.000 description 1
- VXDSPJJQUQDCKH-UKJIMTQDSA-N Val-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N VXDSPJJQUQDCKH-UKJIMTQDSA-N 0.000 description 1
- FEXILLGKGGTLRI-NHCYSSNCSA-N Val-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N FEXILLGKGGTLRI-NHCYSSNCSA-N 0.000 description 1
- LYERIXUFCYVFFX-GVXVVHGQSA-N Val-Leu-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LYERIXUFCYVFFX-GVXVVHGQSA-N 0.000 description 1
- BTWMICVCQLKKNR-DCAQKATOSA-N Val-Leu-Ser Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C([O-])=O BTWMICVCQLKKNR-DCAQKATOSA-N 0.000 description 1
- WBAJDGWKRIHOAC-GVXVVHGQSA-N Val-Lys-Gln Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O WBAJDGWKRIHOAC-GVXVVHGQSA-N 0.000 description 1
- UOUIMEGEPSBZIV-ULQDDVLXSA-N Val-Lys-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UOUIMEGEPSBZIV-ULQDDVLXSA-N 0.000 description 1
- MGVYZTPLGXPVQB-CYDGBPFRSA-N Val-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](C(C)C)N MGVYZTPLGXPVQB-CYDGBPFRSA-N 0.000 description 1
- YKNOJPJWNVHORX-UNQGMJICSA-N Val-Phe-Thr Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YKNOJPJWNVHORX-UNQGMJICSA-N 0.000 description 1
- DOFAQXCYFQKSHT-SRVKXCTJSA-N Val-Pro-Pro Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DOFAQXCYFQKSHT-SRVKXCTJSA-N 0.000 description 1
- SSYBNWFXCFNRFN-GUBZILKMSA-N Val-Pro-Ser Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O SSYBNWFXCFNRFN-GUBZILKMSA-N 0.000 description 1
- LTTQCQRTSHJPPL-ZKWXMUAHSA-N Val-Ser-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N LTTQCQRTSHJPPL-ZKWXMUAHSA-N 0.000 description 1
- VHIZXDZMTDVFGX-DCAQKATOSA-N Val-Ser-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N VHIZXDZMTDVFGX-DCAQKATOSA-N 0.000 description 1
- PZTZYZUTCPZWJH-FXQIFTODSA-N Val-Ser-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PZTZYZUTCPZWJH-FXQIFTODSA-N 0.000 description 1
- BZDGLJPROOOUOZ-XGEHTFHBSA-N Val-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C(C)C)N)O BZDGLJPROOOUOZ-XGEHTFHBSA-N 0.000 description 1
- MIAZWUMFUURQNP-YDHLFZDLSA-N Val-Tyr-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N MIAZWUMFUURQNP-YDHLFZDLSA-N 0.000 description 1
- JPBGMZDTPVGGMQ-ULQDDVLXSA-N Val-Tyr-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N JPBGMZDTPVGGMQ-ULQDDVLXSA-N 0.000 description 1
- JXWGBRRVTRAZQA-ULQDDVLXSA-N Val-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C(C)C)N JXWGBRRVTRAZQA-ULQDDVLXSA-N 0.000 description 1
- PMKQKNBISAOSRI-XHSDSOJGSA-N Val-Tyr-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N2CCC[C@@H]2C(=O)O)N PMKQKNBISAOSRI-XHSDSOJGSA-N 0.000 description 1
- IECQJCJNPJVUSB-IHRRRGAJSA-N Val-Tyr-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CO)C(O)=O IECQJCJNPJVUSB-IHRRRGAJSA-N 0.000 description 1
- LLJLBRRXKZTTRD-GUBZILKMSA-N Val-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)O)N LLJLBRRXKZTTRD-GUBZILKMSA-N 0.000 description 1
- JVGDAEKKZKKZFO-RCWTZXSCSA-N Val-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)N)O JVGDAEKKZKKZFO-RCWTZXSCSA-N 0.000 description 1
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 1
- 108010041407 alanylaspartic acid Proteins 0.000 description 1
- 108010070944 alanylhistidine Proteins 0.000 description 1
- 108010011559 alanylphenylalanine Proteins 0.000 description 1
- 108010087924 alanylproline Proteins 0.000 description 1
- 108010013835 arginine glutamate Proteins 0.000 description 1
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 1
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 1
- 108010068380 arginylarginine Proteins 0.000 description 1
- 108010062796 arginyllysine Proteins 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 108010092854 aspartyllysine Proteins 0.000 description 1
- 210000003578 bacterial chromosome Anatomy 0.000 description 1
- 229940041514 candida albicans extract Drugs 0.000 description 1
- 230000002759 chromosomal effect Effects 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000005782 double-strand break Effects 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- 108010078144 glutaminyl-glycine Proteins 0.000 description 1
- HPAIKDPJURGQLN-UHFFFAOYSA-N glycyl-L-histidyl-L-phenylalanine Natural products C=1C=CC=CC=1CC(C(O)=O)NC(=O)C(NC(=O)CN)CC1=CN=CN1 HPAIKDPJURGQLN-UHFFFAOYSA-N 0.000 description 1
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 1
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 1
- 108010038983 glycyl-histidyl-lysine Proteins 0.000 description 1
- 108010077435 glycyl-phenylalanyl-glycine Proteins 0.000 description 1
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 1
- 108010059898 glycyl-tyrosyl-lysine Proteins 0.000 description 1
- 108010081551 glycylphenylalanine Proteins 0.000 description 1
- 108010084389 glycyltryptophan Proteins 0.000 description 1
- 108010087823 glycyltyrosine Proteins 0.000 description 1
- 108010037850 glycylvaline Proteins 0.000 description 1
- 108010040030 histidinoalanine Proteins 0.000 description 1
- 108010036413 histidylglycine Proteins 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 108010027338 isoleucylcysteine Proteins 0.000 description 1
- 108010076756 leucyl-alanyl-phenylalanine Proteins 0.000 description 1
- 108010083708 leucyl-aspartyl-valine Proteins 0.000 description 1
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 1
- 108010000761 leucylarginine Proteins 0.000 description 1
- 108010003700 lysyl aspartic acid Proteins 0.000 description 1
- 108010025153 lysyl-alanyl-alanine Proteins 0.000 description 1
- 108010009298 lysylglutamic acid Proteins 0.000 description 1
- 108010064235 lysylglycine Proteins 0.000 description 1
- 108010038320 lysylphenylalanine Proteins 0.000 description 1
- 108010017391 lysylvaline Proteins 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000011177 media preparation Methods 0.000 description 1
- 108010022588 methionyl-lysyl-proline Proteins 0.000 description 1
- 108010056582 methionylglutamic acid Proteins 0.000 description 1
- 108010068488 methionylphenylalanine Proteins 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 108010074082 phenylalanyl-alanyl-lysine Proteins 0.000 description 1
- 108010064486 phenylalanyl-leucyl-valine Proteins 0.000 description 1
- 108010012581 phenylalanylglutamate Proteins 0.000 description 1
- 108010083476 phenylalanyltryptophan Proteins 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 108010020432 prolyl-prolylisoleucine Proteins 0.000 description 1
- 108010079317 prolyl-tyrosine Proteins 0.000 description 1
- 108010015796 prolylisoleucine Proteins 0.000 description 1
- 230000036632 reaction speed Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 1
- 108010026333 seryl-proline Proteins 0.000 description 1
- 230000003007 single stranded DNA break Effects 0.000 description 1
- 229910052708 sodium Inorganic materials 0.000 description 1
- 239000011734 sodium Substances 0.000 description 1
- 239000000600 sorbitol Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 108010061238 threonyl-glycine Proteins 0.000 description 1
- 108010071097 threonyl-lysyl-proline Proteins 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 239000012137 tryptone Substances 0.000 description 1
- 108010029384 tryptophyl-histidine Proteins 0.000 description 1
- 108010015666 tryptophyl-leucyl-glutamic acid Proteins 0.000 description 1
- 108010005834 tyrosyl-alanyl-glycine Proteins 0.000 description 1
- 108010032276 tyrosyl-glutamyl-tyrosyl-glutamic acid Proteins 0.000 description 1
- 108010077037 tyrosyl-tyrosyl-phenylalanine Proteins 0.000 description 1
- 108010015385 valyl-prolyl-proline Proteins 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 108010027345 wheylin-1 peptide Proteins 0.000 description 1
- 239000012138 yeast extract Substances 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
- C12N9/1241—Nucleotidyltransferases (2.7.7)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K1/00—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/70—Vectors or expression systems specially adapted for E. coli
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/74—Vectors or expression systems specially adapted for prokaryotic hosts other than E. coli, e.g. Lactobacillus, Micromonospora
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases [RNase]; Deoxyribonucleases [DNase]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/90—Vectors containing a transposable element
Landscapes
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Plant Pathology (AREA)
- Physics & Mathematics (AREA)
- Medicinal Chemistry (AREA)
- Crystallography & Structural Chemistry (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Mycology (AREA)
- Enzymes And Modification Thereof (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
本发明公开了一种新型CRISPR相关转座酶,其来源于半透明假交替单胞菌KMM520,可识别16种PAM,可以一次将货物基因插入至8个位点,效率100%,高于来源于霍乱弧菌Tn6677的CRISPR相关转座酶;转座15.4kb货物基因至对应靶点的效率为100%,能够与来源于霍乱弧菌Tn6677的CRISPR相关转座酶在同一大肠杆菌内使用,且互不干扰地发挥功能,应用前景广阔。The invention discloses a novel CRISPR-related transposase, which is derived from Pseudoalteromonas translucent KMM520, can recognize 16 kinds of PAMs, and can insert cargo genes into 8 sites at one time, with an efficiency of 100%, which is higher than that of the source CRISPR-associated transposase from Vibrio cholerae Tn6677; the efficiency of transposing the 15.4kb cargo gene to the corresponding target site is 100%, and can be used in the same E. coli with the CRISPR-associated transposase from Vibrio cholerae Tn6677, and They can function without interfering with each other, and have broad application prospects.
Description
技术领域technical field
本发明属于基因编辑技术领域,具体地说,涉及一种新型CRISPR相关转座酶及其在基因编辑中的应用。The invention belongs to the technical field of gene editing, and in particular relates to a novel CRISPR-related transposase and its application in gene editing.
背景技术Background technique
在代谢工程中,可以通过调整酶基因的剂量与比值使代谢途径不同酶或不同代谢途径之间协调一致最大化总体反应速度。在细菌中提高基因表达盒剂量的最常用方式是质粒,但会遇到遗传稳定性问题。将基因表达盒逐个整合到细菌染色体上则耗时长,难以快速测试诸多构建方案。CRISPR-Cas虽然可以切割基因组多拷贝序列或以crRNA阵列同时靶向多个不同序列,但受限于双链断裂的修复效率,难以一次性插入3个拷贝以上的基因表达盒。因此,需要一种更好的方法获得不同基因剂量的组合。In metabolic engineering, by adjusting the dosage and ratio of enzyme genes, different enzymes in metabolic pathways or different metabolic pathways can be coordinated to maximize the overall reaction speed. The most common means of increasing the dosage of gene expression cassettes in bacteria are plasmids, but suffer from genetic stability issues. Individual integration of gene expression cassettes into bacterial chromosomes is time-consuming, making rapid testing of many constructions difficult. Although CRISPR-Cas can cut multiple copies of the genome or simultaneously target multiple different sequences with crRNA arrays, it is limited by the repair efficiency of double-strand breaks and it is difficult to insert more than 3 copies of gene expression cassettes at one time. Therefore, a better way to obtain combinations of different gene dosages is needed.
2019年,Broad研究所张锋研究组和哥伦比亚大学Sam Sternberg研究组公布了两项相似的研究成果:利用细菌转座基因,将DNA序列精确地插入基因组而不切割DNA。其中张锋研究组从蓝藻中抽提了一种转座酶,将其命名为CAST,即CRISPR相关转座酶(CRISPR-associated transposase)。Sternberg研究组在霍乱弧菌中发现了一个独特的转座基因后,开发了一种名为INTEGRATE(Insertion of transposable elements by guide RNA-assisted targeting,引导RNA辅助靶向的转座元件插入)的基因编辑工具,可以在基因组中插入大片段基因而不引入DNA断裂。该新技术INTEGRATE利用转座基因将DNA序列插入基因组而不切割DNA。CAST系统和INTEGRATE系统都可不依赖于同源重组将DNA片段通过转座整合到大肠杆菌染色体预设的位点,无抗性标记残留,无双链DNA断口。In 2019, the Zhang Feng research group of the Broad Institute and the Sam Sternberg research group of Columbia University announced two similar research results: using bacterial transposable genes to precisely insert DNA sequences into the genome without cutting DNA. Among them, Zhang Feng's research group extracted a transposase from cyanobacteria and named it CAST, which stands for CRISPR-associated transposase (CRISPR-associated transposase). After discovering a unique transposable gene in Vibrio cholerae, the Sternberg research group developed a gene called INTEGRATE (Insertion of transposable elements by guide RNA-assisted targeting, guide RNA-assisted targeted transposable element insertion) Editing tools that can insert large stretches of genes in the genome without introducing DNA breaks. The new technology, INTEGRATE, utilizes transposable genes to insert DNA sequences into the genome without cutting the DNA. Both the CAST system and the INTEGRATE system can integrate DNA fragments into preset sites on the E. coli chromosome through transposition without relying on homologous recombination, without residual resistance markers and double-stranded DNA breaks.
发明人以CAST系统和INTEGRATE系统为基础,开发出了一种的多拷贝基因插入系统MUCICAT(参见专利文献CN202010083919.7)可以在5天得到染色体插入10拷贝货物基因(Cargo gene)的菌株,具有可编辑、快速、无marker、定点等优势,目前MUCICAT已被成功应用于酶工程菌株与代谢工程菌株的构建(Zhang,Y.,Multicopy chromosomal integrationusing CRISPR-associated transposases.Acs Synthetic Biology,2020.)。然而,代谢工程与合成生物学通常涉及多酶或多条途径表达水平的优化。在一轮转座中,基于单一CRISPR相关转座酶的MUCICAT只能探索负载单或多基因的单一货物基因最优剂量,不能筛选多基因或途径的基因剂量的最优配比。目前还没有平行扩增多基因/途径的染色体拷贝数技术的报道。包含多个彼此正交的CRISPR相关转座酶的MUCICAT技术则有希望实现单一细胞中多基因或途径的同时独立扩增,形成含不同基因剂量配比的菌株文库从而筛选最优多基因或途径的剂量配比。Based on the CAST system and the INTEGRATE system, the inventor developed a multi-copy gene insertion system MUCICAT (see patent document CN202010083919.7), which can obtain a strain with 10 copies of the cargo gene (Cargo gene) inserted into the chromosome in 5 days, with Editable, fast, marker-free, fixed-point and other advantages, MUCICAT has been successfully applied to the construction of enzyme engineering strains and metabolic engineering strains (Zhang, Y., Multicopy chromosome integration using CRISPR-associated transposases.Acs Synthetic Biology, 2020.). However, metabolic engineering and synthetic biology usually involve optimization of the expression levels of multiple enzymes or pathways. In a round of transposition, MUCICAT based on a single CRISPR-related transposase can only explore the optimal dosage of a single cargo gene carrying a single or multiple genes, and cannot screen the optimal ratio of gene dosages for multiple genes or pathways. There are no reports of chromosomal copy number techniques for parallel amplification of multiple genes/pathways. MUCICAT technology containing multiple CRISPR-related transposases that are orthogonal to each other is expected to achieve simultaneous and independent amplification of multiple genes or pathways in a single cell, forming a strain library with different gene dosage ratios to screen the optimal multiple genes or pathways dosage ratio.
已知活性的CRISPR转座酶中,仅来源于霍乱弧菌Tn6677的I-F3型CRISPR转座酶插入效率和中靶率均高(Klompe,S.E.,et al.,Transposon-encoded CRISPR–Cas systemsdirect RNA-guided DNA integration.Nature,2019.571(7764):p.219-225.),来源于霍夫曼尼斯藻和柱状鱼腥藻的V-K型(Jonathan Strecker et al.,RNA-guided DNAinsertion withCRISPR-associated transposases.Science,2019)、杀鲑气单胞菌S44的I-F3型(Michael T.Petassi et al.,Guide RNA Categorization Enables Target SiteChoice in Tn7-CRISPR-Cas Transposons.Cell,2020)、多变鱼腥藻和Peltigeramembranacea cyanobiont 210A的I-B型均效率低或/和脱靶率高(Makoto Saito,A.L.andJonathan Strecker,Han Altae-Tran,Rhiannon K.Macrae,Feng Zhang,Dual modes ofCRISPR-associated transposon homing.Cell,2021.)。Among the CRISPR transposases with known activity, only the I-F3 type CRISPR transposase derived from Vibrio cholerae Tn6677 had high insertion efficiency and on-target rate (Klompe, S.E., et al., Transposon-encoded CRISPR–Cas systemsdirect RNA-guided DNA integration.Nature, 2019.571(7764):p.219-225.), derived from the V-K type of Hofmannis sp. and Anabaena cylindrica (Jonathan Strecker et al., RNA-guided DNAinsertion with CRISPR-associated transposases.Science, 2019), I-F3 types of Aeromonas salmonicida S44 (Michael T.Petassi et al., Guide RNA Categorization Enables Target SiteChoice in Tn7-CRISPR-Cas Transposons.Cell, 2020), changeable fish Both types I-B of Anabaena and Peltigeramembranacea cyanobiont 210A have low efficiency or/and high off-target rate (Makoto Saito, A.L. and Jonathan Strecker, Han Altae-Tran, Rhiannon K. Macrae, Feng Zhang, Dual modes of CRISPR-associated transposon homing. Cell, 2021 .).
发明内容Contents of the invention
根据现有技术的CRISPR转座酶的特点,理想的是,若有其他新型高效的CRISPR相关转座酶,或可与MUCICAT组成正交的CRISPR相关转座系统,应用于复杂的代谢工程与合成生物学设计中。为了实现这一目的,发明人对于其他微生物来源的CRISPR转座酶进行了广泛筛选,发现了来源于半透明假交替单胞菌KMM520(Pseudoalteromonas translucidaKMM520)的I-F3型CRISPR转座系统,其在大肠杆菌中具有不亚于、甚至更优于霍乱弧菌Tn6677的插入效率,且能与霍乱弧菌Tn6677互不干扰地分别靶向插入eda-purT间和lacZ位点。该新型CRISPR转座系统可识别所有16种PAM,没有PAM依赖性。According to the characteristics of the existing CRISPR transposases, ideally, if there are other new and efficient CRISPR-related transposases, or an orthogonal CRISPR-related transposase system can be formed with MUCICAT, it can be applied to complex metabolic engineering and synthesis in biological design. In order to achieve this goal, the inventors conducted extensive screening of CRISPR transposases from other microorganisms, and found a type I-F3 CRISPR transposition system derived from Pseudoalteromonas translucida KMM520, which was found in The insertion efficiency in Escherichia coli is no less than or even better than that of Vibrio cholerae Tn6677, and can be inserted into the eda-purT and lacZ sites without interfering with Vibrio cholerae Tn6677, respectively. This novel CRISPR transposition system recognizes all 16 PAMs without PAM dependence.
因此,本发明的第一个方面在于提供一种CRISPR相关转座酶,其包括选自下组的多肽:来源于假交替单胞菌属细菌的转座酶蛋白tnsA、来源于假交替单胞菌属细菌的转座酶蛋白tnsB、来源于假交替单胞菌属细菌的转座酶蛋白tnsC、来源于假交替单胞菌属细菌的转座酶蛋白tniQ、来源于假交替单胞菌属细菌的核酸酶蛋白Cas5/8、来源于假交替单胞菌属细菌的核酸酶蛋白Cas6、来源于假交替单胞菌属细菌的核酸酶蛋白Cas7。Therefore, the first aspect of the present invention is to provide a CRISPR-related transposase, which includes a polypeptide selected from the group consisting of the transposase protein tnsA derived from Pseudoalteromonas bacteria, the transposase protein tnsA derived from Pseudoalteromonas Transposase protein tnsB from bacteria of the genus Pseudoalteromonas, transposase protein tnsC from bacteria of the genus Pseudoalteromonas, transposase protein tniQ from bacteria of the genus Pseudoalteromonas, Bacterial nuclease protein Cas5/8, nuclease protein Cas6 derived from bacteria of the genus Pseudoalteromonas, nuclease protein Cas7 derived from bacteria of the genus Pseudoalteromonas.
优选地,所述假交替单胞菌属细菌是半透明假交替单胞菌。更优选所述半透明假交替单胞菌是半透明假交替单胞菌KMM520(Pseudoalteromonas translucida KMM520)。Preferably, the bacteria of the genus Pseudoalteromonas is Pseudoalteromonas translucidum. More preferably, the Pseudoalteromonas translucida is Pseudoalteromonas translucida KMM520.
在一种具体的实施方式中,上述CRISPR相关转座酶优选包括选自下组的多肽:In a specific embodiment, the above-mentioned CRISPR-related transposase preferably includes a polypeptide selected from the group consisting of:
tnsA:其为具有SEQ ID NO:1氨基酸序列的多肽,或者与SEQ ID NO:1有95%以上同源性、优选地98%以上同源性、更优地99%以上同源性、且功能相同的多肽;tnsA: it is a polypeptide having the amino acid sequence of SEQ ID NO: 1, or has more than 95% homology, preferably more than 98% homology, more preferably more than 99% homology with SEQ ID NO: 1, and Functionally identical polypeptides;
tnsB:其为具有SEQ ID NO:2氨基酸序列的多肽,或者与SEQ ID NO:2有95%以上同源性、优选地98%以上同源性、更优地99%以上同源性、且功能相同的多肽;tnsB: it is a polypeptide having the amino acid sequence of SEQ ID NO: 2, or has more than 95% homology, preferably more than 98% homology, more preferably more than 99% homology with SEQ ID NO: 2, and Functionally identical polypeptides;
tnsC:其为具有SEQ ID NO:3氨基酸序列的多肽,或者与SEQ ID NO:3有95%以上同源性、优选地98%以上同源性、更优地99%以上同源性、且功能相同的多肽;tnsC: it is a polypeptide having the amino acid sequence of SEQ ID NO:3, or has more than 95% homology, preferably more than 98% homology, more preferably more than 99% homology with SEQ ID NO:3, and Functionally identical polypeptides;
tniQ:其为具有SEQ ID NO:4氨基酸序列的多肽,或者与SEQ ID NO:4有95%以上同源性、优选地98%以上同源性、更优地99%以上同源性、且功能相同的多肽;tniQ: it is a polypeptide having the amino acid sequence of SEQ ID NO:4, or has more than 95% homology, preferably more than 98% homology, more preferably more than 99% homology with SEQ ID NO:4, and Functionally identical polypeptides;
Cas5/8:其为具有SEQ ID NO:5氨基酸序列的多肽,或者与SEQ ID NO:5有95%以上同源性、优选地98%以上同源性、更优地99%以上同源性、且功能相同的多肽;Cas5/8: It is a polypeptide having the amino acid sequence of SEQ ID NO:5, or has more than 95% homology, preferably more than 98% homology, more preferably more than 99% homology with SEQ ID NO:5 , and a polypeptide with the same function;
Cas6:其为具有SEQ ID NO:6氨基酸序列的多肽,或者与SEQ ID NO:6有95%以上同源性、优选地98%以上同源性、更优地99%以上同源性、且功能相同的多肽;Cas6: it is a polypeptide having the amino acid sequence of SEQ ID NO:6, or has more than 95% homology, preferably more than 98% homology, more preferably more than 99% homology with SEQ ID NO:6, and Functionally identical polypeptides;
Cas7:其为具有SEQ ID NO:7氨基酸序列的多肽,或者与SEQ ID NO:7有95%以上同源性、优选地98%以上同源性、更优地99%以上同源性、且功能相同的多肽。Cas7: it is a polypeptide having the amino acid sequence of SEQ ID NO: 7, or has more than 95% homology, preferably more than 98% homology, more preferably more than 99% homology with SEQ ID NO: 7, and functionally identical peptides.
上述核酸酶Cas5/8、Cas7、Cas6是I型CRISPR系统的Cascade复合物,与转座酶tnsABC和tniQ关联变成CRISPR相关转座酶。这些多肽均来源于半透明假交替单胞菌KMM520(Pseudoalteromonas translucida KMM520)。The above-mentioned nucleases Cas5/8, Cas7, and Cas6 are the Cascade complexes of the type I CRISPR system, which are associated with the transposases tnsABC and tniQ to become CRISPR-associated transposases. These polypeptides are all derived from Pseudoalteromonas translucida KMM520 (Pseudoalteromonas translucida KMM520).
本发明的第二个方面提供了编码上述多肽的基因。The second aspect of the present invention provides a gene encoding the above-mentioned polypeptide.
在一种具体的实施方式中,编码具有SEQ ID NO:1氨基酸序列的多肽tnsA的基因是核苷酸序列SEQ ID NO:8,或者与SEQ ID NO:8有80%以上同源性、优选地85%以上同源性、更优地90%以上同源性、更优地95%以上同源性的多核苷酸;In a specific embodiment, the gene encoding the polypeptide tnsA having the amino acid sequence of SEQ ID NO: 1 is the nucleotide sequence of SEQ ID NO: 8, or has more than 80% homology with SEQ ID NO: 8, preferably Polynucleotides with more than 85% homology, more preferably more than 90% homology, more preferably more than 95% homology;
编码具有SEQ ID NO:2氨基酸序列的多肽tnsB的基因是核苷酸序列SEQ ID NO:9,或者与SEQ ID NO:9有80%以上同源性、优选地85%以上同源性、更优地90%以上同源性、更优地95%以上同源性的多核苷酸;The gene encoding the polypeptide tnsB having the amino acid sequence of SEQ ID NO: 2 is the nucleotide sequence of SEQ ID NO: 9, or has more than 80% homology, preferably more than 85% homology, more Polynucleotides with preferably more than 90% homology, more preferably more than 95% homology;
编码具有SEQ ID NO:3氨基酸序列的多肽tnsC的基因是核苷酸序列SEQ ID NO:10,或者与SEQ ID NO:10有80%以上同源性、优选地85%以上同源性、更优地90%以上同源性、更优地95%以上同源性的多核苷酸;The gene encoding the polypeptide tnsC having the amino acid sequence of SEQ ID NO:3 is the nucleotide sequence of SEQ ID NO:10, or has more than 80% homology, preferably more than 85% homology, more homology with SEQ ID NO:10 Polynucleotides with preferably more than 90% homology, more preferably more than 95% homology;
编码具有SEQ ID NO:4氨基酸序列的多肽tniQ的基因是核苷酸序列SEQ ID NO:11,或者与SEQ ID NO:11有80%以上同源性、优选地85%以上同源性、更优地90%以上同源性、更优地95%以上同源性的多核苷酸;The gene encoding the polypeptide tniQ having the amino acid sequence of SEQ ID NO: 4 is the nucleotide sequence of SEQ ID NO: 11, or has more than 80% homology, preferably more than 85% homology, more Polynucleotides with preferably more than 90% homology, more preferably more than 95% homology;
编码具有SEQ ID NO:5氨基酸序列的多肽Cas5/8的基因是核苷酸序列SEQ ID NO:12,或者与SEQ ID NO:12有80%以上同源性、优选地85%以上同源性、更优地90%以上同源性、更优地95%以上同源性的多核苷酸;The gene encoding the polypeptide Cas5/8 having the amino acid sequence of SEQ ID NO:5 is the nucleotide sequence of SEQ ID NO:12, or has more than 80% homology, preferably more than 85% homology with SEQ ID NO:12 , more preferably a polynucleotide with more than 90% homology, more preferably more than 95% homology;
编码具有SEQ ID NO:6氨基酸序列的多肽Cas6的基因是核苷酸序列SEQ ID NO:13,或者与SEQ ID NO:13有80%以上同源性、优选地85%以上同源性、更优地90%以上同源性、更优地95%以上同源性的多核苷酸;The gene encoding the polypeptide Cas6 having the amino acid sequence of SEQ ID NO: 6 is the nucleotide sequence SEQ ID NO: 13, or has more than 80% homology, preferably more than 85% homology, more homology with SEQ ID NO: 13 Polynucleotides with preferably more than 90% homology, more preferably more than 95% homology;
编码具有SEQ ID NO:7氨基酸序列的多肽Cas7的基因是核苷酸序列SEQ ID NO:14,或者与SEQ ID NO:14有80%以上同源性、优选地85%以上同源性、更优地90%以上同源性、更优地95%以上同源性的多核苷酸。The gene encoding the polypeptide Cas7 having the amino acid sequence of SEQ ID NO: 7 is the nucleotide sequence SEQ ID NO: 14, or has more than 80% homology, preferably more than 85% homology, more homology with SEQ ID NO: 14 It is preferably a polynucleotide with a homology of more than 90%, more preferably a homology of more than 95%.
本发明的第三个方面在于提供一种CRISPR转座子系统。如同转座系统INTEGRATE系统或者CAST系统,本发明的CRISPR转座子系统也包括质粒pQCascade、辅助质粒pTns、和携带货物基因的辅助质粒pDonor。这三种质粒中的任意两种甚至可以合并成一种质粒、甚至这三种质粒可以合并一起形成一种质粒。具体而言,The third aspect of the present invention is to provide a CRISPR transposon system. Like the transposition system INTEGRATE system or CAST system, the CRISPR transposon system of the present invention also includes plasmid pQCascade, helper plasmid pTns, and helper plasmid pDonor carrying cargo genes. Any two of these three plasmids can even be combined into one plasmid, and even these three plasmids can be combined together to form one plasmid. in particular,
一种用于CRISPR转座子系统的质粒pQCascade,其包括选自下组的基因片段:上述的Cas5/8编码基因;上述的Cas6编码基因;上述的Cas7编码基因;上述的tniQ编码基因。A plasmid pQCascade for the CRISPR transposon system, which includes a gene segment selected from the group consisting of: the above-mentioned Cas5/8 coding gene; the above-mentioned Cas6 coding gene; the above-mentioned Cas7 coding gene; the above-mentioned tniQ coding gene.
由于上述这些多肽均来源于半透明假交替单胞菌KMM520(Pseudoalteromonastranslucida KMM520),本文中将该质粒pQCascade命名为pQCascadePtr。类似地,下文中可以将本发明的质粒pTns和pDonor分别表示为pTnsPtr和pDonorPtr。Since the above polypeptides are all derived from Pseudoalteromonas translucida KMM520 ( Pseudoalteromonas tr anslucida KMM520), the plasmid pQCascade is named pQCascadePtr herein. Similarly, the plasmids pTns and pDonor of the present invention may be denoted as pTnsPtr and pDonorPtr, respectively, hereinafter.
优选地,上述的质粒pQCascadePtr还可以包括下述基因:靶向基因组目标位点的crRNA序列,CloDF13复制子,启动子例如脱水四环素诱导型启动子,链霉素抗性基因。Preferably, the above-mentioned plasmid pQCascadePtr may also include the following genes: crRNA sequence targeting the target site in the genome, CloDF13 replicon, promoter such as anhydrocycline-inducible promoter, streptomycin resistance gene.
上述CRISPR转座酶中的crRNA可以呈array的形式发挥功能。The crRNA in the above CRISPR transposase can function in the form of an array.
在一种实施方式中,上述crRNA系列的间隔序列spacer可以是靶向待处理细胞基因组中单个位点的spacer。In one embodiment, the above-mentioned spacer of the crRNA series may be a spacer targeting a single site in the genome of the cell to be treated.
在另一种实施方式中,上述crRNA系列可以是靶向待处理细胞基因组中多个位点的crRNA阵列(又称CRISPRarray、crRNA array或者array)。In another embodiment, the above-mentioned crRNA series may be a crRNA array (also known as CRISPRarray, crRNA array or array) targeting multiple sites in the genome of the cell to be treated.
优选上述crRNA序列是靶向基因组中多个位点的crRNA阵列,其中重复序列区repeat包括选自下组序列中的一种以上:核苷酸序列为SEQ ID NO:15的repeat1、核苷酸序列为SEQ ID NO:16的repeat2、核苷酸序列为SEQ ID NO:17的repeat3、核苷酸序列为SEQID NO:18的repeat4,这些核苷酸序列SEQ ID NOs:15-18中的32个N(即[N32])是任意的碱基A、T、G或C。Preferably, the above-mentioned crRNA sequence is a crRNA array targeting multiple sites in the genome, wherein the repeat sequence region repeat includes more than one sequence selected from the following group: the nucleotide sequence is repeat1 of SEQ ID NO: 15, nucleotide The sequence is repeat2 of SEQ ID NO: 16, the nucleotide sequence is repeat3 of SEQ ID NO: 17, the nucleotide sequence is repeat4 of SEQ ID NO: 18, and these nucleotide sequences are 32 in SEQ ID NOs: 15-18 Each N (ie [N32]) is any base A, T, G or C.
一种用于CRISPR转座子系统的辅助质粒pTnsPtr,其与上述的质粒pQCascadePtr配合使用,其包括选自下组的基因片段:上述的tnsA编码基因,上述的tnsB编码基因,上述的tnsC编码基因。A helper plasmid pTnsPtr for the CRISPR transposon system, which is used in conjunction with the above-mentioned plasmid pQCascadePtr, which includes a gene segment selected from the group consisting of the above-mentioned tnsA-encoding gene, the above-mentioned tnsB-encoding gene, and the above-mentioned tnsC-encoding gene .
上述CRISPR转座酶基因序列用于CRISPR转座时,需与宿主内(革兰氏阴性菌如大肠杆菌、需钠弧菌、柠檬塔特姆氏菌;革兰氏阳性菌如谷氨酸棒杆菌)可用的启动子,Leftend(LE)-cargo-Right end(RE)一起发挥作用,不论是以质粒的形式或是整合的形式。When the above-mentioned CRISPR transposase gene sequence is used for CRISPR transposition, it needs to be combined with the host (gram-negative bacteria such as Escherichia coli, natrivibrio, and Tatumella citronella; gram-positive bacteria such as glutamic acid rod Bacillus) available promoters, Leftend (LE)-cargo-Right end (RE) work together, either in the form of a plasmid or in an integrated form.
优选地,上述的辅助质粒pTnsPtr还包括下述基因:ColA复制子、启动子例如脱水四环素诱导型启动子、卡那霉素抗性基因。Preferably, the above-mentioned helper plasmid pTnsPtr also includes the following genes: ColA replicon, promoter such as anhydrocycline-inducible promoter, and kanamycin resistance gene.
作为两种质粒合并成一种质粒的实施方式,本发明提供了一种用于CRISPR转座子系统的质粒pQCasTnsPtr,其由上所述的质粒pQCascadePtr和辅助质粒pTnsPtr合并而成,包括:上述的Cas5/8、Cas6、Cas7、tniQ、tnsA、tnsB和tnsC基因,靶向基因组目标位点的crRNA序列,ColA复制子,启动子例如脱水四环素诱导型启动子,卡那霉素抗性基因。As an embodiment in which two plasmids are combined into one plasmid, the present invention provides a plasmid pQCasTnsPtr for the CRISPR transposon system, which is formed by merging the above-mentioned plasmid pQCascadePtr and the helper plasmid pTnsPtr, including: the above-mentioned Cas5 /8, Cas6, Cas7, tniQ, tnsA, tnsB, and tnsC genes, crRNA sequences targeting genomic target sites, ColA replicons, promoters such as anhydrocycline-inducible promoters, kanamycin resistance genes.
一种用于CRISPR转座子系统的辅助质粒pDonorPtr,其与上述的质粒pQCascadePtr和上述的辅助质粒pTnsPtr配合使用,其包括选自下组的基因片段:序列Leftend(LE),其核苷酸序列为SEQ ID NO:19,或者与SEQ ID NO:19有80%以上同源性、优选地85%以上同源性、更优地90%以上同源性、更优地95%以上同源性且包含SEQ ID NO:19的3’端33bp序列;序列Right end(RE),其核苷酸序列为SEQ ID NO:20,或者与SEQ ID NO:20有80%以上同源性、优选地85%以上同源性、更优地90%以上同源性、更优地95%以上同源性且包含SEQ ID NO:20的5’端27bp序列;目标货物基因(Cargo gene)。A helper plasmid pDonorPtr for the CRISPR transposon system, which is used in conjunction with the above-mentioned plasmid pQCascadePtr and the above-mentioned helper plasmid pTnsPtr, which includes a gene segment selected from the group: sequence Leftend (LE), its nucleotide sequence It is SEQ ID NO:19, or has more than 80% homology, preferably more than 85% homology, more preferably more than 90% homology, more preferably more than 95% homology with SEQ ID NO:19 And comprise the 3' end 33bp sequence of SEQ ID NO:19; Sequence Right end (RE), its nucleotide sequence is SEQ ID NO:20, or there is more than 80% homology with SEQ ID NO:20, preferably More than 85% homology, more preferably more than 90% homology, more preferably more than 95% homology and containing the 27bp sequence of the 5' end of SEQ ID NO: 20; the target cargo gene (Cargo gene).
由于核苷酸序列为SEQ ID NO:19的LE和核苷酸序列为SEQ ID NO:20的RE来源于半透明假交替单胞菌KMM520(Pseudoalteromonas translucida KMM520),本文中将该质粒pDonor命名为pDonorPtr。Since the LE with the nucleotide sequence of SEQ ID NO: 19 and the RE with the nucleotide sequence of SEQ ID NO: 20 are derived from Pseudoalteromonas translucidus KMM520 ( P seudoalteromonas tr anslucida KMM520), the plasmid pDonor Name it pDonorPtr.
优选地,上述辅助质粒pDonorPtr还可以包括下述基因:pMB1复制子、氨苄青霉素抗性基因、目标货物基因(Cargo gene);或者下述基因:p15A复制子、氯霉素抗性基因、目标货物基因(Cargo gene)。Preferably, the above-mentioned auxiliary plasmid pDonorPtr can also include the following genes: pMB1 replicon, ampicillin resistance gene, target cargo gene (Cargo gene); or the following genes: p15A replicon, chloramphenicol resistance gene, target cargo Gene (Cargo gene).
作为上述三种质粒合并成一种质粒的实施方式,本发明提供了一种用于CRISPR转座子系统的质粒pEffectorPtr,其由上述的质粒pQCascadePtr、上述的辅助质粒pTnsPtr和上述的辅助质粒pDonorPtr合并而成,包括:上述的Cas5/8、Cas6、Cas7、tniQ、tnsA、tnsB和tnsC基因、Left end(LE)和Right end(RE)序列,目标货物基因,靶向基因组目标位点的crRNA序列、ColA复制子、启动子例如脱水四环素诱导型启动子、卡那霉素抗性基因。As an embodiment in which the above three plasmids are combined into one plasmid, the present invention provides a plasmid pEffectorPtr for the CRISPR transposon system, which is formed by merging the above-mentioned plasmid pQCascadePtr, the above-mentioned helper plasmid pTnsPtr and the above-mentioned helper plasmid pDonorPtr components, including: the above-mentioned Cas5/8, Cas6, Cas7, tniQ, tnsA, tnsB and tnsC genes, Left end (LE) and Right end (RE) sequences, target cargo genes, crRNA sequences targeting genomic target sites, ColA replicon, promoters such as anhydrocycline-inducible promoters, kanamycin resistance genes.
基于上述的质粒,本发明的第四个方面提供了一种CRISPR转座子系统,其包括:质粒pQCascadePtr、pTnsPtr和pDonorPtr;或者质粒pQCasTnsPtr和pDonorPtr;或者质粒pEffectorPtr。Based on the above plasmids, the fourth aspect of the present invention provides a CRISPR transposon system, which includes: plasmids pQCascadePtr, pTnsPtr and pDonorPtr; or plasmids pQCasTnsPtr and pDonorPtr; or plasmid pEffectorPtr.
本发明的第五个方面提供了一种上述来源于半透明假交替单胞菌KMM520的CRISPR转座子系统与现有技术的来源于霍乱弧菌(Vibrio cholerae)Tn6677的CRISPR转座子系统的组合使用方式,具体而言,A fifth aspect of the present invention provides a combination of the above-mentioned CRISPR transposon system derived from Pseudomonas translucidus KMM520 and the prior art CRISPR transposon system derived from Vibrio cholerae (Vibrio cholerae) Tn6677 Combinations use, specifically,
一种CRISPR转座子系统,其除了包括上述的CRISPR转座子系统(包括:质粒pQCascadePtr、pTnsPtr和pDonorPtr;或者质粒pQCasTnsPtr和pDonorPtr;或者质粒pEffectorPtr)之外,还包括霍乱弧菌(Vibrio cholerae)Tn6677来源的CRISPR转座酶相关质粒。A CRISPR transposon system, which, in addition to including the above-mentioned CRISPR transposon system (including: plasmids pQCascadePtr, pTnsPtr and pDonorPtr; or plasmids pQCasTnsPtr and pDonorPtr; or plasmid pEffectorPtr), also includes Vibrio cholerae ( Vibrio ch olerae) Tn6677-derived CRISPR transposase-associated plasmid.
霍乱弧菌Tn6677的I-F3型CRISPR转座酶的基因序列是NCBI:NZ_ALED01000027.1。The gene sequence of the type I-F3 CRISPR transposase of Vibrio cholerae Tn6677 is NCBI: NZ_ALED01000027.1.
上述霍乱弧菌Tn6677来源的CRISPR转座酶相关质粒包括质粒pQCasTnsVch和辅助质粒pDonorVch,其中The above-mentioned CRISPR transposase-related plasmids derived from Vibrio cholerae Tn6677 include plasmid pQCasTnsVch and helper plasmid pDonorVch, wherein
质粒pQCasTnsVch包括:霍乱弧菌Tn6677来源的Cas5/8、Cas6、Cas7、tniQ、tnsA、tnsB和tnsC基因,CloDF13复制子,启动子例如脱水四环素诱导型启动子,链霉素抗性基因;Plasmid pQCasTnsVch includes: Cas5/8, Cas6, Cas7, tniQ, tnsA, tnsB and tnsC genes derived from Vibrio cholerae Tn6677, CloDF13 replicator, promoter such as anhydrocycline-inducible promoter, streptomycin resistance gene;
质粒pDonorVch包括霍乱弧菌Tn6677来源的CRISPR阵列、Left end(LE)和Rightend(RE),pMB1复制子,氨苄青霉素抗性基因,目标货物基因(Cargo gene)。Plasmid pDonorVch includes CRISPR array derived from Vibrio cholerae Tn6677, Left end (LE) and Right end (RE), pMB1 replicon, ampicillin resistance gene, target cargo gene (Cargo gene).
在一种实施方式中,与本发明的两种或三种质粒合并成一种质粒的组合方式pQCasTnsPtr和pEffectorPtr相类似,上述的质粒pQCasTnsVch和pDonorVch可以合并为质粒pEffectorVch。In one embodiment, similar to the combination of two or three plasmids of the present invention into one plasmid pQCasTnsPtr and pEffectorPtr, the above-mentioned plasmids pQCasTnsVch and pDonorVch can be combined into plasmid pEffectorVch.
在上述启动子采用脱水四环素诱导型启动子的情况下,当上述来源于半透明假交替单胞菌KMM520的CRISPR转座子系统用于转化大肠杆菌BL21(DE3)、BL21StarTM(DE3)或W3110(DE3)等菌株时,可以使用脱水四环素进行诱导。即,将质粒pQCasTnsPtr和pDonorPtr用于转化大肠杆菌BL21(DE3)、BL21StarTM(DE3)或W3110(DE3)等菌株时,使用脱水四环素进行诱导。In the case that the above-mentioned promoter adopts anhydrotetracycline-inducible promoter, when the above-mentioned CRISPR transposon system derived from Pseudomonas translucidum KMM520 is used to transform Escherichia coli BL21 (DE3), BL21Star TM (DE3) or W3110 (DE3) and other strains can be induced with anhydrotetracycline. That is, when the plasmids pQCasTnsPtr and pDonorPtr are used to transform strains such as Escherichia coli BL21(DE3), BL21Star ™ (DE3) or W3110(DE3), anhydrotetracycline is used for induction.
类似地,在上述启动子采用脱水四环素诱导型启动子的情况下,当上述来源于半透明假交替单胞菌KMM520的CRISPR转座子系统与来源于霍乱弧菌Tn6677的CRISPR转座子系统组合一起使用时,也可以使用脱水四环素进行诱导。即,将质粒pEffectorPtr和pEffectorVch,或将质粒pQCasTnsVch、pDonorVch、pQCasTnsPtr和pDonorPtr用于转化大肠杆菌BL21(DE3)、BL21StarTM(DE3)或W3110(DE3)等菌株时,使用脱水四环素进行诱导。Similarly, when the above-mentioned promoter adopts anhydrocycline-inducible promoter, when the above-mentioned CRISPR transposon system derived from Pseudomonas translucidus KMM520 is combined with the CRISPR transposon system derived from Vibrio cholerae Tn6677 When used together, anhydrocycline can also be used for induction. That is, when plasmids pEffectorPtr and pEffectorVch, or plasmids pQCasTnsVch, pDonorVch, pQCasTnsPtr, and pDonorPtr are used to transform strains such as Escherichia coli BL21 (DE3), BL21Star TM (DE3) or W3110 (DE3), anhydrotetracycline is used for induction.
这两种微生物来源的CRISPR转座子系统能够在同一大肠杆菌内使用,且可以互不干扰的发挥功能,两者正交发挥作用。These two microbial-derived CRISPR transposon systems can be used in the same Escherichia coli, and can function without interfering with each other, and the two function orthogonally.
本发明的第六个方面提供了上述的CRISPR相关转座酶、上述的编码多肽的基因、上述的质粒pQCascadePtr、上述的质粒pTnsPtr、质粒pQCasTnsPtr、质粒pDonorPtr、质粒pEffectorPtr、上述的CRISPR转座子系统在基因编辑中的应用。The sixth aspect of the present invention provides the above-mentioned CRISPR-related transposase, the above-mentioned gene encoding a polypeptide, the above-mentioned plasmid pQCascadePtr, the above-mentioned plasmid pTnsPtr, the plasmid pQCasTnsPtr, the plasmid pDonorPtr, the plasmid pEffectorPtr, the above-mentioned CRISPR transposon system Applications in gene editing.
所述的基因编辑可以是任何细胞内的基因编辑,尤其是在微生物(包括真菌和细菌)细胞内的基因编辑,特别是工业微生物细胞内的基因编辑。The gene editing can be gene editing in any cell, especially gene editing in microorganism (including fungi and bacteria) cells, especially gene editing in industrial microorganism cells.
优选地,进行基因编辑的细菌包括革兰氏阴性菌例如大肠杆菌、需钠弧菌、柠檬塔特姆氏菌、革兰氏阳性菌例如谷氨酸棒杆菌等,但不限于此。Preferably, the bacteria for gene editing include Gram-negative bacteria such as Escherichia coli, Narvibrio natrium, Tatumella citrum, Gram-positive bacteria such as Corynebacterium glutamicum, etc., but are not limited thereto.
实验证明,本发明开发的新型CRISPR相关转座酶系统可以一次将货物基因插入至8个位点,效率100%,效率高于来源于霍乱弧菌Tn6677的CRISPR相关转座酶;转座15.4kb货物基因至对应靶点的效率为100%;该新型CRISPR相关转座酶可识别16种PAM。本发明首次将两种CRISPR相关转座酶系统在同一大肠杆菌内使用,且可以互不干扰的发挥功能,因此为加速代谢工程菌株构建提供了一种选择。Experiments have proved that the new CRISPR-associated transposase system developed by the present invention can insert cargo genes into 8 sites at one time, with an efficiency of 100%, which is higher than that of the CRISPR-associated transposase derived from Vibrio cholerae Tn6677; transposition 15.4kb The efficiency of the cargo gene to the corresponding target is 100%; the novel CRISPR-associated transposase can recognize 16 PAMs. In the present invention, two CRISPR-related transposase systems are used in the same Escherichia coli for the first time, and they can function without interfering with each other, thus providing an option for accelerating the construction of metabolic engineering strains.
附图说明Description of drawings
图1显示了实施例1中靶向crRNA3(lacZ)位点的凝胶电泳图照片。Fig. 1 has shown the gel electrophoresis photograph of targeting crRNA3 (lacZ) site in
图2显示了实施例2中菌株基因组中8个位点的货物基因GFP插入情况的凝胶电泳图。其中NC是阴性对照(Negative Control)。Fig. 2 shows the gel electrophoresis of the cargo gene GFP insertion situation at 8 sites in the genome of the strain in Example 2. Where NC is a negative control (Negative Control).
图3是实施例2中通过菌落PCR与核酸凝胶电泳验证各克隆基因组8个位点插入情况的统计柱形图。其中横坐标为货物基因GFP的拷贝数,纵坐标是货物基因GFP插入率。Fig. 3 is a statistical bar chart of the 8-site insertion status of each cloned genome verified by colony PCR and nucleic acid gel electrophoresis in Example 2. The abscissa is the copy number of the cargo gene GFP, and the ordinate is the insertion rate of the cargo gene GFP.
图4显示了实施例3中菌株基因组中6个位点的Ptr携带的货物基因GFP和Vch携带的货物基因“终止子序列”插入情况的凝胶电泳图。其中NC是阴性对照。Figure 4 shows the gel electrophoresis images of the insertion of the "terminator sequence" of the cargo gene GFP carried by Ptr and the cargo gene "terminator sequence" carried by Vch at 6 sites in the genome of the strain in Example 3. where NC is the negative control.
图5是质粒pQCascadePtr的结构示意图。Fig. 5 is a schematic diagram of the structure of plasmid pQCascadePtr.
图6是质粒pDonorPtr的结构示意图。Fig. 6 is a schematic diagram of the structure of plasmid pDonorPtr.
图7是质粒pTnsPtr的结构示意图。Fig. 7 is a schematic diagram of the structure of plasmid pTnsPtr.
图8是质粒pQCasTnsPtr的结构示意图。Fig. 8 is a schematic diagram of the structure of plasmid pQCasTnsPtr.
图9是质粒pQCasTnsVch的结构示意图。Fig. 9 is a schematic diagram of the structure of plasmid pQCasTnsVch.
图10是质粒pEffectorPtr的结构示意图。Fig. 10 is a schematic diagram of the structure of plasmid pEffectorPtr.
图11是质粒pEffectorVch的结构示意图。Fig. 11 is a schematic diagram of the structure of plasmid pEffectorVch.
图12是质粒pVnQCasTnsPtr的结构示意图。Fig. 12 is a schematic diagram of the structure of plasmid pVnQCasTnsPtr.
图13是质粒pCgQCasTnsVch的结构示意图。Fig. 13 is a schematic diagram of the structure of plasmid pCgQCasTnsVch.
图14是质粒pCgDonorPtr的结构示意图。Fig. 14 is a schematic diagram of the structure of plasmid pCgDonorPtr.
图15显示了实施例5中靶向谷氨酸棒杆菌ATCC13032菌株crtYf基因位点的凝胶电泳图照片。Fig. 15 shows the photograph of the gel electrophoresis image targeting the crtYf gene locus of the Corynebacterium glutamicum ATCC13032 strain in Example 5.
具体实施方式Detailed ways
本发明的新型CRISPR相关转座酶系统来源于半透明假交替单胞菌KMM520,是对基因编辑工具CAST系统和INTEGRATE系统以及MUCICAT系统的进一步发展和完善,尤其是基因拷贝量和货物基因插入效率的提高。The novel CRISPR-associated transposase system of the present invention is derived from Pseudoalteromonas translucenta KMM520, which is a further development and improvement of the gene editing tools CAST system, INTEGRATE system and MUCICAT system, especially the gene copy amount and cargo gene insertion efficiency improvement.
在本文中,为了描述简便,有时会将术语“CRISPR相关转座酶系统”简称为“CRISPR相关转座酶”、“CRISPR转座酶”、或者“CRISPR转座子系统”等,它们表示相同的含义,可以互换使用。In this paper, for simplicity of description, the term "CRISPR-associated transposase system" is sometimes abbreviated as "CRISPR-associated transposase", "CRISPR transposase", or "CRISPR transposase system", etc., which mean the same can be used interchangeably.
在本文中,为了描述简便,有时会将某种蛋白比如Cas6与其编码基因(DNA)名称混用,本领域技术人员应能理解它们在不同描述场合表示不同的物质。本领域技术人员根据语境和上下文容易理解它们的含义。例如,对于tnsA,用于描述转座酶功能或类别时,指的是蛋白质;在作为一种基因描述时,指的是编码该转座酶tnsA蛋白的基因。In this paper, for the sake of simplicity of description, sometimes a certain protein such as Cas6 and the name of its encoding gene (DNA) are mixed, and those skilled in the art should understand that they represent different substances in different description occasions. Their meanings are easily understood by those skilled in the art depending on the context and context. For example, when tnsA is used to describe the function or class of transposase, it refers to the protein; when described as a gene, it refers to the gene encoding the transposase tnsA protein.
类似地,为了描述简便,有时会将RNA比如crRNA与其编码基因名称混用,本领域技术人员应能理解它们在不同描述场合表示不同的物质。本领域技术人员根据语境和上下文容易理解它们的含义。Similarly, for simplicity of description, sometimes RNA such as crRNA and the name of its coding gene are mixed, and those skilled in the art should understand that they represent different substances in different description occasions. Their meanings are easily understood by those skilled in the art depending on the context and context.
本发明提供的每种质粒pQCascadePtr、pTnsPtr、pDonorPtr、pQCasTnsPtr和pEffectorPtr等都分别包含多个基因元件,例如质粒pQCascadePtr中包含Cas5/8基因、Cas6基因、Cas7基因、tniQ编码基因、crRNA基因、CloDF13复制子、启动子例如脱水四环素诱导型启动子、链霉素抗性基因,这些基因元件的排列顺序可以是任意的,本领域技术人员可以根据习惯进行安排、并且容易地制备出质粒。Each of the plasmids pQCascadePtr, pTnsPtr, pDonorPtr, pQCasTnsPtr, and pEffectorPtr provided by the present invention contains multiple genetic elements, for example, the plasmid pQCascadePtr contains Cas5/8 gene, Cas6 gene, Cas7 gene, tniQ coding gene, crRNA gene, CloDF13 replication promoters, promoters such as anhydrocycline-inducible promoters, and streptomycin resistance genes, the sequence of these gene elements can be arbitrary, and those skilled in the art can arrange them according to their habits and easily prepare plasmids.
应理解,对于本发明的多肽Cas5/8、Cas7、Cas6、tnsABC(即tnsA、tnsB和tnsC)和tniQ的编码基因,本领域技术人员可以根据待处理细胞的具体种类比如大肠杆菌进行密码子优化,而仅仅不限于上述的核苷酸序列SEQ ID NOs:8-14。It should be understood that for the coding genes of the polypeptides Cas5/8, Cas7, Cas6, tnsABC (i.e. tnsA, tnsB and tnsC) and tniQ of the present invention, those skilled in the art can perform codon optimization according to the specific species of the cells to be treated, such as Escherichia coli , but not limited to the above-mentioned nucleotide sequence SEQ ID NOs:8-14.
密码子优化的目的在于,使得这些多肽能够在待处理细胞中实现最佳表达。密码子优化是可用于通过增加感兴趣基因的翻译效率使生物体中蛋白质表达最大化的一种技术。不同的生物体由于突变倾向和天然选择而通常示出对于编码相同氨基酸的一些密码子之一的特殊偏好性。例如,在生长快速的微生物如大肠杆菌中,优化密码子反映出其各自的基因组tRNA库的组成。因此,在生长快速的微生物中,氨基酸的低频率密码子可以被用于相同氨基酸的但高频率的密码子置换。因此,优化的DNA序列的表达在快速生长的微生物中得以改良。The purpose of codon optimization is to enable optimal expression of these polypeptides in the cells to be treated. Codon optimization is a technique that can be used to maximize protein expression in an organism by increasing the translation efficiency of a gene of interest. Different organisms often show a particular preference for one of several codons encoding the same amino acid due to mutation propensity and natural selection. For example, in fast-growing microorganisms such as E. coli, codons are optimized to reflect the composition of their respective genomic tRNA pools. Thus, in fast-growing microorganisms, low-frequency codons for amino acids can be replaced by high-frequency codons for the same amino acids. Thus, expression of optimized DNA sequences is improved in fast growing microorganisms.
下面结合具体实施例,进一步阐述本发明。应理解,这些实施例仅用于举例说明目的,而不是对本发明的限制。此外应理解,在阅读了本发明的构思之后,本领域技术人员对其作出的各种改变或调整,均应落入本发明的保护范围内,这些等价形式同样属于本申请所附权利要求书限定的范围。Below in conjunction with specific embodiment, further illustrate the present invention. It should be understood that these examples are for the purpose of illustration only and not limitation of the present invention. In addition, it should be understood that after reading the concept of the present invention, various changes or adjustments made by those skilled in the art should fall within the protection scope of the present invention, and these equivalent forms also belong to the appended claims of the present application The scope of the book is limited.
本文中涉及到多种物质的添加量、含量及浓度,其中所述的百分含量,除特别说明外,皆指质量百分含量。This article involves the addition amount, content and concentration of various substances, and the percentage content mentioned therein refers to the mass percentage content unless otherwise specified.
本文的实施例中,如果对于反应温度或操作温度没有做出具体说明,则该温度通常指室温(15-30℃)。In the examples herein, if there is no specific statement about the reaction temperature or operating temperature, the temperature usually refers to room temperature (15-30° C.).
实施例Example
材料和方法Materials and methods
实施例中的全基因合成由南京金斯瑞生物科技有限公司完成,引物合成由铂尚生物技术(上海)有限公司完成,测序由擎科生物有限公司完成。The whole gene synthesis in the examples was completed by Nanjing GenScript Biotechnology Co., Ltd., the primer synthesis was completed by Bosun Biotechnology (Shanghai) Co., Ltd., and the sequencing was completed by Qingke Biotechnology Co., Ltd.
实施例中的分子生物学实验包括质粒构建、酶切、连接、感受态细胞制备、转化、培养基配制等等,主要参照《分子克隆实验指南》(第三版),J.萨姆布鲁克,D.W.拉塞尔(美)编著,黄培堂等译,科学出版社,北京,2002)进行。必要时可以通过简单试验确定具体实验条件。The molecular biology experiments in the examples include plasmid construction, enzyme digestion, connection, competent cell preparation, transformation, medium preparation, etc., mainly referring to "Molecular Cloning Experiment Guide" (third edition), J. Sambrook, Edited by D.W. Russell (US), translated by Huang Peitang et al., Science Press, Beijing, 2002). The specific experimental conditions can be determined by simple experiments if necessary.
PCR扩增实验根据质粒或DNA模板供应商提供的反应条件或试剂盒说明书进行。必要时可以通过简单试验予以调整。PCR amplification experiments were carried out according to the reaction conditions or kit instructions provided by the plasmid or DNA template suppliers. It can be adjusted by simple experiment if necessary.
LB培养基:10g/L胰蛋白胨、5g/L酵母提取物、10g/L氯化钠,pH7.2,121℃高温高压灭菌20min。固体培养基另加20g/L琼脂粉。LB medium: 10g/L tryptone, 5g/L yeast extract, 10g/L sodium chloride, pH7.2, sterilized under high temperature and high pressure at 121°C for 20min. Add 20g/L agar powder to the solid medium.
LBv2培养基:LB培养基中添加v2盐(21.7204g mmol/L NaCl,0.3 4.2g mmol/LKCl,4.723.14g mmol/L MgCl2)。LBv2 medium: add v2 salt (21.7204g mmol/L NaCl, 0.34.2g mmol/LKCl, 4.723.14g mmol/L MgCl 2 ) to LB medium.
BHIS培养基:37g/L BHI,91g/L山梨醇。固体培养基另加20g/L琼脂粉。BHIS medium: 37g/L BHI, 91g/L sorbitol. Add 20g/L agar powder to the solid medium.
实施例中所使用的质粒pQCascadePtr、pTnsPtr和pCgQCasTnsPtr委托南京金斯瑞生物科技有限公司构建合成,质粒pDonorPtr、pQCasTnsPtr、pEffectorPtr、pCgDonorPtr、pQCasTnsVch、pDonorVch和pEffectorVch由中国科学院分子植物科学卓越创新中心杨晟课题组构建,任何单位和个人都可以获得这些质粒用于验证本发明,但未经中国科学院分子植物科学卓越创新中心允许不得用作其他用途,包括开发利用、科学研究和教学。其中Plasmids pQCascadePtr, pTnsPtr, and pCgQCasTnsPtr used in the examples were commissioned by Nanjing GenScript Biotechnology Co., Ltd. to construct and synthesize them. Any unit or individual can obtain these plasmids to verify the present invention, but they cannot be used for other purposes, including development and utilization, scientific research and teaching, without the permission of the Center for Excellence in Molecular Plant Science, Chinese Academy of Sciences. in
质粒pQCascadePtr包含来源于Pseudoalteromonas translucida KMM520的基因tniQ(SEQ ID NO:11)、Cas5/8(SEQ ID NO:12)、Cas7(SEQ ID NO:14)和Cas6(SEQ ID NO:13)、靶向基因组目标位点的crRNA序列、CloDF13复制子、启动子例如脱水四环素诱导型启动子、链霉素抗性基因,委托南京金斯瑞生物科技有限公司合成,其核苷酸序列为SEQ IDNO:21,其结构如图5所示。Plasmid pQCascadePtr comprises genes tniQ (SEQ ID NO:11), Cas5/8 (SEQ ID NO:12), Cas7 (SEQ ID NO:14) and Cas6 (SEQ ID NO:13) derived from Pseudoalteromonas translucida KMM520, targeting The crRNA sequence of the genomic target site, CloDF13 replicon, promoter such as anhydrocycline-inducible promoter, streptomycin resistance gene was commissioned to synthesize by Nanjing GenScript Biotechnology Co., Ltd., and its nucleotide sequence is SEQ ID NO: 21 , whose structure is shown in Figure 5.
质粒pDonorPtr包含来源于Pseudoalteromonas translucida KMM520的基因LE(SEQ ID NO:19)和RE(SEQ ID NO:20)序列,pMB1复制子、氨苄青霉素抗性基因、目标货物基因Cargo例如无启动子的氯霉素抗性CmR基因片段,其结构如图6所示。Plasmid pDonorPtr contains gene LE (SEQ ID NO:19) and RE (SEQ ID NO:20) sequences derived from Pseudoalteromonas translucida KMM520, pMB1 replicon, ampicillin resistance gene, target cargo gene Cargo such as chloramphenicol without promoter The protein-resistant CmR gene fragment, the structure of which is shown in Figure 6.
质粒pTnsPtr包含来源于Pseudoalteromonas translucida KMM520的基因tnsA(SEQ ID NO:8)、tnsB(SEQ ID NO:9)和tnsC(SEQ ID NO:10)、ColA复制子、启动子例如脱水四环素诱导型启动子、卡那霉素抗性基因,委托南京金斯瑞生物科技有限公司合成,其核苷酸序列为SEQ ID NO:22,其结构如图7所示。Plasmid pTnsPtr comprises the genes tnsA (SEQ ID NO:8), tnsB (SEQ ID NO:9) and tnsC (SEQ ID NO:10) derived from Pseudoalteromonas translucida KMM520, a ColA replicon, a promoter such as an anhydrocycline-
质粒pQCasTnsPtr包含来源于Pseudoalteromonas translucida KMM520的基因tnsA(SEQ ID NO:8)、tnsB(SEQ ID NO:9)、tnsC(SEQ ID NO:10)、tniQ(SEQ ID NO:11)、Cas5/8(SEQ ID NO:12)、Cas7(SEQ ID NO:14)和Cas6(SEQ ID NO:13)、靶向基因组目标位点的crRNA序列、ColA复制子、启动子例如脱水四环素诱导型启动子、卡那霉素抗性基因,其结构如图8所示。Plasmid pQCasTnsPtr comprises gene tnsA (SEQ ID NO:8), tnsB (SEQ ID NO:9), tnsC (SEQ ID NO:10), tniQ (SEQ ID NO:11), Cas5/8 ( SEQ ID NO: 12), Cas7 (SEQ ID NO: 14) and Cas6 (SEQ ID NO: 13), crRNA sequences targeting genomic target sites, ColA replicon, promoters such as anhydrocycline-inducible promoters, card The structure of the namycin resistance gene is shown in FIG. 8 .
质粒pQCasTnsVch包含来源于霍乱弧菌Tn6677的基因Cas5/8、Cas7、Cas6、tniQ、tnsA、tnsB、tnsC及CRISPR array、CloDF13复制子、启动子例如脱水四环素诱导型启动子、链霉素抗性基因,其结构如图9所示,由中国科学院分子植物科学卓越创新中心杨晟课题组构建。Plasmid pQCasTnsVch contains genes Cas5/8, Cas7, Cas6, tniQ, tnsA, tnsB, tnsC and CRISPR array derived from Vibrio cholerae Tn6677, CloDF13 replicon, promoter such as anhydrocycline-inducible promoter, streptomycin resistance gene , whose structure is shown in Figure 9, constructed by Yang Sheng's research group of Center for Excellence in Molecular Plant Science, Chinese Academy of Sciences.
质粒pEffectorPtr包含来源于Pseudoalteromonas translucida KMM520的基因tnsA(SEQ ID NO:8)、tnsB(SEQ ID NO:9)、tnsC(SEQ ID NO:10)、tniQ(SEQ ID NO:11)、Cas5/8(SEQ ID NO:12)、Cas7(SEQ ID NO:14)和Cas6(SEQ ID NO:13)、靶向基因组目标位点的crRNA序列、LE(SEQ ID NO:19)和RE(SEQ ID NO:20)序列、目标货物基因、ColA复制子、启动子例如脱水四环素诱导型启动子、卡那霉素抗性基因,其结构如图10所示。Plasmid pEffectorPtr comprises gene tnsA (SEQ ID NO:8), tnsB (SEQ ID NO:9), tnsC (SEQ ID NO:10), tniQ (SEQ ID NO:11), Cas5/8 ( SEQ ID NO: 12), Cas7 (SEQ ID NO: 14) and Cas6 (SEQ ID NO: 13), crRNA sequences targeting genomic target sites, LE (SEQ ID NO: 19) and RE (SEQ ID NO: 20) Sequence, target cargo gene, ColA replicon, promoter such as anhydrocycline-inducible promoter, kanamycin resistance gene, the structure of which is shown in FIG. 10 .
质粒pEffectorVch包含来源于霍乱弧菌Tn6677的基因Cas5/8、Cas7、Cas6、tniQ、tnsA、tnsB、tnsC及CRISPR array、LE和RE序列、目标货物基因、CloDF13复制子、启动子例如脱水四环素诱导型启动子、链霉素抗性基因,其结构如图11所示,由中国科学院分子植物科学卓越创新中心杨晟课题组构建。Plasmid pEffectorVch contains genes Cas5/8, Cas7, Cas6, tniQ, tnsA, tnsB, tnsC and CRISPR array, LE and RE sequences from Vibrio cholerae Tn6677, target cargo gene, CloDF13 replicon, promoter such as anhydrocycline-inducible The structure of the promoter and streptomycin resistance gene is shown in Figure 11, which was constructed by Yang Sheng's research group of the Center for Excellence in Molecular Plant Science, Chinese Academy of Sciences.
实施例1验证CRISPR相关转座酶的转座活性Example 1 Verifying the transposition activity of CRISPR-associated transposases
通过靶向大肠杆菌BL21StarTM(DE3)基因组lacZ和T7RNA聚合酶前lacZ,证实来源于半透明假交替单胞菌KMM520的CRISPR相关酶具有可编程的转座活性。A CRISPR-associated enzyme from Pseudomonas translucidus KMM520 was demonstrated to have programmable transposition activity by targeting the lacZ and T7 RNA polymerase pro-lacZ of the Escherichia coli BL21Star TM (DE3) genome.
构建靶向基因组lacZ和T7RNA聚合酶前lacZ的pQCascadePtr-cr3质粒及验证引物列于表1。其中质粒名称pQCascadePtr-cr3中的后缀“-cr3”代表构建靶向基因组lacZ。The pQCascadePtr-cr3 plasmid targeting genomic lacZ and pre-lacZ of T7 RNA polymerase and the verified primers are listed in Table 1. The suffix "-cr3" in the plasmid name pQCascadePtr-cr3 represents the construction of targeting genome lacZ.
表1:引物序列Table 1: Primer sequences
注:表中引物名称后缀F代表正向引物,R代表反向引物。Note: The suffix F in the primer name in the table represents the forward primer, and R represents the reverse primer.
1.1酶切获得pQCascadePtr骨架片段1.1 Enzyme digestion to obtain pQCascadePtr backbone fragment
NcoI和BamHI酶切pQCascadePtr质粒,获得pQCascadePtr骨架片段。The pQCascadePtr plasmid was digested with NcoI and BamHI to obtain the backbone fragment of pQCascadePtr.
酶切反应体系(50μL)Enzyme digestion reaction system (50μL)
酶切反应条件:37℃,1h。通过凝胶电泳分离酶切后质粒片段并胶回收。Enzyme digestion reaction conditions: 37°C, 1h. The digested plasmid fragments were separated by gel electrophoresis and gel recovered.
限制性内切酶试剂盒购自Thermofisher公司,DNA凝胶回收试剂盒购自上海吐露港生物科技有限公司。The restriction endonuclease kit was purchased from Thermofisher Company, and the DNA gel recovery kit was purchased from Shanghai Tolo Harbor Biotechnology Co., Ltd.
1.2引物退火自搭1.2 Self-annealing of primers
表1中的引物对PtrDR-F和PtrDR-R通过退火自搭,可获得含有4bp粘性末端的DR片段,该粘性末端与pQCascadePtr质粒骨架NcoI和BamHI酶切后的粘性末端互补。The primer pair PtrDR-F and PtrDR-R in Table 1 can be annealed and self-assembled to obtain a DR fragment containing a 4bp sticky end, which is complementary to the sticky end of the pQCascadePtr plasmid backbone after NcoI and BamHI digestion.
退火自搭体系(50μL)Annealed self-build system (50μL)
95℃保温5min,每分钟减低5-10℃,16℃保温10min,用ddH2O稀释20倍,备用。Incubate at 95°C for 5 minutes, lower the temperature by 5-10°C per minute, and incubate at 16°C for 10 minutes, dilute 20 times with ddH 2 O, and set aside.
1.3连接构建质粒pQCascadePtr-DR1.3 Connection construction plasmid pQCascadePtr-DR
T4连接体系(10μL)T4 connection system (10μL)
16℃连接1h。T4连接酶试剂盒购自TAKARA公司。Connect at 16°C for 1h. T4 ligase kit was purchased from TAKARA company.
将上述连接产物全部转化至DH5α感受态细胞中(购自深圳康体生命科技有限公司),在含有链霉素的LB固体平板上进行筛选,获得质粒pQCascadePtr-DR。用引物8测序验证正确。All the above ligation products were transformed into DH5α competent cells (purchased from Shenzhen Kangti Life Technology Co., Ltd.), and screened on LB solid plates containing streptomycin to obtain plasmid pQCascadePtr-DR. It was verified correct by sequencing with
1.4酶切获得pQCascadePtr-DR骨架片段1.4 Enzyme digestion to obtain pQCascadePtr-DR backbone fragment
BsaI酶切pQCascadePtr-DR质粒,获得pQCascadePtr-DR骨架片段。The pQCascadePtr-DR plasmid was digested with BsaI to obtain the pQCascadePtr-DR backbone fragment.
酶切反应体系(50μL)Enzyme digestion reaction system (50μL)
酶切反应条件:37℃,1h。通过凝胶电泳分离酶切后质粒片段并胶回收。Enzyme digestion reaction conditions: 37°C, 1h. The digested plasmid fragments were separated by gel electrophoresis and recovered.
限制性内切酶试剂盒购自Thermofisher公司,DNA凝胶回收试剂盒购自吐露港公司。The restriction endonuclease kit was purchased from Thermofisher Company, and the DNA gel recovery kit was purchased from Tolo Harbor Company.
1.5引物退火自搭1.5 Self-annealing of primers
表1中的引物对Ptrcr3-F和Ptrcr3-R通过退火自搭可获得含有4bp粘性末端的DR片段,该粘性末端与pQCascadePtr-DR质粒骨架BsaI酶切后的粘性末端互补。The primer pair Ptrcr3-F and Ptrcr3-R in Table 1 can be annealed and self-assembled to obtain a DR fragment containing a 4bp sticky end, which is complementary to the sticky end of the pQCascadePtr-DR plasmid backbone after BsaI digestion.
退火自搭体系(50μL)Annealed self-build system (50μL)
95℃保温5min,每分钟减低5-10℃,16℃保温10min,用ddH2O稀释20倍,备用。Incubate at 95°C for 5 minutes, lower the temperature by 5-10°C per minute, and incubate at 16°C for 10 minutes, dilute 20 times with ddH 2 O, and set aside.
1.6连接构建质粒pQCascadePtr-cr31.6 Connection construction plasmid pQCascadePtr-cr3
T4连接体系(10μL)T4 connection system (10μL)
16℃连接1h。T4连接酶试剂盒购自TAKARA公司。Connect at 16°C for 1h. T4 ligase kit was purchased from TAKARA company.
将上述连接产物全部转化至DH5α感受态细胞中(购自深圳康体生命科技有限公司),在含有链霉素的LB固体平板上进行筛选,获得质粒pQCascadePtr-cr3。用引物8测序验证正确。All the above ligation products were transformed into DH5α competent cells (purchased from Shenzhen Kangti Life Technology Co., Ltd.), and screened on LB solid plates containing streptomycin to obtain plasmid pQCascadePtr-cr3. It was verified correct by sequencing with
1.7转化转座工具质粒与诱导转座1.7 Transformation of transposition tool plasmid and induction of transposition
电转化pDonorPtr与pTnsPtr至大肠杆菌BL21StarTM(DE3),37℃条件下在含氨苄青霉素与卡那霉素的LB固体平板上进行筛选。挑选阳性克隆子,制备成电转感受态细胞后,电转化pQCascadePtr-cr3至上述菌中,37℃条件下在含氨苄青霉素、卡那霉素与链霉素的LB固体平板上进行筛选,获得含有pDonorPtr、pTnsPtr与pQCascadePtr-cr3的大肠杆菌BL21StarTM(DE3)菌株。将上述平板上的克隆刮取一部分重悬于液体LB培养基中,重新涂布于含有终浓度100ng/ml脱水四环素、氨苄青霉素、卡那霉素与链霉素的LB固体平板上,脱水四环素负责诱导转座相关酶的表达。37℃条件下培养16h,可能会形成一层菌膜,属正常情况。将上述含有100ng/ml脱水四环素平板上的克隆刮取一部分重悬于液体LB培养基中,调整OD600至约0.5后,用液体LB培养基稀释50倍,吸取100μL涂布于添加了终浓度1000ng/ml脱水四环素、氨苄青霉素、卡那霉素与链霉素的LB固体平板上,37℃条件下培养24h。Electrotransform pDonorPtr and pTnsPtr into Escherichia coli BL21Star TM (DE3), and select on LB solid plates containing ampicillin and kanamycin at 37°C. The positive clones were selected and made into competent cells for electroporation, and pQCascadePtr-cr3 was electrotransformed into the above-mentioned bacteria, and screened on LB solid plates containing ampicillin, kanamycin and streptomycin at 37°C to obtain cells containing E. coli BL21Star ™ (DE3) strains of pDonorPtr, pTnsPtr and pQCascadePtr-cr3. Scrape a part of the clones on the above plate, resuspend them in liquid LB medium, reapply on LB solid plates containing anhydrotetracycline, ampicillin, kanamycin and streptomycin at a final concentration of 100ng/ml, anhydrotetracycline Responsible for inducing the expression of transposition-associated enzymes. After culturing at 37°C for 16 hours, a layer of bacterial film may form, which is normal. Scrape a part of the clones on the plate containing 100ng/ml anhydrotetracycline and resuspend in liquid LB medium, adjust the OD 600 to about 0.5, dilute 50 times with liquid LB medium, draw 100 μL and apply to the final concentration 1000ng/ml anhydrotetracycline, ampicillin, kanamycin and streptomycin on LB solid plates, cultured at 37°C for 24h.
1.8菌落PCR鉴定靶向crRNA3的效率1.8 Efficiency of colony PCR identification targeting crRNA3
使用表1中位于插入位点上下游的引物对crRNA3-F/crRNA3-R和crRNA3-R/T7lacZcr3-R,通过菌落PCR,验证靶向crRNA3的两个位点的效率。Using the primer pairs crRNA3-F/crRNA3-R and crRNA3-R/T7lacZcr3-R located upstream and downstream of the insertion site in Table 1, verify the efficiency of targeting the two sites of crRNA3 by colony PCR.
菌落PCR反应体系(10μL):Colony PCR reaction system (10μL):
PCRMix购自诺维赞公司。PCRMix was purchased from Novizan.
PCR反应条件:PCR reaction conditions:
质粒pDonorPtr上的供体插入片段包括LE(Left end)、RE(Right end)和货物基因CmR片段(无启动子的氯霉素抗性基因片段),共1433bp,阳性条带1601/1759bp,阴性条带168/326bp。经统计,16个克隆均在两位点有插入。凝胶电泳图如图1所示。The donor insert on plasmid pDonorPtr includes LE (Left end), RE (Right end) and cargo gene CmR fragment (chloramphenicol resistance gene fragment without promoter), a total of 1433bp, positive band 1601/1759bp, negative Band 168/326bp. According to statistics, all 16 clones had insertions at two sites. The gel electrophoresis picture is shown in Figure 1.
图1清楚地显示了货物基因CmR片段的条带,通过将该CmR片段靶向克隆于大肠杆菌BL21StarTM(DE3)基因组lacZ和T7RNA聚合酶前lacZ,证实了来源于半透明假交替单胞菌KMM520的CRISPR相关转座酶具有可编程的转座活性。Figure 1 clearly shows the band of the CmR fragment of the cargo gene, which was confirmed to be derived from Pseudoalteromonas translucidus by targeted cloning of the CmR fragment into the lacZ of the Escherichia coli BL21Star TM (DE3) genome and the pre-lacZ of T7 RNA polymerase The CRISPR-associated transposase KMM520 has programmable transposition activity.
实施例2用array靶向基因组8个不同位点实现多拷贝整合Example 2 Using array to target 8 different sites in the genome to achieve multi-copy integration
2.1质粒pQCasTnsPtr-array8的构建2.1 Construction of plasmid pQCasTnsPtr-array8
参照专利文献CN202010083919.7中实施例2的方法,将靶向大肠杆菌BL21StarTM(DE3)基因组8个不同位点的crRNA组合,形成由9个固定正向重复序列与8个靶向不同位点的spacer间隔排列的array序列,插入至pQCasTnsPtr质粒的NcoI和BamHI位点之间,构建质粒pQCasTnsPtr-array8。array序列合成以及上述质粒构建由南京金斯瑞生物科技有限公司完成。Referring to the method of Example 2 in the patent document CN202010083919.7, the crRNA targeting 8 different sites of the Escherichia coli BL21Star TM (DE3) genome was combined to form 9 fixed direct repeat sequences and 8 targeting different sites The array sequences arranged at spacer intervals were inserted between the NcoI and BamHI sites of the pQCasTnsPtr plasmid to construct the plasmid pQCasTnsPtr-array8. The array sequence synthesis and the above plasmid construction were completed by Nanjing GenScript Biotechnology Co., Ltd.
2.2转化转座工具质粒与诱导转座2.2 Transformation of transposition tool plasmid and induction of transposition
电转化质粒pDonorPtr、pQCasTnsPtr-array8至大肠杆菌BL21StarTM(DE3),转化操作同步骤1.7。平板菌落诱导转座与传代操作同步骤1.7。质粒pDonorPtr中的货物基因是绿色荧光蛋白GFP基因(约1.29kb)。Electrotransform the plasmids pDonorPtr and pQCasTnsPtr-array8 into Escherichia coli BL21Star TM (DE3), and the transformation operation is the same as step 1.7. Plate colony-induced transposition and subculture operations are the same as step 1.7. The cargo gene in the plasmid pDonorPtr is the green fluorescent protein GFP gene (about 1.29kb).
2.3菌落PCR鉴定基因组8位点插入情况2.3 Colony PCR identification of 8-site insertion in the genome
在基因组8个位点的上下游处分别设计正向与反向引物,即验证所需的引物,序列见表2。Forward and reverse primers were designed at the upstream and downstream of the 8 genome sites, namely the primers required for verification. The sequences are shown in Table 2.
表2、验证基因组中8个位点插入所需引物序列Table 2. The sequence of primers required to verify the insertion of 8 sites in the genome
注:表中引物名称后缀F代表正向引物,R代表反向引物。Note: The suffix F in the primer name in the table represents the forward primer, and R represents the reverse primer.
PCR体系与反应条件同步骤1.8。The PCR system and reaction conditions are the same as step 1.8.
通过菌落PCR与核酸凝胶电泳验证各克隆基因组8位点的插入情况,如图2所示,目的菌落中一次即可完成8个位点的货物基因全部插入,货物基因拷贝数具体分布如图3所示。表明利用新型CRISPR相关转座酶,货物基因插入效率为100%,高于来源于霍乱弧菌Tn6677的CRISPR相关转座酶。Colony PCR and nucleic acid gel electrophoresis were used to verify the insertion of 8 loci in the genome of each clone. As shown in Figure 2, all cargo genes at 8 loci can be inserted in the target colony at one time, and the specific distribution of cargo gene copy numbers is shown in the figure 3. It shows that using the novel CRISPR-associated transposase, the cargo gene insertion efficiency is 100%, which is higher than that of the CRISPR-associated transposase derived from Vibrio cholerae Tn6677.
实施例3两种CRISPR相关转座酶的正交实验Example 3 Orthogonal experiments of two CRISPR-related transposases
考察来源于Pseudoalteromonas translucida KMM520的新型CRISPR相关转座酶与和来源于霍乱弧菌Tn6677的CRISPR相关转座酶的正交性。The orthogonality between a novel CRISPR-associated transposase from Pseudoalteromonas translucida KMM520 and a CRISPR-associated transposase from Vibrio cholerae Tn6677 was investigated.
3.1质粒pQCasTnsPtr-nagman和pQCasTnsVch-cr3质粒构建3.1 Plasmid pQCasTnsPtr-nagman and pQCasTnsVch-cr3 plasmid construction
将靶向大肠杆菌BL21StarTM(DE3)基因组nagB、nagE和manX的crRNA组合,形成由5个固定正向重复序列与4个靶向不同位点的spacer间隔排列的array序列,插入至pQCasTnsPtr质粒的NcoI和BamHI位点之间,构建质粒pQCasTnsPtr-nagman。array序列合成以及上述质粒构建由南京金斯瑞生物科技有限公司完成。Combine the crRNA targeting Escherichia coli BL21Star TM (DE3) genome nagB, nagE and manX to form an array sequence consisting of 5 fixed direct repeats and 4 spacers targeting different sites, and insert it into the pQCasTnsPtr plasmid Between the NcoI and BamHI sites, the plasmid pQCasTnsPtr-nagman was constructed. The array sequence synthesis and the above plasmid construction were completed by Nanjing GenScript Biotechnology Co., Ltd.
pQCasTnsVch-cr3质粒构建方法同步骤1.4、1.5、1.6,质粒pQCasTnsVch来源于本实验室。The pQCasTnsVch-cr3 plasmid construction method is the same as steps 1.4, 1.5, and 1.6, and the plasmid pQCasTnsVch comes from our laboratory.
质粒构建及鉴定所用引物列于表3。The primers used for plasmid construction and identification are listed in Table 3.
表3、质粒pQCasTnsPtr-nagman和pQCasTnsVch-cr3构建及鉴定所用引物Table 3. Primers used for construction and identification of plasmids pQCasTnsPtr-nagman and pQCasTnsVch-cr3
3.2转化转座工具质粒与诱导转座3.2 Transformation of transposition tool plasmid and induction of transposition
电转化质粒pDonorPtr、pQCasTnsPtr-nagman、pDonorVch和pQCasTnsVch-cr3至大肠杆菌BL21StarTM(DE3),转化操作同步骤1.7。平板菌落诱导转座与传代操作同步骤1.7。其中pDonorPtr携带的货物基因是绿色荧光蛋白GFP基因(约1.29kb),pDonorVch携带的货物基因是终止子序列(约0.64kb)。Electrotransform plasmids pDonorPtr, pQCasTnsPtr-nagman, pDonorVch and pQCasTnsVch-cr3 into Escherichia coli BL21Star TM (DE3), and the transformation operation was the same as step 1.7. Plate colony-induced transposition and subculture operations are the same as step 1.7. The cargo gene carried by pDonorPtr is the green fluorescent protein GFP gene (about 1.29kb), and the cargo gene carried by pDonorVch is the terminator sequence (about 0.64kb).
3.3菌落PCR鉴定基因组插入情况3.3 Identification of Genome Insertion by Colony PCR
用来源于霍乱弧菌Tn6677的CRISPR相关转座酶靶向基因组lacZ和T7RNA聚合酶前lacZ,货物基因大小0.64kb。设计pQCasTnsPtr的crRNA阵列靶向基因组nagB、nagE和manX,货物基因绿色荧光蛋白GFP基因(约1.29kb)。经过大肠杆菌转座实验后,用lacZ上靶位点上下游引物PCR检测,阳性条带大小应为1.0kb和1.17kb,阴性条带大小应为0.17kb和0.33kb;用nagB、nagE和manX上靶位点上下游引物PCR检测,阳性条带大小约为2.47kb、2.70kb、2.49kb和2.36kb阴性条带大小应为0.43kb、0.66kb、0.46kb和0.33kb。结果如图4所示,霍乱弧菌Tn6677来源的CRISPR相关转座酶与pQCasTnsPtr均分别靶向对应位点,插入对应货物基因(pDonorPtr的货物基因是绿色荧光蛋白GFP基因,pDonorVch携带的货物基因是终止子序列),获得2*4拷贝的菌株,效率约100%,该结果经过测序验证,参见图4。Targeting of genomic lacZ and T7 RNA polymerase pre-lacZ with a CRISPR-associated transposase derived from Vibrio cholerae Tn6677 with a cargo gene size of 0.64 kb. The crRNA array of pQCasTnsPtr was designed to target the genomes nagB, nagE and manX, and the cargo gene green fluorescent protein GFP gene (about 1.29kb). After the E. coli transposition experiment, use the upstream and downstream primers of the target site on lacZ for PCR detection, the positive band size should be 1.0kb and 1.17kb, and the negative band size should be 0.17kb and 0.33kb; use nagB, nagE and manX PCR detection of primers upstream and downstream of the upper target site, the positive band size is about 2.47kb, 2.70kb, 2.49kb and 2.36kb, and the negative band size should be 0.43kb, 0.66kb, 0.46kb and 0.33kb. The results are shown in Figure 4. Both the CRISPR-related transposase and pQCasTnsPtr derived from Vibrio cholerae Tn6677 target the corresponding sites, and the corresponding cargo genes are inserted (the cargo gene of pDonorPtr is the green fluorescent protein GFP gene, and the cargo gene carried by pDonorVch is terminator sequence) to obtain 2*4 copies of the strain with an efficiency of about 100%, which was verified by sequencing, see Figure 4.
图4显示了菌株基因组中6个位点的Ptr携带的货物基因GFP和Vch携带的货物基因“终止子序列”插入情况,表明两种CRISPR相关转座酶可以在同一大肠杆菌内使用,且可以互不干扰的发挥功能,从而为加速代谢工程菌株构建提供了一种选择方案。Figure 4 shows the insertion of the cargo gene GFP carried by Ptr and the "terminator sequence" of the cargo gene carried by Vch at 6 sites in the strain genome, indicating that the two CRISPR-related transposases can be used in the same E. coli, and can Function without interfering with each other, thus providing an option for accelerating the construction of metabolic engineering strains.
实施例4验证CRISPR相关转座酶在需钠弧菌的转座活性Example 4 Verifying the transposition activity of CRISPR-associated transposases in Narophilicus
通过靶向需钠弧菌ATCC14048基因组wbfF基因,证实来源于半透明假交替单胞菌KMM520的CRISPR相关酶在需钠弧菌具有可编程的转座活性。By targeting the genome wbfF gene of Narophilus ATCC14048, it was confirmed that a CRISPR-associated enzyme derived from Pseudoalteromonas translucidus KMM520 has programmable transposition activity in Narrative Vibrio.
质粒pVnQCasTnsPtr包含来源于Pseudoalteromonas translucida KMM520的基因tnsA(SEQ ID NO:8)、tnsB(SEQ ID NO:9)、tnsC(SEQ ID NO:10)、tniQ(SEQ ID NO:11)、Cas5/8(SEQ ID NO:12)、Cas7(SEQ ID NO:14)和Cas6(SEQ ID NO:13)、靶向基因组目标位点的crRNA序列、p15A复制子、启动子例如脱水四环素诱导型启动子、氯霉素抗性基因,质粒结构如图12所示。Plasmid pVnQCasTnsPtr comprises gene tnsA (SEQ ID NO:8), tnsB (SEQ ID NO:9), tnsC (SEQ ID NO:10), tniQ (SEQ ID NO:11), Cas5/8 ( SEQ ID NO: 12), Cas7 (SEQ ID NO: 14) and Cas6 (SEQ ID NO: 13), crRNA sequences targeting genomic target sites, p15A replicon, promoters such as anhydrocycline-inducible promoters, chloride Mycin resistance gene, the plasmid structure is shown in Figure 12.
质粒pDonorPtr-GFP包含来源于Pseudoalteromonas translucida KMM520的基因LE(SEQ ID NO:19)和RE(SEQ ID NO:20)序列,pMB1复制子、氨苄青霉素抗性基因、目标货物基因(Cargo)无启动子的绿色荧光蛋白基因片段。Plasmid pDonorPtr-GFP contains gene LE (SEQ ID NO:19) and RE (SEQ ID NO:20) sequences derived from Pseudoalteromonas translucida KMM520, pMB1 replicon, ampicillin resistance gene, target cargo gene (Cargo) without promoter The green fluorescent protein gene fragment.
构建靶向基因组wbfF的pVnQCasTnsPtr-wbfF质粒及验证引物列于表4。其中质粒名称pVnQCasTnsPtr-wbfF中的后缀“-wbfF”代表构建靶向基因组wbfF。pVnQCasTnsPtr-wbfF质粒构建方法同步骤1.4、1.5、1.6。The construction of pVnQCasTnsPtr-wbfF plasmid targeting genome wbfF and the verified primers are listed in Table 4. The suffix "-wbfF" in the plasmid name pVnQCasTnsPtr-wbfF represents the construction of targeted genome wbfF. The pVnQCasTnsPtr-wbfF plasmid construction method is the same as steps 1.4, 1.5, and 1.6.
表4:引物序列Table 4: Primer sequences
注:表中引物名称后缀F代表正向引物,R代表反向引物。Note: The suffix F in the primer name in the table represents the forward primer, and R represents the reverse primer.
4.1转化转座工具质粒与诱导转座4.1 Transformation of transposition tool plasmid and induction of transposition
电转化pDonorPtr-GFP至需钠弧菌ATCC14048,30℃条件下在含氨苄青霉素的LBv2固体平板上进行筛选。挑选阳性克隆子,制备成电转感受态细胞后,电转化pVnQCasTnsPtr-wbfF至上述菌中,30℃条件下在含氨苄青霉素、氯霉素的LBv2固体平板上进行筛选,获得含有pDonorPtr-GFP与pVnQCasTnsPtr-wbfF的需钠弧菌ATCC14048菌株。将上述平板上的克隆刮取一部分重悬于液体LBv2培养基中,重新涂布于含有终浓度100ng/ml脱水四环素、氨苄青霉素与氯霉素的LBv2固体平板上,脱水四环素负责诱导转座相关酶的表达。30℃条件下培养12h,可能会形成一层菌膜,属正常情况。将上述含有100ng/ml脱水四环素平板上的克隆刮取一部分重悬于液体LBv2培养基中,调整OD600至约0.5后,用液体LBv2培养基稀释50倍,吸取100μL涂布于添加了终浓度1000ng/ml脱水四环素、氨苄青霉素与氯霉素的LBv2固体平板上,30℃条件下培养12h。Electrotransform pDonorPtr-GFP into sodium vibrio ATCC14048, and select on LBv2 solid plate containing ampicillin at 30°C. Select the positive clones and make them into competent cells for electroporation, then electrotransform pVnQCasTnsPtr-wbfF into the above bacteria, screen on LBv2 solid plates containing ampicillin and chloramphenicol at 30°C, and obtain pDonorPtr-GFP and pVnQCasTnsPtr -wbfF of the Narophilicus ATCC14048 strain. Scraped a part of the clones on the above plate, resuspended in liquid LBv2 medium, and reapplied on LBv2 solid plate containing anhydrotetracycline, ampicillin and chloramphenicol at a final concentration of 100ng/ml. Anhydrotetracycline is responsible for inducing transposition association Enzyme expression. After culturing at 30°C for 12 hours, a layer of bacterial film may form, which is normal. Scrape a part of the clones on the plate containing 100ng/ml anhydrotetracycline and resuspend in liquid LBv2 medium, adjust OD 600 to about 0.5, dilute 50 times with liquid LBv2 medium, draw 100μL and spread to the final concentration 1000ng/ml anhydrotetracycline, ampicillin and chloramphenicol on LBv2 solid plate, cultured at 30°C for 12h.
4.2菌落PCR鉴定靶向wbfF基因的效率4.2 Efficiency of colony PCR identification targeting wbfF gene
使用表4中位于插入位点上下游的引物对wbfF-test-F/wbfF-test-R,通过菌落PCR,验证靶向wbfF的两个位点的效率。菌落PCR方法同步骤1.8。The efficiency of targeting the two sites of wbfF was verified by colony PCR using the primer pair wbfF-test-F/wbfF-test-R located upstream and downstream of the insertion site in Table 4. The colony PCR method is the same as step 1.8.
质粒pDonorPtr-GFP上的供体插入片段包括LE(Left end)、RE(Right end)和货物基因GFP片段(无启动子的绿色荧光蛋白基因片段),共720bp,阳性条带1220bp,阴性条带500bp。经统计,16个克隆均有插入。证实了来源于半透明假交替单胞菌KMM520的CRISPR相关转座酶在需钠弧菌中也具有可编程的转座活性。The donor insert on the plasmid pDonorPtr-GFP includes LE (Left end), RE (Right end) and cargo gene GFP fragment (green fluorescent protein gene fragment without promoter), a total of 720bp, positive band 1220bp, negative band 500bp. According to statistics, all 16 clones had insertions. We demonstrate that a CRISPR-associated transposase derived from Pseudoalteromonas translucidum KMM520 also has programmable transposition activity in Narophilicus.
实施例5验证CRISPR相关转座酶在谷氨酸棒杆菌中的转座活性Example 5 verifies the transposition activity of CRISPR-related transposase in Corynebacterium glutamicum
实施例中所使用的质粒pCgQCasTnsPtr包含来源于Pseudoalteromonastranslucida KMM520的基因tnsA(SEQ ID NO:8)、tnsB(SEQ ID NO:9)、tnsC(SEQ ID NO:10)、tniQ(SEQ ID NO:11)、Cas5/8(SEQ ID NO:12)、Cas7(SEQ ID NO:14)和Cas6(SEQ IDNO:13)、上述基因全部进行谷氨酸棒杆菌密码子优化,靶向基因组目标位点的crRNA序列、ColA复制子、pBL1ts复制子、启动子例如脱水四环素诱导型启动子、卡那霉素抗性基因,其结构如图13所示。The plasmid pCgQCasTnsPtr used in the examples comprises genes tnsA (SEQ ID NO:8), tnsB (SEQ ID NO:9), tnsC (SEQ ID NO:10), tniQ (SEQ ID NO:11) derived from Pseudoalteromonastranslucida KMM520 , Cas5/8 (SEQ ID NO: 12), Cas7 (SEQ ID NO: 14) and Cas6 (SEQ ID NO: 13), the above-mentioned genes are all codon-optimized in Corynebacterium glutamicum, targeting the crRNA at the genomic target site The structure of sequence, ColA replicon, pBL1ts replicon, promoter such as anhydrocycline-inducible promoter, kanamycin resistance gene is shown in FIG. 13 .
质粒pCgDonorPtr包含来源于Pseudoalteromonas translucida KMM520的基因LE(SEQ ID NO:19)和RE(SEQ ID NO:20)序列,pMB1复制子、pGA1复制子、壮观霉素抗性基因、目标货物基因Cargo例如无启动子的氯霉素抗性CmR基因片段,其结构如图14所示。Plasmid pCgDonorPtr contains gene LE (SEQ ID NO: 19) and RE (SEQ ID NO: 20) sequences derived from Pseudoalteromonas translucida KMM520, pMB1 replicon, pGA1 replicon, spectinomycin resistance gene, target cargo gene Cargo, for example, no The structure of the chloramphenicol-resistant CmR gene fragment of the promoter is shown in FIG. 14 .
通过靶向谷氨酸棒杆菌ATCC13032基因组crtYf基因,证实来源于半透明假交替单胞菌KMM520的CRISPR相关酶在谷氨酸棒杆菌中的转座活性。The transposition activity of CRISPR-associated enzymes from Pseudomonas translucidum KMM520 in Corynebacterium glutamicum was confirmed by targeting the crtYf gene of Corynebacterium glutamicum ATCC13032 genome.
构建靶向基因组crtYf基因的crRNA序列为:The crRNA sequence targeting the genome crtYf gene is constructed as follows:
AGGCAACCATAGGGCAGGAATCAGAAGTACTG。AGGCAACCATAGGGCAGGAATCAGAAGTACTG.
5.1转化转座工具质粒与诱导转座5.1 Transformation of transposition tool plasmid and induction of transposition
电转化pCgDonorPtr与pCgQCasTnsPtr至谷氨酸棒杆菌ATCC13032,30℃条件下在含壮观霉素与卡那霉素的BHIS固体平板上进行筛选,获得含有pCgDonorPtr与pCgQCasTnsPtr的谷氨酸棒杆菌ATCC13032菌株。将上述平板上的克隆刮取一部分重悬于液体BHIS培养基中,重新涂布于含有终浓度100ng/ml脱水四环素、壮观霉素与卡那霉素的BHIS固体平板上,脱水四环素负责诱导转座相关酶的表达。30℃条件下培养24h,可能会形成一层菌膜,属正常情况。将上述含有100ng/ml脱水四环素平板上的克隆刮取一部分重悬于液体BHIS培养基中,调整OD600至约0.5后,用液体BHIS培养基稀释50倍,吸取100μL涂布于添加了终浓度1000ng/ml脱水四环素、壮观霉素与卡那霉素的BHIS固体平板上,30℃条件下培养48h。Electrotransformation of pCgDonorPtr and pCgQCasTnsPtr into Corynebacterium glutamicum ATCC13032 was performed at 30°C on a BHIS solid plate containing spectinomycin and kanamycin to obtain Corynebacterium glutamicum ATCC13032 strains containing pCgDonorPtr and pCgQCasTnsPtr. Scraped a part of the clones on the above plate, resuspended in liquid BHIS medium, and reapplied on the BHIS solid plate containing anhydrotetracycline, spectinomycin and kanamycin at a final concentration of 100ng/ml. Anhydrotetracycline is responsible for inducing transformation. Expression of loci-associated enzymes. After culturing at 30°C for 24 hours, a layer of bacterial film may form, which is normal. Scrape a part of the clones on the plate containing 100ng/ml anhydrotetracycline and resuspend in liquid BHIS medium, adjust the OD 600 to about 0.5, dilute 50 times with liquid BHIS medium, draw 100μL and apply to the final concentration 1000ng/ml anhydrotetracycline, spectinomycin and kanamycin on the BHIS solid plate, cultured at 30°C for 48h.
5.2菌落PCR鉴定靶向crtYf的效率5.2 Efficiency of colony PCR identification targeting crtYf
使用位于插入位点上下游的引物对F-crtYf(TGCTGTGGGAACTTTTCGGT)和R-crtYf(ACTACCACTCCCGAGGTTGA),通过菌落PCR,验证靶向crtYf的效率。The efficiency of targeting crtYf was verified by colony PCR using the primer pair F-crtYf (TGCTGTGGGAACTTTTCGGT) and R-crtYf (ACTACCACTCCCGAGGTTGA) located upstream and downstream of the insertion site.
质粒pDonorPtr上的供体插入片段包括LE(Left end)、RE(Right end)和货物基因CmR片段(无启动子的氯霉素抗性基因片段),共1433bp,阳性条带2110bp,阴性条带677bp。经统计,6个克隆均成功插入。凝胶电泳图如图15所示,证实了来源于半透明假交替单胞菌KMM520的CRISPR相关转座酶在谷氨酸棒杆菌中的转座活性。The donor insert on plasmid pDonorPtr includes LE (Left end), RE (Right end) and cargo gene CmR fragment (chloramphenicol resistance gene fragment without promoter), a total of 1433bp, positive band 2110bp, negative band 677bp. According to statistics, all 6 clones were inserted successfully. The gel electrophoresis image is shown in Figure 15, confirming the transposition activity of the CRISPR-associated transposase derived from Pseudoalteromonas translucenta KMM520 in Corynebacterium glutamicum.
序列表sequence listing
<110> 中国科学院分子植物科学卓越创新中心<110> Center for Excellence in Molecular Plant Science, Chinese Academy of Sciences
<120> 一种新型CRISPR相关转座酶<120> A novel CRISPR-associated transposase
<130> SHPI2110093<130> SHPI2110093
<160> 22<160> 22
<170> SIPOSequenceListing 1.0<170> SIPOSequenceListing 1.0
<210> 1<210> 1
<211> 209<211> 209
<212> PRT<212> PRT
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 1<400> 1
Met Tyr Arg Arg Lys Leu Lys Tyr Ser Arg Val Lys Asn Leu His LysMet Tyr Arg Arg Lys Leu Lys Tyr Ser Arg Val Lys Asn Leu His Lys
1 5 10 151 5 10 15
Phe Ala Ser Gln Lys Asn Lys Ser Thr Cys Leu Val Glu Ser Ser LeuPhe Ala Ser Gln Lys Asn Lys Ser Thr Cys Leu Val Glu Ser Ser Ser Leu
20 25 30 20 25 30
Glu Phe Asp Ala Cys Phe His Phe Glu Phe Ser Pro Pro Ile Ala AlaGlu Phe Asp Ala Cys Phe His Phe Glu Phe Ser Pro Pro Ile Ala Ala
35 40 45 35 40 45
Phe Glu Ala Gln Pro Leu Gly Tyr Glu Tyr Glu Phe Asp Asn Arg IlePhe Glu Ala Gln Pro Leu Gly Tyr Glu Tyr Glu Phe Asp Asn Arg Ile
50 55 60 50 55 60
Cys Arg Tyr Thr Pro Asp Phe Leu Leu Thr His Thr Asp Gly Thr GlnCys Arg Tyr Thr Pro Asp Phe Leu Leu Thr His Thr Asp Gly Thr Gln
65 70 75 8065 70 75 80
Lys Phe Ile Glu Val Lys Pro Gln Ser Lys Ile Ala Asp Glu Asp PheLys Phe Ile Glu Val Lys Pro Gln Ser Lys Ile Ala Asp Glu Asp Phe
85 90 95 85 90 95
Arg Ala Arg Phe Ile Glu Lys Gln Ala Ile Ala Lys Gln Asp Gly ArgArg Ala Arg Phe Ile Glu Lys Gln Ala Ile Ala Lys Gln Asp Gly Arg
100 105 110 100 105 110
Asp Leu Ile Leu Val Thr Asp Lys Gln Ile Arg Val Tyr Pro Thr LeuAsp Leu Ile Leu Val Thr Asp Lys Gln Ile Arg Val Tyr Pro Thr Leu
115 120 125 115 120 125
Asn Asn Leu Lys Leu Leu His Arg Tyr Ser Gly Phe Gln Ser Leu ThrAsn Asn Leu Lys Leu Leu His Arg Tyr Ser Gly Phe Gln Ser Leu Thr
130 135 140 130 135 140
Glu Leu Gln Ala Ser Val Leu Glu Leu Val Lys Gln Tyr Gly Ser IleGlu Leu Gln Ala Ser Val Leu Glu Leu Val Lys Gln Tyr Gly Ser Ile
145 150 155 160145 150 155 160
Lys Val Gly Gln Leu Ile Arg Tyr Leu Lys Val Thr Ala Gly Glu LeuLys Val Gly Gln Leu Ile Arg Tyr Leu Lys Val Thr Ala Gly Glu Leu
165 170 175 165 170 175
Leu Ala Thr Val Leu Arg Leu Leu Ser Leu Gly Gln Leu Phe Ala AspLeu Ala Thr Val Leu Arg Leu Leu Ser Leu Gly Gln Leu Phe Ala Asp
180 185 190 180 185 190
Leu Thr Thr Asn Glu Ile Ser Ile Glu Thr Ala Ile Trp Ser Asn AsnLeu Thr Thr Asn Glu Ile Ser Ile Glu Thr Ala Ile Trp Ser Asn Asn
195 200 205 195 200 205
ValVal
<210> 2<210> 2
<211> 607<211> 607
<212> PRT<212> PRT
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 2<400> 2
Met Phe Asn Asn Asp Leu Phe Asp Asp Glu Phe Asn Gln Pro Leu ProMet Phe Asn Asn Asp Leu Phe Asp Asp Glu Phe Asn Gln Pro Leu Pro
1 5 10 151 5 10 15
Lys Ala Glu Thr Lys Leu Pro Gln Asn Tyr Thr Lys Asp Leu Gln AlaLys Ala Glu Thr Lys Leu Pro Gln Asn Tyr Thr Lys Asp Leu Gln Ala
20 25 30 20 25 30
Leu Pro Glu Lys Ile Lys Thr Thr Thr Phe Ala Lys Leu Lys Tyr IleLeu Pro Glu Lys Ile Lys Thr Thr Thr Phe Ala Lys Leu Lys Tyr Ile
35 40 45 35 40 45
Gln Trp Leu Glu Ala Asn Ile Gln Gly Gly Trp Thr Gln Lys Asn LeuGln Trp Leu Glu Ala Asn Ile Gln Gly Gly Trp Thr Gln Lys Asn Leu
50 55 60 50 55 60
Glu Pro Leu Leu Lys Leu Met Pro Asp Val Glu Gly Glu Lys Lys ProGlu Pro Leu Leu Lys Leu Met Pro Asp Val Glu Gly Glu Lys Lys Pro
65 70 75 8065 70 75 80
Ser Trp Arg Thr Ala Ala Arg Trp Tyr Ser Ala Tyr Thr Asn Ala AspSer Trp Arg Thr Ala Ala Arg Trp Tyr Ser Ala Tyr Thr Asn Ala Asp
85 90 95 85 90 95
Lys Asn Ile Met Ala Leu Ile Pro Ser His Gln Lys Lys Gly Asn ArgLys Asn Ile Met Ala Leu Ile Pro Ser His Gln Lys Lys Gly Asn Arg
100 105 110 100 105 110
Glu Arg Asp Thr Thr Thr Asp Lys Phe Phe Glu Lys Ala Leu Glu ArgGlu Arg Asp Thr Thr Thr Thr Asp Lys Phe Phe Glu Lys Ala Leu Glu Arg
115 120 125 115 120 125
Tyr Leu Val Lys Glu Lys Pro Ser Val Ala Ser Ala Tyr Lys Phe TyrTyr Leu Val Lys Glu Lys Pro Ser Val Ala Ser Ala Tyr Lys Phe Tyr
130 135 140 130 135 140
Lys Asp Leu Val Ile Ile Glu Asn Asp Ser Val Val Asp Ser Val LeuLys Asp Leu Val Ile Ile Glu Asn Asp Ser Val Val Asp Ser Val Leu
145 150 155 160145 150 155 160
Lys Pro Leu Thr Tyr Lys Ala Phe Lys Asn Arg Ile Asp Asn Leu ProLys Pro Leu Thr Tyr Lys Ala Phe Lys Asn Arg Ile Asp Asn Leu Pro
165 170 175 165 170 175
Gln Tyr Glu Val Met Ile Ala Arg Tyr Gly Lys Arg Leu Ala Asp IleGln Tyr Glu Val Met Ile Ala Arg Tyr Gly Lys Arg Leu Ala Asp Ile
180 185 190 180 185 190
Ala Tyr Asn Lys Val Glu Gly His Lys Arg Pro Ile Arg Val Leu GluAla Tyr Asn Lys Val Glu Gly His Lys Arg Pro Ile Arg Val Leu Glu
195 200 205 195 200 205
Lys Val Glu Ile Asp His Thr Pro Leu Asp Leu Ile Leu Leu Asp AspLys Val Glu Ile Asp His Thr Pro Leu Asp Leu Ile Leu Leu Asp Asp
210 215 220 210 215 220
Glu Leu His Ile Pro Leu Gly Arg Pro Thr Leu Thr Met Leu Val AspGlu Leu His Ile Pro Leu Gly Arg Pro Thr Leu Thr Met Leu Val Asp
225 230 235 240225 230 235 240
Val Tyr Ser His Cys Ile Val Gly Tyr Tyr Phe Ser Phe Ser Glu ProVal Tyr Ser His Cys Ile Val Gly Tyr Tyr Phe Ser Phe Ser Glu Pro
245 250 255 245 250 255
Ser Tyr Asp Ala Val Arg Arg Ala Met Leu Asn Ala Met Lys Pro LysSer Tyr Asp Ala Val Arg Arg Ala Met Leu Asn Ala Met Lys Pro Lys
260 265 270 260 265 270
Ser Glu Val Ala Lys Leu Tyr Pro Asp Thr Ile Asn Glu Trp Lys CysSer Glu Val Ala Lys Leu Tyr Pro Asp Thr Ile Asn Glu Trp Lys Cys
275 280 285 275 280 285
Ala Gly Lys Ile Glu Thr Leu Val Val Asp Asn Gly Ala Glu Phe TrpAla Gly Lys Ile Glu Thr Leu Val Val Asp Asn Gly Ala Glu Phe Trp
290 295 300 290 295 300
Ser Asn Ser Leu Glu Leu Ala Cys Glu Glu Ile Gly Ile Asn Thr GlnSer Asn Ser Leu Glu Leu Ala Cys Glu Glu Ile Gly Ile Asn Thr Gln
305 310 315 320305 310 315 320
Tyr Asn Pro Val Ala Lys Pro Trp Leu Lys Pro Phe Val Glu Arg MetTyr Asn Pro Val Ala Lys Pro Trp Leu Lys Pro Phe Val Glu Arg Met
325 330 335 325 330 335
Phe Gly Thr Ile Asn Thr Glu Leu Leu Asp Pro Val Pro Gly Lys ThrPhe Gly Thr Ile Asn Thr Glu Leu Leu Asp Pro Val Pro Gly Lys Thr
340 345 350 340 345 350
Phe Ser Asn Ile Leu Gln Lys His Glu Tyr Asn Pro Lys Lys Asp AlaPhe Ser Asn Ile Leu Gln Lys His Glu Tyr Asn Pro Lys Lys Asp Ala
355 360 365 355 360 365
Ile Met Arg Phe Thr Thr Phe Met Gln Leu Phe His Lys Trp Val ValIle Met Arg Phe Thr Thr Phe Met Gln Leu Phe His Lys Trp Val Val
370 375 380 370 375 380
Asp Val Tyr His Gln Asp Ala Asp Ser Arg Phe Lys Tyr Ile Pro SerAsp Val Tyr His Gln Asp Ala Asp Ser Arg Phe Lys Tyr Ile Pro Ser
385 390 395 400385 390 395 400
Gln Leu Trp Asp Gln Gly Phe Asn Thr Leu Pro Pro Thr Met Leu SerGln Leu Trp Asp Gln Gly Phe Asn Thr Leu Pro Pro Thr Met Leu Ser
405 410 415 405 410 415
Asp Ala Asp Leu Gln Gln Leu Asp Val Val Leu Ser Ile Ser Asn HisAsp Ala Asp Leu Gln Gln Leu Asp Val Val Leu Ser Ile Ser Asn His
420 425 430 420 425 430
Arg Val Leu Arg Lys Gly Gly Ile Arg Leu Glu Asn Leu Ser Tyr AspArg Val Leu Arg Lys Gly Gly Ile Arg Leu Glu Asn Leu Ser Tyr Asp
435 440 445 435 440 445
Ser Thr Glu Leu Ala Asn Tyr Arg Lys Gln Phe Ser His Lys Val SerSer Thr Glu Leu Ala Asn Tyr Arg Lys Gln Phe Ser His Lys Val Ser
450 455 460 450 455 460
Gln Glu Val Leu Ile Lys Leu Asn Pro Asp Asp Ile Ser Tyr Ile TyrGln Glu Val Leu Ile Lys Leu Asn Pro Asp Asp Ile Ser Tyr Ile Tyr
465 470 475 480465 470 475 480
Val Tyr Leu Asp Lys Leu Glu His Tyr Ile Lys Val Pro Cys Ile AspVal Tyr Leu Asp Lys Leu Glu His Tyr Ile Lys Val Pro Cys Ile Asp
485 490 495 485 490 495
Pro Asn Gly Tyr Thr Gln Asn Leu Ser Leu Asn Gln His Lys Ile AsnPro Asn Gly Tyr Thr Gln Asn Leu Ser Leu Asn Gln His Lys Ile Asn
500 505 510 500 505 510
Ile Arg Ile His Arg Asp Phe Ile Ser Gly Ser Ile Asp Asn Val GlyIle Arg Ile His Arg Asp Phe Ile Ser Gly Ser Ile Asp Asn Val Gly
515 520 525 515 520 525
Leu Ala Lys Ala Arg Met Phe Ile His Asn Lys Ile Gln Asn Glu PheLeu Ala Lys Ala Arg Met Phe Ile His Asn Lys Ile Gln Asn Glu Phe
530 535 540 530 535 540
Glu Glu Leu Lys Asn Ala Pro Lys His Ser Lys Val Lys Gly Gly LysGlu Glu Leu Lys Asn Ala Pro Lys His Ser Lys Val Lys Gly Gly Lys
545 550 555 560545 550 555 560
Ala Leu Ala Lys His Gln Asn Ile Ser Ser Asp Ser Gln Lys Ser IleAla Leu Ala Lys His Gln Asn Ile Ser Ser Asp Ser Gln Lys Ser Ile
565 570 575 565 570 575
Thr His Ser Lys Pro Val Glu Ala Lys Lys Val Thr Pro Lys Glu GlnThr His Ser Lys Pro Val Glu Ala Lys Lys Val Thr Pro Lys Glu Gln
580 585 590 580 585 590
Pro Thr Asp Ser Trp Asp Asp Phe Ile Ser Asp Leu Asp Gly PhePro Thr Asp Ser Trp Asp Asp Phe Ile Ser Asp Leu Asp Gly Phe
595 600 605 595 600 605
<210> 3<210> 3
<211> 333<211> 333
<212> PRT<212> PRT
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 3<400> 3
Met Leu Thr Asp Lys Gln Lys Glu Lys Leu Asn Glu Phe Arg Asp ValMet Leu Thr Asp Lys Gln Lys Glu Lys Leu Asn Glu Phe Arg Asp Val
1 5 10 151 5 10 15
Phe Ile Glu Tyr Pro Ile Ile Thr Thr Ile Phe Asn Asp Phe Asp ArgPhe Ile Glu Tyr Pro Ile Ile Thr Thr Ile Phe Asn Asp Phe Asp Arg
20 25 30 20 25 30
Leu Arg Leu Gly Lys Gly Leu Thr Gly Glu Lys Pro Cys Met Leu LeuLeu Arg Leu Gly Lys Gly Leu Thr Gly Glu Lys Pro Cys Met Leu Leu
35 40 45 35 40 45
Asn Gly Asp Thr Gly Thr Gly Lys Thr Ala Leu Ile Lys Gln Tyr LysAsn Gly Asp Thr Gly Thr Gly Lys Thr Ala Leu Ile Lys Gln Tyr Lys
50 55 60 50 55 60
Glu Arg His Leu Pro Gln Phe Ile Asn Gly Val Met Asn His Pro ValGlu Arg His Leu Pro Gln Phe Ile Asn Gly Val Met Asn His Pro Val
65 70 75 8065 70 75 80
Leu Val Ser Arg Ile Pro Ser Asn Pro Thr Leu Glu Ser Thr Leu AlaLeu Val Ser Arg Ile Pro Ser Asn Pro Thr Leu Glu Ser Thr Leu Ala
85 90 95 85 90 95
Glu Leu Leu Lys Asp Leu Gly Gln Val Gly Ser Thr Glu Arg Lys LeuGlu Leu Leu Lys Asp Leu Gly Gln Val Gly Ser Thr Glu Arg Lys Leu
100 105 110 100 105 110
Arg Ile Asn Gly Thr Arg Leu Thr Thr Ser Leu Ile Lys Cys Leu LysArg Ile Asn Gly Thr Arg Leu Thr Thr Ser Leu Ile Lys Cys Leu Lys
115 120 125 115 120 125
Thr Cys Gly Thr Glu Leu Ile Ile Ile Asp Glu Phe Gln Glu Leu IleThr Cys Gly Thr Glu Leu Ile Ile Ile Asp Glu Phe Gln Glu Leu Ile
130 135 140 130 135 140
Glu His Asn Gln Gly Lys Lys Arg Arg Glu Ile Ala Asn Arg Leu LysGlu His Asn Gln Gly Lys Lys Arg Arg Glu Ile Ala Asn Arg Leu Lys
145 150 155 160145 150 155 160
Tyr Ile Asn Asp Glu Ala Gly Val Ser Ile Val Leu Val Gly Met ProTyr Ile Asn Asp Glu Ala Gly Val Ser Ile Val Leu Val Gly Met Pro
165 170 175 165 170 175
Trp Ala Glu Lys Ile Ala Asp Glu Pro Gln Trp Ser Ser Arg Leu LeuTrp Ala Glu Lys Ile Ala Asp Glu Pro Gln Trp Ser Ser Arg Leu Leu
180 185 190 180 185 190
Ile Arg Arg Gln Leu Pro Tyr Phe Lys Leu Ser Glu Asn Pro Lys HisIle Arg Arg Gln Leu Pro Tyr Phe Lys Leu Ser Glu Asn Pro Lys His
195 200 205 195 200 205
Phe Val Gln Leu Ile Ile Gly Leu Ala Asn Arg Met Pro Phe Ala GluPhe Val Gln Leu Ile Ile Gly Leu Ala Asn Arg Met Pro Phe Ala Glu
210 215 220 210 215 220
Lys Pro Asn Leu Ser Glu Gln Ala Thr Val Phe Thr Leu Phe Ser LeuLys Pro Asn Leu Ser Glu Gln Ala Thr Val Phe Thr Leu Phe Ser Leu
225 230 235 240225 230 235 240
Ser Lys Gly Cys Phe Arg Thr Leu Lys Tyr Phe Leu Asp Asp Ala ValSer Lys Gly Cys Phe Arg Thr Leu Lys Tyr Phe Leu Asp Asp Ala Val
245 250 255 245 250 255
Leu Tyr Ala Leu Met Asp Asn Ala Lys Thr Leu Thr Thr Lys His LeuLeu Tyr Ala Leu Met Asp Asn Ala Lys Thr Leu Thr Thr Lys His Leu
260 265 270 260 265 270
Val Lys Ala Phe Glu Val Leu Phe Pro Asp Val Pro Asn Leu Phe ThrVal Lys Ala Phe Glu Val Leu Phe Pro Asp Val Pro Asn Leu Phe Thr
275 280 285 275 280 285
Leu Pro Val Ala Glu Ile Thr Ala Ser Glu Val Glu Arg Tyr Ser LeuLeu Pro Val Ala Glu Ile Thr Ala Ser Glu Val Glu Arg Tyr Ser Leu
290 295 300 290 295 300
Tyr Lys Pro Glu Ser Ser Gln Asp Glu Asp Pro Phe Ile Ala Thr LysTyr Lys Pro Glu Ser Ser Gln Asp Glu Asp Pro Phe Ile Ala Thr Lys
305 310 315 320305 310 315 320
Phe Thr Asp Arg Met Pro Ile Ser Gln Leu Leu Arg LysPhe Thr Asp Arg Met Pro Ile Ser Gln Leu Leu Arg Lys
325 330 325 330
<210> 4<210> 4
<211> 391<211> 391
<212> PRT<212> PRT
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 4<400> 4
Met His Phe Leu Val Gln Thr Lys Ser Tyr Pro Asp Glu Ala Leu GluMet His Phe Leu Val Gln Thr Lys Ser Tyr Pro Asp Glu Ala Leu Glu
1 5 10 151 5 10 15
Ser Tyr Leu Leu Arg Leu Ala Arg Asp Asn Ser Tyr Asn Gly Tyr SerSer Tyr Leu Leu Arg Leu Ala Arg Asp Asn Ser Tyr Asn Gly Tyr Ser
20 25 30 20 25 30
Glu Leu Ala Asp Ile Leu Trp Gln Trp Leu Ala Glu Gln Asp Asn GluGlu Leu Ala Asp Ile Leu Trp Gln Trp Leu Ala Glu Gln Asp Asn Glu
35 40 45 35 40 45
Leu Glu Gly Ala Leu Pro Leu Ala Leu Ser Lys Val Asp Val Tyr HisLeu Glu Gly Ala Leu Pro Leu Ala Leu Ser Lys Val Asp Val Tyr His
50 55 60 50 55 60
Ala Arg Gln Ala Ser Ser Phe Arg Ile Arg Ala Leu Lys Leu Val AlaAla Arg Gln Ala Ser Ser Phe Arg Ile Arg Ala Leu Lys Leu Val Ala
65 70 75 8065 70 75 80
Gln Leu Ala Asp Val Asn Ala Gly Asp Ile Leu Ala Leu Ala Trp ArgGln Leu Ala Asp Val Asn Ala Gly Asp Ile Leu Ala Leu Ala Trp Arg
85 90 95 85 90 95
Arg Ser Asn Phe Lys Phe Gly Asn Leu Ala Ala Val Ser Arg Asn GluArg Ser Asn Phe Lys Phe Gly Asn Leu Ala Ala Val Ser Arg Asn Glu
100 105 110 100 105 110
Leu Ala Ile Pro Leu Glu Leu Leu Arg Thr Asp Asn Ile Pro Val CysLeu Ala Ile Pro Leu Glu Leu Leu Arg Thr Asp Asn Ile Pro Val Cys
115 120 125 115 120 125
Ile Lys Cys Leu Ser Glu Ser Ser His Ile Pro Phe Tyr Trp His LeuIle Lys Cys Leu Ser Glu Ser Ser Ser His Ile Pro Phe Tyr Trp His Leu
130 135 140 130 135 140
Lys Pro Tyr Lys Ala Cys His Lys His Lys Ser Gln Leu Ile Thr ArgLys Pro Tyr Lys Ala Cys His Lys His Lys Ser Gln Leu Ile Thr Arg
145 150 155 160145 150 155 160
Cys Lys Glu Cys Tyr Asp Leu Ile Asp Tyr Arg Ala Ser Glu Ala PheCys Lys Glu Cys Tyr Asp Leu Ile Asp Tyr Arg Ala Ser Glu Ala Phe
165 170 175 165 170 175
Leu Glu Cys Val Cys Gly Cys Lys Ile Thr Asn Ser Glu Gln Leu AsnLeu Glu Cys Val Cys Gly Cys Lys Ile Thr Asn Ser Glu Gln Leu Asn
180 185 190 180 185 190
Asp Ala Asp Phe Lys Ile Ala Ile Ala Leu Ala Ser Ser Asn Ser GlnAsp Ala Asp Phe Lys Ile Ala Ile Ala Leu Ala Ser Ser Asn Ser Gln
195 200 205 195 200 205
Lys Ile Val Gly Leu Ile Ser Trp Phe Ala Lys Val Lys Gln Leu AspLys Ile Val Gly Leu Ile Ser Trp Phe Ala Lys Val Lys Gln Leu Asp
210 215 220 210 215 220
Val Ser Asp Ala Asp Phe Asn Cys Ala Phe Val Asp Tyr Phe Asn ThrVal Ser Asp Ala Asp Phe Asn Cys Ala Phe Val Asp Tyr Phe Asn Thr
225 230 235 240225 230 235 240
Trp Pro Glu Ser Leu Thr Thr Glu Leu Asp Leu Leu Thr Asn Asn AlaTrp Pro Glu Ser Leu Thr Thr Glu Leu Asp Leu Leu Thr Asn Asn Ala
245 250 255 245 250 255
Arg Leu Lys Gln Leu Asn Pro Phe Asn Lys Thr Lys Phe Ser Ser ValArg Leu Lys Gln Leu Asn Pro Phe Asn Lys Thr Lys Phe Ser Ser Val
260 265 270 260 265 270
Tyr Gly Asp Leu Ile Arg Asp Gly Gln Ile Ala Ala Thr Ser Asn ArgTyr Gly Asp Leu Ile Arg Asp Gly Gln Ile Ala Ala Thr Ser Asn Arg
275 280 285 275 280 285
Lys Asn Lys Val Ile Asp Glu Ile Ile Ser Tyr Phe Val Glu Leu ValLys Asn Lys Val Ile Asp Glu Ile Ile Ser Tyr Phe Val Glu Leu Val
290 295 300 290 295 300
Asp Ser Asn Pro Lys Ala Lys His Pro Asn Ile Gly Asp Leu Leu LeuAsp Ser Asn Pro Lys Ala Lys His Pro Asn Ile Gly Asp Leu Leu Leu
305 310 315 320305 310 315 320
Cys Thr Phe Asp Ala Ala Val Leu Leu Asn Thr Thr Thr Glu Gln ValCys Thr Phe Asp Ala Ala Val Leu Leu Asn Thr Thr Thr Glu Gln Val
325 330 335 325 330 335
Tyr Arg Leu His Gln Glu Ala Phe Leu Asn Cys Ala Tyr Ser Gln LysTyr Arg Leu His Gln Glu Ala Phe Leu Asn Cys Ala Tyr Ser Gln Lys
340 345 350 340 345 350
Lys His Glu Gln Leu Arg Ala Asp Ser His Val Phe Tyr Leu Arg GlnLys His Glu Gln Leu Arg Ala Asp Ser His Val Phe Tyr Leu Arg Gln
355 360 365 355 360 365
Val Ile Glu Leu Gln Gln Ala Phe Ala Ala Glu Lys Pro Leu Thr LysVal Ile Glu Leu Gln Gln Ala Phe Ala Ala Glu Lys Pro Leu Thr Lys
370 375 380 370 375 380
Lys Gln Phe Ile Ala Pro TrpLys Gln Phe Ile Ala Pro Trp
385 390385 390
<210> 5<210> 5
<211> 683<211> 683
<212> PRT<212> PRT
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 5<400> 5
Met Asn Leu Gln Asp Ala Leu Ala Ile Glu Pro Leu Lys Glu Lys ThrMet Asn Leu Gln Asp Ala Leu Ala Ile Glu Pro Leu Lys Glu Lys Thr
1 5 10 151 5 10 15
Thr Ala Leu Arg Lys Leu Phe Val Pro Tyr Thr Ser His Val Glu ValThr Ala Leu Arg Lys Leu Phe Val Pro Tyr Thr Ser His Val Glu Val
20 25 30 20 25 30
Asp Gly Phe Glu Glu Leu Ala Leu Thr Val Leu Ile Asn Leu Val TyrAsp Gly Phe Glu Glu Leu Ala Leu Thr Val Leu Ile Asn Leu Val Tyr
35 40 45 35 40 45
Lys Arg Ser Glu Ile Asp Asp Leu Thr Ser Ala Arg Thr Ala Lys SerLys Arg Ser Glu Ile Asp Asp Leu Thr Ser Ala Arg Thr Ala Lys Ser
50 55 60 50 55 60
Val Leu Arg Asp Glu Val Leu Leu Ser Lys Cys Ile Asn Glu Val LysVal Leu Arg Asp Glu Val Leu Leu Ser Lys Cys Ile Asn Glu Val Lys
65 70 75 8065 70 75 80
Trp Phe His Thr His Asn Leu Lys Tyr Pro Asp Ile Arg Val Ser HisTrp Phe His Thr His Asn Leu Lys Tyr Pro Asp Ile Arg Val Ser His
85 90 95 85 90 95
Gln Arg Leu Ile Ser Glu Val Val Ser Glu Asp Ile Ala Gly Ile CysGln Arg Leu Ile Ser Glu Val Val Ser Glu Asp Ile Ala Gly Ile Cys
100 105 110 100 105 110
Ser Arg Ser Leu Pro Leu Ser Phe Gly Trp Ser His Asn Ser Ala GluSer Arg Ser Leu Pro Leu Ser Phe Gly Trp Ser His Asn Ser Ala Glu
115 120 125 115 120 125
Ile Asn His Ala Lys Leu Phe Leu Thr Ser Phe Asn Trp Gln Gly GluIle Asn His Ala Lys Leu Phe Leu Thr Ser Phe Asn Trp Gln Gly Glu
130 135 140 130 135 140
Val Thr Cys Leu Ala Arg Leu Leu Ile Asn Glu Glu Pro Val Trp IleVal Thr Cys Leu Ala Arg Leu Leu Ile Asn Glu Glu Pro Val Trp Ile
145 150 155 160145 150 155 160
Asn Leu Ile Arg Ala Tyr Gly Phe Thr Lys Lys Ala Val Leu Glu IleAsn Leu Ile Arg Ala Tyr Gly Phe Thr Lys Lys Ala Val Leu Glu Ile
165 170 175 165 170 175
Ser Gly Lys Ile Lys Gln Gln Leu Pro Val Ala Glu Phe Pro Leu GluSer Gly Lys Ile Lys Gln Gln Leu Pro Val Ala Glu Phe Pro Leu Glu
180 185 190 180 185 190
Val Ser Ser Phe Ser Pro Gln Leu Gln Met Pro Phe Gln Gln Ser TyrVal Ser Ser Phe Ser Pro Gln Leu Gln Met Pro Phe Gln Gln Ser Tyr
195 200 205 195 200 205
Leu Val Val Thr Pro Val Val Ser His Ala Met Leu Ala Lys Ile GlnLeu Val Val Thr Pro Val Val Ser His Ala Met Leu Ala Lys Ile Gln
210 215 220 210 215 220
Gln Leu Thr Thr Asp Arg Lys Leu Asn Phe Ala Leu Val Glu His SerGln Leu Thr Thr Asp Arg Lys Leu Asn Phe Ala Leu Val Glu His Ser
225 230 235 240225 230 235 240
Arg Pro Ala Asn Val Gly Asp Leu Ala Ser Ser Val Gly Gly Asn IleArg Pro Ala Asn Val Gly Asp Leu Ala Ser Ser Val Gly Gly Asn Ile
245 250 255 245 250 255
Arg Val Leu Arg Tyr Phe Pro Lys Thr Tyr Ser Lys Ala Val Asn ArgArg Val Leu Arg Tyr Phe Pro Lys Thr Tyr Ser Lys Ala Val Asn Arg
260 265 270 260 265 270
Ser Lys Val Ala Asn Asn Asp Ile Glu Lys Ala Phe Lys Ile Arg AlaSer Lys Val Ala Asn Asn Asp Ile Glu Lys Ala Phe Lys Ile Arg Ala
275 280 285 275 280 285
Leu Leu Ser Ser Gln Phe Gln Gln Ala Leu Leu Val Leu Val Gly IleLeu Leu Ser Ser Gln Phe Gln Gln Ala Leu Leu Val Leu Val Gly Ile
290 295 300 290 295 300
Lys Gln Phe Asn Thr Leu Arg Gln Lys Arg Leu Ala Arg Val Ala AlaLys Gln Phe Asn Thr Leu Arg Gln Lys Arg Leu Ala Arg Val Ala Ala
305 310 315 320305 310 315 320
Ile Arg Gln Val Arg Val Ser Leu Gln Leu Trp Leu Asp Asn Ile LeuIle Arg Gln Val Arg Val Ser Leu Gln Leu Trp Leu Asp Asn Ile Leu
325 330 335 325 330 335
Glu Ala Lys Asn Asn Ala Gln Asn Gln Val Tyr Pro Glu Trp Val ArgGlu Ala Lys Asn Asn Ala Gln Asn Gln Val Tyr Pro Glu Trp Val Arg
340 345 350 340 345 350
His Tyr Leu Asp Gln Ser Ile Thr Asn Cys Ile Ser Gln Phe Ser AsnHis Tyr Leu Asp Gln Ser Ile Thr Asn Cys Ile Ser Gln Phe Ser Asn
355 360 365 355 360 365
Val Leu Asn Glu Ser Leu Gly Asn Leu Ser Lys Leu Lys Arg Phe AlaVal Leu Asn Glu Ser Leu Gly Asn Leu Ser Lys Leu Lys Arg Phe Ala
370 375 380 370 375 380
Tyr His Pro Asn Leu Met Gly Leu Phe Lys Ala Gln Leu Asn Tyr ValTyr His Pro Asn Leu Met Gly Leu Phe Lys Ala Gln Leu Asn Tyr Val
385 390 395 400385 390 395 400
Phe Thr His Cys Ala Ala Glu Gln Glu Ile Leu Asn Asp Glu Gln IlePhe Thr His Cys Ala Ala Glu Gln Glu Ile Leu Asn Asp Glu Gln Ile
405 410 415 405 410 415
Val Tyr Val His Cys Gln Asp Met Arg Val Phe Asp Ala Glu Ala MetVal Tyr Val His Cys Gln Asp Met Arg Val Phe Asp Ala Glu Ala Met
420 425 430 420 425 430
Ala Asn Pro Tyr Ile Gln Gly Met Pro Ser Leu Thr Ala Leu Asn GlyAla Asn Pro Tyr Ile Gln Gly Met Pro Ser Leu Thr Ala Leu Asn Gly
435 440 445 435 440 445
Leu Ala His Asn Phe Glu Arg Lys Leu Lys Asn Phe Ile Asp Pro SerLeu Ala His Asn Phe Glu Arg Lys Leu Lys Asn Phe Ile Asp Pro Ser
450 455 460 450 455 460
Ile Lys Cys Ile Gly Ser Ala Ile Tyr Ile Glu Asn Tyr Gln Leu HisIle Lys Cys Ile Gly Ser Ala Ile Tyr Ile Glu Asn Tyr Gln Leu His
465 470 475 480465 470 475 480
Thr Gly Lys Pro Leu Pro Glu Pro Ser Lys Leu Lys Gln Val Ala GlyThr Gly Lys Pro Leu Pro Glu Pro Ser Lys Leu Lys Gln Val Ala Gly
485 490 495 485 490 495
Arg Ser His Val Ile Arg Ser Gly Ile Ile Asp Lys Pro Lys Cys AspArg Ser His Val Ile Arg Ser Gly Ile Ile Asp Lys Pro Lys Cys Asp
500 505 510 500 505 510
Ile Thr Leu Asp Leu Val Phe Arg Leu Phe Val Pro Asn Thr Glu LeuIle Thr Leu Asp Leu Val Phe Arg Leu Phe Val Pro Asn Thr Glu Leu
515 520 525 515 520 525
Leu Asp Lys Leu Asn Ser Gln Leu Ile Lys Pro Ala Leu Pro Ser SerLeu Asp Lys Leu Asn Ser Gln Leu Ile Lys Pro Ala Leu Pro Ser Ser
530 535 540 530 535 540
Phe Ala Gly Gly Thr Met His Pro Pro Ser Leu Tyr Gln Asn Ile AspPhe Ala Gly Gly Thr Met His Pro Pro Ser Leu Tyr Gln Asn Ile Asp
545 550 555 560545 550 555 560
Trp Cys His Val His Thr Lys Pro Ser Glu Leu Phe Lys Lys Leu LysTrp Cys His Val His Thr Lys Pro Ser Glu Leu Phe Lys Lys Leu Lys
565 570 575 565 570 575
Ala Lys Ser Ser Asn Gly Ser Trp Leu Tyr Pro Ser Lys Lys Val ValAla Lys Ser Ser Asn Gly Ser Trp Leu Tyr Pro Ser Lys Lys Val Val
580 585 590 580 585 590
Lys Ser Phe Glu Gln Leu Ile Asp Ala Leu Asn Ser Asn Phe Asn LeuLys Ser Phe Glu Gln Leu Ile Asp Ala Leu Asn Ser Asn Phe Asn Leu
595 600 605 595 600 605
Arg Pro Ala Ala Ile Gly Leu Ala Ala Leu Glu Glu Pro Val Lys ArgArg Pro Ala Ala Ile Gly Leu Ala Ala Leu Glu Glu Pro Val Lys Arg
610 615 620 610 615 620
Asp Ala Ala Leu His Glu Tyr His Cys Tyr Ala Glu Pro Val Ile GlyAsp Ala Ala Leu His Glu Tyr His Cys Tyr Ala Glu Pro Val Ile Gly
625 630 635 640625 630 635 640
Leu Leu Glu Cys Val Ser Asn Thr Ser Val Lys Tyr Ala Gly Ala LysLeu Leu Glu Cys Val Ser Asn Thr Ser Val Lys Tyr Ala Gly Ala Lys
645 650 655 645 650 655
Gln Phe Phe His Asp Ala Phe Trp Val Met Asp Val Gln Lys Glu SerGln Phe Phe His Asp Ala Phe Trp Val Met Asp Val Gln Lys Glu Ser
660 665 670 660 665 670
Met Leu Met Lys Lys Ser Lys Phe Glu Tyr GluMet Leu Met Lys Lys Ser Lys Phe Glu Tyr Glu
675 680 675 680
<210> 6<210> 6
<211> 200<211> 200
<212> PRT<212> PRT
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 6<400> 6
Leu Lys Arg Tyr Tyr Phe Thr Ile Thr Tyr Leu Pro Gln Ser Cys AspLeu Lys Arg Tyr Tyr Phe Thr Ile Thr Tyr Leu Pro Gln Ser Cys Asp
1 5 10 151 5 10 15
Val Ser Leu Leu Ala Gly Arg Cys Ile Gly Ile Leu His Gly Phe MetVal Ser Leu Leu Ala Gly Arg Cys Ile Gly Ile Leu His Gly Phe Met
20 25 30 20 25 30
Ser Ser Arg Glu Ile Ser Asn Ile Gly Val Cys Phe Pro Lys Trp AsnSer Ser Arg Glu Ile Ser Asn Ile Gly Val Cys Phe Pro Lys Trp Asn
35 40 45 35 40 45
Glu Gln Thr Ile Gly Asn Glu Leu Ala Phe Val Ser Thr Asn Lys LysGlu Gln Thr Ile Gly Asn Glu Leu Ala Phe Val Ser Thr Asn Lys Lys
50 55 60 50 55 60
Gln Leu Thr Asn Leu Ser Gln Gln Ser Tyr Phe Glu Met Met Ala HisGln Leu Thr Asn Leu Ser Gln Gln Ser Tyr Phe Glu Met Met Ala His
65 70 75 8065 70 75 80
Asp Lys Leu Phe Gly Leu Ser Lys Ile Leu Glu Val Pro Val Asn GlnAsp Lys Leu Phe Gly Leu Ser Lys Ile Leu Glu Val Pro Val Asn Gln
85 90 95 85 90 95
Ser Glu Val Met Phe Val Arg Asn Gln Ser Val Ala Lys Ala Phe ValSer Glu Val Met Phe Val Arg Asn Gln Ser Val Ala Lys Ala Phe Val
100 105 110 100 105 110
Gly Glu Lys Gln Arg Arg Leu Lys Arg Ala Lys Lys Arg Ala Glu AlaGly Glu Lys Gln Arg Arg Leu Lys Arg Ala Lys Lys Arg Ala Glu Ala
115 120 125 115 120 125
Arg Gly Glu Val Tyr Asn Pro Glu Tyr Lys Phe Glu Ala Lys Asp IleArg Gly Glu Val Tyr Asn Pro Glu Tyr Lys Phe Glu Ala Lys Asp Ile
130 135 140 130 135 140
Gly His Phe His Ser Ile Pro Val Ser Ser Lys Gly Asn Gly Gln SerGly His Phe His Ser Ile Pro Val Ser Ser Lys Gly Asn Gly Gln Ser
145 150 155 160145 150 155 160
Tyr Val Leu His Ile Gln Lys Asn Glu Asn Ala Glu Ser Ile Lys AsnTyr Val Leu His Ile Gln Lys Asn Glu Asn Ala Glu Ser Ile Lys Asn
165 170 175 165 170 175
Gln Phe Asn Asn Tyr Gly Phe Ala Thr Asn Gln Ile Phe Leu Gly ThrGln Phe Asn Asn Tyr Gly Phe Ala Thr Asn Gln Ile Phe Leu Gly Thr
180 185 190 180 185 190
Val Pro Ser Leu Asn Thr Leu LeuVal Pro Ser Leu Asn Thr Leu Leu
195 200 195 200
<210> 7<210> 7
<211> 342<211> 342
<212> PRT<212> PRT
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 7<400> 7
Met Gln Leu Pro Arg His Leu Ser Tyr Thr Arg Ser Leu Ser Pro SerMet Gln Leu Pro Arg His Leu Ser Tyr Thr Arg Ser Leu Ser Pro Ser
1 5 10 151 5 10 15
Lys Ala Val Phe Phe Tyr Lys Thr Pro Glu Ser Asp Phe Glu Pro LeuLys Ala Val Phe Phe Tyr Lys Thr Pro Glu Ser Asp Phe Glu Pro Leu
20 25 30 20 25 30
Gln Ile Glu Gln Asn Lys Leu Val Gly Gln Lys Ser Gly Phe Gly AspGln Ile Glu Gln Asn Lys Leu Val Gly Gln Lys Ser Gly Phe Gly Asp
35 40 45 35 40 45
Ala Tyr Gln Lys Gln Asn Val Ala Lys Asn Leu Ala Pro Gln Asp LeuAla Tyr Gln Lys Gln Asn Val Ala Lys Asn Leu Ala Pro Gln Asp Leu
50 55 60 50 55 60
Ala Phe Gly Asn Pro Gln Thr Ile Asp Val Cys Tyr Val Pro Pro ThrAla Phe Gly Asn Pro Gln Thr Ile Asp Val Cys Tyr Val Pro Pro Thr
65 70 75 8065 70 75 80
Val Asn Glu Leu Phe Cys Arg Phe Ser Leu Arg Val Glu Ala Asn CysVal Asn Glu Leu Phe Cys Arg Phe Ser Leu Arg Val Glu Ala Asn Cys
85 90 95 85 90 95
Ile Glu Pro His Val Cys Asp Asp Pro Lys Val Ile Tyr Trp Leu LysIle Glu Pro His Val Cys Asp Asp Pro Lys Val Ile Tyr Trp Leu Lys
100 105 110 100 105 110
Arg Phe Phe Glu Thr Tyr Lys Lys His Asn Gly Leu Asn Glu Val AlaArg Phe Phe Glu Thr Tyr Lys Lys His Asn Gly Leu Asn Glu Val Ala
115 120 125 115 120 125
Thr Arg Tyr Ala Lys Asn Ile Leu Met Gly Asn Trp Leu Trp Arg AsnThr Arg Tyr Ala Lys Asn Ile Leu Met Gly Asn Trp Leu Trp Arg Asn
130 135 140 130 135 140
Arg Gln Ser Pro Asn Val Asp Ile Glu Ile Leu Thr Glu His Ala AlaArg Gln Ser Pro Asn Val Asp Ile Glu Ile Leu Thr Glu His Ala Ala
145 150 155 160145 150 155 160
Pro Ile Val Val Glu Gly Ala Gln Lys Leu Lys Trp Gln Gly Asn TrpPro Ile Val Val Glu Gly Ala Gln Lys Leu Lys Trp Gln Gly Asn Trp
165 170 175 165 170 175
Gln Asn Asn Gln Thr Ala Leu Leu Thr Leu Ser Glu Ser Ile Gln GluGln Asn Asn Gln Thr Ala Leu Leu Thr Leu Ser Glu Ser Ile Gln Glu
180 185 190 180 185 190
Gly Leu Ser Asn Pro Gln Asn Tyr Cys Tyr Leu Asp Ile Thr Ala LysGly Leu Ser Asn Pro Gln Asn Tyr Cys Tyr Leu Asp Ile Thr Ala Lys
195 200 205 195 200 205
Ile Lys Asn Ala Phe Ser Gln Glu Val His Pro Ser Gln Lys Phe ValIle Lys Asn Ala Phe Ser Gln Glu Val His Pro Ser Gln Lys Phe Val
210 215 220 210 215 220
Asp Asn Val Glu Gln Gly Met Ser Ser Lys Gln Leu Ala Tyr Thr GlnAsp Asn Val Glu Gln Gly Met Ser Ser Lys Gln Leu Ala Tyr Thr Gln
225 230 235 240225 230 235 240
Val Gly Asp Lys Lys Ala Ala Ser Leu Asn Ser Gln Lys Val Gly AlaVal Gly Asp Lys Lys Ala Ala Ser Leu Asn Ser Gln Lys Val Gly Ala
245 250 255 245 250 255
Ala Ile Gln Thr Ile Asp Asp Trp Tyr Glu Glu Gly Tyr Lys Pro LeuAla Ile Gln Thr Ile Asp Asp Trp Tyr Glu Glu Gly Tyr Lys Pro Leu
260 265 270 260 265 270
Arg Thr His Glu Tyr Gly Ala Asp Lys Gln Ile Leu Val Ala His ArgArg Thr His Glu Tyr Gly Ala Asp Lys Gln Ile Leu Val Ala His Arg
275 280 285 275 280 285
Thr Pro Lys Ser His Ser Asp Phe Tyr Ser Leu Leu Pro Arg Ile AlaThr Pro Lys Ser His Ser Asp Phe Tyr Ser Leu Leu Pro Arg Ile Ala
290 295 300 290 295 300
Leu His Ile Lys His Met Glu Lys His Gly Leu Glu Gln Ser Glu GlnLeu His Ile Lys His Met Glu Lys His Gly Leu Glu Gln Ser Glu Gln
305 310 315 320305 310 315 320
Ser Asn Ser Ile His Phe Ile Ala Ala Val Leu Ile Lys Gly Gly LeuSer Asn Ser Ile His Phe Ile Ala Ala Val Leu Ile Lys Gly Gly Leu
325 330 335 325 330 335
Phe Gln Arg Ser Lys GlyPhe Gln Arg Ser Lys Gly
340 340
<210> 8<210> 8
<211> 630<211> 630
<212> DNA<212>DNA
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 8<400> 8
atgtacagaa gaaaactaaa atactcccgt gtaaaaaatc ttcataaatt tgctagtcaa 60atgtacagaa gaaaactaaa atactcccgt gtaaaaaatc ttcataaatt tgctagtcaa 60
aaaaataaat ctacttgttt agtcgaatcc tctttagagt ttgatgcgtg tttccatttt 120aaaaataaat ctacttgttt agtcgaatcc tctttagagt ttgatgcgtg tttccatttt 120
gaattttcac caccaatagc cgcatttgaa gcacaacctc taggttacga atatgagttc 180gaattttcac caccaatagc cgcatttgaa gcacaacctc taggttacga atatgagttc 180
gataaccgta tttgccgtta cacacctgac tttttactta cccacacaga cggcacgcaa 240gataaccgta tttgccgtta cacacctgac tttttactta cccacacaga cggcacgcaa 240
aaatttatag aagtaaaacc gcaaagcaaa attgctgacg aagactttcg tgcacgtttt 300aaatttatag aagtaaaacc gcaaagcaaa attgctgacg aagactttcg tgcacgtttt 300
attgaaaagc aagccatagc taagcaagat ggacgcgact taatactggt tactgataaa 360attgaaaagc aagccatagc taagcaagat ggacgcgact taatactggt tactgataaa 360
caaatccgtg tatacccaac actcaataac ttaaagcttt tgcatcgcta ctctggtttt 420caaatccgtg tatacccaac actcaataac ttaaagcttt tgcatcgcta ctctggtttt 420
cagtctttaa cagaattgca agcatcggta ctagaacttg ttaagcagta cggctctatc 480cagtctttaa cagaattgca agcatcggta ctagaacttg ttaagcagta cggctctatc 480
aaagtgggcc agttaatcag atatttaaaa gtaactgccg gtgagctact tgctacggtg 540aaagtgggcc agttaatcag atatttaaaa gtaactgccg gtgagctact tgctacggtg 540
cttcgcttac tatcactagg gcagttattt gccgacttaa ctacaaatga aatatcaata 600cttcgcttac tatcactagg gcagttattt gccgacttaa ctacaaatga aatatcaata 600
gaaacagcaa tttggtctaa caatgtttaa 630gaaacagcaa tttggtctaa caatgtttaa 630
<210> 9<210> 9
<211> 1824<211> 1824
<212> DNA<212>DNA
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 9<400> 9
atgtttaata acgatttgtt tgatgatgag tttaaccagc cattaccaaa agctgaaacc 60atgtttaata acgatttgtt tgatgatgag tttaaccagc cattaccaaa agctgaaacc 60
aaactacctc aaaattacac taaagactta caagcccttc ctgaaaaaat aaaaacaaca 120aaactacctc aaaattacac taaagactta caagcccttc ctgaaaaaat aaaaacaaca 120
acatttgcta agcttaaata tattcaatgg cttgaggcta atattcaagg tggttggaca 180acatttgcta agcttaaata tattcaatgg cttgaggcta atattcaagg tggttggaca 180
caaaaaaatc ttgaaccttt attaaaatta atgcctgatg ttgagggtga aaaaaagcca 240caaaaaaatc ttgaaccttt attaaaatta atgcctgatg ttgagggtga aaaaaagcca 240
agttggagaa cagccgcacg atggtatagc gcttacacca atgcggataa aaatattatg 300agttggagaa cagccgcacg atggtatagc gcttacacca atgcggataa aaatattatg 300
gcgctaatac caagccacca aaaaaagggt aatagggagc gcgatacaac cactgataag 360gcgctaatac caagccacca aaaaaagggt aataggggagc gcgatacaac cactgataag 360
ttttttgaaa aagcacttga gcgttactta gtaaaagaaa aaccatcagt ggcttcggct 420ttttttgaaa aagcacttga gcgttactta gtaaaagaaa aaccatcagt ggcttcggct 420
tacaagttct ataaagactt agttattatc gaaaacgaca gtgttgttga cagtgtttta 480tacaagttct ataaagactt agttattatc gaaaacgaca gtgttgttga cagtgtttta 480
aagcctttaa catacaaagc gtttaaaaac agaatagata acttaccgca atacgaagta 540aagcctttaa catacaaagc gtttaaaaac agaaagata acttaccgca atacgaagta 540
atgattgctc gttatggtaa gcgccttgct gatattgctt ataataaggt tgaagggcat 600atgattgctc gttatggtaa gcgccttgct gatattgctt ataataaggt tgaagggcat 600
aaacggccta tccgagtact tgaaaaagtt gaaattgacc atacgccact tgatcttatt 660aaacggccta tccgagtact tgaaaaagtt gaaattgacc atacgccact tgatcttatt 660
ttattagatg atgagctaca tattccacta ggtaggccta cactcaccat gttggtagat 720ttattagatg atgagctaca tattccacta ggtaggccta cactcaccat gttggtagat 720
gtgtatagcc attgtattgt tggctattac tttagcttca gtgagcctag ctatgatgca 780gtgtatagcc attgtattgttggctattac tttagcttca gtgagcctag ctatgatgca 780
gtaaggcgag caatgctaaa tgcgatgaaa cctaaaagtg aagtggcaaa actataccct 840gtaaggcgag caatgctaaa tgcgatgaaa cctaaaagtg aagtggcaaa actataccct 840
gatacgatta atgagtggaa gtgtgctggc aaaattgaaa cactcgttgt tgataatggc 900gatacgatta atgagtggaa gtgtgctggc aaaattgaaa cactcgttgt tgataatggc 900
gctgaatttt ggagcaacag ccttgaactt gcttgtgaag aaataggcat taatactcaa 960gctgaatttt ggagcaacag ccttgaactt gcttgtgaag aaataggcat taatactcaa 960
tataacccag tcgcaaagcc ttggttaaaa ccatttgtag aacgtatgtt tggaacaata 1020tataacccag tcgcaaagcc ttggttaaaa ccatttgtag aacgtatgtt tggaacaata 1020
aatactgagt tattagatcc tgttcccggt aaaacctttt ctaacatttt acaaaagcat 1080aatactgagt tattagatcc tgttcccggt aaaacctttt ctaacatttt acaaaagcat 1080
gaatacaatc caaaaaaaga tgcaatcatg cgctttacga cctttatgca gttatttcat 1140gaatacaatc caaaaaaaga tgcaatcatg cgctttacga cctttatgca gttatttcat 1140
aaatgggtag tagacgttta tcatcaagat gccgacagtc gctttaagta cataccgagt 1200aaatgggtag tagacgttta tcatcaagat gccgacagtc gctttaagta cataccgagt 1200
caactgtggg atcaaggttt taatacgtta ccaccaacaa tgctaagtga tgctgatctt 1260caactgtggg atcaaggttt taatacgtta ccaccaacaa tgctaagtga tgctgatctt 1260
caacaactag atgttgtgct cagtatttca aatcatcggg tacttcgtaa aggtgggata 1320caacaactag atgttgtgct cagtatttca aatcatcggg tacttcgtaa aggtgggata 1320
cggctagaaa acttaagcta cgacagtact gaactggcca attatagaaa gcaatttagc 1380cggctagaaa acttaagcta cgacagtact gaactggcca attatagaaa gcaatttagc 1380
cataaagtat ctcaagaagt tttaattaaa ttaaatcccg atgatatttc ttatatatat 1440cataaagtat ctcaagaagt ttaattaaa ttaaatcccg atgatatttc ttatatat 1440
gtttaccttg ataagctaga gcattacata aaagtgccat gcatagatcc aaacggttac 1500gtttaccttg ataagctaga gcattacata aaagtgccat gcatagatcc aaacggttac 1500
acccaaaatt taagtttgaa tcagcataaa ataaatatac gcatccaccg cgactttatt 1560acccaaaatt taagtttgaa tcagcataaa ataaatatac gcatccaccg cgactttatt 1560
tcgggctcta tcgataatgt aggcttagca aaagcgcgca tgtttattca taacaaaatt 1620tcggggctcta tcgataatgt aggcttagca aaagcgcgca tgtttatca taacaaaatt 1620
caaaacgagt ttgaagagtt aaaaaatgcg ccaaaacact caaaagtaaa gggtggtaaa 1680caaaacgagt ttgaagagtt aaaaaatgcg ccaaaacact caaaagtaaa gggtggtaaa 1680
gcgttagcta aacatcaaaa tatcagtagt gactcacaaa agtcaataac gcatagcaaa 1740gcgttagcta aacatcaaaa tatcagtagt gactcacaaa agtcaataac gcatagcaaa 1740
cccgtagagg ccaaaaaggt tacacctaaa gagcaaccaa ctgatagctg ggatgatttt 1800cccgtagagg ccaaaaaggt tacacctaaa gagcaaccaa ctgatagctg ggatgatttt 1800
atctcagact tagatggatt ttaa 1824atctcagact tagatggatt ttaa 1824
<210> 10<210> 10
<211> 1002<211> 1002
<212> DNA<212>DNA
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 10<400> 10
atgctgaccg ataaacaaaa agaaaagctg aatgaatttc gtgatgtatt tattgaatac 60atgctgaccg ataaacaaaa agaaaagctg aatgaatttc gtgatgtatt tattgaatac 60
ccaataataa ccaccatatt taacgacttc gatagattaa gacttggtaa agggctaaca 120ccaataataa ccaccatatt taacgacttc gatagattaa gacttggtaa agggctaaca 120
ggtgaaaagc cttgcatgct cttaaatggc gatacaggca caggtaaaac agcactgatc 180ggtgaaaagc cttgcatgct cttaaatggc gatacaggca caggtaaaac agcactgatc 180
aagcaatata aagaacgaca tttaccgcaa tttattaatg gtgttatgaa ccaccctgta 240aagcaatata aagaacgaca tttaccgcaa tttattaatg gtgttatgaa ccaccctgta 240
ttggtaagcc gcatacctag taacccgaca ttagaatcta ctttagcaga gcttcttaaa 300ttggtaagcc gcatacctag taacccgaca ttagaatcta ctttagcaga gcttcttaaa 300
gatttagggc aagtaggcag cacagagcgt aagctacgaa taaacggcac tcgcttaacg 360gatttagggc aagtaggcag cacagagcgt aagctacgaa taaacggcac tcgcttaacg 360
acatcattaa taaaatgcct aaaaacatgt ggcacagagc ttataattat tgatgagttc 420acatcattaa taaaatgcct aaaaacatgt ggcacagagc ttataattat tgatgagttc 420
caagagctaa ttgagcacaa ccaaggtaaa aagcgccgcg agattgctaa tcgattaaaa 480caagagctaa ttgagcacaa ccaaggtaaa aagcgccgcg agattgctaa tcgattaaaa 480
tatattaacg acgaagcggg tgtatcaatt gtattggtag gtatgccgtg ggcagaaaaa 540tatattaacg acgaagcggg tgtatcaatt gtattggtag gtatgccgtg ggcagaaaaa 540
atagcagacg agccccagtg gtcatctcgt ttattaataa ggcggcagtt gccttatttt 600atagcagacg agccccagtg gtcatctcgt ttattaataa ggcggcagtt gccttatttt 600
aagttgtcag aaaacccaaa gcattttgta caactaataa ttggtctagc caaccgtatg 660aagttgtcag aaaacccaaa gcattttgta caactaataa ttggtctagc caaccgtatg 660
ccatttgccg aaaagccaaa cttaagtgag caagcaacag tgtttacttt gttctcatta 720ccatttgccg aaaagccaaa cttaagtgag caagcaacag tgtttacttt gttctcatta 720
tcaaaaggtt gctttagaac attaaaatac tttttagatg atgccgtact ttatgcatta 780tcaaaaggtt gctttagaac attaaaatac tttttagatg atgccgtact ttatgcatta 780
atggacaacg cgaaaactct cacaaccaag catttagtta aagcatttga ggtactcttt 840atggacaacg cgaaaactct cacaaccaag catttagtta aagcatttga ggtactcttt 840
ccggatgttc ctaatttatt taccttgcct gtagcagaaa taacagcaag cgaagtcgag 900ccggatgttc ctaatttatt taccttgcct gtagcagaaa taacagcaag cgaagtcgag 900
cgctattcac tttataagcc tgaaagctct caagatgaag acccgtttat agcgaccaag 960cgctattcac tttataagcc tgaaagctct caagatgaag acccgtttat agcgaccaag 960
tttactgacc ggatgccgat tagtcagttg ttaaggaaat aa 1002tttactgacc ggatgccgat tagtcagttg ttaaggaaat aa 1002
<210> 11<210> 11
<211> 1176<211> 1176
<212> DNA<212>DNA
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 11<400> 11
atgcattttt tagttcaaac aaaatcttac ccagatgagg cgcttgaaag ctatttgctg 60atgcattttt tagttcaaac aaaatcttac ccagatgagg cgcttgaaag ctatttgctg 60
aggcttgcaa gggataactc atacaatggc tatagtgagc ttgctgatat tttgtggcaa 120aggcttgcaa gggataactc atacaatggc tatagtgagc ttgctgatat tttgtggcaa 120
tggcttgcag agcaagataa tgagcttgaa ggtgcgctgc cgttagcgct gagtaaagtt 180tggcttgcag agcaagataa tgagcttgaa ggtgcgctgc cgttagcgct gagtaaagtt 180
gatgtttatc atgctaggca agcgagcagc tttagaataa gagcgcttaa gttggttgct 240gatgtttatc atgctaggca agcgagcagc tttagaataa gagcgcttaa gttggttgct 240
caattagcag atgtaaacgc tggtgacatt cttgcacttg cttggaggcg cagtaatttt 300caattagcag atgtaaacgc tggtgacatt cttgcacttg cttggaggcg cagtaatttt 300
aaatttggca accttgccgc agtaagtcga aatgaactgg ctattcccct tgagctactt 360aaatttggca accttgccgc agtaagtcga aatgaactgg ctattcccct tgagctactt 360
cgtactgata acatacctgt ttgcattaaa tgcttgtctg aatcttccca tattcccttt 420cgtactgata acatacctgt ttgcattaaa tgcttgtctg aatcttccca tattcccttt 420
tattggcatt taaagcccta taaggcgtgt cataagcata agtcacaatt aattacacgt 480tattggcatt taaagcccta taaggcgtgt cataagcata agtcacaatt aattacacgt 480
tgtaaggagt gctatgactt aattgattac agagcctctg aggcgttttt agagtgtgtt 540tgtaaggagt gctatgactt aattgattac agagcctctg aggcgttttt agagtgtgtt 540
tgcggttgta aaataaccaa tagtgaacag ttaaacgatg cagactttaa aattgcaatt 600tgcggttgta aaataaccaa tagtgaacag ttaaacgatg cagactttaa aattgcaatt 600
gcgcttgcaa gtagtaacag ccaaaaaata gtagggttga tatcgtggtt tgcgaaggtt 660gcgcttgcaa gtagtaacag ccaaaaaata gtaggggttga tatcgtggtt tgcgaaggtt 660
aagcaacttg atgtaagtga tgcagacttt aactgcgctt ttgttgatta ctttaatact 720aagcaacttg atgtaagtga tgcagacttt aactgcgctt ttgttgatta ctttaatact 720
tggcctgaaa gccttaccac tgaattagat ttactcacaa ataatgcgcg actcaagcaa 780tggcctgaaa gccttaccac tgaattagat ttactcacaa ataatgcgcg actcaagcaa 780
cttaaccctt ttaataaaac taagttcagc tctgtttatg gcgatttaat ccgtgatggt 840cttaaccctt ttaataaaac taagttcagc tctgtttatg gcgatttaat ccgtgatggt 840
caaatagctg caacaagtaa ccggaaaaac aaagtaattg atgagattat tagttatttt 900caaatagctg caacaagtaa ccggaaaaac aaagtaattg atgagattat tagttatttt 900
gtcgaattag ttgatagtaa ccctaaagct aaacatccaa atattggtga cttactgctt 960gtcgaattag ttgatagtaa ccctaaagct aaacatccaa atattggtga cttactgctt 960
tgtacttttg atgccgcagt attgttaaac actactacag agcaagttta caggcttcat 1020tgtacttttg atgccgcagt attgttaaac actactacag agcaagttta caggcttcat 1020
caagaagcct ttttaaactg tgcttattca caaaaaaagc acgaacagct cagagctgat 1080caagaagcct ttttaaactg tgcttattca caaaaaaagc acgaacagct cagagctgat 1080
agccatgtat tttatttacg ccaagtgatt gaactacaac aagcattcgc agctgaaaag 1140agccatgtat tttattacg ccaagtgatt gaactacaac aagcattcgc agctgaaaag 1140
cctctaacaa aaaaacaatt tatagcgccg tggtaa 1176cctctaacaa aaaaacaatt tatagcgccg tggtaa 1176
<210> 12<210> 12
<211> 2052<211> 2052
<212> DNA<212>DNA
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 12<400> 12
atgaacttac aagatgcact tgcaattgaa ccactaaaag aaaaaaccac agcacttaga 60atgaacttac aagatgcact tgcaattgaa ccactaaaag aaaaaaccac agcacttaga 60
aaattgttcg ttccatacac gtctcatgtc gaggtagatg gctttgaaga actagcgctg 120aaattgttcg ttccatacac gtctcatgtc gaggtagatg gctttgaaga actagcgctg 120
actgtgctca ttaatcttgt ttataagcga agtgagattg atgatttaac aagtgcaaga 180actgtgctca ttaatcttgt ttataagcga agtgagattg atgatttaac aagtgcaaga 180
actgctaaaa gtgtactacg cgatgaagtg ttactgagta agtgcattaa cgaagtgaaa 240actgctaaaa gtgtactacg cgatgaagtg ttactgagta agtgcattaa cgaagtgaaa 240
tggtttcata ctcataattt aaaatacccc gatatacgag taagccatca acgtttaatt 300tggtttcata ctcataattt aaaatacccc gatatacgag taagccatca acgtttaatt 300
agtgaagttg taagtgaaga tattgcgggc atttgcagcc ggtcattacc tttaagtttt 360agtgaagttg taagtgaaga tattgcgggc atttgcagcc ggtcattacc tttaagtttt 360
ggctggtcgc acaacagtgc tgaaattaat catgcaaagc tatttttaac ctcgtttaat 420ggctggtcgc acaacagtgc tgaaattaat catgcaaagc tatttttaac ctcgtttaat 420
tggcaaggtg aagtgacttg tttagcaagg ctgttaatta atgaagagcc tgtttggatt 480tggcaaggtg aagtgacttg tttagcaagg ctgttaatta atgaagagcc tgtttggatt 480
aatttaataa gagcatacgg gtttactaaa aaggcggttt tagaaatctc gggtaaaata 540aatttaataa gagcatacgg gtttactaaa aaggcggttt tagaaatctc gggtaaaata 540
aaacagcagt tgccagtggc agagttccca ttagaagtaa gctctttttc accacaatta 600aaacagcagt tgccagtggc agagttccca ttagaagtaa gctctttttc accacaatta 600
caaatgccat ttcagcaaag ctaccttgtg gttacgcctg tagtaagcca cgcaatgctg 660caaatgccat ttcagcaaag ctaccttgtg gttacgcctg tagtaagcca cgcaatgctg 660
gctaaaattc agcaattaac aacagatcgt aagttaaatt ttgctttagt tgagcactca 720gctaaaattc agcaattaac aacagatcgt aagttaaatt ttgctttagt tgagcactca 720
agacctgcca atgttggcga tttagcaagc tcagtaggcg gcaatataag agtgctgcgt 780agacctgcca atgttggcga tttagcaagc tcagtaggcg gcaatataag agtgctgcgt 780
tactttccta aaacatattc aaaggctgtt aaccgctcta aagtagccaa taatgatatt 840tactttccta aaacatattc aaaggctgtt aaccgctcta aagtagccaa taatgatatt 840
gagaaagcat ttaaaattcg tgcgctatta agtagtcaat ttcaacaggc gcttttggtg 900gagaaagcat ttaaaattcg tgcgctatta agtagtcaat ttcaacaggc gcttttggtg 900
ttggtaggca ttaaacagtt taatacgtta aggcaaaaac gattagcgcg agtagcggct 960ttggtaggca ttaaacagtt taatacgtta aggcaaaaac gattagcgcg agtagcggct 960
attaggcaag tacgtgttag cttgcagtta tggcttgata atattcttga agctaaaaat 1020attaggcaag tacgtgttag cttgcagtta tggcttgata atattcttga agctaaaaat 1020
aacgcgcaaa accaagttta ccctgagtgg gtaaggcatt acttagatca gagtattact 1080aacgcgcaaa accaagttta ccctgagtgg gtaaggcatt acttagatca gagtattact 1080
aactgtatta gccaatttag taacgtacta aatgagagcc ttggtaattt aagtaagctc 1140aactgtatta gccaatttag taacgtacta aatgagagcc ttggtaattt aagtaagctc 1140
aaacgctttg cgtatcaccc taatttaatg ggactgttta aagcgcagtt aaactatgta 1200aaacgctttg cgtatcaccc taatttaatg ggactgttta aagcgcagtt aaactatgta 1200
tttactcact gtgcagctga acaagaaata ttaaatgatg agcagatagt gtatgtacat 1260tttactcact gtgcagctga acaagaaata ttaaatgatg agcagatagt gtatgtacat 1260
tgccaagata tgcgagtgtt tgatgctgag gcaatggcta atccgtatat tcaaggcatg 1320tgccaagata tgcgagtgtt tgatgctgag gcaatggcta atccgtatat tcaaggcatg 1320
ccgtcactta ctgctttaaa tgggcttgct cataactttg agcgtaagct aaaaaacttt 1380ccgtcactta ctgctttaaa tgggcttgct cataactttg agcgtaagct aaaaaacttt 1380
atagaccctt caattaagtg tattggcagt gctatttaca ttgaaaacta tcaattacat 1440atagaccctt caattaagtg tattggcagt gctatttaca ttgaaaacta tcaattacat 1440
acaggtaaac cattacctga gccaagcaag ttaaaacaag ttgcagggcg tagtcatgta 1500acaggtaaac cattacctga gccaagcaag ttaaaacaag ttgcagggcg tagtcatgta 1500
ataagatctg gaattatcga taaaccaaaa tgtgacataa cactcgattt agtatttaga 1560ataagatctg gaattatcga taaaccaaaa tgtgacataa cactcgattt agtatttaga 1560
ctttttgtac caaatactga gctgttagat aagttaaata gtcagcttat aaagcccgca 1620ctttttgtac caaatactga gctgttagat aagttaaata gtcagcttat aaagcccgca 1620
ctaccgtctt catttgcagg cgggactatg catccacctt cgttatatca aaatattgac 1680ctaccgtctt catttgcagg cgggactatg catccacctt cgttattca aaatattgac 1680
tggtgccatg tacataccaa accgagcgag ctgtttaaaa aacttaaagc aaaatcgtca 1740tggtgccatg tacataccaa accgagcgag ctgtttaaaa aacttaaagc aaaatcgtca 1740
aatggcagtt ggttatatcc ttcaaaaaaa gtagttaaaa gttttgaaca attaattgat 1800aatggcagtt ggttatatcc ttcaaaaaaa gtagttaaaa gttttgaaca attaattgat 1800
gcccttaaca gtaactttaa tttaagaccc gctgcaattg gcttggctgc gcttgaagaa 1860gcccttaaca gtaactttaa tttaagaccc gctgcaattg gcttggctgc gcttgaagaa 1860
cccgtaaagc gagatgcagc attacatgaa taccattgtt atgcagagcc cgtaattggg 1920cccgtaaagc gagatgcagc attacatgaa taccattgtt atgcagagcc cgtaattggg 1920
ctgttagagt gtgttagcaa tacatcagta aagtacgcag gggctaagca gttctttcat 1980ctgttagagt gtgttagcaa tacatcagta aagtacgcag gggctaagca gttctttcat 1980
gacgcatttt gggttatgga tgttcaaaaa gagtctatgc ttatgaaaaa gtctaagttt 2040gacgcatttt gggttatgga tgttcaaaaa gagtctatgc ttatgaaaaa gtctaagttt 2040
gagtatgaat aa 2052gagtatgaat aa 2052
<210> 13<210> 13
<211> 603<211> 603
<212> DNA<212>DNA
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 13<400> 13
ttgaagcgct attattttac cattacttat ttaccccaaa gttgtgatgt aagccttctt 60ttgaagcgct attattttac cattacttat ttaccccaaa gttgtgatgt aagccttctt 60
gctgggcgtt gtatcggtat tttgcatggg tttatgagct cacgtgaaat aagtaatatt 120gctgggcgtt gtatcggtat tttgcatggg tttatgagct cacgtgaaat aagtaatatt 120
ggtgtgtgct ttcctaaatg gaatgagcaa acaataggta atgaattagc gtttgtatca 180ggtgtgtgct ttcctaaatg gaatgagcaa acaataggta atgaattagc gtttgtatca 180
acaaataaaa agcaattaac caatctatct cagcaaagct attttgagat gatggctcat 240acaaataaaa agcaattaac caatctatct cagcaaagct attttgagat gatggctcat 240
gacaagttat ttggcttatc aaaaatactt gaagtaccag taaaccaaag cgaagtcatg 300gacaagttat ttggcttatc aaaaatactt gaagtaccag taaaccaaag cgaagtcatg 300
tttgttcgca accaatcggt agcaaaagca tttgttggcg aaaagcaaag gcgattaaag 360tttgttcgca accaatcggt agcaaaagca tttgttggcg aaaagcaaag gcgattaaag 360
cgagctaaaa aacgagctga agccagaggc gaagtttaca accctgaata taaatttgag 420cgagctaaaa aacgagctga agccagaggc gaagtttaca accctgaata taaatttgag 420
gcaaaggaca taggccattt tcattcaata cccgtatcaa gcaaaggcaa tggtcaaagt 480gcaaaggaca taggccattt tcattcaata cccgtatcaa gcaaaggcaa tggtcaaagt 480
tatgttttgc atatacaaaa aaatgaaaat gctgaatcca taaaaaatca gtttaacaat 540tatgttttgc atatacaaaa aaatgaaaat gctgaatcca taaaaaatca gtttaacaat 540
tatggctttg ctacaaatca aatatttcta ggtacggttc cttctttaaa taccctttta 600tatggctttg ctacaaatca aatatttcta ggtacggttc cttctttaaa taccctttta 600
taa 603taa 603
<210> 14<210> 14
<211> 1029<211> 1029
<212> DNA<212>DNA
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 14<400> 14
atgcaattac ctcggcactt aagttacacg cgttcgctct cacccagtaa agcggtgttt 60atgcaattac ctcggcactt aagttacacg cgttcgctct cacccagtaa agcggtgttt 60
ttttataaaa caccagagtc tgactttgaa ccgctacaaa tagagcaaaa taaattagtt 120ttttataaaa caccagagtc tgactttgaa ccgctacaaa tagagcaaaa taaattagtt 120
gggcagaagt cagggtttgg cgatgcgtat caaaagcaaa atgtggctaa aaatttagcg 180gggcagaagt cagggtttgg cgatgcgtat caaaagcaaa atgtggctaa aaatttagcg 180
ccacaagatc tcgcgtttgg aaaccctcaa acaattgatg tgtgttatgt acctccaacg 240ccacaagatc tcgcgtttgg aaaccctcaa acaattgatg tgtgttatgt acctccaacg 240
gtaaatgagc tattttgtcg tttttcactc agggttgagg ctaattgtat tgagccacat 300gtaaatgagc tattttgtcg tttttcactc agggttgagg ctaattgtat tgagccacat 300
gtatgtgatg accctaaagt tatttattgg ttaaaacggt ttttcgaaac ctataaaaaa 360gtatgtgatg accctaaagt tattattgg ttaaaacggt ttttcgaaac ctataaaaaa 360
cacaatggcc ttaatgaagt tgcaacgcgc tatgctaaaa atatactgat gggcaactgg 420cacaatggcc ttaatgaagt tgcaacgcgc tatgctaaaa atatactgat gggcaactgg 420
ctttggcgta accgccaatc accaaatgtt gatattgaaa tccttactga gcacgcagcc 480ctttggcgta accgccaatc accaaatgtt gatattgaaa tccttactga gcacgcagcc 480
ccgattgttg ttgaaggtgc acaaaaacta aaatggcaag gcaactggca aaataatcaa 540ccgattgttg ttgaaggtgc acaaaaacta aaatggcaag gcaactggca aaataatcaa 540
acggcattat taacgttgtc agaatctatt caagaagggc taagcaatcc tcaaaattat 600acggcattta taacgttgtc agaatctatt caagaagggc taagcaatcc tcaaaattat 600
tgttatttag atataaccgc aaaaattaaa aatgcattta gccaagaggt tcatcctagt 660tgttatttag atataaccgc aaaaattaaa aatgcattta gccaagaggt tcatcctagt 660
caaaagtttg tagataatgt tgaacaaggt atgtcatcta aacaacttgc atatactcaa 720caaaagtttg tagataatgt tgaacaaggt atgtcatcta aacaacttgc atatactcaa 720
gtaggcgata aaaaagcagc aagtttgaat tcacaaaaag taggggctgc tatccaaact 780gtaggcgata aaaaagcagc aagtttgaat tcacaaaaag tagggggctgc tatccaaact 780
attgatgatt ggtatgagga aggttacaaa cctttacgca ctcacgagta tggcgcagat 840attgatgatt ggtatgagga aggttacaaa cctttacgca ctcacgagta tggcgcagat 840
aagcaaatat tagttgcaca cagaacacct aagagccatt cagactttta ttcattactc 900aagcaaatat tagttgcaca cagaacacct aagagccatt cagactttta ttcattactc 900
ccgcgcattg ctttgcatat taaacacatg gaaaagcatg gtttagagca aagtgaacaa 960ccgcgcattg ctttgcatat taaacacatg gaaaagcatg gtttagagca aagtgaacaa 960
tcaaactcaa ttcactttat tgcggcagtg ctgatcaaag gtggcttgtt tcaaaggagt 1020tcaaactcaa ttcactttat tgcggcagtg ctgatcaaag gtggcttgtt tcaaaggagt 1020
aaaggttga 1029aaaggttga 1029
<210> 15<210> 15
<211> 88<211> 88
<212> DNA<212>DNA
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<220><220>
<221> misc_feature<221> misc_feature
<222> (29)..(60)<222> (29)..(60)
<223> n is a, c, g, or t<223> n is a, c, g, or t
<400> 15<400> 15
gtgaactgcc gagtaggcag ctggaaatnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60gtgaactgcc gagtaggcag ctggaaatnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60
gtgaactgcc gagtaggcag ctgaagtt 88gtgaactgcc gagtaggcag ctgaagtt 88
<210> 16<210> 16
<211> 88<211> 88
<212> DNA<212>DNA
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<220><220>
<221> misc_feature<221> misc_feature
<222> (29)..(60)<222> (29)..(60)
<223> n is a, c, g, or t<223> n is a, c, g, or t
<400> 16<400> 16
gtgaactgcc gagtaggcag ctggaaatnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60gtgaactgcc gagtaggcag ctggaaatnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60
gtgaactgcc gagtaggcag ctggaaat 88gtgaactgcc gagtaggcag ctggaaat 88
<210> 17<210> 17
<211> 88<211> 88
<212> DNA<212>DNA
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<220><220>
<221> misc_feature<221> misc_feature
<222> (29)..(60)<222> (29)..(60)
<223> n is a, c, g, or t<223> n is a, c, g, or t
<400> 17<400> 17
gtgaactgcc gagtaggcag ctgaagttnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60gtgaactgcc gagtaggcag ctgaagttnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60
gtgaactgcc gagtaggcag ctgaagtt 88gtgaactgcc gagtaggcag ctgaagtt 88
<210> 18<210> 18
<211> 88<211> 88
<212> DNA<212>DNA
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<220><220>
<221> misc_feature<221> misc_feature
<222> (29)..(60)<222> (29)..(60)
<223> n is a, c, g, or t<223> n is a, c, g, or t
<400> 18<400> 18
gtgaactgcc gagtaggcag ctgaagttnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60gtgaactgcc gagtaggcag ctgaagttnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 60
gtgaactgcc gagtaggcag ctggaaat 88gtgaactgcc gagtaggcag ctggaaat 88
<210> 19<210> 19
<211> 632<211>632
<212> DNA<212>DNA
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 19<400> 19
ttaattttcc ttaattattt ttaaagttag actgatttag acttggaaaa gcttaatgat 60ttaattttcc ttaattattt ttaaagttag actgatttag acttggaaaa gcttaatgat 60
tggagagcta aattgactaa tagtattcag tcaagtttaa ttagttttaa gcgatccatg 120tggagagcta aattgactaa tagtattcag tcaagtttaa ttagttttaa gcgatccatg 120
cataactatt tctgtacaat gctatatttt caccaattaa taatttttaa tcaagcctac 180cataactatt tctgtacaat gctatatttt caccaattaa taatttttaa tcaagcctac 180
attatgaaat atactatacc catttgaact cttctatttg taccattgtc ggtagcaaaa 240attatgaaat atactatacc catttgaact cttctatttg taccatgtc ggtagcaaaa 240
acttatgggg ttttgaacgt cacttaaatt gtaagcattt gcgatggagg cgcgtttaga 300acttatgggg ttttgaacgt cacttaaatt gtaagcattt gcgatggagg cgcgtttaga 300
gtcaaccttg attctgatat gctccgaatt tttggtaaga atataagtgt gagagtagct 360gtcaaccttg attctgatat gctccgaatt tttggtaaga atataagtgt gagagtagct 360
aatgtggata cgcctgagtt aagggaaaaa tgtgaaaatg aaataactcg ttatcatgca 420aatgtggata cgcctgagtt aagggaaaaa tgtgaaaatg aaataactcg ttatcatgca 420
aagtgactaa ggttataatc ttccgtttat ggcacatagc agccaactaa acttgacagt 480aagtgactaa ggttataatc ttccgtttat ggcacatagc agccaactaa acttgacagt 480
atttttatgt ggttggcttt ataaaaccag catttggtaa cattatgcca atttttactt 540attttatgt ggttggcttt ataaaaccag catttggtaa cattatgcca atttttactt 540
caatattatg ccaacataca ctacactaac ggagctgtag cacaataagc tcgtttgtac 600caatattatg ccaacataca ctacactaac ggagctgtag cacaataagc tcgtttgtac 600
ttatgccaac ttatacttca aacaacattg gg 632ttatgccaac ttatacttca aacaacattg gg 632
<210> 20<210> 20
<211> 113<211> 113
<212> DNA<212>DNA
<213> Pseudoalteromonas translucida KMM520<213> Pseudoalteromonas translucida KMM520
<400> 20<400> 20
ttgggtgttg tttgaagtat aagttgacat atctgtacta aaagatggca taaattggaa 60ttgggtgttg tttgaagtat aagttgacat atctgtacta aaagatggca taaattggaa 60
gtgtaaggtg gcatagtcta gtatttaacc aaatggttaa atggttgact cac 113gtgtaaggtg gcatagtcta gtatttaacc aaatggttaa atggttgact cac 113
<210> 21<210> 21
<211> 8138<211> 8138
<212> DNA<212>DNA
<213> 人工序列()<213> artificial sequence ()
<400> 21<400> 21
cattaattcc taatttttgt tgacactcta tcattgatag agttatttta ccactcccta 60cattaattcc taatttttgt tgacactcta tcattgatag agttatttta ccactcccta 60
tcagtgatag agaaaagtga actctagaaa taattttgtt taactttaaa aggagatata 120tcagtgatag agaaaagtga actctagaaa taattttgtt taactttaaa aggagatata 120
ccatgggtga actgccgagt aggcagctgg aaatgagacc tctggtctcg tgaactgccg 180ccatgggtga actgccgagt aggcagctgg aaatgagacc tctggtctcg tgaactgccg 180
agtaggcagc tggaaatgga tccgaattcg agctcggcgc gcctgcaggt cgacaagctt 240agtaggcagc tggaaatgga tccgaattcg agctcggcgc gcctgcaggt cgacaagctt 240
gcggccgctc taatctagac atcattaatt cctaattttt gttgacactc tatcattgat 300gcggccgctc taatctagac atcattaatt cctaattttt gttgacactc tatcattgat 300
agagttattt taccactccc tatcagtgat agagaaaagt gaactctaga aataattttg 360agagttatt taccactccc tatcagtgat agagaaaagt gaactctaga aataattttg 360
tttaacttta aaggagatat acatatgttt ttgcaaagac ctaaataacc aacaagaagg 420tttaacttta aaggagatat acatatgttt ttgcaaagac ctaaataacc aacaagaagg 420
agatatacat atgcactttc tggtgcagac caagagctac ccggacgagg cgctggaaag 480agatatacat atgcactttc tggtgcagac caagagctac ccggacgagg cgctggaaag 480
ctatctgctg cgtctggcgc gtgataacag ctacaacggt tatagcgagc tggcggacat 540ctatctgctg cgtctggcgc gtgataacag ctacaacggt tatagcgagc tggcggacat 540
cctgtggcag tggctggcgg aacaagataa cgagctggaa ggtgcgctgc cgctggcgct 600cctgtggcag tggctggcgg aacaagataa cgagctggaa ggtgcgctgc cgctggcgct 600
gagcaaggtg gacgtttacc acgcgcgtca ggcgagcagc ttccgtatcc gtgcgctgaa 660gagcaaggtg gacgtttacc acgcgcgtca ggcgagcagc ttccgtatcc gtgcgctgaa 660
actggtggcg caactggcgg acgttaacgc gggtgatatt ctggcgctgg cgtggcgtcg 720actggtggcg caactggcgg acgttaacgc gggtgatatt ctggcgctgg cgtggcgtcg 720
tagcaacttc aagtttggca acctggcggc ggtgagccgt aacgagctgg cgatcccgct 780tagcaacttc aagtttggca acctggcggc ggtgagccgt aacgagctgg cgatcccgct 780
ggaactgctg cgtaccgata acatcccggt ttgcattaaa tgcctgagcg agagcagcca 840ggaactgctg cgtaccgata acatcccggt ttgcattaaa tgcctgagcg agagcagcca 840
cattccgttt tactggcacc tgaagccgta taaagcgtgc cacaagcaca aaagccagct 900cattccgttt tactggcacc tgaagccgta taaagcgtgc cacaagcaca aaagccagct 900
gatcacccgt tgcaaggagt gctacgacct gattgattat cgtgcgagcg aggcgtttct 960gatcacccgt tgcaaggagt gctacgacct gattgattat cgtgcgagcg aggcgtttct 960
ggaatgcgtt tgcggttgca aaatcaccaa cagcgaacaa ctgaacgacg cggatttcaa 1020ggaatgcgtt tgcggttgca aaatcaccaa cagcgaacaa ctgaacgacg cggatttcaa 1020
gatcgcgatt gcgctggcga gcagcaacag ccagaaaatc gtgggcctga ttagctggtt 1080gatcgcgatt gcgctggcga gcagcaacag ccagaaaatc gtgggcctga ttagctggtt 1080
cgcgaaggtg aaacaactgg acgttagcga cgcggatttc aactgcgcgt ttgttgatta 1140cgcgaaggtg aaacaactgg acgttagcga cgcggatttc aactgcgcgt ttgttgatta 1140
cttcaacacc tggccggaga gcctgaccac cgaactggac ctgctgacca acaacgcgcg 1200cttcaacacc tggccggaga gcctgaccac cgaactggac ctgctgacca acaacgcgcg 1200
tctgaagcag ctgaacccgt ttaacaagac caaattcagc agcgtgtacg gtgacctgat 1260tctgaagcag ctgaacccgt ttaacaagac caaattcagc agcgtgtacg gtgacctgat 1260
ccgtgatggc caaattgcgg cgaccagcaa ccgtaagaac aaagttatcg acgagatcat 1320ccgtgatggc caaattgcgg cgaccagcaa ccgtaagaac aaagttatcg acgagatcat 1320
tagctatttt gtggaactgg ttgatagcaa cccgaaggcg aaacacccga acattggtga 1380tagctatttt gtggaactgg ttgatagcaa cccgaaggcg aaacacccga aattggtga 1380
cctgctgctg tgcaccttcg atgcggcggt gctgctgaac accaccaccg agcaggttta 1440cctgctgctg tgcaccttcg atgcggcggt gctgctgaac accaccaccg agcaggttta 1440
ccgtctgcac caagaagcgt ttctgaactg cgcgtatagc cagaagaaac acgaacaact 1500ccgtctgcac caagaagcgt ttctgaactg cgcgtatagc cagaagaaac acgaacaact 1500
gcgtgcggat agccacgtgt tctatctgcg tcaggttatc gagctgcagc aagcgtttgc 1560gcgtgcggat agccacgtgt tctatctgcg tcaggttatc gagctgcagc aagcgtttgc 1560
ggcggaaaaa ccgctgacca agaaacaatt cattgcgccg tggtaactta tgaacctgca 1620ggcggaaaaa ccgctgacca agaaacaatt cattgcgccg tggtaactta tgaacctgca 1620
ggatgcgctg gcgattgagc cgctgaagga aaaaaccacc gcgctgcgta agctgttcgt 1680ggatgcgctg gcgattgagc cgctgaagga aaaaaccacc gcgctgcgta agctgttcgt 1680
gccgtacacc agccacgttg aggtggatgg ttttgaggaa ctggcgctga ccgtgctgat 1740gccgtacacc agccacgttg aggtggatgg ttttgaggaa ctggcgctga ccgtgctgat 1740
caacctggtt tataagcgta gcgaaattga cgatctgacc agcgcgcgta ccgcgaaaag 1800caacctggtt tataagcgta gcgaaattga cgatctgacc agcgcgcgta ccgcgaaaag 1800
cgtgctgcgt gacgaggttc tgctgagcaa gtgcatcaac gaagtgaaat ggttccacac 1860cgtgctgcgt gacgaggttc tgctgagcaa gtgcatcaac gaagtgaaat ggttccacac 1860
ccacaacctg aagtacccgg acatccgtgt tagccaccaa cgtctgatta gcgaggtggt 1920ccacaacctg aagtacccgg acatccgtgt tagccaccaa cgtctgatta gcgaggtggt 1920
tagcgaagat atcgcgggta tttgcagccg tagcctgccg ctgagctttg gctggagcca 1980tagcgaagat atcgcgggta tttgcagccg tagcctgccg ctgagctttg gctggagcca 1980
caacagcgcg gagatcaacc acgcgaaact gttcctgacc agctttaact ggcagggtga 2040caacagcgcg gagatcaacc acgcgaaact gttcctgacc agctttaact ggcagggtga 2040
agtgacctgc ctggcgcgtc tgctgattaa cgaggaaccg gtttggatca acctgattcg 2100agtgacctgc ctggcgcgtc tgctgattaa cgaggaaccg gtttggatca acctgattcg 2100
tgcgtacggt ttcaccaaga aagcggttct ggagatcagc ggcaagatta aacagcaact 2160tgcgtacggt ttcaccaaga aagcggttct ggagatcagc ggcaagatta aacagcaact 2160
gccggtggcg gagttcccgc tggaagttag cagctttagc ccgcagctgc aaatgccgtt 2220gccggtggcg gagttcccgc tggaagttag cagctttagc ccgcagctgc aaatgccgtt 2220
tcagcaaagc tatctggtgg ttaccccggt ggttagccac gcgatgctgg cgaagatcca 2280tcagcaaagc tatctggtgg ttaccccggt ggttagccac gcgatgctgg cgaagatcca 2280
gcaactgacc accgaccgta aactgaactt cgcgctggtt gagcacagcc gtccggcgaa 2340gcaactgacc accgaccgta aactgaactt cgcgctggtt gagcacagcc gtccggcgaa 2340
cgttggtgat ctggcgagca gcgtgggtgg caacattcgt gttctgcgtt actttccgaa 2400cgttggtgat ctggcgagca gcgtgggtgg caacattcgt gttctgcgtt actttccgaa 2400
gacctatagc aaagcggtga accgtagcaa agttgcgaac aacgatatcg aaaaggcgtt 2460gacctatagc aaagcggtga accgtagcaa agttgcgaac aacgatatcg aaaaggcgtt 2460
caaaattcgt gcgctgctga gcagccagtt tcagcaagcg ctgctggtgc tggttggcat 2520caaaattcgt gcgctgctga gcagccagtt tcagcaagcg ctgctggtgc tggttggcat 2520
caagcagttc aacaccctgc gtcaaaaacg tctggcgcgt gtggcggcga tccgtcaagt 2580caagcagttc aacaccctgc gtcaaaaacg tctggcgcgt gtggcggcga tccgtcaagt 2580
gcgtgttagc ctgcaactgt ggctggacaa cattctggag gcgaagaaca acgcgcagaa 2640gcgtgttagc ctgcaactgt ggctggaca cattctggag gcgaagaaca acgcgcagaa 2640
ccaagtgtac ccggaatggg ttcgtcacta tctggatcaa agcatcacca actgcattag 2700ccaagtgtac ccggaatggg ttcgtcacta tctggatcaa agcatcacca actgcattag 2700
ccagttcagc aacgttctga acgaaagcct gggtaacctg agcaagctga aacgttttgc 2760ccagttcagc aacgttctga acgaaagcct gggtaacctg agcaagctga aacgttttgc 2760
gtaccacccg aacctgatgg gcctgttcaa agcgcaactg aactatgtgt ttacccactg 2820gtaccacccg aacctgatgg gcctgttcaa agcgcaactg aactatgtgt ttacccactg 2820
cgcggcggag caggaaatcc tgaacgacga gcaaattgtg tacgttcact gccaggacat 2880cgcggcggag caggaaatcc tgaacgacga gcaaattgtg tacgttcact gccaggacat 2880
gcgtgttttc gatgcggaag cgatggcgaa cccgtatatc cagggtatgc cgagcctgac 2940gcgtgttttc gatgcggaag cgatggcgaa cccgtatatc cagggtatgc cgagcctgac 2940
cgcgctgaac ggcctggcgc acaacttcga gcgtaagctg aaaaacttta ttgatccgag 3000cgcgctgaac ggcctggcgc acaacttcga gcgtaagctg aaaaacttta ttgatccgag 3000
catcaagtgc attggtagcg cgatctacat tgagaactat caactgcaca ccggcaaacc 3060catcaagtgc attggtagcg cgatctacat tgagaactat caactgcaca ccggcaaacc 3060
gctgccggaa ccgagcaagc tgaaacaggt ggcgggtcgt agccacgtta tccgtagcgg 3120gctgccggaa ccgagcaagc tgaaacaggt ggcgggtcgt agccacgtta tccgtagcgg 3120
catcattgac aagccgaaat gcgacattac cctggatctg gtgttccgtc tgtttgttcc 3180catcattgac aagccgaaat gcgacattac cctggatctg gtgttccgtc tgtttgttcc 3180
gaacaccgaa ctgctggata agctgaacag ccaactgatt aagccggcgc tgccgagcag 3240gaacaccgaa ctgctggata agctgaacag ccaactgatt aagccggcgc tgccgagcag 3240
ctttgcgggt ggcaccatgc acccgccgag cctgtaccag aacattgact ggtgccacgt 3300ctttgcgggt ggcaccatgc acccgccgag cctgtaccag aacattgact ggtgccacgt 3300
gcacaccaag ccgagcgagc tgtttaagaa actgaaggcg aaaagcagca acggtagctg 3360gcacaccaag ccgagcgagc tgtttaagaa actgaaggcg aaaagcagca acggtagctg 3360
gctgtatccg agcaagaaag tggttaaaag cttcgaacag ctgatcgacg cgctgaacag 3420gctgtatccg agcaagaaag tggttaaaag cttcgaacag ctgatcgacg cgctgaacag 3420
caactttaac ctgcgtccgg cggcgattgg cctggcggcg ctggaggaac cggtgaagcg 3480caactttaac ctgcgtccgg cggcgattgg cctggcggcg ctggaggaac cggtgaagcg 3480
tgatgcggcg ctgcacgagt accactgcta tgcggaaccg gttatcggtc tgctggagtg 3540tgatgcggcg ctgcacgagt accactgcta tgcggaaccg gttatcggtc tgctggagtg 3540
cgtgagcaac accagcgtta agtacgcggg cgcgaaacaa ttctttcacg acgcgttctg 3600cgtgagcaac accagcgtta agtacgcggg cgcgaaacaa ttctttcacg acgcgttctg 3600
ggtgatggat gttcagaagg aaagcatgct gatgaagaaa agcaaatttg agtatgaata 3660ggtgatggat gttcagaagg aaagcatgct gatgaagaaa agcaaatttg agtatgaata 3660
atgcagctgc cgcgtcacct gagctacacc cgtagcctga gcccgagcaa ggcggtgttc 3720atgcagctgc cgcgtcacct gagctacacc cgtagcctga gcccgagcaa ggcggtgttc 3720
ttttataaaa ccccggagag cgacttcgaa ccgctgcaga tcgagcaaaa caaactggtg 3780ttttataaaa ccccggagag cgacttcgaa ccgctgcaga tcgagcaaaa caaactggtg 3780
ggtcagaaga gcggttttgg cgatgcgtac cagaagcaaa acgttgcgaa aaacctggcg 3840ggtcagaaga gcggttttgg cgatgcgtac cagaagcaaa acgttgcgaa aaacctggcg 3840
ccgcaggacc tggcgtttgg taacccgcaa accattgatg tgtgctatgt tccgccgacc 3900ccgcaggacc tggcgtttgg taacccgcaa accattgatg tgtgctatgt tccgccgacc 3900
gtgaacgaac tgttctgccg ttttagcctg cgtgttgagg cgaactgcat cgaaccgcac 3960gtgaacgaac tgttctgccg ttttagcctg cgtgttgagg cgaactgcat cgaaccgcac 3960
gtgtgcgacg atccgaaggt tatttactgg ctgaaacgtt tctttgaaac ctataagaaa 4020gtgtgcgacg atccgaaggt tattactgg ctgaaacgtt tctttgaaac ctataagaaa 4020
cacaacggtc tgaacgaagt ggcgacccgt tacgcgaaga acatcctgat gggcaactgg 4080cacaacggtc tgaacgaagt ggcgacccgt tacgcgaaga acatcctgat gggcaactgg 4080
ctgtggcgta accgtcagag cccgaacgtt gacatcgaga ttctgaccga acacgcggcg 4140ctgtggcgta accgtcagag cccgaacgtt gacatcgaga ttctgaccga acacgcggcg 4140
ccgattgtgg ttgagggtgc gcagaagctg aaatggcaag gcaactggca gaacaaccaa 4200ccgattgtgg ttgagggtgc gcagaagctg aaatggcaag gcaactggca gaacaaccaa 4200
accgcgctgc tgaccctgag cgagagcatc caggaaggtc tgagcaaccc gcaaaactac 4260accgcgctgc tgaccctgag cgagagcatc caggaaggtc tgagcaaccc gcaaaactac 4260
tgctatctgg atatcaccgc gaagattaaa aacgcgttca gccaggaagt gcacccgagc 4320tgctatctgg atatcaccgc gaagattaaa aacgcgttca gccaggaagt gcacccgagc 4320
caaaagtttg tggacaacgt tgaacagggt atgagcagca aacagctggc gtatacccaa 4380caaaagtttg tggacaacgt tgaacagggt atgagcagca aacagctggc gtatacccaa 4380
gtgggcgata agaaagcggc gagcctgaac agccagaagg ttggcgcggc gatccaaacc 4440gtgggcgata agaaagcggc gagcctgaac agccagaagg ttggcgcggc gatccaaacc 4440
attgacgatt ggtacgagga aggttataaa ccgctgcgta cccatgagta tggtgcggac 4500attgacgatt ggtacgagga aggttataaa ccgctgcgta cccatgagta tggtgcggac 4500
aagcaaatcc tggtggcgca ccgtaccccg aaaagccaca gcgattttta tagcctgctg 4560aagcaaatcc tggtggcgca ccgtaccccg aaaagccaca gcgattttta tagcctgctg 4560
ccgcgtatcg cgctgcacat taagcacatg gaaaaacacg gtctggagca gagcgaacaa 4620ccgcgtatcg cgctgcacat taagcacatg gaaaaacacg gtctggagca gagcgaacaa 4620
agcaacagca tccacttcat tgcggcggtt ctgattaagg gtggcctgtt tcagcgtagc 4680agcaacagca tccacttcat tgcggcggtt ctgattaagg gtggcctgtt tcagcgtagc 4680
aaaggatgaa gcgttactat ttcaccatca cctacctgcc gcaaagctgc gatgtgagcc 4740aaaggatgaa gcgttactat ttcaccatca cctacctgcc gcaaagctgc gatgtgagcc 4740
tgctggcggg tcgttgcatc ggcattctgc acggtttcat gagcagccgt gagatcagca 4800tgctggcggg tcgttgcatc ggcattctgc acggtttcat gagcagccgt gagatcagca 4800
acattggcgt gtgctttccg aaatggaacg agcagaccat cggtaacgaa ctggcgtttg 4860acattggcgt gtgctttccg aaatggaacg agcagaccat cggtaacgaa ctggcgtttg 4860
ttagcaccaa caagaaacaa ctgaccaacc tgagccagca aagctatttc gagatgatgg 4920ttagcaccaa caagaaacaa ctgaccaacc tgagccagca aagctatttc gagatgatgg 4920
cgcacgacaa gctgtttggc ctgagcaaaa ttctggaagt gccggttaac cagagcgaag 4980cgcacgacaa gctgtttggc ctgagcaaaa ttctggaagt gccggttaac cagagcgaag 4980
tgatgttcgt tcgtaaccaa agcgtggcga aggcgtttgt tggtgaaaag caacgtcgtc 5040tgatgttcgt tcgtaaccaa agcgtggcga aggcgtttgt tggtgaaaag caacgtcgtc 5040
tgaaacgtgc gaagaaacgt gcggaggcgc gtggcgaagt gtacaacccg gagtataagt 5100tgaaacgtgc gaagaaacgt gcggaggcgc gtggcgaagt gtacaacccg gagtataagt 5100
tcgaagcgaa agatatcggt cactttcaca gcattccggt gagcagcaag ggtaacggcc 5160tcgaagcgaa agatatcggt cactttcaca gcattccggt gagcagcaag ggtaacggcc 5160
agagctacgt tctgcacatc caaaagaacg agaacgcgga aagcattaaa aaccagttca 5220agagctacgt tctgcacatc caaaagaacg agaacgcgga aagcattaaa aaccagttca 5220
acaactatgg ctttgcgacc aaccaaattt tcctgggcac cgtgccgagc ctgaacaccc 5280acaactatgg ctttgcgacc aaccaaattt tcctgggcac cgtgccgagc ctgaacaccc 5280
tgctgtaagg taccaccctt aatctgacct aggctgctgc caccgctgag caataactag 5340tgctgtaagg taccaccctt aatctgacct aggctgctgc caccgctgag caataactag 5340
cataacccct tggggcctct aaacgggtct tgaggggttt tttgctgaaa cctcaggcat 5400cataacccct tggggcctct aaacgggtct tgaggggttt tttgctgaaa cctcaggcat 5400
ttgagaagca cacggtcaca ctgcttccgg tagtcaataa accggtaaac cagcaataga 5460ttgagaagca cacggtcaca ctgcttccgg tagtcaataa accggtaaac cagcaataga 5460
cataagcggc tatttaacga ccctgccctg aaccgacgac cgggtcatcg tggccggatc 5520cataagcggc tattaacga ccctgccctg aaccgacgac cgggtcatcg tggccggatc 5520
ttgcggcccc tcggcttgaa cgaattgtta gacattattt gccgactacc ttggtgatct 5580ttgcggcccc tcggcttgaa cgaattgtta gacattattt gccgactacc ttggtgatct 5580
cgcctttcac gtagtggaca aattcttcca actgatctgc gcgcgaggcc aagcgatctt 5640cgcctttcac gtagtggaca aattcttcca actgatctgc gcgcgaggcc aagcgatctt 5640
cttcttgtcc aagataagcc tgtctagctt caagtatgac gggctgatac tgggccggca 5700cttcttgtcc aagataagcc tgtctagctt caagtatgac gggctgatac tgggccggca 5700
ggcgctccat tgcccagtcg gcagcgacat ccttcggcgc gattttgccg gttactgcgc 5760ggcgctccat tgcccagtcg gcagcgacat ccttcggcgc gattttgccg gttactgcgc 5760
tgtaccaaat gcgggacaac gtaagcacta catttcgctc atcgccagcc cagtcgggcg 5820tgtaccaaat gcgggacaac gtaagcacta catttcgctc atcgccagcc cagtcgggcg 5820
gcgagttcca tagcgttaag gtttcattta gcgcctcaaa tagatcctgt tcaggaaccg 5880gcgagttcca tagcgttaag gtttcattta gcgcctcaaa tagatcctgt tcaggaaccg 5880
gatcaaagag ttcctccgcc gctggaccta ccaaggcaac gctatgttct cttgcttttg 5940gatcaaagag ttcctccgcc gctggaccta ccaaggcaac gctatgttct cttgcttttg 5940
tcagcaagat agccagatca atgtcgatcg tggctggctc gaagatacct gcaagaatgt 6000tcagcaagat agccagatca atgtcgatcg tggctggctc gaagatacct gcaagaatgt 6000
cattgcgctg ccattctcca aattgcagtt cgcgcttagc tggataacgc cacggaatga 6060cattgcgctg ccattctcca aattgcagtt cgcgcttagc tggataacgc cacggaatga 6060
tgtcgtcgtg cacaacaatg gtgacttcta cagcgcggag aatctcgctc tctccagggg 6120tgtcgtcgtg cacaacaatg gtgacttcta cagcgcggag aatctcgctc tctccagggg 6120
aagccgaagt ttccaaaagg tcgttgatca aagctcgccg cgttgtttca tcaagcctta 6180aagccgaagt ttccaaaagg tcgttgatca aagctcgccg cgttgtttca tcaagcctta 6180
cggtcaccgt aaccagcaaa tcaatatcac tgtgtggctt caggccgcca tccactgcgg 6240cggtcaccgt aaccagcaaa tcaatatcac tgtgtggctt caggccgcca tccactgcgg 6240
agccgtacaa atgtacggcc agcaacgtcg gttcgagatg gcgctcgatg acgccaacta 6300agccgtacaa atgtacggcc agcaacgtcg gttcgagatg gcgctcgatg acgccaacta 6300
cctctgatag ttgagtcgat acttcggcga tcaccgcttc cctcatactc ttcctttttc 6360cctctgatag ttgagtcgat acttcggcga tcaccgcttc cctcatactc ttcctttttc 6360
aatattattg aagcatttat cagggttatt gtctcatgag cggatacata tttgaatgta 6420aatattattg aagcatttat cagggttatt gtctcatgag cggatacata tttgaatgta 6420
tttagaaaaa taaacaaata gctagctcac tcggtcgcta cgctccgggc gtgagactgc 6480tttagaaaaa taaacaaata gctagctcac tcggtcgcta cgctccgggc gtgagactgc 6480
ggcgggcgct gcggacacat acaaagttac ccacagattc cgtggataag caggggacta 6540ggcgggcgct gcggacacat acaaagttac ccacagattc cgtggataag caggggacta 6540
acatgtgagg caaaacagca gggccgcgcc ggtggcgttt ttccataggc tccgccctcc 6600acatgtgagg caaaacagca gggccgcgcc ggtggcgttt ttccataggc tccgccctcc 6600
tgccagagtt cacataaaca gacgcttttc cggtgcatct gtgggagccg tgaggctcaa 6660tgccagagtt cacataaaca gacgcttttc cggtgcatct gtgggagccg tgaggctcaa 6660
ccatgaatct gacagtacgg gcgaaacccg acaggactta aagatcccca ccgtttccgg 6720ccatgaatct gacagtacgg gcgaaacccg acaggactta aagatcccca ccgtttccgg 6720
cgggtcgctc cctcttgcgc tctcctgttc cgaccctgcc gtttaccgga tacctgttcc 6780cgggtcgctc cctcttgcgc tctcctgttc cgaccctgcc gtttaccgga tacctgttcc 6780
gcctttctcc cttacgggaa gtgtggcgct ttctcatagc tcacacactg gtatctcggc 6840gcctttctcc cttacgggaa gtgtggcgct ttctcatagc tcacacactg gtatctcggc 6840
tcggtgtagg tcgttcgctc caagctgggc tgtaagcaag aactccccgt tcagcccgac 6900tcggtgtagg tcgttcgctc caagctgggc tgtaagcaag aactccccgt tcagcccgac 6900
tgctgcgcct tatccggtaa ctgttcactt gagtccaacc cggaaaagca cggtaaaacg 6960tgctgcgcct tatccggtaa ctgttcactt gagtccaacc cggaaaagca cggtaaaacg 6960
ccactggcag cagccattgg taactgggag ttcgcagagg atttgtttag ctaaacacgc 7020ccactggcag cagccattgg taactggggag ttcgcagagg atttgtttag ctaaacacgc 7020
ggttgctctt gaagtgtgcg ccaaagtccg gctacactgg aaggacagat ttggttgctg 7080ggttgctctt gaagtgtgcg ccaaagtccg gctacactgg aaggacagat ttggttgctg 7080
tgctctgcga aagccagtta ccacggttaa gcagttcccc aactgactta accttcgatc 7140tgctctgcga aagccagtta ccacggttaa gcagttcccc aactgactta accttcgatc 7140
aaaccacctc cccaggtggt tttttcgttt acagggcaaa agattacgcg cagaaaaaaa 7200aaaccacctc cccaggtggt tttttcgttt acagggcaaa agattacgcg cagaaaaaaa 7200
ggatctcaag aagatccttt gatcttttct actgaaccgc tctagatttc agtgcaattt 7260ggatctcaag aagatccttt gatcttttct actgaaccgc tctagatttc agtgcaattt 7260
atctcttcaa atgtagcacc tgaagtcagc cccatacgat ataagttgta attctcatgt 7320atctcttcaa atgtagcacc tgaagtcagc cccatacgat ataagttgta attctcatgt 7320
tagtcatgcc ccgcgcccac cggaaggagc tgactgggtt gaaggctctc aagggcatcg 7380tagtcatgcc ccgcgcccac cggaaggagc tgactgggtt gaaggctctc aagggcatcg 7380
gtcgagatcc cggtgcctaa tgagtgagct aacttaccgt tgtaaaacga cggccagtga 7440gtcgagatcc cggtgcctaa tgagtgagct aacttaccgt tgtaaaacga cggccagtga 7440
attcctgatg aatcccctaa tgatttttat caaaatcatt aaggttacca tcacggaaaa 7500attcctgatg aatcccctaa tgatttttat caaaatcatt aaggttacca tcacggaaaa 7500
aggttatgct gcttttaaga cccactttca catttaagtt gtttttctaa tccgcatatg 7560aggttatgct gcttttaaga cccactttca catttaagtt gtttttctaa tccgcatatg 7560
atcaattcaa ggccgaataa gaaggctggc tctgcacctt ggtgatcaaa taattcgata 7620atcaattcaa ggccgaataa gaaggctggc tctgcacctt ggtgatcaaa taattcgata 7620
gcttgtcgta ataatggcgg catactatca gtagtaggtg tttccctttc ttctttagcg 7680gcttgtcgta ataatggcgg catactatca gtagtaggtg tttccctttc ttctttagcg 7680
acttgatgct cttgatcttc caatacgcaa cctaaagtaa aatgccccac agcgctgagt 7740acttgatgct cttgatcttc caatacgcaa cctaaagtaa aatgccccac agcgctgagt 7740
gcatataatg cattctctag tgaaaaacct tgttggcata aaaaggctaa ttgattttcg 7800gcatataatg cattctctag tgaaaaacct tgttggcata aaaaggctaa ttgattttcg 7800
agagtttcat actgtttttc tgtaggccgt gtacctaaat gtacttttgc tccatcgcga 7860agagtttcat actgtttttc tgtaggccgt gtacctaaat gtacttttgc tccatcgcga 7860
tgacttagta aagcacatct aaaactttta gcgttattac gtaaaaaatc ttgccagctt 7920tgacttagta aagcacatct aaaactttta gcgttattac gtaaaaaatc ttgccagctt 7920
tccccttcta aagggcaaaa gtgagtatgg tgcctatcta acatctcaat ggctaaggcg 7980tccccttcta aagggcaaaa gtgagtatgg tgcctatcta acatctcaat ggctaaggcg 7980
tcgagcaaag cccgcttatt ttttacatgc caatacaatg taggctgctc tacacctagc 8040tcgagcaaag cccgcttatt ttttacatgc caatacaatg taggctgctc tacacctagc 8040
ttctgggcga gtttacgggt tgttaaacct tcgattccga cctcattaag cagctctaat 8100ttctgggcga gtttacgggt tgttaaacct tcgattccga cctcattaag cagctctaat 8100
gcgctgttaa tcactttact tttatctaat ctagacat 8138gcgctgttaa tcactttact tttatctaat ctagacat 8138
<210> 22<210> 22
<211> 6320<211> 6320
<212> DNA<212>DNA
<213> 人工序列()<213> artificial sequence ()
<400> 22<400> 22
acgatcgtaa aaggatctca agaagatcct ttacggattc ccgacaccat cactctagat 60acgatcgtaa aaggatctca agaagatcct ttacggattc ccgacaccat cactctagat 60
ttcagtgcaa tttatctctt caaatgtagc acctgaagtc agccccatac gatataagtt 120ttcagtgcaa tttatctctt caaatgtagc acctgaagtc agccccatac gatataagtt 120
gtaattctca tgttagtcat gccccgcgcc caccggaagg agctgactgg gttgaaggct 180gtaattctca tgttagtcat gccccgcgcc caccggaagg agctgactgg gttgaaggct 180
ctcaagggca tcggtcgaga tcccggtgcc taatgagtga gctaacttac attaattgcg 240ctcaagggca tcggtcgaga tcccggtgcc taatgagtga gctaacttac attaattgcg 240
ttgcgctgat gaatccccta atgattttta tcaaaatcat taaggttacc atcacggaaa 300ttgcgctgat gaatccccta atgattttta tcaaaatcat taaggttacc atcacggaaa 300
aaggttatgc tgcttttaag acccactttc acatttaagt tgtttttcta atccgcatat 360aaggttatgc tgcttttaag accactttc acatttaagt tgtttttcta atccgcatat 360
gatcaattca aggccgaata agaaggctgg ctctgcacct tggtgatcaa ataattcgat 420gatcaattca aggccgaata agaaggctgg ctctgcacct tggtgatcaa ataattcgat 420
agcttgtcgt aataatggcg gcatactatc agtagtaggt gtttcccttt cttctttagc 480agcttgtcgt aataatggcg gcatactatc agtagtaggt gtttcccttt cttctttagc 480
gacttgatgc tcttgatctt ccaatacgca acctaaagta aaatgcccca cagcgctgag 540gacttgatgc tcttgatctt ccaatacgca acctaaagta aaatgcccca cagcgctgag 540
tgcatataat gcattctcta gtgaaaaacc ttgttggcat aaaaaggcta attgattttc 600tgcatataat gcattctcta gtgaaaaacc ttgttggcat aaaaaggcta attgattttc 600
gagagtttca tactgttttt ctgtaggccg tgtacctaaa tgtacttttg ctccatcgcg 660gagagtttca tactgttttt ctgtaggccg tgtacctaaa tgtacttttg ctccatcgcg 660
atgacttagt aaagcacatc taaaactttt agcgttatta cgtaaaaaat cttgccagct 720atgacttagt aaagcacatc taaaactttt agcgttatta cgtaaaaaat cttgccagct 720
ttccccttct aaagggcaaa agtgagtatg gtgcctatct aacatctcaa tggctaaggc 780ttccccttct aaagggcaaa agtgagtatg gtgcctatct aacatctcaa tggctaaggc 780
gtcgagcaaa gcccgcttat tttttacatg ccaatacaat gtaggctgct ctacacctag 840gtcgagcaaa gcccgcttat tttttacatg ccaatacaat gtaggctgct ctacacctag 840
cttctgggcg agtttacggg ttgttaaacc ttcgattccg acctcattaa gcagctctaa 900cttctgggcg agtttacggg ttgttaaacc ttcgattccg acctcattaa gcagctctaa 900
tgcgctgtta atcactttac ttttatctaa tctagacatc attaattcct aatttttgtt 960tgcgctgtta atcactttac ttttatctaa tctagacatc attaattcct aatttttgtt 960
gacactctat cattgataga gttattttac cactccctat cagtgataga gaaaagtgaa 1020gacactctat cattgataga gttattttac cactccctat cagtgataga gaaaagtgaa 1020
ctctagaaat aattttgttt aactttaaaa ggagatatac catgtaccgt cgtaagctga 1080ctctagaaat aattttgttt aactttaaaa ggagatatac catgtaccgt cgtaagctga 1080
aatatagccg tgttaagaac ctgcacaaat ttgcgagcca gaagaacaaa agcacctgcc 1140aatatagccg tgttaagaac ctgcacaaat ttgcgagcca gaagaacaaa agcacctgcc 1140
tggtggagag cagcctggaa ttcgacgcgt gcttccactt tgagttcagc ccgccgatcg 1200tggtggagag cagcctggaa ttcgacgcgt gcttccactt tgagttcagc ccgccgatcg 1200
cggcgtttga agcgcaaccg ctgggttacg agtatgaatt cgataaccgt atttgccgtt 1260cggcgtttga agcgcaaccg ctgggttacg agtatgaatt cgataaccgt atttgccgtt 1260
acaccccgga ctttctgctg acccacaccg atggcaccca gaagttcatc gaggttaagc 1320acaccccgga ctttctgctg accccacaccg atggcaccca gaagttcatc gaggttaagc 1320
cgcaaagcaa aattgcggac gaggattttc gtgcgcgttt catcgaaaag caggcgattg 1380cgcaaagcaa aattgcggac gaggattttc gtgcgcgttt catcgaaaag caggcgattg 1380
cgaaacaaga cggtcgtgat ctgatcctgg tgaccgacaa gcagattcgt gtttacccga 1440cgaaacaaga cggtcgtgat ctgatcctgg tgaccgacaa gcagattcgt gtttacccga 1440
ccctgaacaa cctgaaactg ctgcaccgtt atagcggctt tcagagcctg accgagctgc 1500ccctgaacaa cctgaaactg ctgcaccgtt atagcggctt tcagagcctg accgagctgc 1500
aagcgagcgt gctggaactg gttaagcagt acggtagcat caaagtgggc caactgattc 1560aagcgagcgt gctggaactg gttaagcagt acggtagcat caaagtgggc caactgattc 1560
gttatctgaa agttaccgcg ggtgaactgc tggcgaccgt gctgcgtctg ctgagcctgg 1620gttatctgaa agttaccgcg ggtgaactgc tggcgaccgt gctgcgtctg ctgagcctgg 1620
gccaactgtt cgcggatctg accaccaacg agatcagcat tgaaaccgcg atctggagca 1680gccaactgtt cgcggatctg accaccaacg agatcagcat tgaaaccgcg atctggagca 1680
acaatgttta ataacgacct gttcgacgat gagtttaacc agccgctgcc gaaggcggaa 1740acaatgttta ataacgacct gttcgacgat gagtttaacc agccgctgcc gaaggcggaa 1740
accaaactgc cgcagaacta taccaaggat ctgcaagcgc tgccggagaa gatcaaaacc 1800accaaactgc cgcagaacta taccaaggat ctgcaagcgc tgccggagaa gatcaaaacc 1800
accaccttcg cgaagctgaa atacattcaa tggctggagg cgaacatcca gggtggctgg 1860accaccttcg cgaagctgaa atacattcaa tggctggagg cgaacatcca gggtggctgg 1860
acccaaaaga acctggaacc gctgctgaaa ctgatgccgg acgttgaggg tgaaaagaaa 1920acccaaaaga acctggaacc gctgctgaaa ctgatgccgg acgttgaggg tgaaaagaaa 1920
ccgagctggc gtaccgcggc gcgttggtat agcgcgtaca ccaacgcgga taagaacatt 1980ccgagctggc gtaccgcggc gcgttggtat agcgcgtaca ccaacgcgga taagaacatt 1980
atggcgctga tcccgagcca ccagaagaaa ggcaaccgtg aacgtgacac caccaccgat 2040atggcgctga tcccgagcca ccagaagaaa ggcaaccgtg aacgtgacac caccaccgat 2040
aagttctttg agaaagcgct ggaacgttac ctggtgaagg agaaaccgag cgttgcgagc 2100aagttctttg agaaagcgct ggaacgttac ctggtgaagg agaaaccgag cgttgcgagc 2100
gcgtataagt tctacaaaga cctggtgatc attgaaaacg acagcgtggt tgatagcgtt 2160gcgtataagt tctacaaaga cctggtgatc attgaaaacg acagcgtggt tgatagcgtt 2160
ctgaaaccgc tgacctataa ggcgtttaaa aaccgtattg acaacctgcc gcagtatgag 2220ctgaaaccgc tgacctataa ggcgtttaaa aaccgtattg acaacctgcc gcagtatgag 2220
gttatgatcg cgcgttacgg caagcgtctg gcggatattg cgtacaacaa ggtggaaggc 2280gttatgatcg cgcgttacgg caagcgtctg gcggatattg cgtacaacaa ggtggaaggc 2280
cacaaacgtc cgattcgtgt gctggagaaa gttgaaatcg accacacccc gctggatctg 2340cacaaacgtc cgattcgtgt gctggagaaa gttgaaatcg accacacccc gctggatctg 2340
attctgctgg acgatgagct gcacatcccg ctgggtcgtc cgaccctgac catgctggtt 2400attctgctgg acgatgagct gcacatcccg ctgggtcgtc cgaccctgac catgctggtt 2400
gacgtttata gccactgcat cgtgggctac tatttcagct ttagcgagcc gagctacgat 2460gacgtttata gccactgcat cgtgggctac tatttcagct ttagcgagcc gagctacgat 2460
gcggttcgtc gtgcgatgct gaacgcgatg aagccgaaaa gcgaagtggc gaaactgtac 2520gcggttcgtc gtgcgatgct gaacgcgatg aagccgaaaa gcgaagtggc gaaactgtac 2520
ccggacacca ttaacgagtg gaagtgcgcg ggtaaaatcg aaaccctggt ggttgataac 2580ccggcacca ttaacgagtg gaagtgcgcg ggtaaaatcg aaaccctggt ggttgataac 2580
ggcgcggagt tctggagcaa cagcctggaa ctggcgtgcg aggaaatcgg tattaacacc 2640ggcgcggagt tctggagcaa cagcctggaa ctggcgtgcg aggaaatcgg tattaacacc 2640
cagtataacc cggtggcgaa gccgtggctg aaaccgttcg ttgagcgtat gtttggcacc 2700cagtataacc cggtggcgaa gccgtggctg aaaccgttcg ttgagcgtat gtttggcacc 2700
atcaacaccg aactgctgga cccggttccg ggcaagacct tcagcaacat cctgcaaaaa 2760atcaacaccg aactgctgga cccggttccg ggcaagacct tcagcaacat cctgcaaaaa 2760
cacgaataca acccgaagaa agacgcgatt atgcgtttca ccacctttat gcagctgttt 2820cacgaataca acccgaagaa agacgcgatt atgcgtttca ccacctttat gcagctgttt 2820
cacaagtggg tggttgatgt gtatcaccaa gacgcggata gccgtttcaa atacattccg 2880cacaagtggg tggttgatgt gtatcaccaa gacgcggata gccgtttcaa atacattccg 2880
agccagctgt gggaccaagg ctttaacacc ctgccgccga ccatgctgag cgatgcggat 2940agccagctgt gggaccaagg ctttaacacc ctgccgccga ccatgctgag cgatgcggat 2940
ctgcagcaac tggatgtggt tctgagcatc agcaaccacc gtgtgctgcg taagggtggc 3000ctgcagcaac tggatgtggt tctgagcatc agcaaccacc gtgtgctgcg taagggtggc 3000
attcgtctgg agaacctgag ctatgacagc accgaactgg cgaactaccg taagcagttc 3060attcgtctgg agaacctgag ctatgacagc accgaactgg cgaactaccg taagcagttc 3060
agccacaaag tgagccaaga ggttctgatc aaactgaacc cggacgatat tagctacatc 3120agccacaaag tgagccaaga ggttctgatc aaactgaacc cggacgatat tagctacatc 3120
tatgtgtacc tggacaagct ggaacactat attaaagttc cgtgcatcga tccgaacggt 3180tatgtgtacc tggacaagct ggaacactat attaaagttc cgtgcatcga tccgaacggt 3180
tacacccaga acctgagcct gaaccaacac aagatcaaca ttcgtatcca ccgtgacttt 3240tacacccaga acctgagcct gaaccaacac aagatcaaca ttcgtatcca ccgtgacttt 3240
attagcggta gcatcgataa cgttggcctg gcgaaggcgc gtatgttcat tcacaacaaa 3300attagcggta gcatcgataa cgttggcctg gcgaaggcgc gtatgttcat tcacaacaaa 3300
atccagaacg agtttgagga actgaagaac gcgccgaaac acagcaaggt gaaaggtggc 3360atccagaacg agtttgagga actgaagaac gcgccgaaac acagcaaggt gaaaggtggc 3360
aaggcgctgg cgaaacacca gaacattagc agcgacagcc aaaagagcat cacccacagc 3420aaggcgctgg cgaaacacca gaacattagc agcgacagcc aaaagagcat cacccacagc 3420
aaaccggtgg aggcgaagaa agttaccccg aaagaacaac cgaccgatag ctgggacgat 3480aaaccggtgg aggcgaagaa agttaccccg aaagaacaac cgaccgatag ctgggacgat 3480
ttcatcagcg acctggatgg tttttaatta tgctgaccga caagcagaaa gaaaagctga 3540ttcatcagcg acctggatgg tttttaatta tgctgaccga caagcagaaa gaaaagctga 3540
acgagttccg tgatgttttt attgaatacc cgatcattac caccatcttc aacgactttg 3600acgagttccg tgatgttttt attgaatacc cgatcattac caccatcttc aacgactttg 3600
atcgtctgcg tctgggtaaa ggcctgaccg gcgagaagcc gtgcatgctg ctgaacggtg 3660atcgtctgcg tctgggtaaa ggcctgaccg gcgagaagcc gtgcatgctg ctgaacggtg 3660
acaccggcac cggtaaaacc gcgctgatta aacagtataa ggaacgtcac ctgccgcaat 3720acaccggcac cggtaaaacc gcgctgatta aacagtataa ggaacgtcac ctgccgcaat 3720
tcatcaacgg tgttatgaac cacccggtgc tggttagccg tattccgagc aacccgaccc 3780tcatcaacgg tgttatgaac cacccggtgc tggttagccg tattccgagc aacccgaccc 3780
tggaaagcac cctggcggag ctgctgaaag acctgggtca agtgggcagc accgagcgta 3840tggaaagcac cctggcggag ctgctgaaag acctgggtca agtgggcagc accgagcgta 3840
agctgcgtat taacggcacc cgtctgacca ccagcctgat caaatgcctg aagacctgcg 3900agctgcgtat taacggcacc cgtctgacca ccagcctgat caaatgcctg aagacctgcg 3900
gcaccgaact gatcattatc gatgagtttc aggaactgat tgagcacaac caaggcaaga 3960gcaccgaact gatcattatc gatgagtttc aggaactgat tgagcacaac caaggcaaga 3960
aacgtcgtga aattgcgaac cgtctgaaat acatcaacga cgaggcgggt gttagcattg 4020aacgtcgtga aattgcgaac cgtctgaaat acatcaacga cgaggcgggt gttagcattg 4020
tgctggttgg catgccgtgg gcggaaaaga tcgcggatga gccgcagtgg agcagccgtc 4080tgctggttgg catgccgtgg gcggaaaaga tcgcggatga gccgcagtgg agcagccgtc 4080
tgctgatccg tcgtcaactg ccgtatttca aactgagcga gaacccgaag cactttgtgc 4140tgctgatccg tcgtcaactg ccgtatttca aactgagcga gaacccgaag cactttgtgc 4140
agctgattat cggtctggcg aaccgtatgc cgttcgcgga aaaaccgaac ctgagcgagc 4200agctgattat cggtctggcg aaccgtatgc cgttcgcgga aaaaccgaac ctgagcgagc 4200
aagcgaccgt tttcaccctg tttagcctga gcaaaggctg cttccgtacc ctgaagtact 4260aagcgaccgt tttcaccctg tttagcctga gcaaaggctg cttccgtacc ctgaagtact 4260
ttctggacga tgcggtgctg tatgcgctga tggacaacgc gaagaccctg accaccaaac 4320ttctggacga tgcggtgctg tatgcgctga tggacaacgc gaagaccctg accaccaaac 4320
acctggtgaa ggcgttcgaa gttctgtttc cggatgtgcc gaacctgttt accctgccgg 4380acctggtgaa ggcgttcgaa gttctgtttc cggatgtgcc gaacctgttt accctgccgg 4380
ttgcggagat caccgcgagc gaggtggaac gttacagcct gtataagccg gaaagcagcc 4440ttgcggagat caccgcgagc gaggtggaac gttacagcct gtataagccg gaaagcagcc 4440
aggacgagga cccgttcatt gcgaccaaat ttaccgatcg tatgccgatc agccaactgc 4500aggacgagga cccgttcatt gcgaccaaat ttaccgatcg tatgccgatc agccaactgc 4500
tgcgtaagta actcgagccg ctgagcaata actagcataa ccccttgggg cctctaaacg 4560tgcgtaagta actcgagccg ctgagcaata actagcataa ccccttgggg cctctaaacg 4560
ggtcttgagg ggttttttgc tgaaacctca ggcatttgag aagcacacgg tcacactgct 4620ggtcttgagg ggttttttgc tgaaacctca ggcatttgag aagcacacgg tcacactgct 4620
tccggtagtc aataaaccgg taaaccagca atagacataa gcggctattt aacgaccctg 4680tccggtagtc aataaaccgg taaaccagca atagacataa gcggctattt aacgaccctg 4680
ccctgaaccg acgacaagct gacgaccggg tctccgcaag tggcactttt cggggaaatg 4740ccctgaaccg acgacaagct gacgaccggg tctccgcaag tggcactttt cggggaaatg 4740
tgcgcggaac ccctatttgt ttatttttct aaatacattc aaatatgtat ccgctcatga 4800tgcgcggaac ccctatttgt ttatttttct aaatacattc aaatatgtat ccgctcatga 4800
attaattctt agaaaaactc atcgagcatc aaatgaaact gcaatttatt catatcagga 4860attaattctt agaaaaactc atcgagcatc aaatgaaact gcaatttatt catatcagga 4860
ttatcaatac catatttttg aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg 4920ttatcaatac catatttttg aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg 4920
cagttccata ggatggcaag atcctggtat cggtctgcga ttccgactcg tccaacatca 4980cagttccata ggatggcaag atcctggtat cggtctgcga ttccgactcg tccaacatca 4980
atacaaccta ttaatttccc ctcgtcaaaa ataaggttat caagtgagaa atcaccatga 5040atacaaccta ttaatttccc ctcgtcaaaa ataaggttat caagtgagaa atcaccatga 5040
gtgacgactg aatccggtga gaatggcaaa agtttatgca tttctttcca gacttgttca 5100gtgacgactg aatccggtga gaatggcaaa agtttatgca tttctttcca gacttgttca 5100
acaggccagc cattacgctc gtcatcaaaa tcactcgcat caaccaaacc gttattcatt 5160acaggccagc cattacgctc gtcatcaaaa tcactcgcat caaccaaacc gttattcatt 5160
cgtgattgcg cctgagcgag acgaaatacg cggtcgctgt taaaaggaca attacaaaca 5220cgtgattgcg cctgagcgag acgaaatacg cggtcgctgt taaaaggaca attacaaaca 5220
ggaatcgaat gcaaccggcg caggaacact gccagcgcat caacaatatt ttcacctgaa 5280ggaatcgaat gcaaccggcg caggaacact gccagcgcat caacaatatt ttcacctgaa 5280
tcaggatatt cttctaatac ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac 5340tcaggatatt cttctaatac ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac 5340
catgcatcat caggagtacg gataaaatgc ttgatggtcg gaagaggcat aaattccgtc 5400catgcatcat caggagtacg gataaaatgc ttgatggtcg gaagaggcat aaattccgtc 5400
agccagttta gtctgaccat ctcatctgta acatcattgg caacgctacc tttgccatgt 5460agccagttta gtctgaccat ctcatctgta acatcattgg caacgctacc tttgccatgt 5460
ttcagaaaca actctggcgc atcgggcttc ccatacaatc gatagattgt cgcacctgat 5520ttcagaaaca actctggcgc atcgggcttc ccatacaatc gtagattgt cgcacctgat 5520
tgcccgacat tatcgcgagc ccatttatac ccatataaat cagcatccat gttggaattt 5580tgcccgacat tatcgcgagc ccatttatac ccatataaat cagcatccat gttggaattt 5580
aatcgcggcc tagagcaaga cgtttcccgt tgaatatggc tcatactctt cctttttcaa 5640aatcgcggcc tagagcaaga cgtttcccgt tgaatatggc tcatactctt cctttttcaa 5640
tattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt 5700tattattgaa gcatttatca gggttatgt ctcatgagcg gatacatatt tgaatgtatt 5700
tagaaaaata aacaaatagg catgctagcg cagaaacgtc ctagaagatg ccaggaggat 5760tagaaaaata aacaaatagg catgctagcg cagaaacgtc ctagaagatg ccaggaggat 5760
acttagcaga gagacaataa ggccggagcg aagccgtttt tccataggct ccgcccccct 5820acttagcaga gagacaataa ggccggagcg aagccgtttt tccataggct ccgcccccct 5820
gacgaacatc acgaaatctg acgctcaaat cagtggtggc gaaacccgac aggactataa 5880gacgaacatc acgaaatctg acgctcaaat cagtggtggc gaaacccgac aggactataa 5880
agataccagg cgtttccccc tgatggctcc ctcttgcgct ctcctgttcc cgtcctgcgg 5940agataccagg cgtttccccc tgatggctcc ctcttgcgct ctcctgttcc cgtcctgcgg 5940
cgtccgtgtt gtggtggagg ctttacccaa atcaccacgt cccgttccgt gtagacagtt 6000cgtccgtgtt gtggtggagg ctttacccaa atcaccacgt cccgttccgt gtagacagtt 6000
cgctccaagc tgggctgtgt gcaagaaccc cccgttcagc ccgactgctg cgccttatcc 6060cgctccaagc tgggctgtgt gcaagaaccc cccgttcagc ccgactgctg cgccttatcc 6060
ggtaactatc atcttgagtc caacccggaa agacacgaca aaacgccact ggcagcagcc 6120ggtaactatc atcttgagtc caacccggaa agacacgaca aaacgccact ggcagcagcc 6120
attggtaact gagaattagt ggatttagat atcgagagtc ttgaagtggt ggcctaacag 6180attggtaact gagaattagt ggatttagat atcgagagtc ttgaagtggt ggcctaacag 6180
aggctacact gaaaggacag tatttggtat ctgcgctcca ctaaagccag ttaccaggtt 6240aggctacact gaaaggacag tatttggtat ctgcgctcca ctaaagccag ttaccaggtt 6240
aagcagttcc ccaactgact taaccttcga tcaaaccgcc tccccaggcg gttttttcgt 6300aagcagttcc ccaactgact taaccttcga tcaaaccgcc tccccaggcg gttttttcgt 6300
ttacagagca ggagattacg 6320ttacagagca ggagattacg 6320
Claims (16)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110532731.0A CN115369098A (en) | 2021-05-17 | 2021-05-17 | A novel CRISPR-associated transposase |
PCT/CN2022/091102 WO2022242464A1 (en) | 2021-05-17 | 2022-05-06 | Novel crispr-associated transposase |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110532731.0A CN115369098A (en) | 2021-05-17 | 2021-05-17 | A novel CRISPR-associated transposase |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115369098A true CN115369098A (en) | 2022-11-22 |
Family
ID=84058343
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110532731.0A Pending CN115369098A (en) | 2021-05-17 | 2021-05-17 | A novel CRISPR-associated transposase |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115369098A (en) |
WO (1) | WO2022242464A1 (en) |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA3102148A1 (en) * | 2018-06-13 | 2019-12-19 | Caribou Biosciences, Inc. | Engineered cascade components and cascade complexes |
WO2020181264A1 (en) * | 2019-03-07 | 2020-09-10 | The Trustees Of Columbia University In The City Of New York | Rna-guided dna integration using tn7-like transposons |
US20220145298A1 (en) * | 2019-03-14 | 2022-05-12 | Cornell University | Compositions and methods for gene targeting using crispr-cas and transposons |
CA3127684A1 (en) * | 2019-07-19 | 2021-01-28 | Synthego Corporation | Stabilized crispr complexes |
-
2021
- 2021-05-17 CN CN202110532731.0A patent/CN115369098A/en active Pending
-
2022
- 2022-05-06 WO PCT/CN2022/091102 patent/WO2022242464A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
WO2022242464A1 (en) | 2022-11-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2021231074B2 (en) | Class II, type V CRISPR systems | |
CN110352241A (en) | Heat-staple Cas9 nuclease | |
WO2016205623A1 (en) | Methods and compositions for genome editing in bacteria using crispr-cas9 systems | |
JP2025090617A (en) | Enzymes containing RUVC domains | |
CN104109687A (en) | Construction and application of Zymomonas mobilis CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR-association proteins)9 system | |
CN103097537B (en) | From deletion plasmid | |
CN102703424B (en) | A kind of method of genome of E.coli point mutation of recombined engineering mediation | |
US12331326B2 (en) | Genetically engineered Vibrio sp. and uses thereof | |
JP2019500036A (en) | Reconstruction of DNA end repair pathways in prokaryotes | |
WO2020186262A1 (en) | Compositions and methods for gene targeting using crispr-cas and transposons | |
CN102618476B (en) | Method for leading restriction barrier of target bacteria into foreign DNA | |
Bost et al. | Application of the endogenous CRISPR-Cas type ID system for genetic engineering in the thermoacidophilic archaeon Sulfolobus acidocaldarius | |
CN119452085A (en) | Enzymes with RUVC domain | |
JP2022513642A (en) | A DNA-cutting means based on the Cas9 protein from Clostridium cellulolyticum, a biotechnically important bacterium. | |
JP2022509826A (en) | DNA cutting means based on Cas9 protein derived from the genus Defluviimonas | |
WO2005103229A2 (en) | Transgenomic mitochondria, transmitochondrial cells and organisms, and methods of making and using | |
Dhundale et al. | Mutations that affect production of branched RNA-linked msDNA in Myxococcus xanthus | |
CN115369098A (en) | A novel CRISPR-associated transposase | |
CN116121288A (en) | A group of vectors for cloning large fragment DNA of Pseudomonas putida and its application | |
KR20180128864A (en) | Gene editing composition comprising sgRNAs with matched 5' nucleotide and gene editing method using the same | |
CN114163506A (en) | Application of Pseudomonas stutzeri-derived PsPIWI-RE protein in mediating homologous recombination | |
KR20200098424A (en) | Method for gene editing in microalgae using particle bombardment | |
Tong et al. | Prokaryotic genome editing based on the subtype IB-Svi CRISPR-Cas system | |
Bost | Optimizing the genetic toolbox of the crenarchaeon Sulfolobus acidocaldarius for use in industrial applications | |
CN118497243A (en) | CRISPR-Cas9-NHEJ gene editing system and its application in Halomonas |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |