CN114835818B - Gene editing fusion protein, adenine base editor constructed by same and application thereof - Google Patents
Gene editing fusion protein, adenine base editor constructed by same and application thereof Download PDFInfo
- Publication number
- CN114835818B CN114835818B CN202210265179.8A CN202210265179A CN114835818B CN 114835818 B CN114835818 B CN 114835818B CN 202210265179 A CN202210265179 A CN 202210265179A CN 114835818 B CN114835818 B CN 114835818B
- Authority
- CN
- China
- Prior art keywords
- lys
- leu
- ile
- glu
- asn
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 title claims abstract description 60
- 229930024421 Adenine Natural products 0.000 title claims abstract description 59
- 229960000643 adenine Drugs 0.000 title claims abstract description 59
- 108020001507 fusion proteins Proteins 0.000 title claims abstract description 27
- 102000037865 fusion proteins Human genes 0.000 title claims abstract description 27
- 238000010362 genome editing Methods 0.000 title claims abstract description 25
- 238000003780 insertion Methods 0.000 claims abstract description 17
- 230000037431 insertion Effects 0.000 claims abstract description 17
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 claims abstract description 12
- 125000003275 alpha amino acid group Chemical group 0.000 claims abstract 3
- 239000013612 plasmid Substances 0.000 claims description 24
- 244000063299 Bacillus subtilis Species 0.000 claims description 21
- 235000014469 Bacillus subtilis Nutrition 0.000 claims description 21
- 108090000623 proteins and genes Proteins 0.000 claims description 14
- 101710089384 Extracellular protease Proteins 0.000 claims description 7
- 239000002773 nucleotide Substances 0.000 claims description 4
- 125000003729 nucleotide group Chemical group 0.000 claims description 4
- 230000000415 inactivating effect Effects 0.000 claims description 2
- 230000001939 inductive effect Effects 0.000 claims description 2
- 230000001276 controlling effect Effects 0.000 claims 2
- 230000001105 regulatory effect Effects 0.000 claims 2
- 230000000694 effects Effects 0.000 abstract description 15
- 238000000034 method Methods 0.000 abstract description 14
- 230000008569 process Effects 0.000 abstract description 7
- 108010003700 lysyl aspartic acid Proteins 0.000 description 41
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 30
- 108010092854 aspartyllysine Proteins 0.000 description 28
- 108091033409 CRISPR Proteins 0.000 description 25
- YLRAFVVWZRSZQC-DZKIICNBSA-N Val-Phe-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N YLRAFVVWZRSZQC-DZKIICNBSA-N 0.000 description 21
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 18
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 18
- 108010034529 leucyl-lysine Proteins 0.000 description 18
- 108020004414 DNA Proteins 0.000 description 17
- RMNMUUCYTMLWNA-ZPFDUUQYSA-N Ile-Lys-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N RMNMUUCYTMLWNA-ZPFDUUQYSA-N 0.000 description 17
- 241000880493 Leptailurus serval Species 0.000 description 17
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 17
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 16
- 108010050848 glycylleucine Proteins 0.000 description 16
- 108010051242 phenylalanylserine Proteins 0.000 description 16
- 108010054155 lysyllysine Proteins 0.000 description 15
- JCFYLFOCALSNLQ-GUBZILKMSA-N Lys-Ala-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JCFYLFOCALSNLQ-GUBZILKMSA-N 0.000 description 14
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 13
- GVKKVHNRTUFCCE-BJDJZHNGSA-N Ile-Leu-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)O)N GVKKVHNRTUFCCE-BJDJZHNGSA-N 0.000 description 13
- 108010038633 aspartylglutamate Proteins 0.000 description 13
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 13
- 108010064235 lysylglycine Proteins 0.000 description 13
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 12
- GOWZVQXTHUCNSQ-NHCYSSNCSA-N Arg-Glu-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GOWZVQXTHUCNSQ-NHCYSSNCSA-N 0.000 description 12
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 12
- TZOZNVLBTAFJRW-UGYAYLCHSA-N Asp-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N TZOZNVLBTAFJRW-UGYAYLCHSA-N 0.000 description 12
- 238000010354 CRISPR gene editing Methods 0.000 description 12
- 238000010443 CRISPR/Cpf1 gene editing Methods 0.000 description 12
- NNKLKUUGESXCBS-KBPBESRZSA-N Lys-Gly-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NNKLKUUGESXCBS-KBPBESRZSA-N 0.000 description 12
- 108010057821 leucylproline Proteins 0.000 description 12
- VJTWLBMESLDOMK-WDSKDSINSA-N Asn-Gln-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VJTWLBMESLDOMK-WDSKDSINSA-N 0.000 description 11
- BJDHEININLSZOT-KKUMJFAQSA-N Asp-Tyr-Lys Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(O)=O BJDHEININLSZOT-KKUMJFAQSA-N 0.000 description 11
- JPHYJQHPILOKHC-ACZMJKKPSA-N Glu-Asp-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O JPHYJQHPILOKHC-ACZMJKKPSA-N 0.000 description 11
- UERORLSAFUHDGU-AVGNSLFASA-N Glu-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N UERORLSAFUHDGU-AVGNSLFASA-N 0.000 description 11
- MRWYPDWDZSLWJM-ACZMJKKPSA-N Glu-Ser-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O MRWYPDWDZSLWJM-ACZMJKKPSA-N 0.000 description 11
- NYEYYMLUABXDMC-NHCYSSNCSA-N Ile-Gly-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)O)N NYEYYMLUABXDMC-NHCYSSNCSA-N 0.000 description 11
- KBAPKNDWAGVGTH-IGISWZIWSA-N Ile-Ile-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KBAPKNDWAGVGTH-IGISWZIWSA-N 0.000 description 11
- ZNOBVZFCHNHKHA-KBIXCLLPSA-N Ile-Ser-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZNOBVZFCHNHKHA-KBIXCLLPSA-N 0.000 description 11
- MPOHDJKRBLVGCT-CIUDSAMLSA-N Lys-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N MPOHDJKRBLVGCT-CIUDSAMLSA-N 0.000 description 11
- WSXTWLJHTLRFLW-SRVKXCTJSA-N Lys-Ala-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O WSXTWLJHTLRFLW-SRVKXCTJSA-N 0.000 description 11
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 11
- ZENDEDYRYVHBEG-SRVKXCTJSA-N Phe-Asp-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZENDEDYRYVHBEG-SRVKXCTJSA-N 0.000 description 11
- NHCKESBLOMHIIE-IRXDYDNUSA-N Phe-Gly-Phe Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 NHCKESBLOMHIIE-IRXDYDNUSA-N 0.000 description 11
- GPLWGAYGROGDEN-BZSNNMDCSA-N Phe-Phe-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O GPLWGAYGROGDEN-BZSNNMDCSA-N 0.000 description 11
- GOUWCZRDTWTODO-YDHLFZDLSA-N Phe-Val-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O GOUWCZRDTWTODO-YDHLFZDLSA-N 0.000 description 11
- LRZLZIUXQBIWTB-KATARQTJSA-N Ser-Lys-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LRZLZIUXQBIWTB-KATARQTJSA-N 0.000 description 11
- 108010062796 arginyllysine Proteins 0.000 description 11
- 108010092114 histidylphenylalanine Proteins 0.000 description 11
- 108010057952 lysyl-phenylalanyl-lysine Proteins 0.000 description 11
- 230000035772 mutation Effects 0.000 description 11
- DOXQMJCSSYZSNM-BZSNNMDCSA-N Phe-Lys-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O DOXQMJCSSYZSNM-BZSNNMDCSA-N 0.000 description 10
- JIPVNVNKXJLFJF-BJDJZHNGSA-N Ser-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N JIPVNVNKXJLFJF-BJDJZHNGSA-N 0.000 description 10
- WAPFQMXRSDEGOE-IHRRRGAJSA-N Tyr-Glu-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O WAPFQMXRSDEGOE-IHRRRGAJSA-N 0.000 description 10
- 108010052875 Adenine deaminase Proteins 0.000 description 9
- VFDRDMOMHBJGKD-UFYCRDLUSA-N Phe-Tyr-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N VFDRDMOMHBJGKD-UFYCRDLUSA-N 0.000 description 9
- JCLAFVNDBJMLBC-JBDRJPRFSA-N Ser-Ser-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JCLAFVNDBJMLBC-JBDRJPRFSA-N 0.000 description 9
- 102000004169 proteins and genes Human genes 0.000 description 9
- HPBNLFLSSQDFQW-WHFBIAKZSA-N Asn-Ser-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O HPBNLFLSSQDFQW-WHFBIAKZSA-N 0.000 description 8
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 8
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 8
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 8
- 108010038320 lysylphenylalanine Proteins 0.000 description 8
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 7
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 7
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 7
- QISZHYWZHJRDAO-CIUDSAMLSA-N Asn-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N QISZHYWZHJRDAO-CIUDSAMLSA-N 0.000 description 7
- KHBLRHKVXICFMY-GUBZILKMSA-N Asp-Glu-Lys Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O KHBLRHKVXICFMY-GUBZILKMSA-N 0.000 description 7
- GQNZIAGMRXOFJX-GUBZILKMSA-N Cys-Val-Met Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCSC)C(O)=O GQNZIAGMRXOFJX-GUBZILKMSA-N 0.000 description 7
- HJIFPJUEOGZWRI-GUBZILKMSA-N Glu-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N HJIFPJUEOGZWRI-GUBZILKMSA-N 0.000 description 7
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 7
- LMMPTUVWHCFTOT-GARJFASQSA-N His-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC2=CN=CN2)N)C(=O)O LMMPTUVWHCFTOT-GARJFASQSA-N 0.000 description 7
- ADDYYRVQQZFIMW-MNXVOIDGSA-N Ile-Lys-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ADDYYRVQQZFIMW-MNXVOIDGSA-N 0.000 description 7
- WXJKFRMKJORORD-DCAQKATOSA-N Lys-Arg-Ala Chemical compound NC(=N)NCCC[C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@@H](N)CCCCN WXJKFRMKJORORD-DCAQKATOSA-N 0.000 description 7
- LMVOVCYVZBBWQB-SRVKXCTJSA-N Lys-Asp-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LMVOVCYVZBBWQB-SRVKXCTJSA-N 0.000 description 7
- NJNRBRKHOWSGMN-SRVKXCTJSA-N Lys-Leu-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O NJNRBRKHOWSGMN-SRVKXCTJSA-N 0.000 description 7
- RBEATVHTWHTHTJ-KKUMJFAQSA-N Lys-Leu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O RBEATVHTWHTHTJ-KKUMJFAQSA-N 0.000 description 7
- VSTNAUBHKQPVJX-IHRRRGAJSA-N Lys-Met-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O VSTNAUBHKQPVJX-IHRRRGAJSA-N 0.000 description 7
- WIVCOAKLPICYGY-KKUMJFAQSA-N Phe-Asp-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N WIVCOAKLPICYGY-KKUMJFAQSA-N 0.000 description 7
- FMMIYCMOVGXZIP-AVGNSLFASA-N Phe-Glu-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O FMMIYCMOVGXZIP-AVGNSLFASA-N 0.000 description 7
- KYYMILWEGJYPQZ-IHRRRGAJSA-N Phe-Glu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KYYMILWEGJYPQZ-IHRRRGAJSA-N 0.000 description 7
- XXNYYSXNXCJYKX-DCAQKATOSA-N Ser-Leu-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O XXNYYSXNXCJYKX-DCAQKATOSA-N 0.000 description 7
- LRWBCWGEUCKDTN-BJDJZHNGSA-N Ser-Lys-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LRWBCWGEUCKDTN-BJDJZHNGSA-N 0.000 description 7
- STGXWWBXWXZOER-MBLNEYKQSA-N Thr-Ala-His Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 STGXWWBXWXZOER-MBLNEYKQSA-N 0.000 description 7
- BVOVIGCHYNFJBZ-JXUBOQSCSA-N Thr-Leu-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O BVOVIGCHYNFJBZ-JXUBOQSCSA-N 0.000 description 7
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 7
- 108010087924 alanylproline Proteins 0.000 description 7
- 108010078144 glutaminyl-glycine Proteins 0.000 description 7
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 7
- 230000006698 induction Effects 0.000 description 7
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 7
- 230000001404 mediated effect Effects 0.000 description 7
- FJVAQLJNTSUQPY-CIUDSAMLSA-N Ala-Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN FJVAQLJNTSUQPY-CIUDSAMLSA-N 0.000 description 6
- CVGNCMIULZNYES-WHFBIAKZSA-N Ala-Asn-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CVGNCMIULZNYES-WHFBIAKZSA-N 0.000 description 6
- SHYYAQLDNVHPFT-DLOVCJGASA-N Ala-Asn-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SHYYAQLDNVHPFT-DLOVCJGASA-N 0.000 description 6
- FXKNPWNXPQZLES-ZLUOBGJFSA-N Ala-Asn-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O FXKNPWNXPQZLES-ZLUOBGJFSA-N 0.000 description 6
- GGNHBHYDMUDXQB-KBIXCLLPSA-N Ala-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)N GGNHBHYDMUDXQB-KBIXCLLPSA-N 0.000 description 6
- TZDNWXDLYFIFPT-BJDJZHNGSA-N Ala-Ile-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O TZDNWXDLYFIFPT-BJDJZHNGSA-N 0.000 description 6
- IHRGVZXPTIQNIP-NAKRPEOUSA-N Ala-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](C)N IHRGVZXPTIQNIP-NAKRPEOUSA-N 0.000 description 6
- IORKCNUBHNIMKY-CIUDSAMLSA-N Ala-Pro-Glu Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O IORKCNUBHNIMKY-CIUDSAMLSA-N 0.000 description 6
- BTRULDJUUVGRNE-DCAQKATOSA-N Ala-Pro-Lys Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O BTRULDJUUVGRNE-DCAQKATOSA-N 0.000 description 6
- DCVYRWFAMZFSDA-ZLUOBGJFSA-N Ala-Ser-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DCVYRWFAMZFSDA-ZLUOBGJFSA-N 0.000 description 6
- BHFOJPDOQPWJRN-XDTLVQLUSA-N Ala-Tyr-Gln Chemical compound C[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCC(N)=O)C(O)=O BHFOJPDOQPWJRN-XDTLVQLUSA-N 0.000 description 6
- QRIYOHQJRDHFKF-UWJYBYFXSA-N Ala-Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 QRIYOHQJRDHFKF-UWJYBYFXSA-N 0.000 description 6
- VBFJESQBIWCWRL-DCAQKATOSA-N Arg-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCNC(N)=N VBFJESQBIWCWRL-DCAQKATOSA-N 0.000 description 6
- OZNSCVPYWZRQPY-CIUDSAMLSA-N Arg-Asp-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O OZNSCVPYWZRQPY-CIUDSAMLSA-N 0.000 description 6
- SLNCSSWAIDUUGF-LSJOCFKGSA-N Arg-His-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O SLNCSSWAIDUUGF-LSJOCFKGSA-N 0.000 description 6
- NVUIWHJLPSZZQC-CYDGBPFRSA-N Arg-Ile-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NVUIWHJLPSZZQC-CYDGBPFRSA-N 0.000 description 6
- RIIVUOJDDQXHRV-SRVKXCTJSA-N Arg-Lys-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O RIIVUOJDDQXHRV-SRVKXCTJSA-N 0.000 description 6
- XUGATJVGQUGQKY-ULQDDVLXSA-N Arg-Lys-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XUGATJVGQUGQKY-ULQDDVLXSA-N 0.000 description 6
- DPLFNLDACGGBAK-KKUMJFAQSA-N Arg-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N DPLFNLDACGGBAK-KKUMJFAQSA-N 0.000 description 6
- XHFXZQHTLJVZBN-FXQIFTODSA-N Asn-Arg-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N XHFXZQHTLJVZBN-FXQIFTODSA-N 0.000 description 6
- RCENDENBBJFJHZ-ACZMJKKPSA-N Asn-Asn-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O RCENDENBBJFJHZ-ACZMJKKPSA-N 0.000 description 6
- BVLIJXXSXBUGEC-SRVKXCTJSA-N Asn-Asn-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BVLIJXXSXBUGEC-SRVKXCTJSA-N 0.000 description 6
- BHQQRVARKXWXPP-ACZMJKKPSA-N Asn-Asp-Glu Chemical compound C(CC(=O)O)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N BHQQRVARKXWXPP-ACZMJKKPSA-N 0.000 description 6
- ZWASIOHRQWRWAS-UGYAYLCHSA-N Asn-Asp-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZWASIOHRQWRWAS-UGYAYLCHSA-N 0.000 description 6
- XWFPGQVLOVGSLU-CIUDSAMLSA-N Asn-Gln-Arg Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N XWFPGQVLOVGSLU-CIUDSAMLSA-N 0.000 description 6
- BZMWJLLUAKSIMH-FXQIFTODSA-N Asn-Glu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BZMWJLLUAKSIMH-FXQIFTODSA-N 0.000 description 6
- ODBSSLHUFPJRED-CIUDSAMLSA-N Asn-His-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N ODBSSLHUFPJRED-CIUDSAMLSA-N 0.000 description 6
- AITGTTNYKAWKDR-CIUDSAMLSA-N Asn-His-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O AITGTTNYKAWKDR-CIUDSAMLSA-N 0.000 description 6
- YYSYDIYQTUPNQQ-SXTJYALSSA-N Asn-Ile-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YYSYDIYQTUPNQQ-SXTJYALSSA-N 0.000 description 6
- ACKNRKFVYUVWAC-ZPFDUUQYSA-N Asn-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N ACKNRKFVYUVWAC-ZPFDUUQYSA-N 0.000 description 6
- BXUHCIXDSWRSBS-CIUDSAMLSA-N Asn-Leu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BXUHCIXDSWRSBS-CIUDSAMLSA-N 0.000 description 6
- BZWRLDPIWKOVKB-ZPFDUUQYSA-N Asn-Leu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BZWRLDPIWKOVKB-ZPFDUUQYSA-N 0.000 description 6
- TZFQICWZWFNIKU-KKUMJFAQSA-N Asn-Leu-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 TZFQICWZWFNIKU-KKUMJFAQSA-N 0.000 description 6
- JWKDQOORUCYUIW-ZPFDUUQYSA-N Asn-Lys-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JWKDQOORUCYUIW-ZPFDUUQYSA-N 0.000 description 6
- WXVGISRWSYGEDK-KKUMJFAQSA-N Asn-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N WXVGISRWSYGEDK-KKUMJFAQSA-N 0.000 description 6
- WCRQQIPFSXFIRN-LPEHRKFASA-N Asn-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N WCRQQIPFSXFIRN-LPEHRKFASA-N 0.000 description 6
- CDGHMJJJHYKMPA-DLOVCJGASA-N Asn-Phe-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CC(=O)N)N CDGHMJJJHYKMPA-DLOVCJGASA-N 0.000 description 6
- GMUOCGCDOYYWPD-FXQIFTODSA-N Asn-Pro-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O GMUOCGCDOYYWPD-FXQIFTODSA-N 0.000 description 6
- VPPXTHJNTYDNFJ-CIUDSAMLSA-N Asp-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N VPPXTHJNTYDNFJ-CIUDSAMLSA-N 0.000 description 6
- ZELQAFZSJOBEQS-ACZMJKKPSA-N Asp-Asn-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZELQAFZSJOBEQS-ACZMJKKPSA-N 0.000 description 6
- QOVWVLLHMMCFFY-ZLUOBGJFSA-N Asp-Asp-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QOVWVLLHMMCFFY-ZLUOBGJFSA-N 0.000 description 6
- SBHUBSDEZQFJHJ-CIUDSAMLSA-N Asp-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O SBHUBSDEZQFJHJ-CIUDSAMLSA-N 0.000 description 6
- CELPEWWLSXMVPH-CIUDSAMLSA-N Asp-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O CELPEWWLSXMVPH-CIUDSAMLSA-N 0.000 description 6
- QCLHLXDWRKOHRR-GUBZILKMSA-N Asp-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N QCLHLXDWRKOHRR-GUBZILKMSA-N 0.000 description 6
- SPWXXPFDTMYTRI-IUKAMOBKSA-N Asp-Ile-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SPWXXPFDTMYTRI-IUKAMOBKSA-N 0.000 description 6
- JTRDJYIZIKCIRC-AJNGGQMLSA-N Asp-Leu-Leu-Gln Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JTRDJYIZIKCIRC-AJNGGQMLSA-N 0.000 description 6
- UMHUHHJMEXNSIV-CIUDSAMLSA-N Asp-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UMHUHHJMEXNSIV-CIUDSAMLSA-N 0.000 description 6
- GKWFMNNNYZHJHV-SRVKXCTJSA-N Asp-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O GKWFMNNNYZHJHV-SRVKXCTJSA-N 0.000 description 6
- JUWISGAGWSDGDH-KKUMJFAQSA-N Asp-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(O)=O)CC1=CC=CC=C1 JUWISGAGWSDGDH-KKUMJFAQSA-N 0.000 description 6
- GWWSUMLEWKQHLR-NUMRIWBASA-N Asp-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O GWWSUMLEWKQHLR-NUMRIWBASA-N 0.000 description 6
- GXHDGYOXPNQCKM-XVSYOHENSA-N Asp-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O GXHDGYOXPNQCKM-XVSYOHENSA-N 0.000 description 6
- NWAHPBGBDIFUFD-KKUMJFAQSA-N Asp-Tyr-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O NWAHPBGBDIFUFD-KKUMJFAQSA-N 0.000 description 6
- XQFLFQWOBXPMHW-NHCYSSNCSA-N Asp-Val-His Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)O XQFLFQWOBXPMHW-NHCYSSNCSA-N 0.000 description 6
- ZUNMTUPRQMWMHX-LSJOCFKGSA-N Asp-Val-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O ZUNMTUPRQMWMHX-LSJOCFKGSA-N 0.000 description 6
- AEJSNWMRPXAKCW-WHFBIAKZSA-N Cys-Ala-Gly Chemical compound SC[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O AEJSNWMRPXAKCW-WHFBIAKZSA-N 0.000 description 6
- XBELMDARIGXDKY-GUBZILKMSA-N Cys-Pro-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CS)N XBELMDARIGXDKY-GUBZILKMSA-N 0.000 description 6
- 241000588724 Escherichia coli Species 0.000 description 6
- TWHDOEYLXXQYOZ-FXQIFTODSA-N Gln-Asn-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N TWHDOEYLXXQYOZ-FXQIFTODSA-N 0.000 description 6
- ODBLJLZVLAWVMS-GUBZILKMSA-N Gln-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)N)N ODBLJLZVLAWVMS-GUBZILKMSA-N 0.000 description 6
- AJDMYLOISOCHHC-YVNDNENWSA-N Gln-Gln-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O AJDMYLOISOCHHC-YVNDNENWSA-N 0.000 description 6
- XSBGUANSZDGULP-IUCAKERBSA-N Gln-Gly-Lys Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CCCCN)C(O)=O XSBGUANSZDGULP-IUCAKERBSA-N 0.000 description 6
- FKXCBKCOSVIGCT-AVGNSLFASA-N Gln-Lys-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O FKXCBKCOSVIGCT-AVGNSLFASA-N 0.000 description 6
- FTTHLXOMDMLKKW-FHWLQOOXSA-N Gln-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(N)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 FTTHLXOMDMLKKW-FHWLQOOXSA-N 0.000 description 6
- ZGHMRONFHDVXEF-AVGNSLFASA-N Gln-Ser-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZGHMRONFHDVXEF-AVGNSLFASA-N 0.000 description 6
- ININBLZFFVOQIO-JHEQGTHGSA-N Gln-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N)O ININBLZFFVOQIO-JHEQGTHGSA-N 0.000 description 6
- CSMHMEATMDCQNY-DZKIICNBSA-N Gln-Val-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CSMHMEATMDCQNY-DZKIICNBSA-N 0.000 description 6
- LKDIBBOKUAASNP-FXQIFTODSA-N Glu-Ala-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LKDIBBOKUAASNP-FXQIFTODSA-N 0.000 description 6
- CVPXINNKRTZBMO-CIUDSAMLSA-N Glu-Arg-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)CN=C(N)N CVPXINNKRTZBMO-CIUDSAMLSA-N 0.000 description 6
- PBEQPAZRHDVJQI-SRVKXCTJSA-N Glu-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCC(=O)O)N PBEQPAZRHDVJQI-SRVKXCTJSA-N 0.000 description 6
- NADWTMLCUDMDQI-ACZMJKKPSA-N Glu-Asp-Cys Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N NADWTMLCUDMDQI-ACZMJKKPSA-N 0.000 description 6
- IESFZVCAVACGPH-PEFMBERDSA-N Glu-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O IESFZVCAVACGPH-PEFMBERDSA-N 0.000 description 6
- KRGZZKWSBGPLKL-IUCAKERBSA-N Glu-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N KRGZZKWSBGPLKL-IUCAKERBSA-N 0.000 description 6
- ZWABFSSWTSAMQN-KBIXCLLPSA-N Glu-Ile-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O ZWABFSSWTSAMQN-KBIXCLLPSA-N 0.000 description 6
- NJCALAAIGREHDR-WDCWCFNPSA-N Glu-Leu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NJCALAAIGREHDR-WDCWCFNPSA-N 0.000 description 6
- OQXDUSZKISQQSS-GUBZILKMSA-N Glu-Lys-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OQXDUSZKISQQSS-GUBZILKMSA-N 0.000 description 6
- MFNUFCFRAZPJFW-JYJNAYRXSA-N Glu-Lys-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MFNUFCFRAZPJFW-JYJNAYRXSA-N 0.000 description 6
- IDEODOAVGCMUQV-GUBZILKMSA-N Glu-Ser-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IDEODOAVGCMUQV-GUBZILKMSA-N 0.000 description 6
- HJTSRYLPAYGEEC-SIUGBPQLSA-N Glu-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCC(=O)O)N HJTSRYLPAYGEEC-SIUGBPQLSA-N 0.000 description 6
- LZEUDRYSAZAJIO-AUTRQRHGSA-N Glu-Val-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LZEUDRYSAZAJIO-AUTRQRHGSA-N 0.000 description 6
- KKBWDNZXYLGJEY-UHFFFAOYSA-N Gly-Arg-Pro Natural products NCC(=O)NC(CCNC(=N)N)C(=O)N1CCCC1C(=O)O KKBWDNZXYLGJEY-UHFFFAOYSA-N 0.000 description 6
- AIJAPFVDBFYNKN-WHFBIAKZSA-N Gly-Asn-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN)C(=O)N AIJAPFVDBFYNKN-WHFBIAKZSA-N 0.000 description 6
- MHHUEAIBJZWDBH-YUMQZZPRSA-N Gly-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN MHHUEAIBJZWDBH-YUMQZZPRSA-N 0.000 description 6
- XTQFHTHIAKKCTM-YFKPBYRVSA-N Gly-Glu-Gly Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O XTQFHTHIAKKCTM-YFKPBYRVSA-N 0.000 description 6
- QITBQGJOXQYMOA-ZETCQYMHSA-N Gly-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)CN QITBQGJOXQYMOA-ZETCQYMHSA-N 0.000 description 6
- DGKBSGNCMCLDSL-BYULHYEWSA-N Gly-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN DGKBSGNCMCLDSL-BYULHYEWSA-N 0.000 description 6
- VEPBEGNDJYANCF-QWRGUYRKSA-N Gly-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN VEPBEGNDJYANCF-QWRGUYRKSA-N 0.000 description 6
- ICUTTWWCDIIIEE-BQBZGAKWSA-N Gly-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN ICUTTWWCDIIIEE-BQBZGAKWSA-N 0.000 description 6
- FGPLUIQCSKGLTI-WDSKDSINSA-N Gly-Ser-Glu Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O FGPLUIQCSKGLTI-WDSKDSINSA-N 0.000 description 6
- FFJQHWKSGAWSTJ-BFHQHQDPSA-N Gly-Thr-Ala Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O FFJQHWKSGAWSTJ-BFHQHQDPSA-N 0.000 description 6
- UMBDRSMLCUYIRI-DVJZZOLTSA-N Gly-Trp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)CN)O UMBDRSMLCUYIRI-DVJZZOLTSA-N 0.000 description 6
- 108020005004 Guide RNA Proteins 0.000 description 6
- WYWBYSPRCFADBM-GARJFASQSA-N His-Cys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CS)NC(=O)[C@H](CC2=CN=CN2)N)C(=O)O WYWBYSPRCFADBM-GARJFASQSA-N 0.000 description 6
- ZVKDCQVQTGYBQT-LSJOCFKGSA-N His-Pro-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O ZVKDCQVQTGYBQT-LSJOCFKGSA-N 0.000 description 6
- PYNPBMCLAKTHJL-SRVKXCTJSA-N His-Pro-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O PYNPBMCLAKTHJL-SRVKXCTJSA-N 0.000 description 6
- LQSBBHNVAVNZSX-GHCJXIJMSA-N Ile-Ala-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N LQSBBHNVAVNZSX-GHCJXIJMSA-N 0.000 description 6
- QYZYJFXHXYUZMZ-UGYAYLCHSA-N Ile-Asn-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N QYZYJFXHXYUZMZ-UGYAYLCHSA-N 0.000 description 6
- IDAHFEPYTJJZFD-PEFMBERDSA-N Ile-Asp-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N IDAHFEPYTJJZFD-PEFMBERDSA-N 0.000 description 6
- HGNUKGZQASSBKQ-PCBIJLKTSA-N Ile-Asp-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HGNUKGZQASSBKQ-PCBIJLKTSA-N 0.000 description 6
- PFTFEWHJSAXGED-ZKWXMUAHSA-N Ile-Cys-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)NCC(=O)O)N PFTFEWHJSAXGED-ZKWXMUAHSA-N 0.000 description 6
- VCYVLFAWCJRXFT-HJPIBITLSA-N Ile-Cys-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N VCYVLFAWCJRXFT-HJPIBITLSA-N 0.000 description 6
- PHIXPNQDGGILMP-YVNDNENWSA-N Ile-Glu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N PHIXPNQDGGILMP-YVNDNENWSA-N 0.000 description 6
- CSQNHSGHAPRGPQ-YTFOTSKYSA-N Ile-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(=O)O)N CSQNHSGHAPRGPQ-YTFOTSKYSA-N 0.000 description 6
- KLBVGHCGHUNHEA-BJDJZHNGSA-N Ile-Leu-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)O)N KLBVGHCGHUNHEA-BJDJZHNGSA-N 0.000 description 6
- YGDWPQCLFJNMOL-MNXVOIDGSA-N Ile-Leu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YGDWPQCLFJNMOL-MNXVOIDGSA-N 0.000 description 6
- RQQCJTLBSJMVCR-DSYPUSFNSA-N Ile-Leu-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N RQQCJTLBSJMVCR-DSYPUSFNSA-N 0.000 description 6
- USXAYNCLFSUSBA-MGHWNKPDSA-N Ile-Phe-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N USXAYNCLFSUSBA-MGHWNKPDSA-N 0.000 description 6
- XMYURPUVJSKTMC-KBIXCLLPSA-N Ile-Ser-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N XMYURPUVJSKTMC-KBIXCLLPSA-N 0.000 description 6
- JNLSTRPWUXOORL-MMWGEVLESA-N Ile-Ser-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N JNLSTRPWUXOORL-MMWGEVLESA-N 0.000 description 6
- BCISUQVFDGYZBO-QSFUFRPTSA-N Ile-Val-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O BCISUQVFDGYZBO-QSFUFRPTSA-N 0.000 description 6
- QSXSHZIRKTUXNG-STECZYCISA-N Ile-Val-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QSXSHZIRKTUXNG-STECZYCISA-N 0.000 description 6
- 108010065920 Insulin Lispro Proteins 0.000 description 6
- SITWEMZOJNKJCH-UHFFFAOYSA-N L-alanine-L-arginine Natural products CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 6
- SENJXOPIZNYLHU-UHFFFAOYSA-N L-leucyl-L-arginine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-UHFFFAOYSA-N 0.000 description 6
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 6
- CQQGCWPXDHTTNF-GUBZILKMSA-N Leu-Ala-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O CQQGCWPXDHTTNF-GUBZILKMSA-N 0.000 description 6
- SUPVSFFZWVOEOI-CQDKDKBSSA-N Leu-Ala-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SUPVSFFZWVOEOI-CQDKDKBSSA-N 0.000 description 6
- SUPVSFFZWVOEOI-UHFFFAOYSA-N Leu-Ala-Tyr Natural products CC(C)CC(N)C(=O)NC(C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 SUPVSFFZWVOEOI-UHFFFAOYSA-N 0.000 description 6
- RFUBXQQFJFGJFV-GUBZILKMSA-N Leu-Asn-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O RFUBXQQFJFGJFV-GUBZILKMSA-N 0.000 description 6
- OIARJGNVARWKFP-YUMQZZPRSA-N Leu-Asn-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIARJGNVARWKFP-YUMQZZPRSA-N 0.000 description 6
- FIJMQLGQLBLBOL-HJGDQZAQSA-N Leu-Asn-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FIJMQLGQLBLBOL-HJGDQZAQSA-N 0.000 description 6
- WXHFZJFZWNCDNB-KKUMJFAQSA-N Leu-Asn-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 WXHFZJFZWNCDNB-KKUMJFAQSA-N 0.000 description 6
- ZURHXHNAEJJRNU-CIUDSAMLSA-N Leu-Asp-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZURHXHNAEJJRNU-CIUDSAMLSA-N 0.000 description 6
- PJYSOYLLTJKZHC-GUBZILKMSA-N Leu-Asp-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O PJYSOYLLTJKZHC-GUBZILKMSA-N 0.000 description 6
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 6
- ZYLJULGXQDNXDK-GUBZILKMSA-N Leu-Gln-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O ZYLJULGXQDNXDK-GUBZILKMSA-N 0.000 description 6
- FQZPTCNSNPWHLJ-AVGNSLFASA-N Leu-Gln-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O FQZPTCNSNPWHLJ-AVGNSLFASA-N 0.000 description 6
- USLNHQZCDQJBOV-ZPFDUUQYSA-N Leu-Ile-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O USLNHQZCDQJBOV-ZPFDUUQYSA-N 0.000 description 6
- OMHLATXVNQSALM-FQUUOJAGSA-N Leu-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(C)C)N OMHLATXVNQSALM-FQUUOJAGSA-N 0.000 description 6
- KYIIALJHAOIAHF-KKUMJFAQSA-N Leu-Leu-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 KYIIALJHAOIAHF-KKUMJFAQSA-N 0.000 description 6
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 6
- ZGUMORRUBUCXEH-AVGNSLFASA-N Leu-Lys-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZGUMORRUBUCXEH-AVGNSLFASA-N 0.000 description 6
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N Leu-Phe-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 6
- KQFZKDITNUEVFJ-JYJNAYRXSA-N Leu-Phe-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CC=CC=C1 KQFZKDITNUEVFJ-JYJNAYRXSA-N 0.000 description 6
- PJWOOBTYQNNRBF-BZSNNMDCSA-N Leu-Phe-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)O)N PJWOOBTYQNNRBF-BZSNNMDCSA-N 0.000 description 6
- MVVSHHJKJRZVNY-ACRUOGEOSA-N Leu-Phe-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MVVSHHJKJRZVNY-ACRUOGEOSA-N 0.000 description 6
- ZJZNLRVCZWUONM-JXUBOQSCSA-N Leu-Thr-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O ZJZNLRVCZWUONM-JXUBOQSCSA-N 0.000 description 6
- HQVDJTYKCMIWJP-YUMQZZPRSA-N Lys-Asn-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HQVDJTYKCMIWJP-YUMQZZPRSA-N 0.000 description 6
- YVSHZSUKQHNDHD-KKUMJFAQSA-N Lys-Asn-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N YVSHZSUKQHNDHD-KKUMJFAQSA-N 0.000 description 6
- LZWNAOIMTLNMDW-NHCYSSNCSA-N Lys-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N LZWNAOIMTLNMDW-NHCYSSNCSA-N 0.000 description 6
- NTBFKPBULZGXQL-KKUMJFAQSA-N Lys-Asp-Tyr Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NTBFKPBULZGXQL-KKUMJFAQSA-N 0.000 description 6
- YVMQJGWLHRWMDF-MNXVOIDGSA-N Lys-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N YVMQJGWLHRWMDF-MNXVOIDGSA-N 0.000 description 6
- GJJQCBVRWDGLMQ-GUBZILKMSA-N Lys-Glu-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O GJJQCBVRWDGLMQ-GUBZILKMSA-N 0.000 description 6
- VEGLGAOVLFODGC-GUBZILKMSA-N Lys-Glu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VEGLGAOVLFODGC-GUBZILKMSA-N 0.000 description 6
- GQZMPWBZQALKJO-UWVGGRQHSA-N Lys-Gly-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O GQZMPWBZQALKJO-UWVGGRQHSA-N 0.000 description 6
- KNKJPYAZQUFLQK-IHRRRGAJSA-N Lys-His-Arg Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCCCN)N KNKJPYAZQUFLQK-IHRRRGAJSA-N 0.000 description 6
- MUXNCRWTWBMNHX-SRVKXCTJSA-N Lys-Leu-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O MUXNCRWTWBMNHX-SRVKXCTJSA-N 0.000 description 6
- AIRZWUMAHCDDHR-KKUMJFAQSA-N Lys-Leu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O AIRZWUMAHCDDHR-KKUMJFAQSA-N 0.000 description 6
- OIQSIMFSVLLWBX-VOAKCMCISA-N Lys-Leu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OIQSIMFSVLLWBX-VOAKCMCISA-N 0.000 description 6
- VUTWYNQUSJWBHO-BZSNNMDCSA-N Lys-Leu-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VUTWYNQUSJWBHO-BZSNNMDCSA-N 0.000 description 6
- RIJCHEVHFWMDKD-SRVKXCTJSA-N Lys-Lys-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RIJCHEVHFWMDKD-SRVKXCTJSA-N 0.000 description 6
- ALGGDNMLQNFVIZ-SRVKXCTJSA-N Lys-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N ALGGDNMLQNFVIZ-SRVKXCTJSA-N 0.000 description 6
- ATNKHRAIZCMCCN-BZSNNMDCSA-N Lys-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N ATNKHRAIZCMCCN-BZSNNMDCSA-N 0.000 description 6
- PLDJDCJLRCYPJB-VOAKCMCISA-N Lys-Lys-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PLDJDCJLRCYPJB-VOAKCMCISA-N 0.000 description 6
- MIMXMVDLMDMOJD-BZSNNMDCSA-N Lys-Tyr-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O MIMXMVDLMDMOJD-BZSNNMDCSA-N 0.000 description 6
- WINFHLHJTRGLCV-BZSNNMDCSA-N Lys-Tyr-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=C(O)C=C1 WINFHLHJTRGLCV-BZSNNMDCSA-N 0.000 description 6
- QEVRUYFHWJJUHZ-DCAQKATOSA-N Met-Ala-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(C)C QEVRUYFHWJJUHZ-DCAQKATOSA-N 0.000 description 6
- BLIPQDLSCFGUFA-GUBZILKMSA-N Met-Arg-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O BLIPQDLSCFGUFA-GUBZILKMSA-N 0.000 description 6
- JACAKCWAOHKQBV-UWVGGRQHSA-N Met-Gly-Lys Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN JACAKCWAOHKQBV-UWVGGRQHSA-N 0.000 description 6
- RMLLCGYYVZKKRT-CIUDSAMLSA-N Met-Ser-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O RMLLCGYYVZKKRT-CIUDSAMLSA-N 0.000 description 6
- GGXZOTSDJJTDGB-GUBZILKMSA-N Met-Ser-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O GGXZOTSDJJTDGB-GUBZILKMSA-N 0.000 description 6
- AUEJLPRZGVVDNU-UHFFFAOYSA-N N-L-tyrosyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-UHFFFAOYSA-N 0.000 description 6
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 6
- 108010079364 N-glycylalanine Proteins 0.000 description 6
- AYPMIIKUMNADSU-IHRRRGAJSA-N Phe-Arg-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O AYPMIIKUMNADSU-IHRRRGAJSA-N 0.000 description 6
- MRNRMSDVVSKPGM-AVGNSLFASA-N Phe-Asn-Gln Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MRNRMSDVVSKPGM-AVGNSLFASA-N 0.000 description 6
- CSDMCMITJLKBAH-SOUVJXGZSA-N Phe-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O CSDMCMITJLKBAH-SOUVJXGZSA-N 0.000 description 6
- BIYWZVCPZIFGPY-QWRGUYRKSA-N Phe-Gly-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CO)C(O)=O BIYWZVCPZIFGPY-QWRGUYRKSA-N 0.000 description 6
- WEMYTDDMDBLPMI-DKIMLUQUSA-N Phe-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N WEMYTDDMDBLPMI-DKIMLUQUSA-N 0.000 description 6
- CBENHWCORLVGEQ-HJOGWXRNSA-N Phe-Phe-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 CBENHWCORLVGEQ-HJOGWXRNSA-N 0.000 description 6
- ZYNBEWGJFXTBDU-ACRUOGEOSA-N Phe-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC2=CC=CC=C2)N ZYNBEWGJFXTBDU-ACRUOGEOSA-N 0.000 description 6
- KUSYCSMTTHSZOA-DZKIICNBSA-N Phe-Val-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N KUSYCSMTTHSZOA-DZKIICNBSA-N 0.000 description 6
- BQMFWUKNOCJDNV-HJWJTTGWSA-N Phe-Val-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BQMFWUKNOCJDNV-HJWJTTGWSA-N 0.000 description 6
- AMBLXEMWFARNNQ-DCAQKATOSA-N Pro-Asn-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@@H]1CCCN1 AMBLXEMWFARNNQ-DCAQKATOSA-N 0.000 description 6
- LXVLKXPFIDDHJG-CIUDSAMLSA-N Pro-Glu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O LXVLKXPFIDDHJG-CIUDSAMLSA-N 0.000 description 6
- SXMSEHDMNIUTSP-DCAQKATOSA-N Pro-Lys-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O SXMSEHDMNIUTSP-DCAQKATOSA-N 0.000 description 6
- SMFQZMGHCODUPQ-ULQDDVLXSA-N Pro-Lys-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SMFQZMGHCODUPQ-ULQDDVLXSA-N 0.000 description 6
- WOIFYRZPIORBRY-AVGNSLFASA-N Pro-Lys-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O WOIFYRZPIORBRY-AVGNSLFASA-N 0.000 description 6
- SXJOPONICMGFCR-DCAQKATOSA-N Pro-Ser-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O SXJOPONICMGFCR-DCAQKATOSA-N 0.000 description 6
- VEUACYMXJKXALX-IHRRRGAJSA-N Pro-Tyr-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O VEUACYMXJKXALX-IHRRRGAJSA-N 0.000 description 6
- FUOGXAQMNJMBFG-WPRPVWTQSA-N Pro-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 FUOGXAQMNJMBFG-WPRPVWTQSA-N 0.000 description 6
- JPIDMRXXNMIVKY-VZFHVOOUSA-N Ser-Ala-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPIDMRXXNMIVKY-VZFHVOOUSA-N 0.000 description 6
- QWZIOCFPXMAXET-CIUDSAMLSA-N Ser-Arg-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O QWZIOCFPXMAXET-CIUDSAMLSA-N 0.000 description 6
- BYIROAKULFFTEK-CIUDSAMLSA-N Ser-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CO BYIROAKULFFTEK-CIUDSAMLSA-N 0.000 description 6
- ZOHGLPQGEHSLPD-FXQIFTODSA-N Ser-Gln-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZOHGLPQGEHSLPD-FXQIFTODSA-N 0.000 description 6
- HJEBZBMOTCQYDN-ACZMJKKPSA-N Ser-Glu-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HJEBZBMOTCQYDN-ACZMJKKPSA-N 0.000 description 6
- WBINSDOPZHQPPM-AVGNSLFASA-N Ser-Glu-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N)O WBINSDOPZHQPPM-AVGNSLFASA-N 0.000 description 6
- OHKFXGKHSJKKAL-NRPADANISA-N Ser-Glu-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O OHKFXGKHSJKKAL-NRPADANISA-N 0.000 description 6
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 6
- HMRAQFJFTOLDKW-GUBZILKMSA-N Ser-His-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O HMRAQFJFTOLDKW-GUBZILKMSA-N 0.000 description 6
- DJACUBDEDBZKLQ-KBIXCLLPSA-N Ser-Ile-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O DJACUBDEDBZKLQ-KBIXCLLPSA-N 0.000 description 6
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 6
- PPNPDKGQRFSCAC-CIUDSAMLSA-N Ser-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPNPDKGQRFSCAC-CIUDSAMLSA-N 0.000 description 6
- FPCGZYMRFFIYIH-CIUDSAMLSA-N Ser-Lys-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O FPCGZYMRFFIYIH-CIUDSAMLSA-N 0.000 description 6
- XVWDJUROVRQKAE-KKUMJFAQSA-N Ser-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC1=CC=CC=C1 XVWDJUROVRQKAE-KKUMJFAQSA-N 0.000 description 6
- WNDUPCKKKGSKIQ-CIUDSAMLSA-N Ser-Pro-Gln Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O WNDUPCKKKGSKIQ-CIUDSAMLSA-N 0.000 description 6
- NADLKBTYNKUJEP-KATARQTJSA-N Ser-Thr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NADLKBTYNKUJEP-KATARQTJSA-N 0.000 description 6
- MFQMZDPAZRZAPV-NAKRPEOUSA-N Ser-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CO)N MFQMZDPAZRZAPV-NAKRPEOUSA-N 0.000 description 6
- VIBXMCZWVUOZLA-OLHMAJIHSA-N Thr-Asn-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O VIBXMCZWVUOZLA-OLHMAJIHSA-N 0.000 description 6
- LAFLAXHTDVNVEL-WDCWCFNPSA-N Thr-Gln-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O LAFLAXHTDVNVEL-WDCWCFNPSA-N 0.000 description 6
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 6
- ZTPXSEUVYNNZRB-CDMKHQONSA-N Thr-Gly-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZTPXSEUVYNNZRB-CDMKHQONSA-N 0.000 description 6
- SXAGUVRFGJSFKC-ZEILLAHLSA-N Thr-His-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SXAGUVRFGJSFKC-ZEILLAHLSA-N 0.000 description 6
- GMXIJHCBTZDAPD-QPHKQPEJSA-N Thr-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N GMXIJHCBTZDAPD-QPHKQPEJSA-N 0.000 description 6
- AHOLTQCAVBSUDP-PPCPHDFISA-N Thr-Ile-Lys Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)[C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O AHOLTQCAVBSUDP-PPCPHDFISA-N 0.000 description 6
- VTVVYQOXJCZVEB-WDCWCFNPSA-N Thr-Leu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O VTVVYQOXJCZVEB-WDCWCFNPSA-N 0.000 description 6
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 6
- MXDOAJQRJBMGMO-FJXKBIBVSA-N Thr-Pro-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O MXDOAJQRJBMGMO-FJXKBIBVSA-N 0.000 description 6
- XHWCDRUPDNSDAZ-XKBZYTNZSA-N Thr-Ser-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O XHWCDRUPDNSDAZ-XKBZYTNZSA-N 0.000 description 6
- PJCYRZVSACOYSN-ZJDVBMNYSA-N Thr-Thr-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(O)=O PJCYRZVSACOYSN-ZJDVBMNYSA-N 0.000 description 6
- XVHAUVJXBFGUPC-RPTUDFQQSA-N Thr-Tyr-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XVHAUVJXBFGUPC-RPTUDFQQSA-N 0.000 description 6
- HJTYJQVRIQXMHM-XIRDDKMYSA-N Trp-Asp-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N HJTYJQVRIQXMHM-XIRDDKMYSA-N 0.000 description 6
- NKUIXQOJUAEIET-AQZXSJQPSA-N Trp-Asp-Thr Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@H](O)C)C(O)=O)=CNC2=C1 NKUIXQOJUAEIET-AQZXSJQPSA-N 0.000 description 6
- KRCPXGSWDOGHAM-XIRDDKMYSA-N Trp-Lys-Asp Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O KRCPXGSWDOGHAM-XIRDDKMYSA-N 0.000 description 6
- UUIYFDAWNBSWPG-IHPCNDPISA-N Trp-Lys-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N UUIYFDAWNBSWPG-IHPCNDPISA-N 0.000 description 6
- SCCKSNREWHMKOJ-SRVKXCTJSA-N Tyr-Asn-Ser Chemical compound N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O SCCKSNREWHMKOJ-SRVKXCTJSA-N 0.000 description 6
- RCLOWEZASFJFEX-KKUMJFAQSA-N Tyr-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 RCLOWEZASFJFEX-KKUMJFAQSA-N 0.000 description 6
- ZRPLVTZTKPPSBT-AVGNSLFASA-N Tyr-Glu-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZRPLVTZTKPPSBT-AVGNSLFASA-N 0.000 description 6
- IJUTXXAXQODRMW-KBPBESRZSA-N Tyr-Gly-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O IJUTXXAXQODRMW-KBPBESRZSA-N 0.000 description 6
- ZOBLBMGJKVJVEV-BZSNNMDCSA-N Tyr-Lys-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O ZOBLBMGJKVJVEV-BZSNNMDCSA-N 0.000 description 6
- LRHBBGDMBLFYGL-FHWLQOOXSA-N Tyr-Phe-Glu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=C(O)C=C1 LRHBBGDMBLFYGL-FHWLQOOXSA-N 0.000 description 6
- XGZBEGGGAUQBMB-KJEVXHAQSA-N Tyr-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CC=C(C=C2)O)N)O XGZBEGGGAUQBMB-KJEVXHAQSA-N 0.000 description 6
- MQGGXGKQSVEQHR-KKUMJFAQSA-N Tyr-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 MQGGXGKQSVEQHR-KKUMJFAQSA-N 0.000 description 6
- UMSZZGTXGKHTFJ-SRVKXCTJSA-N Tyr-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 UMSZZGTXGKHTFJ-SRVKXCTJSA-N 0.000 description 6
- PWKMJDQXKCENMF-MEYUZBJRSA-N Tyr-Thr-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O PWKMJDQXKCENMF-MEYUZBJRSA-N 0.000 description 6
- YOTRXXBHTZHKLU-BVSLBCMMSA-N Tyr-Trp-Met Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCSC)C(O)=O)C1=CC=C(O)C=C1 YOTRXXBHTZHKLU-BVSLBCMMSA-N 0.000 description 6
- FZSPNKUFROZBSG-ZKWXMUAHSA-N Val-Ala-Asp Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O FZSPNKUFROZBSG-ZKWXMUAHSA-N 0.000 description 6
- QHDXUYOYTPWCSK-RCOVLWMOSA-N Val-Asp-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N QHDXUYOYTPWCSK-RCOVLWMOSA-N 0.000 description 6
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 6
- FOADDSDHGRFUOC-DZKIICNBSA-N Val-Glu-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N FOADDSDHGRFUOC-DZKIICNBSA-N 0.000 description 6
- SDUBQHUJJWQTEU-XUXIUFHCSA-N Val-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](C(C)C)N SDUBQHUJJWQTEU-XUXIUFHCSA-N 0.000 description 6
- OTJMMKPMLUNTQT-AVGNSLFASA-N Val-Leu-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](C(C)C)N OTJMMKPMLUNTQT-AVGNSLFASA-N 0.000 description 6
- LYERIXUFCYVFFX-GVXVVHGQSA-N Val-Leu-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LYERIXUFCYVFFX-GVXVVHGQSA-N 0.000 description 6
- RWOGENDAOGMHLX-DCAQKATOSA-N Val-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N RWOGENDAOGMHLX-DCAQKATOSA-N 0.000 description 6
- MHHAWNPHDLCPLF-ULQDDVLXSA-N Val-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=CC=C1 MHHAWNPHDLCPLF-ULQDDVLXSA-N 0.000 description 6
- KISFXYYRKKNLOP-IHRRRGAJSA-N Val-Phe-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N KISFXYYRKKNLOP-IHRRRGAJSA-N 0.000 description 6
- QTPQHINADBYBNA-DCAQKATOSA-N Val-Ser-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN QTPQHINADBYBNA-DCAQKATOSA-N 0.000 description 6
- RLVTVHSDKHBFQP-ULQDDVLXSA-N Val-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=C(O)C=C1 RLVTVHSDKHBFQP-ULQDDVLXSA-N 0.000 description 6
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 6
- 108010044940 alanylglutamine Proteins 0.000 description 6
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 6
- 108010013835 arginine glutamate Proteins 0.000 description 6
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 6
- 108010077245 asparaginyl-proline Proteins 0.000 description 6
- 108010068265 aspartyltyrosine Proteins 0.000 description 6
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 6
- 108010049041 glutamylalanine Proteins 0.000 description 6
- 108010033719 glycyl-histidyl-glycine Proteins 0.000 description 6
- 108010059898 glycyl-tyrosyl-lysine Proteins 0.000 description 6
- 108010084389 glycyltryptophan Proteins 0.000 description 6
- 108010060857 isoleucyl-valyl-tyrosine Proteins 0.000 description 6
- 108010027338 isoleucylcysteine Proteins 0.000 description 6
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 6
- 108010000761 leucylarginine Proteins 0.000 description 6
- 108010025153 lysyl-alanyl-alanine Proteins 0.000 description 6
- 108010045397 lysyl-tyrosyl-lysine Proteins 0.000 description 6
- 108010009298 lysylglutamic acid Proteins 0.000 description 6
- 238000012163 sequencing technique Methods 0.000 description 6
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 6
- 108010080629 tryptophan-leucine Proteins 0.000 description 6
- 108010078580 tyrosylleucine Proteins 0.000 description 6
- FVSOUJZKYWEFOB-KBIXCLLPSA-N Ala-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](C)N FVSOUJZKYWEFOB-KBIXCLLPSA-N 0.000 description 5
- AWAXZRDKUHOPBO-GUBZILKMSA-N Ala-Gln-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O AWAXZRDKUHOPBO-GUBZILKMSA-N 0.000 description 5
- QHASENCZLDHBGX-ONGXEEELSA-N Ala-Gly-Phe Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QHASENCZLDHBGX-ONGXEEELSA-N 0.000 description 5
- IFKQPMZRDQZSHI-GHCJXIJMSA-N Ala-Ile-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O IFKQPMZRDQZSHI-GHCJXIJMSA-N 0.000 description 5
- OKIKVSXTXVVFDV-MMWGEVLESA-N Ala-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C)N OKIKVSXTXVVFDV-MMWGEVLESA-N 0.000 description 5
- NINQYGGNRIBFSC-CIUDSAMLSA-N Ala-Lys-Ser Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CO)C(O)=O NINQYGGNRIBFSC-CIUDSAMLSA-N 0.000 description 5
- VYMJAWXRWHJIMS-LKTVYLICSA-N Ala-Tyr-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N VYMJAWXRWHJIMS-LKTVYLICSA-N 0.000 description 5
- NYDIVDKTULRINZ-AVGNSLFASA-N Arg-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N NYDIVDKTULRINZ-AVGNSLFASA-N 0.000 description 5
- KZXPVYVSHUJCEO-ULQDDVLXSA-N Arg-Phe-Lys Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=CC=C1 KZXPVYVSHUJCEO-ULQDDVLXSA-N 0.000 description 5
- AOHKLEBWKMKITA-IHRRRGAJSA-N Arg-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AOHKLEBWKMKITA-IHRRRGAJSA-N 0.000 description 5
- PBSQFBAJKPLRJY-BYULHYEWSA-N Asn-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N PBSQFBAJKPLRJY-BYULHYEWSA-N 0.000 description 5
- FBODFHMLALOPHP-GUBZILKMSA-N Asn-Lys-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O FBODFHMLALOPHP-GUBZILKMSA-N 0.000 description 5
- RAUPFUCUDBQYHE-AVGNSLFASA-N Asn-Phe-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O RAUPFUCUDBQYHE-AVGNSLFASA-N 0.000 description 5
- RVHGJNGNKGDCPX-KKUMJFAQSA-N Asn-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N RVHGJNGNKGDCPX-KKUMJFAQSA-N 0.000 description 5
- FTNRWCPWDWRPAV-BZSNNMDCSA-N Asn-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CC(N)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 FTNRWCPWDWRPAV-BZSNNMDCSA-N 0.000 description 5
- DPWDPEVGACCWTC-SRVKXCTJSA-N Asn-Tyr-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O DPWDPEVGACCWTC-SRVKXCTJSA-N 0.000 description 5
- KDFQZBWWPYQBEN-ZLUOBGJFSA-N Asp-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N KDFQZBWWPYQBEN-ZLUOBGJFSA-N 0.000 description 5
- UGIBTKGQVWFTGX-BIIVOSGPSA-N Asp-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)C(=O)O UGIBTKGQVWFTGX-BIIVOSGPSA-N 0.000 description 5
- WCFCYFDBMNFSPA-ACZMJKKPSA-N Asp-Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O WCFCYFDBMNFSPA-ACZMJKKPSA-N 0.000 description 5
- CLUMZOKVGUWUFD-CIUDSAMLSA-N Asp-Leu-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O CLUMZOKVGUWUFD-CIUDSAMLSA-N 0.000 description 5
- RXBGWGRSWXOBGK-KKUMJFAQSA-N Asp-Lys-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RXBGWGRSWXOBGK-KKUMJFAQSA-N 0.000 description 5
- JSHWXQIZOCVWIA-ZKWXMUAHSA-N Asp-Ser-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O JSHWXQIZOCVWIA-ZKWXMUAHSA-N 0.000 description 5
- QOCFFCUFZGDHTP-NUMRIWBASA-N Asp-Thr-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QOCFFCUFZGDHTP-NUMRIWBASA-N 0.000 description 5
- KBJVTFWQWXCYCQ-IUKAMOBKSA-N Asp-Thr-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KBJVTFWQWXCYCQ-IUKAMOBKSA-N 0.000 description 5
- JGLWFWXGOINXEA-YDHLFZDLSA-N Asp-Val-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 JGLWFWXGOINXEA-YDHLFZDLSA-N 0.000 description 5
- 108010053770 Deoxyribonucleases Proteins 0.000 description 5
- 102000016911 Deoxyribonucleases Human genes 0.000 description 5
- CYTSBCIIEHUPDU-ACZMJKKPSA-N Gln-Asp-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O CYTSBCIIEHUPDU-ACZMJKKPSA-N 0.000 description 5
- UFNSPPFJOHNXRE-AUTRQRHGSA-N Gln-Gln-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O UFNSPPFJOHNXRE-AUTRQRHGSA-N 0.000 description 5
- RGAOLBZBLOJUTP-GRLWGSQLSA-N Gln-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](CCC(=O)N)N RGAOLBZBLOJUTP-GRLWGSQLSA-N 0.000 description 5
- FTIJVMLAGRAYMJ-MNXVOIDGSA-N Gln-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(N)=O FTIJVMLAGRAYMJ-MNXVOIDGSA-N 0.000 description 5
- ITYRYNUZHPNCIK-GUBZILKMSA-N Glu-Ala-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O ITYRYNUZHPNCIK-GUBZILKMSA-N 0.000 description 5
- KIMXNQXJJWWVIN-AVGNSLFASA-N Glu-Cys-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)O)N)O KIMXNQXJJWWVIN-AVGNSLFASA-N 0.000 description 5
- LGYZYFFDELZWRS-DCAQKATOSA-N Glu-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O LGYZYFFDELZWRS-DCAQKATOSA-N 0.000 description 5
- ZCOJVESMNGBGLF-GRLWGSQLSA-N Glu-Ile-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZCOJVESMNGBGLF-GRLWGSQLSA-N 0.000 description 5
- VGBSZQSKQRMLHD-MNXVOIDGSA-N Glu-Leu-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VGBSZQSKQRMLHD-MNXVOIDGSA-N 0.000 description 5
- WNRZUESNGGDCJX-JYJNAYRXSA-N Glu-Leu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WNRZUESNGGDCJX-JYJNAYRXSA-N 0.000 description 5
- ILWHFUZZCFYSKT-AVGNSLFASA-N Glu-Lys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ILWHFUZZCFYSKT-AVGNSLFASA-N 0.000 description 5
- ZGEJRLJEAMPEDV-SRVKXCTJSA-N Glu-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)O)N ZGEJRLJEAMPEDV-SRVKXCTJSA-N 0.000 description 5
- YQAQQKPWFOBSMU-WDCWCFNPSA-N Glu-Thr-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O YQAQQKPWFOBSMU-WDCWCFNPSA-N 0.000 description 5
- OCQUNKSFDYDXBG-QXEWZRGKSA-N Gly-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N OCQUNKSFDYDXBG-QXEWZRGKSA-N 0.000 description 5
- HDNXXTBKOJKWNN-WDSKDSINSA-N Gly-Glu-Asn Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O HDNXXTBKOJKWNN-WDSKDSINSA-N 0.000 description 5
- BIRKKBCSAIHDDF-WDSKDSINSA-N Gly-Glu-Cys Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CS)C(O)=O BIRKKBCSAIHDDF-WDSKDSINSA-N 0.000 description 5
- YKJUITHASJAGHO-HOTGVXAUSA-N Gly-Lys-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)CN YKJUITHASJAGHO-HOTGVXAUSA-N 0.000 description 5
- ABPRMMYHROQBLY-NKWVEPMBSA-N Gly-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)CN)C(=O)O ABPRMMYHROQBLY-NKWVEPMBSA-N 0.000 description 5
- DBUNZBWUWCIELX-JHEQGTHGSA-N Gly-Thr-Glu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DBUNZBWUWCIELX-JHEQGTHGSA-N 0.000 description 5
- RIYIFUFFFBIOEU-KBPBESRZSA-N Gly-Tyr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 RIYIFUFFFBIOEU-KBPBESRZSA-N 0.000 description 5
- MUGLKCQHTUFLGF-WPRPVWTQSA-N Gly-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)CN MUGLKCQHTUFLGF-WPRPVWTQSA-N 0.000 description 5
- TVRMJKNELJKNRS-GUBZILKMSA-N His-Glu-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N TVRMJKNELJKNRS-GUBZILKMSA-N 0.000 description 5
- CCUSLCQWVMWTIS-IXOXFDKPSA-N His-Thr-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O CCUSLCQWVMWTIS-IXOXFDKPSA-N 0.000 description 5
- NKVZTQVGUNLLQW-JBDRJPRFSA-N Ile-Ala-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)O)N NKVZTQVGUNLLQW-JBDRJPRFSA-N 0.000 description 5
- HZMLFETXHFHGBB-UGYAYLCHSA-N Ile-Asn-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HZMLFETXHFHGBB-UGYAYLCHSA-N 0.000 description 5
- XENGULNPUDGALZ-ZPFDUUQYSA-N Ile-Asn-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(C)C)C(=O)O)N XENGULNPUDGALZ-ZPFDUUQYSA-N 0.000 description 5
- UIEZQYNXCYHMQS-BJDJZHNGSA-N Ile-Lys-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)O)N UIEZQYNXCYHMQS-BJDJZHNGSA-N 0.000 description 5
- IDMNOFVUXYYZPF-DKIMLUQUSA-N Ile-Lys-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N IDMNOFVUXYYZPF-DKIMLUQUSA-N 0.000 description 5
- HJDZMPFEXINXLO-QPHKQPEJSA-N Ile-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N HJDZMPFEXINXLO-QPHKQPEJSA-N 0.000 description 5
- FXJLRZFMKGHYJP-CFMVVWHZSA-N Ile-Tyr-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N FXJLRZFMKGHYJP-CFMVVWHZSA-N 0.000 description 5
- DZMWFIRHFFVBHS-ZEWNOJEFSA-N Ile-Tyr-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N DZMWFIRHFFVBHS-ZEWNOJEFSA-N 0.000 description 5
- ILJREDZFPHTUIE-GUBZILKMSA-N Leu-Asp-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ILJREDZFPHTUIE-GUBZILKMSA-N 0.000 description 5
- YVKSMSDXKMSIRX-GUBZILKMSA-N Leu-Glu-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O YVKSMSDXKMSIRX-GUBZILKMSA-N 0.000 description 5
- HGFGEMSVBMCFKK-MNXVOIDGSA-N Leu-Ile-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O HGFGEMSVBMCFKK-MNXVOIDGSA-N 0.000 description 5
- KUIDCYNIEJBZBU-AJNGGQMLSA-N Leu-Ile-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O KUIDCYNIEJBZBU-AJNGGQMLSA-N 0.000 description 5
- JNDYEOUZBLOVOF-AVGNSLFASA-N Leu-Leu-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JNDYEOUZBLOVOF-AVGNSLFASA-N 0.000 description 5
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 5
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 5
- ILDSIMPXNFWKLH-KATARQTJSA-N Leu-Thr-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ILDSIMPXNFWKLH-KATARQTJSA-N 0.000 description 5
- AXVIGSRGTMNSJU-YESZJQIVSA-N Leu-Tyr-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N2CCC[C@@H]2C(=O)O)N AXVIGSRGTMNSJU-YESZJQIVSA-N 0.000 description 5
- NFLFJGGKOHYZJF-BJDJZHNGSA-N Lys-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN NFLFJGGKOHYZJF-BJDJZHNGSA-N 0.000 description 5
- GAOJCVKPIGHTGO-UWVGGRQHSA-N Lys-Arg-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O GAOJCVKPIGHTGO-UWVGGRQHSA-N 0.000 description 5
- WALVCOOOKULCQM-ULQDDVLXSA-N Lys-Arg-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WALVCOOOKULCQM-ULQDDVLXSA-N 0.000 description 5
- ZQCVMVCVPFYXHZ-SRVKXCTJSA-N Lys-Asn-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN ZQCVMVCVPFYXHZ-SRVKXCTJSA-N 0.000 description 5
- QIJVAFLRMVBHMU-KKUMJFAQSA-N Lys-Asp-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QIJVAFLRMVBHMU-KKUMJFAQSA-N 0.000 description 5
- GUYHHBZCBQZLFW-GUBZILKMSA-N Lys-Gln-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N GUYHHBZCBQZLFW-GUBZILKMSA-N 0.000 description 5
- ZXEUFAVXODIPHC-GUBZILKMSA-N Lys-Glu-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZXEUFAVXODIPHC-GUBZILKMSA-N 0.000 description 5
- GRADYHMSAUIKPS-DCAQKATOSA-N Lys-Glu-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O GRADYHMSAUIKPS-DCAQKATOSA-N 0.000 description 5
- DCRWPTBMWMGADO-AVGNSLFASA-N Lys-Glu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DCRWPTBMWMGADO-AVGNSLFASA-N 0.000 description 5
- QZONCCHVHCOBSK-YUMQZZPRSA-N Lys-Gly-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O QZONCCHVHCOBSK-YUMQZZPRSA-N 0.000 description 5
- XNKDCYABMBBEKN-IUCAKERBSA-N Lys-Gly-Gln Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O XNKDCYABMBBEKN-IUCAKERBSA-N 0.000 description 5
- GQFDWEDHOQRNLC-QWRGUYRKSA-N Lys-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN GQFDWEDHOQRNLC-QWRGUYRKSA-N 0.000 description 5
- PBLLTSKBTAHDNA-KBPBESRZSA-N Lys-Gly-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PBLLTSKBTAHDNA-KBPBESRZSA-N 0.000 description 5
- ZJWIXBZTAAJERF-IHRRRGAJSA-N Lys-Lys-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N ZJWIXBZTAAJERF-IHRRRGAJSA-N 0.000 description 5
- GAHJXEMYXKLZRQ-AJNGGQMLSA-N Lys-Lys-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GAHJXEMYXKLZRQ-AJNGGQMLSA-N 0.000 description 5
- YDDDRTIPNTWGIG-SRVKXCTJSA-N Lys-Lys-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O YDDDRTIPNTWGIG-SRVKXCTJSA-N 0.000 description 5
- TWPCWKVOZDUYAA-KKUMJFAQSA-N Lys-Phe-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O TWPCWKVOZDUYAA-KKUMJFAQSA-N 0.000 description 5
- VHTOGMKQXXJOHG-RHYQMDGZSA-N Lys-Thr-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O VHTOGMKQXXJOHG-RHYQMDGZSA-N 0.000 description 5
- ORRNBLTZBBESPN-HJWJTTGWSA-N Met-Ile-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ORRNBLTZBBESPN-HJWJTTGWSA-N 0.000 description 5
- SODXFJOPSCXOHE-IHRRRGAJSA-N Met-Leu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O SODXFJOPSCXOHE-IHRRRGAJSA-N 0.000 description 5
- WPTHAGXMYDRPFD-SRVKXCTJSA-N Met-Lys-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O WPTHAGXMYDRPFD-SRVKXCTJSA-N 0.000 description 5
- PHKBGZKVOJCIMZ-SRVKXCTJSA-N Met-Pro-Arg Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PHKBGZKVOJCIMZ-SRVKXCTJSA-N 0.000 description 5
- 101710163270 Nuclease Proteins 0.000 description 5
- ULECEJGNDHWSKD-QEJZJMRPSA-N Phe-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 ULECEJGNDHWSKD-QEJZJMRPSA-N 0.000 description 5
- IUVYJBMTHARMIP-PCBIJLKTSA-N Phe-Asp-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O IUVYJBMTHARMIP-PCBIJLKTSA-N 0.000 description 5
- OYQBFWWQSVIHBN-FHWLQOOXSA-N Phe-Glu-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O OYQBFWWQSVIHBN-FHWLQOOXSA-N 0.000 description 5
- RMKGXGPQIPLTFC-KKUMJFAQSA-N Phe-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RMKGXGPQIPLTFC-KKUMJFAQSA-N 0.000 description 5
- CJZTUKSFZUSNCC-FXQIFTODSA-N Pro-Asp-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 CJZTUKSFZUSNCC-FXQIFTODSA-N 0.000 description 5
- CLNJSLSHKJECME-BQBZGAKWSA-N Pro-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H]1CCCN1 CLNJSLSHKJECME-BQBZGAKWSA-N 0.000 description 5
- JIWJRKNYLSHONY-KKUMJFAQSA-N Pro-Phe-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O JIWJRKNYLSHONY-KKUMJFAQSA-N 0.000 description 5
- RMJZWERKFFNNNS-XGEHTFHBSA-N Pro-Thr-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O RMJZWERKFFNNNS-XGEHTFHBSA-N 0.000 description 5
- WTUJZHKANPDPIN-CIUDSAMLSA-N Ser-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N WTUJZHKANPDPIN-CIUDSAMLSA-N 0.000 description 5
- IXUGADGDCQDLSA-FXQIFTODSA-N Ser-Gln-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N IXUGADGDCQDLSA-FXQIFTODSA-N 0.000 description 5
- QKQDTEYDEIJPNK-GUBZILKMSA-N Ser-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CO QKQDTEYDEIJPNK-GUBZILKMSA-N 0.000 description 5
- IOVHBRCQOGWAQH-ZKWXMUAHSA-N Ser-Gly-Ile Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IOVHBRCQOGWAQH-ZKWXMUAHSA-N 0.000 description 5
- MOINZPRHJGTCHZ-MMWGEVLESA-N Ser-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N MOINZPRHJGTCHZ-MMWGEVLESA-N 0.000 description 5
- UIPXCLNLUUAMJU-JBDRJPRFSA-N Ser-Ile-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UIPXCLNLUUAMJU-JBDRJPRFSA-N 0.000 description 5
- ZIFYDQAFEMIZII-GUBZILKMSA-N Ser-Leu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZIFYDQAFEMIZII-GUBZILKMSA-N 0.000 description 5
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 5
- UGTZYIPOBYXWRW-SRVKXCTJSA-N Ser-Phe-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O UGTZYIPOBYXWRW-SRVKXCTJSA-N 0.000 description 5
- UBTNVMGPMYDYIU-HJPIBITLSA-N Ser-Tyr-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UBTNVMGPMYDYIU-HJPIBITLSA-N 0.000 description 5
- SGZVZUCRAVSPKQ-FXQIFTODSA-N Ser-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N SGZVZUCRAVSPKQ-FXQIFTODSA-N 0.000 description 5
- JVTHIXKSVYEWNI-JRQIVUDYSA-N Thr-Asn-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JVTHIXKSVYEWNI-JRQIVUDYSA-N 0.000 description 5
- XOWKUMFHEZLKLT-CIQUZCHMSA-N Thr-Ile-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O XOWKUMFHEZLKLT-CIQUZCHMSA-N 0.000 description 5
- TZJSEJOXAIWOST-RHYQMDGZSA-N Thr-Lys-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N TZJSEJOXAIWOST-RHYQMDGZSA-N 0.000 description 5
- QNCFWHZVRNXAKW-OEAJRASXSA-N Thr-Lys-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNCFWHZVRNXAKW-OEAJRASXSA-N 0.000 description 5
- VGYVVSQFSSKZRJ-OEAJRASXSA-N Thr-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=CC=C1 VGYVVSQFSSKZRJ-OEAJRASXSA-N 0.000 description 5
- 108091028113 Trans-activating crRNA Proteins 0.000 description 5
- IYHNBRUWVBIVJR-IHRRRGAJSA-N Tyr-Gln-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 IYHNBRUWVBIVJR-IHRRRGAJSA-N 0.000 description 5
- BXPOOVDVGWEXDU-WZLNRYEVSA-N Tyr-Ile-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BXPOOVDVGWEXDU-WZLNRYEVSA-N 0.000 description 5
- BYAKMYBZADCNMN-JYJNAYRXSA-N Tyr-Lys-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O BYAKMYBZADCNMN-JYJNAYRXSA-N 0.000 description 5
- LVILBTSHPTWDGE-PMVMPFDFSA-N Tyr-Trp-Lys Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCCCN)C(O)=O)C1=CC=C(O)C=C1 LVILBTSHPTWDGE-PMVMPFDFSA-N 0.000 description 5
- HZDQUVQEVVYDDA-ACRUOGEOSA-N Tyr-Tyr-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HZDQUVQEVVYDDA-ACRUOGEOSA-N 0.000 description 5
- OBKOPLHSRDATFO-XHSDSOJGSA-N Tyr-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N OBKOPLHSRDATFO-XHSDSOJGSA-N 0.000 description 5
- GXAZTLJYINLMJL-LAEOZQHASA-N Val-Asn-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N GXAZTLJYINLMJL-LAEOZQHASA-N 0.000 description 5
- UDNYEPLJTRDMEJ-RCOVLWMOSA-N Val-Asn-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N UDNYEPLJTRDMEJ-RCOVLWMOSA-N 0.000 description 5
- NZGOVKLVQNOEKP-YDHLFZDLSA-N Val-Phe-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N NZGOVKLVQNOEKP-YDHLFZDLSA-N 0.000 description 5
- 108010041407 alanylaspartic acid Proteins 0.000 description 5
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 5
- 230000002779 inactivation Effects 0.000 description 5
- 229930027917 kanamycin Natural products 0.000 description 5
- 229960000318 kanamycin Drugs 0.000 description 5
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 5
- 229930182823 kanamycin A Natural products 0.000 description 5
- 108010091871 leucylmethionine Proteins 0.000 description 5
- 108010074082 phenylalanyl-alanyl-lysine Proteins 0.000 description 5
- 108090000765 processed proteins & peptides Proteins 0.000 description 5
- 108010061238 threonyl-glycine Proteins 0.000 description 5
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 5
- KGHLGJAXYSVNJP-WHFBIAKZSA-N Asp-Ser-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O KGHLGJAXYSVNJP-WHFBIAKZSA-N 0.000 description 4
- GIKOVDMXBAFXDF-NHCYSSNCSA-N Asp-Val-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GIKOVDMXBAFXDF-NHCYSSNCSA-N 0.000 description 4
- 102000053602 DNA Human genes 0.000 description 4
- 108091028043 Nucleic acid sequence Proteins 0.000 description 4
- PYTKULIABVRXSC-BWBBJGPYSA-N Ser-Ser-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PYTKULIABVRXSC-BWBBJGPYSA-N 0.000 description 4
- LUMQYLVYUIRHHU-YJRXYDGGSA-N Tyr-Ser-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LUMQYLVYUIRHHU-YJRXYDGGSA-N 0.000 description 4
- 101150009206 aprE gene Proteins 0.000 description 4
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 4
- 230000001580 bacterial effect Effects 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 101150112117 nprE gene Proteins 0.000 description 4
- 239000000243 solution Substances 0.000 description 4
- 101710169336 5'-deoxyadenosine deaminase Proteins 0.000 description 3
- 102000055025 Adenosine deaminases Human genes 0.000 description 3
- DXHINQUXBZNUCF-MELADBBJSA-N Asn-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC(=O)N)N)C(=O)O DXHINQUXBZNUCF-MELADBBJSA-N 0.000 description 3
- CBHVAFXKOYAHOY-NHCYSSNCSA-N Asn-Val-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O CBHVAFXKOYAHOY-NHCYSSNCSA-N 0.000 description 3
- 241000894006 Bacteria Species 0.000 description 3
- 238000010453 CRISPR/Cas method Methods 0.000 description 3
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 3
- RRSLQOLASISYTB-CIUDSAMLSA-N Leu-Cys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(O)=O RRSLQOLASISYTB-CIUDSAMLSA-N 0.000 description 3
- 108091005804 Peptidases Proteins 0.000 description 3
- OXKJSGGTHFMGDT-UFYCRDLUSA-N Phe-Phe-Arg Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O)C1=CC=CC=C1 OXKJSGGTHFMGDT-UFYCRDLUSA-N 0.000 description 3
- 239000004365 Protease Substances 0.000 description 3
- 102000006382 Ribonucleases Human genes 0.000 description 3
- 108010083644 Ribonucleases Proteins 0.000 description 3
- ABSXSJZNRAQDDI-KJEVXHAQSA-N Tyr-Val-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ABSXSJZNRAQDDI-KJEVXHAQSA-N 0.000 description 3
- 230000009471 action Effects 0.000 description 3
- 230000000052 comparative effect Effects 0.000 description 3
- 230000000295 complement effect Effects 0.000 description 3
- 239000013604 expression vector Substances 0.000 description 3
- 239000012634 fragment Substances 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 3
- 239000002609 medium Substances 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 108010029020 prolylglycine Proteins 0.000 description 3
- HAJWYALLJIATCX-FXQIFTODSA-N Asn-Asn-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N HAJWYALLJIATCX-FXQIFTODSA-N 0.000 description 2
- 241000276408 Bacillus subtilis subsp. subtilis str. 168 Species 0.000 description 2
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 2
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 2
- 102100035102 E3 ubiquitin-protein ligase MYCBP2 Human genes 0.000 description 2
- 102000004190 Enzymes Human genes 0.000 description 2
- 108090000790 Enzymes Proteins 0.000 description 2
- ZMXZGYLINVNTKH-DZKIICNBSA-N Gln-Val-Phe Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZMXZGYLINVNTKH-DZKIICNBSA-N 0.000 description 2
- RJONUNZIMUXUOI-GUBZILKMSA-N Glu-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N RJONUNZIMUXUOI-GUBZILKMSA-N 0.000 description 2
- UMIRPYLZFKOEOH-YVNDNENWSA-N Glu-Gln-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UMIRPYLZFKOEOH-YVNDNENWSA-N 0.000 description 2
- GGJOGFJIPPGNRK-JSGCOSHPSA-N Glu-Gly-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)CNC(=O)[C@H](CCC(O)=O)N)C(O)=O)=CNC2=C1 GGJOGFJIPPGNRK-JSGCOSHPSA-N 0.000 description 2
- JBSLJUPMTYLLFH-MELADBBJSA-N His-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC3=CN=CN3)N)C(=O)O JBSLJUPMTYLLFH-MELADBBJSA-N 0.000 description 2
- 229930010555 Inosine Natural products 0.000 description 2
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 2
- SJNZALDHDUYDBU-IHRRRGAJSA-N Lys-Arg-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(O)=O SJNZALDHDUYDBU-IHRRRGAJSA-N 0.000 description 2
- JHNOXVASMSXSNB-WEDXCCLWSA-N Lys-Thr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JHNOXVASMSXSNB-WEDXCCLWSA-N 0.000 description 2
- 240000007594 Oryza sativa Species 0.000 description 2
- 235000007164 Oryza sativa Nutrition 0.000 description 2
- 102000035195 Peptidases Human genes 0.000 description 2
- 102000004389 Ribonucleoproteins Human genes 0.000 description 2
- 108010081734 Ribonucleoproteins Proteins 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 108091081024 Start codon Proteins 0.000 description 2
- UKEVLVBHRKWECS-LSJOCFKGSA-N Val-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](C(C)C)N UKEVLVBHRKWECS-LSJOCFKGSA-N 0.000 description 2
- 150000001413 amino acids Chemical group 0.000 description 2
- 239000003242 anti bacterial agent Substances 0.000 description 2
- 229940088710 antibiotic agent Drugs 0.000 description 2
- 230000033228 biological regulation Effects 0.000 description 2
- 238000003776 cleavage reaction Methods 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 239000012636 effector Substances 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 108010037850 glycylvaline Proteins 0.000 description 2
- 229960003786 inosine Drugs 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 230000037361 pathway Effects 0.000 description 2
- 235000009566 rice Nutrition 0.000 description 2
- 238000007480 sanger sequencing Methods 0.000 description 2
- 230000007017 scission Effects 0.000 description 2
- 125000006850 spacer group Chemical group 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- CNKBMTKICGGSCQ-ACRUOGEOSA-N (2S)-2-[[(2S)-2-[[(2S)-2,6-diamino-1-oxohexyl]amino]-1-oxo-3-phenylpropyl]amino]-3-(4-hydroxyphenyl)propanoic acid Chemical compound C([C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 CNKBMTKICGGSCQ-ACRUOGEOSA-N 0.000 description 1
- XVZCXCTYGHPNEM-IHRRRGAJSA-N (2s)-1-[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O XVZCXCTYGHPNEM-IHRRRGAJSA-N 0.000 description 1
- PAHHYDSPOXDASW-VGWMRTNUSA-N (2s)-6-amino-2-[[(2s)-6-amino-2-[[(2s)-1-[(2s)-2-amino-3-hydroxypropanoyl]pyrrolidine-2-carbonyl]amino]hexanoyl]amino]hexanoic acid Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO PAHHYDSPOXDASW-VGWMRTNUSA-N 0.000 description 1
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 1
- IPAMZHCXCQLRJR-UHFFFAOYSA-N 2-[[2-[[2-[(2-amino-3-methylbutanoyl)amino]-4-methylpentanoyl]amino]-3-methylbutanoyl]amino]-4-methylpentanoic acid Chemical compound CC(C)CC(C(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(N)C(C)C IPAMZHCXCQLRJR-UHFFFAOYSA-N 0.000 description 1
- 101800000535 3C-like proteinase Proteins 0.000 description 1
- 101800002396 3C-like proteinase nsp5 Proteins 0.000 description 1
- STACJSVFHSEZJV-GHCJXIJMSA-N Ala-Asn-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O STACJSVFHSEZJV-GHCJXIJMSA-N 0.000 description 1
- ZVFVBBGVOILKPO-WHFBIAKZSA-N Ala-Gly-Ala Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O ZVFVBBGVOILKPO-WHFBIAKZSA-N 0.000 description 1
- CKLDHDOIYBVUNP-KBIXCLLPSA-N Ala-Ile-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O CKLDHDOIYBVUNP-KBIXCLLPSA-N 0.000 description 1
- RZZMZYZXNJRPOJ-BJDJZHNGSA-N Ala-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](C)N RZZMZYZXNJRPOJ-BJDJZHNGSA-N 0.000 description 1
- YHKANGMVQWRMAP-DCAQKATOSA-N Ala-Leu-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YHKANGMVQWRMAP-DCAQKATOSA-N 0.000 description 1
- SDZRIBWEVVRDQI-CIUDSAMLSA-N Ala-Lys-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O SDZRIBWEVVRDQI-CIUDSAMLSA-N 0.000 description 1
- PMQXMXAASGFUDX-SRVKXCTJSA-N Ala-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCCN PMQXMXAASGFUDX-SRVKXCTJSA-N 0.000 description 1
- KQESEZXHYOUIIM-CQDKDKBSSA-N Ala-Lys-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KQESEZXHYOUIIM-CQDKDKBSSA-N 0.000 description 1
- WEZNQZHACPSMEF-QEJZJMRPSA-N Ala-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 WEZNQZHACPSMEF-QEJZJMRPSA-N 0.000 description 1
- IOFVWPYSRSCWHI-JXUBOQSCSA-N Ala-Thr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C)N IOFVWPYSRSCWHI-JXUBOQSCSA-N 0.000 description 1
- QOIGKCBMXUCDQU-KDXUFGMBSA-N Ala-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C)N)O QOIGKCBMXUCDQU-KDXUFGMBSA-N 0.000 description 1
- DCGLNNVKIZXQOJ-FXQIFTODSA-N Arg-Asn-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N DCGLNNVKIZXQOJ-FXQIFTODSA-N 0.000 description 1
- LKDHUGLXOHYINY-XUXIUFHCSA-N Arg-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N LKDHUGLXOHYINY-XUXIUFHCSA-N 0.000 description 1
- MJINRRBEMOLJAK-DCAQKATOSA-N Arg-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N MJINRRBEMOLJAK-DCAQKATOSA-N 0.000 description 1
- LCBSSOCDWUTQQV-SDDRHHMPSA-N Arg-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N LCBSSOCDWUTQQV-SDDRHHMPSA-N 0.000 description 1
- PRLPSDIHSRITSF-UNQGMJICSA-N Arg-Phe-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PRLPSDIHSRITSF-UNQGMJICSA-N 0.000 description 1
- ULBHWNVWSCJLCO-NHCYSSNCSA-N Arg-Val-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N ULBHWNVWSCJLCO-NHCYSSNCSA-N 0.000 description 1
- PDQBXRSOSCTGKY-ACZMJKKPSA-N Asn-Ala-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N PDQBXRSOSCTGKY-ACZMJKKPSA-N 0.000 description 1
- CMLGVVWQQHUXOZ-GHCJXIJMSA-N Asn-Ala-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CMLGVVWQQHUXOZ-GHCJXIJMSA-N 0.000 description 1
- VDCIPFYVCICPEC-FXQIFTODSA-N Asn-Arg-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O VDCIPFYVCICPEC-FXQIFTODSA-N 0.000 description 1
- JJGRJMKUOYXZRA-LPEHRKFASA-N Asn-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)N)N)C(=O)O JJGRJMKUOYXZRA-LPEHRKFASA-N 0.000 description 1
- WPOLSNAQGVHROR-GUBZILKMSA-N Asn-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)N)N WPOLSNAQGVHROR-GUBZILKMSA-N 0.000 description 1
- DXVMJJNAOVECBA-WHFBIAKZSA-N Asn-Gly-Asn Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O DXVMJJNAOVECBA-WHFBIAKZSA-N 0.000 description 1
- SPCONPVIDFMDJI-QSFUFRPTSA-N Asn-Ile-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O SPCONPVIDFMDJI-QSFUFRPTSA-N 0.000 description 1
- GLWFAWNYGWBMOC-SRVKXCTJSA-N Asn-Leu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GLWFAWNYGWBMOC-SRVKXCTJSA-N 0.000 description 1
- FTSAJSADJCMDHH-CIUDSAMLSA-N Asn-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N FTSAJSADJCMDHH-CIUDSAMLSA-N 0.000 description 1
- AWXDRZJQCVHCIT-DCAQKATOSA-N Asn-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(N)=O AWXDRZJQCVHCIT-DCAQKATOSA-N 0.000 description 1
- QIRJQYQOIKBPBZ-IHRRRGAJSA-N Asn-Tyr-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QIRJQYQOIKBPBZ-IHRRRGAJSA-N 0.000 description 1
- KTDWFWNZLLFEFU-KKUMJFAQSA-N Asn-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O KTDWFWNZLLFEFU-KKUMJFAQSA-N 0.000 description 1
- BEHQTVDBCLSCBY-CFMVVWHZSA-N Asn-Tyr-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BEHQTVDBCLSCBY-CFMVVWHZSA-N 0.000 description 1
- XACXDSRQIXRMNS-OLHMAJIHSA-N Asp-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)O XACXDSRQIXRMNS-OLHMAJIHSA-N 0.000 description 1
- QXHVOUSPVAWEMX-ZLUOBGJFSA-N Asp-Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXHVOUSPVAWEMX-ZLUOBGJFSA-N 0.000 description 1
- VZNOVQKGJQJOCS-SRVKXCTJSA-N Asp-Asp-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VZNOVQKGJQJOCS-SRVKXCTJSA-N 0.000 description 1
- RATOMFTUDRYMKX-ACZMJKKPSA-N Asp-Glu-Cys Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N RATOMFTUDRYMKX-ACZMJKKPSA-N 0.000 description 1
- YDJVIBMKAMQPPP-LAEOZQHASA-N Asp-Glu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O YDJVIBMKAMQPPP-LAEOZQHASA-N 0.000 description 1
- RPUYTJJZXQBWDT-SRVKXCTJSA-N Asp-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N RPUYTJJZXQBWDT-SRVKXCTJSA-N 0.000 description 1
- WMLFFCRUSPNENW-ZLUOBGJFSA-N Asp-Ser-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O WMLFFCRUSPNENW-ZLUOBGJFSA-N 0.000 description 1
- 102220484559 C-type lectin domain family 4 member A_H36L_mutation Human genes 0.000 description 1
- 101150005393 CBF1 gene Proteins 0.000 description 1
- 108091079001 CRISPR RNA Proteins 0.000 description 1
- 238000010356 CRISPR-Cas9 genome editing Methods 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 101100329224 Coprinopsis cinerea (strain Okayama-7 / 130 / ATCC MYA-4618 / FGSC 9003) cpf1 gene Proteins 0.000 description 1
- HPZAJRPYUIHDIN-BZSNNMDCSA-N Cys-Tyr-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CS)N HPZAJRPYUIHDIN-BZSNNMDCSA-N 0.000 description 1
- 230000007018 DNA scission Effects 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- PCKOTDPDHIBGRW-CIUDSAMLSA-N Gln-Cys-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)N)N)CN=C(N)N PCKOTDPDHIBGRW-CIUDSAMLSA-N 0.000 description 1
- KDXKFBSNIJYNNR-YVNDNENWSA-N Gln-Glu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDXKFBSNIJYNNR-YVNDNENWSA-N 0.000 description 1
- LFIVHGMKWFGUGK-IHRRRGAJSA-N Gln-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N LFIVHGMKWFGUGK-IHRRRGAJSA-N 0.000 description 1
- YXQCLIVLWCKCRS-RYUDHWBXSA-N Gln-Gly-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N)O YXQCLIVLWCKCRS-RYUDHWBXSA-N 0.000 description 1
- HDUDGCZEOZEFOA-KBIXCLLPSA-N Gln-Ile-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CCC(=O)N)N HDUDGCZEOZEFOA-KBIXCLLPSA-N 0.000 description 1
- ITZWDGBYBPUZRG-KBIXCLLPSA-N Gln-Ile-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O ITZWDGBYBPUZRG-KBIXCLLPSA-N 0.000 description 1
- FLLRAEJOLZPSMN-CIUDSAMLSA-N Glu-Asn-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FLLRAEJOLZPSMN-CIUDSAMLSA-N 0.000 description 1
- VAZZOGXDUQSVQF-NUMRIWBASA-N Glu-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N)O VAZZOGXDUQSVQF-NUMRIWBASA-N 0.000 description 1
- XKPOCESCRTVRPL-KBIXCLLPSA-N Glu-Cys-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XKPOCESCRTVRPL-KBIXCLLPSA-N 0.000 description 1
- XHUCVVHRLNPZSZ-CIUDSAMLSA-N Glu-Gln-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XHUCVVHRLNPZSZ-CIUDSAMLSA-N 0.000 description 1
- GRHXUHCFENOCOS-ZPFDUUQYSA-N Glu-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCC(=O)O)N GRHXUHCFENOCOS-ZPFDUUQYSA-N 0.000 description 1
- HVYWQYLBVXMXSV-GUBZILKMSA-N Glu-Leu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HVYWQYLBVXMXSV-GUBZILKMSA-N 0.000 description 1
- IRXNJYPKBVERCW-DCAQKATOSA-N Glu-Leu-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IRXNJYPKBVERCW-DCAQKATOSA-N 0.000 description 1
- RBXSZQRSEGYDFG-GUBZILKMSA-N Glu-Lys-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O RBXSZQRSEGYDFG-GUBZILKMSA-N 0.000 description 1
- FGSGPLRPQCZBSQ-AVGNSLFASA-N Glu-Phe-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O FGSGPLRPQCZBSQ-AVGNSLFASA-N 0.000 description 1
- VNCNWQPIQYAMAK-ACZMJKKPSA-N Glu-Ser-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O VNCNWQPIQYAMAK-ACZMJKKPSA-N 0.000 description 1
- UMZHHILWZBFPGL-LOKLDPHHSA-N Glu-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O UMZHHILWZBFPGL-LOKLDPHHSA-N 0.000 description 1
- RMWAOBGCZZSJHE-UMNHJUIQSA-N Glu-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N RMWAOBGCZZSJHE-UMNHJUIQSA-N 0.000 description 1
- BRFJMRSRMOMIMU-WHFBIAKZSA-N Gly-Ala-Asn Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O BRFJMRSRMOMIMU-WHFBIAKZSA-N 0.000 description 1
- WKJKBELXHCTHIJ-WPRPVWTQSA-N Gly-Arg-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N WKJKBELXHCTHIJ-WPRPVWTQSA-N 0.000 description 1
- BGVYNAQWHSTTSP-BYULHYEWSA-N Gly-Asn-Ile Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BGVYNAQWHSTTSP-BYULHYEWSA-N 0.000 description 1
- KTSZUNRRYXPZTK-BQBZGAKWSA-N Gly-Gln-Glu Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O KTSZUNRRYXPZTK-BQBZGAKWSA-N 0.000 description 1
- XPJBQTCXPJNIFE-ZETCQYMHSA-N Gly-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 1
- HKSNHPVETYYJBK-LAEOZQHASA-N Gly-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)CN HKSNHPVETYYJBK-LAEOZQHASA-N 0.000 description 1
- UESJMAMHDLEHGM-NHCYSSNCSA-N Gly-Ile-Leu Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O UESJMAMHDLEHGM-NHCYSSNCSA-N 0.000 description 1
- SCWYHUQOOFRVHP-MBLNEYKQSA-N Gly-Ile-Thr Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SCWYHUQOOFRVHP-MBLNEYKQSA-N 0.000 description 1
- DHNXGWVNLFPOMQ-KBPBESRZSA-N Gly-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)CN DHNXGWVNLFPOMQ-KBPBESRZSA-N 0.000 description 1
- WNZOCXUOGVYYBJ-CDMKHQONSA-N Gly-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)CN)O WNZOCXUOGVYYBJ-CDMKHQONSA-N 0.000 description 1
- FFALDIDGPLUDKV-ZDLURKLDSA-N Gly-Thr-Ser Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O FFALDIDGPLUDKV-ZDLURKLDSA-N 0.000 description 1
- VSLXGYMEHVAJBH-DLOVCJGASA-N His-Ala-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O VSLXGYMEHVAJBH-DLOVCJGASA-N 0.000 description 1
- ISQOVWDWRUONJH-YESZJQIVSA-N His-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC3=CN=CN3)N)C(=O)O ISQOVWDWRUONJH-YESZJQIVSA-N 0.000 description 1
- HDOYNXLPTRQLAD-JBDRJPRFSA-N Ile-Ala-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(=O)O)N HDOYNXLPTRQLAD-JBDRJPRFSA-N 0.000 description 1
- LEDRIAHEWDJRMF-CFMVVWHZSA-N Ile-Asn-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 LEDRIAHEWDJRMF-CFMVVWHZSA-N 0.000 description 1
- QSPLUJGYOPZINY-ZPFDUUQYSA-N Ile-Asp-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N QSPLUJGYOPZINY-ZPFDUUQYSA-N 0.000 description 1
- DFJJAVZIHDFOGQ-MNXVOIDGSA-N Ile-Glu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N DFJJAVZIHDFOGQ-MNXVOIDGSA-N 0.000 description 1
- JXMSHKFPDIUYGS-SIUGBPQLSA-N Ile-Glu-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N JXMSHKFPDIUYGS-SIUGBPQLSA-N 0.000 description 1
- NHJKZMDIMMTVCK-QXEWZRGKSA-N Ile-Gly-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N NHJKZMDIMMTVCK-QXEWZRGKSA-N 0.000 description 1
- PWDSHAAAFXISLE-SXTJYALSSA-N Ile-Ile-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O PWDSHAAAFXISLE-SXTJYALSSA-N 0.000 description 1
- OUUCIIJSBIBCHB-ZPFDUUQYSA-N Ile-Leu-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O OUUCIIJSBIBCHB-ZPFDUUQYSA-N 0.000 description 1
- PNTWNAXGBOZMBO-MNXVOIDGSA-N Ile-Lys-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PNTWNAXGBOZMBO-MNXVOIDGSA-N 0.000 description 1
- YSGBJIQXTIVBHZ-AJNGGQMLSA-N Ile-Lys-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O YSGBJIQXTIVBHZ-AJNGGQMLSA-N 0.000 description 1
- FFJQAEYLAQMGDL-MGHWNKPDSA-N Ile-Lys-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FFJQAEYLAQMGDL-MGHWNKPDSA-N 0.000 description 1
- IIWQTXMUALXGOV-PCBIJLKTSA-N Ile-Phe-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N IIWQTXMUALXGOV-PCBIJLKTSA-N 0.000 description 1
- NLZVTPYXYXMCIP-XUXIUFHCSA-N Ile-Pro-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O NLZVTPYXYXMCIP-XUXIUFHCSA-N 0.000 description 1
- CZWANIQKACCEKW-CYDGBPFRSA-N Ile-Pro-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)O)N CZWANIQKACCEKW-CYDGBPFRSA-N 0.000 description 1
- VGSPNSSCMOHRRR-BJDJZHNGSA-N Ile-Ser-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N VGSPNSSCMOHRRR-BJDJZHNGSA-N 0.000 description 1
- NAFIFZNBSPWYOO-RWRJDSDZSA-N Ile-Thr-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N NAFIFZNBSPWYOO-RWRJDSDZSA-N 0.000 description 1
- COWHUQXTSYTKQC-RWRJDSDZSA-N Ile-Thr-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N COWHUQXTSYTKQC-RWRJDSDZSA-N 0.000 description 1
- HQLSBZFLOUHQJK-STECZYCISA-N Ile-Tyr-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N HQLSBZFLOUHQJK-STECZYCISA-N 0.000 description 1
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 1
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 1
- NFHJQETXTSDZSI-DCAQKATOSA-N Leu-Cys-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NFHJQETXTSDZSI-DCAQKATOSA-N 0.000 description 1
- PIHFVNPEAHFNLN-KKUMJFAQSA-N Leu-Cys-Tyr Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N PIHFVNPEAHFNLN-KKUMJFAQSA-N 0.000 description 1
- DLCXCECTCPKKCD-GUBZILKMSA-N Leu-Gln-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O DLCXCECTCPKKCD-GUBZILKMSA-N 0.000 description 1
- OGUUKPXUTHOIAV-SDDRHHMPSA-N Leu-Glu-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N OGUUKPXUTHOIAV-SDDRHHMPSA-N 0.000 description 1
- ZFNLIDNJUWNIJL-WDCWCFNPSA-N Leu-Glu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZFNLIDNJUWNIJL-WDCWCFNPSA-N 0.000 description 1
- DBSLVQBXKVKDKJ-BJDJZHNGSA-N Leu-Ile-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O DBSLVQBXKVKDKJ-BJDJZHNGSA-N 0.000 description 1
- KOSWSHVQIVTVQF-ZPFDUUQYSA-N Leu-Ile-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KOSWSHVQIVTVQF-ZPFDUUQYSA-N 0.000 description 1
- IFMPDNRWZZEZSL-SRVKXCTJSA-N Leu-Leu-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(O)=O IFMPDNRWZZEZSL-SRVKXCTJSA-N 0.000 description 1
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 1
- UBZGNBKMIJHOHL-BZSNNMDCSA-N Leu-Leu-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 UBZGNBKMIJHOHL-BZSNNMDCSA-N 0.000 description 1
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 1
- IZPVWNSAVUQBGP-CIUDSAMLSA-N Leu-Ser-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IZPVWNSAVUQBGP-CIUDSAMLSA-N 0.000 description 1
- LMDVGHQPPPLYAR-IHRRRGAJSA-N Leu-Val-His Chemical compound N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)O LMDVGHQPPPLYAR-IHRRRGAJSA-N 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- FZIJIFCXUCZHOL-CIUDSAMLSA-N Lys-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN FZIJIFCXUCZHOL-CIUDSAMLSA-N 0.000 description 1
- 108010062166 Lys-Asn-Asp Proteins 0.000 description 1
- FACUGMGEFUEBTI-SRVKXCTJSA-N Lys-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCCCN FACUGMGEFUEBTI-SRVKXCTJSA-N 0.000 description 1
- FLCMXEFCTLXBTL-DCAQKATOSA-N Lys-Asp-Arg Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N FLCMXEFCTLXBTL-DCAQKATOSA-N 0.000 description 1
- NNCDAORZCMPZPX-GUBZILKMSA-N Lys-Gln-Ser Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N NNCDAORZCMPZPX-GUBZILKMSA-N 0.000 description 1
- PAMDBWYMLWOELY-SDDRHHMPSA-N Lys-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N)C(=O)O PAMDBWYMLWOELY-SDDRHHMPSA-N 0.000 description 1
- MXMDJEJWERYPMO-XUXIUFHCSA-N Lys-Ile-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MXMDJEJWERYPMO-XUXIUFHCSA-N 0.000 description 1
- NCZIQZYZPUPMKY-PPCPHDFISA-N Lys-Ile-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NCZIQZYZPUPMKY-PPCPHDFISA-N 0.000 description 1
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 1
- ODTZHNZPINULEU-KKUMJFAQSA-N Lys-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N ODTZHNZPINULEU-KKUMJFAQSA-N 0.000 description 1
- ZJSZPXISKMDJKQ-JYJNAYRXSA-N Lys-Phe-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCC(O)=O)C(O)=O)CC1=CC=CC=C1 ZJSZPXISKMDJKQ-JYJNAYRXSA-N 0.000 description 1
- MGKFCQFVPKOWOL-CIUDSAMLSA-N Lys-Ser-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N MGKFCQFVPKOWOL-CIUDSAMLSA-N 0.000 description 1
- JOSAKOKSPXROGQ-BJDJZHNGSA-N Lys-Ser-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JOSAKOKSPXROGQ-BJDJZHNGSA-N 0.000 description 1
- BIWVMACFGZFIEB-VFAJRCTISA-N Lys-Trp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CCCCN)N)O BIWVMACFGZFIEB-VFAJRCTISA-N 0.000 description 1
- PSVAVKGDUAKZKU-BZSNNMDCSA-N Lys-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCCCN)N)O PSVAVKGDUAKZKU-BZSNNMDCSA-N 0.000 description 1
- JQEBITVYKUCBMC-SRVKXCTJSA-N Met-Arg-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JQEBITVYKUCBMC-SRVKXCTJSA-N 0.000 description 1
- BQVJARUIXRXDKN-DCAQKATOSA-N Met-Asn-His Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 BQVJARUIXRXDKN-DCAQKATOSA-N 0.000 description 1
- NLHSFJQUHGCWSD-PYJNHQTQSA-N Met-Ile-His Chemical compound N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CNC=N1)C(O)=O NLHSFJQUHGCWSD-PYJNHQTQSA-N 0.000 description 1
- LCPUWQLULVXROY-RHYQMDGZSA-N Met-Lys-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LCPUWQLULVXROY-RHYQMDGZSA-N 0.000 description 1
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- KIEPQOIQHFKQLK-PCBIJLKTSA-N Phe-Asn-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KIEPQOIQHFKQLK-PCBIJLKTSA-N 0.000 description 1
- CUMXHKAOHNWRFQ-BZSNNMDCSA-N Phe-Asp-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 CUMXHKAOHNWRFQ-BZSNNMDCSA-N 0.000 description 1
- MGECUMGTSHYHEJ-QEWYBTABSA-N Phe-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 MGECUMGTSHYHEJ-QEWYBTABSA-N 0.000 description 1
- BFYHIHGIHGROAT-HTUGSXCWSA-N Phe-Glu-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BFYHIHGIHGROAT-HTUGSXCWSA-N 0.000 description 1
- MJAYDXWQQUOURZ-JYJNAYRXSA-N Phe-Lys-Gln Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O MJAYDXWQQUOURZ-JYJNAYRXSA-N 0.000 description 1
- PEFJUUYFEGBXFA-BZSNNMDCSA-N Phe-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 PEFJUUYFEGBXFA-BZSNNMDCSA-N 0.000 description 1
- SCKXGHWQPPURGT-KKUMJFAQSA-N Phe-Lys-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O SCKXGHWQPPURGT-KKUMJFAQSA-N 0.000 description 1
- KAJLHCWRWDSROH-BZSNNMDCSA-N Phe-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=CC=C1 KAJLHCWRWDSROH-BZSNNMDCSA-N 0.000 description 1
- XDMMOISUAHXXFD-SRVKXCTJSA-N Phe-Ser-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O XDMMOISUAHXXFD-SRVKXCTJSA-N 0.000 description 1
- ILGCZYGFYQLSDZ-KKUMJFAQSA-N Phe-Ser-His Chemical compound N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O ILGCZYGFYQLSDZ-KKUMJFAQSA-N 0.000 description 1
- IPFXYNKCXYGSSV-KKUMJFAQSA-N Phe-Ser-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N IPFXYNKCXYGSSV-KKUMJFAQSA-N 0.000 description 1
- CPRLKHJUFAXVTD-ULQDDVLXSA-N Pro-Leu-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CPRLKHJUFAXVTD-ULQDDVLXSA-N 0.000 description 1
- 108010003201 RGH 0205 Proteins 0.000 description 1
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 1
- 108010034634 Repressor Proteins Proteins 0.000 description 1
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- KYKKKSWGEPFUMR-NAKRPEOUSA-N Ser-Arg-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KYKKKSWGEPFUMR-NAKRPEOUSA-N 0.000 description 1
- BGOWRLSWJCVYAQ-CIUDSAMLSA-N Ser-Asp-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BGOWRLSWJCVYAQ-CIUDSAMLSA-N 0.000 description 1
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 1
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 1
- XQJCEKXQUJQNNK-ZLUOBGJFSA-N Ser-Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O XQJCEKXQUJQNNK-ZLUOBGJFSA-N 0.000 description 1
- HSWXBJCBYSWBPT-GUBZILKMSA-N Ser-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)C(C)C)C(O)=O HSWXBJCBYSWBPT-GUBZILKMSA-N 0.000 description 1
- QILPDQCTQZDHFM-HJGDQZAQSA-N Thr-Gln-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QILPDQCTQZDHFM-HJGDQZAQSA-N 0.000 description 1
- LHEZGZQRLDBSRR-WDCWCFNPSA-N Thr-Glu-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LHEZGZQRLDBSRR-WDCWCFNPSA-N 0.000 description 1
- PAXANSWUSVPFNK-IUKAMOBKSA-N Thr-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N PAXANSWUSVPFNK-IUKAMOBKSA-N 0.000 description 1
- YOOAQCZYZHGUAZ-KATARQTJSA-N Thr-Leu-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YOOAQCZYZHGUAZ-KATARQTJSA-N 0.000 description 1
- IJVNLNRVDUTWDD-MEYUZBJRSA-N Thr-Leu-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O IJVNLNRVDUTWDD-MEYUZBJRSA-N 0.000 description 1
- NQQMWWVVGIXUOX-SVSWQMSJSA-N Thr-Ser-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NQQMWWVVGIXUOX-SVSWQMSJSA-N 0.000 description 1
- IEZVHOULSUULHD-XGEHTFHBSA-N Thr-Ser-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O IEZVHOULSUULHD-XGEHTFHBSA-N 0.000 description 1
- KPMIQCXJDVKWKO-IFFSRLJSSA-N Thr-Val-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O KPMIQCXJDVKWKO-IFFSRLJSSA-N 0.000 description 1
- FKAPNDWDLDWZNF-QEJZJMRPSA-N Trp-Asp-Glu Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N FKAPNDWDLDWZNF-QEJZJMRPSA-N 0.000 description 1
- NLLARHRWSFNEMH-NUTKFTJISA-N Trp-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N NLLARHRWSFNEMH-NUTKFTJISA-N 0.000 description 1
- SNWIAPVRCNYFNI-SZMVWBNQSA-N Trp-Met-Arg Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N SNWIAPVRCNYFNI-SZMVWBNQSA-N 0.000 description 1
- AYPAIRCDLARHLM-KKUMJFAQSA-N Tyr-Asn-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O AYPAIRCDLARHLM-KKUMJFAQSA-N 0.000 description 1
- KIJLSRYAUGGZIN-CFMVVWHZSA-N Tyr-Ile-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KIJLSRYAUGGZIN-CFMVVWHZSA-N 0.000 description 1
- DWAMXBFJNZIHMC-KBPBESRZSA-N Tyr-Leu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O DWAMXBFJNZIHMC-KBPBESRZSA-N 0.000 description 1
- SINRIKQYQJRGDQ-MEYUZBJRSA-N Tyr-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 SINRIKQYQJRGDQ-MEYUZBJRSA-N 0.000 description 1
- PSALWJCUIAQKFW-ACRUOGEOSA-N Tyr-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N PSALWJCUIAQKFW-ACRUOGEOSA-N 0.000 description 1
- JXGUUJMPCRXMSO-HJOGWXRNSA-N Tyr-Phe-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 JXGUUJMPCRXMSO-HJOGWXRNSA-N 0.000 description 1
- SZEIFUXUTBBQFQ-STQMWFEESA-N Tyr-Pro-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O SZEIFUXUTBBQFQ-STQMWFEESA-N 0.000 description 1
- MNWINJDPGBNOED-ULQDDVLXSA-N Tyr-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=C(O)C=C1 MNWINJDPGBNOED-ULQDDVLXSA-N 0.000 description 1
- GQVZBMROTPEPIF-SRVKXCTJSA-N Tyr-Ser-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O GQVZBMROTPEPIF-SRVKXCTJSA-N 0.000 description 1
- OGNMURQZFMHFFD-NHCYSSNCSA-N Val-Asn-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N OGNMURQZFMHFFD-NHCYSSNCSA-N 0.000 description 1
- FBVUOEYVGNMRMD-NAKRPEOUSA-N Val-Cys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](C(C)C)N FBVUOEYVGNMRMD-NAKRPEOUSA-N 0.000 description 1
- CVIXTAITYJQMPE-LAEOZQHASA-N Val-Glu-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CVIXTAITYJQMPE-LAEOZQHASA-N 0.000 description 1
- CELJCNRXKZPTCX-XPUUQOCRSA-N Val-Gly-Ala Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O CELJCNRXKZPTCX-XPUUQOCRSA-N 0.000 description 1
- SVFRYKBZHUGKLP-QXEWZRGKSA-N Val-Met-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SVFRYKBZHUGKLP-QXEWZRGKSA-N 0.000 description 1
- OFQGGTGZTOTLGH-NHCYSSNCSA-N Val-Met-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N OFQGGTGZTOTLGH-NHCYSSNCSA-N 0.000 description 1
- CKTMJBPRVQWPHU-JSGCOSHPSA-N Val-Phe-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)O)N CKTMJBPRVQWPHU-JSGCOSHPSA-N 0.000 description 1
- XBJKAZATRJBDCU-GUBZILKMSA-N Val-Pro-Ala Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O XBJKAZATRJBDCU-GUBZILKMSA-N 0.000 description 1
- PGBMPFKFKXYROZ-UFYCRDLUSA-N Val-Tyr-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N PGBMPFKFKXYROZ-UFYCRDLUSA-N 0.000 description 1
- 108010047495 alanylglycine Proteins 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 101150059443 cas12a gene Proteins 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 108010069495 cysteinyltyrosine Proteins 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- 108010054813 diprotin B Proteins 0.000 description 1
- 230000005782 double-strand break Effects 0.000 description 1
- 230000011559 double-strand break repair via nonhomologous end joining Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000003209 gene knockout Methods 0.000 description 1
- 108010057083 glutamyl-aspartyl-leucine Proteins 0.000 description 1
- 108010027668 glycyl-alanyl-valine Proteins 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 108010025306 histidylleucine Proteins 0.000 description 1
- 108010018006 histidylserine Proteins 0.000 description 1
- 230000007062 hydrolysis Effects 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 238000011419 induction treatment Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 108010078274 isoleucylvaline Proteins 0.000 description 1
- 101150109249 lacI gene Proteins 0.000 description 1
- 210000002429 large intestine Anatomy 0.000 description 1
- 231100000518 lethal Toxicity 0.000 description 1
- 230000001665 lethal effect Effects 0.000 description 1
- 108010012058 leucyltyrosine Proteins 0.000 description 1
- 108010017391 lysylvaline Proteins 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 238000001821 nucleic acid purification Methods 0.000 description 1
- 108010012581 phenylalanylglutamate Proteins 0.000 description 1
- 239000000843 powder Substances 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 108010004914 prolylarginine Proteins 0.000 description 1
- 108010070643 prolylglutamic acid Proteins 0.000 description 1
- 108010053725 prolylvaline Proteins 0.000 description 1
- 238000002818 protein evolution Methods 0.000 description 1
- 230000017854 proteolysis Effects 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 230000008263 repair mechanism Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 102200012576 rs111033648 Human genes 0.000 description 1
- 102200001270 rs121909081 Human genes 0.000 description 1
- 102200104802 rs13406336 Human genes 0.000 description 1
- 102200042573 rs17116471 Human genes 0.000 description 1
- 102220273513 rs373435521 Human genes 0.000 description 1
- 102200097407 rs6586239 Human genes 0.000 description 1
- 102220138225 rs759718991 Human genes 0.000 description 1
- 102220089709 rs869320709 Human genes 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 108010015840 seryl-prolyl-lysyl-lysine Proteins 0.000 description 1
- 235000020183 skimmed milk Nutrition 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 108091035705 tRNA adenine Proteins 0.000 description 1
- 108010072986 threonyl-seryl-lysine Proteins 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000011282 treatment Methods 0.000 description 1
- 239000012137 tryptone Substances 0.000 description 1
- 108010051110 tyrosyl-lysine Proteins 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/78—Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/74—Vectors or expression systems specially adapted for prokaryotic hosts other than E. coli, e.g. Lactobacillus, Micromonospora
- C12N15/75—Vectors or expression systems specially adapted for prokaryotic hosts other than E. coli, e.g. Lactobacillus, Micromonospora for Bacillus
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y305/00—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
- C12Y305/04—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
- C12Y305/04004—Adenosine deaminase (3.5.4.4)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Zoology (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biotechnology (AREA)
- Biochemistry (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Microbiology (AREA)
- Medicinal Chemistry (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Plant Pathology (AREA)
- Enzymes And Modification Thereof (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
本发明涉及一种基因编辑融合蛋白、其构建的腺嘌呤碱基编辑器及其应用。基因编辑融合蛋白的氨基酸序列为SEQ ID NO.2‑4所示的序列之一,基于该基因编辑融合蛋白构建的腺嘌呤碱基编辑器还包括crRNA阵列插入区,将基因组多位点腺嘌呤转化为鸟嘌呤。本发明构建的腺嘌呤碱基编辑器解决了现有的腺嘌呤碱基编辑器在基因组上的可操作范围受限,而且在多位点的碱基编辑过程中效率低且操作复杂的问题,提供了一种编辑位点多、编辑窗口大、编辑效率和编辑活性高的腺嘌呤碱基编辑器。
The invention relates to a gene editing fusion protein, an adenine base editor constructed therefrom and its application. The amino acid sequence of the gene editing fusion protein is one of the sequences shown in SEQ ID NO. 2-4. The adenine base editor constructed based on the gene editing fusion protein also includes a crRNA array insertion region to combine adenine residues at multiple sites in the genome. Converted to guanine. The adenine base editor constructed by the present invention solves the problems of the existing adenine base editor's limited operable range on the genome, low efficiency and complicated operation during the base editing process of multiple sites. An adenine base editor with multiple editing sites, a large editing window, and high editing efficiency and editing activity is provided.
Description
技术领域Technical field
本发明涉及生物技术领域,尤其涉及一种基因编辑融合蛋白、其构建的腺嘌呤碱基编辑器及其应用。The invention relates to the field of biotechnology, and in particular to a gene editing fusion protein, an adenine base editor constructed therefrom and its application.
背景技术Background technique
基因编辑是通过在DNA上特定位点引入序列改变,达到基因的敲除、外源DNA片段插入或者DNA碱基突变的技术手段。CRISPR/Cas系统在基因编辑中的应用非常广泛,目前已经发现了三种CRISPR/Cas系统,Type I,Type II和Type III,其中Type II系统是目前改造最成功的人工核酸酶,只需一个Cas9蛋白就能发挥CRISPR系统的免疫作用。在该系统中,crRNA(CRISPR-derived RNA)通过碱基配对与tracrRNA(trans-activating crRNA)结合形成tracrRNA/crRNA复合物,此复合物引导核酸酶Cas9蛋白在与crRNA配对的序列靶位点剪切双链DNA。作用过程包括三个阶段:(1)获得CRISPR的间隔区:细菌通过入侵的噬菌体或质粒的一小段DNA序列整合到自身基因组的重复序列之间,即CRISPR 5′端的两个重复序列之间,从而获得高度可变的间隔区域;(2)CRISPR基因座的表达与crRNA的成熟:CRISPR基因座首先被转录成前体CRISPR RNA(pre-crRNA),然后在Cas9和核酸酶的作用下被剪切成成熟的crRNA;(3)CRISPR/Cas系统对外源遗传物质的切割:成熟的tracrRNA、crRNA和Cas9形成核糖核蛋白复合物,crRNA识别外源匹配的DNA序列并与之结合,从而介导核糖核蛋白复合物对外源遗传物质的切割。Gene editing is a technical means to achieve gene knockout, insertion of foreign DNA fragments or DNA base mutation by introducing sequence changes at specific sites on DNA. The CRISPR/Cas system is widely used in gene editing. Three CRISPR/Cas systems have been discovered so far, Type I, Type II and Type III. Among them, the Type II system is currently the most successfully modified artificial nuclease, requiring only one The Cas9 protein can exert the immune effect of the CRISPR system. In this system, crRNA (CRISPR-derived RNA) combines with tracrRNA (trans-activating crRNA) through base pairing to form a tracrRNA/crRNA complex. This complex guides the nuclease Cas9 protein to cleave at the sequence target site paired with crRNA. Cut double-stranded DNA. The process includes three stages: (1) Obtaining the spacer region of CRISPR: the bacterium integrates a short DNA sequence of the invading phage or plasmid into the repeat sequence of its own genome, that is, between the two repeat sequences at the 5' end of CRISPR. Thus, highly variable spacer regions are obtained; (2) Expression of CRISPR loci and maturation of crRNA: CRISPR loci are first transcribed into precursor CRISPR RNA (pre-crRNA), and then sheared under the action of Cas9 and nucleases Cut into mature crRNA; (3) Cleavage of exogenous genetic material by the CRISPR/Cas system: Mature tracrRNA, crRNA and Cas9 form a ribonucleoprotein complex, and crRNA recognizes and binds to the exogenous matching DNA sequence, thereby mediating Cleavage of foreign genetic material by ribonucleoprotein complexes.
为了简化操作过程并提高编辑效率,常常将crRNA和tracrRNA构建成一个嵌合体sgRNA(small guide RNA)进行表达,即只需要表达sgRNA和Cas9蛋白便可进行基因编辑。目前除了CRISPR/Cas9系统以外,CRISPR/Cpf1系统也常被应用于基因编辑,与CRISPR/Cas9系统不同的是,CRISPR/Cpf1只需要crRNA便可发挥作用,且Cpf1自身就具有RNA酶的活力,可以对未成熟的包含多个crRNA的mRNA序列进行切割加工,因此可以设计一段包含多个crRNA的crRNA阵列,可产生多个具有独立功能的crRNA并引导Cpf1靶向基因组的对应靶点,实现多个位点的同时编辑。In order to simplify the operation process and improve editing efficiency, crRNA and tracrRNA are often constructed into a chimeric sgRNA (small guide RNA) for expression, that is, only sgRNA and Cas9 protein need to be expressed to perform gene editing. At present, in addition to the CRISPR/Cas9 system, the CRISPR/Cpf1 system is also often used in gene editing. Unlike the CRISPR/Cas9 system, CRISPR/Cpf1 only requires crRNA to function, and Cpf1 itself has RNase activity. Immature mRNA sequences containing multiple crRNAs can be cut and processed, so a crRNA array containing multiple crRNAs can be designed to generate multiple crRNAs with independent functions and guide Cpf1 to target corresponding targets in the genome, achieving multiple goals. Simultaneous editing of several sites.
很多物种都缺少NHEJ途径(即使有活性也较弱),而HDR途径需要同源模板的参与才能发挥作用,且这两种修复机制之间还会存在竞争关系,这都导致基于上述方式的基因编辑过程致死率较高且编辑效率较低。将Cas9的DNA酶进行失活后,可以得到只能切割一条DNA链的nCas9和无法切割DNA的dCas9。nCas9与dCas9仍然可以在sgRNA的引导下结合到基因组的特定位点,但却不会产生致死的双链断裂。大肠杆菌的tRNA腺苷脱氨酶TadA突变体TadA7.10可以将腺嘌呤(A)脱氨形成肌苷(I),在DNA水平肌苷会被当作鸟嘌呤(G)进行读码和复制,最终实现A→G的转换。因此,可以将TadA7.10融合到nCas9或dCas9的N末端后获得的融合蛋白(dCas9-ABE或nCas9-ABE)可以在sgRNA的引导下实现基因组上特定位点的A→G转化,被称为腺嘌呤碱基编辑系统。Many species lack the NHEJ pathway (even if it is active, it is weak), and the HDR pathway requires the participation of homologous templates to function, and there will be a competitive relationship between the two repair mechanisms, which leads to genes based on the above methods. The editing process has a high fatality rate and low editing efficiency. After inactivating the DNase of Cas9, nCas9, which can only cut one DNA strand, and dCas9, which cannot cut DNA, can be obtained. nCas9 and dCas9 can still bind to specific sites in the genome under the guidance of sgRNA, but will not produce lethal double-strand breaks. The tRNA adenosine deaminase TadA mutant TadA7.10 of Escherichia coli can deaminate adenine (A) to form inosine (I). Inosine will be read and copied as guanine (G) at the DNA level. , and finally achieve the conversion of A→G. Therefore, the fusion protein (dCas9-ABE or nCas9-ABE) obtained by fusing TadA7.10 to the N terminus of nCas9 or dCas9 can achieve A→G conversion at specific sites on the genome under the guidance of sgRNA, which is called Adenine base editing system.
中国专利CN201811613264.9公开了一种CRISPR/Cas9介导的腺嘌呤碱基编辑系统,实现了单碱基定点替换(A>G),并将其用于改良水稻稻瘟病广谱抗性,但该系统用于多位点碱基编辑时需要构建多个对应的sgRNA表达框,这不仅增加了构建过程的复杂性,还会由于启动子等元件的重复使用而降低DNA序列的稳定性,且Cas9所识别的位点需要具有NGG(N=A,T,C,G)的PAM序列,这限制了基于Cas9的胞嘧啶碱基编辑器在基因组上的可操作范围;中国专利CN201811563073.6公开了一种植物碱基编辑方法,包括由核酸酶失活的CRISPR效应蛋白和DNA依赖的腺嘌呤脱氨酶形成的碱基编辑融合蛋白,所述CRISPR效应蛋白是Cas9核酸酶或Cpf1核酸酶,DNA依赖的腺嘌呤脱氨酶是大肠杆菌tRNA腺嘌呤脱氨酶TadA(ecTadA)的变体,所述变体相对于野生型ecTadA包含一或多组选自以下的突变:1)A106V和D108N;2)D147Y和E155V;3)L84F、H123Y和I156F;4)A142N;5)H36L、R51L、S146C和K157N;6)P48S/T/A;7)A142N;8)W23L/R;9)R152H/P;中国专利201811578853.8公开了一种基于CPF1蛋白的碱基编辑系统,其中,碱基编辑融合蛋白包含DNA切割活性缺失的Cpf1(D917A突变)和脱氨酶,脱氨酶的突变体同上,能够实现靶序列中一或多个C至T或者A至G的取代。上述系统都只使用了目前应用较为广泛的ecTadA的ABE7.9(TadA7.9)和ABE7.10(TadA7.10)突变体,碱基编辑器的性能还需进一步提高。因此,本发明旨在寻找一种编辑位点多、编辑窗口大、编辑效率和编辑活性高的腺嘌呤碱基编辑器。Chinese patent CN201811613264.9 discloses a CRISPR/Cas9-mediated adenine base editing system that realizes single-base site-directed substitution (A>G) and is used to improve broad-spectrum resistance to rice blast. However, When this system is used for multi-site base editing, multiple corresponding sgRNA expression cassettes need to be constructed, which not only increases the complexity of the construction process, but also reduces the stability of the DNA sequence due to the repeated use of promoters and other components, and The site recognized by Cas9 requires a PAM sequence with NGG (N=A, T, C, G), which limits the operable range of Cas9-based cytosine base editors on the genome; Chinese patent CN201811563073.6 disclosed A plant base editing method is provided, which includes a base editing fusion protein formed by a nuclease-inactivated CRISPR effector protein and a DNA-dependent adenine deaminase, and the CRISPR effector protein is Cas9 nuclease or Cpf1 nuclease, The DNA-dependent adenine deaminase is a variant of the Escherichia coli tRNA adenine deaminase TadA (ecTadA) that contains one or more sets of mutations selected from the following relative to wild-type ecTadA: 1) A106V and D108N ;2) D147Y and E155V; 3) L84F, H123Y and I156F; 4) A142N; 5) H36L, R51L, S146C and K157N; 6) P48S/T/A; 7) A142N; 8) W23L/R; 9) R152H /P; Chinese Patent 201811578853.8 discloses a base editing system based on CPF1 protein, in which the base editing fusion protein contains Cpf1 (D917A mutation) with missing DNA cleavage activity and deaminase. The mutant of deaminase is the same as above. One or more C to T or A to G substitutions in the target sequence can be achieved. The above systems only use the ABE7.9 (TadA7.9) and ABE7.10 (TadA7.10) mutants of ecTadA, which are currently widely used, and the performance of the base editor needs to be further improved. Therefore, the present invention aims to find an adenine base editor with multiple editing sites, a large editing window, and high editing efficiency and editing activity.
发明内容Contents of the invention
为解决上述技术问题,本发明提供了一种基于CRISPR/Cpf1的腺嘌呤碱基编辑器,克服了现有的A>G腺嘌呤碱基编辑器在基因组上的可操作范围受限,而且在多位点的碱基编辑过程中效率低且操作复杂的问题。In order to solve the above technical problems, the present invention provides an adenine base editor based on CRISPR/Cpf1, which overcomes the limited operational range of the existing A>G adenine base editor on the genome, and in The problem of low efficiency and complex operation in the multi-site base editing process.
本发明的第一个目的是提供一种基因编辑融合蛋白,该基因编辑融合蛋白的氨基酸序列为SEQ ID NO.2-4所示的序列之一。其中,SEQ ID NO.2为含有TadA8.20的融合蛋白序列,SEQ ID NO.3为含有TadA8e的融合蛋白序列,SEQ ID NO.4为含有TadA9的融合蛋白序列。The first object of the present invention is to provide a gene editing fusion protein whose amino acid sequence is one of the sequences shown in SEQ ID NO. 2-4. Among them, SEQ ID NO.2 is the fusion protein sequence containing TadA8.20, SEQ ID NO.3 is the fusion protein sequence containing TadA8e, and SEQ ID NO.4 is the fusion protein sequence containing TadA9.
本发明提供了一种基因编辑融合蛋白,该基因编辑融合蛋白可识别TTV作为PAM,其中,V=A,C,G,该融合蛋白用于基因编辑时,编辑窗口在crRNA的第9个碱基、第7个碱基到第9个碱基之间或第4个碱基到第23个碱基之间,将腺嘌呤碱基编辑为鸟嘌呤。The invention provides a gene editing fusion protein that can recognize TTV as PAM, wherein V=A, C, G. When the fusion protein is used for gene editing, the editing window is at the 9th base of crRNA. base, between the 7th base and the 9th base, or between the 4th base and the 23rd base, edit the adenine base into guanine.
本发明的第二个目的是提供一种腺嘌呤碱基编辑器,该腺嘌呤碱基编辑器包括上述基因编辑融合蛋白和crRNA阵列插入区。crRNA阵列插入区插入的crRNA与DNA酶失活的Cpf1突变体dCpf1相匹配,引导DNA酶失活的Cpf1突变体dCpf1剪切编辑特定靶点。The second object of the present invention is to provide an adenine base editor, which includes the above-mentioned gene editing fusion protein and a crRNA array insertion region. The crRNA inserted in the insertion region of the crRNA array matches the DNase-inactivated Cpf1 mutant dCpf1, and guides the DNase-inactivated Cpf1 mutant dCpf1 to cut and edit specific target sites.
进一步地,crRNA阵列插入区的核苷酸序列如SEQ ID NO.5所示。该crRNA阵列插入区两端包含两个正向重复的crRNA把手序列,并在中间插入了两个反向的Eco31I酶切位点,通过酶切连接便可将所需的识别序列或crRNA阵列放置到两个把手序列之间。Further, the nucleotide sequence of the crRNA array insertion region is shown in SEQ ID NO. 5. The crRNA array insertion region contains two forward repeating crRNA handle sequences at both ends, and two reverse Eco31I enzyme cutting sites are inserted in the middle. The required recognition sequence or crRNA array can be placed through enzyme cutting connection. to between two handle sequences.
进一步地,上述基因编辑融合蛋白通过诱导型启动子Pgrac100调控表达。Furthermore, the above gene editing fusion protein regulates expression through the inducible promoter P grac100 .
进一步地,crRNA阵列插入区通过组成型启动子Pveg调控表达。Furthermore, the crRNA array insertion region regulates expression through the constitutive promoter P veg .
进一步地,上述腺嘌呤碱基编辑器是将编码SEQ ID NO.2-4之一融合蛋白的基因整合到表达载体以及将crRNA阵列插入区插入到表达载体上得到。Further, the above-mentioned adenine base editor is obtained by integrating the gene encoding the fusion protein of one of SEQ ID NO. 2-4 into an expression vector and inserting the crRNA array insertion region into the expression vector.
进一步地,腺嘌呤碱基编辑器用于基因编辑时,将上述表达载体导入真核生物或原核生物中。Further, when an adenine base editor is used for gene editing, the above-mentioned expression vector is introduced into eukaryotes or prokaryotes.
进一步地,腺嘌呤碱基编辑器包括:含有上述腺嘌呤碱基编辑器以及上述crRNA阵列插入区的质粒。Further, the adenine base editor includes: a plasmid containing the above-mentioned adenine base editor and the above-mentioned crRNA array insertion region.
进一步地,上述质粒的核苷酸序列为SEQ ID NO.9-11所示之一。其中,SEQ IDNO.9为含有TadA8.20的质粒序列,SEQ ID NO.10为含有TadA8e的质粒序列,SEQ ID NO.11为含有TadA9的质粒序列。Further, the nucleotide sequence of the above plasmid is one of SEQ ID NO. 9-11. Among them, SEQ ID NO.9 is the plasmid sequence containing TadA8.20, SEQ ID NO.10 is the plasmid sequence containing TadA8e, and SEQ ID NO.11 is the plasmid sequence containing TadA9.
本发明的第三个目的是提供上述腺嘌呤碱基编辑器在基因编辑中的应用。The third object of the present invention is to provide the application of the above-mentioned adenine base editor in gene editing.
进一步地,上述腺嘌呤碱基编辑器用于将枯草芽孢杆菌基因组上多位点腺嘌呤转化为鸟嘌呤。Further, the above-mentioned adenine base editor is used to convert adenine into guanine at multiple sites on the Bacillus subtilis genome.
进一步地,上述腺嘌呤碱基编辑器用于获取突变体或用于胞外蛋白酶失活。Further, the above-mentioned adenine base editor is used to obtain mutants or to inactivate extracellular proteases.
进一步地,枯草芽孢杆菌为枯草芽孢杆菌Bacillus subtilis 168。Further, Bacillus subtilis is Bacillus subtilis 168.
借由上述方案,本发明至少具有以下优点:Through the above solutions, the present invention at least has the following advantages:
本发明设计构建了基于CRISPR/Cpf1的腺嘌呤碱基编辑器(dCpf1-ABE),利用一个crRNA阵列便可以同时在基因组的5个位点进行碱基(A→G),从而产生极为丰富的突变体组合,并且拓宽了腺嘌呤碱基编辑器在基因组上的可操作范围,部分位点的编辑效率达到100%,编辑活性高,编辑窗口大。The present invention designs and constructs an adenine base editor (dCpf1-ABE) based on CRISPR/Cpf1. Using a crRNA array, it can perform base (A → G) at 5 sites in the genome at the same time, thereby producing extremely rich The combination of mutants has broadened the operable range of the adenine base editor on the genome. The editing efficiency of some sites has reached 100%, with high editing activity and a large editing window.
上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,并可依照说明书的内容予以实施,以下以本发明的较佳实施例并配合详细附图说明如后。The above description is only an overview of the technical solution of the present invention. In order to more clearly understand the technical means of the present invention and implement it according to the contents of the specification, the following is a description of the preferred embodiments of the present invention in conjunction with detailed drawings.
附图说明Description of drawings
为了使本发明的内容更容易被清楚的理解,下面根据本发明的具体实施例并结合附图,对本发明作进一步详细的说明。In order to make the content of the present invention easier to understand clearly, the present invention will be described in further detail below based on specific embodiments of the present invention and in conjunction with the accompanying drawings.
图1为基于CRISPR/Cpf1的腺嘌呤碱基编辑器示意图;Figure 1 is a schematic diagram of an adenine base editor based on CRISPR/Cpf1;
图2为枯草芽孢杆菌腺嘌呤碱基编辑系统质粒图谱;Figure 2 shows the plasmid map of the Bacillus subtilis adenine base editing system;
图3为腺苷脱氨酶选择TadA9时构建的腺嘌呤碱基编辑器介导的多位点碱基编辑;Figure 3 shows the multi-site base editing mediated by the adenine base editor constructed when adenosine deaminase selects TadA9;
图4为多位点腺嘌呤碱基编辑系统处理后产生的突变体的具体组成;Figure 4 shows the specific composition of the mutants produced after processing by the multi-site adenine base editing system;
图5为使用多位点腺嘌呤碱基编辑系统对胞外蛋白酶进行失活后的蛋白水解情况。Figure 5 shows the proteolysis situation after using a multi-site adenine base editing system to inactivate extracellular proteases.
具体实施方式Detailed ways
下面结合附图和具体实施例对本发明作进一步说明,以使本领域的技术人员可以更好地理解本发明并能予以实施,但所举实施例不作为对本发明的限定。The present invention will be further described below in conjunction with the accompanying drawings and specific examples, so that those skilled in the art can better understand and implement the present invention, but the examples are not intended to limit the present invention.
实验涉及的材料及试剂:Materials and reagents involved in the experiment:
DNA聚合酶购自Takara公司,限制性内切酶与T4连接酶购买自NEB公司,质粒提取试剂盒购买自生工生物工程(上海)股份有限公司,PCR产物核酸纯化试剂盒购买自ThermoScientific公司。DNA polymerase was purchased from Takara Company, restriction endonuclease and T4 ligase were purchased from NEB Company, plasmid extraction kit was purchased from Sangon Bioengineering (Shanghai) Co., Ltd., and PCR product nucleic acid purification kit was purchased from ThermoScientific Company.
细胞均使用LB培养基培养,其中含有:胰蛋白胨10g/L,酵母粉5g/L,NaCl 10g/L。培养基中卡那霉素的终浓度为50μg/mL,IPTG的终浓度为1mM。The cells were cultured in LB medium, which contained: tryptone 10g/L, yeast powder 5g/L, and NaCl 10g/L. The final concentration of kanamycin in the culture medium was 50 μg/mL, and the final concentration of IPTG was 1 mM.
实施例1基于CRISPR/Cpf1的腺嘌呤碱基编辑器的设计构建Example 1 Design and construction of adenine base editor based on CRISPR/Cpf1
如图1A所示,构建了基于CRISPR/Cpf1的腺嘌呤编辑系统,该系统由以下两个基本元件组成:As shown in Figure 1A, an adenine editing system based on CRISPR/Cpf1 was constructed, which consists of the following two basic components:
(1)由大肠杆菌的tRNA腺苷脱氨酶TadA突变体与Cpf1的DNA酶失活突变体(D917A)dCpf1构成的融合蛋白,即腺嘌呤碱基编辑器dCpf1-ABE,该融合蛋白可以在dCpf1的帮助下结合到基因组的特定靶点,并在腺嘌呤脱氨酶的作用下将该处的腺嘌呤(A)转化为鸟嘌呤(G);(1) A fusion protein composed of the tRNA adenosine deaminase TadA mutant of Escherichia coli and the DNase inactive mutant (D917A) dCpf1 of Cpf1, that is, the adenine base editor dCpf1-ABE. This fusion protein can be used in With the help of dCpf1, it binds to a specific target site in the genome and converts adenine (A) there into guanine (G) under the action of adenine deaminase;
具体地,分别将tRNA腺苷脱氨酶TadA7.10的突变体TadA8.17(TadA7.10 V82S、Q154R)、TadA8.20(TadA7.10 I76Y、V82S、Y123H、Y147R、Q154R)、TadA8e(TadA7.10 A109S、T111R、D119N、H122N、Y147D、F149Y、T166I、D167N)和TadA9(TadA7.10 V82S、A109S、T111R、D119N、H122N、Y147D、F149Y、Q154R、T166I、D167N)通过短肽linker(SGGSSGGSSGSETPGTSESATPESSGGSSGGS)融合到dCpf1的N末端,得到的融合氨基酸序列分别如序列1-4所示。此外,还设计了DNA序列如5所示的crRNA阵列插入区,该区域两端包含两个正向重复(directrepeat)的crRNA把手序列,并在中间插入了两个反向的Eco31I酶切位点,通过酶切连接便可将所需的识别序列或crRNA阵列放置到两个把手序列之间。Specifically, the mutants TadA8.17 (TadA7.10 V82S, Q154R), TadA8.20 (TadA7.10 I76Y, V82S, Y123H, Y147R, Q154R), and TadA8e (TadA7 .10 A109S, T111R, D119N, H122N, Y147D, F149Y, T166I, D167N) and TadA9 (TadA7.10 V82S, A109S, T111R, D119N, H122N, Y147D, F149Y, Q154R, T166I, D167N) Through the short peptide linker (SGGSSGGSSGSETPGTSESATPESSGGSSGGS ) is fused to the N terminus of dCpf1, and the resulting fusion amino acid sequences are shown in Sequences 1-4 respectively. In addition, a crRNA array insertion region whose DNA sequence is shown in 5 was also designed. This region contains two direct repeat crRNA handle sequences at both ends, and two reverse Eco31I restriction sites are inserted in the middle. , the required recognition sequence or crRNA array can be placed between the two handle sequences through enzymatic ligation.
(2)与Cpf1相匹配的crRNA,该crRNA包含固定的把手序列和与目标靶点互补的识别序列,在dCpf1与crRNA把手序列间的相互作用下dCpf1-ABE会与crRNA结合形成复合物,并在互补序列的作用下识别和靶向基因组的特异性靶点,进而实现碱基编辑(A→G)。由于dCpf1仍然具备RNA酶活力,多个crRNA可通过阵列的形式进行表达,crRNA阵列被dCpf1切割处理以后,便可以将dCpf1-ABE靶向多个位点实现碱基编辑(A→G)过程(图1B)。(2) crRNA that matches Cpf1. The crRNA contains a fixed handle sequence and a recognition sequence complementary to the target target. Under the interaction between dCpf1 and the crRNA handle sequence, dCpf1-ABE will combine with crRNA to form a complex, and Under the action of complementary sequences, specific targets in the genome are identified and targeted to achieve base editing (A→G). Since dCpf1 still has RNase activity, multiple crRNAs can be expressed in the form of an array. After the crRNA array is cleaved by dCpf1, dCpf1-ABE can be targeted to multiple sites to achieve the base editing (A→G) process ( Figure 1B).
实施例2基于CRISPR/Cpf1的腺嘌呤碱基编辑器在多位点碱基编辑中的应用Example 2 Application of CRISPR/Cpf1-based adenine base editor in multi-site base editing
在枯草芽孢杆菌中进行了基于CRISPR/Cpf1的腺嘌呤碱基编辑系统的验证与应用。如图2所示,使用受IPTG诱导的Pgrac100启动子表达不同构造的腺嘌呤碱基编辑器dCpf1-ABE,并将crRNA阵列插入区放置到组成型启动子Pveg之后以实现crRNA阵列的表达,将上述两个表达框都放置到了含有温敏型复制子pE194的质粒上用于枯草芽孢杆菌中的碱基编辑(A→G)。The CRISPR/Cpf1-based adenine base editing system was verified and applied in Bacillus subtilis. As shown in Figure 2, the IPTG-inducible P grac100 promoter was used to express different structures of the adenine base editor dCpf1-ABE, and the crRNA array insertion region was placed behind the constitutive promoter P veg to achieve expression of the crRNA array. , the above two expression cassettes were placed on a plasmid containing the temperature-sensitive replicon pE194 for base editing (A→G) in Bacillus subtilis.
质粒构建具体包括以下步骤:所使用的质粒骨架来自于pJOE8999(Altenbuchner,J.,2016.Editing of the Bacillus subtilis genome by the CRISPR-Cas9system.Applied and Environmental Microbiology 82,5421–5427),该质粒在大肠杆菌与枯草芽孢杆菌中均为卡那霉素抗性(KanR),且具有多拷贝复制子pBR322可在大肠杆菌中进行质粒的构建与保存,而在枯草芽孢杆菌中则带有温敏型复制子pE194(30℃下可以稳定复制,50℃则会被消除);所使用Pgrac100启动子及其阻遏蛋白基因lacI和dCpf1蛋白均来自于质粒pLCg6-dCpf1(Wu,Y.,Liu,Y.,Lv,X.,Li,J.,Du,G.,Liu,L.,2020.CAMERS-B:CRISPR/Cpf1 assisted multiple-genes editing and regulation system for Bacillussubtilis.Biotechnology and Bioengineering 117,1817–1825);crRNA阵列插入区及启动子Pveg来自于质粒pcra2(Wu,Y.,Liu,Y.,Lv,X.,Li,J.,Du,G.,Liu,L.,2020.CAMERS-B:CRISPR/Cpf1 assisted multiple-genes editing and regulation system forBacillus subtilis.Biotechnology and Bioengineering 117,1817–1825);几种腺嘌呤脱氨酶的突变体都是通过基因合成的方式获得。上述片段通过PCR扩增之后,使用无缝连接试剂盒(碧云天D7010M)进行连接,并转化到大肠杆菌中进行测序和保存。包含有TadA7.10的突变体TadA8.17(TadA7.10 V82S、Q154R)、TadA8.20(TadA7.10 I76Y、V82S、Y123H、Y147R、Q154R)、TadA8e(TadA7.10 A109S、T111R、D119N、H122N、Y147D、F149Y、T166I、D167N)和TadA9(TadA7.10V82S、A109S、T111R、D119N、H122N、Y147D、F149Y、Q154R、T166I、D167N)的质粒序列分别如SEQ ID NO.8-11所示,序列9-11所示质粒具体见https://benchling.com/s/seq-8cOWQiJn0nzb9DwUWoQo?m=slm-JrgZjL6OUp4Dod4GvNOX,https://benchling.com/s/seq-7MMzFTpA45YTqhfwW7KV?m=slm-PWCavDFpL7MmrQywyplb,https://benchling.com/s/seq-jtNb6AEo2Q5lPsP42jhH?m=slm-HJP8s8AVmOcAoKe0UnJv。Plasmid construction specifically includes the following steps: the plasmid backbone used is from pJOE8999 (Altenbuchner, J., 2016. Editing of the Bacillus subtilis genome by the CRISPR-Cas9 system. Applied and Environmental Microbiology 82, 5421–5427), which is expressed in the large intestine. Both Bacillus subtilis and Bacillus subtilis are kanamycin resistant (KanR), and have multi-copy replicon pBR322, which can be used for plasmid construction and preservation in E. coli, while Bacillus subtilis has a temperature-sensitive replication pE194 (can be stably replicated at 30°C but will be eliminated at 50°C); the P grac100 promoter and its repressor protein gene lacI and dCpf1 proteins are all from the plasmid pLCg6-dCpf1 (Wu, Y., Liu, Y. , Lv, ; The crRNA array insertion region and promoter P veg are from plasmid pcra2 (Wu, Y., Liu, Y., Lv, X., Li, J., Du, G., Liu, L., 2020. CAMERS-B :CRISPR/Cpf1 assisted multiple-genes editing and regulation system for Bacillus subtilis.Biotechnology and Bioengineering 117,1817–1825); Several adenine deaminase mutants are obtained through gene synthesis. After the above fragments were amplified by PCR, they were connected using a seamless connection kit (Beyotime D7010M), and transformed into E. coli for sequencing and storage. Contains TadA7.10 mutants TadA8.17 (TadA7.10 V82S, Q154R), TadA8.20 (TadA7.10 I76Y, V82S, Y123H, Y147R, Q154R), TadA8e (TadA7.10 A109S, T111R, D119N, H122N , Y147D, F149Y, T166I, D167N) and TadA9 (TadA7.10V82S, A109S, T111R, D119N, H122N, Y147D, F149Y, Q154R, T166I, D167N) plasmid sequences are shown in SEQ ID NO.8-11, respectively. Please see https://benchling.com/s/seq-8cOWQiJn0nzb9DwUWoQo? for the plasmids shown in 9-11? m=slm-JrgZjL6OUp4Dod4GvNOX, https://benchling.com/s/seq-7MMzFTpA45YTqhfwW7KV? m=slm-PWCavDFpL7MmrQywyplb,https://benchling.com/s/seq-jtNb6AEo2Q5lPsP42jhH? m=slm-HJP8s8AVmOcAoKe0UnJv.
以枯草芽孢杆菌aprE基因和nprE基因中的5个位点作为靶点,进行多位点的碱基编辑,共设计了如表1中所示的5个crRNA,通过SOMACA(Synthetic Oligos MediatedAssembly of crRNA Array)方法(参考文献:Wu,Y.,Liu,Y.,Lv,X.,Li,J.,Du,G.,Liu,L.,2020.CAMERS-B:CRISPR/Cpf1assisted multiple-genes editing and regulationsystem for Bacillus subtilis.Biotechnology and Bioengineering 117,1817–1825.)组装到了crRNA阵列插入区中,得到可以完成以上5个位点碱基编辑(A→G)的编辑质粒。Using 5 sites in the aprE gene and nprE gene of Bacillus subtilis as targets, we performed multi-site base editing and designed a total of 5 crRNAs as shown in Table 1. Through SOMACA (Synthetic Oligos MediatedAssembly of crRNA Array) method (Reference: Wu, Y., Liu, Y., Lv, X., Li, J., Du, G., Liu, L., 2020. CAMERS-B: CRISPR/Cpf1 assisted multiple-genes editing and regulation system for Bacillus subtilis. Biotechnology and Bioengineering 117, 1817–1825.) was assembled into the crRNA array insertion region to obtain an editing plasmid that can complete base editing (A→G) of the above five sites.
具体地,当表达单个crRNA时,crRNA可以直接通过带有重叠区的一对引物退火得到(引物浓度为10uM,20uL体系中上下游引物各10uL;反应条件为:98℃2min,0.1℃/S降温至4℃后保温。),退产物稀释10倍后取1uL与使用Eco31I酶切后的载体连接即可。当设计多个crRNA形成crRNA阵列时,首先将多对具有重叠区的引物进行PCR(引物浓度为10uM,20uL体系中含DNA聚合酶10uL以及上下游引物各5uL;使用标准PCR程序,延伸时间5秒,10个循环),结束后稀释10倍便可得到两端具有Eco31I的双链DNA,取上述双链DNA各1μL与质粒进行golden gate组装即可。将上述产物转化到大肠杆菌中之后,在质粒骨架和crRNA上各选一个引物进行菌落PCR,并挑取有条带的单菌落进行测序筛选含有所需crRNA的阳性克隆。Specifically, when expressing a single crRNA, the crRNA can be directly obtained by annealing a pair of primers with overlapping regions (primer concentration is 10uM, 10uL of upstream and downstream primers in a 20uL system; reaction conditions are: 98°C 2min, 0.1°C/S Cool to 4°C and then incubate.), dilute the product 10 times, take 1uL and connect it to the vector digested with Eco31I. When designing multiple crRNAs to form crRNA arrays, first perform PCR on multiple pairs of primers with overlapping regions (the primer concentration is 10uM, the 20uL system contains 10uL of DNA polymerase and 5uL of upstream and downstream primers; use standard PCR procedures, extension time 5 seconds, 10 cycles). After dilution 10 times, double-stranded DNA with Eco31I at both ends can be obtained. Take 1 μL of each double-stranded DNA above and perform golden gate assembly with the plasmid. After transforming the above product into E. coli, select one primer each on the plasmid backbone and crRNA for colony PCR, and select single colonies with bands for sequencing to screen for positive clones containing the desired crRNA.
表1用于引导多位点碱基编辑(A→G)的crRNATable 1 crRNA used to guide multi-site base editing (A→G)
多位点碱基编辑(A→G)与分析的操作步骤如下:首先,将编辑质粒转化到枯草芽孢杆菌Bacillus subtilis 168中,涂布含有卡那霉素的平板,30℃培养12h;然后,挑取任意1个单菌落接种到装有2mL LB培养基(含有卡那霉素)的14mL摇菌管中,30℃振荡培养12h;随后,取5μL菌液转接到含装有2mL LB培养基(含有卡那霉素和IPTG)的14mL摇菌管中,30℃振荡培养12h诱导碱基编辑系统表达进行多位点碱基编辑(A→G);最后,取1μL诱导后的菌液到50μL PCR反应体系中,扩增aprE和nprE基因并进行sanger测序分析。aprE和nprE扩增使用的引物分别如表2所示,上述DNA片段经过纯化后使用表2中所示引物进行sanger测序,并使用BEAT软件(https://hanlab.cc/beat/)对测序结果进行分析。由于所使用的质粒是温敏的,编辑完成之后,可以将菌液划线到不含抗生素的LB平板上,50℃培养过夜即可将质粒消除。The steps for multi-site base editing (A→G) and analysis are as follows: first, transform the editing plasmid into Bacillus subtilis 168, spread on a plate containing kanamycin, and culture at 30°C for 12 hours; then, Pick any single colony and inoculate it into a 14 mL shaking tube containing 2 mL of LB medium (containing kanamycin), and culture it with shaking at 30°C for 12 hours; then, transfer 5 μL of bacterial liquid to a 14 mL shaking tube containing 2 mL of LB medium. Base (containing kanamycin and IPTG) in a 14 mL shaking tube, shake culture at 30°C for 12 hours to induce the expression of the base editing system for multi-site base editing (A→G); finally, take 1 μL of the induced bacterial liquid Into a 50 μL PCR reaction system, amplify the aprE and nprE genes and conduct sanger sequencing analysis. The primers used for aprE and nprE amplification are shown in Table 2 respectively. After purification, the above DNA fragments were subjected to sanger sequencing using the primers shown in Table 2, and sequenced using BEAT software (https://hanlab.cc/beat/) Analyze the results. Since the plasmid used is temperature-sensitive, after editing is completed, the bacterial solution can be streaked onto an LB plate without antibiotics and cultured at 50°C overnight to eliminate the plasmid.
表2引物序列Table 2 Primer sequences
含有不同腺嘌呤脱氨酶TadA突变体的腺嘌呤碱基编辑器(dCpf1-ABE)介导的多位点碱基编辑情况如表3所示:其中将TadA9与dCpf1融合时碱基编辑效率最高,5个位点都有A转化为G,且编辑窗口为crRNA的第4个碱基到第23个碱基之间,而且部分位点的编辑效率达到100%;融合TadA8e时碱基编辑效率略低于TadA9,只在位点2、位点3和位点4个中存在A转化为G,且编辑窗口也较窄,为crRNA的第7个碱基到第9个碱基之间;融合TadA8.20时碱基编辑效率低于TadA9与TadA8e,只在位点2和位点3中存在A转化为G,且编辑窗口也较窄,仅在crRNA的第9个碱基处发生编辑;而融合TadA8.17时在5个位点都无法实现碱基编辑。The multi-site base editing situation mediated by the adenine base editor (dCpf1-ABE) containing different adenine deaminase TadA mutants is shown in Table 3: among them, the base editing efficiency is the highest when TadA9 is fused to dCpf1 , all 5 sites have A converted to G, and the editing window is between the 4th base and the 23rd base of crRNA, and the editing efficiency of some sites reaches 100%; the base editing efficiency when fused to TadA8e Slightly lower than TadA9, A is converted to G only in site 2, site 3 and site 4, and the editing window is also narrow, between the 7th and 9th base of crRNA; When fused to TadA8.20, the base editing efficiency is lower than that of TadA9 and TadA8e. A is converted to G only in sites 2 and 3, and the editing window is also narrow. Editing only occurs at the 9th base of crRNA. ; However, when fused with TadA8.17, base editing cannot be achieved at any of the five sites.
表3腺嘌呤碱基编辑器介导的多位点碱基编辑(A→G)及其效率Table 3 Adenine base editor-mediated multi-site base editing (A→G) and its efficiency
实施例3诱导时间对多位点碱基编辑效率的影响Example 3 Effect of induction time on multi-site base editing efficiency
由于含有TadA7.10突变体TadA9的腺嘌呤碱基编辑器效率最高,可以实现五个位点的同时编辑。因此,又考察了IPTG的诱导时间对该碱基编辑器的碱基编辑效率的影响。Since the adenine base editor containing TadA7.10 mutant TadA9 has the highest efficiency, it can achieve simultaneous editing of five sites. Therefore, the effect of IPTG induction time on the base editing efficiency of the base editor was also investigated.
如表4所示,随着诱导时间的增加,各个位点的编辑效率都有一定的提高,但是幅度有限,而且编辑的窗口也并未随着诱导时间的延长而发生变化。上述结果说明诱导12h已经足够实现较为完全的编辑。其中,诱导36h之后的编辑测序结果如图3所示。As shown in Table 4, as the induction time increases, the editing efficiency of each site increases to a certain extent, but the amplitude is limited, and the editing window does not change with the extension of the induction time. The above results indicate that induction for 12 hours is sufficient to achieve relatively complete editing. Among them, the editing and sequencing results after 36 hours of induction are shown in Figure 3.
表4诱导时间对多位点碱基编辑效率的影响Table 4 Effect of induction time on multi-site base editing efficiency
实施例4多位点腺嘌呤碱基编辑系统产生突变体情况分析Example 4 Analysis of mutants generated by multi-site adenine base editing system
为了分析使用腺嘌呤碱基编辑处理后获得的突变体的具体组成,将含有腺嘌呤脱氨酶TadA7.10突变体TadA9的腺嘌呤碱基编辑器诱导处理36h后的菌液划线到了不含抗生素LB平板上,并置于37℃过夜培养。然后,挑取了8个单菌落分别扩增aprE和nprE位点进行测序分析。In order to analyze the specific composition of the mutants obtained after using adenine base editing treatment, the bacterial solution containing the adenine deaminase TadA7.10 mutant TadA9 after induction treatment with the adenine base editor for 36 hours was streaked to the point without antibiotics on an LB plate and incubate at 37°C overnight. Then, 8 single colonies were picked to amplify aprE and nprE loci respectively for sequencing analysis.
如图4所示,只需要设计和构建一个特定的crRNA阵列,便可以同时产生丰富的具有不同突变组合的突变体,这对于蛋白进化及菌株改造都是非常有价值的;此外,我们也观察到了使用混合模板测序没有测到的突变位点(如位点2中的A10G,位点3中的A11、A12G和A15G),说明某些位点可能存在效率很低的突变,这些突变无法通过实施例2中的方法被检测到,这也说明该碱基编辑器具有更大的编辑窗口。此外,由于挑取的菌落较少,表4中某些低频突变位点也可能在上述突变体中未被检测到。As shown in Figure 4, only a specific crRNA array needs to be designed and constructed to produce abundant mutants with different mutation combinations at the same time, which is very valuable for protein evolution and strain modification; in addition, we also observed The mutation sites that were not detected using mixed template sequencing (such as A10G in site 2, A11, A12G and A15G in site 3) indicate that there may be very low-efficiency mutations at some sites, and these mutations cannot pass The method in Example 2 was detected, which also shows that the base editor has a larger editing window. In addition, due to the small number of picked colonies, some low-frequency mutation sites in Table 4 may not be detected in the above mutants.
实施例5多位点腺嘌呤碱基编辑器在胞外蛋白酶失活中的应用Example 5 Application of multi-site adenine base editor in inactivation of extracellular proteases
枯草芽孢杆菌中共存在6个主要的蛋白酶,这些蛋白酶会将分泌到胞外的蛋白质水解,因此对于目的蛋白的高效分泌表达是极为不利的。因此,我们使用表5中所示的6个crRNA对表中加粗的6个起始密码子(ATG、TTG或GTG)互补链中的A进行了突变(A→G),如此上述起始密码子中的T会突变为C而无法起始表达,从而造成上述蛋白酶的失活。通过测序分析挑选到了6个蛋白酶全部失活的突变菌株之后,并使用脱脂乳平板对胞外蛋白酶活进行了检测,结果显示与野生菌相比突变株不再具有水解圈,说明胞外蛋白酶活已经被成功消除(图5)。There are six main proteases in Bacillus subtilis. These proteases will hydrolyze proteins secreted out of the cell, so they are extremely detrimental to the efficient secretion and expression of the target protein. Therefore, we used the 6 crRNAs shown in Table 5 to mutate A in the complementary strand of the 6 start codons (ATG, TTG or GTG) bolded in the table (A→G), so that the above start codons The T in the codon will be mutated to C and expression will not be initiated, resulting in the inactivation of the above-mentioned protease. After selecting a mutant strain in which all six proteases were inactive through sequencing analysis, the extracellular protease activity was detected using skim milk plates. The results showed that compared with the wild bacteria, the mutant strain no longer had a hydrolysis zone, indicating that the extracellular protease activity was has been successfully eliminated (Figure 5).
表5用于引导胞外蛋白酶失活的crRNATable 5 crRNA used to guide extracellular protease inactivation
对比例1其他腺嘌呤脱氨酶TadA突变体介导的碱基编辑情况Comparative Example 1 Base editing mediated by other adenine deaminase TadA mutants
基于CRISPR/Cas9的腺嘌呤碱基编辑系统中最早应用的腺嘌呤脱氨酶TadA突变体为TadA7.10,而且常常将一个原始TadA和一个突变体TadA7.10同时融合到nCas9的N末端。中国专利CN201811563073.6与中国专利201811578853.8也都使用了此突变体构建了腺嘌呤碱基编辑系统并在水稻中进行了应用。因此,在此进一步尝试了将TadA与TadA7.10融合到dCpf1的N末端构建如序列6所示的基于CRISPR/Cpf1的腺嘌呤碱基编辑系统。The earliest TadA mutant of adenine deaminase used in the CRISPR/Cas9-based adenine base editing system was TadA7.10, and an original TadA and a mutant TadA7.10 were often fused to the N-terminus of nCas9 at the same time. Chinese patent CN201811563073.6 and Chinese patent 201811578853.8 also used this mutant to construct an adenine base editing system and applied it in rice. Therefore, we further tried to fuse TadA and TadA7.10 to the N terminus of dCpf1 to construct a CRISPR/Cpf1-based adenine base editing system as shown in Sequence 6.
然而验证发现,与TadA8.17一样,TadA7.10与dCpf1融合后也不具备任何碱基编辑(A→G)的能力,说明dCpf1与上述TadA突变体不匹配,无法完成碱基编辑(A→G);或者是上述腺嘌呤碱基编辑器所产生的突变效率过低,因而无法通过实施例2中的分析方法被检测到。However, verification found that, like TadA8.17, TadA7.10 does not have any base editing (A→G) ability after fusion with dCpf1, indicating that dCpf1 does not match the above-mentioned TadA mutant and cannot complete base editing (A→ G); or the mutation efficiency generated by the above-mentioned adenine base editor is too low and therefore cannot be detected by the analysis method in Example 2.
对比例2不同DNA酶失活的Cpf1突变体介导的碱基编辑情况Comparative Example 2 Base editing mediated by Cpf1 mutants with different DNase inactivations
dCpf1除了常用的D917A以外,其双位点失活的突变体ddCpf1(D917A、E1006A)同样也只是将其DNase活性进行了失活,而保留了其RNase活性,因此我们构建了序列7所示有TadA9和ddCpf1构成的腺嘌呤碱基编辑器,并使用实施例2中的5个crRNA进行了验证。In addition to the commonly used D917A, the double-site inactivated mutant ddCpf1 (D917A, E1006A) of dCpf1 also only inactivates its DNase activity but retains its RNase activity. Therefore, we constructed the sequence shown in sequence 7. The adenine base editor composed of TadA9 and ddCpf1 was verified using the 5 crRNAs in Example 2.
如表6所示,使用TadA9与ddCpf1构成的腺嘌呤碱基编辑器同样可以完成5个位点的同时突变,但是突变效率较dCpf1有降低。As shown in Table 6, the adenine base editor composed of TadA9 and ddCpf1 can also complete simultaneous mutation of 5 sites, but the mutation efficiency is lower than that of dCpf1.
表6不同DNA酶失活的Cpf1突变体介导的碱基编辑情况Table 6 Base editing mediated by Cpf1 mutants with different DNase inactivations
对比例3连接方式对构建的碱基编辑系统的影响Comparative Example 3: Effect of connection method on the constructed base editing system
如实施例1中所述,本发明中的腺嘌呤脱氨酶突变体与dCpf1均使用了短肽linker(SGGSSGGSSGSETPGTSESATPESSGGSSGGS)进行连接,为了验证连接对碱基编辑系统的影响,我们尝试将TadA9与dCpf1上的短肽去除,从而使两个蛋白直接相连。然而,去除连接肽后的融合蛋白,在使用实施例3中的5个crRNA进行验证时,已不再具有碱基编辑的活性,说明连接肽对于碱基编辑器的正常工作是必须的。As described in Example 1, the adenine deaminase mutants and dCpf1 in the present invention were connected using a short peptide linker (SGGSSGGSSGSETPGTSESATPESSGGSSGGS). In order to verify the impact of the connection on the base editing system, we tried to connect TadA9 and dCpf1 The short peptide on the protein is removed, allowing the two proteins to be directly linked. However, when the fusion protein after removing the linking peptide was verified using the five crRNAs in Example 3, it no longer had base editing activity, indicating that the linking peptide is necessary for the normal operation of the base editor.
显然,上述实施例仅仅是为清楚地说明所作的举例,并非对实施方式的限定。对于所属领域的普通技术人员来说,在上述说明的基础上还可以做出其它不同形式变化或变动。这里无需也无法对所有的实施方式予以穷举。而由此所引申出的显而易见的变化或变动仍处于本发明创造的保护范围之中。Obviously, the above-mentioned embodiments are only examples for clear explanation and are not intended to limit the implementation. For those of ordinary skill in the art, other changes or modifications may be made based on the above description. An exhaustive list of all implementations is neither necessary nor possible. The obvious changes or modifications derived therefrom are still within the protection scope of the present invention.
序列表sequence list
<110> 江南大学<110> Jiangnan University
<120> 一种基因编辑融合蛋白、其构建的腺嘌呤碱基编辑器及其应用<120> A gene editing fusion protein, its constructed adenine base editor and its application
<160> 11<160> 11
<170> SIPOSequenceListing 1.0<170> SIPOSequenceListing 1.0
<210> 1<210> 1
<211> 1509<211> 1509
<212> PRT<212> PRT
<213> (人工序列)<213> (artificial sequence)
<400> 1<400> 1
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala LeuMet Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 151 5 10 15
Thr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly AlaThr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30 20 25 30
Val Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg AlaVal Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala
35 40 45 35 40 45
Ile Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu ArgIle Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60 50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr LeuGln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 8065 70 75 80
Tyr Ser Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile HisTyr Ser Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His
85 90 95 85 90 95
Ser Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ala Lys Thr GlySer Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ala Lys Thr Gly
100 105 110 100 105 110
Ala Ala Gly Ser Leu Met Asp Val Leu His Tyr Pro Gly Met Asn HisAla Ala Gly Ser Leu Met Asp Val Leu His Tyr Pro Gly Met Asn His
115 120 125 115 120 125
Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala LeuArg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu
130 135 140 130 135 140
Leu Cys Tyr Phe Phe Arg Met Pro Arg Arg Val Phe Asn Ala Gln LysLeu Cys Tyr Phe Phe Arg Met Pro Arg Arg Val Phe Asn Ala Gln Lys
145 150 155 160145 150 155 160
Lys Ala Gln Ser Ser Thr Asp Ser Gly Gly Ser Ser Gly Gly Ser SerLys Ala Gln Ser Ser Thr Asp Ser Gly Gly Ser Ser Gly Gly Ser Ser
165 170 175 165 170 175
Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser SerGly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser
180 185 190 180 185 190
Gly Gly Ser Ser Gly Gly Ser Ser Ile Tyr Gln Glu Phe Val Asn LysGly Gly Ser Ser Gly Gly Ser Ser Ile Tyr Gln Glu Phe Val Asn Lys
195 200 205 195 200 205
Tyr Ser Leu Ser Lys Thr Leu Arg Phe Glu Leu Ile Pro Gln Gly LysTyr Ser Leu Ser Lys Thr Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys
210 215 220 210 215 220
Thr Leu Glu Asn Ile Lys Ala Arg Gly Leu Ile Leu Asp Asp Glu LysThr Leu Glu Asn Ile Lys Ala Arg Gly Leu Ile Leu Asp Asp Glu Lys
225 230 235 240225 230 235 240
Arg Ala Lys Asp Tyr Lys Lys Ala Lys Gln Ile Ile Asp Lys Tyr HisArg Ala Lys Asp Tyr Lys Lys Ala Lys Gln Ile Ile Asp Lys Tyr His
245 250 255 245 250 255
Gln Phe Phe Ile Glu Glu Ile Leu Ser Ser Val Cys Ile Ser Glu AspGln Phe Phe Ile Glu Glu Ile Leu Ser Ser Val Cys Ile Ser Glu Asp
260 265 270 260 265 270
Leu Leu Gln Asn Tyr Ser Asp Val Tyr Phe Lys Leu Lys Lys Ser AspLeu Leu Gln Asn Tyr Ser Asp Val Tyr Phe Lys Leu Lys Lys Ser Asp
275 280 285 275 280 285
Asp Asp Asn Leu Gln Lys Asp Phe Lys Ser Ala Lys Asp Thr Ile LysAsp Asp Asn Leu Gln Lys Asp Phe Lys Ser Ala Lys Asp Thr Ile Lys
290 295 300 290 295 300
Lys Gln Ile Ser Glu Tyr Ile Lys Asp Ser Glu Lys Phe Lys Asn LeuLys Gln Ile Ser Glu Tyr Ile Lys Asp Ser Glu Lys Phe Lys Asn Leu
305 310 315 320305 310 315 320
Phe Asn Gln Asn Leu Ile Asp Ala Lys Lys Gly Gln Glu Ser Asp LeuPhe Asn Gln Asn Leu Ile Asp Ala Lys Lys Gly Gln Glu Ser Asp Leu
325 330 335 325 330 335
Ile Leu Trp Leu Lys Gln Ser Lys Asp Asn Gly Ile Glu Leu Phe LysIle Leu Trp Leu Lys Gln Ser Lys Asp Asn Gly Ile Glu Leu Phe Lys
340 345 350 340 345 350
Ala Asn Ser Asp Ile Thr Asp Ile Asp Glu Ala Leu Glu Ile Ile LysAla Asn Ser Asp Ile Thr Asp Ile Asp Glu Ala Leu Glu Ile Ile Lys
355 360 365 355 360 365
Ser Phe Lys Gly Trp Thr Thr Tyr Phe Lys Gly Phe His Glu Asn ArgSer Phe Lys Gly Trp Thr Thr Tyr Phe Lys Gly Phe His Glu Asn Arg
370 375 380 370 375 380
Lys Asn Val Tyr Ser Ser Asn Asp Ile Pro Thr Ser Ile Ile Tyr ArgLys Asn Val Tyr Ser Ser Asn Asp Ile Pro Thr Ser Ile Ile Tyr Arg
385 390 395 400385 390 395 400
Ile Val Asp Asp Asn Leu Pro Lys Phe Leu Glu Asn Lys Ala Lys TyrIle Val Asp Asp Asn Leu Pro Lys Phe Leu Glu Asn Lys Ala Lys Tyr
405 410 415 405 410 415
Glu Ser Leu Lys Asp Lys Ala Pro Glu Ala Ile Asn Tyr Glu Gln IleGlu Ser Leu Lys Asp Lys Ala Pro Glu Ala Ile Asn Tyr Glu Gln Ile
420 425 430 420 425 430
Lys Lys Asp Leu Ala Glu Glu Leu Thr Phe Asp Ile Asp Tyr Lys ThrLys Lys Asp Leu Ala Glu Glu Leu Thr Phe Asp Ile Asp Tyr Lys Thr
435 440 445 435 440 445
Ser Glu Val Asn Gln Arg Val Phe Ser Leu Asp Glu Val Phe Glu IleSer Glu Val Asn Gln Arg Val Phe Ser Leu Asp Glu Val Phe Glu Ile
450 455 460 450 455 460
Ala Asn Phe Asn Asn Tyr Leu Asn Gln Ser Gly Ile Thr Lys Phe AsnAla Asn Phe Asn Asn Tyr Leu Asn Gln Ser Gly Ile Thr Lys Phe Asn
465 470 475 480465 470 475 480
Thr Ile Ile Gly Gly Lys Phe Val Asn Gly Glu Asn Thr Lys Arg LysThr Ile Ile Gly Gly Lys Phe Val Asn Gly Glu Asn Thr Lys Arg Lys
485 490 495 485 490 495
Gly Ile Asn Glu Tyr Ile Asn Leu Tyr Ser Gln Gln Ile Asn Asp LysGly Ile Asn Glu Tyr Ile Asn Leu Tyr Ser Gln Gln Ile Asn Asp Lys
500 505 510 500 505 510
Thr Leu Lys Lys Tyr Lys Met Ser Val Leu Phe Lys Gln Ile Leu SerThr Leu Lys Lys Tyr Lys Met Ser Val Leu Phe Lys Gln Ile Leu Ser
515 520 525 515 520 525
Asp Thr Glu Ser Lys Ser Phe Val Ile Asp Lys Leu Glu Asp Asp SerAsp Thr Glu Ser Lys Ser Phe Val Ile Asp Lys Leu Glu Asp Asp Ser
530 535 540 530 535 540
Asp Val Val Thr Thr Met Gln Ser Phe Tyr Glu Gln Ile Ala Ala PheAsp Val Val Thr Thr Met Gln Ser Phe Tyr Glu Gln Ile Ala Ala Phe
545 550 555 560545 550 555 560
Lys Thr Val Glu Glu Lys Ser Ile Lys Glu Thr Leu Ser Leu Leu PheLys Thr Val Glu Glu Lys Ser Ile Lys Glu Thr Leu Ser Leu Leu Phe
565 570 575 565 570 575
Asp Asp Leu Lys Ala Gln Lys Leu Asp Leu Ser Lys Ile Tyr Phe LysAsp Asp Leu Lys Ala Gln Lys Leu Asp Leu Ser Lys Ile Tyr Phe Lys
580 585 590 580 585 590
Asn Asp Lys Ser Leu Thr Asp Leu Ser Gln Gln Val Phe Asp Asp TyrAsn Asp Lys Ser Leu Thr Asp Leu Ser Gln Gln Val Phe Asp Asp Tyr
595 600 605 595 600 605
Ser Val Ile Gly Thr Ala Val Leu Glu Tyr Ile Thr Gln Gln Ile AlaSer Val Ile Gly Thr Ala Val Leu Glu Tyr Ile Thr Gln Gln Ile Ala
610 615 620 610 615 620
Pro Lys Asn Leu Asp Asn Pro Ser Lys Lys Glu Gln Glu Leu Ile AlaPro Lys Asn Leu Asp Asn Pro Ser Lys Lys Glu Gln Glu Leu Ile Ala
625 630 635 640625 630 635 640
Lys Lys Thr Glu Lys Ala Lys Tyr Leu Ser Leu Glu Thr Ile Lys LeuLys Lys Thr Glu Lys Ala Lys Tyr Leu Ser Leu Glu Thr Ile Lys Leu
645 650 655 645 650 655
Ala Leu Glu Glu Phe Asn Lys His Arg Asp Ile Asp Lys Gln Cys ArgAla Leu Glu Glu Phe Asn Lys His Arg Asp Ile Asp Lys Gln Cys Arg
660 665 670 660 665 670
Phe Glu Glu Ile Leu Ala Asn Phe Ala Ala Ile Pro Met Ile Phe AspPhe Glu Glu Ile Leu Ala Asn Phe Ala Ala Ile Pro Met Ile Phe Asp
675 680 685 675 680 685
Glu Ile Ala Gln Asn Lys Asp Asn Leu Ala Gln Ile Ser Ile Lys TyrGlu Ile Ala Gln Asn Lys Asp Asn Leu Ala Gln Ile Ser Ile Lys Tyr
690 695 700 690 695 700
Gln Asn Gln Gly Lys Lys Asp Leu Leu Gln Ala Ser Ala Glu Asp AspGln Asn Gln Gly Lys Lys Asp Leu Leu Gln Ala Ser Ala Glu Asp Asp
705 710 715 720705 710 715 720
Val Lys Ala Ile Lys Asp Leu Leu Asp Gln Thr Asn Asn Leu Leu HisVal Lys Ala Ile Lys Asp Leu Leu Asp Gln Thr Asn Asn Leu Leu His
725 730 735 725 730 735
Lys Leu Lys Ile Phe His Ile Ser Gln Ser Glu Asp Lys Ala Asn IleLys Leu Lys Ile Phe His Ile Ser Gln Ser Glu Asp Lys Ala Asn Ile
740 745 750 740 745 750
Leu Asp Lys Asp Glu His Phe Tyr Leu Val Phe Glu Glu Cys Tyr PheLeu Asp Lys Asp Glu His Phe Tyr Leu Val Phe Glu Glu Cys Tyr Phe
755 760 765 755 760 765
Glu Leu Ala Asn Ile Val Pro Leu Tyr Asn Lys Ile Arg Asn Tyr IleGlu Leu Ala Asn Ile Val Pro Leu Tyr Asn Lys Ile Arg Asn Tyr Ile
770 775 780 770 775 780
Thr Gln Lys Pro Tyr Ser Asp Glu Lys Phe Lys Leu Asn Phe Glu AsnThr Gln Lys Pro Tyr Ser Asp Glu Lys Phe Lys Leu Asn Phe Glu Asn
785 790 795 800785 790 795 800
Ser Thr Leu Ala Asn Gly Trp Asp Lys Asn Lys Glu Pro Asp Asn ThrSer Thr Leu Ala Asn Gly Trp Asp Lys Asn Lys Glu Pro Asp Asn Thr
805 810 815 805 810 815
Ala Ile Leu Phe Ile Lys Asp Asp Lys Tyr Tyr Leu Gly Val Met AsnAla Ile Leu Phe Ile Lys Asp Asp Lys Tyr Tyr Leu Gly Val Met Asn
820 825 830 820 825 830
Lys Lys Asn Asn Lys Ile Phe Asp Asp Lys Ala Ile Lys Glu Asn LysLys Lys Asn Asn Lys Ile Phe Asp Asp Lys Ala Ile Lys Glu Asn Lys
835 840 845 835 840 845
Gly Glu Gly Tyr Lys Lys Ile Val Tyr Lys Leu Leu Pro Gly Ala AsnGly Glu Gly Tyr Lys Lys Ile Val Tyr Lys Leu Leu Pro Gly Ala Asn
850 855 860850 855 860
Lys Met Leu Pro Lys Val Phe Phe Ser Ala Lys Ser Ile Lys Phe TyrLys Met Leu Pro Lys Val Phe Phe Ser Ala Lys Ser Ile Lys Phe Tyr
865 870 875 880865 870 875 880
Asn Pro Ser Glu Asp Ile Leu Arg Ile Arg Asn His Ser Thr His ThrAsn Pro Ser Glu Asp Ile Leu Arg Ile Arg Asn His Ser Thr His Thr
885 890 895 885 890 895
Lys Asn Gly Ser Pro Gln Lys Gly Tyr Glu Lys Phe Glu Phe Asn IleLys Asn Gly Ser Pro Gln Lys Gly Tyr Glu Lys Phe Glu Phe Asn Ile
900 905 910 900 905 910
Glu Asp Cys Arg Lys Phe Ile Asp Phe Tyr Lys Gln Ser Ile Ser LysGlu Asp Cys Arg Lys Phe Ile Asp Phe Tyr Lys Gln Ser Ile Ser Lys
915 920 925 915 920 925
His Pro Glu Trp Lys Asp Phe Gly Phe Arg Phe Ser Asp Thr Gln ArgHis Pro Glu Trp Lys Asp Phe Gly Phe Arg Phe Ser Asp Thr Gln Arg
930 935 940 930 935 940
Tyr Asn Ser Ile Asp Glu Phe Tyr Arg Glu Val Glu Asn Gln Gly TyrTyr Asn Ser Ile Asp Glu Phe Tyr Arg Glu Val Glu Asn Gln Gly Tyr
945 950 955 960945 950 955 960
Lys Leu Thr Phe Glu Asn Ile Ser Glu Ser Tyr Ile Asp Ser Val ValLys Leu Thr Phe Glu Asn Ile Ser Glu Ser Tyr Ile Asp Ser Val Val
965 970 975 965 970 975
Asn Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe SerAsn Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ser
980 985 990980 985 990
Ala Tyr Ser Lys Gly Arg Pro Asn Leu His Thr Leu Tyr Trp Lys AlaAla Tyr Ser Lys Gly Arg Pro Asn Leu His Thr Leu Tyr Trp Lys Ala
995 1000 1005 995 1000 1005
Leu Phe Asp Glu Arg Asn Leu Gln Asp Val Val Tyr Lys Leu Asn GlyLeu Phe Asp Glu Arg Asn Leu Gln Asp Val Val Tyr Lys Leu Asn Gly
1010 1015 1020 1010 1015 1020
Glu Ala Glu Leu Phe Tyr Arg Lys Gln Ser Ile Pro Lys Lys Ile ThrGlu Ala Glu Leu Phe Tyr Arg Lys Gln Ser Ile Pro Lys Lys Ile Thr
1025 1030 1035 10401025 1030 1035 1040
His Pro Ala Lys Glu Ala Ile Ala Asn Lys Asn Lys Asp Asn Pro LysHis Pro Ala Lys Glu Ala Ile Ala Asn Lys Asn Lys Asp Asn Pro Lys
1045 1050 1055 1045 1050 1055
Lys Glu Ser Val Phe Glu Tyr Asp Leu Ile Lys Asp Lys Arg Phe ThrLys Glu Ser Val Phe Glu Tyr Asp Leu Ile Lys Asp Lys Arg Phe Thr
1060 1065 1070 1060 1065 1070
Glu Asp Lys Phe Phe Phe His Cys Pro Ile Thr Ile Asn Phe Lys SerGlu Asp Lys Phe Phe Phe His Cys Pro Ile Thr Ile Asn Phe Lys Ser
1075 1080 1085 1075 1080 1085
Ser Gly Ala Asn Lys Phe Asn Asp Glu Ile Asn Leu Leu Leu Lys GluSer Gly Ala Asn Lys Phe Asn Asp Glu Ile Asn Leu Leu Leu Lys Glu
1090 1095 1100 1090 1095 1100
Lys Ala Asn Asp Val His Ile Leu Ser Ile Ala Arg Gly Glu Arg HisLys Ala Asn Asp Val His Ile Leu Ser Ile Ala Arg Gly Glu Arg His
1105 1110 1115 11201105 1110 1115 1120
Leu Ala Tyr Tyr Thr Leu Val Asp Gly Lys Gly Asn Ile Ile Lys GlnLeu Ala Tyr Tyr Thr Leu Val Asp Gly Lys Gly Asn Ile Ile Lys Gln
1125 1130 1135 1125 1130 1135
Asp Thr Phe Asn Ile Ile Gly Asn Asp Arg Met Lys Thr Asn Tyr HisAsp Thr Phe Asn Ile Ile Gly Asn Asp Arg Met Lys Thr Asn Tyr His
1140 1145 1150 1140 1145 1150
Asp Lys Leu Ala Ala Ile Glu Lys Asp Arg Asp Ser Ala Arg Lys AspAsp Lys Leu Ala Ala Ile Glu Lys Asp Arg Asp Ser Ala Arg Lys Asp
1155 1160 1165 1155 1160 1165
Trp Lys Lys Ile Asn Asn Ile Lys Glu Met Lys Glu Gly Tyr Leu SerTrp Lys Lys Ile Asn Asn Ile Lys Glu Met Lys Glu Gly Tyr Leu Ser
1170 1175 1180 1170 1175 1180
Gln Val Val His Glu Ile Ala Lys Leu Val Ile Glu Tyr Asn Ala IleGln Val Val His Glu Ile Ala Lys Leu Val Ile Glu Tyr Asn Ala Ile
1185 1190 1195 12001185 1190 1195 1200
Val Val Phe Glu Asp Leu Asn Phe Gly Phe Lys Arg Gly Arg Phe LysVal Val Phe Glu Asp Leu Asn Phe Gly Phe Lys Arg Gly Arg Phe Lys
1205 1210 1215 1205 1210 1215
Val Glu Lys Gln Val Tyr Gln Lys Leu Glu Lys Met Leu Ile Glu LysVal Glu Lys Gln Val Tyr Gln Lys Leu Glu Lys Met Leu Ile Glu Lys
1220 1225 1230 1220 1225 1230
Leu Asn Tyr Leu Val Phe Lys Asp Asn Glu Phe Asp Lys Thr Gly GlyLeu Asn Tyr Leu Val Phe Lys Asp Asn Glu Phe Asp Lys Thr Gly Gly
1235 1240 1245 1235 1240 1245
Val Leu Arg Ala Tyr Gln Leu Thr Ala Pro Phe Glu Thr Phe Lys LysVal Leu Arg Ala Tyr Gln Leu Thr Ala Pro Phe Glu Thr Phe Lys Lys
1250 1255 1260 1250 1255 1260
Met Gly Lys Gln Thr Gly Ile Ile Tyr Tyr Val Pro Ala Gly Phe ThrMet Gly Lys Gln Thr Gly Ile Ile Tyr Tyr Val Pro Ala Gly Phe Thr
1265 1270 1275 12801265 1270 1275 1280
Ser Lys Ile Cys Pro Val Thr Gly Phe Val Asn Gln Leu Tyr Pro LysSer Lys Ile Cys Pro Val Thr Gly Phe Val Asn Gln Leu Tyr Pro Lys
1285 1290 1295 1285 1290 1295
Tyr Glu Ser Val Ser Lys Ser Gln Glu Phe Phe Ser Lys Phe Asp LysTyr Glu Ser Val Ser Lys Ser Gln Glu Phe Phe Ser Lys Phe Asp Lys
1300 1305 1310 1300 1305 1310
Ile Cys Tyr Asn Leu Asp Lys Gly Tyr Phe Glu Phe Ser Phe Asp TyrIle Cys Tyr Asn Leu Asp Lys Gly Tyr Phe Glu Phe Ser Phe Asp Tyr
1315 1320 1325 1315 1320 1325
Lys Asn Phe Gly Asp Lys Ala Ala Lys Gly Lys Trp Thr Ile Ala SerLys Asn Phe Gly Asp Lys Ala Ala Lys Gly Lys Trp Thr Ile Ala Ser
1330 1335 1340 1330 1335 1340
Phe Gly Ser Arg Leu Ile Asn Phe Arg Asn Ser Asp Lys Asn His AsnPhe Gly Ser Arg Leu Ile Asn Phe Arg Asn Ser Asp Lys Asn His Asn
1345 1350 1355 13601345 1350 1355 1360
Trp Asp Thr Arg Glu Val Tyr Pro Thr Lys Glu Leu Glu Lys Leu LeuTrp Asp Thr Arg Glu Val Tyr Pro Thr Lys Glu Leu Glu Lys Leu Leu
1365 1370 1375 1365 1370 1375
Lys Asp Tyr Ser Ile Glu Tyr Gly His Gly Glu Cys Ile Lys Ala AlaLys Asp Tyr Ser Ile Glu Tyr Gly His Gly Glu Cys Ile Lys Ala Ala
1380 1385 1390 1380 1385 1390
Ile Cys Gly Glu Ser Asp Lys Lys Phe Phe Ala Lys Leu Thr Ser ValIle Cys Gly Glu Ser Asp Lys Lys Phe Phe Ala Lys Leu Thr Ser Val
1395 1400 1405 1395 1400 1405
Leu Asn Thr Ile Leu Gln Met Arg Asn Ser Lys Thr Gly Thr Glu LeuLeu Asn Thr Ile Leu Gln Met Arg Asn Ser Lys Thr Gly Thr Glu Leu
1410 1415 1420 1410 1415 1420
Asp Tyr Leu Ile Ser Pro Val Ala Asp Val Asn Gly Asn Phe Phe AspAsp Tyr Leu Ile Ser Pro Val Ala Asp Val Asn Gly Asn Phe Phe Asp
1425 1430 1435 14401425 1430 1435 1440
Ser Arg Gln Ala Pro Lys Asn Met Pro Gln Asp Ala Asp Ala Asn GlySer Arg Gln Ala Pro Lys Asn Met Pro Gln Asp Ala Asp Ala Asn Gly
1445 1450 1455 1445 1450 1455
Ala Tyr His Ile Gly Leu Lys Gly Leu Met Leu Leu Gly Arg Ile LysAla Tyr His Ile Gly Leu Lys Gly Leu Met Leu Leu Gly Arg Ile Lys
1460 1465 1470 1460 1465 1470
Asn Asn Gln Glu Gly Lys Lys Leu Asn Leu Val Ile Lys Asn Glu GluAsn Asn Gln Glu Gly Lys Lys Leu Asn Leu Val Ile Lys Asn Glu Glu
1475 1480 1485 1475 1480 1485
Tyr Phe Glu Phe Val Gln Asn Arg Asn Asn Ser Gly Gly Ser Pro LysTyr Phe Glu Phe Val Gln Asn Arg Asn Asn Ser Gly Gly Ser Pro Lys
1490 1495 15001490 1495 1500
Lys Lys Arg Lys ValLys Lys Arg Lys Val
15051505
<210> 2<210> 2
<211> 1509<211> 1509
<212> PRT<212> PRT
<213> (人工序列)<213> (artificial sequence)
<400> 2<400> 2
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala LeuMet Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 151 5 10 15
Thr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly AlaThr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30 20 25 30
Val Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg AlaVal Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala
35 40 45 35 40 45
Ile Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu ArgIle Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60 50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Tyr Asp Ala Thr LeuGln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Tyr Asp Ala Thr Leu
65 70 75 8065 70 75 80
Tyr Ser Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile HisTyr Ser Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His
85 90 95 85 90 95
Ser Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ala Lys Thr GlySer Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ala Lys Thr Gly
100 105 110 100 105 110
Ala Ala Gly Ser Leu Met Asp Val Leu His His Pro Gly Met Asn HisAla Ala Gly Ser Leu Met Asp Val Leu His His Pro Gly Met Asn His
115 120 125 115 120 125
Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala LeuArg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu
130 135 140 130 135 140
Leu Cys Arg Phe Phe Arg Met Pro Arg Arg Val Phe Asn Ala Gln LysLeu Cys Arg Phe Phe Arg Met Pro Arg Arg Val Phe Asn Ala Gln Lys
145 150 155 160145 150 155 160
Lys Ala Gln Ser Ser Thr Asp Ser Gly Gly Ser Ser Gly Gly Ser SerLys Ala Gln Ser Ser Thr Asp Ser Gly Gly Ser Ser Gly Gly Ser Ser
165 170 175 165 170 175
Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser SerGly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser
180 185 190 180 185 190
Gly Gly Ser Ser Gly Gly Ser Ser Ile Tyr Gln Glu Phe Val Asn LysGly Gly Ser Ser Gly Gly Ser Ser Ile Tyr Gln Glu Phe Val Asn Lys
195 200 205 195 200 205
Tyr Ser Leu Ser Lys Thr Leu Arg Phe Glu Leu Ile Pro Gln Gly LysTyr Ser Leu Ser Lys Thr Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys
210 215 220 210 215 220
Thr Leu Glu Asn Ile Lys Ala Arg Gly Leu Ile Leu Asp Asp Glu LysThr Leu Glu Asn Ile Lys Ala Arg Gly Leu Ile Leu Asp Asp Glu Lys
225 230 235 240225 230 235 240
Arg Ala Lys Asp Tyr Lys Lys Ala Lys Gln Ile Ile Asp Lys Tyr HisArg Ala Lys Asp Tyr Lys Lys Ala Lys Gln Ile Ile Asp Lys Tyr His
245 250 255 245 250 255
Gln Phe Phe Ile Glu Glu Ile Leu Ser Ser Val Cys Ile Ser Glu AspGln Phe Phe Ile Glu Glu Ile Leu Ser Ser Val Cys Ile Ser Glu Asp
260 265 270 260 265 270
Leu Leu Gln Asn Tyr Ser Asp Val Tyr Phe Lys Leu Lys Lys Ser AspLeu Leu Gln Asn Tyr Ser Asp Val Tyr Phe Lys Leu Lys Lys Ser Asp
275 280 285 275 280 285
Asp Asp Asn Leu Gln Lys Asp Phe Lys Ser Ala Lys Asp Thr Ile LysAsp Asp Asn Leu Gln Lys Asp Phe Lys Ser Ala Lys Asp Thr Ile Lys
290 295 300 290 295 300
Lys Gln Ile Ser Glu Tyr Ile Lys Asp Ser Glu Lys Phe Lys Asn LeuLys Gln Ile Ser Glu Tyr Ile Lys Asp Ser Glu Lys Phe Lys Asn Leu
305 310 315 320305 310 315 320
Phe Asn Gln Asn Leu Ile Asp Ala Lys Lys Gly Gln Glu Ser Asp LeuPhe Asn Gln Asn Leu Ile Asp Ala Lys Lys Gly Gln Glu Ser Asp Leu
325 330 335 325 330 335
Ile Leu Trp Leu Lys Gln Ser Lys Asp Asn Gly Ile Glu Leu Phe LysIle Leu Trp Leu Lys Gln Ser Lys Asp Asn Gly Ile Glu Leu Phe Lys
340 345 350 340 345 350
Ala Asn Ser Asp Ile Thr Asp Ile Asp Glu Ala Leu Glu Ile Ile LysAla Asn Ser Asp Ile Thr Asp Ile Asp Glu Ala Leu Glu Ile Ile Lys
355 360 365 355 360 365
Ser Phe Lys Gly Trp Thr Thr Tyr Phe Lys Gly Phe His Glu Asn ArgSer Phe Lys Gly Trp Thr Thr Tyr Phe Lys Gly Phe His Glu Asn Arg
370 375 380 370 375 380
Lys Asn Val Tyr Ser Ser Asn Asp Ile Pro Thr Ser Ile Ile Tyr ArgLys Asn Val Tyr Ser Ser Asn Asp Ile Pro Thr Ser Ile Ile Tyr Arg
385 390 395 400385 390 395 400
Ile Val Asp Asp Asn Leu Pro Lys Phe Leu Glu Asn Lys Ala Lys TyrIle Val Asp Asp Asn Leu Pro Lys Phe Leu Glu Asn Lys Ala Lys Tyr
405 410 415 405 410 415
Glu Ser Leu Lys Asp Lys Ala Pro Glu Ala Ile Asn Tyr Glu Gln IleGlu Ser Leu Lys Asp Lys Ala Pro Glu Ala Ile Asn Tyr Glu Gln Ile
420 425 430 420 425 430
Lys Lys Asp Leu Ala Glu Glu Leu Thr Phe Asp Ile Asp Tyr Lys ThrLys Lys Asp Leu Ala Glu Glu Leu Thr Phe Asp Ile Asp Tyr Lys Thr
435 440 445 435 440 445
Ser Glu Val Asn Gln Arg Val Phe Ser Leu Asp Glu Val Phe Glu IleSer Glu Val Asn Gln Arg Val Phe Ser Leu Asp Glu Val Phe Glu Ile
450 455 460 450 455 460
Ala Asn Phe Asn Asn Tyr Leu Asn Gln Ser Gly Ile Thr Lys Phe AsnAla Asn Phe Asn Asn Tyr Leu Asn Gln Ser Gly Ile Thr Lys Phe Asn
465 470 475 480465 470 475 480
Thr Ile Ile Gly Gly Lys Phe Val Asn Gly Glu Asn Thr Lys Arg LysThr Ile Ile Gly Gly Lys Phe Val Asn Gly Glu Asn Thr Lys Arg Lys
485 490 495 485 490 495
Gly Ile Asn Glu Tyr Ile Asn Leu Tyr Ser Gln Gln Ile Asn Asp LysGly Ile Asn Glu Tyr Ile Asn Leu Tyr Ser Gln Gln Ile Asn Asp Lys
500 505 510 500 505 510
Thr Leu Lys Lys Tyr Lys Met Ser Val Leu Phe Lys Gln Ile Leu SerThr Leu Lys Lys Tyr Lys Met Ser Val Leu Phe Lys Gln Ile Leu Ser
515 520 525 515 520 525
Asp Thr Glu Ser Lys Ser Phe Val Ile Asp Lys Leu Glu Asp Asp SerAsp Thr Glu Ser Lys Ser Phe Val Ile Asp Lys Leu Glu Asp Asp Ser
530 535 540 530 535 540
Asp Val Val Thr Thr Met Gln Ser Phe Tyr Glu Gln Ile Ala Ala PheAsp Val Val Thr Thr Met Gln Ser Phe Tyr Glu Gln Ile Ala Ala Phe
545 550 555 560545 550 555 560
Lys Thr Val Glu Glu Lys Ser Ile Lys Glu Thr Leu Ser Leu Leu PheLys Thr Val Glu Glu Lys Ser Ile Lys Glu Thr Leu Ser Leu Leu Phe
565 570 575 565 570 575
Asp Asp Leu Lys Ala Gln Lys Leu Asp Leu Ser Lys Ile Tyr Phe LysAsp Asp Leu Lys Ala Gln Lys Leu Asp Leu Ser Lys Ile Tyr Phe Lys
580 585 590 580 585 590
Asn Asp Lys Ser Leu Thr Asp Leu Ser Gln Gln Val Phe Asp Asp TyrAsn Asp Lys Ser Leu Thr Asp Leu Ser Gln Gln Val Phe Asp Asp Tyr
595 600 605 595 600 605
Ser Val Ile Gly Thr Ala Val Leu Glu Tyr Ile Thr Gln Gln Ile AlaSer Val Ile Gly Thr Ala Val Leu Glu Tyr Ile Thr Gln Gln Ile Ala
610 615 620 610 615 620
Pro Lys Asn Leu Asp Asn Pro Ser Lys Lys Glu Gln Glu Leu Ile AlaPro Lys Asn Leu Asp Asn Pro Ser Lys Lys Glu Gln Glu Leu Ile Ala
625 630 635 640625 630 635 640
Lys Lys Thr Glu Lys Ala Lys Tyr Leu Ser Leu Glu Thr Ile Lys LeuLys Lys Thr Glu Lys Ala Lys Tyr Leu Ser Leu Glu Thr Ile Lys Leu
645 650 655 645 650 655
Ala Leu Glu Glu Phe Asn Lys His Arg Asp Ile Asp Lys Gln Cys ArgAla Leu Glu Glu Phe Asn Lys His Arg Asp Ile Asp Lys Gln Cys Arg
660 665 670 660 665 670
Phe Glu Glu Ile Leu Ala Asn Phe Ala Ala Ile Pro Met Ile Phe AspPhe Glu Glu Ile Leu Ala Asn Phe Ala Ala Ile Pro Met Ile Phe Asp
675 680 685 675 680 685
Glu Ile Ala Gln Asn Lys Asp Asn Leu Ala Gln Ile Ser Ile Lys TyrGlu Ile Ala Gln Asn Lys Asp Asn Leu Ala Gln Ile Ser Ile Lys Tyr
690 695 700 690 695 700
Gln Asn Gln Gly Lys Lys Asp Leu Leu Gln Ala Ser Ala Glu Asp AspGln Asn Gln Gly Lys Lys Asp Leu Leu Gln Ala Ser Ala Glu Asp Asp
705 710 715 720705 710 715 720
Val Lys Ala Ile Lys Asp Leu Leu Asp Gln Thr Asn Asn Leu Leu HisVal Lys Ala Ile Lys Asp Leu Leu Asp Gln Thr Asn Asn Leu Leu His
725 730 735 725 730 735
Lys Leu Lys Ile Phe His Ile Ser Gln Ser Glu Asp Lys Ala Asn IleLys Leu Lys Ile Phe His Ile Ser Gln Ser Glu Asp Lys Ala Asn Ile
740 745 750 740 745 750
Leu Asp Lys Asp Glu His Phe Tyr Leu Val Phe Glu Glu Cys Tyr PheLeu Asp Lys Asp Glu His Phe Tyr Leu Val Phe Glu Glu Cys Tyr Phe
755 760 765 755 760 765
Glu Leu Ala Asn Ile Val Pro Leu Tyr Asn Lys Ile Arg Asn Tyr IleGlu Leu Ala Asn Ile Val Pro Leu Tyr Asn Lys Ile Arg Asn Tyr Ile
770 775 780 770 775 780
Thr Gln Lys Pro Tyr Ser Asp Glu Lys Phe Lys Leu Asn Phe Glu AsnThr Gln Lys Pro Tyr Ser Asp Glu Lys Phe Lys Leu Asn Phe Glu Asn
785 790 795 800785 790 795 800
Ser Thr Leu Ala Asn Gly Trp Asp Lys Asn Lys Glu Pro Asp Asn ThrSer Thr Leu Ala Asn Gly Trp Asp Lys Asn Lys Glu Pro Asp Asn Thr
805 810 815 805 810 815
Ala Ile Leu Phe Ile Lys Asp Asp Lys Tyr Tyr Leu Gly Val Met AsnAla Ile Leu Phe Ile Lys Asp Asp Lys Tyr Tyr Leu Gly Val Met Asn
820 825 830 820 825 830
Lys Lys Asn Asn Lys Ile Phe Asp Asp Lys Ala Ile Lys Glu Asn LysLys Lys Asn Asn Lys Ile Phe Asp Asp Lys Ala Ile Lys Glu Asn Lys
835 840 845 835 840 845
Gly Glu Gly Tyr Lys Lys Ile Val Tyr Lys Leu Leu Pro Gly Ala AsnGly Glu Gly Tyr Lys Lys Ile Val Tyr Lys Leu Leu Pro Gly Ala Asn
850 855 860 850 855 860
Lys Met Leu Pro Lys Val Phe Phe Ser Ala Lys Ser Ile Lys Phe TyrLys Met Leu Pro Lys Val Phe Phe Ser Ala Lys Ser Ile Lys Phe Tyr
865 870 875 880865 870 875 880
Asn Pro Ser Glu Asp Ile Leu Arg Ile Arg Asn His Ser Thr His ThrAsn Pro Ser Glu Asp Ile Leu Arg Ile Arg Asn His Ser Thr His Thr
885 890 895 885 890 895
Lys Asn Gly Ser Pro Gln Lys Gly Tyr Glu Lys Phe Glu Phe Asn IleLys Asn Gly Ser Pro Gln Lys Gly Tyr Glu Lys Phe Glu Phe Asn Ile
900 905 910 900 905 910
Glu Asp Cys Arg Lys Phe Ile Asp Phe Tyr Lys Gln Ser Ile Ser LysGlu Asp Cys Arg Lys Phe Ile Asp Phe Tyr Lys Gln Ser Ile Ser Lys
915 920 925 915 920 925
His Pro Glu Trp Lys Asp Phe Gly Phe Arg Phe Ser Asp Thr Gln ArgHis Pro Glu Trp Lys Asp Phe Gly Phe Arg Phe Ser Asp Thr Gln Arg
930 935 940 930 935 940
Tyr Asn Ser Ile Asp Glu Phe Tyr Arg Glu Val Glu Asn Gln Gly TyrTyr Asn Ser Ile Asp Glu Phe Tyr Arg Glu Val Glu Asn Gln Gly Tyr
945 950 955 960945 950 955 960
Lys Leu Thr Phe Glu Asn Ile Ser Glu Ser Tyr Ile Asp Ser Val ValLys Leu Thr Phe Glu Asn Ile Ser Glu Ser Tyr Ile Asp Ser Val Val
965 970 975 965 970 975
Asn Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe SerAsn Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ser
980 985 990 980 985 990
Ala Tyr Ser Lys Gly Arg Pro Asn Leu His Thr Leu Tyr Trp Lys AlaAla Tyr Ser Lys Gly Arg Pro Asn Leu His Thr Leu Tyr Trp Lys Ala
995 1000 1005 995 1000 1005
Leu Phe Asp Glu Arg Asn Leu Gln Asp Val Val Tyr Lys Leu Asn GlyLeu Phe Asp Glu Arg Asn Leu Gln Asp Val Val Tyr Lys Leu Asn Gly
1010 1015 1020 1010 1015 1020
Glu Ala Glu Leu Phe Tyr Arg Lys Gln Ser Ile Pro Lys Lys Ile ThrGlu Ala Glu Leu Phe Tyr Arg Lys Gln Ser Ile Pro Lys Lys Ile Thr
1025 1030 1035 10401025 1030 1035 1040
His Pro Ala Lys Glu Ala Ile Ala Asn Lys Asn Lys Asp Asn Pro LysHis Pro Ala Lys Glu Ala Ile Ala Asn Lys Asn Lys Asp Asn Pro Lys
1045 1050 1055 1045 1050 1055
Lys Glu Ser Val Phe Glu Tyr Asp Leu Ile Lys Asp Lys Arg Phe ThrLys Glu Ser Val Phe Glu Tyr Asp Leu Ile Lys Asp Lys Arg Phe Thr
1060 1065 1070 1060 1065 1070
Glu Asp Lys Phe Phe Phe His Cys Pro Ile Thr Ile Asn Phe Lys SerGlu Asp Lys Phe Phe Phe His Cys Pro Ile Thr Ile Asn Phe Lys Ser
1075 1080 1085 1075 1080 1085
Ser Gly Ala Asn Lys Phe Asn Asp Glu Ile Asn Leu Leu Leu Lys GluSer Gly Ala Asn Lys Phe Asn Asp Glu Ile Asn Leu Leu Leu Lys Glu
1090 1095 1100 1090 1095 1100
Lys Ala Asn Asp Val His Ile Leu Ser Ile Ala Arg Gly Glu Arg HisLys Ala Asn Asp Val His Ile Leu Ser Ile Ala Arg Gly Glu Arg His
1105 1110 1115 11201105 1110 1115 1120
Leu Ala Tyr Tyr Thr Leu Val Asp Gly Lys Gly Asn Ile Ile Lys GlnLeu Ala Tyr Tyr Thr Leu Val Asp Gly Lys Gly Asn Ile Ile Lys Gln
1125 1130 1135 1125 1130 1135
Asp Thr Phe Asn Ile Ile Gly Asn Asp Arg Met Lys Thr Asn Tyr HisAsp Thr Phe Asn Ile Ile Gly Asn Asp Arg Met Lys Thr Asn Tyr His
1140 1145 11501140 1145 1150
Asp Lys Leu Ala Ala Ile Glu Lys Asp Arg Asp Ser Ala Arg Lys AspAsp Lys Leu Ala Ala Ile Glu Lys Asp Arg Asp Ser Ala Arg Lys Asp
1155 1160 1165 1155 1160 1165
Trp Lys Lys Ile Asn Asn Ile Lys Glu Met Lys Glu Gly Tyr Leu SerTrp Lys Lys Ile Asn Asn Ile Lys Glu Met Lys Glu Gly Tyr Leu Ser
1170 1175 1180 1170 1175 1180
Gln Val Val His Glu Ile Ala Lys Leu Val Ile Glu Tyr Asn Ala IleGln Val Val His Glu Ile Ala Lys Leu Val Ile Glu Tyr Asn Ala Ile
1185 1190 1195 12001185 1190 1195 1200
Val Val Phe Glu Asp Leu Asn Phe Gly Phe Lys Arg Gly Arg Phe LysVal Val Phe Glu Asp Leu Asn Phe Gly Phe Lys Arg Gly Arg Phe Lys
1205 1210 1215 1205 1210 1215
Val Glu Lys Gln Val Tyr Gln Lys Leu Glu Lys Met Leu Ile Glu LysVal Glu Lys Gln Val Tyr Gln Lys Leu Glu Lys Met Leu Ile Glu Lys
1220 1225 1230 1220 1225 1230
Leu Asn Tyr Leu Val Phe Lys Asp Asn Glu Phe Asp Lys Thr Gly GlyLeu Asn Tyr Leu Val Phe Lys Asp Asn Glu Phe Asp Lys Thr Gly Gly
1235 1240 1245 1235 1240 1245
Val Leu Arg Ala Tyr Gln Leu Thr Ala Pro Phe Glu Thr Phe Lys LysVal Leu Arg Ala Tyr Gln Leu Thr Ala Pro Phe Glu Thr Phe Lys Lys
1250 1255 1260 1250 1255 1260
Met Gly Lys Gln Thr Gly Ile Ile Tyr Tyr Val Pro Ala Gly Phe ThrMet Gly Lys Gln Thr Gly Ile Ile Tyr Tyr Val Pro Ala Gly Phe Thr
1265 1270 1275 12801265 1270 1275 1280
Ser Lys Ile Cys Pro Val Thr Gly Phe Val Asn Gln Leu Tyr Pro LysSer Lys Ile Cys Pro Val Thr Gly Phe Val Asn Gln Leu Tyr Pro Lys
1285 1290 1295 1285 1290 1295
Tyr Glu Ser Val Ser Lys Ser Gln Glu Phe Phe Ser Lys Phe Asp LysTyr Glu Ser Val Ser Lys Ser Gln Glu Phe Phe Ser Lys Phe Asp Lys
1300 1305 1310 1300 1305 1310
Ile Cys Tyr Asn Leu Asp Lys Gly Tyr Phe Glu Phe Ser Phe Asp TyrIle Cys Tyr Asn Leu Asp Lys Gly Tyr Phe Glu Phe Ser Phe Asp Tyr
1315 1320 1325 1315 1320 1325
Lys Asn Phe Gly Asp Lys Ala Ala Lys Gly Lys Trp Thr Ile Ala SerLys Asn Phe Gly Asp Lys Ala Ala Lys Gly Lys Trp Thr Ile Ala Ser
1330 1335 1340 1330 1335 1340
Phe Gly Ser Arg Leu Ile Asn Phe Arg Asn Ser Asp Lys Asn His AsnPhe Gly Ser Arg Leu Ile Asn Phe Arg Asn Ser Asp Lys Asn His Asn
1345 1350 1355 13601345 1350 1355 1360
Trp Asp Thr Arg Glu Val Tyr Pro Thr Lys Glu Leu Glu Lys Leu LeuTrp Asp Thr Arg Glu Val Tyr Pro Thr Lys Glu Leu Glu Lys Leu Leu
1365 1370 1375 1365 1370 1375
Lys Asp Tyr Ser Ile Glu Tyr Gly His Gly Glu Cys Ile Lys Ala AlaLys Asp Tyr Ser Ile Glu Tyr Gly His Gly Glu Cys Ile Lys Ala Ala
1380 1385 1390 1380 1385 1390
Ile Cys Gly Glu Ser Asp Lys Lys Phe Phe Ala Lys Leu Thr Ser ValIle Cys Gly Glu Ser Asp Lys Lys Phe Phe Ala Lys Leu Thr Ser Val
1395 1400 1405 1395 1400 1405
Leu Asn Thr Ile Leu Gln Met Arg Asn Ser Lys Thr Gly Thr Glu LeuLeu Asn Thr Ile Leu Gln Met Arg Asn Ser Lys Thr Gly Thr Glu Leu
1410 1415 1420 1410 1415 1420
Asp Tyr Leu Ile Ser Pro Val Ala Asp Val Asn Gly Asn Phe Phe AspAsp Tyr Leu Ile Ser Pro Val Ala Asp Val Asn Gly Asn Phe Phe Asp
1425 1430 1435 14401425 1430 1435 1440
Ser Arg Gln Ala Pro Lys Asn Met Pro Gln Asp Ala Asp Ala Asn GlySer Arg Gln Ala Pro Lys Asn Met Pro Gln Asp Ala Asp Ala Asn Gly
1445 1450 1455 1445 1450 1455
Ala Tyr His Ile Gly Leu Lys Gly Leu Met Leu Leu Gly Arg Ile LysAla Tyr His Ile Gly Leu Lys Gly Leu Met Leu Leu Gly Arg Ile Lys
1460 1465 1470 1460 1465 1470
Asn Asn Gln Glu Gly Lys Lys Leu Asn Leu Val Ile Lys Asn Glu GluAsn Asn Gln Glu Gly Lys Lys Leu Asn Leu Val Ile Lys Asn Glu Glu
1475 1480 1485 1475 1480 1485
Tyr Phe Glu Phe Val Gln Asn Arg Asn Asn Ser Gly Gly Ser Pro LysTyr Phe Glu Phe Val Gln Asn Arg Asn Asn Ser Gly Gly Ser Pro Lys
1490 1495 1500 1490 1495 1500
Lys Lys Arg Lys ValLys Lys Arg Lys Val
15051505
<210> 3<210> 3
<211> 1509<211> 1509
<212> PRT<212> PRT
<213> (人工序列)<213> (artificial sequence)
<400> 3<400> 3
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala LeuMet Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 151 5 10 15
Thr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly AlaThr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30 20 25 30
Val Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg AlaVal Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala
35 40 45 35 40 45
Ile Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu ArgIle Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60 50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr LeuGln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 8065 70 75 80
Tyr Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile HisTyr Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His
85 90 95 85 90 95
Ser Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ser Lys Arg GlySer Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ser Lys Arg Gly
100 105 110 100 105 110
Ala Ala Gly Ser Leu Met Asn Val Leu Asn Tyr Pro Gly Met Asn HisAla Ala Gly Ser Leu Met Asn Val Leu Asn Tyr Pro Gly Met Asn His
115 120 125 115 120 125
Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala LeuArg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu
130 135 140 130 135 140
Leu Cys Asp Phe Tyr Arg Met Pro Arg Gln Val Phe Asn Ala Gln LysLeu Cys Asp Phe Tyr Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys
145 150 155 160145 150 155 160
Lys Ala Gln Ser Ser Ile Asn Ser Gly Gly Ser Ser Gly Gly Ser SerLys Ala Gln Ser Ser Ile Asn Ser Gly Gly Ser Ser Gly Gly Ser Ser
165 170 175 165 170 175
Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser SerGly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser
180 185 190 180 185 190
Gly Gly Ser Ser Gly Gly Ser Ser Ile Tyr Gln Glu Phe Val Asn LysGly Gly Ser Ser Gly Gly Ser Ser Ile Tyr Gln Glu Phe Val Asn Lys
195 200 205 195 200 205
Tyr Ser Leu Ser Lys Thr Leu Arg Phe Glu Leu Ile Pro Gln Gly LysTyr Ser Leu Ser Lys Thr Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys
210 215 220 210 215 220
Thr Leu Glu Asn Ile Lys Ala Arg Gly Leu Ile Leu Asp Asp Glu LysThr Leu Glu Asn Ile Lys Ala Arg Gly Leu Ile Leu Asp Asp Glu Lys
225 230 235 240225 230 235 240
Arg Ala Lys Asp Tyr Lys Lys Ala Lys Gln Ile Ile Asp Lys Tyr HisArg Ala Lys Asp Tyr Lys Lys Ala Lys Gln Ile Ile Asp Lys Tyr His
245 250 255 245 250 255
Gln Phe Phe Ile Glu Glu Ile Leu Ser Ser Val Cys Ile Ser Glu AspGln Phe Phe Ile Glu Glu Ile Leu Ser Ser Val Cys Ile Ser Glu Asp
260 265 270 260 265 270
Leu Leu Gln Asn Tyr Ser Asp Val Tyr Phe Lys Leu Lys Lys Ser AspLeu Leu Gln Asn Tyr Ser Asp Val Tyr Phe Lys Leu Lys Lys Ser Asp
275 280 285 275 280 285
Asp Asp Asn Leu Gln Lys Asp Phe Lys Ser Ala Lys Asp Thr Ile LysAsp Asp Asn Leu Gln Lys Asp Phe Lys Ser Ala Lys Asp Thr Ile Lys
290 295 300 290 295 300
Lys Gln Ile Ser Glu Tyr Ile Lys Asp Ser Glu Lys Phe Lys Asn LeuLys Gln Ile Ser Glu Tyr Ile Lys Asp Ser Glu Lys Phe Lys Asn Leu
305 310 315 320305 310 315 320
Phe Asn Gln Asn Leu Ile Asp Ala Lys Lys Gly Gln Glu Ser Asp LeuPhe Asn Gln Asn Leu Ile Asp Ala Lys Lys Gly Gln Glu Ser Asp Leu
325 330 335 325 330 335
Ile Leu Trp Leu Lys Gln Ser Lys Asp Asn Gly Ile Glu Leu Phe LysIle Leu Trp Leu Lys Gln Ser Lys Asp Asn Gly Ile Glu Leu Phe Lys
340 345 350 340 345 350
Ala Asn Ser Asp Ile Thr Asp Ile Asp Glu Ala Leu Glu Ile Ile LysAla Asn Ser Asp Ile Thr Asp Ile Asp Glu Ala Leu Glu Ile Ile Lys
355 360 365 355 360 365
Ser Phe Lys Gly Trp Thr Thr Tyr Phe Lys Gly Phe His Glu Asn ArgSer Phe Lys Gly Trp Thr Thr Tyr Phe Lys Gly Phe His Glu Asn Arg
370 375 380 370 375 380
Lys Asn Val Tyr Ser Ser Asn Asp Ile Pro Thr Ser Ile Ile Tyr ArgLys Asn Val Tyr Ser Ser Asn Asp Ile Pro Thr Ser Ile Ile Tyr Arg
385 390 395 400385 390 395 400
Ile Val Asp Asp Asn Leu Pro Lys Phe Leu Glu Asn Lys Ala Lys TyrIle Val Asp Asp Asn Leu Pro Lys Phe Leu Glu Asn Lys Ala Lys Tyr
405 410 415 405 410 415
Glu Ser Leu Lys Asp Lys Ala Pro Glu Ala Ile Asn Tyr Glu Gln IleGlu Ser Leu Lys Asp Lys Ala Pro Glu Ala Ile Asn Tyr Glu Gln Ile
420 425 430 420 425 430
Lys Lys Asp Leu Ala Glu Glu Leu Thr Phe Asp Ile Asp Tyr Lys ThrLys Lys Asp Leu Ala Glu Glu Leu Thr Phe Asp Ile Asp Tyr Lys Thr
435 440 445 435 440 445
Ser Glu Val Asn Gln Arg Val Phe Ser Leu Asp Glu Val Phe Glu IleSer Glu Val Asn Gln Arg Val Phe Ser Leu Asp Glu Val Phe Glu Ile
450 455 460 450 455 460
Ala Asn Phe Asn Asn Tyr Leu Asn Gln Ser Gly Ile Thr Lys Phe AsnAla Asn Phe Asn Asn Tyr Leu Asn Gln Ser Gly Ile Thr Lys Phe Asn
465 470 475 480465 470 475 480
Thr Ile Ile Gly Gly Lys Phe Val Asn Gly Glu Asn Thr Lys Arg LysThr Ile Ile Gly Gly Lys Phe Val Asn Gly Glu Asn Thr Lys Arg Lys
485 490 495 485 490 495
Gly Ile Asn Glu Tyr Ile Asn Leu Tyr Ser Gln Gln Ile Asn Asp LysGly Ile Asn Glu Tyr Ile Asn Leu Tyr Ser Gln Gln Ile Asn Asp Lys
500 505 510 500 505 510
Thr Leu Lys Lys Tyr Lys Met Ser Val Leu Phe Lys Gln Ile Leu SerThr Leu Lys Lys Tyr Lys Met Ser Val Leu Phe Lys Gln Ile Leu Ser
515 520 525 515 520 525
Asp Thr Glu Ser Lys Ser Phe Val Ile Asp Lys Leu Glu Asp Asp SerAsp Thr Glu Ser Lys Ser Phe Val Ile Asp Lys Leu Glu Asp Asp Ser
530 535 540 530 535 540
Asp Val Val Thr Thr Met Gln Ser Phe Tyr Glu Gln Ile Ala Ala PheAsp Val Val Thr Thr Met Gln Ser Phe Tyr Glu Gln Ile Ala Ala Phe
545 550 555 560545 550 555 560
Lys Thr Val Glu Glu Lys Ser Ile Lys Glu Thr Leu Ser Leu Leu PheLys Thr Val Glu Glu Lys Ser Ile Lys Glu Thr Leu Ser Leu Leu Phe
565 570 575 565 570 575
Asp Asp Leu Lys Ala Gln Lys Leu Asp Leu Ser Lys Ile Tyr Phe LysAsp Asp Leu Lys Ala Gln Lys Leu Asp Leu Ser Lys Ile Tyr Phe Lys
580 585 590 580 585 590
Asn Asp Lys Ser Leu Thr Asp Leu Ser Gln Gln Val Phe Asp Asp TyrAsn Asp Lys Ser Leu Thr Asp Leu Ser Gln Gln Val Phe Asp Asp Tyr
595 600 605 595 600 605
Ser Val Ile Gly Thr Ala Val Leu Glu Tyr Ile Thr Gln Gln Ile AlaSer Val Ile Gly Thr Ala Val Leu Glu Tyr Ile Thr Gln Gln Ile Ala
610 615 620 610 615 620
Pro Lys Asn Leu Asp Asn Pro Ser Lys Lys Glu Gln Glu Leu Ile AlaPro Lys Asn Leu Asp Asn Pro Ser Lys Lys Glu Gln Glu Leu Ile Ala
625 630 635 640625 630 635 640
Lys Lys Thr Glu Lys Ala Lys Tyr Leu Ser Leu Glu Thr Ile Lys LeuLys Lys Thr Glu Lys Ala Lys Tyr Leu Ser Leu Glu Thr Ile Lys Leu
645 650 655 645 650 655
Ala Leu Glu Glu Phe Asn Lys His Arg Asp Ile Asp Lys Gln Cys ArgAla Leu Glu Glu Phe Asn Lys His Arg Asp Ile Asp Lys Gln Cys Arg
660 665 670 660 665 670
Phe Glu Glu Ile Leu Ala Asn Phe Ala Ala Ile Pro Met Ile Phe AspPhe Glu Glu Ile Leu Ala Asn Phe Ala Ala Ile Pro Met Ile Phe Asp
675 680 685 675 680 685
Glu Ile Ala Gln Asn Lys Asp Asn Leu Ala Gln Ile Ser Ile Lys TyrGlu Ile Ala Gln Asn Lys Asp Asn Leu Ala Gln Ile Ser Ile Lys Tyr
690 695 700 690 695 700
Gln Asn Gln Gly Lys Lys Asp Leu Leu Gln Ala Ser Ala Glu Asp AspGln Asn Gln Gly Lys Lys Asp Leu Leu Gln Ala Ser Ala Glu Asp Asp
705 710 715 720705 710 715 720
Val Lys Ala Ile Lys Asp Leu Leu Asp Gln Thr Asn Asn Leu Leu HisVal Lys Ala Ile Lys Asp Leu Leu Asp Gln Thr Asn Asn Leu Leu His
725 730 735 725 730 735
Lys Leu Lys Ile Phe His Ile Ser Gln Ser Glu Asp Lys Ala Asn IleLys Leu Lys Ile Phe His Ile Ser Gln Ser Glu Asp Lys Ala Asn Ile
740 745 750 740 745 750
Leu Asp Lys Asp Glu His Phe Tyr Leu Val Phe Glu Glu Cys Tyr PheLeu Asp Lys Asp Glu His Phe Tyr Leu Val Phe Glu Glu Cys Tyr Phe
755 760 765 755 760 765
Glu Leu Ala Asn Ile Val Pro Leu Tyr Asn Lys Ile Arg Asn Tyr IleGlu Leu Ala Asn Ile Val Pro Leu Tyr Asn Lys Ile Arg Asn Tyr Ile
770 775 780 770 775 780
Thr Gln Lys Pro Tyr Ser Asp Glu Lys Phe Lys Leu Asn Phe Glu AsnThr Gln Lys Pro Tyr Ser Asp Glu Lys Phe Lys Leu Asn Phe Glu Asn
785 790 795 800785 790 795 800
Ser Thr Leu Ala Asn Gly Trp Asp Lys Asn Lys Glu Pro Asp Asn ThrSer Thr Leu Ala Asn Gly Trp Asp Lys Asn Lys Glu Pro Asp Asn Thr
805 810 815 805 810 815
Ala Ile Leu Phe Ile Lys Asp Asp Lys Tyr Tyr Leu Gly Val Met AsnAla Ile Leu Phe Ile Lys Asp Asp Lys Tyr Tyr Leu Gly Val Met Asn
820 825 830 820 825 830
Lys Lys Asn Asn Lys Ile Phe Asp Asp Lys Ala Ile Lys Glu Asn LysLys Lys Asn Asn Lys Ile Phe Asp Asp Lys Ala Ile Lys Glu Asn Lys
835 840 845 835 840 845
Gly Glu Gly Tyr Lys Lys Ile Val Tyr Lys Leu Leu Pro Gly Ala AsnGly Glu Gly Tyr Lys Lys Ile Val Tyr Lys Leu Leu Pro Gly Ala Asn
850 855 860850 855 860
Lys Met Leu Pro Lys Val Phe Phe Ser Ala Lys Ser Ile Lys Phe TyrLys Met Leu Pro Lys Val Phe Phe Ser Ala Lys Ser Ile Lys Phe Tyr
865 870 875 880865 870 875 880
Asn Pro Ser Glu Asp Ile Leu Arg Ile Arg Asn His Ser Thr His ThrAsn Pro Ser Glu Asp Ile Leu Arg Ile Arg Asn His Ser Thr His Thr
885 890 895 885 890 895
Lys Asn Gly Ser Pro Gln Lys Gly Tyr Glu Lys Phe Glu Phe Asn IleLys Asn Gly Ser Pro Gln Lys Gly Tyr Glu Lys Phe Glu Phe Asn Ile
900 905 910 900 905 910
Glu Asp Cys Arg Lys Phe Ile Asp Phe Tyr Lys Gln Ser Ile Ser LysGlu Asp Cys Arg Lys Phe Ile Asp Phe Tyr Lys Gln Ser Ile Ser Lys
915 920 925 915 920 925
His Pro Glu Trp Lys Asp Phe Gly Phe Arg Phe Ser Asp Thr Gln ArgHis Pro Glu Trp Lys Asp Phe Gly Phe Arg Phe Ser Asp Thr Gln Arg
930 935 940 930 935 940
Tyr Asn Ser Ile Asp Glu Phe Tyr Arg Glu Val Glu Asn Gln Gly TyrTyr Asn Ser Ile Asp Glu Phe Tyr Arg Glu Val Glu Asn Gln Gly Tyr
945 950 955 960945 950 955 960
Lys Leu Thr Phe Glu Asn Ile Ser Glu Ser Tyr Ile Asp Ser Val ValLys Leu Thr Phe Glu Asn Ile Ser Glu Ser Tyr Ile Asp Ser Val Val
965 970 975 965 970 975
Asn Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe SerAsn Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ser
980 985 990 980 985 990
Ala Tyr Ser Lys Gly Arg Pro Asn Leu His Thr Leu Tyr Trp Lys AlaAla Tyr Ser Lys Gly Arg Pro Asn Leu His Thr Leu Tyr Trp Lys Ala
995 1000 1005 995 1000 1005
Leu Phe Asp Glu Arg Asn Leu Gln Asp Val Val Tyr Lys Leu Asn GlyLeu Phe Asp Glu Arg Asn Leu Gln Asp Val Val Tyr Lys Leu Asn Gly
1010 1015 1020 1010 1015 1020
Glu Ala Glu Leu Phe Tyr Arg Lys Gln Ser Ile Pro Lys Lys Ile ThrGlu Ala Glu Leu Phe Tyr Arg Lys Gln Ser Ile Pro Lys Lys Ile Thr
1025 1030 1035 10401025 1030 1035 1040
His Pro Ala Lys Glu Ala Ile Ala Asn Lys Asn Lys Asp Asn Pro LysHis Pro Ala Lys Glu Ala Ile Ala Asn Lys Asn Lys Asp Asn Pro Lys
1045 1050 1055 1045 1050 1055
Lys Glu Ser Val Phe Glu Tyr Asp Leu Ile Lys Asp Lys Arg Phe ThrLys Glu Ser Val Phe Glu Tyr Asp Leu Ile Lys Asp Lys Arg Phe Thr
1060 1065 1070 1060 1065 1070
Glu Asp Lys Phe Phe Phe His Cys Pro Ile Thr Ile Asn Phe Lys SerGlu Asp Lys Phe Phe Phe His Cys Pro Ile Thr Ile Asn Phe Lys Ser
1075 1080 1085 1075 1080 1085
Ser Gly Ala Asn Lys Phe Asn Asp Glu Ile Asn Leu Leu Leu Lys GluSer Gly Ala Asn Lys Phe Asn Asp Glu Ile Asn Leu Leu Leu Lys Glu
1090 1095 1100 1090 1095 1100
Lys Ala Asn Asp Val His Ile Leu Ser Ile Ala Arg Gly Glu Arg HisLys Ala Asn Asp Val His Ile Leu Ser Ile Ala Arg Gly Glu Arg His
1105 1110 1115 11201105 1110 1115 1120
Leu Ala Tyr Tyr Thr Leu Val Asp Gly Lys Gly Asn Ile Ile Lys GlnLeu Ala Tyr Tyr Thr Leu Val Asp Gly Lys Gly Asn Ile Ile Lys Gln
1125 1130 1135 1125 1130 1135
Asp Thr Phe Asn Ile Ile Gly Asn Asp Arg Met Lys Thr Asn Tyr HisAsp Thr Phe Asn Ile Ile Gly Asn Asp Arg Met Lys Thr Asn Tyr His
1140 1145 1150 1140 1145 1150
Asp Lys Leu Ala Ala Ile Glu Lys Asp Arg Asp Ser Ala Arg Lys AspAsp Lys Leu Ala Ala Ile Glu Lys Asp Arg Asp Ser Ala Arg Lys Asp
1155 1160 1165 1155 1160 1165
Trp Lys Lys Ile Asn Asn Ile Lys Glu Met Lys Glu Gly Tyr Leu SerTrp Lys Lys Ile Asn Asn Ile Lys Glu Met Lys Glu Gly Tyr Leu Ser
1170 1175 1180 1170 1175 1180
Gln Val Val His Glu Ile Ala Lys Leu Val Ile Glu Tyr Asn Ala IleGln Val Val His Glu Ile Ala Lys Leu Val Ile Glu Tyr Asn Ala Ile
1185 1190 1195 12001185 1190 1195 1200
Val Val Phe Glu Asp Leu Asn Phe Gly Phe Lys Arg Gly Arg Phe LysVal Val Phe Glu Asp Leu Asn Phe Gly Phe Lys Arg Gly Arg Phe Lys
1205 1210 1215 1205 1210 1215
Val Glu Lys Gln Val Tyr Gln Lys Leu Glu Lys Met Leu Ile Glu LysVal Glu Lys Gln Val Tyr Gln Lys Leu Glu Lys Met Leu Ile Glu Lys
1220 1225 1230 1220 1225 1230
Leu Asn Tyr Leu Val Phe Lys Asp Asn Glu Phe Asp Lys Thr Gly GlyLeu Asn Tyr Leu Val Phe Lys Asp Asn Glu Phe Asp Lys Thr Gly Gly
1235 1240 1245 1235 1240 1245
Val Leu Arg Ala Tyr Gln Leu Thr Ala Pro Phe Glu Thr Phe Lys LysVal Leu Arg Ala Tyr Gln Leu Thr Ala Pro Phe Glu Thr Phe Lys Lys
1250 1255 1260 1250 1255 1260
Met Gly Lys Gln Thr Gly Ile Ile Tyr Tyr Val Pro Ala Gly Phe ThrMet Gly Lys Gln Thr Gly Ile Ile Tyr Tyr Val Pro Ala Gly Phe Thr
1265 1270 1275 12801265 1270 1275 1280
Ser Lys Ile Cys Pro Val Thr Gly Phe Val Asn Gln Leu Tyr Pro LysSer Lys Ile Cys Pro Val Thr Gly Phe Val Asn Gln Leu Tyr Pro Lys
1285 1290 1295 1285 1290 1295
Tyr Glu Ser Val Ser Lys Ser Gln Glu Phe Phe Ser Lys Phe Asp LysTyr Glu Ser Val Ser Lys Ser Gln Glu Phe Phe Ser Lys Phe Asp Lys
1300 1305 1310 1300 1305 1310
Ile Cys Tyr Asn Leu Asp Lys Gly Tyr Phe Glu Phe Ser Phe Asp TyrIle Cys Tyr Asn Leu Asp Lys Gly Tyr Phe Glu Phe Ser Phe Asp Tyr
1315 1320 1325 1315 1320 1325
Lys Asn Phe Gly Asp Lys Ala Ala Lys Gly Lys Trp Thr Ile Ala SerLys Asn Phe Gly Asp Lys Ala Ala Lys Gly Lys Trp Thr Ile Ala Ser
1330 1335 1340 1330 1335 1340
Phe Gly Ser Arg Leu Ile Asn Phe Arg Asn Ser Asp Lys Asn His AsnPhe Gly Ser Arg Leu Ile Asn Phe Arg Asn Ser Asp Lys Asn His Asn
1345 1350 1355 13601345 1350 1355 1360
Trp Asp Thr Arg Glu Val Tyr Pro Thr Lys Glu Leu Glu Lys Leu LeuTrp Asp Thr Arg Glu Val Tyr Pro Thr Lys Glu Leu Glu Lys Leu Leu
1365 1370 1375 1365 1370 1375
Lys Asp Tyr Ser Ile Glu Tyr Gly His Gly Glu Cys Ile Lys Ala AlaLys Asp Tyr Ser Ile Glu Tyr Gly His Gly Glu Cys Ile Lys Ala Ala
1380 1385 1390 1380 1385 1390
Ile Cys Gly Glu Ser Asp Lys Lys Phe Phe Ala Lys Leu Thr Ser ValIle Cys Gly Glu Ser Asp Lys Lys Phe Phe Ala Lys Leu Thr Ser Val
1395 1400 1405 1395 1400 1405
Leu Asn Thr Ile Leu Gln Met Arg Asn Ser Lys Thr Gly Thr Glu LeuLeu Asn Thr Ile Leu Gln Met Arg Asn Ser Lys Thr Gly Thr Glu Leu
1410 1415 1420 1410 1415 1420
Asp Tyr Leu Ile Ser Pro Val Ala Asp Val Asn Gly Asn Phe Phe AspAsp Tyr Leu Ile Ser Pro Val Ala Asp Val Asn Gly Asn Phe Phe Asp
1425 1430 1435 14401425 1430 1435 1440
Ser Arg Gln Ala Pro Lys Asn Met Pro Gln Asp Ala Asp Ala Asn GlySer Arg Gln Ala Pro Lys Asn Met Pro Gln Asp Ala Asp Ala Asn Gly
1445 1450 1455 1445 1450 1455
Ala Tyr His Ile Gly Leu Lys Gly Leu Met Leu Leu Gly Arg Ile LysAla Tyr His Ile Gly Leu Lys Gly Leu Met Leu Leu Gly Arg Ile Lys
1460 1465 1470 1460 1465 1470
Asn Asn Gln Glu Gly Lys Lys Leu Asn Leu Val Ile Lys Asn Glu GluAsn Asn Gln Glu Gly Lys Lys Leu Asn Leu Val Ile Lys Asn Glu Glu
1475 1480 1485 1475 1480 1485
Tyr Phe Glu Phe Val Gln Asn Arg Asn Asn Ser Gly Gly Ser Pro LysTyr Phe Glu Phe Val Gln Asn Arg Asn Asn Ser Gly Gly Ser Pro Lys
1490 1495 1500 1490 1495 1500
Lys Lys Arg Lys ValLys Lys Arg Lys Val
15051505
<210> 4<210> 4
<211> 1509<211> 1509
<212> PRT<212> PRT
<213> (人工序列)<213> (artificial sequence)
<400> 4<400> 4
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala LeuMet Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 151 5 10 15
Thr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly AlaThr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30 20 25 30
Val Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg AlaVal Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala
35 40 45 35 40 45
Ile Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu ArgIle Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60 50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr LeuGln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 8065 70 75 80
Tyr Ser Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile HisTyr Ser Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His
85 90 95 85 90 95
Ser Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ser Lys Arg GlySer Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ser Lys Arg Gly
100 105 110 100 105 110
Ala Ala Gly Ser Leu Met Asn Val Leu Asn Tyr Pro Gly Met Asn HisAla Ala Gly Ser Leu Met Asn Val Leu Asn Tyr Pro Gly Met Asn His
115 120 125 115 120 125
Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala LeuArg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu
130 135 140 130 135 140
Leu Cys Asp Phe Tyr Arg Met Pro Arg Arg Val Phe Asn Ala Gln LysLeu Cys Asp Phe Tyr Arg Met Pro Arg Arg Val Phe Asn Ala Gln Lys
145 150 155 160145 150 155 160
Lys Ala Gln Ser Ser Ile Asn Ser Gly Gly Ser Ser Gly Gly Ser SerLys Ala Gln Ser Ser Ile Asn Ser Gly Gly Ser Ser Gly Gly Ser Ser
165 170 175 165 170 175
Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser SerGly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser
180 185 190 180 185 190
Gly Gly Ser Ser Gly Gly Ser Ser Ile Tyr Gln Glu Phe Val Asn LysGly Gly Ser Ser Gly Gly Ser Ser Ile Tyr Gln Glu Phe Val Asn Lys
195 200 205 195 200 205
Tyr Ser Leu Ser Lys Thr Leu Arg Phe Glu Leu Ile Pro Gln Gly LysTyr Ser Leu Ser Lys Thr Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys
210 215 220 210 215 220
Thr Leu Glu Asn Ile Lys Ala Arg Gly Leu Ile Leu Asp Asp Glu LysThr Leu Glu Asn Ile Lys Ala Arg Gly Leu Ile Leu Asp Asp Glu Lys
225 230 235 240225 230 235 240
Arg Ala Lys Asp Tyr Lys Lys Ala Lys Gln Ile Ile Asp Lys Tyr HisArg Ala Lys Asp Tyr Lys Lys Ala Lys Gln Ile Ile Asp Lys Tyr His
245 250 255 245 250 255
Gln Phe Phe Ile Glu Glu Ile Leu Ser Ser Val Cys Ile Ser Glu AspGln Phe Phe Ile Glu Glu Ile Leu Ser Ser Val Cys Ile Ser Glu Asp
260 265 270 260 265 270
Leu Leu Gln Asn Tyr Ser Asp Val Tyr Phe Lys Leu Lys Lys Ser AspLeu Leu Gln Asn Tyr Ser Asp Val Tyr Phe Lys Leu Lys Lys Ser Asp
275 280 285 275 280 285
Asp Asp Asn Leu Gln Lys Asp Phe Lys Ser Ala Lys Asp Thr Ile LysAsp Asp Asn Leu Gln Lys Asp Phe Lys Ser Ala Lys Asp Thr Ile Lys
290 295 300 290 295 300
Lys Gln Ile Ser Glu Tyr Ile Lys Asp Ser Glu Lys Phe Lys Asn LeuLys Gln Ile Ser Glu Tyr Ile Lys Asp Ser Glu Lys Phe Lys Asn Leu
305 310 315 320305 310 315 320
Phe Asn Gln Asn Leu Ile Asp Ala Lys Lys Gly Gln Glu Ser Asp LeuPhe Asn Gln Asn Leu Ile Asp Ala Lys Lys Gly Gln Glu Ser Asp Leu
325 330 335 325 330 335
Ile Leu Trp Leu Lys Gln Ser Lys Asp Asn Gly Ile Glu Leu Phe LysIle Leu Trp Leu Lys Gln Ser Lys Asp Asn Gly Ile Glu Leu Phe Lys
340 345 350 340 345 350
Ala Asn Ser Asp Ile Thr Asp Ile Asp Glu Ala Leu Glu Ile Ile LysAla Asn Ser Asp Ile Thr Asp Ile Asp Glu Ala Leu Glu Ile Ile Lys
355 360 365 355 360 365
Ser Phe Lys Gly Trp Thr Thr Tyr Phe Lys Gly Phe His Glu Asn ArgSer Phe Lys Gly Trp Thr Thr Tyr Phe Lys Gly Phe His Glu Asn Arg
370 375 380 370 375 380
Lys Asn Val Tyr Ser Ser Asn Asp Ile Pro Thr Ser Ile Ile Tyr ArgLys Asn Val Tyr Ser Ser Asn Asp Ile Pro Thr Ser Ile Ile Tyr Arg
385 390 395 400385 390 395 400
Ile Val Asp Asp Asn Leu Pro Lys Phe Leu Glu Asn Lys Ala Lys TyrIle Val Asp Asp Asn Leu Pro Lys Phe Leu Glu Asn Lys Ala Lys Tyr
405 410 415 405 410 415
Glu Ser Leu Lys Asp Lys Ala Pro Glu Ala Ile Asn Tyr Glu Gln IleGlu Ser Leu Lys Asp Lys Ala Pro Glu Ala Ile Asn Tyr Glu Gln Ile
420 425 430 420 425 430
Lys Lys Asp Leu Ala Glu Glu Leu Thr Phe Asp Ile Asp Tyr Lys ThrLys Lys Asp Leu Ala Glu Glu Leu Thr Phe Asp Ile Asp Tyr Lys Thr
435 440 445 435 440 445
Ser Glu Val Asn Gln Arg Val Phe Ser Leu Asp Glu Val Phe Glu IleSer Glu Val Asn Gln Arg Val Phe Ser Leu Asp Glu Val Phe Glu Ile
450 455 460 450 455 460
Ala Asn Phe Asn Asn Tyr Leu Asn Gln Ser Gly Ile Thr Lys Phe AsnAla Asn Phe Asn Asn Tyr Leu Asn Gln Ser Gly Ile Thr Lys Phe Asn
465 470 475 480465 470 475 480
Thr Ile Ile Gly Gly Lys Phe Val Asn Gly Glu Asn Thr Lys Arg LysThr Ile Ile Gly Gly Lys Phe Val Asn Gly Glu Asn Thr Lys Arg Lys
485 490 495 485 490 495
Gly Ile Asn Glu Tyr Ile Asn Leu Tyr Ser Gln Gln Ile Asn Asp LysGly Ile Asn Glu Tyr Ile Asn Leu Tyr Ser Gln Gln Ile Asn Asp Lys
500 505 510 500 505 510
Thr Leu Lys Lys Tyr Lys Met Ser Val Leu Phe Lys Gln Ile Leu SerThr Leu Lys Lys Tyr Lys Met Ser Val Leu Phe Lys Gln Ile Leu Ser
515 520 525 515 520 525
Asp Thr Glu Ser Lys Ser Phe Val Ile Asp Lys Leu Glu Asp Asp SerAsp Thr Glu Ser Lys Ser Phe Val Ile Asp Lys Leu Glu Asp Asp Ser
530 535 540 530 535 540
Asp Val Val Thr Thr Met Gln Ser Phe Tyr Glu Gln Ile Ala Ala PheAsp Val Val Thr Thr Met Gln Ser Phe Tyr Glu Gln Ile Ala Ala Phe
545 550 555 560545 550 555 560
Lys Thr Val Glu Glu Lys Ser Ile Lys Glu Thr Leu Ser Leu Leu PheLys Thr Val Glu Glu Lys Ser Ile Lys Glu Thr Leu Ser Leu Leu Phe
565 570 575 565 570 575
Asp Asp Leu Lys Ala Gln Lys Leu Asp Leu Ser Lys Ile Tyr Phe LysAsp Asp Leu Lys Ala Gln Lys Leu Asp Leu Ser Lys Ile Tyr Phe Lys
580 585 590 580 585 590
Asn Asp Lys Ser Leu Thr Asp Leu Ser Gln Gln Val Phe Asp Asp TyrAsn Asp Lys Ser Leu Thr Asp Leu Ser Gln Gln Val Phe Asp Asp Tyr
595 600 605 595 600 605
Ser Val Ile Gly Thr Ala Val Leu Glu Tyr Ile Thr Gln Gln Ile AlaSer Val Ile Gly Thr Ala Val Leu Glu Tyr Ile Thr Gln Gln Ile Ala
610 615 620 610 615 620
Pro Lys Asn Leu Asp Asn Pro Ser Lys Lys Glu Gln Glu Leu Ile AlaPro Lys Asn Leu Asp Asn Pro Ser Lys Lys Glu Gln Glu Leu Ile Ala
625 630 635 640625 630 635 640
Lys Lys Thr Glu Lys Ala Lys Tyr Leu Ser Leu Glu Thr Ile Lys LeuLys Lys Thr Glu Lys Ala Lys Tyr Leu Ser Leu Glu Thr Ile Lys Leu
645 650 655 645 650 655
Ala Leu Glu Glu Phe Asn Lys His Arg Asp Ile Asp Lys Gln Cys ArgAla Leu Glu Glu Phe Asn Lys His Arg Asp Ile Asp Lys Gln Cys Arg
660 665 670 660 665 670
Phe Glu Glu Ile Leu Ala Asn Phe Ala Ala Ile Pro Met Ile Phe AspPhe Glu Glu Ile Leu Ala Asn Phe Ala Ala Ile Pro Met Ile Phe Asp
675 680 685 675 680 685
Glu Ile Ala Gln Asn Lys Asp Asn Leu Ala Gln Ile Ser Ile Lys TyrGlu Ile Ala Gln Asn Lys Asp Asn Leu Ala Gln Ile Ser Ile Lys Tyr
690 695 700 690 695 700
Gln Asn Gln Gly Lys Lys Asp Leu Leu Gln Ala Ser Ala Glu Asp AspGln Asn Gln Gly Lys Lys Asp Leu Leu Gln Ala Ser Ala Glu Asp Asp
705 710 715 720705 710 715 720
Val Lys Ala Ile Lys Asp Leu Leu Asp Gln Thr Asn Asn Leu Leu HisVal Lys Ala Ile Lys Asp Leu Leu Asp Gln Thr Asn Asn Leu Leu His
725 730 735 725 730 735
Lys Leu Lys Ile Phe His Ile Ser Gln Ser Glu Asp Lys Ala Asn IleLys Leu Lys Ile Phe His Ile Ser Gln Ser Glu Asp Lys Ala Asn Ile
740 745 750 740 745 750
Leu Asp Lys Asp Glu His Phe Tyr Leu Val Phe Glu Glu Cys Tyr PheLeu Asp Lys Asp Glu His Phe Tyr Leu Val Phe Glu Glu Cys Tyr Phe
755 760 765 755 760 765
Glu Leu Ala Asn Ile Val Pro Leu Tyr Asn Lys Ile Arg Asn Tyr IleGlu Leu Ala Asn Ile Val Pro Leu Tyr Asn Lys Ile Arg Asn Tyr Ile
770 775 780 770 775 780
Thr Gln Lys Pro Tyr Ser Asp Glu Lys Phe Lys Leu Asn Phe Glu AsnThr Gln Lys Pro Tyr Ser Asp Glu Lys Phe Lys Leu Asn Phe Glu Asn
785 790 795 800785 790 795 800
Ser Thr Leu Ala Asn Gly Trp Asp Lys Asn Lys Glu Pro Asp Asn ThrSer Thr Leu Ala Asn Gly Trp Asp Lys Asn Lys Glu Pro Asp Asn Thr
805 810 815 805 810 815
Ala Ile Leu Phe Ile Lys Asp Asp Lys Tyr Tyr Leu Gly Val Met AsnAla Ile Leu Phe Ile Lys Asp Asp Lys Tyr Tyr Leu Gly Val Met Asn
820 825 830 820 825 830
Lys Lys Asn Asn Lys Ile Phe Asp Asp Lys Ala Ile Lys Glu Asn LysLys Lys Asn Asn Lys Ile Phe Asp Asp Lys Ala Ile Lys Glu Asn Lys
835 840 845 835 840 845
Gly Glu Gly Tyr Lys Lys Ile Val Tyr Lys Leu Leu Pro Gly Ala AsnGly Glu Gly Tyr Lys Lys Ile Val Tyr Lys Leu Leu Pro Gly Ala Asn
850 855 860 850 855 860
Lys Met Leu Pro Lys Val Phe Phe Ser Ala Lys Ser Ile Lys Phe TyrLys Met Leu Pro Lys Val Phe Phe Ser Ala Lys Ser Ile Lys Phe Tyr
865 870 875 880865 870 875 880
Asn Pro Ser Glu Asp Ile Leu Arg Ile Arg Asn His Ser Thr His ThrAsn Pro Ser Glu Asp Ile Leu Arg Ile Arg Asn His Ser Thr His Thr
885 890 895 885 890 895
Lys Asn Gly Ser Pro Gln Lys Gly Tyr Glu Lys Phe Glu Phe Asn IleLys Asn Gly Ser Pro Gln Lys Gly Tyr Glu Lys Phe Glu Phe Asn Ile
900 905 910 900 905 910
Glu Asp Cys Arg Lys Phe Ile Asp Phe Tyr Lys Gln Ser Ile Ser LysGlu Asp Cys Arg Lys Phe Ile Asp Phe Tyr Lys Gln Ser Ile Ser Lys
915 920 925 915 920 925
His Pro Glu Trp Lys Asp Phe Gly Phe Arg Phe Ser Asp Thr Gln ArgHis Pro Glu Trp Lys Asp Phe Gly Phe Arg Phe Ser Asp Thr Gln Arg
930 935 940 930 935 940
Tyr Asn Ser Ile Asp Glu Phe Tyr Arg Glu Val Glu Asn Gln Gly TyrTyr Asn Ser Ile Asp Glu Phe Tyr Arg Glu Val Glu Asn Gln Gly Tyr
945 950 955 960945 950 955 960
Lys Leu Thr Phe Glu Asn Ile Ser Glu Ser Tyr Ile Asp Ser Val ValLys Leu Thr Phe Glu Asn Ile Ser Glu Ser Tyr Ile Asp Ser Val Val
965 970 975 965 970 975
Asn Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe SerAsn Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ser
980 985 990 980 985 990
Ala Tyr Ser Lys Gly Arg Pro Asn Leu His Thr Leu Tyr Trp Lys AlaAla Tyr Ser Lys Gly Arg Pro Asn Leu His Thr Leu Tyr Trp Lys Ala
995 1000 1005 995 1000 1005
Leu Phe Asp Glu Arg Asn Leu Gln Asp Val Val Tyr Lys Leu Asn GlyLeu Phe Asp Glu Arg Asn Leu Gln Asp Val Val Tyr Lys Leu Asn Gly
1010 1015 1020 1010 1015 1020
Glu Ala Glu Leu Phe Tyr Arg Lys Gln Ser Ile Pro Lys Lys Ile ThrGlu Ala Glu Leu Phe Tyr Arg Lys Gln Ser Ile Pro Lys Lys Ile Thr
1025 1030 1035 10401025 1030 1035 1040
His Pro Ala Lys Glu Ala Ile Ala Asn Lys Asn Lys Asp Asn Pro LysHis Pro Ala Lys Glu Ala Ile Ala Asn Lys Asn Lys Asp Asn Pro Lys
1045 1050 1055 1045 1050 1055
Lys Glu Ser Val Phe Glu Tyr Asp Leu Ile Lys Asp Lys Arg Phe ThrLys Glu Ser Val Phe Glu Tyr Asp Leu Ile Lys Asp Lys Arg Phe Thr
1060 1065 1070 1060 1065 1070
Glu Asp Lys Phe Phe Phe His Cys Pro Ile Thr Ile Asn Phe Lys SerGlu Asp Lys Phe Phe Phe His Cys Pro Ile Thr Ile Asn Phe Lys Ser
1075 1080 1085 1075 1080 1085
Ser Gly Ala Asn Lys Phe Asn Asp Glu Ile Asn Leu Leu Leu Lys GluSer Gly Ala Asn Lys Phe Asn Asp Glu Ile Asn Leu Leu Leu Lys Glu
1090 1095 1100 1090 1095 1100
Lys Ala Asn Asp Val His Ile Leu Ser Ile Ala Arg Gly Glu Arg HisLys Ala Asn Asp Val His Ile Leu Ser Ile Ala Arg Gly Glu Arg His
1105 1110 1115 11201105 1110 1115 1120
Leu Ala Tyr Tyr Thr Leu Val Asp Gly Lys Gly Asn Ile Ile Lys GlnLeu Ala Tyr Tyr Thr Leu Val Asp Gly Lys Gly Asn Ile Ile Lys Gln
1125 1130 1135 1125 1130 1135
Asp Thr Phe Asn Ile Ile Gly Asn Asp Arg Met Lys Thr Asn Tyr HisAsp Thr Phe Asn Ile Ile Gly Asn Asp Arg Met Lys Thr Asn Tyr His
1140 1145 1150 1140 1145 1150
Asp Lys Leu Ala Ala Ile Glu Lys Asp Arg Asp Ser Ala Arg Lys AspAsp Lys Leu Ala Ala Ile Glu Lys Asp Arg Asp Ser Ala Arg Lys Asp
1155 1160 1165 1155 1160 1165
Trp Lys Lys Ile Asn Asn Ile Lys Glu Met Lys Glu Gly Tyr Leu SerTrp Lys Lys Ile Asn Asn Ile Lys Glu Met Lys Glu Gly Tyr Leu Ser
1170 1175 1180 1170 1175 1180
Gln Val Val His Glu Ile Ala Lys Leu Val Ile Glu Tyr Asn Ala IleGln Val Val His Glu Ile Ala Lys Leu Val Ile Glu Tyr Asn Ala Ile
1185 1190 1195 12001185 1190 1195 1200
Val Val Phe Glu Asp Leu Asn Phe Gly Phe Lys Arg Gly Arg Phe LysVal Val Phe Glu Asp Leu Asn Phe Gly Phe Lys Arg Gly Arg Phe Lys
1205 1210 1215 1205 1210 1215
Val Glu Lys Gln Val Tyr Gln Lys Leu Glu Lys Met Leu Ile Glu LysVal Glu Lys Gln Val Tyr Gln Lys Leu Glu Lys Met Leu Ile Glu Lys
1220 1225 1230 1220 1225 1230
Leu Asn Tyr Leu Val Phe Lys Asp Asn Glu Phe Asp Lys Thr Gly GlyLeu Asn Tyr Leu Val Phe Lys Asp Asn Glu Phe Asp Lys Thr Gly Gly
1235 1240 1245 1235 1240 1245
Val Leu Arg Ala Tyr Gln Leu Thr Ala Pro Phe Glu Thr Phe Lys LysVal Leu Arg Ala Tyr Gln Leu Thr Ala Pro Phe Glu Thr Phe Lys Lys
1250 1255 1260 1250 1255 1260
Met Gly Lys Gln Thr Gly Ile Ile Tyr Tyr Val Pro Ala Gly Phe ThrMet Gly Lys Gln Thr Gly Ile Ile Tyr Tyr Val Pro Ala Gly Phe Thr
1265 1270 1275 12801265 1270 1275 1280
Ser Lys Ile Cys Pro Val Thr Gly Phe Val Asn Gln Leu Tyr Pro LysSer Lys Ile Cys Pro Val Thr Gly Phe Val Asn Gln Leu Tyr Pro Lys
1285 1290 1295 1285 1290 1295
Tyr Glu Ser Val Ser Lys Ser Gln Glu Phe Phe Ser Lys Phe Asp LysTyr Glu Ser Val Ser Lys Ser Gln Glu Phe Phe Ser Lys Phe Asp Lys
1300 1305 1310 1300 1305 1310
Ile Cys Tyr Asn Leu Asp Lys Gly Tyr Phe Glu Phe Ser Phe Asp TyrIle Cys Tyr Asn Leu Asp Lys Gly Tyr Phe Glu Phe Ser Phe Asp Tyr
1315 1320 1325 1315 1320 1325
Lys Asn Phe Gly Asp Lys Ala Ala Lys Gly Lys Trp Thr Ile Ala SerLys Asn Phe Gly Asp Lys Ala Ala Lys Gly Lys Trp Thr Ile Ala Ser
1330 1335 1340 1330 1335 1340
Phe Gly Ser Arg Leu Ile Asn Phe Arg Asn Ser Asp Lys Asn His AsnPhe Gly Ser Arg Leu Ile Asn Phe Arg Asn Ser Asp Lys Asn His Asn
1345 1350 1355 13601345 1350 1355 1360
Trp Asp Thr Arg Glu Val Tyr Pro Thr Lys Glu Leu Glu Lys Leu LeuTrp Asp Thr Arg Glu Val Tyr Pro Thr Lys Glu Leu Glu Lys Leu Leu
1365 1370 1375 1365 1370 1375
Lys Asp Tyr Ser Ile Glu Tyr Gly His Gly Glu Cys Ile Lys Ala AlaLys Asp Tyr Ser Ile Glu Tyr Gly His Gly Glu Cys Ile Lys Ala Ala
1380 1385 1390 1380 1385 1390
Ile Cys Gly Glu Ser Asp Lys Lys Phe Phe Ala Lys Leu Thr Ser ValIle Cys Gly Glu Ser Asp Lys Lys Phe Phe Ala Lys Leu Thr Ser Val
1395 1400 1405 1395 1400 1405
Leu Asn Thr Ile Leu Gln Met Arg Asn Ser Lys Thr Gly Thr Glu LeuLeu Asn Thr Ile Leu Gln Met Arg Asn Ser Lys Thr Gly Thr Glu Leu
1410 1415 1420 1410 1415 1420
Asp Tyr Leu Ile Ser Pro Val Ala Asp Val Asn Gly Asn Phe Phe AspAsp Tyr Leu Ile Ser Pro Val Ala Asp Val Asn Gly Asn Phe Phe Asp
1425 1430 1435 14401425 1430 1435 1440
Ser Arg Gln Ala Pro Lys Asn Met Pro Gln Asp Ala Asp Ala Asn GlySer Arg Gln Ala Pro Lys Asn Met Pro Gln Asp Ala Asp Ala Asn Gly
1445 1450 1455 1445 1450 1455
Ala Tyr His Ile Gly Leu Lys Gly Leu Met Leu Leu Gly Arg Ile LysAla Tyr His Ile Gly Leu Lys Gly Leu Met Leu Leu Gly Arg Ile Lys
1460 1465 1470 1460 1465 1470
Asn Asn Gln Glu Gly Lys Lys Leu Asn Leu Val Ile Lys Asn Glu GluAsn Asn Gln Glu Gly Lys Lys Leu Asn Leu Val Ile Lys Asn Glu Glu
1475 1480 1485 1475 1480 1485
Tyr Phe Glu Phe Val Gln Asn Arg Asn Asn Ser Gly Gly Ser Pro LysTyr Phe Glu Phe Val Gln Asn Arg Asn Asn Ser Gly Gly Ser Pro Lys
1490 1495 1500 1490 1495 1500
Lys Lys Arg Lys ValLys Lys Arg Lys Val
15051505
<210> 5<210> 5
<211> 82<211> 82
<212> DNA<212> DNA
<213> (人工序列)<213> (artificial sequence)
<400> 5<400> 5
gtctaagaac tttaaataat ttctactgtt gtagatagag accgtgaagt taataaggtc 60gtctaagaac tttaaataat ttctactgtt gtagatagag accgtgaagt taataaggtc 60
tcaaatttct actgttgtag at 82tcaaatttct actgttgtag at 82
<210> 6<210> 6
<211> 1707<211> 1707
<212> PRT<212> PRT
<213> (人工序列)<213> (artificial sequence)
<400> 6<400> 6
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala LeuMet Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 151 5 10 15
Thr Leu Ala Lys Arg Ala Trp Asp Glu Arg Glu Val Pro Val Gly AlaThr Leu Ala Lys Arg Ala Trp Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30 20 25 30
Val Leu Val His Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg ProVal Leu Val His Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Pro
35 40 45 35 40 45
Ile Gly Arg His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu ArgIle Gly Arg His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60 50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr LeuGln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 8065 70 75 80
Tyr Val Thr Leu Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile HisTyr Val Thr Leu Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His
85 90 95 85 90 95
Ser Arg Ile Gly Arg Val Val Phe Gly Ala Arg Asp Ala Lys Thr GlySer Arg Ile Gly Arg Val Val Phe Gly Ala Arg Asp Ala Lys Thr Gly
100 105 110 100 105 110
Ala Ala Gly Ser Leu Met Asp Val Leu His His Pro Gly Met Asn HisAla Ala Gly Ser Leu Met Asp Val Leu His His Pro Gly Met Asn His
115 120 125 115 120 125
Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala LeuArg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu
130 135 140 130 135 140
Leu Ser Asp Phe Phe Arg Met Arg Arg Gln Glu Ile Lys Ala Gln LysLeu Ser Asp Phe Phe Arg Met Arg Arg Gln Glu Ile Lys Ala Gln Lys
145 150 155 160145 150 155 160
Lys Ala Gln Ser Ser Thr Asp Ser Gly Gly Ser Ser Gly Gly Ser SerLys Ala Gln Ser Ser Thr Asp Ser Gly Gly Ser Ser Gly Gly Ser Ser
165 170 175 165 170 175
Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser SerGly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser
180 185 190 180 185 190
Gly Gly Ser Ser Gly Gly Ser Ser Glu Val Glu Phe Ser His Glu TyrGly Gly Ser Ser Gly Gly Ser Ser Ser Glu Val Glu Phe Ser His Glu Tyr
195 200 205 195 200 205
Trp Met Arg His Ala Leu Thr Leu Ala Lys Arg Ala Arg Asp Glu ArgTrp Met Arg His Ala Leu Thr Leu Ala Lys Arg Ala Arg Asp Glu Arg
210 215 220 210 215 220
Glu Val Pro Val Gly Ala Val Leu Val Leu Asn Asn Arg Val Ile GlyGlu Val Pro Val Gly Ala Val Leu Val Leu Asn Asn Arg Val Ile Gly
225 230 235 240225 230 235 240
Glu Gly Trp Asn Arg Ala Ile Gly Leu His Asp Pro Thr Ala His AlaGlu Gly Trp Asn Arg Ala Ile Gly Leu His Asp Pro Thr Ala His Ala
245 250 255 245 250 255
Glu Ile Met Ala Leu Arg Gln Gly Gly Leu Val Met Gln Asn Tyr ArgGlu Ile Met Ala Leu Arg Gln Gly Gly Leu Val Met Gln Asn Tyr Arg
260 265 270 260 265 270
Leu Ile Asp Ala Thr Leu Tyr Val Thr Phe Glu Pro Cys Val Met CysLeu Ile Asp Ala Thr Leu Tyr Val Thr Phe Glu Pro Cys Val Met Cys
275 280 285 275 280 285
Ala Gly Ala Met Ile His Ser Arg Ile Gly Arg Val Val Phe Gly ValAla Gly Ala Met Ile His Ser Arg Ile Gly Arg Val Val Phe Gly Val
290 295 300 290 295 300
Arg Asn Ala Lys Thr Gly Ala Ala Gly Ser Leu Met Asp Val Leu HisArg Asn Ala Lys Thr Gly Ala Ala Gly Ser Leu Met Asp Val Leu His
305 310 315 320305 310 315 320
Tyr Pro Gly Met Asn His Arg Val Glu Ile Thr Glu Gly Ile Leu AlaTyr Pro Gly Met Asn His Arg Val Glu Ile Thr Glu Gly Ile Leu Ala
325 330 335 325 330 335
Asp Glu Cys Ala Ala Leu Leu Cys Tyr Phe Phe Arg Met Pro Arg GlnAsp Glu Cys Ala Ala Leu Leu Cys Tyr Phe Phe Arg Met Pro Arg Gln
340 345 350 340 345 350
Val Phe Asn Ala Gln Lys Lys Ala Gln Ser Ser Thr Asp Ser Gly GlyVal Phe Asn Ala Gln Lys Lys Ala Gln Ser Ser Thr Asp Ser Gly Gly
355 360 365 355 360 365
Ser Ser Gly Gly Ser Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu SerSer Ser Gly Gly Ser Ser Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser
370 375 380 370 375 380
Ala Thr Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly Ser Ser Ile TyrAla Thr Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly Ser Ser Ile Tyr
385 390 395 400385 390 395 400
Gln Glu Phe Val Asn Lys Tyr Ser Leu Ser Lys Thr Leu Arg Phe GluGln Glu Phe Val Asn Lys Tyr Ser Leu Ser Lys Thr Leu Arg Phe Glu
405 410 415 405 410 415
Leu Ile Pro Gln Gly Lys Thr Leu Glu Asn Ile Lys Ala Arg Gly LeuLeu Ile Pro Gln Gly Lys Thr Leu Glu Asn Ile Lys Ala Arg Gly Leu
420 425 430 420 425 430
Ile Leu Asp Asp Glu Lys Arg Ala Lys Asp Tyr Lys Lys Ala Lys GlnIle Leu Asp Asp Glu Lys Arg Ala Lys Asp Tyr Lys Lys Ala Lys Gln
435 440 445 435 440 445
Ile Ile Asp Lys Tyr His Gln Phe Phe Ile Glu Glu Ile Leu Ser SerIle Ile Asp Lys Tyr His Gln Phe Phe Ile Glu Glu Ile Leu Ser Ser
450 455 460 450 455 460
Val Cys Ile Ser Glu Asp Leu Leu Gln Asn Tyr Ser Asp Val Tyr PheVal Cys Ile Ser Glu Asp Leu Leu Gln Asn Tyr Ser Asp Val Tyr Phe
465 470 475 480465 470 475 480
Lys Leu Lys Lys Ser Asp Asp Asp Asn Leu Gln Lys Asp Phe Lys SerLys Leu Lys Lys Ser Asp Asp Asp Asn Leu Gln Lys Asp Phe Lys Ser
485 490 495 485 490 495
Ala Lys Asp Thr Ile Lys Lys Gln Ile Ser Glu Tyr Ile Lys Asp SerAla Lys Asp Thr Ile Lys Lys Gln Ile Ser Glu Tyr Ile Lys Asp Ser
500 505 510 500 505 510
Glu Lys Phe Lys Asn Leu Phe Asn Gln Asn Leu Ile Asp Ala Lys LysGlu Lys Phe Lys Asn Leu Phe Asn Gln Asn Leu Ile Asp Ala Lys Lys
515 520 525 515 520 525
Gly Gln Glu Ser Asp Leu Ile Leu Trp Leu Lys Gln Ser Lys Asp AsnGly Gln Glu Ser Asp Leu Ile Leu Trp Leu Lys Gln Ser Lys Asp Asn
530 535 540 530 535 540
Gly Ile Glu Leu Phe Lys Ala Asn Ser Asp Ile Thr Asp Ile Asp GluGly Ile Glu Leu Phe Lys Ala Asn Ser Asp Ile Thr Asp Ile Asp Glu
545 550 555 560545 550 555 560
Ala Leu Glu Ile Ile Lys Ser Phe Lys Gly Trp Thr Thr Tyr Phe LysAla Leu Glu Ile Ile Lys Ser Phe Lys Gly Trp Thr Thr Tyr Phe Lys
565 570 575 565 570 575
Gly Phe His Glu Asn Arg Lys Asn Val Tyr Ser Ser Asn Asp Ile ProGly Phe His Glu Asn Arg Lys Asn Val Tyr Ser Ser Asn Asp Ile Pro
580 585 590 580 585 590
Thr Ser Ile Ile Tyr Arg Ile Val Asp Asp Asn Leu Pro Lys Phe LeuThr Ser Ile Ile Tyr Arg Ile Val Asp Asp Asn Leu Pro Lys Phe Leu
595 600 605 595 600 605
Glu Asn Lys Ala Lys Tyr Glu Ser Leu Lys Asp Lys Ala Pro Glu AlaGlu Asn Lys Ala Lys Tyr Glu Ser Leu Lys Asp Lys Ala Pro Glu Ala
610 615 620 610 615 620
Ile Asn Tyr Glu Gln Ile Lys Lys Asp Leu Ala Glu Glu Leu Thr PheIle Asn Tyr Glu Gln Ile Lys Lys Asp Leu Ala Glu Glu Leu Thr Phe
625 630 635 640625 630 635 640
Asp Ile Asp Tyr Lys Thr Ser Glu Val Asn Gln Arg Val Phe Ser LeuAsp Ile Asp Tyr Lys Thr Ser Glu Val Asn Gln Arg Val Phe Ser Leu
645 650 655 645 650 655
Asp Glu Val Phe Glu Ile Ala Asn Phe Asn Asn Tyr Leu Asn Gln SerAsp Glu Val Phe Glu Ile Ala Asn Phe Asn Asn Tyr Leu Asn Gln Ser
660 665 670 660 665 670
Gly Ile Thr Lys Phe Asn Thr Ile Ile Gly Gly Lys Phe Val Asn GlyGly Ile Thr Lys Phe Asn Thr Ile Ile Gly Gly Lys Phe Val Asn Gly
675 680 685 675 680 685
Glu Asn Thr Lys Arg Lys Gly Ile Asn Glu Tyr Ile Asn Leu Tyr SerGlu Asn Thr Lys Arg Lys Gly Ile Asn Glu Tyr Ile Asn Leu Tyr Ser
690 695 700 690 695 700
Gln Gln Ile Asn Asp Lys Thr Leu Lys Lys Tyr Lys Met Ser Val LeuGln Gln Ile Asn Asp Lys Thr Leu Lys Lys Tyr Lys Met Ser Val Leu
705 710 715 720705 710 715 720
Phe Lys Gln Ile Leu Ser Asp Thr Glu Ser Lys Ser Phe Val Ile AspPhe Lys Gln Ile Leu Ser Asp Thr Glu Ser Lys Ser Phe Val Ile Asp
725 730 735 725 730 735
Lys Leu Glu Asp Asp Ser Asp Val Val Thr Thr Met Gln Ser Phe TyrLys Leu Glu Asp Asp Ser Asp Val Val Thr Thr Met Gln Ser Phe Tyr
740 745 750 740 745 750
Glu Gln Ile Ala Ala Phe Lys Thr Val Glu Glu Lys Ser Ile Lys GluGlu Gln Ile Ala Ala Phe Lys Thr Val Glu Glu Lys Ser Ile Lys Glu
755 760 765 755 760 765
Thr Leu Ser Leu Leu Phe Asp Asp Leu Lys Ala Gln Lys Leu Asp LeuThr Leu Ser Leu Leu Phe Asp Asp Leu Lys Ala Gln Lys Leu Asp Leu
770 775 780 770 775 780
Ser Lys Ile Tyr Phe Lys Asn Asp Lys Ser Leu Thr Asp Leu Ser GlnSer Lys Ile Tyr Phe Lys Asn Asp Lys Ser Leu Thr Asp Leu Ser Gln
785 790 795 800785 790 795 800
Gln Val Phe Asp Asp Tyr Ser Val Ile Gly Thr Ala Val Leu Glu TyrGln Val Phe Asp Asp Tyr Ser Val Ile Gly Thr Ala Val Leu Glu Tyr
805 810 815 805 810 815
Ile Thr Gln Gln Ile Ala Pro Lys Asn Leu Asp Asn Pro Ser Lys LysIle Thr Gln Gln Ile Ala Pro Lys Asn Leu Asp Asn Pro Ser Lys Lys
820 825 830 820 825 830
Glu Gln Glu Leu Ile Ala Lys Lys Thr Glu Lys Ala Lys Tyr Leu SerGlu Gln Glu Leu Ile Ala Lys Lys Thr Glu Lys Ala Lys Tyr Leu Ser
835 840 845 835 840 845
Leu Glu Thr Ile Lys Leu Ala Leu Glu Glu Phe Asn Lys His Arg AspLeu Glu Thr Ile Lys Leu Ala Leu Glu Glu Phe Asn Lys His Arg Asp
850 855 860 850 855 860
Ile Asp Lys Gln Cys Arg Phe Glu Glu Ile Leu Ala Asn Phe Ala AlaIle Asp Lys Gln Cys Arg Phe Glu Glu Ile Leu Ala Asn Phe Ala Ala
865 870 875 880865 870 875 880
Ile Pro Met Ile Phe Asp Glu Ile Ala Gln Asn Lys Asp Asn Leu AlaIle Pro Met Ile Phe Asp Glu Ile Ala Gln Asn Lys Asp Asn Leu Ala
885 890 895 885 890 895
Gln Ile Ser Ile Lys Tyr Gln Asn Gln Gly Lys Lys Asp Leu Leu GlnGln Ile Ser Ile Lys Tyr Gln Asn Gln Gly Lys Lys Asp Leu Leu Gln
900 905 910 900 905 910
Ala Ser Ala Glu Asp Asp Val Lys Ala Ile Lys Asp Leu Leu Asp GlnAla Ser Ala Glu Asp Asp Val Lys Ala Ile Lys Asp Leu Leu Asp Gln
915 920 925 915 920 925
Thr Asn Asn Leu Leu His Lys Leu Lys Ile Phe His Ile Ser Gln SerThr Asn Asn Leu Leu His Lys Leu Lys Ile Phe His Ile Ser Gln Ser
930 935 940 930 935 940
Glu Asp Lys Ala Asn Ile Leu Asp Lys Asp Glu His Phe Tyr Leu ValGlu Asp Lys Ala Asn Ile Leu Asp Lys Asp Glu His Phe Tyr Leu Val
945 950 955 960945 950 955 960
Phe Glu Glu Cys Tyr Phe Glu Leu Ala Asn Ile Val Pro Leu Tyr AsnPhe Glu Glu Cys Tyr Phe Glu Leu Ala Asn Ile Val Pro Leu Tyr Asn
965 970 975 965 970 975
Lys Ile Arg Asn Tyr Ile Thr Gln Lys Pro Tyr Ser Asp Glu Lys PheLys Ile Arg Asn Tyr Ile Thr Gln Lys Pro Tyr Ser Asp Glu Lys Phe
980 985 990 980 985 990
Lys Leu Asn Phe Glu Asn Ser Thr Leu Ala Asn Gly Trp Asp Lys AsnLys Leu Asn Phe Glu Asn Ser Thr Leu Ala Asn Gly Trp Asp Lys Asn
995 1000 1005 995 1000 1005
Lys Glu Pro Asp Asn Thr Ala Ile Leu Phe Ile Lys Asp Asp Lys TyrLys Glu Pro Asp Asn Thr Ala Ile Leu Phe Ile Lys Asp Asp Lys Tyr
1010 1015 1020 1010 1015 1020
Tyr Leu Gly Val Met Asn Lys Lys Asn Asn Lys Ile Phe Asp Asp LysTyr Leu Gly Val Met Asn Lys Lys Asn Asn Lys Ile Phe Asp Asp Lys
1025 1030 1035 10401025 1030 1035 1040
Ala Ile Lys Glu Asn Lys Gly Glu Gly Tyr Lys Lys Ile Val Tyr LysAla Ile Lys Glu Asn Lys Gly Glu Gly Tyr Lys Lys Ile Val Tyr Lys
1045 1050 1055 1045 1050 1055
Leu Leu Pro Gly Ala Asn Lys Met Leu Pro Lys Val Phe Phe Ser AlaLeu Leu Pro Gly Ala Asn Lys Met Leu Pro Lys Val Phe Phe Ser Ala
1060 1065 1070 1060 1065 1070
Lys Ser Ile Lys Phe Tyr Asn Pro Ser Glu Asp Ile Leu Arg Ile ArgLys Ser Ile Lys Phe Tyr Asn Pro Ser Glu Asp Ile Leu Arg Ile Arg
1075 1080 1085 1075 1080 1085
Asn His Ser Thr His Thr Lys Asn Gly Ser Pro Gln Lys Gly Tyr GluAsn His Ser Thr His Thr Lys Asn Gly Ser Pro Gln Lys Gly Tyr Glu
1090 1095 1100 1090 1095 1100
Lys Phe Glu Phe Asn Ile Glu Asp Cys Arg Lys Phe Ile Asp Phe TyrLys Phe Glu Phe Asn Ile Glu Asp Cys Arg Lys Phe Ile Asp Phe Tyr
1105 1110 1115 11201105 1110 1115 1120
Lys Gln Ser Ile Ser Lys His Pro Glu Trp Lys Asp Phe Gly Phe ArgLys Gln Ser Ile Ser Lys His Pro Glu Trp Lys Asp Phe Gly Phe Arg
1125 1130 1135 1125 1130 1135
Phe Ser Asp Thr Gln Arg Tyr Asn Ser Ile Asp Glu Phe Tyr Arg GluPhe Ser Asp Thr Gln Arg Tyr Asn Ser Ile Asp Glu Phe Tyr Arg Glu
1140 1145 1150 1140 1145 1150
Val Glu Asn Gln Gly Tyr Lys Leu Thr Phe Glu Asn Ile Ser Glu SerVal Glu Asn Gln Gly Tyr Lys Leu Thr Phe Glu Asn Ile Ser Glu Ser
1155 1160 1165 1155 1160 1165
Tyr Ile Asp Ser Val Val Asn Gln Gly Lys Leu Tyr Leu Phe Gln IleTyr Ile Asp Ser Val Val Asn Gln Gly Lys Leu Tyr Leu Phe Gln Ile
1170 1175 1180 1170 1175 1180
Tyr Asn Lys Asp Phe Ser Ala Tyr Ser Lys Gly Arg Pro Asn Leu HisTyr Asn Lys Asp Phe Ser Ala Tyr Ser Lys Gly Arg Pro Asn Leu His
1185 1190 1195 12001185 1190 1195 1200
Thr Leu Tyr Trp Lys Ala Leu Phe Asp Glu Arg Asn Leu Gln Asp ValThr Leu Tyr Trp Lys Ala Leu Phe Asp Glu Arg Asn Leu Gln Asp Val
1205 1210 1215 1205 1210 1215
Val Tyr Lys Leu Asn Gly Glu Ala Glu Leu Phe Tyr Arg Lys Gln SerVal Tyr Lys Leu Asn Gly Glu Ala Glu Leu Phe Tyr Arg Lys Gln Ser
1220 1225 1230 1220 1225 1230
Ile Pro Lys Lys Ile Thr His Pro Ala Lys Glu Ala Ile Ala Asn LysIle Pro Lys Lys Ile Thr His Pro Ala Lys Glu Ala Ile Ala Asn Lys
1235 1240 1245 1235 1240 1245
Asn Lys Asp Asn Pro Lys Lys Glu Ser Val Phe Glu Tyr Asp Leu IleAsn Lys Asp Asn Pro Lys Lys Glu Ser Val Phe Glu Tyr Asp Leu Ile
1250 1255 1260 1250 1255 1260
Lys Asp Lys Arg Phe Thr Glu Asp Lys Phe Phe Phe His Cys Pro IleLys Asp Lys Arg Phe Thr Glu Asp Lys Phe Phe Phe His Cys Pro Ile
1265 1270 1275 12801265 1270 1275 1280
Thr Ile Asn Phe Lys Ser Ser Gly Ala Asn Lys Phe Asn Asp Glu IleThr Ile Asn Phe Lys Ser Ser Gly Ala Asn Lys Phe Asn Asp Glu Ile
1285 1290 1295 1285 1290 1295
Asn Leu Leu Leu Lys Glu Lys Ala Asn Asp Val His Ile Leu Ser IleAsn Leu Leu Leu Lys Glu Lys Ala Asn Asp Val His Ile Leu Ser Ile
1300 1305 1310 1300 1305 1310
Ala Arg Gly Glu Arg His Leu Ala Tyr Tyr Thr Leu Val Asp Gly LysAla Arg Gly Glu Arg His Leu Ala Tyr Tyr Thr Leu Val Asp Gly Lys
1315 1320 1325 1315 1320 1325
Gly Asn Ile Ile Lys Gln Asp Thr Phe Asn Ile Ile Gly Asn Asp ArgGly Asn Ile Ile Lys Gln Asp Thr Phe Asn Ile Ile Gly Asn Asp Arg
1330 1335 1340 1330 1335 1340
Met Lys Thr Asn Tyr His Asp Lys Leu Ala Ala Ile Glu Lys Asp ArgMet Lys Thr Asn Tyr His Asp Lys Leu Ala Ala Ile Glu Lys Asp Arg
1345 1350 1355 13601345 1350 1355 1360
Asp Ser Ala Arg Lys Asp Trp Lys Lys Ile Asn Asn Ile Lys Glu MetAsp Ser Ala Arg Lys Asp Trp Lys Lys Ile Asn Asn Ile Lys Glu Met
1365 1370 1375 1365 1370 1375
Lys Glu Gly Tyr Leu Ser Gln Val Val His Glu Ile Ala Lys Leu ValLys Glu Gly Tyr Leu Ser Gln Val Val His Glu Ile Ala Lys Leu Val
1380 1385 1390 1380 1385 1390
Ile Glu Tyr Asn Ala Ile Val Val Phe Glu Asp Leu Asn Phe Gly PheIle Glu Tyr Asn Ala Ile Val Val Phe Glu Asp Leu Asn Phe Gly Phe
1395 1400 1405 1395 1400 1405
Lys Arg Gly Arg Phe Lys Val Glu Lys Gln Val Tyr Gln Lys Leu GluLys Arg Gly Arg Phe Lys Val Glu Lys Gln Val Tyr Gln Lys Leu Glu
1410 1415 1420 1410 1415 1420
Lys Met Leu Ile Glu Lys Leu Asn Tyr Leu Val Phe Lys Asp Asn GluLys Met Leu Ile Glu Lys Leu Asn Tyr Leu Val Phe Lys Asp Asn Glu
1425 1430 1435 14401425 1430 1435 1440
Phe Asp Lys Thr Gly Gly Val Leu Arg Ala Tyr Gln Leu Thr Ala ProPhe Asp Lys Thr Gly Gly Val Leu Arg Ala Tyr Gln Leu Thr Ala Pro
1445 1450 1455 1445 1450 1455
Phe Glu Thr Phe Lys Lys Met Gly Lys Gln Thr Gly Ile Ile Tyr TyrPhe Glu Thr Phe Lys Lys Met Gly Lys Gln Thr Gly Ile Ile Tyr Tyr
1460 1465 1470 1460 1465 1470
Val Pro Ala Gly Phe Thr Ser Lys Ile Cys Pro Val Thr Gly Phe ValVal Pro Ala Gly Phe Thr Ser Lys Ile Cys Pro Val Thr Gly Phe Val
1475 1480 1485 1475 1480 1485
Asn Gln Leu Tyr Pro Lys Tyr Glu Ser Val Ser Lys Ser Gln Glu PheAsn Gln Leu Tyr Pro Lys Tyr Glu Ser Val Ser Lys Ser Gln Glu Phe
1490 1495 1500 1490 1495 1500
Phe Ser Lys Phe Asp Lys Ile Cys Tyr Asn Leu Asp Lys Gly Tyr PhePhe Ser Lys Phe Asp Lys Ile Cys Tyr Asn Leu Asp Lys Gly Tyr Phe
1505 1510 1515 15201505 1510 1515 1520
Glu Phe Ser Phe Asp Tyr Lys Asn Phe Gly Asp Lys Ala Ala Lys GlyGlu Phe Ser Phe Asp Tyr Lys Asn Phe Gly Asp Lys Ala Ala Lys Gly
1525 1530 1535 1525 1530 1535
Lys Trp Thr Ile Ala Ser Phe Gly Ser Arg Leu Ile Asn Phe Arg AsnLys Trp Thr Ile Ala Ser Phe Gly Ser Arg Leu Ile Asn Phe Arg Asn
1540 1545 1550 1540 1545 1550
Ser Asp Lys Asn His Asn Trp Asp Thr Arg Glu Val Tyr Pro Thr LysSer Asp Lys Asn His Asn Trp Asp Thr Arg Glu Val Tyr Pro Thr Lys
1555 1560 1565 1555 1560 1565
Glu Leu Glu Lys Leu Leu Lys Asp Tyr Ser Ile Glu Tyr Gly His GlyGlu Leu Glu Lys Leu Leu Lys Asp Tyr Ser Ile Glu Tyr Gly His Gly
1570 1575 1580 1570 1575 1580
Glu Cys Ile Lys Ala Ala Ile Cys Gly Glu Ser Asp Lys Lys Phe PheGlu Cys Ile Lys Ala Ala Ile Cys Gly Glu Ser Asp Lys Lys Phe Phe
1585 1590 1595 16001585 1590 1595 1600
Ala Lys Leu Thr Ser Val Leu Asn Thr Ile Leu Gln Met Arg Asn SerAla Lys Leu Thr Ser Val Leu Asn Thr Ile Leu Gln Met Arg Asn Ser
1605 1610 1615 1605 1610 1615
Lys Thr Gly Thr Glu Leu Asp Tyr Leu Ile Ser Pro Val Ala Asp ValLys Thr Gly Thr Glu Leu Asp Tyr Leu Ile Ser Pro Val Ala Asp Val
1620 1625 1630 1620 1625 1630
Asn Gly Asn Phe Phe Asp Ser Arg Gln Ala Pro Lys Asn Met Pro GlnAsn Gly Asn Phe Phe Asp Ser Arg Gln Ala Pro Lys Asn Met Pro Gln
1635 1640 1645 1635 1640 1645
Asp Ala Asp Ala Asn Gly Ala Tyr His Ile Gly Leu Lys Gly Leu MetAsp Ala Asp Ala Asn Gly Ala Tyr His Ile Gly Leu Lys Gly Leu Met
1650 1655 1660 1650 1655 1660
Leu Leu Gly Arg Ile Lys Asn Asn Gln Glu Gly Lys Lys Leu Asn LeuLeu Leu Gly Arg Ile Lys Asn Asn Gln Glu Gly Lys Lys Leu Asn Leu
1665 1670 1675 16801665 1670 1675 1680
Val Ile Lys Asn Glu Glu Tyr Phe Glu Phe Val Gln Asn Arg Asn AsnVal Ile Lys Asn Glu Glu Tyr Phe Glu Phe Val Gln Asn Arg Asn Asn
1685 1690 1695 1685 1690 1695
Ser Gly Gly Ser Pro Lys Lys Lys Arg Lys ValSer Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
1700 1705 1700 1705
<210> 7<210> 7
<211> 1509<211> 1509
<212> PRT<212> PRT
<213> (人工序列)<213> (artificial sequence)
<400> 7<400> 7
Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala LeuMet Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu
1 5 10 151 5 10 15
Thr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly AlaThr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala
20 25 30 20 25 30
Val Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg AlaVal Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala
35 40 45 35 40 45
Ile Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu ArgIle Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg
50 55 60 50 55 60
Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr LeuGln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu
65 70 75 8065 70 75 80
Tyr Ser Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile HisTyr Ser Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His
85 90 95 85 90 95
Ser Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ser Lys Arg GlySer Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ser Lys Arg Gly
100 105 110 100 105 110
Ala Ala Gly Ser Leu Met Asn Val Leu Asn Tyr Pro Gly Met Asn HisAla Ala Gly Ser Leu Met Asn Val Leu Asn Tyr Pro Gly Met Asn His
115 120 125 115 120 125
Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala LeuArg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu
130 135 140 130 135 140
Leu Cys Asp Phe Tyr Arg Met Pro Arg Arg Val Phe Asn Ala Gln LysLeu Cys Asp Phe Tyr Arg Met Pro Arg Arg Val Phe Asn Ala Gln Lys
145 150 155 160145 150 155 160
Lys Ala Gln Ser Ser Ile Asn Ser Gly Gly Ser Ser Gly Gly Ser SerLys Ala Gln Ser Ser Ile Asn Ser Gly Gly Ser Ser Gly Gly Ser Ser
165 170 175 165 170 175
Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser SerGly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser
180 185 190 180 185 190
Gly Gly Ser Ser Gly Gly Ser Ser Ile Tyr Gln Glu Phe Val Asn LysGly Gly Ser Ser Gly Gly Ser Ser Ile Tyr Gln Glu Phe Val Asn Lys
195 200 205 195 200 205
Tyr Ser Leu Ser Lys Thr Leu Arg Phe Glu Leu Ile Pro Gln Gly LysTyr Ser Leu Ser Lys Thr Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys
210 215 220 210 215 220
Thr Leu Glu Asn Ile Lys Ala Arg Gly Leu Ile Leu Asp Asp Glu LysThr Leu Glu Asn Ile Lys Ala Arg Gly Leu Ile Leu Asp Asp Glu Lys
225 230 235 240225 230 235 240
Arg Ala Lys Asp Tyr Lys Lys Ala Lys Gln Ile Ile Asp Lys Tyr HisArg Ala Lys Asp Tyr Lys Lys Ala Lys Gln Ile Ile Asp Lys Tyr His
245 250 255 245 250 255
Gln Phe Phe Ile Glu Glu Ile Leu Ser Ser Val Cys Ile Ser Glu AspGln Phe Phe Ile Glu Glu Ile Leu Ser Ser Val Cys Ile Ser Glu Asp
260 265 270 260 265 270
Leu Leu Gln Asn Tyr Ser Asp Val Tyr Phe Lys Leu Lys Lys Ser AspLeu Leu Gln Asn Tyr Ser Asp Val Tyr Phe Lys Leu Lys Lys Ser Asp
275 280 285 275 280 285
Asp Asp Asn Leu Gln Lys Asp Phe Lys Ser Ala Lys Asp Thr Ile LysAsp Asp Asn Leu Gln Lys Asp Phe Lys Ser Ala Lys Asp Thr Ile Lys
290 295 300 290 295 300
Lys Gln Ile Ser Glu Tyr Ile Lys Asp Ser Glu Lys Phe Lys Asn LeuLys Gln Ile Ser Glu Tyr Ile Lys Asp Ser Glu Lys Phe Lys Asn Leu
305 310 315 320305 310 315 320
Phe Asn Gln Asn Leu Ile Asp Ala Lys Lys Gly Gln Glu Ser Asp LeuPhe Asn Gln Asn Leu Ile Asp Ala Lys Lys Gly Gln Glu Ser Asp Leu
325 330 335 325 330 335
Ile Leu Trp Leu Lys Gln Ser Lys Asp Asn Gly Ile Glu Leu Phe LysIle Leu Trp Leu Lys Gln Ser Lys Asp Asn Gly Ile Glu Leu Phe Lys
340 345 350 340 345 350
Ala Asn Ser Asp Ile Thr Asp Ile Asp Glu Ala Leu Glu Ile Ile LysAla Asn Ser Asp Ile Thr Asp Ile Asp Glu Ala Leu Glu Ile Ile Lys
355 360 365 355 360 365
Ser Phe Lys Gly Trp Thr Thr Tyr Phe Lys Gly Phe His Glu Asn ArgSer Phe Lys Gly Trp Thr Thr Tyr Phe Lys Gly Phe His Glu Asn Arg
370 375 380 370 375 380
Lys Asn Val Tyr Ser Ser Asn Asp Ile Pro Thr Ser Ile Ile Tyr ArgLys Asn Val Tyr Ser Ser Asn Asp Ile Pro Thr Ser Ile Ile Tyr Arg
385 390 395 400385 390 395 400
Ile Val Asp Asp Asn Leu Pro Lys Phe Leu Glu Asn Lys Ala Lys TyrIle Val Asp Asp Asn Leu Pro Lys Phe Leu Glu Asn Lys Ala Lys Tyr
405 410 415 405 410 415
Glu Ser Leu Lys Asp Lys Ala Pro Glu Ala Ile Asn Tyr Glu Gln IleGlu Ser Leu Lys Asp Lys Ala Pro Glu Ala Ile Asn Tyr Glu Gln Ile
420 425 430 420 425 430
Lys Lys Asp Leu Ala Glu Glu Leu Thr Phe Asp Ile Asp Tyr Lys ThrLys Lys Asp Leu Ala Glu Glu Leu Thr Phe Asp Ile Asp Tyr Lys Thr
435 440 445 435 440 445
Ser Glu Val Asn Gln Arg Val Phe Ser Leu Asp Glu Val Phe Glu IleSer Glu Val Asn Gln Arg Val Phe Ser Leu Asp Glu Val Phe Glu Ile
450 455 460 450 455 460
Ala Asn Phe Asn Asn Tyr Leu Asn Gln Ser Gly Ile Thr Lys Phe AsnAla Asn Phe Asn Asn Tyr Leu Asn Gln Ser Gly Ile Thr Lys Phe Asn
465 470 475 480465 470 475 480
Thr Ile Ile Gly Gly Lys Phe Val Asn Gly Glu Asn Thr Lys Arg LysThr Ile Ile Gly Gly Lys Phe Val Asn Gly Glu Asn Thr Lys Arg Lys
485 490 495 485 490 495
Gly Ile Asn Glu Tyr Ile Asn Leu Tyr Ser Gln Gln Ile Asn Asp LysGly Ile Asn Glu Tyr Ile Asn Leu Tyr Ser Gln Gln Ile Asn Asp Lys
500 505 510 500 505 510
Thr Leu Lys Lys Tyr Lys Met Ser Val Leu Phe Lys Gln Ile Leu SerThr Leu Lys Lys Tyr Lys Met Ser Val Leu Phe Lys Gln Ile Leu Ser
515 520 525 515 520 525
Asp Thr Glu Ser Lys Ser Phe Val Ile Asp Lys Leu Glu Asp Asp SerAsp Thr Glu Ser Lys Ser Phe Val Ile Asp Lys Leu Glu Asp Asp Ser
530 535 540 530 535 540
Asp Val Val Thr Thr Met Gln Ser Phe Tyr Glu Gln Ile Ala Ala PheAsp Val Val Thr Thr Met Gln Ser Phe Tyr Glu Gln Ile Ala Ala Phe
545 550 555 560545 550 555 560
Lys Thr Val Glu Glu Lys Ser Ile Lys Glu Thr Leu Ser Leu Leu PheLys Thr Val Glu Glu Lys Ser Ile Lys Glu Thr Leu Ser Leu Leu Phe
565 570 575 565 570 575
Asp Asp Leu Lys Ala Gln Lys Leu Asp Leu Ser Lys Ile Tyr Phe LysAsp Asp Leu Lys Ala Gln Lys Leu Asp Leu Ser Lys Ile Tyr Phe Lys
580 585 590 580 585 590
Asn Asp Lys Ser Leu Thr Asp Leu Ser Gln Gln Val Phe Asp Asp TyrAsn Asp Lys Ser Leu Thr Asp Leu Ser Gln Gln Val Phe Asp Asp Tyr
595 600 605 595 600 605
Ser Val Ile Gly Thr Ala Val Leu Glu Tyr Ile Thr Gln Gln Ile AlaSer Val Ile Gly Thr Ala Val Leu Glu Tyr Ile Thr Gln Gln Ile Ala
610 615 620 610 615 620
Pro Lys Asn Leu Asp Asn Pro Ser Lys Lys Glu Gln Glu Leu Ile AlaPro Lys Asn Leu Asp Asn Pro Ser Lys Lys Glu Gln Glu Leu Ile Ala
625 630 635 640625 630 635 640
Lys Lys Thr Glu Lys Ala Lys Tyr Leu Ser Leu Glu Thr Ile Lys LeuLys Lys Thr Glu Lys Ala Lys Tyr Leu Ser Leu Glu Thr Ile Lys Leu
645 650 655 645 650 655
Ala Leu Glu Glu Phe Asn Lys His Arg Asp Ile Asp Lys Gln Cys ArgAla Leu Glu Glu Phe Asn Lys His Arg Asp Ile Asp Lys Gln Cys Arg
660 665 670 660 665 670
Phe Glu Glu Ile Leu Ala Asn Phe Ala Ala Ile Pro Met Ile Phe AspPhe Glu Glu Ile Leu Ala Asn Phe Ala Ala Ile Pro Met Ile Phe Asp
675 680 685 675 680 685
Glu Ile Ala Gln Asn Lys Asp Asn Leu Ala Gln Ile Ser Ile Lys TyrGlu Ile Ala Gln Asn Lys Asp Asn Leu Ala Gln Ile Ser Ile Lys Tyr
690 695 700 690 695 700
Gln Asn Gln Gly Lys Lys Asp Leu Leu Gln Ala Ser Ala Glu Asp AspGln Asn Gln Gly Lys Lys Asp Leu Leu Gln Ala Ser Ala Glu Asp Asp
705 710 715 720705 710 715 720
Val Lys Ala Ile Lys Asp Leu Leu Asp Gln Thr Asn Asn Leu Leu HisVal Lys Ala Ile Lys Asp Leu Leu Asp Gln Thr Asn Asn Leu Leu His
725 730 735 725 730 735
Lys Leu Lys Ile Phe His Ile Ser Gln Ser Glu Asp Lys Ala Asn IleLys Leu Lys Ile Phe His Ile Ser Gln Ser Glu Asp Lys Ala Asn Ile
740 745 750 740 745 750
Leu Asp Lys Asp Glu His Phe Tyr Leu Val Phe Glu Glu Cys Tyr PheLeu Asp Lys Asp Glu His Phe Tyr Leu Val Phe Glu Glu Cys Tyr Phe
755 760 765 755 760 765
Glu Leu Ala Asn Ile Val Pro Leu Tyr Asn Lys Ile Arg Asn Tyr IleGlu Leu Ala Asn Ile Val Pro Leu Tyr Asn Lys Ile Arg Asn Tyr Ile
770 775 780 770 775 780
Thr Gln Lys Pro Tyr Ser Asp Glu Lys Phe Lys Leu Asn Phe Glu AsnThr Gln Lys Pro Tyr Ser Asp Glu Lys Phe Lys Leu Asn Phe Glu Asn
785 790 795 800785 790 795 800
Ser Thr Leu Ala Asn Gly Trp Asp Lys Asn Lys Glu Pro Asp Asn ThrSer Thr Leu Ala Asn Gly Trp Asp Lys Asn Lys Glu Pro Asp Asn Thr
805 810 815 805 810 815
Ala Ile Leu Phe Ile Lys Asp Asp Lys Tyr Tyr Leu Gly Val Met AsnAla Ile Leu Phe Ile Lys Asp Asp Lys Tyr Tyr Leu Gly Val Met Asn
820 825 830 820 825 830
Lys Lys Asn Asn Lys Ile Phe Asp Asp Lys Ala Ile Lys Glu Asn LysLys Lys Asn Asn Lys Ile Phe Asp Asp Lys Ala Ile Lys Glu Asn Lys
835 840 845 835 840 845
Gly Glu Gly Tyr Lys Lys Ile Val Tyr Lys Leu Leu Pro Gly Ala AsnGly Glu Gly Tyr Lys Lys Ile Val Tyr Lys Leu Leu Pro Gly Ala Asn
850 855 860 850 855 860
Lys Met Leu Pro Lys Val Phe Phe Ser Ala Lys Ser Ile Lys Phe TyrLys Met Leu Pro Lys Val Phe Phe Ser Ala Lys Ser Ile Lys Phe Tyr
865 870 875 880865 870 875 880
Asn Pro Ser Glu Asp Ile Leu Arg Ile Arg Asn His Ser Thr His ThrAsn Pro Ser Glu Asp Ile Leu Arg Ile Arg Asn His Ser Thr His Thr
885 890 895 885 890 895
Lys Asn Gly Ser Pro Gln Lys Gly Tyr Glu Lys Phe Glu Phe Asn IleLys Asn Gly Ser Pro Gln Lys Gly Tyr Glu Lys Phe Glu Phe Asn Ile
900 905 910 900 905 910
Glu Asp Cys Arg Lys Phe Ile Asp Phe Tyr Lys Gln Ser Ile Ser LysGlu Asp Cys Arg Lys Phe Ile Asp Phe Tyr Lys Gln Ser Ile Ser Lys
915 920 925 915 920 925
His Pro Glu Trp Lys Asp Phe Gly Phe Arg Phe Ser Asp Thr Gln ArgHis Pro Glu Trp Lys Asp Phe Gly Phe Arg Phe Ser Asp Thr Gln Arg
930 935 940 930 935 940
Tyr Asn Ser Ile Asp Glu Phe Tyr Arg Glu Val Glu Asn Gln Gly TyrTyr Asn Ser Ile Asp Glu Phe Tyr Arg Glu Val Glu Asn Gln Gly Tyr
945 950 955 960945 950 955 960
Lys Leu Thr Phe Glu Asn Ile Ser Glu Ser Tyr Ile Asp Ser Val ValLys Leu Thr Phe Glu Asn Ile Ser Glu Ser Tyr Ile Asp Ser Val Val
965 970 975 965 970 975
Asn Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe SerAsn Gln Gly Lys Leu Tyr Leu Phe Gln Ile Tyr Asn Lys Asp Phe Ser
980 985 990980 985 990
Ala Tyr Ser Lys Gly Arg Pro Asn Leu His Thr Leu Tyr Trp Lys AlaAla Tyr Ser Lys Gly Arg Pro Asn Leu His Thr Leu Tyr Trp Lys Ala
995 1000 1005 995 1000 1005
Leu Phe Asp Glu Arg Asn Leu Gln Asp Val Val Tyr Lys Leu Asn GlyLeu Phe Asp Glu Arg Asn Leu Gln Asp Val Val Tyr Lys Leu Asn Gly
1010 1015 1020 1010 1015 1020
Glu Ala Glu Leu Phe Tyr Arg Lys Gln Ser Ile Pro Lys Lys Ile ThrGlu Ala Glu Leu Phe Tyr Arg Lys Gln Ser Ile Pro Lys Lys Ile Thr
1025 1030 1035 10401025 1030 1035 1040
His Pro Ala Lys Glu Ala Ile Ala Asn Lys Asn Lys Asp Asn Pro LysHis Pro Ala Lys Glu Ala Ile Ala Asn Lys Asn Lys Asp Asn Pro Lys
1045 1050 1055 1045 1050 1055
Lys Glu Ser Val Phe Glu Tyr Asp Leu Ile Lys Asp Lys Arg Phe ThrLys Glu Ser Val Phe Glu Tyr Asp Leu Ile Lys Asp Lys Arg Phe Thr
1060 1065 1070 1060 1065 1070
Glu Asp Lys Phe Phe Phe His Cys Pro Ile Thr Ile Asn Phe Lys SerGlu Asp Lys Phe Phe Phe His Cys Pro Ile Thr Ile Asn Phe Lys Ser
1075 1080 1085 1075 1080 1085
Ser Gly Ala Asn Lys Phe Asn Asp Glu Ile Asn Leu Leu Leu Lys GluSer Gly Ala Asn Lys Phe Asn Asp Glu Ile Asn Leu Leu Leu Lys Glu
1090 1095 1100 1090 1095 1100
Lys Ala Asn Asp Val His Ile Leu Ser Ile Ala Arg Gly Glu Arg HisLys Ala Asn Asp Val His Ile Leu Ser Ile Ala Arg Gly Glu Arg His
1105 1110 1115 11201105 1110 1115 1120
Leu Ala Tyr Tyr Thr Leu Val Asp Gly Lys Gly Asn Ile Ile Lys GlnLeu Ala Tyr Tyr Thr Leu Val Asp Gly Lys Gly Asn Ile Ile Lys Gln
1125 1130 1135 1125 1130 1135
Asp Thr Phe Asn Ile Ile Gly Asn Asp Arg Met Lys Thr Asn Tyr HisAsp Thr Phe Asn Ile Ile Gly Asn Asp Arg Met Lys Thr Asn Tyr His
1140 1145 1150 1140 1145 1150
Asp Lys Leu Ala Ala Ile Glu Lys Asp Arg Asp Ser Ala Arg Lys AspAsp Lys Leu Ala Ala Ile Glu Lys Asp Arg Asp Ser Ala Arg Lys Asp
1155 1160 1165 1155 1160 1165
Trp Lys Lys Ile Asn Asn Ile Lys Glu Met Lys Glu Gly Tyr Leu SerTrp Lys Lys Ile Asn Asn Ile Lys Glu Met Lys Glu Gly Tyr Leu Ser
1170 1175 1180 1170 1175 1180
Gln Val Val His Glu Ile Ala Lys Leu Val Ile Glu Tyr Asn Ala IleGln Val Val His Glu Ile Ala Lys Leu Val Ile Glu Tyr Asn Ala Ile
1185 1190 1195 12001185 1190 1195 1200
Val Val Phe Ala Asp Leu Asn Phe Gly Phe Lys Arg Gly Arg Phe LysVal Val Phe Ala Asp Leu Asn Phe Gly Phe Lys Arg Gly Arg Phe Lys
1205 1210 1215 1205 1210 1215
Val Glu Lys Gln Val Tyr Gln Lys Leu Glu Lys Met Leu Ile Glu LysVal Glu Lys Gln Val Tyr Gln Lys Leu Glu Lys Met Leu Ile Glu Lys
1220 1225 1230 1220 1225 1230
Leu Asn Tyr Leu Val Phe Lys Asp Asn Glu Phe Asp Lys Thr Gly GlyLeu Asn Tyr Leu Val Phe Lys Asp Asn Glu Phe Asp Lys Thr Gly Gly
1235 1240 1245 1235 1240 1245
Val Leu Arg Ala Tyr Gln Leu Thr Ala Pro Phe Glu Thr Phe Lys LysVal Leu Arg Ala Tyr Gln Leu Thr Ala Pro Phe Glu Thr Phe Lys Lys
1250 1255 1260 1250 1255 1260
Met Gly Lys Gln Thr Gly Ile Ile Tyr Tyr Val Pro Ala Gly Phe ThrMet Gly Lys Gln Thr Gly Ile Ile Tyr Tyr Val Pro Ala Gly Phe Thr
1265 1270 1275 12801265 1270 1275 1280
Ser Lys Ile Cys Pro Val Thr Gly Phe Val Asn Gln Leu Tyr Pro LysSer Lys Ile Cys Pro Val Thr Gly Phe Val Asn Gln Leu Tyr Pro Lys
1285 1290 1295 1285 1290 1295
Tyr Glu Ser Val Ser Lys Ser Gln Glu Phe Phe Ser Lys Phe Asp LysTyr Glu Ser Val Ser Lys Ser Gln Glu Phe Phe Ser Lys Phe Asp Lys
1300 1305 1310 1300 1305 1310
Ile Cys Tyr Asn Leu Asp Lys Gly Tyr Phe Glu Phe Ser Phe Asp TyrIle Cys Tyr Asn Leu Asp Lys Gly Tyr Phe Glu Phe Ser Phe Asp Tyr
1315 1320 1325 1315 1320 1325
Lys Asn Phe Gly Asp Lys Ala Ala Lys Gly Lys Trp Thr Ile Ala SerLys Asn Phe Gly Asp Lys Ala Ala Lys Gly Lys Trp Thr Ile Ala Ser
1330 1335 1340 1330 1335 1340
Phe Gly Ser Arg Leu Ile Asn Phe Arg Asn Ser Asp Lys Asn His AsnPhe Gly Ser Arg Leu Ile Asn Phe Arg Asn Ser Asp Lys Asn His Asn
1345 1350 1355 13601345 1350 1355 1360
Trp Asp Thr Arg Glu Val Tyr Pro Thr Lys Glu Leu Glu Lys Leu LeuTrp Asp Thr Arg Glu Val Tyr Pro Thr Lys Glu Leu Glu Lys Leu Leu
1365 1370 1375 1365 1370 1375
Lys Asp Tyr Ser Ile Glu Tyr Gly His Gly Glu Cys Ile Lys Ala AlaLys Asp Tyr Ser Ile Glu Tyr Gly His Gly Glu Cys Ile Lys Ala Ala
1380 1385 1390 1380 1385 1390
Ile Cys Gly Glu Ser Asp Lys Lys Phe Phe Ala Lys Leu Thr Ser ValIle Cys Gly Glu Ser Asp Lys Lys Phe Phe Ala Lys Leu Thr Ser Val
1395 1400 1405 1395 1400 1405
Leu Asn Thr Ile Leu Gln Met Arg Asn Ser Lys Thr Gly Thr Glu LeuLeu Asn Thr Ile Leu Gln Met Arg Asn Ser Lys Thr Gly Thr Glu Leu
1410 1415 1420 1410 1415 1420
Asp Tyr Leu Ile Ser Pro Val Ala Asp Val Asn Gly Asn Phe Phe AspAsp Tyr Leu Ile Ser Pro Val Ala Asp Val Asn Gly Asn Phe Phe Asp
1425 1430 1435 14401425 1430 1435 1440
Ser Arg Gln Ala Pro Lys Asn Met Pro Gln Asp Ala Asp Ala Asn GlySer Arg Gln Ala Pro Lys Asn Met Pro Gln Asp Ala Asp Ala Asn Gly
1445 1450 1455 1445 1450 1455
Ala Tyr His Ile Gly Leu Lys Gly Leu Met Leu Leu Gly Arg Ile LysAla Tyr His Ile Gly Leu Lys Gly Leu Met Leu Leu Gly Arg Ile Lys
1460 1465 1470 1460 1465 1470
Asn Asn Gln Glu Gly Lys Lys Leu Asn Leu Val Ile Lys Asn Glu GluAsn Asn Gln Glu Gly Lys Lys Leu Asn Leu Val Ile Lys Asn Glu Glu
1475 1480 1485 1475 1480 1485
Tyr Phe Glu Phe Val Gln Asn Arg Asn Asn Ser Gly Gly Ser Pro LysTyr Phe Glu Phe Val Gln Asn Arg Asn Asn Ser Gly Gly Ser Pro Lys
1490 1495 1500 1490 1495 1500
Lys Lys Arg Lys ValLys Lys Arg Lys Val
15051505
<210> 8<210> 8
<211> 9445<211> 9445
<212> DNA<212> DNA
<213> (人工序列)<213> (artificial sequence)
<400> 8<400> 8
atgtcatgac attggtgtac agaaatggcg cagcaatggc aagaacgtcc cgggcggagc 60atgtcatgac attggtgtac agaaatggcg cagcaatggc aagaacgtcc cgggcggagc 60
tcaggcctta actcacatta attgcgttgc gctcactgcc cgctttccag tcgggaaacc 120tcaggcctta actcacatta attgcgttgc gctcactgcc cgctttccag tcgggaaacc 120
tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg 180tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg 180
ggcgccaggg tggtttttct tttcaccagt gagacgggca acagctgatt gcccttcacc 240ggcgccaggg tggtttttct tttcaccagt gagacgggca acagctgatt gcccttcacc 240
gcctggccct gagagagttg cagcaagcgg tccacgctgg tttgccccag caggcgaaaa 300gcctggccct gagagagttg cagcaagcgg tccacgctgg tttgccccag caggcgaaaa 300
tcctgtttga tggtggttaa cggcgggata taacatgagc tgtcttcggt atcgtcgtat 360tcctgtttga tggtggttaa cggcgggata taacatgagc tgtcttcggt atcgtcgtat 360
cccactaccg agatatccgc accaacgcgc agcccggact cggtaatggc gcgcattgcg 420cccactaccg agatatccgc accaacgcgc agcccggact cggtaatggc gcgcattgcg 420
cccagcgcca tctgatcgtt ggcaaccagc atcgcagtgg gaacgatgcc ctcattcagc 480cccagcgcca tctgatcgtt ggcaaccagc atcgcagtgg gaacgatgcc ctcattcagc 480
atttgcatgg tttgttgaaa accggacatg gcactccagt cgccttcccg ttccgctatc 540atttgcatgg tttgttgaaa accggacatg gcactccagt cgccttcccg ttccgctatc 540
ggctgaattt gattgcgagt gagatattta tgccagccag ccagacgcag acgcgccgag 600ggctgaattt gattgcgagt gagatattta tgccagccag ccagacgcag acgcgccgag 600
acagaactta atgggcccgc taacagcgcg atttgctggt gacccaatgc gaccagatgc 660acagaactta atgggcccgc taacagcgcg atttgctggt gacccaatgc gaccagatgc 660
tccacgccca gtcgcgtacc gtcttcatgg gagaaaataa tactgttgat gggtgtctgg 720tccacgccca gtcgcgtacc gtcttcatgg gagaaaataa tactgttgat gggtgtctgg 720
tcagagacat caagaaataa cgccggaaca ttagtgcagg cagcttccac agcaatggca 780tcagagacat caagaaataa cgccggaaca ttagtgcagg cagcttccac agcaatggca 780
tcctggtcat ccagcggata gttaatgatc agcccactga cgcgttgcgc gagaagattg 840tcctggtcat ccagcggata gttaatgatc agcccactga cgcgttgcgc gagaagattg 840
tgcaccgccg ttttacaggc ttcgacgccg cttcgttcta ccatcgacac caccacgctg 900tgcaccgccg ttttacaggc ttcgacgccg cttcgttcta ccatcgacac caccacgctg 900
gcacccagtt gatcggcgcg agatttaatc gccgcgacaa tttgcgacgg cgcgtgcagg 960gcacccagtt gatcggcgcg agatttaatc gccgcgacaa tttgcgacgg cgcgtgcagg 960
gccagactgg aggtggcaac gccaatcagc aacgactgtt tgcccgccag ttgttgtgcc 1020gccagactgg aggtggcaac gccaatcagc aacgactgtt tgcccgccag ttgttgtgcc 1020
acgcggttgg gaatgtaatt cagctccgcc atcgccgctt ccactttttc ccgcgttttc 1080acgcggttgg gaatgtaatt cagctccgcc atcgccgctt ccactttttc ccgcgttttc 1080
gcagaaacgt ggctggcctg gttcaccacg cgggaaacgg tctgataaga gacaccggca 1140gcagaaacgt ggctggcctg gttcaccacg cgggaaacgg tctgataaga gacaccggca 1140
tactctgcga catcgtataa cgttactggt ttcatcaaaa tcgtctccct ccgtttgaat 1200tactctgcga catcgtataa cgttatactggt ttcatcaaaa tcgtctccct ccgtttgaat 1200
atttgattga tcgtaaccag atgaagcact ctttccacta tccctacagt gttatggctt 1260atttgattga tcgtaaccag atgaagcact ctttccacta tccctacagt gttatggctt 1260
gaacaatcac gaaacaataa ttggtacgta cgatctttca gccgactcaa acatcaaatc 1320gaacaatcac gaaacaataa ttggtacgta cgatctttca gccgactcaa acatcaaatc 1320
ttacaaatgt agtctttgaa agtattacat atgtaagatt taaatgcaac cgttttttcg 1380ttacaaatgt agtctttgaa agtattacat atgtaagatt taaatgcaac cgttttttcg 1380
gaaggaaatg atgacctcgt ttccaccgga attagcttgg taccagctat tgtaacataa 1440gaaggaaatg atgacctcgt ttccaccgga attagcttgg taccagctat tgtaacataa 1440
tcggtacggg ggtgaaaaag ctaacggaaa agggagcgga aaagaatgat gtaagcgtga 1500tcggtacggg ggtgaaaaag ctaacggaaa agggagcgga aaagaatgat gtaagcgtga 1500
aaaatttttt aaaaaatctc ttgacattgg aagggagata tgttattata agaattgcgg 1560aaaatttttt aaaaaatctc ttgacattgg aagggagata tgttattata agaattgcgg 1560
aattgtgagc ggataacaat tcataattgt gagcggataa caattcaacc ccaaaggagg 1620aattgtgagc ggataacaat tcataattgt gagcggataa caattcaacc ccaaaggagg 1620
tgggatccat gagcgaagtt gaattcagcc acgagtactg gatgcgccat gcccttacac 1680tgggatccat gagcgaagtt gaattcagcc acgagtactg gatgcgccat gcccttacac 1680
ttgccaaacg cgctcgcgat gaacgcgaag tcccagttgg cgccgttctt gttcttaaca 1740ttgccaaacg cgctcgcgat gaacgcgaag tcccagttgg cgccgttctt gttcttaaca 1740
accgcgtcat cggagaaggc tggaaccgcg ctatcggact tcatgatccg acagctcatg 1800accgcgtcat cggagaaggc tggaaccgcg ctatcggact tcatgatccg acagctcatg 1800
ccgaaattat ggctctgcgc caaggcggac ttgtcatgca gaattacaga cttattgatg 1860ccgaaattat ggctctgcgc caaggcggac ttgtcatgca gaattacaga cttattgatg 1860
ccacgcttta ctcaacgttc gaaccgtgcg tcatgtgcgc cggcgccatg attcatagcc 1920ccacgcttta ctcaacgttc gaaccgtgcg tcatgtgcgc cggcgccatg attcatagcc 1920
gcatcggcag agttgtcttc ggcgtccgca atgctaaaac gggcgccgcc ggctccctta 1980gcatcggcag agttgtcttc ggcgtccgca atgctaaaac gggcgccgcc ggctccctta 1980
tggacgtcct tcattacccg ggcatgaatc accgcgttga aatcacagaa ggcattctgg 2040tggacgtcct tcattacccg ggcatgaatc accgcgttga aatcacagaa ggcattctgg 2040
ccgatgagtg cgccgctctg ctgtgctact tcttcagaat gccgagaaga gtcttcaacg 2100ccgatgagtg cgccgctctg ctgtgctact tcttcagaat gccgagaaga gtcttcaacg 2100
cccagaagaa agcccaaagc agcacagact ctggaggatc atccggaggc agctctggaa 2160cccagaagaa agcccaaagc agcacagact ctggaggatc atccggaggc agctctggaa 2160
gtgaaacacc aggaacaagc gaatcagcta caccagagtc ctctggaggc tcatctggag 2220gtgaaacacc aggaacaagc gaatcagcta caccagagtc ctctggaggc tcatctggag 2220
gaagctcaat ttatcaagaa tttgttaata aatatagttt aagtaaaact ctaagatttg 2280gaagctcaat ttatcaagaa tttgttaata aatatagttt aagtaaaact ctaagatttg 2280
agttaatccc acagggtaaa acacttgaaa acataaaagc aagaggtttg attttagatg 2340agttaatccc acagggtaaa acacttgaaa acataaaagc aagaggtttg attttagatg 2340
atgagaaaag agctaaagac tacaaaaagg ctaaacaaat aattgataaa tatcatcagt 2400atgagaaaag agctaaagac tacaaaaagg ctaaacaaat aattgataaa tatcatcagt 2400
tttttataga ggagatatta agttcggttt gtattagcga agatttatta caaaactatt 2460tttttataga ggagatatta agttcggttt gtattagcga agatttatta caaaactatt 2460
ctgatgttta ttttaaactt aaaaagagtg atgatgataa tctacaaaaa gattttaaaa 2520ctgatgttta ttttaaactt aaaaagagtg atgatgataa tctacaaaaa gattttaaaa 2520
gtgcaaaaga tacgataaag aaacaaatat ctgaatatat aaaggactca gagaaattta 2580gtgcaaaaga tacgataaag aaacaaatat ctgaatat aaaggactca gagaaattta 2580
agaatttgtt taatcaaaac cttatcgatg ctaaaaaagg gcaagagtca gatttaattc 2640agaatttgtt taatcaaaac cttatcgatg ctaaaaaagg gcaagagtca gatttaattc 2640
tatggctaaa gcaatctaag gataatggta tagaactatt taaagccaat agtgatatca 2700tatggctaaa gcaatctaag gataatggta tagaactatt taaagccaat agtgatatca 2700
cagatataga tgaggcgtta gaaataatca aatcttttaa aggttggaca acttatttta 2760cagatataga tgaggcgtta gaaataatca aatcttttaa aggttggaca acttatttta 2760
agggttttca tgaaaataga aaaaatgttt atagtagcaa tgatattcct acatctatta 2820agggttttca tgaaaataga aaaaatgttt atagtagcaa tgatattcct acatctatta 2820
tttataggat agtagatgat aatttgccta aatttctaga aaataaagct aagtatgaga 2880tttataggat agtagatgat aatttgccta aatttctaga aaataaagct aagtatgaga 2880
gtttaaaaga caaagctcca gaagctataa actatgaaca aattaaaaaa gatttggcag 2940gtttaaaaga caaagctcca gaagctataa actatgaaca aattaaaaaa gatttggcag 2940
aagagctaac ctttgatatt gactacaaaa catctgaagt taatcaaaga gttttttcac 3000aagagctaac ctttgatatt gactacaaaa catctgaagt taatcaaaga gttttttcac 3000
ttgatgaagt ttttgagata gcaaacttta ataattatct aaatcaaagt ggtattacta 3060ttgatgaagt ttttgagata gcaaacttta ataattatct aaatcaaagt ggtattacta 3060
aatttaatac tattattggt ggtaaatttg taaatggtga aaatacaaag agaaaaggta 3120aatttaatac tattattggt ggtaaatttg taaatggtga aaatacaaag agaaaaggta 3120
taaatgaata tataaatcta tactcacagc aaataaatga taaaacactc aaaaaatata 3180taaatgaata tataaatcta tactcacagc aaataaatga taaaacactc aaaaaatata 3180
aaatgagtgt tttatttaag caaattttaa gtgatacaga atctaaatct tttgtaattg 3240aaatgagtgt tttatttaag caaattttaa gtgatacaga atctaaatct tttgtaattg 3240
ataagttaga agatgatagt gatgtagtta caacgatgca aagtttttat gagcaaatag 3300ataagttaga agatgatagt gatgtagtta caacgatgca aagtttttat gagcaaatag 3300
cagcttttaa aacagtagaa gaaaaatcta ttaaagaaac actatcttta ttatttgatg 3360cagcttttaa aacagtagaa gaaaaatcta ttaaagaaac actatcttta ttatttgatg 3360
atttaaaagc tcaaaaactt gatttgagta aaatttattt taaaaatgat aaatctctta 3420atttaaaagc tcaaaaactt gatttgagta aaatttattt taaaaatgat aaatctctta 3420
ctgatctatc acaacaagtt tttgatgatt atagtgttat tggtacagcg gtactagaat 3480ctgatctatc acaacaagtt tttgatgatt atagtgttat tggtacagcg gtactagaat 3480
atataactca acaaatagca cctaaaaatc ttgataaccc tagtaagaaa gagcaagaat 3540atataactca acaaatagca cctaaaaatc ttgataaccc tagtaagaaa gagcaagaat 3540
taatagccaa aaaaactgaa aaagcaaaat acttatctct agaaactata aagcttgcct 3600taatagccaa aaaaactgaa aaagcaaaat acttatctct agaaactata aagcttgcct 3600
tagaagaatt taataagcat agagatatag ataaacagtg taggtttgaa gaaatacttg 3660tagaagaatt taataagcat agagatatag ataaacagtg taggtttgaa gaaatacttg 3660
caaactttgc ggctattccg atgatatttg atgaaatagc tcaaaacaaa gacaatttgg 3720caaactttgc ggctattccg atgatatttg atgaaatagc tcaaaacaaa gacaatttgg 3720
cacagatatc tatcaaatat caaaatcaag gtaaaaaaga cctacttcaa gctagtgcgg 3780cacagatatc tatcaaatat caaaatcaag gtaaaaaaga cctacttcaa gctagtgcgg 3780
aagatgatgt taaagctatc aaggatcttt tagatcaaac taataatctc ttacataaac 3840aagatgatgt taaagctatc aaggatcttt tagatcaaac taataatctc ttacataaac 3840
taaaaatatt tcatattagt cagtcagaag ataaggcaaa tattttagac aaggatgagc 3900taaaaatatt tcatattagt cagtcagaag ataaggcaaa tattttagac aaggatgagc 3900
atttttatct agtatttgag gagtgctact ttgagctagc gaatatagtg cctctttata 3960atttttatct agtatttgag gagtgctact ttgagctagc gaatatagtg cctctttata 3960
acaaaattag aaactatata actcaaaagc catatagtga tgagaaattt aagctcaatt 4020acaaaattag aaactata actcaaaagc catatagtga tgagaaattt aagctcaatt 4020
ttgagaactc gactttggct aatggttggg ataaaaataa agagcctgac aatacggcaa 4080ttgagaactc gactttggct aatggttggg ataaaaataa agagcctgac aatacggcaa 4080
ttttatttat caaagatgat aaatattatc tgggtgtgat gaataagaaa aataacaaaa 4140ttttatttat caaagatgat aaatattatc tgggtgtgat gaataagaaa aataacaaaa 4140
tatttgatga taaagctatc aaagaaaata aaggcgaggg ttataaaaaa attgtttata 4200tatttgatga taaagctatc aaagaaaata aaggcgaggg ttataaaaaa attgtttata 4200
aacttttacc tggcgcaaat aaaatgttac ctaaggtttt cttttctgct aaatctataa 4260aacttttacc tggcgcaaat aaaatgttac ctaaggtttt cttttctgct aaatctataa 4260
aattttataa tcctagtgaa gatatactta gaataagaaa tcattccaca catacaaaaa 4320aattttataa tcctagtgaa gatatactta gaataagaaa tcattccaca catacaaaaa 4320
atggtagtcc tcaaaaagga tatgaaaaat ttgagtttaa tattgaagat tgccgaaaat 4380atggtagtcc tcaaaaagga tatgaaaaat ttgagtttaa tattgaagat tgccgaaaat 4380
ttatagattt ttataaacag tctataagta agcatccgga gtggaaagat tttggattta 4440ttatagattt ttataaacag tctataagta agcatccgga gtggaaagat tttggattta 4440
gattttctga tactcaaaga tataattcta tagatgaatt ttatagagaa gttgaaaatc 4500gattttctga tactcaaaga tataattcta tagatgaatt ttatagagaa gttgaaaatc 4500
aaggctacaa actaactttt gaaaatatat cagagagcta tattgatagc gtagttaatc 4560aaggctacaa actaactttt gaaaatat cagagagcta tattgatagc gtagttaatc 4560
agggtaaatt gtacctattc caaatctata ataaagattt ttcagcttat agcaaagggc 4620agggtaaatt gtacctattc caaatctata ataaagattt ttcagcttat agcaaagggc 4620
gaccaaatct acatacttta tattggaaag cgctgtttga tgagagaaat cttcaagatg 4680gaccaaatct acatacttta tattggaaag cgctgtttga tgagagaaat cttcaagatg 4680
tggtttataa gctaaatggt gaggcagagc ttttttatcg taaacaatca atacctaaaa 4740tggtttataa gctaaatggt gaggcagagc ttttttatcg taaacaatca atacctaaaa 4740
aaatcactca cccagctaaa gaggcaatag ctaataaaaa caaagataat cctaaaaaag 4800aaatcactca cccagctaaa gaggcaatag ctaataaaaa caaagataat cctaaaaaag 4800
agagtgtttt tgaatatgat ttaatcaaag ataaacgctt tactgaagat aagtttttct 4860agagtgtttt tgaatatgat ttaatcaaag ataaacgctt tactgaagat aagtttttct 4860
ttcactgtcc tattacaatc aattttaaat ctagtggagc taataagttt aatgatgaaa 4920ttcactgtcc tattacaatc aattttaaat ctagtggagc taataagttt aatgatgaaa 4920
tcaatttatt gctaaaagaa aaagcaaatg atgttcatat attaagtata gctagaggtg 4980tcaatttatt gctaaaagaa aaagcaaatg atgttcatat attaagtata gctagaggtg 4980
aaagacattt agcttactat actttggtag atggtaaagg caatatcatc aaacaagata 5040aaagacattt agcttactat actttggtag atggtaaagg caatatcatc aaacaagata 5040
ctttcaacat cattggtaat gatagaatga aaacaaacta ccatgataag cttgctgcaa 5100ctttcaacat cattggtaat gatagaatga aaacaaacta ccatgataag cttgctgcaa 5100
tagagaaaga tagggattca gctaggaaag actggaaaaa gataaataac atcaaagaga 5160tagagaaaga tagggattca gctaggaaag actggaaaaa gataaataac atcaaagaga 5160
tgaaagaggg ctatctatct caggtagttc atgaaatagc taagctagtt atagagtata 5220tgaaagaggg ctatctatct caggtagttc atgaaatagc taagctagtt atagagtata 5220
atgctattgt ggtttttgag gatttaaatt ttggatttaa aagagggcgt ttcaaggtag 5280atgctattgt ggtttttgag gatttaaatt ttggatttaa aagagggcgt ttcaaggtag 5280
agaagcaggt ctatcaaaag ttagaaaaaa tgctaattga gaaactaaac tatctagttt 5340agaagcaggt ctatcaaaag ttagaaaaaa tgctaattga gaaactaaac tatctagttt 5340
tcaaagataa tgagtttgat aaaactgggg gagtgcttag agcttatcag ctaacagcac 5400tcaaagataa tgagtttgat aaaactgggg gagtgcttag agctttatcag ctaacagcac 5400
cttttgagac ttttaaaaag atgggtaaac aaacaggtat tatctactat gtaccagctg 5460cttttgagac ttttaaaaag atgggtaaac aaacaggtat tatctactat gtaccagctg 5460
gttttacttc aaaaatttgt cctgtaactg gttttgtaaa tcagttatat cctaagtatg 5520gttttacttc aaaaatttgt cctgtaactg gttttgtaaa tcagttatat cctaagtatg 5520
aaagtgtcag caaatctcaa gagttcttta gtaagtttga caagatttgt tataaccttg 5580aaagtgtcag caaatctcaa gagttcttta gtaagtttga caagatttgt tataaccttg 5580
ataagggcta ttttgagttt agttttgatt ataaaaactt tggtgacaag gctgccaaag 5640ataagggcta ttttgagttt agttttgatt ataaaaactt tggtgacaag gctgccaaag 5640
gcaagtggac tatagctagc tttgggagta gattgattaa ctttagaaat tcagataaaa 5700gcaagtggac tatagctagc tttgggagta gattgattaa ctttagaaat tcagataaaa 5700
atcataattg ggatactcga gaagtttatc caactaaaga gttggagaaa ttgctaaaag 5760atcataattg ggatactcga gaagtttatc caactaaaga gttggagaaa ttgctaaaag 5760
attattctat cgaatatggg catggcgaat gtatcaaagc agctatttgc ggtgagagcg 5820attattctat cgaatatggg catggcgaat gtatcaaagc agctatttgc ggtgagagcg 5820
acaaaaagtt ttttgctaag ctaactagtg tcctaaatac tatcttacaa atgcgtaact 5880acaaaaagttttttgctaag ctaactagtg tcctaaatac tatcttacaa atgcgtaact 5880
caaaaacagg tactgagtta gattatctaa tttcaccagt agcagatgta aatggcaatt 5940caaaaacagg tactgagtta gattatctaa tttcaccagt agcagatgta aatggcaatt 5940
tctttgattc gcgacaggcg ccaaaaaata tgcctcaaga tgctgatgcc aatggtgctt 6000tctttgattc gcgacaggcg ccaaaaaata tgcctcaaga tgctgatgcc aatggtgctt 6000
atcatattgg gctaaaaggt ctgatgctac taggtaggat caaaaataat caagagggca 6060atcatattgg gctaaaaggt ctgatgctac taggtaggat caaaaataat caagagggca 6060
aaaaactcaa tttggttatc aaaaatgaag agtattttga gttcgtgcag aataggaata 6120aaaaactcaa tttggttatc aaaaatgaag agtattttga gttcgtgcag aataggaata 6120
actctggtgg ttctcccaag aagaagagga aagtctaact gcagtataat cagaaacagc 6180actctggtgg ttctcccaag aagaagagga aagtctaact gcagtataat cagaaacagc 6180
ccgcggatgt tgatctgcgg gctgtttttt attgatcgaa tggccatgac caaaatccct 6240ccgcggatgt tgatctgcgg gctgtttttt attgatcgaa tggccatgac caaaatccct 6240
taacgtgaag ttttcgttcc actgagcgtc agaccccgta gaaaagatca aaggatcttc 6300taacgtgaag ttttcgttcc actgagcgtc agaccccgta gaaaagatca aaggatcttc 6300
ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc 6360ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc 6360
agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt 6420agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt 6420
cagcagagcg cagataccaa atactgtcct tctagtgtag ccgtagttag gccaccactt 6480cagcagagcg cagataccaa atactgtcct tctagtgtag ccgtagttag gccaccactt 6480
caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc 6540caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc 6540
tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt taccggataa 6600tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt taccggataa 6600
ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac 6660ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac 6660
ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg 6720ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg 6720
gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga 6780gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga 6780
gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact 6840gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact 6840
tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa 6900tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa 6900
cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt tctttcctgc 6960cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt tctttcctgc 6960
gttatcccct gattctgtgg ataaccgtat taccgccttt gagtgagctg agacaggtca 7020gttatcccct gattctgtgg ataaccgtat taccgccttt gagtgagctg agacaggtca 7020
ttcagactgg ctaatgcacc cagtaaggca gcggtatcat caactcaaaa tggtatgcgt 7080ttcagactgg ctaatgcacc cagtaaggca gcggtatcat caactcaaaa tggtatgcgt 7080
tttgacacat ccactatata tccgtgtcgt tctgtccact cctgaatccc attccagaaa 7140tttgacacat ccactatata tccgtgtcgt tctgtccact cctgaatccc attccagaaa 7140
ttctctagcg attccagaag tttctcagag tcggaaagtt gaccagacat tacgaactgg 7200ttctctagcg attccagaag tttctcagag tcggaaagtt gaccagacat tacgaactgg 7200
cacagatggt cataacctga aggaagatct gattgcttaa ctgcttcagt taagaccgaa 7260cacagatggt cataacctga aggaagatct gattgcttaa ctgcttcagt taagaccgaa 7260
gcgctcgtcg tataacagat gcgatgatgc agaccaatca acatggcacc tgccattgct 7320gcgctcgtcg tataacagat gcgatgatgc agaccaatca acatggcacc tgccattgct 7320
acctgcacag tcaaggatgg tagaaatgtt gtcggtcctt gcacacgaat attacgccat 7380acctgcacag tcaaggatgg tagaaatgtt gtcggtcctt gcacacgaat attacgccat 7380
ttgcctgcat attcaaacag ctcttctacg ataagggcac aaatcgcatc gtggaacgtt 7440ttgcctgcat attcaaacag ctcttctacg ataagggcac aaatcgcatc gtggaacgtt 7440
tgggcttcta ccgatttagc agtttgatac actttctcta agtatccacc tgaatcataa 7500tgggcttcta ccgatttagc agtttgatac actttctcta agtatccacc tgaatcataa 7500
atcggcaaaa tagagaaaaa ttgaccatgt gtaagcggcc aatctgattc cacctgagat 7560atcggcaaaa tagagaaaaa ttgaccatgt gtaagcggcc aatctgattc cacctgagat 7560
gcataatcta gtagaatctc ttcgctatca aaattcactt ccaccttcca ctcaccggtt 7620gcataatcta gtagaatctc ttcgctatca aaattcactt ccaccttcca ctcaccggtt 7620
gtccattcat ggctgaactc tgcttcctct gttgacatga cacacatcat ctcaatatcc 7680gtccattcat ggctgaactc tgcttcctct gttgacatga cacacatcat ctcaatatcc 7680
gaatagggcc catcagtctg acgaccaaga gagccataaa caccaatagc cttaacatca 7740gaatagggcc catcagtctg acgaccaaga gagccataaa caccaatagc cttaacatca 7740
tccccatatt tatccaatat tcgttcctta atttcatgaa caatcttcat tctttcttct 7800tccccatatt tatccaatat tcgttcctta atttcatgaa caatcttcat tctttcttct 7800
ctagtcatta ttattggtcc attcactatt ctcattcccc tttcagataa ttttagattt 7860ctagtcatta ttatggtcc attcactatt ctcattcccc tttcagataa ttttagattt 7860
gcttttctaa ataagaatat ttggagagca ccgttcttat tcagctatta aacccattat 7920gcttttctaa ataagaatat ttggagagca ccgttcttat tcagctatta aacccattat 7920
atcgggtttt tgaggggatt tcaactgcag acacctaaat tcaaaatcta tcggtcagat 7980atcgggtttt tgaggggatt tcaactgcag acacctaaat tcaaaatcta tcggtcagat 7980
ttataccgat ttgattttat atattcttga ataacatacg ccgagttatc acataaaagc 8040ttataccgat ttgattttat atattcttga ataacatacg ccgagttatc acataaaagc 8040
gggaaccaat catcaaattt aaacttcatt gcataatcca ttaaactctt aaattctacg 8100gggaaccaat catcaaattt aaacttcatt gcataatcca ttaaactctt aaattctacg 8100
attccttgtt catcaataaa ctcaatcatt tctttaatta atttatatct atctgttgtt 8160attccttgtt catcaataaa ctcaatcatt tctttaatta atttatatct atctgttgtt 8160
gttttcttta ataattcatc aacatctaca ccgccataaa ctatcatatc ttctttttga 8220gttttcttta ataattcatc aacatctaca ccgccataaa ctatcatatc ttctttttga 8220
tatttaaatt tattaggatc gtccatgtga agcatatatc tcacaagacc tttcacactt 8280tatttaaatt tattaggatc gtccatgtga agcatatatc tcacaagacc tttcacactt 8280
cctgcaatct gcggaatagt cgcattcaat tcttctgtaa ttatttttat ctgttcataa 8340cctgcaatct gcggaatagt cgcattcaat tcttctgtaa ttattttat ctgttcataa 8340
gatttattac cctcatacat cactagaata tgataatgct cttttttcat cctatcttct 8400gatttattac cctcatacat cactagaata tgataatgct cttttttcat cctatcttct 8400
gtatcagtat ccctatcatg taatggagac actacaaatt gaatgtgtaa ctcttttaaa 8460gtatcagtat ccctatcatg taatggagac actacaaatt gaatgtgtaa ctcttttaaa 8460
tactctaacc actcggcttt tgctgattct ggatataaaa caaatgtcca attacgtcct 8520tactctaacc actcggcttt tgctgattct ggatataaaa caaatgtcca attacgtcct 8520
cttgaatttt tcttgttttc agtttctttt attacatttt cgctcatgat ataataacgg 8580cttgaatttt tcttgttttc agtttctttt attacatttt cgctcatgat ataataacgg 8580
tgctaataca tttaacaaaa tttagtcata gataggcagc atgccagtgc tgtctatctt 8640tgctaataca tttaacaaaa tttagtcata gataggcagc atgccagtgc tgtctatctt 8640
tttttgttta aaatgcaccg tattcctcct ttgcatattt ttttattaga ataccggttg 8700tttttgttta aaatgcaccg tattcctcct ttgcatattt ttttattaga ataccggttg 8700
catctgattt gctaatatta tatttttctt tgattctatt taatatctca ttttcttctg 8760catctgattt gctaatatta tatttttctt tgattctatt taatatctca ttttcttctg 8760
ttgtaagtct taaagtaaca gcaacttttt tctcttcttt tctatctaca accatcactg 8820ttgtaagtct taaagtaaca gcaacttttt tctcttcttt tctatctaca accatcactg 8820
tacctcccaa catctgtttt tttcacttta acataaaaaa caacctttta acattaaaaa 8880tacctcccaa catctgtttt tttcacttta acataaaaaa caacctttta acattaaaaa 8880
cccaatattt atttatttgt ttggacaatg gacaatggac acctaggggg gaggtcgtag 8940cccaatattt atttatttgt ttggacaatg gacaatggac acctaggggg gaggtcgtag 8940
taccccccta tgttttctcc cctaaataac cccaaaaatc taagaaaaaa agacctcaaa 9000taccccccta tgttttctcc cctaaataac cccaaaaatc taagaaaaaa agacctcaaa 9000
aaggtcttta attaacatct caaatttcgc atttattcca atttcctttt tgcgtgtgat 9060aaggtcttta attaacatct caaatttcgc atttattcca atttcctttt tgcgtgtgat 9060
gcgttattaa cgttgatata atttaaattt tatttgacaa aaatgggctc gtgttgtaca 9120gcgttattaa cgttgatata atttaaattt tatttgacaa aaatgggctc gtgttgtaca 9120
ataaatgtag aggtagagac gcgaggtcta agaactttaa ataatttcta ctgttgtaga 9180ataaatgtag aggtagagac gcgaggtcta agaactttaa ataatttcta ctgttgtaga 9180
tagagaccgt gaagttaata aggtctcaaa tttctactgt tgtagatcgt ctctgaactg 9240tagagaccgt gaagttaata aggtctcaaa tttctactgt tgtagatcgt ctctgaactg 9240
attcaagcaa gcttaaaccc agctcaatga gctgggtttt ttgtttgttt tttcaaactt 9300attcaagcaa gcttaaaccc agctcaatga gctgggtttt ttgtttgttt tttcaaactt 9300
agttagcttg gccagtgcct ctagagtcaa gtaaagagtc gacctgttac gaacggcaga 9360agttagcttg gccagtgcct ctagagtcaa gtaaagagtc gacctgttac gaacggcaga 9360
tcagaatttt gtaataaaaa aagagcctgc tcattacact gcgggctctt tttcatggtc 9420tcagaatttt gtaataaaaa aagagcctgc tcattacact gcgggctctt tttcatggtc 9420
agaagacggg taaccaagat aacaa 9445agaagacggg taaccaagat aacaa 9445
<210> 9<210> 9
<211> 9445<211> 9445
<212> DNA<212> DNA
<213> (人工序列)<213> (artificial sequence)
<400> 9<400> 9
atgtcatgac attggtgtac agaaatggcg cagcaatggc aagaacgtcc cgggcggagc 60atgtcatgac attggtgtac agaaatggcg cagcaatggc aagaacgtcc cgggcggagc 60
tcaggcctta actcacatta attgcgttgc gctcactgcc cgctttccag tcgggaaacc 120tcaggcctta actcacatta attgcgttgc gctcactgcc cgctttccag tcgggaaacc 120
tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg 180tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg 180
ggcgccaggg tggtttttct tttcaccagt gagacgggca acagctgatt gcccttcacc 240ggcgccaggg tggtttttct tttcaccagt gagacgggca acagctgatt gcccttcacc 240
gcctggccct gagagagttg cagcaagcgg tccacgctgg tttgccccag caggcgaaaa 300gcctggccct gagagagttg cagcaagcgg tccacgctgg tttgccccag caggcgaaaa 300
tcctgtttga tggtggttaa cggcgggata taacatgagc tgtcttcggt atcgtcgtat 360tcctgtttga tggtggttaa cggcgggata taacatgagc tgtcttcggt atcgtcgtat 360
cccactaccg agatatccgc accaacgcgc agcccggact cggtaatggc gcgcattgcg 420cccactaccg agatatccgc accaacgcgc agcccggact cggtaatggc gcgcattgcg 420
cccagcgcca tctgatcgtt ggcaaccagc atcgcagtgg gaacgatgcc ctcattcagc 480cccagcgcca tctgatcgtt ggcaaccagc atcgcagtgg gaacgatgcc ctcattcagc 480
atttgcatgg tttgttgaaa accggacatg gcactccagt cgccttcccg ttccgctatc 540atttgcatgg tttgttgaaa accggacatg gcactccagt cgccttcccg ttccgctatc 540
ggctgaattt gattgcgagt gagatattta tgccagccag ccagacgcag acgcgccgag 600ggctgaattt gattgcgagt gagatattta tgccagccag ccagacgcag acgcgccgag 600
acagaactta atgggcccgc taacagcgcg atttgctggt gacccaatgc gaccagatgc 660acagaactta atgggcccgc taacagcgcg atttgctggt gacccaatgc gaccagatgc 660
tccacgccca gtcgcgtacc gtcttcatgg gagaaaataa tactgttgat gggtgtctgg 720tccacgccca gtcgcgtacc gtcttcatgg gagaaaataa tactgttgat gggtgtctgg 720
tcagagacat caagaaataa cgccggaaca ttagtgcagg cagcttccac agcaatggca 780tcagagacat caagaaataa cgccggaaca ttagtgcagg cagcttccac agcaatggca 780
tcctggtcat ccagcggata gttaatgatc agcccactga cgcgttgcgc gagaagattg 840tcctggtcat ccagcggata gttaatgatc agcccactga cgcgttgcgc gagaagattg 840
tgcaccgccg ttttacaggc ttcgacgccg cttcgttcta ccatcgacac caccacgctg 900tgcaccgccg ttttacaggc ttcgacgccg cttcgttcta ccatcgacac caccacgctg 900
gcacccagtt gatcggcgcg agatttaatc gccgcgacaa tttgcgacgg cgcgtgcagg 960gcacccagtt gatcggcgcg agatttaatc gccgcgacaa tttgcgacgg cgcgtgcagg 960
gccagactgg aggtggcaac gccaatcagc aacgactgtt tgcccgccag ttgttgtgcc 1020gccagactgg aggtggcaac gccaatcagc aacgactgtt tgcccgccag ttgttgtgcc 1020
acgcggttgg gaatgtaatt cagctccgcc atcgccgctt ccactttttc ccgcgttttc 1080acgcggttgg gaatgtaatt cagctccgcc atcgccgctt ccactttttc ccgcgttttc 1080
gcagaaacgt ggctggcctg gttcaccacg cgggaaacgg tctgataaga gacaccggca 1140gcagaaacgt ggctggcctg gttcaccacg cgggaaacgg tctgataaga gacaccggca 1140
tactctgcga catcgtataa cgttactggt ttcatcaaaa tcgtctccct ccgtttgaat 1200tactctgcga catcgtataa cgttatactggt ttcatcaaaa tcgtctccct ccgtttgaat 1200
atttgattga tcgtaaccag atgaagcact ctttccacta tccctacagt gttatggctt 1260atttgattga tcgtaaccag atgaagcact ctttccacta tccctacagt gttatggctt 1260
gaacaatcac gaaacaataa ttggtacgta cgatctttca gccgactcaa acatcaaatc 1320gaacaatcac gaaacaataa ttggtacgta cgatctttca gccgactcaa acatcaaatc 1320
ttacaaatgt agtctttgaa agtattacat atgtaagatt taaatgcaac cgttttttcg 1380ttacaaatgt agtctttgaa agtattacat atgtaagatt taaatgcaac cgttttttcg 1380
gaaggaaatg atgacctcgt ttccaccgga attagcttgg taccagctat tgtaacataa 1440gaaggaaatg atgacctcgt ttccaccgga attagcttgg taccagctat tgtaacataa 1440
tcggtacggg ggtgaaaaag ctaacggaaa agggagcgga aaagaatgat gtaagcgtga 1500tcggtacggg ggtgaaaaag ctaacggaaa agggagcgga aaagaatgat gtaagcgtga 1500
aaaatttttt aaaaaatctc ttgacattgg aagggagata tgttattata agaattgcgg 1560aaaatttttt aaaaaatctc ttgacattgg aagggagata tgttattata agaattgcgg 1560
aattgtgagc ggataacaat tcataattgt gagcggataa caattcaacc ccaaaggagg 1620aattgtgagc ggataacaat tcataattgt gagcggataa caattcaacc ccaaaggagg 1620
tgggatccat gagcgaagtc gagttctccc acgaatactg gatgcgccat gcccttacac 1680tgggatccat gagcgaagtc gagttctccc acgaatactg gatgcgccat gcccttacac 1680
ttgccaaacg cgctcgcgat gaacgcgaag tcccagttgg cgccgttctt gttcttaaca 1740ttgccaaacg cgctcgcgat gaacgcgaag tcccagttgg cgccgttctt gttcttaaca 1740
accgcgtcat cggagaaggc tggaaccgcg ctatcggact tcatgatccg acagctcatg 1800accgcgtcat cggagaaggc tggaaccgcg ctatcggact tcatgatccg acagctcatg 1800
ccgaaattat ggctctgcgc caaggcggac ttgtcatgca gaattacaga ctttatgatg 1860ccgaaattat ggctctgcgc caaggcggac ttgtcatgca gaattacaga ctttatgatg 1860
ccacgcttta ctcaacgttc gaaccgtgcg tcatgtgcgc cggcgccatg attcatagcc 1920ccacgcttta ctcaacgttc gaaccgtgcg tcatgtgcgc cggcgccatg attcatagcc 1920
gcatcggcag agttgtcttc ggcgtccgca atgctaaaac gggcgccgcc ggctccctta 1980gcatcggcag agttgtcttc ggcgtccgca atgctaaaac gggcgccgcc ggctccctta 1980
tggacgtcct tcatcacccg ggcatgaatc accgcgttga aatcacagaa ggcattctgg 2040tggacgtcct tcatcacccg ggcatgaatc accgcgttga aatcacagaa ggcattctgg 2040
ccgatgagtg cgccgctctg ctgtgcagat tcttcagaat gccgagaaga gtcttcaacg 2100ccgatgagtg cgccgctctg ctgtgcagat tcttcagaat gccgagaaga gtcttcaacg 2100
cccagaagaa agcccaaagc agcacagact ctggaggatc atccggaggc agctctggaa 2160cccagaagaa agcccaaagc agcacagact ctggaggatc atccggaggc agctctggaa 2160
gtgaaacacc aggaacaagc gaatcagcta caccagagtc ctctggaggc tcatctggag 2220gtgaaacacc aggaacaagc gaatcagcta caccagagtc ctctggaggc tcatctggag 2220
gaagctcaat ttatcaagaa tttgttaata aatatagttt aagtaaaact ctaagatttg 2280gaagctcaat ttatcaagaa tttgttaata aatatagttt aagtaaaact ctaagatttg 2280
agttaatccc acagggtaaa acacttgaaa acataaaagc aagaggtttg attttagatg 2340agttaatccc acagggtaaa acacttgaaa acataaaagc aagaggtttg attttagatg 2340
atgagaaaag agctaaagac tacaaaaagg ctaaacaaat aattgataaa tatcatcagt 2400atgagaaaag agctaaagac tacaaaaagg ctaaacaaat aattgataaa tatcatcagt 2400
tttttataga ggagatatta agttcggttt gtattagcga agatttatta caaaactatt 2460tttttataga ggagatatta agttcggttt gtattagcga agatttatta caaaactatt 2460
ctgatgttta ttttaaactt aaaaagagtg atgatgataa tctacaaaaa gattttaaaa 2520ctgatgttta ttttaaactt aaaaagagtg atgatgataa tctacaaaaa gattttaaaa 2520
gtgcaaaaga tacgataaag aaacaaatat ctgaatatat aaaggactca gagaaattta 2580gtgcaaaaga tacgataaag aaacaaatat ctgaatat aaaggactca gagaaattta 2580
agaatttgtt taatcaaaac cttatcgatg ctaaaaaagg gcaagagtca gatttaattc 2640agaatttgtt taatcaaaac cttatcgatg ctaaaaaagg gcaagagtca gatttaattc 2640
tatggctaaa gcaatctaag gataatggta tagaactatt taaagccaat agtgatatca 2700tatggctaaa gcaatctaag gataatggta tagaactatt taaagccaat agtgatatca 2700
cagatataga tgaggcgtta gaaataatca aatcttttaa aggttggaca acttatttta 2760cagatataga tgaggcgtta gaaataatca aatcttttaa aggttggaca acttatttta 2760
agggttttca tgaaaataga aaaaatgttt atagtagcaa tgatattcct acatctatta 2820agggttttca tgaaaataga aaaaatgttt atagtagcaa tgatattcct acatctatta 2820
tttataggat agtagatgat aatttgccta aatttctaga aaataaagct aagtatgaga 2880tttataggat agtagatgat aatttgccta aatttctaga aaataaagct aagtatgaga 2880
gtttaaaaga caaagctcca gaagctataa actatgaaca aattaaaaaa gatttggcag 2940gtttaaaaga caaagctcca gaagctataa actatgaaca aattaaaaaa gatttggcag 2940
aagagctaac ctttgatatt gactacaaaa catctgaagt taatcaaaga gttttttcac 3000aagagctaac ctttgatatt gactacaaaa catctgaagt taatcaaaga gttttttcac 3000
ttgatgaagt ttttgagata gcaaacttta ataattatct aaatcaaagt ggtattacta 3060ttgatgaagt ttttgagata gcaaacttta ataattatct aaatcaaagt ggtattacta 3060
aatttaatac tattattggt ggtaaatttg taaatggtga aaatacaaag agaaaaggta 3120aatttaatac tattattggt ggtaaatttg taaatggtga aaatacaaag agaaaaggta 3120
taaatgaata tataaatcta tactcacagc aaataaatga taaaacactc aaaaaatata 3180taaatgaata tataaatcta tactcacagc aaataaatga taaaacactc aaaaaatata 3180
aaatgagtgt tttatttaag caaattttaa gtgatacaga atctaaatct tttgtaattg 3240aaatgagtgt tttatttaag caaattttaa gtgatacaga atctaaatct tttgtaattg 3240
ataagttaga agatgatagt gatgtagtta caacgatgca aagtttttat gagcaaatag 3300ataagttaga agatgatagt gatgtagtta caacgatgca aagtttttat gagcaaatag 3300
cagcttttaa aacagtagaa gaaaaatcta ttaaagaaac actatcttta ttatttgatg 3360cagcttttaa aacagtagaa gaaaaatcta ttaaagaaac actatcttta ttatttgatg 3360
atttaaaagc tcaaaaactt gatttgagta aaatttattt taaaaatgat aaatctctta 3420atttaaaagc tcaaaaactt gatttgagta aaatttattt taaaaatgat aaatctctta 3420
ctgatctatc acaacaagtt tttgatgatt atagtgttat tggtacagcg gtactagaat 3480ctgatctatc acaacaagtt tttgatgatt atagtgttat tggtacagcg gtactagaat 3480
atataactca acaaatagca cctaaaaatc ttgataaccc tagtaagaaa gagcaagaat 3540atataactca acaaatagca cctaaaaatc ttgataaccc tagtaagaaa gagcaagaat 3540
taatagccaa aaaaactgaa aaagcaaaat acttatctct agaaactata aagcttgcct 3600taatagccaa aaaaactgaa aaagcaaaat acttatctct agaaactata aagcttgcct 3600
tagaagaatt taataagcat agagatatag ataaacagtg taggtttgaa gaaatacttg 3660tagaagaatt taataagcat agagatatag ataaacagtg taggtttgaa gaaatacttg 3660
caaactttgc ggctattccg atgatatttg atgaaatagc tcaaaacaaa gacaatttgg 3720caaactttgc ggctattccg atgatatttg atgaaatagc tcaaaacaaa gacaatttgg 3720
cacagatatc tatcaaatat caaaatcaag gtaaaaaaga cctacttcaa gctagtgcgg 3780cacagatatc tatcaaatat caaaatcaag gtaaaaaaga cctacttcaa gctagtgcgg 3780
aagatgatgt taaagctatc aaggatcttt tagatcaaac taataatctc ttacataaac 3840aagatgatgt taaagctatc aaggatcttt tagatcaaac taataatctc ttacataaac 3840
taaaaatatt tcatattagt cagtcagaag ataaggcaaa tattttagac aaggatgagc 3900taaaaatatt tcatattagt cagtcagaag ataaggcaaa tattttagac aaggatgagc 3900
atttttatct agtatttgag gagtgctact ttgagctagc gaatatagtg cctctttata 3960atttttatct agtatttgag gagtgctact ttgagctagc gaatatagtg cctctttata 3960
acaaaattag aaactatata actcaaaagc catatagtga tgagaaattt aagctcaatt 4020acaaaattag aaactata actcaaaagc catatagtga tgagaaattt aagctcaatt 4020
ttgagaactc gactttggct aatggttggg ataaaaataa agagcctgac aatacggcaa 4080ttgagaactc gactttggct aatggttggg ataaaaataa agagcctgac aatacggcaa 4080
ttttatttat caaagatgat aaatattatc tgggtgtgat gaataagaaa aataacaaaa 4140ttttatttat caaagatgat aaatattatc tgggtgtgat gaataagaaa aataacaaaa 4140
tatttgatga taaagctatc aaagaaaata aaggcgaggg ttataaaaaa attgtttata 4200tatttgatga taaagctatc aaagaaaata aaggcgaggg ttataaaaaa attgtttata 4200
aacttttacc tggcgcaaat aaaatgttac ctaaggtttt cttttctgct aaatctataa 4260aacttttacc tggcgcaaat aaaatgttac ctaaggtttt cttttctgct aaatctataa 4260
aattttataa tcctagtgaa gatatactta gaataagaaa tcattccaca catacaaaaa 4320aattttataa tcctagtgaa gatatactta gaataagaaa tcattccaca catacaaaaa 4320
atggtagtcc tcaaaaagga tatgaaaaat ttgagtttaa tattgaagat tgccgaaaat 4380atggtagtcc tcaaaaagga tatgaaaaat ttgagtttaa tattgaagat tgccgaaaat 4380
ttatagattt ttataaacag tctataagta agcatccgga gtggaaagat tttggattta 4440ttatagattt ttataaacag tctataagta agcatccgga gtggaaagat tttggattta 4440
gattttctga tactcaaaga tataattcta tagatgaatt ttatagagaa gttgaaaatc 4500gattttctga tactcaaaga tataattcta tagatgaatt ttatagagaa gttgaaaatc 4500
aaggctacaa actaactttt gaaaatatat cagagagcta tattgatagc gtagttaatc 4560aaggctacaa actaactttt gaaaatat cagagagcta tattgatagc gtagttaatc 4560
agggtaaatt gtacctattc caaatctata ataaagattt ttcagcttat agcaaagggc 4620agggtaaatt gtacctattc caaatctata ataaagattt ttcagcttat agcaaagggc 4620
gaccaaatct acatacttta tattggaaag cgctgtttga tgagagaaat cttcaagatg 4680gaccaaatct acatacttta tattggaaag cgctgtttga tgagagaaat cttcaagatg 4680
tggtttataa gctaaatggt gaggcagagc ttttttatcg taaacaatca atacctaaaa 4740tggtttataa gctaaatggt gaggcagagc ttttttatcg taaacaatca atacctaaaa 4740
aaatcactca cccagctaaa gaggcaatag ctaataaaaa caaagataat cctaaaaaag 4800aaatcactca cccagctaaa gaggcaatag ctaataaaaa caaagataat cctaaaaaag 4800
agagtgtttt tgaatatgat ttaatcaaag ataaacgctt tactgaagat aagtttttct 4860agagtgtttt tgaatatgat ttaatcaaag ataaacgctt tactgaagat aagtttttct 4860
ttcactgtcc tattacaatc aattttaaat ctagtggagc taataagttt aatgatgaaa 4920ttcactgtcc tattacaatc aattttaaat ctagtggagc taataagttt aatgatgaaa 4920
tcaatttatt gctaaaagaa aaagcaaatg atgttcatat attaagtata gctagaggtg 4980tcaatttatt gctaaaagaa aaagcaaatg atgttcatat attaagtata gctagaggtg 4980
aaagacattt agcttactat actttggtag atggtaaagg caatatcatc aaacaagata 5040aaagacattt agcttactat actttggtag atggtaaagg caatatcatc aaacaagata 5040
ctttcaacat cattggtaat gatagaatga aaacaaacta ccatgataag cttgctgcaa 5100ctttcaacat cattggtaat gatagaatga aaacaaacta ccatgataag cttgctgcaa 5100
tagagaaaga tagggattca gctaggaaag actggaaaaa gataaataac atcaaagaga 5160tagagaaaga tagggattca gctaggaaag actggaaaaa gataaataac atcaaagaga 5160
tgaaagaggg ctatctatct caggtagttc atgaaatagc taagctagtt atagagtata 5220tgaaagaggg ctatctatct caggtagttc atgaaatagc taagctagtt atagagtata 5220
atgctattgt ggtttttgag gatttaaatt ttggatttaa aagagggcgt ttcaaggtag 5280atgctattgt ggtttttgag gatttaaatt ttggatttaa aagagggcgt ttcaaggtag 5280
agaagcaggt ctatcaaaag ttagaaaaaa tgctaattga gaaactaaac tatctagttt 5340agaagcaggt ctatcaaaag ttagaaaaaa tgctaattga gaaactaaac tatctagttt 5340
tcaaagataa tgagtttgat aaaactgggg gagtgcttag agcttatcag ctaacagcac 5400tcaaagataa tgagtttgat aaaactgggg gagtgcttag agctttatcag ctaacagcac 5400
cttttgagac ttttaaaaag atgggtaaac aaacaggtat tatctactat gtaccagctg 5460cttttgagac ttttaaaaag atgggtaaac aaacaggtat tatctactat gtaccagctg 5460
gttttacttc aaaaatttgt cctgtaactg gttttgtaaa tcagttatat cctaagtatg 5520gttttacttc aaaaatttgt cctgtaactg gttttgtaaa tcagttatat cctaagtatg 5520
aaagtgtcag caaatctcaa gagttcttta gtaagtttga caagatttgt tataaccttg 5580aaagtgtcag caaatctcaa gagttcttta gtaagtttga caagatttgt tataaccttg 5580
ataagggcta ttttgagttt agttttgatt ataaaaactt tggtgacaag gctgccaaag 5640ataagggcta ttttgagttt agttttgatt ataaaaactt tggtgacaag gctgccaaag 5640
gcaagtggac tatagctagc tttgggagta gattgattaa ctttagaaat tcagataaaa 5700gcaagtggac tatagctagc tttgggagta gattgattaa ctttagaaat tcagataaaa 5700
atcataattg ggatactcga gaagtttatc caactaaaga gttggagaaa ttgctaaaag 5760atcataattg ggatactcga gaagtttatc caactaaaga gttggagaaa ttgctaaaag 5760
attattctat cgaatatggg catggcgaat gtatcaaagc agctatttgc ggtgagagcg 5820attattctat cgaatatggg catggcgaat gtatcaaagc agctatttgc ggtgagagcg 5820
acaaaaagtt ttttgctaag ctaactagtg tcctaaatac tatcttacaa atgcgtaact 5880acaaaaagttttttgctaag ctaactagtg tcctaaatac tatcttacaa atgcgtaact 5880
caaaaacagg tactgagtta gattatctaa tttcaccagt agcagatgta aatggcaatt 5940caaaaacagg tactgagtta gattatctaa tttcaccagt agcagatgta aatggcaatt 5940
tctttgattc gcgacaggcg ccaaaaaata tgcctcaaga tgctgatgcc aatggtgctt 6000tctttgattc gcgacaggcg ccaaaaaata tgcctcaaga tgctgatgcc aatggtgctt 6000
atcatattgg gctaaaaggt ctgatgctac taggtaggat caaaaataat caagagggca 6060atcatattgg gctaaaaggt ctgatgctac taggtaggat caaaaataat caagagggca 6060
aaaaactcaa tttggttatc aaaaatgaag agtattttga gttcgtgcag aataggaata 6120aaaaactcaa tttggttatc aaaaatgaag agtattttga gttcgtgcag aataggaata 6120
actctggtgg ttctcccaag aagaagagga aagtctaact gcagtataat cagaaacagc 6180actctggtgg ttctcccaag aagaagagga aagtctaact gcagtataat cagaaacagc 6180
ccgcggatgt tgatctgcgg gctgtttttt attgatcgaa tggccatgac caaaatccct 6240ccgcggatgt tgatctgcgg gctgtttttt attgatcgaa tggccatgac caaaatccct 6240
taacgtgaag ttttcgttcc actgagcgtc agaccccgta gaaaagatca aaggatcttc 6300taacgtgaag ttttcgttcc actgagcgtc agaccccgta gaaaagatca aaggatcttc 6300
ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc 6360ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc 6360
agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt 6420agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt 6420
cagcagagcg cagataccaa atactgtcct tctagtgtag ccgtagttag gccaccactt 6480cagcagagcg cagataccaa atactgtcct tctagtgtag ccgtagttag gccaccactt 6480
caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc 6540caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc 6540
tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt taccggataa 6600tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt taccggataa 6600
ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac 6660ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac 6660
ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg 6720ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg 6720
gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga 6780gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga 6780
gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact 6840gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact 6840
tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa 6900tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa 6900
cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt tctttcctgc 6960cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt tctttcctgc 6960
gttatcccct gattctgtgg ataaccgtat taccgccttt gagtgagctg agacaggtca 7020gttatcccct gattctgtgg ataaccgtat taccgccttt gagtgagctg agacaggtca 7020
ttcagactgg ctaatgcacc cagtaaggca gcggtatcat caactcaaaa tggtatgcgt 7080ttcagactgg ctaatgcacc cagtaaggca gcggtatcat caactcaaaa tggtatgcgt 7080
tttgacacat ccactatata tccgtgtcgt tctgtccact cctgaatccc attccagaaa 7140tttgacacat ccactatata tccgtgtcgt tctgtccact cctgaatccc attccagaaa 7140
ttctctagcg attccagaag tttctcagag tcggaaagtt gaccagacat tacgaactgg 7200ttctctagcg attccagaag tttctcagag tcggaaagtt gaccagacat tacgaactgg 7200
cacagatggt cataacctga aggaagatct gattgcttaa ctgcttcagt taagaccgaa 7260cacagatggt cataacctga aggaagatct gattgcttaa ctgcttcagt taagaccgaa 7260
gcgctcgtcg tataacagat gcgatgatgc agaccaatca acatggcacc tgccattgct 7320gcgctcgtcg tataacagat gcgatgatgc agaccaatca acatggcacc tgccattgct 7320
acctgcacag tcaaggatgg tagaaatgtt gtcggtcctt gcacacgaat attacgccat 7380acctgcacag tcaaggatgg tagaaatgtt gtcggtcctt gcacacgaat attacgccat 7380
ttgcctgcat attcaaacag ctcttctacg ataagggcac aaatcgcatc gtggaacgtt 7440ttgcctgcat attcaaacag ctcttctacg ataagggcac aaatcgcatc gtggaacgtt 7440
tgggcttcta ccgatttagc agtttgatac actttctcta agtatccacc tgaatcataa 7500tgggcttcta ccgatttagc agtttgatac actttctcta agtatccacc tgaatcataa 7500
atcggcaaaa tagagaaaaa ttgaccatgt gtaagcggcc aatctgattc cacctgagat 7560atcggcaaaa tagagaaaaa ttgaccatgt gtaagcggcc aatctgattc cacctgagat 7560
gcataatcta gtagaatctc ttcgctatca aaattcactt ccaccttcca ctcaccggtt 7620gcataatcta gtagaatctc ttcgctatca aaattcactt ccaccttcca ctcaccggtt 7620
gtccattcat ggctgaactc tgcttcctct gttgacatga cacacatcat ctcaatatcc 7680gtccattcat ggctgaactc tgcttcctct gttgacatga cacacatcat ctcaatatcc 7680
gaatagggcc catcagtctg acgaccaaga gagccataaa caccaatagc cttaacatca 7740gaatagggcc catcagtctg acgaccaaga gagccataaa caccaatagc cttaacatca 7740
tccccatatt tatccaatat tcgttcctta atttcatgaa caatcttcat tctttcttct 7800tccccatatt tatccaatat tcgttcctta atttcatgaa caatcttcat tctttcttct 7800
ctagtcatta ttattggtcc attcactatt ctcattcccc tttcagataa ttttagattt 7860ctagtcatta ttatggtcc attcactatt ctcattcccc tttcagataa ttttagattt 7860
gcttttctaa ataagaatat ttggagagca ccgttcttat tcagctatta aacccattat 7920gcttttctaa ataagaatat ttggagagca ccgttcttat tcagctatta aacccattat 7920
atcgggtttt tgaggggatt tcaactgcag acacctaaat tcaaaatcta tcggtcagat 7980atcgggtttt tgaggggatt tcaactgcag acacctaaat tcaaaatcta tcggtcagat 7980
ttataccgat ttgattttat atattcttga ataacatacg ccgagttatc acataaaagc 8040ttataccgat ttgattttat atattcttga ataacatacg ccgagttatc acataaaagc 8040
gggaaccaat catcaaattt aaacttcatt gcataatcca ttaaactctt aaattctacg 8100gggaaccaat catcaaattt aaacttcatt gcataatcca ttaaactctt aaattctacg 8100
attccttgtt catcaataaa ctcaatcatt tctttaatta atttatatct atctgttgtt 8160attccttgtt catcaataaa ctcaatcatt tctttaatta atttatatct atctgttgtt 8160
gttttcttta ataattcatc aacatctaca ccgccataaa ctatcatatc ttctttttga 8220gttttcttta ataattcatc aacatctaca ccgccataaa ctatcatatc ttctttttga 8220
tatttaaatt tattaggatc gtccatgtga agcatatatc tcacaagacc tttcacactt 8280tatttaaatt tattaggatc gtccatgtga agcatatatc tcacaagacc tttcacactt 8280
cctgcaatct gcggaatagt cgcattcaat tcttctgtaa ttatttttat ctgttcataa 8340cctgcaatct gcggaatagt cgcattcaat tcttctgtaa ttattttat ctgttcataa 8340
gatttattac cctcatacat cactagaata tgataatgct cttttttcat cctatcttct 8400gatttattac cctcatacat cactagaata tgataatgct cttttttcat cctatcttct 8400
gtatcagtat ccctatcatg taatggagac actacaaatt gaatgtgtaa ctcttttaaa 8460gtatcagtat ccctatcatg taatggagac actacaaatt gaatgtgtaa ctcttttaaa 8460
tactctaacc actcggcttt tgctgattct ggatataaaa caaatgtcca attacgtcct 8520tactctaacc actcggcttt tgctgattct ggatataaaa caaatgtcca attacgtcct 8520
cttgaatttt tcttgttttc agtttctttt attacatttt cgctcatgat ataataacgg 8580cttgaatttt tcttgttttc agtttctttt attacatttt cgctcatgat ataataacgg 8580
tgctaataca tttaacaaaa tttagtcata gataggcagc atgccagtgc tgtctatctt 8640tgctaataca tttaacaaaa tttagtcata gataggcagc atgccagtgc tgtctatctt 8640
tttttgttta aaatgcaccg tattcctcct ttgcatattt ttttattaga ataccggttg 8700tttttgttta aaatgcaccg tattcctcct ttgcatattt ttttattaga ataccggttg 8700
catctgattt gctaatatta tatttttctt tgattctatt taatatctca ttttcttctg 8760catctgattt gctaatatta tatttttctt tgattctatt taatatctca ttttcttctg 8760
ttgtaagtct taaagtaaca gcaacttttt tctcttcttt tctatctaca accatcactg 8820ttgtaagtct taaagtaaca gcaacttttt tctcttcttt tctatctaca accatcactg 8820
tacctcccaa catctgtttt tttcacttta acataaaaaa caacctttta acattaaaaa 8880tacctcccaa catctgtttt tttcacttta acataaaaaa caacctttta acattaaaaa 8880
cccaatattt atttatttgt ttggacaatg gacaatggac acctaggggg gaggtcgtag 8940cccaatattt atttatttgt ttggacaatg gacaatggac acctaggggg gaggtcgtag 8940
taccccccta tgttttctcc cctaaataac cccaaaaatc taagaaaaaa agacctcaaa 9000taccccccta tgttttctcc cctaaataac cccaaaaatc taagaaaaaa agacctcaaa 9000
aaggtcttta attaacatct caaatttcgc atttattcca atttcctttt tgcgtgtgat 9060aaggtcttta attaacatct caaatttcgc atttattcca atttcctttt tgcgtgtgat 9060
gcgttattaa cgttgatata atttaaattt tatttgacaa aaatgggctc gtgttgtaca 9120gcgttattaa cgttgatata atttaaattt tatttgacaa aaatgggctc gtgttgtaca 9120
ataaatgtag aggtagagac gcgaggtcta agaactttaa ataatttcta ctgttgtaga 9180ataaatgtag aggtagagac gcgaggtcta agaactttaa ataatttcta ctgttgtaga 9180
tagagaccgt gaagttaata aggtctcaaa tttctactgt tgtagatcgt ctctgaactg 9240tagagaccgt gaagttaata aggtctcaaa tttctactgt tgtagatcgt ctctgaactg 9240
attcaagcaa gcttaaaccc agctcaatga gctgggtttt ttgtttgttt tttcaaactt 9300attcaagcaa gcttaaaccc agctcaatga gctgggtttt ttgtttgttt tttcaaactt 9300
agttagcttg gccagtgcct ctagagtcaa gtaaagagtc gacctgttac gaacggcaga 9360agttagcttg gccagtgcct ctagagtcaa gtaaagagtc gacctgttac gaacggcaga 9360
tcagaatttt gtaataaaaa aagagcctgc tcattacact gcgggctctt tttcatggtc 9420tcagaatttt gtaataaaaa aagagcctgc tcattacact gcgggctctt tttcatggtc 9420
agaagacggg taaccaagat aacaa 9445agaagacggg taaccaagat aacaa 9445
<210> 10<210> 10
<211> 9445<211> 9445
<212> DNA<212> DNA
<213> (人工序列)<213> (artificial sequence)
<400> 10<400> 10
atgtcatgac attggtgtac agaaatggcg cagcaatggc aagaacgtcc cgggcggagc 60atgtcatgac attggtgtac agaaatggcg cagcaatggc aagaacgtcc cgggcggagc 60
tcaggcctta actcacatta attgcgttgc gctcactgcc cgctttccag tcgggaaacc 120tcaggcctta actcacatta attgcgttgc gctcactgcc cgctttccag tcgggaaacc 120
tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg 180tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg 180
ggcgccaggg tggtttttct tttcaccagt gagacgggca acagctgatt gcccttcacc 240ggcgccaggg tggtttttct tttcaccagt gagacgggca acagctgatt gcccttcacc 240
gcctggccct gagagagttg cagcaagcgg tccacgctgg tttgccccag caggcgaaaa 300gcctggccct gagagagttg cagcaagcgg tccacgctgg tttgccccag caggcgaaaa 300
tcctgtttga tggtggttaa cggcgggata taacatgagc tgtcttcggt atcgtcgtat 360tcctgtttga tggtggttaa cggcgggata taacatgagc tgtcttcggt atcgtcgtat 360
cccactaccg agatatccgc accaacgcgc agcccggact cggtaatggc gcgcattgcg 420cccactaccg agatatccgc accaacgcgc agcccggact cggtaatggc gcgcattgcg 420
cccagcgcca tctgatcgtt ggcaaccagc atcgcagtgg gaacgatgcc ctcattcagc 480cccagcgcca tctgatcgtt ggcaaccagc atcgcagtgg gaacgatgcc ctcattcagc 480
atttgcatgg tttgttgaaa accggacatg gcactccagt cgccttcccg ttccgctatc 540atttgcatgg tttgttgaaa accggacatg gcactccagt cgccttcccg ttccgctatc 540
ggctgaattt gattgcgagt gagatattta tgccagccag ccagacgcag acgcgccgag 600ggctgaattt gattgcgagt gagatattta tgccagccag ccagacgcag acgcgccgag 600
acagaactta atgggcccgc taacagcgcg atttgctggt gacccaatgc gaccagatgc 660acagaactta atgggcccgc taacagcgcg atttgctggt gacccaatgc gaccagatgc 660
tccacgccca gtcgcgtacc gtcttcatgg gagaaaataa tactgttgat gggtgtctgg 720tccacgccca gtcgcgtacc gtcttcatgg gagaaaataa tactgttgat gggtgtctgg 720
tcagagacat caagaaataa cgccggaaca ttagtgcagg cagcttccac agcaatggca 780tcagagacat caagaaataa cgccggaaca ttagtgcagg cagcttccac agcaatggca 780
tcctggtcat ccagcggata gttaatgatc agcccactga cgcgttgcgc gagaagattg 840tcctggtcat ccagcggata gttaatgatc agcccactga cgcgttgcgc gagaagattg 840
tgcaccgccg ttttacaggc ttcgacgccg cttcgttcta ccatcgacac caccacgctg 900tgcaccgccg ttttacaggc ttcgacgccg cttcgttcta ccatcgacac caccacgctg 900
gcacccagtt gatcggcgcg agatttaatc gccgcgacaa tttgcgacgg cgcgtgcagg 960gcacccagtt gatcggcgcg agatttaatc gccgcgacaa tttgcgacgg cgcgtgcagg 960
gccagactgg aggtggcaac gccaatcagc aacgactgtt tgcccgccag ttgttgtgcc 1020gccagactgg aggtggcaac gccaatcagc aacgactgtt tgcccgccag ttgttgtgcc 1020
acgcggttgg gaatgtaatt cagctccgcc atcgccgctt ccactttttc ccgcgttttc 1080acgcggttgg gaatgtaatt cagctccgcc atcgccgctt ccactttttc ccgcgttttc 1080
gcagaaacgt ggctggcctg gttcaccacg cgggaaacgg tctgataaga gacaccggca 1140gcagaaacgt ggctggcctg gttcaccacg cgggaaacgg tctgataaga gacaccggca 1140
tactctgcga catcgtataa cgttactggt ttcatcaaaa tcgtctccct ccgtttgaat 1200tactctgcga catcgtataa cgttatactggt ttcatcaaaa tcgtctccct ccgtttgaat 1200
atttgattga tcgtaaccag atgaagcact ctttccacta tccctacagt gttatggctt 1260atttgattga tcgtaaccag atgaagcact ctttccacta tccctacagt gttatggctt 1260
gaacaatcac gaaacaataa ttggtacgta cgatctttca gccgactcaa acatcaaatc 1320gaacaatcac gaaacaataa ttggtacgta cgatctttca gccgactcaa acatcaaatc 1320
ttacaaatgt agtctttgaa agtattacat atgtaagatt taaatgcaac cgttttttcg 1380ttacaaatgt agtctttgaa agtattacat atgtaagatt taaatgcaac cgttttttcg 1380
gaaggaaatg atgacctcgt ttccaccgga attagcttgg taccagctat tgtaacataa 1440gaaggaaatg atgacctcgt ttccaccgga attagcttgg taccagctat tgtaacataa 1440
tcggtacggg ggtgaaaaag ctaacggaaa agggagcgga aaagaatgat gtaagcgtga 1500tcggtacggg ggtgaaaaag ctaacggaaa agggagcgga aaagaatgat gtaagcgtga 1500
aaaatttttt aaaaaatctc ttgacattgg aagggagata tgttattata agaattgcgg 1560aaaatttttt aaaaaatctc ttgacattgg aagggagata tgttattata agaattgcgg 1560
aattgtgagc ggataacaat tcataattgt gagcggataa caattcaacc ccaaaggagg 1620aattgtgagc ggataacaat tcataattgt gagcggataa caattcaacc ccaaaggagg 1620
tgggatccat gagcgaagtc gagttctccc acgaatactg gatgcgccat gcccttacac 1680tgggatccat gagcgaagtc gagttctccc acgaatactg gatgcgccat gcccttacac 1680
ttgccaaacg cgctcgcgat gaacgcgaag tcccagttgg cgccgttctt gttcttaaca 1740ttgccaaacg cgctcgcgat gaacgcgaag tcccagttgg cgccgttctt gttcttaaca 1740
accgcgtcat cggagaaggc tggaaccgcg ctatcggact tcatgatccg acagctcatg 1800accgcgtcat cggagaaggc tggaaccgcg ctatcggact tcatgatccg acagctcatg 1800
ccgaaattat ggctctgcgc caaggcggac ttgtcatgca gaattacaga cttattgatg 1860ccgaaattat ggctctgcgc caaggcggac ttgtcatgca gaattacaga cttattgatg 1860
ccacgcttta cgtcacgttc gaaccgtgcg tcatgtgcgc cggcgccatg attcatagcc 1920ccacgcttta cgtcacgttc gaaccgtgcg tcatgtgcgc cggcgccatg attcatagcc 1920
gcatcggcag agttgtcttc ggcgtccgca attcaaaaag aggcgccgcc ggctccctta 1980gcatcggcag agttgtcttc ggcgtccgca attcaaaaag aggcgccgcc ggctccctta 1980
tgaacgtcct taactacccg ggcatgaatc accgcgttga aatcacagaa ggcattctgg 2040tgaacgtcct taactacccg ggcatgaatc accgcgttga aatcacagaa ggcattctgg 2040
ccgatgagtg cgccgctctg ctgtgcgact tctatagaat gccgagacaa gtcttcaacg 2100ccgatgagtg cgccgctctg ctgtgcgact tctatagaat gccgagacaa gtcttcaacg 2100
cccagaagaa agcccaaagc agcattaact ctggaggatc atccggaggc agctctggaa 2160cccagaagaa agcccaaagc agcattaact ctggaggatc atccggaggc agctctggaa 2160
gtgaaacacc aggaacaagc gaatcagcta caccagagtc ctctggaggc tcatctggag 2220gtgaaacacc aggaacaagc gaatcagcta caccagagtc ctctggaggc tcatctggag 2220
gaagctcaat ttatcaagaa tttgttaata aatatagttt aagtaaaact ctaagatttg 2280gaagctcaat ttatcaagaa tttgttaata aatatagttt aagtaaaact ctaagatttg 2280
agttaatccc acagggtaaa acacttgaaa acataaaagc aagaggtttg attttagatg 2340agttaatccc acagggtaaa acacttgaaa acataaaagc aagaggtttg attttagatg 2340
atgagaaaag agctaaagac tacaaaaagg ctaaacaaat aattgataaa tatcatcagt 2400atgagaaaag agctaaagac tacaaaaagg ctaaacaaat aattgataaa tatcatcagt 2400
tttttataga ggagatatta agttcggttt gtattagcga agatttatta caaaactatt 2460tttttataga ggagatatta agttcggttt gtattagcga agatttatta caaaactatt 2460
ctgatgttta ttttaaactt aaaaagagtg atgatgataa tctacaaaaa gattttaaaa 2520ctgatgttta ttttaaactt aaaaagagtg atgatgataa tctacaaaaa gattttaaaa 2520
gtgcaaaaga tacgataaag aaacaaatat ctgaatatat aaaggactca gagaaattta 2580gtgcaaaaga tacgataaag aaacaaatat ctgaatat aaaggactca gagaaattta 2580
agaatttgtt taatcaaaac cttatcgatg ctaaaaaagg gcaagagtca gatttaattc 2640agaatttgtt taatcaaaac cttatcgatg ctaaaaaagg gcaagagtca gatttaattc 2640
tatggctaaa gcaatctaag gataatggta tagaactatt taaagccaat agtgatatca 2700tatggctaaa gcaatctaag gataatggta tagaactatt taaagccaat agtgatatca 2700
cagatataga tgaggcgtta gaaataatca aatcttttaa aggttggaca acttatttta 2760cagatataga tgaggcgtta gaaataatca aatcttttaa aggttggaca acttatttta 2760
agggttttca tgaaaataga aaaaatgttt atagtagcaa tgatattcct acatctatta 2820agggttttca tgaaaataga aaaaatgttt atagtagcaa tgatattcct acatctatta 2820
tttataggat agtagatgat aatttgccta aatttctaga aaataaagct aagtatgaga 2880tttataggat agtagatgat aatttgccta aatttctaga aaataaagct aagtatgaga 2880
gtttaaaaga caaagctcca gaagctataa actatgaaca aattaaaaaa gatttggcag 2940gtttaaaaga caaagctcca gaagctataa actatgaaca aattaaaaaa gatttggcag 2940
aagagctaac ctttgatatt gactacaaaa catctgaagt taatcaaaga gttttttcac 3000aagagctaac ctttgatatt gactacaaaa catctgaagt taatcaaaga gttttttcac 3000
ttgatgaagt ttttgagata gcaaacttta ataattatct aaatcaaagt ggtattacta 3060ttgatgaagt ttttgagata gcaaacttta ataattatct aaatcaaagt ggtattacta 3060
aatttaatac tattattggt ggtaaatttg taaatggtga aaatacaaag agaaaaggta 3120aatttaatac tattattggt ggtaaatttg taaatggtga aaatacaaag agaaaaggta 3120
taaatgaata tataaatcta tactcacagc aaataaatga taaaacactc aaaaaatata 3180taaatgaata tataaatcta tactcacagc aaataaatga taaaacactc aaaaaatata 3180
aaatgagtgt tttatttaag caaattttaa gtgatacaga atctaaatct tttgtaattg 3240aaatgagtgt tttatttaag caaattttaa gtgatacaga atctaaatct tttgtaattg 3240
ataagttaga agatgatagt gatgtagtta caacgatgca aagtttttat gagcaaatag 3300ataagttaga agatgatagt gatgtagtta caacgatgca aagtttttat gagcaaatag 3300
cagcttttaa aacagtagaa gaaaaatcta ttaaagaaac actatcttta ttatttgatg 3360cagcttttaa aacagtagaa gaaaaatcta ttaaagaaac actatcttta ttatttgatg 3360
atttaaaagc tcaaaaactt gatttgagta aaatttattt taaaaatgat aaatctctta 3420atttaaaagc tcaaaaactt gatttgagta aaatttattt taaaaatgat aaatctctta 3420
ctgatctatc acaacaagtt tttgatgatt atagtgttat tggtacagcg gtactagaat 3480ctgatctatc acaacaagtt tttgatgatt atagtgttat tggtacagcg gtactagaat 3480
atataactca acaaatagca cctaaaaatc ttgataaccc tagtaagaaa gagcaagaat 3540atataactca acaaatagca cctaaaaatc ttgataaccc tagtaagaaa gagcaagaat 3540
taatagccaa aaaaactgaa aaagcaaaat acttatctct agaaactata aagcttgcct 3600taatagccaa aaaaactgaa aaagcaaaat acttatctct agaaactata aagcttgcct 3600
tagaagaatt taataagcat agagatatag ataaacagtg taggtttgaa gaaatacttg 3660tagaagaatt taataagcat agagatatag ataaacagtg taggtttgaa gaaatacttg 3660
caaactttgc ggctattccg atgatatttg atgaaatagc tcaaaacaaa gacaatttgg 3720caaactttgc ggctattccg atgatatttg atgaaatagc tcaaaacaaa gacaatttgg 3720
cacagatatc tatcaaatat caaaatcaag gtaaaaaaga cctacttcaa gctagtgcgg 3780cacagatatc tatcaaatat caaaatcaag gtaaaaaaga cctacttcaa gctagtgcgg 3780
aagatgatgt taaagctatc aaggatcttt tagatcaaac taataatctc ttacataaac 3840aagatgatgt taaagctatc aaggatcttt tagatcaaac taataatctc ttacataaac 3840
taaaaatatt tcatattagt cagtcagaag ataaggcaaa tattttagac aaggatgagc 3900taaaaatatt tcatattagt cagtcagaag ataaggcaaa tattttagac aaggatgagc 3900
atttttatct agtatttgag gagtgctact ttgagctagc gaatatagtg cctctttata 3960atttttatct agtatttgag gagtgctact ttgagctagc gaatatagtg cctctttata 3960
acaaaattag aaactatata actcaaaagc catatagtga tgagaaattt aagctcaatt 4020acaaaattag aaactata actcaaaagc catatagtga tgagaaattt aagctcaatt 4020
ttgagaactc gactttggct aatggttggg ataaaaataa agagcctgac aatacggcaa 4080ttgagaactc gactttggct aatggttggg ataaaaataa agagcctgac aatacggcaa 4080
ttttatttat caaagatgat aaatattatc tgggtgtgat gaataagaaa aataacaaaa 4140ttttatttat caaagatgat aaatattatc tgggtgtgat gaataagaaa aataacaaaa 4140
tatttgatga taaagctatc aaagaaaata aaggcgaggg ttataaaaaa attgtttata 4200tatttgatga taaagctatc aaagaaaata aaggcgaggg ttataaaaaa attgtttata 4200
aacttttacc tggcgcaaat aaaatgttac ctaaggtttt cttttctgct aaatctataa 4260aacttttacc tggcgcaaat aaaatgttac ctaaggtttt cttttctgct aaatctataa 4260
aattttataa tcctagtgaa gatatactta gaataagaaa tcattccaca catacaaaaa 4320aattttataa tcctagtgaa gatatactta gaataagaaa tcattccaca catacaaaaa 4320
atggtagtcc tcaaaaagga tatgaaaaat ttgagtttaa tattgaagat tgccgaaaat 4380atggtagtcc tcaaaaagga tatgaaaaat ttgagtttaa tattgaagat tgccgaaaat 4380
ttatagattt ttataaacag tctataagta agcatccgga gtggaaagat tttggattta 4440ttatagattt ttataaacag tctataagta agcatccgga gtggaaagat tttggattta 4440
gattttctga tactcaaaga tataattcta tagatgaatt ttatagagaa gttgaaaatc 4500gattttctga tactcaaaga tataattcta tagatgaatt ttatagagaa gttgaaaatc 4500
aaggctacaa actaactttt gaaaatatat cagagagcta tattgatagc gtagttaatc 4560aaggctacaa actaactttt gaaaatat cagagagcta tattgatagc gtagttaatc 4560
agggtaaatt gtacctattc caaatctata ataaagattt ttcagcttat agcaaagggc 4620agggtaaatt gtacctattc caaatctata ataaagattt ttcagcttat agcaaagggc 4620
gaccaaatct acatacttta tattggaaag cgctgtttga tgagagaaat cttcaagatg 4680gaccaaatct acatacttta tattggaaag cgctgtttga tgagagaaat cttcaagatg 4680
tggtttataa gctaaatggt gaggcagagc ttttttatcg taaacaatca atacctaaaa 4740tggtttataa gctaaatggt gaggcagagc ttttttatcg taaacaatca atacctaaaa 4740
aaatcactca cccagctaaa gaggcaatag ctaataaaaa caaagataat cctaaaaaag 4800aaatcactca cccagctaaa gaggcaatag ctaataaaaa caaagataat cctaaaaaag 4800
agagtgtttt tgaatatgat ttaatcaaag ataaacgctt tactgaagat aagtttttct 4860agagtgtttt tgaatatgat ttaatcaaag ataaacgctt tactgaagat aagtttttct 4860
ttcactgtcc tattacaatc aattttaaat ctagtggagc taataagttt aatgatgaaa 4920ttcactgtcc tattacaatc aattttaaat ctagtggagc taataagttt aatgatgaaa 4920
tcaatttatt gctaaaagaa aaagcaaatg atgttcatat attaagtata gctagaggtg 4980tcaatttatt gctaaaagaa aaagcaaatg atgttcatat attaagtata gctagaggtg 4980
aaagacattt agcttactat actttggtag atggtaaagg caatatcatc aaacaagata 5040aaagacattt agcttactat actttggtag atggtaaagg caatatcatc aaacaagata 5040
ctttcaacat cattggtaat gatagaatga aaacaaacta ccatgataag cttgctgcaa 5100ctttcaacat cattggtaat gatagaatga aaacaaacta ccatgataag cttgctgcaa 5100
tagagaaaga tagggattca gctaggaaag actggaaaaa gataaataac atcaaagaga 5160tagagaaaga tagggattca gctaggaaag actggaaaaa gataaataac atcaaagaga 5160
tgaaagaggg ctatctatct caggtagttc atgaaatagc taagctagtt atagagtata 5220tgaaagaggg ctatctatct caggtagttc atgaaatagc taagctagtt atagagtata 5220
atgctattgt ggtttttgag gatttaaatt ttggatttaa aagagggcgt ttcaaggtag 5280atgctattgt ggtttttgag gatttaaatt ttggatttaa aagagggcgt ttcaaggtag 5280
agaagcaggt ctatcaaaag ttagaaaaaa tgctaattga gaaactaaac tatctagttt 5340agaagcaggt ctatcaaaag ttagaaaaaa tgctaattga gaaactaaac tatctagttt 5340
tcaaagataa tgagtttgat aaaactgggg gagtgcttag agcttatcag ctaacagcac 5400tcaaagataa tgagtttgat aaaactgggg gagtgcttag agctttatcag ctaacagcac 5400
cttttgagac ttttaaaaag atgggtaaac aaacaggtat tatctactat gtaccagctg 5460cttttgagac ttttaaaaag atgggtaaac aaacaggtat tatctactat gtaccagctg 5460
gttttacttc aaaaatttgt cctgtaactg gttttgtaaa tcagttatat cctaagtatg 5520gttttacttc aaaaatttgt cctgtaactg gttttgtaaa tcagttatat cctaagtatg 5520
aaagtgtcag caaatctcaa gagttcttta gtaagtttga caagatttgt tataaccttg 5580aaagtgtcag caaatctcaa gagttcttta gtaagtttga caagatttgt tataaccttg 5580
ataagggcta ttttgagttt agttttgatt ataaaaactt tggtgacaag gctgccaaag 5640ataagggcta ttttgagttt agttttgatt ataaaaactt tggtgacaag gctgccaaag 5640
gcaagtggac tatagctagc tttgggagta gattgattaa ctttagaaat tcagataaaa 5700gcaagtggac tatagctagc tttgggagta gattgattaa ctttagaaat tcagataaaa 5700
atcataattg ggatactcga gaagtttatc caactaaaga gttggagaaa ttgctaaaag 5760atcataattg ggatactcga gaagtttatc caactaaaga gttggagaaa ttgctaaaag 5760
attattctat cgaatatggg catggcgaat gtatcaaagc agctatttgc ggtgagagcg 5820attattctat cgaatatggg catggcgaat gtatcaaagc agctatttgc ggtgagagcg 5820
acaaaaagtt ttttgctaag ctaactagtg tcctaaatac tatcttacaa atgcgtaact 5880acaaaaagttttttgctaag ctaactagtg tcctaaatac tatcttacaa atgcgtaact 5880
caaaaacagg tactgagtta gattatctaa tttcaccagt agcagatgta aatggcaatt 5940caaaaacagg tactgagtta gattatctaa tttcaccagt agcagatgta aatggcaatt 5940
tctttgattc gcgacaggcg ccaaaaaata tgcctcaaga tgctgatgcc aatggtgctt 6000tctttgattc gcgacaggcg ccaaaaaata tgcctcaaga tgctgatgcc aatggtgctt 6000
atcatattgg gctaaaaggt ctgatgctac taggtaggat caaaaataat caagagggca 6060atcatattgg gctaaaaggt ctgatgctac taggtaggat caaaaataat caagagggca 6060
aaaaactcaa tttggttatc aaaaatgaag agtattttga gttcgtgcag aataggaata 6120aaaaactcaa tttggttatc aaaaatgaag agtattttga gttcgtgcag aataggaata 6120
actctggtgg ttctcccaag aagaagagga aagtctaact gcagtataat cagaaacagc 6180actctggtgg ttctcccaag aagaagagga aagtctaact gcagtataat cagaaacagc 6180
ccgcggatgt tgatctgcgg gctgtttttt attgatcgaa tggccatgac caaaatccct 6240ccgcggatgt tgatctgcgg gctgtttttt attgatcgaa tggccatgac caaaatccct 6240
taacgtgaag ttttcgttcc actgagcgtc agaccccgta gaaaagatca aaggatcttc 6300taacgtgaag ttttcgttcc actgagcgtc agaccccgta gaaaagatca aaggatcttc 6300
ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc 6360ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc 6360
agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt 6420agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt 6420
cagcagagcg cagataccaa atactgtcct tctagtgtag ccgtagttag gccaccactt 6480cagcagagcg cagataccaa atactgtcct tctagtgtag ccgtagttag gccaccactt 6480
caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc 6540caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc 6540
tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt taccggataa 6600tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt taccggataa 6600
ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac 6660ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac 6660
ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg 6720ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg 6720
gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga 6780gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga 6780
gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact 6840gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact 6840
tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa 6900tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa 6900
cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt tctttcctgc 6960cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt tctttcctgc 6960
gttatcccct gattctgtgg ataaccgtat taccgccttt gagtgagctg agacaggtca 7020gttatcccct gattctgtgg ataaccgtat taccgccttt gagtgagctg agacaggtca 7020
ttcagactgg ctaatgcacc cagtaaggca gcggtatcat caactcaaaa tggtatgcgt 7080ttcagactgg ctaatgcacc cagtaaggca gcggtatcat caactcaaaa tggtatgcgt 7080
tttgacacat ccactatata tccgtgtcgt tctgtccact cctgaatccc attccagaaa 7140tttgacacat ccactatata tccgtgtcgt tctgtccact cctgaatccc attccagaaa 7140
ttctctagcg attccagaag tttctcagag tcggaaagtt gaccagacat tacgaactgg 7200ttctctagcg attccagaag tttctcagag tcggaaagtt gaccagacat tacgaactgg 7200
cacagatggt cataacctga aggaagatct gattgcttaa ctgcttcagt taagaccgaa 7260cacagatggt cataacctga aggaagatct gattgcttaa ctgcttcagt taagaccgaa 7260
gcgctcgtcg tataacagat gcgatgatgc agaccaatca acatggcacc tgccattgct 7320gcgctcgtcg tataacagat gcgatgatgc agaccaatca acatggcacc tgccattgct 7320
acctgcacag tcaaggatgg tagaaatgtt gtcggtcctt gcacacgaat attacgccat 7380acctgcacag tcaaggatgg tagaaatgtt gtcggtcctt gcacacgaat attacgccat 7380
ttgcctgcat attcaaacag ctcttctacg ataagggcac aaatcgcatc gtggaacgtt 7440ttgcctgcat attcaaacag ctcttctacg ataagggcac aaatcgcatc gtggaacgtt 7440
tgggcttcta ccgatttagc agtttgatac actttctcta agtatccacc tgaatcataa 7500tgggcttcta ccgatttagc agtttgatac actttctcta agtatccacc tgaatcataa 7500
atcggcaaaa tagagaaaaa ttgaccatgt gtaagcggcc aatctgattc cacctgagat 7560atcggcaaaa tagagaaaaa ttgaccatgt gtaagcggcc aatctgattc cacctgagat 7560
gcataatcta gtagaatctc ttcgctatca aaattcactt ccaccttcca ctcaccggtt 7620gcataatcta gtagaatctc ttcgctatca aaattcactt ccaccttcca ctcaccggtt 7620
gtccattcat ggctgaactc tgcttcctct gttgacatga cacacatcat ctcaatatcc 7680gtccattcat ggctgaactc tgcttcctct gttgacatga cacacatcat ctcaatatcc 7680
gaatagggcc catcagtctg acgaccaaga gagccataaa caccaatagc cttaacatca 7740gaatagggcc catcagtctg acgaccaaga gagccataaa caccaatagc cttaacatca 7740
tccccatatt tatccaatat tcgttcctta atttcatgaa caatcttcat tctttcttct 7800tccccatatt tatccaatat tcgttcctta atttcatgaa caatcttcat tctttcttct 7800
ctagtcatta ttattggtcc attcactatt ctcattcccc tttcagataa ttttagattt 7860ctagtcatta ttatggtcc attcactatt ctcattcccc tttcagataa ttttagattt 7860
gcttttctaa ataagaatat ttggagagca ccgttcttat tcagctatta aacccattat 7920gcttttctaa ataagaatat ttggagagca ccgttcttat tcagctatta aacccattat 7920
atcgggtttt tgaggggatt tcaactgcag acacctaaat tcaaaatcta tcggtcagat 7980atcgggtttt tgaggggatt tcaactgcag acacctaaat tcaaaatcta tcggtcagat 7980
ttataccgat ttgattttat atattcttga ataacatacg ccgagttatc acataaaagc 8040ttataccgat ttgattttat atattcttga ataacatacg ccgagttatc acataaaagc 8040
gggaaccaat catcaaattt aaacttcatt gcataatcca ttaaactctt aaattctacg 8100gggaaccaat catcaaattt aaacttcatt gcataatcca ttaaactctt aaattctacg 8100
attccttgtt catcaataaa ctcaatcatt tctttaatta atttatatct atctgttgtt 8160attccttgtt catcaataaa ctcaatcatt tctttaatta atttatatct atctgttgtt 8160
gttttcttta ataattcatc aacatctaca ccgccataaa ctatcatatc ttctttttga 8220gttttcttta ataattcatc aacatctaca ccgccataaa ctatcatatc ttctttttga 8220
tatttaaatt tattaggatc gtccatgtga agcatatatc tcacaagacc tttcacactt 8280tatttaaatt tattaggatc gtccatgtga agcatatatc tcacaagacc tttcacactt 8280
cctgcaatct gcggaatagt cgcattcaat tcttctgtaa ttatttttat ctgttcataa 8340cctgcaatct gcggaatagt cgcattcaat tcttctgtaa ttattttat ctgttcataa 8340
gatttattac cctcatacat cactagaata tgataatgct cttttttcat cctatcttct 8400gatttattac cctcatacat cactagaata tgataatgct cttttttcat cctatcttct 8400
gtatcagtat ccctatcatg taatggagac actacaaatt gaatgtgtaa ctcttttaaa 8460gtatcagtat ccctatcatg taatggagac actacaaatt gaatgtgtaa ctcttttaaa 8460
tactctaacc actcggcttt tgctgattct ggatataaaa caaatgtcca attacgtcct 8520tactctaacc actcggcttt tgctgattct ggatataaaa caaatgtcca attacgtcct 8520
cttgaatttt tcttgttttc agtttctttt attacatttt cgctcatgat ataataacgg 8580cttgaatttt tcttgttttc agtttctttt attacatttt cgctcatgat ataataacgg 8580
tgctaataca tttaacaaaa tttagtcata gataggcagc atgccagtgc tgtctatctt 8640tgctaataca tttaacaaaa tttagtcata gataggcagc atgccagtgc tgtctatctt 8640
tttttgttta aaatgcaccg tattcctcct ttgcatattt ttttattaga ataccggttg 8700tttttgttta aaatgcaccg tattcctcct ttgcatattt ttttattaga ataccggttg 8700
catctgattt gctaatatta tatttttctt tgattctatt taatatctca ttttcttctg 8760catctgattt gctaatatta tatttttctt tgattctatt taatatctca ttttcttctg 8760
ttgtaagtct taaagtaaca gcaacttttt tctcttcttt tctatctaca accatcactg 8820ttgtaagtct taaagtaaca gcaacttttt tctcttcttt tctatctaca accatcactg 8820
tacctcccaa catctgtttt tttcacttta acataaaaaa caacctttta acattaaaaa 8880tacctcccaa catctgtttt tttcacttta acataaaaaa caacctttta acattaaaaa 8880
cccaatattt atttatttgt ttggacaatg gacaatggac acctaggggg gaggtcgtag 8940cccaatattt atttatttgt ttggacaatg gacaatggac acctaggggg gaggtcgtag 8940
taccccccta tgttttctcc cctaaataac cccaaaaatc taagaaaaaa agacctcaaa 9000taccccccta tgttttctcc cctaaataac cccaaaaatc taagaaaaaa agacctcaaa 9000
aaggtcttta attaacatct caaatttcgc atttattcca atttcctttt tgcgtgtgat 9060aaggtcttta attaacatct caaatttcgc atttattcca atttcctttt tgcgtgtgat 9060
gcgttattaa cgttgatata atttaaattt tatttgacaa aaatgggctc gtgttgtaca 9120gcgttattaa cgttgatata atttaaattt tatttgacaa aaatgggctc gtgttgtaca 9120
ataaatgtag aggtagagac gcgaggtcta agaactttaa ataatttcta ctgttgtaga 9180ataaatgtag aggtagagac gcgaggtcta agaactttaa ataatttcta ctgttgtaga 9180
tagagaccgt gaagttaata aggtctcaaa tttctactgt tgtagatcgt ctctgaactg 9240tagagaccgt gaagttaata aggtctcaaa tttctactgt tgtagatcgt ctctgaactg 9240
attcaagcaa gcttaaaccc agctcaatga gctgggtttt ttgtttgttt tttcaaactt 9300attcaagcaa gcttaaaccc agctcaatga gctgggtttt ttgtttgttt tttcaaactt 9300
agttagcttg gccagtgcct ctagagtcaa gtaaagagtc gacctgttac gaacggcaga 9360agttagcttg gccagtgcct ctagagtcaa gtaaagagtc gacctgttac gaacggcaga 9360
tcagaatttt gtaataaaaa aagagcctgc tcattacact gcgggctctt tttcatggtc 9420tcagaatttt gtaataaaaa aagagcctgc tcattacact gcgggctctt tttcatggtc 9420
agaagacggg taaccaagat aacaa 9445agaagacggg taaccaagat aacaa 9445
<210> 11<210> 11
<211> 9445<211> 9445
<212> DNA<212> DNA
<213> (人工序列)<213> (artificial sequence)
<400> 11<400> 11
atgtcatgac attggtgtac agaaatggcg cagcaatggc aagaacgtcc cgggcggagc 60atgtcatgac attggtgtac agaaatggcg cagcaatggc aagaacgtcc cgggcggagc 60
tcaggcctta actcacatta attgcgttgc gctcactgcc cgctttccag tcgggaaacc 120tcaggcctta actcacatta attgcgttgc gctcactgcc cgctttccag tcgggaaacc 120
tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg 180tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg 180
ggcgccaggg tggtttttct tttcaccagt gagacgggca acagctgatt gcccttcacc 240ggcgccaggg tggtttttct tttcaccagt gagacgggca acagctgatt gcccttcacc 240
gcctggccct gagagagttg cagcaagcgg tccacgctgg tttgccccag caggcgaaaa 300gcctggccct gagagagttg cagcaagcgg tccacgctgg tttgccccag caggcgaaaa 300
tcctgtttga tggtggttaa cggcgggata taacatgagc tgtcttcggt atcgtcgtat 360tcctgtttga tggtggttaa cggcgggata taacatgagc tgtcttcggt atcgtcgtat 360
cccactaccg agatatccgc accaacgcgc agcccggact cggtaatggc gcgcattgcg 420cccactaccg agatatccgc accaacgcgc agcccggact cggtaatggc gcgcattgcg 420
cccagcgcca tctgatcgtt ggcaaccagc atcgcagtgg gaacgatgcc ctcattcagc 480cccagcgcca tctgatcgtt ggcaaccagc atcgcagtgg gaacgatgcc ctcattcagc 480
atttgcatgg tttgttgaaa accggacatg gcactccagt cgccttcccg ttccgctatc 540atttgcatgg tttgttgaaa accggacatg gcactccagt cgccttcccg ttccgctatc 540
ggctgaattt gattgcgagt gagatattta tgccagccag ccagacgcag acgcgccgag 600ggctgaattt gattgcgagt gagatattta tgccagccag ccagacgcag acgcgccgag 600
acagaactta atgggcccgc taacagcgcg atttgctggt gacccaatgc gaccagatgc 660acagaactta atgggcccgc taacagcgcg atttgctggt gacccaatgc gaccagatgc 660
tccacgccca gtcgcgtacc gtcttcatgg gagaaaataa tactgttgat gggtgtctgg 720tccacgccca gtcgcgtacc gtcttcatgg gagaaaataa tactgttgat gggtgtctgg 720
tcagagacat caagaaataa cgccggaaca ttagtgcagg cagcttccac agcaatggca 780tcagagacat caagaaataa cgccggaaca ttagtgcagg cagcttccac agcaatggca 780
tcctggtcat ccagcggata gttaatgatc agcccactga cgcgttgcgc gagaagattg 840tcctggtcat ccagcggata gttaatgatc agcccactga cgcgttgcgc gagaagattg 840
tgcaccgccg ttttacaggc ttcgacgccg cttcgttcta ccatcgacac caccacgctg 900tgcaccgccg ttttacaggc ttcgacgccg cttcgttcta ccatcgacac caccacgctg 900
gcacccagtt gatcggcgcg agatttaatc gccgcgacaa tttgcgacgg cgcgtgcagg 960gcacccagtt gatcggcgcg agatttaatc gccgcgacaa tttgcgacgg cgcgtgcagg 960
gccagactgg aggtggcaac gccaatcagc aacgactgtt tgcccgccag ttgttgtgcc 1020gccagactgg aggtggcaac gccaatcagc aacgactgtt tgcccgccag ttgttgtgcc 1020
acgcggttgg gaatgtaatt cagctccgcc atcgccgctt ccactttttc ccgcgttttc 1080acgcggttgg gaatgtaatt cagctccgcc atcgccgctt ccactttttc ccgcgttttc 1080
gcagaaacgt ggctggcctg gttcaccacg cgggaaacgg tctgataaga gacaccggca 1140gcagaaacgt ggctggcctg gttcaccacg cgggaaacgg tctgataaga gacaccggca 1140
tactctgcga catcgtataa cgttactggt ttcatcaaaa tcgtctccct ccgtttgaat 1200tactctgcga catcgtataa cgttatactggt ttcatcaaaa tcgtctccct ccgtttgaat 1200
atttgattga tcgtaaccag atgaagcact ctttccacta tccctacagt gttatggctt 1260atttgattga tcgtaaccag atgaagcact ctttccacta tccctacagt gttatggctt 1260
gaacaatcac gaaacaataa ttggtacgta cgatctttca gccgactcaa acatcaaatc 1320gaacaatcac gaaacaataa ttggtacgta cgatctttca gccgactcaa acatcaaatc 1320
ttacaaatgt agtctttgaa agtattacat atgtaagatt taaatgcaac cgttttttcg 1380ttacaaatgt agtctttgaa agtattacat atgtaagatt taaatgcaac cgttttttcg 1380
gaaggaaatg atgacctcgt ttccaccgga attagcttgg taccagctat tgtaacataa 1440gaaggaaatg atgacctcgt ttccaccgga attagcttgg taccagctat tgtaacataa 1440
tcggtacggg ggtgaaaaag ctaacggaaa agggagcgga aaagaatgat gtaagcgtga 1500tcggtacggg ggtgaaaaag ctaacggaaa agggagcgga aaagaatgat gtaagcgtga 1500
aaaatttttt aaaaaatctc ttgacattgg aagggagata tgttattata agaattgcgg 1560aaaatttttt aaaaaatctc ttgacattgg aagggagata tgttattata agaattgcgg 1560
aattgtgagc ggataacaat tcataattgt gagcggataa caattcaacc ccaaaggagg 1620aattgtgagc ggataacaat tcataattgt gagcggataa caattcaacc ccaaaggagg 1620
tgggatccat gagcgaagtc gagttctccc acgaatactg gatgcgccat gcccttacac 1680tgggatccat gagcgaagtc gagttctccc acgaatactg gatgcgccat gcccttacac 1680
ttgccaaacg cgctcgcgat gaacgcgaag tcccagttgg cgccgttctt gttcttaaca 1740ttgccaaacg cgctcgcgat gaacgcgaag tcccagttgg cgccgttctt gttcttaaca 1740
accgcgtcat cggagaaggc tggaaccgcg ctatcggact tcatgatccg acagctcatg 1800accgcgtcat cggagaaggc tggaaccgcg ctatcggact tcatgatccg acagctcatg 1800
ccgaaattat ggctctgcgc caaggcggac ttgtcatgca gaattacaga cttattgatg 1860ccgaaattat ggctctgcgc caaggcggac ttgtcatgca gaattacaga cttattgatg 1860
ccacgcttta ctcaacgttc gaaccgtgcg tcatgtgcgc cggcgccatg attcatagcc 1920ccacgcttta ctcaacgttc gaaccgtgcg tcatgtgcgc cggcgccatg attcatagcc 1920
gcatcggcag agttgtcttc ggcgtccgca attcaaaaag aggcgccgcc ggctccctta 1980gcatcggcag agttgtcttc ggcgtccgca attcaaaaag aggcgccgcc ggctccctta 1980
tgaacgtcct taactacccg ggcatgaatc accgcgttga aatcacagaa ggcattctgg 2040tgaacgtcct taactacccg ggcatgaatc accgcgttga aatcacagaa ggcattctgg 2040
ccgatgagtg cgccgctctg ctgtgcgact tctatagaat gccgagaaga gtcttcaacg 2100ccgatgagtg cgccgctctg ctgtgcgact tctatagaat gccgagaaga gtcttcaacg 2100
cccagaagaa agcccaaagc agcattaact ctggaggatc atccggaggc agctctggaa 2160cccagaagaa agcccaaagc agcattaact ctggaggatc atccggaggc agctctggaa 2160
gtgaaacacc aggaacaagc gaatcagcta caccagagtc ctctggaggc tcatctggag 2220gtgaaacacc aggaacaagc gaatcagcta caccagagtc ctctggaggc tcatctggag 2220
gaagctcaat ttatcaagaa tttgttaata aatatagttt aagtaaaact ctaagatttg 2280gaagctcaat ttatcaagaa tttgttaata aatatagttt aagtaaaact ctaagatttg 2280
agttaatccc acagggtaaa acacttgaaa acataaaagc aagaggtttg attttagatg 2340agttaatccc acagggtaaa acacttgaaa acataaaagc aagaggtttg attttagatg 2340
atgagaaaag agctaaagac tacaaaaagg ctaaacaaat aattgataaa tatcatcagt 2400atgagaaaag agctaaagac tacaaaaagg ctaaacaaat aattgataaa tatcatcagt 2400
tttttataga ggagatatta agttcggttt gtattagcga agatttatta caaaactatt 2460tttttataga ggagatatta agttcggttt gtattagcga agatttatta caaaactatt 2460
ctgatgttta ttttaaactt aaaaagagtg atgatgataa tctacaaaaa gattttaaaa 2520ctgatgttta ttttaaactt aaaaagagtg atgatgataa tctacaaaaa gattttaaaa 2520
gtgcaaaaga tacgataaag aaacaaatat ctgaatatat aaaggactca gagaaattta 2580gtgcaaaaga tacgataaag aaacaaatat ctgaatat aaaggactca gagaaattta 2580
agaatttgtt taatcaaaac cttatcgatg ctaaaaaagg gcaagagtca gatttaattc 2640agaatttgtt taatcaaaac cttatcgatg ctaaaaaagg gcaagagtca gatttaattc 2640
tatggctaaa gcaatctaag gataatggta tagaactatt taaagccaat agtgatatca 2700tatggctaaa gcaatctaag gataatggta tagaactatt taaagccaat agtgatatca 2700
cagatataga tgaggcgtta gaaataatca aatcttttaa aggttggaca acttatttta 2760cagatataga tgaggcgtta gaaataatca aatcttttaa aggttggaca acttatttta 2760
agggttttca tgaaaataga aaaaatgttt atagtagcaa tgatattcct acatctatta 2820agggttttca tgaaaataga aaaaatgttt atagtagcaa tgatattcct acatctatta 2820
tttataggat agtagatgat aatttgccta aatttctaga aaataaagct aagtatgaga 2880tttataggat agtagatgat aatttgccta aatttctaga aaataaagct aagtatgaga 2880
gtttaaaaga caaagctcca gaagctataa actatgaaca aattaaaaaa gatttggcag 2940gtttaaaaga caaagctcca gaagctataa actatgaaca aattaaaaaa gatttggcag 2940
aagagctaac ctttgatatt gactacaaaa catctgaagt taatcaaaga gttttttcac 3000aagagctaac ctttgatatt gactacaaaa catctgaagt taatcaaaga gttttttcac 3000
ttgatgaagt ttttgagata gcaaacttta ataattatct aaatcaaagt ggtattacta 3060ttgatgaagt ttttgagata gcaaacttta ataattatct aaatcaaagt ggtattacta 3060
aatttaatac tattattggt ggtaaatttg taaatggtga aaatacaaag agaaaaggta 3120aatttaatac tattattggt ggtaaatttg taaatggtga aaatacaaag agaaaaggta 3120
taaatgaata tataaatcta tactcacagc aaataaatga taaaacactc aaaaaatata 3180taaatgaata tataaatcta tactcacagc aaataaatga taaaacactc aaaaaatata 3180
aaatgagtgt tttatttaag caaattttaa gtgatacaga atctaaatct tttgtaattg 3240aaatgagtgt tttatttaag caaattttaa gtgatacaga atctaaatct tttgtaattg 3240
ataagttaga agatgatagt gatgtagtta caacgatgca aagtttttat gagcaaatag 3300ataagttaga agatgatagt gatgtagtta caacgatgca aagtttttat gagcaaatag 3300
cagcttttaa aacagtagaa gaaaaatcta ttaaagaaac actatcttta ttatttgatg 3360cagcttttaa aacagtagaa gaaaaatcta ttaaagaaac actatcttta ttatttgatg 3360
atttaaaagc tcaaaaactt gatttgagta aaatttattt taaaaatgat aaatctctta 3420atttaaaagc tcaaaaactt gatttgagta aaatttattt taaaaatgat aaatctctta 3420
ctgatctatc acaacaagtt tttgatgatt atagtgttat tggtacagcg gtactagaat 3480ctgatctatc acaacaagtt tttgatgatt atagtgttat tggtacagcg gtactagaat 3480
atataactca acaaatagca cctaaaaatc ttgataaccc tagtaagaaa gagcaagaat 3540atataactca acaaatagca cctaaaaatc ttgataaccc tagtaagaaa gagcaagaat 3540
taatagccaa aaaaactgaa aaagcaaaat acttatctct agaaactata aagcttgcct 3600taatagccaa aaaaactgaa aaagcaaaat acttatctct agaaactata aagcttgcct 3600
tagaagaatt taataagcat agagatatag ataaacagtg taggtttgaa gaaatacttg 3660tagaagaatt taataagcat agagatatag ataaacagtg taggtttgaa gaaatacttg 3660
caaactttgc ggctattccg atgatatttg atgaaatagc tcaaaacaaa gacaatttgg 3720caaactttgc ggctattccg atgatatttg atgaaatagc tcaaaacaaa gacaatttgg 3720
cacagatatc tatcaaatat caaaatcaag gtaaaaaaga cctacttcaa gctagtgcgg 3780cacagatatc tatcaaatat caaaatcaag gtaaaaaaga cctacttcaa gctagtgcgg 3780
aagatgatgt taaagctatc aaggatcttt tagatcaaac taataatctc ttacataaac 3840aagatgatgt taaagctatc aaggatcttt tagatcaaac taataatctc ttacataaac 3840
taaaaatatt tcatattagt cagtcagaag ataaggcaaa tattttagac aaggatgagc 3900taaaaatatt tcatattagt cagtcagaag ataaggcaaa tattttagac aaggatgagc 3900
atttttatct agtatttgag gagtgctact ttgagctagc gaatatagtg cctctttata 3960atttttatct agtatttgag gagtgctact ttgagctagc gaatatagtg cctctttata 3960
acaaaattag aaactatata actcaaaagc catatagtga tgagaaattt aagctcaatt 4020acaaaattag aaactata actcaaaagc catatagtga tgagaaattt aagctcaatt 4020
ttgagaactc gactttggct aatggttggg ataaaaataa agagcctgac aatacggcaa 4080ttgagaactc gactttggct aatggttggg ataaaaataa agagcctgac aatacggcaa 4080
ttttatttat caaagatgat aaatattatc tgggtgtgat gaataagaaa aataacaaaa 4140ttttatttat caaagatgat aaatattatc tgggtgtgat gaataagaaa aataacaaaa 4140
tatttgatga taaagctatc aaagaaaata aaggcgaggg ttataaaaaa attgtttata 4200tatttgatga taaagctatc aaagaaaata aaggcgaggg ttataaaaaa attgtttata 4200
aacttttacc tggcgcaaat aaaatgttac ctaaggtttt cttttctgct aaatctataa 4260aacttttacc tggcgcaaat aaaatgttac ctaaggtttt cttttctgct aaatctataa 4260
aattttataa tcctagtgaa gatatactta gaataagaaa tcattccaca catacaaaaa 4320aattttataa tcctagtgaa gatatactta gaataagaaa tcattccaca catacaaaaa 4320
atggtagtcc tcaaaaagga tatgaaaaat ttgagtttaa tattgaagat tgccgaaaat 4380atggtagtcc tcaaaaagga tatgaaaaat ttgagtttaa tattgaagat tgccgaaaat 4380
ttatagattt ttataaacag tctataagta agcatccgga gtggaaagat tttggattta 4440ttatagattt ttataaacag tctataagta agcatccgga gtggaaagat tttggattta 4440
gattttctga tactcaaaga tataattcta tagatgaatt ttatagagaa gttgaaaatc 4500gattttctga tactcaaaga tataattcta tagatgaatt ttatagagaa gttgaaaatc 4500
aaggctacaa actaactttt gaaaatatat cagagagcta tattgatagc gtagttaatc 4560aaggctacaa actaactttt gaaaatat cagagagcta tattgatagc gtagttaatc 4560
agggtaaatt gtacctattc caaatctata ataaagattt ttcagcttat agcaaagggc 4620agggtaaatt gtacctattc caaatctata ataaagattt ttcagcttat agcaaagggc 4620
gaccaaatct acatacttta tattggaaag cgctgtttga tgagagaaat cttcaagatg 4680gaccaaatct acatacttta tattggaaag cgctgtttga tgagagaaat cttcaagatg 4680
tggtttataa gctaaatggt gaggcagagc ttttttatcg taaacaatca atacctaaaa 4740tggtttataa gctaaatggt gaggcagagc ttttttatcg taaacaatca atacctaaaa 4740
aaatcactca cccagctaaa gaggcaatag ctaataaaaa caaagataat cctaaaaaag 4800aaatcactca cccagctaaa gaggcaatag ctaataaaaa caaagataat cctaaaaaag 4800
agagtgtttt tgaatatgat ttaatcaaag ataaacgctt tactgaagat aagtttttct 4860agagtgtttt tgaatatgat ttaatcaaag ataaacgctt tactgaagat aagtttttct 4860
ttcactgtcc tattacaatc aattttaaat ctagtggagc taataagttt aatgatgaaa 4920ttcactgtcc tattacaatc aattttaaat ctagtggagc taataagttt aatgatgaaa 4920
tcaatttatt gctaaaagaa aaagcaaatg atgttcatat attaagtata gctagaggtg 4980tcaatttatt gctaaaagaa aaagcaaatg atgttcatat attaagtata gctagaggtg 4980
aaagacattt agcttactat actttggtag atggtaaagg caatatcatc aaacaagata 5040aaagacattt agcttactat actttggtag atggtaaagg caatatcatc aaacaagata 5040
ctttcaacat cattggtaat gatagaatga aaacaaacta ccatgataag cttgctgcaa 5100ctttcaacat cattggtaat gatagaatga aaacaaacta ccatgataag cttgctgcaa 5100
tagagaaaga tagggattca gctaggaaag actggaaaaa gataaataac atcaaagaga 5160tagagaaaga tagggattca gctaggaaag actggaaaaa gataaataac atcaaagaga 5160
tgaaagaggg ctatctatct caggtagttc atgaaatagc taagctagtt atagagtata 5220tgaaagaggg ctatctatct caggtagttc atgaaatagc taagctagtt atagagtata 5220
atgctattgt ggtttttgag gatttaaatt ttggatttaa aagagggcgt ttcaaggtag 5280atgctattgt ggtttttgag gatttaaatt ttggatttaa aagagggcgt ttcaaggtag 5280
agaagcaggt ctatcaaaag ttagaaaaaa tgctaattga gaaactaaac tatctagttt 5340agaagcaggt ctatcaaaag ttagaaaaaa tgctaattga gaaactaaac tatctagttt 5340
tcaaagataa tgagtttgat aaaactgggg gagtgcttag agcttatcag ctaacagcac 5400tcaaagataa tgagtttgat aaaactgggg gagtgcttag agctttatcag ctaacagcac 5400
cttttgagac ttttaaaaag atgggtaaac aaacaggtat tatctactat gtaccagctg 5460cttttgagac ttttaaaaag atgggtaaac aaacaggtat tatctactat gtaccagctg 5460
gttttacttc aaaaatttgt cctgtaactg gttttgtaaa tcagttatat cctaagtatg 5520gttttacttc aaaaatttgt cctgtaactg gttttgtaaa tcagttatat cctaagtatg 5520
aaagtgtcag caaatctcaa gagttcttta gtaagtttga caagatttgt tataaccttg 5580aaagtgtcag caaatctcaa gagttcttta gtaagtttga caagatttgt tataaccttg 5580
ataagggcta ttttgagttt agttttgatt ataaaaactt tggtgacaag gctgccaaag 5640ataagggcta ttttgagttt agttttgatt ataaaaactt tggtgacaag gctgccaaag 5640
gcaagtggac tatagctagc tttgggagta gattgattaa ctttagaaat tcagataaaa 5700gcaagtggac tatagctagc tttgggagta gattgattaa ctttagaaat tcagataaaa 5700
atcataattg ggatactcga gaagtttatc caactaaaga gttggagaaa ttgctaaaag 5760atcataattg ggatactcga gaagtttatc caactaaaga gttggagaaa ttgctaaaag 5760
attattctat cgaatatggg catggcgaat gtatcaaagc agctatttgc ggtgagagcg 5820attattctat cgaatatggg catggcgaat gtatcaaagc agctatttgc ggtgagagcg 5820
acaaaaagtt ttttgctaag ctaactagtg tcctaaatac tatcttacaa atgcgtaact 5880acaaaaagttttttgctaag ctaactagtg tcctaaatac tatcttacaa atgcgtaact 5880
caaaaacagg tactgagtta gattatctaa tttcaccagt agcagatgta aatggcaatt 5940caaaaacagg tactgagtta gattatctaa tttcaccagt agcagatgta aatggcaatt 5940
tctttgattc gcgacaggcg ccaaaaaata tgcctcaaga tgctgatgcc aatggtgctt 6000tctttgattc gcgacaggcg ccaaaaaata tgcctcaaga tgctgatgcc aatggtgctt 6000
atcatattgg gctaaaaggt ctgatgctac taggtaggat caaaaataat caagagggca 6060atcatattgg gctaaaaggt ctgatgctac taggtaggat caaaaataat caagagggca 6060
aaaaactcaa tttggttatc aaaaatgaag agtattttga gttcgtgcag aataggaata 6120aaaaactcaa tttggttatc aaaaatgaag agtattttga gttcgtgcag aataggaata 6120
actctggtgg ttctcccaag aagaagagga aagtctaact gcagtataat cagaaacagc 6180actctggtgg ttctcccaag aagaagagga aagtctaact gcagtataat cagaaacagc 6180
ccgcggatgt tgatctgcgg gctgtttttt attgatcgaa tggccatgac caaaatccct 6240ccgcggatgt tgatctgcgg gctgtttttt attgatcgaa tggccatgac caaaatccct 6240
taacgtgaag ttttcgttcc actgagcgtc agaccccgta gaaaagatca aaggatcttc 6300taacgtgaag ttttcgttcc actgagcgtc agaccccgta gaaaagatca aaggatcttc 6300
ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc 6360ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc 6360
agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt 6420agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt 6420
cagcagagcg cagataccaa atactgtcct tctagtgtag ccgtagttag gccaccactt 6480cagcagagcg cagataccaa atactgtcct tctagtgtag ccgtagttag gccaccactt 6480
caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc 6540caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc 6540
tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt taccggataa 6600tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt taccggataa 6600
ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac 6660ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac 6660
ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg 6720ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg 6720
gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga 6780gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga 6780
gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact 6840gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact 6840
tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa 6900tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa 6900
cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt tctttcctgc 6960cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt tctttcctgc 6960
gttatcccct gattctgtgg ataaccgtat taccgccttt gagtgagctg agacaggtca 7020gttatcccct gattctgtgg ataaccgtat taccgccttt gagtgagctg agacaggtca 7020
ttcagactgg ctaatgcacc cagtaaggca gcggtatcat caactcaaaa tggtatgcgt 7080ttcagactgg ctaatgcacc cagtaaggca gcggtatcat caactcaaaa tggtatgcgt 7080
tttgacacat ccactatata tccgtgtcgt tctgtccact cctgaatccc attccagaaa 7140tttgacacat ccactatata tccgtgtcgt tctgtccact cctgaatccc attccagaaa 7140
ttctctagcg attccagaag tttctcagag tcggaaagtt gaccagacat tacgaactgg 7200ttctctagcg attccagaag tttctcagag tcggaaagtt gaccagacat tacgaactgg 7200
cacagatggt cataacctga aggaagatct gattgcttaa ctgcttcagt taagaccgaa 7260cacagatggt cataacctga aggaagatct gattgcttaa ctgcttcagt taagaccgaa 7260
gcgctcgtcg tataacagat gcgatgatgc agaccaatca acatggcacc tgccattgct 7320gcgctcgtcg tataacagat gcgatgatgc agaccaatca acatggcacc tgccattgct 7320
acctgcacag tcaaggatgg tagaaatgtt gtcggtcctt gcacacgaat attacgccat 7380acctgcacag tcaaggatgg tagaaatgtt gtcggtcctt gcacacgaat attacgccat 7380
ttgcctgcat attcaaacag ctcttctacg ataagggcac aaatcgcatc gtggaacgtt 7440ttgcctgcat attcaaacag ctcttctacg ataagggcac aaatcgcatc gtggaacgtt 7440
tgggcttcta ccgatttagc agtttgatac actttctcta agtatccacc tgaatcataa 7500tgggcttcta ccgatttagc agtttgatac actttctcta agtatccacc tgaatcataa 7500
atcggcaaaa tagagaaaaa ttgaccatgt gtaagcggcc aatctgattc cacctgagat 7560atcggcaaaa tagagaaaaa ttgaccatgt gtaagcggcc aatctgattc cacctgagat 7560
gcataatcta gtagaatctc ttcgctatca aaattcactt ccaccttcca ctcaccggtt 7620gcataatcta gtagaatctc ttcgctatca aaattcactt ccaccttcca ctcaccggtt 7620
gtccattcat ggctgaactc tgcttcctct gttgacatga cacacatcat ctcaatatcc 7680gtccattcat ggctgaactc tgcttcctct gttgacatga cacacatcat ctcaatatcc 7680
gaatagggcc catcagtctg acgaccaaga gagccataaa caccaatagc cttaacatca 7740gaatagggcc catcagtctg acgaccaaga gagccataaa caccaatagc cttaacatca 7740
tccccatatt tatccaatat tcgttcctta atttcatgaa caatcttcat tctttcttct 7800tccccatatt tatccaatat tcgttcctta atttcatgaa caatcttcat tctttcttct 7800
ctagtcatta ttattggtcc attcactatt ctcattcccc tttcagataa ttttagattt 7860ctagtcatta ttatggtcc attcactatt ctcattcccc tttcagataa ttttagattt 7860
gcttttctaa ataagaatat ttggagagca ccgttcttat tcagctatta aacccattat 7920gcttttctaa ataagaatat ttggagagca ccgttcttat tcagctatta aacccattat 7920
atcgggtttt tgaggggatt tcaactgcag acacctaaat tcaaaatcta tcggtcagat 7980atcgggtttt tgaggggatt tcaactgcag acacctaaat tcaaaatcta tcggtcagat 7980
ttataccgat ttgattttat atattcttga ataacatacg ccgagttatc acataaaagc 8040ttataccgat ttgattttat atattcttga ataacatacg ccgagttatc acataaaagc 8040
gggaaccaat catcaaattt aaacttcatt gcataatcca ttaaactctt aaattctacg 8100gggaaccaat catcaaattt aaacttcatt gcataatcca ttaaactctt aaattctacg 8100
attccttgtt catcaataaa ctcaatcatt tctttaatta atttatatct atctgttgtt 8160attccttgtt catcaataaa ctcaatcatt tctttaatta atttatatct atctgttgtt 8160
gttttcttta ataattcatc aacatctaca ccgccataaa ctatcatatc ttctttttga 8220gttttcttta ataattcatc aacatctaca ccgccataaa ctatcatatc ttctttttga 8220
tatttaaatt tattaggatc gtccatgtga agcatatatc tcacaagacc tttcacactt 8280tatttaaatt tattaggatc gtccatgtga agcatatatc tcacaagacc tttcacactt 8280
cctgcaatct gcggaatagt cgcattcaat tcttctgtaa ttatttttat ctgttcataa 8340cctgcaatct gcggaatagt cgcattcaat tcttctgtaa ttattttat ctgttcataa 8340
gatttattac cctcatacat cactagaata tgataatgct cttttttcat cctatcttct 8400gatttattac cctcatacat cactagaata tgataatgct cttttttcat cctatcttct 8400
gtatcagtat ccctatcatg taatggagac actacaaatt gaatgtgtaa ctcttttaaa 8460gtatcagtat ccctatcatg taatggagac actacaaatt gaatgtgtaa ctcttttaaa 8460
tactctaacc actcggcttt tgctgattct ggatataaaa caaatgtcca attacgtcct 8520tactctaacc actcggcttt tgctgattct ggatataaaa caaatgtcca attacgtcct 8520
cttgaatttt tcttgttttc agtttctttt attacatttt cgctcatgat ataataacgg 8580cttgaatttt tcttgttttc agtttctttt attacatttt cgctcatgat ataataacgg 8580
tgctaataca tttaacaaaa tttagtcata gataggcagc atgccagtgc tgtctatctt 8640tgctaataca tttaacaaaa tttagtcata gataggcagc atgccagtgc tgtctatctt 8640
tttttgttta aaatgcaccg tattcctcct ttgcatattt ttttattaga ataccggttg 8700tttttgttta aaatgcaccg tattcctcct ttgcatattt ttttattaga ataccggttg 8700
catctgattt gctaatatta tatttttctt tgattctatt taatatctca ttttcttctg 8760catctgattt gctaatatta tatttttctt tgattctatt taatatctca ttttcttctg 8760
ttgtaagtct taaagtaaca gcaacttttt tctcttcttt tctatctaca accatcactg 8820ttgtaagtct taaagtaaca gcaacttttt tctcttcttt tctatctaca accatcactg 8820
tacctcccaa catctgtttt tttcacttta acataaaaaa caacctttta acattaaaaa 8880tacctcccaa catctgtttt tttcacttta acataaaaaa caacctttta acattaaaaa 8880
cccaatattt atttatttgt ttggacaatg gacaatggac acctaggggg gaggtcgtag 8940cccaatattt atttatttgt ttggacaatg gacaatggac acctaggggg gaggtcgtag 8940
taccccccta tgttttctcc cctaaataac cccaaaaatc taagaaaaaa agacctcaaa 9000taccccccta tgttttctcc cctaaataac cccaaaaatc taagaaaaaa agacctcaaa 9000
aaggtcttta attaacatct caaatttcgc atttattcca atttcctttt tgcgtgtgat 9060aaggtcttta attaacatct caaatttcgc atttattcca atttcctttt tgcgtgtgat 9060
gcgttattaa cgttgatata atttaaattt tatttgacaa aaatgggctc gtgttgtaca 9120gcgttattaa cgttgatata atttaaattt tatttgacaa aaatgggctc gtgttgtaca 9120
ataaatgtag aggtagagac gcgaggtcta agaactttaa ataatttcta ctgttgtaga 9180ataaatgtag aggtagagac gcgaggtcta agaactttaa ataatttcta ctgttgtaga 9180
tagagaccgt gaagttaata aggtctcaaa tttctactgt tgtagatcgt ctctgaactg 9240tagagaccgt gaagttaata aggtctcaaa tttctactgt tgtagatcgt ctctgaactg 9240
attcaagcaa gcttaaaccc agctcaatga gctgggtttt ttgtttgttt tttcaaactt 9300attcaagcaa gcttaaaccc agctcaatga gctgggtttt ttgtttgttt tttcaaactt 9300
agttagcttg gccagtgcct ctagagtcaa gtaaagagtc gacctgttac gaacggcaga 9360agttagcttg gccagtgcct ctagagtcaa gtaaagagtc gacctgttac gaacggcaga 9360
tcagaatttt gtaataaaaa aagagcctgc tcattacact gcgggctctt tttcatggtc 9420tcagaatttt gtaataaaaa aagagcctgc tcattacact gcgggctctt tttcatggtc 9420
agaagacggg taaccaagat aacaa 9445agaagacggg taaccaagat aacaa 9445
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210265179.8A CN114835818B (en) | 2022-03-17 | 2022-03-17 | Gene editing fusion protein, adenine base editor constructed by same and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210265179.8A CN114835818B (en) | 2022-03-17 | 2022-03-17 | Gene editing fusion protein, adenine base editor constructed by same and application thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114835818A CN114835818A (en) | 2022-08-02 |
CN114835818B true CN114835818B (en) | 2024-03-22 |
Family
ID=82561443
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210265179.8A Active CN114835818B (en) | 2022-03-17 | 2022-03-17 | Gene editing fusion protein, adenine base editor constructed by same and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114835818B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116751799B (en) * | 2023-06-14 | 2024-01-26 | 江南大学 | Multi-site double-base editor and application thereof |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109295186A (en) * | 2018-09-30 | 2019-02-01 | 中山大学 | A method for detecting off-target effects of adenine single base editing system based on whole genome sequencing and its application in gene editing |
KR20190044157A (en) * | 2017-10-20 | 2019-04-30 | 경상대학교산학협력단 | Composition for single base editing comprising adenine or adenosine deaminase as effective component and uses thereof |
CN109957569A (en) * | 2017-12-22 | 2019-07-02 | 中国科学院遗传与发育生物学研究所 | Base editing system and method based on CPF1 protein |
CN112080513A (en) * | 2020-09-16 | 2020-12-15 | 中国农业科学院植物保护研究所 | Rice artificial genome editing system with expanded editing range and application thereof |
CN112143753A (en) * | 2020-09-17 | 2020-12-29 | 中国农业科学院植物保护研究所 | Adenine base editor and related biological material and application thereof |
-
2022
- 2022-03-17 CN CN202210265179.8A patent/CN114835818B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20190044157A (en) * | 2017-10-20 | 2019-04-30 | 경상대학교산학협력단 | Composition for single base editing comprising adenine or adenosine deaminase as effective component and uses thereof |
CN109957569A (en) * | 2017-12-22 | 2019-07-02 | 中国科学院遗传与发育生物学研究所 | Base editing system and method based on CPF1 protein |
CN109295186A (en) * | 2018-09-30 | 2019-02-01 | 中山大学 | A method for detecting off-target effects of adenine single base editing system based on whole genome sequencing and its application in gene editing |
CN112080513A (en) * | 2020-09-16 | 2020-12-15 | 中国农业科学院植物保护研究所 | Rice artificial genome editing system with expanded editing range and application thereof |
CN112143753A (en) * | 2020-09-17 | 2020-12-29 | 中国农业科学院植物保护研究所 | Adenine base editor and related biological material and application thereof |
Non-Patent Citations (4)
Title |
---|
High-efficiency and multiplex adenine base editing in plants using new TadA variants;Yan et al.;《Mol. Plant.》;第14卷;第724页左栏第2段及图2A * |
Phage-assisted evolution of an adenine base editor with improved Cas domain compatibility and activity;Richter et al.;《NATURE BIOTECHNOLOGY》;第38卷;摘要部分、第886页右栏第1-2段及附加信息 * |
基于 CRISPR 的枯草芽孢杆菌基因编辑和表达调控系统的设计、构建与应用;武耀康;《中国博士学位论文全文数据库(电子期刊) 基础科学辑》;第13-16页 * |
基于CRISPR/Cas9系统的单碱基基因编辑技术及其在医药研究中的应用;张爱霞;赵宇;安静;罗影;陈志国;;中国药理学与毒理学杂志(07);第607-514页 * |
Also Published As
Publication number | Publication date |
---|---|
CN114835818A (en) | 2022-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108384784A (en) | A method of knocking out Endoglin genes using CRISPR/Cas9 technologies | |
JP4469005B2 (en) | Artificial promoter library for selected organisms and promoters derived from the library | |
KR20210151916A (en) | AAV vector-mediated deletion of large mutant hotspots for the treatment of Duchenne muscular dystrophy. | |
CN113481136B (en) | Recombinant halophilic monad, construction method and application of catalyzing citric acid to prepare itaconic acid | |
CN108135949A (en) | Delivery vehicle | |
WO2018220616A2 (en) | Genetic systems that defend against foreign dna and uses thereof | |
CN110066829A (en) | A kind of CRISPR/Cas9 gene editing system and its application | |
CA2652689C (en) | Method of constructing gene transport support | |
CN114835818B (en) | Gene editing fusion protein, adenine base editor constructed by same and application thereof | |
US6558924B1 (en) | Recombinant expression of insulin C-peptide | |
WO2020169221A1 (en) | Production of plant-based active substances (e.g. cannabinoids) by recombinant microorganisms | |
CN112961832A (en) | Cell strain and preparation method and application thereof | |
CN109971789A (en) | A gene editing system and its application in Mycobacterium neoaureus | |
KR101831121B1 (en) | Nucleic acid structure containing a pyripyropene biosynthesis gene cluster and a marker gene | |
Bhattarai-Kline et al. | Reconstructing transcriptional histories by CRISPR acquisition of retron-based genetic barcodes | |
CN112195190B (en) | Replication element derived from Bacillus velesi plasmid and its application | |
KR102669217B1 (en) | Expression vector for use in methanogens | |
US20030027286A1 (en) | Bacterial promoters and methods of use | |
CN109913484A (en) | A kind of bidirectional expression T vector and its preparation method and application | |
CN115605589A (en) | Improved process for the production of isoprenoids | |
CN113151130A (en) | Genetically engineered bacterium and application thereof in preparation of isobutanol by bioconversion of methane | |
CN113462701B (en) | High-temperature polyphenol oxidase and application thereof in treatment of phenol-containing wastewater | |
CN115678908A (en) | A Bacillus subtilis multigene modular assembly and expression plasmid pBsubPB02 and its construction method | |
CN114507685A (en) | A Bacillus subtilis multigene modular assembly and inducible expression plasmid and its construction method | |
JP2000014388A (en) | Recombinant CRP and method for producing the same |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |