CN110846336B - Multifunctional fusion enzyme XAET, multifunctional fusion enzyme site-directed integration eukaryotic specific expression vector and construction method thereof - Google Patents
Multifunctional fusion enzyme XAET, multifunctional fusion enzyme site-directed integration eukaryotic specific expression vector and construction method thereof Download PDFInfo
- Publication number
- CN110846336B CN110846336B CN201911170220.8A CN201911170220A CN110846336B CN 110846336 B CN110846336 B CN 110846336B CN 201911170220 A CN201911170220 A CN 201911170220A CN 110846336 B CN110846336 B CN 110846336B
- Authority
- CN
- China
- Prior art keywords
- ala
- gly
- ser
- thr
- leu
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 108090000790 Enzymes Proteins 0.000 title claims abstract description 102
- 102000004190 Enzymes Human genes 0.000 title claims abstract description 93
- 230000004927 fusion Effects 0.000 title claims abstract description 63
- 239000013604 expression vector Substances 0.000 title claims abstract description 25
- 230000010354 integration Effects 0.000 title claims abstract description 14
- 238000010276 construction Methods 0.000 title claims abstract description 13
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 92
- 229940088598 enzyme Drugs 0.000 claims abstract description 89
- 229940106157 cellulase Drugs 0.000 claims abstract description 39
- 230000014509 gene expression Effects 0.000 claims abstract description 37
- 101710121765 Endo-1,4-beta-xylanase Proteins 0.000 claims abstract description 26
- 229940085127 phytase Drugs 0.000 claims abstract description 26
- 230000009261 transgenic effect Effects 0.000 claims abstract description 24
- 241001465754 Metazoa Species 0.000 claims abstract description 16
- 108010075254 C-Peptide Proteins 0.000 claims abstract description 12
- 238000011144 upstream manufacturing Methods 0.000 claims abstract description 12
- 238000000034 method Methods 0.000 claims abstract description 10
- 102000004169 proteins and genes Human genes 0.000 claims description 24
- 239000013598 vector Substances 0.000 claims description 15
- 101710038256 CEP112 Proteins 0.000 claims description 9
- 102100033129 Centrosomal protein of 112 kDa Human genes 0.000 claims description 9
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 8
- 238000012546 transfer Methods 0.000 claims description 7
- 230000001105 regulatory effect Effects 0.000 claims description 5
- 238000012216 screening Methods 0.000 claims description 3
- 230000002194 synthesizing effect Effects 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 36
- 108090000765 processed proteins & peptides Proteins 0.000 abstract description 30
- 108010059892 Cellulase Proteins 0.000 abstract description 26
- 102000004196 processed proteins & peptides Human genes 0.000 abstract description 22
- 230000004186 co-expression Effects 0.000 abstract description 21
- 108010011619 6-Phytase Proteins 0.000 abstract description 17
- 229920002678 cellulose Polymers 0.000 abstract description 10
- 239000001913 cellulose Substances 0.000 abstract description 10
- 229920001184 polypeptide Polymers 0.000 abstract description 9
- 230000000433 anti-nutritional effect Effects 0.000 abstract description 4
- 102000037865 fusion proteins Human genes 0.000 abstract description 4
- 108020001507 fusion proteins Proteins 0.000 abstract description 4
- IMQLKJBTEOYOSI-GPIVLXJGSA-N Inositol-hexakisphosphate Chemical compound OP(O)(=O)O[C@H]1[C@H](OP(O)(O)=O)[C@@H](OP(O)(O)=O)[C@H](OP(O)(O)=O)[C@H](OP(O)(O)=O)[C@@H]1OP(O)(O)=O IMQLKJBTEOYOSI-GPIVLXJGSA-N 0.000 abstract description 3
- IMQLKJBTEOYOSI-UHFFFAOYSA-N Phytic acid Natural products OP(O)(=O)OC1C(OP(O)(O)=O)C(OP(O)(O)=O)C(OP(O)(O)=O)C(OP(O)(O)=O)C1OP(O)(O)=O IMQLKJBTEOYOSI-UHFFFAOYSA-N 0.000 abstract description 3
- 230000007062 hydrolysis Effects 0.000 abstract description 3
- 238000006460 hydrolysis reaction Methods 0.000 abstract description 3
- 229940068041 phytic acid Drugs 0.000 abstract description 3
- 235000002949 phytic acid Nutrition 0.000 abstract description 3
- 239000000467 phytic acid Substances 0.000 abstract description 3
- 229920001221 xylan Polymers 0.000 abstract description 3
- 150000004823 xylans Chemical class 0.000 abstract description 3
- 230000002708 enhancing effect Effects 0.000 abstract description 2
- 229920001503 Glucan Polymers 0.000 abstract 1
- 230000015572 biosynthetic process Effects 0.000 description 15
- 101150050411 appA gene Proteins 0.000 description 14
- 229920001282 polysaccharide Polymers 0.000 description 14
- 239000005017 polysaccharide Substances 0.000 description 14
- 150000004804 polysaccharides Chemical class 0.000 description 14
- 238000003786 synthesis reaction Methods 0.000 description 14
- 101100148259 Actinobacillus pleuropneumoniae apxIIA gene Proteins 0.000 description 13
- 108010047495 alanylglycine Proteins 0.000 description 13
- 101100373342 Botryotinia fuckeliana (strain B05.10) xyn11A gene Proteins 0.000 description 12
- 101100506054 Cellulomonas fimi cex gene Proteins 0.000 description 12
- 241000282887 Suidae Species 0.000 description 12
- 101150021205 xlnB gene Proteins 0.000 description 12
- 101150031048 xynB gene Proteins 0.000 description 12
- 108020004414 DNA Proteins 0.000 description 10
- 241000282898 Sus scrofa Species 0.000 description 10
- 210000004027 cell Anatomy 0.000 description 10
- 229920002472 Starch Polymers 0.000 description 9
- 238000013461 design Methods 0.000 description 9
- 239000008107 starch Substances 0.000 description 9
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 8
- 108010076504 Protein Sorting Signals Proteins 0.000 description 7
- 238000003776 cleavage reaction Methods 0.000 description 7
- 235000019698 starch Nutrition 0.000 description 7
- WIDVAWAQBRAKTI-YUMQZZPRSA-N Asn-Leu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O WIDVAWAQBRAKTI-YUMQZZPRSA-N 0.000 description 6
- 108020004705 Codon Proteins 0.000 description 6
- KWTVLKBOQATPHJ-SRVKXCTJSA-N Leu-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N KWTVLKBOQATPHJ-SRVKXCTJSA-N 0.000 description 6
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 108010089804 glycyl-threonine Proteins 0.000 description 6
- 108010010147 glycylglutamine Proteins 0.000 description 6
- 108010061238 threonyl-glycine Proteins 0.000 description 6
- 108010051110 tyrosyl-lysine Proteins 0.000 description 6
- 108010047290 Multifunctional Enzymes Proteins 0.000 description 5
- 101800001494 Protease 2A Proteins 0.000 description 5
- 101800001066 Protein 2A Proteins 0.000 description 5
- 108010047857 aspartylglycine Proteins 0.000 description 5
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Natural products NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 5
- 108010057821 leucylproline Proteins 0.000 description 5
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 4
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 4
- XQNRANMFRPCFFW-GCJQMDKQSA-N Ala-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C)N)O XQNRANMFRPCFFW-GCJQMDKQSA-N 0.000 description 4
- PDQBXRSOSCTGKY-ACZMJKKPSA-N Asn-Ala-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N PDQBXRSOSCTGKY-ACZMJKKPSA-N 0.000 description 4
- KZYSHAMXEBPJBD-JRQIVUDYSA-N Asn-Thr-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KZYSHAMXEBPJBD-JRQIVUDYSA-N 0.000 description 4
- 102000004961 Furin Human genes 0.000 description 4
- 108090001126 Furin Proteins 0.000 description 4
- IVGJYOOGJLFKQE-AVGNSLFASA-N Glu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N IVGJYOOGJLFKQE-AVGNSLFASA-N 0.000 description 4
- VSVZIEVNUYDAFR-YUMQZZPRSA-N Gly-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN VSVZIEVNUYDAFR-YUMQZZPRSA-N 0.000 description 4
- KQDMENMTYNBWMR-WHFBIAKZSA-N Gly-Asp-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O KQDMENMTYNBWMR-WHFBIAKZSA-N 0.000 description 4
- GJHWILMUOANXTG-WPRPVWTQSA-N Gly-Val-Arg Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GJHWILMUOANXTG-WPRPVWTQSA-N 0.000 description 4
- QHUREMVLLMNUAX-OSUNSFLBSA-N Ile-Thr-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)O)N QHUREMVLLMNUAX-OSUNSFLBSA-N 0.000 description 4
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 4
- 241000880493 Leptailurus serval Species 0.000 description 4
- IASQBRJGRVXNJI-YUMQZZPRSA-N Leu-Cys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)NCC(O)=O IASQBRJGRVXNJI-YUMQZZPRSA-N 0.000 description 4
- QJXHMYMRGDOHRU-NHCYSSNCSA-N Leu-Ile-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O QJXHMYMRGDOHRU-NHCYSSNCSA-N 0.000 description 4
- ZGGVHTQAPHVMKM-IHPCNDPISA-N Leu-Trp-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCCCN)C(=O)O)N ZGGVHTQAPHVMKM-IHPCNDPISA-N 0.000 description 4
- XOEDPXDZJHBQIX-ULQDDVLXSA-N Leu-Val-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XOEDPXDZJHBQIX-ULQDDVLXSA-N 0.000 description 4
- KNKHAVVBVXKOGX-JXUBOQSCSA-N Lys-Ala-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KNKHAVVBVXKOGX-JXUBOQSCSA-N 0.000 description 4
- KBTQZYASLSUFJR-KKUMJFAQSA-N Met-Phe-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N KBTQZYASLSUFJR-KKUMJFAQSA-N 0.000 description 4
- 102000006833 Multifunctional Enzymes Human genes 0.000 description 4
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 4
- XQJCEKXQUJQNNK-ZLUOBGJFSA-N Ser-Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O XQJCEKXQUJQNNK-ZLUOBGJFSA-N 0.000 description 4
- SWIKDOUVROTZCW-GCJQMDKQSA-N Thr-Asn-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C)C(=O)O)N)O SWIKDOUVROTZCW-GCJQMDKQSA-N 0.000 description 4
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 4
- MROIJTGJGIDEEJ-RCWTZXSCSA-N Thr-Pro-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 MROIJTGJGIDEEJ-RCWTZXSCSA-N 0.000 description 4
- FWTFAZKJORVTIR-VZFHVOOUSA-N Thr-Ser-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O FWTFAZKJORVTIR-VZFHVOOUSA-N 0.000 description 4
- PIFJAFRUVWZRKR-QMMMGPOBSA-N Val-Gly-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O PIFJAFRUVWZRKR-QMMMGPOBSA-N 0.000 description 4
- VHIZXDZMTDVFGX-DCAQKATOSA-N Val-Ser-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N VHIZXDZMTDVFGX-DCAQKATOSA-N 0.000 description 4
- 108010044940 alanylglutamine Proteins 0.000 description 4
- 108010008355 arginyl-glutamine Proteins 0.000 description 4
- 108010068265 aspartyltyrosine Proteins 0.000 description 4
- 230000001588 bifunctional effect Effects 0.000 description 4
- 239000000835 fiber Substances 0.000 description 4
- 108010078144 glutaminyl-glycine Proteins 0.000 description 4
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 4
- 108010019832 glycyl-asparaginyl-glycine Proteins 0.000 description 4
- 108010050848 glycylleucine Proteins 0.000 description 4
- 108010084389 glycyltryptophan Proteins 0.000 description 4
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 4
- 108010034529 leucyl-lysine Proteins 0.000 description 4
- 235000015097 nutrients Nutrition 0.000 description 4
- 239000000047 product Substances 0.000 description 4
- 108010087846 prolyl-prolyl-glycine Proteins 0.000 description 4
- 108010031719 prolyl-serine Proteins 0.000 description 4
- 108010090894 prolylleucine Proteins 0.000 description 4
- 230000007017 scission Effects 0.000 description 4
- 108010073969 valyllysine Proteins 0.000 description 4
- YYSWCHMLFJLLBJ-ZLUOBGJFSA-N Ala-Ala-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YYSWCHMLFJLLBJ-ZLUOBGJFSA-N 0.000 description 3
- OTOXOKCIIQLMFH-KZVJFYERSA-N Arg-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N OTOXOKCIIQLMFH-KZVJFYERSA-N 0.000 description 3
- HMHRTKOWRUPPNU-RCOVLWMOSA-N Gly-Ile-Gly Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O HMHRTKOWRUPPNU-RCOVLWMOSA-N 0.000 description 3
- HTDRTKMNJRRYOJ-SIUGBPQLSA-N Ile-Gln-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HTDRTKMNJRRYOJ-SIUGBPQLSA-N 0.000 description 3
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 3
- 101710110284 Nuclear shuttle protein Proteins 0.000 description 3
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 description 3
- UTAUEDINXUMHLG-FXQIFTODSA-N Pro-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 UTAUEDINXUMHLG-FXQIFTODSA-N 0.000 description 3
- 102100022647 Reticulon-1 Human genes 0.000 description 3
- ZUGXSSFMTXKHJS-ZLUOBGJFSA-N Ser-Ala-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O ZUGXSSFMTXKHJS-ZLUOBGJFSA-N 0.000 description 3
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 3
- 235000019764 Soybean Meal Nutrition 0.000 description 3
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 3
- WPSKTVVMQCXPRO-BWBBJGPYSA-N Thr-Ser-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WPSKTVVMQCXPRO-BWBBJGPYSA-N 0.000 description 3
- 108010077245 asparaginyl-proline Proteins 0.000 description 3
- 210000002421 cell wall Anatomy 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 229940079919 digestives enzyme preparation Drugs 0.000 description 3
- 108010087823 glycyltyrosine Proteins 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 229910052698 phosphorus Inorganic materials 0.000 description 3
- 239000011574 phosphorus Substances 0.000 description 3
- 239000013612 plasmid Substances 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 239000004455 soybean meal Substances 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 238000001890 transfection Methods 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- BRPMXFSTKXXNHF-IUCAKERBSA-N (2s)-1-[2-[[(2s)-pyrrolidine-2-carbonyl]amino]acetyl]pyrrolidine-2-carboxylic acid Chemical compound OC(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@H]1NCCC1 BRPMXFSTKXXNHF-IUCAKERBSA-N 0.000 description 2
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 2
- TTXMOJWKNRJWQJ-FXQIFTODSA-N Ala-Arg-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N TTXMOJWKNRJWQJ-FXQIFTODSA-N 0.000 description 2
- DWINFPQUSSHSFS-UVBJJODRSA-N Ala-Arg-Trp Chemical compound N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC2=CC=CC=C12)C(=O)O DWINFPQUSSHSFS-UVBJJODRSA-N 0.000 description 2
- KIUYPHAMDKDICO-WHFBIAKZSA-N Ala-Asp-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KIUYPHAMDKDICO-WHFBIAKZSA-N 0.000 description 2
- CXQODNIBUNQWAS-CIUDSAMLSA-N Ala-Gln-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N CXQODNIBUNQWAS-CIUDSAMLSA-N 0.000 description 2
- CVHJIWVKTFNGHT-ACZMJKKPSA-N Ala-Gln-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N CVHJIWVKTFNGHT-ACZMJKKPSA-N 0.000 description 2
- IFTVANMRTIHKML-WDSKDSINSA-N Ala-Gln-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O IFTVANMRTIHKML-WDSKDSINSA-N 0.000 description 2
- MPLOSMWGDNJSEV-WHFBIAKZSA-N Ala-Gly-Asp Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MPLOSMWGDNJSEV-WHFBIAKZSA-N 0.000 description 2
- PEIBBAXIKUAYGN-UBHSHLNASA-N Ala-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 PEIBBAXIKUAYGN-UBHSHLNASA-N 0.000 description 2
- RUXQNKVQSKOOBS-JURCDPSOSA-N Ala-Phe-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RUXQNKVQSKOOBS-JURCDPSOSA-N 0.000 description 2
- FFZJHQODAYHGPO-KZVJFYERSA-N Ala-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N FFZJHQODAYHGPO-KZVJFYERSA-N 0.000 description 2
- DCVYRWFAMZFSDA-ZLUOBGJFSA-N Ala-Ser-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DCVYRWFAMZFSDA-ZLUOBGJFSA-N 0.000 description 2
- YYAVDNKUWLAFCV-ACZMJKKPSA-N Ala-Ser-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYAVDNKUWLAFCV-ACZMJKKPSA-N 0.000 description 2
- VRTOMXFZHGWHIJ-KZVJFYERSA-N Ala-Thr-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VRTOMXFZHGWHIJ-KZVJFYERSA-N 0.000 description 2
- LSMDIAAALJJLRO-XQXXSGGOSA-N Ala-Thr-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O LSMDIAAALJJLRO-XQXXSGGOSA-N 0.000 description 2
- KUFVXLQLDHJVOG-SHGPDSBTSA-N Ala-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C)N)O KUFVXLQLDHJVOG-SHGPDSBTSA-N 0.000 description 2
- KLKARCOHVHLAJP-UWJYBYFXSA-N Ala-Tyr-Cys Chemical compound C[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CS)C(O)=O KLKARCOHVHLAJP-UWJYBYFXSA-N 0.000 description 2
- ZJLORAAXDAJLDC-CQDKDKBSSA-N Ala-Tyr-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O ZJLORAAXDAJLDC-CQDKDKBSSA-N 0.000 description 2
- ZCUFMRIQCPNOHZ-NRPADANISA-N Ala-Val-Gln Chemical compound C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N ZCUFMRIQCPNOHZ-NRPADANISA-N 0.000 description 2
- BEXGZLUHRXTZCC-CIUDSAMLSA-N Arg-Gln-Ser Chemical compound C(C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N)CN=C(N)N BEXGZLUHRXTZCC-CIUDSAMLSA-N 0.000 description 2
- OGUPCHKBOKJFMA-SRVKXCTJSA-N Arg-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N OGUPCHKBOKJFMA-SRVKXCTJSA-N 0.000 description 2
- CYXCAHZVPFREJD-LURJTMIESA-N Arg-Gly-Gly Chemical compound NC(=N)NCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O CYXCAHZVPFREJD-LURJTMIESA-N 0.000 description 2
- WVNFNPGXYADPPO-BQBZGAKWSA-N Arg-Gly-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O WVNFNPGXYADPPO-BQBZGAKWSA-N 0.000 description 2
- MJINRRBEMOLJAK-DCAQKATOSA-N Arg-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N MJINRRBEMOLJAK-DCAQKATOSA-N 0.000 description 2
- RIQBRKVTFBWEDY-RHYQMDGZSA-N Arg-Lys-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RIQBRKVTFBWEDY-RHYQMDGZSA-N 0.000 description 2
- XSPKAHFVDKRGRL-DCAQKATOSA-N Arg-Pro-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O XSPKAHFVDKRGRL-DCAQKATOSA-N 0.000 description 2
- WOZDCBHUGJVJPL-AVGNSLFASA-N Arg-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N WOZDCBHUGJVJPL-AVGNSLFASA-N 0.000 description 2
- HAJWYALLJIATCX-FXQIFTODSA-N Asn-Asn-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N HAJWYALLJIATCX-FXQIFTODSA-N 0.000 description 2
- PCKRJVZAQZWNKM-WHFBIAKZSA-N Asn-Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O PCKRJVZAQZWNKM-WHFBIAKZSA-N 0.000 description 2
- IOTKDTZEEBZNCM-UGYAYLCHSA-N Asn-Asn-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O IOTKDTZEEBZNCM-UGYAYLCHSA-N 0.000 description 2
- WVCJSDCHTUTONA-FXQIFTODSA-N Asn-Asp-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WVCJSDCHTUTONA-FXQIFTODSA-N 0.000 description 2
- NKLRWRRVYGQNIH-GHCJXIJMSA-N Asn-Ile-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O NKLRWRRVYGQNIH-GHCJXIJMSA-N 0.000 description 2
- PNHQRQTVBRDIEF-CIUDSAMLSA-N Asn-Leu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)N)N PNHQRQTVBRDIEF-CIUDSAMLSA-N 0.000 description 2
- ALHMNHZJBYBYHS-DCAQKATOSA-N Asn-Lys-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ALHMNHZJBYBYHS-DCAQKATOSA-N 0.000 description 2
- BKFXFUPYETWGGA-XVSYOHENSA-N Asn-Phe-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BKFXFUPYETWGGA-XVSYOHENSA-N 0.000 description 2
- GKKUBLFXKRDMFC-BQBZGAKWSA-N Asn-Pro-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O GKKUBLFXKRDMFC-BQBZGAKWSA-N 0.000 description 2
- BYLSYQASFJJBCL-DCAQKATOSA-N Asn-Pro-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O BYLSYQASFJJBCL-DCAQKATOSA-N 0.000 description 2
- VCJCPARXDBEGNE-GUBZILKMSA-N Asn-Pro-Pro Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 VCJCPARXDBEGNE-GUBZILKMSA-N 0.000 description 2
- JWQWPRCDYWNVNM-ACZMJKKPSA-N Asn-Ser-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N JWQWPRCDYWNVNM-ACZMJKKPSA-N 0.000 description 2
- HPBNLFLSSQDFQW-WHFBIAKZSA-N Asn-Ser-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O HPBNLFLSSQDFQW-WHFBIAKZSA-N 0.000 description 2
- XHTUGJCAEYOZOR-UBHSHLNASA-N Asn-Ser-Trp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O XHTUGJCAEYOZOR-UBHSHLNASA-N 0.000 description 2
- XCBKBPRFACFFOO-AQZXSJQPSA-N Asn-Thr-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O XCBKBPRFACFFOO-AQZXSJQPSA-N 0.000 description 2
- LTDGPJKGJDIBQD-LAEOZQHASA-N Asn-Val-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LTDGPJKGJDIBQD-LAEOZQHASA-N 0.000 description 2
- CBHVAFXKOYAHOY-NHCYSSNCSA-N Asn-Val-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O CBHVAFXKOYAHOY-NHCYSSNCSA-N 0.000 description 2
- XOQYDFCQPWAMSA-KKHAAJSZSA-N Asn-Val-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOQYDFCQPWAMSA-KKHAAJSZSA-N 0.000 description 2
- XBQSLMACWDXWLJ-GHCJXIJMSA-N Asp-Ala-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XBQSLMACWDXWLJ-GHCJXIJMSA-N 0.000 description 2
- YNQIDCRRTWGHJD-ZLUOBGJFSA-N Asp-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC(O)=O YNQIDCRRTWGHJD-ZLUOBGJFSA-N 0.000 description 2
- XACXDSRQIXRMNS-OLHMAJIHSA-N Asp-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)O XACXDSRQIXRMNS-OLHMAJIHSA-N 0.000 description 2
- RDRMWJBLOSRRAW-BYULHYEWSA-N Asp-Asn-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O RDRMWJBLOSRRAW-BYULHYEWSA-N 0.000 description 2
- AMRANMVXQWXNAH-ZLUOBGJFSA-N Asp-Cys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CS)NC(=O)[C@@H](N)CC(O)=O AMRANMVXQWXNAH-ZLUOBGJFSA-N 0.000 description 2
- WLKVEEODTPQPLI-ACZMJKKPSA-N Asp-Gln-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O WLKVEEODTPQPLI-ACZMJKKPSA-N 0.000 description 2
- DTNUIAJCPRMNBT-WHFBIAKZSA-N Asp-Gly-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O DTNUIAJCPRMNBT-WHFBIAKZSA-N 0.000 description 2
- OMMIEVATLAGRCK-BYPYZUCNSA-N Asp-Gly-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)NCC(O)=O OMMIEVATLAGRCK-BYPYZUCNSA-N 0.000 description 2
- KHGPWGKPYHPOIK-QWRGUYRKSA-N Asp-Gly-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KHGPWGKPYHPOIK-QWRGUYRKSA-N 0.000 description 2
- SNDBKTFJWVEVPO-WHFBIAKZSA-N Asp-Gly-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SNDBKTFJWVEVPO-WHFBIAKZSA-N 0.000 description 2
- ICZWAZVKLACMKR-CIUDSAMLSA-N Asp-His-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CN=CN1 ICZWAZVKLACMKR-CIUDSAMLSA-N 0.000 description 2
- SPWXXPFDTMYTRI-IUKAMOBKSA-N Asp-Ile-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SPWXXPFDTMYTRI-IUKAMOBKSA-N 0.000 description 2
- VMVUDJUXJKDGNR-FXQIFTODSA-N Asp-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N VMVUDJUXJKDGNR-FXQIFTODSA-N 0.000 description 2
- YFGUZQQCSDZRBN-DCAQKATOSA-N Asp-Pro-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O YFGUZQQCSDZRBN-DCAQKATOSA-N 0.000 description 2
- CUQDCPXNZPDYFQ-ZLUOBGJFSA-N Asp-Ser-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O CUQDCPXNZPDYFQ-ZLUOBGJFSA-N 0.000 description 2
- YIDFBWRHIYOYAA-LKXGYXEUSA-N Asp-Ser-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O YIDFBWRHIYOYAA-LKXGYXEUSA-N 0.000 description 2
- GCACQYDBDHRVGE-LKXGYXEUSA-N Asp-Thr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC(O)=O GCACQYDBDHRVGE-LKXGYXEUSA-N 0.000 description 2
- LLRJPYJQNBMOOO-QEJZJMRPSA-N Asp-Trp-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N LLRJPYJQNBMOOO-QEJZJMRPSA-N 0.000 description 2
- NJLLRXWFPQQPHV-SRVKXCTJSA-N Asp-Tyr-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O NJLLRXWFPQQPHV-SRVKXCTJSA-N 0.000 description 2
- BYLPQJAWXJWUCJ-YDHLFZDLSA-N Asp-Tyr-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O BYLPQJAWXJWUCJ-YDHLFZDLSA-N 0.000 description 2
- WAEDSQFVZJUHLI-BYULHYEWSA-N Asp-Val-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WAEDSQFVZJUHLI-BYULHYEWSA-N 0.000 description 2
- XMKXONRMGJXCJV-LAEOZQHASA-N Asp-Val-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XMKXONRMGJXCJV-LAEOZQHASA-N 0.000 description 2
- XQFLFQWOBXPMHW-NHCYSSNCSA-N Asp-Val-His Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)O XQFLFQWOBXPMHW-NHCYSSNCSA-N 0.000 description 2
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- 102100021526 BPI fold-containing family A member 2 Human genes 0.000 description 2
- 101710193979 BPI fold-containing family A member 2 Proteins 0.000 description 2
- 241000283690 Bos taurus Species 0.000 description 2
- WKELHWMCIXSVDT-UBHSHLNASA-N Cys-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CS)N WKELHWMCIXSVDT-UBHSHLNASA-N 0.000 description 2
- BPHKULHWEIUDOB-FXQIFTODSA-N Cys-Gln-Gln Chemical compound SC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O BPHKULHWEIUDOB-FXQIFTODSA-N 0.000 description 2
- YZKOXEJTLWZOQL-GUBZILKMSA-N Cys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CS)N YZKOXEJTLWZOQL-GUBZILKMSA-N 0.000 description 2
- UDPSLLFHOLGXBY-FXQIFTODSA-N Cys-Glu-Glu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UDPSLLFHOLGXBY-FXQIFTODSA-N 0.000 description 2
- HKALUUKHYNEDRS-GUBZILKMSA-N Cys-Leu-Gln Chemical compound SC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O HKALUUKHYNEDRS-GUBZILKMSA-N 0.000 description 2
- UCSXXFRXHGUXCQ-SRVKXCTJSA-N Cys-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CS)N UCSXXFRXHGUXCQ-SRVKXCTJSA-N 0.000 description 2
- GGRDJANMZPGMNS-CIUDSAMLSA-N Cys-Ser-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O GGRDJANMZPGMNS-CIUDSAMLSA-N 0.000 description 2
- NAPULYCVEVVFRB-HEIBUPTGSA-N Cys-Thr-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)CS NAPULYCVEVVFRB-HEIBUPTGSA-N 0.000 description 2
- 108010090461 DFG peptide Proteins 0.000 description 2
- 108010001682 Dextranase Proteins 0.000 description 2
- 241000588724 Escherichia coli Species 0.000 description 2
- LKUWAWGNJYJODH-KBIXCLLPSA-N Gln-Ala-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LKUWAWGNJYJODH-KBIXCLLPSA-N 0.000 description 2
- WOACHWLUOFZLGJ-GUBZILKMSA-N Gln-Arg-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O WOACHWLUOFZLGJ-GUBZILKMSA-N 0.000 description 2
- DTMLKCYOQKZXKZ-HJGDQZAQSA-N Gln-Arg-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DTMLKCYOQKZXKZ-HJGDQZAQSA-N 0.000 description 2
- IKDOHQHEFPPGJG-FXQIFTODSA-N Gln-Asp-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O IKDOHQHEFPPGJG-FXQIFTODSA-N 0.000 description 2
- SOIAHPSKKUYREP-CIUDSAMLSA-N Gln-Asp-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)N)N SOIAHPSKKUYREP-CIUDSAMLSA-N 0.000 description 2
- IXFVOPOHSRKJNG-LAEOZQHASA-N Gln-Asp-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IXFVOPOHSRKJNG-LAEOZQHASA-N 0.000 description 2
- ALUBSZXSNSPDQV-WDSKDSINSA-N Gln-Cys-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O ALUBSZXSNSPDQV-WDSKDSINSA-N 0.000 description 2
- NPTGGVQJYRSMCM-GLLZPBPUSA-N Gln-Gln-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NPTGGVQJYRSMCM-GLLZPBPUSA-N 0.000 description 2
- DWDBJWAXPXXYLP-SRVKXCTJSA-N Gln-His-Arg Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N DWDBJWAXPXXYLP-SRVKXCTJSA-N 0.000 description 2
- FFVXLVGUJBCKRX-UKJIMTQDSA-N Gln-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCC(=O)N)N FFVXLVGUJBCKRX-UKJIMTQDSA-N 0.000 description 2
- KHNJVFYHIKLUPD-SRVKXCTJSA-N Gln-Leu-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCC(=O)N)N KHNJVFYHIKLUPD-SRVKXCTJSA-N 0.000 description 2
- CELXWPDNIGWCJN-WDCWCFNPSA-N Gln-Lys-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CELXWPDNIGWCJN-WDCWCFNPSA-N 0.000 description 2
- NMYFPKCIGUJMIK-GUBZILKMSA-N Gln-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N NMYFPKCIGUJMIK-GUBZILKMSA-N 0.000 description 2
- DRNMNLKUUKKPIA-HTUGSXCWSA-N Gln-Phe-Thr Chemical compound C[C@@H](O)[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)CCC(N)=O)C(O)=O DRNMNLKUUKKPIA-HTUGSXCWSA-N 0.000 description 2
- RNPGPFAVRLERPP-QEJZJMRPSA-N Gln-Trp-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(O)=O RNPGPFAVRLERPP-QEJZJMRPSA-N 0.000 description 2
- UBRQJXFDVZNYJP-AVGNSLFASA-N Gln-Tyr-Ser Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O UBRQJXFDVZNYJP-AVGNSLFASA-N 0.000 description 2
- HGBHRZBXOOHRDH-JBACZVJFSA-N Gln-Tyr-Trp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O HGBHRZBXOOHRDH-JBACZVJFSA-N 0.000 description 2
- ZZLDMBMFKZFQMU-NRPADANISA-N Gln-Val-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O ZZLDMBMFKZFQMU-NRPADANISA-N 0.000 description 2
- QGWXAMDECCKGRU-XVKPBYJWSA-N Gln-Val-Gly Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(N)=O)C(=O)NCC(O)=O QGWXAMDECCKGRU-XVKPBYJWSA-N 0.000 description 2
- FHPXTPQBODWBIY-CIUDSAMLSA-N Glu-Ala-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FHPXTPQBODWBIY-CIUDSAMLSA-N 0.000 description 2
- ATRHMOJQJWPVBQ-DRZSPHRISA-N Glu-Ala-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ATRHMOJQJWPVBQ-DRZSPHRISA-N 0.000 description 2
- SRZLHYPAOXBBSB-HJGDQZAQSA-N Glu-Arg-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SRZLHYPAOXBBSB-HJGDQZAQSA-N 0.000 description 2
- XXCDTYBVGMPIOA-FXQIFTODSA-N Glu-Asp-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XXCDTYBVGMPIOA-FXQIFTODSA-N 0.000 description 2
- VMKCPNBBPGGQBJ-GUBZILKMSA-N Glu-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N VMKCPNBBPGGQBJ-GUBZILKMSA-N 0.000 description 2
- IRXNJYPKBVERCW-DCAQKATOSA-N Glu-Leu-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IRXNJYPKBVERCW-DCAQKATOSA-N 0.000 description 2
- VGBSZQSKQRMLHD-MNXVOIDGSA-N Glu-Leu-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VGBSZQSKQRMLHD-MNXVOIDGSA-N 0.000 description 2
- WXONSNSSBYQGNN-AVGNSLFASA-N Glu-Ser-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O WXONSNSSBYQGNN-AVGNSLFASA-N 0.000 description 2
- UMZHHILWZBFPGL-LOKLDPHHSA-N Glu-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O UMZHHILWZBFPGL-LOKLDPHHSA-N 0.000 description 2
- NEDQVOQDDBCRGG-UHFFFAOYSA-N Gly Gly Thr Tyr Chemical compound NCC(=O)NCC(=O)NC(C(O)C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 NEDQVOQDDBCRGG-UHFFFAOYSA-N 0.000 description 2
- YMUFWNJHVPQNQD-ZKWXMUAHSA-N Gly-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN YMUFWNJHVPQNQD-ZKWXMUAHSA-N 0.000 description 2
- QSDKBRMVXSWAQE-BFHQHQDPSA-N Gly-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN QSDKBRMVXSWAQE-BFHQHQDPSA-N 0.000 description 2
- QIZJOTQTCAGKPU-KWQFWETISA-N Gly-Ala-Tyr Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 QIZJOTQTCAGKPU-KWQFWETISA-N 0.000 description 2
- OCQUNKSFDYDXBG-QXEWZRGKSA-N Gly-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N OCQUNKSFDYDXBG-QXEWZRGKSA-N 0.000 description 2
- GWCRIHNSVMOBEQ-BQBZGAKWSA-N Gly-Arg-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O GWCRIHNSVMOBEQ-BQBZGAKWSA-N 0.000 description 2
- CIMULJZTTOBOPN-WHFBIAKZSA-N Gly-Asn-Asn Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CIMULJZTTOBOPN-WHFBIAKZSA-N 0.000 description 2
- JVWPPCWUDRJGAE-YUMQZZPRSA-N Gly-Asn-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JVWPPCWUDRJGAE-YUMQZZPRSA-N 0.000 description 2
- XEJTYSCIXKYSHR-WDSKDSINSA-N Gly-Asp-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN XEJTYSCIXKYSHR-WDSKDSINSA-N 0.000 description 2
- TZOVVRJYUDETQG-RCOVLWMOSA-N Gly-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)CN TZOVVRJYUDETQG-RCOVLWMOSA-N 0.000 description 2
- IXKRSKPKSLXIHN-YUMQZZPRSA-N Gly-Cys-Leu Chemical compound [H]NCC(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O IXKRSKPKSLXIHN-YUMQZZPRSA-N 0.000 description 2
- PABFFPWEJMEVEC-JGVFFNPUSA-N Gly-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)CN)C(=O)O PABFFPWEJMEVEC-JGVFFNPUSA-N 0.000 description 2
- YYPFZVIXAVDHIK-IUCAKERBSA-N Gly-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN YYPFZVIXAVDHIK-IUCAKERBSA-N 0.000 description 2
- JSNNHGHYGYMVCK-XVKPBYJWSA-N Gly-Glu-Val Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O JSNNHGHYGYMVCK-XVKPBYJWSA-N 0.000 description 2
- CCQOOWAONKGYKQ-BYPYZUCNSA-N Gly-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 2
- QPTNELDXWKRIFX-YFKPBYRVSA-N Gly-Gly-Gln Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O QPTNELDXWKRIFX-YFKPBYRVSA-N 0.000 description 2
- XMPXVJIDADUOQB-RCOVLWMOSA-N Gly-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C([O-])=O)NC(=O)CNC(=O)C[NH3+] XMPXVJIDADUOQB-RCOVLWMOSA-N 0.000 description 2
- BUEFQXUHTUZXHR-LURJTMIESA-N Gly-Gly-Pro zwitterion Chemical compound NCC(=O)NCC(=O)N1CCC[C@H]1C(O)=O BUEFQXUHTUZXHR-LURJTMIESA-N 0.000 description 2
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 2
- VAXIVIPMCTYSHI-YUMQZZPRSA-N Gly-His-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN VAXIVIPMCTYSHI-YUMQZZPRSA-N 0.000 description 2
- YNIMVVJTPWCUJH-KBPBESRZSA-N Gly-His-Tyr Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YNIMVVJTPWCUJH-KBPBESRZSA-N 0.000 description 2
- CVFOYJJOZYYEPE-KBPBESRZSA-N Gly-Lys-Tyr Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CVFOYJJOZYYEPE-KBPBESRZSA-N 0.000 description 2
- QLQDIJBYJZKQPR-BQBZGAKWSA-N Gly-Met-Cys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN QLQDIJBYJZKQPR-BQBZGAKWSA-N 0.000 description 2
- GAFKBWKVXNERFA-QWRGUYRKSA-N Gly-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 GAFKBWKVXNERFA-QWRGUYRKSA-N 0.000 description 2
- YYXJFBMCOUSYSF-RYUDHWBXSA-N Gly-Phe-Gln Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYXJFBMCOUSYSF-RYUDHWBXSA-N 0.000 description 2
- WNZOCXUOGVYYBJ-CDMKHQONSA-N Gly-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)CN)O WNZOCXUOGVYYBJ-CDMKHQONSA-N 0.000 description 2
- FFJQHWKSGAWSTJ-BFHQHQDPSA-N Gly-Thr-Ala Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O FFJQHWKSGAWSTJ-BFHQHQDPSA-N 0.000 description 2
- NWOSHVVPKDQKKT-RYUDHWBXSA-N Gly-Tyr-Gln Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O NWOSHVVPKDQKKT-RYUDHWBXSA-N 0.000 description 2
- RIYIFUFFFBIOEU-KBPBESRZSA-N Gly-Tyr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 RIYIFUFFFBIOEU-KBPBESRZSA-N 0.000 description 2
- FNXSYBOHALPRHV-ONGXEEELSA-N Gly-Val-Lys Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN FNXSYBOHALPRHV-ONGXEEELSA-N 0.000 description 2
- AFMOTCMSEBITOE-YEPSODPASA-N Gly-Val-Thr Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AFMOTCMSEBITOE-YEPSODPASA-N 0.000 description 2
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 2
- 229920002488 Hemicellulose Polymers 0.000 description 2
- HDXNWVLQSQFJOX-SRVKXCTJSA-N His-Arg-Gln Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N HDXNWVLQSQFJOX-SRVKXCTJSA-N 0.000 description 2
- KYMUEAZVLPRVAE-GUBZILKMSA-N His-Asn-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O KYMUEAZVLPRVAE-GUBZILKMSA-N 0.000 description 2
- VOKCBYNCZVSILJ-KKUMJFAQSA-N His-Asn-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC2=CN=CN2)N)O VOKCBYNCZVSILJ-KKUMJFAQSA-N 0.000 description 2
- KYFGGRHWLFZXPU-KKUMJFAQSA-N His-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N KYFGGRHWLFZXPU-KKUMJFAQSA-N 0.000 description 2
- VDHOMPFVSABJKU-ULQDDVLXSA-N His-Phe-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CC2=CN=CN2)N VDHOMPFVSABJKU-ULQDDVLXSA-N 0.000 description 2
- VSZALHITQINTGC-GHCJXIJMSA-N Ile-Ala-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VSZALHITQINTGC-GHCJXIJMSA-N 0.000 description 2
- BOTVMTSMOUSDRW-GMOBBJLQSA-N Ile-Arg-Asn Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(N)=O)C(O)=O BOTVMTSMOUSDRW-GMOBBJLQSA-N 0.000 description 2
- JXMSHKFPDIUYGS-SIUGBPQLSA-N Ile-Glu-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N JXMSHKFPDIUYGS-SIUGBPQLSA-N 0.000 description 2
- IGJWJGIHUFQANP-LAEOZQHASA-N Ile-Gly-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)O)N IGJWJGIHUFQANP-LAEOZQHASA-N 0.000 description 2
- WIZPFZKOFZXDQG-HTFCKZLJSA-N Ile-Ile-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O WIZPFZKOFZXDQG-HTFCKZLJSA-N 0.000 description 2
- RVNOXPZHMUWCLW-GMOBBJLQSA-N Ile-Met-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N RVNOXPZHMUWCLW-GMOBBJLQSA-N 0.000 description 2
- IITVUURPOYGCTD-NAKRPEOUSA-N Ile-Pro-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O IITVUURPOYGCTD-NAKRPEOUSA-N 0.000 description 2
- KCTIFOCXAIUQQK-QXEWZRGKSA-N Ile-Pro-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O KCTIFOCXAIUQQK-QXEWZRGKSA-N 0.000 description 2
- JZNVOBUNTWNZPW-GHCJXIJMSA-N Ile-Ser-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N JZNVOBUNTWNZPW-GHCJXIJMSA-N 0.000 description 2
- PELCGFMHLZXWBQ-BJDJZHNGSA-N Ile-Ser-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)O)N PELCGFMHLZXWBQ-BJDJZHNGSA-N 0.000 description 2
- NGKPIPCGMLWHBX-WZLNRYEVSA-N Ile-Tyr-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N NGKPIPCGMLWHBX-WZLNRYEVSA-N 0.000 description 2
- SITWEMZOJNKJCH-UHFFFAOYSA-N L-alanine-L-arginine Natural products CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 2
- UGTHTQWIQKEDEH-BQBZGAKWSA-N L-alanyl-L-prolylglycine zwitterion Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UGTHTQWIQKEDEH-BQBZGAKWSA-N 0.000 description 2
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 2
- TYYLDKGBCJGJGW-UHFFFAOYSA-N L-tryptophan-L-tyrosine Natural products C=1NC2=CC=CC=C2C=1CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 TYYLDKGBCJGJGW-UHFFFAOYSA-N 0.000 description 2
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 2
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 2
- WSGXUIQTEZDVHJ-GARJFASQSA-N Leu-Ala-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(O)=O WSGXUIQTEZDVHJ-GARJFASQSA-N 0.000 description 2
- GRZSCTXVCDUIPO-SRVKXCTJSA-N Leu-Arg-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O GRZSCTXVCDUIPO-SRVKXCTJSA-N 0.000 description 2
- KAFOIVJDVSZUMD-DCAQKATOSA-N Leu-Gln-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O KAFOIVJDVSZUMD-DCAQKATOSA-N 0.000 description 2
- KAFOIVJDVSZUMD-UHFFFAOYSA-N Leu-Gln-Gln Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)NC(CCC(N)=O)C(O)=O KAFOIVJDVSZUMD-UHFFFAOYSA-N 0.000 description 2
- WQWSMEOYXJTFRU-GUBZILKMSA-N Leu-Glu-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O WQWSMEOYXJTFRU-GUBZILKMSA-N 0.000 description 2
- QPXBPQUGXHURGP-UWVGGRQHSA-N Leu-Gly-Met Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)N[C@@H](CCSC)C(=O)O)N QPXBPQUGXHURGP-UWVGGRQHSA-N 0.000 description 2
- HNDWYLYAYNBWMP-AJNGGQMLSA-N Leu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N HNDWYLYAYNBWMP-AJNGGQMLSA-N 0.000 description 2
- TVEOVCYCYGKVPP-HSCHXYMDSA-N Leu-Ile-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC(C)C)N TVEOVCYCYGKVPP-HSCHXYMDSA-N 0.000 description 2
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 2
- ZGUMORRUBUCXEH-AVGNSLFASA-N Leu-Lys-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZGUMORRUBUCXEH-AVGNSLFASA-N 0.000 description 2
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 2
- PWPBLZXWFXJFHE-RHYQMDGZSA-N Leu-Pro-Thr Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O PWPBLZXWFXJFHE-RHYQMDGZSA-N 0.000 description 2
- IZPVWNSAVUQBGP-CIUDSAMLSA-N Leu-Ser-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IZPVWNSAVUQBGP-CIUDSAMLSA-N 0.000 description 2
- RGUXWMDNCPMQFB-YUMQZZPRSA-N Leu-Ser-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RGUXWMDNCPMQFB-YUMQZZPRSA-N 0.000 description 2
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 2
- LCNASHSOFMRYFO-WDCWCFNPSA-N Leu-Thr-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(N)=O LCNASHSOFMRYFO-WDCWCFNPSA-N 0.000 description 2
- LFSQWRSVPNKJGP-WDCWCFNPSA-N Leu-Thr-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(O)=O LFSQWRSVPNKJGP-WDCWCFNPSA-N 0.000 description 2
- VDIARPPNADFEAV-WEDXCCLWSA-N Leu-Thr-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O VDIARPPNADFEAV-WEDXCCLWSA-N 0.000 description 2
- WFCKERTZVCQXKH-KBPBESRZSA-N Leu-Tyr-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O WFCKERTZVCQXKH-KBPBESRZSA-N 0.000 description 2
- MVJRBCJCRYGCKV-GVXVVHGQSA-N Leu-Val-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O MVJRBCJCRYGCKV-GVXVVHGQSA-N 0.000 description 2
- NTXYXFDMIHXTHE-WDSOQIARSA-N Leu-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(C)C)C(O)=O)=CNC2=C1 NTXYXFDMIHXTHE-WDSOQIARSA-N 0.000 description 2
- DGWXCIORNLWGGG-CIUDSAMLSA-N Lys-Asn-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O DGWXCIORNLWGGG-CIUDSAMLSA-N 0.000 description 2
- DFXQCCBKGUNYGG-GUBZILKMSA-N Lys-Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCCN DFXQCCBKGUNYGG-GUBZILKMSA-N 0.000 description 2
- DKTNGXVSCZULPO-YUMQZZPRSA-N Lys-Gly-Cys Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CS)C(O)=O DKTNGXVSCZULPO-YUMQZZPRSA-N 0.000 description 2
- CANPXOLVTMKURR-WEDXCCLWSA-N Lys-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN CANPXOLVTMKURR-WEDXCCLWSA-N 0.000 description 2
- ZASPELYMPSACER-HOCLYGCPSA-N Lys-Gly-Trp Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O ZASPELYMPSACER-HOCLYGCPSA-N 0.000 description 2
- VMTYLUGCXIEDMV-QWRGUYRKSA-N Lys-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCCCN VMTYLUGCXIEDMV-QWRGUYRKSA-N 0.000 description 2
- OIQSIMFSVLLWBX-VOAKCMCISA-N Lys-Leu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OIQSIMFSVLLWBX-VOAKCMCISA-N 0.000 description 2
- IOQWIOPSKJOEKI-SRVKXCTJSA-N Lys-Ser-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IOQWIOPSKJOEKI-SRVKXCTJSA-N 0.000 description 2
- BDFHWFUAQLIMJO-KXNHARMFSA-N Lys-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N)O BDFHWFUAQLIMJO-KXNHARMFSA-N 0.000 description 2
- MIMXMVDLMDMOJD-BZSNNMDCSA-N Lys-Tyr-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O MIMXMVDLMDMOJD-BZSNNMDCSA-N 0.000 description 2
- BVXXDMUMHMXFER-BPNCWPANSA-N Met-Ala-Tyr Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BVXXDMUMHMXFER-BPNCWPANSA-N 0.000 description 2
- CWFYZYQMUDWGTI-GUBZILKMSA-N Met-Arg-Asp Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O CWFYZYQMUDWGTI-GUBZILKMSA-N 0.000 description 2
- DCHHUGLTVLJYKA-FXQIFTODSA-N Met-Asn-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O DCHHUGLTVLJYKA-FXQIFTODSA-N 0.000 description 2
- SQUTUWHAAWJYES-GUBZILKMSA-N Met-Asp-Arg Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SQUTUWHAAWJYES-GUBZILKMSA-N 0.000 description 2
- DNDVVILEHVMWIS-LPEHRKFASA-N Met-Asp-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N DNDVVILEHVMWIS-LPEHRKFASA-N 0.000 description 2
- BQHLZUMZOXUWNU-DCAQKATOSA-N Met-Pro-Glu Chemical compound CSCC[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(=O)O)C(=O)O)N BQHLZUMZOXUWNU-DCAQKATOSA-N 0.000 description 2
- WXJLBSXNUHIGSS-OSUNSFLBSA-N Met-Thr-Ile Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WXJLBSXNUHIGSS-OSUNSFLBSA-N 0.000 description 2
- 241000699660 Mus musculus Species 0.000 description 2
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 2
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 2
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 2
- 108010066427 N-valyltryptophan Proteins 0.000 description 2
- 241000238814 Orthoptera Species 0.000 description 2
- AGYXCMYVTBYGCT-ULQDDVLXSA-N Phe-Arg-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O AGYXCMYVTBYGCT-ULQDDVLXSA-N 0.000 description 2
- NKLDZIPTGKBDBB-HTUGSXCWSA-N Phe-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N)O NKLDZIPTGKBDBB-HTUGSXCWSA-N 0.000 description 2
- MMYUOSCXBJFUNV-QWRGUYRKSA-N Phe-Gly-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N MMYUOSCXBJFUNV-QWRGUYRKSA-N 0.000 description 2
- VZFPYFRVHMSSNA-JURCDPSOSA-N Phe-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=CC=C1 VZFPYFRVHMSSNA-JURCDPSOSA-N 0.000 description 2
- CZQZSMJXFGGBHM-KKUMJFAQSA-N Phe-Pro-Gln Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O CZQZSMJXFGGBHM-KKUMJFAQSA-N 0.000 description 2
- UNBFGVQVQGXXCK-KKUMJFAQSA-N Phe-Ser-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O UNBFGVQVQGXXCK-KKUMJFAQSA-N 0.000 description 2
- QSWKNJAPHQDAAS-MELADBBJSA-N Phe-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O QSWKNJAPHQDAAS-MELADBBJSA-N 0.000 description 2
- BSKMOCNNLNDIMU-CDMKHQONSA-N Phe-Thr-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O BSKMOCNNLNDIMU-CDMKHQONSA-N 0.000 description 2
- SHUFSZDAIPLZLF-BEAPCOKYSA-N Phe-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)O SHUFSZDAIPLZLF-BEAPCOKYSA-N 0.000 description 2
- LKRUQZQZMXMKEQ-SFJXLCSZSA-N Phe-Trp-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LKRUQZQZMXMKEQ-SFJXLCSZSA-N 0.000 description 2
- ZYNBEWGJFXTBDU-ACRUOGEOSA-N Phe-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC2=CC=CC=C2)N ZYNBEWGJFXTBDU-ACRUOGEOSA-N 0.000 description 2
- RGMLUHANLDVMPB-ULQDDVLXSA-N Phe-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N RGMLUHANLDVMPB-ULQDDVLXSA-N 0.000 description 2
- APZNYJFGVAGFCF-JYJNAYRXSA-N Phe-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)Cc1ccccc1)C(C)C)C(O)=O APZNYJFGVAGFCF-JYJNAYRXSA-N 0.000 description 2
- DZZCICYRSZASNF-FXQIFTODSA-N Pro-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 DZZCICYRSZASNF-FXQIFTODSA-N 0.000 description 2
- LCRSGSIRKLXZMZ-BPNCWPANSA-N Pro-Ala-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LCRSGSIRKLXZMZ-BPNCWPANSA-N 0.000 description 2
- DIFXZGPHVCIVSQ-CIUDSAMLSA-N Pro-Gln-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DIFXZGPHVCIVSQ-CIUDSAMLSA-N 0.000 description 2
- VPEVBAUSTBWQHN-NHCYSSNCSA-N Pro-Glu-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O VPEVBAUSTBWQHN-NHCYSSNCSA-N 0.000 description 2
- UUHXBJHVTVGSKM-BQBZGAKWSA-N Pro-Gly-Asn Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UUHXBJHVTVGSKM-BQBZGAKWSA-N 0.000 description 2
- DXTOOBDIIAJZBJ-BQBZGAKWSA-N Pro-Gly-Ser Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CO)C(O)=O DXTOOBDIIAJZBJ-BQBZGAKWSA-N 0.000 description 2
- FFSLAIOXRMOFIZ-GJZGRUSLSA-N Pro-Gly-Trp Chemical compound N([C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)O)C(=O)CNC(=O)[C@@H]1CCCN1 FFSLAIOXRMOFIZ-GJZGRUSLSA-N 0.000 description 2
- XYSXOCIWCPFOCG-IHRRRGAJSA-N Pro-Leu-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XYSXOCIWCPFOCG-IHRRRGAJSA-N 0.000 description 2
- MRYUJHGPZQNOAD-IHRRRGAJSA-N Pro-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1 MRYUJHGPZQNOAD-IHRRRGAJSA-N 0.000 description 2
- DRKAXLDECUGLFE-ULQDDVLXSA-N Pro-Leu-Phe Chemical compound CC(C)C[C@H](NC(=O)[C@@H]1CCCN1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O DRKAXLDECUGLFE-ULQDDVLXSA-N 0.000 description 2
- FKYKZHOKDOPHSA-DCAQKATOSA-N Pro-Leu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FKYKZHOKDOPHSA-DCAQKATOSA-N 0.000 description 2
- ABSSTGUCBCDKMU-UWVGGRQHSA-N Pro-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H]1CCCN1 ABSSTGUCBCDKMU-UWVGGRQHSA-N 0.000 description 2
- FYKUEXMZYFIZKA-DCAQKATOSA-N Pro-Pro-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O FYKUEXMZYFIZKA-DCAQKATOSA-N 0.000 description 2
- LEIKGVHQTKHOLM-IUCAKERBSA-N Pro-Pro-Gly Chemical compound OC(=O)CNC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 LEIKGVHQTKHOLM-IUCAKERBSA-N 0.000 description 2
- BGWKULMLUIUPKY-BQBZGAKWSA-N Pro-Ser-Gly Chemical compound OC(=O)CNC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 BGWKULMLUIUPKY-BQBZGAKWSA-N 0.000 description 2
- MKGIILKDUGDRRO-FXQIFTODSA-N Pro-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 MKGIILKDUGDRRO-FXQIFTODSA-N 0.000 description 2
- GZNYIXWOIUFLGO-ZJDVBMNYSA-N Pro-Thr-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZNYIXWOIUFLGO-ZJDVBMNYSA-N 0.000 description 2
- QDDJNKWPTJHROJ-UFYCRDLUSA-N Pro-Tyr-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 QDDJNKWPTJHROJ-UFYCRDLUSA-N 0.000 description 2
- FUOGXAQMNJMBFG-WPRPVWTQSA-N Pro-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 FUOGXAQMNJMBFG-WPRPVWTQSA-N 0.000 description 2
- 238000001190 Q-PCR Methods 0.000 description 2
- 108010003201 RGH 0205 Proteins 0.000 description 2
- BKOKTRCZXRIQPX-ZLUOBGJFSA-N Ser-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N BKOKTRCZXRIQPX-ZLUOBGJFSA-N 0.000 description 2
- MMGJPDWSIOAGTH-ACZMJKKPSA-N Ser-Ala-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O MMGJPDWSIOAGTH-ACZMJKKPSA-N 0.000 description 2
- BTKUIVBNGBFTTP-WHFBIAKZSA-N Ser-Ala-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)NCC(O)=O BTKUIVBNGBFTTP-WHFBIAKZSA-N 0.000 description 2
- HRNQLKCLPVKZNE-CIUDSAMLSA-N Ser-Ala-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O HRNQLKCLPVKZNE-CIUDSAMLSA-N 0.000 description 2
- GXXTUIUYTWGPMV-FXQIFTODSA-N Ser-Arg-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O GXXTUIUYTWGPMV-FXQIFTODSA-N 0.000 description 2
- HBOABDXGTMMDSE-GUBZILKMSA-N Ser-Arg-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O HBOABDXGTMMDSE-GUBZILKMSA-N 0.000 description 2
- FIDMVVBUOCMMJG-CIUDSAMLSA-N Ser-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO FIDMVVBUOCMMJG-CIUDSAMLSA-N 0.000 description 2
- BNFVPSRLHHPQKS-WHFBIAKZSA-N Ser-Asp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O BNFVPSRLHHPQKS-WHFBIAKZSA-N 0.000 description 2
- MMAPOBOTRUVNKJ-ZLUOBGJFSA-N Ser-Asp-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CO)N)C(=O)O MMAPOBOTRUVNKJ-ZLUOBGJFSA-N 0.000 description 2
- INCNPLPRPOYTJI-JBDRJPRFSA-N Ser-Cys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CO)N INCNPLPRPOYTJI-JBDRJPRFSA-N 0.000 description 2
- MOVJSUIKUNCVMG-ZLUOBGJFSA-N Ser-Cys-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)O)N)O MOVJSUIKUNCVMG-ZLUOBGJFSA-N 0.000 description 2
- OJPHFSOMBZKQKQ-GUBZILKMSA-N Ser-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CO OJPHFSOMBZKQKQ-GUBZILKMSA-N 0.000 description 2
- GWMXFEMMBHOKDX-AVGNSLFASA-N Ser-Gln-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 GWMXFEMMBHOKDX-AVGNSLFASA-N 0.000 description 2
- UFKPDBLKLOBMRH-XHNCKOQMSA-N Ser-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N)C(=O)O UFKPDBLKLOBMRH-XHNCKOQMSA-N 0.000 description 2
- AEGUWTFAQQWVLC-BQBZGAKWSA-N Ser-Gly-Arg Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O AEGUWTFAQQWVLC-BQBZGAKWSA-N 0.000 description 2
- BPMRXBZYPGYPJN-WHFBIAKZSA-N Ser-Gly-Asn Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O BPMRXBZYPGYPJN-WHFBIAKZSA-N 0.000 description 2
- KDGARKCAKHBEDB-NKWVEPMBSA-N Ser-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CO)N)C(=O)O KDGARKCAKHBEDB-NKWVEPMBSA-N 0.000 description 2
- SFTZWNJFZYOLBD-ZDLURKLDSA-N Ser-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO SFTZWNJFZYOLBD-ZDLURKLDSA-N 0.000 description 2
- HMRAQFJFTOLDKW-GUBZILKMSA-N Ser-His-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O HMRAQFJFTOLDKW-GUBZILKMSA-N 0.000 description 2
- CJINPXGSKSZQNE-KBIXCLLPSA-N Ser-Ile-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O CJINPXGSKSZQNE-KBIXCLLPSA-N 0.000 description 2
- UIPXCLNLUUAMJU-JBDRJPRFSA-N Ser-Ile-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UIPXCLNLUUAMJU-JBDRJPRFSA-N 0.000 description 2
- FUMGHWDRRFCKEP-CIUDSAMLSA-N Ser-Leu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O FUMGHWDRRFCKEP-CIUDSAMLSA-N 0.000 description 2
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 2
- IUXGJEIKJBYKOO-SRVKXCTJSA-N Ser-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CO)N IUXGJEIKJBYKOO-SRVKXCTJSA-N 0.000 description 2
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 2
- GVIGVIOEYBOTCB-XIRDDKMYSA-N Ser-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC(C)C)C(O)=O)=CNC2=C1 GVIGVIOEYBOTCB-XIRDDKMYSA-N 0.000 description 2
- IXZHZUGGKLRHJD-DCAQKATOSA-N Ser-Leu-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IXZHZUGGKLRHJD-DCAQKATOSA-N 0.000 description 2
- UGTZYIPOBYXWRW-SRVKXCTJSA-N Ser-Phe-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O UGTZYIPOBYXWRW-SRVKXCTJSA-N 0.000 description 2
- HJAXVYLCKDPPDF-SRVKXCTJSA-N Ser-Phe-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N HJAXVYLCKDPPDF-SRVKXCTJSA-N 0.000 description 2
- PJIQEIFXZPCWOJ-FXQIFTODSA-N Ser-Pro-Asp Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O PJIQEIFXZPCWOJ-FXQIFTODSA-N 0.000 description 2
- VFWQQZMRKFOGLE-ZLUOBGJFSA-N Ser-Ser-Cys Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N)O VFWQQZMRKFOGLE-ZLUOBGJFSA-N 0.000 description 2
- JCLAFVNDBJMLBC-JBDRJPRFSA-N Ser-Ser-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JCLAFVNDBJMLBC-JBDRJPRFSA-N 0.000 description 2
- PYTKULIABVRXSC-BWBBJGPYSA-N Ser-Ser-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PYTKULIABVRXSC-BWBBJGPYSA-N 0.000 description 2
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 2
- ZKOKTQPHFMRSJP-YJRXYDGGSA-N Ser-Thr-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKOKTQPHFMRSJP-YJRXYDGGSA-N 0.000 description 2
- GSCVDSBEYVGMJQ-SRVKXCTJSA-N Ser-Tyr-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CO)N)O GSCVDSBEYVGMJQ-SRVKXCTJSA-N 0.000 description 2
- IAOHCSQDQDWRQU-GUBZILKMSA-N Ser-Val-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IAOHCSQDQDWRQU-GUBZILKMSA-N 0.000 description 2
- YEDSOSIKVUMIJE-DCAQKATOSA-N Ser-Val-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O YEDSOSIKVUMIJE-DCAQKATOSA-N 0.000 description 2
- ODRUTDLAONAVDV-IHRRRGAJSA-N Ser-Val-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ODRUTDLAONAVDV-IHRRRGAJSA-N 0.000 description 2
- BSNZTJXVDOINSR-JXUBOQSCSA-N Thr-Ala-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BSNZTJXVDOINSR-JXUBOQSCSA-N 0.000 description 2
- GZYNMZQXFRWDFH-YTWAJWBKSA-N Thr-Arg-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(=O)O)N)O GZYNMZQXFRWDFH-YTWAJWBKSA-N 0.000 description 2
- VASYSJHSMSBTDU-LKXGYXEUSA-N Thr-Asn-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N)O VASYSJHSMSBTDU-LKXGYXEUSA-N 0.000 description 2
- OHAJHDJOCKKJLV-LKXGYXEUSA-N Thr-Asp-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O OHAJHDJOCKKJLV-LKXGYXEUSA-N 0.000 description 2
- XDARBNMYXKUFOJ-GSSVUCPTSA-N Thr-Asp-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XDARBNMYXKUFOJ-GSSVUCPTSA-N 0.000 description 2
- OYTNZCBFDXGQGE-XQXXSGGOSA-N Thr-Gln-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](C)C(=O)O)N)O OYTNZCBFDXGQGE-XQXXSGGOSA-N 0.000 description 2
- JKGGPMOUIAAJAA-YEPSODPASA-N Thr-Gly-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O JKGGPMOUIAAJAA-YEPSODPASA-N 0.000 description 2
- VUSAEKOXGNEYNE-PBCZWWQYSA-N Thr-His-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O VUSAEKOXGNEYNE-PBCZWWQYSA-N 0.000 description 2
- GXUWHVZYDAHFSV-FLBSBUHZSA-N Thr-Ile-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GXUWHVZYDAHFSV-FLBSBUHZSA-N 0.000 description 2
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 2
- MXNAOGFNFNKUPD-JHYOHUSXSA-N Thr-Phe-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MXNAOGFNFNKUPD-JHYOHUSXSA-N 0.000 description 2
- JAJOFWABAUKAEJ-QTKMDUPCSA-N Thr-Pro-His Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O JAJOFWABAUKAEJ-QTKMDUPCSA-N 0.000 description 2
- STUAPCLEDMKXKL-LKXGYXEUSA-N Thr-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O STUAPCLEDMKXKL-LKXGYXEUSA-N 0.000 description 2
- IQPWNQRRAJHOKV-KATARQTJSA-N Thr-Ser-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN IQPWNQRRAJHOKV-KATARQTJSA-N 0.000 description 2
- YRJOLUDFVAUXLI-GSSVUCPTSA-N Thr-Thr-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(O)=O YRJOLUDFVAUXLI-GSSVUCPTSA-N 0.000 description 2
- COYHRQWNJDJCNA-NUJDXYNKSA-N Thr-Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O COYHRQWNJDJCNA-NUJDXYNKSA-N 0.000 description 2
- LVRFMARKDGGZMX-IZPVPAKOSA-N Thr-Tyr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=C(O)C=C1 LVRFMARKDGGZMX-IZPVPAKOSA-N 0.000 description 2
- KPMIQCXJDVKWKO-IFFSRLJSSA-N Thr-Val-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O KPMIQCXJDVKWKO-IFFSRLJSSA-N 0.000 description 2
- KZTLZZQTJMCGIP-ZJDVBMNYSA-N Thr-Val-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KZTLZZQTJMCGIP-ZJDVBMNYSA-N 0.000 description 2
- HYVLNORXQGKONN-NUTKFTJISA-N Trp-Ala-Lys Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 HYVLNORXQGKONN-NUTKFTJISA-N 0.000 description 2
- QNMIVTOQXUSGLN-SZMVWBNQSA-N Trp-Arg-Arg Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 QNMIVTOQXUSGLN-SZMVWBNQSA-N 0.000 description 2
- ZJKZLNAECPIUTL-JBACZVJFSA-N Trp-Gln-Tyr Chemical compound C([C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)N)C(O)=O)C1=CC=C(O)C=C1 ZJKZLNAECPIUTL-JBACZVJFSA-N 0.000 description 2
- WSGPBCAGEGHKQJ-BBRMVZONSA-N Trp-Gly-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC1=CNC2=CC=CC=C21)N WSGPBCAGEGHKQJ-BBRMVZONSA-N 0.000 description 2
- OCCYDHCUKXRPSJ-SXNHZJKMSA-N Trp-Ile-Gln Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O OCCYDHCUKXRPSJ-SXNHZJKMSA-N 0.000 description 2
- UKWSFUSPGPBJGU-VFAJRCTISA-N Trp-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O UKWSFUSPGPBJGU-VFAJRCTISA-N 0.000 description 2
- ACGIVBXINJFALS-HKUYNNGSSA-N Trp-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N ACGIVBXINJFALS-HKUYNNGSSA-N 0.000 description 2
- IKUMWSDCGQVGHC-UMPQAUOISA-N Trp-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CNC3=CC=CC=C32)N)O IKUMWSDCGQVGHC-UMPQAUOISA-N 0.000 description 2
- JEYRCNVVYHTZMY-SZMVWBNQSA-N Trp-Pro-Val Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O JEYRCNVVYHTZMY-SZMVWBNQSA-N 0.000 description 2
- KBKTUNYBNJWFRL-UBHSHLNASA-N Trp-Ser-Asn Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O)=CNC2=C1 KBKTUNYBNJWFRL-UBHSHLNASA-N 0.000 description 2
- HTGJDTPQYFMKNC-VFAJRCTISA-N Trp-Thr-Leu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)[C@@H](C)O)=CNC2=C1 HTGJDTPQYFMKNC-VFAJRCTISA-N 0.000 description 2
- UPUNWAXSLPBMRK-XTWBLICNSA-N Trp-Thr-Thr Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O UPUNWAXSLPBMRK-XTWBLICNSA-N 0.000 description 2
- WNGMGTMSUBARLB-RXVVDRJESA-N Trp-Trp-Gly Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CC=3C4=CC=CC=C4NC=3)N)C(=O)NCC(O)=O)=CNC2=C1 WNGMGTMSUBARLB-RXVVDRJESA-N 0.000 description 2
- BURPTJBFWIOHEY-UWJYBYFXSA-N Tyr-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 BURPTJBFWIOHEY-UWJYBYFXSA-N 0.000 description 2
- TVOGEPLDNYTAHD-CQDKDKBSSA-N Tyr-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 TVOGEPLDNYTAHD-CQDKDKBSSA-N 0.000 description 2
- IIJWXEUNETVJPV-IHRRRGAJSA-N Tyr-Arg-Ser Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(=O)O)N)O IIJWXEUNETVJPV-IHRRRGAJSA-N 0.000 description 2
- DKKHULUSOSWGHS-UWJYBYFXSA-N Tyr-Asn-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N DKKHULUSOSWGHS-UWJYBYFXSA-N 0.000 description 2
- OEVJGIHPQOXYFE-SRVKXCTJSA-N Tyr-Asn-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O OEVJGIHPQOXYFE-SRVKXCTJSA-N 0.000 description 2
- SCCKSNREWHMKOJ-SRVKXCTJSA-N Tyr-Asn-Ser Chemical compound N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O SCCKSNREWHMKOJ-SRVKXCTJSA-N 0.000 description 2
- UXUFNBVCPAWACG-SIUGBPQLSA-N Tyr-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N UXUFNBVCPAWACG-SIUGBPQLSA-N 0.000 description 2
- FJBCEFPCVPHPPM-STECZYCISA-N Tyr-Ile-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O FJBCEFPCVPHPPM-STECZYCISA-N 0.000 description 2
- MVFQLSPDMMFCMW-KKUMJFAQSA-N Tyr-Leu-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O MVFQLSPDMMFCMW-KKUMJFAQSA-N 0.000 description 2
- PGEFRHBWGOJPJT-KKUMJFAQSA-N Tyr-Lys-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O PGEFRHBWGOJPJT-KKUMJFAQSA-N 0.000 description 2
- XJPXTYLVMUZGNW-IHRRRGAJSA-N Tyr-Pro-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O XJPXTYLVMUZGNW-IHRRRGAJSA-N 0.000 description 2
- GQVZBMROTPEPIF-SRVKXCTJSA-N Tyr-Ser-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O GQVZBMROTPEPIF-SRVKXCTJSA-N 0.000 description 2
- MDXLPNRXCFOBTL-BZSNNMDCSA-N Tyr-Ser-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MDXLPNRXCFOBTL-BZSNNMDCSA-N 0.000 description 2
- GZWPQZDVTBZVEP-BZSNNMDCSA-N Tyr-Tyr-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O GZWPQZDVTBZVEP-BZSNNMDCSA-N 0.000 description 2
- OJCISMMNNUNNJA-BZSNNMDCSA-N Tyr-Tyr-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=C(O)C=C1 OJCISMMNNUNNJA-BZSNNMDCSA-N 0.000 description 2
- AGDDLOQMXUQPDY-BZSNNMDCSA-N Tyr-Tyr-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O AGDDLOQMXUQPDY-BZSNNMDCSA-N 0.000 description 2
- VMRFIKXKOFNMHW-GUBZILKMSA-N Val-Arg-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(=O)O)N VMRFIKXKOFNMHW-GUBZILKMSA-N 0.000 description 2
- LIQJSDDOULTANC-QSFUFRPTSA-N Val-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N LIQJSDDOULTANC-QSFUFRPTSA-N 0.000 description 2
- QHFQQRKNGCXTHL-AUTRQRHGSA-N Val-Gln-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QHFQQRKNGCXTHL-AUTRQRHGSA-N 0.000 description 2
- NXRAUQGGHPCJIB-RCOVLWMOSA-N Val-Gly-Asn Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O NXRAUQGGHPCJIB-RCOVLWMOSA-N 0.000 description 2
- SDUBQHUJJWQTEU-XUXIUFHCSA-N Val-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](C(C)C)N SDUBQHUJJWQTEU-XUXIUFHCSA-N 0.000 description 2
- FEXILLGKGGTLRI-NHCYSSNCSA-N Val-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N FEXILLGKGGTLRI-NHCYSSNCSA-N 0.000 description 2
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 2
- YLRAFVVWZRSZQC-DZKIICNBSA-N Val-Phe-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N YLRAFVVWZRSZQC-DZKIICNBSA-N 0.000 description 2
- DEGUERSKQBRZMZ-FXQIFTODSA-N Val-Ser-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DEGUERSKQBRZMZ-FXQIFTODSA-N 0.000 description 2
- AJNUKMZFHXUBMK-GUBZILKMSA-N Val-Ser-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N AJNUKMZFHXUBMK-GUBZILKMSA-N 0.000 description 2
- OFTXTCGQJXTNQS-XGEHTFHBSA-N Val-Thr-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N)O OFTXTCGQJXTNQS-XGEHTFHBSA-N 0.000 description 2
- QHSSPPHOHJSTML-HOCLYGCPSA-N Val-Trp-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)NCC(=O)O)N QHSSPPHOHJSTML-HOCLYGCPSA-N 0.000 description 2
- KJFBXCFOPAKPTM-BZSNNMDCSA-N Val-Trp-Val Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O)=CNC2=C1 KJFBXCFOPAKPTM-BZSNNMDCSA-N 0.000 description 2
- VTIAEOKFUJJBTC-YDHLFZDLSA-N Val-Tyr-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VTIAEOKFUJJBTC-YDHLFZDLSA-N 0.000 description 2
- PMKQKNBISAOSRI-XHSDSOJGSA-N Val-Tyr-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N2CCC[C@@H]2C(=O)O)N PMKQKNBISAOSRI-XHSDSOJGSA-N 0.000 description 2
- WBPFYNYTYASCQP-CYDGBPFRSA-N Val-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)N WBPFYNYTYASCQP-CYDGBPFRSA-N 0.000 description 2
- 238000010521 absorption reaction Methods 0.000 description 2
- 108010039538 alanyl-glycyl-aspartyl-valine Proteins 0.000 description 2
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 2
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 2
- 108010041407 alanylaspartic acid Proteins 0.000 description 2
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 2
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 2
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 2
- 108010029539 arginyl-prolyl-proline Proteins 0.000 description 2
- 108010018691 arginyl-threonyl-arginine Proteins 0.000 description 2
- 108010060035 arginylproline Proteins 0.000 description 2
- 108010066988 asparaginyl-alanyl-glycyl-alanine Proteins 0.000 description 2
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 2
- 108010093581 aspartyl-proline Proteins 0.000 description 2
- 108010038633 aspartylglutamate Proteins 0.000 description 2
- 108010092854 aspartyllysine Proteins 0.000 description 2
- 210000004899 c-terminal region Anatomy 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 238000005520 cutting process Methods 0.000 description 2
- 108010060199 cysteinylproline Proteins 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000001976 enzyme digestion Methods 0.000 description 2
- 238000010195 expression analysis Methods 0.000 description 2
- 238000000855 fermentation Methods 0.000 description 2
- 230000004151 fermentation Effects 0.000 description 2
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 2
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 2
- 108010090037 glycyl-alanyl-isoleucine Proteins 0.000 description 2
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 2
- 108010051307 glycyl-glycyl-proline Proteins 0.000 description 2
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 2
- 108010015792 glycyllysine Proteins 0.000 description 2
- 108010081551 glycylphenylalanine Proteins 0.000 description 2
- 108010077515 glycylproline Proteins 0.000 description 2
- 210000002288 golgi apparatus Anatomy 0.000 description 2
- 108010085325 histidylproline Proteins 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000009776 industrial production Methods 0.000 description 2
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 2
- 108010054155 lysyllysine Proteins 0.000 description 2
- 239000006174 pH buffer Substances 0.000 description 2
- 210000003681 parotid gland Anatomy 0.000 description 2
- 108010024607 phenylalanylalanine Proteins 0.000 description 2
- 229920000642 polymer Polymers 0.000 description 2
- 108010014614 prolyl-glycyl-proline Proteins 0.000 description 2
- 108010070643 prolylglutamic acid Proteins 0.000 description 2
- 230000009257 reactivity Effects 0.000 description 2
- 108010048818 seryl-histidine Proteins 0.000 description 2
- 108010026333 seryl-proline Proteins 0.000 description 2
- 108010071207 serylmethionine Proteins 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 108010072986 threonyl-seryl-lysine Proteins 0.000 description 2
- 238000011830 transgenic mouse model Methods 0.000 description 2
- 108010038745 tryptophylglycine Proteins 0.000 description 2
- 108010044292 tryptophyltyrosine Proteins 0.000 description 2
- 108010003137 tyrosyltyrosine Proteins 0.000 description 2
- FYGDTMLNYKFZSV-URKRLVJHSA-N (2s,3r,4s,5s,6r)-2-[(2r,4r,5r,6s)-4,5-dihydroxy-2-(hydroxymethyl)-6-[(2r,4r,5r,6s)-4,5,6-trihydroxy-2-(hydroxymethyl)oxan-3-yl]oxyoxan-3-yl]oxy-6-(hydroxymethyl)oxane-3,4,5-triol Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@H]1OC1[C@@H](CO)O[C@@H](OC2[C@H](O[C@H](O)[C@H](O)[C@H]2O)CO)[C@H](O)[C@H]1O FYGDTMLNYKFZSV-URKRLVJHSA-N 0.000 description 1
- GOJUJUVQIVIZAV-UHFFFAOYSA-N 2-amino-4,6-dichloropyrimidine-5-carbaldehyde Chemical group NC1=NC(Cl)=C(C=O)C(Cl)=N1 GOJUJUVQIVIZAV-UHFFFAOYSA-N 0.000 description 1
- DKJPOZOEBONHFS-ZLUOBGJFSA-N Ala-Ala-Asp Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O DKJPOZOEBONHFS-ZLUOBGJFSA-N 0.000 description 1
- YLTKNGYYPIWKHZ-ACZMJKKPSA-N Ala-Ala-Glu Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O YLTKNGYYPIWKHZ-ACZMJKKPSA-N 0.000 description 1
- PIPTUBPKYFRLCP-NHCYSSNCSA-N Ala-Ala-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PIPTUBPKYFRLCP-NHCYSSNCSA-N 0.000 description 1
- PXKLCFFSVLKOJM-ACZMJKKPSA-N Ala-Asn-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PXKLCFFSVLKOJM-ACZMJKKPSA-N 0.000 description 1
- CVGNCMIULZNYES-WHFBIAKZSA-N Ala-Asn-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CVGNCMIULZNYES-WHFBIAKZSA-N 0.000 description 1
- GORKKVHIBWAQHM-GCJQMDKQSA-N Ala-Asn-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GORKKVHIBWAQHM-GCJQMDKQSA-N 0.000 description 1
- NHCPCLJZRSIDHS-ZLUOBGJFSA-N Ala-Asp-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O NHCPCLJZRSIDHS-ZLUOBGJFSA-N 0.000 description 1
- DAEFQZCYZKRTLR-ZLUOBGJFSA-N Ala-Cys-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(O)=O DAEFQZCYZKRTLR-ZLUOBGJFSA-N 0.000 description 1
- SMCGQGDVTPFXKB-XPUUQOCRSA-N Ala-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N SMCGQGDVTPFXKB-XPUUQOCRSA-N 0.000 description 1
- HJGZVLLLBJLXFC-LSJOCFKGSA-N Ala-His-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(O)=O HJGZVLLLBJLXFC-LSJOCFKGSA-N 0.000 description 1
- RZZMZYZXNJRPOJ-BJDJZHNGSA-N Ala-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](C)N RZZMZYZXNJRPOJ-BJDJZHNGSA-N 0.000 description 1
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 1
- SUHLZMHFRALVSY-YUMQZZPRSA-N Ala-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)NCC(O)=O SUHLZMHFRALVSY-YUMQZZPRSA-N 0.000 description 1
- XUCHENWTTBFODJ-FXQIFTODSA-N Ala-Met-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O XUCHENWTTBFODJ-FXQIFTODSA-N 0.000 description 1
- MMLHRUJLOUSRJX-CIUDSAMLSA-N Ala-Ser-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN MMLHRUJLOUSRJX-CIUDSAMLSA-N 0.000 description 1
- LTTLSZVJTDSACD-OWLDWWDNSA-N Ala-Thr-Trp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O LTTLSZVJTDSACD-OWLDWWDNSA-N 0.000 description 1
- 229920000310 Alpha glucan Polymers 0.000 description 1
- OOBVTWHLKYJFJH-FXQIFTODSA-N Arg-Ala-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O OOBVTWHLKYJFJH-FXQIFTODSA-N 0.000 description 1
- PAXHINASXXXILC-SRVKXCTJSA-N Asn-Asp-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N)O PAXHINASXXXILC-SRVKXCTJSA-N 0.000 description 1
- AYKKKGFJXIDYLX-ACZMJKKPSA-N Asn-Gln-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O AYKKKGFJXIDYLX-ACZMJKKPSA-N 0.000 description 1
- HCAUEJAQCXVQQM-ACZMJKKPSA-N Asn-Glu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HCAUEJAQCXVQQM-ACZMJKKPSA-N 0.000 description 1
- BZWRLDPIWKOVKB-ZPFDUUQYSA-N Asn-Leu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BZWRLDPIWKOVKB-ZPFDUUQYSA-N 0.000 description 1
- CDGHMJJJHYKMPA-DLOVCJGASA-N Asn-Phe-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CC(=O)N)N CDGHMJJJHYKMPA-DLOVCJGASA-N 0.000 description 1
- YRTOMUMWSTUQAX-FXQIFTODSA-N Asn-Pro-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O YRTOMUMWSTUQAX-FXQIFTODSA-N 0.000 description 1
- QIRJQYQOIKBPBZ-IHRRRGAJSA-N Asn-Tyr-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QIRJQYQOIKBPBZ-IHRRRGAJSA-N 0.000 description 1
- CELPEWWLSXMVPH-CIUDSAMLSA-N Asp-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O CELPEWWLSXMVPH-CIUDSAMLSA-N 0.000 description 1
- XXAMCEGRCZQGEM-ZLUOBGJFSA-N Asp-Ser-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O XXAMCEGRCZQGEM-ZLUOBGJFSA-N 0.000 description 1
- KNDCWFXCFKSEBM-AVGNSLFASA-N Asp-Tyr-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O KNDCWFXCFKSEBM-AVGNSLFASA-N 0.000 description 1
- SQIARYGNVQWOSB-BZSNNMDCSA-N Asp-Tyr-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SQIARYGNVQWOSB-BZSNNMDCSA-N 0.000 description 1
- 241000228245 Aspergillus niger Species 0.000 description 1
- 229920002498 Beta-glucan Polymers 0.000 description 1
- 102100032487 Beta-mannosidase Human genes 0.000 description 1
- 108700010070 Codon Usage Proteins 0.000 description 1
- KJJASVYBTKRYSN-FXQIFTODSA-N Cys-Pro-Asp Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CS)N)C(=O)N[C@@H](CC(=O)O)C(=O)O KJJASVYBTKRYSN-FXQIFTODSA-N 0.000 description 1
- 229920002307 Dextran Polymers 0.000 description 1
- 108010082495 Dietary Plant Proteins Proteins 0.000 description 1
- 108091029865 Exogenous DNA Proteins 0.000 description 1
- QBLMTCRYYTVUQY-GUBZILKMSA-N Gln-Leu-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O QBLMTCRYYTVUQY-GUBZILKMSA-N 0.000 description 1
- HPBKQFJXDUVNQV-FHWLQOOXSA-N Gln-Tyr-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O HPBKQFJXDUVNQV-FHWLQOOXSA-N 0.000 description 1
- HNAUFGBKJLTWQE-IFFSRLJSSA-N Gln-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCC(=O)N)N)O HNAUFGBKJLTWQE-IFFSRLJSSA-N 0.000 description 1
- DSPQRJXOIXHOHK-WDSKDSINSA-N Glu-Asp-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O DSPQRJXOIXHOHK-WDSKDSINSA-N 0.000 description 1
- JVSBYEDSSRZQGV-GUBZILKMSA-N Glu-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O JVSBYEDSSRZQGV-GUBZILKMSA-N 0.000 description 1
- IOUQWHIEQYQVFD-JYJNAYRXSA-N Glu-Leu-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O IOUQWHIEQYQVFD-JYJNAYRXSA-N 0.000 description 1
- PAZQYODKOZHXGA-SRVKXCTJSA-N Glu-Pro-His Chemical compound N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O PAZQYODKOZHXGA-SRVKXCTJSA-N 0.000 description 1
- UGVQELHRNUDMAA-BYPYZUCNSA-N Gly-Ala-Gly Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)NCC([O-])=O UGVQELHRNUDMAA-BYPYZUCNSA-N 0.000 description 1
- KKBWDNZXYLGJEY-UHFFFAOYSA-N Gly-Arg-Pro Natural products NCC(=O)NC(CCNC(=N)N)C(=O)N1CCCC1C(=O)O KKBWDNZXYLGJEY-UHFFFAOYSA-N 0.000 description 1
- FIQQRCFQXGLOSZ-WDSKDSINSA-N Gly-Glu-Asp Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O FIQQRCFQXGLOSZ-WDSKDSINSA-N 0.000 description 1
- IGOYNRWLWHWAQO-JTQLQIEISA-N Gly-Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 IGOYNRWLWHWAQO-JTQLQIEISA-N 0.000 description 1
- IEGFSKKANYKBDU-QWHCGFSZSA-N Gly-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)CN)C(=O)O IEGFSKKANYKBDU-QWHCGFSZSA-N 0.000 description 1
- IALQAMYQJBZNSK-WHFBIAKZSA-N Gly-Ser-Asn Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O IALQAMYQJBZNSK-WHFBIAKZSA-N 0.000 description 1
- OHUKZZYSJBKFRR-WHFBIAKZSA-N Gly-Ser-Asp Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O OHUKZZYSJBKFRR-WHFBIAKZSA-N 0.000 description 1
- WNGHUXFWEWTKAO-YUMQZZPRSA-N Gly-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN WNGHUXFWEWTKAO-YUMQZZPRSA-N 0.000 description 1
- LCRDMSSAKLTKBU-ZDLURKLDSA-N Gly-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN LCRDMSSAKLTKBU-ZDLURKLDSA-N 0.000 description 1
- ZKJZBRHRWKLVSJ-ZDLURKLDSA-N Gly-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN)O ZKJZBRHRWKLVSJ-ZDLURKLDSA-N 0.000 description 1
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 1
- AFPFGFUGETYOSY-HGNGGELXSA-N His-Ala-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AFPFGFUGETYOSY-HGNGGELXSA-N 0.000 description 1
- VCDNHBNNPCDBKV-DLOVCJGASA-N His-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N VCDNHBNNPCDBKV-DLOVCJGASA-N 0.000 description 1
- XGBVLRJLHUVCNK-DCAQKATOSA-N His-Val-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O XGBVLRJLHUVCNK-DCAQKATOSA-N 0.000 description 1
- GYAFMRQGWHXMII-IUKAMOBKSA-N Ile-Asp-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N GYAFMRQGWHXMII-IUKAMOBKSA-N 0.000 description 1
- AQTWDZDISVGCAC-CFMVVWHZSA-N Ile-Asp-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N AQTWDZDISVGCAC-CFMVVWHZSA-N 0.000 description 1
- RWYCOSAAAJBJQL-KCTSRDHCSA-N Ile-Gly-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N RWYCOSAAAJBJQL-KCTSRDHCSA-N 0.000 description 1
- UIEZQYNXCYHMQS-BJDJZHNGSA-N Ile-Lys-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)O)N UIEZQYNXCYHMQS-BJDJZHNGSA-N 0.000 description 1
- NURNJECQNNCRBK-FLBSBUHZSA-N Ile-Thr-Thr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NURNJECQNNCRBK-FLBSBUHZSA-N 0.000 description 1
- BCISUQVFDGYZBO-QSFUFRPTSA-N Ile-Val-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O BCISUQVFDGYZBO-QSFUFRPTSA-N 0.000 description 1
- WIYDLTIBHZSPKY-HJWJTTGWSA-N Ile-Val-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 WIYDLTIBHZSPKY-HJWJTTGWSA-N 0.000 description 1
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 1
- LJHGALIOHLRRQN-DCAQKATOSA-N Leu-Ala-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N LJHGALIOHLRRQN-DCAQKATOSA-N 0.000 description 1
- XBBKIIGCUMBKCO-JXUBOQSCSA-N Leu-Ala-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XBBKIIGCUMBKCO-JXUBOQSCSA-N 0.000 description 1
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 1
- QLDHBYRUNQZIJQ-DKIMLUQUSA-N Leu-Ile-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QLDHBYRUNQZIJQ-DKIMLUQUSA-N 0.000 description 1
- DSFYPIUSAMSERP-IHRRRGAJSA-N Leu-Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DSFYPIUSAMSERP-IHRRRGAJSA-N 0.000 description 1
- MVVSHHJKJRZVNY-ACRUOGEOSA-N Leu-Phe-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MVVSHHJKJRZVNY-ACRUOGEOSA-N 0.000 description 1
- AKVBOOKXVAMKSS-GUBZILKMSA-N Leu-Ser-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O AKVBOOKXVAMKSS-GUBZILKMSA-N 0.000 description 1
- IDGRADDMTTWOQC-WDSOQIARSA-N Leu-Trp-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IDGRADDMTTWOQC-WDSOQIARSA-N 0.000 description 1
- XZNJZXJZBMBGGS-NHCYSSNCSA-N Leu-Val-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XZNJZXJZBMBGGS-NHCYSSNCSA-N 0.000 description 1
- VKVDRTGWLVZJOM-DCAQKATOSA-N Leu-Val-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 1
- MPOHDJKRBLVGCT-CIUDSAMLSA-N Lys-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N MPOHDJKRBLVGCT-CIUDSAMLSA-N 0.000 description 1
- DGAAQRAUOFHBFJ-CIUDSAMLSA-N Lys-Asn-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O DGAAQRAUOFHBFJ-CIUDSAMLSA-N 0.000 description 1
- RZHLIPMZXOEJTL-AVGNSLFASA-N Lys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N RZHLIPMZXOEJTL-AVGNSLFASA-N 0.000 description 1
- IUWMQCZOTYRXPL-ZPFDUUQYSA-N Lys-Ile-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O IUWMQCZOTYRXPL-ZPFDUUQYSA-N 0.000 description 1
- CFOLERIRBUAYAD-HOCLYGCPSA-N Lys-Trp-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(O)=O CFOLERIRBUAYAD-HOCLYGCPSA-N 0.000 description 1
- RMKJOQSYLQQRFN-KKUMJFAQSA-N Lys-Tyr-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O RMKJOQSYLQQRFN-KKUMJFAQSA-N 0.000 description 1
- 229920000057 Mannan Polymers 0.000 description 1
- AUEJLPRZGVVDNU-UHFFFAOYSA-N N-L-tyrosyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-UHFFFAOYSA-N 0.000 description 1
- 108010087066 N2-tryptophyllysine Proteins 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- FPTXMUIBLMGTQH-ONGXEEELSA-N Phe-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 FPTXMUIBLMGTQH-ONGXEEELSA-N 0.000 description 1
- WPTYDQPGBMDUBI-QWRGUYRKSA-N Phe-Gly-Asn Chemical compound N[C@@H](Cc1ccccc1)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O WPTYDQPGBMDUBI-QWRGUYRKSA-N 0.000 description 1
- NHCKESBLOMHIIE-IRXDYDNUSA-N Phe-Gly-Phe Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 NHCKESBLOMHIIE-IRXDYDNUSA-N 0.000 description 1
- WEMYTDDMDBLPMI-DKIMLUQUSA-N Phe-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N WEMYTDDMDBLPMI-DKIMLUQUSA-N 0.000 description 1
- QUUCAHIYARMNBL-FHWLQOOXSA-N Phe-Tyr-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N QUUCAHIYARMNBL-FHWLQOOXSA-N 0.000 description 1
- SGCZFWSQERRKBD-BQBZGAKWSA-N Pro-Asp-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 SGCZFWSQERRKBD-BQBZGAKWSA-N 0.000 description 1
- GBRUQFBAJOKCTF-DCAQKATOSA-N Pro-His-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O GBRUQFBAJOKCTF-DCAQKATOSA-N 0.000 description 1
- MWMKFWJYRRGXOR-ZLUOBGJFSA-N Ser-Ala-Asn Chemical compound N[C@H](C(=O)N[C@H](C(=O)N[C@H](C(=O)O)CC(N)=O)C)CO MWMKFWJYRRGXOR-ZLUOBGJFSA-N 0.000 description 1
- NRCJWSGXMAPYQX-LPEHRKFASA-N Ser-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CO)N)C(=O)O NRCJWSGXMAPYQX-LPEHRKFASA-N 0.000 description 1
- OOKCGAYXSNJBGQ-ZLUOBGJFSA-N Ser-Asn-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O OOKCGAYXSNJBGQ-ZLUOBGJFSA-N 0.000 description 1
- ICHZYBVODUVUKN-SRVKXCTJSA-N Ser-Asn-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ICHZYBVODUVUKN-SRVKXCTJSA-N 0.000 description 1
- SWSRFJZZMNLMLY-ZKWXMUAHSA-N Ser-Asp-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O SWSRFJZZMNLMLY-ZKWXMUAHSA-N 0.000 description 1
- QJKPECIAWNNKIT-KKUMJFAQSA-N Ser-Lys-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QJKPECIAWNNKIT-KKUMJFAQSA-N 0.000 description 1
- NADLKBTYNKUJEP-KATARQTJSA-N Ser-Thr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NADLKBTYNKUJEP-KATARQTJSA-N 0.000 description 1
- VLMIUSLQONKLDV-HEIBUPTGSA-N Ser-Thr-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VLMIUSLQONKLDV-HEIBUPTGSA-N 0.000 description 1
- IGROJMCBGRFRGI-YTLHQDLWSA-N Thr-Ala-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O IGROJMCBGRFRGI-YTLHQDLWSA-N 0.000 description 1
- DCLBXIWHLVEPMQ-JRQIVUDYSA-N Thr-Asp-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 DCLBXIWHLVEPMQ-JRQIVUDYSA-N 0.000 description 1
- DSLHSTIUAPKERR-XGEHTFHBSA-N Thr-Cys-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(O)=O DSLHSTIUAPKERR-XGEHTFHBSA-N 0.000 description 1
- SIMKLINEDYOTKL-MBLNEYKQSA-N Thr-His-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](C)C(=O)O)N)O SIMKLINEDYOTKL-MBLNEYKQSA-N 0.000 description 1
- IMDMLDSVUSMAEJ-HJGDQZAQSA-N Thr-Leu-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IMDMLDSVUSMAEJ-HJGDQZAQSA-N 0.000 description 1
- RVMNUBQWPVOUKH-HEIBUPTGSA-N Thr-Ser-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMNUBQWPVOUKH-HEIBUPTGSA-N 0.000 description 1
- ZMYCLHFLHRVOEA-HEIBUPTGSA-N Thr-Thr-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ZMYCLHFLHRVOEA-HEIBUPTGSA-N 0.000 description 1
- 241000223259 Trichoderma Species 0.000 description 1
- 241000499912 Trichoderma reesei Species 0.000 description 1
- 241000209140 Triticum Species 0.000 description 1
- 235000021307 Triticum Nutrition 0.000 description 1
- VEYXZZGMIBKXCN-UBHSHLNASA-N Trp-Asp-Asp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VEYXZZGMIBKXCN-UBHSHLNASA-N 0.000 description 1
- FEZASNVQLJQBHW-CABZTGNLSA-N Trp-Gly-Ala Chemical compound C1=CC=C2C(C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O)=CNC2=C1 FEZASNVQLJQBHW-CABZTGNLSA-N 0.000 description 1
- NXQAOORHSYJRGH-AAEUAGOBSA-N Trp-Gly-Ser Chemical compound C1=CC=C2C(C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O)=CNC2=C1 NXQAOORHSYJRGH-AAEUAGOBSA-N 0.000 description 1
- OGXQLUCMJZSJPW-LYSGOOTNSA-N Trp-Gly-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O OGXQLUCMJZSJPW-LYSGOOTNSA-N 0.000 description 1
- YTZYHKOSHOXTHA-TUSQITKMSA-N Trp-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CC=3C4=CC=CC=C4NC=3)CC(C)C)C(O)=O)=CNC2=C1 YTZYHKOSHOXTHA-TUSQITKMSA-N 0.000 description 1
- MBFJIHUHHCJBSN-AVGNSLFASA-N Tyr-Asn-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MBFJIHUHHCJBSN-AVGNSLFASA-N 0.000 description 1
- IXTQGBGHWQEEDE-AVGNSLFASA-N Tyr-Asp-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 IXTQGBGHWQEEDE-AVGNSLFASA-N 0.000 description 1
- DWAMXBFJNZIHMC-KBPBESRZSA-N Tyr-Leu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O DWAMXBFJNZIHMC-KBPBESRZSA-N 0.000 description 1
- NWEGIYMHTZXVBP-JSGCOSHPSA-N Tyr-Val-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O NWEGIYMHTZXVBP-JSGCOSHPSA-N 0.000 description 1
- WOCYUGQDXPTQPY-FXQIFTODSA-N Val-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C(C)C)N WOCYUGQDXPTQPY-FXQIFTODSA-N 0.000 description 1
- BYOHPUZJVXWHAE-BYULHYEWSA-N Val-Asn-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N BYOHPUZJVXWHAE-BYULHYEWSA-N 0.000 description 1
- XLDYBRXERHITNH-QSFUFRPTSA-N Val-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)C(C)C XLDYBRXERHITNH-QSFUFRPTSA-N 0.000 description 1
- SDHZOOIGIUEPDY-JYJNAYRXSA-N Val-Ser-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CO)NC(=O)[C@@H](N)C(C)C)C(O)=O)=CNC2=C1 SDHZOOIGIUEPDY-JYJNAYRXSA-N 0.000 description 1
- CEKSLIVSNNGOKH-KZVJFYERSA-N Val-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](C(C)C)N)O CEKSLIVSNNGOKH-KZVJFYERSA-N 0.000 description 1
- PQSNETRGCRUOGP-KKHAAJSZSA-N Val-Thr-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(N)=O PQSNETRGCRUOGP-KKHAAJSZSA-N 0.000 description 1
- JVGDAEKKZKKZFO-RCWTZXSCSA-N Val-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)N)O JVGDAEKKZKKZFO-RCWTZXSCSA-N 0.000 description 1
- 240000008042 Zea mays Species 0.000 description 1
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 description 1
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 108010070944 alanylhistidine Proteins 0.000 description 1
- 108010087924 alanylproline Proteins 0.000 description 1
- 239000003513 alkali Substances 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- 108010062796 arginyllysine Proteins 0.000 description 1
- 239000002585 base Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 108010055059 beta-Mannosidase Proteins 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 238000009395 breeding Methods 0.000 description 1
- 230000001488 breeding effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 235000005822 corn Nutrition 0.000 description 1
- 239000012531 culture fluid Substances 0.000 description 1
- 235000013325 dietary fiber Nutrition 0.000 description 1
- 235000019621 digestibility Nutrition 0.000 description 1
- 230000001079 digestive effect Effects 0.000 description 1
- 102000038379 digestive enzymes Human genes 0.000 description 1
- 108091007734 digestive enzymes Proteins 0.000 description 1
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000007071 enzymatic hydrolysis Effects 0.000 description 1
- 238000006047 enzymatic hydrolysis reaction Methods 0.000 description 1
- 230000002550 fecal effect Effects 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 108010057083 glutamyl-aspartyl-leucine Proteins 0.000 description 1
- 108010067216 glycyl-glycyl-glycine Proteins 0.000 description 1
- 108010077435 glycyl-phenylalanyl-glycine Proteins 0.000 description 1
- 108010074027 glycyl-seryl-phenylalanine Proteins 0.000 description 1
- 230000003301 hydrolyzing effect Effects 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- FZWBNHMXJMCXLU-BLAUPYHCSA-N isomaltotriose Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@@H]1OC[C@@H]1[C@@H](O)[C@H](O)[C@@H](O)[C@@H](OC[C@@H](O)[C@@H](O)[C@H](O)[C@@H](O)C=O)O1 FZWBNHMXJMCXLU-BLAUPYHCSA-N 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 229920005610 lignin Polymers 0.000 description 1
- 239000002502 liposome Substances 0.000 description 1
- 108010009298 lysylglutamic acid Proteins 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 230000002503 metabolic effect Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 150000002772 monosaccharides Chemical class 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 235000003170 nutritional factors Nutrition 0.000 description 1
- 210000003463 organelle Anatomy 0.000 description 1
- 229920001277 pectin Polymers 0.000 description 1
- 239000001814 pectin Substances 0.000 description 1
- 235000010987 pectin Nutrition 0.000 description 1
- 230000012743 protein tagging Effects 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 210000003296 saliva Anatomy 0.000 description 1
- 210000003079 salivary gland Anatomy 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 108010080629 tryptophan-leucine Proteins 0.000 description 1
- 108010078580 tyrosylleucine Proteins 0.000 description 1
- 108010009962 valyltyrosine Proteins 0.000 description 1
- 230000035899 viability Effects 0.000 description 1
- 239000011345 viscous material Substances 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/24—Hydrolases (3) acting on glycosyl compounds (3.2)
- C12N9/2402—Hydrolases (3) acting on glycosyl compounds (3.2) hydrolysing O- and S- glycosyl compounds (3.2.1)
- C12N9/2477—Hemicellulases not provided in a preceding group
- C12N9/248—Xylanases
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K67/00—Rearing or breeding animals, not otherwise provided for; New or modified breeds of animals
- A01K67/027—New or modified breeds of vertebrates
- A01K67/0275—Genetically modified vertebrates, e.g. transgenic
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
- C12N15/8509—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells for producing genetically modified animals, e.g. transgenic
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/24—Hydrolases (3) acting on glycosyl compounds (3.2)
- C12N9/2402—Hydrolases (3) acting on glycosyl compounds (3.2) hydrolysing O- and S- glycosyl compounds (3.2.1)
- C12N9/2405—Glucanases
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/24—Hydrolases (3) acting on glycosyl compounds (3.2)
- C12N9/2402—Hydrolases (3) acting on glycosyl compounds (3.2) hydrolysing O- and S- glycosyl compounds (3.2.1)
- C12N9/2405—Glucanases
- C12N9/2434—Glucanases acting on beta-1,4-glucosidic bonds
- C12N9/2437—Cellulases (3.2.1.4; 3.2.1.74; 3.2.1.91; 3.2.1.150)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y301/00—Hydrolases acting on ester bonds (3.1)
- C12Y301/03—Phosphoric monoester hydrolases (3.1.3)
- C12Y301/03008—3-Phytase (3.1.3.8)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y301/00—Hydrolases acting on ester bonds (3.1)
- C12Y301/03—Phosphoric monoester hydrolases (3.1.3)
- C12Y301/03026—4-Phytase (3.1.3.26), i.e. 6-phytase
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2207/00—Modified animals
- A01K2207/05—Animals modified by non-integrating nucleic acids, e.g. antisense, RNAi, morpholino, episomal vector, for non-therapeutic purpose
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2227/00—Animals characterised by species
- A01K2227/10—Mammal
- A01K2227/108—Swine
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2267/00—Animals characterised by purpose
- A01K2267/02—Animal zootechnically ameliorated
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/01—Fusion polypeptide containing a localisation/targetting motif
- C07K2319/02—Fusion polypeptide containing a localisation/targetting motif containing a signal sequence
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/10—Plasmid DNA
- C12N2800/106—Plasmid DNA for vertebrates
- C12N2800/107—Plasmid DNA for vertebrates for mammalian
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/22—Vectors comprising a coding region that has been codon optimised for expression in a respective host
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2840/00—Vectors comprising a special translation-regulating system
- C12N2840/20—Vectors comprising a special translation-regulating system translation of more than one cistron
- C12N2840/203—Vectors comprising a special translation-regulating system translation of more than one cistron having an IRES
- C12N2840/206—Vectors comprising a special translation-regulating system translation of more than one cistron having an IRES having multiple IRES
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P60/00—Technologies relating to agriculture, livestock or agroalimentary industries
- Y02P60/80—Food processing, e.g. use of renewable energies or variable speed drives in handling, conveying or stacking
- Y02P60/87—Re-use of by-products of food processing for fodder production
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Wood Science & Technology (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Biotechnology (AREA)
- Biomedical Technology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Medicinal Chemistry (AREA)
- Veterinary Medicine (AREA)
- Environmental Sciences (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Plant Pathology (AREA)
- Animal Behavior & Ethology (AREA)
- Animal Husbandry (AREA)
- Biodiversity & Conservation Biology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
Description
技术领域technical field
本发明涉及生物技术领域,特别涉及一种多功能融合酶XAET和多功能融合酶腮腺组织特异表达的定点整合载体及其构建方法。The invention relates to the field of biotechnology, in particular to a site-specific integration vector for the specific expression of a multifunctional fusion enzyme XAET and a multifunctional fusion enzyme parotid gland tissue and a construction method thereof.
背景技术Background technique
非淀粉多糖(NSP)是由若干单糖通过糖苷键连接成的多聚体,包括除α-葡聚糖以外的大部分多糖分子。NSP最初是根据提取和分离多糖所采用的方法进行分类的。细胞壁经一系列碱提取后剩余的不溶物叫纤维素,溶在碱液中的物质称为半纤维素。考虑到非淀粉多糖的化学结构及生物功能,人们发现依据其溶解度分类有失精准。通常非淀粉多糖一般分为3大类,即纤维素、非纤维多糖(半纤维素性聚合体)和果胶聚糖。其中非纤维多糖又包括木聚糖、β-葡聚糖、甘露聚糖、半乳聚糖等。按照水溶性的不同,非淀粉多糖又可分为可溶性非淀粉多糖(SNSP)和不可溶性非淀粉多糖(INSP),这是因为在谷物细胞壁中,一些非淀粉多糖以氢键松散地与纤维素、木质素、蛋白质结合,故溶于水,称为可溶性非淀粉多糖。Non-starch polysaccharide (NSP) is a polymer composed of several monosaccharides connected by glycosidic bonds, including most polysaccharide molecules except α-glucan. NSPs were originally classified according to the methods used to extract and isolate polysaccharides. The insoluble matter left after the cell wall is extracted by a series of alkalis is called cellulose, and the matter dissolved in the alkali solution is called hemicellulose. Considering the chemical structure and biological function of non-starch polysaccharides, it was found that the classification based on their solubility is inaccurate. Generally, non-starch polysaccharides are generally divided into three categories, namely cellulose, non-fibrous polysaccharides (hemicellulose polymers) and pectin polysaccharides. Among them, non-fibrous polysaccharides include xylan, β-glucan, mannan, galactan, etc. According to the difference in water solubility, non-starch polysaccharides can be divided into soluble non-starch polysaccharides (SNSP) and insoluble non-starch polysaccharides (INSP). , lignin, and protein combination, so it is soluble in water and is called soluble non-starch polysaccharide.
非淀粉多糖(NSP)是饲料膳食纤维的主要组成成分,这些纤维将饲料营养物质包围在细胞壁里面,部分纤维可溶解于水并产生粘性物质。这些粘性物质抑制动物的正常消化功能,妨碍动物吸收营养。如将这些NSP去除,营养物质就能从细胞壁里释放出来,从而提高代谢能和蛋白质的利用率。玉米、小麦中均含有大量的NSP,许多植物蛋白源,如大豆粕,同样含有 NSP。在饲料中添加酶制剂,可将这些NSP去除,如大豆粕中被细胞结构包围的淀粉和蛋白就可释放,从而提高了大豆粕的代谢能和蛋白质的利用率。Non-starch polysaccharide (NSP) is the main component of feed dietary fiber. These fibers surround feed nutrients in the cell wall, and some fibers can dissolve in water and produce viscous substances. These sticky substances inhibit the normal digestive function of the animal and prevent the animal from absorbing nutrients. If these NSPs are removed, nutrients can be released from the cell wall, thereby improving the utilization rate of metabolizable energy and protein. Both corn and wheat contain large amounts of NSP, and many vegetable protein sources, such as soybean meal, also contain NSP. Adding enzyme preparations to the feed can remove these NSPs, such as starch and protein surrounded by cell structures in soybean meal can be released, thereby improving the metabolic energy of soybean meal and the utilization rate of protein.
在饲料中添加酶制剂可以解决NSP的消化吸收问题,但酶制剂在饲料制粒和膨化过程中容易受高温失活,还增加了饲料的成本,也对饲料的储存条件更加严苛,增加了运输和饲用成本。因此,亟需一种其他的解决办法。近几年,陆续有研究表明,将可以表达木聚糖酶、葡聚糖酶等纤维素素或非纤维素多糖的酶基因转入动物体内,通过获得表达该基因的转基因动物,可以通过该动物自身分泌和表达相应的酶,来消化饲料中的NSP,达到提高饲料营养消化吸收的目的。因而,利用转基因育种技术培育在猪体内表达自身缺乏的消化酶的动物新品种是解决饲料抗营养问题的新途径。2001年加拿大科学家Golovan等(2001)首次利用将大肠杆菌耐酸性植酸酶appA基因整合到猪体内,提高饲料磷的消化率,粪磷排放量显著降低75%。2013年本课题同样利用猪腮腺分泌蛋白上游调控区域成功制备得到转纤维素酶基因小鼠(Huang et al.,2013)和转甘露聚糖酶基因小鼠(李紫聪等,2013),2018年zhang等用2A多肽将木聚糖酶-葡聚糖酶-植酸酶融合后导入猪基因猪,提高猪生产性能,减少磷氮排放。在公开号为 CN106086068A专利中亦公开了利用2A连接肽,实现了果胶酶-木聚糖酶- 植酸酶-纤维素酶融合表达,但表达效率不高。然而这些现有技术,存在如下问题,转单基因,表达量高,但表达酶种类少,消除个别饲料营养因子对动物性能改善水平有限,经济价值不大。2A介导的转多基因依然存在共表达效率不高,多基因融合酶的首尾基因表达不一致的问题,基因位置难以确定,还存在一些基因序列会干扰2A的剪切,导致表达产物功能尚失。Adding enzyme preparations to feed can solve the problem of digestion and absorption of NSP, but enzyme preparations are easily inactivated by high temperature during the feed pelleting and extruding process, which also increases the cost of feed, and also makes the storage conditions of feed more stringent, increasing the Transportation and feeding costs. Therefore, a kind of other solution is badly needed. In recent years, successive studies have shown that enzyme genes that can express cellulose or non-cellulosic polysaccharides such as xylanase and glucanase are transferred into animals, and by obtaining transgenic animals expressing this gene, the Animals secrete and express corresponding enzymes to digest NSP in the feed, so as to improve the digestion and absorption of feed nutrients. Therefore, it is a new way to solve the anti-nutritional problem of feed by using transgenic breeding technology to breed new animal varieties that express the digestive enzymes that pigs lack. In 2001, Canadian scientist Golovan et al. (2001) first used the Escherichia coli acid-resistant phytase appA gene to be integrated into pigs to improve the digestibility of feed phosphorus and significantly reduce fecal phosphorus emissions by 75%. In 2013, this project also used the upstream regulatory region of the porcine parotid gland secretion protein to successfully generate transgenic mice for cellulase (Huang et al., 2013) and transgenic mice for mannanase (Li Zicong et al., 2013), zhang 2018 etc. used 2A polypeptide to fuse xylanase-glucanase-phytase into pig gene pigs to improve pig production performance and reduce phosphorus and nitrogen emissions. It is also disclosed in the publication number CN106086068A that the fusion expression of pectinase-xylanase-phytase-cellulase has been realized by using 2A connecting peptide, but the expression efficiency is not high. However, these existing technologies have the following problems: single gene transfer, high expression level, but few types of expressed enzymes, elimination of individual feed nutritional factors has limited improvement in animal performance, and has little economic value. 2A-mediated multi-gene transfer still has the problem of low co-expression efficiency, inconsistent expression of the first and last genes of the multi-gene fusion enzyme, difficult to determine the gene position, and some gene sequences will interfere with the cleavage of 2A, resulting in loss of function of the expression product .
发明内容Contents of the invention
本发明公开提供一种多功能融合酶XAET和多功能融合酶定点整合真核特异表达载体及其构建方法,该多功能融合酶可以表达木聚糖酶,植酸酶,双葡聚糖酶和双纤维素酶活性,将其用于后期制备相应的转基因动物,该动物将自身分泌这些酶,起到消化饲料中木聚糖、植酸、葡聚糖,纤维素等抗营养因子,达到提高饲料利用率,减少污染排放的功效。同时,解决刚性肽介导两个以上融合蛋白共表达时相互干扰问题;解决2A连接肽多基因共表达效率低,基因位置确定困难问题,及多基因共表达中2A多肽残基对上游酶蛋白干扰的问题,达到增强多基因高效共表达效率的目的。此外,还可以解决单个来源的纤维素酶基因表达和水解纤维素效果差的问题。The present invention discloses a multifunctional fusion enzyme XAET and a multifunctional fusion enzyme site-specific integration eukaryotic expression vector and its construction method. The multifunctional fusion enzyme can express xylanase, phytase, diglucanase and Double cellulase activity, which is used to prepare corresponding transgenic animals in the later stage, the animals will secrete these enzymes by themselves to digest xylan, phytic acid, dextran, cellulose and other anti-nutritional factors in the feed, so as to improve Feed utilization, the efficacy of reducing pollution emissions. At the same time, solve the problem of mutual interference when rigid peptides mediate the co-expression of two or more fusion proteins; solve the problems of low efficiency of multi-gene co-expression of 2A-linked peptides, difficulty in determining gene positions, and the impact of 2A polypeptide residues on upstream enzyme proteins in multi-gene co-expression The problem of interference is eliminated, and the purpose of enhancing the efficient co-expression efficiency of multiple genes is achieved. In addition, the problem of poor cellulase gene expression and hydrolysis of cellulose from a single source can also be solved.
根据本公开的一个方面,提供了多功能融合酶,该多功能酶可以同时表达木聚糖酶,植酸酶,双葡聚糖酶和双纤维素酶。According to one aspect of the present disclosure, a multifunctional fusion enzyme is provided, which can simultaneously express xylanase, phytase, diglucanase and dicellulase.
在某些实施方式中,该多功能融合酶基因由木聚糖酶基因-A3-植酸酶基因-furin-P2A-纤维素酶基因-A3'-纤维素酶基因组成。由于在2个以上蛋白共表达时,单纯使用柔性肽或刚性肽都很难实现所有基因共表达,本发明通过组合应用A3和2A,解决刚性肽介导两个以上融合蛋白共表达时相互干扰问题;解决2A连接肽多基因共表达效率低的问题,及多基因共表达中 2A多肽残基对上游酶蛋白干扰的问题,达到多基因高效共表达的效果。同时,与单纯2A连接肽相比,本发明利用A3和furin P2A构建XAET显著提高四种酶不同PH条件下的活性,消除2A多肽对上游基因的影响,也避免部分蛋白空间结构抑制2A反应活性的问题,同时还避开对不同蛋白顺序优化繁琐试验过程,酶表达量比2A更高。刚性肽只能用于两种融合酶的构建,本发明将2A和A3巧妙结合,即保留A3优势,又增强融合酶的多基因共表达能力。在融合酶第二和第三号基因连接处采用自剪切效率最高的P2A,同时在其N端添加furin酶识别基序RVKR,RVKR可在细胞器高尔基体高效剪切,仅残留2个氨基酸残基,该设计与一般2A序列,具有更高的剪切效率,对表达产物影响最小。由此,通过本公开设计获得的XAET基因序列,可以高效表达木聚糖酶,植酸酶,双葡聚糖酶和双纤维素酶,解决了四种基因共表达的难题,更为以后将其用于转基因动物和酶发酵工业生产来提高基因转移和酶的生产效率提供了基础。同时,在公开号为CN106086068A 专利中公开的,采用2A连接两个基因序列时,两个基因的连接顺序都会影响基因的表达及功能的发挥,而且两个基因的酶活都不及单基因的酶活高;在多基因连接时,若只采用2A连接肽,这种位置效应会更加明显,对于每个基因酶活的影响会更加突出。但是本公开采用A3和2A连接肽组合使用构建的上述多功能酶基因,可以克服位置效应,木聚糖酶和植酸酶基因表达的酶活与单基因相当。但由于纤维素酶采用了两个双基因共表达,其表达的纤维素酶活性要高于单基因酶活性。In some embodiments, the multifunctional fusion enzyme gene consists of xylanase gene-A3-phytase gene-furin-P2A-cellulase gene-A3'-cellulase gene. When two or more proteins are co-expressed, it is difficult to achieve co-expression of all genes by using only flexible peptides or rigid peptides. The present invention uses A3 and 2A in combination to solve the problem of mutual interference when rigid peptides mediate the co-expression of two or more fusion proteins. Problem: Solve the problem of low efficiency of multi-gene co-expression of 2A-linked peptides and the interference of 2A polypeptide residues on upstream enzyme proteins during multi-gene co-expression, so as to achieve the effect of multi-gene co-expression efficiently. At the same time, compared with the simple 2A linking peptide, the present invention uses A3 and furin P2A to construct XAET to significantly improve the activity of the four enzymes under different pH conditions, eliminate the influence of 2A polypeptide on upstream genes, and avoid the inhibition of 2A reactivity by the spatial structure of some proteins At the same time, it also avoids the tedious test process of optimizing different protein sequences, and the enzyme expression level is higher than that of 2A. Rigid peptides can only be used in the construction of two fusion enzymes. The present invention combines 2A and A3 skillfully, which not only retains the advantages of A3, but also enhances the multi-gene co-expression ability of the fusion enzyme. P2A with the highest self-cleavage efficiency is used at the junction of the second and third genes of the fusion enzyme, and the furin enzyme recognition motif RVKR is added to its N-terminus. RVKR can efficiently cut at the organelle Golgi apparatus, leaving only 2 amino acid residues Base, this design has higher cutting efficiency than the general 2A sequence, and has the least impact on the expression product. Therefore, the XAET gene sequence obtained through the design of this disclosure can efficiently express xylanase, phytase, diglucanase and dicellulase, and solve the problem of co-expression of the four genes. It provides a basis for improving gene transfer and enzyme production efficiency for transgenic animals and enzyme fermentation industrial production. At the same time, as disclosed in the patent publication No. CN106086068A, when 2A is used to connect two gene sequences, the connection sequence of the two genes will affect the expression and function of the genes, and the enzyme activity of the two genes is not as good as that of a single gene. High activity; in multi-gene connection, if only 2A connecting peptide is used, this position effect will be more obvious, and the impact on the enzyme activity of each gene will be more prominent. However, the present disclosure adopts the combination of A3 and 2A connecting peptides to construct the above-mentioned multifunctional enzyme gene, which can overcome the position effect, and the enzyme activity expressed by the xylanase and phytase genes is equivalent to that of a single gene. However, since the cellulase is co-expressed with two double genes, the expressed cellulase activity is higher than that of the single-gene enzyme.
在某些实施方式中,该多功能融合酶基因中A3的基因序列如SEQ ID No:5所示。In some embodiments, the gene sequence of A3 in the multifunctional fusion enzyme gene is shown in SEQ ID No:5.
在某些实施方式中,该多功能融合酶基因中A3'的基因序列如SEQ ID No:6所示。In some embodiments, the gene sequence of A3' in the multifunctional fusion enzyme gene is shown in SEQ ID No:6.
在某些实施方式中,该多功能融合酶基因XAET序列如SEQ ID No:11所示。由于2A多肽高效的自剪切功能,可较好实现前后两个蛋白的共表达,被誉为蛋白融合表达最可靠的linker。但是,部分酶蛋白融合2A的C端多肽后功能显著受损,后一个蛋白的表达量降低。多基因共表达时,对蛋白组合顺序要求苛刻,需要繁琐验证排列组合,才能实现低水平的共表达(公开号为CN106086068A专利中已表明此现象),不利于多功能融合酶的构建。 2A序列剪切效率与上游多肽基序有关,部分蛋白氨基酸基序会严重影响2A 切割活性,导致2A介导前后两个酶因无法完全切割,导致二级结构折叠异常,还有可能被上游基因信号肽迁移到靶向错误的细胞器,无法正确的加工和分泌,导致功能尚失,表达失败。本公开中,采用A3与2A连接肽组合使用的方式来构建4种基因融合表达的多功能融合酶,可以克服2A及 A3连接肽各自的缺点,使该多功能融合酶可以高效的表达四种酶活,可为获得表达该四种酶的转基因动物做好基础。In some embodiments, the multifunctional fusion enzyme gene XAET sequence is shown in SEQ ID No:11. Due to the efficient self-cleavage function of the 2A polypeptide, it can better realize the co-expression of the two proteins before and after, and is known as the most reliable linker for protein fusion expression. However, the function of the partzyme protein was significantly impaired after fusing the C-terminal polypeptide of 2A, and the expression level of the latter protein was reduced. In the co-expression of multiple genes, the sequence of protein combinations is strict, and it is necessary to cumbersomely verify the arrangement and combination to achieve low-level co-expression (this phenomenon has been shown in the patent publication No. CN106086068A), which is not conducive to the construction of multifunctional fusion enzymes. The cleavage efficiency of 2A sequence is related to the upstream polypeptide motif. Some protein amino acid motifs will seriously affect the 2A cleavage activity, resulting in the incomplete cleavage of the two enzymes before and after 2A mediation, resulting in abnormal folding of the secondary structure, and may be blocked by upstream genes. The signal peptide migrates to the wrongly targeted organelles and cannot be processed and secreted correctly, resulting in loss of function and failure of expression. In the present disclosure, the combination of A3 and 2A linking peptides is used to construct a multifunctional fusion enzyme for fusion expression of four genes, which can overcome the respective shortcomings of 2A and A3 linking peptides, so that the multifunctional fusion enzyme can efficiently express four kinds of genes. Enzyme activity can lay the foundation for obtaining transgenic animals expressing the four enzymes.
在某些实施方式中,该多功能融合酶具有如SEQ ID No:12所示的氨基酸序列。连接肽最重要的指标是氨基酸链的长度,不同长度的GGGGS连接肽对融合蛋白的表达量和活性有着不同的影响,并不是连接肽越长表达量就越高。融合酶对柔性肽要求比较苛刻,过长或过短的柔性肽都不利于融合酶活性。分子质量大、结构复杂的蛋白需要的折叠空间也就越大,连接肽也应加长,但过长的肽链可能会增加抗原性,同时也易被酶水解断裂。连接肽过短形成空间位阻效应会影响蛋白的正确折叠,也提高了聚体形成的几率。刚性连接肽由于二级结构的稳固,不可伸展弯曲的特性,常用来固定两端功能蛋白间距,保证功能域的完整。由此,设计的刚性连接肽 A3/A3'与自剪切的P2A共同作用连接4个基因,实现4个基因的共表达, A3/A3'与P2A连接肽序列经过优化设计,使得木聚糖酶、植酸酶、、葡聚糖酶和纤维素酶均可高效表达。In some embodiments, the multifunctional fusion enzyme has an amino acid sequence as shown in SEQ ID No:12. The most important indicator of the connecting peptide is the length of the amino acid chain. GGGGS connecting peptides of different lengths have different effects on the expression and activity of the fusion protein, and it does not mean that the longer the connecting peptide, the higher the expression. Fusion enzymes have strict requirements on flexible peptides, and flexible peptides that are too long or too short are not conducive to the activity of fusion enzymes. Proteins with large molecular weight and complex structure require a larger folding space, and the linking peptide should also be lengthened, but too long peptide chains may increase antigenicity and are also easily broken by enzymatic hydrolysis. The steric hindrance effect caused by too short connecting peptide will affect the correct folding of the protein and increase the probability of aggregate formation. Rigid linker peptides are often used to fix the distance between functional proteins at both ends to ensure the integrity of functional domains due to the stability of the secondary structure and the inability to stretch and bend. Therefore, the designed rigid linker peptide A3/A3' and self-cleaved P2A act together to connect 4 genes to achieve co-expression of the 4 genes. Enzyme, phytase, dextranase and cellulase can be highly expressed.
根据本公开的另一个方面,提供了一种多功能融合酶定点整合真核特异表达载体,该表达载体包括如权利要求5或6所述的多功能融合酶。According to another aspect of the present disclosure, there is provided a multifunctional fusion enzyme site-directed integration eukaryotic specific expression vector, the expression vector comprising the multifunctional fusion enzyme according to
在某些实施方式中,多功能融合酶定点整合真核特异表达载体的基因序列如SEQID No:13所示。In some embodiments, the multifunctional fusion enzyme site-specifically integrates the gene sequence of the eukaryotic specific expression vector as shown in SEQ ID No:13.
在某些实施方式中,提供了一种多功能融合酶定点整合真核特异表达载体的构建方法,其包括如下步骤:In some embodiments, a method for constructing a multifunctional fusion enzyme site-directed integration of a eukaryotic-specific expression vector is provided, which includes the following steps:
候选目的基因的筛选和优化;Screening and optimization of candidate target genes;
连接肽的设计;Design of connecting peptide;
目的基因与连接肽连接;The target gene is connected with the connecting peptide;
多功能融合酶基因XAET的合成;Synthesis of multifunctional fusion enzyme gene XAET;
构建CEP112位点定点转XAET基因表达载体。Construct the CEP112 site-directed transfer XAET gene expression vector.
在某些实施方式中,构建CEP112位点定点转XAET基因表达载体包括如下步骤:In some embodiments, constructing a CEP112 site-directed XAET gene expression vector comprises the following steps:
将XAET多顺反子替换CEP112-LA340RA3219载体中BEXA顺反子,构建新载体Cep112-mPSP-XAET;Replace the BEXA cistron in the CEP112-LA340RA3219 vector with the XAET polycistroton to construct a new vector Cep112-mPSP-XAET;
用PacI和sexAI线性化Cep112-mPSP-XAET,然后用inf-npsp引物扩增 npsp上游调控区,并替换现有mpsp序列;Linearize Cep112-mPSP-XAET with PacI and sexAI, then use inf-npsp primers to amplify the npsp upstream regulatory region and replace the existing mpsp sequence;
构建CEP112位点定点转XAET基因表达载体Cep112-npsp-XAET。The expression vector Cep112-npsp-XAET was constructed for site-specific transfer of CEP112 to XAET.
根据本公开的另一个方面,提供了一种多功能融合酶XAET的应用,该多功能融合酶XAET可用于制备节粮环保转基因动物,例如转基因猪、转基因牛、转基因羊等According to another aspect of the present disclosure, the application of a multifunctional fusion enzyme XAET is provided, and the multifunctional fusion enzyme XAET can be used to prepare food-saving and environmentally friendly transgenic animals, such as transgenic pigs, transgenic cattle, transgenic sheep, etc.
本公开的有益效果:Beneficial effects of the present disclosure:
1)本发明设计构建pxynB-A3-APPA-furin-P2A-pegⅡ-A3'-TeEGI(XAET)融合酶,具有共表达木聚糖酶,植酸酶,双葡聚糖酶和双纤维素酶四种酶活性,相较于单纤维素酶基因具有更好的葡聚糖酶和纤维素酶活性。囊括饲料中消除主要抗营养因子所需要水解酶,对提高高纤维饲料转化率具有重要价值。本发明设计提高了基因表达效率,若将其用于转基因动物和酶发酵工业生产,能显著提高基因转移和酶的生产效率,具有重要的经济价值;1) The present invention designs and constructs pxynB-A3-APPA-furin-P2A-pegⅡ-A3'-TeEGI (XAET) fusion enzyme, which can co-express xylanase, phytase, double glucanase and double cellulase Four enzyme activities, better glucanase and cellulase activity than monocellulase gene. Including the hydrolytic enzymes needed to eliminate the main anti-nutritional factors in the feed is of great value in improving the conversion rate of high-fiber feed. The design of the present invention improves the efficiency of gene expression, and if it is used in the industrial production of transgenic animals and enzyme fermentation, it can significantly improve the production efficiency of gene transfer and enzyme, and has important economic value;
2)与单纯2A连接肽相比,本发明获得XAET显著提高四种酶不同PH 条件下的活性,消除2A多肽对上游基因的影响,也避免部分蛋白空间结构抑制2A反应活性的问题,同时还避开对不同蛋白顺序优化繁琐试验过程;2) Compared with the simple 2A linking peptide, the XAET obtained by the present invention significantly improves the activity of the four enzymes under different pH conditions, eliminates the influence of the 2A polypeptide on the upstream gene, and avoids the problem that the spatial structure of some proteins inhibits the 2A reactivity. Avoid the tedious test process of optimizing different protein sequences;
3)刚性肽只能用于两种融合酶的构建,本发明将2A和A3巧妙结合,即保留A3优势,又增强融合酶的多基因共表达能力;3) Rigid peptides can only be used in the construction of two fusion enzymes. The present invention combines 2A and A3 skillfully, which not only retains the advantages of A3, but also enhances the multi-gene co-expression ability of the fusion enzyme;
4)在融合酶第二和第三号基因连接处采用自剪切效率最高的P2A,同时在其N端添加furin酶识别基序RVKR,RVKR可在细胞器高尔基体高效剪切,仅残留4个氨基酸残基,该设计与一般2A序列,具有更高的剪切效率,对表达产物影响最小;4) P2A with the highest self-cleavage efficiency is used at the junction of the second and third genes of the fusion enzyme, and the furin enzyme recognition motif RVKR is added to its N-terminus. RVKR can efficiently cut in the organelle Golgi apparatus, leaving only 4 Amino acid residues, this design has higher cutting efficiency than the general 2A sequence, and has the least impact on the expression product;
5)携带XAET基因由猪CEP112位点高效定点整合载体运载,可高效制备转基因猪,快速获得整合位置一致的转基因家系,培育转基因猪新品种。5) Carrying the XAET gene is carried by an efficient site-specific integration vector at the pig CEP112 site, which can efficiently prepare transgenic pigs, quickly obtain transgenic families with consistent integration positions, and cultivate new varieties of transgenic pigs.
附图说明Description of drawings
图1为pxynB-A3-APPA-furin-P2A-pegII-A3'-teEGI(XAET)结构示意图;Figure 1 is a schematic diagram of the structure of pxynB-A3-APPA-furin-P2A-pegII-A3'-teEGI (XAET);
图2为木聚糖酶(xynB)-植酸酶(appA)双顺反子优化组合与表达结果: A.木聚糖酶-植酸酶双顺反子优化组合设计示意图;B.xynB的pH范围;C. appA的pH范围;D.xynB的pH稳定性(39.℃,2h);appA的pH稳定性 (39.℃,2h);Fig. 2 is xylanase (xynB)-phytase (appA) bicistronic optimal combination and expression result: A. xylanase-phytase bicistronic optimal combination design schematic diagram; B. xynB pH range; C. pH range of appA; D. pH stability of xynB (39.℃, 2h); pH stability of appA (39.℃, 2h);
图3为葡聚糖酶(egⅡ)-纤维素酶(TeEGⅠ)双顺反子优化组合与表达结果:A.egⅡ-TeEGⅠ双顺反子优化组合设计示意图;B.葡聚糖酶的pH范围;C.纤维素酶的pH范围;D.葡聚糖酶的pH稳定性;E.appA纤维素酶的 pH稳定性;Figure 3 is the optimal combination and expression results of glucanase (egⅡ)-cellulase (TeEGⅠ) bicistronic combination: A.egⅡ-TeEGⅠ bicistronic optimal combination design schematic diagram; B. pH range of glucanase ; The pH range of C. cellulase; the pH stability of D. glucanase; the pH stability of E.appA cellulase;
图4为融合酶与酶单基因表达产物酶活差异结果:A.不同pH条件下,纤维素酶活性比较;B:纤维素酶对不同pH环境耐受性(39度,处理2h);C: 融合酶与单基因表达量Q-PCR(TeEGI基因),D:融合酶与单基因表达量 Q-PCR(egII);Figure 4 shows the difference in enzyme activity between the fusion enzyme and the single gene expression product of the enzyme: A. Comparison of cellulase activity under different pH conditions; B: Cellulase tolerance to different pH environments (39 degrees, treated for 2 hours); C : fusion enzyme and single gene expression Q-PCR (TeEGI gene), D: fusion enzyme and single gene expression Q-PCR (egII);
图5为XAET在pK15细胞中表达检测结果;Figure 5 is the detection result of XAET expression in pK15 cells;
图6为Cep112-npsp-XAET质粒图谱。Figure 6 is a plasmid map of Cep112-npsp-XAET.
具体实施方式detailed description
一、目的基因的筛选1. Screening of target genes
分别将来源于黑曲霉的木聚糖酶基因xynB(Guo et al.,2013),来源于大肠杆菌的植酸酶基因appA,来源于里氏木霉的纤维素酶基因egⅡ(Akbarzadeh et al.,2014),来源于蟋蟀的纤维素酶基因TeEGⅠ(Kim et al., 2008),经SignalP 4.1 Server预测信号肽后,分别去掉其自身的信号肽,然后根据猪密码子偏好进行优化,并将猪或牛来源的腮腺分泌蛋白(parotid secretory protein,PSP)信号肽(signal peptide,sp)序列分别添加到密码子优化后的候选基因的氨基酸序列的N端,如pigPSP-SP-xynB,pigPSP-SP-appA,pigPSP-SP-egⅡ,pigPSP-SP-teEGⅠ分别简写为pSPxyn、pSPappA、pSPeg2、pSPTeEG,密码子优化后的成熟肽基因分别命名为pxyn(基因序列如SEQ ID No:1所示),pappA(基因序列如SEQ ID No:2所示),pegII(基因序列如SEQ ID No:3所示),pTEGI(基因序列如SEQ ID No:4所示)。The xylanase gene xynB from Aspergillus niger (Guo et al., 2013), the phytase gene appA from Escherichia coli, and the cellulase gene egⅡ from Trichoderma reesei (Akbarzadeh et al. , 2014), the cellulase gene TeEGⅠ from crickets (Kim et al., 2008), after the signal peptide was predicted by SignalP 4.1 Server, its own signal peptide was removed, and then optimized according to the pig codon preference, and the The signal peptide (signal peptide, sp) sequence of the parotid secretory protein (PSP) derived from pig or bovine is added to the N-terminal of the amino acid sequence of the candidate gene after codon optimization, such as pigPSP-SP-xynB, pigPSP- SP-appA, pigPSP-SP-egⅡ, and pigPSP-SP-teEGⅠ are abbreviated as pSPxyn, pSPappA, pSPeg2, pSPTeEG respectively, and the mature peptide genes after codon optimization are respectively named pxyn (the gene sequence is shown in SEQ ID No: 1) , pappA (the gene sequence is shown in SEQ ID No: 2), pegII (the gene sequence is shown in SEQ ID No: 3), pTEGI (the gene sequence is shown in SEQ ID No: 4).
二、构建木聚糖酶(xynB)-植酸酶(appA),纤维素酶(egⅡ)-纤维素酶(TeEG Ⅰ)多顺反子基因序列2. Construction of polycistronic gene sequences of xylanase (xynB)-phytase (appA), cellulase (egⅡ)-cellulase (TeEGⅠ)
利用A3刚性肽,分别将木聚糖酶(xynB)-植酸酶(appA),纤维素酶(eg Ⅱ)-纤维素酶(TeEGⅠ)连接构建双功能融合酶,去掉A3上游基因终止密码子,同时去掉A3 C端下游基因信号肽。Using the A3 rigid peptide, connect xylanase (xynB)-phytase (appA), cellulase (eg II)-cellulase (TeEGⅠ) respectively to construct a bifunctional fusion enzyme, and remove the stop codon of the upstream gene of A3 , while removing the A3 C-terminal downstream gene signal peptide.
经优化突变后的A3序列如下:The A3 sequence after optimized mutation is as follows:
A3(SEQ ID No:5):GAGGCTGCCGCCAAAGAAGCTGCCGCCAAGGAGGCTGCCGCC AAGA3 (SEQ ID No: 5): GAGGCTGCCGCCAAAGAAGCTGCCGCCAAGGAGGCTGCCGCC AAG
A3'(SEQ ID No:6):GAGGCCGCCGCCAAGGAGGCCGCCGCCAAGGAGGCCGCCGCC AAGA3' (SEQ ID No: 6): GAGGCCGCCGCCAAGGAGGCCGCCGCCAAGGAGGCCGCCGCC AAG
经优化后A3连接的双功能酶基因序列为xynB-A3-APPA与 egII-A3'-TeEGI。将上述融合设计多功能酶顺反子经猪密码子优化,去除稀有密码子,选择猪细胞使用频率较高的密码子。The optimized A3-linked bifunctional enzyme gene sequence is xynB-A3-APPA and egII-A3'-TeEGI. The cistrons of the fusion-designed multifunctional enzymes were optimized by porcine codons to remove rare codons and select codons with high frequency of use by porcine cells.
并分别将多顺反子内A3重复序列和猪腮腺蛋白信号肽再次优化,以减少重复序列对多顺反子结构稳定性的影响,优化后的序列进行人工合成。优化后的xynB-A3-APPA基因序列如SEQ ID No:7所示,氨基酸序列如SEQ ID No:8所示;优化后的egII-A3'-TeEGI基因序列如SEQ ID No:9所示,氨基酸序列如SEQ ID No:10所示。The A3 repeat sequence in the polycistronic and the signal peptide of the porcine parotid protein were re-optimized to reduce the influence of the repetitive sequence on the stability of the polycistronic structure, and the optimized sequences were artificially synthesized. The optimized xynB-A3-APPA gene sequence is shown in SEQ ID No:7, the amino acid sequence is shown in SEQ ID No:8; the optimized egII-A3'-TeEGI gene sequence is shown in SEQ ID No:9, The amino acid sequence is shown in SEQ ID No:10.
利用Furin酶识别基因序列和高效自剪切P2A序列,将pxynB-A3-APPA 与pegII-A3'-TeEGI,连接构建多功能顺反子 pxynB-A3-APPA-furin-P2A-pegII-A3'-TeEGI(XAET)(基因序列如SEQ ID No:11所示,氨基酸序列如SEQ ID No:12所示,基因结构图谱如图1所示)克隆到 pcDNA3.1(+)真核表达载体多克隆位点BamHI/EcoRI上。Using furin enzyme recognition gene sequence and efficient self-cleaving P2A sequence, pxynB-A3-APPA and pegII-A3'-TeEGI were connected to construct a multifunctional cistron pxynB-A3-APPA-furin-P2A-pegII-A3'- TeEGI (XAET) (gene sequence shown in SEQ ID No: 11, amino acid sequence shown in SEQ ID No: 12, gene structure map shown in Figure 1) cloned into pcDNA3.1 (+) eukaryotic expression vector polyclonal On site BamHI/EcoRI.
利用P2A分别将木聚糖酶(xynB)-植酸酶(appA),纤维素酶(egⅡ)- 纤维素酶(TeEGⅠ),连接构建双功能酶pxynB-p2A-pAPPA、pegII-p2A-pTeEG,克隆到pcDNA3.1(+)真核表达载体上多克隆位点,作为对照组。Use P2A to link xylanase (xynB)-phytase (appA), cellulase (egII)-cellulase (TeEGI) to construct bifunctional enzymes pxynB-p2A-pAPPA, pegII-p2A-pTeEG, Cloned into the multiple cloning site of pcDNA3.1 (+) eukaryotic expression vector, as a control group.
三、木聚糖酶(xynB)-植酸酶(appA)-纤维素酶(egⅡ)-纤维素酶(TeEGⅠ)真核表达载体构建3. Xylanase (xynB)-phytase (appA)-cellulase (egⅡ)-cellulase (TeEGⅠ) eukaryotic expression vector construction
将优化后的多功能酶顺反子XAET插入真核表达载体pcDNA3.1多克隆位点BamHI/EcoRI上,经酶切和测序鉴定,多功能酶顺反子XAET真核表达载体pCD-XAET构建成功。The optimized multifunctional enzyme cistron XAET was inserted into the eukaryotic expression vector pcDNA3.1 multiple cloning site BamHI/EcoRI, identified by enzyme digestion and sequencing, and the multifunctional enzyme cistron XAET eukaryotic expression vector pCD-XAET was constructed success.
四、木聚糖酶(xynB)-植酸酶(appA)-纤维素酶(egⅡ)-纤维素酶(TeEGⅠ)多功能酶的体外表达与功能验证4. In vitro expression and functional verification of xylanase (xynB)-phytase (appA)-cellulase (egⅡ)-cellulase (TeEGⅠ) multifunctional enzyme
将pCD-XAET真核表达载体按照转染试剂盒LipofectamineTM LTX +PLUSTMReagent(invitrogen)说明书瞬时转染猪肾pK15细胞系,48-72h,收集细胞上清液作为粗酶液测定酶活,检测其酶活力及其pH耐受力。酶活测定方法和定义参考纤维素酶《NYT/912-2004》、葡聚糖酶《NYT/911-2004》、木聚糖酶《GBT/23874-2009》、植酸酶《GBT/18634-2009》。The pCD-XAET eukaryotic expression vector was transiently transfected into the porcine kidney pK15 cell line according to the instructions of the transfection kit LipofectamineTM LTX + PLUSTMReagent (invitrogen), for 48-72 hours, the cell supernatant was collected as a crude enzyme solution to measure the enzyme activity, and the enzyme Viability and its pH tolerance. Enzyme activity determination methods and definitions refer to cellulase "NYT/912-2004", dextranase "NYT/911-2004", xylanase "GBT/23874-2009", phytase "GBT/18634- 2009".
1、木聚糖酶(xynB)和植酸酶(appA)融合酶表达分析1. Expression analysis of xylanase (xynB) and phytase (appA) fusion enzyme
将木聚糖酶(xynB)和植酸酶(appA)分别用furin-P2A和A3融合,并在猪pK15细胞表达,结果显示,融合后A3连接的融合酶成功表达木聚糖酶和植酸酶双功能酶,且其表达木聚糖酶和植酸酶在pH2.0-pH6.5均具有较高的生物学活性,其中pxynB-A3-appA的木聚糖酶活性,在pH2.0-5.0高于 pxynB-p2A-pappA,在pH5.0后略低于后者,但pxynB-A3-appA表达的木聚糖酶对pH2.0-pH7.0耐受能力明显高于pxynB-p2A-pappA。pxynB-A3-appA表达的植酸酶在不同pH缓冲液中活性和耐受性均显著优于 pxynB-p2A-pappA,结果如图2所示。Xylanase (xynB) and phytase (appA) were fused with furin-P2A and A3 respectively, and expressed in porcine pK15 cells. The results showed that the A3-linked fusion enzyme successfully expressed xylanase and phytic acid Enzyme bifunctional enzyme, and its expression xylanase and phytase have higher biological activity at pH2.0-pH6.5, wherein the xylanase activity of pxynB-A3-appA, at pH2.0 -5.0 is higher than pxynB-p2A-pappA, slightly lower than the latter after pH5.0, but the xylanase expressed by pxynB-A3-appA is significantly more resistant to pH2.0-pH7.0 than pxynB-p2A -pappA. The activity and tolerance of the phytase expressed by pxynB-A3-appA were significantly better than those of pxynB-p2A-pappA in different pH buffers, the results are shown in Figure 2.
2、纤维素酶(egⅡ)-纤维素酶(TeEGⅠ)融合酶表达分析2. Cellulase (egⅡ)-cellulase (TeEGⅠ) fusion enzyme expression analysis
纤维素结构复杂,且溶解度低,是限制其水解的主要原因,目前发现的纤维素酶基因活性普遍偏低。为提高纤维素酶活性,本研究将里氏木酶来源egⅡ基因与蟋蟀来源TeEGⅠ基因通过A3和2A进行融合,结果显示, pegⅡ-A3'-TeEGⅠ双顺反子表达的葡聚糖酶和纤维素酶在不同pH缓冲液中活性和耐受性均优于pegⅡ-p2A-pTeEGⅠ,酶活力显著提高,结果如图3所示。The complex structure and low solubility of cellulose are the main reasons for limiting its hydrolysis. The activity of cellulase genes found so far is generally low. In order to improve the activity of cellulase, this study fused the egⅡ gene from Reesei’s Trichoderma and the TeEGⅠ gene from cricket through A3 and 2A. The results showed that the glucanase and fiber The activity and tolerance of the enzyme in different pH buffers were better than those of pegⅡ-p2A-pTeEGⅠ, and the enzyme activity was significantly improved. The results are shown in Figure 3.
与单酶基因表达相比,融合酶pegⅡ-A3'-TeEGⅠ表达纤维素酶有所提高,根据定量结果可知,单基因表达量一般高于融合酶表达量2倍以上(质粒大小与转染效率成反比),而采用本设计的融合酶表达纤维素酶活性比单基因更高,结果如图4所示。Compared with single-enzyme gene expression, the expression of cellulase by fusion enzyme pegⅡ-A3'-TeEGⅠ has been improved. According to the quantitative results, the expression level of single gene is generally more than 2 times higher than that of fusion enzyme (plasmid size and transfection efficiency Inversely proportional), while the cellulase activity expressed by the fusion enzyme of this design is higher than that of the single gene, the results are shown in Figure 4.
3、多顺反子XAET在pK15细胞中表达检测3. Expression detection of polycistronic XAET in pK15 cells
分别用电转和脂质体化学转染方法,将多顺反子XAET真核表达载体 pCD-XAET导入PK15细胞,于48h后收集细胞上清培养液,测定其表达情况,结果显示,多顺反子XAET均成功表达出木聚糖酶,植酸酶和纤维素酶,结果如图5所示。The polycistronic XAET eukaryotic expression vector pCD-XAET was introduced into PK15 cells by electroporation and liposome chemical transfection respectively, and the cell supernatant culture fluid was collected after 48 hours to measure its expression. The results showed that polycistronic XAET Anti-XAET all successfully expressed xylanase, phytase and cellulase, the results are shown in Figure 5.
五、定点整合到CEP112位点转木聚糖酶(xynB)-植酸酶(appA)-纤维素酶(eg Ⅱ)-纤维素酶(TeEGⅠ)(XAET)基因表达载体构建5. Site-directed integration into CEP112 site transxylanase (xynB)-phytase (appA)-cellulase (eg Ⅱ)-cellulase (TeEGⅠ) (XAET) gene expression vector construction
首先将XAET多顺反子替换前期研究载体CEP112-LA340RA3219(来源于第“201711477805.5”号“一种定点整合外源DNA转基因猪的构建方法”,公开号“108285906A”专利)中的BEXA顺反子,构建了新载体 Cep112-mPSP-XAET,在其基础上,用PacI和sexAI线性化Cep112-mPSP-XAET,然后用表1中inf-npsp引物扩增npsp上游调控区,并替换现有序列, npsp(-11.5kb~-5.7kb)在原mpsp(-11.1kb~-5.7kb)基础上延长了调控区序列395bp,构建Cep112-npsp-XAET载体(序列如SEQ ID No:12所示),经酶切验证,切割条带大小与预期相符,经过测序验证,确定成功获得一个能在猪唾液腺中特异表达XAET四种功能酶的转基因载体,质粒图谱如图6所示。Firstly, replace the BEXA cistron in the previous research vector CEP112-LA340RA3219 (derived from No. "201711477805.5" "A Construction Method for Site-Directed Integration of Exogenous DNA Transgenic Pig", Publication No. "108285906A" patent) with the XAET polycistronic , constructed a new vector Cep112-mPSP-XAET, based on it, linearized Cep112-mPSP-XAET with PacI and sexAI, then amplified the npsp upstream regulatory region with inf-npsp primers in Table 1, and replaced the existing sequence, npsp (-11.5kb~-5.7kb) extended the regulatory region sequence by 395bp on the basis of the original mpsp (-11.1kb~-5.7kb), and constructed the Cep112-npsp-XAET vector (sequence shown in SEQ ID No: 12). Enzyme digestion verification showed that the size of the cut band was in line with the expectation. After sequencing verification, it was confirmed that a transgenic vector capable of specifically expressing the four functional enzymes of XAET in porcine salivary glands was successfully obtained. The plasmid map is shown in Figure 6.
获得的Cep112-npsp-XAET载体为猪CEP112位点高效定点整合载体运载,可高效制备转基因猪,快速获得整合位置一致的转基因家系,为后续快速培育遗传背景一致的转基因猪新品种做好基础。The obtained Cep112-npsp-XAET vector is an efficient site-specific integration vector for the pig CEP112 site, which can efficiently prepare transgenic pigs, quickly obtain transgenic families with consistent integration positions, and lay a good foundation for the subsequent rapid cultivation of new transgenic pig varieties with consistent genetic background.
六、转多功能融合酶XAET基因猪的获得6. Acquisition of pigs transfected with multifunctional fusion enzyme XAET gene
将构建成功的Cep112-npsp-XAET载体转染猪成纤维细胞系,获得表达 XAET多顺反子的阳性细胞系,将阳性细胞系作为供核细胞进行核移植,通过体细胞克隆的方法获得转XAET基因猪。The successfully constructed Cep112-npsp-XAET vector was transfected into a porcine fibroblast cell line to obtain a positive cell line expressing XAET polycistrons. XAET genetic pigs.
对获得的转XAET基因猪进行基因及测序水平的鉴定,采集阳性猪的唾液,进行检测,发现转XAET基因猪均可高效的表达植酸酶,木聚糖酶、葡聚糖酶和纤维素酶,且木聚糖酶和植酸酶的活性与转单基因猪酶的活性相当,纤维素酶的活性高于转单基因猪酶的活性。The obtained XAET transgenic pigs were identified for gene and sequencing level, and the saliva of the positive pigs was collected and tested. It was found that the XAET transgenic pigs could efficiently express phytase, xylanase, glucanase and cellulose Enzymes, and the activities of xylanase and phytase were equivalent to those of transgenic pig enzymes, and the activity of cellulase was higher than that of transgenic pig enzymes.
以上所述的仅是本发明的一些实施方式。对于本领域的普通技术人员来说,在不脱离本发明创造构思的前提下,还可以做出若干变形和改进,这些都属于发明的保护范围。What have been described above are only some embodiments of the present invention. For those skilled in the art, without departing from the inventive concept of the present invention, several modifications and improvements can be made, and these all belong to the protection scope of the present invention.
序列表sequence listing
<110> 温氏食品集团股份有限公司;华南农业大学<110> Wen's Food Group Co., Ltd.; South China Agricultural University
<120> 一种多功能融合酶XAET和多功能融合酶定点整合真核特异表达载体及其构建方法<120> A multifunctional fusion enzyme XAET and multifunctional fusion enzyme site-directed integration eukaryotic specific expression vector and its construction method
<130> 2019.11.21<130> 2019.11.21
<160> 13<160> 13
<170> SIPOSequenceListing 1.0<170> SIPOSequenceListing 1.0
<210> 1<210> 1
<211> 627<211> 627
<212> DNA<212>DNA
<213> 人工合成()<213> artificial synthesis ()
<400> 1<400> 1
atgtttcaac tttggaaact tgttttcttg tgcggtctgc tcattgggac ctcagcatct 60atgtttcaac tttggaaact tgttttcttg tgcggtctgc tcattgggac ctcagcatct 60
agcacacctt caagcacagg cgaaaacaat gggttctatt actccttctg gaccgacggg 120agcacacctt caagcacagg cgaaaacaat gggttctatt actccttctg gaccgacggg 120
ggcggcgatg tcacctacac aaacggagac gccggagcct acaccgtgga gtggagcaac 180ggcggcgatg tcacctacac aaacggagac gccggagcct acaccgtgga gtggagcaac 180
gtggggaact tcgtgggagg aaagggatgg aacccaggat ccgcccagga tatcacctac 240gtggggaact tcgtggggagg aaagggatgg aacccaggat ccgcccagga tatcacctac 240
tccggcacct ttacaccaag cggcaacgga tacctgtccg tgtacggatg gaccacagac 300tccggcacct ttacaccaag cggcaacgga tacctgtccg tgtacggatg gaccacagac 300
cctctgatcg agtactacat cgtggaaagc tacggcgatt acaaccccgg atccgggggc 360cctctgatcg agtactacat cgtggaaagc tacggcgatt acaaccccgg atccgggggc 360
acctacaaag ggaccgtgac atccgacggc agcgtgtacg atatctacac cgctacaagg 420acctacaaag ggaccgtgac atccgacggc agcgtgtacg atatctacac cgctacaagg 420
accaacgctg ccagcatcca gggcacagcc accttcacac agtactggtc cgtgcgccag 480accaacgctg ccagcatcca gggcacagcc accttcacac agtactggtc cgtgcgccag 480
aacaagcggg tgggagggac cgtgaccaca agcaaccact ttaacgcctg ggccaaactg 540aacaagcggg tgggagggac cgtgaccaca agcaaccact ttaacgcctg ggccaaactg 540
ggaatgaacc tggggacaca caactaccag attgtcgcca ccgaaggcta ccagtcctca 600ggaatgaacc tggggacaca caactaccag attgtcgcca ccgaaggcta ccagtcctca 600
ggctcatcct ccattacagt ccagtga 627ggctcatcct ccattacagt ccagtga 627
<210> 2<210> 2
<211> 1293<211> 1293
<212> DNA<212>DNA
<213> 人工合成()<213> artificial synthesis ()
<400> 2<400> 2
atgttccaac tgtggaagct ggtcttcctg tgtggtctgc tgattggcac ctctgcttcc 60atgttccaac tgtggaagct ggtcttcctg tgtggtctgc tgattggcac ctctgcttcc 60
cagagcgaac ccgaactgaa actggaaagc gtcgtcatcg tctcccgcca cggagtccgc 120cagagcgaac ccgaactgaa actggaaagc gtcgtcatcg tctcccgcca cggagtccgc 120
gcccctacaa aagccaccca gctcatgcag gacgtgaccc ccgatgcctg gcctacatgg 180gcccctacaa aagccaccca gctcatgcag gacgtgaccc ccgatgcctg gcctacatgg 180
ccagtcaagc tgggatggct cacccctagg ggaggagagc tgatcgccta cctcggacac 240ccagtcaagc tgggatggct cacccctagg ggaggagagc tgatcgccta cctcggacac 240
tatcagaggc agagactggt ggctgacgga ctgctcgcta agaaaggatg cccacagtcc 300tatcagaggc agagactggt ggctgacgga ctgctcgcta agaaaggatg cccacagtcc 300
ggacaggtgg ctatcattgc tgacgtggat gagcgcaccc ggaagacagg agaagccttc 360ggacaggtgg ctatcattgc tgacgtggat gagcgcaccc ggaagacagg agaagccttc 360
gccgctggac tggctccaga ttgcgctatc accgtgcaca cacaggccga caccagctcc 420gccgctggac tggctccaga ttgcgctatc accgtgcaca cacaggccga caccagctcc 420
cccgatcctc tgtttaaccc cctcaaaacc ggcgtgtgcc agctggacaa cgccaatgtc 480cccgatcctc tgtttaaccc cctcaaaacc ggcgtgtgcc agctggacaa cgccaatgtc 480
accgatgcta tcctgtctag ggccggaggc agcattgctg acttcaccgg ccatagacag 540accgatgcta tcctgtctag ggccggaggc agcattgctg acttcaccgg ccatagacag 540
acagcctttc gcgagctgga acgggtgctc aacttccctc agagcaatct gtgcctcaag 600acagcctttc gcgagctgga acgggtgctc aacttccctc agagcaatct gtgcctcaag 600
cgcgagaaac aggacgaatc ttgtagcctg acccaggccc tcccatccga gctgaaggtg 660cgcgagaaac aggacgaatc ttgtagcctg acccaggccc tcccatccga gctgaaggtg 660
tctgctgata acgtcagcct gaccggagcc gtgtccctcg cttctatgct gacagagatc 720tctgctgata acgtcagcct gaccggagcc gtgtccctcg cttctatgct gacagagatc 720
ttcctgctcc agcaggctca gggaatgcca gaaccaggat ggggccgcat taccgactcc 780ttcctgctcc agcaggctca gggaatgcca gaaccaggat ggggccgcat taccgactcc 780
caccagtgga acacactgct ctctctgcat aatgcccagt tttacctgct ccagaggacc 840caccagtgga acacactgct ctctctgcat aatgcccagt tttacctgct ccagaggacc 840
ccagaggtgg ctaggtctag agctacaccc ctgctcgacc tcatcaagac cgccctgaca 900ccagaggtgg ctaggtctag agctacaccc ctgctcgacc tcatcaagac cgccctgaca 900
cctcaccccc ctcagaaaca ggcttatggg gtgaccctgc caacaagcgt cctgttcatt 960cctcaccccc ctcagaaaca ggcttatggg gtgaccctgc caacaagcgt cctgttcatt 960
gccggacatg ataccaacct ggccaatctc gggggagctc tggaactcaa ctggaccctg 1020gccggacatg ataccaacct ggccaatctc gggggagctc tggaactcaa ctggaccctg 1020
cccggccagc ctgacaatac accacccggc ggggagctgg tgttcgaaag gtggcgccgg 1080cccggccagc ctgacaatac accacccggc ggggagctgg tgttcgaaag gtggcgccgg 1080
ctgagcgata actcccagtg gatccaggtg agcctggtct ttcagaccct gcagcagatg 1140ctgagcgata actcccagtg gatccaggtg agcctggtct ttcagaccct gcagcagatg 1140
agagacaaga cccccctgtc cctcaacaca cctccaggag aggtcaaact gaccctcgcc 1200agagacaaga cccccctgtc cctcaacaca cctccaggag aggtcaaact gaccctcgcc 1200
ggctgcgagg aacgcaatgc tcaggggatg tgctctctcg ccggattcac ccagattgtc 1260ggctgcgagg aacgcaatgc tcaggggatg tgctctctcg ccggattcac ccagattgtc 1260
aacgaagccc gcattccagc ctgctccctg tga 1293aacgaagccc gcattccagc ctgctccctg tga 1293
<210> 3<210> 3
<211> 1251<211> 1251
<212> DNA<212>DNA
<213> 人工合成()<213> artificial synthesis ()
<400> 3<400> 3
atgttccagc tgtggaagct ggtgttcctg tgcggcctgc tgatcggcac cagcgcccag 60atgttccagc tgtggaagct ggtgttcctg tgcggcctgc tgatcggcac cagcgcccag 60
cagaccgtgt ggggccagtg tggcggaatt ggatggtccg ggcccacaaa ttgcgctcca 120cagaccgtgt ggggccagtg tggcggaatt ggatggtccg ggcccacaaa ttgcgctcca 120
ggatccgcct gtagcaccct gaacccttac tatgctcagt gcatccccgg agccaccaca 180ggatccgcct gtagcaccct gaacccttac tatgctcagt gcatccccgg agccaccaca 180
attaccacat ccaccaggcc accaagcgga ccaaccacaa ccacaagagc cacctccaca 240attaccacat ccaccaggcc accaagcgga ccaaccaaa ccacaagagc cacctccaca 240
tccagctcta cacctcccac ctcatccgga gtgagattcg ctggggtcaa catcgccggg 300tccagctcta cacctcccac ctcatccgga gtgagattcg ctggggtcaa catcgccggg 300
ttcgactttg gctgcaccac agatgggaca tgtgtgacct ctaaggtcta cccacctctg 360ttcgactttg gctgcaccac agatgggaca tgtgtgacct ctaaggtcta cccacctctg 360
aaaaatttta ccggatccaa caattatcca gacgggatcg gccagatgca gcacttcgtg 420aaaaatttta ccggatccaa caattatcca gacgggatcg gccagatgca gcacttcgtg 420
aacgaagatg gaatgaccat ttttcggctc cctgtggggt ggcagtacct ggtcaacaat 480aacgaagatg gaatgaccat ttttcggctc cctgtggggt ggcagtacct ggtcaacaat 480
aacctcgggg gcaacctgga ctcaacctcc atcagcaagt atgatcagct ggtgcaggga 540aacctcgggg gcaacctgga ctcaacctcc atcagcaagt atgatcagct ggtgcaggga 540
tgcctgagcc tcggagctta ctgtatcgtc gacattcaca attatgccag gtggaacgga 600tgcctgagcc tcggagctta ctgtatcgtc gacattcaca attatgccag gtggaacgga 600
gggatcattg gccagggcgg acctacaaat gcccagttca cctctctctg gtcacagctg 660gggatcattg gccagggcgg acctacaaat gcccagttca cctctctctg gtcacagctg 660
gcttccaaat acgcctctca gtcacgagtg tggtttggca tcatgaacga gccccatgac 720gcttccaaat acgcctctca gtcacgagtg tggtttggca tcatgaacga gccccatgac 720
gtgaatatta acacatgggc cgctaccgtc caggaagtgg tcacagccat ccgcaatgct 780gtgaatatta acacatgggc cgctaccgtc caggaagtgg tcacagccat ccgcaatgct 780
ggcgccacct ctcagttcat ttcactgcca ggaaacgact ggcagtccgc tggagccttt 840ggcgccacct ctcagttcat ttcactgcca ggaaacgact ggcagtccgc tggagccttt 840
atctccgatg gaagcgctgc tgctctgtcc caggtgacca atccagacgg cagcaccaca 900atctccgatg gaagcgctgc tgctctgtcc caggtgacca atccagacgg cagcaccaca 900
aacctcatct tcgatgtcca caagtacctg gactctgata actcagggac ccatgccgag 960aacctcatct tcgatgtcca caagtacctg gactctgata actcagggac ccatgccgag 960
tgcaccacaa ataacatcga cggcgctttt agcccactcg ccacctggct gcgccagaat 1020tgcaccacaa ataacatcga cggcgctttt agccactcg ccacctggct gcgccagaat 1020
aaccggcagg ccatcctgac cgaaacaggg ggcggaaacg tgcagtcctg catccaggac 1080aaccggcagg ccatcctgac cgaaacaggg ggcggaaacg tgcagtcctg catccaggac 1080
atgtgtcagc agattcagta cctcaatcag aacagcgatg tgtacctggg atatgtcgga 1140atgtgtcagc agattcagta cctcaatcag aacagcgatg tgtacctggg atatgtcgga 1140
tggggagctg gatccttcga cagcacctac gtgctgaccg agacacccac cagctctggc 1200tggggagctg gatccttcga cagcacctac gtgctgaccg agacacccac cagctctggc 1200
aactcttgga cagatacctc actcgtgtca tcctgtctgg cccgaaaatg a 1251aactcttgga cagatacctc actcgtgtca tcctgtctgg cccgaaaatg a 1251
<210> 4<210> 4
<211> 1365<211> 1365
<212> DNA<212>DNA
<213> 人工合成()<213> artificial synthesis ()
<400> 4<400> 4
atggttcagc tttggaaact tgttctcttg tgcggcctgc tcgccgggac ctcagcgtct 60atggttcagc tttggaaact tgttctcttg tgcggcctgc tcgccgggac ctcagcgtct 60
ggcagctacg actacgccga cgtgatcaag aagtccctgc tgttctacca ggctcagcgc 120ggcagctacg actacgccga cgtgatcaag aagtccctgc tgttctacca ggctcagcgc 120
agcggccggc tgagcggcat ggaccccctg gtgagctgga ggaaggactc cgccctgaac 180agcggccggc tgagcggcat ggaccccctg gtgagctgga ggaaggactc cgccctgaac 180
gacagaggaa acaacggaga ggacctgacc ggaggatact acgacgctgg cgacttcgtg 240gacagaggaa acaacggaga ggacctgacc ggaggatact acgacgctgg cgacttcgtg 240
aagttcggct tccccatggc ctacaccatc accctgctga gctggggcgt gatcgactac 300aagttcggct tccccatggc cctacaccatc accctgctga gctggggcgt gatcgactac 300
gagaacacct acagctccat cggcgccctg tccgccgccc gcgccgccat caagtggggc 360gagaacacct acagctccat cggcgccctg tccgccgccc gcgccgccat caagtggggc 360
accgactact tcatcaaggc ccacgtgagc gccaacgagc tgtacggaca ggtcggaaac 420accgactact tcatcaaggc ccacgtgagc gccaacgagc tgtacggaca ggtcggaaac 420
ggaggagctg accactcctg gtggggcagg cccgaggaca tgaacatgga ccggcccgcc 480ggaggagctg accactcctg gtggggcagg cccgaggaca tgaacatgga ccggcccgcc 480
tacaagatcg acacctcccg gccaggcagc gacctggccg ccgagaccgc cgccgccatg 540tacaagatcg acacctcccg gccaggcagc gacctggccg ccgagaccgc cgccgccatg 540
gccgccgcca gcatcgtgtt caagaacgcc gactccaact acgccaacac cctgctgagg 600gccgccgcca gcatcgtgtt caagaacgcc gactccaact acgccaacac cctgctgagg 600
cacgccaagg agctgtacaa cttcgccgac aactacaggg gcaagtacag cgactccatc 660cacgccaagg agctgtacaa cttcgccgac aactacaggg gcaagtacag cgactccatc 660
agcgacgccg ccgccttcta caactcctac agctacgagg acgagctggt gtggggagct 720agcgacgccg ccgccttcta caactcctac agctacgagg acgagctggt gtggggagct 720
atctggctgt ggagggctac caacgaccag aactacctga acaaggccac ccagtactac 780atctggctgt ggagggctac caacgaccag aactacctga acaaggccac ccagtactac 780
aaccagtaca gcatccagta caagaactcc cccctgagct gggacgacaa gtccaccgga 840aaccagtaca gcatccagta caagaactcc cccctgagct gggacgacaa gtccaccgga 840
gctagcgccc tgctggctaa gctgaccgga ggcgaccagt acaagtccgc cgtgcagagc 900gctagcgccc tgctggctaa gctgaccgga ggcgaccagt acaagtccgc cgtgcagagc 900
ttctgcgacg gcttctacta caaccagcag aagaccccca agggcctgat ctggtactcc 960ttctgcgacg gcttctacta caaccagcag aagaccccca agggcctgat ctggtactcc 960
gactggggca gcctgaggca gtccatgaac gccgtgtggg tgtgcctcca ggccgccgac 1020gactggggca gcctgaggca gtccatgaac gccgtgtggg tgtgcctcca ggccgccgac 1020
gctggagtga agaccggaga gtaccgcagc ctggccaaga agcagctgga ctacgctctg 1080gctggagtga agaccggaga gtaccgcagc ctggccaaga agcagctgga ctacgctctg 1080
ggcgacgccg gccggtcctt cgtggtgggc ttcggcaaca acccccccag ccacgagcag 1140ggcgacgccg gccggtcctt cgtggtgggc ttcggcaaca accccccag ccacgagcag 1140
cacagggctg cttcctgccc agacgctcct gccgcctgcg actggaacac ctacaacggc 1200cacagggctg cttcctgccc agacgctcct gccgcctgcg actggaacac ctacaacggc 1200
ggccagtcca actaccacgt gctgtacggc gccctggtgg gaggaccaga cgccaacgac 1260ggccagtcca actaccacgt gctgtacggc gccctggtgg gaggaccaga cgccaacgac 1260
tactacaacg acgtgagaag cgactacgtg cacaacgagg tggcctgcga ctacaacgcc 1320tactacaacg acgtgagaag cgactacgtg cacaacgagg tggcctgcga ctacaacgcc 1320
ggcttccaga acgtgctggt gtccctgaag gccaacggct actga 1365ggcttccaga acgtgctggt gtccctgaag gccaacggct actga 1365
<210> 5<210> 5
<211> 45<211> 45
<212> DNA<212>DNA
<213> 人工合成()<213> artificial synthesis ()
<400> 5<400> 5
gaggctgccg ccaaagaagc tgccgccaag gaggctgccg ccaag 45gaggctgccg ccaaagaagc tgccgccaag gaggctgccg ccaag 45
<210> 6<210> 6
<211> 45<211> 45
<212> DNA<212>DNA
<213> 人工合成()<213> artificial synthesis ()
<400> 6<400> 6
gaggccgccg ccaaggaggc cgccgccaag gaggccgccg ccaag 45gaggccgccg ccaaggaggc cgccgccaag gaggccgccg ccaag 45
<210> 7<210> 7
<211> 1980<211> 1980
<212> DNA<212>DNA
<213> 人工合成()<213> artificial synthesis ()
<400> 7<400> 7
atgttccagc tgtggaagct ggtgttcctg tgcggactgc tgatcggcac cagcgcctcc 60atgttccagc tgtggaagct ggtgttcctg tgcggactgc tgatcggcac cagcgcctcc 60
agcaccccct ccagcaccgg agagaacaac ggcttctact actccttctg gaccgacgga 120agcaccccct ccagcaccgg agagaacaac ggcttctact actccttctg gaccgacgga 120
ggaggcgacg tgacctacac caacggcgac gccggagctt acaccgtgga gtggagcaac 180ggaggcgacg tgacctacac caacggcgac gccggagctt acaccgtgga gtggagcaac 180
gtgggcaact tcgtgggagg caagggatgg aacccaggct ccgcccagga catcacctac 240gtggggcaact tcgtgggagg caagggatgg aacccaggct ccgcccagga catcacctac 240
tccggcacct tcaccccaag cggcaacggc tacctgtccg tgtacggctg gaccaccgac 300tccggcacct tcaccccaag cggcaacggc tacctgtccg tgtacggctg gaccaccgac 300
cccctgatcg agtactacat cgtggagagc tacggcgact acaacccagg ctccggaggc 360cccctgatcg agtactacat cgtggagagc tacggcgact acaacccagg ctccggaggc 360
acctacaagg gcaccgtgac cagcgacggc tccgtgtacg acatctacac cgctaccagg 420acctacaagg gcaccgtgac cagcgacggc tccgtgtacg acatctacac cgctaccagg 420
accaacgctg ccagcatcca gggcaccgcc accttcaccc agtactggtc cgtgaggcag 480accaacgctg ccagcatcca gggcaccgcc accttcaccc agtactggtc cgtgaggcag 480
aacaagagag tgggcggcac cgtgaccacc agcaaccact tcaacgcctg ggccaagctg 540aacaagagag tgggcggcac cgtgaccacc agcaaccact tcaacgcctg ggccaagctg 540
ggcatgaacc tgggcaccca caactaccag atcgtggcta ccgagggcta ccagtccagc 600ggcatgaacc tgggcaccca caactaccag atcgtggcta ccgagggcta ccagtccagc 600
ggctccagct ccatcaccgt gcaggaggct gccgccaaag aagctgccgc caaggaggct 660ggctccagct ccatcaccgt gcaggaggct gccgccaaag aagctgccgc caaggaggct 660
gccgccaagc agtccgagcc agagctgaag ctggagagcg tggtcatcgt gtcccgccac 720gccgccaagc agtccgagcc agagctgaag ctggagagcg tggtcatcgt gtcccgccac 720
ggcgtgcgcg ctccaaccaa ggccacccag ctgatgcagg acgtgacccc agacgcttgg 780ggcgtgcgcg ctccaaccaa ggccaccag ctgatgcagg acgtgacccc agacgcttgg 780
ccaacctggc cagtgaagct gggatggctg acccccaggg gcggagagct gatcgcctac 840ccaacctggc cagtgaagct gggatggctg acccccaggg gcggagagct gatcgcctac 840
ctgggccact accagaggca gagactggtg gctgacggac tgctggccaa gaagggatgc 900ctgggccact accagaggca gagactggtg gctgacggac tgctggccaa gaagggatgc 900
ccacagagcg gacaggtggc tatcatcgct gacgtggacg agcgcacccg gaagaccgga 960ccacagagcg gacaggtggc tatcatcgct gacgtggacg agcgcacccg gaagaccgga 960
gaggccttcg ccgccggcct ggccccagac tgcgctatca ccgtgcacac ccaggctgac 1020gaggccttcg ccgccggcct ggccccagac tgcgctatca ccgtgcacac ccaggctgac 1020
accagctccc ccgacccact gttcaaccca ctgaagaccg gcgtgtgcca gctggacaac 1080accagctccc ccgacccact gttcaaccca ctgaagaccg gcgtgtgcca gctggacaac 1080
gccaacgtga ccgacgctat cctgagccgc gccggaggct ccatcgctga cttcaccgga 1140gccaacgtga ccgacgctat cctgagccgc gccggaggct ccatcgctga cttcaccgga 1140
cacaggcaga ccgccttcag ggagctggag agagtgctga acttccccca gtccaacctg 1200cacaggcaga ccgccttcag ggagctggag agagtgctga acttccccca gtccaacctg 1200
tgcctgaagc gggagaagca ggacgagagc tgctccctga cccaggccct gccaagcgag 1260tgcctgaagc gggagaagca ggacgagagc tgctccctga cccaggccct gccaagcgag 1260
ctgaaggtgt ccgccgacaa cgtgagcctg accggagccg tgagcctggc ctccatgctg 1320ctgaaggtgt ccgccgacaa cgtgagcctg accggagccg tgagcctggc ctccatgctg 1320
accgagatct tcctgctcca gcaggctcag ggaatgccag agccaggatg gggaaggatc 1380accgagatct tcctgctcca gcaggctcag ggaatgccag agccaggatg gggaaggatc 1380
accgacagcc accagtggaa caccctgctg tccctgcaca acgcccagtt ctacctgctc 1440accgacagcc accacgtggaa caccctgctg tccctgcaca acgcccagtt ctacctgctc 1440
cagcggaccc cagaggtggc taggagcaga gccaccccac tgctggacct gatcaagacc 1500cagcggaccc cagaggtggc taggagcaga gccaccccac tgctggacct gatcaagacc 1500
gccctgaccc cacacccacc acagaagcag gcctacggcg tgaccctgcc aacctccgtg 1560gccctgaccc cacacccacc acagaagcag gcctacggcg tgaccctgcc aacctccgtg 1560
ctgttcatcg ccggccacga caccaacctg gctaacctgg gaggcgccct ggagctgaac 1620ctgttcatcg ccggccacga caccaacctg gctaacctgg gaggcgccct ggagctgaac 1620
tggaccctgc caggacagcc agacaacacc ccaccaggag gagagctggt gttcgagagg 1680tggaccctgc caggacagcc agacaacacc ccaccaggag gagagctggt gttcgagagg 1680
tggcgccggc tgagcgacaa ctcccagtgg attcaggtgt ccctggtgtt ccagaccctc 1740tggcgccggc tgagcgacaa ctcccagtgg attcaggtgt ccctggtgtt ccagaccctc 1740
cagcagatga gagacaagac cccactgtcc ctgaacaccc caccaggaga ggtgaagctg 1800cagcagatga gagacaagac cccactgtcc ctgaacaccc caccaggaga ggtgaagctg 1800
accctggccg gatgcgagga gaggaacgct cagggaatgt gcagcctggc cggcttcacc 1860accctggccg gatgcgagga gaggaacgct cagggaatgt gcagcctggc cggcttcacc 1860
cagatcgtga acgaggctag aatccccgcc tgctccctga gggtgaagag gggcagcgga 1920cagatcgtga acgaggctag aatccccgcc tgctccctga gggtgaagag gggcagcgga 1920
gctaccaact tctccctgct gaagcaggct ggcgacgtgg aggagaaccc aggaccatga 1980gctaccaact tctccctgct gaagcaggct ggcgacgtgg aggagaaccc aggaccatga 1980
<210> 8<210> 8
<211> 659<211> 659
<212> PRT<212> PRT
<213> 人工合成()<213> artificial synthesis ()
<400> 8<400> 8
Met Phe Gln Leu Trp Lys Leu Val Phe Leu Cys Gly Leu Leu Ile GlyMet Phe Gln Leu Trp Lys Leu Val Phe Leu Cys Gly Leu Leu Ile Gly
1 5 10 151 5 10 15
Thr Ser Ala Ser Ser Thr Pro Ser Ser Thr Gly Glu Asn Asn Gly PheThr Ser Ala Ser Ser Thr Pro Ser Ser Thr Gly Glu Asn Asn Gly Phe
20 25 30 20 25 30
Tyr Tyr Ser Phe Trp Thr Asp Gly Gly Gly Asp Val Thr Tyr Thr AsnTyr Tyr Ser Phe Trp Thr Asp Gly Gly Gly Asp Val Thr Tyr Thr Asn
35 40 45 35 40 45
Gly Asp Ala Gly Ala Tyr Thr Val Glu Trp Ser Asn Val Gly Asn PheGly Asp Ala Gly Ala Tyr Thr Val Glu Trp Ser Asn Val Gly Asn Phe
50 55 60 50 55 60
Val Gly Gly Lys Gly Trp Asn Pro Gly Ser Ala Gln Asp Ile Thr TyrVal Gly Gly Lys Gly Trp Asn Pro Gly Ser Ala Gln Asp Ile Thr Tyr
65 70 75 8065 70 75 80
Ser Gly Thr Phe Thr Pro Ser Gly Asn Gly Tyr Leu Ser Val Tyr GlySer Gly Thr Phe Thr Pro Ser Gly Asn Gly Tyr Leu Ser Val Tyr Gly
85 90 95 85 90 95
Trp Thr Thr Asp Pro Leu Ile Glu Tyr Tyr Ile Val Glu Ser Tyr GlyTrp Thr Thr Asp Pro Leu Ile Glu Tyr Tyr Ile Val Glu Ser Tyr Gly
100 105 110 100 105 110
Asp Tyr Asn Pro Gly Ser Gly Gly Thr Tyr Lys Gly Thr Val Thr SerAsp Tyr Asn Pro Gly Ser Gly Gly Thr Tyr Lys Gly Thr Val Thr Ser
115 120 125 115 120 125
Asp Gly Ser Val Tyr Asp Ile Tyr Thr Ala Thr Arg Thr Asn Ala AlaAsp Gly Ser Val Tyr Asp Ile Tyr Thr Ala Thr Arg Thr Asn Ala Ala
130 135 140 130 135 140
Ser Ile Gln Gly Thr Ala Thr Phe Thr Gln Tyr Trp Ser Val Arg GlnSer Ile Gln Gly Thr Ala Thr Phe Thr Gln Tyr Trp Ser Val Arg Gln
145 150 155 160145 150 155 160
Asn Lys Arg Val Gly Gly Thr Val Thr Thr Ser Asn His Phe Asn AlaAsn Lys Arg Val Gly Gly Thr Val Thr Thr Ser Asn His Phe Asn Ala
165 170 175 165 170 175
Trp Ala Lys Leu Gly Met Asn Leu Gly Thr His Asn Tyr Gln Ile ValTrp Ala Lys Leu Gly Met Asn Leu Gly Thr His Asn Tyr Gln Ile Val
180 185 190 180 185 190
Ala Thr Glu Gly Tyr Gln Ser Ser Gly Ser Ser Ser Ile Thr Val GlnAla Thr Glu Gly Tyr Gln Ser Ser Gly Ser Ser Ser Ile Thr Val Gln
195 200 205 195 200 205
Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys GlnGlu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Gln
210 215 220 210 215 220
Ser Glu Pro Glu Leu Lys Leu Glu Ser Val Val Ile Val Ser Arg HisSer Glu Pro Glu Leu Lys Leu Glu Ser Val Val Ile Val Ser Arg His
225 230 235 240225 230 235 240
Gly Val Arg Ala Pro Thr Lys Ala Thr Gln Leu Met Gln Asp Val ThrGly Val Arg Ala Pro Thr Lys Ala Thr Gln Leu Met Gln Asp Val Thr
245 250 255 245 250 255
Pro Asp Ala Trp Pro Thr Trp Pro Val Lys Leu Gly Trp Leu Thr ProPro Asp Ala Trp Pro Thr Trp Pro Val Lys Leu Gly Trp Leu Thr Pro
260 265 270 260 265 270
Arg Gly Gly Glu Leu Ile Ala Tyr Leu Gly His Tyr Gln Arg Gln ArgArg Gly Gly Glu Leu Ile Ala Tyr Leu Gly His Tyr Gln Arg Gln Arg
275 280 285 275 280 285
Leu Val Ala Asp Gly Leu Leu Ala Lys Lys Gly Cys Pro Gln Ser GlyLeu Val Ala Asp Gly Leu Leu Ala Lys Lys Gly Cys Pro Gln Ser Gly
290 295 300 290 295 300
Gln Val Ala Ile Ile Ala Asp Val Asp Glu Arg Thr Arg Lys Thr GlyGln Val Ala Ile Ile Ala Asp Val Asp Glu Arg Thr Arg Lys Thr Gly
305 310 315 320305 310 315 320
Glu Ala Phe Ala Ala Gly Leu Ala Pro Asp Cys Ala Ile Thr Val HisGlu Ala Phe Ala Ala Gly Leu Ala Pro Asp Cys Ala Ile Thr Val His
325 330 335 325 330 335
Thr Gln Ala Asp Thr Ser Ser Pro Asp Pro Leu Phe Asn Pro Leu LysThr Gln Ala Asp Thr Ser Ser Pro Asp Pro Leu Phe Asn Pro Leu Lys
340 345 350 340 345 350
Thr Gly Val Cys Gln Leu Asp Asn Ala Asn Val Thr Asp Ala Ile LeuThr Gly Val Cys Gln Leu Asp Asn Ala Asn Val Thr Asp Ala Ile Leu
355 360 365 355 360 365
Ser Arg Ala Gly Gly Ser Ile Ala Asp Phe Thr Gly His Arg Gln ThrSer Arg Ala Gly Gly Ser Ile Ala Asp Phe Thr Gly His Arg Gln Thr
370 375 380 370 375 380
Ala Phe Arg Glu Leu Glu Arg Val Leu Asn Phe Pro Gln Ser Asn LeuAla Phe Arg Glu Leu Glu Arg Val Leu Asn Phe Pro Gln Ser Asn Leu
385 390 395 400385 390 395 400
Cys Leu Lys Arg Glu Lys Gln Asp Glu Ser Cys Ser Leu Thr Gln AlaCys Leu Lys Arg Glu Lys Gln Asp Glu Ser Cys Ser Leu Thr Gln Ala
405 410 415 405 410 415
Leu Pro Ser Glu Leu Lys Val Ser Ala Asp Asn Val Ser Leu Thr GlyLeu Pro Ser Glu Leu Lys Val Ser Ala Asp Asn Val Ser Leu Thr Gly
420 425 430 420 425 430
Ala Val Ser Leu Ala Ser Met Leu Thr Glu Ile Phe Leu Leu Gln GlnAla Val Ser Leu Ala Ser Met Leu Thr Glu Ile Phe Leu Leu Gln Gln
435 440 445 435 440 445
Ala Gln Gly Met Pro Glu Pro Gly Trp Gly Arg Ile Thr Asp Ser HisAla Gln Gly Met Pro Glu Pro Gly Trp Gly Arg Ile Thr Asp Ser His
450 455 460 450 455 460
Gln Trp Asn Thr Leu Leu Ser Leu His Asn Ala Gln Phe Tyr Leu LeuGln Trp Asn Thr Leu Leu Ser Leu His Asn Ala Gln Phe Tyr Leu Leu
465 470 475 480465 470 475 480
Gln Arg Thr Pro Glu Val Ala Arg Ser Arg Ala Thr Pro Leu Leu AspGln Arg Thr Pro Glu Val Ala Arg Ser Arg Ala Thr Pro Leu Leu Asp
485 490 495 485 490 495
Leu Ile Lys Thr Ala Leu Thr Pro His Pro Pro Gln Lys Gln Ala TyrLeu Ile Lys Thr Ala Leu Thr Pro His Pro Pro Gln Lys Gln Ala Tyr
500 505 510 500 505 510
Gly Val Thr Leu Pro Thr Ser Val Leu Phe Ile Ala Gly His Asp ThrGly Val Thr Leu Pro Thr Ser Val Leu Phe Ile Ala Gly His Asp Thr
515 520 525 515 520 525
Asn Leu Ala Asn Leu Gly Gly Ala Leu Glu Leu Asn Trp Thr Leu ProAsn Leu Ala Asn Leu Gly Gly Ala Leu Glu Leu Asn Trp Thr Leu Pro
530 535 540 530 535 540
Gly Gln Pro Asp Asn Thr Pro Pro Gly Gly Glu Leu Val Phe Glu ArgGly Gln Pro Asp Asn Thr Pro Pro Gly Gly Glu Leu Val Phe Glu Arg
545 550 555 560545 550 555 560
Trp Arg Arg Leu Ser Asp Asn Ser Gln Trp Ile Gln Val Ser Leu ValTrp Arg Arg Leu Ser Asp Asn Ser Gln Trp Ile Gln Val Ser Leu Val
565 570 575 565 570 575
Phe Gln Thr Leu Gln Gln Met Arg Asp Lys Thr Pro Leu Ser Leu AsnPhe Gln Thr Leu Gln Gln Met Arg Asp Lys Thr Pro Leu Ser Leu Asn
580 585 590 580 585 590
Thr Pro Pro Gly Glu Val Lys Leu Thr Leu Ala Gly Cys Glu Glu ArgThr Pro Pro Gly Glu Val Lys Leu Thr Leu Ala Gly Cys Glu Glu Arg
595 600 605 595 600 605
Asn Ala Gln Gly Met Cys Ser Leu Ala Gly Phe Thr Gln Ile Val AsnAsn Ala Gln Gly Met Cys Ser Leu Ala Gly Phe Thr Gln Ile Val Asn
610 615 620 610 615 620
Glu Ala Arg Ile Pro Ala Cys Ser Leu Arg Val Lys Arg Gly Ser GlyGlu Ala Arg Ile Pro Ala Cys Ser Leu Arg Val Lys Arg Gly Ser Gly
625 630 635 640625 630 635 640
Ala Thr Asn Phe Ser Leu Leu Lys Gln Ala Gly Asp Val Glu Glu AsnAla Thr Asn Phe Ser Leu Leu Lys Gln Ala Gly Asp Val Glu Glu Asn
645 650 655 645 650 655
Pro Gly ProPro Gly Pro
<210> 9<210> 9
<211> 2598<211> 2598
<212> DNA<212>DNA
<213> 人工合成()<213> artificial synthesis ()
<400> 9<400> 9
atgtttcagc tctggaagct cgtgtttctc tgcggactcc tcatcgggac ctcagcccag 60atgtttcagc tctggaagct cgtgtttctc tgcggactcc tcatcgggac ctcagcccag 60
cagaccgtgt ggggacagtg tggcggaatc ggctggtccg gcccaaccaa ctgcgcccca 120cagaccgtgt ggggacagtg tggcggaatc ggctggtccg gcccaaccaa ctgcgcccca 120
ggcagcgcct gctccaccct gaacccctac tacgcccagt gcatcccagg cgccaccacc 180ggcagcgcct gctccaccct gaacccctac tacgcccagt gcatcccagg cgccaccacc 180
atcaccacca gcaccaggcc cccatccggc cccaccacca ccaccagagc cacctccacc 240atcaccacca gcaccaggcc cccatccggc cccaccacca ccaccagagc cacctccacc 240
agcagctcca ccccacccac ctccagcggc gtgagattcg ccggcgtgaa catcgccggc 300agcagctcca ccccaccac ctccagcggc gtgagattcg ccggcgtgaa catcgccggc 300
ttcgacttcg gctgcaccac cgacggcacc tgcgtgacca gcaaggtgta ccccccactg 360ttcgacttcg gctgcaccac cgacggcacc tgcgtgacca gcaaggtgta ccccccactg 360
aagaacttca ccggcagcaa caactaccca gacggcatcg gccagatgca gcacttcgtg 420aagaacttca ccggcagcaa caactaccca gacggcatcg gccagatgca gcacttcgtg 420
aacgaggacg gcatgaccat cttccggctg cccgtgggct ggcagtacct ggtgaacaac 480aacgaggacg gcatgaccat cttccggctg cccgtgggct ggcagtacct ggtgaacaac 480
aacctgggcg gcaacctgga cagcacctcc atcagcaagt acgaccagct ggtgcagggc 540aacctgggcg gcaacctgga cagcacctcc atcagcaagt acgaccagct ggtgcagggc 540
tgcctgagcc tgggcgccta ctgcatcgtg gacatccaca actacgccag atggaacggc 600tgcctgagcc tgggcgccta ctgcatcgtg gacatccaca actacgccag atggaacggc 600
ggcatcatcg gccagggcgg ccccaccaac gcccagttca ccagcctgtg gtcccagctg 660ggcatcatcg gccagggcgg ccccaccaac gccccagttca ccagcctgtg gtcccagctg 660
gcctccaagt acgccagcca gtccagagtg tggttcggca tcatgaacga gccacacgac 720gcctccaagt acgccagcca gtccagagtg tggttcggca tcatgaacga gccacacgac 720
gtgaacatca acacctgggc cgccaccgtg caggaggtgg tgaccgccat cagaaacgcc 780gtgaacatca acacctgggc cgccaccgtg caggaggtgg tgaccgccat cagaaacgcc 780
ggcgccacct cccagttcat ctccctgcca ggcaacgact ggcagagcgc cggcgccttc 840ggcgccacct cccagttcat ctccctgcca ggcaacgact ggcagagcgc cggcgccttc 840
atctccgacg gcagcgccgc cgccctgagc caggtgacca accccgacgg cagcaccacc 900atctccgacg gcagcgccgc cgccctgagc caggtgacca accccgacgg cagcaccacc 900
aacctgatct tcgacgtgca caagtacctg gactccgaca acagcggcac ccacgccgag 960aacctgatct tcgacgtgca caagtacctg gactccgaca acagcggcac ccacgccgag 960
tgcaccacca acaacatcga cggcgccttc tccccactgg ccacctggct gagacagaac 1020tgcaccacca acaacatcga cggcgccttc tccccactgg ccacctggct gagacagaac 1020
aacagacagg ccatcctgac cgagaccggc ggcggcaacg tgcagtcctg catccaggac 1080aacagacagg ccatcctgac cgagaccggc ggcggcaacg tgcagtcctg catccaggac 1080
atgtgccagc agatccagta cctgaaccag aacagcgacg tgtacctggg ctacgtgggc 1140atgtgccagc agatccagta cctgaaccag aacagcgacg tgtacctggg ctacgtgggc 1140
tggggcgccg gctccttcga cagcacctac gtgctgaccg agacccccac ctcctccggc 1200tggggcgccg gctccttcga cagcacctac gtgctgaccg agaccccac ctcctccggc 1200
aacagctgga ccgacacctc cctggtgtcc agctgcctgg ccagaaagga ggccgccgcc 1260aacagctgga ccgacacctc cctggtgtcc agctgcctgg ccagaaagga ggccgccgcc 1260
aaggaggccg ccgccaagga ggccgccgcc aagggcagct acgactacgc cgacgtgatc 1320aaggaggccg ccgccaagga ggccgccgcc aagggcagct acgactacgc cgacgtgatc 1320
aagaagagcc tgctgttcta ccaggcccag cggtccggca ggctgagcgg catggaccca 1380aagaagagcc tgctgttcta ccaggcccag cggtccggca ggctgagcgg catggaccca 1380
ctggtgtcct ggagaaagga cagcgccctg aacgacaggg gcaacaacgg cgaggacctg 1440ctggtgtcct ggagaaagga cagcgccctg aacgacaggg gcaacaacgg cgaggacctg 1440
accggcggct actacgacgc cggcgacttc gtgaagttcg gcttcccaat ggcctacacc 1500accggcggct actacgacgc cggcgacttc gtgaagttcg gcttcccaat ggcctacacc 1500
atcaccctgc tgagctgggg cgtgatcgac tacgagaaca cctactccag catcggcgcc 1560atcaccctgc tgagctgggg cgtgatcgac tacgagaaca cctactccag catcggcgcc 1560
ctgagcgccg ccagagccgc catcaagtgg ggcaccgact acttcatcaa ggcccacgtg 1620ctgagcgccg ccagagccgc catcaagtgg ggcaccgact acttcatcaa ggcccacgtg 1620
tccgccaacg agctgtacgg ccaggtgggc aacggcggcg ccgaccacag ctggtggggc 1680tccgccaacg agctgtacgg ccaggtgggc aacggcggcg ccgaccacag ctggtggggc 1680
agacccgagg acatgaacat ggacaggcca gcctacaaga tcgacacctc cagaccaggc 1740agacccgagg acatgaacat ggacaggcca gcctacaaga tcgacacctc cagaccaggc 1740
tccgacctgg ccgccgagac cgccgccgcc atggccgccg cctccatcgt gttcaagaac 1800tccgacctgg ccgccgagac cgccgccgcc atggccgccg cctccatcgt gttcaagaac 1800
gccgactcca actacgccaa caccctgctg agacacgcca aggagctgta caacttcgcc 1860gccgactcca actacgccaa caccctgctg agacacgcca aggagctgta caacttcgcc 1860
gacaactaca ggggcaagta ctccgactcc atcagcgacg ccgccgcctt ctacaacagc 1920gacaactaca ggggcaagta ctccgactcc atcagcgacg ccgccgcctt ctacaacagc 1920
tacagctacg aggacgagct ggtgtggggc gccatctggc tgtggagagc caccaacgac 1980tacagctacg aggacgagct ggtgtggggc gccatctggc tgtggagagc caccaacgac 1980
cagaactacc tgaacaaggc cacccagtac tacaaccagt actccatcca gtacaagaac 2040cagaactacc tgaacaaggc cacccagtac tacaaccagt actccatcca gtacaagaac 2040
agcccactgt cctgggacga caagagcacc ggcgcctccg ccctgctggc caagctgacc 2100agcccactgt cctgggacga caagagcacc ggcgcctccg ccctgctggc caagctgacc 2100
ggcggcgacc agtacaagag cgccgtgcag agcttctgcg acggcttcta ctacaaccag 2160ggcggcgacc agtacaagag cgccgtgcag agcttctgcg acggcttcta ctacaaccag 2160
cagaagaccc caaagggcct gatctggtac tccgactggg gctccctgag acagtccatg 2220cagaagaccc caaagggcct gatctggtac tccgactggg gctccctgag acagtccatg 2220
aacgccgtgt gggtgtgcct gcaagccgcc gacgccggcg tgaagaccgg cgagtacaga 2280aacgccgtgt gggtgtgcct gcaagccgcc gacgccggcg tgaagaccgg cgagtacaga 2280
tccctggcca agaagcagct ggactacgcc ctgggcgacg ccggccggag cttcgtggtg 2340tccctggcca agaagcagct ggactacgcc ctgggcgacg ccggccggag cttcgtggtg 2340
ggcttcggca acaacccacc ctcccacgag cagcacaggg ccgccagctg cccagacgcc 2400ggcttcggca acaacccacc ctcccacgag cagcacagggg ccgccagctg cccagacgcc 2400
cccgccgcct gcgactggaa cacctacaac ggcggccagt ccaactacca cgtgctgtac 2460cccgccgcct gcgactggaa cacctacaac ggcggccagt ccaactacca cgtgctgtac 2460
ggcgccctgg tgggcggccc cgacgccaac gactactaca acgacgtgag atccgactac 2520ggcgccctgg tgggcggccc cgacgccaac gactactaca acgacgtgag atccgactac 2520
gtgcacaacg aggtggcctg cgactacaac gctggatttc agaatgtcct cgtgtcactc 2580gtgcacaacg aggtggcctg cgactacaac gctggatttc agaatgtcct cgtgtcactc 2580
aaggctaatg gctactga 2598aaggctaatg gctactga 2598
<210> 10<210> 10
<211> 865<211> 865
<212> PRT<212> PRT
<213> 人工合成()<213> artificial synthesis ()
<400> 10<400> 10
Met Phe Gln Leu Trp Lys Leu Val Phe Leu Cys Gly Leu Leu Ile GlyMet Phe Gln Leu Trp Lys Leu Val Phe Leu Cys Gly Leu Leu Ile Gly
1 5 10 151 5 10 15
Thr Ser Ala Gln Gln Thr Val Trp Gly Gln Cys Gly Gly Ile Gly TrpThr Ser Ala Gln Gln Thr Val Trp Gly Gln Cys Gly Gly Ile Gly Trp
20 25 30 20 25 30
Ser Gly Pro Thr Asn Cys Ala Pro Gly Ser Ala Cys Ser Thr Leu AsnSer Gly Pro Thr Asn Cys Ala Pro Gly Ser Ala Cys Ser Thr Leu Asn
35 40 45 35 40 45
Pro Tyr Tyr Ala Gln Cys Ile Pro Gly Ala Thr Thr Ile Thr Thr SerPro Tyr Tyr Ala Gln Cys Ile Pro Gly Ala Thr Thr Ile Thr Thr Ser
50 55 60 50 55 60
Thr Arg Pro Pro Ser Gly Pro Thr Thr Thr Thr Arg Ala Thr Ser ThrThr Arg Pro Pro Ser Gly Pro Thr Thr Thr Thr Thr Arg Ala Thr Ser Thr
65 70 75 8065 70 75 80
Ser Ser Ser Thr Pro Pro Thr Ser Ser Gly Val Arg Phe Ala Gly ValSer Ser Ser Thr Pro Pro Thr Ser Ser Gly Val Arg Phe Ala Gly Val
85 90 95 85 90 95
Asn Ile Ala Gly Phe Asp Phe Gly Cys Thr Thr Asp Gly Thr Cys ValAsn Ile Ala Gly Phe Asp Phe Gly Cys Thr Thr Asp Gly Thr Cys Val
100 105 110 100 105 110
Thr Ser Lys Val Tyr Pro Pro Leu Lys Asn Phe Thr Gly Ser Asn AsnThr Ser Lys Val Tyr Pro Pro Leu Lys Asn Phe Thr Gly Ser Asn Asn
115 120 125 115 120 125
Tyr Pro Asp Gly Ile Gly Gln Met Gln His Phe Val Asn Glu Asp GlyTyr Pro Asp Gly Ile Gly Gln Met Gln His Phe Val Asn Glu Asp Gly
130 135 140 130 135 140
Met Thr Ile Phe Arg Leu Pro Val Gly Trp Gln Tyr Leu Val Asn AsnMet Thr Ile Phe Arg Leu Pro Val Gly Trp Gln Tyr Leu Val Asn Asn
145 150 155 160145 150 155 160
Asn Leu Gly Gly Asn Leu Asp Ser Thr Ser Ile Ser Lys Tyr Asp GlnAsn Leu Gly Gly Asn Leu Asp Ser Thr Ser Ile Ser Lys Tyr Asp Gln
165 170 175 165 170 175
Leu Val Gln Gly Cys Leu Ser Leu Gly Ala Tyr Cys Ile Val Asp IleLeu Val Gln Gly Cys Leu Ser Leu Gly Ala Tyr Cys Ile Val Asp Ile
180 185 190 180 185 190
His Asn Tyr Ala Arg Trp Asn Gly Gly Ile Ile Gly Gln Gly Gly ProHis Asn Tyr Ala Arg Trp Asn Gly Gly Ile Ile Gly Gln Gly Gly Pro
195 200 205 195 200 205
Thr Asn Ala Gln Phe Thr Ser Leu Trp Ser Gln Leu Ala Ser Lys TyrThr Asn Ala Gln Phe Thr Ser Leu Trp Ser Gln Leu Ala Ser Lys Tyr
210 215 220 210 215 220
Ala Ser Gln Ser Arg Val Trp Phe Gly Ile Met Asn Glu Pro His AspAla Ser Gln Ser Arg Val Trp Phe Gly Ile Met Asn Glu Pro His Asp
225 230 235 240225 230 235 240
Val Asn Ile Asn Thr Trp Ala Ala Thr Val Gln Glu Val Val Thr AlaVal Asn Ile Asn Thr Trp Ala Ala Thr Val Gln Glu Val Val Thr Ala
245 250 255 245 250 255
Ile Arg Asn Ala Gly Ala Thr Ser Gln Phe Ile Ser Leu Pro Gly AsnIle Arg Asn Ala Gly Ala Thr Ser Gln Phe Ile Ser Leu Pro Gly Asn
260 265 270 260 265 270
Asp Trp Gln Ser Ala Gly Ala Phe Ile Ser Asp Gly Ser Ala Ala AlaAsp Trp Gln Ser Ala Gly Ala Phe Ile Ser Asp Gly Ser Ala Ala Ala
275 280 285 275 280 285
Leu Ser Gln Val Thr Asn Pro Asp Gly Ser Thr Thr Asn Leu Ile PheLeu Ser Gln Val Thr Asn Pro Asp Gly Ser Thr Thr Asn Leu Ile Phe
290 295 300 290 295 300
Asp Val His Lys Tyr Leu Asp Ser Asp Asn Ser Gly Thr His Ala GluAsp Val His Lys Tyr Leu Asp Ser Asp Asn Ser Gly Thr His Ala Glu
305 310 315 320305 310 315 320
Cys Thr Thr Asn Asn Ile Asp Gly Ala Phe Ser Pro Leu Ala Thr TrpCys Thr Thr Asn Asn Ile Asp Gly Ala Phe Ser Pro Leu Ala Thr Trp
325 330 335 325 330 335
Leu Arg Gln Asn Asn Arg Gln Ala Ile Leu Thr Glu Thr Gly Gly GlyLeu Arg Gln Asn Asn Arg Gln Ala Ile Leu Thr Glu Thr Gly Gly Gly
340 345 350 340 345 350
Asn Val Gln Ser Cys Ile Gln Asp Met Cys Gln Gln Ile Gln Tyr LeuAsn Val Gln Ser Cys Ile Gln Asp Met Cys Gln Gln Ile Gln Tyr Leu
355 360 365 355 360 365
Asn Gln Asn Ser Asp Val Tyr Leu Gly Tyr Val Gly Trp Gly Ala GlyAsn Gln Asn Ser Asp Val Tyr Leu Gly Tyr Val Gly Trp Gly Ala Gly
370 375 380 370 375 380
Ser Phe Asp Ser Thr Tyr Val Leu Thr Glu Thr Pro Thr Ser Ser GlySer Phe Asp Ser Thr Tyr Val Leu Thr Glu Thr Pro Thr Ser Ser Ser Gly
385 390 395 400385 390 395 400
Asn Ser Trp Thr Asp Thr Ser Leu Val Ser Ser Cys Leu Ala Arg LysAsn Ser Trp Thr Asp Thr Ser Leu Val Ser Ser Cys Leu Ala Arg Lys
405 410 415 405 410 415
Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys GlyGlu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Gly
420 425 430 420 425 430
Ser Tyr Asp Tyr Ala Asp Val Ile Lys Lys Ser Leu Leu Phe Tyr GlnSer Tyr Asp Tyr Ala Asp Val Ile Lys Lys Ser Leu Leu Phe Tyr Gln
435 440 445 435 440 445
Ala Gln Arg Ser Gly Arg Leu Ser Gly Met Asp Pro Leu Val Ser TrpAla Gln Arg Ser Gly Arg Leu Ser Gly Met Asp Pro Leu Val Ser Trp
450 455 460 450 455 460
Arg Lys Asp Ser Ala Leu Asn Asp Arg Gly Asn Asn Gly Glu Asp LeuArg Lys Asp Ser Ala Leu Asn Asp Arg Gly Asn Asn Gly Glu Asp Leu
465 470 475 480465 470 475 480
Thr Gly Gly Tyr Tyr Asp Ala Gly Asp Phe Val Lys Phe Gly Phe ProThr Gly Gly Tyr Tyr Asp Ala Gly Asp Phe Val Lys Phe Gly Phe Pro
485 490 495 485 490 495
Met Ala Tyr Thr Ile Thr Leu Leu Ser Trp Gly Val Ile Asp Tyr GluMet Ala Tyr Thr Ile Thr Leu Leu Ser Trp Gly Val Ile Asp Tyr Glu
500 505 510 500 505 510
Asn Thr Tyr Ser Ser Ile Gly Ala Leu Ser Ala Ala Arg Ala Ala IleAsn Thr Tyr Ser Ser Ile Gly Ala Leu Ser Ala Ala Arg Ala Ala Ile
515 520 525 515 520 525
Lys Trp Gly Thr Asp Tyr Phe Ile Lys Ala His Val Ser Ala Asn GluLys Trp Gly Thr Asp Tyr Phe Ile Lys Ala His Val Ser Ala Asn Glu
530 535 540 530 535 540
Leu Tyr Gly Gln Val Gly Asn Gly Gly Ala Asp His Ser Trp Trp GlyLeu Tyr Gly Gln Val Gly Asn Gly Gly Ala Asp His Ser Trp Trp Gly
545 550 555 560545 550 555 560
Arg Pro Glu Asp Met Asn Met Asp Arg Pro Ala Tyr Lys Ile Asp ThrArg Pro Glu Asp Met Asn Met Asp Arg Pro Ala Tyr Lys Ile Asp Thr
565 570 575 565 570 575
Ser Arg Pro Gly Ser Asp Leu Ala Ala Glu Thr Ala Ala Ala Met AlaSer Arg Pro Gly Ser Asp Leu Ala Ala Glu Thr Ala Ala Ala Met Ala
580 585 590 580 585 590
Ala Ala Ser Ile Val Phe Lys Asn Ala Asp Ser Asn Tyr Ala Asn ThrAla Ala Ser Ile Val Phe Lys Asn Ala Asp Ser Asn Tyr Ala Asn Thr
595 600 605 595 600 605
Leu Leu Arg His Ala Lys Glu Leu Tyr Asn Phe Ala Asp Asn Tyr ArgLeu Leu Arg His Ala Lys Glu Leu Tyr Asn Phe Ala Asp Asn Tyr Arg
610 615 620 610 615 620
Gly Lys Tyr Ser Asp Ser Ile Ser Asp Ala Ala Ala Phe Tyr Asn SerGly Lys Tyr Ser Asp Ser Ile Ser Asp Ala Ala Ala Phe Tyr Asn Ser
625 630 635 640625 630 635 640
Tyr Ser Tyr Glu Asp Glu Leu Val Trp Gly Ala Ile Trp Leu Trp ArgTyr Ser Tyr Glu Asp Glu Leu Val Trp Gly Ala Ile Trp Leu Trp Arg
645 650 655 645 650 655
Ala Thr Asn Asp Gln Asn Tyr Leu Asn Lys Ala Thr Gln Tyr Tyr AsnAla Thr Asn Asp Gln Asn Tyr Leu Asn Lys Ala Thr Gln Tyr Tyr Asn
660 665 670 660 665 670
Gln Tyr Ser Ile Gln Tyr Lys Asn Ser Pro Leu Ser Trp Asp Asp LysGln Tyr Ser Ile Gln Tyr Lys Asn Ser Pro Leu Ser Trp Asp Asp Lys
675 680 685 675 680 685
Ser Thr Gly Ala Ser Ala Leu Leu Ala Lys Leu Thr Gly Gly Asp GlnSer Thr Gly Ala Ser Ala Leu Leu Ala Lys Leu Thr Gly Gly Asp Gln
690 695 700 690 695 700
Tyr Lys Ser Ala Val Gln Ser Phe Cys Asp Gly Phe Tyr Tyr Asn GlnTyr Lys Ser Ala Val Gln Ser Phe Cys Asp Gly Phe Tyr Tyr Asn Gln
705 710 715 720705 710 715 720
Gln Lys Thr Pro Lys Gly Leu Ile Trp Tyr Ser Asp Trp Gly Ser LeuGln Lys Thr Pro Lys Gly Leu Ile Trp Tyr Ser Asp Trp Gly Ser Leu
725 730 735 725 730 735
Arg Gln Ser Met Asn Ala Val Trp Val Cys Leu Gln Ala Ala Asp AlaArg Gln Ser Met Asn Ala Val Trp Val Cys Leu Gln Ala Ala Asp Ala
740 745 750 740 745 750
Gly Val Lys Thr Gly Glu Tyr Arg Ser Leu Ala Lys Lys Gln Leu AspGly Val Lys Thr Gly Glu Tyr Arg Ser Leu Ala Lys Lys Gln Leu Asp
755 760 765 755 760 765
Tyr Ala Leu Gly Asp Ala Gly Arg Ser Phe Val Val Gly Phe Gly AsnTyr Ala Leu Gly Asp Ala Gly Arg Ser Phe Val Val Gly Phe Gly Asn
770 775 780 770 775 780
Asn Pro Pro Ser His Glu Gln His Arg Ala Ala Ser Cys Pro Asp AlaAsn Pro Pro Ser His Glu Gln His Arg Ala Ala Ser Cys Pro Asp Ala
785 790 795 800785 790 795 800
Pro Ala Ala Cys Asp Trp Asn Thr Tyr Asn Gly Gly Gln Ser Asn TyrPro Ala Ala Cys Asp Trp Asn Thr Tyr Asn Gly Gly Gln Ser Asn Tyr
805 810 815 805 810 815
His Val Leu Tyr Gly Ala Leu Val Gly Gly Pro Asp Ala Asn Asp TyrHis Val Leu Tyr Gly Ala Leu Val Gly Gly Pro Asp Ala Asn Asp Tyr
820 825 830 820 825 830
Tyr Asn Asp Val Arg Ser Asp Tyr Val His Asn Glu Val Ala Cys AspTyr Asn Asp Val Arg Ser Asp Tyr Val His Asn Glu Val Ala Cys Asp
835 840 845 835 840 845
Tyr Asn Ala Gly Phe Gln Asn Val Leu Val Ser Leu Lys Ala Asn GlyTyr Asn Ala Gly Phe Gln Asn Val Leu Val Ser Leu Lys Ala Asn Gly
850 855 860 850 855 860
TyrTyr
865865
<210> 11<210> 11
<211> 4581<211> 4581
<212> DNA<212>DNA
<213> 人工合成()<213> artificial synthesis ()
<400> 11<400> 11
gccaccatgt tccagctgtg gaagctggtg ttcctgtgcg gactgctgat cggcaccagc 60gccaccatgt tccagctgtg gaagctggtg ttcctgtgcg gactgctgat cggcaccagc 60
gcctccagca ccccctccag caccggagag aacaacggct tctactactc cttctggacc 120gcctccagca ccccctccag caccggagag aacaacggct tctactactc cttctggacc 120
gacggaggag gcgacgtgac ctacaccaac ggcgacgccg gagcttacac cgtggagtgg 180gacggaggag gcgacgtgac ctacaccaac ggcgacgccg gagcttacac cgtggagtgg 180
agcaacgtgg gcaacttcgt gggaggcaag ggatggaacc caggctccgc ccaggacatc 240agcaacgtgg gcaacttcgt gggaggcaag ggatggaacc caggctccgc ccaggacatc 240
acctactccg gcaccttcac cccaagcggc aacggctacc tgtccgtgta cggctggacc 300acctactccg gcaccttcac cccaagcggc aacggctacc tgtccgtgta cggctggacc 300
accgaccccc tgatcgagta ctacatcgtg gagagctacg gcgactacaa cccaggctcc 360accgaccccc tgatcgagta ctacatcgtg gagagctacg gcgactacaa cccaggctcc 360
ggaggcacct acaagggcac cgtgaccagc gacggctccg tgtacgacat ctacaccgct 420ggaggcacct acaagggcac cgtgaccagc gacggctccg tgtacgacat ctacaccgct 420
accaggacca acgctgccag catccagggc accgccacct tcacccagta ctggtccgtg 480accaggacca acgctgccag catccagggc accgccacct tcacccagta ctggtccgtg 480
aggcagaaca agagagtggg cggcaccgtg accaccagca accacttcaa cgcctgggcc 540aggcagaaca agagagtggg cggcaccgtg accaccagca accacttcaa cgcctgggcc 540
aagctgggca tgaacctggg cacccacaac taccagatcg tggctaccga gggctaccag 600aagctgggca tgaacctggg cacccacaac taccagatcg tggctaccga gggctaccag 600
tccagcggct ccagctccat caccgtgcag gaggctgccg ccaaagaagc tgccgccaag 660tccagcggct ccagctccat caccgtgcag gaggctgccg ccaaagaagc tgccgccaag 660
gaggctgccg ccaagcagtc cgagccagag ctgaagctgg agagcgtggt catcgtgtcc 720gaggctgccg ccaagcagtc cgagccagag ctgaagctgg agagcgtggt catcgtgtcc 720
cgccacggcg tgcgcgctcc aaccaaggcc acccagctga tgcaggacgt gaccccagac 780cgccacggcg tgcgcgctcc aaccaaggcc acccagctga tgcaggacgt gaccccagac 780
gcttggccaa cctggccagt gaagctggga tggctgaccc ccaggggcgg agagctgatc 840gcttggccaa cctggccagt gaagctggga tggctgaccc ccaggggcgg agagctgatc 840
gcctacctgg gccactacca gaggcagaga ctggtggctg acggactgct ggccaagaag 900gcctacctgg gccactacca gaggcagaga ctggtggctg acggactgct ggccaagaag 900
ggatgcccac agagcggaca ggtggctatc atcgctgacg tggacgagcg cacccggaag 960ggatgcccac agagcggaca ggtggctatc atcgctgacg tggacgagcg cacccggaag 960
accggagagg ccttcgccgc cggcctggcc ccagactgcg ctatcaccgt gcacacccag 1020accggagagg ccttcgccgc cggcctggcc ccagactgcg ctatcaccgt gcacacccag 1020
gctgacacca gctcccccga cccactgttc aacccactga agaccggcgt gtgccagctg 1080gctgacacca gctcccccga cccactgttc aacccactga agaccggcgt gtgccagctg 1080
gacaacgcca acgtgaccga cgctatcctg agccgcgccg gaggctccat cgctgacttc 1140gacaacgcca acgtgaccga cgctatcctg agccgcgccg gaggctccat cgctgacttc 1140
accggacaca ggcagaccgc cttcagggag ctggagagag tgctgaactt cccccagtcc 1200accggacaca ggcagaccgc cttcagggag ctggagagag tgctgaactt cccccagtcc 1200
aacctgtgcc tgaagcggga gaagcaggac gagagctgct ccctgaccca ggccctgcca 1260aacctgtgcc tgaagcggga gaagcaggac gagagctgct ccctgaccca ggccctgcca 1260
agcgagctga aggtgtccgc cgacaacgtg agcctgaccg gagccgtgag cctggcctcc 1320agcgagctga aggtgtccgc cgacaacgtg agcctgaccg gagccgtgag cctggcctcc 1320
atgctgaccg agatcttcct gctccagcag gctcagggaa tgccagagcc aggatgggga 1380atgctgaccg agatcttcct gctccagcag gctcagggaa tgccagagcc aggatgggga 1380
aggatcaccg acagccacca gtggaacacc ctgctgtccc tgcacaacgc ccagttctac 1440aggatcaccg acagccacca gtggaacacc ctgctgtccc tgcacaacgc ccagttctac 1440
ctgctccagc ggaccccaga ggtggctagg agcagagcca ccccactgct ggacctgatc 1500ctgctccagc ggaccccaga ggtggctagg agcagagcca ccccactgct ggacctgatc 1500
aagaccgccc tgaccccaca cccaccacag aagcaggcct acggcgtgac cctgccaacc 1560aagaccgccc tgaccccaca cccaccacag aagcaggcct acggcgtgac cctgccaacc 1560
tccgtgctgt tcatcgccgg ccacgacacc aacctggcta acctgggagg cgccctggag 1620tccgtgctgt tcatcgccgg ccacgacacc aacctggcta acctgggagg cgccctggag 1620
ctgaactgga ccctgccagg acagccagac aacaccccac caggaggaga gctggtgttc 1680ctgaactgga ccctgccagg acagccagac aacacccac caggaggaga gctggtgttc 1680
gagaggtggc gccggctgag cgacaactcc cagtggattc aggtgtccct ggtgttccag 1740gagaggtggc gccggctgag cgacaactcc cagtggattc aggtgtccct ggtgttccag 1740
accctccagc agatgagaga caagacccca ctgtccctga acaccccacc aggagaggtg 1800accctccagc agatgagaga caagacccca ctgtccctga acaccccacc aggagaggtg 1800
aagctgaccc tggccggatg cgaggagagg aacgctcagg gaatgtgcag cctggccggc 1860aagctgaccc tggccggatg cgaggagagg aacgctcagg gaatgtgcag cctggccggc 1860
ttcacccaga tcgtgaacga ggctagaatc cccgcctgct ccctgagggt gaagaggggc 1920ttcacccaga tcgtgaacga ggctagaatc cccgcctgct ccctgagggt gaagaggggc 1920
agcggagcta ccaacttctc cctgctgaag caggctggcg acgtggagga gaacccagga 1980agcggagcta ccaacttctc cctgctgaag caggctggcg acgtggagga gaacccagga 1980
ccaatgtttc agctctggaa gctcgtgttt ctctgcggac tcctcatcgg gacctcagcc 2040ccaatgtttc agctctggaa gctcgtgttt ctctgcggac tcctcatcgg gacctcagcc 2040
cagcagaccg tgtggggaca gtgtggcgga atcggctggt ccggcccaac caactgcgcc 2100cagcagaccg tgtggggaca gtgtggcgga atcggctggt ccggcccaac caactgcgcc 2100
ccaggcagcg cctgctccac cctgaacccc tactacgccc agtgcatccc aggcgccacc 2160ccaggcagcg cctgctccac cctgaaccccc tactacgccc agtgcatccc aggcgccacc 2160
accatcacca ccagcaccag gcccccatcc ggccccacca ccaccaccag agccacctcc 2220accatcacca ccagcaccag gcccccatcc ggccccacca ccaccaccag agccacctcc 2220
accagcagct ccaccccacc cacctccagc ggcgtgagat tcgccggcgt gaacatcgcc 2280accagcagct ccaccccacc cacctccagc ggcgtgagat tcgccggcgt gaacatcgcc 2280
ggcttcgact tcggctgcac caccgacggc acctgcgtga ccagcaaggt gtacccccca 2340ggcttcgact tcggctgcac caccgacggc acctgcgtga ccagcaaggt gtacccccca 2340
ctgaagaact tcaccggcag caacaactac ccagacggca tcggccagat gcagcacttc 2400ctgaagaact tcaccggcag caacaactac ccagacggca tcggccagat gcagcacttc 2400
gtgaacgagg acggcatgac catcttccgg ctgcccgtgg gctggcagta cctggtgaac 2460gtgaacgagg acggcatgac catcttccgg ctgcccgtgg gctggcagta cctggtgaac 2460
aacaacctgg gcggcaacct ggacagcacc tccatcagca agtacgacca gctggtgcag 2520aacaacctgg gcggcaacct ggacagcacc tccatcagca agtacgacca gctggtgcag 2520
ggctgcctga gcctgggcgc ctactgcatc gtggacatcc acaactacgc cagatggaac 2580ggctgcctga gcctgggcgc ctactgcatc gtggacatcc acaactacgc cagatggaac 2580
ggcggcatca tcggccaggg cggccccacc aacgcccagt tcaccagcct gtggtcccag 2640ggcggcatca tcggccaggg cggccccacc aacgcccagt tcaccagcct gtggtcccag 2640
ctggcctcca agtacgccag ccagtccaga gtgtggttcg gcatcatgaa cgagccacac 2700ctggcctcca agtacgccag ccagtccaga gtgtggttcg gcatcatgaa cgagccacac 2700
gacgtgaaca tcaacacctg ggccgccacc gtgcaggagg tggtgaccgc catcagaaac 2760gacgtgaaca tcaacacctg ggccgccacc gtgcaggagg tggtgaccgc catcagaaac 2760
gccggcgcca cctcccagtt catctccctg ccaggcaacg actggcagag cgccggcgcc 2820gccggcgcca cctcccagtt catctccctg ccaggcaacg actggcagag cgccggcgcc 2820
ttcatctccg acggcagcgc cgccgccctg agccaggtga ccaaccccga cggcagcacc 2880ttcatctccg acggcagcgc cgccgccctg agccaggtga ccaaccccga cggcagcacc 2880
accaacctga tcttcgacgt gcacaagtac ctggactccg acaacagcgg cacccacgcc 2940accaacctga tcttcgacgt gcacaagtac ctggactccg acaacagcgg cacccacgcc 2940
gagtgcacca ccaacaacat cgacggcgcc ttctccccac tggccacctg gctgagacag 3000gagtgcacca ccaacaacat cgacggcgcc ttctccccac tggccacctg gctgagacag 3000
aacaacagac aggccatcct gaccgagacc ggcggcggca acgtgcagtc ctgcatccag 3060aacaacagac aggccatcct gaccgagacc ggcggcggca acgtgcagtc ctgcatccag 3060
gacatgtgcc agcagatcca gtacctgaac cagaacagcg acgtgtacct gggctacgtg 3120gacatgtgcc agcagatcca gtacctgaac cagaacagcg acgtgtacct gggctacgtg 3120
ggctggggcg ccggctcctt cgacagcacc tacgtgctga ccgagacccc cacctcctcc 3180ggctggggcg ccggctcctt cgacagcacc tacgtgctga ccgagacccc cacctcctcc 3180
ggcaacagct ggaccgacac ctccctggtg tccagctgcc tggccagaaa ggaggccgcc 3240ggcaacagct ggaccgacac ctccctggtg tccagctgcc tggccagaaa ggaggccgcc 3240
gccaaggagg ccgccgccaa ggaggccgcc gccaagggca gctacgacta cgccgacgtg 3300gccaaggagg ccgccgccaa ggaggccgcc gccaagggca gctacgacta cgccgacgtg 3300
atcaagaaga gcctgctgtt ctaccaggcc cagcggtccg gcaggctgag cggcatggac 3360atcaagaaga gcctgctgtt ctaccaggcc cagcggtccg gcaggctgag cggcatggac 3360
ccactggtgt cctggagaaa ggacagcgcc ctgaacgaca ggggcaacaa cggcgaggac 3420ccactggtgt cctggagaaa ggacagcgcc ctgaacgaca ggggcaacaa cggcgaggac 3420
ctgaccggcg gctactacga cgccggcgac ttcgtgaagt tcggcttccc aatggcctac 3480ctgaccggcg gctactacga cgccggcgac ttcgtgaagt tcggcttccc aatggcctac 3480
accatcaccc tgctgagctg gggcgtgatc gactacgaga acacctactc cagcatcggc 3540accatcaccc tgctgagctg gggcgtgatc gactacgaga acacctactc cagcatcggc 3540
gccctgagcg ccgccagagc cgccatcaag tggggcaccg actacttcat caaggcccac 3600gccctgagcg ccgccagagc cgccatcaag tggggcaccg actacttcat caaggcccac 3600
gtgtccgcca acgagctgta cggccaggtg ggcaacggcg gcgccgacca cagctggtgg 3660gtgtccgcca acgagctgta cggccaggtg ggcaacggcg gcgccgacca cagctggtgg 3660
ggcagacccg aggacatgaa catggacagg ccagcctaca agatcgacac ctccagacca 3720ggcagacccg aggacatgaa catggacagg ccagcctaca agatcgacac ctccagacca 3720
ggctccgacc tggccgccga gaccgccgcc gccatggccg ccgcctccat cgtgttcaag 3780ggctccgacc tggccgccga gaccgccgcc gccatggccg ccgcctccat cgtgttcaag 3780
aacgccgact ccaactacgc caacaccctg ctgagacacg ccaaggagct gtacaacttc 3840aacgccgact ccaactacgc caacaccctg ctgagacacg ccaaggagct gtacaacttc 3840
gccgacaact acaggggcaa gtactccgac tccatcagcg acgccgccgc cttctacaac 3900gccgacaact acaggggcaa gtactccgac tccatcagcg acgccgccgc cttctacaac 3900
agctacagct acgaggacga gctggtgtgg ggcgccatct ggctgtggag agccaccaac 3960agctacagct acgaggacga gctggtgtgg ggcgccatct ggctgtggag agccaccaac 3960
gaccagaact acctgaacaa ggccacccag tactacaacc agtactccat ccagtacaag 4020gaccagaact acctgaacaa ggccaccag tactacaacc agtactccat ccagtacaag 4020
aacagcccac tgtcctggga cgacaagagc accggcgcct ccgccctgct ggccaagctg 4080aacagcccac tgtcctggga cgacaagagc accggcgcct ccgccctgct ggccaagctg 4080
accggcggcg accagtacaa gagcgccgtg cagagcttct gcgacggctt ctactacaac 4140accggcggcg accagtacaa gagcgccgtg cagagcttct gcgacggctt ctactacaac 4140
cagcagaaga ccccaaaggg cctgatctgg tactccgact ggggctccct gagacagtcc 4200cagcagaaga ccccaaaggg cctgatctgg tactccgact ggggctccct gagacagtcc 4200
atgaacgccg tgtgggtgtg cctgcaagcc gccgacgccg gcgtgaagac cggcgagtac 4260atgaacgccg tgtgggtgtg cctgcaagcc gccgacgccg gcgtgaagac cggcgagtac 4260
agatccctgg ccaagaagca gctggactac gccctgggcg acgccggccg gagcttcgtg 4320agatccctgg ccaagaagca gctggactac gccctgggcg acgccggccg gagcttcgtg 4320
gtgggcttcg gcaacaaccc accctcccac gagcagcaca gggccgccag ctgcccagac 4380gtgggcttcg gcaacaaccc accctcccac gagcagcaca gggccgccag ctgcccagac 4380
gcccccgccg cctgcgactg gaacacctac aacggcggcc agtccaacta ccacgtgctg 4440gcccccgccg cctgcgactg gaacacctac aacggcggcc agtccaacta ccacgtgctg 4440
tacggcgccc tggtgggcgg ccccgacgcc aacgactact acaacgacgt gagatccgac 4500tacggcgccc tggtgggcgg ccccgacgcc aacgactact acaacgacgt gagatccgac 4500
tacgtgcaca acgaggtggc ctgcgactac aacgctggat ttcagaatgt cctcgtgtca 4560tacgtgcaca acgaggtggc ctgcgactac aacgctggat ttcagaatgt cctcgtgtca 4560
ctcaaggcta atggctactg a 4581ctcaaggcta atggctactg a 4581
<210> 12<210> 12
<211> 1524<211> 1524
<212> PRT<212> PRT
<213> 人工合成()<213> artificial synthesis ()
<400> 12<400> 12
Met Phe Gln Leu Trp Lys Leu Val Phe Leu Cys Gly Leu Leu Ile GlyMet Phe Gln Leu Trp Lys Leu Val Phe Leu Cys Gly Leu Leu Ile Gly
1 5 10 151 5 10 15
Thr Ser Ala Ser Ser Thr Pro Ser Ser Thr Gly Glu Asn Asn Gly PheThr Ser Ala Ser Ser Thr Pro Ser Ser Thr Gly Glu Asn Asn Gly Phe
20 25 30 20 25 30
Tyr Tyr Ser Phe Trp Thr Asp Gly Gly Gly Asp Val Thr Tyr Thr AsnTyr Tyr Ser Phe Trp Thr Asp Gly Gly Gly Asp Val Thr Tyr Thr Asn
35 40 45 35 40 45
Gly Asp Ala Gly Ala Tyr Thr Val Glu Trp Ser Asn Val Gly Asn PheGly Asp Ala Gly Ala Tyr Thr Val Glu Trp Ser Asn Val Gly Asn Phe
50 55 60 50 55 60
Val Gly Gly Lys Gly Trp Asn Pro Gly Ser Ala Gln Asp Ile Thr TyrVal Gly Gly Lys Gly Trp Asn Pro Gly Ser Ala Gln Asp Ile Thr Tyr
65 70 75 8065 70 75 80
Ser Gly Thr Phe Thr Pro Ser Gly Asn Gly Tyr Leu Ser Val Tyr GlySer Gly Thr Phe Thr Pro Ser Gly Asn Gly Tyr Leu Ser Val Tyr Gly
85 90 95 85 90 95
Trp Thr Thr Asp Pro Leu Ile Glu Tyr Tyr Ile Val Glu Ser Tyr GlyTrp Thr Thr Asp Pro Leu Ile Glu Tyr Tyr Ile Val Glu Ser Tyr Gly
100 105 110 100 105 110
Asp Tyr Asn Pro Gly Ser Gly Gly Thr Tyr Lys Gly Thr Val Thr SerAsp Tyr Asn Pro Gly Ser Gly Gly Thr Tyr Lys Gly Thr Val Thr Ser
115 120 125 115 120 125
Asp Gly Ser Val Tyr Asp Ile Tyr Thr Ala Thr Arg Thr Asn Ala AlaAsp Gly Ser Val Tyr Asp Ile Tyr Thr Ala Thr Arg Thr Asn Ala Ala
130 135 140 130 135 140
Ser Ile Gln Gly Thr Ala Thr Phe Thr Gln Tyr Trp Ser Val Arg GlnSer Ile Gln Gly Thr Ala Thr Phe Thr Gln Tyr Trp Ser Val Arg Gln
145 150 155 160145 150 155 160
Asn Lys Arg Val Gly Gly Thr Val Thr Thr Ser Asn His Phe Asn AlaAsn Lys Arg Val Gly Gly Thr Val Thr Thr Ser Asn His Phe Asn Ala
165 170 175 165 170 175
Trp Ala Lys Leu Gly Met Asn Leu Gly Thr His Asn Tyr Gln Ile ValTrp Ala Lys Leu Gly Met Asn Leu Gly Thr His Asn Tyr Gln Ile Val
180 185 190 180 185 190
Ala Thr Glu Gly Tyr Gln Ser Ser Gly Ser Ser Ser Ile Thr Val GlnAla Thr Glu Gly Tyr Gln Ser Ser Gly Ser Ser Ser Ile Thr Val Gln
195 200 205 195 200 205
Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys GlnGlu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Gln
210 215 220 210 215 220
Ser Glu Pro Glu Leu Lys Leu Glu Ser Val Val Ile Val Ser Arg HisSer Glu Pro Glu Leu Lys Leu Glu Ser Val Val Ile Val Ser Arg His
225 230 235 240225 230 235 240
Gly Val Arg Ala Pro Thr Lys Ala Thr Gln Leu Met Gln Asp Val ThrGly Val Arg Ala Pro Thr Lys Ala Thr Gln Leu Met Gln Asp Val Thr
245 250 255 245 250 255
Pro Asp Ala Trp Pro Thr Trp Pro Val Lys Leu Gly Trp Leu Thr ProPro Asp Ala Trp Pro Thr Trp Pro Val Lys Leu Gly Trp Leu Thr Pro
260 265 270 260 265 270
Arg Gly Gly Glu Leu Ile Ala Tyr Leu Gly His Tyr Gln Arg Gln ArgArg Gly Gly Glu Leu Ile Ala Tyr Leu Gly His Tyr Gln Arg Gln Arg
275 280 285 275 280 285
Leu Val Ala Asp Gly Leu Leu Ala Lys Lys Gly Cys Pro Gln Ser GlyLeu Val Ala Asp Gly Leu Leu Ala Lys Lys Gly Cys Pro Gln Ser Gly
290 295 300 290 295 300
Gln Val Ala Ile Ile Ala Asp Val Asp Glu Arg Thr Arg Lys Thr GlyGln Val Ala Ile Ile Ala Asp Val Asp Glu Arg Thr Arg Lys Thr Gly
305 310 315 320305 310 315 320
Glu Ala Phe Ala Ala Gly Leu Ala Pro Asp Cys Ala Ile Thr Val HisGlu Ala Phe Ala Ala Gly Leu Ala Pro Asp Cys Ala Ile Thr Val His
325 330 335 325 330 335
Thr Gln Ala Asp Thr Ser Ser Pro Asp Pro Leu Phe Asn Pro Leu LysThr Gln Ala Asp Thr Ser Ser Pro Asp Pro Leu Phe Asn Pro Leu Lys
340 345 350 340 345 350
Thr Gly Val Cys Gln Leu Asp Asn Ala Asn Val Thr Asp Ala Ile LeuThr Gly Val Cys Gln Leu Asp Asn Ala Asn Val Thr Asp Ala Ile Leu
355 360 365 355 360 365
Ser Arg Ala Gly Gly Ser Ile Ala Asp Phe Thr Gly His Arg Gln ThrSer Arg Ala Gly Gly Ser Ile Ala Asp Phe Thr Gly His Arg Gln Thr
370 375 380 370 375 380
Ala Phe Arg Glu Leu Glu Arg Val Leu Asn Phe Pro Gln Ser Asn LeuAla Phe Arg Glu Leu Glu Arg Val Leu Asn Phe Pro Gln Ser Asn Leu
385 390 395 400385 390 395 400
Cys Leu Lys Arg Glu Lys Gln Asp Glu Ser Cys Ser Leu Thr Gln AlaCys Leu Lys Arg Glu Lys Gln Asp Glu Ser Cys Ser Leu Thr Gln Ala
405 410 415 405 410 415
Leu Pro Ser Glu Leu Lys Val Ser Ala Asp Asn Val Ser Leu Thr GlyLeu Pro Ser Glu Leu Lys Val Ser Ala Asp Asn Val Ser Leu Thr Gly
420 425 430 420 425 430
Ala Val Ser Leu Ala Ser Met Leu Thr Glu Ile Phe Leu Leu Gln GlnAla Val Ser Leu Ala Ser Met Leu Thr Glu Ile Phe Leu Leu Gln Gln
435 440 445 435 440 445
Ala Gln Gly Met Pro Glu Pro Gly Trp Gly Arg Ile Thr Asp Ser HisAla Gln Gly Met Pro Glu Pro Gly Trp Gly Arg Ile Thr Asp Ser His
450 455 460 450 455 460
Gln Trp Asn Thr Leu Leu Ser Leu His Asn Ala Gln Phe Tyr Leu LeuGln Trp Asn Thr Leu Leu Ser Leu His Asn Ala Gln Phe Tyr Leu Leu
465 470 475 480465 470 475 480
Gln Arg Thr Pro Glu Val Ala Arg Ser Arg Ala Thr Pro Leu Leu AspGln Arg Thr Pro Glu Val Ala Arg Ser Arg Ala Thr Pro Leu Leu Asp
485 490 495 485 490 495
Leu Ile Lys Thr Ala Leu Thr Pro His Pro Pro Gln Lys Gln Ala TyrLeu Ile Lys Thr Ala Leu Thr Pro His Pro Pro Gln Lys Gln Ala Tyr
500 505 510 500 505 510
Gly Val Thr Leu Pro Thr Ser Val Leu Phe Ile Ala Gly His Asp ThrGly Val Thr Leu Pro Thr Ser Val Leu Phe Ile Ala Gly His Asp Thr
515 520 525 515 520 525
Asn Leu Ala Asn Leu Gly Gly Ala Leu Glu Leu Asn Trp Thr Leu ProAsn Leu Ala Asn Leu Gly Gly Ala Leu Glu Leu Asn Trp Thr Leu Pro
530 535 540 530 535 540
Gly Gln Pro Asp Asn Thr Pro Pro Gly Gly Glu Leu Val Phe Glu ArgGly Gln Pro Asp Asn Thr Pro Pro Gly Gly Glu Leu Val Phe Glu Arg
545 550 555 560545 550 555 560
Trp Arg Arg Leu Ser Asp Asn Ser Gln Trp Ile Gln Val Ser Leu ValTrp Arg Arg Leu Ser Asp Asn Ser Gln Trp Ile Gln Val Ser Leu Val
565 570 575 565 570 575
Phe Gln Thr Leu Gln Gln Met Arg Asp Lys Thr Pro Leu Ser Leu AsnPhe Gln Thr Leu Gln Gln Met Arg Asp Lys Thr Pro Leu Ser Leu Asn
580 585 590 580 585 590
Thr Pro Pro Gly Glu Val Lys Leu Thr Leu Ala Gly Cys Glu Glu ArgThr Pro Pro Gly Glu Val Lys Leu Thr Leu Ala Gly Cys Glu Glu Arg
595 600 605 595 600 605
Asn Ala Gln Gly Met Cys Ser Leu Ala Gly Phe Thr Gln Ile Val AsnAsn Ala Gln Gly Met Cys Ser Leu Ala Gly Phe Thr Gln Ile Val Asn
610 615 620 610 615 620
Glu Ala Arg Ile Pro Ala Cys Ser Leu Arg Val Lys Arg Gly Ser GlyGlu Ala Arg Ile Pro Ala Cys Ser Leu Arg Val Lys Arg Gly Ser Gly
625 630 635 640625 630 635 640
Ala Thr Asn Phe Ser Leu Leu Lys Gln Ala Gly Asp Val Glu Glu AsnAla Thr Asn Phe Ser Leu Leu Lys Gln Ala Gly Asp Val Glu Glu Asn
645 650 655 645 650 655
Pro Gly Pro Met Phe Gln Leu Trp Lys Leu Val Phe Leu Cys Gly LeuPro Gly Pro Met Phe Gln Leu Trp Lys Leu Val Phe Leu Cys Gly Leu
660 665 670 660 665 670
Leu Ile Gly Thr Ser Ala Gln Gln Thr Val Trp Gly Gln Cys Gly GlyLeu Ile Gly Thr Ser Ala Gln Gln Thr Val Trp Gly Gln Cys Gly Gly
675 680 685 675 680 685
Ile Gly Trp Ser Gly Pro Thr Asn Cys Ala Pro Gly Ser Ala Cys SerIle Gly Trp Ser Gly Pro Thr Asn Cys Ala Pro Gly Ser Ala Cys Ser
690 695 700 690 695 700
Thr Leu Asn Pro Tyr Tyr Ala Gln Cys Ile Pro Gly Ala Thr Thr IleThr Leu Asn Pro Tyr Tyr Ala Gln Cys Ile Pro Gly Ala Thr Thr Ile
705 710 715 720705 710 715 720
Thr Thr Ser Thr Arg Pro Pro Ser Gly Pro Thr Thr Thr Thr Arg AlaThr Thr Ser Thr Arg Pro Pro Ser Gly Pro Thr Thr Thr Thr Thr Arg Ala
725 730 735 725 730 735
Thr Ser Thr Ser Ser Ser Thr Pro Pro Thr Ser Ser Gly Val Arg PheThr Ser Thr Ser Ser Ser Thr Pro Pro Thr Ser Ser Gly Val Arg Phe
740 745 750 740 745 750
Ala Gly Val Asn Ile Ala Gly Phe Asp Phe Gly Cys Thr Thr Asp GlyAla Gly Val Asn Ile Ala Gly Phe Asp Phe Gly Cys Thr Thr Asp Gly
755 760 765 755 760 765
Thr Cys Val Thr Ser Lys Val Tyr Pro Pro Leu Lys Asn Phe Thr GlyThr Cys Val Thr Ser Lys Val Tyr Pro Pro Leu Lys Asn Phe Thr Gly
770 775 780 770 775 780
Ser Asn Asn Tyr Pro Asp Gly Ile Gly Gln Met Gln His Phe Val AsnSer Asn Asn Tyr Pro Asp Gly Ile Gly Gln Met Gln His Phe Val Asn
785 790 795 800785 790 795 800
Glu Asp Gly Met Thr Ile Phe Arg Leu Pro Val Gly Trp Gln Tyr LeuGlu Asp Gly Met Thr Ile Phe Arg Leu Pro Val Gly Trp Gln Tyr Leu
805 810 815 805 810 815
Val Asn Asn Asn Leu Gly Gly Asn Leu Asp Ser Thr Ser Ile Ser LysVal Asn Asn Asn Leu Gly Gly Asn Leu Asp Ser Thr Ser Ile Ser Lys
820 825 830 820 825 830
Tyr Asp Gln Leu Val Gln Gly Cys Leu Ser Leu Gly Ala Tyr Cys IleTyr Asp Gln Leu Val Gln Gly Cys Leu Ser Leu Gly Ala Tyr Cys Ile
835 840 845 835 840 845
Val Asp Ile His Asn Tyr Ala Arg Trp Asn Gly Gly Ile Ile Gly GlnVal Asp Ile His Asn Tyr Ala Arg Trp Asn Gly Gly Ile Ile Gly Gln
850 855 860 850 855 860
Gly Gly Pro Thr Asn Ala Gln Phe Thr Ser Leu Trp Ser Gln Leu AlaGly Gly Pro Thr Asn Ala Gln Phe Thr Ser Leu Trp Ser Gln Leu Ala
865 870 875 880865 870 875 880
Ser Lys Tyr Ala Ser Gln Ser Arg Val Trp Phe Gly Ile Met Asn GluSer Lys Tyr Ala Ser Gln Ser Arg Val Trp Phe Gly Ile Met Asn Glu
885 890 895 885 890 895
Pro His Asp Val Asn Ile Asn Thr Trp Ala Ala Thr Val Gln Glu ValPro His Asp Val Asn Ile Asn Thr Trp Ala Ala Thr Val Gln Glu Val
900 905 910 900 905 910
Val Thr Ala Ile Arg Asn Ala Gly Ala Thr Ser Gln Phe Ile Ser LeuVal Thr Ala Ile Arg Asn Ala Gly Ala Thr Ser Gln Phe Ile Ser Leu
915 920 925 915 920 925
Pro Gly Asn Asp Trp Gln Ser Ala Gly Ala Phe Ile Ser Asp Gly SerPro Gly Asn Asp Trp Gln Ser Ala Gly Ala Phe Ile Ser Asp Gly Ser
930 935 940 930 935 940
Ala Ala Ala Leu Ser Gln Val Thr Asn Pro Asp Gly Ser Thr Thr AsnAla Ala Ala Leu Ser Gln Val Thr Asn Pro Asp Gly Ser Thr Thr Asn
945 950 955 960945 950 955 960
Leu Ile Phe Asp Val His Lys Tyr Leu Asp Ser Asp Asn Ser Gly ThrLeu Ile Phe Asp Val His Lys Tyr Leu Asp Ser Asp Asn Ser Gly Thr
965 970 975 965 970 975
His Ala Glu Cys Thr Thr Asn Asn Ile Asp Gly Ala Phe Ser Pro LeuHis Ala Glu Cys Thr Thr Asn Asn Ile Asp Gly Ala Phe Ser Pro Leu
980 985 990 980 985 990
Ala Thr Trp Leu Arg Gln Asn Asn Arg Gln Ala Ile Leu Thr Glu ThrAla Thr Trp Leu Arg Gln Asn Asn Arg Gln Ala Ile Leu Thr Glu Thr
995 1000 1005 995 1000 1005
Gly Gly Gly Asn Val Gln Ser Cys Ile Gln Asp Met Cys Gln Gln IleGly Gly Gly Asn Val Gln Ser Cys Ile Gln Asp Met Cys Gln Gln Ile
1010 1015 1020 1010 1015 1020
Gln Tyr Leu Asn Gln Asn Ser Asp Val Tyr Leu Gly Tyr Val Gly TrpGln Tyr Leu Asn Gln Asn Ser Asp Val Tyr Leu Gly Tyr Val Gly Trp
1025 1030 1035 10401025 1030 1035 1040
Gly Ala Gly Ser Phe Asp Ser Thr Tyr Val Leu Thr Glu Thr Pro ThrGly Ala Gly Ser Phe Asp Ser Thr Tyr Val Leu Thr Glu Thr Pro Thr
1045 1050 1055 1045 1050 1055
Ser Ser Gly Asn Ser Trp Thr Asp Thr Ser Leu Val Ser Ser Cys LeuSer Ser Gly Asn Ser Trp Thr Asp Thr Ser Leu Val Ser Ser Cys Leu
1060 1065 1070 1060 1065 1070
Ala Arg Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala AlaAla Arg Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala Ala
1075 1080 1085 1075 1080 1085
Ala Lys Gly Ser Tyr Asp Tyr Ala Asp Val Ile Lys Lys Ser Leu LeuAla Lys Gly Ser Tyr Asp Tyr Ala Asp Val Ile Lys Lys Ser Leu Leu
1090 1095 1100 1090 1095 1100
Phe Tyr Gln Ala Gln Arg Ser Gly Arg Leu Ser Gly Met Asp Pro LeuPhe Tyr Gln Ala Gln Arg Ser Gly Arg Leu Ser Gly Met Asp Pro Leu
1105 1110 1115 11201105 1110 1115 1120
Val Ser Trp Arg Lys Asp Ser Ala Leu Asn Asp Arg Gly Asn Asn GlyVal Ser Trp Arg Lys Asp Ser Ala Leu Asn Asp Arg Gly Asn Asn Gly
1125 1130 1135 1125 1130 1135
Glu Asp Leu Thr Gly Gly Tyr Tyr Asp Ala Gly Asp Phe Val Lys PheGlu Asp Leu Thr Gly Gly Tyr Tyr Asp Ala Gly Asp Phe Val Lys Phe
1140 1145 1150 1140 1145 1150
Gly Phe Pro Met Ala Tyr Thr Ile Thr Leu Leu Ser Trp Gly Val IleGly Phe Pro Met Ala Tyr Thr Ile Thr Leu Leu Ser Trp Gly Val Ile
1155 1160 1165 1155 1160 1165
Asp Tyr Glu Asn Thr Tyr Ser Ser Ile Gly Ala Leu Ser Ala Ala ArgAsp Tyr Glu Asn Thr Tyr Ser Ser Ile Gly Ala Leu Ser Ala Ala Arg
1170 1175 1180 1170 1175 1180
Ala Ala Ile Lys Trp Gly Thr Asp Tyr Phe Ile Lys Ala His Val SerAla Ala Ile Lys Trp Gly Thr Asp Tyr Phe Ile Lys Ala His Val Ser
1185 1190 1195 12001185 1190 1195 1200
Ala Asn Glu Leu Tyr Gly Gln Val Gly Asn Gly Gly Ala Asp His SerAla Asn Glu Leu Tyr Gly Gln Val Gly Asn Gly Gly Ala Asp His Ser
1205 1210 1215 1205 1210 1215
Trp Trp Gly Arg Pro Glu Asp Met Asn Met Asp Arg Pro Ala Tyr LysTrp Trp Gly Arg Pro Glu Asp Met Asn Met Asp Arg Pro Ala Tyr Lys
1220 1225 1230 1220 1225 1230
Ile Asp Thr Ser Arg Pro Gly Ser Asp Leu Ala Ala Glu Thr Ala AlaIle Asp Thr Ser Arg Pro Gly Ser Asp Leu Ala Ala Glu Thr Ala Ala
1235 1240 1245 1235 1240 1245
Ala Met Ala Ala Ala Ser Ile Val Phe Lys Asn Ala Asp Ser Asn TyrAla Met Ala Ala Ala Ser Ile Val Phe Lys Asn Ala Asp Ser Asn Tyr
1250 1255 1260 1250 1255 1260
Ala Asn Thr Leu Leu Arg His Ala Lys Glu Leu Tyr Asn Phe Ala AspAla Asn Thr Leu Leu Arg His Ala Lys Glu Leu Tyr Asn Phe Ala Asp
1265 1270 1275 12801265 1270 1275 1280
Asn Tyr Arg Gly Lys Tyr Ser Asp Ser Ile Ser Asp Ala Ala Ala PheAsn Tyr Arg Gly Lys Tyr Ser Asp Ser Ile Ser Asp Ala Ala Ala Phe
1285 1290 1295 1285 1290 1295
Tyr Asn Ser Tyr Ser Tyr Glu Asp Glu Leu Val Trp Gly Ala Ile TrpTyr Asn Ser Tyr Ser Tyr Glu Asp Glu Leu Val Trp Gly Ala Ile Trp
1300 1305 1310 1300 1305 1310
Leu Trp Arg Ala Thr Asn Asp Gln Asn Tyr Leu Asn Lys Ala Thr GlnLeu Trp Arg Ala Thr Asn Asp Gln Asn Tyr Leu Asn Lys Ala Thr Gln
1315 1320 1325 1315 1320 1325
Tyr Tyr Asn Gln Tyr Ser Ile Gln Tyr Lys Asn Ser Pro Leu Ser TrpTyr Tyr Asn Gln Tyr Ser Ile Gln Tyr Lys Asn Ser Pro Leu Ser Trp
1330 1335 1340 1330 1335 1340
Asp Asp Lys Ser Thr Gly Ala Ser Ala Leu Leu Ala Lys Leu Thr GlyAsp Asp Lys Ser Thr Gly Ala Ser Ala Leu Leu Ala Lys Leu Thr Gly
1345 1350 1355 13601345 1350 1355 1360
Gly Asp Gln Tyr Lys Ser Ala Val Gln Ser Phe Cys Asp Gly Phe TyrGly Asp Gln Tyr Lys Ser Ala Val Gln Ser Phe Cys Asp Gly Phe Tyr
1365 1370 1375 1365 1370 1375
Tyr Asn Gln Gln Lys Thr Pro Lys Gly Leu Ile Trp Tyr Ser Asp TrpTyr Asn Gln Gln Lys Thr Pro Lys Gly Leu Ile Trp Tyr Ser Asp Trp
1380 1385 1390 1380 1385 1390
Gly Ser Leu Arg Gln Ser Met Asn Ala Val Trp Val Cys Leu Gln AlaGly Ser Leu Arg Gln Ser Met Asn Ala Val Trp Val Cys Leu Gln Ala
1395 1400 1405 1395 1400 1405
Ala Asp Ala Gly Val Lys Thr Gly Glu Tyr Arg Ser Leu Ala Lys LysAla Asp Ala Gly Val Lys Thr Gly Glu Tyr Arg Ser Leu Ala Lys Lys
1410 1415 1420 1410 1415 1420
Gln Leu Asp Tyr Ala Leu Gly Asp Ala Gly Arg Ser Phe Val Val GlyGln Leu Asp Tyr Ala Leu Gly Asp Ala Gly Arg Ser Phe Val Val Gly
1425 1430 1435 14401425 1430 1435 1440
Phe Gly Asn Asn Pro Pro Ser His Glu Gln His Arg Ala Ala Ser CysPhe Gly Asn Asn Pro Pro Ser His Glu Gln His Arg Ala Ala Ser Cys
1445 1450 1455 1445 1450 1455
Pro Asp Ala Pro Ala Ala Cys Asp Trp Asn Thr Tyr Asn Gly Gly GlnPro Asp Ala Pro Ala Ala Cys Asp Trp Asn Thr Tyr Asn Gly Gly Gln
1460 1465 1470 1460 1465 1470
Ser Asn Tyr His Val Leu Tyr Gly Ala Leu Val Gly Gly Pro Asp AlaSer Asn Tyr His Val Leu Tyr Gly Ala Leu Val Gly Gly Pro Asp Ala
1475 1480 1485 1475 1480 1485
Asn Asp Tyr Tyr Asn Asp Val Arg Ser Asp Tyr Val His Asn Glu ValAsn Asp Tyr Tyr Asn Asp Val Arg Ser Asp Tyr Val His Asn Glu Val
1490 1495 1500 1490 1495 1500
Ala Cys Asp Tyr Asn Ala Gly Phe Gln Asn Val Leu Val Ser Leu LysAla Cys Asp Tyr Asn Ala Gly Phe Gln Asn Val Leu Val Ser Leu Lys
1505 1510 1515 15201505 1510 1515 1520
Ala Asn Gly TyrAla Asn Gly Tyr
<210> 13<210> 13
<211> 26848<211> 26848
<212> DNA<212>DNA
<213> 人工合成()<213> artificial synthesis ()
<400> 13<400> 13
gcggccgccc atccatagtg tgtccttcac cctctgaagt tcatgtgcga agttggctgc 60gcggccgccc atccatagtg tgtccttcac cctctgaagt tcatgtgcga agttggctgc 60
gtctcttctc ataaaaatga cacaaaggaa aaagtacatc agttgtaatg aagtagcatt 120gtctcttctc ataaaaatga cacaaaggaa aaagtacatc agttgtaatg aagtagcatt 120
gttttatgct ccagagggcc tttgacttcc tagacctatt ttttgttttt accataatca 180gttttatgct ccagagggcc tttgacttcc tagacctatt ttttgttttt accataatca 180
taaactttct catctgaggt gaagagtgtg gaattaacac attttgttcc tttgttaggc 240taaactttct catctgaggt gaagagtgtg gaattaacac attttgttcc tttgttaggc 240
aaagactctg aggctgaaca atcgggaggt tctatcgctt aaataagaaa agttaagata 300aaagactctg aggctgaaca atcgggaggt tctatcgctt aaataagaaa agttaagata 300
attaactggc attgagcact tgtccacatt cttgtgctgt ggtcagaggt aggacacagt 360attaactggc attgagcact tgtccacatt cttgtgctgt ggtcagaggt aggacacagt 360
ctcccatccc cgggttaatt aagtgcctcc aacaaagggg tactgttgcc cacatagaaa 420ctcccatccc cgggttaatt aagtgcctcc aacaaagggg tactgttgcc cacatagaaa 420
gatctaaact aattaattaa tccctcaccc gcaaatcttt cagtcactaa gttagcacga 480gatctaaact aattaattaa tccctcaccc gcaaatcttt cagtcactaa gttagcacga 480
ttgttgaaca agttctccaa aggagagata cagatgagtg cgtatagggt ggacctggct 540ttgttgaaca agttctccaa aggagagata cagatgagtg cgtatagggt ggacctggct 540
gctgaggaga cacctgcatc tgactaagaa gagccacggt gttagttgaa tggtgtggag 600gctgaggaga cacctgcatc tgactaagaa gagccacggt gttagttgaa tggtgtggag 600
tagggtggtt ctgtgggaca gtagaaaatc gagaggcatg tgccgtttag tgaactgatg 660tagggtggtt ctgtgggaca gtagaaaatc gagaggcatg tgccgtttag tgaactgatg 660
gaagctaccc caaacgacag agattgtcag tcaggccaat ccgtttcgag tttgatgggc 720gaagctaccc caaacgacag agattgtcag tcaggccaat ccgtttcgag tttgatgggc 720
agccggacag tgagacagac acacctactc agttggagga aggatgagaa caatggccag 780agccggacag tgagacagac acacctactc agttggagga aggatgagaa caatggccag 780
cagggattga gagaccctga caggcgcaag gccctaacac acacacctac cacctcactt 840cagggattga gagaccctga caggcgcaag gccctaacacacacacctaccacctcactt 840
gacaaagctg ccaaagacca aagacttgtt ctccattaga aatgacagct ggcttgaccc 900gacaaagctg ccaaagacca aagacttgtt ctccattaga aatgacagct ggcttgaccc 900
gacagcataa taagcagagt gtactctgat tggagaactt taatgtgttt cattcagtat 960gacagcataa taagcagagt gtactctgat tggagaactt taatgtgttt cattcagtat 960
tataaaagga cagtattaca gattttgttg tacactgctg ttacatgtgg ggcagtgtgt 1020tataaaagga cagtattaca gattttgttg tacactgctg ttacatgtgg ggcagtgtgt 1020
ctttaagtag ggtaaagtac tctttaaaaa tgggtcctag atattttttc ctttaactca 1080ctttaagtag ggtaaagtac tctttaaaaa tgggtcctag atattttttc ctttaactca 1080
agtctcttac tgtttaaatg atttttattt tgtttaatat ggaggaaaaa gaagcgtaaa 1140agtctcttac tgtttaaatg attttattt tgtttaatat ggaggaaaaa gaagcgtaaa 1140
tggacaatat atatttagag aaagatggtt agctgtcaga aaaatatgca aatcaaaatc 1200tggacaatat atatttagag aaagatggtt agctgtcaga aaaatatgca aatcaaaatc 1200
acaccaagac tgcagcacac ccctgtcaga tggctgtgat caagaaaata aatgacaatg 1260acaccaagac tgcagcacac ccctgtcaga tggctgtgat caagaaaata aatgacaatg 1260
agtggtggtg aagatgtact aaagggaaac acacacacac acacacacac acacacacac 1320agtggtggtg aagatgtact aaagggaaac acacacacac aacacacacacacacacacac 1320
acacactgga gcaaccactg tggaaatcag tatgaatggt cctcaaaaac ctgaagatag 1380acacactgga gcaaccactg tggaaatcag tatgaatggt cctcaaaaac ctgaagatag 1380
agcggggcgt ggtggcatac acttttattc ccagcactgg ggaggcagag gcaggtggat 1440agcggggcgt ggtggcatac acttttattc ccagcactgg ggaggcagag gcaggtggat 1440
ctctgagttc caggccagcc tggtctatag cacaggttct aggacagcca gggctacaca 1500ctctgagttc caggccagcc tggtctatag cacaggttct aggacagcca gggctacaca 1500
gaaaaaccct gccttgatta aaccaaacca aaccaaacca aaccaaacca aaccaaacca 1560gaaaaaccct gccttgatta aaccaaacca aaccaaacca aaccaaacca aaccaaacca 1560
aaccaaacca aaccaaacca gaccaaacca aaacactgaa gatagaactt cagtattcca 1620aaccaaacca aaccaaacca gaccaaacca aaacactgaa gatagaactt cagtattcca 1620
ttcctagata tatacccaat ggagactaag tcagcaagac acctgcacag ccatgttcac 1680ttcctagata tatacccaat ggagactaag tcagcaagac acctgcacag ccatgttcac 1680
tactacactg ttcaccacag ccaggctgtg gaaccagcct gagtgtccat gataaatgaa 1740tactacactg ttcaccacag ccaggctgtg gaaccagcct gagtgtccat gataaatgaa 1740
tggataggta actttcaagg taaatggact ctgctgtgta catgcctcac attctgttta 1800tggataggta actttcaagg taaatggact ctgctgtgta catgcctcac attctgttta 1800
ttcatttttc tttatgaggt gtccattcag gagtcacatg gtagttctat tttcagtctt 1860ttcatttttc tttatgaggt gtccattcag gagtcacatg gtagttctat tttcagtctt 1860
ctgaagatac tacactggtc cccacagttt acacttttat cagcagtgaa taagggttcc 1920ctgaagatac tacactggtc cccacagttt acacttttat cagcagtgaa taagggttcc 1920
tctatcctta ccatcatttg ttgtaatttt tcttgatgac cctctttctg acagggatag 1980tctatcctta ccatcatttg ttgtaatttt tcttgatgac cctctttctg acagggatag 1980
gatgtaatat cagtgtgagg aagtacaact tgttttctaa gtatttattg gccccttgca 2040gatgtaatat cagtgtgagg aagtacaact tgttttctaa gtatttattg gccccttgca 2040
tttcttcttt tgaaaactgt cggttcctga catctgctca ggtattcatt ggatgttgtt 2100tttcttcttt tgaaaactgt cggttcctga catctgctca ggtattcatt ggatgttgtt 2100
tctttggtgt ttgagttctt atgaattcta gatgttaaat ccctgcctgt ggttctctcc 2160tctttggtgt ttgagttctt atgaattcta gatgttaaat ccctgcctgt ggttctctcc 2160
cattctgtag gctgcctcct caccctggca attgttgtcc ttgttttgca gaaacttttg 2220cattctgtag gctgcctcct caccctggca attgttgtcc ttgttttgca gaaacttttg 2220
acttcatgga atctcatttg tcagttttcc ctcctctgct atagcctgag ctaatgcact 2280acttcatgga atctcatttg tcagttttcc ctcctctgct atagcctgag ctaatgcact 2280
ggtttttaca gagccctggt ctatgccttt atcctcctct ggcagcttcg gagtttcatt 2340ggtttttaca gagccctggt ctatgccttt atcctcctct ggcagcttcg gagtttcatt 2340
tcttacattt agatctttga tccactttga acaagttttg gagcagggtg agagatacga 2400tcttacattt agatctttga tccactttga acaagttttg gagcagggtg agagatacga 2400
atctagttcc attcttccat atgtgatcct agtttacata gcatcgttgg ttgaagaggt 2460atctagttcc attcttccat atgtgatcct agtttacata gcatcgttgg ttgaagaggt 2460
tttattttat ttttaaataa tgtgtcataa aaaacgaggt ggttgtagca gtgtggattt 2520tttattttat ttttaaataa tgtgtcataa aaaacgaggt ggttgtagca gtgtggattt 2520
gtttctttgt cctttgatct acaggtcttg ttttgtgtca gtctcatgat gttttattgc 2580gtttctttgt cctttgatct acaggtcttg ttttgtgtca gtctcatgat gttttattgc 2580
tatggctctg tcatacagtc tgaggtcagg tattgtgata taccttcagt attgctccct 2640tatggctctg tcatacagtc tgaggtcagg tattgtgata taccttcagt attgctccct 2640
cagactcagg tttgctttgg ccaggagtca tcttactcag tgctcttaga gctcccccag 2700cagactcagg tttgctttgg ccaggagtca tcttactcag tgctcttaga gctcccccag 2700
catgtagctg ctactattct tagttgataa atcaggaaac tggggctcag agagattaac 2760catgtagctg ctactattct tagttgataa atcaggaaac tggggctcag agagattaac 2760
tgtcttgaac tacttctggg gaggtgaaac gtggagacac taaactgtgt ttaccctgta 2820tgtcttgaac tacttctggg gaggtgaaac gtggagaacac taaactgtgt ttaccctgta 2820
ctgctccagt agctgtcggg tgctgggcta cagcaaagca cctatactat atattactca 2880ctgctccagt agctgtcggg tgctgggcta cagcaaagca cctatactat atattactca 2880
ggaggtggaa aaactcagcc tcccttgggg ttcccaagct cccaggtgtc cagtcactgc 2940ggaggtggaa aaactcagcc tcccttgggg ttcccaagct cccaggtgtc cagtcactgc 2940
tggaaacctc atggagtctg aaaggaaggg ttgagggtac atggggcagc gatgaggagc 3000tggaaacctc atggagtctg aaaggaaggg ttgagggtac atggggcagc gatgaggagc 3000
ctggggctgg gatctcccaa acacctggat atccagatgc cactgggtca gggggagttg 3060ctggggctgg gatctcccaa acacctggat atccagatgc cactgggtca ggggggagttg 3060
ggaacagagt tgggatgtcc atggacctgt gacaaggcca gggccagggg gaggataact 3120ggaacagagt tgggatgtcc atggacctgt gacaaggcca gggccagggg gaggataact 3120
ctggctttac taatttgcga aagtccttag cttagcagca gttgtctggg agcacagagg 3180ctggctttac taatttgcga aagtccttag cttagcagca gttgtctggg agcacagagg 3180
ggccttctgt aagaggctca ggcagtgccg ctctgtaggc gaaggtcttc tccatgttcc 3240ggccttctgt aagaggctca ggcagtgccg ctctgtaggc gaaggtcttc tccatgttcc 3240
ccatggtggt tcttgatgaa agagacagtc cttggctcca aactggttta ttgattgttc 3300ccatggtggt tcttgatgaa agagacagtc cttggctcca aactggttta ttgattgttc 3300
attgtggaaa atgggtgcac accaccttct cagggtggac cagagatcaa ataccttttg 3360attgtggaaa atgggtgcac accaccttct cagggtggac cagagatcaa ataccttttg 3360
cagggaggaa tatctgggaa gggacgctta ctggctaaac cctcagggcc tctagataca 3420caggggaggaa tatctgggaa gggacgctta ctggctaaac cctcagggcc tctagataca 3420
tcattagcat ggagaactct gttctgggct acatgaccac aggccacatt tccacaagcc 3480tcattagcat ggagaactct gttctgggct acatgaccac aggccacatt tccacaagcc 3480
acatgtggga agtgtggcac atgttctagg ccaggaatct ggtagggagc gtggagccac 3540acatgtggga agtgtggcac atgttctagg ccaggaatct ggtagggagc gtggagccac 3540
ctaccatccc aggtgggtgc ctgggtgcca gggaccctga acccgctcaa ccttaccaag 3600ctaccatccc aggtgggtgc ctgggtgcca gggaccctga acccgctcaa ccttaccaag 3600
tttcctggca gggtccactg tcctacacag aagctggagg aggtgtgagg gttgtgtctt 3660tttcctggca gggtccactg tcctacacag aagctggagg aggtgtgagg gttgtgtctt 3660
tgtggaatgt cccatgctgc ttggggctca gtttctccac ctgtacctca ttggtttggg 3720tgtggaatgt cccatgctgc ttggggctca gtttctccac ctgtacctca ttggtttggg 3720
tataaaaagt ggggatactt tattattctc tgactcggtc ctgaggaaaa agcatcgtgg 3780tataaaaagt ggggatactt tattattctc tgactcggtc ctgaggaaaa agcatcgtgg 3780
cagtccagga accacaccct gaggttcctg cactgaaggg actccctaag tctctggagt 3840cagtccagga accacacccct gaggttcctg cactgaaggg actccctaag tctctggagt 3840
ctctcccctt cacagagctg ccaaagtcta ggttcttttg aggataacag agccatgctt 3900ctctcccctt cacagagctg ccaaagtcta ggttcttttg aggataacag agccatgctt 3900
ggtaagcaga caacagcatt tgtttactca accttctttt gtcagctccc tcttcataaa 3960ggtaagcaga caacagcatt tgtttactca accttctttt gtcagctccc tcttcataaa 3960
caagttgaga caccatgctg gcttgaggaa gacttctaaa gccagacaac tgtgcaagga 4020caagttgaga caccatgctg gcttgaggaa gacttctaaa gccagacaac tgtgcaagga 4020
agaagaagaa ggggcaagtg gagttagcct ggatgtagcc ctcaaagtct ccagagacca 4080agaagaagaa ggggcaagtg gagttagcct ggatgtagcc ctcaaagtct ccagagacca 4080
gccatgaagg ctcaagtgga gggcaagacc tgcagcagcc aagcatctgg caggagagga 4140gccatgaagg ctcaagtgga gggcaagacc tgcagcagcc aagcatctgg caggagagga 4140
tcctgggaac ccctctacca tgacacacat tcttcctgca ggtcacactt aataggccat 4200tcctgggaac ccctctacca tgacacacat tcttcctgca ggtcacactt aataggccat 4200
ttcttatttg gatctatcat ggtgttctgt gcgagattaa tgaggtgtta tgctgcgaac 4260ttcttatttg gatctatcat ggtgttctgt gcgagattaa tgaggtgtta tgctgcgaac 4260
agaaagttat ataaaaacaa gtcccccccc cttgtcactg ctgctaagaa tgtagcagaa 4320agaaagttat ataaaaacaa gtcccccccc cttgtcactg ctgctaagaa tgtagcagaa 4320
attgtctcaa gtgtctctct aatcagaaac aataaaggtc tccttggatt caagccctcc 4380attgtctcaa gtgtctctct aatcagaaac aataaaggtc tccttggatt caagccctcc 4380
agtttcctcc ttccttgctg agccttggac acccatacaa acctcctgga tgctacagct 4440agtttcctcc ttccttgctg agccttggac acccatacaa acctcctgga tgctacagct 4440
ctgggcagag actccaaggt ggggagagac tgatggtaca aaagcaaaat acttgtttgg 4500ctgggcagag actccaaggt ggggagagac tgatggtaca aaagcaaaat acttgtttgg 4500
gggtacaccc actcctctgc ctgtgtggtt cctgcagtca gtcctgcaga caggccctca 4560gggtacaccc actcctctgc ctgtgtggtt cctgcagtca gtcctgcaga caggccctca 4560
gtgggtcttc catgggcaac acgcagaggg aggcaatgga tgggaatacc cacaccctgg 4620gtgggtcttc catgggcaac acgcagaggg aggcaatgga tgggaatacc cacaccctgg 4620
ttagtttacc ccggccatgc tctctgctct tcatccctcc tctgccctct gccacggctt 4680ttagtttacc ccggccatgc tctctgctct tcatccctcc tctgccctct gccacggctt 4680
tctctgcagg aatcatatct tcatattggc ccacaggtgt tctcctcacc ctagctatga 4740tctctgcagg aatcatatct tcatattggc ccacaggtgt tctcctcacc ctagctatga 4740
tgtttacttt agagtgacct tagcagggct ggtgggaatg agttctagaa ggctcacgga 4800tgtttacttt agagtgacct tagcagggct ggtgggaatg agttctagaa ggctcacgga 4800
gatgctaggg aagaaacgtc ttctaactac tgaggttact aagttcctgg tggttgtctc 4860gatgctaggg aagaaacgtc ttctaactac tgaggttact aagttcctgg tggttgtctc 4860
tgcctttccc ttgttaaagt caccttgaag ttagtgcaga agaaatcaga gcccagtcac 4920tgcctttccc ttgttaaagt caccttgaag ttagtgcaga agaaatcaga gcccagtcac 4920
agagtaaata tggtcctgaa gatttccttt gagtgcccag aatccatgac atttcaagag 4980agagtaaata tggtcctgaa gatttccttt gagtgcccag aatccatgac atttcaagag 4980
ccctctttgt accttaagtc atttggggtt gtatcttctg cttgatgtat gtgtgtgtgt 5040ccctctttgt accttaagtc atttggggtt gtatcttctg cttgatgtat gtgtgtgtgt 5040
ttatcaaaga gtgagatggt tacataagag gtgctctaaa ggacagagag gatttgcaat 5100ttatcaaaga gtgagatggt tacataagag gtgctctaaa ggacagagag gatttgcaat 5100
tgtggcatgt gacatcctca ggccttgctc tggtgccagg aggaactgat gcagaaaaga 5160tgtggcatgt gacatcctca ggccttgctc tggtgccagg aggaactgat gcagaaaaga 5160
gtaagaggtc atttcctgga ggctgtcact atagaggaga tcttacagtg cattccctcc 5220gtaagaggtc atttcctgga ggctgtcact atagaggaga tcttacagtg cattccctcc 5220
tccaggccct gcctgaggat agacatgtgc tgactgcaac tgaaacagag gcttgggatg 5280tccaggccct gcctgaggat agacatgtgc tgactgcaac tgaaacagag gcttgggatg 5280
gagagttagg ttcacagaag ggagggtggg agatggatgc ttgctgggtt ctgggtctca 5340gagagttagg ttcacagaag ggagggtggg agatggatgc ttgctgggtt ctgggtctca 5340
tcaccagctc ctgaccaccc ggtcagccca tgtgcttatt ccatagcttt cttttgctat 5400tcaccagctc ctgaccacccc ggtcagccca tgtgcttatt ccatagcttt cttttgctat 5400
gtttactcag tgtggtgttt gttgggaccc agcagaagcc agtcccaggc tgacagctgt 5460gtttactcag tgtggtgttt gttgggaccc agcagaagcc agtcccaggc tgacagctgt 5460
ggatacacag ggcagcatga gggtcctcag cctgaagcag tcaggctggc agaagagaaa 5520ggatacacag ggcagcatga gggtcctcag cctgaagcag tcaggctggc agaagagaaa 5520
gaccagcaca cattccttca accaactatg tcttgaaaaa caaacatatt atatcacata 5580gaccagcaca cattccttca accaactatg tcttgaaaaa caaacatatt atatcacata 5580
tattgcattt atgagacagc taaaatgtac tcgggtagca tgactccagg tggggatatc 5640tattgcattt atgagacagc taaaatgtac tcgggtagca tgactccagg tggggatatc 5640
tgcaagtgcc atgagtggca gagggacagc caatgtgagg caagaaggaa ttctggctca 5700tgcaagtgcc atgagtggca gagggacagc caatgtgagg caagaaggaa ttctggctca 5700
acacagctta gctccctggt gttggttcaa actttgagag tttgaccaca agcactttat 5760acacagctta gctccctggt gttggttcaa actttgagag tttgaccaca agcactttat 5760
ttttgacata tttaaacaga gcacaacttt gggaaaaagt tttcttatga aaattatcac 5820ttttgacata tttaaacaga gcacaacttt gggaaaaagt tttcttatga aaattatcac 5820
aataaagctt aaggcatgac tacattaaaa tgcctttgca aagtatatgt gccctcttcc 5880aataaagctt aaggcatgac tacattaaaa tgcctttgca aagtatatgt gccctcttcc 5880
acaagaatgg ttctattgac tgagaaataa tgttcaggat aaagatccag gaagaaaaga 5940acaagaatgg ttctattgac tgagaaataa tgttcaggat aaagatccag gaagaaaaga 5940
tcagggataa gtaaaatact aaactctttt gcaaagtaca tagaccctct ttcataacaa 6000tcagggataa gtaaaatact aaactctttt gcaaagtaca tagacccctct ttcataacaa 6000
tgggttctat tgactgacaa gcactgctca ggagttggga aagagtctag cataagcacg 6060tgggttctat tgactgacaa gcactgctca ggagttggga aagagtctag cataagcacg 6060
atagcctgga gactctagtg aggtctagtc ttacagacag caaaaatcac caggttacaa 6120atagcctgga gactctagtg aggtctagtc ttacagacag caaaaatcac caggttacaa 6120
actacattca tttccagttt tctgatcagg cacaggtatg aatcccttct gttgaagaga 6180actacattca tttccagttt tctgatcagg cacaggtatg aatcccttct gttgaagaga 6180
aaagtccatg tgtttaaaat atctggtttc tccagtgcta ttagcgagaa gacttgagcc 6240aaagtccatg tgtttaaaat atctggtttc tccagtgcta ttagcgagaa gacttgagcc 6240
ctatacaact cccacctgga gtgacatcct gtcttcatgg tatattacat acctagacac 6300ctatacaact cccacctgga gtgacatcct gtcttcatgg tatattacat acctagacac 6300
gctcatctca cagacttagg actttgtctt ctgatctcca tttctgatcc cacttccacc 6360gctcatctca cagacttagg actttgtctt ctgatctcca tttctgatcc cacttccacc 6360
tttgccttga tagtgtcatt ttcttcactg ccttggtgac aaccatgtta tcctctgtgt 6420tttgccttga tagtgtcatt ttcttcactg ccttggtgac aaccatgtta tcctctgtgt 6420
atttgagtgt taccattttc agattttacc tgtatgcaag atcacacagt ctttgtcttt 6480atttgagtgt taccattttc agattttacc tgtatgcaag atcacacagt ctttgtcttt 6480
ctgtctggat gcatgctaat ctctacacaa caacccttcc ccgtcactca gatcttcctc 6540ctgtctggat gcatgctaat ctctacacaa caacccttcc ccgtcactca gatcttcctc 6540
cattaacaca tacatggtgc tgaagaggct agggagcttc ccttcagtgg ggagctagct 6600cattaacaca tacatggtgc tgaagaggct agggagcttc ccttcagtgg ggagctagct 6600
ggctattggg cctttttgac tgtccaggaa ggcccccaat tgctgagaca agaacttaga 6660ggctattggg cctttttgac tgtccaggaa ggcccccaat tgctgagaca agaacttaga 6660
ttcttcatta ttgactctaa ctcatgtatc aagcagaagc taatgaatag ttatcaacag 6720ttcttcatta ttgactctaa ctcatgtatc aagcagaagc taatgaatag ttatcaacag 6720
gatcagaggt tccagtgtaa gacactttga catgaaagaa cggaggaagg acagatggat 6780gatcagaggt tccagtgtaa gacactttga catgaaagaa cggaggaagg acagatggat 6780
gcataaaagc aggaccactg ccccaggaag gtcctggaaa ctgatgcagg gcaaaggaca 6840gcataaaagc aggaccactg ccccaggaag gtcctggaaa ctgatgcagg gcaaaggaca 6840
ggttataaac caaatcttag ggagtcagga agagcacaga ggagctcaac caactgacca 6900ggttataaac caaatcttag ggagtcagga agagcacaga ggagctcaac caactgacca 6900
ctgcttaggg gctaccaacc caatcctccc tgtgggaaca gctaagctat cagccaaggg 6960ctgcttaggg gctaccaacc caatcctccc tgtgggaaca gctaagctat cagccaaggg 6960
taataaacag gcaggacctg tggatgacat ggagagcata gggaccctgg gtccagcctt 7020taataaacag gcaggacctg tggatgacat ggagagcata gggaccctgg gtccagcctt 7020
tagcacctgc actctcagga tactccacca ttgtgtctta gagagcctag ggatactggg 7080tagcacctgc actctcagga tactccacca ttgtgtctta gagagcctag ggatactggg 7080
tccagccttt ggtaccttca ctctcagggt accccatcac tgtgtcttgg agagcctagg 7140tccagccttt ggtaccttca ctctcagggt accccatcac tgtgtcttgg agagcctagg 7140
caccctgggt ccagccttca gtacctgcgc tctcaggaca ccccaccatt gtctcttgcc 7200caccctgggt ccagccttca gtacctgcgc tctcaggaca ccccaccatt gtctcttgcc 7200
ccgtctcttc ttcctcttcc tccctttcat tgtctcttct ctgtttcttt cttgactctc 7260ccgtctcttc ttcctcttcc tccctttcat tgtctcttct ctgtttcttt cttgactctc 7260
ctttcccctc acaccctcac tctagttctc cccttccctc tctgcatcac cctattctct 7320ctttcccctc acaccctcac tctagttctc cccttccctc tctgcatcac cctattctct 7320
ctgtggtccc tccactttcc tttatctctc atgcttctct cctccctcaa atacttgtca 7380ctgtggtccc tccactttcc tttatctctc atgcttctct cctccctcaa atacttgtca 7380
cccactatac ttcaggggcc agctctagtg acaaagctgt taatagcaag actctcagat 7440cccactatac ttcaggggcc agctctagtg acaaagctgt taatagcaag actctcagat 7440
ctccaacggc tcagaggagc cagacccacc aagaactctc tccaggtcca atttcaggtt 7500ctccaacggc tcagaggagc cagacccacc aagaactctc tccaggtcca atttcaggtt 7500
ccttcgaaag ctttcagcaa atgctcaggg aacatgccac taacaagaag atgcaaattc 7560ccttcgaaag ctttcagcaa atgctcaggg aacatgccac taacaagaag atgcaaattc 7560
cagttgagag tgggaaaggc ccttgcgtag gtcccatctt ccaggccaag gtcagagggg 7620cagttgagag tgggaaaggc ccttgcgtag gtcccatctt ccaggccaag gtcagagggg 7620
ctctgtgtaa tccggattga cagggctcag aacaatgttt tgtttttaag gtttatttat 7680ctctgtgtaa tccggattga cagggctcag aacaatgttt tgtttttaag gtttatttat 7680
tttaggtgtt agtgtctttg cttgcatgac cttatgtgca tcatgtgtgt gcaggttcct 7740tttaggtgtt agtgtctttg cttgcatgac cttatgtgca tcatgtgtgt gcaggttcct 7740
gatgacagta gaggagggct ttgaatccct ggggatagga agttacagga aattataagc 7800gatgacagta gaggagggct ttgaatccct ggggatagga agttacagga aattataagc 7800
tgctttgtgg gtcttctagc tttcccaaca gaagtgaatg ctcttcacca ctgagccatc 7860tgctttgtgg gtcttctagc tttcccaaca gaagtgaatg ctcttcacca ctgagccatc 7860
tctctaggcc caagagacat tgctttatgg atataattgt gtgtgtgtgt caacattgag 7920tctctaggcc caagagacat tgctttatgg atataattgt gtgtgtgtgtcaacattgag 7920
gaaagggaaa taaaaaaaaa acttcagccg ctaaggttgt acagtttcac taattgctac 7980gaaagggaaa taaaaaaaaa acttcagccg ctaaggttgt acagtttcac taattgctac 7980
ttttagttgt gataaaatgg caggtgcttc aacatttata tatacaaaaa cttccctgct 8040ttttagttgt gataaaatgg caggtgcttc aacatttata tatacaaaaa cttccctgct 8040
ggtggttcaa ctgtgagaac tggggtaagt gggtgagttc tctttttctg tctctgtctc 8100ggtggttcaa ctgtgagaac tggggtaagt gggtgagttc tctttttctg tctctgtctc 8100
tgtctctctc cttccattct ttcttaaagg aaataaacat tgcagctggg ttatagctca 8160tgtctctctc cttccattct ttcttaaagg aaataaacat tgcagctggg ttatagctca 8160
tcaatatgga agttacagaa gtgaaaaaag gcattgcctt ggtgggtggt gttaccagct 8220tcaatatgga agttacagaa gtgaaaaaag gcattgcctt ggtgggtggt gttaccagct 8220
gatttttggt tgtcctgcaa ggaggtctgg ggactggctg ctctgtctct gtctgtatga 8280gatttttggt tgtcctgcaa ggaggtctgg ggactggctg ctctgtctct gtctgtatga 8280
gtgagggaag tctggggagc agattcccta accttcagcc tggcctggtt cctgagtgaa 8340gtgagggaag tctggggagc agattcccta accttcagcc tggcctggtt cctgagtgaa 8340
cccagcctct ctggtcctag tagctttttc caaacaggaa tctgagtggt gacagggaac 8400cccagcctct ctggtcctag tagctttttc caaacaggaa tctgagtggt gacagggaac 8400
aagtaccagc ccattgctta agtgccaggg ttagtgaggg caggaagctg ccatagctgg 8460aagtaccagc ccattgctta agtgccaggg ttagtgaggg caggaagctg ccatagctgg 8460
gattagtagt tgtattggat gtaggaagtc ctatcctggg acagctaatc cttaatgctt 8520gattagtagt tgtattggat gtaggaagtc ctatcctggg acagctaatc cttaatgctt 8520
cactggagat tttcaatgag aaatttatcc cacggcccat atggccccat ccttttgtct 8580cactggagat tttcaatgag aaatttatcc cacggcccat atggccccat ccttttgtct 8580
ccaacagcca agtattttcc attagaggag acttcctgta cacttgatgg atgctcattt 8640ccaacagcca agtattttcc attagaggag acttcctgta cacttgatgg atgctcattt 8640
caaggtgact tggggcagtc agtacagact tgggatgacc tctgacagcc taacctctcc 8700caaggtgact tggggcagtc agtacagact tgggatgacc tctgacagcc taacctctcc 8700
ccaacaaggg ccctctatgt ttgctatgta atgtaatgtc agacattgtc aggagtgtcc 8760ccaacaaggg ccctctatgt ttgctatgta atgtaatgtc agacattgtc aggagtgtcc 8760
gcagcacagc ctgcccagtg tgagggctct cataggtttc ccactgtctt atctacacag 8820gcagcacagc ctgcccagtg tgagggctct cataggtttc ccactgtctt atctacacag 8820
ggataacgag gaggtaagct gcagttccca gtctcacttc acagaggaag agataacccc 8880ggataacgag gaggtaagct gcagttccca gtctcacttc acagaggaag agataaccccc 8880
atcccaggtc atgtagccag cagtggaaag aatgaggatt tgaactcagg tcttccaagt 8940atcccaggtc atgtagccag cagtggaaag aatgaggatt tgaactcagg tcttccaagt 8940
cccattgata gcatctcctc acaagtccct tgccaccctc acgatgcctt agacacttgc 9000cccattgata gcatctcctc acaagtccct tgccaccctc acgatgcctt agacacttgc 9000
ctgcccttta tactaaggag atgcaggtac aaggggttta cccatgtagc agctgaggca 9060ctgcccttta tactaaggag atgcaggtac aaggggttta cccatgtagc agctgaggca 9060
gctggggata gataccagca gcaggcctga tgtcaccact ctaactccag catccccagt 9120gctggggata gataccagca gcaggcctga tgtcaccact ctaactccag catccccagt 9120
ctgtgttcct ggagtgtgaa aatccctact taacaagatt gtgcaacagt ccttggctct 9180ctgtgttcct ggagtgtgaa aatccctact taacaagatt gtgcaacagt ccttggctct 9180
gtgacccata gctggaaaca ggattctcat tgatttgtgg aacatggtgg cagccagcca 9240gtgacccata gctggaaaca ggattctcat tgatttgtgg aacatggtgg cagccagcca 9240
aaaagagggt ctgcatacag aagacagctg tggcaaggcc acagcagact ctgactacct 9300aaaagagggt ctgcatacag aagacagctg tggcaaggcc acagcagact ctgactacct 9300
tagcttacag aattacaagg tcataatgtc ctctgctttg gtcacctcat gttaaggaca 9360tagcttacag aattacaagg tcataatgtc ctctgctttg gtcacctcat gttaaggaca 9360
ggccctaatg aagatggggc agaagactga aggaatggcc aaccaataac tggcccaact 9420ggccctaatg aagatggggc agaagactga aggaatggcc aaccaataac tggcccaact 9420
tgagacccat cctacaggca agcatcaatt cctgacacta ctaatgatac tctgttatgc 9480tgagacccat cctacaggca agcatcaatt cctgacacta ctaatgatac tctgttatgc 9480
ttgcagacag aagcctagca taactatcct ccgagaggtc cacccagcaa ctgactgaaa 9540ttgcagacag aagcctagca taactatcct ccgagaggtc cacccagcaa ctgactgaaa 9540
cagaaaaaga tatccacagg caaacagtgg atggaggtca gggactatta tgggagagct 9600cagaaaaaga tatccacagg caaacagtgg atggaggtca gggactatta tgggagagct 9600
gtgggaagga ttaaaaaccc tgaaggggat aggaacccca caggaagacc aacagagtca 9660gtgggaagga ttaaaaaccc tgaaggggat aggaaccccca caggaagacc aacagagtca 9660
actaagagac ctgtgggagc tctcagagac tgagccacca accaaagagc atacacaggc 9720actaagagac ctgtgggagc tctcagagac tgagccacca accaaagagc atacacaggc 9720
cggtccgagg cacctggcac gtgtgaagca gacatgcagc tcagtctcca tgtaggtcct 9780cggtccgagg cacctggcac gtgtgaagca gacatgcagc tcagtctcca tgtaggtcct 9780
ccaataagcg gtagcctgac tgcagtatcc aatccctaac agggctgcac agtctggcct 9840ccaataagcg gtagcctgac tgcagtatcc aatccctaac agggctgcac agtctggcct 9840
cagtggggga gggtgcccct aatcctgcag agacttgatg agtggagagc tatccagggg 9900cagtggggga gggtgcccct aatcctgcag agacttgatg agtggagagc tatccagggg 9900
gaacccaccc tctctgagaa gggaatgggg atgggggagg gactctgtga agaggggaca 9960gaacccacccc tctctgagaa gggaatgggg atgggggagg gactctgtga agagggggaca 9960
aggacaaaca agaacctcaa ataggtcagg ccctaaaggc ttgctaagta gcagtggccc 10020aggacaaaca agaacctcaa ataggtcagg ccctaaaggc ttgctaagta gcagtggccc 10020
agctctgtcc tgttcctcag cccaaggctc agctcccacc tgtttctgtg tttttctggc 10080agctctgtcc tgttcctcag cccaaggctc agctcccacc tgtttctgtg tttttctggc 10080
ttttcatggg cctaggactt ggtggccagt tcaaacaatg gggcctgtgg aagacacaat 10140ttttcatggg cctaggactt ggtggccagt tcaaacaatg gggcctgtgg aagacacaat 10140
atacaagact agggacattc ctgttctgct gactatccac agcctgatgt aggtggaagg 10200atacaagact agggaacattc ctgttctgct gactatccac agcctgatgt aggtggaagg 10200
acccaatcac tggatttcta cccttgcgca accttgacag ctgagggcct ctcagaaacc 10260acccaatcac tggatttcta cccttgcgca accttgacag ctgagggcct ctcagaaacc 10260
tatttcttcc actgaaaaat gagactctca aatgaacgtc ctgacaatca tcaggcttat 10320tatttcttcc actgaaaaat gagactctca aatgaacgtc ctgacaatca tcaggcttat 10320
taaagaggtg tatctaacct gaatggcaag cagacagcag gcaaatgtct gtatcaacct 10380taaagaggtg tatctaacct gaatggcaag cagacagcag gcaaatgtct gtatcaacct 10380
ctaggaagga caagaactgc tcactgctgc cccccaggag gccatttgct gaaacagctg 10440ctaggaagga caagaactgc tcactgctgc cccccaggag gccatttgct gaaacagctg 10440
ctctcctgct ggtgcacagg ccctgccttc tcattgcagc tacagcccct tcctgtctga 10500ctctcctgct ggtgcacagg ccctgccttc tcattgcagc tacagcccct tcctgtctga 10500
acctcctgtc aggtcactgg gaaacagatc aagatggaac aggacagctc ctgatggtaa 10560acctcctgtc aggtcactgg gaaacagatc aagatggaac aggacagctc ctgatggtaa 10560
ataaaaaaca gtggtcatgg ctattcatag gggtttatgc ttcttcagtc cacactgtga 10620ataaaaaaca gtggtcatgg ctattcatag gggtttatgc ttcttcagtc cacactgtga 10620
agagctgtgg gcatgaacca cagtgttcga ggtagagttg gggttctgaa attcacagtg 10680agagctgtgg gcatgaacca cagtgttcga ggtaggttg gggttctgaa attcacagtg 10680
gggtgagctc agtaaatgtg agctggaggt cactcgtgag acacacagtc ctgctgcttc 10740gggtgagctc agtaaatgtg agctggaggt cactcgtgag acacacagtc ctgctgcttc 10740
tgttcccaat atcctgagga gacgacacat ctactttgtt cagaggccac agtctagttg 10800tgttcccaat atcctgagga gacgacacat ctactttgtt cagaggccac agtctagttg 10800
acctgagagt taccagtttc ttatttgtgt gtgtgtgtgt gtgtgtgtgt gtgtgtgtgt 10860acctgagagt taccagtttc ttattgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgt10860
gtgttgttcg tgtgtgagtg caggtgcaca tatgatagcg tacacgttga ggtcagagga 10920gtgttgttcg tgtgtgagtg caggtgcaca tatgatagcg tacacgttga ggtcagagga 10920
taactatcag gcgttgtccc ctcctacttt tcctcggact ctggagaaca aacatgggtc 10980taactatcag gcgttgtccc ctcctacttt tcctcggact ctggagaaca aacatgggtc 10980
cttattccag gggagcaagt gcctgttggc tgacacatct tgctcacata cattttacct 11040cttattccag gggagcaagt gcctgttggc tgacacatct tgctcacata cattttacct 11040
agacaatgga gcctccatca gagtattact ttagctcctc accgatggca atgcaccacc 11100agacaatgga gcctccatca gagtattact ttagctcctc accgatggca atgcaccacc 11100
tctctaccca cataggagtt gggtctccac acacccccac acccccttca ccaaaacgtt 11160tctctaccca cataggagtt gggtctccac accccccac acccccttca ccaaaacgtt 11160
ttcagttact ttatctggta aagttcatca gagaatgaag ccagtattaa gaacatggaa 11220ttcagttatact ttatctggta aagttcatca gagaatgaag ccagttattaa gaacatggaa 11220
tcatttggga acctggatct agcaataccc caccctagat ggagttgctg agttttcacc 11280tcatttggga acctggatct agcaataccc caccctagat ggagttgctg agttttcacc 11280
tcagattata attcccccct agcttctatg gtttattctg aaaccagggg aactcgattc 11340tcagattata attcccccct agcttctatg gtttatctg aaaccagggg aactcgattc 11340
ctccctttgg accacagaca tcctggcttg tgaattcaca tgtcatctac tgctaatcca 11400ctccctttgg accacagaca tcctggcttg tgaattcaca tgtcatctac tgctaatcca 11400
ttggtagtat gtggctcaca gagacacact acagtcatgg ccaatgtcaa ggtaggacag 11460ttggtagtat gtggctcaca gagaacacact acagtcatgg ccaatgtcaa ggtagcaag 11460
atgtgaatca ttcccccagt cctgctgttt tcatgactaa ccctcctcag cacagtgacc 11520atgtgaatca ttcccccagt cctgctgttt tcatgactaa ccctcctcag cacagtgacc 11520
atgaacctac ttttcccctc cttttatttt tagaattgct ggaattttct attttgagaa 11580atgaacctac ttttcccctc cttttatttt tagaattgct ggaattttct attttgagaa 11580
ataatagcct tggggcagca ttaaacaaaa tcatctagaa agctggttta aaatacagat 11640ataatagcct tggggcagca ttaaacaaaa tcatctagaa agctggttta aaatacagat 11640
ggttgagtca gtgaaagagt gaggaatgtc attattggcc cctcacagag gctggctcac 11700ggttgagtca gtgaaagagt gaggaatgtc attattggcc cctcacagag gctggctcac 11700
tccagcagag gtggttgaag ctcttggaca cgggtcaggt gcataggaag ggtggtctgg 11760tccagcagag gtggttgaag ctcttggaca cgggtcaggt gcataggaag ggtggtctgg 11760
gacacctgag aaccacaatt gaacaaacag aagctgctgg cttttttttt tttaaatgag 11820gacacctgag aaccacaatt gaacaaacag aagctgctgg cttttttttt tttaaatgag 11820
ttctcaaaaa atgactgggc tagcttaggc aaatacttcg agccaaccca acagaacatt 11880ttctcaaaaa atgactgggc tagcttaggc aaatacttcg agccaaccca acagaacatt 11880
cttccattga ttcattctgg atcttctttc tagacaatac tgaactgacc ccttgttggc 11940cttccattga ttcattctgg atcttctttc tagacaatac tgaactgacc ccttgttggc 11940
agtctcaagt ttgacaacat agggctttga acttggcaca aggtccatca ctgtcaccca 12000agtctcaagt ttgacaacat agggctttga acttggcaca aggtccatca ctgtcaccca 12000
agcatcctgg gtgacctttg ggttggaata tcttggctaa ccttagatat tttctttgga 12060agcatcctgg gtgacctttg ggttggaata tcttggctaa ccttagatat tttctttgga 12060
gtatctttag aacatccagg aaatagggct tgattctcat cctgggacca caatataagt 12120gtatctttag aacatccagg aaatagggct tgattctcat cctgggacca caatataagt 12120
caccctagaa tcccaggaga tcgtgcagag aaacaaggat ctctctcgtg tgcatccttc 12180caccctagaa tcccaggaga tcgtgcagag aaacaaggat ctctctcgtg tgcatccttc 12180
ttcaaagcag tgagtagtga ctccactaaa ctgagttccc atctgagagt ccacaggagg 12240ttcaaagcag tgagtagtga ctccactaaa ctgagttccc atctgagagt ccacaggagg 12240
ctttggggca agaagcagag ggaaggcact gtttgtgttg gtaaagtttt gactctaaca 12300ctttggggca agaagcagag ggaaggcact gtttgtgttg gtaaagtttt gactctaaca 12300
aatttgaaga catagatgac attgtgtcag actaacaaca acctagactc atgtgggttc 12360aatttgaaga catagatgac atgtgtcag actaacaaca acctagactc atgtgggttc 12360
tgtttaggga tcagatttta ttcatcaatg acttgtctta gtgtatagag aaaggcttcc 12420tgtttaggga tcagatttta ttcatcaatg acttgtctta gtgtataggag aaaggcttcc 12420
tactggagtg taggctcaat aatgacagaa gagatagcta tttcccctag ggactgtgct 12480tactggagtg taggctcaat aatgacagaa gagatagcta tttcccctag ggactgtgct 12480
gctccaagtt tggtggagaa aggcagtggg gaacctagat gtgctctctg gggagggggt 12540gctccaagtt tggtggagaa aggcagtgggg gaacctagat gtgctctctg gggaggggggt 12540
ctgaagctgg cttcatagaa ggtgtgaagt tttgctgaaa catctaaaca gaattatagc 12600ctgaagctgg cttcatagaa ggtgtgaagt tttgctgaaa catctaaaca gaattatagc 12600
ttaggaaagt gagcaggcaa ggcagggaat gtgttgcata tgtatatgta catgaatata 12660ttaggaaagt gagcaggcaa ggcagggaat gtgttgcata tgtatatgta catgaatata 12660
ttatgttata gatacacaca catttgaacc tcatttgcag atgacagaaa ataggttatt 12720ttatgttata gatacacaca catttgaacc tcatttgcag atgacagaaa ataggttat 12720
ttgcctctct taactgctaa gcacaatgac ttccagttcc atccatttcc tgaaatgcca 12780ttgcctctct taactgctaa gcacaatgac ttccagttcc atccatttcc tgaaatgcca 12780
caatttcatt tttcattgtg gctgaataaa attccattgc agactgggcc ctacttcatc 12840caatttcatt tttcattgtg gctgaataaa attccattgc agactgggcc ctacttcatc 12840
cactcctgag ggcaggcata tcccctggct ccatttctta cctattgtga agagaagtgc 12900cactcctgag ggcaggcata tcccctggct ccatttctta cctattgtga agagaagtgc 12900
aactgtcttg ttgaaaggca agcgtgagag aggcaggcac taattgtggg tttttgtttc 12960aactgtcttg ttgaaaggca agcgtgagag aggcaggcac taattgtggg tttttgtttc 12960
ttcttcctgc tatgactctc catttgtcag ggcgcgccgc caccatgttc cagctgtgga 13020ttcttcctgc tatgactctc catttgtcag ggcgcgccgc caccatgttc cagctgtgga 13020
agctggtgtt cctgtgcgga ctgctgatcg gcaccagcgc ctccagcacc ccctccagca 13080agctggtgtt cctgtgcgga ctgctgatcg gcaccagcgc ctccagcacc ccctccagca 13080
ccggagagaa caacggcttc tactactcct tctggaccga cggaggaggc gacgtgacct 13140ccggagagaa caacggcttc tactactcct tctggaccga cggaggaggc gacgtgacct 13140
acaccaacgg cgacgccgga gcttacaccg tggagtggag caacgtgggc aacttcgtgg 13200acaccaacgg cgacgccgga gcttacaccg tggagtggag caacgtgggc aacttcgtgg 13200
gaggcaaggg atggaaccca ggctccgccc aggacatcac ctactccggc accttcaccc 13260gaggcaaggg atggaaccca ggctccgccc aggacatcac ctactccggc accttcaccc 13260
caagcggcaa cggctacctg tccgtgtacg gctggaccac cgaccccctg atcgagtact 13320caagcggcaa cggctacctg tccgtgtacg gctggaccac cgaccccctg atcgagtact 13320
acatcgtgga gagctacggc gactacaacc caggctccgg aggcacctac aagggcaccg 13380acatcgtgga gagctacggc gactacaacc caggctccgg aggcacctac aagggcaccg 13380
tgaccagcga cggctccgtg tacgacatct acaccgctac caggaccaac gctgccagca 13440tgaccagcga cggctccgtg tacgacatct acaccgctac caggaccaac gctgccagca 13440
tccagggcac cgccaccttc acccagtact ggtccgtgag gcagaacaag agagtgggcg 13500tccagggcac cgccaccttc accccagtact ggtccgtgag gcagaacaag agagtgggcg 13500
gcaccgtgac caccagcaac cacttcaacg cctgggccaa gctgggcatg aacctgggca 13560gcaccgtgac caccagcaac cacttcaacg cctgggccaa gctgggcatg aacctgggca 13560
cccacaacta ccagatcgtg gctaccgagg gctaccagtc cagcggctcc agctccatca 13620cccacaacta ccagatcgtg gctaccgagg gctaccagtc cagcggctcc agctccatca 13620
ccgtgcagga ggctgccgcc aaagaagctg ccgccaagga ggctgccgcc aagcagtccg 13680ccgtgcagga ggctgccgcc aaagaagctg ccgccaagga ggctgccgcc aagcagtccg 13680
agccagagct gaagctggag agcgtggtca tcgtgtcccg ccacggcgtg cgcgctccaa 13740agccagagct gaagctggag agcgtggtca tcgtgtcccg ccacggcgtg cgcgctccaa 13740
ccaaggccac ccagctgatg caggacgtga ccccagacgc ttggccaacc tggccagtga 13800ccaaggccac ccagctgatg caggacgtga ccccagacgc ttggccaacc tggccagtga 13800
agctgggatg gctgaccccc aggggcggag agctgatcgc ctacctgggc cactaccaga 13860agctgggatg gctgaccccc aggggcggag agctgatcgc ctacctgggc cactaccaga 13860
ggcagagact ggtggctgac ggactgctgg ccaagaaggg atgcccacag agcggacagg 13920ggcagagact ggtggctgac ggactgctgg ccaagaaggg atgcccacag agcggacagg 13920
tggctatcat cgctgacgtg gacgagcgca cccggaagac cggagaggcc ttcgccgccg 13980tggctatcat cgctgacgtg gacgagcgca cccggaagac cggagaggcc ttcgccgccg 13980
gcctggcccc agactgcgct atcaccgtgc acacccaggc tgacaccagc tcccccgacc 14040gcctggcccc agactgcgct atcaccgtgc acacccaggc tgacaccagc tcccccgacc 14040
cactgttcaa cccactgaag accggcgtgt gccagctgga caacgccaac gtgaccgacg 14100cactgttcaa cccactgaag accggcgtgt gccagctgga caacgccaac gtgaccgacg 14100
ctatcctgag ccgcgccgga ggctccatcg ctgacttcac cggacacagg cagaccgcct 14160ctatcctgag ccgcgccgga ggctccatcg ctgacttcac cggacacagg cagaccgcct 14160
tcagggagct ggagagagtg ctgaacttcc cccagtccaa cctgtgcctg aagcgggaga 14220tcagggagct ggagagagtg ctgaacttcc cccagtccaa cctgtgcctg aagcgggaga 14220
agcaggacga gagctgctcc ctgacccagg ccctgccaag cgagctgaag gtgtccgccg 14280agcaggacga gagctgctcc ctgacccagg ccctgccaag cgagctgaag gtgtccgccg 14280
acaacgtgag cctgaccgga gccgtgagcc tggcctccat gctgaccgag atcttcctgc 14340acaacgtgag cctgaccgga gccgtgagcc tggcctccat gctgaccgag atcttcctgc 14340
tccagcaggc tcagggaatg ccagagccag gatggggaag gatcaccgac agccaccagt 14400tccagcaggc tcagggaatg ccagagccag gatggggaag gatcaccgac agccaccagt 14400
ggaacaccct gctgtccctg cacaacgccc agttctacct gctccagcgg accccagagg 14460ggaacaccct gctgtccctg cacaacgccc agttctacct gctccagcgg accccagagg 14460
tggctaggag cagagccacc ccactgctgg acctgatcaa gaccgccctg accccacacc 14520tggctaggag cagagccacc ccactgctgg acctgatcaa gaccgccctg accccacacc 14520
caccacagaa gcaggcctac ggcgtgaccc tgccaacctc cgtgctgttc atcgccggcc 14580caccacagaa gcaggcctac ggcgtgaccc tgccaacctc cgtgctgttc atcgccggcc 14580
acgacaccaa cctggctaac ctgggaggcg ccctggagct gaactggacc ctgccaggac 14640acgacaccaa cctggctaac ctgggaggcg ccctggagct gaactggacc ctgccaggac 14640
agccagacaa caccccacca ggaggagagc tggtgttcga gaggtggcgc cggctgagcg 14700agccagacaa caccccacca ggaggagagc tggtgttcga gaggtggcgc cggctgagcg 14700
acaactccca gtggattcag gtgtccctgg tgttccagac cctccagcag atgagagaca 14760acaactccca gtggattcag gtgtccctgg tgttccagac cctccagcag atgagagaca 14760
agaccccact gtccctgaac accccaccag gagaggtgaa gctgaccctg gccggatgcg 14820agaccccact gtccctgaac accccaccag gagaggtgaa gctgaccctg gccggatgcg 14820
aggagaggaa cgctcaggga atgtgcagcc tggccggctt cacccagatc gtgaacgagg 14880aggagaggaa cgctcaggga atgtgcagcc tggccggctt cacccagatc gtgaacgagg 14880
ctagaatccc cgcctgctcc ctgagggtga agaggggcag cggagctacc aacttctccc 14940ctagaatccc cgcctgctcc ctgagggtga agaggggcag cggagctacc aacttctccc 14940
tgctgaagca ggctggcgac gtggaggaga acccaggacc aatgtttcag ctctggaagc 15000tgctgaagca ggctggcgac gtggaggaga acccaggacc aatgtttcag ctctggaagc 15000
tcgtgtttct ctgcggactc ctcatcggga cctcagccca gcagaccgtg tggggacagt 15060tcgtgtttct ctgcggactc ctcatcggga cctcagccca gcagaccgtg tggggacagt 15060
gtggcggaat cggctggtcc ggcccaacca actgcgcccc aggcagcgcc tgctccaccc 15120gtggcggaat cggctggtcc ggcccaacca actgcgcccc aggcagcgcc tgctccaccc 15120
tgaaccccta ctacgcccag tgcatcccag gcgccaccac catcaccacc agcaccaggc 15180tgaaccccta ctacgcccag tgcatcccag gcgccaccac catcaccacc agcaccaggc 15180
ccccatccgg ccccaccacc accaccagag ccacctccac cagcagctcc accccaccca 15240ccccatccgg cccccaccacc accacccagag ccacctccac cagcagctcc accccacccca 15240
cctccagcgg cgtgagattc gccggcgtga acatcgccgg cttcgacttc ggctgcacca 15300cctccagcgg cgtgagattc gccggcgtga acatcgccgg cttcgacttc ggctgcacca 15300
ccgacggcac ctgcgtgacc agcaaggtgt accccccact gaagaacttc accggcagca 15360ccgacggcac ctgcgtgacc agcaaggtgt accccccact gaagaacttc accggcagca 15360
acaactaccc agacggcatc ggccagatgc agcacttcgt gaacgaggac ggcatgacca 15420acaactaccc agacggcatc ggccagatgc agcacttcgt gaacgaggac ggcatgacca 15420
tcttccggct gcccgtgggc tggcagtacc tggtgaacaa caacctgggc ggcaacctgg 15480tcttccggct gcccgtgggc tggcagtacc tggtgaacaa caacctgggc ggcaacctgg 15480
acagcacctc catcagcaag tacgaccagc tggtgcaggg ctgcctgagc ctgggcgcct 15540acagcacctc catcagcaag tacgaccagc tggtgcaggg ctgcctgagc ctgggcgcct 15540
actgcatcgt ggacatccac aactacgcca gatggaacgg cggcatcatc ggccagggcg 15600actgcatcgt ggacatccac aactacgcca gatggaacgg cggcatcatc ggccagggcg 15600
gccccaccaa cgcccagttc accagcctgt ggtcccagct ggcctccaag tacgccagcc 15660gccccaccaa cgcccagttc accagcctgt ggtcccagct ggcctccaag tacgccagcc 15660
agtccagagt gtggttcggc atcatgaacg agccacacga cgtgaacatc aacacctggg 15720agtccagagt gtggttcggc atcatgaacg agccacacga cgtgaacatc aacacctggg 15720
ccgccaccgt gcaggaggtg gtgaccgcca tcagaaacgc cggcgccacc tcccagttca 15780ccgccaccgt gcaggaggtg gtgaccgcca tcagaaacgc cggcgccacc tcccagttca 15780
tctccctgcc aggcaacgac tggcagagcg ccggcgcctt catctccgac ggcagcgccg 15840tctccctgcc aggcaacgac tggcagagcg ccggcgcctt catctccgac ggcagcgccg 15840
ccgccctgag ccaggtgacc aaccccgacg gcagcaccac caacctgatc ttcgacgtgc 15900ccgccctgag ccaggtgacc aaccccgacg gcagcaccac caacctgatc ttcgacgtgc 15900
acaagtacct ggactccgac aacagcggca cccacgccga gtgcaccacc aacaacatcg 15960acaagtacct ggactccgac aacagcggca cccacgccga gtgcaccacc aacaacatcg 15960
acggcgcctt ctccccactg gccacctggc tgagacagaa caacagacag gccatcctga 16020acggcgcctt ctccccactg gccacctggc tgagacagaa caacagacag gccatcctga 16020
ccgagaccgg cggcggcaac gtgcagtcct gcatccagga catgtgccag cagatccagt 16080ccgagaccgg cggcggcaac gtgcagtcct gcatccagga catgtgccag cagatccagt 16080
acctgaacca gaacagcgac gtgtacctgg gctacgtggg ctggggcgcc ggctccttcg 16140acctgaacca gaacagcgac gtgtacctgg gctacgtggg ctggggcgcc ggctccttcg 16140
acagcaccta cgtgctgacc gagaccccca cctcctccgg caacagctgg accgacacct 16200acagcaccta cgtgctgacc gagaccccca cctcctccgg caacagctgg accgacacct 16200
ccctggtgtc cagctgcctg gccagaaagg aggccgccgc caaggaggcc gccgccaagg 16260ccctggtgtc cagctgcctg gccagaaagg aggccgccgc caaggaggcc gccgccaagg 16260
aggccgccgc caagggcagc tacgactacg ccgacgtgat caagaagagc ctgctgttct 16320aggccgccgc caagggcagc tacgactacg ccgacgtgat caagaagagc ctgctgttct 16320
accaggccca gcggtccggc aggctgagcg gcatggaccc actggtgtcc tggagaaagg 16380accaggccca gcggtccggc aggctgagcg gcatggaccc actggtgtcc tggagaaagg 16380
acagcgccct gaacgacagg ggcaacaacg gcgaggacct gaccggcggc tactacgacg 16440acagcgccct gaacgacagg ggcaacaacg gcgaggacct gaccggcggc tactacgacg 16440
ccggcgactt cgtgaagttc ggcttcccaa tggcctacac catcaccctg ctgagctggg 16500ccggcgactt cgtgaagttc ggcttcccaa tggcctacac catcaccctg ctgagctggg 16500
gcgtgatcga ctacgagaac acctactcca gcatcggcgc cctgagcgcc gccagagccg 16560gcgtgatcga ctacgagaac acctactcca gcatcggcgc cctgagcgcc gccagagccg 16560
ccatcaagtg gggcaccgac tacttcatca aggcccacgt gtccgccaac gagctgtacg 16620ccatcaagtg gggcaccgac tacttcatca aggccacgt gtccgccaac gagctgtacg 16620
gccaggtggg caacggcggc gccgaccaca gctggtgggg cagacccgag gacatgaaca 16680gccaggtggg caacggcggc gccgaccaca gctggtgggg cagacccgag gacatgaaca 16680
tggacaggcc agcctacaag atcgacacct ccagaccagg ctccgacctg gccgccgaga 16740tggacaggcc agcctacaag atcgacacct ccagaccagg ctccgacctg gccgccgaga 16740
ccgccgccgc catggccgcc gcctccatcg tgttcaagaa cgccgactcc aactacgcca 16800ccgccgccgc catggccgcc gcctccatcg tgttcaagaa cgccgactcc aactacgcca 16800
acaccctgct gagacacgcc aaggagctgt acaacttcgc cgacaactac aggggcaagt 16860acaccctgct gagacacgcc aaggagctgt acaacttcgc cgacaactac aggggcaagt 16860
actccgactc catcagcgac gccgccgcct tctacaacag ctacagctac gaggacgagc 16920actccgactc catcagcgac gccgccgcct tctacaacag ctacagctac gaggacgagc 16920
tggtgtgggg cgccatctgg ctgtggagag ccaccaacga ccagaactac ctgaacaagg 16980tggtgtgggg cgccatctgg ctgtggagag ccaccaacga ccagaactac ctgaacaagg 16980
ccacccagta ctacaaccag tactccatcc agtacaagaa cagcccactg tcctgggacg 17040ccaccccagta ctacaaccag tactccatcc agtacaagaa cagcccactg tcctgggacg 17040
acaagagcac cggcgcctcc gccctgctgg ccaagctgac cggcggcgac cagtacaaga 17100acaagagcac cggcgcctcc gccctgctgg ccaagctgac cggcggcgac cagtacaaga 17100
gcgccgtgca gagcttctgc gacggcttct actacaacca gcagaagacc ccaaagggcc 17160gcgccgtgca gagcttctgc gacggcttct actacaacca gcagaagacc ccaaagggcc 17160
tgatctggta ctccgactgg ggctccctga gacagtccat gaacgccgtg tgggtgtgcc 17220tgatctggta ctccgactgg ggctccctga gacagtccat gaacgccgtg tgggtgtgcc 17220
tgcaagccgc cgacgccggc gtgaagaccg gcgagtacag atccctggcc aagaagcagc 17280tgcaagccgc cgacgccggc gtgaagaccg gcgagtacag atccctggcc aagaagcagc 17280
tggactacgc cctgggcgac gccggccgga gcttcgtggt gggcttcggc aacaacccac 17340tggactacgc cctgggcgac gccggccgga gcttcgtggt gggcttcggc aacaacccac 17340
cctcccacga gcagcacagg gccgccagct gcccagacgc ccccgccgcc tgcgactgga 17400cctcccacga gcagcacagg gccgccagct gcccagacgc ccccgccgcc tgcgactgga 17400
acacctacaa cggcggccag tccaactacc acgtgctgta cggcgccctg gtgggcggcc 17460acacctacaa cggcggccag tccaactacc acgtgctgta cggcgccctg gtgggcggcc 17460
ccgacgccaa cgactactac aacgacgtga gatccgacta cgtgcacaac gaggtggcct 17520ccgacgccaa cgactactac aacgacgtga gatccgacta cgtgcacaac gaggtggcct 17520
gcgactacaa cgctggattt cagaatgtcc tcgtgtcact caaggctaat ggctactgag 17580gcgactacaa cgctggattt cagaatgtcc tcgtgtcact caaggctaat ggctactgag 17580
ggcgcgccga tcaattctct agagctcgct gatcagcctc gactgtgcct tctagttgcc 17640ggcgcgccga tcaattctct agagctcgct gatcagcctc gactgtgcct tctagttgcc 17640
agccatctgt tgtttgcccc tcccccgtgc cttccttgac cctggaaggt gccactccca 17700agccatctgt tgtttgcccc tcccccgtgc cttccttgac cctggaaggt gccactccca 17700
ctgtcctttc ctaataaaat gaggaaattg catcgcattg tctgagtagg tgtcattcta 17760ctgtcctttc ctaataaaat gaggaaattg catcgcattg tctgagtagg tgtcattcta 17760
ttctgggggg tggggtgggg caggacagca agggggagga ttgggaagac aatagcaggc 17820ttctgggggg tggggtgggg caggacagca aggggggagga ttgggaagac aatagcaggc 17820
atgctgggga tgcggtgggc tctatggctt ctgaggcgga aagaaccagc tggtttaaac 17880atgctgggga tgcggtgggc tctatggctt ctgaggcgga aagaaccagc tggtttaaac 17880
tcgattataa cttcgtatag catacattat acgaagttat gatcgatatg aagaatctgc 17940tcgattataa cttcgtatag catacattat acgaagttat gatcgatatg aagaatctgc 17940
ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 18000ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 18000
gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 18060gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 18060
tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 18120tggagttccg cgttacataa ccttacggtaa atggcccgcc tggctgaccg cccaacgacc 18120
cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 18180cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 18180
attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 18240attgacgtca atgggtggag tattacggt aaactgccca cttggcagta catcaagtgt 18240
atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 18300atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 18300
atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 18360atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 18360
tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 18420tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 18420
actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 18480actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 18480
aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 18540aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 18540
gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca 18600gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca 18600
ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc 18660ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc 18660
gtttaaactt aagcttggta ccgagctcgg atccactagt ccagtgtggt ggaattcgcc 18720gtttaaactt aagcttggta ccgagctcgg atccactagt ccagtgtggt ggaattcgcc 18720
accatgggat cggccattga acaagatgga ttgcacgcag gttctccggc cgcttgggtg 18780accatgggat cggccattga acaagatgga ttgcacgcag gttctccggc cgcttgggtg 18780
gagaggctat tcggctatga ctgggcacaa cagacaatcg gctgctctga tgccgccgtg 18840gagaggctat tcggctatga ctgggcacaa cagacaatcg gctgctctga tgccgccgtg 18840
ttccggctgt cagcgcaggg gcgcccggtt ctttttgtca agaccgacct gtccggtgcc 18900ttccggctgt cagcgcaggg gcgcccggtt ctttttgtca agaccgacct gtccggtgcc 18900
ctgaatgaac tgcaggacga ggcagcgcgg ctatcgtggc tggccacgac gggcgttcct 18960ctgaatgaac tgcaggacga ggcagcgcgg ctatcgtggc tggccacgac gggcgttcct 18960
tgcgcagctg tgctcgacgt tgtcactgaa gcgggaaggg actggctgct attgggcgaa 19020tgcgcagctg tgctcgacgt tgtcactgaa gcgggaaggg actggctgct attgggcgaa 19020
gtgccggggc aggatctcct gtcatctcac cttgctcctg ccgagaaagt atccatcatg 19080gtgccggggc aggatctcct gtcatctcac cttgctcctg ccgagaaagt atccatcatg 19080
gctgatgcaa tgcggcggct gcatacgctt gatccggcta cctgcccatt cgaccaccaa 19140gctgatgcaa tgcggcggct gcatacgctt gatccggcta cctgcccatt cgaccaccaa 19140
gcgaaacatc gcatcgagcg agcacgtact cggatggaag ccggtcttgt cgatcaggat 19200gcgaaacatc gcatcgagcg agcacgtact cggatggaag ccggtcttgt cgatcaggat 19200
gatctggacg aagagcatca ggggctcgcg ccagccgaac tgttcgccag gctcaaggcg 19260gatctggacg aagagcatca ggggctcgcg ccagccgaac tgttcgccag gctcaaggcg 19260
cgcatgcccg acggcgatga tctcgtcgtg acccatggcg atgcctgctt gccgaatatc 19320cgcatgcccg acggcgatga tctcgtcgtg acccatggcg atgcctgctt gccgaatatc 19320
atggtggaaa atggccgctt ttctggattc atcgactgtg gccggctggg tgtggcggac 19380atggtggaaa atggccgctt ttctggattc atcgactgtg gccggctggg tgtggcggac 19380
cgctatcagg acatagcgtt ggctacccgt gatattgctg aagagcttgg cggcgaatgg 19440cgctatcagg acatagcgtt ggctacccgt gatattgctg aagagcttgg cggcgaatgg 19440
gctgaccgct tcctcgtgct ttacggtatc gccgctcccg attcgcagcg catcgccttc 19500gctgaccgct tcctcgtgct ttacggtatc gccgctcccg attcgcagcg catcgccttc 19500
tatcgccttc ttgacgagtt cttcgagggc agaggaagtc tgctaacatg cggtgacgtc 19560tatcgccttc ttgacgagtt cttcgagggc agaggaagtc tgctaacatg cggtgacgtc 19560
gaggagaatc ctggcccaat ggtgagcaag ggcgaggagc tgttcaccgg ggtggtgccc 19620gaggagaatc ctggcccaat ggtgagcaag ggcgaggagc tgttcaccgg ggtggtgccc 19620
atcctggtcg agctggacgg cgacgtaaac ggccacaagt tcagcgtgtc cggcgagggc 19680atcctggtcg agctggacgg cgacgtaaac ggccacaagt tcagcgtgtc cggcgagggc 19680
gagggcgatg ccacctacgg caagctgacc ctgaagttca tctgcaccac cggcaagctg 19740gagggcgatg ccacctacgg caagctgacc ctgaagttca tctgcaccac cggcaagctg 19740
cccgtgccct ggcccaccct cgtgaccacc ctgacctacg gcgtgcagtg cttcagccgc 19800cccgtgccct ggcccaccct cgtgaccacc ctgacctacg gcgtgcagtg cttcagccgc 19800
taccccgacc acatgaagca gcacgacttc ttcaagtccg ccatgcccga aggctacgtc 19860taccccgacc acatgaagca gcacgacttc ttcaagtccg ccatgcccga aggctacgtc 19860
caggagcgca ccatcttctt caaggacgac ggcaactaca agacccgcgc cgaggtgaag 19920caggagcgca ccatcttctt caaggacgac ggcaactaca agacccgcgc cgaggtgaag 19920
ttcgagggcg acaccctggt gaaccgcatc gagctgaagg gcatcgactt caaggaggac 19980ttcgagggcg acaccctggt gaaccgcatc gagctgaagg gcatcgactt caaggaggac 19980
ggcaacatcc tggggcacaa gctggagtac aactacaaca gccacaacgt ctatatcatg 20040ggcaacatcc tggggcacaa gctggagtac aactacaaca gccacaacgt ctatatcatg 20040
gccgacaagc agaagaacgg catcaaggtg aacttcaaga tccgccacaa catcgaggac 20100gccgacaagc agaagaacgg catcaaggtg aacttcaaga tccgccacaa catcgaggac 20100
ggcagcgtgc agctcgccga ccactaccag cagaacaccc ccatcggcga cggccccgtg 20160ggcagcgtgc agctcgccga ccactaccag cagaacaccc ccatcggcga cggccccgtg 20160
ctgctgcccg acaaccacta cctgagcacc cagtccgccc tgagcaaaga ccccaacgag 20220ctgctgcccg acaaccacta cctgagcacc cagtccgccc tgagcaaaga ccccaacgag 20220
aagcgcgatc acatggtcct gctggagttc gtgaccgccg ccgggatcac tctcggcatg 20280aagcgcgatc acatggtcct gctggagttc gtgaccgccg ccgggatcac tctcggcatg 20280
gacgagctgt acaagtaatc tagagggccc gtttaaaccc gctgatcagc ctcgactgtg 20340gacgagctgt acaagtaatc tagagggccc gtttaaaccc gctgatcagc ctcgactgtg 20340
ccttctagtt gccagccatc tgttgtttgc ccctcccccg tgccttcctt gaccctggaa 20400ccttctagtt gccagccatc tgttgtttgc ccctcccccg tgccttcctt gaccctggaa 20400
ggtgccactc ccactgtcct ttcctaataa aatgaggaaa ttgcatcgca ttgtctgagt 20460ggtgccactc ccactgtcct ttcctaataa aatgaggaaa ttgcatcgca ttgtctgagt 20460
aggtgtcatt ctattctggg gggtggggtg gggcaggaca gcaaggggga ggattgggaa 20520aggtgtcatt ctattctggg gggtggggtg gggcaggaca gcaaggggga ggattgggaa 20520
gacaatagca ggcatgctgg ggatgcggtg ggctctatgg cttctgaaga tctataactt 20580gacaatagca ggcatgctgg ggatgcggtg ggctctatgg cttctgaaga tctataactt 20580
cgtatagcat acattatacg aagttatgga tctgtcgacc atagtgtgtc cttcacacat 20640cgtatagcat acattatacg aagttatgga tctgtcgacc atagtgtgtc cttcacacat 20640
cacggttaca attaggcagt gctgactcta aatcaagaga cctcattaca tgttcctgac 20700cacggttaca attaggcagt gctgactcta aatcaagaga cctcattaca tgttcctgac 20700
tctttttttt ctcacttttt tttccatttt tttattactc aaatgaattt atcacatctg 20760tctttttttt ctcacttttt tttccatttttttattactc aaatgaattt atcacatctg 20760
tagttgtgca atgatcaaaa caatctgatt tcacaggatt tccacatttt ttattttgtc 20820tagttgtgca atgatcaaaa caatctgatt tcacaggatt tccacatttt ttattttgtc 20820
ttttcactca agtgtctttg caggcttcca tccctttatc ctcaagaatc attttcgggc 20880ttttcactca agtgtctttg caggcttcca tccctttatc ctcaagaatc attttcgggc 20880
tctaagaaat cttttttgta ctttttctat ttattttcag agcctcttgt tttcctttta 20940tctaagaaat cttttttgta ctttttctat ttattttcag agcctcttgt tttcctttta 20940
tagatgtaat aaaacttcct ctcttcaagt ttatgaatta gaatttttta gtgccattcc 21000tagatgtaat aaaacttcct ctcttcaagt ttatgaatta gaatttttta gtgccattcc 21000
ccaggttatc tcatttcttc aggttttttt cagttatctt ttaatttctc ttttatgttt 21060ccaggttatc tcatttcttc aggttttttt cagttatctt ttaatttctc ttttatgttt 21060
ctcactcttg gcttttgtgc ttgatgattc tttctcttta ttcatattga caaaagatga 21120ctcactcttg gcttttgtgc ttgatgattc tttctcttta ttcatattga caaaagatga 21120
atattatatt agttgcctgt tgctattata acaaatcaca acaaattttg tggcttaaaa 21180atattatatt agttgcctgt tgctattata acaaatcaca acaaattttg tggcttaaaa 21180
caacagaaat ttattatctc agagttgtga aggtgaccat tccaaaatta gtcttagaat 21240caacagaaat ttattatctc aggttgtga aggtgaccat tccaaaatta gtcttagaat 21240
actaaaatca ataacatgac tctgtacgtc aactatactc caatacatat ttttttaatt 21300actaaaatca ataacatgac tctgtacgtc aactatactc caatacatat ttttttaatt 21300
gagaaaaaaa aaaaagaaaa tcaaaaccaa cttgtcagca gggatttcct tccagaggct 21360gagaaaaaaa aaaaagaaaa tcaaaaccaa cttgtcagca gggatttcct tccagaggct 21360
ccagggaagg atctgcttgc ttgcctttcc cagcttccag aagccactta acattcctgc 21420ccagggaagg atctgcttgc ttgcctttcc cagcttccag aagccactta attcctgc 21420
tcatggccct gcacgcatcg ctcttcctcg tgctgctttc atcctgaccc tggcccttct 21480tcatggccct gcacgcatcg ctcttcctcg tgctgctttc atcctgaccc tggcccttct 21480
cctctcttac aaggaccttt gaggttgcac cagccccaca ggtgcaccca ggatgctctc 21540cctctcttac aaggaccttt gaggttgcac cagccccaca ggtgcaccca ggatgctctc 21540
ccatctcagg atccgttatc acacccgtgt tgccacgtaa ggtaacatgt tcaccggtgc 21600ccatctcagg atccgttatc acacccgtgttgccacgtaa ggtaacatgt tcaccggtgc 21600
cgagggttag gatgtgggca tctttgggca gaggggcgtt cttcaactta ccgcagacag 21660cgagggttag gatgtggggca tctttggggca gaggggcgtt cttcaactta ccgcagacag 21660
ggtttctaga cagctgattt gctgtttaaa tagagtttct ttcctaatat gtctctccct 21720ggtttctaga cagctgattt gctgtttaaa tagagtttct ttcctaatat gtctctccct 21720
gaggagaaag tcaaaatgag ttctggggtg ggaatgacac acggcctgga ggcgatatta 21780gaggagaaag tcaaaatgag ttctggggtg ggaatgacac acggcctgga ggcgatatta 21780
gaatccaggc cccttaaata tgacagtgag gagggcttta ttctgggttt gtagaatcca 21840gaatccaggc cccttaaata tgacagtgag gagggcttta ttctgggttt gtagaatcca 21840
cacttgaagt cttggtttcc cctggattgt tcattcacat ttatttttag gggtgagcct 21900cacttgaagt cttggtttcc cctggattgt tcattcacat ttattttag gggtgagcct 21900
tctgtttttc atctcgtttg cccacagtct gcaggttggg atggggctcg cctcctctat 21960tctgtttttc atctcgtttg cccacagtct gcaggttggg atggggctcg cctcctctat 21960
acacagacct ttaaggagct tgtttttatc ttcacttctc accaggctct ttgccatctt 22020acacagacct ttaaggagct tgtttttatc ttcacttctc accaggctct ttgccatctt 22020
tgcggtatgt gaggccagaa cttctctaga gggctggata gacagaggcg ccagctctgc 22080tgcggtatgt gaggccagaa cttctctaga gggctggata gacagaggcg ccagctctgc 22080
gaacgcctcc ctattagcta ggctttcaca gcttgtttta tccccaaaca tctcagtatt 22140gaacgcctcc ctatagcta ggctttcaca gcttgtttta tccccaaaca tctcagtatt 22140
ctcttacctc ctttccatcc tgcagaagtg gactgagagc tgttctcagt tggtgaatgt 22200ctcttacctc ctttccatcc tgcagaagtg gactgagagc tgttctcagt tggtgaatgt 22200
ctcccaaaaa tgactattcc tgttctacac ctgactgtgt tatacagcaa tccgttttta 22260ctcccaaaaa tgactattcc tgttctacac ctgactgtgt tatacagcaa tccgttttta 22260
ttttactcaa gattgtttct tagctgttca gtattggtta aaggtcacta aaaagcagaa 22320ttttactcaa gattgtttct tagctgttca gtattggtta aaggtcacta aaaagcagaa 22320
ttcttaatgt attgtaataa tcacttaaga tataaaaata tttgtgcata taatgactaa 22380ttcttaatgt attgtaataa tcacttaaga tataaaaata tttgtgcata taatgactaa 22380
atgctgcatt caaggaatga atcttggtta aaactttttg ccaatctgta tctgataaca 22440atgctgcatt caaggaatga atcttggtta aaactttttg ccaatctgta tctgataaca 22440
aaataatttg aaacatatta cattttaaac gaatggccct taaaatttga atgaaggata 22500aaataatttg aaacatatta cattttaaac gaatggccct taaaatttga atgaaggata 22500
actagacatt ttaatagaag tgcagcatga tactttcttt gcaatttcac attataaaat 22560actagacatt ttaatagaag tgcagcatga tactttcttt gcaatttcac attataaaat 22560
aatgcaatta cgaagcatat cattaggaac ttaattgtgc tcagtgttgt tgtggctcag 22620aatgcaatta cgaagcatat cattaggaac ttaattgtgc tcagtgttgt tgtggctcag 22620
gttattctgg aaagagagcc tgataacata tgagtactta ttggggaggg aattccagga 22680gttattctgg aaagagagcc tgataacata tgagtactta ttggggaggg aattccagga 22680
atttgaggta agtgaatgga gataaggaaa cagagaaggg aaaagccaat aaggggagcc 22740atttgaggta agtgaatgga gataaggaaa cagagaaggg aaaagccaat aaggggagcc 22740
ttattgatgg agttactgct gagagtgaag ggggtctcca tcccactgag gaccctgaat 22800ttattgatgg agttactgct gagagtgaag ggggtctcca tcccactgag gaccctgaat 22800
gatccttcag gacataatca tggaatcgtc ccatcagaga atggtagcct ggagtattta 22860gatccttcag gacataatca tggaatcgtc ccatcagaga atggtagcct gagtattta 22860
gccacacaag tccagcccct tttattgagg gtgctcctaa aggacatctg accctgctct 22920gccacacaag tccagcccct tttattgagg gtgctcctaa aggacatctg accctgctct 22920
tcctgctcct gcacttcctg tctgcccctg cacttcctgc ctgctcctgc acttcctgtc 22980tcctgctcct gcacttcctg tctgcccctg cacttcctgc ctgctcctgc acttcctgtc 22980
ttcccctgca cttcctgccc ctgaacttcc tgcctgctcc tgcacttcct gtcttcccct 23040ttcccctgca cttcctgccc ctgaacttcc tgcctgctcc tgcacttcct gtcttcccct 23040
gcacttcctg cccctgcact tccttgctcc tacacttccc gccttctttt gtacttcttg 23100gcacttcctg cccctgcact tccttgctcc tacacttccc gccttctttt gtacttcttg 23100
ctgctcctgc acttccttcc tgactctgca cttcctgcct gctcctgtac ttccctcctt 23160ctgctcctgc acttccttcc tgactctgca cttcctgcct gctcctgtac ttccctcctt 23160
cgtttgcact tcctgtctgc tcctgcactt ccttctccta cactccctgc ctgctcctgc 23220cgtttgcact tcctgtctgc tcctgcactt ccttctccta cactccctgc ctgctcctgc 23220
tcttcctgct cctgcatttc ctgctttttc cttttcctgc tcctgcatct cctacctgct 23280tcttcctgct cctgcatttc ctgctttttc cttttcctgc tcctgcatct cctacctgct 23280
cctggagttc cttcctgctc ctacacttcc tgtctgctcc tgcacttctg gcttcacctg 23340cctggagttc cttcctgctc ctacacttcc tgtctgctcc tgcacttctg gcttcacctg 23340
ctcttggact gaatgacctt ccctagcttt aaagaaagct tgaggtggaa aaactaagcc 23400ctcttggact gaatgacctt ccctagcttt aaagaaagct tgaggtggaa aaactaagcc 23400
gtcccacagc ccagttgagg gggaatcagg tatgagttgc ctgtctcagc tgggttgcaa 23460gtcccacagc ccagttgagg gggaatcagg tatgagttgc ctgtctcagc tgggttgcaa 23460
tcagatggat caaaaagatg tggcaggatg ccagaagcat ctagaattga atggaaacag 23520tcagatggat caaaaagatg tggcaggatg ccagaagcat ctagaattga atggaaacag 23520
tgaaagtgga tcagaaatag agatgcatct ttctctacac agtagtcttc cctccataac 23580tgaaagtgga tcagaaatag agatgcatct ttctctacac agtagtcttc cctccataac 23580
tgcattaaaa cagcgttcaa agatttgaat catgtttata taaaacatac ccaaagaagc 23640tgcattaaaa cagcgttcaa agatttgaat catgtttata taaaacatac ccaaagaagc 23640
cccccaaaat tagactacat gagtttactt tttcatctct tacaaggccc ccttaaaatg 23700cccccaaaat tagactacat ggtttactt tttcatctct tacaaggccc ccttaaaatg 23700
acaaaaatca atcccaaagg cgtaagtctg ccacaacaaa ggaaacacaa aggtggccgt 23760acaaaaatca atcccaaagg cgtaagtctg ccacaacaaa ggaaacacaa aggtggccgt 23760
cagcaggcaa gacttggaca tcttcctgga agcaatgctg agatggccag cttctcgcta 23820cagcaggcaa gacttggaca tcttcctgga agcaatgctg agatggccag cttctcgcta 23820
cacaggaagc cagttatagc cttagaaaga gctcgaggat ctgcaatccc gcggccatgg 23880cacaggaagc cagttatagc cttagaaaga gctcgaggat ctgcaatccc gcggccatgg 23880
cggccgggag catgcgacgt cgggcccaat tcgccctata gtgagtcgta ttacaattca 23940cggccggggag catgcgacgt cgggcccaat tcgccctata gtgagtcgta ttacaattca 23940
ctggccgtcg ttttacaacg tcgtgactgg gaaaaccctg gcgttaccca acttaatcgc 24000ctggccgtcg ttttacaacg tcgtgactgg gaaaaccctg gcgttaccca acttaatcgc 24000
cttgcagcac atcccccttt cgccagctgg cgtaatagcg aagaggcccg caccgatcgc 24060cttgcagcac atcccccttt cgccagctgg cgtaatagcg aagaggcccg caccgatcgc 24060
ccttcccaac agttgcgcag cctgaatggc gaatggacgc gccctgtagc ggcgcattaa 24120ccttcccaac agttgcgcag cctgaatggc gaatggacgc gccctgtagc ggcgcattaa 24120
gcgcggcggg tgtggtggtt acgcgcagcg tgaccgctac acttgccagc gccctagcgc 24180gcgcggcggg tgtggtggtt acgcgcagcg tgaccgctac acttgccagc gccctagcgc 24180
ccgctccttt cgctttcttc ccttcctttc tcgccacgtt cgccggcttt ccccgtcaag 24240ccgctccttt cgctttcttc ccttcctttc tcgccacgtt cgccggcttt ccccgtcaag 24240
ctctaaatcg ggggctccct ttagggttcc gatttagtgc tttacggcac ctcgacccca 24300ctctaaatcg ggggctccct ttagggttcc gatttagtgc tttacggcac ctcgacccca 24300
aaaaacttga ttagggtgat ggctcacgta gtgggccatc gccctgatag acggattttc 24360aaaaacttga ttagggtgat ggctcacgta gtgggccatc gccctgatag acggattttc 24360
gccctttgac gttggagtcc acgttcttta atagtggact cttgttccaa actggaacaa 24420gccctttgac gttggagtcc acgttcttta atagtggact cttgttccaa actggaacaa 24420
cactcaaccc tatctcggtc tattcttttg atttataagg gattttgccg atttcggcct 24480cactcaaccc tatctcggtc tattcttttg atttataagg gattttgccg atttcggcct 24480
attggttaaa aaatgagctg atttaacaaa aatttaacgc gaattttaac aaaatattaa 24540attggttaaa aaatgagctg atttaacaaa aatttaacgc gaattttaac aaaatattaa 24540
cgcttacaat ttcctgatgc ggtattttct ccttacgcat ctgtgcggta tttcacaccg 24600cgcttacaat ttcctgatgc ggtattttct ccttacgcat ctgtgcggta tttcacaccg 24600
catcaggtgg cacttttcgg ggaaatgtgc gcggaacccc tatttgttta tttttctaaa 24660catcaggtgg cacttttcgg ggaaatgtgc gcggaaccccc tatttgttta tttttctaaa 24660
tacattcaaa tatgtatccg ctcatgagac aataaccctg ataaatgctt caataatatt 24720tacattcaaa tatgtatccg ctcatgagac aataaccctg ataaatgctt caataatatt 24720
gaaaaaggaa gagtatgagt attcaacatt tccgtgtcgc ccttattccc ttttttgcgg 24780gaaaaaggaa gagtatgagt attcaacatt tccgtgtcgc ccttattccc ttttttgcgg 24780
cattttgcct tcctgttttt gctcacccag aaacgctggt gaaagtaaaa gatgctgaag 24840cattttgcct tcctgttttt gctcacccag aaacgctggt gaaagtaaaa gatgctgaag 24840
atcagttggg tgcacgagtg ggttacatcg aactggatct caacagcggt aagatccttg 24900atcagttggg tgcacgagtg ggttacatcg aactggatct caacagcggt aagatccttg 24900
agagttttcg ccccgaagaa cgttttccaa tgatgagcac ttttaaagtt ctgctatgtg 24960agagttttcg ccccgaagaa cgttttccaa tgatgagcac ttttaaagtt ctgctatgtg 24960
gcgcggtatt atcccgtatt gacgccgggc aagagcaact cggtcgccgc atacactatt 25020gcgcggtatt atcccgtatt gacgccgggc aagagcaact cggtcgccgc atacactatt 25020
ctcagaatga cttggttgag tactcaccag tcacagaaaa gcatcttacg gatggcatga 25080ctcagaatga cttggttgag tactcaccag tcacagaaaa gcatcttacg gatggcatga 25080
cagtaagaga attatgcagt gctgccataa ccatgagtga taacactgcg gccaacttac 25140cagtaagaga attatgcagt gctgccataa ccatgagtga taacactgcg gccaacttac 25140
ttctgacaac gatcggagga ccgaaggagc taaccgcttt tttgcacaac atgggggatc 25200ttctgacaac gatcggagga ccgaaggagc taaccgcttt tttgcacaac atgggggatc 25200
atgtaactcg ccttgatcgt tgggaaccgg agctgaatga agccatacca aacgacgagc 25260atgtaactcg ccttgatcgt tgggaaccgg agctgaatga agccatacca aacgacgagc 25260
gtgacaccac gatgcctgta gcaatggcaa caacgttgcg caaactatta actggcgaac 25320gtgacaccac gatgcctgta gcaatggcaa caacgttgcg caaactatta actggcgaac 25320
tacttactct agcttcccgg caacaattaa tagactggat ggaggcggat aaagttgcag 25380tacttactct agcttcccgg caacaattaa tagactggat ggaggcggat aaagttgcag 25380
gaccacttct gcgctcggcc cttccggctg gctggtttat tgctgataaa tctggagccg 25440gaccacttct gcgctcggcc cttccggctg gctggtttat tgctgataaa tctggagccg 25440
gtgagcgtgg gtctcgcggt atcattgcag cactggggcc agatggtaag ccctcccgta 25500gtgagcgtgg gtctcgcggt atcattgcag cactggggcc agatggtaag ccctcccgta 25500
tcgtagttat ctacacgacg gggagtcagg caactatgga tgaacgaaat agacagatcg 25560tcgtagttat ctacacgacg gggagtcagg caactatgga tgaacgaaat agacagatcg 25560
ctgagatagg tgcctcactg attaagcatt ggtaactgtc agaccaagtt tactcatata 25620ctgagatagg tgcctcactg attaagcatt ggtaactgtc agaccaagtt tactcatata 25620
tactttagat tgatttaaaa cttcattttt aatttaaaag gatctaggtg aagatccttt 25680tactttagat tgatttaaaa cttcattttt aatttaaaag gatctaggtg aagatccttt 25680
ttgataatct catgaccaaa atcccttaac gtgagttttc gttccactga gcatcagacc 25740ttgataatct catgaccaaa atcccttaac gtgagttttc gttccactga gcatcagacc 25740
ccgtagaaaa gatcaaagga tcttcttgag atcctttttt tctgcgcgta atctgctgct 25800ccgtagaaaa gatcaaagga tcttcttgag atcctttttt tctgcgcgta atctgctgct 25800
tgcaaacaaa aaaaccaccg ctaccagcgg tggtttgttt gccggatcaa gagctaccaa 25860tgcaaacaaa aaaaccaccg ctaccagcgg tggtttgttt gccggatcaa gagctaccaa 25860
ctctttttcc gaaggtaact ggcttcagca gagcgcagat accaaatact gttcttctag 25920ctctttttcc gaaggtaact ggcttcagca gagcgcagat accaaatact gttcttctag 25920
tgtagccgta gttaggccac cacttcaaga actctgtagc accgcctaca tacctcgctc 25980tgtagccgta gttaggccac cacttcaaga actctgtagc accgcctaca tacctcgctc 25980
tgctaatcct gttaccagtg gctgctgcca gtggcgataa gtcgtgtctt accgggttgg 26040tgctaatcct gttaccagtg gctgctgcca gtggcgataa gtcgtgtctt accgggttgg 26040
actcaagacg atagttaccg gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca 26100actcaagacg atagttaccg gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca 26100
cacagcccag cttggagcga acgacctaca ccgaactgag atacctacag cgtgagctat 26160cacagcccag cttggagcga acgacctaca ccgaactgag atacctacag cgtgagctat 26160
gagaaagcgc cacgcttccc gaagggagaa aggcggacag gtatccggta agcggcaggg 26220gagaaagcgc cacgcttccc gaagggagaa aggcggacag gtatccggta agcggcaggg 26220
tcggaacagg agagcgcacg agggagcttc cagggggaaa cgcctggtat ctttatagtc 26280tcggaacagg agagcgcacg agggagcttc cagggggaaa cgcctggtat ctttatagtc 26280
ctgtcgggtt tcgccacctc tgacttgagc gtcgattttt gtgatgctcg tcaggggggc 26340ctgtcgggtt tcgccacctc tgacttgagc gtcgattttt gtgatgctcg tcaggggggc 26340
ggagcctatg gaaaaacgcc agcaacgcgg cctttttacg gttcctggcc ttttgctggc 26400ggagcctatg gaaaaacgcc agcaacgcgg cctttttacg gttcctggcc ttttgctggc 26400
cttttgctca catgttcttt cctgcgttat cccctgattc tgtggataac cgtattaccg 26460cttttgctca catgttcttt cctgcgttat cccctgattc tgtggataac cgtattaccg 26460
cctttgagtg agctgatacc gctcgccgca gccgaacgac cgagcgcagc gagtcagtga 26520cctttgagtg agctgatacc gctcgccgca gccgaacgac cgagcgcagc gagtcagtga 26520
gcgaggaagc ggaagagcgc ccaatacgca aaccgcctct ccccgcgcgt tggccgattc 26580gcgaggaagc ggaagagcgc ccaatacgca aaccgcctct ccccgcgcgt tggccgattc 26580
attaatgcag ctggcacgac aggtttcccg actggaaagc gggcagtgag cgcaacgcaa 26640attaatgcag ctggcacgac aggtttcccg actggaaagc gggcagtgag cgcaacgcaa 26640
ttaatgtgag ttagctcact cattaggcac cccaggcttt acactttatg cttccggctc 26700ttaatgtgag ttagctcact cattaggcac cccaggcttt acactttatg cttccggctc 26700
gtatgttgtg tggaattgtg agcggataac aatttcacac aggaaacagc tatgaccatg 26760gtatgttgtg tggaattgtg aggcggataac aatttcacac aggaaacagc tatgaccatg 26760
attacgccaa gctatttagg tgacactata gaatactcaa gctatgcatc caacgcgttg 26820attacgccaa gctatttagg tgacactata gaatactcaa gctatgcatc caacgcgttg 26820
ggagctctcc catatggtcg acctgcag 26848ggagctctcc catatggtcg acctgcag 26848
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911170220.8A CN110846336B (en) | 2019-11-26 | 2019-11-26 | Multifunctional fusion enzyme XAET, multifunctional fusion enzyme site-directed integration eukaryotic specific expression vector and construction method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911170220.8A CN110846336B (en) | 2019-11-26 | 2019-11-26 | Multifunctional fusion enzyme XAET, multifunctional fusion enzyme site-directed integration eukaryotic specific expression vector and construction method thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110846336A CN110846336A (en) | 2020-02-28 |
CN110846336B true CN110846336B (en) | 2022-12-20 |
Family
ID=69604355
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911170220.8A Active CN110846336B (en) | 2019-11-26 | 2019-11-26 | Multifunctional fusion enzyme XAET, multifunctional fusion enzyme site-directed integration eukaryotic specific expression vector and construction method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110846336B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106086068A (en) * | 2016-06-06 | 2016-11-09 | 广东温氏食品集团股份有限公司 | A kind of polycistron, the specific expressed polycistronic carrier of salivary gland and construction method thereof |
CN108026145A (en) * | 2015-09-18 | 2018-05-11 | 谷万达公司 | It is engineered phytase and its application method |
-
2019
- 2019-11-26 CN CN201911170220.8A patent/CN110846336B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108026145A (en) * | 2015-09-18 | 2018-05-11 | 谷万达公司 | It is engineered phytase and its application method |
CN106086068A (en) * | 2016-06-06 | 2016-11-09 | 广东温氏食品集团股份有限公司 | A kind of polycistron, the specific expressed polycistronic carrier of salivary gland and construction method thereof |
Non-Patent Citations (3)
Title |
---|
孙悦.非淀粉多糖酶基因筛选及其与植酸酶基因共表达载体构建.《中国优秀硕士学位论文全文数据库农业科技辑》.2017, * |
木聚糖酶-甘露聚糖酶融合酶基因Linker优化及其在猪肾pK15细胞中共表达;张献伟等;《中国农业科学》;20131116(第22期);摘要、第4776页第1.2.2节和第4779页表3 * |
非淀粉多糖酶基因筛选及其与植酸酶基因共表达载体构建;孙悦;《中国优秀硕士学位论文全文数据库农业科技辑》;20170315;摘要,第14、23、47页 * |
Also Published As
Publication number | Publication date |
---|---|
CN110846336A (en) | 2020-02-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102000141B1 (en) | Vectors encoding rod-derived cone viability factor | |
CN108441519A (en) | The method that homologous remediation efficiency is improved in CRISPR/CAS9 gene editings | |
CN112552380B (en) | An immunogen of SARS-CoV-2 virus and its application | |
AU2016227621C1 (en) | System for the presentation of peptides on the cell surface | |
CN106520830B (en) | method for targeted editing of mitochondrial genome by using CRISPR/Cas9 | |
CN110981968B (en) | Fusion protein containing rabies virus G protein, preparation method, application and vaccine thereof | |
CN110295149B (en) | A mutant strain type 3 duck hepatitis A virus CH-P60-117C strain and construction method | |
CN112029736B (en) | Recombinant pseudorabies virus live vaccine for preventing African swine fever and preparation method thereof | |
CN114395017B (en) | Preparation method and application of SARS-CoV-2 virus-like particle | |
CN101260408B (en) | Construction method and application of two-color fluorescence report carrier | |
CN109321571A (en) | A method of utilizing CRISPR/Cas9 preparation and reorganization porcine pseudorabies virus | |
CN113583978A (en) | 3 kinds of recombinant adenovirus, RBD of SARS-CoV-2 Spike protein and their application | |
CN110846336B (en) | Multifunctional fusion enzyme XAET, multifunctional fusion enzyme site-directed integration eukaryotic specific expression vector and construction method thereof | |
KR20230173145A (en) | Engineering B cell-based protein factories to treat serious diseases | |
CN111467497B (en) | Application of FTO (fluorine-doped tin oxide) as target point in treatment of pressure-loaded myocardial injury | |
CN109456991B (en) | Switch system regulated by protocatechuic acid and its regulation method and application | |
CN103131709B (en) | Ribonucleic acid (RNA) interference fragment of zinc finger-x (zfx) gene and application of RNA interference fragment in mouse sex control | |
US20240254515A1 (en) | Flexible expression vector systems and application of same to vaccines and immunotherapeutics | |
CN101186925A (en) | Universal high-efficiency eukaryotic expression vector p3I-GFPN and anti-mastitis transgene vector constructed by using the vector | |
CN109371167A (en) | Genetic elements and the application of frameshift mutation are generated for detecting CRISPR/Cas9 gene editing system cutting gene | |
CN106929526B (en) | Dual-luciferase reporter gene vector for screening shRNA and construction method and application thereof | |
CN106929529A (en) | IL-2 and IFN gamma protein coexpression eukaryotic expression system and preparation method and application thereof | |
CN104195153A (en) | Bicistronic co-expression gene transfer bodyand preparation method | |
CN113999824A (en) | H1a gene type chimeric measles virus attenuated strain, preparation method and application | |
CN107090473A (en) | Antibacterial peptide PR39 and PG1 coexpression vector and turn PR39 and PG1 DNA murine preparation methods |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |