WO2024059813A2 - Biosynthesis of salidroside - Google Patents
Biosynthesis of salidroside Download PDFInfo
- Publication number
- WO2024059813A2 WO2024059813A2 PCT/US2023/074336 US2023074336W WO2024059813A2 WO 2024059813 A2 WO2024059813 A2 WO 2024059813A2 US 2023074336 W US2023074336 W US 2023074336W WO 2024059813 A2 WO2024059813 A2 WO 2024059813A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- amino acid
- acid sequence
- seq
- identity
- set forth
- Prior art date
Links
- ILRCGYURZSFMEG-RQICVUQASA-N salidroside Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)OC1OCCC1=CC=C(O)C=C1 ILRCGYURZSFMEG-RQICVUQASA-N 0.000 title claims abstract description 73
- ILRCGYURZSFMEG-UHFFFAOYSA-N Salidroside Natural products OC1C(O)C(O)C(CO)OC1OCCC1=CC=C(O)C=C1 ILRCGYURZSFMEG-UHFFFAOYSA-N 0.000 title claims abstract description 72
- 230000015572 biosynthetic process Effects 0.000 title description 8
- 125000003275 alpha amino acid group Chemical group 0.000 claims abstract description 186
- 108090000765 processed proteins & peptides Proteins 0.000 claims abstract description 179
- 102000004196 processed proteins & peptides Human genes 0.000 claims abstract description 171
- 229920001184 polypeptide Polymers 0.000 claims abstract description 169
- YCCILVSKPBXVIP-UHFFFAOYSA-N 2-(4-hydroxyphenyl)ethanol Chemical compound OCCC1=CC=C(O)C=C1 YCCILVSKPBXVIP-UHFFFAOYSA-N 0.000 claims abstract description 80
- 238000000034 method Methods 0.000 claims abstract description 54
- DBLDQZASZZMNSL-QMMMGPOBSA-N L-tyrosinol Natural products OC[C@@H](N)CC1=CC=C(O)C=C1 DBLDQZASZZMNSL-QMMMGPOBSA-N 0.000 claims abstract description 40
- 235000004330 tyrosol Nutrition 0.000 claims abstract description 40
- HSCJRCZFDFQWRP-JZMIEXBBSA-N UDP-alpha-D-glucose Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@@H]1OP(O)(=O)OP(O)(=O)OC[C@@H]1[C@@H](O)[C@@H](O)[C@H](N2C(NC(=O)C=C2)=O)O1 HSCJRCZFDFQWRP-JZMIEXBBSA-N 0.000 claims abstract description 36
- HSCJRCZFDFQWRP-UHFFFAOYSA-N Uridindiphosphoglukose Natural products OC1C(O)C(O)C(CO)OC1OP(O)(=O)OP(O)(=O)OCC1C(O)C(O)C(N2C(NC(=O)C=C2)=O)O1 HSCJRCZFDFQWRP-UHFFFAOYSA-N 0.000 claims abstract description 33
- XCCTYIAWTASOJW-XVFCMESISA-N Uridine-5'-Diphosphate Chemical compound O[C@@H]1[C@H](O)[C@@H](COP(O)(=O)OP(O)(O)=O)O[C@H]1N1C(=O)NC(=O)C=C1 XCCTYIAWTASOJW-XVFCMESISA-N 0.000 claims abstract description 26
- 239000011541 reaction mixture Substances 0.000 claims abstract description 19
- 239000008103 glucose Substances 0.000 claims abstract description 6
- 230000002194 synthesizing effect Effects 0.000 claims abstract description 6
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 claims abstract description 5
- 102000004190 Enzymes Human genes 0.000 claims description 55
- 108090000790 Enzymes Proteins 0.000 claims description 55
- 108091033319 polynucleotide Proteins 0.000 claims description 37
- 102000040430 polynucleotide Human genes 0.000 claims description 37
- 239000002157 polynucleotide Substances 0.000 claims description 37
- 108010043934 Sucrose synthase Proteins 0.000 claims description 35
- 239000002773 nucleotide Substances 0.000 claims description 27
- 125000003729 nucleotide group Chemical group 0.000 claims description 27
- 239000000047 product Substances 0.000 claims description 17
- 229930006000 Sucrose Natural products 0.000 claims description 12
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 claims description 12
- 230000000813 microbial effect Effects 0.000 claims description 12
- 239000005720 sucrose Substances 0.000 claims description 12
- 241000235058 Komagataella pastoris Species 0.000 claims description 11
- 239000000203 mixture Substances 0.000 claims description 11
- 241000588724 Escherichia coli Species 0.000 claims description 6
- 108010009534 Arabidopsis sucrose synthase-1 Proteins 0.000 claims description 3
- 235000013361 beverage Nutrition 0.000 claims description 2
- 239000002537 cosmetic Substances 0.000 claims description 2
- 238000012258 culturing Methods 0.000 claims description 2
- 235000015872 dietary supplement Nutrition 0.000 claims description 2
- 235000013305 food Nutrition 0.000 claims description 2
- 239000002417 nutraceutical Substances 0.000 claims description 2
- 235000021436 nutraceutical agent Nutrition 0.000 claims description 2
- 241000219977 Vigna Species 0.000 claims 1
- 108700023372 Glycosyltransferases Proteins 0.000 abstract description 8
- 102000051366 Glycosyltransferases Human genes 0.000 abstract description 8
- 210000004027 cell Anatomy 0.000 description 103
- 229940088598 enzyme Drugs 0.000 description 49
- 108090000623 proteins and genes Proteins 0.000 description 46
- 235000001014 amino acid Nutrition 0.000 description 45
- 150000001413 amino acids Chemical class 0.000 description 42
- 229940024606 amino acid Drugs 0.000 description 41
- 230000014509 gene expression Effects 0.000 description 40
- 150000007523 nucleic acids Chemical group 0.000 description 38
- 238000006243 chemical reaction Methods 0.000 description 32
- 108020004414 DNA Proteins 0.000 description 30
- 241000235648 Pichia Species 0.000 description 29
- 239000013604 expression vector Substances 0.000 description 24
- 102000039446 nucleic acids Human genes 0.000 description 23
- 108020004707 nucleic acids Proteins 0.000 description 23
- 239000013598 vector Substances 0.000 description 22
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 21
- 238000004519 manufacturing process Methods 0.000 description 21
- 241000196324 Embryophyta Species 0.000 description 17
- 108091028043 Nucleic acid sequence Proteins 0.000 description 17
- 238000005516 engineering process Methods 0.000 description 17
- 102000004169 proteins and genes Human genes 0.000 description 16
- 108091026890 Coding region Proteins 0.000 description 15
- 230000000694 effects Effects 0.000 description 15
- 235000018102 proteins Nutrition 0.000 description 15
- 238000006467 substitution reaction Methods 0.000 description 15
- 108020004705 Codon Proteins 0.000 description 14
- 238000005859 coupling reaction Methods 0.000 description 14
- 238000004128 high performance liquid chromatography Methods 0.000 description 13
- 239000012634 fragment Substances 0.000 description 12
- 230000002255 enzymatic effect Effects 0.000 description 10
- 230000009261 transgenic effect Effects 0.000 description 10
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 9
- 230000009466 transformation Effects 0.000 description 9
- 125000000539 amino acid group Chemical group 0.000 description 8
- 230000000295 complement effect Effects 0.000 description 8
- 230000008878 coupling Effects 0.000 description 8
- 238000010168 coupling process Methods 0.000 description 8
- 238000000338 in vitro Methods 0.000 description 8
- 101000630755 Arabidopsis thaliana Sucrose synthase 1 Proteins 0.000 description 7
- 241000894006 Bacteria Species 0.000 description 7
- 238000006206 glycosylation reaction Methods 0.000 description 7
- 230000001105 regulatory effect Effects 0.000 description 7
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 6
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 6
- 238000000527 sonication Methods 0.000 description 6
- 239000000758 substrate Substances 0.000 description 6
- 238000013518 transcription Methods 0.000 description 6
- 230000035897 transcription Effects 0.000 description 6
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 5
- 238000007792 addition Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 5
- 239000013613 expression plasmid Substances 0.000 description 5
- 239000003550 marker Substances 0.000 description 5
- 239000013612 plasmid Substances 0.000 description 5
- 229920000642 polymer Polymers 0.000 description 5
- 108700010070 Codon Usage Proteins 0.000 description 4
- 108020004511 Recombinant DNA Proteins 0.000 description 4
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 4
- 240000004922 Vigna radiata Species 0.000 description 4
- 230000001580 bacterial effect Effects 0.000 description 4
- 230000001851 biosynthetic effect Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 108020004999 messenger RNA Proteins 0.000 description 4
- 238000003752 polymerase chain reaction Methods 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- 238000012216 screening Methods 0.000 description 4
- 241000894007 species Species 0.000 description 4
- 239000006228 supernatant Substances 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- 238000005918 transglycosylation reaction Methods 0.000 description 4
- 238000013519 translation Methods 0.000 description 4
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 3
- -1 2-(4-hydroxyphenyl)ethyl- Chemical group 0.000 description 3
- 102100036826 Aldehyde oxidase Human genes 0.000 description 3
- 240000002791 Brassica napus Species 0.000 description 3
- 102000053602 DNA Human genes 0.000 description 3
- 244000020551 Helianthus annuus Species 0.000 description 3
- 101000928314 Homo sapiens Aldehyde oxidase Proteins 0.000 description 3
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 3
- 244000061176 Nicotiana tabacum Species 0.000 description 3
- 241001165494 Rhodiola Species 0.000 description 3
- 241000235070 Saccharomyces Species 0.000 description 3
- 235000006582 Vigna radiata Nutrition 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 238000005119 centrifugation Methods 0.000 description 3
- 230000002759 chromosomal effect Effects 0.000 description 3
- 239000002299 complementary DNA Substances 0.000 description 3
- 150000001875 compounds Chemical class 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000029087 digestion Effects 0.000 description 3
- 239000006167 equilibration buffer Substances 0.000 description 3
- 125000002791 glucosyl group Chemical group C1([C@H](O)[C@@H](O)[C@H](O)[C@H](O1)CO)* 0.000 description 3
- 230000013595 glycosylation Effects 0.000 description 3
- 229930027917 kanamycin Natural products 0.000 description 3
- 229960000318 kanamycin Drugs 0.000 description 3
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 3
- 229930182823 kanamycin A Natural products 0.000 description 3
- 239000002609 medium Substances 0.000 description 3
- 238000010369 molecular cloning Methods 0.000 description 3
- 230000008929 regeneration Effects 0.000 description 3
- 238000011069 regeneration method Methods 0.000 description 3
- 108091008146 restriction endonucleases Proteins 0.000 description 3
- 230000002103 transcriptional effect Effects 0.000 description 3
- 210000005253 yeast cell Anatomy 0.000 description 3
- 241000219194 Arabidopsis Species 0.000 description 2
- 244000105624 Arachis hypogaea Species 0.000 description 2
- 239000004475 Arginine Substances 0.000 description 2
- 244000075850 Avena orientalis Species 0.000 description 2
- 235000007319 Avena orientalis Nutrition 0.000 description 2
- 235000011299 Brassica oleracea var botrytis Nutrition 0.000 description 2
- 240000003259 Brassica oleracea var. botrytis Species 0.000 description 2
- 241001301148 Brassica rapa subsp. oleifera Species 0.000 description 2
- 241000222120 Candida <Saccharomycetales> Species 0.000 description 2
- 241000195493 Cryptophyta Species 0.000 description 2
- 241000192700 Cyanobacteria Species 0.000 description 2
- 241000588722 Escherichia Species 0.000 description 2
- 241000701533 Escherichia virus T4 Species 0.000 description 2
- 241000233866 Fungi Species 0.000 description 2
- 244000068988 Glycine max Species 0.000 description 2
- 244000299507 Gossypium hirsutum Species 0.000 description 2
- 235000003222 Helianthus annuus Nutrition 0.000 description 2
- 240000005979 Hordeum vulgare Species 0.000 description 2
- 235000007340 Hordeum vulgare Nutrition 0.000 description 2
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 2
- 240000006240 Linum usitatissimum Species 0.000 description 2
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 2
- 239000004472 Lysine Substances 0.000 description 2
- 240000004658 Medicago sativa Species 0.000 description 2
- 240000007594 Oryza sativa Species 0.000 description 2
- 235000007164 Oryza sativa Nutrition 0.000 description 2
- 108010076504 Protein Sorting Signals Proteins 0.000 description 2
- 240000006394 Sorghum bicolor Species 0.000 description 2
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 2
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 2
- 239000000872 buffer Substances 0.000 description 2
- 244000038559 crop plants Species 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 238000006911 enzymatic reaction Methods 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- 229920001519 homopolymer Polymers 0.000 description 2
- 238000010348 incorporation Methods 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 229930182817 methionine Natural products 0.000 description 2
- 244000005700 microbiome Species 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000001823 molecular biology technique Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 239000008188 pellet Substances 0.000 description 2
- 229920002704 polyhistidine Polymers 0.000 description 2
- 239000008057 potassium phosphate buffer Substances 0.000 description 2
- 239000013615 primer Substances 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- RXWNCPJZOCPEPQ-NVWDDTSBSA-N puromycin Chemical compound C1=CC(OC)=CC=C1C[C@H](N)C(=O)N[C@H]1[C@@H](O)[C@H](N2C3=NC=NC(=C3N=C2)N(C)C)O[C@@H]1CO RXWNCPJZOCPEPQ-NVWDDTSBSA-N 0.000 description 2
- 230000035484 reaction time Effects 0.000 description 2
- 239000011780 sodium chloride Substances 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 238000001890 transfection Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- PJVXUVWGSCCGHT-ZPYZYFCMSA-N (2r,3s,4r,5r)-2,3,4,5,6-pentahydroxyhexanal;(3s,4r,5r)-1,3,4,5,6-pentahydroxyhexan-2-one Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)[C@@H](O)C=O.OC[C@@H](O)[C@@H](O)[C@H](O)C(=O)CO PJVXUVWGSCCGHT-ZPYZYFCMSA-N 0.000 description 1
- FQVLRGLGWNWPSS-BXBUPLCLSA-N (4r,7s,10s,13s,16r)-16-acetamido-13-(1h-imidazol-5-ylmethyl)-10-methyl-6,9,12,15-tetraoxo-7-propan-2-yl-1,2-dithia-5,8,11,14-tetrazacycloheptadecane-4-carboxamide Chemical compound N1C(=O)[C@@H](NC(C)=O)CSSC[C@@H](C(N)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)NC(=O)[C@@H]1CC1=CN=CN1 FQVLRGLGWNWPSS-BXBUPLCLSA-N 0.000 description 1
- HNSDLXPSAYFUHK-UHFFFAOYSA-N 1,4-bis(2-ethylhexyl) sulfosuccinate Chemical compound CCCCC(CC)COC(=O)CC(S(O)(=O)=O)C(=O)OCC(CC)CCCC HNSDLXPSAYFUHK-UHFFFAOYSA-N 0.000 description 1
- VGONTNSXDCQUGY-RRKCRQDMSA-N 2'-deoxyinosine Chemical group C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC2=O)=C2N=C1 VGONTNSXDCQUGY-RRKCRQDMSA-N 0.000 description 1
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 1
- 101710163881 5,6-dihydroxyindole-2-carboxylic acid oxidase Proteins 0.000 description 1
- 241000589291 Acinetobacter Species 0.000 description 1
- 241001133760 Acoelorraphe Species 0.000 description 1
- 241000186361 Actinobacteria <class> Species 0.000 description 1
- 102100034035 Alcohol dehydrogenase 1A Human genes 0.000 description 1
- 108020005544 Antisense RNA Proteins 0.000 description 1
- 240000007087 Apium graveolens Species 0.000 description 1
- 235000015849 Apium graveolens Dulce Group Nutrition 0.000 description 1
- 235000010591 Appio Nutrition 0.000 description 1
- 241000219195 Arabidopsis thaliana Species 0.000 description 1
- 101000879463 Arabidopsis thaliana Sucrose synthase 3 Proteins 0.000 description 1
- 241000722808 Arthrobotrys Species 0.000 description 1
- 235000016425 Arthrospira platensis Nutrition 0.000 description 1
- 240000002900 Arthrospira platensis Species 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 241000228212 Aspergillus Species 0.000 description 1
- 241000193830 Bacillus <bacterium> Species 0.000 description 1
- 241000219310 Beta vulgaris subsp. vulgaris Species 0.000 description 1
- 239000002028 Biomass Substances 0.000 description 1
- 108010006654 Bleomycin Proteins 0.000 description 1
- 241000743774 Brachypodium Species 0.000 description 1
- 235000011293 Brassica napus Nutrition 0.000 description 1
- 240000007124 Brassica oleracea Species 0.000 description 1
- 235000003899 Brassica oleracea var acephala Nutrition 0.000 description 1
- 235000011301 Brassica oleracea var capitata Nutrition 0.000 description 1
- 235000017647 Brassica oleracea var italica Nutrition 0.000 description 1
- 235000001169 Brassica oleracea var oleracea Nutrition 0.000 description 1
- 235000004977 Brassica sinapistrum Nutrition 0.000 description 1
- 239000002126 C01EB10 - Adenosine Substances 0.000 description 1
- UXVMQQNJUSDDNG-UHFFFAOYSA-L Calcium chloride Chemical compound [Cl-].[Cl-].[Ca+2] UXVMQQNJUSDDNG-UHFFFAOYSA-L 0.000 description 1
- 244000025254 Cannabis sativa Species 0.000 description 1
- 241000193403 Clostridium Species 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- 241000186216 Corynebacterium Species 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- 235000011309 Crambe hispanica subsp abyssinica Nutrition 0.000 description 1
- 241000220247 Crambe hispanica subsp. abyssinica Species 0.000 description 1
- 244000241257 Cucumis melo Species 0.000 description 1
- 235000015510 Cucumis melo subsp melo Nutrition 0.000 description 1
- RFSUNEUAIZKAJO-VRPWFDPXSA-N D-Fructose Natural products OC[C@H]1OC(O)(CO)[C@@H](O)[C@@H]1O RFSUNEUAIZKAJO-VRPWFDPXSA-N 0.000 description 1
- 102000012410 DNA Ligases Human genes 0.000 description 1
- 108010061982 DNA Ligases Proteins 0.000 description 1
- 102000004594 DNA Polymerase I Human genes 0.000 description 1
- 108010017826 DNA Polymerase I Proteins 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 235000002767 Daucus carota Nutrition 0.000 description 1
- 244000000626 Daucus carota Species 0.000 description 1
- 241000235035 Debaryomyces Species 0.000 description 1
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 1
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 1
- 206010012335 Dependence Diseases 0.000 description 1
- 229920002307 Dextran Polymers 0.000 description 1
- 241001057636 Dracaena deremensis Species 0.000 description 1
- 241000195634 Dunaliella Species 0.000 description 1
- 241000701832 Enterobacteria phage T3 Species 0.000 description 1
- YQYJSBFKSSDGFO-UHFFFAOYSA-N Epihygromycin Natural products OC1C(O)C(C(=O)C)OC1OC(C(=C1)O)=CC=C1C=C(C)C(=O)NC1C(O)C(O)C2OCOC2C1O YQYJSBFKSSDGFO-UHFFFAOYSA-N 0.000 description 1
- 241001198387 Escherichia coli BL21(DE3) Species 0.000 description 1
- 240000009088 Fragaria x ananassa Species 0.000 description 1
- RFSUNEUAIZKAJO-ARQDHWQXSA-N Fructose Chemical compound OC[C@H]1O[C@](O)(CO)[C@@H](O)[C@@H]1O RFSUNEUAIZKAJO-ARQDHWQXSA-N 0.000 description 1
- 101150094690 GAL1 gene Proteins 0.000 description 1
- 101150038242 GAL10 gene Proteins 0.000 description 1
- 102100028501 Galanin peptides Human genes 0.000 description 1
- 102100024637 Galectin-10 Human genes 0.000 description 1
- 101000892220 Geobacillus thermodenitrificans (strain NG80-2) Long-chain-alcohol dehydrogenase 1 Proteins 0.000 description 1
- 108010055629 Glucosyltransferases Proteins 0.000 description 1
- 102000000340 Glucosyltransferases Human genes 0.000 description 1
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 1
- 102100031181 Glyceraldehyde-3-phosphate dehydrogenase Human genes 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- 235000004341 Gossypium herbaceum Nutrition 0.000 description 1
- 240000002024 Gossypium herbaceum Species 0.000 description 1
- 235000009432 Gossypium hirsutum Nutrition 0.000 description 1
- 101150009006 HIS3 gene Proteins 0.000 description 1
- 101100246753 Halobacterium salinarum (strain ATCC 700922 / JCM 11081 / NRC-1) pyrF gene Proteins 0.000 description 1
- 101000780443 Homo sapiens Alcohol dehydrogenase 1A Proteins 0.000 description 1
- 101100121078 Homo sapiens GAL gene Proteins 0.000 description 1
- 101000878605 Homo sapiens Low affinity immunoglobulin epsilon Fc receptor Proteins 0.000 description 1
- 101001046426 Homo sapiens cGMP-dependent protein kinase 1 Proteins 0.000 description 1
- 208000023105 Huntington disease Diseases 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 241000588748 Klebsiella Species 0.000 description 1
- 241000235649 Kluyveromyces Species 0.000 description 1
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 1
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 1
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 1
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 241000209510 Liliopsida Species 0.000 description 1
- 235000004431 Linum usitatissimum Nutrition 0.000 description 1
- 102100038007 Low affinity immunoglobulin epsilon Fc receptor Human genes 0.000 description 1
- 239000006137 Luria-Bertani broth Substances 0.000 description 1
- 235000007688 Lycopersicon esculentum Nutrition 0.000 description 1
- 108010038049 Mating Factor Proteins 0.000 description 1
- 235000010624 Medicago sativa Nutrition 0.000 description 1
- 235000017587 Medicago sativa ssp. sativa Nutrition 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 241000589344 Methylomonas Species 0.000 description 1
- 241000589354 Methylosinus Species 0.000 description 1
- 241000235395 Mucor Species 0.000 description 1
- 102000016943 Muramidase Human genes 0.000 description 1
- 108010014251 Muramidase Proteins 0.000 description 1
- 108010062010 N-Acetylmuramoyl-L-alanine Amidase Proteins 0.000 description 1
- 229930193140 Neomycin Natural products 0.000 description 1
- 101150012394 PHO5 gene Proteins 0.000 description 1
- 241000209117 Panicum Species 0.000 description 1
- 235000006443 Panicum miliaceum subsp. miliaceum Nutrition 0.000 description 1
- 235000009037 Panicum miliaceum subsp. ruderale Nutrition 0.000 description 1
- 241001520808 Panicum virgatum Species 0.000 description 1
- 241000520272 Pantoea Species 0.000 description 1
- 208000018737 Parkinson disease Diseases 0.000 description 1
- UOZODPSAJZTQNH-UHFFFAOYSA-N Paromomycin II Natural products NC1C(O)C(O)C(CN)OC1OC1C(O)C(OC2C(C(N)CC(N)C2O)OC2C(C(O)C(O)C(CO)O2)N)OC1CO UOZODPSAJZTQNH-UHFFFAOYSA-N 0.000 description 1
- 240000004370 Pastinaca sativa Species 0.000 description 1
- 235000017769 Pastinaca sativa subsp sativa Nutrition 0.000 description 1
- 108010033276 Peptide Fragments Proteins 0.000 description 1
- 102000007079 Peptide Fragments Human genes 0.000 description 1
- 244000062780 Petroselinum sativum Species 0.000 description 1
- 240000007377 Petunia x hybrida Species 0.000 description 1
- 235000010627 Phaseolus vulgaris Nutrition 0.000 description 1
- 244000046052 Phaseolus vulgaris Species 0.000 description 1
- 235000010582 Pisum sativum Nutrition 0.000 description 1
- 240000004713 Pisum sativum Species 0.000 description 1
- 241000209504 Poaceae Species 0.000 description 1
- 241000589516 Pseudomonas Species 0.000 description 1
- 241000191025 Rhodobacter Species 0.000 description 1
- 241000316848 Rhodococcus <scale insect> Species 0.000 description 1
- 101100394989 Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009) hisI gene Proteins 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- 101100434411 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) ADH1 gene Proteins 0.000 description 1
- 240000000111 Saccharum officinarum Species 0.000 description 1
- 235000007201 Saccharum officinarum Nutrition 0.000 description 1
- 241000607142 Salmonella Species 0.000 description 1
- 101001000154 Schistosoma mansoni Phosphoglycerate kinase Proteins 0.000 description 1
- 241000209056 Secale Species 0.000 description 1
- 235000007238 Secale cereale Nutrition 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 235000003434 Sesamum indicum Nutrition 0.000 description 1
- 244000040738 Sesamum orientale Species 0.000 description 1
- 108020004682 Single-Stranded DNA Proteins 0.000 description 1
- 240000003768 Solanum lycopersicum Species 0.000 description 1
- 235000002595 Solanum tuberosum Nutrition 0.000 description 1
- 244000061456 Solanum tuberosum Species 0.000 description 1
- 235000007230 Sorghum bicolor Nutrition 0.000 description 1
- 235000011684 Sorghum saccharatum Nutrition 0.000 description 1
- 241000187747 Streptomyces Species 0.000 description 1
- 235000021536 Sugar beet Nutrition 0.000 description 1
- 241000192584 Synechocystis Species 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- 208000030886 Traumatic Brain injury Diseases 0.000 description 1
- 241000209140 Triticum Species 0.000 description 1
- 235000021307 Triticum Nutrition 0.000 description 1
- 241000209146 Triticum sp. Species 0.000 description 1
- 229920004890 Triton X-100 Polymers 0.000 description 1
- 239000013504 Triton X-100 Substances 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- 101150050575 URA3 gene Proteins 0.000 description 1
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 1
- 235000010721 Vigna radiata var radiata Nutrition 0.000 description 1
- 235000011469 Vigna radiata var sublobata Nutrition 0.000 description 1
- 241001464837 Viridiplantae Species 0.000 description 1
- 241000219094 Vitaceae Species 0.000 description 1
- 240000008042 Zea mays Species 0.000 description 1
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 description 1
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 1
- 241000235017 Zygosaccharomyces Species 0.000 description 1
- 238000002835 absorbance Methods 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 230000002378 acidificating effect Effects 0.000 description 1
- 101150102866 adc1 gene Proteins 0.000 description 1
- 229960005305 adenosine Drugs 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 230000007131 anti Alzheimer effect Effects 0.000 description 1
- 230000001430 anti-depressive effect Effects 0.000 description 1
- 230000000648 anti-parkinson Effects 0.000 description 1
- 230000000692 anti-sense effect Effects 0.000 description 1
- 230000000320 anti-stroke effect Effects 0.000 description 1
- 239000000939 antiparkinson agent Substances 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 125000003118 aryl group Chemical group 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 230000000975 bioactive effect Effects 0.000 description 1
- 239000002551 biofuel Substances 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 229960001561 bleomycin Drugs 0.000 description 1
- OYVAGSVQBOHSSS-UAPAGMARSA-O bleomycin A2 Chemical compound N([C@H](C(=O)N[C@H](C)[C@@H](O)[C@H](C)C(=O)N[C@@H]([C@H](O)C)C(=O)NCCC=1SC=C(N=1)C=1SC=C(N=1)C(=O)NCCC[S+](C)C)[C@@H](O[C@H]1[C@H]([C@@H](O)[C@H](O)[C@H](CO)O1)O[C@@H]1[C@H]([C@@H](OC(N)=O)[C@H](O)[C@@H](CO)O1)O)C=1N=CNC=1)C(=O)C1=NC([C@H](CC(N)=O)NC[C@H](N)C(N)=O)=NC(N)=C1C OYVAGSVQBOHSSS-UAPAGMARSA-O 0.000 description 1
- 210000004899 c-terminal region Anatomy 0.000 description 1
- 102100022422 cGMP-dependent protein kinase 1 Human genes 0.000 description 1
- 239000001110 calcium chloride Substances 0.000 description 1
- 229910001628 calcium chloride Inorganic materials 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 238000006555 catalytic reaction Methods 0.000 description 1
- 210000004671 cell-free system Anatomy 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000007795 chemical reaction product Substances 0.000 description 1
- 239000013611 chromosomal DNA Substances 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 230000004186 co-expression Effects 0.000 description 1
- 238000000975 co-precipitation Methods 0.000 description 1
- 230000003920 cognitive function Effects 0.000 description 1
- 239000003184 complementary RNA Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 239000000356 contaminant Substances 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 235000005822 corn Nutrition 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 230000022811 deglycosylation Effects 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 210000002257 embryonic structure Anatomy 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 206010015037 epilepsy Diseases 0.000 description 1
- 241001233957 eudicotyledons Species 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 239000004459 forage Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- BRZYSWJRSDMWLG-CAXSIQPQSA-N geneticin Natural products O1C[C@@](O)(C)[C@H](NC)[C@@H](O)[C@H]1O[C@@H]1[C@@H](O)[C@H](O[C@@H]2[C@@H]([C@@H](O)[C@H](O)[C@@H](C(C)O)O2)N)[C@@H](N)C[C@H]1N BRZYSWJRSDMWLG-CAXSIQPQSA-N 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 108020004445 glyceraldehyde-3-phosphate dehydrogenase Proteins 0.000 description 1
- 235000021021 grapes Nutrition 0.000 description 1
- 239000011121 hardwood Substances 0.000 description 1
- 235000019534 high fructose corn syrup Nutrition 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 229910052739 hydrogen Inorganic materials 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 230000002209 hydrophobic effect Effects 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 125000001449 isopropyl group Chemical group [H]C([H])([H])C([H])(*)C([H])([H])[H] 0.000 description 1
- 238000001638 lipofection Methods 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 239000012139 lysis buffer Substances 0.000 description 1
- 229960000274 lysozyme Drugs 0.000 description 1
- 235000010335 lysozyme Nutrition 0.000 description 1
- 239000004325 lysozyme Substances 0.000 description 1
- SKEFKEOTNIPLCQ-LWIQTABASA-N mating hormone Chemical group C([C@@H](C(=O)NC(CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N1[C@@H](CCC1)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCS(C)=O)C(=O)NC(CC=1C=CC(O)=CC=1)C(O)=O)NC(=O)[C@@H](N)CC=1C2=CC=CC=C2NC=1)C1=CN=CN1 SKEFKEOTNIPLCQ-LWIQTABASA-N 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 238000003808 methanol extraction Methods 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 229930014626 natural product Natural products 0.000 description 1
- 229960004927 neomycin Drugs 0.000 description 1
- 230000000324 neuroprotective effect Effects 0.000 description 1
- 108091027963 non-coding RNA Proteins 0.000 description 1
- 102000042567 non-coding RNA Human genes 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000002018 overexpression Effects 0.000 description 1
- 229960001914 paromomycin Drugs 0.000 description 1
- UOZODPSAJZTQNH-LSWIJEOBSA-N paromomycin Chemical compound N[C@@H]1[C@@H](O)[C@H](O)[C@H](CN)O[C@@H]1O[C@H]1[C@@H](O)[C@H](O[C@H]2[C@@H]([C@@H](N)C[C@@H](N)[C@@H]2O)O[C@@H]2[C@@H]([C@@H](O)[C@H](O)[C@@H](CO)O2)N)O[C@@H]1CO UOZODPSAJZTQNH-LSWIJEOBSA-N 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 235000020232 peanut Nutrition 0.000 description 1
- 235000011197 perejil Nutrition 0.000 description 1
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N phenol group Chemical group C1(=CC=CC=C1)O ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 230000029553 photosynthesis Effects 0.000 description 1
- 238000010672 photosynthesis Methods 0.000 description 1
- 230000000704 physical effect Effects 0.000 description 1
- 239000013600 plasmid vector Substances 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 230000000379 polymerizing effect Effects 0.000 description 1
- 235000012015 potatoes Nutrition 0.000 description 1
- 229940124606 potential therapeutic agent Drugs 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 230000019525 primary metabolic process Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 229950010131 puromycin Drugs 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 238000009394 selective breeding Methods 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 239000011122 softwood Substances 0.000 description 1
- 229940082787 spirulina Drugs 0.000 description 1
- 235000021012 strawberries Nutrition 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 238000004704 ultra performance liquid chromatography Methods 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P19/00—Preparation of compounds containing saccharide radicals
- C12P19/12—Disaccharides
-
- A—HUMAN NECESSITIES
- A23—FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
- A23L—FOODS, FOODSTUFFS OR NON-ALCOHOLIC BEVERAGES, NOT OTHERWISE PROVIDED FOR; PREPARATION OR TREATMENT THEREOF
- A23L33/00—Modifying nutritive qualities of foods; Dietetic products; Preparation or treatment thereof
- A23L33/10—Modifying nutritive qualities of foods; Dietetic products; Preparation or treatment thereof using additives
- A23L33/125—Modifying nutritive qualities of foods; Dietetic products; Preparation or treatment thereof using additives containing carbohydrate syrups; containing sugars; containing sugar alcohols; containing starch hydrolysates
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/1048—Glycosyltransferases (2.4)
- C12N9/1051—Hexosyltransferases (2.4.1)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P19/00—Preparation of compounds containing saccharide radicals
- C12P19/18—Preparation of compounds containing saccharide radicals produced by the action of a glycosyl transferase, e.g. alpha-, beta- or gamma-cyclodextrins
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P19/00—Preparation of compounds containing saccharide radicals
- C12P19/44—Preparation of O-glycosides, e.g. glucosides
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y204/00—Glycosyltransferases (2.4)
Definitions
- BIOSYNTHESIS OF SALIDROSIDE RELATED APPLICATION This application claims the benefit under 35 U.S.C. ⁇ 119(e) of U.S. Provisional Application No. 63/375,734 filed on September 15, 2022, entitled “BIOSYNTHESIS OF SALIDROSIDE,” the entire contents of which are incorporated herein by reference.
- REFERENCE TO AN ELECTRONIC SEQUENCE LISTING The contents of the electronic sequence listing (C149770092WO00-SEQ-VLJ.xml; Size: 52,129 bytes; and Date of Creation: September 13, 2023) is herein incorporated by reference in its entirety.
- Salidroside a glycosylated form of tyrosol also known as tyrosol 8-O-glucoside and as 2-(4-hydroxyphenyl)ethyl- ⁇ -D-glucopyranoside, is naturally produced by plants within the Rhodiola genus.
- Salidroside is of particular interest and value because a number of studies have revealed that it exhibits neuroprotective activities, including anti-Alzheimer’s disease, anti-Parkinson’s disease, anti-Huntington’s disease, anti-stroke, anti-depressive effects, and anti-traumatic brain injury; it is also useful for improving cognitive function, treating addiction, and preventing epilepsy.
- the present disclosure relates to the synthesis of salidroside. More particularly, the present disclosure relates to biosynthetic methods for producing the salidroside. The present disclosure also relates to enzymes that can be used to prepare the salidroside and recombinant cells for producing the enzymes.
- a method for synthesizing salidroside comprising: (i) preparing a reaction mixture comprising: tyrosol, uridine diphosphate-glucose (UDP-glucose), and a uridine diphosphate (UDP)-glycosyltransferase, and (ii) incubating the reaction mixture to produce salidroside, wherein a glucose is covalently coupled to the tyrosol to produce salidroside.
- the UDP-glycosyltransferase is selected from the group consisting of a first polypeptide, a second polypeptide, a third polypeptide, and combinations thereof.
- the first polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 5
- the second polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 11
- the third polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
- the reaction mixture may further comprises sucrose and a sucrose synthase.
- the sucrose synthase may be selected from the group consisting of an Arabidopsis sucrose synthase 1; an Arabidopsis sucrose synthase 3 and a Vigna radiata sucrose synthase.
- the first polypeptide may comprise an amino acid sequence having at least 95% identity to the amino acid sequence as set forth in SEQ ID NO: 5. In a representative example, the first polypeptide comprises an amino acid sequence having at least 99% identity to the amino acid sequence as set forth in SEQ ID NO: 5. In a further example, the first polypeptide comprises the amino acid sequence as set forth in SEQ ID NO: 5.
- the second polypeptide may comprise an amino acid sequence having at least 95% identity to the amino acid sequence as set forth in SEQ ID NO: 11. In a representative example, the second polypeptide comprises an amino acid sequence having at least 99% identity to the amino acid sequence as set forth in SEQ ID NO: 11.
- the second polypeptide comprises an amino acid sequence having at least 99% identity to the amino acid sequence as set forth in SEQ ID NO: 11.
- the second polypeptide comprises the amino acid sequence as set forth in SEQ ID NO: 11.
- the third polypeptide may comprise an amino acid sequence having at least 95% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
- the third polypeptide comprises an amino acid sequence having at least 95 % identity to the amino acid sequence as set forth in SEQ ID NO: 15.
- the third polypeptide comprises an amino acid sequence having at least 99% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
- the third polypeptide comprises the amino acid sequence as set forth in SEQ ID NO: 15.
- the present disclosure provides a recombinant cell comprising a heterologous polynucleotide encoding a polypeptide selected from the group consisting of a first polypeptide, a second polypeptide, a third polypeptide, and combinations thereof.
- the first polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 5
- the second polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 11
- the third polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
- the heterologous polynucleotide comprises a nucleotide sequence having at least 95%, 99%, or 100% identity to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 6, 12, and 16.
- the first polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 5.
- the second polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 11.
- the third polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
- the recombinant cell is a microbial cell, such as Escherichia coli or Pichia pastoris.
- a method for producing a polypeptide includes culturing the recombinant cells of the above second aspect, and expressing a polypeptide in the recombinant cells, wherein the polypeptide is selected from the group consisting of the first polypeptide, the second polypeptide, the third polypeptide, and combinations thereof.
- the first polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 5
- the second polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 11
- the third polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
- the present disclosure provide isolated UDP-glycosyltransferase enzymes.
- a first such enzyme comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 5.
- the first enzyme comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 5.
- a second enzyme comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 11.
- the second enzyme comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 11.
- a third enzyme comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
- the third enzyme comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
- FIG. 1 illustrates the bioconversion pathway of salidroside.
- UGT catalyze a transglycosylation reaction for salidroside production from tyrosol.
- Sucrose synthase (SUS) can couple with the UGT-catalyzed reaction for UDPG regeneration.
- FIG. 2 is an HPLC profile of enzymatic screening of UGT candidates for salidroside production.
- Panel A tyrosol (TY) and salidroside (SA) standard
- Panel B tyrosol (TY) and salidroside (SA) standard
- FIG. 3 illustrates results confirming the enzymatic activity of TS29, TS34 and TS39. Reaction samples were collected at 1.5 hr, 4 hr and 24 hr. Tyrosol and salidroside was detected by HPLC analysis and concentrations were calculated based on standard curves.
- SA 4 is an HPLC profile of in vitro production of salidroside (“SA”) from tyrosol (“TY”) catalyzed by a recombinant TS29 or TS34 and a recombinant AtSUS1 in a UGT-SUS coupling reaction system as described herein. All samples were collected at 24 hr and analyzed by HPLC.
- Panel A shows the standards of salidroside (“SA”) and tyrosol (“TY”).
- Panel B reaction mixture; Panel C. TS29 reaction with UDPG, Panel D. TS34 reaction with UDPG; Panel E. TS29 reaction with UDP; Panel F. TS34 reaction with UDP; Panel G.
- FIG. 5 illustrates the in vitro production of salidroside from tyrosol as catalyzed by recombinant TS29 and TS34 with or without AtSUS1 in a UGT-SUS coupling reaction system as described herein. In these reactions, the recombinant polypeptide (10 ⁇ g aliquots) was tested in a 200 ⁇ L in vitro reaction system.
- the reaction system contained 50 mM Tris- HCl, pH 7, 2.5 mg/ml tyrosol, 3mM UDPG (Panel A) or 3mM UDP without (Panel B) or with 250 mM sucrose, and 84 ⁇ g/ml AtSUS1 (Panel C).
- the reaction was performed at 37 °C and samples were collected at 1 hr, 4 hr, and 24 hr.
- the concentration of tyrosol and salidroside was calculated by HPLC based on standard curve.
- FIG. 6 is a graph summarizing the HPLC detection of salidroside enzymatically produced by the crude enzyme from induced Pichia cells.
- Pichia expression plasmid harboring single expression cassette was linearized by BspEI digestion and transformed into GS115 Pichia cells. Crude enzyme was prepared from induced Pichia cell by sonication. Reaction samples were collected at 5 hours after crude enzyme addition.
- Panel A Tyrosol (TY) and salidroside (SA) standard
- Panel B TS29 crude enzyme from induced Pichia cell
- Panel C TS34 crude enzyme from induced Pichia cells
- Panel D TS39 crude enzyme from induced Pichia cells.
- FIG. 7 is a graph summarizing the HPLC detection of the salidroside enzymatically produced by the crude enzyme from induced Pichia cells.
- Pichia expression plasmid harboring four TS34 expression cassette and two mbSUS1 expression cassette was linearized by BspEI and transformed into GS115 Pichia cell. Crude enzyme was prepared from induced Pichia cell by sonication. Reaction samples were collected at 5 and 18 hours after crude enzyme addition. Panel A: Tyrosol (TY) and salidroside (SA) standard; Panel B: enzymatic bioconversion at 5 hour; Panel C: enzymatic bioconversion at 18 hours. While the disclosure is susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawings and are herein described below in detail.
- nucleic acid and “nucleotide” are used according to their respective ordinary and customary meanings as understood by a person of ordinary skill in the art, and are used without limitation to refer to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form.
- nucleic acids containing known analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally-occurring nucleotides.
- a particular nucleic acid sequence also implicitly encompasses conservatively modified or degenerate variants thereof (e.g., degenerate codon substitutions) and complementary sequences, as well as the sequence explicitly indicated.
- isolated is used according to its ordinary and customary meaning as understood by a person of ordinary skill in the art, and when used in the context of an isolated nucleic acid or an isolated polypeptide, is used without limitation to refer to a nucleic acid or polypeptide that, by the hand of man, exists apart from its native environment and is therefore not a product of nature.
- An isolated nucleic acid or polypeptide can exist in a purified form or can exist in a non-native environment such as, for example, in a transgenic host cell.
- incubating and “incubation” as used herein refers to a process of mixing two or more chemical or biological entities (such as a chemical compound and an enzyme) and allowing them to interact under conditions favorable for producing a salidroside composition.
- degenerate variant refers to a nucleic acid sequence having a residue sequence that differs from a reference nucleic acid sequence by one or more degenerate codon substitutions. Degenerate codon substitutions can be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed base and/or deoxyinosine residues. A nucleic acid sequence and all of its degenerate variants will express the same amino acid or polypeptide.
- polypeptide “protein,” and “peptide” are used according to their respective ordinary and customary meanings as understood by a person of ordinary skill in the art; the three terms are sometimes used interchangeably, and are used without limitation to refer to a polymer of amino acids, or amino acid analogs, regardless of its size or function.
- amino acid residues in a polymer of amino acids may be referred to as the “amino acids of the polymer” with the understanding that peptidic bonds have formed among the amino acids or precursors thereof during the formation of the polymer chain.
- protein is often used in reference to relatively large polypeptides
- peptide is often used in reference to small polypeptides, usage of these terms in the art overlaps and varies.
- polypeptide refers to peptides, polypeptides, and proteins, unless otherwise noted.
- protein polypeptide
- peptide are used interchangeably herein when referring to a polynucleotide product.
- exemplary polypeptides include polynucleotide products, naturally occurring proteins, homologs, orthologs, paralogs, fragments and other equivalents, variants, and analogs of the foregoing.
- polypeptide fragment and “fragment,” when used in reference to a reference polypeptide, are used according to their ordinary and customary meanings to a person of ordinary skill in the art, and are used without limitation to refer to a polypeptide in which amino acid residues are deleted as compared to the reference polypeptide itself, but where the remaining amino acid sequence is usually identical to the corresponding positions in the reference polypeptide. Such deletions can occur at the amino-terminus or carboxy- terminus of the reference polypeptide, or alternatively both.
- the term “functional fragment” of a polypeptide or protein refers to a peptide fragment that is a portion of the full length polypeptide or protein, and has substantially the same biological activity, or carries out substantially the same function as the full length polypeptide or protein (e.g., carrying out the same enzymatic reaction).
- a variant is a “functional variant” which retains some or all of the ability of the reference polypeptide.
- the term “functional variant” further includes conservatively substituted variants.
- the term “conservatively substituted variant” refers to a peptide having an amino acid sequence that differs from a reference peptide by one or more conservative amino acid substitutions, and maintains some or all of the activity of the reference peptide.
- a “conservative amino acid substitution” is a substitution of an amino acid residue with a functionally similar residue.
- conservative substitutions include the substitution of one non-polar (hydrophobic) residue such as isoleucine, valine, leucine or methionine for another; the substitution of one charged or polar (hydrophilic) residue for another such as between arginine and lysine, between glutamine and asparagine, between threonine and serine; the substitution of one basic residue such as lysine or arginine for another; or the substitution of one acidic residue, such as aspartic acid or glutamic acid for another; or the substitution of one aromatic residue, such as phenylalanine, tyrosine, or tryptophan for another.
- one non-polar (hydrophobic) residue such as isoleucine, valine, leucine or methionine for another
- substitution of one charged or polar (hydrophilic) residue for another such as between arginine and lysine, between glutamine and asparagine, between threonine and serine
- substitution of one basic residue such as
- substitutions are expected to have little or no effect on the apparent molecular weight or isoelectric point of the protein or polypeptide.
- the phrase “conservatively substituted variant” also includes peptides wherein a residue is replaced with a chemically- derivatized residue, provided that the resulting peptide maintains some or all of the activity of the reference peptide as described herein.
- variant in connection with the polypeptides of the subject technology, further includes a functionally active polypeptide having an amino acid sequence at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, and even 100% identical to the amino acid sequence of a reference polypeptide.
- homologous in all its grammatical forms and spelling variations refers to the relationship between polynucleotides or polypeptides that possess a “common evolutionary origin,” including polynucleotides or polypeptides from superfamilies and homologous polynucleotides or proteins from different species (Reeck et al., Cell 50:667, 1987). Such polynucleotides or polypeptides have sequence homology, as reflected by their sequence similarity, whether in terms of percent identity or the presence of specific amino acids or motifs at conserved positions.
- two homologous polypeptides can have amino acid sequences that are at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, and even 100% identical.
- Percent (%) amino acid sequence identity refers to the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues of a reference polypeptide (such as, for example, SEQ ID NO: 5), after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign (DNASTAR) software.
- the % amino acid sequence identity may be determined using the sequence comparison program NCBI-BLAST2.
- NCBI-BLAST2 sequence comparison program may be downloaded from ncbi.nlm.nih.gov.
- the % amino acid sequence identity of a given amino acid sequence A to, with, or against a given amino acid sequence B is calculated as follows: 100 times the fraction X/Y where X is the number of amino acid residues scored as identical matches by the sequence alignment program NCBI- BLAST2 in that program’s alignment of A and B, and where Y is the total number of amino acid residues in B.
- nucleic acid and amino acid sequence identity also are well known in the art and include determining the nucleotide sequence of the mRNA for that gene (usually via a cDNA intermediate) and determining the amino acid sequence encoded therein, and comparing this to a second amino acid sequence.
- identity refers to an exact nucleotide to nucleotide or amino acid to amino acid correspondence of two polynucleotides or polypeptide sequences, respectively. Two or more polynucleotide sequences can be compared by determining their “percent identity”, as can two or more amino acid sequences.
- the programs available in the Wisconsin Sequence Analysis Package, Version 8 (available from Genetics Computer Group, Madison, Wis.), for example, the GAP program, are capable of calculating both the identity between two polynucleotides and the identity and similarity between two polypeptide sequences, respectively.
- Other programs for calculating identity or similarity between sequences are known by those skilled in the art.
- An amino acid position “corresponding to” a reference position refers to a position that aligns with a reference sequence, as identified by aligning the amino acid sequences. Such alignments can be done by hand or by using well-known sequence alignment programs such as ClustalW2, Blast 2, etc.
- the percent identity of two polypeptide or polynucleotide sequences refers to the percentage of identical amino acid residues or nucleotides across the entire length of the shorter of the two sequences. “Coding sequence” is used according to its ordinary and customary meaning as understood by a person of ordinary skill in the art, and is used without limitation to refer to a DNA sequence that encodes for a specific amino acid sequence.
- Suitable regulatory sequences is used according to its ordinary and customary meaning as understood by a person of ordinary skill in the art, and is used without limitation to refer to nucleotide sequences located upstream (5’ non-coding sequences), within, or downstream (3’ non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include promoters, translation leader sequences, introns, and polyadenylation recognition sequences. “Promoter” is used according to its ordinary and customary meaning as understood by a person of ordinary skill in the art, and is used without limitation to refer to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA.
- a coding sequence is located 3’ to a promoter sequence.
- Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different cell types, or at different stages of development, or in response to different environmental conditions. Promoters, which cause a gene to be expressed in most cell types at most times, are commonly referred to as “constitutive promoters.” It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity.
- operably linked refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other.
- a promoter is operably linked with a coding sequence when it is capable of affecting the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter).
- Coding sequences can be operably linked to regulatory sequences in sense or antisense orientation.
- expression as used herein, is used according to its ordinary and customary meaning as understood by a person of ordinary skill in the art, and is used without limitation to refer to the transcription and stable accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of the subject technology.
- “Over-expression” refers to the production of a gene product in transgenic or recombinant organisms that exceeds levels of production in normal or non-transformed organisms.
- “Transformation” is used according to its ordinary and customary meaning as understood by a person of ordinary skill in the art, and is used without limitation to refer to the transfer of a polynucleotide into a target cell.
- the transferred polynucleotide can be incorporated into the genome or chromosomal DNA of a target cell, resulting in genetically stable inheritance, or it can replicate independent of the host chromosomal.
- Host organisms containing the transformed nucleic acid fragments are referred to as “transgenic” or “recombinant” or “transformed” organisms.
- transformed when used herein in connection with host cells, are used according to their ordinary and customary meanings as understood by a person of ordinary skill in the art, and are used without limitation to refer to a cell of a host organism, such as a plant or microbial cell, into which a heterologous nucleic acid molecule has been introduced.
- the nucleic acid molecule can be stably integrated into the genome of the host cell, or the nucleic acid molecule can be present as an extrachromosomal molecule. Such an extrachromosomal molecule can be auto-replicating.
- Transformed cells, tissues, or subjects are understood to encompass not only the end product of a transformation process, but also transgenic progeny thereof.
- heterologous when used herein in connection with polynucleotides, are used according to their ordinary and customary meanings as understood by a person of ordinary skill in the art, and are used without limitation to refer to a polynucleotide (e.g., a DNA sequence or a gene) that originates from a source foreign to the particular host cell or, if from the same source, is modified from its original form.
- a heterologous gene in a host cell includes a gene that is endogenous to the particular host cell but has been modified through, for example, the use of site-directed mutagenesis or other recombinant techniques.
- the terms also include non-naturally occurring multiple copies of a naturally occurring DNA sequence.
- the terms refer to a DNA segment that is foreign or heterologous to the cell, or homologous to the cell but in a position or form within the host cell in which the element is not ordinarily found.
- the terms “recombinant,” “heterologous,” and “exogenous,” when used herein in connection with a polypeptide or amino acid sequence means a polypeptide or amino acid sequence that originates from a source foreign to the particular host cell or, if from the same source, is modified from its original form.
- recombinant DNA segments can be expressed in a host cell to produce a recombinant polypeptide.
- Plasmid refers to any extra chromosomal element often carrying genes which are not part of the central metabolism of the cell, and usually in the form of circular double- stranded DNA molecules.
- Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear or circular, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing a promoter fragment and DNA sequence for a selected gene product along with appropriate 3’ untranslated sequence into a cell.
- Transformation cassette refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that facilitate transformation of a particular host cell.
- “Expression cassette” refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that allow for enhanced expression of that gene in a foreign host.
- Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described, for example, by Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual, 2 nd ed.; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y., 1989 (hereinafter “Maniatis”); and by Silhavy, T. J., Bennan, M. L. and Enquist, L. W.
- biosynthetic methods for synthesizing salidroside are disclosed. Also in accordance with the present disclosure, nucleic acid constructs and recombinant bacterial and yeast cells for producing enzymes which find use in the biosynthetic methods are provided. Methods of Producing Salidroside
- the present disclosure is directed to a biosynthetic method whereby tyrosol is converted to salidroside.
- the method includes preparing a reaction mixture including tyrosol, uridine diphosphate-glucose (UDP-glucose), and a uridine diphosphate (UDP)-glycosyltransferase (UGT) enzyme.
- the reaction mixture is incubated and a glucosyl moiety is covalently coupled to the tyrosol, to form salidroside.
- the reaction mixture can be, for example, an in vitro cell-free system.
- the (UDP)-glycosyltransferase enzyme is believed to catalyze a transglycosylation reaction whereby a glucosyl moiety is transferred from the UDP-glucose to be covalently coupled to a tyrosol molecule, thereby forming salidroside.
- UDP-glycosyltransferases include those listed in Table 1, among which particularly suitable are those marked as TS29 (amino acid SEQ ID NO: 5), TS34 (amino acid SEQ ID NO: 11), and TS39 (amino acid SEQ ID NO: 15), respectively. It has been discovered that such polypeptides can be used to couple glucose with tyrosol in unexpected high yields.
- the UDP-glycosyltransferase has a percent amino acid sequence identity to the polypeptide of SEQ ID NO: 5 of at least 80%, e.g., at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%.
- the UDP-glycosyltransferase differs by no more than 50 amino acids, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, or 49 amino acid(s), from the polypeptide of SEQ ID NO: 5.
- the UGT differs by no more than 10 amino acids, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acids, from the polypeptide of SEQ ID NO: 5.
- the UDP-glycosyltransferase comprises, consists essentially of, or consists of the amino acid sequence of SEQ ID NO: 5 or an allelic variant thereof; or is a fragment thereof having UDP-glycosyltransferase activity.
- the UDP-glycosyltransferase has a percent amino acid sequence identity to the polypeptide of SEQ ID NO: 11 of at least 80%, e.g., at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%.
- the UDP-glycosyltransferase differs by no more than 50 amino acids, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, or 49 amino acid(s), from the polypeptide of SEQ ID NO: 11.
- the UGT differs by no more than 10 amino acids, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acids, from the polypeptide of SEQ ID NO: 11.
- the UDP- glycosyltransferase comprises, consists essentially of, or consists of the amino acid sequence of SEQ ID NO: 11 or an allelic variant thereof; or is a fragment thereof having UDP- glycosyltransferase activity.
- the UDP-glycosyltransferase has a percent amino acid sequence identity to the polypeptide of SEQ ID NO: 15 of at least 80%, e.g., at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%.
- the UDP-glycosyltransferase differs by no more than 50 amino acids, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, or 49 amino acid(s), from the polypeptide of SEQ ID NO: 15.
- the polypeptides differ by no more than 10 amino acids, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acids, from the polypeptide of SEQ ID NO: 15.
- the UDP- glycosyltransferase comprises, consists essentially of, or consists of the amino acid sequence of SEQ ID NO: 15 or an allelic variant thereof; or is a fragment thereof having UDP- glycosyltransferase activity.
- the method can further include adding sucrose and a sucrose synthase enzyme (SUS) to the reaction mixture.
- Sucrose synthase is a glycosyltransferase.
- the systematic name of this enzyme class is NDP-glucose:D-fructose 2-alpha-D-glucosyltransferase.
- UDP-glucose can be regenerated from UDP and sucrose, which allows for omitting the addition of extra UDP- glucose to replenish the reaction mixture; instead, the UDP-glucose is regenerated by glycosylation of UDP that is already present in the mixture either from the outset or as a product of the deglycosylation of the initial UDP-glucose.
- This approach also allows for UDP-glucose to be generated completely in situ, although aliquots of UDP-glucose may be externally added to the mixture, for example in the early stages of the biosynthesis.
- Suitable sucrose synthases can be, for example, an Arabidopsis sucrose synthase 1 (AtSUS1, SEQ ID NO: 19); an Arabidopsis sucrose synthase 3 (AtSUS3) and a Vigna radiata sucrose synthase (mbSUS1, SEQ ID NO: 21).
- a particularly suitable sucrose synthase can be, for example, a sucrose synthase Vigna radiata sucrose synthase mbSUS1 having the amino acid sequence of SEQ ID NO:21.
- Standard recombinant DNA methodologies may be used to obtain a nucleic acid construct that encodes a recombinant polypeptide described herein, incorporate the nucleic acid into an expression vector, and introduce the vector into a host cell, such as those described in Sambrook, et al. (eds), Molecular Cloning; A Laboratory Manual, Third Edition, Cold Spring Harbor, (2001); and Ausubel, F. M. et al. (eds.) Current Protocols in Molecular Biology, John Wiley & Sons (1995).
- a nucleic acid encoding a polypeptide may be inserted into an expression vector or vectors such that the nucleic acids are operably linked to transcriptional and translational control sequences (such as a promoter sequence, a transcription termination sequence, etc.).
- the expression vector and expression control sequences are generally chosen to be compatible with the expression host cell used. Accordingly, in one aspect, the present disclosure provides nucleic acid constructs comprising a nucleic acid sequence that encodes at least a UDP-glycosyltransferase as described herein, as well as a recombinant host cells comprising said nucleic acid construct(s).
- Said host cell e.g., a bacterium or a yeast
- the first polypeptide comprises an amino acid sequence having at least 90% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 5;
- the second polypeptide comprises an amino acid sequence having at least 90% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 11,
- the third polypeptide comprises an amino acid sequence having at least 90% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 15.
- the first polypeptide may have at least 95%, 99%, or 100% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 5.
- the second polypeptide may have at least 95%, 99%, or 100% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 11.
- the third polypeptide may have at least 95%, 99%, or 100% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 15.
- a nucleic acid construct may include a polynucleotide comprising a nucleotide sequence selected from the group consisting of a first sequence, a second sequence, and a third sequence.
- the first sequence has at least 90% identity to the nucleotide sequence as set forth in SEQ ID NO: 6; the second sequence has at least 90% identity to the nucleotide sequence as set forth in SEQ ID NO: 12, and the third sequence has at least 90% identity to the nucleotide sequence as set forth in SEQ ID NO: 16.
- the first sequence may have at least 95%, 99%, or 100% identity to the nucleotide sequence as set forth in SEQ ID NO: 6.
- the second sequence may have at least 95%, 99%, or 100% identity to the nucleotide sequence as set forth in SEQ ID NO: 12.
- the third sequence may have at least 95%, 99%, or 100% identity to the nucleotide sequence as set forth in SEQ ID NO: 16.
- the polynucleotide sequence encoding at least a UDP-glycosyltransferase is operably linked to a promoter.
- the promoter is a constitutive promoter (e.g., a constitutive promoter in a bacterium or yeast).
- the promoter comprises a mutated promoter (e.g., a bacterial lacUV5 promoter).
- the polynucleotide sequence encoding at least a UDP-glycosyltransferase is operably linked to a transcription terminator.
- the transcription terminator may be the bacteriophage T7 terminator.
- the nucleic acid sequence that encodes the one or more UDP-glycosyltransferases can include a polyhistidine tag.
- the most common polyhistidine tag are formed of six histidine (6xHis tag) residues, which are added at the N-terminus preceded by methionine or C-terminus before a stop codon, in the coding sequence of the protein of interest.
- Expression Vectors As stated above, a nucleic acid molecule encoding at least a UDP-glycosyltransferase as described herein may be inserted into a host species, e.g., a bacterium or yeast cell, for example in the form of an expression vector.
- a recombinant cell comprising a transgenic polynucleotide encoding a polypeptide selected from the group consisting of a first polypeptide, a second polypeptide, and a third polypeptide.
- the first polypeptide comprises an amino acid sequence having at least 90% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 5
- the second polypeptide comprises an amino acid sequence having at least 90% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 11
- the third polypeptide comprises an amino acid sequence having at least 90% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 15.
- the first polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 5.
- the second polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 11.
- the third polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 15.
- the transgenic polynucleotide encoding the UGT may include a nucleotide sequence having at least 95%, 99%, or 100% identity to the nucleotide sequence of SEQ ID NO: 6.
- the transgenic polynucleotide encoding the UGT may include a nucleotide sequence having at least 95%, 99%, or 100% identity to the nucleotide sequence of SEQ ID NO: 12.
- the transgenic polynucleotide encoding the UGT may include a nucleotide sequence having at least 95%, 99%, or 100% identity to the nucleotide sequence of SEQ ID NO: 16.
- Vectors or cassettes useful for the transformation of suitable hosts, e.g., microbial cells, are well known in the art.
- the vector or cassette contains sequences directing transcription and translation of the relevant polynucleotide, a selectable marker, and sequences allowing autonomous replication or chromosomal integration.
- Suitable vectors comprise a region 5' of the polynucleotide which harbors transcriptional initiation controls and a region 3' of the DNA fragment which controls transcriptional termination. It is preferred for both control regions to be derived from genes homologous to the transformed host cell, although it is to be understood that such control regions need not be derived from the genes native to the specific species chosen as a host. Initiation control regions or promoters, which are useful to drive expression of the recombinant polypeptide in the desired microbial host cell are numerous and familiar to those skilled in the art.
- Virtually any promoter capable of driving these genes is suitable for the subject technology including but not limited to CYC1, HIS3, GAL1, GAL10, ADH1, PGK, PHO5, GAPDH, ADC1, TRP1, URA3, LEU2, ENO, TPI (useful for expression in Saccharomyces); AOX1 (useful for expression in Pichia); and lac, trp, IPL, IPR, T7, tac, and trc (useful for expression in Escherichia coli). Termination control regions may also be derived from various genes native to the hosts. A termination site optionally may be included for the microbial hosts described hereinbelow.
- the nucleic acid molecule (e.g., vector) inserted in the host cell further comprises a polynucleotide encoding a selection marker.
- a “selection marker” is a gene introduced into a cell, especially cells in culture, that confers a trait suitable for artificial selection.
- a selectable marker is a gene that confers resistance to a drug to eukaryotic cells, including but not limited to kanamycin, paromomycin, puromycin, hygromycin, G418, neomycin, or bleomycin.
- a selectable marker is a gene that confers resistance to kanamycin.
- the polynucleotide used for incorporation into the expression vector of the subject technology can be prepared by routine techniques such as polymerase chain reaction (PCR).
- PCR polymerase chain reaction
- an UGT-encoding sequences is inserted into a pETite plasmid vector (Lucigen, WI), to construct a pETite expression vector.
- pETite plasmid vector Lucigen, WI
- Several molecular biology techniques can be developed to operably link DNA to vectors via complementary cohesive termini.
- complementary homopolymer tracts can be added to the nucleic acid molecule to be inserted into the vector DNA.
- the vector and nucleic acid molecule are then joined by hydrogen bonding between the complementary homopolymeric tails to form recombinant DNA molecules.
- synthetic linkers containing one or more restriction sites are used to operably link the polynucleotide(s) of the subject technology to the expression vector.
- the polynucleotide is generated by restriction endonuclease digestion.
- the nucleic acid molecule is treated with bacteriophage T4 DNA polymerase or E.
- blunt-ended segments are then incubated with a large molar excess of linker molecules in the presence of an enzyme that can catalyze the ligation of blunt-ended DNA molecules, such as bacteriophage T4 DNA ligase.
- an enzyme that can catalyze the ligation of blunt-ended DNA molecules such as bacteriophage T4 DNA ligase.
- the product of the reaction is a polynucleotide carrying polymeric linker sequences at its ends.
- polynucleotides are then cleaved with the appropriate restriction enzyme and ligated to an expression vector that has been cleaved with an enzyme that produces termini compatible with those of the polynucleotide.
- a vector having ligation-independent cloning (LIC) sites can be employed.
- the required PCR amplified polynucleotide can then be cloned into the LIC vector without restriction digest or ligation (Aslanidis and de Jong, NUCL. ACID. RES. 18 6069-74, (1990), Haun, et al, BIOTECHNIQUES 13, 515-18 (1992), which is incorporated herein by reference to the extent it is consistent herewith).
- a polynucleotide for incorporation into an expression vector of the subject technology may be prepared using PCR using appropriate oligonucleotide primers.
- the coding region is amplified, while the primers themselves become incorporated into the amplified sequence product.
- the amplification primers contain restriction endonuclease recognition sites, which allow the amplified sequence product to be cloned into an appropriate vector.
- Recombinant Host Cells The expression vectors can be introduced into host cells by suitable transformation or transfection techniques.
- the present disclosure further relates to host cells that have been transformed with one or more of the expression vectors described herein.
- Suitable hosts of the subject technology typically include microbial hosts or plant hosts.
- the host cell of the subject technology is selected from the group consisting of bacteria, yeast, filamentous fungi, cyanobacteria algae and plant cells.
- Microbial host cell expression systems and expression vectors containing regulatory sequences that direct high level expression of foreign proteins are well known to those skilled in the art. Any of these could be used to construct vectors for expression of the recombinant polypeptide of the subjection technology in a microbial host cell. These vectors could then be introduced into appropriate microorganisms via transformation to allow for high level expression of the recombinant polypeptide of the subject technology.
- the present disclosure includes transgenic host cells or hosts that have been transformed with one or more of the vectors disclosed herein.
- Transformation of appropriate cells with an expression vector of the subject technology is accomplished by methods known in the art and typically depends on both the type of vector and cell. Suitable techniques include calcium phosphate or calcium chloride co-precipitation, DEAE-dextran mediated transfection, lipofection, chemoporation or electroporation.
- Successfully transformed host cells that is, those cells containing the expression vector, can be identified by techniques well known in the art. For example, bacterial or yeast cells (e.g., E. coli S. cerevisiae, or Pichia pastoris) transfected with an expression vector of the subject technology can be cultured to produce polypeptides described herein. Cells can be examined for the presence of the expression vector DNA by techniques well known in the art.
- the host cells can contain a single copy of the expression vector described previously, or alternatively, multiple copies of the expression vector.
- the expression of polypeptide in a host described herein can be further improved by codon optimization. For example, modifying a less-common codon with a more common codon may affect the half-life of the mRNA or alter its structure by introducing a secondary structure that interferes with translation of the message. All or a portion of a coding region can be optimized. In some cases the desired modulation of expression is achieved by optimizing essentially the entire gene. In other cases, the desired modulation will be achieved by optimizing part of, but not the entire, sequence of the gene.
- the codon usage of any coding sequence can be adjusted to achieve a desired property, for example high levels of expression in a specific cell type.
- the starting point for such an optimization may be a coding sequence with 100% common codons, or a coding sequence which contains a mixture of common and non-common codons.
- Two or more candidate sequences that differ in their codon usage can be generated and tested to determine if they possess the desired property.
- Candidate sequences can be evaluated by using a computer to search for the presence of regulatory elements, such as silencers or enhancers, and to search for the presence of regions of coding sequence which could be converted into such regulatory elements by an alteration in codon usage.
- Additional criteria may include enrichment for particular nucleotides, e.g., A, C, G or U, codon bias for a particular amino acid, or the presence or absence of particular mRNA secondary or tertiary structure. Adjustment to the candidate sequence can be made based on a number of such criteria.
- the codon optimized nucleic acid sequence can express its protein, at a level which is about 110%, about 150%, about 200%, about 250%, about 300%, about 350%, about 400%, about 450%, or about 500%, of that expressed by a nucleic acid sequence that has not been codon optimized.
- Suitable hosts can include any organism capable of expressing the polynucleotide (such as SEQ ID NO: 5, 11 or 15) to produce the recombinant UGT polypeptide described herein.
- Host cells can be unmodified cells or cell lines, or cell lines that have been genetically modified. In some embodiments, the host cell is a cell line that has been modified to allow for growth under desired conditions, such as at a lower temperature.
- Microorganisms useful as hosts in the subject technology include bacteria, such as the enteric bacteria (Escherichia and Salmonella, for example) as well as Bacillus, Acinetobacter, Klebsiella, Pantoea, Clostridium, Actinomycetes such as Streptomyces, Corynebacterium, Methanotrophs such as Methylosinus, Methylomonas, Rhodococcus and Pseudomona; Cyanobacteria, such as Rhodobacter and Synechocystis; yeasts, such as Saccharomyces, Zygosaccharomyces, Kluyveromyces, Candida, Hansenula, Debaryomyces, Mucor, Pichia and Torulopsis; and filamentous fungi such as Aspergillus and Arthrobotrys, and algae.
- enteric bacteria Esscherichia and Salmonella, for example
- Bacillus such as the enteric bacteria (Escherichia and Salmon
- the microbial host is a bacterium (such as Escherichia) or a yeast (such as Saccharomyces or Pichia).
- the expression vectors can be incorporated into these and other microbial hosts to prepare large, commercially useful amounts of salidroside.
- the recombinant polypeptide can be expressed in a host cell that is a plant cell.
- plant cell is understood to mean any cell derived from a monocotyledonous or a dicotyledonous plant and capable of constituting undifferentiated tissues such as calli, differentiated tissues such as embryos, portions of monocotyledonous plants, monocotyledonous plants or seeds.
- the term "plant” is understood to mean any differentiated multi-cellular organism capable of photosynthesis, including monocotyledons and dicotyledons.
- the plant cell can be an Arabidopsis plant cell, a tobacco plant cell, a soybean plant cell, a petunia plant cell, or a cell from another oilseed crop including, but not limited to, a canola plant cell, a rapeseed plant cell, a palm plant cell, a sunflower plant cell, a cotton plant cell, a corn plant cell, a peanut plant cell, a flax plant cell, and a sesame plant cell.
- Useful plant hosts can include any plant that supports the production of the recombinant polypeptides of the subject technology.
- Suitable green plants for use as hosts include, but are not limited to, Rhodiola, soybean, rapeseed (Brassica napus, B. campestris), sunflower (Helianthus annus), cotton (Gossypium hirsutum), corn, tobacco (Nicotiana tabacum), alfalfa (Medicago sativa), wheat (Triticum sp.), barley (Hordeum vulgare), oats (Avena sativa), sorghum (Sorghum bicolor), rice (Oryza sativa), Arabidopsis, cruciferous vegetables (broccoli, cauliflower, cabbage, parsnips, etc.), melons, carrots, celery, parsley, tomatoes, potatoes, strawberries, peanuts, grapes, grass seed crops, sugar beets, sugar cane, beans, peas, rye, flax, hardwood trees, softwood trees, and forage grasses.
- Algal species include, but are not limited to, commercially significant hosts such as Spirulina, Haemotacoccus, and Dunaliella.
- Suitable plants for the method of the subject technology also include biofuel, biomass, and bioenergy crop plants.
- Exemplary plants also include switchgrass (Panicum vigratum), Brachypodium sp,, and Crambe abyssinica.
- Salidroside produced in the methods disclosed herein can find use as a neuroprotectant. To this end, it may be formulated into compositions for administrations to humans and animals.
- Such compositions may take the form of consumable products, including but not limited to food products, beverage products, nutraceuticals, pharmaceuticals, dietary supplements, dental hygienic compositions, edible gel compositions, cosmetic products and tabletop flavorings.
- UDP-glycosyltransferases are a very divergent group of enzymes that transfer a glucose residue from UDP-glucose to a core molecule.
- UGTs catalyze transglycosylation reactions using uridine 5’-diphosphoglucose (UDPG) as a donor of the sugar.
- Salidroside (tyrosol 8-O- ⁇ -D-glucoside) is a bioactive tyroside-derived phenolic natural product found in medical plants of the Rhodiola genus. Salidroside is synthesized by the UGT-catalyzed glycosylation of tyrosol. In order to identify high activity and specificity UGT enzymes for salidroside production, a number of UGT enzyme candidates were selected from different UGT subfamilies based on activity of reported UGTs and phylogenetic similarity. After several screening rounds, a subset of the candidates as listed in Table 1 were found to exhibit enzymatic activity for salidroside production.
- Cells were harvested by centrifugation (3,000 x g; 10 min; 4 °C). The cell pellets were collected and were either used immediately or stored at -80 °C. The cell pellets typically were re-suspended in lysis buffer (50 mM potassium phosphate buffer, pH 7.2, 25 ⁇ g/ml lysozyme, 5 ⁇ g/ml DNase I, 20 mM imidazole, 500 mM NaCl, 10% glycerol, and 0.4% Triton X-100). The cells were disrupted by sonication under 4 °C, and the cell debris was clarified by centrifugation (18,000 x g; 30 min).
- lysis buffer 50 mM potassium phosphate buffer, pH 7.2, 25 ⁇ g/ml lysozyme, 5 ⁇ g/ml DNase I, 20 mM imidazole, 500 mM NaCl, 10% glycerol, and 0.4% Triton X-100
- the recombinant polypeptide (5 ⁇ g aliquots) was tested in a 100 ⁇ l in vitro reaction system.
- the reaction system contains 50 mM Tris-HCl, pH 7, 2.5 mg/ml tyrosol and 6 mM UDP-glucose.
- the reaction was performed at 37 °C and 50 ⁇ L of reaction mixture was terminated by adding 50 ⁇ L methanol at 21 hours.
- the samples were vortexed and centrifuged for 5 min at 20,000 x g in preparation for high-performance liquid chromatography (HPLC) analysis.
- HPLC high-performance liquid chromatography
- HPLC analysis was then performed using a Dionex UPLC ultimate 3000 system (Sunnyvale, CA), including a quaternary pump, a temperature-controlled column compartment, an auto sampler and a UV absorbance detector.
- a Luna C18 column (Phenomenex, Torrance, CA) with a guard column was used for the characterization of tyrosol and salidroside in the samples using water and 100% methanol mobile phase.
- the detection wavelength used in the HPLC analysis was 275nm. After screening, at least nine candidates having enzymatic activity for bioconversion of tyrosol to salidroside were found (Table 1). As shown in FIG.
- Example 2 Combination of UDP-glycosyltransferase with UDPG regeneration system for salidroside production Sucrose synthase (SUS) is an enzyme catalyzing the conversion of UDP to UDP- glucose in the presence of sucrose.
- the method described herein may include a coupling reaction system in which the UGT enzymes described herein are allowed to function in combination with one or more additional enzymes to improve the efficiency or modify the outcome of the overall biosynthesis of salidroside compounds.
- the additional enzyme may be an SUS that regenerates the UDP-glucose needed for the glycosylation reaction by converting the UDP produced from the glycosylation reaction back to UDP-glucose (using, for example, sucrose as a donor of the glucose residue), thus improving the efficiency of the glycosylation reaction.
- SUS1 SEQ ID NO: 19
- mbSUS1 SEQ ID NO: 21
- the recombinant UGT polypeptide (10 ⁇ g aliquots) was tested in a 200 ⁇ L in vitro reaction system.
- the reaction system contained 50 mM Tris-HCl, pH 7, 2.5 mg/ml tyrosol, 3 mM UDPG or UDP with or without 250 mM sucrose and 84 ⁇ g/ml sucrose synthase (SUS).
- SUS sucrose synthase
- enzymes TS29 and TS34 can catalyze the transglycosylation reaction to produce salidroside both in the presence of UDPG substrate (FIG. 4, panels C and D) or in a UGT-SUS coupling system with UDP substrate (FIG. 4, panels G and H). Neither enzyme is able to produce salidroside with UDP substrate in the absence of UGT-SUS coupling (FIG.4, panels E and F). Relatively lower amounts of tyrosol were converted to salidroside after 24 hours of reaction time by the recombinant TS29 and TS34 enzyme in the absence of coupling to an SUS (FIG.4, panels C and D).
- the codon-optimized nucleotide sequences encoding TS29 (SEQ ID NO: 23), TS34 (SEQ ID NO: 24), or TS39 (SEQ ID NO: 25) was inserted in frame after a nucleotide sequence encoding ⁇ factor signal peptide.
- the synthesized genes were cloned into pHKA vector.
- each expression cassette contained an AOX1 promoter, ⁇ mating factor signal peptide, UGT gene and an AOX1 transcription terminator.
- the codon-optimized sucrose synthase cDNA mbSUS1, SEQ ID NO:22 was cloned into the Pichia expression vector pHKA vector following the same strategy.
- the above constructs were digested with BamHI and BglII.
- the expression cassette was collected and inserted into BamHI-digested plasmids. After digestion analysis, plasmids with 4 copies of UGT expression cassettes, plasmids with 2 copies of mbSUS1 expression cassettes, and plasmids with 4 copies of UGT expression cassettes and 2 copies of mbSUS1 expression cassettes were identified.
- the linearized expression plasmid was transformed into Pichia pastoris (GS115) cells using traditional methods and the expression cassette was integrated into the Pichia genome. After screening, the positive strains were characterized for enzymes and salidroside production.
- EXAMPLE 4 Production of salidroside using produced enzyme from engineered Pichia strains A single colony of each of the above engineered Pichia pastoris strains was inoculated in BMGY medium in a baffled flask and grown at 28–30°C in a shaking incubator (250–300 rpm) until the culture reached an OD600 of 2–6 (log-phase growth). The Pichia cells were harvested by centrifuging and resuspended to an OD600 of 1.0 in BMMY medium to induce expression. Methanol 100% was added to the BMMY medium to a final concentration of 1% methanol every 24 hours so as to maintain induction of expression.
- the Pichia cells were harvested by centrifuging and subjected to glycosylation activity analysis as described herein below.
- the induced Pichia cells was suspended in Tris-HCl buffer (pH 7.0) and lysed by sonication. After centrifugation, the supernatant was collected for salidroside production.
- the enzymatic bioconversion was tested in a 200 ⁇ l in vitro reaction system containing 50 mM Tris-HCl buffer, pH 7.0, 100-150 ⁇ l supernatant, 2-5 mg/ml tyrosol, 1-3 mM UDP or UDP-glucose (UDPG), and with or without 250 mM sucrose.
- Pichia expression plasmid harboring four TS34 expression cassettes and two mbSUS1 expression cassettes was linearized by BspEI and transformed into GS115 Pichia cells. Crude enzyme was prepared from induced Pichia cells by sonication. UDP and sucrose was added in the reaction to establish UDPG regeneration system. As shown in FIG. 7, TS34 and mbSUS1 crude enzyme can establish a UGT-SUS coupling reaction for salidroside production and almost fully convert tyrosol substrate to salidroside in a reaction time of 18 hours.
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Genetics & Genomics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Biotechnology (AREA)
- General Chemical & Material Sciences (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Molecular Biology (AREA)
- Medicinal Chemistry (AREA)
- Biomedical Technology (AREA)
- Mycology (AREA)
- Nutrition Science (AREA)
- Food Science & Technology (AREA)
- Polymers & Plastics (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
- Enzymes And Modification Thereof (AREA)
Abstract
A method for synthesizing salidroside, the method comprising: (i) preparing a reaction mixture comprising: tyrosol, uridine diphosphate-glucose (UDP-glucose), and a uridine diphosphate (UDP)-glycosyltransferase, and (ii) incubating the reaction mixture to produce salidroside, wherein a glucose is covalently coupled to the tyrosol to produce salidroside. The UDP- glycosyltransferase is selected from the group consisting of a first polypeptide, a second polypeptide, a third polypeptide, and combinations thereof. The first polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 5, the second polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 11, and the third polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
Description
BIOSYNTHESIS OF SALIDROSIDE RELATED APPLICATION This application claims the benefit under 35 U.S.C. § 119(e) of U.S. Provisional Application No. 63/375,734 filed on September 15, 2022, entitled “BIOSYNTHESIS OF SALIDROSIDE,” the entire contents of which are incorporated herein by reference. REFERENCE TO AN ELECTRONIC SEQUENCE LISTING The contents of the electronic sequence listing (C149770092WO00-SEQ-VLJ.xml; Size: 52,129 bytes; and Date of Creation: September 13, 2023) is herein incorporated by reference in its entirety. BACKGROUND OF THE DISCLOSURE Salidroside, a glycosylated form of tyrosol also known as tyrosol 8-O-glucoside and as 2-(4-hydroxyphenyl)ethyl-^-D-glucopyranoside, is naturally produced by plants within the Rhodiola genus. Salidroside is of particular interest and value because a number of studies have revealed that it exhibits neuroprotective activities, including anti-Alzheimer’s disease, anti-Parkinson’s disease, anti-Huntington’s disease, anti-stroke, anti-depressive effects, and anti-traumatic brain injury; it is also useful for improving cognitive function, treating addiction, and preventing epilepsy. However, commercially available salidroside in its pure form is typically obtained through a lengthy purification process from its native plant host, which poses a significant bottleneck hindering further clinical development of salidroside as a potential therapeutic agent. Accordingly, improved methods of making salidroside are needed. SUMMARY OF THE DISCLOSURE The present disclosure relates to the synthesis of salidroside. More particularly, the present disclosure relates to biosynthetic methods for producing the salidroside. The present disclosure also relates to enzymes that can be used to prepare the salidroside and recombinant cells for producing the enzymes. In a first aspect, provided herein is a method for synthesizing salidroside, the method comprising: (i) preparing a reaction mixture comprising: tyrosol, uridine diphosphate-glucose
(UDP-glucose), and a uridine diphosphate (UDP)-glycosyltransferase, and (ii) incubating the reaction mixture to produce salidroside, wherein a glucose is covalently coupled to the tyrosol to produce salidroside. The UDP-glycosyltransferase is selected from the group consisting of a first polypeptide, a second polypeptide, a third polypeptide, and combinations thereof. The first polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 5, the second polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 11, and the third polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 15. The reaction mixture may further comprises sucrose and a sucrose synthase. The sucrose synthase may be selected from the group consisting of an Arabidopsis sucrose synthase 1; an Arabidopsis sucrose synthase 3 and a Vigna radiata sucrose synthase. The first polypeptide may comprise an amino acid sequence having at least 95% identity to the amino acid sequence as set forth in SEQ ID NO: 5. In a representative example, the first polypeptide comprises an amino acid sequence having at least 99% identity to the amino acid sequence as set forth in SEQ ID NO: 5. In a further example, the first polypeptide comprises the amino acid sequence as set forth in SEQ ID NO: 5. The second polypeptide may comprise an amino acid sequence having at least 95% identity to the amino acid sequence as set forth in SEQ ID NO: 11. In a representative example, the second polypeptide comprises an amino acid sequence having at least 99% identity to the amino acid sequence as set forth in SEQ ID NO: 11. In a further example, the second polypeptide comprises an amino acid sequence having at least 99% identity to the amino acid sequence as set forth in SEQ ID NO: 11. In an additional example, the second polypeptide comprises the amino acid sequence as set forth in SEQ ID NO: 11. The third polypeptide may comprise an amino acid sequence having at least 95% identity to the amino acid sequence as set forth in SEQ ID NO: 15. In a representative example, the third polypeptide comprises an amino acid sequence having at least 95 % identity to the amino acid sequence as set forth in SEQ ID NO: 15. In a further example, the third polypeptide comprises an amino acid sequence having at least 99% identity to the amino acid sequence as set forth in SEQ ID NO: 15. In an additional example, the third polypeptide comprises the amino acid sequence as set forth in SEQ ID NO: 15. In a second aspect, the present disclosure provides a recombinant cell comprising a heterologous polynucleotide encoding a polypeptide selected from the group consisting of a first polypeptide, a second polypeptide, a third polypeptide, and combinations thereof. The
first polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 5, the second polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 11, and the third polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 15. In some embodiments, the heterologous polynucleotide comprises a nucleotide sequence having at least 95%, 99%, or 100% identity to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 6, 12, and 16. In representative examples, the first polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 5. In further examples, the second polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 11. In additional examples, the third polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 15. In some embodiments, the recombinant cell is a microbial cell, such as Escherichia coli or Pichia pastoris. In a third aspect, there is provided a method for producing a polypeptide. The method includes culturing the recombinant cells of the above second aspect, and expressing a polypeptide in the recombinant cells, wherein the polypeptide is selected from the group consisting of the first polypeptide, the second polypeptide, the third polypeptide, and combinations thereof. In representative examples, the first polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 5, the second polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 11, and the third polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 15. In a fourth aspect, the present disclosure provide isolated UDP-glycosyltransferase enzymes. A first such enzyme comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 5. In some embodiments, the first enzyme comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 5. A second enzyme comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 11. In some embodiments, the second enzyme comprises an amino acid
sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 11. A third enzyme comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 15. In some embodiments, the third enzyme comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 15. BRIEF DESCRIPTION OF THE DRAWINGS The disclosure will be better understood, and features, aspects and advantages other than those set forth above will become apparent when consideration is given to the following detailed description thereof. Such detailed description makes reference to the following drawings, wherein: FIG. 1 illustrates the bioconversion pathway of salidroside. UGT catalyze a transglycosylation reaction for salidroside production from tyrosol. Sucrose synthase (SUS) can couple with the UGT-catalyzed reaction for UDPG regeneration. FIG. 2 is an HPLC profile of enzymatic screening of UGT candidates for salidroside production. Panel A. tyrosol (TY) and salidroside (SA) standard; Panel B. no enzyme reaction mixture; Panels C-F. Reaction samples with different enzyme candidates at 21 hr. Panel C: TS19, Panel D: TS27; Panel E: TS29; Panel F: TS31; Panel G: TS32; Panel H: TS34; Panel I: TS38; Panel J: TS39; Panel K: TS41. FIG. 3 illustrates results confirming the enzymatic activity of TS29, TS34 and TS39. Reaction samples were collected at 1.5 hr, 4 hr and 24 hr. Tyrosol and salidroside was detected by HPLC analysis and concentrations were calculated based on standard curves. FIG. 4 is an HPLC profile of in vitro production of salidroside (“SA”) from tyrosol (“TY”) catalyzed by a recombinant TS29 or TS34 and a recombinant AtSUS1 in a UGT-SUS coupling reaction system as described herein. All samples were collected at 24 hr and analyzed by HPLC. Panel A. shows the standards of salidroside (“SA”) and tyrosol (“TY”). Panel B. reaction mixture; Panel C. TS29 reaction with UDPG, Panel D. TS34 reaction with UDPG; Panel E. TS29 reaction with UDP; Panel F. TS34 reaction with UDP; Panel G. TS29- SUS coupling reaction with UDP and sucrose; Panel H. TS34-SUS coupling reaction with UDP and sucrose.
FIG. 5 illustrates the in vitro production of salidroside from tyrosol as catalyzed by recombinant TS29 and TS34 with or without AtSUS1 in a UGT-SUS coupling reaction system as described herein. In these reactions, the recombinant polypeptide (10 µg aliquots) was tested in a 200 µL in vitro reaction system. The reaction system contained 50 mM Tris- HCl, pH 7, 2.5 mg/ml tyrosol, 3mM UDPG (Panel A) or 3mM UDP without (Panel B) or with 250 mM sucrose, and 84 µg/ml AtSUS1 (Panel C). The reaction was performed at 37 °C and samples were collected at 1 hr, 4 hr, and 24 hr. The concentration of tyrosol and salidroside was calculated by HPLC based on standard curve. FIG. 6 is a graph summarizing the HPLC detection of salidroside enzymatically produced by the crude enzyme from induced Pichia cells. Pichia expression plasmid harboring single expression cassette was linearized by BspEI digestion and transformed into GS115 Pichia cells. Crude enzyme was prepared from induced Pichia cell by sonication. Reaction samples were collected at 5 hours after crude enzyme addition. Panel A: Tyrosol (TY) and salidroside (SA) standard; Panel B: TS29 crude enzyme from induced Pichia cell; Panel C: TS34 crude enzyme from induced Pichia cells; Panel D: TS39 crude enzyme from induced Pichia cells. FIG. 7 is a graph summarizing the HPLC detection of the salidroside enzymatically produced by the crude enzyme from induced Pichia cells. Pichia expression plasmid harboring four TS34 expression cassette and two mbSUS1 expression cassette was linearized by BspEI and transformed into GS115 Pichia cell. Crude enzyme was prepared from induced Pichia cell by sonication. Reaction samples were collected at 5 and 18 hours after crude enzyme addition. Panel A: Tyrosol (TY) and salidroside (SA) standard; Panel B: enzymatic bioconversion at 5 hour; Panel C: enzymatic bioconversion at 18 hours. While the disclosure is susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawings and are herein described below in detail. It should be understood, however, that the description of specific embodiments is not intended to limit the disclosure to cover all modifications, equivalents and alternatives falling within the spirit and scope of the disclosure as defined by the appended claims.
DETAILED DESCRIPTION Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the disclosure belongs. Although any methods and materials similar to or equivalent to those described herein may be used in the practice or testing of the present disclosure, the preferred materials and methods are described below. The term “complementary” is used according to its ordinary and customary meaning as understood by a person of ordinary skill in the art, and is used without limitation to describe the relationship between nucleotide bases that are capable to hybridizing to one another. For example, with respect to DNA, adenosine is complementary to thymine and cytosine is complementary to guanine. Accordingly, the subject technology also includes isolated nucleic acid fragments that are complementary to the complete sequences as reported in the accompanying Sequence Listing as well as those substantially similar nucleic acid sequences. The terms “nucleic acid” and “nucleotide” are used according to their respective ordinary and customary meanings as understood by a person of ordinary skill in the art, and are used without limitation to refer to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally-occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified or degenerate variants thereof (e.g., degenerate codon substitutions) and complementary sequences, as well as the sequence explicitly indicated. The term “isolated” is used according to its ordinary and customary meaning as understood by a person of ordinary skill in the art, and when used in the context of an isolated nucleic acid or an isolated polypeptide, is used without limitation to refer to a nucleic acid or polypeptide that, by the hand of man, exists apart from its native environment and is therefore not a product of nature. An isolated nucleic acid or polypeptide can exist in a purified form or can exist in a non-native environment such as, for example, in a transgenic host cell.
The terms “incubating” and “incubation” as used herein refers to a process of mixing two or more chemical or biological entities (such as a chemical compound and an enzyme) and allowing them to interact under conditions favorable for producing a salidroside composition. The term “degenerate variant” refers to a nucleic acid sequence having a residue sequence that differs from a reference nucleic acid sequence by one or more degenerate codon substitutions. Degenerate codon substitutions can be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed base and/or deoxyinosine residues. A nucleic acid sequence and all of its degenerate variants will express the same amino acid or polypeptide. The terms “polypeptide,” “protein,” and “peptide” are used according to their respective ordinary and customary meanings as understood by a person of ordinary skill in the art; the three terms are sometimes used interchangeably, and are used without limitation to refer to a polymer of amino acids, or amino acid analogs, regardless of its size or function. For the sake of simplicity, the amino acid residues in a polymer of amino acids may be referred to as the “amino acids of the polymer” with the understanding that peptidic bonds have formed among the amino acids or precursors thereof during the formation of the polymer chain. Although “protein” is often used in reference to relatively large polypeptides, and “peptide” is often used in reference to small polypeptides, usage of these terms in the art overlaps and varies. The term “polypeptide” as used herein refers to peptides, polypeptides, and proteins, unless otherwise noted. The terms “protein,” “polypeptide,” and “peptide” are used interchangeably herein when referring to a polynucleotide product. Thus, exemplary polypeptides include polynucleotide products, naturally occurring proteins, homologs, orthologs, paralogs, fragments and other equivalents, variants, and analogs of the foregoing. The terms “polypeptide fragment” and “fragment,” when used in reference to a reference polypeptide, are used according to their ordinary and customary meanings to a person of ordinary skill in the art, and are used without limitation to refer to a polypeptide in which amino acid residues are deleted as compared to the reference polypeptide itself, but where the remaining amino acid sequence is usually identical to the corresponding positions in the reference polypeptide. Such deletions can occur at the amino-terminus or carboxy- terminus of the reference polypeptide, or alternatively both.
The term “functional fragment” of a polypeptide or protein refers to a peptide fragment that is a portion of the full length polypeptide or protein, and has substantially the same biological activity, or carries out substantially the same function as the full length polypeptide or protein (e.g., carrying out the same enzymatic reaction). The terms “variant polypeptide,” “modified amino acid sequence” or “modified polypeptide,” which are used interchangeably, refer to an amino acid sequence that is different from the reference polypeptide by one or more amino acids, e.g., by one or more amino acid substitutions, deletions, and/or additions. In an aspect, a variant is a “functional variant” which retains some or all of the ability of the reference polypeptide. The term “functional variant” further includes conservatively substituted variants. The term “conservatively substituted variant” refers to a peptide having an amino acid sequence that differs from a reference peptide by one or more conservative amino acid substitutions, and maintains some or all of the activity of the reference peptide. A “conservative amino acid substitution” is a substitution of an amino acid residue with a functionally similar residue. Examples of conservative substitutions include the substitution of one non-polar (hydrophobic) residue such as isoleucine, valine, leucine or methionine for another; the substitution of one charged or polar (hydrophilic) residue for another such as between arginine and lysine, between glutamine and asparagine, between threonine and serine; the substitution of one basic residue such as lysine or arginine for another; or the substitution of one acidic residue, such as aspartic acid or glutamic acid for another; or the substitution of one aromatic residue, such as phenylalanine, tyrosine, or tryptophan for another. Such substitutions are expected to have little or no effect on the apparent molecular weight or isoelectric point of the protein or polypeptide. The phrase “conservatively substituted variant” also includes peptides wherein a residue is replaced with a chemically- derivatized residue, provided that the resulting peptide maintains some or all of the activity of the reference peptide as described herein. The term “variant,” in connection with the polypeptides of the subject technology, further includes a functionally active polypeptide having an amino acid sequence at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least
96%, at least 97%, at least 98%, at least 99%, and even 100% identical to the amino acid sequence of a reference polypeptide. The term “homologous” in all its grammatical forms and spelling variations refers to the relationship between polynucleotides or polypeptides that possess a “common evolutionary origin,” including polynucleotides or polypeptides from superfamilies and homologous polynucleotides or proteins from different species (Reeck et al., Cell 50:667, 1987). Such polynucleotides or polypeptides have sequence homology, as reflected by their sequence similarity, whether in terms of percent identity or the presence of specific amino acids or motifs at conserved positions. For example, two homologous polypeptides can have amino acid sequences that are at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, and even 100% identical. “Percent (%) amino acid sequence identity” with respect to the variant polypeptide sequences of the subject technology refers to the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues of a reference polypeptide (such as, for example, SEQ ID NO: 5), after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign (DNASTAR) software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared. For example, the % amino acid sequence identity may be determined using the sequence comparison program NCBI-BLAST2. The NCBI-BLAST2 sequence comparison program may be downloaded from ncbi.nlm.nih.gov. NCBI BLAST2 uses several search parameters, wherein all of those search parameters are set to default values including, for example, unmask yes, strand=all, expected occurrences 10, minimum low complexity length=15/5, multi-pass e-value=0.01, constant for multi-pass=25, dropoff for final gapped alignment=25 and scoring matrix=BLOSUM62. In situations where
NCBI-BLAST2 is employed for amino acid sequence comparisons, the % amino acid sequence identity of a given amino acid sequence A to, with, or against a given amino acid sequence B (which can alternatively be phrased as a given amino acid sequence A that has or comprises a certain % amino acid sequence identity to, with, or against a given amino acid sequence B) is calculated as follows: 100 times the fraction X/Y where X is the number of amino acid residues scored as identical matches by the sequence alignment program NCBI- BLAST2 in that program’s alignment of A and B, and where Y is the total number of amino acid residues in B. It will be appreciated that where the length of amino acid sequence A is not equal to the length of amino acid sequence B, the % amino acid sequence identity of A to B will not equal the % amino acid sequence identity of B to A. In this sense, techniques for determining amino acid sequence “similarity” are well known in the art. In general, “similarity” refers to the exact amino acid to amino acid comparison of two or more polypeptides at the appropriate place, where amino acids are identical or possess similar chemical and/or physical properties such as charge or hydrophobicity. A so-termed “percent similarity” may then be determined between the compared polypeptide sequences. Techniques for determining nucleic acid and amino acid sequence identity also are well known in the art and include determining the nucleotide sequence of the mRNA for that gene (usually via a cDNA intermediate) and determining the amino acid sequence encoded therein, and comparing this to a second amino acid sequence. In general, “identity” refers to an exact nucleotide to nucleotide or amino acid to amino acid correspondence of two polynucleotides or polypeptide sequences, respectively. Two or more polynucleotide sequences can be compared by determining their “percent identity”, as can two or more amino acid sequences. The programs available in the Wisconsin Sequence Analysis Package, Version 8 (available from Genetics Computer Group, Madison, Wis.), for example, the GAP program, are capable of calculating both the identity between two polynucleotides and the identity and similarity between two polypeptide sequences, respectively. Other programs for calculating identity or similarity between sequences are known by those skilled in the art. An amino acid position “corresponding to” a reference position refers to a position that aligns with a reference sequence, as identified by aligning the amino acid sequences. Such alignments can be done by hand or by using well-known sequence alignment programs such as ClustalW2, Blast 2, etc.
Unless specified otherwise, the percent identity of two polypeptide or polynucleotide sequences refers to the percentage of identical amino acid residues or nucleotides across the entire length of the shorter of the two sequences. “Coding sequence” is used according to its ordinary and customary meaning as understood by a person of ordinary skill in the art, and is used without limitation to refer to a DNA sequence that encodes for a specific amino acid sequence. “Suitable regulatory sequences” is used according to its ordinary and customary meaning as understood by a person of ordinary skill in the art, and is used without limitation to refer to nucleotide sequences located upstream (5’ non-coding sequences), within, or downstream (3’ non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include promoters, translation leader sequences, introns, and polyadenylation recognition sequences. “Promoter” is used according to its ordinary and customary meaning as understood by a person of ordinary skill in the art, and is used without limitation to refer to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. Typically, a coding sequence is located 3’ to a promoter sequence. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different cell types, or at different stages of development, or in response to different environmental conditions. Promoters, which cause a gene to be expressed in most cell types at most times, are commonly referred to as “constitutive promoters.” It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity. The term “operably linked” refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably linked with a coding sequence when it is capable of affecting the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional
control of the promoter). Coding sequences can be operably linked to regulatory sequences in sense or antisense orientation. The term “expression” as used herein, is used according to its ordinary and customary meaning as understood by a person of ordinary skill in the art, and is used without limitation to refer to the transcription and stable accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of the subject technology. “Over-expression” refers to the production of a gene product in transgenic or recombinant organisms that exceeds levels of production in normal or non-transformed organisms. “Transformation” is used according to its ordinary and customary meaning as understood by a person of ordinary skill in the art, and is used without limitation to refer to the transfer of a polynucleotide into a target cell. The transferred polynucleotide can be incorporated into the genome or chromosomal DNA of a target cell, resulting in genetically stable inheritance, or it can replicate independent of the host chromosomal. Host organisms containing the transformed nucleic acid fragments are referred to as “transgenic” or “recombinant” or “transformed” organisms. The terms “transformed,” “transgenic,” and “recombinant,” when used herein in connection with host cells, are used according to their ordinary and customary meanings as understood by a person of ordinary skill in the art, and are used without limitation to refer to a cell of a host organism, such as a plant or microbial cell, into which a heterologous nucleic acid molecule has been introduced. The nucleic acid molecule can be stably integrated into the genome of the host cell, or the nucleic acid molecule can be present as an extrachromosomal molecule. Such an extrachromosomal molecule can be auto-replicating. Transformed cells, tissues, or subjects are understood to encompass not only the end product of a transformation process, but also transgenic progeny thereof. The terms “recombinant,” “heterologous,” and “exogenous,” when used herein in connection with polynucleotides, are used according to their ordinary and customary meanings as understood by a person of ordinary skill in the art, and are used without limitation to refer to a polynucleotide (e.g., a DNA sequence or a gene) that originates from a source foreign to the particular host cell or, if from the same source, is modified from its original form. Thus, a heterologous gene in a host cell includes a gene that is endogenous to the particular host cell but has been modified through, for example, the use of site-directed
mutagenesis or other recombinant techniques. The terms also include non-naturally occurring multiple copies of a naturally occurring DNA sequence. Thus, the terms refer to a DNA segment that is foreign or heterologous to the cell, or homologous to the cell but in a position or form within the host cell in which the element is not ordinarily found. Similarly, the terms “recombinant,” “heterologous,” and “exogenous,” when used herein in connection with a polypeptide or amino acid sequence, means a polypeptide or amino acid sequence that originates from a source foreign to the particular host cell or, if from the same source, is modified from its original form. Thus, recombinant DNA segments can be expressed in a host cell to produce a recombinant polypeptide. The terms “plasmid,” “vector,” and “cassette” are used according to their ordinary and customary meanings as understood by a person of ordinary skill in the art, and are used without limitation to refer to an extra chromosomal element often carrying genes which are not part of the central metabolism of the cell, and usually in the form of circular double- stranded DNA molecules. Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear or circular, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing a promoter fragment and DNA sequence for a selected gene product along with appropriate 3’ untranslated sequence into a cell. “Transformation cassette” refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that facilitate transformation of a particular host cell. “Expression cassette” refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that allow for enhanced expression of that gene in a foreign host. Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described, for example, by Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual, 2nd ed.; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y., 1989 (hereinafter “Maniatis”); and by Silhavy, T. J., Bennan, M. L. and Enquist, L. W. Experiments with Gene Fusions; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y., 1984; and by Ausubel, F. M. et al., In Current Protocols in Molecular Biology, published by Greene Publishing and Wiley-Interscience, 1987; the entireties of each of which are hereby incorporated herein by reference to the extent they are consistent herewith.
As used herein, “synthetic” or “organically synthesized” or “chemically synthesized” or “organically synthesizing” or “chemically synthesizing” or “organic synthesis” or “chemical synthesis” are used to refer to preparing the compounds through a series of chemical reactions; this does not include extracting the compound, for example, from a natural source. In accordance with the present disclosure, biosynthetic methods for synthesizing salidroside are disclosed. Also in accordance with the present disclosure, nucleic acid constructs and recombinant bacterial and yeast cells for producing enzymes which find use in the biosynthetic methods are provided. Methods of Producing Salidroside In one aspect, the present disclosure is directed to a biosynthetic method whereby tyrosol is converted to salidroside. In one configuration, the method includes preparing a reaction mixture including tyrosol, uridine diphosphate-glucose (UDP-glucose), and a uridine diphosphate (UDP)-glycosyltransferase (UGT) enzyme. The reaction mixture is incubated and a glucosyl moiety is covalently coupled to the tyrosol, to form salidroside. The reaction mixture can be, for example, an in vitro cell-free system. Without being bound to any theory, the (UDP)-glycosyltransferase enzyme is believed to catalyze a transglycosylation reaction whereby a glucosyl moiety is transferred from the UDP-glucose to be covalently coupled to a tyrosol molecule, thereby forming salidroside. Representative (UDP)-glycosyltransferases include those listed in Table 1, among which particularly suitable are those marked as TS29 (amino acid SEQ ID NO: 5), TS34 (amino acid SEQ ID NO: 11), and TS39 (amino acid SEQ ID NO: 15), respectively. It has been discovered that such polypeptides can be used to couple glucose with tyrosol in unexpected high yields. In a first set of exemplary instances, the UDP-glycosyltransferase has a percent amino acid sequence identity to the polypeptide of SEQ ID NO: 5 of at least 80%, e.g., at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%. Also contemplated are instances where the UDP-glycosyltransferase differs by no more than 50 amino acids, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, or 49 amino acid(s), from the
polypeptide of SEQ ID NO: 5. In preferred embodiments, the UGT differs by no more than 10 amino acids, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acids, from the polypeptide of SEQ ID NO: 5. In particularly preferred embodiments, the UDP-glycosyltransferase comprises, consists essentially of, or consists of the amino acid sequence of SEQ ID NO: 5 or an allelic variant thereof; or is a fragment thereof having UDP-glycosyltransferase activity. In a second set of exemplary instances, the UDP-glycosyltransferase has a percent amino acid sequence identity to the polypeptide of SEQ ID NO: 11 of at least 80%, e.g., at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%. Also contemplated are instances where the UDP-glycosyltransferase differs by no more than 50 amino acids, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, or 49 amino acid(s), from the polypeptide of SEQ ID NO: 11. In preferred embodiments, the UGT differs by no more than 10 amino acids, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acids, from the polypeptide of SEQ ID NO: 11. In particularly preferred embodiments, the UDP- glycosyltransferase comprises, consists essentially of, or consists of the amino acid sequence of SEQ ID NO: 11 or an allelic variant thereof; or is a fragment thereof having UDP- glycosyltransferase activity. In an additional set of exemplary instances, the UDP-glycosyltransferase has a percent amino acid sequence identity to the polypeptide of SEQ ID NO: 15 of at least 80%, e.g., at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%. Also contemplated are instances where the UDP-glycosyltransferase differs by no more than 50 amino acids, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, or 49 amino acid(s), from the polypeptide of SEQ ID NO: 15. In preferred embodiments, the polypeptides differ by no more than 10 amino acids, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 amino acids, from the polypeptide of SEQ ID NO: 15. In particularly preferred embodiments, the UDP- glycosyltransferase comprises, consists essentially of, or consists of the amino acid sequence of SEQ ID NO: 15 or an allelic variant thereof; or is a fragment thereof having UDP- glycosyltransferase activity.
The method can further include adding sucrose and a sucrose synthase enzyme (SUS) to the reaction mixture. Sucrose synthase is a glycosyltransferase. The systematic name of this enzyme class is NDP-glucose:D-fructose 2-alpha-D-glucosyltransferase. Other names in common use include UDP glucose-fructose glucosyltransferase, sucrose synthetase, sucrose- UDP glucosyltransferase, sucrose-uridine diphosphate glucosyltransferase, and uridine diphosphoglucose-fructose glucosyltransferase. As illustrated in the diagram of FIG. 1, this creates a “UGT-SUS coupling system”. In the UGT-SUS coupling system, UDP-glucose can be regenerated from UDP and sucrose, which allows for omitting the addition of extra UDP- glucose to replenish the reaction mixture; instead, the UDP-glucose is regenerated by glycosylation of UDP that is already present in the mixture either from the outset or as a product of the deglycosylation of the initial UDP-glucose. This approach also allows for UDP-glucose to be generated completely in situ, although aliquots of UDP-glucose may be externally added to the mixture, for example in the early stages of the biosynthesis. Suitable sucrose synthases can be, for example, an Arabidopsis sucrose synthase 1 (AtSUS1, SEQ ID NO: 19); an Arabidopsis sucrose synthase 3 (AtSUS3) and a Vigna radiata sucrose synthase (mbSUS1, SEQ ID NO: 21). A particularly suitable sucrose synthase can be, for example, a sucrose synthase Vigna radiata sucrose synthase mbSUS1 having the amino acid sequence of SEQ ID NO:21. Coding Nucleic Acid Sequences Standard recombinant DNA methodologies may be used to obtain a nucleic acid construct that encodes a recombinant polypeptide described herein, incorporate the nucleic acid into an expression vector, and introduce the vector into a host cell, such as those described in Sambrook, et al. (eds), Molecular Cloning; A Laboratory Manual, Third Edition, Cold Spring Harbor, (2001); and Ausubel, F. M. et al. (eds.) Current Protocols in Molecular Biology, John Wiley & Sons (1995). A nucleic acid encoding a polypeptide may be inserted into an expression vector or vectors such that the nucleic acids are operably linked to transcriptional and translational control sequences (such as a promoter sequence, a transcription termination sequence, etc.). The expression vector and expression control sequences are generally chosen to be compatible with the expression host cell used. Accordingly, in one aspect, the present disclosure provides nucleic acid constructs comprising a nucleic acid sequence that encodes at least a UDP-glycosyltransferase as described herein, as well as a recombinant host cells comprising said nucleic acid
construct(s). Said host cell, e.g., a bacterium or a yeast, can be induced to express the UDP- glycosyltransferase by the inclusion of a heterologous gene for producing a polypeptide selected from the group consisting of a first polypeptide, a second polypeptide, and a third polypeptide. The first polypeptide comprises an amino acid sequence having at least 90% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 5; the second polypeptide comprises an amino acid sequence having at least 90% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 11, and the third polypeptide comprises an amino acid sequence having at least 90% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 15. In exemplary embodiments, the first polypeptide may have at least 95%, 99%, or 100% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 5. The second polypeptide may have at least 95%, 99%, or 100% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 11. The third polypeptide may have at least 95%, 99%, or 100% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 15. A nucleic acid construct may include a polynucleotide comprising a nucleotide sequence selected from the group consisting of a first sequence, a second sequence, and a third sequence. The first sequence has at least 90% identity to the nucleotide sequence as set forth in SEQ ID NO: 6; the second sequence has at least 90% identity to the nucleotide sequence as set forth in SEQ ID NO: 12, and the third sequence has at least 90% identity to the nucleotide sequence as set forth in SEQ ID NO: 16. The first sequence may have at least 95%, 99%, or 100% identity to the nucleotide sequence as set forth in SEQ ID NO: 6. The second sequence may have at least 95%, 99%, or 100% identity to the nucleotide sequence as set forth in SEQ ID NO: 12. The third sequence may have at least 95%, 99%, or 100% identity to the nucleotide sequence as set forth in SEQ ID NO: 16. Typically, the polynucleotide sequence encoding at least a UDP-glycosyltransferase is operably linked to a promoter. In some embodiments, the promoter is a constitutive promoter (e.g., a constitutive promoter in a bacterium or yeast). In further embodiments, the promoter comprises a mutated promoter (e.g., a bacterial lacUV5 promoter). Usually, the polynucleotide sequence encoding at least a UDP-glycosyltransferase is operably linked to a transcription terminator. For example, the transcription terminator may be the bacteriophage T7 terminator.
To facilitate protein purification after expression, the nucleic acid sequence that encodes the one or more UDP-glycosyltransferases can include a polyhistidine tag. The most common polyhistidine tag are formed of six histidine (6xHis tag) residues, which are added at the N-terminus preceded by methionine or C-terminus before a stop codon, in the coding sequence of the protein of interest. Expression Vectors As stated above, a nucleic acid molecule encoding at least a UDP-glycosyltransferase as described herein may be inserted into a host species, e.g., a bacterium or yeast cell, for example in the form of an expression vector. Hence, provided herein is a recombinant cell comprising a transgenic polynucleotide encoding a polypeptide selected from the group consisting of a first polypeptide, a second polypeptide, and a third polypeptide. The first polypeptide comprises an amino acid sequence having at least 90% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 5, the second polypeptide comprises an amino acid sequence having at least 90% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 11, and the third polypeptide comprises an amino acid sequence having at least 90% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 15. In some embodiments, the first polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 5. In further embodiments, the second polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 11. In additional embodiments, the third polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% amino acid sequence identity to the amino acid sequence as set forth in SEQ ID NO: 15. In representative examples, the transgenic polynucleotide encoding the UGT may include a nucleotide sequence having at least 95%, 99%, or 100% identity to the nucleotide sequence of SEQ ID NO: 6. In further examples, the transgenic polynucleotide encoding the UGT may include a nucleotide sequence having at least 95%, 99%, or 100% identity to the nucleotide sequence of SEQ ID NO: 12. In additional, non-exclusive examples, the transgenic polynucleotide encoding the UGT may include a nucleotide sequence having at least 95%, 99%, or 100% identity to the nucleotide sequence of SEQ ID NO: 16.
Vectors or cassettes useful for the transformation of suitable hosts, e.g., microbial cells, are well known in the art. Typically the vector or cassette contains sequences directing transcription and translation of the relevant polynucleotide, a selectable marker, and sequences allowing autonomous replication or chromosomal integration. Suitable vectors comprise a region 5' of the polynucleotide which harbors transcriptional initiation controls and a region 3' of the DNA fragment which controls transcriptional termination. It is preferred for both control regions to be derived from genes homologous to the transformed host cell, although it is to be understood that such control regions need not be derived from the genes native to the specific species chosen as a host. Initiation control regions or promoters, which are useful to drive expression of the recombinant polypeptide in the desired microbial host cell are numerous and familiar to those skilled in the art. Virtually any promoter capable of driving these genes is suitable for the subject technology including but not limited to CYC1, HIS3, GAL1, GAL10, ADH1, PGK, PHO5, GAPDH, ADC1, TRP1, URA3, LEU2, ENO, TPI (useful for expression in Saccharomyces); AOX1 (useful for expression in Pichia); and lac, trp, IPL, IPR, T7, tac, and trc (useful for expression in Escherichia coli). Termination control regions may also be derived from various genes native to the hosts. A termination site optionally may be included for the microbial hosts described hereinbelow. In some embodiments, the nucleic acid molecule (e.g., vector) inserted in the host cell further comprises a polynucleotide encoding a selection marker. A “selection marker” is a gene introduced into a cell, especially cells in culture, that confers a trait suitable for artificial selection. In some embodiments, a selectable marker is a gene that confers resistance to a drug to eukaryotic cells, including but not limited to kanamycin, paromomycin, puromycin, hygromycin, G418, neomycin, or bleomycin. In some embodiments, a selectable marker is a gene that confers resistance to kanamycin. A person of ordinary skill in the art will be aware of the molecular biology techniques available for the preparation of expression vectors. The polynucleotide used for incorporation into the expression vector of the subject technology, as described above, can be prepared by routine techniques such as polymerase chain reaction (PCR). In a representative
embodiment, an UGT-encoding sequences is inserted into a pETite plasmid vector (Lucigen, WI), to construct a pETite expression vector. Several molecular biology techniques can be developed to operably link DNA to vectors via complementary cohesive termini. In one embodiment, complementary homopolymer tracts can be added to the nucleic acid molecule to be inserted into the vector DNA. The vector and nucleic acid molecule are then joined by hydrogen bonding between the complementary homopolymeric tails to form recombinant DNA molecules. In some embodiments, synthetic linkers containing one or more restriction sites are used to operably link the polynucleotide(s) of the subject technology to the expression vector. In a non-exclusive embodiment, the polynucleotide is generated by restriction endonuclease digestion. In one technique, the nucleic acid molecule is treated with bacteriophage T4 DNA polymerase or E. coli DNA polymerase I, enzymes that remove protruding, 3'-single-stranded termini with their 3'-5'-exonucleolytic activities and fill in recessed 3'-ends with their polymerizing activities, thereby generating blunt-ended DNA segments. The blunt-ended segments are then incubated with a large molar excess of linker molecules in the presence of an enzyme that can catalyze the ligation of blunt-ended DNA molecules, such as bacteriophage T4 DNA ligase. Thus, the product of the reaction is a polynucleotide carrying polymeric linker sequences at its ends. These polynucleotides are then cleaved with the appropriate restriction enzyme and ligated to an expression vector that has been cleaved with an enzyme that produces termini compatible with those of the polynucleotide. Alternatively, a vector having ligation-independent cloning (LIC) sites can be employed. The required PCR amplified polynucleotide can then be cloned into the LIC vector without restriction digest or ligation (Aslanidis and de Jong, NUCL. ACID. RES. 18 6069-74, (1990), Haun, et al, BIOTECHNIQUES 13, 515-18 (1992), which is incorporated herein by reference to the extent it is consistent herewith). A polynucleotide for incorporation into an expression vector of the subject technology may be prepared using PCR using appropriate oligonucleotide primers. The coding region is amplified, while the primers themselves become incorporated into the amplified sequence product. In some embodiments, the amplification primers contain restriction endonuclease recognition sites, which allow the amplified sequence product to be cloned into an appropriate vector.
Recombinant Host Cells The expression vectors can be introduced into host cells by suitable transformation or transfection techniques. Hence, the present disclosure further relates to host cells that have been transformed with one or more of the expression vectors described herein. Suitable hosts of the subject technology typically include microbial hosts or plant hosts. In one non-limiting example, the host cell of the subject technology is selected from the group consisting of bacteria, yeast, filamentous fungi, cyanobacteria algae and plant cells. Microbial host cell expression systems and expression vectors containing regulatory sequences that direct high level expression of foreign proteins are well known to those skilled in the art. Any of these could be used to construct vectors for expression of the recombinant polypeptide of the subjection technology in a microbial host cell. These vectors could then be introduced into appropriate microorganisms via transformation to allow for high level expression of the recombinant polypeptide of the subject technology. In some embodiments, the present disclosure includes transgenic host cells or hosts that have been transformed with one or more of the vectors disclosed herein. Transformation of appropriate cells with an expression vector of the subject technology is accomplished by methods known in the art and typically depends on both the type of vector and cell. Suitable techniques include calcium phosphate or calcium chloride co-precipitation, DEAE-dextran mediated transfection, lipofection, chemoporation or electroporation. Successfully transformed host cells, that is, those cells containing the expression vector, can be identified by techniques well known in the art. For example, bacterial or yeast cells (e.g., E. coli S. cerevisiae, or Pichia pastoris) transfected with an expression vector of the subject technology can be cultured to produce polypeptides described herein. Cells can be examined for the presence of the expression vector DNA by techniques well known in the art. The host cells can contain a single copy of the expression vector described previously, or alternatively, multiple copies of the expression vector. The expression of polypeptide in a host described herein can be further improved by codon optimization. For example, modifying a less-common codon with a more common codon may affect the half-life of the mRNA or alter its structure by introducing a secondary
structure that interferes with translation of the message. All or a portion of a coding region can be optimized. In some cases the desired modulation of expression is achieved by optimizing essentially the entire gene. In other cases, the desired modulation will be achieved by optimizing part of, but not the entire, sequence of the gene. The codon usage of any coding sequence can be adjusted to achieve a desired property, for example high levels of expression in a specific cell type. The starting point for such an optimization may be a coding sequence with 100% common codons, or a coding sequence which contains a mixture of common and non-common codons. Two or more candidate sequences that differ in their codon usage can be generated and tested to determine if they possess the desired property. Candidate sequences can be evaluated by using a computer to search for the presence of regulatory elements, such as silencers or enhancers, and to search for the presence of regions of coding sequence which could be converted into such regulatory elements by an alteration in codon usage. Additional criteria may include enrichment for particular nucleotides, e.g., A, C, G or U, codon bias for a particular amino acid, or the presence or absence of particular mRNA secondary or tertiary structure. Adjustment to the candidate sequence can be made based on a number of such criteria. In certain embodiments, the codon optimized nucleic acid sequence can express its protein, at a level which is about 110%, about 150%, about 200%, about 250%, about 300%, about 350%, about 400%, about 450%, or about 500%, of that expressed by a nucleic acid sequence that has not been codon optimized. Suitable hosts can include any organism capable of expressing the polynucleotide (such as SEQ ID NO: 5, 11 or 15) to produce the recombinant UGT polypeptide described herein. Host cells can be unmodified cells or cell lines, or cell lines that have been genetically modified. In some embodiments, the host cell is a cell line that has been modified to allow for growth under desired conditions, such as at a lower temperature. Microorganisms useful as hosts in the subject technology include bacteria, such as the enteric bacteria (Escherichia and Salmonella, for example) as well as Bacillus, Acinetobacter, Klebsiella, Pantoea, Clostridium, Actinomycetes such as Streptomyces, Corynebacterium, Methanotrophs such as Methylosinus, Methylomonas, Rhodococcus and Pseudomona; Cyanobacteria, such as Rhodobacter and Synechocystis; yeasts, such as Saccharomyces,
Zygosaccharomyces, Kluyveromyces, Candida, Hansenula, Debaryomyces, Mucor, Pichia and Torulopsis; and filamentous fungi such as Aspergillus and Arthrobotrys, and algae. Preferably, the microbial host is a bacterium (such as Escherichia) or a yeast (such as Saccharomyces or Pichia). The expression vectors can be incorporated into these and other microbial hosts to prepare large, commercially useful amounts of salidroside. In an embodiment, the recombinant polypeptide can be expressed in a host cell that is a plant cell. As used herein, the term "plant cell" is understood to mean any cell derived from a monocotyledonous or a dicotyledonous plant and capable of constituting undifferentiated tissues such as calli, differentiated tissues such as embryos, portions of monocotyledonous plants, monocotyledonous plants or seeds. The term "plant" is understood to mean any differentiated multi-cellular organism capable of photosynthesis, including monocotyledons and dicotyledons. In some embodiments, the plant cell can be an Arabidopsis plant cell, a tobacco plant cell, a soybean plant cell, a petunia plant cell, or a cell from another oilseed crop including, but not limited to, a canola plant cell, a rapeseed plant cell, a palm plant cell, a sunflower plant cell, a cotton plant cell, a corn plant cell, a peanut plant cell, a flax plant cell, and a sesame plant cell. Useful plant hosts can include any plant that supports the production of the recombinant polypeptides of the subject technology. Suitable green plants for use as hosts include, but are not limited to, Rhodiola, soybean, rapeseed (Brassica napus, B. campestris), sunflower (Helianthus annus), cotton (Gossypium hirsutum), corn, tobacco (Nicotiana tabacum), alfalfa (Medicago sativa), wheat (Triticum sp.), barley (Hordeum vulgare), oats (Avena sativa), sorghum (Sorghum bicolor), rice (Oryza sativa), Arabidopsis, cruciferous vegetables (broccoli, cauliflower, cabbage, parsnips, etc.), melons, carrots, celery, parsley, tomatoes, potatoes, strawberries, peanuts, grapes, grass seed crops, sugar beets, sugar cane, beans, peas, rye, flax, hardwood trees, softwood trees, and forage grasses. Algal species include, but are not limited to, commercially significant hosts such as Spirulina, Haemotacoccus, and Dunaliella. Suitable plants for the method of the subject technology also include biofuel, biomass, and bioenergy crop plants. Exemplary plants also include switchgrass (Panicum vigratum), Brachypodium sp,, and Crambe abyssinica. Salidroside produced in the methods disclosed herein can find use as a neuroprotectant. To this end, it may be formulated into compositions for administrations to humans and animals. Such compositions may take the form of consumable products,
including but not limited to food products, beverage products, nutraceuticals, pharmaceuticals, dietary supplements, dental hygienic compositions, edible gel compositions, cosmetic products and tabletop flavorings. EXAMPLES Example 1. Identification of novel UDP-glycosyltransferase for salidroside production UDP-glycosyltransferases (UGTs) are a very divergent group of enzymes that transfer a glucose residue from UDP-glucose to a core molecule. Typically, UGTs catalyze transglycosylation reactions using uridine 5’-diphosphoglucose (UDPG) as a donor of the sugar. Salidroside (tyrosol 8-O-^-D-glucoside) is a bioactive tyroside-derived phenolic natural product found in medical plants of the Rhodiola genus. Salidroside is synthesized by the UGT-catalyzed glycosylation of tyrosol. In order to identify high activity and specificity UGT enzymes for salidroside production, a number of UGT enzyme candidates were selected from different UGT subfamilies based on activity of reported UGTs and phylogenetic similarity. After several screening rounds, a subset of the candidates as listed in Table 1 were found to exhibit enzymatic activity for salidroside production. Full length DNA fragments of all candidate UGT genes were commercially synthesized. Almost all codons of the cDNA were changed to those preferred for E. coli. The synthesized DNA was cloned into a bacterial expression vector pETite N-His SUMO Kan Vector (Lucigen). Each expression construct was transformed into E. coli BL21 (DE3), which was subsequently grown in LB media containing 50 µg/mL kanamycin at 37 °C until reaching an OD600 of 0.8-1.0. Protein expression was induced by addition of 1 mM isopropyl ^-D-1- thiogalactopyranoside (IPTG) and the culture was further grown at 16 °C for 22 hr. Cells were harvested by centrifugation (3,000 x g; 10 min; 4 °C). The cell pellets were collected and were either used immediately or stored at -80 °C. The cell pellets typically were re-suspended in lysis buffer (50 mM potassium phosphate buffer, pH 7.2, 25 ^g/ml lysozyme, 5 ^g/ml DNase I, 20 mM imidazole, 500 mM NaCl, 10% glycerol, and 0.4% Triton X-100). The cells were disrupted by sonication under 4 °C, and the cell debris was clarified by centrifugation (18,000 x g; 30 min). Supernatant was loaded to an equilibrated (equilibration buffer: 50 mM potassium phosphate buffer, pH 7.2,
20 mM imidazole, 500 mM NaCl, 10% glycerol) Ni-NTA (Qiagen) affinity column. After loading of protein sample, the column was washed with equilibration buffer to remove unbound contaminant proteins. The His-tagged UGT recombinant polypeptides were eluted by equilibration buffer containing 250 mM imidazole. The purified candidate UGT recombinant polypeptides were assayed for glycosyltransferase activity by using tyrosol as substrate. Typically, the recombinant polypeptide (5 µg aliquots) was tested in a 100 µl in vitro reaction system. The reaction system contains 50 mM Tris-HCl, pH 7, 2.5 mg/ml tyrosol and 6 mM UDP-glucose. The reaction was performed at 37 °C and 50 ^L of reaction mixture was terminated by adding 50 µL methanol at 21 hours. The samples were vortexed and centrifuged for 5 min at 20,000 x g in preparation for high-performance liquid chromatography (HPLC) analysis. HPLC analysis was then performed using a Dionex UPLC ultimate 3000 system (Sunnyvale, CA), including a quaternary pump, a temperature-controlled column compartment, an auto sampler and a UV absorbance detector. A Luna C18 column (Phenomenex, Torrance, CA) with a guard column was used for the characterization of tyrosol and salidroside in the samples using water and 100% methanol mobile phase. The detection wavelength used in the HPLC analysis was 275nm. After screening, at least nine candidates having enzymatic activity for bioconversion of tyrosol to salidroside were found (Table 1). As shown in FIG. 2, different enzyme candidates presented various enzymatic activity for salidroside synthesis and TS29, TS34 and TS39 show higher enzymatic activity than other candidates. Table 1: Identified UDP-glycosyltransferases for salidroside production
In order to confirm the high activity enzyme candidates, enzymes TS29, TS34 and TS39 were subjected to further testing. In the reaction, each recombinant polypeptide (10 µg aliquot) was tested in a 200 µl in vitro reaction system. The reaction system contained 50 mM Tris-HCl, pH 7, 2.5 mg/ml tyrosol and 6 mM UDP-glucose. The reaction was performed at 37 °C and was terminated by adding 50 µL methanol at various time points to 50 ^L aliquots of reaction mixture. The samples were vortexed and centrifuged for 5 min at 20,000 x g in preparation for high-performance liquid chromatography (HPLC) analysis. As shown in FIG. 3, all three enzymes can convert tyrosol to salidroside efficiently and TS29 and TS39 are characterized by higher activity than TS39. Example 2: Combination of UDP-glycosyltransferase with UDPG regeneration system for salidroside production Sucrose synthase (SUS) is an enzyme catalyzing the conversion of UDP to UDP- glucose in the presence of sucrose. Thus, for a glycosylation reaction utilizing UDP-glucose as substrate (such as those catalyzed by the UGTs), SUS can be used to regenerate UDP- glucose from UDP, thereby enhancing the efficiency of such reaction. Hence, the method described herein may include a coupling reaction system in which the UGT enzymes described herein are allowed to function in combination with one or more additional enzymes to improve the efficiency or modify the outcome of the overall biosynthesis of salidroside compounds. For instance, the additional enzyme may be an SUS that regenerates the UDP-glucose needed for the glycosylation reaction by converting the UDP produced from the glycosylation reaction back to UDP-glucose (using, for example, sucrose as a donor of the glucose residue), thus improving the efficiency of the glycosylation reaction. This hypothesis was tested with two exemplary sucrose synthases, i.e., AtSUS1 (SEQ ID NO: 19) from Arabidopsis thaliana and mbSUS1 (SEQ ID NO: 21) from mung bean. Specifically, the activity of recombinant UGT polypeptides (TS29 and TS34) was assessed either alone or in the context of a UGT-SUS coupling system. In these reactions, the recombinant UGT polypeptide (10 µg aliquots) was tested in a 200 µL in vitro reaction system. The reaction system contained 50 mM Tris-HCl, pH 7, 2.5 mg/ml tyrosol, 3 mM
UDPG or UDP with or without 250 mM sucrose and 84 µg/ml sucrose synthase (SUS). The reaction was performed at 37 °C and was terminated by adding 50 µL methanol to 50 ^L reaction mixture aliquots at various time points. The samples were vortexed and centrifuged for 5 min at 20,000xg in preparation for high-performance liquid chromatography (HPLC) analysis. As illustrated in FIGs. 4 and 5, enzymes TS29 and TS34 can catalyze the transglycosylation reaction to produce salidroside both in the presence of UDPG substrate (FIG. 4, panels C and D) or in a UGT-SUS coupling system with UDP substrate (FIG. 4, panels G and H). Neither enzyme is able to produce salidroside with UDP substrate in the absence of UGT-SUS coupling (FIG.4, panels E and F). Relatively lower amounts of tyrosol were converted to salidroside after 24 hours of reaction time by the recombinant TS29 and TS34 enzyme in the absence of coupling to an SUS (FIG.4, panels C and D). However, a 4- to 6-fold increase in salidroside production was observed in the UGT-SUS coupling systems (FIG.5, panel A) as compared to reaction mixtures lacking the SUS (FIG.5, panel C). These results indicate that SUS enhances the conversion efficiency in the UGT-SUS coupling system (FIG.5). EXAMPLE 3: Production of UGTs in Pichia pastoris Full-length DNA fragments of candidate UGT genes were synthesized for use in the transformation of the Pichia pastoris cells. Specifically, the cDNAs were codon optimized for Pichia pastoris expression to produce UDP-glycosyltransferase for salidroside production. The codon-optimized nucleotide sequences encoding TS29 (SEQ ID NO: 23), TS34 (SEQ ID NO: 24), or TS39 (SEQ ID NO: 25) was inserted in frame after a nucleotide sequence encoding ^ factor signal peptide. The synthesized genes were cloned into pHKA vector. In the vector, each expression cassette contained an AOX1 promoter, ^ mating factor signal peptide, UGT gene and an AOX1 transcription terminator. In addition, the codon-optimized sucrose synthase cDNA (mbSUS1, SEQ ID NO:22) was cloned into the Pichia expression vector pHKA vector following the same strategy. To generate the multiple copies of the expression cassette, the above constructs were digested with BamHI and BglII. The expression cassette was collected and inserted into BamHI-digested plasmids. After digestion analysis, plasmids with 4 copies of UGT expression cassettes, plasmids with 2 copies of mbSUS1 expression cassettes, and plasmids
with 4 copies of UGT expression cassettes and 2 copies of mbSUS1 expression cassettes were identified. The linearized expression plasmid was transformed into Pichia pastoris (GS115) cells using traditional methods and the expression cassette was integrated into the Pichia genome. After screening, the positive strains were characterized for enzymes and salidroside production. EXAMPLE 4: Production of salidroside using produced enzyme from engineered Pichia strains A single colony of each of the above engineered Pichia pastoris strains was inoculated in BMGY medium in a baffled flask and grown at 28–30°C in a shaking incubator (250–300 rpm) until the culture reached an OD600 of 2–6 (log-phase growth). The Pichia cells were harvested by centrifuging and resuspended to an OD600 of 1.0 in BMMY medium to induce expression. Methanol 100% was added to the BMMY medium to a final concentration of 1% methanol every 24 hours so as to maintain induction of expression. Seventy-two hours after induction, the Pichia cells were harvested by centrifuging and subjected to glycosylation activity analysis as described herein below. The induced Pichia cells was suspended in Tris-HCl buffer (pH 7.0) and lysed by sonication. After centrifugation, the supernatant was collected for salidroside production. The enzymatic bioconversion was tested in a 200 µl in vitro reaction system containing 50 mM Tris-HCl buffer, pH 7.0, 100-150 µl supernatant, 2-5 mg/ml tyrosol, 1-3 mM UDP or UDP-glucose (UDPG), and with or without 250 mM sucrose. The bioconversions were carried out at 37°C. Samples were collected at various time points and the reactions were terminated by adding same volume of methanol. After methanol extraction, the supernatants were analyzed by HPLC as above described. Salidroside can be produced by recombinant UGTs from induced Pichia cells. As shown in FIG. 6, a Pichia expression plasmid harboring a single expression cassette was linearized by BspEI and transformed into GS115 Pichia cells. Crude enzyme was prepared from induced Pichia cells by sonication. UDP-glucose was added in the reaction as sugar donor. All of the TS29, TS34 and TS39 crude enzymes exhibited enzymatic activity for
salidroside production. In order to enhance UGT enzyme expression and co-expression of mbSUS1 enzyme in the same Pichia strain, Pichia expression plasmid harboring four TS34 expression cassettes and two mbSUS1 expression cassettes was linearized by BspEI and transformed into GS115 Pichia cells. Crude enzyme was prepared from induced Pichia cells by sonication. UDP and sucrose was added in the reaction to establish UDPG regeneration system. As shown in FIG. 7, TS34 and mbSUS1 crude enzyme can establish a UGT-SUS coupling reaction for salidroside production and almost fully convert tyrosol substrate to salidroside in a reaction time of 18 hours. SEQUENCES SEQ ID NO: 1 TS19 Amino acid MESHAVSPARKQHVVCVPYPAQGHINPMMKVAKLLHAKGFYVTFVNTIYNHKRLL RSRGSNALDGLPSFRFESIPDGLPETDVDVTQDIPALCESTVKNSLAPFKELLRRINAQ DESPPVSCIVSDGCMSFTLDAAEELGVPEVLFWTTSACGLLAYLHYHKFIEKGLSPLK DESYLTKEHLDTIIDWIPSMKNLRLKDIPSFIRTTNPNDIMLNFLVRETERAKRASAIIL NTFDDLELDVIQSMQSIVPPVYSIGPLHLQVKQQISEDSELGRMGSNLWKEEAECMD WLNTRAPNSVVYVNFGSITVMTAKQLVEFAWGLAATGKEFLWVIRPDLVAGEVSM VPPEFLTETADRSMLASWCPQEEVLSHPAVGGFLTHCGWNSTLESICGGVPMVCWPF FAEQQTNCKFCCDEWEIGVEIGGDVRREEVEAVVRELMDGEKGKKMREKAEEWRS LAEKATECKRGSSVVNFDKVVKVLLGE SEQ ID NO: 2 TS19 DNA ATGGGTTCTCAGGCAATCCCGTCTGTCCAGAAACCCCATGTTGTTTGTGTGCCTTA TCCCGCGCAAGGTCATATAAATCCAATGATGAAAGTCGCCAAACTGTTATATGCC AAAGGATTTCATGTTACGTTCGTAAACACGATCTATAACCATAACCGCCTCCTGA AATCCCGCGGCCCAAACGCCGTGGATGGTCTGCCGAGCTTTCGCTTTGAAAGTAT TCCCGATGGCCTCCCGGAAACTGATGTCGATGTGACACAGGATATTCCGAGCCTT TGTGAATCTACGGTTAAACACAGCCTGGCCCCATTCAAGAAACTGTTACGAGAA ATCAATGCCAAAGATGACGTTCCACCCGTTAGCTGTATTGTAAGTGATGGGTGTA TGTCTTTTACCTTGGATGCGGCAGAAGAATTAGGCGTTCCCGAGGTGCTGTTCTG GACCACTTCCGCATGTGGGTTTCTGGCGTACCTCTACTTTTATCGTTTCATCGAGA
AAGGCCTCTCTCCGGTGAAAGATGAATCTTATCTGACGAAGGAATATCTTGATAC GGAGATTGATTGGATTCCGATGATGAAGAATCTCAAATTAAAAGACATTCCGTCA TTTATTCGTACCGCTAATCCAGACGACATCATGTTGAACTTCTTGGTGCGTGAAA CAGAGCGCACTAAACGCGCGAGCGCTATTATACTTAACACCTTTGATGACTTGGA GCACGATGTGATACAGTCAATGCAATCTATCATTCCGCCGGTTTATTCCATCGGC CCGCTGCACCTGCTGGAAAAGCAAGAGATCGGCGAAGATTCCGAGATCGGGCGG ATGGGTTCCAACCTGTGGAAAGAGGAAACCGAATGCCTCGATTGGCTGGATACC AAAGCCCAAAATTCTGTAGTGTATGTAAATTTCGGAAGTATCACAGTACTGAATG CTAAACAGCTTGCCGAATTTGCATGGGGCCTCGCCGCCACGGGTAAAGAGTTCTT ATGGGTAATCCGCCCAGACCTTGTTGCAGGTGATGATGCGATGGTGCCGCAAGA GTTCTTGACTGAAACGGAAGATCGCCGGATGCTGGCCTCCTGGTGTCCGCAAGAA CAGGTGCTTTCCCATCCGGCAATTGGTGGCTTCCTGACGCATTGTGGGTGGAACT CGACCTTAGAAAGTTTATGCGGCGGCGTCCCGATGGTCTGTTGGCCGTTCTTTGC CGAGCAACAGACGAACTGCAAATTCAGCTGCGATGAATGGGAAGTGGGGATTGA AACGGGTGGCGACGTTAACCGCGAAGAGGTGGAAGCAGTGGTGCGGGAACTTAT GGATGGAGAAAAGGGCAAGAAGATGCGCGAGAAAGCGGAAGAATGGCGCCGTC TTGCCAAAGAGGCGACTGATCATAAACTGGGAAGTTCCATAGTCAACCTGGAGA CAGTGGTGCGTAAGATTCTGCTGCGCGAA SEQ ID NO: 3 TS27 Amino acid MESHAVSPARKQHVVCVPYPAQGHINPMMKVAKLLHAKGFYVTFVNTIYNHKRLL RSRGSNALDGLPSFRFESIPDGLPETDVDVTQDIPALCESTVKNSLAPFKELLRRINAQ DESPPVSCIVSDGCMSFTLDAAEELGVPEVLFWTTSACGLLAYLHYHKFIEKGLSPLK DESYLTKEHLDTIIDWIPSMKNLRLKDIPSFIRTTNPNDIMLNFLVRETERAKRASAIIL NTFDDLELDVIQSMQSIVPPVYSIGPLHLQVKQQISEDSELGRMGSNLWKEETECMD WLNTRAPNSVVYVNFGSITVMTAKQLVDFAWGLAATGKEFLWVIRPDLVAGEVSM VPPEFLTETADRSMLASWCPQEEVLSHPAVGGFLTHCGWNSTLESICGGVPMVCWPF FAEQQTNCKFCCDEWEIGVEIGGDVRREEVEAVVRELMDGEKGKKVREKAEEWRS LAEKATECKRGSSVVNFDKVVKVLLGE SEQ ID NO: 4 TS27 DNA
ATGGAAAGCCATGCGGTGAGCCCGGCGCGCAAACAGCATGTGGTGTGCGTGCCG TATCCGGCGCAAGGCCATATTAACCCGATGATGAAAGTGGCGAAACTGCTGCAT GCGAAAGGCTTTTATGTGACCTTTGTGAACACCATTTATAACCATAAACGCCTGC TGCGCAGCCGCGGCAGCAACGCGCTGGACGGACTGCCGAGCTTTCGCTTTGAAA GCATTCCGGACGGCCTGCCGGAAACCGATGTGGATGTGACCCAAGATATTCCGG CGCTGTGCGAAAGCACCGTGAAAAACAGCCTGGCGCCGTTTAAAGAACTGCTGC GCCGCATTAACGCGCAAGATGAAAGCCCGCCGGTGAGCTGCATTGTGAGCGATG GCTGCATGAGCTTTACCCTGGATGCGGCGGAAGAACTGGGCGTGCCGGAAGTGC TGTTTTGGACCACGAGCGCGTGCGGCCTGCTGGCGTATCTGCATTATCATAAATT TATTGAAAAAGGCCTGAGCCCGCTGAAAGATGAAAGCTATCTGACCAAAGAACA TCTGGATACCATTATTGATTGGATTCCGAGCATGAAAAACCTGCGCCTGAAAGAT ATTCCGAGCTTTATTCGCACCACCAACCCGAACGATATTATGCTGAACTTTCTGG TGCGCGAAACCGAACGCGCGAAACGCGCGAGCGCGATTATTCTGAACACCTTTG ATGATCTGGAACTGGATGTGATTCAGAGCATGCAGAGCATTGTGCCGCCGGTGTA TAGCATTGGCCCGCTGCATCTGCAAGTGAAACAGCAGATTAGCGAAGATAGCGA ACTGGGCCGCATGGGCAGCAACCTGTGGAAAGAAGAAACCGAATGCATGGATTG GCTGAACACCCGCGCGCCGAACAGCGTGGTGTATGTGAACTTTGGCAGCATTACC GTGATGACCGCGAAACAGCTGGTGGATTTTGCGTGGGGCCTGGCGGCGACCGGC AAAGAATTTCTGTGGGTGATTCGCCCGGATCTGGTGGCGGGCGAAGTGAGCATG GTGCCGCCGGAATTTCTGACCGAAACCGCGGATCGCAGCATGCTGGCGAGCTGG TGCCCGCAAGAAGAAGTGCTGAGCCATCCGGCGGTGGGCGGCTTTCTGACCCATT GCGGCTGGAACAGCACCCTGGAAAGCATTTGCGGCGGCGTGCCGATGGTGTGCT GGCCGTTTTTTGCGGAACAGCAGACCAACTGCAAATTTTGCTGCGATGAATGGGA AATTGGCGTGGAAATTGGCGGCGATGTGCGCCGCGAAGAAGTGGAAGCGGTGGT GCGCGAACTGATGGATGGCGAAAAAGGCAAAAAAGTGCGCGAAAAAGCGGAAG AATGGCGCAGCCTGGCGGAAAAAGCGACCGAATGCAAACGCGGCAGCAGCGTG GTGAACTTTGATAAAGTGGTGAAAGTGCTGCTGGGCGAA SEQ ID NO: 5 TS29 Amino acid MESHAVSPARKQHVVCVPYPAQGHINPMMKVAKLLHAKGFYVTFVNTIYNHKRLL RSRGSNALDGLPSFRFESIPDGLPETDVDVTQDIPALCESTVKNSLAPFKELLRRINAQ DESPPVSCIVSDGCMSFTLDAAEELGVPEVLFWTTSACGLLAYLHYHKFIEKGLSPLK
DESYLTKEHLDTIIDWIPSMKNLRLKDIPSFIRTTNPNDIMLNFLVRETERAKRASAIIL NTFDDLELDVIQSMQSIVPPVYSIGPLHLQVKQQISEDSELGRMGSNLWKEEAECMD WLNTRAPNSVVYVNFGSITVMTAKQLVEFAWGLAATGKEFLWVIRPDLVAGEVSM VPPEFLTETADRSMLASWCPQEEVLSHPAVGGFLTHCGWNSTLESICGGVPMVCWPF FAEQQTNCKFCCDEWEIGVEIGGDVRREEVEAVVRELMDGEKGKKMREKAEEWRS LAEKATECKRGSSVVNFDKVVKVLLGE SEQ ID NO: 6 TS29 DNA ATGGAAAGCCATGCGGTGAGCCCGGCGCGCAAACAGCATGTGGTGTGCGTGCCG TATCCGGCGCAAGGCCATATTAACCCGATGATGAAAGTGGCGAAACTGCTGCAT GCGAAAGGCTTTTATGTGACCTTTGTGAACACCATTTATAACCATAAACGCCTGC TGCGCAGCCGCGGCAGCAACGCGCTGGACGGACTGCCGAGCTTTCGCTTTGAAA GCATTCCGGACGGCCTGCCGGAAACCGATGTGGATGTGACCCAAGATATTCCGG CGCTGTGCGAAAGCACCGTGAAAAACAGCCTGGCGCCGTTTAAAGAACTGCTGC GCCGCATTAACGCGCAAGATGAAAGCCCGCCGGTGAGCTGCATTGTGAGCGATG GCTGCATGAGCTTTACCCTGGATGCGGCGGAAGAACTGGGCGTGCCGGAAGTGC TGTTTTGGACCACGAGCGCGTGCGGCCTGCTGGCGTATCTGCATTATCATAAATT TATTGAAAAAGGCCTGAGCCCGCTGAAAGATGAAAGCTATCTGACCAAAGAACA TCTGGATACCATTATTGATTGGATTCCGAGCATGAAAAACCTGCGCCTGAAAGAT ATTCCGAGCTTTATTCGCACCACCAACCCGAACGATATTATGCTGAACTTTCTGG TGCGCGAAACCGAACGCGCGAAACGCGCGAGCGCGATTATTCTGAACACCTTTG ATGATCTGGAACTGGATGTGATTCAGAGCATGCAGAGCATTGTGCCGCCGGTGTA TAGCATTGGCCCGCTGCATCTGCAAGTGAAACAGCAGATTAGCGAAGATAGCGA ACTGGGCCGCATGGGCAGCAACCTGTGGAAAGAAGAAGCGGAATGCATGGATTG GCTGAACACCCGCGCGCCGAACAGCGTGGTGTATGTGAACTTTGGCAGCATTACC GTGATGACCGCGAAACAGCTGGTGGAATTTGCGTGGGGCCTGGCGGCGACCGGC AAAGAATTTCTGTGGGTGATTCGCCCGGATCTGGTGGCGGGCGAAGTGAGCATG GTGCCGCCGGAATTTCTGACCGAAACCGCGGATCGCAGCATGCTGGCGAGCTGG TGCCCGCAAGAAGAAGTGCTGAGCCATCCGGCGGTGGGCGGCTTTCTGACCCATT GCGGCTGGAACAGCACCCTGGAAAGCATTTGCGGCGGCGTGCCGATGGTGTGCT GGCCGTTTTTTGCGGAACAGCAGACCAACTGCAAATTTTGCTGCGATGAATGGGA AATTGGCGTGGAAATTGGCGGCGATGTGCGCCGCGAAGAAGTGGAAGCGGTGGT
GCGCGAACTGATGGATGGCGAAAAAGGCAAAAAAATGCGCGAAAAAGCGGAAG AATGGCGCAGCCTGGCGGAAAAAGCGACCGAATGCAAACGCGGCAGCAGCGTG GTGAACTTTGATAAAGTGGTGAAAGTGCTGCTGGGCGAA SEQ ID NO: 7 TS31 Amino acid MESHAVSPARKQHVVCVPYPAQGHINPMMKVAKLLHAKGFYVTFVNTIYNHKRLL RSRGSNALDGLPSFRFESIPDGLPETDVDVTQDIPSLCESTPKYSLAPFKELLRRINAQD EVPPVNCIVSDGCMSFTLDAAEELGVPEVLFWTTSACGFLAYLHYHKFIEKGLSPLK DESYLTKEHLDTIIDWIPSMKNLRLKDIPSFVRTTNPNDIMLNFLVRETERAKRASAIIL NTFDDLEHDVIQSMQSIVPPVYSIGPLHLQVKQQISEDSELGRMGSNLWKEETECMD WLNTKAPNSVVYVNFGSITVMTAKQLVEFAWGLAATGKEFLWVIRPDLVAGEVSM VPPEFLTETADRSMLASWCPQEEVLSHPAVGGFLTHCGWNSTLESICGGVPMVCWPF FAEQQTNCKFCCDEWEIGVEIGGDVRREEVEAVVRELMDGEKGKKMREKAAEWRS LAEKATECKRGSSVVNFDNVVKVLLGE SEQ ID NO: 8 TS31 DNA ATGGAAAGCCATGCGGTGAGCCCGGCGCGCAAACAGCATGTGGTGTGCGTGCCG TATCCGGCGCAAGGCCATATTAACCCGATGATGAAAGTGGCGAAACTGCTGCAT GCGAAAGGCTTTTATGTGACCTTTGTGAACACCATTTATAACCATAAACGCCTGC TGCGCAGCCGCGGCAGCAACGCGCTGGACGGCCTGCCTAGCTTCCGCTTTGAAA GCATTCCGGATGGCCTGCCGGAAACCGATGTGGATGTGACCCAAGATATACCGA GCCTGTGCGAAAGCACCCCGAAATATAGCCTGGCGCCGTTTAAAGAACTGCTGC GCCGCATTAACGCGCAAGATGAAGTCCCGCCCGTGAATTGTATTGTGAGCGATG GCTGCATGAGCTTTACCCTGGATGCGGCGGAAGAACTGGGCGTGCCGGAAGTGC TGTTTTGGACCACGAGCGCGTGCGGCTTTCTGGCGTATCTGCATTATCATAAATTT ATTGAAAAAGGCCTGAGCCCGCTGAAAGATGAAAGCTATCTGACCAAAGAACAT CTGGATACCATTATTGATTGGATTCCGAGCATGAAAAACCTGCGCCTGAAAGATA
TTCCGAGCTTTGTGCGCACCACCAACCCGAACGATATTATGCTGAACTTTCTGGT GCGCGAAACCGAACGCGCGAAACGCGCGAGCGCGATTATTCTGAACACCTTTGA TGATCTGGAACATGATGTGATTCAGAGCATGCAGAGCATTGTGCCGCCGGTGTAC TCGATTGGCCCGCTGCATCTGCAAGTGAAACAGCAGATTAGCGAAGATAGCGAA CTGGGCCGCATGGGCAGCAACCTGTGGAAAGAAGAAACCGAATGCATGGATTGG CTGAACACCAAAGCGCCGAACAGCGTGGTGTATGTGAACTTTGGCAGCATTACC GTGATGACCGCGAAACAGCTGGTGGAATTTGCGTGGGGCCTGGCGGCGACCGGC AAAGAATTTCTGTGGGTGATTCGCCCGGATCTGGTGGCGGGCGAAGTGAGCATG GTGCCGCCGGAATTTCTGACCGAAACCGCGGATCGCAGCATGCTGGCGAGCTGG TGCCCGCAAGAAGAAGTGCTGAGCCATCCGGCGGTGGGCGGCTTTCTGACCCATT GCGGCTGGAACAGCACCCTGGAAAGCATTTGCGGCGGCGTGCCGATGGTGTGCT GGCCGTTTTTTGCGGAACAGCAGACCAACTGCAAATTTTGCTGCGATGAATGGGA AATTGGCGTGGAAATTGGCGGCGATGTGCGCCGCGAAGAAGTGGAAGCGGTGGT GCGCGAACTGATGGATGGCGAAAAAGGCAAAAAAATGCGCGAAAAAGCGGCGG AATGGCGCAGCCTGGCGGAAAAAGCGACCGAATGCAAACGCGGCAGCAGCGTG GTGAACTTTGATAACGTGGTGAAAGTGCTGCTGGGCGAA SEQ ID NO: 9 TS32 Amino acid MESRAVSPARKQHVVCVPYPAQGHINPMMKVAKLLHAKGFYVTFVNTIYNHKRLL RSRGSNALDGLPSFRFESIPDGLPETDVDVTQDIPSLCESTPKYSLAPFKELLRRINAQD EVPPVNCIVSDGCMSFTLDAAEELGVPEVLFWTTSACGFLAYLHYHKFIEKGLSPLK ADESYLTKEHLDTIIDWIPSMKNLRLKDIPSFVRTTNPNDIMLNFLVRETERAKRASAI ILNTFDDLEHDVIQSMQSIVPPVYSIGPLHLQVKQQISEDSELGRMGSNLWKEETECM DWLNTKAPNSVVYVNFGSITVMTAKQLVEFAWGLAATGKEFLWVIRPDLVAGEVS MVPPEFLTETADRSMLASWCPQEEVLSHPAVGGFLTHCGWNSTLESICGGVPMVCW PFFAEQQTNCKFCCDEWEIGVEIGGDVRREEVEAVVRELMDGEKGKKMREKAAEW RSLAEKATECKRGSSVVNFDNVVKVLLGE SEQ ID NO: 10 TS32 DNA
ATGGAAAGCCGCGCGGTGAGCCCGGCGCGCAAACAGCATGTGGTGTGCGTGCCG TATCCGGCGCAAGGCCATATTAACCCGATGATGAAAGTGGCGAAACTGCTGCAT GCGAAAGGCTTTTATGTGACCTTTGTGAACACCATTTATAACCATAAACGCCTGC TGCGCAGCCGCGGCAGCAACGCGCTGGACGGCCTGCCTAGCTTCCGCTTTGAAA GCATTCCGGATGGCCTGCCGGAAACCGATGTGGATGTGACCCAAGATATACCGA GCCTGTGCGAAAGCACCCCGAAATATAGCCTGGCGCCGTTTAAAGAACTGCTGC GCCGCATTAACGCGCAAGATGAAGTCCCGCCCGTGAATTGTATTGTGAGCGATG GCTGCATGAGCTTTACCCTGGATGCGGCGGAAGAACTGGGCGTGCCGGAAGTGC TGTTTTGGACCACGAGCGCGTGCGGCTTTCTGGCGTATCTGCATTATCATAAATTT ATTGAAAAAGGCCTGAGCCCGCTGAAAGCGGATGAAAGCTATCTGACCAAAGAA CATCTGGATACCATTATTGATTGGATTCCGAGCATGAAAAACCTGCGCCTGAAAG ATATTCCGAGCTTTGTGCGCACCACCAACCCGAACGATATTATGCTGAACTTTCT GGTGCGCGAAACCGAACGCGCGAAACGCGCGAGCGCGATTATTCTGAACACCTT TGATGATCTGGAACATGATGTGATTCAGAGCATGCAGAGCATTGTGCCGCCGGTG TACTCGATTGGCCCGCTGCATCTGCAAGTGAAACAGCAGATTAGCGAAGATAGC GAACTGGGCCGCATGGGCAGCAACCTGTGGAAAGAAGAAACCGAATGCATGGAT TGGCTGAACACCAAAGCGCCGAACAGCGTGGTGTATGTGAACTTTGGCAGCATT ACCGTGATGACCGCGAAACAGCTGGTGGAATTTGCGTGGGGCCTGGCGGCGACC GGCAAAGAATTTCTGTGGGTGATTCGCCCGGATCTGGTGGCGGGCGAAGTGAGC ATGGTGCCGCCGGAATTTCTGACCGAAACCGCGGATCGCAGCATGCTGGCGAGC TGGTGCCCGCAAGAAGAAGTGCTGAGCCATCCGGCGGTGGGCGGCTTTCTGACC CATTGCGGCTGGAACAGCACCCTGGAAAGCATTTGCGGCGGCGTGCCGATGGTG TGCTGGCCGTTTTTTGCGGAACAGCAGACCAACTGCAAATTTTGCTGCGATGAAT GGGAAATTGGCGTGGAAATTGGCGGCGATGTGCGCCGCGAAGAAGTGGAAGCG GTGGTGCGCGAACTGATGGATGGCGAAAAAGGCAAAAAAATGCGCGAAAAAGC GGCGGAATGGCGCAGCCTGGCGGAAAAAGCGACCGAATGCAAACGCGGCAGCA GCGTGGTGAACTTTGATAACGTGGTGAAAGTGCTGCTGGGCGAA SEQ ID NO: 11 TS34 Amino acid MGSHAVSPARKQHVVCVPYPAQGHINPMMKVAKLLHAKGFYVTFVNTIYNHNRLL RSRGSNALDGLPSFQFESIPDGLPETDVDVTQDIPSLCESTPKNSLAPFKELLRRINAQ DEVPPVSCIVSDGCMSFTLDAAEELGVPEVLFWTTSACGLLAYLHYHRFVEKGLSPL
KDESYLTKEHLDTIIDWIPSMKNLRLKDIPSFIRTTNPNDIMLNFLIRETDRAKRASAII LNTFDDLEHDVIQSMQSIVPPVYSIGPLHLQVKQQISEDSELGRVGSNLWKEETACID WLNTKAPNSVVYVNFGSITVMTAKQLVEFAWGLAATGKEFLWVIRPDLVAGDVAM VPPEFLTETADRRMLASWCPQEEVLSHPAVGGFLTHCGWNSTLESICGGVPMVCWP FFAEQQTNCKFCCDEWEIGVEIGGDVKREEVEAVVRELMDGEKGKKMREKAEEWR SLAEKATECKRGSSVVNFDKVVKVLLGE SEQ ID NO: 12 TS34 DNA ATGGGCAGCCATGCGGTGAGCCCGGCGCGCAAACAGCATGTGGTGTGCGTGCCG TATCCGGCGCAAGGCCATATTAACCCGATGATGAAAGTGGCGAAACTGCTGCAT GCGAAAGGCTTTTATGTGACCTTTGTGAACACCATTTATAACCATAACCGCCTGC TGCGCAGCCGCGGCAGCAACGCGCTGGACGGCCTGCCTAGCTTTCAGTTTGAAA GCATTCCGGATGGCCTGCCGGAAACCGATGTGGATGTGACCCAAGATATACCGA GCCTGTGCGAAAGCACCCCGAAAAACAGCCTGGCGCCGTTTAAAGAACTGCTGC GCCGCATTAACGCGCAAGATGAAGTCCCGCCCGTGTCGTGTATTGTGAGCGATGG CTGCATGAGCTTTACCCTGGATGCGGCGGAAGAACTGGGCGTGCCGGAAGTGCT GTTTTGGACCACGAGCGCGTGCGGCCTGCTGGCGTATCTGCATTATCATCGCTTT GTGGAAAAAGGCCTGAGCCCGCTGAAAGATGAAAGCTATCTGACCAAAGAACAT CTGGATACCATTATTGATTGGATTCCGAGCATGAAAAACCTGCGCCTGAAAGATA TTCCGAGCTTTATTCGCACCACCAACCCGAACGATATTATGCTGAACTTTCTGATT CGCGAAACCGATCGCGCGAAACGCGCGAGCGCGATTATTCTGAACACCTTTGAT GATCTGGAACATGATGTGATTCAGAGCATGCAGAGCATTGTGCCGCCGGTGTACT CGATTGGCCCGCTGCATCTGCAAGTGAAACAGCAGATTAGCGAAGATAGCGAAC TGGGCCGCGTGGGCAGCAACCTGTGGAAAGAAGAAACCGCGTGCATTGATTGGC TGAACACCAAAGCGCCGAACAGCGTGGTGTATGTGAACTTTGGCAGCATTACCG TGATGACCGCGAAACAGCTGGTGGAATTTGCGTGGGGCCTGGCGGCGACCGGCA AAGAATTTCTGTGGGTGATTCGCCCGGATCTGGTGGCGGGCGATGTGGCGATGGT GCCGCCGGAATTTCTGACCGAAACCGCGGATCGCCGCATGCTGGCGAGCTGGTG CCCGCAAGAAGAAGTGCTGAGCCATCCGGCGGTGGGCGGCTTTCTGACCCATTG CGGCTGGAACAGCACCCTGGAAAGCATTTGCGGCGGCGTGCCGATGGTGTGCTG
GCCGTTTTTTGCGGAACAGCAGACCAACTGCAAATTTTGCTGCGATGAATGGGAA ATTGGCGTGGAAATTGGCGGCGATGTGAAACGCGAAGAAGTGGAAGCGGTGGTG CGCGAACTGATGGATGGCGAAAAAGGCAAAAAAATGCGCGAAAAAGCGGAAGA ATGGCGCAGCCTGGCGGAAAAAGCGACCGAATGCAAACGCGGCAGCAGCGTGG TGAACTTTGATAAAGTGGTGAAAGTGCTGCTGGGCGAA SEQ ID NO: 13 TS38 Amino acid MGSHAGQKPHVVCVPYPAQGHITPMLKVAKLLHARGFHVTFVNTVYNNNRLLRSR GPNALEGIHSFRFESIPDGLPETDVDVTQDIISLCDSTMKHCLTPFKELLRKINAGGDV PPVSCIVSDGCMSFTLDAAEELGVPDVFFWTTSACAFMAYFHFYLFVEKGIAPFKDES YLTNEHLNTVIDWIPSMKNLKLKDIPSFIRTTNPDDLMLNFIIRETDRAKRASAIFLNTF DDLDHDIIQSMQSILPPVYSIGPLHLLANRGMQESSEIGRLGSNLWKEEPECLDWLDT KARNSVVYVNFGSITVLSAKQLLEFAWGLAGCGKDFLWVIRPDLVAGEEAVVSPEF LKETADRSMLASWCPQEKVLSHPAIGGFLTHCGWNSMLESIAGGVPMVCWPFFADQ QTNCKFCCDEWEVGMEIGGDVRREEIETVIRELMDGEKGKKMRAKAEDWGRLAVE ATGHEHGSSVVNFEEVSKILLAKRSED SEQ ID NO: 14 TS38 DNA ATGGGCAGCCATGCGGGTCAGAAACCGCATGTGGTGTGCGTGCCGTATCCGGCG CAAGGCCATATTACCCCGATGCTGAAAGTGGCGAAACTGCTGCATGCGCGCGGC TTTCATGTGACCTTTGTGAACACCGTGTATAACAACAACCGCCTGCTGCGCAGCC GCGGCCCGAACGCGCTGGAAGGCATTCATAGCTTTCGCTTTGAAAGCATTCCGGA TGGCCTGCCGGAAACCGATGTGGATGTGACCCAAGATATTATTAGCCTGTGCGAT AGCACCATGAAACATTGCCTGACCCCGTTTAAAGAACTGCTGCGCAAAATTAAC GCGGGAGGCGATGTGCCGCCGGTGAGCTGCATTGTGAGCGATGGCTGCATGAGC TTTACCCTGGATGCGGCGGAAGAACTGGGCGTGCCGGATGTGTTTTTTTGGACCA CGAGCGCGTGCGCGTTTATGGCGTATTTTCATTTTTATCTGTTTGTGGAAAAAGGC
ATTGCGCCGTTTAAAGATGAAAGCTATCTGACCAACGAACATCTGAACACCGTG ATTGATTGGATTCCGAGCATGAAAAACCTGAAACTGAAAGATATTCCGAGCTTTA TTCGCACCACCAACCCGGATGATCTGATGCTGAACTTTATTATTCGCGAAACCGA TCGCGCGAAACGCGCGAGCGCGATTTTTCTGAACACCTTTGATGATCTGGATCAT GATATTATTCAGAGCATGCAGAGCATTCTGCCGCCGGTGTATAGCATTGGCCCGC TGCATCTGTTAGCGAACCGCGGCATGCAAGAAAGCAGCGAAATTGGCCGCCTGG GCAGCAACCTGTGGAAAGAAGAACCGGAATGCCTGGATTGGCTGGATACCAAAG CGCGCAACAGCGTGGTGTATGTGAACTTTGGCAGCATTACCGTGCTGAGCGCGA AACAGCTGCTGGAATTTGCGTGGGGCCTGGCGGGCTGCGGCAAAGATTTTCTGTG GGTGATTCGCCCGGATCTGGTGGCGGGCGAAGAAGCGGTGGTGAGCCCGGAATT TCTGAAAGAAACCGCGGATCGCAGCATGCTGGCGAGCTGGTGCCCGCAAGAAAA AGTGCTGAGCCATCCGGCGATTGGCGGCTTTCTGACCCATTGCGGCTGGAACAGC ATGCTGGAAAGCATTGCGGGCGGCGTGCCGATGGTGTGCTGGCCGTTTTTTGCGG ATCAGCAGACCAACTGCAAATTTTGCTGCGATGAATGGGAAGTGGGCATGGAAA TTGGTGGCGATGTGCGCCGCGAAGAAATTGAAACCGTGATTCGCGAACTGATGG ATGGCGAAAAAGGCAAAAAAATGCGCGCGAAAGCGGAAGATTGGGGCCGCCTG GCGGTGGAAGCGACCGGCCATGAACATGGCAGCAGCGTGGTGAACTTTGAAGAA GTGAGCAAAATTCTGCTGGCGAAACGCAGCGAAGAT SEQ ID NO: 15 TS39 Amino acid MGSHVAQKQHVVCVPYPAQGHINPMMKVAKLLYAKGFHITFVNTVYNHNRLLRSR GPNAVDGLPSFRFESIPDGLPETDVDVTQDIPTLCESTMKHCLAPFKELLRQINARDD VPPVSCIVSDGCMSFTLDAAEELGVPEVLFWTTSACGFLAYLYYYRFIEKGLSPIKDE SYLTKEHLDTKIDWIPSMKNLRLKDIPSFIRTTNPDDIMLNFIIREADRAKRASAIILNT FDDLEHDVIQSMKSIVPPVYSIGPLHLLEKQESGEDSEIGRTGSNLWREETECLDWLN TKARNSVVYVNFGSITVLSAKQLVEFAWGLAATGKEFLWVIRPDLVAGDEAMVPPE FLTATADRRMLASWCPQEKVLSHPAIGGFLTHCGWNSTLESLCGGVPMVCWPFFAE QQTNCKFSRDEWEVGIEIGGDVKREEVEAVVRELMDGEKGKKMREKAEEWRRLAN EATQHKHGSSKLNFEMLVNKVLLGE SEQ ID NO: 16 TS39 DNA
ATGGGCAGCCATGTGGCGCAGAAACAGCATGTGGTGTGCGTGCCGTATCCGGCG CAAGGCCATATTAACCCGATGATGAAAGTGGCGAAACTGCTGTATGCGAAAGGC TTTCATATTACCTTTGTGAACACCGTGTATAACCATAACCGCCTGCTGCGCAGCC GCGGCCCGAACGCGGTGGATGGCTTACCGTCGTTCCGCTTTGAAAGCATTCCGGA CGGCCTGCCGGAGACAGATGTGGATGTGACCCAAGATATTCCGACCCTGTGCGA AAGCACCATGAAACATTGCCTGGCGCCGTTTAAAGAACTGCTGCGTCAGATTAAC GCGCGCGATGATGTGCCCCCGGTGAGCTGCATTGTGAGCGATGGCTGCATGAGCT TTACCCTGGATGCGGCGGAAGAACTGGGCGTGCCGGAAGTGCTGTTTTGGACCA CGAGCGCGTGCGGCTTTCTGGCGTATCTGTATTATTATCGCTTTATTGAAAAAGG CCTGAGCCCGATTAAAGATGAAAGCTATCTGACCAAAGAACATCTGGATACCAA AATTGATTGGATTCCGAGCATGAAAAACCTGCGCCTGAAAGATATTCCGAGCTTT ATTCGCACCACCAACCCGGATGATATTATGCTGAACTTTATTATTCGCGAAGCGG ATCGCGCGAAACGCGCGAGCGCGATTATTCTGAACACCTTTGATGATCTGGAACA TGATGTGATTCAGAGCATGAAAAGCATTGTGCCGCCGGTGTATAGCATTGGCCCG CTGCATCTGCTGGAAAAACAAGAAAGCGGCGAAGATAGCGAAATTGGCCGCACC GGCAGCAACCTGTGGCGCGAAGAAACCGAATGCCTGGATTGGCTGAACACCAAA GCGCGCAACAGCGTGGTGTATGTGAACTTTGGCAGCATTACCGTGCTGAGCGCG AAACAGCTGGTGGAATTTGCGTGGGGCCTGGCGGCGACCGGCAAAGAATTTCTG TGGGTGATTCGCCCGGATCTGGTGGCGGGCGATGAAGCGATGGTGCCGCCGGAA TTTCTGACCGCGACCGCGGATCGCCGCATGCTGGCGAGCTGGTGCCCGCAAGAA AAAGTGCTGAGCCATCCGGCGATTGGCGGCTTTCTGACCCATTGCGGCTGGAACA GCACCCTGGAAAGCCTGTGCGGCGGCGTGCCGATGGTGTGCTGGCCGTTTTTTGC GGAACAGCAGACCAACTGCAAATTTAGCCGCGATGAATGGGAAGTGGGCATTGA AATTGGCGGCGATGTGAAACGCGAAGAAGTGGAAGCGGTGGTGCGCGAACTGAT GGATGGCGAAAAAGGCAAAAAAATGCGCGAAAAAGCGGAAGAATGGCGCCGCC TGGCGAACGAAGCGACGCAGCATAAACATGGCAGCAGCAAACTGAACTTTGAAA TGCTGGTGAACAAAGTGCTGCTGGGCGAA SEQ ID NO: 17 TS41 Amino acid MGSHVAQKQHVVCVPYPAQGHINPMMKVAKLLYAKGFHITFVNTVYNHNRLLRSR GPNAVDGLPSFRFKSIPDGLPETDVDVTQDIPTLCESTMKHCLAPFKELLRQINARDD VPPVSCIVSDGCMSFTLDAAEELGVPEVLFWTTSACGFLAYLYYYRFIEKGLSPIKDE
SYLTKEHLDTKIDWIPSMKNLRLKDIPSFIRTTNPDDIMLNFIIREADRAKRASAIILNT FDDLEHDVIQSMKSIVPPVYSIGPLHLLEKQESGEDSEIGRTGSNLWREETECLDWLN TKARNSVVYVNFGSITVLSAKQLVEFAWGLAATGKEFLWVIRPDLVAGDEAMVPPE FLTATADRRMLASWCPQEKVLSHPAIGGFLTHCGWNSTLESLCGGVPMVCWPFFAE QQTNCKFSRDEWEVGIEIGGDVKREEVEAVVRELMDGEKGKNMREKAEEWRRLAN EATEHKHGSSKLNFEMLVNKVLLGE SEQ ID NO: 18 TS41 DNA ATGGGCAGCCATGTGGCGCAGAAACAGCATGTGGTGTGCGTGCCGTATCCGGCG CAAGGCCATATTAACCCGATGATGAAAGTGGCGAAACTGCTGTATGCGAAAGGC TTTCATATTACCTTTGTGAACACCGTGTATAACCATAACCGCCTGCTGCGCAGCC GCGGCCCGAACGCGGTGGATGGCTTACCGTCGTTCCGCTTTAAAAGCATTCCGGA CGGCCTGCCGGAGACAGATGTGGATGTGACCCAAGATATTCCGACCCTGTGCGA AAGCACCATGAAACATTGCCTGGCGCCGTTTAAAGAACTGCTGCGTCAGATTAAC GCGCGCGATGATGTGCCCCCGGTGAGCTGCATTGTGAGCGATGGCTGCATGAGCT TTACCCTGGATGCGGCGGAAGAACTGGGCGTGCCGGAAGTGCTGTTTTGGACCA CGAGCGCGTGCGGCTTTCTGGCGTATCTGTATTATTATCGCTTTATTGAAAAAGG CCTGAGCCCGATTAAAGATGAAAGCTATCTGACCAAAGAACATCTGGATACCAA AATTGATTGGATTCCGAGCATGAAAAACCTGCGCCTGAAAGATATTCCGAGCTTT ATTCGCACCACCAACCCGGATGATATTATGCTGAACTTTATTATTCGCGAAGCGG ATCGCGCGAAACGCGCGAGCGCGATTATTCTGAACACCTTTGATGATCTGGAACA TGATGTGATTCAGAGCATGAAAAGCATTGTGCCGCCGGTGTATAGCATTGGCCCG CTGCATCTGCTGGAAAAACAAGAAAGCGGCGAAGATAGCGAAATTGGCCGCACC GGCAGCAACCTGTGGCGCGAAGAAACCGAATGCCTGGATTGGCTGAACACCAAA GCGCGCAACAGCGTGGTGTATGTGAACTTTGGCAGCATTACCGTGCTGAGCGCG AAACAGCTGGTGGAATTTGCGTGGGGCCTGGCGGCGACCGGCAAAGAATTTCTG TGGGTGATTCGCCCGGATCTGGTGGCGGGCGATGAAGCGATGGTGCCGCCGGAA TTTCTGACCGCGACCGCGGATCGCCGCATGCTGGCGAGCTGGTGCCCGCAAGAA AAAGTGCTGAGCCATCCGGCGATTGGCGGCTTTCTGACCCATTGCGGCTGGAACA GCACCCTGGAAAGCCTGTGCGGCGGCGTGCCGATGGTGTGCTGGCCGTTTTTTGC
GGAACAGCAGACCAACTGCAAATTTAGCCGCGATGAATGGGAAGTGGGCATTGA AATTGGCGGCGATGTGAAACGCGAAGAAGTGGAAGCGGTGGTGCGCGAACTGAT GGATGGCGAAAAAGGCAAAAACATGCGCGAAAAAGCGGAAGAATGGCGCCGCC TGGCGAACGAAGCGACCGAACATAAACATGGCAGCAGCAAACTGAACTTTGAAA TGCTGGTGAACAAAGTGCTGCTGGGCGAA SEQ ID NO: 19 AtSUS1 Amino acid MANAERMITRVHSQRERLNETLVSERNEVLALLSRVEAKGKGILQQNQIIAEFEALPE QTRKKLEGGPFFDLLKSTQEAIVLPPWVALAVRPRPGVWEYLRVNLHALVVEELQP AEFLHFKEELVDGVKNGNFTLELDFEPFNASIPRPTLHKYIGNGVDFLNRHLSAKLFH DKESLLPLLKFLRLHSHQGKNLMLSEKIQNLNTLQHTLRKAEEYLAELKSETLYEEFE AKFEEIGLERGWGDNAERVLDMIRLLLDLLEAPDPCTLETFLGRVPMVFNVVILSPH GYFAQDNVLGYPDTGGQVVYILDQVRALEIEMLQRIKQQGLNIKPRILILTRLLPDAV GTTCGERLERVYDSEYCDILRVPFRTEKGIVRKWISRFEVWPYLETYTEDAAVELSKE LNGKPDLIIGNYSDGNLVASLLAHKLGVTQCTIAHALEKTKYPDSDIYWKKLDDKYH FSCQFTADIFAMNHTDFIITSTFQEIAGSKETVGQYESHTAFTLPGLYRVVHGIDVFDP KFNIVSPGADMSIYFPYTEEKRRLTKFHSEIEELLYSDVENKEHLCVLKDKKKPILFTM ARLDRVKNLSGLVEWYGKNTRLRELANLVVVGGDRRKESKDNEEKAEMKKMYDL IEEYKLNGQFRWISSQMDRVRNGELYRYICDTKGAFVQPALYEAFGLTVVEAMTCG LPTFATCKGGPAEIIVHGKSGFHIDPYHGDQAADTLADFFTKCKEDPSHWDEISKGGL QRIEEKYTWQIYSQRLLTLTGVYGFWKHVSNLDRLEARRYLEMFYALKYRPLAQAV PLAQDD SEQ ID NO: 20 AtSUS1 DNA ATGGCAAACGCTGAACGTATGATTACCCGTGTCCACTCCCAACGCGAACGCCTGA ACGAAACCCTGGTGTCGGAACGCAACGAAGTTCTGGCACTGCTGAGCCGTGTGG AAGCTAAGGGCAAAGGTATTCTGCAGCAAAACCAGATTATCGCGGAATTTGAAG
CCCTGCCGGAACAAACCCGCAAAAAGCTGGAAGGCGGTCCGTTTTTCGATCTGCT GAAATCTACGCAGGAAGCGATCGTTCTGCCGCCGTGGGTCGCACTGGCAGTGCG TCCGCGTCCGGGCGTTTGGGAATATCTGCGTGTCAACCTGCATGCACTGGTGGTT GAAGAACTGCAGCCGGCTGAATTTCTGCACTTCAAGGAAGAACTGGTTGACGGC GTCAAAAACGGTAATTTTACCCTGGAACTGGATTTTGAACCGTTCAATGCCAGTA TCCCGCGTCCGACGCTGCATAAATATATTGGCAACGGTGTGGACTTTCTGAATCG CCATCTGAGCGCAAAGCTGTTCCACGATAAAGAATCTCTGCTGCCGCTGCTGAAA TTCCTGCGTCTGCATAGTCACCAGGGCAAGAACCTGATGCTGTCCGAAAAAATTC AGAACCTGAATACCCTGCAACACACGCTGCGCAAGGCGGAAGAATACCTGGCCG AACTGAAAAGTGAAACCCTGTACGAAGAATTCGAAGCAAAGTTCGAAGAAATTG GCCTGGAACGTGGCTGGGGTGACAATGCTGAACGTGTTCTGGATATGATCCGTCT GCTGCTGGACCTGCTGGAAGCACCGGACCCGTGCACCCTGGAAACGTTTCTGGGT CGCGTGCCGATGGTTTTCAACGTCGTGATTCTGTCCCCGCATGGCTATTTTGCACA GGACAATGTGCTGGGTTACCCGGATACCGGCGGTCAGGTTGTCTATATTCTGGAT CAAGTTCGTGCGCTGGAAATTGAAATGCTGCAGCGCATCAAGCAGCAAGGCCTG AACATCAAACCGCGTATTCTGATCCTGACCCGTCTGCTGCCGGATGCAGTTGGTA CCACGTGCGGTGAACGTCTGGAACGCGTCTATGACAGCGAATACTGTGATATTCT GCGTGTCCCGTTTCGCACCGAAAAGGGTATTGTGCGTAAATGGATCAGTCGCTTC GAAGTTTGGCCGTATCTGGAAACCTACACGGAAGATGCGGCCGTGGAACTGTCC AAGGAACTGAATGGCAAACCGGACCTGATTATCGGCAACTATAGCGATGGTAAT CTGGTCGCATCTCTGCTGGCTCATAAACTGGGTGTGACCCAGTGCACGATTGCAC ACGCTCTGGAAAAGACCAAATATCCGGATTCAGACATCTACTGGAAAAAGCTGG ATGACAAATATCATTTTTCGTGTCAGTTCACCGCGGACATTTTTGCCATGAACCA CACGGATTTTATTATCACCAGTACGTTCCAGGAAATCGCGGGCTCCAAAGAAACC GTGGGTCAATACGAATCACATACCGCCTTCACGCTGCCGGGCCTGTATCGTGTGG TTCACGGTATCGATGTTTTTGACCCGAAATTCAATATTGTCAGTCCGGGCGCGGA TATGTCCATCTATTTTCCGTACACCGAAGAAAAGCGTCGCCTGACGAAATTCCAT TCAGAAATTGAAGAACTGCTGTACTCGGACGTGGAAAACAAGGAACACCTGTGT GTTCTGAAAGATAAAAAGAAACCGATCCTGTTTACCATGGCCCGTCTGGATCGCG TGAAGAATCTGTCAGGCCTGGTTGAATGGTATGGTAAAAACACGCGTCTGCGCG AACTGGCAAATCTGGTCGTGGTTGGCGGTGACCGTCGCAAGGAATCGAAAGATA ACGAAGAAAAGGCTGAAATGAAGAAAATGTACGATCTGATCGAAGAATACAAG CTGAACGGCCAGTTTCGTTGGATCAGCTCTCAAATGGACCGTGTGCGCAATGGCG AACTGTATCGCTACATTTGCGATACCAAGGGTGCGTTTGTTCAGCCGGCACTGTA
CGAAGCTTTCGGCCTGACCGTCGTGGAAGCCATGACGTGCGGTCTGCCGACCTTT GCGACGTGTAAAGGCGGTCCGGCCGAAATTATCGTGCATGGCAAATCTGGTTTCC ATATCGATCCGTATCACGGTGATCAGGCAGCTGACACCCTGGCGGATTTCTTTAC GAAGTGTAAAGAAGACCCGTCACACTGGGATGAAATTTCGAAGGGCGGTCTGCA ACGTATCGAAGAAAAATATACCTGGCAGATTTACAGCCAACGCCTGCTGACCCT GACGGGCGTCTACGGTTTTTGGAAACATGTGTCTAATCTGGATCGCCTGGAAGCC CGTCGCTATCTGGAAATGTTTTACGCACTGAAGTATCGCCCGCTGGCACAAGCCG TTCCGCTGGCACAGGACGACTAA SEQ ID NO: 21 mbSUS1 Amino acid MATDRLTRVHELRERLDETLSANRNEILALLSRIEGKGKGILQHHQVIAEFEEIPEESR QKLTDGAFGEVLRSTQEAIVLPPWVALAVRPRPGVWEYLRVNVHALVVEVLQPAEY LRFKEELVDGSSNGNFVLELDFEPFTASFPRPTLNKSIGNGVQFLNRHLSAKLFHDKE SLHPLLEFLRLHSVKGKTLMLNDRIQNPDALQHVLRKAEEYLGTVPPETPYSAFEHK FQEIGLERGWGDNAERVLESIQLLLDLLEAPDPCTLETFLGRIPMVFNVVILSPHGYFA QDNVLGYPDTGGQVVYILDQVRALENEMLHRIKQQGLDIVPRILIITRLLPDAVGTTC GQRLEKVFGTEHSHILRVPFRTENGIVRKWISRFEVWPYLETYTEDVAHELAKELQG KPDLIVGNYSDGNIVASLLAHKLGVTQCTIAHALEKTKYPESDIYWKKLEERYHFSC QFTADLFAMNHTDFIITSTFQEIAGSKDTVGQYESHTAFTLPGLYRVVHGIDVFDPKF NIVSPGADQTIYFPHTETSRRLTSFHTEIEELLYSSVENEEHICVLKDRSKPIIFTMARLD RVKNITGLVEWYGKNAKLRELVNLVVVAGDRRKESKDLEEKAEMKKMYSLIETYK LNGQFRWISSQMNRVRNGELYRVIADTKGAFVQPAVYEAFGLTVVEAMTCGLPTFA TCNGGPAEIIVHGKSGFHIDPYHGDRAADLLVEFFEKVKVDPSHWDKISQAGLQRIEE KYTWQIYSQRLLTLTGVYGFWKHVSNLDRRESRRYLEMFYALKYRKLAESVPLAVE SEQ ID NO: 22 mbSUS1 DNA ATGGCTACTGACAGATTGACTAGAGTTCACGAATTGAGAGAAAGATTGGACGAA ACCTTGTCTGCTAATAGAAACGAGATTTTGGCCTTGTTGTCTAGAATTGAAGGAA
AGGGTAAGGGTATTTTGCAACACCATCAGGTTATTGCAGAGTTCGAGGAAATCCC AGAAGAGTCTAGACAGAAGTTGACTGATGGAGCTTTCGGAGAAGTCTTGAGATC CACTCAAGAGGCTATTGTCTTGCCACCATGGGTTGCCTTGGCAGTCAGACCAAGA CCTGGTGTCTGGGAATACTTGAGAGTTAATGTCCATGCTTTGGTTGTTGAGGTTTT GCAGCCTGCCGAGTATTTGAGATTTAAGGAAGAGTTGGTTGATGGATCTTCTAAC GGTAATTTCGTCTTGGAGTTGGATTTCGAGCCTTTTACTGCCTCTTTTCCTAGACC AACATTGAATAAGTCTATCGGTAATGGTGTCCAGTTTTTGAACAGACATTTGTCT GCCAAATTGTTTCATGATAAGGAATCTTTGCATCCATTGTTGGAGTTCTTGAGATT GCATTCTGTTAAGGGAAAAACTTTGATGTTGAACGATAGAATCCAAAACCCAGA CGCATTGCAGCATGTTTTGAGAAAGGCTGAAGAGTACTTGGGAACAGTTCCACC AGAAACACCTTACTCTGCATTCGAGCATAAGTTCCAGGAAATCGGATTGGAGAG AGGTTGGGGTGATAACGCTGAGAGAGTTTTGGAATCTATTCAGTTGTTGTTGGAC TTGTTGGAGGCCCCAGACCCATGTACTTTGGAGACTTTCTTGGGTAGAATCCCTA TGGTTTTCAACGTCGTTATCTTGTCTCCACATGGTTACTTTGCTCAGGATAACGTT TTGGGTTACCCTGACACTGGAGGTCAAGTCGTTTACATTTTGGATCAAGTTAGAG CCTTGGAGAACGAAATGTTGCACAGAATTAAACAACAGGGTTTGGATATTGTTCC AAGAATCTTGATTATTACTAGATTGTTGCCTGACGCCGTTGGAACTACTTGTGGT CAGAGATTGGAAAAAGTCTTCGGTACAGAACACTCTCATATTTTGAGAGTCCCAT TTAGAACTGAAAACGGTATTGTTAGAAAGTGGATCTCTAGATTCGAGGTTTGGCC ATACTTGGAAACTTATACAGAGGATGTTGCTCATGAATTGGCTAAGGAGTTGCAG GGAAAGCCAGATTTGATCGTTGGTAACTACTCTGACGGAAATATCGTCGCTTCTT TGTTGGCCCACAAATTGGGTGTTACTCAATGTACTATTGCTCACGCATTGGAAAA GACAAAGTACCCAGAATCTGATATTTACTGGAAAAAGTTGGAAGAGAGATACCA CTTCTCTTGTCAGTTTACAGCTGATTTGTTTGCTATGAACCATACTGATTTCATTA TTACTTCTACTTTTCAGGAAATCGCAGGTTCTAAGGATACTGTTGGTCAATACGA ATCTCACACAGCATTCACTTTGCCAGGTTTGTATAGAGTTGTTCACGGAATCGAT GTTTTTGATCCAAAGTTTAACATTGTTTCTCCAGGAGCTGATCAAACTATCTATTT CCCACATACCGAGACCTCTAGAAGATTGACTTCTTTCCACACAGAGATTGAGGAA TTGTTGTATTCTTCTGTTGAAAACGAGGAACACATTTGTGTTTTGAAAGACAGAT CCAAGCCTATCATTTTCACTATGGCTAGATTGGATAGAGTCAAGAACATCACTGG TTTGGTCGAATGGTACGGTAAGAATGCTAAGTTGAGAGAATTGGTTAACTTGGTC GTTGTTGCCGGTGATAGAAGAAAGGAATCTAAAGACTTGGAGGAAAAGGCTGAA ATGAAGAAGATGTACTCTTTGATTGAAACTTACAAATTGAACGGTCAATTCAGAT GGATCTCTTCTCAGATGAACAGAGTCAGAAACGGTGAATTGTACAGAGTTATTGC
TGATACTAAGGGTGCATTTGTTCAACCAGCAGTCTACGAAGCTTTCGGTTTGACT GTTGTTGAAGCTATGACTTGTGGTTTGCCTACATTTGCAACTTGTAATGGTGGACC AGCTGAGATCATCGTTCATGGAAAGTCTGGTTTTCATATTGATCCTTACCATGGA GATAGAGCTGCAGACTTGTTGGTTGAGTTCTTCGAGAAGGTTAAGGTTGACCCAT CTCATTGGGATAAGATTTCTCAAGCTGGATTGCAAAGAATTGAAGAAAAATACA CTTGGCAAATTTACTCTCAAAGATTGTTGACATTGACTGGAGTTTATGGTTTCTGG AAGCATGTTTCTAATTTGGACAGAAGAGAATCTAGAAGATACTTGGAAATGTTTT ACGCTTTGAAATATAGAAAATTGGCCGAGTCTGTTCCATTGGCTGTTGAGTAA SEQ NO: 23 TS29 DNA codon-optimized for Pichia pastoris ATGGAATCTCATGCTGTTTCTCCTGCTAGAAAACAACATGTTGTTTGTGTTCCATA TCCTGCTCAAGGTCATATTAATCCAATGATGAAGGTTGCTAAACTGTTGCACGCC AAAGGTTTTTATGTTACTTTTGTGAATACTATCTACAATCACAAAAGATTGTTGA GGTCTAGAGGTTCTAATGCTTTGGATGGCTTGCCATCTTTTAGATTTGAATCTATT CCTGATGGTTTGCCCGAAACTGATGTTGATGTTACTCAAGATATTCCTGCTTTGTG TGAATCTACTGTTAAAAATTCTTTGGCTCCATTTAAAGAATTGTTGAGAAGAATT AATGCTCAAGATGAATCTCCACCTGTTTCTTGTATTGTTTCTGATGGTTGTATGTC TTTTACTTTGGATGCTGCTGAAGAATTGGGTGTTCCTGAAGTTTTGTTTTGGACTA CTTCTGCTTGCGGATTATTAGCCTATTTGCATTACCACAAATTTATTGAAAAAGG GTTGTCCCCATTGAAAGATGAATCTTATTTAACCAAAGAACACCTGGACACAATT ATTGATTGGATTCCCTCTATGAAAAATCTGAGATTGAAAGATATACCATCCTTTA TTCGGACTACTAATCCAAATGATATTATGTTGAATTTTTTGGTTAGAGAAACTGA AAGAGCAAAAAGAGCTTCGGCTATTATCTTGAATACGTTTGATGATTTGGAATTG GATGTTATTCAATCTATGCAATCTATTGTTCCACCGGTCTATTCTATTGGTCCATT GCATTTGCAAGTTAAACAACAAATTTCTGAAGATTCTGAATTGGGTAGAATGGGT TCTAATTTGTGGAAAGAAGAAGCTGAATGTATGGATTGGTTGAATACTAGGGCTC CAAATTCTGTTGTTTATGTTAATTTTGGTTCTATTACTGTTATGACTGCTAAACAA TTGGTTGAATTTGCTTGGGGTTTGGCTGCTACTGGTAAAGAATTTTTGTGGGTTAT TAGACCTGATTTGGTTGCTGGTGAAGTTTCTATGGTTCCACCTGAATTTTTGACTG AAACTGCTGATAGGTCTATGTTGGCTTCTTGGTGTCCACAAGAAGAAGTTTTGTC TCATCCTGCTGTTGGTGGTTTTTTGACTCATTGTGGTTGGAACTCTACTTTGGAAT CTATTTGTGGTGGTGTTCCAATGGTTTGTTGGCCATTTTTTGCTGAACAACAAACT
AATTGTAAATTTTGTTGTGATGAATGGGAAATTGGTGTTGAAATTGGTGGTGATG TTAGAAGAGAAGAAGTTGAAGCTGTTGTTAGAGAATTGATGGATGGTGAAAAAG GTAAAAAAATGAGAGAAAAAGCTGAAGAATGGAGGTCTTTGGCTGAAAAAGCT ACTGAATGTAAAAGAGGTTCTTCTGTTGTTAATTTTGATAAAGTTGTTAAAGTTTT GTTGGGTGAATAA SEQ NO: 24 TS34 DNA codon-optimized for Pichia pastoris ATGGGTTCTCATGCTGTTTCTCCTGCTAGAAAACAACATGTTGTTTGTGTTCCATA TCCTGCTCAAGGTCATATTAATCCAATGATGAAGGTCGCTAAACTCCTCCATGCT AAAGGTTTCTACGTTACTTTTGTGAACACTATTTATAACCATAATAGATTGTTGAG GTCTAGAGGTTCTAATGCTTTGGACGGTCTGCCTTCTTTTCAATTTGAATCTATTC CTGATGGTCTGCCGGAAACTGATGTTGATGTTACTCAAGATATACCAAGCCTATG CGAATCTACTCCAAAAAATTCTTTGGCTCCATTTAAAGAATTGTTGAGAAGAATT AATGCTCAAGATGAAGTTCCACCTGTTTCTTGTATTGTTTCTGATGGTTGTATGTC TTTTACTTTGGATGCTGCTGAAGAATTGGGTGTTCCTGAAGTTTTGTTTTGGACTA CTTCTGCTTGTGGTCTGTTGGCGTACTTGCATTATCATAGATTTGTTGAAAAAGGG TTATCACCATTGAAGGACGAGTCATACTTGACTAAGGAACACTTGGACACTATAA TTGATTGGATTCCCTCTATGAAAAATCTACGGTTGAAAGATATTCCATCTTTCATA CGAACTACTAACCCTAATGATATTATGTTGAATTTTCTCATTAGAGAAACTGATA GAGCGAAAAGAGCTTCTGCTATCATCTTGAATACTTTTGACGATTTGGAGCATGA TGTTATTCAATCTATGCAATCTATTGTTCCACCGGTTTATTCTATCGGACCATTGC ATTTGCAAGTTAAACAACAAATTTCTGAAGATTCTGAATTGGGTAGAGTTGGTTC TAATTTGTGGAAAGAAGAAACTGCTTGTATTGATTGGCTGAATACTAAAGCTCCA AATAGCGTAGTTTATGTTAACTTTGGTTCTATTACTGTTATGACTGCTAAACAATT GGTTGAATTTGCTTGGGGTTTGGCTGCTACTGGTAAAGAATTTTTGTGGGTTATTA GACCTGATTTGGTTGCTGGTGATGTTGCTATGGTTCCACCTGAATTTTTGACTGAA ACTGCTGATAGAAGAATGTTGGCTTCTTGGTGTCCACAAGAAGAAGTTTTGTCTC ATCCTGCTGTTGGTGGTTTTTTGACTCATTGTGGTTGGAACTCTACTTTGGAATCT ATTTGTGGTGGTGTTCCAATGGTTTGTTGGCCATTTTTTGCTGAACAACAAACTAA TTGTAAATTTTGTTGTGATGAATGGGAAATTGGTGTTGAAATTGGTGGTGATGTT
AAAAGAGAAGAAGTTGAAGCTGTTGTTAGAGAATTGATGGATGGTGAAAAAGGT AAAAAAATGAGAGAAAAAGCTGAAGAATGGAGGTCTTTGGCTGAAAAAGCTACT GAATGTAAAAGAGGTTCTTCTGTTGTTAATTTTGATAAAGTTGTTAAAGTTTTGTT GGGTGAATAA SEQ NO: 25 TS39 DNA codon-optimized for Pichia pastoris ATGGGTTCTCATGTTGCTCAAAAACAACATGTTGTTTGTGTTCCATATCCTGCTCA AGGTCATATTAATCCGATGATGAAAGTGGCTAAGTTATTGTATGCTAAAGGTTTT CATATCACTTTTGTAAACACTGTCTATAATCATAATAGATTGTTGAGGTCTAGAG GTCCAAATGCTGTTGATGGGTTGCCATCTTTTAGATTTGAATCTATTCCTGATGGT TTGCCTGAAACTGATGTTGATGTTACTCAAGATATTCCAACTTTGTGTGAATCTAC TATGAAACATTGTTTGGCTCCATTTAAAGAATTGTTGAGACAAATTAATGCTAGA GATGATGTACCACCTGTTTCTTGTATTGTTTCTGATGGTTGTATGTCTTTTACTTTG GATGCTGCTGAAGAATTGGGTGTTCCTGAAGTTTTGTTTTGGACTACAAGTGCTT GCGGCTTTTTGGCTTACCTCTATTATTATAGATTTATCGAAAAAGGTCTTTCTCCA ATTAAAGATGAAAGTTACTTGACTAAGGAGCACCTAGACACTAAAATTGATTGG ATTCCATCAATGAAGAACTTGAGATTGAAGGATATACCATCATTTATAAGAACTA CGAATCCTGATGATATTATGCTGAATTTCATTATTAGAGAAGCTGATAGAGCCAA AAGAGCCTCTGCTATCATTTTGAATACTTTTGACGACTTGGAACATGATGTTATTC AATCTATGAAATCTATTGTCCCACCTGTTTATTCTATTGGACCATTGCATTTGTTG GAAAAACAAGAATCTGGTGAAGATTCTGAAATTGGTAGAACTGGTTCTAATTTGT GGAGAGAAGAAACTGAATGTCTTGATTGGTTAAATACTAAGGCTAGAAATTCTG TTGTTTACGTAAATTTTGGTTCTATTACTGTTTTGTCTGCTAAACAATTGGTTGAA TTTGCTTGGGGTTTGGCTGCTACTGGTAAAGAATTTTTGTGGGTTATTAGACCTGA TTTGGTTGCTGGTGATGAAGCTATGGTTCCACCTGAATTTTTGACTGCTACTGCTG ATAGAAGAATGTTGGCTTCTTGGTGTCCACAAGAAAAAGTTTTGTCTCATCCTGC TATTGGTGGTTTTTTGACTCATTGTGGTTGGAACTCTACTTTGGAATCTTTGTGTG GTGGTGTTCCAATGGTTTGTTGGCCATTTTTTGCTGAACAACAAACTAATTGTAA ATTTTCTAGAGATGAATGGGAAGTTGGTATTGAAATTGGTGGTGATGTTAAAAGA GAAGAAGTTGAAGCTGTTGTTAGAGAATTGATGGATGGTGAAAAAGGTAAAAAA ATGAGAGAAAAAGCTGAAGAATGGAGAAGATTGGCAAATGAAGCTACCCAACA
CAAGCATGGTTCTTCTAAATTGAATTTTGAAATGTTGGTTAATAAAGTTTTGTTGG GTGAATAA In view of the above, it will be seen that the several advantages of the disclosure are achieved and other advantageous results attained. As various changes could be made in the above methods and systems without departing from the scope of the disclosure, it is intended that all matter contained in the above description and shown in the accompanying drawings shall be interpreted as illustrative and not in a limiting sense. When introducing elements of the present disclosure or the various versions, embodiment(s) or aspects thereof, the articles “a”, “an”, “the” and “said” are intended to mean that there are one or more of the elements. The terms “comprising”, “including” and “having” are intended to be inclusive and mean that there may be additional elements other than the listed elements.
Claims
CLAIMS What is claimed is: 1. A method for synthesizing salidroside, the method comprising: (i) preparing a reaction mixture comprising: tyrosol, uridine diphosphate-glucose (UDP-glucose), and a uridine diphosphate (UDP)-glycosyltransferase, and (ii) incubating the reaction mixture to produce salidroside, wherein a glucose is covalently coupled to the tyrosol to produce salidroside, wherein the UDP-glycosyltransferase is selected from the group consisting of a first polypeptide, a second polypeptide, a third polypeptide, and combinations thereof, wherein the first polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 5, the second polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 11, and the third polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
2. The method according to claim 1, wherein the reaction mixture further comprises sucrose and a sucrose synthase.
3. The method according to claim 2, wherein the sucrose synthase is selected from the group consisting of an Arabidopsis sucrose synthase 1; an Arabidopsis sucrose synthase 3 and a Vigna radiate sucrose synthase.
4. The method according to any one of claims 1 to 3, wherein the first polypeptide comprises an amino acid sequence having at least 95% identity to the amino acid sequence as set forth in SEQ ID NO: 5.
5. The method according to any one of claims 1 to 4, wherein the first polypeptide comprises an amino acid sequence having at least 99% identity to the amino acid sequence as set forth in SEQ ID NO: 5.
6. The method according to any one of claims 1 to 5, wherein the first polypeptide comprises the amino acid sequence as set forth in SEQ ID NO: 5.
7. The method according to any one of claims 1 to 3, wherein the second polypeptide comprises an amino acid sequence having at least 95% identity to the amino acid sequence as set forth in SEQ ID NO: 11.
8. The method according to claim 7, wherein the second polypeptide comprises an amino acid sequence having at least 99% identity to the amino acid sequence as set forth in SEQ ID NO: 11.
9. The method according to claims 7 or 8, wherein the second polypeptide comprises an amino acid sequence having at least 99% identity to the amino acid sequence as set forth in SEQ ID NO: 11.
10. The method according to any one of claims 7 to 9, wherein the second polypeptide comprises the amino acid sequence as set forth in SEQ ID NO: 11.
11. The method according to any one of claims 1 to 3, wherein the third polypeptide comprises an amino acid sequence having at least 95% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
12. The method according to claim 11, wherein the third polypeptide comprises an amino acid sequence having at least 99% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
13. The method according to claims 11 or 12, wherein the third polypeptide comprises the amino acid sequence as set forth in SEQ ID NO: 15.
14. A recombinant cell comprising a heterologous polynucleotide encoding a polypeptide selected from the group consisting of a first polypeptide, a second polypeptide, a third polypeptide, and combinations thereof,
wherein the first polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 5, the second polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 11, and the third polypeptide comprises an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
15. The recombinant cell according to claim 14, wherein the heterologous polynucleotide comprises a nucleotide sequence having at least 95%, 99%, or 100% identity to a nucleotide sequence selected from the group consisting of SEQ ID NOs: 6, 12, and 16.
16. The recombinant cell according to claims 14 or 15, wherein the first polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 5.
17. The recombinant cell according to claims 14 or 15, wherein the second polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 11.
18. The recombinant cell according to claims 14 or 15, wherein the third polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
19. The recombinant cell according to any of claims 14 to 18, wherein the recombinant cell is a microbial cell.
20. The recombinant microbial cell according to claim 19, wherein the cell is Escherichia coli or Pichia pastoris.
21. A method of producing a polypeptide, the method comprising: culturing recombinant cells according to any of claims 14 to 20; and expressing a polypeptide in the recombinant cells, wherein the polypeptide is selected from the group consisting of the first polypeptide, the second polypeptide, the third polypeptide, and combinations thereof.
22. The method of producing a polypeptide according to claim 21, wherein the first polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 5, the second polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 11, and the third polypeptide comprises an amino acid sequence having at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
23. A polypeptide produced according to the method of claims 21 or 22.
24. An isolated UDP-glycosyltransferase enzyme comprising an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 5.
25. The isolated UDP-glycosyltransferase enzyme according to claim 24, wherein the amino acid sequence has at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 5.
26. An isolated UDP-glycosyltransferase enzyme comprising an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 11.
27. The isolated UDP-glycosyltransferase enzyme according to claim 26, wherein the amino acid sequence has at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 11.
28. An isolated UDP-glycosyltransferase enzyme comprising an amino acid sequence having at least 90% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
29. The isolated UDP-glycosyltransferase enzyme according to claim 28, wherein the amino acid sequence has at least 95%, 99%, or 100% identity to the amino acid sequence as set forth in SEQ ID NO: 15.
30. Salidroside produced according the method of any of claims 1 to 13, for use as a neuroprotectant.
31. A composition comprising the salidroside produced according to the method of any of claims 1 to 13.
32. A consumable product comprising the salidroside produced using the method of any one of claims 1 to 13.
33. The consumable product according to claim 32, wherein the consumable product is selected from: a food product, a beverage product, a nutraceutical, a pharmaceutical, a dietary supplement, a dental hygienic composition, an edible gel composition, a cosmetic product and a tabletop flavoring.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202263375734P | 2022-09-15 | 2022-09-15 | |
US63/375,734 | 2022-09-15 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2024059813A2 true WO2024059813A2 (en) | 2024-03-21 |
WO2024059813A3 WO2024059813A3 (en) | 2024-05-16 |
Family
ID=89321894
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2023/074336 WO2024059813A2 (en) | 2022-09-15 | 2023-09-15 | Biosynthesis of salidroside |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024059813A2 (en) |
-
2023
- 2023-09-15 WO PCT/US2023/074336 patent/WO2024059813A2/en unknown
Non-Patent Citations (8)
Title |
---|
"Current Protocols in Molecular Biology", 1995, JOHN WILEY & SONS |
ASLANIDISJONG, NUCL. ACID. RES, vol. 18, 1990, pages 6069 - 74 |
AUSUBEL, F. M ET AL.: "In Current Protocols in Molecular Biology", 1987, WILEY-INTERSCIENCE |
HAUN ET AL., Σ3IOTFCΣ-IΛIQUES, vol. 13, 1992, pages 515 - 18 |
REECK ET AL., CELL, vol. 50, 1987, pages 667 |
SAMBROOK ET AL.: "Molecular Cloning; A Laboratory Manual", 2001 |
SAMBROOK, JFRITSCH. E.MANIATIS, T: "Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY: COLD SPRING HARBOR |
SILHAVY, T. JBENNAN, M. LENQUIST, L. W: "Experiments with Gene Fusions", 1984, COLD SPRING HARBOR LABORATORY |
Also Published As
Publication number | Publication date |
---|---|
WO2024059813A3 (en) | 2024-05-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10253344B2 (en) | Recombinant production of steviol glycosides | |
KR101983115B1 (en) | Methods and materials for recombinant production of saffron compounds | |
JP2018511335A (en) | Generation of non-caloric sweeteners using modified whole-cell catalysts | |
JP7183254B2 (en) | A terpene synthase producing patchoulol and eremoll and preferably also pogostol | |
JP7510187B2 (en) | Biosynthesis of vanillin from isoeugenol | |
JP7525185B2 (en) | Biosynthesis of Alpha-Ionone and Beta-Ionone | |
US20220112525A1 (en) | Biosynthesis of vanillin from isoeugenol | |
JP2021515555A (en) | Production by biosynthesis of steviol glycosides rebaugioside J and rebaugioside N | |
WO2024059813A2 (en) | Biosynthesis of salidroside | |
US20240052380A1 (en) | Biosynthesis of vanillin from isoeugenol | |
GB2416769A (en) | Biosynthesis of raspberry ketone | |
US20240052381A1 (en) | Biosynthesis of vanillin from isoeugenol | |
CN112877349A (en) | Recombinant expression vector, genetic engineering bacterium containing recombinant expression vector and application of recombinant expression vector | |
JP4655648B2 (en) | Method for producing ubiquinone | |
WO2024108175A2 (en) | Constructs and methods for biosynthesis of gastrodin | |
HK40058263A (en) | Biosynthesis of vanillin from isoeugenol | |
HK40061253A (en) | Biosynthesis of vanillin from isoeugenol | |
HK1230659A1 (en) | Recombinant production of steviol glycosides |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23828540 Country of ref document: EP Kind code of ref document: A2 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |