CN118531030A - Expression cassette, recombinant vector, recombinant protein and application thereof - Google Patents
Expression cassette, recombinant vector, recombinant protein and application thereof Download PDFInfo
- Publication number
- CN118531030A CN118531030A CN202410653658.6A CN202410653658A CN118531030A CN 118531030 A CN118531030 A CN 118531030A CN 202410653658 A CN202410653658 A CN 202410653658A CN 118531030 A CN118531030 A CN 118531030A
- Authority
- CN
- China
- Prior art keywords
- seq
- expression cassette
- glp
- cpb
- kex2
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000014509 gene expression Effects 0.000 title claims abstract description 80
- 239000013598 vector Substances 0.000 title claims abstract description 24
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 title claims abstract description 9
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 title claims abstract description 9
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 30
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 29
- 150000001413 amino acids Chemical group 0.000 claims abstract description 22
- 229940089838 Glucagon-like peptide 1 receptor agonist Drugs 0.000 claims abstract description 17
- 239000003877 glucagon like peptide 1 receptor agonist Substances 0.000 claims abstract description 14
- 108010076504 Protein Sorting Signals Proteins 0.000 claims abstract description 8
- 102000004190 Enzymes Human genes 0.000 claims description 24
- 108090000790 Enzymes Proteins 0.000 claims description 24
- 101150045458 KEX2 gene Proteins 0.000 claims description 7
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 claims description 2
- DLSWIYLPEUIQAV-UHFFFAOYSA-N Semaglutide Chemical compound CCC(C)C(NC(=O)C(Cc1ccccc1)NC(=O)C(CCC(O)=O)NC(=O)C(CCCCNC(=O)COCCOCCNC(=O)COCCOCCNC(=O)CCC(NC(=O)CCCCCCCCCCCCCCCCC(O)=O)C(O)=O)NC(=O)C(C)NC(=O)C(C)NC(=O)C(CCC(N)=O)NC(=O)CNC(=O)C(CCC(O)=O)NC(=O)C(CC(C)C)NC(=O)C(Cc1ccc(O)cc1)NC(=O)C(CO)NC(=O)C(CO)NC(=O)C(NC(=O)C(CC(O)=O)NC(=O)C(CO)NC(=O)C(NC(=O)C(Cc1ccccc1)NC(=O)C(NC(=O)CNC(=O)C(CCC(O)=O)NC(=O)C(C)(C)NC(=O)C(N)Cc1cnc[nH]1)C(C)O)C(C)O)C(C)C)C(=O)NC(C)C(=O)NC(Cc1c[nH]c2ccccc12)C(=O)NC(CC(C)C)C(=O)NC(C(C)C)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CCCNC(N)=N)C(=O)NCC(O)=O DLSWIYLPEUIQAV-UHFFFAOYSA-N 0.000 abstract description 18
- 229950011186 semaglutide Drugs 0.000 abstract description 12
- 108010060325 semaglutide Proteins 0.000 abstract description 12
- 238000002474 experimental method Methods 0.000 abstract description 9
- 230000015572 biosynthetic process Effects 0.000 abstract description 7
- 238000000746 purification Methods 0.000 abstract description 3
- 239000002994 raw material Substances 0.000 abstract description 2
- 238000003776 cleavage reaction Methods 0.000 description 20
- 230000007017 scission Effects 0.000 description 20
- 235000018102 proteins Nutrition 0.000 description 19
- 125000003275 alpha amino acid group Chemical group 0.000 description 16
- 238000000855 fermentation Methods 0.000 description 13
- 230000004151 fermentation Effects 0.000 description 13
- 108090000765 processed proteins & peptides Proteins 0.000 description 13
- 210000003000 inclusion body Anatomy 0.000 description 12
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 10
- 235000001014 amino acid Nutrition 0.000 description 9
- 238000013461 design Methods 0.000 description 9
- 238000004153 renaturation Methods 0.000 description 9
- 229940079593 drug Drugs 0.000 description 8
- 239000003814 drug Substances 0.000 description 8
- 206010012601 diabetes mellitus Diseases 0.000 description 7
- 230000000694 effects Effects 0.000 description 7
- 229920001184 polypeptide Polymers 0.000 description 7
- 102000004196 processed proteins & peptides Human genes 0.000 description 7
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 6
- 239000000872 buffer Substances 0.000 description 6
- 239000002609 medium Substances 0.000 description 6
- 108020004707 nucleic acids Proteins 0.000 description 6
- 102000039446 nucleic acids Human genes 0.000 description 6
- 150000007523 nucleic acids Chemical class 0.000 description 6
- 238000001742 protein purification Methods 0.000 description 6
- 239000000243 solution Substances 0.000 description 6
- 208000008589 Obesity Diseases 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 230000006872 improvement Effects 0.000 description 5
- 235000020824 obesity Nutrition 0.000 description 5
- 101710172711 Structural protein Proteins 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 238000011282 treatment Methods 0.000 description 4
- 241000588724 Escherichia coli Species 0.000 description 3
- 101710198884 GATA-type zinc finger protein 1 Proteins 0.000 description 3
- DTHNMHAUYICORS-KTKZVXAJSA-N Glucagon-like peptide 1 Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(N)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CO)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC=1N=CNC=1)[C@@H](C)O)[C@@H](C)O)C(C)C)C1=CC=CC=C1 DTHNMHAUYICORS-KTKZVXAJSA-N 0.000 description 3
- 239000001888 Peptone Substances 0.000 description 3
- 108010080698 Peptones Proteins 0.000 description 3
- 102100040918 Pro-glucagon Human genes 0.000 description 3
- 229940041514 candida albicans extract Drugs 0.000 description 3
- 230000015556 catabolic process Effects 0.000 description 3
- 238000006731 degradation reaction Methods 0.000 description 3
- 238000000502 dialysis Methods 0.000 description 3
- 239000013604 expression vector Substances 0.000 description 3
- 235000019319 peptone Nutrition 0.000 description 3
- 239000002244 precipitate Substances 0.000 description 3
- 238000012772 sequence design Methods 0.000 description 3
- 239000011780 sodium chloride Substances 0.000 description 3
- 208000001072 type 2 diabetes mellitus Diseases 0.000 description 3
- 239000012138 yeast extract Substances 0.000 description 3
- 241000894006 Bacteria Species 0.000 description 2
- 108010013369 Enteropeptidase Proteins 0.000 description 2
- 102100029727 Enteropeptidase Human genes 0.000 description 2
- 108091005804 Peptidases Proteins 0.000 description 2
- 239000004365 Protease Substances 0.000 description 2
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000001684 chronic effect Effects 0.000 description 2
- 238000004925 denaturation Methods 0.000 description 2
- 230000036425 denaturation Effects 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000016097 disease of metabolism Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 230000002255 enzymatic effect Effects 0.000 description 2
- 239000012526 feed medium Substances 0.000 description 2
- 239000001963 growth medium Substances 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 208000030159 metabolic disease Diseases 0.000 description 2
- 238000000034 method Methods 0.000 description 2
- 239000002773 nucleotide Substances 0.000 description 2
- 125000003729 nucleotide group Chemical group 0.000 description 2
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 238000005406 washing Methods 0.000 description 2
- 101150028074 2 gene Proteins 0.000 description 1
- FUOOLUPWFVMBKG-UHFFFAOYSA-N 2-Aminoisobutyric acid Chemical compound CC(C)(N)C(O)=O FUOOLUPWFVMBKG-UHFFFAOYSA-N 0.000 description 1
- 101150090724 3 gene Proteins 0.000 description 1
- 206010067484 Adverse reaction Diseases 0.000 description 1
- 102000003670 Carboxypeptidase B Human genes 0.000 description 1
- 108090000087 Carboxypeptidase B Proteins 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 108060002716 Exonuclease Proteins 0.000 description 1
- 102000051325 Glucagon Human genes 0.000 description 1
- 108060003199 Glucagon Proteins 0.000 description 1
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 1
- 101000908391 Homo sapiens Dipeptidyl peptidase 4 Proteins 0.000 description 1
- 102000004877 Insulin Human genes 0.000 description 1
- 108090001061 Insulin Proteins 0.000 description 1
- 241000235058 Komagataella pastoris Species 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 1
- 108010022999 Serine Proteases Proteins 0.000 description 1
- 102000012479 Serine Proteases Human genes 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- 108020005038 Terminator Codon Proteins 0.000 description 1
- 206010067584 Type 1 diabetes mellitus Diseases 0.000 description 1
- XSQUKJJJFZCRTK-UHFFFAOYSA-N Urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 description 1
- 229940126704 Wegovy Drugs 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 210000000577 adipose tissue Anatomy 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000006838 adverse reaction Effects 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- 230000036528 appetite Effects 0.000 description 1
- 235000019789 appetite Nutrition 0.000 description 1
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 1
- 239000004202 carbamide Substances 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 239000003398 denaturant Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 102000013165 exonuclease Human genes 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000005182 global health Effects 0.000 description 1
- MASNOZXLGMXCHN-ZLPAWPGGSA-N glucagon Chemical compound C([C@@H](C(=O)N[C@H](C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O)C(C)C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](C)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CO)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CO)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC=1NC=NC=1)[C@@H](C)O)[C@@H](C)O)C1=CC=CC=C1 MASNOZXLGMXCHN-ZLPAWPGGSA-N 0.000 description 1
- 229960004666 glucagon Drugs 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- 229960000789 guanidine hydrochloride Drugs 0.000 description 1
- PJJJBBJSCAKJQF-UHFFFAOYSA-N guanidinium chloride Chemical compound [Cl-].NC(N)=[NH2+] PJJJBBJSCAKJQF-UHFFFAOYSA-N 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 201000001421 hyperglycemia Diseases 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 238000011081 inoculation Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 229940125396 insulin Drugs 0.000 description 1
- 230000003914 insulin secretion Effects 0.000 description 1
- 230000000968 intestinal effect Effects 0.000 description 1
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 1
- 238000011005 laboratory method Methods 0.000 description 1
- 229940127554 medical product Drugs 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000004768 organ dysfunction Effects 0.000 description 1
- 239000001301 oxygen Substances 0.000 description 1
- 229910052760 oxygen Inorganic materials 0.000 description 1
- 239000013612 plasmid Substances 0.000 description 1
- 239000011148 porous material Substances 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 210000001236 prokaryotic cell Anatomy 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 239000003488 releasing hormone Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- 230000002588 toxic effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
- 238000011269 treatment regimen Methods 0.000 description 1
- 238000009423 ventilation Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/70—Vectors or expression systems specially adapted for E. coli
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/46—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
- C07K14/47—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
- C07K14/4701—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used
- C07K14/4702—Regulators; Modulating activity
- C07K14/4705—Regulators; Modulating activity stimulating, promoting or activating activity
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/67—General methods for enhancing the expression
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12R—INDEXING SCHEME ASSOCIATED WITH SUBCLASSES C12C - C12Q, RELATING TO MICROORGANISMS
- C12R2001/00—Microorganisms ; Processes using microorganisms
- C12R2001/01—Bacteria or Actinomycetales ; using bacteria or Actinomycetales
- C12R2001/185—Escherichia
- C12R2001/19—Escherichia coli
Landscapes
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Plant Pathology (AREA)
- Physics & Mathematics (AREA)
- Toxicology (AREA)
- Gastroenterology & Hepatology (AREA)
- Medicinal Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Peptides Or Proteins (AREA)
Abstract
本发明涉及生物工程领域,尤其涉及表达盒、重组载体、重组蛋白及其应用。本发明提供了表达盒,包括但不限于:前导肽、酶切位点、Linker和GLP‑1受体激动剂主链;所述GLP‑1受体激动剂主链具有:如SEQ ID NO:1所示的氨基酸序列。本发明在前期研究的基础上,进行了多串联重复序列的设计,并实验证明该方案的可行性,通过多串联重复序列,增加了司美格鲁肽主链的生物合成量,经过后续多步纯化后均能获得理想的蛋白常量,为后续实验提供更多的原料。The present invention relates to the field of bioengineering, and in particular to expression cassettes, recombinant vectors, recombinant proteins and their applications. The present invention provides an expression cassette, including but not limited to: a leader peptide, a restriction site, a linker and a GLP-1 receptor agonist main chain; the GLP-1 receptor agonist main chain has: an amino acid sequence as shown in SEQ ID NO: 1. Based on previous studies, the present invention has designed a multi-tandem repeat sequence, and experimentally proved the feasibility of the scheme. Through the multi-tandem repeat sequence, the biosynthesis amount of the semaglutide main chain is increased, and the ideal protein constant can be obtained after subsequent multi-step purification, providing more raw materials for subsequent experiments.
Description
技术领域Technical Field
本发明涉及生物工程领域,尤其涉及表达盒、重组载体、重组蛋白及其应用。The present invention relates to the field of bioengineering, and in particular to an expression box, a recombinant vector, a recombinant protein and applications thereof.
背景技术Background Art
糖尿病(Diabetes mellitus,简称DM)是一种常见的代谢性疾病,其主要特征是由于慢性高血糖的直接或间接影响而导致的器官功能障碍,可分为1型糖尿病和2型糖尿病两种主要类型。根据国际糖尿病联合会的数据,2019年全球糖尿病的患病率估计为9.3%,约有4.63亿人患有该疾病。预计到2030年,糖尿病的患病率将升至10.2%(约5.78亿人),到2045年将上升至10.9%(约7亿人)。糖尿病是全球主要的致死率和残疾率原因之一,其不仅威胁着患者的生活质量和寿命,还给个人和全球医疗系统带来了巨大的经济负担。Diabetes mellitus (DM) is a common metabolic disease, the main feature of which is organ dysfunction caused by the direct or indirect effects of chronic hyperglycemia. It can be divided into two main types: type 1 diabetes and type 2 diabetes. According to the International Diabetes Federation, the global prevalence of diabetes was estimated to be 9.3% in 2019, with approximately 463 million people suffering from the disease. It is estimated that by 2030, the prevalence of diabetes will rise to 10.2% (approximately 578 million people) and to 10.9% (approximately 700 million people) by 2045. Diabetes is one of the leading causes of mortality and disability worldwide. It not only threatens the quality of life and life expectancy of patients, but also imposes a huge economic burden on individuals and the global medical system.
肥胖症是一种复杂的慢性代谢性疾病,其主要特征为体内脂肪积聚超过正常水平,对健康产生不利影响。近年来,随着人们生活方式的转变和环境因素的影响,肥胖症的发病率持续上升,已成为全球性的健康挑战。据估计,全球肥胖症患病率约为14%,在某些地区更高达40%以上,相当于大约10亿人口。这一数字预计将于2030年增至15亿。Obesity is a complex chronic metabolic disease, the main feature of which is the accumulation of body fat exceeding normal levels, which has adverse effects on health. In recent years, with the change of people's lifestyles and the influence of environmental factors, the incidence of obesity has continued to rise, and has become a global health challenge. It is estimated that the global prevalence of obesity is about 14%, and in some regions it is as high as more than 40%, equivalent to about 1 billion people. This number is expected to increase to 1.5 billion by 2030.
近年来,多肽药物研发领域取得了显著进展,针对特定疾病目标的多肽药物的设计和开发更加精准。这一进展归功于靶向性的提升、新型制剂技术的引入、合成和修饰技术的不断改进,以及新颖的治疗策略的涌现。这些技术的改进能够有效提升多肽药物的稳定性、溶解度和生物分布,优化多肽药物的药代动力学特性,延长其血浆半衰期,从而提高其药效并减少不良反应。在此背景下,司美格鲁肽作为一种新兴的多肽药物,已在治疗糖尿病和成年人肥胖症方面展现出良好的疗效。In recent years, significant progress has been made in the field of peptide drug research and development, and the design and development of peptide drugs for specific disease targets have become more precise. This progress is attributed to the improvement of targeting, the introduction of new formulation technologies, the continuous improvement of synthesis and modification technologies, and the emergence of novel treatment strategies. These technological improvements can effectively improve the stability, solubility and biodistribution of peptide drugs, optimize the pharmacokinetic properties of peptide drugs, and prolong their plasma half-life, thereby improving their efficacy and reducing adverse reactions. In this context, semaglutide, as an emerging peptide drug, has shown good efficacy in the treatment of diabetes and adult obesity.
司美格鲁肽是由丹麦制药公司Novo Nordisk开发的一种合成类似物,属于GLP-1受体激动剂,用于治疗2型糖尿病。GLP-1是一种由肠细胞产生的胰岛素释放激素,具有促进胰岛素分泌、抑制胰高血糖素分泌以及降低食欲的功能;2017年,FDA批准了品牌名为Wegovy的司美格鲁肽的注射剂,用于治疗成人肥胖症;2024年1月,中国国家药监局批准了首款口服司美格鲁肽用于治疗2型糖尿病,这也是国内首个获批上市的口服 GLP-1 受体激动剂。Semaglutide is a synthetic analog developed by Danish pharmaceutical company Novo Nordisk. It is a GLP-1 receptor agonist used to treat type 2 diabetes. GLP-1 is an insulin-releasing hormone produced by intestinal cells, which has the functions of promoting insulin secretion, inhibiting glucagon secretion and reducing appetite. In 2017, the FDA approved the injection of semaglutide under the brand name Wegovy for the treatment of adult obesity. In January 2024, the China National Medical Products Administration approved the first oral semaglutide for the treatment of type 2 diabetes, which is also the first oral GLP-1 receptor agonist approved for marketing in China.
如图1所示,司美格鲁肽由具有一个长侧链修饰的31个氨基酸组成。由于GLP-1主链的半衰期极短,只有1-2分钟左右,为了防止该多肽的降解,需要对该多肽进行相关的改造修饰以延长药物的半衰期和减小药物的毒副作用。因此,该多肽的第8位氨基酸(丙氨酸)被替换为了非天然氨基酸(a-氨基异丁酸),以防止被DPP4酶导致的GLP-1的快速降解以达到延长半衰期;在底26位赖氨酸进行了长侧链的修饰(为谷氨酸连接的PEG和C18脂肪二酸链),该修饰增加了多肽的亲水性和稳定性,防止该多肽的降解,有效延长司美格鲁肽的半衰期。As shown in Figure 1, semaglutide is composed of 31 amino acids with a long side chain modification. Since the half-life of the GLP-1 main chain is extremely short, only about 1-2 minutes, in order to prevent the degradation of the polypeptide, the polypeptide needs to be modified to extend the half-life of the drug and reduce the toxic side effects of the drug. Therefore, the 8th amino acid (alanine) of the polypeptide was replaced with a non-natural amino acid (a-aminoisobutyric acid) to prevent the rapid degradation of GLP-1 caused by the DPP4 enzyme to achieve the purpose of extending the half-life; the 26th lysine at the bottom was modified with a long side chain (PEG and C18 fatty diacid chain connected to glutamic acid), which increased the hydrophilicity and stability of the polypeptide, prevented the degradation of the polypeptide, and effectively extended the half-life of semaglutide.
生物合成主要以大肠杆菌(E.Coli)和毕赤酵母(Pichia pastoris)为主,通过合理设计多肽序列,利用特殊工具酶进行切割并最终获得司美格鲁肽主链的方式。The biosynthesis is mainly based on Escherichia coli (E.Coli) and Pichia pastoris. The peptide sequence is rationally designed and cut using special tool enzymes to finally obtain the main chain of semaglutide.
发明内容Summary of the invention
有鉴于此,本发明提供了表达盒、重组载体、重组蛋白及其应用。本发明在前期研究的基础上,进行了多串联重复序列的设计,并实验证明该方案的可行性,通过多串联重复序列,增加了司美格鲁肽主链的生物合成量,经过后续纯化后均能获得理想的蛋白常量,为单一结构蛋白多串联表达提供了可行的方案,解决了单一结构蛋白多串联重复序列无法正常表达和组装的问题,并且可以通过引入特殊蛋白酶切位点,可通过后期酶切获得完整的目的蛋白序列。In view of this, the present invention provides an expression cassette, a recombinant vector, a recombinant protein and its application. Based on the previous research, the present invention designs a multi-tandem repeat sequence and experimentally proves the feasibility of the scheme. The multi-tandem repeat sequence increases the biosynthesis amount of the semaglutide main chain, and after subsequent purification, the ideal protein constant can be obtained, which provides a feasible scheme for the multi-tandem expression of a single structural protein, solves the problem that the multi-tandem repeat sequence of a single structural protein cannot be expressed and assembled normally, and can introduce a special protease cleavage site, and obtain a complete target protein sequence by late enzymatic cleavage.
为了实现上述发明目的,本发明提供以下技术方案:In order to achieve the above-mentioned invention object, the present invention provides the following technical solutions:
本发明提供了表达盒,包括但不限于:前导肽、酶切位点、Linker和GLP-1受体激动剂主链;The present invention provides an expression cassette, including but not limited to: a leader peptide, an enzyme cleavage site, a linker and a GLP-1 receptor agonist backbone;
所述GLP-1受体激动剂主链具有:The GLP-1 receptor agonist backbone has:
(1)、如SEQ ID NO:1所示的氨基酸序列;或(1) the amino acid sequence shown in SEQ ID NO: 1; or
(2)、在如(1)所示的氨基酸序列的基础上经取代、缺失、添加和/或替换1个或多个氨基酸的序列;或(2) A sequence in which one or more amino acids are substituted, deleted, added and/or replaced based on the amino acid sequence shown in (1); or
(3)、与如(1)所示的氨基酸序列同源性90%以上的序列。(3) A sequence having a homology of more than 90% with the amino acid sequence shown in (1).
在本发明的一些实施方案中,上述表达盒中,SEQ ID NO:1的序列为:EGTFTSDVSSYLEGQAAKEFIAWLVRGRG。In some embodiments of the present invention, in the above expression cassette, the sequence of SEQ ID NO: 1 is: EGTFTSDVSSYLEGQAAKEFIAWLVRGRG.
在本发明的一些实施方案中,上述表达盒中,编码所述GLP-1受体激动剂主链的核酸分子具有如SEQ ID NO:4~SEQ ID NO:9任意所示的序列;In some embodiments of the present invention, in the above expression cassette, the nucleic acid molecule encoding the GLP-1 receptor agonist backbone has a sequence as shown in any of SEQ ID NO:4 to SEQ ID NO:9;
SEQ ID NO:4的序列为:GAAGGCACGTTTACCTCTGATGTGAGCTCTTATTTAGAAGGCCAGGCGGCTAAAGAATTTATTGCGTGGCTTGTGCGCGGCCGCGGC;The sequence of SEQ ID NO:4 is: GAAGGCACGTTTACCTCTGATGTGAGCTCTTATTTAGAAGGCCAGGCGGCTAAAGAATTTATTGCGTGGCTTGTGCGCGGCCGCGGC;
SEQ ID NO:5的序列为:GAAGGCACCTTTACCAGCGATGTGAGCAGCTATCTGGAAGGCCAGGCGGCGAAAGAGTTTATTGCGTGGTTAGTGCGCGGTCGCGGT;The sequence of SEQ ID NO:5 is: GAAGGCACCTTTACCAGCGATGTGAGCAGCTATCTGGAAGGCCAGGCGGCGAAAGAGTTTATTGCGTGGTTAGTGCGCGGTCGCGGT;
SEQ ID NO:6的序列为:GAAGGCACCTTTACGAGCGATGTGAGCAGCTATTTAGAAGGTCAGGCGGCGAAAGAATTCATTGCGTGGTTAGTGCGTGGTCGTGGT;The sequence of SEQ ID NO: 6 is: GAAGGCACCTTTACGAGCGATGTGAGCAGCTATTTAGAAGGTCAGGCGGCGAAAGAATTCATTGCGTGGTTAGTGCGTGGTCGTGGT;
SEQ ID NO:7的序列为:GAGGGCACCTTTACCTCCGATGTGAGCAGCTATTTGGAAGGCCAGGCGGCGAAGGAATTTATTGCGTGGCTGGTTCGTGGTCGCGGT;The sequence of SEQ ID NO:7 is: GAGGGCACCTTTACCTCCGATGTGAGCAGCTATTTGGAAGGCCAGGCGGCGAAGGAATTTATTGCGTGGCTGGTTCGTGGTCGCGGT;
SEQ ID NO:8的序列为:GAAGGCACCTTTACCTCTGATGTGAGCAGCTATCTCGAAGGCCAGGCGGCCAAAGAATTTATTGCATGGCTGGTTCGTGGCCGAGGT;The sequence of SEQ ID NO: 8 is: GAAGGCACCTTTACCTCTGATGTGAGCAGCTATCTCGAAGGCCAGGCGGCCAAAGAATTTATTGCATGGCTGGTTCGTGGCCGAGGT;
SEQ ID NO:9的序列为:GAAGGCACGTTTACCAGCGATGTGAGCTCTTACCTGGAAGGCCAGGCGGCAAAAGAATTTATCGCGTGGTTGGTGCGTGGTCGTGGC。The sequence of SEQ ID NO: 9 is: GAAGGCACGTTTACCAGCGATGTGAGCTCTTACCTGGAAGGCCAGGCGGCAAAAGAATTTATCGCGTGGTTGGTGCGTGGTCGTGGC.
在本发明的一些实施方案中,上述表达盒中,所述GLP-1受体激动剂的数量不少于6个。In some embodiments of the present invention, in the above expression cassette, the number of the GLP-1 receptor agonists is no less than 6.
在本发明的一些实施方案中,上述表达盒中,所述GLP-1受体激动剂的数量为6、7、8、9个。In some embodiments of the present invention, in the above expression cassette, the number of the GLP-1 receptor agonists is 6, 7, 8, or 9.
在本发明的一些实施方案中,上述表达盒中,所述酶切位点包括:EK酶酶切位点、KEX2酶酶切位点和CPB酶酶切位点中的一种或多种。In some embodiments of the present invention, in the above expression cassette, the restriction enzyme cleavage site comprises: one or more of an EK enzyme cleavage site, a KEX2 enzyme cleavage site and a CPB enzyme cleavage site.
在本发明的一些实施方案中,上述表达盒中,所述EK酶酶切位点位于第4个所述表达盒的N端。In some embodiments of the present invention, in the above expression cassettes, the EK enzyme cleavage site is located at the N-terminus of the fourth expression cassette.
在本发明的一些实施方案中,上述表达盒中,所述KEX2酶酶切位点和所述CPB酶酶切位点位于所述GLP-1受体激动剂主链的N端。In some embodiments of the present invention, in the above expression cassette, the KEX2 enzyme cleavage site and the CPB enzyme cleavage site are located at the N-terminus of the GLP-1 receptor agonist backbone.
在本发明的一些实施方案中,上述表达盒中,所述EK酶酶切位点的氨基酸序列为:DDDDK。In some embodiments of the present invention, in the above expression cassette, the amino acid sequence of the EK enzyme cleavage site is: DDDDK.
在本发明的一些实施方案中,上述表达盒中,编码所述EK酶酶切位点的核酸分子的序列如SEQ ID NO:10或SEQ ID NO:11所示:GATGATGATGACAAG(SEQ ID NO:10)或GACGACGACGACAAG(SEQ ID NO:11)。In some embodiments of the present invention, in the above expression cassette, the sequence of the nucleic acid molecule encoding the EK enzyme cleavage site is as shown in SEQ ID NO: 10 or SEQ ID NO: 11: GATGATGATGACAAG (SEQ ID NO: 10) or GACGACGACGACAAG (SEQ ID NO: 11).
在本发明的一些实施方案中,上述表达盒中,所述KEX2酶酶切位点的氨基酸序列为:RR。In some embodiments of the present invention, in the above expression cassette, the amino acid sequence of the KEX2 enzyme cleavage site is: RR.
在本发明的一些实施方案中,上述表达盒中,编码所述KEX2酶酶切位点的核酸分子的序列如SEQ ID NO:12~SEQ ID NO:15任意所示:CGCCGT(SEQ ID NO:12)、AGACGT(SEQID NO:13)、GCCGCT(SEQ ID NO:14)或CGTCGC(SEQ ID NO:15)。In some embodiments of the present invention, in the above expression cassette, the sequence of the nucleic acid molecule encoding the KEX2 enzyme cleavage site is shown in any of SEQ ID NO:12 to SEQ ID NO:15: CGCCGT (SEQ ID NO:12), AGACGT (SEQ ID NO:13), GCCGCT (SEQ ID NO:14) or CGTCGC (SEQ ID NO:15).
在本发明的一些实施方案中,上述表达盒中,所述CPB酶酶切位点的氨基酸序列为:R。In some embodiments of the present invention, in the above expression cassette, the amino acid sequence of the CPB enzyme cleavage site is: R.
在本发明的一些实施方案中,上述表达盒中,编码所述CPB酶酶切位点的核酸分子的序列如SEQ ID NO:16~SEQ ID NO:18任意所示:CGC(SEQ ID NO:16)、CGT(SEQ ID NO:17)或AGA(SEQ ID NO:18)。In some embodiments of the present invention, in the above expression cassette, the sequence of the nucleic acid molecule encoding the CPB enzyme cleavage site is shown in any of SEQ ID NO:16 to SEQ ID NO:18: CGC (SEQ ID NO:16), CGT (SEQ ID NO:17) or AGA (SEQ ID NO:18).
在本发明的一些实施方案中,上述表达盒中,所述前导肽具有:In some embodiments of the present invention, in the above expression cassette, the leader peptide has:
(4)、如SEQ ID NO:2所示的氨基酸序列;或(4) the amino acid sequence shown in SEQ ID NO: 2; or
(5)、在如(4)所示的氨基酸序列的基础上经取代、缺失、添加和/或替换1个或多个氨基酸的序列;或(5) A sequence in which one or more amino acids are substituted, deleted, added and/or replaced based on the amino acid sequence shown in (4); or
(6)、与如(4)所示的氨基酸序列同源性90%以上的序列。(6) A sequence having a homology of more than 90% with the amino acid sequence shown in (4).
在本发明的一些实施方案中,上述表达盒中,SEQ ID NO:2的序列为:LVPRGSGMKETAAAKFERQHMDSPDLGTDDDDKAMADIGSMRLNSA。In some embodiments of the present invention, in the above expression cassette, the sequence of SEQ ID NO: 2 is: LVPRGSGMKETAAAKFERQHMDSPDLGTDDDDKAMADIGSMRLNSA.
在本发明的一些实施方案中,上述表达盒中,编码所述前导肽的核酸分子具有如SEQ ID NO:22所示的序列:CTGGTGCCACGCGGTTCTGGTATGAAAGAAACCGCTGCTGCTAAATTCGAACGCCAGCACATGGACAGCCCAGATCTGGGTACCGACGACGACGACAAGGCCATGGCTGATATCGGATCCATGCGCCTGAACAGCGCG。In some embodiments of the present invention, in the above-mentioned expression cassette, the nucleic acid molecule encoding the leader peptide has a sequence as shown in SEQ ID NO:22: CTGGTGCCACGCGGTTCTGGTATGAAAGAAACCGCTGCTGCTAAATTCGAACGCCAGCACATGGACAGCCCAGATCTGGGTACCGACGACGACGACAAGGCCATGGCTGATATCGGATCCATGCGCCTGAACAGCGCG.
在本发明的一些实施方案中,上述表达盒中,所述Linker具有:In some embodiments of the present invention, in the above expression cassette, the Linker has:
(7)、如SEQ ID NO:3所示的氨基酸序列;或(7) the amino acid sequence shown in SEQ ID NO: 3; or
(8)、在如(7)所示的氨基酸序列的基础上经取代、缺失、添加和/或替换1个或多个氨基酸的序列;或(8) A sequence in which one or more amino acids are substituted, deleted, added and/or replaced based on the amino acid sequence shown in (7); or
(9)、与如(7)所示的氨基酸序列同源性90%以上的序列。(9) A sequence having a homology of more than 90% with the amino acid sequence shown in (7).
在本发明的一些实施方案中,上述表达盒中,SEQ ID NO:3的序列为:GSGSEEGSGS。In some embodiments of the present invention, in the above expression cassette, the sequence of SEQ ID NO: 3 is: GSGSEEGSGS.
在本发明的一些实施方案中,上述表达盒中,编码所述Linker的核酸分子的核苷酸序列如SEQ ID NO:19所示:GGTAGCGGTTCTGAGGAAGGTTCTGGAAGC。In some embodiments of the present invention, in the above expression cassette, the nucleotide sequence of the nucleic acid molecule encoding the Linker is shown in SEQ ID NO: 19: GGTAGCGGTTCTGAGGAAGGTTCTGGAAGC.
在本发明的一些实施方案中,上述表达盒中,还包括:Trx-6x His标签。In some embodiments of the present invention, the above expression cassette further comprises: a Trx-6x His tag.
在本发明的一些实施方案中,上述表达盒中,所述Trx-6x His标签的氨基酸序列如SEQ ID NO:20所示:MSDKIIHLTDDSFDTDVLKADGAILVDFWAEWCGPCKMIAPILDEIADEYQGKLTVAKLNIDQNPGTAPKYGIRGIPTLLLFKNGEVAATKVGALSKGQLKEFLDANLAGSGSGHMHHHHHHSSG。In some embodiments of the present invention, in the above expression cassette, the amino acid sequence of the Trx-6x His tag is as shown in SEQ ID NO: 20: MSDKIIHLTDDSFDTDVLKADGAILVDFWAEWCGPCKMIAPILDEIADEYQGKLTVAKLNIDQNPGTAPKYGIRGIPTLLLFKNGEVAATKVGALSKGQLKEFLDANLAGSGSGHMHHHHHHSSG.
在本发明的一些实施方案中,上述表达盒中,编码所述Trx-6x His标签的核苷酸序列如SEQ ID NO:21所示:ATGAGCGATAAAATTATTCACCTGACTGACGACAGTTTTGACACGGATGTACTCAAAGCGGACGGGGCGATCCTCGTCGATTTCTGGGCAGAGTGGTGCGGTCCGTGCAAAATGATCGCCCCGATTCTGGATGAAATCGCTGACGAATATCAGGGCAAACTGACCGTTGCAAAACTGAACATCGATCAAAACCCTGGCACTGCGCCGAAATATGGCATCCGTGGTATCCCGACTCTGCTGCTGTTCAAAAACGGTGAAGTGGCGGCAACCAAAGTGGGTGCACTGTCTAAAGGTCAGTTGAAAGAGTTCCTCGACGCTAACCTGGCCGGTTCTGGTTCTGGCCATATGCACCATCATCATCATCATTCTTCTGGT。In some embodiments of the present invention, in the above expression cassette, the nucleotide sequence encoding the Trx-6x His tag is as shown in SEQ ID NO:21: ATGAGCGATAAAATTATTCACCTGACTGACGACAGTTTTGACACGGATGTACTCAAAGCGGACGGGGCGATCCTCGTCGATTTCTGGGCAGAGTGGTGCGGTCCGTGCAAAATGATCGCCCCGATTCTGGATGAAATCGCTGACGAATATCAGGGCAAACTGACCGTTGCAAAACTGAACATCGATCAAAACCCTGGCACTGCGCCGAAATATGGCATCCGTGGTATCCCGACTCTGCTGCTGTTCAAAAACGGTGAAGTGGCGGCAACCAAAGTGGGTGCACTGTCTAAAGGTCAGTTGAAAGAGTTCCTCGACGCTAACCTGGCCGGTTCTGGTTCTGGCCATATGCACCATCATCATCATCATTCTTCTGGT.
在本发明的一些实施方案中,上述表达盒中,所述Trx-6x His标签位于所述前导肽的N端。In some embodiments of the present invention, in the above expression cassette, the Trx-6x His tag is located at the N-terminus of the leader peptide.
本发明还提供了重组载体,包括:上述表达盒。The present invention also provides a recombinant vector, comprising: the above expression cassette.
在本发明的一些实施方案中,上述重组载体中,所述表达盒的数量大于等于1个。In some embodiments of the present invention, in the above-mentioned recombinant vector, the number of the expression cassette is greater than or equal to one.
在本发明的一些实施方案中,上述重组载体中,所述表达盒的数量为1、2或3个。In some embodiments of the present invention, in the above-mentioned recombinant vector, the number of the expression cassettes is 1, 2 or 3.
本发明还提供了宿主,转化和/或转染上述重组载体。The present invention also provides a host for transforming and/or transfecting the above recombinant vector.
本发明还提供了重组蛋白,经上述宿主表达和变复性后获得。The present invention also provides a recombinant protein obtained after being expressed and renatured by the above host.
本发明还提供了上述表达盒、上述重组载体、上述宿主和/或如上述蛋白在提高GLP-1受体激动剂的产量和/或表达量中的应用。The present invention also provides the use of the above expression cassette, the above recombinant vector, the above host and/or the above protein in increasing the production and/or expression level of the GLP-1 receptor agonist.
本发明提供了表达盒,包括但不限于:前导肽、酶切位点、Linker和GLP-1受体激动剂主链;The present invention provides an expression cassette, including but not limited to: a leader peptide, an enzyme cleavage site, a linker and a GLP-1 receptor agonist backbone;
所述GLP-1受体激动剂主链具有:The GLP-1 receptor agonist backbone has:
(1)、如SEQ ID NO:1所示的氨基酸序列;或(1) the amino acid sequence shown in SEQ ID NO: 1; or
(2)、在如(1)所示的氨基酸序列的基础上经取代、缺失、添加和/或替换1个或多个氨基酸的序列;或(2) A sequence in which one or more amino acids are substituted, deleted, added and/or replaced based on the amino acid sequence shown in (1); or
(3)、与如(1)所示的氨基酸序列同源性90%以上的序列。(3) A sequence having a homology of more than 90% with the amino acid sequence shown in (1).
本发明优化设计多串联重复序列,在现有专利基础上,大大增加串联重复数,最高达24个串联重复序列,为单一结构蛋白多串联表达提供了可行的方案,解决了单一结构蛋白多串联重复序列无法正常表达和组装的问题,并且可以通过引入特殊蛋白酶切位点,可通过后期酶切获得完整的目的蛋白序列。The present invention optimizes the design of multiple tandem repeat sequences and greatly increases the number of tandem repeats on the basis of existing patents, up to 24 tandem repeat sequences, providing a feasible solution for the multiple tandem expression of single structural proteins, solving the problem that multiple tandem repeat sequences of single structural proteins cannot be expressed and assembled normally, and by introducing special protease cleavage sites, the complete target protein sequence can be obtained by later enzymatic cleavage.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings required for use in the embodiments or the description of the prior art are briefly introduced below.
图1示司美格鲁肽药物结构示意图;Figure 1 shows a schematic diagram of the drug structure of semaglutide;
图2示多串联重复序列的设计原理;Figure 2 shows the design principle of multiple tandem repeat sequences;
图3示本发明设计的多串联序列数;FIG3 shows the number of multiple series sequences designed by the present invention;
图4示蛋白纯化SDS-PAGE图;Fig. 4 shows the SDS-PAGE image of protein purification;
图5示多表达盒蛋白纯化SDS-PAGE图。FIG. 5 shows the SDS-PAGE image of the protein purification of the multiple expression cassette.
具体实施方式DETAILED DESCRIPTION
本发明公开了表达盒、重组载体、重组蛋白及其应用。The invention discloses an expression box, a recombinant vector, a recombinant protein and applications thereof.
应该理解,表述“……中的一种或多种”单独地包括每个在所述表述后叙述的物体以及所述叙述的物体中的两者或更多者的各种不同组合,除非从上下文和用法中另有理解。与三个或更多个叙述的物体相结合的表述“和/或”应该被理解为具有相同的含义,除非从上下文另有理解。It should be understood that the expression "one or more of..." includes individually each of the objects recited after the expression and various different combinations of two or more of the recited objects, unless otherwise understood from the context and usage. The expression "and/or" in combination with three or more recited objects should be understood to have the same meaning, unless otherwise understood from the context.
术语“包括”、“具有”或“含有”,包括其语法同义语的使用,通常应该被理解为开放性和非限制性的,例如不排除其他未叙述的要素或步骤,除非另有具体陈述或从上下文另有理解。The use of the terms "comprising", "having" or "containing", including their grammatical synonyms, should generally be understood as open and non-restrictive, for example not excluding other unrecited elements or steps, unless otherwise specifically stated or otherwise understood from the context.
应该理解,只要本发明仍可操作,步骤的顺序或执行某些行动的顺序并不重要。此外,两个或更多个步骤或行动可以同时进行。It should be understood that the order of steps or the order in which certain actions are performed is not important as long as the present invention remains operable. In addition, two or more steps or actions may be performed simultaneously.
本文中的任何和所有实例或示例性语言如“例如”或“包括”的使用,仅仅打算更好地说明本发明,并且除非提出权利要求,否则不对本发明的范围构成限制。本说明书中的任何语言都不应解释为指示任何未要求保护的要素对于本发明的实践是必不可少的。The use of any and all examples or exemplary language, such as "for example" or "including", herein is intended only to better illustrate the invention and does not limit the scope of the invention unless otherwise claimed. No language in this specification should be construed as indicating that any non-claimed element is essential to the practice of the invention.
此外,用以界定本发明的数值范围与参数皆是约略的数值,此处已尽可能精确地呈现具体实施例中的相关数值。然而,任何数值本质上不可避免地含有因个别测试方法所致的标准偏差。因此,除非另有明确的说明,应当理解本公开所用的所有范围、数量、数值与百分比均经过“约”的修饰。在此处,“约”通常是指实际数值在一特定数值或范围的正负10%、5%、1%或0.5%之内。In addition, the numerical ranges and parameters used to define the present invention are approximate values, and the relevant values in the specific embodiments have been presented as accurately as possible. However, any numerical value inherently inevitably contains standard deviations due to individual test methods. Therefore, unless otherwise expressly stated, it should be understood that all ranges, quantities, values and percentages used in this disclosure are modified by "about". Here, "about" generally means that the actual value is within plus or minus 10%, 5%, 1% or 0.5% of a specific value or range.
本发明实施例1~实施例6和效果例中,实验中使用的不同培养基及Buffer的配方:In Examples 1 to 6 of the present invention and the effect examples, the formulas of different culture media and buffers used in the experiments are as follows:
LB培养基(1 L):10 g蛋白胨,10 g氯化钠,5 g酵母提取物;LB medium (1 L): 10 g peptone, 10 g sodium chloride, 5 g yeast extract;
TB培养基/发酵培养基(1 L):18g 蛋白胨,12 g 酵母提取物,10 g氯化钠;TB medium/fermentation medium (1 L): 18 g peptone, 12 g yeast extract, 10 g sodium chloride;
补料培养基(1 L):25~65%葡萄糖,20 g蛋白胨,10 g酵母提取物;Feed medium (1 L): 25-65% glucose, 20 g peptone, 10 g yeast extract;
Buffer A:50 mM Tirs·HCl pH 6.5~8.5,300 mM 氯化钠;Buffer A: 50 mM Tirs·HCl pH 6.5-8.5, 300 mM NaCl;
Buffer B:50 mM Tirs·HCl pH 6.5~8.5,300 mM 氯化钠,2 M 尿素;Buffer B: 50 mM Tirs·HCl pH 6.5-8.5, 300 mM NaCl, 2 M urea;
Buffer C:50 mM Tirs·HCl pH 6.5~8.5,300 mM 氯化钠,8 M 尿素;Buffer C: 50 mM Tirs·HCl pH 6.5-8.5, 300 mM NaCl, 8 M urea;
Buffer D:50 mM Tirs·HCl pH 6.5~8.5,300 mM 氯化钠,梯度尿素(4-2-1-0 )M。Buffer D: 50 mM Tirs·HCl pH 6.5~8.5, 300 mM sodium chloride, gradient urea (4-2-1-0 )M.
本发明实施例1~实施例6和效果例中,所用原料及试剂均可由市场购得。In Examples 1 to 6 and the Effect Examples of the present invention, the raw materials and reagents used can be purchased from the market.
下面结合实施例,进一步阐述本发明:The present invention will be further described below in conjunction with embodiments:
实施例1 串联重复序列设计原理和路线Example 1 Tandem repeat sequence design principles and routes
本专利设计原理如图2所示。以pQLinkN/pET-32a为表达载体。序列设计的原则以多个司美格鲁肽主链串联重复进行。如图2所示,在N-端添加了Trx-标签,此序列为促进蛋白翻译;随后添加EK酶(肠激酶)切位点(DDDDK-,Asp-Asp-Asp-Asp-Lys-)通过串联重复多个多拷贝序列,每两个司美格鲁肽主链串联重复之间以序列“-RR-GSGSEEGSGS-RR-”或“-RR-GSGSEEGSGS-DDDDK-”的linker相连接。The design principle of this patent is shown in Figure 2. pQLinkN/pET-32a is used as the expression vector. The principle of sequence design is to repeat multiple semaglutide backbones in tandem. As shown in Figure 2, a Trx-tag is added to the N-terminus. This sequence is to promote protein translation; then an EK enzyme (enterokinase) cleavage site (DDDDK-, Asp-Asp-Asp-Asp-Lys-) is added to repeat multiple copies of the sequence in tandem, and each two semaglutide backbone tandem repeats are connected by a linker with the sequence "-RR-GSGSEEGSGS-RR-" or "-RR-GSGSEEGSGS-DDDDK-".
其中,序列“DDDDK-”能被EK酶(肠激酶)从C-端特异性识别并切割;“RR-”序列能被KEX2酶(丝氨酸蛋白酶)进行切割,因为KEX2酶能够特异性识别RR或KR两个碱性氨基酸残基,并从这两个残基的C-端进行切割;“-R-R”则被CPB酶(羧肽酶B)切割,因为CPB酶具有特异性识别K或R两个碱性氨基酸的外切酶活性,从C-端依次切除这两个多余氨基酸,最终形成完整的司美格鲁肽主链。图2为多串联重复序列的设计原理,图3为本发明涉及的多串联序列。Among them, the sequence "DDDDK-" can be specifically recognized and cut by the EK enzyme (enterokinase) from the C-terminus; the "RR-" sequence can be cut by the KEX2 enzyme (serine protease), because the KEX2 enzyme can specifically recognize the two basic amino acid residues of RR or KR, and cut from the C-terminus of these two residues; "-R-R" is cut by the CPB enzyme (carboxypeptidase B), because the CPB enzyme has the exonuclease activity of specifically recognizing the two basic amino acids of K or R, and sequentially removes the two redundant amino acids from the C-terminus, and finally forms a complete semaglutide backbone. Figure 2 shows the design principle of the multiple tandem repeat sequence, and Figure 3 shows the multiple tandem sequence involved in the present invention.
具体包括如下步骤:从表达盒阅读框起始密码子(ATG)开始,此为N-端;到表达盒读框起终止码子(TGA)结束,此为C-端;如无特殊说明,本专利的表达设计顺序为N-端开始至C-端结束。Specifically, the steps include: starting from the start codon (ATG) of the expression cassette reading frame, which is the N-terminus; ending at the termination codon (TGA) of the expression cassette reading frame, which is the C-terminus; unless otherwise specified, the expression design sequence of this patent starts from the N-terminus and ends at the C-terminus.
实施例2 载体的选择和构建技术Example 2 Vector selection and construction technology
载体选择pQLinkN为表达载体。该载体的特点是能够在单个载体上通过重组的方式,串联多个独立运行的多拷贝克隆表达盒,理论是该载体可具备N个多克隆表达盒,前期研究证实独立完成蛋白表的的多克隆表达盒不少于5个,即可以一个载体表达5个蛋白。pQLinkN was selected as the expression vector. The characteristic of this vector is that it can connect multiple independently running multi-copy clone expression cassettes in series through recombination on a single vector. Theoretically, this vector can have N multi-clone expression cassettes. Previous studies have confirmed that there are no less than 5 multi-clone expression cassettes that can independently complete protein expression, that is, one vector can express 5 proteins.
因此,本发明在进行了司美格鲁肽主链多串联重复序列的同时,以该串联序列为一个拷贝单位,再次进行了多克隆基因表达盒的构建,此时蛋白的主链的蛋白表达理论值为多克隆重复序列的倍数增长,以5串联重复序列为例:5×重复序列为一个拷贝单位,建立2个基因表达盒则主链的数量为2倍即10个重复序列(5×2),建立3个基因表达盒则主链的数量为3倍即15个重复序列(5×3),以此重复。Therefore, in the present invention, while constructing the multiple tandem repeat sequence of the semaglutide backbone, the tandem sequence is used as a copy unit to construct a polyclonal gene expression cassette again. At this time, the theoretical protein expression value of the backbone of the protein is a multiple increase of the polyclonal repeat sequence. Taking 5 tandem repeat sequences as an example: 5× repeat sequences are a copy unit, and when 2 gene expression cassettes are established, the number of backbones is 2 times, that is, 10 repeat sequences (5×2), and when 3 gene expression cassettes are established, the number of backbones is 3 times, that is, 15 repeat sequences (5×3), and this is repeated.
本发明选用的载体为商用载体pET-32a,并选择Nco1(CCATGG)和Xho1(CTCGA)进行序列插入。首先设计单表达盒的重复序列6-9个串联,以下标记为6 X - 9 X;然后设计多表达盒重复,来增加大于10 X的串联重复序列,本发明设计了单表达盒最高9个多串联表达序列、多表达盒最高3个即24个多串联表达序列。The vector selected by the present invention is the commercial vector pET-32a, and Nco1 (CCATGG) and Xho1 (CTCGA) are selected for sequence insertion. First, 6-9 tandem repeat sequences of a single expression cassette are designed, which are hereinafter marked as 6 X - 9 X; then multiple expression cassette repeats are designed to increase the tandem repeat sequence greater than 10 X. The present invention designs a single expression cassette with a maximum of 9 multiple tandem expression sequences and a multiple expression cassette with a maximum of 3, that is, 24 multiple tandem expression sequences.
本发明涉及以下组合:The present invention relates to the following combination:
6 X:N-leading peptide-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-EK-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB;(CPB-GLP-1主链序列的顺序:SEQ ID NO:4-SEQ ID NO:5-SEQ ID NO:6-SEQ IDNO:7-SEQ ID NO:8-SEQ ID NO:9)6 X: N-leading peptide-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-EK-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB; (the order of the CPB-GLP-1 backbone sequence: SEQ ID NO:4-SEQ ID NO:5-SEQ ID NO:6-SEQ ID NO:7-SEQ ID NO:8-SEQ ID NO:9)
7 X:N-leading peptide-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-EK-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB;(CPB-GLP-1主链序列的顺序:SEQ IDNO:5-SEQ ID NO:6-SEQ ID NO:7-SEQ ID NO:4-SEQ ID NO:6-SEQ ID NO:9-SEQ ID NO:5)7 X: N-leading peptide-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-EK-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB; (the order of the CPB-GLP-1 backbone sequence: SEQ ID NO: 5-SEQ ID NO: 6-SEQ ID NO: 7-SEQ ID NO: 4-SEQ ID NO: 6-SEQ ID NO: 9-SEQ ID NO: 5)
8 X:N-leading peptide-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-EK-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-EK-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB;(CPB-GLP-1主链序列的顺序:SEQ ID NO:5-SEQ ID NO:6-SEQ ID NO:7-SEQ IDNO:4-SEQ ID NO:6-SEQ ID NO:9-SEQ ID NO:5-SEQ ID NO:6)8 X: N-leading peptide-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-EK-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-EK-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB; (the order of the CPB-GLP-1 backbone sequence: SEQ ID NO: 5-SEQ ID NO: 6-SEQ ID NO: 7-SEQ ID NO: 8-SEQ ID NO: 9-SEQ ID NO: 10-SEQ ID NO: 11-SEQ ID NO: 12-SEQ ID NO: 13-SEQ ID NO: 14-SEQ ID NO: 15 NO:6-SEQ ID NO:9-SEQ ID NO:5-SEQ ID NO:6)
9 X:N-leading peptide-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-EK-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-EK-KEX2/CPB-linker-EK-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB-linker-KEX2/CPB-GLP-1主链-KEX2/CPB;(CPB-GLP-1主链序列的顺序:SEQ IDNO:5-SEQ ID NO:6-SEQ ID NO:7-SEQ ID NO:4-SEQ ID NO:5-SEQ ID NO:6-SEQ ID NO:7-SEQ ID NO:8-SEQ ID NO:9)9 X:N-leading peptide-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-EK-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-EK-KEX2/CPB-linker-EK-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB-linker-KEX2/CPB-GLP-1 backbone-KEX2/CPB; (sequence of CPB-GLP-1 backbone: SEQ ID NO: 5-SEQ ID NO: 6). NO:6-SEQ ID NO:7-SEQ ID NO:4-SEQ ID NO:5-SEQ ID NO:6-SEQ ID NO:7-SEQ ID NO:8-SEQ ID NO:9)
16 X:以1个8 X为重复单位,2个重复单位串联获得16 X;16 X: 1 8 X is used as a repeating unit, and 2 repeating units are connected in series to obtain 16 X;
24 X:以1个8 X为重复单位,3个重复单位串联获得24 X;24 X: 1 8 X is used as a repeating unit, and 3 repeating units are connected in series to obtain 24 X;
获得序列设计后,我们通过基因合成的方式合成了串联蛋白优化后的密码子序列并重组到表达载体pET-32a,经测序正确后进行蛋白的表达。After obtaining the sequence design, we synthesized the optimized codon sequence of the tandem protein by gene synthesis and recombined it into the expression vector pET-32a. After correct sequencing, the protein was expressed.
最终,我们设计了6-9个的串联重复序列,通过增加蛋白表达串联重复数的反式来提高司美格鲁肽主链蛋白表达量和产率。Finally, we designed 6-9 tandem repeat sequences to increase the expression level and yield of semaglutide backbone protein by increasing the number of tandem repeats in trans.
实施例3 重组蛋白的生物合成技术Example 3 Biosynthesis technology of recombinant protein
如无特殊说明,大肠杆菌的转化、普通培养基(LB\TB,发酵培养基)、补料培养基、发酵方案和蛋白纯化方案均采用常规实验室方式进行。Unless otherwise specified, transformation of E. coli, common culture medium (LB\TB, fermentation medium), feed medium, fermentation protocol, and protein purification protocol were performed using conventional laboratory methods.
将冻存质粒选用BL-21(DE3)进行常规转化和涂板,37℃过夜培养,挑取多个单菌落于LB培养基分别过夜培养,此为一级种子液;把过夜培养的一级种子液加入适量的TB培养基,培养至OD600为2-7,此为二级种子液,可进行发酵罐的接种。The frozen plasmid was transformed and plated using BL-21 (DE3), cultured overnight at 37°C, and several single colonies were picked and cultured overnight in LB medium. This was the primary seed solution. The primary seed solution cultured overnight was added to an appropriate amount of TB medium and cultured to an OD 600 of 2-7. This was the secondary seed solution, which could be used to inoculate the fermenter.
将二级种子液按照0.1-1%的接种量进行接种。发酵温度37℃,转速设置300-1200r/min,通气量设置500 mL/min,pH 6.5-7.5进行发酵培养。当发酵液中底糖耗尽时开始补料,并根据发酵参数(如:溶氧、pH等)进行补料的调控,直至发酵结束。当发酵液OD600达到100时,加入0.1-1.5 mM的IPTG进行诱导,并根据蛋白特性适当降低发酵温度,诱导温度设置在15-35区间,该温度可在前期小试期间确定。发酵过程中需每两小时进行OD600的检测,当发酵OD进入平台期后,发酵即可终止,可以开始下罐,通过高速离心机富集菌体,根据实验安排进行后续破碎处理或暂存-80℃冰箱备用。The secondary seed liquid was inoculated at an inoculation rate of 0.1-1%. The fermentation temperature was 37°C, the speed was set at 300-1200r/min, the ventilation was set at 500 mL/min, and the pH was 6.5-7.5 for fermentation culture. When the substrate sugar in the fermentation broth was exhausted, feeding began, and the feeding was regulated according to the fermentation parameters (such as dissolved oxygen, pH, etc.) until the fermentation was completed. When the OD 600 of the fermentation broth reached 100, 0.1-1.5 mM IPTG was added for induction, and the fermentation temperature was appropriately lowered according to the characteristics of the protein. The induction temperature was set in the range of 15-35, which can be determined during the early small test. During the fermentation process, OD 600 needs to be tested every two hours. When the fermentation OD enters the plateau phase, the fermentation can be terminated, and the tank can be started. The bacteria are enriched by a high-speed centrifuge, and the subsequent crushing treatment or temporary storage in a -80°C refrigerator is carried out according to the experimental arrangement.
本发明中涉及的司美格鲁肽主链多串联重复序列蛋白均以包涵体的形式存在,因此需要进行包涵体的变复性实验。取适当菌体用buffer A按照1:5~20(W/V)重悬,进行高压破碎(大体积)或超声破碎(小体积),使用预冷的高速离心机18000 rpm,离心40-60 min并收集沉淀,此为粗包涵体。用buffer B按照1:10(W/V)对粗包涵体重悬,高速离心机18000rpm,离心40-60 min并收集沉淀,重复此过程3次,此为洗杂后的包涵体,可用于后续变复性实验。The semaglutide main chain multi-tandem repeat sequence proteins involved in the present invention all exist in the form of inclusion bodies, so it is necessary to perform a denaturation experiment of the inclusion bodies. Take appropriate bacteria and resuspend them with buffer A at 1:5~20 (W/V), perform high-pressure crushing (large volume) or ultrasonic crushing (small volume), use a pre-cooled high-speed centrifuge at 18000 rpm, centrifuge for 40-60 min and collect the precipitate, which is the crude inclusion body. Resuspend the crude inclusion body with buffer B at 1:10 (W/V), centrifuge at 18000 rpm, centrifuge for 40-60 min and collect the precipitate, repeat this process 3 times, this is the inclusion body after washing, which can be used for subsequent denaturation experiments.
取适当洗杂后的包涵体用buffer C按照1:20(W/V)重悬,室温孵育,使包涵体充分溶解使(变性),用预冷的高速离心机18000 rpm,离心40-60 min并弃沉淀,收集上清,此为变性后的包涵体蛋白。包涵体蛋白的复性可根据实验具体情况来实施,可选择稀释复性、透析复性和亲和柱的柱上复性,本发明采用透析复性方式进行。根据蛋白分子量,选择合适分子量孔径的透析袋,加入适量变性后的蛋白进行复性实验。使用buffer D进行复性实验,每隔6-8 h进行换液,最终把变性剂(尿素或盐酸胍)缓慢去除,得到复性的蛋白,此蛋白即为司美格鲁肽主链多串联重复序列蛋白。Take the inclusion bodies after appropriate washing and resuspend them with buffer C at 1:20 (W/V), incubate at room temperature, fully dissolve the inclusion bodies (denature), use a pre-cooled high-speed centrifuge at 18000 rpm, centrifuge for 40-60 min and discard the precipitate, collect the supernatant, which is the denatured inclusion body protein. The renaturation of the inclusion body protein can be implemented according to the specific conditions of the experiment, and dilution renaturation, dialysis renaturation and on-column renaturation of the affinity column can be selected. The present invention adopts dialysis renaturation. According to the molecular weight of the protein, a dialysis bag with a suitable molecular weight pore size is selected, and an appropriate amount of denatured protein is added to perform a renaturation experiment. Use buffer D for the renaturation experiment, change the liquid every 6-8 hours, and finally slowly remove the denaturant (urea or guanidine hydrochloride) to obtain the renatured protein, which is the main chain multi-tandem repeat sequence protein of semaglutide.
实施例4Example 4
选择8 X单表达盒的载体进行蛋白纯化表达,具体实验细节如“实施例3重组蛋白的生物合成技术”中介绍。经过蛋白表达盒包涵体变复性实验,获得了纯度较好的蛋白。The vector of 8X single expression cassette was selected for protein purification and expression. The specific experimental details are described in "Example 3 Biosynthesis Technology of Recombinant Protein". After the protein expression cassette inclusion body renaturation experiment, a protein with good purity was obtained.
实施例5Example 5
选择8 X 2表达盒的载体进行蛋白纯化表达,细节如实施例1~实施例3。The vector of 8×2 expression cassette was selected for protein purification and expression, and the details are as in Examples 1 to 3.
实施例6Example 6
选择8 X 3表达盒的载体进行蛋白纯化表达,细节如实施例1~实施例3。The vector of 8×3 expression cassette was selected for protein purification and expression, and the details are as in Examples 1 to 3.
效果例Effect example
本发明通过蛋白表达纯化方法的设计盒优化,所设计多肽的多串联重复序列均能够获得目的蛋白。其中,所有设计多肽的多串联重复序列蛋白的在原核细胞(E.coli)中均以包涵体的形式存在,需经包涵体变复性实验获得可溶的目的蛋白;随着表达盒的增加,蛋白产率也随之提高,与8 X相比,8 X 3表达盒的蛋白产率提高了64%,结果如表1所示:The present invention optimizes the design cassette of the protein expression and purification method, and the multi-tandem repeat sequence of the designed polypeptide can obtain the target protein. Among them, the multi-tandem repeat sequence protein of all designed polypeptides exists in the form of inclusion bodies in prokaryotic cells (E.coli), and the soluble target protein needs to be obtained through inclusion body renaturation experiment; as the number of expression cassettes increases, the protein yield also increases. Compared with 8 X, the protein yield of 8 X 3 expression cassettes increases by 64%, and the results are shown in Table 1:
表1Table 1
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。The above is only a preferred embodiment of the present invention. It should be pointed out that for ordinary technicians in this technical field, several improvements and modifications can be made without departing from the principle of the present invention. These improvements and modifications should also be regarded as the scope of protection of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410653658.6A CN118531030A (en) | 2024-05-24 | 2024-05-24 | Expression cassette, recombinant vector, recombinant protein and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410653658.6A CN118531030A (en) | 2024-05-24 | 2024-05-24 | Expression cassette, recombinant vector, recombinant protein and application thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118531030A true CN118531030A (en) | 2024-08-23 |
Family
ID=92387339
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410653658.6A Pending CN118531030A (en) | 2024-05-24 | 2024-05-24 | Expression cassette, recombinant vector, recombinant protein and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118531030A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118702828A (en) * | 2024-08-27 | 2024-09-27 | 深圳佳肽生物科技有限公司 | Systems and methods for producing semaglutide precursor polypeptides |
-
2024
- 2024-05-24 CN CN202410653658.6A patent/CN118531030A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118702828A (en) * | 2024-08-27 | 2024-09-27 | 深圳佳肽生物科技有限公司 | Systems and methods for producing semaglutide precursor polypeptides |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
ES2375330T3 (en) | DIRECT EXPRESSION OF PEPTIDES IN CULTURE MEDIA. | |
WO2020259403A1 (en) | Method for preparing target polypeptide by means of recombination and series connection of fused proteins | |
WO2020182229A1 (en) | Fusion protein and method of preparing liraglutide intermediate polypeptide thereof | |
EP0691406A1 (en) | Process for the preparation of superoxide dismutase by recombinant DNA-technology | |
CN113502296B (en) | Recombinant engineering bacterium for expressing semaglutide precursor and construction method thereof | |
EP0104920B1 (en) | Dna sequences, recombinant dna molecules and processes for producing swing growth hormone-like polypeptides | |
CN106434717A (en) | Method for biosynthesis preparation of human GLP-1 polypeptide or analogue thereof | |
WO2022012020A1 (en) | Preparation method for glp-1 analogue polypeptide and use thereof in type ii diabetes | |
CN105198972B (en) | Preparation method of high-purity recombinant human brain natriuretic peptide | |
CN118531030A (en) | Expression cassette, recombinant vector, recombinant protein and application thereof | |
US20120058513A1 (en) | Method for producing human recombinant insulin | |
CN102209725B (en) | Lack the colon bacillus BL21 bacterial strain of functional II group capsule gene bunch | |
US5496713A (en) | Process for producing 20 kD human growth hormone | |
CN110257347B (en) | Thioredoxin mutant, preparation method thereof and application thereof in recombinant fusion protein production | |
CN112239760B (en) | Recombinant engineering bacterium for efficiently expressing recombinant hGH (human growth hormone) and construction method and application thereof | |
CN108998458A (en) | The preparation method of rh-insulin | |
CN102180959B (en) | Improved chook Interleukin-2 protein and preparation method thereof | |
CN113249288B9 (en) | Recombinant bacterium for expressing GLP-1 analogue and application thereof | |
CN102732549B (en) | Preparation method of recombinant insulin-like growth factor-I (IGF-I) | |
CN114933658B (en) | Short peptide element and application method thereof | |
JP2549504B2 (en) | DNA base sequence, polypeptide secretory expression vector and transformed microorganism | |
JP2609462B2 (en) | Method for producing somatomedin C | |
CN118373922A (en) | Application of soluble tag protein in preparation of semaglutin main chain | |
EP0622459A1 (en) | Fused proteins for preparing vasoactive intestinal polypeptide analogs, method of preparing same and recombinant plasmids and transformant microorganisms | |
ES2281822T3 (en) | EXPRESSION VECTORS, TRANSFORMED GUEST CELLS AND FERMENTATION PROCEDURE FOR THE PRODUCTION OF RECOMBINATING POLYPEPTIDES. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |