CN109022478A

CN109022478A - A kind of DNA element for high-throughput protein synthesis in vitro

Info

Publication number: CN109022478A
Application number: CN201710430876.3A
Authority: CN
Inventors: 郭敏; 王海鹏; 柴智; 刘帅龙; 于雪
Original assignee: Kang Code (shanghai) Biological Technology Co Ltd
Current assignee: Kang Code (shanghai) Biological Technology Co Ltd
Priority date: 2017-06-09
Filing date: 2017-06-09
Publication date: 2018-12-18

Abstract

The present invention provides a kind of DNA elements for high-throughput protein synthesis in vitro, specifically, the present invention has unexpectedly discovered a kind of Novel DNA element that protein translation efficiency can be greatly enhanced for the first time, DNA element of the invention is applied in protein synthesis system outside yeast, the relative light unit value of synthesized uciferase activity enhances about 22.3 times than the DNA element only containing Ω sequence.

Description

A kind of DNA element for high-throughput protein synthesis in vitro

Technical field

The present invention relates to field of biotechnology, preferably, being related to a kind of DNA member for high-throughput protein synthesis in vitro Part.

Background technique

In eukaryocyte, the overwhelming majority all contains " cap " structure at 5 ' ends for the mRNA molecule of coding protein (m⁷GpppN) and 3 ' end polyadenosine chain [poly (A)] structure.The steady of mRNA molecule not only can be enhanced in both structures It is qualitative, and be necessary to protein translation^[1].Wherein, 5 ' " cap " structures can be translated initiation factor eIF4E and be known Not, and a variety of translation initiation factors and ribosomes in downstream, and then initiation protein translation process are recruited^[1].It is right in eukaryocyte MRNA is carried out plus " cap " modification is related to the 3 steps reaction by 3 enzymatics, and this process is urged with by rna plymerase ii Change transcription simultaneously progress^[2]。

It is often poly- using the RNA of bacteriophage or virus in the protein synthesis in vitro system of transcription and translation coupling Synthase (such as t7 rna polymerase) is transcribed, which dictates that being difficult to carry out mRNA to add " cap " modification.It is being based on eukaryocyte Protein synthesis in vitro system in, to utilize protein translation system (including the translation initiation factor from eukaryocyte Son and ribosomes etc.), the mRNA for not having " cap " structure must recruit correlation factor containing other elements, and then originate The translation of protein^[3]。

However, element used in current raising protein synthesis mainly has the translational enhancer Ω sequence of tobacco mosaic virus (TMV) Column and the internal ribosome of some other viruses enter sequence (Internal Ribosome Entry Sites, IRESs), such as EMCV IRES(Encephalomyocarditis virus).These elements can in the cell with protein synthesis in vitro body " cap "-non-dependent protein translation is originated in system.But the efficiency of these element initiation proteins translation often compares It is low, the time required to will increase protein synthesis, limit the yield of final protein^[4]。

Therefore, there is an urgent need in the art to develop a kind of new DNA element that can enhance protein translation efficiency.

Summary of the invention

The purpose of the present invention is to provide the new DNA elements that one kind can enhance protein translation efficiency.

First aspect present invention provides a kind of nucleic acid constructs, and the construction has from 5 ' to 3 ' Formulas I knot Structure:

Z1-Z2-Z3-Z4 (I)

In formula,

Z1, Z2, Z3, Z4 are respectively the element for being used to constitute the construction；

Each "-" independently is key or nucleotide catenation sequence；

Z1 is 5 ' end leader sequence (leading sequence)-Ω sequences of tobacco mosaic virus (TMV)；

Z2 is the oligomerization chain [oligo (A)] of adenyl-deoxyribonucleotide_n；

Z3 is translation initiation codon；

Z4 is serine codon；

Also, described Z2, Z3, Z4 have collectively constituted Kozak sequence, and the Kozak sequence derives from Kluyveromyces Yeast.

In another preferred example, the Kluyveromyces yeast is selected from the group: Kluyveromyces lactis, Marx's Crewe Tie up yeast, more cloth kluyveromyces (Kluyveromyces dobzhanskii), or combinations thereof.

In another preferred example, the translation initiation codon is selected from the group: ATG, ATA, ATT, GTG, TTG or its group It closes.

In another preferred example, the translation initiation codon is ATG.

In another preferred example, the serine codon is selected from the group: TCT, TCC, TCA, TCG, AGT, AGC or its Combination.

In another preferred example, the serine codon is TCT.

In another preferred example, n 6-12, preferably, 8-10.

In another preferred example, the Ω sequence includes direct replicated blocks (ACAATTAC)_m(CAA)_pModule.

In another preferred example, the m is 1-6, preferably, 2-4.

In another preferred example, the p is 6-12, preferably, 8-10.

In another preferred example, described (CAA)_pModule is 1-5, preferably, 1-3.

In another preferred example, described (CAA)_pModule further includes (CAA) of optimization_pModule.

In another preferred example, the sequence of the nucleic acid constructs is as shown in SEQ ID NO.:1-3.

In another preferred example, shown Kozak sequence is as shown in SEQ ID NO.:5.

Second aspect of the present invention provides a kind of nucleic acid constructs, and the construction has from 5 ' to 3 ' Formula II knot Structure:

Z1-Z2-Z3-Z4-Z5 (II)

In formula,

Z1, Z2, Z3, Z4, Z5 are respectively the element for being used to constitute the construction；

Each "-" independently is key or nucleotide catenation sequence；

Z2 is the oligomerization chain [oligo (A)] of adenyl-deoxyribonucleotide_n；

Z3 is translation initiation codon；

Z4 is serine codon；

Z5 is the coded sequence of foreign protein；

In another preferred example, the coded sequence of the foreign protein comes from prokaryotes, eucaryote.

In another preferred example, the coded sequence of the foreign protein comes from animal, plant, pathogen.

In another preferred example, the coded sequence of the foreign protein comes from mammal, preferably Primate, grinding tooth Animal, including people, mouse, rat.

In another preferred example, the coded sequence of the foreign protein is selected from the group: coding fluorescence fibroin or fluorescence Plain enzyme (such as firefly luciferase), green fluorescent protein, yellow fluorescence protein, aminoacyl tRNA synthetase, glyceraldehyde-3-phosphate Dehydrogenase, catalase, actin, the exogenous DNA of the Variable Area of antibody, the DNA of luciferase mutant or its group It closes.

In another preferred example, the foreign protein is selected from the group: alpha-amylase, enterocin A, hepatitis C virus E 2 Glycoprotein, insulin precurosor, Interferon α A, interleukin-1 ' beta ', lysozyme element, seralbumin, single-chain antibody section (scFV), transthyretin, tyrosinase, zytase, or combinations thereof.

In another preferred example, the nucleotide sequence of the coded sequence of the foreign protein such as SEQ ID NO.:6-12 institute Show.

Third aspect present invention provides a kind of carrier or carrier combination, and the carrier or carrier combination contain the present invention Nucleic acid constructs described in first aspect or second aspect of the present invention.

Fourth aspect present invention provides a kind of genetically engineered cell, one of the genome of the genetically engineered cell or Multiple integrations have construction described in first aspect present invention or second aspect of the present invention or the genetically engineered cell In contain carrier described in third aspect present invention or carrier combination.

In another preferred example, the genetically engineered cell is yeast cells.

In another preferred example, the yeast cells comes from Kluyveromyces yeast.

One of fifth aspect present invention provides a kind of kit, and the reagent for including in the kit is selected from the group Or it is a variety of:

(a) construction described in first aspect present invention or second aspect of the present invention；

(b) carrier described in third aspect present invention or carrier combination；With

(c) genetically engineered cell described in fourth aspect present invention；

In another preferred example, the kit further includes the outer albumen synthetic system of (d) yeast.

In another preferred example, the outer albumen synthetic system of the yeast is the external albumen synthetic system of kluyveromyces (the preferably external albumen synthetic system of Kluyveromyces lactis).

Sixth aspect present invention provides construction, this hair as described in first aspect present invention or second aspect of the present invention Genetically engineered cell described in carrier described in the bright third aspect or carrier combination, fourth aspect present invention or the 5th side of the invention The purposes of kit described in face, for carrying out high-throughput external albumen synthesis.

Seventh aspect present invention provides a kind of foreign protein synthetic method of external high throughput, comprising steps of

(i) outside yeast in the presence of albumen synthetic system, nucleic acid constructs described in second aspect of the present invention is provided；

(ii) under the suitable conditions, the outer albumen synthetic system of the yeast of incubation step (i) T1 for a period of time, to close At the foreign protein.

In another preferred example, the method also includes (iii) albumen synthetic systems optionally outside the yeast In, separate or detect the foreign protein.

In another preferred example, in the step (ii), reaction temperature is 20-37 DEG C, preferably, 22-35 DEG C.

In another preferred example, in the step (ii), reaction time 1-10h, preferably, 2-8h.

It should be understood that above-mentioned each technical characteristic of the invention and having in below (eg embodiment) within the scope of the present invention It can be combined with each other between each technical characteristic of body description, to form a new or preferred technical solution.As space is limited, exist This no longer tires out one by one states.

Detailed description of the invention

Fig. 1, which is shown, joined the Ω -10A/8A/6A-Fluc containing Kluyveromyces lactis specificity Kozak sequence 3 DNA fragmentations and Ω-Fluc DNA fragmentation without containing Kozak sequence encode in protein synthesis system outside yeast to be closed At the relative light unit value (Relative Light Unit, RLU) of Fluc protein, and it is added without the ferment of any DNA fragmentation The relative light unit value RLU of the outer protein system of parent.

Specific embodiment

After extensive and in-depth study, by largely screening and groping, have unexpectedly discovered that one kind can be significantly for the first time Enhance the Novel DNA element of protein translation efficiency, DNA element of the invention includes Ω sequence, adenine deoxyribonucleoside oligomerization Chain, translation initiation codon and serine codon, wherein rear three elements constitute kluyveromyces (such as Kluyveromyces Lactis dimension ferment It is female) the Kozak sequence of specificity, DNA element of the invention is applied in protein synthesis system outside yeast, synthesized is glimmering The relative light unit value of light element enzymatic activity enhances about 22.3 times than the DNA element only containing Ω sequence.On this basis, of the invention People completes the present invention.

The outer protein synthesis system of yeast

Yeast (yeast) has both the advantage for cultivating simple efficient protein matter folding and posttranslational modification.It wherein makes wine ferment Female (Saccharomyces cerevisiae) and pichia yeast (Pichia pastoris) be express complicated eukaryotic protein with The model organism of memebrane protein, yeast also can be used as the raw material for preparing external translating system.

Kluyveromyces (Kluyveromyces) are a kind of ascospore yeast, kluyveromyces marxianus therein (Kluyveromyces marxianus) and Kluyveromyces lactis (Kluyveromyces lactis) are industrially to make extensively Yeast^[5].Compared with other yeast, Kluyveromyces lactis is had many advantages, such as superpower secretion capacity, preferably Large scale fermentation characteristic, the rank of food safety and there is the ability modified after protein translation simultaneously etc.^[5]。

In the present invention, the outer protein synthesis system of yeast is not particularly limited, a kind of outer albumen of preferred yeast Matter synthetic system is kluyveromyces expression system (more preferably, Kluyveromyces lactis expression system).

In the present invention, the outer protein synthesis system of the yeast includes:

(a) yeast cell extract；

(b) polyethylene glycol；

(c) optional Exogenous Sucrose；With

(d) optional solvent, the solvent are water or aqueous solvent.

In a particularly preferred embodiment, external albumen synthetic system provided by the invention includes: that yeast cells mentions Take object, 4- hydroxyethyl piperazineethanesulfonic acid, potassium acetate, magnesium acetate, adenosine triphyosphate (ATP), guanopterin nucleoside triphosphate (GTP), cytidine triphosphate (CTP), thymidine triphosphate (TTP), ispol, phosphocreatine, two Sulphur threitol (DTT), creatine phosphokinase, RNase inhibitor, fluorescein, luciferin enzyme dna, RNA polymerase.

In the present invention, RNA polymerase is not particularly limited, and can be selected from one or more RNA polymerases, typically RNA polymerase is t7 rna polymerase.

In the present invention, ratio of the yeast cell extract in vitro in albumen synthetic system is not particularly limited, Shared system is 20-70% to the usual yeast cell extract in protein synthetic proteins synthetic system in vitro, preferably Ground, 30-60%, more preferably, 40-50%.

In the present invention, the yeast cell extract is free of complete cell, typical yeast cell extract packet Include the initiation factor and extension of the ribosomes for protein translation, transfer RNA, aminoacyl tRNA synthetase, protein synthesis needs The factor and termination releasing factor.In addition, also containing other in some cytoplasm from yeast cells in yeast extract Albumen, especially soluble protein.

In the present invention, protein content contained by the yeast cell extract is 20-100mg/ml, preferably 50- 100mg/ml.The measurement protein content method is Coomassie brilliant blue measuring method.

In the present invention, the preparation method of the yeast cell extract is unrestricted, a kind of preferred preparation method The following steps are included:

(i) yeast cells is provided；

(ii) carrying out washing treatment is carried out to yeast cells, obtains washed yeast cells；

(iii) broken cell processing is carried out to washed yeast cells, to obtain yeast crude extract；

(iv) the yeast crude extract is separated by solid-liquid separation, obtains liquid portion, as yeast cell extract.

In the present invention, the solid-liquid separation method is not particularly limited, and a kind of preferred mode is centrifugation.

In a preferred embodiment, the centrifugation carries out in the liquid state.

In the present invention, the centrifugal condition is not particularly limited, and a kind of preferred centrifugal condition is 5000-100000 × g, preferably, 8000-30000 × g.

In the present invention, the centrifugation time is not particularly limited, and a kind of preferred centrifugation time is 0.5min-2h, compared with Goodly, 20min-50min.

In the present invention, the temperature of the centrifugation is not particularly limited, it is preferred that and the centrifugation carries out at 1-10 DEG C, Preferably, being carried out at 2-6 DEG C.

In the present invention, the carrying out washing treatment mode is not particularly limited, and a kind of preferred carrying out washing treatment mode is to adopt It is handled with cleaning solution in the case where pH is 7-8 (preferably, 7.4), the cleaning solution is not particularly limited, the typical washing Liquid is selected from the group: 4- hydroxyethyl piperazineethanesulfonic acid potassium, potassium acetate, magnesium acetate, or combinations thereof.

In the present invention, the mode of broken cell processing is not particularly limited, at a kind of preferred broken cell Reason includes that high pressure is broken, freeze thawing (such as liquid nitrogen cryogenics) is broken.

Ribonucleoside triphosphote mixture in the protein synthesis in vitro system is adenosine triphyosphate, guanosint Guanosine triphosphate, cytidine triphosphate and uridine diphosphate guanosine triphosphate.In the present invention, the concentration of various mononucleotides does not have Especially limitation, the concentration of usual every kind of mononucleotide are 0.5-5mM, preferably 1.0-2.0mM.

Ispol in the protein synthesis in vitro system may include natural or non-natural amino acids, it may include D type or L-type amino acid.Representative amino acid includes (but being not limited to) 20 kinds of natural amino acids: glycine, alanine, figured silk fabrics Propylhomoserin, leucine, isoleucine, phenylalanine, proline, tryptophan, serine, tyrosine, cysteine, methionine, day Winter amide, glutamine, threonine, aspartic acid, glutamic acid, lysine, arginine and histidine.The concentration of every kind of amino acid Usually 0.01-0.5mM, preferably 0.02-0.2mM, such as 0.05,0.06,0.07,0.08mM.

In preference, the protein synthesis in vitro system also contains polyethylene glycol or its analog.Polyethylene glycol or The concentration of its analog is not particularly limited, in general, the concentration (w/v) of polyethylene glycol or its analog is 0.1-8%, preferably Ground, 0.5-4%, more preferably, 1-2%, with the total weight of the albumen synthetic system.Representative PEG example includes (but simultaneously It is not limited to): PEG3000, PEG8000, PEG6000 and PEG3350.It should be understood that may also include other various for system of the invention The polyethylene glycol (such as PEG200,400,1500,2000,4000,6000,8000,10000) of molecular weight.

In preference, the protein synthesis in vitro system also contains sucrose.The concentration of sucrose is not particularly limited, and leads to Often, the concentration of sucrose is 0.03-40wt%, preferably, 0.08-10wt%, more preferably, 0.1-5wt%, with albumen synthesis The total weight of system.

A kind of particularly preferred protein synthesis in vitro system also contains following components other than yeast extract: The 4- hydroxyethyl piperazineethanesulfonic acid that 22mM, pH are 7.4,30-150mM potassium acetate, 1.0-5.0 mM magnesium acetate, 1.5-4mM nucleosides Triphosphoric acid mixture, the ispol of 0.08-0.24mM, 25 mM phosphocreatines, 1.7mM dithiothreitol (DTT), 0.27mg/ ML creatine phosphokinase, 1%-4% polyethylene glycol, 0.5%-2% sucrose, the DNA of 8-20ng/ μ l firefly luciferase, 0.027-0.054mg/mL T7 RNA polymerase.

The coded sequence (exogenous DNA) of foreign protein

As used herein, term " coded sequence of foreign protein " is used interchangeably with " exogenous DNA ", refers both to the use of external source In the DNA molecular for instructing protein to synthesize.In general, the DNA molecular is linear or cricoid.The DNA molecular contains There is the sequence of encoding foreign proteins.

In the present invention, the example of the sequence of the encoding foreign proteins includes (but being not limited to): genome sequence, CDNA sequence.The sequence of the encoding foreign proteins also contains promoter sequence, 5' non-translated sequence, 3' non-translated sequence.

In the present invention, the selection of the exogenous DNA is not particularly limited, in general, exogenous DNA is selected from the group: coding is glimmering Light fibroin or luciferase (such as firefly luciferase), green fluorescent protein, yellow fluorescence protein, aminoacyl tRNA synthesis Enzyme, glyceraldehyde-3-phosphate dehydrogenase, catalase, actin, the exogenous DNA of the Variable Area of antibody, luciferase are prominent The DNA of variant, or combinations thereof.

Exogenous DNA is also selected from the following group: coding alpha-amylase, enterocin A, hepatitis C virus E 2 glycoprotein, pancreas Island element precursor, Interferon α A, interleukin-1 ' beta ', lysozyme element, seralbumin, single-chain antibody section (scFV), thyroxine Transporter, tyrosinase, zytase exogenous DNA, or combinations thereof.

The sequence of a kind of representative exogenous DNA is selected from: SEQ ID NO.:6-12.

In a preferred embodiment, the exogenous DNA encodes albumen selected from the group below: green fluorescent protein (enhanced GFP, eGFP), yellow fluorescence protein (YFP), Escherichia coli beta galactosidase (β-galactosidase, LacZ), people's lysine-tRNA synzyme (Lysine-tRNA synthetase), human leucine-tRNA synzyme (Leucine-tRNA synthetase), arabidopsis glyceraldehyde 3 phosphate dehydrogenase (Glyceraldehyde-3- Phosphate dehydrogenase), mouse catalase (Catalase), or combinations thereof.

In a preferred embodiment, the nucleotide sequence of the green fluorescent protein is as shown in SEQ ID NO.:6；Institute The nucleotide sequence of yellow fluorescence protein is stated as shown in SEQ ID NO.:7；The nucleotide of the Escherichia coli beta galactosidase Sequence is as shown in SEQ ID NO.:8；The nucleotide sequence of people's lysine-tRNA synzyme is as shown in SEQ ID NO.:9； The nucleotide sequence of the human leucine-tRNA synzyme is as shown in SEQ ID NO.:10；The arabidopsis glyceraldehyde 3- phosphorus The nucleotide sequence of acidohydrogenase is as shown in SEQ ID NO.:11；The nucleotide sequence of the mouse catalase such as SEQ ID Shown in NO.:12.

Ω sequence

As used herein, term " Ω sequence " is 5 ' end leader sequences of tmv cdna group, is this kind of virus Translational enhancer.The DNA sequence dna of Ω contains 68 base-pairs, straight by 1-6 (preferably 2-4, more preferable 3) 8 base-pairs Connecing replicated blocks (ACAATTAC) and 1-5 (preferably 1-3, more preferable 1) (CAA) p modules, wherein p is 6-12, compared with Goodly, 8-10.The two modules are crucial for the enhancing interpretative function of Ω sequence.The protein outside yeast of the invention In synthetic system, Ω sequence can originate the protein translation of " cap sequence " dependent/non-dependent, and this function may be by recruiting What the translation initiation factor eIF4G that raises was realized.But the efficiency of Ω sequence initiation protein translation is relatively low, needs to constitute it It optimizes, and cooperates other DNA elements or protein to enhance the efficiency of protein translation.

Kozak sequence

By dividing translation initiation codon (AUG) upstream and downstream sequence in known Eukaryotic mRNA molecule Analysis, the consensus sequence found out are referred to as Kozak sequence.Kozak sequence is proved that the translation initiation efficiency of mRNA can be enhanced. The Kozak sequence of different plant species is often different, and such as brewing yeast cell (Saccharomyces cerevisiae) and is fed The Kozak sequence of newborn zooblast there is significant difference.

In the present invention, Kozak sequence used includes 6-12 adenine deoxyribonucleoside oligomerization chain (preferably, 8- 10), translation initiation codon (such as ATG, ATA, ATT, GTG, TTG, preferably ATG) and serine codon (such as TCT, TCC, TCA, TCG, AGT, AGC etc., preferably TCT), it derives from kluyveromyces (preferably Kluyveromyces lactis).

DNA construction

The present invention provides a kind of DNA construction, the DNA construction contains structure nucleic acid sequence shown in formula I:

Z1-Z2-Z3-Z4 (I)

In formula,

Each "-" independently is key or nucleotide catenation sequence；

Z2 is the oligomerization chain [oligo (A)] of adenyl-deoxyribonucleotide_n；

Z3 is translation initiation codon；

Z4 is serine codon；

The present invention also provides a kind of DNA construction, the construction has from 5 ' to 3 ' Formula II structure:

Z1-Z2-Z3-Z4-Z5 (II)

In formula,

Each "-" independently is key or nucleotide catenation sequence；

Z2 is the oligomerization chain [oligo (A)] of adenyl-deoxyribonucleotide_n；

Z3 is translation initiation codon；

Z4 is serine codon；

Z5 is the coded sequence of foreign protein；

In the present invention, the selection of the coded sequence of the foreign protein is not particularly limited, in general, the volume of foreign protein Code sequence is selected from the group: coding fluorescence fibroin or luciferase (such as firefly luciferase), green fluorescent protein, yellow The variable region of fluorescin, aminoacyl tRNA synthetase, glyceraldehyde-3-phosphate dehydrogenase, catalase, actin, antibody The exogenous DNA in domain, luciferase mutant DNA, or combinations thereof.

Foreign protein is also selected from the following group: alpha-amylase, enterocin A, hepatitis C virus E 2 glycoprotein, insulin Precursor, Interferon α A, interleukin-1 ' beta ', lysozyme element, seralbumin, single-chain antibody section (scFV), thyroxine delivery Albumen, tyrosinase, zytase, or combinations thereof.

In addition, the nucleic acid constructs of the invention can be linear, it is also possible to cricoid.The core of the invention Acid construct object can be single-stranded, be also possible to double-strand.The nucleic acid constructs of the invention can be DNA, be also possible to RNA or DNA/RNA heterozygosis.

In another preferred example, the construction further includes element selected from the group below or combinations thereof: promoter, termination Son, poly (A) element, transhipment element, gene target element, riddled basins, enhancer, resistant gene, transposase coding Gene.

Multiple choices marker gene can be applied to the present invention, including but not limited to: nutrient defect type mark, resistance mark Note, reporter gene label.The application of selective key plays a role the screening of recombinant cell (recon), so that recipient cell Born of the same parents can significantly be distinguished with unconverted cell.Nutrient defect type mark is the marker gene and recipient cell by being transferred to Mutated gene is complementary, so that recipient cell be made to show wild type growth.Resistance marker, which refers to, is transferred to recipient cell for resistant gene In, the gene being transferred to makes recipient cell show drug resistance under certain drug concentration.As preferred embodiment of the invention, application Resistance marker realizes the convenient screening of recombinant cell.

In the present invention, DNA construction of the invention is applied in protein synthesis system outside yeast of the invention, it can The efficiency of protein translation is significantly improved, specifically, using the opposite of uciferase activity synthesized by DNA construction of the invention Light unit value enhances about 22.3 times than the DNA element only containing Ω sequence.

Carrier, genetically engineered cell

The present invention also provides a kind of carriers or carrier to combine, and the carrier contains DNA construction of the invention.It is preferred that Ground, the carrier are selected from: bacterial plasmid, bacteriophage, yeast plasmid or zooblast carrier, shuttle vector；The carrier is Transposon vector.The method for being used to prepare recombinant vector is well known to those of ordinary skill in the art.As long as it can be in place Duplication and stabilization in main body, any plasmid and carrier are all can be adopted.

Those of ordinary skill in the art can be used the building of well known method containing promoter of the present invention and/or The expression vector of objective gene sequence.These methods include recombinant DNA technology in vi, DNA synthetic technology, In vivo recombination technology Deng.

The present invention also provides a kind of genetically engineered cell, the genetically engineered cell contains the construction or load Body or carrier combination or the genetically engineered cell chromosomal integration have the construction or carrier.In another preference In, the genetically engineered cell further includes being integrated with transposase base on the carrier containing encoding transposase gene or its chromosome Cause.

Preferably, the genetically engineered cell is eukaryocyte.

In another preferred example, the eukaryocyte, including but not limited to: (preferably, kluyveromyces are thin for yeast cells Born of the same parents, more preferable Kluyveromyces lactis cell).

Construction or carrier of the invention, can be used for converting genetically engineered cell appropriate.Genetically engineered cell can be with It is prokaryotic cell, such as Escherichia coli, streptomyces, Agrobacterium: or low eukaryocyte, such as yeast cells；Or it is high dynamic Object cell, such as insect cell.Persons skilled in the art are aware that how to select carrier and genetically engineered cell appropriate.With Recombinant DNA transformation gene engineering cell can be carried out with routine techniques well known to those skilled in the art.When host is prokaryotes When (such as Escherichia coli), CaCl can be used₂Method processing, it is also possible to which electroporation carries out.When host is eucaryote, can be selected such as Under DNA transfection method: calcium phosphate precipitation, conventional mechanical methods (such as microinjection, electroporation, liposome packaging). The methods of Agrobacterium-mediated Transformation or via Particle Bombardment Transformation, such as leaf disk method, rataria conversion method, bud infusion method can also be used in conversion plant Deng.

External high-throughput protein synthesis methods

The present invention provides a kind of protein synthesis methods of external high throughput, comprising steps of

(i) outside yeast in the presence of albumen synthetic system, nucleic acid constructs described in first aspect present invention is provided, and The DNA molecular for instructing protein to synthesize of external source is added；Or provide nucleic acid constructs described in second aspect of the present invention；

(ii) under the suitable conditions, the outer albumen synthetic system of the yeast of incubation step (i) T1 for a period of time, to close At the protein or foreign protein encoded by the exogenous DNA.

Main advantages of the present invention include:

(1) it present invention firstly discovers that, by Ω sequence and derives from kluyveromyces (preferably Kluyveromyces lactis) Kozak sequence is remarkably improved albumen applied in protein synthesis system outside yeast of the invention as nucleic acid constructs The efficiency of translation.

(2) present invention firstly discovers that, compared with the DNA element only containing Ω sequence, synthesized by DNA construction of the invention Uciferase activity relative light unit value enhance about 22.3 times.

(3) present invention firstly discovers that, in terms of promoting protein synthesis, Ω -10A is better than Ω -8A and Ω -6A.

(4) Novel DNA element Ω -10A of the invention derives from the Kozak sequence of Kluyveromyces lactis specificity, energy Enough significantly increase the efficiency of the outer synthetic system synthetic proteins matter of yeast.With the Ω-Fluc DNA fragmentation for not containing Kozak sequence It compares, Ω -10A-Fluc increases 22.3 times of protein synthetic quantity.

(5) present invention is by translation initiation codon AUG upstream and downstream sequence in K. lactis gene mRNA Analysis finds that the sequence at 5 ' ends is that A and U is enriched with, and the DNA fragmentation containing polyadenous purine nucleosides (10A) oligomerization chain increases The efficiency of strong protein synthesis is better than the DNA fragmentation containing 8A and 6A.The above analysis with the experimental results showed that, the present invention is wrapped The Novel DNA element contained has the potentiality advanced optimized.

(6) the Ω sequence in the Novel DNA element Ω -10A that the present invention is included is generally applied to external protein conjunction In architectonical, carry out the synthesis of initiation protein.For the present invention by the understanding to Ω sequence component part, learning has starting albumen The possible comprising modules of matter translation ability, it is subsequent to be directed between the sequence, quantity and disparate modules of these modules It puts in order and is further optimized, it is desired to be able to further strengthen the ability of its initiation protein combined coefficient.

Present invention will be further explained below with reference to specific examples.It should be understood that these embodiments are merely to illustrate the present invention Rather than it limits the scope of the invention.In the following examples, the experimental methods for specific conditions are not specified, usually according to conventional strip Part, such as Sambrook et al., molecular cloning: laboratory manual (New York:Cold Spring Harbor Laboratory Press, 1989) condition described in, or according to the normal condition proposed by manufacturer.Unless otherwise stated, no Then percentage and number are weight percent and parts by weight.

Unless otherwise instructed, then material used in the embodiment of the present invention and reagent are commercial product.

Embodiment 1: Kluyveromyces lactis specificity Kozak sequence analysis

The determination of the Kozak sequence of 1.1 Kluyveromyces lactis specificity: according to analysis saccharomyces cerevisiae The method of the Kozak sequence of (Saccharomyces cerevisiae) specificity, retrieves and collects Kluyveromyces lactis 107 The mRNA sequence of a gene, and 60 alkali that the Kluyveromyces lactis mRNA translation initiation codon AUG 5 ' filtered out is held Basic sequence and a Codon sequences in 3 ' end downstreams carry out the content analysis of A, U, C and G, have been surprisingly found that and hold position the 5 ' of AUG It sets, the ratio of adenine A is dominant, and has been determined that the codon UCU an of serine is dominant at the 3 ' ends of AUG, in this way The Kozak sequence for just having primarily determined Kluyveromyces lactis specificity, is named as kl-Kozak sequence.

Protein synthesis system matter outside 1.2 yeasts of the building containing Kluyveromyces lactis specificity kl-Kozak sequence Grain: (the pET21a plasmid modified is obtained from health code (Shanghai) biological section to protein synthesis system plasmid outside original yeast Skill Co., Ltd) in Ω sequence and reporter gene firefly luciferase Fluc sequence between be added Kluyveromyces Lactis tie up ferment The kl-Kozak sequence of female specificity inserts at 5 ' ends of Fluc initiation codon containing different number adenine deoxyribonucleoside The oligomerization chain of acid, including 10A, 8A and 6A, and serine codon TCT is inserted into Fluc initiation codon and second password Between son, pKL- Ω -10A-Fluc, pKL- containing Kluyveromyces lactis specificity kl-Kozak sequence are ultimately formed The plasmid of Ω -8A-Fluc and pKL- Ω -6A-Fluc.

Specific implementation process is as follows: use 3 pairs of primers:

O6A-605_f(CAACAATTACCAACAACAACAAACAACAAACAACATTAC AATTACTATTTACAATTACAAAAAAAATGTCTGAAGACGCCAAAAACAT AAAGAAAGGCC)(SEQ ID NO.:13) With OK-60_b (TTTTTTTTGTAATTGTAAATAGTAATTGTAATGTTGTTTGTTGTTTGTTGT TG) (SEQ ID NO.:14)；

O8A-605_f(CAATTACCAACAACAACAAACAACAAACAACATTACAAT TACTATTTACAATTACAAAAAAAAAATGTCTGAAGACGCCAAAAACATA AAGAAAGGCC)(SEQ ID NO.:15) With OK-60_b；

O10A-605_f(AATTACCAACAACAACAAACAACAAACAACATTACAAT TACTATTTACAATTACAAAAAAAAAAAATGTCTGAAGACGCCAAAAACA TAAAGAAAGGCC)(SEQ ID NO.: 16) and OK-60_b carries out PCR amplification to plasmid pKL- Ω-Fluc (SEQ ID NO.:4), and 1 μ is added into 20 μ L amplified productions L Dpn I, 37 DEG C of incubation 6h；4 μ L of product is added in 50 μ L DH5 α competent cells after DpnI is handled, and places on ice After 30min, 42 DEG C of heat shock 45s, 3min is placed on ice, and 200 37 DEG C of shaken cultivation 4h of μ L LB liquid medium are added, are coated on It is incubated overnight on LB solid medium containing Amp antibiotic；After 6 monoclonals of picking expand culture, be sequenced really After recognizing correctly, extracts plasmid and save, be named as pKL- Ω -10A-Fluc (SEQ ID NO.:1), pKL- Ω -8A-Fluc (SEQ ID NO.:2) and pKL- Ω -6A-Fluc (SEQ ID NO.:3) plasmid.

Application of the embodiment 2:kl-Kozak sequence outside yeast in protein synthesis system

2.1 methods for utilizing PCR, and use primer T7_pET21a_F:CGCGAAATTAATACGACTCACTATAGG (SEQ ID NO.:17) and T7ter_pET21a_R:TCCGGATATAGTTCCTCCTTTCAG (SEQ ID NO.:18) are by matter It include Ω-between T7 transcriptional initiation sequence and termination sequence in grain pKL- Ω -10A/8A/6A-Fluc and pKL- Ω-Fluc The segment and Ω-Fluc segment of 10A/8A/6A-Fluc is expanded.

And the method for the DNA fragmentation ethanol precipitation that amplification obtains is purified and is enriched with: being added into PCR product Then the 3M sodium acetate (pH5.2) of 1/10 volume adds 2.5-3 times of volume (volume is the volume being added after sodium acetate) 95% ethyl alcohol, be placed in and be incubated for 15min on ice；30min is centrifuged with the speed higher than 14000g under room temperature, is discarded Clearly；It is cleaned using 70% ethyl alcohol, is then centrifuged 15min again, discarded supernatant, and dissolved precipitating with ultrapure water, measure DNA Concentration.

The DNA fragmentation of purifying according to operation instruction, is added to the external protein of homemade Kluyveromyces lactis and closed by 2.2 In architectonical.And above-mentioned reaction system is placed in 25-30 DEG C of environment, stationary incubation about 2-6h.After reaction, in 96 holes Isometric Fluc substrate luciferin (luciferin) is added in blank or 384 hole blanks, is placed in Envision immediately 2120 multi-function microplate readers (Perkin Elmer), reading, detection Fluc activity, relative light unit value (Relative Light Unit, RLU) it is used as active unit, as shown in Figure 1.

2.3 use the DNA fragmentation (Ω-Fluc) without containing kl-Kozak sequence as control, and each sample designs three Group independent experiment, and the experimental group without containing any DNA fragmentation is designed as negative control.

Experimental result

1. the kl-Kozak sequence of Kluyveromyces lactis specificity

On the position before initiation codon AUG, gland is fast at 5 ' ends of 107 gene mRNAs of Kluyveromyces lactis The ratio of purine nucleotide (A) is dominant, has the ratio of uridylate (U) on several positions and approaches, but is less slightly.So with Saccharomyces cerevisiae is similar, and the 5 ' terminal sequence of gene mRNA initiation codon of Kluyveromyces lactis is that A and U is enriched with.Initiation codon It is dominant on 3 positions of the codon after sub- AUG to be followed successively by U, C and U, it is the codon of serine.Finally determine The kl-Kozak sequence of Kluyveromyces lactis specificity is as shown in SEQ ID NO.:5.

Application of the 2.kl-Kozak sequence outside yeast in protein synthesis system

As shown in Figure 1, Ω-the 10A/8A/6A-Fluc 3 containing Kluyveromyces lactis specificity kl-Kozak sequence A DNA fragmentation encodes the relative light unit value highest that the Fluc albumen of synthesis is released outside yeast in protein synthesis system Reach 2.72 × 10⁷.And the light relatively of the Fluc albumen of the Ω-Fluc DNA fragmentation coding synthesis without containing kl-Kozak sequence Unit value only has 1.22 × 10⁶.This shows that the insertion of kl-Kozak sequence can effectively enhance the outer protein compound body of yeast It is the efficiency of synthetic proteins matter.Because being in the range of linearity of detecting instrument relative light unit value RLU and protein concentration relationship Interior, the relative light unit value of the highest Ω -10A-Fluc segment of activity is 22.3 times of Ω-Fluc segment, shows Ω -10A energy The protein synthesis of enough about 22.3 times of enhancings.

The efficiency of DNA fragmentation enhancing protein synthesis containing different length adenosine oligomerization chain is also to have differences , it joined the outer protein synthesis system synthesis of yeast of Ω -10A-Fluc, Ω -8A-Fluc and Ω -6A-Fluc segment The relative light unit value of Fluc protein is respectively 2.72 × 10⁷、1.52×10⁷With 1.01 × 10⁷.These are the result shows that promoting Aspect is synthesized into protein, Ω -10A is better than Ω -8A and Ω -6A.The outer albumen of yeast of any DNA fragmentation is not added Matter synthetic system only has 190.33 as negative control, relative light unit value.

The present invention the result shows that, the kl-Kozak sequence of Kluyveromyces lactis specificity can significantly increase outside yeast Protein synthesis system produces protedogenous efficiency, and the relative light unit value of Ω -10A-Fluc DNA fragmentation is Ω-Fluc 22.3 times, reach 2.72 × 10⁷, show that Ω -10A can enhance 22.3 times of protein synthesis.

All references mentioned in the present invention is incorporated herein by reference, independent just as each document It is incorporated as with reference to such.In addition, it should also be understood that, after reading the above teachings of the present invention, those skilled in the art can To make various changes or modifications to the present invention, such equivalent forms equally fall within model defined by the application the appended claims It encloses.

Bibliography:

1.Dever,T.E.,et al.(2016)Mechanism and Regulation of Protein Synthesis in Saccharomyces cerevisiae.Genetics 203,65-107.

2.Kyrieleis,O.J.,et al.(2014)Crystal structure of vaccinia virus mRNA capping enzyme provides insights into the mechanism and evolution of the capping apparatus.Structure 22,452-465.

3.Katzen,F.,et al.(2005)The past,present and future of cell-free protein synthesis.Trends in biotechnology 23,150-156.

4.Anastasina,M.,et al.(2014)A technique to increase protein yield in a rabbit reticulocyte lysate translation system.BioTechniques 56,36-39.

5.Spohner,S.C.,et al.(2016)Kluyveromyces lactis:An emerging tool in biotechnology.Journal of biotechnology 222,104-116.

Sequence table

<110>health code (Shanghai) Biotechnology Co., Ltd

<120>a kind of DNA element for high-throughput protein synthesis in vitro

<130> P2017-1191

<160> 18

<170> PatentIn version 3.5

<210> 1

<211> 1993

<212> DNA

<213>artificial sequence

<400> 1

cgcgaaatta atacgactca ctataggggt atttttacaa caattaccaa caacaacaaa 60

caacaaacaa cattacaatt actatttaca attacaaaaa aaaaaaatgt ctgaagacgc 120

caaaaacata aagaaaggcc cggcgccatt ctatcctcta gaggatggaa ccgctggaga 180

gcaactgcat aaggctatga agagatacgc cctggttcct ggaacaattg cttttacaga 240

tgcacatatc gaggtgaaca tcacgtacgc ggaatacttc gaaatgtccg ttcggttggc 300

agaagctatg aaacgatatg ggctgaatac aaatcacaga atcgtcgtat gcagtgaaaa 360

ctctcttcaa ttctttatgc cggtgttggg cgcgttattt atcggagttg cagttgcgcc 420

cgcgaacgac atttataatg aacgtgaatt gctcaacagt atgaacattt cgcagcctac 480

cgtagtgttt gtttccaaaa aggggttgca aaaaattttg aacgtgcaaa aaaaattacc 540

aataatccag aaaattatta tcatggattc taaaacggat taccagggat ttcagtcgat 600

gtacacgttc gtcacatctc atctacctcc cggttttaat gaatacgatt ttgtaccaga 660

gtcctttgat cgtgacaaaa caattgcact gataatgaat tcctctggat ctactgggtt 720

acctaagggt gtggcccttc cgcatagaac tgcctgcgtc agattctcgc atgccagaga 780

tcctattttt ggcaatcaaa tcattccgga tactgcgatt ttaagtgttg ttccattcca 840

tcacggtttt ggaatgttta ctacactcgg atatttgata tgtggatttc gagtcgtctt 900

aatgtataga tttgaagaag agctgttttt acgatccctt caggattaca aaattcaaag 960

tgcgttgcta gtaccaaccc tattttcatt cttcgccaaa agcactctga ttgacaaata 1020

cgatttatct aatttacacg aaattgcttc tgggggcgca cctctttcga aagaagtcgg 1080

ggaagcggtt gcaaaacgct tccatcttcc agggatacga caaggatatg ggctcactga 1140

gactacatca gctattctga ttacacccga gggggatgat aaaccgggcg cggtcggtaa 1200

agttgttcca ttttttgaag cgaaggttgt ggatctggat accgggaaaa cgctgggcgt 1260

taatcagaga ggcgaattat gtgtcagagg acctatgatt atgtccggtt atgtaaacaa 1320

tccggaagcg accaacgcct tgattgacaa ggatggatgg ctacattctg gagacatagc 1380

ttactgggac gaagacgaac acttcttcat agttgaccgc ttgaagtctt taattaaata 1440

caaaggatat caggtggccc ccgctgaatt ggaatcgata ttgttacaac accccaacat 1500

cttcgacgcg ggcgtggcag gtcttcccga cgatgacgcc ggtgaacttc ccgccgccgt 1560

tgttgttttg gagcacggaa agacgatgac ggaaaaagag atcgtggatt acgtcgccag 1620

tcaagtaaca accgcgaaaa agttgcgcgg aggagttgtg tttgtggacg aagtaccgaa 1680

aggtcttacc ggaaaactcg acgcaagaaa aatcagagag atcctcataa aggccaagaa 1740

gggcggaaag tccaaattgg tttaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1800

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaactcga 1860

gcaccaccac caccaccact gagatccggc tgctaacaaa gcccgaaagg aagctgagtt 1920

ggctgctgcc accgctgagc aataactagc ataacccctt ggggcctcta aacgggtctt 1980

gaggggtttt ttg 1993

<210> 2

<211> 1991

<212> DNA

<213>artificial sequence

<400> 2

cgcgaaatta atacgactca ctataggggt atttttacaa caattaccaa caacaacaaa 60

caacaaacaa cattacaatt actatttaca attacaaaaa aaaaatgtct gaagacgcca 120

aaaacataaa gaaaggcccg gcgccattct atcctctaga ggatggaacc gctggagagc 180

aactgcataa ggctatgaag agatacgccc tggttcctgg aacaattgct tttacagatg 240

cacatatcga ggtgaacatc acgtacgcgg aatacttcga aatgtccgtt cggttggcag 300

aagctatgaa acgatatggg ctgaatacaa atcacagaat cgtcgtatgc agtgaaaact 360

ctcttcaatt ctttatgccg gtgttgggcg cgttatttat cggagttgca gttgcgcccg 420

cgaacgacat ttataatgaa cgtgaattgc tcaacagtat gaacatttcg cagcctaccg 480

tagtgtttgt ttccaaaaag gggttgcaaa aaattttgaa cgtgcaaaaa aaattaccaa 540

taatccagaa aattattatc atggattcta aaacggatta ccagggattt cagtcgatgt 600

acacgttcgt cacatctcat ctacctcccg gttttaatga atacgatttt gtaccagagt 660

cctttgatcg tgacaaaaca attgcactga taatgaattc ctctggatct actgggttac 720

ctaagggtgt ggcccttccg catagaactg cctgcgtcag attctcgcat gccagagatc 780

ctatttttgg caatcaaatc attccggata ctgcgatttt aagtgttgtt ccattccatc 840

acggttttgg aatgtttact acactcggat atttgatatg tggatttcga gtcgtcttaa 900

tgtatagatt tgaagaagag ctgtttttac gatcccttca ggattacaaa attcaaagtg 960

cgttgctagt accaacccta ttttcattct tcgccaaaag cactctgatt gacaaatacg 1020

atttatctaa tttacacgaa attgcttctg ggggcgcacc tctttcgaaa gaagtcgggg 1080

aagcggttgc aaaacgcttc catcttccag ggatacgaca aggatatggg ctcactgaga 1140

ctacatcagc tattctgatt acacccgagg gggatgataa accgggcgcg gtcggtaaag 1200

ttgttccatt ttttgaagcg aaggttgtgg atctggatac cgggaaaacg ctgggcgtta 1260

atcagagagg cgaattatgt gtcagaggac ctatgattat gtccggttat gtaaacaatc 1320

cggaagcgac caacgccttg attgacaagg atggatggct acattctgga gacatagctt 1380

actgggacga agacgaacac ttcttcatag ttgaccgctt gaagtcttta attaaataca 1440

aaggatatca ggtggccccc gctgaattgg aatcgatatt gttacaacac cccaacatct 1500

tcgacgcggg cgtggcaggt cttcccgacg atgacgccgg tgaacttccc gccgccgttg 1560

ttgttttgga gcacggaaag acgatgacgg aaaaagagat cgtggattac gtcgccagtc 1620

aagtaacaac cgcgaaaaag ttgcgcggag gagttgtgtt tgtggacgaa gtaccgaaag 1680

gtcttaccgg aaaactcgac gcaagaaaaa tcagagagat cctcataaag gccaagaagg 1740

gcggaaagtc caaattggtt taaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1800

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaactcgagc 1860

accaccacca ccaccactga gatccggctg ctaacaaagc ccgaaaggaa gctgagttgg 1920

ctgctgccac cgctgagcaa taactagcat aaccccttgg ggcctctaaa cgggtcttga 1980

ggggtttttt g 1991

<210> 3

<211> 1989

<212> DNA

<213>artificial sequence

<400> 3

cgcgaaatta atacgactca ctataggggt atttttacaa caattaccaa caacaacaaa 60

caacaaacaa cattacaatt actatttaca attacaaaaa aaatgtctga agacgccaaa 120

aacataaaga aaggcccggc gccattctat cctctagagg atggaaccgc tggagagcaa 180

ctgcataagg ctatgaagag atacgccctg gttcctggaa caattgcttt tacagatgca 240

catatcgagg tgaacatcac gtacgcggaa tacttcgaaa tgtccgttcg gttggcagaa 300

gctatgaaac gatatgggct gaatacaaat cacagaatcg tcgtatgcag tgaaaactct 360

cttcaattct ttatgccggt gttgggcgcg ttatttatcg gagttgcagt tgcgcccgcg 420

aacgacattt ataatgaacg tgaattgctc aacagtatga acatttcgca gcctaccgta 480

gtgtttgttt ccaaaaaggg gttgcaaaaa attttgaacg tgcaaaaaaa attaccaata 540

atccagaaaa ttattatcat ggattctaaa acggattacc agggatttca gtcgatgtac 600

acgttcgtca catctcatct acctcccggt tttaatgaat acgattttgt accagagtcc 660

tttgatcgtg acaaaacaat tgcactgata atgaattcct ctggatctac tgggttacct 720

aagggtgtgg cccttccgca tagaactgcc tgcgtcagat tctcgcatgc cagagatcct 780

atttttggca atcaaatcat tccggatact gcgattttaa gtgttgttcc attccatcac 840

ggttttggaa tgtttactac actcggatat ttgatatgtg gatttcgagt cgtcttaatg 900

tatagatttg aagaagagct gtttttacga tcccttcagg attacaaaat tcaaagtgcg 960

ttgctagtac caaccctatt ttcattcttc gccaaaagca ctctgattga caaatacgat 1020

ttatctaatt tacacgaaat tgcttctggg ggcgcacctc tttcgaaaga agtcggggaa 1080

gcggttgcaa aacgcttcca tcttccaggg atacgacaag gatatgggct cactgagact 1140

acatcagcta ttctgattac acccgagggg gatgataaac cgggcgcggt cggtaaagtt 1200

gttccatttt ttgaagcgaa ggttgtggat ctggataccg ggaaaacgct gggcgttaat 1260

cagagaggcg aattatgtgt cagaggacct atgattatgt ccggttatgt aaacaatccg 1320

gaagcgacca acgccttgat tgacaaggat ggatggctac attctggaga catagcttac 1380

tgggacgaag acgaacactt cttcatagtt gaccgcttga agtctttaat taaatacaaa 1440

ggatatcagg tggcccccgc tgaattggaa tcgatattgt tacaacaccc caacatcttc 1500

gacgcgggcg tggcaggtct tcccgacgat gacgccggtg aacttcccgc cgccgttgtt 1560

gttttggagc acggaaagac gatgacggaa aaagagatcg tggattacgt cgccagtcaa 1620

gtaacaaccg cgaaaaagtt gcgcggagga gttgtgtttg tggacgaagt accgaaaggt 1680

cttaccggaa aactcgacgc aagaaaaatc agagagatcc tcataaaggc caagaagggc 1740

ggaaagtcca aattggttta aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1800

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa actcgagcac 1860

caccaccacc accactgaga tccggctgct aacaaagccc gaaaggaagc tgagttggct 1920

gctgccaccg ctgagcaata actagcataa ccccttgggg cctctaaacg ggtcttgagg 1980

ggttttttg 1989

<210> 4

<211> 1983

<212> DNA

<213>artificial sequence

<400> 4

cgcgaaatta atacgactca ctataggggt atttttacaa caattaccaa caacaacaaa 60

caacaaacaa cattacaatt actatttaca attacaatgt ctgaagacgc caaaaacata 120

aagaaaggcc cggcgccatt ctatcctcta gaggatggaa ccgctggaga gcaactgcat 180

aaggctatga agagatacgc cctggttcct ggaacaattg cttttacaga tgcacatatc 240

gaggtgaaca tcacgtacgc ggaatacttc gaaatgtccg ttcggttggc agaagctatg 300

aaacgatatg ggctgaatac aaatcacaga atcgtcgtat gcagtgaaaa ctctcttcaa 360

ttctttatgc cggtgttggg cgcgttattt atcggagttg cagttgcgcc cgcgaacgac 420

atttataatg aacgtgaatt gctcaacagt atgaacattt cgcagcctac cgtagtgttt 480

gtttccaaaa aggggttgca aaaaattttg aacgtgcaaa aaaaattacc aataatccag 540

aaaattatta tcatggattc taaaacggat taccagggat ttcagtcgat gtacacgttc 600

gtcacatctc atctacctcc cggttttaat gaatacgatt ttgtaccaga gtcctttgat 660

cgtgacaaaa caattgcact gataatgaat tcctctggat ctactgggtt acctaagggt 720

gtggcccttc cgcatagaac tgcctgcgtc agattctcgc atgccagaga tcctattttt 780

ggcaatcaaa tcattccgga tactgcgatt ttaagtgttg ttccattcca tcacggtttt 840

ggaatgttta ctacactcgg atatttgata tgtggatttc gagtcgtctt aatgtataga 900

tttgaagaag agctgttttt acgatccctt caggattaca aaattcaaag tgcgttgcta 960

gtaccaaccc tattttcatt cttcgccaaa agcactctga ttgacaaata cgatttatct 1020

aatttacacg aaattgcttc tgggggcgca cctctttcga aagaagtcgg ggaagcggtt 1080

gcaaaacgct tccatcttcc agggatacga caaggatatg ggctcactga gactacatca 1140

gctattctga ttacacccga gggggatgat aaaccgggcg cggtcggtaa agttgttcca 1200

ttttttgaag cgaaggttgt ggatctggat accgggaaaa cgctgggcgt taatcagaga 1260

ggcgaattat gtgtcagagg acctatgatt atgtccggtt atgtaaacaa tccggaagcg 1320

accaacgcct tgattgacaa ggatggatgg ctacattctg gagacatagc ttactgggac 1380

gaagacgaac acttcttcat agttgaccgc ttgaagtctt taattaaata caaaggatat 1440

caggtggccc ccgctgaatt ggaatcgata ttgttacaac accccaacat cttcgacgcg 1500

ggcgtggcag gtcttcccga cgatgacgcc ggtgaacttc ccgccgccgt tgttgttttg 1560

gagcacggaa agacgatgac ggaaaaagag atcgtggatt acgtcgccag tcaagtaaca 1620

accgcgaaaa agttgcgcgg aggagttgtg tttgtggacg aagtaccgaa aggtcttacc 1680

ggaaaactcg acgcaagaaa aatcagagag atcctcataa aggccaagaa gggcggaaag 1740

tccaaattgg tttaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1800

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaactcga gcaccaccac 1860

caccaccact gagatccggc tgctaacaaa gcccgaaagg aagctgagtt ggctgctgcc 1920

accgctgagc aataactagc ataacccctt ggggcctcta aacgggtctt gaggggtttt 1980

ttg 1983

<210> 5

<211> 16

<212> RNA

<213>artificial sequence

<400> 5

aaaaaaaaaa augucu 16

<210> 6

<211> 717

<212> DNA

<213>artificial sequence

<400> 6

atggtgagca agggcgagga gctgttcacc ggggtggtgc ccatcctggt cgagctggac 60

ggcgacgtaa acggccacaa gttcagcgtg tccggcgagg gcgagggcga tgccacctac 120

ggcaagctga ccctgaagtt catctgcacc accggcaagc tgcccgtgcc ctggcccacc 180

ctcgtgacca ccctgaccta cggcgtgcag tgcttcagcc gctaccccga ccacatgaag 240

cagcacgact tcttcaagtc cgccatgccc gaaggctacg tccaggagcg caccatcttc 300

ttcaaggacg acggcaacta caagacccgc gccgaggtga agttcgaggg cgacaccctg 360

gtgaaccgca tcgagctgaa gggcatcgac ttcaaggagg acggcaacat cctggggcac 420

aagctggagt acaactacaa cagccacaac gtctatatca tggccgacaa gcagaagaac 480

ggcatcaagg tgaacttcaa gatccgccac aacatcgagg acggcagcgt gcagctcgcc 540

gaccactacc agcagaacac ccccatcggc gacggccccg tgctgctgcc cgacaaccac 600

tacctgagca cccagtccgc cctgagcaaa gaccccaacg agaagcgcga tcacatggtc 660

ctgctggagt tcgtgaccgc cgccgggatc actctcggca tggacgagct gtacaag 717

<210> 7

<211> 717

<212> DNA

<213>artificial sequence

<400> 7

atggtgagca agggcgagga gctgttcacc ggggtggtgc ccatcctggt cgagctggac 60

ggcgacgtaa acggccacaa gttcagcgtg tccggcgagg gcgagggcga tgccacctac 120

ggcaagctga ccctgaagct gatctgcacc accggcaagc tgcccgtgcc ctggcccacc 180

ctcgtgacca ccctgggcta cggcctgcag tgcttcgccc gctaccccga ccacatgaag 240

cagcacgact tcttcaagtc cgccatgccc gaaggctacg tccaggagcg caccatcttc 300

ttcaaggacg acggcaacta caagacccgc gccgaggtga agttcgaggg cgacaccctg 360

gtgaaccgca tcgagctgaa gggcatcgac ttcaaggagg acggcaacat cctggggcac 420

aagctggagt acaactacaa cagccacaac gtctatatca ccgccgacaa gcagaagaac 480

ggcatcaagg ccaacttcaa gatccgccac aacatcgagg acggcggcgt gcagctcgcc 540

gaccactacc agcagaacac ccccatcggc gacggccccg tgctgctgcc cgacaaccac 600

tacctgagct accagtccgc cctgagcaaa gaccccaacg agaagcgcga tcacatggtc 660

ctgctggagt tcgtgaccgc cgccgggatc actctcggca tggacgagct gtacaag 717

<210> 8

<211> 3048

<212> DNA

<213>artificial sequence

<400> 8

atggtcgttt tacaacgtcg tgactgggaa aaccctggcg ttacccaact taatcgcctt 60

gcagcacatc cccctttcgc cagctggcgt aatagcgaag aggcccgcac cgatcgccct 120

tcccaacagt tgcgcagcct gaatggcgaa tggcgctttg cctggtttcc ggcaccagaa 180

gcggtgccgg aaagctggct ggagtgcgat cttcctgagg ccgatactgt cgtcgtcccc 240

tcaaactggc agatgcacgg ttacgatgcg cccatctaca ccaacgtgac ctatcccatt 300

acggtcaatc cgccgtttgt tcccacggag aatccgacgg gttgttactc gctcacattt 360

aatgttgatg aaagctggct acaggaaggc cagacgcgaa ttatttttga tggcgttaac 420

tcggcgtttc atctgtggtg caacgggcgc tgggtcggtt acggccagga cagtcgtttg 480

ccgtctgaat ttgacctgag cgcattttta cgcgccggag aaaaccgcct cgcggtgatg 540

gtgctgcgct ggagtgacgg cagttatctg gaagatcagg atatgtggcg gatgagcggc 600

attttccgtg acgtctcgtt gctgcataaa ccgactacac aaatcagcga tttccatgtt 660

gccactcgct ttaatgatga tttcagccgc gctgtactgg aggctgaagt tcagatgtgc 720

ggcgagttgc gtgactacct acgggtaaca gtttctttat ggcagggtga aacgcaggtc 780

gccagcggca ccgcgccttt cggcggtgaa attatcgatg agcgtggtgg ttatgccgat 840

cgcgtcacac tacgtctgaa cgtcgaaaac ccgaaactgt ggagcgccga aatcccgaat 900

ctctatcgtg cggtggttga actgcacacc gccgacggca cgctgattga agcagaagcc 960

tgcgatgtcg gtttccgcga ggtgcggatt gaaaatggtc tgctgctgct gaacggcaag 1020

ccgttgctga ttcgaggcgt taaccgtcac gagcatcatc ctctgcatgg tcaggtcatg 1080

gatgagcaga cgatggtgca ggatatcctg ctgatgaagc agaacaactt taacgccgtg 1140

cgctgttcgc attatccgaa ccatccgctg tggtacacgc tgtgcgaccg ctacggcctg 1200

tatgtggtgg atgaagccaa tattgaaacc cacggcatgg tgccaatgaa tcgtctgacc 1260

gatgatccgc gctggctacc ggcgatgagc gaacgcgtaa cgcgaatggt gcagcgcgat 1320

cgtaatcacc cgagtgtgat catctggtcg ctggggaatg aatcaggcca cggcgctaat 1380

cacgacgcgc tgtatcgctg gatcaaatct gtcgatcctt cccgcccggt gcagtatgaa 1440

ggcggcggag ccgacaccac ggccaccgat attatttgcc cgatgtacgc gcgcgtggat 1500

gaagaccagc ccttcccggc tgtgccgaaa tggtccatca aaaaatggct ttcgctacct 1560

ggagagacgc gcccgctgat cctttgcgaa tacgcccacg cgatgggtaa cagtcttggc 1620

ggtttcgcta aatactggca ggcgtttcgt cagtatcccc gtttacaggg cggcttcgtc 1680

tgggactggg tggatcagtc gctgattaaa tatgatgaaa acggcaaccc gtggtcggct 1740

tacggcggtg attttggcga tacgccgaac gatcgccagt tctgtatgaa cggtctggtc 1800

tttgccgacc gcacgccgca tccagcgctg acggaagcaa aacaccagca gcagtttttc 1860

cagttccgtt tatccgggca aaccatcgaa gtgaccagcg aatacctgtt ccgtcatagc 1920

gataacgagc tcctgcactg gatggtggcg ctggatggta agccgctggc aagcggtgaa 1980

gtgcctctgg atgtcgctcc acaaggtaaa cagttgattg aactgcctga actaccgcag 2040

ccggagagcg ccgggcaact ctggctcaca gtacgcgtag tgcaaccgaa cgcgaccgca 2100

tggtcagaag ccggacacat cagcgcctgg cagcagtggc gtctggctga aaacctcagc 2160

gtgacactcc ccgccgcgtc ccacgccatc ccgcatctga ccaccagcga aatggatttt 2220

tgcatcgagc tgggtaataa gcgttggcaa tttaaccgcc agtcaggctt tctttcacag 2280

atgtggattg gcgataaaaa acaactgctg acgccgctgc gcgatcagtt cacccgtgca 2340

ccgctggata acgacattgg cgtaagtgaa gcgacccgca ttgaccctaa cgcctgggtc 2400

gaacgctgga aggcggcggg ccattaccag gccgaagcag cgttgttgca gtgcacggca 2460

gatacacttg ctgatgcggt gctgattacg accgctcacg cgtggcagca tcaggggaaa 2520

accttattta tcagccggaa aacctaccgg attgatggta gtggtcaaat ggcgattacc 2580

gttgatgttg aagtggcgag cgatacaccg catccggcgc ggattggcct gaactgccag 2640

ctggcgcagg tagcagagcg ggtaaactgg ctcggattag ggccgcaaga aaactatccc 2700

gaccgcctta ctgccgcctg ttttgaccgc tgggatctgc cattgtcaga catgtatacc 2760

ccgtacgtct tcccgagcga aaacggtctg cgctgcggga cgcgcgaatt gaattatggc 2820

ccacaccagt ggcgcggcga cttccagttc aacatcagcc gctacagtca acagcaactg 2880

atggaaacca gccatcgcca tctgctgcac gcggaagaag gcacatggct gaatatcgac 2940

ggtttccata tggggattgg tggcgacgac tcctggagcc cgtcagtatc ggcggaattc 3000

cagctgagcg ccggtcgcta ccattaccag ttggtctggt gtcaaaaa 3048

<210> 9

<211> 1935

<212> DNA

<213>artificial sequence

<400> 9

atggcggccg tgcaggcggc cgaggtgaaa gtggatggca gcgagccgaa actgagcaag 60

aatgagctga agagacgcct gaaagctgag aagaaagtag cagagaagga ggccaaacag 120

aaagagctca gtgagaaaca gctaagccaa gccactgctg ctgccaccaa ccacaccact 180

gataatggtg tgggtcctga ggaagagagc gtggacccaa atcaatacta caaaatccgc 240

agtcaagcaa ttcatcagct gaaggtcaat ggggaagacc catacccaca caagttccat 300

gtagacatct cactcactga cttcatccaa aaatatagtc acctgcagcc tggggatcac 360

ctgactgaca tcaccttaaa ggtggcaggt aggatccatg ccaaaagagc ttctggggga 420

aagctcatct tctatgatct tcgaggagag ggggtgaagt tgcaagtcat ggccaattcc 480

agaaattata aatcagaaga agaatttatt catattaata acaaactgcg tcggggagac 540

ataattggag ttcaggggaa tcctggtaaa accaagaagg gtgagctgag catcattccg 600

tatgagatca cactgctgtc tccctgtttg catatgttac ctcatcttca ctttggcctc 660

aaagacaagg aaacaaggta tcgccagaga tacttggact tgatcctgaa tgactttgtg 720

aggcagaaat ttatcatccg ctctaagatc atcacatata taagaagttt cttagatgag 780

ctgggattcc tagagattga aactcccatg atgaacatca tcccaggggg agccgtggcc 840

aagcctttca tcacttatca caacgagctg gacatgaact tatatatgag aattgctcca 900

gaactctatc ataagatgct tgtggttggt ggcatcgacc gggtttatga aattggacgc 960

cagttccgga atgaggggat tgatttgacg cacaatcctg agttcaccac ctgtgagttc 1020

tacatggcct atgcagacta tcacgatctc atggaaatca cggagaagat ggtttcaggg 1080

atggtgaagc atattacagg cagttacaag gtcacctacc acccagatgg cccagagggc 1140

caagcctacg atgttgactt caccccaccc ttccggcgaa tcaacatggt agaagagctt 1200

gagaaagccc tggggatgaa gctgccagaa acgaacctct ttgaaactga agaaactcgc 1260

aaaattcttg atgatatctg tgtggcaaaa gctgttgaat gccctccacc tcggaccaca 1320

gccaggctcc ttgacaagct tgttggggag ttcctggaag tgacttgcat caatcctaca 1380

ttcatctgtg atcacccaca gataatgagc cctttggcta aatggcaccg ctctaaagag 1440

ggtctgactg agcgctttga gctgtttgtc atgaagaaag agatatgcaa tgcgtatact 1500

gagctgaatg atcccatgcg gcagcggcag ctttttgaag aacaggccaa ggccaaggct 1560

gcaggtgatg atgaggccat gttcatagat gaaaacttct gtactgccct ggaatatggg 1620

ctgcccccca cagctggctg gggcatgggc attgatcgag tcgccatgtt tctcacggac 1680

tccaacaaca tcaaggaagt acttctgttt cctgccatga aacccgaaga caagaaggag 1740

aatgtagcaa ccactgatac actggaaagc acaacagttg gcacttctgt ctagaaaata 1800

ataattgcaa gttgtataac tcaggcgtct ttgcatttct gcgaaagatc aaggtctgca 1860

agggaattct tgtgtgctgc tttccatttg acaccgcagt tctgttcagc catcagaaga 1920

gagacaagga attaa 1935

<210> 10

<211> 3531

<212> DNA

<213>artificial sequence

<400> 10

atggcggaaa gaaaaggaac agccaaagtg gactttttga agaagattga gaaagaaatc 60

caacagaaat gggatactga gagagtgttt gaggtcaatg catctaattt agagaaacag 120

accagcaagg gcaagtattt tgtaaccttc ccatatccat atatgaatgg acgccttcat 180

ttgggacaca cgttttcttt atccaaatgt gagtttgctg tagggtacca gcgattgaaa 240

ggaaaatgtt gtctgtttcc ctttggcctg cactgtactg gaatgcctat taaggcatgt 300

gctgataagt tgaaaagaga aatagagctg tatggttgcc cccctgattt tccagatgaa 360

gaagaggaag aggaagaaac cagtgttaaa acagaagata taataattaa ggataaagct 420

aaaggaaaaa agagtaaagc tgctgctaaa gctggatctt ctaaatacca gtggggcatt 480

atgaaatccc ttggcctgtc tgatgaagag atagtaaaat tttctgaagc agaacattgg 540

cttgattatt tcccgccact ggctattcag gatttaaaaa gaatgggttt gaaggtagac 600

tggcgtcgtt ccttcatcac cactgatgtt aatccttact atgattcatt tgtcagatgg 660

caatttttaa cattaagaga aagaaacaaa attaaatttg ggaagcggta tacaatttac 720

tctccgaaag atggacagcc ttgcatggat catgatagac aaactggaga gggtgttgga 780

cctcaggaat atactttact caaattgaag gtgcttgagc catacccatc taaattaagt 840

ggcctgaaag gtaaaaatat tttcttggtg gctgctactc tcagacctga gaccatgttt 900

gggcagacaa attgttgggt tcgtcctgat atgaagtaca ttggatttga gacggtgaat 960

ggtgatatat tcatctgtac ccaaaaagca gccaggaata tgtcatacca gggctttacc 1020

aaagacaatg gcgtggtgcc tgttgttaag gaattaatgg gggaggaaat tcttggtgca 1080

tcactttctg cacctttaac atcatacaag gtgatctatg ttctcccaat gctaactatt 1140

aaggaggata aaggcactgg tgtggttaca agtgttcctt ccgactcccc tgatgatatt 1200

gctgccctca gagacttgaa gaaaaagcaa gccttacgag caaaatatgg aattagagat 1260

gacatggtct tgccatttga gccggtgcca gtcattgaaa tcccaggttt tggaaatctt 1320

tctgctgtaa ccatttgtga tgagttgaaa attcagagcc agaatgaccg ggaaaaactt 1380

gcagaagcaa aggagaagat atatctaaaa ggattttatg agggtatcat gttggtggat 1440

ggatttaaag gacagaaggt tcaagatgta aagaagacta ttcagaaaaa gatgattgac 1500

gctggagatg cacttattta catggaacca gagaaacaag tgatgtccag gtcgtcagat 1560

gaatgtgttg tggctctgtg tgaccagtgg tacttggatt atggagaaga gaattggaag 1620

aaacagacat ctcagtgctt gaagaacctg gaaacattct gtgaggagac caggaggaat 1680

tttgaagcca ccttaggttg gctacaagaa catgcttgct caagaactta tggtctaggc 1740

actcacctgc cttgggatga gcagtggctg attgaatcac tttctgactc cactatttac 1800

atggcatttt acacagttgc acacctattg caggggggta acttgcatgg acaggcagag 1860

tctccgctgg gcattagacc gcaacagatg accaaggaag tttgggatta tgttttcttc 1920

aaggaggctc catttcctaa gactcagatt gcaaaggaaa aattagatca gttaaagcag 1980

gagtttgaat tctggtatcc tgttgatctt cgcgtctctg gcaaggatct tgttccaaat 2040

catctttcat attaccttta taatcatgtg gctatgtggc cggaacaaag tgacaaatgg 2100

cctacagctg tgagagcaaa tggacatctc ctcctgaact ctgagaagat gtcaaaatcc 2160

acaggcaact tcctcacttt gacccaagct attgacaaat tttcagcaga tggaatgcgt 2220

ttggctctgg ctgatgctgg tgacactgta gaagatgcca actttgtgga agccatggca 2280

gatgcaggta ttctccgtct gtacacctgg gtagagtggg tgaaagaaat ggttgccaac 2340

tgggacagcc taagaagtgg tcctgccagc actttcaatg atagagtttt tgccagtgaa 2400

ttgaatgcag gaattataaa aacagatcaa aactatgaaa agatgatgtt taaagaagct 2460

ttgaaaacag ggttttttga gtttcaggcc gcaaaagata agtaccgtga attggctgtg 2520

gaagggatgc acagagaact tgtgttccgg tttattgaag ttcagacact tctcctcgct 2580

ccattctgtc cacatttgtg tgagcacatc tggacactcc tgggaaagcc tgactcaatt 2640

atgaatgctt catggcctgt ggcaggtcct gttaatgaag ttttaataca ctcctcacag 2700

tatcttatgg aagtaacaca tgaccttaga ctacgactca agaactatat gatgccagct 2760

aaagggaaga agactgacaa acaacccctg cagaagccct cacattgcac catctatgtg 2820

gcaaagaact atccaccttg gcaacatacc accctgtctg ttctacgtaa acactttgag 2880

gccaataacg gaaaactgcc tgacaacaaa gtcattgcta gtgaactagg cagtatgcca 2940

gaactgaaga aatacatgaa gaaagtcatg ccatttgttg ccatgattaa ggaaaatctg 3000

gagaagatgg ggcctcgtat tctggatttg caattagaat ttgatgaaaa ggctgtgctt 3060

atggagaata tagtctatct gactaattcg cttgagctag aacacataga agtcaagttt 3120

gcctccgaag cagaagataa aatcagggaa gactgctgtc ctgggaaacc acttaatgtt 3180

tttagaatag aacctggtgt gtccgtttct ctggtgaatc cccagccatc caatggccac 3240

ttctcaacca aaattgaaat caggcaagga gataactgtg attccataat caggcgttta 3300

atgaaaatga atcgaggaat taaagacctt tccaaagtga aactgatgag atttgatgat 3360

ccactgttgg ggcctcgacg agttcctgtc ctgggaaagg agtacaccga gaagaccccc 3420

atttctgagc atgctgtttt caatgtggac ctcatgagca agaaaattca tctgactgag 3480

aatgggataa gggtggatat tggcgataca ataatctatc tggttcatta a 3531

<210> 11

<211> 1017

<212> DNA

<213>artificial sequence

<400> 11

atggctgaca agaagattag gatcggaatc aacggattcg gaagaattgg tcgtttggtt 60

gctagagttg ttctccagag ggacgatgtt gagctcgtcg ctgtcaacga ccccttcatc 120

actactgagt acatgaccta catgttcaag tacgacagtg ttcacggtca atggaaacac 180

aatgaactca agatcaagga tgagaagacc cttctcttcg gtgagaagcc agtcactgtt 240

ttcggcatca ggaaccctga ggatatccca tgggccgagg ctggagctga ctacgttgtt 300

gagtctactg gtgtcttcac tgacaaagac aaggctgcag ctcacttgaa gggtggtgcc 360

aagaaggttg ttatctctga acccagcaaa gacgctccaa tgtttgttgt tggtgtcaac 420

gagcacgaat acaagtccga ccttgacatt gtctccaacg ctagctgcac cactaactgc 480

cttgctcccc ttgccaaggt tatcaatgac agatttggaa ttgttgaggg tcttatgact 540

acagtccact caatcactgc tactcagaag actgttgatg ggccttcaat gaaggactgg 600

agaggtggaa gagctgcttc attcaacatt attcccagca gcactggagc tgccaaggct 660

gtcggaaagg tgcttccagc tcttaacgga aagttgactg gaatgtcttt ccgtgtccca 720

accgttgatg tctcagttgt tgaccttact gtcagactcg agaaagctgc tacctacgaa 780

gaaatcaaaa aggctatcaa ggaggaatcc gaaggcaaac tcaagggaat ccttggatac 840

accgaggatg atgttgtctc aactgacttc gttggcgaca acaggtcgag catttttgac 900

gccaaggctg gaattgcatt gagcgacaag tttgtgaaat tggtgtcatg gtacgacaac 960

gaatggggtt acagttcccg tgtggtcgac ttgatcgtcc acatgtcaaa ggcctaa 1017

<210> 12

<211> 1584

<212> DNA

<213>artificial sequence

<400> 12

atgtcggaca gtcgggaccc agccagcgac cagatgaagc agtggaagga gcagcgggcc 60

tcgcagagac cagatgtcct gaccaccgga ggcgggaacc caataggaga taaacttaat 120

atcatgaccg cggggtcccg agggcccctc ctcgttcagg atgtggtttt cactgacgag 180

atggcacact ttgacagaga gcggattcct gagagagtgg tacacgcaaa aggagcaggt 240

gcttttggat actttgaggt cacccacgat atcaccagat actccaaggc aaaggtgttt 300

gagcatattg gaaagaggac ccctattgcc gttcgattct ccacagtcgc tggagagtca 360

ggctcagctg acacagttcg tgaccctcgg gggtttgcag tgaaatttta cactgaagat 420

ggtaactggg atcttgtggg aaacaacacc cctattttct tcatcaggga tgccatattg 480

tttccatcct ttatccatag ccagaagaga aacccacaga ctcacctgaa ggatcctgac 540

atggtctggg acttctggag tcttcgtccc gagtctctcc atcaggtttc tttcttgttc 600

agtgaccgag ggattcccga tggtcaccgg cacatgaatg gctatggatc acacaccttc 660

aagttggtta atgcagatgg agaggcagtc tattgcaagt tccattacaa gaccgaccag 720

ggcatcaaaa acttgcctgt tggagaggca ggaaggcttg ctcaggaaga tccggattat 780

ggcctccgag atcttttcaa tgccatcgcc aatggcaatt acccgtcctg gacgttttac 840

atccaggtca tgacttttaa ggaggcagaa actttcccat ttaatccatt tgatctgacc 900

aaggtttggc ctcacaagga ctaccctctt ataccagttg gcaaactggt tttaaacaaa 960

aatccagtta attactttgc tgaagttgaa cagatggctt ttgacccaag caatatgccc 1020

cctggcatcg agcccagccc tgacaaaatg cttcagggcc gcctttttgc ctacccggac 1080

actcaccgcc accgcctggg acccaactat ctgcagatac ctgtgaactg tccctaccgc 1140

gctcgagtgg ccaactacca gcgtgatggc cccatgtgca tgcatgacaa ccagggtggt 1200

gcccccaact attaccccaa cagcttcagc gcaccagagc agcagcgctc agccctggag 1260

cacagcgtcc agtgcgctgt agatgtgaaa cgcttcaaca gtgctaatga agacaatgtc 1320

actcaggtgc ggacattcta cacaaaggtg ttgaatgagg aggagaggaa acgcctgtgt 1380

gagaacattg ctggccacct gaaggacgct cagcttttca ttcagaagaa agcggtcaag 1440

aatttcactg acgtccaccc tgactatggg gcccgcatcc aggctcttct ggacaagtac 1500

aacgctgaga agcctaagaa cgcaattcac acctacacgc aggccggctc tcacatggct 1560

gcgaagggaa aagctaacct gtaa 1584

<210> 13

<211> 99

<212> DNA

<213>artificial sequence

<400> 13

caacaattac caacaacaac aaacaacaaa caacattaca attactattt acaattacaa 60

aaaaaatgtc tgaagacgcc aaaaacataa agaaaggcc 99

<210> 14

<211> 53

<212> DNA

<213>artificial sequence

<400> 14

ttttttttgt aattgtaaat agtaattgta atgttgtttg ttgtttgttg ttg 53

<210> 15

<211> 98

<212> DNA

<213>artificial sequence

<400> 15

caattaccaa caacaacaaa caacaaacaa cattacaatt actatttaca attacaaaaa 60

aaaaatgtct gaagacgcca aaaacataaa gaaaggcc 98

<210> 16

<211> 99

<212> DNA

<213>artificial sequence

<400> 16

aattaccaac aacaacaaac aacaaacaac attacaatta ctatttacaa ttacaaaaaa 60

aaaaaatgtc tgaagacgcc aaaaacataa agaaaggcc 99

<210> 17

<211> 27

<212> DNA

<213>artificial sequence

<400> 17

cgcgaaatta atacgactca ctatagg 27

<210> 18

<211> 24

<212> DNA

<213>artificial sequence

<400> 18

tccggatata gttcctcctt tcag 24

Claims

1. a kind of nucleic acid constructs, which is characterized in that the construction has from 5 ' to 3 ' Formulas I structure:

Z1-Z2-Z3-Z4 (I)

In formula,

Each "-" independently is key or nucleotide catenation sequence；

Z2 is the oligomerization chain [oligo (A)] of adenyl-deoxyribonucleotide_n；

Z3 is translation initiation codon；

Z4 is serine codon；

2. nucleic acid constructs as described in claim 1, which is characterized in that the Kluyveromyces yeast is selected from the group: cream Sour kluyveromyces, kluyveromyces marxianus, more cloth kluyveromyces, or combinations thereof.

3. a kind of nucleic acid constructs, which is characterized in that the construction has from 5 ' to 3 ' Formula II structure:

Z1-Z2-Z3-Z4-Z5 (II)

In formula,

Each "-" independently is key or nucleotide catenation sequence；

Z2 is the oligomerization chain [oligo (A)] of adenyl-deoxyribonucleotide_n；

Z3 is translation initiation codon；

Z4 is serine codon；

Z5 is the coded sequence of foreign protein；

4. nucleic acid constructs as claimed in claim 3, which is characterized in that the coded sequence of the foreign protein is selected from down Group: coding fluorescence fibroin or luciferase (such as firefly luciferase), green fluorescent protein, yellow fluorescence protein, aminoacyl TRNA synzyme, glyceraldehyde-3-phosphate dehydrogenase, catalase, actin, the exogenous DNA of the Variable Area of antibody, firefly The DNA of light element enzyme mutant, or combinations thereof.

5. a kind of carrier or carrier combination, which is characterized in that the carrier or carrier combination are containing described in claim 1 or 3 Nucleic acid constructs.

6. a kind of genetically engineered cell, which is characterized in that one or more sites of the genome of the genetically engineered cell are whole Conjunction have the right to require 1 or 3 described in nucleic acid constructs or the genetically engineered cell containing the load described in claim 5 Body or carrier combination.

7. a kind of kit, which is characterized in that the reagent for including in the kit one of is selected from the group or a variety of:

(a) construction described in claim 1 or 3；

(b) carrier described in claim 5 or carrier combination；With

(c) genetically engineered cell as claimed in claim 6.

8. described in carrier described in construction as claimed in claim 1 or 3, claim 5 or carrier combination, claim 6 Genetically engineered cell or claim 7 described in kit purposes, which is characterized in that for carrying out high-throughput external albumen Synthesis.

9. a kind of protein synthesis methods of external high throughput, which is characterized in that comprising steps of

(i) outside yeast in the presence of albumen synthetic system, nucleic acid constructs as claimed in claim 3 is provided；

(ii) under the suitable conditions, the outer albumen synthetic system of the yeast of incubation step (i) T1 for a period of time, to synthesize The foreign protein.

10. method as claimed in claim 9, which is characterized in that the method also includes: (iii) is optionally from the yeast In external albumen synthetic system, the foreign protein is separated or detected.