[go: up one dir, main page]

CN113754739B - Preparation method and application of coronavirus S protein RBD glycoprotein - Google Patents

Preparation method and application of coronavirus S protein RBD glycoprotein Download PDF

Info

Publication number
CN113754739B
CN113754739B CN202010493748.5A CN202010493748A CN113754739B CN 113754739 B CN113754739 B CN 113754739B CN 202010493748 A CN202010493748 A CN 202010493748A CN 113754739 B CN113754739 B CN 113754739B
Authority
CN
China
Prior art keywords
leu
ser
val
gly
lys
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010493748.5A
Other languages
Chinese (zh)
Other versions
CN113754739A (en
Inventor
刘波
吴军
孙鹏
王甜甜
巩新
侯旭宸
徐俊杰
殷瑛
张军
罗士强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Junshi Biosciences Co Ltd
Original Assignee
Shanghai Junshi Biosciences Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Junshi Biosciences Co Ltd filed Critical Shanghai Junshi Biosciences Co Ltd
Priority to CN202010493748.5A priority Critical patent/CN113754739B/en
Priority to PCT/CN2021/093757 priority patent/WO2021244255A1/en
Publication of CN113754739A publication Critical patent/CN113754739A/en
Application granted granted Critical
Publication of CN113754739B publication Critical patent/CN113754739B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K16/00Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
    • C07K16/08Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from viruses
    • C07K16/10Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from viruses from RNA viruses
    • C07K16/1002Coronaviridae
    • C07K16/1003Severe acute respiratory syndrome coronavirus 2 [SARS‐CoV‐2 or Covid-19]
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K39/12Viral antigens
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K39/12Viral antigens
    • A61K39/215Coronaviridae, e.g. avian infectious bronchitis virus
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K39/385Haptens or antigens, bound to carriers
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P11/00Drugs for disorders of the respiratory system
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/12Antivirals
    • A61P31/14Antivirals for RNA viruses
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • C07K14/08RNA viruses
    • C07K14/165Coronaviridae, e.g. avian infectious bronchitis virus
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K16/00Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
    • C07K16/08Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from viruses
    • C07K16/10Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from viruses from RNA viruses
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K19/00Hybrid peptides, i.e. peptides covalently bound to nucleic acids, or non-covalently bound protein-protein complexes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/62DNA sequences coding for fusion proteins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/80Vectors or expression systems specially adapted for eukaryotic hosts for fungi
    • C12N15/81Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/80Vectors or expression systems specially adapted for eukaryotic hosts for fungi
    • C12N15/81Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
    • C12N15/815Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts for yeasts other than Saccharomyces
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K2039/555Medicinal preparations containing antigens or antibodies characterised by a specific combination antigen/adjuvant
    • A61K2039/55505Inorganic adjuvants
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2770/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses positive-sense
    • C12N2770/00011Details
    • C12N2770/20011Coronaviridae
    • C12N2770/20022New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2770/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses positive-sense
    • C12N2770/00011Details
    • C12N2770/20011Coronaviridae
    • C12N2770/20034Use of virus or viral component as vaccine, e.g. live-attenuated or inactivated virus, VLP, viral protein

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Virology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Public Health (AREA)
  • Mycology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biomedical Technology (AREA)
  • Veterinary Medicine (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Immunology (AREA)
  • Pulmonology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Communicable Diseases (AREA)
  • Physics & Mathematics (AREA)
  • Epidemiology (AREA)
  • Plant Pathology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Oncology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The invention discloses a preparation method and application of coronavirus S protein RBD glycoprotein. The present invention provides a method for preparing a coronavirus S protein RBD having a mammalian sugar structure N-sugar chain modification, comprising: expressing a coronavirus S protein RBD in Pichia pastoris genetically engineered by a glycosylation modification pathway (Pichia pastoris cell mutant with a defective mannosylation modification pathway and a reconstructed mammalian cell N-glycosylation modification pathway), thereby obtaining a recombinant yeast cell; culturing recombinant yeast cells, and purifying from the culture supernatant to obtain the target protein. The invention successfully expresses the coronavirus S protein RBD with mammal sugar structure N-sugar chain modification, can generate high titer antibody after immunizing mice by using the coronavirus S protein RBD, and can neutralize SARS-CoV-2 virus. The invention is beneficial to the high-efficiency research and development and mass production of novel coronavirus vaccines.

Description

Preparation method and application of coronavirus S protein RBD glycoprotein
Technical Field
The invention relates to the field of biological medicine, in particular to a preparation method and application of coronavirus S protein RBD glycoprotein.
Background
Over the past 20 years, a variety of coronaviruses have spread beyond the species boundary to humans, resulting in about 30% of respiratory infections (Coronaviruses: drug discovery and therapeutic options) and causing significant losses. This suggests that coronaviruses are increasingly threatening human health, and research into related vaccines is the most urgent task.
The S protein is the only protein on the surface of coronavirus, belongs to type I membrane protein, is modified by N-glycosylation, and is formed by about 1300 amino acids as monomers, and the monomers are polymerized to form homotrimers after folding. The S protein monomer consists of an N-terminal S1 subunit and a C-terminal S2 subunit, wherein the S1 subunit is responsible for binding to a host cell receptor, and after the virus is taken up by the host cell, the S2 subunit is cleaved by a host protease near the S2 site of the fusion peptide, triggering a conformational change of the S protein, and thus the S2 subunit mediates membrane fusion (reference :Glycan shield and epitope masking of a coronavirus spike protein observed by cryo-electron microscopy).
Vaccine research based on full-length SARS-CoV S protein shows that S protein has high immunogenicity and can induce organism to produce neutralizing antibody against SARS-CoV, but the full-length S protein as antigen can induce eosinophil immunopathology or antibody-mediated immune enhancement ADE and other adverse reactions, and the safety is widely questioned.
The novel coronavirus SARS-CoV-2S protein has higher homology with the SARS-CoV S protein, the advanced structures of the two proteins have certain similarity, and the SARS-CoV-2 also invades human cells (cryo-EM structure of the 2019-nCoV SPIKE IN THE precusion conformation) through the receptor ACE2, which suggests that the research and development experience of the SARS-CoV vaccine can provide clues for the research and development of the SARS-CoV-2 vaccine.
The RBD region of SARS-CoV S protein is used as independent structural domain, can form correct conformation, contains several epitopes with space structure dependence, and is one of the antigens considered as main subunit vaccine, i.e. RBD is one of the antigens considered except Spike protein S1 region, S2 region, full-length S region and nucleoprotein. However, the SARS-CoV-2S protein RBD has two potential N-glycosylation sites, and the correct glycoform structure plays an important role in maintaining the natural conformation and immunogenicity of the RBD. Yeast is used as a microorganism for industrial mass production for a long time, has been successfully used for producing non-glycoprotein subunit vaccines such as Hepatitis B (HBV) vaccines, papilloma virus (HPV) vaccines and the like, and has the characteristics of high safety, short construction period of engineering strains, fast growth, easy mass production and the like, so that the yeast is very suitable for being used as an expression system for efficient and large-scale vaccine production under sudden infectious diseases and other emergency conditions. However, the yeast has a phenomenon of excessive glycosylation (excessive mannosylation), and shielding of important epitopes by excessive glycosylation reduces the protective effect of the vaccine. The research and modification of the yeast glycosylation modification system make the glycosylation modification approach, even better than the natural glycosyl structure of the antigen, which is hopeful to make the genetically modified yeast of the glycosylation modification approach replace chicken embryo and mammalian cells for the rapid and efficient production of genetic engineering subunit vaccine.
Disclosure of Invention
The invention aims to provide a method for preparing a coronavirus S protein RBD with a mammal sugar-type structure modified by N-sugar chain by utilizing pichia pastoris genetically modified by a glycosylation modification approach and application thereof.
In a first aspect, the invention claims a method of preparing a coronavirus S protein Receptor Binding Domain (RBD) having a modification of the N-sugar chain of a mammalian sugar structure.
The method for preparing a coronavirus S protein Receptor Binding Domain (RBD) modified by a mammalian sugar structure comprises the following steps:
(1) Expressing a coronavirus S protein Receptor Binding Domain (RBD) in Pichia pastoris genetically engineered with glycosylation modification pathways to obtain recombinant yeast cells;
The pichia pastoris genetically modified by the glycosylation modification pathway is a pichia pastoris cell mutant which is defective in the mannosylation modification pathway and has the mammalian cell N-glycosylation modification pathway reconstructed.
(2) Culturing the recombinant yeast cell, and purifying from the culture supernatant to obtain the coronavirus S protein receptor binding region with the modification of the mammalian sugar structure N-sugar chain.
Wherein, the pichia pastoris genetically modified by the glycosylation modification pathway can be prepared according to the method comprising the following steps:
(A1) Inactivating the receptors Pichia pastoris endogenous alpha-1, 6-mannosyltransferase, phosphomannosyltransferase, beta mannosyltransferase I, beta mannosyltransferase II, beta mannosyltransferase III and beta mannosyltransferase IV to obtain recombinant yeast 1;
(A2) Expressing in the recombinant yeast 1 at least one of the following exogenous proteins: exogenous mannosidase I, exogenous N-acetylglucosamine transferase I, exogenous mannosidase II, exogenous N-acetylglucosamine transferase II, exogenous galactose isomerase and exogenous galactose transferase, to obtain recombinant yeast 2; the recombinant yeast 2 is the pichia pastoris genetically modified by the glycosylation modification pathway.
After inactivation of alpha-1, 6-mannosyltransferase, phosphomannosyltransferase, β mannosyltransferase I-IV, the modification of N glycosylation is significantly reduced, the glycosyl internal environment tends to be relatively "clean", with the new problem that: how does the O glycosylation modification decrease? Numerous O-glycosylated family members, which inactivation of enzymes may be suitable for use in the present invention and achieve the desired effect? N-glycosylation modification is known to occur at its conserved N-glycosylation modification site (N-X-S/T), but since O-glycosylation modification does not have a conserved glycosylation site, it is generally believed that it occurs at serine-or threonine-rich amino acids, whether or not O-glycosylation modification occurs in different proteins, and at which amino acid, the degree of O-glycosylation modification varies. Serine or threonine of a protein may be potential sites for O-glycosylation, but not every serine or threonine will undergo O-glycosylation modification, nor every serine or threonine containing protein will undergo O-glycosylation modification, and the glycosylation modifications of different proteins in different expression systems will be different. If O-glycosylation modification occurs, the sugar groups on the sugar chains are mannose, and the number of the sugar chains is relatively short, but the surface of the yeast expression protein may have a large amount of exposed mannose due to the large number of the sugar chains. The mannosylated glycoprotein has short half-life period in human body, high immunogenicity and easy removal. Because of this deficiency, pichia pastoris has limited its use in the production of most protein drugs.
Members of the O glycosyltransferase family are classified into three subfamilies according to their homology: sub-PMT 1, sub-PMT 2, and sub-PMT 4. The number of members of the sub-PMT 1 and sub-PMT 2 families in different species may be different, for a total of 7 family members: PMT1\PMT2\PMT3\PMT4\PMT5\PMT6\PMT7. The sub-family pmt1 of saccharomyces cerevisiae includes pmt1\pmt5\pmt7 and the sub-family pmt2 includes pmt2\pmt3\pmt638. Members of the sub-families Pmt1p (Pmt 1p, pmt5 p) and Pmt2p (Pmt 2p, pmt3 p) form heteromeric dimers with each other, pmt4p forms homomeric dimers, and Pmt6p forms neither heteromeric nor homomeric dimers with other members of the Pmtp family. In wild-type yeasts, the complexes formed by members of the subfamily Pmt1p and Pmt2p are mainly Pmt1p-Pmt2p and Pmt5p-Pmt3p complexes, with very small amounts of Pmt1p-Pmt3p and Pmt2p-Pmt5p complexes. However, in the present invention, we found that further inactivation of O-mannosyltransferase I, while expressing a specific source of a specific type of exogenous mannosidase I, exogenous N-acetylglucosamine transferase I, exogenous mannosidase II, exogenous N-acetylglucosamine transferase II, exogenous galactose isomerase GalE and exogenous galactose transferase GalT, based on alpha-1, 6-mannosyltransferase inactivation, phosphomannose synthetase inactivation and beta-mannosyltransferase I-IV inactivation, in combination, can significantly reduce O-glycosylation modification of the protein expressed by the engineering yeast and yield a cell-type having a specific mammal.
Correspondingly, the following step (A3) can be further included after the step (A2):
(A3) Inactivating the O mannose transferase I endogenous to the recombinant yeast 2 to obtain recombinant yeast 3; the recombinant yeast 3 is also the pichia pastoris genetically modified by the glycosylation modification pathway.
Step (A3) further reduces the phenomenon of yeast O glycosylation modification.
Wherein the mammalian sugar structure N-sugar chain may be Gal 2GlcNAc2Man3GlcNAc2、GalGlcNAcMan5GlcNAc2 or Man 5GlcNAc2. Gal: galactose, glcNAc: n-acetylglucosamine; man: mannose.
When the mammalian sugar structure N-sugar chain is Gal 2GlcNAc2Man3GlcNAc2、GalGlcNAcMan5GlcNAc2 or Man 5GlcNAc2, the exogenous proteins expressed in the recombinant yeast 1 in step (A2) may be exogenous mannosidase I, exogenous N-acetylglucosamine transferase I, exogenous galactose isomerase and exogenous galactose transferase, exogenous mannosidase II, and exogenous N-acetylglucosamine transferase II.
When the mammalian sugar-structure N-sugar chain is GalGlcNAcMan 5GlcNAc2, the foreign protein expressed in the recombinant yeast 1 in step (A2) may also be foreign mannosidase I, foreign N-acetylglucosamine transferase I, and foreign galactose isomerase and foreign galactose transferase.
When the mammalian sugar structure N-sugar chain is Man 5GlcNAc2, the foreign protein expressed in the recombinant yeast 1 in step (A2) may also be foreign mannosidase I.
In the steps (A1) and (A3), the above glycosyl modifying enzyme may be inactivated by mutating one or more nucleotide sequences of a gene, or by deleting a part or the whole sequence of a gene, or by disrupting the original reading frame by inserting nucleotides, terminating the protein synthesis in advance, or the like. The mutation, deletion, insertion and inactivation may be obtained by conventional mutagenesis, knocking out and the like. These methods have been reported in many documents, such as J.Sam Brooks et al, second edition, science publishers, 1995, molecular cloning Experimental guidelines. Other methods known in the art can also be used to construct genetically inactivated yeast strains. Wherein the preferred strain is obtained by knocking out a partial sequence of the mannose-transferase gene. The sequence is at least greater than three bases, preferably greater than 100 bases, and more preferably comprises greater than 50% of the coding sequence. The strain obtained by knocking out partial sequences of the glycosyl modified enzyme gene is not easy to generate back mutation, has higher stability than that constructed by using methods such as point mutation and the like, and is more beneficial to being applied to medical and industrial fields.
The method of knocking out a partial sequence of the glycosylation modification enzyme gene may comprise: first, a plasmid for knocking out the gene is constructed: the plasmid comprises homologous arm sequences at two sides of the gene to be knocked out, two homologous arms are selected at two sides of the target gene, the length of the homologous arms is at least more than 200bp, and the optimal size is 500bp-2000bp. It is also possible to obtain a nucleotide sequence which is rendered functionally inactive by substitution and/or deletion and/or addition of one or more amino acid residues in an amino acid sequence by means of insertional inactivation, and to construct it into a plasmid. The plasmid also has URA3 (orotidine-5' -phosphate decarboxylase) gene, bleomycin, hygromycin B, blasticidin or G418 and the like as screening markers. The nucleotide sequence of the nucleic acid polynucleotide encoding the homology arm fragment of the flanking region, the nucleotide sequence of the protein to be disrupted, may be obtained from published National Center for Biotechnology Information (NCBI). The PCR method is used, pichia pastoris host genome is used as a template, a certain length of flanking homologous regions required by the inactivated gene are obtained, the flanking homologous regions respectively comprise upstream and downstream flanking homologous regions of a coding region of a target gene (the sequence of which is disclosed in NCBI), and proper enzyme cutting sites are added in a primer part. The polynucleotides obtained from the sequences may be obtained by methods well known in the art, such as PCR (J. Sam Brookfield et al, second edition of molecular cloning Experimental guidelines, science Press, 1995), RT-PCR methods, synthetic methods, genomic DNA and methods for constructing screening cDNA libraries, and the like. If desired, the polynucleotide may be mutated, deleted, inserted, ligated to other polynucleotides, etc., using methods well known in the art. The fusion of the homologous arm fragments of the upstream (5 ') and downstream (3') flanking regions, respectively, can be carried out by various methods known in the art, for example by means of overlap PCR, using standard molecular cloning procedures as described in J.Sam Brookfield et al (J.Sam Brookfield et al, second edition, science Press, 1995). Nucleic acids containing homologous arm sequence fusion fragments of the gene to be inactivated can be separately cloned into various vectors suitable for use in yeast using methods well known in the art. Or the restriction sites on the homologous arms are respectively inserted into specific regions of the vector. The standard molecular cloning procedure used is described in J.Sam Brookfield et al (J.Sam Brookfield et al, second edition, science Press, 1995.). Constructing recombinant knockdown plasmid. the original plasmid may be selected from expression vectors suitable for yeasts, shuttle vectors, replication sites, selection markers, auxotrophs (URA 3, HIS, ADE1, LEU2, ARG 4) and the like, and the construction method of these vectors is disclosed in many documents (e.g., J. Sammbruk et al, second edition of the guidelines for molecular cloning experiments, science publishers, 1995), and also commercially available from various companies (e.g., invitrogen life technologies, carlsbad, california 92008, USA), and the preferred vector is pPICZαA, pYES2 Yeast expression vector. The inactivated vector is shuttle plasmid, which is first duplicated and amplified in colibacillus and then introduced into host yeast cell, and the vector should have resistance marker gene or auxotroph marker gene to facilitate the subsequent transformant screening.
Homologous regions on two sides of the gene to be inactivated (the upstream is called as a 5 'arm and the downstream is called as a 3' arm) are respectively constructed into yeast vectors to form recombinant knockout vectors. Further utilizing linearization sites of homology arms to linearize the knockout vector, and transforming the knockout vector into pichia pastoris or one of the modified pichia pastoris by an electrotransformation method for culture. Transformation of the desired nucleic acid into the host cell can be achieved by conventional methods, such as preparation of competent cells, electroporation, lithium acetate method, etc. (A. Adam et al, guidelines for the genetic methods of Yeast, scientific Press, 2000). Successfully transformed cells, i.e., cells containing the homologous region of the gene to be knocked out, can be identified by well known techniques, such as collection and lysis of cells, extraction of DNA, and then genotyping by PCR; whereas previous selection of the correct phenotype can be achieved by selection of auxotrophs or resistance markers. The primary recombinant correct transformant is cultured in basic yeast culture medium, and then coated on uracil-containing 5-fluoroorotic acid flat plate and other secondary recombinant screening flat plates, and the grown clone is further subjected to genotype PCR identification. Transformants were screened for the correct deletion of the coding region of the expected gene.
In a specific embodiment of the present invention, in step (A1), the inactivated receptor pichia pastoris endogenous α -1, 6-mannosyltransferases, phosphomannosyltransferases, β mannosyltransferases I, β mannosyltransferases II, β mannosyltransferases III and β mannosyltransferases IV are all knocked out by homologous recombination.
In a specific embodiment of the present invention, in step (A2), the expression of the foreign protein in the recombinant yeast 1 is achieved by introducing a gene encoding the foreign protein into the recombinant yeast 1.
Further, the coding gene of the foreign protein is introduced into the recombinant yeast 1 in the form of a recombinant vector.
Further, the coding gene of the exogenous mannosidase I and the coding gene of the exogenous mannosidase II are both introduced into the recombinant yeast 1 twice.
In a specific embodiment of the present invention, in step (A3), the inactivation of the O-mannosyltransferase I endogenous to the recombinant yeast 2 is not performed in a conventional manner by knocking out genes, but is skillfully performed by insertionally inactivating the O-mannosyltransferase I encoding gene in the genomic DNA of the recombinant yeast 2 (disrupting its corresponding nucleotide sequence by insertionally inactivating).
In the present invention, specifically, the front end and the tail end of the target fragment of the gene encoding O mannose transferase I in the genome DNA of the recombinant yeast 2 are respectively provided with different combinations of stop codons, and a terminator (such as CYC1TT terminator) is arranged after the stop codon at the tail end. The target fragment with the front end and the tail end provided with different combinations of stop codons is specifically a fragment obtained by PCR amplification by using genomic DNA of Pichia pastoris JC308 as a template and utilizing primers PMT1-IN-5 and PMT 1-IN-3.
PMT1-IN-5:5’-tctatgcattaatgatagttaatgactaatagagtaaaacaagtcctcaagaggt-3’;
PMT1-IN-3:5’-tgacataactaattacatgatctattagtcattaactatcattagatcagagtggggacgactaagaaa gc-3’。
The next technical problem is to construct engineered pichia pastoris strains with mammalian cell glycoform modification capability in yeast chassis cells, which are complex and involved in glycosyl modification enzymes of mammalian cell glycoform modification, which enzyme modification will obtain which glycoform? And the ratio combinations of glycoforms obtained are not known until not studied. The invention is realized by the following technical methods:
The exogenous mannosidase I is derived from Trichoderma viride and the C-terminal fusion endoplasmic reticulum retention signal HDEL.
The exogenous N-acetylglucosamine transferase I may be N-acetylglucosamine transferase I derived from mammals and the like, such as human N-acetylglucosamine transferase I (GenBank NO NM 002406), candida albicans N-acetylglucosamine transferase I (GenBank NO NW_ 139513.1), pelargonium gracilis N-acetylglucosamine transferase I (GenBank NO NC_ 007088.5) and the like, and endoplasmic reticulum or inner Golgi localization signals such as ScGLS, scMNS1, ppSEC12, scMNN9 and the like may be fused at the N-terminus or the C-terminus; preferably of human origin and containing an mnn9 localization signal;
The exogenous mannosidase II may be mannosidase II derived from filamentous fungi, plants, insects, java, mammals, etc., if fly mannosidase II (GenBank NOX 77652), nematode mannosidase II (GenBank NO NM 0735941), human mannosidase II (GenBank NO U31520), etc.; the expressed mannosidase II may be fused at the N-or C-terminus to an endoplasmic reticulum or inner golgi localization signal, such as ScGLS, scMNS1, ppSEC, scMNN, etc., preferably derived from nematodes, containing an mnn2 localization signal;
Exogenous N-acetylglucosamine transferase II, can be N-acetylglucosamine transferase II derived from mammals, such as human N-acetylglucosamine transferase II (GenBank NO Q10469), murine N-acetylglucosamine transferase II (GenBank NO Q09326), etc.; the expressed N-acetylglucosamine transferase II may be fused at the N-or C-terminus to an endoplasmic reticulum or inner golgi localization signal, such as ScGLS, scMNS1, ppSEC, scMNN9, etc., preferably of human origin, containing an mnn2 localization signal;
The mannosidase II and the N-acetylglucosamine transferase II both contain an mnn2 localization signal;
the galactose isomerase and the galactose transferase are fusion proteins, are both selected from human sources, and share a kre2 localization signal.
The galactosyltransferase may be a galactosyltransferase derived from mammals or the like, such as human beta-1, 4-galactosyltransferase (GenBank NO gi: 13929461), murine beta-1, 4-galactosyltransferase GenBank NO NC-000081.6), or the like. Expressed galactosyltransferases may be fused at the N-or C-terminus to endoplasmic reticulum or inner golgi localization signals such as ScKRE2, scGLS, scMNS1, ppSEC, scMNN, etc., and the galactosyltransferases of the present invention are derived from humans and share a kre2 localization signal.
The alpha-1, 6-mannosyltransferase may be B1) or B2) as follows:
B1 A protein having an amino acid sequence of SEQ ID No. 1;
b2 A protein having the same function by substitution and/or deletion and/or addition of one or more amino acid residues in the amino acid sequence shown in SEQ ID No.1, or a protein having the same function by having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the amino acid sequence shown in SEQ ID No. 1.
The phosphomannose transferase may be B3) or B4) as follows:
B3 A protein having an amino acid sequence of SEQ ID No. 2;
B4 A protein having the same function by substitution and/or deletion and/or addition of one or more amino acid residues in the amino acid sequence shown in SEQ ID No.2, or a protein having the same function by having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the amino acid sequence shown in SEQ ID No. 2.
The phosphomannose synthetase may be B5) or B6) as follows:
B5 A protein having an amino acid sequence of SEQ ID No. 3;
B6 A protein having the same function by substitution and/or deletion and/or addition of one or more amino acid residues in the amino acid sequence shown in SEQ ID No.3, or a protein having the same function by having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the amino acid sequence shown in SEQ ID No. 3.
The beta mannosyltransferase I may be B7) or B8) as follows:
b7 A protein having an amino acid sequence of SEQ ID No. 4;
B8 A protein having the same function by substitution and/or deletion and/or addition of one or more amino acid residues in the amino acid sequence shown in SEQ ID No.4, or a protein having the same function by having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the amino acid sequence shown in SEQ ID No. 4.
The beta mannosyltransferase II may be B9) or B10) as follows:
B9 A protein having an amino acid sequence of SEQ ID No. 5;
B10 A protein having the same function by substitution and/or deletion and/or addition of one or more amino acid residues in the amino acid sequence shown in SEQ ID No.5, or a protein having the same function by having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the amino acid sequence shown in SEQ ID No. 5.
The beta mannosyltransferase III may be B11) or B12) as follows:
b11 A protein having an amino acid sequence of SEQ ID No. 6;
B12 A protein having the same function by substitution and/or deletion and/or addition of one or more amino acid residues in the amino acid sequence shown in SEQ ID No.6, or a protein having the same function by having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the amino acid sequence shown in SEQ ID No. 6.
The beta mannosyltransferase IV may be B13) or B14) as follows:
b13 A protein having an amino acid sequence of SEQ ID No. 7;
B14 A protein having the same function by substitution and/or deletion and/or addition of one or more amino acid residues in the amino acid sequence shown in SEQ ID No.7, or a protein having the same function by having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the amino acid sequence shown in SEQ ID No. 7.
The O mannose transferase I may be B15) or B16) as follows:
B15 A protein having an amino acid sequence of SEQ ID No. 8;
B16 A protein having the same function by substitution and/or deletion and/or addition of one or more amino acid residues in the amino acid sequence shown in SEQ ID No.8, or a protein having the same function by having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the amino acid sequence shown in SEQ ID No. 8.
The exogenous mannosidase I can be B17) or B18) as follows:
B17 A protein having an amino acid sequence of SEQ ID No. 9;
B18 A protein having the same function by substitution and/or deletion and/or addition of one or more amino acid residues in the amino acid sequence shown in SEQ ID No.9, or a protein having the same function by having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the amino acid sequence shown in SEQ ID No. 9.
The exogenous N-acetylglucosamine transferase I can be B19) or B20) as follows:
B19 A protein having an amino acid sequence of SEQ ID No. 10;
B20 A protein having the same function by substitution and/or deletion and/or addition of one or more amino acid residues in the amino acid sequence shown in SEQ ID No.10, or a protein having the same function by having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the amino acid sequence shown in SEQ ID No. 10.
The fusion protein consisting of the galactose isomerase and the galactose transferase may be B21) or B22) as follows:
b21 A protein having an amino acid sequence of SEQ ID No. 11;
B22 A protein having the same function by substitution and/or deletion and/or addition of one or more amino acid residues to the amino acid sequence shown in SEQ ID No.11, or a protein having the same function by having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology to the amino acid sequence shown in SEQ ID No. 11.
The mannosidase II may be B23) or B24) as follows:
B23 A protein having an amino acid sequence of SEQ ID No. 12;
b24 A protein having the same function by substitution and/or deletion and/or addition of one or more amino acid residues in the amino acid sequence shown in SEQ ID No.12, or a protein having the same function by having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the amino acid sequence shown in SEQ ID No. 12.
The N-acetylglucosamine transferase II may be B25) or B26) as follows:
B25 A protein having an amino acid sequence of SEQ ID No. 13;
B26 A protein having the same function by substitution and/or deletion and/or addition of one or more amino acid residues in the amino acid sequence shown in SEQ ID No.13, or a protein having the same function by having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the amino acid sequence shown in SEQ ID No. 13.
The coding gene of the exogenous mannosidase I can be C1) or C2) as follows:
c1A DNA molecule having a nucleotide sequence of SEQ ID No. 14;
C2 A DNA molecule having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the nucleotide sequence shown in SEQ ID No.14 and encoding the exogenous mannosidase I, or a DNA molecule hybridizing under stringent conditions with the DNA molecule defined in C1) and encoding the exogenous mannosidase I.
The coding gene of the exogenous N-acetylglucosamine transferase I can be C3) or C4) as follows:
c3 A DNA molecule having a nucleotide sequence of SEQ ID No. 15;
c4 A DNA molecule having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the nucleotide sequence shown in SEQ ID No.15 and encoding the exogenous N-acetylglucosamine transferase I, or a DNA molecule hybridizing under stringent conditions with a DNA molecule defined by C3) and encoding the exogenous N-acetylglucosamine transferase I.
The gene encoding the fusion protein consisting of the galactose isomerase and the galactose transferase may be C5) or C6) as follows:
c5 A DNA molecule having a nucleotide sequence of SEQ ID No. 16;
C6 A DNA molecule having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the nucleotide sequence shown in SEQ ID No.16 and encoding the fusion protein, or a DNA molecule hybridizing under stringent conditions with a DNA molecule defined by C5) and encoding the fusion protein.
The mannosidase II encoding gene may be C7) or C8) as follows:
C7 A DNA molecule having a nucleotide sequence of SEQ ID No. 17;
c8 A DNA molecule having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the nucleotide sequence shown in SEQ ID No.17 and encoding said mannosidase II, or a DNA molecule hybridizing under stringent conditions with a DNA molecule defined by C7) and encoding said mannosidase II.
The coding gene of the N-acetylglucosamine transferase II can be C9) or C10) as follows:
c9 A DNA molecule having the nucleotide sequence of SEQ ID No. 18;
C10 A DNA molecule having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the nucleotide sequence shown in SEQ ID No.18 and encoding said N-acetylglucosamine transferase II, or a DNA molecule hybridizing under stringent conditions with a DNA molecule defined by C9) and encoding said N-acetylglucosamine transferase II.
All glycosyl modification enzyme related information of the invention can be obtained in the National Center for Biotechnology Information (NCBI) or published literature, and the functions and definitions of the related enzymes can be obtained in the literature. Even if the same bacterium or species, the amino acids of the respective enzymes may be slightly different due to the difference in origin, but the functions thereof are substantially the same, and thus the enzymes of the present invention may include these variants.
Further, the pichia pastoris genetically modified by the glycosylation modification pathway is a strain with a preservation number of CGMCC No.19488 preserved in the China general microbiological culture Collection center.
In step (1), the recombinant yeast cell can be obtained by introducing a gene encoding the coronavirus S protein Receptor Binding Domain (RBD) into the genetically engineered Pichia pastoris via the glycosylation modification pathway.
Further, the encoding gene of the coronavirus S protein Receptor Binding Domain (RBD) is introduced into the genetically engineered Pichia pastoris via glycosylation modification pathway in the form of a recombinant vector.
In the present invention, the recombinant vector is specifically a recombinant vector obtained by cloning the gene encoding the coronavirus S protein Receptor Binding Domain (RBD) between pPICZ alpha A vectors (e.g., cleavage sites XhoI and Not I).
Wherein the coronavirus S protein Receptor Binding Domain (RBD) may be any of the following:
(a1) The protein shown in SEQ ID No.21 (corresponding to RBD 223) or a truncated form thereof;
(a2) The protein shown in SEQ ID No.22 (corresponding to RBD 219) or a truncated form thereof;
(a3) The protein shown in SEQ ID No.23 (corresponding to RBD 216) or a truncated form thereof;
(a4) The protein shown in SEQ ID No.24 (corresponding to RBD 210) or a truncated form thereof;
(a5) A protein having the same function by substitution and/or deletion and/or addition of one or more amino acid residues to the amino acid sequence defined in any one of (a 1) to (a 4), or a protein having the same function by having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology to the amino acid sequence defined in any one of (a 1) to (a 4).
Accordingly, the gene encoding the coronavirus S protein Receptor Binding Domain (RBD) may be a DNA molecule encoding the coronavirus S protein receptor binding domain shown in any one of (a 1) to (a 5).
Further, the encoding gene of the coronavirus S protein Receptor Binding Domain (RBD) may be any of the following:
(b1) The DNA molecule shown in SEQ ID No.25 (corresponding to RBD 223);
(b2) The DNA molecule shown in SEQ ID No.26 (corresponding to RBD 219);
(b3) The DNA molecule shown in SEQ ID No.27 (corresponding to RBD 216);
(b4) The DNA molecule shown in SEQ ID No.28 (corresponding to RBD 210);
(b5) A DNA molecule having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology to the nucleotide sequence shown in any one of SEQ ID nos. 25 to 28 and encoding the coronavirus S protein Receptor Binding Domain (RBD), or a DNA molecule hybridizing under stringent conditions to the DNA molecule shown in any one of SEQ ID nos. 25 to 28 and encoding the coronavirus S protein Receptor Binding Domain (RBD).
Wherein, the 4 amino acid sequences shown in SEQ ID No.21 to SEQ ID No.24 are all part of the S protein of SARS-CoV-2"wuhan-Hu-1" isolate from GenBank No. MN 908947.3. Specifically, SEQ ID No.21 is the R319-F541 region (RBD 223) of the S protein; SEQ ID No.22 shows the R319-K537 region (RBD 219) of the S protein; SEQ ID No.23 shows the R319-V534 region (RBD 216) of the S protein; SEQ ID No.24 shows the R319-K528 region (RBD 210) of the S protein.
The nucleotide sequences of SEQ ID No.25 to SEQ ID No.28 are obtained by codon optimization according to the amino acid sequences of SEQ ID No.21 to SEQ ID No.24, respectively, and DNA fragments of the corresponding sequences are obtained by total gene synthesis.
In the above proteins, homology refers to the identity of amino acid sequences. The identity of amino acid sequences can be determined using homology search sites on the internet, such as BLAST web pages of the NCBI homepage website. For example, in advanced BLAST2.1, by using blastp as a program, expect values are set to 10, all filters are set to OFF, BLOSUM62 is used as Matrix, gap existence cost, per residue gap cost and Lambda ratio are set to 11,1 and 0.85 (default values), respectively, and identity of a pair of amino acid sequences is searched for and calculated, and then the value (%) of identity can be obtained.
In the above genes, homology refers to nucleotide sequence identity. The identity of nucleotide sequences can be determined using homology search sites on the internet, such as BLAST web pages of the NCBI homepage website. For example, in advanced BLAST2.1, by using blastp as a program, expect values are set to 10, all filters are set to OFF, BLOSUM62 is used as Matrix, gap existence cost, per residue gap cost and Lambda ratio are set to 11,1 and 0.85 (default values), respectively, and identity of a pair of nucleotide sequences is searched for and calculated, and then the value (%) of identity can be obtained.
The above-mentioned proteins and genes may have homology of 95% or more of at least 96%, 97% or 98% identity. The 90% or more homology may be at least 91%, 92%, 93%, 94% identical. The 85% or more homology may be at least 86%, 87%, 88%, 89% identical. The 80% or more homology may be at least 81%, 82%, 83%, 84% identical.
In the above gene, the stringent conditions may be as follows: hybridization at 50℃in a mixed solution of 7% Sodium Dodecyl Sulfate (SDS), 0.5M NaPO 4 and 1mM EDTA, rinsing in2 XSSC, 0.1% SDS at 50 ℃; the method can also be as follows: hybridization at 50℃in a mixed solution of 7% SDS, 0.5M NaPO 4 and 1mM EDTA, rinsing in 1 XSSC, 0.1% SDS at 50 ℃; the method can also be as follows: hybridization at 50℃in a mixed solution of 7% SDS, 0.5M NaPO 4 and 1mM EDTA, rinsing in 0.5 XSSC, 0.1% SDS at 50 ℃; the method can also be as follows: hybridization at 50℃in a mixed solution of 7% SDS, 0.5M NaPO 4 and 1mM EDTA, rinsing in 0.1 XSSC, 0.1% SDS at 50 ℃; the method can also be as follows: hybridization at 50℃in a mixed solution of 7% SDS, 0.5M NaPO 4 and 1mM EDTA, rinsing in 0.1 XSSC, 0.1% SDS at 65 ℃; the method can also be as follows: hybridization was performed in a solution of 6 XSSC, 0.5% SDS at 65℃and then washed once with 2 XSSC, 0.1% SDS and 1 XSSC, 0.1% SDS.
Further, in the step (2), the coronavirus S protein receptor binding region having the mammalian sugar chain structure modified can be purified from the culture supernatant according to a method comprising the steps of: sequentially performing cation exchange chromatography, hydrophobic chromatography, G25 desalination and anion exchange chromatography on the culture supernatant to obtain the coronavirus S protein receptor binding region with the mammal sugar-type structure modified by the N-sugar chain.
Further, in the step (2), the coronavirus S protein receptor binding domain having the mammalian sugar structure N-sugar chain modification is purified from the culture supernatant according to a method comprising the steps of: capturing target proteins by passing the culture supernatant through CaptoMMC chromatographic columns, and eluting by using a buffer solution containing 1M NaCl to obtain a crude sample containing the target proteins; purifying the crude sample by using a hydrophobic chromatography column Phenyl HP, desalting an elution peak sample containing the target protein by using a G25 chromatography column, and then adsorbing the impurity protein by using an anion exchange chromatography column Source30Q, wherein the target protein is obtained by flowing through liquid; the target protein is the coronavirus S protein receptor binding region with the modification of the mammal sugar-type structure N-sugar chain.
In a second aspect, the invention claims any one of the following products:
d1 The coronavirus S protein receptor binding region having a modified N-sugar chain with a mammalian sugar structure prepared by the method described in the first aspect.
D2 Using the recombinant yeast cell prepared in step (1) of the method of the first aspect.
D3 A drug for preventing and/or treating diseases caused by coronavirus infection, which comprises the active ingredient D1) the coronavirus S protein receptor binding region with the modification of the mammalian sugar-type structure N-sugar chain.
D4 A drug capable of inhibiting coronavirus, which comprises as an active ingredient D1) the coronavirus S protein receptor binding region having a modification of mammalian sugar-structure N-sugar chain.
D5 A kit or a kit for diagnosing coronavirus infection, comprising D1) said coronavirus S protein receptor binding region having a mammalian sugar structure N-sugar chain modification.
D6 Coronavirus vaccine comprising antigen and adjuvant; the antigen is D1) the coronavirus S protein receptor binding region with the modification of the mammalian sugar structure N-sugar chain; the adjuvant may be an aluminium adjuvant.
In a specific embodiment of the invention, the adjuvant is specifically aluminum hydroxide. The coronavirus vaccine is prepared by mixing D1) the coronavirus S protein receptor binding region modified by the N-sugar chain with the mammal sugar type structure and aluminum hydroxide according to the mass ratio of 1:10.
D7 A product capable of causing the production in an animal of antibodies specific for the binding region of the coronavirus S protein receptor, the active ingredient of which is D1) said binding region of the coronavirus S protein receptor having a modification of the N-sugar chain of the mammalian sugar structure.
In a third aspect, the invention claims any of the following applications:
D8 D2) use of said recombinant yeast cell in the preparation of D1) said coronavirus S protein receptor binding domain having a modification of the mammalian sugar structure N-sugar chain.
D9 D1) the N-sugar chain modified coronavirus S protein receptor binding domain having a mammalian sugar structure, in the preparation of the medicament of D3) or D4), the reagent or kit of D5), the coronavirus vaccine of D6) or the product of D7).
In each of the above aspects, the coronavirus is SARS-CoV-2.
In a specific embodiment of the invention, the coronavirus is specifically a SARS-CoV-2"wuhan-Hu-1" isolate.
Experiments prove that the coronavirus S protein RBD expressed by the pichia pastoris genetically modified by the glycosylation modification pathway has the N-sugar chain modification of the mammal sugar structure, so that the problems that fungus type glycosylation modification possibly causes allergy and the like are avoided. The coronavirus S protein RBD obtained by expression of the invention can generate high-titer anti-RBD antibodies after immunizing mice, and can neutralize SARS-CoV-2 virus. In addition, the engineering pichia pastoris strain has the characteristics of short construction period, quick growth, easy mass production, high safety and the like, and is beneficial to the efficient research and development and mass production of novel coronavirus vaccines under emergency conditions such as sudden novel coronavirus infection and the like.
Preservation description
Strain latin name: pichia pastoris
Biological materials according to: GJK30
Suggested class naming: pichia pastoris
Preservation mechanism: china general microbiological culture Collection center (China Committee for culture Collection of microorganisms)
The preservation organization is abbreviated as: CGMCC
Address: beijing, chaoyang district North Star, west Lu No. 1, 3
Preservation date: 2020, 03 and 18 days
Accession numbers of the preservation center: CGMCC No.19488
Drawings
FIG. 1 shows the results of identification and glycoform analysis of the och1 gene in GJK01 bacterium. A is the identification result of och1 gene. M represents Marker;1: GJK01 bacterium (och 1 knocked out); 2: x33 bacteria (non-knockout och 1). B is the DSA-FACE glycoform analysis result of the antibody expressed by GJK01 bacterium (knockout och 1).
FIG. 2 shows the results of the pno1 gene identification. M represents Marker;1: GJK02 bacteria (pno 1 knocked out); 2: x33 bacterium (pno 1 was not knocked out).
FIG. 3 shows the results of mnn4b gene identification. M represents Marker;1: GJK03 (mnn 4b knocked out); 2: x33 (non-knocked out mnn4 b).
FIG. 4 shows the results of DSA-FACE glycoforms of GJK01, GJK02 and GJK03 (with och1, pno1 and mnn4b knocked out).
FIG. 5 shows ARM2 gene identification results. M represents Marker;1: GJK04 bacteria (ARM 2 knocked out); 2: x33 bacteria (ARM 2 was not knocked out).
FIG. 6 shows the results of ARM1 gene identification. M represents Marker;1: GJK05 bacteria (ARM 1 knocked out); 2: x33 (ARM 1 was not knocked out).
FIG. 7 shows ARM3 gene identification results. M represents Marker;1: GJK07 strain (ARM 3 knocked out); 2: x33 bacteria (ARM 3 was not knocked out).
FIG. 8 shows the results of ARM4 gene identification. M represents Marker;1: GJK18 bacteria (ARM 4 knocked out); 2: x33 bacteria (ARM 4 was not knocked out).
FIG. 9 shows the result of DSA-FACE glycoform analysis of GJK18 bacterium.
FIG. 10 shows the result of TrmdsI gene identification and DSA-FACE glycoform analysis of W10 strain. A is TrmdsI gene identification result. M represents Marker;1: introducing TrmdsI into W10 bacteria; no TrmdsI was found in X33. B is the result of DSA-FACE glycoform analysis of W10 bacteria.
FIG. 11 shows the identification result of GnTI gene and DSA-FACE glycoform analysis results of 1-8 bacteria. A is the identification result of GnTI genes. M represents Marker;1: introducing GnTI into 1-8 bacteria; 2: gnTI was absent in X33 bacteria. B is the DSA-FACE glycoform analysis result of 1-8 bacteria.
FIG. 12 shows the results of GalE-GalT gene identification and DSA-FACE glycoform analysis of 1-8-4 bacteria. GalE-GalT gene identification results. M represents Marker;1:1-8-4 bacteria, and introducing GalE-GalT;2: galE-GalT was absent in X33 bacteria. B is the DSA-FACE glycoform analysis result of 1-8-4 bacteria.
FIG. 13 shows the mdsII gene, gnTII gene identification result and DSA-FACE glycoform analysis result of 52-60 and 150L2 bacteria. A is MdsII gene identification result. M represents Marker;1: introducing MdsII into 52-60 bacteria; 2: no MdsII was found in X33. B is the identification result of GnTII gene. M represents Marker;1: introducing GnTII into 150L2 bacteria; 2: the X33 bacteria have no GnTII. And C is the DSA-FACE glycoform analysis result of 52-60 bacteria.
FIG. 14 shows the results of identifying PMT1 insertion inactivated genes. M represents Marker;1: the X33 strain PMT1 is not inactivated; 2: GJK30 (PMT 1 inactive).
FIG. 15 shows the results of sugar-type structure analysis of GJK30 engineering bacteria. A is a prophase Gal2GlcNAc2Man3GlcNAc2 structure of less than 50%; b is Gal2GlcNAc2Man3GlcNAc2 structure obtained by GJK30 engineering bacteria, and the glycoform proportion is more than 60%; c is the cleavage analysis of this glycoform by glycosidases (NEW ENGLAND Biolabs, beijing).
FIG. 16 is a verification chart of CGMCC19488/pPICZ alpha-SARS 2S-RBD (RBD 223) positive clone screening WB. The upper half is SDS-PAGE electrophoresis analysis, and the lower half is Western Blotting analysis; lanes 1-7 are different expression clones.
FIG. 17 is a graph showing the electrophoresis detection of CGMCC19488/pPICZ alpha-S-RBD (RBD 223) at different induction times.
FIG. 18 is an SDS-PAGE of SARS-CoV-2S-RBD (RBD 223) purified samples.
FIG. 19 is a WB comparison of CGMCC19488/pPICZ alpha-S-RBD (RBD 223) with X33/pPICZ alpha-S-RBD (RBD 223) expressed RBD (RBD 223) glycoprotein. The upper half is SDS-PAGE electrophoresis analysis, and the lower half is Western Blotting analysis; 1-3 are 3 different X33/pPICZ alpha-S-RBD clones.
FIG. 20 is a diagram showing the electrophoresis of PNGF and Endo H-digested SARS-CoV-2S-RBD (RBD 223).
FIG. 21 shows the result of DSA-FACE sugar chain analysis of SARS-CoV-2S-RBD (RBD 223) glycoprotein expressed by CGMCC 19488.
FIG. 22 shows the anti-RBD antibody titres (RBD 223 glycoprotein) of mice after 14 days of secondary immunization.
FIG. 23 shows the result of virus neutralization assay (RBD 223 glycoprotein).
FIG. 24 shows SDS-PAGE analysis of SARS-CoV-2S-RBD expressed by CGMCC19488 (RBD 210, RBD216, RBD219 and RBD 223).
FIG. 25 shows the results of glycoform analysis of SARS-CoV-2S-RBD (RBD 210, RBD216 and RBD 219) glycoprotein expressed by CGMCC 19488.
FIG. 26 shows the anti-RBD antibody titers (RBD 210, RBD216, RBD219 and RBD223 glycoprotein) of mice after 14 days of priming.
FIG. 27 shows the results of virus neutralization assays (RBD 210, RBD216, RBD219, and RBD223 glycoproteins).
Detailed Description
The experimental methods used in the following examples are conventional methods unless otherwise specified.
Materials, reagents and the like used in the examples described below are commercially available unless otherwise specified.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Exemplary methods and materials are described below, although methods and materials similar or equivalent to those described herein can also be used in the practice of the present invention, as will be apparent to those skilled in the art. All publications and other references mentioned herein are incorporated by reference in their entirety. In the event of inconsistencies, the present description, including the definitions, controls. The materials, methods, and examples are illustrative only and not intended to be limiting.
The pPICZ alpha A, pYES vector, X33, GS115 Pichia pastoris is a product of Invitrogen company.
Pichia pastoris GJK01 CGMCC No.1853 (the publication number is CN101195809 in patent ZL200610164912.8, which is the Pichia pastoris inactivated with alpha-1, 6-mannosyltransferase).
Pyrobest enzyme, LA Taq enzyme, dNTPs, restriction enzyme, T4 ligase and the like used in the experiment are purchased from Dalianbao bioengineering Co., ltd, pfu enzyme, kit and DH5 alpha competent cells are Beijing full-scale gold Co., ltd. Total gene synthesis, nucleotide synthesis, primer synthesis, sequencing, etc. are provided by Shanghai Biotechnology engineering services, inc.
SARS-CoV-2 (2019-nCoV) Spike RBD-Fc Recombinant Protein (40592-V02H) is a product of Beijing Yiqiao Shenzhou Biotech Co., ltd; goat anti-rabbit IgG secondary antibody (SAB 3700885) was a Sigma product; the goat anti-mouse IgG secondary antibody (ab 205719) is a product of abcam company; bglII restriction enzyme is NEB company product; PNGaseF (P0708) and Endo H (P0702) are NEB company products.
Capto MMC chromatography media, phenyl HP, G25, source30Q, all available from GE HEALTHCARE company, used in the experiments.
The sequence information of the related modified enzymes in the following examples is shown in Table 1.
TABLE 1 related modified enzymes according to the invention
EXAMPLE 1 construction of Pichia pastoris genetically engineered with glycosylation modification pathways
1. Construction of Yeast Strain inactivated by phosphomannose transferase Gene
The basic strain adopted by the invention is GJK01 strain which is constructed in earlier stage, the preservation number is CGMCC No.1853, and the strain authorizes patent number: ZL200610164912.8. The strain is a Pichia pastoris strain inactivated by alpha-1, 6-mannose transferase. The amino acid sequence of the alpha-1, 6-mannosyltransferase (OCH 1) is shown as SEQ ID No. 1.
The yeast strain GJK02 with inactivated phosphomannose transferase gene is obtained by knocking out a DNA molecule part of the Pichia pastoris GJK01 which codes for phosphomannose transferase shown in SEQ ID No.2, namely, knocking out the phosphomannose transferase gene in the GJK01 yeast genome, and obtaining recombinant yeast.
1. Construction of phosphomannose transferase Gene inactivation vector
The knockout plasmid pYES2-PNO1 for knocking out the mannose transferase (PNO 1) gene is a vector obtained by inserting a gene fragment (SEQ ID No. 20) corresponding to the mannose transferase (PNO 1) between KpnI and XbaI cleavage sites of the vector pYES 2. Wherein nucleotide 7-1006 of SEQ ID No.20 from the 5' end is the upstream homology arm of the knockout mannose transferase (PNO 1) gene fragment; nucleotide 1015-2017 of SEQ ID No.20 from the 5' end is a downstream homology arm of a knockout mannose transferase (PNO 1) gene fragment.
The method comprises the following steps:
The genomic DNA of Pichia pastoris X33 was extracted by the glass bead preparation method (A. Adam et al, guidelines for Yeast genetics methods, science Press, 2000), and the homologous arms on both sides of the mannosyltransferase (PNO 1) gene were amplified using the genomic DNA as a template, the homologous arms on both sides of PNO1 were about 1kb, respectively, and the coding gene of about 1.4kb was deleted in the middle.
Primers used for amplifying homologous arm of upstream flanking region of PNO1 (homologous arm of PNO1 5') are PNO-5-5 and PNO-5-3, and the primer sequences are respectively:
5'-AGTGGTACCGCAGTTTAATCATAGCCCACTGC-3' (Kpn I recognition site in underlined part);
5'-ATTCCAATACCAAGAAAGTAAAGTgcggccgcAAGTGGAACTGGCGCACCGGT-3' (NotI recognition site in the underlined part).
Primers used for amplifying the homology arm of the downstream flanking region of PNO1 (PNO 13' homology arm) are PNO-3-5 and PNO-3-3, and the primer sequences are respectively:
5'-ACCGGTGCGCCAGTTCCACTTgcggccgcACTTTACTTTCTTGGTATTGGAAT-3' (underlined is Not I recognition site);
5'-TGTTCTAGATCCGAGATTTTGCGCTATGGAGC-3' (Xba I recognition site in the underlined section).
The PCR amplification conditions for the two homology arms were as follows: after denaturation at 94℃for 5min, 30 cycles of denaturation at 94℃for 30sec, renaturation at 55℃for 30sec, extension at 72℃for 1min for 30sec, and extension at 72℃for 10min; the size of the target fragment was about 1 kb. The PCR product was purified and recovered using a PCR product recovery purification kit (purchased from Ding national biotechnology Co., beijing). The PNO 15 'homology arm and the 3' homology arm are fused by using an overlap extension PCR method (see J. Sam Brookfield et al, second edition of molecular cloning Experimental guidelines, science Press, 1995), the PCR products of the PNO 15 'homology arm and the 3' homology arm are used as templates, PNO-5-5/PNO-3-3 is used as a primer, and the PCR amplification conditions are as follows: after denaturation at 94 ℃ for 5min, 30 cycles are carried out according to denaturation at 94 ℃ for 1min, renaturation at 55 ℃ for 1min and extension at 72 ℃ for 3min for 30sec, and finally extension at 72 ℃ for 10min; the size of the target fragment was about 2 kb. And purifying and recovering the PCR product by using a PCR product recovery and purification kit.
Kpn I/Xba I double restriction (restriction enzymes used in this test were all from Takara Bio Inc., dalian) PCR products, after which the digested products were inserted into the vector pYES2 (Invitrogen Corp. USA) treated by the same double restriction, T4 ligase was ligated overnight at 16℃to transform E.coli DH 5. Alpha. And positive clones were selected on LB plates containing ampicillin (100. Mu.g/ml). The plasmid of positive clone is identified by Kpn I/Xba I double enzyme digestion, and the recombinant vector of the fragment with about 4200bp and about 2000bp is named as pYES2-PNO1, namely the knockout plasmid for knocking out mannose transferase (PNO 1) gene, and the homologous arm at the upstream and downstream of the PNO1 gene is verified to be correct by final sequencing.
2. Transformation of Pichia pastoris with knockout plasmid
The knockout plasmid pYES2-pno1 was transformed into Pichia pastoris GJK01 (described in patent ZL200610164912.8, publication No. CN 101195809) by electrotransformation methods well known in the art (e.g., A. Adam et al, guidelines for Yeast genetic methods, science Press, 2000). Prior to electrotransformation, the knocked-out plasmid was linearized with BamH I cleavage site upstream of the 5' homology arm, then electrotransformed into prepared competent cells, and plated onto MD medium (YNB 1.34g/100mL, biotin 4X 10 -5 g/100mL, glucose 2g/100mL, agar 1.5g/100mL, arginine 100mg/mL, histidine 100 mg/mL) containing arginine and histidine. After the clone grows on the culture medium, several clones are randomly picked up to extract genome, and whether the knocked-out plasmid is correctly integrated to a target site on a chromosome is identified by a PCR method, wherein two pairs of primers used in the PCR reaction are respectively as follows: primer sequence PNO-5-5OUT outside 5' homology arm of PNO1 gene: 5'-GCAGTTTAATCATAGCCCACTGCTA-3' and primer sequence inner01 on the vector: 5'-AGCGTCGATTTTTGTGATGCTCGTCA-3'. The enzyme used in the PCR reaction was rTaq (Takara Bio-engineering Co., ltd.) and the PCR amplification conditions were as follows: after denaturation at 94℃for 5min, 30 cycles of denaturation at 94℃for 30sec, renaturation at 55℃for 30sec and extension at 72℃for 3min were performed, followed by 72 extension for 10min. The size of the PCR product band was analyzed by gel electrophoresis, and the band amplified by the primer was a positive clone of about 2.3 kb.
3. PCR identification of positive engineering strains
One of the positive clones was inoculated into YPD medium (10 g/L yeast extract, 20g/L peptone, 20g/L glucose), after shaking culture at 25℃for 12 hours, the bacterial solution was spread on adenine-deficient 5-FOA medium (YNB 1.34g/100mL, biotin 4X 10-5g/100mL, glucose 2g/100mL, agar 1.5g/100mL, arginine 100mg/mL, histidine 100mg/mL, uracil 100mg/mL,5-FOA 0.1%) (wherein YNB is an amino acid-free yeast nitrogen source, beijing-Xin Kogyo Biotechnology Co., ltd., 5-FOA is 5-fluorouracil, from Sigma-aldrich P.O.BOX14508, st.Louis, MO 63178 USA) and cultured at 25 ℃.
After the clones were grown on 5-FOA medium, the genomes of these clones were extracted and PCR identified: the genome is taken as a template, the identification primers are sequences PNO1-ORF01 and PNO1-ORF02 outside the homologous arm of the PNO1 gene on the chromosome, and the primer sequences are respectively as follows:
PNO1-ORF01:5′-GGGAAAGAAAACCTTCAATTT-3′;
PNO1-ORF02:5′-TACAAGCCAGTTTCGCAATAA-3′。
Meanwhile, a PCR reaction system using the genome of the wild type X33 strain (Invitrogen) as a template was used as a control. The enzyme used in the PCR reaction was LA Taq (Takara Bio-engineering Co., ltd.) and the PCR amplification conditions were as follows: after denaturation at 94℃for 5min, 30 cycles of denaturation at 94℃for 30sec, renaturation at 55℃for 30sec and extension at 72℃for 3min were performed, followed by 72 extension for 10min.
In order to identify whether the alpha-1, 6-mannose transferase is knocked out, a reporter protein is introduced after GJK01 engineering bacteria are obtained, and the construction method and the vector transformation method of the expression vector of the anti-Her 2 antibody by taking the anti-Her 2 antibody as the reporter protein are disclosed in the application patent (publication number: CN 101748145A). The method is used for transferring the anti-Her 2 antibody expression vector into GJK01 host bacteria to obtain GJK01-HL engineering strain for expressing the anti-Her 2 antibody. The method for analyzing the oligosaccharide chains by using DSA-FACE has been disclosed in "Liu Bo et al," A method for analyzing oligosaccharide chains by using DSA-FACE, biotechnology communication, 2008.19 (6), 885-888, "et al.
The product was subjected to agarose gel electrophoresis. In FIG. 1, A is the identification result of GJK01 host bacteria; FIG. 1B shows the result of DSA-FACE glycoform analysis of GJK01-HL bacterium (knockout och 1). In FIG. 2, lane 1 is PON1 deficient and lane 2 is wild type; the size of a PCR product taking a wild type X33 strain genome as a template is about 490bp, a PON1 defective engineering bacterium has no amplification strip, and also proves that the PNO1 gene is lost, the phosphomannose transferase knocked-out strain is correctly constructed, and the strain is named as GJK02 and is the phosphomannose transferase knocked-out recombinant pichia pastoris.
2. Construction of Yeast Strain inactivated by Phosphomonomannose synthetase Gene
The yeast strain GJK03 with inactivated phosphomannose synthase gene is obtained by knocking out a DNA molecule part of the Pichia pastoris GJK02 for encoding the phosphomannose synthase shown in SEQ ID No.3, namely, knocking out the phosphomannose synthase gene in the GJK02 yeast genome, and obtaining recombinant yeast; namely, the yeast is inactivated with alpha-1, 6-mannose transferase, phosphomannose transferase and phosphomannose synthase.
The method of constructing the vector is the same as in step one.
1. Construction of phosphomannose synthase Gene inactivation vector
The knockout plasmid pYES2-MNN4B for knocking out the phosphomannose synthetase gene is a vector obtained by inserting the upstream and downstream homology arms of the gene fragment to be knocked out corresponding to the phosphomannose synthetase into the position between Stu I and Spe I cleavage sites of the vector pYES 2.
The genome DNA of Pichia pastoris X33 is extracted by the glass bead preparation method by utilizing the method, mannose synthetase (MNN 4B) gene fragments are amplified and knocked out by taking the genome DNA as a template, homologous arms at two sides of the MNN4B are respectively about 1kb, and the coding genes of about 1kb are deleted in the middle.
Primers used for amplifying the homology ARM (ARM 25' homology ARM) of the upstream flanking region of MNN4B are MNN4B-5-5 and MNN4B-5-3, and the sequences of the primers are respectively as follows:
5'-AGTAGGCCTTTCAACGAGTGACCAATGTAGA-3' (sti I recognition site in the underlined part);
5'-TATCTCCATAGTTTCTAAGCAGGGCGGCCGCAATATGTGCGGTGTAGGGAGAAA-3' (NotI recognition site in the underlined part).
Primers used for amplifying the homology arm of the downstream flanking region of MNN4B (homology arm of MNN4B 3') are MNN4B-3-5 and MNN4B-3-3, and the sequences of the primers are respectively as follows:
5'-TTTCTCCCTACACCGCACATATTGCGGCCGCCCTGCTTAGAAACTATGGAGATA-3' (underlined is Not I recognition site);
5'-TGTACTAGTTGAAGACGTCCCCTTTGAACA-3' (SpeI recognition site in underlined section).
The PCR amplification conditions, the recovery method and the digestion method of the two homology arms are the same as those of the step 1, and the pYES2-MNN4B knockout vector is finally constructed and obtained, and is finally verified to be correct through sequencing.
2. Transformation of Pichia pastoris with knockout plasmid
The knocked-out plasmid is transformed into the constructed Pichia pastoris engineering strain GJK02 by adopting an electrotransformation method, and the electrotransformation method and the identification method are the same as the first step.
The two pairs of primers used for the PCR reaction were: primer sequence MNN4B-5-5OUT outside 5' homology arm of MNN4B gene: 5'-TAGTCCAAGTACGAAACGACACTA-3' and primer sequence inner01 on the vector: 5'-AGCGTCGATTTTTGTGATGCTCGTCA-3' the band amplified by the primer was positive clone at about 2 kb.
3. PCR identification of positive engineering strains
After one of the positive clones was inoculated on 5-FOA medium (same formulation as before) to develop clones, the genome of these clones was extracted and PCR identification was performed: the genome is used as a template, and the identification primers are sequences MNN4B-ORF01 and MNN4B-ORF02 outside the homologous arm of the MNN4B gene on the chromosome, and the primer sequences are as follows:
MNN4B-ORF01:5'-AAAACTATCCAATGAGGGTCTC-3';
MNN4B-ORF02:5'-TCTTCAATGTCTTTAACGGTGT-3'。
PCR amplification was performed using positive clone genomic DNA as a template and primers MNN4B-ORF01 and MNN4B-ORF 02. As a result, FIG. 3 shows that lane 1 is deficient in MNN4B and lane 2 is wild-type; the size of a PCR product taking a wild type X33 strain genome as a template is about 912bp, and a MNN4 defective engineering bacterium has no amplification strip, and also proves that the wild type X33 strain genome is knocked out by phosphomannose synthase, named GJK03, and is a recombinant pichia pastoris knocked out by phosphomannose transferase and phosphomannose synthase.
The DSA-FACE glycoforms of GJK02 and GJK03 bacteria (with och1, pno1 and mnn4b knocked out) were analyzed as shown in FIG. 4, and it was found that the phosphomannose moiety in the glycoforms was removed after pno1 and mnn4b knockouts.
3. Construction of Yeast Strain inactivated by beta-mannosyltransferase Gene ARM2
The yeast strain GJK04 with inactivated genes of phosphomannose transferase, phosphomannose synthetase and beta mannose transferase ARM2 (namely beta mannose transferase II) is obtained by partially knocking out a DNA molecule encoding the beta mannose transferase ARM2 shown in SEQ ID No.5 in pichia pastoris GJK03, namely, knocking out the beta mannose transferase ARM2 gene in a GJK03 yeast genome, and obtaining recombinant yeast; that is, the α -1, 6-mannosyltransferase, phosphomannosyltransferase gene and β -mannosyltransferase ARM2 in the yeast genome have been inactivated.
1. Construction of beta-mannosyltransferase ARM2 Gene inactivation vector
The carrier construction method comprises the following steps:
The genome DNA of Pichia pastoris X33 is extracted by the glass bead preparation method by utilizing the method, homologous ARMs at two sides of beta-mannosyltransferase (ARM 2) gene are amplified by taking the genome DNA as a template, the homologous ARMs at two sides of ARM2 are respectively about 0.6kb, and the coding gene of about 0.6kb is deleted in the middle.
Primers used for amplifying the homology ARM of the upstream flanking region of ARM2 (ARM 2 5' homology ARM) are ARM2-5-5 and ARM2-5-3, and the primer sequences are respectively as follows:
5'-ActTGGTACCACACGACTCAACTTCCTGCTGCTC-3' (Kpn I recognition site in underlined part);
5'-actGCGGCCGCCACGAAACTTCTTACCTTTGACAA-3' (NotI recognition site in the underlined part).
Primers used for amplifying the ARM2 downstream flanking region homology ARM (ARM 23' homology ARM) are ARM2-3-5 and ARM2-3-3, and the primer sequences are respectively as follows:
5'-TTGTCAAAGGTAAGAAGTTTCGTGGCGGCCGCTATCTTGACATTGTCATTCAGTGA-3' (underlined is Not I recognition site);
5'-caaTCTAGAGCCTCCTTCTTTTCCGCCT-3' (Xba I recognition site in the underlined section).
2. Transformation of Pichia pastoris with knockout plasmid
The knocked-out plasmid is transformed into the constructed Pichia pastoris engineering strain GJK03 by adopting an electrotransformation method, and the electrotransformation method and the identification method are the same as those of the first embodiment.
The two pairs of primers used for the PCR reaction were: primer sequence ARM2-5-5OUT outside the 5' homology ARM of ARM2 gene: 5'-TTTTCCTCAAGCCTTCAAAGACAG-3' and primer sequence inner01 on the vector: 5'-AGCGTCGATTTTTGTGATGCTCGTCA-3' the band amplified by the primer was positive clone at about 0.8 kb.
3. PCR identification of positive engineering strains
After one of the positive clones was inoculated on 5-FOA medium (same formulation as before) to develop clones, the genome of these clones was extracted and PCR identification was performed: the genome is used as a template, and the identification primers are the sequences ARM-ORF01 and ARM-ORF02 outside the ARM of ARM2 gene homology on the chromosome, and the primer sequences are as follows:
Arm2-ORF-09:5'-gggcagaagatcctagag-3';
Arm2-ORF-10:5'-tcgtctccattgctatctacgact-3'。
PCR amplification was performed using positive clone genomic DNA as a template and primers Arm2-ORF-09 and Arm2-ORF-10, and the results are shown in FIG. 5, wherein lane 1 is ARM 2-deficient, and lane 2 is wild-type; as a result, the size of a PCR product taking a wild type X33 strain genome as a template is about 600bp, and an ARM2 defective engineering bacterium has no amplification band, and also proves that the PCR product is a recombinant Pichia pastoris which is knocked out by beta mannose transferase (ARM 2) and is named as GJK04 and knocked out by genes of phosphomannose transferase, phosphomannose synthase and beta mannose transferase II (ARM 2).
4. Construction of yeast strain inactivated by beta-mannosyltransferase ARM1, ARM3 and ARM4 genes
According to the design method and the construction process of the yeast strain construction inactivated by the beta-mannosyltransferase gene ARM2 in the first to third steps, beta-mannosyltransferases ARM1, ARM3 and ARM4 (namely beta-mannosyltransferases I, III and IV with amino acid sequences of SEQ ID No.4, SEQ ID No.6 and SEQ ID No. 7) are knocked out successively on the basis of the GJK04 engineering bacteria, and the GJK05, GJK07 and GJK18 engineering strains are respectively constructed.
1. Construction of beta-mannosyltransferase ARM1, ARM3, ARM4 Gene inactivation vector
The carrier construction method is the same as the third step, and the difference is that:
Primers used for amplifying the homology ARM of the upstream flanking region of ARM1 (ARM 1 5' homology ARM) are ARM1-5-5 and ARM1-5-3, and the primer sequences are respectively as follows:
ARM1-5-5:5'-TCAACGCGTTGGCTCTGGATCGTTCTAATA-3' (the underlined part is the MluI recognition site);
ARM1-5-3:5'-ttctccgttctcctttctccgtGCGGCCGCcagcagcaaggaagataccaa-3' (underlined is NotI recognition site).
Primers used for amplifying the homology ARM of the downstream flanking region of ARM1 (ARM 1 3' homology ARM) are ARM1-3-5 and ARM1-3-3, and the primer sequences are respectively as follows:
ARM1-3-5:5'-ttggtatcttccttgctgctgGCGGCCGCacggagaaaggagaacggagaa-3' (underlined is NotI recognition site);
ARM1-3-3:5'-TCAACGCGTTGGCTGGAGGTGACAGAGGAA-3' (the underlined part is the MluI recognition site).
Primers used for amplifying the homology ARM of the upstream flanking region of ARM3 (ARM 3 5' homology ARM) are ARM3-5-5 and ARM3-5-3, and the primer sequences are respectively as follows:
ARM3-5-5:5'-TCAACGCGTTAGTAGTGCCGTGCCAAGTAGCG-3' (the underlined part is the MluI recognition site);
ARM3-5-3:5'-tcctactttgcttatcatctgccGCGGCCGCggtcaggccctcttatggttgtg-3' (underlined is NotI recognition site).
Primers used for amplifying the homology ARM of the downstream flanking region of ARM3 (ARM 3 3' homology ARM) are ARM3-3-5 and ARM3-3-3, and the primer sequences are respectively as follows:
ARM3-3-5:5'- _ CACAACCATAAGAGGGCCTGACCGCGGCCGCGGCAGATGATAAGCAAAGTAGGA-3' (underlined is the NotI recognition site);
ARM3-3-3:5'-TCAACGCGTCATAGGTAATGGCACAGGGATAG-3' (the underlined part is the MluI recognition site).
Primers used for amplifying the homology ARM of the upstream flanking region of ARM4 (ARM 4 5' homology ARM) are ARM4-5-5 and ARM4-5-3, and the primer sequences are respectively as follows:
ARM4-5-5:5'-TCAACGCGTGCAGCGTTTACGAATAGTGTCC-3' (the underlined part is the MluI recognition site);
ARM4-5-3:5'-gcatagggctgaagcatactgtGCGGCCGCaatgatatgtacgttcccaaga-3' (underlined is NotI recognition site).
Primers used for amplifying the homology ARM of the downstream flanking region of ARM4 (ARM 4 3' homology ARM) are ARM4-3-5 and ARM4-3-3, and the primer sequences are respectively as follows:
ARM4-3-5:5'-tcttgggaacgtacatatcattGCGGCCGCacagtatgcttcagccctatgc-3' (underlined is NotI recognition site);
ARM4-3-3:5'-TCAACGCGTGAGGTGGACAAGAGTTCAACAAAG-3' (the underlined part is the MluI recognition site).
2. Transformation of Pichia pastoris with knockout plasmid
The difference is that the two pairs of primers used in the PCR reaction are:
Primer sequences ARM1-5-5OUT outside the 5' homology ARM of ARM1 gene: 5'-GTTCTGGTATGCGTTCTATTCTTC-3' and primer sequence inner01 on the vector: 5'-AGCGTCGATTTTTGTGATGCTCGTCA-3', the band amplified by the primer was positive clone at about 3.5 kb.
Primer sequence ARM3-5-5OUT outside the 5' homology ARM of ARM3 gene: 5'-TATTTGCCTTCTTCACCGT TAT-3' and primer sequence inner01 on the vector: 5'-AGCGTCGATTTTTGTGATGCTCGTCA-3' the band amplified by the primer was positive clone at about 3.7 kb.
Primer sequence ARM4-5-5OUT outside 5' homology ARM of ARM4 gene: 5'-TCCGTTGAGGGTGCTAATGGTA-3' and primer sequence inner01 on the vector: 5'-AGCGTCGATTTTTGTGATGCTCGTCA-3' the band amplified by the primer was positive clone at about 3.7 kb.
3. PCR identification of positive engineering strains
The difference is that the engineering bacteria are identified by using the following primers, and the gene can be found to be knocked out (fig. 6, 7 and 8):
Arm1-ORF-09:5'-TAGTCTGGTTTGCGGTAGTGT-3';
Arm1-ORF-10:5'-AGATTGAGCATAGGAGTGGC-3'。
Arm3-ORF-09:5'-AAACGGAGTCCAGTTCTTCT-3';
Arm3-ORF-10:5'-CAACTTTGCCTGTCATTTCC-3'。
Arm4-ORF-09:5'-CGCTTCAGTTCACGGACATA-3';
Arm4-ORF-10:5'-GCAACCCAGACCTCCTTACC-3'。
The results of DSA-FACE glycoform analysis of GJK18 are shown in FIG. 9. Since modification of beta mannose is only added at individual ends of mannose, although the glycoform analysis results are not substantially changed, beta mannose is a potential immunogenic sugar, and thus there is a potential risk for a pharmaceutical source for human body, the present invention inactivates all beta mannose, thus fundamentally solving the problem of the existence of beta mannose, and the glycoform structure is not changed.
5. Construction of glycosyl engineering yeast strains with mammalian Man5GlcNAc2 and free of fucosylation structures
Firstly, in order to identify whether exogenous mannosidase I (MDSI) plays a role correctly, a reporter protein is introduced into GJK18 engineering bacteria in advance, and an anti-Her 2 antibody is used as the reporter protein, so that an expression vector of the anti-Her 2 antibody is constructed. The construction method of the vector and the transformation method of the vector are disclosed in the application patent (publication number: CN 101748145A). The method is used for transferring the anti-Her 2 antibody expression vector into GJK18 host bacteria to obtain the W2 engineering strain for expressing the anti-Her 2 antibody.
Secondly, a glycosyl engineering yeast strain W10 with a mammalian Man5GlcNAc2 and without fucosylation structure is an engineering bacterium obtained by inserting MDSI (TrmdsI, the nucleotide sequence of which is shown as SEQ ID No.14, and encoding MDSI protein shown as SEQ ID No. 9) of a C-terminal fusion HDEL sequence into the genome of a host bacterium W2.
1. Construction of exogenous mannosidase I (MDSI) expression vector
The recombinant vector pPIC9-TrmdsI for expressing exogenous mannosidase I is a recombinant vector obtained by inserting a DNA molecule shown in SEQ ID No.14 between Xho I and EcoR I cleavage sites of the pPIC9 vector.
Wherein, the 1 st to 1524 th nucleotide of the 5 'end of SEQ ID No.14 is the optimized mannosidase I coding gene, and the 1525 th to 1536 th nucleotide of the 5' end is the endoplasmic reticulum retention signal, namely the HDEL coding gene.
(1) Mannosidase I (MDSI) genes
The exogenous mannosidase I can be mannosidase I from filamentous fungi, plants, insects, java, mammals and the like, and the embodiment selects the mannosidase I of Trichoderma viride (Zhan Jie. Clone expression and activity identification of Trichoderma viride alpha-1, 2-mannosidase in Pichia pastoris [ academic, studies, and the like), and the C-end of the mannosidase I is fused with an endoplasmic reticulum retention signal, namely HDEL.
According to cloning expression and activity identification of Trichoderma jeldahl. Alpha-1, 2-mannosidase in Pichia pastoris [ academic Studies ] published mannosidase I sequence of Trichoderma viride, optimizing coding gene according to preferred codon of yeast and gene high expression principle, and fusing HDEL sequence at C terminal to obtain gene fragment (SEQ ID No. 14).
(2) The following primers were designed and synthesized:
TrmdsI-5:5'-TCTCTCGAGAAAAGAGAGGCTGAAGCTTATCCAAAGCCGGGCGCCAC-3'; the underlined sequence is the Xho I cleavage recognition site.
TrmdsI-3:5'-AGGGAATTCTTACAACTCGTCGTGAGCAAGGTGGCCGCCCCGTCGTGATG-3'; the underlined sequence is EcoRI cleavage recognition site.
(3) And (3) carrying out PCR amplification by taking the gene fragment obtained in the step (1) as a template and TrmdsI-5 and TrmdsI-3 as primers to obtain a PCR amplification product which is named as TrmdsI and contains SEQ ID No.14.
(4) Double-enzyme cutting of Xho I and EcoR I to obtain the PCR product obtained in the step (3) to obtain a gene fragment; double-enzyme cutting pPIC9 vector by Xho I and EcoR I to obtain a large vector fragment; the gene fragment was ligated to the vector large fragment to give a recombinant plasmid, which was designated pPIC9-TrmdsI. pPIC9-TrmdsI was sequenced and the results were correct.
2. Construction of recombinant Yeast expressing exogenous mannosidase I
About 10. Mu.g of pPIC9-TrmdsI plasmid was linearized with Sal I and the linearized plasmid was precipitated with 1/10 volume of 3M sodium acetate and 3 volumes of absolute alcohol. The plasmid was washed twice with 70% by volume of aqueous ethanol to remove salts therefrom, dried in the air, and the precipitate was resuspended in about 30. Mu.L of water to obtain pPIC9-TrmdsI linearized plasmid for transformation.
The method for preparing yeast electrotransformation competent cells in the following steps was the W2 engineering bacteria constructed as described above with reference to Invitrogen corporation's related manual and "Molecular Cloning,A laboratory Manual(Fourth Edition)",2012Cold Spring Harbor Laboratory Press,Cold Spring Harbor,New YorK. selected as the host bacteria.
The method comprises the following steps:
Pichia pastoris W2 was streaked on YPD plates (yeast extract 10g/L, tryptone 20g/L, glucose 20g/L, agar 15 g/L) and incubated at 28℃for 2 days. A single clone was inoculated into a 50mL Erlenmeyer flask containing 10mL of YPD liquid medium (yeast extract 10g/L, tryptone 20g/L, glucose 20 g/L), and cultured overnight at 28℃until OD 600 was about 2, to give a bacterial liquid. Then 0.1-0.5mL of the bacterial liquid is inoculated into a 3.5L shake flask containing 500mL of YPD liquid culture medium, and the bacterial liquid is cultured overnight until the OD 600 is between 1.3 and 1.5. The bacterial solution was transferred to a sterile centrifuge bottle and centrifuged at 1500g for 10 minutes at 4 ℃. The cells were resuspended in 500mL of pre-chilled sterile water, centrifuged at 1500g for 10min at 4℃and the cells were harvested and washed once more with 250mL of pre-chilled sterile water. Cells were harvested by re-suspending the cells with 20mL of pre-chilled sterile 1M sorbitol, centrifuging at 4℃for 10 minutes and re-suspending the cells with pre-chilled 1M sorbitol to a final volume of 1.5mL to give a bacterial suspension.
80. Mu.L of the bacterial suspension was mixed with 10. Mu.L of pPIC9-TrmdsI linearized plasmid for transformation in a microcentrifuge tube to give a mixture, which was placed on ice for 5min, transferred to an ice-cold 0.2cm electrocuvette, cells were electroporated (Bio-Rad Gene Pulser,2000V, 25. Mu.F, 200Ω), 1mL ice-cold 1M sorbitol was immediately added to the electrocuvette, and the mixture (transformed cells) was carefully transferred to a 15mL culture tube.
The culture tubes were incubated in a 28℃incubator for 1h without shaking. Then, 1mL of YPD liquid medium was added thereto and incubated for 3 hours at 28℃in a shaker at 250 rpm. mu.L of the transformed cells were plated on MD-containing plates (1.34 g/100ml YNB, 4X 10 -5 g/100ml Biotin,2g/100ml glucose). Incubator at 28℃for 2-5 days until a monoclonal, W2-Tr, designated W10, is formed.
Extracting the genome DNA of W10 by a glass bead preparation method, and carrying out PCR amplification by taking the genome DNA as a template and TrMDSI-1.3kb-01 and TrMDSI-1.3kb-02 as primers to obtain about 1.3kb of PCR amplification products, wherein the insertion of MDSI into the genome is proved to be positive engineering bacteria (A in figure 10).
TrMDSI-1.3kb-01:5’-GAACACGATCCTTCAGTATGTA-3’;
TrMDSI-1.3kb-02:5’-TGATGATGAACGGATGCTAAAG-3’。
As shown in FIG. 10B, the results of DSA-FACE glycoform analysis of W10 strain (method is the same as that described in example one) show that, after TrmdsI is transferred, the glycoform structure of the expressed protein of W10 strain is Man5GlcNAc2, man6GlcNAc2, wherein Man5GlcNAc2 is the main.
6. Construction of a sugar-based engineered Yeast Strain having mammalian GlcNAcMan5GlcNAc2 and free of fucosylation Structure
Glycosyl engineering yeast strains 1-8 with mammalian GlcNAcMan5GlcNAc2 and without fucosylation structure are engineering bacteria obtained by inserting a DNA fragment of N-acetylglucosamine transferase I (GnTI) containing a mnn9 localization signal (the nucleotide sequence of which is shown as SEQ ID No.15 and codes for a protein shown as SEQ ID No. 10) into the genome of host bacterium W10.
Wherein, the 1 st to 114 th nucleotide of the 5 'end of SEQ ID No.15 is mnn9 localization signal, and the 115 st to 1335 th nucleotide of the 5' end is N-acetylglucosamine transferase I coding gene.
1. Construction of N-acetylglucosamine transferase I (GnTI) expression vector containing mnn9 localization Signal
(1) Calling human gnt1 gene
Human gnt1 gene upstream primer (mnn 9-GnTI-01:5 '-TCAGTCAGCGCTCTCGATGGCGACCCCG-3') and downstream primer GnTI-02:5'-GCGAATTCTTAGTGCTAATTCCAGCTAGGATCATAG-3' (underlined as EcoR I cleavage site), the full-length fragment of the human gnt1 gene was obtained by PCR from a human liver embryo cDNA library (available from Clontech Laboratories Inc.1290terra Bella Ave. Mountain View, CA94043, USA) under the following conditions: pre-denaturation at 94℃for 5 min, denaturation at 94℃for 30 sec, annealing at 52℃for 30 sec, extension at 72℃for 1 min for 30 sec, and cycling for 30 times; finally, the extension is carried out at 72 ℃ for 10 minutes. The PCR amplified product was separated by 0.8% agarose gel electrophoresis, and recovered by using a DNA recovery kit.
(2) GnTI DNA fragment containing localization signal mnn9
S.core MNN9 Golgi positioning Signal :ScMNN9-03:tatAATattATGTCACTTTCTCTTGTATCGTACCGCCTAAGAAAGAACCCGTGGGTTAACATTTTTCTACCTGTTTTGGCCATATTTCTAATATATATAATTTTTTTCCAGAGAGATCAATCTtcagtcagcgctctcgatggcgaccccg
The recovered purified 1.2kb GnTI fragment was ligated to the S.core MNN9 Golgi localization signal coding sequence by PCR using a Pyrobest DNA polymerase to amplify the MNN9-gnt1 gene fragment (SEQ ID No. 15) with an upstream primer ScMNN9-03(tatAATattATGTCACTTTCTCTTGTATCGTACCGCCTAAGAAAGAACCCGTGGGTTAACATTTTTCTACCTGTTTTGGCCATATTTCTAATATATATAATTTTTTTCCAGAGAGATCAATCTtcagtcagcgctctcgatggcgaccccg, containing the S.core MNN9 Golgi localization signal coding sequence underlined as SspI cleavage site) and a downstream primer of the GnTI catalytic domain coding region GnTI-02.
PCR reaction conditions: denaturation at 94℃for 2min, annealing at 52℃for 30sec, extension at 72℃for 5min, then denaturation at 94℃for 30sec, annealing at 52℃for 30sec, extension at 72℃for 1min for 30sec, and cycling 30 times; finally, the extension is carried out at 72 ℃ for 10 minutes.
The PCR amplified product was separated by 0.8% agarose gel electrophoresis (8V/cm, 15 minutes), and a 1.3kb target band was excised by a clean blade under an ultraviolet lamp, and recovered by using a DNA recovery kit as described above.
(3) Construction of PGE-URA3-GAP1-mnn9-GnTI expression vector
Cutting the mnn9-gnt1 gene fragment PCR product obtained in the step (2) by Ssp I and EcoRI to obtain a gene fragment; the Ssp I and EcoR I double enzyme cut PGE-URA3-GAP1 (Yang Xiaopeng, liu Bo, song Miao, new, reed-Solomon, xue Kuijing, wu Jun. Man5GlcNAc2 mammalian mannosyn glycoprotein Pichia pastoris expression System construction. Bioengineering. 2011; 27:108-17.) vector gave a vector large fragment; the gene fragment was ligated to the vector large fragment to give a recombinant plasmid, which was designated as PGE-URA3-GAP1-mnn9-GnTI. Sequencing, the result is correct.
The PGE-URA3-GAP1-mnn9-GnTI is a recombinant vector obtained by inserting a DNA molecule shown in SEQ ID No.15 between the cleavage sites Ssp I and EcoRI of the PGE-URA3-GAP1 vector.
2. Construction of recombinant Yeast expressing exogenous mannosidase I
Method for preparing Yeast electrotransformation competent cells step five above, linearizing about 10. Mu.g of PGE-URA3-GAP1-mnn9-GnTI plasmid with Nhe I to obtain PGE-URA3-GAP1-mnn9-GnTI linearized plasmid for transformation.
The selected host bacteria are W10 engineering bacteria constructed in the step five. The monoclonal formed on the MD plates after transformation was designated 1-8.
Extracting 1-8 genome DNA by glass bead preparation method, and performing PCR amplification by using genome DNA as template and HuGnTI-0.9k-01 and HuGnTI-0.9k-02 as primers to obtain PCR amplified product of about 0.9kb, which proves that GnTI is inserted into genome, namely positive engineering bacteria (as shown in figure 11A).
HuGnTI-0.9k-01:5’-TGGACAAGCTGCTGCATTATC-3’;
HuGnTI-0.9k-02:5’-CGGAACTGGAAGGTGACAATA-3’。
As shown in FIG. 11B, the results of DSA-FACE glycoform analysis of 1-8 bacteria (method as described in example one) revealed that the major glycoform structure of the host cell expressed protein after transfer to GnTI was GlcNAcMan5GlcNAc2.
7. Construction of glycosyl engineering yeasts having mammalian GalGlcNAcMan5GlcNAc2 and being free of fucosylation structures
A glycosyl engineering yeast strain 1-8-4 with mammal GalGlcNAcMan5GlcNAc2 and without fucosylation structure is obtained by inserting kre-GalE-GalT gene fragment (nucleotide sequence is shown as SEQ ID No.16, encoding protein shown as SEQ ID No. 11) into host bacterium 1-8 genome to obtain engineering bacterium 1-8-4.
Wherein, the 1 st to 294 th nucleotides of the 5' end of SEQ ID No.16 are kre < 2 > localization signals, the 295 th to 1317 th nucleotides of the 5' end are galactose isomerase GalE coding genes, and the 1325 th to 2394 th nucleotides of the 5' end are galactose transferase GalT coding genes.
1. Construction of galactose transferase (GalE+T) expression vector containing kre 2-located Signal
(1) Human GalE and GalT genes are called
Human GalE, galT gene full-length fragments were obtained from human liver-fetal cDNA library (purchased from Clontech Laboratories inc.1290terra Bella ave. Mountain View, CA94043, USA) using PCR with human GalE gene upstream primer GalE5 'and downstream primer GalE3', respectively, using human GalT gene upstream primer GalT5 'and downstream primer GalT3', and PCR reaction conditions: pre-denaturation at 94℃for 5 min, denaturation at 94℃for 30 sec, annealing at 52℃for 30 sec, extension at 72℃for 1 min for 30 sec, and cycling for 30 times; finally, the extension is carried out at 72 ℃ for 10 minutes. The PCR amplified products were separated by 0.8% agarose gel electrophoresis, and recovered by a DNA recovery kit.
GalE5’:5’-ATGAGAGTTCTGGTTACCGGTGGTA-3’;
GalE3’:5’-AGGGTACCATCGGGATATCCCTGTGGATGGC-3’(KpnI);
GalT5’:5’-ATGGTACCGGTGGTGGACGTGACCTTTCTCGTCTGCCA-3’(KpnI)。
GalT3’:5’-GCatttaaatttaGCTCGGTGTCCCGATGTCCACTGTGAT-3’(SwaI)。
(2) GalE-GalT DNA fragment containing localization signal kre2
Kre2 5':5'-ATAATattAAACGATGGCCCTCTTTCTCAGTAAGAG-3' (underlined SspI I site);
Kre2 3’+GalE5’:5’-CACCGGtAACCAGaACTctCatGATCGGGGCAtctgccttttcagcggcagctttcagagccttggattc-3’。
the kre targeting signal fragment was PCR-derived from s.cerevisiae genomic DNA. PCR conditions were as above.
The recovered purified GalE, galT fragment and S.core Kre Golgi localization signal coding sequence were ligated by PCR reaction using the upstream primer Kre2 containing the S.core Kre Golgi localization signal coding sequence and the downstream primer GalT3' of the GalE+GalT catalytic domain coding region, and the Kre-GalE-GalT gene fragment was amplified using a Pyrobest DNA polymerase.
PCR reaction conditions: denaturation at 94℃for 2min, annealing at 52℃for 30sec, extension at 72℃for 5min, then denaturation at 94℃for 30sec, annealing at 52℃for 30sec, extension at 72℃for 4min for 30sec, and cycling 30 times; finally, the extension is carried out at 72 ℃ for 10 minutes.
The PCR amplified product was separated by 0.8% agarose gel electrophoresis (8V/cm, 15 minutes), and a 2.4kb target band was excised by a clean blade under an ultraviolet lamp, and recovered by using a DNA recovery kit as described above.
(3) Construction of PGE-URA3-GAP 1-kre-GalE-GalT vector
Firstly, swaI is used for enzyme digestion of the kre-GalE-GalT DNA molecule, and then T4 PNK enzyme (Dalianbao biological Co., ltd.) is used for phosphorylating the gene fragment; double-enzyme cutting of PGE-URA3-GAP1 vector by Ssp I and SwaI to obtain a large vector fragment; the gene fragment was ligated to the vector large fragment to give a recombinant plasmid, which was designated as PGE-URA3-GAP 1-kre-GalE-GalT. Sequencing, the result is correct.
PGE-URA3-GAP 1-kre-GalE-GalT is a recombinant vector obtained by inserting a DNA molecule of kre-GalE-GalT shown in SEQ ID No.16 into Ssp I and SwaI cleavage sites of a PGE-URA3-GAP1 vector.
2. Construction of recombinant Yeast expressing exogenous UDP-Gal and lactose transferase
About 10. Mu.g of PGE-URA3-GAP1-kre2-GalE-GalT plasmid was linearized with Nhe I to obtain PGE-URA3-GAP 1-kre-GalE-GalT linearized plasmid for transformation, and the procedure for preparing competent cells for yeast electrotransformation was the same as in the fifth step.
The host bacteria are 1-8 engineering bacteria constructed in the step six. The monoclonal formed on the MD plates after transformation was designated 1-8-4.
Extracting 1-8-4 genome DNA by glass bead preparation method, and performing PCR amplification by using genome DNA as template and GalE-T (1.5 k) -01 (5'-TGATAACCTCTGTAACAGTAAGCGC-3') and GalE-T (1.5 k) -02 (5'-GGAGCTTAGCACGATTGAATATAGT-3') as primers to obtain PCR amplified products of 1.5kb respectively, which proves that GalE-T has been inserted into genome to obtain positive engineering bacteria (shown as A in figure 12).
As shown in FIG. 12B, the DSA-FACE glycoform analysis results of 1-8-4 bacteria (method is the same as that described in example one) show that the main glycoform structure of the host bacterium expressed protein after the transfer of galactose isomerase and galactose transferase is GalGlcNAcMan5GlcNAc2.
8. Construction of a glycosyl engineering Yeast Strain with mammalian GalGlcNAcMan3GlcNAc2 and free of fucosylation Structure
The glycosyl engineering yeast strain 52-60 with mammal GalGlcNAcMan3GlcNAc2 and without fucosylation structure is engineering strain 52-60 obtained by inserting MDSII DNA molecules (the nucleotide sequence is shown as SEQ ID No.17 and the protein is coded as SEQ ID No. 12) into the genome of host bacteria 1-8-4.
Wherein, the 1 st to 108 th nucleotide of the 5 'end of SEQ ID No.17 is a mnn2 localization signal of mannosidase II encoding gene, and the 109 th to 3303 th nucleotide of the 5' end is mannosidase II encoding gene.
1. Construction of mannosidase II (MDSII) expression vector containing mnn2 localization Signal
(1) MDSII genes containing mnn2 positioning signals synthesized by total gene synthesis mode
The MDSII gene (SEQ ID No. 17) containing mnn2 was synthesized by the whole gene synthesis method according to the sequence, synthesized by Nanjing Jin Ruisi company and cloned into pUC57 cloning vector to obtain pUC57-MDSII.
The MDSII gene upstream primer (mnn 2-MDSII-01:5 '-ATAATATTAAACCATGCTGCTTACCAAAAGGTTTTCAAAGCTGTTC-3') (underlined as SspI cleavage site) and downstream primer (MDSII-02:5 '-GCTATTTAAATCTATTACCTCAACTGGATTCGGAATGTGCTGATTTCCATTG-3') (underlined as SwaI cleavage site) were designed, and the human MDSII gene full-length fragment PCR product was obtained from pUC57-MDSII by PCR method, and the PCR reaction conditions were: pre-denaturation at 94℃for 5 min, denaturation at 94℃for 30 sec, annealing at 52℃for 30 sec, extension at 72℃for 4 min for 30 sec, and cycling for 30 times; finally, the extension is carried out at 72 ℃ for 10 minutes. The PCR amplified product (SEQ ID NO: 17) was separated by electrophoresis on a 0.8% agarose gel, and recovered using a DNA recovery kit.
(2) Construction of PGE-URA3-arm3-GAP-mnn2-MDSII expression vector
Firstly, the PCR product is cut by SwaI enzyme, and then the gene fragment is phosphorylated by T4 PNK enzyme (Dalianbao biological Co., ltd.); double-enzyme cutting of PGE-URA3-GAP1 vector by Ssp I and SwaI to obtain a large vector fragment; the gene fragment was ligated with the vector large fragment to give a recombinant plasmid, which was designated as PGE-URA3-arm3-GAP-mnn2-MDSII. Sequencing, the result is correct.
PGE-URA3-arm3-GAP-mnn2-MDSII is a recombinant vector obtained by inserting the DNA molecule shown in SEQ ID No.17 into Ssp I and Swa I cleavage sites of PGE-URA3-GAP1 vector.
2. Construction of recombinant Yeast expressing exogenous mannosidase II
About 10. Mu.g of PGE-URA3-arm3-GAP-mnn2-MDSII plasmid was linearized with Msc I to obtain PGE-URA3-arm3-GAP-mnn2-MDSII linearized plasmid for transformation, and the procedure for preparing competent cells for yeast electrotransformation was the same as in step five above.
The host bacteria are 1-8-4 engineering bacteria constructed in the step seven. The monoclonal formed on the MD plates after transformation was designated 52-60.
Extracting 52-60 genome DNA by glass bead preparation method, and performing PCR amplification by using the genome DNA as template and CeMNSII-1.2k-01 and CeMNSII-1.2k-02 as primers to obtain PCR amplified products of 1.2kb respectively, which proves that MDSII is inserted into genome, namely positive engineering bacteria (A in figure 13).
CeMNSII-1.2k-01:5’-CAGATGGATGAGCATAGAGTTA-3’;
CeMNSII-1.2k-02:5’-GACAAGAGGATAATGAAGAGAC-3’。
The results of the DSA-FACE glycoform analysis of 52-60 bacteria are shown in FIG. 13C. It can be seen that the primary glycoform structure of the host bacterium expressed protein after transfer exogenous mannosidase II is GalGlcNAcMan3GlcNAc2.
9. Construction of glycosyl engineering yeast strains with mammalian Gal2GlcNAc2Man3GlcNAc2 and free of fucosylation structures
A glycosyl engineering yeast strain 150L2 with mammal Gal2GlcNAc2Man3GlcNAc2 and without fucosylation structure is engineering bacteria 150L2 obtained by inserting a GnT II DNA molecule (the nucleotide sequence is shown as SEQ ID No.18, and the protein is shown as SEQ ID No. 13) into the genome of host bacteria 52-60.
Wherein, the 1 st to 108 th nucleotide of the 5 'end of SEQ ID No.18 is an mnn2 localization signal of a gene encoding N-acetylglucosamine transferase II, and the 109 th to 1185 th nucleotide of the 5' end is N-acetylglucosamine transferase II.
1. Construction of N-acetylglucosamine transferase II (GnTII) expression vector for mnn2 localization Signal
(1) Synthesis of GnTII Gene by Total Gene Synthesis
The Mnn 2-containing GnTII gene (SEQ ID No. 18) was synthesized by the whole gene synthesis method according to the sequence, synthesized by Nanjing Jin Ruisi and cloned into pUC57 cloning vector to obtain pUC57-GnTII.
The PCR method is used to obtain N-acetylglucosamine transferase II (GnTII) DNA molecules containing mnn2 localization signals from pUC57-GnTII by designing an upstream primer (mnn 2-GnTII-01:5 '-ATAATATTAAACCATGCTGCTTACCAAAAGGTTTTCAAAGCTGTTC-3') (underlined as SspI cleavage site) and a downstream primer (GnTII-02:5 '-GCTATTTAAATTTATCACTGCAGTCTTCTATAACTTTTAC-3') (underlined as SwaI cleavage site), and the PCR reaction conditions are as follows: pre-denaturation at 94℃for 5 min, denaturation at 94℃for 30 sec, annealing at 52℃for 30 sec, extension at 72℃for 2 min for 30 sec, and cycling for 30 times; finally, the extension is carried out at 72 ℃ for 10 minutes. The PCR amplified product was separated by 0.8% agarose gel electrophoresis, and recovered by using a DNA recovery kit.
(2) Construction of PGE-URA3-arm3-GAP-mnn2-GnTII expression vector
The enzyme digestion and construction method is consistent with the construction method of PGE-URA3-arm3-GAP-mnn2-MDSII, and recombinant plasmid is obtained and named as PGE-URA3-arm3-GAP-mnn2-GnTII. Sequencing, the result is correct.
The PGE-URA3-arm3-GAP-mnn2-GnTII is a recombinant vector obtained by inserting the DNA molecule shown in SEQ ID No.18 into Ssp I and Swa I cleavage sites of the PGE-URA3-GAP1 vector.
2. Construction of recombinant Yeast expressing exogenous N-acetylglucosamine transferase II
About 10. Mu.g of PGE-URA3-arm3-GAP-mnn2-GnTII plasmid was linearized with Msc I to obtain PGE-URA3-arm3-GAP-mnn2-GnTII linearized plasmid for transformation, and the method for preparing competent cells for yeast electrotransformation was the same as in the above-mentioned step five.
The selected host bacteria are 52-60 engineering bacteria constructed in the step eight. The monoclonal formed on the MD plates after transformation was designated 150L2.
Extracting 150L2 genome DNA by glass bead preparation method, and performing PCR amplification by using genome DNA as template and RnGnTII-0.8k-01 and RnGnTII-0.8k-02 as primers to obtain PCR amplified product of 0.8kb, which proves that GnTII has been inserted into genome, namely positive engineering bacteria (B in figure 13).
RnGnTII-0.8k-01:5’-ATCAACAGTCTGATCTCTAGTG-3’;
RnGnTII-0.8k-02:5’-AGTTCATGGTCCCTAATATCTC-3’。
10. Knockout of anti-her 2 antibody genes in engineered strains
The yeast strain 3-5-11 with inactivated anti-her 2 antibody gene is recombinant yeast obtained by introducing a DNA molecule shown in SEQ ID No.19 (anti-her 2 antibody light-heavy chain gene knockout sequence) into Pichia pastoris 150L2, performing homologous recombination with a homologous sequence in the 150L2 genome, and knocking out the anti-her 2 antibody light-heavy chain gene in the yeast genome.
Constructing an anti-her 2 antibody light and heavy chain gene inactivation vector, knocking out the transformation of plasmid to pichia pastoris, and identifying positive engineering strains by PCR, wherein the yeast strains inactivated by the anti-her 2 antibody gene are named as 3-5-11.
11. Inactivation of O-mannosyltransferase I Gene in engineered Strain
As the host bacteria are found to be unstable and MDSI and MDSII genes are easy to lose, before the O-mannosyltransferase I genes are inactivated, the host bacteria are transferred into SEQ ID No.17 (MDSII) and SEQ ID No.14 (MDSI) in 3-5-11 according to the same technical method of the step eight and the step five of the embodiment, double copies of the two genes in engineering bacteria are ensured, and 670 host bacteria are constructed and obtained.
The yeast strain 7b with inactivated O-mannosyltransferase I gene is yeast obtained by inserting and inactivating DNA molecule encoding O-mannosyltransferase I shown in SEQ ID No.8 in Pichia pastoris 670, and is named as 7b, namely GJK30.GJK30 is preserved in China general microbiological culture collection center (CGMCC No. 19488) at 18/03/2020.
1. Construction of O-mannose transferase Gene inactivation vector
The terminator AOXTT sequence was obtained by PCR using plasmid pPIC9 (Invitrogen) as a template. PCR fishing terminator primers AOXTT-5 and AOXTT-3(5'-AOX1TT-5tctacgcgtccttag acatgactgttcctcagt-3';AOX1TT-3:5'-tctacgcgtaagcttgcacaaacgaacttc-3'). were used to purify and recover the obtained PCR product with a PCR product recovery purification kit (Ding national biotechnology Co., beijing) to obtain an AOX1TT terminator fragment.
The vector pYES2 (Invitrogen) used in the invention has a URA3 screening marker of yeast, and can be used for subsequent screening work. In order to prevent the influence of the promoters of the URA3 genes on other genes on the vector, the invention adds an AOX1TT terminator at the tail end of the URA3 genes. The specific construction method comprises the following steps: recovering the obtained AOX1TT terminator fragment, and then carrying out enzyme digestion by using MluI to obtain an enzyme-digested fragment; the digested fragment was ligated with vector pYES2 treated with Mlu1 as well, the ligation product was transformed into E.coli competent cell Trans5α (Beijing full-size gold Biotechnology Co., ltd., catalog number CD 201) and amplified, and the clone with the correct sequence was named Trans5α -pYES2-URA3-AOX1TT, and the plasmid was extracted to give a recombinant vector with the end of URA3 gene added with AOX1TT terminator, designated pYES2-URA3-AOX1TT.
In order to enable site-directed integration of the constructed vector into the pichia pastoris PMT1 gene, the present invention uses PCR to fish a fragment of the ORF region of the PMT1 gene as a homologous recombination fragment. To ensure that integration of the inactivating vector into the PMT1 gene could cause inactivation of the PMT1 gene, the study added a different combination of stop codons at both ends of the primer, and a CYCTT terminator at the end of the 3 fragment of the PMT1 gene that was fished.
The genomic DNA of Pichia pastoris JC308 (Invitrogen) was extracted by a glass bead preparation method (A. Adam et al, laboratory guidelines for Yeast genetic methods, science Press, 2000) using the genomic DNA of Pichia pastoris JC308 as a template, and PCR amplification of the fishing PMT1 gene fragment was performed using primers PMT1-IN-5 and PMT 1-IN-3.
PMT1-IN-5:5’-tctatgcattaatgatagttaatgactaatagagtaaaacaagtcctcaagaggt-3’;
PMT1-IN-3:5’-tgacataactaattacatgatctattagtcattaactatcattagatcagagtggggacgactaagaaa gc-3’。
The two ends of the PMT1 gene fragment which is fished are added with stop codons with different combinations, and the fragment is named PMT1-IN.
PCR fishing PMT1 gene fragment reaction conditions are 94 ℃ and pre-denaturation is carried out for 5min; denaturation at 94℃for 30s, annealing at 55℃for 30s, and extension at 72℃for 1min40s. A total of 25 cycles were carried out and finally the extension was carried out at 72℃for 10min. And recovering the PCR product to obtain the fishing PMT1 gene fragment.
PCR amplification of the fragment of the CYC1TT terminator was performed using the plasmid pYES2 containing CYCTT terminator as template and the primers CYC1TT-5 and CYC1TT-3(CYC1TT-5:5'-gctttcttagtcgtccccactctgatctaatgatagttaatgactaatagatcatgtaattagttatgtca-3';CYC1TT-3:5'-gcaaattaaagccttcgagcgtc-3'). The PCR reaction condition is 94 ℃ pre-denaturation for 5min; denaturation at 94℃for 30s, annealing at 55℃for 30s, and extension at 72℃for 1min. A total of 25 cycles were carried out and finally the extension was carried out at 72℃for 10min. And (3) recovering the PCR product to obtain the CYC1TT terminator fragment.
And then, taking the recovered PCR product CYC1TT terminator fragment and PMT1-IN fragment (the fishing PMT1 gene fragment) as templates, carrying out PCR amplification by using primers PMT1-IN-5 and CYC1TT-3, and connecting the PMT1-IN and the CYC1TT fragment to construct the PMT1-IN-CYC1TT fusion fragment. The PCR reaction condition is 94 ℃ pre-denaturation for 5min; denaturation at 94℃for 30s, annealing at 55℃for 30s, and extension at 72℃for 2.4min. A total of 25 cycles were carried out and finally the extension was carried out at 72℃for 10min. And recovering PCR products, namely the PMT1-IN-CYC1TT fusion fragment which is a connecting fragment of PMT1-IN and CYC1TT terminators. The recovered product is phosphorylated after being digested by NSi1, and then is connected with a vector skeleton obtained by digestion of pYES2-URA3-AOX1TT through NSi1 and Stu1, and the obtained recombinant vector with the correct sequence is PMT1 inserted into an inactivated vector PMT1-IN-pYES2.
The front end and the tail end of the fishing PMT1 gene fragment are respectively provided with different combinations of stop codons, and the CYC1TT terminator is arranged behind the stop codon at the tail end, so that the PMT1 gene cannot be expressed if the genome is integrated correctly. The pYES2 vector contains the URA3 gene of Pichia pastoris, and an AOX1TT terminator is inserted after the URA3 gene in order to prevent the promoter of the URA3 gene from promoting the PMT1 gene. According to the designed primer, a CYC1TT terminator (272 bp) fragment and a PMT1 (907 bp) fragment are obtained, which are consistent with the theoretical size. The fusion fragment size of the PMT1-IN fragment and the CYC1TT fragment is 1135bp, and the construction of the vector PMT1-IN-pYES2 is proved to be successful through the PCR identification, sequencing and the like.
2. Construction of PMT1 Gene-inactivated Strain
Preparing yeast 670 competent cells, wherein the preparation method comprises the following steps:
670 single colonies were picked and inoculated into 2mL of YPD+U medium (the medium is a medium with uracil concentration of 100. Mu.g/mL obtained by adding uracil to YPD medium), and cultured at 170r/min for 48h at 25℃on a shaker; then, 500. Mu.L of the culture was inoculated into 100mL of YPD+U medium and cultured at 170r/min at 25℃for 24 hours until OD 600 reached 1.0; then centrifuging at 6000r/min for 6min at 4 ℃, and re-suspending the thalli with 15mL of cold sterile water; re-centrifuging under the same conditions, and re-suspending the thalli with 15mL of cold sterile water; centrifuging at 6000r/min for 6min at 4 ℃, and re-suspending the thalli with 15mL of cold 1mol/L sorbitol; centrifuging again under the same condition; the supernatant was decanted, the cells were resuspended in 1mL cold 1mol/L sorbitol, and the volume was approximately 1.5mL, i.e., yeast 670 competent cells, and placed on ice for use.
Shock transformation of PMT1 into the inactivating vector PMT1-IN-pYES 2: PMT1 was inserted into the inactivating vector PMT1-IN-pYES2, which was then digested with EcoRV enzyme and recovered, and the final product was dissolved IN 20 μl ddH 2 O, the linearized plasmid; mixing 670 competent cells of 85 mu L with linearization plasmid in an electrorotating cup, placing on ice for 5min, carrying OUT electrotransformation (2 kV) according to the conditions on a Pichia electrotransformation manual, immediately adding 700 mu L of 1M sorbitol after electric shock, transferring to a 1.5mL centrifuge tube, placing at 25 ℃ for 1h, coating on an MD+RH plate (solid medium with histidine and arginine concentrations of 100 mu g/mL and 100 mu g/mL obtained by adding histidine and arginine to MD medium), placing at 25 ℃ for culture, extracting genome DNA from clones grown on the plate, carrying OUT PCR identification by using PMT1 genome peripheral primers PMT1-ORF-OUT-5 and PMT1-ORF-OUT-3, and identifying the correct genome with the name of 7b, namely GJK30.
PMT1-ORF-OUT-5:5’-aagacccatgccgaacacgac-3’;
PMT1-ORF-OUT-3:5’-gctctgaggcaccttgggtaa-3’。
Integration into the pichia pastoris chromosome by means of insertion and integration of the insert-inactivating vector is achieved, and because the vector contains PMT1 gene homologous fragments, the integration of the vector belongs to site-directed integration in theory, namely, the insert-inactivating vector is inserted into the PMT1 gene, and identification and screening can be performed by designed specific primers. Clones grown on MD+RH plates were identified by pressure screening using the URA3 screening markers of Pichia pastoris. PCR was performed by using PMT1 gene peripheral primers PMT1-ORF-OUT-5 and PMT 1-ORF-OUT-3. If the PMT1-IN-pYES2 vector is properly integrated into the PMT1 gene, an 8.6kb fragment can be obtained by using the above primers; the control (i.e.Yeast X33) is a 3kb sized fragment (FIG. 14); as is clear from the above, the PMT1-IN-pYES2 vector was correctly integrated into the PMT1 gene and designated as 7b, namely GJK30. Since different stop codons and stop promoters are designed on the insert vector, the PMT1 gene is not expressed due to correct gene integration.
12. Sugar type structure analysis of GJK30 engineering bacteria
In order to observe whether the sugar type structure of the finally obtained GJK30 is correct, the invention introduces a reporter protein after obtaining the GJK30 engineering bacteria, and the construction method and the vector transformation method of the expression vector of the anti-Her 2 antibody by taking the anti-Her 2 antibody as the reporter protein are disclosed in the application patent (see the example 1). The method is used for transferring the anti-Her 2 antibody expression vector into GJK30 host bacteria to obtain GJK30-HL engineering strain for expressing the anti-Her 2 antibody. The glycoforms are different from those of the control recombinant engineering bacteria obtained in the earlier stage (the Her2 antibody expression vector is transferred to the GJK08 strain constructed in example 1 of Chinese patent application 201410668305. X), namely, compared with the GJK30-HL engineering strain of the invention, three points are that the knocked-out beta mannosyl transferase is I-IV, the control recombinant engineering bacteria only knocked-out beta mannosyl transferase II, the invention also inactivates O-mannosyl transferase I, the control recombinant engineering bacteria are not, the exogenous strains MDSI and MDSII are transferred twice, the control recombinant engineering bacteria are transferred once), although the two have Gal2GlcNAc2Man3GlcNAc2 structures, the ratio of the two structures is obviously different, the ratio of the Gal2GlcNAc2Man3GlcNAc2 structures obtained by the GJK30 engineering bacteria is lower than 50% (A in FIG. 15), the ratio of the glycoforms occupied by the Gal2GlcNAc2Man3 structures is more than 60%, and the glycoforms are uniform and the whole glycoforms are more simple (in FIG. 15B). It is reported in many documents that this Gal2GlcNAc2Man3GlcNAc2 glycoform structure affects the biological activity of proteins, such as ADCC and CDC activities of antibodies, and thus its specific gravity directly affects many properties of proteins. The glycoform was subjected to cleavage analysis by commercially available glycosidases (NEW ENGLAND Biolabs, beijin), as shown in fig. 15C, since the end of Gal2GlcNAc2Man3GlcNAc2 (G2) is free of N-acetylglucosamine, the Gal2GlcNAc2Man3GlcNAc2 structure is not altered by the action of β -N-acetylglucosaminidase, but two galactose can be cleaved off by the action of exo-enzyme β1, 4-galactosidase to form the structure of GlcNAc2Man3GlcNAc2 (G0); and simultaneously, under the action of the two exonucleases, galactose Gal and N-acetylglucosamine GlcNAc are sequentially cut off, so that the glycosyl structure is changed into a Man3GlcNAc2 structure, and the expressed glycoform is proved to be correct.
EXAMPLE 2 construction of SARS-CoV-2S-RBD (RBD 223) recombinant Yeast Strain
1. Acquisition of SARS-CoV-2S protein RBD gene and construction of yeast expression vector
According to the published sequence of the 'Wuhan-Hu-1' isolate (GenBank: MN 908947.3), the 319 th to 541 th amino acids of S protein (R319-F541) are selected, the DNA sequence is optimized according to the preferred codon of Pichia pastoris by the Beijing nuoxer genome research center limited company, and the DNA sequence is inserted between the XhoI and NotI restriction sites of the pPICZ alpha A vector, and the recombinant expression vector pPICZ alpha-S-RBD, namely an RBD223 expression vector is obtained.
The structure of the recombinant expression vector pPICZ alpha-S-RBD is described as follows: a recombinant plasmid obtained by inserting the DNA fragment shown in SEQ ID No.25 between XhoI and NotI cleavage sites of the pPICZ alpha A vector. SEQ ID No.25 is a coding gene sequence obtained by codon optimization according to amino acids 319 to 541 (R319-F541) of SARS-CoV-2 'wuhan-Hu-1' isolate S protein, and codes for SARS-CoV-2S-RBD (RBD 223) protein shown in SEQ ID No. 21.
2. Recombinant expression vector pPICZ alpha-S-RBD transformed saccharomycete CGMCC No.19488
Yeast streaks were resuscitated on YPD plates and the monoclonal isolated. Selecting resuscitated monoclonal, inoculating to YPD liquid culture medium, culturing in test tube until the logarithmic phase, transferring 1ml into 100ml YPD shake flask, shake culturing at 25deg.C and 200rpm until OD 600 is 1.3-1.5, centrifuging at 1500deg.C for 5min, discarding supernatant, re-suspending with equal volume of pre-cooled distilled water, centrifuging at 1500deg.C for 5min, discarding supernatant, and repeating the steps for 3 times; the mixture was resuspended in an equal volume of pre-chilled 1M sorbitol and centrifuged at 1500g for 5min at 4℃and the supernatant discarded and the procedure repeated 3 times. The bacterial pellet washed by distilled water and sorbitol for 3 times is added with 1ml of 1M sorbitol to be suspended, and 100 mu l of each bacterial pellet is packaged into a sterile centrifuge tube for preservation at-80 ℃.
About 10. Mu.g of the constructed expression plasmid pPICZ. Alpha. -S-RBD was single-point linearized with the restriction enzyme BglII, and the cleavage system (50. Mu.L) was as follows: the expression plasmid pPICZα -S-RBD 43 μ L, bglII 2 μL, 10 XNEB 3.1 buffer 5 μL, digested for 1h at 37deg.C, sampled, separated by 1% agarose gel electrophoresis, and analyzed for complete linearization of the plasmid. The separation result shows that the completely linearized cleavage product is subjected to fragment recovery using a centrifugal column type DNA fragment recovery kit, and finally eluted with 30. Mu.L of pure water when the linearized plasmid is eluted.
Taking 15 μl of linearized expression plasmid pPICZ alpha-S-RBD, adding into 100 μl of Pichia pastoris (preservation number is CGMCC No. 19488) genetically modified by glycosylation modification method obtained in example 1, performing electric shock transformation on competent cells, gently mixing, transferring into a precooled 0.2cm electric rotating cup, and standing on ice for 5min. According to the requirements of yeast electric transfer manual, 900 mu L of precooled 1M sorbitol is quickly added after electric shock at 2kV voltage, transferred into a clean test tube and placed in a 25 ℃ incubator for standing for 2 hours. Then 1ml YPD liquid medium without antibiotic is added, and the mixture is placed at 25 ℃ and is subjected to shaking culture at 200rpm for 3 to 4 hours. The bacterial liquid obtained by the shaking culture is coated on a YPD plate with screening resistance of Zeocin in 300 mu L, and the bacterial liquid is cultured for 60-72 hours in an inverted incubator at 25 ℃.
3. Screening of recombinant expression strains
After the coated plate grows out the monoclone, 8 monoclone are randomly selected and inoculated on a new YPD/Zeocin plate, and the temperature box is used for inversion culture at 25 ℃. After the colony grows out, inoculating into 3ml of YPD/Zeocin liquid culture medium, performing shaking culture at 25 ℃ and 200rpm, transferring into 3ml of BMGY culture medium according to the inoculum size of 5% (volume percentage) after the bacterial liquid grows thick, performing shaking culture at 25 ℃ and 200rpm in the culture medium, and adding 0.5% (V/V) methanol every 12 hours for induction after 48 hours. After 48h of induction, culture supernatants were collected at 12000rpm for 3 min.
Culture supernatants collected after 48 hours of methanol induction were screened by WB and Western Blot procedure was approximately as follows: (1) separating the sample from the 12% SDS-PAGE gel; (2) transferring the sample on the SDS-PAGE gel to a PVDF membrane; (3) Sealing PVDF film with target protein transferred by 5% milk sealing liquid, and sealing at room temperature for 1 hr; (4) Turning to incubation with 5% milk for 2 hours with dilution of primary antibody (Anti-CoV spike Antibody, 40150-T62, san. Sedge) at 1:1000; (5) PBST washing for 5min, and washing for 5 times; (6) Transfer to 1 hour incubation with 5% milk diluted secondary antibody (Sigma SAB 3700885) at a dilution of 1:4000; (7) PBST washing for 5min, and washing for 5 times; (8) Color development was performed with Pro-light HRP Chemiluminescent color development (Tiangen Biochemical, PA 112-02).
The results are shown in FIG. 16. As can be seen from the figure, western Blotting analysis shows that clones 1-7 have different levels of protein expression, wherein the expression level of the clone 7# is higher, and the clone is selected as a clone strain for the next step of experiment and named CGMCC19488/pPICZ alpha-S-RBD.
EXAMPLE 3 expression and purification of recombinant SARS-CoV-2S-RBD glycoprotein
1. Recombinant strain CGMCC19488/pPICZ alpha-S-RBD culture
The positive clone (i.e., recombinant strain CGMCC 19488/pPICZ. Alpha. -S-RBD) identified in example 2 was selected and inoculated into YPD/Zeocin liquid medium, cultured at 25℃and 200rpm until OD 600 was 15-20, inoculated into BMGY medium at 5% (V/V) and cultured at 25℃and 200rpm for 24 hours, then methanol was added to induce expression of S-RBD at a volume percentage of 0.5%, induction was performed every 12 hours, and the expression was sampled and examined, and after 48 hours of induction, culture supernatant was collected by centrifugation.
SDS-PAGE detection at different induction times is shown in FIG. 17. As can be seen, the expression level of the target protein was increased with the increase in the induction time.
2. Purification of SARS-CoV-2S-RBD
1. Cation exchange chromatography
Diluting the culture supernatant obtained in the first step for 48 hours with water for 2 times, adjusting the pH to 6.5, purifying by using Capto MMC chromatography medium, wherein the mobile phase comprises the following components:
a:20mM pH6.5 PB (phosphate buffer);
B:100mM pH8.5 Tris-HCl+1M NaCl。
the loading was completed with a equilibration followed by B elution.
2. Hydrophobic chromatography
Purifying the purified sample with Capto MMC by Phenyl HP, eluting the hetero protein with 40% (volume percentage) B, and then eluting the target protein with 20% (volume percentage) B, wherein the mobile phase comprises the following components:
a:20mM pH7.5Tris-HCl+1M AS (ammonium sulfate);
B:20mM pH7.5 Tris-HCl
3. g25 desalination
Desalting Phenyl HP purified sample with G25fine chromatography medium, collecting protein sample, and mixing mobile phase components: 20mM Tris-HCl pH 8.5.
4. Anion exchange chromatography
The desalted sample was purified using SOURCE30Q chromatography medium with mobile phase composition of:
A:20mM pH8.5 Tris-HCl;
B:20mM pH8.5 Tris-HCl+1M NaCl。
the loading was completed with a equilibration followed by B elution.
SDS-PAGE was performed and the results are shown in FIG. 18.
The SARS-CoV-2S-RBD protein can be captured by Capto MMC through SDS-PAGE electrophoresis; the desalted sample is purified by SOURCE30Q, the target protein flows through, and almost all the hybrid proteins are adsorbed on the SOURCE30Q chromatographic medium.
EXAMPLE 4 glycoform analysis of SARS-CoV-2S-RBD glycoprotein
1. Expression of SARS-CoV-2S-RBD by wild Pichia pastoris
The expression plasmid pPICZα -S-RBD was click-transformed into Pichia pastoris X33, cloned and screened, and then assayed by SDS-PAGE and WB (see example 2 for method) as shown in FIG. 19.
The N-sugar chain of wild yeast is excessive mannosylated sugar type, and it can be seen from the figure that the SARS-CoV-2S-RBD electrophoresis band expressed by X33 is a diffuse region, and the SARS-CoV-2S-RBD expressed by CGMCC19488 genetically modified by glycosylation modification pathway is a single band.
2. PNGase F and Endo H enzyme analysis of CGMCC19488 expressing SARS-CoV-2S-RBD glycoprotein
PNGase F cleaves high mannose, hybrid and complex N-sugar chains at a glycosidic bond between N-acetylglucosamine (GlcNAc) and asparagine at the innermost side of the sugar chain. Endo H cleaves only high mannose type and hybrid type N-sugar chains, and the cleavage site is a glycosidic bond between the first and second N-acetylglucosamine (GlcNAc) at the innermost side of the sugar chain. The glycoform of SARS-CoV-2S-RBD glycoprotein can be primarily determined by PNGase F and Endo H cleavage.
The SARS-CoV-2S-RBD glycoprotein expressed by purified CGMCC19488 is subjected to digestion according to the method described in the specification, and the result is shown in FIG. 20 after the digestion is detected by SDS-PAGE electrophoresis.
The Endo H digestion results prove that SARS-CoV-2S-RBD glycoprotein expressed by CGMCC19488 has complex and heterozygous N-sugar chain.
3. Analysis of SARS-CoV-2S-RBD glycoprotein type structure by DSA-FACE
1. Preparation of SARS-CoV-2S-RBD glycoprotein N-sugar chain sample
The method comprises the steps of (1) carrying out biotechnological communication, 2008,19 (6): 885-888), "purifying a sugar chain sample subjected to PNGaseF digestion by using a Carbograph column, activating the Carbograph column by using a mobile phase A (80% acetonitrile, 0.1% TFA and% of the mobile phase A by volume), washing the sample by water, loading the sample by water, washing the sample by using a mobile phase B (25% acetonitrile, 0.05% TFA and% of the mobile phase A by volume), eluting the sample by using a mobile phase B (25% acetonitrile, 0.05% TFA and% of the mobile phase A by volume), collecting elution peaks, freezing and draining the sample, and storing a precipitate at the temperature of-20 ℃ for later use.
2. APTS labeling of N-sugar chain samples
The sugar chain precipitate was taken and added with 1. Mu.L of 20mM APTS solution and 1. Mu.L of 1M NaBH 3 CN solution (dissolved in DMSO), and the mixture was then mixed, sealed, and placed in a 37℃water bath for reaction for 18 hours.
3. Purification of APTS-labeled sugar chains by Sephadex G10
The method has been reported to retain more than 70% of the labeled complex, remove 90% of the monomeric APTS, and remove some salts. The labeled sample was purified twice by Sephadex G10, eluting with 30. Mu.L of ddH 2 O each time, and vacuum freeze-dried. The sugar chain structure of SARS-CoV-2S-RBD glycoprotein expressed by CGMCC19488 was analyzed by means of 3100DNA sequencer assisted capillary electrophoresis (DSA-FACE) using five standard N-sugar structures of commercial bovine ribonuclease B (RNaseB) Man 5-9GlcNAc2 as standard substance, and the result is shown in FIG. 21.
As can be seen from the figure: the RBD glycoforms are Gal 2GlcNAc2Man3GlcNAc2、GalGlcNAcMan5GlcNAc2 or Man 5GlcNAc2.
Example 5 mouse immunization experiment
Methods of immunization have been disclosed in a number of documents, such as the replication of animal models of human diseases, the main edition of Li Cai, the publication by the people health Press. The method comprises the following steps: 20 female Balb/c mice aged 6-8 weeks were randomly divided into the following 2 groups: saline and immune groups, wherein the immune group is 10 μg RBD+100 μg Al (OH) 3. Wherein RBD is SARS-CoV-2S-RBD glycoprotein expressed by CGMCC19488 prepared by the method, and the RBD contains 10 mug RBD and 100 mug Al (OH) 3 according to the volume of 100 mug. Each group was immunized with 100 μl of muscle on days 0, 14 and bled on day 28.
The anti-RBD antibody titer in the serum of each group of mice was measured by an indirect ELISA method. SARS-CoV-2S-RBD coated plate expressed by CGMCC19488 prepared by the method has other operation steps as shown in the fine-compiled molecular biology experimental guideline [ M ]. Science Press, 2008.
The results are shown in FIG. 22. As can be seen from the figure: immune group antibody titres can reach 1:10000, whereas the control group was only 1:10.
Example 6 Virus neutralization assay
In example 5, two groups of mice were serum taken 14 days after the second immunization, incubated at 56℃for 30min, and diluted with physiological saline at a certain dilution. Virus neutralization assays were performed according to conventional methods (reference :Feng Cai Zhu,et al.Safety,tolerability,and immunogenicity of a recombinantadenovirus type-5vectored COVID-19vaccine:a dose-escalation,open-label,non-randomised,first-in-human trial.Lancet.2020May22;S0140-6736(20)31208-3.doi:10.1016/S0140-6736(20)31208-3.). steps are as follows:
1. Preparing cells: 293T-ACE2 cells (Sino Biological, beijing, cat# OEC 001) were digested, diluted to 3X 10 4/mL and plated in 96-well plates at 100 μl/well each.
2. Serum dilution: diluting with physiological saline at a certain dilution, and arranging 3-5 multiple holes.
3. Virus dilution: virus (virus STRAIN SARS-CoV-2/human/CHN/Wuhan-IME-BJ 01/2020, described in the above references) was diluted to 1X 10 4 TCID50/ml.
4. And (3) neutralization: serum dilutions were typically mixed with virus dilutions in equal volumes and incubated for 1h at 37℃in a 5% CO 2 incubator.
5. Infection: the incubated mixture was added to the cells at 100. Mu.l/well.
6. And (3) detection: culturing in a 5% CO 2 incubator at 37deg.C for 60 hr. The dilution corresponding to 50% protection was calculated.
As shown in fig. 23, the neutralizing antibody titer of the 10 μg rbd+100 μg al (OH) 3 group was 1:25, significantly higher than the negative control group.
EXAMPLE 7 expression purification of RBD210, RBD216, RBD219
Construction of SARS-CoV-2S-RBD (RBD 223) recombinant yeast strain in example 2, expression and purification of recombinant SARS-CoV-2S-RBD glycoprotein in example 3, glycoform analysis of SARS-CoV-2S-RBD glycoprotein in example 4, mouse immunization experiment in example 5 and virus neutralization test in example 6, construction of RBD210, RBD216, RBD219 expression vectors, construction of RBD210, RBD216, RBD219 yeast expression strain, expression purification to obtain RBD210, RBD216, RBD219 glycoprotein, and carrying out glycoform analysis, mouse immunization experiment and virus neutralization test, respectively.
The amino acid sequence of RBD219 is shown as SEQ ID No.22, and is the R319-K537 region of SARS-CoV-2"wuhan-Hu-1" isolate S protein (the corresponding coding gene sequence used in this example is shown as SEQ ID No. 26). The amino acid sequence of RBD216 is shown as SEQ ID No.23 and is the R319-V534 region of the S protein of the SARS-CoV-2"wuhan-Hu-1" isolate (the corresponding coding gene sequence employed in this example is shown as SEQ ID No. 27). The amino acid sequence of RBD210 is shown as SEQ ID No.24 (the corresponding coding gene sequence used in this example is shown as SEQ ID No. 28) and is the R319-K528 region of the S protein of the SARS-CoV-2"wuhan-Hu-1" isolate. SEQ ID No.26 to SEQ ID No.28 are also codon-optimized nucleotide sequences.
SDS-PAGE analysis of RBD210, RBD216, RBD219 and RBD223 glycoproteins is shown in FIG. 24. The purified RBD210, RBD216, RBD219 and RBD223 proteins are shown.
The results of the glycoform analysis of RBD210, RBD216, RBD219 and RBD223 glycoproteins are shown in FIG. 25. As can be seen, the sugar type is mainly: gal 2GlcNAc2Man3GlcNAc2、GalGlcNAcMan5GlcNAc2 or Man 5GlcNAc2.
The immunization experiments of mice show that: the antibody titer generated by the RBD210, RBD216, RBD219 and RBD223 protein induced mice can reach 1:10000 or so, there was no difference between groups, which was significantly higher than that of the negative control group (fig. 26).
The neutralization experimental result shows that: RBD210, RBD216, RBD219, and RBD223 protein induced neutralizing antibodies produced by mice at about 1: about 25, there was no difference between the groups, significantly higher than the negative control group (fig. 27).
The present application is described in detail above. It will be apparent to those skilled in the art that the present application can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the application and without undue experimentation. While the application has been described with respect to specific embodiments, it will be appreciated that the application may be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. The application of some of the basic features may be done in accordance with the scope of the claims that follow.
<110> Military medical institute of the military academy of China's civil liberation army
<120> Preparation method and application of coronavirus S protein RBD glycoprotein
<130> GNCLN201449
<160> 28
<170> PatentIn version 3.5
<210> 1
<211> 404
<212> PRT
<213> Artificial sequence
<400> 1
Met Ala Lys Ala Asp Gly Ser Leu Leu Tyr Tyr Asn Pro His Asn Pro
1 5 10 15
Pro Arg Arg Tyr Tyr Phe Tyr Met Ala Ile Phe Ala Val Ser Val Ile
20 25 30
Cys Val Leu Tyr Gly Pro Ser Gln Gln Leu Ser Ser Pro Lys Ile Asp
35 40 45
Tyr Asp Pro Leu Thr Leu Arg Ser Leu Asp Leu Lys Thr Leu Glu Ala
50 55 60
Pro Ser Gln Leu Ser Pro Gly Thr Val Glu Asp Asn Leu Arg Arg Gln
65 70 75 80
Leu Glu Phe His Phe Pro Tyr Arg Ser Tyr Glu Pro Phe Pro Gln His
85 90 95
Ile Trp Gln Thr Trp Lys Val Ser Pro Ser Asp Ser Ser Phe Pro Lys
100 105 110
Asn Phe Lys Asp Leu Gly Glu Ser Trp Leu Gln Arg Ser Pro Asn Tyr
115 120 125
Asp His Phe Val Ile Pro Asp Asp Ala Ala Trp Glu Leu Ile His His
130 135 140
Glu Tyr Glu Arg Val Pro Glu Val Leu Glu Ala Phe His Leu Leu Pro
145 150 155 160
Glu Pro Ile Leu Lys Ala Asp Phe Phe Arg Tyr Leu Ile Leu Phe Ala
165 170 175
Arg Gly Gly Leu Tyr Ala Asp Met Asp Thr Met Leu Leu Lys Pro Ile
180 185 190
Glu Ser Trp Leu Thr Phe Asn Glu Thr Ile Gly Gly Val Lys Asn Asn
195 200 205
Ala Gly Leu Val Ile Gly Ile Glu Ala Asp Pro Asp Arg Pro Asp Trp
210 215 220
His Asp Trp Tyr Ala Arg Arg Ile Gln Phe Cys Gln Trp Ala Ile Gln
225 230 235 240
Ser Lys Arg Gly His Pro Ala Leu Arg Glu Leu Ile Val Arg Val Val
245 250 255
Ser Thr Thr Leu Arg Lys Glu Lys Ser Gly Tyr Leu Asn Met Val Glu
260 265 270
Gly Lys Asp Arg Gly Ser Asp Val Met Asp Trp Thr Gly Pro Gly Ile
275 280 285
Phe Thr Asp Thr Leu Phe Asp Tyr Met Thr Asn Val Asn Thr Thr Gly
290 295 300
His Ser Gly Gln Gly Ile Gly Ala Gly Ser Ala Tyr Tyr Asn Ala Leu
305 310 315 320
Ser Leu Glu Glu Arg Asp Ala Leu Ser Ala Arg Pro Asn Gly Glu Met
325 330 335
Leu Lys Glu Lys Val Pro Gly Lys Tyr Ala Gln Gln Val Val Leu Trp
340 345 350
Glu Gln Phe Thr Asn Leu Arg Ser Pro Lys Leu Ile Asp Asp Ile Leu
355 360 365
Ile Leu Pro Ile Thr Ser Phe Ser Pro Gly Ile Gly His Ser Gly Ala
370 375 380
Gly Asp Leu Asn His His Leu Ala Tyr Ile Arg His Thr Phe Glu Gly
385 390 395 400
Ser Trp Lys Asp
<210> 2
<211> 462
<212> PRT
<213> Artificial sequence
<400> 2
Met Ser Thr Asp Ser Asn Leu Gly Tyr Gly Ile Ser Ile Ser Gly Gly
1 5 10 15
Ser Arg Ser Thr Gln Ser Leu Gly Thr Ser Arg Val Thr Pro Ser Arg
20 25 30
Ser Ala Asn His Glu Gly Lys Glu Asn Lys Ala Phe Ser Met Ile Ser
35 40 45
Pro Lys Lys Leu Ile Asn Lys Leu Ser Lys Ser Ser Val Ser Ser Asn
50 55 60
Asn Thr Ser Ser Ser Asn His Asp Ser Phe Val Asp Arg Lys Tyr Lys
65 70 75 80
Ile Glu Ile Glu Asn Ser Phe Ser Asp Arg Ser Val Ser Glu Val Asp
85 90 95
Leu Leu Glu Asp Ser Leu Asp Thr Thr Glu Gly Asp Ser Gly Glu Asn
100 105 110
Leu Val Ser Thr Pro Thr Gln Val Thr Leu Arg Pro Lys Arg Gly Asn
115 120 125
Ser Gln Asp Arg Asn Glu Asn Arg Val Leu Lys Glu Lys Glu Thr Ala
130 135 140
Val Arg Glu Ser Gln Arg His Thr Gly Phe Phe Thr Glu Ser Met Leu
145 150 155 160
Ser Pro Ser Asp Gly Ser Arg Gln Asp Thr Ser Asp Ser Pro Gly Ser
165 170 175
Ile Ser Ile Pro Thr Ala Glu Leu Ser Lys Lys Asn Leu Ser Asp Val
180 185 190
Ser Lys Ser Thr Ser Glu Asn Ser His Asn Arg Lys Trp Glu Ala Arg
195 200 205
Ser Ser Leu Leu Pro Glu Asn Leu Ser Ser Ile His Leu Asp Asp Ser
210 215 220
Pro Ile Glu Ile Tyr Glu Asp Ala Glu Glu Ile Ile Asp Glu Thr Val
225 230 235 240
Glu Glu Pro Arg Ser Ser Ile Pro Leu Gln Asn Glu Trp Glu Met Glu
245 250 255
Asp Thr Ile Leu Glu Gly Arg Leu Val Gln Ser Ala Ser Asp Pro Val
260 265 270
Ile Thr Ser Asn Asp Ile Ser Lys Glu Leu Arg Lys Ser Ile Ser Thr
275 280 285
Pro Ala Leu Thr His Ser Asp Leu Val Asp Phe Arg Lys Val Ile Pro
290 295 300
Gly Ser Ser His Tyr His Val Phe Thr Asp Pro Lys Ser Pro Phe Thr
305 310 315 320
Glu Asp Pro Ser Gln Leu Ala Tyr His Lys Ile Arg Asp Arg Asn Phe
325 330 335
Asp Ala His Tyr Ser Thr Asp Pro Ile Arg Leu Ser Ser Gly Ser Ser
340 345 350
Ser Glu Gly Ser Asp Glu Lys Asn Leu Leu Leu Gly Ser Arg Lys Pro
355 360 365
Ser Asp Pro Tyr Arg Leu Pro Tyr Glu Asp Glu Asp Gly Tyr Arg Phe
370 375 380
Trp Thr Lys Thr Pro Leu Asn Arg Glu Cys Pro Lys Arg Val Ala Leu
385 390 395 400
Trp Leu Leu Val Gly Ala Ile Leu Ala Pro Pro Val Trp Ile Met Met
405 410 415
Tyr Val Gly Phe Leu Asp Ser Ser Val Gly Arg Leu Pro Pro Lys Tyr
420 425 430
Arg Val Ile Ser Gly Val Leu Ala Leu Ser Met Ile Ile Leu Thr Ala
435 440 445
Met Gly Ile Ala Val Gly Phe Ala Tyr Gly Leu Asn Asn Arg
450 455 460
<210> 3
<211> 652
<212> PRT
<213> Artificial sequence
<400> 3
Met Phe Lys Glu Thr Ser Lys Asn Leu Phe Gly Ser Ile Asn Thr Phe
1 5 10 15
Asn Thr Val Glu Tyr Val Met Tyr Met Met Leu Leu Leu Thr Ala Tyr
20 25 30
Phe Leu Asn His Leu Leu His Ser Leu Asp Asn Ile Asn His Leu Val
35 40 45
Glu Ser Asp Val Asn Tyr Gln Leu Leu Gln Arg Val Thr Asn Lys Val
50 55 60
Lys Leu Phe Asp Glu Glu Ala Val Leu Pro Phe Ala Lys Asn Leu Asn
65 70 75 80
Arg Arg Thr Glu Arg Phe Asp Pro Arg Leu Pro Val Ala Ala Tyr Leu
85 90 95
Arg Ser Leu Gln Asp Gln Tyr Ser Glu Leu Pro Gln Gly Thr Asp Leu
100 105 110
Asn Asp Ile Pro Pro Leu Glu Val Ser Phe His Trp Asp Asp Trp Leu
115 120 125
Ser Leu Gly Ile Ala Ser Thr Phe Trp Asp Ala Phe Asp Asn Tyr Asn
130 135 140
Lys Arg Gln Gly Glu Asn Ala Ile Ser Tyr Glu Gln Leu Gln Ala Ile
145 150 155 160
Leu Val Asn Asp Leu Glu Asp Phe Ser Pro Tyr Thr Ala His Ile Leu
165 170 175
His Ser Asn Val Glu Val Tyr Lys Tyr Arg Thr Ile Pro Gln Lys Ile
180 185 190
Val Tyr Met Ser Asn Lys Gly Tyr Phe Glu Leu Leu Val Thr Glu Lys
195 200 205
Glu Lys Leu Ser Asn Glu Gly Leu Trp Ser Ile Phe His Gln Lys Gln
210 215 220
Gly Gly Leu Asn Glu Phe Ser Ser Leu Asn Leu Ile Glu Glu Val Asp
225 230 235 240
Ala Leu Asp Glu Ile Tyr Asp Ser Lys Gly Leu Pro Ala Trp Asp Pro
245 250 255
Pro Phe Pro Glu Glu Leu Asp Ala Ser Asp Glu Asp Leu Pro Phe Asn
260 265 270
Ala Thr Glu Glu Leu Ala Lys Val Glu Gln Ile Lys Glu Pro Lys Leu
275 280 285
Glu Asp Ile Phe Tyr Gln Glu Gly Leu Gln His Gly Ile Gln Thr Leu
290 295 300
Pro Ser Asp Ala Ser Val Tyr Phe Pro Val Asn Tyr Val Glu Asn Asp
305 310 315 320
Pro Gly Leu Gln Ser His His Leu His Phe Pro Phe Phe Ser Gly Met
325 330 335
Val Leu Pro Arg Glu Ile His Ser Ser Val His His Met Asn Lys Ala
340 345 350
Phe Phe Leu Phe Ala Arg Gln His Gly Tyr Val Val Trp Phe Phe Tyr
355 360 365
Gly Asn Leu Ile Gly Trp Tyr Tyr Asn Gly Asn Asn His Pro Trp Asp
370 375 380
Ser Asp Ile Asp Ala Ile Met Pro Met Ala Glu Met Ala Arg Met Ala
385 390 395 400
His His His Asn Asn Thr Leu Ile Ile Glu Asn Pro His Asp Gly Tyr
405 410 415
Gly Thr Tyr Leu Leu Thr Ile Ser Pro Trp Phe Thr Lys Lys Thr Arg
420 425 430
Gly Gly Asn His Ile Asp Gly Arg Phe Val Asp Val Lys Arg Gly Thr
435 440 445
Tyr Ile Asp Leu Ser Ala Ile Ser Ala Met His Gly Ile Tyr Pro Asp
450 455 460
Trp Val Arg Asp Gly Val Lys Glu Asn Pro Lys Asn Leu Ala Leu Ala
465 470 475 480
Asp Lys Asn Gly Asn Trp Tyr Leu Thr Arg Asp Ile Leu Pro Leu Arg
485 490 495
Arg Thr Ile Phe Glu Gly Ser Arg Ser Tyr Thr Val Lys Asp Ile Glu
500 505 510
Asp Thr Leu Leu Arg Asn Tyr Gly Asp Lys Val Leu Ile Asn Thr Glu
515 520 525
Leu Ala Asp His Glu Trp His Asp Asp Trp Lys Met Trp Val Gln Lys
530 535 540
Lys Lys Tyr Cys Thr Tyr Glu Glu Phe Glu Asp Tyr Leu Ser Ala His
545 550 555 560
Gly Gly Val Glu Tyr Asp Glu Asp Gly Val Leu Thr Leu Glu Gly Ala
565 570 575
Cys Gly Phe Glu Glu Val Arg Gln Asp Trp Ile Ile Thr Arg Glu Ser
580 585 590
Val Asn Leu His Met Lys Glu Trp Glu Ala Ile Gln Arg Asn Glu Ser
595 600 605
Thr Thr Glu Tyr Thr Ala Lys Asp Leu Pro Arg Tyr Arg Pro Asp Ser
610 615 620
Phe Lys Asn Leu Leu Asp Gly Val Ser Asn His Gly Asn Gly Asn Val
625 630 635 640
Gly Lys Ile Glu His Val Lys Leu Glu His Asn Asp
645 650
<210> 4
<211> 594
<212> PRT
<213> Artificial sequence
<400> 4
Met Arg Ile Arg Ser Asn Val Leu Leu Leu Ser Thr Ala Gly Ala Leu
1 5 10 15
Ala Leu Val Trp Phe Ala Val Val Phe Ser Trp Asp Asp Lys Ser Ile
20 25 30
Phe Gly Ile Pro Thr Pro Gly His Ala Val Ala Ser Ala Tyr Asp Ser
35 40 45
Ser Val Thr Leu Gly Thr Phe Asn Asp Met Glu Val Asp Ser Tyr Val
50 55 60
Thr Asn Ile Tyr Asp Asn Ala Pro Val Leu Gly Cys Tyr Asp Leu Ser
65 70 75 80
Tyr His Gly Leu Leu Lys Val Ser Pro Lys His Glu Ile Leu Cys Asp
85 90 95
Met Lys Phe Ile Arg Ala Arg Val Leu Glu Thr Glu Ala Tyr Ala Ala
100 105 110
Leu Lys Asp Leu Glu His Lys Lys Leu Thr Glu Glu Glu Lys Ile Glu
115 120 125
Lys His Trp Phe Thr Phe Tyr Gly Ser Ser Val Phe Leu Pro Asp His
130 135 140
Asp Val His Tyr Leu Val Arg Arg Val Val Phe Ser Gly Glu Gly Lys
145 150 155 160
Ala Asn Arg Pro Ile Thr Ser Ile Leu Val Ala Gln Ile Tyr Asp Lys
165 170 175
Asn Trp Asn Glu Leu Asn Gly His Phe Leu Asn Val Leu Asn Pro Asn
180 185 190
Thr Gly Lys Leu Gln His His Ala Phe Pro Gln Val Leu Pro Ile Ala
195 200 205
Val Asn Trp Asp Arg Asn Ser Lys Tyr Arg Gly Gln Glu Asp Pro Arg
210 215 220
Val Val Leu Arg Arg Gly Arg Phe Gly Pro Asp Pro Leu Val Met Phe
225 230 235 240
Asn Thr Leu Thr Gln Asn Asn Lys Leu Arg Arg Leu Phe Thr Ile Ser
245 250 255
Pro Phe Asp Gln Tyr Lys Thr Val Met Tyr Arg Thr Asn Ala Phe Lys
260 265 270
Met Gln Thr Thr Glu Lys Asn Trp Val Pro Phe Phe Leu Lys Asp Asp
275 280 285
Gln Glu Ser Val His Phe Val Tyr Ser Phe Asn Pro Leu Arg Val Leu
290 295 300
Asn Cys Ser Leu Asp Asn Gly Ala Cys Asp Val Leu Phe Glu Leu Pro
305 310 315 320
His Asp Phe Gly Met Ser Ser Glu Leu Arg Gly Ala Thr Pro Met Leu
325 330 335
Asn Leu Pro Gln Ala Ile Pro Met Ala Asp Asp Lys Glu Ile Trp Val
340 345 350
Ser Phe Pro Arg Thr Arg Ile Ser Asp Cys Gly Cys Ser Glu Thr Met
355 360 365
Tyr Arg Pro Met Leu Met Leu Phe Val Arg Glu Gly Thr Asn Phe Phe
370 375 380
Ala Glu Leu Leu Ser Ser Ser Ile Asp Phe Gly Leu Glu Val Ile Pro
385 390 395 400
Tyr Thr Gly Asp Gly Leu Pro Cys Ser Ser Gly Gln Ser Val Leu Ile
405 410 415
Pro Asn Ser Ile Asp Asn Trp Glu Val Thr Gly Ser Asn Gly Glu Asp
420 425 430
Ile Leu Ser Leu Thr Phe Ser Glu Ala Asp Lys Ser Thr Ser Val Val
435 440 445
His Ile Arg Gly Leu Tyr Lys Tyr Leu Ser Glu Leu Asp Gly Tyr Gly
450 455 460
Gly Pro Glu Ala Glu Asp Glu His Asn Phe Gln Arg Ile Leu Ser Asp
465 470 475 480
Leu His Phe Asp Gly Lys Lys Thr Ile Glu Asn Phe Lys Lys Val Gln
485 490 495
Ser Cys Ala Leu Asp Ala Ala Lys Ala Tyr Cys Lys Glu Tyr Gly Val
500 505 510
Thr Arg Gly Glu Glu Asp Arg Leu Lys Asn Lys Glu Lys Glu Arg Lys
515 520 525
Ile Glu Glu Lys Arg Lys Lys Glu Glu Glu Arg Lys Lys Lys Glu Glu
530 535 540
Glu Lys Lys Lys Lys Glu Glu Glu Glu Lys Lys Lys Lys Glu Glu Glu
545 550 555 560
Glu Glu Glu Glu Lys Arg Leu Lys Glu Leu Lys Lys Lys Leu Lys Glu
565 570 575
Leu Gln Glu Glu Leu Glu Lys Gln Lys Asp Glu Val Lys Asp Thr Lys
580 585 590
Ala Lys
<210> 5
<211> 644
<212> PRT
<213> Artificial sequence
<400> 5
Met Arg Thr Arg Leu Asn Phe Leu Leu Leu Cys Ile Ala Ser Val Leu
1 5 10 15
Ser Val Ile Trp Ile Gly Val Leu Leu Thr Trp Asn Asp Asn Asn Leu
20 25 30
Gly Gly Ile Ser Leu Asn Gly Gly Lys Asp Ser Ala Tyr Asp Asp Leu
35 40 45
Leu Ser Leu Gly Ser Phe Asn Asp Met Glu Val Asp Ser Tyr Val Thr
50 55 60
Asn Ile Tyr Asp Asn Ala Pro Val Leu Gly Cys Thr Asp Leu Ser Tyr
65 70 75 80
His Gly Leu Leu Lys Val Thr Pro Lys His Asp Leu Ala Cys Asp Leu
85 90 95
Glu Phe Ile Arg Ala Gln Ile Leu Asp Ile Asp Val Tyr Ser Ala Ile
100 105 110
Lys Asp Leu Glu Asp Lys Ala Leu Thr Val Lys Gln Lys Val Glu Lys
115 120 125
His Trp Phe Thr Phe Tyr Gly Ser Ser Val Phe Leu Pro Glu His Asp
130 135 140
Val His Tyr Leu Val Arg Arg Val Ile Phe Ser Ala Glu Gly Lys Ala
145 150 155 160
Asn Ser Pro Val Thr Ser Ile Ile Val Ala Gln Ile Tyr Asp Lys Asn
165 170 175
Trp Asn Glu Leu Asn Gly His Phe Leu Asp Ile Leu Asn Pro Asn Thr
180 185 190
Gly Lys Val Gln His Asn Thr Phe Pro Gln Val Leu Pro Ile Ala Thr
195 200 205
Asn Phe Val Lys Gly Lys Lys Phe Arg Gly Ala Glu Asp Pro Arg Val
210 215 220
Val Leu Arg Lys Gly Arg Phe Gly Pro Asp Pro Leu Val Met Phe Asn
225 230 235 240
Ser Leu Thr Gln Asp Asn Lys Arg Arg Arg Ile Phe Thr Ile Ser Pro
245 250 255
Phe Asp Gln Phe Lys Thr Val Met Tyr Asp Ile Lys Asp Tyr Glu Met
260 265 270
Pro Arg Tyr Glu Lys Asn Trp Val Pro Phe Phe Leu Lys Asp Asn Gln
275 280 285
Glu Ala Val His Phe Val Tyr Ser Phe Asn Pro Leu Arg Val Leu Lys
290 295 300
Cys Ser Leu Asp Asp Gly Ser Cys Asp Ile Val Phe Glu Ile Pro Lys
305 310 315 320
Val Asp Ser Met Ser Ser Glu Leu Arg Gly Ala Thr Pro Met Ile Asn
325 330 335
Leu Pro Gln Ala Ile Pro Met Ala Lys Asp Lys Glu Ile Trp Val Ser
340 345 350
Phe Pro Arg Thr Arg Ile Ala Asn Cys Gly Cys Ser Arg Thr Thr Tyr
355 360 365
Arg Pro Met Leu Met Leu Phe Val Arg Glu Gly Ser Asn Phe Phe Val
370 375 380
Glu Leu Leu Ser Thr Ser Leu Asp Phe Gly Leu Glu Val Leu Pro Tyr
385 390 395 400
Ser Gly Asn Gly Leu Pro Cys Ser Ala Asp His Ser Val Leu Ile Pro
405 410 415
Asn Ser Ile Asp Asn Trp Glu Val Val Asp Ser Asn Gly Asp Asp Ile
420 425 430
Leu Thr Leu Ser Phe Ser Glu Ala Asp Lys Ser Thr Ser Val Ile His
435 440 445
Ile Arg Gly Leu Tyr Asn Tyr Leu Ser Glu Leu Asp Gly Tyr Gln Gly
450 455 460
Pro Glu Ala Glu Asp Glu His Asn Phe Gln Arg Ile Leu Ser Asp Leu
465 470 475 480
His Phe Asp Asn Lys Thr Thr Val Asn Asn Phe Ile Lys Val Gln Ser
485 490 495
Cys Ala Leu Asp Ala Ala Lys Gly Tyr Cys Lys Glu Tyr Gly Leu Thr
500 505 510
Arg Gly Glu Ala Glu Arg Arg Arg Arg Val Ala Glu Glu Arg Lys Lys
515 520 525
Lys Glu Lys Glu Glu Glu Glu Lys Lys Lys Lys Lys Glu Lys Glu Glu
530 535 540
Glu Glu Lys Lys Arg Ile Glu Glu Glu Lys Lys Lys Ile Glu Glu Lys
545 550 555 560
Glu Arg Lys Glu Lys Glu Lys Glu Glu Ala Glu Arg Lys Lys Leu Gln
565 570 575
Glu Met Lys Lys Lys Leu Glu Glu Ile Thr Glu Lys Leu Glu Lys Gly
580 585 590
Gln Arg Asn Lys Glu Ile Asp Pro Lys Glu Lys Gln Arg Glu Glu Glu
595 600 605
Glu Arg Lys Glu Arg Val Arg Lys Ile Ala Glu Lys Gln Arg Lys Glu
610 615 620
Ala Glu Lys Lys Glu Ala Glu Lys Lys Ala Asn Asp Lys Lys Asp Leu
625 630 635 640
Lys Ile Arg Gln
<210> 6
<211> 488
<212> PRT
<213> Artificial sequence
<400> 6
Met Tyr His Leu Ala Pro Arg Lys Lys Leu Leu Ile Trp Gly Gly Ser
1 5 10 15
Leu Gly Phe Val Leu Leu Leu Leu Ile Val Ala Ser Ser His Gln Arg
20 25 30
Ile Arg Ser Thr Ile Leu His Arg Thr Pro Ile Ser Thr Leu Pro Val
35 40 45
Ile Ser Gln Glu Val Ile Thr Ala Asp Tyr His Pro Thr Leu Leu Thr
50 55 60
Gly Phe Ile Pro Thr Asp Ser Asp Asp Ser Asp Cys Ala Asp Phe Ser
65 70 75 80
Pro Ser Gly Val Ile Tyr Ser Thr Asp Lys Leu Val Leu His Asp Ser
85 90 95
Leu Lys Asp Ile Arg Asp Ser Leu Leu Lys Thr Gln Tyr Lys Asp Leu
100 105 110
Val Thr Leu Glu Asp Glu Glu Lys Met Asn Ile Asp Asp Ile Leu Lys
115 120 125
Arg Trp Tyr Thr Leu Ser Gly Ser Ser Val Trp Ile Pro Gly Met Lys
130 135 140
Ala His Leu Val Val Ser Arg Val Met Tyr Leu Gly Thr Asn Gly Arg
145 150 155 160
Ser Asp Pro Leu Val Ser Phe Val Arg Val Gln Leu Phe Asp Pro Asp
165 170 175
Phe Asn Glu Leu Lys Asp Ile Ala Leu Lys Phe Ser Asp Lys Pro Asp
180 185 190
Gly Thr Val Ile Phe Pro Tyr Ile Leu Pro Val Asp Ile Pro Arg Glu
195 200 205
Gly Ser Arg Trp Leu Gly Pro Glu Asp Ala Lys Ile Ala Val Asn Pro
210 215 220
Glu Thr Pro Asp Asp Pro Ile Val Ile Phe Asn Met Gln Asn Ser Val
225 230 235 240
Asn Arg Ala Met Tyr Gly Phe Tyr Pro Phe Arg Pro Glu Asn Lys Gln
245 250 255
Val Leu Phe Ser Ile Lys Asp Glu Glu Pro Arg Lys Lys Glu Lys Asn
260 265 270
Trp Thr Pro Phe Phe Val Pro Gly Ser Pro Thr Thr Val Asn Phe Val
275 280 285
Tyr Asp Leu Gln Lys Leu Thr Ile Leu Lys Cys Ser Ile Ile Thr Gly
290 295 300
Ile Cys Glu Lys Glu Phe Val Ser Gly Asp Asp Gly Gln Asn His Gly
305 310 315 320
Ile Gly Ile Phe Arg Gly Gly Ser Asn Leu Val Pro Phe Pro Thr Ser
325 330 335
Phe Thr Asp Lys Asp Val Trp Val Gly Phe Pro Lys Thr His Met Glu
340 345 350
Ser Cys Gly Cys Ser Ser His Ile Tyr Arg Pro Tyr Leu Met Val Leu
355 360 365
Val Arg Lys Gly Asp Phe Tyr Tyr Lys Ala Phe Val Ser Thr Pro Leu
370 375 380
Asp Phe Gly Ile Asp Val Arg Ser Trp Glu Ser Ala Glu Ser Thr Ser
385 390 395 400
Cys Gln Thr Ala Lys Asn Val Leu Ala Val Asn Ser Ile Ser Asn Trp
405 410 415
Asp Leu Leu Asp Asp Gly Leu Asp Lys Asp Tyr Met Thr Ile Thr Leu
420 425 430
Ser Glu Ala Asp Val Val Asn Ser Val Leu Arg Val Arg Gly Ile Ala
435 440 445
Lys Phe Val Asp Asn Leu Thr Met Asp Asp Gly Ser Thr Thr Leu Ser
450 455 460
Thr Ser Asn Lys Ile Asp Glu Cys Ala Thr Thr Gly Ser Lys Gln Tyr
465 470 475 480
Cys Gln Arg Tyr Gly Glu Leu His
485
<210> 7
<211> 652
<212> PRT
<213> Artificial sequence
<400> 7
Met Val Asp Leu Phe Gln Trp Leu Lys Phe Tyr Ser Met Arg Arg Leu
1 5 10 15
Gly Gln Val Ala Ile Thr Leu Val Leu Leu Asn Leu Phe Val Phe Leu
20 25 30
Gly Tyr Lys Phe Thr Pro Ser Thr Val Ile Gly Ser Pro Ser Trp Glu
35 40 45
Pro Ala Val Val Pro Thr Val Phe Asn Glu Ser Tyr Leu Asp Ser Leu
50 55 60
Gln Phe Thr Asp Ile Asn Val Asp Ser Phe Leu Ser Asp Thr Asn Gly
65 70 75 80
Arg Ile Ser Val Thr Cys Asp Ser Leu Ala Tyr Lys Gly Leu Val Lys
85 90 95
Thr Ser Lys Lys Lys Glu Leu Asp Cys Asp Met Ala Tyr Ile Arg Arg
100 105 110
Lys Ile Phe Ser Ser Glu Glu Tyr Gly Val Leu Ala Asp Leu Glu Ala
115 120 125
Gln Asp Ile Thr Glu Glu Gln Arg Ile Lys Lys His Trp Phe Thr Phe
130 135 140
Tyr Gly Ser Ser Val Tyr Leu Pro Glu His Glu Val His Tyr Leu Val
145 150 155 160
Arg Arg Val Leu Phe Ser Lys Val Gly Arg Ala Asp Thr Pro Val Ile
165 170 175
Ser Leu Leu Val Ala Gln Leu Tyr Asp Lys Asp Trp Asn Glu Leu Thr
180 185 190
Pro His Thr Leu Glu Ile Val Asn Pro Ala Thr Gly Asn Val Thr Pro
195 200 205
Gln Thr Phe Pro Gln Leu Ile His Val Pro Ile Glu Trp Ser Val Asp
210 215 220
Asp Lys Trp Lys Gly Thr Glu Asp Pro Arg Val Phe Leu Lys Pro Ser
225 230 235 240
Lys Thr Gly Val Ser Glu Pro Ile Val Leu Phe Asn Leu Gln Ser Ser
245 250 255
Leu Cys Asp Gly Lys Arg Gly Met Phe Val Thr Ser Pro Phe Arg Ser
260 265 270
Asp Lys Val Asn Leu Leu Asp Ile Glu Asp Lys Glu Arg Pro Asn Ser
275 280 285
Glu Lys Asn Trp Ser Pro Phe Phe Leu Asp Asp Val Glu Val Ser Lys
290 295 300
Tyr Ser Thr Gly Tyr Val His Phe Val Tyr Ser Phe Asn Pro Leu Lys
305 310 315 320
Val Ile Lys Cys Ser Leu Asp Thr Gly Ala Cys Arg Met Ile Tyr Glu
325 330 335
Ser Pro Glu Glu Gly Arg Phe Gly Ser Glu Leu Arg Gly Ala Thr Pro
340 345 350
Met Val Lys Leu Pro Val His Leu Ser Leu Pro Lys Gly Lys Glu Val
355 360 365
Trp Val Ala Phe Pro Arg Thr Arg Leu Arg Asp Cys Gly Cys Ser Arg
370 375 380
Thr Thr Tyr Arg Pro Val Leu Thr Leu Phe Val Lys Glu Gly Asn Lys
385 390 395 400
Phe Tyr Thr Glu Leu Ile Ser Ser Ser Ile Asp Phe His Ile Asp Val
405 410 415
Leu Ser Tyr Asp Ala Lys Gly Glu Ser Cys Ser Gly Ser Ile Ser Val
420 425 430
Leu Ile Pro Asn Gly Ile Asp Ser Trp Asp Val Ser Lys Lys Gln Gly
435 440 445
Gly Lys Ser Asp Ile Leu Thr Leu Thr Leu Ser Glu Ala Asp Arg Asn
450 455 460
Thr Val Val Val His Val Lys Gly Leu Leu Asp Tyr Leu Leu Val Leu
465 470 475 480
Asn Gly Glu Gly Pro Ile His Asp Ser His Ser Phe Lys Asn Val Leu
485 490 495
Ser Thr Asn His Phe Lys Ser Asp Thr Thr Leu Leu Asn Ser Val Lys
500 505 510
Ala Ala Glu Cys Ala Ile Phe Ser Ser Arg Asp Tyr Cys Lys Lys Tyr
515 520 525
Gly Glu Thr Arg Gly Glu Pro Ala Arg Tyr Ala Lys Gln Met Glu Asn
530 535 540
Glu Arg Lys Glu Lys Glu Lys Lys Glu Lys Glu Ala Lys Glu Lys Leu
545 550 555 560
Glu Ala Glu Lys Ala Glu Met Glu Glu Ala Val Arg Lys Ala Gln Glu
565 570 575
Ala Ile Ala Gln Lys Glu Arg Glu Lys Glu Glu Ala Glu Gln Glu Lys
580 585 590
Lys Ala Gln Gln Glu Ala Lys Glu Lys Glu Ala Glu Glu Lys Ala Ala
595 600 605
Lys Glu Lys Glu Ala Lys Glu Asn Glu Ala Lys Lys Lys Ile Ile Val
610 615 620
Glu Lys Leu Ala Lys Glu Gln Glu Glu Ala Glu Lys Leu Glu Ala Lys
625 630 635 640
Lys Lys Leu Tyr Gln Leu Gln Glu Glu Glu Arg Ser
645 650
<210> 8
<211> 789
<212> PRT
<213> Artificial sequence
<400> 8
Met Cys Gln Ile Phe Leu Pro Gln Asn Val Thr Arg Cys Ser Val Ser
1 5 10 15
Leu Leu Thr Met Ser Lys Thr Ser Pro Gln Glu Val Pro Glu Asn Thr
20 25 30
Thr Glu Leu Lys Ile Ser Lys Gly Glu Leu Arg Pro Phe Ile Val Thr
35 40 45
Ser Pro Ser Pro Gln Leu Ser Lys Ser Arg Ser Val Thr Ser Thr Lys
50 55 60
Glu Lys Leu Ile Leu Ala Ser Leu Phe Ile Phe Ala Met Val Ile Arg
65 70 75 80
Phe His Asn Val Ala His Pro Asp Ser Val Val Phe Asp Glu Val His
85 90 95
Phe Gly Gly Phe Ala Arg Lys Tyr Ile Leu Gly Thr Phe Phe Met Asp
100 105 110
Val His Pro Pro Leu Ala Lys Leu Leu Phe Ala Gly Val Gly Ser Leu
115 120 125
Gly Gly Tyr Asp Gly Glu Phe Glu Phe Lys Lys Ile Gly Asp Glu Phe
130 135 140
Pro Glu Asn Val Pro Tyr Val Leu Met Arg Tyr Leu Pro Ser Gly Met
145 150 155 160
Gly Val Gly Thr Cys Ile Met Leu Tyr Leu Thr Leu Arg Ala Ser Gly
165 170 175
Cys Gln Pro Ile Val Cys Cys Ser Asp Asn Arg Ser Leu Ile Ile Glu
180 185 190
Asn Ala Asn Val Thr Ile Ser Arg Phe Ile Leu Leu Asp Ser Pro Met
195 200 205
Leu Phe Phe Ile Ala Ser Thr Val Tyr Ser Phe Lys Lys Phe Gln Ile
210 215 220
Gln Glu Pro Phe Thr Phe Gln Trp Tyr Lys Thr Leu Ile Ala Thr Gly
225 230 235 240
Val Ser Leu Gly Leu Ala Ala Ser Ser Lys Trp Val Gly Leu Phe Thr
245 250 255
Val Ala Trp Ile Gly Leu Ile Thr Ile Trp Asp Leu Trp Phe Ile Ile
260 265 270
Gly Asp Leu Thr Val Ser Val Lys Lys Ile Phe Gly His Phe Ile Thr
275 280 285
Arg Ala Val Ala Phe Leu Val Val Pro Thr Leu Ile Tyr Leu Thr Phe
290 295 300
Phe Ala Ile His Leu Gln Val Leu Thr Lys Glu Gly Asp Gly Gly Ala
305 310 315 320
Phe Met Ser Ser Val Phe Arg Ser Thr Leu Glu Gly Asn Ala Val Pro
325 330 335
Lys Gln Ser Leu Ala Asn Val Gly Leu Gly Ser Leu Val Thr Ile Arg
340 345 350
His Leu Asn Thr Arg Gly Gly Tyr Leu His Ser His Asn His Leu Tyr
355 360 365
Glu Gly Gly Ser Gly Gln Gln Gln Val Thr Leu Tyr Pro His Ile Asp
370 375 380
Ser Asn Asn Gln Trp Ile Val Gln Asp Tyr Asn Ala Thr Glu Glu Pro
385 390 395 400
Thr Glu Phe Val Pro Leu Lys Asp Gly Val Lys Ile Arg Leu Asn His
405 410 415
Lys Leu Thr Ser Arg Arg Leu His Ser His Asn Leu Arg Pro Pro Val
420 425 430
Thr Glu Gln Asp Trp Gln Asn Glu Val Ser Ala Tyr Gly His Glu Gly
435 440 445
Phe Gly Gly Asp Ala Asn Asp Asp Phe Val Val Glu Ile Ala Lys Asp
450 455 460
Leu Ser Thr Thr Glu Glu Ala Lys Glu Asn Val Arg Ala Ile Gln Thr
465 470 475 480
Val Phe Arg Leu Arg His Ala Met Thr Gly Cys Tyr Leu Phe Ser His
485 490 495
Glu Val Lys Leu Pro Lys Trp Ala Tyr Glu Gln Gln Glu Val Thr Cys
500 505 510
Ala Thr Gln Gly Ile Lys Pro Leu Ser Tyr Trp Tyr Val Glu Thr Asn
515 520 525
Glu Asn Pro Phe Leu Asp Lys Glu Val Asp Glu Ile Val Ser Tyr Pro
530 535 540
Val Pro Thr Phe Phe Gln Lys Val Ala Glu Leu His Ala Arg Met Trp
545 550 555 560
Lys Ile Asn Lys Gly Leu Thr Asp His His Val Tyr Glu Ser Ser Pro
565 570 575
Asp Ser Trp Pro Phe Leu Leu Arg Gly Ile Ser Tyr Trp Ser Lys Asn
580 585 590
His Ser Gln Ile Tyr Phe Ile Gly Asn Ala Val Thr Trp Trp Thr Val
595 600 605
Thr Ala Ser Ile Ala Leu Phe Ser Val Phe Leu Val Phe Ser Ile Leu
610 615 620
Arg Trp Gln Arg Gly Phe Gly Phe Ser Val Asp Pro Thr Val Phe Asn
625 630 635 640
Phe Asn Val Gln Met Leu His Tyr Ile Leu Gly Trp Val Leu His Tyr
645 650 655
Leu Pro Ser Phe Leu Met Ala Arg Gln Leu Phe Leu His His Tyr Leu
660 665 670
Pro Ser Leu Tyr Phe Gly Ile Leu Ala Leu Gly His Val Phe Glu Ile
675 680 685
Ile His Ser Tyr Val Phe Lys Asn Lys Gln Val Val Ser Tyr Ser Ile
690 695 700
Phe Val Leu Phe Phe Ala Val Ala Leu Ser Phe Phe Gln Arg Tyr Ser
705 710 715 720
Pro Leu Ile Tyr Ala Gly Arg Trp Thr Lys Asp Gln Cys Asn Glu Ser
725 730 735
Lys Ile Leu Lys Trp Asp Phe Asp Cys Asn Thr Phe Pro Ser His Thr
740 745 750
Ser Gln Tyr Glu Ile Trp Ala Ser Pro Val Gln Thr Ser Thr Pro Lys
755 760 765
Glu Gly Thr His Ser Glu Ser Thr Val Gly Glu Pro Asp Val Glu Lys
770 775 780
Leu Gly Glu Thr Val
785
<210> 9
<211> 512
<212> PRT
<213> Artificial sequence
<400> 9
Glu Ala Glu Ala Tyr Pro Lys Pro Gly Ala Thr Lys Arg Gly Ser Pro
1 5 10 15
Asn Pro Thr Arg Ala Ala Ala Val Lys Ala Ala Phe Gln Thr Ser Trp
20 25 30
Asn Ala Tyr His His Phe Ala Phe Pro His Asp Asp Leu His Pro Val
35 40 45
Ser Asn Ser Phe Asp Asp Glu Arg Asn Gly Trp Gly Ser Ser Ala Ile
50 55 60
Asp Gly Leu Asp Thr Ala Ile Leu Met Gly Asp Ala Asp Ile Val Asn
65 70 75 80
Thr Ile Leu Gln Tyr Val Pro Gln Ile Asn Phe Thr Thr Thr Ala Val
85 90 95
Ala Asn Gln Gly Ile Ser Val Phe Glu Thr Asn Ile Arg Tyr Leu Gly
100 105 110
Gly Leu Leu Ser Ala Tyr Asp Leu Leu Arg Gly Pro Phe Ser Ser Leu
115 120 125
Ala Thr Asn Gln Thr Leu Val Asn Ser Leu Leu Arg Gln Ala Gln Thr
130 135 140
Leu Ala Asn Gly Leu Lys Val Ala Phe Thr Thr Pro Ser Gly Val Pro
145 150 155 160
Asp Pro Thr Val Phe Phe Asn Pro Thr Val Arg Arg Ser Gly Ala Ser
165 170 175
Ser Asn Asn Val Ala Glu Ile Gly Ser Leu Val Leu Glu Trp Thr Arg
180 185 190
Leu Ser Asp Leu Thr Gly Asn Pro Gln Tyr Ala Gln Leu Ala Gln Lys
195 200 205
Gly Glu Ser Tyr Leu Leu Asn Pro Lys Gly Ser Pro Glu Ala Trp Pro
210 215 220
Gly Leu Ile Gly Thr Phe Val Ser Thr Ser Asn Gly Thr Phe Gln Asp
225 230 235 240
Ser Ser Gly Ser Trp Ser Gly Leu Met Asp Ser Phe Tyr Glu Tyr Leu
245 250 255
Ile Lys Met Tyr Leu Tyr Asp Pro Val Ala Phe Ala His Tyr Lys Asp
260 265 270
Arg Trp Val Leu Ala Ala Asp Ser Thr Ile Ala His Leu Ala Ser His
275 280 285
Pro Ser Thr Arg Lys Asp Leu Thr Phe Leu Ser Ser Tyr Asn Gly Gln
290 295 300
Ser Thr Ser Pro Asn Ser Gly His Leu Ala Ser Phe Ala Gly Gly Asn
305 310 315 320
Phe Ile Leu Gly Gly Ile Leu Leu Asn Glu Gln Lys Tyr Ile Asp Phe
325 330 335
Gly Ile Lys Leu Ala Ser Ser Tyr Phe Ala Thr Tyr Asn Gln Thr Ala
340 345 350
Ser Gly Ile Gly Pro Glu Gly Phe Ala Trp Val Asp Ser Val Thr Gly
355 360 365
Ala Gly Gly Ser Pro Pro Ser Ser Gln Ser Gly Phe Tyr Ser Ser Ala
370 375 380
Gly Phe Trp Val Thr Ala Pro Tyr Tyr Ile Leu Arg Pro Glu Thr Leu
385 390 395 400
Glu Ser Leu Tyr Tyr Ala Tyr Arg Val Thr Gly Asp Ser Lys Trp Gln
405 410 415
Asp Leu Ala Trp Glu Ala Phe Ser Ala Ile Glu Asp Ala Cys Arg Ala
420 425 430
Gly Ser Ala Tyr Ser Ser Ile Asn Asp Val Thr Gln Ala Asn Gly Gly
435 440 445
Gly Ala Ser Asp Asp Met Glu Ser Phe Trp Phe Ala Glu Ala Leu Lys
450 455 460
Tyr Ala Tyr Leu Ile Phe Ala Glu Glu Ser Asp Val Gln Val Gln Ala
465 470 475 480
Asn Gly Gly Asn Lys Phe Val Phe Asn Thr Glu Ala His Pro Phe Ser
485 490 495
Ile Arg Ser Ser Ser Arg Arg Gly Gly His Leu Ala His Asp Glu Leu
500 505 510
<210> 10
<211> 445
<212> PRT
<213> Artificial sequence
<400> 10
Met Ser Leu Ser Leu Val Ser Tyr Arg Leu Arg Lys Asn Pro Trp Val
1 5 10 15
Asn Ile Phe Leu Pro Val Leu Ala Ile Phe Leu Ile Tyr Ile Ile Phe
20 25 30
Phe Gln Arg Asp Gln Ser Ser Val Ser Ala Leu Asp Gly Asp Pro Ala
35 40 45
Ser Leu Thr Arg Glu Val Ile Arg Leu Ala Gln Asp Ala Glu Val Glu
50 55 60
Leu Glu Arg Gln Arg Gly Leu Leu Gln Gln Ile Gly Asp Ala Leu Ser
65 70 75 80
Ser Gln Arg Gly Arg Val Pro Thr Ala Ala Pro Pro Ala Gln Pro Arg
85 90 95
Val Pro Val Thr Pro Ala Pro Ala Val Ile Pro Ile Leu Val Ile Ala
100 105 110
Cys Asp Arg Ser Thr Val Arg Arg Cys Leu Asp Lys Leu Leu His Tyr
115 120 125
Arg Pro Ser Ala Glu Leu Phe Pro Ile Ile Val Ser Gln Asp Cys Gly
130 135 140
His Glu Glu Thr Ala Gln Ala Ile Ala Ser Tyr Gly Ser Ala Val Thr
145 150 155 160
His Ile Arg Gln Pro Asp Leu Ser Ser Ile Ala Val Pro Pro Asp His
165 170 175
Arg Lys Phe Gln Gly Tyr Tyr Lys Ile Ala Arg His Tyr Arg Trp Ala
180 185 190
Leu Gly Gln Val Phe Arg Gln Phe Arg Phe Pro Ala Ala Val Val Val
195 200 205
Glu Asp Asp Leu Glu Val Ala Pro Asp Phe Phe Glu Tyr Phe Arg Ala
210 215 220
Thr Tyr Pro Leu Leu Lys Ala Asp Pro Ser Leu Trp Cys Val Ser Ala
225 230 235 240
Trp Asn Asp Asn Gly Lys Glu Gln Met Val Asp Ala Ser Arg Pro Glu
245 250 255
Leu Leu Tyr Arg Thr Asp Phe Phe Pro Gly Leu Gly Trp Leu Leu Leu
260 265 270
Ala Glu Leu Trp Ala Glu Leu Glu Pro Lys Trp Pro Lys Ala Phe Trp
275 280 285
Asp Asp Trp Met Arg Arg Pro Glu Gln Arg Gln Gly Arg Ala Cys Ile
290 295 300
Arg Pro Glu Ile Ser Arg Thr Met Thr Phe Gly Arg Lys Gly Val Ser
305 310 315 320
His Gly Gln Phe Phe Asp Gln His Leu Lys Phe Ile Lys Leu Asn Gln
325 330 335
Gln Phe Val His Phe Thr Gln Leu Asp Leu Ser Tyr Leu Gln Arg Glu
340 345 350
Ala Tyr Asp Arg Asp Phe Leu Ala Arg Val Tyr Gly Ala Pro Gln Leu
355 360 365
Gln Val Glu Lys Val Arg Thr Asn Asp Arg Lys Glu Leu Gly Glu Val
370 375 380
Arg Val Gln Tyr Thr Gly Arg Asp Ser Phe Lys Ala Phe Ala Lys Ala
385 390 395 400
Leu Gly Val Met Asp Asp Leu Lys Ser Gly Val Pro Arg Ala Gly Tyr
405 410 415
Arg Gly Ile Val Thr Phe Gln Phe Arg Gly Arg Arg Val His Leu Ala
420 425 430
Pro Pro Leu Thr Trp Glu Gly Tyr Asp Pro Ser Trp Asn
435 440 445
<210> 11
<211> 804
<212> PRT
<213> Artificial sequence
<400> 11
Met Ala Leu Phe Leu Ser Lys Arg Leu Leu Arg Phe Thr Val Ile Ala
1 5 10 15
Gly Ala Val Ile Val Leu Leu Leu Thr Leu Asn Ser Asn Ser Arg Thr
20 25 30
Gln Gln Tyr Ile Pro Ser Ser Ile Ser Ala Ala Phe Asp Phe Thr Ser
35 40 45
Gly Ser Ile Ser Pro Glu Gln Gln Val Ile Ser Glu Glu Asn Asp Ala
50 55 60
Lys Lys Leu Glu Gln Ser Ala Leu Asn Ser Glu Ala Ser Glu Asp Ser
65 70 75 80
Glu Ala Met Asp Glu Glu Ser Lys Ala Leu Lys Ala Ala Ala Glu Lys
85 90 95
Ala Asp Ala Pro Ile Gly Gly Gly Pro Ala Gly Met Arg Val Leu Val
100 105 110
Thr Gly Gly Ser Gly Tyr Ile Gly Ser His Thr Cys Val Gln Leu Leu
115 120 125
Gln Asn Gly His Asp Val Ile Ile Leu Asp Asn Leu Cys Asn Ser Lys
130 135 140
Arg Ser Val Leu Pro Val Ile Glu Arg Leu Gly Gly Lys His Pro Thr
145 150 155 160
Phe Val Glu Gly Asp Ile Arg Asn Glu Ala Leu Met Thr Glu Ile Leu
165 170 175
His Asp His Ala Ile Asp Thr Val Ile His Phe Ala Gly Leu Lys Ala
180 185 190
Val Gly Glu Ser Val Gln Lys Pro Leu Glu Tyr Tyr Asp Asn Asn Val
195 200 205
Asn Gly Thr Leu Arg Leu Ile Ser Ala Met Arg Ala Ala Asn Val Lys
210 215 220
Asn Phe Ile Phe Ser Ser Ser Ala Thr Val Tyr Gly Asp Gln Pro Lys
225 230 235 240
Ile Pro Tyr Val Glu Ser Phe Pro Thr Gly Thr Pro Gln Ser Pro Tyr
245 250 255
Gly Lys Ser Lys Leu Met Val Glu Gln Ile Leu Thr Asp Leu Gln Lys
260 265 270
Ala Gln Pro Asp Trp Ser Ile Ala Leu Leu Arg Tyr Phe Asn Pro Val
275 280 285
Gly Ala His Pro Ser Gly Asp Met Gly Glu Asp Pro Gln Gly Ile Pro
290 295 300
Asn Asn Leu Met Pro Tyr Ile Ala Gln Val Ala Val Gly Arg Arg Asp
305 310 315 320
Ser Leu Ala Ile Phe Gly Asn Asp Tyr Pro Thr Glu Asp Gly Thr Gly
325 330 335
Val Arg Asp Tyr Ile His Val Met Asp Leu Ala Asp Gly His Val Val
340 345 350
Ala Met Glu Lys Leu Ala Asn Lys Pro Gly Val His Ile Tyr Asn Leu
355 360 365
Gly Ala Gly Val Gly Asn Ser Val Leu Asp Val Val Asn Ala Phe Ser
370 375 380
Lys Ala Cys Gly Lys Pro Val Asn Tyr His Phe Ala Pro Arg Arg Glu
385 390 395 400
Gly Asp Leu Pro Ala Tyr Trp Ala Asp Ala Ser Lys Ala Asp Arg Glu
405 410 415
Leu Asn Trp Arg Val Thr Arg Thr Leu Asp Glu Met Ala Gln Asp Thr
420 425 430
Trp His Trp Gln Ser Arg His Pro Gln Gly Tyr Pro Asp Gly Thr Gly
435 440 445
Gly Gly Arg Asp Leu Ser Arg Leu Pro Gln Leu Val Gly Val Ser Thr
450 455 460
Pro Leu Gln Gly Gly Ser Asn Ser Ala Ala Ala Ile Gly Gln Ser Ser
465 470 475 480
Gly Glu Leu Arg Thr Gly Gly Ala Arg Pro Pro Pro Pro Leu Gly Ala
485 490 495
Ser Ser Gln Pro Arg Pro Gly Gly Asp Ser Ser Pro Val Val Asp Ser
500 505 510
Gly Pro Gly Pro Ala Ser Asn Leu Thr Ser Val Pro Val Pro His Thr
515 520 525
Thr Ala Leu Ser Leu Pro Ala Cys Pro Glu Glu Ser Pro Leu Leu Val
530 535 540
Gly Pro Met Leu Ile Glu Phe Asn Met Pro Val Asp Leu Glu Leu Val
545 550 555 560
Ala Lys Gln Asn Pro Asn Val Lys Met Gly Gly Arg Tyr Ala Pro Arg
565 570 575
Asp Cys Val Ser Pro His Lys Val Ala Ile Ile Ile Pro Phe Arg Asn
580 585 590
Arg Gln Glu His Leu Lys Tyr Trp Leu Tyr Tyr Leu His Pro Val Leu
595 600 605
Gln Arg Gln Gln Leu Asp Tyr Gly Ile Tyr Val Ile Asn Gln Ala Gly
610 615 620
Asp Thr Ile Phe Asn Arg Ala Lys Leu Leu Asn Val Gly Phe Gln Glu
625 630 635 640
Ala Leu Lys Asp Tyr Asp Tyr Thr Cys Phe Val Phe Ser Asp Val Asp
645 650 655
Leu Ile Pro Met Asn Asp His Asn Ala Tyr Arg Cys Phe Ser Gln Pro
660 665 670
Arg His Ile Ser Val Ala Met Asp Lys Phe Gly Phe Ser Leu Pro Tyr
675 680 685
Val Gln Tyr Phe Gly Gly Val Ser Ala Leu Ser Lys Gln Gln Phe Leu
690 695 700
Thr Ile Asn Gly Phe Pro Asn Asn Tyr Trp Gly Trp Gly Gly Glu Asp
705 710 715 720
Asp Asp Ile Phe Asn Arg Leu Val Phe Arg Gly Met Ser Ile Ser Arg
725 730 735
Pro Asn Ala Val Val Gly Arg Cys Arg Met Ile Arg His Ser Arg Asp
740 745 750
Lys Lys Asn Glu Pro Asn Pro Gln Arg Phe Asp Arg Ile Ala His Thr
755 760 765
Lys Glu Thr Met Leu Ser Asp Gly Leu Asn Ser Leu Thr Tyr Gln Val
770 775 780
Leu Asp Val Gln Arg Tyr Pro Leu Tyr Thr Gln Ile Thr Val Asp Ile
785 790 795 800
Gly Thr Pro Ser
<210> 12
<211> 1101
<212> PRT
<213> Artificial sequence
<400> 12
Met Leu Leu Thr Lys Arg Phe Ser Lys Leu Phe Lys Leu Thr Phe Ile
1 5 10 15
Val Leu Ile Leu Cys Gly Leu Phe Val Ile Thr Asn Lys Tyr Met Asp
20 25 30
Glu Asn Thr Ser Pro Ala Gly Val Glu Asp Gly Pro Lys Ser Ser Gln
35 40 45
Ser Asn Phe Ser Gln Gly Ala Gly Ser His Leu Leu Pro Ser Gln Leu
50 55 60
Ser Leu Ser Val Asp Thr Ala Asp Cys Leu Phe Ala Ser Gln Ser Gly
65 70 75 80
Ser His Asn Ser Asp Val Gln Met Leu Asp Val Tyr Ser Leu Ile Ser
85 90 95
Phe Asp Asn Pro Asp Gly Gly Val Trp Lys Gln Gly Phe Asp Ile Thr
100 105 110
Tyr Glu Ser Asn Glu Trp Asp Thr Glu Pro Leu Gln Val Phe Val Val
115 120 125
Pro His Ser His Asn Asp Pro Gly Trp Leu Lys Thr Phe Asn Asp Tyr
130 135 140
Phe Arg Asp Lys Thr Gln Tyr Ile Phe Asn Asn Met Val Leu Lys Leu
145 150 155 160
Lys Glu Asp Ser Arg Arg Lys Phe Ile Trp Ser Glu Ile Ser Tyr Leu
165 170 175
Ser Lys Trp Trp Asp Ile Ile Asp Ile Gln Lys Lys Asp Ala Val Lys
180 185 190
Ser Leu Ile Glu Asn Gly Gln Leu Glu Ile Val Thr Gly Gly Trp Val
195 200 205
Met Pro Asp Glu Ala Thr Pro His Tyr Phe Ala Leu Ile Asp Gln Leu
210 215 220
Ile Glu Gly His Gln Trp Leu Glu Asn Asn Ile Gly Val Lys Pro Arg
225 230 235 240
Ser Gly Trp Ala Ile Asp Pro Phe Gly His Ser Pro Thr Met Ala Tyr
245 250 255
Leu Leu Asn Arg Ala Gly Leu Ser His Met Leu Ile Gln Arg Val His
260 265 270
Tyr Ala Val Lys Lys His Phe Ala Leu His Lys Thr Leu Glu Phe Phe
275 280 285
Trp Arg Gln Asn Trp Asp Leu Gly Ser Val Thr Asp Ile Leu Cys His
290 295 300
Met Met Pro Phe Tyr Ser Tyr Asp Ile Pro His Thr Cys Gly Pro Asp
305 310 315 320
Pro Lys Ile Cys Cys Gln Phe Asp Phe Lys Arg Leu Pro Gly Gly Arg
325 330 335
Phe Gly Cys Pro Trp Gly Val Pro Pro Glu Thr Ile His Pro Gly Asn
340 345 350
Val Gln Ser Arg Ala Arg Met Leu Leu Asp Gln Tyr Arg Lys Lys Ser
355 360 365
Lys Leu Phe Arg Thr Lys Val Leu Leu Ala Pro Leu Gly Asp Asp Phe
370 375 380
Arg Tyr Cys Glu Tyr Thr Glu Trp Asp Leu Gln Phe Lys Asn Tyr Gln
385 390 395 400
Gln Leu Phe Asp Tyr Met Asn Ser Gln Ser Lys Phe Lys Val Lys Ile
405 410 415
Gln Phe Gly Thr Leu Ser Asp Phe Phe Asp Ala Leu Asp Lys Ala Asp
420 425 430
Glu Thr Gln Arg Asp Lys Gly Gln Ser Met Phe Pro Val Leu Ser Gly
435 440 445
Asp Phe Phe Thr Tyr Ala Asp Arg Asp Asp His Tyr Trp Ser Gly Tyr
450 455 460
Phe Thr Ser Arg Pro Phe Tyr Lys Arg Met Asp Arg Ile Met Glu Ser
465 470 475 480
His Leu Arg Ala Ala Glu Ile Leu Tyr Tyr Phe Ala Leu Arg Gln Ala
485 490 495
His Lys Tyr Lys Ile Asn Lys Phe Leu Ser Ser Ser Leu Tyr Thr Ala
500 505 510
Leu Thr Glu Ala Arg Arg Asn Leu Gly Leu Phe Gln His His Asp Ala
515 520 525
Ile Thr Gly Thr Ala Lys Asp Trp Val Val Val Asp Tyr Gly Thr Arg
530 535 540
Leu Phe His Ser Leu Met Val Leu Glu Lys Ile Ile Gly Asn Ser Ala
545 550 555 560
Phe Leu Leu Ile Gly Lys Asp Lys Leu Thr Tyr Asp Ser Tyr Ser Pro
565 570 575
Asp Thr Phe Leu Glu Met Asp Leu Lys Gln Lys Ser Gln Asp Ser Leu
580 585 590
Pro Gln Lys Asn Ile Ile Arg Leu Ser Ala Glu Pro Arg Tyr Leu Val
595 600 605
Val Tyr Asn Pro Leu Glu Gln Asp Arg Ile Ser Leu Val Ser Val Tyr
610 615 620
Val Ser Ser Pro Thr Val Gln Val Phe Ser Ala Ser Gly Lys Pro Val
625 630 635 640
Glu Val Gln Val Ser Ala Val Trp Asp Thr Ala Asn Thr Ile Ser Glu
645 650 655
Thr Ala Tyr Glu Ile Ser Phe Arg Ala His Ile Pro Pro Leu Gly Leu
660 665 670
Lys Val Tyr Lys Ile Leu Glu Ser Ala Ser Ser Asn Ser His Leu Ala
675 680 685
Asp Tyr Val Leu Tyr Lys Asn Lys Val Glu Asp Ser Gly Ile Phe Thr
690 695 700
Ile Lys Asn Met Ile Asn Thr Glu Glu Gly Ile Thr Leu Glu Asn Ser
705 710 715 720
Phe Val Leu Leu Arg Phe Asp Gln Thr Gly Leu Met Lys Gln Met Met
725 730 735
Thr Lys Glu Asp Gly Lys His His Glu Val Asn Val Gln Phe Ser Trp
740 745 750
Tyr Gly Thr Thr Ile Lys Arg Asp Lys Ser Gly Ala Tyr Leu Phe Leu
755 760 765
Pro Asp Gly Asn Ala Lys Pro Tyr Val Tyr Thr Thr Pro Pro Phe Val
770 775 780
Arg Val Thr His Gly Arg Ile Tyr Ser Glu Val Thr Cys Phe Phe Asp
785 790 795 800
His Val Thr His Arg Val Arg Leu Tyr His Ile Gln Gly Ile Glu Gly
805 810 815
Gln Ser Val Glu Val Ser Asn Ile Val Asp Ile Arg Lys Val Tyr Asn
820 825 830
Arg Glu Ile Ala Met Lys Ile Ser Ser Asp Ile Lys Ser Gln Asn Arg
835 840 845
Phe Tyr Thr Asp Leu Asn Gly Tyr Gln Ile Gln Pro Arg Met Thr Leu
850 855 860
Ser Lys Leu Pro Leu Gln Ala Asn Val Tyr Pro Met Thr Thr Met Ala
865 870 875 880
Tyr Ile Gln Asp Ala Lys His Arg Leu Thr Leu Leu Ser Ala Gln Ser
885 890 895
Leu Gly Val Ser Ser Leu Asn Ser Gly Gln Ile Glu Val Ile Met Asp
900 905 910
Arg Arg Leu Met Gln Asp Asp Asn Arg Gly Leu Glu Gln Gly Ile Gln
915 920 925
Asp Asn Lys Ile Thr Ala Asn Leu Phe Arg Ile Leu Leu Glu Lys Arg
930 935 940
Ser Ala Val Asn Thr Glu Glu Glu Lys Lys Ser Val Ser Tyr Pro Ser
945 950 955 960
Leu Leu Ser His Ile Thr Ser Ser Leu Met Asn His Pro Val Ile Pro
965 970 975
Met Ala Asn Lys Phe Ser Ser Pro Thr Leu Glu Leu Gln Gly Glu Phe
980 985 990
Ser Pro Leu Gln Ser Ser Leu Pro Cys Asp Ile His Leu Val Asn Leu
995 1000 1005
Arg Thr Ile Gln Ser Lys Val Gly Asn Gly His Ser Asn Glu Ala
1010 1015 1020
Ala Leu Ile Leu His Arg Lys Gly Phe Asp Cys Arg Phe Ser Ser
1025 1030 1035
Lys Gly Thr Gly Leu Phe Cys Ser Thr Thr Gln Gly Lys Ile Leu
1040 1045 1050
Val Gln Lys Leu Leu Asn Lys Phe Ile Val Glu Ser Leu Thr Pro
1055 1060 1065
Ser Ser Leu Ser Leu Met His Ser Pro Pro Gly Thr Gln Asn Ile
1070 1075 1080
Ser Glu Ile Asn Leu Ser Pro Met Glu Ile Ser Thr Phe Arg Ile
1085 1090 1095
Gln Leu Arg
1100
<210> 13
<211> 394
<212> PRT
<213> Artificial sequence
<400> 13
Met Leu Leu Thr Lys Arg Phe Ser Lys Leu Phe Lys Leu Thr Phe Ile
1 5 10 15
Val Leu Ile Leu Cys Gly Leu Phe Val Ile Thr Asn Lys Tyr Met Asp
20 25 30
Glu Asn Thr Ser Pro Ala Gly Ser Leu Val Tyr Gln Leu Asn Phe Asp
35 40 45
Gln Thr Leu Arg Asn Val Asp Lys Ala Gly Thr Trp Ala Pro Arg Glu
50 55 60
Leu Val Leu Val Val Gln Val His Asn Arg Pro Glu Tyr Leu Arg Leu
65 70 75 80
Leu Leu Asp Ser Leu Arg Lys Ala Gln Gly Ile Asp Asn Val Leu Val
85 90 95
Ile Phe Ser His Asp Phe Trp Ser Thr Glu Ile Asn Gln Leu Ile Ala
100 105 110
Gly Val Asn Phe Cys Pro Val Leu Gln Val Phe Phe Pro Phe Ser Ile
115 120 125
Gln Leu Tyr Pro Asn Glu Phe Pro Gly Ser Asp Pro Arg Asp Cys Pro
130 135 140
Arg Asp Leu Pro Lys Asn Ala Ala Leu Lys Leu Gly Cys Ile Asn Ala
145 150 155 160
Glu Tyr Pro Asp Ser Phe Gly His Tyr Arg Glu Ala Lys Phe Ser Gln
165 170 175
Thr Lys His His Trp Trp Trp Lys Leu His Phe Val Trp Glu Arg Val
180 185 190
Lys Ile Leu Arg Asp Tyr Ala Gly Leu Ile Leu Phe Leu Glu Glu Asp
195 200 205
His Tyr Leu Ala Pro Asp Phe Tyr His Val Phe Lys Lys Met Trp Lys
210 215 220
Leu Lys Gln Gln Glu Cys Pro Glu Cys Asp Val Leu Ser Leu Gly Thr
225 230 235 240
Tyr Ser Ala Ser Arg Ser Phe Tyr Gly Met Ala Asp Lys Val Asp Val
245 250 255
Lys Thr Trp Lys Ser Thr Glu His Asn Met Gly Leu Ala Leu Thr Arg
260 265 270
Asn Ala Tyr Gln Lys Leu Ile Glu Cys Thr Asp Thr Phe Cys Thr Tyr
275 280 285
Asp Asp Tyr Asn Trp Asp Trp Thr Leu Gln Tyr Leu Thr Val Ser Cys
290 295 300
Leu Pro Lys Phe Trp Lys Val Leu Val Pro Gln Ile Pro Arg Ile Phe
305 310 315 320
His Ala Gly Asp Cys Gly Met His His Lys Lys Thr Cys Arg Pro Ser
325 330 335
Thr Gln Ser Ala Gln Ile Glu Ser Leu Leu Asn Asn Asn Lys Gln Tyr
340 345 350
Met Phe Pro Glu Thr Leu Thr Ile Ser Glu Lys Phe Thr Val Val Ala
355 360 365
Ile Ser Pro Pro Arg Lys Asn Gly Gly Trp Gly Asp Ile Arg Asp His
370 375 380
Glu Leu Cys Lys Ser Tyr Arg Arg Leu Gln
385 390
<210> 14
<211> 1539
<212> DNA
<213> Artificial sequence
<400> 14
gaggctgaag cttatccaaa gccgggcgcc acaaaacgtg gatctcccaa ccctacgagg 60
gcggcagcag tcaaggccgc attccagacg tcgtggaacg cttaccacca ttttgccttt 120
ccccatgacg acctccaccc ggtcagcaac agctttgatg atgagagaaa cggctggggc 180
tcgtcggcaa tcgatggctt ggacacggct atcctcatgg gggatgccga cattgtgaac 240
acgatccttc agtatgtacc gcagatcaac ttcaccacga ctgcggttgc caaccaaggc 300
atctccgtgt tcgagaccaa cattcggtac ctcggtggcc tgctttctgc ctatgacctg 360
ttgcgaggtc ctttcagctc cttggcgaca aaccagaccc tggtaaacag ccttctgagg 420
caggctcaaa cactggccaa cggcctcaag gttgcgttca ccactcccag cggtgtcccg 480
gaccctaccg tcttcttcaa ccctaccgtc cggagaagtg gtgcatctag caacaacgtc 540
gctgaaattg gaagcctggt gctcgaatgg acacggttga gcgacctgac gggaaacccg 600
cagtatgccc agcttgcgca gaagggcgag tcgtatctcc tgaatccaaa gggaagcccg 660
gaggcatggc ctggcctgat tggaacgttt gtcagcacga gcaacggtac ctttcaggat 720
agcagcggca gctggtccgg cctcatggac agcttctacg agtacctgat caagatgtac 780
ctgtacgacc cggttgcgtt tgcacactac aaggatcgct gggtccttgc tgccgactcg 840
accattgcgc atctcgcctc tcacccgtcg acgcgcaagg acttgacctt tttgtcttcg 900
tacaacggac agtctacgtc gccaaactca ggacatttgg ccagttttgc cggtggcaac 960
ttcatcttgg gaggcattct cctgaacgag caaaagtaca ttgactttgg aatcaagctt 1020
gccagctcgt actttgccac gtacaaccag acggcttctg gaatcggccc cgaaggcttc 1080
gcgtgggtgg acagcgtgac gggcgccggc ggctcgccgc cctcgtccca gtccgggttc 1140
tactcgtcgg caggattctg ggtgacggca ccgtattaca tcctgcggcc ggagacgctg 1200
gagagcttgt actacgcata ccgcgtcacg ggcgactcca agtggcagga cctggcgtgg 1260
gaagcgttca gtgccattga ggacgcatgc cgcgccggca gcgcgtactc gtccatcaac 1320
gacgtgacgc aggccaacgg cgggggtgcc tctgacgata tggagagctt ctggtttgcc 1380
gaggcgctca agtatgcgta cctgatcttt gcggaggagt cggatgtgca ggtgcaggcc 1440
aacggcggga acaaatttgt ctttaacacg gaggcgcacc cctttagcat ccgttcatca 1500
tcacgacggg gcggccacct tgctcacgac gagttgtaa 1539
<210> 15
<211> 1338
<212> DNA
<213> Artificial sequence
<400> 15
atgtcacttt ctcttgtatc gtaccgccta agaaagaacc cgtgggttaa catttttcta 60
cctgttttgg ccatatttct aatatatata atttttttcc agagagatca atcttcagtc 120
agcgctctcg atggcgaccc cgccagcctc acccgggaag tgattcgcct ggcccaagac 180
gccgaggtgg agctggagcg gcagcgtggg ctgctgcagc agatcgggga tgccctgtcg 240
agccagcggg ggagggtgcc caccgcggcc cctcccgccc agccgcgtgt gcctgtgacc 300
cccgcgccgg cggtgattcc catcctggtc atcgcctgtg accgcagcac tgttcggcgc 360
tgcctggaca agctgctgca ttatcggccc tcggctgagc tcttccccat catcgttagc 420
caggactgcg ggcacgagga gacggcccag gccatcgcct cctacggcag cgcggtcacg 480
cacatccggc agcccgacct gagcagcatt gcggtgccgc cggaccaccg caagttccag 540
ggctactaca agatcgcgcg ccactaccgc tgggcgctgg gccaggtctt ccggcagttt 600
cgcttccccg cggccgtggt ggtggaggat gacctggagg tggccccgga cttcttcgag 660
tactttcggg ccacctatcc gctgctgaag gccgacccct ccctgtggtg cgtctcggcc 720
tggaatgaca acggcaagga gcagatggtg gacgccagca ggcctgagct gctctaccgc 780
accgactttt tccctggcct gggctggctg ctgttggccg agctctgggc tgagctggag 840
cccaagtggc caaaggcctt ctgggacgac tggatgcggc ggccggagca gcggcagggg 900
cgggcctgca tacgccctga gatctcaaga acgatgacct ttggccgcaa gggtgtgagc 960
cacgggcagt tctttgacca gcacctcaag tttatcaagc tgaaccagca gtttgtgcac 1020
ttcacccagc tggacctgtc ttacctgcag cgggaggcct atgaccgaga tttcctcgcc 1080
cgcgtctacg gtgctcccca gctgcaggtg gagaaagtga ggaccaatga ccggaaggag 1140
ctgggggagg tgcgggtgca gtatacgggc agggacagct tcaaggcttt cgccaaggct 1200
ctgggtgtca tggatgacct taagtcgggg gttccgagag ctggctaccg gggtattgtc 1260
accttccagt tccggggccg ccgtgtccac ctggcgcccc cactgacgtg ggagggctat 1320
gatcctagct ggaattag 1338
<210> 16
<211> 2397
<212> DNA
<213> Artificial sequence
<400> 16
atggccctct ttctcagtaa gagactgttg agatttaccg tcattgcagg tgcggttatt 60
gttctcctcc taacattgaa ttccaacagt agaactcagc aatatattcc gagttccatc 120
tccgctgcat ttgattttac ctcaggatct atatcccctg aacaacaagt catctctgag 180
gaaaatgatg ctaaaaaatt agagcaaagt gctctgaatt cagaggcaag cgaagactcc 240
gaagccatgg atgaagaatc caaggctctg aaagctgccg ctgaaaaggc agatgccccg 300
atcatgagag ttctggttac cggtggtagc ggttacattg gaagtcatac ctgtgtgcaa 360
ttactgcaaa acggtcatga tgtcatcatt cttgataacc tctgtaacag taagcgcagc 420
gtactgcctg ttatcgagcg tttaggcggc aaacatccaa cgtttgttga aggcgatatt 480
cgtaacgaag cgttgatgac cgagatcctg cacgatcacg ctatcgacac cgtgatccac 540
ttcgccgggc tgaaagccgt gggcgaatcg gtacaaaaac cgctggaata ttacgacaac 600
aatgtcaacg gcactctgcg cctgattagc gccatgcgcg ccgctaacgt caaaaacttt 660
atttttagct cctccgccac cgtttatggc gatcagccca aaattccata cgttgaaagc 720
ttcccgaccg gcacaccgca aagcccttac ggcaaaagca agctgatggt ggaacagatc 780
ctcaccgatc tgcaaaaagc ccagccggac tggagcattg ccctgctgcg ctacttcaac 840
ccggttggcg cgcatccgtc gggcgatatg ggcgaagatc cgcaaggcat tccgaataac 900
ctgatgccat acatcgccca ggttgctgta ggccgtcgcg actcgctggc gatttttggt 960
aacgattatc cgaccgaaga tggtactggc gtacgcgatt acatccacgt aatggatctg 1020
gcggacggtc acgtcgtggc gatggaaaaa ctggcgaaca agccaggcgt acacatctac 1080
aacctcggcg ctggcgtagg caacagcgtg ctggacgtgg ttaatgcctt cagcaaagcc 1140
tgcggcaaac cggttaatta tcattttgca ccgcgtcgcg agggcgacct tccggcctac 1200
tgggcggacg ccagcaaagc cgaccgtgaa ctgaactggc gcgtaacgcg cacactcgat 1260
gaaatggcgc aggacacctg gcactggcag tcacgccatc cacagggata tcccgatggt 1320
accggtggtg gacgtgacct ttctcgtctg ccacaactgg ttggagtttc tactccactg 1380
caaggtggat ctaactctgc tgctgcaatt ggtcaatcat ctggtgagct tcgtactgga 1440
ggtgctcgtc cccctccacc acttggtgct tcttcccagc cccgtccagg tggcgactcc 1500
agcccagtcg tggattctgg ccctggcccc gctagcaact tgacctcggt cccagtgccc 1560
cacaccaccg cactgtcgct gcccgcctgc cctgaggagt ccccgctgct tgtgggcccc 1620
atgctgattg agtttaacat gcctgtggac ctggagctcg tggcaaagca gaacccaaat 1680
gtgaagatgg gcggccgcta tgcccccagg gactgcgtct ctcctcacaa ggtggccatc 1740
atcattccat tccgcaaccg gcaggagcac ctcaagtact ggctatatta tttgcaccca 1800
gtcctgcagc gccagcagct ggactatggc atctatgtta tcaaccaggc gggagacact 1860
atattcaatc gtgctaagct cctcaatgtt ggctttcaag aagccttgaa ggactatgac 1920
tacacctgct ttgtgtttag tgacgtggac ctcattccaa tgaatgacca taatgcgtac 1980
aggtgttttt cacagccacg gcacatttcc gttgcaatgg ataagtttgg attcagccta 2040
ccttatgttc agtattttgg aggtgtctct gctctaagta aacaacagtt tctaaccatc 2100
aatggatttc ctaataatta ttggggttgg ggaggagaag atgacgacat ttttaacaga 2160
ttagttttta gaggcatgtc tatatctcgc ccaaatgctg tggtcgggag gtgtcgcatg 2220
atccgccact caagagacaa gaaaaatgaa cccaatcctc agaggtttga ccgaattgca 2280
cacacaaagg agacaatgct ctctgatggt ttgaactcac tcacctacca ggtgctggat 2340
gtacagagat acccattgta tacccaaatc acagtggaca tcgggacacc gagctaa 2397
<210> 17
<211> 3306
<212> DNA
<213> Artificial sequence
<400> 17
atgctgctta ccaaaaggtt ttcaaagctg ttcaagctga cgttcatagt tttgatattg 60
tgcgggctgt tcgtcattac aaacaaatac atggatgaga acacgtcgcc tgcaggcgtg 120
gaggatggtc cgaaaagttc acaaagcaat ttcagccaag gtgctggctc acatcttctg 180
ccctcacaat tatccctctc agttgacact gcagactgtc tgtttgcttc acaaagtgga 240
agtcacaatt cagatgtgca gatgttggat gtttacagtc taatttcttt tgacaatcca 300
gatggtggag tttggaagca aggatttgac attacttatg aatctaatga atgggacact 360
gaaccccttc aagtctttgt ggtgcctcat tcccataacg acccaggttg gttgaagact 420
ttcaatgact actttagaga caagactcag tatattttta ataacatggt cctaaagctg 480
aaagaagact cacggaggaa gtttatttgg tctgagatct cttacctttc aaagtggtgg 540
gatattatag atattcagaa gaaggatgct gttaaaagtt taatagaaaa tggtcagctt 600
gaaattgtga caggtggctg ggttatgcct gatgaagcta ctccacatta ttttgcctta 660
attgatcaac taattgaagg acatcagtgg ctggaaaata atataggagt gaaacctcgg 720
tccggctggg ctattgatcc ctttggacac tcaccaacaa tggcttatct tctaaaccgt 780
gctggacttt ctcacatgct tatccagaga gttcattatg cagttaaaaa acactttgca 840
ctgcataaaa cattggagtt tttttggaga cagaattggg atctgggatc tgtcacagat 900
attttatgcc acatgatgcc cttctacagc tatgacatcc ctcacacttg tggacctgat 960
cctaaaatat gctgccagtt tgattttaaa cgtcttcctg gaggcagatt tggttgtccc 1020
tggggagtcc ccccagaaac aatacatcct ggaaatgtcc aaagcagggc tcggatgcta 1080
ctagatcagt accgaaagaa gtcaaagctt tttcgaacca aagttctcct ggctccacta 1140
ggagatgatt tccgctactg tgaatacacg gaatgggatt tacagtttaa gaattatcag 1200
cagctttttg attatatgaa ttctcagtcc aagtttaaag ttaagataca gtttggaact 1260
ttatcagatt tttttgatgc gctggataaa gcagatgaaa ctcagagaga caagggccaa 1320
tcgatgttcc ctgttttaag tggagatttt ttcacttatg ccgatcgaga tgatcattac 1380
tggagtggct attttacatc cagacccttt tacaaacgaa tggacagaat catggaatct 1440
catttaaggg ctgctgaaat tctttactat ttcgccctga gacaagctca caaatacaag 1500
ataaataaat ttctctcatc atcactttac acggcactga cagaagccag aaggaatttg 1560
ggactgtttc aacatcatga tgctatcaca ggaactgcaa aagactgggt ggttgtggat 1620
tatggtacca gactttttca ttcgttaatg gttttggaga agataattgg aaattctgca 1680
tttcttctta ttgggaagga caaactcaca tacgactctt actctcctga taccttcctg 1740
gagatggatt tgaaacaaaa atcacaagat tctctgccac aaaaaaatat aataaggctg 1800
agtgcggagc caaggtacct tgtggtctat aatcctttag aacaagaccg aatctcgttg 1860
gtctcagtct atgtgagttc cccgacagtg caagtgttct ctgcttcagg aaaacctgtg 1920
gaagttcaag tcagcgcagt ttgggataca gcaaatacta tttcagaaac agcctatgag 1980
atctcttttc gagcacatat accgccattg ggactgaaag tgtataagat tttggaatca 2040
gcaagttcaa attcacattt agctgattat gtcttgtata agaataaagt agaagatagc 2100
ggaattttca ccataaagaa tatgataaat actgaagaag gtataacact agagaactcc 2160
tttgttttac ttcggtttga tcaaactgga cttatgaagc aaatgatgac taaagaagat 2220
ggtaaacacc atgaagtaaa tgtgcaattt tcatggtatg gaaccacaat taaaagagac 2280
aaaagtggtg cctacctctt cttacctgat ggtaatgcca agccttatgt ttacacaaca 2340
ccgccctttg tcagagtgac acatggaagg atttattcgg aagtgacttg cttttttgac 2400
catgttactc atagagtccg actataccac atacagggaa tagaaggaca gtctgtggaa 2460
gtttccaata ttgtggacat ccgaaaagta tataaccgtg agattgcaat gaaaatttct 2520
tctgatataa aaagccaaaa tagattttat actgacctaa atgggtacca gattcaacct 2580
agaatgacac tgagcaaatt gcctcttcaa gcaaatgtct atcccatgac cacaatggcc 2640
tatatccagg atgccaaaca tcgtttgaca ctgctctctg ctcagtcatt aggggtttcg 2700
agtttgaata gtggtcagat tgaagttatc atggatcgaa gactcatgca agatgataat 2760
cgtggccttg agcaaggtat ccaggataac aagattacag ctaatctatt tcgaatacta 2820
ctagaaaaaa gaagtgctgt taatacggaa gaagaaaaga agtcggtcag ttatccttct 2880
ctccttagcc acataacttc ttctctcatg aatcatccag tcattccaat ggcaaataag 2940
ttctcctcac ctacccttga gctgcaaggt gaattctctc cattacagtc atctttgcct 3000
tgtgacattc atctggttaa tttgagaaca atacagtcaa aggtgggcaa tgggcactcc 3060
aatgaggcag ccttgatcct ccacagaaaa gggtttgatt gtcggttctc tagcaaaggc 3120
acagggctgt tttgttctac tactcaggga aagatattgg tacagaaact tttaaacaag 3180
tttattgtcg aaagtctcac accttcatca ctatccttga tgcattcacc tcccggcact 3240
cagaatataa gtgagatcaa cttgagtcca atggaaatca gcacattccg aatccagttg 3300
aggtga 3306
<210> 18
<211> 1188
<212> DNA
<213> Artificial sequence
<400> 18
atgctgctta ccaaaaggtt ttcaaagctg ttcaagctga cgttcatagt tttgatattg 60
tgcgggctgt tcgtcattac aaacaaatac atggatgaga acacgtcgcc tgcaggctcc 120
ctggtgtacc agctgaactt tgatcagacc ctgaggaatg tagataaggc tggcacctgg 180
gccccccggg agctggtgct ggtggtccag gtgcataacc ggcccgaata cctcagactg 240
ctgctggact cacttcgaaa agcccaggga attgacaacg tcctcgtcat ctttagccat 300
gacttctggt cgaccgagat caatcagctg atcgccgggg tgaatttctg tccggttctg 360
caggtgttct ttcctttcag cattcagttg taccctaacg agtttccagg tagtgaccct 420
agagattgtc ccagagacct gccgaagaat gccgctttga aattggggtg catcaatgct 480
gagtatcccg actccttcgg ccattataga gaggccaaat tctcccagac caaacatcac 540
tggtggtgga agctgcattt tgtgtgggaa agagtgaaaa ttcttcgaga ttatgctggc 600
cttatacttt tcctagaaga ggatcactac ttagccccag acttttacca tgtcttcaaa 660
aagatgtgga aactgaagca gcaagagtgc cctgaatgtg atgttctctc cctggggacc 720
tatagtgcca gtcgcagttt ctatggcatg gctgacaagg tagatgtgaa aacttggaaa 780
tccacagagc acaatatggg tctagccttg acccggaatg cctatcagaa gctgatcgag 840
tgcacagaca ctttctgtac ttatgatgat tataactggg actggactct tcaatacttg 900
actgtatctt gtcttccaaa attctggaaa gtgctggttc ctcaaattcc taggatcttt 960
catgctggag actgtggtat gcatcacaag aaaacctgta gaccatccac tcagagtgcc 1020
caaattgagt cactcttaaa taataacaaa caatacatgt ttccagaaac tctaactatc 1080
agtgaaaagt ttactgtggt agccatttcc ccacctagaa aaaatggagg gtggggagat 1140
attagggacc atgaactctg taaaagttat agaagactgc agtgataa 1188
<210> 19
<211> 4921
<212> DNA
<213> Artificial sequence
<400> 19
ggcatacact attatcttat ctatattagt cgtcgccgtt gcttttggat cctcgtgtat 60
ctctggagca ttattcactg tggaagataa ttataatgtt tcattggaag ttgccatttt 120
gacagtttca ttgatggtct tgggtttctc cttgggtcca ttgttgtggt ctcctttatc 180
tgagcagatt ggaaggagat gggtttattt tatatccttg ggtctctaca caatttttaa 240
cattccttgc gctctatccc ctaatatcgg tggtctctta gtttgtcgat ttttgtgtgg 300
tgtttttagt tccagcgcac tttgtctggt tggtggttct atagctgaca tgcatccttc 360
tgaaacaaga ggtaaagcaa tcgcctattt tgcagcagct ccttatggtg gaccagttat 420
tggaccttta gtatgtggtt ggatcggtgt taaaaccaac agaatggatc ttatcttttg 480
ggtaaatatg ggatttgcag gatttatgtg gttactagtt gcctgcattc cagaaaccta 540
tcaaccagta attttaaaga accgagcaaa gaaattaaga atggagttga acaatcctaa 600
catcatgaca gagcaagaag ctaatccact aactttcaag gaattagtag ttacctgcct 660
ttataggcct cttatgtttg ttttcactga gcctgttttg gacatgatgt gtgtttacgt 720
ttgtcttatt tactcattgc tttatgcatt tttctttgca tacccagtta tatttaatga 780
gctttatggc tatgaagatg atttcatcgg cctgatgttg attccaatat tgataggagc 840
ctttttggcc ttagttacaa ctccaatttt ggaatccatg tacgtgaaaa tgtgtcaacg 900
aagaaaacca actcctgaag acagattggt aggagccatg attgggtctc ctttccctgc 960
aattgcccta tttattttgg gagcaacgtc ctacaagcat atcatttggg tcggtccagc 1020
atcttccggt atcgccttcg gttatggaat ggtactaatt tactactctt tgaataatta 1080
catcatcgac acctacgcca agtatgcagc tagtgctctg gcaacaaagg ttttcctgag 1140
gagtgctgga ggtgctgctt tcccactatt tactacacag atgtaccata aactagggct 1200
acagtgggcc agttggttgt tggcattcat ttcattagca atgattctca tcccattcgt 1260
tttctacatt tatggtgctc gtttgagggc caaaatgtgt aaagagaact acagtgagat 1320
gtgatgcatt aagaacaatc attcattaat ccttttcagc atatattatt tctaattaat 1380
tcatacttaa taacgaaaat atggtacctg ccctcacggt ggttacggtc taggaacgga 1440
acgtatctta gcatggttgt gcgacagatt cactgtgaaa gactgttcat tatacccacg 1500
tttcactggg agatgtaagc cttaggtgtt ttaccctgat tagataatac aataaccaac 1560
agaaatacga gaatctagac taatttcgat gattcatttt tctttttacc gcgctgcctc 1620
ttttggcaat tctttcacct atattctacc ttctctttcc ttttgttcta aacttattac 1680
cagctatcta tgtcgaatca agaagaaaga cttaaactgt ggggtggcag gtttactggg 1740
gctactgacc ccttgatgga tttgtataac gcttccttac cttacgacaa gaaaatgtac 1800
aaggtggatt tagaaggaac aaaagtttac actgagggcc tggagaaaat taatttgcta 1860
actaaagacg aactaagtga gattcatcgt ggtctcaaat tgattgaagc agagtgggca 1920
gaagggaagt ttgttgagaa gccaggggat gaggatattc acactgctaa tgaacgtcgc 1980
ttgggtgagt tgattggtcg tggaatctct ggtaaggttc ataccggaag gtctagaaat 2040
gatcaagttg ccactgatat gcggttgtat gtcagagaca atctaactca gttggctgac 2100
tatctgaagc agttcattca agtaatcatc aagagagctg aacaggaaat agacgtcttg 2160
atgcccggtt atactcactt gcaaagagct caaccaatca gatggtctca ctggttgagc 2220
atgtatgcta cctatttcac tgaagattat gagagactga atcaaatcgt taaaaggttg 2280
aacaaatccc cattgggagc tggagctttg gctggtcatc cttatggaat tgatcgtgaa 2340
tacattgctg agagattagg gtttgattct gttattggta attctttggc cgctgtttca 2400
gacagagatt ttgtagtcga aaccatgttc tggtcttcgt tgtttatgaa tcatatttct 2460
cgattctcag aagatttgat catttactcc actggagagt ttggatttat caagttggca 2520
gatgcttatt ctactggatc ttctctgatg cctcaaaaaa aaaacccaga ctctttggag 2580
ttattgaggg gtaaatctgg tagatgtttt ggggccttgg ctggtttcct catgtctatt 2640
aagtccattc cgtcaaccta taacaaagat atgcaagagg ataaggagcc tttatttgat 2700
actctaatca ctgtagagca ctcgattttg atagcatccg gtgtagtttc taccttgaac 2760
attgatgccg aacgaatgaa gaatgctcta actatggata tgctggctac agatcttgcc 2820
gactatttag ttagaagggg agttccattc agagaaactc accacatttc tggtgaatgt 2880
gtcagacaag ccgaggagtt gaacctttct ggtattgatc agttgtccct cgaacaattg 2940
aaatccattg actcccgttt tgaggctgat gtggcttcaa cgtttgactt tgaagccagt 3000
gttgaaaaaa gaactgccac cggaggaact tctaagactg ctgttttaaa gcaattggat 3060
gcactgaatg aaaagctaga gtcttgaagg ttttatactg agtttgttaa tgatacaata 3120
aactgttata gtacatacaa ttgaaactct cttatctata ctgggggacc ttctcgcaga 3180
atggtataaa tatctactaa ctgactgtcg tacggcctag gggtctcttc ttcgattatt 3240
tgcaggtcgg aacatccttc gtctgatgcg gatctcctga gacaaagttc acgggtatct 3300
agtattctat cagcataaat ggaggacctt tctaaactaa actttgaatc gtctccagca 3360
gcatcctcgc ataatccttt tgtcatttcc tctatgtcta ttgtcactgt ggttggcgca 3420
tcaagagtcg tccttctgta aaccggtaca gaattcctac cactagaagc ttgaaatggg 3480
gagggtttca gctttgtatc ccgatactgt gctttaaaaa gggagtccaa actgaaatct 3540
ttttcggaat cattggatga tacctctgta ttagatctcc tatgtatcgg tttcctcggg 3600
tagatagaac ttcactcatc aacattatga tctttgtcga aaagtatcaa ttgaaacatt 3660
gccgctctgg ctctttcctt ggtgtccgtg ttgtcgcttt caaaactcaa tttcttgata 3720
acatcataaa atccatcttt aattagcttc aacgctcttg atctaggtgc tcgcatcttc 3780
ttgaaatgtt catcggaagt tagctcattc aagtacccaa catttatttc ttcttcaata 3840
gtttccatat ccatttcaac atctgaatct tccagatctg aagatgtatc gtccttccat 3900
gttaagttgg taactatcca aatacatgat atcatcagat ctttatggaa agcggcccat 3960
tcggaggaga ccccttctat ttcttgtact aaaggagtct ccaataacat ataaatgaag 4020
tcgagcaatt cttgattaca aataatcatt gatctgttat cttcattaga ggccgcaaaa 4080
tggaccagga tataagtgat agcaagaata acctcataag tttctgattc ctttctttta 4140
ctaatgtcat cctcctttaa tgtggatgat aaactcttca aattttttaa tagaaaattc 4200
aaaaaatctt tatcatcgtg agcttttgct gtcgggtcgg aacagaatga ttgaatgatt 4260
ttgttcgaat agttaagagg accacaggac aagtttcgga taatattgaa tgctttttct 4320
tgaatttgca gatttgaaga ataacaaagt tcaaaaattc ttgataaagg aactttgtcc 4380
aaaaataatt ttttatcaat gatatcatcc ccgtaaaggt aatttctaag aattgataag 4440
gcattgttct ttaagaattc aaactcgttc tcttccgaaa caaaataaga caataccttc 4500
aggaagtctt cattgaaaac attttttttc aaagaactat attccaccac acaattggaa 4560
atgattccca aaactagcgc cttgagtgta atttcgtcat ccaataagga cgcacaagat 4620
tcattgtctc ttgaaaccaa aggcagtttc accagatccg tcaacgagtt tacaatgttt 4680
aaatctacta ggtaggttcg tagcaatggc gctgatcgtg acagtgagcg gaccaagtac 4740
agagcagacg tcgtgattat gtttgaaagc ttcaatatag taatgacatg atccaactga 4800
gcagcgtctc cgtgaaactc tttgtaaata ttaatatgac gagagacaag atccaccaca 4860
cattctgaaa caatgtcatg cttgataatt tcatctctgt attcctcgtt gttggacgtt 4920
a 4921
<210> 20
<211> 2023
<212> DNA
<213> Artificial sequence
<400> 20
ggtaccgcag tttaatcata gcccactgct aagccagaat tctaatatgt aactacgtac 60
ctttcctttt aataaatgat ctgtattttc cacctagtag cagatcaaat tgttcaactt 120
taagtctttg gtccctcaag cgagagaact tgcgatgaca ctcaggagtg ccataaaagc 180
cagaacctca aaaggactga tcggagctgt tattatagcc tcaataatat ttttcaccac 240
agtaaccttc tacgatgaaa gcaaaattgt cggcataata agagtttctg atacttatac 300
aggccatagc gctgtatctt caactttcaa tgcttcttcc gttgttagtg acaacaagat 360
caacggatat ggacttcctt tgattgacac ggaatcaaat agccgttatg aggatccaga 420
cgatatttcc attgaaaacg aattgcgcta tagaattgcc caatctacca aagaggaaga 480
aaacatgtgg aaactcgata ccactctcac ggaagcaagc ttgaaaatcc ccaacataca 540
gtcgtttgag ctgcagccgt tcaaagaaag acttgataat tcactttaca attctaagaa 600
cataggaaac ttttacttct atgacccaag gcttacattc tcagtttact tgaagtatat 660
caaggataaa ttggcctctg gaagcacaac aaatcttaca atacccttca actgggcaca 720
ttttagagat ttatcgtcac tgaatcctta tttggacata aaacaagaag ataaggtcgc 780
atgtgattac ttttatgaat caagtaataa agacaaacga aaacccacgg gtaactgtat 840
tgagtttaaa gatgttcgtg atgagcacct gatacagtat gggatttcat caaaagacca 900
tctacctggt ccttttattt taaagtcact tggaattccc atgcagcata cagccaagcg 960
actggaatca aatctttatc tattaaccgg tgcgccagtt ccacttgcgg ccgcacttta 1020
ctttcttggt attggaattc attgatgttc ccttgggatt atgatattga tgtgcaaatg 1080
ccaatcaaga gtttgaacaa tctatgtgct aacttcaacc aatcattaat aattgaggat 1140
cttactgaag gatattcttc ttttttcttg gattgcggat caagtatcac gcatagaaca 1200
aaaggcaaag gattaaactt cattgatgca agattcataa atgttgaaac aggcctttat 1260
atcgatatca ctggattaag taccagtcag tcagctcgac cgccaaggtt tagtaacgct 1320
tcgaagaaag atcctattta caattgcagg aataatcatt tctactctca taacaatata 1380
gcacctctca aatacacgtt gatggagggg gttcccagtt tcattcctca acagtatgaa 1440
gaaatattga gagaggagta tacaactggt ttgacttcga aacactacaa cggcaacttt 1500
tttatgactc aattgaattt gtggcttgaa agagatccaa tgctagcact tgtgccttca 1560
tccaaatacg aaattgaagg tggaggggtg gaccataaca agattatcaa gtctattctt 1620
gaactttcca acatcaaaaa attggaattg ttggatgata atcccgatat attagaggag 1680
gtgatcagga catacgaact gacttccatt caccataaag agatgcagta tctttccagt 1740
gtcaaaccag atggggacag gtccatgcag tcaaatgaca taaccagttc ttaccaggag 1800
tttctagcaa gtctgaagaa attccagcct ttacgcaaag atttgttcca atttgagcgg 1860
atagaccttt ctaagcatag aaaacagtga gcagccgttt tgcctaaaat gttccagaaa 1920
ctataggata aatatataca gtaatgaatt aggtgatgtt agcatttagt ccccaaaaat 1980
acctcgaatc tccagctcca tagcgcaaaa tctcggatct aga 2023
<210> 21
<211> 223
<212> PRT
<213> Artificial sequence
<400> 21
Arg Val Gln Pro Thr Glu Ser Ile Val Arg Phe Pro Asn Ile Thr Asn
1 5 10 15
Leu Cys Pro Phe Gly Glu Val Phe Asn Ala Thr Arg Phe Ala Ser Val
20 25 30
Tyr Ala Trp Asn Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr Ser
35 40 45
Val Leu Tyr Asn Ser Ala Ser Phe Ser Thr Phe Lys Cys Tyr Gly Val
50 55 60
Ser Pro Thr Lys Leu Asn Asp Leu Cys Phe Thr Asn Val Tyr Ala Asp
65 70 75 80
Ser Phe Val Ile Arg Gly Asp Glu Val Arg Gln Ile Ala Pro Gly Gln
85 90 95
Thr Gly Lys Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe Thr
100 105 110
Gly Cys Val Ile Ala Trp Asn Ser Asn Asn Leu Asp Ser Lys Val Gly
115 120 125
Gly Asn Tyr Asn Tyr Leu Tyr Arg Leu Phe Arg Lys Ser Asn Leu Lys
130 135 140
Pro Phe Glu Arg Asp Ile Ser Thr Glu Ile Tyr Gln Ala Gly Ser Thr
145 150 155 160
Pro Cys Asn Gly Val Glu Gly Phe Asn Cys Tyr Phe Pro Leu Gln Ser
165 170 175
Tyr Gly Phe Gln Pro Thr Asn Gly Val Gly Tyr Gln Pro Tyr Arg Val
180 185 190
Val Val Leu Ser Phe Glu Leu Leu His Ala Pro Ala Thr Val Cys Gly
195 200 205
Pro Lys Lys Ser Thr Asn Leu Val Lys Asn Lys Cys Val Asn Phe
210 215 220
<210> 22
<211> 219
<212> PRT
<213> Artificial sequence
<400> 22
Arg Val Gln Pro Thr Glu Ser Ile Val Arg Phe Pro Asn Ile Thr Asn
1 5 10 15
Leu Cys Pro Phe Gly Glu Val Phe Asn Ala Thr Arg Phe Ala Ser Val
20 25 30
Tyr Ala Trp Asn Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr Ser
35 40 45
Val Leu Tyr Asn Ser Ala Ser Phe Ser Thr Phe Lys Cys Tyr Gly Val
50 55 60
Ser Pro Thr Lys Leu Asn Asp Leu Cys Phe Thr Asn Val Tyr Ala Asp
65 70 75 80
Ser Phe Val Ile Arg Gly Asp Glu Val Arg Gln Ile Ala Pro Gly Gln
85 90 95
Thr Gly Lys Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe Thr
100 105 110
Gly Cys Val Ile Ala Trp Asn Ser Asn Asn Leu Asp Ser Lys Val Gly
115 120 125
Gly Asn Tyr Asn Tyr Leu Tyr Arg Leu Phe Arg Lys Ser Asn Leu Lys
130 135 140
Pro Phe Glu Arg Asp Ile Ser Thr Glu Ile Tyr Gln Ala Gly Ser Thr
145 150 155 160
Pro Cys Asn Gly Val Glu Gly Phe Asn Cys Tyr Phe Pro Leu Gln Ser
165 170 175
Tyr Gly Phe Gln Pro Thr Asn Gly Val Gly Tyr Gln Pro Tyr Arg Val
180 185 190
Val Val Leu Ser Phe Glu Leu Leu His Ala Pro Ala Thr Val Cys Gly
195 200 205
Pro Lys Lys Ser Thr Asn Leu Val Lys Asn Lys
210 215
<210> 23
<211> 216
<212> PRT
<213> Artificial sequence
<400> 23
Arg Val Gln Pro Thr Glu Ser Ile Val Arg Phe Pro Asn Ile Thr Asn
1 5 10 15
Leu Cys Pro Phe Gly Glu Val Phe Asn Ala Thr Arg Phe Ala Ser Val
20 25 30
Tyr Ala Trp Asn Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr Ser
35 40 45
Val Leu Tyr Asn Ser Ala Ser Phe Ser Thr Phe Lys Cys Tyr Gly Val
50 55 60
Ser Pro Thr Lys Leu Asn Asp Leu Cys Phe Thr Asn Val Tyr Ala Asp
65 70 75 80
Ser Phe Val Ile Arg Gly Asp Glu Val Arg Gln Ile Ala Pro Gly Gln
85 90 95
Thr Gly Lys Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe Thr
100 105 110
Gly Cys Val Ile Ala Trp Asn Ser Asn Asn Leu Asp Ser Lys Val Gly
115 120 125
Gly Asn Tyr Asn Tyr Leu Tyr Arg Leu Phe Arg Lys Ser Asn Leu Lys
130 135 140
Pro Phe Glu Arg Asp Ile Ser Thr Glu Ile Tyr Gln Ala Gly Ser Thr
145 150 155 160
Pro Cys Asn Gly Val Glu Gly Phe Asn Cys Tyr Phe Pro Leu Gln Ser
165 170 175
Tyr Gly Phe Gln Pro Thr Asn Gly Val Gly Tyr Gln Pro Tyr Arg Val
180 185 190
Val Val Leu Ser Phe Glu Leu Leu His Ala Pro Ala Thr Val Cys Gly
195 200 205
Pro Lys Lys Ser Thr Asn Leu Val
210 215
<210> 24
<211> 210
<212> PRT
<213> Artificial sequence
<400> 24
Arg Val Gln Pro Thr Glu Ser Ile Val Arg Phe Pro Asn Ile Thr Asn
1 5 10 15
Leu Cys Pro Phe Gly Glu Val Phe Asn Ala Thr Arg Phe Ala Ser Val
20 25 30
Tyr Ala Trp Asn Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr Ser
35 40 45
Val Leu Tyr Asn Ser Ala Ser Phe Ser Thr Phe Lys Cys Tyr Gly Val
50 55 60
Ser Pro Thr Lys Leu Asn Asp Leu Cys Phe Thr Asn Val Tyr Ala Asp
65 70 75 80
Ser Phe Val Ile Arg Gly Asp Glu Val Arg Gln Ile Ala Pro Gly Gln
85 90 95
Thr Gly Lys Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe Thr
100 105 110
Gly Cys Val Ile Ala Trp Asn Ser Asn Asn Leu Asp Ser Lys Val Gly
115 120 125
Gly Asn Tyr Asn Tyr Leu Tyr Arg Leu Phe Arg Lys Ser Asn Leu Lys
130 135 140
Pro Phe Glu Arg Asp Ile Ser Thr Glu Ile Tyr Gln Ala Gly Ser Thr
145 150 155 160
Pro Cys Asn Gly Val Glu Gly Phe Asn Cys Tyr Phe Pro Leu Gln Ser
165 170 175
Tyr Gly Phe Gln Pro Thr Asn Gly Val Gly Tyr Gln Pro Tyr Arg Val
180 185 190
Val Val Leu Ser Phe Glu Leu Leu His Ala Pro Ala Thr Val Cys Gly
195 200 205
Pro Lys
210
<210> 25
<211> 669
<212> DNA
<213> Artificial sequence
<400> 25
agagtacaac caactgaatc cattgttaga tttcctaata tcactaacct gtgcccattt 60
ggtgaagttt ttaacgctac tagatttgct tctgtttacg cctggaacag aaagagaatt 120
tctaactgtg ttgctgatta ctctgttctt tacaactctg cctctttttc tacttttaag 180
tgttatggtg tctctccaac caagttgaac gatttgtgtt ttaccaacgt ttacgctgat 240
tcttttgtta ttagaggtga tgaggttaga caaattgctc ctggtcaaac tggtaagatt 300
gctgattata actacaagtt gcctgatgat tttactggtt gcgtcattgc ttggaactct 360
aataatttgg attctaaggt tggtggaaat tacaactact tgtacagatt gtttagaaag 420
agtaacttga agccatttga aagagatatt tctactgaaa tctaccaagc tggatctact 480
ccttgtaacg gtgtcgaagg ttttaactgc tactttcctt tgcagtctta cggttttcaa 540
cccactaacg gtgttggtta ccagccctac agagttgttg ttttgtcttt tgagttgctt 600
catgctccag ctactgtttg tggtcctaag aagtctacta acttggttaa gaacaagtgt 660
gttaatttc 669
<210> 26
<211> 657
<212> DNA
<213> Artificial sequence
<400> 26
agagtacaac caactgaatc cattgttaga tttcctaata tcactaacct gtgcccattt 60
ggtgaagttt ttaacgctac tagatttgct tctgtttacg cctggaacag aaagagaatt 120
tctaactgtg ttgctgatta ctctgttctt tacaactctg cctctttttc tacttttaag 180
tgttatggtg tctctccaac caagttgaac gatttgtgtt ttaccaacgt ttacgctgat 240
tcttttgtta ttagaggtga tgaggttaga caaattgctc ctggtcaaac tggtaagatt 300
gctgattata actacaagtt gcctgatgat tttactggtt gcgtcattgc ttggaactct 360
aataatttgg attctaaggt tggtggaaat tacaactact tgtacagatt gtttagaaag 420
agtaacttga agccatttga aagagatatt tctactgaaa tctaccaagc tggatctact 480
ccttgtaacg gtgtcgaagg ttttaactgc tactttcctt tgcagtctta cggttttcaa 540
cccactaacg gtgttggtta ccagccctac agagttgttg ttttgtcttt tgagttgctt 600
catgctccag ctactgtttg tggtcctaag aagtctacta acttggttaa gaacaag 657
<210> 27
<211> 648
<212> DNA
<213> Artificial sequence
<400> 27
agagtacaac caactgaatc cattgttaga tttcctaata tcactaacct gtgcccattt 60
ggtgaagttt ttaacgctac tagatttgct tctgtttacg cctggaacag aaagagaatt 120
tctaactgtg ttgctgatta ctctgttctt tacaactctg cctctttttc tacttttaag 180
tgttatggtg tctctccaac caagttgaac gatttgtgtt ttaccaacgt ttacgctgat 240
tcttttgtta ttagaggtga tgaggttaga caaattgctc ctggtcaaac tggtaagatt 300
gctgattata actacaagtt gcctgatgat tttactggtt gcgtcattgc ttggaactct 360
aataatttgg attctaaggt tggtggaaat tacaactact tgtacagatt gtttagaaag 420
agtaacttga agccatttga aagagatatt tctactgaaa tctaccaagc tggatctact 480
ccttgtaacg gtgtcgaagg ttttaactgc tactttcctt tgcagtctta cggttttcaa 540
cccactaacg gtgttggtta ccagccctac agagttgttg ttttgtcttt tgagttgctt 600
catgctccag ctactgtttg tggtcctaag aagtctacta acttggtt 648
<210> 28
<211> 630
<212> DNA
<213> Artificial sequence
<400> 28
agagtacaac caactgaatc cattgttaga tttcctaata tcactaacct gtgcccattt 60
ggtgaagttt ttaacgctac tagatttgct tctgtttacg cctggaacag aaagagaatt 120
tctaactgtg ttgctgatta ctctgttctt tacaactctg cctctttttc tacttttaag 180
tgttatggtg tctctccaac caagttgaac gatttgtgtt ttaccaacgt ttacgctgat 240
tcttttgtta ttagaggtga tgaggttaga caaattgctc ctggtcaaac tggtaagatt 300
gctgattata actacaagtt gcctgatgat tttactggtt gcgtcattgc ttggaactct 360
aataatttgg attctaaggt tggtggaaat tacaactact tgtacagatt gtttagaaag 420
agtaacttga agccatttga aagagatatt tctactgaaa tctaccaagc tggatctact 480
ccttgtaacg gtgtcgaagg ttttaactgc tactttcctt tgcagtctta cggttttcaa 540
cccactaacg gtgttggtta ccagccctac agagttgttg ttttgtcttt tgagttgctt 600
catgctccag ctactgtttg tggtcctaag 630

Claims (21)

1. A method of preparing a coronavirus S protein receptor binding region having a mammalian glycostructure N-sugar chain modification comprising the steps of:
(1) Expressing a coronavirus S protein receptor binding region in Pichia pastoris genetically modified by a glycosylation modification pathway to obtain a recombinant yeast cell;
The pichia pastoris genetically modified by the glycosylation modification pathway is a pichia pastoris cell mutant which is defective in the mannosylation modification pathway and has the mammal cell N-glycosylation modification pathway reconstructed;
(2) Culturing the recombinant yeast cells, and purifying from the culture supernatant to obtain a coronavirus S protein receptor binding region modified by a mammalian sugar structure N-sugar chain;
Wherein, the pichia pastoris genetically modified by the glycosylation modification pathway is prepared according to the method comprising the following steps:
(A1) Inactivating the receptors Pichia pastoris endogenous alpha-1, 6-mannosyltransferase, phosphomannosyltransferase, beta mannosyltransferase I, beta mannosyltransferase II, beta mannosyltransferase III and beta mannosyltransferase IV to obtain recombinant yeast 1;
(A2) Expressing the following exogenous proteins in the recombinant yeast 1: exogenous mannosidase I, exogenous N-acetylglucosamine transferase I, exogenous mannosidase II, exogenous N-acetylglucosamine transferase II, exogenous galactose isomerase and exogenous galactose transferase, to obtain recombinant yeast 2; and
(A3) Inactivating the O mannose transferase I endogenous to the recombinant yeast 2 to obtain recombinant yeast 3; the recombinant yeast 3 is the pichia pastoris genetically modified by the glycosylation modification pathway;
Wherein the exogenous mannosidase I is derived from trichoderma viride, and the C-terminal fuses endoplasmic reticulum retention signal HDEL; the exogenous N-acetylglucosamine transferase I is of human origin and comprises an mnn9 localization signal; the exogenous mannosidase II is derived from nematodes, and the exogenous N-acetylglucosamine transferase II is derived from humans and contains mnn2 localization signals; the exogenous galactose isomerase and the exogenous galactose transferase are fusion proteins, are both derived from human, and share one kre positioning signal;
wherein, the coding gene of the exogenous mannosidase I and the coding gene of the exogenous mannosidase II are both introduced into the recombinant yeast 1 twice;
Wherein the coronavirus S protein receptor binding region is any one of the following:
(a1) A protein shown in SEQ ID No. 21;
(a2) A protein represented by SEQ ID No. 22;
(a3) A protein shown in SEQ ID No. 23;
(a4) The protein shown in SEQ ID No. 24.
2. The method according to claim 1, characterized in that:
in the step (A1), the endogenous alpha-1, 6-mannose transferase, phosphomannose synthetase, beta mannose transferase I, beta mannose transferase II, beta mannose transferase III and beta mannose transferase IV of the inactivated acceptor Pichia pastoris are subjected to gene knockout by adopting a homologous recombination mode; and/or
In the step (A2), the expression of the foreign protein in the recombinant yeast 1 is achieved by introducing a gene encoding the foreign protein into the recombinant yeast 1.
3. The method according to claim 2, wherein the gene encoding the foreign protein is introduced into the recombinant yeast 1 in the form of a recombinant vector.
4. A method according to any one of claims 1 to 3, wherein in step (A3) the inactivation of the O-mannosyltransferase I endogenous to the recombinant yeast 2 is achieved by insertional inactivation of the O-mannosyltransferase I encoding gene in the genomic DNA of the recombinant yeast 2.
5. A method according to any one of claims 1-3, characterized in that:
the amino acid sequence of the alpha-1, 6-mannose transferase is shown as SEQ ID No. 1; and/or
The amino acid sequence of the phosphomannose transferase is shown as SEQ ID No. 2; and/or
The amino acid sequence of the phosphomannose synthetase is shown as SEQ ID No. 3; and/or
The amino acid sequence of the beta mannosyltransferase I is shown as SEQ ID No. 4; and/or
The amino acid sequence of the beta mannosyltransferase II is shown as SEQ ID No. 5; and/or
The amino acid sequence of the beta mannosyltransferase III is shown as SEQ ID No. 6; and/or
The amino acid sequence of the beta mannose transferase is shown as SEQ ID No. 7; and/or
The amino acid sequence of the O mannose transferase I is shown as SEQ ID No. 8; and/or
The amino acid sequence of the exogenous mannosidase I is shown in SEQ ID No. 9; and/or
The amino acid sequence of the exogenous N-acetylglucosamine transferase I is shown as SEQ ID No. 10; and/or
The amino acid sequence of the fusion protein consisting of the galactose isomerase and the galactose transferase is shown as SEQ ID No. 11; and/or
The amino acid sequence of mannosidase II is shown in SEQ ID No. 12; and/or
The amino acid sequence of the N-acetylglucosamine transferase II is shown as SEQ ID No. 13.
6. A method according to any one of claims 1-3, characterized in that:
The nucleotide sequence of the coding gene of the exogenous mannosidase I is shown as SEQ ID No. 14; and/or
The nucleotide sequence of the coding gene of the exogenous N-acetylglucosamine transferase I is shown as SEQ ID No. 15; and/or
The nucleotide sequence of the encoding gene of the fusion protein consisting of the galactose isomerase and the galactose transferase is shown as SEQ ID No. 16; and/or
The nucleotide sequence of the coding gene of mannosidase II is shown as SEQ ID No. 17; and/or
The nucleotide sequence of the coding gene of the N-acetylglucosamine transferase II is shown as SEQ ID No. 18.
7. A method according to any one of claims 1-3, characterized in that: the Pichia pastoris genetically modified by the glycosylation modification pathway is a strain with the preservation number of CGMCC No.19488 preserved in the China general microbiological culture Collection center.
8. A method according to any one of claims 1-3, characterized in that: in the step (1), the recombinant yeast cell is obtained by introducing a gene encoding the coronavirus S protein receptor binding region into the Pichia pastoris genetically modified by the glycosylation modification pathway.
9. The method of claim 1, wherein the coronavirus S protein receptor binding domain encodes any one of the following genes:
(b1) A DNA molecule shown in SEQ ID No. 25;
(b2) A DNA molecule shown in SEQ ID No. 26;
(b3) A DNA molecule shown in SEQ ID No. 27;
(b4) A DNA molecule shown in SEQ ID No. 28.
10. A method according to any one of claims 1-3, characterized in that: in step (2), the coronavirus S protein receptor binding domain having the modification of the N-sugar chain of the mammalian sugar structure is purified from the culture supernatant according to a method comprising the steps of: sequentially performing cation exchange chromatography, hydrophobic chromatography, G25 desalination and anion exchange chromatography on the culture supernatant to obtain the coronavirus S protein receptor binding region with the mammal sugar-type structure modified by the N-sugar chain.
11. The method according to claim 10, wherein in step (2), the coronavirus S protein receptor binding domain having a modified mammalian sugar structure is purified from the culture supernatant according to a method comprising the steps of: capturing target proteins by passing the culture supernatant through CaptoMMC chromatographic columns, and eluting by using a buffer solution containing 1M NaCl to obtain a crude sample containing the target proteins; purifying the crude sample by using a hydrophobic chromatography column Phenyl HP, desalting an elution peak sample containing the target protein by using a G25 chromatography column, and then adsorbing the impurity protein by using an anion exchange chromatography column Source30Q, wherein the target protein is obtained by flowing through liquid; the target protein is the coronavirus S protein receptor binding region with the modification of the mammal sugar-type structure N-sugar chain.
12. The method of claim 1, wherein the coronavirus is SARS-CoV-2.
13. A coronavirus S protein receptor binding region having an amino acid sequence as shown in SEQ ID No.22, SEQ ID No.23 or SEQ ID No.24, prepared by the method of any one of claims 1 to 12.
14. A medicament for preventing diseases caused by coronavirus infection, which comprises the coronavirus S protein receptor binding region of claim 13 as an active ingredient.
15. A medicament capable of inhibiting coronavirus, the active ingredient of which is the coronavirus S protein receptor binding region of claim 13.
16. A reagent or kit for diagnosing coronavirus infection comprising the coronavirus S protein receptor binding region of claim 13.
17. A coronavirus vaccine comprising an antigen and an adjuvant; wherein the antigen is the coronavirus S protein receptor binding region of claim 13.
18. The coronavirus vaccine of claim 17, wherein the adjuvant is an aluminum adjuvant.
19. A product capable of causing the production in an animal of antibodies specific for the binding region of the coronavirus S protein receptor, the active ingredient of which is the binding region of the coronavirus S protein receptor according to claim 13.
20. Use of the recombinant yeast 3 of claim 1 or the strain of collection No. cgmccno.19488 of claim 7 for the preparation of the coronavirus S protein receptor binding region of claim 1.
21. Use of a coronavirus S protein receptor binding region according to claim 13 for the preparation of a medicament according to claim 14 or 15, a reagent or kit according to claim 16, a coronavirus vaccine according to claim 17 or a product according to claim 19.
CN202010493748.5A 2020-06-03 2020-06-03 Preparation method and application of coronavirus S protein RBD glycoprotein Active CN113754739B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010493748.5A CN113754739B (en) 2020-06-03 2020-06-03 Preparation method and application of coronavirus S protein RBD glycoprotein
PCT/CN2021/093757 WO2021244255A1 (en) 2020-06-03 2021-05-14 Method for preparing rbd glycoprotein of coronavirus spike protein, and use thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010493748.5A CN113754739B (en) 2020-06-03 2020-06-03 Preparation method and application of coronavirus S protein RBD glycoprotein

Publications (2)

Publication Number Publication Date
CN113754739A CN113754739A (en) 2021-12-07
CN113754739B true CN113754739B (en) 2024-07-05

Family

ID=78783238

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010493748.5A Active CN113754739B (en) 2020-06-03 2020-06-03 Preparation method and application of coronavirus S protein RBD glycoprotein

Country Status (2)

Country Link
CN (1) CN113754739B (en)
WO (1) WO2021244255A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113549560B (en) * 2020-04-24 2024-02-13 中国人民解放军军事科学院军事医学研究院 An engineered yeast construction method and strain for glycoprotein preparation
CN114380896A (en) * 2022-02-22 2022-04-22 河北佑仁生物科技有限公司 Expression method of novel coronavirus S protein
CN114717205A (en) * 2022-03-29 2022-07-08 中国人民解放军军事科学院军事医学研究院 Coronavirus RBDdm variant and application thereof
CN114478757B (en) * 2022-03-31 2022-07-05 深圳市人民医院 Nano antibody targeting new coronavirus, and preparation method and application thereof
CN114989308B (en) * 2022-05-12 2023-04-04 中国科学院微生物研究所 Novel coronavirus chimeric nucleic acid vaccine and use thereof
CN114957410A (en) * 2022-06-08 2022-08-30 华兰基因工程有限公司 Preparation method of surface protein receptor binding region of kappa strain 2019-nCoV
CN118909811A (en) * 2023-05-08 2024-11-08 中国科学院微生物研究所 Pichia pastoris engineering bacteria for producing novel coronavirus recombinant antigen, preparation method and application thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102648286A (en) * 2009-10-16 2012-08-22 默沙东公司 Method for producing a protein in Pichia pastoris that lacks detectable cross-binding activity to an antibody directed against a host cell antigen
CN105671109A (en) * 2014-11-20 2016-06-15 中国人民解放军军事医学科学院生物工程研究所 Method for preparing animal cell galactosylated modification influenza HA (Hemagglutinin) glycoprotein from glycosyl engineering yeast

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101195809B (en) * 2006-12-07 2011-07-27 中国人民解放军军事医学科学院生物工程研究所 Pichia pastoris strain with deletion of alpha-1,6-mannose transferase and uses

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102648286A (en) * 2009-10-16 2012-08-22 默沙东公司 Method for producing a protein in Pichia pastoris that lacks detectable cross-binding activity to an antibody directed against a host cell antigen
CN105671109A (en) * 2014-11-20 2016-06-15 中国人民解放军军事医学科学院生物工程研究所 Method for preparing animal cell galactosylated modification influenza HA (Hemagglutinin) glycoprotein from glycosyl engineering yeast

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Brian D. Quinlan等.The SARS-CoV-2 receptor-binding domain elicits a potent neutralizing response without antibody-dependent enhancement.bioRxiv.2020,1-24. *
Chi-Pang Chuck等.Expression of SARS-coronavirus spike glycoprotein in Pichia pastoris.Virus Genes.2008,第38卷1-9. *
Pieter P Jacobs.Engineering complex-type N-glycosylation in Pichia pastoris using GlycoSwitch technology.NATURE PROTOCOLS.2009,第4卷(第1期),58-70. *
PMT1 gene plays a major role in O-mannosylation of insulin precursor in Pichia pastoris;Nagaraj G 等;Protein Expression and Purification;第88卷(第1期);164-171 *
Yan,R.等.Chain E, SARS-coV-2 Receptor Binding Domain.PDB:6M17_E.2020,1-2. *
杨晓鹏等.Man5GlcNAc2哺乳动物甘露糖型糖蛋白的毕赤酵母表达系统构建.生物工程学报.2011,第27卷(第1期),108-117. *

Also Published As

Publication number Publication date
WO2021244255A1 (en) 2021-12-09
CN113754739A (en) 2021-12-07

Similar Documents

Publication Publication Date Title
CN113754739B (en) Preparation method and application of coronavirus S protein RBD glycoprotein
KR101930961B1 (en) Method for increasing n-glycosylation site occupancy on therapeutic glycoproteins produced in pichia pastoris
US10513724B2 (en) Production of glycoproteins with mammalian-like N-glycans in filamentous fungi
JP6017439B2 (en) Fusion enzyme having N-acetylglucosaminyltransferase activity
CN101679991A (en) Genetically modified yeasts for the production of homogeneous glycoproteins
CA2916905A1 (en) Multiple proteases deficient filamentous fungal cells and methods of use thereof
KR20140015137A (en) Methods for the production of recombinant proteins with improved secretion efficiencies
KR20120084734A (en) Method for producing proteins in pichia pastoris that lack detectable cross binding activity to antibodies against host cell antigens
WO2013066765A1 (en) Mutation of tup1 in glycoengineered yeast
KR20140091017A (en) Methods for increasing n-glycan occupancy and reducing production of hybrid n-glycans in pichia pastoris strains lacking alg3 expression
CN113797326B (en) Vaccine for preventing diseases caused by coronaviruses
WO2015001049A1 (en) O-mannosyltransferase deficient filamentous fungal cells and methods of use thereof
US20230279337A1 (en) Method for constructing engineered yeast for glycoprotein preparation and strain thereof
KR20140114278A (en) Yeast strain for the production of proteins with modified o-glycosylation
CN113817040A (en) A kind of Echinococcus granulosus recombinant protein and preparation method thereof
US20210189002A1 (en) Recombinant organisms and methods for producing glycomolecules with high glycan occupancy
CN105671109B (en) There is the method for the glycosylation modified influenza hemagglutinin glycoprotein of zooblast with the preparation of glycosyl Engineering Yeast
KR100915670B1 (en) A novel YlMPO1 gene derived from Yarrowia lipolytica and a process for preparing a glycoprotein not being mannosylphosphorylated by using a mutated Yarrowia lipolytica in which YlMPO1 gene is disrupted
EP3788163A1 (en) Recombinant organisms and methods for producing glycomolecules with low sulfation
CN110904115B (en) Canine recombinant interferon alpha 7, preparation method and application thereof, expression vector containing canine recombinant interferon alpha 7 and host cell
KR20140134272A (en) Methods for reducing mannosyltransferase activity in yeast
WO2019089077A1 (en) Expression of modified glycoproteins and glycopeptides
JP2007517519A (en) Ultrafast selection method of protein fusion factors for recombinant protein production and protein fusion factors selected thereby
CN101402961A (en) Method for producing glucoprotein vaccine
KR101783677B1 (en) Method for mass producing dengue virus vaccine using plant viral expression system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20220228

Address after: 201210 13th floor, building 2, No. 36 and 58, Haiqu Road, pilot Free Trade Zone, Pudong New Area, Shanghai

Applicant after: SHANGHAI JUNSHI BIOSCIENCES Co.,Ltd.

Address before: 100071 Taiping Road, Haidian District, Beijing 27

Applicant before: ACADEMY OF MILITARY MEDICAL SCIENCES

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant