[go: up one dir, main page]

HK40027079A - Modified cas9 protein and use thereof - Google Patents

Modified cas9 protein and use thereof Download PDF

Info

Publication number
HK40027079A
HK40027079A HK62020016607.7A HK62020016607A HK40027079A HK 40027079 A HK40027079 A HK 40027079A HK 62020016607 A HK62020016607 A HK 62020016607A HK 40027079 A HK40027079 A HK 40027079A
Authority
HK
Hong Kong
Prior art keywords
lys
leu
ctg
aag
glu
Prior art date
Application number
HK62020016607.7A
Other languages
Chinese (zh)
Inventor
Osamu Nureki
Hiroshi NISHIMASU
Hisato Hirano
Original Assignee
The University Of Tokyo
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The University Of Tokyo filed Critical The University Of Tokyo
Publication of HK40027079A publication Critical patent/HK40027079A/en

Links

Description

Modified Cas9 protein and uses thereof
Technical Field
The present invention relates to modified Cas9 proteins with further expansion of the targetable region and uses thereof.
Background
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and Cas (CRISPR-associated) genes are known to together constitute an adaptive immune system providing acquired resistance to invading foreign nucleic acids in bacteria and archaea. CRISPRs are often caused by phage or plasmid DNA and consist of short conserved repeats of 24-48 bp between which unique variable DNA sequences of similar size called spacers are inserted. In addition, a genome encoding a Cas protein family exists in the vicinity of the repeat and spacer sequences.
In the CRISPR-Cas system, exogenous DNA is cleaved by the Cas protein family into fragments of around 30bp and inserted into the CRISPR. The Cas1 and Cas2 proteins, one of the Cas protein families, recognize the nucleotide sequence called a proto-spacer adjacent motif (PAM) of exogenous DNA, cut the upstream of the nucleotide sequence and insert the upstream into the CRISPR sequence of a host, so that the protein becomes the immunological memory of bacteria. An RNA transcribed from a CRISPR sequence comprising an immunological memory (referred to as pre-crRNA) is paired with a portion of complementary RNA (trans-activating crRNA) and taken up into Cas9 protein, one of the Cas protein families. The pre-crRNA and tracrRNA taken up into Cas9 are cleaved by RNaseIII into small RNA fragments (CRISPR-RNAs: crRNAs) containing foreign sequences (guide sequences) to form Cas9-crRNA-tracrRNA complexes. The Cas9-crRNA-tracrRNA complex binds to foreign invasive DNA complementary to the crRNA, and the Cas9 protein, which is a DNA cleaving enzyme (nuclease), cleaves the foreign invasive DNA, thereby inhibiting and excluding the function of the foreign invasive DNA.
The Cas9 protein recognizes the PAM sequence in foreign invasive DNA,the double-stranded DNA is cleaved upstream thereof to form a smooth end. The length or nucleotide sequence of the PAM sequence varies depending on the type of bacterium, and is found in Streptococcus pyogenes (S.) (Streptococcus pyogenes)(S.pyogenes) The base (base) of 3 in "NGG" is identified. Streptococcus thermophilus (Streptococcus thermophilus)(S.thermophilus)2 Cas9 are held, and 5 to 6 bases of "NGGNG" or "NNAGAA" are recognized as PAM sequences. Francisella november: (Francisella novicida)(F. novicida) The 3 bases of "NGR" are identified. The position of any bp upstream of the cleavage PAM sequence also varies depending on the bacterial species, but includesS. pyogenesMost of the Cas9 orthologs cleaved 3 bases upstream of the PAM sequence.
In recent years, a technique of applying the CRISPR-Cas system in bacteria to genome editing is being actively developed. crRNA is fused with tracrRNA, and the resulting fusion is expressed as a tracrRNA-crRNA chimera (hereinafter, referred to as guide RNA (gRNA)). Thus, it is called a nuclease (RNA-guided nuclease: RGN), and cuts genomic DNA at a target site (target site).
The CRISPR-Cas system has types I, II, III, but the type II CRISPR-Cas system is exclusively used in genome editing, and Cas9 protein is used as RGN in type II. Due to coming fromS.pyogenesSince the Cas9 protein of (g) recognizes 3 bases of NGG as a PAM sequence, it can cleave the upstream thereof if there are 2 guanine-juxtaposed sequences.
The method using the CRISPR-Cas system can synthesize not only short grnas homologous to a target DNA sequence, but also perform genome editing using Cas9 protein as a single protein. Therefore, genome editing can be performed easily and quickly without synthesizing large proteins having different DNA sequences, as in the case of Zinc Finger Nucleases (ZFNs) or transactivator-like agonists (TALENs) that have been used conventionally.
Patent document 1 discloses: effectively utilize and come fromS.pyogenesGenome editing technology of CRISPR-Cas system of (1).
Patent document 2 discloses: effectively utilize and come fromS.thermophilusC of (A)Genome editing technology of the RISPR-Cas system. Further, patent document 2 discloses: the D31A or N891A mutant of Cas9 protein functions as a DNA cleaving enzyme, i.e., a nickase, in which only one DNA strand is nicked. Furthermore, the homologous recombination efficiency was comparable to that of the wild-type Cas9 protein even when the incidence of non-homologous end binding, in which mutations such as indels are likely to occur due to the repair mechanism after DNA cleavage, was low.
Non-patent document 1 discloses: a double nickase system of 1 pair of target specific guide RNAs using 2D 10A mutants of Cas9 protein and forming a complex with the D10A mutant using a double nickase derived fromS.pyogenesThe CRISPR-Cas system of Cas 9. The complex of the D10A mutant of each Cas9 protein and the target-specific guide RNA made only 1 nick on the DNA strand complementary to the guide RNA. A pair of guide RNAs has a mismatch of about 20 nucleotides (ずれ), and recognizes only target sequences located on opposite strands of a target DNA. The 2 nicks made from the complex of the D10A mutant of each Cas9 protein and the target-specific guide RNA form a state mimicking DNA double-strand cleavage (DNA double-strand break: DSB), and by using a pair of guide RNAs, the specificity of Cas9 protein-mediated gene editing can be improved while maintaining a high level of efficiency.
Patent document 3 discloses a compound derived fromS.pyogenesVarious mutants of Cas9 protein of (4), and a protein derived from the same are disclosed in patent documentF.novicidaVarious mutants of Cas9 protein.
Documents of the prior art
Patent document
Patent document 1: WO 2014/093661;
patent document 2: japanese laid-open patent publication No. 2015-510778;
patent document 3: WO 2016/141224;
patent document 4: WO 2017/010543;
non-patent document
Non-patent document 1: ran, F. A. et al, Double Nicking by RNA-Guided CRISPR Cas9 for enhanced Genome Editing Specificity, Cell, vol 154, p 1380-1389, 2013.
Disclosure of Invention
Problems to be solved by the invention
From patent document 1S.pyogenesThe PAM sequence recognizable by the Cas9 (also referred to as SpCas9 in the present specification) protein is 2 bases of "NGG (N is an arbitrary base)". In addition, since the SpCas9 protein is used in the double-nickase system disclosed in non-patent document 1, 2 PAM sequences that can be recognized at 1 position in each of the sense strand and the antisense strand within the target sequence are required, and thus the target sequence that can be further edited is limited.
Thus, in the existing Cas9 protein, since there is a limit to recognizable PAM sequences, there is a problem in that editable target sequences are limited.
The invention aims to: a modified Cas9 protein and uses thereof are provided in which restriction of the target sequence is alleviated while maintaining the ability to bind to guide RNAs.
Means for solving the problems
The present inventors focused on SpCas9 protein as Cas9 protein, and conducted intensive studies to solve the above problems. As a result, the present inventors succeeded in converting a 2-base PAM sequence, which is a conventional NGG (N is an arbitrary base), into a 1-base sequence of NG while maintaining the ability to bind to guide RNA by substituting an amino acid at a predetermined position of the SpCas9 protein with a specific amino acid (introducing a mutation), and thus completed the present invention.
In the present specification, the Cas9 protein before introduction of mutation is sometimes referred to as wild-type Cas9 protein, and the Cas9 protein after introduction of mutation is sometimes referred to as modified Cas9 protein or mutant Cas9 protein.
Namely, the present invention is as follows.
[1] A protein which comprises a sequence having an amino acid sequence in which 1 amino acid selected from the group consisting of alanine, glycine, cysteine, isoleucine, leucine, methionine, phenylalanine, proline, valine, threonine, asparagine and aspartic acid is substituted for arginine at position 1335 in the amino acid sequence represented by SEQ ID NO. 1, and which has a binding ability to a guide RNA.
[2] The protein according to [1] above, wherein the amino acid sequence represented by SEQ ID NO. 1 further has a mutation at position 1219.
[3] The protein according to [1] or [2], wherein the amino acid sequence represented by SEQ ID NO. 1 further has a mutation at position 1322.
[4] A protein consisting of a sequence comprising an amino acid sequence in which arginine at position 1335 in the amino acid sequence represented by SEQ ID NO. 1 is substituted with 1 amino acid selected from the group consisting of alanine, glycine, cysteine, isoleucine, leucine, methionine, phenylalanine, proline, valine, threonine, asparagine and aspartic acid, and further having a mutation at position 1219, and having a binding ability to a guide RNA.
[5] A protein consisting of a sequence comprising an amino acid sequence in which arginine at position 1335 in the amino acid sequence represented by SEQ ID NO. 1 is substituted with 1 amino acid selected from the group consisting of alanine, glycine, cysteine, isoleucine, leucine, methionine, phenylalanine, proline, valine, threonine, asparagine and aspartic acid, and further having a mutation at position 1322, and having a binding ability to a guide RNA.
[6] The protein according to any one of the above [1] to [5], wherein the substitution of arginine at position 1335 is a substitution to alanine.
[7] The protein according to any one of the above [1] to [5], wherein the substitution of arginine at position 1335 is isoleucine, methionine, threonine or valine.
[8] The protein according to the above [2] or [4], wherein the mutation at position 1219 is a substitution of glutamic acid to phenylalanine.
[9] The protein according to [3] or [5], wherein the mutation at position 1322 is a substitution of alanine for arginine, histidine or lysine.
[10] The protein according to [9], wherein the mutation at position 1322 is a substitution of alanine for arginine.
[11] The protein according to any one of the above [1] to [10], wherein the amino acid sequence represented by SEQ ID NO. 1 further has a mutation at least one position selected from among 1111, 1135, 1218, and 1337 positions.
[12] The protein according to [11] above, wherein the amino acid sequence represented by SEQ ID NO. 1 further has a mutation at least 2 positions selected from the group consisting of 1111, 1135, 1218 and 1337.
[13] The protein according to [11] above, wherein the amino acid sequence represented by SEQ ID NO. 1 further has a mutation at least 3 positions selected from the group consisting of 1111, 1135, 1218 and 1337.
[14] The protein according to [11] above, wherein the amino acid sequence represented by SEQ ID NO. 1 further has mutations at positions 1111, 1135, 1218, and 1337.
[15] The protein according to any one of the above [11] to [14], wherein,
the mutation at position 1111 is a substitution of leucine for arginine, histidine or lysine;
the mutation at position 1135 was a substitution of aspartic acid to valine;
a mutation at position 1218 with a substitution of glycine to arginine, histidine, or lysine;
the mutation at position 1337 is a substitution of threonine to arginine, histidine or lysine.
[16] The protein according to any one of the above [1] to [15], wherein the protein has a homology of 80% or more at a site other than a position at which mutation is performed in SEQ ID NO. 1.
[17] The protein according to any one of the above [1] to [15], wherein 1 to more amino acids are substituted, deleted, inserted and/or added at positions other than the position of SEQ ID NO. 1 at which mutation is performed.
[18] The protein according to any one of the above [1] to [17], which has an RNA-inducible DNA endonuclease activity.
[19] The protein according to any one of the above [1] to [16], further comprising a mutation that deletes a part or all of the nuclease activity in the amino acid sequence represented by SEQ ID NO. 1.
[20] The protein according to [19], wherein the mutation that lacks a part or all of the nuclease activity is a mutation at (i) at least 1 position selected from the group consisting of positions 10, 762, 839, 983 and 986 or a position corresponding thereto, and/or (ii) a position selected from the group consisting of positions 840 and 863 or a position corresponding thereto, in the amino acid sequence represented by SEQ ID NO. 1.
[21] The protein according to [20] above, wherein the aspartic acid at position 10 is substituted with alanine or asparagine; or
Histidine at position 840 is substituted with alanine, asparagine or tyrosine.
[22] The protein according to any one of [19] to [21], which is linked to a transcription regulator protein or domain.
[23] The protein according to [22], wherein the transcription regulatory factor is a transcription activator.
[24] The protein according to [22], wherein the transcription regulatory factor is a transcription silencer or a transcription repressor.
[25] A nucleic acid encoding the protein according to any one of [1] to [24 ].
[26] A protein-RNA complex comprising the protein according to any one of the above [1] to [24] and a guide RNA comprising a polynucleotide consisting of a nucleotide sequence complementary to a nucleotide sequence upstream of from 1 to 20 to 24 bases inclusive of the PAM (Proto-spacer adjacent Motif) sequence in a target double-stranded polynucleotide.
[27] A method for site-specifically modifying a target double-stranded polynucleotide, the method comprising the steps of:
a step of mixing and culturing a target double-stranded polynucleotide, a protein and a guide RNA; and
a step of modifying the target double-stranded polynucleotide at a binding site located upstream of the PAM sequence by the protein,
the target double-stranded polynucleotide has a PAM sequence consisting of NG (N is an arbitrary base, G is guanine),
the protein according to any one of the above [1] to [24],
the guide RNA includes a polynucleotide composed of a nucleotide sequence complementary to a nucleotide sequence from 1 base upstream to 20 bases upstream and 24 bases downstream of the PAM sequence in the target double-stranded polynucleotide.
[28] The method according to [27], wherein the modification is site-specific cleavage of the target double-stranded polynucleotide.
[29] The method according to [27], wherein the modification is substitution, deletion and/or addition of 1 or more site-specific nucleotides in the target double-stranded polynucleotide.
[30] A method of increasing expression of a target gene in a cell, the method comprising: expressing the protein of [23] and 1 or more guide RNAs against the target gene in the cell.
[31] A method of reducing expression of a target gene in a cell, the method comprising: expressing the protein of [24] and 1 or more guide RNAs against the target gene in the cell.
[32] The method according to [30] or [31], wherein the cell is a eukaryotic cell.
[33] The method according to [30] or [31], wherein the cell is a yeast cell, a plant cell or an animal cell.
Effects of the invention
According to the present invention, a Cas9 protein in which recognition of a PAM sequence becomes broad while maintaining binding force to a guide RNA can be obtained. In addition, a simple and rapid target sequence site-specific genome editing technique using the Cas9 protein described above can be provided.
Drawings
FIG. 1A is a view showing the results of agarose gel electrophoresis in the DNA cleavage activity assay in example 1. "TGT" was used as PAM sequence and EcoRI was used as restriction enzyme.
FIG. 1B is a view showing the results of agarose gel electrophoresis in the DNA cleavage activity assay in example 1. "TGG" was used as PAM sequence and HindIII was used as restriction enzyme.
FIG. 1C is a view showing the results of agarose gel electrophoresis in the DNA cleavage activity assay in example 1. "TGNA" was used as PAM sequence and BamHI was used as restriction enzyme.
FIG. 1D is a view showing the results of agarose gel electrophoresis in the DNA cleavage activity assay in example 1. "TGN" was used as PAM sequence and BamHI was used as restriction enzyme.
FIG. 2 is a view showing the results of agarose gel electrophoresis in the DNA cleavage activity assay of example 2.
FIG. 3 is a graph showing the results of the DNA cleavage activity assay in example 3. "TGA" was used as PAM sequence and BamHI was used as restriction enzyme.
FIG. 4 is a graph showing the results of the DNA cleavage activity assay in example 4.
FIG. 5 is a graph showing the results of the DNA cleavage activity assay in example 5.
Detailed Description
The present invention will be explained below. Terms used in the present specification have meanings commonly used in the field unless otherwise specified.
< Cas9 protein with widespread recognition of PAM sequence >
The protein of the present embodiment is a Cas9 protein in which the PAM sequence is recognized widely while maintaining the binding force to the guide RNA. The protein according to the present embodiment can provide a technique for performing site-specific genome editing on a target sequence in a simple and rapid manner.
In the present specification, the "guide RNA" refers to an RNA that mimics the hairpin structure of a tracrRNA-crRNA, and comprises, in its 5' -terminal region, a polynucleotide composed of a nucleotide sequence complementary to a nucleotide sequence from 1 base upstream to preferably 20 bases or more and 24 bases or less, more preferably 22 bases or more and 24 bases or less, of the PAM sequence in the target double-stranded polynucleotide. The guide RNA may further comprise 1 or more polynucleotides each having a nucleotide sequence capable of forming a hairpin structure, the nucleotide sequence being a nucleotide sequence that is not complementary to the target double-stranded polynucleotide and being arranged so as to be symmetrically complementary to each other about a single point.
The guide RNA has the function of binding to the mutant Cas9 protein of the invention to guide the protein to the target DNA. The guide RNA has a sequence complementary to the target DNA at its 5' end, and binds to the target DNA via the complementary sequence, thereby guiding the mutant Cas9 protein of the present invention to the target DNA. In case the mutant Cas9 protein functions as a DNA endonuclease, the DNA can be cleaved at the site where the target DNA is present, e.g., the function of the target DNA can be specifically lost.
The guide RNA is designed and prepared based on the sequence information of the target DNA to be cleaved or modified. Specifically, sequences as used in examples can be enumerated.
In the present specification, "endonuclease" refers to an enzyme that cleaves the middle position of a nucleotide chain. Therefore, the Cas9 protein having endonuclease activity that broadens recognition of the PAM sequence of the present embodiment has an enzymatic activity that cleaves the middle position of the DNA strand by guide RNA induction.
In the present specification, "polypeptide", "peptide" and "protein" refer to a polymer of amino acid residues and are used interchangeably. In addition, the following amino acid polymers are also referred to: 1 or more amino acids are chemical analogs or modified derivatives of the naturally occurring corresponding amino acids.
In the present specification, the term "sequence" refers to a nucleotide sequence of any length, which is a deoxyribonucleotide or a ribonucleotide, and which is linear, cyclic or branched and is single-stranded or double-stranded.
In the present specification, the "PAM sequence" refers to a sequence that is present in a target double-stranded polynucleotide and is recognized by a Cas9 protein, and the length or nucleotide sequence of the PAM sequence differs depending on the bacterial species. The sequence that can be recognized by the Cas9 protein, the recognition of which becomes extensive in the PAM sequence of the present embodiment, can be represented by "5 '-NG-3'".
In the present specification, "N" refers to any one of bases selected from adenine, cytosine, thymine and guanine, "a" refers to adenine, "G" refers to guanine, "C" refers to cytosine, "T" refers to thymine, "R" refers to a base having a purine skeleton (adenine or guanine), and "Y" refers to a base having a pyrimidine skeleton (cytosine or thymine).
In the present specification, "polynucleotide" refers to a deoxyribonucleotide or ribonucleotide polymer in either linear or circular conformation and in either single-or double-stranded form, with no limiting explanation as to the length of the polymer. In addition, known analogs of natural nucleotides, as well as nucleotides modified in at least one of the base moiety, sugar moiety, and phosphate moiety (e.g., phosphorothioate backbones) are also included. Typically, analogs of a particular nucleotide have the same base pairing specificity, e.g., an analog of A base pairs with T.
In one embodiment, the present invention provides a protein (scheme 1) consisting of an amino acid sequence having a mutation at position 1335 in the amino acid sequence represented by SEQ ID NO:1, and having a binding ability to a guide RNA. In addition, the protein of scheme 1 also has RNA-inducible DNA endonuclease activity.
SEQ ID NO 1 is the full-length amino acid sequence of the SpCas9 protein. The sequence of the PAM sequence recognition site in the SpCas9 protein is an amino acid sequence consisting of 271 residues from the 1097 th position to the 1368 th position of SEQ ID NO: 1.
Specifically, the mutation at position 1335 of SEQ ID NO. 1 means that arginine at position 1335 is substituted with 1 amino acid selected from the group consisting of alanine, glycine, cysteine, isoleucine, leucine, methionine, phenylalanine, proline, threonine, valine, asparagine and aspartic acid. Substitution to alanine is preferred. In addition, another preferred mutation at position 1335 is a substitution to isoleucine, methionine, threonine or valine.
The hydrogen bond with guanine at position 3 (5 '-NG "G" -3') in the PAM sequence is eliminated by mutation at position 1335, and thus the PAM sequence of the protein can be widely recognized.
In another embodiment of the present invention, the present invention provides a protein (scheme 2) which further has a mutation at position 1219 in addition to the mutation of scheme 1 above and has a binding ability to a guide RNA. In addition, the proteins of scheme 2 also have RNA-inducible DNA endonuclease activity.
Specifically, the mutation at position 1219 means that glutamic acid at position 1219 is substituted with phenylalanine.
The mutation at position 1219 can contribute to an increase (maintenance) of the expression rate of the RNA-inducible DNA endonuclease activity.
In yet another embodiment of the present invention, the present invention provides a protein (scheme 3) further having a mutation at position 1322 in addition to the mutation of scheme 1 or 2 described above, and having a binding ability to a guide RNA. In addition, the protein of scheme 3 also has RNA-inducible DNA endonuclease activity.
Specifically, the mutation at position 1322 means that alanine at position 1322 is substituted with arginine, histidine or lysine. Preferably to arginine.
The mutation at position 1322 may contribute to the activity improvement (activity maintenance) of the RNA-induced DNA endonuclease activity.
In a further embodiment of the present invention, the present invention provides a protein (scheme 4) which, in addition to the mutations of scheme 1, 2 or 3 described above, further has mutations at least 1, preferably 2, more preferably 3, particularly preferably all 4 positions selected from among positions 1111, 1135, 1218 and 1337 and also has a binding ability to a guide RNA. The protein of scheme 4 has RNA-inducible DNA endonuclease activity.
Specifically, the 1111 mutation means that the 1111 leucine is substituted with arginine, histidine or lysine. Preferably to arginine.
Specifically, the mutation at position 1135 means that aspartic acid at position 1135 is substituted with valine.
Specifically, the 1218 mutation is a substitution of the 1218 glycine with arginine, histidine, or lysine. Preferably to arginine.
Specifically, the 1337 mutation is a substitution of threonine to arginine, histidine or lysine at position 1337. Preferably to arginine.
In yet another embodiment of the present invention, the present invention provides a protein (scheme 5) which further has a mutation at (i) at least 1 position selected from the group consisting of positions 10, 762, 839, 983, 986 and/or (ii) at a position selected from the group consisting of positions 840 and 863 and has a binding ability to a guide RNA, in addition to the mutation of scheme 1, 2, 3 or 4 described above.
Specifically, the mutation at position 10 means that aspartic acid at position 10 is substituted with alanine or asparagine.
Specifically, the 762 mutation means that glutamic acid at 762 is substituted with glutamine.
Specifically, the mutation at position 839 means that aspartic acid at position 839 is substituted with alanine or asparagine.
Specifically, the mutation at position 983 means that histidine at position 983 is substituted with asparagine or tyrosine.
Specifically, the mutation at position 986 means that aspartic acid at position 986 is substituted with asparagine.
Specifically, the mutation at position 840 is a substitution of histidine at position 840 to alanine, asparagine or tyrosine.
Specifically, the mutation at position 863 means that asparagine at position 863 is substituted with aspartic acid, serine or histidine.
As embodiment 5, a protein in which aspartic acid at position 10 is substituted with alanine or asparagine, or histidine at position 840 is substituted with alanine, asparagine, or tyrosine is preferable.
The protein of scheme 5 having the mutation of (i) or the mutation of (ii) has a nickase activity.
The protein of scheme 5 having the mutation of (i) and the mutation of (ii) is bound to guide RNA and transported to the target DNA, but loses endonuclease activity.
In still another embodiment of the present invention, the present invention provides a protein functionally equivalent to the proteins according to schemes 1 to 5 (scheme 6). In order to be functionally equivalent to the proteins of schemes 1 to 5, the amino acid sequence represented by SEQ ID NO. 1 has a sequence homology of 80% or more at a site other than the site mutated in schemes 1 to 5, and has a binding ability to a guide RNA. In the case where the amino acid is increased or decreased by mutation, the "site other than the position at which mutation is performed" may be understood as "a site other than the position corresponding to the position at which mutation is performed". The homology is preferably 80% or more, more preferably 85% or more, still more preferably 90% or more, particularly preferably 95% or more, and most preferably 99% or more. Amino acid sequence homology can be determined by methods known per se. For example, the amino acid sequence homology (%) can be determined by using a program (for example, BLAST, FASTA, etc.) commonly used in the art according to the initial setting. On the other hand, the homology (%) can be determined using any algorithm known in the art, for example, the algorithms of Needleman et al (1970) (J. mol. biol. 48: 444-. The algorithm of Needleman et al is incorporated into the GAP program of the GCG package (available through www.gcg.com), and homology (%) can be determined, for example, by using BLOSUM 62 matrix or PAM250 matrix, and GAP weights (GAP weight) of 16, 14, 12, 10, 8, 6, or 4 and length weights (length weight) of any of 1, 2, 3, 4, 5, or 6. In addition, the algorithms of Myers and Miller are incorporated into the ALIGN program as part of the GCG sequence alignment software package. When the ALIGN program is used for comparison of amino acid sequences, for example, PAM120 weight residue table (weight residual table), gap length penalty (gap length penalty) 12, and gap penalty (gap penalty) 4 can be used.
The following proteins (scheme 7) are provided as proteins functionally equivalent to the proteins of schemes 1 to 5: the protein has 1 to a plurality of amino acids substituted, deleted, inserted and/or added at positions other than the positions mutated in the above schemes 1 to 5 in the amino acid sequence represented by SEQ ID NO. 1, and has a binding ability to a guide RNA. When the amino acid is increased or decreased by mutation, the "site other than the position at which mutation is performed" is understood to mean "a site other than the position corresponding to the position at which mutation is performed".
Examples of the method for artificially carrying out "substitution, deletion, insertion and/or addition of amino acid" include: a DNA encoding a predetermined amino acid sequence is subjected to a conventional method of introducing a site-specific mutation and then expressing the DNA by a conventional method. Here, examples of the site-specific mutagenesis method include: a method using amber mutation (gapped duplex, Nucleic Acids Res., 12, 9441-9456 (1984)), a method using PCR using a primer for mutation introduction, and the like.
The number of the modified amino acids is at least 1 residue, specifically 1 or more, or more. Among the above substitutions, deletions, insertions, or additions, amino acid substitutions are particularly preferred. The substitution is more preferably to an amino acid having similar properties in hydrophobicity, charge, pK, steric characteristics and the like. Examples of such substitutions include those within the following groups: i) glycine, alanine; ii) valine, isoleucine, leucine; iii) aspartic acid, glutamic acid, asparagine, glutamine; iv) serine, threonine; v) lysine, arginine; vi) phenylalanine, tyrosine.
As the Cas9 protein in which recognition of the PAM sequence of the present invention becomes broad, proteins comprising the following amino acid sequences are preferably exemplified: in SEQ ID NO:1, the amino acid sequence (SEQ ID NO:18) was obtained by mutating arginine at position 1335 to alanine (R1335A), leucine at position 1111 to arginine (L1111R), aspartic acid at position 1135 to valine (D1135V), glycine at position 1218 to arginine (G1218R), glutamic acid at position 1219 to phenylalanine (E1219F), alanine at position 1322 to arginine (A1322R), and threonine at position 1337 to arginine (T1337R).
In addition, as the Cas9 protein in which recognition of the PAM sequence of the present invention becomes broad, a protein comprising the following amino acid sequence is also preferable: 1 in SEQ ID NO:1, wherein the arginine at position 1335 is mutated to isoleucine (R1335I), methionine (R1335M), threonine (R1335T) or valine (R1335V) (more preferably R1335M and R1335V), the leucine at position 1111 is mutated to arginine (L1111R), the aspartic acid at position 1135 is mutated to valine (D1135V), the glycine at position 1218 is mutated to arginine (G1218R), the glutamic acid at position 1219 is mutated to phenylalanine (E1219F), the alanine at position 1322 is mutated to arginine (A1322R), and the threonine at position 1337 is mutated to arginine (T1322 7 133 1337R). The protein corresponds to a protein comprising an amino acid sequence in which alanine at position 1335 of SEQ ID NO. 18 is mutated to isoleucine, methionine, threonine or valine, respectively.
In the present specification, the letter (alphabet) shown on the left side of the number indicating the number of amino acid residues to the substitution site indicates the one-letter symbol of the amino acid before substitution, and the letter (alphabet) shown on the right side indicates the one-letter symbol of the amino acid after substitution.
The Cas9 protein with broad PAM recognition in this embodiment can be produced, for example, by the following method. First, a host is transformed with a vector comprising a nucleic acid encoding the Cas9 protein that becomes broad for PAM recognition as described above. Then, the host is cultured to express the above-mentioned protein. The composition of the medium, the temperature and time of culture, and the addition of an inducer can be determined by those skilled in the art according to known methods, and the transformant can be grown to produce the protein with high efficiency. In addition, for example, in the case of integrating an antibiotic resistance gene into an expression vector as a selection marker, a transformant can be selected by adding an antibiotic to the medium. Then, by purifying the above protein expressed by the host by an appropriate method known per se, the Cas9 protein in which PAM recognition becomes extensive was obtained.
The host is not particularly limited, and examples thereof include: animal cells, plant cells, insect cells, or microorganisms such as Escherichia coli, Bacillus subtilis, and yeast.
< Cas9 protein-guide RNA Complex in which recognition of PAM sequence becomes extensive >
In one embodiment, the present invention provides a protein-RNA complex comprising: the protein shown in Cas9 protein > in which recognition of the < PAM sequence becomes extensive as described above, and a guide RNA comprising a polynucleotide composed of a nucleotide sequence complementary to a nucleotide sequence from 1 base upstream to 20 bases upstream and 24 bases downstream of a PAM (pro-spacer Adjacent Motif) sequence in a target double-stranded polynucleotide.
According to the protein-RNA complex of the present embodiment, the PAM sequence is extended, and the target double-stranded polynucleotide can be edited easily and quickly, site-specifically with respect to the target sequence.
The protein and the guide RNA may be mixed under mild conditions in vitro and in vivo to form a protein-RNA complex. The mild conditions mean a temperature and a pH at which the protein is not decomposed or denatured, and the temperature is preferably 4 ℃ to 40 ℃ and the pH is preferably 4 to 10 inclusive.
The time for mixing the protein and the guide RNA and culturing is preferably 0.5 hours to 1 hour. The complex formed by the protein and the guide RNA is stable and can maintain stability even when left standing at room temperature for several hours.
< CRISPR-Cas vector System >
In one embodiment, the present invention provides a CRISPR-Cas vector system comprising a1 st vector comprising a gene encoding a protein represented in Cas9 protein > in which recognition of the < PAM sequence becomes broad, and a 2 nd vector comprising a guide RNA comprising a polynucleotide consisting of a nucleotide sequence complementary to a nucleotide sequence from 1 base upstream to 20 bases upstream and 24 bases downstream of the PAM sequence in a target double-stranded polynucleotide.
According to the CRISPR-Cas vector system of the present embodiment, a PAM sequence becomes broad, and site-specific target double-stranded polynucleotide can be edited easily and rapidly with respect to a target sequence.
The guide RNA may be appropriately designed so long as it contains a polynucleotide composed of a nucleotide sequence complementary to a nucleotide sequence from 1 base upstream to preferably 20 bases or more and 24 bases or less, more preferably 22 bases or more and 24 bases or less of the PAM sequence in the target double-stranded polynucleotide in the 5' -terminal region. The guide RNA may further comprise 1 or more polynucleotides each having a nucleotide sequence capable of forming a hairpin structure, the nucleotide sequence being a nucleotide sequence that is not complementary to the target double-stranded polynucleotide and being arranged so as to be symmetrically complementary to each other about a single point.
The vector of the present embodiment is preferably an expression vector. The expression vector is not particularly limited, and for example, there can be used: plasmids derived from Escherichia coli such as pBR322, pBR325, pUC12, and pUC 13; plasmids derived from Bacillus subtilis such as pUB110, pTP5, pC 194; yeast-derived plasmids such as pSH19 and pSH 15; bacteriophage such as lambda phage; viruses such as adenovirus, adeno-associated virus, lentivirus, vaccinia virus, baculovirus and the like; and vectors obtained by modifying these vectors.
In the above expression vector, promoters for expressing the Cas9 protein and the guide RNA are not particularly limited, and examples thereof include promoters for expression in animal cells such as EF1 α promoter, SR α promoter, SV40 promoter, LTR promoter, CMV (cytomegalovirus) promoter, HSV-tk promoter, promoters for expression in plant cells such as 35S promoter of cauliflower mosaic virus (CaMV), REF (rubber elongation factor) promoter, polyhedrin promoter, p10 promoter, and the like for expression in insect cells, and these promoters can be appropriately selected depending on the Cas9 protein and the guide RNA, or depending on the type of cells expressing the Cas9 protein and the RNA.
The expression vector may further have a multiple cloning site, an enhancer, a splicing signal, a signal to which poly A is added, a selection marker, an origin of replication, and the like.
< method for site-specifically modifying target double-stranded polynucleotide >
[ embodiment 1]
In one embodiment, the present invention provides a method for site-specifically modifying a target double-stranded polynucleotide, the method comprising the steps of:
a step of mixing the target double-stranded polynucleotide, the protein and the guide RNA and culturing; and a step of modifying the target double-stranded polynucleotide at a binding site located upstream of the PAM sequence by the protein,
the target double-stranded polynucleotide has a PAM sequence consisting of NG (N is an arbitrary base, G is guanine),
the above-mentioned protein is a protein shown in Cas9 protein > in which recognition of the < PAM sequence becomes extensive as described above,
the guide RNA includes a polynucleotide composed of a nucleotide sequence complementary to a nucleotide sequence from 1 base upstream to 20 bases upstream and 24 bases downstream of the PAM sequence in the target double-stranded polynucleotide.
According to the method of the present embodiment, by using the mutant Cas9 protein in which the PAM sequence is broadened, the target double-stranded polynucleotide can be modified easily and rapidly, and site-specifically with respect to the target sequence.
In the present embodiment, the target double-stranded polynucleotide is not particularly limited as long as it has a PAM sequence composed of NG (N is an arbitrary base, and G is guanine).
In the present embodiment, as for the protein and guide RNA, as shown in the Cas9 protein > where recognition of the < PAM sequence becomes broad as described above.
The details of the method for site-specifically modifying a target double-stranded polynucleotide will be described below.
First, the protein and the guide RNA are mixed under mild conditions and cultured. Mild conditions are as described above. The time for culturing is preferably 0.5 hours or more and 1 hour or less. The complex formed by the protein and the guide RNA is stable and can maintain stability even when left standing at room temperature for several hours.
Then, the protein and the guide RNA form a complex on the target double-stranded polynucleotide. The protein recognizes a PAM sequence consisting of "5 '-NG-3'", and binds to the target double-stranded polynucleotide at a binding site located upstream of the PAM sequence. The polynucleotide is cleaved at this site in the case where the above-mentioned protein has endonuclease activity. The Cas9 protein recognizes the PAM sequence, cleaves the double helix structure of the target double-stranded polynucleotide using the PAM sequence as a starting point, and anneals to a nucleotide sequence complementary to the target double-stranded polynucleotide in the guide RNA, thereby unwinding a part of the double helix structure of the target double-stranded polynucleotide. In this case, the Cas9 protein cleaves the phosphodiester bond of the target double-stranded polynucleotide at a cleavage site located upstream of the PAM sequence and at a cleavage site located upstream of a sequence complementary to the PAM sequence.
[2 nd embodiment ]
In the present embodiment, the culture method may further comprise, before the culturing step, the following expression step: using the CRISPR-Cas vector system described above, the proteins and guide RNAs shown in Cas9 protein > described above, whose recognition of the < PAM sequence was made extensive, were expressed.
In the expression step of this embodiment, first, the Cas9 protein and guide RNA are expressed using the CRISPR-Cas vector system described above. As a specific method for causing expression thereof, a host is transformed with an expression vector containing a gene encoding Cas9 protein and an expression vector containing a guide RNA, respectively. The host is then cultured to express the Cas9 protein and the guide RNA. The composition of the medium, the temperature and time of culturing, the addition of an inducing substance, and other conditions can be determined by those skilled in the art according to known methods, and the transformant can be grown to produce the fusion protein with high efficiency. In addition, for example, in the case of integrating an antibiotic resistance gene into an expression vector as a selection marker, a transformant can be selected by adding an antibiotic to the medium. Then, by purifying Cas9 protein and guide RNA expressed by the host using an appropriate method, Cas9 protein and guide RNA were obtained.
< method for site-specifically modifying target double-stranded nucleotide >
[ embodiment 1]
In one embodiment, the present invention provides a method for site-specifically modifying a target double-stranded polynucleotide, the method comprising the steps of:
a step of mixing the target double-stranded polynucleotide, the protein and the guide RNA and culturing; binding the protein to the target double-stranded polynucleotide at a binding site located upstream of the PAM sequence; and a step of obtaining the modified target double-stranded polynucleotide in a region defined by complementary binding between the guide RNA and the target double-stranded polynucleotide,
the target double-stranded polynucleotide has a PAM sequence consisting of NG (N is an arbitrary base, G is guanine),
the above-mentioned protein is a protein shown in Cas9 protein > in which recognition of the < PAM sequence becomes extensive as described above,
the guide RNA includes a polynucleotide composed of a nucleotide sequence complementary to a nucleotide sequence from 1 base upstream to 20 bases upstream and 24 bases downstream of the PAM sequence in the target double-stranded polynucleotide.
According to the method of the present embodiment, by using an RNA-inducible DNA endonuclease with a wide range of PAM sequences, it is possible to modify a target double-stranded polynucleotide in a simple and rapid manner and site-specifically with respect to a target sequence.
In the present embodiment, regarding the target double-stranded polynucleotide, protein and guide RNA, as shown in the above < Cas9 protein, the recognition of PAM sequence becomes extensive > and < method for site-specifically modifying the target double-stranded polynucleotide >.
The details of the method for site-specifically modifying a target double-stranded polynucleotide will be described below. The steps up to the site-specific binding to the target double-stranded polynucleotide are the same as those shown in the above < method for site-specifically cleaving a target double-stranded polynucleotide >. Then, a target double-stranded polynucleotide modified according to the purpose can be obtained in a region determined by complementary binding between the guide RNA and the double-stranded polynucleotide.
In the present specification, "modification" means that the nucleotide sequence of a target double-stranded polynucleotide is changed. For example, the nucleotide sequence of the target double-stranded polynucleotide may be changed by cleavage of the target double-stranded polynucleotide, insertion of an exogenous sequence after cleavage (physical insertion or insertion by replication through homology-directed repair), non-homologous end joining after cleavage (NHEJ: DNA ends generated by cleavage are joined to each other again), or the nucleotide sequence of the target double-stranded polynucleotide may be changed by addition of a functional protein or nucleotide sequence.
The modification of the target double-stranded polynucleotide in the present embodiment can introduce a mutation into the target double-stranded polynucleotide, or can disrupt or alter the function of the target double-stranded polynucleotide.
[2 nd embodiment ]
In the present embodiment, the method may further comprise, before the culturing step, the following expression step: using the CRISPR-Cas vector system described above, the proteins and guide RNAs shown in Cas9 protein > described above, whose recognition of the < PAM sequence was made extensive, were expressed.
In the expression step of this embodiment, first, the Cas9 protein and guide RNA are expressed using the CRISPR-Cas vector system described above. Specific methods for expression are the same as those exemplified in [ embodiment 2] of the above < method for site-specifically modifying a target double-stranded polynucleotide ].
< method for site-specifically modifying a target double-stranded polynucleotide in a cell >
In one embodiment, the present invention provides a method for site-specific modification of a target double-stranded polynucleotide at an intracellular site, the method comprising the steps of:
an expression step of introducing the above CRISPR-Cas vector system into a cell to express a protein shown in Cas9 protein > in which recognition of the above < PAM sequence becomes extensive, and a guide RNA;
binding the protein to the target double-stranded polynucleotide at a binding site located upstream of the PAM sequence; and
a step of obtaining the modified target double-stranded polynucleotide in a region defined by complementary binding between the guide RNA and the target double-stranded polynucleotide,
the target double-stranded polynucleotide has a PAM sequence consisting of NG (N is an arbitrary base, G is guanine),
the guide RNA includes a polynucleotide composed of a nucleotide sequence complementary to a nucleotide sequence from 1 base upstream to 20 bases upstream and 24 bases downstream of the PAM sequence in the target double-stranded polynucleotide.
In the expression step of the present embodiment, first, the Cas9 protein and the guide RNA are expressed in the cell using the CRISPR-Cas vector system described above.
Examples of organisms from which cells to be applied in the method of the present embodiment are derived include: prokaryotes, yeast, animals, plants, insects, and the like. The animal is not particularly limited, and examples thereof include: human, monkey, dog, cat, rabbit, pig, cow, mouse, rat, etc., but are not limited thereto. The type of organism from which the cell is derived can be arbitrarily selected depending on the type, purpose, and the like of the desired target double-stranded polynucleotide.
Examples of the animal-derived cells to which the method of the present embodiment is applied include: germ cells (sperm, ovum, etc.), somatic cells constituting a living body, stem cells, precursor cells, cancer cells isolated from a living body, cells isolated from a living body and stably maintained in vitro (cell line) by acquiring immortalizing ability, cells isolated from a living body and artificially genetically modified, cells isolated from a living body and artificially nucleus-exchanged, and the like, but are not limited thereto.
Examples of somatic cells constituting a living body include, but are not limited to, cells collected from any tissue such as skin, kidney, spleen, upper glands, liver, lung, ovary, pancreas, uterus, stomach, colon, small intestine, large intestine, bladder, prostate, testis, thymus, muscle, connective tissue, bone, cartilage, vascular tissue, blood, heart, eye, brain, and nerve tissue, and more specifically, examples of somatic cells include, but are not limited to, fibroblasts, osteocytes, immune cells (e.g., B lymphocytes, T lymphocytes, neutrophils, macrophages, monocytes, and the like), erythrocytes, platelets, bone cells, osteocytes, pericytes (pericytes), dendritic cells, keratinocytes, adipocytes, mesenchymal cells, epithelial cells, epidermal cells, endothelial cells, vascular endothelial cells, lymphatic endothelial cells, hepatic cells, pancreatic islet cells (e.g., α cells, β cells, δ epsilon cells, PP cells, and the like), chondrocytes, mesenchymal cells, neural cells, oligodendrocytes (e.g., cardiac muscle cells, astrocytes, smooth myocytes, and the like), and other somatic cells.
Stem cells are cells that have both the ability to replicate themselves and the ability to differentiate into cells of other multiple systems. Examples of the stem cells include: embryonic stem cells (ES cells), embryonic tumor cells, embryonic germ stem cells, artificial pluripotent stem cells (iPS cells), neural stem cells, hematopoietic stem cells, mesenchymal stem cells, hepatic stem cells, pancreatic stem cells, muscle stem cells, germ stem cells, intestinal stem cells, cancer stem cells, hair follicle stem cells, and the like, but are not limited thereto.
Cancer cells are cells derived from somatic cells to acquire an unlimited proliferation capacity. Examples of the cancer derived from cancer cells include: breast cancer (e.g., invasive ductal carcinoma, non-invasive ductal carcinoma, inflammatory breast cancer, etc.), prostate cancer (e.g., hormone-dependent prostate cancer, hormone-independent prostate cancer, etc.), pancreatic cancer (e.g., pancreatic ductal carcinoma, etc.), gastric cancer (e.g., papillary adenocarcinoma, mucinous adenocarcinoma, adenosquamous carcinoma, etc.), lung cancer (e.g., non-small cell lung cancer, malignant mesothelioma, etc.), colon cancer (e.g., gastrointestinal stromal tumor, etc.), rectal cancer (e.g., gastrointestinal stromal tumor, etc.), large intestine cancer (e.g., familial large intestine cancer, hereditary non-polyposis large intestine cancer, gastrointestinal stromal tumor, etc.), small intestine cancer (e.g., non-hodgkin's lymphoma, gastrointestinal stromal tumor, etc.), esophageal cancer, duodenal cancer, tongue cancer, pharyngeal cancer (e.g., nasopharyngeal cancer (superior pharyngeal cancer), oropharyngeal cancer, pharyngeal cancer, etc.), esophageal cancer, duodenal cancer, hypopharyngeal carcinoma, etc.), head and neck cancer, salivary gland cancer, brain tumor (e.g., pineal astrocytoma, hairy cell astrocytoma, diffuse astrocytoma, anaplastic astrocytoma, etc.), schwannoma, liver cancer (e.g., primary liver cancer, extrahepatic bile duct cancer, etc.), kidney cancer (e.g., renal cell carcinoma, transitional epithelial carcinoma of the renal pelvis and ureter, etc.), gallbladder cancer, bile duct cancer, pancreatic cancer, endometrial cancer, cervical cancer, ovarian cancer (e.g., epithelial ovarian cancer, extragonadal germ cell tumor, ovarian low grade malignancy, etc.), bladder cancer, urinary tract cancer, skin cancer (e.g., intraocular (eye) melanoma, Merkel (Merkel) cell carcinoma, etc.), hemangioma, malignant lymphoma (e.g., reticulosarcoma, lymphosarcoma, hodgkin's disease, etc.), melanoma (malignant melanoma), Thyroid cancer (e.g., medullary thyroid cancer, etc.), parathyroid cancer, nasal cavity cancer, paranasal sinus cancer, bone tumor (e.g., osteosarcoma, ewing's tumor (ewing's sarcoma), uterine sarcoma, soft tissue sarcoma, etc.), metastatic medulloblastoma, angiofibroma, dermatofibrosarcoma eminensis, retinal sarcoma, penile cancer, testicular tumor, solid tumor (cancer) of children (e.g., wilms ' tumor, kidney tumor of children, etc.), kaposi's sarcoma, AIDS-induced kaposi's sarcoma, maxillary sinus tumor, fibrosarcoma, leiomyosarcoma, rhabdomyosarcoma, chronic bone proliferative disease, leukemia (e.g., acute bone leukemia, acute lymphoblastic leukemia, etc.), and the like, without being limited thereto.
The cell line refers to a cell that has acquired an unlimited proliferation ability in vitro by artificial manipulation. Examples of the cell line include: HCT116, Huh7, HEK293 (human fetal kidney cells), HeLa (human cervical cancer cell line), HepG2 (human liver cancer cell line), UT7/TPO (human leukemia cell line), CHO (Chinese hamster ovary cell line), MDCK, MDBK, BHK, C-33A, HT-29, AE-1, 3D9, Ns0/1, Jurkat, NIH3T3, PC12, S2, Sf9, Sf21, High Five, Vero, etc., but are not limited thereto.
The method for introducing the CRISPR-Cas vector system into a cell can be performed by a method suitable for a living cell to be used, and examples thereof include: examples of the method include electroporation, heat shock, calcium phosphate, lipofection, DEAE dextran, microinjection, particle gun, and a method using a virus, and commercially available Transfection reagents such as FuGENE (registered trademark) 6 Transfection Reagent (Transfection Reagent) (manufactured by Roche), Transfection amine (Lipofectamine)2000 Reagent (manufactured by Invitrogen), Transfection amine LTX Reagent (manufactured by Invitrogen), and Transfection amine 3000 Reagent (manufactured by Invitrogen).
Then, the modification step is the same as that described in [ embodiment 1] of the above < method for site-specifically modifying a target double-stranded nucleotide ].
By modifying the target double-stranded polynucleotide in this embodiment, a cell in which a mutation is introduced into the target double-stranded polynucleotide or the function of the target double-stranded polynucleotide is disrupted or changed can be obtained.
When a protocol without endonuclease activity (e.g., protocol 5) is employed as the mutant Cas9 protein of the present invention, the protein can bind to the target double-stranded polynucleotide at a binding site located upstream of the PAM sequence, but remains there and cannot be cleaved. Thus, for example, if the protein is fused to a marker protein such as a fluorescent protein (e.g., GFP), the marker protein can be bound to the target double-stranded polynucleotide via the guide RNA-mutant Cas9 protein. By appropriate selection of the substance that binds to the mutant Cas9 protein, a wide variety of functions can be imparted to the target double-stranded polynucleotide.
The transcriptional regulator protein or domain may be further linked at the N-terminus or C-terminus of a protein from which a portion or all of the cleavage enzyme activity is deleted by the mutant Cas9 protein or mutant Cas 9. Examples of the transcription regulatory factor or the domain thereof include: transcriptional activators or domains thereof (e.g., VP64, NF-. kappa. B p65) and transcriptional silencers or domains thereof (e.g., heterochromatin protein 1(HP1)) or transcriptional repressors or domains thereof (e.g., Kruppel association cassette (KRAB), ERF Repression Domain (ERD), mSin3A interaction domain (SID)).
Enzymes that modify the methylation state of DNA (e.g., DNA methyltransferases (DNMT), TET) or enzymes that modify histone subunits (e.g., Histone Acetyltransferases (HAT), Histone Deacetylases (HDAC), histone methyltransferases, histone demethylases) may also be linked.
< Gene therapy >
In one embodiment, the invention provides methods and compositions for performing genome editing, gene therapy. The method of the present embodiment is effective and inexpensive to carry out as compared with the previously known targeted gene recombination method, and can be applied to any cell or organism. Any fragment of the double-stranded nucleic acid of the cell or organism can be modified by the gene therapy method of the present embodiment. The gene therapy method of the present embodiment employs both a homologous recombination process and a non-homologous recombination process that are inherent in all cells.
In the present specification, "genome editing" refers to a novel gene modification technique such as specific gene disruption or reporter gene knock-in by performing targeted gene recombination or targeted mutation using a CRISPR/Cas9 system or Transcription Activator-Like Effector Nucleases (TALENs).
In one embodiment, the present invention provides: a gene therapy method for performing targeted DNA insertions or targeted DNA deletions. The gene therapy method comprises the following steps: a step of transforming a cell with a nucleic acid construct comprising donor DNA. As for the schemes relating to DNA insertion and DNA deletion after cleavage of a target gene, those skilled in the art can determine the same according to known methods.
In another embodiment, the present invention provides a gene therapy method comprising: the method is utilized in both somatic and germ cells for genetic manipulation at specific loci.
In one embodiment, the present invention provides: a gene therapy method for disrupting a gene in a somatic cell. Here, the gene overexpresses and expresses products harmful to the cell or organism. Such genes can be overexpressed in more than 1 cell type generated in the disease. Disruption of the overexpressed gene by the gene therapy method of the present embodiment can bring about better health of an individual suffering from a disease caused by the overexpressed gene. That is, the destruction of a gene in a minute proportion of cells acts, the expression level decreases, and a therapeutic effect is produced.
In one embodiment, the present invention provides: a gene therapy method for disrupting a gene in a germ cell. Cells in which a specific gene is disrupted can be effectively used for producing an organism that does not have the function of the specific gene. In the above-mentioned gene-disrupted cells, the gene may be completely knocked out. Loss of function in this particular cell can have a therapeutic effect.
In one embodiment, the present invention provides: gene therapy methods for inserting donor DNA encoding a gene product. The gene product has therapeutic effects when expressed constitutively. For example, the following methods can be mentioned: in order to insert an active promoter and a donor DNA encoding an insulin gene into an individual group of pancreatic cells, the above donor DNA is inserted into an individual (patient) suffering from diabetes. The group of individuals comprising pancreatic cells of the donor DNA can then produce insulin to treat the diabetic patient. Furthermore, the above donor DNA is inserted into a plant to produce a drug-related gene product. Genes for protein products (e.g., insulin, lipase or hemoglobin) can be inserted into plants along with regulatory elements (constitutively active promoters, or inducible promoters) to produce large amounts of drugs in plants. Such protein products can then be isolated from the plant.
Transgenic plants or animals can be produced by a method using the nucleic acid transfer technique (McCreath, K. J. et al (2000) Nature 405: 1066-1069; Polejaeva, I. A. et al (2000) Nature 407: 86-90). Tissue-type specific vectors or cell-type specific vectors may be utilized in order to provide gene expression only in selected cells.
In addition, when the above method is applied to germ cells, donor DNA may be inserted into a target gene, and cells having a designed genetic modification may be generated by all cell divisions thereafter.
Examples of the target to which the gene therapy method of the present embodiment is applied include: any organism, cultured cell, cultured tissue, cultured nucleus (including cells, tissues or nuclei useful for regenerating an organism in the whole cultured cell, cultured tissue or cultured nucleus), gamete (for example, egg or sperm at various stages of development), and the like, without being limited thereto.
The source of the cells to be applied in the gene therapy method of the present embodiment includes, but is not limited to, any living organism (for example, insects, fungi, rodents, cows, sheep, goats, chickens, other agriculturally important animals, and other mammals (for example, dogs, cats, and humans, but not limited thereto), and the like).
The gene therapy method of the present embodiment can be further used in plants. The plant to which the gene therapy method of the present embodiment is applied is not particularly limited, and can be applied to any of various plant species (for example, monocotyledons, dicotyledons, and the like).
The present invention will be described in more detail with reference to the following examples, which are not intended to limit the scope of the present invention.
Examples
Example 1
1. Modulation of wild type and mutant SpCas9
(1) Design of Structure (Structure)
The codon-optimized wild-type or mutant SpCas9 gene was integrated into the pET vector (Novagen) by gene synthesis. A TEV recognition sequence was further added between the His tag and the SpCas9 gene. The N-terminus of Cas9 expressed from the completed structure was linked to a histidine of 6 residues (His tag), resulting in a design with the addition of a TEV protease recognition site.
The nucleotide sequence of the SpCas9 gene used is as follows.
WT: nucleotide sequence of wild-type SpCas 9: 2, SEQ ID NO;
m0: nucleotide sequence of mutant SpCas9 gene (R1335A): 3, SEQ ID NO;
m4: nucleotide sequence of mutant SpCas9 gene (R1335A/G1218R): 4, SEQ ID NO;
m18: nucleotide sequence of mutant SpCas9 gene (R1335A/G1218R/T1337R): 5, SEQ ID NO;
m19: nucleotide sequence of mutant SpCas9 gene (R1335A/G1218R/T1337R/L1111R): 6, SEQ ID NO;
m20: nucleotide sequence of mutant SpCas9 gene (R1335A/G1218R/T1337R/L1111R/D1332R): SEQ ID NO. 7;
m21: nucleotide sequence of mutant SpCas9 gene (R1335A/G1218R/T1337R/L1111R/D1332R/A1322R): 8 in SEQ ID NO;
m22: nucleotide sequence of mutant SpCas9 gene (R1335A/G1218R/T1337R/L1111R/D1332R/A1322R/D1284R/A1285R): 9, SEQ ID NO;
m23: nucleotide sequence of mutant SpCas9 gene (R1335A/G1218R/L1111R/D1332R/A1322R): SEQ ID NO. 10;
m24: nucleotide sequence of mutant SpCas9 gene (R1335A/G1218R/L1111R/D1332R/A1322R/D1284R/A1285R): 11 is SEQ ID NO;
m25: nucleotide sequence of mutant SpCas9 gene (R1335A/G1218R/T1337R/L1111R/A1322R): SEQ ID NO. 12;
m26: nucleotide sequence of mutant SpCas9 gene (R1335A/G1218R/L1111R/A1322R): 13 in SEQ ID NO;
m29: nucleotide sequence of mutant SpCas9 gene (R1335A/G1218R/L1111R): 14, SEQ ID NO;
m32: nucleotide sequence of mutant SpCas9 gene (R1335A/G1218R/T1337R/L1111R/A1322R/E1219M): 15, SEQ ID NO;
m33: nucleotide sequence of mutant SpCas9 gene (R1335A/G1218R/T1337R/L1111R/A1322R/E1219F): 16 in SEQ ID NO;
m34: nucleus of mutant SpCas9 gene (R1335A/G1218R/T1337R/L1111R/A1322R/E1219W)Nucleotide sequence: 17 in SEQ ID NO;
m43: nucleotide sequence of mutant SpCas9 gene (R1335A/G1218R/T1337R/L1111R/A1322R/E1219F/D1135V): 18 in SEQ ID NO;
m61: nucleotide sequence of mutant SpCas9 gene (R1335I/G1218R/T1337R/L1111R/A1322R/E1219F/D1135V): a nucleotide sequence obtained by converting gcc at position 4003-4005 into atc with respect to the nucleotide sequence of m43 (SEQ ID NO: 18).
m62: nucleotide sequence of mutant SpCas9 gene (R1335L/G1218R/T1337R/L1111R/A1322R/E1219F/D1135V): a nucleotide sequence obtained by converting gcc at position 4003-4005 into ctg with respect to the nucleotide sequence of m43 (SEQ ID NO: 18).
m63: nucleotide sequence of mutant SpCas9 gene (R1335M/G1218R/T1337R/L1111R/A1322R/E1219F/D1135V): a nucleotide sequence obtained by converting gcc at position 4003-4005 into atg with respect to the nucleotide sequence of m43 (SEQ ID NO: 18).
m64: nucleotide sequence of mutant SpCas9 gene (R1335F/G1218R/T1337R/L1111R/A1322R/E1219F/D1135V): a nucleotide sequence obtained by converting gcc at position 4003-4005 into ttt with respect to the nucleotide sequence of m43 (SEQ ID NO: 18).
m65: nucleotide sequence of mutant SpCas9 gene (R1335T/G1218R/T1337R/L1111R/A1322R/E1219F/D1135V): a nucleotide sequence obtained by converting gcc at position 4003-4005 into acc with respect to the nucleotide sequence of m43 (SEQ ID NO: 18).
m66: nucleotide sequence of mutant SpCas9 gene (R1335V/G1218R/T1337R/L1111R/A1322R/E1219F/D1135V): a nucleotide sequence obtained by converting gcc at position 4003-4005 into gtg with respect to the nucleotide sequence of m43 (SEQ ID NO: 18).
(2) Expression in E.coli
The prepared vector was transformed into Escherichia coliEscherichia coli rosettaStrain 2 (DE 3). Thereafter, the culture was carried out using LB medium containing 20. mu.g/ml kanamycin and 20. mu.g/ml chloramphenicol. Isopropyl group as an expression inducer was added at the time point of culturing until OD =0.8- β -thiogalactopyranoside (Isopyropyl β -D-1-thiogalactopyranoside: IPTG) (final concentration of 1mM), cultured at 37 ℃ for 4 hours, after which E.coli was recovered by centrifugation (5,000g, 10 minutes).
(3) Purification of wild type and mutant SpCas9
The bacterial cells recovered in (2) were suspended in buffer A and disrupted by ultrasonication. The supernatant was recovered by centrifugation (25,000g, 30 min), mixed with Ni-NTA Superflow resin (QIAGEN) equilibrated with buffer A, and mixed by gentle inversion for 1 hour. The flow-through fractions were recovered and washed with 4 column volumes of buffer A and further 2 column volumes of high salt concentration buffer B.
Then, the column was washed with buffer A of 2 column volumes again, and then the target protein was eluted with buffer C of high imidazole concentration of 5 column volumes.
The crude sample was then loaded onto hitrapsp (ge healthcare). Then, the column was washed with a mixed solution of 5 column volumes (volume fractions) of 92.5% buffer D (0M NaCl) and 7.5% buffer F (2M NaCl), and then a linear gradient was formed from 10% to 50% (NaCl concentration from 200mM to 1M) using buffer E to elute the target protein.
The compositions of the buffers A to E are shown below.
And (3) buffer solution A: 20mM Tris-HCl, pH8.0, 300mM NaCl, 20mM imidazole;
and (3) buffer solution B: 20mM Tris-HCl, pH8.0, 1000mM NaCl, 20mM imidazole;
and (3) buffer C: 20mM Tris-HCl, pH8.0, 300mM NaCl, 300mM imidazole;
and (3) buffer solution D: 20mM Tris-HCl, pH8.0;
and (3) buffer solution E: 20mM Tris-HCl, pH8.0, 2000mM NaCl.
2. Preparation of guide RNA
Into which a target guide RNA sequence has been inserted (ggaaauuaggugcgcuuggcguuuuaga gcuagaaauagcaaguuaaaauaaggcuaguccguuaucaacuugaaaaagug and SEQ ID NO: 19). The 20-base guide sequence is underlined, and the remainder corresponds to the scaffold part (neck-loop 2, stem-loop 2).The T7 promoter sequence was added upstream of the guide RNA sequence and integrated into a pUC119 vector (TaKaRa) which had been formed into a linear form. From the prepared vector, a template DNA for in vitro transcription reaction was prepared by PCR. Using this template DNA, an in vitro transcription reaction with T7 RNA polymerase was carried out at 37 ℃ for 4 hours. An equal amount of phenol and chloroform was added to the reaction solution containing the transcription product, followed by mixing, centrifugation (10,000g, 2 minutes) at 20 ℃ and recovery of the supernatant. The supernatant was added 1/10 amounts of 3M sodium acetate and 2.5 times the amount of 100% ethanol, and centrifuged (10,000g, 3 minutes) at 4 ℃ to precipitate the transcript. The supernatant was discarded, 70% ethanol was added thereto, and the mixture was centrifuged at 4 ℃ (10,000g for 3 minutes) to discard the supernatant again. The pellet was air-dried, resuspended in TBE buffer and purified by 7M urea-modified 10% PAGE. Bands at the molecular weight of the target RNA were excised, and RNA was extracted using the Elutrap electroelution System (GE Healthcare). Thereafter, the extracted RNA was passed through a PD-10 column (GE Healthcare), and the buffer was changed to buffer H (10mM Tris-HCl (pH8.0), 150mM NaCl).
3. Plasmid DNA cleavage Activity assay
A vector having a target DNA sequence and a PAM sequence inserted therein was prepared and used in a test for measuring DNA cleavage activity. PAM sequences 1 to 4 were added to the target DNA sequence, respectively, and the resulting mixture was integrated into a linear pUC119 vector. The target sequences and PAM sequences 1-4 are shown in Table 1.
[ Table 1]
Nucleotide sequence SEQ ID NO
Target DNA 5’-GGAAATTAGGTGCGCTTGGC-3’ SEQ ID NO: 20
PAM sequence 1 5’-TGT-3’
PAM sequence 2 5’-TGG-3’
PAM sequence 3 5’-TGNA-3’
PAM sequence 4 5’-TGN-3’
Escherichia coli Mach1 strain (Life Technologies) was transformed with the vector prepared, and cultured at 37 ℃ in LB medium containing 20. mu.g/mL of ampicillin.
After the culture, the cells were collected by centrifugation (8,000g, 1 minute), and the plasmid DNA was purified using QIAprep Spin Miniprep kit (QIAGEN).
Cleavage experiments were performed using target plasmid DNA with the addition of purified PAM sequences. Plasmid DNA was formed into 1 line by restriction enzyme. If the wild-type or mutant SpCas9 cleaves the target DNA sequence in the linearized DNA, cleavage products of about 1,000bp and about 2,000bp can be generated. As a buffer for cleavage, lysis buffer B having the following composition was used.
Composition of B (. times.10)
200mM HEPES 7.5;
1000mM KCl;
50% of glycerol;
10mM DTT;
5mM EDTA;
20mM MgCl2
The band of the cleavage product was confirmed by electrophoresis of the reacted sample using 1% strength agarose gel. The results are shown in FIGS. 1A-D. In the figure, "Substrate" shows the Substrate and "Product" shows the cleavage Product. The PAM sequence and reaction conditions are shown in the figure.
While the wild-type SpCas9 recognizes that only the base at position 3 of the PAM sequence is G and cleaves the target plasmid DNA, the mutant SpCas9 recognizes that the base at position 3 is a PAM sequence other than G and cleaves the target plasmid DNA.
Thus, it was confirmed that: the PAM sequence "NGG" is recognized in the wild-type SpCas9, whereas the PAM sequence "NG" is recognized in the mutant SpCas 9.
From the above results it is clear that: in the mutant SpCas9, the PAM sequence is broadened, and site-specific cleavage of the target double-stranded polynucleotide can be performed easily and quickly with respect to the target sequence.
Example 2
The plasmid DNA cleavage activity assay was performed in the same manner as in example 1 using mutant SpCas9(m43) prepared in example 1. The results are shown in FIG. 2.
While the wild-type SpCas9 recognizes that only the base at position 3 of the PAM sequence is G and cleaves the target plasmid DNA, the mutant SpCas9 recognizes that the base at position 3 is a PAM sequence other than G and cleaves the target plasmid DNA.
Thus, it was confirmed that: the PAM sequence "NGG" is recognized in the wild-type SpCas9, whereas the PAM sequence "NG" is recognized in the mutant SpCas 9.
Example 3
A plasmid DNA cleavage activity assay was performed in the same manner as in example 1 using mutant SpCas9(m43, m61 to m66) prepared in example 1. In addition, a MultiNA capillary electrophoresis apparatus (Shimadzu corporation) was used for detection of the cleavage products. As the PAM sequence, 5 '-TGC-3' as PAM sequence 4 was used. The cleavage experiments were carried out for 0.5 min (0.5m) and 2 min (2 m). The results are shown in FIG. 3. Excellent DNA cleavage activity was confirmed in m61, m63, m65 and m 66.
Example 4
A plasmid DNA cleavage activity assay was performed in the same manner as in example 1 using mutant SpCas9(m43, m61, m63, and m66) prepared in example 1. The cleavage experiments were carried out for 0.5 min (0.5m) and 2 min (2 m). The results are shown in FIG. 4.
While the wild-type SpCas9 recognizes that only the base at position 3 of the PAM sequence is G and cleaves the target plasmid DNA, the mutant SpCas9 recognizes that the base at position 3 is a PAM sequence other than G and cleaves the target plasmid DNA. Confirming that: m61, m63 and m66, particularly m63 and m66, can cleave DNA with high efficiency even in the case of using a PAM sequence of TGA and TGC which is low in efficiency in m 43.
Example 5
A plasmid DNA cleavage activity assay was performed in the same manner as in example 1, using wild-type SpCas9 and mutant SpCas9(WT, m43) prepared in example 1 and the following mutant SpCas9 prepared in the same manner as in example 1. The cleavage experiments were performed over time (0, 0.5, 1, 2, 5 minutes). The results are shown in FIG. 5. In m43, a comparable improvement in cleavage activity was observed as compared with WT.
Nucleotide sequence of mutant SpCas9 gene (R1335A/G1218R/T1337R/L1111R/A1322R/D1135V): a nucleotide sequence obtained by converting gac at position 3403-3405 into gtt with respect to the nucleotide sequence of m25 (SEQ ID NO: 12).
Industrial applicability
According to the present invention, a Cas9 protein can be obtained in which recognition of a PAM sequence becomes broad while maintaining binding force to a target double-stranded polynucleotide and further maintaining endonuclease activity. In addition, a technology for easy and rapid site-specific genome editing against a target sequence can be provided, which utilizes the Cas9 protein described above.
The present application is based on Japanese application laid-open application No. 2017-108556 (application date: 2017, 5, 31), and the contents thereof are all included in the present specification.
<110> university of tokyo
<120> modified Cas9 protein and uses thereof
<130>092761
<150>JP2017-108556
<151>2017-05-31
<160>20
<170>PatentIn version 3.5
<210>1
<211>1368
<212>PRT
<213> Streptococcus pyogenes
<400>1
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 9095
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala IleLys Lys Gly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met AsnPhe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
11901195 1200
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
1310 1315 1320
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser
1325 1330 1335
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
Gly Leu Tyr Glu Thr Arg Ile Asp Leu SerGln Leu Gly Gly Asp
1355 1360 1365
<210>2
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>2
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc ggc aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc ctg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gac agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc ggc 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
gaa ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc gac gct aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct gcc gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc gac cgg aag agg tac acc agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>3
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>3
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc ggc aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cagcgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aacgag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc ctg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gac agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
11601165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc ggc 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
gaa ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 12551260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc gac gct aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct gcc gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc gac cgg aag gcc tac acc agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Ala Tyr Thr Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>4
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>4
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc ggc aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile LeuThr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc ctg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gac agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aagaga atg ctg gcc tct gcc cgg 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
gaa ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc gac gct aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct gcc gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc gac cgg aag gcc tac acc agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Ala Tyr Thr Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>5
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>5
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc ggc aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttcctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc ctg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gac agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc cgg 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
gaa ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc gac gct aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct gcc gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc gac cgg aag gcc tac cgg agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Ala Tyr Arg Ser
13251330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>6
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>6
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tacccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc ggc aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
GlyGlu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe ThrVal Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc cgg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gac agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc cgg 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
gaa ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc gac gct aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct gcc gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc gac cgg aag gcc tac cgg agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Ala Tyr Arg Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa4107
<210>7
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>7
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc ggc aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg accaga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc cgg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gac agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc cgg 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
gaa ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc gac gct aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct gcc gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc cgg cgg aag gcc tac cgg agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Arg Arg Lys Ala Tyr Arg Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>8
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>8
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 1015
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc ggc aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atcctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atcccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc cgg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gac agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc cgg 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
gaa ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc gac gct aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct cgg gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Arg Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc cgg cgg aag gcc tac cgg agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Arg Arg Lys Ala Tyr Arg Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>9
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>9
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu IleTyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc ggc aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc cgg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gac agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc cgg 3654
Phe Glu Leu GluAsn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
gaa ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc cgg cgg aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Arg Arg Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct cgg gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Arg Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc cgg cgg aag gcc tac cgg agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Arg Arg Lys Ala Tyr Arg Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>10
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>10
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc ggc aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atgacc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys GlyGln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu ValVal Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc cgg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gac agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
11451150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc cgg 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
gaa ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc gac gct aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct cgg gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Arg Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc cgg cgg aag gcc tac acc agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Arg Arg Lys Ala Tyr Thr Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>11
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>11
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc ggc aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu AspPhe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gccggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtgtac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc cgg ccc aag3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gac agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc cgg 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
gaa ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc cgg cgg aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Arg Arg Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct cgg gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Arg Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc cgg cgg aag gcc tac acc agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Arg Arg Lys Ala Tyr Thr Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>12
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>12
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc ggc aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser AspLys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc cgg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gac agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
11301135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc cgg 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
gaa ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 12251230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc gac gct aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct cgg gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Arg Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc gac cgg aag gcc tac cgg agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Ala Tyr Arg Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>13
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>13
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc ggc aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc cgg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gac agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc cgg 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
gaa ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc gac gct aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct cgg gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Arg Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc gac cgg aag gcc tac acc agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Ala Tyr Thr Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>14
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>14
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc ggc aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His ValAla Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc cgg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gac agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc cgg 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
gaa ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc gac gct aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
12951300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct gcc gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc gac cgg aag gcc tac acc agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Ala Tyr Thr Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>15
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>15
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc gga aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc cgg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gtt agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Val Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc cgg 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
atg ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Met Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc gac gct aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct cgg gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Arg Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc gac cgg aag gcc tac cgg agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Ala Tyr Arg Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atcacc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>16
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>16
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys LysAsn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc gga aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggcagc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aaccgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc cgg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gtt agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Val Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu ProLys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc cgg 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
ttt ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Phe Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc gac gct aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct cgg gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Arg Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc gac cgg aag gcc tac cgg agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Ala Tyr Arg Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>17
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>17
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gacaac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc ggc aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cagatc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atggac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaa gcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tac ctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc cgg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gac agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc cgg 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
tgg ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Trp Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttcctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc gac gct aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct cgg gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Arg Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc gac cgg aag gcc tac cgg agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Ala Tyr Arg Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>18
<211>4107
<212>DNA
<213> Streptococcus pyogenes
<220>
<221>CDS
<222>(1)..(4107)
<400>18
atg gac aag aag tac agc atc ggc ctg gac atc ggc acc aac tct gtg 48
Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
1 5 10 15
ggc tgg gcc gtg atc acc gac gag tac aag gtg ccc agc aag aaa ttc 96
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
aag gtg ctg ggc aac acc gac cgg cac agc atc aag aag aac ctg atc 144
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
gga gcc ctg ctg ttc gac agc ggc gaa aca gcc gag gcc acc cgg ctg 192
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
aag aga acc gcc aga aga aga tac acc aga cgg aag aac cgg atc tgc 240
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
tat ctg caa gag atc ttc agc aac gag atg gcc aag gtg gac gac agc 288
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
ttc ttc cac aga ctg gaa gag tcc ttc ctg gtg gaa gag gat aag aag 336
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
cac gag cgg cac ccc atc ttc ggc aac atc gtg gac gag gtg gcc tac 384
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp GluVal Ala Tyr
115 120 125
cac gag aag tac ccc acc atc tac cac ctg aga aag aaa ctg gtg gac 432
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
agc acc gac aag gcc gac ctg cgg ctg atc tat ctg gcc ctg gcc cac 480
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
atg atc aag ttc cgg ggc cac ttc ctg atc gag ggc gac ctg aac ccc 528
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
gac aac agc gac gtg gac aag ctg ttc atc cag ctg gtg cag acc tac 576
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
aac cag ctg ttc gag gaa aac ccc atc aac gcc agc ggc gtg gac gcc 624
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
aag gcc atc ctg tct gcc aga ctg agc aag agc aga cgg ctg gaa aat 672
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
ctg atc gcc cag ctg ccc ggc gag aag aag aat ggc ctg ttc gga aac 720
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
ctg att gcc ctg agc ctg ggc ctg acc ccc aac ttc aag agc aac ttc 768
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
gac ctg gcc gag gat gcc aaa ctg cag ctg agc aag gac acc tac gac 816
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
gac gac ctg gac aac ctg ctg gcc cag atc ggc gac cag tac gcc gac 864
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
ctg ttt ctg gcc gcc aag aac ctg tcc gac gcc atc ctg ctg agc gac 912
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
atc ctg aga gtg aac acc gag atc acc aag gcc ccc ctg agc gcc tct 960
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310315 320
atg atc aag aga tac gac gag cac cac cag gac ctg acc ctg ctg aaa 1008
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
gct ctc gtg cgg cag cag ctg cct gag aag tac aaa gag att ttc ttc 1056
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
gac cag agc aag aac ggc tac gcc ggc tac att gac ggc gga gcc agc 1104
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
cag gaa gag ttc tac aag ttc atc aag ccc atc ctg gaa aag atg gac 1152
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
ggc acc gag gaa ctg ctc gtg aag ctg aac aga gag gac ctg ctg cgg 1200
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
aag cag cgg acc ttc gac aac ggc agc atc ccc cac cag atc cac ctg 1248
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
gga gag ctg cac gcc att ctg cgg cgg cag gaa gat ttt tac cca ttc 1296
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
ctg aag gac aac cgg gaa aag atc gag aag atc ctg acc ttc cgc atc 1344
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
ccc tac tac gtg ggc cct ctg gcc agg gga aac agc aga ttc gcc tgg 1392
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
atg acc aga aag agc gag gaa acc atc acc ccc tgg aac ttc gag gaa 1440
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
gtg gtg gac aag ggc gct tcc gcc cag agc ttc atc gag cgg atg acc 1488
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
aac ttc gat aag aac ctg ccc aac gag aag gtg ctg ccc aag cac agc 1536
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
ctg ctg tac gag tac ttc acc gtg tat aac gag ctg acc aaa gtg aaa 1584
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
tac gtg acc gag gga atg aga aag ccc gcc ttc ctg agc ggc gag cag 1632
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
aaa aag gcc atc gtg gac ctg ctg ttc aag acc aac cgg aaa gtg acc 1680
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
gtg aag cag ctg aaa gag gac tac ttc aag aaa atc gag tgc ttc gac 1728
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
tcc gtg gaa atc tcc ggc gtg gaa gat cgg ttc aac gcc tcc ctg ggc 1776
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
aca tac cac gat ctg ctg aaa att atc aag gac aag gac ttc ctg gac 1824
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
aat gag gaa aac gag gac att ctg gaa gat atc gtg ctg acc ctg aca 1872
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
ctg ttt gag gac aga gag atg atc gag gaa cgg ctg aaa acc tat gcc 1920
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
cac ctg ttc gac gac aaa gtg atg aag cag ctg aag cgg cgg aga tac 1968
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
acc ggc tgg ggc agg ctg agc cgg aag ctg atc aac ggc atc cgg gac 2016
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
aag cag tcc ggc aag aca atc ctg gat ttc ctg aag tcc gac ggc ttc 2064
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
gcc aac aga aac ttc atg cag ctg atc cac gac gac agc ctg acc ttt 2112
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
aaa gag gac atc cag aaagcc cag gtg tcc ggc cag ggc gat agc ctg 2160
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
cac gag cac att gcc aat ctg gcc ggc agc ccc gcc att aag aag ggc 2208
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
atc ctg cag aca gtg aag gtg gtg gac gag ctc gtg aaa gtg atg ggc 2256
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
cgg cac aag ccc gag aac atc gtg atc gaa atg gcc aga gag aac cag 2304
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
acc acc cag aag gga cag aag aac agc cgc gag aga atg aag cgg atc 2352
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
gaa gag ggc atc aaa gag ctg ggc agc cag atc ctg aaa gaa cac ccc 2400
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
gtg gaa aac acc cag ctg cag aac gag aag ctg tac ctg tac tacctg 2448
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
cag aat ggg cgg gat atg tac gtg gac cag gaa ctg gac atc aac cgg 2496
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
ctg tcc gac tac gat gtg gac cat atc gtg cct cag agc ttt ctg aag 2544
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
gac gac tcc atc gac aac aag gtg ctg acc aga agc gac aag aac cgg 2592
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
ggc aag agc gac aac gtg ccc tcc gaa gag gtc gtg aag aag atg aag 2640
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
aac tac tgg cgg cag ctg ctg aac gcc aag ctg att acc cag aga aag 2688
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
ttc gac aat ctg acc aag gcc gag aga ggc ggc ctg agc gaa ctg gat 2736
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
aag gcc ggc ttc atc aag aga cag ctg gtg gaa acc cgg cag atc aca 2784
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
aag cac gtg gca cag atc ctg gac tcc cgg atg aac act aag tac gac 2832
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
gag aat gac aag ctg atc cgg gaa gtg aaa gtg atc acc ctg aag tcc 2880
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
aag ctg gtg tcc gat ttc cgg aag gat ttc cag ttt tac aaa gtg cgc 2928
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
gag atc aac aac tac cac cac gcc cac gac gcc tac ctg aac gcc gtc 2976
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
gtg gga acc gcc ctg atc aaa aag tac cct aag ctg gaa agc gag ttc 3024
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
gtg tac ggc gac tac aag gtg tac gac gtg cgg aag atg atc gcc 3069
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
aag agc gag cag gaa atc ggc aag gct acc gcc aag tac ttc ttc 3114
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
tac agc aac atc atg aac ttt ttc aag acc gag att acc ctg gcc 3159
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
aac ggc gag atc cgg aag cgg cct ctg atc gag aca aac ggc gaa 3204
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu
1055 1060 1065
acc ggg gag atc gtg tgg gat aag ggc cgg gat ttt gcc acc gtg 3249
Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
1070 1075 1080
cgg aaa gtg ctg agc atg ccc caa gtg aat atc gtg aaa aag acc 3294
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr
1085 1090 1095
gag gtg cag aca ggc ggc ttc agc aaa gag tct atc cgg ccc aag 3339
Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
agg aac agc gat aag ctg atc gcc aga aag aag gac tgg gac cct 3384
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
aag aag tac ggc ggc ttc gtt agc ccc acc gtg gcc tat tct gtg 3429
Lys Lys Tyr Gly Gly Phe Val Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
ctg gtg gtg gcc aaa gtg gaa aag ggc aag tcc aag aaa ctg aag 3474
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
agt gtg aaa gag ctg ctg ggg atc acc atc atg gaa aga agc agc 3519
Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
ttc gag aag aat ccc atc gac ttt ctg gaa gcc aag ggc tac aaa 3564
Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
gaa gtg aaa aag gac ctg atc atc aag ctg cct aag tac tcc ctg 3609
Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
ttc gag ctg gaa aac ggc cgg aag aga atg ctg gcc tct gcc cgg 3654
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
ttc ctg cag aag gga aac gaa ctg gcc ctg ccc tcc aaa tat gtg 3699
Phe Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
aac ttc ctg tac ctg gcc agc cac tat gag aag ctg aag ggc tcc 3744
Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
ccc gag gat aat gag cag aaa cag ctg ttt gtg gaa cag cac aag 3789
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
cac tac ctg gac gag atc atc gag cag atc agc gag ttc tcc aag 3834
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
aga gtg atc ctg gcc gac gct aat ctg gac aaa gtg ctg tcc gcc 3879
Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
tac aac aag cac cgg gat aag ccc atc aga gag cag gcc gag aat 3924
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
atc atc cac ctg ttt acc ctg acc aat ctg gga gcc cct cgg gcc 3969
Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Arg Ala
1310 1315 1320
ttc aag tac ttt gac acc acc atc gac cgg aag gcc tac cgg agc 4014
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Ala Tyr Arg Ser
1325 1330 1335
acc aaa gag gtg ctg gac gcc acc ctg atc cac cag agc atc acc 4059
Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
ggc ctg tac gag aca cgg atc gac ctg tct cag ctg gga ggc gac 4104
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
taa 4107
<210>19
<211>81
<212>RNA
<213> Artificial sequence
<220>
<223> guide RNA
<400>19
ggaaauuagg ugcgcuuggc guuuuagagc uagaaauagc aaguuaaaau aaggcuaguc 60
cguuaucaac uugaaaaagu g 81
<210>20
<211>23
<212>DNA
<213> Artificial sequence
<220>
<223> target DNA
<400>20
ggaaattagg tgcgcttggc tgg 23

Claims (33)

1. A protein which comprises a sequence having an amino acid sequence in which 1 amino acid selected from the group consisting of alanine, glycine, cysteine, isoleucine, leucine, methionine, phenylalanine, proline, valine, threonine, asparagine and aspartic acid is substituted for arginine at position 1335 in the amino acid sequence represented by SEQ ID NO. 1, and which has a binding ability to a guide RNA.
2. The protein according to claim 1, wherein the amino acid sequence represented by SEQ ID NO. 1 further has a mutation at position 1219.
3. The protein according to claim 1 or 2, wherein the amino acid sequence represented by SEQ ID NO. 1 further has a mutation at position 1322.
4. A protein consisting of a sequence comprising an amino acid sequence in which arginine at position 1335 in the amino acid sequence represented by SEQ ID NO. 1 is substituted with 1 amino acid selected from the group consisting of alanine, glycine, cysteine, isoleucine, leucine, methionine, phenylalanine, proline, valine, threonine, asparagine and aspartic acid, and further having a mutation at position 1219, and having a binding ability to a guide RNA.
5. A protein consisting of a sequence comprising an amino acid sequence in which arginine at position 1335 in the amino acid sequence represented by SEQ ID NO. 1 is substituted with 1 amino acid selected from the group consisting of alanine, glycine, cysteine, isoleucine, leucine, methionine, phenylalanine, proline, valine, threonine, asparagine and aspartic acid, and further having a mutation at position 1322, and having a binding ability to a guide RNA.
6. The protein according to any one of claims 1 to 5, wherein the substitution of arginine at position 1335 is a substitution to alanine.
7. The protein of any one of claims 1 to 5, wherein the substitution of arginine at position 1335 is a substitution to isoleucine, methionine, threonine or valine.
8. A protein according to claim 2 or 4, wherein the mutation at position 1219 is a substitution of glutamic acid to phenylalanine.
9. A protein according to claim 3 or 5, wherein the mutation at position 1322 is a substitution of alanine to arginine, histidine or lysine.
10. The protein of claim 9, wherein the mutation at position 1322 is a substitution of alanine to arginine.
11. The protein according to any one of claims 1 to 10, wherein the amino acid sequence represented by SEQ ID NO. 1 further has a mutation at least one position selected from the group consisting of 1111, 1135, 1218 and 1337.
12. The protein according to claim 11, wherein the amino acid sequence represented by SEQ ID NO. 1 further has a mutation at least 2 positions selected from the group consisting of 1111, 1135, 1218 and 1337.
13. The protein according to claim 11, wherein the amino acid sequence represented by SEQ ID NO. 1 further has a mutation at least 3 positions selected from the group consisting of 1111, 1135, 1218 and 1337.
14. The protein according to claim 11, wherein the amino acid sequence represented by SEQ ID NO. 1 further has mutations at positions 1111, 1135, 1218, and 1337.
15. The protein of any one of claims 11 to 14, wherein,
the mutation at position 1111 is a substitution of leucine for arginine, histidine or lysine;
the mutation at position 1135 was a substitution of aspartic acid to valine;
a mutation at position 1218 with a substitution of glycine to arginine, histidine, or lysine;
the mutation at position 1337 is a substitution of threonine to arginine, histidine or lysine.
16. The protein according to any one of claims 1 to 15, wherein the site other than the position of SEQ ID NO. 1 at which mutation is performed has 80% or more homology.
17. The protein according to any one of claims 1 to 15, wherein 1 to more amino acids are substituted, deleted, inserted and/or added at positions other than the position of SEQ ID NO. 1 at which mutation is performed.
18. The protein according to any one of claims 1 to 17, which has an RNA-inducible DNA endonuclease activity.
19. The protein according to any one of claims 1 to 16, which further comprises a mutation that lacks a part or all of the nuclease activity in the amino acid sequence represented by SEQ ID NO. 1.
20. The protein according to claim 19, wherein the mutation that lacks a part or all of the nuclease activity is a mutation at (i) at least 1 position selected from the group consisting of positions 10, 762, 839, 983 and 986 or a position corresponding thereto and/or (ii) a position selected from the group consisting of positions 840 and 863 or a position corresponding thereto in the amino acid sequence represented by SEQ ID NO. 1.
21. The protein of claim 20, wherein the aspartic acid at position 10 is substituted with alanine or asparagine; or
Histidine at position 840 is substituted with alanine, asparagine or tyrosine.
22. The protein of any one of claims 19 to 21 linked to a transcriptional regulator protein or domain.
23. The protein of claim 22, wherein the transcriptional regulator is a transcriptional activator.
24. The protein of claim 22, wherein the transcriptional regulator is a transcriptional silencer or transcriptional repressor.
25. A nucleic acid encoding a protein according to any one of claims 1 to 24.
26. A protein-RNA complex comprising the protein according to any one of claims 1 to 24 and a guide RNA comprising a polynucleotide consisting of a nucleotide sequence complementary to a nucleotide sequence from 1 base upstream to 20 bases upstream and 24 bases downstream of a PAM sequence in a target double-stranded polynucleotide, wherein the PAM is a motif adjacent to a pre-spacer sequence.
27. A method for site-specifically modifying a target double-stranded polynucleotide, the method comprising the steps of:
a step of mixing and culturing a target double-stranded polynucleotide, a protein and a guide RNA; and
a step of modifying the target double-stranded polynucleotide at a binding site located upstream of the PAM sequence by the protein,
the target double-stranded polynucleotide has a PAM sequence consisting of NG, wherein N is an arbitrary base, G is guanine,
the protein is the protein according to any one of claims 1 to 24,
the guide RNA includes a polynucleotide composed of a nucleotide sequence complementary to a nucleotide sequence from 1 base upstream to 20 bases upstream and 24 bases downstream of the PAM sequence in the target double-stranded polynucleotide.
28. The method of claim 27, wherein the modification is site-specific cleavage of the target double-stranded polynucleotide.
29. The method of claim 27, wherein the modification is a substitution, deletion, and/or addition of 1 or more nucleotides that are site-specific in the target double-stranded polynucleotide.
30. A method of increasing expression of a target gene in a cell, the method comprising: expressing the protein of claim 23 and 1 or more guide RNAs against the target gene in the cell.
31. A method of reducing expression of a target gene in a cell, the method comprising: expressing the protein of claim 24 and 1 or more guide RNAs against the target gene in the cell.
32. The method of claim 30 or 31, wherein the cell is a eukaryotic cell.
33. The method of claim 30 or 31, wherein the cell is a yeast cell, a plant cell, or an animal cell.
HK62020016607.7A 2017-05-31 2018-05-31 Modified cas9 protein and use thereof HK40027079A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2017-108556 2017-05-31

Publications (1)

Publication Number Publication Date
HK40027079A true HK40027079A (en) 2021-01-15

Family

ID=

Similar Documents

Publication Publication Date Title
US11702645B2 (en) Polynucleotide encoding modified CAS9 protein
US12152259B2 (en) Modified CAS9 protein, and use thereof
JPWO2018221685A6 (en) Modified Cas9 protein and use thereof
KR102691636B1 (en) Compounds and methods for CRISPR/CAS-based genome editing by homologous recombination
JP7138712B2 (en) Systems and methods for genome editing
WO2020085441A1 (en) Modified cas9 protein, and use thereof
CN107794272A (en) A kind of CRISPR genome editor&#39;s systems of high specific
US20180201912A1 (en) Modified fncas9 protein and use thereof
WO2018172798A1 (en) Argonaute system
JP7412001B2 (en) Modified Cas9 protein and its uses
JP2024501892A (en) Novel nucleic acid-guided nuclease
HK40027079A (en) Modified cas9 protein and use thereof
WO2019026976A1 (en) MODIFIED Cas9 PROTEINS AND APPLICATION FOR SAME