NL2028346B1 - gRAMP protein for modulating a target mRNA - Google Patents
gRAMP protein for modulating a target mRNA Download PDFInfo
- Publication number
- NL2028346B1 NL2028346B1 NL2028346A NL2028346A NL2028346B1 NL 2028346 B1 NL2028346 B1 NL 2028346B1 NL 2028346 A NL2028346 A NL 2028346A NL 2028346 A NL2028346 A NL 2028346A NL 2028346 B1 NL2028346 B1 NL 2028346B1
- Authority
- NL
- Netherlands
- Prior art keywords
- lys
- leu
- glu
- gly
- ile
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Organic Chemistry (AREA)
- Biomedical Technology (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- General Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Plant Pathology (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Medicinal Chemistry (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
The invention provides a control method for modulating a target mRNA (50) using a gRAMP protein (60), Wherein a gRAMP amino acid sequence of the gRAMP protein (60) has at least 28% sequence identity to SEQ ID N01 with respect to a sequence alignment between the gRAMP amino acid sequence and SEQ ID NO: 1, wherein the sequence alignment has a length of at least 80% of a sequence length of SEQ ID N01, and wherein the gRAMP protein (60) is functionally associated with a guide RNA (30), Wherein the guide RNA (30) is configured to hybridize with the target RNA (50), Wherein the control method comprises exposing the target RNA (50) to the gRAMP protein (60).
Description
gRAMP protein for modulating a target mRNA
FIELD OF THE INVENTION The invention relates to an isolated or recombinant nucleotide. The invention further relates to an isolated or recombinant gRAMP protein. The invention further relates to a modification method. The invention further relates to a recombinant cell. The invention further relates to a control method. The invention further relates to a use of the polynucleotide or the recombinant gRAMP protein.
BACKGROUND OF THE INVENTION RNA-targeting CRISPR-Cas systems are known in the art. For instance, Terns, “CRISPR-Based Technologies: Impact of RNA-Targeting Systems”, Molecular Cell, 2018, summarizes information on RNA-targeting CRISPR-Cas systems and describes advances in converting them into programmable RNA-binding and cleavage tools with biotechnological and biomedical applications.
SUMMARY OF THE INVENTION In vitro and in vivo modulation of polynucleotides and polypeptides, such as to steer cellular function, may require flexible and precise tools for practical applications.
In recent years, Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-systems have been identified, and further engineered, to provide a set of tools for the modulation of DNA and RNA, including cleaving, editing, and binding of DNA and RNA. However, the CRISPR systems described in the prior art may suffer from one or more drawbacks.
In particular, CRISPR systems described in the prior art may require a plurality of different proteins in order to provide the modulating effect, which may complicate their use in practice.
Further, CRISPR systems described in the prior art may be unspecific with regards to target recognition and/or may have a high degree of off-target cleavage, such as by- stander RNA cleavage, which may for some applications be undesirable as it may hamper pinpointed modulations, and may be unacceptable for other applications, such as for medical applications. Similarly, prior art CRISPR systems may produce second messengers, such as cyclic oligoadenylates, which may inadvertently influence other cellular processes.
Yet further, prior art CRISPR systems may be inexact regarding a cleavage position, which may also hamper pinpointed modulations.
In addition, prior art CRISPR systems may require specific sequences, such as Protospacer Adjacent Motif (PAM) sequences, to be present directly adjacent to a target site in a target nucleotide in order to modulate the target site.
The prior art may primarily describe CRISPR systems in the context of modulating a target polynucleotide, which may optionally affect the activity of a downstream protein, such as by preventing the protein from being synthesized. The prior art may not, however, describe a CRISPR system that may directly activate an associated protein, such as a protease, upon recognition of a target RNA complementary to a guide RNA.
Hence, it is an aspect of the invention to provide an alternative protein, and polynucleotide encoding such protein, which preferably further at least partly obviates one or more of above-described drawbacks. The present invention may have as object to overcome or ameliorate at least one of the disadvantages of the prior art, or to provide a useful alternative.
Hence, in a first aspect, the invention may provide an isolated or recombinant polynucleotide (“polynucleotide”) comprising a coding sequence encoding a gRAMP protein, especially wherein the gRAMP protein is an RNA-guided RNA endonuclease. In embodiments, a gRAMP amino acid sequence of the gRAMP protein may have at least 28% (pairwise) sequence identity to SEQ ID NO:1 with respect to a sequence alignment between the gRAMP amino acid sequence and SEQ ID NO:1, especially wherein the sequence alignment has a length of at least 80% of a sequence length of SEQ ID NO:1.
Hence, the polynucleotide of the invention may encode a gRAMP protein, which may provide several benefits. In particular, the gRAMP protein of the invention may functionally associate to a guide RNA (“gRNA”), such as a CRISPR RNA (“crRNA”), and be able to recognize a target RNA — at least partially complementary to the gRNA — and cleave (or “cut”) the target RNA in one or more well-defined positions in a programmable manner, especially without off-target effects. In addition the gRAMP protein may functionally associate to a protease, especially TPR-CHAT, the activity of which may especially be controllable via the gRAMP protein, the guide RNA and/or the presence of a target RNA.
Further, in specific embodiments, the gRAMP protein may cut the target RNA in two well-defined position. In further embodiments, the gRAMP protein may cut the target RNA in a single well-defined position.
In embodiments, the gRAMP protein may be applied for, for example, RNA knockdown, gene regulation, both in prokaryotic and eukaryotic cells, sequence specific RNA editing, precursor crRNA cleavage. In further embodiments, the gRAMP protein may together with a TPR-CHAT protein be applied as an RNA-activated protease system.
Hence, compared to prior art solutions, the gRAMP protein of the invention may be: (1) simpler in use, as only a single protein is needed, (ii) may be more precise (or “accurate’”) as it may cleave a target RNA in well-defined positions with limited to no off-target cleavage, and (iii) may provide a higher degree of modularity as it may functionally associate to a protease, especially TPR-CHAT.
In specific embodiments, the invention may provide an isolated or recombinant polynucleotide comprising a coding sequence encoding a gRAMP protein, wherein a gRAMP amino acid sequence of the gRAMP protein has at least 28% sequence identity to SEQ ID NO:1 with respect to a sequence alignment between the gRAMP amino acid sequence and SEQ ID NO:1, wherein the sequence alignment has a length of at least 80% of a sequence length of SEQ ID NO:1, and wherein the gRAMP protein is an RNA-guided RNA endonuclease.
In particular, SEQ ID NO:1 may be RNA-guided RNA endonuclease, i.e., a protein corresponding to SEQ ID NO:1 may be RNA-guided RNA endonuclease.
Hence, the invention may provide an isolated or recombinant polynucleotide (“polynucleotide”), especially an isolated or recombinant RNA molecule, or especially an isolated or recombinant DNA molecule.
The term “isolated” with regards to the polynucleotide (and similarly with regards to the gRAMP protein; see below) may herein refer to a polynucleotide isolated from its natural environment. In embodiments, the polynucleotide may especially be an isolated polynucleotide.
The term “recombinant” with regards to the polynucleotide (and similarly with regards to the gRAMP protein; see below) may herein refer to a polynucleotide that relates to or contains genetically engineered DNA, such as a polynucleotide produced via genetic engineering. The polynucleotide may be (artificially) synthesized, or may be engineered to deviate from a naturally occurring polynucleotide, especially with regards to its nucleotide sequence. In embodiments, the polynucleotide may be a recombinant polynucleotide.
In further embodiments, the polynucleotide, especially the coding sequence, may have a non-naturally occurring sequence. In further embodiments, the polynucleotide may encode a non-naturally occurring amino acid sequence.
The polynucleotide may comprise a coding sequence encoding a gRAMP protein. The term “gRAMP protein” may herein refer to a “giant Repeat Associated Mysterious Protein”. In particular, the gRAMP protein may be an effector protein of a class 1, type III-E
CRISPR-Cas system, classified according to Makarova et al. “Evolutionary classification of CRISPR-Cas systems: a burst of class 2 and derived variants”, which is hereby herein incorporated by reference.
Class 1 CRISPR-Cas systems may generally comprise effector modules consisting of a plurality of Cas (CRISPR-associated) proteins that together may, for instance, mediate pre-ctRNA processing and RNA modulation.
Relatively recently, the new CRISPR-Cas subtype III-E was bioinformatically predicted and was determined to occur in several species. A Type III-E loci typically encodes ancillary proteins, generally including a TPR-CHAT protein. Various Type III-E appear to encode adaptation machinery fused to a reverse transcriptase. The gRAMP protein may contain protein domains that are structurally homologous to subunits found in type HI effectors, which led to the subtype III-E classification.
Surprisingly, the inventors discovered that the gRAMP protein of a type II-E CRISPR-Cas system, particularly the gRAMP protein of Candidatus “Scalindua brodae” (or “Candidatus Scalindua brodae™), may functionally associate to a guide RNA, such as a crRNA, may subsequently associate with a target RNA complementary to the guide RNA, and may cleave the target RNA in two well-defined target sites. Hence, the gRAMP protein may be an RNA-guided RNA endonuclease, i.e, a guide RNA, especially a crRNA, may guide the gRAMP protein to a target RNA, and the gRAMP protein may be configured to subject the target RNA to an endonuclease activity —it may cleave the target RNA. In particular, the gRAMP protein may not require further functionally coupled proteins for these functions, which is convenient for practical applications. In addition, in embodiments, the gRAMP protein may catalyze the maturation of pre-ctRNA into crRNAs, further simplifying practical applications as no separate protein may be required.
In embodiments, a gRAMP amino acid sequence of the gRAMP protein may have at least 28% sequence identity to SEQ ID NO:1 with respect to a (gRAMP) sequence alignment between the gRAMP amino acid sequence and SEQ ID NO:1, especially wherein the sequence alignment has a length of at least 80% of a sequence length of SEQ ID NO:1. In further embodiments, the gRAMP amino acid sequence may have at least 30% sequence identity with the gRAMP amino acid sequence of SEQ ID NO:1, such as at least > 40% sequence identity, especially > 50%, such as > 60%, especially > 70%, such as > 75%, especially > 80%, such as > 85%, especially > 90%, such as > 93%, especially > 95%, such as > 97%, including 100% sequence identity, with respect to a sequence alignment between the gRAMP amino acid sequence and SEQ ID NO: 1, wherein the sequence alignment has a length
> 50% of the sequence length of SEQ ID NO:1, such as > 60%, especially > 70%, such as > 80, especially > 90, such as > 99%, including 100%, wherein the gRAMP protein is an RNA- guided RNA endonuclease, especially with properties similar to SEQ ID NO:1. In general, if two proteins consist of (highly) similar amino acid sequences, these 5 two proteins may be likely to perform the same biological function.
This relation between amino acid sequence and protein function may, for example, be used to predict the function of a protein based on its sequence identity with proteins of known function (annotation by sequence homology based inference). The term “sequence identity” herein refers to the percentage of the characters (such as amino acids in an amino acid sequence) in the shorter of two sequences matching an identical character in the longer of the two sequences in a sequence alignment (also see below). The higher the sequence identity between two proteins, the higher the chance may be that these two proteins have the same or a similar function.
Although there may not be a hard rule for inferring functional identity or similarity based on a threshold value for sequence identity, especially as the threshold value may need to be adjusted based on (relative) sequence length and/or protein function, proteins may have been successfully annotated based on a common rule-of-thumb threshold of at least 30-40% sequence identity.
Hence, proteins similar to SEQ ID NO:1 in both length (including both shorter and longer) and amino acid sequence may have a similar activity as SEQ ID NO: 1. Amino acid sequence alignments may especially be obtained using BLASTp at the website of the National Center for Biotechnology Information (NCBI). Two sequences may be aligned via BLASTp, especially using default algorithm parameters, such as using a BLOSUM62 matrix with a gap cost of 11:1 (existence:extension). Hence, in embodiments, the sequence alignment of the gRAMP amino acid sequence and SEQ ID NO: 1 may be a BLASTP pairwise sequence alignment obtained with a BLOSUM62 matrix with an existence gap cost of 11 and an extension gap costs of 1. In embodiments, the gRAMP amino acid sequence may be shorter or longer than SEQ ID NO:1. Hence, in embodiments, the gRAMP amino acid sequence of the gRAMP protein may have a sequence length > 50% of the sequence length of SEQ ID NO:1, such as > 60%, especially > 70%, such as > 80, especially > 90, such as > 100%, especially > 120%, wherein the gRAMP protein is an RNA-guided RNA endonuclease.
Similarly, in further embodiments, the gRAMP amino acid sequence of the gRAMP protein may have a sequence length < 200% of the sequence length of SEQ ID NO:1, such as < 180%, especially < 160%, such as < 140%, especially < 130%, such as < 120%, especially < 110%, such as < 100%, wherein the gRAMP protein is an RNA-guided RNA endonuclease.
Specifically, SEQ ID NO:1 corresponds to the amino acid sequence of the gRAMP protein of Candidatus Scalindua brodae. A one-on-one comparison was made with 13 amino acid sequences corresponding to gRAMP proteins of other annotated type HI-E CRISPR systems, specifically with SEQ ID NO:2-14, which may likely have a high functional similarity to SEQ ID NO:1. In the comparison, the sequence alignment had a length varying between 82 — 99% of SEQ ID NO:1, and had a sequence identity varying between 29 — 43% along the sequence alignments. Hence, in embodiments, the gRAMP amino acid sequence of the gRAMP protein may have at least 28% sequence identity to SEQ ID NO:1 with respect to a sequence alignment between the gRAMP amino acid sequence and SEQ ID NO:1, especially wherein the sequence alignment has a length of at least 80% of a sequence length of SEQ ID NO:1.
In further embodiments, the gRAMP amino acid sequence of the gRAMP protein may be SEQ ID NO:1.
In further embodiments, the gRAMP amino acid sequence may have at least 50% sequence identity, such as > 60%, especially > 70%, such as > 75%, especially > 80%, such as > 85%, especially > 90%, such as > 93%, especially > 95%, such as > 97%, including 100% sequence identity, with respect to a sequence alignment between the gRAMP amino acid sequence and a (RAMP) reference sequence, wherein the sequence alignment has a length > 60% of the sequence length of the reference sequence, especially > 70%, such as > 80, especially > 90, such as > 99%, including 100%, wherein the gRAMP protein is an RNA- guided RNA endonuclease, and wherein the reference sequence is selected from the group comprising SEQ ID NO:123456789 1011 12 13 14, especially SEQ ID NO:2, or especially SEQ ID NO:3, or especially SEQ ID NO:4, or especially SEQ ID NO:5, or especially SEQ ID NO:6, or especially SEQ ID NO:7, or especially SEQ ID NO:3, or especially SEQ ID NO:9, or especially SEQ ID NO:10, or especially SEQ ID NO:11, or especially SEQ ID NO:12, or especially SEQ ID NO:13, or especially SEQ ID NO:14.
In further embodiments, the gRAMP amino acid sequence may comprise at least one difference to the reference sequence, especially to SEQ ID NO: 1, or especially to (each of) the reference sequences. Hence, in embodiments, the gRAMP protein may be a recombinant gRAMP protein, wherein gRAMP amino acid sequence of the recombinant gRAMP protein is engineered to have at least one difference with respect to (each of) the reference sequences. The difference may especially be an amino acid deletion, addition, and/or substitution. Hence, in embodiments, the gRAMP protein may be a non-naturally occurring protein.
In embodiments, the gRAMP protein may comprise 1400 — 2300 amino acids, such as 1550 — 2100 AA, especially 1600 — 2000 AA.
Based on bioinformatics analysis, it appears that for some organisms the gRAMP protein of type III-E CRISPR systems may be encoded in two open reading frames, i.e, two ORFs each encode part of the gRAMP protein for these organisms. Hence, in embodiments, the coding sequence may code for a first gRAMP protein part and for a second gRAMP protein part, wherein the first RAMP protein part has at least 30% sequence identity, such as > 40% sequence identity, especially > 50% sequence identity, such as > 60%, especially > 70%, such as > 75%, especially > 80%, such as > 85%, especially > 90%, such as > 93%, especially > 95%, such as > 97%, including 100% sequence identity, with respect to a first part sequence alignment between (the AA sequence of) the first RAMP protein part and SEQ ID NO:58, wherein the first part sequence alignment has a length > 60% of the sequence length of SEQ ID NO:58, especially > 70%, such as > 80, especially > 90, such as > 99%, including 100%, and wherein the second gRAMP protein part has at least 30% sequence identity, such as > 40% sequence identity, especially > 50% sequence identity, such as > 60%, especially > 70%, such as > 75%, especially > 80%, such as > 85%, especially > 90%, such as > 93%, especially > 95%, such as > 97%, including 100% sequence identity, with respect to a second part sequence alignment between (the AA sequence of) the second gRAMP protein part and SEQ ID NO:59, wherein the second part sequence alignment has a length > 60% of the sequence length of SEQ ID NO:59, especially > 70%, such as > 80, especially > 90, such as > 99%, including 100%.
As indicated above, the gRAMP protein may comprise a plurality of protein domains, especially one or more of a first domain, a second domain, a third domain, and a fourth domain. The plurality of protein domains may have structural similarity to Csm3 and/or Cmr4, and may herein be referred to as Csm3/Cmr4-like domains. In embodiments, the gRAMP protein may comprise at least three domains of the first domain, the second domain, the third domain, and the fourth domain. The term “protein domain” may herein refer to a distinct functional and/or structural part of a protein, which may typically contribute, especially be responsible for, a particular function or interaction, which may contribute to the overall function of the protein. Due to the relationship between protein domains and specific functions (of the protein), protein domains may also be indicative of the function of the encompassing protein and may thus be used for protein annotation.
In embodiments, the gRAMP amino acid sequence may comprise a first domain sequence, wherein the first domain sequence may have at least 25% sequence identity, such as at least 30% sequence identity, such as at least 50%, especially at least 70%, such as at least 80%, especially at least 90%, to SEQ ID NO:23 with respect to a sequence alignment between the gRAMP amino acid sequence and SEQ ID NO:23, especially wherein the sequence alignment has a length of at least 80% of a sequence length of SEQ ID NO:23.
In further embodiments, the gRAMP amino acid sequence may comprise a second domain sequence, wherein the second domain sequence may have at least 30% sequence identity, such as at least 50%, especially at least 55%, such as at least 60%, especially at least 70%, such as at least 80%, especially at least 90%, to SEQ ID NO:24 with respect to a sequence alignment between the gRAMP amino acid sequence and SEQ ID NO:24, especially wherein the sequence alignment has a length of at least 80% of a sequence length of SEQ ID NO:24.
In further embodiments, the gRAMP amino acid sequence may comprise a third domain sequence, wherein the third domain sequence may have at least 30% sequence identity, such as at least 35%, especially at least 40%, such as at least 50%, especially at least 70%, such as at least 80%, especially at least 90%, to SEQ ID NO:25 with respect to a sequence alignment between the gRAMP amino acid sequence and SEQ ID NO:25, especially wherein the sequence alignment has a length of at least 80% of a sequence length of SEQ ID NO:25.
In further embodiments, the gRAMP amino acid sequence may comprise a fourth domain sequence, wherein the fourth domain sequence may have at least 30% sequence identity, such as at least 35%, especially at least 40%, such as at least 50%, especially at least 70%, such as at least 80%, especially at least 90%, to SEQ ID NO:26 with respect to a sequence alignment between the gRAMP amino acid sequence and SEQ ID NO:26, especially wherein the sequence alignment has a length of at least 80% of a sequence length of SEQ ID NO:26.
In embodiments, in the gRAMP amino acid sequence, the first domain sequence may be arranged upstream of the second domain sequence, or especially of the third amino acid sequence, or especially of the fourth amino acid sequence. In further embodiments, the second domain sequence may be arranged upstream of the third domain sequence, or especially of the fourth amino acid sequence. In further embodiments, the third domain sequence may be arranged upstream of the fourth domain sequence.
The terms “upstream” and “downstream” may relate to an arrangement of genetic features, especially relative to the reading direction of a coding sequence, or especially to the thereby encoded amino acid sequence. Hence, an element arranged at the 5’ end relative to another genetic feature (especially with respect to the reading direction of a gene) may be referred to as “upstream”, such as a promotor being upstream of the respective gene, whereas an element arranged at the 3’ end relative to another genetic feature may be referred to as “downstream”. As translation may occur in the 5’-3 direction and may extend the encoded protein from its N terminus towards its C-terminus, an element — such as a protein domain —
arranged at the N-terminal side of a protein relative to a second element may herein be referred to as “upstream” of the second element.
In embodiments, the gRAMP protein may be configured to functionally associate with a guide RNA. The term “guide RNA” may herein refer to an RNA molecule configured to functionally associate with the gRAMP protein, and especially configured to hybridize with a target RNA. In particular, the guide RNA, especially a spacer region of the guide RNA (see below), may be configured to hybridize with the target RNA along a complementary region. In embodiments, the guide RNA may comprise a natural crRNA, or may comprise an engineered version thereof. In further embodiments, the guide RNA may comprise a single guide RNA (sgRNA).
In further embodiments, the target RNA may comprise a target messenger RNA (mRNA).
In further embodiments, the guide RNA may have a length of at least 20 nucleotides, such as at least 25 nucleotides, especially at least 26 nucleotides. In further embodiments, the guide RNA may have a length of at least 30 nucleotides, such as at least 35 nucleotides, especially at least 40 nucleotides. In further embodiments, the guide RNA may have a length of at least 42 nucleotides, such as at least 45 nucleotides, especially at least 50 nucleotides. In further embodiments, the guide RNA may have a length of at most 200 nucleotides, such as at most 100 nucleotides, especially at most 80 nucleotides, such as at most 65 nucleotides.
In further embodiments, the complementary region may have a length of at least n nucleotides, especially wherein n is at least 12, such as at least 13, especially at least 15. In further embodiments, n may be at least 18, especially at least 20. In further embodiments, the complementary region may have a length of at least n nucleotides, especially wherein n is at least 25, such as at least 30, especially at least 35. In further embodiments, the complementary region may have a length of at least 50% of the (length of the) guide RNA, such as at least 60%, especially at least 70%. In further embodiments, the complementary region may have a length of at least 80% of the guide RNA, such as at least 85%, especially at least 90%. In further embodiments, the complementary region may have a length of at least 80% of the spacer region guide RNA, such as at least 85%, especially at least 90%, including 100%.
In embodiments, nucleotides of the guide RNA may be complementary to nucleotides of the target RNA along the (entire) complementary region. However, in further embodiments, the guide RNA and the target RNA may only be partially complementary along the complementary region, i.e., there may be some mismatches between the nucleotides of the guide RNA and the target RNA. For instance, in embodiments, the complementarity of the nucleotides directly following the target sites may not impact the target RNA recognition, i.e, nucleotides 5 and 11 of the spacer region of the guide RNA may be non-complementary with the aligned nucleotides of the target RNA without (substantial) effect on the activity of the gRAMP protein with respect to the target RNA.
In embodiments, the guide RNA and the target RNA may comprise at least 0.5%n complementary nucleotides along the complementary region, such as at least 0.6*n, especially at least 0.7*n. In further embodiments, the guide RNA and the target RNA may comprise at least 0.8*n complementary nucleotides along the complementary region, such as at least
0.85*n, especially at least 0.9%n, including n. In further embodiments, the guide RNA and the target RNA may have at least n-10 complementary nucleotides in the complementary region, such as at least n-7 complementary nucleotides, such as at least n-5 complementary nucleotides. In further embodiments, the guide RNA and the target RNA may have at least n-3 complementary nucleotides in the complementary region, such as at least n-2 complementary nucleotides, such as at least n-1 complementary nucleotides, including n complementary nucleotides.
The gRAMP protein may especially be configured to cut the target RNA at a target site, especially upon (or “following” hybridization of the guide RNA with the target RNA. The term “target site” may also refer to a plurality of (different) target sites.
In embodiments, the (first) target site may be arranged in the target RNA between nucleotides aligned with nucleotides 4 and 5 of the spacer region of the guide RNA in the complementary region. In particular, the guide RNA may comprise a spacer region and a repeat region. Conventionally, the nucleotides in the guide RNA may be numbered according to their position relative to the border between the spacer region and the repeat region, wherein nucleotides are numbered positively for the spacer region and negatively for the repeat region. Hence, nucleotides 4-5 in the guide RNA may correspond to the fourth and fifth nucleotide counted from the repeat region of the guide RNA. Hence, in embodiments, the gRAMP protein may be configured to cut the target RNA between nucleotides (of the target RNA) aligned with nucleotides 4 and 5 of the guide RNA.
In further embodiments, the (second) target site may be arranged in the target RNA between nucleotides aligned with nucleotides 10 and 11 of the spacer region of the guide RNA in the complementary region, especially counted from the 5° end of the target RNA. Hence, in embodiments, the gRAMP protein may be configured to cut the target RNA between nucleotides (of the target RNA) aligned with nucleotides 10 and 11 of the guide RNA.
In specific embodiments, the gRAMP protein may be configured to cut the target RNA at (exactly) two target sites, especially wherein a first target site is arranged in the target RNA between nucleotides aligned with nucleotides 4 and 5 of the guide RNA in the complementary region, and especially wherein a second target site is arranged in the target RNA between nucleotides aligned with nucleotides 10 and 11 of the guide RNA in the complementary region. In further embodiments, the gRAMP protein may be configured to cut the target RNA at a single target site. In particular, a recombinant gRAMP protein, comprising an amino acid sequence with one or more differences relative to SEQ ID NO:1, may be configured to cut the target RNA at a single target site. In particular, the gRAMP protein having a sequence according to SEQ ID NO: 1 may cut at two target sites, whereas a recombinant gRAMP protein with a different residue at position 698 of SEQ ID NO: 1 may cut at a single target site. In such embodiments, the gRAMP protein may (thus) be configured to cut a target RNA at a single target site, especially at the first target site.
Hence, in embodiments, the amino acid sequence may in the sequence alignment have a first amino acid aligned with the aspartate of SEQ ID NO:1 at position 698, wherein the first amino acid differs from aspartate. In further embodiments, the first amino acid may be selected from the group comprising alanine, arginine, asparagine, cysteine, glutamine, glutamate, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, and valine, especially from the group comprising alanine, arginine, asparagine, cysteine, glutamine, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, and valine, more especially alanine.
In further embodiments, the gRAMP protein may be configured to cut the target RNA at a single target site, especially at the second target site. In such embodiments, a second amino acid of the amino acid sequence of the gRAMP protein may be aligned with a native second amino acid in a sequence alignment with SEQ ID NO: 1, wherein the second amino acid and the native second amino acid differ.
In specific embodiments, the amino acid sequence of the gRAMP protein may be SEQ ID NO:15, which corresponds to the D698A mutant of SEQ ID NO:1. In such embodiments, the gRAMP protein may be configured to cut at a single target site (in a target RNA).
In further embodiments, the gRAMP protein may be configured to bind the target RNA, especially via the guide RNA, without cutting the target RNA. In particular, the gRAMP protein may comprise a plurality of amino acid differences with respect to SEQ ID NO:1, wherein the plurality of amino acid differences result in a loss of the endonuclease activity of the gRAMP protein.
The polynucleotide may, in embodiments, comprise DNA or RNA, especially DNA, or especially RNA.
In further embodiments, the polynucleotide may comprise DNA, especially wherein the polynucleotide further comprises one or more regulatory elements configured to control the expression of the coding sequence. For instance, the one or more regulatory elements may comprise one or more of a promoter, a response element, an enhancer and a silencer. In further embodiments, the one or more regulatory elements may comprise a promoter, especially wherein the promoter is configured to facilitate transcription initiation leading to transcription of the coding sequence. In further embodiments, the promoter may comprise an inducible or repressible promoter, especially an inducible promoter, which may facilitate controlling the expression of the coding sequence.
In further embodiments, the polynucleotide may comprise a (recombinant) construct, especially a plasmid (or “vector”). The construct may especially be configured for integration into the genome of a target cell (see below), especially such as to enable the target cell to transcribe the coding sequence.
In further embodiments, the polynucleotide may comprise a second coding sequence, especially wherein the second coding sequence encodes a TPR-CHAT protein. In embodiments, a TPR-CHAT amino acid sequence of the TPR-CHAT protein may have at least 25% sequence identity to SEQ ID NO:27 with respect to a (TPR-CHAT) sequence alignment between the TPR-CHAT amino acid sequence and SEQ ID NO:27, wherein the TPR-CHAT sequence alignment has a length of at least 60% of a sequence length of SEQ ID NO:27, and wherein the TPR-CHAT protein is a protease.
As indicated above, the gRAMP protein may functionally associate with the TPR-CHAT protein, especially to provide a gRAMP-TPR-CHAT complex, which complex may provide RNA-activated protease activity. For practical applications, it may be preferable for a single polynucleotide, such as a plasmid, to encode both the gRAMP protein and the TPR- CHAT protein. In particular, such embodiments may facilitate an more convenient integration of the coding sequences into a genome of a target cell, and may reduce a metabolic burden on the target cell, particularly if the genome integration occurs via introduction of a plasmid in the target cell (as otherwise two separate plasmids may be required, both of which may impose a metabolic burden).
In further embodiments, a TPR-CHAT amino acid sequence of the TPR-CHAT protein may have at least 25% sequence identity to SEQ ID NO:27 with respect to a TPR- CHAT sequence alignment between the TPR-CHAT amino acid sequence and SEQ ID NO:27, especially wherein the TPR-CHAT sequence alignment has a length of at least 60% of a sequence length of SEQ ID NO:27. In further embodiments, the TPR-CHAT amino acid sequence may have at least 30% sequence identity with the TPR-CHAT amino acid sequence of SEQ ID NO:27, such as at least > 40% sequence identity, especially > 50%, such as > 60%, especially > 70%, such as > 75%, especially > 80%, such as > 85%, especially > 90%, such as > 93%, especially > 95%, such as > 97%, including 100% sequence identity, with respect to a TPR-CHAT sequence alignment between the amino acid sequence and SEQ ID NO:27, wherein the TPR-CHAT sequence alignment has a length > 50% of the sequence length of SEQ ID NO:27, such as > 60%, especially > 70%, such as > 80, especially > 90, such as > 99%, including 100%, wherein the TPR-CHAT protein is a protease, especially with properties similar to SEQ ID NO:27. Specifically, SEQ ID NO:27 corresponds to the amino acid sequence of the TPR-CHAT protein of Candidatus Scalindua brodae. A one-on-one comparison was made with amino acid sequences corresponding to TPR-CHAT proteins of other annotated type III-E CRISPR systems, specifically with SEQ ID NO:28-47, which may likely have a high functional similarity to SEQ ID NO:27. In the comparison, the sequence alignment had a length varying 20 between 61 — 99% of SEQ ID NO:27, and had a sequence identity varying between 25 — 44% along the sequence alignments. Hence, in embodiments, the TPR-CHAT amino acid sequence of the TPR-CHAT protein may have at least 25% sequence identity to SEQ ID NO:27 with respect to a sequence alignment between the TPR-CHAT amino acid sequence and SEQ ID NO:27, especially wherein the sequence alignment has a length of at least 60% of a sequence length of SEQ ID NO:27. In further embodiments, the TPR-CHAT amino acid sequence of the TPR- CHAT protein may be SEQ ID NO:27.
In further embodiments, the TPR-CHAT amino acid sequence may have at least 50% sequence identity, such as > 60%, especially > 70%, such as > 75%, especially > 80%, such as > 85%, especially > 90%, such as > 93%, especially > 95%, such as > 97%, including 100% sequence identity, with respect to a TPR-CHAT sequence alignment between the TPR- CHAT amino acid sequence and a TPR-CHAT reference sequence, wherein the TPR-CHAT sequence alignment has a length > 60% of the sequence length of the TPR-CHAT reference sequence, especially > 70%, such as > 80, especially > 90, such as > 99%, including 100%,
wherein the TPR-CHAT protein is a protein, and wherein the reference sequence is selected from the group comprising SEQ ID NO:27 28 29 30 31 32 33 3435 3637 3839 40 41 42 43 44 45 46 47, especially SEQ ID NO:28, or especially SEQ ID NO:29, or especially SEQ ID NO:30, or especially SEQ ID NO:31, or especially SEQ ID NO:32, or especially SEQ ID NO:33, or especially SEQ ID NO:34, or especially SEQ ID NO:35, or especially SEQ ID NO:36, or especially SEQ ID NO:37, or especially SEQ ID NO:38, or especially SEQ ID NO:39, or especially SEQ ID NO:40, or especially SEQ ID NO:41, or especially SEQ ID NO:42, or especially SEQ ID NO:43, or especially SEQ ID NO:44, or especially SEQ ID NO:45, or especially SEQ ID NO:46, or especially SEQ ID NO:47. In further embodiments, the reference sequence may be SEQ ID NO: 27.
In further embodiments, the TPR-CHAT amino acid sequence may comprise at least one difference to the TPR-CHAT reference sequence, especially to SEQ ID NO:27, or especially to (each of) the TPR-CHAT reference sequences. Hence, in embodiments, the TPR- CHAT protein may be a recombinant TPR-CHAT protein, wherein the TPR-CHAT amino acid sequence of the recombinant TPR-CHAT protein is engineered to have at least one difference with respect to (each of) the TPR-CHAT reference sequences. The difference may especially be an amino acid deletion, addition, and/or substitution. Hence, in embodiments, the TPR- CHAT protein may be a non-naturally occurring protein.
In embodiments, the second coding sequence may be a non-naturally occurring sequence. In further embodiments, the second coding sequence may encode a non-naturally occurring amino acid sequence.
In particular, the gRAMP protein may functionally associate to a TPR domain of the TPR-CHAT protein. Hence, a second protein (domain) may be functionally associated to the gRAMP protein by a (translational) fusion of the TPR domain with a second protein (domain), such as a GFP domain, or a ligase (domain). Thereby, the second protein (domain) may essentially be guided to the target RNA (via the gRAMP protein and the guide RNA), and, in further embodiments, the gRAMP protein may be configured to control an activity of the second protein (domain) in dependence of a target RNA recognition. In particular, this may facilitate providing functional and modular (translational) fusion proteins allowing for functional association and/or dissociation, especially with the gRAMP protein.
Hence, in embodiments, the second coding sequence may encode a second protein, wherein the second protein comprises a TPR domain, wherein a TPR amino acid sequence of the second protein has at least 25% sequence identity to SEQ ID NO:88 with respect to a TPR sequence alignment between the TPR amino acid sequence and SEQ ID
NO:88, especially wherein the TPR sequence alignment has a length of at least 60% of a sequence length of SEQ ID NO:88. In further embodiments, the TPR amino acid sequence may have at least 30% sequence identity with SEQ ID NO:88, such as at least > 40% sequence identity, especially > 50%, such as > 60%, especially > 70%, such as > 75%, especially > 80%, such as > 85%, especially > 90%, such as > 93%, especially > 95%, such as > 97%, including 100% sequence identity, with respect to a TPR sequence alignment between the amino acid sequence and SEQ ID NO:88, wherein the TPR sequence alignment has a length > 50% of the sequence length of SEQ ID NO:88, such as > 60%, especially > 70%, such as > 80, especially > 90, such as > 99%, including 100%, especially wherein the second protein is configured to functionally associated with the gRAMP protein.
In embodiments, the polynucleotide may comprise a CRISPR array. In particular, it may be practically convenient for the polynucleotide to comprise both the coding sequence and the CRISPR array.
The term “CRISPR array” may herein refer to a sequence comprising identical sequences (“repeats”) that are separated by variable sequences (“spacers”). In particular, the CRISPR array may be transcribed to a pre-ctRNA (pre-CRISPR RNA), which may be processed, especially by gRAMP, into a plurality of guide RNAs. In particular, each guide RNA from a pre-ctRNA may comprise a repeat region corresponding to (at least part of) a repeat of the CRISPR array and a spacer region corresponding to a spacer of the CRISPR array, wherein the spacer region is configured for hybridizing with a (corresponding) target RNA. Hence, the target RNA may be defined through the spacers included in the CRISPR array.
In further embodiments, the CRISPR array may comprise a spacer corresponding to the target RNA, i.e, a spacer encoding a guide RNA (as defined above). In further embodiments, the CRISPR array may comprise a plurality of spacers corresponding to a plurality of target RNAs.
In further embodiments, the CRISPR array may comprise a plurality of repeats, especially wherein each repeat has at least 50% sequence identity, such as > 60%, especially > 70%, such as > 75%, especially > 80%, such as > 85%, especially > 90%, such as > 93%, especially > 95%, such as > 97%, including 100% sequence identity, with respect to a repeat sequence alignment between the (nucleotide sequence of the) repeat and a reference repeat sequence, especially wherein the repeat sequence alignment has a length > 60% of the sequence length of the reference repeat sequence, especially > 70%, such as > 80, especially > 90, such as > 99%, including 100%. In further embodiments, the reference repeat sequence may especially comprise a reference sequence selected from the group comprising SEQ ID NO:49
60 61 62, especially SEQ ID NO:49, or especially SEQ ID NO 60, or especially SEQ ID NO 61, or especially SEQ ID NO 62. A comparison of the repeat sequences SEQ ID NO:49 60 61 62 reveals a particularly high conservation at the 3’-end of the repeat sequences, especially at the (last) 15 nucleotides at the 3’-end of the repeat sequences. The high conservation may suggest that the 3’-end of the repeat sequence may contribute to the functional association with the gRAMP protein, to the cleavage activity of the (functionally associated) gRAMP protein, and/or to the activation of a TPR-CHAT protein (functionally associated with the gRAMP protein). Hence, in further embodiments, the CRISPR array may comprise a plurality of repeats, especially wherein each repeat has at least 80% sequence identity, such as > 85%, especially > 90%, such as > 93%, especially > 95%, such as > 97%, including 100% sequence identity, with respect to a repeat sequence alignment between the (nucleotide sequence of the) repeat and (a 3’-end of) a reference repeat sequence, especially wherein the repeat sequence alignment has a length > 13 nucleotides, especially at least 14 nucleotides, such as at least 15 nucleotides, and especially wherein the repeat sequence alignment comprises an alignment of the 3’-end of the reference repeat sequence. In further embodiments, the reference repeat sequence may especially comprise a reference sequence selected from the group comprising SEQ ID NO:49 60 61 62, especially SEQ ID NO:49, or especially SEQ ID NO 60, or especially SEQ ID NO 61, or especially SEQ ID NO 62.
In further embodiments, the CRISPR array may comprise a plurality of repeats, especially wherein each repeat has at least 80% sequence identity, such as > 85%, especially > 90%, such as > 93%, especially > 95%, such as > 97%, including 100% sequence identity, with respect to a repeat sequence alignment between the (nucleotide sequence of the) repeat and a reference repeat sequence, especially wherein the repeat sequence alignment has a length > 13 nucleotides, especially at least 14 nucleotides, such as at least 15 nucleotides. In further embodiments, the reference repeat sequence may especially comprise a reference sequence selected from the group comprising SEQ ID NO:79 80 81 82, especially SEQ ID NO:79, or especially SEQ ID NO 80, or especially SEQ ID NO 81, or especially SEQ ID NO 82.
In particular, the guide RNA, especially the crRNA, may comprise nucleotides corresponding to the conserved 3’-end of the repeat region. Hence, in embodiments, the guide RNA may comprise a repeat region and a spacer region. In further embodiments, the repeat region may have at least 80% sequence identity, such as > 85%, especially > 90%, such as > 93%, especially > 95%, such as > 97%, including 100% sequence identity, with respect to a repeat sequence alignment between the (nucleotide sequence of the) repeat region and a reference repeat sequence, especially wherein the repeat sequence alignment has a length > 13 nucleotides, especially at least 14 nucleotides, such as at least 15 nucleotides. In further embodiments, the reference repeat sequence may especially comprise a reference sequence selected from the group comprising SEQ ID NO:83 84 85 86, especially SEQ ID NO:83, or especially SEQ ID NO 84, or especially SEQ ID NO 85, or especially SEQ ID NO 86.
The spacer region may especially be configured to hybridize with a target RNA. In particular, the spacer region may be configured to hybridize with the target RNA along a complementary region. In further embodiments, the complementary region may have a length of at least n nucleotides, especially wherein n is at least 12, such as at least 13, especially at least 14, such as at least 15. In further embodiments, the complementary region may have a length of at least 80% of the spacer region, such as at least 85%, especially at least 90%, including 100%.
In further embodiments, the spacer region and the target RNA may comprise at least 0.7*n complementary nucleotides along the complementary region. In further embodiments, the spacer region and the target RNA may comprise at least 0.8*n complementary nucleotides along the complementary region, such as at least 0.85*n, especially at least 0.9%*n, including n. In further embodiments, the spacer region and the target RNA may have at least n-5 complementary nucleotides in the complementary region, such as at least n-4 complementary nucleotides. In further embodiments, the spacer region and the target RNA may have at least n-2 complementary nucleotides in the complementary region, such as at least n-1 complementary nucleotides, including n complementary nucleotides.
In further embodiments, the spacer region may be arranged (directly) at the 3’ side of the repeat region. In particular, in embodiments, the nucleotides of the guide RNA arranged at the 3’-end of the repeat sequence alignment may form the spacer region.
With respect to guide RNAs provided from a CRISPR array, the term “spacer region” refers to a part of the guide RNA originating from a spacer in the CRISPR array. However, as will be clear to the person skilled in the art, a guide RNA may also, for instance, be synthesized directly. Hence, more generally, the term “spacer region” may herein refer to a region of the guide RNA that is (at least partially) complementary to the target RNA.
In particular, in embodiments, the CRISPR array may encode a CRISPR RNA precursor, especially wherein the gRAMP protein 1s configured to process the CRISPR RNA precursor into a plurality of guide RNAs, especially crRNAs, and wherein the gRAMP protein is configured to functionally associate with (one of) the plurality of guide RNAs.
In embodiments, the polynucleotide may further comprise one or more second regulatory elements configured to control expression of the second coding sequence.
In further embodiments, the polynucleotide may further comprise one or more third regulatory elements configured to control expression of the CRISPR array.
In a further aspect, the invention may provide an isolated or recombinant gRAMP protein (as defined above). The gRAMP protein may especially be encoded by the polynucleotide according to the invention.
In embodiments, the gRAMP protein may be functionally associated with a guide RNA. In particular, the gRAMP protein together with the guide RNA may herein be referred to as an “effector complex”. In particular, the effector complex may be more stable than a gRAMP protein without the presence of an associated guide RNA.
In further embodiments, the gRAMP protein may be functionally associated with the TPR-CHAT protein. As indicated above, the functional association of the gRAMP protein and the TPR-CHAT protein may provide RNA-based control of the proteolytic activity of TPR-CHAT. In embodiments, the gRAMP protein may be more stable when functionally associated to the TPR-CHAT protein.
In particular, in embodiments, the gRAMP protein may be functionally associated to the TPR-CHAT protein in the absence of a guide RNA. Such embodiments may be particularly flexible with regards to (biotech) applications as the gRAMP protein may be (conveniently) functionally associated to a guide RNA of choice.
In a further aspect, the invention may provide a modification method for providing a recombinant cell. The modification method may comprise introducing the polynucleotide according to the invention into the genome of a parent cell to provide the recombinant cell. Suitable methods for genetically modifying a (specific) parent cell to provide a recombinant cell will be known to the person skilled in the art.
In embodiments, the method may comprise inserting the polynucleotide into a chromosome of the parent cell.
In further embodiments, the method may comprise introducing a recombinant construct, such as a plasmid, into the parent cell. In particular, the recombinant construct may be configured to be maintained in the recombinant cell. In further embodiments, the recombinant construct may comprise a selection marker, such as an antibiotic marker, for maintenance of the recombinant construct. The recombinant construct may especially be configured not to be integrated into chromosomal DNA.
The parent cell may essentially be any type of cell.
In embodiments, the parent cell may be a prokaryotic cell, such as a bacterial cell, or such as an archaeal cell.
In further embodiments, the parent cell may be a eukaryotic cell, such as an animal cell, especially a human cell, or especially a non-human animal cell, or such as a plant cell, or especially a fungal cell, or especially an algal cell.
In embodiments, the modification method may especially be a non-medical method.
In a further aspect, the invention may provide a recombinant cell.
The recombinant cell may especially comprise a genome modification relative to a parent cell, wherein the genome modification comprises (an introduction of) the polynucleotide according to the invention.
The recombinant cell may especially be obtainable using the modification method of the invention.
In a further aspect, the invention may provide a control method for modulating, especially controlling, a target mRNA using the gRAMP protein of the invention.
In particular, a gRAMP amino acid sequence of the gRAMP protein may have at least 28% sequence identity to SEQ ID NO:1 with respect to a sequence alignment between the gRAMP amino acid sequence and SEQ ID NO: 1, especially wherein the sequence alignment has a length of at least 80% of a sequence length of SEQ ID NO:1. In embodiments, the gRAMP protein may be functionally associated with a guide RNA, especially a crtRNA, wherein the guide RNA is configured to hybridize with the target RNA.
The control method may especially comprise exposing the target RNA to the gRAMP protein.
In embodiments, exposure of the target RNA to the gRAMP protein may result in the gRAMP protein cutting the target RNA at one or more target sites (see above), thereby effectively degrading the target RNA.
Hence, the method may comprise exposing the target RNA to the gRAMP protein to degrade the target RNA.
The degradation of the target RNA may have further downstream effects.
For instance, the target RNA may encode for a protein, and the degradation of the target RNA may thus result in a lower production of the protein, in turn leading to a reduced abundance of the corresponding protein after the exposure of the target RNA to the gRAMP protein, which may affect cellular behavior.
In particular, the control method may comprise knocking down the target RNA, i.e., knocking down a target gene encoding for the target RNA.
Similarly, in embodiments, the target RNA may have a regulatory function, and the degradation of the target RNA may thus (effectively) have a regulatory effect on cellular behavior.
Hence, the control method may comprise controlling gene regulation in a target cell by exposing the target cell to the gRAMP protein and one or more guide RNAs.
In further embodiments, the gRAMP protein may be configured to cut the target RNA in two or more target RNA fragments, wherein one (or more) of the two or more target RNA fragments have a regulatory function. Thereby, exposure of the target RNA to the gRAMP protein may (also) effectively have a regulatory effect on cellular behavior.
Hence, in embodiments, the method may comprise modulating, especially controlling, cellular behavior of a target cell, especially by exposing the target cell to the gRAMP protein. In particular, the control method may comprise modulating, especially controlling, the target RNA in a target cell. In embodiments, the target cell may (have a genome that) encode the target RNA. In further embodiments, the control method may comprise providing the gRAMP protein and the guide RNA inside the target cell.
In embodiments, the control method may comprise providing the polynucleotide to the target cell. In further embodiments, the control method may comprise expressing the polynucleotide (in the target cell). In further embodiments, the control method may comprise providing the gRAMP protein to the target cell, especially providing an effector complex to the target cell, wherein the effector complex comprises the gRAMP protein and a guide RNA.
In further embodiments the control method may comprise providing a CRISPR array to the target cell. In such embodiments, the CRISPR array may especially comprise a spacer configured to provide the guide RNA, i.e., the CRISPR array may comprise a spacer having a sequence suitable for targeting the target RNA. The control method may further, in embodiments, comprise expressing the CRISPR array (in the target cell). Thereby, the target cell may provide a pre-crRNA, which may, especially by the gRAMP protein, be cleaved into a plurality of guide RNAs, especially crRNAs.
In further embodiments, the control method may comprise providing a pre- crRNA or a guide RNA to the target cell, especially a pre-crRNA, or especially a guide RNA.
The phrase “providing a protein to the target cell” and similar phrases may herein refer to providing the protein or a precursor thereof to the target cell, wherein the precursor is configured such that the target cell produces the protein. For instance, to provide the gRAMP protein to the target cell, the control method may comprise one or more of: (i) injecting the gRAMP protein into the target cell, (ii) injecting an mRNA encoding the protein into the target cell, (111) transforming the target cell with a plasmid encoding the protein, (iv) providing a capsid comprising the gRAMP protein to the target cell, especially wherein the capsid comprises a liposome, or especially wherein the capsid comprises a viral capsid. It will be clear to the person skilled in the art that many possibilities are available for providing a protein to, especially inside, a target cell. In particular, the phrase “providing a protein to the target cell” and similar phrases may herein especially refer to providing the protein inside the target cell.
In specific embodiments, the target cell may comprise the recombinant cell according to the invention.
In further embodiments, the control method may comprise providing a TPR- CHAT protein inside the target cell, especially wherein a TPR-CHAT amino acid sequence of the TPR-CHAT protein has at least 25% sequence identity to SEQ ID NO:27 with respect to a sequence alignment between the amino acid sequence and SEQ ID NO:27, wherein the sequence alignment has a length of at least 60% of a sequence length of SEQ ID NO:27. The TRP-CHAT protein may especially be configured to functionally associate with the gRAMP protein. In further embodiments, the control method may comprise providing a gRAMP-TPR- CHAT complex into the target cell, i.e, a complex comprising a functionally associated gRAMP protein and TPR-CHAT protein.
In embodiments, the control method may comprise providing the gRAMP protein to the target cell, such as by injecting the gRAMP protein into the target cell. In further embodiments, the gRAMP protein may be functionally associated to a guide RNA, i.e, the control method may comprise providing an effector complex to the target cell, such as by injecting the effector complex into the target cell, wherein the effector complex comprises the gRAMP protein and the guide RNA. Such embodiments may be beneficial as the effector complex may be more stable than the gRAMP protein.
In further embodiments, the control method may comprise separately providing the gRAMP protein and the guide RNA to the target cell, such as by injecting the gRAMP protein and separately injecting the guide RNA into the target cell. Such embodiments may be beneficial as the gRAMP protein may be dynamically functionally associated to a desired guide RNA, which may increase the flexibility of the method.
In further embodiments, the control method may comprise providing a gRAMP- TPR-CHAT complex to the target cell, such as by injecting the gRAMP-TPR-CHAT complex into the target cell. In further embodiments, the gRAMP-TPR-CHAT complex may be functionally associated to the guide RNA.
In further embodiments, the gRAMP-TPR-CHAT complex and the guide RNA may be provided to the target cell separately.
In further embodiments, the control method may comprise providing the gRAMP protein and the TPR-CHAT protein to the target cell separately.
The control method of the invention may, as indicated above, in embodiments, be an in vivo method.
However, in further embodiments, the control method may comprise exposing the target RNA to the gRAMP protein in vitro.
In particular, the control method may comprise providing a first mixture, wherein the first mixture comprises the target RNA, the gRAMP protein, and the guide RNA.
Optionally, the control method may comprise providing an effector complex comprising the functionally associated gRAMP protein and guide RNA.
The gRAMP protein may require the presence of a bivalent cation in order to cleave the target RNA, especially a bivalent cation selected from the group comprising Mg?*, Ca? and Mn?*. However, if the concentration of the bivalent cation is too high, the activity of the gRAMP protein may also be reduced, especially inhibited.
In general, the concentration of the bivalent cation inside a target cell may (naturally) be suitable (or “sufficient”) in order to facilitate the endonuclease activity.
However, the dependence on the bivalent cation may, especially in vitro, provide additional control over the activity of the gRAMP protein.
In particular, in the absence or excess of the bivalent cation, the gR AMP protein may bind to the target RNA, especially via hybridization of the (corresponding) guide RNA and the target RNA, but may (essentially) not cleave the target RNA.
Thereby, the gRAMP protein may, for instance, inhibit translation of the target RNA.
Hence, in embodiments, the first mixture may thus comprise < 0.01 mM of a bivalent cation, such as < 0.001 mM, especially < 0.0001 mM, especially wherein the bivalent cation is selected from the group comprising Mg?*, Ca**, and Mn?*. In further embodiments, the first mixture may comprise > 105 mM of the bivalent cation, such as > 110 mM, especially > 125 mM.
Hence, the control method may comprise inhibiting translation of the target RNA, especially by cleaving the target RNA with the gRAMP protein, or especially by binding the target RNA with the gRAMP protein.
In addition, in embodiments, the gRAMP protein and/or the guide RNA may be configured to acquire an auxiliary effector, which auxiliary effector may interact with the target RNA once bound to the gRAMP protein.
Hence, in embodiments, the method may comprise providing an auxiliary effector to the first mixture. Especially, the first mixture may comprise an auxiliary effector. In further embodiments, a concentration of the bivalent cation may be selected to facilitate cleavage by the gRAMP protein. Hence, in embodiments, the first mixture may comprise 0.05 - 105 mM of the bivalent cation, such as 0.1 — 100 mM of the bivalent cation, especially 0.2 — 100 mM of the bivalent cation, especially wherein the bivalent cation is selected from the group comprising Mg?*, Ca?*, and Mn?”. In further embodiments, the first mixture may comprise at least 0.1 mM of the bivalent cation, such as at least 0.2 mM, especially at least 0.5 mM. In further embodiments, the first mixture may comprise at most 102 mM of the bivalent cation, such as at most 100 mM, especially at most 95 mM.
In embodiments, the gRAMP protein may be configured to cut the target RNA (upon hybridization with the guide RNA) into two or more target (m)RNA fragments. In such embodiments, the control method may further comprises providing an RNA ligase and a second RNA fragment, especially wherein the RNA ligase is configured to ligate one (or more) of the two or more target RNA fragments to the second RNA fragment, especially thereby providing a ligated RNA product. As the gRAMP protein cuts at very specific sites in a programmable manner, the obtained target RNA fragments may be well-defined, facilitating providing a well- defined ligated RNA product. For instance, such embodiments may be used for RNA library modulation. For example, a plurality, especially all, of the RNAs in a RNA library may be cleaved to remove a 3’- or 5’- terminal part and subsequently ligated to an RNA part of interest. Thereby, the method may provide a convenient way to modify an existing library as an alternative to provide a fully new library, which may involve substantial work and/or costs.
As indicated above, in embodiments, the gR AMP protein may be configured to cut a target RNA at two target sites.
In further embodiments, the gRAMP protein may be configured to cut the target RNA at a single target site, especially at the first target site, or especially at the second target site. Especially, in a sequence alignment of the amino acid sequence of the gRAMP protein and SEQ ID NO:1 a first amino acid of the amino acid sequence may be aligned with the aspartate of SEQ ID NO:1 at position 698, wherein the first amino acid differs from aspartate.
In further embodiments, the gRAMP protein may be configured to bind the target RNA without cleaving the target RNA.
In further embodiments, the control method may comprise converting (at least part) of (a sequence of) the target RNA to a complementary sequence, and providing a guide RNA comprising a spacer region, wherein the spacer region comprises the complementary sequence. Hence, the control method may comprise designing and providing a guide RNA (having a spacer region) configured to hybridize with the target RNA. In embodiments, the control method may especially be a non-medical method. In a further aspect, the invention may provide a use of the polynucleotide or the gRAMP protein, especially the polynucleotide, or especially the gRAMP protein, for modulating, especially controlling, a target mRNA.
In a further aspect, the invention may provide a use of the polynucleotide or the gRAMP protein, especially the polynucleotide, or especially the gRAMP protein, for modulating, especially controlling, a protease, especially a TPR-CHAT protein.
The embodiments described herein are not limited to a single aspect of the invention. For example, an embodiment describing the gRAMP protein may, for example, further relate to the polynucleotide, especially to the amino acid sequence encoded by the polynucleotide. Similarly, an embodiment of the gRAMP protein may further relate to embodiments of the method. Specifically, the method may comprise exposing a target RNA to any embodiments of the polynucleotide and the gRAMP protein described herein.
BRIEF DESCRIPTION OF THE DRAWINGS Embodiments of the invention will now be described, by way of example only, with reference to the accompanying schematic drawings in which corresponding reference symbols indicate corresponding parts, and in which: Fig. 1 schematically depicts embodiments of the polynucleotide, the gRAMP protein, and the control method of the invention. Fig. 2A schematically depicts embodiments of the modification method and the recombinant cell. Fig. 2B schematically depict experimental measurements. Fig. 3A-10 schematically depict further aspects. The schematic drawings are not necessarily on scale.
DETAILED DESCRIPTION OF THE EMBODIMENTS Fig. 1 schematically depicts an embodiment of an isolated or recombinant polynucleotide 10. The polynucleotide comprises a coding sequence 11 encoding a gRAMP protein 60. Especially, a gRAMP amino acid sequence of the gRAMP protein 60 may have at least 30% sequence identity to SEQ ID NO:1 with respect to a sequence alignment between the gRAMP amino acid sequence and SEQ ID NO:1, especially wherein the sequence alignment has a length of at least 80% of a sequence length of SEQ ID NO:1. The gRAMP protein 60 encoded by the coding sequence 11 of the polynucleotide 10 may be an RNA-guided RNA endonuclease.
In the depicted embodiment, the polynucleotide 10 comprises DNA, and the polynucleotide 10 comprises one or more regulatory elements 13 configured to control the expression of the coding sequence 11. In particular, the depicted regulatory element 13 may be a promoter.
Further, in the depicted embodiment, the polynucleotide 10 comprises a CRISPR array 15. The CRISPR array comprises a plurality of repeats 16 and a plurality of spacers 17. In particular, the CRISPR array 15 may encode a CRISPR RNA precursor. The gRAMP protein 60 may especially be configured to process the CRISPR RNA precursor into guide RNAs 30, especially crRNAs 31, especially wherein the gRAMP protein 60 is configured to functionally associate with (one of) the guide RNAs 30. The depicted guide RNA 30, especially the crRNA 31, may comprise a repeat region 36 and a spacer region 37, especially wherein the repeat region 36 1s encoded by (at least part of) a repeat 16 of the CRISPR array 15, and wherein the spacer region 37 is (at least partially) encoded by a spacer 17 of the CRISPR array 15.
Fig. 1 further schematically depicts an isolated or recombinant gRAMP protein (“gRAMP protein”) 60. In the depicted embodiments, the gRAMP protein is encoded by the depicted polynucleotide 10.
The gRAMP protein 60 may especially be configured to functionally associate with a guide RNA 30, such as a crRNA 31. In the depicted embodiments, the gRAMP protein 60 is functionally associated with a guide RNA 30. In particular, the guide RNA 30 may be configured to hybridize with a target (m)RNA 50 along a complementary region 40. Upon hybridization of (part of) the guide RNA with the target RNA, the gRAMP protein 60 may be configured to cut the target RNA 50 at a target site 45, especially at two target sites 45. In embodiments, the target site 45 may be arranged in the target RNA 50 between nucleotides aligned with nucleotides 4 and 5 of the guide RNA 30 in the complementary region 40. In further embodiments, the target site 45 may be arranged in the target RNA 50 between nucleotides aligned with nucleotides 10 and 11 of the guide RNA in the complementary region 40 (also see Fig. 4A).
Fig. 1 further schematically depicts a control method for modulating a target RNA 50 using the gRAMP protein 60, especially wherein the gRAMP protein 60 is functionally associated with a guide RNA 30 configured to hybridize with the target RNA 50. The control method may especially comprise exposing the target RNA 50 to the gRAMP protein 60 (functionally associated to the guide RNA 30). Exposing the target RNA 50 to the gRAMP protein 60 may especially result in the gRAMP protein cleaving the target RNA 50 at one or more target sites 45, thereby resulting in degradation of the target RNA 50.
In embodiments, the control method may especially comprise exposing the target RNA to the gRAMP protein, wherein the gRAMP protein is configured to (only) cut at a first target site 45, wherein the first target site 45 is arranged in the target RNA 50 between nucleotides in the complementary region 40 that are aligned with nucleotides 4 and 5 of the guide RNA.
In further embodiments, the control method may especially comprise exposing the target RNA to the gRAMP protein, wherein the gRAMP protein is configured to (only) cut at a second target site 45, wherein the second target site 45 is arranged in the target RNA 50 between nucleotides in the complementary region 40 that are aligned with nucleotides 10 and 11 of the guide RNA, especially counted from the 5° end of the target RNA 50.
In further embodiments, the gRAMP protein may be configured to cut at the first and second target sites 45.
Fig. 1 further schematically depicts a use of the polynucleotide 10 or the gRAMP protein 60 for modulating a target mRNA 50.
Fig. 2A schematically depicts an embodiment of the modification method of the invention. The modification method may comprise introducing the polynucleotide 10 according into the genome of a parent cell 100 to provide the recombinant cell 110. In the depicted embodiment, the polynucleotide 10 may comprise a recombinant construct 14, especially a plasmid, and the introduction of the polynucleotide 10 into the genome comprises the introduction of the recombinant construct 14 into the parent cell 100 to provide the recombinant cell 110.
In further embodiments, the introduction of the polynucleotide 10 into the genome may comprise an insertion of the polynucleotide 10 into a chromosome.
In the depicted embodiment, the parent cell 100, and thus the recombant cell 110, may especially comprise a prokaryotic cell, specifically a bacterial cell corresponding to Escherichia coli BL21. In further embodiments, the parent cell 100 and the recombinant cell 110 may be a prokaryotic cell or a eukaryotic cell, especially an animal cell, such as a human cell.
Fig. 2A further schematically depicts an embodiment of the recombinant cell 110, especially obtainable using the modification method of the invention. The recombinant cell 110 may comprise a genome modification relative to a parent cell 100, wherein the genome modification comprises (an introduction of) the polynucleotide 10.
Experiments Unless specific otherwise, the experiments described herein are performed using the materials and methods described directly hereinafter.
Materials and methods Providing the polynucleotide encoding the gRAMP protein - For the expression of Candidatus Scalindua brodae gRAMP, a two-plasmid expression system was used: a plasmid encoding the gRAMP protein and a plasmid containing the CRISPR RNA.
To construct the gRAMP plasmid, a coding sequence corresponding to an £. coli codon-optimized gRAMP protein containing a N-terminal Twin-Strep-Tag II and a SUMO was designed.
The SUMO is an N-terminal piece that may improve expression yields and/or improve protein solubility.
The codon-optimzed coding sequence corresponds to SEQ ID NO:16. The coding sequence was ordered and cloned in front of the lacl repressed T7 promoter via amplification of plasmid 13S-S (encoding spectinomycin resistance for selection), which has a sequence according to SEQ ID NO:63. The resulting plasmid is hereinafter referred to as pGRAMP (or “pgRAMP”). Providing a (second) polynucleotide encoding the CRISPR array - To construct a CRISPR RNA plasmid, a CRISPR array starting with the native Candidatus S. brodae leader sequence, followed by six native repeats (SED ID NO:49) interspaced by five times the first spacer in the native CRISPR array was designed.
The CRISPR array corresponds to SEQ ID NO:48, and was ordered and cloned in front of the lacl repressed T7 promoter on the plasmid pACYC Duet-1 (encoding chloramphenicol resistance for selection), which has a sequence according to SEQ ID NO:64. The resulting plasmid is hereinafter referred to as pCRISPR-1. In addition, a second CRISPR RNA plasmid was constructed to contain the whole native CRISPR array from Candidatus S. brodae by PCR amplifying the CRISPR array, corresponding to an amplicon having a sequence according to SEQ ID NO:65, off Candidatus S. brodae genomic DNA using a forward primer according to SEQ ID NO:66 and a reverse primer according to SEQ ID NO:67, and cloning the amplicon in front of the lacl repressed T7 promoter on pACYC Duet-1. The resulting plasmid is hereinafter referred to as pCRISPR-2. Providing the polynucleotide encoding the gRAMP protein and the CRISPR array — In order to get both the gRAMP protein and the CRISPR-1 array encoded on the same plasmid, the CRISPR array was PCR amplified from pCRISPR-1 (using primers with sequences SEQ ID NO:50 and SEQ ID NO:51) and assembled with PCR amplified pGRAMP (using primers with sequences SEQ ID NO:52 and SEQ ID NO:52) using Gibson assembly to yield pGRAMP-CRISPRI.
The resulting plasmid is hereinafter referred to as pGRAMP CRISPRI.
Transformation with the polynucleotide and the second polynucleotide encoding the CRISPR array - Both plasmids were transformed in electro competent Escherichia coli BL2I(AI) cells and grown overnight on selection media (50 pg/mL spectinomycin, 25 pg/mL chloramphenicol). Colonies were streaked from the plate and grown in 8 L LB medium containing (50 pg/mL spectinomycin, 25 pg/mL chloramphenicol) in baffled flasks until an OD600 of 0.45 was reached.
The cells were cold-shocked by putting the flasks in an ice-bath for | hour.
Subsequently, the cells were induced with 0.5 mM IPTG and 0.2% L-arabinose overnight at 20°C with 150 rpm rotation.
The overnight cultures were pelleted at 6000 rpm at room temperature, whereupon the pellets were washed in 1x PBS, pelleted again and stored at -80°C until further use.
Purification of the gRAMP protein — A bacterial cell pellet of each 1 L culture were resuspended in 50 mL of ice-cold lysis buffer (100mM Tris-HCI, 150mM NaCl, ImM DTT, 5% glycerol, pH 7.5). 1 tablet of cOmplete™ EDT A-free Protease Inhibitor Cocktail was added per 50 mL resuspended pellet.
The cells were lysed with 3 runs at 1000 bar in a cooled French press.
The lysed cells were spun down at 16000 rpm at 4°C for 30 min.
The resulting lysate was filtered through a 0.45 um syringe filter.
To prepare the purification column, a 20 mL gravity column was loaded with 3 mL of a 50% suspension Strep-Tactin ®XT Affinity Resin (corresponding with 1.5 mL column bed volume). The Strep-Tactin ®XT Affinity Resin were washed with 5 column volumes of ice cold wash buffer (100mM Tris-HCI, 150mM NaCl, ImM DTT, 5% glycerol, pH 7.5). The filtered sample lysate was loaded on the washed Strep- Tactin ®XT Affinity Resin and subsequently washed with 5 column volumes of ice cold wash buffer. 10 mL of elution buffer (100mM Tris-HCI, 150mM NaCl, ImM DTT, 5% glycerol, S0mM Biotin, pH 7.5) was used to elute the gRAMP protein.
The pooled fractions were gradually diluted 1.5 times with a wash buffer devoid of NaCl (100mM Tris-HCI, ImM DTT, 5% glycerol, pH 7.5). The sample was spun down at 13,200 rpm at 4°C for 10 min whereupon the supernatant was transferred to a new tube.
Heparin Chromatography - For the Heparin chromatography, a 5 mL HiTrap Heparin HP colum (Cytiva) was washed with 2 column volumes of degassed MilliQ at 1 mL/min and equilibrated with degassed low salt buffer (100mM Tris-HCI, 100mM NaCl, ImM DTT, 5% glycerol, pH 7.5) at | mL/min.
The sample was loaded onto the column and washed with 10 column volumes of ice cold degassed low salt buffer.
The proteins were eluted using a NaCl gradient (0-100% in 25 mL) from low salt buffer to high salt buffer (100mM Tris-HCI, 1
M NaCl, 1mM DTT, 5% glycerol, pH 7.5), collecting 1 mL fractions. Pooled fractions were concentrated (Amicon ultra-4 Centrifugal Filter Unit with ultracel-100 membrane), snap frozen in liquid nitrogen and stored at -80°C until further use. Providing a (second) polynucleotide encoding TPR-CHAT - To construct the TPR-CHAT plasmid, a coding sequence corresponding to an E. coli codon-optimized TPR- CHAT protein variant containing a C-terminal His-tag was designed having a sequence according to SEQ ID NO:18, ordered (from IDT) and cloned in front the lacl repressed T7 promoter on the plasmid 2AT, which has a sequence according to SEQ ID NO:68. The resulting plasmid is hereinafter referred to as pTPR-CHAT.
Transformation with the second polynucleotide encoding the TPR-CHAT protein - pTPR-CHAT - the second polynucleotide encoding TPR-CHAT — was transformed into electrocompetent £. coli BL21 AI cells and plated on LB-agar plates containing 25 pg/ml of chloramphenicol. Plates were incubated overnight at 37°C. One Erlenmeyer flask (250 ml) containing 50 ml of LB with 25 pg/ml of chloramphenicol was inoculated with all the colonies from the plate and grown at 37°C, 180 rpm for 3-4h until the cells reached stationary phase (OD600 > 1). Two Erlenmeyer flasks (5 L) containing 2 L of LB with 25 pg/ml of chloramphenicol each were inoculated with 25 mL of the overgrown culture and incubated at 37°C and 150 rpm until the cultures reached exponential phase (OD600 0.3-0.5). The cultures were incubated on ice for 1h and protein expression was induced with a final concentration of 2% L-arabinose and 1mM IPTG, which was followed by overnight incubation at 20°C and 150 rpm. The cultures were centrifuged for 30 minutes at 6000 rpm and 4°C, and the supernatant was discarded. The pellets were resuspended in PBS (50 ml PBS/initial 1 L culture) and centrifuged 30 minutes at 3900 rpm. The supernatant was discarded, and the pellets stored at - 80°C until further use.
Purification of the TPR-CHAT protein - Cell pellets were unfrozen and resuspended in 200 ml of ice-cooled Lysis/Wash buffer (100 mM Tris-HCI, 150 mM NaCl, 1 mM DTT, 5% glycerol, 25 mM imidazole, pH 7.5). Cell lysis was performed three times in a cooled French press (1kbar). The lysate was centrifuged at 16000 rpm for 30 minutes at 4°C (JA-17 rotor, Avanti). The supernatant was filtered through 0.45 um syringe filter, and the pellets discarded. A disposable 20 ml column (Biorad) was used for gravity-flow affinity chromatography. 2 ml of HIS-Select Nickel Affinity Gel (Sigma-aldrich) were added into the column (500 ul/50 ml lysate) and washed with 20 ml of ice-cold Lysis/wash buffer. The filtered lysate was added into the column and the flowthrough was collected. The loaded column was washed with 15 ml of ice-cold Lysis/wash buffer and the wash was collected. The TPR-CHAT protein was eluted in 6 elution fractions of 1 ml by addition of ice-cold Elution buffer (100 mM
Tris-HCI, 150 mM NaCl, 1 mM DTT, 5% glycerol, 250 mM imidazole, pH 7.5). Expression and purification of a complex of the gRAMP protein and the TPR- CHAT protein - pTPR-CHAT and pgRAMP CRISPRI were transtormed into electrocompetent E. coli BL21 Al cells and plated in LB-agar plates containing 25 pg/ml of chloramphenicol and 50 pg/ml of spectinomycin.
Plates were incubated overnight at 37°C.
One Erlenmeyer flask (250 ml) containing 50 ml of LB with 25 pg/ml of chloramphenicol and 50 ng/ml of spectinomycin was inoculated with all the colonies from the plate and grown at 37°C, 180 rpm for 3-4h until the cells reached stationary phase (OD600 > 1). Four Erlenmeyer flasks (5L) containing 2 L of LB with 25 pg/ml of chloramphenicol and 50 pg/ml of spectinomycin each were inoculated with 12 mL of the overgrown culture and incubated at 37 °C and 150 rpm until the cultures reached exponential phase (OD600 0.3-0.5). The cultures were incubated on ice for 1h and protein expression was induced with a final concentration of 2% L-arabinose and ImM IPTG, which was followed by overnight incubation at 20 °C and 150 rpm.
The cultures were centrifuged for 30 minutes at 6000 rpm and 4 °C, and the supernatant was discarded.
The pellets were resuspended in PBS (50 ml PBS/initial 1 L culture) and centrifuged 30 minutes at 3900 rpm.
The supernatant was discarded, and the pellets stored at -80°C until further use.
Cell pellets were unfrozen and resuspended in 400 ml of ice-cooled Lysis/Wash buffer (100 mM Tris-HCI, 150 mM NaCl, 1 mM DTT, 5% glycerol). Cell lysis was performed three times in a cooled French press (1kbar). The lysate was centrifuged at 16000 rpm for 30 minutes at 4°C (JA-17 rotor, Avanti). The supernatant was filtered through 0.45 pm syringe filter, and the pellets discarded.
Two disposable 20 ml columns (Biorad) were used for gravity-flow affinity chromatography. 3 ml of Strep-Tactin®XT 4Flow® high-capacity resin (IBA life sciences) were added into each column and washed with 30 ml/column of ice-cold Lysis/wash buffer.
The filtered lysate was added into both columns and the flowthroughs were collected.
The loaded columns were washed with 30 ml of ice-cold Lysis/wash buffer and the washes were collected.
The proteins were eluted in 10 ml/column of ice-cold Elution buffer (100 mM Tris- HCI, 150 mM NaCl, 1 mM DTT, 5% glycerol, 50 mM Biotin). Both elution samples were put together and 10 ml of Buffer 1 (100 mM Tris-HCI, 1 mM DTT, 5% glycerol) were added, while mixing the sample.
The sample was then centrifuged 10 minutes at maximum speed and the supernatant was recovered.
A 5 ml HiTrap Heparin HP column (Cytiva) was used, and the chromatography was run at 0.3 ml/min.
The sample was loaded into the column equilibrated with Buffer 2 (100 mM Tris-HCI, 100 mM NaCl, | mM DTT, 5% glycerol) and washed until UV 1 _ 280 was almost 0 mAU.
The proteins were eluted by using a NaCl gradient (0-100% in
25ml) from Buffer 2 to Buffer 3 (100 mM Tris-HCI, 1 M NaCl, 1 mM DTT, 5% glycerol), collecting 1 ml fractions. Fractions from the chromatogram were put together and concentrated until a volume of 600 ul. The concentrated sample was loaded into a Superdex 200 Increase 10/300 GL (Cytiva) equilibrated with Lysis/wash buffer, and 0.5 ml samples were taken.
Extraction of crRNA from purified gRAMP - Samples with purified gRAMP protein were unfrozen and incubated with 20 mg/mL proteinase K (NEB) at 37°C for 1 hour to digest the protein part of the complex comprising gRAMP functionally associated with a guide RNA, followed by heat inactivation at 95°C for 5 minutes. To separate the RNA from the protein, acidic phenol (pH 4.5, phenol:chlorophorm = 5:1, Invitrogen) was added to the sample in a 1:1 ratio, vortexed for 1 minute and centrifuged for 10 minutes at 13200 rpm at room temperature. The aqueous phase was collected and subjected to RNA precipitation (20 uL 3M NaAce and 500 pL 100% ethanol per 200 uL of sample) for 1 hour at -20°C. Samples were spun down at 13,200 rpm at 4°C for 2 hours, washed twice with ice cold 70% ethanol, spun down at 13,200 rpm at 4°C for 10 minutes. The pellet was dried in a SpeedVac centrifugal evaporator for 30 minutes at 60°C and resuspended in RNA grade water. Samples were stored at -80°C until further use.
Visualization of the guide RNA — the guide RNA, specifically, the crRNA, was visualized by mixing 10 pL. of 50 ng/uL crRNA with 2x RNA loading dye (95% formamide,
0.025% SDS and 0.5 mM EDTA) and running the mixture on a 10% PAGE 8M Urea gel (pre- ran at 350V for 1 hour, sample run at 333V for 2 hours). The gel was stained with Sybr Gold and imaged on a Typhoon laser-scanner platform (Cytiva).
RNA sequencing — RNA sequencing was performed on the guide RNA, specifically the crRNA, extracted from the gRAMP protein that was expressed in the presence of the whole native CRISPR array, using pCRISPR-2, from Candidatus S. brodae, using an RNA library prepared using the Illumina TruSeq RNA Library Prep Kit with 20% PhiX and 10% miRNA control added. A HiSeq X lane (2x150bp) sequencing run was performed. The obtained reads (2,649,538 in total) were trimmed using cutadapt to remove the adapters, as described in M. Martin, “Cutadept removes adapter sequences from high-throughput sequencing reads”, EMBnet journal, 2011, which is hereby herein incorporated by reference.
The single reads of size 40 to 60 (954,096 in total) were filtered and mapped on the CRISPR array sequence using minimap2, as described in H. Li, “Minimal2: pairwise alignment for nucleotide sequences”, Bioinformatics, 2018, which is hereby herein incorporated by reference. Visualisation was done with Jalviewer, as described in Clamp et afl, “The Jalview Java alignment editor”, Bioinformatics, 2004, which is hereby herein incorporated by reference.
gRAMP protein RNA cleavage assay - RNA cleavage reactions were performed in 10 pL reaction volume, containing purified 200 nM gRAMP protein, 100 nM Cy5-labelled RNA oligo, 25 mM Tris, 150 mM NaCl, 10 mM DTT and 2 mM MgCl2. Reactions were incubated at 20°C for 2 hours, after which 0.5 uL proteinase K (neb) was added to stop the reactions (1 hour at 37°C, 95°C for 5 minutes). 5 pL of the reactions was mixed with 5 uL of 2x RNA loading dye (95% formamide, 0.025% SDS and 0.5 mM EDTA) and loaded on a 10% PAGE 8M Urea gel (pre-runned at 350V for 1 hour, sample run at 333V for 2 hours). Gels were imaged on a Typhoon laser-scanner platform (Cytiva). GFP mRNA knockdown assay - (Non-)Target GFP plasmids were generated based on Gibson assembly of Gibson assembly fragments corresponding to SEQ ID NO:69 (encoding tetR and Tet promoter), SEQ ID NO:70 (encoding AmpR, amplified from p2AT using a primer according to SEQ ID NO:71 and a primer according to SEQ ID NO:72), SEQ ID NO:73 (encoding RepA), together with either a Gibson assembly fragment corresponding to SEQ ID NO:74(encoding GFP with the cognate target of CRISPR1 after the start codon) or a Gibson assembly fragment corresponding to SEQ ID NO:75 (encoding GFP with a non- cognate target after the start codon). The resulting plasmids are hereinafter referred to as pGFPuv-Target, corresponding to SEQ ID NO:76 and pGFPuv-Non-Target, corresponding to SEQ ID NO:77. The plasmids pGRAMP, pCRISPR-1 and a pGFPuv-(Non)-Target variant were transformed in electrocompetent BL21(Al) and plated in LB-agar plates containing 25 pg/ml of chloramphenicol, 50 pg/ml of spectinomycin and 100 pg/ml of ampicillin.
Six colonies per condition were grown overnight and stored at -80°C in 20% glycerol.
Cultures from the cryo-stocks were grown overnight in 5 mL LB containing the antibiotics.
Overnight cultures were diluted to OD600=0.05 of which 200 pL was put in a 96-well microplate (Sigma Aldrich). Cells were grown in a plate reader (Synergy H1 microplate reader, Biotek) at 37°C until with continuous shaking (282 cpm, double orbital) early exponential phase.
The cells were then induced for gRAMP and crRNA expression with 1 mM IPTG and 0.2% L-arabinose.
After growth at 20°C and 450 rpm for 3 hours, the cells were induced for GFP expression with 0.04 ng/uL anhydrotetracyclin and grown at 23°C with continuous shaking (282 cpm, double orbital) in the plate reader for 22 hours.
OD600 and fluorescence (excitation: 394 nm, emission: 509 nm) were measured every 10 minutes.
Results To determine the molecular composition of the gRAMP ribonucleoprotein, an E. coli expression system composed of two plasmids was designed: plasmid pGRAMP containing the codon-optimized Candidatus S. brodae gRAMP sequence with a N-terminal dual-strep tag, and the crRNA plasmid containing five copies of the first spacer from the native S. brodae interspaced with the native repeat sequence. The gRAMP protein was purified to homogeneity via three column chromatography steps, and was subjected to size-exclusion chromatography.
Fig. 2B schematically depicts the measurements from the size-exclusion chromatography, wherein UV-absorbance (in mAU) is depicted for different retention volumes Ve (in mL). Specifically, line L1 corresponds to absorbance at 260 nm, and line L2 corresponds to an absorbance at 280 nm, which correspond to the presence of nucleic acids and proteins, respectively. Site Sl corresponds to a void volume, whereas site S2 corresponds to the gRAMP protein. In view of the absorbance at 260 nm and 280 nm, Fig. 2B is indicative of the presence of ribonucleoproteins, i.e, the size-exclusion chromatography measurements depicted in Fig. 2B indicate that the gRAMP protein 60 associates with a guide RNA 30.
Subsequent SDS-PAGE analysis of those retention volumes revealed a single protein that indeed was gRAMP as confirmed by mass spectroscopy. SEC-MALS analysis also indicates a homogenous, monomeric protein of 219 kDa and a nucleic acid content of 16 kDa, corresponding to the expected size for a gRAMP monomer bound to an RNA oligo of about 50 nucleotides (M.w. of ssRNA = (#nucleotides x 320.5) + 159.0). Hence, these results demonstrate that the gRAMP protein together with a guide RNA forms a single, giant ribonucleoprotein.
Fig. 2C schematically depicts an SDS page gel corresponding to the gRAMP protein 60 purified without (lane L4) and with (lane LS) the presence of the crRNA plasmid. Strikingly, a lower yield of the gRAMP protein 60 was obtained without the provision of the guide RNA 30, indicative of a stabilizing or chaperoning role for the guide RNA 30 in complex formation.
In order to further investigate the characteristics of the gRAMP-bound guide RNA 30, the gRAMP protein 60 was loaded with guide RNAs 30 corresponding to its native spacers. The Candidatus S. brodae type III-E locus contains a CRISPR array 15 comprising 11 spacers ranging from 34 to 41 nucleotides in size, interspaced by a 36 nt long repeat sequence. This region was PCR amplified from the Candidatus S. brodae genomic DNA and inserted into a crRNA plasmid. After gRAMP expression, the purified protein was degraded and the RNA content was separated using phenol-chloroform extraction.
Fig. 3 schematically depict the experimental setup, as well as results from a corresponding denaturing PAGE analysis, wherein lane L3 indicates RNA extracted from gRAMP, and wherein the indicated numbers correspond to a number of ssRNA nucleotides.
Intriguingly, the denaturing PAGE analysis reveals besides a population corresponding to a RNA specie corresponding to the spacer and part of the repeat (~43-50 nt), two other populations (~60 nt and ~90 nt). In particular, the presence of a spacer-sized RNA bound by the gRAMP protein 60 suggests that the gRAMP protein 60 may be capable of processing its own pre-crRNA. In other type III systems, generally dedicated proteins (e.g. Csmó) are required for the processing of the pre-crRNA into mature crRNAs.
Fig. 4A-C schematically depict results of an in vitro embodiment of the control method of the invention. In the depicted embodiment, the control method comprises providing a first mixture, wherein the first mixture comprises the target RNA 50, the gRAMP protein 60, and the guide RNA 30. In particular, in the depicted embodiment, purified gRAMP protein 60 was incubated together with three different target RNAs (50, 50, 50b, 50c) — specifically three different Cy5-labelled ssRNA substrates — complementary to the guide RNA 30 corresponding to a spacer in the CRISPR array 15, as depicted in Fig. 4A. Specifically, a first target RNA 50,50a corresponds to SEQ ID NO:20, a second target RNA 50,50b corresponds to SEQ ID NO:21, and a third target RNA 50,50c corresponds to SEQ ID NO:2. The first and second target RNAs are labelled with a Cy5-label 51 at their 5’-ends, whereas the third target RNA 1s labelled with a Cy5-label 51 at its 3’-end. The guide RNA 50 used for targeting the first, second and third target RNAs 50 is produced from the CRISPR array 15. In the depicted embodiment, the guide RNA corresponds to SEQ ID NO: 19. In further embodiments, the guide RNA may, for instance, correspond to SEQ ID NO:78.
In the depicted embodiments, the target sites 45 of the gRAMP protein 60 in the target RNAs 50,504,50b,50c are schematically indicated. Specifically, nucleotides in the guide RNA 30 are numbered according to their sequential order from the transition between a repeat region (or “repeat portion”) and a spacer region (or “spacer portion”) of the guide RNA, wherein nucleotides in the spacer region are number positively, and wherein nucleotides in the repeat region are numbered negatively. As indicated above, the gRAMP protein may, in embodiments, be configured to cleave the target RNA 50 at a first target site 45 between the nucleotides aligned with nucleotides 4 and 5 of the guide RNA 30. In further embodiments, the gRAMP protein may be configured to cleave the target RNA 50 at a second target site between the nucleotides aligned with nucleotides 10 and 11 of the guide RNA 30. The first target site 45 and the second target site 45 may be separated by a distance of dl nucleotides, especially wherein d1=6.
In the depicted embodiment, the complementary region 40 has a length of at least n nucleotides, wherein n is at least 12, such as at least 13, especially at least 15, such as at least 18. In further embodiments, n may be at least 20, especially at least 25, such as at least 30, especially at least 35. Specifically, in the depicted embodiment n is 27. In addition, in the depicted embodiment, the first mixture comprised 2 mM of the bivalent cation Mg?".
Fig. 4B schematically depicts an SDS-page gel, wherein lane L6 corresponds to a marker lane comprising Cy5-labeled ssRNA having sizes of 10, 20, 30, 40, 50, and 60 nucleotides, lane L7 corresponds to the first target RNA 50, 50a, lane L8 corresponds to the first target RNA 50, 50a together with the gRAMP protein 60, lane L9 corresponds to the second target RNA 50, 50b, lane L10 corresponds to the second target RNA 50, 50b together with the gRAMP protein 60, lane L11 corresponds to the third target RNA 50, 50c, lane L12 corresponds to the third target RNA 50, 50c together with the gRAMP protein 60, lane L13 corresponds to Cy5-labelled non-target RNA corresponding to SEQ ID NO:54, lane L14 corresponds to the Cy5-labelled non-target RNA together with the gRAMP protein 60, lane L15 corresponds to Cy5-labelled ssDNA corresponding to SEQ ID NO:55, which is (at least partially) complementary to the guide RNA 30, and lane L16 corresponds to the Cy5-labelled ssDNA together with the gRAMP protein 60. The arrows indicate Cy5-labelled RNA fragments indicating that the gRAMP protein successfully cut the target RNAs 50. In particular, Fig. 4B, lanes L7-L12 schematically depict two RNA fragments, irrespective of whether the Cy5 label is on the 3’-end or the 5’-end, indicating that gRAMP cuts the target RNA 50 at two target sites 45, separated by a distance dl of 6 nucleotides, as depicted in Fig. 4A. The incubation of gRAMP with a non-complementary RNA target (L14) or a complementary fluorescently labelled ssDNA target (L16) did not yield cleavage products, indicating that gRAMP is a specific RNA-guided RNA cleaver.
To verify that the observed degradation pattern was due to guide RNA complementarity, additional 5° Cy5-labelled target RNAs 50 with a 3” and 5° extension were designed. The exact same cleavage pattern was observed with varying 3° sizes, whereas cleavage products shifted correspondingly with varying 5° sizes. Collectively, this data demonstrates that the gRAMP protein 60 cleaves the target RNA 50 in positions well-defined by complementarity to the guide RNA 30.
As indicated above, the guide RNA 30 obtained from a CRISPR array may comprise a repeat region 16 and a spacer region 17. CRISPR systems may rely on non- complementarity of the repeat region 16 with a target in order to distinguish self from non-self. In particular, some CRISPR cleavers may only cleave a target if the repeat region 16 (or “repeat-derived crRNA 57-handle”) in the guide RNA, especially the crRNA, is non-
complementary to a protospacer 3’-flank (also known as the protospacer flanking sequence, or PFS) in the target, which may reduce the freedom in target selection.
Fig. 5 schematically depicts SDS-page results, wherein lane L19 indicates a target RNA with a PFS matching the guide RNA 30, lane L20 indicates the target RNA with matching PFS together with the gRAMP protein 60 (and the guide RNA 30), and lane L21 indicates a target RNA with a PFS not matching the guide RNA 30 together with the gRAMP protein 60. Specifically, the target RNA with matching PFS corresponds to SEQ ID NO:56, the target RNA with non-matching PFS corresponds to SEQ ID NO:57, and the guide RNA corresponds to SEQ ID NO: 19. The results depicted in Fig. 5 indicate that there is no substantial difference in 8 nucleotides additional complementarity extended beyond the spacer sequence with 8 nucleotide of non-complementary extension.
Hence, these results suggest that target cleavage of the gRAMP protein is not affected, especially inhibited, by a matching PFS.
Fig. 6A schematically depicts SDS-page results, wherein both lanes L22 and L23 correspond to a target RNA 50 exposed to the gRAMP protein 60 and a complementary guide RNA 30, wherein lane L22 corresponds to a first mixture comprising < 0.001 mM of a bivalent cation selected from the group comprising Mg®*, Ca’, and Mn?’, especially corresponding to (essentially) an absence of the bivalent cation(s), and wherein lane L23 corresponds to the presence of Mg’. The cleavage products clearly indicate a substantial decrease, especially abolishment, of cleavage by the gRAMP protein in the absence of the bivalent cation.
As indicated above, the native gRAMP protein of Candidatus S. brodae may cut a target RNA 50 at two well-defined sites.
As, for specific applications, it may be preferable for the gRAMP protein to cut at a single well-defined site, engineered gRAMP proteins were designed and evaluated.
These experiments were performed with the same target RNA 50 and guide RNA 30, corresponding to SEQ ID NO:57 and SEQ ID NO: 19, respectively.
Fig. 6B schematically depicts SDS-page results, wherein for each of lanes L24- 34 the target RNA is present, wherein L25 corresponds to the gRAMP protein with SEQ ID NO:1, L26 corresponds to a D437A D516A mutant of SEQ ID NO:1, L27 corresponds to a D448 A mutant of SEQ ID NO: 1, L28 corresponds to a D448 A D516A mutant of SEQ ID NO:1, L29 corresponds to a S457A D516A mutant of SEQ ID NO:1, L30 corresponds to a D698A mutant of SEQ ID NO:1, L31 corresponds to a D698A D771A mutant of SEQ ID NO:1, L32 corresponds to a D771 A mutant of SEQ ID NO:1, L33 corresponds to a D968 A mutant of SEQ ID NO:1, and L34 corresponds to a D971A mutant of SEQ ID NO:1. The mutational analysis indeed revealed that the gRAMP protein lost one of its active sites upon changing a single aspartic acid residue to an alanine, i.e, the D698 A mutant, as can be seen in lanes L30 and L31. The single amino acid change renders the gRAMP protein capable of sequence specifically cutting RNA in exactly one spot, making this gRAMP protein mutant potentially of value for precise nucleic acid-editing technologies where guided single cleavages are desired (such as sequence specific editing of RNA and the processing of precursor mRNAs).
Hence, in embodiments, the (RAMP) amino acid sequence may have a first amino acid aligned with the aspartate of SEQ ID NO:1 at position 698 in the sequence alignment, wherein the first amino acid differs from aspartate. In further embodiments, the first amino acid may especially be alanine. In such embodiments, the gRAMP protein 60 may be configured to cut a target RNA 50 at a single target site 45.
As the gRAMP protein 60 may cleave a target RNA 50 in a programmable manner, the gRAMP protein 60 may be used for in vivo target RNA modulation, such as for in vivo mRNA knockdown. To evaluate this potential, £. coli co-transformed with pgRAMP, pCRISPR-1 and either pGFPuv-Target or pGFPuv-Non-Target.
Fig. 7 schematically depicts GFP measurements of the recombinant E. coli cells in fluorescence F (in a.u.) versus time T (in hh:mm:ss), wherein line L35 corresponds to the recombinant £.coli transformed with pgRAMP, pCRISPR-1 and pGFPuv-Non-Target, whereas line L36 corresponds to recombinant F.coli transformed with pgRAMP, pCRISPR-1 and pGFPuv-Target. Indeed, a reduction in fluorescence was observed from cells with pGFPuv- Target compared to cells with pGFPuv-Non-Target, indicating that gRAMP can interfere with protein expression at the transcript level.
Hence, in embodiments, the control method may comprise modulating the target RNA 50 in a target cell 120, wherein the target cell 120 (has a genome that) encodes the target RNA 50, and wherein the control method comprises providing the gRAMP protein 60 and the guide RNA 30 inside the target cell 120. Especially, the control method may comprise providing the polynucleotide 10 to the target cell 120, and controlling the expression of the polynucleotide 10, especially of the coding sequence 11, in the target cell 120.
Type III-E CRISPR loci may often comprise a plurality of coding sequences corresponding to different proteins, among which the gRAMP protein, as well as the TPR- CHAT protein. To test whether and to which extent the gRAMP protein is interacting with the TPR-CHAT protein, pulling experiments were performed. In particular, an expression system was designed in which gRAMP, CRISPR RNA and TPR-CHAT are produced, as schematically depicted in Fig. 8A. Each of the gRAMP protein and the TPR-CHAT protein were designed to carry a respective (different) tag (Dual-Strep tag and 6x His-tag, respectively), that may be used to purify the protein using respective columns (streptavidin-column and his-column). Column purification with either column results in purification of both the gRAMP protein and the TPR- CHAT protein, indicating that they are interacting. The interaction was not lost after subsequent heparin chromatography, and size-exclusion chromatography revealed a single peak for the complex.
In the embodiment depicted in Fig. 8A, the second coding sequence 12 coding for the TPR-CHAT protein 70 is arranged on a second polynucleotide 20, whereas the coding sequence 11 coding for the gRAMP protein 60 is arranged on the polynucleotide 10. In further embodiments, the polynucleotide 10 may comprise the second coding sequence 12, wherein the second coding sequence 12 encodes a TPR-CHAT protein 70, wherein a TPR-CHAT amino acid sequence of the TPR-CHAT protein 70 has at least 25% sequence identity to SEQ ID NO:27 with respect to a sequence alignment between the amino acid sequence and SEQ ID NO:27, wherein the sequence alignment has a length of at least 60% of a sequence length of SEQ ID NO:27, and wherein the TPR-CHAT protein 70 is a protease.
Fig. 8B schematically depicts SDS-page results of fractions of size exclusion chromatography, wherein lanes L40 correspond to a ladder with reference sizes of about 15, 25, 35, 55, 70, 100, 130, and 250 kDa, and wherein lanes 141, 1.42, L43, L44, and L45 correspond to different elution fractions of the size exclusion chromatography. Hence, a complex comprising the gRAMP protein 60 and the TPR-CHAT protein 70 is observed across several fractions obtained from the size exclusion chromatography, further supporting a strong complex formation between these proteins. In particular, the complex appears to contain the gRAMP protein and the TPR-CHAT protein in a ratio of about 1:1.
Fig. 9A schematically depicts SDS-page results wherein lanes L46 and L49 correspond to an ssRNA-Cy5 ladder with reference sizes of 10, 20, 30, 40, 50, and 60 nucleotides, wherein lane L47 corresponds to a target RNA 50 corresponding to SEQ ID NO:57, and wherein lane L48 corresponds to a target RNA 50 corresponding to SEQ ID NO:57 together with a complex comprising the gRAMP protein 60 and the TPR-CHAT protein as well as a guide RNA 30 (corresponding to SEQ ID NO:19). Hence, the purified complex was still able to cleave the target RNA 50, in (essentially) the same way as observed for the gRAMP protein without functionally associated TPR-CHAT protein.
The complex purification was repeated in the presence of a RNA target 50. Fig. 9C schematically depicts the experimental setup, as well as gel results, wherein L50 indicates a ladder with reference sizes of about 15, 25, 35, 55, 70, 100, 130, and 250 kDa, wherein L51- 54 correspond to a Ni-NTA His-tag purification column, and wherein L55-58 correspond to a
Streptavidin purification column, and wherein lanes L51 and L55 correspond to flow-through, L52 and L56 to washing, L53 and L57 to elution, and L54 and L58 to 20 times elution. In particular, Fig. 9B schematically depicts that also in the presence of a target RNA 50 | the gRAMP protein 60 and the TPR-CHAT protein 70 are pulled without their respective tags. Further, purified complex was incubated with an excess of target RNA 50 and without Mg?" ions in the first mixture, which may lead to target binding but may not lead to cleavage.
The functional association between the gRAMP protein 60 and the TPR-CHAT protein 70 may strongly suggest that the (protease) activity of the TPR-CHAT protein 70 is (at least partially) controlled by the interaction with the gRAMP protein 60, especially by recognition or cleavage of a target RNA 50 by a guide RNA 30 functionally associated with the gRAMP protein 60.
Hence, in embodiments, the polynucleotide 10 or the gRAMP protein 60 may be used for modulating a protease, especially a TPR-CHAT protein 70.
Fig. 10 schematically depicts results of the sequencing of the guide RNA 30, specifically the crRNAs, extracted from the gRAMP protein 30 that was expressed in the presence of pCRISPR-2, i.e, of the whole native CRISPR array 15 from Candidatus S. brodae. Specifically, Fig. 10 schematically depicts a consensus mapping of the guide RNA 30 to the CRISPR array, indicating that the guide RNA 30 comprises a repeat portion (or “repeat region’) corresponding to (at least part of) a repeat 16 of the CRISPR array 15 and a spacer portion (or “spacer region”) corresponding to (at least part of) a spacer 17 of the CRISPR array 15. The depicted consensus sequence corresponds to SEQ ID NO:87, and the consensus sequence of the guide RNA corresponds to SEQ ID NO: 19. As schematically depicted, the sequenced guide RNAs (corresponding to this spacer) may generally comprise about 27-30 nucleotides in the repeat region and about 15-27 nucleotides in the spacer region. Specifically, essentially all sequenced guide RNAs (corresponding to this spacer) share about 27 nucleotides of the repeat region and about 15 nucleotides of the spacer region.
Hence, in embodiments, the repeat region may comprise 20-35 nucleotides, especially 25-32 nucleotides, such as 27-30 nucleotides.
In further embodiments, the spacer region may comprise at least 13 nucleotides, such as at least 14 nucleotides especially at least 15 nucleotides. In further embodiments, the spacer region may comprise at least 16 nucleotides, such as at least 17 nucleotides especially at least 18 nucleotides. In further embodiments, the spacer region may comprise at most 35 nucleotides, such as at most 30 nucleotides, especially at most 27 nucleotides, such as at most 21 nucleotides.
The term “plurality” refers to two or more. Furthermore, the terms “a plurality of” and “a number of” may be used interchangeably.
The terms “substantially” or “essentially” herein, and similar terms, will be understood by the person skilled in the art. The terms “substantially” or “essentially” may also include embodiments with “entirely”, “completely”, “all”, etc. Hence, in embodiments the adjective substantially or essentially may also be removed. Where applicable, the term “substantially” or the term “essentially” may also relate to 90% or higher, such as 95% or higher, especially 99% or higher, even more especially 99.5% or higher, including 100%. Moreover, the terms “about” and “approximately” may also relate to 90% or higher, such as 95% or higher, especially 99% or higher, even more especially 99.5% or higher, including 100%. For numerical values it is to be understood that the terms “substantially”, “essentially”, “about”, and “approximately” may also relate to the range of 90% - 110%, such as 95%-105%, especially 99%-101% of the values(s) it refers to.
The term “comprise” also includes embodiments wherein the term “comprises” means “consists of”.
The term “and/or” especially relates to one or more of the items mentioned before and after “and/or”. For instance, a phrase “item 1 and/or item 2” and similar phrases may relate to one or more of item 1 and item 2. The term "comprising" may in an embodiment refer to "consisting of" but may in another embodiment also refer to "containing at least the defined species and optionally one or more other species".
Furthermore, the terms first, second, third and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the invention described herein are capable of operation in other sequences than described or illustrated herein.
The devices, apparatus, or systems may herein amongst others be described during operation. As will be clear to the person skilled in the art, the invention is not limited to methods of operation, or devices, apparatus, or systems in operation.
The term “further embodiment” and similar terms may refer to an embodiment comprising the features of the previously discussed embodiment, but may also refer to an alternative embodiment.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims.
In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim.
Use of the verb "to comprise" and its conjugations does not exclude the presence of elements or steps other than those stated in a claim. Unless the context clearly requires otherwise, throughout the description and the claims, the words “comprise”, “comprising”, “include”, “including”, “contain”, “containing” and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to”.
The article "a" or "an" preceding an element does not exclude the presence of a plurality of such elements.
The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a device claim, or an apparatus claim, or a system claim, enumerating several means, several of these means may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
The invention also provides a control system that may control the device, apparatus, or system, or that may execute the herein described method or process. Yet further, the invention also provides a computer program product, when running on a computer which is functionally coupled to or comprised by the device, apparatus, or system, controls one or more controllable elements of such device, apparatus, or system.
The invention further applies to a device, apparatus, or system comprising one or more of the characterizing features described in the description and/or shown in the attached drawings. The invention further pertains to a method or process comprising one or more of the characterizing features described in the description and/or shown in the attached drawings. Moreover, if a method or an embodiment of the method is described being executed in a device, apparatus, or system, it will be understood that the device, apparatus, or system is suitable for or configured for (executing) the method or the embodiment of the method, respectively.
The various aspects discussed in this patent can be combined in order to provide additional advantages. Further, the person skilled in the art will understand that embodiments can be combined, and that also more than two embodiments can be combined. Furthermore, some of the features can form the basis for one or more divisional applications.
20210531 P1600222NL00 SequenceListing.txt
SEQUENCE LISTING <110> Technische Universiteit Delft <120> gRAMP protein for modulating a target RNA <130> P1600222NL00 <160> 88 <170> BiSSAP 1.3.6 <210> 1 <211> 1722 <212> PRT <213> Candidatus Scalindua brodae <220> <223> Candidatus Scalindua brodae gRAMP <400> 1 Met Lys Ser Asn Asp Met Asn Ile Thr Val Glu Leu Thr Phe Phe Glu 1 5 10 15 Pro Tyr Arg Leu Val Glu Trp Phe Asp Trp Asp Ala Arg Lys Lys Ser His Ser Ala Met Arg Gly Gln Ala Phe Ala Gln Trp Thr Trp Lys Gly 40 45 Lys Gly Arg Thr Ala Gly Lys Ser Phe Ile Thr Gly Thr Leu Val Arg 50 55 60 Ser Ala Val Ile Lys Ala Val Glu Glu Leu Leu Ser Leu Asn Asn Gly 65 70 75 80 Lys Trp Glu Gly Val Pro Cys Cys Asn Gly Ser Phe Gln Thr Asp Glu 85 90 95 Ser Lys Gly Lys Lys Pro Ser Phe Leu Arg Lys Arg His Thr Leu Gln 100 105 110 Trp Gln Ala Asn Asn Lys Asn Ile Cys Asp Lys Glu Glu Ala Cys Pro 115 120 125 Phe Cys Ile Leu Leu Gly Arg Phe Asp Asn Ala Gly Lys Val His Glu 130 135 140 Arg Asn Lys Asp Tyr Asp Ile His Phe Ser Asn Phe Asp Leu Asp His 145 150 155 160 Lys Gln Glu Lys Asn Asp Leu Arg Leu Val Asp Ile Ala Ser Gly Arg 165 170 175 Ile Leu Asn Arg Val Asp Phe Asp Thr Gly Lys Ala Lys Asp Tyr Phe 180 185 190 Arg Thr Trp Glu Ala Asp Tyr Glu Thr Tyr Gly Thr Tyr Thr Gly Arg 195 200 205 Ile Thr Leu Arg Asn Glu His Ala Lys Lys Leu Leu Leu Ala Ser Leu Page 1
Gly Phe Val Asp Lys Leu Cys Gly Ala Leu Cys Arg Ile Glu Val Ile 225 230 235 240 Lys Lys Ser Glu Ser Pro Leu Pro Ser Asp Thr Lys Glu Gln Ser Tyr 245 250 255 Thr Lys Asp Asp Thr Val Glu Val Leu Ser Glu Asp His Asn Asp Glu 260 265 270 Leu Arg Lys Gln Ala Glu Val Ile Val Glu Ala Phe Lys Gln Asn Asp 275 280 285 Lys Leu Glu Lys Ile Arg Ile Leu Ala Asp Ala Ile Arg Thr Leu Arg 290 295 300 Leu His Gly Glu Gly Val Ile Glu Lys Asp Glu Leu Pro Asp Gly Lys 305 310 315 320 Glu Glu Arg Asp Lys Gly His His Leu Trp Asp Ile Lys Val Gln Gly 325 330 335 Thr Ala Leu Arg Thr Lys Leu Lys Glu Leu Trp Gln Ser Asn Lys Asp 340 345 350 Ile Gly Trp Arg Lys Phe Thr Glu Met Leu Gly Ser Asn Leu Tyr Leu 355 360 365 Ile Tyr Lys Lys Glu Thr Gly Gly Val Ser Thr Arg Phe Arg Ile Leu 370 375 380 Gly Asp Thr Glu Tyr Tyr Ser Lys Ala His Asp Ser Glu Gly Ser Asp 385 390 395 400 Leu Phe Ile Pro Val Thr Pro Pro Glu Gly Ile Glu Thr Lys Glu Trp 405 410 415 Ile Tle Val Gly Arg Leu Lys Ala Ala Thr Pro Phe Tyr Phe Gly Val 420 425 430 Gln Gln Pro Ser Asp Ser Ile Pro Gly Lys Glu Lys Lys Ser Glu Asp 435 440 445 Ser Leu Val Ile Asn Glu His Thr Ser Phe Asn Ile Leu Leu Asp Lys 450 455 460 Glu Asn Arg Tyr Arg Ile Pro Arg Ser Ala Leu Arg Gly Ala Leu Arg 465 470 475 480 Arg Asp Leu Arg Thr Ala Phe Gly Ser Gly Cys Asn Val Ser Leu Gly 485 490 495 Gly Gln Ile Leu Cys Asn Cys Lys Val Cys Ile Glu Met Arg Arg Ile 500 505 510 Thr Leu Lys Asp Ser Val Ser Asp Phe Ser Glu Pro Pro Glu Ile Arg 515 520 525 Tyr Arg Ile Ala Lys Asn Pro Gly Thr Ala Thr Val Glu Asp Gly Ser 530 535 540 Leu Phe Asp Ile Glu Val Gly Pro Glu Gly Leu Thr Phe Pro Phe Val 545 550 555 560 Leu Arg Tyr Arg Gly His Lys Phe Pro Glu Gln Leu Ser Ser Val Ile 565 570 575 Arg Tyr Trp Glu Glu Asn Asp Gly Lys Asn Gly Met Ala Trp Leu Gly 580 585 590 Gly Leu Asp Ser Thr Gly Lys Gly Arg Phe Ala Leu Lys Asp Ile Lys 595 600 605 Ile Phe Glu Trp Asp Leu Asn Gln Lys Ile Asn Glu Tyr Ile Lys Glu Page 2
Arg Gly Met Arg Gly Lys Glu Lys Glu Leu Leu Glu Met Gly Glu Ser 625 630 635 640 Ser Leu Pro Asp Gly Leu Ile Pro Tyr Lys Phe Phe Glu Glu Arg Glu 645 650 655 Cys Leu Phe Pro Tyr Lys Glu Asn Leu Lys Pro Gln Trp Ser Glu Val 660 665 670 Gln Tyr Thr Ile Glu Val Gly Ser Pro Leu Leu Thr Ala Asp Thr Ile 675 680 685 Ser Ala Leu Thr Glu Pro Gly Asn Arg Asp Ala Ile Ala Tyr Lys Lys 690 695 700 Arg Val Tyr Asn Asp Gly Asn Asn Ala Ile Glu Pro Glu Pro Arg Phe 705 710 715 720 Ala Val Lys Ser Glu Thr His Arg Gly Ile Phe Arg Thr Ala Val Gly 725 730 735 Arg Arg Thr Gly Asp Leu Gly Lys Glu Asp His Glu Asp Cys Thr Cys 740 745 750 Asp Met Cys Ile Ile Phe Gly Asn Glu His Glu Ser Ser Lys Ile Arg 755 760 765 Phe Glu Asp Leu Glu Leu Ile Asn Gly Asn Glu Phe Glu Lys Leu Glu 770 775 780 Lys His Ile Asp His Val Ala Ile Asp Arg Phe Thr Gly Gly Ala Leu 785 790 795 800 Asp Lys Ala Lys Phe Asp Thr Tyr Pro Leu Ala Gly Ser Pro Lys Lys 805 810 815 Pro Leu Lys Leu Lys Gly Arg Phe Trp Ile Lys Lys Gly Phe Ser Gly 820 825 830 Asp His Lys Leu Leu Ile Thr Thr Ala Leu Ser Asp Ile Arg Asp Gly 835 840 845 Leu Tyr Pro Leu Gly Ser Lys Gly Gly Val Gly Tyr Gly Trp Val Ala 850 855 860 Gly Ile Ser Ile Asp Asp Asn Val Pro Asp Asp Phe Lys Glu Met Ile 865 870 875 880 Asn Lys Thr Glu Met Pro Leu Pro Glu Glu Val Glu Glu Ser Asn Asn 885 890 895 Gly Pro Ile Asn Asn Asp Tyr Val His Pro Gly His Gln Ser Pro Lys 900 905 910 Gln Asp His Lys Asn Lys Asn Ile Tyr Tyr Pro His Tyr Phe Leu Asp 915 920 925 Ser Gly Ser Lys Val Tyr Arg Glu Lys Asp Ile Ile Thr His Glu Glu 930 935 940 Phe Thr Glu Glu Leu Leu Ser Gly Lys Ile Asn Cys Lys Leu Glu Thr 945 950 955 960 Leu Thr Pro Leu Ile Ile Pro Asp Thr Ser Asp Glu Asn Gly Leu Lys 965 970 975 Leu Gln Gly Asn Lys Pro Gly His Lys Asn Tyr Lys Phe Phe Asn Ile 980 985 990 Asn Gly Glu Leu Met Ile Pro Gly Ser Glu Leu Arg Gly Met Leu Arg 995 1000 1005 Thr His Phe Glu Ala Leu Thr Lys Ser Cys Phe Ala Ile Phe Gly Glu Page 3
Asp Ser Thr Leu Ser Trp Arg Met Asn Ala Asp Glu Lys Asp Tyr Lys 1025 1030 1035 1040 Ile Asp Ser Asn Ser Ile Arg Lys Met Glu Ser Gln Arg Asn Pro Lys 1045 1050 1055 Tyr Arg Ile Pro Asp Glu Leu Gln Lys Glu Leu Arg Asn Ser Gly Asn 1060 1065 1070 Gly Leu Phe Asn Arg Leu Tyr Thr Ser Glu Arg Arg Phe Trp Ser Asp 1075 1080 1085 Val Ser Asn Lys Phe Glu Asn Ser Ile Asp Tyr Lys Arg Glu Ile Leu 1090 1095 1100 Arg Cys Ala Gly Arg Pro Lys Asn Tyr Lys Gly Gly Ile Ile Arg Gln 1105 1110 1115 1120 Arg Lys Asp Ser Leu Met Ala Glu Glu Leu Lys Val His Arg Leu Pro 1125 1130 1135 Leu Tyr Asp Asn Phe Asp Ile Pro Asp Ser Ala Tyr Lys Ala Asn Asp 1140 1145 1150 His Cys Arg Lys Ser Ala Thr Cys Ser Thr Ser Arg Gly Cys Arg Glu 1155 1160 1165 Arg Phe Thr Cys Gly Ile Lys Val Arg Asp Lys Asn Arg Val Phe Leu 1170 1175 1180 Asn Ala Ala Asn Asn Asn Arg Gln Tyr Leu Asn Asn Ile Lys Lys Ser 1185 1190 1195 1200 Asn His Asp Leu Tyr Leu Gln Tyr Leu Lys Gly Glu Lys Lys Ile Arg 1205 1210 1215 Phe Asn Ser Lys Val Ile Thr Gly Ser Glu Arg Ser Pro Ile Asp Val 1220 1225 1230 Ile Ala Glu Leu Asn Glu Arg Gly Arg Gln Thr Gly Phe Ile Lys Leu 1235 1240 1245 Ser Gly Leu Asn Asn Ser Asn Lys Ser Gln Gly Asn Thr Gly Thr Thr 1250 1255 1260 Phe Asn Ser Gly Trp Asp Arg Phe Glu Leu Asn Ile Leu Leu Asp Asp 1265 1270 1275 1280 Leu Glu Thr Arg Pro Ser Lys Ser Asp Tyr Pro Arg Pro Arg Leu Leu 1285 1290 1295 Phe Thr Lys Asp Gln Tyr Glu Tyr Asn Ile Thr Lys Arg Cys Glu Arg 1300 1305 1310 Val Phe Glu Ile Asp Lys Gly Asn Lys Thr Gly Tyr Pro Val Asp Asp 1315 1320 1325 Gln Ile Lys Lys Asn Tyr Glu Asp Ile Leu Asp Ser Tyr Asp Gly Ile 1330 1335 1340 Lys Asp Gln Glu Val Ala Glu Arg Phe Asp Thr Phe Thr Arg Gly Ser 1345 1350 1355 1360 Lys Leu Lys Val Gly Asp Leu Val Tyr Phe His Ile Asp Gly Asp Asn 1365 1370 1375 Lys Ile Asp Ser Leu Ile Pro Val Arg Ile Ser Arg Lys Cys Ala Ser 1380 1385 1390 Lys Thr Leu Gly Gly Lys Leu Asp Lys Ala Leu His Pro Cys Thr Gly 1395 1400 1405 Leu Ser Asp Gly Leu Cys Pro Gly Cys His Leu Phe Gly Thr Thr Asp Page 4
Tyr Lys Gly Arg Val Lys Phe Gly Phe Ala Lys Tyr Glu Asn Gly Pro 1425 1430 1435 1440 Glu Trp Leu Ile Thr Arg Gly Asn Asn Pro Glu Arg Ser Leu Thr Leu 1445 1450 1455 Gly Val Leu Glu Ser Pro Arg Pro Ala Phe Ser Ile Pro Asp Asp Glu 1460 1465 1470 Ser Glu Ile Pro Gly Arg Lys Phe Tyr Leu His His Asn Gly Trp Arg 1475 1480 1485 Ile Tle Arg Gln Lys Gln Leu Glu Ile Arg Glu Thr Val Gln Pro Glu 1490 1495 1500 Arg Asn Val Thr Thr Glu Val Met Asp Lys Gly Asn Val Phe Ser Phe 1505 1510 1515 1520 Asp Val Arg Phe Glu Asn Leu Arg Glu Trp Glu Leu Gly Leu Leu Leu 1525 1530 1535 Gln Ser Leu Asp Pro Gly Lys Asn Ile Ala His Lys Leu Gly Lys Gly 1540 1545 1550 Lys Pro Tyr Gly Phe Gly Ser Val Lys Ile Lys Ile Asp Ser Leu His 1555 1560 1565 Thr Phe Lys Ile Asn Ser Asn Asn Asp Lys Ile Lys Arg Val Pro Gln 1570 1575 1580 Ser Asp Ile Arg Glu Tyr Ile Asn Lys Gly Tyr Gln Lys Leu Ile Glu 1585 1590 1595 1600 Trp Ser Gly Asn Asn Ser Ile Gln Lys Gly Asn Val Leu Pro Gln Trp 1605 1610 1615 His Val Ile Pro His Ile Asp Lys Leu Tyr Lys Leu Leu Trp Val Pro 1620 1625 1630 Phe Leu Asn Asp Ser Lys Leu Glu Pro Asp Val Arg Tyr Pro Val Leu 1635 1640 1645 Asn Glu Glu Ser Lys Gly Tyr Ile Glu Gly Ser Asp Tyr Thr Tyr Lys 1650 1655 1660 Lys Leu Gly Asp Lys Asp Asn Leu Pro Tyr Lys Thr Arg Val Lys Gly 1665 1670 1675 1680 Leu Thr Thr Pro Trp Ser Pro Trp Asn Pro Phe Gln Val Ile Ala Glu 1685 1690 1695 His Glu Glu Gln Glu Val Asn Val Thr Gly Ser Arg Pro Ser Val Thr 1700 1705 1710 Asp Lys Ile Glu Arg Asp Gly Lys Met Val 1715 1720 <210> 2 <211> 1717 <212> PRT <213> Candidatus Scalindua brodae <220> <223> JRY001000185.1 cds KHE91659.1 <400> 2 Page 5
Met Asn Ile Thr Val Glu Leu Thr Phe Phe Glu Pro Tyr Arg Leu Val 1 5 10 15 Glu Trp Phe Asp Trp Asp Ala Arg Lys Lys Ser His Ser Ala Met Arg
Gly Gln Ala Phe Ala Gln Trp Thr Trp Lys Gly Lys Gly Arg Thr Ala 40 45 Gly Lys Ser Phe Ile Thr Gly Thr Leu Val Arg Ser Ala Val Ile Lys 50 55 60 Ala Val Glu Glu Leu Leu Ser Leu Asn Asn Gly Lys Trp Glu Gly Val 65 70 75 80 Pro Cys Cys Asn Gly Ser Phe Gln Thr Asp Glu Ser Lys Gly Lys Lys 85 90 95 Pro Ser Phe Leu Arg Lys Arg His Thr Leu Gln Trp Gln Ala Asn Asn 100 105 110 Lys Asn Ile Cys Asp Lys Glu Glu Ala Cys Pro Phe Cys Ile Leu Leu 115 120 125 Gly Arg Phe Asp Asn Ala Gly Lys Val His Glu Arg Asn Lys Asp Tyr 130 135 140 Asp Ile His Phe Ser Asn Phe Asp Leu Asp His Lys Gln Glu Lys Asn 145 150 155 160 Asp Leu Arg Leu Val Asp Ile Ala Ser Gly Arg Ile Leu Asn Arg Val 165 170 175 Asp Phe Asp Thr Gly Lys Ala Lys Asp Tyr Phe Arg Thr Trp Glu Ala 180 185 190 Asp Tyr Glu Thr Tyr Gly Thr Tyr Thr Gly Arg Ile Thr Leu Arg Asn 195 200 205 Glu His Ala Lys Lys Leu Leu Leu Ala Ser Leu Gly Phe Val Asp Lys 210 215 220 Leu Cys Gly Ala Leu Cys Arg Ile Glu Val Ile Lys Lys Ser Glu Ser 225 230 235 240 Pro Leu Pro Ser Asp Thr Lys Glu Gln Ser Tyr Thr Lys Asp Asp Thr 245 250 255 Val Glu Val Leu Ser Glu Asp His Asn Asp Glu Leu Arg Lys Gln Ala 260 265 270 Glu Val Ile Val Glu Ala Phe Lys Gln Asn Asp Lys Leu Glu Lys Ile 275 280 285 Arg Ile Leu Ala Asp Ala Ile Arg Thr Leu Arg Leu His Gly Glu Gly 290 295 300 Val Ile Glu Lys Asp Glu Leu Pro Asp Gly Lys Glu Glu Arg Asp Lys 305 310 315 320 Gly His His Leu Trp Asp Ile Lys Val Gln Gly Thr Ala Leu Arg Thr 325 330 335 Lys Leu Lys Glu Leu Trp Gln Ser Asn Lys Asp Ile Gly Trp Arg Lys 340 345 350 Phe Thr Glu Met Leu Gly Ser Asn Leu Tyr Leu Ile Tyr Lys Lys Glu 355 360 365 Thr Gly Gly Val Ser Thr Arg Phe Arg Ile Leu Gly Asp Thr Glu Tyr 370 375 380 Tyr Ser Lys Ala His Asp Ser Glu Gly Ser Asp Leu Phe Ile Pro Val 385 390 395 400 Page ©
Thr Pro Pro Glu Gly Ile Glu Thr Lys Glu Trp Ile Ile Val Gly Arg 405 410 415 Leu Lys Ala Ala Thr Pro Phe Tyr Phe Gly Val Gln Gln Pro Ser Asp 420 425 430 Ser Ile Pro Gly Lys Glu Lys Lys Ser Glu Asp Ser Leu Val Ile Asn 435 440 445 Glu His Thr Ser Phe Asn Ile Leu Leu Asp Lys Glu Asn Arg Tyr Arg 450 455 460 Ile Pro Arg Ser Ala Leu Arg Gly Ala Leu Arg Arg Asp Leu Arg Thr 465 470 475 480 Ala Phe Gly Ser Gly Cys Asn Val Ser Leu Gly Gly Gln Ile Leu Cys 485 490 495 Asn Cys Lys Val Cys Ile Glu Met Arg Arg Ile Thr Leu Lys Asp Ser 500 505 510 Val Ser Asp Phe Ser Glu Pro Pro Glu Ile Arg Tyr Arg Ile Ala Lys 515 520 525 Asn Pro Gly Thr Ala Thr Val Glu Asp Gly Ser Leu Phe Asp Ile Glu 530 535 540 Val Gly Pro Glu Gly Leu Thr Phe Pro Phe Val Leu Arg Tyr Arg Gly 545 550 555 560 His Lys Phe Pro Glu Gln Leu Ser Ser Val Ile Arg Tyr Trp Glu Glu 565 570 575 Asn Asp Gly Lys Asn Gly Met Ala Trp Leu Gly Gly Leu Asp Ser Thr 580 585 590 Gly Lys Gly Arg Phe Ala Leu Lys Asp Ile Lys Ile Phe Glu Trp Asp 595 600 605 Leu Asn Gln Lys Ile Asn Glu Tyr Ile Lys Glu Arg Gly Met Arg Gly 610 615 620 Lys Glu Lys Glu Leu Leu Glu Met Gly Glu Ser Ser Leu Pro Asp Gly 625 630 635 640 Leu Ile Pro Tyr Lys Phe Phe Glu Glu Arg Glu Cys Leu Phe Pro Tyr 645 650 655 Lys Glu Asn Leu Lys Pro Gln Trp Ser Glu Val Gln Tyr Thr Ile Glu 660 665 670 Val Gly Ser Pro Leu Leu Thr Ala Asp Thr Ile Ser Ala Leu Thr Glu 675 680 685 Pro Gly Asn Arg Asp Ala Ile Ala Tyr Lys Lys Arg Val Tyr Asn Asp 690 695 700 Gly Asn Asn Ala Ile Glu Pro Glu Pro Arg Phe Ala Val Lys Ser Glu 705 710 715 720 Thr His Arg Gly Ile Phe Arg Thr Ala Val Gly Arg Arg Thr Gly Asp 725 730 735 Leu Gly Lys Glu Asp His Glu Asp Cys Thr Cys Asp Met Cys Ile Ile 740 745 750 Phe Gly Asn Glu His Glu Ser Ser Lys Ile Arg Phe Glu Asp Leu Glu 755 760 765 Leu Ile Asn Gly Asn Glu Phe Glu Lys Leu Glu Lys His Ile Asp His 770 775 780 Val Ala Ile Asp Arg Phe Thr Gly Gly Ala Leu Asp Lys Ala Lys Phe 785 790 795 800 Page 7
Asp Thr Tyr Pro Leu Ala Gly Ser Pro Lys Lys Pro Leu Lys Leu Lys 805 810 815 Gly Arg Phe Trp Ile Lys Lys Gly Phe Ser Gly Asp His Lys Leu Leu 820 825 830 Ile Thr Thr Ala Leu Ser Asp Ile Arg Asp Gly Leu Tyr Pro Leu Gly 835 840 845 Ser Lys Gly Gly Val Gly Tyr Gly Trp Val Ala Gly Ile Ser Ile Asp 850 855 860 Asp Asn Val Pro Asp Asp Phe Lys Glu Met Ile Asn Lys Thr Glu Met 865 870 875 880 Pro Leu Pro Glu Glu Val Glu Glu Ser Asn Asn Gly Pro Ile Asn Asn 885 890 895 Asp Tyr Val His Pro Gly His Gln Ser Pro Lys Gln Asp His Lys Asn 900 905 910 Lys Asn Ile Tyr Tyr Pro His Tyr Phe Leu Asp Ser Gly Ser Lys Val 915 920 925 Tyr Arg Glu Lys Asp Ile Ile Thr His Glu Glu Phe Thr Glu Glu Leu 930 935 940 Leu Ser Gly Lys Ile Asn Cys Lys Leu Glu Thr Leu Thr Pro Leu Ile 945 950 955 960 Ile Pro Asp Thr Ser Asp Glu Asn Gly Leu Lys Leu Gln Gly Asn Lys 965 970 975 Pro Gly His Lys Asn Tyr Lys Phe Phe Asn Ile Asn Gly Glu Leu Met 980 985 990 Ile Pro Gly Ser Glu Leu Arg Gly Met Leu Arg Thr His Phe Glu Ala 995 1000 1005 Leu Thr Lys Ser Cys Phe Ala Ile Phe Gly Glu Asp Ser Thr Leu Ser 1010 1015 1020 Trp Arg Met Asn Ala Asp Glu Lys Asp Tyr Lys Ile Asp Ser Asn Ser 1025 1030 1035 1040 Ile Arg Lys Met Glu Ser Gln Arg Asn Pro Lys Tyr Arg Ile Pro Asp 1045 1050 1055 Glu Leu Gln Lys Glu Leu Arg Asn Ser Gly Asn Gly Leu Phe Asn Arg 1060 1065 1070 Leu Tyr Thr Ser Glu Arg Arg Phe Trp Ser Asp Val Ser Asn Lys Phe 1075 1080 1085 Glu Asn Ser Ile Asp Tyr Lys Arg Glu Ile Leu Arg Cys Ala Gly Arg 1090 1095 1100 Pro Lys Asn Tyr Lys Gly Gly Ile Ile Arg Gln Arg Lys Asp Ser Leu 1105 1110 1115 1120 Met Ala Glu Glu Leu Lys Val His Arg Leu Pro Leu Tyr Asp Asn Phe 1125 1130 1135 Asp Ile Pro Asp Ser Ala Tyr Lys Ala Asn Asp His Cys Arg Lys Ser 1140 1145 1150 Ala Thr Cys Ser Thr Ser Arg Gly Cys Arg Glu Arg Phe Thr Cys Gly 1155 1160 1165 Ile Lys Val Arg Asp Lys Asn Arg Val Phe Leu Asn Ala Ala Asn Asn 1170 1175 1180 Asn Arg Gln Tyr Leu Asn Asn Ile Lys Lys Ser Asn His Asp Leu Tyr 1185 1190 1195 1200 Page 8
Leu Gln Tyr Leu Lys Gly Glu Lys Lys Ile Arg Phe Asn Ser Lys Val 1205 1210 1215 Ile Thr Gly Ser Glu Arg Ser Pro Ile Asp Val Ile Ala Glu Leu Asn 1220 1225 1230 Glu Arg Gly Arg Gln Thr Gly Phe Ile Lys Leu Ser Gly Leu Asn Asn 1235 1240 1245 Ser Asn Lys Ser Gln Gly Asn Thr Gly Thr Thr Phe Asn Ser Gly Trp 1250 1255 1260 Asp Arg Phe Glu Leu Asn Ile Leu Leu Asp Asp Leu Glu Thr Arg Pro 1265 1270 1275 1280 Ser Lys Ser Asp Tyr Pro Arg Pro Arg Leu Leu Phe Thr Lys Asp Gln 1285 1290 1295 Tyr Glu Tyr Asn Ile Thr Lys Arg Cys Glu Arg Val Phe Glu Ile Asp 1300 1305 1310 Lys Gly Asn Lys Thr Gly Tyr Pro Val Asp Asp Gln Ile Lys Lys Asn 1315 1320 1325 Tyr Glu Asp Ile Leu Asp Ser Tyr Asp Gly Ile Lys Asp Gln Glu Val 1330 1335 1340 Ala Glu Arg Phe Asp Thr Phe Thr Arg Gly Ser Lys Leu Lys Val Gly 1345 1350 1355 1360 Asp Leu Val Tyr Phe His Ile Asp Gly Asp Asn Lys Ile Asp Ser Leu 1365 1370 1375 Ile Pro Val Arg Ile Ser Arg Lys Cys Ala Ser Lys Thr Leu Gly Gly 1380 1385 1390 Lys Leu Asp Lys Ala Leu His Pro Cys Thr Gly Leu Ser Asp Gly Leu 1395 1400 1405 Cys Pro Gly Cys His Leu Phe Gly Thr Thr Asp Tyr Lys Gly Arg Val 1410 1415 1420 Lys Phe Gly Phe Ala Lys Tyr Glu Asn Gly Pro Glu Trp Leu Ile Thr 1425 1430 1435 1440 Arg Gly Asn Asn Pro Glu Arg Ser Leu Thr Leu Gly Val Leu Glu Ser 1445 1450 1455 Pro Arg Pro Ala Phe Ser Ile Pro Asp Asp Glu Ser Glu Ile Pro Gly 1460 1465 1470 Arg Lys Phe Tyr Leu His His Asn Gly Trp Arg Ile Ile Arg Gln Lys 1475 1480 1485 Gln Leu Glu Ile Arg Glu Thr Val Gln Pro Glu Arg Asn Val Thr Thr 1490 1495 1500 Glu Val Met Asp Lys Gly Asn Val Phe Ser Phe Asp Val Arg Phe Glu 1505 1510 1515 1520 Asn Leu Arg Glu Trp Glu Leu Gly Leu Leu Leu Gln Ser Leu Asp Pro 1525 1530 1535 Gly Lys Asn Ile Ala His Lys Leu Gly Lys Gly Lys Pro Tyr Gly Phe 1540 1545 1550 Gly Ser Val Lys Ile Lys Ile Asp Ser Leu His Thr Phe Lys Ile Asn 1555 1560 1565 Ser Asn Asn Asp Lys Ile Lys Arg Val Pro Gln Ser Asp Ile Arg Glu 1570 1575 1580 Tyr Ile Asn Lys Gly Tyr Gln Lys Leu Ile Glu Trp Ser Gly Asn Asn 1585 1590 1595 1600 Page 9
Ser Tle Gln Lys Gly Asn Val Leu Pro Gln Trp His Val Ile Pro His 1605 1610 1615 Ile Asp Lys Leu Tyr Lys Leu Leu Trp Val Pro Phe Leu Asn Asp Ser 1620 1625 1630 Lys Leu Glu Pro Asp Val Arg Tyr Pro Val Leu Asn Glu Glu Ser Lys 1635 1640 1645 Gly Tyr Ile Glu Gly Ser Asp Tyr Thr Tyr Lys Lys Leu Gly Asp Lys 1650 1655 1660 Asp Asn Leu Pro Tyr Lys Thr Arg Val Lys Gly Leu Thr Thr Pro Trp 1665 1670 1675 1680 Ser Pro Trp Asn Pro Phe Gln Val Ile Ala Glu His Glu Glu Gln Glu 1685 1690 1695 Val Asn Val Thr Gly Ser Arg Pro Ser Val Thr Asp Lys Ile Glu Arg 1700 1705 1710 Asp Gly Lys Met Val 1715 <210> 3 <211> 1657 <212> PRT <213> Candidatus Magnetomorum <220> <223> JPDT01001326.1 cds KPA14974.1 <400> 3 Met Leu Lys Leu Lys Val Lys Ile Thr Tyr Phe Gln Pro Phe Arg Val 1 5 10 15 Ile Pro Trp Ile Lys Glu Asp Asp Arg Asn Ser Asp Arg Asn Tyr Leu
Arg Gly Gly Thr Phe Ala Arg Trp His Lys Asp Lys Lys Asp Asp Ile 40 45 His Gly Lys Pro Tyr Ile Thr Gly Thr Leu Leu Arg Ser Ala Leu Phe 50 55 60 Thr Glu Ile Glu Lys Ile Lys Ile His His Ser Asp Phe Ile His Cys 65 70 75 80 Cys Asn Ala Ile Asp Arg Thr Glu Gly Lys His Gln Pro Ser Phe Leu 85 90 95 Arg Lys Arg Pro Val Tyr Thr Glu Asn Lys Asn Ile Gln Ala Cys Asn 100 105 110 Lys Cys Pro Leu Cys Leu Ile Met Gly Arg Gly Asp Asp Arg Gly Glu 115 120 125 Asp Leu Lys Lys Lys Lys His Tyr Asn Gly Lys His Tyr Gln Asn Trp 130 135 140 Thr Val His Phe Ser Asn Phe Asp Thr Gln Ala Thr Phe Tyr Trp Lys 145 150 155 160 Asp Ile Val Gln Lys Arg Ile Leu Asn Arg Val Asp Gln Thr Cys Gly 165 170 175 Lys Ala Lys Asp Phe Phe Lys Val Cys Glu Val Asp His Ile Ala Cys Page 10
Pro Thr Leu Asn Gly Ile Ile Arg Ile Asn Asp Glu Lys Leu Ser Gln 195 200 205 Glu Glu Ile Ser Lys Ile Lys Gln Leu Ile Ala Val Gly Leu Ala Gln 210 215 220 Ile Glu Ser Leu Ala Gly Gly Ile Cys Arg Ile Asp Ile Thr Asn Gln 225 230 235 240 Asn His Asp Asp Leu Ile Lys Ser Phe Phe Glu Thr Lys Pro Ser Lys 245 250 255 Ile Leu Gln Pro Asn Leu Lys Glu Ser Gly Glu Glu Arg Phe Glu Leu 260 265 270 Ala Lys Leu Glu Leu Leu Ala Glu Tyr Leu Thr Gln Ser Phe Asp Ala 275 280 285 Asn Gln Lys Glu Gln Gln Leu Arg Arg Leu Ala Asp Ala Ile Arg Asp 290 295 300 Leu Arg Lys Tyr Ser Pro Asp Tyr Leu Lys Asp Leu Pro Lys Gly Lys 305 310 315 320 Lys Gly Gly Arg Thr Ser Ile Trp Asn Lys Lys Val Ala Asp Asp Phe 325 330 335 Thr Leu Arg Asp Cys Leu Lys Asn Gln Lys Ile Pro Asn Glu Leu Trp 340 345 350 Arg Gln Phe Cys Glu Gly Leu Gly Arg Glu Val Tyr Lys Ile Ser Lys 355 360 365 Asn Ile Ser Asn Arg Ser Asp Ala Lys Pro Arg Leu Leu Gly Glu Thr 370 375 380 Glu Tyr Ala Gly Leu Pro Leu Arg Lys Glu Asp Glu Lys Glu Tyr Ser 385 390 395 400 Pro Thr Tyr Gln Asn Gln Glu Ser Leu Pro Lys Thr Lys Trp Ile Ile 405 410 415 Ser Gly Glu Leu Gln Ala Ile Thr Pro Phe Tyr Ile Gly His Val Asn 420 425 430 Lys Thr Ser His Thr Arg Ser Thr Ile Phe Leu Asn Met Asn Gly Gln 435 440 445 Phe Cys Ile Pro Arg Ser Thr Leu Arg Gly Ala Leu Arg Arg Asp Leu 450 455 460 Arg Leu Val Phe Gly Asp Ser Cys Asn Thr Pro Val Gly Ser Arg Val 465 470 475 480 Cys Tyr Cys Gln Val Cys Gln Ile Met Arg Cys Ile Lys Phe Glu Asp 485 490 495 Ala Leu Ser Asp Val Asp Ser Pro Pro Glu Val Arg His Arg Ile Arg 500 505 510 Leu Asn Cys His Thr Gly Val Val Glu Glu Gly Ala Leu Phe Asp Met 515 520 525 Glu Thr Gly Phe Gln Gly Met Ile Phe Pro Phe Arg Leu Tyr Tyr Glu 530 535 540 Ser Lys Asn Glu Ile Met Ser Gln His Leu Tyr Glu Val Leu Asn Asn 545 550 555 560 Trp Thr Asn Gly Gln Ala Phe Phe Gly Gly Glu Ala Gly Thr Gly Phe 565 570 575 Gly Arg Phe Lys Leu Leu Asn Asn Glu Val Phe Leu Trp Glu Ile Asp Page 11
Gly Glu Glu Glu Asp Tyr Leu Gln Tyr Leu Phe Ser Arg Gly Tyr Lys 595 600 605 Gly Ile Glu Thr Asp Glu Ile Lys Lys Val Ala Asp Pro Ile Lys Trp 610 615 620 Lys Thr Leu Phe Thr Lys Leu Glu Ile Pro Pro Glu Lys Ile Pro Leu 625 630 635 640 Thr Gln Leu Asn Tyr Thr Leu Thr Ile Asp Ser Pro Leu Ile Ser Arg 645 650 655 Asp Pro Ile Ala Ala Met Leu Asp Asn Arg Asn Pro Asp Ala Val Met 660 665 670 Val Lys Lys Thr Ile Leu Val Tyr Glu Gln Asp Ser Ser Thr His Lys 675 680 685 Asn Val Pro Lys Glu Val Pro Lys Tyr Phe Ile Lys Ser Glu Thr Ile 690 695 700 Arg Gly Leu Leu Arg Ser Ile Ile Ser Arg Thr Glu Ile Lys Leu Glu 705 710 715 720 Asp Gly Lys Lys Glu Arg Ile Phe Asn Leu Asp His Glu Asp Cys Asp 725 730 735 Cys Leu Gln Cys Arg Leu Phe Gly Asn Val His Gln Gln Gly Ile Leu 740 745 750 Arg Phe Glu Asp Ala Glu Ile Thr Asn Lys Asn Val Ser Asp Cys Cys 755 760 765 Ile Asp His Val Ala Ile Asp Arg Phe Thr Gly Gly Gly Val Glu Lys 770 775 780 Met Lys Phe Asn Asp Tyr Pro Leu Ser Ala Ser Pro Lys Asn Cys Leu 785 790 795 800 Asn Leu Lys Gly Ser Ile Trp Ile Thr Ser Ala Leu Lys Asp Ser Glu 805 810 815 Lys Glu Ala Leu Ser Lys Ala Leu Ser Glu Leu Lys Tyr Gly Tyr Ala 820 825 830 Ser Leu Gly Gly Leu Ser Ala Ile Gly Tyr Gly Arg Val Lys Glu Leu 835 840 845 Thr Leu Glu Glu Asn Asp Ile Ile Gln Leu Thr Glu Ile Thr Glu Ser 850 855 860 Asn Leu Asn Ser Gln Ser Arg Leu Ser Leu Lys Pro Asp Val Lys Lys 865 870 875 880 Glu Leu Ser Asn Asn His Phe Tyr Tyr Pro His Tyr Phe Ile Lys Pro 885 890 895 Ala Pro Lys Glu Val Val Arg Glu Ser Arg Leu Ile Ser His Val Gln 900 905 910 Gly His Asp Thr Glu Gly Glu Phe Leu Leu Thr Gly Lys Ile Lys Cys 915 920 925 Arg Leu Gln Thr Leu Gly Pro Leu Phe Ile Ala Asn Asn Asp Lys Gly 930 935 940 Asp Asp Tyr Phe Glu Leu Gln His Asn Asn Pro Gly His Leu Asn Tyr 945 950 955 960 Ala Phe Phe Arg Ile Asn Asp His Ile Ala Ile Pro Gly Ala Ser Ile 965 970 975 Arg Gly Met Ile Ser Ser Val Phe Glu Thr Leu Thr His Ser Cys Phe Page 12
Arg Val Met Asp Asp Lys Lys Tyr Leu Thr Arg Arg Val Ile Pro Glu 995 1000 1005 Ser Glu Thr Thr Gln Lys Arg Lys Ser Gly Arg Tyr Gln Val Glu Glu 1010 1015 1020 Ser Asp Pro Asp Leu Phe Pro Gly Arg Val Gln Lys Lys Gly Asn Lys 1025 1030 1035 1040 Tyr Lys Ile Glu Lys Met Asp Glu Ile Val Arg Leu Pro Ile Tyr Asp 1045 1050 1055 Asn Phe Ser Leu Val Glu Arg Ile Arg Glu Tyr His Tyr Ser Glu Glu 1060 1065 1070 Cys Ala Ser Tyr Val Pro Ser Val Lys Lys Ala Ile Asp Tyr Asn Arg 1075 1080 1085 Met Leu Ala Gln Ala Ala Asp Ser Asn Arg Glu Phe Leu Tyr Asn His 1090 1095 1100 Pro Glu Ala Lys Ser Ile Leu Gln Gly Lys Lys Glu Val Tyr Tyr Ile 1105 1110 1115 1120 Leu His Lys Gln Glu Ser Lys Asn Arg Gly Lys Thr Lys Glu Ile Asn 1125 1130 1135 Pro Asn Ala Arg Tyr Ala Cys Leu Thr Asp Glu Asn Thr Pro Gly Ser 1140 1145 1150 Arg Lys Gly Phe Ile Lys Phe Thr Gly Pro Asp Met Val Thr Val Asn 1155 1160 1165 Lys Glu Leu Lys Ser Lys Ile Ala Pro Ile Tyr Asp Pro Glu Trp Glu 1170 1175 1180 Lys Asp Ile Pro Asp Trp Glu Arg Ser Asn Gln Glu Ser Asn His Lys 1185 1190 1195 1200 Tyr Ser Phe Ile Leu His Asn Glu Ile Glu Met Arg Ser Ser Gln Lys 1205 1210 1215 Lys Lys Tyr Pro Arg Pro Val Phe Ile Cys Lys Lys Asn Gly Val Glu 1220 1225 1230 Tyr Arg Met Gln Lys Arg Cys Glu Arg Ile Phe Asp Phe Thr Lys Glu 1235 1240 1245 Glu Glu Lys Asp Lys Glu Ile Val Ile Pro Gln Lys Val Val Ser Gln 1250 1255 1260 Tyr Asn Ala Ile Leu Lys Asp Asn Lys Glu Asn Thr Glu Thr Ile Pro 1265 1270 1275 1280 Gly Leu Phe Asn Ser Lys Met Val Asn Lys Glu Leu Glu Asp Gly Asp 1285 1290 1295 Leu Val Tyr Phe Lys Tyr Lys Glu Gly Lys Val Thr Glu Leu Thr Pro 1300 1305 1310 Val Ala Ile Ser Arg Lys Thr Asp Asn Lys Pro Met Gly Lys Arg Phe 1315 1320 1325 Pro Lys Ile Ser Ile Asn Gly Lys Met Lys Pro Asn Asp Ser Leu Arg 1330 1335 1340 Ser Cys Ser His Thr Cys Thr Glu Asp Cys Asp Asp Cys Pro Asn Leu 1345 1350 1355 1360 Cys Glu Ser Val Lys Asp Tyr Phe Lys Pro His Pro Asp Gly Leu Cys 1365 1370 1375 Pro Ala Cys His Leu Phe Gly Thr Thr Phe Tyr Lys Ser Arg Leu Ser Page 13
Phe Gly Leu Ala Trp Leu Glu Asn Asn Ala Lys Trp Tyr Ile Ser Asn 1395 1400 1405 Asp Phe Gln Gln Lys Asp Ser Lys Lys Glu Lys Gly Gly Lys Leu Thr 1410 1415 1420 Leu Pro Leu Leu Glu Arg Pro Arg Pro Thr Trp Ser Met Pro Asn Asn 1425 1430 1435 1440 Asn Ala Glu Val Pro Gly Arg Lys Phe Tyr Val His His Pro Trp Ser 1445 1450 1455 Val Glu Asn Ile Lys Asn Asn Gln Gly Asn Gln Lys Asp Ile Ser Leu 1460 1465 1470 Lys Pro Asp Ser Asp Ala Ile Lys Ile Lys Glu Asn Asn Arg Thr Ile 1475 1480 1485 Glu Pro Leu Gly Lys Asp Asn Val Phe Asn Phe Glu Ile Ser Phe Asn 1490 1495 1500 Asn Leu Arg Asp Trp Glu Leu Gly Leu Leu Leu Tyr Ala Ile Glu Leu 1505 1510 1515 1520 Glu Asp His Leu Ala His Lys Leu Gly Met Ala Lys Ala Phe Gly Met 1525 1530 1535 Gly Ser Val Lys Ile Glu Ile Lys Asn Leu Leu Ile Lys Gly Ser Ile 1540 1545 1550 Asn Asp Ile Ser Lys Ala Glu Leu Ile Lys Lys Gly Phe Lys Lys Leu 1555 1560 1565 Gly Ile Asp Ser Leu Glu Lys Asp Asp Leu Ser Glu Tyr Leu His Ile 1570 1575 1580 Lys Gln Leu Arg Glu Ile Leu Trp Phe Ser Asp Lys Pro Val Gly Thr 1585 1590 1595 1600 Ile Glu Tyr Pro Lys Leu Glu Asn Lys Thr Asn Ser Arg Ile Pro Ser 1605 1610 1615 Tyr Thr Asp Phe Val Gln Glu Lys Asp His Glu Thr Gly Phe Lys Asn 1620 1625 1630 Pro Lys Tyr Gln Asn Leu Lys Ser Arg Leu His Ile Leu Gln Asn Pro 1635 1640 1645 Trp Asn Ala Trp Trp Lys Asn Glu Glu 1650 1655 <210> 4 <211> 1812 <212> PRT <213> Candidatus Jettenia <220> <223> BAFH01000003.1 cds GAB61731.1 <400> 4 Met His Thr Ile Leu Pro Ile His Leu Thr Phe Leu Glu Pro Tyr Arg 1 5 10 15 Leu Ala Glu Trp His Ala Lys Ala Asp Arg Lys Lys Asn Lys Arg Tyr
Page 14
Leu Arg Gly Met Ser Phe Ala Gln Trp His Lys Asp Lys Asp Gly Ile 40 45 Gly Lys Pro Tyr Ile Thr Gly Thr Leu Leu Arg Ser Ala Val Leu Asn 50 55 60 Ala Ala Glu Glu Leu Ile Ser Leu Asn Gln Gly Met Trp Ala Lys Glu 65 70 75 80 Pro Cys Cys Asn Gly Lys Phe Glu Thr Glu Lys Asp Lys Pro Ala Val 85 90 95 Leu Arg Lys Arg Pro Thr Ile Gln Trp Lys Thr Gly Arg Pro Ala Ile 100 105 110 Cys Asp Pro Glu Lys Gln Glu Lys Lys Asp Ala Cys Pro Leu Cys Met 115 120 125 Leu Leu Gly Arg Phe Asp Lys Ala Gly Lys Arg His Arg Asp Asn Lys 130 135 140 Tyr Asp Lys His Asp Tyr Asp Ile His Phe Asp Asn Leu Asn Leu Ile 145 150 155 160 Thr Asp Lys Lys Phe Ser His Pro Asp Asp Ile Ala Ser Glu Arg Ile 165 170 175 Leu Asn Arg Val Asp Tyr Thr Thr Gly Lys Ala His Asp Tyr Phe Lys 180 185 190 Val Trp Glu Val Asp Asp Asp Gln Trp Trp Gln Phe Thr Gly Thr Ile 195 200 205 Thr Met His Asp Asp Cys Ser Lys Ala Lys Gly Leu Leu Leu Ala Ser 210 215 220 Leu Cys Phe Val Asp Lys Leu Cys Gly Ala Leu Cys Arg Ile Glu Val 225 230 235 240 Thr Gly Asn Asn Ser Gln Asp Glu Asn Lys Glu Tyr Ala His Pro Asp 245 250 255 Thr Gly Ile Ile Thr Ser Leu Asn Leu Lys Tyr Gln Asn Asn Ser Thr 260 265 270 Ile His Gln Asp Ala Val Pro Leu Ser Gly Ser Ala His Asp Asn Asp 275 280 285 Glu Pro Pro Val His Asp Asn Asp Ser Ser Leu Asp Asn Asp Thr Ile 290 295 300 Thr Leu Leu Ser Met Lys Ala Lys Glu Ile Val Gly Ala Phe Arg Glu 305 310 315 320 Ser Gly Lys Ile Glu Lys Ala Arg Thr Leu Ala Asp Val Ile Arg Ala 325 330 335 Met Arg Leu Gln Lys Pro Asp Ile Trp Glu Lys Leu Pro Lys Gly Ile 340 345 350 Asn Asp Lys His His Leu Trp Asp Arg Glu Val Asn Gly Lys Lys Leu 355 360 365 Arg Asn Ile Leu Glu Glu Leu Trp Arg Leu Met Asn Lys Arg Asn Ala 370 375 380 Trp Arg Thr Phe Cys Glu Val Leu Gly Asn Glu Leu Tyr Arg Cys Tyr 385 390 395 400 Lys Glu Lys Thr Gly Gly Ile Val Leu Arg Phe Arg Thr Leu Gly Glu 405 410 415 Thr Glu Tyr Tyr Pro Glu Pro Glu Lys Thr Glu Pro Cys Leu Ile Ser 420 425 430 Page 15
Asp Asn Ser Ile Pro Ile Thr Pro Leu Gly Gly Val Lys Glu Trp Ile 435 440 445 Ile Tle Gly Arg Leu Lys Ala Glu Thr Pro Phe Tyr Phe Gly Val Gln 450 455 460 Ser Ser Phe Asp Ser Thr Gln Asp Asp Leu Asp Leu Val Pro Asp Ile 465 470 475 480 Val Asn Thr Asp Glu Lys Leu Glu Ala Asn Glu Gln Thr Ser Phe Arg 485 490 495 Ile Leu Met Asp Lys Lys Gly Arg Tyr Arg Ile Pro Arg Ser Leu Ile 500 505 510 Arg Gly Val Leu Arg Arg Asp Leu Arg Thr Ala Phe Gly Gly Ser Gly 515 520 525 Cys Ile Val Glu Leu Gly Arg Met Ile Pro Cys Asp Cys Lys Val Cys 530 535 540 Ala Ile Met Arg Lys Ile Thr Val Met Asp Ser Arg Ser Glu Asn Ile 545 550 555 560 Glu Leu Pro Asp Ile Arg Tyr Arg Ile Arg Leu Asn Pro Tyr Thr Ala 565 570 575 Thr Val Asp Glu Gly Ala Leu Phe Asp Met Glu Ile Gly Pro Glu Gly 580 585 590 Ile Thr Phe Pro Phe Val Phe Arg Tyr Arg Gly Glu Asp Ala Leu Pro 595 600 605 Arg Glu Leu Trp Ser Val Ile Arg Tyr Trp Met Asp Gly Met Ala Trp 610 615 620 Leu Gly Gly Ser Gly Ser Thr Gly Lys Gly Arg Phe Ala Leu Ile Asp 625 630 635 640 Ile Lys Val Phe Glu Trp Asp Leu Cys Asn Glu Glu Gly Leu Lys Ala 645 650 655 Tyr Ile Cys Ser Arg Gly Leu Arg Gly Ile Glu Lys Glu Val Leu Leu 660 665 670 Glu Asn Lys Thr Ile Ala Glu Ile Thr Asn Leu Phe Lys Thr Glu Glu 675 680 685 Val Lys Phe Phe Glu Ser Tyr Ser Lys His Ile Lys Gln Leu Cys His 690 695 700 Glu Cys Ile Ile Asn Gln Ile Ser Phe Leu Trp Gly Leu Arg Ser Tyr 705 710 715 720 Tyr Glu Tyr Leu Gly Pro Leu Trp Thr Glu Val Lys Tyr Glu Ile Lys 725 730 735 Ile Ala Ser Pro Leu Leu Ser Ser Asp Thr Ile Ser Ala Leu Leu Asn 740 745 750 Lys Asp Asn Ile Asp Cys Ile Ala Tyr Glu Lys Arg Lys Trp Glu Asn 755 760 765 Gly Gly Ile Lys Phe Val Pro Thr Ile Lys Gly Glu Thr Ile Arg Gly 770 775 780 Ile Val Arg Met Ala Val Gly Lys Arg Ser Gly Asp Leu Gly Met Asp 785 790 795 800 Asp His Glu Asp Cys Ser Cys Thr Leu Cys Thr Ile Phe Gly Asn Glu 805 810 815 His Glu Ala Gly Lys Leu Arg Phe Glu Asp Leu Glu Val Val Glu Glu 820 825 830 Page 16
Lys Leu Pro Ser Glu Gln Asn Ser Asp Ser Asn Lys Ile Pro Phe Gly 835 840 845 Pro Val Gln Asp Gly Asp Gly Asn Arg Glu Lys Glu Cys Val Thr Ala 850 855 860 Val Lys Ser Tyr Lys Lys Lys Leu Ile Asp His Val Ala Ile Asp Arg 865 870 875 880 Phe His Gly Gly Ala Glu Asp Lys Met Lys Phe Asn Thr Leu Pro Leu 885 890 895 Ala Gly Ser Phe Glu Lys Pro Ile Ile Leu Lys Gly Arg Phe Trp Ile 900 905 910 Lys Lys Asp Ile Val Lys Asp Tyr Lys Lys Lys Ile Glu Asp Ala Met 915 920 925 Val Asp Ile Arg Asp Gly Leu Tyr Pro Ile Gly Gly Lys Thr Gly Ile 930 935 940 Gly Tyr Gly Trp Val Thr Asp Leu Thr Ile Leu Asn Pro Gln Ser Gly 945 950 955 960 Phe Gln Ile Pro Val Lys Lys Asp Ile Ser Pro Glu Pro Gly Thr Tyr 965 970 975 Ser Thr Tyr Pro Ser His Ser Thr Pro Ser Leu Asn Lys Gly His Ile 980 985 990 Tyr Tyr Pro His Tyr Phe Leu Ala Pro Ala Asn Thr Val His Arg Glu 995 1000 1005 Gln Glu Met Ile Gly His Glu Gln Phe His Lys Glu Gln Lys Gly Glu 1010 1015 1020 Leu Leu Val Ser Gly Lys Ile Val Cys Thr Leu Lys Thr Val Thr Pro 1025 1030 1035 1040 Leu Ile Ile Pro Asp Thr Glu Asn Glu Asp Ala Phe Gly Leu Gln Asn 1045 1050 1055 Thr Tyr Ser Gly His Lys Asn Tyr Gln Phe Phe His Ile Asn Asp Glu 1060 1065 1070 Ile Met Val Pro Gly Ser Glu Ile Arg Gly Met Ile Ser Ser Val Tyr 1075 1080 1085 Glu Ala Ile Thr Asn Ser Cys Phe Arg Val Tyr Asp Glu Thr Lys Tyr 1090 1095 1100 Ile Thr Arg Arg Leu Ser Pro Glu Lys Lys Asp Glu Ser Asn Asp Lys 1105 1110 1115 1120 Asn Lys Ser Gln Asp Asp Ala Ser Gln Lys Ile Arg Lys Gly Leu Val 1125 1130 1135 Lys Lys Thr Asp Glu Gly Phe Ser Ile Ile Glu Val Glu Arg Tyr Ser 1140 1145 1150 Met Lys Thr Lys Gly Gly Thr Lys Leu Val Asp Lys Val Tyr Arg Leu 1155 1160 1165 Pro Leu Tyr Asp Ser Glu Ala Val Ile Ala Ser Ile Gln Phe Glu Gln 1170 1175 1180 Tyr Gly Glu Lys Asn Glu Lys Arg Asn Ala Lys Ile Arg Ala Ala Ile 1185 1190 1195 1200 Lys Arg Asn Glu Val Ile Ala Glu Val Ala Arg Lys Asn Leu Ile Phe 1205 1210 1215 Leu Arg Ser Leu Thr Pro Glu Glu Leu Lys Lys Val Leu Gln Gly Glu 1220 1225 1230 Page 17
Ile Leu Val Lys Phe Ser Leu Lys Ser Gly Lys Asn Pro Asn Asp Tyr 1235 1240 1245 Leu Ala Glu Leu His Glu Asn Gly Thr Glu Arg Gly Leu Ile Lys Phe 1250 1255 1260 Thr Gly Leu Asn Met Val Asn Ile Lys Asn Val Asn Glu Glu Asp Lys 1265 1270 1275 1280 Asp Phe Asn Asp Thr Trp Asp Trp Glu Lys Leu Asn Ile Phe His Asn 1285 1290 1295 Ala His Glu Lys Arg Asn Ser Leu Lys Gln Gly Tyr Pro Arg Pro Val 1300 1305 1310 Leu Lys Phe Ile Lys Asp Arg Val Glu Tyr Thr Ile Pro Lys Arg Cys 1315 1320 1325 Glu Arg Ile Phe Cys Ile Pro Val Lys Asn Thr Ile Glu Tyr Lys Val 1330 1335 1340 Ser Ser Lys Val Cys Lys Gln Tyr Lys Asp Val Leu Ser Asp Tyr Glu 1345 1350 1355 1360 Lys Asn Phe Gly His Ile Asn Lys Ile Phe Thr Thr Lys Ile Gln Lys 1365 1370 1375 Arg Glu Leu Thr Asp Gly Asp Leu Val Tyr Phe Ile Pro Asn Glu Gly 1380 1385 1390 Ala Asp Lys Thr Val Gln Ala Ile Met Pro Val Pro Leu Ser Arg Ile 1395 1400 1405 Thr Asp Ser Arg Thr Leu Gly Glu Arg Leu Pro His Lys Asn Leu Leu 1410 1415 1420 Pro Cys Val His Glu Val Asn Glu Gly Leu Leu Ser Gly Ile Leu Asp 1425 1430 1435 1440 Ser Leu Asp Lys Lys Leu Leu Ser Ile His Pro Glu Gly Leu Cys Pro 1445 1450 1455 Thr Cys Arg Leu Phe Gly Thr Thr Tyr Tyr Lys Gly Arg Val Arg Phe 1460 1465 1470 Gly Phe Ala Asn Leu Met Asn Lys Pro Lys Trp Leu Thr Glu Arg Glu 1475 1480 1485 Asn Gly Cys Gly Gly Tyr Val Thr Leu Pro Leu Leu Glu Arg Pro Arg 1490 1495 1500 Leu Thr Trp Ser Val Pro Ser Asp Lys Cys Asp Val Pro Gly Arg Lys 1505 1510 1515 1520 Phe Tyr Ile His His Asn Gly Trp Gln Glu Val Leu Arg Asn Asn Asp 1525 1530 1535 Ile Thr Pro Lys Thr Glu Asn Asn Arg Thr Val Glu Pro Leu Ala Ala 1540 1545 1550 Asp Asn Arg Phe Thr Phe Asp Val Tyr Phe Glu Asn Leu Arg Glu Trp 1555 1560 1565 Glu Leu Gly Leu Leu Cys Tyr Cys Leu Glu Leu Glu Pro Gly Met Gly 1570 1575 1580 His Lys Leu Gly Met Gly Lys Pro Met Gly Phe Gly Ser Val Lys Ile 1585 1590 1595 1600 Ala Ile Glu Arg Leu Gln Thr Phe Thr Val His Gln Asp Gly Ile Asn 1605 1610 1615 Trp Lys Pro Ser Glu Asn Glu Ile Gly Val Tyr Val Gln Lys Gly Arg 1620 1625 1630 Page 18
Glu Lys Leu Val Glu Trp Phe Thr Pro Ser Ala Pro His Lys Asn Met 1635 1640 1645 Glu Trp Asn Gly Val Lys His Ile Lys Asp Leu Arg Ser Leu Leu Ser 1650 1655 1660 Ile Pro Gly Asp Lys Pro Thr Val Lys Tyr Pro Thr Leu Asn Lys Asp 1665 1670 1675 1680 Ala Glu Gly Ala Ile Ser Asp Tyr Thr Tyr Glu Arg Leu Ser Asp Thr 1685 1690 1695 Lys Leu Leu Pro His Asp Lys Arg Val Glu Tyr Leu Arg Thr Pro Trp 1700 1705 1710 Ser Pro Trp Asn Ala Phe Val Lys Glu Ala Glu Tyr Ser Pro Ser Glu 1715 1720 1725 Lys Ser Asp Glu Lys Gly Arg Glu Thr Ile Arg Thr Lys Pro Lys Ser 1730 1735 1740 Leu Pro Ser Val Lys Ser Ile Gly Lys Val Lys Trp Phe Asp Glu Gly 1745 1750 1755 1760 Lys Gly Phe Gly Ile Leu Ile Met Asp Asp Gly Lys Glu Val Ser Ile 1765 1770 1775 Ser Lys Asn Ser Ile Arg Gly Asn Ile Leu Leu Lys Lys Gly Gln Lys 1780 1785 1790 Val Thr Phe His Ile Val Gln Gly Leu Ile Pro Lys Ala Glu Asp Ile 1795 1800 1805 Glu Ile Ala Lys 1810 <210> 5 <211> 1812 <212> PRT <213> Candidatus Jettenia <220> <223> JABWARO10000005.1 cds NUN21993.1 <400> 5 Met Ser Lys Lys His Phe Ile His Leu Thr Phe Leu Glu Pro Tyr Arg 1 5 10 15 Leu Ala Glu Trp His Ala Lys Ala Asp Arg Lys Lys Asn Lys Arg Tyr
Leu Arg Gly Met Ser Phe Ala Gln Trp His Lys Asp Lys Asp Gly Ile 40 45 Gly Lys Pro Tyr Ile Thr Gly Thr Leu Leu Arg Ser Ala Val Leu Asn 50 55 60 Ala Ala Glu Glu Leu Ile Ser Leu Asn Gln Gly Met Trp Ala Lys Glu 65 70 75 80 Pro Cys Cys Asn Gly Lys Phe Glu Thr Glu Lys Asp Lys Pro Ala Val 85 90 95 Leu Arg Lys Arg Pro Thr Ile Gln Trp Lys Thr Gly Arg Pro Ala Ile 100 105 110 Cys Asp Pro Glu Lys Gln Glu Lys Lys Asp Ala Cys Pro Leu Cys Met Page 19
Leu Leu Gly Arg Phe Asp Lys Ala Gly Lys Arg His Arg Asp Asn Lys 130 135 140 Tyr Asp Lys His Asp Tyr Asp Ile His Phe Asp Asn Leu Asn Leu Ile 145 150 155 160 Thr Asp Lys Lys Phe Ser His Pro Asp Asp Ile Ala Ser Glu Arg Ile 165 170 175 Leu Asn Arg Val Asp Tyr Thr Thr Gly Lys Ala His Asp Tyr Phe Lys 180 185 190 Val Trp Glu Val Asp Asp Asp Gln Trp Trp Gln Phe Thr Gly Thr Ile 195 200 205 Thr Met His Asp Asp Cys Ser Lys Ala Lys Gly Leu Leu Leu Ala Ser 210 215 220 Leu Cys Phe Val Asp Lys Leu Cys Gly Ala Leu Cys Arg Ile Glu Val 225 230 235 240 Thr Gly Asn Asn Ser Gln Asp Glu Asn Lys Glu Tyr Ala His Pro Asp 245 250 255 Thr Gly Ile Ile Thr Ser Leu Asn Leu Lys Tyr Gln Asn Asn Ser Thr 260 265 270 Ile His Gln Asp Ala Val Pro Leu Ser Gly Ser Ala His Asp Asn Asp 275 280 285 Glu Pro Pro Val His Asp Asn Asp Ser Ser Leu Asp Asn Asp Thr Ile 290 295 300 Thr Leu Leu Ser Met Lys Ala Lys Glu Ile Val Gly Ala Phe His Glu 305 310 315 320 Ser Gly Lys Ile Glu Lys Ala Arg Thr Leu Ala Asp Val Ile Arg Ala 325 330 335 Met Arg Leu Gln Lys Pro Asp Ile Trp Glu Lys Leu Pro Lys Gly Ile 340 345 350 Asn Asp Lys His His Leu Trp Asp Arg Glu Val Asn Gly Lys Lys Leu 355 360 365 Arg Asn Ile Leu Glu Glu Leu Trp Arg Leu Met Ser Lys Arg Asn Ala 370 375 380 Trp Arg Thr Phe Cys Glu Val Leu Gly Asn Glu Leu Tyr Arg Cys Tyr 385 390 395 400 Lys Glu Lys Thr Gly Gly Ile Val Leu Arg Phe Arg Thr Leu Gly Glu 405 410 415 Thr Glu Tyr Tyr Pro Glu Pro Glu Lys Thr Glu Pro Cys Leu Ile Ser 420 425 430 Asp Asn Ser Ile Pro Ile Thr Pro Leu Gly Gly Val Lys Glu Trp Ile 435 440 445 Ile Tle Gly Arg Leu Lys Ala Glu Thr Pro Phe Tyr Phe Gly Ala Gln 450 455 460 Ser Ser Phe Asp Ser Thr Gln Asp Asp Leu Asp Leu Val Pro Asp Ile 465 470 475 480 Val Asn Thr Asp Glu Lys Leu Glu Ala Asn Glu Gln Thr Ser Phe Arg 485 490 495 Ile Leu Met Asp Lys Lys Gly Arg Tyr Arg Ile Pro Arg Ser Leu Ile 500 505 510 Arg Gly Val Leu Arg Arg Asp Leu Arg Thr Ala Phe Gly Gly Ser Gly Page 20
Cys Ile Val Glu Leu Gly Arg Met Ile Pro Cys Asp Cys Lys Val Cys 530 535 540 Ala Ile Met Arg Lys Ile Thr Val Met Asp Ser Arg Ser Glu Asn Ile 545 550 555 560 Glu Leu Pro Asp Ile Arg Tyr Arg Ile Arg Leu Asn Pro Tyr Thr Ala 565 570 575 Thr Val Asp Glu Gly Ala Leu Phe Asp Met Glu Ile Gly Pro Glu Gly 580 585 590 Ile Thr Phe Pro Phe Val Phe Arg Tyr Arg Gly Glu Asp Ala Leu Pro 595 600 605 Arg Glu Leu Trp Ser Val Ile Arg Tyr Trp Met Asp Gly Met Ala Trp 610 615 620 Leu Gly Gly Ser Gly Ser Thr Gly Lys Gly Arg Phe Ala Leu Ile Asp 625 630 635 640 Ile Lys Val Phe Glu Trp Asp Leu Cys Asn Glu Glu Gly Leu Lys Ala 645 650 655 Tyr Ile Cys Ser Arg Gly Leu Arg Gly Ile Glu Lys Glu Val Leu Leu 660 665 670 Glu Asn Lys Thr Ile Thr Glu Ile Thr Asn Leu Phe Lys Thr Glu Glu 675 680 685 Val Lys Phe Phe Glu Ser Tyr Ser Lys His Ile Lys Gln Leu Cys His 690 695 700 Glu Gly Ile Ile Asn Gln Met Ser Phe Ser Gly Gly Leu Arg Ser Tyr 705 710 715 720 His Glu Tyr Leu Ser Pro Leu Trp Thr Glu Val Lys Tyr Glu Ile Lys 725 730 735 Ile Ala Ser Pro Leu Leu Ser Ser Asp Thr Ile Ser Ala Leu Leu Asn 740 745 750 Lys Asp Asn Ile Asp Cys Ile Ala Tyr Glu Lys Arg Lys Trp Glu Asn 755 760 765 Gly Gly Ile Lys Phe Val Pro Thr Ile Lys Gly Glu Thr Ile Arg Gly 770 775 780 Ile Val Arg Met Ala Val Gly Lys Arg Ser Gly Asp Leu Gly Met Asp 785 790 795 800 Asp His Glu Asp Cys Ser Cys Thr Leu Cys Thr Ile Phe Gly Asn Glu 805 810 815 His Glu Ala Gly Lys Leu Arg Phe Glu Asp Leu Glu Val Val Glu Glu 820 825 830 Lys Leu Pro Ser Glu Gln Asn Ser Asp Ser Asn Lys Ile Pro Phe Gly 835 840 845 Pro Val Gln Asp Gly Asp Gly Asn Arg Glu Lys Glu Cys Val Ala Glu 850 855 860 Val Lys Ile Tyr Lys Lys Lys Leu Ile Asp His Val Ala Ile Asp Arg 865 870 875 880 Phe His Gly Gly Ala Glu Asp Lys Met Lys Phe Asn Thr Leu Pro Leu 885 890 895 Val Gly Ser Pro Glu Arg Pro Ile Ile Leu Lys Gly Arg Phe Trp Ile 900 905 910 Lys Lys Asp Met Val Lys Asp Tyr Arg Lys Lys Ile Glu Asp Ala Met Page 21
Val Asp Ile Arg Asp Gly Leu Tyr Pro Ile Gly Gly Lys Thr Gly Ile 930 935 940 Gly Tyr Gly Trp Val Thr Asp Leu Thr Ile Leu Asn Pro Gln Ser Gly 945 950 955 960 Phe Gln Ile Pro Val Lys Lys Asp Ile Ser Pro Glu Pro Gly Thr Tyr 965 970 975 Leu Thr Tyr Pro Ser Tyr Ser Ala Pro Ser Leu Asn Arg Gly His Ile 980 985 990 Tyr Tyr Pro His Tyr Phe Leu Ala Pro Ala Asn Thr Val His Arg Glu 995 1000 1005 Gln Glu Met Ile Gly His Glu Gln Phe His Lys Glu Gln Lys Gly Glu 1010 1015 1020 Leu Leu Val Ser Gly Lys Ile Val Cys Thr Leu Lys Thr Val Thr Pro 1025 1030 1035 1040 Leu Ile Ile Pro Asp Thr Glu Asn Glu Asp Ala Phe Gly Leu Gln Asn 1045 1050 1055 Thr Tyr Ser Gly His Lys Asn Tyr Gln Phe Phe His Ile Asn Asp Glu 1060 1065 1070 Ile Met Val Pro Gly Ser Glu Ile Arg Gly Met Ile Ser Ser Val Tyr 1075 1080 1085 Glu Ala Ile Thr Asn Ser Cys Phe Arg Val Tyr Asp Glu Thr Lys Tyr 1090 1095 1100 Ile Thr Arg Arg Leu Ser Ser Glu Lys Lys Asp Glu Ser Asn Asp Lys 1105 1110 1115 1120 Asn Lys Ser Gln Asp Asp Ala Ser Gln Lys Ile Arg Lys Gly Leu Val 1125 1130 1135 Lys Lys Thr Asp Glu Gly Phe Ser Ile Ile Glu Val Glu Arg Tyr Ser 1140 1145 1150 Met Lys Thr Lys Gly Arg Thr Lys Leu Val Asp Lys Val Tyr Arg Leu 1155 1160 1165 Pro Leu Tyr Asp Ser Glu Ala Val Ile Ala Ser Ile Lys Phe Glu Gln 1170 1175 1180 Tyr Gly Glu Lys Asn Glu Lys Arg Asn Ala Lys Ile Leu Ala Ala Ile 1185 1190 1195 1200 Lys Arg Asn Asn Val Ile Ala Glu Val Ala Arg Lys Asn Leu Ile Phe 1205 1210 1215 Leu Arg Ser Leu Thr Pro Glu Glu Leu Lys Lys Val Leu Gln Gly Glu 1220 1225 1230 Ile Leu Val Lys Phe Ser Leu Lys Ser Gly Glu Asn Pro Asn Asp Tyr 1235 1240 1245 Leu Ala Glu Leu His Glu Asn Gly Thr Glu Arg Gly Leu Ile Lys Phe 1250 1255 1260 Thr Gly Leu Asn Met Val Asn Ile Lys Asn Val Asn Glu Glu Asp Lys 1265 1270 1275 1280 Asp Phe Asn Asp Thr Trp Asp Trp Glu Lys Leu Asn Ile Phe His Asn 1285 1290 1295 Ala His Glu Lys Arg Asn Ser Leu Lys Gln Gly Tyr Pro Arg Pro Val 1300 1305 1310 Leu Lys Phe Ile Lys Asp Arg Val Glu Tyr Thr Ile Pro Lys Arg Cys Page 22
Glu Arg Ile Phe Cys Ile Pro Val Lys Asn Thr Ile Glu Tyr Lys Val 1330 1335 1340 Ser Ser Lys Val Cys Lys Gln Tyr Lys Asp Val Leu Ser Asp Tyr Glu 1345 1350 1355 1360 Lys Asn Phe Gly His Ile Asn Lys Ile Phe Thr Thr Lys Ile Gln Lys 1365 1370 1375 Arg Glu Leu Thr Asp Gly Asp Leu Val Tyr Phe Ile Pro Asn Glu Gly 1380 1385 1390 Ala Asp Lys Thr Val Gln Ala Ile Met Pro Val Pro Leu Ser Arg Ile 1395 1400 1405 Thr Asp Ser Arg Thr Leu Gly Glu Arg Leu Pro His Lys Asn Leu Leu 1410 1415 1420 Pro Cys Val His Glu Val Asn Glu Gly Leu Leu Ser Gly Ile Leu Asp 1425 1430 1435 1440 Ser Leu Asp Lys Lys Leu Leu Ser Ile His Pro Glu Gly Leu Cys Pro 1445 1450 1455 Thr Cys Arg Leu Phe Gly Thr Thr Tyr Tyr Lys Gly Arg Val Arg Phe 1460 1465 1470 Gly Phe Ala Asn Leu Ile Asn Lys Pro Lys Trp Leu Thr Glu Arg Glu 1475 1480 1485 Asn Gly Cys Gly Gly Tyr Val Thr Leu Pro Leu Leu Glu Arg Pro Arg 1490 1495 1500 Leu Thr Trp Ser Val Pro Ser Asp Lys Cys Asp Val Pro Gly Arg Lys 1505 1510 1515 1520 Phe Tyr Val His His Asn Gly Trp Gln Glu Val Leu Arg Asn Asn Asp 1525 1530 1535 Ile Thr Pro Lys Thr Glu Asn Asn Arg Thr Val Glu Pro Leu Ala Ala 1540 1545 1550 Asp Asn Arg Phe Thr Phe Asp Val Tyr Phe Glu Asn Leu Arg Glu Trp 1555 1560 1565 Glu Leu Gly Leu Leu Cys Tyr Cys Leu Glu Leu Glu Pro Gly Met Gly 1570 1575 1580 His Lys Leu Gly Met Gly Lys Pro Leu Gly Phe Gly Ser Val Lys Ile 1585 1590 1595 1600 Ala Ile Glu Arg Leu Gln Thr Phe Thr Val His Gln Asp Asp Ile Asn 1605 1610 1615 Trp Lys Pro Ser Glu Asn Glu Ile Gly Val Tyr Val Gln Arg Gly Arg 1620 1625 1630 Glu Lys Leu Val Glu Trp Phe Thr Pro Ser Asp Ser His Lys Asn Met 1635 1640 1645 Glu Trp Asn Glu Val Lys His Ile Lys Asp Leu Arg Ser Leu Leu Ser 1650 1655 1660 Ile Pro Asp Asp Lys Pro Thr Val Lys Tyr Pro Ala Leu Asn Lys Gly 1665 1670 1675 1680 Ala Glu Gly Ala Ile Ser Asp Tyr Thr Tyr Glu Arg Leu Ser Asp Thr 1685 1690 1695 Lys Leu Leu Pro His Asp Lys Arg Val Glu Tyr Leu Arg Thr Pro Trp 1700 1705 1710 Gly Pro Trp Asn Ala Phe Val Lys Glu Ala Glu Tyr Ser Thr Ser Glu Page 23
Asn Ser Asp Glu Lys Gly Arg Glu Thr Ile Arg Thr Lys Pro Lys Ser 1730 1735 1740 Leu Pro Ser Val Lys Ser Ile Gly Lys Val Lys Trp Phe Asp Glu Gly 1745 1750 1755 1760 Lys Gly Phe Gly Ile Leu Ile Met Asp Asp Gly Lys Glu Val Ser Ile 1765 1770 1775 Ser Lys Asn Ser Ile Arg Gly Asn Asn Leu Leu Lys Lys Asp Gln Lys 1780 1785 1790 Val Thr Phe His Ile Val Gln Gly Leu Ile Pro Lys Ala Glu Asp Ile 1795 1800 1805 Glu Ile Ala Lys 1810 <210> 6 <211> 1599 <212> PRT <213> Desulfobacterales <220> <223> JADGCY010000041.1 cds MBF0120744.1 <400> 6 Met Lys Ile Thr Val Lys Phe Leu Glu Pro Phe Arg Leu Leu Glu Trp 1 5 10 15 Ile Lys His Glu Asn Arg Asn Arg Glu Asn Lys Pro Tyr Leu Arg Gly
Gln Ser Phe Ala Arg Trp His Lys Asn Lys Asp Gly Lys Gly Gly Arg 40 45 Pro Tyr Ile Thr Gly Ser Leu Leu Arg Ser Ala Val Ile Gln Ser Ala 50 55 60 Glu Lys Leu Leu Val Leu Ser Gly Gly Lys Ile Asn Asn Lys Ser Cys 65 70 75 80 Cys Pro Gly Glu Phe Ser Thr Lys Asn Asn Asn Ser Val Leu Leu Leu 85 90 95 Arg Gln Arg Ala Thr Phe Lys Trp Thr Asp Asp Lys Leu Cys Asn Ser 100 105 110 Ser Ser Pro Cys Pro Phe Cys Glu Leu Leu Gly Arg His Asp Gln Ala 115 120 125 Gly Lys Asn Ala Lys Lys Glu Asn Gly Val Gln Phe His Ile His Phe 130 135 140 Gly Asn Leu Asn Leu Pro Tyr Asp Lys Leu Tyr Ser Asp Ile Glu Glu 145 150 155 160 Ile Ala Phe Lys Arg Thr Leu Asn Arg Ile Asp Gln Asp Ser Gly Lys 165 170 175 Ala Phe Asp Phe Leu Arg Val Trp Glu Ile Asp Asn Leu Glu Val Pro 180 185 190 Leu Phe Thr Gly Glu Ile Ser Ile Ser Asn Ile Ile Ser Pro Glu Ser 195 200 205 Page 24
Ile Arg Leu Leu Lys Asp Ser Leu Ser Phe Val Asp Lys Leu Cys Gly 210 215 220 Ser Leu Cys Ile Ile Lys Gln Asp Asp Asn Asp Leu Thr Leu Ser Ile 225 230 235 240 Pro Ser Ile Ser Lys Ser Asp Ile Ser Glu Arg Ala Lys Ile Leu Val 245 250 255 Asp Ala Ile Gly Lys Tyr Asn Glu Ala Asp Lys Ile Arg Ile Met Ala 260 265 270 Asp Ala Met Leu Ala Leu Arg Arg Asp Lys Asn Leu Val Ser Ser Leu 275 280 285 Pro Lys Asp His Asp Asn Lys Glu Asn His Tyr Leu Trp Asp Ile Lys 290 295 300 Glu Gly Asn Ser Lys Ser Ile Arg Leu Ile Leu Lys Glu Gln Ala Asn 305 310 315 320 Ala Leu Asp Asn Gln Gly Trp Arg Asn Leu Cys Glu Gln Ala Gly Gln 325 330 335 Leu Ile Phe Glu Lys Ala Lys Gln Leu Thr Gly Gly Ile Ser Val Ser 340 345 350 Gln Arg Ile Leu Gly Asp Ile Glu Tyr Leu Ser Glu Pro Glu Leu Ile 355 360 365 Ser Asp Thr Val Phe Ile Ser Ser Val Pro Gln Tyr Glu Thr Ile Ile 370 375 380 Gln Gly Lys Leu Ile Ala Lys Thr Pro Phe Phe Phe Gly Leu Glu Asn 385 390 395 400 Asp Glu Thr Lys Gln Ser Ser Tyr Lys Leu Leu Leu Asp Asn Lys Asn 405 410 415 Ser Tyr Arg Ile Pro Arg Ser Ala Ile Arg Gly Ile Leu Arg Arg Asp 420 425 430 Leu Lys Asn Ile Leu Gly Thr Gly Cys Asn Val Glu Leu Gly Gly Val 435 440 445 Pro Cys Pro Cys Lys Val Cys Ser Ile Met Arg Asn Ile Thr Ile Met 450 455 460 Asp Ser Arg Ser Asn Tyr Ser Glu Pro Pro Glu Ile Arg Asn Arg Ile 465 470 475 480 Arg Ile Asn Thr Tyr Thr Gly Thr Val Asp Glu Gly Ala Leu Phe Asp 485 490 495 Met Glu Val Gly Pro Glu Gly Leu Glu Phe Pro Phe Thr Leu Arg Tyr 500 505 510 Arg Gly Arg Tyr Lys Asn Pro Asp Ser Glu Lys Ile Pro Asp Ser Leu 515 520 525 Glu Lys Val Leu Thr Leu Trp Thr Glu Gly Gln Ala Phe Leu Ser Gly 530 535 540 Ser Ala Ser Thr Gly Lys Gly Arg Phe Lys Ile Glu Asp Ile Lys Tyr 545 550 555 560 Cys Arg Leu Asp Leu Lys Asp Ala Ser Lys Arg Asp Glu Tyr Leu Leu 565 570 575 Asn His Gly Trp Arg Asp Asn Leu Asp Lys Leu Lys Phe Asp Asn Leu 580 585 590 Pro Leu Lys Ile Gln Asn Leu Ile Ala Arg Trp Lys Lys Val Glu Ile 595 600 605 Page 25
Glu Ile Lys Leu Ala Ser Pro Phe Leu Asn Gly Asp Pro Ile Arg Ala 610 615 620 Leu Leu Glu Ser Asn Ser Gly Asp Ile Val Ser Phe Arg Lys Phe Ile 625 630 635 640 Asn Gly Gly Thr Gln Glu Val Tyr Ala Tyr Lys Ser Glu Ser Phe Lys 645 650 655 Gly Val Val Arg Ala Ala Val Ser Lys Phe Glu Gly Ile Asp Ser Ile 660 665 670 Thr Glu Lys Thr Gly Pro Leu Gly Thr Leu Thr His Gln Asp Cys Ser 675 680 685 Cys Leu Leu Cys Ser Leu Phe Gly Ser Glu Tyr Glu Thr Gly Lys Met 690 695 700 Arg Phe Glu Asp Leu Ile Phe Asp Pro Gln Pro Val Ser Lys Ile Phe 705 710 715 720 Asp His Val Ala Ile Asp Arg Phe Thr Gly Gly Ala Val Asp Lys Lys 725 730 735 Lys Phe Asp Asp Asn Ser Ile Val Gly Ser His Ser Asn Gln Leu Thr 740 745 750 Leu Lys Gly Leu Phe Trp Ile Arg Asn Asp Ile Thr Asp Glu Glu Tyr 755 760 765 Asn Ala Leu Ser Arg Ala Phe Thr Asp Ile Lys Asn Asn Ile Tyr Pro 770 775 780 Leu Gly Ala Lys Gly Ser Ile Gly Tyr Gly Cys Val Gln Asp Leu Thr 785 790 795 800 Thr Asp Asn Ser Asn Ile Asn Leu Lys Thr Ile Asn Leu Asn Tyr Lys 805 810 815 Pro Ile Pro Gln Lys Ile Asn Ser Asn Ile Lys Ile Asp Phe Asn Asp 820 825 830 Asn Glu Ile Tyr Tyr Pro His Tyr Phe Leu Glu Pro Ser Lys Thr Val 835 840 845 Asn Arg Ile Pro Val Pro Ile Gly His Glu Lys Phe Asp Glu Asn Leu 850 855 860 Leu Thr Gly Lys Ile Thr Cys Thr Leu Asn Thr Leu Ser Pro Leu Ile 865 870 875 880 Val Pro Asp Thr Thr Asn Asp Asn Phe Phe Lys Leu Ala Asp Glu Lys 885 890 895 Glu Lys Ser Glu Gly Lys Pro Tyr His Lys Ser Tyr Asn Phe Phe Ser 900 905 910 Val Asn Arg Asp Ile Ser Ile Pro Gly Ser Glu Leu Arg Gly Met Ile 915 920 925 Ser Ser Val Tyr Glu Ala Val Thr Asn Ser Cys Phe Arg Ile Phe Asp 930 935 940 Glu Lys Tyr Arg Leu Ser Trp Arg Met Asp Val Ser Pro Ala Val Leu 945 950 955 960 Arg Glu Phe Lys Pro Gly Met Val Ile Lys Asp His Asn Asp Leu Lys 965 970 975 Ile Tle Glu Met Glu Glu Phe Arg Tyr Pro Phe Tyr Asp Gln Asn Ile 980 985 990 Gln Asp Ile Glu Ala Gln Asn Lys Tyr Phe Glu Trp Glu Tyr Gly Thr 995 1000 1005 Page 26
Ile Lys Ile Thr Lys Lys Ser Ile Tyr Glu Leu Glu Lys Leu Ile Glu 1010 1015 1020 Lys Lys Asn Ile Leu Asn Lys Ile Lys Glu Leu Gln Asp Ile Glu Tyr 1025 1030 1035 1040 Lys Ser Glu Tyr Ala Leu Ile Asn Ala Leu Glu Lys Leu Ile Gly Arg 1045 1050 1055 Asn Ser Leu Ala Lys Cys Lys Ser Asn Ile Leu Lys His Ala Glu Arg 1060 1065 1070 Lys Gly Glu Phe Pro Arg Tyr Asp His Pro Thr Asp Thr Asp Arg Met 1075 1080 1085 Met Leu Ser Leu Ser Gly Lys Asn Arg Asn Leu Lys Asn Lys Lys Glu 1090 1095 1100 Lys Ile Glu Tyr Ile Ile Ile Lys Pro Asn Ser Lys Ser Lys Ala Thr 1105 1110 1115 1120 Phe Met Tyr Leu Ala Thr Pro Leu Asn Asn Ile Asp Glu Tyr Glu Asn 1125 1130 1135 Glu Ser Val Ala Arg Lys Ala His Gly Tyr Leu Lys Ile Thr Gly Pro 1140 1145 1150 Asn Lys Ile Glu Lys Glu Asn Val Asp Ser Ile Asp Ser Asn Phe Lys 1155 1160 1165 Pro Val Pro Gln Met Asp Asp Gln Ile Ile Leu Glu Lys Val Trp Leu 1170 1175 1180 Arg Lys Val Phe Val Leu Ser Ala Lys Lys Arg Thr Ser Tyr Arg Asp 1185 1190 1195 1200 Arg Leu Ile Pro Glu Phe Ile Cys Tyr Asp Lys Ile Lys Gly Ile Lys 1205 1210 1215 Tyr Thr Met Asn Lys Arg Ser Glu Arg Ile Phe Val Glu Lys Lys Glu 1220 1225 1230 Arg Ile Lys Lys Glu Ile Thr Gln Gln Ala Ile Glu Lys Phe Glu Ile 1235 1240 1245 Leu Ile Gln Glu Tyr His Lys Asn Ala Glu Gln Gln Gln Thr Pro Glu 1250 1255 1260 Val Phe Arg Thr Ile Leu Pro Gln Asn Gly Thr Ile Asn Asp Gly Asp 1265 1270 1275 1280 Leu Val Tyr Phe Arg Glu Glu Asn Asn Gln Val Val Glu Ile Ile Pro 1285 1290 1295 Val Arg Ile Ser Arg Lys Val Asp Asp Asn Tyr Ile Gly Lys Arg Ile 1300 1305 1310 Asp Glu Gln Leu Arg Pro Cys His Gly Asp Trp Ile Glu Glu Asp Asp 1315 1320 1325 Ile Ser Lys Leu Asn Ala Tyr Pro Glu Lys Arg Leu Phe Thr Arg Asn 1330 1335 1340 Glu Lys Gly Leu Cys Pro Ala Cys Arg Leu Phe Gly Thr Gly Ser Tyr 1345 1350 1355 1360 Lys Gly Arg Val Arg Phe Gly Leu Ala Lys Leu Asp Asn Glu Pro Lys 1365 1370 1375 Trp Leu Met Ser Asn Asp Gly Arg Leu Thr Leu Pro Leu Leu Glu Arg 1380 1385 1390 Pro Arg Pro Thr Trp Ser Ile Pro Asp Asp Lys Lys Glu Asn Lys Val 1395 1400 1405 Page 27
Leu Gly Arg Lys Phe Tyr Val His His Asp Gly Trp Lys Thr Val Phe 1410 1415 1420 Glu Gly Lys Asn Pro Ser Asn Gly Glu Thr Ile Gln Ser Asn Pro Asn 1425 1430 1435 1440 Asn Arg Thr Val Lys Pro Leu Gly Cys Asp Asn Lys Phe Thr Phe Asp 1445 1450 1455 Ile Tyr Phe Glu Asn Leu Glu Asp Tyr Glu Leu Gly Leu Leu Phe Tyr 1460 1465 1470 Thr Leu Gln Leu Glu Lys Gly Leu Ser His Lys Leu Gly Met Ala Lys 1475 1480 1485 Ser Met Gly Phe Gly Ser Val Glu Ile Asp Ile Lys Asn Ile Ser Leu 1490 1495 1500 Arg Lys Asp Pro Glu Asn Trp Glu Asp Gly Asn Ser Lys Ile Ala Asp 1505 1510 1515 1520 Trp Ile Lys Glu Gly Glu Lys Met Leu Thr Lys Trp Phe Gln Thr Asp 1525 1530 1535 Phe Asn Thr Ile Glu His Leu Asn Asn Leu Lys Lys Leu Leu Tyr Phe 1540 1545 1550 Ser Gly Asn Lys Asn Leu Lys Val Phe Tyr Pro Thr Leu Lys Lys Glu 1555 1560 1565 Gly Lys Ile Pro Gly Tyr Glu Glu Leu Lys Lys Asp Ile Lys Asp Arg 1570 1575 1580 Lys Lys Met Leu Thr Thr Pro Trp Met Pro Trp His Ser Glu Glu 1585 1590 1595 <210> 7 <211> 1900 <212> PRT <213> Candidatus Magnetomorum <220> <223> JADFYV010000175.1 cds MBF0452212.1 <400> 7 Met Asn Ile Thr Ile Thr Phe Phe Glu Pro Ile Arg Met Met Glu Trp 1 5 10 15 Ile Asp Pro Ser Glu Arg Lys Arg Lys Ser Lys Asn Leu Arg Ala Gln
Ser Phe Ala Arg Trp His Lys Cys Lys Ile Asn Lys Asp Leu Gly Lys 40 45 Pro Phe Ile Thr Gly Thr Leu Leu Arg Ser Ala Val Ile Arg Ala Ala 50 55 60 Glu His Leu Leu Val Leu Gln Glu Gly Lys Ser Asp Asn Ile Gly Cys 65 70 75 80 Cys Pro Gly Lys Phe Ser Thr Asn Asp Pro Ala Lys Asp Lys Ile Ile 85 90 95 Tyr Leu Arg Gln Arg Thr Thr Pro Val Trp Thr Asp Asp Ala Leu Cys 100 105 110 Asp Asp Asn His Leu Cys Pro Phe Cys Glu Leu Ile Gly Arg Lys Val Page 28
Asn Pro Glu Lys Gly Ile Lys Ser Tyr Lys Lys Lys Asp Gln Asn Ile 130 135 140 Ser Asn Ile Lys Phe Asp Asn Leu His Leu Pro Gln Lys Thr Glu Val 145 150 155 160 Pro Glu Lys Ile Ala Gln Lys Arg Thr Leu Asn Arg Val Asp Tyr Ala 165 170 175 Thr Gly Lys Ala His Asp Phe Phe Lys Ile Leu Glu Ile Asp His Arg 180 185 190 Gln Phe Pro Val Phe Glu Gly Lys Ile Val Ile Ala Glu Tyr Val Ser 195 200 205 Pro Ala Ala Arg Ser Leu Leu Ile Lys Ser Leu Lys Phe Ile Asp Arg 210 215 220 Leu Cys Gly Ser Leu Cys Lys Ile Thr Phe Asp Glu Asn Asn Asp Asp 225 230 235 240 His Ile Thr Asn Pro Lys Ile Leu Asp Thr Lys Asn Ala Gln Gln Thr 245 250 255 Ala Asp Gln Leu Ile Thr Ile Leu Glu Lys Asn Ala Lys Thr Gln Tyr 260 265 270 Leu Arg Ile Leu Ser Asp Ala Val Arg Glu Leu Gly Arg Asp Arg Lys 275 280 285 Lys Ile Asn Glu Leu Pro Lys Asp His Asn Gly Lys Phe Glu His Tyr 290 295 300 Ile Trp Asp Leu Ser Asn Gly Asn Leu Ser Ile Arg Ser Leu Leu Gln 305 310 315 320 Gln Lys Ala Asp Lys Ile Tyr Asp Asn Asp Gln Phe Ala Ala Phe Phe 325 330 335 Lys Ser Leu Gly Thr Trp Leu Tyr His His Glu Lys Glu Ser Ser Gly 340 345 350 Gly Leu Lys Lys Ser Gln Arg Ile Leu Gly Asn Arg Ala Tyr Tyr Gly 355 360 365 Lys Asn Gln Phe Thr Asp Arg Pro Pro Asp Ile Gln Ile Val Pro Asp 370 375 380 Arg Glu Phe Leu Phe Tyr Gly Thr Leu Thr Ser Glu Thr Pro Phe Phe 385 390 395 400 Phe Gly Leu Glu Ser Glu Glu Thr Gln Gln Thr Asp Phe Thr Ile Leu 405 410 415 Leu Asp Arg Asn Asn His Tyr Arg Leu Pro Arg Ser Ala Leu Arg Gly 420 425 430 Val Leu Arg Arg Asp Ile Arg Ile Val Met Asp Asp Ile Gly Cys Asp 435 440 445 Val Arg Leu Gly Gly Tyr Gln Cys Met Cys Pro Ile Cys Gln Ile Met 450 455 460 Arg Asn Ile Thr Ile Met Asp Val Arg Asn Glu His Tyr Thr Glu Asn 465 470 475 480 Pro Glu Val Arg Gln Arg Ile Arg Leu Asn Pro Tyr Thr Gly Thr Val 485 490 495 Ala Glu Gly Ala Leu Phe Ser Met Glu Leu Gly Pro Gln Gly Met Glu 500 505 510 Phe Pro Phe Ile Leu Arg Tyr Arg Gly Asn Asp Thr Gly Pro Pro Glu Page 29
Ser Leu Ile Lys Val Leu Leu Asn Trp Val Ala Gly Lys Ala Phe Leu 530 535 540 Ser Gly Ala Ser Ser Thr Gly Lys Gly Arg Phe Lys Leu His His Leu 545 550 555 560 Lys Gln Lys Glu Phe Leu Leu Lys Asp Asn Thr Tyr Leu Asp Glu Arg 565 570 575 Gly Trp Arg Asn Arg Glu Asn Glu Ile Pro Asp Leu Met Pro Leu Lys 580 585 590 Leu Phe Glu Thr Lys Thr Ser Leu Trp Glu Lys Lys Ser Ile Glu Ile 595 600 605 Tyr Val Lys Ser Pro Leu Leu Asn Gly Asp Pro Val Arg Ala Leu Val 610 615 620 Ser Gly Asn Gly Met Asp Ile Val Ser Phe Lys Lys Tyr Thr Ser Val 625 630 635 640 Gly Phe Gln Gln Val Tyr Ala Tyr Lys Ser Glu Ser Leu Lys Gly Ile 645 650 655 Phe Arg Thr Ala Leu Gly Arg Lys Phe Gln His Lys Asp Ser Val Ser 660 665 670 Asp Lys Val Leu Pro Leu Leu Ala Leu Asn His Lys Asp Cys Asp Cys 675 680 685 Pro Leu Cys Arg Leu Phe Gly Ser Glu Tyr Glu Ser Gly Lys Ile Lys 690 695 700 Phe Glu Asp Leu Leu Phe Ser Thr Pro Pro Glu Glu Lys Lys Phe Asp 705 710 715 720 His Val Ala Ile Asp Arg Phe Thr Gly Gly Ala Val Asn Gln Lys Lys 725 730 735 Phe Asp Asp Tyr Ser Leu Val Gly Thr Pro Lys Lys Pro Leu Lys Leu 740 745 750 Glu Gly Ala Leu Trp Ile Arg Lys Asp Leu Ser Glu Asn Asp Arg Asn 755 760 765 Asn Leu Asn Ala Ala Phe Trp Asp Ile Lys Lys Gly Leu Tyr Pro Leu 770 775 780 Gly Ala Lys Ala Gly Ile Gly Tyr Gly Gln Val Gln Asp Ile Val Leu 785 790 795 800 Thr Pro Pro Ile Ile Lys Glu Ser Lys Ala Asp Ser Phe Tyr Asn Asn 805 810 815 Gln Asn Thr Lys Leu Pro Asp Lys Lys Ile Thr Asp Ala Asn Ile Ser 820 825 830 Ile Ser Glu Glu Ala Val Tyr Phe Pro His Tyr Phe Leu Lys Pro His 835 840 845 Gln Lys Val His Arg Lys Thr Ile Pro Leu Asp His Leu Ser Leu His 850 855 860 His Asn Asp Cys Cys Thr Gly Lys Ile Thr Leu Thr Leu Thr Thr Lys 865 870 875 880 Thr Pro Leu Ile Val Pro Asp Thr Glu Asn Asp Asp Ala Phe His Leu 885 890 895 Lys Ser Lys Thr Met Thr Asn Asp Gly Arg Tyr His Lys Ser Tyr Ala 900 905 910 Phe Phe Ser Ile Asn Asp Glu Ile Met Ile Pro Gly Ser Glu Ile Arg Page 30
Gly Met Ile Ser Ser Val Phe Glu Ala Leu Thr Asn Ser Cys Phe Arg 930 935 940 Ile Phe Glu Glu Lys His Arg Leu Ser Trp Arg Met Glu Ala Asp Pro 945 950 955 960 Asp Ile Leu Gly Lys Phe Lys Pro Gly Arg Val Ile Gln Cys Glu Asp 965 970 975 Gly Leu Arg Met Val Lys Met Glu Glu Tyr Arg Tyr Pro Phe Tyr Asp 980 985 990 Asn Lys Asp Leu Asp Tyr Ser Ser Trp Glu Gly Glu Glu Lys Pro Val 995 1000 1005 Tyr Asp His Pro Thr Pro Ser Asp Lys Met Ile Ser Thr Leu Ser Glu 1010 1015 1020 Tyr Asn Arg Asn His Arg Pro Pro Asp Asn Thr Lys Ala Ser Phe Lys 1025 1030 1035 1040 Ile Tle Lys Pro Glu Ser Ser Ser Lys Ala Ser Phe Met Tyr Thr Ala 1045 1050 1055 Thr Pro Ala Asp Asn Glu Gln Ile His Asp Thr Asn Cys Val Leu Lys 1060 1065 1070 Lys Lys Val Asn Gly Tyr Leu Lys Ile Ser Gly Pro Asn Lys Val Glu 1075 1080 1085 Lys Glu Lys Ser Asp Lys Lys Gly Asp Asp Gln Leu Pro Asn Pro Ile 1090 1095 1100 Gln His Asn Arg Ile Tyr Ser Gln Thr Ile Phe Val Gln Asn Ala Gln 1105 1110 1115 1120 Arg Lys Lys Asp Arg Thr Arg Leu Ile Pro Glu Phe Ile Phe Leu Gly 1125 1130 1135 Thr Asn Val Lys Tyr Thr Met Asn Lys Arg Cys Glu Arg Val Phe Val 1140 1145 1150 Glu Pro Glu Asn Ile Asn His Lys Gly Ile Pro Ile Ser Gln Gln Ala 1155 1160 1165 Lys Glu Leu Phe Lys Gln Leu Val Asn Asp Tyr Arg Asn Asn Ala Glu 1170 1175 1180 Gln Gln Glu Thr Pro Asp Val Phe Arg Thr Ile Leu Pro Glu Lys Gly 1185 1190 1195 1200 Lys Ile Glu Glu Gly Leu Leu Val Tyr Phe Arg Glu Glu Gln Asp Glu 1205 1210 1215 Val Val Glu Ile Ile Pro Val Lys Ile Ser Arg Lys Val Asp Asp Arg 1220 1225 1230 Phe Ile Gly Lys Arg Leu Ser Lys Ala Leu Arg Pro Cys His Gly Glu 1235 1240 1245 Trp Leu Glu Thr Asp Asp Leu Ser Gly Ile Asn Gln Tyr Pro Glu Lys 1250 1255 1260 Lys Leu Phe Thr Arg His Pro Lys Gly Leu Cys Pro Ala Cys Gln Leu 1265 1270 1275 1280 Phe Gly Thr Gly Ala Tyr Lys Gly Lys Leu Arg Phe Gly Phe Ala Thr 1285 1290 1295 Leu Thr Asn Asp Pro Val Trp Leu Asn Ser Gly Asp His Ser Gln Ile 1300 1305 1310 Leu Pro Leu Leu Glu Arg Pro Arg Pro Thr Trp Ala Val Pro Asn Ala Page 31
Thr Gln Ser Ser Lys Val Pro Gly Arg Lys Phe Tyr Ile His His His 1330 1335 1340 Ala Trp Lys His Ile Gln Ala Lys Asn His Pro Ser Thr Gly Gln Ser 1345 1350 1355 1360 Ile Asp Ile Asp Ile Asn Asn Arg Thr Val Gln Pro Leu Gly Cys Asn 1365 1370 1375 Asn Thr Phe Gln Phe Asp Ile His Phe Glu Asn Leu Gln Ile His Glu 1380 1385 1390 Leu Gly Leu Leu Leu Tyr Thr Leu Gln Leu Glu Glu Gly Leu Ser His 1395 1400 1405 Lys Leu Gly Met Gly Lys Ala Phe Gly Phe Gly Ser Ile Asp Leu Asn 1410 1415 1420 Met Lys Thr Leu Leu Leu Leu Asp Pro Lys Asp Asn Gln Trp Ile Asn 1425 1430 1435 1440 Lys Thr Asp Gln Thr Asp Ile Phe Ile Asp Lys Gly Lys Glu His Leu 1445 1450 1455 Glu Lys Leu Phe Glu Lys Lys Trp Asn Ser Ile Asp His Ile Asn Asp 1460 1465 1470 Leu Lys Ser Leu Leu Cys Tyr Thr Glu Asn Glu Ile Ile Ser Val Phe 1475 1480 1485 Tyr Pro Leu Leu Arg Gln Lys Asp Tyr Pro Asp Gln Asp Leu Pro Gly 1490 1495 1500 Tyr Glu Glu Leu Lys Gln Asn Phe Gln Gln Gly Ile Gln Ile Arg Gln 1505 1510 1515 1520 His Leu Leu Thr Thr Pro Trp Thr Pro Trp Ala Tyr Arg Glu Lys Lys 1525 1530 1535 Leu Asn Thr Leu Gly Asp Ile Ile Thr Glu Pro Pro Ser Glu Thr Glu 1540 1545 1550 Ile Ile Thr Glu Leu Pro Ser Glu Thr Glu Asn Ile Thr Gly Pro Leu 1555 1560 1565 Ser Lys Ala Glu Val Met Ile Glu Pro Phe Ser Lys Ala Asp Lys Ile 1570 1575 1580 Val Tyr Pro Val Ser Lys Val Glu Ile Ile Lys Tyr Leu Gln Ile Arg 1585 1590 1595 1600 Ser Ala Thr Met Glu Ile Pro His Asp Ala Gln Trp Val Phe Leu Thr 1605 1610 1615 Gly Asn Asn Ala Tyr Gly Lys Thr Thr Leu Leu Gln Ala Ile Val Thr 1620 1625 1630 Gly Leu Tyr Gly Lys Ile Tyr Asn Asp Thr Arg Asp Glu Pro Phe Tyr 1635 1640 1645 Ser Glu Cys Asp Ile Arg Ile Lys Ile Asp Asp Arg Trp Thr Asn Asp 1650 1655 1660 Leu Lys Asn Lys Leu Phe Lys Ser Tyr Lys Asn Phe Val Ala Tyr Gly 1665 1670 1675 1680 Pro Ser Arg Leu Asn Ile Ser Ser Gly Asp Ala Arg Asn Ile Lys Arg 1685 1690 1695 Tyr His Ser Leu Phe Glu Thr Asp Lys Val Tyr Tyr His Asp Ile Glu 1700 1705 1710 Asp Glu Leu Cys His Phe Lys Asp Arg Asn Ser Gln Arg Val Asp Leu Page 32
Ile Lys Ser Ile Leu Glu Asp Leu Met Pro Ser Ile Asp Ser Ile Asp 1730 1735 1740 Ile Lys Glu Asp Lys Lys Thr Gln Arg Phe Tyr Val Arg Tyr Ile Glu 1745 1750 1755 1760 Lys Asn Thr Pro Asp Ile Phe Lys Leu Ser Glu Leu Ser Ser Gly Asn 1765 1770 1775 Lys Ser Ile Leu Ala Met Ile Gly Asp Leu Ile Ile Arg Phe Thr Asn 1780 1785 1790 Asp Gln Glu Gly Ala Ile Lys Asp Lys Lys Asp Phe Lys Gly Leu Val 1795 1800 1805 Ile Ile Asp Glu Leu Asp Leu His Leu His Pro Ile Trp Gln Arg Lys 1810 1815 1820 Leu Pro Glu Leu Leu Ser Glu Ala Phe Pro Lys Val Arg Phe Ile Val 1825 1830 1835 1840 Ser Thr His Ser Leu Ile Pro Phe Leu Gly Ala Pro Lys Asn Ser Ala 1845 1850 1855 Phe Phe Lys Val Asn Arg Asn His Asp His Tyr Ile Asn Val Glu Arg 1860 1865 1870 Ile Asp Ile Asp Ile Ser Asn Leu Leu Pro Asn Thr Ile Leu Thr Ser 1875 1880 1885 Pro Leu Phe Asp Met Gly Glu Asn Gly Val Thr Ser 1890 1895 1900 <210> 8 <211> 1674 <212> PRT <213> Candidatus Magnetomorum <220> <223> JADFYV010000127.1 cds MBF0451763.1 <400> 8 Met Asp Asp Ala Leu Asn Cys Arg Ile Lys Phe Phe Gly His Tyr Arg 1 5 10 15 Leu Val Glu Trp Val Glu Gln His Lys Arg Lys Thr Asp Gln Arg Tyr
Gln Arg Phe Ile Asn Tyr Leu Asn Trp His Arg Pro Ile Asp Pro Asn 40 45 Thr Leu Gln Asn Pro Tyr Ile Arg Gly Ser Leu Val Arg Ser Tyr Met 50 55 60 Ile Trp Asn Leu Glu Glu Met Phe Ala Phe Phe Lys Asp Gln Phe Gln 65 70 75 80 Glu Cys Cys Pro Gly His Met Lys Tyr His Ala Asp Leu Arg Ser Gly 85 90 95 Ile Ser Glu Lys Lys Gln Ser Asp Phe Leu Arg Arg Arg Asn Arg Tyr 100 105 110 Gln Thr Gly Lys Ser Val Cys Thr Asn Lys Glu Asp Ala Cys Ile Phe 115 120 125 Page 33
Cys Gln Ile Leu Gly Ala Phe Asp Ala Gln His Phe Ser Gln Arg His 130 135 140 Ser Ser Lys Asn Lys Leu Asn Ala Gln Gly Asn His Leu Lys Tyr Asn 145 150 155 160 Pro Gln Lys Asn Arg Lys Ser Val Arg Phe Leu Asn Phe Ile Pro Asp 165 170 175 Asp Arg Phe Ser Cys Ile Thr Asp Leu Ala Ser Val Arg Tyr Lys Asn 180 185 190 Arg Tyr Asp Arg Asn Ser Gln Lys Ala Arg Asp Tyr Phe Gln Ile Trp 195 200 205 Glu Ala Asn His Phe Leu Cys Ser Glu Phe Phe Gly Lys Ile Val Ile 210 215 220 Asn Thr Asp Leu Val Lys Asp Val Glu Lys Ala Lys Cys Leu Ile Ala 225 230 235 240 Ala Gly Leu Ser Arg Ile Asn Tyr Leu Ala Gly Ala Pro Cys Arg Ile 245 250 255 Asp Ile Gly Glu Lys Lys Asp Gly Gln Trp Asp Met Ser Ile His Gln 260 265 270 Gln Leu Leu Asp Ser Phe His Gln Asn Phe Leu Met Lys Asp Cys Asn 275 280 285 Ser Asp Val Lys Thr Asn Pro Leu Ser Phe Ile Glu Thr Thr Asn Glu 290 295 300 Pro Thr Lys Asn Ile Asn Glu Gln Asn Ile Pro Thr Glu Ala Glu Ala 305 310 315 320 Ile Ser Glu Asp Ile Ile Asn Thr Met Gln Lys Ser Gly Lys Thr Ile 325 330 335 His Ile Arg Ser Phe Ala Asp Ala Ile Tyr Glu Leu Arg Arg Ile Ser 340 345 350 Val Pro Gln Ile Glu His Ile Pro Glu Ile Asn Thr Ala Thr Ala Lys 355 360 365 Gln Thr Phe Trp Gly Leu Leu Cys Gln Asn Asn Lys Ser Ile Lys Gln 370 375 380 Arg Ile Val Glu His Ile Lys Asn Lys Asn Thr Ile Gln Lys Gln His 385 390 395 400 Phe Phe Glu Cys Leu Arg Asp Asn Leu Tyr Gln Glu Ala Lys Ala Leu 405 410 415 Gln Ile Val Gln Gln Pro Thr Gly Arg Ile Ile Gly Glu Asn Glu Tyr 420 425 430 Tyr Ala Arg Gln Ala Pro Ala Leu Ser Asp Arg Tyr Glu Ala Ile Val 435 440 445 Thr Lys Asn Ser Phe Trp Ile Ile Thr Gly Phe Leu Glu Ser Gln Thr 450 455 460 Pro Phe Phe Phe Gly Ser Gly Leu Thr Gly Gly Asn Thr Asp Ile Pro 465 470 475 480 Ile Val Thr Asp Ala Asn Gly His Leu Arg Leu Pro Phe Asp Val Leu 485 490 495 Arg Gly Val Phe Arg Arg Asp Leu Ser Ser Met Val Ser Asn Cys Asn 500 505 510 Leu Val Asp Ile Gly Ile Ser Arg Pro Cys Asn Cys Pro Val Cys Gln 515 520 525 Page 34
Thr Leu Ser Gln Cys Lys Phe Glu Asp Ser Ile Ala Cys Asp Tyr Ser 530 535 540 Ile Pro Pro Glu Ile Arg Gln Arg Ile Lys Ile Ser Pro His Thr Gly 545 550 555 560 Val Val Glu His Gly Ala Leu Phe Asn Ile Glu Ile Gly Pro Gln Gly 565 570 575 Leu Arg Phe Pro Ile Met Ile Lys His Tyr Pro Gly Asp Ile Asn Ile 580 585 590 Asn Glu Asp Leu Lys Lys Val Leu Gly Leu Trp Ser Asn Asn Gln Cys 595 600 605 Phe Ile Gly Gly Gln Leu Gly Thr Gly Lys Gly Arg Phe Asn Leu Val 610 615 620 Asp Leu Lys Trp Tyr Asn Leu Gln Phe Asn Leu Gln Ser Pro Arg Gln 625 630 635 640 Lys Ile Ile Ala Ile Lys Asp Tyr Asn Cys Leu Leu Lys His Arg Gly 645 650 655 Tyr Ile Gly Leu Lys Lys Ser Glu Val Asp His Leu Ile Asp Asp Lys 660 665 670 Lys Lys Asn Ile Asp Leu Thr Met Thr Asp Asn Val Ser Tyr Ser Lys 675 680 685 Val Ser Tyr Thr Leu Asn Phe Glu Ser Pro Val Ile Ser Asn Asp Pro 690 695 700 Ile Ala Ala Leu Ile His Glu Asp Thr Pro Asp Ala Ile Met Phe Lys 705 710 715 720 Lys Thr Leu Val Ser Tyr Asp Asp Asn Gly Met Ser Ser Thr Ser Asp 725 730 735 Ile Tyr Ser Leu Lys Gly Glu Gly Ile Arg Gly Val Leu Arg Tyr Leu 740 745 750 Val Gly Lys Asn Glu Asn Cys His Asp Leu Ile His Glu Asp Cys Asn 755 760 765 Cys Met Val Cys His Ile Phe Gly Asn Asn Gln Thr Gly Gly Tyr Val 770 775 780 Arg Phe Glu Asp Ala Asn Leu Val Asn His Val Lys Pro Val Arg Phe 785 790 795 800 Asp His Val Ala Ile Asp Trp Ile Gly Ala Lys Glu His Ala Lys Phe 805 810 815 Asp Val His Ala Leu Pro Gly Asn Pro His Gln Ser Leu Ala Phe Asn 820 825 830 Gly Ile Phe Trp Ile Ser Lys Asp Leu Glu Asp Ala Cys Gln Asn Ala 835 840 845 Ile Lys Lys Ala Phe Ile Asp Leu Gln Asn Lys Met Ala Thr Leu Gly 850 855 860 Ala Asn Gly Ala Ser Gly Tyr Gly Trp Val Ser Lys Ile Gln Phe Met 865 870 875 880 Asp Gly Pro Asp Trp Leu Ile Asn Gln Phe Pro Glu Pro Val Pro Glu 885 890 895 Ile Gln Thr Asn Arg Gln Asp Val Thr Asp Gln Asn Lys Lys Pro Glu 900 905 910 Ile Lys Leu Asn Leu Ser Lys Asp Gln Trp Tyr His Pro Tyr Tyr Phe 915 920 925 Page 35
Leu Glu Pro Ser Pro Thr Val Lys Arg Ser Asn Asp Leu Ile Thr His 930 935 940 Glu Cys Tyr His Lys Glu Lys Leu Ser Gly Met Ile Thr Cys Asp Leu 945 950 955 960 His Thr Leu Ser Pro Phe Phe Ile Pro Asp Thr Thr Gln Asp Asn Ser 965 970 975 Leu Asp Leu Ile Leu Asn Thr Asp Ser Phe Pro Glu Asn His Lys Arg 980 985 990 Phe Arg Phe Phe Arg Ile Asn Asp Gln Pro Met Ile Pro Ala Ser Ser 995 1000 1005 Ile Arg Gly Met Ile Gly Tyr Ile Tyr Gln Leu Leu Thr Asn Ser Cys 1010 1015 1020 Phe Arg Asn Leu Asp Glu Ser Ala Tyr Ile Thr Arg Arg Met Asp Ala 1025 1030 1035 1040 Ser Lys Ala Gly Asp Leu Lys Cys Gly Ile Val Arg Lys Gly Asn Asp 1045 1050 1055 Gly Asn Leu Tyr Ile Thr Glu Cys Asn Arg Tyr Arg Leu Pro Leu Tyr 1060 1065 1070 Asp Asp Met Thr Ile Thr Asn Ala Ile Gly Ala Lys Asp Glu Tyr Leu 1075 1080 1085 Asp Arg Met Lys Ala Lys Asn Lys Asp Gln Arg Leu Arg Lys Ala Ile 1090 1095 1100 Asn Asn Asn Lys Ala Leu Ala Lys Phe Ala Glu Gln Asn Arg Ala Tyr 1105 1110 1115 1120 Leu Leu Lys Lys Asp Glu Asn Thr Arg Ile Gln Leu Leu Ser Gly Lys 1125 1130 1135 Gln Ala Ile Arg Phe Asn Ile Val Gln Lys Pro Glu Trp Lys Gln Gly 1140 1145 1150 Gly Val Asp Lys Ile Val Val Leu Thr Asp Leu Gly Ser Arg Ser Gly 1155 1160 1165 Tyr Ile Lys Phe Thr Gly Thr Asn Asn Ala Asn Lys Lys Leu Asp Asp 1170 1175 1180 Asp Gln Gln Thr Ser Cys Gln Asp Ile Thr Phe Ile Ser Cys Asp Gln 1185 1190 1195 1200 Ser Trp Asp Pro Trp Lys Leu Asn Ile Leu Leu Lys Ser Lys Ser Pro 1205 1210 1215 Glu Leu Arg Pro Glu Thr Gly Lys Asn Asn Tyr Tyr Pro Arg Pro Val 1220 1225 1230 Leu Tyr Cys Lys Thr Lys Asp Asn Thr Glu Tyr Thr Ile Ser Lys Arg 1235 1240 1245 Cys Glu Arg Val Phe Glu Ile Pro Tyr Asp Lys Ser Glu Glu Phe Met 1250 1255 1260 Val Thr Ser Gln Ser Lys Lys Gln Tyr Lys Glu Ile Ile Lys Ser Tyr 1265 1270 1275 1280 Asp Asp Asn Thr Gly Lys Ile Asp Ser Leu Phe Arg Thr Gln Phe Gln 1285 1290 1295 Asn Asp Glu Leu Thr Val Gly Asp Leu Val Tyr Phe Glu Pro Lys Ile 1300 1305 1310 Glu Glu Lys Lys Cys Val Ala Lys Asn Ile Thr Pro Val Asn Ile Ser 1315 1320 1325 Page 36
Arg Glu Ser Asp Asp Lys Pro Met Arg Gln Arg Phe Lys Glu Gly Phe 1330 1335 1340 Glu Ser Leu Arg Pro Cys Ile Lys Glu Cys Ser Glu Asn Cys Ala Thr 1345 1350 1355 1360 Cys Asn Asp Ile Ser Cys Leu Glu Ser Leu Leu Thr Asp Tyr Ser Lys 1365 1370 1375 Gly Leu Cys Pro Thr Cys Arg Leu Phe Gly Thr Thr Thr Ile Lys Ser 1380 1385 1390 Arg Leu Arg Phe Gly Phe Gly Lys Leu Thr Glu Ala Asp His Thr Asn 1395 1400 1405 His Ala Leu Trp Tyr Ala Asn Asn Glu Gln Gly Phe Val Asn Asn Thr 1410 1415 1420 Lys Gly Thr Pro Leu Thr Leu Pro Leu Leu Glu Arg Pro Arg Ala Thr 1425 1430 1435 1440 Trp Ala Met Pro Asp Lys Lys Ser Cys Ile Pro Gly Arg Lys Leu Tyr 1445 1450 1455 Val Asn His Pro Lys Gly Val Lys Ile Gln Ser Ala Thr Pro Gly Glu 1460 1465 1470 Asn Asn Arg Thr Ile Glu Pro Met Ala Ala Gly Asn Thr Phe Arg Phe 1475 1480 1485 Gln Ile Gly Phe Asp Asn Leu Ser Glu Trp Glu Leu Gly Leu Leu Leu 1490 1495 1500 Tyr Ala Leu Glu Leu Glu Asn Asn Met Ala His Arg Leu Gly Met Gly 1505 1510 1515 1520 Lys Pro Leu Gly Met Gly Ser Val Asn Ile Gln Val Arg Asp Ile Leu 1525 1530 1535 Ile Arg Lys Ser Pro Asp His Phe Glu Ser Arg Phe Asp Ser Lys Ala 1540 1545 1550 Phe Tyr Ile Asp Lys Gly Lys Lys Glu Leu Ser Ser Trp Phe Asn Asn 1555 1560 1565 Ser Asn Glu Ile Glu Asn Ile Ser Glu Asn Thr His Phe Asp His Ile 1570 1575 1580 Pro His Ile Gln Asp Met Lys Arg Met Leu Ile Ile Pro Asp Glu Thr 1585 1590 1595 1600 Ile Asp Ile Arg Tyr Pro Glu Leu Lys Ser Glu Lys Glu Ile Asp Tyr 1605 1610 1615 Glu Lys Leu Lys Lys Lys Glu Ala Asp Asp Ile Gln Lys Val Leu Tyr 1620 1625 1630 Thr Pro Trp Thr Met Trp Asp Leu Asp Gln Met Lys Asp Arg Lys Glu 1635 1640 1645 Lys Lys Ile Pro Lys Asn Lys Lys Lys Gln His Tyr Asn His Lys Lys 1650 1655 1660 Lys Asn Lys Phe Tyr Arg Ser Lys Arg Tyr 1665 1670 <210> 9 <211> 1558 <212> PRT <213> Deferribacteres <phylum> Page 37
<220> <223> JAADEW010000104.1 cds NPA15996.1 <400> 9 Met Val Val Gly Phe Thr Leu Lys Phe Leu Glu Pro Tyr Arg Ile Gln 1 5 10 15 Pro Trp Ile Glu Pro Glu Lys Arg Arg Leu Ser Asn Ala Glu Tyr Ala
Arg Thr Leu Ser Tyr Ala Arg Thr His Arg Ser Ala Asp Asp Arg Tyr 40 45 Ser Ile Tyr Ile Pro Gly Thr Leu Leu Arg Ser Ala Phe Ile Asp Val 50 55 60 Leu Pro Trp Leu Leu Gly Val Lys Gly Glu Lys Phe Cys Thr Ser Ile 65 70 75 80 Phe Arg Gly Phe Thr Glu Lys Gly Ala Lys Pro Lys Met Gly Leu Leu 85 90 95 Arg Lys Lys Trp Val Phe Asp Gly Gly Lys Ser Ser Cys Arg Ser Lys 100 105 110 Glu Asp Ala Cys Pro Leu Cys Leu Ile Ala Gly Arg Phe Asp Arg Ala 115 120 125 Asp Asp Pro Arg Asp Lys Ser Val His Phe Arg Asn Leu Phe Leu Glu 130 135 140 Pro Gln Val Ser Phe Thr Glu Ala Glu Tyr Glu Arg Ile Ala Arg Gln 145 150 155 160 Arg Thr Val Asn Arg Val Gln Gln Phe Thr Glu Lys Ala His Asp Tyr 165 170 175 Phe Lys Val Arg Glu Val Val Ala Pro Ser Leu Trp Arg Phe Arg Gly 180 185 190 Glu Ile His Val Lys Asp Glu Leu Ala Ser Phe Val Pro Leu Ile Lys 195 200 205 Glu Thr Leu Ser Leu Val Asp Arg Leu Ser Gly Ala Leu Cys Val Val 210 215 220 Glu Phe Glu Lys Thr Pro Ser Glu Ser Ile Ser Val Arg Glu Ala Phe 225 230 235 240 Ser Glu Leu Pro Arg Glu Leu Val Glu Ser Ala Gly Lys Val Leu Glu 245 250 255 Glu Leu Gly Glu Asn Leu Asp Val Arg Arg Ile Arg Arg Leu Ser Asp 260 265 270 Ala Phe Arg Leu Ile Ala Asp Thr Cys Lys Asp Glu Leu Lys Leu Pro 275 280 285 Glu Gly Met Glu Asp Asp His Phe Leu Trp Asp Glu Ile Lys Val Ser 290 295 300 Gly Lys Ser Leu Arg Ser Trp Leu Ala Glu Ile His Glu Arg Tyr Lys 305 310 315 320 Pro Tyr Trp Thr Asp Phe Cys Leu Phe Val Ala Gln Glu Leu Tyr Glu 325 330 335 Leu Tyr Gly Arg Lys Arg Thr Lys Ser Gln Ile Glu Glu Ser Phe Ala 340 345 350 Val Ser Leu Tyr Gln Ala Asp Gln Lys Glu Ala Pro Ser Arg Pro Val Page 38
Ser Leu Pro Tyr Lys Arg Arg Ser Phe Tyr Glu Trp Ile Val Val Gly 370 375 380 Arg Leu Lys Ala Leu Thr Pro Phe His Phe Gly Asp Ala Ser Glu Arg 385 390 395 400 Glu Gly Gly Ile Leu Leu Thr Ser Asp Gly Arg Phe Arg Ile Pro His 405 410 415 Ser Ala Leu Arg Gly Val Leu Arg Arg Asp Leu Arg Trp Ala Gly Ala 420 425 430 Val Ala Cys Glu Val Lys Val Gly Arg Lys Asn Leu Cys Thr Cys Asp 435 440 445 Val Cys Arg Ile Met Arg Arg Leu Arg Ile Lys Asp Cys Phe Ser Asp 450 455 460 Gln Gln Thr Gly Leu Pro Glu Leu Arg Lys Arg Asn Arg Met Asn Pro 465 470 475 480 Tyr Ser Gly Thr Val Ala Glu Glu Ala Leu Phe Asp Thr Glu Val Gly 485 490 495 Thr Glu Gly Ala Thr Phe Pro Phe Val Leu Arg Tyr Arg Ser Glu Asp 500 505 510 Arg Glu Phe Pro Glu Glu Leu Ala Val Val Leu Lys Trp Trp Ser Glu 515 520 525 Gly Arg Ala Phe Leu Ser Gly Gln Ala Ala Thr Gly Lys Gly Arg Phe 530 535 540 Arg Leu Asp Leu Asp Gly Val Tyr Lys Trp Glu Leu Asp Lys Glu Asn 545 550 555 560 Val Asp Leu Tyr Val Arg Glu Lys Gly Phe Arg Gly Arg Glu Asp Glu 565 570 575 Leu Ser Ser Arg Tyr Ala Glu Cys Lys Ile Glu Lys Val Ser Leu Glu 580 585 590 Glu Tyr Val Lys Glu Ala Pro Pro His Pro Tyr Ile Pro Val Glu Phe 595 600 605 Glu Val Glu Val Ala Ser Pro Leu Leu Leu Gly Asp Pro Leu Arg Ala 610 615 620 Leu Thr Val Gly Glu Gly Thr Glu Lys Ala Pro Asp Thr Val Phe Phe 625 630 635 640 Arg Lys Lys Thr Val Asn Gly Glu Lys Leu Cys Phe Lys Ala Glu Ser 645 650 655 Phe Arg Gly Val Leu Arg Thr Ala Ala Gly Arg Ala Lys Gly Leu Leu 660 665 670 Thr Glu Gly His Glu Asp Cys Thr Cys Glu Leu Cys Arg Val Phe Gly 675 680 685 Asn Val His Glu Lys Gly Ala Val Val Val Glu Asp Leu Asp Val Glu 690 695 700 Ser Gly Glu Glu Lys Thr Phe His His Val Ser Ile Asp Arg Phe Leu 705 710 715 720 Ser Gly Ala Lys Glu Lys His Lys Phe Asp Asp Arg Pro Val Val Pro 725 730 735 Asp Phe Asn Thr Pro Ile Val Leu Lys Gly Arg Leu Trp Leu Lys Arg 740 745 750 Glu Val Phe Glu Gly Gly Lys Ser Ala Gly Arg Phe Arg Asp Phe Leu Page 39
Arg Thr Ala Phe Ser His Ile Asn Leu Gly Leu Tyr Pro Leu Gly Ala 770 775 780 Asn Arg Ser Thr Gly Tyr Gly Glu Val Ala Ser Val Arg Ile Ile Ser 785 790 795 800 Gly Pro Asp Trp Leu Lys Pro Lys Asp Tyr Asp Val Pro Glu Gly Thr 805 810 815 Thr His Leu Asn Leu Asp Lys His Ile Lys Arg Tyr Leu Glu Gly Leu 820 825 830 Glu Leu Pro Glu Lys Gly Lys Thr Tyr Leu Pro Tyr Gly Phe Val Pro 835 840 845 Leu Ser Glu Asn Pro Pro Val Arg Thr Ser Ser Pro Pro Thr His Glu 850 855 860 Arg Phe His Arg Gly Lys Phe Thr Gly Trp Met Asp Val Glu Leu Glu 865 870 875 880 Leu Leu Thr Pro Leu Ile Met Pro Asp Ala Glu Arg Ala Ser Asp Asp 885 890 895 Gly Gly His Lys Thr Tyr Pro Phe Leu Arg Leu Gly Asp Val Pro Val 900 905 910 Ile Pro Gly Ser Glu Leu Lys Gly Val Phe Ser Ser Leu Tyr Glu Thr 915 920 925 Leu Thr Asn Ser Cys Met Arg Val Phe Asn Glu Lys Lys Leu Ile Thr 930 935 940 Trp Arg Leu Gly Ala Asp Glu Ala Lys Glu Gly Lys Asp Gly Lys Asp 945 950 955 960 Gly Val Phe Pro Gly Arg Val Val Lys Lys Glu Asp Gly Leu Tyr Ile 965 970 975 Gln Arg Met Lys Glu Leu Arg Leu Pro Val Tyr Asp Asn Lys Lys Ala 980 985 990 Phe Leu Asn Ser Leu Leu Gly Lys Cys Phe Lys Glu His Ser Asp Tyr 995 1000 1005 Asn His Pro Met Arg Pro Glu Val Ile Phe Leu Gly Ala Ala Lys Gly 1010 1015 1020 Val Arg Asp Tyr Leu Glu Asn Leu Glu Lys Glu Asn Pro Lys Ala Phe 1025 1030 1035 1040 Arg Gly Glu Val Cys Leu Ser Trp Arg Lys Ile Pro Met Ser Ser Gly 1045 1050 1055 Glu Phe Asp Ala Leu Ala Ile Leu Lys Asp Gln Ser Val Lys Lys Ile 1060 1065 1070 Val Ala Asp Val Val Gly Phe Val Lys Lys Ala Lys Ala Asn Gly Lys 1075 1080 1085 Ile Lys Glu Lys Ile Leu Glu Cys Ile Asn Lys Asn Met Thr Thr Asn 1090 1095 1100 Lys Lys Trp Leu Lys Glu Ser Tyr Cys Glu Gly Tyr Leu Lys Phe Ser 1105 1110 1115 1120 Gly Pro Asn Val Val Glu Val Lys Asn Leu Asp Lys Arg Gly Asn Glu 1125 1130 1135 Pro Pro Val Pro Gln Phe Trp Arg Lys Ala Tyr His Asn Lys Leu Gln 1140 1145 1150 Cys Gly Lys Glu Ile Glu Val Phe Ser Ser Arg His Arg Gly Asn Leu Page 40
Lys Lys Lys Arg His Ile Pro Val Lys Val Cys Trp Glu Asn Gly Lys 1170 1175 1180 Glu Tyr Arg Met Arg Lys Arg Cys Glu Arg Val Phe Val Arg Asp Gly 1185 1190 1195 1200 Ser Gly Glu Val Leu Lys Ile Pro Leu Glu Val Leu Lys Lys Tyr Glu 1205 1210 1215 Thr Val Leu Glu Ile Tyr Arg Glu Asn Arg Glu Arg Tyr Glu Val Pro 1220 1225 1230 Glu Val Phe Trp Thr Ile Leu Pro Gly Asp Gly Lys Thr Leu Arg Gly 1235 1240 1245 Gly Glu Leu Val Tyr Tyr Met Lys Arg Gly Gly Glu Ile Thr Asp Ile 1250 1255 1260 Met Pro Val Arg Ile Ser Arg Val Ala Asp Asp Lys Pro Met Ile Tyr 1265 1270 1275 1280 Arg Ile Pro Glu Lys His Arg Pro Cys Val Ile Ser Glu Cys Pro Glu 1285 1290 1295 Leu Gly Phe Ser Cys Glu Lys Cys Arg Leu Glu Gly Phe Thr Glu Lys 1300 1305 1310 Arg Trp Phe Met Val Ser Pro Glu Gly Leu Cys Pro Ala Cys Ser Leu 1315 1320 1325 Phe Gly Thr Gln Ile Tyr Arg Ser Arg Val Arg Phe Ser Phe Ala Arg 1330 1335 1340 Ala Asp Glu Tyr Glu Met Leu Gly Asn Val Thr Leu Pro Arg Leu Glu 1345 1350 1355 1360 Ser Pro Arg Pro Asn Trp Leu Ile Lys Arg Asp Asp Ala Phe Gly Val 1365 1370 1375 Phe Gly Arg Lys Phe Tyr Leu His Ser Gly Arg Trp Arg Lys Val Val 1380 1385 1390 Asp Asp Ser Glu Asn Gly Arg Val Ser Arg Thr Glu Asn Asn Ala Thr 1395 1400 1405 Phe Glu Val Met Glu Arg Gly Arg Phe Ser Phe Arg Val Trp Phe Glu 1410 1415 1420 Asn Leu Glu Ser Trp Glu Leu Gly Ala Leu Met Leu Thr Val Ser Gly 1425 1430 1435 1440 Leu Gly Lys Val Val Lys Ile Gly His Ala Lys Pro Leu Gly Phe Gly 1445 1450 1455 Ser Val Lys Ser Arg Val Asn Arg Val Val Leu Phe Arg Ala Asp Glu 1460 1465 1470 Asp Arg Leu Glu Leu Val Lys Asp Gly Gly Lys Phe Ile Asp Glu Ala 1475 1480 1485 Leu Lys Tyr Leu Val Lys Leu Trp Gly Ser Glu Lys Asp Met Lys Arg 1490 1495 1500 Cys Leu Gly Arg Val Val Ser Tyr Leu Asn Pro Asp Phe Glu Val Asp 1505 1510 1515 1520 Val Lys Tyr Pro Asp Phe Lys Gly Tyr Glu Lys Ile Lys Lys Lys Asn 1525 1530 1535 Pro Arg Lys Leu Phe Glu Lys Glu Ala Phe Val Glu Val Pro Arg Val 1540 1545 1550 Glu Gln Lys Pro Asp Ala Page 41
<210> 10 <211> 1812 <212> PRT <213> Candidatus Jettenia <220> <223> RHLA01000020.1 cds KAA0249751.1 <400> 10 Met His Thr Ile Leu Pro Ile His Leu Thr Phe Leu Glu Pro Tyr Arg 1 5 10 15 Leu Ala Glu Trp His Ala Lys Ala Asp Arg Lys Lys Asn Lys Arg Tyr
Leu Arg Gly Met Ser Phe Ala Gln Trp His Lys Asp Lys Asp Gly Ile 40 45 Gly Lys Pro Tyr Ile Thr Gly Thr Leu Leu Arg Ser Ala Val Leu Asn 50 55 60 Ala Ala Glu Glu Leu Ile Ser Leu Asn Gln Gly Met Trp Ala Lys Glu 65 70 75 80 Pro Cys Cys Asn Gly Lys Phe Glu Thr Glu Lys Asp Lys Pro Ala Val 85 90 95 Leu Arg Lys Arg Pro Thr Ile Gln Trp Lys Thr Gly Arg Pro Ala Ile 100 105 110 Cys Asp Pro Glu Lys Gln Glu Lys Lys Asp Ala Cys Pro Leu Cys Met 115 120 125 Leu Leu Gly Arg Phe Asp Lys Ala Gly Lys Arg His Arg Asp Asn Lys 130 135 140 Tyr Asp Lys His Asp Tyr Asp Ile His Phe Asp Asn Leu Asn Leu Ile 145 150 155 160 Thr Asp Lys Lys Phe Ser His Pro Asp Asp Ile Ala Ser Glu Arg Ile 165 170 175 Leu Asn Arg Val Asp Tyr Thr Thr Gly Lys Ala His Asp Tyr Phe Lys 180 185 190 Val Trp Glu Val Asp Asp Asp Gln Trp Trp Gln Phe Thr Gly Thr Ile 195 200 205 Thr Met His Asp Asp Cys Ser Lys Ala Lys Gly Leu Leu Leu Ala Ser 210 215 220 Leu Cys Phe Val Asp Lys Leu Cys Gly Ala Leu Cys Arg Ile Glu Val 225 230 235 240 Thr Gly Asn Asn Ser Gln Asp Glu Asn Lys Glu Tyr Ala His Pro Asp 245 250 255 Thr Gly Ile Ile Thr Ser Leu Asn Leu Lys Tyr Gln Asn Asn Ser Thr 260 265 270 Ile His Gln Asp Ala Val Pro Leu Ser Gly Ser Ala His Asp Asn Asp 275 280 285 Glu Pro Pro Val His Asp Asn Asp Ser Ser Leu Asp Asn Asp Thr Ile 290 295 300 Page 42
Thr Leu Leu Ser Met Lys Ala Lys Glu Ile Val Gly Ala Phe Arg Glu 305 310 315 320 Ser Gly Lys Ile Glu Lys Ala Arg Thr Leu Ala Asp Val Ile Arg Ala 325 330 335 Met Arg Leu Gln Lys Pro Asp Ile Trp Glu Lys Leu Pro Lys Gly Ile 340 345 350 Asn Asp Lys His His Leu Trp Asp Arg Glu Val Asn Gly Lys Lys Leu 355 360 365 Arg Asn Ile Leu Glu Glu Leu Trp Arg Leu Met Asn Lys Arg Asn Ala 370 375 380 Trp Arg Thr Phe Cys Glu Val Leu Gly Asn Glu Leu Tyr Arg Cys Tyr 385 390 395 400 Lys Glu Lys Thr Gly Gly Ile Val Leu Arg Phe Arg Thr Leu Gly Glu 405 410 415 Thr Glu Tyr Tyr Pro Glu Pro Glu Lys Thr Glu Pro Cys Leu Ile Ser 420 425 430 Asp Asn Ser Ile Pro Ile Thr Pro Leu Gly Gly Val Lys Glu Trp Ile 435 440 445 Ile Tle Gly Arg Leu Lys Ala Glu Thr Pro Phe Tyr Phe Gly Val Gln 450 455 460 Ser Ser Phe Asp Ser Thr Gln Asp Asp Leu Asp Leu Val Pro Asp Ile 465 470 475 480 Val Asn Thr Asp Glu Lys Leu Glu Ala Asn Glu Gln Thr Ser Phe Arg 485 490 495 Ile Leu Met Asp Lys Lys Gly Arg Tyr Arg Ile Pro Arg Ser Leu Ile 500 505 510 Arg Gly Val Leu Arg Arg Asp Leu Arg Thr Ala Phe Gly Gly Ser Gly 515 520 525 Cys Ile Val Glu Leu Gly Arg Met Ile Pro Cys Asp Cys Lys Val Cys 530 535 540 Ala Ile Met Arg Lys Ile Thr Val Met Asp Ser Arg Ser Glu Asn Ile 545 550 555 560 Glu Leu Pro Asp Ile Arg Tyr Arg Ile Arg Leu Asn Pro Tyr Thr Ala 565 570 575 Thr Val Asp Glu Gly Ala Leu Phe Asp Met Glu Ile Gly Pro Glu Gly 580 585 590 Ile Thr Phe Pro Phe Val Phe Arg Tyr Arg Gly Glu Asp Ala Leu Pro 595 600 605 Arg Glu Leu Trp Ser Val Ile Arg Tyr Trp Met Asp Gly Met Ala Trp 610 615 620 Leu Gly Gly Ser Gly Ser Thr Gly Lys Gly Arg Phe Ala Leu Ile Asp 625 630 635 640 Ile Lys Val Phe Glu Trp Asp Leu Cys Asn Glu Glu Gly Leu Lys Ala 645 650 655 Tyr Ile Cys Ser Arg Gly Leu Arg Gly Ile Glu Lys Glu Val Leu Leu 660 665 670 Glu Asn Lys Thr Ile Ala Glu Ile Thr Asn Leu Phe Lys Thr Glu Glu 675 680 685 Val Lys Phe Phe Glu Ser Tyr Ser Lys His Ile Lys Gln Leu Cys His 690 695 700 Page 43
Glu Cys Ile Ile Asn Gln Ile Ser Phe Leu Trp Gly Leu Arg Ser Tyr 705 710 715 720 Tyr Glu Tyr Leu Gly Pro Leu Trp Thr Glu Val Lys Tyr Glu Ile Lys 725 730 735 Ile Ala Ser Pro Leu Leu Ser Ser Asp Thr Ile Ser Ala Leu Leu Asn 740 745 750 Lys Asp Asn Ile Asp Cys Ile Ala Tyr Glu Lys Arg Lys Trp Glu Asn 755 760 765 Gly Gly Ile Lys Phe Val Pro Thr Ile Lys Gly Glu Thr Ile Arg Gly 770 775 780 Ile Val Arg Met Ala Val Gly Lys Arg Ser Gly Asp Leu Gly Met Asp 785 790 795 800 Asp His Glu Asp Cys Ser Cys Thr Leu Cys Thr Ile Phe Gly Asn Glu 805 810 815 His Glu Ala Gly Lys Leu Arg Phe Glu Asp Leu Glu Val Val Glu Glu 820 825 830 Lys Leu Pro Ser Glu Gln Asn Ser Asp Ser Asn Lys Ile Pro Phe Gly 835 840 845 Pro Val Gln Asp Gly Asp Gly Asn Arg Glu Lys Glu Cys Val Thr Ala 850 855 860 Val Lys Ser Tyr Lys Lys Lys Leu Ile Asp His Val Ala Ile Asp Arg 865 870 875 880 Phe His Gly Gly Ala Glu Asp Lys Met Lys Phe Asn Thr Leu Pro Leu 885 890 895 Ala Gly Ser Phe Glu Lys Pro Ile Ile Leu Lys Gly Arg Phe Trp Ile 900 905 910 Lys Lys Asp Ile Val Lys Asp Tyr Lys Lys Lys Ile Glu Asp Ala Met 915 920 925 Val Asp Ile Arg Asp Gly Leu Tyr Pro Ile Gly Gly Lys Thr Gly Ile 930 935 940 Gly Tyr Gly Trp Val Thr Asp Leu Thr Ile Leu Asn Pro Gln Ser Gly 945 950 955 960 Phe Gln Ile Pro Val Lys Lys Asp Ile Ser Pro Glu Pro Gly Thr Tyr 965 970 975 Ser Thr Tyr Pro Ser His Ser Thr Pro Ser Leu Asn Lys Gly His Ile 980 985 990 Tyr Tyr Pro His Tyr Phe Leu Ala Pro Ala Asn Thr Val His Arg Glu 995 1000 1005 Gln Glu Met Ile Gly His Glu Gln Phe His Lys Glu Gln Lys Gly Glu 1010 1015 1020 Leu Leu Val Ser Gly Lys Ile Val Cys Thr Leu Lys Thr Val Thr Pro 1025 1030 1035 1040 Leu Ile Ile Pro Asp Thr Glu Asn Glu Asp Ala Phe Gly Leu Gln Asn 1045 1050 1055 Thr Tyr Ser Gly His Lys Asn Tyr Gln Phe Phe His Ile Asn Asp Glu 1060 1065 1070 Ile Met Val Pro Gly Ser Glu Ile Arg Gly Met Ile Ser Ser Val Tyr 1075 1080 1085 Glu Ala Ile Thr Asn Ser Cys Phe Arg Val Tyr Asp Glu Thr Lys Tyr 1090 1095 1100 Page 44
Ile Thr Arg Arg Leu Ser Pro Glu Lys Lys Asp Glu Ser Asn Asp Lys 1105 1110 1115 1120 Asn Lys Ser Gln Asp Asp Ala Ser Gln Lys Ile Arg Lys Gly Leu Val 1125 1130 1135 Lys Lys Thr Asp Glu Gly Phe Ser Ile Ile Glu Val Glu Arg Tyr Ser 1140 1145 1150 Met Lys Thr Lys Gly Gly Thr Lys Leu Val Asp Lys Val Tyr Arg Leu 1155 1160 1165 Pro Leu Tyr Asp Ser Glu Ala Val Ile Ala Ser Ile Gln Phe Glu Gln 1170 1175 1180 Tyr Gly Glu Lys Asn Glu Lys Arg Asn Ala Lys Ile Arg Ala Ala Ile 1185 1190 1195 1200 Lys Arg Asn Glu Val Ile Ala Glu Val Ala Arg Lys Asn Leu Ile Phe 1205 1210 1215 Leu Arg Ser Leu Thr Pro Glu Glu Leu Lys Lys Val Leu Gln Gly Glu 1220 1225 1230 Ile Leu Val Lys Phe Ser Leu Lys Ser Gly Lys Asn Pro Asn Asp Tyr 1235 1240 1245 Leu Ala Glu Leu His Glu Asn Gly Thr Glu Arg Gly Leu Ile Lys Phe 1250 1255 1260 Thr Gly Leu Asn Met Val Asn Ile Lys Asn Val Asn Glu Glu Asp Lys 1265 1270 1275 1280 Asp Phe Asn Asp Thr Trp Asp Trp Glu Lys Leu Asn Ile Phe His Asn 1285 1290 1295 Ala His Glu Lys Arg Asn Ser Leu Lys Gln Gly Tyr Pro Arg Pro Val 1300 1305 1310 Leu Lys Phe Ile Lys Asp Arg Val Glu Tyr Thr Ile Pro Lys Arg Cys 1315 1320 1325 Glu Arg Ile Phe Cys Ile Pro Val Lys Asn Thr Ile Glu Tyr Lys Val 1330 1335 1340 Ser Ser Lys Val Cys Lys Gln Tyr Lys Asp Val Leu Ser Asp Tyr Glu 1345 1350 1355 1360 Lys Asn Phe Gly His Ile Asn Lys Ile Phe Thr Thr Lys Ile Gln Lys 1365 1370 1375 Arg Glu Leu Thr Asp Gly Asp Leu Val Tyr Phe Ile Pro Asn Glu Gly 1380 1385 1390 Ala Asp Lys Thr Val Gln Ala Ile Met Pro Val Pro Leu Ser Arg Ile 1395 1400 1405 Thr Asp Ser Arg Thr Leu Gly Glu Arg Leu Pro His Lys Asn Leu Leu 1410 1415 1420 Pro Cys Val His Glu Val Asn Glu Gly Leu Leu Ser Gly Ile Leu Asp 1425 1430 1435 1440 Ser Leu Asp Lys Lys Leu Leu Ser Ile His Pro Glu Gly Leu Cys Pro 1445 1450 1455 Thr Cys Arg Leu Phe Gly Thr Thr Tyr Tyr Lys Gly Arg Val Arg Phe 1460 1465 1470 Gly Phe Ala Asn Leu Met Asn Lys Pro Lys Trp Leu Thr Glu Arg Glu 1475 1480 1485 Asn Gly Cys Gly Gly Tyr Val Thr Leu Pro Leu Leu Glu Arg Pro Arg 1490 1495 1500 Page 45
Leu Thr Trp Ser Val Pro Ser Asp Lys Cys Asp Val Pro Gly Arg Lys 1505 1510 1515 1520 Phe Tyr Ile His His Asn Gly Trp Gln Glu Val Leu Arg Asn Asn Asp 1525 1530 1535 Ile Thr Pro Lys Thr Glu Asn Asn Arg Thr Val Glu Pro Leu Ala Ala 1540 1545 1550 Asp Asn Arg Phe Thr Phe Asp Val Tyr Phe Glu Asn Leu Arg Glu Trp 1555 1560 1565 Glu Leu Gly Leu Leu Cys Tyr Cys Leu Glu Leu Glu Pro Gly Met Gly 1570 1575 1580 His Lys Leu Gly Met Gly Lys Pro Met Gly Phe Gly Ser Val Lys Ile 1585 1590 1595 1600 Ala Ile Glu Arg Leu Gln Thr Phe Thr Val His Gln Asp Gly Ile Asn 1605 1610 1615 Trp Lys Pro Ser Glu Asn Glu Ile Gly Val Tyr Val Gln Lys Gly Arg 1620 1625 1630 Glu Lys Leu Val Glu Trp Phe Thr Pro Ser Ala Pro His Lys Asn Met 1635 1640 1645 Glu Trp Asn Gly Val Lys His Ile Lys Asp Leu Arg Ser Leu Leu Ser 1650 1655 1660 Ile Pro Gly Asp Lys Pro Thr Val Lys Tyr Pro Thr Leu Asn Lys Asp 1665 1670 1675 1680 Ala Glu Gly Ala Ile Ser Asp Tyr Thr Tyr Glu Arg Leu Ser Asp Thr 1685 1690 1695 Lys Leu Leu Pro His Asp Lys Arg Val Glu Tyr Leu Arg Thr Pro Trp 1700 1705 1710 Ser Pro Trp Asn Ala Phe Val Lys Glu Ala Glu Tyr Ser Pro Ser Glu 1715 1720 1725 Lys Ser Asp Glu Lys Gly Arg Glu Thr Ile Arg Thr Lys Pro Lys Ser 1730 1735 1740 Leu Pro Ser Val Lys Ser Ile Gly Lys Val Lys Trp Phe Asp Glu Gly 1745 1750 1755 1760 Lys Gly Phe Gly Ile Leu Ile Met Asp Asp Gly Lys Glu Val Ser Ile 1765 1770 1775 Ser Lys Asn Ser Ile Arg Gly Asn Ile Leu Leu Lys Lys Gly Gln Lys 1780 1785 1790 Val Thr Phe His Ile Val Gln Gly Leu Ile Pro Lys Ala Glu Asp Ile 1795 1800 1805 Glu Ile Ala Lys 1810 <210> 11 <211> 1812 <212> PRT <213> Candidatus Jettenia <220> <223> QUCE01000002.1 cds MBC6927464.1 Page 46
<400> 11 Met His Thr Ile Leu Pro Ile His Leu Thr Phe Leu Glu Pro Tyr Arg 1 5 10 15 Leu Ala Glu Trp His Ala Lys Ala Asp Arg Lys Lys Asn Lys Arg Tyr
Leu Arg Gly Met Ser Phe Ala Gln Trp His Lys Asp Lys Asp Gly Ile 40 45 Gly Lys Pro Tyr Ile Thr Gly Thr Leu Leu Arg Ser Ala Val Leu Asn 50 55 60 Ala Ala Glu Glu Leu Ile Ser Leu Asn Gln Gly Met Trp Ala Lys Glu 65 70 75 80 Pro Cys Cys Asn Gly Lys Phe Glu Thr Glu Lys Asp Lys Pro Ala Val 85 90 95 Leu Arg Lys Arg Pro Thr Ile Gln Trp Lys Thr Gly Arg Pro Ala Ile 100 105 110 Cys Asp Pro Glu Lys Gln Glu Lys Lys Asp Ala Cys Pro Leu Cys Met 115 120 125 Leu Leu Gly Arg Phe Asp Lys Ala Gly Lys Arg His Arg Asp Asn Lys 130 135 140 Tyr Asp Lys His Asp Tyr Asp Ile His Phe Asp Asn Leu Asn Leu Ile 145 150 155 160 Thr Asp Lys Lys Phe Ser His Pro Asp Asp Ile Ala Ser Glu Arg Ile 165 170 175 Leu Asn Arg Val Asp Tyr Thr Thr Gly Lys Ala His Asp Tyr Phe Lys 180 185 190 Val Trp Glu Val Asp Asp Asp Gln Trp Trp Gln Phe Thr Gly Thr Ile 195 200 205 Thr Met His Asp Asp Cys Ser Lys Ala Lys Gly Leu Leu Leu Ala Ser 210 215 220 Leu Cys Phe Val Asp Lys Leu Cys Gly Ala Leu Cys Arg Ile Glu Val 225 230 235 240 Thr Gly Asn Asn Ser Gln Asp Glu Asn Lys Glu Tyr Ala His Pro Asp 245 250 255 Thr Gly Ile Ile Thr Ser Leu Asn Leu Lys Tyr Gln Asn Asn Ser Thr 260 265 270 Ile His Gln Asp Ala Val Pro Leu Ser Gly Ser Ala His Asp Asn Asp 275 280 285 Glu Pro Pro Val His Asp Asn Asp Ser Ser Leu Asp Asn Asp Thr Ile 290 295 300 Thr Leu Leu Ser Met Lys Ala Lys Glu Ile Val Gly Ala Phe Arg Glu 305 310 315 320 Ser Gly Lys Ile Glu Lys Ala Arg Thr Leu Ala Asp Val Ile Arg Ala 325 330 335 Met Arg Leu Gln Lys Pro Asp Ile Trp Glu Lys Leu Pro Lys Gly Ile 340 345 350 Asn Asp Lys His His Leu Trp Asp Arg Glu Val Asn Gly Lys Lys Leu 355 360 365 Arg Asn Ile Leu Glu Glu Leu Trp Arg Leu Met Asn Lys Arg Asn Ala 370 375 380 Trp Arg Thr Phe Cys Glu Val Leu Gly Asn Glu Leu Tyr Arg Cys Tyr Page 47
Lys Glu Lys Thr Gly Gly Ile Val Leu Arg Phe Arg Thr Leu Gly Glu 405 410 415 Thr Glu Tyr Tyr Pro Glu Pro Glu Lys Thr Glu Pro Cys Leu Ile Ser 420 425 430 Asp Asn Ser Ile Pro Ile Thr Pro Leu Gly Gly Val Lys Glu Trp Ile 435 440 445 Ile Tle Gly Arg Leu Lys Ala Glu Thr Pro Phe Tyr Phe Gly Val Gln 450 455 460 Ser Ser Phe Asp Ser Thr Gln Asp Asp Leu Asp Leu Val Pro Asp Ile 465 470 475 480 Val Asn Thr Asp Glu Lys Leu Glu Ala Asn Glu Gln Thr Ser Phe Arg 485 490 495 Ile Leu Met Asp Lys Lys Gly Arg Tyr Arg Ile Pro Arg Ser Leu Ile 500 505 510 Arg Gly Val Leu Arg Arg Asp Leu Arg Thr Ala Phe Gly Gly Ser Gly 515 520 525 Cys Ile Val Glu Leu Gly Arg Met Ile Pro Cys Asp Cys Lys Val Cys 530 535 540 Ala Ile Met Arg Lys Ile Thr Val Met Asp Ser Arg Ser Glu Asn Ile 545 550 555 560 Glu Leu Pro Asp Ile Arg Tyr Arg Ile Arg Leu Asn Pro Tyr Thr Ala 565 570 575 Thr Val Asp Glu Gly Ala Leu Phe Asp Met Glu Ile Gly Pro Glu Gly 580 585 590 Ile Thr Phe Pro Phe Val Phe Arg Tyr Arg Gly Glu Asp Ala Leu Pro 595 600 605 Arg Glu Leu Trp Ser Val Ile Arg Tyr Trp Met Asp Gly Met Ala Trp 610 615 620 Leu Gly Gly Ser Gly Ser Thr Gly Lys Gly Arg Phe Ala Leu Ile Asp 625 630 635 640 Ile Lys Val Phe Glu Trp Asp Leu Cys Asn Glu Glu Gly Leu Lys Ala 645 650 655 Tyr Ile Cys Ser Arg Gly Leu Arg Gly Ile Glu Lys Glu Val Leu Leu 660 665 670 Glu Asn Lys Thr Ile Ala Glu Ile Thr Asn Leu Phe Lys Thr Glu Glu 675 680 685 Val Lys Phe Phe Glu Ser Tyr Ser Lys His Ile Lys Gln Leu Cys His 690 695 700 Glu Cys Ile Ile Asn Gln Ile Ser Phe Leu Trp Gly Leu Arg Ser Tyr 705 710 715 720 Tyr Glu Tyr Leu Gly Pro Leu Trp Thr Glu Val Lys Tyr Glu Ile Lys 725 730 735 Ile Ala Ser Pro Leu Leu Ser Ser Asp Thr Ile Ser Ala Leu Leu Asn 740 745 750 Lys Asp Asn Ile Asp Cys Ile Ala Tyr Glu Lys Arg Lys Trp Glu Asn 755 760 765 Gly Gly Ile Lys Phe Val Pro Thr Ile Lys Gly Glu Thr Ile Arg Gly 770 775 780 Ile Val Arg Met Ala Val Gly Lys Arg Ser Gly Asp Leu Gly Met Asp Page 48
Asp His Glu Asp Cys Ser Cys Thr Leu Cys Thr Ile Phe Gly Asn Glu 805 810 815 His Glu Ala Gly Lys Leu Arg Phe Glu Asp Leu Glu Val Val Glu Glu 820 825 830 Lys Leu Pro Ser Glu Gln Asn Ser Asp Ser Asn Lys Ile Pro Phe Gly 835 840 845 Pro Val Gln Asp Gly Asp Gly Asn Arg Glu Lys Glu Cys Val Thr Ala 850 855 860 Val Lys Ser Tyr Lys Lys Lys Leu Ile Asp His Val Ala Ile Asp Arg 865 870 875 880 Phe His Gly Gly Ala Glu Asp Lys Met Lys Phe Asn Thr Leu Pro Leu 885 890 895 Ala Gly Ser Phe Glu Lys Pro Ile Ile Leu Lys Gly Arg Phe Trp Ile 900 905 910 Lys Lys Asp Ile Val Lys Asp Tyr Lys Lys Lys Ile Glu Asp Ala Met 915 920 925 Val Asp Ile Arg Asp Gly Leu Tyr Pro Ile Gly Gly Lys Thr Gly Ile 930 935 940 Gly Tyr Gly Trp Val Thr Asp Leu Thr Ile Leu Asn Pro Gln Ser Gly 945 950 955 960 Phe Gln Ile Pro Val Lys Lys Asp Ile Ser Pro Glu Pro Gly Thr Tyr 965 970 975 Ser Thr Tyr Pro Ser His Ser Thr Pro Ser Leu Asn Lys Gly His Ile 980 985 990 Tyr Tyr Pro His Tyr Phe Leu Ala Pro Ala Asn Thr Val His Arg Glu 995 1000 1005 Gln Glu Met Ile Gly His Glu Gln Phe His Lys Glu Gln Lys Gly Glu 1010 1015 1020 Leu Leu Val Ser Gly Lys Ile Val Cys Thr Leu Lys Thr Val Thr Pro 1025 1030 1035 1040 Leu Ile Ile Pro Asp Thr Glu Asn Glu Asp Ala Phe Gly Leu Gln Asn 1045 1050 1055 Thr Tyr Ser Gly His Lys Asn Tyr Gln Phe Phe His Ile Asn Asp Glu 1060 1065 1070 Ile Met Val Pro Gly Ser Glu Ile Arg Gly Met Ile Ser Ser Val Tyr 1075 1080 1085 Glu Ala Ile Thr Asn Ser Cys Phe Arg Val Tyr Asp Glu Thr Lys Tyr 1090 1095 1100 Ile Thr Arg Arg Leu Ser Pro Glu Lys Lys Asp Glu Ser Asn Asp Lys 1105 1110 1115 1120 Asn Lys Ser Gln Asp Asp Ala Ser Gln Lys Ile Arg Lys Gly Leu Val 1125 1130 1135 Lys Lys Thr Asp Glu Gly Phe Ser Ile Ile Glu Val Glu Arg Tyr Ser 1140 1145 1150 Met Lys Thr Lys Gly Gly Thr Lys Leu Val Asp Lys Val Tyr Arg Leu 1155 1160 1165 Pro Leu Tyr Asp Ser Glu Ala Val Ile Ala Ser Ile Gln Phe Glu Gln 1170 1175 1180 Tyr Gly Glu Lys Asn Glu Lys Arg Asn Ala Lys Ile Arg Ala Ala Ile Page 49
Lys Arg Asn Glu Val Ile Ala Glu Val Ala Arg Lys Asn Leu Ile Phe 1205 1210 1215 Leu Arg Ser Leu Thr Pro Glu Glu Leu Lys Lys Val Leu Gln Gly Glu 1220 1225 1230 Ile Leu Val Lys Phe Ser Leu Lys Ser Gly Lys Asn Pro Asn Asp Tyr 1235 1240 1245 Leu Ala Glu Leu His Glu Asn Gly Thr Glu Arg Gly Leu Ile Lys Phe 1250 1255 1260 Thr Gly Leu Asn Met Val Asn Ile Lys Asn Val Asn Glu Glu Asp Lys 1265 1270 1275 1280 Asp Phe Asn Asp Thr Trp Asp Trp Glu Lys Leu Asn Ile Phe His Asn 1285 1290 1295 Ala His Glu Lys Arg Asn Ser Leu Lys Gln Gly Tyr Pro Arg Pro Val 1300 1305 1310 Leu Lys Phe Ile Lys Asp Arg Val Glu Tyr Thr Ile Pro Lys Arg Cys 1315 1320 1325 Glu Arg Ile Phe Cys Ile Pro Val Lys Asn Thr Ile Glu Tyr Lys Val 1330 1335 1340 Ser Ser Lys Val Cys Lys Gln Tyr Lys Asp Val Leu Ser Asp Tyr Glu 1345 1350 1355 1360 Lys Asn Phe Gly His Ile Asn Lys Ile Phe Thr Thr Lys Ile Gln Lys 1365 1370 1375 Arg Glu Leu Thr Asp Gly Asp Leu Val Tyr Phe Ile Pro Asn Glu Gly 1380 1385 1390 Ala Asp Lys Thr Val Gln Ala Ile Met Pro Val Pro Leu Ser Arg Ile 1395 1400 1405 Thr Asp Ser Arg Thr Leu Gly Glu Arg Leu Pro His Lys Asn Leu Leu 1410 1415 1420 Pro Cys Val His Glu Val Asn Glu Gly Leu Leu Ser Gly Ile Leu Asp 1425 1430 1435 1440 Ser Leu Asp Lys Lys Leu Leu Ser Ile His Pro Glu Gly Leu Cys Pro 1445 1450 1455 Thr Cys Arg Leu Phe Gly Thr Thr Tyr Tyr Lys Gly Arg Val Arg Phe 1460 1465 1470 Gly Phe Ala Asn Leu Met Asn Lys Pro Lys Trp Leu Thr Glu Arg Glu 1475 1480 1485 Asn Gly Cys Gly Gly Tyr Val Thr Leu Pro Leu Leu Glu Arg Pro Arg 1490 1495 1500 Leu Thr Trp Ser Val Pro Ser Asp Lys Cys Asp Val Pro Gly Arg Lys 1505 1510 1515 1520 Phe Tyr Ile His His Asn Gly Trp Gln Glu Val Leu Arg Asn Asn Asp 1525 1530 1535 Ile Thr Pro Lys Thr Glu Asn Asn Arg Thr Val Glu Pro Leu Ala Ala 1540 1545 1550 Asp Asn Arg Phe Thr Phe Asp Val Tyr Phe Glu Asn Leu Arg Glu Trp 1555 1560 1565 Glu Leu Gly Leu Leu Cys Tyr Cys Leu Glu Leu Glu Pro Gly Met Gly 1570 1575 1580 His Lys Leu Gly Met Gly Lys Pro Met Gly Phe Gly Ser Val Lys Ile Page 50
Ala Ile Glu Arg Leu Gln Thr Phe Thr Val His Gln Asp Gly Ile Asn 1605 1610 1615 Trp Lys Pro Ser Glu Asn Glu Ile Gly Val Tyr Val Gln Lys Gly Arg 1620 1625 1630 Glu Lys Leu Val Glu Trp Phe Thr Pro Ser Ala Pro His Lys Asn Met 1635 1640 1645 Glu Trp Asn Gly Val Lys His Ile Lys Asp Leu Arg Ser Leu Leu Ser 1650 1655 1660 Ile Pro Gly Asp Lys Pro Thr Val Lys Tyr Pro Thr Leu Asn Lys Asp 1665 1670 1675 1680 Ala Glu Gly Ala Ile Ser Asp Tyr Thr Tyr Glu Arg Leu Ser Asp Thr 1685 1690 1695 Lys Leu Leu Pro His Asp Lys Arg Val Glu Tyr Leu Arg Thr Pro Trp 1700 1705 1710 Ser Pro Trp Asn Ala Phe Val Lys Glu Ala Glu Tyr Ser Pro Ser Glu 1715 1720 1725 Lys Ser Asp Glu Lys Gly Arg Glu Thr Ile Arg Thr Lys Pro Lys Ser 1730 1735 1740 Leu Pro Ser Val Lys Ser Ile Gly Lys Val Lys Trp Phe Asp Glu Gly 1745 1750 1755 1760 Lys Gly Phe Gly Ile Leu Ile Met Asp Asp Gly Lys Glu Val Ser Ile 1765 1770 1775 Ser Lys Asn Ser Ile Arg Gly Asn Ile Leu Leu Lys Lys Gly Gln Lys 1780 1785 1790 Val Thr Phe His Ile Val Gln Gly Leu Ile Pro Lys Ala Glu Asp Ile 1795 1800 1805 Glu Ile Ala Lys 1810 <210> 12 <211> 1659 <212> PRT <213> Desulfobacteraceae <220> <223> NBMKO01000156.1 cds 0QY58162.1 <400> 12 Met Lys Ile Thr Leu Arg Phe Leu Glu Pro Phe Arg Met Leu Asp Trp 1 5 10 15 Ile Arg Pro Glu Glu Arg Ile Ser Gly Asn Lys Ala Phe Gln Arg Gly
Leu Thr Phe Ala Arg Trp His Lys Ser Lys Ala Asp Asp Lys Gly Lys 40 45 Pro Phe Ile Thr Gly Thr Leu Leu Arg Ser Ala Val Ile Arg Ala Ala 50 55 60 Glu His Leu Leu Val Leu Ser Lys Gly Lys Val Gly Glu Lys Ala Cys 65 70 75 80 Page 51
Cys Pro Gly Lys Phe Leu Thr Glu Thr Asp Thr Glu Thr Asn Lys Ala 85 90 95 Pro Thr Met Phe Leu Arg Lys Arg Pro Thr Leu Lys Trp Thr Asp Arg 100 105 110 Lys Gly Cys Asp Pro Asp Phe Pro Cys Pro Leu Cys Glu Leu Leu Gly 115 120 125 Pro Gly Ala Val Gly Lys Lys Glu Gly Glu Ala Gly Ile Asn Ser Tyr 130 135 140 Val Asn Phe Gly Asn Leu Ser Phe Pro Gly Asp Thr Gly Tyr Ser Asn 145 150 155 160 Ala Arg Glu Ile Ala Val Arg Arg Val Val Asn Arg Val Asp Tyr Ala 165 170 175 Ser Gly Lys Ala His Asp Phe Phe Arg Ile Phe Glu Val Asp His Ile 180 185 190 Ala Phe Pro Cys Phe His Gly Glu Ile Ala Phe Gly Glu Asn Val Ser 195 200 205 Ser Gln Ala Arg Asn Leu Leu Gln Asp Ser Leu Arg Phe Thr Asp Arg 210 215 220 Leu Cys Gly Ala Leu Cys Val Ile Arg Tyr Asp Gly Asp Ile Pro Lys 225 230 235 240 Cys Gly Lys Thr Ala Pro Leu Pro Glu Thr Glu Ser Ile Gln Asn Ala 245 250 255 Ala Glu Glu Thr Ala Arg Ala Ile Val Arg Val Phe His Gly Gly Arg 260 265 270 Lys Asp Pro Glu Gln Ala Gln Ile Asp Lys Ala Glu Gln Ile Gln Leu 275 280 285 Leu Ser Ala Ala Val Arg Glu Leu Gly Arg Asp Lys Lys Lys Val Ser 290 295 300 Ala Leu Pro Leu Asn His Glu Gly Lys Glu Asp His Tyr Leu Trp Asp 305 310 315 320 Lys Lys Ala Gly Gly Glu Thr Ile Arg Thr Ile Leu Lys Ala Ala Ala 325 330 335 Glu Lys Glu Ala Val Ala Asn Gln Trp Arg Gln Phe Cys Ile Glu Leu 340 345 350 Ser Glu Glu Leu Tyr Lys Glu Ala Lys Lys Ala His Gly Gly Leu Glu 355 360 365 Pro Ala Arg Arg Ile Met Gly Asp Ala Glu Phe Ser Asp Lys Ser Val 370 375 380 Pro Asp Thr Val Ser His Ser Ile Gly Ile Ser Val Glu Lys Glu Thr 385 390 395 400 Ile Ile Met Gly Thr Leu Lys Ala Glu Thr Pro Phe Phe Phe Gly Ile 405 410 415 Glu Ser Lys Glu Lys Lys Gln Thr Asp Leu Met Leu Leu Leu Asp Gly 420 425 430 Gln Asn His Tyr Arg Ile Pro Arg Ser Ala Leu Arg Gly Ile Leu Arg 435 440 445 Arg Asp Ile Arg Ser Val Leu Gly Thr Gly Cys Asn Ala Glu Val Gly 450 455 460 Gly Arg Pro Cys Leu Cys Pro Val Cys Arg Ile Met Lys Asn Ile Thr 465 470 475 480 Page 52
Val Met Asp Thr Arg Ser Ser Thr Asp Thr Leu Pro Glu Val Arg Pro 485 490 495 Arg Ile Arg Leu Asn Pro Phe Thr Gly Ser Val Gln Glu Lys Ala Leu 500 505 510 Phe Asn Met Glu Met Gly Thr Glu Gly Ile Glu Phe Pro Phe Val Leu 515 520 525 Ser Tyr Arg Gly Lys Lys Thr Leu Pro Lys Glu Leu Arg Asn Val Leu 530 535 540 Asn Trp Trp Thr Glu Gly Lys Ala Phe Leu Gly Gly Ala Ala Ser Thr 545 550 555 560 Gly Lys Ser Ile Phe Gln Leu Ser Asp Ile His Ala Phe Ser Ser Asp 565 570 575 Leu Ser Asp Glu Thr Ala Arg Glu Ser Tyr Leu Ser Asn His Gly Trp 580 585 590 Arg Gly Ile Met Glu Asn Ser Ile Val His Glu Ser Pro Leu Glu Gly 595 600 605 Gly Ala Gly Gly Cys Ser Phe Gly Leu Ser Asp Leu Pro Lys Leu Gly 610 615 620 Trp His Ala Glu Asp Leu Lys Leu Ser Asp Ile Glu Lys Tyr Lys Pro 625 630 635 640 Phe His Arg Gln Lys Ile Ser Val Lys Ile Thr Leu Asn Ser Pro Phe 645 650 655 Leu Asn Gly Asp Pro Val Arg Ala Leu Thr Glu Asp Val Ala Asp Ile 660 665 670 Val Ser Phe Lys Lys Tyr Thr Gln Gly Gly Glu Lys Ile Ile Tyr Ala 675 680 685 Tyr Lys Ser Glu Ser Phe Arg Gly Val Val Arg Thr Ala Leu Gly Leu 690 695 700 Arg Asn Gln Gly Asn Asp Asp Ile Thr Gly Lys Lys Asn Val Pro Leu 705 710 715 720 Ile Ala Leu Thr His Gln Asp Cys Glu Cys Met Leu Cys Arg Phe Phe 725 730 735 Gly Ser Glu Tyr Glu Ala Gly Arg Leu Tyr Phe Glu Asp Leu Thr Phe 740 745 750 Glu Ser Glu Pro Glu Pro Arg Arg Phe Asp His Val Ala Ile Asp Arg 755 760 765 Phe Thr Gly Gly Ala Val Asn Gln Lys Lys Phe Asp Asp Arg Ser Leu 770 775 780 Val Pro Gly Lys Glu Gly Phe Met Thr Leu Ile Gly Cys Phe Trp Met 785 790 795 800 Arg Lys Asp Lys Glu Leu Ser Arg Asn Glu Ile Glu Glu Leu Gly Lys 805 810 815 Ala Phe Ala Asp Ile Arg Asp Gly Leu Tyr Pro Leu Gly Ala Lys Gly 820 825 830 Ser Met Gly Tyr Gly Gln Val Ala Glu Leu Ser Ile Val Asp Asp Glu 835 840 845 Asp Ser Asp Asp Glu Asn Asn Pro Ala Lys Leu Leu Ala Glu Ser Met 850 855 860 Lys Asn Ala Ser Pro Ser Leu Gly Thr Pro Thr Ser Leu Lys Lys Lys 865 870 875 880 Page 53
Asp Ala Gly Leu Ser Leu Arg Phe Asp Glu Asn Ala Asp Tyr Tyr Pro 885 890 895 Tyr Tyr Phe Leu Glu Pro Glu Lys Ser Val His Arg Asp Pro Val Pro 900 905 910 Pro Gly His Glu Glu Ala Phe Arg Gly Gly Leu Leu Thr Gly Arg Ile 915 920 925 Thr Cys Arg Leu Thr Val Arg Thr Pro Leu Ile Val Pro Asn Thr Glu 930 935 940 Thr Asp Asp Ala Phe Asn Met Lys Glu Lys Ala Gly Lys Lys Lys Asp 945 950 955 960 Ala Tyr His Lys Ser Tyr Arg Phe Phe Thr Leu Asn Arg Val Pro Met 965 970 975 Ile Pro Gly Ser Glu Ile Arg Gly Met Ile Ser Ser Val Phe Glu Ala 980 985 990 Leu Ser Asn Ser Cys Phe Arg Ile Phe Asp Glu Lys Tyr Arg Leu Ser 995 1000 1005 Trp Arg Met Asp Ala Asp Val Lys Glu Leu Glu Gln Phe Lys Pro Gly 1010 1015 1020 Arg Val Ala Asp Asp Gly Lys Arg Ile Glu Glu Met Lys Glu Ile Arg 1025 1030 1035 1040 Tyr Pro Phe Tyr Asp Arg Thr Tyr Pro Glu Arg Asn Ala Gln Asn Gly 1045 1050 1055 Tyr Phe Arg Trp Asp Ala Arg Ile Ser Leu Thr Asp Asn Ser Met Arg 1060 1065 1070 Lys Met Glu Lys Asp Gly Val Pro Arg Asn Val Ile Tyr Lys Leu Asn 1075 1080 1085 Thr Leu Lys Asn Lys Ala Tyr Lys Ser Glu Lys Ser Phe Leu Phe Asp 1090 1095 1100 Leu Lys Asn Lys Ala Gly Gly Val Gly Arg Tyr Lys Lys Leu Val Leu 1105 1110 1115 1120 Lys His Ala Glu Val Arg Gly Gly Glu Ile Pro Tyr Tyr Ser His Pro 1125 1130 1135 Thr Pro Thr Asp Cys Lys Leu Leu Ser Leu Val Gly Pro Asn Arg Gln 1140 1145 1150 Leu Cys Arg Gln Asp Thr Leu Val Gln Tyr Arg Ile Ile Lys His Arg 1155 1160 1165 Arg Gly Ala Lys Pro Glu Glu Asp Phe Met Phe Val Gly Thr Pro Ser 1170 1175 1180 Glu Asn Gln Lys Gly His Lys Glu Asn Asn Asp His Gly Gly Gly Tyr 1185 1190 1195 1200 Leu Lys Ile Ser Gly Pro Asn Lys Ile Glu Lys Glu Asn Val Leu Thr 1205 1210 1215 Ser Gly Val Pro Ser Val Pro Glu Asn Met Gly Ala Val Val His Asn 1220 1225 1230 Cys Pro Pro Arg Leu Val Glu Val Thr Val Arg Cys Gly Arg Lys Gln 1235 1240 1245 Glu Glu Glu Cys Lys Arg Lys Arg Leu Val Pro Glu Tyr Val Cys Ala 1250 1255 1260 Asp Pro Glu Lys Lys Val Thr Tyr Thr Met Thr Lys Arg Cys Glu Arg 1265 1270 1275 1280 Page 54
Ile Phe Leu Glu Lys Ser Arg Arg Ile Ile Pro Phe Thr Asn Asp Ala 1285 1290 1295 Val Asp Lys Phe Glu Ile Leu Val Lys Glu Tyr Arg Arg Asn Ala Glu 1300 1305 1310 Gln Gln Asp Thr Pro Glu Ala Phe Gln Thr Ile Leu Pro Glu Asn Gly 1315 1320 1325 Thr Val Asn Pro Gly Asp Leu Leu Tyr Phe Arg Glu Glu Lys Gly Lys 1330 1335 1340 Ala Ala Glu Ile Val Pro Val Arg Ile Ser Arg Lys Val Asp Asp Arg 1345 1350 1355 1360 His Ile Gly Lys Arg Ile Asp Pro Glu Leu Arg Pro Cys His Gly Glu 1365 1370 1375 Trp Ile Glu Asp Gly Asp Leu Ser Lys Leu Asp Ala Tyr Pro Ala Glu 1380 1385 1390 Lys Lys Leu Leu Thr Arg His Pro Lys Gly Leu Cys Pro Ala Cys Arg 1395 1400 1405 Val Phe Gly Thr Gly Ser Tyr Lys Ser Arg Val Arg Phe Gly Phe Ala 1410 1415 1420 Ala Leu Lys Gly Thr Pro Lys Trp Leu Lys Glu Asp Pro Ala Glu Pro 1425 1430 1435 1440 Ser Gln Gly Lys Gly Ile Thr Leu Pro Leu Leu Glu Arg Pro Arg Pro 1445 1450 1455 Thr Trp Ala Val Leu His Asn Asp Lys Glu Asn Ser Glu Ile Pro Gly 1460 1465 1470 Arg Lys Phe Tyr Val His His Asn Gly Trp Lys Gly Ile Ser Glu Gly 1475 1480 1485 Ile His Pro Ile Ser Gly Glu Asn Ile Glu Pro Asp Glu Asn Asn Arg 1490 1495 1500 Thr Val Glu Val Leu Asp Lys Gly Asn Arg Phe Val Phe Glu Leu Ser 1505 1510 1515 1520 Phe Glu Asn Leu Glu Pro Arg Glu Leu Gly Leu Leu Ile His Ser Leu 1525 1530 1535 Gln Leu Glu Lys Gly Leu Ala His Lys Leu Gly Met Ala Lys Ser Met 1540 1545 1550 Gly Phe Gly Ser Val Glu Ile Asp Val Glu Ser Val Arg Val Lys His 1555 1560 1565 Arg Ser Gly Glu Trp Asp Tyr Lys Asp Gly Glu Thr Val Asp Gly Trp 1570 1575 1580 Ile Glu Glu Gly Lys Arg Gly Val Ala Ala Lys Gly Lys Ala Asn Asp 1585 1590 1595 1600 Leu Arg Lys Leu Leu Tyr Leu Pro Gly Glu Lys Gln Asn Pro His Val 1605 1610 1615 His Tyr Pro Thr Leu Lys Lys Glu Lys Lys Gly Asp Pro Pro Gly Tyr 1620 1625 1630 Glu Asp Leu Lys Lys Ser Phe Arg Glu Lys Lys Leu Asn Arg Arg Lys 1635 1640 1645 Met Leu Thr Thr Leu Trp Glu Pro Trp His Lys 1650 1655 <210> 13 Page 55
<211> 1599 <212> PRT <213> Desulfobacterales <220> <223> MBF0120744.1 hypothetical protein HQK79 18095 <400> 13 Met Lys Ile Thr Val Lys Phe Leu Glu Pro Phe Arg Leu Leu Glu Trp 1 5 10 15 Ile Lys His Glu Asn Arg Asn Arg Glu Asn Lys Pro Tyr Leu Arg Gly
Gln Ser Phe Ala Arg Trp His Lys Asn Lys Asp Gly Lys Gly Gly Arg 40 45 Pro Tyr Ile Thr Gly Ser Leu Leu Arg Ser Ala Val Ile Gln Ser Ala 50 55 60 Glu Lys Leu Leu Val Leu Ser Gly Gly Lys Ile Asn Asn Lys Ser Cys 65 70 75 80 Cys Pro Gly Glu Phe Ser Thr Lys Asn Asn Asn Ser Val Leu Leu Leu 85 90 95 Arg Gln Arg Ala Thr Phe Lys Trp Thr Asp Asp Lys Leu Cys Asn Ser 100 105 110 Ser Ser Pro Cys Pro Phe Cys Glu Leu Leu Gly Arg His Asp Gln Ala 115 120 125 Gly Lys Asn Ala Lys Lys Glu Asn Gly Val Gln Phe His Ile His Phe 130 135 140 Gly Asn Leu Asn Leu Pro Tyr Asp Lys Leu Tyr Ser Asp Ile Glu Glu 145 150 155 160 Ile Ala Phe Lys Arg Thr Leu Asn Arg Ile Asp Gln Asp Ser Gly Lys 165 170 175 Ala Phe Asp Phe Leu Arg Val Trp Glu Ile Asp Asn Leu Glu Val Pro 180 185 190 Leu Phe Thr Gly Glu Ile Ser Ile Ser Asn Ile Ile Ser Pro Glu Ser 195 200 205 Ile Arg Leu Leu Lys Asp Ser Leu Ser Phe Val Asp Lys Leu Cys Gly 210 215 220 Ser Leu Cys Ile Ile Lys Gln Asp Asp Asn Asp Leu Thr Leu Ser Ile 225 230 235 240 Pro Ser Ile Ser Lys Ser Asp Ile Ser Glu Arg Ala Lys Ile Leu Val 245 250 255 Asp Ala Ile Gly Lys Tyr Asn Glu Ala Asp Lys Ile Arg Ile Met Ala 260 265 270 Asp Ala Met Leu Ala Leu Arg Arg Asp Lys Asn Leu Val Ser Ser Leu 275 280 285 Pro Lys Asp His Asp Asn Lys Glu Asn His Tyr Leu Trp Asp Ile Lys 290 295 300 Glu Gly Asn Ser Lys Ser Ile Arg Leu Ile Leu Lys Glu Gln Ala Asn 305 310 315 320 Ala Leu Asp Asn Gln Gly Trp Arg Asn Leu Cys Glu Gln Ala Gly Gln Page 56
Leu Ile Phe Glu Lys Ala Lys Gln Leu Thr Gly Gly Ile Ser Val Ser 340 345 350 Gln Arg Ile Leu Gly Asp Ile Glu Tyr Leu Ser Glu Pro Glu Leu Ile 355 360 365 Ser Asp Thr Val Phe Ile Ser Ser Val Pro Gln Tyr Glu Thr Ile Ile 370 375 380 Gln Gly Lys Leu Ile Ala Lys Thr Pro Phe Phe Phe Gly Leu Glu Asn 385 390 395 400 Asp Glu Thr Lys Gln Ser Ser Tyr Lys Leu Leu Leu Asp Asn Lys Asn 405 410 415 Ser Tyr Arg Ile Pro Arg Ser Ala Ile Arg Gly Ile Leu Arg Arg Asp 420 425 430 Leu Lys Asn Ile Leu Gly Thr Gly Cys Asn Val Glu Leu Gly Gly Val 435 440 445 Pro Cys Pro Cys Lys Val Cys Ser Ile Met Arg Asn Ile Thr Ile Met 450 455 460 Asp Ser Arg Ser Asn Tyr Ser Glu Pro Pro Glu Ile Arg Asn Arg Ile 465 470 475 480 Arg Ile Asn Thr Tyr Thr Gly Thr Val Asp Glu Gly Ala Leu Phe Asp 485 490 495 Met Glu Val Gly Pro Glu Gly Leu Glu Phe Pro Phe Thr Leu Arg Tyr 500 505 510 Arg Gly Arg Tyr Lys Asn Pro Asp Ser Glu Lys Ile Pro Asp Ser Leu 515 520 525 Glu Lys Val Leu Thr Leu Trp Thr Glu Gly Gln Ala Phe Leu Ser Gly 530 535 540 Ser Ala Ser Thr Gly Lys Gly Arg Phe Lys Ile Glu Asp Ile Lys Tyr 545 550 555 560 Cys Arg Leu Asp Leu Lys Asp Ala Ser Lys Arg Asp Glu Tyr Leu Leu 565 570 575 Asn His Gly Trp Arg Asp Asn Leu Asp Lys Leu Lys Phe Asp Asn Leu 580 585 590 Pro Leu Lys Ile Gln Asn Leu Ile Ala Arg Trp Lys Lys Val Glu Ile 595 600 605 Glu Ile Lys Leu Ala Ser Pro Phe Leu Asn Gly Asp Pro Ile Arg Ala 610 615 620 Leu Leu Glu Ser Asn Ser Gly Asp Ile Val Ser Phe Arg Lys Phe Ile 625 630 635 640 Asn Gly Gly Thr Gln Glu Val Tyr Ala Tyr Lys Ser Glu Ser Phe Lys 645 650 655 Gly Val Val Arg Ala Ala Val Ser Lys Phe Glu Gly Ile Asp Ser Ile 660 665 670 Thr Glu Lys Thr Gly Pro Leu Gly Thr Leu Thr His Gln Asp Cys Ser 675 680 685 Cys Leu Leu Cys Ser Leu Phe Gly Ser Glu Tyr Glu Thr Gly Lys Met 690 695 700 Arg Phe Glu Asp Leu Ile Phe Asp Pro Gln Pro Val Ser Lys Ile Phe 705 710 715 720 Asp His Val Ala Ile Asp Arg Phe Thr Gly Gly Ala Val Asp Lys Lys Page 57
Lys Phe Asp Asp Asn Ser Ile Val Gly Ser His Ser Asn Gln Leu Thr 740 745 750 Leu Lys Gly Leu Phe Trp Ile Arg Asn Asp Ile Thr Asp Glu Glu Tyr 755 760 765 Asn Ala Leu Ser Arg Ala Phe Thr Asp Ile Lys Asn Asn Ile Tyr Pro 770 775 780 Leu Gly Ala Lys Gly Ser Ile Gly Tyr Gly Cys Val Gln Asp Leu Thr 785 790 795 800 Thr Asp Asn Ser Asn Ile Asn Leu Lys Thr Ile Asn Leu Asn Tyr Lys 805 810 815 Pro Ile Pro Gln Lys Ile Asn Ser Asn Ile Lys Ile Asp Phe Asn Asp 820 825 830 Asn Glu Ile Tyr Tyr Pro His Tyr Phe Leu Glu Pro Ser Lys Thr Val 835 840 845 Asn Arg Ile Pro Val Pro Ile Gly His Glu Lys Phe Asp Glu Asn Leu 850 855 860 Leu Thr Gly Lys Ile Thr Cys Thr Leu Asn Thr Leu Ser Pro Leu Ile 865 870 875 880 Val Pro Asp Thr Thr Asn Asp Asn Phe Phe Lys Leu Ala Asp Glu Lys 885 890 895 Glu Lys Ser Glu Gly Lys Pro Tyr His Lys Ser Tyr Asn Phe Phe Ser 900 905 910 Val Asn Arg Asp Ile Ser Ile Pro Gly Ser Glu Leu Arg Gly Met Ile 915 920 925 Ser Ser Val Tyr Glu Ala Val Thr Asn Ser Cys Phe Arg Ile Phe Asp 930 935 940 Glu Lys Tyr Arg Leu Ser Trp Arg Met Asp Val Ser Pro Ala Val Leu 945 950 955 960 Arg Glu Phe Lys Pro Gly Met Val Ile Lys Asp His Asn Asp Leu Lys 965 970 975 Ile Tle Glu Met Glu Glu Phe Arg Tyr Pro Phe Tyr Asp Gln Asn Ile 980 985 990 Gln Asp Ile Glu Ala Gln Asn Lys Tyr Phe Glu Trp Glu Tyr Gly Thr 995 1000 1005 Ile Lys Ile Thr Lys Lys Ser Ile Tyr Glu Leu Glu Lys Leu Ile Glu 1010 1015 1020 Lys Lys Asn Ile Leu Asn Lys Ile Lys Glu Leu Gln Asp Ile Glu Tyr 1025 1030 1035 1040 Lys Ser Glu Tyr Ala Leu Ile Asn Ala Leu Glu Lys Leu Ile Gly Arg 1045 1050 1055 Asn Ser Leu Ala Lys Cys Lys Ser Asn Ile Leu Lys His Ala Glu Arg 1060 1065 1070 Lys Gly Glu Phe Pro Arg Tyr Asp His Pro Thr Asp Thr Asp Arg Met 1075 1080 1085 Met Leu Ser Leu Ser Gly Lys Asn Arg Asn Leu Lys Asn Lys Lys Glu 1090 1095 1100 Lys Ile Glu Tyr Ile Ile Ile Lys Pro Asn Ser Lys Ser Lys Ala Thr 1105 1110 1115 1120 Phe Met Tyr Leu Ala Thr Pro Leu Asn Asn Ile Asp Glu Tyr Glu Asn Page 58
Glu Ser Val Ala Arg Lys Ala His Gly Tyr Leu Lys Ile Thr Gly Pro 1140 1145 1150 Asn Lys Ile Glu Lys Glu Asn Val Asp Ser Ile Asp Ser Asn Phe Lys 1155 1160 1165 Pro Val Pro Gln Met Asp Asp Gln Ile Ile Leu Glu Lys Val Trp Leu 1170 1175 1180 Arg Lys Val Phe Val Leu Ser Ala Lys Lys Arg Thr Ser Tyr Arg Asp 1185 1190 1195 1200 Arg Leu Ile Pro Glu Phe Ile Cys Tyr Asp Lys Ile Lys Gly Ile Lys 1205 1210 1215 Tyr Thr Met Asn Lys Arg Ser Glu Arg Ile Phe Val Glu Lys Lys Glu 1220 1225 1230 Arg Ile Lys Lys Glu Ile Thr Gln Gln Ala Ile Glu Lys Phe Glu Ile 1235 1240 1245 Leu Ile Gln Glu Tyr His Lys Asn Ala Glu Gln Gln Gln Thr Pro Glu 1250 1255 1260 Val Phe Arg Thr Ile Leu Pro Gln Asn Gly Thr Ile Asn Asp Gly Asp 1265 1270 1275 1280 Leu Val Tyr Phe Arg Glu Glu Asn Asn Gln Val Val Glu Ile Ile Pro 1285 1290 1295 Val Arg Ile Ser Arg Lys Val Asp Asp Asn Tyr Ile Gly Lys Arg Ile 1300 1305 1310 Asp Glu Gln Leu Arg Pro Cys His Gly Asp Trp Ile Glu Glu Asp Asp 1315 1320 1325 Ile Ser Lys Leu Asn Ala Tyr Pro Glu Lys Arg Leu Phe Thr Arg Asn 1330 1335 1340 Glu Lys Gly Leu Cys Pro Ala Cys Arg Leu Phe Gly Thr Gly Ser Tyr 1345 1350 1355 1360 Lys Gly Arg Val Arg Phe Gly Leu Ala Lys Leu Asp Asn Glu Pro Lys 1365 1370 1375 Trp Leu Met Ser Asn Asp Gly Arg Leu Thr Leu Pro Leu Leu Glu Arg 1380 1385 1390 Pro Arg Pro Thr Trp Ser Ile Pro Asp Asp Lys Lys Glu Asn Lys Val 1395 1400 1405 Leu Gly Arg Lys Phe Tyr Val His His Asp Gly Trp Lys Thr Val Phe 1410 1415 1420 Glu Gly Lys Asn Pro Ser Asn Gly Glu Thr Ile Gln Ser Asn Pro Asn 1425 1430 1435 1440 Asn Arg Thr Val Lys Pro Leu Gly Cys Asp Asn Lys Phe Thr Phe Asp 1445 1450 1455 Ile Tyr Phe Glu Asn Leu Glu Asp Tyr Glu Leu Gly Leu Leu Phe Tyr 1460 1465 1470 Thr Leu Gln Leu Glu Lys Gly Leu Ser His Lys Leu Gly Met Ala Lys 1475 1480 1485 Ser Met Gly Phe Gly Ser Val Glu Ile Asp Ile Lys Asn Ile Ser Leu 1490 1495 1500 Arg Lys Asp Pro Glu Asn Trp Glu Asp Gly Asn Ser Lys Ile Ala Asp 1505 1510 1515 1520 Trp Ile Lys Glu Gly Glu Lys Met Leu Thr Lys Trp Phe Gln Thr Asp Page 59
Phe Asn Thr Ile Glu His Leu Asn Asn Leu Lys Lys Leu Leu Tyr Phe 1540 1545 1550 Ser Gly Asn Lys Asn Leu Lys Val Phe Tyr Pro Thr Leu Lys Lys Glu 1555 1560 1565 Gly Lys Ile Pro Gly Tyr Glu Glu Leu Lys Lys Asp Ile Lys Asp Arg 1570 1575 1580 Lys Lys Met Leu Thr Thr Pro Trp Met Pro Trp His Ser Glu Glu 1585 1590 1595 <210> 14 <211> 1601 <212> PRT <213> Desulfonema ishimotonii <220> <223> WP 124327589.1 hypothetical protein <400> 14 Met Thr Thr Thr Met Lys Ile Ser Ile Glu Phe Leu Glu Pro Phe Arg 1 5 10 15 Met Thr Lys Trp Gln Glu Ser Thr Arg Arg Asn Lys Asn Asn Lys Glu
Phe Val Arg Gly Gln Ala Phe Ala Arg Trp His Arg Asn Lys Lys Asp 40 45 Asn Thr Lys Gly Arg Pro Tyr Ile Thr Gly Thr Leu Leu Arg Ser Ala 50 55 60 Val Ile Arg Ser Ala Glu Asn Leu Leu Thr Leu Ser Asp Gly Lys Ile 65 70 75 80 Ser Glu Lys Thr Cys Cys Pro Gly Lys Phe Asp Thr Glu Asp Lys Asp 85 90 95 Arg Leu Leu Gln Leu Arg Gln Arg Ser Thr Leu Arg Trp Thr Asp Lys 100 105 110 Asn Pro Cys Pro Asp Asn Ala Glu Thr Tyr Cys Pro Phe Cys Glu Leu 115 120 125 Leu Gly Arg Ser Gly Asn Asp Gly Lys Lys Ala Glu Lys Lys Asp Trp 130 135 140 Arg Phe Arg Ile His Phe Gly Asn Leu Ser Leu Pro Gly Lys Pro Asp 145 150 155 160 Phe Asp Gly Pro Lys Ala Ile Gly Ser Gln Arg Val Leu Asn Arg Val 165 170 175 Asp Phe Lys Ser Gly Lys Ala His Asp Phe Phe Lys Ala Tyr Glu Val 180 185 190 Asp His Thr Arg Phe Pro Arg Phe Glu Gly Glu Ile Thr Ile Asp Asn 195 200 205 Lys Val Ser Ala Glu Ala Arg Lys Leu Leu Cys Asp Ser Leu Lys Phe 210 215 220 Thr Asp Arg Leu Cys Gly Ala Leu Cys Val Ile Arg Phe Asp Glu Tyr 225 230 235 240 Page 60
Thr Pro Ala Ala Asp Ser Gly Lys Gln Thr Glu Asn Val Gln Ala Glu 245 250 255 Pro Asn Ala Asn Leu Ala Glu Lys Thr Ala Glu Gln Ile Ile Ser Ile 260 265 270 Leu Asp Asp Asn Lys Lys Thr Glu Tyr Thr Arg Leu Leu Ala Asp Ala 275 280 285 Ile Arg Ser Leu Arg Arg Ser Ser Lys Leu Val Ala Gly Leu Pro Lys 290 295 300 Asp His Asp Gly Lys Asp Asp His Tyr Leu Trp Asp Ile Gly Lys Lys 305 310 315 320 Lys Lys Asp Glu Asn Ser Val Thr Ile Arg Gln Ile Leu Thr Thr Ser 325 330 335 Ala Asp Thr Lys Glu Leu Lys Asn Ala Gly Lys Trp Arg Glu Phe Cys 340 345 350 Glu Lys Leu Gly Glu Ala Leu Tyr Leu Lys Ser Lys Asp Met Ser Gly 355 360 365 Gly Leu Lys Ile Thr Arg Arg Ile Leu Gly Asp Ala Glu Phe His Gly 370 375 380 Lys Pro Asp Arg Leu Glu Lys Ser Arg Ser Val Ser Ile Gly Ser Val 385 390 395 400 Leu Lys Glu Thr Val Val Cys Gly Glu Leu Val Ala Lys Thr Pro Phe 405 410 415 Phe Phe Gly Ala Ile Asp Glu Asp Ala Lys Gln Thr Asp Leu Gln Val 420 425 430 Leu Leu Thr Pro Asp Asn Lys Tyr Arg Leu Pro Arg Ser Ala Val Arg 435 440 445 Gly Ile Leu Arg Arg Asp Leu Gln Thr Tyr Phe Asp Ser Pro Cys Asn 450 455 460 Ala Glu Leu Gly Gly Arg Pro Cys Met Cys Lys Thr Cys Arg Ile Met 465 470 475 480 Arg Gly Ile Thr Val Met Asp Ala Arg Ser Glu Tyr Asn Ala Pro Pro 485 490 495 Glu Ile Arg His Arg Thr Arg Ile Asn Pro Phe Thr Gly Thr Val Ala 500 505 510 Glu Gly Ala Leu Phe Asn Met Glu Val Ala Pro Glu Gly Ile Val Phe 515 520 525 Pro Phe Gln Leu Arg Tyr Arg Gly Ser Glu Asp Gly Leu Pro Asp Ala 530 535 540 Leu Lys Thr Val Leu Lys Trp Trp Ala Glu Gly Gln Ala Phe Met Ser 545 550 555 560 Gly Ala Ala Ser Thr Gly Lys Gly Arg Phe Arg Met Glu Asn Ala Lys 565 570 575 Tyr Glu Thr Leu Asp Leu Ser Asp Glu Asn Gln Arg Asn Asp Tyr Leu 580 585 590 Lys Asn Trp Gly Trp Arg Asp Glu Lys Gly Leu Glu Glu Leu Lys Lys 595 600 605 Arg Leu Asn Ser Gly Leu Pro Glu Pro Gly Asn Tyr Arg Asp Pro Lys 610 615 620 Trp His Glu Ile Asn Val Ser Ile Glu Met Ala Ser Pro Phe Ile Asn 625 630 635 640 Page 61
Gly Asp Pro Ile Arg Ala Ala Val Asp Lys Arg Gly Thr Asp Val Val 645 650 655 Thr Phe Val Lys Tyr Lys Ala Glu Gly Glu Glu Ala Lys Pro Val Cys 660 665 670 Ala Tyr Lys Ala Glu Ser Phe Arg Gly Val Ile Arg Ser Ala Val Ala 675 680 685 Arg Ile His Met Glu Asp Gly Val Pro Leu Thr Glu Leu Thr His Ser 690 695 700 Asp Cys Glu Cys Leu Leu Cys Gln Ile Phe Gly Ser Glu Tyr Glu Ala 705 710 715 720 Gly Lys Ile Arg Phe Glu Asp Leu Val Phe Glu Ser Asp Pro Glu Pro 725 730 735 Val Thr Phe Asp His Val Ala Ile Asp Arg Phe Thr Gly Gly Ala Ala 740 745 750 Asp Lys Lys Lys Phe Asp Asp Ser Pro Leu Pro Gly Ser Pro Ala Arg 755 760 765 Pro Leu Met Leu Lys Gly Ser Phe Trp Ile Arg Arg Asp Val Leu Glu 770 775 780 Asp Glu Glu Tyr Cys Lys Ala Leu Gly Lys Ala Leu Ala Asp Val Asn 785 790 795 800 Asn Gly Leu Tyr Pro Leu Gly Gly Lys Ser Ala Ile Gly Tyr Gly Gln 805 810 815 Val Lys Ser Leu Gly Ile Lys Gly Asp Asp Lys Arg Ile Ser Arg Leu 820 825 830 Met Asn Pro Ala Phe Asp Glu Thr Asp Val Ala Val Pro Glu Lys Pro 835 840 845 Lys Thr Asp Ala Glu Val Arg Ile Glu Ala Glu Lys Val Tyr Tyr Pro 850 855 860 His Tyr Phe Val Glu Pro His Lys Lys Val Glu Arg Glu Glu Lys Pro 865 870 875 880 Cys Gly His Gln Lys Phe His Glu Gly Arg Leu Thr Gly Lys Ile Arg 885 890 895 Cys Lys Leu Ile Thr Lys Thr Pro Leu Ile Val Pro Asp Thr Ser Asn 900 905 910 Asp Asp Phe Phe Arg Pro Ala Asp Lys Glu Ala Arg Lys Glu Lys Asp 915 920 925 Glu Tyr His Lys Ser Tyr Ala Phe Phe Arg Leu His Lys Gln Ile Met 930 935 940 Ile Pro Gly Ser Glu Leu Arg Gly Met Val Ser Ser Val Tyr Glu Thr 945 950 955 960 Val Thr Asn Ser Cys Phe Arg Ile Phe Asp Glu Thr Lys Arg Leu Ser 965 970 975 Trp Arg Met Asp Ala Asp His Gln Asn Val Leu Gln Asp Phe Leu Pro 980 985 990 Gly Arg Val Thr Ala Asp Gly Lys His Ile Gln Lys Phe Ser Glu Thr 995 1000 1005 Ala Arg Val Pro Phe Tyr Asp Lys Thr Gln Lys His Phe Asp Ile Leu 1010 1015 1020 Asp Glu Gln Glu Ile Ala Gly Glu Lys Pro Val Arg Met Trp Val Lys 1025 1030 1035 1040 Page 62
Arg Phe Ile Lys Arg Leu Ser Leu Val Asp Pro Ala Lys His Pro Gln 1045 1050 1055 Lys Lys Gln Asp Asn Lys Trp Lys Arg Arg Lys Glu Gly Ile Ala Thr 1060 1065 1070 Phe Tle Glu Gln Lys Asn Gly Ser Tyr Tyr Phe Asn Val Val Thr Asn 1075 1080 1085 Asn Gly Cys Thr Ser Phe His Leu Trp His Lys Pro Asp Asn Phe Asp 1090 1095 1100 Gln Glu Lys Leu Glu Gly Ile Gln Asn Gly Glu Lys Leu Asp Cys Trp 1105 1110 1115 1120 Val Arg Asp Ser Arg Tyr Gln Lys Ala Phe Gln Glu Ile Pro Glu Asn 1125 1130 1135 Asp Pro Asp Gly Trp Glu Cys Lys Glu Gly Tyr Leu His Val Val Gly 1140 1145 1150 Pro Ser Lys Val Glu Phe Ser Asp Lys Lys Gly Asp Val Ile Asn Asn 1155 1160 1165 Phe Gln Gly Thr Leu Pro Ser Val Pro Asn Asp Trp Lys Thr Ile Arg 1170 1175 1180 Thr Asn Asp Phe Lys Asn Arg Lys Arg Lys Asn Glu Pro Val Phe Cys 1185 1190 1195 1200 Cys Glu Asp Asp Lys Gly Asn Tyr Tyr Thr Met Ala Lys Tyr Cys Glu 1205 1210 1215 Thr Phe Phe Phe Asp Leu Lys Glu Asn Glu Glu Tyr Glu Ile Pro Glu 1220 1225 1230 Lys Ala Arg Ile Lys Tyr Lys Glu Leu Leu Arg Val Tyr Asn Asn Asn 1235 1240 1245 Pro Gln Ala Val Pro Glu Ser Val Phe Gln Ser Arg Val Ala Arg Glu 1250 1255 1260 Asn Val Glu Lys Leu Lys Ser Gly Asp Leu Val Tyr Phe Lys His Asn 1265 1270 1275 1280 Glu Lys Tyr Val Glu Asp Ile Val Pro Val Arg Ile Ser Arg Thr Val 1285 1290 1295 Asp Asp Arg Met Ile Gly Lys Arg Met Ser Ala Asp Leu Arg Pro Cys 1300 1305 1310 His Gly Asp Trp Val Glu Asp Gly Asp Leu Ser Ala Leu Asn Ala Tyr 1315 1320 1325 Pro Glu Lys Arg Leu Leu Leu Arg His Pro Lys Gly Leu Cys Pro Ala 1330 1335 1340 Cys Arg Leu Phe Gly Thr Gly Ser Tyr Lys Gly Arg Val Arg Phe Gly 1345 1350 1355 1360 Phe Ala Ser Leu Glu Asn Asp Pro Glu Trp Leu Ile Pro Gly Lys Asn 1365 1370 1375 Pro Gly Asp Pro Phe His Gly Gly Pro Val Met Leu Ser Leu Leu Glu 1380 1385 1390 Arg Pro Arg Pro Thr Trp Ser Ile Pro Gly Ser Asp Asn Lys Phe Lys 1395 1400 1405 Val Pro Gly Arg Lys Phe Tyr Val His His His Ala Trp Lys Thr Ile 1410 1415 1420 Lys Asp Gly Asn His Pro Thr Thr Gly Lys Ala Ile Glu Gln Ser Pro 1425 1430 1435 1440 Page 63
Asn Asn Arg Thr Val Glu Ala Leu Ala Gly Gly Asn Ser Phe Ser Phe 1445 1450 1455 Glu Ile Ala Phe Glu Asn Leu Lys Glu Trp Glu Leu Gly Leu Leu Ile 1460 1465 1470 His Ser Leu Gln Leu Glu Lys Gly Leu Ala His Lys Leu Gly Met Ala 1475 1480 1485 Lys Ser Met Gly Phe Gly Ser Val Glu Ile Asp Val Glu Ser Val Arg 1490 1495 1500 Leu Arg Lys Asp Trp Lys Gln Trp Arg Asn Gly Asn Ser Glu Ile Pro 1505 1510 1515 1520 Asn Trp Leu Gly Lys Gly Phe Ala Lys Leu Lys Glu Trp Phe Arg Asp 1525 1530 1535 Glu Leu Asp Phe Ile Glu Asn Leu Lys Lys Leu Leu Trp Phe Pro Glu 1540 1545 1550 Gly Asp Gln Ala Pro Arg Val Cys Tyr Pro Met Leu Arg Lys Lys Asp 1555 1560 1565 Asp Pro Asn Gly Asn Ser Gly Tyr Glu Glu Leu Lys Asp Gly Glu Phe 1570 1575 1580 Lys Lys Glu Asp Arg Gln Lys Lys Leu Thr Thr Pro Trp Thr Pro Trp 1585 1590 1595 1600 Ala <210> 15 <211> 1722 <212> PRT <213> Artificial Sequence <220> <223> D698A mutant of SEQ ID NO:1 <400> 15 Met Lys Ser Asn Asp Met Asn Ile Thr Val Glu Leu Thr Phe Phe Glu 1 5 10 15 Pro Tyr Arg Leu Val Glu Trp Phe Asp Trp Asp Ala Arg Lys Lys Ser
His Ser Ala Met Arg Gly Gln Ala Phe Ala Gln Trp Thr Trp Lys Gly 40 45 Lys Gly Arg Thr Ala Gly Lys Ser Phe Ile Thr Gly Thr Leu Val Arg 50 55 60 Ser Ala Val Ile Lys Ala Val Glu Glu Leu Leu Ser Leu Asn Asn Gly 65 70 75 80 Lys Trp Glu Gly Val Pro Cys Cys Asn Gly Ser Phe Gln Thr Asp Glu 85 90 95 Ser Lys Gly Lys Lys Pro Ser Phe Leu Arg Lys Arg His Thr Leu Gln 100 105 110 Trp Gln Ala Asn Asn Lys Asn Ile Cys Asp Lys Glu Glu Ala Cys Pro 115 120 125 Page 64
Phe Cys Ile Leu Leu Gly Arg Phe Asp Asn Ala Gly Lys Val His Glu 130 135 140 Arg Asn Lys Asp Tyr Asp Ile His Phe Ser Asn Phe Asp Leu Asp His 145 150 155 160 Lys Gln Glu Lys Asn Asp Leu Arg Leu Val Asp Ile Ala Ser Gly Arg 165 170 175 Ile Leu Asn Arg Val Asp Phe Asp Thr Gly Lys Ala Lys Asp Tyr Phe 180 185 190 Arg Thr Trp Glu Ala Asp Tyr Glu Thr Tyr Gly Thr Tyr Thr Gly Arg 195 200 205 Ile Thr Leu Arg Asn Glu His Ala Lys Lys Leu Leu Leu Ala Ser Leu 210 215 220 Gly Phe Val Asp Lys Leu Cys Gly Ala Leu Cys Arg Ile Glu Val Ile 225 230 235 240 Lys Lys Ser Glu Ser Pro Leu Pro Ser Asp Thr Lys Glu Gln Ser Tyr 245 250 255 Thr Lys Asp Asp Thr Val Glu Val Leu Ser Glu Asp His Asn Asp Glu 260 265 270 Leu Arg Lys Gln Ala Glu Val Ile Val Glu Ala Phe Lys Gln Asn Asp 275 280 285 Lys Leu Glu Lys Ile Arg Ile Leu Ala Asp Ala Ile Arg Thr Leu Arg 290 295 300 Leu His Gly Glu Gly Val Ile Glu Lys Asp Glu Leu Pro Asp Gly Lys 305 310 315 320 Glu Glu Arg Asp Lys Gly His His Leu Trp Asp Ile Lys Val Gln Gly 325 330 335 Thr Ala Leu Arg Thr Lys Leu Lys Glu Leu Trp Gln Ser Asn Lys Asp 340 345 350 Ile Gly Trp Arg Lys Phe Thr Glu Met Leu Gly Ser Asn Leu Tyr Leu 355 360 365 Ile Tyr Lys Lys Glu Thr Gly Gly Val Ser Thr Arg Phe Arg Ile Leu 370 375 380 Gly Asp Thr Glu Tyr Tyr Ser Lys Ala His Asp Ser Glu Gly Ser Asp 385 390 395 400 Leu Phe Ile Pro Val Thr Pro Pro Glu Gly Ile Glu Thr Lys Glu Trp 405 410 415 Ile Tle Val Gly Arg Leu Lys Ala Ala Thr Pro Phe Tyr Phe Gly Val 420 425 430 Gln Gln Pro Ser Asp Ser Ile Pro Gly Lys Glu Lys Lys Ser Glu Asp 435 440 445 Ser Leu Val Ile Asn Glu His Thr Ser Phe Asn Ile Leu Leu Asp Lys 450 455 460 Glu Asn Arg Tyr Arg Ile Pro Arg Ser Ala Leu Arg Gly Ala Leu Arg 465 470 475 480 Arg Asp Leu Arg Thr Ala Phe Gly Ser Gly Cys Asn Val Ser Leu Gly 485 490 495 Gly Gln Ile Leu Cys Asn Cys Lys Val Cys Ile Glu Met Arg Arg Ile 500 505 510 Thr Leu Lys Asp Ser Val Ser Asp Phe Ser Glu Pro Pro Glu Ile Arg 515 520 525 Page 65
Tyr Arg Ile Ala Lys Asn Pro Gly Thr Ala Thr Val Glu Asp Gly Ser 530 535 540 Leu Phe Asp Ile Glu Val Gly Pro Glu Gly Leu Thr Phe Pro Phe Val 545 550 555 560 Leu Arg Tyr Arg Gly His Lys Phe Pro Glu Gln Leu Ser Ser Val Ile 565 570 575 Arg Tyr Trp Glu Glu Asn Asp Gly Lys Asn Gly Met Ala Trp Leu Gly 580 585 590 Gly Leu Asp Ser Thr Gly Lys Gly Arg Phe Ala Leu Lys Asp Ile Lys 595 600 605 Ile Phe Glu Trp Asp Leu Asn Gln Lys Ile Asn Glu Tyr Ile Lys Glu 610 615 620 Arg Gly Met Arg Gly Lys Glu Lys Glu Leu Leu Glu Met Gly Glu Ser 625 630 635 640 Ser Leu Pro Asp Gly Leu Ile Pro Tyr Lys Phe Phe Glu Glu Arg Glu 645 650 655 Cys Leu Phe Pro Tyr Lys Glu Asn Leu Lys Pro Gln Trp Ser Glu Val 660 665 670 Gln Tyr Thr Ile Glu Val Gly Ser Pro Leu Leu Thr Ala Asp Thr Ile 675 680 685 Ser Ala Leu Thr Glu Pro Gly Asn Arg Ala Ala Ile Ala Tyr Lys Lys 690 695 700 Arg Val Tyr Asn Asp Gly Asn Asn Ala Ile Glu Pro Glu Pro Arg Phe 705 710 715 720 Ala Val Lys Ser Glu Thr His Arg Gly Ile Phe Arg Thr Ala Val Gly 725 730 735 Arg Arg Thr Gly Asp Leu Gly Lys Glu Asp His Glu Asp Cys Thr Cys 740 745 750 Asp Met Cys Ile Ile Phe Gly Asn Glu His Glu Ser Ser Lys Ile Arg 755 760 765 Phe Glu Asp Leu Glu Leu Ile Asn Gly Asn Glu Phe Glu Lys Leu Glu 770 775 780 Lys His Ile Asp His Val Ala Ile Asp Arg Phe Thr Gly Gly Ala Leu 785 790 795 800 Asp Lys Ala Lys Phe Asp Thr Tyr Pro Leu Ala Gly Ser Pro Lys Lys 805 810 815 Pro Leu Lys Leu Lys Gly Arg Phe Trp Ile Lys Lys Gly Phe Ser Gly 820 825 830 Asp His Lys Leu Leu Ile Thr Thr Ala Leu Ser Asp Ile Arg Asp Gly 835 840 845 Leu Tyr Pro Leu Gly Ser Lys Gly Gly Val Gly Tyr Gly Trp Val Ala 850 855 860 Gly Ile Ser Ile Asp Asp Asn Val Pro Asp Asp Phe Lys Glu Met Ile 865 870 875 880 Asn Lys Thr Glu Met Pro Leu Pro Glu Glu Val Glu Glu Ser Asn Asn 885 890 895 Gly Pro Ile Asn Asn Asp Tyr Val His Pro Gly His Gln Ser Pro Lys 900 905 910 Gln Asp His Lys Asn Lys Asn Ile Tyr Tyr Pro His Tyr Phe Leu Asp 915 920 925 Page 66
Ser Gly Ser Lys Val Tyr Arg Glu Lys Asp Ile Ile Thr His Glu Glu 930 935 940 Phe Thr Glu Glu Leu Leu Ser Gly Lys Ile Asn Cys Lys Leu Glu Thr 945 950 955 960 Leu Thr Pro Leu Ile Ile Pro Asp Thr Ser Asp Glu Asn Gly Leu Lys 965 970 975 Leu Gln Gly Asn Lys Pro Gly His Lys Asn Tyr Lys Phe Phe Asn Ile 980 985 990 Asn Gly Glu Leu Met Ile Pro Gly Ser Glu Leu Arg Gly Met Leu Arg 995 1000 1005 Thr His Phe Glu Ala Leu Thr Lys Ser Cys Phe Ala Ile Phe Gly Glu 1010 1015 1020 Asp Ser Thr Leu Ser Trp Arg Met Asn Ala Asp Glu Lys Asp Tyr Lys 1025 1030 1035 1040 Ile Asp Ser Asn Ser Ile Arg Lys Met Glu Ser Gln Arg Asn Pro Lys 1045 1050 1055 Tyr Arg Ile Pro Asp Glu Leu Gln Lys Glu Leu Arg Asn Ser Gly Asn 1060 1065 1070 Gly Leu Phe Asn Arg Leu Tyr Thr Ser Glu Arg Arg Phe Trp Ser Asp 1075 1080 1085 Val Ser Asn Lys Phe Glu Asn Ser Ile Asp Tyr Lys Arg Glu Ile Leu 1090 1095 1100 Arg Cys Ala Gly Arg Pro Lys Asn Tyr Lys Gly Gly Ile Ile Arg Gln 1105 1110 1115 1120 Arg Lys Asp Ser Leu Met Ala Glu Glu Leu Lys Val His Arg Leu Pro 1125 1130 1135 Leu Tyr Asp Asn Phe Asp Ile Pro Asp Ser Ala Tyr Lys Ala Asn Asp 1140 1145 1150 His Cys Arg Lys Ser Ala Thr Cys Ser Thr Ser Arg Gly Cys Arg Glu 1155 1160 1165 Arg Phe Thr Cys Gly Ile Lys Val Arg Asp Lys Asn Arg Val Phe Leu 1170 1175 1180 Asn Ala Ala Asn Asn Asn Arg Gln Tyr Leu Asn Asn Ile Lys Lys Ser 1185 1190 1195 1200 Asn His Asp Leu Tyr Leu Gln Tyr Leu Lys Gly Glu Lys Lys Ile Arg 1205 1210 1215 Phe Asn Ser Lys Val Ile Thr Gly Ser Glu Arg Ser Pro Ile Asp Val 1220 1225 1230 Ile Ala Glu Leu Asn Glu Arg Gly Arg Gln Thr Gly Phe Ile Lys Leu 1235 1240 1245 Ser Gly Leu Asn Asn Ser Asn Lys Ser Gln Gly Asn Thr Gly Thr Thr 1250 1255 1260 Phe Asn Ser Gly Trp Asp Arg Phe Glu Leu Asn Ile Leu Leu Asp Asp 1265 1270 1275 1280 Leu Glu Thr Arg Pro Ser Lys Ser Asp Tyr Pro Arg Pro Arg Leu Leu 1285 1290 1295 Phe Thr Lys Asp Gln Tyr Glu Tyr Asn Ile Thr Lys Arg Cys Glu Arg 1300 1305 1310 Val Phe Glu Ile Asp Lys Gly Asn Lys Thr Gly Tyr Pro Val Asp Asp 1315 1320 1325 Page 67
Gln Ile Lys Lys Asn Tyr Glu Asp Ile Leu Asp Ser Tyr Asp Gly Ile 1330 1335 1340 Lys Asp Gln Glu Val Ala Glu Arg Phe Asp Thr Phe Thr Arg Gly Ser 1345 1350 1355 1360 Lys Leu Lys Val Gly Asp Leu Val Tyr Phe His Ile Asp Gly Asp Asn 1365 1370 1375 Lys Ile Asp Ser Leu Ile Pro Val Arg Ile Ser Arg Lys Cys Ala Ser 1380 1385 1390 Lys Thr Leu Gly Gly Lys Leu Asp Lys Ala Leu His Pro Cys Thr Gly 1395 1400 1405 Leu Ser Asp Gly Leu Cys Pro Gly Cys His Leu Phe Gly Thr Thr Asp 1410 1415 1420 Tyr Lys Gly Arg Val Lys Phe Gly Phe Ala Lys Tyr Glu Asn Gly Pro 1425 1430 1435 1440 Glu Trp Leu Ile Thr Arg Gly Asn Asn Pro Glu Arg Ser Leu Thr Leu 1445 1450 1455 Gly Val Leu Glu Ser Pro Arg Pro Ala Phe Ser Ile Pro Asp Asp Glu 1460 1465 1470 Ser Glu Ile Pro Gly Arg Lys Phe Tyr Leu His His Asn Gly Trp Arg 1475 1480 1485 Ile Tle Arg Gln Lys Gln Leu Glu Ile Arg Glu Thr Val Gln Pro Glu 1490 1495 1500 Arg Asn Val Thr Thr Glu Val Met Asp Lys Gly Asn Val Phe Ser Phe 1505 1510 1515 1520 Asp Val Arg Phe Glu Asn Leu Arg Glu Trp Glu Leu Gly Leu Leu Leu 1525 1530 1535 Gln Ser Leu Asp Pro Gly Lys Asn Ile Ala His Lys Leu Gly Lys Gly 1540 1545 1550 Lys Pro Tyr Gly Phe Gly Ser Val Lys Ile Lys Ile Asp Ser Leu His 1555 1560 1565 Thr Phe Lys Ile Asn Ser Asn Asn Asp Lys Ile Lys Arg Val Pro Gln 1570 1575 1580 Ser Asp Ile Arg Glu Tyr Ile Asn Lys Gly Tyr Gln Lys Leu Ile Glu 1585 1590 1595 1600 Trp Ser Gly Asn Asn Ser Ile Gln Lys Gly Asn Val Leu Pro Gln Trp 1605 1610 1615 His Val Ile Pro His Ile Asp Lys Leu Tyr Lys Leu Leu Trp Val Pro 1620 1625 1630 Phe Leu Asn Asp Ser Lys Leu Glu Pro Asp Val Arg Tyr Pro Val Leu 1635 1640 1645 Asn Glu Glu Ser Lys Gly Tyr Ile Glu Gly Ser Asp Tyr Thr Tyr Lys 1650 1655 1660 Lys Leu Gly Asp Lys Asp Asn Leu Pro Tyr Lys Thr Arg Val Lys Gly 1665 1670 1675 1680 Leu Thr Thr Pro Trp Ser Pro Trp Asn Pro Phe Gln Val Ile Ala Glu 1685 1690 1695 His Glu Glu Gln Glu Val Asn Val Thr Gly Ser Arg Pro Ser Val Thr 1700 1705 1710 Asp Lys Ile Glu Arg Asp Gly Lys Met Val 1715 1720 Page 68
<210> 16 <211> 5169 <212> DNA <213> Artificial Sequence <220> <223> gRAMP codon-optimized for E. coli <400> 16 atgaaaagca acgacatgaa cattaccgtg gaactgacct tttttgaacc gtatcgtctg 60 gttgaatggt ttgattggga tgcacgtaaa aaaagccata gcgcaatgcg tggtcaggca 120 tttgcacagt ggacctggaa aggtaaaggt cgtaccgcag gtaaaagctt tattaccggt 180 acactggttc gtagcgcagt tattaaagca gttgaagaac tgctgagcct gaataatggt 240 aaatgggaag gtgttccgtg ttgcaatggt agctttcaga ccgatgaaag caaaggtaaa 300 aaaccgagct ttctgcgtaa acgtcatacc ctgcagtggc aggcaaataa caaaaacatt 360 tgcgataaag aagaggcctg tccgttttgt attctgctgg gtcgttttga taatgccggt 420 aaagtgcatg aacgcaacaa agattatgat atccacttca gcaacttcga cctggatcac 480 aaacaagaaa aaaatgatct gcgcctggtt gatattgcaa gcggtcgtat tctgaatcgt 540 gttgattttg ataccggcaa agccaaagat tactttcgta cctgggaagc agattatgaa 600 acctatggca cctataccgg tcgcattacc ctgcgtaatg aacatgcaaa aaaactgctg 660 ctggcaagcc tgggttttgt tgataaactg tgtggtgcac tgtgtcgtat tgaggttatc 720 aaaaaaagcg aaagtccgct gccgagcgat accaaagaac agagctatac aaaagatgat 780 accgttgaag ttctgagcga agatcataat gatgaactgc gcaaacaggc cgaagttatt 840 gttgaagcat ttaagcagaa cgataaactg gaaaaaattc gcattctggc agatgcaatt 900 cgtaccctgc gcctgcatgg tgaaggtgtg attgaaaaag atgagctgcc ggatggtaaa 960 gaagaacgcg ataaaggtca tcatctgtgg gatattaaag ttcagggcac cgcactgcgt 1020 accaaactga aagaactgtg gcagagcaat aaagatattg gctggcgcaa atttaccgaa 1080 atgctgggta gcaatctgta cctgatctat aagaaagaaa ccggtggtgt tagcacccgt 1140 tttegcatcc tgggtgatac cgagtattat agcaaagcac atgatagcga aggtagcgac 1200 Page 69 ctgtttattc cggttacacc gcctgaaggt attgaaacca aagaatggat tattgtgggt 1260 cgcctgaaag cagcaacccc gttttatttc ggtgttcagc agccgagtga tagcattccg 1320 ggtaaagaga aaaaatcaga agatagcctg gtcatcaatg aacacaccag ctttaacatc 1380 ctgctggata aagaaaatcg ttatcgtatt ccgcgtagtg cactgcgtgg tgccctgegt 1440 cgcgatctgc gtaccgcatt tggtagcggt tgtaatgtta gcttaggtgg tcagattctg 1500 tgcaattgta aagtgtgtat tgaaatgcgt cgcatcacac tgaaagatag cgttagcgat 1560 ttttcagaac ctccggaaat tcgctatcgc attgcaaaaa atccgggtac agcaaccgtg 1620 gaagatggta gtctgtttga tattgaagtt ggtccggaag gcctgacctt tcegtttgtt 1680 ctgcgttatc gtggtcataa atttccagaa cagctgagca gcgttattcg ttattgggaa 1740 gaaaatgatg gcaaaaatgg tatggcatgg ttaggtggcc tggatagcac cggtaaaggc 1800 cgttttgccc tgaaagacat taaaatcttt gagtgggatc tgaaccagaa aatcaacgaa 1860 tatatcaaag aacgcggtat gcgtggcaaa gaaaaagaat tactggaaat gggtgaaagc 1920 agtctgcctg atggtctgat tccgtataaa ttctttgaag aacgtgaatg cctgtttccg 1980 tacaaagaaa acctgaaacc gcagtggtca gaagttcagt ataccattga agtgggttca 2040 ccgctgctga ccgcagatac cattagcgca ctgaccgaac cgggtaatcg tgatgcaatt 2100 gcctacaaaa aacgcgtgta taacgatggc aataatgcca ttgaaccgga accgcgtttt 2160 gcagttaaaa gtgaaaccca tcgtggtatt tttcgcaccg cagttggtcg tcgtaccggt 2220 gatctgggca aagaagatca cgaagattgt acctgtgata tgtgcattat ctttggcaat 2280 gagcatgaga gcagcaaaat tcgttttgaa gatctggaac tgatcaacgg caacgaattt 2340 gaaaagctgg aaaaacatat tgaccacgtg gccattgatc gttttacagg tggcgcactg 2400 gacaaagcaa aatttgatac ctatccgctg gcaggtagcc cgaaaaaacc gctgaaactg 2460 aagggtcgct tttggattaa aaagggtttt agcggtgatc acaagctgct gattaccaca 2520 gcactgagcg atattcgtga tggcctgtat cctctgggta gtaaaggtgg tgttggttat 2580 ggttgggttg caggtattag cattgatgat aatgtgccgg atgactttaa agagatgatc 2640 aacaagacag aaatgccgct gccggaagaa gtggaagaaa gcaataatgg tccgatcaat 2700 Page 70 aacgattatg ttcatccggg tcatcagagc ccgaaacagg atcataaaaa caagaacatc 2760 tattatccgc attattttct ggacagcggc agcaaagtgt atcgcgaaaa agatattatc 2820 acccacgaag aattcaccga ggaactgctg tcaggcaaaa ttaactgtaa acttgaaacc 2880 ctgacaccgc tgattattcc ggataccagt gatgaaaatg gtctgaaact tcagggtaat 2940 aaaccgggtc ataagaacta caaattcttc aacattaatg gcgaactgat gattccgggt 3000 tcagaactgc gtggcatgct gcgcacccat tttgaagcac tgaccaaaag ctgttttgcc 3060 atttttggtg aagatagcac cctgagctgg cgtatgaatg cagatgagaa agattacaaa 3120 atcgatagca acagcatccg caaaatggaa agccagcgta atccgaaata tcgcattccg 3180 gacgaactgc agaaagagct gcgtaatagc ggtaatggtc tgtttaatcg tctgtatacc 3240 agcgaacgtc gtttttggag tgatgtgagt aacaaatttg agaacagcat cgattacaaa 3300 cgcgaaattc tgcgttgtgc aggtcgtccg aaaaactata aaggcggtat tattcgtcag 3360 cgtaaagata gtctgatggc cgaagaactg aaagttcatc gtctgcctct gtatgataac 3420 tttgatattc ctgatagcgc ctacaaagcc aacgatcatt gtcgtaaaag cgcaacctgt 3480 agcaccagcc gtggttgtcg tgaacgtttt acctgtggca ttaaagtgcg tgataaaaat 3540 cgcgtttttc tgaatgcagc caataataat cgccagtacc tgaacaacat caaaaagtcc 3600 aatcacgatc tgtatctgca gtatctgaaa ggcgaaaaaa agatccgctt caacagcaaa 3660 gttattacag gtagcgaacg tagcccgatt gatgttattg cagaactgaa tgaacgtggt 3720 cgtcagaccg gttttatcaa actgagcggt ctgaataaca gcaataaaag ccagggcaat 3780 accggcacca catttaatag tggttgggat cgctttgaac tgaatatact gctggatgat 3840 ctggaaaccc gtccgagcaa aagcgattat ccegcgtccgc gtctgctgtt taccaaagat 3900 cagtatgaat acaacatcac caaacgttgc gaacgcgtgt ttgaaattga taaaggcaac 3960 aaaacaggct atccggtgga tgatcagatc aaaaagaact acgaagatat cctggacagc 4020 tatgatggca tcaaagatca agaagttgcc gaacgctttg atacatttac ccgtggtagc 4080 aagctgaaag ttggcgatct ggtttatttt catatcgacg gcgataacaa aattgacagc 4140 ctgattccgg ttcgtattag ccgtaaatgt gcaagcaaaa ccttaggtgg caaattagat 4200 Page 71 aaagcactgc atccgtgcac cggtctgtca gatggtctgt gtccgggttg tcacctgttt 4260 ggcaccaccg attataaagg tcgcgttaaa tttggcttecg ccaaatatga aaacggtcct 4320 gaatggctga ttacgcgtgg taataatccg gaacgtagtc tgaccctggg tgtgctggaa 4380 tcaccgcgtc cggcattttc aattccggat gatgaaagtg aaattccggg tcgtaaattc 4440 tatctgcatc acaatggttg gcgcattatt cgccagaaac aactggaaat tcgtgaaacc 4500 gttcagccgg aacgcaatgt taccaccgaa gtgatggata aaggtaacgt gtttagcttt 4560 gatgtgcgct ttgaaaatct gcgtgaatgg gaactgggtc tgctgctgca gagtctggat 4620 cctggtaaaa acattgcaca taaacttggt aaaggcaaac cgtatggttt tggcagcgtg 4680 aaaatcaaga ttgatagcct gcataccttc aagattaaca gcaacaacga caaaatcaaa 4740 cgtgttccgc agagtgatat ccgcgagtat attaacaaag gctaccagaa actgattgaa 4800 tggtcaggta ataatagcat ccagaaaggt aatgtgctgc cgcagtggca tgttattccg 4860 catattgaca aactgtacaa actgctgtgg gttccgtttc tgaacgatag caaactggaa 4920 ccggatgttc gttatccggt tctgaatgaa gaatccaaag gttatattga gggcagcgat 4980 tacacctata aaaagctggg agataaagat aacctgccgt ataaaacccg tgttaaaggt 5040 ctgaccacac cgtggtcacc gtggaatccg tttcaggtga ttgccgaaca tgaagaacaa 5100 gaagtgaacg ttaccggtag ccgtccgagt gttaccgata aaattgaacg tgatggtaaa 5160 atggtgtaa 5169 <210> 17 <211> 5169 <212> DNA <213> Artificial Sequence <220> <223> gRAMP - single cutter - DNA <400> 17 atgaaaagca acgacatgaa cattaccgtg gaactgacct tttttgaacc gtatcgtctg 60 gttgaatggt ttgattggga tgcacgtaaa aaaagccata gcgcaatgcg tggtcaggca 120 Page 72 tttgcacagt ggacctggaa aggtaaaggt cgtaccgcag gtaaaagctt tattaccggt 180 acactggttc gtagcgcagt tattaaagca gttgaagaac tgctgagcct gaataatggt 240 aaatgggaag gtgttccgtg ttgcaatggt agctttcaga ccgatgaaag caaaggtaaa 300 aaaccgagct ttctgcgtaa acgtcatacc ctgcagtggc aggcaaataa caaaaacatt 360 tgcgataaag aagaggcctg tccgttttgt attctgctgg gtcgttttga taatgccggt 420 aaagtgcatg aacgcaacaa agattatgat atccacttca gcaacttcga cctggatcac 480 aaacaagaaa aaaatgatct gcgcctggtt gatattgcaa gcggtcgtat tctgaatcgt 540 gttgattttg ataccggcaa agccaaagat tactttcgta cctgggaagc agattatgaa 600 acctatggca cctataccgg tcgcattacc ctgcgtaatg aacatgcaaa aaaactgctg 660 ctggcaagcc tgggttttgt tgataaactg tgtggtgcac tgtgtcgtat tgaggttatc 720 aaaaaaagcg aaagtccgct gccgagcgat accaaagaac agagctatac aaaagatgat 780 accgttgaag ttctgagcga agatcataat gatgaactgc gcaaacaggc cgaagttatt 840 gttgaagcat ttaagcagaa cgataaactg gaaaaaattc gcattctggc agatgcaatt 900 cgtaccctgc gcctgcatgg tgaaggtgtg attgaaaaag atgagctgcc ggatggtaaa 960 gaagaacgcg ataaaggtca tcatctgtgg gatattaaag ttcagggcac cgcactgcgt 1020 accaaactga aagaactgtg gcagagcaat aaagatattg gctggcgcaa atttaccgaa 1080 atgctgggta gcaatctgta cctgatctat aagaaagaaa ccggtggtgt tagcacccgt 1140 tttegcatcc tgggtgatac cgagtattat agcaaagcac atgatagcga aggtagcgac 1200 ctgtttattc cggttacacc gcctgaaggt attgaaacca aagaatggat tattgtgggt 1260 cgcctgaaag cagcaacccc gttttatttc ggtgttcagc agccgagtga tagcattccg 1320 ggtaaagaga aaaaatcaga agatagcctg gtcatcaatg aacacaccag ctttaacatc 1380 ctgctggata aagaaaatcg ttatcgtatt ccgcgtagtg cactgcgtgg tgccctgegt 1440 cgcgatctgc gtaccgcatt tggtagcggt tgtaatgtta gcttaggtgg tcagattctg 1500 tgcaattgta aagtgtgtat tgaaatgcgt cgcatcacac tgaaagatag cgttagcgat 1560 ttttcagaac ctccggaaat tcgctatcgc attgcaaaaa atccgggtac agcaaccgtg 1620 Page 73 gaagatggta gtctgtttga tattgaagtt ggtccggaag gcctgacctt tcegtttgtt 1680 ctgcgttatc gtggtcataa atttccagaa cagctgagca gcgttattcg ttattgggaa 1740 gaaaatgatg gcaaaaatgg tatggcatgg ttaggtggcc tggatagcac cggtaaaggc 1800 cgttttgccc tgaaagacat taaaatcttt gagtgggatc tgaaccagaa aatcaacgaa 1860 tatatcaaag aacgcggtat gcgtggcaaa gaaaaagaat tactggaaat gggtgaaagc 1920 agtctgcctg atggtctgat tccgtataaa ttctttgaag aacgtgaatg cctgtttccg 1980 tacaaagaaa acctgaaacc gcagtggtca gaagttcagt ataccattga agtgggttca 2040 ccgctgctga ccgcagatac cattagcgca ctgaccgaac cgggtaatcg tgctgcaatt 2100 gcctacaaaa aacgcgtgta taacgatggc aataatgcca ttgaaccgga accgcgtttt 2160 gcagttaaaa gtgaaaccca tcgtggtatt tttcgcaccg cagttggtcg tcgtaccggt 2220 gatctgggca aagaagatca cgaagattgt acctgtgata tgtgcattat ctttggcaat 2280 gagcatgaga gcagcaaaat tcgttttgaa gatctggaac tgatcaacgg caacgaattt 2340 gaaaagctgg aaaaacatat tgaccacgtg gccattgatc gttttacagg tggcgcactg 2400 gacaaagcaa aatttgatac ctatccgctg gcaggtagcc cgaaaaaacc gctgaaactg 2460 aagggtcgct tttggattaa aaagggtttt agcggtgatc acaagctgct gattaccaca 2520 gcactgagcg atattcgtga tggcctgtat cctctgggta gtaaaggtgg tgttggttat 2580 ggttgggttg caggtattag cattgatgat aatgtgccgg atgactttaa agagatgatc 2640 aacaagacag aaatgccgct gccggaagaa gtggaagaaa gcaataatgg tccgatcaat 2700 aacgattatg ttcatccggg tcatcagagc ccgaaacagg atcataaaaa caagaacatc 2760 tattatccgc attattttct ggacagcggc agcaaagtgt atcgcgaaaa agatattatc 2820 acccacgaag aattcaccga ggaactgctg tcaggcaaaa ttaactgtaa acttgaaacc 2880 ctgacaccgc tgattattcc ggataccagt gatgaaaatg gtctgaaact tcagggtaat 2940 aaaccgggtc ataagaacta caaattcttc aacattaatg gcgaactgat gattccgggt 3000 tcagaactgc gtggcatgct gcgcacccat tttgaagcac tgaccaaaag ctgttttgcc 3060 atttttggtg aagatagcac cctgagctgg cgtatgaatg cagatgagaa agattacaaa 3120 Page 74 atcgatagca acagcatccg caaaatggaa agccagcgta atccgaaata tcgcattccg 3180 gacgaactgc agaaagagct gcgtaatagc ggtaatggtc tgtttaatcg tctgtatacc 3240 agcgaacgtc gtttttggag tgatgtgagt aacaaatttg agaacagcat cgattacaaa 3300 cgcgaaattc tgcgttgtgc aggtcgtccg aaaaactata aaggcggtat tattcgtcag 3360 cgtaaagata gtctgatggc cgaagaactg aaagttcatc gtctgcctct gtatgataac 3420 tttgatattc ctgatagcgc ctacaaagcc aacgatcatt gtcgtaaaag cgcaacctgt 3480 agcaccagcc gtggttgtcg tgaacgtttt acctgtggca ttaaagtgcg tgataaaaat 3540 cgcgtttttc tgaatgcagc caataataat cgccagtacc tgaacaacat caaaaagtcc 3600 aatcacgatc tgtatctgca gtatctgaaa ggcgaaaaaa agatccgctt caacagcaaa 3660 gttattacag gtagcgaacg tagcccgatt gatgttattg cagaactgaa tgaacgtggt 3720 cgtcagaccg gttttatcaa actgagcggt ctgaataaca gcaataaaag ccagggcaat 3780 accggcacca catttaatag tggttgggat cgctttgaac tgaatatact gctggatgat 3840 ctggaaaccc gtccgagcaa aagcgattat ccegcgtccgc gtctgctgtt taccaaagat 3900 cagtatgaat acaacatcac caaacgttgc gaacgcgtgt ttgaaattga taaaggcaac 3960 aaaacaggct atccggtgga tgatcagatc aaaaagaact acgaagatat cctggacagc 4020 tatgatggca tcaaagatca agaagttgcc gaacgctttg atacatttac ccgtggtagc 4080 aagctgaaag ttggcgatct ggtttatttt catatcgacg gcgataacaa aattgacagc 4140 ctgattccgg ttcgtattag ccgtaaatgt gcaagcaaaa ccttaggtgg caaattagat 4200 aaagcactgc atccgtgcac cggtctgtca gatggtctgt gtccgggttg tcacctgttt 4260 ggcaccaccg attataaagg tcgcgttaaa tttggcttecg ccaaatatga aaacggtcct 4320 gaatggctga ttacgcgtgg taataatccg gaacgtagtc tgaccctggg tgtgctggaa 4380 tcaccgcgtc cggcattttc aattccggat gatgaaagtg aaattccggg tcgtaaattc 4440 tatctgcatc acaatggttg gcgcattatt cgccagaaac aactggaaat tcgtgaaacc 4500 gttcagccgg aacgcaatgt taccaccgaa gtgatggata aaggtaacgt gtttagcttt 4560 gatgtgcgct ttgaaaatct gcgtgaatgg gaactgggtc tgctgctgca gagtctggat 4620 Page 75 cctggtaaaa acattgcaca taaacttggt aaaggcaaac cgtatggttt tggcagcgtg 4680 aaaatcaaga ttgatagcct gcataccttc aagattaaca gcaacaacga caaaatcaaa 4740 cgtgttccgc agagtgatat ccgcgagtat attaacaaag gctaccagaa actgattgaa 4800 tggtcaggta ataatagcat ccagaaaggt aatgtgctgc cgcagtggca tgttattccg 4860 catattgaca aactgtacaa actgctgtgg gttccgtttc tgaacgatag caaactggaa 4920 ccggatgttc gttatccggt tctgaatgaa gaatccaaag gttatattga gggcagcgat 4980 tacacctata aaaagctggg agataaagat aacctgccgt ataaaacccg tgttaaaggt 5040 ctgaccacac cgtggtcacc gtggaatccg tttcaggtga ttgccgaaca tgaagaacaa 5100 gaagtgaacg ttaccggtag ccgtccgagt gttaccgata aaattgaacg tgatggtaaa 5160 atggtgtaa 5169 <210> 18 <211> 2187 <212> DNA <213> Artificial Sequence <220> <223> TPR-CHAT codon-optimized for E. coli <400> 18 atgaacaaca ccgaagaaaa catcgatcgt attcaagaac cgacgcgtga agatattgat 60 cgtaaagaag cagaacgtct gctggatgaa gcatttaatc cgcgtaccaa accggtggat 120 cgcaaaaaaa tcattaatag cgcactgaaa attctgatcg gcctgtacaa agagaaaaaa 180 gacgatctga ccagcgcaag ctttattagc attgcacgtg cctattatct ggtgagcatt 240 accattctgc cgaaaggcac caccattccg gaaaaaaaga aagaagcact gcgcaaaggc 300 atcgaattta ttgatcgcgc aatcaacaag tttaacggca gcattctgga tagccagcgt 360 gcatttcgta ttaaaagcgt tctgagcatt gagttcaatc gtatcgatcg tgaaaaatgc 420 gacaacatca aactgaaaaa cctgctgaac gaagccgttg ataaaggttg taccgatttt 480 gatacctatg agtgggatat tcagattgcc attcgtctgt gtgaactggg tgttgatatg 540 gaaggtcatt ttgacaacct gatcaaaagc aacaaagcca acgatctgca gaaagccaaa 600 Page 76 gcctattact tcatcaaaaa ggatgaccat aaggccaaag aacacatgga taaatgtacc 660 gcaagcctga aatatacccc gtgtagtcat cgtctgtggg atgaaaccgt tggttttatt 720 gaacgtctga aaggtgatag cagcaccctg tggcgtgatt ttgcaattaa aacctatcgt 780 agctgccgtg tgcaagaaaa agaaaccggt acactgcgtc tgcgttggta ttggagccgt 840 catcgtgttc tgtatgatat ggcatttctg gcagttaaag aacaggcaga tgatgaagaa 900 ccggatgtta atgttaaaca ggccaaaatc aaaaagctgg ccgaaattag cgatagcctg 960 aaaagccgtt ttagcctgcg tctgagcgat atggaaaaaa tgccgaaaag tgatgatgaa 1020 agcaaccacg agttcaaaaa gtttctggac aaatgtgtta ccgcctatca ggatggttat 1080 gtgattaatc gtagcgagga taaagaaggt cagggcgaaa acaaaagcac caccagtaaa 1140 cagccggaac cgcgtccgca ggcaaaactg ctggaactga cccaggttcc ggaaggttgg 1200 gttgttgtte acttttatct gaataaactg gaaggtatgg gcaacgccat tgtgtttgat 1260 aaatgtgcaa atagctggca gtacaaagaa tttcagtata aagaactgtt tgaagtgttc 1320 ctgacctggc aggcaaacta taatctttac aaagaaaacg cagccgaaca tctggttacc 1380 ctgtgtaaaa agattggtga aaccatgccg tttctgttet gcgataactt tattccgaat 1440 ggtaaagatg ttctgtttgt gccgcatgat tttctgcatc gtctgccgct gcatggtagc 1500 attgagaata aaacaaatgg caagctgttc ctggaaaatc atagctgttg ttatctgcct 1560 gcatggtcat ttgcaagcga aaaagaagca agcaccagcg acgaatatgt tctgctgaaa 1620 aatttcgatc agggccattt tgaaaccctg cagaataatc agatttgggg cacccagagc 1680 gttaaagatg gtgcaagcag tgatgatctg gaaaacattc gtaacaatcc gcgtctgetg 1740 accattctgt gtcatggtga agcaaatatg agcaatccgt ttcgtagcat gctgaaactg 1800 gcaaatggtg gtattaccta tctggaaatt ctgaatagcg tgaaaggcct gaaaggtagc 1860 caggttattc tgggtgcatg tgaaaccgat ctggttccgc ctctgagtga tgttatggat 1920 gaacattata gcgttgcaac cgcactgctg ctgattggtg cagcgggtgt tgttggcacc 1980 atgtggaaag ttcgtagcaa taaaacgaaa agcctgatcg agtggaagct ggaaaatatc 2040 gaatataaac tgaacgagtg gcagaaagaa acaggtggtg cagcatataa agatcatccg 2100 Page 77 cctacctttt atcgtagcat tgcctttegt agtattggtt ttccgttagg tggtagcggt 2160 catcaccatc accaccatca tcattaa 2187 <210> 19 <211> 57 <212> RNA <213> Artificial Sequence <220> <223> Guide rna 1 for experiments <400> 19 aaacaagaga aggacuuaau gucacgguac ccaauuuucu gccccggacu ccacggc 57 <210> 20 <211> 38 <212> RNA <213> Artificial Sequence <220> <223> Target RNA 1 <400> 20 cucuaguaac agccguggag uccggggcag aaaauugg 38 <210> 21 <211> 46 <212> RNA <213> Artificial Sequence <220> <223> Target RNA 2 <400> 21 cucuaguaac agccguggag uccggggcag aaaauuggac gauuaa 46 <210> 22 <211> 48 <212> RNA <213> Artificial Sequence Page 78
<220> <223> Target RNA 3 <400> 22 cucuaguaac agccguggag uccggggcag aaaauuggac gauuaacu 48 <210> 23 <211> 233 <212> PRT <213> Candidatus Scalindua brodae <220> <223> gRAMP domain 1 <400> 23 Met Lys Ser Asn Asp Met Asn Ile Thr Val Glu Leu Thr Phe Phe Glu 1 5 10 15 Pro Tyr Arg Leu Val Glu Trp Phe Asp Trp Asp Ala Arg Lys Lys Ser
His Ser Ala Met Arg Gly Gln Ala Phe Ala Gln Trp Thr Trp Lys Gly 40 45 Lys Gly Arg Thr Ala Gly Lys Ser Phe Ile Thr Gly Thr Leu Val Arg 50 55 60 Ser Ala Val Ile Lys Ala Val Glu Glu Leu Leu Ser Leu Asn Asn Gly 65 70 75 80 Lys Trp Glu Gly Val Pro Cys Cys Asn Gly Ser Phe Gln Thr Asp Glu 85 90 95 Ser Lys Gly Lys Lys Pro Ser Phe Leu Arg Lys Arg His Thr Leu Gln 100 105 110 Trp Gln Ala Asn Asn Lys Asn Ile Cys Asp Lys Glu Glu Ala Cys Pro 115 120 125 Phe Cys Ile Leu Leu Gly Arg Phe Asp Asn Ala Gly Lys Val His Glu 130 135 140 Arg Asn Lys Asp Tyr Asp Ile His Phe Ser Asn Phe Asp Leu Asp His 145 150 155 160 Lys Gln Glu Lys Asn Asp Leu Arg Leu Val Asp Ile Ala Ser Gly Arg 165 170 175 Ile Leu Asn Arg Val Asp Phe Asp Thr Gly Lys Ala Lys Asp Tyr Phe 180 185 190 Arg Thr Trp Glu Ala Asp Tyr Glu Thr Tyr Gly Thr Tyr Thr Gly Arg 195 200 205 Ile Thr Leu Arg Asn Glu His Ala Lys Lys Leu Leu Leu Ala Ser Leu 210 215 220 Gly Phe Val Asp Lys Leu Cys Gly Ala 225 230 <210> 24 Page 79
<211> 202 <212> PRT <213> Candidatus Scalindua brodae <220> <223> gRAMP domain 2 <400> 24 Thr Lys Glu Trp Ile Ile Val Gly Arg Leu Lys Ala Ala Thr Pro Phe 1 5 10 15 Tyr Phe Gly Val Gln Gln Pro Ser Asp Ser Ile Pro Gly Lys Glu Lys
Lys Ser Glu Asp Ser Leu Val Ile Asn Glu His Thr Ser Phe Asn Ile 40 45 Leu Leu Asp Lys Glu Asn Arg Tyr Arg Ile Pro Arg Ser Ala Leu Arg 50 55 60 Gly Ala Leu Arg Arg Asp Leu Arg Thr Ala Phe Gly Ser Gly Cys Asn 65 70 75 80 Val Ser Leu Gly Gly Gln Ile Leu Cys Asn Cys Lys Val Cys Ile Glu 85 90 95 Met Arg Arg Ile Thr Leu Lys Asp Ser Val Ser Asp Phe Ser Glu Pro 100 105 110 Pro Glu Ile Arg Tyr Arg Ile Ala Lys Asn Pro Gly Thr Ala Thr Val 115 120 125 Glu Asp Gly Ser Leu Phe Asp Ile Glu Val Gly Pro Glu Gly Leu Thr 130 135 140 Phe Pro Phe Val Leu Arg Tyr Arg Gly His Lys Phe Pro Glu Gln Leu 145 150 155 160 Ser Ser Val Ile Arg Tyr Trp Glu Glu Asn Asp Gly Lys Asn Gly Met 165 170 175 Ala Trp Leu Gly Gly Leu Asp Ser Thr Gly Lys Gly Arg Phe Ala Leu 180 185 190 Lys Asp Ile Lys Ile Phe Glu Trp Asp Leu 195 200 <210> 25 <211> 198 <212> PRT <213> Candidatus Scalindua brodae <220> <223> gRAMP domain 3 <400> 25 Glu Val Gln Tyr Thr Ile Glu Val Gly Ser Pro Leu Leu Thr Ala Asp 1 5 10 15 Thr Ile Ser Ala Leu Thr Glu Pro Gly Asn Arg Asp Ala Ile Ala Tyr 20 25 30 Page 80
Lys Lys Arg Val Tyr Asn Asp Gly Asn Asn Ala Ile Glu Pro Glu Pro 35 40 45 Arg Phe Ala Val Lys Ser Glu Thr His Arg Gly Ile Phe Arg Thr Ala 50 55 60 Val Gly Arg Arg Thr Gly Asp Leu Gly Lys Glu Asp His Glu Asp Cys 65 70 75 80 Thr Cys Asp Met Cys Ile Ile Phe Gly Asn Glu His Glu Ser Ser Lys 85 90 95 Ile Arg Phe Glu Asp Leu Glu Leu Ile Asn Gly Asn Glu Phe Glu Lys 100 105 110 Leu Glu Lys His Ile Asp His Val Ala Ile Asp Arg Phe Thr Gly Gly 115 120 125 Ala Leu Asp Lys Ala Lys Phe Asp Thr Tyr Pro Leu Ala Gly Ser Pro 130 135 140 Lys Lys Pro Leu Lys Leu Lys Gly Arg Phe Trp Ile Lys Lys Gly Phe 145 150 155 160 Ser Gly Asp His Lys Leu Leu Ile Thr Thr Ala Leu Ser Asp Ile Arg 165 170 175 Asp Gly Leu Tyr Pro Leu Gly Ser Lys Gly Gly Val Gly Tyr Gly Trp 180 185 190 Val Ala Gly Ile Ser Ile 195 <210> 26 <211> 690 <212> PRT <213> Candidatus Scalindua brodae <220> <223> gRAMP domain 4 <400> 26 His Glu Glu Phe Thr Glu Glu Leu Leu Ser Gly Lys Ile Asn Cys Lys 1 5 10 15 Leu Glu Thr Leu Thr Pro Leu Ile Ile Pro Asp Thr Ser Asp Glu Asn
Gly Leu Lys Leu Gln Gly Asn Lys Pro Gly His Lys Asn Tyr Lys Phe 40 45 Phe Asn Ile Asn Gly Glu Leu Met Ile Pro Gly Ser Glu Leu Arg Gly 50 55 60 Met Leu Arg Thr His Phe Glu Ala Leu Thr Lys Ser Cys Phe Ala Ile 65 70 75 80 Phe Gly Glu Asp Ser Thr Leu Ser Trp Arg Met Asn Ala Asp Glu Lys 85 90 95 Asp Tyr Lys Ile Asp Ser Asn Ser Ile Arg Lys Met Glu Ser Gln Arg 100 105 110 Asn Pro Lys Tyr Arg Ile Pro Asp Glu Leu Gln Lys Glu Leu Arg Asn 115 120 125 Ser Gly Asn Gly Leu Phe Asn Arg Leu Tyr Thr Ser Glu Arg Arg Phe Page 81
Trp Ser Asp Val Ser Asn Lys Phe Glu Asn Ser Ile Asp Tyr Lys Arg 145 150 155 160 Glu Ile Leu Arg Cys Ala Gly Arg Pro Lys Asn Tyr Lys Gly Gly Ile 165 170 175 Ile Arg Gln Arg Lys Asp Ser Leu Met Ala Glu Glu Leu Lys Val His 180 185 190 Arg Leu Pro Leu Tyr Asp Asn Phe Asp Ile Pro Asp Ser Ala Tyr Lys 195 200 205 Ala Asn Asp His Cys Arg Lys Ser Ala Thr Cys Ser Thr Ser Arg Gly 210 215 220 Cys Arg Glu Arg Phe Thr Cys Gly Ile Lys Val Arg Asp Lys Asn Arg 225 230 235 240 Val Phe Leu Asn Ala Ala Asn Asn Asn Arg Gln Tyr Leu Asn Asn Ile 245 250 255 Lys Lys Ser Asn His Asp Leu Tyr Leu Gln Tyr Leu Lys Gly Glu Lys 260 265 270 Lys Ile Arg Phe Asn Ser Lys Val Ile Thr Gly Ser Glu Arg Ser Pro 275 280 285 Ile Asp Val Ile Ala Glu Leu Asn Glu Arg Gly Arg Gln Thr Gly Phe 290 295 300 Ile Lys Leu Ser Gly Leu Asn Asn Ser Asn Lys Ser Gln Gly Asn Thr 305 310 315 320 Gly Thr Thr Phe Asn Ser Gly Trp Asp Arg Phe Glu Leu Asn Ile Leu 325 330 335 Leu Asp Asp Leu Glu Thr Arg Pro Ser Lys Ser Asp Tyr Pro Arg Pro 340 345 350 Arg Leu Leu Phe Thr Lys Asp Gln Tyr Glu Tyr Asn Ile Thr Lys Arg 355 360 365 Cys Glu Arg Val Phe Glu Ile Asp Lys Gly Asn Lys Thr Gly Tyr Pro 370 375 380 Val Asp Asp Gln Ile Lys Lys Asn Tyr Glu Asp Ile Leu Asp Ser Tyr 385 390 395 400 Asp Gly Ile Lys Asp Gln Glu Val Ala Glu Arg Phe Asp Thr Phe Thr 405 410 415 Arg Gly Ser Lys Leu Lys Val Gly Asp Leu Val Tyr Phe His Ile Asp 420 425 430 Gly Asp Asn Lys Ile Asp Ser Leu Ile Pro Val Arg Ile Ser Arg Lys 435 440 445 Cys Ala Ser Lys Thr Leu Gly Gly Lys Leu Asp Lys Ala Leu His Pro 450 455 460 Cys Thr Gly Leu Ser Asp Gly Leu Cys Pro Gly Cys His Leu Phe Gly 465 470 475 480 Thr Thr Asp Tyr Lys Gly Arg Val Lys Phe Gly Phe Ala Lys Tyr Glu 485 490 495 Asn Gly Pro Glu Trp Leu Ile Thr Arg Gly Asn Asn Pro Glu Arg Ser 500 505 510 Leu Thr Leu Gly Val Leu Glu Ser Pro Arg Pro Ala Phe Ser Ile Pro 515 520 525 Asp Asp Glu Ser Glu Ile Pro Gly Arg Lys Phe Tyr Leu His His Asn Page 82
Gly Trp Arg Ile Ile Arg Gln Lys Gln Leu Glu Ile Arg Glu Thr Val 545 550 555 560 Gln Pro Glu Arg Asn Val Thr Thr Glu Val Met Asp Lys Gly Asn Val 565 570 575 Phe Ser Phe Asp Val Arg Phe Glu Asn Leu Arg Glu Trp Glu Leu Gly 580 585 590 Leu Leu Leu Gln Ser Leu Asp Pro Gly Lys Asn Ile Ala His Lys Leu 595 600 605 Gly Lys Gly Lys Pro Tyr Gly Phe Gly Ser Val Lys Ile Lys Ile Asp 610 615 620 Ser Leu His Thr Phe Lys Ile Asn Ser Asn Asn Asp Lys Ile Lys Arg 625 630 635 640 Val Pro Gln Ser Asp Ile Arg Glu Tyr Ile Asn Lys Gly Tyr Gln Lys 645 650 655 Leu Ile Glu Trp Ser Gly Asn Asn Ser Ile Gln Lys Gly Asn Val Leu 660 665 670 Pro Gln Trp His Val Ile Pro His Ile Asp Lys Leu Tyr Lys Leu Leu 675 680 685 Trp Val 690 <210> 27 <211> 716 <212> PRT <213> Candidatus Scalindua brodae <220> <223> Candidatus Scalindua brodae TPR-CHAT AA <400> 27 Met Asn Asn Thr Glu Glu Asn Ile Asp Arg Ile Gln Glu Pro Thr Arg 1 5 10 15 Glu Asp Ile Asp Arg Lys Glu Ala Glu Arg Leu Leu Asp Glu Ala Phe
Asn Pro Arg Thr Lys Pro Val Asp Arg Lys Lys Ile Ile Asn Ser Ala 40 45 Leu Lys Ile Leu Ile Gly Leu Tyr Lys Glu Lys Lys Asp Asp Leu Thr 50 55 60 Ser Ala Ser Phe Ile Ser Ile Ala Arg Ala Tyr Tyr Leu Val Ser Ile 65 70 75 80 Thr Ile Leu Pro Lys Gly Thr Thr Ile Pro Glu Lys Lys Lys Glu Ala 85 90 95 Leu Arg Lys Gly Ile Glu Phe Ile Asp Arg Ala Ile Asn Lys Phe Asn 100 105 110 Gly Ser Ile Leu Asp Ser Gln Arg Ala Phe Arg Ile Lys Ser Val Leu 115 120 125 Ser Ile Glu Phe Asn Arg Ile Asp Arg Glu Lys Cys Asp Asn Ile Lys 130 135 140 Page 83
Leu Lys Asn Leu Leu Asn Glu Ala Val Asp Lys Gly Cys Thr Asp Phe 145 150 155 160 Asp Thr Tyr Glu Trp Asp Ile Gln Ile Ala Ile Arg Leu Cys Glu Leu 165 170 175 Gly Val Asp Met Glu Gly His Phe Asp Asn Leu Ile Lys Ser Asn Lys 180 185 190 Ala Asn Asp Leu Gln Lys Ala Lys Ala Tyr Tyr Phe Ile Lys Lys Asp 195 200 205 Asp His Lys Ala Lys Glu His Met Asp Lys Cys Thr Ala Ser Leu Lys 210 215 220 Tyr Thr Pro Cys Ser His Arg Leu Trp Asp Glu Thr Val Gly Phe Ile 225 230 235 240 Glu Arg Leu Lys Gly Asp Ser Ser Thr Leu Trp Arg Asp Phe Ala Ile 245 250 255 Lys Thr Tyr Arg Ser Cys Arg Val Gln Glu Lys Glu Thr Gly Thr Leu 260 265 270 Arg Leu Arg Trp Tyr Trp Ser Arg His Arg Val Leu Tyr Asp Met Ala 275 280 285 Phe Leu Ala Val Lys Glu Gln Ala Asp Asp Glu Glu Pro Asp Val Asn 290 295 300 Val Lys Gln Ala Lys Ile Lys Lys Leu Ala Glu Ile Ser Asp Ser Leu 305 310 315 320 Lys Ser Arg Phe Ser Leu Arg Leu Ser Asp Met Glu Lys Met Pro Lys 325 330 335 Ser Asp Asp Glu Ser Asn His Glu Phe Lys Lys Phe Leu Asp Lys Cys 340 345 350 Val Thr Ala Tyr Gln Asp Gly Tyr Val Ile Asn Arg Ser Glu Asp Lys 355 360 365 Glu Gly Gln Gly Glu Asn Lys Ser Thr Thr Ser Lys Gln Pro Glu Pro 370 375 380 Arg Pro Gln Ala Lys Leu Leu Glu Leu Thr Gln Val Pro Glu Gly Trp 385 390 395 400 Val Val Val His Phe Tyr Leu Asn Lys Leu Glu Gly Met Gly Asn Ala 405 410 415 Ile Val Phe Asp Lys Cys Ala Asn Ser Trp Gln Tyr Lys Glu Phe Gln 420 425 430 Tyr Lys Glu Leu Phe Glu Val Phe Leu Thr Trp Gln Ala Asn Tyr Asn 435 440 445 Leu Tyr Lys Glu Asn Ala Ala Glu His Leu Val Thr Leu Cys Lys Lys 450 455 460 Ile Gly Glu Thr Met Pro Phe Leu Phe Cys Asp Asn Phe Ile Pro Asn 465 470 475 480 Gly Lys Asp Val Leu Phe Val Pro His Asp Phe Leu His Arg Leu Pro 485 490 495 Leu His Gly Ser Ile Glu Asn Lys Thr Asn Gly Lys Leu Phe Leu Glu 500 505 510 Asn His Ser Cys Cys Tyr Leu Pro Ala Trp Ser Phe Ala Ser Glu Lys 515 520 525 Glu Ala Ser Thr Ser Asp Glu Tyr Val Leu Leu Lys Asn Phe Asp Gln 530 535 540 Page 84
Gly His Phe Glu Thr Leu Gln Asn Asn Gln Ile Trp Gly Thr Gln Ser 545 550 555 560 Val Lys Asp Gly Ala Ser Ser Asp Asp Leu Glu Asn Ile Arg Asn Asn 565 570 575 Pro Arg Leu Leu Thr Ile Leu Cys His Gly Glu Ala Asn Met Ser Asn 580 585 590 Pro Phe Arg Ser Met Leu Lys Leu Ala Asn Gly Gly Ile Thr Tyr Leu 595 600 605 Glu Ile Leu Asn Ser Val Lys Gly Leu Lys Gly Ser Gln Val Ile Leu 610 615 620 Gly Ala Cys Glu Thr Asp Leu Val Pro Pro Leu Ser Asp Val Met Asp 625 630 635 640 Glu His Tyr Ser Val Ala Thr Ala Leu Leu Leu Ile Gly Ala Ala Gly 645 650 655 Val Val Gly Thr Met Trp Lys Val Arg Ser Asn Lys Thr Lys Ser Leu 660 665 670 Ile Glu Trp Lys Leu Glu Asn Ile Glu Tyr Lys Leu Asn Glu Trp Gln 675 680 685 Lys Glu Thr Gly Gly Ala Ala Tyr Lys Asp His Pro Pro Thr Phe Tyr 690 695 700 Arg Ser Ile Ala Phe Arg Ser Ile Gly Phe Pro Leu 705 710 715 <210> 28 <211> 760 <212> PRT <213> Candidatus Jettenia <220> <223> Candidatus Jettenia caeni TPR-CHAT AA <400> 28 Met Lys Asn Arg Val Gln Ile Glu Ala Ile Ile Arg Asn Leu Gln Gly 1 5 10 15 Ala Ala Arg Asp Ser Lys Thr Asn Lys Leu Ser Glu Asn Ile Ile Ala
Tyr Asp Glu Tyr Arg Lys Ile His Lys Ser Ala Ser Leu Tyr Gln Phe 40 45 Gly Ile Ile Pro Ala Lys Glu Ser Ser Ser Val Leu Ala Glu Asn Glu 50 55 60 Thr Asn His Val Ala Tyr Glu Asn Ala Ile Phe Glu Met Ala Glu Lys 65 70 75 80 Asn Ile Glu Asn Phe Ser Ser Glu Asp Ile His Lys Lys Arg Lys Glu 85 90 95 Met Ile Glu Ser Ala Leu Arg Leu Leu Met Gly Leu Tyr Lys Asp Arg 100 105 110 His Glu Lys Leu Gln Pro Arg Thr Phe Val Leu Ile Ala Lys Ala Tyr 115 120 125 Leu Leu Arg Ser Leu Ile Thr Arg Pro Lys Gly Ile Thr Ile Pro Glu Page 85
Lys Lys Lys Glu Ala Leu Lys Lys Gly Ile Gly Phe Val Glu Ser Ala 145 150 155 160 Ile Lys Lys Ile Gln Ser Ser Glu Asn Ile Leu Ser His Ser Ser Asp 165 170 175 Ile Asp Leu Leu Glu Lys Ala Trp Arg Ile Lys Ser Gln Leu Tyr Leu 180 185 190 Glu Tyr Tyr Arg Val Asn Lys Asp Glu Cys Asp Lys Asn Thr Leu Lys 195 200 205 Glu Val Leu Glu Asn Ser Leu Ile Ser Gly Cys Asp Lys Phe Asp Lys 210 215 220 Asn Ile Glu Asp Val Gln Ile Ala Ile Arg Tyr Cys Glu Leu Glu Ser 225 230 235 240 Ser Arg Glu Tyr Leu Glu Gln Ile Ile Ser Ser His Leu Glu Gly Ile 245 250 255 Glu Phe Glu Lys Ala Arg Ala Tyr Lys Leu Leu Glu Leu Glu Asn Glu 260 265 270 Asn Glu Asp Glu Ile Arg Lys Ser Met Lys Val Val Ile Glu Glu Tyr 275 280 285 Leu Ser Gly Phe Ser Asp Pro Leu Trp Glu Asp Ala Val Glu Phe Ile 290 295 300 Asn Lys Leu Lys Ser Asp Asn Lys Asn Cys Trp Lys Glu Leu Ser Leu 305 310 315 320 Asp Met Tyr Lys Val Cys Arg Glu Gln Glu Ala Glu Thr Ala Ser Leu 325 330 335 His Leu Arg Trp Tyr Trp Ser Arg Gln Arg Arg Leu Tyr Asp Leu Ala 340 345 350 Phe Tle Ala Ala Asp Lys Glu Glu Glu Lys Ala Lys Ile Ala Asp Ser 355 360 365 Leu Lys Ser Arg Leu Ser Leu Arg Trp Ser Ala Leu Glu Glu Thr Gly 370 375 380 Lys Lys Ser Lys Asn Lys Arg Glu Lys Glu Glu Ile Ser Arg Ile Leu 385 390 395 400 Glu Ala Glu Ala Val Ala Met Leu Gly Gly Tyr Ile Lys Gly Ala Arg 405 410 415 Lys Ile Leu Lys Lys Arg Arg Arg Pro Leu Pro Asp Glu Gln Arg Ser 420 425 430 Ile Pro Lys Asp Trp Ile Val Ile His Phe Tyr Val Asn Gln Leu Glu 435 440 445 Asn Lys Cys Tyr Ala Leu Ile Tyr Asn Lys Asp Glu Asn Thr Trp Lys 450 455 460 Cys Glu Phe Val Lys Glu Tyr Gln Arg Leu Phe His Val Phe Leu Thr 465 470 475 480 Trp Gln Thr Asn Tyr Asn Arg Cys Lys Glu Arg Ala Ala Asp Ser Leu 485 490 495 Val Gln Leu Cys Lys Glu Ile Gly Asn Ala Met Pro Phe Leu Phe Asp 500 505 510 Glu Cys Ile Ile Pro Gln Asp Lys Asn Val Leu Phe Ile Pro His Asp 515 520 525 Phe Leu His Arg Leu Pro Leu His Gly Ala Ile His Glu Lys Asn Asn Page 86
Gly Val Phe Leu Glu Asn His Pro Cys Cys Tyr Leu Pro Ala Trp Ser 545 550 555 560 Phe Thr Ala Lys Glu Asn Asn Ala Val Val Gln Gly Ser Ile Leu Leu 565 570 575 Lys Asn Phe Pro Glu Tyr Ser Tyr Glu Glu Leu Val Ser Asn Ser Thr 580 585 590 Leu Trp Thr Ser Pro Val Lys Asp Pro Ala Ser Pro Asp Asp Leu Lys 595 600 605 Thr Ile Ile Ala Ser Pro Glu Met Leu Val Ile Leu Cys His Gly Glu 610 615 620 Ala Asp Ala Val Asn Pro Phe Asn Ala Arg Leu Lys Leu Thr Gly Asn 625 630 635 640 Gly Ile Ser His Leu Glu Ile Leu Gln Ser Thr Lys Met Ile Leu Lys 645 650 655 Gly Ser Lys Ile Ile Leu Gly Ala Cys Glu Thr Asp Leu Val Pro Pro 660 665 670 Leu Ser Asp Ile Met Asp Glu His Leu Ser Ile Ala Thr Ala Phe Leu 675 680 685 Thr Asn Gly Thr His Glu Ile Leu Gly Thr Met Trp Gln Ser Arg Pro 690 695 700 Glu Asp Ile Glu Asp Ile Ile Arg Leu Leu Cys Asp Lys Lys Thr Ser 705 710 715 720 Asp Thr Lys Ala Arg Gly Asp Leu Trp Asn Trp Gln Lys Glu Arg Ile 725 730 735 Arg Asp Tyr Trp Ala Gly Glu Asp Ala Met Phe Tyr Arg Ser Val Ala 740 745 750 Phe Arg Ile Ile Gly Leu Thr Ile 755 760 <210> 29 <211> 760 <212> PRT <213> Candidatus Jettenia <220> <223> Candidatus Jettenia sp.
AMX1 TPR-CHAT AA <400> 29 Met Lys Asn Arg Val Gln Ile Glu Ala Ile Ile Arg Asn Leu Gln Gly 1 5 10 15 Ala Ala Arg Asp Ser Lys Thr Asn Lys Leu Ser Glu Asn Ile Ile Ala
Tyr Asp Glu Tyr Arg Lys Ile His Lys Ser Ala Ser Leu Tyr Gln Phe 40 45 Gly Ile Ile Pro Ala Lys Glu Ser Ser Ser Val Leu Ala Glu Asn Glu 50 55 60 Thr Asn His Val Ala Cys Glu Asn Ala Ile Phe Glu Met Ala Glu Lys 65 70 75 80 Page 87
Asn Ile Glu Asn Phe Ser Ser Glu Asp Ile His Lys Lys Arg Lys Glu 85 90 95 Thr Ile Glu Ser Ala Leu Arg Leu Leu Met Gly Leu Tyr Lys Asp Arg 100 105 110 His Glu Lys Leu Gln Pro Arg Thr Phe Val Leu Ile Ala Lys Ala Tyr 115 120 125 Leu Leu Arg Ser Leu Ile Thr Arg Pro Lys Gly Ile Thr Ile Pro Glu 130 135 140 Lys Lys Lys Glu Ala Leu Lys Lys Gly Ile Gly Phe Val Glu Ser Ala 145 150 155 160 Ile Lys Lys Ile Gln Ser Ser Glu Asn Ile Leu Ser His Ser Ser Asp 165 170 175 Ile Asp Leu Leu Glu Lys Ala Trp Arg Ile Lys Ser Gln Leu Tyr Leu 180 185 190 Glu Tyr Tyr Arg Val Asn Lys Asp Glu Cys Asp Lys Asn Thr Leu Lys 195 200 205 Glu Val Leu Glu Asn Ser Leu Ile Ser Gly Cys Asp Lys Phe Asp Lys 210 215 220 Asn Ile Glu Asp Val Gln Ile Ala Ile Arg Tyr Cys Glu Leu Glu Ser 225 230 235 240 Ser Arg Glu Tyr Leu Glu Gln Ile Ile Ser Ser His Leu Glu Gly Ile 245 250 255 Glu Phe Glu Lys Ala Arg Ala Tyr Lys Leu Leu Glu Leu Glu Asn Glu 260 265 270 Asn Glu Asp Glu Ile Arg Lys Ser Met Lys Val Val Ile Glu Glu Tyr 275 280 285 Leu Ser Gly Phe Ser Asp Pro Leu Trp Glu Asp Ala Val Glu Phe Ile 290 295 300 Asn Lys Leu Lys Ser Asp Asn Lys Asn Cys Trp Lys Glu Leu Ser Leu 305 310 315 320 Asp Met Tyr Lys Val Cys Arg Glu Gln Glu Ala Glu Thr Ala Ser Leu 325 330 335 His Leu Arg Trp Tyr Trp Ser Arg Gln Arg Arg Leu Tyr Asp Leu Ala 340 345 350 Phe Tle Ala Ala Asp Lys Glu Glu Glu Lys Ala Lys Ile Ala Asp Ser 355 360 365 Leu Lys Ser Arg Leu Ser Leu Arg Trp Ser Ala Leu Glu Glu Thr Gly 370 375 380 Lys Lys Ser Lys Asn Lys Arg Glu Lys Glu Glu Ile Ser Arg Ile Leu 385 390 395 400 Glu Ala Glu Ala Val Ala Met Leu Gly Gly Tyr Ile Lys Gly Ala Arg 405 410 415 Lys Ile Leu Lys Lys Arg Arg Arg Pro Leu Pro Asp Glu Gln Arg Ser 420 425 430 Ile Pro Lys Asp Trp Ile Val Ile His Phe Tyr Val Asn Gln Leu Glu 435 440 445 Asn Lys Cys Tyr Ala Leu Ile Tyr Asn Lys Asp Glu Asn Thr Trp Lys 450 455 460 Cys Glu Phe Val Lys Glu Tyr Gln Arg Leu Phe His Val Phe Leu Thr 465 470 475 480 Page 88
Trp Gln Thr Asn Tyr Asn Arg Cys Lys Glu Arg Ala Ala Asp Ser Leu 485 490 495 Val Gln Leu Cys Lys Glu Ile Gly Asn Ala Met Pro Phe Leu Phe Asp 500 505 510 Glu Cys Ile Ile Pro Gln Asp Lys Asn Val Leu Phe Ile Pro His Asp 515 520 525 Phe Leu His Arg Leu Pro Leu His Gly Ala Ile His Glu Lys Asn Asn 530 535 540 Gly Val Phe Leu Glu Asn His Pro Cys Cys Tyr Leu Pro Ala Trp Ser 545 550 555 560 Phe Ala Ala Lys Glu Asn Asn Ala Val Val Gln Gly Ser Ile Leu Leu 565 570 575 Lys Asn Phe Pro Glu Tyr Ser Tyr Glu Glu Leu Val Ser Asn Ser Thr 580 585 590 Leu Trp Thr Ser Pro Val Lys Asp Pro Ala Ser Pro Asp Asp Leu Lys 595 600 605 Thr Ile Ile Ala Ser Pro Glu Met Leu Val Ile Leu Cys His Gly Glu 610 615 620 Ala Asp Ala Val Asn Pro Phe Asn Ala Arg Leu Lys Leu Thr Gly Asn 625 630 635 640 Gly Ile Ser His Leu Glu Ile Leu Gln Ser Thr Lys Met Ile Leu Lys 645 650 655 Gly Ser Lys Ile Ile Leu Gly Ala Cys Glu Thr Asp Leu Val Pro Pro 660 665 670 Leu Ser Asp Ile Met Asp Glu His Leu Ser Ile Thr Thr Ala Phe Leu 675 680 685 Thr Asn Asp Ala Arg Glu Ile Leu Gly Thr Met Tyr Glu Ala Leu Asp 690 695 700 Val Arg Ile Ser Ser Ile Ile Gln Lys Ile Tyr Arg Gln Glu His Tyr 705 710 715 720 Ser Ser Met Met Lys Gln Leu Trp Glu Trp Gln Lys Val Gly Val Glu 725 730 735 Asn Tyr Arg Glu Asn Gly Asp Thr Pro Ala Phe Tyr Asn Thr Val Val 740 745 750 Phe Arg Val Ile Gly Leu Ser Ile 755 760 <210> 30 <211> 773 <212> PRT <213> Candidatus Brocadia fulgida <220> <223> Candidatus Brocadia fulgida TPR-CHAT AA <400> 30 Met Asn Asp Thr Leu Leu Arg His Leu Gly Leu Asp Ile Glu Lys Ile 1 5 10 15 Ala Glu Glu Met Gln Leu Leu Ser Ala Asp Ile Glu Gly Asn Lys Glu Page 89
Ala Leu Val Lys Thr Leu Val Arg Tyr Asp Glu Ala Lys Arg Ile Ala 40 45 Lys Asn Ala Ala Leu Trp Gln Phe Gly Leu Arg Pro Asn Gln Ile Leu 50 55 60 Phe Ser Val Ile Asp Gln Thr Arg Gln Asn Gln Thr Met Lys Glu Gln 65 70 75 80 Ala Val Arg Ala Val Ala Thr Gln Tyr Leu Glu Thr Phe Lys Gln Ser 85 90 95 Arg Glu Asp Gly Arg Asp Lys Cys Leu Thr His Asn Asp Gln Arg Glu 100 105 110 Leu Leu Glu Ser Ala Leu Lys Ile Leu Val Asn Phe Glu Lys Glu Met 115 120 125 Asp Gly Lys Ile Glu Pro Ala Thr Cys Ala Leu Ile Ala Arg Thr Tyr 130 135 140 Leu Leu Arg Ser Ala Ile Met Leu Pro Lys Gly Phe Thr Val Pro Glu 145 150 155 160 Lys Lys Lys Glu Ala Leu Arg Lys Gly Ser Glu Tyr Ile Arg Thr Ile 165 170 175 Asp Asp Leu Thr Glu Glu Ala Leu Arg Val Arg Gly Ser Leu Leu Leu 180 185 190 Glu Gln Arg His Ile Asp Ile Leu Glu Lys Asn Arg Glu Ser Asn Gly 195 200 205 Asp Asn Gln Thr Leu Ile Lys Glu Leu Arg Glu Ala Leu Glu Asn Gly 210 215 220 Cys Asp Lys Phe Asn Asn Thr Ile Glu Asp Val Arg Ile Ala Leu Cys 225 230 235 240 Tyr Ile Glu Leu Thr Asp Asp Lys Thr Asp Leu Leu Gln Lys Ile Ile 245 250 255 Asp Ser Gln Leu Asp Phe Pro Gly Ile Glu Leu Tyr Arg Leu Lys Ala 260 265 270 Tyr Phe Leu Lys Gly Asp Tyr Ala Ala Ile Ser Asp Glu Ala Leu Lys 275 280 285 Glu Glu Leu Ser Gly Ile Arg Phe Asn His Pro Val Trp Asn Glu Ala 290 295 300 Met Ile Phe Ile Lys Gln Leu Lys Asp Ala Gln Ala Asp Cys Trp Arg 305 310 315 320 Lys Leu Ala Leu Ala Ala Tyr Gln Val Cys Arg Thr Arg Glu Ser Glu 325 330 335 Thr Ser Ser Leu His Leu Arg Trp Tyr Trp Ser Gly Tyr Arg Leu Leu 340 345 350 Tyr Asp Leu Ala Phe Ile Ala Glu Asp Asp Leu His Arg Lys Ala Glu 355 360 365 Ile Ala Asp Ser Leu Lys Ser Arg Val Ser Leu His Ala Lys Ala Leu 370 375 380 Asp Glu Ile Ile Lys Asn Asp Lys Glu Arg Glu Glu Tyr Tyr Asn Ala 385 390 395 400 His Ala Val Ala Tyr Ala Gly Gly Tyr Val Lys Gly Ala Gly Arg Ile 405 410 415 His Thr Gly Arg Lys Glu Lys Asp Cys Asp Thr Asn Asn Val Phe Lys Page 90
Ala Leu Pro Lys Asp Val Ala Ile Val Ala Phe Tyr Leu Asn Tyr Cys 435 440 445 Glu Lys Asn Lys Asp Ser Arg Gly Arg Gly Tyr Ala Leu Ile Ala Glu 450 455 460 Asn Gly Thr Trp Asn Ile Lys Glu Phe Pro Phe Asp Ser Leu Tyr Lys 465 470 475 480 Ala Tyr Leu Thr Trp Gln Thr Asn Tyr Ala Arg His Lys Glu Ser Ala 485 490 495 Ser Pro Ser Leu Val Glu Leu Cys Glu Glu Ile Gly Arg Ala Met Pro 500 505 510 Phe Leu Phe Glu Ile Thr Lys Lys Arg Ile Val Phe Val Pro His Asp 515 520 525 Phe Leu His Arg Leu Pro Leu His Gly Ala Ile Lys Arg Glu Trp Pro 530 535 540 Lys Val Leu Leu Glu Glu Tyr Ser Cys Leu Tyr Leu Pro Ala Trp Ser 545 550 555 560 Leu Leu His Ala Asp Thr Thr Lys Ser Ser Gln Thr Ala Arg Lys Arg 565 570 575 Met Leu Ile Glu Cys Phe His Glu Tyr Asp Tyr His Glu Leu Gln Thr 580 585 590 Lys Ile Asn Ala Gln Ile Lys Glu Ser Lys Gly Val Val Trp Glu Lys 595 600 605 Arg Glu Lys Ala Lys Pro Lys Asp Leu Leu Gln Ile Pro Glu Ala Pro 610 615 620 Glu Ile Leu Met Ile Leu Ser His Gly Arg Ala Asp Met Thr Asn Pro 625 630 635 640 Tyr Tyr Ala Arg Leu Lys Leu Glu Gly Gly Asp Val Ser Ala Leu Glu 645 650 655 Ile Met Lys Ala Lys Thr Gly Thr Met Ser Ile Lys Gly Ser Asn Val 660 665 670 Ile Met Gly Cys Cys Glu Thr Asp Leu Leu Pro Val Leu Ser Thr Pro 675 680 685 Ile Asp Glu His Val Ser Pro Ala Thr Ala Leu Tyr Thr Arg Gly Ala 690 695 700 Asn Phe Val Val Gly Thr Met Trp Glu Ile Asn Pro Ile Asp Ile Glu 705 710 715 720 Arg His Phe Ile Glu Leu Leu Thr Lys Asn Asp Asn Ser Met Leu Glu 725 730 735 Gly Val Gly Asn Trp Gln Arg Glu Gly Leu Ser Asp Asp Lys Trp Lys 740 745 750 Lys His Lys Glu Ser Arg Phe Phe Tyr Ala Ile Ile Gly Phe Arg Val 755 760 765 Leu Gly Ile Phe Thr 770 <210> 31 <211> 751 <212> PRT <213> Desulfonema ishimotonii Page 91
<220> <223> Desulfonema ishimotonii TPR-CHAT AA <400> 31 Met Ser Asn Pro Ile Arg Asp Ile Gln Asp Arg Leu Lys Thr Ala Lys 1 5 10 15 Phe Asp Asn Lys Asp Asp Met Met Asn Leu Ala Ser Ser Leu Tyr Lys
Tyr Glu Lys Gln Leu Met Asp Ser Ser Glu Ala Thr Leu Cys Gln Gln 40 45 Gly Leu Ser Asn Arg Pro Asn Ser Phe Ser Gln Leu Ser Gln Phe Arg 50 55 60 Asp Ser Asp Ile Gln Ser Lys Ala Gly Gly Gln Thr Gly Lys Phe Trp 65 70 75 80 Gln Asn Glu Tyr Glu Ala Cys Lys Asn Phe Gln Thr His Lys Glu Arg 85 90 95 Arg Glu Thr Leu Glu Gln Ile Ile Arg Phe Leu Gln Asn Gly Ala Glu 100 105 110 Glu Lys Asp Ala Asp Asp Leu Leu Leu Lys Thr Leu Ala Arg Ala Tyr 115 120 125 Phe His Arg Gly Leu Leu Tyr Arg Pro Lys Gly Phe Ser Val Pro Ala 130 135 140 Arg Lys Val Glu Ala Met Lys Lys Ala Ile Ala Tyr Cys Glu Ile Ile 145 150 155 160 Leu Asp Lys Asn Glu Glu Glu Ser Glu Ala Leu Arg Ile Trp Leu Tyr 165 170 175 Ala Ala Met Glu Leu Arg Arg Cys Gly Glu Glu Tyr Pro Glu Asn Phe 180 185 190 Ala Glu Lys Leu Phe Tyr Leu Ala Asn Asp Gly Phe Ile Ser Glu Leu 195 200 205 Tyr Asp Ile Arg Leu Phe Leu Glu Tyr Thr Glu Arg Glu Glu Asp Asn 210 215 220 Asn Phe Leu Asp Met Ile Leu Gln Glu Asn Gln Asp Arg Glu Arg Leu 225 230 235 240 Phe Glu Leu Cys Leu Tyr Lys Ala Arg Ala Cys Phe His Leu Asn Gln 245 250 255 Leu Asn Asp Val Arg Ile Tyr Gly Glu Ser Ala Ile Asp Asn Ala Pro 260 265 270 Gly Ala Phe Ala Asp Pro Phe Trp Asp Glu Leu Val Glu Phe Ile Arg 275 280 285 Met Leu Arg Asn Lys Lys Ser Glu Leu Trp Lys Glu Ile Ala Ile Lys 290 295 300 Ala Trp Asp Lys Cys Arg Glu Lys Glu Met Lys Val Gly Asn Asn Ile 305 310 315 320 Tyr Leu Ser Trp Tyr Trp Ala Arg Gln Arg Glu Leu Tyr Asp Leu Ala 325 330 335 Phe Met Ala Gln Asp Gly Ile Glu Lys Lys Thr Arg Ile Ala Asp Ser 340 345 350 Page 92
Leu Lys Ser Arg Thr Thr Leu Arg Ile Gln Glu Leu Asn Glu Leu Arg 355 360 365 Lys Asp Ala His Arg Lys Gln Asn Arg Arg Leu Glu Asp Lys Leu Asp 370 375 380 Arg Ile Ile Glu Gln Glu Asn Glu Ala Arg Asp Gly Ala Tyr Leu Arg 385 390 395 400 Arg Asn Pro Pro Cys Phe Thr Gly Gly Lys Arg Glu Glu Ile Pro Phe 405 410 415 Ala Arg Leu Pro Gln Asn Trp Ile Ala Val His Phe Tyr Leu Asn Glu 420 425 430 Leu Glu Ser His Glu Gly Gly Lys Gly Gly His Ala Leu Ile Tyr Asp 435 440 445 Pro Gln Lys Ala Glu Lys Asp Gln Trp Gln Asp Lys Ser Phe Asp Tyr 450 455 460 Lys Glu Leu His Arg Lys Phe Leu Glu Trp Gln Glu Asn Tyr Ile Leu 465 470 475 480 Asn Glu Glu Gly Ser Ala Asp Phe Leu Val Thr Leu Cys Arg Glu Ile 485 490 495 Glu Lys Ala Met Pro Phe Leu Phe Lys Ser Glu Val Ile Pro Glu Asp 500 505 510 Arg Pro Val Leu Trp Ile Pro His Gly Phe Leu His Arg Leu Pro Leu 515 520 525 His Ala Ala Met Lys Ser Gly Asn Asn Ser Asn Ile Glu Ile Phe Trp 530 535 540 Glu Arg His Ala Ser Arg Tyr Leu Pro Ala Trp His Leu Phe Asp Pro 545 550 555 560 Ala Pro Tyr Ser Arg Glu Glu Ser Ser Thr Leu Leu Lys Asn Phe Glu 565 570 575 Glu Tyr Asp Phe Gln Asn Leu Glu Asn Gly Glu Ile Glu Val Tyr Ala 580 585 590 Pro Ser Ser Pro Lys Lys Val Lys Glu Ala Ile Arg Glu Asn Pro Ala 595 600 605 Ile Leu Leu Leu Leu Cys His Gly Glu Ala Asp Met Thr Asn Pro Phe 610 615 620 Arg Ser Cys Leu Lys Leu Lys Asn Lys Asp Met Thr Ile Phe Asp Leu 625 630 635 640 Leu Thr Val Glu Asp Val Arg Leu Ser Gly Ser Arg Ile Leu Leu Gly 645 650 655 Ala Cys Glu Ser Asp Met Val Pro Pro Leu Glu Phe Ser Val Asp Glu 660 665 670 His Leu Ser Val Ser Gly Ala Phe Leu Ser His Lys Ala Gly Glu Ile 675 680 685 Val Ala Gly Leu Trp Thr Val Asp Ser Glu Lys Val Asp Glu Cys Tyr 690 695 700 Ser Tyr Leu Val Glu Glu Lys Asp Phe Leu Arg Asn Leu Gln Glu Trp 705 710 715 720 Gln Met Ala Glu Thr Glu Asn Phe Arg Ser Glu Asn Asp Ser Ser Leu 725 730 735 Phe Tyr Lys Ile Ala Pro Phe Arg Ile Ile Gly Phe Pro Ala Glu 740 745 750 Page 93
<210> 32 <211> 669 <212> PRT <213> Bacteria <prokaryote> <220> <223> bacterium BMS3Abin06 TPR-CHAT AA <400> 32 Met Pro Ala Thr Pro Lys Ile Glu Asp Ile Ala Asn Asn Leu Arg Lys 1 5 10 15 Ala Ala Lys Ser Pro Asp Gly Lys Lys Leu Ala Gly Ala Leu Leu Asp
Tyr Ile Glu Ala Lys Lys Thr Asp Pro Asp Ala Pro Leu Tyr Gln Phe 40 45 Gly Ile Thr Pro Thr Lys Glu Leu Ala Glu Ile Leu Ser Glu Leu Lys 50 55 60 Asn Cys Pro Glu Trp Glu Glu Ala Val Cys Glu Lys Ala Gly Thr Tyr 65 70 75 80 Leu Gln Ile Ala Lys Lys Glu Thr Leu Lys His Lys Ile Leu Lys Lys 85 90 95 His Ile Glu Asp Ala Ile Gly Leu Leu Ile Glu Asn Cys Asp Val Ser 100 105 110 Glu Lys Gly Glu Gly Tyr Arg Pro Phe Ala Leu Leu Ala Lys Ala Tyr 115 120 125 Leu Met Arg Ser Asn Ile Ile Arg Pro Lys Gly Ile Thr Ile Pro Glu 130 135 140 Arg Lys Lys Glu Ala Ile Asp Lys Gly Ile Lys Tyr Ala Asp Val Ala 145 150 155 160 Thr Asn Leu Ala Lys Glu Lys Lys Glu Ile Lys Asp Ala Trp Arg Ile 165 170 175 Lys Ala Leu Leu Tyr Leu Glu Leu Gln Arg Ile Gln Arg Gly Thr Cys 180 185 190 Lys Ala Asp Glu Lys Glu Ser Tyr Thr Asp Lys Ile Thr Asp Ala Leu 195 200 205 Tyr Ser Ala Val Asp Lys Gly Cys Asn Asp Ile Asn Asn Phe Val Glu 210 215 220 Asp Leu Lys Ile Ile Val His Tyr Ser Arg Ala Asn Ser Asn Asp Ala 225 230 235 240 Tyr Leu Lys Asn Ile Asp Leu His Ser Leu Ser His Glu Lys Ile Glu 245 250 255 Leu Glu Lys Ala Met Ala Tyr Lys Ile Leu Lys Asp Thr Asp Asn Leu 260 265 270 Tyr Glu Glu Met Gln Lys Leu Thr Val Arg Leu Glu Asn Ile Tyr Leu 275 280 285 Thr Ser Pro Val Trp Asp Asp Thr Val Asn Phe Ile Asn Glu Leu Arg 290 295 300 Val Asp Asn Ile Glu Gly Trp Lys Asp Leu Ala Ile Leu Thr Trp Glu Page 94
Ala Cys Gly Lys Val Leu Asn Lys Thr Met Ser Asn Leu His Ile Arg 325 330 335 Trp Tyr Trp Ser Arg Gln Arg Leu Leu Tyr Asp Leu Ala Phe Met Ala 340 345 350 Ala Asp Lys Phe Ser Asp Lys Gly Glu Lys Tyr His Lys Lys Ala Asp 355 360 365 Ile Ala Asp Ser Leu Lys Ser Arg Pro Ala Leu Arg Trp Asn Ala Leu 370 375 380 Asn Glu Ser Ala Lys Asn Tyr Glu Ile Leu Lys Lys Ser Leu Glu Ala 385 390 395 400 Glu Ala Asp Ser Gly Tyr Leu Lys Asn Ile Lys Val Thr Tyr Ala Lys 405 410 415 Thr Lys Pro Arg Val Phe Asn Phe Asn Lys Asp Ser Val Pro Glu Gly 420 425 430 Trp Ile Val Ile His Phe Tyr Ile Asn Gln Leu Glu Lys Lys Gly Tyr 435 440 445 Ala Leu Ile Tyr Asp Asn Thr Asp Pro Gly Lys Glu Glu Ser Lys Lys 450 455 460 Trp Lys Glu Glu Pro Phe Glu Tyr Asn Glu Leu Phe Asn Ser Tyr Ile 465 470 475 480 Glu Trp Gln Gly Asn Tyr Asn Arg Leu Pro His Gly Glu Lys His Lys 485 490 495 Ser Ala Glu Ser Leu Val Ala Leu Cys Lys Glu Ile Gly Lys Ala Leu 500 505 510 Pro Phe Leu Phe Lys Leu Pro Asn Asn Glu Gln Glu Asn Thr Ser Val 515 520 525 Leu Phe Ile Pro His Asp Phe Leu His Arg Leu Pro Leu His Ala Ala 530 535 540 Ile Asn Gly Glu Asn Asn Val Val Phe Leu Ile Asn His Pro Asn Cys 545 550 555 560 Tyr Leu Pro Ala Trp Ser Phe Tyr Arg Asp Lys Ile Asp Lys Asn Val 565 570 575 Glu Asn Met Leu Leu Phe Lys Asn Val Glu Asn Ala Val Asp Leu Arg 580 585 590 Glu Leu Lys Gly Ile Ser Lys Leu Asn Cys Trp Glu Ile Asn Thr Asn 595 600 605 Glu Val Ser Thr Lys Glu Phe Val Thr Thr Leu Glu Lys Leu Pro Lys 610 615 620 Pro Pro Lys Val Leu Ser Ile Phe Cys His Gly Lys Ala Asn Ala Val 625 630 635 640 Asn Pro Phe Asn Ala Lys Leu Lys Leu Arg Val His Glu Lys Ile Thr 645 650 655 Leu Gln Ser Leu Val Leu Asn Ser Gln Gln Leu Ile Asp 660 665 <210> 33 <211> 769 <212> PRT <213> Desulfobacteraceae Page 95
<220> <223> Desulfobacteraceae bacterium 4572 88 TPR-CHAT AA <400> 33 Met Thr Ser Ser Arg Thr Asn Cys Ser Phe Ile Asp Arg Ile Glu Lys 1 5 10 15 Ala Leu Gln Lys Glu Asp Leu Glu Ser Thr Leu Pro Glu Leu Ala Leu
Arg Leu Ile Glu Phe Glu Thr Ala Asn Ala Glu Pro Glu Asn Ala Leu 40 45 Cys Gln Arg Gly Ile Ser Asn Ala Asn Asn Ala Ala Val Arg Ile Ala 50 55 60 Lys Ala Leu Gly Glu Lys Ser Ala Leu Ala Asp Met Ala Glu Val Arg 65 70 75 80 Ile Lys Asp Tyr Glu Val Arg Lys Pro Arg Leu Thr His Arg Gln Arg 85 90 95 Arg Gln Tyr Leu Glu Asp Thr Ile Arg Ile Leu Gln Pro Glu Glu Glu 100 105 110 Lys Ser Lys Glu Ser Gly Met Leu Ala Ser Leu Ala Arg Val Tyr Leu 115 120 125 Tyr Arg Gly Val Leu Tyr Arg Pro Lys Gly Arg Ile Thr Pro Ala Arg 130 135 140 Lys Thr Glu Ala Val Arg Lys Ala Val Arg Leu Ser Glu Lys Ala Ile 145 150 155 160 Gln Asn Leu Ser Asp Lys Ser Gly Lys Ala Val Phe Val Trp Arg Thr 165 170 175 Trp Ala Glu Ala Ala Leu Glu Leu Glu Arg Ala Gly Asp Tyr Ser Ala 180 185 190 Pro Leu Glu Thr Leu Glu Ala Ala Ala Leu Gln Ile Asn Ala Asp Gly 195 200 205 Ile Thr Ser Leu Thr Asp Ile Leu Ile Leu Leu Arg Tyr Ala Glu Arg 210 215 220 Ser Lys Lys Asn Ala Phe Lys Gly Lys Leu Thr Asp Leu Leu Asp Lys 225 230 235 240 Lys Glu His Trp Trp Gly His Thr Ser Asp Ile Tyr Leu Leu Lys Ala 245 250 255 Arg Ile Ala Phe Leu Phe Gly His Ser Asp Lys Glu Val Trp Lys Tyr 260 265 270 Leu Lys Asn Ala Leu Asp His Val Pro Asp Ala Phe Ser Asn Pro Phe 275 280 285 Trp Asp Asp Leu Val Asp Phe Val Lys Lys Leu Arg Asp Glu Glu Ser 290 295 300 Asp Met Trp Lys Lys Thr Ala Ile Arg Ala His Gly Glu Cys Arg Lys 305 310 315 320 Lys Glu Ala Glu Ile Ala Ser Gly Val Val Leu Arg Trp Tyr Trp Ser 325 330 335 Arg Gln Lys Asp Leu Tyr Asp Leu Ala Phe Leu Ala Ala Asp His Ala 340 345 350 Page 96
Glu Lys Lys Ala Glu Ile Ala Asp Ser Leu Lys Ser Arg Pro Val Leu 355 360 365 Arg Tyr Gln Thr Leu Arg Glu Leu Lys Asp Ile Gly Thr Ile Gly Glu 370 375 380 Ile Leu Asp Arg Glu Asp Glu Ala Arg Asp Gly Arg Tyr Leu Lys Thr 385 390 395 400 Lys Pro Glu Pro Lys Glu Lys Glu Ile Val Lys Glu Ile Lys Lys Lys 405 410 415 Gln Ala Val Pro Phe Lys Asp Met Pro Glu Pro Trp Ile Ala Ile His 420 425 430 Phe Tyr Leu Asn Asp Phe Glu Glu Lys Gly Tyr Ala Leu Ile Phe Asp 435 440 445 Ala Thr Ser Arg Asp Asp Asp Gly Trp Lys Glu Cys Arg Phe Asp Tyr 450 455 460 Arg Glu Leu His Arg Lys Phe Met Ala Trp Gln Glu Leu Tyr Phe Ser 465 470 475 480 Gly Ser Glu Asp Ser Ala Ala Asp Ala Leu Val Leu Leu Cys Arg Glu 485 490 495 Ile Gly Arg Ala Met Pro Phe Leu Phe Asp Gly Thr Leu Pro Glu Asn 500 505 510 Ser Arg Val Leu Trp Ile Pro His Gly Phe Leu His Arg Leu Pro Leu 515 520 525 His Ala Ala Ile Arg Ala Asp Glu Asn Asp Thr Leu Phe Leu Glu Lys 530 535 540 His Ile Ser Arg Tyr Leu Pro Ala Trp Asn Met Leu Thr Ser Asp Ser 545 550 555 560 Val Lys Asp Asn Glu Ala Ser Glu Asp Lys Gly Gly Phe His Met Ile 565 570 575 Lys Arg Leu Arg Pro Glu Asp Ser Asp Asn Tyr Phe Lys Leu Asn Lys 580 585 590 Arg Lys Trp Lys Asn Lys Glu Asp Glu Gly Ile Tyr Arg Ala Arg Glu 595 600 605 Glu Asp Leu Lys Ala Ser Met Glu Lys Asn Pro Gln Ala Leu Thr Leu 610 615 620 Ile Cys His Gly His Gly Asp Ile Leu Asn Pro Leu Lys Ser Trp Leu 625 630 635 640 Glu Leu Glu Asp Ser Gly Met Thr Val Leu Asp Ile Leu Lys Ser Glu 645 650 655 Ala Lys Leu Ser Gly Thr Arg Val Leu Leu Gly Ala Cys Glu Ser Asp 660 665 670 Met Ala Pro Pro Thr Glu His Thr Ile Asp Glu His Leu Ser Leu Cys 675 680 685 Thr Val Phe Leu Ser His Asn Ala Arg Glu Ile Val Ala Gly Leu Trp 690 695 700 Glu Ile Gln Thr Asn Met Val Asp Gly Cys Tyr Asn Gln Ile Leu Asp 705 710 715 720 Ser Asn Asp Ile Ser Glu Ala Leu Lys Gln Trp Gln Glu Asp Gln Met 725 730 735 Lys Lys Arg Trp Lys Lys Lys Gln Asp His Thr Ile Phe Tyr Leu Ile 740 745 750 Page 97
Ala Pro Phe Arg Val Met Gly Phe Pro Lys Arg Val Ser Ser Glu Ala 755 760 765 Asn <210> 34 <211> 778 <212> PRT <213> Deltaproteobacteria <220> <223> Deltaproteobacteria bacterium TPR-CHAT AA <400> 34 Met Arg Tyr Ser Ser Arg Thr Asn Cys Glu Ala Ile Asp Asn Leu Ala 1 5 10 15 Glu Ala Leu Gln Asp Gln Glu Asn Met Pro Glu Ile Ala Arg Arg Val
Leu Glu Phe Glu Ala Glu Asn Ala Lys Pro Glu Asn Ala Leu Cys Gln 40 45 His Gly Leu Pro His Thr Lys Lys Ala Ala Ser Gln Ile Ala Gly Val 50 55 60 Arg Asp Lys His Ser Glu Phe Tyr Asp Asn Ala Leu Leu Asp Leu Val 65 70 75 80 Glu Glu Trp Leu Lys Thr Tyr Glu Glu Ala Lys Lys Leu Thr His Arg 85 90 95 Glu Arg Arg Gln Glu Met Glu Asp Lys Ile Arg Val Leu Gln Pro Val 100 105 110 Leu Gln Ala Lys Gly Lys Asp Ala Asp Pro Arg Phe Leu Ser Leu Leu 115 120 125 Ala Arg Ile Tyr Leu Tyr Arg Gly Met Leu Phe Arg Pro Lys Gly Phe 130 135 140 Thr Thr Pro Ala Arg Lys Ile Glu Ala Leu Lys Lys Ala Val Gln Leu 145 150 155 160 Ser Glu Lys Ala Val Glu Lys Glu Lys Asp Asn Pro Asn Phe Leu Arg 165 170 175 Thr Trp Ala Gln Ala Ala Leu Glu Leu Glu Ala Ile Pro Glu Thr Ser 180 185 190 Phe Lys Val Ser Ser Gly Leu Leu Lys Asp Ala Ala Val Cys Ile Asn 195 200 205 Arg Asp Gly Ile His Ser Leu Asn Asp Leu Gln Val Ile Leu Glu Tyr 210 215 220 Ala Glu Ser Glu Gly Lys Thr Ser Phe Leu Gln His Val Leu Val Glu 225 230 235 240 Lys Arg Tyr Trp Lys Arg Pro Phe Asp Leu Phe Leu Leu Lys Ala Arg 245 250 255 Ala Ala Phe Ala Leu Asn Arg Met Asp Asp Val Arg Tyr Phe Leu Lys 260 265 270 Page 98
Ser Ala Met Asp Lys Thr Pro Lys Ala Leu Ser Ser Pro Phe Trp Asp 275 280 285 His Leu Val Asp Phe Leu Lys Lys Leu Arg Thr Lys Glu Gly Ser Asp 290 295 300 Leu Trp Lys Glu Met Ala Val Ala Ala His Arg Leu Cys Arg Glu Lys 305 310 315 320 Glu Val Lys Ile Ala Asn Asn Ile Tyr Leu Tyr Arg His Trp Ala Arg 325 330 335 Gln Lys Ser Leu Tyr Asn Met Ala Phe Leu Ala Gln Asn Asp Leu Lys 340 345 350 Glu Lys Ala Lys Ile Ala Asp Ser Leu Lys Ser Arg Pro Val Leu Arg 355 360 365 Tyr Gln Ala Leu Arg Glu Met Lys Glu His Gln Asn Ile Ala Lys Leu 370 375 380 Leu Glu Gln Asp Asp Gln Glu Arg Asp Gly Gly Tyr His Lys Gln Gln 385 390 395 400 Val Glu Met Asp Glu Arg Thr Gly Lys Arg Leu Ser Glu Lys Met Glu 405 410 415 Lys Ala Gly Val Ser Tyr Glu Asn Leu Pro Val Pro Trp Ile Ser Val 420 425 430 His Phe Tyr Leu Asn Glu Ser Glu Asn Ser Glu Asp Glu Gly Ser Lys 435 440 445 Gly Tyr Ala Leu Ile Phe Asp Ala Leu Thr Gln Ser Trp Lys Glu Arg 450 455 460 Arg Phe Asp Tyr Ala Lys Leu His Arg Lys Phe Met Thr Trp Gln Glu 465 470 475 480 Ala Tyr Ile Ser Ala Lys Lys Ser Ser Phe Ala Lys Asp Ser Leu Val 485 490 495 Glu Leu Cys Arg Glu Ile Gly Asn Thr Met Pro Phe Leu Phe Asp Thr 500 505 510 Ala Cys Ile Arg Asp Gly Ala Pro Val Leu Trp Ile Pro His Gly Phe 515 520 525 Leu His Arg Leu Pro Leu His Ala Ala Ile Arg Asp Glu Ala Thr Asn 530 535 540 Glu Ile Phe Leu Glu Asn His Ala Ser Arg Tyr Leu Pro Ala Trp Ser 545 550 555 560 Ile Leu Asn Ser Ala Ser Ala Arg Arg Gly Lys Asp Ser Tyr Met Ile 565 570 575 Lys Arg Phe Arg Ala Glu Asp Tyr Glu Lys Glu Pro Phe Ser Glu Leu 580 585 590 Glu Asp Met Glu Trp Asp Asn Glu Glu His Glu Lys Leu Ala Thr Pro 595 600 605 Asp Asp Leu Lys His Phe Met Ala Lys Asn Pro Gly Val Phe Ala Val 610 615 620 Leu Cys His Gly His Gly Asp Ile Leu Asn Pro Leu Lys Ser Trp Leu 625 630 635 640 Glu Leu Glu Gly Gly Gly Val Ser Val Leu Asp Ile Leu Arg Tyr Glu 645 650 655 Lys Ala Asn Leu Ser Gly Thr Arg Val Leu Leu Gly Ala Cys Glu Ala 660 665 670 Page 99
Asp Met Ala Pro Pro Val Glu Tyr Ala Ile Asp Glu His Val Ser Leu 675 680 685 Ser Ala Ala Phe Leu Ser His Lys Ala Gln Glu Val Ile Ala Gly Leu 690 695 700 Trp Glu Ile Asn Ile Gly Glu Ala Asp Glu Cys Tyr Ala Glu Ile Leu 705 710 715 720 Asp Cys Ser Asp Leu Ser Thr Glu Leu Lys Asp Trp Gln Cys Asp Trp 725 730 735 Val Glu Lys Trp Arg Asp Asp Val Glu Ala Ser Gly Asp Asn Ser Thr 740 745 750 Phe Tyr His Ile Thr Pro Phe Arg Ile Met Gly Phe Pro Leu Lys Leu 755 760 765 Lys Glu Asn Asn Glu Ser Glu Ala Lys Gln 770 775 <210> 35 <211> 812 <212> PRT <213> Deltaproteobacteria <220> <223> Deltaproteobacteria bacterium TPR-CHAT AA 2 <400> 35 Met Asp Leu Leu Ser Gly Asn Ser Gln Arg Gly Leu Glu Met Ser Lys 1 5 10 15 Lys Ser Ala Gln Lys Ser Val His Ser Lys Tyr Lys Glu Ala Leu Glu
Ser Glu Thr Ala Phe Ser Leu Leu Pro Pro Ser His Ser Glu Trp Pro 40 45 Glu Lys Leu Lys Ser Arg Asn Lys Ala Phe Leu Ser Thr Leu Gln Ala 50 55 60 Lys Thr Ser Thr Asp Glu Gly Leu Leu Tyr Arg Gly Ala Ala Leu Arg 65 70 75 80 Phe Leu Asn Asp Ser Asp Ala Asn Ala Gly Ala Phe Tyr Gly Met Pro 85 90 95 Pro Thr Gln Asn Ala Cys Glu Asn Leu Leu Lys Val His Glu Lys Ile 100 105 110 Ala Val Ile His Ala Arg Ser Val Phe Glu Val Ala Glu Asn Val Leu 115 120 125 Lys Thr Val Cys Glu Ser Lys Met Glu Arg Arg Thr Arg His His Val 130 135 140 Leu Glu Thr Ala Tle His Arg Ile Ser Ala Leu Met Asp Lys Asp Lys 145 150 155 160 Glu His Trp Pro Phe Pro Glu Pro Glu Glu Pro Lys Val Trp Ala Phe 165 170 175 Leu Ala Gln Ala Tyr Phe Glu Arg Ser Arg Thr Ile Leu Pro Lys Gly 180 185 190 Ala Asp Phe Pro Gln Lys Lys Val Glu Gly Leu Arg Lys Ala Arg Ile Page 100
Trp Ala Glu Lys Ile Lys Asp Gly Asn Lys Asn Val Leu Leu Leu Leu 210 215 220 Ser Gln Ile Tyr Leu Glu Thr Gln Arg Val Ala Gly Asp Glu Ile Ser 225 230 235 240 Glu Lys Val Ile Lys Asp Ile Leu Phe Glu Cys Val Cys Glu Leu Ser 245 250 255 Ala Asp Ser Met Thr Arg Asp Glu Met Asp Ile Cys Leu His Leu Phe 260 265 270 Glu Lys Asp Gln Asp Ser Arg Asp Ile His Tyr Leu Glu Thr Ile Met 275 280 285 Ala Ser Asp Ser Asp Ala Pro Ser Phe Phe Lys Ala Arg Ala Ala Phe 290 295 300 Leu Lys Gly Asp Lys Gln Lys Val Gly Ser Glu Leu Arg Ala Thr Leu 305 310 315 320 Lys Phe Leu Lys Ser Thr Pro Phe Ser His Pro Val Trp Glu Trp Cys 325 330 335 Ala Asp Phe Leu Val Asn Leu Ser Lys Glu Glu Phe Pro Gly Trp Gln 340 345 350 Asp Leu Ala Tyr Glu Thr Trp Lys Val Cys Arg Asn Leu Glu Lys Lys 355 360 365 Ser Phe Lys Gly His Ile Leu Arg Trp Tyr Trp Ser Arg Leu Gly Gln 370 375 380 Leu Tyr Asp Thr Ala Phe Thr Ala Val Ile Glu Lys Ala Gly Ala Ser 385 390 395 400 Gln Asn Asp Lys Asp Arg Met Tyr Trp Leu Trp Thr Ala Ala Glu Ile 405 410 415 Ala Asp Ser Leu Lys Ser Leu Pro Thr Val Arg Trp Met Ala Val Glu 420 425 430 Asn Asp Glu Ala Leu Phe Gln Gln Glu Lys Glu Asp Lys Glu Gly Lys 435 440 445 Glu Glu Trp Phe Glu Asn Glu Thr Arg Ser Ile Glu Asp Arg Tyr Leu 450 455 460 Ile Asn Leu Asn Gln Gly Tyr Ser Asn Pro Val Pro Val Pro Pro Pro 465 470 475 480 Lys Pro Phe Ser Glu Leu Pro Glu Pro Trp Leu Ala Val His Phe Tyr 485 490 495 Val Glu Glu Asn Gly Gln Gly His Ala Leu Ile Tyr Asp Ser Leu Thr 500 505 510 Lys Cys Trp Glu Lys Glu Glu Ala Phe Asp Ala Ala Pro Leu Trp Glu 515 520 525 Ala Tyr Val Ser Trp Arg Thr Ser Tyr Ser Ala Gly Pro Gln Thr Glu 530 535 540 Arg Pro Asp Phe Ser Glu Thr Gln Met Ala Glu Leu Cys Lys Lys Leu 545 550 555 560 Gly Glu Asn Met Gly Phe Leu Phe Asp Leu Pro Glu Asp Asp Lys Pro 565 570 575 Arg Pro Val Leu Phe Val Pro His Arg Phe Leu His Met Met Pro Ile 580 585 590 His Ala Ala Cys Arg Asn Gly Thr Tyr Phe Phe Thr Gln Arg Pro Ser Page 101
Leu Tyr Leu Pro Ala Trp Ser Leu Thr Gly Phe Gln Pro Asp Glu Pro 610 615 620 Asn Thr Ser Asn Ile Gln Met Leu Tyr Lys Ser Phe Asp Asp Asp Asp 625 630 635 640 Tyr Ala Phe Asp Lys Leu Lys Ile Gln Gly Asn Trp Ser Lys Ile Lys 645 650 655 Asp Pro Ala Ser Pro Ser Asp Val Leu Thr Leu Asp Asp Ser Pro Glu 660 665 670 Leu Met Val Ile Leu Ser His Gly Ala Ser Asp Pro Thr Asn Pro Phe 675 680 685 Asn Cys Arg Leu Arg Leu Lys Asn Gly Asp Leu Arg Tyr Leu Asp Ile 690 695 700 Phe Gln Asp Gly Pro Met Leu Asn Gly Thr Arg Val Ile Leu Gly Ala 705 710 715 720 Cys Glu Thr Asp Met Ala Ser Pro Pro Glu Ser Ser Leu Asp Glu His 725 730 735 Leu Ser Leu Ser Thr Ala Phe Leu Gln Lys Gly Thr Gln Ser Val Met 740 745 750 Gly Thr Leu Trp Glu Val Arg Ser Gly Asp Val Glu Glu Met Val Leu 755 760 765 Arg Ile Arg Lys Ser Lys Thr Ala Ser Leu His Glu Ile Val Trp Glu 770 775 780 Lys Gln Lys Ser Trp Ile Gln Asn Pro Leu Glu Tyr Gln Lys Phe Tyr 785 790 795 800 Asp Ile Ile Pro Phe Arg Val Ile Gly Tyr Pro Ser 805 810 <210> 36 <211> 747 <212> PRT <213> Desulfobacterales <220> <223> Desulfobacterales bacterium TPR-CHAT AA <400> 36 Met Thr Ser Ile Cys Glu His Ile Gly Asn Ile Glu Lys Tyr Ile Glu 1 5 10 15 Lys Lys Asp Thr Ala Lys Leu Ala Asn Ser Ile Ile Asp Leu Glu Lys
Ala Phe Leu Lys Ser Lys Phe Gly Leu Cys Gln Arg Gly Ile Lys Asn 40 45 Lys Ala Gly Ser Val Glu Glu Ile Ile Lys Ala Arg Lys Asp Gly Val 50 55 60 Tyr Lys Lys Gln Met Glu Glu Thr Thr Lys Asn Trp Ile Ala Asp Tyr 65 70 75 80 Glu Lys Lys Lys Asp Lys Met Asn His Lys Glu Arg Lys His Lys Leu 85 90 95 Page 102
Glu Glu Ile Ile Arg Leu Ile Lys Pro Ile Val Lys Asp Asn Tyr Pro 100 105 110 Glu Ser Tyr Ile Ile Leu Ala Lys Ala Tyr Leu Tyr Arg Gly Ile Ile 115 120 125 Phe Arg Pro Lys Gly Phe Arg Val Pro Ala Arg Lys Ile Glu Ala Leu 130 135 140 Lys Glu Ala Glu Lys Phe Ser Lys Lys Ala Cys Tyr Leu Leu Pro Asn 145 150 155 160 Asn Ile Asp Ala Leu Arg Val Trp Ala Tyr Ser Val Leu Glu Ile Glu 165 170 175 Phe Tle Phe Lys Glu Ser Ile Lys Leu Asp Asn Asp Leu Phe Asn Lys 180 185 190 Ala Ala Gln Cys Ile Ala Asn Asp Arg Ile Cys Asp Leu Lys Asp Met 195 200 205 Ala Val Ile Leu Arg Tyr Ser Glu Lys Glu Lys Asn Gln Thr Tyr Leu 210 215 220 Tyr Lys Ile Leu Asn Glu Asn Asn Asp Tyr Lys Arg Ala Tyr Asp Leu 225 230 235 240 Leu Leu Tyr Lys Ala Lys Ala Cys Phe Leu Leu Asn Lys Ala Asp Glu 245 250 255 Ile Lys Ser Tyr Ile Glu Arg Ala Ile Glu Gln Ala Pro Lys Ala Phe 260 265 270 Ser Asp Pro Phe Trp Asp Asp Ile Val Glu Phe Leu Leu Lys Leu Lys 275 280 285 Glu Asn Asp Ser Lys Leu Gly Glu Ile Trp Lys Asn Leu Ser Leu Lys 290 295 300 Thr Ile Glu Ala Cys Met Asn Asn Glu Leu Lys Ile Ser Ser Gly Ile 305 310 315 320 Asn Leu Ile Arg His Trp Ala Arg Gln Lys Pro Leu Tyr Asp Leu Ala 325 330 335 Phe Leu Ala Ala Ser Thr Pro Val Lys Lys Val Glu Ile Ala Asp Ser 340 345 350 Leu Lys Ser Arg Pro Ala Leu Arg Tyr Ala Val Leu Asn Asn Leu Lys 355 360 365 Asn Glu Ile Ser Glu Val Asp Glu Ile Leu Lys Ile Glu Asp Glu Ser 370 375 380 Arg Asp Gly Arg Tyr Ile Lys Lys Glu Ile Asp Leu Asp Ala Lys Lys 385 390 395 400 Ile Ala Asp Glu Ile Lys Ser Lys Asn Ile Ser Pro Glu Glu Leu Thr 405 410 415 Glu Pro Trp Ile Ala Val His Phe Tyr Leu Asn Glu Ile Asp Gly Asn 420 425 430 Gly Tyr Ala Ile Asn Phe Asp Ser Lys Thr Lys Lys Trp Glu Ile His 435 440 445 Ser Phe Lys Tyr Lys Glu Leu Tyr Glu Ser Phe Lys Ala Trp Tyr Val 450 455 460 Pro Tyr Ser His Lys Asp Glu Asp Asp Glu Phe Leu Ser Glu Met Ile 465 470 475 480 Leu Thr Leu Cys Lys Lys Ile Gly Glu Thr Met Pro Phe Leu His Ser 485 490 495 Page 103
Phe Pro Ile Asn Ser Asn Ile Leu Trp Ile Pro His Gly Phe Leu His 500 505 510 His Leu Pro Leu His Ala Ala Ile Leu Asn Asn Gly Lys Ile Leu Phe 515 520 525 Glu Thr Cys Leu Ser Arg Tyr Leu Pro Ala Trp His Leu Gly Asn Phe 530 535 540 Ser Glu Lys Gln Ala Val Asn Gly Lys Gly Arg Phe Leu Leu Lys Arg 545 550 555 560 Phe Gly Asn Tyr His Phe Glu Glu Ile Cys Asp Leu Asp Trp Asp Lys 565 570 575 Lys Glu Ile Glu Glu Ala Thr Lys Glu Met Leu Ile Asn Tyr Leu Lys 580 585 590 Thr Asn Pro Glu Ile Leu Met Leu Leu Cys His Gly Glu Ala Asp Ile 595 600 605 Leu Asn Pro Phe Arg Ser Lys Leu Lys Leu Lys Glu Pro Gly Cys Ser 610 615 620 Ile Leu Asp Ile Leu Lys Ser Thr Asn Ser Asn Ile Arg Ser Thr Lys 625 630 635 640 Val Phe Leu Gly Ala Cys Glu Ser Asp Met Ala Gln Phe Ser Gly Asn 645 650 655 Thr Ile Asp Glu His Leu Ser Leu Ser Ser Val Leu Leu Ser Ile Gly 660 665 670 Cys Asn Glu Ile Ile Ala Gly Leu Trp Glu Val His Pro Tyr Gln Val 675 680 685 Glu Glu Cys Tyr Lys Asp Ile Ile Pro Ile Gln Ser Ser Val Lys Gln 690 695 700 Glu Ile Asp Tyr Thr Leu Lys Glu Trp Gln Ile Lys Ala Tyr Arg Glu 705 710 715 720 Gly Ile Arg Ile Tyr Arg Ile Ala Pro Phe Arg Ile Met Gly Leu Pro 725 730 735 Ser Lys Lys Ser Ser Asn Asn Glu Val Gln Ser 740 745 <210> 37 <211> 822 <212> PRT <213> Desulfonema magnum <220> <223> Desulfonema magnum TPR-CHAT AA <400> 37 Met Ser Ser Ala Phe Ser Gly Leu Lys Ile Pro Glu Leu Ser Val Asp 1 5 10 15 Pro Ala Glu Val Phe Lys Ser Asp Asn Pro Gln Leu Val Ser Val Leu
Leu Asp Glu Phe Glu Leu Gln Glu Gln Arg Pro Phe Phe Ser Gly Leu 40 45 Ile Pro Glu Lys Gln Ile Asn Ile Ala Leu Lys Lys Ser Pro Gln Leu Page 104
Lys Lys Leu Ala Cys His Leu Leu Glu Ala Tyr Glu Ile Asn Gly Arg 65 70 75 80 Arg Trp Lys His Ala Asp Arg Arg Arg Val Leu Glu Lys Ala Ile Arg 85 90 95 Leu Leu Glu Lys Val Ser Asn Glu Leu Lys Gly Asp Ile Gln Lys Leu 100 105 110 Glu Asn Asn Val Lys Glu Ser Gly Lys Asp Ser Glu Glu Leu Asn Lys 115 120 125 Thr Arg Glu Lys His Gly Glu Ile Leu Ala Asp Met Gly Arg Ala Tyr 130 135 140 Leu His Arg Ala Lys Ile Ile Arg Pro Lys Gly Phe Thr Ile Pro Ala 145 150 155 160 Lys Lys Lys Glu Ser Leu Arg Lys Ala Leu Asp Phe Cys Lys Glu Ala 165 170 175 Lys Ala Arg His Leu Ala Gly Leu Val Gly Leu Glu Met Asp Arg Cys 180 185 190 Asp Met Ser Asp Phe Asp Ser Glu Asn Ser Leu Glu Lys Leu Leu Arg 195 200 205 Asp Ala Thr Thr Gly Val Ser Leu Ser Lys Asp Gln Tyr Tyr Ser Asn 210 215 220 Glu Tyr Arg Lys Ala Glu Tyr Tyr Gln Met Cys Ile Arg Leu Ala Glu 225 230 235 240 Ile Glu Glu Asp Lys His Gly Ala Thr Asp Glu Gly Ser Glu Asn Asn 245 250 255 Arg Tyr Lys Arg Leu Glu Gly Ile Leu Gly Pro Lys Lys Ser Gln Gly 260 265 270 Lys Lys Lys Gly Lys Lys Asn Lys Lys Lys Lys Gly Thr Asn Ser Lys 275 280 285 Ser Ala Leu Glu Lys Ile Met Ala Phe Glu Lys Phe Lys Val Ala Val 290 295 300 Tyr Leu Gly Glu Gly Lys Glu Lys Thr His Trp Asn Asp Phe Ser Glu 305 310 315 320 Phe Leu Met Arg Tyr Pro Phe Ser His Pro Ser Trp Glu Asp Ser Val 325 330 335 Arg Phe Leu Arg Arg Leu Tyr Lys Lys Gly Asn Asp Lys Trp Arg Glu 340 345 350 Leu Ser Val Ala Leu Trp Glu Ile Ala Lys Lys Asn Ser Ala Lys Thr 355 360 365 Ser Ser Ile His Leu Arg Trp Tyr Trp Ser Arg Gln Arg Asp Leu Tyr 370 375 380 Asp Leu Ala Phe Leu Ser Ala Leu Glu Gln Ala Asp Ile Ala Glu Asp 385 390 395 400 Glu Ser Val Arg Gln Glu Lys Leu Arg Leu Ala Ala Lys Val Ala Asp 405 410 415 Ser Ala Lys Asn Arg Pro Ala Leu Thr Trp Gln Ala Met Glu Gln Met 420 425 430 Ala Asp Lys Asp Glu Ala Leu Lys Lys Glu Ile Glu Asn Tyr Ala Gln 435 440 445 Ala Leu Gly Gly Gly Tyr Ile Glu Gln Phe Glu Lys Thr Glu Leu Pro Page 105
Lys Asn Thr Asn Pro Gln Asp Asp Ile Pro Asp Asp Leu Lys Thr Ser 465 470 475 480 Gly Leu Val Val Gln Phe Tyr Leu Val His Leu Lys Asp Tyr Glu Asn 485 490 495 Gly Tyr Ala Leu Ile Tyr Asp Gly Lys Thr Lys Lys Trp Ser Cys Glu 500 505 510 Met Phe Asn Phe Phe Pro Ile Trp Glu Lys Tyr Leu Gln Trp Gln Ser 515 520 525 Val Tyr Phe Asp Leu Pro Ala Asn Gln Lys Glu Glu Ser Ala Asp Gln 530 535 540 Leu Lys Ala Leu Cys Leu Glu Leu Gly Arg His Leu Tyr Phe Leu Phe 545 550 555 560 Asp Phe Glu Glu Asn Arg Lys Ala His Glu Lys Asn Gln Asp Thr Ala 565 570 575 Lys Glu Asn Trp Asp Met Leu Phe Ile Pro His Asp Phe Leu His Arg 580 585 590 Val Pro Ile His Gly Ala Ile Lys Lys Asn Gly Thr Asn Ile Leu Leu 595 600 605 Lys Lys Phe Asn Cys Thr Tyr Phe Pro Thr Leu Pro Asn Ala Ser Ile 610 615 620 Thr Pro Glu Ser Pro Lys Ser Asp Asn Val Ala Glu Leu Ile Glu Tyr 625 630 635 640 Phe Ser Glu His Glu Lys Asp Tyr Ser Glu Tyr Phe Asp Lys Ile Glu 645 650 655 His Ile Phe Asp Lys Lys Asn Arg Ala Ala Thr Ser Gln Asp Leu Leu 660 665 670 Asp Ala Ala Ile Ser Pro Pro Arg Thr Leu Thr Ile Tyr Cys His Gly 675 680 685 Gln Ser Asp Val Asn Asn Pro Phe Tyr Ser Arg Leu Leu Leu Lys Asp 690 695 700 Lys Leu Glu Leu Ile Lys Leu Ala Thr Leu Gln Glu His Tyr Ser Gly 705 710 715 720 Thr His Ile Phe Leu Gly Ala Cys Glu Thr Asp Leu Met Pro Pro Leu 725 730 735 Ser Ala Pro Leu Asp Glu Gln Leu Ser Met Ala Ala Met Phe Leu Gln 740 745 750 Lys Gly Val Gly Ser Val Leu Gly Thr Leu Trp Glu Ala Tyr Pro Arg 755 760 765 Ala Val Lys Asp Ile Val Lys Asn Ile Leu Ser Ala Asp Asp Thr Gln 770 775 780 Tyr Phe Glu Lys Leu Phe Ser Leu Lys Lys Lys Ile Ser Val Asn Gln 785 790 795 800 Lys Lys Ser Leu Tyr Tyr Tyr Leu Cys Phe Lys Leu Tyr Arg Ser Phe 805 810 815 Asp Gln Ile Gly Lys Glu 820 <210> 38 <211> 754 Page 106
<212> PRT <213> Candidatus Magnetomorum <220> <223> Candidatus Magnetomorum sp.
TPR-CHAT AA <400> 38 Met Met Asp Ser Thr Phe Gln Thr Ala Cys Gln Ser Ile Glu Glu Ile 1 5 10 15 Ser Asn Ala Ile Lys Asn Lys Lys Asp Asn Leu Pro Gln Ala Leu Ile
Lys Tyr Ala Gln Glu Asn Leu Asn Pro Leu Thr Ala Leu Cys Gln Arg 40 45 Gly Val Ser Asn Glu Lys Asn Ala Ile Gln Asp Ile Thr Asn Ala His 50 55 60 Asp Asp Glu Ser Tyr His Glu Ala Leu Phe Glu Leu Asn Asn Ile Lys 65 70 75 80 Leu Lys Ser Tyr Glu Ala Cys Lys Thr Ser Met Asn His Ser Gln Arg 85 90 95 Arg Met Phe Leu Glu Asp Met Ile Gln Val Leu Lys Asn Lys Glu Asn 100 105 110 Glu Phe Lys Thr Asn Thr Asp Phe Leu Leu Leu Met Ala Lys Ile Tyr 115 120 125 Phe Phe Arg Ser Leu Leu Phe Arg Pro Lys Gly Arg Thr Val Pro Ala 130 135 140 Arg Lys Ile Glu Ala Leu Lys Arg Ser Glu Asp Leu Ile Gln Gln Ile 145 150 155 160 Lys Asn Lys Ser Thr Asp Ala Trp Arg Leu Ser Gly Gln Ile Phe Leu 165 170 175 Ser Leu Ile Ala Ile Asn Glu Pro Tyr Asp Glu Glu Ile Phe Glu Glu 180 185 190 Val Val Phe Asn Ile Glu Glu Gln Phe Asp Met Glu Asn Asp Pro Leu 195 200 205 Thr Val Ile Asn Asp Ile Arg Val Leu Leu Thr Gly Ser Glu Gln Lys 210 215 220 Asn Phe Pro Asp Phe Leu Glu Lys Ile Ser Leu Lys Ala Leu Asn Asn 225 230 235 240 Tyr Thr Glu His Leu Asn Asp Val Phe Leu Leu Met Ala Arg Thr Ala 245 250 255 Phe Gln Lys Lys Gln Ile Glu Asp Thr Glu Leu Tyr Leu Thr Lys Ser 260 265 270 Met Asp Glu Ala Pro Ala Ala Phe Ala Asp Pro Tyr Trp Asp Asp Leu 275 280 285 Val Asp Phe Ile Asp Leu Leu Lys Thr Asn Asn Cys Phe Ile Trp Lys 290 295 300 Lys Ala Ala Leu Lys Ala His Ala Val Cys Cys Glu Lys Glu Thr Glu 305 310 315 320 Ile Gly Asn Ile Tyr Leu Arg Trp Tyr Trp Ser Arg Gln Asn Lys Leu 325 330 335 Page 107
Tyr Asp Leu Ala Phe Ile Ala Thr Glu Ser Leu Glu Asp Lys Val Met 340 345 350 Ile Ala Asp Ser Leu Lys Ser Arg Pro Cys Leu Arg Phe Lys Gln Leu 355 360 365 Arg Glu Met Ser Pro Tyr Ile Asn Asn Phe Asp His Ile Leu Asn Gln 370 375 380 Glu Asp Glu Ala Arg Asp Asn Arg Tyr Leu Lys Lys Lys Pro Ser Lys 385 390 395 400 Pro Arg Lys Lys Leu Pro Lys Lys Lys Phe Thr Asp Cys Gln Met Leu 405 410 415 Asp Asn Gln Trp Ile Val Val His Phe Tyr Ile Asn Glu Phe Glu Lys 420 425 430 Lys Ala Tyr Ala Leu Ile Phe Tyr Cys Glu Thr Gly Asn Ser Ala Ile 435 440 445 Glu Ser Phe Gln Tyr Thr Glu Leu Phe Arg Thr Phe Ile Ser Trp Gln 450 455 460 Glu Met Glu Leu Pro Glu Tyr His Tyr Glu Asn Asn Asn Glu Ala Leu 465 470 475 480 Lys Ala Ile Asn Cys Lys Lys Gly Lys Tyr Leu Tyr Gln Leu Cys Arg 485 490 495 Glu Ile Thr Arg Thr Met Pro Phe Ile Phe Glu Phe Pro Glu Asn Lys 500 505 510 Ser Ile Leu Trp Val Pro His Gly Phe Ile His Arg Leu Pro Leu His 515 520 525 Ala Ala Ile Lys Glu Glu Thr Asn Gly Ala Cys Ser Lys Glu Ile Phe 530 535 540 Leu Phe Glu Lys His Glu Ser Arg Tyr Leu Pro Ala Trp His Leu Leu 545 550 555 560 Asp Leu Lys Asp Arg Gln Asp Gly Asn Gly His Thr Phe Leu Lys Arg 565 570 575 Tyr Thr Thr Thr Leu Asp Leu Leu Thr Lys Glu Asn Leu Lys Tyr Tyr 580 585 590 His Tyr Lys Lys Arg Ser Asn Arg Asp Tyr Phe Phe Glu Ser Leu Lys 595 600 605 Lys Asn Leu Gln Thr Leu Ile Ile Phe Cys His Gly Lys Ala Asn Val 610 615 620 Thr Asn Ser Phe Gln Ser Arg Leu Lys Phe Asp Pro Pro Ile Thr Ile 625 630 635 640 Leu Asp Ile Leu Lys Ala Lys Val Thr Ile Arg Gly Cys Arg Ile Phe 645 650 655 Leu Gly Ala Cys Glu Ser Asp Met Ala Gln Pro Ile Glu Phe His Val 660 665 670 Asp Glu His Leu Ser Leu Ser Thr Ala Leu Leu Leu Ile Gly Ala Lys 675 680 685 Glu Val Ile Ala Gly Leu Cys Lys Leu Trp Val Pro Thr Ile Glu Lys 690 695 700 Cys Tyr Phe Asp Leu Leu Asp Ser Asn Asn Leu Ser Lys Ser Leu Ser 705 710 715 720 Ile Trp Gln Arg Lys Lys Phe Lys Asn Trp Asn Ser Met Ser Lys Glu 725 730 735 Page 108
Glu Gln Tyr Leu Val Ile Tyr Ser Ser Ala Pro Ile Arg Val Met Gly 740 745 750 Phe Ile <210> 39 <211> 783 <212> PRT <213> Deltaproteobacteria <220> <223> Deltaproteobacteria bacterium TPR-CHAT AA 3 <400> 39 Met Asn Gln Pro Gln Arg Leu Pro Asp Gln Phe Glu Arg Leu Lys Lys 1 5 10 15 Ile Ile Leu Glu Ala Asp Ala Val Cys Asp Ser Glu Lys Arg Ser Glu
Phe Phe Ala Asn Lys Ala Asp Ala Ile Asn Asp Ala Leu Ile Ala Leu 40 45 Glu Thr Ala Asp Ser Val Gln Leu Gly Ile Met Leu Arg Val Leu Ser 50 55 60 Arg Glu Leu Glu Glu Ser Pro Val Ser Leu Ile Ser Leu Lys Pro Ser 65 70 75 80 Lys Asp Val Cys Leu Ser Leu Lys Glu Asn Ser Ile Glu Ile Ala Glu 85 90 95 Lys Val Ala Ile Phe Asn Phe Glu Arg Ala Lys Tyr Leu Lys Glu Arg 100 105 110 Ala Lys Lys Gly Lys Asp Ile Glu Lys Tyr Thr Asp Arg His Arg Leu 115 120 125 Ile Glu Cys Ala Ile Ser Leu Val Trp Arg Glu Phe Asp Ser Asp Asp 130 135 140 Lys Trp Pro His Thr Ser Val Ile Ser Glu Asn Glu Ala Cys Ser Phe 145 150 155 160 Ile Ala Asn Cys Tyr Leu Leu Arg Ser Arg Leu Ala Leu Ser Lys Gly 165 170 175 Ser Asp Ile Pro Glu Lys Lys Leu Glu Ala Leu Thr Lys Ala Trp Asn 180 185 190 Trp Ala Glu Arg Gly Ser Asp Lys Met Asp Ser Leu Lys Met Arg Ile 195 200 205 Ala Leu Glu Lys Asp Arg Trp Asp His Thr Leu Gly Asp Glu Trp Ile 210 215 220 Lys Thr Cys Leu Thr Asp Phe Ile Asn Ala Asn His Tyr Lys Phe Asn 225 230 235 240 Val Cys Ser Pro Leu His Trp Ala Val Asn Asp Lys Cys Arg Ala Leu 245 250 255 Gly Ile Ser Glu Lys Asp Asn Asp Lys Glu Leu Val Lys His Lys Pro 260 265 270 Page 109
Pro Asp Gly Lys Asp Asn Phe Leu Pro Leu Tyr Gln Ala Lys Ala Ala 275 280 285 Phe Arg Leu Lys Leu Asp Ile Ser Glu Arg Leu Glu Glu Ala Val Glu 290 295 300 Arg Leu Ser Asn Phe Pro Leu Ser Ala Pro Leu Trp His Asp Thr Val 305 310 315 320 Glu Gln Ile Lys Asp Val Ser Glu Thr Asp Gln Tyr Ala Asp Gln Trp 325 330 335 Glu Lys Ser Ala Ile Arg Ala Trp Gln Lys Cys Lys Glu Ala Glu Glu 340 345 350 Gly Leu Arg Leu Ser Ile Gln Val Arg Trp Tyr Trp Ser Gly Tyr Arg 355 360 365 Lys Leu Tyr Asp Leu Ala Phe Gln Ala Val Leu Asn Ser Ala Asp Thr 370 375 380 Glu Thr Val Pro Ser Lys Asp Ser Val Thr Leu Lys Ser Ala Ile Glu 385 390 395 400 Ile Thr Asp Ser Leu Lys Ser Arg Pro Thr Ile Lys Met Gln Asp Leu 405 410 415 Glu Lys Ser Leu Lys Gly Asp Asp Arg Glu Ile Tyr Lys Lys Val Leu 420 425 430 Glu Ala Glu Val Arg Ser Phe Glu Gly Tyr Ile His Ser Leu Pro Lys 435 440 445 Pro Glu Lys Asp Lys Val Ser Glu Thr Ala Lys Glu Val Arg Asp Phe 450 455 460 Leu Asp Ile Pro Glu Gly Trp Ala Ala Val His Phe Asn Ile Ser Asp 465 470 475 480 Lys Asp Lys Gly His Ala Leu Val Val Glu Asn Gly Glu Ile His His 485 490 495 Val Gly Ile Asp Ile Ser Gly Val Trp Glu Ala Phe Arg Arg Trp Gly 500 505 510 Ser Asn Leu Arg His Phe Gly Ile Glu Gly Ser Glu Gln Ser Leu Lys 515 520 525 Ala Leu Cys Gln Glu Ser Gly Glu Met Leu Arg Pro Ile Met Glu Ile 530 535 540 Ile Arg Ser Glu Asn Ile Leu Phe Ile Pro His Gly Phe Leu His Leu 545 550 555 560 Val Pro Ile His Ala Ser Glu Leu Asp Asp Gly Asn Tyr Leu Phe Gln 565 570 575 Glu Lys Ala Cys Met Phe Leu Pro Phe Trp Ser Ala Ala Pro Leu Arg 580 585 590 Arg Glu Asp Met Ala Arg Glu Gly Asn Val Leu Leu Thr Asn Trp Asp 595 600 605 Glu Cys His Arg Ile Lys Asp Leu Leu Ile Glu Asn Asn Trp Arg Asn 610 615 620 Ser Glu Ile Thr Glu Asn Ala Asp Asp Asp Ile Phe Ala Glu Leu Glu 625 630 635 640 Lys Gly Val Pro Arg Leu Met Thr Leu Phe Cys His Gly Gln Ser Asp 645 650 655 Met Leu Asn Pro Tyr Gln Ser Lys Phe Leu Met Trp Glu Ser Ser Leu 660 665 670 Page 110
Thr His Tyr Asp Leu Ala Glu Glu Leu Pro Thr Leu Asn Gly Ala Arg 675 680 685 Val Ile Leu Thr Ala Cys Glu Ser Asp Leu Val Ser Gly Ala Pro Gly 690 695 700 Pro Val Asp Glu His Leu Ser Leu Ala Ser Val Phe Leu Gly Lys Gly 705 710 715 720 Ala Asp Glu Val Ile Gly Ser Leu Tyr Leu Cys Phe Thr Asp Val Ser 725 730 735 Gln Glu Leu Val Leu Ala Ala Lys Gly Ser Pro Gln Lys Pro Leu Tyr 740 745 750 Lys Ile Leu Gln Gln Lys Gln Thr Glu Trp Leu Lys Lys Tyr Gly Ser 755 760 765 Gln Phe Tyr Lys Ile Pro Val Phe Arg Val Met Gly Phe Pro Gln 770 775 780 <210> 40 <211> 659 <212> PRT <213> Deltaproteobacteria <220> <223> Deltaproteobacteria bacterium TPR-CHAT 4 <400> 40 Ile Glu Ser Ala Ile Arg Leu Val Trp Asn Asn Phe Asp Ala Gly Lys 1 5 10 15 Lys Trp Thr His Thr Asp Val Val Ser Glu Lys Asp Ala Cys Tyr Phe
Ile Ala Lys Cys Tyr Leu Leu Arg Ser Arg Leu Thr Leu Pro Lys Gly 40 45 Ala Asp Ile Pro Ala Lys Lys Leu Glu Ala Leu Asp Lys Ala Trp Glu 50 55 60 Trp Ala Glu Lys Gly Ser Glu Glu Thr Asp Ala Leu Lys Met Gly Ile 65 70 75 80 Val Leu Glu Arg Lys Gln Trp Asp Ser Glu Leu Lys Glu Glu Trp Ile 85 90 95 Glu Ser Cys Met Val Cys Phe Leu Asn Pro Arg His Tyr Lys Phe Arg 100 105 110 Ile Lys Asn Ser Leu His Trp Ala Ile Asn Asp Met Ala Arg Glu Ser 115 120 125 Gly Met Leu Asp Asp Asp Ser Asp Arg Ala Ile Ala Asp Tyr Asp Ile 130 135 140 Lys Ser His Asn Lys Phe Leu Pro Phe Tyr Gln Ala Arg Ala Ala Leu 145 150 155 160 Arg Leu Asn Ala Asn Asp Ile Ser Gly Lys Leu Glu Asn Ala Val Glu 165 170 175 Cys Leu Arg Arg Ile Pro Leu Ser His Pro Leu Trp Lys Asn Thr Ser 180 185 190 Glu Leu Ile Arg Asn Val Ser Glu Lys Asp Glu Tyr Lys Gly Gln Trp Page 111
Glu Asp Ala Ala Ile Leu Ala Trp Lys Ile Cys Gln Lys Ala Glu Glu 210 215 220 Asp Ile Cys Leu Ser Ile Gln Val Arg Trp Tyr Trp Gly Gly Tyr Ser 225 230 235 240 Gly Leu Tyr Asp Leu Ala Phe Gln Ala Ala Ile Ser Lys Gly Glu Leu 245 250 255 Pro Thr Ala Val Arg Ile Ala Asp Ser Leu Lys Ser Arg Pro Thr Ile 260 265 270 Lys Met Gln Asn Ala Glu Arg Cys Leu Gly Asp Asp Asp Ala Glu Ile 275 280 285 Phe Lys Lys Leu Ala Glu Ile Glu Ala Leu Ser Leu Ala Asp Thr Tyr 290 295 300 Ile Pro His Val Glu Ala Val Lys Lys Glu Phe Lys Thr Lys Val Arg 305 310 315 320 Lys Asp Lys Ser Asp Asp Lys His Glu Ser Arg Lys Asp Arg Asp Ile 325 330 335 Leu Asp Val Pro Lys Gly Trp Ala Ala Val His Phe Tyr Ile Thr Gly 340 345 350 Lys Lys Glu Gly His Ala Leu Ile Val Glu Asn Gly Asn Ala Pro Arg 355 360 365 His Val Leu Leu Asp Ile Ser Glu Val Trp Asn Ala Phe Cys Glu Trp 370 375 380 Asp Glu Ala Arg Arg Arg Val Gly Ile Gly Glu Asp Ser Lys Gly Glu 385 390 395 400 Leu Glu Met Leu Cys Asp Glu Ala Gly Glu Met Leu Thr Ser Val Leu 405 410 415 Glu Glu Ile Ser Ala Glu Asn Ile Leu Phe Ile Pro His Gly Phe Leu 420 425 430 His Leu Val Pro Leu His Ala Ser Tyr Ile Asn Asp Asp Glu Thr Cys 435 440 445 Leu Phe Glu Glu Lys Thr Cys Leu Phe Leu Pro Ser Trp Ser Leu Ala 450 455 460 Pro Ser Arg Asp Glu Ser Asp Thr Pro Thr Arg His Asp Asp Ile Leu 465 470 475 480 Leu Thr Asn Trp Asp Cys Cys Asp Asp Ile Arg Glu Ile Ile Thr Gln 485 490 495 Asp Gly Trp Lys Asn Lys Glu Ile Ile Asn Ser Thr Ala Asn Asp Val 500 505 510 Phe Asp Ala Ile Glu Thr Ser Pro Arg Leu Leu Val Leu Phe Ser His 515 520 525 Gly Gln Gly Asp Ser Val Asn Pro Tyr Gln Ala Lys Phe Leu Met Asn 530 535 540 Gly Pro Pro Leu Thr His Gln Glu Leu Ile Gly Lys Leu Pro Glu Leu 545 550 555 560 His Gly Thr Arg Val Ile Leu Thr Ala Cys Glu Ser Asp Leu Val Ser 565 570 575 Gly Asn Phe Glu Leu Thr Asp Glu His Leu Ser Leu Ala Thr Ala Phe 580 585 590 Leu Arg Arg Arg Ala Asn Glu Val Ile Gly Ala Leu Phe Ser Cys Tyr Page 112
Pro Ser Val Ser Gln Glu Leu Ile Leu Ser Ala Lys Asp Lys Pro Lys 610 615 620 Lys Pro Leu Tyr Gln Ile Leu Gln Asn Lys Gln Thr Gly Trp Lys Lys 625 630 635 640 Ser Lys Asp Val Tyr Lys Val Pro Ala Phe Arg Val Met Gly Phe Pro 645 650 655 Ala Asn Asp <210> 41 <211> 809 <212> PRT <213> Deltaproteobacteria <220> <223> Deltaproteobacteria bacterium TPR-CHAT AA 5 <400> 41 Met Asn Glu Glu Gln Thr Asn Arg Trp Pro Asp Leu Phe Asp Lys Phe 1 5 10 15 Glu Asp Ile Ile Met Glu Val Asp Lys Lys Cys Asp Ile Glu Glu Arg
Thr Gln Phe Leu Arg Glu Arg Lys Glu Phe Asn Ala Glu Thr Leu Thr 40 45 Ser Leu Glu Lys Ala Asp Asp Trp Ile Arg Phe Gly Val Ile Leu Lys 50 55 60 Val Leu Ser Arg Glu Ser Glu Asn Ser Pro Val Pro Val Phe Thr Phe 65 70 75 80 Arg Pro Ser Gln Asn His Cys Glu Asn Leu Lys Lys Asn Ser Asp Lys 85 90 95 Ile Asp Lys Ala Ile Ala Glu Leu Asn Cys Lys Arg Ala Glu Lys Leu 100 105 110 Leu Gly His Ala Arg Thr Gly Lys Val Arg Lys Tyr Thr Asp Arg His 115 120 125 Arg Leu Val Glu Ser Ala Ile Thr Leu Val Trp Glu Gln Phe Glu Lys 130 135 140 Arg Asp Asn Gly Gln Phe Lys Trp Leu His Ile His Val Lys Thr Glu 145 150 155 160 Lys Gln Ala Cys Asn Phe Ile Ala Lys Cys Tyr Leu Leu Arg Ser Lys 165 170 175 Leu Ala Leu Pro Lys Gly Ser Ser Ile Pro Glu Lys Lys Leu Glu Ala 180 185 190 Leu Asp Asn Ala Trp Glu Trp Ala Lys Arg Gly Ser Pro Glu Thr Asp 195 200 205 Asp Leu Lys Met Glu Val Ala Leu Gln Lys His Arg Trp Asp Pro Asn 210 215 220 Leu Gly Lys Arg Trp Phe Gln Lys Gln Leu Asn Ala Phe Leu Asp Ser Page 113
Asn Lys Leu Asp Leu Ser Asn Pro Leu His Trp Ala Val Asn Asp Ile 245 250 255 Val Gly Asp Lys Ala Leu Val Ser Glu Glu Tyr Asp Leu Glu Met Leu 260 265 270 Asn Asp Ala Ser Val Arg Asn Leu Thr Lys Asn Trp Glu Trp Lys Asp 275 280 285 Lys Ser Gly Ile Pro Leu Tyr Gln Ala Arg Ala Ala Phe Arg Thr His 290 295 300 Ala Ser Asp Leu Asp Arg Arg Leu Ile Asn Ala Val Lys Lys Leu Lys 305 310 315 320 Trp Leu Pro Leu Ser His His Leu Trp Glu Asp Thr Val Ala Leu Ile 325 330 335 Lys Asn Val Ser Glu Asp Asp Ser Phe Asn Gly Lys Trp Glu Met Ala 340 345 350 Ala Ile Leu Ala Trp Ala Ile Cys Gln Asn Ala Glu Ala Arg Ile Lys 355 360 365 Leu Ser Val Gln Leu Arg Trp Tyr Trp Ser Arg Ala Arg Glu Leu Tyr 370 375 380 Asp Leu Ala Phe Gln Ala Ala Leu Lys Arg Lys Arg Pro Phe Leu Leu 385 390 395 400 Val Arg Ile Thr Asp Ser Glu Lys Ser Arg Pro Thr Ile Lys Met Gln 405 410 415 Ala Ala Glu Lys Ser Phe Ala Asn Ala Ala Ala Phe Gln Thr Tyr Leu 420 425 430 Glu Ala Glu Thr Leu Phe Ala Thr Gly Asn Phe Asn Ala Gly Leu Lys 435 440 445 Glu Leu Asn Ser Val Pro Ile Glu Lys Met Arg Thr Arg Asn Ile Arg 450 455 460 Ala Val Pro Glu Gly Trp Ala Ala Val His Phe Asn Ile Ile Asp Lys 465 470 475 480 Asn Glu Ser His Ala Leu Ile Val Glu Asn Arg Glu Cys His Ser Ile 485 490 495 Arg Ile Asp Leu Pro Asp Val Trp Asp Ala Phe Gln Lys Trp Asn Thr 500 505 510 Glu Arg Arg Asp Leu Lys Leu Ile Lys Lys Ser Glu Thr Ser Leu Glu 515 520 525 Ile Leu Cys Glu Lys Ser Gly Ile Met Leu Glu Pro Ile Leu Asn Gln 530 535 540 Ile Lys Ser Glu Asn Ile Leu Phe Ile Pro Tyr Gly Phe Leu His Leu 545 550 555 560 Val Pro Leu His Ala Ser Lys Ile Lys Lys Pro Asp Glu Thr Tyr Thr 565 570 575 Tyr Leu Phe Gln Glu Lys Gln Cys Leu Phe Leu Pro Ser Trp Ser Leu 580 585 590 Ala Pro Val Glu Lys Glu Asn Ile His Thr Gly Glu His Asp Leu Leu 595 600 605 Leu Leu Ala Lys Met Arg Gly Lys Asp Ile Gln Asn Ile Met Asp Arg 610 615 620 Glu Asp Trp Cys Asn Glu Lys Asn Ala Glu Asn Ile Glu Asn Thr Ala Page 114
Asp Asp Phe Phe Asn Cys Leu Phe Asp Thr Leu Glu Arg Phe Arg Lys 645 650 655 Pro Pro His Leu Leu Val Leu Tyr Cys His Gly Gln Gly Asp Phe Val 660 665 670 Asn Pro Tyr Cys Ser Lys Phe Ile Met Glu Gly Arg Pro Leu Thr His 675 680 685 Gln Asp Ile Val Gln Asp Leu Gln Asp Leu Pro Val Leu Gln Gly Thr 690 695 700 Lys Val Ile Leu Thr Ala Cys Glu Thr Asp Leu Val Ser Arg His Phe 705 710 715 720 Gly Leu Ile Asp Glu His Leu Ser Leu Ala Thr Ala Phe Leu Cys Lys 725 730 735 Gly Ala Ser Gln Val Ile Ala Ser Leu Phe Thr Cys Thr Thr Asp Ile 740 745 750 Ser Cys Glu Ile Ile Val His Ala Lys Asp Asn Pro Glu Lys Ser Leu 755 760 765 Gly Gln Ile Leu Gln Glu Lys Gln Asn Gln Trp Ala Ala Asn Glu Ala 770 775 780 Leu Tyr Arg Leu Ser Val Phe Arg Val Met Gly Phe Pro Gly Ser Ala 785 790 795 800 Arg Ala Met Glu Glu Glu Ile Ser Pro 805 <210> 42 <211> 860 <212> PRT <213> Desulfobulbaceae <220> <223> Desulfobulbaceae bacterium TPR-CHAT AA <400> 42 Met Asp Leu Ser Ala Asp Leu Leu Ala Ala Leu Arg Tyr Ile Val Ala 1 5 10 15 Thr His Thr Ala Ala Val Leu Thr Ala Asp Ser Pro Pro Lys Arg Val
Thr Gln Arg Glu Glu Leu Leu Arg Gly Ser Leu Ala Gly Ile Asp Lys 40 45 Ile Leu Ser Leu Leu Ala Gln Asp Gln Lys Arg Ala Ala Glu Leu Thr 50 55 60 Ala Leu Leu His Phe Leu Asn Asp Thr Gly Leu Ala Val His Phe Gly 65 70 75 80 Ala Arg Ile Asp Cys Pro Gly Gln Pro Ile Ser Lys Asp Glu Val Leu 85 90 95 Pro His Tyr Val Gly Leu Arg Asn Gly Leu Ser Lys Arg Leu Leu Ala 100 105 110 Ala Ala Leu Ala Thr Arg Lys Thr Pro Trp Arg Arg Arg Ser Glu Trp 115 120 125 Page 115
Ile Glu Leu Ala Leu Arg His Leu Leu Ala Leu Phe Glu Pro Asp Ser 130 135 140 Arg Gly Gln Gln Val Trp Pro His Gln Ala Ile Val Pro Leu Pro Glu 145 150 155 160 Ala Val Val Leu Ala Gly Arg Cys Tyr Leu His Arg Gly Leu Ala Phe 165 170 175 Leu Pro Gln Gly Arg Asn Pro Ala Glu Leu Gly Asp Arg Arg Asp Phe 180 185 190 Leu Glu Ala Gly Met Arg Leu Val Ala Thr Ala Val Asp Ser Ala Gly 195 200 205 Asn Gly Pro Asp Tyr Asn Gln Pro Asp Gln Leu Pro Arg Ala Val Leu 210 215 220 His Ala Leu Leu Leu His Glu Leu Gln Arg Leu Ser Pro Ser Gly Val 225 230 235 240 Tyr Ser Gln Glu Leu Ser Asp Leu Leu Glu Thr Ile Gln Ile Val Leu 245 250 255 Ser Gly Lys Ala Thr Leu Asp Pro Glu Glu Ile Phe Leu Leu Trp Leu 260 265 270 Gln Gly Arg Val Asn Gln Ala Ser Thr Thr Ala Asn Glu Trp Arg Gln 275 280 285 Ile Gly Glu Ala Val Ala Ala Cys Pro Trp Ser Gly Gly Arg Asp Ala 290 295 300 Arg Tyr Arg Leu His Leu Pro Leu Ile Gly Ala Trp Ala Asn Trp Arg 305 310 315 320 Ala Ser Gly Thr Leu Glu Asn Ala Asn Arg Gln Ala Val Leu Ala Ala 325 330 335 Leu Gln Lys His Gln Leu Tyr Ser Pro Leu Trp Glu Glu Ala Val Ala 340 345 350 Phe Leu Arg Glu Leu Lys Asp Lys Ser Cys Thr Asp Trp Ser Asp Phe 355 360 365 Ala Tyr Gln Val Trp Arg Leu Cys Arg Asp Arg Glu Lys Glu Phe Gly 370 375 380 Phe Gly Phe Gln Ile Arg Gln Tyr Trp Ala Arg Leu Asp Glu Leu Tyr 385 390 395 400 Arg Leu Val Ile Gly Ala Met Val Glu Gln Gly Glu Val Glu Lys Ala 405 410 415 Ala Glu Val Val Asp Ser Leu Lys Ala Arg Thr Thr Leu Thr Trp Arg 420 425 430 Asp Met Glu Gly Leu Leu Ala Gly Arg Ser Gly Thr Ala Glu Lys Cys 435 440 445 Arg Glu Ala Tyr Tyr Ala Met Glu Ala Gln Ala Ala Met Gly Val Tyr 450 455 460 Ala Glu Arg Leu Leu Asp Asp Asp Cys Arg Arg Leu Met Gln Thr Ala 465 470 475 480 Val Ala Pro Thr Gly Arg Pro Leu Ala Lys Val Pro Thr Gly Trp Val 485 490 495 Ala Val His Phe His Leu Asp Glu Thr Arg Gly Gln Ala Phe Leu Ile 500 505 510 Gly Ala Pro Ala Ser Val Thr Glu Lys Ala Pro Ser Phe Asp Pro Arg 515 520 525 Page 116
Pro Leu Leu Ala Ala Trp Gln Ile Trp Arg Arg His Tyr Gly Gln Met 530 535 540 Pro Thr Glu Ile Glu Phe Asp Arg Gln Gln Leu Asp Val Ser Leu Pro 545 550 555 560 Lys Trp Glu Ser Glu Asn Gly Gln His Leu Leu Asp Leu Cys His Arg 565 570 575 Met Gly Glu Gln Phe Ser Phe Leu Phe Arg Leu Ala Val Asn Glu Glu 580 585 590 Ile Lys Gly Leu Leu Phe Ile Pro His Gly Phe Leu His Gln Leu Pro 595 600 605 Leu His Ala Ala His Lys Asp Ser Glu Tyr Leu Phe Gly Arg Leu Pro 610 615 620 Ser Val Phe Leu Pro Ala Trp Glu Met Leu Pro Ala Thr Leu Pro Ser 625 630 635 640 Thr Pro Gly Arg Asn Asp Ala Asp Ser Phe Ala Ser Leu Arg Asp Tyr 645 650 655 Glu His Gly Leu Val Ala Ala Ile Lys Ser Ala His Ser Trp Asn Tyr 660 665 670 Phe His Gly Asp Ile His Gly Pro Lys Ser Pro Leu Trp Ala Asp Ile 675 680 685 Val Ala Arg Tyr Gly Asn Gly Gly Asn Ala Ile Pro Arg Trp Leu Ala 690 695 700 Ile Trp Cys His Gly Glu Ala Asp Pro Val Asn Phe Asn His Ser Arg 705 710 715 720 Leu Leu Leu Gly Asn Pro Gly Val Ser Phe Phe Asp Leu Gln Ala Ala 725 730 735 Asp Leu Gln Leu Arg Gly Ala Ser Val Leu Leu Ser Val Cys Glu Ser 740 745 750 Asp Leu Ser Pro Pro Asn Leu Ser Gly Arg Leu Asp Glu His Leu Ser 755 760 765 Leu Ala Ala Pro Phe Leu His Lys Gly Ala Gly Gln Val Leu Thr Gly 770 775 780 Leu Trp Ala Val Arg Lys Asp Leu Met Phe Arg Ile Val Thr Gly Ser 785 790 795 800 Leu Thr Thr Gln Ser Pro Leu Trp Gln Ile Val Gln Glu Gln Gln Ile 805 810 815 Gly Ile Met Lys Asp Thr Gly Ile Ser Glu Ala Val Arg Leu His Arg 820 825 830 Ala Ala Pro Val Arg Val Val Gly Trp Pro Glu Gly Val Asn Glu Val 835 840 845 Ser Thr Ala Ala Gln Pro Ile Ala Glu Glu Lys Ala 850 855 860 <210> 43 <211> 835 <212> PRT <213> Candidatus Magnetomorum <220> Page 117
<223> Candidatus Magnetomorum sp.
TPR-CHAT AA 2 <400> 43 Met Thr Leu Glu Ile Asn Asp Leu Leu Lys Glu Ala Ile Gln Arg Asp 1 5 10 15 Leu Glu Leu Asp Thr Ile Val Ser Gln Thr Pro Glu Lys Tyr Val Arg
Gln Lys Lys Gln Lys Glu Lys Arg Lys Lys Asp Ala Phe Gln Lys Ala 40 45 Ile Asp Glu Phe Ile Lys Gly Lys Asp Tyr Glu His Leu Leu Leu Ala 50 55 60 Met Lys Leu Ser Gly Asp Lys Ser Ile Lys Cys Pro Leu Ile Asp Gly 65 70 75 80 Asn Lys Asp Leu Val Ser Asp Thr Leu Lys Gln Tyr His Gln Lys Ala 85 90 95 Glu Gln Lys Glu Ile Ser Ile Leu Val Asn Lys Ala Asp Lys Ile Tyr 100 105 110 Gln Lys Ala Val Ser Glu Asp Arg Glu Ile Pro Phe Lys Cys Lys Asn 115 120 125 Asn Trp Leu Glu Leu Val Ile Arg Ile Leu Met Glu Leu Leu Lys Arg 130 135 140 Lys Ser Asp Ser Arg Trp Leu Tyr Asn Asp Ile Val Ser Ser Ser Asn 145 150 155 160 Val Tyr Tyr Leu Leu Ala Asp Ala Tyr Phe Lys Arg Gly Lys Cys Ile 165 170 175 Leu Pro Lys Gly Lys Gly Val Ser Ala Pro Glu Lys Lys Leu Glu Ala 180 185 190 Met Lys Lys Ser Leu Gly Trp Cys Cys Ser Tyr Phe Asp Glu Ala Ala 195 200 205 Glu Gly Asp Gly Ser Tyr Arg Leu Lys Phe Thr Glu Leu Tyr Val Gln 210 215 220 Val Leu His Glu Leu Tyr Arg Met Asp Leu Ile Thr Tyr Gln Asn Asp 225 230 235 240 Phe His Asp Ala Leu Lys Lys Phe Met Lys Met Ser Ile Thr Pro Gln 245 250 255 Thr Pro Leu Gln Phe Tyr Ile Leu Ser Leu Tyr Cys Ser Glu Leu Thr 260 265 270 Pro Asn Asn Glu Asp Leu Asp Arg Thr Ile Leu Ser Met Pro Ile Asp 275 280 285 Phe Glu Met Asp Arg Leu Thr Ala Asp Tyr Pro Gly Leu Met Leu Ala 290 295 300 Lys Leu Gln Ser Val Phe Arg Leu Val Lys Thr Gly Ser Met Asp Asn 305 310 315 320 Asp Asp Phe Asn Asn Glu Ile Lys Val Phe Ile Ser Ile Phe Lys Asp 325 330 335 Gln Lys His Thr Glu Leu Phe Ser Pro Ile Trp Asp Asp Leu Ile Leu 340 345 350 Phe Tle Lys Met Leu Cys Thr Glu Lys Tyr Pro Gln Trp Lys Glu Leu 355 360 365 Ala Leu Glu Ala Trp Asn Ile Cys Asn Asn Lys Glu Gln Leu Met Ser Page 118
Phe Gly Leu Gln Ile Arg Gln Tyr Trp Ser Arg Leu Asp Asp Leu Tyr 385 390 395 400 His Leu Ala Ile Lys Ala Ala Ile Ser Thr Glu Asn Leu Gln Lys Ala 405 410 415 Ala Glu Ile Ile Asp Ser Leu Lys Gly Arg Ser Gln Ile Thr Trp Ala 420 425 430 Asp Met Asp Ser Phe Leu Leu Lys Arg Lys Asp Thr Lys Ser Glu Ala 435 440 445 Ile Lys Thr Leu Arg Glu His Tyr Tyr Gln Met Glu Ala His Ala Ala 450 455 460 Met Gly Ile Tyr Asn Pro Glu Tyr Thr Ile Leu Arg Lys Gln Leu Pro 465 470 475 480 Ser Lys Ile Lys Gly Lys Ser Pro Val Ala Ile Asp Gln Ile Pro Ala 485 490 495 Ala Phe Thr Ser Ile His Leu Tyr Ile Asp Glu Thr Met Asn Gly His 500 505 510 Ala Ile Cys Gly His Gln Pro Glu Asn Ser Asn Asp Val Lys Trp Lys 515 520 525 Lys Tyr Glu Phe Asn Ala Lys Pro Val Trp Gln Ser Tyr Lys His Trp 530 535 540 Lys Asp Thr Cys Thr Ser Lys Lys Glu Thr Trp Asp Ala Leu Asn Asp 545 550 555 560 Leu Cys Lys Thr Leu Gly Lys Ser Met Ser Phe Leu Phe Glu Ile Ser 565 570 575 Lys Ser Ala Lys Gly Ile Ile Phe Ile Pro His Gly Phe Met His Gln 580 585 590 Leu Pro Leu His Ala Ala Phe Asp Ser Thr Glu Pro Pro Leu Phe Cys 595 600 605 Leu Thr Cys Cys Thr Tyr Leu Pro Ala Trp Ser Leu Ala Phe Glu Asn 610 615 620 Met Ile His Gln Asp Leu Lys Gly Lys Tyr Ile Leu Arg His Phe Asp 625 630 635 640 Asp Arg Lys Tyr Ile Glu Glu Thr Gln Gln Thr Tyr Asp His Glu Lys 645 650 655 Trp Thr Asn Arg Lys Asp Asp Ile Ser Arg Asp Asp Met Ile Ser Gln 660 665 670 Asp Asp Trp Ile Asn Asn Lys Gln Ile Pro Pro Asp Ile Leu Cys Ile 675 680 685 Leu Cys His Gly Lys Ala Asp Lys Val Asn Pro Phe Asp Ser Lys Leu 690 695 700 Lys Ile Lys Asn Gly Gly Leu Ser Cys Leu Asp Leu Gln Met Thr Ser 705 710 715 720 Leu Asn Ile Glu Gly Thr Thr Ile Ile Leu Gly Ala Cys Glu Thr Glu 725 730 735 Leu Ser Pro Ala Tyr Asn Ser Met Ile Asp Glu His Ile Ser Ile Ala 740 745 750 Gly Ile Phe Leu Thr Lys Lys Ala Lys Trp Val Val Gly Ser Phe Trp 755 760 765 Glu Cys Ser Ala Ala Leu Thr Ser Glu Met Ile Val Gln Ile Ile Ser Page 119
Asp Glu Ser Lys Asn Ser Leu Ser Leu Leu Glu Lys Phe Gln Thr Ile 785 790 795 800 Gln His Ser Trp Phe Asn Thr Arg Lys Ser Phe Cys Asp Gln Asn Gly 805 810 815 Glu Pro Thr Asp Arg Leu Tyr Tyr Phe Ala Pro Phe Lys Ile Ile Gly 820 825 830 Tyr Pro Gly 835 <210> 44 <211> 857 <212> PRT <213> Desulfosarcina <220> <223> Desulfosarcina sp.
TPR-CHAT AA <400> 44 Glu Leu Asn Leu Pro Ser Lys Thr Lys Val Val Leu Ile Ala Val Gly 1 5 10 15 Pro Ala Asp Glu Leu Asn Lys Val Lys Asn Leu Val Glu Ser Gly Thr
Lys Gln Pro Ala Asp Asn Ser Glu Ser Ser Ile Cys Trp Val Ile Tyr 40 45 Gly Pro Glu Ile Lys Asp Met Phe Lys Glu Phe Lys Lys Asn Ile Ile 50 55 60 Gln Lys Ala Asp Thr Ile Leu Asp Phe Glu Glu Lys Ala Asn Phe Phe 65 70 75 80 Lys Ile His Lys Ser Phe Thr Lys Lys Ile Leu Glu Ser Leu Glu Lys 85 90 95 Asp Asp Trp Ile His Leu Gly Thr Val Leu Lys Ile Met Ala Gln Glu 100 105 110 Ala Met Gln Ala Pro Val Pro Val Phe Gly Leu Gln Pro Ser Glu Lys 115 120 125 Leu Cys Lys Gly Leu Lys Val Asn Glu Asp Ala Ile Ile Glu Ser Ile 130 135 140 Ala Glu Phe Asn Leu Trp Arg Ala Lys Lys Leu Gln Gln Ile Ala Met 145 150 155 160 Gln Lys Lys Tyr Ala Gly Lys Tyr Thr Asp Arg His Arg Leu Leu Glu 165 170 175 Ser Ala Ile Arg Leu Thr Trp Glu Gln Phe Asp Ser Asn Glu Ser Trp 180 185 190 His His Ala Val Ile Thr Glu Lys Glu Ala Cys Asp Phe Ile Ala Gln 195 200 205 Cys Tyr Leu Phe Arg Ser Arg Leu Ala Leu Ala Lys Gly Ser Thr Ile 210 215 220 Pro Glu Lys Lys Leu Glu Ala Leu Ala Arg Ala Trp His Trp Ala Ala 225 230 235 240 Page 120
Glu Lys Gly Tyr Asp Asp Met Asn Asp Leu Lys Met Lys Ile Val Leu 245 250 255 Glu Lys Asp Lys Trp Glu Lys Asn Leu Ser Glu Glu Trp Ile Lys Asp 260 265 270 Gln Ile Asn Ser Phe Leu Lys Lys Glu Asn Cys Ser Leu Asp Phe Ser 275 280 285 Lys Pro Leu His Trp Ala Val Asn Asp Arg Ala Gln Asn Leu Glu Leu 290 295 300 Ile Ser Asp Lys Ser Asn Asp Arg Arg Ile Thr Asp Asp Cys Lys Ile 305 310 315 320 Ser His Ile Thr Lys Ser Phe Thr Asp Glu Glu Lys Ala Leu Val Phe 325 330 335 Leu Tyr Gln Ala Arg Ala Ala Leu Arg Met Gln Ser Asp Asp Ile Ser 340 345 350 Glu Arg Leu Lys Lys Ala Val Asp Ala Leu Glu Leu Ser Gln Arg Gly 355 360 365 Met Pro Ser Lys Ala Asn Ser Met Arg Tyr His Val Pro Leu Ser His 370 375 380 Tyr Leu Trp Gly Asp Ile Val Asp Leu Ile Glu Lys Ala Ala Glu Gln 385 390 395 400 Glu Glu Leu Trp Glu Asp Ser Ala Val Asp Ile Trp Arg Arg Cys Met 405 410 415 Glu Glu Glu Ser Arg Val Lys Val Gly Ile Gln Ile Arg Trp Tyr Trp 420 425 430 Ser Lys Tyr Arg Gln Leu Tyr Gln Leu Ala Phe Gln Ala Ala Leu Asn 435 440 445 Pro Ala Tyr His Tyr Pro Glu Asn Phe Lys Leu Ala Ala Glu Ile Thr 450 455 460 Asp Ser Gln Lys Ser Arg Pro Thr Ile Lys Ala Leu Ala Leu Glu Lys 465 470 475 480 Ser Leu Ser Lys Lys Asn Ala Glu Gly Tyr Lys Lys His Val Glu Ala 485 490 495 Asp Ala Leu Phe Ala Ala Gly Asn Phe Val Ala Gly Phe Glu Ala Leu 500 505 510 Lys Lys Ile Pro Val Pro Glu Lys Glu Lys Pro Arg His Ile Met Asp 515 520 525 Val Pro Ala Asp Trp Ala Ala Val His Phe Asn Ile Ile Asp Arg Asn 530 535 540 Asp Ala His Ala Leu Ile Val Glu Asp Glu Lys Cys Arg His Val Ser 545 550 555 560 Ile Glu Ile Ser Asp Leu Trp Asp Ala Phe Asn Lys Trp Val Ala Ser 565 570 575 Asp Arg Asp Ser Gly Leu Glu Lys Val Cys Glu Gln Ser Gly Glu Met 580 585 590 Leu Leu Pro Ile Leu Lys Glu Ile Asn Ser Lys Lys Ile Leu Phe Ile 595 600 605 Pro His Gly Phe Leu His Leu Val Pro Leu His Ala Ala Glu Leu Ala 610 615 620 Lys Ala Lys Asp Lys Ala Arg Tyr Leu Phe Gln Glu Lys Ser Cys Leu 625 630 635 640 Page 121
Phe Leu Pro Ser Trp Ser Leu Ala Pro Leu Gln Asp Glu Ala Phe Ala 645 650 655 Thr Asn Gly Asp Val Leu Leu Thr Lys Trp Asn Tyr Val Ala Ile Lys 660 665 670 Asn Ile Val Glu Arg Asn Asp Trp Ser Asn Ser Lys Arg Glu Glu Asn 675 680 685 Thr Pro Asp Asp Phe Phe Ala Ala Leu Lys Glu Leu Pro Asn Pro Pro 690 695 700 Asp Leu Leu Val Leu Tyr Cys His Gly Gln Ser Asp Phe Val Asn Pro 705 710 715 720 Tyr Asn Ser Arg Phe Val Leu Asn Gly Asp Leu Thr His Gln Arg Leu 725 730 735 Ala Gln Asp Leu Ser Val Ser Asp Leu Lys Lys Ser Lys Val Ile Leu 740 745 750 Thr Ala Cys Glu Ser Asp Leu Val Ser Gly Ile Phe Gly Leu Ile Asp 755 760 765 Glu His Leu Ser Leu Ala Asn Val Phe Leu Ser Lys Gly Ala Ser Glu 770 775 780 Val Leu Gly Ala Leu Phe Glu Cys Ser Ser Asn Ile Ala Leu Asp Leu 785 790 795 800 Ile Leu Glu Ala Lys Asn Asn Ser Lys Thr Pro Leu Tyr Glu Ile Leu 805 810 815 Gln Asn Lys Gln Lys Glu Trp Ala Glu Lys Glu Glu Ser Ile Asp Glu 820 825 830 Ile Ala Val Phe Arg Val Met Gly Phe Pro Gln Ala Gly Val Lys Glu 835 840 845 Ile Leu Lys Thr Gln Glu Asp Leu Leu 850 855 <210> 45 <211> 1119 <212> PRT <213> Candidatus Magnetomorum <220> <223> Candidatus Magnetomorum sp.
HK-1 TPR-CHAT AA <400> 45 Met Thr Lys Ile Phe Ile Ser Tyr Ser Arg Glu Asp Glu Lys Phe Ala 1 5 10 15 Gln Lys Leu Phe Asn Asn Leu Lys Gln Asp Lys Met Asp Pro Trp Leu
Asp Asp Gln Lys Ile Ile Thr Gly Asp Lys Trp Glu Glu Lys Ile Asp 40 45 Lys Glu Ile Asn Thr Thr His Cys Phe Leu Pro Ile Leu Ser Lys Ala 50 55 60 Ser Ile Asn Asn Asn Arg Tyr Phe Gln Thr Glu Val Asp Ile Ala Ile 65 70 75 80 Gln Val Ser Glu Ser Arg Asp Asn Asn Phe Ile Met Pro Val Arg Asn Page 122
Glu Glu Cys Asp Pro Gly His Thr Lys Leu Lys Asp Tyr Asn Ile Leu 100 105 110 Asp Leu Phe Pro Ser Phe Glu His Gln Tyr Lys Lys Leu Leu Lys His 115 120 125 Ile Lys Glu His Asn Arg Lys Phe Asp Phe Lys Lys Ile Thr Asp Ile 130 135 140 Ser Leu Lys Thr Val Asp Phe Leu Ser Lys Phe Ala Thr Ile Ser Tyr 145 150 155 160 Val Ala Ser Gln Phe Met Ser Asp Asp Val Gln Pro Pro Lys Tyr Asn 165 170 175 Met Asp Phe Ala Ser Tyr Gln Phe Pro Lys Glu Gly Gly Lys Val Gln 180 185 190 Leu Leu Gly Pro Asp Val Ala Asp Gln Leu Arg Thr Ile Arg Leu Tyr 195 200 205 Ala Arg Glu Asn Asn Tyr Arg Tyr Thr Ala Phe Phe Ile Asp Gln Lys 210 215 220 Gly His Val Ile Gly His Thr Ala Asn Asn Lys Lys Leu Lys Tyr Lys 225 230 235 240 Ala Ile Gln Leu Asp Ile Pro Glu Asn Thr Ser Met Val Phe Ile Ala 245 250 255 Leu Gly Tyr Gly Asp Asp Leu Glu Tyr Ala Lys Glu Ser Phe Gly Lys 260 265 270 Val Pro Ile Thr His Asp Gln Leu Leu Trp Val Val Tyr Gly Glu Lys 275 280 285 Lys Thr Ala Ile Pro Asp Glu Lys Lys Asp Phe Leu Lys Ile Phe Lys 290 295 300 Ser Ile Ile Phe Asp Ala Asp Lys Ile Ile Asn Val Glu Glu Arg Tyr 305 310 315 320 Arg Tyr Leu Glu Lys Lys Arg Gln Thr Ile Met Gln Ala Phe Asp Ala 325 330 335 Leu His Gln Val Lys Asp Trp Gly Asn Ile Gly Met Met Leu Asn Ile 340 345 350 Met Met Thr Glu Thr Lys Asp Ser Pro Val Ala Leu Phe Ser Glu Arg 355 360 365 Gln Ser His Glu Leu Cys Lys Glu Leu Glu Glu Asn Glu Asn Asn Ile 370 375 380 Lys Arg Cys Phe Ile Glu Phe Asn Phe Leu Arg Ala Lys Glu Leu Ser 385 390 395 400 Asn Asn Ala Ile Asn Glu Gln Pro Gly Leu Lys Tyr Thr Asp Arg His 405 410 415 Arg Phe Ile Glu Ser Ala Ile Arg Leu Ile Trp Asp Gln Cys Asp Lys 420 425 430 Asp Asn Lys Trp Leu His Asn Asp Ile Ile Ser Asp Glu Ile Ala Tyr 435 440 445 Gln Leu Ile Ala Lys Cys Tyr Leu Tyr Arg Ser Lys Leu Ala Leu Thr 450 455 460 Lys Gly Lys Ser Ile Pro Glu Lys Lys Leu Glu Ala Leu Ser Lys Ala 465 470 475 480 Met Glu Trp Ala Lys Asp Asp Val Asn Glu Ile Asn Lys Lys Asn Lys Page 123
Ile Asn Asp Leu Ile Val Tyr Ile Leu Leu Glu Gln Tyr Thr Trp Asp 500 505 510 Lys His Phe Ser Lys Lys Glu Leu Glu Asn Gly Leu Lys Asn Tyr Phe 515 520 525 Cys Gln Tyr Lys Lys Lys Val Leu Asp Leu Ser Asn Ile Ile Gln Phe 530 535 540 Ala Val Ile Asp Lys Leu Ile Asn Ile Leu Thr Asp Lys Lys Glu Arg 545 550 555 560 Leu Gly Asp Glu Phe Ser Asp Ser Asp Glu Gln Leu Leu Lys Gln Leu 565 570 575 Thr Glu Tyr Asp Glu Gln Ile Leu Ser Leu Pro Lys Thr His Lys Leu 580 585 590 Thr Pro Phe Phe Gly Lys Gly Cys Phe Leu Lys Glu Ala Glu Asp Ile 595 600 605 Ser Phe Tyr Gln Leu Val Ser Ala Asn Arg Leu Gly Arg Val Asp Asp 610 615 620 Ile Ser Lys Phe Leu Asn Ala Ala Ile Ser Lys Leu Ser Lys His Phe 625 630 635 640 Gln Thr His Ile Ile Trp Glu Lys Thr Ile Ser Ile Leu Asn Gln Val 645 650 655 Ala Glu Lys Lys Ile Phe Glu Gly Lys Trp Glu Asp Ser Ala Ile Ser 660 665 670 Ala Trp Glu Lys Cys Ile Glu Val Glu Asn Leu Ile Lys Leu Pro Ile 675 680 685 Gln Leu Arg Trp Tyr Trp Ser Ser Tyr Asn Leu Leu Tyr Asp Leu Ala 690 695 700 Phe Gln Ala Ala Leu Asn Lys Lys Gln Tyr Met Leu Ala Ala Arg Ile 705 710 715 720 Ala Asp Ala Val Lys Ser Arg Pro Thr Ile Lys Ile Gln Asn Val Glu 725 730 735 Gln Thr Leu Lys Ile Asn Asp Tyr Phe Lys Gln Phe Ile Glu Thr Asp 740 745 750 Thr Leu Ser Phe Thr Glu Ala Tyr Met Val Arg Phe Glu Asn Leu Lys 755 760 765 Gln Ser Ser Ser Ala Lys Ile Lys Leu Thr Leu Lys Ile Glu Asp Ile 770 775 780 Pro Glu Asp Trp Thr Ala Ile His Phe Tyr Ile Gln Lys Asp Ile Asp 785 790 795 800 Asn Glu Ala Phe Ala Leu Ile Ile Ser Ser Glu Thr Lys Lys Cys Ile 805 810 815 Pro Val Lys Leu Asp Ile Ser Lys Leu Trp Lys Thr Phe Asn Lys Trp 820 825 830 Gln Ala Asp Lys Arg Ala Phe Lys Ala Pro Pro Ala Asn Thr Leu Lys 835 840 845 Gln Met Cys Ile Asp Ala Gly Glu Met Leu Lys Pro Val Leu Asp Gln 850 855 860 Ile Thr Thr Gln Lys Ile Ile Phe Ile Pro His Gly Phe Leu His Leu 865 870 875 880 Val Pro Leu His Ala Ala Ile Ile Asp Gln Glu Asp Thr Tyr Leu Leu Page 124
Ser Lys Lys Leu Cys Val Tyr Leu Pro Ser Trp Ser Ile Ile Ser Gly 900 905 910 Lys Ile Lys Pro Val Glu Asn Ile Asn Asp Tyr Phe Phe Ser Asn Trp 915 920 925 Lys Asn Glu Glu Glu Leu Asn Lys Leu Glu Trp Lys Asn Glu Glu Ile 930 935 940 Gly Lys Asp Thr Pro Asp Gly Val Ile Asn Leu Leu Lys Thr Asn Phe 945 950 955 960 Pro Leu Thr Pro Lys Asn Pro Gln Asn Thr Leu Lys Ala Pro Asn Leu 965 970 975 Leu Val Phe Phe Cys His Gly Lys Gly Asp Tyr Met Asn Pro Tyr Gln 980 985 990 Ser Lys Phe Ile Leu Asn Lys Gly Asn Leu Thr His Asn Ala Ile Val 995 1000 1005 Ala Gly Lys Phe Leu Leu Asn Gly Thr Lys Val Leu Leu Thr Ala Cys 1010 1015 1020 Glu Thr Asp Leu Val Ser Ser Asp Leu Gly Leu Val Asp Glu His Leu 1025 1030 1035 1040 Ser Leu Ser Asn Ala Phe Leu Thr Gln Asn Ala Ser Glu Val Leu Gly 1045 1050 1055 Ala Ile Tyr Glu Cys Arg Pro Asp Asp Ala Lys Lys Ile Ile Lys Tyr 1060 1065 1070 Ile Lys Ala Asn Pro Lys Lys Thr Leu Tyr Glu Ser Leu Gln Asp Ile 1075 1080 1085 Gln Lys Gln Trp Val Asp Glu Lys Lys Ser Ile Thr Asp Ile Ala Val 1090 1095 1100 Phe Arg Val Met Ala Leu Pro Lys Thr Gly Glu Asn Lys Ile Ala 1105 1110 1115 <210> 46 <211> 849 <212> PRT <213> Deltaproteobacteria <220> <223> Deltaproteobacteria bacterium RIFOXYD12 FULL 50 9 TPR-CHAT AA <400> 46 Met Asn Gln Asn Ile Asp Arg Ala Val Gly Ala Ile Leu Ala Ile Glu 1 5 10 15 Thr Ala Thr Pro Leu Thr Glu Ser Ser Thr Leu Ala Gln Arg Glu Arg
His Gln Lys Leu Leu His Asp Glu Thr Lys Lys Ile Glu Gln Ala Phe 40 45 Ile Ala Leu Ala Gln Pro Pro Gln Cys Arg Ala Val Glu Ile Ala Ala 50 55 60 Leu Ser Arg Phe Leu Gln Met Thr Pro Leu Ala Val Gly Pro Leu Arg 65 70 75 80 Page 125
Lys Arg Val Ile Cys Arg Ala Glu Pro Leu Lys Asp Asp Ala His Glu 85 90 95 Gln Glu Ile Ala Ser His Phe Asn Gly Leu Leu Leu Arg Leu Ala Lys 100 105 110 Gly Leu Leu Ala Ser Ala Leu Asn Pro Ala Gly Ile Pro Trp Arg Arg 115 120 125 Arg Val Leu Trp Leu Glu Lys Ala Ala His Ile Ala His Arg Phe Asp 130 135 140 Lys Glu Pro Leu Ala Asp Asp Lys Glu Arg Thr Glu Ala Ala Gly Val 145 150 155 160 Leu Ala Arg Cys Cys Leu His Leu Ala Leu Ala His Leu Pro Lys Gly 165 170 175 Lys Asp Lys Ser Ala Met Ala Glu Arg Gln Glu Asp Leu Leu Gln Ser 180 185 190 Leu Met Trp Ala Gln Lys Ala Ile Val Leu Ala Gly Gln Asp Lys Leu 195 200 205 Ser Gly Glu Glu Tyr Lys Leu Leu Lys Ala Leu Val Leu Ile Glu Leu 210 215 220 Asp Asn Leu Ser Pro Gly Arg Phe Gln Gln Gln Leu Asn Tyr Val Leu 225 230 235 240 Tyr Asp Leu Ala Val Ile Trp Leu Glu Arg Asp Thr Ala Thr Lys Pro 245 250 255 Phe His Pro Gln Glu Leu Phe Val Leu Trp Arg Tyr Leu Ala Thr Asp 260 265 270 Phe Glu Pro Asp Leu Asn Met Leu Leu Phe Lys Gly Ser Asn Thr Ser 275 280 285 Glu Arg Thr Ala Ala Val Gln Gln Ala Ser Pro Glu Ala Glu Arg Phe 290 295 300 Arg Pro Leu Leu Pro Leu Ile His Ala Trp Ser Ala Trp Lys Leu Asp 305 310 315 320 Pro Pro Asn Asn Lys Ile Ala Glu Val Ile Leu Gln Ala Val Asn Asn 325 330 335 Leu Asp Glu His Gln Val Tyr Glu Gln Val Trp Lys Trp Thr Val Asp 340 345 350 Phe Leu Gln Glu Leu Arg Asn Thr Gly Ala Val Asp Trp Gln Leu Pro 355 360 365 Ala Ile Ala Ala Trp Glu Leu Cys Asn Lys Lys Glu Lys Glu Leu Pro 370 375 380 Phe Gly Phe Gln Ile Arg Gln Tyr Trp Ser Arg Leu Asp Ser Leu Tyr 385 390 395 400 Arg Leu Ala Phe Asp Gly Ala Leu Glu Leu Lys Asp Cys Met Thr Ala 405 410 415 Ala Arg Ile Val Asp Ser Leu Lys Ser Arg Thr Pro Leu Thr Trp Arg 420 425 430 Asp Met Asp Thr Leu Phe Ala Lys Leu Pro Lys Glu Lys Ala Asp Gln 435 440 445 Leu Arg Glu Ala Phe Tyr Ser Met Glu Val Gln Ala Arg Met Gly Phe 450 455 460 Tyr Ala Glu Ala Lys Glu Asp Ala Asn Lys Leu Lys Lys Leu Leu Ala 465 470 475 480 Page 126
Ala Gln Val Arg Lys Ile Arg Asp Ile Glu Ser Val Pro Ala Gly Trp 485 490 495 Thr Val Val His Phe His Leu Arg Glu Asp Gln Asp Leu Gly Tyr Ala 500 505 510 Leu Ala Cys Arg Leu Thr Ala Asp Gly Met Ser Tyr Trp Thr Asn His 515 520 525 Ile Phe Pro Val Ala Gly Ile Arg Arg Ala Tyr Asp Cys Trp Leu Glu 530 535 540 Ala Tyr His Gly Met Glu Pro Gly Ala Arg Glu Lys Ser Gly Tyr Gln 545 550 555 560 Leu Val Glu Leu Ser Glu Ile Met Gly Lys Asp Leu Asp Phe Leu Phe 565 570 575 Glu Leu Ala Gly Glu Asp Gly Ala Arg Gly Leu Leu Phe Val Pro His 580 585 590 Gly Phe Ser His Leu Leu Pro Leu His Ala Ala Lys Lys Asp Gly Ser 595 600 605 Tyr Leu Phe Glu Lys Ile Pro Ser Leu Thr Leu Pro Ala Trp Glu Phe 610 615 620 Ala Pro Asp Val Asp Gln Ile Pro Val Ser Asp Gly Gln Asp Phe Cys 625 630 635 640 Phe Ile Ser Gln Arg Ala Asn Glu Gln Asp Leu Val Gly Asn Ile Glu 645 650 655 Arg Ser His Thr Trp Asn Gly Val Cys Asn Lys Asn Ala Ala Trp Thr 660 665 670 Asn Val Leu Asn Thr Asn Lys Glu Trp Ser Lys Ala Pro Pro Arg Trp 675 680 685 Leu Val Phe Trp Cys His Gly Gln Ala Asp Pro His Val Ala Phe Arg 690 695 700 Ser Lys Leu Leu Leu Gly Thr Leu Gly Val Ser Leu Phe Glu Ile Gln 705 710 715 720 Glu Ala Ala Leu Ser Leu Thr Gly Thr Lys Val Val Leu Ala Val Cys 725 730 735 Glu Ser Asp Leu Ala Pro Pro Glu Glu Tyr Glu Lys Thr Asp Asp His 740 745 750 Leu Ser Leu Ala Ala Pro Phe Leu Leu Lys Gly Ala Arg Gln Val Leu 755 760 765 Ala Ala Ile Trp Glu Gly Ala Gln Leu Asp Leu Leu Lys Ala Met Lys 770 775 780 Glu Met Leu Ser Asn Gln Asp Lys His Ser Trp Glu Ile Leu Arg Glu 785 790 795 800 Leu Gln Ser Cys Trp Met Arg Gln Pro Gly Ala Ile Phe Asn Asp Glu 805 810 815 Tyr Ile Arg Leu Tyr Tyr Ala Ala Ser Phe Arg Ile Leu Gly Phe Pro 820 825 830 Glu Val Ala Thr Thr Asn Met Ala Thr Ala Thr Ala Gln Glu Glu Ile 835 840 845 Ala Page 127
<210> 47 <211> 723 <212> PRT <213> Deferribacteres <phylum> <220> <223> Deferribacteres bacterium TPR-CHAT AA <400> 47 Met Lys Ala Leu Asp Gln Ser Leu Val Asn Glu Ile Phe Asp Arg Met 1 5 10 15 Phe Glu Lys Arg Leu Asn His Gly Cys Phe Ser Arg Phe Gln Gly Ile
Ser Leu Asn Leu Lys Glu Asp Leu Cys Arg Arg Ile Arg Ala Phe Ile 40 45 Glu Lys Pro Lys Val Glu Lys Leu Asp Ser Trp Lys Asp Ser Val Leu 50 55 60 Ser Arg Tyr Phe Asp Asp Thr Arg Lys Gly Glu Arg Tyr Leu Lys Asp 65 70 75 80 Pro Lys Gly Tyr Ala Arg Ala Val Glu Ala Leu Ile Arg Leu Leu Lys 85 90 95 Thr Val Glu Lys Thr Pro Glu Leu Cys Val Val Leu Ala Glu Leu Leu 100 105 110 Ile Gln Arg Ala Lys Leu Leu Asn Pro Lys Gly Phe Gly Ala Asn Pro 115 120 125 Lys Lys Ala Lys Met Leu Arg Glu Ala Val Ala Leu Leu Asp Glu Ala 130 135 140 Ile Lys Asn Lys Glu Lys Leu Ser Tyr Ala Tyr Arg Leu Lys Val Ser 145 150 155 160 Ala Tyr Tyr Glu Leu Lys Arg Glu Pro Ser Asp Ala Gly Val Pro Asp 165 170 175 Asn Tyr Tyr Glu Val Leu Glu Glu Ala Leu Arg Arg Asn Lys Glu Arg 180 185 190 Ile Asn Glu Pro Glu Phe Glu Leu Ala Ala Ile Glu Leu Ala Glu Ser 195 200 205 Glu Arg Thr Asp Val Gln Glu Leu Leu Val Lys Ile Gly Ser Asn Ser 210 215 220 Lys Asp Asn Thr His Arg Ser Met Ala Phe Ser Leu Leu Asn Gln Arg 225 230 235 240 Glu Lys Ala Lys Glu His Phe Asp Lys Ala Leu Asn Glu Thr Ala Glu 245 250 255 Leu Ala Arg Lys Gly Ile Leu Ser Phe Thr His Pro Val Trp Glu Lys 260 265 270 Leu His His Ala Ala Glu Leu Leu Asp Gly Asp Arg Asp Ala Ala Val 275 280 285 Ser Leu Trp Lys Thr Leu Lys Gly Phe Glu Asn Thr Arg Arg Tyr His 290 295 300 Gly Leu His Leu Leu Trp Tyr Trp Ser Arg Leu Thr Asp Ile Tyr Ser 305 310 315 320 Page 128
Leu Ala Phe Arg Arg Ala Tyr Asp Asp Gly Asp Tyr Leu Thr Ala Phe 325 330 335 Trp Val Ala Asp Gly Leu Lys Ala Arg Pro Leu Ile Gln Trp Gln Val 340 345 350 Leu Glu His Val Phe Arg Lys Glu Gly Phe Glu Asp Tyr Val Glu Ala 355 360 365 Glu Val Tyr Gly Arg Leu Gly Tyr Tyr Val Arg Lys Ser Gly Gly Leu 370 375 380 Lys Arg Arg Gly Thr Asp Ser Glu Phe Ala Pro Phe Glu Pro Pro Pro 385 390 395 400 Phe Lys Leu Lys Asp Gly Ile Ala Val Val Ser Leu Tyr Leu Ser Glu 405 410 415 Lys Glu Gly Tyr Gly Leu Val Leu Glu Gly Ser Gly Val Lys Ser Thr 420 425 430 Phe Arg Phe Asp Ala Asn Pro Leu Trp Lys Ile Tyr Leu Ser Tyr Ser 435 440 445 Glu Ala Leu Ser Arg Gly Ala Asp Val Leu Lys His Asn Ile Gly Thr 450 455 460 Val Cys Leu Glu Met Gly Lys Ala Phe Gly Glu Leu Phe Glu Leu Asp 465 470 475 480 Ser Asp Lys Ile Lys Arg Leu Val Ile Ile Pro His Gly Phe Leu His 485 490 495 Leu Phe Pro Phe His Gly Ala Tyr Ser Glu Glu Lys Gly Met Tyr Leu 500 505 510 Leu Glu Lys Phe Asp Val Ser Tyr Leu Pro Ser Phe Glu Leu Ala Leu 515 520 525 Lys Cys Phe Ser Asp Asp Thr Thr Phe Ser Asp Glu Lys Val Ala Leu 530 535 540 Ile Asp Glu Glu Ser Arg Asp Phe Ser Asp Phe Val Pro Glu Leu Glu 545 550 555 560 Glu Lys Gly Phe His Arg Ser Asp Val Arg Ile Phe Glu Lys Pro Glu 565 570 575 Lys Leu His Thr Leu Ala Ile Val Cys His Gly Lys Ala Asn Pro Ala 580 585 590 Asn Pro Phe Glu Ser Val Leu Lys Val Gly Glu Gly Ile Thr Val Lys 595 600 605 Asp Val Leu Ser Ala Gly Leu Lys Ala Lys Glu Val Tyr Leu Val Ala 610 615 620 Cys Glu Ser Asp Leu Ala Phe Pro Thr Ala Glu Gln Val Asp Glu His 625 630 635 640 Leu Ser Leu Gly Thr Val Phe Leu Ser Lys Lys Ser Arg Phe Val Phe 645 650 655 Ser Asn Leu Trp Glu Ala Lys Ile Glu Lys Leu Lys Asn Ile Val Pro 660 665 670 Glu Leu Val Asp Ala Asp Asp Lys Val Ser Val Leu Met Glu Glu Met 675 680 685 Lys Ser Lys Ile Pro Glu Gly Leu Pro Ser Glu Lys Gly Val Gly Tyr 690 695 700 Arg Leu Ala Asp Ala Leu Pro Phe Arg Ile Tyr Asp Val Asn Ile Phe 705 710 715 720 Page 129
Ala Gly Gly <210> 48 <211> 456 <212> DNA <213> Artificial Sequence <220> <223> CRISPR-1 array <400> 48 tacaaaatgg ccccttcteg ccatatacgt aacctcagag ttgttggagg gttatgaaac 60 aagagaagga cttaatgtca cggtacccaa ttttectgccc cggactccac ggctgttact 120 agaggttatg aaacaagaga aggacttaat gtcacggtac ccaattttct gccccggact 180 ccacggctgt tactagaggt tatgaaacaa gagaaggact taatgtcacg gtacccaatt 240 ttctgccccg gactccacgg ctgttactag aggttatgaa acaagagaag gacttaatgt 300 cacggtaccc aattttctgc ccecggactcc acggctgtta ctagaggtta tgaaacaaga 360 gaaggactta atgtcacggt acccaatttt ctgccccgga ctccacggct gttactagag 420 gttatgaaac aagagaagga cttaatgtca cggtac 456 <210> 49 <211> 36 <212> DNA <213> Artificial Sequence <220> <223> Candidatus Scalindua brodae CRISPR repeat <400> 49 gttatgaaac aagagaagga cttaatgtca cggtac 36 <210> 50 <211> 35 <212> DNA <213> Artificial Sequence
Page 130
<220> <223> Primer 1 - BN3227 <400> 50 gtcgtattaa tttccaccgt gtgcttctca aatgc 35 <210> 51 <211> 36 <212> DNA <213> Artificial Sequence <220> <223> Primer 2 - BN3228 <400> 51 caataaaccg gtaaataatc gtattgtaca cggccg 36 <210> 52 <211> 36 <212> DNA <213> Artificial Sequence <220> <223> Primer 3 - BN3225 <400> 52 ttttgctgaa acctcccagc aatagacata agcggc 36 <210> 53 <211> 41 <212> DNA <213> Artificial Sequence <220> <223> Primer 4 - BN3226 <400> 53 tgtacaatac gattatttac cggtttattg actaccggaa g 41 <210> 54 <211> 47 <212> RNA <213> Artificial Sequence Page 131
<220> <223> non-target RNA 1 <400> 54 gaccucgaaa uaaugaggga agcuguccaa augauaauua cgauuaa 47 <210> 55 <211> 46 <212> DNA <213> Artificial Sequence <220> <223> ssDNA 1 <400> 55 ctctagtaac agccgtggag tccggggcag aaaattggac gattaa 46 <210> 56 <211> 46 <212> RNA <213> Artificial Sequence <220> <223> target RNA with matching PFS <400> 56 cucuaguaac agccguggag uccggggcag aaaauugggu accgug 46 <210> 57 <211> 46 <212> RNA <213> Artificial Sequence <220> <223> Target RNA with non-matching PFS <400> 57 cucuaguaac agccguggag uccggggcag aaaauuggac gauuaa 46 <210> 58 <211> 350 Page 132
<212> PRT <213> Syntrophorhabdaceae <220> <223> Syntrophorhabdaceae bacterium PtaUl.Bin034 gRAMP part 1 AA <400> 58 Met Glu Ser Ile Pro Val Thr Leu Thr Phe Leu Glu Pro Tyr Arg Val 1 5 10 15 Val Glu Trp Tyr Ala Asn Glu Asp Arg Arg Ser Ala Glu Arg Tyr Leu
Arg Gly Gln Ser Phe Ala Arg Trp His Arg Lys Lys Asn Asp Lys Lys 40 45 Gly Arg Pro Tyr Ile Thr Gly Thr Leu Leu Arg Ser Ala Ala Ile Arg 50 55 60 Ala Ala Glu Glu Leu Leu Ser Leu Ser Gly Gly Val Trp Asp Gly Gln 65 70 75 80 His Cys Cys Lys Gly Gln Phe Leu Ser Gly Gly Val Lys Pro Glu Tyr 85 90 95 Met Arg Lys Arg Pro Thr Tyr Ile Trp Ala Glu Lys Glu Gly Ala Cys 100 105 110 Ser Ala Pro Asp Tyr Cys Pro Phe Cys Ile Phe Leu Gly Asp Arg Asp 115 120 125 Gln Ala Glu Lys Lys Ala Glu Ser Gln Asn Gly Tyr Pro Asp Lys Ser 130 135 140 Tyr His Ile Arg Phe Gly Asn Leu Ser Leu Pro Asp Pro Pro Pro Leu 145 150 155 160 Leu Asp Leu Lys Glu Val Ala Val Glu Arg Thr Leu Asn Arg Val Asp 165 170 175 Phe Gln Thr Ala Lys Ala His Asp Tyr Phe Lys Val Trp Glu Ile Ser 180 185 190 His Glu Asp Leu Gly Val Tyr Thr Gly Gln Ile Val Ile His Tyr Thr 195 200 205 Gly Pro Trp Gln Glu Lys Val Lys Ser Leu Leu Glu Gly Ser Leu Arg 210 215 220 Phe Val Asp Arg Leu Cys Gly Ala Leu Cys Lys Ile Glu Met Ala Pro 225 230 235 240 Lys Pro Ala Arg Pro Leu Pro Lys Ser Leu Ser Val Asp Met Thr Glu 245 250 255 His Ala Lys Ile Ile Val Thr Ala Phe Asp Asp Ala Lys Lys Ala Glu 260 265 270 Lys Val Arg Gly Leu Ala Asp Ala Met Arg Ser Met Gly Ser Lys Gly 275 280 285 Pro Thr Ile Leu Asp Lys Leu Pro Ala Gly His Asp Asp Arg Asp His 290 295 300 His Thr Trp Asp Val Thr Ile Val Asp Lys Thr Pro Leu Arg Thr Tyr 305 310 315 320 Leu Lys Gly Val Leu Arg Ala Asp Asp Ala Ala Ser Trp Pro Ala Leu 325 330 335 Page 133
Cys Lys Ala Leu Gly Asn Ala Leu Tyr Asp Val Ser Gln Gly 340 345 350 <210> 59 <211> 1322 <212> PRT <213> Syntrophorhabdaceae <220> <223> Syntrophorhabdaceae bacterium PtaUl.Bin034 gRAMP part 2 AA <400> 59 Met Arg Arg Gln Arg Leu Leu Gly Asp Ala Glu Tyr Tyr Gly Gly Thr 1 5 10 15 Gly Arg Glu Gln Pro Ala Ser Ile Val Ile Ser Thr Asp Ser Asp Pro
Asp His Lys Val Tyr Glu Trp Ile Ile Thr Gly Gln Leu Lys Ala Glu 40 45 Thr Gly Phe Phe Phe Gly Thr Lys Ala Gly Ala Gly Gly His Thr Asp 50 55 60 Leu Ser Ile Leu Leu Gly Lys Asp Gly His Tyr Arg Val Pro Arg Ser 65 70 75 80 Val Phe Arg Gly Ala Leu Arg Arg Asp Leu Arg Val Ala Phe Gly Ala 85 90 95 Gly Cys Arg Val Glu Val Gly Arg Glu Arg Pro Cys Glu Cys Pro Val 100 105 110 Cys Lys Val Met Arg Gln Ile Thr Val Met Asp Thr Ile Ser Ser Tyr 115 120 125 Arg Glu Ala Pro Glu Ile Arg Gln Arg Ile Arg Leu Asn Pro Tyr Thr 130 135 140 Gly Thr Val Asp Lys Gly Ala Leu Phe Asp Met Glu Val Gly Pro Glu 145 150 155 160 Gly Ile Glu Phe Pro Phe Val Leu Arg Phe Arg Gly Ser Lys Ser Phe 165 170 175 Pro Ser Glu Leu Ala Ala Val Ile Gly Ser Trp Thr Lys Gly Thr Ala 180 185 190 Trp Leu Gly Gly Ala Ala Ala Thr Gly Lys Gly Arg Phe Ser Leu Leu 195 200 205 Gly Leu Ser Ile His Lys Trp Asn Leu Ser Thr Ala Glu Gly Arg Lys 210 215 220 Ser Tyr Leu Ala Ala Tyr Gly Leu Arg Asp Ala Ala Asp Lys Thr Val 225 230 235 240 Lys Arg Leu Ser Ile Asp Lys Gly Gly Lys Gly Asp Val Gly Leu Pro 245 250 255 Ala Gly Leu Glu Arg Asp Ala Leu Pro Ser Ser Val Arg Glu Pro Leu 260 265 270 Trp Lys Lys Leu Val Cys Thr Val Asp Phe Ser Ser Pro Leu Leu Leu 275 280 285 Ala Asp Pro Ile Ala Ala Leu Leu Gly Val Glu Gly Asp Glu Arg Ile Page 134
Gly Phe Asp Asn Ile Ala Tyr Glu Lys Arg Arg Tyr Asn Gly Glu Thr 305 310 315 320 Asn Thr Thr Glu Ser Ile Pro Ala Val Lys Gly Glu Thr Phe Arg Gly 325 330 335 Ile Val Arg Thr Ala Leu Gly Lys Arg His Gly Asn Leu Thr Arg Asp 340 345 350 His Glu Asp Cys Arg Cys Arg Leu Cys Ala Val Phe Gly Lys Glu Gln 355 360 365 Glu Ala Gly Lys Ile Arg Phe Glu Asp Leu Met Pro Val Gly Ala Trp 370 375 380 Thr Arg Lys His Leu Asp His Val Ala Ile Asp Arg Phe His Gly Gly 385 390 395 400 Ala Glu Glu Asn Met Lys Phe Asp Thr Tyr Ala Leu Ala Ala Ser Pro 405 410 415 Thr Asn Pro Leu Arg Met Lys Gly Leu Ile Trp Val Arg Ser Asp Leu 420 425 430 Phe Glu Thr Gly His Asp Gly Pro Thr Pro Pro Tyr Val Lys Asp Ile 435 440 445 Ile Asp Ala Leu Ala Asp Val Lys Arg Gly Leu Tyr Pro Val Gly Gly 450 455 460 Lys Thr Gly Ser Gly Tyr Gly Trp Ile Lys Asp Val Thr Ile Asp Gly 465 470 475 480 Leu Pro Gln Gly Leu Ser Leu Pro Pro Ala Glu Glu Arg Val Asp Gly 485 490 495 Val Asn Glu Val Pro Pro Tyr Asn Tyr Ser Ala Pro Pro Asp Leu Pro 500 505 510 Ser Ala Ala Glu Gly Glu Tyr Phe Phe Pro His Val Phe Ile Lys Pro 515 520 525 Tyr Asp Lys Val Asp Arg Val Ser Arg Leu Thr Gly His Asp Arg Phe 530 535 540 Arg Gln Gly Arg Ile Thr Gly Arg Ile Thr Cys Thr Leu Lys Thr Leu 545 550 555 560 Thr Pro Leu Ile Ile Pro Asp Ser Glu Gly Ile Gln Thr Asp Ala Thr 565 570 575 Gly His Lys Met Cys Lys Phe Phe Ser Val Ala Gly Lys Pro Met Ile 580 585 590 Pro Gly Ser Glu Ile Arg Gly Met Ile Ser Ser Val Tyr Glu Ala Leu 595 600 605 Thr Asn Ser Cys Phe Arg Val Phe Asp Glu Glu Lys Tyr Leu Thr Arg 610 615 620 Arg Val Gln Pro Lys Lys Gly Ala Lys Ser Ser Glu Leu Val Pro Gly 625 630 635 640 Ile Ile Val Trp Gly Gln Asn Gly Gly Leu Ala Val Gln Gln Val Lys 645 650 655 Asn Ala Tyr Arg Val Pro Leu Tyr Asp Asp Pro Ala Val Thr Ser Ala 660 665 670 Ile Pro Thr Glu Ala Gln Lys Asn Lys Glu Arg Trp Glu Ser Val Pro 675 680 685 Ser Val Asn Leu Gln Gly Ala Leu Asp Trp Asn Leu Thr Thr Ala Asn Page 135
Ile Ala Arg Asp Asn Arg Thr Phe Leu Asn Ser Arg Pro Glu Glu Lys 705 710 715 720 Asp Ala Ile Leu Ser Gly Thr Lys Pro Ile Ser Phe Glu Leu Glu Gly 725 730 735 Thr Asn Pro Asn Asp Met Leu Val Arg Leu Val Pro Asp Gly Val Asp 740 745 750 Gly Ala His Ser Gly Tyr Leu Lys Phe Thr Gly Leu Asn Met Val Leu 755 760 765 Lys Ala Asn Lys Lys Thr Ser Arg Lys Leu Ala Pro Ser Glu Glu Asp 770 775 780 Val Arg Thr Leu Ala Ile Leu His Asn Asp Phe Asp Ser Arg Arg Asp 785 790 795 800 Trp Arg Arg Pro Pro Asn Ser Gln Arg Tyr Phe Pro Arg Ser Val Leu 805 810 815 Arg Phe Ser Leu Glu Arg Ser Thr Tyr Thr Ile Pro Lys Arg Cys Glu 820 825 830 Arg Val Phe Glu Gly Thr Cys Gly Glu Pro Tyr Ser Val Pro Ser Asp 835 840 845 Val Glu Arg Gln Tyr Asn Ser Ile Ile Asp Asp Ile Ser Lys Asn Tyr 850 855 860 Gly Arg Ile Ser Glu Thr Tyr Leu Thr Lys Thr Ala Asn Arg Lys Leu 865 870 875 880 Thr Val Gly Asp Leu Val Tyr Phe Ile Ala Asp Leu Asp Lys Asn Met 885 890 895 Ala Thr His Ile Leu Pro Val Phe Ile Ser Arg Ile Ser Asp Glu Lys 900 905 910 Pro Leu Gly Glu Leu Leu Pro Phe Ser Gly Lys Leu Ile Pro Cys Glu 915 920 925 Gly Glu Pro Pro Thr Ile Leu Lys Lys Met Ala Pro Ser Leu Leu Thr 930 935 940 Glu Ala Trp Arg Thr Leu Ile Ser Thr His Leu Glu Gly Phe Cys Pro 945 950 955 960 Ala Cys Arg Leu Phe Gly Thr Thr Ser Tyr Lys Gly Arg Ile Arg Phe 965 970 975 Gly Phe Ala Glu His Thr Gly Thr Pro Lys Trp Leu Arg Glu Glu Leu 980 985 990 Asp Trp Ala Arg Pro Phe Leu Thr Leu Pro Ile Gln Glu Arg Pro Arg 995 1000 1005 Pro Thr Trp Ser Val Pro Asp Asp Lys Ser Glu Val Pro Gly Arg Lys 1010 1015 1020 Phe Tyr Leu His His His Gly Gly Asn Arg Ile Val Glu Ser Asn Leu 1025 1030 1035 1040 Arg Asn Arg Pro Glu Val Asn Gln Thr Lys Asn Asn Ser Ser Val Glu 1045 1050 1055 Pro Ile Ser Ala Gly Asn Thr Phe Thr Phe Asp Val Cys Phe Glu Asn 1060 1065 1070 Leu Glu Ala Trp Glu Leu Gly Leu Leu Leu Tyr Cys Leu Glu Leu Ser 1075 1080 1085 Pro Lys Leu Ala His Lys Leu Gly Arg Ala Lys Ala Phe Gly Phe Gly Page 136
Ser Val Lys Ile His Val Glu Arg Ile Glu Glu Arg Thr Thr Asp Gly 1105 1110 1115 1120 Ala Tyr Gln Asp Val Thr Ala Val Lys Lys Asn Gly Trp Ile Thr Thr 1125 1130 1135 Gly His Asp Lys Leu Arg Glu Trp Phe His Arg Asp Asp Trp Glu Asp 1140 1145 1150 Val Asp His Ile Arg Asn Leu Arg Thr Val Leu Arg Phe Pro Asp Ala 1155 1160 1165 Asp Gln Glu His Asp Val Arg Tyr Pro Glu Leu Lys Ala Asn Asn Gly 1170 1175 1180 Val Ser Gly Tyr Val Glu Leu Arg Asp Lys Met Thr Ala Ser Glu Arg 1185 1190 1195 1200 Gln Glu Ser Leu Arg Thr Pro Trp Tyr Arg Trp Phe Pro Gln Asn Gly 1205 1210 1215 Thr Gly Gly Ser Gly Arg His Glu Gln Ala Ala Thr Ser Gln Glu Gln 1220 1225 1230 Asp Thr Ala Lys Asp Glu Ser Val Leu Ser Ala Thr Gln Arg Arg Gln 1235 1240 1245 Ala Val Ile Asp Val Ser Asp Pro Asp Glu Arg Leu Ser Gly Thr Val 1250 1255 1260 Glu Ser Phe Asp Arg Gln Lys Gly Asp Gly Tyr Ile Gly Cys Gly Val 1265 1270 1275 1280 Arg Gln Phe Tyr Val Arg Leu Glu Asp Ile Arg Ser Arg Thr Ala Leu 1285 1290 1295 Cys Glu Gly Gln Val Val Thr Phe Arg Ala Arg Lys Glu Trp Glu Gly 1300 1305 1310 His Glu Ala Tyr Asp Val Glu Ile Asp Gln 1315 1320 <210> 60 <211> 36 <212> DNA <213> Candidatus Jettenia <220> <223> Candidatus Jettenia caeni CRISPR repeat <400> 60 cttgaagact aaaggaagga attaatgtca cggtac 36 <210> 61 <211> 37 <212> DNA <213> Deferribacteres <phylum> <220> Page 137
<223> Deferribacteres CRISPR repeat
<400> 61 gttggtgcat cagcccggaa ttatgatgtt ttggtac 37
<210> 62
<211> 35
<212> DNA
<213> Desulfonema ishimotonii
<220>
<223> Desulfonema ishimotonii CRISPR repeat
<400> 62 ggttggaaag ccggttttct ttgatgtcac ggaac 35
<210> 63
<211> 3939
<212> DNA
<213> Artificial Sequence
<220>
<223> Plasmid 135-3
<400> 63 gctcactgcc cgctttccag tcgggaaacc tgtcgtgcca gctgcattaa tgaatcggcc 60 aacgcgcggg gagaggcggt ttgcgtattg ggcgccaggg tggtttttct tttcaccagt 120 gagacgggca acagctgatt gcccttcacc gcctggccct gagagagttg cagcaagcgg 180 tccacgctgg tttgccccag caggcgaaaa tcctgtttga tggtggttaa cggcgggata 240 taacatgagc tgtcttcggt atcgtcgtat cccactaccg agatgtccgc accaacgcgc 300 agcccggact cggtaatggc gcgcattgcg cccagcgcca tctgatcgtt ggcaaccagc 360 atcgcagtgg gaacgatgcc ctcattcagc atttgcatgg tttgttgaaa accggacatg 420 gcactccagt cgccttcceg ttccgctatc ggctgaattt gattgcgagt gagatattta 480 tgccagccag ccagacgcag acgcgccgag acagaactta atgggcccgc taacagcgcg 540 atttgctggt gacccaatgc gaccagatgc tccacgccca gtcgcgtacc gtcttcatgg 600 gagaaaataa tactgttgat gggtgtctgg tcagagacat caagaaataa cgccggaaca 660 Page 138 ttagtgcagg cagcttccac agcaatggca tcctggtcat ccagcggata gttaatgatc 720 agcccactga cgcgttgcgc gagaagattg tgcaccgcecg ctttacaggc ttcgacgccg 780 cttegttcta ccatcgacac caccacgctg gcacccagtt gatcggcgcg agatttaatc 840 gccgcgacaa tttgcgacgg cgcgtgcagg gccagactgg aggtggcaac gccaatcagc 900 aacgactgtt tgcccgccag ttgttgtgcc acgcggttgg gaatgtaatt cagctccgcc 960 atcgcegctt ccactttttc cegcgttttec gcagaaacgt ggctggcctg gttcaccacg 1020 cgggaaacgg tctgataaga gacaccggca tactctgcga catcgtataa cgttactggt 1080 ttcacattca ccaccctgaa ttgactctct tccgggcgct atcatgccat accgcgaaag 1140 gttttgcgcec attcgatggt gtcegggatc tcgacgctct cccttatgcg actcctgcat 1200 taggaaatta atacgactca ctatagggag agaattgtga gcggataaca attcccctgt 1260 agattaatta agcggccgcc ctgcaggact cgagttctag aaataatttt gtttaacttt 1320 aagaaggaga tatacatatg aaatcttctc accatcacca tcaccatggt tcttctatgg 1380 ctagcatgtc ggactcagaa gtcaatcaag aagctaagcc agaggtcaag ccagaagtca 1440 agcctgagac tcacatcaat ttaaaggtgt ccgatggatc ttcagagatc ttcttcaaga 1500 tcaaaaagac cactccttta agaaggctga tggaagcgtt cgctaaaaga cagggtaagg 1560 aaatggactc cttaagattc ttgtacgacg gtattagaat tcaagctgat cagacccctg 1620 aagatttgga catggaggat aacgatatta ttgaggctca cagagaacag attggtggga 1680 tcgaggaaaa cctgtacttc caatccaata ttggaagtgg ataacggatc cgcgatcgcg 1740 gcgcgccacc tggtggccgg ccggtaccac gcgtgcgcegc tgatccggct gctaacaaag 1800 cccgaaagga agctgagttg gctgctgcca ccgctgagca ataactagca taaccccttg 1860 gggcctctaa acgggtcttg aggggttttt tgctgaaacc tcaggcattt gagaagcaca 1920 cggtcacact gcttccggta gtcaataaac cggtaaacca gcaatagaca taagcggcta 1980 tttaacgacc ctgccctgaa ccgacgaccg ggtcatcgtg gccggatctt gceggccccte 2040 ggcttgaacg aattgttaga cattatttgc cgactacctt ggtgatcteg cctttcacgt 2100 aatggacaaa ttcttccaac tgatctgcgc gcgaggccaa gcgatcttct tcttgtccaa 2160 Page 139 gataagcctg tctagcttca agtatgacgg gctgatactg ggccggcagg cgctccattg 2220 cccagtcggc agcgacatcc ttcggcgcga ttttgcceggt tactgcgctg taccaaatgc 2280 gggacaacgt aagcactaca tttegctcat cgccagccca gtcgggcggc gagttccata 2340 gcgttaaggt ttcatttagc gcctcaaata gatcctgttc aggaaccgga tcaaagagtt 2400 cctcegccgc tggacctacc aaggcaacgc tatgttctet tgcttttgtc agcaagatag 2460 ccagatcaat gtcgatcgtg gctggctcga agatacctgc aagaatgtca ttgcgctgcec 2520 attctccaaa ttgcagttcg cgcttagctg gataacgcca cggaatgatg tcgtcgtgca 2580 caacaatggt gacttctaca gcgcggagaa tctcgctctc tccaggggaa gccgaagttt 2640 ccaaaaggtc gttgatcaaa gctegccgcg ttgtttcatc aagccttacg gtcaccgtaa 2700 ccagcaaatc aatatcactg tgcggcttca ggccgccatc cactgcggag ccgtacaaat 2760 gtacggccag caacgtcggt tcgagatggc gctcgatgac gccaactacc tctgatagtt 2820 gagtcgatac ttcggcgatc accgcttccc tcatactctt cctttttcat tattattgaa 2880 gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata 2940 aacaaatagc tagctcactc ggtcgctacg ctccgggcgt gagactgcgg cgggcgctgec 3000 ggacacatac aaagttaccc acagattccg tggataagca ggggactaac atgtgaggca 3060 aaacagcagg gccgcgcecgg tggegttttt ccataggctc cgccctcctg ccagagttca 3120 cataaacaga cgcttttccg gtgcatctgt gggagccgtg aggctcaacc atgaatctga 3180 cagtacgggc gaaacccgac aggacttaaa gatccccacc gtttceggcg ggtegctcce 3240 tcttgegete tcctgttecg accctgccgt ttaccggata cctgttcegc ctttetccct 3300 tacgggaagt gtggcgcttt ctcatagctc acacactggt atctcggctc ggtgtaggtc 3360 gttegctcca agctgggctg taagcaagaa ctccccgttc agcccgactg ctgcgcctta 3420 tceggtaact gttcacttga gtccaacccg gaaaagcacg gtaaaacgcc actggcagca 3480 gccattggta actgggagtt cgcagaggat ttgtttagct aaacacgcgg ttgctcttga 3540 agtgtgcgcc aaagtccggc tacactggaa ggacagattt ggttgctgtg ctctgcgaaa 3600 gccagttacc acggttaagc agttccccaa ctgacttaac cttcgatcaa accacctccc 3660 Page 140 caggtggttt tttcgtttac agggcaaaag attacgcgca gaaaaaaagg atctcaagaa 3720 gatcctttga tcttttctac tgaaccgctc ttgatttcag tgcaatttat ctcttcaaat 3780 gtagcacctg aagtcagccc catacgatat aagttgtaat tctcatgtta gtcatgcccc 3840 gcgcccaccg gaaggagctg actgggttga aggctctcaa gggcatcggt cgagatcccg 3900 gtgcctaatg agtgagctaa cttacattaa ttgcgttgc 3939 <210> 64 <211> 4008 <212> DNA <213> Artificial Sequence <220> <223> Plasmid pACYC Duet-1 <400> 64 ggggaattgt gagcggataa caattcccct gtagaaataa ttttgtttaa ctttaataag 60 gagatatacc atgggcagca gccatcacca tcatcaccac agccaggatc cgaattcgag 120 ctcggegege ctgcaggtcg acaagcttgc ggccgcataa tgcttaagtc gaacagaaag 180 taatcgtatt gtacacggcc gcataatcga aattaatacg actcactata ggggaattgt 240 gagcggataa caattcccca tcttagtata ttagttaagt ataagaagga gatatacata 300 tggcagatct caattggata tcggccggcc acgcgatcgc tgacgtcggt accctcgagt 360 ctggtaaaga aaccgctgct gcgaaatttg aacgccagca catggactcg tctactagcg 420 cagcttaatt aacctaggct gctgccaccg ctgagcaata actagcataa ccccttgggg 480 cctctaaacg ggtcttgagg ggttttttgc tgaaacctca ggcatttgag aagcacacgg 540 tcacactgct tccggtagtc aataaaccgg taaaccagca atagacataa gcggctattt 600 aacgaccctg ccctgaaccg acgaccgggt cgaatttgct ttcgaatttc tgccattcat 660 ccgcttatta tcacttattc aggcgtagca ccaggcgttt aagggcacca ataactgcct 720 taaaaaaatt acgccccgcc ctgccactca tcgcagtact gttgtaattc attaagcatt 780 ctgccgacat ggaagccatc acagacggca tgatgaacct gaatcgccag cggcatcagc 840 Page 141 accttgtegc cttgcgtata atatttgccc atagtgaaaa cgggggcgaa gaagttgtcc 900 atattggcca cgtttaaatc aaaactggtg aaactcaccc agggattggc tgagacgaaa 960 aacatattct caataaaccc tttagggaaa taggccaggt tttcaccgta acacgccaca 1020 tcttgcgaat atatgtgtag aaactgccgg aaatcgtcgt ggtattcact ccagagcgat 1080 gaaaacgttt cagtttgctc atggaaaacg gtgtaacaag ggtgaacact atcccatatc 1140 accagctcac cgtctttcat tgccatacgg aactccggat gagcattcat caggcgggca 1200 agaatgtgaa taaaggccgg ataaaacttg tgcttatttt tctttacggt ctttaaaaag 1260 gccgtaatat ccagctgaac ggtctggtta taggtacatt gagcaactga ctgaaatgcc 1320 tcaaaatgtt ctttacgatg ccattgggat atatcaacgg tggtatatcc agtgattttt 1380 ttetccattt tagcttcett agctcctgaa aatctcgata actcaaaaaa tacgcccggt 1440 agtgatctta tttcattatg gtgaaagttg gaacctctta cgtgccgatc aacgtctcat 1500 tttcgccaaa agttggccca gggcttcceg gtatcaacag ggacaccagg atttatttat 1560 tctgcgaagt gatcttcegt cacaggtatt tattcggcgc aaagtgcgtc gggtgatgct 1620 gccaacttac tgatttagtg tatgatggtg tttttgaggt gctccagtgg cttctgttte 1680 tatcagctgt ccctcctgtt cagctactga cggggtggtg cgtaacggca aaagcaccgc 1740 cggacatcag cgctagcgga gtgtatactg gcttactatg ttggcactga tgagggtgtce 1800 agtgaagtgc ttcatgtggc aggagaaaaa aggctgcacc ggtgcgtcag cagaatatgt 1860 gatacaggat atattccegct tcctegctca ctgactcgct acgctecggtc gttcgactgc 1920 ggcgagcgga aatggcttac gaacggggcg gagatttcct ggaagatgcc aggaagatac 1980 ttaacaggga agtgagaggg ccgcggcaaa gccgtttttc cataggctcc gcccccctga 2040 caagcatcac gaaatctgac gctcaaatca gtggtggcga aacccgacag gactataaag 2100 ataccaggcg tttcccctgg cggctccctc gtgegctcte ctgttcetgec cttteggttt 2160 accggtgtca ttccgctgtt atggccgcgt ttgtctcatt ccacgcctga cactcagttc 2220 cgggtaggca gttcgctcca agctggactg tatgcacgaa ccccccegttc agtccgaccg 2280 ctgcgcctta tccggtaact atcgtcttga gtccaacccg gaaagacatg caaaagcacc 2340 Page 142 actggcagca gccactggta attgatttag aggagttagt cttgaagtca tgcgccggtt 2400 aaggctaaac tgaaaggaca agttttggtg actgcgctcc tccaagccag ttacctcggt 2460 tcaaagagtt ggtagctcag agaaccttcg aaaaaccgcc ctgcaaggcg gttttttegt 2520 tttcagagca agagattacg cgcagaccaa aacgatctca agaagatcat cttattaatc 2580 agataaaata tttctagatt tcagtgcaat ttatctcttc aaatgtagca cctgaagtca 2640 gccccatacg atataagttg taattctcat gttagtcatg ccccgcgccc accggaagga 2700 gctgactggg ttgaaggctc tcaagggcat cggtcgagat cccggtgcct aatgagtgag 2760 ctaacttaca ttaattgcgt tgcgctcact gccegctttc cagtcgggaa acctgtcgtg 2820 ccagctgcat taatgaatcg gccaacgcgc ggggagaggc ggtttgcgta ttgggcgcca 2880 gggtggtttt tettttcacc agtgagacgg gcaacagctg attgcccttc accgcctggc 2940 cctgagagag ttgcagcaag cggtccacgc tggtttgcecc cagcaggcga aaatcctgtt 3000 tgatggtggt taacggcggg atataacatg agctgtcttc ggtatcgtcg tatcccacta 3060 ccgagatgtc cgcaccaacg cgcagcccgg actcggtaat ggcgcgcatt gcgcccagcg 3120 ccatctgatc gttggcaacc agcatcgcag tgggaacgat gccctcattc agcatttgca 3180 tggtttgttg aaaaccggac atggcactcc agtcgcettc ccgttcegct atcggctgaa 3240 tttgattgcg agtgagatat ttatgccagc cagccagacg cagacgcgcc gagacagaac 3300 ttaatgggcc cgctaacagc gcgatttgct ggtgacccaa tgcgaccaga tgctccacgc 3360 ccagtcgcgt accgtcttca tgggagaaaa taatactgtt gatgggtgtc tggtcagaga 3420 catcaagaaa taacgccgga acattagtgc aggcagcttc cacagcaatg gcatcctggt 3480 catccagcgg atagttaatg atcagcccac tgacgcgttg cgcgagaaga ttgtgcaccg 3540 ccgctttaca ggcttcgacg ccgcttecgtt ctaccatcga caccaccacg ctggcaccca 3600 gttgatcggc gcgagattta atcgccgcga caatttgcga cggcgcgtgc agggccagac 3660 tggaggtggc aacgccaatc agcaacgact gtttgcccgc cagttgttgt gccacgcggt 3720 tgggaatgta attcagctcc gccatcgceg cttccacttt ttecccecgegtt ttcgcagaaa 3780 cgtggctggc ctggttcacc acgcgggaaa cggtctgata agagacaccg gcatactctg 3840 Page 143 cgacatcgta taacgttact ggtttcacat tcaccaccct gaattgactc tcttccgggc 3900 gctatcatgc cataccgcga aaggttttgc gccattcgat ggtgtccggg atctegacgc 3960 tectcccttat gcgactcctg cattaggaaa ttaatacgac tcactata 4008 <210> 65 <211> 1371 <212> DNA <213> Candidatus Scalindua brodae <220> <223> Candidatus Scalindua broadae CRISPR array amplicon <400> 65 cagacaaacg gtttgcgaag aaatacgcga cagggtgatt ggaccgtaac ctcatgatta 60 tatgattgat acacgattta accctgactt gccggttttt gaaaaagttc gcaaaccctg 120 ttttgcttca tgaagtgagt tgggtttgcg aaaaaaggtt attacagcct gatatctaag 180 tagaagagta ccggtattga agaccaaagt tgctgcgtat ggcggtcecgg ttgtccttgec 240 tttcgcaagg attccaatac tggaatcctc ccgaaaggga ggtcgcaaaa ggccgttttt 300 cgaaaaccat agtttcatac aaaccggcga tgaggtttgc gaactttttg attgtagtaa 360 gtattattaa aataatggct taatattttt ggtatataca attctcaact ttttcacctt 420 gccggaaatg aggtttgcga aattttagag agccgcatat ctatattatt tacaatcagt 480 tacaaaatgg ccccttcteg ccatatacgt aacctcagag ttgttggagg gttatgaaac 540 aagagaagga cttaatgtca cggtacccaa ttttectgccc cggactccac ggctgttact 600 agaggttatg aaacaagaga aggacttaat gtcacggtac agtttcctgt tttttttgct 660 ccctaacgct actttgaatg ttatgaaaca agagaaggac ttaatgtcac ggtacaatta 720 tcatttggac agcttccctc attatttcga ggtcgttatg aaacaagaga aggacttaat 780 gtcacggtac gaaaaaaaaa aagtaaagtt caggggcaag tgccaaagtt atgaaacaag 840 agaaggactt aatgtcacgg tacccctttg cttcttctct agtgtttcta tccatgtttg 900 tgttatgaaa caagagaagg acttaatgtc acggtactta cgaagtatct ccgtacgaac 960 cttttcactg tgttatgaaa caagagaagg acttaatgtc acggtacaga attggtatta 1020 Page 144 tttttcccag tgtaataata ccgttatgaa acaagagaag gacttaatgt cacggtacac 1080 catttttgtc attatttatt gtcatgttag aaagttatga aacaagagaa ggacttaatg 1140 tcacggtact cttcagcaat tacttcttta cgaagagata actttgttat gaaacaagag 1200 aaggacttaa tgtcacggta caaaatctca agcctcaagc atatactcaa aatcattagt 1260 tatgaaacaa gagaaggact taatgtcacg gtacatcatt accatccata tgttctgatg 1320 actgtctctg ctgtagttat gaaacaagag aagtacttag gtttgattga a 1371 <210> 66 <211> 52 <212> DNA <213> Artificial Sequence <220> <223> Primer 5 <400> 66 catatggcag atctcaattg gatatccaga caaacggttt gcgaagaaat ac 52 <210> 67 <211> 49 <212> DNA <213> Artificial Sequence <220> <223> Primer 6 <400> 67 gagggtaccg acgtcagcga tcttcaatca aacctaagta cttctcttg 49 <210> 68 <211> 4731 <212> DNA <213> Artificial Sequence <220> <223> Plasmid 2AT <400> 68 Page 145 ttcttgaaga cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat 60 aatggtttct tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 120 tttatttttc taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 180 gcttcaataa cattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 240 toccctttttt gceggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 300 aaaagatgct gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 360 cggtaagatc cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 420 agttctgcta tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg 480 ccgcatacac tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 540 tacggatggc atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 600 tgcggccaac ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 660 caacatgggg gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 720 accaaacgac gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact 780 attaactggc gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 840 ggataaagtt gcaggaccac ttctgcgctc ggcccttceg gctggctggt ttattgctga 900 taaatctgga gccggtgagc gtgggtctecg cggtatcatt gcagcactgg ggccagatgg 960 taagccctcc cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1020 aaatagacag atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1080 agtttactca tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1140 ggtgaagatc ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1200 ctgagcgtca gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1260 cgtaatctgc tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1320 tcaagagcta ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1380 tactgtcctt ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1440 tacatacctc gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1500 Page 146 tcttaccggg ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 1560 ggggggttcg tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 1620 acagcgtgag ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 1680 ggtaagcggc agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 1740 gtatctttat agtcctgteg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 1800 ctcgtcaggg gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttect 1860 ggccttttge tggccttttg ctcacatgtt ctttcctgeg ttatcccctg attctgtgga 1920 taaccgtatt accgcctttg agtgagctga taccgctegc cgcagccgaa cgaccgagcg 1980 cagcgagtca gtgagcgagg aagcggaaga gcgcctgatg cggtattttec tccttacgca 2040 tctgtgcggt atttcacacc gcatatatgg tgcactctca gtacaatctg ctctgatgcc 2100 gcatagttaa gccagtatac actccgctat cgctacgtga ctgggtcatg gctgegcccec 2160 gacacccgcc aacacccgct gacgcgccct gacgggettg tctgctcceg gcatccegctt 2220 acagacaagc tgtgaccgtc tccgggagct gcatgtgtca gaggttttca ccgtcatcac 2280 cgaaacgcgc gaggcagctg cggtaaagct catcagcgtg gtcgtgaagc gattcacaga 2340 tgtctgcctg ttcatcecgeg tccagctcgt tgagtttectc cagaagcgtt aatgtctggc 2400 ttctgataaa gcgggccatg ttaagggcgg ttttttcetg tttggtcact gatgcctccg 2460 tgtaaggggg atttctgttc atgggggtaa tgataccgat gaaacgagag aggatgctca 2520 cgatacgggt tactgatgat gaacatgccc ggttactgga acgttgtgag ggtaaacaac 2580 tggcggtatg gatgcggcgg gaccagagaa aaatcactca gggtcaatgc cagcgcttcg 2640 ttaatacaga tgtaggtgtt ccacagggta gccagcagca tcctgcgatg cagatccgga 2700 acataatggt gcagggcgct gacttccgeg tttccagact ttacgaaaca cggaaaccga 2760 agaccattca tgttgttgct caggtcgcag acgttttgca gcagcagteg cttcacgttec 2820 gctcgegtat cggtgattca ttctgctaac cagtaaggca accccgccag cctagccggg 2880 tcctcaacga caggagcacg atcatgcgca cccgtggcca ggacccaacg ctgcccgaga 2940 tgegccgcgt gcggctgctg gagatggcgg acgcgatgga tatgttctgc caagggttgg 3000 Page 147 tttgcgcatt cacagttctc cgcaagaatt gattggctcc aattcttgga gtggtgaatc 3060 cgttagcgag gtgccgcegg cttccattca ggtcgaggtg gcccggctcc atgcaccgcg 3120 acgcaacgcg gggaggcaga caaggtatag ggcggcgcct acaatccatg ccaacccgtt 3180 ccatgtgctc gccgaggcgg cataaatcgc cgtgacgatc agcggtccag tgatcgaagt 3240 taggctggta agagccgcga gcgatccttg aagctgtccc tgatggtcgt catctacctg 3300 cctggacagc atggcctgca acgcgggcat cccgatgccg ccggaagcga gaagaatcat 3360 aatggggaag gccatccagc ctcgcgtcgc gaacgccagc aagacgtagc ccagcgcgtc 3420 ggccgccatg ccggcgataa tggcctgctt ctcgccgaaa cgtttggtgg cgggaccagt 3480 gacgaaggct tgagcgaggg cgtgcaagat tccgaatacc gcaagcgaca ggccgatcat 3540 cgtegcgctc cagcgaaagc ggtcctegcc gaaaatgacc cagagcgctg ccggcacctg 3600 tcctacgagt tgcatgataa agaagacagt cataagtgcg gcgacgatag tcatgccccg 3660 cgcccaccgg aaggagctga ctgggttgaa ggctctcaag ggcatcggtc gacgctctce 3720 cttatgcgac tcctgcatta ggaagcagcc cagtagtagg ttgaggccgt tgagcaccgc 3780 cgccgcaagg aatggtgcat gcaaggagat ggcgcccaac agtcccccgg ccacggggcc 3840 tgccaccata cccacgccga aacaagcgct catgagcccg aagtggcgag cccgatcttc 3900 cccatcggtg atgtcggcga tataggcgcc agcaaccgca cctgtggcgc cggtgatgcc 3960 ggccacgatg cgtccggcgt agaggatcga gatctcgatc ccgcgaaatt aatacgactc 4020 actataggga gaccacaacg gtttccctct agtgccggct ccggagagct ctttaattaa 4080 gcggccgccc tgcaggactc gagttctaga aataattttg tttaacttta agaaggagat 4140 atagatatcc caactccata aggatccgcg atcgcggcgc gccacctggt ggccggccgg 4200 taccacgcgt gcgcgctgat ccggctgcta acaaagcccg aaaggaagct gagttggctg 4260 ctgccaccgc tgagcaataa ctagcataac cccttggggc ctctaaacgg gtcttgaggg 4320 gttttttgct gaaaggagga actatatccg gacatccaca ggacgggtgt ggtcgccatg 4380 atcgcgtagt cgatagtggc tccaagtagc gaagcgagca ggactgggcg gcggccaaag 4440 cggtcggaca gtgctccgag aacgggtgcg catagaaatt gcatcaacgc atatagcgct 4500 Page 148 agcagcacgc catagtgact ggcgatgctg tcggaatgga cgacatcccg caagaggccc 4560 ggcagtaccg gcataaccaa gcctatgcct acagcatcca gggtgacggt gccgaggatg 4620 acgatgagcg cattgttaga tttcatacac ggtgcctgac tgcgttagca atttaactgt 4680 gataaactac cgcattaaag cttatcgatg ataagctgtc aaacatgaga a 4731 <210> 69 <211> 827 <212> DNA <213> Artificial Sequence <220> <223> Gibson assembly sequence 1 <400> 69 cgtctaagaa accatagcca tccagtttac tttgcagggc ttcccaacct taccagaggg 60 cgccccagct ggcaattccg acgtcttaag acccactttc acatttaagt tgtttttcta 120 atccgcatat gatcaattca aggccgaata agaaggctgg ctctgcacct tggtgatcaa 180 ataattcgat agcttgtcgt aataatggcg gcatactatc agtagtaggt gtttccettt 240 cttctttagc gacttgatgc tcttgatctt ccaatacgca acctaaagta aaatgcccca 300 cagcgctgag tgcatataat gcattctcta gtgaaaaacc ttgttggcat aaaaaggcta 360 attgattttc gagagtttca tactgttttt ctgtaggccg tgtacctaaa tgtacttttg 420 ctccatcgcg atgacttagt aaagcacatc taaaactttt agcgttatta cgtaaaaaat 480 cttgccagct ttccccttct aaagggcaaa agtgagtatg gtgcctatct aacatctcaa 540 tggctaaggc gtcgagcaaa gcccgcttat tttttacatg ccaatacaat gtaggctgct 600 ctacacctag cttctgggcg agtttacggg ttgttaaacc ttcgattceg acctcattaa 660 gcagctctaa tgcgctgtta atcactttac ttttatctaa tctagacatc attaattcct 720 aatttttgtt gacactctat cgttgataga gttattttac cactccctat cagtgataga 780 gaaaagaatt caaaagatct aggaggaaaa aaatggctct agtaaca 827 <210> 70 <211> 1122
Page 149
<212> DNA <213> Artificial Sequence <220> <223> Gibson assembly sequence 2 <400> 70 aacaggagtc caagcaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa 60 atcaatctaa agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga 120 ggcacctatc tcagcgatct gtctatttcg ttcatccata gttgcctgac tcccegtegt 180 gtagataact acgatacggg agggcttacc atctggcccc agtgctgcaa tgataccgcg 240 agacccacgc tcaccggctc cagatttatc agcaataaac cagccagccg gaagggccga 300 gcgcagaagt ggtcctgcaa ctttatccgc ctccatccag tctattaatt gttgccggga 360 agctagagta agtagttcgc cagttaatag tttgcgcaac gttgttgcca ttgctgcagg 420 catcgtggtg tcacgctegt cgtttggtat ggcttcattc agctceggtt cccaacgatc 480 aaggcgagtt acatgatccc ccatgttgtg caaaaaagcg gttagctcct tcggtcctcec 540 gatcgttgtc agaagtaagt tggccgcagt gttatcactc atggttatgg cagcactgca 600 taattctctt actgtcatgc catccgtaag atgcttttct gtgactggtg agtactcaac 660 caagtcattc tgagaatagt gtatgcggcg accgagttgc tcttgcccgg cgtcaacacg 720 ggataatacc gcgccacata gcagaacttt aaaagtgctc atcattggaa aacgttcttc 780 ggggcgaaaa ctctcaagga tcttaccgct gttgagatcc agttcgatgt aacccactcg 840 tgcacccaac tgatcttcag catcttttac tttcaccagc gtttctgggt gagcaaaaac 900 aggaaggcaa aatgccgcaa aaaagggaat aagggcgaca cggaaatgtt gaatactcat 960 actcttcctt tttcaatgtt attgaagcat ttatcagggt tattgtctca tgagcggata 1020 catatttgaa tgtatttaga aaaataaaca aataggggtt ccgcgcacat ttccccgaaa 1080 agtgccacct gacgtctaag aaaccatagc catccagttt ac 1122 <210> 71 <211> 35 <212> DNA
Page 150
<213> Artificial Sequence
<220>
<223> Primer 7
<400> 71 aacaggagtc caagcaagga tcttcaccta gatcc 35
<210> 72
<211> 35
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer 8
<400> 72 gtaaactgga tggctatggt ttcttagacg tcagg 35
<210> 73
<211> 2495
<212> DNA
<213> Artificial Sequence
<220>
<223> Gibson assembly sequence 3
<400> 73 caaaaataat aggccccagg catcaaataa aacgaaaggc tcagtcgaaa gactgggcct 60 ttegttttat ctgttgtttg tcggtgaacg ctctctacta gagtcacact ggctcacctt 120 cgggtgggcc tttctgcegtt tatacctagg gtacgggttt tgctgcccgc aaacgggctg 180 ttctggtgtt gctagtttgt tatcagaatc gcagatccgg cttcagccgg tttgccggct 240 gaaagcgcta tttcttccag aattgccatg attttttccc cacgggaggc gtcactggct 300 cccgtgttgt cggcagcttt gattcgataa gcagcatcgc ctgtttcagg ctgtctatgt 360 gtgactgttg agctgtaaca agttgtctca ggtgttcaat ttcatgttct agttgctttg 420 ttttactggt ttcacctgtt ctattaggtg ttacatgctg ttcatctgtt acattgtcga 480 tctgttcatg gtgaacagct ttgaatgcac caaaaactcg taaaagctct gatgtatcta 540 Page 151 tcttttttac accgttttca tctgtgcata tggacagttt tccctttgat atgtaacggt 600 gaacagttgt tctacttttg tttgttagtc ttgatgcttc actgatagat acaagagcca 660 taagaacctc agatccttcc gtatttagcc agtatgttct ctagtgtggt tcgttgtttt 720 tgcgtgagcc atgagaacga accattgaga tcatacttac tttgcatgtc actcaaaaat 780 tttgcctcaa aactggtgag ctgaattttt gcagttaaag catcgtgtag tgtttttett 840 agtccgttat gtaggtagga atctgatgta atggttgttg gtattttgtc accattcatt 900 tttatctggt tgttctcaag ttcggttacg agatccattt gtctatctag ttcaacttgg 960 aaaatcaacg tatcagtcgg gcggcctcge ttatcaacca ccaatttcat attgctgtaa 1020 gtgtttaaat ctttacttat tggtttcaaa acccattggt taagcctttt aaactcatgg 1080 tagttatttt caagcattaa catgaactta aattcatcaa ggctaatctc tatatttgcc 1140 ttgtgagttt tcttttgtgt tagttctttt aataaccact cataaatcct catagagtat 1200 ttgttttcaa aagacttaac atgttccaga ttatatttta tgaatttttt taactggaaa 1260 agataaggca atatctcttc actaaaaact aattctaatt tttcgcttga gaacttggca 1320 tagtttgtcc actggaaaat ctcaaagcct ttaaccaaag gattcctgat ttccacagtt 1380 ctcgtcatca gctctctggt tgctttagct aatacaccat aagcattttc cctactgatg 1440 ttcatcatct gagcgtattg gttataagtg aacgataccg tccgttcttt ccttgtaggg 1500 ttttcaatcg tggggttgag tagtgccaca cagcataaaa ttagcttggt ttcatgctcec 1560 gttaagtcat agcgactaat cgctagttca tttgctttga aaacaactaa ttcagacata 1620 catctcaatt ggtctaggtg attttaatca ctataccaat tgagatgggc tagtcaatga 1680 taattactag tcctttteccc gggtgatctg ggtatctgta aattctgcta gacctttgct 1740 ggaaaacttg taaattctgc tagaccctct gtaaattccg ctagaccttt gtgtgttttt 1800 tttgtttata ttcaagtggt tataatttat agaataaaga aagaataaaa aaagataaaa 1860 agaatagatc ccagccctgt gtataactca ctactttagt cagttccgca gtattacaaa 1920 aggatgtcgc aaacgctgtt tgctcctcta caaaacagac cttaaaaccc taaaggctta 1980 agtagcaccc tcgcaagctc gggcaaatcg ctgaatattc cttttgtctc cgaccatcag 2040 Page 152 gcacctgagt cgctgtcttt ttcgtgacat tcagttegct gcgctcacgg ctctggcagt 2100 gaatgggggt aaatggcact acaggcgcct tttatggatt catgcaagga aactacccat 2160 aatacaagaa aagcccgtca cgggcttctc agggcgtttt atggcgggtc tgctatgtgg 2220 tgctatctga ctttttgetg ttcagcagtt cctgccectct gattttccag tctgaccact 2280 tcggattatc ccgtgacagg tcattcagac tggctaatgc acccagtaag gcagcggtat 2340 catcaacagg cttacccgtce ttactgtccc tagtgcttgg attctcacca ataaaaaacg 2400 cccggcggca accgagcgtt ctgaacaaat ccagatggag ttctgaggtc attactggat 2460 ctatcaacag gagtccaagc aaggatcttc accta 2495 <210> 74 <211> 856 <212> DNA <213> Artificial Sequence <220> <223> Gibson assembly sequence 4 - GFP with target <400> 74 tctaggagga aaaaaatggc tctagtaaca gccgtggagt ccggggcaga aaattggaaa 60 ttaagtaaag gagaagaact tttcactgga gttgtcccaa ttcttgttga attagatggt 120 gatgttaatg ggcacaaatt ttctgtcagt ggagagggtg aaggtgatgc aacatacgga 180 aaacttaccc ttaaatttat ttgcactact ggaaaactac ctgttccatg gccaacactt 240 gtcactactt tctcttatgg tgttcaatgc ttttcecgtt atccggatca tatgaaacgg 300 catgactttt tcaagagtgc catgcccgaa ggttatgtac aggaacgcac tatatctttec 360 aaagatgacg ggaactacaa gacgcgtgct gaagtcaagt ttgaaggtga tacccttgtt 420 aatcgtatcg agttaaaagg tattgatttt aaagaagatg gaaacattct cggacacaaa 480 ctcgagtaca actataactc acacaatgta tacatcacgg cagacaaaca aaagaatgga 540 atcaaagcta acttcaaaat tcgccacaac attgaagatg gatccgttca actagcagac 600 cattatcaac aaaatactcc aattggcgat ggccctgtcc ttttaccaga caaccattac 660 Page 153 ctgtcgacac aatctgccct ttcgaaagat cccaacgaaa agcgtgacca catggtcctt 720 cttgagtttg taactgctgc tgggattaca catggcatgg atgagctcta caaataatga 780 attccaactg agcgccggtc gctaccatta ccaacttgtc tggtgtcaaa aataataggc 840 cccaggcatc aaataa 856 <210> 75 <211> 856 <212> DNA <213> Artificial Sequence <220> <223> Gibson assembly sequence 4 - GFP without target <400> 75 tctaggagga aaaaaatggg acctcgaaat aatgagggaa gctgtccaaa tgataattac 60 attagtaaag gagaagaact tttcactgga gttgtcccaa ttcttgttga attagatggt 120 gatgttaatg ggcacaaatt ttctgtcagt ggagagggtg aaggtgatgc aacatacgga 180 aaacttaccc ttaaatttat ttgcactact ggaaaactac ctgttccatg gccaacactt 240 gtcactactt tctcttatgg tgttcaatgc ttttcecgtt atccggatca tatgaaacgg 300 catgactttt tcaagagtgc catgcccgaa ggttatgtac aggaacgcac tatatctttec 360 aaagatgacg ggaactacaa gacgcgtgct gaagtcaagt ttgaaggtga tacccttgtt 420 aatcgtatcg agttaaaagg tattgatttt aaagaagatg gaaacattct cggacacaaa 480 ctcgagtaca actataactc acacaatgta tacatcacgg cagacaaaca aaagaatgga 540 atcaaagcta acttcaaaat tcgccacaac attgaagatg gatccgttca actagcagac 600 cattatcaac aaaatactcc aattggcgat ggccctgtcc ttttaccaga caaccattac 660 ctgtcgacac aatctgccct ttcgaaagat cccaacgaaa agcgtgacca catggtcctt 720 cttgagtttg taactgctgc tgggattaca catggcatgg atgagctcta caaataatga 780 attccaactg agcgccggtc gctaccatta ccaacttgtc tggtgtcaaa aataataggc 840 cccaggcatc aaataa 856 Page 154
<210> 76 <211> 5180 <212> DNA <213> Artificial Sequence <220> <223> pGFPuv-Target <400> 76 ccaggcatca aataaaacga aaggctcagt cgaaagactg ggcctttegt tttatctgtt 60 gtttgtcggt gaacgctctc tactagagtc acactggctc accttegggt gggcectttet 120 gcgtttatac ctagggtacg ggttttgctg cccgcaaacg ggctgttctg gtgttgctag 180 tttgttatca gaatcgcaga tccggcttca gccggtttgc cggctgaaag cgctatttcet 240 tccagaattg ccatgatttt ttccccacgg gaggcgtcac tggctccegt gttgtcggca 300 gctttgattc gataagcagc atcgcctgtt tcaggctgtc tatgtgtgac tgttgagctg 360 taacaagttg tctcaggtgt tcaatttcat gttctagttg ctttgtttta ctggtttcac 420 ctgttctatt aggtgttaca tgctgttcat ctgttacatt gtcgatctgt tcatggtgaa 480 cagctttgaa tgcaccaaaa actcgtaaaa gctctgatgt atctatcttt tttacaccgt 540 tttcatctgt gcatatggac agttttccct ttgatatgta acggtgaaca gttgttctac 600 ttttgtttgt tagtcttgat gcttcactga tagatacaag agccataaga acctcagatc 660 cttcegtatt tagccagtat gttctctagt gtggttegtt gtttttgcgt gagccatgag 720 aacgaaccat tgagatcata cttactttgc atgtcactca aaaattttgc ctcaaaactg 780 gtgagctgaa tttttgcagt taaagcatcg tgtagtgttt ttcttagtcc gttatgtagg 840 taggaatctg atgtaatggt tgttggtatt ttgtcaccat tcatttttat ctggttgttec 900 tcaagttcgg ttacgagatc catttgtcta tctagttcaa cttggaaaat caacgtatca 960 gtegggcggc ctcgcttatc aaccaccaat ttcatattgc tgtaagtgtt taaatcttta 1020 cttattggtt tcaaaaccca ttggttaagc cttttaaact catggtagtt attttcaagc 1080 attaacatga acttaaattc atcaaggcta atctctatat ttgccttgtg agttttcttt 1140 tgtgttagtt cttttaataa ccactcataa atcctcatag agtatttgtt ttcaaaagac 1200 Page 155 ttaacatgtt ccagattata ttttatgaat ttttttaact ggaaaagata aggcaatatc 1260 tcttcactaa aaactaattc taatttttcg cttgagaact tggcatagtt tgtccactgg 1320 aaaatctcaa agcctttaac caaaggattc ctgatttcca cagttctcgt catcagctct 1380 ctggttgctt tagctaatac accataagca ttttccctac tgatgttcat catctgagcg 1440 tattggttat aagtgaacga taccgtccgt tctttccttg tagggttttc aatcgtgggg 1500 ttgagtagtg ccacacagca taaaattagc ttggtttcat gctccgttaa gtcatagcga 1560 ctaatcgcta gttcatttgc tttgaaaaca actaattcag acatacatct caattggtct 1620 aggtgatttt aatcactata ccaattgaga tgggctagtc aatgataatt actagtcctt 1680 tteccgggtg atctgggtat ctgtaaattc tgctagacct ttgctggaaa acttgtaaat 1740 tctgctagac cctctgtaaa ttcegctaga cctttgtgtg ttttttttgt ttatattcaa 1800 gtggttataa tttatagaat aaagaaagaa taaaaaaaga taaaaagaat agatcccagc 1860 cctgtgtata actcactact ttagtcagtt ccgcagtatt acaaaaggat gtcgcaaacg 1920 ctgtttgctc ctctacaaaa cagaccttaa aaccctaaag gcttaagtag caccctcgca 1980 agctcgggca aatcgctgaa tattcctttt gtctcegacc atcaggcacc tgagtcgctg 2040 tetttttegt gacattcagt tcgctgcegct cacggctctg gcagtgaatg ggggtaaatg 2100 gcactacagg cgccttttat ggattcatgc aaggaaacta cccataatac aagaaaagcc 2160 cgtcacgggc ttctcagggc gttttatggc gggtctgcta tgtggtgcta tctgactttt 2220 tgctgttcag cagttcctgc cctctgattt tccagtctga ccacttcgga ttatcccegtg 2280 acaggtcatt cagactggct aatgcaccca gtaaggcagc ggtatcatca acaggcttac 2340 ccgtcttact gtccctagtg cttggattct caccaataaa aaacgcccgg cggcaaccga 2400 gcgttctgaa caaatccaga tggagttctg aggtcattac tggatctatc aacaggagtc 2460 caagcaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa atcaatctaa 2520 agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga ggcacctatc 2580 tcagcgatct gtctatttcg ttcatccata gttgcctgac tcccegtcgt gtagataact 2640 acgatacggg agggcttacc atctggcccc agtgctgcaa tgataccgcg agacccacgc 2700 Page 156 tcaccggctc cagatttatc agcaataaac cagccagccg gaagggccga gcgcagaagt 2760 ggtcctgcaa ctttatcegc ctccatccag tctattaatt gttgccggga agctagagta 2820 agtagttcgc cagttaatag tttgcgcaac gttgttgcca ttgctgcagg catcgtggtg 2880 tcacgctcgt cgtttggtat ggcttcattc agctccggtt cccaacgatc aaggcgagtt 2940 acatgatccc ccatgttgtg caaaaaagcg gttagctecct teggtcctec gatcgttgte 3000 agaagtaagt tggccgcagt gttatcactc atggttatgg cagcactgca taattctctt 3060 actgtcatgc catccgtaag atgcttttct gtgactggtg agtactcaac caagtcattc 3120 tgagaatagt gtatgcggcg accgagttgc tcttgccegg cgtcaacacg ggataatacc 3180 gcgccacata gcagaacttt aaaagtgctc atcattggaa aacgttcttc ggggcgaaaa 3240 ctctcaagga tcttaccgct gttgagatcc agttcgatgt aacccactcg tgcacccaac 3300 tgatcttcag catcttttac tttcaccagc gtttctgggt gagcaaaaac aggaaggcaa 3360 aatgccgcaa aaaagggaat aagggcgaca cggaaatgtt gaatactcat actcttcctt 3420 tttcaatgtt attgaagcat ttatcagggt tattgtctca tgagcggata catatttgaa 3480 tgtatttaga aaaataaaca aataggggtt ccgcgcacat ttccccgaaa agtgccacct 3540 gacgtctaag aaaccatagc catccagttt actttgcagg gcttcccaac cttaccagag 3600 ggcgccccag ctggcaattc cgacgtctta agacccactt tcacatttaa gttgttttte 3660 taatccgcat atgatcaatt caaggccgaa taagaaggct ggctctgcac cttggtgatc 3720 aaataattcg atagcttgtc gtaataatgg cggcatacta tcagtagtag gtgtttccct 3780 ttettettta gcgacttgat gctcttgatc ttccaatacg caacctaaag taaaatgccc 3840 cacagcgctg agtgcatata atgcattctc tagtgaaaaa ccttgttggc ataaaaaggc 3900 taattgattt tcgagagttt catactgttt ttctgtaggc cgtgtaccta aatgtacttt 3960 tgctccatcg cgatgactta gtaaagcaca tctaaaactt ttagcgttat tacgtaaaaa 4020 atcttgccag ctttcccctt ctaaagggca aaagtgagta tggtgcctat ctaacatctc 4080 aatggctaag gcgtcgagca aagcccgctt attttttaca tgccaataca atgtaggctg 4140 ctctacacct agcttctggg cgagtttacg ggttgttaaa ccttegattc cgacctcatt 4200 Page 157 aagcagctct aatgcgctgt taatcacttt acttttatct aatctagaca tcattaattc 4260 ctaatttttg ttgacactct atcgttgata gagttatttt accactccct atcagtgata 4320 gagaaaagaa ttcaaaagat ctaggaggaa aaaaatggct ctagtaacag ccgtggagtc 4380 cggggcagaa aattggaaat taagtaaagg agaagaactt ttcactggag ttgtcccaat 4440 tcttgttgaa ttagatggtg atgttaatgg gcacaaattt tctgtcagtg gagagggtga 4500 aggtgatgca acatacggaa aacttaccct taaatttatt tgcactactg gaaaactacc 4560 tgttccatgg ccaacacttg tcactacttt ctcttatggt gttcaatgct tttcccgtta 4620 tccggatcat atgaaacggc atgacttttt caagagtgcc atgcccgaag gttatgtaca 4680 ggaacgcact atatctttca aagatgacgg gaactacaag acgcgtgctg aagtcaagtt 4740 tgaaggtgat acccttgtta atcgtatcga gttaaaaggt attgatttta aagaagatgg 4800 aaacattctc ggacacaaac tcgagtacaa ctataactca cacaatgtat acatcacggc 4860 agacaaacaa aagaatggaa tcaaagctaa cttcaaaatt cgccacaaca ttgaagatgg 4920 atccgttcaa ctagcagacc attatcaaca aaatactcca attggcgatg gccctgtcct 4980 tttaccagac aaccattacc tgtcgacaca atctgccctt tcgaaagatc ccaacgaaaa 5040 gcgtgaccac atggtccttec ttgagtttgt aactgctgct gggattacac atggcatgga 5100 tgagctctac aaataatgaa ttccaactga gcgccggteg ctaccattac caacttgtct 5160 ggtgtcaaaa ataataggcc 5180 <210> 77 <211> 5180 <212> DNA <213> Artificial Sequence <220> <223> pGFPuv-Non-Target <400> 77 ccaggcatca aataaaacga aaggctcagt cgaaagactg ggcctttegt tttatctgtt 60 gtttgtcggt gaacgctctc tactagagtc acactggctc accttegggt gggcectttet 120 gcgtttatac ctagggtacg ggttttgctg cccgcaaacg ggctgttctg gtgttgctag 180 Page 158 tttgttatca gaatcgcaga tccggcttca gccggtttgc cggctgaaag cgctatttcet 240 tccagaattg ccatgatttt ttccccacgg gaggcgtcac tggctccegt gttgtcggca 300 gctttgattc gataagcagc atcgcctgtt tcaggctgtc tatgtgtgac tgttgagctg 360 taacaagttg tctcaggtgt tcaatttcat gttctagttg ctttgtttta ctggtttcac 420 ctgttctatt aggtgttaca tgctgttcat ctgttacatt gtcgatctgt tcatggtgaa 480 cagctttgaa tgcaccaaaa actcgtaaaa gctctgatgt atctatcttt tttacaccgt 540 tttcatctgt gcatatggac agttttccct ttgatatgta acggtgaaca gttgttctac 600 ttttgtttgt tagtcttgat gcttcactga tagatacaag agccataaga acctcagatc 660 cttcegtatt tagccagtat gttctctagt gtggttegtt gtttttgcgt gagccatgag 720 aacgaaccat tgagatcata cttactttgc atgtcactca aaaattttgc ctcaaaactg 780 gtgagctgaa tttttgcagt taaagcatcg tgtagtgttt ttcttagtcc gttatgtagg 840 taggaatctg atgtaatggt tgttggtatt ttgtcaccat tcatttttat ctggttgttec 900 tcaagttcgg ttacgagatc catttgtcta tctagttcaa cttggaaaat caacgtatca 960 gtegggcggc ctcgcttatc aaccaccaat ttcatattgc tgtaagtgtt taaatcttta 1020 cttattggtt tcaaaaccca ttggttaagc cttttaaact catggtagtt attttcaagc 1080 attaacatga acttaaattc atcaaggcta atctctatat ttgccttgtg agttttcttt 1140 tgtgttagtt cttttaataa ccactcataa atcctcatag agtatttgtt ttcaaaagac 1200 ttaacatgtt ccagattata ttttatgaat ttttttaact ggaaaagata aggcaatatc 1260 tcttcactaa aaactaattc taatttttcg cttgagaact tggcatagtt tgtccactgg 1320 aaaatctcaa agcctttaac caaaggattc ctgatttcca cagttctcgt catcagctct 1380 ctggttgctt tagctaatac accataagca ttttccctac tgatgttcat catctgagcg 1440 tattggttat aagtgaacga taccgtccgt tctttccttg tagggttttc aatcgtgggg 1500 ttgagtagtg ccacacagca taaaattagc ttggtttcat gctccgttaa gtcatagcga 1560 ctaatcgcta gttcatttgc tttgaaaaca actaattcag acatacatct caattggtct 1620 aggtgatttt aatcactata ccaattgaga tgggctagtc aatgataatt actagtcctt 1680 Page 159 tteccgggtg atctgggtat ctgtaaattc tgctagacct ttgctggaaa acttgtaaat 1740 tctgctagac cctctgtaaa ttcegctaga cctttgtgtg ttttttttgt ttatattcaa 1800 gtggttataa tttatagaat aaagaaagaa taaaaaaaga taaaaagaat agatcccagc 1860 cctgtgtata actcactact ttagtcagtt ccgcagtatt acaaaaggat gtcgcaaacg 1920 ctgtttgctc ctctacaaaa cagaccttaa aaccctaaag gcttaagtag caccctcgca 1980 agctcgggca aatcgctgaa tattcctttt gtctcegacc atcaggcacc tgagtcgctg 2040 tetttttegt gacattcagt tcgctgcegct cacggctctg gcagtgaatg ggggtaaatg 2100 gcactacagg cgccttttat ggattcatgc aaggaaacta cccataatac aagaaaagcc 2160 cgtcacgggc ttctcagggc gttttatggc gggtctgcta tgtggtgcta tctgactttt 2220 tgctgttcag cagttcctgc cctctgattt tccagtctga ccacttcgga ttatcccegtg 2280 acaggtcatt cagactggct aatgcaccca gtaaggcagc ggtatcatca acaggcttac 2340 ccgtcttact gtccctagtg cttggattct caccaataaa aaacgcccgg cggcaaccga 2400 gcgttctgaa caaatccaga tggagttctg aggtcattac tggatctatc aacaggagtc 2460 caagcaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa atcaatctaa 2520 agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga ggcacctatc 2580 tcagcgatct gtctatttcg ttcatccata gttgcctgac tcccegtcgt gtagataact 2640 acgatacggg agggcttacc atctggcccc agtgctgcaa tgataccgcg agacccacgc 2700 tcaccggctc cagatttatc agcaataaac cagccagccg gaagggccga gcgcagaagt 2760 ggtcctgcaa ctttatcegc ctccatccag tctattaatt gttgccggga agctagagta 2820 agtagttcgc cagttaatag tttgcgcaac gttgttgcca ttgctgcagg catcgtggtg 2880 tcacgctcgt cgtttggtat ggcttcattc agctccggtt cccaacgatc aaggcgagtt 2940 acatgatccc ccatgttgtg caaaaaagcg gttagctecct teggtcctec gatcgttgte 3000 agaagtaagt tggccgcagt gttatcactc atggttatgg cagcactgca taattctctt 3060 actgtcatgc catccgtaag atgcttttct gtgactggtg agtactcaac caagtcattc 3120 tgagaatagt gtatgcggcg accgagttgc tcttgccegg cgtcaacacg ggataatacc 3180 Page 160 gcgccacata gcagaacttt aaaagtgctc atcattggaa aacgttcttc ggggcgaaaa 3240 ctctcaagga tcttaccgct gttgagatcc agttcgatgt aacccactcg tgcacccaac 3300 tgatcttcag catcttttac tttcaccagc gtttctgggt gagcaaaaac aggaaggcaa 3360 aatgccgcaa aaaagggaat aagggcgaca cggaaatgtt gaatactcat actcttcctt 3420 tttcaatgtt attgaagcat ttatcagggt tattgtctca tgagcggata catatttgaa 3480 tgtatttaga aaaataaaca aataggggtt ccgcgcacat ttccccgaaa agtgccacct 3540 gacgtctaag aaaccatagc catccagttt actttgcagg gcttcccaac cttaccagag 3600 ggcgccccag ctggcaattc cgacgtctta agacccactt tcacatttaa gttgttttte 3660 taatccgcat atgatcaatt caaggccgaa taagaaggct ggctctgcac cttggtgatc 3720 aaataattcg atagcttgtc gtaataatgg cggcatacta tcagtagtag gtgtttccct 3780 ttettettta gcgacttgat gctcttgatc ttccaatacg caacctaaag taaaatgccc 3840 cacagcgctg agtgcatata atgcattctc tagtgaaaaa ccttgttggc ataaaaaggc 3900 taattgattt tcgagagttt catactgttt ttctgtaggc cgtgtaccta aatgtacttt 3960 tgctccatcg cgatgactta gtaaagcaca tctaaaactt ttagcgttat tacgtaaaaa 4020 atcttgccag ctttcccctt ctaaagggca aaagtgagta tggtgcctat ctaacatctc 4080 aatggctaag gcgtcgagca aagcccgctt attttttaca tgccaataca atgtaggctg 4140 ctctacacct agcttctggg cgagtttacg ggttgttaaa ccttegattc cgacctcatt 4200 aagcagctct aatgcgctgt taatcacttt acttttatct aatctagaca tcattaattc 4260 ctaatttttg ttgacactct atcgttgata gagttatttt accactccct atcagtgata 4320 gagaaaagaa ttcaaaagat ctaggaggaa aaaaatggga cctcgaaata atgagggaag 4380 ctgtccaaat gataattaca ttagtaaagg agaagaactt ttcactggag ttgtcccaat 4440 tcttgttgaa ttagatggtg atgttaatgg gcacaaattt tctgtcagtg gagagggtga 4500 aggtgatgca acatacggaa aacttaccct taaatttatt tgcactactg gaaaactacc 4560 tgttccatgg ccaacacttg tcactacttt ctcttatggt gttcaatgct tttcccgtta 4620 tccggatcat atgaaacggc atgacttttt caagagtgcc atgcccgaag gttatgtaca 4680 Page 161 ggaacgcact atatctttca aagatgacgg gaactacaag acgcgtgctg aagtcaagtt 4740 tgaaggtgat acccttgtta atcgtatcga gttaaaaggt attgatttta aagaagatgg 4800 aaacattctc ggacacaaac tcgagtacaa ctataactca cacaatgtat acatcacggc 4860 agacaaacaa aagaatggaa tcaaagctaa cttcaaaatt cgccacaaca ttgaagatgg 4920 atccgttcaa ctagcagacc attatcaaca aaatactcca attggcgatg gccctgtcct 4980 tttaccagac aaccattacc tgtcgacaca atctgccctt tcgaaagatc ccaacgaaaa 5040 gcgtgaccac atggtccttec ttgagtttgt aactgctgct gggattacac atggcatgga 5100 tgagctctac aaataatgaa ttccaactga gcgccggteg ctaccattac caacttgtct 5160 ggtgtcaaaa ataataggcc 5180 <210> 78 <211> 42 <212> RNA <213> Artificial Sequence <220> <223> Guide rna 2 for experiments <400> 78 caagagaagg acuuaauguc acgguaccca auuuucugcc cc 42 <210> 79 <211> 15 <212> DNA <213> Candidatus Scalindua brodae <220> <223> Candidatus Scalindua brodae CRISPR repeat - conserved region <400> 79 ttaatgtcac ggtac 15 <210> 80 <211> 15 <212> DNA <213> Candidatus Jettenia
Page 162
<220> <223> Candidatus Jettenia caeni CRISPR repeat - conserved region <400> 80 ttaatgtcac ggtac 15 <210> 81 <211> 15 <212> DNA <213> Deferribacteres <phylum> <220> <223> Deferribacteres CRISPR repeat - conserved region <400> 81 atgatgtttt ggtac 15 <210> 82 <211> 15 <212> DNA <213> Desulfonema ishimotonii <220> <223> Desulfonema ishimotonii CRISPR repeat - conserved region <400> 82 ttgatgtcac ggaac 15 <210> 83 <211> 15 <212> RNA <213> Candidatus Scalindua brodae <220> <223> Candidatus Scalindua brodae CRISPR repeat - conserved region -
RNA <400> 83 uuaaugucac gguac 15 <210> 84 Page 163
<211> 15
<212> RNA
<213> Candidatus Jettenia
<220>
<223> Candidatus Jettenia caeni CRISPR repeat - conserved region - RNA
<400> 84 uuaaugucac gguac 15
<210> 85
<211> 15
<212> RNA
<213> Deferribacteres <phylum>
<220>
<223> Deferribacteres CRISPR repeat - conserved region - RNA
<400> 85 augauguuuu gguac 15
<210> 86
<211> 15
<212> RNA
<213> Desulfonema ishimotonii
<220>
<223> Desulfonema ishimotonii CRISPR repeat - conserved region - RNA
<400> 86 uugaugucac ggaac 15
<210> 87
<211> 57
<212> DNA
<213> Candidatus Scalindua brodae
<220>
<223> Consensus sequence Fig. 10
<400> 87 aaacaagaga aggacttaat gtcacggtac ccaattttct gccccggact ccacggc 57 Page 164
<210> 88 <211> 321 <212> PRT <213> Candidatus Scalindua brodae <220> <223> Candidatus Scalindua brodae - TPR domain <400> 88 Asn Asn Thr Glu Glu Asn Ile Asp Arg Ile Gln Glu Pro Thr Arg Glu 1 5 10 15 Asp Ile Asp Arg Lys Glu Ala Glu Arg Leu Leu Asp Glu Ala Phe Asn
Pro Arg Thr Lys Pro Val Asp Arg Lys Lys Ile Ile Asn Ser Ala Leu 40 45 Lys Ile Leu Ile Gly Leu Tyr Lys Glu Lys Lys Asp Asp Leu Thr Ser 50 55 60 Ala Ser Phe Ile Ser Ile Ala Arg Ala Tyr Tyr Leu Val Ser Ile Thr 65 70 75 80 Ile Leu Pro Lys Gly Thr Thr Ile Pro Glu Lys Lys Lys Glu Ala Leu 85 90 95 Arg Lys Gly Ile Glu Phe Ile Asp Arg Ala Ile Asn Lys Phe Asn Gly 100 105 110 Ser Ile Leu Asp Ser Gln Arg Ala Phe Arg Ile Lys Ser Val Leu Ser 115 120 125 Ile Glu Phe Asn Arg Ile Asp Arg Glu Lys Cys Asp Asn Ile Lys Leu 130 135 140 Lys Asn Leu Leu Asn Glu Ala Val Asp Lys Gly Cys Thr Asp Phe Asp 145 150 155 160 Thr Tyr Glu Trp Asp Ile Gln Ile Ala Ile Arg Leu Cys Glu Leu Gly 165 170 175 Val Asp Met Glu Gly His Phe Asp Asn Leu Ile Lys Ser Asn Lys Ala 180 185 190 Asn Asp Leu Gln Lys Ala Lys Ala Tyr Tyr Phe Ile Lys Lys Asp Asp 195 200 205 His Lys Ala Lys Glu His Met Asp Lys Cys Thr Ala Ser Leu Lys Tyr 210 215 220 Thr Pro Cys Ser His Arg Leu Trp Asp Glu Thr Val Gly Phe Ile Glu 225 230 235 240 Arg Leu Lys Gly Asp Ser Ser Thr Leu Trp Arg Asp Phe Ala Ile Lys 245 250 255 Thr Tyr Arg Ser Cys Arg Val Gln Glu Lys Glu Thr Gly Thr Leu Arg 260 265 270 Leu Arg Trp Tyr Trp Ser Arg His Arg Val Leu Tyr Asp Met Ala Phe 275 280 285 Leu Ala Val Lys Glu Gln Ala Asp Asp Glu Glu Pro Asp Val Asn Val 290 295 300 Lys Gln Ala Lys Ile Lys Lys Leu Ala Glu Ile Ser Asp Ser Leu Lys Page 165
Ser Page 166
Claims (19)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
NL2028346A NL2028346B1 (en) | 2021-05-31 | 2021-05-31 | gRAMP protein for modulating a target mRNA |
PCT/NL2022/050296 WO2022255865A1 (en) | 2021-05-31 | 2022-05-31 | Gramp protein and tpr-chat protein for modulating a target mrna or target protein |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
NL2028346A NL2028346B1 (en) | 2021-05-31 | 2021-05-31 | gRAMP protein for modulating a target mRNA |
Publications (1)
Publication Number | Publication Date |
---|---|
NL2028346B1 true NL2028346B1 (en) | 2022-12-12 |
Family
ID=77627473
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
NL2028346A NL2028346B1 (en) | 2021-05-31 | 2021-05-31 | gRAMP protein for modulating a target mRNA |
Country Status (1)
Country | Link |
---|---|
NL (1) | NL2028346B1 (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016035044A1 (en) * | 2014-09-05 | 2016-03-10 | Vilnius University | Programmable rna shredding by the type iii-a crispr-cas system of streptococcus thermophilus |
WO2019222555A1 (en) * | 2018-05-16 | 2019-11-21 | Arbor Biotechnologies, Inc. | Novel crispr-associated systems and components |
-
2021
- 2021-05-31 NL NL2028346A patent/NL2028346B1/en active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016035044A1 (en) * | 2014-09-05 | 2016-03-10 | Vilnius University | Programmable rna shredding by the type iii-a crispr-cas system of streptococcus thermophilus |
WO2019222555A1 (en) * | 2018-05-16 | 2019-11-21 | Arbor Biotechnologies, Inc. | Novel crispr-associated systems and components |
Non-Patent Citations (9)
Title |
---|
CARYN R. HALE ET AL: "RNA-Guided RNA Cleavage by a CRISPR RNA-Cas Protein Complex", CELL, vol. 139, no. 5, 25 November 2009 (2009-11-25), Amsterdam NL, pages 945 - 956, XP055038712, ISSN: 0092-8674, DOI: 10.1016/j.cell.2009.07.040 * |
CARYN R. HALE ET AL: "Target RNA capture and cleavage by the Cmr type III-B CRISPR–Cas effector complex", GENES & DEVELOPMENT, vol. 28, no. 21, 1 November 2014 (2014-11-01), US, pages 2432 - 2443, XP055247605, ISSN: 0890-9369, DOI: 10.1101/gad.250712.114 * |
CLAMP ET AL.: "The Jalview Java alignment editor", BIOINFORMATICS, 2004 |
H. LI: "Minimal2: pairwise alignment for nucleotide sequences", BIOINFORMATICS, 2018 |
MAKAROVA ET AL., EVOLUTIONARY CLASSIFICATION OF CRISPR-CAS SYSTEMS: A BURST OF CLASS 2 AND DERIVED VARIANTS |
ÖZCAN AHSEN ET AL: "Programmable RNA targeting with the single-protein CRISPR effector Cas7-11", NATURE, NATURE PUBLISHING GROUP UK, LONDON, vol. 597, no. 7878, 6 September 2021 (2021-09-06), pages 720 - 725, XP037576049, ISSN: 0028-0836, [retrieved on 20210906], DOI: 10.1038/S41586-021-03886-5 * |
ÖZCAN AHSEN ET AL: "Suppl. Information - Programmable RNA targeting with the single-protein CRISPR effector Cas7-11", NATURE, 6 September 2021 (2021-09-06), pages 1 - 32, XP055886761, Retrieved from the Internet <URL:https://static-content.springer.com/esm/art%3A10.1038%2Fs41586-021-03886-5/MediaObjects/41586_2021_3886_MOESM1_ESM.pdf> [retrieved on 20220202], DOI: 10.1038/s41586-021-03886-5 * |
VAN BELJOUW SAM P. B. ET AL: "Suppl. Material - The gRAMP CRISPR-Cas effector is an RNA endonuclease complexed with a caspase-like peptidase", SCIENCE, 26 August 2021 (2021-08-26), pages 1 - 66, XP055886492, Retrieved from the Internet <URL:https://www.science.org/doi/suppl/10.1126/science.abk2718/suppl_file/science.abk2718_SM.pdf> [retrieved on 20220202], DOI: 10.1126/science.abk2718 * |
VAN BELJOUW SAM P. B. ET AL: "The gRAMP CRISPR-Cas effector is an RNA endonuclease complexed with a caspase-like peptidase", SCIENCE, vol. 373, no. 6561, 26 August 2021 (2021-08-26), US, pages 1349 - 1353, XP055886391, ISSN: 0036-8075, DOI: 10.1126/science.abk2718 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107250363B (en) | Compositions and methods for efficient gene editing in E.coli | |
DK3087178T3 (en) | PROCEDURE FOR PREPARING OXALATE OXIDASES WITH ACTIVITY OPTIMUM NEAR PHYSIOLOGICAL PH AND APPLICATION OF SUCH RECOMBINANT OXALATE OXIDASES IN THE TREATMENT OF OXALATE-RELATED DISEASES | |
AU636537B2 (en) | Improvement of the yield when disulfide-bonded proteins are secreted | |
KR20220020826A (en) | Production of Fucosylated Oligosaccharides in Bacillus | |
CN116083398A (en) | Isolated Cas13 proteins and uses thereof | |
CN106011133B (en) | A kind of small DNA molecular amount reference substance, reference substance plasmid and preparation method thereof | |
CN114222763A (en) | Supermodular IgG3 spacer domain and multifunctional sites achieved in chimeric antigen receptor design | |
CN102719471B (en) | Integrative plasmid pOPHI and resistance screening marker-free self-luminescent mycobacterium | |
NL2028346B1 (en) | gRAMP protein for modulating a target mRNA | |
CN112553176A (en) | Glutamine transaminase with improved thermal stability | |
CN115247173A (en) | Gene editing system for constructing TMPRSS6 gene mutant iron deficiency anemia pig nuclear transplantation donor cells and application thereof | |
CN115247153A (en) | Gene editing system for constructing diabetes model pig nuclear transplantation donor cells with HNF1A gene mutation and application thereof | |
CN115232817A (en) | Gene editing system for constructing three-gene combined mutant miniature pig nuclear transplantation donor cells and application thereof | |
CN115247175A (en) | Gene editing system for constructing epigenetic dysregulation model pig nuclear transplantation donor cell of SETDB1 gene mutation and application thereof | |
CN108949690B (en) | A method of prepare can real-time detection mescenchymal stem cell bone differentiation cell model | |
CN112437684A (en) | Recombinant adenovirus vector expressing Zika antigen with improved productivity | |
CN113755512B (en) | A method and application for preparing tandem repeat proteins | |
CN115247191B (en) | Gene editing system and its application in constructing pig nuclear transplant donor cells with double gene mutations in nevoid basal cell carcinoma syndrome | |
CN115161335B (en) | Gene editing system for constructing ALS model pig nuclear transfer donor cells with TARDBP gene mutation and application of gene editing system | |
KR20230150998A (en) | How to make Cas3 protein | |
RU2810729C2 (en) | Production of fucosylated oligosaccharides in bacillus | |
CN113234746B (en) | Method for pesticide induced protein interaction and induced gene expression | |
CN114317473B (en) | A transglutaminase variant with improved catalytic activity and thermostability | |
CN115232813A (en) | Gene editing system for constructing von willebrand model pig nuclear transplantation donor cells with vWF gene mutation and application of gene editing system | |
CN115247163A (en) | Gene editing system for constructing stomach cancer model pig nuclear transplantation donor cell with GP130 gene mutation and application thereof |