WO2018164948A1 - Vectors with self-directed cpf1-dependent switches - Google Patents
Vectors with self-directed cpf1-dependent switches Download PDFInfo
- Publication number
- WO2018164948A1 WO2018164948A1 PCT/US2018/020619 US2018020619W WO2018164948A1 WO 2018164948 A1 WO2018164948 A1 WO 2018164948A1 US 2018020619 W US2018020619 W US 2018020619W WO 2018164948 A1 WO2018164948 A1 WO 2018164948A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- vector
- sequence
- cpfl
- transgene
- protospacer
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/111—General methods applicable to biologically active non-coding nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases [RNase]; Deoxyribonucleases [DNase]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2830/00—Vector systems having a special element relevant for transcription
Definitions
- CRISPR based genome editing is guided by a short CRISPR RNA (crRNA) which targets the Cas9 DNase activity to genomic sequences complementary to the crRNA and preceded by a short protospacer-adjacent motif (PAM).
- crRNA CRISPR RNA
- PAM protospacer-adjacent motif
- Cpfl crRNA a T-rich PAM distal from a staggered DNase cut site.
- the mature Cpfl crRNA is composed of a 5' scaffold region (also described as a 5' handle or a direct repeat), and a 3' guide region.
- Cas9 relies on RNase III to excise crRNAs from a CRISPR array.
- FnCpfl Francisella novidica Ul 12
- FnCpfl has its own RNase activity that can excise crRNA from a bacterial CRISPR array.
- FnCpfl does not efficiently edit mammalian genomes, and it is not known whether the RNase activity of any Cpfl is functional in mammalian cells.
- the invention provides vectors or expression constructs that contain a transgene and a polynucleotide switch.
- the polynucleotide switch in the vectors harbors (a) a first sequence segment that contains a Cpfl DNase target sequence having (1) a protospacer that is approximately 23 ⁇ 2 nucleotides in length and (2) a protospacer-adjacent motif (PAM) that is located 5' to the protospacer, and (b) a second sequence segment that encodes an RNA transcript that contains (1) a scaffold RNA containing a sequence 4-10 nucleotides in length and its reverse complement and (2) a guide RNA (gRNA) sequence that is substantially identical to the protospacer and is approximately 23 ⁇ 2 nucleotides in length.
- the polynucleotide switch is capable of undergoing a self-directed DNA modification that switches on or switches off the transgene in the vectors.
- the scaffold RNA in the RNA transcript is capable of being cleaved by a Type V CRISPR effector protein (e.g., a Cpfl enzyme such as LbCpf 1 or AsCpf) in a mammalian cell, resulting in the generation of a mature crRNA of the CRISPR effector protein.
- the generated crRNA is capable of targeting the CRISPR effector protein to the Cpfl DNase target sequence to cleave the vector within the protospacer.
- the encoded RNA transcript contains two or more scaffold RNAs.
- Some vectors of the invention can additionally contain an RNA polymerase II (Pol II) promoter.
- the second sequence segment is operably linked to the promoter for the transcript to be expressed from the RNA polymerase II (Pol II) promoter.
- the transgene contains at least one open reading frame (ORF).
- the encoded scaffold RNA contains one or more structural elements selected from (a) a sequence 4-10 nucleotides in length comprising UCUAC and a reverse complement comprising GUAGA, (b) a U nucleotide at the first unpaired position 5' of the sequence 4-10 nucleotides in length, (c) a U nucleotide at the first unpaired position 3' of the sequence 4-10 nucleotides in length, (d) a U nucleotide at the first unpaired position 5' of the reverse complement, (e) a U nucleotide at the first unpaired position 3' of the reverse complement, and (f) a trinucleotide AAU at a position fewer than 5 nucleotides 5' of said sequence 4-10 nucleotides in length.
- the encoded scaffold RNA can further contain a CU dinucleotide, an AU dinucleotide or an AAG trinucleotide between the sequence 4-10 nucleotides in length and its reverse complement.
- the PAM is a tetranucleotide containing at least two thymidine (T) nucleotides.
- the sequence of the guide RNA is identical to that of the protospacer. For some vectors, cleavage of the protospacer in a cell by a Cpfl present therein will switch off expression of the transgene.
- cleavage of the protospacer in a cell by a Cpfl present therein will switch on expression of the transgene.
- concentration of the transcript expressed from the vector will be reduced when the corresponding Cpfl enzyme is present.
- Some vectors of the invention contain a promoter (e.g., a Pol II promoter) that harbors the protospacer.
- the protospacer comprises or partially overlaps a TFIIB recognition element (BRE), a TATA box, an Initiator (Inr), a downstream promoter element (DPE), a splice acceptor AG dinucleotide, a splice donor GU dinucleotide, an ATG start codon trinucleotide, or an internal ribosomal entry site (IRES).
- Some vectors of the invention contain two or more Cpfl DNase target sequences.
- each Cpfl DNase target sequence can contain (1) a protospacer that is approximately 23 ⁇ 2 nucleotides in length and (2) a protospacer-adjacent motif (PAM) that is located 5' to the protospacer.
- PAM protospacer-adjacent motif
- the protospacers of the two or more Cpfl DNase target sequences are not identical to each other.
- the ORF in the transgene encodes an amino acid sequence that is substantially identical to the amino acid sequence of at least a portion of a human protein. In some embodiments, the ORF encodes an amino acid sequence that is substantially identical to the Fc region of an antibody. In some embodiments, the ORF encodes an amino acid sequence other than an antibody Fc region that is substantially identical to one or more immunoadhesins or antibodies known in the art. In some embodiments, the ORF encodes a sequence encoding at least a portion of one or more known cellular proteins such as cellular receptors, other cell surface molecules, enzymes, cytokines, chemokines, costimulatory molecules, interleukins, and physiologically active polypeptide factors.
- the ORF encodes at least a portion of a chimeric antigen receptor (CAR).
- Some preferred vectors of the invention are based on or derived from an adeno-associated virus (AAV) vector or a retroviral vector.
- Some vectors of the invention can additionally contain an inducible expression cassette that encodes a Type V CRISPR effector protein (e.g., a Cpfl enzyme).
- the invention provides mammalian cells into which one or more vectors of the present invention have been introduced.
- the invention provides pharmaceutical compositions that contain at least one vector of the present invention.
- the invention provides methods of switching on or switching off expression of a transgene. These methods typically entail (a) administering a vector of the invention to a subject, and (b) administering either a Cpfl enzyme or a polynucleotide capable of expressing a Cpfl enzyme to the subject.
- FIG. 1 shows that LbCpfl and AsCpfl have RNase activities in mammalian cells
- the crRNA recognized by LbCpfl (SEQ ID NO:20) and AsCpfl (SEQ ID NO:21) are represented.
- a 19-20 nucleotide scaffold RNA region in the crRNAs (SEQ ID NOs: 18 and 19, respectively) is followed by a 23-base guide RNA (gRNA) complementary to the Cpfl DNase target sequence
- gRNA 23-base guide RNA
- Cpfl recognizes an appropriate scaffold RNA region present in the 3' UTR of an mRNA encoding GLuc, the message is cleaved and GLuc expression is halted
- Plasmids encoding LbCpfl, AsCpfl or vector alone were cotransfected with GLuc-expressing plasmids bearing the indicated scaffold variants, and GLuc activity was measured.
- a small ' ⁇ ' preceding the scaffold RNA indicates replacement of the initial AAUU sequence with UUAA.
- a large ' x ' indicates that the scaffold sequence has been randomized.
- the first three have LbCpfl (Lb) scaffold, and the other two have AsCpfl (As) scaffold,
- e-f LbCpfl variants were assessed for their ability to edit an integrated gene when co- expressed with U6 promoter-driven crRNA, using (e) a T7E1 mismatch cleavage assay or (f) double-strand break (DSB)-induced gain-of-expression assay, depicted in Figure 3a.
- H759 and K785 are proximal to the phosphate at 5' end of the Lb scaffold.
- the first and second bases of the scaffold RNA, A(-20), A(-19), are also indicated.
- Experiments shown are representative of two (panel c), three (panel e), or four (panels d and f) performed with nearly identical results.
- Data points in panels c, d, and f represent mean ⁇ s.d. of three biological replicates.
- FIG. 2 shows that crRNA excised from a Pol II-expressed RNA transcript can efficiently edit a mammalian genome
- a An mRNA encoding GLuc with two Lb scaffold regions (sR) separated by a 23-base guide (gRl) can be cleaved by LbCpfl . If both scaffold RNA regions are cleaved, the GLuc message is degraded and the resulting crRNA can be loaded into LbCpfl to edit a reporter transgene.
- U6 promoter-driven guide RNAs used in the subsequent panel are represented
- the indicated U6-expressed crRNAs were co-expressed with LbCpfl in the presence of a DSB-induced FLuc (left) or GLuc (right) reporter gene, as depicted in Figure 5d. When expressed in tandem, both crRNAs are active, (g) A direct comparison of the efficiencies of Pol II and Pol Ill-expressed crRNA using a T7E1 mismatch cleavage assay.
- Figure 3 shows assays and additional results for Figure 1.
- a gene encoding EGFP and GLuc separated by a foot-and-mouth disease 2a protease (F2A) was integrated into the genome of 239T cells.
- EGFP is initially encoded in the +1 frame, whereas F2A and GLuc are encoded in the +3 frame and the GLuc start methionine has been eliminated, so that only EGFP is expressed.
- a frameshift is induced by nonhomologous end joining (NHEJ), inactivating EGFP expression.
- NHEJ nonhomologous end joining
- Figure 4 shows additional results for Figure 2 (panels a-c).
- Figure 5 shows assays and additional results for Figure 2 (panels d-f).
- Tandem crRNAs were expressed from a Pol-III (U6) promoter.
- These crRNAs were assayed individually and in both tandem orders in Figure 2f. Experiments shown are representative of two with similar results. Data points in panels b and c represent mean ⁇ s.d. of three biological replicates.
- FIG. 6 shows that the self-directed polynucleotide switch is capable of inactivating the vector containing it at the RNA level.
- a first plasmid vector containing a Firefly luciferase transgene was co-transfected into 293T cells with an empty plasmid vector or a plasmid vector expressing either wild-type LbCpfl, RNase domain mutant LbCpfl H759A, or DNase domain mutant LbCpfl D832A.
- the Firefly luciferase transgene was expressed as a single Pol II transcript, which either did not include a scaffold RNA and guide RNA (gRNA) (a), or did include a scaffold RNA and gRNA (b). Firefly luciferase (FLuc) expression is indicated in relative light units (RLU).
- RLU relative light units
- Figure 7 shows various strategies for a self-directed polynucleotide switch to inactivate the vector containing it at the DNA level.
- the vectors shown here express Firefly luciferase from a Pol II promoter and express a second transcript containing a scaffold RNA and guide RNA (gRNA) from a U6 Pol III promoter.
- the plasmid vector expressing both Firefly luciferase and the transcript containing the scaffold RNA and gRNA was co- trasnfected into 293T cells with an empty plasmid vector or a plasmid vector expressing either wild-type LbCpfl, RNase domain mutant LbCpfl H759A, or DNase domain mutant LbCpfl D832A.
- the protospacer overlaps with the splice acceptor AG dinucleotide (SA) (a).
- SA splice acceptor AG dinucleotide
- the protospacer overlaps with the ATG start codon trinucleotide (b).
- the Firefly luciferase expression vector contained two protospacers, the first protospacer overlapping with the SA site, and the second protospacer overlapping with the ATG start codon trinucleotide (c).
- the protospacer resided in the coding region of the Firefly luciferase transgene (d). Multiple protospacer sequences within the Firefly luciferase transgene were evaluated (Fluc-gRl through Fluc- gR19). Firefly luciferase (FLuc) expression is indicated in relative light units (RLU).
- Figure 8 shows the inactivation of a transgene in vivo.
- Mice were inoculated in the left gastrocenemius muscle with 10 10 copies of an AAV vector encoding Firefly luciferase.
- the vector contained a U6 promoter that expresses a transcript containing a scaffold RNA and a guide RNA (gRNA).
- gRNA guide RNA
- An otherwise-identical negative control vector was constructed that encoded Firefly luciferase but not a scaffold RNA and gRNA. The gRNA recognizes a protospacer sequence in the coding region of the Firefly luciferase transgene.
- luciferase activity was measured using a Xenogen imager on days 8, 14, 21, and 28 post-injection of the animals that received the AAV vector lacking (a) or containing (b) the scaffold RNA and gRNA.
- the luciferase signal from each mouse was quantified using the Xenoimager software for both groups of mice, which received the AAV vector lacking (c) or containing (d) the scaffold RNA and gRNA.
- Cpfl is a Type V CRISPR-effector protein with greater specificity than Cas9 in genome-editing applications.
- the present invention is predicated in part on the discoveries by the present inventors that some Cpfl proteins have RNase activities that can excise CRISPR RNAs (crRNAs) from a single Pol II-driven RNA transcript expressed in mammalian cells. Specifically, the inventors observed and assessed the utility of RNase activity of LbCpfl and AsCpfl for genome-editing applications. As detailed herein, it was found that AsCpfl and LbCpfl can excise multiple crRNAs from a single RNA transcript expressed from either a Pol-II or a Pol-III promoter.
- Pol-II promoter allows regulated and tissue-specific control of crRNA expression, and more efficient expression of long transcripts. It was also found that Pol II-expressed crRNAs were consistently more efficient at mediating genome editing than those expressed from a Pol-III promoter. This observation may reflect in part the fact that only Pol-II transcripts are efficiently exported to the cytoplasm where they might more easily interact with recently translated Cpfl.
- polynucleotide switches that are dependent on Cpfl enzymatic activities and expression vectors coexpressing such a polynucleotide switch and a target polypeptide of interest.
- the Cpfl -based expression controlling systems of the invention depend on the RNase activity of Cpfl proteins in excising their own guide RNAs from mRNA transcripts made in mammalian cells.
- Polynucleotide switches derived from such activities can be used as permanent off-switches for AAV delivered transgenes.
- the polynucleotide switches and expression vectors of the invention have a number of useful properties that are advantageous over any polynucleotide switches that are currently employed in the art. First, they depend on a nuclease with limited off-target activity.
- transgenes immediately halt transgene expression through one mechanism (by degrading the transgene mRNA), while permanently halting expression through another (by degrading the DNA transgene itself).
- they can be engineered so that they are only active in cells carrying the exogenously introduced transgene. Additionally, once an off-switch of the invention has eliminated the transgene, it automatically turns itself off.
- this invention is not limited to the particular methodology, protocols, and reagents described as these may vary. Unless otherwise indicated, the practice of the present invention employs conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and immunology, which are within the skill of the art. Such techniques are explained fully in the literature. For example, exemplary methods are described in the following references, Sambrook et al, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press (3 rd ed., 2001); Brent et al, Current Protocols in Molecular Biology, John Wiley & Sons, Inc.
- Protospacers are spacer sequences in CRISPR loci in a bacterium that were inserted into a CRISPR locus by invading viral or plasmid DNA.
- Cas9 nuclease attaches to tracrRNAxrRNA which guides Cas9 to the invading protospacer sequence. But Cas9 will not cleave the protospacer sequence unless there is an adjacent PAM sequence.
- the spacer in the bacterial CRISPR loci will not contain a PAM sequence, and will thus not be cut by the nuclease. But the protospacer in the invading virus or plasmid will contain the PAM sequence, and will thus be cleaved by the Cas9 nuclease.
- guideRNAs gRNAs are synthesized to perform the function of the tracrRNAxrRNA complex in recognizing gene sequences having a PAM sequence at the 3'-end (5'-end for Cpfl).
- Protospacer adjacent motif is a 2-6 base pair DNA sequence immediately following the DNA sequence targeted by the Cas9 nuclease in the CRISPR bacterial adaptive immune system.
- PAM is a component of the invading virus or plasmid, but is not a component of the bacterial CRISPR locus. Cas9 will not successfully bind to or cleave the target DNA sequence if it is not followed by the PAM sequence.
- PAM is an essential targeting component (not found in bacterial genome) which distinguishes bacterial self from non-self DNA, thereby preventing the CRISPR locus from being targeted and destroyed by nuclease.
- the canonical PAM is the sequence 5'-NGG-3' where "N” is any nucleobase followed by two guanine (“G”) nucleobases.
- Guide RNAs gRNAs
- the canonical PAM is associated with the Cas9 nuclease of Streptococcus pyogenes (designated SpCas9), whereas different PAMs are associated with the Cas9 proteins of the bacteria Neisseria meningitidis, Treponema denticola, and Streptococcus thermophilus.
- 5'-NGA-3' can be a highly efficient non-canonical PAM for human cells, but efficiency varies with genome location. Attempts have been made to engineer Cas9s to recognize different PAMs to improve ability of CRISPR-Cas9 to do gene editing at any desired genome location.
- Cas9 of Francisella novicida recognizes the canonical PAM sequence 5'-NGG-3', but has been engineered to recognize the PAM 5'-YG-3' (where "Y” is a pyrimidine), thus adding to the range of possible Cas9 targets.
- the Cpfl nuclease of Francisella novicida recognizes the PAM 5'-TTN-3' or 5'-YTN-3'.
- Cpfl refers to AsCpfl, LbCpfl, their functional derivatives or variants (e.g., the divergent LbCpfl exemplified herein), or any other Type V CRISPR effector protein.
- AsCpfl from A cidaminococcus
- LbCpfl from Lachnospiraceae
- Cpfl proteins are RNA-guided nucleases, similar to Cas9. They recognize a T- rich protospacer-adjacent motif (PAM), TTTN, but on the 5' side of the guide. This makes Cpfl distinct from Cas9, which uses an NGG PAM on the 3' side.
- PAM protospacer-adjacent motif
- Cpfl makes is staggered. In AsCpfl and LbCpfl, it occurs 23 bp after the PAM on the targeted (+) strand and 19 bp on the other strand. Cpfl requires only a crRNA for activity and does not need a tracrRNA to also be present. Unless otherwise noted, Cpfl as described herein for the present invention also broadly encompasses any other Type V CRISPR effector proteins beyond the specifically exemplified AsCpfl and LbCpfl enzymes.
- a Type V CRISPR effector protein refers to a CRISPR effector protein or enzyme that does not require a multiple-protein complex formation for its catalytic function and also does not require a tracrRNA(but instead is itself sufficient) for the maturation of its crRNA.
- a "host cell” or “target cell” refers to a living cell into which a heterologous polynucleotide sequence is to be or has been introduced.
- the living cell includes both a cultured cell and a cell within a living organism.
- Means for introducing the heterologous polynucleotide sequence into the cell are well known, e.g., transfection, electroporation, calcium phosphate precipitation, microinjection, transformation, viral infection, and/or the like.
- the heterologous polynucleotide sequence to be introduced into the cell is a replicable expression vector or cloning vector.
- host cells can be engineered to incorporate a desired gene on its chromosome or in its genome.
- host cells that can be employed in the practice of the present invention (e.g., CHO cells) serve as hosts are well known in the art. See, e.g., Sambrook et al, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press (3 rd ed., 2001); and Brent et al, Current Protocols in Molecular Biology, John Wiley & Sons, Inc. (ringbou ed., 2003).
- the host cell is a mammalian cell.
- operably linked refers to functional linkage between genetic elements that are joined in a manner that enables them to carry out their normal functions.
- a gene is operably linked to a promoter when its
- a Cpfl -dependent polynucleotide switch sequence is operably linked to a transgene if its insertion into the 5'- UTR or 3'-UTR of the gene, as described herein, allows control of the transgene expression by Cpfl RNase digestion of mRNA transcript of the transgene.
- a "substantially identical" nucleic acid or amino acid sequence refers to a polynucleotide or amino acid sequence which comprises a sequence that has at least 75%, 80% or 90% sequence identity to a reference sequence as measured by one of the well known programs described herein (e.g., BLAST) using standard parameters.
- the sequence identity is preferably at least 95%, more preferably at least 98%, and most preferably at least 99%.
- the subject sequence is of about the same length as compared to the reference sequence, i.e., consisting of about the same number of contiguous amino acid residues (for polypeptide sequences) or nucleotide residues (for polynucleotide sequences).
- Polynucleotide sequences are no less substantially identical if they are composed of RNA or DNA, despite the chemical differences between RNA and DNA, and the presence of uracil in RNA instead of thymidine in DNA.
- Sequence identity can be readily determined with various methods known in the art.
- the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89: 10915 (1989)).
- Percentage of sequence identity is determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.
- the percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
- complementary refers to a nucleotide or nucleotide sequence that hybridizes to a given nucleotide or nucleotide sequence.
- nucleotide A is complementary to T and vice versa
- nucleotide C is complementary to G and vice versa.
- nucleotide A is complementary to the nucleotide U and vice versa
- nucleotide C is complementary to the nucleotide G and vice versa.
- Reverse complement means a sequence that is complementary to another sequence, but in reverse order.
- AATTGG is CC AATT.
- AAUUGG is CCAAUU.
- a cell has been "transformed” or “transfected” by exogenous or heterologous polynucleotide (or "a transgene” or “a target gene” as used interchangeably herein) when such polynucleotide has been introduced inside the cell.
- the transforming polynucleotide may or may not be integrated (covalently linked) into the genome of the cell.
- the transforming polynucleotide may be maintained on an episomal element such as a plasmid.
- a stably transformed cell is one in which the transforming polynucleotide has become integrated into a chromosome so that it is inherited by daughter cells through chromosome replication. This stability is demonstrated by the ability of the eukaryotic cell to establish cell lines or clones comprised of a population of daughter cells containing the transforming polynucleotide.
- a "clone” is a population of cells derived from a single cell or common ancestor by mitosis.
- a "cell line” is a clone of a primary cell that is capable of stable growth in vitro for many generations.
- a "vector” or “construct” is a non-naturally occurring nucleic acid with or without a carrier that can be introduced into a cell, or has been introduced into a cell.
- Vectors that have been introduced into a cell include transfected plasmids and integrated DNA molecules, including those resulting from retroviral integration, integration of an AAV vector, and integration by homologous recombination.
- Vectors capable of directing the expression of heterologous polynucleotide or transgene sequences encoding for one or more polypeptides are referred to as "expression vectors" or "expression constructs".
- the cloned transgene sequence or open reading frame (ORF) is usually placed under the control of (i.e., operably linked to) certain regulatory sequences such as promoters, enhancers and polynucleotide switch sequences.
- a transgene is any transcription unit contained within a vector.
- the transgene may or may not encode a protein or proteins.
- the transgene may encode one or more shRNA, miRNA, ribozyme, or protein.
- AAV is adeno-associated virus, and may be used to refer to the naturally occurring wild-type virus itself or derivatives thereof. The term covers all subtypes, serotypes and pseudotypes, and both naturally occurring and recombinant forms, except where required otherwise.
- Pseudotyped AAV refers to an AAV that contains capsid proteins from one serotype and a viral genome including 5'-3' ITRs of a second serotype.
- rAAV refers to recombinant adeno-associated viral particle or a recombinant AAV vector (or "rAAV vector”).
- AAV virus or “AAV viral particle” refers to a viral particle composed of at least one AAV capsid protein (preferably by all of the capsid proteins of a wild-type AAV) and an encapsidated polynucleotide. If the particle comprises a heterologous polynucleotide (i.e., a polynucleotide other than a wild-type AAV genome such as a transgene to be delivered to a mammalian cell), it is typically referred to as "rAAV”.
- rAAV heterologous polynucleotide
- a retrovirus (e.g., a lentivirus) based vector or retroviral vector means that genome of the vector comprises components from the virus as a backbone.
- the viral particle generated from the vector as a whole contains essential vector components compatible with the RNA genome, including reverse transcription and integration systems. Usually these will include the gag and pol proteins derived from the virus. If the vector is derived from a lentivirus, the viral particles are capable of infecting and transducing non-dividing cells. Recombinant retroviral particles are able to deliver a selected exogenous gene or polynucleotide sequence such as therapeutically active genes, to the genome of a target cell.
- the present invention provides polynucleotide switches that are dependent on the RNase and DNase activities of a Type V CRISPR effector protein (e.g., a Cpfl enzyme). These switches and related expression vectors of the invention provide exceptional tools for controlling transgene expression. For example, there is currently no effective way to turn off expression of a transgene delivered by adeno-associated viral vectors (AAVs).
- AAVs adeno-associated viral vectors
- the off- switches of the invention allow immediate and permanent termination of AAV-mediated transgene expression. As exemplified herein, the off-switches can be used for terminating expression of an AAV-delivered transgene that expresses any antibody or other protein therapeutic, such that its safety more closely matches that of the protein therapeutic not expressed by a gene delivery vector.
- the polynucleotide switches of the present invention can be used for switching on the expression of a transgene.
- the present invention includes polynucleotide switches for turning on or turning off the expression of a transgene from a vector.
- the polynucleotide switches of the invention contain a Cpfl DNase target sequence that can be targeted by a Type V CRISPR effector protein, e.g., Cpfl enzymes from two bacterial species, Lachnospiraceae bacterium (Lb) and Acidaminococus sp. (As) as exemplified herein.
- the switches also contain a second sequence segment or polynucleotide motif that encodes a transcript that is capable of becoming the crRNA for the Cpfl enzymes.
- polynucleotide switches of the invention contains a protospacer motif and a protospacer- adjacent motif (PAM) that is typically located 5' to the protospacer.
- the protospacer motif can contain about 15 to about 30 nucleotides that are specifically targeted by the crRNA of a Cpfl enzyme disclosed herein.
- the protospacer contains about 23 ⁇ 5 nucleotides, 23 ⁇ 4 nucleotides, 23 ⁇ 3 nucleotides, 23 ⁇ 2 nucleotides, 23 ⁇ 1 nucleotides or 23 nucleotides in length.
- the protospacer used in the polynucleotide switches of the invention can contain a sequence identical to Guide 1 (SEQ ID NO: 9) or Guide 3 (SEQ ID NO: 11) exemplified herein.
- the protospacer-adjacent motif (PAM) in the Cpfl DNase target sequence is a thymidine (T) - rich sequence motif. It typically contains 2-6 nucleotide residues in length. Unless otherwise specified, the PAM is located at the 5' of the protospacer in the Cpfl DNase target sequence. In various embodiments, the PAM can contain 2, 3, 4, 5, or 6 T nucleotides. In some embodiments, the employed PAM is a trinucleotide comprising two T residues, e.g., TTN. In some other embodiments, the employed PAM is a tetranucleotide comprising three T residues, e.g., TTTN.
- the Cpfl DNase target sequence in some preferred polynucleotide switches of the invention can contain a PAM sequence of TTTA or TTTG as exemplified herein for sR-gRl or sR-gR3.
- the Cpfl orthologs exemplified herein have RNase activity that can be used to excise its guide RNA from an mRNA transcript produced in mammalian cells.
- the second sequence segment in the polynucleotide switches encodes an RNA transcript that harbors a scaffold RNA (or "Cpfl RNase target sequence") that mediates the cleavage of the transcript by a Type V CRISPR effector protein (e.g., Cpfl) RNase activity, as well as a gRNA that mediates the cleavage of its DNA target by Cpfl DNase activity.
- the gRNA is flanked on either side by a scaffold RNA.
- cleavage of both of the two scaffold RNA sequences located on either side of the gRNA by Cpfl generates a mature crRNA.
- a transcript expressed by a vector can be processed into a mature crRNA by Cpfl .
- the scaffold RNA sequence is generally 8-30 nucleotides in length.
- Such scaffold RNAs contain a hairpin RNA structure that is formed from a sequence that is typically 4-10 (or 5-10) nucleotides in length and a reverse complement of that sequence that is also 4-10 (or 5-10) nucleotides in length.
- the sequence 4-10 (or 5-10) nucleotides in length can also be termed "hairpin motif sequence" herein.
- AAU trinucleotide motif Within the scaffold RNA, at a position fewer than 5 nucleotides 5' of the sequence 4-10 nucleotides in length, there is typically an AAU trinucleotide motif.
- the scaffold RNA can contain from about 15 to about 25 nucleotides in length.
- the scaffold RNA can contain about 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 nucleotides. In some preferred embodiments, the scaffold RNA contains about 19 or 20 nucleotides. In some embodiments, the 5' end of the scaffold RNA, including the sequence 4-10 nucleotides in length, contains the sequence: AAUUUCUACU (SEQ ID NO: 17).
- functional scaffold RNAs for LbCfpl and AsCpfl are AAUUUCUACUAAGUGUAGAU (SEQ ID NO: 18) and
- AAUUUCUACUCUUGUAGAU (SEQ ID NO: 19), respectively, as exemplified herein.
- the scaffold RNA for a variant LbCpfl exemplified herein is
- the scaffold RNAs of the switches can be substantially identical (e.g., at least 90%, 95, or 99% identical) to the scaffold RNAs shown in SEQ ID NO: 18, SEQ ID NO: 19 or SEQ ID NO:22.
- the scaffold RNA of the polynucleotide switch contains a sequence that is identical to SEQ ID NO: 18, SEQ ID NO: 19 or SEQ ID NO:22.
- the gRNA can contain about 23 ⁇ 5 nucleotides.
- the gRNA encoded by the second polynucleotide motif in the polynucleotide switches of the present invention can contain 23 ⁇ 4 nucleotides, 23 ⁇ 3 nucleotides, 23 ⁇ 2 nucleotides, 23 ⁇ 1 nucleotides or 23 nucleotides in length.
- sequence of the gRNA is at least 75% identical to the protospacer sequence.
- the gRNA has a sequence that is at least 80%, at least 85%, at least 90% or at least 95% identical to the sequence of the protospacer sequence in the same polynucleotide switch.
- the gRNA encoded by the second sequence segment of the polynucleotide switches of the invention can have a sequence that is identical to any of the protospacer sequences exemplified in Figure 3, panel b (SEQ ID NOs:9-16). It is to be noted that as the PAM motif is located 5' of the protospacer, and the sequence of the protospacer is read from 5' to 3'. Consistent with this orientation, while the gRNA actually binds to the complementary strand of the protospacer, its sequence is denoted herein as being “identical” and not “complementary” to the sequence of the protospacer.
- RNA and DNA sequences are said to be “identical,” despite chemical difference of ribose versus deoxyribose, and the inclusion of uracil (U) versus thymidine (T).
- the switches of the present invention can generate more than one crRNA, and the crRNAs can target more than one protospacer within the vector.
- multiple crRNAs can be generated by the RNase activity of Cpfl from a single transcript, in which each of the crRNAs is flanked on both sides by a scaffold RNA, i.e., one scaffold RNA on the 5' side and one scaffold RNA on the 3' side.
- a scaffold RNA that is located between two gRNAs can be shared.
- some polynucleotide switches of the invention can contain more than one Cpfl DNase target sequence.
- each Cpfl DNase target sequence harbors a specific protospacer motif that is preceded at the 5' by a PAM.
- the more than one Cpfl DNase target sequences in the polynucleotide switch can contain different protospacer sequences. These different Cpfl DNase target sequences can be used for switching on or switching off the expression of one or more transgenes or ORFs in the same vector.
- the second sequence segment can encode a transcript containing one or more pre-crRNAs that respectively can contain distinct gRNAs, each targeting distinct protospacer sequences.
- each of the encoded pre-crRNAs can contain a distinct gRNA, each flanked by a scaffold RNA sequence.
- Such polynucleotide switches can be employed to redundantly target a vector, thus switching on or switching off a transgene.
- the Cpfl -dependent polynucleotide switches of the invention are suitable for controlling expression of transgenes or ORFs in various applications. Some embodiments of the invention are directed to expression vectors that contain the polynucleotide switch that is operably fused to a transgene sequence.
- the polynucleotide switch typically employs one or more guide RNA sequences that recognize rationally designed Cpfl DNase target sequences placed in or adjacent to the transgene.
- the switch can be operably placed at the 3 '-untranslated region of a transgene on a viral vector (e.g., an AAV vector).
- transgene is used herein to refer to a transcription unit of a vector, including a region that is transcribed into RNA from the vector template, and any operably -linked enhancer and promoter sequences.
- the transgene can contain an open reading frame (ORF), which can be translated into a protein, or alternatively, can be a non-coding RNA, including, e.g., an antisense RNA, an shRNA, an miRNA, an aptamer, or a ribozyme.
- transgene expression can be achieved by both an immediate degradation of its message RNA transcript and a permanent modification at one or more protospacer sites in the transgene DNA.
- the control of transgene expression by the vectors of the present invention has several additional advantages.
- Cpfl already shown to be more specific than the better known CRISPR effector Cas9, would only be active in the presence of the transgene, due to the absence of a scaffold RNA in cells that have not been transduced by the vector, so there would be no off-target effects in non-transduced cells. This is because the guide RNA, necessary for activation of the Cpfl DNase activity, is generated from the transgene, which is expressed by the vector.
- the system is self-limiting, because after the transgene is eliminated, it no longer can be used to generate the guide RNA, thereby halting the generation of active Cpfl/crRNA complexes.
- the off-target effects of this system are at an absolute minimum.
- the system can also be self-limiting when Cpfl is provided transiently, either as protein, or as a polynucleotide that transiently expresses a Cpfl protein.
- the transgene regulated by the polynucleotide switch of the present invention typically contains an ORF.
- the ORF can encode any polypeptide of interest.
- expression of the transgene can be switched off immediately (via Cpfl RNase cleavage of the mRNA transcript of the transgene) and permanently (via Cpfl DNase cleavage of the vector itself) as exemplified herein.
- the transgene can be expressed in the same transcript that encodes the gRNA.
- the transgene is expressed separately, e.g., as a different transcript from that encodes the gRNA.
- expression of the gRNA and/or the transgene can be placed under the control of either a Pol II (e.g., CMV) or a Pol III (e.g., U6) promoter sequence.
- the protospacer of the polynucleotide switch contains a sequence element that regulates the expression of a transgene.
- the protospacer can be a region of a vector or a transgene that is not transcribed, the protospacer can be within a region of a transgene that is transcribed but not translated, or the protospacer can be within a region of a transgene that is translated.
- the polynucleotide switch switches on or switches off transgene expression at the level of transcription. In some other embodiments, the polynucleotide switch switches on or switches off transgene expression at the level of RNA stability or transport. In some other embodiments, the polynucleotide switch switches on or switches off transgene expression at the level of translation.
- both the gRNA and the transgene are expressed in a single transcript under the control of the same promoter, e.g., a Pol II promoter.
- cleavage of the expression vector in or near the protospacer by the DNase activity of a Cpf 1 will lead to permanent termination of the expression of the transgene.
- cleavage of the expression vector in or near the protospacer by the DNase activity of Cpfl switches on the expression of the transgene.
- the protospacer of the polynucleotide switch is located in or near the coding region of the transgene.
- the protospacer is located in an expression control sequence of the transgene, e.g., the promoter region.
- the guide RNA in the second sequence segment of the polynucleotide switch is designed to be substantially identical (e.g., at least 80%, 90%, or 95% identical) to a chosen coding region or an expression control region of the transgene that is preceded by a T-rich PAM.
- some expression vectors of the invention contain a promoter sequence (e.g., a RNA Pol II promoter sequence) that harbors the protospacer of the Cpfl DNase target sequence.
- the protospacer contains promoter elements completely. In some other embodiments, the protospacer partially overlaps with promoter elements.
- Such promoter elements which can be contained within or partially overlapped by protospacer, include, e.g., (i) a TFIIB recognition element (BRE), typically located at positions -37 to -32 5' of the first nucleotide of a transcript and often resembling the consensus sequence
- BRE TFIIB recognition element
- the protospacer contains or overlaps transcription factor binding sites.
- some off-switch embodiments can include a protospacer that contains or overlaps the binding site of a transcriptional activator, whereas some on-switch embodiments can include a protospacer that contains or overlaps the binding site of a transcriptional repressor.
- the protospacer contains or partially overlaps a splice acceptor site.
- Splice acceptor sites typically contain the AG dinucleotide.
- Canonical splice acceptor sites, but not all splice acceptor sites contain a polypyrimidine tract 5' of the AG dinucleotide.
- a subset of splice acceptor sites contain the AG dinucleotide within a CAG trinucleotide.
- the protospacer contains or partially overlaps an AG dinucleotide, CAG trinucleotide, or polypyrimidine tract.
- the PAM overlaps the polypyrimidine tract.
- the protospacer contains or partially overlaps a splice donor GU dinucleotide.
- the protospacer sequence exists within the transgene at a position where Cpf 1 cleavage, and the repair of the cleaved protospacer, affects translation.
- the protospacer includes or overlaps an ATG start codon.
- the protospacer exists within an open reading frame (ORF).
- ORF open reading frame
- cleavage of the protospacer by Cpfl can switch on or switch off the expression of a protein by changing the reading frame of the encoded protein. Changing the reading frame can, e.g., result in a premature stop codon, or restore a reading frame without a premature stop codon.
- the protospacer can be at a position that lies in any of the untranslated regions that are known to be involved in controlling mRNA translation, degradation, and/or localization. These include, e.g., stem-loop structures, alternative start codons and open reading frames, internal ribosome entry sites (IRESes), RNA instability elements, RNA stability elements, the woodchuck hepatitis virus post- transcriptional regulatory element (WPRE), and various other cis-acting elements that are bound by RNA-binding proteins.
- the vectors can additionally harbor sequences corresponding to the 5'-ETS and ITS elements of the precursor RNA sequence.
- the transgene is typically operably fused with the switch polynucleotide sequence on the expression construct.
- the expression constructs can be recombinantly produced with many vectors well known in the art. These include viral vectors such as recombinant adenovirus, retrovirus, lentivirus, herpesvirus, poxvirus, papilloma virus, or adeno-associated virus (AAV).
- viral vectors such as recombinant adenovirus, retrovirus, lentivirus, herpesvirus, poxvirus, papilloma virus, or adeno-associated virus (AAV).
- the vectors can be present in liposomes, e.g., neutral or cationic liposomes, such as DOSPA/DOPE, DOGS/DOPE or DMRIE/DOPE liposomes, and/or associated with other molecules such as DNA-anti-DNA antibody-cationic lipid (DOTMA/DOPE) complexes.
- the expression constructs are based on retroviral vectors.
- Some preferred embodiments of the invention can employ AAV vectors or adenoviral vectors for introducing into host cells the transgene that is operably linked to a polynucleotide switch of the invention.
- reporter gene e.g., FLuc or GLuc
- AAV vectors gRlT(+l)-FLuc(+3) and gR3T(+l)-GLuc(+3) which have a start codon in the functional +1 frame followed by a Cpfl DNase target sequence or guide RNA target sequence (gRT) and a reporter gene at +3 reading frame.
- Cpfl makes double-strand cleavage at the gRT region, cellular non-homologous end joining system will repair the DNA double-strand break by introducing random length of insertion or deletion.
- the reporter gene will be placed in frame with the +1 frame translational start codon and the reporter expression is activated ("switched on”).
- Adeno-associated virus is a small, nonenveloped virus that was adapted for use as a gene transfer vehicle.
- Adeno-associated virus vectors refer to recombinant adeno-associated viruses that are derived from nonpathogenic parvoviruses. They evoke essentially no cellular immune response, and produce transgene expression lasting months in most systems. Like adenovirus, adeno- associated virus vectors also have the capability to infect replicating and nonreplicating cells and are believed to be nonpathogenic to humans. Delivery of heterologous polynucleotide sequences via recombinant AAV can provide for safe, unobtrusive and sustained expression (> 2 years) of high levels of protein therapeutics.
- adenoviral or retroviral based expression vector of the invention In order to construct an adenoviral or retroviral based expression vector of the invention, the transgene and an operably linked polynucleotide switch sequence are often inserted into the viral genome in the place of certain viral sequences to produce a viral construct that is replication-defective.
- Methods for producing adenoviral and retroviral vectors are well-known in the art.
- Suitable host or producer cells for producing recombinant retroviruses or retroviral vectors according to the invention are also well known in the art (e.g., 293T cells exemplified herein).
- expression vectors of the invention that harbor a transgene sequence and an operably linked polynucleotide switch sequence can be readily constructed in accordance with methodologies known in the art of molecular biology in view of the exemplifications provided herein. See, e.g., Sambrook et al, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press (3 rd ed., 2001); Brent et al, Current Protocols in Molecular Biology, John Wiley & Sons, Inc. (ringbou ed., 2003); and Freshney, Culture of Animal Cells: A Manual of Basic Technique, Wiley-Liss, Inc. (4 th ed., 2000).
- the expression vectors are assembled by inserting into a suitable vector backbone the transgene harboring a heterologous polynucleotide or transgene of interest and a polynucleotide switch described herein, as well as sequences encoding, e.g., selection markers, and other optional elements.
- a suitable vector backbone the transgene harboring a heterologous polynucleotide or transgene of interest and a polynucleotide switch described herein, as well as sequences encoding, e.g., selection markers, and other optional elements.
- Many virus based expression vector systems well known in the art can be used in the invention. Widely used retroviral vectors include those based upon murine leukemia virus (MuLV), gibbon ape leukemia virus (GaLV), simian immunodeficiency virus (SIV), human immunodeficiency virus (HIV), and combinations thereof (see, e.g., Buchscher et al , J.
- Adeno-associated viral vectors have also been used in many reported studies for gene therapy in research and clinical environment. See, e.g., Kaplitt et al, Lancet 369: 2097-105, 2007; Daya et al, Clin Microbiol Rev.
- AAV based expression vectors for practicing the invention can be based on the pAAV-MCS construct that is available from Agilent Technoligies (Santa Clara, CA). Similarly, a number of retroviral vectors and compatible packing cell lines are available from Clontech (Mountain View, CA).
- lentiviral based vectors examples include, e.g., pLVX-Puro, pLVX-IRES- Neo, pLVX-IRES-Hyg, and pLVX-IRES-Puro.
- Corresponding packaging cell lines are also available, e.g., Lenti-X 293T cell line.
- other retroviral based vectors and packaging systems are also commercially available.
- MMLV based vectors pQCXIP, pQCXIN, pQCXIQ and pQCXIH include MMLV based vectors pQCXIP, pQCXIN, pQCXIQ and pQCXIH, and compatible producer cell lines such as HEK 293 based packaging cell lines GP2-293, EcoPack 2-293 and AmphoPack 293, as well as NIH/3T3 -based packaging cell line
- RetroPack PT67 Any of these and other retroviral vectors and producer cell lines may be employed in the practice of the present invention.
- Vectors with Cpfl -based polynucleotide switches can be readily employed in many clinical or therapeutic settings. For example, they can be included in various gene transfer vectors, thus preparing the gene transfer vector to subsequently be switched on or switched off in a subject upon the addition of Cpfl.
- the polynucleotide switches of the invention can be used to terminate expression of a transgene that is introduced into a subject for gene therapies.
- the polynucleotide switches can be used to switch on the expression of a transgene encoded by a gene therapy vector administered to a subject. In certain embodiments, it is preferable to eliminate vector- transduced cells within a subject, rather than merely switch off the transgene.
- a polynucleotide switch of the present invention can be employed to switch on the expression of a suicide gene, e.g., a caspase, in a subject who received the vector.
- the vectors of the invention have a number of advantages over currently known systems for controlling transgene expression in mammalian settings.
- the vectors can contain a scaffold RNA for Cpfl RNAse activity in the 3 '-untranslated region of the transgene.
- expression of the transgene can be inactivated immediately (by degrading its message RNA transcript) and permanently (by cleaving at one or more sites of the transgene DNA).
- Certain off-switch embodiments of the present invention in which the scaffold RNA and gRNA components of the polynucleotide switch are included in the transgene, have at least two notable safety advantages that are relevant in human subjects and clinical settings.
- the crRNA-dependent DNase activity of Cpfl would only be active in the presence of the transgene. Although there might be cells that receive Cpfl that were not previously transduced with the vector expressing the transgene, there would be no off-target effects in these cells without the transgene that received only Cpfl. This is because the crRNA, which is necessary for activation of the DNase activity of Cpfl, is generated from the transgene.
- the system is self-limiting, because after Cpfl cleaves a protospacer resulting in the switching off of transgene expression, the transgene can no longer be processed to form a mature crRNA, precluding the DNase activity of Cpfl.
- the off-target effects of this polynucleotide switch in a subject are at an absolute minimum.
- Retroviral vectors or recombinant retroviruses are widely employed in gene transfer in various therapeutic or industrial applications. For example, gene therapy procedures have been used to correct acquired and inherited genetic defects, and to treat cancer or viral infection in a number of contexts. The ability to express artificial genes in humans facilitates the prevention and/or cure of many important human diseases, including many diseases which are not amenable to treatment by other therapies.
- the invention accordingly provides methods or uses of the Cpfl -dependent switches or expression vectors in various clinical or industrial bioengineering context.
- the vectors expressing a transgene can be transduced into host cells in various gene therapy and other clinical applications.
- the transgene harbored by the vector can encode a therapeutic agent.
- These constructs can be transferred, for example to treat cancer cells, to express immunomodulatory genes to fight viral infections, or to replace a gene's function as a result of a genetic defect.
- the polynucleotide switches of the present invention can be used to optimize dosing of a therapeutic protein encoded by a transgene.
- Cpfl can be provided in an amount that is sufficient to switch off transgene expression in some but not all of the vectors in a subject.
- Cpfl can be provided in an amount that is sufficient to switch off some but not all copies of a vector present within a cell.
- a transgene that is initially administered to a subject at a level that is too high can be switched off in some but not all transduced cells, in order to decrease the dose of the transgene to the subject.
- Transgene expression by the vectors of the present invention can be switched on or switched off, in some or all transduced cells in a subject by administering to the subject a Cpfl protein (or a Cpfl -expressing polynucleotide construct) that can cleave both the expression vector and the mRNA transcript.
- expression of the therapeutic agent can be temporarily terminated by degrading the mRNA transcript but not the expression vector itself.
- Temporary shut-off at the RNA level can be achieved with engineered Cpfl variants maintaining only the RNAse activity as exemplified herein.
- a Cpfl enzyme needs to be present to mediate cleavage of the protospacer in the vector administered to a subject undergoing treatment.
- the Cpfl protein can be delivered to the patient, e.g., in case of adverse events, via one of several approaches. These include, e.g., (1) delivery by AAV with the original AAV transgene and activated when necessary by a morphilino or small molecules, (2) delivery by AAV to the site of transgene expression, for example the liver or a specific set of muscle cells, and (3) delivery as a mRNA message to the site of transgene expression.
- the Cpfl protein or variant can be administered to the subject via a separate expression vector as exemplified herein.
- an ORF present in the transgene in the expression vectors of the invention can encode any polypeptide of interest.
- the ORF or transgene encodes a polypeptide that is at least 90% identical to one or more human proteins.
- the ORF can encode a constant region of an antibody, e.g., the Fc of IgGl, IgG2, IgG3, or IgG4, or other constant regions such as CHI, the constant region of a kappa light chain, or the constant region of a lambda light chain.
- the transgene operably inserted into the polynucleotide switch containing expression vectors of the invention encodes a portion or a fragment (e.g., an antigen-binding fragment) derived from one or more immunoadhesins or antibodies.
- antibody-related molecules that are well characterized in the art, e.g., CD4-Ig, eCD4-Ig, PG9, PG16, PGT121, PGT128, 10-1074, PGT145, PGT151, CAP256, 2F5, 4E10, 10E8, 3BNC117, VRCOl, VRC07, VRC13, PGDM1400, PGV04, 2G12, bl2, N6, TR66, etanercept, abatacept, rilonacept, aflibercept, belatacept, romiplostim, efmoroctocog, eftrenonacog, asfotase alpha, muromonab-CD3, edrecolomab, capromab, ibritumomab, blinatumomab, abciximab, rituximab, basiliximab, infliximab
- the transgene in the expression vectors of the invention can encode at least a chain or functional fragment derived from any of the other known cellular proteins such as cellular receptors, other cell surface molecules, enzymes, cytokines, chemokines, costimulatory molecules, interleukins, and physiologically active polypeptide factors.
- these known cellular proteins include, e.g., CD4, TPST1, TPST2, TNFR II, CD28, CTLA-4, PD-1, PD-L1, PD-L2, 4-1BBL, 4-1BB, EPO, Factor VIII, Factor IX, alkaline phosphatase, hemoglobin, fetal hemoglobin, and RPE65.
- the polypeptide expressed from the ORF in the expression vectors of the invention is at least part of a chimeric antigen receptor (CAR).
- the expression vectors of the invention can be used in gene therapies for expression many therapeutic agents known in the art. These include factor VIII, factor IX, ⁇ -globin, low-density lipoprotein receptor, adenosine deaminase, purine nucleoside phosphorylase, sphingomyelinase, glucocerebrosidase, cystic fibrosis
- transmembrane conductance regulator a-antitrypsin, CD- 18, ornithine transcarbamylase, argininosuccinate synthetase, phenylalanine hydroxylase, branched-chain a-ketoacid dehydrogenase, fumarylacetoacetate hydrolase, glucose 6-phosphatase, a-L-fucosidase, ⁇ - glucuronidase, a-L-iduronidase, galactose 1 -phosphate uridyltransferase, interleukins, cytokines, small peptides, and the like.
- therapeutic agents or proteins of interest include, but are not limited to, insulin, erythropoietin, tissue plasminogen activator (tPA), urokinase, streptokinase, neutropoesis stimulating protein (also known as filgastim or granulocyte colony stimulating factor (G-CSF)), thrombopoietin (TPO), growth hormone, emoglobin, insulinotropin, imiglucerase, sarbramostim, endothelian, soluble CD4, and antibodies and/or antigen-binding fragments (e.g., FAbs) thereof (e.g., orthoclone OKT-e (anti-CD3), GPIIb/IIa monoclonal antibody), ciliary neurite transforming factor (CNTF), granulocyte macrophage colony stimulating factor (GM-CSF), brain-derived neurite factor (BDNF), parathyroid hormone(PTH)-like hormone, insulinotrophic hormone, insulin-
- tPA tissue
- expression vectors or polynucleotide switches of the present invention can be used to control expression of a transgene for regulating cell growth, differentiation or viability in cells transplanted into a subject.
- the expression constructs used in these methods expresses a transgene operably linked to a polynucleotide switch of the invention.
- the transgene can contain an ORF encoding a polypeptide that regulates the growth or other cellular processes of the cell.
- the level of expression of the polypeptide can be readily switched on or off via the targeted digestion of the vector and also the
- these methods can be used to prevent the growth of hyperplastic or tumor cells, or even the unwanted proliferation of normal cells.
- the methods can also be used to induce the death of fat cells, to regulate growth and differentiation of stem cells, or to regulate an immune response.
- the transgene to be expressed under the control of Cpfl- dependent polynucleotide switches of the present invention can encode a therapeutic polypeptide or agent noted above.
- transfection of tumor suppressor gene p53 into human breast cancer cell lines has led to restored growth suppression in the cells (Casey et al, Oncogene 6: 1791-7, 1991).
- the Rb protein can be employed similarly.
- the transgene operably linked to a polynucleotide switch of the present invention can encode an enzyme.
- the gene can encode a cyclin-dependent kinase (CDK).
- Additional embodiments of the invention encompass expression in target cells of cell adhesion molecules, other tumor suppressors such as p21 and BRCA2, inducers of apoptosis such as Bax and Bak, other enzymes such as cytosine deaminases and thymidine kinases, hormones such as growth hormone and insulin, and interleukins and cytokines.
- target cells are mammalian cells, e.g., cells of both human and non-human animals including vertebrates and mammals.
- the target cells are cancer or tumor cells.
- the target cells are stem cells.
- the vector introduced into the cells can express a transgene that regulates differentiation, proliferation, or death (e.g., by apoptosis) of stem cells.
- a stem cell therapy can include stem cells modified to include a vector containing a polynucleotide on-switch that expresses a transgene such as a suicide gene that promotes apoptosis, e.g., a caspase.
- Stem cells suitable for practicing the invention include and are not limited to hematopoietic stem cells (HSC), embryonic stem cells, pluripotent stem cells, or mesenchymal stem cells.
- the invention provides engineered mammalian cells which express a transgene that is operably fused to an polynucleotide switch described herein.
- Cpfl -dependent polynucleotide switch of the present invention various mammalian cells can be employed for introducing a vector of the invention or by stably integrating the rDNA described herein into the host genome.
- Vectors encoding Cpfl -dependent polynucleotide switches can be introduced into an appropriate host cell (e.g., a mammalian cell such as 293T cell, N2a cell or CHO cell) by any means known in the art.
- the cells can transiently or stably express the introduced transgene.
- mammalian cells are used in these embodiments of the invention.
- Mammalian expression systems allow for proper post-translational modifications of expressed mammalian proteins to occur, e.g., proper processing of the primary transcript, glycosylation, phosphorylation and advantageously secretion of expressed product.
- Suitable cells include cells rodent, cow, goat, rabbit, sheep, non-human primate, human, and the like).
- Specific examples of cell lines include CHO, BHK, HEK293, N2a, VERO, HeLa, COS, MDCK, and W138.
- any convenient protocol may be employed for in vitro or in vivo introduction of the vector into the host cell, depending on the location of the host cell.
- the expression vector may be introduced directly into the cell under cell culture conditions permissive of viability of the host cell, e.g., by using standard transduction techniques.
- the targeting vector may be administered to the organism or host in a manner such that the vector is able to enter the host cell(s), e.g., via an in vivo or ex vivo protocol.
- in vivo it is meant in the target construct is administered to a living body of an animal.
- ex vivo it is meant that cells or organs are modified outside of the body. Such cells or organs are typically returned to a living body. Techniques well known in the art for the transfection of cells can be used for the ex vivo administration of vectors. The exact formulation, route of administration and dosage can be chosen empirically. See e.g.
- DNA and RNA vectors can be delivered with cationic lipids (Goddard, et al, Gene Therapy, 4: 1231-1236, 1997; Gorman et al., Gene Therapy 4:983-992, 1997; Chadwick et al, Gene Therapy 4:937-942, 1997; Gokhale et al, Gene Therapy 4: 1289-1299, 1997; Gao and Huang, Gene Therapy 2:710-722, 1995), using viral vectors (Monahan et al, Gene Therapy 4:40-49, 1997; Onodera et al, Blood 91 :30-36, 1998), by uptake of "naked DNA", and the like.
- the vectors or expression constructs of the invention can be introduced into the target cells via a liposome.
- the physical properties of liposomes depend on pH, ion strength and the existence of divalent cations.
- Pharmaceutical preparations or compositions are typically employed in the practice of the various therapeutic embodiments of the invention.
- the pharmaceutical preparations contain a vector harboring the polynucleotide switch.
- a transgene sequence is operably linked to the polynucleotide switch in the vector as described herein.
- the pharmaceutical compositions of the invention can also contain a pharmaceutically acceptable carrier suitable for administration to a human or non-human subject.
- the pharmaceutically acceptable carrier can be selected from pharmaceutically acceptable salts, ester, and salts of such esters.
- the pharmaceutical compositions may be administered to a subject via any route including, but not limited to, intramuscular, buccal, rectal, intravenous or intracoronary routes.
- LbCpfl DNase mutants inactivated GLuc as efficiently as wild-type LbCpfl.
- RNase LbCpfl mutants retained their DNase activity when an appropriate crRNA was expressed from a U6 (Pol III) promoter.
- H759A, K768A, and K785A LbCpfl variants, but not Cpfl DNase domain mutants D832A and E925A mediated efficient cleavage in the EGFP gene of an integrated DNA double-strand break (DSB) reporter construct (Fig. 3a), as indicated by a T7E1 mismatch cleavage assay (Fig. le).
- GLuc expression was turned on by a polynucleotide switch of the present invention by cleavage of a protospacer in EGFP coding region and its repair in a manner that changed the reading frame, thereby switching on the expression of GLuc.
- Wild-type LbCpfl efficiently introduced mutations in an integrated transgene encoding a green-fluorescent protein variant engineered for a short half-life (EGFPd4), as indicated by a T7E1 assay (Fig. 2b) and by loss of GFP fluorescence (Fig. 2c, Fig. 4b). Nearly identical results were obtained when gRl in the 3' UTR was replaced by a different guide RNA, gR3 (Fig. 4c and 4d). In both cases, LbCpfl inactivated more than half of all GFP expression, consistent with efficient editing of the integrated EGFPd4 gene.
- wild-type LbCpfl coexpressed with the GLuc transcript also efficiently induced expression of firefly luciferase (FLuc) from a DSB-induced gain-of-expression reporter plasmid (Certo et al. Nat. Methods 8, 671-676, 2011) (Fig 2d; assay depicted in Fig. 5 a).
- FLuc firefly luciferase
- Fig 2d assay depicted in Fig. 5 a
- This gain-of-expression was caused by the Cpfl DNase activity cleaving a protospacer, the repair of which resulted in a change of reading frame, thus switching on FLuc expression.
- This experiment shows that our approach can be used to switch on the expression of a transgene.
- RNA transcripts (Fig. 2e) in a DSB-induced gain-of-expression assay (depicted in Fig. 5d).
- Two DSB reporter plasmids were used in this assay.
- One reporter, gRlT(+l)-FLuc(+3) carries the target sequence for gRl and a DSB- responsive FLuc gene.
- the other reporter, gR3T(+l)-GLuc(+3) carries the target sequence for gR3 and a DSB-responsive GLuc gene.
- Example 3 Switching off transgene expression at RNA level
- the same Pol II transcript can contain a transgene, a scaffold RNA and a guide RNA (gRNA). In such embodiments where all of these elements reside on the same transcript, transgene expression can be shut off at the RNA level.
- gRNA guide RNA
- the plasmid vector expressing both Firefly luciferase and the transcript containing the scaffold RNA and gRNA was co-trasnfected into 293T cells with an empty plasmid vector or a plasmid vector expressing either wild-type LbCpfl, RNase domain mutant LbCpfl H759A, or DNase domain mutant LbCpfl D832A.
- the protospacer overlaps with the splice acceptor AG dinucleotide (SA) site (Fig. 7a).
- SA splice acceptor AG dinucleotide
- wild-type LbCpfl greatly diminished the expression of Firefly luciferase through a mechanism that includes cleavage of a protospacer overlapping the SA site.
- the switching off of luciferase expression was abolished in the context of the DNase domain mutant, indicating that the effect is mediated by the cleavage of vector DNA.
- the protospacer overlaps with the ATG start codon trinucleotide (Fig. 7b).
- the Firefly luciferase expression vector contained two protospacers, with the first protospacer overlapping with the splice acceptor AG dinucleotide site and the second protospacer overlapping with the ATG start codon trinucleotide (Fig. 7c).
- the scaffold RNA and gRNA were in the same Pol II transcript that expresses Firefly luciferase.
- the wild-type LbCpfl efficiently inactivated the transgene.
- the DNase domain mutant (H832A) LbCpfl also efficiently shut off expression, due to the RNase domain of LbCpfl cleaving the Firefly luciferase mRNA at a site proximal to the scaffold RNA.
- the RNase domain mutant did not inactivate Firefly luciferase, since it could not process the scaffold and gRNA.
- the protospacer resided in the coding region of the Firefly luciferase transgene (Fig 7d).
- Fig 7d Nineteen of nineteen protospacer sequences within the Firefly luciferase coding region were functional as the protospacer for the self-directed inactivation of a transgene in the present invention.
- mice were inoculated with 10 10 copies of an AAV vector encoding Firefly luciferase as a model transgene.
- the same AAV vector that encoded Firefly luciferase also included a U6 promoter that expresses a transcript containing a scaffold RNA and a guide RNA (gRNA).
- An otherwise-identical negative control vector was constructed that contained Firefly luciferase, but did not express a transcript containing a scaffold RNA and gRNA.
- the gRNA was engineered to match a protospacer sequence in the coding region of the Firefly luciferase transgene.
- a separate AAV vector encoding LbCpfl or a control AAV vector was administered to the same site. Luciferase activity was quantified using a Xenogen imager on days 8, 14, 21, and 28 post- injection (Fig. 8).
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Biomedical Technology (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Plant Pathology (AREA)
- Medicinal Chemistry (AREA)
- Crystallography & Structural Chemistry (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
The present invention provides self-directed polynucleotide switches capable of turning on or turning off the expression of a transgene in mammalian cells from a vector in the presence of Cpfl. Also provided in the invention are methods for employing the Cpfl- dependent switches and expression vectors of the invention in controlling transgene expression in various clinical or industrial applications.
Description
Vectors With Self-Directed Cpfl -Dependent Switches
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] The subject patent application claims the benefit of priority to U.S. Provisional Patent Application Number 62/469,069 (filed March 9, 2017). The full disclosure of the priority application is incorporated herein by reference in its entirety and for all purposes.
STATEMENT CONCERNING GOVERNMENT SUPPORT
[0002] This invention was made with government support under AI091476-06 awarded by the National Institutes of Health. The government has certain rights in the invention.
BACKGROUND OF THE INVENTION
[0003] CRISPR based genome editing is guided by a short CRISPR RNA (crRNA) which targets the Cas9 DNase activity to genomic sequences complementary to the crRNA and preceded by a short protospacer-adjacent motif (PAM). Recently, two effectors of type V-A CRISPR systems, the Cpfl proteins of Lachnosperiaceae bacterium ND2006 (LbCpfl) and Acidaminococus sp. BV3L6 (AsCpfl) have been shown to edit mammalian cell genomes with fewer off-target cleavage sites than Streptococcus pyogenes Cas9 (SpCas9). Unlike Cas9 proteins, these enzymes do not require an additional transacting CRISPR RNA
(tracrRNA), and they recognize a T-rich PAM distal from a staggered DNase cut site. The mature Cpfl crRNA is composed of a 5' scaffold region (also described as a 5' handle or a direct repeat), and a 3' guide region. In bacteria, Cas9 relies on RNase III to excise crRNAs from a CRISPR array. In contrast, at least one member of the Cpfl family, that of
Francisella novidica Ul 12 (FnCpfl) has its own RNase activity that can excise crRNA from a bacterial CRISPR array. However, FnCpfl does not efficiently edit mammalian genomes, and it is not known whether the RNase activity of any Cpfl is functional in mammalian cells.
[0004] A major hurdle preventing widespread use of gene-therapy vectors for delivery of biologies is the inability to turn off these therapies in case of adverse events. There is a great need in the art for more efficient genetic tools in controlling gene expression in various mammalian settings. The present invention addresses this and other needs.
SUMMARY OF THE INVENTION
[0005] In one aspect, the invention provides vectors or expression constructs that contain a transgene and a polynucleotide switch. The polynucleotide switch in the vectors harbors (a) a first sequence segment that contains a Cpfl DNase target sequence having (1) a protospacer that is approximately 23±2 nucleotides in length and (2) a protospacer-adjacent motif (PAM) that is located 5' to the protospacer, and (b) a second sequence segment that encodes an RNA transcript that contains (1) a scaffold RNA containing a sequence 4-10 nucleotides in length and its reverse complement and (2) a guide RNA (gRNA) sequence that is substantially identical to the protospacer and is approximately 23±2 nucleotides in length. In addition, the polynucleotide switch is capable of undergoing a self-directed DNA modification that switches on or switches off the transgene in the vectors.
[0006] In some vectors of the invention, the scaffold RNA in the RNA transcript is capable of being cleaved by a Type V CRISPR effector protein (e.g., a Cpfl enzyme such as LbCpf 1 or AsCpf) in a mammalian cell, resulting in the generation of a mature crRNA of the CRISPR effector protein. In some of these vectors, the generated crRNA is capable of targeting the CRISPR effector protein to the Cpfl DNase target sequence to cleave the vector within the protospacer. In some vectors, the encoded RNA transcript contains two or more scaffold RNAs. Some vectors of the invention can additionally contain an RNA polymerase II (Pol II) promoter. In some of these vectors, the second sequence segment is operably linked to the promoter for the transcript to be expressed from the RNA polymerase II (Pol II) promoter. In some vectors of the invention, the transgene contains at least one open reading frame (ORF).
[0007] In some vectors of the invention, the encoded scaffold RNA contains one or more structural elements selected from (a) a sequence 4-10 nucleotides in length comprising UCUAC and a reverse complement comprising GUAGA, (b) a U nucleotide at the first unpaired position 5' of the sequence 4-10 nucleotides in length, (c) a U nucleotide at the first unpaired position 3' of the sequence 4-10 nucleotides in length, (d) a U nucleotide at the first unpaired position 5' of the reverse complement, (e) a U nucleotide at the first unpaired position 3' of the reverse complement, and (f) a trinucleotide AAU at a position fewer than 5 nucleotides 5' of said sequence 4-10 nucleotides in length. In some of these vectors, the encoded scaffold RNA can further contain a CU dinucleotide, an AU dinucleotide or an AAG trinucleotide between the sequence 4-10 nucleotides in length and its reverse complement.
[0008] In some vectors of the invention, the PAM is a tetranucleotide containing at least two thymidine (T) nucleotides. In some vectors, the sequence of the guide RNA is identical to that of the protospacer. For some vectors, cleavage of the protospacer in a cell by a Cpfl present therein will switch off expression of the transgene. For some other vectors, cleavage of the protospacer in a cell by a Cpfl present therein will switch on expression of the transgene. For some vectors introduced into a cell (e.g., a mammalian cell), the concentration of the transcript expressed from the vector will be reduced when the corresponding Cpfl enzyme is present.
[0009] Some vectors of the invention contain a promoter (e.g., a Pol II promoter) that harbors the protospacer. In some embodiments, the protospacer comprises or partially overlaps a TFIIB recognition element (BRE), a TATA box, an Initiator (Inr), a downstream promoter element (DPE), a splice acceptor AG dinucleotide, a splice donor GU dinucleotide, an ATG start codon trinucleotide, or an internal ribosomal entry site (IRES). Some vectors of the invention contain two or more Cpfl DNase target sequences. In these embodiments, each Cpfl DNase target sequence can contain (1) a protospacer that is approximately 23±2 nucleotides in length and (2) a protospacer-adjacent motif (PAM) that is located 5' to the protospacer. In some of these vectors, the protospacers of the two or more Cpfl DNase target sequences are not identical to each other.
[0010] In some vectors of the invention, the ORF in the transgene encodes an amino acid sequence that is substantially identical to the amino acid sequence of at least a portion of a human protein. In some embodiments, the ORF encodes an amino acid sequence that is substantially identical to the Fc region of an antibody. In some embodiments, the ORF encodes an amino acid sequence other than an antibody Fc region that is substantially identical to one or more immunoadhesins or antibodies known in the art. In some embodiments, the ORF encodes a sequence encoding at least a portion of one or more known cellular proteins such as cellular receptors, other cell surface molecules, enzymes, cytokines, chemokines, costimulatory molecules, interleukins, and physiologically active polypeptide factors. In some embodiments, the ORF encodes at least a portion of a chimeric antigen receptor (CAR). Some preferred vectors of the invention are based on or derived from an adeno-associated virus (AAV) vector or a retroviral vector. Some vectors of the invention can additionally contain an inducible expression cassette that encodes a Type V CRISPR effector protein (e.g., a Cpfl enzyme).
[0011] In a related aspect, the invention provides mammalian cells into which one or more vectors of the present invention have been introduced. In another related aspect, the invention provides pharmaceutical compositions that contain at least one vector of the present invention. In still another aspect, the invention provides methods of switching on or switching off expression of a transgene. These methods typically entail (a) administering a vector of the invention to a subject, and (b) administering either a Cpfl enzyme or a polynucleotide capable of expressing a Cpfl enzyme to the subject.
[0012] A further understanding of the nature and advantages of the present invention may be realized by reference to the remaining portions of the specification and claims.
DESCRIPTION OF THE DRAWINGS
[0013] Figure 1 shows that LbCpfl and AsCpfl have RNase activities in mammalian cells, (a) The crRNA recognized by LbCpfl (SEQ ID NO:20) and AsCpfl (SEQ ID NO:21) are represented. A 19-20 nucleotide scaffold RNA region in the crRNAs (SEQ ID NOs: 18 and 19, respectively) is followed by a 23-base guide RNA (gRNA) complementary to the Cpfl DNase target sequence, (b) When Cpfl recognizes an appropriate scaffold RNA region present in the 3' UTR of an mRNA encoding GLuc, the message is cleaved and GLuc expression is halted, (c) Plasmids encoding LbCpfl, AsCpfl or vector alone were cotransfected with GLuc-expressing plasmids bearing the indicated scaffold variants, and GLuc activity was measured. A small ' χ ' preceding the scaffold RNA indicates replacement of the initial AAUU sequence with UUAA. A large ' x ' indicates that the scaffold sequence has been randomized. Among the 5 plasmids with a scaffold inserted, the first three have LbCpfl (Lb) scaffold, and the other two have AsCpfl (As) scaffold, (d) An assay similar to that in (c) except that LbCpfl variants with mutations in their putative RNase domains or in their RuvC DNase domains were evaluated for their ability to inactivate GLuc expression, (e-f) LbCpfl variants were assessed for their ability to edit an integrated gene when co- expressed with U6 promoter-driven crRNA, using (e) a T7E1 mismatch cleavage assay or (f) double-strand break (DSB)-induced gain-of-expression assay, depicted in Figure 3a. (g) H759 and K785 are proximal to the phosphate at 5' end of the Lb scaffold. The first and second bases of the scaffold RNA, A(-20), A(-19), are also indicated. Experiments shown are representative of two (panel c), three (panel e), or four (panels d and f) performed with nearly identical results. Data points in panels c, d, and f represent mean ± s.d. of three biological replicates.
[0014] Figure 2 shows that crRNA excised from a Pol II-expressed RNA transcript can efficiently edit a mammalian genome, (a) An mRNA encoding GLuc with two Lb scaffold regions (sR) separated by a 23-base guide (gRl) can be cleaved by LbCpfl . If both scaffold RNA regions are cleaved, the GLuc message is degraded and the resulting crRNA can be loaded into LbCpfl to edit a reporter transgene. (b-c) Wild-type LbCpfl or the indicated LbCpfl variants were co-expressed with GLuc mRNA with two Lb scaffold RNAs separated by a gRNA (sR-gRl-sR) in its 3' UTR, as represented in (a). gRl is complementary to an integrated GFP gene. Genome-editing is efficient and dependent on both RNase and DNase activity, as indicated with a T7E1 mismatch cleavage assay (b), or loss of GFP expression as measured by flow cytometry (c). (d) The same Gluc-sR-gRl-sR transcript used in (b) and (c) was used in a DSB-induced gain-of-expression assay, as depicted in Figure 5a. (e) U6 promoter-driven guide RNAs used in the subsequent panel are represented, (f) The indicated U6-expressed crRNAs were co-expressed with LbCpfl in the presence of a DSB-induced FLuc (left) or GLuc (right) reporter gene, as depicted in Figure 5d. When expressed in tandem, both crRNAs are active, (g) A direct comparison of the efficiencies of Pol II and Pol Ill-expressed crRNA using a T7E1 mismatch cleavage assay. Cells stably expressing EGFP (left) or EGFPd4 (right) were transfected with plasmid expressing LbCpfl together with a plasmid encoding the Lb sR-gRl crRNA expressed from a Pol-III (U6) or Pol-II (CMV) promoter, as indicated. Experiments shown are representative of at least two performed with nearly identical results. Data points in panels c, d, and f represent mean ± s.d. of three biological replicates.
[0015] Figure 3 shows assays and additional results for Figure 1. (a) Depiction of the assay used in Figure If A gene encoding EGFP and GLuc separated by a foot-and-mouth disease 2a protease (F2A) was integrated into the genome of 239T cells. EGFP is initially encoded in the +1 frame, whereas F2A and GLuc are encoded in the +3 frame and the GLuc start methionine has been eliminated, so that only EGFP is expressed. In the presence of Cpf 1 and an appropriate crRNA targeting the EGFP gene, a frameshift is induced by nonhomologous end joining (NHEJ), inactivating EGFP expression. In approximately one-third of editing events, the F2A and GLuc genes are placed in the same frame as the EGFP start methionine, and GLuc expression is detected, (b) A diagram indicating the EGFP locations and sequence of the guides tested in the subsequent panel. Shown in the figure are 8 sR-gR sequences (also referred to as "Cpfl DNase recognition sequences" or "Cpfl DNase target sequences" herein) in the target, which contain a 5 'terminal PAM motif and a protospacer
sequence (SEQ ID NOs: l-8, respectively). The protospacer sequences themselves (SEQ ID NOs:9-16, respectively), which are denoted "Guide" in the figure, are identical to the guide RNA sequences expressed from the U6 promoter-driven vector described in Figure 1 (e-f). (c) crRNAs with the indicated guides were expressed from a Pol-III (U6) promoter and assayed for their abilities to induce a DSB and frameshift necessary for GLuc expression in the presence of LbCpfl. Guide 1 (gRl) and Guide 3 (gR3) (SEQ ID NOs:9 and 11, respectively) were found to be most efficient and were used in subsequent assays, (d) The abilities of LbCpfl and AsCpfl to induce GLuc expression were compared in the same assay, (e, f) Experiments similar to those in Figure If except that guide 3 (e) and 8 (f) were used. Experiments shown are representative of at least two with similar results. Data points in panels c-f represent mean ± s.d. of three biological replicates.
[0016] Figure 4 shows additional results for Figure 2 (panels a-c). (a) An assay identical to that depicted in Figure la demonstrating that inactivation of the GLuc message represented in Figure 2a is dependent on the RNase but not the RuvC DNase activities of LbCpfl. (b) Key flow cytometry experiments used to generate Figure 2c. Inactivation of EGFP activity by LbCpfl depends on H759 in its RNase domain, (c) An experiment identical to that in Figure 2c, except that Guide 3 was used to inactivate EGFP. (d) Key flow cytometry experiments used to generate the figure in (c). Again, inactivation of EGFP depends on H759. Experiments shown are representative of at least two with similar results. Data points in panels a and c represent mean ± s.d. of three biological replicates.
[0017] Figure 5 shows assays and additional results for Figure 2 (panels d-f). (a)
Depiction of the assay used in Figure 2d and in Figure 4 (b-c). The RNase activity of LbCpfl excises sR-gRl crRNA from the GLuc message, inactivating GLuc. The LbCpfl/crRNA complex introduces a double-strand break (DSB), which, when repaired by NHEJ, can place the FLuc gene in-frame with a start methionine preceding the edited region, (b) An assay similar to that in Figure 2d except that no scaffold RNA or gRNA regions were present in the 3' UTR of the GLuc message, (c) An assay similar to that in Figure 2d and (b) except that the second scaffold RNA region was not present in the 3' UTR of the GLuc message, (d) A depiction of the assay used in Figure 2f. Tandem crRNAs were expressed from a Pol-III (U6) promoter. One crRNA, bearing Guide 1 (gRl), recognized and could induce a frameshift necessary for FLuc expression. Another, bearing Guide 3 (gR3), recognized and could induce a frameshift necessary for GLuc expression. These crRNAs were assayed individually and in both tandem orders in Figure 2f. Experiments shown are representative of
two with similar results. Data points in panels b and c represent mean ± s.d. of three biological replicates.
[0018] Figure 6 shows that the self-directed polynucleotide switch is capable of inactivating the vector containing it at the RNA level. A first plasmid vector containing a Firefly luciferase transgene was co-transfected into 293T cells with an empty plasmid vector or a plasmid vector expressing either wild-type LbCpfl, RNase domain mutant LbCpfl H759A, or DNase domain mutant LbCpfl D832A. The Firefly luciferase transgene was expressed as a single Pol II transcript, which either did not include a scaffold RNA and guide RNA (gRNA) (a), or did include a scaffold RNA and gRNA (b). Firefly luciferase (FLuc) expression is indicated in relative light units (RLU).
[0019] Figure 7 shows various strategies for a self-directed polynucleotide switch to inactivate the vector containing it at the DNA level. The vectors shown here express Firefly luciferase from a Pol II promoter and express a second transcript containing a scaffold RNA and guide RNA (gRNA) from a U6 Pol III promoter. The plasmid vector expressing both Firefly luciferase and the transcript containing the scaffold RNA and gRNA was co- trasnfected into 293T cells with an empty plasmid vector or a plasmid vector expressing either wild-type LbCpfl, RNase domain mutant LbCpfl H759A, or DNase domain mutant LbCpfl D832A. In a first strategy for inactivating transgene expression, the protospacer overlaps with the splice acceptor AG dinucleotide (SA) (a). In a second strategy for inactivating transgene expression, the protospacer overlaps with the ATG start codon trinucleotide (b). In a third strategy for inactivating transgene expression, the Firefly luciferase expression vector contained two protospacers, the first protospacer overlapping with the SA site, and the second protospacer overlapping with the ATG start codon trinucleotide (c). In a fourth strategy for inactivating transgene expression, the protospacer resided in the coding region of the Firefly luciferase transgene (d). Multiple protospacer sequences within the Firefly luciferase transgene were evaluated (Fluc-gRl through Fluc- gR19). Firefly luciferase (FLuc) expression is indicated in relative light units (RLU).
[0020] Figure 8 shows the inactivation of a transgene in vivo. Mice were inoculated in the left gastrocenemius muscle with 1010 copies of an AAV vector encoding Firefly luciferase. In addition to the Firefly luciferase transgene, the vector contained a U6 promoter that expresses a transcript containing a scaffold RNA and a guide RNA (gRNA). An otherwise-identical negative control vector was constructed that encoded Firefly luciferase but not a scaffold RNA and gRNA. The gRNA recognizes a protospacer sequence in the
coding region of the Firefly luciferase transgene. At the same time that these luciferase- encoding AAV vectors were administered to the mice by intramuscular injection, a separate AAV vector encoding LbCpf 1 or a control AAV vector (CNTL) was administered to the same site. Luciferase activity was measured using a Xenogen imager on days 8, 14, 21, and 28 post-injection of the animals that received the AAV vector lacking (a) or containing (b) the scaffold RNA and gRNA. The luciferase signal from each mouse was quantified using the Xenoimager software for both groups of mice, which received the AAV vector lacking (c) or containing (d) the scaffold RNA and gRNA.
DETAILED DESCRIPTION OF THE INVENTION
I. Overview
[0021] Cpfl is a Type V CRISPR-effector protein with greater specificity than Cas9 in genome-editing applications. The present invention is predicated in part on the discoveries by the present inventors that some Cpfl proteins have RNase activities that can excise CRISPR RNAs (crRNAs) from a single Pol II-driven RNA transcript expressed in mammalian cells. Specifically, the inventors observed and assessed the utility of RNase activity of LbCpfl and AsCpfl for genome-editing applications. As detailed herein, it was found that AsCpfl and LbCpfl can excise multiple crRNAs from a single RNA transcript expressed from either a Pol-II or a Pol-III promoter. Use of a Pol-II promoter allows regulated and tissue-specific control of crRNA expression, and more efficient expression of long transcripts. It was also found that Pol II-expressed crRNAs were consistently more efficient at mediating genome editing than those expressed from a Pol-III promoter. This observation may reflect in part the fact that only Pol-II transcripts are efficiently exported to the cytoplasm where they might more easily interact with recently translated Cpfl.
[0022] The ability to excise multiple crRNAs simplifies the targeting of a single gene or exon at multiple sites, or the targeting of multiple genes in a cell. Such targeting can increase the efficiency of a gene knockout, or facilitate the editing of many proteins in parallel for combinatorial screens. It also facilitates targeting of integrated viral genomes, such as that of HIV-1, whose sequence may not be wholly predictable. Finally, it enables in vivo genome- editing of multiple targets with the commonly used adeno-associated virus vector, which can only accommodate Cpfl or Cas9 and a single promoter for crRNA expression. Thus the RNase activity of AsCpfl and LbCpfl in mammalian cells expands the utility of these key CRISPR effector molecules.
[0023] In accordance with these observations, the present invention provides
polynucleotide switches that are dependent on Cpfl enzymatic activities and expression vectors coexpressing such a polynucleotide switch and a target polypeptide of interest.
Related clinical applications of the expression vectors harboring Cpfl -dependent polynucleotide switches as described herein, e.g., gene therapies, are also provided by the invention. The Cpfl -based expression controlling systems of the invention depend on the RNase activity of Cpfl proteins in excising their own guide RNAs from mRNA transcripts made in mammalian cells. Polynucleotide switches derived from such activities can be used as permanent off-switches for AAV delivered transgenes. As detailed herein, the polynucleotide switches and expression vectors of the invention have a number of useful properties that are advantageous over any polynucleotide switches that are currently employed in the art. First, they depend on a nuclease with limited off-target activity.
Second, they immediately halt transgene expression through one mechanism (by degrading the transgene mRNA), while permanently halting expression through another (by degrading the DNA transgene itself). Third, they can be engineered so that they are only active in cells carrying the exogenously introduced transgene. Additionally, once an off-switch of the invention has eliminated the transgene, it automatically turns itself off.
[0024] It is noted that, unless otherwise specified, this invention is not limited to the particular methodology, protocols, and reagents described as these may vary. Unless otherwise indicated, the practice of the present invention employs conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and immunology, which are within the skill of the art. Such techniques are explained fully in the literature. For example, exemplary methods are described in the following references, Sambrook et al, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press (3rd ed., 2001); Brent et al, Current Protocols in Molecular Biology, John Wiley & Sons, Inc. (ringbou ed., 2003); Freshney, Culture of Animal Cells: A Manual of Basic Technique, Wiley-Liss, Inc. (4th ed., 2000); and Weissbach & Weissbach, Methods for Plant Molecular Biology, Academic Press, NY, Section VIII, pp. 42 1-463, 1988. In addition, the following sections provide more detailed guidance for practicing the invention.
II. Definitions
[0025] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by those of ordinary skill in the art to which this
invention pertains. The following references provide one of skill with a general definition of many of the terms used in this invention: Academic Press Dictionary of Science and
Technology, Morris (Ed.), Academic Press (1st ed., 1992); Oxford Dictionary of
Biochemistry and Molecular Biology, Smith et al. (Eds.), Oxford University Press (revised ed., 2000); Encyclopaedic Dictionary of Chemistry, Kumar (Ed.), Anmol Publications Pvt. Ltd. (2002); Dictionary of Microbiology and Molecular Biology, Singleton et al. (Eds.), John Wiley & Sons (3rd ed., 2002); Dictionary of Chemistry, Hunt (Ed.), Routledge (1st ed., 1999); Dictionary of Pharmaceutical Medicine, Nahler (Ed.), Springer-Verlag Telos (1994);
Dictionary of Organic Chemistry, Kumar and Anandand (Eds.), Anmol Publications Pvt. Ltd. (2002); and A Dictionary of Biology (Oxford Paperback Reference), Martin and Hine (Eds.), Oxford University Press (4th ed., 2000). Further clarifications of some of these terms as they apply specifically to this invention are provided herein.
[0026] As used herein, the singular forms "a", "an", and "the" include plural reference unless the context clearly dictates otherwise. Thus, for example, reference to "a cell" includes a plurality of such cells, reference to "a protein" includes one or more proteins and equivalents thereof known to those skilled in the art, and so forth. Protospacers are spacer sequences in CRISPR loci in a bacterium that were inserted into a CRISPR locus by invading viral or plasmid DNA. On subsequent invasion, Cas9 nuclease attaches to tracrRNAxrRNA which guides Cas9 to the invading protospacer sequence. But Cas9 will not cleave the protospacer sequence unless there is an adjacent PAM sequence. The spacer in the bacterial CRISPR loci will not contain a PAM sequence, and will thus not be cut by the nuclease. But the protospacer in the invading virus or plasmid will contain the PAM sequence, and will thus be cleaved by the Cas9 nuclease. For editing genes, guideRNAs (gRNAs) are synthesized to perform the function of the tracrRNAxrRNA complex in recognizing gene sequences having a PAM sequence at the 3'-end (5'-end for Cpfl).
[0027] Protospacer adjacent motif (PAM) is a 2-6 base pair DNA sequence immediately following the DNA sequence targeted by the Cas9 nuclease in the CRISPR bacterial adaptive immune system. PAM is a component of the invading virus or plasmid, but is not a component of the bacterial CRISPR locus. Cas9 will not successfully bind to or cleave the target DNA sequence if it is not followed by the PAM sequence. PAM is an essential targeting component (not found in bacterial genome) which distinguishes bacterial self from non-self DNA, thereby preventing the CRISPR locus from being targeted and destroyed by nuclease. The canonical PAM is the sequence 5'-NGG-3' where "N" is any nucleobase
followed by two guanine ("G") nucleobases. Guide RNAs (gRNAs) can transport Cas9 to anywhere in the genome for gene editing, but no editing can occur at any site other than one at which Cas9 recognizes PAM.
[0028] The canonical PAM is associated with the Cas9 nuclease of Streptococcus pyogenes (designated SpCas9), whereas different PAMs are associated with the Cas9 proteins of the bacteria Neisseria meningitidis, Treponema denticola, and Streptococcus thermophilus. 5'-NGA-3' can be a highly efficient non-canonical PAM for human cells, but efficiency varies with genome location. Attempts have been made to engineer Cas9s to recognize different PAMs to improve ability of CRISPR-Cas9 to do gene editing at any desired genome location. Cas9 of Francisella novicida recognizes the canonical PAM sequence 5'-NGG-3', but has been engineered to recognize the PAM 5'-YG-3' (where "Y" is a pyrimidine), thus adding to the range of possible Cas9 targets. The Cpfl nuclease of Francisella novicida recognizes the PAM 5'-TTN-3' or 5'-YTN-3'.
[0029] Unless otherwise noted, Cpfl refers to AsCpfl, LbCpfl, their functional derivatives or variants (e.g., the divergent LbCpfl exemplified herein), or any other Type V CRISPR effector protein. Two Cpl -family proteins, AsCpfl (from A cidaminococcus) and LbCpfl (from Lachnospiraceae), have been shown to perform efficient genome editing in human cells. Cpfl proteins are RNA-guided nucleases, similar to Cas9. They recognize a T- rich protospacer-adjacent motif (PAM), TTTN, but on the 5' side of the guide. This makes Cpfl distinct from Cas9, which uses an NGG PAM on the 3' side. The cut Cpfl makes is staggered. In AsCpfl and LbCpfl, it occurs 23 bp after the PAM on the targeted (+) strand and 19 bp on the other strand. Cpfl requires only a crRNA for activity and does not need a tracrRNA to also be present. Unless otherwise noted, Cpfl as described herein for the present invention also broadly encompasses any other Type V CRISPR effector proteins beyond the specifically exemplified AsCpfl and LbCpfl enzymes. A Type V CRISPR effector protein refers to a CRISPR effector protein or enzyme that does not require a multiple-protein complex formation for its catalytic function and also does not require a tracrRNA(but instead is itself sufficient) for the maturation of its crRNA.
[0030] A "host cell" or "target cell" refers to a living cell into which a heterologous polynucleotide sequence is to be or has been introduced. The living cell includes both a cultured cell and a cell within a living organism. Means for introducing the heterologous polynucleotide sequence into the cell are well known, e.g., transfection, electroporation, calcium phosphate precipitation, microinjection, transformation, viral infection, and/or the
like. Often, the heterologous polynucleotide sequence to be introduced into the cell is a replicable expression vector or cloning vector. In some embodiments, host cells can be engineered to incorporate a desired gene on its chromosome or in its genome. Many host cells that can be employed in the practice of the present invention (e.g., CHO cells) serve as hosts are well known in the art. See, e.g., Sambrook et al, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press (3rd ed., 2001); and Brent et al, Current Protocols in Molecular Biology, John Wiley & Sons, Inc. (ringbou ed., 2003). In some preferred embodiments, the host cell is a mammalian cell.
[0031] The term "operably linked" or "operably associated" refers to functional linkage between genetic elements that are joined in a manner that enables them to carry out their normal functions. For example, a gene is operably linked to a promoter when its
transcription is under the control of the promoter and the transcript produced is correctly translated into the protein normally encoded by the gene. Similarly, a Cpfl -dependent polynucleotide switch sequence is operably linked to a transgene if its insertion into the 5'- UTR or 3'-UTR of the gene, as described herein, allows control of the transgene expression by Cpfl RNase digestion of mRNA transcript of the transgene.
[0032] A "substantially identical" nucleic acid or amino acid sequence refers to a polynucleotide or amino acid sequence which comprises a sequence that has at least 75%, 80% or 90% sequence identity to a reference sequence as measured by one of the well known programs described herein (e.g., BLAST) using standard parameters. The sequence identity is preferably at least 95%, more preferably at least 98%, and most preferably at least 99%. In some embodiments, the subject sequence is of about the same length as compared to the reference sequence, i.e., consisting of about the same number of contiguous amino acid residues (for polypeptide sequences) or nucleotide residues (for polynucleotide sequences). Polynucleotide sequences are no less substantially identical if they are composed of RNA or DNA, despite the chemical differences between RNA and DNA, and the presence of uracil in RNA instead of thymidine in DNA.
[0033] Sequence identity can be readily determined with various methods known in the art. For example, the BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89: 10915 (1989)). Percentage of sequence identity is determined
by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
[0034] As used herein, "complementary" or "complement" refers to a nucleotide or nucleotide sequence that hybridizes to a given nucleotide or nucleotide sequence. For instance, for DNA, the nucleotide A is complementary to T and vice versa, and the nucleotide C is complementary to G and vice versa. For instance, in RNA, the nucleotide A is complementary to the nucleotide U and vice versa, and the nucleotide C is complementary to the nucleotide G and vice versa.
[0035] Reverse complement means a sequence that is complementary to another sequence, but in reverse order. For instance, in the context of DNA, the reverse complement of AATTGG is CC AATT. Likewise, in the context of RNA, the reverse complement of AAUUGG is CCAAUU.
[0036] "Paired" and "unpaired" refer to Watson-Crick base pairs, in either the context of DNA, RNA or a DNA-RNA hybrid. A sequence is said to be paired with its reverse complement.
[0037] A cell has been "transformed" or "transfected" by exogenous or heterologous polynucleotide (or "a transgene" or "a target gene" as used interchangeably herein) when such polynucleotide has been introduced inside the cell. The transforming polynucleotide may or may not be integrated (covalently linked) into the genome of the cell. In prokaryotes, yeast, and mammalian cells for example, the transforming polynucleotide may be maintained on an episomal element such as a plasmid. With respect to eukaryotic cells, a stably transformed cell is one in which the transforming polynucleotide has become integrated into a chromosome so that it is inherited by daughter cells through chromosome replication. This stability is demonstrated by the ability of the eukaryotic cell to establish cell lines or clones comprised of a population of daughter cells containing the transforming polynucleotide. A "clone" is a population of cells derived from a single cell or common ancestor by mitosis. A
"cell line" is a clone of a primary cell that is capable of stable growth in vitro for many generations.
[0038] A "vector" or "construct" is a non-naturally occurring nucleic acid with or without a carrier that can be introduced into a cell, or has been introduced into a cell. Vectors that have been introduced into a cell include transfected plasmids and integrated DNA molecules, including those resulting from retroviral integration, integration of an AAV vector, and integration by homologous recombination. Vectors capable of directing the expression of heterologous polynucleotide or transgene sequences encoding for one or more polypeptides are referred to as "expression vectors" or "expression constructs". The cloned transgene sequence or open reading frame (ORF) is usually placed under the control of (i.e., operably linked to) certain regulatory sequences such as promoters, enhancers and polynucleotide switch sequences.
[0039] A transgene is any transcription unit contained within a vector. The transgene may or may not encode a protein or proteins. For instance, the transgene may encode one or more shRNA, miRNA, ribozyme, or protein.
[0040] As used herein, "AAV" is adeno-associated virus, and may be used to refer to the naturally occurring wild-type virus itself or derivatives thereof. The term covers all subtypes, serotypes and pseudotypes, and both naturally occurring and recombinant forms, except where required otherwise. Pseudotyped AAV refers to an AAV that contains capsid proteins from one serotype and a viral genome including 5'-3' ITRs of a second serotype. The abbreviation "rAAV" refers to recombinant adeno-associated viral particle or a recombinant AAV vector (or "rAAV vector"). An "AAV virus" or "AAV viral particle" refers to a viral particle composed of at least one AAV capsid protein (preferably by all of the capsid proteins of a wild-type AAV) and an encapsidated polynucleotide. If the particle comprises a heterologous polynucleotide (i.e., a polynucleotide other than a wild-type AAV genome such as a transgene to be delivered to a mammalian cell), it is typically referred to as "rAAV".
[0041] A retrovirus (e.g., a lentivirus) based vector or retroviral vector means that genome of the vector comprises components from the virus as a backbone. The viral particle generated from the vector as a whole contains essential vector components compatible with the RNA genome, including reverse transcription and integration systems. Usually these will include the gag and pol proteins derived from the virus. If the vector is derived from a lentivirus, the viral particles are capable of infecting and transducing non-dividing cells.
Recombinant retroviral particles are able to deliver a selected exogenous gene or polynucleotide sequence such as therapeutically active genes, to the genome of a target cell.
III. Cpfl -based polynucleotide switches for controlling transgene expression
[0042] The present invention provides polynucleotide switches that are dependent on the RNase and DNase activities of a Type V CRISPR effector protein (e.g., a Cpfl enzyme). These switches and related expression vectors of the invention provide exceptional tools for controlling transgene expression. For example, there is currently no effective way to turn off expression of a transgene delivered by adeno-associated viral vectors (AAVs). The off- switches of the invention allow immediate and permanent termination of AAV-mediated transgene expression. As exemplified herein, the off-switches can be used for terminating expression of an AAV-delivered transgene that expresses any antibody or other protein therapeutic, such that its safety more closely matches that of the protein therapeutic not expressed by a gene delivery vector. Likewise, it may be desirable to turn on a transgene. As exemplified herein, the polynucleotide switches of the present invention can be used for switching on the expression of a transgene. For example, it may be desirable to eliminate a cell containing a vector by switching on a suicide gene such as a caspase. Thus, the present invention includes polynucleotide switches for turning on or turning off the expression of a transgene from a vector.
[0043] Typically, the polynucleotide switches of the invention contain a Cpfl DNase target sequence that can be targeted by a Type V CRISPR effector protein, e.g., Cpfl enzymes from two bacterial species, Lachnospiraceae bacterium (Lb) and Acidaminococus sp. (As) as exemplified herein. The switches also contain a second sequence segment or polynucleotide motif that encodes a transcript that is capable of becoming the crRNA for the Cpfl enzymes. In various embodiments, the Cpfl DNase target sequence in the
polynucleotide switches of the invention contains a protospacer motif and a protospacer- adjacent motif (PAM) that is typically located 5' to the protospacer. The protospacer motif can contain about 15 to about 30 nucleotides that are specifically targeted by the crRNA of a Cpfl enzyme disclosed herein. In some embodiments, the protospacer contains about 23±5 nucleotides, 23±4 nucleotides, 23±3 nucleotides, 23±2 nucleotides, 23±1 nucleotides or 23 nucleotides in length. Some specific Cpfl DNase target sequences that are suitable for the invention are described herein in Figure 3 (panel b) as SEQ ID NOs: l-8. Any of the protospacers (SEQ ID NOs:9-16, respectively) present in these Cpfl DNase target sequences
can be readily employed in the practice of the invention. In some preferred embodiments, the protospacer used in the polynucleotide switches of the invention can contain a sequence identical to Guide 1 (SEQ ID NO: 9) or Guide 3 (SEQ ID NO: 11) exemplified herein.
[0044] The protospacer-adjacent motif (PAM) in the Cpfl DNase target sequence is a thymidine (T) - rich sequence motif. It typically contains 2-6 nucleotide residues in length. Unless otherwise specified, the PAM is located at the 5' of the protospacer in the Cpfl DNase target sequence. In various embodiments, the PAM can contain 2, 3, 4, 5, or 6 T nucleotides. In some embodiments, the employed PAM is a trinucleotide comprising two T residues, e.g., TTN. In some other embodiments, the employed PAM is a tetranucleotide comprising three T residues, e.g., TTTN. Some specific PAM sequences suitable for the invention are exemplified in Figure 3 (panel b) herein. Thus, the Cpfl DNase target sequence in some preferred polynucleotide switches of the invention can contain a PAM sequence of TTTA or TTTG as exemplified herein for sR-gRl or sR-gR3.
[0045] As described herein, it was found that the Cpfl orthologs exemplified herein (LbCpfl and AsCpfl) have RNase activity that can be used to excise its guide RNA from an mRNA transcript produced in mammalian cells. Accordingly, the second sequence segment in the polynucleotide switches encodes an RNA transcript that harbors a scaffold RNA (or "Cpfl RNase target sequence") that mediates the cleavage of the transcript by a Type V CRISPR effector protein (e.g., Cpfl) RNase activity, as well as a gRNA that mediates the cleavage of its DNA target by Cpfl DNase activity. In some embodiments, the gRNA is flanked on either side by a scaffold RNA. In these embodiments, where there are two scaffold RNA sequences, cleavage of both of the two scaffold RNA sequences located on either side of the gRNA by Cpfl generates a mature crRNA. Thus, a transcript expressed by a vector can be processed into a mature crRNA by Cpfl .
[0046] The scaffold RNA sequence is generally 8-30 nucleotides in length. Such scaffold RNAs contain a hairpin RNA structure that is formed from a sequence that is typically 4-10 (or 5-10) nucleotides in length and a reverse complement of that sequence that is also 4-10 (or 5-10) nucleotides in length. Thus, the sequence 4-10 (or 5-10) nucleotides in length can also be termed "hairpin motif sequence" herein. Within the scaffold RNA, at a position fewer than 5 nucleotides 5' of the sequence 4-10 nucleotides in length, there is typically an AAU trinucleotide motif. Between the sequence 4-10 nucleotides in length and its reverse complement, there is an unpaired sequence that can contain, e.g., the dinucleotide CU, the dinucleotide AU, or the trinucleotide AAG. Other unpaired nucleotides typically include a
uracil (U) nucleotide at the first unpaired position 5' of the sequence 4-10 nucleotides in length, at the first unpaired position 3' of the sequence 4-10 nucleotides in length, at the first unpaired position 5' of the reverse complement, and at the first unpaired position 3' of the reverse complement. In some embodiments, the scaffold RNA can contain from about 15 to about 25 nucleotides in length. In some embodiments, the scaffold RNA can contain about 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 nucleotides. In some preferred embodiments, the scaffold RNA contains about 19 or 20 nucleotides. In some embodiments, the 5' end of the scaffold RNA, including the sequence 4-10 nucleotides in length, contains the sequence: AAUUUCUACU (SEQ ID NO: 17). For example, functional scaffold RNAs for LbCfpl and AsCpfl are AAUUUCUACUAAGUGUAGAU (SEQ ID NO: 18) and
AAUUUCUACUCUUGUAGAU (SEQ ID NO: 19), respectively, as exemplified herein. Additionally, the scaffold RNA for a variant LbCpfl exemplified herein is
AAUUUCUACUAUUGUAGAU (SEQ ID NO:22). In various embodiments, the scaffold RNAs of the switches can be substantially identical (e.g., at least 90%, 95, or 99% identical) to the scaffold RNAs shown in SEQ ID NO: 18, SEQ ID NO: 19 or SEQ ID NO:22. In some embodiments, the scaffold RNA of the polynucleotide switch contains a sequence that is identical to SEQ ID NO: 18, SEQ ID NO: 19 or SEQ ID NO:22.
[0047] Similar to the protospacer sequence, the gRNA can contain about 23±5 nucleotides. In various embodiments, the gRNA encoded by the second polynucleotide motif in the polynucleotide switches of the present invention can contain 23±4 nucleotides, 23±3 nucleotides, 23±2 nucleotides, 23±1 nucleotides or 23 nucleotides in length. Typically, sequence of the gRNA is at least 75% identical to the protospacer sequence. In various embodiments, the gRNA has a sequence that is at least 80%, at least 85%, at least 90% or at least 95% identical to the sequence of the protospacer sequence in the same polynucleotide switch. Thus, in some embodiments, the gRNA encoded by the second sequence segment of the polynucleotide switches of the invention can have a sequence that is identical to any of the protospacer sequences exemplified in Figure 3, panel b (SEQ ID NOs:9-16). It is to be noted that as the PAM motif is located 5' of the protospacer, and the sequence of the protospacer is read from 5' to 3'. Consistent with this orientation, while the gRNA actually binds to the complementary strand of the protospacer, its sequence is denoted herein as being "identical" and not "complementary" to the sequence of the protospacer. It is also to be noted that RNA and DNA sequences are said to be "identical," despite chemical difference of ribose versus deoxyribose, and the inclusion of uracil (U) versus thymidine (T).
[0048] In certain embodiments designed to maximize the efficiency of a polynucleotide switch, the switches of the present invention can generate more than one crRNA, and the crRNAs can target more than one protospacer within the vector. As exemplified herein with LbCpfl, multiple crRNAs can be generated by the RNase activity of Cpfl from a single transcript, in which each of the crRNAs is flanked on both sides by a scaffold RNA, i.e., one scaffold RNA on the 5' side and one scaffold RNA on the 3' side. Notably, a scaffold RNA that is located between two gRNAs can be shared. After processing by the RNase activity of Cpfl, the different crRNAs thus generated were shown to be able to separately edit multiple integrated transgenes that respectively harbor the target sequences of the crRNAs.
Accordingly, some polynucleotide switches of the invention can contain more than one Cpfl DNase target sequence. In these embodiments, each Cpfl DNase target sequence harbors a specific protospacer motif that is preceded at the 5' by a PAM. In some embodiments, the more than one Cpfl DNase target sequences in the polynucleotide switch can contain different protospacer sequences. These different Cpfl DNase target sequences can be used for switching on or switching off the expression of one or more transgenes or ORFs in the same vector. In some of these polynucleotide switches, the second sequence segment can encode a transcript containing one or more pre-crRNAs that respectively can contain distinct gRNAs, each targeting distinct protospacer sequences. In these embodiments, each of the encoded pre-crRNAs can contain a distinct gRNA, each flanked by a scaffold RNA sequence. Such polynucleotide switches can be employed to redundantly target a vector, thus switching on or switching off a transgene.
IV. Expression vectors harboring Cpfl -dependent polynucleotide switches
[0049] The Cpfl -dependent polynucleotide switches of the invention are suitable for controlling expression of transgenes or ORFs in various applications. Some embodiments of the invention are directed to expression vectors that contain the polynucleotide switch that is operably fused to a transgene sequence. In the expression vectors of the invention, the polynucleotide switch typically employs one or more guide RNA sequences that recognize rationally designed Cpfl DNase target sequences placed in or adjacent to the transgene. For example, the switch can be operably placed at the 3 '-untranslated region of a transgene on a viral vector (e.g., an AAV vector). Once the switch is activated, e.g., via the addition of the corresponding Cpfl enzyme, the transgene is switched on or switched off. Unless otherwise noted, the term "transgene" is used herein to refer to a transcription unit of a vector, including
a region that is transcribed into RNA from the vector template, and any operably -linked enhancer and promoter sequences. The transgene can contain an open reading frame (ORF), which can be translated into a protein, or alternatively, can be a non-coding RNA, including, e.g., an antisense RNA, an shRNA, an miRNA, an aptamer, or a ribozyme. Disruption of transgene expression can be achieved by both an immediate degradation of its message RNA transcript and a permanent modification at one or more protospacer sites in the transgene DNA. The control of transgene expression by the vectors of the present invention has several additional advantages. First, Cpfl, already shown to be more specific than the better known CRISPR effector Cas9, would only be active in the presence of the transgene, due to the absence of a scaffold RNA in cells that have not been transduced by the vector, so there would be no off-target effects in non-transduced cells. This is because the guide RNA, necessary for activation of the Cpfl DNase activity, is generated from the transgene, which is expressed by the vector. Second, the system is self-limiting, because after the transgene is eliminated, it no longer can be used to generate the guide RNA, thereby halting the generation of active Cpfl/crRNA complexes. Thus the off-target effects of this system are at an absolute minimum. The system can also be self-limiting when Cpfl is provided transiently, either as protein, or as a polynucleotide that transiently expresses a Cpfl protein.
[0050] The transgene regulated by the polynucleotide switch of the present invention typically contains an ORF. The ORF can encode any polypeptide of interest. By operably linking the transgene sequence to the polynucleotide switch, expression of the transgene can be switched off immediately (via Cpfl RNase cleavage of the mRNA transcript of the transgene) and permanently (via Cpfl DNase cleavage of the vector itself) as exemplified herein. In some embodiments, the transgene can be expressed in the same transcript that encodes the gRNA. Alternatively, the transgene is expressed separately, e.g., as a different transcript from that encodes the gRNA. As exemplified herein, expression of the gRNA and/or the transgene can be placed under the control of either a Pol II (e.g., CMV) or a Pol III (e.g., U6) promoter sequence.
[0051] In some embodiments, the protospacer of the polynucleotide switch contains a sequence element that regulates the expression of a transgene. The protospacer can be a region of a vector or a transgene that is not transcribed, the protospacer can be within a region of a transgene that is transcribed but not translated, or the protospacer can be within a region of a transgene that is translated. In some embodiments, the polynucleotide switch switches on or switches off transgene expression at the level of transcription. In some other embodiments,
the polynucleotide switch switches on or switches off transgene expression at the level of RNA stability or transport. In some other embodiments, the polynucleotide switch switches on or switches off transgene expression at the level of translation.
[0052] In some preferred embodiments, both the gRNA and the transgene are expressed in a single transcript under the control of the same promoter, e.g., a Pol II promoter.
Typically, cleavage of the expression vector in or near the protospacer by the DNase activity of a Cpf 1 will lead to permanent termination of the expression of the transgene. However, in some other embodiments, cleavage of the expression vector in or near the protospacer by the DNase activity of Cpfl switches on the expression of the transgene. In some embodiments, the protospacer of the polynucleotide switch is located in or near the coding region of the transgene. In some other embodiments, the protospacer is located in an expression control sequence of the transgene, e.g., the promoter region. In these embodiments, the guide RNA in the second sequence segment of the polynucleotide switch is designed to be substantially identical (e.g., at least 80%, 90%, or 95% identical) to a chosen coding region or an expression control region of the transgene that is preceded by a T-rich PAM. Thus, some expression vectors of the invention contain a promoter sequence (e.g., a RNA Pol II promoter sequence) that harbors the protospacer of the Cpfl DNase target sequence. In some embodiments, the protospacer contains promoter elements completely. In some other embodiments, the protospacer partially overlaps with promoter elements. Such promoter elements, which can be contained within or partially overlapped by protospacer, include, e.g., (i) a TFIIB recognition element (BRE), typically located at positions -37 to -32 5' of the first nucleotide of a transcript and often resembling the consensus sequence
(G/C)(G/C)(G/A)CGCC, (ii) a TATA box containing the tetranucleotide motif TATA, (iii) an Initiator sequence (Inr) typically resembling the consensus sequence
(C/T)(C/T)AN(T/A)(C/T)(C/T), or (iv) a downstream promoter element (DPE) typically +28 to +32 nucleotides 3' of the first nucleotide of a transcript and often resembling the consensus sequence (A/G)G(A/T)(C/T)(G/A/C). In certain embodiments, the protospacer contains or overlaps transcription factor binding sites. Thus, some off-switch embodiments can include a protospacer that contains or overlaps the binding site of a transcriptional activator, whereas some on-switch embodiments can include a protospacer that contains or overlaps the binding site of a transcriptional repressor. In some embodiments, the protospacer contains or partially overlaps a splice acceptor site. Splice acceptor sites typically contain the AG dinucleotide. Canonical splice acceptor sites, but not all splice acceptor sites, contain a polypyrimidine
tract 5' of the AG dinucleotide. A subset of splice acceptor sites contain the AG dinucleotide within a CAG trinucleotide. Thus, in some embodiments, the protospacer contains or partially overlaps an AG dinucleotide, CAG trinucleotide, or polypyrimidine tract. In certain embodiments, the PAM overlaps the polypyrimidine tract. In some other embodiments, the protospacer contains or partially overlaps a splice donor GU dinucleotide.
[0053] In some embodiments, the protospacer sequence exists within the transgene at a position where Cpf 1 cleavage, and the repair of the cleaved protospacer, affects translation. In some embodiments, the protospacer includes or overlaps an ATG start codon. In some embodiments, the protospacer exists within an open reading frame (ORF). In a subset of such embodiments, cleavage of the protospacer by Cpfl can switch on or switch off the expression of a protein by changing the reading frame of the encoded protein. Changing the reading frame can, e.g., result in a premature stop codon, or restore a reading frame without a premature stop codon. In some other embodiments, the protospacer can be at a position that lies in any of the untranslated regions that are known to be involved in controlling mRNA translation, degradation, and/or localization. These include, e.g., stem-loop structures, alternative start codons and open reading frames, internal ribosome entry sites (IRESes), RNA instability elements, RNA stability elements, the woodchuck hepatitis virus post- transcriptional regulatory element (WPRE), and various other cis-acting elements that are bound by RNA-binding proteins. In some other embodiments, the vectors can additionally harbor sequences corresponding to the 5'-ETS and ITS elements of the precursor RNA sequence.
[0054] To control a specific transgene expression, the transgene is typically operably fused with the switch polynucleotide sequence on the expression construct. For controlling gene expression in mammalian cells, the expression constructs can be recombinantly produced with many vectors well known in the art. These include viral vectors such as recombinant adenovirus, retrovirus, lentivirus, herpesvirus, poxvirus, papilloma virus, or adeno-associated virus (AAV). The vectors can be present in liposomes, e.g., neutral or cationic liposomes, such as DOSPA/DOPE, DOGS/DOPE or DMRIE/DOPE liposomes, and/or associated with other molecules such as DNA-anti-DNA antibody-cationic lipid (DOTMA/DOPE) complexes. In some embodiments, the expression constructs are based on retroviral vectors. Some preferred embodiments of the invention can employ AAV vectors or adenoviral vectors for introducing into host cells the transgene that is operably linked to a polynucleotide switch of the invention. This is exemplified herein using a reporter gene (e.g.,
FLuc or GLuc) contained in the AAV vectors gRlT(+l)-FLuc(+3) and gR3T(+l)-GLuc(+3), which have a start codon in the functional +1 frame followed by a Cpfl DNase target sequence or guide RNA target sequence (gRT) and a reporter gene at +3 reading frame. Once Cpfl makes double-strand cleavage at the gRT region, cellular non-homologous end joining system will repair the DNA double-strand break by introducing random length of insertion or deletion. As a result, the reporter gene will be placed in frame with the +1 frame translational start codon and the reporter expression is activated ("switched on").
[0055] Adeno-associated virus (AAV) is a small, nonenveloped virus that was adapted for use as a gene transfer vehicle. Adeno-associated virus vectors refer to recombinant adeno-associated viruses that are derived from nonpathogenic parvoviruses. They evoke essentially no cellular immune response, and produce transgene expression lasting months in most systems. Like adenovirus, adeno- associated virus vectors also have the capability to infect replicating and nonreplicating cells and are believed to be nonpathogenic to humans. Delivery of heterologous polynucleotide sequences via recombinant AAV can provide for safe, unobtrusive and sustained expression (> 2 years) of high levels of protein therapeutics.
[0056] In order to construct an adenoviral or retroviral based expression vector of the invention, the transgene and an operably linked polynucleotide switch sequence are often inserted into the viral genome in the place of certain viral sequences to produce a viral construct that is replication-defective. Methods for producing adenoviral and retroviral vectors are well-known in the art. Suitable host or producer cells for producing recombinant retroviruses or retroviral vectors according to the invention are also well known in the art (e.g., 293T cells exemplified herein). Thus, expression vectors of the invention that harbor a transgene sequence and an operably linked polynucleotide switch sequence can be readily constructed in accordance with methodologies known in the art of molecular biology in view of the exemplifications provided herein. See, e.g., Sambrook et al, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press (3rd ed., 2001); Brent et al, Current Protocols in Molecular Biology, John Wiley & Sons, Inc. (ringbou ed., 2003); and Freshney, Culture of Animal Cells: A Manual of Basic Technique, Wiley-Liss, Inc. (4th ed., 2000). Typically, the expression vectors are assembled by inserting into a suitable vector backbone the transgene harboring a heterologous polynucleotide or transgene of interest and a polynucleotide switch described herein, as well as sequences encoding, e.g., selection markers, and other optional elements. Many virus based expression vector systems well known in the art can be used in the invention. Widely used retroviral vectors include those based upon murine leukemia
virus (MuLV), gibbon ape leukemia virus (GaLV), simian immunodeficiency virus (SIV), human immunodeficiency virus (HIV), and combinations thereof (see, e.g., Buchscher et al , J. Virol. 66:2731-2739, 1992; Jo mn et al, J. Virol. 66: 1635-1640, 1992; Sommerfelt ^ a/. , Virol. 176:58-59, 1990; Wilson et al, J. Virol. 63:2374-2378, 1989; Miller et al, J. Virol. 65:2220-2224, 1991 ; and PCT/US94/05700). Adeno-associated viral vectors have also been used in many reported studies for gene therapy in research and clinical environment. See, e.g., Kaplitt et al, Lancet 369: 2097-105, 2007; Daya et al, Clin Microbiol Rev. 21(4): 583- 593, 2008; Strobel et al., Am. J. Resp. Cell Mol. Biol. 53: 291-302, 2015; and Kotterman et al, Nat. Rev. Genet. 15:445-451, 2014. Many viral vectors and related reagents (e.g., packaging cell lines) suitable for the invention can be obtained commercially. For example, AAV based expression vectors for practicing the invention can be based on the pAAV-MCS construct that is available from Agilent Technoligies (Santa Clara, CA). Similarly, a number of retroviral vectors and compatible packing cell lines are available from Clontech (Mountain View, CA). Examples of lentiviral based vectors include, e.g., pLVX-Puro, pLVX-IRES- Neo, pLVX-IRES-Hyg, and pLVX-IRES-Puro. Corresponding packaging cell lines are also available, e.g., Lenti-X 293T cell line. In addition to lentiviral based vectors and packaging system, other retroviral based vectors and packaging systems are also commercially available. These include MMLV based vectors pQCXIP, pQCXIN, pQCXIQ and pQCXIH, and compatible producer cell lines such as HEK 293 based packaging cell lines GP2-293, EcoPack 2-293 and AmphoPack 293, as well as NIH/3T3 -based packaging cell line
RetroPack PT67. Any of these and other retroviral vectors and producer cell lines may be employed in the practice of the present invention.
V. Clinical applications of vectors containing Cpfl -dependent polynucleotide switches
[0057] Vectors with Cpfl -based polynucleotide switches can be readily employed in many clinical or therapeutic settings. For example, they can be included in various gene transfer vectors, thus preparing the gene transfer vector to subsequently be switched on or switched off in a subject upon the addition of Cpfl. In some preferred embodiments, the polynucleotide switches of the invention can be used to terminate expression of a transgene that is introduced into a subject for gene therapies. In other embodiments, the polynucleotide switches can be used to switch on the expression of a transgene encoded by a gene therapy vector administered to a subject. In certain embodiments, it is preferable to eliminate vector- transduced cells within a subject, rather than merely switch off the transgene. In such
embodiments, a polynucleotide switch of the present invention can be employed to switch on the expression of a suicide gene, e.g., a caspase, in a subject who received the vector. The vectors of the invention have a number of advantages over currently known systems for controlling transgene expression in mammalian settings. As noted above, the vectors can contain a scaffold RNA for Cpfl RNAse activity in the 3 '-untranslated region of the transgene. As a result, expression of the transgene can be inactivated immediately (by degrading its message RNA transcript) and permanently (by cleaving at one or more sites of the transgene DNA). Certain off-switch embodiments of the present invention, in which the scaffold RNA and gRNA components of the polynucleotide switch are included in the transgene, have at least two notable safety advantages that are relevant in human subjects and clinical settings. First, the crRNA-dependent DNase activity of Cpfl would only be active in the presence of the transgene. Although there might be cells that receive Cpfl that were not previously transduced with the vector expressing the transgene, there would be no off-target effects in these cells without the transgene that received only Cpfl. This is because the crRNA, which is necessary for activation of the DNase activity of Cpfl, is generated from the transgene. Second, the system is self-limiting, because after Cpfl cleaves a protospacer resulting in the switching off of transgene expression, the transgene can no longer be processed to form a mature crRNA, precluding the DNase activity of Cpfl. Thus, the off- target effects of this polynucleotide switch in a subject are at an absolute minimum.
[0058] Retroviral vectors or recombinant retroviruses are widely employed in gene transfer in various therapeutic or industrial applications. For example, gene therapy procedures have been used to correct acquired and inherited genetic defects, and to treat cancer or viral infection in a number of contexts. The ability to express artificial genes in humans facilitates the prevention and/or cure of many important human diseases, including many diseases which are not amenable to treatment by other therapies. For a review of gene therapy procedures, see Anderson, Science 256:808-813, 1992; Nabel & Feigner, TIBTECH 11 :211-217, 1993; Mitani & Caskey, TIBTECH 11 : 162-166, 1993; Mulligan, Science 926- 932, 1993; Dillon, TIBTECH 11 : 167-175, 1993; Miller, Nature 357:455-460, 1992; Van Brunt, Biotechnology 6: 1149-1154, 1998; Vigne, Restorative Neurology and Neuroscience 8:35-36, 1995; Kremer & Perricaudet, British Medical Bulletin 51 :31-44, 1995; Haddada et al., in Current Topics in Microbiology and Immunology (Doerfler & Bohm eds., 1995); and Yxx et al, Gene Therapy 1 : 13-26, 1994.
[0059] The invention accordingly provides methods or uses of the Cpfl -dependent switches or expression vectors in various clinical or industrial bioengineering context. In some embodiments, the vectors expressing a transgene can be transduced into host cells in various gene therapy and other clinical applications. For example, the transgene harbored by the vector can encode a therapeutic agent. These constructs can be transferred, for example to treat cancer cells, to express immunomodulatory genes to fight viral infections, or to replace a gene's function as a result of a genetic defect. Depending on the subject's response to the treatment (e.g., adverse responses), progress of treatment (e.g., completion of treatment), or specificity of treatment (e.g., effects on non-target cells or tissue), it is often desired or necessary to terminate expression of the therapeutic agent introduced by the vector in the subject in an expeditious or permanent manner. Moreover, the polynucleotide switches of the present invention can be used to optimize dosing of a therapeutic protein encoded by a transgene. For example, Cpfl can be provided in an amount that is sufficient to switch off transgene expression in some but not all of the vectors in a subject. Likewise, Cpfl can be provided in an amount that is sufficient to switch off some but not all copies of a vector present within a cell. Thus, a transgene that is initially administered to a subject at a level that is too high can be switched off in some but not all transduced cells, in order to decrease the dose of the transgene to the subject. Transgene expression by the vectors of the present invention can be switched on or switched off, in some or all transduced cells in a subject by administering to the subject a Cpfl protein (or a Cpfl -expressing polynucleotide construct) that can cleave both the expression vector and the mRNA transcript. Alternatively, expression of the therapeutic agent can be temporarily terminated by degrading the mRNA transcript but not the expression vector itself. Temporary shut-off at the RNA level can be achieved with engineered Cpfl variants maintaining only the RNAse activity as exemplified herein.
[0060] To utilize the Cpfl -based polynucleotide switches of the invention in gene therapy settings, a Cpfl enzyme needs to be present to mediate cleavage of the protospacer in the vector administered to a subject undergoing treatment. In various embodiments, the Cpfl protein can be delivered to the patient, e.g., in case of adverse events, via one of several approaches. These include, e.g., (1) delivery by AAV with the original AAV transgene and activated when necessary by a morphilino or small molecules, (2) delivery by AAV to the site of transgene expression, for example the liver or a specific set of muscle cells, and (3) delivery as a mRNA message to the site of transgene expression. In some preferred
embodiments, the Cpfl protein or variant can be administered to the subject via a separate expression vector as exemplified herein.
[0061] As noted above, an ORF present in the transgene in the expression vectors of the invention can encode any polypeptide of interest. In some embodiments, the ORF or transgene encodes a polypeptide that is at least 90% identical to one or more human proteins. In some embodiments, the ORF can encode a constant region of an antibody, e.g., the Fc of IgGl, IgG2, IgG3, or IgG4, or other constant regions such as CHI, the constant region of a kappa light chain, or the constant region of a lambda light chain. In some embodiments, the transgene operably inserted into the polynucleotide switch containing expression vectors of the invention encodes a portion or a fragment (e.g., an antigen-binding fragment) derived from one or more immunoadhesins or antibodies. These include many known antibody- related molecules that are well characterized in the art, e.g., CD4-Ig, eCD4-Ig, PG9, PG16, PGT121, PGT128, 10-1074, PGT145, PGT151, CAP256, 2F5, 4E10, 10E8, 3BNC117, VRCOl, VRC07, VRC13, PGDM1400, PGV04, 2G12, bl2, N6, TR66, etanercept, abatacept, rilonacept, aflibercept, belatacept, romiplostim, efmoroctocog, eftrenonacog, asfotase alpha, muromonab-CD3, edrecolomab, capromab, ibritumomab, blinatumomab, abciximab, rituximab, basiliximab, infliximab, cetuximab, brentuximab, siltuximab, palivizumab, trastuzumab, alemtuzumab, omalizumab, bevacizumab, natalizumab, ranibizumab, eculizumab, certolizumab, pertuzumab, obinutuzumab, pembrolizumab, vedolizumab, elotuzumab, idarucizumab, mepolizumab, adalimumab, panitumumab, canakinumab, golimumab, ofatumumab, ustekinumab, denosumab, belimumab, ipilimumab, raxibacumab, nivolumab, ramucirumab, alirocumab, daratumumab, evolocumab, necitumumab, and secukinumab. In some other embodiments, the transgene in the expression vectors of the invention can encode at least a chain or functional fragment derived from any of the other known cellular proteins such as cellular receptors, other cell surface molecules, enzymes, cytokines, chemokines, costimulatory molecules, interleukins, and physiologically active polypeptide factors. Examples of these known cellular proteins include, e.g., CD4, TPST1, TPST2, TNFR II, CD28, CTLA-4, PD-1, PD-L1, PD-L2, 4-1BBL, 4-1BB, EPO, Factor VIII, Factor IX, alkaline phosphatase, hemoglobin, fetal hemoglobin, and RPE65. In some of these embodiments, the polypeptide expressed from the ORF in the expression vectors of the invention is at least part of a chimeric antigen receptor (CAR).
[0062] In various embodiments, the expression vectors of the invention can be used in gene therapies for expression many therapeutic agents known in the art. These include factor
VIII, factor IX, β-globin, low-density lipoprotein receptor, adenosine deaminase, purine nucleoside phosphorylase, sphingomyelinase, glucocerebrosidase, cystic fibrosis
transmembrane conductance regulator, a-antitrypsin, CD- 18, ornithine transcarbamylase, argininosuccinate synthetase, phenylalanine hydroxylase, branched-chain a-ketoacid dehydrogenase, fumarylacetoacetate hydrolase, glucose 6-phosphatase, a-L-fucosidase, β- glucuronidase, a-L-iduronidase, galactose 1 -phosphate uridyltransferase, interleukins, cytokines, small peptides, and the like. Other therapeutic proteins that can be expressed from an intergrated target polynucleotide in the engineered host cell of the invention include, e.g., Herceptin®, polypeptide antigens from various pathogens such as disease causing bacteria or viruses (e.g., E. coli, P. aeruginosa, S. aureus, malaria, HIV, rabies virus, HBV, and cytomegalovirus), and other proteins such as lactoferrin, thioredoxin and beta-casein.
[0063] Additional examples of therapeutic agents or proteins of interest include, but are not limited to, insulin, erythropoietin, tissue plasminogen activator (tPA), urokinase, streptokinase, neutropoesis stimulating protein (also known as filgastim or granulocyte colony stimulating factor (G-CSF)), thrombopoietin (TPO), growth hormone, emoglobin, insulinotropin, imiglucerase, sarbramostim, endothelian, soluble CD4, and antibodies and/or antigen-binding fragments (e.g., FAbs) thereof (e.g., orthoclone OKT-e (anti-CD3), GPIIb/IIa monoclonal antibody), ciliary neurite transforming factor (CNTF), granulocyte macrophage colony stimulating factor (GM-CSF), brain-derived neurite factor (BDNF), parathyroid hormone(PTH)-like hormone, insulinotrophic hormone, insulin-like growth factor- 1 (IGF-1), platelet-derived growth factor (PDGF), epidermal growth factor (EGF), acidic fibroblast growth factor, basic fibroblast growth factor, transforming growth factor β, neurite growth factor (NGF), interferons (IFN) (e.g., IFN-a2b, IFN-a2a, IFN-αΝΙ, IFN^lb, IFN-γ), interleukins (e.g., IL-1, IL-2, IL-8), tumor necrosis factor (TNF) (e.g., TNF-a, TNF-β), transforming growth factor-a and -β, catalase, calcitonin, arginase, phenylalanine ammonia lyase, L-asparaginase, pepsin, uricase, trypsin, chymotrypsin, elastase, carboxypeptidase, lactase, sucrase, intrinsic factor, vasoactive intestinal peptide (VIP), calcitonin, Ob gene product, cholecystokinin (CCK), and glucagon.
[0064] In some embodiments, expression vectors or polynucleotide switches of the present invention can be used to control expression of a transgene for regulating cell growth, differentiation or viability in cells transplanted into a subject. The expression constructs used in these methods expresses a transgene operably linked to a polynucleotide switch of the invention. The transgene can contain an ORF encoding a polypeptide that regulates the
growth or other cellular processes of the cell. The level of expression of the polypeptide can be readily switched on or off via the targeted digestion of the vector and also the
corresponding mRNA transcript by the Cpfl DNase and RNAse activities. As
exemplification, these methods can be used to prevent the growth of hyperplastic or tumor cells, or even the unwanted proliferation of normal cells. The methods can also be used to induce the death of fat cells, to regulate growth and differentiation of stem cells, or to regulate an immune response.
[0065] In some embodiments, the transgene to be expressed under the control of Cpfl- dependent polynucleotide switches of the present invention can encode a therapeutic polypeptide or agent noted above. For example, transfection of tumor suppressor gene p53 into human breast cancer cell lines has led to restored growth suppression in the cells (Casey et al, Oncogene 6: 1791-7, 1991). The Rb protein can be employed similarly. In some other embodiments, the transgene operably linked to a polynucleotide switch of the present invention can encode an enzyme. For example, the gene can encode a cyclin-dependent kinase (CDK). It was shown that restoration of the function of a wild-type cyclin-dependent kinase, pl6INK4, by transfection with a pl6INK4-expressing vector reduced colony formation by some human cancer cell lines (Okamoto, Proc. Natl. Acad. Sci. U.S.A.
91 : 11045-9, 1994). Additional embodiments of the invention encompass expression in target cells of cell adhesion molecules, other tumor suppressors such as p21 and BRCA2, inducers of apoptosis such as Bax and Bak, other enzymes such as cytosine deaminases and thymidine kinases, hormones such as growth hormone and insulin, and interleukins and cytokines. As exemplified herein, preferred target cells for the present invention are mammalian cells, e.g., cells of both human and non-human animals including vertebrates and mammals. In some embodiments, the target cells are cancer or tumor cells. In some other embodiments, the target cells are stem cells. The vector introduced into the cells can express a transgene that regulates differentiation, proliferation, or death (e.g., by apoptosis) of stem cells. For example, a stem cell therapy can include stem cells modified to include a vector containing a polynucleotide on-switch that expresses a transgene such as a suicide gene that promotes apoptosis, e.g., a caspase. Stem cells suitable for practicing the invention include and are not limited to hematopoietic stem cells (HSC), embryonic stem cells, pluripotent stem cells, or mesenchymal stem cells.
[0066] The invention provides engineered mammalian cells which express a transgene that is operably fused to an polynucleotide switch described herein. Using Cpfl -dependent
polynucleotide switch of the present invention, various mammalian cells can be employed for introducing a vector of the invention or by stably integrating the rDNA described herein into the host genome. Vectors encoding Cpfl -dependent polynucleotide switches can be introduced into an appropriate host cell (e.g., a mammalian cell such as 293T cell, N2a cell or CHO cell) by any means known in the art. The cells can transiently or stably express the introduced transgene. Preferably, mammalian cells are used in these embodiments of the invention. Mammalian expression systems allow for proper post-translational modifications of expressed mammalian proteins to occur, e.g., proper processing of the primary transcript, glycosylation, phosphorylation and advantageously secretion of expressed product. Suitable cells include cells rodent, cow, goat, rabbit, sheep, non-human primate, human, and the like). Specific examples of cell lines include CHO, BHK, HEK293, N2a, VERO, HeLa, COS, MDCK, and W138. As exemplified herein, any convenient protocol may be employed for in vitro or in vivo introduction of the vector into the host cell, depending on the location of the host cell. In some embodiments, where the host cell is an isolated cell, the expression vector may be introduced directly into the cell under cell culture conditions permissive of viability of the host cell, e.g., by using standard transduction techniques.
[0067] Alternatively, where the host cell or cells are part of a multicellular organism, the targeting vector may be administered to the organism or host in a manner such that the vector is able to enter the host cell(s), e.g., via an in vivo or ex vivo protocol. By "in vivo," it is meant in the target construct is administered to a living body of an animal. By "ex vivo" it is meant that cells or organs are modified outside of the body. Such cells or organs are typically returned to a living body. Techniques well known in the art for the transfection of cells can be used for the ex vivo administration of vectors. The exact formulation, route of administration and dosage can be chosen empirically. See e.g. Fingl et al, 1975, in The Pharmacological Basis of Therapeutics , Ch. 1 p 1). For example, DNA and RNA vectors can be delivered with cationic lipids (Goddard, et al, Gene Therapy, 4: 1231-1236, 1997; Gorman et al., Gene Therapy 4:983-992, 1997; Chadwick et al, Gene Therapy 4:937-942, 1997; Gokhale et al, Gene Therapy 4: 1289-1299, 1997; Gao and Huang, Gene Therapy 2:710-722, 1995), using viral vectors (Monahan et al, Gene Therapy 4:40-49, 1997; Onodera et al, Blood 91 :30-36, 1998), by uptake of "naked DNA", and the like. In some other embodiments, the vectors or expression constructs of the invention can be introduced into the target cells via a liposome. The physical properties of liposomes depend on pH, ion strength and the existence of divalent cations.
[0068] Pharmaceutical preparations or compositions are typically employed in the practice of the various therapeutic embodiments of the invention. The pharmaceutical preparations contain a vector harboring the polynucleotide switch. In some embodiments, a transgene sequence is operably linked to the polynucleotide switch in the vector as described herein. In addition, the pharmaceutical compositions of the invention can also contain a pharmaceutically acceptable carrier suitable for administration to a human or non-human subject. The pharmaceutically acceptable carrier can be selected from pharmaceutically acceptable salts, ester, and salts of such esters. The pharmaceutical compositions may be administered to a subject via any route including, but not limited to, intramuscular, buccal, rectal, intravenous or intracoronary routes.
EXAMPLES
[0069] The following examples are provided to further illustrate the invention but not to limit its scope. Other variants of the invention will be readily apparent to one of ordinary skill in the art and are encompassed by the appended claims.
Example 1. RNase activities of Cpfl enzymes in mammalian cells
[0070] We investigated the ability of LbCpfl and AsCpf 1 to excise crRNAs from mRNA transcripts expressed from an RNA polymerase II (Pol II) promoter in mammalian cells, and to use these crRNAs to switch on or switch off transgene expression from vectors. To assess Cpfl RNase activity in mammalian cells, we first inserted the LbCpfl or AsCpfl crRNA- scaffold region (Fig. la) into the 3' untranslated region (3' UTR) of a gene encoding Gaussia luciferase (GLuc). Thus, when the GLuc mRNA transcript is cleaved in its 3' UTR, luciferase expression is halted (Fig. lb). We cotransfected 293T cells with one of six such vectors and a plasmid expressing either LbCpfl or AsCpfl (Fig. lc). Neither Cpfl enzyme inactivated a GLuc transcript lacking a scaffold RNA, whereas both LbCpfl and AsCpfl inactivated a GLuc transcript bearing the Lb scaffold RNA. When that scaffold RNA sequence was modified in its initiating AAUU sequence to UUAA (indicated with a small ' χ'), or it was completely randomized (indicated with a large ' χ '), Cpfl activity was abrogated. Interestingly, when the As scaffold RNA sequence replaced the Lb scaffold RNA sequence, only AsCpfl, but not LbCpfl, could halt luciferase expression. We then tested in Figure Id a panel of LbCpfl variants, with mutations either in its putative RNase region (H759A, K768A, K785A, HKK/AAA bearing all three of these mutations, or F789A), or in
its RuvC DNase domain (D832A, E925A). As before, wild-type LbCpfl abrogated GLuc expression, whereas putative RNase LbCpfl mutants did so less efficiently (K768A, K785A, F789A) or not at all (H759A, HKK/AAA). In contrast, LbCpfl DNase mutants inactivated GLuc as efficiently as wild-type LbCpfl. Several of these putative RNase LbCpfl mutants retained their DNase activity when an appropriate crRNA was expressed from a U6 (Pol III) promoter. Specifically, H759A, K768A, and K785A LbCpfl variants, but not Cpfl DNase domain mutants D832A and E925A, mediated efficient cleavage in the EGFP gene of an integrated DNA double-strand break (DSB) reporter construct (Fig. 3a), as indicated by a T7E1 mismatch cleavage assay (Fig. le). These variants also mediated a gain-of-expression frameshift in the same retrovirally integrated DSB reporter construct, EGFP(+l)-GLuc(+3) (Fig. If; Fig. 3). Thus, GLuc expression was turned on by a polynucleotide switch of the present invention by cleavage of a protospacer in EGFP coding region and its repair in a manner that changed the reading frame, thereby switching on the expression of GLuc. Note that the Cpfl and scaffold RNA co-crystal structure reveals that H759 and K785 are proximal to the phosphate group of the first adenosine, A(-20), of the Lb scaffold RNA and are thus well positioned to mediate LbCpfl 's RNase activity (Fig. lg). We conclude that both LbCpfl and AsCpfl have RNase activity that can cleave an mRNA transcript bearing an appropriate scaffold sequence in its 3' UTR, and that the RNase and DNase activities of LbCpfl can be segregated with appropriate mutations. In addition to these two Cpfl enzymes, a divergent LbCpfl protein (which has scaffold RNA sequence shown in SEQ ID NO:22) was also used and found to similarly work in separate studies.
Example 2. Functioning Cpfl crRNA excised from Pol II-expressed transcript
[0071] To determine if crRNA excised from a Pol Il-driven mRNA transcript could be used to edit mammalian cell genomes, we inserted a functional guide RNA, gRl, between two Lb scaffold RNAs (sR) in the 3' UTR of a GLuc transcript (GLuc-sR-gRl-sR; Fig. 2a). As expected, expression of this GLuc transcript was halted in the presence of wild-type LbCpfl or its DNase mutants (D832A, E925A), but not LbCpfl variants bearing mutations in their RNase domains (Fig. 4a). We then assessed the efficiency of genome editing with crRNA excised from this GLuc transcript. Wild-type LbCpfl efficiently introduced mutations in an integrated transgene encoding a green-fluorescent protein variant engineered for a short half-life (EGFPd4), as indicated by a T7E1 assay (Fig. 2b) and by loss of GFP fluorescence (Fig. 2c, Fig. 4b). Nearly identical results were obtained when gRl in the 3'
UTR was replaced by a different guide RNA, gR3 (Fig. 4c and 4d). In both cases, LbCpfl inactivated more than half of all GFP expression, consistent with efficient editing of the integrated EGFPd4 gene. In addition, wild-type LbCpfl coexpressed with the GLuc transcript also efficiently induced expression of firefly luciferase (FLuc) from a DSB-induced gain-of-expression reporter plasmid (Certo et al. Nat. Methods 8, 671-676, 2011) (Fig 2d; assay depicted in Fig. 5 a). This gain-of-expression was caused by the Cpfl DNase activity cleaving a protospacer, the repair of which resulted in a change of reading frame, thus switching on FLuc expression. This experiment shows that our approach can be used to switch on the expression of a transgene. In Figures 2b-d, genome editing was dependent on both the DNase and RNase activities of Cpfl, as indicated by loss of editing with RNase mutants H759A and HKK/AAA and with DNase mutants D832 and E925. Editing was also dependent on the presence of the sR-gRl-sR sequence in the 3' UTR of the GLuc expressor and on the second sR domain (Fig. 5b and 5c), indicating that two RNase cleavage events were necessary to produce a functional crRNA. We conclude that LbCpfl can excise and utilize crRNAs embedded in the 3' UTR of a Pol II-driven RNA transcript.
[0072] To assess the ability of LbCpfl to utilize multiple crRNA expressed from single transcript, we tested four Pol III (U6)-expressed RNA transcripts (Fig. 2e) in a DSB-induced gain-of-expression assay (depicted in Fig. 5d). Two DSB reporter plasmids were used in this assay. One reporter, gRlT(+l)-FLuc(+3), carries the target sequence for gRl and a DSB- responsive FLuc gene. The other reporter, gR3T(+l)-GLuc(+3), carries the target sequence for gR3 and a DSB-responsive GLuc gene. As expected, sR-gRl could induce expression of FLuc, but not GLuc, whereas the reverse was true for sR-gR3 (Fig. 2f). In contrast, tandem constructs could switch on expression of both genes, indicating that the LbCpfl can efficiently utilize two crRNAs expressed from a single promoter. Finally, we observed that crRNAs driven by a Pol-II (CMV) promoter were consistently more efficient than those expressed from a Pol-III (U6) promoter, as indicated in two cell lines stably transfected with different GFP variants, using the T7E1 assay (Fig. 2g). These data indicate that genome- editing with Cpfl is more efficient when its crRNA is expressed from a Pol II promoter. Thus, the ability to generate crRNA from a Pol-II promoter represents an additional advantage of Cpfl relative to Cas9.
Example 3. Switching off transgene expression at RNA level
[0073] Additional studies were performed to show that the self-directed polynucleotide switch can switch off transgene expression at the RNA level. In one embodiment, the same Pol II transcript can contain a transgene, a scaffold RNA and a guide RNA (gRNA). In such embodiments where all of these elements reside on the same transcript, transgene expression can be shut off at the RNA level. Here, we co-transfected 293T cells with a first plasmid expressing a single Pol II transcript containing Firefly luciferase, a scaffold RNA, and gRNA, alongside a second plasmid, which was an empty vector, or a plasmid vector expressing wild- type LbCpfl, an RNase domain mutant LbCpfl (H759A), or a DNase domain mutant Cpfl (D832A). These point mutations in the RNase and DNase domains of LbCpfl allow us to discriminate between the RNase- and DNase-mediated effects the polynucleotide switch.
[0074] The results are exemplified in Figure 6. In the absence of the scaffold RNA and gRNA, the Firefly luciferase transgene of a plasmid vector was unaffected by the presence of wild-type or mutant Cpfl (Fig. 6a). However, when the Firefly luciferase transcript also contains a scaffold RNA and gRNA, Firefly luciferase expression was greatly reduced by wild-type LbCpfl and by DNase domain mutant (D832A) LbCpfl, but not by RNase domain mutant (H759A) LbCpfl (Fig. 6b). Therefore, when the transgene, scaffold RNA, and gRNA are contained within the same transcript, the polynucleotide switch is capable of switching off transgene expression at the RNA level.
Example 4. Different structural organizations of the polynucleotide switch components
[0075] We also performed studies to assess structural requirement of the different components of the polynucleotide switch. Results from the studies were exemplified in Figure 7. We found that the protospacer can occupy various sites, including the start codon, a splice acceptor site, a coding region, or any combination thereof. Plasmid vectors were constructed that express Firefly luciferase from a Pol II promoter and express a second transcript containing a scaffold RNA and guide RNA (gRNA) from a U6 Pol III promoter. Because the scaffold RNA and the gRNA are not on the same transcript, any transgene inactivation is occurring at the DNA level, not due to the cleavage of the RNA transcript by the RNase domain of LbCpfl . The plasmid vector expressing both Firefly luciferase and the transcript containing the scaffold RNA and gRNA was co-trasnfected into 293T cells with an empty plasmid vector or a plasmid vector expressing either wild-type LbCpfl, RNase domain mutant LbCpfl H759A, or DNase domain mutant LbCpfl D832A. In a first strategy for inactivating transgene expression, the protospacer overlaps with the splice acceptor AG
dinucleotide (SA) site (Fig. 7a). In this context, wild-type LbCpfl greatly diminished the expression of Firefly luciferase through a mechanism that includes cleavage of a protospacer overlapping the SA site. The switching off of luciferase expression was abolished in the context of the DNase domain mutant, indicating that the effect is mediated by the cleavage of vector DNA. In a second strategy for inactivating transgene expression, the protospacer overlaps with the ATG start codon trinucleotide (Fig. 7b). This strategy also resulted in a substantial loss of luciferase activity. In a third strategy for inactivating transgene expression, the Firefly luciferase expression vector contained two protospacers, with the first protospacer overlapping with the splice acceptor AG dinucleotide site and the second protospacer overlapping with the ATG start codon trinucleotide (Fig. 7c). However, in contrast to the plasmid vectors utilized to generate Figs. 7a & 7b, in this double-protospacer construct, the scaffold RNA and gRNA were in the same Pol II transcript that expresses Firefly luciferase. In this case, the wild-type LbCpfl efficiently inactivated the transgene. The DNase domain mutant (H832A) LbCpfl also efficiently shut off expression, due to the RNase domain of LbCpfl cleaving the Firefly luciferase mRNA at a site proximal to the scaffold RNA.
However, the RNase domain mutant did not inactivate Firefly luciferase, since it could not process the scaffold and gRNA. In a fourth strategy for inactivating transgene expression, the protospacer resided in the coding region of the Firefly luciferase transgene (Fig 7d). Nineteen of nineteen protospacer sequences within the Firefly luciferase coding region were functional as the protospacer for the self-directed inactivation of a transgene in the present invention. Thus, we demonstrated that various divergent embodiments of the polynucleotide switch of the present invention are functional, including where the scaffold and gRNAs are on the same or different transcripts from the transgene, and where the protospacer may overlap the splice acceptor AG dinucleotide, ATG start codon trinucleotide, or coding region.
Example 5. Self-directed inactivation of a transgene in vivo
[0076] We further examined ability of the polynucleotide switch to mediate self-directed inactivation of a transgene in vivo. Results from the studies were exemplified in Figure 8. Here, we successfully demonstrated the inactivation of an AAV transgene containing a polynucleotide switch in vivo. Specifically, mice were inoculated with 1010 copies of an AAV vector encoding Firefly luciferase as a model transgene. The same AAV vector that encoded Firefly luciferase also included a U6 promoter that expresses a transcript containing a scaffold RNA and a guide RNA (gRNA). An otherwise-identical negative control vector
was constructed that contained Firefly luciferase, but did not express a transcript containing a scaffold RNA and gRNA. The gRNA was engineered to match a protospacer sequence in the coding region of the Firefly luciferase transgene. At the same time when these luciferase- encoding AAV vectors were administered to the mice by intramuscular injection, a separate AAV vector encoding LbCpfl or a control AAV vector was administered to the same site. Luciferase activity was quantified using a Xenogen imager on days 8, 14, 21, and 28 post- injection (Fig. 8). Among the mice that received the negative control vector, which expressed Firefly luciferase but not the scaffold RNA and gRNA, there was no difference in the luciferase activity among those that also received the vector encoding LbCpfl versus the control (Figs. 8a & 8c). However, among the mice that received an AAV vector that expressed Firefly luciferase plus a transcript containing a scaffold RNA and gRNA, the activity of the Firefly luciferase transgene was greatly diminished in the mice that received the second AAV vector containing LbCpfl but not in the mice that received a second AAV vector expressing a control protein (Figs. 8b & 8d). In the mice that received the AAV vector expressing Firefly luciferase, a scaffold RNA, and gRNA, the reduction in gene expression due to LbCpfl was statistically significant at 28 weeks (P=0.01). Since the scaffold RNA and gRNA were expressed on a separate transcript from Firefly luciferase, the inactivation of the vector observed here occurred at the DNA level, not at the RNA level. These data indicate that in vivo a vector containing a polynucleotide switch of the invention could undergo a self- directed DNA modification to inactivate the vector-encoded transgene.
***
[0077] Although the foregoing invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, it will be readily apparent to one of ordinary skill in the art in light of the teachings of this invention that certain changes and modifications may be made thereto without departing from the spirit or scope of the appended claims.
[0078] All publications, databases, GenBank sequences, patents, and patent applications cited in this specification are herein incorporated by reference as if each was specifically and individually indicated to be incorporated by reference.
Claims
1. A vector comprising a transgene and a polynucleotide switch, said polynucleotide switch comprising:
(a) a first sequence segment comprising a Cpfl DNase target sequence comprising (1) a protospacer that is approximately 23±2 nucleotides in length and (2) a protospacer-adjacent motif (PAM) that is located 5' to the protospacer; and
(b) a second sequence segment encoding an RNA transcript comprising (1) a scaffold RNA comprising a sequence 4-10 nucleotides in length and its reverse complement, and (2) a guide RNA (gRNA) sequence that is substantially identical to said protospacer and is
approximately 23±2 nucleotides in length;
wherein said polynucleotide switch is capable of undergoing a self-directed DNA
modification that switches on or switches off said transgene.
2. The vector of claim 1, wherein the scaffold RNA in the RNA transcript is capable of being cleaved by a Cpfl enzyme to generate a mature crRNA of the Cpfl enzyme in a mammalian cell.
3. The vector of claim 2, wherein the crRNA is capable of targeting the Cpfl enzyme to the Cpfl DNase target sequence to cleave the vector within the protospacer.
4. The vector of claim 1, wherein the RNA transcript comprises two or more scaffold RNAs.
5. The vector of claim 1, further comprising an RNA polymerase II (Pol II) promoter.
6. The vector of claim 5, wherein the transcript is expressed from the RNA polymerase II (Pol II) promoter.
7. The vector of claim 1, wherein said transgene comprises at least one open reading frame (ORF).
8. The vector of claim 1, wherein said scaffold RNA comprises one or more features selected from the group consisting of (a) a sequence 4-10 nucleotides in length comprising UCUAC and a reverse complement comprising GUAGA, (b) a U nucleotide at the first
unpaired position 5' of the sequence 4-10 nucleotides in length, (c) a U nucleotide at the first unpaired position 3' of the sequence 4-10 nucleotides in length, (d) a U nucleotide at the first unpaired position 5' of the reverse complement, (e) a U nucleotide at the first unpaired position 3' of the reverse complement, and (f) a trinucleotide AAU at a position fewer than 5 nucleotides 5' of said sequence 4-10 nucleotides in length.
9. The vector of claim 8, wherein the scaffold RNA further comprises a CU dinucleotide, an AU dinucleotide, or an AAG trinucleotide between the sequence 4-10 nucleotides in length and its reverse complement.
10. The vector of claim 1, wherein the PAM is a tetranucleotide comprising at least two thymidine (T) nucleotides.
11. The vector of claim 1, wherein the sequence of the guide RNA is identical to that of the protospacer.
12. The vector of claim 1, wherein cleavage of the protospacer, in a cell comprising Cpfl, switches off the expression of said transgene.
13. The vector of claim 1, wherein cleavage of protospacer, in a cell comprising Cpfl, switches on the expression of said transgene.
14. The vector of claim 1, wherein the concentration of the transcript is reduced in the presence of Cpfl.
15. The vector of claim 5, wherein the promoter comprises the protospacer.
16. The vector of claim 5, wherein the protospacer comprises or partially overlaps a TFIIB recognition element (BRE), a TATA box, an Initiator (Inr), a downstream promoter element (DPE), a splice acceptor AG dinucleotide, a splice donor GU dinucleotide, an ATG start codon trinucleotide, or an internal ribosomal entry site (IRES).
17. The vector of claim 1, comprising two or more Cpfl DNase target sequences, wherein each Cpfl DNase target sequence comprises (1) a protospacer that is approximately 23±2 nucleotides in length and (2) a protospacer-adjacent motif (PAM) that is located 5' to the protospacer.
18. The vector of claim 17, wherein the protospacers of the two or more Cpfl DNase target sequences are not identical to each other.
19. The vector of claim 7, wherein said ORF encodes an amino acid sequence that is substantially identical to the amino acid sequence of at least a portion of a human protein.
20. The vector of claim 7, wherein the ORF encodes an amino acid sequence that is substantially identical to the Fc region of an antibody.
21. The vector of claim 7, wherein the ORF encodes an amino acid sequence other than an antibody constant region that is substantially identical to one or more immunoadhesins or antibodies selected from the group consisting of CD4-Ig, eCD4-Ig, PG9, PG16, PGT121, PGT128, 10-1074, PGT145, PGT151, CAP256, 2F5, 4E10, 10E8, 3BNC117, VRCOl, VRC07, VRC13, PGDM1400, PGV04, 2G12, bl2, N6, TR66, etanercept, abatacept, rilonacept, aflibercept, belatacept, romiplostim, efmoroctocog, eftrenonacog, asfotase alpha, muromonab-CD3, edrecolomab, capromab, ibritumomab, blinatumomab, abciximab, rituximab, basiliximab, infliximab, cetuximab, brentuximab, siltuximab, palivizumab, trastuzumab, alemtuzumab, omalizumab, bevacizumab, natalizumab, ranibizumab, eculizumab, certolizumab, pertuzumab, obinutuzumab, pembrolizumab, vedolizumab, elotuzumab, idarucizumab, mepolizumab, adalimumab, panitumumab, canakinumab, golimumab, ofatumumab, ustekinumab, denosumab, belimumab, ipilimumab, raxibacumab, nivolumab, ramucirumab, alirocumab, daratumumab, evolocumab, necitumumab, and secukinumab.
22. The vector of claim 7, wherein the ORF encodes a sequence encoding at least a portion of one or more proteins selected from the group consisting of serum albumin, CD2, CD3y, CD35, CD38, CD3C, FcsRy, CD4, CD8a, CD8p, CD28, DAP10, DAP 12, TPSTl, TPST2, TNFR II, CTLA-4, PD-1, PD-L1, PD-L2, 4-1BBL, 4- IBB, alkaline phosphatase, hemoglobin, fetal hemoglobin, erythropoietin (EPO), RPE65, caspase 2, caspase 3, caspase 6, caspase 7, caspase 8, caspase 9, caspase 10, decay-accelerating factor (DAF), Clr, Cls, CNGA3, AADC, LDLR, dystrophin, XLRS 1, Gigaxonin, CHM, GDNF, LEKTI, insulin, glucagon, growth hormone (GH), parathyroid hormone (PTH), growth hormone releasing factor (GRF), follicle stimulating hormone (FSH), luteinizing hormone (LH), human chorionic gonadotropin (hCG), vascular endothelial growth factor (VEGF), angiopoietin,
angiostatin, TGF-β, factor VIII, factor IX, β-globin, low-density lipoprotein receptor, adenosine deaminase, purine nucleoside phosphorylase, sphingomyelinase,
glucocerebrosidase, cystic fibrosis transmembrane conductance regulator (CFTR), a- antitrypsin, CD- 18, ornithine transcarbamylase, argininosuccinate synthetase, phenylalanine hydroxylase, branched-chain a-ketoacid dehydrogenase, fumarylacetoacetate hydrolase, glucose 6-phosphatase, a-L-fucosidase, β-glucuronidase, a-L-iduronidase, galactose 1- phosphate uridyltransferase, tissue plasminogen activator (tPA), urokinase, streptokinase, granulocyte colony stimulating factor (G-CSF), thrombopoietin (TPO), growth hormone, emoglobin, insulinotropin, imiglucerase, sarbramostim, endothelian, ciliary neurite transforming factor (CNTF), granulocyte macrophage colony stimulating factor (GM-CSF), brain-derived neurite factor (BDNF), parathyroid hormone (PTH)-like hormone,
insulinotrophic hormone, insulin-like growth factor- 1 (IGF-1), platelet-derived growth factor (PDGF), epidermal growth factor (EGF), acidic fibroblast growth factor, basic fibroblast growth factor, transforming growth factor β, neurite growth factor (NGF), IFN-a2b, IFN-a2a, IFN-αΝΙ, IFN-pib, IFN-γ, IL-1, IL-2, IL-4, IL-8, IL-10, TNF-a, TNF-β, catalase, calcitonin, arginase, phenylalanine ammonia lyase, L-asparaginase, pepsin, uricase, trypsin,
chymotrypsin, elastase, carboxypeptidase, lactase, sucrase, intrinsic factor, vasoactive intestinal peptide (VIP), calcitonin, Ob gene product, cholecystokinin (CCK), Rb, and p53.
23. The vector of claim 7, wherein the ORF encodes at least a portion of a chimeric antigen receptor (CAR).
24. The vector of claim 1, which is an adeno-associated virus (AAV) vector or a retroviral vector.
25. The vector of claim 1, further comprising an inducible expression cassette that encodes a Cpfl enzyme.
26. A mammalian cell comprising the vector of claim 1.
27. A pharmaceutical composition comprising the vector of claim 1
28. A method of switching on or switching off expression of a transgene, said method comprising (a) administering the vector of claim 1 to a subject, and (b) administering either a
Cpfl enzyme or a polynucleotide capable of expressing a Cpfl enzyme to the subject; thereby switching on or switching off expression of the transgene in the subject.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201762469069P | 2017-03-09 | 2017-03-09 | |
US62/469,069 | 2017-03-09 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018164948A1 true WO2018164948A1 (en) | 2018-09-13 |
Family
ID=63448678
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2018/020619 WO2018164948A1 (en) | 2017-03-09 | 2018-03-02 | Vectors with self-directed cpf1-dependent switches |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2018164948A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021112136A1 (en) * | 2019-12-02 | 2021-06-10 | 国立大学法人京都大学 | Mrna switch, and method for regulating expression of protein using same |
CN113015798A (en) * | 2018-11-15 | 2021-06-22 | 中国农业大学 | CRISPR-Cas12a enzymes and systems |
CN113897397A (en) * | 2021-09-30 | 2022-01-07 | 中南大学 | DNAzyme based gene editing regulation method |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016055551A1 (en) * | 2014-10-07 | 2016-04-14 | Cellectis | Method for modulating car-induced immune cells activity |
WO2016094874A1 (en) * | 2014-12-12 | 2016-06-16 | The Broad Institute Inc. | Escorted and functionalized guides for crispr-cas systems |
WO2016154299A1 (en) * | 2015-03-24 | 2016-09-29 | The Trustees Of Columbia University In The City Of New York | Genetic modification of pigs for xenotransplantation |
US20160289637A1 (en) * | 2015-04-03 | 2016-10-06 | Dana-Farber Cancer Institute, Inc. | Composition and methods of genome editing of b-cells |
WO2016205759A1 (en) * | 2015-06-18 | 2016-12-22 | The Broad Institute Inc. | Engineering and optimization of systems, methods, enzymes and guide scaffolds of cas9 orthologs and variants for sequence manipulation |
WO2017015015A1 (en) * | 2015-07-17 | 2017-01-26 | Emory University | Crispr-associated protein from francisella and uses related thereto |
-
2018
- 2018-03-02 WO PCT/US2018/020619 patent/WO2018164948A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016055551A1 (en) * | 2014-10-07 | 2016-04-14 | Cellectis | Method for modulating car-induced immune cells activity |
WO2016094874A1 (en) * | 2014-12-12 | 2016-06-16 | The Broad Institute Inc. | Escorted and functionalized guides for crispr-cas systems |
WO2016154299A1 (en) * | 2015-03-24 | 2016-09-29 | The Trustees Of Columbia University In The City Of New York | Genetic modification of pigs for xenotransplantation |
US20160289637A1 (en) * | 2015-04-03 | 2016-10-06 | Dana-Farber Cancer Institute, Inc. | Composition and methods of genome editing of b-cells |
WO2016205759A1 (en) * | 2015-06-18 | 2016-12-22 | The Broad Institute Inc. | Engineering and optimization of systems, methods, enzymes and guide scaffolds of cas9 orthologs and variants for sequence manipulation |
WO2017015015A1 (en) * | 2015-07-17 | 2017-01-26 | Emory University | Crispr-associated protein from francisella and uses related thereto |
Non-Patent Citations (7)
Title |
---|
JUSIAK ET AL.: "Engineering Synthetic Gene Circuits in Living Cells with CRISPR Technology", TRENDS IN BIOTECHNOLOGY, vol. 34, no. 7, 1 July 2016 (2016-07-01), pages 535 - 547, XP029607920 * |
NISSIM ET AL.: "Multiplexed and Programmable Regulation of Gene Networks with an Integrated RNA and CRISPR/Cas Toolkit in Human Cells", MOLECULAR CELL, vol. 54, no. 4, 22 May 2014 (2014-05-22), pages 698 - 710, XP029028594 * |
TANG ET AL.: "A CRISPR-Cpf1 system for efficient genome editing and transcriptional repression in plants", NATURE PLANTS, vol. 3, no. 3, 17 February 2017 (2017-02-17), pages 1 - 5, XP055538473 * |
TOTH ET AL.: "Cpf1 Nucleases Demonstrate Robust Activity to Induce DNA Modification by Exploiting Homology Directed Repair Pathways in Mammalian Cells", BIOLOGY DIRECT, vol. 11, no. 46, 14 September 2016 (2016-09-14), pages 1 - 14, XP055534917 * |
ZALATAN ET AL.: "Engineering Complex Synthetic Transcriptional Programs with CRISPR RNA Scaffolds", CELL, vol. 160, 15 January 2015 (2015-01-15), pages 339 - 350, XP055278878 * |
ZETSCHE ET AL.: "Cpf1 Is a Single RNA-Guided Endonuclease of a Class 2 CRISPR-Cas System", CELL, vol. 163, 22 October 2015 (2015-10-22), pages 759 - 771, XP055267511 * |
ZHONG ET AL.: "Cpf1 proteins excise CRISPR RNAs from mRNA transcripts in mammalian cells", NATURE CHEMICAL BIOLOGY, vol. 13, no. 8, 19 June 2017 (2017-06-19), pages 839 - 841, XP055538474 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113015798A (en) * | 2018-11-15 | 2021-06-22 | 中国农业大学 | CRISPR-Cas12a enzymes and systems |
CN113015798B (en) * | 2018-11-15 | 2023-01-10 | 中国农业大学 | CRISPR-Cas12a enzymes and systems |
WO2021112136A1 (en) * | 2019-12-02 | 2021-06-10 | 国立大学法人京都大学 | Mrna switch, and method for regulating expression of protein using same |
JP7630835B2 (en) | 2019-12-02 | 2025-02-18 | 国立大学法人京都大学 | mRNA switch and method for controlling protein expression using the same |
CN113897397A (en) * | 2021-09-30 | 2022-01-07 | 中南大学 | DNAzyme based gene editing regulation method |
CN113897397B (en) * | 2021-09-30 | 2024-04-02 | 中南大学 | A method for regulating gene editing based on DNAzyme |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7593920B2 (en) | Nucleic Acid Constructs and Methods of Use | |
JP7472121B2 (en) | Compositions and methods for transgene expression from the albumin locus | |
US10858631B2 (en) | Methods for adeno-associated viral vector production | |
JP2019530467A (en) | Self-limiting Cas9 network (SLiCES) plasmid and its lentiviral system for improved safety | |
TW202027799A (en) | Compositions and methods for expressing factor ix | |
JP7558575B2 (en) | Genome editing by directed non-homologous DNA insertion using retroviral integrase-Cas9 fusion proteins | |
CA3073292A1 (en) | Improved transposase polypeptide and uses thereof | |
EP3310369A1 (en) | Self-limiting viral vectors encoding nucleases | |
WO2018164948A1 (en) | Vectors with self-directed cpf1-dependent switches | |
JP2009538144A (en) | Protein production using eukaryotic cell lines | |
GB2566572A (en) | Methods for adeno-associated viral vector production | |
US20230212590A1 (en) | Nucleic acid constructs for protein manufacture | |
CN115176003A (en) | High efficiency RNA switches and related expression systems | |
Chang et al. | Synthetic biology approach to developing all-in-one baculovirus vector using mammalian introns and miRNA binding sites | |
Hacobian et al. | Pushing the right buttons: Improving efficacy of therapeutic DNA vectors | |
JP2024515715A (en) | Methods for genome editing and therapy by directed heterologous DNA insertion using retroviral integrase-Cas fusion proteins | |
HK40051560A (en) | Nucleic acid constructs and methods of use | |
WO2023086889A1 (en) | Methods of targeting mutant cells | |
Holkers | The roles of adenoviral vectors and donor DNA structures on genome editing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18763439 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18763439 Country of ref document: EP Kind code of ref document: A1 |