The present application claims priority and benefit from PCT application No. PCT/CN2021/133681 filed on 11/26 of 2021, the contents of which are incorporated herein by reference in their entirety.
The sequence Listing XML relevant to the present application is provided electronically in XML file format and is incorporated by reference into the present specification. The name of the XML file containing the sequence Listing XML is "EPIG-001_001WO_sequencer Listing_ST26". The XML file is 220,215 bytes in size and was created at 2022, 11, 1.
Detailed Description
The present disclosure overcomes the problems associated with the current technology by providing genetically engineered fusion molecules (e.g., DNMT3A-DNMT3L (3A 3L) -dCas9-KRAB fusion molecules) for targeted reduction or elimination of gene products (e.g., PCSK 9) in cells for use in vivo gene therapy. The genetically engineered fusion molecules of the present disclosure are useful for the treatment of genetic diseases, including, for example, liver diseases, diseases associated with high cholesterol, and diseases associated with deregulation of cholesterol (e.g., low Density Lipoprotein (LDL) cholesterol). Thus, methods of making genetically engineered fusion molecules and pharmaceutical formulations thereof for in vivo delivery (e.g., lipid nanoparticle formulations) are also provided.
I. Definition of the definition
The term "coding sequence" or "coding nucleic acid" as used herein refers to a nucleic acid (RNA or DNA molecule) comprising a nucleotide sequence encoding a protein. The coding sequence may further comprise initiation and termination signals operably linked to regulatory elements, including promoters and polyadenylation signals capable of directing expression in the cells of the individual or mammal to which the nucleic acid is administered. The coding sequence may be codon optimized.
As used herein, the term "complementary" or "complementary" with respect to nucleic acids can refer to Watson-Crick (e.g., A-T/U and C-G) or Hoogsteen base pairing between nucleotides or nucleotide analogs of a nucleic acid molecule. "complementarity" refers to the property shared between two nucleic acid sequences such that when they are aligned antiparallel to each other, the nucleotide bases at each position will be complementary.
The terms "correction", "genome editing" and "restoring" refer to altering a mutant gene encoding a mutant protein, truncated protein or no protein at all, such that expression of the full-length functional or partially full-length functional protein is obtained. Correcting or restoring a mutated gene may include replacing a region of the gene having the mutation with a copy of the gene not having the mutation or replacing the entire mutated gene using a repair mechanism such as Homology Directed Repair (HDR). Correcting or restoring a mutant gene may also include repairing a frameshift mutation that causes a premature stop codon, an aberrant splice acceptor site, or an aberrant splice donor site by creating a double strand break in the gene, followed by repair using a non-homologous end joining (NHEJ). NHEJ can add or delete at least one base pair during repair, which can restore the correct reading frame and eliminate premature stop codons. Correction or restoration of a mutant gene may also include disruption of an aberrant splice acceptor site or splice donor sequence. Correction or restoration of a mutated gene may also include deletion of non-essential gene segments by simultaneous action of two nucleases on the same DNA strand in order to restore the correct reading frame by removing DNA between the two nuclease target sites and repairing DNA breaks by NHEJ.
The terms "donor DNA," "donor template," and "repair template" as used herein refer to a double-stranded DNA fragment or molecule that includes at least a portion of a gene of interest. The donor DNA may encode a fully functional protein or a partially functional protein.
The terms "frameshift" or "frameshift mutation" are used interchangeably herein and refer to a type of genetic mutation in which the addition or deletion of one or more nucleotides results in a shift in the reading frame of codons in the mRNA. The shift in reading frame may result in a change in amino acid sequence during translation of the protein, such as a missense mutation or premature stop codon.
The terms "functional" and "fully functional" as used herein describe proteins having biological activity. "functional gene" refers to a gene transcribed into mRNA that is translated into a functional protein.
The term "fusion protein" as used herein refers to a chimeric protein produced directly or indirectly by covalent or non-covalent linkage of two or more genes, which initially encode separate proteins. In certain embodiments, translation of the fusion gene results in a single polypeptide having functional properties derived from each of the original proteins.
The term "genetic construct" as used herein refers to a DNA or RNA molecule comprising a nucleotide sequence encoding a protein. The coding sequence includes initiation and termination signals operably linked to regulatory elements including promoters and polyadenylation signals capable of directing expression in the cell.
The term "homology-directed repair" or "HDR" as used interchangeably herein refers to the mechanism by which double stranded DNA damage is repaired in a cell when homologous DNA fragments are present in the cell nucleus (primarily in the G2 and S phases of the cell cycle). HDR uses donor DNA templates to direct repair and can be used to create specific sequence changes to the genome, including targeted insertion of the entire gene. If the donor template is provided with a site-specific nuclease, for example with a CRISPR/Cas9 based system, the cellular mechanism will repair the break by homologous recombination, which is enhanced by several orders of magnitude in the presence of DNA cleavage. When homologous DNA fragments are not present, non-homologous end joining may instead occur.
The term "genome editing" as used herein refers to altering a gene. Genome editing may include correction or restoration of mutant genes. Genome editing may include knocking out genes, such as mutant genes or normal genes. Genome editing can be used to treat diseases by altering genes of interest.
In the case of two or more nucleic acid or polypeptide sequences, the term "identical" or "identity" as used herein refers to sequences having a specified percentage of identical residues within a specified region. The percentage can be calculated as follows: optimally aligning two sequences, comparing the two sequences within a designated region, determining the number of positions in the two sequences at which identical residues occur to produce a number of matched positions, dividing the number of matched positions by the total number of positions in the designated region, and multiplying the result by 100 to produce a percentage of sequence identity. Where two sequences differ in length or an alignment produces one or more staggered ends and the designated comparison region includes only a single sequence, the residues of the single sequence are included in the calculated denominator but not in the numerator. Thymine (T) and uracil (U) can be considered equivalent when comparing DNA and RNA. Identity may be performed manually or by using a computer sequence algorithm such as BLAST or BLAST 2.0. Identity of the relevant peptides can be readily calculated by known methods. Such methods include, but are not limited to, methods :Computational Molecular Biology,Lesk,A.M.,ed.,Oxford University Press,New York,1988;Biocomputing:Informatics and Genome Projects,Smith,D.W.,ed.,Academic Press,New York,1993;Computer Analysis of Sequence Data,Part1,Griffin,A.M.,and Griffin,H.G.,eds.,Humana Press,New Jersey,1994;Sequence Analysis in Molecular Biology,von Heinje,G.,Academic Press,1987;Sequence Analysis Primer,Gribskov,M.and Devereux,J.,eds.,M.Stockton Press,New York,1991; and Carillo et al described in the following documents, SIAM j.applied math.48,1073 (1988), which are incorporated herein by reference in their entirety.
As used herein, the term "mutant gene" or "mutated gene" as used interchangeably herein refers to a gene that has undergone a detectable mutation. Mutant genes undergo changes, such as loss, acquisition or exchange of genetic material, which affect the normal transmission and expression of the gene. As used herein, a "disrupted gene" refers to a mutant gene having a mutation that causes premature stop codons. The product of the disrupted gene is truncated relative to the full length of the non-disrupted gene product.
The term "epigenetic modification modulator" as used herein refers to an agent that targets gene expression by epigenetic modification (e.g., by histone acetylation or methylation, or by DNA methylation at regulatory elements of the target gene, such as promoters, enhancers, or transcription initiation sites). Chromatin remodeling and DNA methylation are two major mechanisms that regulate gene transcription. Specific epigenetic markers (e.g., DNA methylation) structurally or biochemically direct gene transcription or gene silencing/repression. For example, DNA methylation of regions that regulate transcriptional activity alters gene expression without altering the underlying DNA sequence. Transcriptional regulation using epigenetic modifications (e.g., DNA methylation) allows targeted regulation of gene expression without affecting expression of other gene products.
The term "non-homologous end joining (NHEJ) pathway" as used herein refers to a pathway that repairs double strand breaks in DNA by directly joining broken ends without the need for a homologous template. Independent reconnection of the template to the DNA ends by NHEJ is a random, error-prone repair process that introduces random micro-insertions and micro-deletions (indels) at DNA breakpoints. This method can be used to deliberately disrupt, delete or alter the reading frame of the target gene sequence. NHEJ typically uses short homologous DNA sequences called microhomologs to direct repair. These microhomologs are typically found in single-stranded overhangs (overhang) at the ends of the double strand break. NHEJ usually repair breaks accurately when the overhangs are fully compatible, but imprecise repair leading to nucleotide loss may also occur, but is more common when the overhangs are incompatible.
The term "normal gene" as used herein refers to a gene that has not been altered, e.g., lost, obtained, or exchanged, of genetic material. Normal genes undergo normal gene transfer and gene expression.
The term "nuclease-mediated NHEJ" as used herein refers to NHEJ that is initiated after cleavage of double-stranded DNA by a nuclease such as cas 9.
The term "nucleic acid" or "oligonucleotide" or "polynucleotide" as used herein refers to at least two nucleotides that are covalently linked together. The description of a single strand also defines the sequence of the complementary strand. Thus, nucleic acids also encompass the described single-stranded complementary strand. Many variants of a nucleic acid can be used for the same purpose as a given nucleic acid. Thus, nucleic acids also encompass substantially identical nucleic acids and their complements. Single strands provide probes that can hybridize to a target sequence under stringent hybridization conditions. Thus, nucleic acids also encompass probes that hybridize under stringent hybridization conditions. The nucleic acid may be single-stranded or double-stranded, or may contain portions having both double-stranded and single-stranded sequences. The nucleic acid may be DNA (both genomic DNA and cDNA), RNA, or hybrids, wherein the nucleic acid may contain a combination of deoxyribonucleotides and ribonucleotides, as well as combinations of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine, hypoxanthine, isocytosine, and isoguanine. The nucleic acid may be obtained by chemical synthesis methods or by recombinant methods.
The term "operably linked" as used herein means that the expression of a gene is under the control of a promoter to which it is spatially linked. Promoters may be located 5 '(upstream) or 3' (downstream) of a gene under their control. The distance between the promoter and the gene may be about the same as the distance between the promoter and the gene it controls in the gene from which the promoter is derived. As is known in the art, this change in distance can be tolerated without loss of promoter function.
The term "partially functional" as used herein describes a protein encoded by a mutant gene that has a lower biological activity than a functional protein but a higher biological activity than a nonfunctional protein. In one embodiment, a portion of the functional protein exhibits a biological activity that is less than 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35% or 30% of the biological activity of the corresponding functional protein.
The term "premature stop codon" or "out of frame stop codon" as used herein refers to a nonsense mutation in a DNA sequence that produces a stop codon at a position that is not normally found in a wild-type gene. Premature stop codons may result in the protein being truncated or shorter than the full length version of the protein.
The term "promoter" or "core promoter" as used herein refers to a synthetically or naturally derived molecule capable of conferring, activating, or enhancing expression of a nucleic acid in a cell. Promoters may contain one or more specific transcriptional regulatory sequences to further enhance expression of a nucleic acid and/or to alter spatial and/or temporal expression of a nucleic acid. Promoters may also include distal enhancer or repressor elements, which may be located up to several thousand base pairs from the transcription initiation site. Promoters may be derived from sources including viruses, bacteria, fungi, plants, insects, and animals. Promoters may differentially regulate expression of a genomic component, either constitutively or relative to the cell, tissue or organ in which expression occurs or relative to the developmental stage in which expression occurs or in response to an external stimulus such as a physiological stress, pathogen, metal ion or inducer. Representative examples of promoters include phage T7 promoter, phage T3 promoter, SP6 promoter, lac operator promoter, tac promoter, SV40 late promoter, SV40 early promoter, RSV-LTR promoter, and CMV IE promoter.
The term "target gene" as used herein refers to any nucleotide sequence encoding a known or putative gene product. The target gene may be a mutant gene involved in a genetic disease or disorder.
The term "target region" as used herein refers to the region of the target gene to which the site-specific nuclease is designed to bind.
The term "transgene" as used herein refers to genetic material containing a gene or containing a sequence of genes that is isolated from one organism and introduced into a different organism. Alternatively, the term "transgene" also refers to a gene or genetic material that is chemically synthesized and introduced into an organism. Such an unnatural DNA segment may retain the ability to produce RNA or protein in a transgenic organism, or it may alter the normal function of the genetic code of a transgenic organism. The introduction of transgenes has the potential to alter the phenotype of organisms.
The term "variant" as used herein when applied to nucleic acids refers to (i) a portion or fragment of a reference nucleotide sequence; (ii) a complement of a reference nucleotide sequence or portion thereof; (iii) A nucleic acid substantially identical to a reference nucleic acid or a complementary sequence thereof; or (iv) a nucleic acid that hybridizes under stringent conditions to a reference nucleic acid, a sequence complementary thereto, or a sequence substantially identical thereto. For peptides or polypeptides, the amino acid sequence of a "variant" differs by amino acid insertions, deletions, or conservative substitutions, but retains at least one biological activity.
A variant may also refer to a protein having an amino acid sequence that is substantially identical to the amino acid sequence of a reference protein that retains at least one biological activity. Conservative substitutions of amino acids, i.e., the substitution of an amino acid with a different amino acid having similar properties (e.g., hydrophilicity, degree and distribution of charged regions), are believed to involve small changes in general in the art. As understood in the art, these minor variations can be identified in part by considering the hydropathic index of amino acids. Kyte et al, J.mol.biol.157:105-132 (1982), incorporated herein by reference in its entirety. The hydropathic index of amino acids is based on their hydrophobicity and charge considerations. It is known in the art that amino acids having similar hydrophilicity indices may be substituted and still retain protein function. In one aspect, the amino acid having a hydropathic index of ±2 is substituted. The hydrophilicity of amino acids may also be used to reveal substitutions that will result in proteins that retain biological function. Considering the hydrophilicity of amino acids in the case of peptides allows the calculation of the maximum local average hydrophilicity of the peptide. Amino acids having hydrophilicity values within + -2 of each other may be substituted. Both the hydrophobicity index and the hydrophilicity value of an amino acid are affected by the particular side chain of the amino acid. Consistent with this observation, amino acid substitutions that are compatible with biological function are understood to depend on the relative similarity of the amino acids, particularly the side chains of those amino acids, as revealed by hydrophobicity, hydrophilicity, charge, size, and other properties.
As used herein, the term "vector" as used herein refers to a nucleic acid sequence that contains an origin of replication. The vector may be a viral vector, phage, bacterial artificial chromosome, or yeast artificial chromosome. The vector may be a DNA or RNA vector. The vector may be a self-replicating extrachromosomal vector, such as a DNA plasmid.
The terms "gene transfer", "gene delivery" and "gene transduction" as used herein refer to a method or system for the reliable insertion of a particular nucleotide sequence (e.g., DNA or RNA), fusion protein, polypeptide, etc., into a target cell.
The terms "adeno-associated virus (AAV) vector," "AAV gene therapy vector," and "gene therapy vector" as used herein refer to vectors having functional or partially functional ITR sequences and transgenes. The term "ITR" as used herein refers to an Inverted Terminal Repeat (ITR). ITR sequences may be derived from adeno-associated virus serotypes including, but not limited to, AAV-1, AAV-2, AAV-3, AAV-4, AAV-5, and AAV-6. However, the ITR need not be a wild-type nucleotide sequence and can be altered (e.g., by insertion, deletion or substitution of nucleotides) so long as the sequence retains functionality to provide functional rescue, replication and packaging. One or more AAV wild-type genes, preferably rep and/or cap genes, of an AAV vector may be deleted in whole or in part, but retain functional flanking ITR sequences. The function of the functional ITR sequence is, for example, to rescue, replicate and package AAV virions or particles. Thus, "AAV vector" is defined herein to include at least those sequences required for insertion of a transgene into a cell of a subject. Optionally including those sequences that must take cis form for replication and packaging of the virus (e.g., functional ITRs).
The term "gene therapy" as used herein refers to a method of treating a patient in which a polypeptide or nucleic acid sequence is transferred into cells of the patient in order to modulate the activity and/or expression of a particular gene. In certain embodiments, expression of the gene is inhibited. In certain embodiments, expression of the gene is enhanced. In certain embodiments, the temporal or spatial pattern of gene expression is modulated.
The "transgene" may contain a transgene sequence or a natural or wild-type DNA sequence. The transgene may be part of the primate subject genome. The transgene sequence may be partially or completely heterologous in species, i.e., the transgene sequence or portion thereof may be from a species different from the cell into which it is introduced.
The term "stably maintained" as used herein refers to the characteristic of a transgenic subject (e.g., human or non-human primate) to maintain at least one of its transgenic elements (i.e., the desired element) through multiple generations of cells. For example, the term is intended to encompass many cell division cycles of an initially transfected cell. The term "stably transfected" or "stably transfected" refers to the introduction and integration of exogenous DNA into the genome of a cell. The term "stable transfectants" refers to cells that have stably integrated exogenous DNA into genomic DNA.
The terms "transgene encoding … …", "nucleic acid molecule encoding … …", "DNA sequence encoding … …" and "DNA encoding … …" as used herein refer to the order or sequence of deoxyribonucleotides along the strand of deoxyribonucleic acid. For example, the order of these deoxyribonucleotides can determine the order of amino acids along the polypeptide (protein) chain. Thus, the DNA sequence may encode the amino acid sequence.
The term "wild-type" (wt) as used herein refers to a gene or gene product that has the characteristics of that gene or gene product when isolated from a naturally occurring source. Wild-type genes are the genes most commonly observed in a population and are therefore arbitrarily designed as "normal" or "wild-type" forms of the genes. In contrast, the term "modified" or "mutant" refers to a gene or gene product that exhibits a modification in sequence and/or functional properties (i.e., altered characteristics) when compared to the wild-type gene or gene product. Notably, naturally occurring mutants can be isolated that are identified by obtaining altered characteristics as compared to the wild-type gene or gene product.
The term "transfection" as used herein refers to the uptake of a foreign nucleic acid (e.g., DNA or RNA) by a cell. When exogenous nucleic acid (DNA or RNA) is introduced into the interior of the cell membrane, the cell has been "transfected". Numerous transfection techniques are known in the art (see, e.g., Graham et al.,Virol.,52:456(1973);Sambrook et al.,Molecular Cloning,a Laboratory Manual,Cold Spring Harbor Laboratories,New York(1989);Davis et al.,Basic Methods in Molecular Biology,Elsevier,(1986);and Chu et al.,Gene 13:197(1981),, incorporated herein by reference in its entirety). Such techniques may be used to introduce one or more exogenous DNA portions, such as gene transfer vectors and other nucleic acid molecules, into suitable recipient cells.
The terms "stably transfected" and "stably transfected" as used herein refer to the introduction and integration of foreign DNA into the genome of a transfected cell. The term "stable transfectants" refers to cells that have stably integrated foreign DNA into genomic DNA.
The term "transiently transfected" or "transiently transfected" as used herein refers to the introduction of foreign DNA into a cell, wherein the foreign DNA fails to integrate into the genome of the transfected cell and is maintained as episome. During this time, the foreign DNA is regulated by controlling the expression of endogenous genes in the chromosome. The term "transient transfectants" refers to cells that have taken up foreign DNA but have failed to integrate that DNA. The term "transduction" as used herein means the delivery of a DNA molecule to a recipient cell in vivo or in vitro by a replication defective viral vector, e.g., by recombinant AAV viral particles.
The term "recipient cell" as used herein refers to a cell that has been transfected or transduced with, or is capable of being transfected or transduced by, a nucleic acid construct or vector carrying a selected nucleotide sequence of interest. The term includes progeny of a parent cell, whether or not the progeny are identical in morphology or genetic constitution to the original parent, as long as the selected nucleotide sequence is present. The recipient cell may be a cell of the subject to which the gene therapy particle and/or gene therapy vector has been administered.
The term "recombinant DNA molecule" as used herein refers to a DNA molecule consisting of DNA fragments joined together by molecular biological techniques.
The term "regulatory element" as used herein refers to a genetic element that can control expression of a nucleic acid sequence. For example, a promoter is a regulatory element that helps to initiate transcription of an operably linked coding region. Other regulatory elements are splicing signals, polyadenylation signals, termination signals, and the like.
The term DNA "control sequences" is collectively referred to as regulatory elements, such as promoter sequences, polyadenylation signals, transcription termination sequences, upstream regulatory domains, origins of replication, internal ribosome entry sites ("IRES"), enhancers, and the like, which collectively provide for the replication, transcription, and translation of a coding sequence in a recipient cell. Not all of these control sequences need be present.
The term "enhancer" as used herein refers to a non-coding DNA sequence comprising a plurality of activator and repressor binding sites. Enhancers range in length from 50bp to 1500bp, either proximal to the 5' upstream of the promoter, within any intron of the regulated gene, or in the distal, adjacent gene intron or in an intergenic region away from the locus, or in a region on a different chromosome. More than one enhancer may interact with the promoter. Likewise, enhancers can regulate more than one gene without linkage restriction, and can "skip" adjacent genes to regulate more distant genes. Transcriptional regulation may involve elements located on a chromosome different from the chromosome on which the promoter is located. Proximal enhancers or promoters adjacent to a gene may serve as a platform for recruiting more distal elements. Enhancers and promoters used may be "endogenous," "exogenous," or "heterologous" with respect to the gene to which they are operably linked. An "endogenous" enhancer/promoter is one that is naturally associated with a given gene in the genome. An "exogenous" or "heterologous" enhancer or promoter is one that is placed in juxtaposition to a gene by genetic manipulation (i.e., molecular biology techniques) such that transcription of the gene is directed by the linked enhancer/promoter.
The term "insulator" or "insulator element" as used herein refers to a genetic border element that blocks interactions between enhancers and promoters. By being located between the enhancer and the promoter, the insulator can inhibit their subsequent interactions. The insulator may determine the genome that the enhancer may affect. Insulators are required when two adjacent genes on a chromosome have very different transcription patterns and the induction or inhibition mechanism of one of the genes does not interfere with the adjacent gene. Insulators are also found to aggregate at the boundaries of topologically related domains (TADs) and may play a role in dividing the genome into "chromosomal neighbors," i.e., regions of the genome where regulation occurs. Insulator activity is thought to be achieved primarily by the 3D structure of DNA mediated by proteins including CTCF. Insulators may function through a variety of mechanisms. Many enhancers form DNA loops that bring them into physical proximity to the promoter region during transcriptional activation. The insulator may promote the formation of a DNA loop, thereby preventing the formation of a promoter-enhancer loop. The barrier insulator may prevent diffusion of heterochromatin from the silenced gene to the actively transcribed gene.
The term "locus control region (locus control region)" as used herein refers to a long-range cis-regulatory element capable of enhancing expression of a linked gene at a distal chromatin site. It functions in a copy number dependent manner and has tissue specificity, such as selective expression of the 3-globulin gene in erythrocytes. The level of expression of a gene can be altered by LCR and gene proximal elements (e.g., promoters, enhancers, and silencers). LCR functions by recruiting chromatin modifications, coactivators, and transcriptional complexes. Its sequence is conserved in many vertebrates, and conservation of a particular site may indicate the importance of its function.
The terms "silencer" or "repressor" as used herein are used interchangeably to refer to a DNA sequence capable of binding to a transcriptional regulator and preventing the expression of a gene as a protein. Silencers are sequence-specific elements that can negatively affect transcription of their particular gene. The silencer element may be located at a number of positions in the DNA. The most common location is upstream of the gene of interest, which can help to inhibit transcription of the gene. The distance may vary considerably between about-20 bp to-2000 bp upstream of the gene. Some silencers are located downstream of promoters within introns or exons of the gene itself. Silencers are also found in the 3 'untranslated region (3' UTR) of mRNA. There are two main types of silencers in DNA, classical silencer elements and non-classical Negative Regulatory Elements (NREs). In classical silencers, genes are actively inhibited by silencer elements, mainly by interfering with the General Transcription Factor (GTF) assembly. NRE passive repressors are usually produced by other elements upstream of the repressor gene.
The term "tissue-specific" as used herein refers to regulatory elements or control sequences, such as promoters, enhancers, and the like, wherein the expression of a nucleic acid sequence is significantly higher in a particular cell type or tissue.
The presence of a "splicing signal" on an expression vector generally results in higher expression levels of the recombinant transcript. The splice signal mediates the removal of introns from the primary RNA transcript and consists of splice donor and acceptor sites (Sambrook et al.,Molecular Cloning:A Laboratory Manual,2nd ed.,Cold Spring Harbor Laboratory Press,New York(1989),pp.16.7-16.8,, the entire contents of which are incorporated herein by reference). A commonly used splice donor and acceptor site is the splice junction of 16S RNA from SV 40.
Transcription termination signals are typically present downstream of polyadenylation signals, several hundred nucleotides in length. The term "poly A site" or "poly A sequence" as used herein refers to a DNA sequence that directs both termination and polyadenylation of a nascent RNA transcript. Efficient polyadenylation of recombinant transcripts is desirable because transcripts lacking a poly A tail are unstable and can be rapidly degraded. The poly A signal used in the expression vector may be "heterologous" or "endogenous". Endogenous poly a signals are signals that naturally occur at the 3' end of the coding region of a given gene in the genome. Heterologous poly A signals are signals that are isolated from one gene and operably linked to the 3' end of another gene. One commonly used heterologous poly A signal is the SV40 poly A signal. The SV40 poly A signal was contained on a 237bp BamHI/BclI restriction fragment and directed to both termination and polyadenylation (Sambrook et al, supra, 16.6-16.7, which is incorporated herein by reference in its entirety).
The terms "subject" and "patient" as used herein are used interchangeably herein and refer to humans and non-human animals. The term "non-human animal" of the present disclosure includes all vertebrates, e.g., mammals and non-mammals, e.g., non-human primates, sheep, dogs, cats, horses, cattle, chickens, amphibians, reptiles, and the like.
As defined herein, a "therapeutically effective amount" or "therapeutically effective dose" is an amount or dose of a fusion protein, polypeptide, nucleic acid, lipid nanoparticle, liposome, AAV particle, or viral particle that is capable of producing a sufficient amount of the desired protein to modulate the activity of the protein in a desired manner, thereby providing a means of alleviation for clinical intervention. In certain embodiments, a therapeutically effective amount or dose of a transfected fusion protein, polypeptide, nucleic acid, AAV particle, or viral particle as described herein is sufficient to inhibit a gene targeted by the fusion protein/gene therapy construct.
The term "treating" as used herein, for example, refers to a subject (e.g., a human) suffering from, at risk of suffering from, and/or experiencing symptoms of a disease, and in one embodiment, will suffer from a more subtle symptom and/or will recover more rapidly when administered, for example, with a fusion molecule described herein or a nucleic acid encoding the fusion molecule and/or a gRNA or a nucleic acid encoding the gRNA, as compared to when not administered.
DNA binding proteins
In certain embodiments of the methods and compositions as defined herein according to the present disclosure, the DNA binding protein (e.g., DNA targeting agent) comprises a (DNA) nuclease, e.g., a nuclease that can target DNA in a sequence-specific manner or can be directed or directed to target DNA in a sequence-specific manner, e.g., a CRISPR-Cas system, a Zinc Finger Nuclease (ZFN), a transcription activator-like effector nuclease (TALEN), or a meganuclease. In certain embodiments, the DNA binding protein is a DNA nuclease derived from a CRISPR-Cas system.
Transcription activator-like effector nuclease (TALEN) system
In certain embodiments, the nucleic acid binding protein is a (modified) transcription activator-like effector nuclease (TALEN) system. Transcription activator-like effectors (TALEs) can be engineered to bind virtually any desired DNA sequence. Exemplary methods of genome editing using TALEN systems can be found, for example, in :Cermak T.Doyle EL.Christian M.Wang L.Zhang Y.Schmidt C,et al.Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting.Nucleic Acids Res.2011;39:e82;Zhang F.Cong L.Lodato S.Kosuri S.Church GM.Arlotta PEfficient construction of sequence-specific TAL effectors for modulating mammalian transcription.Nat Biotechnol.2011;29:149-153, and U.S. patent nos. 8,450,471, 8,440,431, and 8,440,432, each of which is incorporated herein by reference in its entirety.
As a further guidance, but not limited thereto, naturally occurring TALEs or "wild-type TALEs" are nucleic acid binding proteins secreted by a variety of proteus species. TALE polypeptides contain a nucleic acid binding domain consisting of a tandem repeat sequence of highly conserved monomeric polypeptides, predominantly 33, 34 or 35 amino acids in length, differing from each other predominantly in amino acid positions 12 and 13. In certain embodiments, the nucleic acid is DNA.
The term "polypeptide monomer" or "TALE monomer" as used herein is intended to refer to a highly conserved repeat polypeptide sequence within a TALE nucleic acid binding domain, and the term "repeat variable double residue" or "RVD" is intended to refer to highly variable amino acids at positions 12 and 13 of a polypeptide monomer.
As provided throughout this disclosure, amino acid residues of RVDs are described using IUPAC single letter codes for amino acids. TALE monomers contained within the DNA binding domain are generally represented by X1-11- (X12X 13) -X14-33 or 34 or 35, wherein the subscript indicates an amino acid position and X indicates any amino acid. X12X13 represents RVD. In certain polypeptide monomers, the variable amino acid at position 13 is deleted or absent, and in such polypeptide monomers, the RVD consists of a single amino acid. In such cases, RVD may alternatively be represented as X, where X represents X12 and X13 is absent. The DNA binding domain comprises several repeats of TALE monomers, which may be expressed as (X1-11- (X12X 13) -X14-33 or 34 or 35) z, wherein in an advantageous embodiment z is at least 5 to 40. In another advantageous embodiment, z is at least 10 to 26.TALE monomers have nucleotide binding affinities that are determined by the identity of the amino acids in the RVD. For example, a polypeptide monomer with RVD NI preferentially binds to adenine (a), a polypeptide monomer with RVD NG preferentially binds to thymine (T), a polypeptide monomer with RVD HD preferentially binds to cytosine (C), and a polypeptide monomer with RVD NN preferentially binds to both adenine (a) and guanine (G). In yet another embodiment of the present disclosure, the polypeptide monomer of RVD IG preferentially binds to T. Thus, the number and order of polypeptide monomer repeats in the nucleic acid binding domain of TALE determines its nucleic acid target specificity. In still further embodiments of the present disclosure, the polypeptide monomer of RVD is NS recognizes all four base pairs and can bind to A, T, G or C. The structure and function of TALEs are further described in, for example, the following documents: moscou et al, science 326:1501 (2009); boch et al Science326:1509-1512 (2009); and Zhang et al, nature Biotechnology 29:149-153 (2011), each of which is incorporated herein by reference in its entirety. In certain embodiments, targeting is achieved by binding to a polynucleic acid TALEN fragment. In certain embodiments, the targeting domain comprises or consists of a catalytically inactive TALEN or nucleic acid binding fragment thereof.
Zinc Finger Nuclease (ZFN) system
In certain embodiments, the nucleic acid structural protein (e.g., DNA binding protein) comprises or consists of a (modified) Zinc Finger Nuclease (ZFN) system. The ZFN system uses artificial restriction enzymes created by fusing zinc finger DNA binding domains to DNA cleavage domains, which can be engineered to target a desired DNA sequence. Exemplary methods of genome editing using ZFNs can be found, for example, in U.S. patent nos. 6,534,261、6,607,882、6,746,838、6,794,136、6,824,978、6,866,997、6,933,113、6,979,539、7,013,219、7,030,215、7,220,719、7,241,573、7,241,574、7,585,849、7,595,376、6,903,185 and 6,479,626, each of which is incorporated herein by reference in its entirety. As a further guide, but not limited thereto, the artificial Zinc Finger (ZF) technique involves an array of ZF modules to target new DNA binding sites in the genome. Each finger module in the ZF array targets three DNA bases. Custom arrays of individual zinc finger domains are assembled into ZF proteins (ZFPs). ZFP may comprise a functional domain. By fusing the ZF protein with the catalytic domain of the type IIS restriction enzyme fokl, a first synthetic zinc finger nuclease (ZFN).(Kim,Y.G.et al.,1994,Chimeric restriction endonuclease,Proc.Natl.Acad.Sci.U.S.A.91,883-887;Kim,Y.G.et al.,1996,Hybrid restriction enzymes:zinc finger fusions to FokI cleavage domain.Proc.Natl.Acad.Sci.U.S.A.93,1156-1160). was developed that by using pairs of ZFN heterodimers, each targeting a different nucleotide sequence separated by a short spacer, increased cleavage specificity .(Doyon,Y.et al.,2011,Enhancing zinc-finger-nuclease activity with improved obligate heterodimeric architectures.Nat.Methods 8,74-79).ZFP can be obtained with reduced off-target activity, can also be designed as a transcriptional activator and repressor, and has been used to target many genes in a variety of organisms. In certain embodiments, the targeting domain comprises or consists of a nucleic acid-binding zinc finger nuclease or a nucleic acid-binding fragment thereof. In certain embodiments, the nucleic acid-binding zinc finger nucleases (fragments thereof) are catalytically inactive.
Meganucleases
In certain embodiments, the nucleic acid structural protein (e.g., a DNA binding protein) comprises a (modified) meganuclease that is an endo-deoxyribonuclease characterized by a large recognition site (a 12-40 base pair double-stranded DNA sequence). Exemplary methods of using meganucleases can be found in U.S. Pat. nos. 8,163,514, 8,133,697, 8,021,867, 8,119,361, 8,119,381, 8,124,369, and 8,129,134, each of which is incorporated herein by reference in its entirety. In certain embodiments, targeting is achieved by binding to a meganuclease fragment of a polynucleic acid. In certain embodiments, targeting is achieved by the binding of a polynucleic acid to a catalytically inactive meganuclease (fragment). Thus, in a particular embodiment, the targeting domain comprises or consists of a meganuclease that binds nucleic acid or a nucleic acid binding fragment thereof.
CRISPR-Cas system
In certain embodiments, the nucleic acid structural protein (e.g., DNA binding protein) and single guide RNA sequence are derived from a CRISPR-Cas system. The present disclosure provides CRISPR/Cas 9-based engineered systems for genome editing and treatment of genetic diseases. The CRISPR/Cas 9-based engineered system can be designed to target any gene (e.g., PCSK 9), including genes associated with genetic diseases, liver disease, and cholesterol such as LDL dysregulation. The present disclosure provides a CRISPR-Cas system comprising a genetically engineered Cas protein and/or guide RNA with a desired specificity and activity (e.g., reduced or eliminated expression of a PCSK9 gene product). The CRISPR/Cas 9-based system can include a Cas9 protein, a mutated Cas9 protein, or a Cas9 fusion protein (e.g., DNMT3A-DNMT3L (3A 3L) -dCas9-KRAB fusion molecule), and at least one sgRNA (e.g., PCSK9 sgRNA). The Cas9 fusion protein may, for example, include a domain having a different activity than the endogenous domain of Cas9 (e.g., DNMT3A, DNMT L or KRAB).
In general, a Cas protein (used interchangeably herein with CRISPR protein, CRISPR enzyme, CRISPR-Cas protein, CRISPR-Cas enzyme, cas, CRISPR effector, or Cas effect protein) and/or a guide sequence is a component of a CRISPR-Cas system. CRISPR-Cas systems or CRISPR systems refer to transcripts and other elements involved in the expression of or directing the activity of a CRISPR-associated ("Cas") gene, including sequences encoding Cas genes, tracr (transactivation) sequences (e.g., tracrRNA or active partial tracrRNA), tracr-mate sequences (covering "forward repeats" and partial forward repeats of tracrRNA processing in the case of endogenous CRISPR systems), guide sequences (also referred to as "spacers" in the case of endogenous CRISPR systems) or the term "RNA" as used herein (e.g., RNAs such as CRISPR RNA and transactivation (tracr) RNAs or single guide RNAs (also referred to as sgrnas; chimeric RNAs)) or other sequences and transcripts from a CRISPR locus.
In general, CRISPR systems are characterized by elements that promote CRISPR complex formation at target sequence sites (also referred to as pre-spacer sequences in the case of endogenous CRISPR systems). In the engineered systems of the disclosure, the forward repeat sequence (DIRECT REPEAT) can include naturally occurring sequences or non-naturally occurring sequences. The forward repeat sequences of the present disclosure are not limited to naturally occurring lengths and sequences. Furthermore, the forward repeat sequences of the present disclosure may include insertion of nucleotides, such as an aptamer or an adapter protein binding sequence (for binding to a functional domain). In certain embodiments, the insertion of a forward repeat sequence is contained, for example, with one end being approximately the first half of the short DR and the end being approximately the second half of the short DR.
In the case of CRISPR complex formation, "target sequence" or "target polynucleotide" refers to a sequence to which a guide sequence is designed to have complementarity, wherein hybridization between the target sequence and the guide sequence facilitates CRISPR complex formation. The target sequence may comprise any polynucleotide, such as a DNA or RNA polynucleotide. In certain embodiments, the target sequence is located in the nucleus or cytoplasm of the cell.
In general, a targeting sequence (or spacer sequence) can be any polynucleotide sequence that has sufficient complementarity to a target polynucleotide sequence to hybridize to the target sequence and direct sequence-specific binding of a CRISPR complex to the target sequence. In certain embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence is equal to or greater than about 50%, 60%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more when optimally aligned using a suitable alignment algorithm.
In certain embodiments, modulation of cleavage efficiency may be utilized by introducing a mismatch, e.g., 1 or more mismatches, e.g., 1 or 2 mismatches between the spacer sequence and the target sequence (including the position of the mismatch along the spacer/target). For example, the closer the double mismatch is to the center (i.e., not at 3 'or 5'), the greater the impact on the cutting efficiency. Thus, by selecting the location of the mismatch along the spacer, the cleavage efficiency can be adjusted. For example, if it is desired that the cleavage of the target is less than 100% (e.g., in a cell population), 1 or more, e.g., preferably 2, mismatches between the spacer and target sequences may be introduced in the spacer sequence. The closer the mismatch position is to the center along the spacer, the lower the cut percentage.
The CRISPR-Cas system or components thereof may be used to introduce one or more mutations in a target locus or nucleic acid sequence. The mutation may comprise the introduction, deletion or substitution of one or more nucleotides at each target sequence of the cell by a guide RNA or sgRNA. The mutation may comprise the introduction, deletion or substitution of 1-75 nucleotides at each target sequence of the cell by a guide RNA.
Typically, in the case of endogenous CRISPR-Cas systems, the formation of CRISPR complexes (comprising a guide sequence that hybridizes to a target sequence and complexes with one or more Cas proteins) results in cleavage in or near the target sequence (e.g., within 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50 or more base pairs therefrom), but may depend on, for example, secondary structure, particularly in the case of RNA targets. In certain instances, in the case of endogenous CRISPR systems, the formation of a CRISPR complex (comprising a guide sequence that hybridizes to a target sequence and that is complexed with one or more Cas proteins) results in cleavage of one or both strands (if applicable) in or near the target sequence (e.g., within 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50 or more base pairs therefrom).
In certain embodiments, the guide RNA (capable of guiding Cas to a target locus) can comprise (1) a guide sequence capable of hybridizing to a target locus (polynucleotide target locus, e.g., RNA target locus) in a eukaryotic cell; (2) A forward repeat (DR) sequence that is present in a single RNA, sgRNA (arranged in the 5 'to 3' direction) or crRNA.
General information about CRISPR-Cas systems, components thereof, and delivery of such components, including all methods, materials, delivery vehicles, vectors, particles, AAV, and their preparation and use, including amounts and formulations, useful in the practice of the present disclosure, refer to the following documents: U.S. Pat. nos. 8,999,641、8,993,233、8,945,839、8,932,814、8,906,616、8,895,308、8,889,418、8,889,356、8,871,445、8,865,406、8,795,965、8,771,945 and 8,697,359; U.S. patent publication US 2014-0310830、US 2014-0287938 A1、US 2014-0273234 A1、US 2014-0273232 A1、US 2014-0273231 A1、US2014-0256046 A1、US 2014-0248702 A1、US 2014-0242700 A1、US 2014-0242699 A1、US 2014-0242664 A1、US 2014-0234972 A1、US 2014-0227787 A1、US 2014-0189896 A1、US 2014-0186958、US 2014-0186919 A1、US 2014-0186843 A1、US 2014-0179770 A1、US 2014-0179006 A1、US 2014-0170753; European patents EP 2784162 B1 and EP 2771468 B1; european patent applications EP 2771468, EP 2764103 and EP 2784162; and PCT patent publication WO 2021/183807A1(PCT/US2021/021973)、WO 2014/093661(PCT/US2013/074743)、WO 2014/093694(PCT/US2013/074790)、WO 2014/093595(PCT/US2013/074611)、WO 2014/093718(PCT/US2013/074825)、WO 2014/093709(PCT/US2013/074812)、WO 2014/093622(PCT/US2013/074667)、WO 2014/093635(PCT/US2013/074691)、WO 2014/093655(PCT/US2013/074736)、WO 2014/093712(PCT/US2013/074819)、WO 2014/093701(PCT/US2013/074800)、WO 2014/018423(PCT/US2013/051418)、WO 2014/204723(PCT/US2014/041790)、WO 2014/204724(PCT/US2014/041800)、WO2014/204725(PCT/US2014/041803)、WO 2014/204726(PCT/US2014/041804)、WO 2014/204727(PCT/US2014/041806)、WO 2014/204728(PCT/US2014/041808)、WO 2014/204729(PCT/US2014/041809),, each of which is incorporated by reference herein in its entirety.
Cas proteins
The Cas protein (e.g., an engineered Cas protein) may have substantially the same (e.g., between 80% and 100%, between 90% and 100%, between 95% and 100%, between 98% and 100%, between 99% and 100%, between 99.9% and 100%, or about 100%) nuclease activity as the corresponding wild-type Cas protein. In certain instances, the engineered Cas protein has a nuclease activity that is higher (e.g., at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90%) than the corresponding wild-type Cas protein.
Alternatively or additionally, the Cas protein (e.g., an engineered Cas protein) may have a specificity that is at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90% higher than the corresponding wild-type Cas protein. In particular examples, the Cas protein (e.g., an engineered Cas protein) can have a specificity that is at least 30% higher than the corresponding wild-type Cas protein. The term "specificity" of Cas as used herein may correspond to the number or percentage of mid-target polynucleotide cleavage events relative to the number or percentage of all polynucleotide cleavage events, including mid-target and off-target events. Activity and specificity of Cas proteins are consistent with those described in :Hsu PD et al.,DNA targeting specificity of RNA-guided Cas9nucleases,Nat Biotechnol.2013Sep;31(9):827-832; and Slaymaker IM,et al.,Rationally engineered Cas9 nucleases with improved specificity,Science.2016Jan l;351(6268):84-88,, which are also examples of methods for detecting activity and specificity of Cas proteins, and are incorporated by reference herein in their entirety and in detail elsewhere herein.
In certain embodiments, the Cas protein (e.g., its RuvC domain) can slide one base upstream (relative to PAM) and create staggered nicks that can be filled and result in replication of a single base (i.e., +1 insertion). An example of +1 insertion positions is described in Zuo,Z.,and Liu,J.(2016).Cas9-catalyzed DNA Cleavage Generates Staggered Ends:Evidence from Molecular Dynamics Simulations.Scientific Reports 6,37584. In certain embodiments, the engineered Cas protein has a +1 insertion frequency that is different from the corresponding wild-type Cas protein. For example, when guanine is present in the-2 position relative to PAM, +1 insertion frequency is higher than +1 insertion frequency when thymidine, cytidine, or adenine is present in the-2 position relative to PAM. In some cases, +1 insertion depends on host mechanisms in human cells. In certain examples, the Cas protein may create staggered nicks. The staggered nicks may be 1 bp or 5' overhang of 1 nucleotide. The staggered nicks may be 1 bp or 3' overhang of 1 nucleotide.
The nucleic acid molecule encoding Cas may be codon optimized. In this case, one example of a codon optimized sequence is a sequence optimized for expression in a eukaryotic organism, such as a human (i.e., a sequence optimized for expression in a human), or a sequence optimized for expression in another eukaryotic organism, animal, or mammal discussed herein; see, e.g., the SaCas9 human codon optimized sequence in WO 2014/093622 (PCT/US 2013/074667). While this is preferred, it will be appreciated that other examples are possible and that codon optimisation for host species other than humans or for specific organs are known. In certain embodiments, the enzyme coding sequence encoding Cas is codon optimized for expression in a particular cell, e.g., a eukaryotic cell. The eukaryotic cell may be a cell of or derived from a particular organism, such as a mammal, including but not limited to a human or non-human eukaryote or an animal or mammal as discussed herein, such as a mouse, rat, rabbit, dog, livestock or non-human mammal or primate. In certain embodiments, processes for modifying the germ line genetic characteristics of humans and/or processes for modifying the genetic characteristics of animals that may cause pain to humans or animals without any substantial medical benefit to them may be excluded, as well as animals resulting from such processes. generally, codon optimization refers to the process of modifying a nucleic acid sequence to enhance expression in a host cell of interest by replacing at least one codon (e.g., about or more than about 1, 2, 3,4,5, 10, 15, 20, 25, 50 or more codons) of the native sequence with a more or most frequently used codon in the gene of the host cell while maintaining the native amino acid sequence. Various species exhibit specific preferences for certain codons for a particular amino acid. Codon bias (the difference in codon usage between organisms) is typically related to the efficiency of translation of messenger RNA (mRNA), which in turn is believed to depend on the nature of the codon being translated, availability of specific transfer RNA (tRNA) molecules, and the like. The number of selected tRNA's in a cell is generally an advantage that reflects the codons most frequently used in peptide synthesis. Thus, genes can be tailored for optimal gene expression in a given organism based on codon optimization. Codon usage tables are readily available, for example in the "codon usage database" on www.kazusa.orjp/codon, and these tables can be adapted in a number of ways. See Nakamura,Y.,et al."Codon usage tabulated from the international DNA sequence databases:status for the year 2000"Nucl.Acids Res.28:292(2000). computer algorithms for codon optimization of specific sequences for expression in specific host cells are also available, such as Gene force (Aptagen; jacobus, PA) are also available. In certain embodiments, one or more codons (e.g., 1, 2,3, 4, 5, 10, 15, 20, 25, 50 or more or all codons) in the Cas-encoding sequence correspond to the most frequently used codons of a particular amino acid.
In certain embodiments, the Cas protein may have nucleic acid cleavage activity. The Cas protein may have RNA binding and DNA cleavage functions. In certain embodiments, cas may direct cleavage of one or both nucleic acid strands at a position at or near the target sequence, e.g., within the target sequence and/or within the complement of the target sequence or at a sequence associated with the target sequence, e.g., within about 1,2, 3, 4, 5, 6, 7,8, 9, 10, 15, 20, 25, 50, 100, 200, 500 or more base pairs from the first or last nucleotide of the target sequence. In certain embodiments, the Cas protein may direct more than one cleavage (e.g., 1,2, 3, 4, 5 or more cleaves) of one or both strands within and/or within the complement of a target sequence or at a sequence related to a target sequence and/or within about 1,2, 3, 4, 5, 6, 7,8, 9, 10, 15, 20, 25, 50, 100, 200, 500 or more base pairs from the first or last nucleotide of a target sequence. In certain embodiments, the cut may be blunt-ended, i.e., produce blunt ends. In some embodiments, the cuts may be staggered, i.e., produce tacky ends.
In certain embodiments, the vector encodes a nucleic acid-targeted Cas protein that can be mutated relative to the corresponding wild-type enzyme such that the mutated nucleic acid-targeted Cas protein lacks the ability to cleave one or both strands of a target polynucleotide containing a target sequence, e.g., a change or mutation in the HNH domain, to produce a mutated Cas that lacks substantially all DNA cleavage activity, e.g., a mutant enzyme having a DNA cleavage activity of about no more than 25%, 10%, 5%, 1%, 0.1%, 0.01% or less of the nucleic acid cleavage activity of the non-mutated form of the enzyme; an example may be a mutant form with zero or negligible nucleic acid cleavage activity compared to a non-mutant form. As used herein, the term "derived" with respect to an enzyme means that the derived enzyme is largely based on the wild-type enzyme (in the sense of having a high degree of sequence homology with the wild-type enzyme), but it has been mutated (modified) in some manner known in the art or described herein.
Typically, in the case of endogenous nucleic acid targeting systems, the formation of a nucleic acid targeting complex (comprising a guide RNA or crRNA that hybridizes to a target sequence and is complexed with one or more effector proteins of the targeting nucleic acid) results in cleavage of a DNA strand in or near the target sequence (e.g., within 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50 or more base pairs). The term "sequence associated with a target locus of interest" as used herein refers to a sequence that is in the vicinity of a target sequence (e.g., within 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50 or more base pairs from the target sequence, wherein the target sequence is contained within the target locus of interest).
It will be appreciated that effector proteins are enzyme-based or derived from enzymes, and thus in certain embodiments the term "effector protein" includes, of course, "enzymes". However, it should also be understood that in certain embodiments, effector proteins may have DNA or RNA binding activity, as desired, but not necessarily cleavage or nicking activity, including dead Cas protein function.
In certain embodiments, the Cas protein may form a component of an inducible system. The inducibility of the system will allow for the use of some form of energy for spatiotemporal control of gene editing or gene expression. The form of energy may include, but is not limited to, electromagnetic radiation, acoustic energy, chemical energy, and thermal energy. Examples of inducible systems include tetracycline-inducible promoters (Tet-On or Tet-Off), small molecule two-hybrid transcriptional activation systems (FKBP, ABA, etc.), or photoinductive systems (phytophotopigments, LOV domains, or cryptoanthocyanidins). In one embodiment, the CRISPR effector protein may be part of a Light Induced Transcriptional Effector (LITE) that directs changes in transcriptional activity in a sequence specific manner. Components of the light-induced transcriptional effectors may include CRISPR effector proteins, light-responsive cytochrome heterodimers (e.g., from arabidopsis thaliana), and transcriptional activation/repression domains. Further examples of inducible DNA binding proteins and methods of their use are provided in US 61/736465 and US 61/721,283 and WO 2014018423A2, which are incorporated herein by reference in their entirety.
In certain embodiments, a mutated Cas may have one or more mutations that result in reduced off-target effects, e.g., a modified CRISPR enzyme modifies a target locus but reduces or eliminates off-target activity, e.g., when complexed with a guide RNA, and a modified CRISPR enzyme increases CRISPR enzyme activity, e.g., when complexed with a guide RNA. It should be understood that the mutant enzymes described below may be used in any of the methods according to the present disclosure described elsewhere herein. Any of the methods, products, compositions and uses described elsewhere herein are equally applicable to the mutated CRISPR enzymes described in further detail below.
Methods and mutations that can be used in various combinations to increase or decrease targeting activity and/or specificity relative to off-target activity, or to increase or decrease targeting binding and/or specificity relative to off-target binding, can be used to compensate or enhance mutations or modifications made to promote other effects. Such mutations or modifications to promote other effects include mutations or modifications to Cas and/or mutations or modifications to guide RNAs. The methods and mutations of the present disclosure are useful for modulating Cas nuclease activity and/or binding to chemically modified guide RNAs.
In certain embodiments, the catalytic activity of the Cas protein of the present disclosure is altered or modified. It will be appreciated that a mutated Cas has altered or modified catalytic activity if the catalytic activity is different from the catalytic activity of the corresponding wild-type Cas protein (e.g., an unmutated Cas protein). The catalytic activity may be determined by means known in the art. For example, but not limited to, catalytic activity can be determined in vitro or in vivo by determining the percent indels (e.g., after a given time or at a given dose). In certain embodiments, catalytic activity is increased. In certain embodiments, the catalytic activity is increased by at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 100%. In certain embodiments, the catalytic activity is reduced. In certain embodiments, the catalytic activity is reduced by at least 5%, preferably at least 10%, more preferably at least 20%, such as at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90% or (substantially) 100%. One or more mutations herein may inactivate the catalytic activity, which may significantly reduce all catalytic activity, reduce activity below a detectable level, or reduce to no measurable catalytic activity.
One or more characteristics of the engineered Cas protein may be different from the corresponding wild-type Cas protein. Examples of such features include catalytic activity, gRNA binding, specificity of Cas protein (e.g., edit-defined target specificity), stability of Cas protein, off-target binding, protease activity, nickase activity, PFS recognition. In certain examples, the engineered Cas protein may comprise one or more mutations of the corresponding wild-type Cas protein. In certain embodiments, the engineered Cas protein has increased catalytic activity compared to the corresponding wild-type Cas protein. In certain embodiments, the engineered Cas protein has reduced catalytic activity compared to the corresponding wild-type Cas protein. In certain embodiments, the gRNA binding of the engineered Cas protein is increased compared to the corresponding wild-type Cas protein. In certain embodiments, the gRNA binding of the engineered Cas protein is reduced compared to the corresponding wild-type Cas protein. In certain embodiments, the Cas protein has increased specificity as compared to the corresponding wild-type Cas protein. In certain embodiments, the Cas protein has reduced specificity compared to the corresponding wild-type Cas protein. In certain embodiments, the Cas protein has increased stability compared to the corresponding wild-type Cas protein. In certain embodiments, the Cas protein has reduced stability compared to the corresponding wild-type Cas protein. In certain embodiments, the engineered Cas protein further comprises one or more mutations that inactivate catalytic activity. In certain embodiments, off-target binding of the Cas protein is increased as compared to the corresponding wild-type Cas protein. In certain embodiments, off-target binding of the Cas protein is reduced compared to the corresponding wild-type Cas protein. In certain embodiments, the Cas protein has increased target binding compared to the corresponding wild-type Cas protein. In certain embodiments, the Cas protein has reduced target binding compared to the corresponding wild-type Cas protein. In certain embodiments, the engineered Cas protein has a higher protease activity or polynucleotide binding capacity as compared to the corresponding wild-type Cas protein. In certain embodiments, PFS recognition is altered compared to the corresponding wild-type Cas protein.
Examples of Cas proteins
Examples of Cas proteins include class I (e.g., type I, type III, and type IV) and class II (e.g., type II, type V, and type VI) Cas proteins, such as Cas9, cas12 (e.g., cas12a, cas12b, cas12c, cas12 d), cas13 (e.g., cas13a, cas13b, cas13c, cas13 d), casX, casY, cas14, variants thereof (e.g., mutant forms, truncated forms), homologs thereof, and orthologs thereof. The terms "ortholog" and "homolog" are well known in the art. By way of further guidance, a "homolog" of a protein as used herein is a protein of the same species that performs the same or similar function as the protein that is the homolog thereof. Homologous proteins may be, but need not be, structurally related, or only partially structurally related. An "ortholog" of a protein, as used herein, is a protein of a different species that performs the same or similar function as the protein that is an ortholog thereof. Ortholog proteins may be, but need not be, structurally related, or only partially structurally related.
Class 2 Cas proteins
In certain embodiments, the Cas protein is a class 2 Cas protein, i.e., a Cas protein of a class 2 CRISPR-Cas system. The class 2 CRISPR-Cas system may be Sup>A subtype, e.g., type II-Sup>A, type II-B, type II-C, type V-Sup>A, type V-B, type V-C or type V-U. In certain embodiments, the Cas protein is Cas9, cas12a, cas12b, cas12c, or Cas12d. In certain embodiments, cas9 may be SpCas9, saCas9, stCas, and other Cas9 orthologs. Cas12 may be Cas12a, cas12b, and Cas12c, including FnCas a, or a homolog or ortholog thereof. Definition and exemplary members of the CRISPR-Cas system include :Kira S.Makarova and Eugene V.Koonin,Annotation and Classification of CRISPR-Cas systems,Methods Mol Biol.2015;1311:47-75; and those described in the following documents Sergey Shmakov et al.,Diversity and evolution of class 2CRISPR-Cas systems,Nat Rev Microbial.2017Mar;15(3):169-182.
Cas protein linker
In certain examples, the Cas protein comprises at least one RuvC domain and at least one HNH domain. The Cas protein may further comprise first and second linker domains connecting the RuvC domain and the HNH domain. The first (L1) and second (L2) linkers in Cas9 that connect HNH and RuvC domains are described in fig. 1 of :Nishimasu,H.et al."Crystal structure of Cas9 in complex with guide RNA and target RNA"Cell 156(Feb.27,2014):935-949 and Ribeiro,L.et al.(2018)"Protein engineering strategies to expand CRISPR-Cas9 applications"International Journal of Genomics Volume2018,Article ID 1652567(doi.org/10.1155/2018/1652567).Ribeiro in the following studies, which are specifically incorporated herein by reference, illustrate the overall organization, structure, and function of Cas 9. Specifically, fig. 1A shows a schematic representation of the domain organization of SpCas9, indicating the genetic structure of HNH and RuvC domains described herein, including the linker L1 (spanning amino acids 765-780) and L2 (spanning amino acids 906-918).
Similarly, when referencing the first and second linker domains, the domain organization of staphylococcus aureus (Staphylococcus aureus) Cas9 (SaCas 9) can be utilized. In one aspect, the linker 1 domain region spans residues 481-519 and connects the RuvC-II domain to the HNH domain in SaCas 9. In certain embodiments, the linker 2 region spans residues 629-649 and connects the RuvC-III domain and HNH domain of SaCas 9. Thus, the first and/or second linker domain may be mutated in a Cas9 ortholog, and the amino acid residue corresponding to the amino acid of wild-type SaCas9 may be referenced. See Nishimasu, cell.2015aug 27;162 1113-1126; doi 10.1016/j.cell.2015.08.007, incorporated herein by reference. In particular, fig. 1, S1-S3 of Nishimasu describe in detail the domain organization of Cas9 proteins, and are specifically incorporated herein by reference.
The first and second linkers may comprise about 10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32、33、34、35、36、37、38、39、40、41、42、43、44、45 or more amino acids. The first and second adaptors may correspond to wild-type adaptors. In certain aspects, the first and second linkers may comprise one or more mutations in the first and/or second linkers. In one aspect, the first and/or second linker comprises one or more mutations that increase the specificity of the Cas9 protein.
In certain embodiments, the linkers L1 and L2 that connect the HNH and RuvC domains of Cas9 contain wild-type amino acid sequences. In certain embodiments, the linker connecting the HNH and RuvC domains contains a mutation of one or more amino acids. In one exemplary embodiment, the first linker (L1) contains a mutation corresponding to amino acid T769I of SpCas9 and/or the second linker (L2) contains a mutation corresponding to amino acid G915M of SpCas 9. In one exemplary embodiment, one or more linker mutations, such as T769I and G915M, confer improved specificity for Cas9 proteins.
In one embodiment, one or more mutations in the first and second linkers can be combined with one or more mutations in other portions of the Cas9 protein for further improving specificity and/or maintaining substantially equivalent activity to a wild-type Cas9 protein, as described herein. In one embodiment, mutations in the linker and/or additional mutations within the Cas protein can be identified using the methods detailed herein that enhance/improve specificity and substantially preserve wild-type activity of wild-type Cas 9.
Class 2 type II Cas protein (e.g., cas 9)
In certain embodiments, the Cas protein can be a Cas protein of a class 2 type II CRISPR-Cas system (type II Cas protein). In certain embodiments, the Cas protein may be a type 2 II Cas protein, such as Cas9. In certain embodiments, the CRISPR/Cas 9-based system can include a Cas9 protein or a fragment thereof, a Cas9 fusion protein, a nucleic acid encoding a Cas9 protein or a fragment thereof, or a nucleic acid encoding a Cas9 fusion protein. By "Cas9 (CRISPR-associated protein 9)" is meant a polypeptide or fragment thereof having at least about 85% amino acid identity to NCBI accession No. np_269215 and having RNA binding activity, DNA binding activity and/or DNA cleavage activity (e.g., endonuclease or nicking enzyme activity). "Cas9 function" may be defined by any of a variety of assays, including, but not limited to, fluorescence polarization-based nucleic acid binding assays, fluorescence polarization-based strand invasion assays, transcription assays, EGFP disruption assays, DNA cleavage assays, and/or survivin assays, e.g., as described herein. By "Cas9 nucleic acid molecule" is meant a polynucleotide encoding a Cas9 polypeptide or a fragment thereof. An exemplary Cas9 nucleic acid molecule sequence is provided at genomic sequence number nc_ 002737. In certain embodiments, disclosed herein are inhibitors of Cas9, e.g., naturally occurring Cas9 or variants thereof in streptococcus pyogenes (SpCas 9) or staphylococcus aureus (s.aureus) (SaCas 9). Cas9 uses the pre-spacer adjacent motif (PAM) sequence and base pairing of guide RNA (gRNA) with target DNA to recognize foreign DNA. The relative ease with which Cas9 induces targeted strand breaks at any genomic locus enables efficient genome editing in a variety of cell types and organisms. Cas9 derivatives may also be used as transcriptional activators/repressors.
In certain instances, the CRISPR-Cas protein is Cas9 or a variant thereof. In certain examples, cas9 can be a wild-type Cas9, including any naturally occurring bacterial Cas9.Cas9 orthologs typically share a general organization of 3-4 RuvC domains and one HNH domain. The RuvC domain most 5' cleaves the non-complementary strand and the HNH domain cleaves the complementary strand. All symbols refer to the pilot sequence. Catalytic residues in the 5' ruvc domain are identified by homology comparison of Cas9 of interest to other Cas9 orthologs (from streptococcus pyogenes type II CRISPR locus, streptococcus thermophilus CRISPR locus 1, streptococcus thermophilus (s. Thermophilus) CRISPR locus 3 and francissus novellovis (FRANCISCILLA NOVICIDA) type II CRISPR locus) and mutating the conserved Asp residue (D10) to alanine to convert Cas9 to a complementary strand cleaving enzyme. Thus, the Cas enzyme may be a wild-type Cas9, including any naturally occurring bacterial Cas9. The CRISPR, cas or Cas9 enzyme may be codon optimized, or modified version, including any chimeric, mutant, homolog or ortholog. In another aspect of the disclosure, the Cas9 enzyme may comprise one or more mutations and may be used as a universal DNA binding protein fused with or without a functional domain.
The mutation may be an artificially introduced mutation or a gain-of-function or loss-of-function mutation. In certain embodiments, the transcriptional activation domain may be VP64. In certain embodiments, the transcriptional repressor domain may be KRAB or SID4X. Other aspects of the disclosure relate to mutated Cas9 enzymes fused to domains including, but not limited to, nucleases, transcriptional activators, repressors, recombinases, transposases, histone remodelles, demethylases, DNA methyltransferases, cryptomelanes, photoinduced/controllable domains, or chemically induced/controllable domains. The present disclosure may relate to sgrnas or tracrrnas or guide or chimeric guide sequences that allow for enhancing the performance of these RNAs in cells. Such a type II CRISPR enzyme can be any Cas enzyme. In certain instances, the Cas9 enzyme is from or derived from SpCas9 or SaCas9. The term "derived" as used herein with respect to an enzyme means that the derived enzyme is largely based on the wild-type enzyme (in the sense of having a high degree of sequence homology with the wild-type enzyme), but it has been mutated (modified) in some way known in the art or as described herein. In one example, the mutation may include one or more mutations in the first linking domain, the second linking domain, and/or other portions of the protein. High sequence homology may include at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more relative to the wild-type enzyme.
The Cas enzyme may be identified Cas9, as this may refer to a generic class of enzymes sharing homology with the largest nuclease from a type II CRISPR system having multiple nuclease domains. In certain instances, the Cas9 enzyme is from or derived from SpCas9 (s. Pyogens) Cas9 or saCas (s. Aureus) Cas 9. "StCas9" refers to wild-type Cas9 (UniProt ID: G3ECR 1) from Streptococcus thermophilus (S.thermophilus). Similarly, "SpCas9" refers to wild-type Cas9 (UniProtID: Q99ZW 2) from Streptococcus pyogenes (S.pyogens). The term "derived" as used herein with respect to an enzyme means that the derived enzyme is largely based on the wild-type enzyme (in the sense of having a high degree of sequence homology with the wild-type enzyme), but it has been mutated (modified) in some way known in the art or as described herein. It is to be understood that the terms Cas and CRISPR enzyme are generally used interchangeably herein unless specifically indicated otherwise. As described above, many of the residue numbers used herein refer to Cas9 enzymes from the type II CRISPR locus in streptococcus pyogenes (Streptococcus pyogenes).
In particular embodiments, the effector protein is a Cas9 effector protein from or derived from an organism of the genus: streptococcus (Streptococcus), campylobacter (Campylobacter), nitrate-lysing bacteria (Nitratifractor), staphylococcus (Staphylococcus), corynebacterium (Parvibaculum), roche (Roseburia), neisseria (Neisseria), gluconobacter (Gluconobacter), staphylococcus (Streptococcus), Azospirillum (Azospirillum), sphaerochaeta, lactobacillus (Lactobacillus), eubacterium (Eubacterium), corynebacterium (Corynebacter), myrothecium (Carnobacterium), rhodobacter (Rhodobacter), listeria (Listeria), paludibacter, clostridium (Clostridium), geobacilaceae (Lachnospiraceae), clostridiaridium, Ciliated genus (Leptotrichia), francisella genus (FRANCISELLA), legionella genus (Legionella), alicyclic acid Bacillus genus (Aliciclobacillus), methanomethyophilus, porphyromonas genus (Porphyromonas), prevolvulella genus (Prevoltella), bacteroides genus (Bacteroidetes), leucococcus genus (Helcococcus), leptospira genus (Letospira), Vibrio (Desulfovibrio), campylobacter (Desulfonatronum), feng Youjun (Opitutaceae), bacillus megaterium (Tuberibacillus), bacillus (Bacillus), bacillus pumilus (Brevibacilus), methylobacillus (Methylobacterium) or amino acid coccus (Acidaminococcus), streptococcus (Streptococcus), campylobacter (Campylobacter), Nitrate-lysing bacteria (Nitratifractor), staphylococcus (Staphylococcus), corynebacterium (Parvibaculum), rogowski (Roseburia), neisseria (Neisseria), gluconacetobacter (glucacter), azospirillum (azospiralum), sphaerochaeta, lactobacillus (Lactobacillus), eubacterium (Eubacterium), corynebacterium (Corynebacter), Sagnac genus (Sutterella), legionella genus (Legionella), legionella genus (Treponema), leptospira genus (Filifactor), eubacterium genus (Eubacterium), streptococcus genus (Streptococcus), legionella genus) Lactobacillus (Lactobacillus), mycoplasma (Mycoplasma), bacteroides (bacteriodes), flaviivola, flavobacterium (Flavobacterium), sphaerochaeta, Azospirillum (Azospiralum), gluconobacter (Gluconobacter), neisseria (Neisseria), roche (Roseburia), corynebacterium parvulum (Parvibaculum), staphylococcus (Staphylococcus), leucolyticus (Nitratifractor), mycoplasma (Mycoplasma) or Campylobacter (Campylobacter).
In certain embodiments, the Cas9 protein is from or derived from an organism selected from the group consisting of: streptococcus mutans (S.mutans), streptococcus agalactiae (S.agalactiae), streptococcus equisimilis (S.equisimilar), streptococcus sanguineus (S.sanguinis), streptococcus pneumoniae (S.pneumonia), campylobacter jejuni (C.jejuni), campylobacter coli (C.coli), brine nitrate-splitting bacteria (N.saluginis), N.tergarcus, staphylococcus aureus (S.aureularis), staphylococcus (S.carnosus), neisseria Meningitidis (NMENINGITIDES), neisseria gonorrhoeae (N. Gonorrheae), listeria monocytogenes (L.monocytogenes), listeria monocytogenes (L.ivanovii), clostridium botulinum (C.botulium), clostridium difficile (C.diffilide), clostridium tetani (C.tetani) or clostridium sordellii (C.sordellie), francisella tularensis (FRANCISELLA TULARENSIS) 1, Francisella tularensis Noveyae (FRANCISELLA TULARENSIS subsp. Noviocada), prevotella arabica (Prevotella albensis), proteus (Lachnospiraceae bacterium) MC2017 1, vibrio proteolyticus (Butyrivibrio proteoclasticus), italian bacteria (Peregrinibacteria bacterium) GW2011 GWA2_33_10, The bacteria (Parcubacteria bacterium) from the genus George, GW2011 GWC 2-44-17, the species SCADC of the genus Smith (SMITHELLA sp. SCADC), the species of the genus amino acid coccus (Acidaminococcus sp.) BV3L6, the bacteria (Lachnospiraceae bacterium) from the family George, MA2020, the candidate Mycoplasma methanolica (Candidatus Methanoplasma termitum), Eubacterium (Eubacterium eligens), moraxella jenkinii (Moraxella bovoculi) 237, leptospira paddy (Leptospira inadai), trichosporon (Lachnospiraceae bacterium) ND2006, porphyromonas canis (Porphyromonas crevioricanis) 3, prevotella descense (Prevotella disiens) and Porphyromonas kii (Porphyromonas macacae). in certain embodiments, the Cas9 protein is Cas9 from or derived from the organism streptococcus pyogenes (Streptococcus pyogenes), staphylococcus aureus (Staphylococcus aureus), or streptococcus thermophilus (Streptococcus thermophilus).
In a more preferred embodiment, the Cas9 protein is derived from a bacterial species selected from streptococcus pyogenes (Streptococcus pyogenes), staphylococcus aureus (Staphylococcus aureus), or streptococcus thermophilus (Streptococcus thermophilus). In certain embodiments, the Cas9 is derived from a bacterial species selected from the group consisting of: francisella tularensis (FRANCISELLA TULARENSIS) 1, prevotella albopomofo (Prevotella albensis), proteus (Lachnospiraceae bacterium) MC20171, vibrio proteolyticus (Butyrivibrio proteoclasticus), proteus exocarpium (Peregrinibacteria bacterium) GW2011 GWA2 33JO, morganella superdoor bacteria (Parcubacteria bacterium) GW2011 GWC2_44_17, smith sp SCADC (SMITHELLA sp.SCADC), amino acid coccus sp (Acidaminococcus sp.) BV3L6, trimerella viridae (Lachnospiraceae bacterium) MA2020, mycoplasma methanolica (Candidatus Methanoplasma termitum) candidates, bacillus natto (Eubacterium eligens), moraxella (Moraxella bovoculi) 237, leptospira paddy (Leptospira inadai), proteus (Lachnospiraceae bacterium) ND2006, porphyromonas canis (Porphyromonas crevioricanis) 3, proteus saccharolyticus (Prevotella disiens) and Porphyromonas (Porphyromonas macacae). In certain embodiments, the Cas9 protein is derived from a bacterial species selected from the group consisting of the species of the amino acid coccus (Acidaminococcus sp.) BV3L6, the chaetoceros bacterium (Lachnospiraceae bacterium) MA 2020. In certain embodiments, the effector protein is derived from subspecies of francissampsonii (FRANCISELLA TULARENSIS) 1, including, but not limited to, francissampsonii novinai subspecies (FRANCISELLA TULARENSIS subsp. Novicda).
Cas9 enzymes include, but are not limited to, streptococcus pyogenes(s) serotype M1 (UniProtID: Q99ZW 2), staphylococcus aureus (s.aureus) Cas9 (UniProt ID: J7RUA 5), eubacterium avium (Eubacterium ventriosum) Cas9 (UniProt ID: A5Z 395), azospirillum (azospirlum) (strain B510) Cas9 (UniProt ID: D3NT 09), gluconobacter diazophilum (Gluconacetobacter diazotrophicus) (strain ATCC 49037) Cas9 (UnitProt ID: A9HKP 2), neisseria graciliate (NEISSERIA CINEREA) Cas9 (UniProt ID: D0W2Z 9), neisseria enterica (Roseburia intestinalis) Cas9 (UniProt ID: C7G 697), detergent-eating corynebacterium parvulum (Parvibaculum lavamentivorans) (strain DS-1: A7HP 89), nitrate lysate (Nitratifractor salsuginis) (strain atc: cas9 (UniProt ID 6: E6) and trel 9 (UniProt 62) Cas9 (UniProt ID 3G 697).
The enzymatic action of Cas9 or any closely related Cas9 derived from streptococcus pyogenes (Streptococcus pyogenes) creates a double strand break at a target site sequence that hybridizes to 20 nucleotides of the guide sequence and has a pre-spacer adjacent motif (PAM) sequence after 20 nucleotides of the target sequence (examples include NGG/NRG or PAM that can be determined as described herein). CRISPR activity for site-specific DNA recognition and cleavage by Cas9 is defined by the guide sequence, tracr sequence hybridized to the guide sequence portion, and PAM sequence. Further aspects of CRISPR systems are described in Karginov and Hannon,The CRISPR system:small RNA-guided defense in bacteria and archaea,Mole Cell 2010,January 15;37(1):7. The type II CRISPR locus from streptococcus pyogenes (Streptococcus pyogenes) SF370 contains clusters of four genes Cas9, cas1, cas2 and Csnl, and a characteristic array of two non-coding RNA elements tracrRNA and a repeat sequence (forward repeat) separated by a short non-repeat sequence (spacer, each about 30 bp). In this system, targeted DNA Double Strand Breaks (DSBs) are generated in four consecutive steps. First, two non-coding RNAs, namely a pre-crRNA array and a tracrRNA, are transcribed from the CRISPR locus. Next, the tracrRNA hybridizes to the forward repeat of the pre-crRNA, which is then processed into mature crrnas containing the respective spacer sequences. Third, the mature crRNA-tracrRNA complex directs Cas9 to a DNA target consisting of the pre-spacer and the corresponding PAM through heteroduplex formation between the spacer of the crRNA and the pre-spacer DNA. finally, cas9 mediates cleavage of target DNA upstream of PAM, generating DSBs within the pre-spacer sequence. Arrays of pre-crrnas consisting of a single spacer region flanked by two forward repeats (DR) are also encompassed by the term "tracr-mate sequence". In certain embodiments, cas9 may be constitutively or inducibly or conditionally present or administered or delivered. Cas9 optimization may be used to enhance functionality or develop new functionality. Chimeric Cas9 proteins can be produced, and Cas9 can be used as a universal DNA-binding protein. The structural information provided for Cas9 can be used to further engineer and optimize the CRISPR-Cas system, and this can also infer the structure-function relationship of other CRISPR enzyme systems, in particular in other type II CRISPR enzymes or Cas9 orthologs. Crystal structure information (describing U.S. provisional application 61/915,251 filed on 12 months 2013, 61/930,214 filed on 22 months 2014, 61/980,012 filed on 15 months 2014; and Nishimasu et al,"Crystal Structure of Cas9in Complex with Guide RNA and Target DNA,"Cell 156(5):935-949,DOI:http://dx.doi.org/10.1016/j.cell.2014.02.001(2014), each of which is incorporated herein by reference in its entirety) provides structural information for truncating and generating a modular or multipart CRISPR enzyme that may be incorporated into an inducible CRISPR-Cas system. In particular, structural information of streptococcus pyogenes(s) Cas9 (SpCas 9) is provided, and this can be extrapolated to other Cas9 orthologs or other type II CRISPR enzymes. Cas9 genes are present in several different bacterial genomes, typically in the same locus as Cas1, cas2 and Cas4 genes and CRISPR cassettes. In addition, the Cas9 protein contains an easily identifiable C-terminal region that is homologous to the transposon ORF-B and includes an active RuvC-like nuclease, arginine-rich region.
dCas9
The Cas9 protein may be mutated so as to inactivate nuclease activity. An inactivated Cas9 protein (iCas 9, also known as "dCas 9") from streptococcus pyogenes(s) that has no endonuclease activity has recently been targeted by gRNA to genes in bacteria, yeast and human cells to silence gene expression by steric hindrance. As used herein, "dCas molecule" may refer to dCas protein or a fragment thereof. As used herein, "dCas9 molecule" may refer to dCas9 protein or a fragment thereof. The terms "iCas" and "dCas" are used interchangeably herein and refer to a catalytically inactive CRISPR-associated protein. In one embodiment, the dCas molecule comprises one or more mutations in the DNA cleavage domain. In one embodiment, the dCas molecule comprises one or more mutations in the RuvC or HNH domain. In one embodiment, the dCas molecule comprises one or more mutations in both RuvC and HNH domains. In one embodiment, the dCas molecule is a fragment of a wild-type Cas molecule. In one embodiment, the dCas molecule comprises a functional domain from a wild-type Cas molecule, wherein the functional domain is selected from a Reel domain, a bridged helical domain, or a PAM interaction domain. In one embodiment, the nuclease activity of the dCas molecule is reduced by at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% or 99% as compared to the corresponding wild-type Cas molecule.
Suitable dCas molecules can be derived from wild-type Cas molecules. The Cas molecule may be from a type I, type II or type III CRISPR-Cas system. In one embodiment, a suitable dCas molecule can be derived from a Cas1, cas2, cas3, cas4, cas5, cas6, cas7, cas8, cas9, or Cas10 molecule. In one embodiment, the dCas molecule is derived from a Cas9 molecule. The dCas9 molecule can be obtained by introducing point mutations (e.g., substitutions, deletions, or additions) at, for example, DNA cleavage domains in the Cas9 molecule, e.g., nuclease domains, e.g., ruvC and/or HNH domains. See, e.g., jinek et al, science (2012) 337:816-21, which is incorporated herein by reference in its entirety. For example, the introduction of two point mutations in RuvC and HNH domains reduces Cas9 nuclease activity while retaining Cas9 sgRNA and DNA binding activity. In one embodiment, the two point mutations within the RuvC and HNH active sites are D10A and H840A mutations of a streptococcus pyogenes(s) Cas9 molecule. Alternatively, D10 and H840 of the Cas9 molecule may be deleted to inactivate Cas9 nuclease activity while retaining its sgRNA and DNA binding activity. In one embodiment, the two point mutations within the RuvC and HNH active sites are D10A and N580A mutations of a streptococcus pyogenes(s) Cas9 molecule.
In one embodiment, the dCas molecule is according to SEQ ID NO:1 (s.aureus) dCas9 molecule comprising mutations at D10 and/or N580. In one embodiment, the dCas molecule is according to SEQ ID NO:1 (s.aureus) dCas9 molecule comprising D10A and/or N580A mutations.
Staphylococcus aureus (S.aureus) dCAS9
In one embodiment, the dCas9 molecule is a staphylococcus aureus (s.aureus) dCas9 molecule comprising the amino acid sequence of SEQ ID NO:2 or 3, amino acid sequence corresponding to SEQ ID NO:2 or 3 (e.g., sequence identity of at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more) or relative to SEQ ID NO:2 or 3 has a sequence of 1,2, 3, 4, 5 or more changes (e.g., amino acid substitutions, insertions or deletions) or any fragment thereof.
Similar mutations can also be applied to any other naturally occurring Cas9 (e.g., cas9 from other species) or engineered Cas9 molecules. In certain embodiments, the dCas9 molecules include streptococcus pyogenes (Streptococcus pyogenes) dCas9 molecules, staphylococcus aureus (Staphylococcus aureus) dCas9 molecules, campylobacter jejuni (Campylobacter jejuni) dCas9 molecules, corynebacterium diphtheriae (Corynebacterium diphtheria) dCas9 molecules, eubacterium avium (Eubacterium ventriosum) dCas9 molecules, streptococcus pastoris (Streptococcus pasteurianus) dCas9 molecules, lactobacillus sausage (Lactobacillus farciminis) dCas9 molecules, helicobacter (Sphaerochaeta globus) dCas9 molecules, azospirillum (azospirlum) (strain B510) dCas9 molecules, gluconobacter diazophilum (Gluconacetobacter diazotrophicus) dCas9 molecules, neisseria griseus (NEISSERIA CINEREA) dCas9 molecules, lactobacillus sanguinis (Roseburia intestinalis) dCas9 molecules, detergent parvulus (Parvibaculum lavamentivorans) dCas9 molecules, nitrate-splitting bacteria (Nitratifractor salsuginis) (DSM 16511) dCas9 molecules, campylobacter gull (Campylobacter lari) (CF 89-12) dCas9 molecules, streptococcus thermophilus (Streptococcus thermophilus) or a fragment thereof.
In certain embodiments, the present disclosure provides a vector comprising a nucleotide sequence encoding a streptococcus pyogenes (Streptococcus pyogenes) dCas9 molecule, staphylococcus aureus (Staphylococcus aureus) dCas9 molecule, campylobacter jejuni (Campylobacter jejuni) dCas9 molecule, corynebacterium diphtheriae (Corynebacterium diphtheria) dCas9 molecule, eubacterium avium (Eubacterium ventriosum) dCas9 molecule, streptococcus pastoris (Streptococcus pasteurianus) dCas9 molecule, lactobacillus sausage (Lactobacillus farciminis) dCas9 molecule, helicobacter (Sphaerochaeta globus) dCas9 molecule, azospirillum (azospiralum) (strain B510) dCas9 molecule, gluconobacter diazophilum (Gluconacetobacter diazotrophicus) dCas9 molecule, neisseria griseus (NEISSERIA CINEREA) dCas9 molecule, streptococcus enterica (Roseburia intestinalis) dCas9 molecule, corynebacterium parvulus (Parvibaculum lavamentivorans) dCas9 molecule, streptococcus brine nitrate (Nitratifractor salsuginis) (strain DSM 16511) dCas9 molecule, streptococcus gull-1) (strain 89-12) dCas9 molecule, streptococcus thermophilus (lmas 9) or a fragment thereof.
Exemplary dCas9 proteins include, but are not limited to, those listed in table 1.
TABLE 1 exemplary dCAS9 protein
Cas9 fusion proteins
CRISPR/Cas 9-based systems may include fusion molecules (e.g., DNMT3A-DNMT3L (3A 3L) -dCas 9-KRAB). The fusion molecule may comprise at least one DNA binding protein (e.g., dCas 9) and at least one gene expression modulator (e.g., KRAB, DNMT3A, DNMT3L, DNMT a-DNMT3L fusion peptide). In certain embodiments, the gene expression modulator is selected from a repressor of gene expression (e.g., KRAB), a gene expression activator, or an epigenetic modification modulator (e.g., DNMT3A, DNMT3L, DNMT a-DNMT3L fusion peptide), or a combination thereof. Different modulators of gene expression are known in the art, see for example Thakore et al, nat methods.2016;13:127-37, which are incorporated herein by reference in their entirety.
Repressor of gene expression
In certain embodiments, the gene expression modulator comprises a repressor of gene expression. The repressor may be any known gene expression repressor, for example, a repressor selected from the group consisting of Kruppel-related cassette (KRAB) domain, mSin3 interaction domain (SID), MAX interaction protein 1 (MXI 1), chromosomal shadow domain (chromo shadow domain), EAR inhibition domain (SRDX), eukaryotic release factor 1 (ERFl), eukaryotic release factor 3 (ERF 3), tetracycline repressor, lad repressor, vinca G-box binding factors 1 and 2, drosophila Groucho, TRIPARTITE MOTIF-containing 28 (TRTM), nuclear receptor co-repressor 1, nuclear receptor co-repressor 2, or fragments or fusions thereof.
Kruppel related box (KRAB)
The KRAB domain is a transcriptional repression domain that is present in the N-terminal portion of many zinc finger protein-based transcription factors. The KRAB domain acts as a transcriptional repressor when bound to target DNA through the DNA binding domain. The KRAB domain is rich in charged amino acids and can be divided into subdomains a and B. The KRAB a and B subdomains may be separated by a variable spacer fragment, and many KRAB proteins contain only the a subdomain. The 45 amino acid sequence in the KRAB a subdomain has been shown to be important for transcriptional repression. The B subdomain itself does not repress transcription, but enhances the repression of the KRAB a subdomain. The KRAB domain recruits the co-repressor KAP1 (KRAB-related protein-1, also known as transcription mediator 1 betase:Sub>A, KRAB-A interacting protein and ternary motif protein 28) and heterochromatin protein 1 (HP 1) as well as other chromatin regulatory proteins, causing transcriptional repression through heterochromatin formation. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to a KRAB domain or fragment thereof. In one embodiment, the KRAB domain or fragment thereof is fused to the N-terminus of dCas9 molecule. In one embodiment, the KRAB domain or fragment thereof is fused to the C-terminus of dCas9 molecule. In one embodiment, the KRAB domain or fragment thereof is fused to both the N-and C-terminus of dCas9 molecules. In one embodiment, the fusion molecule comprises a KRAB domain comprising the amino acid sequence of SEQ ID NO:22, and SEQ ID NO:22 (e.g., having at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or more identity) or a sequence that is substantially identical relative to SEQ ID NO:22 has a sequence of 1,2, 3, 4, 5 or more changes (e.g., amino acid substitutions, insertions, or deletions), or any fragment thereof.
Exemplary KRAB domain sequences
MSin3 interaction domain (SID)
MSin3 interaction domain (SID) is an interaction domain that resides on several transcription repressors. It interacts with a pair of amphipathic α -helix 2 (PAH 2) domains of mSin3, which are transcriptional repressor domains attached to transcriptional repressors such as mSin 3A co-repressor. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to an mSin3 interaction domain or fragment thereof. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to four tandem mSin3 interaction domains (SID 4X). In one embodiment, the four tandem mSin3 interaction domains (SID 4X) are fused to the C-terminus of dCas9 molecules.
MAX interaction protein 1 (MXI 1)
Mxi1 is a repression of MYC. Mxi1 may antagonize MYC transcriptional activity by competing for MYC-associated factor X (MAX), which binds to MYC and is essential for MYC to function. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to Mxi1 or a fragment thereof. In one embodiment, mxi1 is fused to the C-terminus of dCas9 molecule.
Gene expression activating factor
In certain embodiments, the gene expression modulator comprises a gene expression activator. The activator may be any known gene expression activator, such as a VP16 activation domain, VP64 activation domain, p65 activation domain, epstein-Barr virus (Epstein-Barr virus) R transactivator Rta molecule or fragment thereof. Activators that can be used with dCas9 molecules are known in the art. See, e.g., chavez et al, nat methods (2016) 13:563-67, incorporated herein by reference in its entirety.
VP16、VP64、VP160
VP16 is a 16 amino acid viral protein sequence that recruits transcriptional activators to promoters and enhancers. VP64 is a transcriptional activator comprising four copies of VP16, e.g., a molecule comprising four tandem copies of VP16 linked by Gly-Ser linker. VP160 is a transcriptional activator comprising 10 copies of VP 16. In one embodiment, the methods and compositions disclosed herein comprise fusion molecules comprising dCas9 molecules fused to 1, 2, 3, 4,5, 6, 7, 8, 9, 10 or more copies of VP 16. In one embodiment, the methods and compositions disclosed herein comprise a fusion molecule comprising a dCas9 molecule fused to VP 64. In one embodiment, the methods and compositions disclosed herein comprise a fusion molecule comprising a dCas9 molecule fused to VP 160. In one embodiment, VP64 is fused to the C-terminus, N-terminus, or both the N-and C-termini of the dCAS9 molecule.
P65 activation domain (p 65 AD)
P65AD is the major transactivation domain of the 65kDa polypeptide in the nuclear form of an F-kappa beta transcription factor. An exemplary sequence for human transcription factor p65 is available in Uniprot database under accession number Q04206. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to p65 or a fragment thereof, e.g., p65 AD.
Epstein-Barr virus (EBV) R transactivator (Rta)
Rta this immediate early protein of EBV is a transcriptional activator that induces lytic gene expression and triggers viral reactivation. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to Rta or a fragment thereof.
VP64, p65, rta fusion
In one embodiment, the methods and compositions disclosed herein comprise a fusion molecule comprising a dCas9 molecule fused to VP64, p65, rta, or any combination thereof. Ternary activator VP64-p65-Rta (also known as VPR), in which three transcriptional activation domains are fused using a short amino acid linker, can efficiently up-regulate target gene expression when fused to dCS 9 molecules. In one embodiment, the methods and compositions disclosed herein comprise a dCas9 molecule comprising a fusion to VPR.
Synergistic Activation Mediators (SAMs)
In one embodiment, the methods and compositions disclosed herein comprise a CRISPR-Cas system comprising three components: (1) dCAS9-VP64 fusion, (2) gRNA incorporating two MS2 RNA aptamers at the four and stem loops, and (3) MS2-P65-HSF1 activation helper proteins. This system, named Synergistic Activation Mediator (SAM), aggregates the three activation domains-VP 64, P65 and HSFl, and has been described at Konermann et al, nature.2015;517:583-8, which is incorporated herein by reference in its entirety.
Ldbl self-association domain
In one embodiment, the methods and compositions disclosed herein comprise a fusion molecule comprising a dCas9 molecule fused to a Ldbl self-associating domain. Ldbl recruits enhancer-related endogenous Ldbl from the association domain.
Epigenetic modification modulators
In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to a gene expression modulator. In certain embodiments, the modulator of gene expression comprises an epigenetic modified modulator. In one embodiment, the fusion molecule modulates target gene expression by epigenetic modification, such as by histone acetylation or methylation or DNA methylation at regulatory elements of the target gene, such as promoters, enhancers, or transcription initiation sites. The modulator may be any known epigenetic modification modulator, such as a histone acetyltransferase (e.g., p300 catalytic domain), a histone deacetylase, a histone methyltransferase (e.g., SUV39H1 or G9a (EHMT 2)), a histone demethylase (e.g., LSD 1), a DNA methyltransferase (e.g., DNMT3a or DNMT3a-DNMT 3L), a DNA demethylase (e.g., TET1 catalytic domain or TDG), or a fragment thereof.
Histone modification Activity
In certain embodiments, the epigenetic modification modulator can have histone modification activity. Histone modification activities may include, but are not limited to, histone deacetylase, histone acetyltransferase, histone demethylase, or histone methyltransferase activity.
In certain embodiments, the epigenetic modification modulator may have histone acetyltransferase activity. The histone acetyltransferase can be a p300 or CREB Binding Protein (CBP) protein or fragment thereof. In certain embodiments, the methods and compositions disclosed herein comprise a fusion molecule comprising a dCas9 molecule fused to the catalytic core of acetyltransferase p300 or a fragment thereof, e.g., p 300. In certain embodiments, the methods and compositions disclosed herein comprise a fusion molecule comprising a dCas9 molecule fused to a CREB Binding Protein (CBP) protein or a fragment thereof.
In certain embodiments, the epigenetic modification modulator can have histone demethylase activity. For example, the epigenetic modification modulator can include an enzyme that removes a methyl (CH 3-) group from a nucleic acid or protein (e.g., histone). In certain embodiments, the methods and compositions disclosed herein comprise a fusion molecule comprising a dCas9 molecule fused to Lys-specific histone demethylase 1 (LSD 1) or a fragment thereof.
In certain embodiments, the epigenetic modification modulator may have histone methyltransferase activity. In certain embodiments, the methods and compositions disclosed herein comprise a fusion molecule comprising a dCas9 molecule fused to SUV39H1 or a fragment thereof. In certain embodiments, the methods and compositions disclosed herein comprise a fusion molecule comprising a dCas9 molecule fused to G9a (EHMT 2) or a fragment thereof.
DNA demethylase Activity
In certain embodiments, the epigenetic modification modulator can have DNA demethylase activity. For example, the epigenetic modification modulator may convert methyl groups to hydroxymethylcytosine as a mechanism for DNA demethylation. In certain embodiments, the methods and compositions disclosed herein comprise a fusion molecule comprising a dCas9 molecule fused to 10-11 translocated methylcytosine dioxygenase 1 (TET 1) or a fragment thereof. In certain embodiments, the methods and compositions disclosed herein comprise a fusion molecule comprising a dCas9 molecule fused to Thymidylate DNA Glycosidase (TDG) or a fragment thereof.
DNA methylation enzyme Activity
In certain embodiments, the epigenetic modification modulator can have DNA methylase activity. For example, the epigenetic modification modulator may have a methylase activity, which involves the transfer of a methyl group to DNA, RNA, a protein, a small molecule, cytosine, or adenine. In certain embodiments, the methods and compositions disclosed herein comprise a fusion molecule comprising a dCas9 molecule fused to DNMT3A or a fragment thereof. In certain embodiments, the methods and compositions disclosed herein comprise a fusion molecule comprising a dCas9 molecule fused to DNMT3L or a fragment thereof. In certain embodiments, the methods and compositions disclosed herein comprise a fusion molecule comprising a dCas9 molecule fused to DNMT3L and DNMT3L or a fragment thereof. In certain embodiments, the methods and compositions disclosed herein comprise a fusion molecule comprising a dCas9 molecule fused to a DNMT3A-DNMT3L fusion peptide.
DNMT3A
DNMT3L
DNMT3A-DNMT3L fusion peptides
In one embodiment, the Cas9 fusion protein further comprises a Nuclear Localization Sequence (NLS), such as LS fused to the N-terminus and/or C-terminus of Cas 9.
Nuclear localization sequences are known in the art. In one embodiment, the NLS comprises SEQ ID NO:25 or 26, and SEQ ID NO:25 or 26 (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or more identical), or a sequence that is substantially identical relative to SEQ ID NO:25 or 26 has a sequence of 1, 2, 3, 4, 5 or more changes (e.g., amino acid substitutions, insertions, or deletions), or any fragment thereof.
SEQ ID NO:25 (exemplary Nuclear localization sequence)
APKKKRKVGIHGVPAA
SEQ ID NO:26 (exemplary Nuclear localization sequence)
KRPAATKKAGQAKKKK
In certain embodiments, the CRISPR/Cas 9-based system can include a dCas9 molecule and a gene expression modulator or a nucleic acid encoding a dCas9 molecule and a gene expression modulator. In one embodiment, the dCas9 molecule and the gene expression modulator are covalently linked. In one embodiment, the gene expression modulator is directly covalently fused to the dCas9 molecule. In one embodiment, the gene expression modulator is indirectly covalently fused to the dCas9 molecule, for example, through a non-modulator or linker or through a second modulator. In one embodiment, the gene expression modulator is located at the N-and/or C-terminus of the dCas9 molecule. In one embodiment, the dCas9 molecule and the gene expression modulator are non-covalently linked. Exemplary sequences include, but are not limited to, those listed in table 2. In certain embodiments, the linker between the dCas9 and at least one gene expression modulator comprises an amino acid sequence corresponding to the linker listed in table 2.
TABLE 2 exemplary linker sequences
In one embodiment, the dCas9 molecule is fused to a first tag, such as a first peptide tag. In one embodiment, the gene expression modulator is fused to a second tag, such as a second peptide tag. In one embodiment, the first and second tags, e.g., the first peptide tag and the second peptide tag, interact non-covalently with each other, thereby bringing the dCas9 molecule and the gene expression modulator into close proximity.
In one embodiment, the CRISPR/Cas 9-based system comprises a fusion molecule or a nucleic acid encoding a fusion molecule. In one embodiment, the fusion molecule comprises a sequence comprising dCas9 fused to a gene expression modulator. In one embodiment, the dCas9 molecules include streptococcus pyogenes (Streptococcus pyogenes) dCas9 molecules, staphylococcus aureus (Staphylococcus aureus) dCas9 molecules, campylobacter jejuni (Campylobacter jejuni) dCas9 molecules, corynebacterium diphtheriae (Corynebacterium diphtheria) dCas9 molecules, eubacterium avium (Eubacterium ventriosum) dCas9 molecules, streptococcus pastoris (Streptococcus pasteurianus) dCas9 molecules, lactobacillus sausage (Lactobacillus farciminis) dCas9 molecules, helicobacter (Sphaerochaeta globus) dCas9 molecules, azospirillum (azospirlum) (strain B510) dCas9 molecules, gluconobacter diazophilum (Gluconacetobacter diazotrophicus) dCas9 molecules, neisseria gray (NEISSERIA CINEREA) dCas9 molecules, campylobacter enterobacter jejuni (Roseburia intestinalis) dCas9 molecules, parvibaculum lavamentivorans dCas molecules, nitratifractor salsuginis (strain DSM 16511) dCas9 molecules, campylobacter marinus (Campylobacter lari) (strain CF 89-12) dCas9 molecules, streptococcus thermophilus (Streptococcus thermophilus) (strain d-9) or fragments thereof.
In one embodiment, the fusion molecule is a DNMT3A-DNMT3L (3A 3L) -dCAS9-KRAB fusion molecule comprising from N-terminus to C-terminus: DNMT3A-DNMT3L fusion peptide (3A 3L), dCAS9 peptide and KRAB peptide domains fused directly or indirectly (e.g., via a linker).
In one embodiment, the fusion molecule comprises a fusion molecule comprising the amino acid sequence of SEQ ID NO:97, and SEQ ID NO:97 (e.g., a sequence having a sequence identity of at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more), or a sequence that is substantially identical relative to SEQ ID NO:97 has a sequence of 1, 2, 3, 4, 5 or more changes (e.g., substitutions, insertions, or deletions), or any fragment thereof.
DNMT3A-DNMT3L(3A3L)-dCas9-KRAB
gRNA
In the context of a CRISPR-Cas system, the term "guide sequence" as used herein includes any polynucleotide sequence that has sufficient complementarity to a target nucleic acid sequence to hybridize to the target nucleic acid sequence and direct the specific binding of a nucleic acid targeting complex to the sequence of the target nucleic acid sequence. The guide sequence may form a duplex with the target sequence. The duplex may be a DNA duplex, an RNA duplex or an RNA/DNA duplex. The terms "guide molecule," "guide RNA," and "single guide RNA," are used interchangeably herein to refer to an RNA-based molecule that is capable of forming a complex with a CRISPR-Cas protein and comprises a guide sequence that has sufficient complementarity to a target nucleic acid sequence to hybridize to the target nucleic acid sequence and direct sequence-specific binding of the complex to the target nucleic acid sequence. As described herein, a guide molecule or guide RNA specifically encompasses an RNA-based molecule having one or more chemical modifications (e.g., by chemically linking two ribonucleotides or by replacing one or more ribonucleotides with one or more deoxyribonucleotides).
The guide molecule or guide RNA of the CRISPR-Cas protein may comprise a tracr-mate sequence (covering the "forward repeat sequence" in the case of an endogenous CRISPR system) and a guide sequence (also referred to as a "spacer" in the case of an endogenous CRISPR system). In certain embodiments, a CRISPR-Cas system or complex described herein does not comprise a tracr sequence and/or is independent of the presence of a tracr sequence. In certain embodiments, the guide molecule may comprise, consist essentially of, or consist of a forward repeat sequence fused or linked to a guide sequence or a spacer sequence.
In general, CRISPR-Cas systems are characterized by elements that promote the formation of CRISPR complexes at target sequence sites. In the context of CRISPR complex formation, "target sequence" refers to a sequence to which a guide sequence is designed to have complementarity, wherein hybridization between the target DNA sequence and the guide sequence facilitates CRISPR complex formation.
In certain embodiments, the guide sequence or spacer of the guide molecule is 15 to 50 nucleotides in length. In certain embodiments, the guide RNA has a spacer length of at least 15 nucleotides in length. In certain embodiments, the spacer is 15 to 17 nucleotides, 17 to 20 nucleotides, 20 to 24 nucleotides, 23 to 25 nucleotides, 24 to 27 nucleotides, 27 to 30 nucleotides, 30 to 35 nucleotides, or greater than 35 nucleotides in length.
In certain embodiments, the guide sequence is 15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32、33、34、35、36、37、38、39、40、41、42、43、44、45、46、47、48、49、50、51、52、53、54、55、56、57、58、59、60、61、62、63、64、65、66、67、68、69、70、71、72、73、74、75、76、77、78、79、80、81、82、83、84、85、86、87、88、89、90、91、92、93、94、95、96、97、98、99 or 100 nucleotides in length.
In certain embodiments, the sequence (forward repeat and/or spacer) of the guide molecule is selected to reduce the extent of secondary structure within the guide molecule. In certain embodiments, equal to or less than about 75%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5%, 1% or less of the nucleotides of the guide RNA of the targeting nucleic acid participate in self-complementary base pairing when optimally folded. Optimal folding may be determined by any suitable polynucleotide folding algorithm. Some procedures are based on calculating the minimum gibbs free energy. An example of one such algorithm is mFold, as described in Zuker AND STIEGLER (Nucleic Acids Res.9 (1981), 133-148). Another exemplary folding algorithm is the online web server RNAfold, which was developed at the university of vienna, theoretical chemistry institute using centroid structure prediction algorithms (see, e.g., a.r. gruber et al, 2008, cell 106 (1): 23-24 and PA CARR AND GM Church,2009,Nature Biotechnology 27 (12): 1151-62).
As described above, the CRISPR/Cas9 system utilizes grnas that provide for targeting of CRISPR/Cas 9-based systems. The gRNA is a fusion of two non-coding RNAs: crRNA and tracrRNA. The sgrnas can target any desired DNA sequence by exchanging sequences encoding 20bp pre-spacer sequences that confer targeting specificity by complementary base pairing to the desired DNA target. The gRNA mimics the naturally occurring crRNA tracrRNA duplex involved in the type II effector system. Such a duplex may include, for example, a 42 nucleotide crRNA and a 75 nucleotide tracrRNA, acting as a guide for Cas9 cleavage of the target nucleic acid.
The terms "target region," "target sequence," or "pre-spacer sequence," as used interchangeably herein, refer to a region of a target gene targeted by a CRISPR/Cas 9-based system. The CRISPR/Cas 9-based system may include at least one gRNA, wherein the grnas target different DNA sequences. The target DNA sequences may be overlapping. The target sequence or pre-spacer sequence is followed by a PAM sequence at the 3' end of the pre-spacer sequence. Different type II systems have different PAM requirements. For example, streptococcus pyogenes type II systems use an "NGG" sequence, where "N" can be any nucleotide.
In certain embodiments, the amount of gRNA administered to the cell can be at least 1 gRNA, at least 2 different grnas, at least 3 different grnas, at least 4 different grnas, at least 5 different grnas, at least 6 different grnas, at least 7 different grnas, at least 8 different grnas, at least 9 different grnas, at least 10 different grnas, at least 11 different grnas, at least 12 different grnas, at least 13 different grnas, at least 14 different grnas, at least 15 different grnas, at least 16 different grnas, at least 17 different grnas, at least 18 different grnas, at least 19 different grnas, at least 20 different grnas, at least 25 different grnas, at least 30 different grnas, at least 35 different grnas, at least 40 different grnas, at least 45 different grnas, or at least 50 different grnas.
In some embodiments of the present invention, in some embodiments, the amount of gRNA administered to the cell can be between at least 1 gRNA and at least 50 different gRNAs, between at least 1 gRNA and at least 45 different gRNAs, between at least 1 gRNA and at least 40 different gRNAs, between at least 1 gRNA and at least 35 different gRNAs, between at least 1 gRNA and at least 30 different gRNAs, between at least 1 gRNA and at least 25 different gRNAs, between at least 1 gRNA and at least 20 different gRNAs, between at least 1 gRNA and at least 16 different gRNAs, between at least 1 gRNA and at least 12 different gRNAs, between at least 1 gRNA and at least 8 different gRNAs, between at least 1 gRNA and at least 4 different gRNAs, between at least 4gRNAs and at least 50 different gRNAs, between at least 4 different gRNAs and at least 45 different gRNAs, between at least 4 different gRNAs and at least 40 different gRNAs, between at least 4 different gRNAs and at least 35 different gRNAs at least 4 different gRNAs to at least 30 different gRNAs, at least 4 different gRNAs to at least 25 different gRNAs, at least 4 different gRNAs to at least 20 different gRNAs, at least 4 different gRNAs to at least 16 different gRNAs, at least 4 different gRNAs to at least 12 different gRNAs, at least 4 different gRNAs to at least 8 different gRNAs, at least 8 different gRNAs to at least 50 different gRNAs, at least 8 different gRNAs to at least 45 different gRNAs, at least 8 different gRNAs to at least 40 different gRNAs, at least 8 different gRNAs to at least 35 different gRNAs, 8 different gRNAs to at least 30 different gRNAs, at least 8 different gRNAs to at least 25 different gRNAs, 8 different gRNAs to at least 20 different gRNAs, at least 8 different grnas to at least 16 different grnas or between 8 different grnas to at least 12 different grnas.
In certain embodiments, the gRNA is selected to increase or decrease transcription of the target gene. In certain embodiments, the gRNA targets a region upstream of the Transcription Start Site (TSS) of a target gene (e.g., PCSK 9), e.g., a region between 0-1000bp upstream of the transcription start site of the target gene. In certain embodiments, the gRNA targets a region between 0-50bp、0-100bp、0-150bp、0-200bp、0-250bp、0-300bp、0-350bp、0-400bp、0-450bp、0-500bp、0-550bp、0-600bp、0-650bp、0-700bp、0-750bp、0-800bp、0-850bp、0-900bp、0-950bp or 0-1000bp upstream of the transcription start site of the target gene. In certain embodiments, the gRNA targets a region within about 100bp, about 200bp, about 300bp, about 400bp, about 500bp, about 600bp, about 700bp, about 800bp, about 900bp, about 1000bp, about 1100bp, about 1200bp, about 1300bp, about 1400bp, or about 1500bp upstream of the transcription initiation site of the target gene. In one embodiment, the gRNA targets a region 0-300bp upstream of the TSS of the target gene.
In certain embodiments, the gRNA targets a region downstream of the transcription start site of the target gene, e.g., a region between 0-1000bp downstream of the transcription start site of the target gene. In certain embodiments, the gRNA targets a region between 0-50bp、0-100bp、0-150bp、0-200bp、0-250bp、0-300bp、0-350bp、0-400bp、0-450bp、0-500bp、0-550bp、0-600bp、0-650bp、0-700bp、0-750bp、0-800bp、0-850bp、0-900bp、0-950bp or 0-1000bp downstream of the transcription start site of the target gene. In certain embodiments, the gRNA targets a region within about 100bp, about 200bp, about 300bp, about 400bp, about 500bp, about 600bp, about 700bp, about 800bp, about 900bp, about 1000bp, about 1100bp, about 1200bp, about 1300bp, about 1400bp, or about 1500bp downstream of the transcription initiation site of the target gene. In one embodiment, the gRNA targets a region 0-300bp downstream of the TSS of the target gene.
Proprotein convertase subtilisin/Kexin type 9 (PCSK 9) may also be referred to as subtilisin/Kexin-like protease PC9. Human PCSK9 has a cytogenetic position of 1p32.3 and genomic coordinates at position 55,039,548-55,064,852 on the forward chain of chromosome 1. BSND is the gene upstream of PCSK9 on the forward strand. Human PCSK9 has NCBI gene ID 255738, ref Seq accession No. nm_174936.4, ref Seq accession No. np_777596.2, and Ensembl gene ID ENSG00000169174.
The genomic position of mouse PCSK9 is 4,4C7 and has the genomic sequence of chromosome 4 at position nc_ 000070.07. BSND is the gene upstream of mouse PCSK9 on the forward strand. The NCBI gene ID of mouse PCSK9 was 100102, ref Seq accession No. NM-153565.2, ref Seq accession No. NP-705793.1, and Ensembl gene ID was ENSMUSG00000044254.
The genomic position of rhesus (Macaca mulatta) PCSK9 is nc_041754.1.ENSMMUG0000005740 is the gene upstream of monkey PCSK9 on the forward chain. The Ref Seq accession No. nm_001112660.1,Ref Seq accession No. np_001106130.1, and the Ensembl gene ID ENSMMUG00000005736 for monkey PCSK 9.
The present disclosure provides sgRNA sequences that target the mouse PCSK9 target gene. Exemplary sgrnas include, but are not limited to, those listed in table 3. The disclosure also provides sgRNA sequences that target human PCSK 9. Exemplary sgrnas include, but are not limited to, those listed in table 4. The disclosure also provides sgRNA sequences targeting monkey PCSK 9. Exemplary sgrnas include, but are not limited to, those listed in table 5.
TABLE 3 exemplary mouse PCSK9 sgRNA sequences
| Description of the invention |
Sequence(s) |
SEQ ID NO: |
| Mouse PCSK9 sgRNA1 |
TGGACGCGCAGGCTGCCGGT |
SEQ ID NO:27 |
| Mouse PCSK9 sgRNA2 |
CCACCTTCACGTGGACGCGC |
SEQ ID NO:28 |
| Mouse PCSK9 sgRNA3 |
GTGGACGCGCAGGCTGCCGG |
SEQ ID NO:29 |
| Mouse PCSK9 sgRNA4 |
CTCTCTCTTTCTGAGGCTAG |
SEQ ID NO:30 |
| Mouse PCSK9 sgRNA5 |
CACGTGGACGCGCAGGCTGC |
SEQ ID NO:31 |
| Mouse PCSK9 sgRNA6 |
TTAAGAGGGGGGAATGTAAC |
SEQ ID NO:32 |
| Mouse PCSK9 sgRNA7 |
AACCTGATCCTTTAGTACCG |
SEQ ID NO:33 |
| Mouse PCSK9 sgRNA8 |
TCAGAGAGGATCTTCCGATG |
SEQ ID NO:34 |
| Mouse PCSK9 sgRNA9 |
GGATCTTCCGATGGGGCTCG |
SEQ ID NO:35 |
| Mouse PCSK9 sgRNA10 |
GCGTCATTTGACGCTGTCTG |
SEQ ID NO:36 |
| Mouse PCSK9 sgRNA11 |
TCATTTGACGCTGTCTGGGG |
SEQ ID NO:37 |
| Mouse PCSK9 sgRNA12 |
GATCCTTTAGTACCGGGGCC |
SEQ ID NO:38 |
| Mouse PCSK9 sgRNA13 |
TGCAGCCCAATTAGGATTTG |
SEQ ID NO:39 |
TABLE 4 exemplary human PCSK9 sgRNA sequences
TABLE 5 exemplary monkey PCSK9 sgRNA sequences
In one embodiment, the gRNA targets the promoter region of a target gene. In one embodiment, the gRNA targets an enhancer region of a target gene. grnas can be divided into a target binding region, a Cas9 binding region, and a transcription termination region. The target binding region hybridizes to a target region in a target gene. Methods of designing such target binding regions are known in the art, see, e.g., doench et al, nat biotechnol (2014) 32:1262-7 and Doench et al, nat biotechnol (2016) 34:184-91, which are incorporated herein by reference in their entirety. Design Tools are available, for example, at TARGET FINDER in the Feng Zhang laboratory, TARGET FINDER (E-CRISP) in the Michael Boutros laboratory, RGEN Tools (Cas-OF Finder), CASFINDER, and CRISPR Optimal TARGET FINDER. In certain embodiments, the target binding region can be between about 15 and about 50 nucleotides in length (about 15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32、33、34、35、36、37、38、39、40、41、42、43、44、45、46、47、48、49 or about 50 nucleotides in length). In certain embodiments, the target binding region can be between about 19 and about 21 nucleotides in length. In one embodiment, the target binding region is 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 nucleotides in length.
In one embodiment, the target binding region is complementary, e.g., fully complementary, to a target region in a target gene. In one embodiment, the target binding region is substantially complementary to a target region in a target gene. In one embodiment, the target binding region comprises no more than 1, 2,3, 4,5, 6, 7, 8, 9, or 10 nucleotides that are not complementary to a target region in a target gene.
In one embodiment, the target binding region is engineered to increase stability or half-life, for example by incorporating non-natural or modified nucleotides in the target binding region, by removing or modifying RNA destabilizing sequence elements, by adding RNA stabilizing sequence elements, or by increasing the stability of the Cas9/gRNA complex. In one embodiment, the target binding region is engineered to enhance transcription thereof. In one embodiment, the target binding region is engineered to reduce the formation of secondary structures. In one embodiment, the Cas 9-binding region of the gRNA is modified to enhance transcription of the gRNA. In one embodiment, the Cas 9-binding region of the gRNA is modified to improve the stability or assembly of the Cas9/gRNA complex.
Delivery system
The present disclosure also provides delivery systems for introducing components of the systems and compositions herein into cells, tissues, organs or organisms. The delivery system may include one or more delivery vehicles and/or carriers.
Carrying article
The delivery system may include one or more vehicles. The cargo may include one or more components of the systems and compositions herein. The shipment may include one or more of the following: i) A plasmid encoding one or several Cas proteins; ii) a plasmid encoding one or more guide RNAs, iii) mRNA of one or more Cas proteins; iv) one or more guide RNAs; v) one or more Cas proteins; vi) any combination thereof. In certain examples, the cargo can comprise a plasmid encoding one or more Cas proteins and one or more (e.g., multiple) guide RNAs. In certain embodiments, the cargo may include mRNA encoding one or more Cas proteins and one or more guide RNAs.
In certain examples, the cargo can include one or more Cas proteins and one or more guide RNAs, e.g., in the form of ribonucleoprotein complexes (RNPs). The ribonucleoprotein complex may be delivered by the methods and systems herein. In some cases, the ribonucleoprotein may be delivered using a polypeptide-based shuttle agent. In one example, the ribonucleoprotein may be delivered using a synthetic peptide comprising an Endosomal Leakage Domain (ELD) operably linked to a Cell Penetrating Domain (CPD), a histidine-rich domain, and a CPD, for example as described in WO 2016161516.
Physical delivery
In certain embodiments, the cargo may be introduced into the cells by physical delivery methods. Examples of physical methods include microinjection, electroporation, and hydrodynamic delivery.
Microinjection
Microinjection of the cargo directly into the cells may achieve high efficiency, e.g., greater than 90% or about 100%. In certain embodiments, microinjection can be performed using a microscope and needle (e.g., 0.5-5.0 μm in diameter) to pierce the cell membrane and deliver the cargo directly to the target site within the cell. Microinjection can be used for in vitro and ex vivo delivery.
Plasmids comprising sequences encoding Cas proteins and/or guide RNAs, mrnas, and/or guide RNAs can be microinjected. In some cases, microinjection can be used to i) deliver DNA directly to the nucleus, and/or ii) deliver mRNA (e.g., transcribed in vitro) to the nucleus or cytoplasm. In certain examples, microinjection can be used to deliver sgrnas directly to the nucleus and mRNA encoding Cas to the cytoplasm, e.g., to facilitate translation of Cas and shuttle to the nucleus.
Microinjection can be used to produce genetically modified animals. For example, gene editing vehicles may be injected into the zygote to allow for efficient germ line modification. Such methods can produce normal embryos and term mouse pups with the desired modifications. Microinjection can also be used to provide transient up-or down-regulation of specific genes within the cell genome, for example using CRISPRa and CRISPRi.
Electroporation method
In certain embodiments, the cargo and/or delivery vehicle may be delivered by electroporation. Electroporation can use pulsed high voltage current to transiently open nano-sized pores in the cell membrane of cells suspended in a buffer, allowing components with hydrodynamic diameters of tens of nanometers to flow into the cells. In some cases, electroporation may be used for a variety of cell types and to transfer cargo into cells with high efficiency. Electroporation may be used for in vitro and ex vivo delivery.
Electroporation may also be used to deliver cargo into the nucleus of mammalian cells by applying specific voltages and reagents, for example, by nuclear transfection. Such methods include those :Wu Y,et al.(2015).Cell Res 25:67-79;Ye L,et al.(2014).Proc Natl Acad Sci USA 111:9591-6;Choi PS,Meyerson M.(2014).Nat Commun 5:3728;Wang J,Quake SR.(2014).Proc Natl Acad Sci 111:13157-62. electroporation described in the following documents may also be used to deliver cargo in vivo, for example using the method described in Zuckermann M, et al (2015), nat Commun 6:7391.
Hydrodynamic delivery
Hydrodynamic delivery may also be used to deliver cargo, for example for in vivo delivery. In certain examples, hydrodynamic delivery may be performed by rapidly pushing a large volume (8-10% body weight) of solution containing the gene editing cargo into the blood stream of a subject (e.g., animal or human), such as through the tail vein for a mouse. Since blood is incompressible, large volumes of liquid may cause an increase in hydrodynamic pressure, temporarily enhancing permeability to endothelial cells and parenchymal cells, allowing for the entry of cargo that normally cannot pass through the cell membrane into the cell. This method can be used to deliver naked DNA plasmids and proteins. The delivered cargo may be enriched in the liver, kidneys, lungs, muscles and/or heart.
Transfection
The cargo, e.g., nucleic acid, may be introduced into the cell by a transfection method that introduces the nucleic acid into the cell. Examples of transfection methods include calcium phosphate mediated transfection, cationic transfection, lipofection, dendrimer transfection, heat shock transfection, magnetic transfection, lipofection, puncture transfection (impalefection), optical transfection, and proprietary reagents to enhance nucleic acid uptake.
Delivery vehicle
The delivery system may include one or more delivery vehicles. The delivery vehicle may deliver the cargo into a cell, tissue, organ or organism (e.g., an animal or plant). The cargo may be packaged, carried, or otherwise associated with a delivery vehicle. The delivery vehicle may be selected according to the type of cargo to be delivered and/or the delivery is in vitro and/or in vivo. Examples of delivery vehicles include vectors, viruses, non-viral vehicles, and other delivery agents described herein.
A delivery vehicle according to the present disclosure may have a maximum dimension (e.g., diameter) of less than 100 micrometers (μm). In certain embodiments, the maximum dimension of the delivery vehicle is less than 10 μm. In certain embodiments, the delivery vehicle may have a maximum dimension of less than 2000 nanometers (nm). In certain embodiments, the delivery vehicle may have a maximum dimension of less than 1000 nanometers (nm). In certain embodiments, the delivery vehicle may have a maximum dimension (e.g., diameter) of less than 900nm, less than 800nm, less than 700nm, less than 600nm, less than 500nm, less than 400nm, less than 300nm, less than 200nm, less than 150nm, or less than 100nm, less than 50 nm. In certain embodiments, the delivery vehicle may have a maximum dimension in the range between 25nm and 200 nm.
In certain embodiments, the delivery vehicle may be or comprise particles. For example, the delivery vehicle may be or comprise nanoparticles (e.g., particles having a maximum dimension (e.g., diameter) of no more than 1000 nm). The particles may be provided in different forms, for example as solid particles (e.g., metals such as silver, gold, iron, titanium, non-metals, lipid-based solids, polymers), suspensions of particles, or combinations thereof. Metal, dielectric and semiconductor particles and hybrid structures (e.g., core-shell particles) may be prepared.
Carrier body
The system, composition and/or delivery system may comprise one or more carriers. The present disclosure also includes a carrier system. The carrier system may comprise one or more carriers. In certain embodiments, a vector refers to a nucleic acid molecule capable of transporting another nucleic acid to which it is linked. Vectors include single-stranded, double-stranded, or partially double-stranded nucleic acid molecules, nucleic acid molecules comprising one or more free ends, no free ends (e.g., circular), nucleic acid molecules comprising DNA, RNA, or both, and other various polynucleotides known in the art. The vector may be a plasmid, e.g., a circular double stranded DNA loop into which additional DNA segments may be inserted, e.g., by standard molecular cloning techniques. Some vectors may be capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Certain vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. In certain examples, the vector may be an expression vector, e.g., capable of directing expression of genes to which they are operably linked. Expression vectors may be used for expression in eukaryotic cells in some cases. Common expression vectors useful in recombinant DNA technology typically take the form of plasmids.
Examples of vectors include pGEX, pMAL, pRIT, E.coli expression vectors (e.g., pTrc, pETlld), yeast expression vectors (e.g., pYepSecl, pMFa, pJRY88, pYES2 and picZ), baculovirus vectors (e.g., for expression in insect cells such as SF9 cells) (e.g., pAc series and pVL series), mammalian expression vectors (e.g., pCDM8 and pMT2 PC).
The vector may comprise i) a Cas coding sequence, and/or ii) a single or at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 12, at least 14, at least 16, at least 32, at least 48, at least 50 guide RNA coding sequences. In a single vector, each RNA coding sequence may have a promoter. Alternatively or in addition, in a single vector, there may be promoters that control (e.g., drive transcription and/or expression) multiple RNA coding sequences.
Regulatory element
The vector may comprise one or more regulatory elements. The regulatory element may be operably linked to the coding sequence of a Cas protein, a helper protein, a guide RNA (e.g., a single guide RNA, a crRNA, and/or a tracrRNA), or a combination thereof. The term "operably linked" means that the nucleotide sequence of interest is linked to the regulatory element in a manner that allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell). In certain examples, the carrier may comprise: a first regulatory element operably linked to a nucleotide sequence encoding a Cas protein, and a second regulatory element operably linked to a nucleotide sequence encoding a guide RNA.
Examples of regulatory elements include promoters, enhancers, internal Ribosome Entry Sites (IRES), and other expression control elements (e.g., transcription termination signals, such as polyadenylation signals and poly U sequences). Such regulatory elements are described, for example, in Goeddel, GENE EXPRESSION TECHNOLOGY: METHODS IN ENZYMOLOGY 185,Academic Press,San Diego,Calif (1990). Regulatory elements include those that direct constitutive expression of a nucleotide sequence in many types of host cells, as well as those that direct expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences). Tissue-specific promoters may direct expression primarily in a desired tissue of interest, such as muscle, neuron, bone, skin, blood, specific organ (e.g., liver, pancreas), or specific cell type (e.g., lymphocyte). Regulatory elements may also direct expression in a time-dependent manner, such as in a cell cycle-dependent or developmental stage-dependent manner, which may or may not be tissue or cell type specific.
Examples of promoters include one or more pol III promoters (e.g., 1, 2,3, 4,5 or more pol III promoters), one or more pol II promoters (e.g., 1, 2,3, 4,5 or more pol II promoters), one or more pol I promoters (e.g., 1, 2,3, 4,5 or more pol I promoters), or combinations thereof. Examples of pol III promoters include, but are not limited to, U6 and HI promoters. Examples of pol II promoters include, but are not limited to, the retrovirus Rous Sarcoma Virus (RSV) LTR promoter (optionally with an RSV enhancer), the Cytomegalovirus (CMV) promoter (optionally with a CMV enhancer), the SV40 promoter, the dihydrofolate reductase promoter, the actin promoter, the phosphoglycerate kinase (PGK) promoter, and the EF1a promoter.
Viral vectors
The cargo may be delivered by a virus. In certain embodiments, viral vectors are used. Viral vectors may include virus-derived DNA or RNA sequences for packaging into viruses (e.g., retroviruses, replication-defective retroviruses, adenoviruses, replication-defective adenoviruses, and adeno-associated viruses). Viral vectors also comprise polynucleotides carried by the virus for transfection into a host cell. Viruses and viral vectors may be used for in vitro, ex vivo, and/or in vivo delivery.
Adeno-associated virus (AAV)
The systems and compositions herein may be delivered by adeno-associated virus (AAV). AAV vectors may be used for such delivery. AAV belongs to the family of parvoviruses of the genus dependovirus, and is a single-stranded DNA virus. In certain embodiments, AAV may provide a durable source of DNA provided because the genomic material delivered by AAV may be present in the cell indefinitely, for example as exogenous DNA or with some modification, integrated directly into the host DNA. In certain embodiments, the AAV does not cause or is associated with any disease in a human. The virus itself is able to infect cells with high efficiency, while rarely eliciting an innate or adaptive immune response or associated toxicity.
Examples of AAV useful herein include AAV-1, AAV-2, AAV-3, AAV-4, AAV-5, AAV-6, AAV-8, and AAV-9. The type of AAV may be selected according to the cell to be targeted; for example, AAV serotypes 1,2, 5 or heterozygous capsid AAV1, AAV2, AAV5, or any combination thereof, can be selected for targeting brain or neuronal cells; and AAV4 may be selected for targeting to heart tissue. AAV8 may be used for delivery to the liver. AAV-2 based vectors were originally proposed for CFTR delivery to the CF airways, and other serotypes such as AAV-1, AAV-5, AAV-6 and AAV-9 showed improved gene transfer efficiency in various lung epithelial models. Examples of AAV-targeted cell types are described in Grimm, D.et al, J.Virol.82:5887-5911 (2008) and WO 2021/183807A1, which are incorporated herein by reference in their entirety.
CRISPR-Cas AAV particles can be produced in HEK 293T cells. Once particles with a specific tropism (tropism) are produced, they can be used to infect target cell lines in essentially the same way as native viral particles. This may allow the CRISPR-Cas component to persist in the infected cell type, which is also why this delivery version is particularly suitable for situations where long-term expression is required. Examples of dosages and formulations of AAV that may be used include those described in U.S. patent nos. 8,454,972 and 8,404,658.
A variety of strategies are available for delivering the systems and compositions herein using AAV. In certain examples, the coding sequences for Cas and gRNA can be packaged directly onto one DNA plasmid vector and delivered through one AAV particle. In certain examples, AAV can be used to deliver gRNA into cells that have been previously engineered to express Cas. In certain examples, the coding sequences for Cas and gRNA can be manufactured as two separate AAV particles for co-transfection of target cells. In certain examples, the markers, tags, and other sequences can be packaged in the same AAV particle as the coding sequence for Cas and/or gRNA.
Lentivirus virus
The systems and compositions herein may be delivered by lentiviruses. Lentiviral vectors may be used for such delivery. Lentiviruses are complex retroviruses with the ability to infect and express their genes in mitotic and postmitotic cells.
Examples of lentiviruses include Human Immunodeficiency Virus (HIV), which can target a broad range of cell types with the envelope glycoproteins of other viruses; minimal non-primate lentiviral vectors based on Equine Infectious Anemia Virus (EIAV), which are useful for ocular therapy. In certain embodiments, self-inactivating lentiviral vectors (see, e.g., diGiusto et al. (2010) SCI TRANSL MED 2:36r43) having sirnas targeting HIV tat/rev shared common exons, nucleolar-localized TAR baits, and anti-CCR 5 specific hammerhead ribozymes, are useful and/or adaptable for use in the nucleic acid targeting systems herein.
Lentiviruses can be pseudotyped with other viral proteins such as the G protein of vesicular stomatitis virus. In so doing, the cell tropism of the lentivirus can be altered as desired to be either broad or narrow. In some cases, to increase safety, second and third generation lentiviral systems may split essential genes into three plasmids, which may reduce the likelihood of accidental reconstitution of live viral particles within the cell.
In certain examples, using integration capability, lentiviruses can be used to create libraries comprising various genetically modified cells, for example, for screening and/or studying genes and signaling pathways.
Adenovirus
The systems and compositions herein may be delivered by adenovirus. Adenovirus vectors may be used for such delivery. Adenoviruses include non-enveloped viruses with an icosahedral nucleocapsid containing a double-stranded DNA genome. Adenovirus can infect dividing cells and non-dividing cells. In certain embodiments, the adenovirus does not integrate into the genome of the host cell, which can be used to limit off-target effects of the CRISPR-Cas system in gene editing applications.
Non-viral vehicle
The delivery vehicle may comprise a non-viral vehicle. In general, methods and vehicles capable of delivering nucleic acids and/or proteins can be used to deliver the systems and compositions herein. Examples of non-viral vehicles include lipid nanoparticles, cell Penetrating Peptides (CPPs), DNA nanoclusters, gold nanoparticles, streptolysin 0, multifunctional encapsulated nano-devices (MENDs), lipid coated mesoporous silica particles, and other inorganic nanoparticles.
Lipid particles
The delivery vehicle may include lipid particles, such as Lipid Nanoparticles (LNPs) and liposomes.
Lipid Nanoparticles (LNP)
LNP can encapsulate nucleic acids within cationic lipid particles (e.g., liposomes) and can be delivered to cells relatively easily. In certain examples, the lipid nanoparticle is free of any viral components, which helps to minimize safety and immunogenicity. Lipid particles can be used for in vitro, ex vivo, and in vivo delivery. Lipid particles can be used in cell populations of various sizes.
In certain examples, LNP can be used to deliver DNA molecules (e.g., those comprising coding sequences for Cas and/or grnas) and/or RNA molecules (e.g., mRNA, gRNA for Cas). In certain instances, LNP can be used to deliver RNP complexes of Cas/gRNA.
In certain embodiments, the LNP is used to deliver mRNA and gRNA, e.g., an mRNA fusion molecule comprising DNMT3A-DNMT3L (3A-3L) -dCas9-KRAB and at least one PCSK 9-targeted sgRNA.
The components of the LNP may include cationic lipids 1,2-dilineoyl-3-dimethylammonium-propane(DLinDAP)、1,2-dilinoleyloxy-3-N,N-dimethylaminopropane(DLinDMA)、1,2-dilinoleyloxyketo-N,N-dimethyl-3-aminopropane(DLinK-DMA)、l,2-dilinoleyl-4-(2-dimethylaminoethyl)-[l,3]-dioxolane(DLinKC2-DMA)、(3-o-[2-(methoxypolyethyleneglycol 2000)succinoyl]-1,2-dimyristoyl-sn-glycol(PEG-S-DMG)、R-3-[(ro-methoxy-poly(ethylene glycol)2000)carbamoyl]-1,2-dimyristyloxlpropyl-3-amine(PEG-C-DOMG) and any combination thereof. The preparation and encapsulation of LNP can be adapted from Conway et al, molecular Therapy, vol.27, no.4, pages 866-877, apr.2019 and Rosin et al, molecular Therapy, vol.19, no.12, pages 1286-2200, dec.201.
In certain embodiments, the LNP may comprise an ionizable lipid. In certain embodiments, the ionizable lipids include, but are not limited to, pH-responsive ionizable lipids, thermally-responsive ionizable lipids, and photo-responsive ionizable lipids. In certain embodiments, the ionizable lipids include cationic lipids and anionic lipids that ionize under certain conditions such as, but not limited to, pH, temperature, or light. In certain embodiments, the molar ratio of the ionizable lipid of the LNP is 20% to about 70% (e.g., about 20% to about 70%, about 20% to about 65%, about 20% to about 60%, about 20% to about 55%, about 20% to about 50%, about 20% to about 45%, about 20% to about 40%, about 20% to about 35%, about 20% to about 30%, about 20% to about 25%, about 30% to about 70%, about 30% to about 65%, about 30% to about 60%, about 30% to about 55%, about 30% to about 50%, about 30% to about 45%, about 30% to about 40%, about 30% to about 35%, about 40% to about 70%, about 40% to about 65%, about 40% to about 60%, about 40% to about 55%, about 40% to about 50%, about 40% to about 45%, about 50% to about 70%, about 50% to about 65%, about 50% to about 60%, about 50% to about 55%, about 60% to about 70%, or about 60% to about 65%). In certain embodiments, the molar ratio of the ionizable lipids of the LNP is about 45% to about 50%.
In certain embodiments, the LNP may comprise a pegylated lipid. In certain embodiments, the molar ratio of pegylated lipid of the LNP is from 0% to about 30% (e.g., from about 0% to about 30%, from about 0% to about 25%, from about 0% to about 20%, from about 0% to about 15%, from about 0% to about 10%, from about 10% to about 30%, from about 10% to about 25%, from about 10% to about 20%, from about 10% to about 15%, from about 20% to about 30%, or from about 20% to about 25%). In certain embodiments, the molar ratio of pegylated lipid of the LNP is about 1%.
In certain embodiments, the LNP may comprise a supporting lipid. In certain embodiments, the molar ratio of the supporting lipid of the LNP is from 5% to about 50% (e.g., from about 5% to about 50%, from about 5% to about 45%, from about 5% to about 40%, from about 5% to about 35%, from about 5% to about 30%, from about 5% to about 25%, from about 5% to about 20%, from about 5% to about 15%, from about 5% to about 10%, from about 10% to about 50%, from about 10% to about 45%, from about 10% to about 40%, from about 10% to about 35%, from about 10% to about 30%, from about 10% to about 25%, from about 10% to about 20%, from about 10% to about 15%, from about 20% to about 50%, from about 20% to about 45%, from about 20% to about 40%, from about 20% to about 30%, from about 20% to about 25%, from about 30% to about 50%, from about 30% to about 40%, from about 30% to about 35%, from about 40% to about 50%, from about 40% to about 40%, from about 40% to about 45%, from about 45% to about 40%, from about 40% to about 40%, or from about 40%. In certain embodiments, the molar ratio of the supporting lipid of the LNP is about 9%.
In certain embodiments, the LNP may comprise cholesterol. In certain embodiments, the molar ratio of cholesterol of the LNP is from 10% to about 50% (e.g., from about 10% to about 50%, from about 10% to about 45%, from about 10% to about 40%, from about 10% to about 35%, from about 10% to about 30%, from about 10% to about 25%, from about 10% to about 20%, from about 10% to about 15%, from about 20% to about 50%, from about 20% to about 45%, from about 20% to about 40%, from about 20% to about 35%, from about 20% to about 30%, from about 20% to about 25%, from about 30% to about 50%, from about 30% to about 45%, from about 30% to about 40%, from about 30% to about 35%, from about 40% to about 50%, or from about 40% to about 45%). In certain embodiments, the molar ratio of cholesterol of the LNP is about 40% to about 45%.
In certain embodiments, the LNP may comprise a mixture of ionizable lipids (20% -70%, molar ratio), pegylated lipids (0% -30%, molar ratio), supportive lipids (30% -50%, molar ratio), and cholesterol (10% -50%, molar ratio). In certain embodiments, the LNP may comprise a mixture of ionizable lipids (45-50%, molar ratio), pegylated lipids (1%, molar ratio), supportive lipids (9%, molar ratio), and cholesterol (40-50%, molar ratio).
Liposome
In certain embodiments, the lipid particle may be a liposome. Liposomes are spherical vesicle structures composed of a monolayer or multilamellar lipid bilayer surrounding an internal aqueous compartment and a relatively impermeable outer lipophilic phospholipid bilayer. In certain embodiments, liposomes are biocompatible, nontoxic, can deliver both hydrophilic and lipophilic drug molecules, protect their cargo from degradation by cytoplasmic enzymes, and transport their load across the biological membrane and the Blood Brain Barrier (BBB).
Liposomes can be made from several different types of lipids, such as phospholipids. Liposomes can comprise natural phospholipids and lipids, such as 1, 2-distearoyl-sn-glycero-3-phosphatidylcholine (DSPC), sphingomyelin, lecithin, monosialoganglioside, or any combination thereof.
Several other additives may be added to the liposomes in order to modify their structure and properties. For example, the liposome may further comprise cholesterol, sphingomyelin, and/or 1, 2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE), for example, to improve stability and/or prevent leakage of cargo inside the liposome.
Stabilized Nucleic Acid Lipid Particles (SNALP)
In certain embodiments, the lipid particle may be a Stabilized Nucleic Acid Lipid Particle (SNALP). SNALP may comprise an ionizable lipid (DLinDMA) (e.g., cationic at low pH), a neutral helper lipid, cholesterol, a diffusible polyethylene glycol (PEG) lipid, or any combination thereof. In certain examples, SNALP may comprise synthetic cholesterol, dipalmitoyl phosphatidylcholine, 3-N- [ (w-methoxypolyethylene glycol) 2000] carbamoyl ] -1, 2-dimyristoxypropylamine, and the cation 1, 2-diiodoyloxy-3-N, N-dimethylaminopropane. In certain examples, SNALP may comprise synthetic cholesterol, 1, 2-distearoyl-sn-glycero-3-phosphorylcholine, PEG-cDMA, and 1, 2-dioleyloxy-3- (N, N-dimethyl) aminopropane (DLinDMA).
Other lipids
The lipid particles may also comprise one or more other types of lipids, for example cationic lipids such as the amino lipids 2, 2-diimine-4-dimethylaminoethyl- [1,3] -dioxolane (DLin-KC 2-DMA), DLin-KC2-DMA4, cl2-200 and the helper lipids distearoyl phosphatidylcholine, cholesterol and PEG-DMG.
Lipid complexes and/or polymeric complexes
In certain embodiments, the delivery vehicle comprises a lipid complex and/or a polymeric complex. The lipid complex can bind to negatively charged cell membranes and induce endocytosis. Examples of cationic lipid complexes may be complexes comprising lipid and non-lipid components. Examples of lipid complexes and polymeric complexes include fugenee-6 reagent, non-liposome solutions containing lipids and other components, zwitterionic Amino Lipids (ZAL), ca2p (e.g., forming DNA/Ca 2+ microcomposites), polyethylenimine (PEI) (e.g., branched PEI), and poly (L-lysine) (PLL).
Cell penetrating peptides
In certain embodiments, the delivery vehicle comprises a Cell Penetrating Peptide (CPP). CPPs are short peptides that promote cellular uptake of various molecular cargo (e.g., from nanoparticles to small chemical molecules and large fragments of DNA).
CPPs can have different sizes, amino acid sequences, and charges. In certain examples, the CPP can translocate the plasma membrane and facilitate delivery of various molecular cargo to the cytoplasm or organelle. CPPs can be introduced into cells by different mechanisms, such as direct penetration into the membrane, endocytosis-mediated entry, and translocation through formation of temporary structures.
CPPs may have an amino acid composition comprising a relatively high abundance of positively charged amino acids, such as lysine or arginine, or have a sequence comprising an alternating pattern of polar/charged amino acids and nonpolar hydrophobic amino acids. These two types of structures are referred to as polycationic or amphiphilic, respectively. The third class of CPPs are hydrophobic peptides containing only nonpolar residues with low net charge or hydrophobic amino acid groups critical for cellular uptake. Another type of CPP is the transactivation transcriptional activator (Tat) from human immunodeficiency virus I (HIV-I). Examples of CPPs include Penetratin (PENETRATIN), tat (48-60), transmembrane peptide (Transportan) and (R-AhX-R4) (AhX refers to aminocaproyl). Examples of CPPs and related applications also include those described in U.S. patent 8,372,951.
CPPs can be used quite easily for in vitro and ex vivo work and generally require extensive optimization for each cargo and cell type. In certain examples, the CPP can be directly covalently linked to the Cas protein, then complexed with the gRNA and delivered to the cell. In certain examples, the CPP-Cas and CPP-gRNA can be delivered separately to multiple cells. CPPs may also be used to deliver RNPs.
DNA nanowire ball
In certain embodiments, the delivery vehicle comprises a DNA nanowire clew. DNA nanoclusters refer to a spherical structure of DNA (e.g. having the shape of a yarn ball). The nanowire clew can be synthesized by rolling circle amplification using palindromic sequences that facilitate self-assembly of the structure. The balls may then be loaded with a payload. Examples of DNA nanowires clews are described in Sun W et al, J Am Chem soc.2014oct 22;136 14722-5; and Sun Wet al, ANGEW CHEM INT ED engl.2015oct 5;54 (41) 12029-33. The DNA nanowire clew may have a palindromic sequence complementary to the gRNA portion within the Cas: gRNA ribonucleoprotein complex. The DNA coils may be coated, for example with PEI, to induce endosomal escape.
Gold nanoparticles
In certain embodiments, the delivery vehicle comprises gold nanoparticles (also known as AuNP or colloidal gold). Gold nanoparticles can form complexes with a cargo such as Cas: gRNA RNP. Gold nanoparticles may be coated, for example, in silicate and endosomal destructive polymer PAsp (DET). Examples of gold nanoparticles include the AuraSense Therapeutics spherical nucleic acid (SNA TM) construct, described in Mout R, et al (2017). ACS Nano 11:2452-8; lee K, et al (2017), nat Biomed Eng 1:889-901.
iTOP
In certain embodiments, the delivery vehicle comprises iTOP. iTOP refers to a combination of small molecules that do not rely on any transduction peptide to drive efficient intracellular delivery of the native protein. iTOP can be used for transduction induced by cell permeation and propane betaine, using NaCl-mediated hypertonicity together with a transduction compound (propane betaine) to trigger endocytic uptake (macropinocytotic uptake) of extracellular macromolecules into cells. Examples of iTOP methods and reagents include those described in D' Astolfo DS, pagliero RJ, pras a, et al (2015), cell 161:674-690.
Polymer-based particles
In certain embodiments, the delivery vehicle may include polymer-based particles (e.g., nanoparticles). In certain embodiments, the polymer-based particles may mimic the membrane fusion mechanism of a virus. The polymer-based particles may be synthetic copies of the influenza mechanism and form transfected complexes with various types of nucleic acids (siRNA, miRNA, plasmid DNA or shRNA, mRNA) taken up by the cell via the endocytic pathway, a process involving the formation of acidic compartments. The low pH in late endosomes acts as a chemical switch, rendering the surface of the particles hydrophobic and facilitating membrane penetration. Once in the cytosol, the particles release their payload for cellular action. This active endosomal escape (Active Endosome Escape) technique is safe and maximizes transfection efficiency because it uses the natural uptake pathway. In certain embodiments, the polymer-based particles may comprise alkylated and carboxyalkylated branched polyethylenimine. In certain examples, the polymer-based particles are VIROMER, e.g., VIROMER RNAi, VIROMER RED, VIROMER mRNA, VIROMER CRISPR. Examples of methods of delivering the systems and compositions herein include those described in the following documents :Bawage SS et al.,Synthetic mRNA expressed Casl3amitigates RNA virus infections,www.biorxiv.org/content/l0.l l01/370460v1.full doi:doi.org/10.1101/370460,RED,a powerful tool for transfection of keratinocytes.doi:10.13140/RG.2.2.16993.61281,Transfection-Factbook 2018:technology,product overview,users'data.,doi:10.13140/RG.2.2.23912.16642。
Streptolysin O (SLO)
The delivery vehicle may be streptolysin O (SLO). SLO is a toxin produced by group a streptococci that acts by creating pores in mammalian cell membranes. SLO can function in a reversible manner, which allows the delivery of proteins (e.g., up to 100 kDa) to the cytosol of the cell without compromising overall viability. Examples of SLOs include those :Sierig G,et al.(2003).Infect Immun 71:446-55;Walev I,et al.(2001).Proc Natl Acad Sci US A 98:3185-90;Teng KW,et al.(2017).Elife 6:e25460. multifunctional encapsulated nano-devices (MEND) described in the following documents
The delivery vehicle may include a Multifunctional Encapsulated Nanodevice (MEND). The MEND may include condensed plasmid DNA, PLL core and lipid membrane shell. The MEND may further comprise a cell penetrating peptide (e.g., stearoyl octaarginine). The cell penetrating peptide may be in a lipid shell. The lipid envelope may be modified with one or more functional components, such as one or more of the following: polyethylene glycol (e.g., to increase vascular circulation time), ligands for targeting specific tissues/cells, other cell penetrating peptides (e.g., for larger cell delivery), lipids that enhance endosomal escape, and nuclear delivery tags. In certain examples, the MEND may be a four-layer MEND (T-MEND) that may target nuclei and mitochondria. Examples of MENDs include those described in the following documents: kogure K, et al (2004) J Control Release 98:317-23; nakamura T, et al (2012) ACE CHEM RES45:1113-21.
Lipid coated mesoporous silica particles
The delivery vehicle may comprise lipid-coated mesoporous silica particles. The lipid-coated mesoporous silica particles may comprise a mesoporous silica nanoparticle core and a lipid membrane shell. The silica core may have a large internal surface area resulting in high cargo loading capacity. In certain embodiments, pore size, pore chemistry, and overall particle size may be modified to load different types of cargo. The lipid coating of the particles may also be modified to maximize cargo loading, increase circulation time, and provide accurate targeting and cargo release. Examples of lipid-coated mesoporous silica particles include those described in the following documents: du X, et al (2014) Biomaterials 35:5580-90; durfee PN, et al (2016) ACS Nano 10:8325-45.
Inorganic nanoparticles
The delivery vehicle may include inorganic nanoparticles. Examples of inorganic nanoparticles include Carbon Nanotubes (CNTs) (e.g., as described in Bates Kand Kostarelos K. (2013) Adv Drug Deliv Rev 65:2023-33), bare Mesoporous Silica Nanoparticles (MSNP) (e.g., as described in Luo GF, et al (2014) Sci Rep 4:6064), and dense silica nanoparticles (SiNPs) (e.g., as described in Luo D and Saltzman WM. (2000) Nat Biotechnol 18:893-5).
Application method
The compositions and systems herein are useful in a variety of applications, including the modification of non-animal organisms such as plants and fungi and the modification of animals, the treatment and diagnosis of diseases in plants, animals and humans. In general, the compositions and systems can be introduced into cells, tissues, organs, or organisms where they modify the expression and/or activity of one or more genes (e.g., PCSK 9).
In certain embodiments, the expression of the PCSK9 gene product is reduced in cells into which the compositions and systems described herein are introduced. In certain embodiments, the reduction in PCSK9 gene product expression is transient. In certain embodiments, reduced expression of the PCSK9 gene product is stable. In certain embodiments, the reduction in PCSK9 gene product expression is heritable.
In certain embodiments, the plurality of cells modified by the compositions and systems herein comprise a reduction in the expression of PCSK9 gene product of at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90% relative to cells not incorporating the compositions and systems described herein.
In certain embodiments, cells expanded or derived from a plurality of cells modified by the compositions and systems described herein also comprise at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90% reduced expression of PCSK9 gene product relative to cells expanded or derived from cells not incorporating the compositions and systems described herein.
Cells and organisms
The present disclosure provides cells, tissues, organisms comprising the engineered Cas protein, CRISPR-Cas system, polynucleotides encoding one or more components of CRISPR-Cas system, and/or vectors comprising the polynucleotides. The present disclosure also provides nucleotide sequences encoding effector proteins that are codon optimized for expression in eukaryotic or eukaryotic cells in any of the methods or compositions described herein. In one embodiment of the disclosure, the codon-optimized effector protein is any Cas protein discussed herein, and is codon-optimized for operability in eukaryotic cells or organisms, such as such cells or organisms mentioned elsewhere herein, for example, but not limited to, yeast cells or mammalian cells or organisms, including mouse cells, rat cells, and human cells or non-human eukaryotes such as plants.
In certain embodiments, the modification of the target locus of interest may result in: eukaryotic cells in which the expression of at least one gene product is altered; a eukaryotic cell in which expression of at least one gene product is altered, wherein expression of the at least one gene product is increased; a eukaryotic cell in which the expression of at least one gene product is altered, wherein the expression of the at least one gene product is reduced; or eukaryotic cells containing an edited genome.
In certain embodiments, the eukaryotic cell may be a mammalian cell or a human cell.
In other embodiments, the non-naturally occurring or engineered compositions, carrier systems, or delivery systems described in this specification can be used to: site-specific gene knockout; site-specific genome editing; RNA sequence specific interference; or multiple genome engineering.
Also provided herein is a gene product from a cell, cell line or organism described herein. In certain embodiments, the amount of expressed gene product may be greater or less than the amount of gene product from a cell that does not have an altered expressed or edited genome. In certain embodiments, the gene product may be altered as compared to a gene product from a cell that does not have an altered expressed or edited genome.
Exemplary therapies
The present disclosure provides the use of the CRISPR-Cas system in the treatment of various diseases and disorders. In certain embodiments, the disclosure described herein relates to a method of treatment, wherein cells are edited by a CRISPR or base editor to modulate at least one gene, and then the edited cells are administered to a patient in need thereof. In certain embodiments, the editing involves typing, knocking out or knocking down the expression of at least one target gene in the cell. In particular embodiments, the editing inserts an exogenous gene, minigene, or sequence, which may include one or more exons, into the locus of the target gene, a hotspot locus, a safe harbor locus in the genomic position of the gene (where new genes or genetic elements may be introduced without disrupting expression or regulation of neighboring genes), in trans or in a natural or synthetic trans manner, or corrects by inserting or deleting one or more mutations in the DNA sequence encoding the regulatory element of the target gene. In certain embodiments, the editing comprises introducing one or more point mutations in a nucleic acid (e.g., genomic DNA) in the target cell.
In certain embodiments, the treatment is directed to diseases/disorders of organs, including liver disease, ocular disease, muscle disease, heart disease, blood disease, brain disease, kidney disease, or may include treatment directed to autoimmune disease, central nervous system disease, cancer and other proliferative diseases, neurodegenerative diseases, inflammatory disease, metabolic disorders, musculoskeletal disorders, and the like.
In certain embodiments, the disease is associated with high cholesterol and provides for modulation of cholesterol (e.g., LDL). In certain embodiments, modulation is effected by modification in the target gene PCSK 9. PCSK9 is associated with diseases and disorders such as, but not limited to, the following: non-beta lipoproteinemia, adenoma, arteriosclerosis, atherosclerosis, cardiovascular diseases, gallstones, coronary arteriosclerosis, coronary heart disease, non-insulin dependent diabetes mellitus, hypercholesterolemia, familial hypercholesterolemia, hyperinsulinemia, hyperlipidemia, familial combined hyperlipidemia, hypobetalipoproteinemia, chronic renal failure, liver disease, liver tumor, melanoma, myocardial infarction, drowsiness, tumor metastasis, wilms' tumor, obesity, peritonitis, elastic pseudoxanthoma, cerebrovascular accident, vascular disease, xanthomatosis, peripheral vascular disease, myocardial ischemia, dyslipidemia, impaired glucose tolerance xanthomatosis, multiple-source hypercholesterolemia, secondary hepatic malignancy, dementia, overweight, chronic hepatitis c, carotid atherosclerosis, ha-type hyperlipidemia, intracranial atherosclerosis, ischemic stroke, acute coronary syndrome, aortic calcification, cardiovascular morbidity, lib-type hyperlipoproteinemia, peripheral arterial disease, familial aldosteronism type II, familial hypobetalipoproteinemia, autosomal recessive hypercholesterolemia, autosomal dominant hypercholesterolemia 3, coronary artery disease, liver cancer, ischemic cerebrovascular accident, and arteriosclerotic cardiovascular disease NOS. Epigenetic modifications of the PCSK9 gene using any of the methods described herein may be used to treat, prevent, and/or alleviate symptoms of the diseases and disorders described herein.
Dyslipidemia is a genetic disease characterized by elevated levels of lipids in the blood, leading to the occurrence of arterial blockages (atherosclerosis). These lipids include plasma cholesterol, triglycerides, high density lipoproteins or low density lipoproteins. Dyslipidemia increases the risk of heart attacks, strokes, or other circulatory system problems. Current treatments include lifestyle changes such as exercise and diet adjustment, and the use of lipid-lowering drugs such as statins. Non-statin lipid-lowering drugs include bile acid sequestrants, cholesterol absorption inhibitors, homozygous familial hypercholesterolemia drugs, fibrates, niacin, omega-3 fatty acids and/or combination products. The treatment regimen will generally depend on the particular lipid abnormality, although different lipid abnormalities tend to coexist. Treatment of children is more challenging because dietary changes can be difficult to implement and lipid-lowering therapies have not proven effective. Epigenetic modifications of the PCSK9 gene using any of the methods described herein may be used to treat, prevent, and/or alleviate symptoms of dyslipidemia (e.g., LDL dysregulation).
The activity of PCSK9 is mainly limited to the liver, and PCSK9 is associated with dyslipidemia, familial hypercholesterolemia (familial), gastric papillary adenocarcinoma, homozygous familial hypercholesterolemia, and nasopharyngitis. PCSK 9-associated familial hypercholesterolemia is a (autosomal dominant) genetic disease in which the body develops dangerous blood cholesterol levels due to a lack of low density lipoprotein cholesterol receptors. PCSK 9-related familial hypercholesterolemia affects between 500-fold and 1,000,000-fold of heterozygotes in the world population, and is more common in south african whites, french canadian, dawn-governor and finnish. Common symptoms of PCSK 9-related familial hypercholesterolemia include elevated circulating cholesterol in low density lipoproteins alone or also in very low density lipoproteins. Current treatment of PCSK 9-related familial hypercholesterolemia involves administration of statin drugs to inhibit hydroxymethylglutaryl-CoA reductase (HMG-CoA reductase) in the liver. Another option for the treatment of PCSK 9-related familial hypercholesterolemia is ezetimibe to inhibit the absorption of cholesterol in the gut.
In certain embodiments, the epigenetic modification of the PCSK9 gene of any of the methods described herein can target the liver, i.e., the primary site of PCSK9 activity.
Examples
Example 1: fusion molecule plasmid construction and knockdown efficiency
Two plasmids were constructed to form a "EPICAS" system (which can be used interchangeably with the "CRISPRoff" system) (fig. 1A). The "fusion molecule" or "catalytic protein" plasmid encodes dCas9, DNMT3A, DNMT L, and KRAB peptide. The fused DNMT3A and DNMT3L (3A 3L) peptides are located at the N-terminus of dCAS9, and KRAB is located at the C-terminus of dCAS 9. Thus, the fusion molecule has 3A3L-dCAS9-KRAB from the N-terminus to the C-terminus. The "sgRNA" plasmid encodes a sgRNA sequence that targets the PCSK9 gene. Multiple sgrnas were designed to target regions within 250bp upstream and downstream of the Transcription Start Site (TSS) of the mouse PCSK9 gene.
Each sgRNA plasmid was co-transfected with a catalytic protein plasmid into a mouse AML12 cell line. After 72 hours, the first 10% gfp+ and mcherry+ cells were sorted by FACS. RT-QPCR experiments were performed to assess mRNA expression levels of Pcsk 9. 12 of the 13 sgrnas tested showed significant down-regulation of Pcsk9 expression in AML12 cells. Cells transfected with sgRNA9 showed up to about 82% effective knockdown (fig. 1B).
Next, combinations of sgrnas 9 with other individual sgrnas were tested to determine if the combination of more than one sgRNA could further reduce the gene expression level of Pcsk9 in AML12 cells (fig. 1C). Among the combinations tested, sgRNA7 and sgRNA9 together showed the highest level of inhibition. Various combinations of sgrnas were also tested to determine if more than one combination of sgrnas could reduce the gene expression level of Pcsk9 in Ai9 primary hepatocytes (fig. 1D). All combinations significantly knocked down the expression level of Pcsk 9. Minimal reduction was observed in cells co-transfected with sgrnas 7, 8 and 9. Pcsk9 silencing persists in primary hepatocytes for at least two weeks more, and the combination of sgRNA7 and sgRNA9 shows the highest inhibition efficiency (up to 81%) of Pcsk9 gene expression. Taken together, this suggests that EPICAS system can be used to induce efficient and durable silencing of Pcsk9 gene in mouse hepatocytes.
Example 2: in vitro transcription of mRNA encoding fusion molecules
In vitro transcription and purification are used to produce mRNA corresponding to the fusion molecule or catalytic protein of EPICAS systems. First, a plasmid containing all fusion molecule elements including the 5'UTR-DNMT3A-DNMT3L-dCAS9-KRAB-3' UTR-polyA expression cassette was constructed. The plasmid sequence was linearized by XbaI and BpiI restriction endonuclease digestion (FIG. 2A). An in vitro transcription reaction containing linearized DNA template, T7 RNA polymerase, NTP and cap analogue is performed to produce an mRNA containing N1-methyl pseudouridine. After digestion of the DNA template with DNaseI, the mRNA product was purified and buffer exchanged and the purity of the final mRNA product was assessed by capillary gel electrophoresis (fig. 2B). 100-mer sgrnas were synthesized chemically by commercial suppliers under solid phase synthesis conditions with minimal end modifications. To test the function of in vitro transcribed mRNA, snrpn-GFP reporter system was constructed in HEK293T cells (FIG. 2C). The reporter system uses a synthetic methylation-sensitive promoter (conserved sequence elements from the promoter of the imprinted gene Snrpn) to control GFP expression. Insertion of the reporter construct in the genomic locus reveals the methylation status of the adjacent sequences. The in vitro transcribed mRNA was co-transfected into reporter cells with sgRNA targeting Snrpn genes. 8 days post-transfection, 25.3% of the reported cells were GFP negative, significantly higher than the control group transfected with non-targeted sgrnas (fig. 2D). GFP negative cells were sorted by FACS and cultured for 30 days. At 30 days post-transfection, 93.2% of the cells in the reporter system were GFP negative, whereas few GFP negative cells were found in the control group (fig. 2D, 2E). At 70 days and 90 days post-transfection, 86.1% and 87.3% of the cells in the reporter system were GFP negative, respectively (fig. 2I). At 150 days and 400 days post-transfection (i.e., up to 400 cell divisions), 92.7% and 88.1% of cells in the reporter system were GFP negative, respectively. This demonstrates the persistence of epigenomic editing using CRISPRoff system (fig. 2D). In addition, the DNA methylation level at the Snrpn locus was analyzed by a bisulfite PCR assay. The methylation level of the reporter cells (GFP-OFF group) was significantly higher than that of the control cells (GFP-ON group) (FIG. 2F). This result was accompanied by the high CpG methylation observed at the Snrpn locus (FIGS. 2F, 2G, 2I). Taken together, these results indicate that transient expression and mRNA scanning of the EPICAS system silences the expression of the target gene over a long period of time.
Example 3: lipid nanoparticles encoding mRNA and sgRNA of fusion molecules encapsulate LNP formulations and features
LNP is formulated using standard methods known in the art for delivery of fusion molecules mRNA and sgRNA to human hepatocytes. For the mouse study, LNP was formulated as described previously and with some modifications (1). Briefly, an ethanol solution of 1, 2-distearoyl-sn-glycero-3-phosphorylcholine, cholesterol, PEG lipids, and ionizable cationic lipids was rapidly mixed with an aqueous solution (pH 4) containing mRNA and sgRNA (1:1 weight ratio) at a flow ratio (ethanol: aqueous phase) of 1:3 using an (in-line) mixer. The ratio of N to P between the ionizable lipid and the nucleic acid was maintained between 4 and 6 throughout the study.
The resulting LNP formulation was dialyzed against 1 XPBS overnight, filtered off with 0.2 μm sterilization and stored at 4℃until use. The particle size was in the range of 70-90nm (Z-Ave, hydrodynamic diameter) and the polydispersity index was <0.2, as determined by dynamic light scattering (Malvern NanoZS Zetasizer). Encapsulation efficiency of RNA in LNP was measured by Quant-iT Ribogreen Assay (Life Technologies).
Cryo-TEM sample preparation and imaging
LNP samples (3-5 μl) were dispensed onto a plasma-cleaned mesh (Quantifoil, R1.2/1.3 300 or 400 copper mesh) in a FEI Vitrobot chamber at 95% relative humidity and allowed to stand for 30-60s. The web was then blotted with filter paper for 3s and immersed in liquid ethane cooled by liquid nitrogen. Cryo-EM imaging was performed on FEI Talos F200C, run at 200kV acceleration voltage.
LNP contains a 1:1 weight ratio of fusion mRNA and sgRNA targeting the Pcsk9 gene (FIG. 3A). Lipid Nanoparticles (LNP) were formulated using a well-designed impinging stream reactor or microfluidic device using a mixture of ionizable lipids (20% -70%, molar ratio), pegylated lipids (0% -30%, molar ratio), supportive lipids (30% -50%, molar ratio) and cholesterol (10% -50%, molar ratio). By varying the proportion of ionizable lipids, the release kinetics of sgrnas and mrnas can be altered. With a higher proportion of ionizable lipids (molar ratio higher than 55%), the release of sgrnas is much faster than mRNA. Transmission Electron Microscopy (TEM) images showed that LNP was spherical and nano-sized particles (fig. 3B). Using dynamic light scattering (NanoSZ, malvern), the LNP had uniform dimensions (78.2±5.2nm, pdi < 0.10) (fig. 3C).
Example 4: PCSK9 gene silencing in mice using LNP delivery of mRNA and sgRNA encoding fusion molecules
Next, the use of EPICAS system (also referred to as "CRISPRoff" system) was tested for silencing Pcsk9 expression in vivo. LNP was administered to C57CB/6J mice by caudal intravenous injection (FIG. 3E). Five days after injection, mice were euthanized, liver samples were obtained and treated to purify mRNA. RT-QPCR experiments were performed to assess the knockdown efficiency of the Pcsk9 gene in mice. The expression level of Pcsk9 in LNP-injected mice was significantly lower than in the control group (fig. 3F), indicating the efficacy of EPICAS system in silencing Pcsk9 gene expression in vivo.
To test whether LNP can successfully deliver mRNA to mouse hepatocytes in vivo, LNP containing luciferase mRNA was produced and injected into wild-type mice by intramuscular injection.
In vivo Luc mRNA delivery
To examine in vivo distribution of Luc mRNA-LNP, 6-8 week old female Balb/c mice (n=5) were injected in vivo with 5 μg Luc mRNA. At the prescribed test time points, mice were injected with 0.2ml of D-fluorescein (15 mg/ml in DPBS) and imaged using the IVIS Lumina system (PERKIN ELMER).
For in vitro imaging, female BALB/c mice (n=2) of 6-8 weeks of age were injected with 5 μ gLuc mRNA. After 12 hours, animals were injected intraperitoneally (i.p.) with 0.2mL D-fluorescein (15 mg/mL in DPBS) and then reacted for 5 minutes. Tissues including heart, liver, spleen, lung and kidney were immediately collected and fluorescence signals of each tissue were monitored by the IVIS luminea system.
LNP was shown to efficiently deliver luciferase mRNA into mouse hepatocytes by in vivo fluorescence imaging (fig. 3D). The results indicate that LNP can also deliver relatively large-sized mRNA into mouse liver efficiently by tail vein injection, as shown, almost all hepatocytes exhibited tdmamio fluorescence (fig. 6A). In addition, LNP delivered luciferase mRNA accumulated in the mouse liver 6hr post injection (fig. 6B), gradually decreased in the mouse liver 12hr and 24hr post injection, and was absent in the liver 48hr post injection (fig. 6C).
LNP treatment in mice
Mouse studies were approved by beijing villouhua laboratory animal technologies limited and SLAC laboratories and used for experiments. Ai9 (C57 BL/6J genetic background) and C57BL/6J wild type mice were bred in a cycle of 12h light/12 h dark and breeding conditions without specific pathogen. The use and care of animals conforms to guidelines of the national institutes of sciences of life sciences of Shanghai, china. Male C57BL/6J mice were used for experiments at 8-10 weeks of age, the mice were randomly assigned to each experimental group, and data collection and analysis were performed in a blind manner. LNP was administered to mice by tail-side intravenous injection in 200 μl PBS. Mice were sacrificed at the indicated time points and liver samples and serum were obtained at necropsy for RNA extraction or serum biochemical analysis.
For CRISPRoff delivery in C57CB/6J adult wild type mice, CRISPRoff mRNA and sgRNA (sgRNA 7 (SEQ ID NO: 33) and sgRNA 9 (SEQ ID NO: 35)) were delivered in a 1:1 weight ratio by intravenous injection of LNP (FIGS. 9A, 9B). Liver tissue was collected 7 days after injection for qPCR analysis. The expression of Pcsk9 was significantly reduced compared to PBS-injected control mice (fig. 6D). Further studies showed that CRISPRoff-containing LNP at doses of 1.5, 3.0, 6.0 and 10mg/kg resulted in 76%, 93%, 97% and 98% inhibition of Pcsk9 expression, with near-plateau effects occurring at 3.0mg/kg (FIG. 6D). In CRISPRoff treated mice, blood Pcsk9 protein levels were also significantly reduced, with dose dependence similar to Pcsk9 expression in liver tissue (fig. 6E). High levels of DNA methylation at the Pcsk9 gene promoter were observed at 3.0 and 6.0mg/kg, whereas no abnormalities were seen in blood chemistry (AST, ALT, ALP and ALB) (fig. 9C, 9D).
Stable PCSK9 methylation and reduced expression following liver excision and regeneration
Partial hepatectomy (PHx) induced liver regeneration mouse model
Mice were partially hepatectomized (PHx) as described previously (2). Briefly, we used general anesthesia, a small upper midline incision, silk suture to bind the lobe to be resected, a warm pad and light, and subcutaneous saline injection to ensure minimal morbidity.
High fat diet induced hypercholesterolemia murine model
High fat diet mice were obtained from Jiangsu Ji-kang biotechnology Co., ltd (Nanjing, china). Mice were kept under specific pathogen free feeding conditions and a 12h light/12 h dark cycle and fed with a high fat diet, i.e. 60kcal% saturated (lard) fat diet (HFD) obtained from RESEARCH DIETS, inc. (New Brunswick, NJ) for 24 weeks. Mice with blood LDL-c levels greater than 25mg/dL were selected for the experiment.
Next, the persistence of CRISPRoff-induced reduction in Pcsk9 at the 6mg/kg dose was assessed. Protein expression levels of Pcsk9 were reduced by 88%, 81%, 82% and 77%, respectively, 2, 4, 6 and 8 weeks after LNP injection in blood of CRISPRoff-treated mice, indicating sustained silencing of the target gene in vivo by CRISProfff (fig. 6F). To further demonstrate the heritability of CRISPRoff-mediated epigenomic editing, liver resection experiments were performed to measure gene silencing effects after liver regeneration. Specifically, mice were administered LNP containing CRISPRoff on day 0 and either a partial hepatectomy (PHx) (14) or a sham procedure (fig. 6G) was performed on day 7. Liver tissue samples were collected after PHx experiments on day 14, at which time liver regeneration was almost complete (fig. 6G). Expression of Pcsk9mRNA in liver after PHx showed similar reduced levels (93 +/-11% and 92% +/-9%, P <0.0001, t-test) as in sham surgery group (fig. 6H). Furthermore, cpG methylation at the Pcsk9 promoter remained high after PHx (fig. 6I). Since a large number of hepatocytes regenerate through cell division in the late PHx stage, these results indicate CRISPRoff-mediated epigenomic editing is heritable during in vivo cell division, which is advantageous for therapeutic design.
Sustained reduction of blood LDL in high fat diet fed mice
Serum LDL-C levels have been shown to increase with increased High Fat Diets (HFD) (15). The effect of apparent genome editing on reduction of Pcsk9 levels in HFD fed mice was evaluated. Specifically, 6 week old male C57BI/6J mice were fed with HFD for 6 months and then treated by tail vein injection with LNP or PBS encapsulated with CRISPRoff mRNA and Pcsk 9-targeted sgrnas (fig. 7A). At 7 and 14 days post injection, the level of Pcsk9 in blood was significantly reduced at doses of 4mg/kg and 6mg/kg compared to the PBS group (fig. 7B, 7C). In addition, serum LDL-C levels decreased by about 44% and about 58% at 14 days post-injection and about 43% and about 51% at 21 days post-injection, respectively, at 4mg/kg and 6mg/kg (fig. 7D). These results indicate that lowering Pcsk9 by epigenomic editing can efficiently and permanently lower serum LDL-C levels in HFD mice, which is advantageous in therapeutic design.
Accurate silencing of PCSK9 by EPICAS systems without off-target effects
To investigate the potential off-target effects of epigenomic editing, the specificity of CRISPRoff edits at both the transcriptome and genomic levels was assessed. Mouse liver tissue was RNA-seq 7 days after LNP delivery CRISPRoff mRNA and Pcsk9 targeting sgrnas. The whole transcriptome gene expression levels in CRISPRoff treated mice were not significantly different from PBS treated mice except that the expression level of the target gene Pcsk9 was silenced (fig. 8A and 10A). The expression levels of neighboring genes within the window from Pcsk9 1-Mb also did not show significant differences between the two groups of mice (fig. 8B). Several non-target transcripts with more than 2-fold change (FDR < 0.05) were observed in CRISPRoff treated groups (fig. 10B), but there was no significant difference in DNA methylation levels on these non-target transcripts between CRISP-off and PBS treated mice (fig. 10B). In addition, high throughput Whole Genome Bisulfite Sequencing (WGBS) was performed to examine off-target methylation of CpG on liver tissue. Except for Pcsk9, there was no significant difference in methylation levels at all CpG sites between the two groups of mice, indicating a dominant increase in DNA methylation at the Pcsk9 promoter in CRISPRoff-treated mice (fig. 8C). Detailed examination revealed that DNA methylation was only up-regulated in the promoter region targeted by sgrnas, and did not diffuse to Pcsk9 genome or adjacent genes (fig. 8D and 10C). Examination of genes at or near the non-target differential methylation region revealed no transcriptional differences (fig. 10D), indicating that the non-specific methylation differences have little effect on gene expression. We also compared the methylation and gene expression levels of potential sgRNA-dependent off-target sites (with high sequence similarity to the target locus) and found no significant differences between CRISPRoff and PBS-treated mice (fig. 8E). Together, these results demonstrate that epigenetic mediated gene silencing induces little sgRNA-independent and sgRNA-dependent off-target effects in vivo.
In summary, the results demonstrate that EPICAS (CRISPRoff) system can efficiently inhibit the expression of target gene Pcsk9 in mouse liver by up to 98%, with efficacy higher than existing drugs (statins, antibodies and siRNA) and current gene editing techniques using CRISPR/Cas9 or base editors (8-12). The use of the EPICAS (CRISPRoff) system of cleavage-free epigenomic editing reduces the potential risk of unnecessary DNA repair-mediated editing at the target locus, which is beneficial for human therapy design. It also did not induce detectable off-target DNA methylation and changes in gene expression. This CRISPRoff-induced methylation can be reversed by CRISPR-mediated demethylation tools (4). Notably, CRISPRoff-dependent down-regulation of the target gene persisted after multiple rounds of cell division using transient delivery of CRISProff mRNA. In vivo delivery of gene editing tools using LNP may be better than AAV, as long term expression of editing tools by AAV delivery is neither necessary nor desirable. Transient LNP-mediated mRNA delivery avoids off-target effects caused by the prolonged presence of editors and other side effects of AAV, such as immune responses and genome integration of editing tools (16). Finally, epigenomic editing-induced Pcsk9 silencing and blood LDL-C reduction are robust and durable, providing a potential therapeutic strategy for treatment of FH. Such methods may also be applied to the treatment of other chronic diseases.
Example 5: monkey PCSK9 gene silencing in monkey cells
A variety of sgRNAs were designed to target the 250bp regions upstream and downstream of the Transcription Start Site (TSS) of the monkey Pcsk9 gene (FIG. 4A).
Each sgRNA plasmid was co-transfected into monkey cells with a catalytic protein (DNMT 3A-DNMT3L-dCAS 9-KRAB) plasmid. RT-QPCR experiments were performed to assess mRNA expression levels of monkey Pcsk 9. 3 of the 5 sgrnas tested showed significant down-regulation of Pcsk9 expression in monkey cells (fig. 4B). Cells transfected with S2, S8 or S9 sgrnas resulted in monkey Pcks down-regulated by about 90%.
Example 6: human PCSK9 gene silencing in human cell lines
A reporter cell line was constructed to test the efficiency of PCKS gene silencing in a human cell line. Constructing a plasmid with a CMV promoter driven cassette, wherein the cassette has the following elements in the 5 'to 3' direction: 5 '-pCMV-300 bp-TSS- +300bp-PCSK9 exon 1-2A-GFP-3'. In this reporter system, the CMV promoter drives the expression of PCSK9 and GFP fluorescence. If PCSK9 is silenced, transcription of GFP is terminated. The reporter plasmid was transfected into HEK293T cells along with the PiggyBac transposase (PBase) plasmid. Cells successfully integrated with the reporter cassette were sorted by FACS based on GFP fluorescence expression (fig. 5A).
109 SgRNAs were designed to target 300bp upstream and 300bp downstream of the PCSK9 TSS. Plasmids were constructed to encode each sgRNA. Each sgRNA plasmid was co-transfected into a human reporter cell line with a plasmid encoding the fusion molecule. The decrease in average GFP intensity rates at 72h and 120h post-transfection was analyzed. The overall decrease in GFP intensity rate indicates the sensitivity of the reporting system (fig. 5B). The reduced GFP intensity rate was maintained for 120h after transfection. Many sgrnas showed much lower GFP intensity rates at 120h compared to 72h post-transfection. These results indicate the efficacy and persistence of EPICAS systems in human cell lines.
Next, experiments were repeated with each sgRNA and the average GFP intensity rates were measured 72h post-transfection for comparison (fig. 5C). More than half of the designed sgrnas showed a significant decrease in the average fluorescence intensity rate, indicating that EPICAS system could induce targeted knockdown of PCSK9 expression in human cells.
Next, the use of EPICAS systems in silencing endogenous PCSK9 expression was tested in the human Hep3B cell line. Plasmids encoding fusion molecules were co-transfected with various sgRNA plasmids into Hep3B cells. 48h after transfection, the mRNA expression level of human PCSK9 was measured by RT-PCR. Six sgrnas resulted in the highest decrease in PCSK9 expression levels (approximately > 65%). These sgrnas also resulted in a decrease in fluorescence intensity rate of approximately >50% in the reported human cell lines (fig. 5D). These results indicate that EPICAS system can efficiently and long-term countersink the expression of endogenous PCSK9 in human cells.
Taken together, these results demonstrate that EPICAS system successfully silences PCSK9 expression in both mouse and human cells and efficiently and permanently supports silencing PCSK9 gene expression by epigenetic editing. LNP has been successfully used in an in vivo delivery EPICAS system. Thus, LNP formulations of the EPICAS system can be used to treat PCSK 9-related diseases such as atherosclerotic cardiovascular disease by reducing the expression of PCSK9 and thereby lowering low density lipoprotein cholesterol (LDL).
Other embodiments of the present disclosure include the following:
Embodiment 1. A method for reducing or eliminating expression of a proprotein convertase subtilisin/Kexin type 9 (PCSK 9) gene product in a cell, the method comprising the step of introducing into the cell:
A fusion molecule comprising at least one DNA binding protein and at least one gene expression modulator, or a nucleic acid sequence encoding said fusion molecule,
Wherein the gene expression modulator provides modification of at least one nucleotide in the vicinity of the PCSK9 gene and/or within a PCSK9 regulatory element,
Thereby reducing or eliminating expression of the PCSK9 gene product in the cell.
Embodiment 2. An in vivo method of reducing or eliminating expression of a PCSK9 gene product in a subject, the method comprising the step of introducing into cells of the subject:
A fusion molecule comprising at least one DNA binding protein and at least one gene expression modulator, or a nucleic acid sequence encoding said fusion molecule,
Wherein the gene expression modulator provides modification of at least one nucleotide in the vicinity of the PCSK9 gene and/or within a PCSK9 regulatory element,
Thereby reducing or eliminating expression of the PCSK9 gene product in the subject.
Embodiment 3. A method of reducing Low Density Lipoprotein (LDL) cholesterol in a subject, the method comprising the step of introducing into cells of the subject:
A fusion molecule comprising at least one DNA binding protein and at least one gene expression modulator, or a nucleic acid sequence encoding said fusion molecule,
Wherein the gene expression modulator provides modification of at least one nucleotide in the vicinity of the PCSK9 gene and/or within a PCSK9 regulatory element,
Thereby reducing LDL cholesterol in the subject.
Embodiment 4. A method of treating or alleviating a symptom of a PCSK 9-related disease in a subject, the method comprising the step of introducing into cells of the subject:
A fusion molecule comprising at least one DNA binding protein and at least one gene expression modulator, or a nucleic acid sequence encoding said fusion molecule,
Wherein the gene expression modulator provides modification of at least one nucleotide in the vicinity of the PCSK9 gene and/or within a PCSK9 regulatory element,
Thereby treating or alleviating a symptom of a PCSK 9-related disease in the subject.
Embodiment 5. A method of amplifying a population of cells with reduced expression of a PCSK9 gene product, the method comprising the steps of:
i) Introducing a fusion molecule comprising at least one DNA binding protein and at least one gene expression modulator or a nucleic acid sequence encoding said fusion molecule into a plurality of cells,
Wherein the gene expression modulator provides modification of at least one nucleotide in the vicinity of the PCSK9 gene and/or within a PCSK9 regulatory element;
ii) expanding the plurality of cells to produce a plurality of modified cells having reduced expression of the PCSK9 gene product,
Wherein the PCSK9 gene product expression of the plurality of modified cells is reduced by at least 50%, at least 60%, at least 70%, at least 80% or at least 90%, relative to a cell into which the fusion molecule or the nucleic acid sequence has not been introduced, and
Wherein the cells are hepatocytes.
Embodiment 6. The method of embodiment 5, wherein the decrease in expression of the PCSK9 gene product is a transient decrease.
Embodiment 7. The method of embodiment 5, wherein the decrease in expression of the PCSK9 gene product is a steady decrease.
Embodiment 8. The method of any one of embodiments 1-7, wherein the PCSK9 regulatory element is a core promoter, a proximal promoter, a distal enhancer, a silencer, an insulator element, a border element, or a locus control region.
Embodiment 9 the method of any one of embodiments 1-8, wherein the modification of at least one nucleotide near the PCSK9 gene and/or within a PCSK9 regulatory element is located within about 100bp, about 200bp, about 300bp, about 400bp, about 500bp, about 600bp, about 700bp, about 800bp, about 900bp, about 1000bp, about 1100bp, about 1200bp, about 1300bp, about 1400bp, or about 1500bp upstream of the transcription start site of the PCSK9 gene.
Embodiment 10. The method of embodiment 9, wherein the modification of at least one nucleotide near the PCSK9 gene and/or within the PCSK9 regulatory element is located within 1000bp upstream of the transcription start site of the PCSK9 gene.
Embodiment 11. The method of embodiment 9, wherein the modification of at least one nucleotide near the PCSK9 gene and/or within a PCSK9 regulatory element is located within 300bp upstream of the transcription start site of the PCSK9 gene.
Embodiment 12. The method of any of embodiments 1-11, wherein the modification of at least one nucleotide near the PCSK9 gene and/or within a PCSK9 regulatory element is located within about 100bp, about 200bp, about 300bp, about 400bp, about 500bp, about 600bp, about 700bp, about 800bp, about 900bp, about 1000bp, about 1100bp, about 1200bp, about 1300bp, about 1400bp, or about 1500bp downstream of the transcription initiation site of the PCSK9 gene.
Embodiment 13. The method of embodiment 12, wherein the modification of at least one nucleotide near the PCSK9 gene and/or within a PCSK9 regulatory element is within about 300bp of the PCSK9 gene downstream of the transcription start site.
Embodiment 14. The method of any one of embodiments 1-8, wherein the modification of at least one nucleotide near the PCSK9 gene and/or within a PCSK9 regulatory element is within 1000bp upstream and 300bp downstream of the transcription start site of the PCSK9 gene.
Embodiment 15. The method of any one of embodiments 1-14, wherein the modification of the at least one nucleotide is DNA methylation.
Embodiment 16. The method of any one of embodiments 1-15, wherein the at least one gene expression modulator comprises a DNA methyltransferase (DNMT), a DNA demethylase, a histone methyltransferase, a histone demethylase, or a portion thereof.
Embodiment 17. The method of embodiment 16, wherein the at least one gene expression modulator comprises a DNA methyltransferase (DNMT) or a portion thereof.
Embodiment 18. The method of embodiment 17, wherein the DNA methyltransferase is DNMT3A, DNMT, B, DNMT, 3L, DNMT1 or DNMT2.
Embodiment 19. The method of embodiment 18, wherein the DNMT3A comprises the amino acid sequence of SEQ ID NO:23, and a sequence of amino acids thereof.
Embodiment 20. The method of embodiment 18, wherein the DNMT3L comprises the amino acid sequence of SEQ ID NO:24, and an amino acid sequence of seq id no.
Embodiment 21. The method of any one of embodiments 1-20, wherein the at least one gene expression modulator comprises a zinc finger protein-based transcription factor or portion thereof.
Embodiment 22. The method of embodiment 21, wherein the zinc finger protein-based transcription factor is a Kruppel-associated inhibition cassette (KRAB).
The method of embodiment 23, wherein the KRAB comprises SEQ ID NO: 22.
Embodiment 24. The method of any one of embodiments 1-23, wherein the at least one gene expression modulator comprises a DNA methyltransferase or a portion thereof and a zinc finger protein-based transcription factor or a portion thereof.
Embodiment 25. The method of embodiment 24, wherein the DNA methyltransferase is selected from the group consisting of DNMT3A and DNMT3L, and combinations thereof, and the zinc finger protein-based transcription factor is KRAB.
Embodiment 26. The method of any one of embodiments 1-25, wherein the at least one DNA binding protein is Cas9, dCas9, cpf1, zinc finger nuclease (ZNF), transcription activator-like effector nuclease (TALEN), homing endonuclease, dCas 9-fokl nuclease, or MegaTal nuclease.
Embodiment 27. The method of embodiment 26, wherein the at least one DNA binding protein is dCas9.
Embodiment 28. The method of embodiment 27 wherein the dCas9 comprises staphylococcus aureus (Staphylococcus aureus) dCas9, streptococcus pyogenes (Streptococcus pyogenes) dCas9, campylobacter jejuni (Campylobacter jejuni) dCas9, corynebacterium diphtheriae (Corynebacterium diphtheria) dCas9, eubacterium avium (Eubacterium ventriosum) dCas9, streptococcus pastoris (Streptococcus pasteurianus) dCas9, lactobacillus sausage (Lactobacillus farciminis) dCas9, helicobacter (Sphaerochaeta globus) dCas9, azospirillum (azospirlum) (e.g., strain B510) dCas9, gluconacetobacter diazotrophicus dCas9, neisseria gra (NEISSERIA CINEREA) dCas9, streptococcus enterica (Roseburia intestinalis) dCas9, detergent corynebacterium parvulus (Parvibaculum lavamentivorans) dCas9, nitrate splitting bacteria (Nitratifractor salsuginis) (e.g., strain DSM 16511) dCas9, campylobacter sea (Campylobacter lari) (e.g., strain CF 89-12) dCas9, streptococcus thermophilus (Streptococcus thermophilus) (e.g., strain d-9).
Embodiment 29. The method of embodiment 27, wherein the dCas9 comprises SEQ ID NO:1, and a sequence of amino acids thereof.
Embodiment 30 the method of any one of embodiments 1-29, wherein the fusion molecule comprises at least one gene expression modulator fused to the C-terminus, the N-terminus, or both, of the at least one DNA binding protein.
Embodiment 31. The method of embodiment 30, wherein the at least one gene expression modulator is directly fused to the at least one DNA binding protein.
Embodiment 32. The method of embodiment 30, wherein the at least one gene expression modulator is indirectly fused to the at least one DNA binding protein via a non-modulator, a second modulator, or a linker.
Embodiment 33. The method of any of embodiments 30-32, wherein the fusion molecule comprises dCas9 fused to KRAB at the C-terminal end and DNMT3A and DNMT3L at the N-terminal end.
Embodiment 34. The method of embodiment 33, wherein the fusion molecule comprises the sequence of SEQ ID NO:97, and a sequence of amino acids.
Embodiment 35 the method of any one of embodiments 1-34, wherein the fusion molecule further comprises at least one nuclear localization sequence.
Embodiment 36. The method of embodiment 35, wherein the at least one nuclear localization sequence is directly fused to the C-terminus, the N-terminus, or both, of the at least one DNA binding protein.
Embodiment 37. The method of embodiment 35, wherein the at least one nuclear localization sequence is indirectly fused to the C-terminus, the N-terminus, or both, of the at least one DNA binding protein via a linker.
Embodiment 38. The method of any of embodiments 1-37, wherein the nucleic acid sequence encoding the fusion molecule is deoxyribonucleic acid (DNA).
Embodiment 39. The method of any one of embodiments 1-37, wherein the nucleic acid sequence encoding the fusion molecule is a messenger ribonucleic acid (mRNA).
Embodiment 40. The method of any of embodiments 1-39, further comprising the step of introducing at least one single guide RNA (sgRNA) or DNA encoding the sgRNA that is complementary to a DNA sequence in the vicinity of the PCSK9 gene and/or within a PCSK9 regulatory element, thereby targeting the fusion molecule to the PCSK9 gene or PCSK9 regulatory element.
Embodiment 41. The method of embodiment 40, wherein the sgRNA comprises the sequence of SEQ ID NO:27-95 or 98-108.
Embodiment 42. The method of any of embodiments 1-41, wherein the fusion molecule is formulated in a liposome or lipid nanoparticle.
Embodiment 43. The method of any of embodiments 40-41, wherein the fusion molecule and the sgRNA are formulated in a liposome or a lipid nanoparticle.
Embodiment 44. The method of embodiment 43, wherein the fusion molecule and the sgRNA are formulated in the same liposome or lipid nanoparticle.
Embodiment 45. The method of embodiment 43, wherein the fusion molecule and the sgRNA are formulated in different liposomes or lipid nanoparticles.
Embodiment 46. The method of any of embodiments 42-45, wherein the liposome or lipid nanoparticle comprises an ionizable lipid (20% -70%, molar ratio), a pegylated lipid (0% -30%, molar ratio), a supportive lipid (5% -50%, molar ratio), and cholesterol (10% -50%, molar ratio).
Embodiment 47. The method of embodiment 46, wherein the ionizable lipid is selected from the group consisting of a pH-responsive ionizable lipid, a thermally-responsive ionizable lipid, and a photo-responsive ionizable lipid.
Embodiment 48. The method of any one of embodiments 1-41, wherein the fusion molecule is formulated in an AAV vector.
Embodiment 49. The method of any one of embodiments 40-41, wherein the fusion molecule and the sgRNA are formulated in an AAV vector.
Embodiment 50. The method of embodiment 49, wherein the fusion molecule and the sgRNA are formulated in the same AAV vector.
Embodiment 51. The method of embodiment 49, wherein the fusion molecule and the sgRNA are formulated in different AAV vectors.
Embodiment 52. The method of any of embodiments 1-51, wherein the fusion molecule is delivered to the cell by local injection, systemic infusion, or a combination thereof.
Embodiment 53 the method of any one of embodiments 2-4 and 8-52, wherein the subject is a human.
Embodiment 54 the method of any one of embodiments 4 and 8-53, wherein the PCSK 9-associated disease is a high-atherosclerosis cardiovascular disease.
Embodiment 55. The method of any one of embodiments 4 and 8-53, wherein the PCSK 9-associated disease is hypercholesterolemia.
Embodiment 56. The method of any of embodiments 1-55, wherein the cell is a hepatocyte.
Embodiment 57. An sgRNA comprising SEQ ID NO:27-95 or 98-108.
Embodiment 58. A DNA sequence encoding the sgRNA according to embodiment 54.
Embodiment 59 a pharmaceutical composition comprising: a fusion molecule comprising at least one DNA binding protein and at least one gene expression modulator, or a nucleic acid sequence encoding said fusion molecule,
Wherein the fusion molecule targets a genomic region near the PCSK9 gene and/or within a PCSK9 regulatory element,
Wherein the at least one gene expression modulator provides modification of at least one nucleotide in the vicinity of the PCSK9 gene and/or within a PCSK9 regulatory element,
Wherein the at least one gene expression modulator comprises a DNA methyltransferase (DNMT), a DNA demethylase, a histone methyltransferase, a histone demethylase or portion thereof, or a zinc finger protein-based transcription factor or portion thereof, or a combination thereof, and
Wherein the at least one DNA binding protein is Cas9, dCas9, cpf1, zinc finger nuclease (ZNF), transcription activator-like effector nuclease (TALEN), homing endonuclease, dCas 9-fokl nuclease, or MegaTal nuclease.
Embodiment 60. The pharmaceutical composition of embodiment 59, wherein the PCSK9 regulatory element is a transcription start site, a core promoter, a proximal promoter, a distal enhancer, a silencer, an insulator element, a border element, or a locus control region.
Embodiment 61 the pharmaceutical composition of any one of embodiments 59-60, wherein the modification of at least one nucleotide near the PCSK9 gene and/or within a PCSK9 regulatory element is located within about 100bp, about 200bp, about 300bp, about 400bp, about 500bp, about 600bp, about 700bp, about 800bp, about 900bp, about 1000bp, about 1100bp, about 1200bp, about 1300bp, about 1400bp, or about 1500bp upstream of the transcription start site of the PCSK9 gene.
Embodiment 62. The pharmaceutical composition of embodiment 61, wherein the modification of at least one nucleotide near the PCSK9 gene and/or within a PCSK9 regulatory element is located within 1000bp upstream of the transcription start site of the PCSK9 gene.
Embodiment 63. The pharmaceutical composition of embodiment 61, wherein the modification of at least one nucleotide near the PCSK9 gene and/or within a PCSK9 regulatory element is located within 300bp upstream of the transcription start site of the PCSK9 gene.
Embodiment 64 the pharmaceutical composition of any one of embodiments 59-63, wherein the modification of at least one nucleotide near the PCSK9 gene and/or within a PCSK9 regulatory element is located within about 100bp, about 200bp, about 300bp, about 400bp, about 500bp, about 600bp, about 700bp, about 800bp, about 900bp, about 1000bp, about 1100bp, about 1200bp, about 1300bp, about 1400bp, or about 1500bp downstream of the transcription start site of the PCSK9 gene.
Embodiment 65. The pharmaceutical composition of embodiment 64, wherein the modification of at least one nucleotide near the PCSK9 gene and/or within a PCSK9 regulatory element is within about 300bp of the PCSK9 gene downstream of the transcription start site.
Embodiment 66 the pharmaceutical composition of any one of embodiments 59-65, wherein the modification of at least one nucleotide near the PCSK9 gene and/or within a PCSK9 regulatory element is within 1000bp upstream and 300bp downstream of the transcription start site of the PCSK9 gene.
Embodiment 67 the pharmaceutical composition of any of embodiments 59-66, wherein the modification of at least one nucleotide is DNA methylation.
Embodiment 68. The pharmaceutical composition of embodiments 59-67, wherein the at least one gene expression modulator comprises a DNA methyltransferase (DNMT) or a portion thereof.
Embodiment 69. The pharmaceutical composition of embodiment 68 wherein the DNA methyltransferase is DNMT3A, DNMT3B, DNMT3L, DNMT1 or DNMT2.
Embodiment 70. The pharmaceutical composition of embodiment 69, wherein the DNMT3A comprises SEQ ID NO:23, and a sequence of amino acids thereof.
Embodiment 71. The pharmaceutical composition of embodiment 69, wherein the DNMT3L comprises SEQ ID NO:24, and an amino acid sequence of seq id no.
Embodiment 72 the pharmaceutical composition of any one of embodiments 59-71, wherein the at least one gene expression modulator comprises a zinc finger protein-based transcription factor or portion thereof.
Embodiment 73. The pharmaceutical composition of embodiment 72, wherein the zinc finger protein-based transcription factor is a Kruppel-associated inhibition cassette (KRAB).
Embodiment 74 the pharmaceutical composition of embodiment 73, wherein the KRAB comprises SEQ ID NO: 22.
Embodiment 75 the pharmaceutical composition of any one of embodiments 59-74, wherein the at least one gene expression modulator comprises a DNA methyltransferase or a portion thereof and a zinc finger protein-based transcription factor or a portion thereof.
Embodiment 76. The pharmaceutical composition of embodiment 75, wherein the DNA methyltransferase is selected from the group consisting of DNMT3A and DNMT3L, and a combination thereof, and the zinc finger protein-based transcription factor is KRAB.
Embodiment 77 the pharmaceutical composition of any one of embodiments 59-76, wherein said at least one DNA binding protein is Cas9, dCas9, cpf1, zinc finger nuclease (ZNF), transcription activator-like effector nuclease (TALEN), homing endonuclease, dCas 9-fokl nuclease, or MegaTal nuclease.
Embodiment 78. The pharmaceutical composition of embodiment 77, wherein the at least one DNA binding protein is dCas9.
Embodiment 79 the pharmaceutical composition according to embodiment 78, wherein the dCas9 comprises staphylococcus aureus (Staphylococcus aureus) dCas9, streptococcus pyogenes (Streptococcus pyogenes) dCas9, campylobacter jejuni (Campylobacter jejuni) dCas9, corynebacterium diphtheriae (Corynebacterium diphtheria) dCas9, eubacterium avium (Eubacterium ventriosum) dCas9, streptococcus pastoris (Streptococcus pasteurianus) dCas9, lactobacillus sausage (Lactobacillus farciminis) dCas9, helicobacter (Sphaerochaeta globus) dCas9, azospirillum (azospirlum) (e.g., strain B510) dCas9, gluconacetobacter diazotrophicus dCas9, neisseria griseus (NEISSERIA CINEREA) dCas9, neisseria enterica (Roseburia intestinalis) dCas9, corynebacterium parvulus (Parvibaculum lavamentivorans) dCas9, nitrate brine cracker (Nitratifractor salsuginis) (e.g., strain DSM 16511) dCas9, campylobacter marinus (Campylobacter lari) (e.g., strain CF 89-12) dCas9, streptococcus thermophilus (Streptococcus thermophilus) (e.g., strain LMD 9).
Embodiment 80. The pharmaceutical composition of embodiment 78, wherein the dCas9 comprises the amino acid sequence of SEQ ID NO:1, and a sequence of amino acids thereof.
Embodiment 81 the pharmaceutical composition of any one of embodiments 59-80, wherein the fusion molecule comprises the at least one gene expression modulator fused to the C-terminus, the N-terminus, or both, of the at least one DNA binding protein.
Embodiment 82. The pharmaceutical composition of embodiment 81, wherein the at least one gene expression modulator is directly fused to the at least one DNA binding protein.
Embodiment 83. The pharmaceutical composition of embodiment 81, wherein the at least one gene expression modulator is indirectly fused to the at least one DNA binding protein via a non-modulator, a second modulator, or a linker.
Embodiment 84 the pharmaceutical composition of any one of embodiments 81-83, wherein the fusion molecule comprises dCas9 with KRAB fused at the C-terminal end and DNMT3A and DNMT3L fused at the N-terminal end.
Embodiment 85 the pharmaceutical composition of embodiment 84, wherein the fusion molecule comprises the amino acid sequence of SEQ ID NO:97, and a sequence of amino acids.
Embodiment 86 the pharmaceutical composition of any one of embodiments 59-85, wherein the fusion molecule further comprises at least one nuclear localization sequence.
Embodiment 87. The pharmaceutical composition of embodiment 86, wherein the at least one nuclear localization sequence is directly fused to the C-terminus, the N-terminus, or both, of the at least one DNA binding protein.
Embodiment 88 the pharmaceutical composition of embodiment 86, wherein said at least one nuclear localization sequence is indirectly fused to the C-terminus, the N-terminus, or both, of said at least one DNA binding protein via a linker.
Embodiment 89 the pharmaceutical composition of any of embodiments 59-88, wherein the nucleic acid sequence encoding the fusion molecule is deoxyribonucleic acid (DNA).
Embodiment 90 the pharmaceutical composition of any one of embodiments 59-88, wherein the nucleic acid sequence encoding a fusion molecule is a messenger ribonucleic acid (mRNA).
Embodiment 91 the pharmaceutical composition of any of embodiments 59-90, further comprising at least one single guide RNA (sgRNA) complementary to a DNA sequence in the vicinity of the PCSK9 gene and/or within a PCSK9 regulatory element.
Embodiment 92. The pharmaceutical composition of embodiment 91, wherein the sgRNA comprises the sequence of SEQ ID NO:27-95 or 98-108.
Embodiment 93 the pharmaceutical composition of any of embodiments 59-92, wherein the fusion molecule is packaged in a liposome or a lipid nanoparticle.
Embodiment 94 the pharmaceutical composition of any one of embodiments 91-92, wherein said fusion molecule and said sgRNA are packaged in a liposome or a lipid nanoparticle.
Embodiment 95. The pharmaceutical composition of embodiment 94, wherein the fusion molecule and the sgRNA are packaged in the same liposome or lipid nanoparticle.
Embodiment 95. The pharmaceutical composition of embodiment 94, wherein the fusion molecule and the sgRNA are packaged in different liposomes or lipid nanoparticles.
Embodiment 97 the pharmaceutical composition of any of embodiments 93-96, wherein the liposome or lipid nanoparticle comprises an ionizable lipid (20% -70%, molar ratio), a pegylated lipid (0% -30%, molar ratio), a supportive lipid (5% -50%, molar ratio), and cholesterol (10% -50%, molar ratio).
Embodiment 98 the pharmaceutical composition of any of embodiments 93-97, wherein the ionizable lipid is selected from the group consisting of a pH-responsive ionizable lipid, a thermally-responsive ionizable lipid, and a photo-responsive ionizable lipid.
Embodiment 99 the pharmaceutical composition of any one of embodiments 59-92, wherein the fusion molecule is packaged in an AAV vector.
Embodiment 100 the pharmaceutical composition of any one of embodiments 91-92, wherein the fusion molecule and the sgRNA are packaged in an AAV vector.
Embodiment 101. The pharmaceutical composition of embodiment 100, wherein the fusion molecule and the sgRNA are packaged in the same AAV vector.
Embodiment 102. The pharmaceutical composition of embodiment 100, wherein the fusion molecule and the sgRNA are packaged in different AAV vectors.