TW202521691A

TW202521691A - Engineered type v rna programmable endonucleases and their uses

Info

Publication number: TW202521691A
Application number: TW113137996A
Authority: TW
Inventors: 提姆西拉奧; 喬史考特蒙瑟; 穆罕默德南努拉尼
Original assignee: 美商藍岩醫療公司
Priority date: 2023-10-06
Filing date: 2024-10-04
Publication date: 2025-06-01
Also published as: WO2025076291A1

Abstract

The present disclosure provides engineered Type V endonucleases suitable for editing eukaryotic genomic DNA, as well as methods of producing the nucleases, systems comprising the nucleases, as well as methods of using the nucleases and systems to edit eukaryotic genomic DNA.

Description

Engineered V-type RNA programmable endonuclease and its use

成簇規律間隔短迴文重複(Clustered Regularly Interspaced Short Palindromic Repeat，CRISPR)及CRISPR相關(Cas)基因(統稱為CRISPR-Cas或CRISPR/Cas系統)目前應理解為對細菌及古菌提供對抗噬菌體感染之免疫。原核適應性免疫之CRISPR-Cas系統係蛋白質效應子、非編碼元件、以及基因座架構之極度多樣化群組，其一些實例已經工程化且適用於產生重要生物科技。Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and CRISPR-associated (Cas) genes (collectively referred to as CRISPR-Cas or CRISPR/Cas systems) are currently understood to provide immunity to bacteria and archaea against phage infection. The CRISPR-Cas system of prokaryotic adaptive immunity is an extremely diverse group of protein effectors, noncoding elements, and locus architectures, some of which have been engineered and applied to produce important biotechnological applications.

參與宿主防禦之系統之組分包括一或多種能夠修飾DNA或RNA之效應蛋白及負責將此等蛋白活性靶向至噬菌體DNA或RNA上的特異性序列之RNA引導元件。該RNA引導物由CRISPR RNA (crRNA)組成且可能需要另外反式作用RNA (tracrRNA)以實現藉由效應蛋白之靶向核酸操縱。該crRNA由負責將該crRNA結合至該效應蛋白之稱為「直接重複」的區段及與所需核酸靶序列互補之稱為「間隔子序列」的區段組成。可藉由修飾該crRNA之該間隔子序列再程式化CRISPR系統以靶向替代DNA或RNA靶。The components of the system involved in host defense include one or more effector proteins capable of modifying DNA or RNA and an RNA guide element responsible for targeting the activity of these proteins to specific sequences on the phage DNA or RNA. The RNA guide consists of CRISPR RNA (crRNA) and may require an additional trans-acting RNA (tracrRNA) to achieve targeted nucleic acid manipulation by the effector protein. The crRNA consists of a segment called a "direct repeat" that is responsible for binding the crRNA to the effector protein and a segment called a "spacer sequence" that is complementary to the desired nucleic acid target sequence. The CRISPR system can be reprogrammed to target alternative DNA or RNA targets by modifying the spacer sequence of the crRNA.

CRISPR-Cas系統可廣泛地分為兩類：第1類系統由多個效應蛋白圍繞crRNA一起形成複合物所組成，及第2類系統由單個效應蛋白與用於靶向DNA或RNA受質之crRNA引導物複合所組成。第2類系統之單個次單元效應子組成為工程化及應用提供更簡單的組分集且迄今為止已成為可程式效應子之重要來源。因此，新穎第2類系統之發現、工程化及最佳化可導致用於基因組工程化及其他領域之廣泛且強大之可程式技術。CRISPR-Cas systems can be broadly divided into two categories: Class 1 systems consist of multiple effector proteins complexed together around crRNA, and Class 2 systems consist of a single effector protein complexed with a crRNA guide for targeting DNA or RNA substrates. The single subunit effector components of Class 2 systems provide a simpler set of components for engineering and application and have become an important source of programmable effectors to date. Therefore, the discovery, engineering, and optimization of novel Class 2 systems can lead to broad and powerful programmable technologies for genome engineering and other fields.

近年來已廣泛開發使用CRISPR (成簇規律間隔短迴文重複)-Cas (CRISPR相關蛋白)之RNA引導之DNA靶向原理編輯基因組。已描述五種類型之CRISPR-Cas系統(I型、II型及IIb型、III型、V型及VI型)。CRISPR-Cas用於基因組編輯之大多數使用係利用II型系統。細菌II型CRISPR-Cas系統提供的主要優點在於可程式DNA干擾之最低要求：藉由可客製雙重RNA結構引導之核酸內切酶Cas9。如最初在釀膿鏈球菌( Streptococcus pyogenes)之原始II型系統中所證實，反式活化CRISPR RNA (tracrRNA)結合至前驅物CRISPR RNA (pre-crRNA)之不變重複，從而形成雙重-RNA，該雙重-RNA對於在Cas9之存在下藉由RNA酶III介導之crRNA共成熟及藉由Cas9介導之入侵DNA切割而言必不可少。如在釀膿鏈球菌中所證實，藉由成熟活化tracrRNA與靶向crRNA所形成之雙鏈體引導之Cas9在入侵同源DNA中引入位點特異性雙股DNA (dsDNA)斷裂。Cas9係多域酵素，其使用HNH核酸酶域以切割靶股(定義為與crRNA之間隔子序列互補)及使用RuvC樣域以切割非靶股。 In recent years, the RNA-guided DNA-targeting principle of genome editing using CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR-associated proteins) has been widely developed. Five types of CRISPR-Cas systems have been described (type I, type II and IIb, type III, type V and type VI). Most uses of CRISPR-Cas for genome editing utilize type II systems. The main advantage offered by bacterial type II CRISPR-Cas systems is the minimal requirement for programmable DNA interference: the endonuclease Cas9 guided by a customizable dual RNA structure. As originally demonstrated in the original type II system of Streptococcus pyogenes , transactivating CRISPR RNA (tracrRNA) binds to the invariant repeats of precursor CRISPR RNA (pre-crRNA), thereby forming a duplex-RNA that is essential for RNase III-mediated co-maturation of crRNA in the presence of Cas9 and for Cas9-mediated cleavage of invading DNA. As demonstrated in Streptococcus pyogenes, Cas9 guided by the duplex formed by mature activating tracrRNA and targeting crRNA introduces site-specific double-stranded DNA (dsDNA) breaks in invading homologous DNA. Cas9 is a multi-domain enzyme that uses the HNH nuclease domain to cleave the target strand (defined as complementary to the crRNA spacer sequence) and the RuvC-like domain to cleave the non-target strand.

除了II型CRISPR Cas 9核酸酶之外，已描述多種不同V型核酸內切酶核酸酶，諸如Cas12a、Cas12b、Cas12e、Cas12f、Cas13a、Cas13b (Koonin等人，Curr Opin Microbiol. 2017年6月；37: 67–78、及Makarova等人，Nat Rev Microbiol. 2020年2月；18(2):67-83.)。此等系統中的一些不需要tracr RNA (Cas 12a、Cas 13a、Cas 13b)，而Cas 12b核酸酶通常需要tracr RNA (Koonin等人，Curr Opin Microbiol. 2017年6月；37: 67–78)。In addition to the type II CRISPR Cas 9 nuclease, a variety of different type V endonuclease nucleases have been described, such as Cas12a, Cas12b, Cas12e, Cas12f, Cas13a, Cas13b (Koonin et al., Curr Opin Microbiol. 2017 Jun; 37: 67–78, and Makarova et al., Nat Rev Microbiol. 2020 Feb; 18(2): 67-83.). Some of these systems do not require tracr RNA (Cas 12a, Cas 13a, Cas 13b), while Cas 12b nucleases generally require tracr RNA (Koonin et al., Curr Opin Microbiol. 2017 Jun; 37: 67–78).

WO2022258753A1揭示稱為B-GEn.1、B-GEn.1.2及B-GEn.2之新穎V型核酸內切酶多肽，相較於其他核酸內切酶，其等對於真核生物基因組編輯具有特別有利之特徵。本發明提供具有改良之基因編輯效率之經工程化之V型核酸內切酶多肽。WO2022258753A1 discloses novel V-type endonuclease polypeptides called B-GEn.1, B-GEn.1.2 and B-GEn.2, which have particularly advantageous characteristics for eukaryotic genome editing compared to other endonucleases. The present invention provides engineered V-type endonuclease polypeptides with improved gene editing efficiency.

本發明係關於具有改良之基因編輯效率之經工程化之V型核酸內切酶。在不受理論約束下，咸信，該等經工程化之V型核酸內切酶透過經由其等寡核苷酸結合域(OBD) (例如OBD-II)進行更好的靶相互作用而具有改良之基因編輯效率。The present invention relates to engineered V-type endonucleases with improved gene editing efficiency. Without being bound by theory, it is believed that the engineered V-type endonucleases have improved gene editing efficiency through better target interaction through their oligonucleotide binding domains (OBDs) (e.g., OBD-II).

本發明提供包含在對應於SEQ ID NO: 1 (B-GEn.1)之V型核酸內切酶之D504或SEQ ID NO: 2 (B-GEn.1.2)或SEQ ID NO: 3 (B-GEn.2)之V型核酸內切酶之D501之位置處之除天冬胺酸之外之胺基酸之經工程化之V型核酸內切酶。在一些實施例中，在對應於SEQ ID NO: 1 (B-GEn.1)之V型核酸內切酶之D504或SEQ ID NO: 2 (B-GEn.1.2)或SEQ ID NO: 3 (B-GEn.2)之V型核酸內切酶之D501之位置處之胺基酸為精胺酸。The present invention provides engineered V-type endonucleases comprising an amino acid other than aspartic acid at the position of D504 of the V-type endonuclease corresponding to SEQ ID NO: 1 (B-GEn.1) or D501 of the V-type endonuclease of SEQ ID NO: 2 (B-GEn.1.2) or SEQ ID NO: 3 (B-GEn.2). In some embodiments, the amino acid at the position of D504 of the V-type endonuclease corresponding to SEQ ID NO: 1 (B-GEn.1) or D501 of the V-type endonuclease of SEQ ID NO: 2 (B-GEn.1.2) or SEQ ID NO: 3 (B-GEn.2) is arginine.

本發明亦提供含有包含靶相互作用序列基序GX ₁X ₂X ₃X ₄NX ₅X ₆X ₇DX ₈(SEQ ID NO: 204)之OBD (例如OBD-II)之經工程化之V型核酸內切酶，其中X ₁至X ₈中之各者為任何胺基酸。在示例性實施例中，該靶相互作用序列基序為SEQ ID NO: 201、SEQ ID NO: 202及SEQ ID NO: 3中之任一者。 The present invention also provides an engineered V-type nuclease containing an OBD ( _e.g. , OBD-II) comprising a target interaction sequence motif _{GX1X2X3X4NX5X6X7DX8} ₍ _SEQ ID _NO : 204), wherein each _of _X1 to _X8 is any _amino acid. In an exemplary embodiment, the target interaction sequence motif is any one _of SEQ ID NO: 201, SEQ ID NO: 202, and SEQ ID NO: 3.

示例性經工程化之V型核酸內切酶包括B-GEn多肽。經工程化之V型核酸內切酶及B-GEn多肽包括彼等描述於章節6.2 (及視需要包含如章節6.3中所述之核定位信號及/或如章節6.4中所述之連接子序列)及編號實施例1至52中者。Exemplary engineered V-type endonucleases include B-GEn polypeptides. Engineered V-type endonucleases and B-GEn polypeptides include those described in Chapter 6.2 (and optionally comprising a nuclear localization signal as described in Chapter 6.3 and/or a linker sequence as described in Chapter 6.4) and numbered Examples 1 to 52.

本發明進一步提供經工程化之V型核酸內切酶系統，例如經工程化之B-GEn V型核酸內切酶系統，其包含經工程化之B-GEn多肽及適宜引導RNA及/或編碼其之核酸。示例性經工程化之V型核酸內切酶系統揭示於章節6.5中及示例性引導RNA揭示於章節6.6中。在一些實施例中，經工程化之V型核酸內切酶系統為核糖核蛋白(RNP)複合物，其包含經工程化之V型核酸內切酶或B-GEn多肽及引導RNA。核糖核蛋白複合物描述於章節6.7及編號實施例72至77中。The present invention further provides an engineered V-type endonuclease system, such as an engineered B-GEn V-type endonuclease system, which comprises an engineered B-GEn polypeptide and a suitable guide RNA and/or a nucleic acid encoding it. An exemplary engineered V-type endonuclease system is disclosed in Chapter 6.5 and an exemplary guide RNA is disclosed in Chapter 6.6. In some embodiments, the engineered V-type endonuclease system is a ribonucleoprotein (RNP) complex, which comprises an engineered V-type endonuclease or B-GEn polypeptide and a guide RNA. The ribonucleoprotein complex is described in Chapter 6.7 and in numbered embodiments 72 to 77.

本發明進一步提供編碼經工程化之V型核酸內切酶及B-GEn多肽之核酸，例如用於經工程化之V型核酸內切酶及B-GEn多肽之表現載體、及經工程化以表現經工程化之V型核酸內切酶及B-GEn多肽之重組細胞。示例性核酸揭示於章節6.8及編號實施例53至62中，示例性重組細胞及其產生經工程化之V型核酸內切酶及B-GEn多肽之用途陳述於章節6.10及編號實施例63至69中，及示例性載體揭示於章節6.9中。The present invention further provides nucleic acids encoding engineered V-type endonucleases and B-GEn polypeptides, for example, expression vectors for engineered V-type endonucleases and B-GEn polypeptides, and recombinant cells engineered to express engineered V-type endonucleases and B-GEn polypeptides. Exemplary nucleic acids are disclosed in Chapter 6.8 and Examples 53 to 62, exemplary recombinant cells and their use for producing engineered V-type endonucleases and B-GEn polypeptides are described in Chapter 6.10 and Examples 63 to 69, and exemplary vectors are disclosed in Chapter 6.9.

在某些態樣中，本文提供一種在細胞中或體外一或多個位置靶向、編輯、修飾或操縱靶DNA之方法。該等方法一般需要在適合於經工程化之V型核酸內切酶或B-GEn多肽之條件下將經工程化之V型核酸內切酶或B-GEn多肽系統引入至該細胞或該體外環境中以在該靶DNA中進行一或多個切口或切割或鹼基編輯，其中該等經工程化之V型核酸內切酶或B-GEn多肽係藉由經處理或未經處理形式的引導RNA引導至該靶DNA。如本文所用，術語「V型核酸內切酶或B-GEn多肽系統」意欲指可經遞送至細胞使得包含V型核酸內切酶或B-GEn多肽之RNP可操作地構成於細胞中使得可進行編輯之核酸及多肽組分之任何組合。因此，V型核酸內切酶或B-GEn多肽系統可包括如下之任何組合：(a)(i) V型核酸內切酶或B-GEn多肽及/或(ii)一或多種包含編碼V型核酸內切酶或B-GEn多肽之核苷酸序列之核酸及(b)(i)引導RNA及/或(ii)包含編碼引導RNA之核苷酸序列之核酸。In certain aspects, a method for targeting, editing, modifying or manipulating a target DNA in a cell or in vitro is provided herein. These methods generally require that an engineered V-type endonuclease or B-GEn polypeptide system be introduced into the cell or in vitro environment under conditions suitable for the engineered V-type endonuclease or B-GEn polypeptide to perform one or more nicks or cuts or base editing in the target DNA, wherein the engineered V-type endonucleases or B-GEn polypeptides are guided to the target DNA by a guide RNA in a treated or untreated form. As used herein, the term "V-type endonuclease or B-GEn polypeptide system" is intended to refer to any combination of nucleic acid and polypeptide components that can be delivered to a cell so that an RNP comprising a V-type endonuclease or B-GEn polypeptide can be operably constituted in the cell so that editing can be performed. Therefore, a V-type endonuclease or B-GEn polypeptide system can include any combination of: (a) (i) a V-type endonuclease or B-GEn polypeptide and/or (ii) one or more nucleic acids comprising a nucleotide sequence encoding a V-type endonuclease or B-GEn polypeptide and (b) (i) a guide RNA and/or (ii) a nucleic acid comprising a nucleotide sequence encoding a guide RNA.

在一些實施例中，使用本發明之RNP (包含經工程化之V型核酸內切酶或B-GEn多肽及引導RNA)以編輯細胞之基因組。在一些實施例中，使用RNP進行基因組DNA編輯之方法包括用RNP核轉染包含該基因組DNA之靶細胞且將該靶細胞暴露至發生基因編輯之條件，例如藉由在適合於藉由經工程化之V型核酸內切酶或B-GEn多肽進行基因組編輯之條件下培養該靶細胞。In some embodiments, the RNP of the present invention (comprising an engineered V-type endonuclease or B-GEn polypeptide and a guide RNA) is used to edit the genome of a cell. In some embodiments, the method of using RNP for genomic DNA editing comprises nucleofecting a target cell comprising the genomic DNA with the RNP and exposing the target cell to conditions for gene editing to occur, for example, by culturing the target cell under conditions suitable for genome editing by an engineered V-type endonuclease or B-GEn polypeptide.

在一些實施例中，使用一或多種病毒(例如一或多種腺相關病毒(AAV))以將經工程化之V型核酸內切酶或B-GEn多肽系統(例如一或多種包含一或多種編碼經工程化之V型核酸內切酶或B-GEn多肽之核酸及一或多種編碼引導RNA之核酸之病毒)遞送至細胞使得其基因組可經編輯。在一些實施例中，使用一或多種病毒進行基因組DNA編輯之方法包括使包含該基因組DNA之靶細胞與該一或多種病毒接觸且將該靶細胞暴露至發生基因編輯之條件，例如藉由在適合於表現該經工程化之V型核酸內切酶或B-GEn多肽及引導RNA及藉由如藉由該引導RNA引導之經工程化之V型核酸內切酶或B-GEn多肽進行基因組編輯之條件下培養該靶細胞。In some embodiments, one or more viruses (e.g., one or more adeno-associated viruses (AAV)) are used to deliver an engineered V-type endonuclease or B-GEn polypeptide system (e.g., one or more viruses comprising one or more nucleic acids encoding an engineered V-type endonuclease or B-GEn polypeptide and one or more nucleic acids encoding a guide RNA) to a cell so that its genome can be edited. In some embodiments, methods of using one or more viruses to perform genomic DNA editing include contacting a target cell comprising the genomic DNA with the one or more viruses and exposing the target cell to conditions where genetic editing occurs, such as by culturing the target cell under conditions suitable for expression of the engineered V-type nuclease or B-GEn polypeptide and guide RNA and for genome editing by the engineered V-type nuclease or B-GEn polypeptide as guided by the guide RNA.

在一些實施例中，使用脂質奈米粒子(LNP)以將經工程化之V型核酸內切酶或B-GEn多肽系統(例如包含一或多種編碼經工程化之V型核酸內切酶或B-GEn多肽及引導RNA之核酸或編碼引導RNA之核酸之LNP)遞送至細胞使得其基因組可經編輯。在一些實施例中，使用一或多種病毒進行基因組DNA編輯之方法包括使包含該基因組DNA之靶細胞與該脂質奈米粒子接觸且將該靶細胞暴露至發生基因編輯之條件，例如藉由在適合於表現該經工程化之V型核酸內切酶或B-GEn多肽及視需要之引導RNA及藉由如藉由該引導RNA引導之經工程化之V型核酸內切酶或B-GEn多肽進行基因組編輯之條件下培養該靶細胞。In some embodiments, lipid nanoparticles (LNPs) are used to deliver engineered V-type endonucleases or B-GEn polypeptide systems (e.g., LNPs comprising one or more nucleic acids encoding engineered V-type endonucleases or B-GEn polypeptides and guide RNAs or nucleic acids encoding guide RNAs) to cells so that their genomes can be edited. In some embodiments, the method of using one or more viruses to perform genomic DNA editing includes contacting a target cell containing the genomic DNA with the lipid nanoparticle and exposing the target cell to conditions where gene editing occurs, such as by culturing the target cell under conditions suitable for expressing the engineered V-type nuclease or B-GEn polypeptide and optionally a guide RNA and performing genomic editing by the engineered V-type nuclease or B-GEn polypeptide as guided by the guide RNA.

使用經工程化之V型核酸內切酶及B-GEn多肽編輯細胞基因組之示例性方法闡述於章節6.13及編號實施例78至97中。Exemplary methods for editing a cell genome using engineered V-type endonucleases and B-GEn polypeptides are described in Section 6.13 and Examples 78-97.

編輯細胞基因組可係體外(例如在細胞培養中)、離體或體內進行(例如，出於基因療法之目的，經由對個體投與RNP、AAV或LNP)。Editing of a cell's genome can be performed in vitro (e.g., in cell culture), ex vivo, or in vivo (e.g., by administering RNPs, AAVs, or LNPs to an individual for the purpose of gene therapy).

包含經工程化之V型核酸內切酶及B-GEn多肽及核酸之示例性細胞(例如與如本文所揭示之RNP、AAV或LNP接觸之細胞)闡述於章節6.11及編號實施例98至110中。Exemplary cells comprising engineered V-type endonucleases and B-GEn polypeptides and nucleic acids (e.g., cells contacted with RNPs, AAVs, or LNPs as disclosed herein) are described in Section 6.11 and in numbered Examples 98 to 110.

本發明之經工程化之V型核酸內切酶及B-GEn多肽之另外特徵、優點及應用於下文更具體描述。Additional features, advantages and applications of the engineered V-type endonucleases and B-GEn polypeptides of the present invention are described in more detail below.

1. 相關申請案之交叉參考本申請案主張2023年10月6日申請之美國臨時申請案第63/588,636號之優先權，該案之內容係以全文引用之方式併入本文中。 2. 序列表 1. Cross-reference to related applications This application claims priority to U.S. Provisional Application No. 63/588,636 filed on October 6, 2023, the contents of which are incorporated herein by reference in their entirety. 2. Sequence Listing

本申請案含有序列表，該序列表已以XML格式電子呈送且以其全文引用之方式併入本文中。創建於2024年9月26日之該XML序列表命名為BRT-008TW_SL.xml且為451,061位元组大小。 6.1. 定義 This application contains a sequence listing, which has been submitted electronically in XML format and is incorporated herein by reference in its entirety. The XML sequence listing, created on September 26, 2024, is named BRT-008TW_SL.xml and is 451,061 bytes in size. 6.1. Definitions

除非本文另有定義，否則結合本發明使用之科學及技術術語應具有一般技術者通常所理解的含義。下文描述示例性方法及材料，儘管類似於或等效於彼等本文所述者之方法及材料亦可用於實施或測試本發明。若發生衝突，則以本說明書(包括定義)為準。一般而言，結合本文描述之細胞及組織培養、分子生物學、免疫學、微生物學、遺傳學、分析化學、合成有機化學、醫學及醫藥化學、及蛋白質及核酸化學及雜交使用之命名法、及本文描述之細胞及組織培養、分子生物學、免疫學、微生物學、遺傳學、分析化學、合成有機化學、醫學及醫藥化學、及蛋白質及核酸化學及雜交之技術為彼等熟知且在此項技術中常用者。根據製造商說明書如此項技術中通常達成或如本文所述進行酶促反應及純化技術。此外，除非內文另有要求，否則單數術語應包括複數及複數術語應包括單數。在整篇本說明書及實施例中，詞語「具有(have)」及「包含(comprise)」、或變化形式諸如「具有(has)」、「具有(having)」、「包含(comprises)」或「包含(comprising)」應理解為意指包括規定整數或整數組但不排除任何其他整數或整數組。本文提及之所有公開案及其他參考文獻均以其全文引用之方式併入。儘管本文引用許多文件，但此引用並不構成承認此等文件中之任一者構成此項技術中之常見一般知識之一部分。Unless otherwise defined herein, scientific and technical terms used in connection with the present invention shall have the meanings commonly understood by those of ordinary skill in the art. Exemplary methods and materials are described below, although methods and materials similar or equivalent to those described herein may also be used to practice or test the present invention. In the event of a conflict, the present specification, including definitions, shall prevail. In general, nomenclature used in connection with, and techniques of, cell and tissue culture, molecular biology, immunology, microbiology, genetics, analytical chemistry, synthetic organic chemistry, medicinal and pharmaceutical chemistry, and protein and nucleic acid chemistry and hybridization described herein are those well known and commonly used in the art. Enzymatic reactions and purification techniques are performed according to manufacturer's instructions as commonly accomplished in the art or as described herein. In addition, unless the context otherwise requires, singular terms shall include the plural and plural terms shall include the singular. Throughout the specification and examples, the words "have" and "comprise", or variations such as "has", "having", "comprises" or "comprising", should be understood to mean including the stated integer or group of integers but not excluding any other integer or group of integers. All publications and other references mentioned herein are incorporated by reference in their entirety. Although many documents are cited herein, this citation does not constitute an admission that any of these documents constitutes part of the common general knowledge in the art.

B-GEn多肽：如本文所用，術語「B-GEn多肽」係指包含具有關於或衍生自短桿菌屬( Brevibacillus) V型核酸內切酶之胺基酸序列之核酸酶域之多肽，諸如B-GEn.1 (SEQ ID NO: 1)、B-GEn.1.2 (SEQ ID NO: 2)或B-GEn.2 (SEQ ID NO: 3))。B-GEn.1 (SEQ ID NO: 1)具有包含對應於SEQ ID NOS:8-10之RuvC I、RuvC II及RuvC III子域之核酸酶域。B-GEn.1.2 (SEQ ID NO: 2)具有包含對應於SEQ ID NO: 11-13之RuvC I、RuvC II及RuvC III子域之核酸酶域。B-GEn.2 (SEQ ID NO: 3)具有包含對應於SEQ ID NO: 11-13之RuvC I、RuvC II及RuvC III子域之核酸酶域。術語「B-GEn多肽」涵蓋包含與B-GEn.1、B-GEn1.2及B-GEn.2中之任一者之RuvC I、RuvC II及RuvC III域(個別或統稱)具有至少40%序列一致性之胺基酸序列之多肽。「B-GEn多肽」亦涵蓋B-GEn.1 (SEQ ID NO: 1)、B-GEn.1.2 (SEQ ID NO: 2)、B-GEn.2 (SEQ ID NO: 3)中之任一者之變體，諸如包含與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3具有至少50%序列一致性之胺基酸序列之變體及/或(2)包含與SEQ ID NO: 4、SEQ ID NO: 5或SEQ ID NO: 6差異多至25個胺基酸之胺基酸序列之變體。在一些實施例中，該B-GEn多肽具有核酸酶活性。術語「B-GEn多肽」涵蓋經工程化之融合多肽，其包括B-GEn.1 (SEQ ID NO: 1)、B-GEn.1.2 (SEQ ID NO: 2)、B-GEn.2 (SEQ ID NO: 3)、或其如章節6.2中所描述之任何變體之胺基酸序列，例如(1)與SEQ ID NOS: 1至3中之任一者之核酸酶域或整個長度具有至少50%、至少60%、至少70%、至少80%、至少90%、至少95%、至少96%、至少97%、至少98%序列一致性、至少99%序列一致性或100%序列一致性及/或(2)與SEQ ID NOS: 1至3中之任一者差異多至25個胺基酸、多至20個胺基酸、多至15個胺基酸、多至14個胺基酸、多至13個胺基酸、多至12個胺基酸、多至11個胺基酸、多至10個胺基酸、多至9個胺基酸、多至8個胺基酸、多至7個胺基酸、多至6個胺基酸或多至5個胺基酸之胺基酸序列、以及另外序列(例如如章節6.3及/或章節6.4中所描述之一或多個核定位及/或連接子序列)。術語「B-GEn多肽」涵蓋其中在對應於B-GEn.1 (SEQ ID NO: 1)之D504或B-GEn.1.2 (SEQ ID NO: 2)之D501或B-GEn.2 (SEQ ID NO: 3)之D501之位置處之胺基酸不是天冬胺酸，例如為精胺酸之變體。 B-GEn polypeptide: As used herein, the term "B-GEn polypeptide" refers to a polypeptide comprising a nuclease domain having an amino acid sequence related to or derived from a Brevibacillus type V endonuclease, such as B-GEn.1 (SEQ ID NO: 1), B-GEn.1.2 (SEQ ID NO: 2), or B-GEn.2 (SEQ ID NO: 3). B-GEn.1 (SEQ ID NO: 1) has a nuclease domain comprising RuvC I, RuvC II, and RuvC III subdomains corresponding to SEQ ID NOS: 8-10. B-GEn.1.2 (SEQ ID NO: 2) has a nuclease domain comprising RuvC I, RuvC II, and RuvC III subdomains corresponding to SEQ ID NOS: 11-13. B-GEn.2 (SEQ ID NO: 3) has a nuclease domain comprising RuvC I, RuvC II, and RuvC III subdomains corresponding to SEQ ID NOs: 11-13. The term "B-GEn polypeptide" encompasses a polypeptide comprising an amino acid sequence having at least 40% sequence identity with the RuvC I, RuvC II, and RuvC III domains (individually or collectively) of any of B-GEn.1, B-GEn1.2, and B-GEn.2. "B-GEn polypeptide" also encompasses variants of any of B-GEn.1 (SEQ ID NO: 1), B-GEn.1.2 (SEQ ID NO: 2), B-GEn.2 (SEQ ID NO: 3), such as variants comprising an amino acid sequence having at least 50% sequence identity with SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and/or (2) variants comprising an amino acid sequence that differs by up to 25 amino acids from SEQ ID NO: 4, SEQ ID NO: 5 or SEQ ID NO: 6. In some embodiments, the B-GEn polypeptide has nuclease activity. The term "B-GEn polypeptide" encompasses engineered fusion polypeptides comprising the amino acid sequence of B-GEn.1 (SEQ ID NO: 1), B-GEn.1.2 (SEQ ID NO: 2), B-GEn.2 (SEQ ID NO: 3), or any variant thereof as described in Section 6.2, such as (1) having at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% sequence identity, at least 99% sequence identity, or 100% sequence identity to the nuclease domain or the entire length of any one of SEQ ID NOS: 1 to 3 and/or (2) having at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% sequence identity, at least 99% sequence identity, or 100% sequence identity to the nuclease domain or the entire length of any one of SEQ ID NOS: Any of 1 to 3 differs by up to 25 amino acids, up to 20 amino acids, up to 15 amino acids, up to 14 amino acids, up to 13 amino acids, up to 12 amino acids, up to 11 amino acids, up to 10 amino acids, up to 9 amino acids, up to 8 amino acids, up to 7 amino acids, up to 6 amino acids or up to 5 amino acids, as well as additional sequences (e.g., one or more nuclear localization and/or linker sequences as described in Section 6.3 and/or Section 6.4). The term "B-GEn polypeptide" encompasses variants in which the amino acid at the position corresponding to D504 of B-GEn.1 (SEQ ID NO: 1) or D501 of B-GEn.1.2 (SEQ ID NO: 2) or D501 of B-GEn.2 (SEQ ID NO: 3) is not aspartic acid, for example, is arginine.

結合：如本文所用，術語「結合」 (例如參考多肽之RNA結合域)係指大分子之間(例如蛋白質與核酸之間)之非共價相互作用。當處於非共價相互作用之狀態時，該等大分子稱為「結合(associated)」或「相互作用」或「結合(binding)」 (例如當分子X稱為與分子Y相互作用時，其意指該分子X以非共價方式結合至分子Y)。結合相互作用之所有組分不必是序列特異性的(例如與DNA主鏈中之磷酸酯殘基接觸)，但結合相互作用之一些部分可係序列特異性的。結合相互作用一般以小於10 ^-6M、小於10 ^-7M、小於10 ^-8M、小於10 ^-9M、小於10 ^-10M、小於10 ^-11M、小於10 ^-12M、小於10 ^-13M、小於10 ^-14M或小於10 ^-15M之結合常數(Kd)表徵。「親和力」係指結合之強度，增加之結合親和力與較低之Kd相關聯。 Binding: As used herein, the term "binding" (e.g., with reference to the RNA binding domain of a polypeptide) refers to a non-covalent interaction between macromolecules (e.g., between a protein and a nucleic acid). When in a state of non-covalent interaction, the macromolecules are said to be "associated" or "interacting" or "binding" (e.g., when molecule X is said to interact with molecule Y, it means that molecule X binds to molecule Y in a non-covalent manner). All components of the binding interaction need not be sequence-specific (e.g., contacts with phosphate residues in the DNA backbone), but some portions of the binding interaction may be sequence-specific. Binding interactions are typically characterized by a binding constant ^{(Kd) of less than 10-6 M, less than 10-7 M, less than 10-8 M, less than 10-9 M, less than 10-10} ^M ^, ^less ^than ^10-11 ^M , less than ^10-12 M, less than 10-13 M, less than ^10-14 M, or less than ^10-15 M. "Affinity" refers to the strength of binding, with increased binding affinity being associated with lower Kd.

細胞療法：如本文所用，術語「細胞療法」係指其中對患者投與細胞材料之療法。該細胞材料可為完整、活細胞。例如，可在免疫療法過程中投與能夠經由細胞介導之免疫對抗癌細胞之T細胞。細胞療法亦稱為細胞療法(cellular therapy/cytotherapy)。Cell therapy: As used herein, the term "cell therapy" refers to a therapy in which a cellular material is administered to a patient. The cellular material can be intact, living cells. For example, T cells capable of fighting cancer cells via cell-mediated immunity can be administered during immunotherapy. Cell therapy is also known as cellular therapy/cytotherapy.

編碼序列：如本文所用，術語「編碼序列」或「編碼核酸」係指編碼蛋白質或RNA分子之核酸(RNA或DNA)分子內的序列。該編碼序列可進一步包括可以操作方式連接至調節元件之起始及終止信號，該等調節元件包括能夠引導引入或投與核酸之個體或哺乳動物之細胞中之表現之啟動子及多腺苷酸化信號。該編碼序列可經密碼子最佳化用於所關注細胞中之表現。Coding sequence: As used herein, the term "coding sequence" or "coding nucleic acid" refers to a sequence within a nucleic acid (RNA or DNA) molecule that encodes a protein or RNA molecule. The coding sequence may further include start and stop signals that may be operably linked to regulatory elements, including promoters and polyadenylation signals that are capable of directing expression in cells of an individual or mammal into which the nucleic acid is introduced or administered. The coding sequence may be codon optimized for expression in the cell of interest.

互補體：如本文所用，術語「互補體」及「互補性」在核酸分子之內文中係指核酸分子之核苷酸或核苷酸類似物之間形成沃森-克裡克(Watson-Crick) (例如A-T/U及C-G)或胡斯坦(Hoogsteen)鹼基配對能力。「互補性」係指兩個核酸序列之間共有之特性，使得當其彼此反平行比對時，每個位置處的核苷酸鹼基將係互補的。Complementarity: As used herein, the terms "complementarity" and "complementarity" in the context of nucleic acid molecules refer to the ability of nucleotides or nucleotide analogs of nucleic acid molecules to form Watson-Crick (e.g., A-T/U and C-G) or Hoogsteen base pairing. "Complementarity" refers to a property shared between two nucleic acid sequences such that when they are aligned antiparallel to each other, the nucleotide bases at each position will be complementary.

對應(Corresponding/Corresponds)於：術語「對應於(corresponding to/corresponds to)」如關於參考序列(例如，SEQ ID NO: 1、SEQ ID NO: 2及SEQ ID NO: 3中之任一者之B-GEn多肽之胺基酸序列、或其如SEQ ID NO: 8或SEQ ID NO: 11 (RuvC I)、SEQ ID NO: 9或SEQ ID NO: 12 (RuvC II)、及SEQ ID NO: 10或SEQ ID NO: 13 (RuvC II)中之任一者之核酸酶域之RuvC I、RuvC II及RuvC III子域)中之位置所用係出現在該參考及查詢序列之比對中之相同位置處查詢序列中之序列位置，例如如圖1中所顯示。序列比對演算法(諸如例如Clustal Omega (ClustalW；可見於www.ebi.ac.uk/Tools/msa/clustalo/))可用於比對參考及查詢序列，使用軟體自2023年10月1日起的預設參數。Corresponding/Corresponds to: The term "corresponding to/corresponds to" as used with respect to a position in a reference sequence (e.g., the amino acid sequence of a B-GEn polypeptide of any one of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3, or the RuvC I, RuvC II, and RuvC III subdomains of the nuclease domain thereof of any one of SEQ ID NO: 8 or SEQ ID NO: 11 (RuvC I), SEQ ID NO: 9 or SEQ ID NO: 12 (RuvC II), and SEQ ID NO: 10 or SEQ ID NO: 13 (RuvC II)) refers to the sequence position in the query sequence that appears at the same position in an alignment of the reference and query sequences, such as shown in Figure 1. A sequence alignment algorithm such as, for example, Clustal Omega (ClustalW; available at www.ebi.ac.uk/Tools/msa/clustalo/) can be used to align reference and query sequences using the software default parameters as of October 1, 2023.

電穿孔：如本文所用，術語「電穿孔」係指轉染技術，其中使用電脈衝在細胞膜中建立暫時性孔以允許將核糖核蛋白或核酸分子(諸如DNA或RNA (例如mRNA)分子)引入至細胞中。Electroporation: As used herein, the term "electroporation" refers to a transfection technique in which electric pulses are used to create temporary pores in the cell membrane to allow the introduction of ribonucleoproteins or nucleic acid molecules, such as DNA or RNA (e.g., mRNA) molecules, into the cell.

編碼：術語關於核酸(DNA或RNA)之「編碼」意指該核酸包含編碼多肽之胺基酸或RNA之核苷酸之核苷酸序列。Coding: The term "coding" with respect to a nucleic acid (DNA or RNA) means that the nucleic acid comprises a nucleotide sequence of amino acids for a polypeptide or nucleotides for RNA.

經工程化之B-GEn多肽：如本文所用，術語「經工程化之B-GEn多肽」係指相較於B-GEn.1 (SEQ ID NO: 1)、B-GEn.1.2 (SEQ ID NO: 2)及B-GEn.2 (SEQ ID NO: 3)之胺基酸序列包含至少一個突變(例如胺基酸插入、缺失或取代)之變體多肽。本發明之經工程化之B-GEn多肽涵蓋包含在對應於B-GEn.1 (SEQ ID NO: 1)、B-GEn.1.2 (SEQ ID NO: 2)及B-GEn.2 (SEQ ID NO: 3)之D504之位置處之除天冬胺酸之外之胺基酸之V型核酸內切酶。在一些實施例中，在位置501處之胺基酸為精胺酸。Engineered B-GEn polypeptides: As used herein, the term "engineered B-GEn polypeptides" refers to variant polypeptides comprising at least one mutation (e.g., amino acid insertion, deletion, or substitution) relative to the amino acid sequence of B-GEn.1 (SEQ ID NO: 1), B-GEn.1.2 (SEQ ID NO: 2), and B-GEn.2 (SEQ ID NO: 3). The engineered B-GEn polypeptides of the present invention encompass V-type nucleases comprising amino acids other than aspartic acid at positions corresponding to D504 of B-GEn.1 (SEQ ID NO: 1), B-GEn.1.2 (SEQ ID NO: 2), and B-GEn.2 (SEQ ID NO: 3). In some embodiments, the amino acid at position 501 is arginine.

表現盒：如本文所用，術語「表現盒」係指可以操作方式連接至啟動子之DNA編碼序列。Expression cassette: As used herein, the term "expression cassette" refers to a DNA coding sequence that can be operably linked to a promoter.

引導RNA：如本文所用，術語「引導RNA」係指具有DNA靶向序列(亦稱為「間隔子」或「DNA靶向區段」)及蛋白結合序列(亦稱為「蛋白結合區段」)之核糖核酸。該DNA靶向序列與靶DNA (例如基因組DNA)序列具有足夠的互補性，以與該靶DNA序列雜交且將核酸靶向複合物直接序列特異性結合該靶DNA序列。該DNA靶向序列一般包括本文描述之「原型間隔子樣」序列。該蛋白結合序列與位點特異性修飾酵素(例如如在下文章節6.2中所述之B-GEn多肽)相互作用。靶DNA之位點特異性切割發生在藉由(i)該引導RNA與該靶DNA之間之鹼基配對互補性；及(ii)該靶DNA中之短基序(稱為原型間隔子相鄰基序(PAM))確定之位置。引導RNA之蛋白結合區段部分包括與彼此雜交以形成雙股RNA雙鏈體(dsRNA雙鏈體)之核苷酸之兩個互補延伸。在一些實施例中，引導RNA為單股引導RNA (sgRNA)。Guide RNA: As used herein, the term "guide RNA" refers to a ribonucleic acid having a DNA targeting sequence (also referred to as a "spacer" or "DNA targeting segment") and a protein binding sequence (also referred to as a "protein binding segment"). The DNA targeting sequence is sufficiently complementary to the target DNA (e.g., genomic DNA) sequence to hybridize with the target DNA sequence and direct sequence-specific binding of the nucleic acid targeting complex to the target DNA sequence. The DNA targeting sequence generally includes a "prototype spacer-like" sequence described herein. The protein binding sequence interacts with a site-specific modifying enzyme (e.g., a B-GEn polypeptide as described in Section 6.2 below). Site-specific cleavage of the target DNA occurs at a position determined by (i) base pairing complementarity between the guide RNA and the target DNA; and (ii) a short motif in the target DNA, called a protospacer adjacent motif (PAM). The protein-binding segment portion of the guide RNA includes two complementary stretches of nucleotides that hybridize with each other to form a double-stranded RNA duplex (dsRNA duplex). In some embodiments, the guide RNA is a single-stranded guide RNA (sgRNA).

引導RNA及位點特異性修飾酵素(諸如B-GEn多肽)可形成核糖核蛋白複合物(例如經由非共價相互作用結合)。該引導RNA藉由包含與靶DNA之序列互補之核苷酸序列來為該複合物提供靶特異性。該複合物之位點特異性修飾酵素提供核酸內切酶活性。換言之，該位點特異性修飾酵素因其與該引導RNA之蛋白結合區段結合而被引導至靶DNA序列(例如染色體核酸中之靶序列；染色體外核酸(例如附加型核酸、微環等)中之靶序列；粒線體核酸中之靶序列；葉綠體核酸中之靶序列；質體中之靶序列；等)。The guide RNA and the site-specific modifying enzyme (such as a B-GEn polypeptide) can form a ribonucleoprotein complex (e.g., bound via non-covalent interactions). The guide RNA provides target specificity to the complex by comprising a nucleotide sequence that is complementary to the sequence of the target DNA. The site-specific modifying enzyme of the complex provides endonuclease activity. In other words, the site-specific modifying enzyme is guided to the target DNA sequence (e.g., a target sequence in chromosomal nucleic acid; a target sequence in an extrachromosomal nucleic acid (e.g., an episomal nucleic acid, a microcircle, etc.); a target sequence in a mitochondrial nucleic acid; a target sequence in a chloroplast nucleic acid; a target sequence in a plastid; etc.) because it binds to the protein binding segment of the guide RNA.

異源：如本文所用，術語「異源」分別係指天然核酸或多肽中未發現之核苷酸或肽。本文描述之B-GEn.1、或B-GEn.1.2、或B-GEn.2融合蛋白可在一些實施例中包含B-GEn.1、或B-GEn.1.2、或B-GEn.2多肽(或其變體)之RNA結合域與異源多肽序列(例如來自除B-GEn.1或B-GEn.2之外之蛋白質之多肽序列)之融合。該異源多肽可展現亦將藉由B-GEn.1、或B-GEn.1.2、或B-GEn.2融合蛋白展現之活性(例如酶促活性) (例如甲基轉移酶活性、乙醯基轉移酶活性、激酶活性、泛素化活性等)。可將異源核酸連接至天然存在核酸(或其變體) (例如藉由基因工程化)以產生編碼融合多肽之融合核酸。作為另一個實例，在融合變體B-GEn.1、或B-GEn.1.2、或B-GEn.2多肽中，變體B-GEn.1、或B-GEn.1.2、或B-GEn.2多肽可經融合至異源多肽(例如除B-GEn.1或B-GEn.2之外之多肽)，該異源多肽展現亦將由融合變體B-GEn.1、或B-GEn.1.2、或B-GEn.2多肽展現出之活性。可將異源核酸連接至變體B-GEn.1、或B-GEn.1.2、或B-GEn.2多肽(例如藉由基因工程化)以產生編碼融合變體B-GEn.1、或B-GEn.1.2、或B-GEn.2多肽之核酸。「異源」如本文所用另外意指非其天然細胞之細胞中之核苷酸或多肽。Heterologous: As used herein, the term "heterologous" refers to a nucleotide or peptide that is not found in a natural nucleic acid or polypeptide, respectively. The B-GEn.1, or B-GEn.1.2, or B-GEn.2 fusion protein described herein may, in some embodiments, comprise a fusion of the RNA binding domain of a B-GEn.1, or B-GEn.1.2, or B-GEn.2 polypeptide (or a variant thereof) with a heterologous polypeptide sequence (e.g., a polypeptide sequence from a protein other than B-GEn.1 or B-GEn.2). The heterologous polypeptide may exhibit an activity (e.g., an enzymatic activity) that would also be exhibited by the B-GEn.1, or B-GEn.1.2, or B-GEn.2 fusion protein (e.g., methyltransferase activity, acetyltransferase activity, kinase activity, ubiquitination activity, etc.). A heterologous nucleic acid can be linked to a naturally occurring nucleic acid (or a variant thereof) (e.g., by genetic engineering) to produce a fusion nucleic acid encoding a fusion polypeptide. As another example, in a fusion variant B-GEn.1, or B-GEn.1.2, or B-GEn.2 polypeptide, the variant B-GEn.1, or B-GEn.1.2, or B-GEn.2 polypeptide can be fused to a heterologous polypeptide (e.g., a polypeptide other than B-GEn.1 or B-GEn.2) that exhibits an activity that will also be exhibited by the fusion variant B-GEn.1, or B-GEn.1.2, or B-GEn.2 polypeptide. A heterologous nucleic acid can be linked to a variant B-GEn.1, or B-GEn.1.2, or B-GEn.2 polypeptide (e.g., by genetic engineering) to generate a nucleic acid encoding a fusion variant B-GEn.1, or B-GEn.1.2, or B-GEn.2 polypeptide. "Heterologous" as used herein further refers to a nucleotide or polypeptide in a cell other than its natural cell.

宿主細胞：如本文所用，術語「宿主細胞」及「重組宿主細胞」係指例如透過引入異源多肽或核酸(諸如本發明之載體或系統)而已經基因工程化之細胞。應理解，此類術語無意僅指特定個體細胞而且指此一細胞之子代。在一些實施例中，宿主細胞攜帶本發明之載體作為染色體外異源表現載體。在一些實施例中，宿主細胞包含本文揭示之經工程化之B-GEn多肽中之任一者，例如，如作為RNP複合物引入的。在其他實施例中，宿主細胞已進行藉由本發明之經工程化之B-GEn多肽之基因編輯。Host cell: As used herein, the terms "host cell" and "recombinant host cell" refer to cells that have been genetically engineered, for example, by the introduction of heterologous polypeptides or nucleic acids (such as vectors or systems of the present invention). It should be understood that such terms are not intended to refer only to a specific individual cell but also to the progeny of such a cell. In some embodiments, the host cell carries the vector of the present invention as an extrachromosomal heterologous expression vector. In some embodiments, the host cell comprises any of the engineered B-GEn polypeptides disclosed herein, for example, as introduced as an RNP complex. In other embodiments, the host cell has been genetically edited by the engineered B-GEn polypeptide of the present invention.

iPSC：如本文所用，術語「誘導型多能幹細胞」及「iPSC」係指自非多能細胞，諸如成體細胞、部分分化細胞或末端分化細胞，諸如纖維母細胞、造血譜系之細胞、肌细胞、神經元、表皮細胞或類似者藉由將該細胞引入或接觸一或多種再程式化因子而人工製備之一類多能幹細胞。iPSC可衍生自多種不同細胞類型，包括末端分化細胞。iPSC具有胚胎幹(ES)細胞樣形態，其生長為具有大核-細胞質比、界定之邊界及突出核之平坦菌落。此外，iPSC表現一般技術者已知的一或多個關鍵多能性標誌物，包括(但不限於)鹼性磷酸酶、SSEA3、SSEA4、Sox2、Oct3/4、Nanog、TRA160、TRA181、TDGF 1、Dnmt3b、Fox03、GDF3、Cyp26al、TERT及zfp42。iPSC: As used herein, the terms "induced pluripotent stem cells" and "iPSC" refer to a type of pluripotent stem cell artificially prepared from non-pluripotent cells, such as adult cells, partially differentiated cells, or terminally differentiated cells, such as fibroblasts, cells of the hematopoietic lineage, myocytes, neurons, epidermal cells, or the like, by introducing or contacting the cells with one or more reprogramming factors. iPSCs can be derived from a variety of different cell types, including terminally differentiated cells. iPSCs have an embryonic stem (ES) cell-like morphology, growing as flat colonies with a large nuclear-cytoplasmic ratio, defined borders, and prominent nuclei. In addition, iPSCs express one or more key pluripotency markers known to those of ordinary skill, including but not limited to alkaline phosphatases, SSEA3, SSEA4, Sox2, Oct3/4, Nanog, TRA160, TRA181, TDGF1, Dnmt3b, Fox03, GDF3, Cyp26al, TERT, and zfp42.

產生及表徵iPSC之方法之實例可見於例如美國專利公開案第US20090047263號、第US20090068742號、第US20090191159號、第US20090227032號、第US20090246875號、及第US20090304646號及PCT專利公開案WO2013177133及WO2022204567中，該等案各者之揭示以引用之方式併入本文中。一般而言，為產生iPSC，為體細胞提供此項技術中已知之再程式化因子(例如Oct4、SOX2、KLF4、MYC、Nanog、Lin28等)來再程式化該等體細胞以變成多能幹細胞。Examples of methods for generating and characterizing iPSCs can be found, for example, in U.S. Patent Publication Nos. US20090047263, US20090068742, US20090191159, US20090227032, US20090246875, and US20090304646 and PCT Patent Publications WO2013177133 and WO2022204567, the disclosures of each of which are incorporated herein by reference. Generally, to generate iPSCs, somatic cells are provided with reprogramming factors known in the art (e.g., Oct4, SOX2, KLF4, MYC, Nanog, Lin28, etc.) to reprogram the somatic cells to become pluripotent stem cells.

核酸：如本文所用，術語「核酸」及「寡核苷酸」係指共價連接在一起之至少兩個核苷酸。核酸可為單股或雙股或可含有雙股及單股序列之部分。該核酸可為DNA (基因組及cDNA兩者)、RNA、或雜交物，其中該核酸可含有去氧核糖-及核糖-核苷酸之組合、及鹼基(包括尿嘧啶、腺嘌呤、胸腺嘧啶、胞嘧啶、鳥嘌呤、肌苷、黃嘌呤、次黃嘌呤、異鳥苷及異鳥嘌呤)之組合。核酸可藉由化學合成方法或藉由重組方法來獲得。單股之描繪亦定義互補股之序列。因此，本文中對單股核酸之引用亦涵蓋所描繪單股之互補股。Nucleic Acid: As used herein, the terms "nucleic acid" and "oligonucleotide" refer to at least two nucleotides covalently linked together. Nucleic acids may be single-stranded or double-stranded or may contain portions of double-stranded and single-stranded sequences. The nucleic acid may be DNA (both genomic and cDNA), RNA, or a hybrid, wherein the nucleic acid may contain a combination of deoxyribo- and ribo-nucleotides, and a combination of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine, hypoxanthine, isoguanosine and isoguanine. Nucleic acids may be obtained by chemical synthesis methods or by recombinant methods. The description of a single strand also defines the sequence of the complementary strands. Therefore, references herein to a single-stranded nucleic acid also encompass the complementary strands of the described single strands.

核定位信號：如本文所用，術語「核定位信號」及「NLS」係指可促進將多肽定位至真核細胞之核之胺基酸序列。Nuclear localization signal: As used herein, the terms "nuclear localization signal" and "NLS" refer to an amino acid sequence that promotes the localization of a polypeptide to the nucleus of a eukaryotic cell.

核酸酶：如本文所用，術語「核酸酶」及「核酸內切酶」在本文中可互換用於意指具有核酸切割之內切核苷酸催化活性之酵素以及其核酸酶去活化變體。Nuclease: As used herein, the terms "nuclease" and "endonuclease" are used interchangeably herein to refer to enzymes with endonucleolytic activity for nucleic acid cleavage, as well as nuclease-inactivated variants thereof.

核酸酶域：如本文所用，術語核酸酶之「核酸酶域」及「切割域」或「活性域」係指核酸酶內具有DNA切割之催化活性之胺基酸序列或域。切割域可包含在單個多肽鏈中或切割活性可由於兩個(或更多個)多肽之結合所致。單個核酸酶域可由給定多肽內的胺基酸之多於一個分離的延伸組成。在一些實施例中，B-GEn多肽之核酸酶域之邊界藉由將B-GEn多肽與BthCas12b比對(Wu等人，2017，Cell Research 27:705-708)且識別與包含RuvC I、RuvC II及RuvC III子域之BthCas12b RuvC核酸酶域對齊之胺基酸來確定。B-GEn.1之RuvC I域如SEQ ID NO: 8所闡明，及B-GEn.1.2及B-GEn.2之RuvC I域如SEQ ID NO: 11所闡明。B-GEn.1之RuvC II域如SEQ ID NO: 9所闡明，及B-GEn.1.2及B-GEn.2之RuvC II域如SEQ ID NO: 12所闡明。B-GEn.1之RuvC III域如SEQ ID NO: 10所闡明，及B-GEn.1.2及B-GEn.2之RuvC III域如SEQ ID NO: 13所闡明。Nuclease domain: As used herein, the term "nuclease domain" and "cleavage domain" or "active domain" of a nuclease refers to an amino acid sequence or domain within a nuclease that has catalytic activity for DNA cleavage. The cleavage domain may be contained in a single polypeptide chain or the cleavage activity may be due to the combination of two (or more) polypeptides. A single nuclease domain may consist of more than one separate extension of amino acids within a given polypeptide. In some embodiments, the boundaries of the nuclease domain of a B-GEn polypeptide are determined by aligning the B-GEn polypeptide with BthCas12b (Wu et al., 2017, Cell Research 27:705-708) and identifying amino acids aligned with the BthCas12b RuvC nuclease domain comprising RuvC I, RuvC II, and RuvC III subdomains. The RuvC I domain of B-GEn.1 is as described in SEQ ID NO: 8, and the RuvC I domain of B-GEn.1.2 and B-GEn.2 is as described in SEQ ID NO: 11. The RuvC II domain of B-GEn.1 is as described in SEQ ID NO: 9, and the RuvC II domain of B-GEn.1.2 and B-GEn.2 is as described in SEQ ID NO: 12. The RuvC III domain of B-GEn.1 is as described in SEQ ID NO: 10, and the RuvC III domain of B-GEn.1.2 and B-GEn.2 is as described in SEQ ID NO: 13.

核轉染：如本文所用，術語「核轉染」係指基於電穿孔之轉染方法，其使用電學參數及細胞型特異性試劑之組合以將核酸(諸如DNA或RNA)及RNP直接轉移至靶細胞之核。Nucleofection: As used herein, the term "nucleofection" refers to an electroporation-based transfection method that uses a combination of electrical parameters and cell-type specific reagents to transfer nucleic acids (such as DNA or RNA) and RNPs directly to the nucleus of a target cell.

可以操作方式連接：如本文所用，術語「可以操作方式連接」係指兩個或更多個肽或多肽域或核酸(例如DNA)區段之間之功能關係。在轉錄調節之內文中，該術語係指轉錄調節序列對經轉錄之序列之功能關係。例如，啟動子或增強子序列在其刺激或調節編碼序列在適宜宿主細胞或其他表現系統中之轉錄之情況下可以操作方式連接至編碼序列。Operably linked: As used herein, the term "operably linked" refers to a functional relationship between two or more peptide or polypeptide domains or nucleic acid (e.g., DNA) segments. In the context of transcriptional regulation, the term refers to the functional relationship of a transcriptional regulatory sequence to a transcribed sequence. For example, a promoter or enhancer sequence can be operably linked to a coding sequence if it stimulates or regulates transcription of the coding sequence in an appropriate host cell or other expression system.

序列一致性百分比(%)：如本文所用，關於兩個胺基酸序列之術語「序列一致性百分比」、「%序列一致性」及類似者係指如使用BLASTP演算法(Tatusova及Madden，1999，FEMS Microbiol. Lett. 174:247-250)確定之序列一致性百分比，BLASTP演算法可自美國國家生物技術信息中心(the National Center for Biotechnology Information，NCBI)網站(www.ncbi.nlm.nih.gov)獲得，使用以下設定：矩陣 = Blosum62；間隙空位 = 11；延伸空位 = 1；罰分空位 x_下降值 = 50；預期值 = 10；字長 = 3；缺省值。該BLAST演算法藉由首先比對參考序列(例如SEQ ID NO: 1、SEQ ID NO: 2及SEQ ID NO: 3中任一者之B-GEn多肽、或其如SEQ ID NO: 8或SEQ ID NO: 11 (RuvC I)、SEQ ID NO: 9或SEQ ID NO: 12 (RuvC II)、及SEQ ID NO: 10或SEQ ID NO: 13 (RuvC II))中任一者之核酸酶域之RuvC I、RuvC II及RuvC III子域及所基於的查詢序列且接著確定兩個所比對序列之間重疊範圍內之%序列一致性進行兩步驟操作。除%序列一致性之外，BLASTP亦基於該等設定確定%序列相似性。為表徵該一致性，標的序列經比對使得獲得最高級同源性(匹配)。Percent sequence identity (%): As used herein, the terms "percent sequence identity", "% sequence identity" and the like with respect to two amino acid sequences refer to the percent sequence identity as determined using the BLASTP algorithm (Tatusova and Madden, 1999, FEMS Microbiol. Lett. 174:247-250), which can be obtained from the National Center for Biotechnology Information (NCBI) website (www.ncbi.nlm.nih.gov) using the following settings: matrix = Blosum62; gap = 11; extension gap = 1; penalty gap x_down = 50; expected value = 10; word length = 3; default value. The BLAST algorithm operates in two steps by first aligning a reference sequence (e.g., a B-GEn polypeptide of any one of SEQ ID NO: 1, SEQ ID NO: 2, and SEQ ID NO: 3, or a nuclease domain of any one of SEQ ID NO: 8, SEQ ID NO: 11 (RuvC I), SEQ ID NO: 9, SEQ ID NO: 12 (RuvC II), and SEQ ID NO: 10, or SEQ ID NO: 13 (RuvC II)) and the query sequence based thereon and then determining the % sequence identity within the overlap between the two aligned sequences. In addition to % sequence identity, BLASTP also determines % sequence similarity based on these settings. To characterize the identity, the target sequences are aligned so that the highest homology (match) is obtained.

多肽、肽及蛋白質：如本文所用，術語「多肽」、「肽」及「蛋白質」係指任何長度之胺基酸之聚合物。該聚合物在各種實施例中可為直鏈或分支鏈，其可包含經修飾之胺基酸，且其可間插非胺基酸。Polypeptides, peptides and proteins: As used herein, the terms "polypeptide", "peptide" and "protein" refer to polymers of amino acids of any length. The polymer may be a linear or branched chain in various embodiments, it may contain modified amino acids, and it may be interrupted by non-amino acids.

多能：如本文所用，術語「多能」或「多能性」係指細胞自更新及分化成三個胚層：內胚層、中胚層或外胚層中之任一者之細胞之能力。「多能幹細胞」或「PSC」包括例如衍生自囊胚之內細胞團或藉由體細胞核轉移衍生之胚胎幹細胞、及衍生自非多能細胞之iPSC。Pluripotent: As used herein, the term "pluripotent" or "pluripotency" refers to the ability of a cell to self-renew and differentiate into cells of any of the three germ layers: endoderm, mesoderm, or ectoderm. "Pluripotent stem cells" or "PSCs" include, for example, embryonic stem cells derived from the inner cell mass of a blastocyst or by somatic cell nuclear transfer, and iPSCs derived from non-pluripotent cells.

啟動子：如本文所用，術語「啟動子」係指藉由細胞之引發核酸序列之特異性轉錄所需之合成機制識別(或引入的合成機制)識別之核苷酸序列。啟動子可為組成型活性啟動子(例如組成於活性「開啟」狀態之啟動子)，其可為誘導型啟動子(例如其狀態活性/「開啟」或非活性/「關閉」例如在特定溫度、化合物或蛋白質之存在下藉由外部刺激物控制之啟動子)，其可為空間限制型啟動子(例如轉錄控制元件、增強子等)(例如組織特異性啟動子、細胞類型特異性啟動子等)，且其可為時間限制型啟動子(例如，該啟動子在胚胎發育之特定階段期間或在生物過程之特定階段(例如小鼠中之毛囊週期)期間處於「開啟」狀態或「關閉」狀態)。Promoter: As used herein, the term "promoter" refers to a nucleotide sequence that is recognized by the synthetic machinery of a cell (or introduced into the synthetic machinery) required to initiate specific transcription of a nucleic acid sequence. A promoter can be a constitutively active promoter (e.g., a promoter that is constitutively in an active "on" state), it can be an induced promoter (e.g., a promoter whose state of activity/"on" or inactivity/"off" is controlled by an external stimulus, such as in the presence of a specific temperature, compound, or protein), it can be a spatially restricted promoter (e.g., a transcriptional control element, an enhancer, etc.) (e.g., a tissue-specific promoter, a cell type-specific promoter, etc.), and it can be a temporally restricted promoter (e.g., the promoter is in the "on" state or the "off" state during a specific stage of embryonic development or during a specific stage of a biological process (e.g., the hair follicle cycle in mice)).

原型間隔子相鄰基序：如本文所用，術語「原型間隔子相鄰基序」或「PAM」係指藉由Cas蛋白識別之非靶股上的靶序列下游(例如緊接下游)之DNA序列。PAM序列位於該非靶股上的該靶序列的3’。Protospacer adjacent motif: As used herein, the term "protospacer adjacent motif" or "PAM" refers to a DNA sequence downstream (e.g., immediately downstream) of a target sequence on a non-target strand that is recognized by a Cas protein. The PAM sequence is located 3' to the target sequence on the non-target strand.

重組：如本文所用，關於核酸、多肽或細胞之術語「重組」係指為基因工程化(直接或間接)之產物之核酸(DNA或RNA)、多肽或細胞(例如為藉由基因工程化方法產生之核酸、多肽或細胞之後代或複製)。例如，重組載體可為選殖、限制、聚合酶鏈反應(PCR)及/或繫接步驟之各種組合之產物，導致具有可區別於可見於天然系統中之內源核酸之結構編碼或非編碼序列之構築體。編碼多肽之DNA序列可自cDNA片段或自一系列合成寡核苷酸組裝，以提供能夠自包含在細胞中或包含在無細胞轉錄及轉譯系統中之重組轉錄單元表現之合成核酸。包含相關序列之基因組DNA亦可用於形成重組基因或轉錄單元。非轉譯DNA之序列可存在於開放閱讀框的5'或3'，其中此類序列不干擾編碼區域之操縱或表現且可實際上用於藉由各種機制調節所需產物之產生(參見下文「DNA調節序列」)。此外或或者，未轉譯的編碼RNA (例如引導RNA)之DNA序列亦可視為重組。因此，例如，術語「重組」核酸係指非天然存在核酸，例如，係藉由序列之兩個以其他方式分開的區段透過人類干預之人工組合而製成。此種人工組合經常藉由化學合成手段或藉由人工操縱分離的核酸區段，例如藉由基因工程化技術來達成。此一般進行以用編碼相同胺基酸、保守性胺基酸或非保守性胺基酸之密碼子取代密碼子。此外或或者，進行將所需功能之核酸區段接合在一起以產生所需功能組合。此種人工組合經常藉由化學合成手段或藉由人工操縱分離的核酸區段，例如藉由基因工程化技術來達成。當重組核酸編碼多肽時，經編碼之多肽之序列可係天然存在(「野生型」)或可為該天然存在序列之變體(例如突變體)。因此，術語「重組」多肽不一定指其序列非天然存在之多肽。相反地，「重組」多肽藉由重組DNA序列編碼，但該多肽之該序列可為天然存在(「野生型」)或非天然存在(例如變體、突變體等)。因此，「重組」多肽係人類干預之結果但可為天然存在胺基酸序列。術語「非天然存在」包括顯著不同於其天然存在對應物之分子，包括化學修飾或突變之分子。Recombinant: As used herein, the term "recombinant" with respect to nucleic acids, polypeptides, or cells refers to nucleic acids (DNA or RNA), polypeptides, or cells that are the product of genetic engineering (directly or indirectly) (e.g., the progeny or replication of a nucleic acid, polypeptide, or cell produced by genetic engineering methods). For example, a recombinant vector can be the product of various combinations of cloning, restriction, polymerase chain reaction (PCR), and/or ligation steps, resulting in a construct having structural coding or non-coding sequences that are distinguishable from endogenous nucleic acids found in natural systems. A DNA sequence encoding a polypeptide can be assembled from a cDNA fragment or from a series of synthetic oligonucleotides to provide a synthetic nucleic acid capable of being expressed from a recombinant transcription unit contained in a cell or contained in a cell-free transcription and translation system. Genomic DNA comprising the sequence of interest can also be used to form a recombinant gene or transcription unit. Sequences of non-translated DNA may be present 5' or 3' to the open reading frame, wherein such sequences do not interfere with manipulation or expression of the coding region and may in fact be used to regulate the production of a desired product by various mechanisms (see "DNA regulatory sequences" below). Additionally or alternatively, DNA sequences encoding RNA (e.g., guide RNA) that are not translated may also be considered recombinant. Thus, for example, the term "recombinant" nucleic acid refers to a non-naturally occurring nucleic acid, for example, one that is made by the artificial combination of two otherwise separate segments of sequence through human intervention. Such artificial combinations are often achieved by means of chemical synthesis or by artificial manipulation of separate nucleic acid segments, for example, by genetic engineering techniques. This is generally done by replacing codons with codons encoding the same amino acid, conservative amino acids, or non-conservative amino acids. Additionally or alternatively, nucleic acid segments of the desired functions are joined together to produce the desired functional combination. Such artificial combinations are often achieved by chemical synthesis means or by artificial manipulation of isolated nucleic acid segments, for example by genetic engineering techniques. When a recombinant nucleic acid encodes a polypeptide, the sequence of the encoded polypeptide may be naturally occurring ("wild type") or may be a variant (e.g., a mutant) of the naturally occurring sequence. Therefore, the term "recombinant" polypeptide does not necessarily refer to a polypeptide whose sequence is not naturally occurring. On the contrary, a "recombinant" polypeptide is encoded by a recombinant DNA sequence, but the sequence of the polypeptide may be naturally occurring ("wild type") or non-naturally occurring (e.g., a variant, mutant, etc.). Therefore, a "recombinant" polypeptide is the result of human intervention but may be a naturally occurring amino acid sequence. The term "non-naturally occurring" includes molecules that are significantly different from their naturally occurring counterparts, including chemically modified or mutated molecules.

調節序列：如本文所用，術語「調節序列」係指為可以操作方式連接之所關注序列(例如引導RNA或經工程化之B-GEn多肽序列)之表現所需之核酸序列。在一些情況下，該調節序列可為啟動子序列，且在其他情況下，該調節序列可包括啟動子及增強子序列及/或其他為pol之表現所需之調節元件。該調節序列可例如為組成性地或以組織特異性方式驅動可以操作方式連接之序列之表現之序列。Regulatory sequence: As used herein, the term "regulatory sequence" refers to a nucleic acid sequence required for the expression of an operably linked sequence of interest (e.g., a guide RNA or an engineered B-GEn polypeptide sequence). In some cases, the regulatory sequence may be a promoter sequence, and in other cases, the regulatory sequence may include promoter and enhancer sequences and/or other regulatory elements required for the expression of pol. The regulatory sequence may, for example, be a sequence that drives the expression of an operably linked sequence constitutively or in a tissue-specific manner.

核糖核蛋白(RNP)複合物、核糖核蛋白(RNP)粒子：如本文所用，術語「核糖核蛋白複合物」及「核糖核蛋白粒子」係指包括核蛋白及核糖核酸的複合物或粒子。如本文所提供的「核蛋白」係指能夠結合核酸(例如RNA、DNA)之蛋白質。在核蛋白結合核糖核酸之情況下，其稱為「核糖核蛋白」。核糖核蛋白與核糖核酸之間之相互作用可係直接的，例如藉由共價鍵，或間接的，例如藉由非共價鍵(例如靜電相互作用(例如離子鍵、氫鍵、鹵素鍵)、凡得瓦(van der Waals)相互作用(例如偶極子-偶極子、偶極子誘導之偶極子、倫敦色散(London dispersion))、環堆疊(pi效應)、疏水相互作用及類似者)。在實施例中，核糖核蛋白包括非共價結合至核糖核酸之RNA結合基序。例如，RNA結合基序中之帶正電之芳族胺基酸殘基(例如離胺酸殘基)可與RNA之陰性核酸磷酸酯主鏈形成靜電相互作用，由此形成核糖核蛋白複合物。在一些實施例中，本文揭示之經工程化之B-GEn多肽中之任一者係與引導RNA呈RNP。Ribonucleoprotein (RNP) complex, ribonucleoprotein (RNP) particle: As used herein, the terms "ribonucleoprotein complex" and "ribonucleoprotein particle" refer to a complex or particle comprising a nucleoprotein and a ribonucleic acid. As provided herein, "nucleoprotein" refers to a protein capable of binding to a nucleic acid (e.g., RNA, DNA). In the case where a nucleoprotein binds to a ribonucleic acid, it is referred to as a "ribonucleoprotein". The interaction between the ribonucleoprotein and the ribonucleic acid can be direct, such as by covalent bonding, or indirect, such as by non-covalent bonding (e.g., electrostatic interactions (e.g., ionic bonds, hydrogen bonds, halogen bonds), van der Waals interactions (e.g., dipole-dipole, dipole-induced dipole, London dispersion), ring stacking (pi effect), hydrophobic interactions, and the like). In an embodiment, the ribonucleoprotein includes an RNA binding motif that is non-covalently bound to the ribonucleic acid. For example, a positively charged aromatic amino acid residue (e.g., a lysine residue) in the RNA binding motif can form an electrostatic interaction with the negative nucleic acid phosphate backbone of the RNA, thereby forming a ribonucleoprotein complex. In some embodiments, any of the engineered B-GEn polypeptides disclosed herein is in an RNP with a guide RNA.

間隔子：如本文所用，術語「間隔子」係指gRNA分子之與可見於基因組DNA之+或-股中之靶序列部分或完全互補之區域。當與Cas蛋白複合時，該gRNA將該Cas蛋白引導至該基因組DNA中之該靶序列。間隔子之長度通常為15至30個核苷酸(例如20至25個核苷酸)。間隔子之核苷酸序列可為(但不一定)與該靶序列完全互補。例如，在一些實施例中，間隔子可含有與靶序列之一或多個錯配，例如，該間隔子可包含與該靶序列之一個、兩個或三個錯配。Spacer: As used herein, the term "spacer" refers to a region of a gRNA molecule that is partially or completely complementary to a target sequence found in the + or - strand of genomic DNA. When complexed with a Cas protein, the gRNA guides the Cas protein to the target sequence in the genomic DNA. The length of a spacer is typically 15 to 30 nucleotides (e.g., 20 to 25 nucleotides). The nucleotide sequence of a spacer may be (but not necessarily) completely complementary to the target sequence. For example, in some embodiments, a spacer may contain one or more mismatches with a target sequence, for example, the spacer may comprise one, two, or three mismatches with the target sequence.

幹-環結構：如本文所用，術語「幹-環結構」係指具有二級結構之核酸，該二級結構包括核苷酸之已知或預測形成雙股(幹部分)之區域，該雙股(幹部分)在一側藉由主要單股核苷酸(環部分)之區域連接。術語「髮夾(hairpin)」及「折返(fold-back)」結構在本文中亦用於指幹-環結構。此類結構係此項技術中熟知的且此等術語係與其在此項技術中之已知含義一致地使用。如此項技術中已知，幹-環結構不需要精確鹼基配對。因此，該幹可包括一或多個鹼基錯配。或者，該鹼基配對可係精確的，例如不包括任何錯配。Stem-ring structure: As used herein, the term "stem-ring structure" refers to a nucleic acid having a secondary structure that includes a region of nucleotides that are known or predicted to form a double strand (stem portion) that is connected on one side by a region of predominantly single stranded nucleotides (ring portion). The terms "hairpin" and "fold-back" structures are also used herein to refer to stem-ring structures. Such structures are well known in the art and these terms are used consistently with their known meanings in the art. As is known in the art, a stem-ring structure does not require exact base pairing. Thus, the stem may include one or more base mismatches. Alternatively, the base pairing may be exact, e.g., not including any mismatches.

靶細胞：如本文所用，術語「靶細胞」係指其中引入核酸酶(例如本發明之B-GEn系統)之細胞，例如在其基因組中包含靶DNA之細胞。應理解，此術語無意僅指特定個體細胞而且指此一細胞之子代。因為基因編輯可在該細胞中由於該核酸酶系統而進行，因此此後代不需要與其中最初引入該系統之親本細胞相同但包括該細胞之基因編輯對應物。此基因編輯後代仍包括在如本文所用之術語「靶細胞」之範疇內。Target cell: As used herein, the term "target cell" refers to a cell into which a nuclease (e.g., the B-GEn system of the present invention) is introduced, such as a cell that contains target DNA in its genome. It should be understood that this term is not intended to refer only to a specific individual cell but also to the progeny of such a cell. Because gene editing can be performed in the cell due to the nuclease system, this progeny does not need to be the same as the parent cell into which the system was originally introduced but includes the gene-edited counterpart of the cell. This gene-edited progeny is still included in the scope of the term "target cell" as used herein.

靶DNA：如本文所用，術語「靶DNA」係指包括「靶位點」或「靶序列」之多脫氧核糖核苷酸。術語「靶位點」、「靶序列」、「靶原型間隔子DNA」或「原型間隔子樣序列」在本文中可互換用於指存在於引導RNA之DNA靶向區段(亦稱為「間隔子」)將結合(前提是存在足以進行結合之條件)的靶DNA中之核酸序列。例如，靶DNA內的靶位點(或靶序列) 5' GAGCATATC-3'藉由RNA序列5'-GAUAUGCUC-3’靶向(或藉由RNA序列5'-GAUAUGCUC-3’結合、或與RNA序列5'-GAUAUGCUC-3’雜交、或與RNA序列5'-GAUAUGCUC-3’互補)。適宜DNA/RNA結合條件包括通常存在於細胞中之生理條件。其他適宜DNA/RNA結合條件(例如無細胞系統中之條件)係此項技術中已知的；參見，例如，Sambrook, J.及Russell, W.，2001. Molecular Cloning: A Laboratory Manual，第三版，Cold Spring Harbor Laboratory Press。靶DNA之與引導RNA互補且與該引導RNA雜交之股稱為「互補股」及該靶DNA之與該「互補股」互補(且因此不與該引導RNA互補)之股稱為「非互補股」。在一些實施例中，該靶DNA為基因組DNA。Target DNA: As used herein, the term "target DNA" refers to a polydeoxyribonucleotide comprising a "target site" or "target sequence". The terms "target site", "target sequence", "target protospacer DNA" or "protospacer-like sequence" are used interchangeably herein to refer to a nucleic acid sequence in the target DNA to which the DNA targeting segment (also referred to as a "spacer") of a guide RNA will bind (provided that conditions sufficient for binding exist). For example, the target site (or target sequence) 5' GAGCATATC-3' in the target DNA is targeted by (or bound by, or hybridized to, or complemented to) the RNA sequence 5'-GAUAUGCUC-3'. Suitable DNA/RNA binding conditions include physiological conditions that normally exist in cells. Other suitable DNA/RNA binding conditions (e.g., conditions in a cell-free system) are known in the art; see, e.g., Sambrook, J. and Russell, W., 2001. Molecular Cloning: A Laboratory Manual, 3rd ed., Cold Spring Harbor Laboratory Press. The strand of the target DNA that complements the guide RNA and hybridizes with the guide RNA is called a "complementary strand" and the strand of the target DNA that complements the "complementary strand" (and therefore does not complement the guide RNA) is called a "non-complementary strand." In some embodiments, the target DNA is genomic DNA.

轉染：如本文所用，術語「轉染」係指將核酸分子(諸如DNA或RNA (例如mRNA)分子)引入至細胞中，例如引入至靶細胞或產生細胞之核中。在本發明之內文中，術語「轉染」涵蓋熟練技術者已知用於將核酸分子引入至細胞中，例如引入至真核細胞中，諸如引入至哺乳動物細胞中之任何方法。此類方法涵蓋例如電穿孔、脂質轉染(例如基於陽離子脂質及/或脂質體)、磷酸鈣沉澱、基於奈米粒子之轉染、基於病毒之轉染、或基於陽離子聚合物(諸如DEAE-聚葡萄糖或聚乙烯亞胺)之轉染。在一些實施例中，該等核酸分子與多肽結合或複合例如呈核糖核蛋白之形式。Transfection: As used herein, the term "transfection" refers to the introduction of nucleic acid molecules, such as DNA or RNA (e.g., mRNA) molecules, into cells, for example, into the nucleus of a target cell or a producer cell. In the context of the present invention, the term "transfection" encompasses any method known to the skilled artisan for introducing nucleic acid molecules into cells, such as into eukaryotic cells, such as into mammalian cells. Such methods encompass, for example, electroporation, lipofection (e.g., based on cationic lipids and/or liposomes), calcium phosphate precipitation, nanoparticle-based transfection, virus-based transfection, or transfection based on cationic polymers (such as DEAE-polydextrose or polyethyleneimine). In some embodiments, the nucleic acid molecules are associated or complexed with a polypeptide, for example in the form of a ribonucleoprotein.

載體：如本文所用，術語「載體」係指能夠運輸與其已連接的另一核酸之核酸分子。載體之一種類型為「質體」，其係指其中可併入另外DNA區段之圓形雙股DNA環。載體之另一種類型為病毒載體，其中另外DNA區段可繫接至病毒基因組中。某些載體能夠在其所引入的宿主細胞(例如具有細菌複製起點之細菌載體及附加型哺乳動物載體)中自主複製。其他載體(例如非附加型哺乳動物載體)可在引入至宿主細胞中後整合至該宿主細胞之基因組中，且由此連同該宿主基因組進行複製。此外，某些載體能夠引導其所可以操作方式連接之核苷酸序列之表現。此類載體在本文中稱為「表現載體」。在一些實施例中，載體為病毒載體，例如腺病毒載體或腺相關病毒(AAV)載體。 6.2. 經工程化之V型核酸內切酶及B-GEn多肽 Vector: As used herein, the term "vector" refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of vector is a "plastid," which refers to a circular double-stranded DNA loop into which additional DNA segments can be incorporated. Another type of vector is a viral vector, in which additional DNA segments can be tethered to the viral genome. Certain vectors are capable of autonomous replication in the host cell into which they are introduced (e.g., bacterial vectors with a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) can be integrated into the genome of the host cell after introduction into the host cell, and thereby replicate along with the host genome. In addition, certain vectors are capable of directing the expression of nucleotide sequences to which they are operably linked. Such vectors are referred to herein as "expression vectors." In some embodiments, the vector is a viral vector, such as an adenoviral vector or an adeno-associated virus (AAV) vector. 6.2. Engineered V-type endonucleases and B-GEn polypeptides

本發明提供經工程化之V型核酸內切酶，例如經工程化之芽孢桿菌目V型核酸內切酶。本發明之經工程化之V型核酸內切酶通常包含在對應於SEQ ID NO: 1 (B-GEn.1)之D504或SEQ ID NO: 2 (B-GEn.1.2)或SEQ ID NO: 3 (B-GEn.2)之D501之位置處之除天冬胺酸之外之胺基酸。The present invention provides engineered V-type endonucleases, such as engineered Bacillusales V-type endonucleases. The engineered V-type endonucleases of the present invention generally comprise an amino acid other than aspartic acid at a position corresponding to D504 of SEQ ID NO: 1 (B-GEn. 1) or D501 of SEQ ID NO: 2 (B-GEn. 1.2) or SEQ ID NO: 3 (B-GEn. 2).

在某些態樣中，本發明亦提供含有包含靶相互作用序列基序GX ₁X ₂X ₃X ₄NX ₅X ₆X ₇DX ₈(SEQ ID NO: 204)之OBD (例如OBD-II)之經工程化之V型核酸內切酶，其中X ₁至X ₈中之各者為任何胺基酸。在一些實施例中，(a) X ₁選自D、E、S、P、K、及R；(b) X ₂獨立地選自V、I及A；(c) X ₃獨立地選自Y及F；(d) X ₄獨立地選自L及F；(e) X ₅獨立地選自I、L、F、V及M；(f) X ₆獨立地選自S、V、T及A；(g) X ₇獨立地選自V、L及I；及(h) X ₈獨立地選自V、F、L及I。在示例性實施例中，該靶相互作用序列基序為SEQ ID NO: 201、SEQ ID NO: 202及SEQ ID NO: 3中之任一者。 In certain aspects, the present invention also provides engineered V _- _type _{endonucleases} containing an OBD ( _e.g. , OBD-II) comprising a target interaction sequence motif _{GX1X2X3X4NX5X6X7DX8} (SEQ ID NO: 204), _wherein each _of _X1 to _X8 is _any amino acid. In some embodiments, (a) _X1 is selected from D, E, S, P, K, and R; (b) _X2 is independently selected from V, I, and A; (c) _X3 is independently selected from Y and F; (d) _X4 is independently selected from L and F; (e) _X5 is independently selected from I, L, F, V, and M; (f) _X6 is independently selected from S, V, T, and A; (g) _X7 is independently selected from V, L, and I; and (h) _X8 is independently selected from V, F, L, and I. In exemplary embodiments, the target interaction sequence motif is any one of SEQ ID NO: 201, SEQ ID NO: 202, and SEQ ID NO: 3.

在某些態樣中，本發明亦提供經工程化之V型核酸內切酶，例如包含包括如上文所述之靶相互作用序列基序之OBD及/或包含在對應於SEQ ID NO: 1 (B-GEn.1)之V型核酸內切酶之D504及SEQ ID NO: 2 (B-GEn.1.2)及SEQ ID NO: 3 (B-GEn.2)之V型核酸內切酶之D501之位置處之除天冬胺酸之外之胺基酸之經工程化之V型核酸內切酶。在一些實施例中，在對應於D504或D501之位置處之胺基酸為精胺酸。In certain aspects, the present invention also provides an engineered V-type endonuclease, for example, an engineered V-type endonuclease comprising an OBD comprising a target interaction sequence motif as described above and/or an amino acid other than aspartic acid at a position corresponding to D504 of a V-type endonuclease of SEQ ID NO: 1 (B-GEn.1) and D501 of a V-type endonuclease of SEQ ID NO: 2 (B-GEn.1.2) and SEQ ID NO: 3 (B-GEn.2). In some embodiments, the amino acid at the position corresponding to D504 or D501 is arginine.

本發明之經工程化之V型核酸內切酶通常包含與芽孢桿菌目之V型核酸內切酶之胺基酸序列具有至少50%一致性之胺基酸序列，例如，與SEQ ID NOS: 1、2、3及179至199中之任一者之胺基酸序列具有至少50%、至少60%、至少70%或至少80%一致性之任何胺基酸序列，且含有包含靶相互作用序列基序GX ₁X ₂X ₃X ₄NX ₅X ₆X ₇DX ₈(SEQ ID NO: 204)之OBD (例如OBD-II)，其中X ₁至X ₈中之各者為任何胺基酸。在一些實施例中，(a) X ₁選自D、E、S、P、K、及R；(b) X ₂獨立地選自V、I及A；(c) X ₃獨立地選自Y及F；(d) X ₄獨立地選自L及F；(e) X ₅獨立地選自I、L、F、V及M；(f) X ₆獨立地選自S、V、T及A；(g) X ₇獨立地選自V、L及I；及(h) X ₈獨立地選自V、F、L及I。在示例性實施例中，該靶相互作用序列基序為SEQ ID NO: 201、SEQ ID NO: 202及SEQ ID NO: 203中之任一者。 The engineered V-type endonucleases of the present invention generally comprise an amino acid sequence having at least 50% identity to the amino acid sequence of a V-type endonuclease of the Bacillusales, for example, any amino acid sequence having at least 50%, at least 60%, at least 70% or at least 80% identity to the amino acid sequence of any one of SEQ ID NOS _: ₁ , ₂ , ₃ , and ₁₇₉ _to ₁₉₉ , and contain an OBD (e.g., OBD-II) comprising the target interaction sequence motif _{GX1X2X3X4NX5X6X7DX8} (SEQ ID NO: 204), wherein each of _X1 to _X8 is any amino acid. In some embodiments, (a) _X1 is selected from D, E, S, P, K, and R; (b) _X2 is independently selected from V, I, and A; (c) _X3 is independently selected from Y and F; (d) _X4 is independently selected from L and F; (e) _X5 is independently selected from I, L, F, V, and M; (f) _X6 is independently selected from S, V, T, and A; (g) _X7 is independently selected from V, L, and I; and (h) _X8 is independently selected from V, F, L, and I. In exemplary embodiments, the target interaction sequence motif is any one of SEQ ID NO: 201, SEQ ID NO: 202, and SEQ ID NO: 203.

在某些態樣中，該等經工程化之V型核酸內切酶為經工程化之B-GEn多肽。在一些實施例中，該等經工程化之B-GEn多肽包含與B-GEn.1 (SEQ ID NO: 1)、B-GEn1.2 (SEQ ID NO: 2)或B-GEn.2 (SEQ ID NO: 3)之整個長度具有至少50%一致性及/或與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者差異多至25個胺基酸之胺基酸序列。該胺基酸序列較佳包含：(a) SEQ ID NO: 201、SEQ ID NO: 202、SEQ ID NO: 203；及SEQ ID NO: 204中之任一者之靶相互作用序列基序；(b)包含與SEQ ID NO: 8或SEQ ID NO: 11之RuvC I域具有至少40%、至少45%、至少50%、至少60%、至少70%、至少80%或至少90%序列一致性之胺基酸序列之RuvC I域；(c)包含與SEQ ID NO: 9或SEQ ID NO: 12之RuvC II域具有至少50%、至少60%、至少70%、至少80%或至少90%序列一致性之胺基酸序列之RuvC II域；(d)包含與SEQ ID NO: 10或SEQ ID NO: 13之RuvC III域具有至少80%、至少85%或至少90%序列一致性之胺基酸序列之RuvC III域；或(a)、(b)、(c)及(d)中之二者、三者或全部四者之任何組合。In some aspects, the V-type endonucleases through through engineering approaches are B-GEn polypeptides through through engineering approaches. In some embodiments, the B-GEn polypeptides through through engineering approaches comprise an amino acid sequence having at least 50% identity to the entire length of B-GEn.1 (SEQ ID NO: 1), B-GEn1.2 (SEQ ID NO: 2) or B-GEn.2 (SEQ ID NO: 3) and/or differing by up to 25 amino acids from any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3. The amino acid sequence preferably comprises: (a) a target interaction sequence motif of any one of SEQ ID NO: 201, SEQ ID NO: 202, SEQ ID NO: 203; and SEQ ID NO: 204; (b) a RuvC I domain comprising an amino acid sequence having at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80% or at least 90% sequence identity with the RuvC I domain of SEQ ID NO: 8 or SEQ ID NO: 11; (c) a RuvC II domain comprising an amino acid sequence having at least 50%, at least 60%, at least 70%, at least 80% or at least 90% sequence identity with the RuvC II domain of SEQ ID NO: 9 or SEQ ID NO: 12; (d) a RuvC III domain comprising an amino acid sequence having at least 80%, at least 85% or at least 90% sequence identity with the RuvC III domain of SEQ ID NO: 10 or SEQ ID NO: 13. III domain; or any combination of two, three, or all four of (a), (b), (c), and (d).

在一些實施例中，本發明之經工程化之B-GEn多肽包含與SEQ ID NO: 1、或SEQ ID NO: 2或SEQ ID NO: 3之整個長度具有至少55%、至少60%、至少65%、至少70%、至少75%、至少80%、至少85%、至少90%或至少95%一致性之胺基酸序列。在一些實施例中，經工程化之B-GEn多肽包含與SEQ ID NO: 1或SEQ ID NO: 2或SEQ ID NO: 3之整個長度具有至少95%、至少96%、至少97%、至少98%、至少99%一致性或至少99.5%一致性之胺基酸序列。該序列較佳包含：(a) SEQ ID NO: 201、SEQ ID NO: 202、SEQ ID NO: 203；及SEQ ID NO: 204中之任一者之靶相互作用序列基序；(b)包含與SEQ ID NO: 8或SEQ ID NO: 11之RuvC I域具有至少40%、至少45%、至少50%、至少60%、至少70%、至少80%或至少90%序列一致性之胺基酸序列之RuvC I域；(c)包含與SEQ ID NO: 9或SEQ ID NO: 12之RuvC II域具有至少50%、至少60%、至少70%、至少80%或至少90%序列一致性之胺基酸序列之RuvC II域；(d)包含與SEQ ID NO: 10或SEQ ID NO: 13之RuvC III域具有至少80%、至少85%或至少90%序列一致性之胺基酸序列之RuvC III域；或(a)、(b)、(c)及(d)中之二者、三者或全部四者之任何組合。In some embodiments, the engineered B-GEn polypeptides of the present invention comprise an amino acid sequence that is at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% identical to the entire length of SEQ ID NO: 1, or SEQ ID NO: 2, or SEQ ID NO: 3. In some embodiments, the engineered B-GEn polypeptides comprise an amino acid sequence that is at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to the entire length of SEQ ID NO: 1, or SEQ ID NO: 2, or SEQ ID NO: 3. The sequence preferably comprises: (a) a target interaction sequence motif of any one of SEQ ID NO: 201, SEQ ID NO: 202, SEQ ID NO: 203; and SEQ ID NO: 204; (b) a RuvC I domain comprising an amino acid sequence having at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80% or at least 90% sequence identity with the RuvC I domain of SEQ ID NO: 8 or SEQ ID NO: 11; (c) a RuvC II domain comprising an amino acid sequence having at least 50%, at least 60%, at least 70%, at least 80% or at least 90% sequence identity with the RuvC II domain of SEQ ID NO: 9 or SEQ ID NO: 12; (d) a RuvC III domain comprising an amino acid sequence having at least 80%, at least 85% or at least 90% sequence identity with the RuvC III domain of SEQ ID NO: 10 or SEQ ID NO: 13. III domain; or any combination of two, three, or all four of (a), (b), (c), and (d).

在一些態樣中，本發明之經工程化之B-GEn多肽包含(a) SEQ ID NO: 201、SEQ ID NO: 202、SEQ ID NO: 203；及SEQ ID NO: 204中之任一者之靶相互作用序列基序及(b)跨三個核酸酶域(B-GEn.1之SEQ ID NOS: 8至10及B-GEn.1.2及B-GEn.2之SEQ ID NOS: 11至13)與B-GEn.1、B-GEn.1.2或B-GEn.2之對應序列差異總共多至25個胺基酸之RuvC I、RuvC II及RuvC III胺基酸序列，視需要，其中在該RuvC III域中存在不多於3個胺基酸的差異。在一些實施例中，該經工程化之B-GEn多肽包含與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3之胺基酸序列具有整體至少70%、至少80%或至少90%序列一致性之胺基酸序列。In some aspects, the engineered B-GEn polypeptides of the invention comprise (a) a target interaction sequence motif of any one of SEQ ID NO: 201, SEQ ID NO: 202, SEQ ID NO: 203; and SEQ ID NO: 204 and (b) RuvC I, RuvC II, and RuvC III amino acid sequences that differ from the corresponding sequences of B-GEn.1, B-GEn.1.2, or B-GEn.2 by a total of up to 25 amino acids across three nuclease domains (SEQ ID NOS: 8 to 10 for B-GEn.1 and SEQ ID NOS: 11 to 13 for B-GEn.1.2 and B-GEn.2), optionally wherein there is no more than 3 amino acid difference in the RuvC III domain. In some embodiments, the engineered B-GEn polypeptide comprises an amino acid sequence having at least 70%, at least 80%, or at least 90% sequence identity overall to the amino acid sequence of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3.

在又進一步的態樣中，本發明之經工程化之B-GEn多肽可包含與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3之整個序列差異多至25個胺基酸之胺基酸序列，包括關於SEQ ID NO: 1之胺基酸取代D504R或關於SEQ ID NO: 2或SEQ NO: 3之D501R。在一些實施例中，經工程化之B-GEn多肽包含與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者之整個程度差異多至25個胺基酸、多至20個胺基酸、多至15個胺基酸、多至14個胺基酸、多至13個胺基酸、多至11個胺基酸、多至10個胺基酸、多至9個胺基酸、多至8個胺基酸、多至7個胺基酸、多至6個胺基酸、多至5個胺基酸之胺基酸序列，在每種情況下包括關於SEQ ID NO: 1之胺基酸取代D504R或關於SEQ ID NO: 2或SEQ NO: 3之D501R。In yet a further aspect, the engineered B-GEn polypeptide of the present invention may comprise an amino acid sequence that differs from the entire sequence of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 by up to 25 amino acids, including the amino acid substitution D504R with respect to SEQ ID NO: 1 or D501R with respect to SEQ ID NO: 2 or SEQ NO: 3. In some embodiments, the engineered B-GEn polypeptide comprises an amino acid sequence that differs from any of SEQ ID NO: 1, SEQ ID NO: 2, or SEQ ID NO: 3 by up to 25 amino acids, up to 20 amino acids, up to 15 amino acids, up to 14 amino acids, up to 13 amino acids, up to 11 amino acids, up to 10 amino acids, up to 9 amino acids, up to 8 amino acids, up to 7 amino acids, up to 6 amino acids, up to 5 amino acids, in each case including the amino acid substitution D504R with respect to SEQ ID NO: 1 or D501R with respect to SEQ ID NO: 2 or SEQ NO: 3.

本發明提供多肽，例如具有如在編號實施例1至26中所闡明之一或多個特徵之多肽，其與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少80%序列一致性且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸，或核酸，其包含編碼包含與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少80%序列一致性之核酸酶序列且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸之多肽之核苷酸序列。The present invention provides polypeptides, such as polypeptides having one or more features as described in numbered Examples 1 to 26, which have at least 80% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case include the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3, or nucleic acids comprising a nucleotide sequence encoding a polypeptide comprising a nuclease sequence having at least 80% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case include the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3.

根據本發明之一些實施例為多肽，例如具有如在編號實施例1至26中所闡明之一或多個特徵之多肽，其與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少85%序列一致性且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸，或核酸，其包含編碼包含與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少85%序列一致性之核酸酶序列且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸之多肽之核苷酸序列。Some embodiments according to the present invention are polypeptides, such as polypeptides having one or more of the features as described in numbered embodiments 1 to 26, which have at least 85% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case include the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3, or nucleic acids comprising a nucleotide sequence encoding a polypeptide comprising a nuclease sequence having at least 85% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case include the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3.

根據本發明之其他實施例為多肽，例如具有如在編號實施例1至26中所闡明之一或多個特徵之多肽，其與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少90%序列一致性且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸，或核酸，其包含編碼包含與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少90%序列一致性之核酸酶序列且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸之多肽之核苷酸序列。Other embodiments according to the present invention are polypeptides, such as polypeptides having one or more of the features as specified in numbered embodiments 1 to 26, which have at least 90% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case include the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3, or nucleic acids comprising a nucleotide sequence encoding a polypeptide comprising a nuclease sequence having at least 90% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case include the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3.

根據本發明之進一步的實施例為多肽，例如具有如在編號實施例1至26中所闡明之一或多個特徵之多肽，其與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少95%序列一致性且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸，或核酸，其包含編碼包含與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少95%序列一致性之核酸酶序列且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸之多肽之核苷酸序列。A further embodiment according to the present invention is a polypeptide, for example a polypeptide having one or more of the features as specified in numbered embodiments 1 to 26, which has at least 95% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case comprises the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3, or a nucleic acid comprising a nucleotide sequence encoding a polypeptide comprising a nuclease sequence having at least 95% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case comprises the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3.

根據本發明之進一步的實施例為多肽，例如具有如在編號實施例1至26中所闡明之一或多個特徵之多肽，其與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少96%序列一致性且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸，或核酸，其包含編碼包含與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少96%序列一致性之核酸酶序列且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸之多肽之核苷酸序列。A further embodiment according to the present invention is a polypeptide, for example a polypeptide having one or more of the features as specified in numbered embodiments 1 to 26, which has at least 96% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case comprises the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3, or a nucleic acid comprising a nucleotide sequence encoding a polypeptide comprising a nuclease sequence having at least 96% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case comprises the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3.

根據本發明之另外實施例為多肽，例如具有如在編號實施例1至26中所闡明之一或多個特徵之多肽，其與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少97%序列一致性且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸，或核酸，其包含編碼包含與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少97%序列一致性之核酸酶序列且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸之多肽之核苷酸序列。Another embodiment according to the present invention is a polypeptide, for example a polypeptide having one or more of the features as specified in numbered embodiments 1 to 26, which has at least 97% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case comprises the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3, or a nucleic acid comprising a nucleotide sequence encoding a polypeptide comprising a nuclease sequence having at least 97% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case comprises the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3.

根據本發明之其他另外實施例為多肽，例如具有如在編號實施例1至26中所闡明之一或多個特徵之多肽，其與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少98%序列一致性且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸，或核酸，其包含編碼包含與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少98%序列一致性之核酸酶序列且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸之多肽之核苷酸序列。Other additional embodiments according to the present invention are polypeptides, such as polypeptides having one or more of the features as specified in numbered embodiments 1 to 26, which have at least 98% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case include the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3, or nucleic acids comprising a nucleotide sequence encoding a polypeptide comprising a nuclease sequence having at least 98% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case include the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3.

根據本發明之又其他實施例為多肽，例如具有如在編號實施例1至26中所闡明之一或多個特徵之多肽，其與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少99%序列一致性且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸，或核酸，其包含編碼包含與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少99%序列一致性之核酸酶序列且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸之多肽之核苷酸序列。Still other embodiments according to the present invention are polypeptides, such as polypeptides having one or more of the features as described in numbered embodiments 1 to 26, which have at least 99% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case include the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3, or nucleic acids comprising a nucleotide sequence encoding a polypeptide comprising a nuclease sequence having at least 99% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case include the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3.

根據本發明之又其他實施例為多肽，例如具有如在編號實施例1至26中所闡明之一或多個特徵之多肽，其與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少99.5%序列一致性且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸，或核酸，其包含編碼包含與SEQ ID NO: 1、SEQ ID NO: 2或SEQ ID NO: 3中之任一者具有至少99.5%序列一致性之核酸酶序列且在每種情況下包括在對應於SEQ ID NO: 1之D504之位置處或在對應於SEQ ID NO: 2或SEQ NO: 3之D501之位置處之胺基酸精胺酸之多肽之核苷酸序列。Still other embodiments according to the present invention are polypeptides, such as polypeptides having one or more of the features as described in numbered embodiments 1 to 26, which have at least 99.5% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case include the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3, or nucleic acids comprising a nucleotide sequence encoding a polypeptide comprising a nuclease sequence having at least 99.5% sequence identity to any one of SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3 and in each case include the amino acid arginine at a position corresponding to D504 of SEQ ID NO: 1 or at a position corresponding to D501 of SEQ ID NO: 2 or SEQ NO: 3.

示例性B-GEn共通序列B-GEn共通I及B-GEn共通II在本文中分別以SEQ ID NO:7及SEQ ID NO: 205提供。Exemplary B-GEn consensus sequences B-GEn consensus I and B-GEn consensus II are provided herein as SEQ ID NO: 7 and SEQ ID NO: 205, respectively.

本文中稱為B-GEn.1 D504R (SEQ ID NO: 4)、B-GEn.1.2 D501R (SEQ ID NO: 5)及B-GEn.2 D501R (SEQ ID NO: 6)之示例性經工程化之B-GEn序列闡述於下表1中。表 1 名稱序列 SEQ ID NO B-GEn.1 D504R aa序列 4 B-GEn.1.2 D501R aa序列 5 B-GEn.2 D501R aa序列 6 Exemplary engineered B-GEn sequences referred to herein as B-GEn.1 D504R (SEQ ID NO: 4), B-GEn.1.2 D501R (SEQ ID NO: 5), and B-GEn.2 D501R (SEQ ID NO: 6) are described in Table 1 below. Table 1 Name sequence SEQ ID NO B-GEn.1 D504R aa sequence 4 B-GEn.1.2 D501R aa sequence 5 B-GEn.2 D501R aa sequence 6

在一些實施例中，經工程化之B-GEn多肽包含與SEQ ID NO: 4或SEQ ID NO: 5或SEQ ID NO: 6之示例性經工程化之B-GEn多肽具有一致性之胺基酸序列。In some embodiments, the engineered B-GEn polypeptide comprises an amino acid sequence identical to the exemplary engineered B-GEn polypeptide of SEQ ID NO: 4 or SEQ ID NO: 5 or SEQ ID NO: 6.

在一些實施例中，該等多肽進一步包含例如如章節6.3中所闡明之核定位信號及/或例如如章節6.4中所闡明之連接子序列。 6.3. 核定位信號 In some embodiments, the polypeptides further comprise a nuclear localization signal, such as described in Section 6.3 and/or a linker sequence, such as described in Section 6.4. 6.3. Nuclear localization signal

本發明之經工程化之V型核酸內切酶及B-GEn多肽可進一步包含一或多個核定位信號(NLS)。在一些實施例中，本發明之經工程化之V型核酸內切酶及B-GEn多肽包含在其N端處之一或多個NLS。在一些實施例中，本發明之經工程化之V型核酸內切酶及B-GEn多肽包含在其C端處之一或多個NLS。在一些實施例中，本發明之經工程化之V型核酸內切酶及B-C-GEn多肽包含在其N端及C端處之一或多個NLS。該等NLS可與該等核酸內切酶序列分離及藉由連接子序列與彼此分離。The V-type endonuclease and B-GEn polypeptide of the present invention through through engineering approaches may further include one or more nuclear localization signals (NLS). In some embodiments, the V-type endonuclease and B-GEn polypeptide of the present invention through through engineering approaches are included in one or more NLS at its N-terminus. In some embodiments, the V-type endonuclease and B-GEn polypeptide of the present invention through through engineering approaches are included in one or more NLS at its C-terminus. In some embodiments, the V-type endonuclease and B-C-GEn polypeptide of the present invention through through engineering approaches are included in one or more NLS at its N-terminus and C-terminus. These NLS can be separated from these endonuclease sequences and separated from each other by a linker sequence.

因此，本發明提供經工程化之V型核酸內切酶及B-GEn多肽，其包含： (i)經工程化之V型核酸內切酶或B-GEn多肽序列，如章節6.2中所述； (ii)如本文中(例如在該章節6.3中)所述之一或多個NLS序列；及 (iii)例如如章節6.4中所述之一或多個可選連接子序列。 Thus, the present invention provides engineered V-type endonucleases and B-GEn polypeptides comprising: (i) an engineered V-type endonuclease or B-GEn polypeptide sequence as described in Section 6.2; (ii) one or more NLS sequences as described herein (e.g., in Section 6.3); and (iii) one or more optional linker sequences, e.g., as described in Section 6.4.

在某些態樣中，經工程化之V型核酸內切酶或B-GEn多肽包含核酸酶序列及位於該核酸酶序列的C端的第一NLS序列。包含核酸酶序列及第一NLS序列之經工程化之V型核酸內切酶或B-GEn多肽可進一步包含介於該核酸酶序列與該第一NLS序列之間之第一連接子序列。In certain aspects, the engineered V-type endonuclease or B-GEn polypeptide comprises a nuclease sequence and a first NLS sequence at the C-terminus of the nuclease sequence. The engineered V-type endonuclease or B-GEn polypeptide comprising a nuclease sequence and a first NLS sequence may further comprise a first linker sequence between the nuclease sequence and the first NLS sequence.

在某些態樣中，經工程化之V型核酸內切酶或B-GEn多肽包含多於一個NLS序列(例如位於該核酸酶序列之C端的多於一個NLS序列)。In certain aspects, the engineered V-type endonuclease or B-GEn polypeptide comprises more than one NLS sequence (eg, more than one NLS sequence located at the C-terminus of the nuclease sequence).

在一些實施例中，經工程化之V型核酸內切酶或B-GEn多肽包含位於該第一NLS序列的C端的第二NLS序列。包含第二NLS序列之經工程化之V型核酸內切酶或B-GEn多肽可進一步包含介於該第一NLS序列與該第二NLS序列之間之連接子序列。In some embodiments, the engineered V-type endonuclease or B-GEn polypeptide comprises a second NLS sequence located at the C-terminus of the first NLS sequence. The engineered V-type endonuclease or B-GEn polypeptide comprising the second NLS sequence may further comprise a linker sequence between the first NLS sequence and the second NLS sequence.

在進一步的實施例中，經工程化之V型核酸內切酶或B-GEn多肽包含位於該第二NLS序列的C端的第三NLS序列。包含第三NLS序列之經工程化之V型核酸內切酶或B-GEn多肽序列可進一步包含介於該第二NLS序列與該第三NLS序列之間之連接子序列。In a further embodiment, the engineered V-type nuclease or B-GEn polypeptide comprises a third NLS sequence located at the C-terminus of the second NLS sequence. The engineered V-type nuclease or B-GEn polypeptide sequence comprising the third NLS sequence may further comprise a linker sequence between the second NLS sequence and the third NLS sequence.

在另外實施例中，經工程化之V型核酸內切酶或B-GEn多肽包含位於該第三NLS序列的C端的第四NLS序列。包含第四NLS序列之經工程化之V型核酸內切酶或B-GEn多肽可進一步包含介於該第三NLS序列與該第四NLS序列之間之連接子序列。In another embodiment, the engineered V-type nuclease or B-GEn polypeptide comprises a fourth NLS sequence located at the C-terminus of the third NLS sequence. The engineered V-type nuclease or B-GEn polypeptide comprising the fourth NLS sequence may further comprise a linker sequence between the third NLS sequence and the fourth NLS sequence.

在一些實施例中，除了位於該核酸酶序列的C端的該一或多個NLS序列外，經工程化之V型核酸內切酶或B-GEn多肽亦包含N端NLS序列。因此，在某些態樣中，經工程化之V型核酸內切酶或B-GEn多肽包含位於該核酸酶序列的N端的NLS序列。包含位於該核酸酶序列的N端的NLS序列之經工程化之V型核酸內切酶或B-GEn多肽可進一步包含介於該NLS序列與該核酸酶序列之間之連接子序列。在某些實施例中，經工程化之V型核酸內切酶或B-GEn多肽包含多於一個N端NLS序列(例如位於該核酸酶序列的N端的可經由一或多個連接子連接之多於一個NLS序列)。In some embodiments, in addition to the one or more NLS sequences at the C-terminus of the nuclease sequence, the engineered V-type nuclease or B-GEn polypeptide also comprises an N-terminal NLS sequence. Therefore, in some aspects, the engineered V-type nuclease or B-GEn polypeptide comprises an NLS sequence at the N-terminus of the nuclease sequence. The engineered V-type nuclease or B-GEn polypeptide comprising an NLS sequence at the N-terminus of the nuclease sequence may further comprise a linker sequence between the NLS sequence and the nuclease sequence. In some embodiments, the engineered V-type nuclease or B-GEn polypeptide comprises more than one N-terminal NLS sequence (e.g., more than one NLS sequence that can be connected by one or more linkers at the N-terminus of the nuclease sequence).

核定位信號之非限制性實例列於表2中。表 2 NLS 序列 NLS SEQ ID NO PKKKRKV 野生型SV40大T蛋白 14 KRPAATKKAGQAKKKK 核質 15 PAAKRVKLD c-myc 16 MSRRRKANPTKLSENAKKLAKEVEN EGL-13 17 KLKIKRPVK TUS 18 (KR)-X(0,2)-(KR)-(KR)-x(3,10)-(RHK)-X(1,5)-PY 鹼性PY-NLS 19 PKKKRMV SV40大T蛋白，變體1 20 PKKKRKWEDP SV40大T蛋白，變體2 21 CGYGPKKKRKVGG SV40大T蛋白，變體3 22 CGYGPKKKRKV SV40大T蛋白，變體4 23 CYDDEATADSQHSTPPKKKRKWEDPK DFESELLS SV40大T蛋白長NLS 24 CGGPKKKRKWG SV40大T蛋白，變體5 25 PKKKIKW SV40大T蛋白，變體6 26 KRTADGSEFESPKKKRKV SV40大T蛋白，變體7 27 TKKAGQAKKK 核質min-NLS 28 TKKAGQAKKKKLD 核質NLS變體1 29 CGQAKKKKLD 核質NLS變體2 30 RQRRNELKRSP c-myc NLS2 31 PKKARED 多瘤大T蛋白 32 CGYGWSRKRPRPG 多瘤病毒大T蛋白 33 APTKRKGS SV40 VP1衣殼多肽 34 APKRKSGVSKC 多瘤病毒主要衣殼蛋白VP1 (11 N端aa) 35 PNKKKRK SV40 VP2衣殼蛋白(39 kD) 36 EEDGPQKKKRRL 多瘤病毒衣殼蛋白VP2 37 GKKRSKA 酵母組蛋白H2B 38 KRPRP 腺病毒E1a 39 CGGLSSKRPRP 腺病毒2/5型E1a 40 LVRKKRKTE3SP 爪蟾(Xenopus) N1、NLS1 41 LKDKDAKKSKQE 爪蟾N1、NLS2 42 GNKAKRQRST V-Rel 43 PFLDRLRRDQK A型流感病毒之NS1蛋白 44 SVTKKRKLE 人類層黏連蛋白A 45 SASKRRRLE 爪蟾層黏連蛋白A 46 ACIDKRVKLD 人類c-myc 47 SALIKKKKKMAP 鼠類c-abl 48 PPKKRMRRRIE 腺病毒5 DBP 49 YRKCLQAGMNLEARKTKKKIKGIQQATA 大鼠糖皮質素受體 50 CGYGARKTKKKIK 人類糖皮質素受體 51 RKCLQAGMNLEARKTKK 人類糖皮質素受體NLS變體 52 RKFKKFNK 兔黃體酮受體 53 CGYGIRKDRRGGR 人類雌激素受體 54 CGYGARKLKKLGN 人類雄激素受體 55 GKRKNKPK 雞Ets1核NLS 56 PLLKKIKQ c-myb 57 PPQKKIKS N-myc 58 PQPKKKP p53 59 PQPKKKPL p53 NLS變體 60 SKRWAKRKL c-erb-A 61 MTGSKTRKHRGSGA 酵母核糖體蛋白L29 NLS 62 MTGSKHRKHPGSGA 酵母核糖體蛋白L29 NLS，變體1 63 RHRKHP 酵母核糖體蛋白L29 NLS，變體2 64 KRRKHP 酵母核糖體蛋白L29 NLS，變體3 65 KYRKHP 酵母核糖體蛋白L29 NLS，變體4 66 KHRRHP 酵母核糖體蛋白L29 NLS，變體5 67 KHKKHP 酵母核糖體蛋白L29 NLS，變體6 68 RHLKHP 酵母核糖體蛋白L29 NLS，變體7 69 KHRKYP 酵母核糖體蛋白L29 NLS，變體8 70 KHRQHP 酵母核糖體蛋白L29 NLS，變體9 71 LVRKKRKTE3SP 爪蟾N1 NLS1 72 LKDKDAKKSKQE 爪蟾N1 NLS2 73 ASKSRKRKL 病毒Jun 74 GGLCSARLHRHALLAT 人類T細胞白血病病毒Tax反式活化子蛋白 75 DTREKKKFLKRRLLRLDE 小鼠核MX1蛋白(72 kD) 76 REKKKFLKRR 小鼠核MX1蛋白NLS變體 77 CGYGDRNKKKKE 人類視黃酸受體 78 RKRQRALMLRQAR 人類XPAC 79 EYLSRKGKLEL T-DNA連接之VirD2核酸內切酶 80 KKSKKKRC 酵母TRM1之推定核心NLS 206 QPQRYGGGRGRRW 人類泡沫狀逆轉錄病毒之Gag蛋白 81 NKKKRKLSRGSSQKTKGTSASAKARH KRRNRSSRS SV40 Vp3結構蛋白 82 RVTIRTWRWRRPPKGKHRK 猿猴肉瘤病毒v-sis基因產物 83 KRKIEEPEPEPKKAK 爪蟾蛋白因子Xnf7之推定雙聯NLS 84 KKYENVVIKRSPRKRGRPRKD 酵母SWI5基因產物 85 GRKRAFHGDDPFGEGPPDKKGD 單純疱疹病毒ICP8蛋白 86 KRPREDDDGEPSERKRARDDR VirD2核酸內切酶之雙聯NLS 87 RMRIZFKNKGKDTAELRRRRVEVSVEL RKAKKDEQILKRRNV 來自輸入蛋白-α之IBB域 88 NQSSNFGPMKGGNFGGRSSGPYGGG GQYFAKPRNQGGY hRNPAl M9 NLS 89 KRKGDEVDGVDEVAKKKSKK 人類聚(ADP-核糖)聚合酶 90 RKLKKKIKKL 肝炎病毒δ抗原 91 PKQKKRK 流感病毒NLS 92 SALIKKKKKMAP 小鼠c-abl IV 93 VSRKRPRP 肌瘤T蛋白NLS1 94 PPKKARED 肌瘤T蛋白NLS2 95 GPAAKRVKLD myc原癌基因蛋白[智人] 96 KKRRIKQD 熱休克因子蛋白HSF8 [番茄] 97 PKKKRKVEDPKKKRKVD 2x SV40, LrgT 98 PKKKRKVDPKKKRKVDPKKKRKV 3x SV40, LrgT 99 KKGKKKGK 單聯NLS共通叢集1 (Dissertation Tatyana Goldberg 2016) 100 PKRRRGVVL 單聯NLS共通叢集2 (Dissertation Tatyana Goldberg 2016) 101 EQLFKRRNV 單聯NLS共通叢集3 (Dissertation Tatyana Goldberg 2016) 102 KRRRR 單聯NLS共通叢集4 (Dissertation Tatyana Goldberg 2016) 103 KKRRR 單聯NLS共通叢集5 (Dissertation Tatyana Goldberg 2016) 104 EGAPPAKRPR 單聯NLS共通叢集6 (Dissertation Tatyana Goldberg 2016) 105 MLRRRRRKRAR 單聯NLS共通叢集7 (Dissertation Tatyana Goldberg 2016) 106 RRKRR 單聯NLS共通叢集8 (Dissertation Tatyana Goldberg 2016) 107 RKRK 單聯NLS共通叢集9 (Dissertation Tatyana Goldberg 2016) 108 FKAVLEDILGEL 單聯NLS次要叢集(Dissertation Tatyana Goldberg 2016) 109 KNRRL NLSdb、蛋白質來源P10152 110 RKRHW NLSdb、蛋白質來源Q09353 111 RRKKRR NLSdb、蛋白質來源Q0VD86、Q58DS6、Q5R6G2、Q9ERI5、Q6AYK2、Q6NYC1 112 RRKRSR NLSdb、蛋白質來源Q99PU7、D3ZHS6、Q92560、A2VDM8 113 KRGRKP NLSdb、蛋白質來源Q14781、P30658 114 KKRKLE NLSdb、蛋白質來源P02545、P48678、P48679、Q3ZD69 115 PKKKSRK NLSdb、蛋白質來源O35914、Q01954 116 PKRGRGR NLSdb、蛋白質來源Q9FYS5、Q43386 117 KEKRKKR NLSdb、蛋白質來源E5RQA1 118 KKKKRKR NLSdb、蛋白質來源Q9Z1J1、Q9HCS4、Q924A0 119 RRGDGRRR NLSdb、蛋白質來源Q80WE1、Q5R9B4、Q06787、P35922 120 LSPSLSPL NLSdb、蛋白質來源Q9Y261、P32182、P35583 121 VNFSEFSK NLSdb、蛋白質來源P07156 122 IVINILSE NLSdb、蛋白質來源Q96EB6 123 PPAKRKCIF NLSdb、蛋白質來源Q6AZ28、O75928、Q8C5D8 124 QRPGPYDRP NLSdb、SeqNLS預測 125 KRKRGRPRK NLSdb、蛋白質來源Q8L7L5、A1L4X7、O80834、Q8LPN5 126 KIKELYRRR NLSdb、蛋白質來源O88907、O75925 127 MVQLRPRASR NLSdb、SeqNLS預測 128 KKRREKQRRR NLSdb、蛋白質來源Q5VK71 129 EGAPPAKRAR NLSdb、蛋白質來源P0C6L6、P25880、Q81835、P29833、P25882、P06934、P0C6L3、P29997、P0C6M5、P0C6M9、P29996、P0C6M1、P0C6L8、P0C6M2、P0C6L7、P25881 130 PKKGDKYDKTD NLSdb、蛋白質來源Q45FA5 131 KKKKSKDKKRK NLSdb、蛋白質來源P97376、Q14331 132 AHRAKKMSKTHA NLSdb、蛋白質來源P21827 133 KKGPSVQKRKKT NLSdb、蛋白質來源Q6ZN17 134 KGVKRKADTTTP NLSdb、蛋白質來源Q4R8Y1 135 KGVKRRADTTTP NLSdb、蛋白質來源Q91Y44、D4A7T3 136 KKPKWDDFKKKKK NLSdb、蛋白質來源Q15397、Q8BKS9、Q562C7 137 KRRRRRRREKRKR NLSdb、蛋白質來源Q96GM8 138 DVRKRVQDLEQKM NLSdb、蛋白質來源P61635、Q6DV79、Q19S50、P52631 139 KKGKDEWFSRGKK NLSdb、蛋白質來源O60716 140 KKGKDEWFSRGKKP NLSdb、蛋白質來源P30999 141 ASPEYVNLPINGNG NLSdb、SeqNLS預測 142 YLRPVKKPKIRRKK NLSdb、蛋白質來源Q7Z7C8、Q5ZMS1、Q9EQH4、A7MAZ4 143 KRKGKLKNKGSKRKK NLSdb、蛋白質來源O15381 144 RRRGKNKVAAQNCRK NLSdb、SeqNLS預測 145 DKAKRVSRNKSEKKRR NLSdb、蛋白質來源O15516、Q5RAK8、Q91YB2、Q91YB0、Q8QGQ6、O08785、Q9WVS9、Q6YGZ4 146 EEQLRRRKNSRLNNTG NLSdb、蛋白質來源G5EFF5 147 HKKKHPDASVNFSEFSK NLSdb、蛋白質來源P10103、Q4R844、P12682、B0CM99、A9RA84、Q6YKA4、P09429、P63159、Q08IE6、P63158、Q9YH06、B1MTB0 148 KKTGKNRKLKSKRVKTR NLSdb、蛋白質來源Q9Z301、O54943、Q8K3T2 149 KRSCRRRLAGHNERRRK NLSdb、蛋白質來源Q38740、Q38741、Q700W2、Q9S7A9、Q6Z461、P93015、Q94JW8、Q9S758 150 KRQRRKQSNRESARRSR NLSdb、蛋白質來源Q501B2 151 RGKGGKGLGKGGAKRHRK NLSdb、SeqNLS預測 152 RRRGFERFGPDNMGRKRK NLSdb、蛋白質來源Q63014、Q9DBR0 153 RRHQQGQGDDSSHKKERK NLSdb、蛋白質來源Q0IJ08、Q2TAE3、Q63470、Q13627、Q61214 154 KKKTGVIAPKRFVQRLKK NLSdb、蛋白質來源Q8LAM0、O24454 155 KRAMKDDSHGNSTSPKRRK NLSdb、蛋白質來源Q0E671 156 KVNFLDMSLDDIIIYKELE NLSdb、蛋白質來源Q9P127 157 KKYENVVIKRSPRKRGRPRK NLSdb、SeqNLS預測 158 KRGNSSIGPNDLSKRKQRKK NLSdb、SeqNLS預測 159 KRASEDTTSGSPPKKSSAGPKR NLSdb、蛋白質來源Q9BZZ5、Q5R644 160 KRIHSVSLSQSQIDPSKKVKRAK NLSdb、SeqNLS預測 161 EVLKVIRTGKRKKKAWKRMVTKVC NLSdb、SeqNLS預測 162 IINGRKLKLKKSRRRSSQTSNNSFTSRRS NLSdb、SeqNLS預測 163 AHFKISGEKRPSTDPGKKAKNPKKKKKKDP NLSdb、蛋白質來源Q76IQ7 164 Non-limiting examples of nuclear localization signals are listed in Table 2. Table 2 NLS sequence NLS SEQ ID NO PKKKRKV Wild-type SV40 large T protein 14 KRPAATKKAGQAKKKK Nucleoplasm 15 PAAKRVKLD c-myc 16 MSRRRKANPTKLSENAKKLAKEVEN EGL-13 17 KLKIKRPVK TUS 18 (KR)-X(0,2)-(KR)-(KR)-x(3,10)-(RHK)-X(1,5)-PY Alkaline PY-NLS 19 PKKKRMV SV40 large T protein, variant 1 20 PKKKRKWEDP SV40 large T protein, variant 2 twenty one CGYGPKKKRKVGG SV40 large T protein, variant 3 twenty two CGYGPKKKRKV SV40 large T protein, variant 4 twenty three CYDDEATADSQHSTPPKKKRKWEDPKDFESELLS SV40 large T protein long NLS twenty four CGGPKKKRKWG SV40 large T protein, variant 5 25 PKKKIKW SV40 large T protein, variant 6 26 KRTADGSEFESPKKKRKV SV40 large T protein, variant 7 27 TKKAGQAKKK Nucleoplasmic min-NLS 28 TKKAGQAKKKKLD Nucleocytoplasmic NLS variant 1 29 CGQAKKKKLD Nucleocytoplasmic NLS variant 2 30 RQRRNELKRSP c-myc NLS2 31 PKKARED Polyoma large T protein 32 CGYGWSRKRPRPG Polyomavirus large T protein 33 APTKRKGS SV40 VP1 capsid polypeptide 34 APKRKSGVSKC Polyomavirus major capsid protein VP1 (11 N-terminal aa) 35 PNKKKR SV40 VP2 capsid protein (39 kD) 36 EEDGPQKKKRRL Polyomavirus capsid protein VP2 37 GKKRSKA Yeast Histone H2B 38 KRPRP Adenovirus E1a 39 CGGLSSKRPRP Adenovirus type 2/5 E1a 40 LVRKKRKTE3SP Xenopus N1, NLS1 41 LKDKDAKKSKQE Xenopus N1, NLS2 42 GNKAKRQRST V-Rel 43 PFLDRLRRDQK NS1 protein of influenza A virus 44 SVTKKRKLE Human laminin A 45 SASKRRRLE Xenopus laminin A 46 ACIDKRVKLD Human c-myc 47 SALIKKKKKMAP Mouse c-abl 48 PPKKRMRRRIE Adenovirus 5 DBP 49 YRKCLQAGMNLEARKTKKKIKGIQQATA Rat glucocorticoid receptor 50 CGYGARKTKKKIK Human glucocorticoid receptor 51 RKCLQAGMNLEARKTKK Human glucocorticoid receptor NLS variants 52 RKFKKFNK Rabbit progesterone receptor 53 CGYGIRKDRRGGR Human estrogen receptor 54 CGYGARKLKKLGN Human androgen receptor 55 GKRKNKPK Chicken Ets1 nuclear NLS 56 PLLKKIKQ c-myb 57 PPQKKIKS N-myc 58 PQPKKKP p53 59 PQPKKKPL p53 NLS variants 60 SKRWAKRKL c-erb-A 61 MTGSKTRKHRGSGA Yeast ribosomal protein L29 NLS 62 MTGSKHRKHPGSGA Yeast ribosomal protein L29 NLS, variant 1 63 R Yeast ribosomal protein L29 NLS, variant 2 64 KRW Yeast ribosomal protein L29 NLS, variant 3 65 KYRKHP Yeast ribosomal protein L29 NLS, variant 4 66 KHRR Yeast ribosomal protein L29 NLS, variant 5 67 KHKKHP Yeast ribosomal protein L29 NLS, variant 6 68 RHLK Yeast ribosomal protein L29 NLS, variant 7 69 KHRKYP Yeast ribosomal protein L29 NLS, variant 8 70 QUR Yeast ribosomal protein L29 NLS, variant 9 71 LVRKKRKTE3SP Xenopus N1 NLS1 72 LKDKDAKKSKQE Xenopus N1 NLS2 73 ASKSRKRKL VirusJun 74 GGLCSARLHRHALLAT Human T-cell leukemia virus Tax transactivator protein 75 DTREKKKFLKRRLLRLDE Mouse nuclear MX1 protein (72 kD) 76 REKKKFLKRR Mouse nuclear MX1 protein NLS variant 77 CGYGDRNKKKKE Human retinoic acid receptor 78 RKRQRALMLRQAR Human XPAC 79 EYLSRKGKLEL VirD2 endonuclease for T-DNA ligation 80 KKSKKKRC Putative core NLS of yeast TRM1 206 QPQRYGGGRGRRW Human foamy retrovirus Gag protein 81 NKKKRKLSRGSSQKTKGTSASAKARH KRRNRSSRS SV40 Vp3 structural protein 82 RVTIRTWRWRRPPKGKHRK Simian sarcoma virus v-sis gene product 83 KRKIEEPEPEPKKAK Putative double NLS of Xenopus laevis protein factor Xnf7 84 KKYENVVIKRSPRKRGRPRKD Yeast SWI5 gene product 85 GRKRAFHGDDPFGEGPPDKKGD Herpes simplex virus ICP8 protein 86 KRPREDDDGEPSERKRARDDR VirD2 endonuclease double NLS 87 RMRIZFKNKGKDTAELRRRRVEVSVEL RKAKKDEQILKRRNV IBB domain from importin-α 88 NQSSNFGPMKGGNFGGRSSGPYGGG GQYFAKPRNQGGY hRNPAl M9 NLS 89 KRKGDEVDGVDEVAKKKSKK Human poly (ADP-ribose) polymerase 90 RKLKKKIKKL Hepatitis virus delta antigen 91 PKKKRK Influenza virus NLS 92 SALIKKKKKMAP Mouse c-abl IV 93 VSRKRPRP Myoma T protein NLS1 94 PPKKARED Myoma T protein NLS2 95 GPAAKRVKLD myc proto-oncogene protein [Homo sapiens] 96 KKRRIK Heat shock factor protein HSF8 [Tomato] 97 PKKKRKVEDPKKKRKVD 2x SV40, LrgT 98 PKKKRKVDPKKKRKVDPKKKRKV 3x SV40, LrgT 99 KKGKKKGK Single NLS Common Collection 1 (Dissertation Tatyana Goldberg 2016) 100 PKRRRGVVL Single NLS Common Collection 2 (Dissertation Tatyana Goldberg 2016) 101 EQLFKRRNV Single NLS Common Collection 3 (Dissertation Tatyana Goldberg 2016) 102 KRRRR Single NLS Common Collection 4 (Dissertation Tatyana Goldberg 2016) 103 KKRR Single NLS Common Collection 5 (Dissertation Tatyana Goldberg 2016) 104 EGAPPAKRPR Single NLS Common Collection 6 (Dissertation Tatyana Goldberg 2016) 105 MLRRRRRKRAR Single NLS Common Collection 7 (Dissertation Tatyana Goldberg 2016) 106 RRKRR Single NLS Common Collection 8 (Dissertation Tatyana Goldberg 2016) 107 RKR Single NLS Common Collection 9 (Dissertation Tatyana Goldberg 2016) 108 FKAVLEDILGEL Single-linked NLS minor cluster (Dissertation Tatyana Goldberg 2016) 109 KNRRL NLSdb, protein source P10152 110 R NLSdb, protein source Q09353 111 RRKKRR NLSdb, protein source Q0VD86, Q58DS6, Q5R6G2, Q9ERI5, Q6AYK2, Q6NYC1 112 RRKRSR NLSdb, protein sources Q99PU7, D3ZHS6, Q92560, A2VDM8 113 KRKP NLSdb, protein source Q14781, P30658 114 KKRKLE NLSdb, protein sources P02545, P48678, P48679, Q3ZD69 115 PKKKSRK NLSdb, protein source O35914, Q01954 116 PKRGRGR NLSdb, protein source Q9FYS5, Q43386 117 KEKRKKR NLSdb, protein source E5RQA1 118 KKKKRKR NLSdb, protein source Q9Z1J1, Q9HCS4, Q924A0 119 RRGDGRRR NLSdb, protein source Q80WE1, Q5R9B4, Q06787, P35922 120 LSPSLSPL NLSdb, protein source Q9Y261, P32182, P35583 121 VNFSEFSK NLSdb, protein source P07156 122 IVINILSE NLSdb, protein source Q96EB6 123 PPAKRKCIF NLSdb, protein source Q6AZ28, O75928, Q8C5D8 124 QRPGPYDRP NLSdb, SeqNLS prediction 125 KRKRGRPRK NLSdb, protein source Q8L7L5, A1L4X7, O80834, Q8LPN5 126 KIKELYRR NLSdb, protein source O88907, O75925 127 MVQLRPRASR NLSdb, SeqNLS prediction 128 KKRREKQRRR NLSdb, protein source Q5VK71 129 EGAPPAKRAR NLSdb, protein source P0C6L6, P25880, Q81835, P29833, P25882, P06934, P0C6L3, P29997, P0C6M5, P0C6M9, P29996, P0C6M1, P0C6L8, P0C6M2, P0C6L7, P25881 130 PKKGDKYDKTD NLSdb, protein source Q45FA5 131 KKKKSKDKKRK NLSdb, protein source P97376, Q14331 132 AHRAKKMSKTHA NLSdb, protein source P21827 133 KKGPSVQKRKKT NLSdb, protein source Q6ZN17 134 KGVKRKADTTTP NLSdb, protein source Q4R8Y1 135 KGVKRRADTTTP NLSdb, protein source Q91Y44, D4A7T3 136 KKPKWDDFKKKKK NLSdb, protein source Q15397, Q8BKS9, Q562C7 137 KRRRRRRREKRKR NLSdb, protein source Q96GM8 138 DVRKRVQDLEQKM NLSdb, protein source P61635, Q6DV79, Q19S50, P52631 139 KKGKDEWFSRGKK NLSdb, protein source O60716 140 KKGKDEWFSRGKKP NLSdb, protein source P30999 141 ASPEYVNLPINGNG NLSdb, SeqNLS prediction 142 YLRPVKKPKIRRKK NLSdb, protein source Q7Z7C8, Q5ZMS1, Q9EQH4, A7MAZ4 143 KRKGKLKNKGSKRKK NLSdb, protein source O15381 144 RRRGKNKVAAQNCRK NLSdb, SeqNLS prediction 145 DKAKRVSRNKSEKKRR NLSdb, protein source O15516, Q5RAK8, Q91YB2, Q91YB0, Q8QGQ6, O08785, Q9WVS9, Q6YGZ4 146 EEQLRRRKNSRLNNTG NLSdb, protein source G5EFF5 147 HKKKHPDASVNFSEFSK NLSdb, protein source P10103, Q4R844, P12682, B0CM99, A9RA84, Q6YKA4, P09429, P63159, Q08IE6, P63158, Q9YH06, B1MTB0 148 KKTGKNRKLKSKRVKTR NLSdb, protein source Q9Z301, O54943, Q8K3T2 149 KRSCRRRLAGHNERRRK NLSdb, protein sources Q38740, Q38741, Q700W2, Q9S7A9, Q6Z461, P93015, Q94JW8, Q9S758 150 KRQRRKQSNRESARRSR NLSdb, protein source Q501B2 151 RGKGGKGLGKGGAKRHRK NLSdb, SeqNLS prediction 152 RRRGFERFGPDNMGRKRK NLSdb, protein source Q63014, Q9DBR0 153 RRHQQGQGDDSSHKKERK NLSdb, protein source Q0IJ08, Q2TAE3, Q63470, Q13627, Q61214 154 KKKTGVIAPKRFVQRLKK NLSdb, protein source Q8LAM0, O24454 155 KRAMKDDSHGNSTSPKRRK NLSdb, protein source Q0E671 156 KVNFLDMSLDDIIIYKELE NLSdb, protein source Q9P127 157 KKYENVVIKRSPRKRGRPRK NLSdb, SeqNLS prediction 158 KRGNSSIGPNDLSKRKQRKK NLSdb, SeqNLS prediction 159 KRASEDTTSGSPPKKSSAGPKR NLSdb, protein source Q9BZZ5, Q5R644 160 KRIHSVSLSQSQIDPSKKVKRAK NLSdb, SeqNLS prediction 161 EVLKVIRTGKRKKKAWKRMVTKVC NLSdb, SeqNLS prediction 162 IINGRKLKLKKSRRRSSQTSNNSFTSRRS NLSdb, SeqNLS prediction 163 AHFKISGEKRPSTDPGKKAKNPKKKKKKDP NLSdb, protein source Q76IQ7 164

在一些實施例中，經工程化之B-GEn多肽包含僅位於B-GEn.2蛋白序列的N端之一或多個NLS序列。在其他實施例中，本發明融合蛋白包含一或多個僅位於B-GEn.2蛋白序列的C端的NLS序列。具有C端NLS之示例性經工程化之B-GEn多肽由SEQ ID NO: 200表示。In some embodiments, the engineered B-GEn polypeptide comprises one or more NLS sequences located only at the N-terminus of the B-GEn.2 protein sequence. In other embodiments, the fusion protein of the present invention comprises one or more NLS sequences located only at the C-terminus of the B-GEn.2 protein sequence. An exemplary engineered B-GEn polypeptide with a C-terminal NLS is represented by SEQ ID NO: 200.

在一些實施例中，經工程化之B-GEn多肽包含為多個相同NLS序列。在其他實施例中，本發明融合蛋白包含多個不同NLS序列。In some embodiments, the engineered B-GEn polypeptide comprises multiple identical NLS sequences. In other embodiments, the fusion protein of the present invention comprises multiple different NLS sequences.

在一些實施例中，經工程化之B-GEn多肽包含在如章節6.2中所述之B-GEn多肽序列的N端之核質NLS (SEQ ID NO: 15)及在如章節6.2中所述之B-GEn多肽序列的C端之SV40大T蛋白NLS (SEQ ID NO: 14)。In some embodiments, the engineered B-GEn polypeptide comprises a nucleoplasmic NLS (SEQ ID NO: 15) at the N-terminus of the B-GEn polypeptide sequence as described in Section 6.2 and a SV40 large T protein NLS (SEQ ID NO: 14) at the C-terminus of the B-GEn polypeptide sequence as described in Section 6.2.

在一些實施例中，經工程化之B-GEn多肽包含在如章節6.2中所述之B-GEn多肽序列的C端之SV40大T蛋白NLS (SEQ ID NO: 14)。In some embodiments, the engineered B-GEn polypeptide comprises the SV40 large T protein NLS (SEQ ID NO: 14) at the C-terminus of the B-GEn polypeptide sequence described in Section 6.2.

在一些實施例中，經工程化之B-GEn多肽包含在如章節6.2中所述之B-GEn多肽序列的N端之核質NLS (SEQ ID NO: 15)。 6.4. 連接子序列 In some embodiments, the engineered B-GEn polypeptide comprises a nucleoplasmic NLS (SEQ ID NO: 15) at the N-terminus of the B-GEn polypeptide sequence as described in Section 6.2. 6.4. Linker sequence

本發明提供經工程化之B-GEn多肽，其呈包含B-GEn多肽序列(例如具有如章節6.2中所述之胺基酸序列)與一或多個NLS序列視需要經由肽連接子融合之之融合蛋白之形式。The present invention provides engineered B-GEn polypeptides in the form of fusion proteins comprising a B-GEn polypeptide sequence (eg, having an amino acid sequence as described in Section 6.2) and one or more NLS sequences, optionally fused via a peptide linker.

在一些實施例中，B-GEn多肽序列經由肽連接子連接至位於其N端及/或C端的NLS序列。在其他實施例中，B-GEn多肽序列經連接一對個別多肽序列之肽連接子連接至一或多個NLS序列，諸如介於兩個NLS序列之間或介於NLS序列與B-GEn多肽序列之間。In some embodiments, the B-GEn polypeptide sequence is linked to an NLS sequence at its N-terminus and/or C-terminus via a peptide linker. In other embodiments, the B-GEn polypeptide sequence is linked to one or more NLS sequences, such as between two NLS sequences or between an NLS sequence and a B-GEn polypeptide sequence, via a peptide linker that links a pair of individual polypeptide sequences.

在一些實施例中，本發明之B-GEn多肽包含藉由相同連接子連接之多個NLS序列。在其他實施例中，本發明之B-GEn多肽包含藉由不同連接子連接之多個NLS序列。In some embodiments, the B-GEn polypeptide of the present invention comprises multiple NLS sequences connected by the same linker. In other embodiments, the B-GEn polypeptide of the present invention comprises multiple NLS sequences connected by different linkers.

適合用於本發明之B-GEn多肽中之肽連接子包括彼等揭示於Chen等人，2013，Adv Drug Deliv Rev. 65(10):1357-1369中者。此類連接子之非限制性實例再現於下表3中。表 3 連接子序列 連接子類型 SEQ ID NO (GGGGS) ₃ 可撓性 165 GGG(EAAAK) ₃ 混合 166 A(EAAAK) ₄ALEA(EAAAK) ₄A 剛性 167 GGGGSLVPRGSGGGGS 可撓性 168 GAAPAAAPAKQEAAAPAPAAKAEAPAAAPAAKA 富含脯胺酸、剛性、域間 169 TRHRQPRGWE 可切割 170 6.5. B-GEn V型CRISPR-Cas系統 Peptide linkers suitable for use in the B-GEn polypeptides of the present invention include those disclosed in Chen et al., 2013, Adv Drug Deliv Rev. 65(10):1357-1369. Non-limiting examples of such linkers are reproduced in Table 3 below. Table 3 Connector sequence Connection subtype SEQ ID NO (GGGGS) ₃ Flexibility 165 GGG(EAAAK) ₃ mix 166 A(EAAAK) ₄ ALE(EAAAK) ₄ A Rigidity 167 GGGGSLVPRGSGGGGS Flexibility 168 GAAPAAAPAKQEAAAAPAPAAKAAEAPAAAPAAKA Rich in proline, rigid, interdomain 169 HRQPRGWE Cuttable 170 6.5. B-GEn V-type CRISPR-Cas system

本發明提供併入本發明之經工程化之V型核酸內切酶或B-GEn多肽或編碼其之核酸之經工程化之V型核酸內切酶及B-GEn多肽系統。The present invention provides engineered V-type endonucleases and B-GEn polypeptide systems incorporating the engineered V-type endonucleases or B-GEn polypeptides of the present invention or nucleic acids encoding the same.

在一些實施例中，經工程化之V型核酸內切酶或B-GEn多肽系統包含以下組分： (a) 例如如章節6.2中所述之經工程化之V型核酸內切酶或B-GEn多肽、或例如如章節6.8中所述之編碼此種經工程化之B-GEn多肽之核酸， (b) 例如如章節6.6中所述之異源引導RNA (gRNA)、或允許原位產生此種gRNA之核酸(例如如章節6.9中所述之載體)，其中該gRNA包含： i. 由RNA組成且能夠雜交至核酸基因座中之靶序列之經工程化之DNA靶向區段， ii. 由RNA組成之tracr配對序列，及 iii. 由RNA組成之tracr RNA序列，其中該tracr配對序列雜交至該tracr序列，且其中(i)、(ii)及(iii)係以5’至3’定向排列。該gRNA可為單個引導RNA (sgRNA)。在sgRNA中，tracr配對序列及tracr序列一般藉由適宜環序列連接且形成幹-環結構。 In some embodiments, the engineered V-type nuclease or B-GEn polypeptide system comprises the following components: (a) An engineered V-type nuclease or B-GEn polypeptide, such as described in Section 6.2, or a nucleic acid encoding such an engineered B-GEn polypeptide, such as described in Section 6.8, (b) A heterologous guide RNA (gRNA), such as described in Section 6.6, or a nucleic acid that allows the in situ generation of such a gRNA (such as a vector as described in Section 6.9), wherein the gRNA comprises: i. An engineered DNA targeting segment consisting of RNA and capable of hybridizing to a target sequence in a nucleic acid locus, ii. A tracr pairing sequence consisting of RNA, and iii. A tracr RNA sequence consisting of RNA, The tracr mate sequence is hybridized to the tracr sequence, and (i), (ii) and (iii) are arranged in a 5' to 3' orientation. The gRNA may be a single guide RNA (sgRNA). In the sgRNA, the tracr mate sequence and the tracr sequence are generally connected by a suitable loop sequence to form a stem-loop structure.

在特定實施例中，經工程化之V型核酸內切酶或B-GEn多肽系統包含以下組分： (a) 例如如章節6.2中所述之經工程化之V型核酸內切酶或B-GEn多肽， (b) 例如如章節6.6中所述之異源引導RNA (gRNA)，其包含： i. 由RNA組成且能夠雜交至核酸基因座中之靶序列之經工程化之DNA靶向區段， ii. 由RNA組成之tracr配對序列，及 iii. 由RNA組成之tracr RNA序列，其中該tracr配對序列雜交至該tracr序列，且其中(i)、(ii)及(iii)係以5’至3’定向排列。該gRNA可為單個引導RNA (sgRNA)。在sgRNA中，tracr配對序列及tracr序列一般藉由適宜環序列連接且形成幹-環結構。 In certain embodiments, the engineered V-type endonuclease or B-GEn polypeptide system comprises the following components: (a) For example, an engineered V-type endonuclease or B-GEn polypeptide as described in Section 6.2, (b) For example, a heterologous guide RNA (gRNA) as described in Section 6.6, comprising: i. An engineered DNA targeting segment consisting of RNA and capable of hybridizing to a target sequence in a nucleic acid locus, ii. A tracr mate sequence consisting of RNA, and iii. A tracr RNA sequence consisting of RNA, wherein the tracr mate sequence hybridizes to the tracr sequence, and wherein (i), (ii) and (iii) are arranged in a 5' to 3' orientation. The gRNA can be a single guide RNA (sgRNA). In sgRNA, the tracr mate sequence and the tracr sequence are generally connected by an appropriate loop sequence to form a stem-loop structure.

在一些實施例中，將此類經工程化之V型核酸內切酶或B-GEn多肽呈章節6.7中所述之稱為核糖核蛋白(RNP)複合物之組合物遞送至靶細胞。In some embodiments, such engineered V-type endonucleases or B-GEn polypeptides are delivered to target cells as compositions known as ribonucleoprotein (RNP) complexes as described in Section 6.7.

在特定實施例中，經工程化之V型核酸內切酶或B-GEn多肽系統包含以下組分： (a) 例如如章節6.8中所述之編碼例如如章節6.2中所述之經工程化之V型核酸內切酶或B-GEn多肽之核酸，或 (b) 允許產生例如如章節6.6中所述之異源引導RNA (gRNA)之核酸(例如如章節6.9中所述之載體)，其中該gRNA包含： i. 由RNA組成且能夠雜交至核酸基因座中之靶序列之經工程化之DNA靶向區段， ii. 由RNA組成之tracr配對序列，及 iii. 由RNA組成之tracr RNA序列，其中該tracr配對序列雜交至該tracr序列，且其中(i)、(ii)及(iii)係以5’至3’定向排列。該gRNA可為單個引導RNA (sgRNA)。在sgRNA中，tracr配對序列及tracr序列一般藉由適宜環序列連接且形成幹-環結構。 In certain embodiments, the engineered V-type endonuclease or B-GEn polypeptide system comprises the following components: (a) a nucleic acid encoding an engineered V-type endonuclease or B-GEn polypeptide, such as described in Section 6.8, or (b) a nucleic acid (such as a vector as described in Section 6.9) that allows the production of a heterologous guide RNA (gRNA), such as described in Section 6.6, wherein the gRNA comprises: i. an engineered DNA targeting segment consisting of RNA and capable of hybridizing to a target sequence in a nucleic acid locus, ii. a tracr mate sequence consisting of RNA, and iii. a tracr RNA sequence consisting of RNA, wherein the tracr mate sequence hybridizes to the tracr sequence, and wherein (i), (ii) and (iii) are arranged in a 5' to 3' orientation. The gRNA may be a single guide RNA (sgRNA). In sgRNA, the tracr pairing sequence and the tracr sequence are generally connected by a suitable loop sequence to form a stem-loop structure.

任何對「RNA」或「引導RNA」之引用涵蓋包含非天然以及天然核鹼基之RNA分子，例如章節6.8.3中描述之核酸修飾中之一者或多者。Any reference to "RNA" or "guide RNA" encompasses RNA molecules containing non-natural as well as natural nucleobases, such as one or more of the nucleic acid modifications described in Section 6.8.3.

在一些實施例中，編碼該經工程化之V型核酸內切酶或B-GEn多肽及/或該等sgRNA之核酸含有用於細胞或體外環境中表現之適宜啟動子。In some embodiments, the nucleic acid encoding the engineered V-type endonuclease or B-GEn polypeptide and/or the sgRNAs contains a suitable promoter for expression in cells or in vitro environments.

在一些實施例中，編碼該經工程化之V型核酸內切酶或B-GEn多肽及/或該等sgRNA之核酸呈病毒載體之形式，例如如章節6.9.3中所述。 6.6. 引導RNA (gRNA)及單引導RNA (sgRNA) In some embodiments, the nucleic acid encoding the engineered V-type endonuclease or B-GEn polypeptide and/or the sgRNAs is in the form of a viral vector, for example as described in Section 6.9.3. 6.6. Guide RNA (gRNA) and single guide RNA (sgRNA)

本文描述之系統、組合物及方法在一些實施例中係採用可將經工程化之B-GEn多肽之活性引導至靶核酸內的特異性靶序列之基因組靶向核酸。在一些實施例中，該基因組靶向核酸為RNA。基因組靶向RNA在本文中稱為「引導RNA」或「gRNA」。引導RNA具有至少可雜交至所關注靶核酸序列之間隔子序列及CRISPR重複序列(此一CRISPR重複序列亦稱為「tracr配對序列」)。在II型系統中，該gRNA亦具有稱為tracrRNA序列之第二RNA。在II型引導RNA (gRNA)中，該CRISPR重複序列及tracrRNA序列彼此雜交以形成雙鏈體。在V型引導RNA (gRNA)中，該crRNA形成雙鏈體。在兩種系統中，該雙鏈體結合位點特異性多肽使得該引導RNA及定點多肽形成複合物。該基因組靶向核酸憑藉其與位點特異性多肽之結合為該複合物提供靶特異性。因此，該基因組靶向核酸引導該位點特異性多肽之活性。The systems, compositions and methods described herein in some embodiments use a genome targeting nucleic acid that can direct the activity of an engineered B-GEn polypeptide to a specific target sequence in a target nucleic acid. In some embodiments, the genome targeting nucleic acid is RNA. Genome targeting RNA is referred to herein as "guide RNA" or "gRNA". The guide RNA has at least a spacer sequence and a CRISPR repeat sequence that can be hybridized to the target nucleic acid sequence of interest (this CRISPR repeat sequence is also referred to as a "tracr pairing sequence"). In a type II system, the gRNA also has a second RNA called a tracrRNA sequence. In a type II guide RNA (gRNA), the CRISPR repeat sequence and the tracrRNA sequence hybridize with each other to form a duplex. In a type V guide RNA (gRNA), the crRNA forms a duplex. In both systems, the duplex binds to the site-specific polypeptide so that the guide RNA and the site-directed polypeptide form a complex. The genome-targeting nucleic acid provides target specificity to the complex by virtue of its binding to the site-specific polypeptide. Thus, the genome-targeting nucleic acid directs the activity of the site-specific polypeptide.

在一些實施例中，該基因組靶向核酸為雙分子引導RNA。在一些實施例中，該基因組靶向核酸為單分子引導RNA或單個引導RNA (sgRNA)。雙分子引導RNA具有RNA之兩個股。第一股在5'至3'方向上具有可選間隔子延伸序列、間隔子序列及最小CRISPR重複序列。第二股具有最小tracrRNA序列(與最小CRISPR重複序列互補)、3’ tracrRNA序列及可選tracrRNA延伸序列。II型系統中之單分子引導RNA (sgRNA)在5'至3'方向上具有可選間隔子延伸序列、間隔子序列、最小CRISPR重複序列、單分子引導連接子、最小tracrRNA序列、3’ tracrRNA序列及可選tracrRNA延伸序列。該可選tracrRNA延伸可具有對該引導RNA貢獻另外功能性(例如穩定性)之元件。該單分子引導連接子將該最小CRISPR重複及該最小tracrRNA序列連接以形成髮夾結構。該可選tracrRNA表現具有一或多個髮夾。In some embodiments, the genome targeting nucleic acid is a bimolecular guide RNA. In some embodiments, the genome targeting nucleic acid is a single molecule guide RNA or a single guide RNA (sgRNA). Bimolecular guide RNA has two strands of RNA. The first strand has an optional spacer extension sequence, a spacer sequence, and a minimal CRISPR repeat sequence in the 5' to 3' direction. The second strand has a minimal tracrRNA sequence (complementary to the minimal CRISPR repeat sequence), a 3' tracrRNA sequence, and an optional tracrRNA extension sequence. The single molecule guide RNA (sgRNA) in the type II system has an optional spacer extension sequence, a spacer sequence, a minimal CRISPR repeat sequence, a single molecule guide linker, a minimal tracrRNA sequence, a 3' tracrRNA sequence, and an optional tracrRNA extension sequence in the 5' to 3' direction. The optional tracrRNA extension may have an element that contributes additional functionality (e.g., stability) to the guide RNA. The single-molecule guide linker connects the minimal CRISPR repeat and the minimal tracrRNA sequence to form a hairpin structure. The optional tracrRNA is expressed with one or more hairpins.

V型系統中之單分子引導RNA (sgRNA)在5'至3'方向上具有最小CRISPR重複序列及間隔子序列。或替代地，V型系統中之單分子引導RNA (sgRNA)在5'至3'方向上具有可選tracr延伸序列、tracr RNA序列、單分子引導連接子、最小CRISPR重複序列、間隔子序列及可選間隔子延伸序列。The single molecule guide RNA (sgRNA) in the V-type system has a minimal CRISPR repeat sequence and a spacer sequence in the 5' to 3' direction. Or alternatively, the single molecule guide RNA (sgRNA) in the V-type system has an optional tracr extension sequence, a tracr RNA sequence, a single molecule guide linker, a minimal CRISPR repeat sequence, a spacer sequence, and an optional spacer extension sequence in the 5' to 3' direction.

或者，V型系統中之單分子引導RNA (sgRNA)在5'至3'方向上具有可選延伸序列、最小CRISPR重複序列、間隔子序列及可選間隔子延伸序列。Alternatively, the single-molecule guide RNA (sgRNA) in the V-type system has an optional extension sequence, a minimal CRISPR repeat sequence, a spacer sequence, and an optional spacer extension sequence in the 5' to 3' direction.

在一些實施例中，V型系統中之sgRNA在5'至3'方向上包含可選延伸序列、人工核酸酶結合RNA序列及間隔子序列、及可選間隔子延伸序列。In some embodiments, the sgRNA in the V-type system comprises an optional extension sequence, an artificial nuclease binding RNA sequence and a spacer sequence, and an optional spacer extension sequence in the 5' to 3' direction.

用於根據本發明之B-GEn.2 CRISPR Cas核酸酶及可能用於其他V型核酸內切酶核酸酶之特別有用之sgRNA揭示於表4中。表 4 sgRNA 序列 SEQ ID NO B-GEn.2-sgRNA_v4 171 B-GEn.2-sgRNA_v4.2 172 B-GEn.2-sgRNA_v4.3 173 B-GEn.2-sgRNA_v4.4 174 B-GEn.2-sgRNA_v4.5 175 Particularly useful sgRNAs for use with the B-GEn.2 CRISPR Cas nuclease according to the present invention, and potentially for other V-type endonuclease nucleases, are disclosed in Table 4. Table 4 sgRNA sequence SEQ ID NO B-GEn.2-sgRNA_v4 171 B-GEn.2-sgRNA_v4.2 172 B-GEn.2-sgRNA_v4.3 173 B-GEn.2-sgRNA_v4.4 174 B-GEn.2-sgRNA_v4.5 175

示例性基因組靶向核酸描述於例如WO2018002719中。一般而言，CRISPR重複序列包括與tracr序列具有足夠互補性以促進下列中之一者或多者之任何序列：(1)含有對應tracr序列之細胞中之側接CRISPR重複序列之DNA靶向區段之切除；及(2)靶序列處CRISPR複合物之形成，其中該CRISPR複合物包括雜交至該tracr序列之CRISPR重複序列。一般而言，互補程度係參考該CRISPR重複序列及tracr序列之最佳比對，順著該兩個序列中較短序列之長度。最佳比對可藉由任何適宜比對演算法來確定且可進一步考慮二級結構，諸如該tracr序列或CRISPR重複序列內的自互補性。在一些實施例中，當最佳比對時，該tracr序列及CRISPR重複序列之間順著該兩者中較短者之30個核苷酸長度之互補程度為約或多於25%、30%、40%、50%、60%、70%、80%、90%、95%、97.5%、99%或更高。在一些實施例中，該tracr序列之長度為約或多於5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、25、30、40、50個或更多個核苷酸。在一些實施例中，該tracr序列及CRISPR重複序列包含在單個轉錄物中，使得此二者之間的雜交產生具有二級結構(諸如髮夾)之轉錄物。在一些實施例中，該轉錄物或經轉錄之核酸序列具有至少兩個或更多個髮夾。Exemplary genome targeting nucleic acids are described, for example, in WO2018002719. In general, a CRISPR repeat sequence includes any sequence that is sufficiently complementary to a tracr sequence to promote one or more of the following: (1) excision of a DNA targeting segment flanking the CRISPR repeat sequence in a cell containing the corresponding tracr sequence; and (2) formation of a CRISPR complex at a target sequence, wherein the CRISPR complex includes the CRISPR repeat sequence hybridized to the tracr sequence. In general, the degree of complementarity is referenced to the optimal alignment of the CRISPR repeat sequence and the tracr sequence, in terms of the length of the shorter of the two sequences. The optimal alignment can be determined by any suitable alignment algorithm and can further take into account secondary structure, such as self-complementarity within the tracr sequence or the CRISPR repeat sequence. In some embodiments, the degree of complementarity between the tracr sequence and the CRISPR repeat sequence along the 30 nucleotide length of the shorter of the two when optimally aligned is about or more than 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97.5%, 99% or more. In some embodiments, the length of the tracr sequence is about or more than 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50 or more nucleotides. In some embodiments, the tracr sequence and the CRISPR repeat sequence are contained in a single transcript such that hybridization between the two produces a transcript having a secondary structure (e.g., a hairpin). In some embodiments, the transcript or transcribed nucleic acid sequence has at least two or more hairpins.

用於與CRISPR Cas系統中之B-GEn.2或B-GEn.1一起使用之適宜tracr序列列於表5中。或者，可採用此等序列之變體。變體可包括此類序列及/或在此等序列之一或多個地方中具有鹼基修飾之序列之部分或截短形式。各別RNA序列分別揭示於SEQ ID NOs:176及177中。表 5 名稱 SEQ ID NO: RNA 序列 tracr_B-GEn.2 176 tracr_B-GEn.1 177 Suitable tracr sequences for use with B-GEn.2 or B-GEn.1 in the CRISPR Cas system are listed in Table 5. Alternatively, variants of these sequences may be used. Variants may include portions or truncated forms of these sequences and/or sequences with base modifications in one or more places in these sequences. The respective RNA sequences are disclosed in SEQ ID NOs: 176 and 177, respectively. Table 5 Name SEQ ID NO: RNA -seq tracr_B-GEn.2 176 tracr_B-GEn.1 177

引導RNA之間隔子包括與靶DNA中之序列互補之核苷酸序列。換言之，引導RNA之間隔子以序列特異性方式經由雜交(例如鹼基配對)與靶DNA相互作用。因此，間隔子之核苷酸序列可改變且確定引導RNA及靶DNA將相互作用的靶DNA內的位置。引導RNA之DNA靶向區段可經修飾(例如藉由基因工程化)以雜交至靶DNA內的任何所需序列。The spacer of the guide RNA includes a nucleotide sequence that is complementary to a sequence in the target DNA. In other words, the spacer of the guide RNA interacts with the target DNA in a sequence-specific manner via hybridization (e.g., base pairing). Thus, the nucleotide sequence of the spacer can vary and determine the location within the target DNA where the guide RNA and target DNA will interact. The DNA targeting segment of the guide RNA can be modified (e.g., by genetic engineering) to hybridize to any desired sequence within the target DNA.

在一些實施例中，該間隔子具有10個核苷酸至30個核苷酸之長度。在一些實施例中，該間隔子具有13個核苷酸至25個核苷酸之長度。在一些實施例中，該間隔子具有15個核苷酸至23個核苷酸之長度。在一些實施例中，該間隔子具有18個核苷酸至22個核苷酸，例如20至22個核苷酸之長度。In some embodiments, the spacer has a length of 10 nucleotides to 30 nucleotides. In some embodiments, the spacer has a length of 13 nucleotides to 25 nucleotides. In some embodiments, the spacer has a length of 15 nucleotides to 23 nucleotides. In some embodiments, the spacer has a length of 18 nucleotides to 22 nucleotides, such as 20 to 22 nucleotides.

在一些實施例中，在20至22個核苷酸之上，該間隔子之DNA靶向序列與靶DNA之原型間隔子之間之互補百分比為至少60% (例如至少65%、至少70%、至少75%、至少80%、至少85%、至少90%、至少95%、至少97%、至少98%、至少99%、或100%)。In some embodiments, the percent complementarity between the DNA targeting sequence of the spacer and the protospacer of the target DNA is at least 60% (e.g., at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%) over 20 to 22 nucleotides.

在一些實施例中，該原型間隔子直接鄰近於其3’端上之適宜PAM序列或此PAM序列係其3’部分中該DNA靶向序列之一部分。In some embodiments, the protospacer is directly adjacent to a suitable PAM sequence on its 3' end or the PAM sequence is part of the DNA targeting sequence in its 3' portion.

適宜PAM序列列於表6中，其中該經工程化之DNA靶向區段在其3’端上直接鄰近在該靶向DNA區段上的該PAM序列或此PAM序列係其5'部分中之該靶向DNA序列之一部分。表6 B-GEn PAM 序列 B-GEn.1 「DTTN」，其中「D」表示「A」或「T」或「G」；較佳係「ATT」 B-GEn.1.2 「DTTN」，其中「D」表示「A」或「T」或「G」；較佳係「ATT」 B-GEn.2 「DTTN」，其中「D」表示「A」或「T」或「G」；較佳係「ATT」 Suitable PAM sequences are listed in Table 6, where the engineered DNA targeting segment is directly adjacent to the PAM sequence on the targeting DNA segment on its 3' end or the PAM sequence is part of the targeting DNA sequence in its 5' portion. Table 6 B-GEn PAM Sequence B-GEn.1 "DTTN", where "D" stands for "A" or "T" or "G";"ATT" is preferred B-GEn.1.2 "DTTN", where "D" stands for "A" or "T" or "G";"ATT" is preferred B-GEn.2 "DTTN", where "D" stands for "A" or "T" or "G";"ATT" is preferred

引導RNA之修飾可用於增強包含引導RNA及Cas核酸內切酶(諸如B-GEn.1或B-GEn.2)之CRISPR-Cas基因組編輯複合物之形成或穩定性。引導RNA之修飾亦可或替代地用於增強基因組編輯複合物與基因組中之靶序列之間之相互作用之開始、穩定性或動力學，此可用於例如增強中靶活性(on-target activity)。相較於在其他(脫靶)位點處之效應，引導RNA之修飾亦可或替代地用於增強特異性，例如，在中靶位點處之基因組編輯之相對速率。Modification of the guide RNA can be used to enhance the formation or stability of the CRISPR-Cas genome editing complex comprising the guide RNA and a Cas endonuclease (such as B-GEn.1 or B-GEn.2). Modification of the guide RNA can also or alternatively be used to enhance the initiation, stability or kinetics of the interaction between the genome editing complex and the target sequence in the genome, which can be used, for example, to enhance on-target activity. Modification of the guide RNA can also or alternatively be used to enhance specificity, for example, the relative rate of genome editing at the on-target site compared to the effect at other (off-target) sites.

修飾亦可或替代地用於增加引導RNA之穩定性，例如，藉由增加其對存在於細胞中之核糖核酸酶(RNA酶)之降解之抗性，由此導致其在細胞中之半衰期增加。增強引導RNA半衰期之修飾可特別用於其中Cas核酸內切酶(諸如B-GEn.1、或B-GEn.1.2、或B-GEn.2)經由需要轉譯以便產生B-GEn.1、或B-GEn.1.2、或B-GEn.2核酸內切酶之RNA引入至待編輯的細胞中之實施例中，因為增加與編碼該核酸內切酶之RNA同時引入之引導RNA之半衰期可用於增加該等引導RNA及該經編碼之Cas核酸內切酶在該細胞中共同存在之時間。 6.6.1. 另外序列 Modifications may also or alternatively be used to increase the stability of the guide RNA, for example, by increasing its resistance to degradation by ribonucleases (RNases) present in the cell, thereby resulting in an increase in its half-life in the cell. Modifications that enhance the half-life of the guide RNA may be particularly useful in embodiments in which a Cas endonuclease (such as B-GEn.1, or B-GEn.1.2, or B-GEn.2) is introduced into the cell to be edited via an RNA that needs to be translated in order to produce the B-GEn.1, or B-GEn.1.2, or B-GEn.2 endonuclease, because increasing the half-life of the guide RNA introduced simultaneously with the RNA encoding the endonuclease may be used to increase the time that the guide RNAs and the encoded Cas endonuclease co-exist in the cell. 6.6.1. Other sequences

在一些實施例中，引導RNA包含在5'或3'端之至少一個另外區段。例如，適宜另外區段可包含5'帽(例如7-甲基鳥苷酸帽(m7G))；3'聚腺苷酸化尾(例如3'聚(A)尾)；核糖開關(riboswitch)序列(例如以允許蛋白質及蛋白質複合物之經調節之穩定性及/或經調節之可接近性)；形成dsRNA雙鏈體(例如髮夾)之序列；將RNA靶向至亞細胞位置(例如核、粒腺體、葉綠體及類似者)之序列；提供追蹤(例如直接結合至螢光分子、結合至促進螢光偵測之部分、允許螢光偵測之序列等)之修飾或序列；為蛋白質(例如作用於DNA上之蛋白質，包括轉錄活化子、轉錄抑制子、DNA甲基轉移酶、DNA去甲基化酶、組蛋白乙醯基轉移酶、組蛋白去乙醯酶及類似者)提供結合位點之修飾或序列、提供增加之、減少之及/或可控制之穩定性之修飾或序列；及其組合。 6.6.1.1. 穩定性控制序列 In some embodiments, the guide RNA comprises at least one additional segment at the 5' or 3' end. For example, suitable additional segments may include a 5' cap (e.g., a 7-methylguanylate cap (m7G)); a 3' polyadenylation tail (e.g., a 3' poly (A) tail); a riboswitch sequence (e.g., to allow regulated stability and/or regulated accessibility of proteins and protein complexes); a sequence that forms a dsRNA duplex (e.g., a hairpin); a sequence that targets the RNA to a subcellular location (e.g., nucleus, mitochondria, chloroplasts, and the like); a sequence that provides Modifications or sequences that track (e.g., directly bind to a fluorescent molecule, bind to a moiety that promotes fluorescence detection, allow fluorescence detection, etc.); modifications or sequences that provide binding sites for proteins (e.g., proteins that act on DNA, including transcriptional activators, transcriptional repressors, DNA methyltransferases, DNA demethylases, histone acetyltransferases, histone deacetylases, and the like); modifications or sequences that provide increased, decreased, and/or controllable stability; and combinations thereof. 6.6.1.1. Stability Control Sequences

穩定性控制序列影響RNA (例如引導RNA)之穩定性。適宜穩定性控制序列之非限制性實例為轉錄終止子區段(例如轉錄終止序列)。引導RNA之轉錄終止子區段可具有10個核苷酸至100個核苷酸，例如10個核苷酸(nt)至20 nt、20 nt至30 nt、30 nt至40 nt、40 nt至50 nt、50 nt至60 nt、60 nt至70 nt、70 nt至80 nt、80 nt至90 nt、或90 nt至100 nt之總長度。例如，該轉錄終止子區段可具有15個核苷酸(nt)至80 nt、15 nt至50 nt、15 nt至40 nt、15 nt至30 nt或15 nt至25 nt之長度。Stability control sequences affect the stability of RNA (e.g., guide RNA). A non-limiting example of a suitable stability control sequence is a transcriptional terminator segment (e.g., a transcriptional terminator sequence). The transcriptional terminator segment of the guide RNA may have a total length of 10 nucleotides to 100 nucleotides, such as 10 nucleotides (nt) to 20 nt, 20 nt to 30 nt, 30 nt to 40 nt, 40 nt to 50 nt, 50 nt to 60 nt, 60 nt to 70 nt, 70 nt to 80 nt, 80 nt to 90 nt, or 90 nt to 100 nt. For example, the transcriptional terminator segment can have a length of 15 nucleotides (nt) to 80 nt, 15 nt to 50 nt, 15 nt to 40 nt, 15 nt to 30 nt, or 15 nt to 25 nt.

在一些實施例中，該轉錄終止序列係在真核細胞中發揮功能之序列。在一些實施例中，該轉錄終止序列係在原核細胞中發揮功能之序列。In some embodiments, the transcription termination sequence is a sequence that functions in eukaryotic cells. In some embodiments, the transcription termination sequence is a sequence that functions in prokaryotic cells.

可包括在穩定性控制序列中(例如轉錄終止區段、或在引導RNA之任一區段中以提供增加之穩定性)中之核苷酸序列包括例如Rho獨立性trp終止位點。 6.7. 核糖核蛋白(RNP)複合物 Nucleotide sequences that may be included in stability control sequences (e.g., transcriptional termination segments, or in any segment of a guide RNA to provide increased stability) include, for example, Rho-independent trp termination sites. 6.7. Ribonucleoprotein (RNP) complexes

在一些實施例中，該經工程化之V型核酸內切酶及B-GEn多肽呈稱為核糖核蛋白或RNP複合物之組合物遞送。RNP複合物係藉由組合Cas核酸內切酶(諸如經工程化之V型或B-GEn核酸內切酶)與核糖核酸(例如引導RNA (gRNA))來組裝。In some embodiments, the engineered V-type endonucleases and B-GEn polypeptides are delivered as a composition called a ribonucleoprotein or RNP complex. The RNP complex is assembled by combining a Cas endonuclease (such as an engineered V-type or B-GEn endonuclease) with a ribonucleic acid (e.g., a guide RNA (gRNA)).

在一些實施例中，該核糖核蛋白複合物包含例如如章節6.2中所述之經工程化之B-GEn核酸內切酶與適宜核糖核酸之複合。在一些實施例中，該核糖核酸為gRNA或sgRNA，其進一步描述於章節6.6中。在一些實施例中，該RNP複合物包含經工程化之B-GEn多肽及列於表4中之sgRNA或另一適宜sgRNA。In some embodiments, the ribonucleoprotein complex comprises a complex of an engineered B-GEn endonuclease, such as described in Section 6.2, and a suitable ribonucleic acid. In some embodiments, the ribonucleic acid is a gRNA or sgRNA, which is further described in Section 6.6. In some embodiments, the RNP complex comprises an engineered B-GEn polypeptide and an sgRNA listed in Table 4 or another suitable sgRNA.

在一些實施例中，經工程化之B-GEn多肽及sgRNA以約1:1至約1:4之莫耳比存在。在一些實施例中，該莫耳比為約1:1至約1:3。在一些實施例中，該莫耳比為約1:1至約1:2.5。在一些實施例中，該莫耳比為約1:1至約1:2。在一些實施例中，該莫耳比為約1:1至約1:1.5。在一些實施例中，多肽與sgRNA之莫耳比為1:1。In some embodiments, the engineered B-GEn polypeptide and sgRNA are present in a molar ratio of about 1:1 to about 1:4. In some embodiments, the molar ratio is about 1:1 to about 1:3. In some embodiments, the molar ratio is about 1:1 to about 1:2.5. In some embodiments, the molar ratio is about 1:1 to about 1:2. In some embodiments, the molar ratio is about 1:1 to about 1:1.5. In some embodiments, the molar ratio of polypeptide to sgRNA is 1:1.

用於遞送RNP之一種最常見技術係電穿孔，其在細胞膜中產生孔，允許RNP進入至細胞質中。此外，可將電穿孔與細胞型特異性試劑組合，稱為核轉染技術，該技術在核膜中形成孔，允許進入DNA模板。One of the most common techniques used to deliver RNPs is electroporation, which creates pores in the cell membrane, allowing the RNPs to enter the cytoplasm. In addition, electroporation can be combined with cell type-specific reagents, a technique called nucleofection, which creates pores in the nuclear membrane, allowing access to the DNA template.

在一些實施例中，RNP複合物中之經工程化之V型核酸內切酶或B-GEn多肽係經由核轉染遞送至靶細胞中。 6.8. 核酸 In some embodiments, the engineered V-type endonuclease or B-GEn polypeptide in the RNP complex is delivered to the target cell via nuclear transfection. 6.8. Nucleic Acids

本發明提供編碼B-GEn V型CRISPR-Cas蛋白(例如經工程化之B-GEn多肽)之核酸(例如DNA或RNA)、編碼本發明之gRNA或sgRNA之核酸、編碼經工程化之B-GEn多肽及gRNA或sgRNA之核酸、及複數種例如包含編碼經工程化之B-GEn多肽及gRNA或sgRNA之核酸之核酸。The present invention provides nucleic acids (e.g., DNA or RNA) encoding B-GEn type V CRISPR-Cas proteins (e.g., engineered B-GEn polypeptides), nucleic acids encoding gRNAs or sgRNAs of the present invention, nucleic acids encoding engineered B-GEn polypeptides and gRNAs or sgRNAs, and a plurality of nucleic acids, e.g., comprising nucleic acids encoding engineered B-GEn polypeptides and gRNAs or sgRNAs.

編碼經工程化之B-GEn多肽之核酸可經密碼子最佳化，例如，其中至少一個非常見密碼子或較不常見密碼子已經宿主細胞或靶細胞中常見的密碼子置換。例如，經密碼子最佳化之核酸可引導經最佳化之傳訊mRNA之合成，例如經最佳化用於哺乳動物表現系統中之表現。 6.8.1. B-GEn編碼序列 Nucleic acids encoding engineered B-GEn polypeptides can be codon-optimized, e.g., wherein at least one uncommon or less common codon has been replaced with a codon that is common in a host cell or target cell. For example, a codon-optimized nucleic acid can direct the synthesis of an optimized signaling mRNA, e.g., optimized for expression in a mammalian expression system. 6.8.1. B-GEn Coding Sequences

在一些實施例中，本文描述之核酸包含一或多種修飾，其可用於例如增強活性、穩定性或特異性，改變遞送，減少宿主細胞中之先天免疫反應，進一步減小蛋白質大小，或用於其他增強，如本文中進一步描述及此項技術中已知。在一些實施例中，此類修飾將產生經工程化之B-GEn多肽，其核酸酶序列組分與SEQ ID NOs: 4、5或6之序列具有至少75%、至少80%、至少85%、至少90%、至少95%、至少99%、或100%胺基酸序列一致性。 6.8.2. 密碼子最佳化 In some embodiments, the nucleic acids described herein comprise one or more modifications that can be used, for example, to enhance activity, stability or specificity, alter delivery, reduce innate immune responses in host cells, further reduce protein size, or for other enhancements, as further described herein and known in the art. In some embodiments, such modifications will produce an engineered B-GEn polypeptide whose nuclease sequence component has at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 99%, or 100% amino acid sequence identity to the sequence of SEQ ID NOs: 4, 5, or 6. 6.8.2. Codon Optimization

在某些實施例中，經修飾之核酸用於本文描述之CRISPR-B-GEn.1、或B-GEn.1.2、或B-GEn.2系統中，其中引導RNA及/或包含編碼經工程化之B-GEn多肽之核酸序列之DNA或RNA可經修飾，如下文所述。此類經修飾之核酸可用於CRISPR-B-GEn.1、或B-GEn.1.2、或B-GEn.2系統中以編輯任何一或多個基因組基因座。在一些實施例中，本發明之核酸中之此類修飾係經由密碼子最佳化(例如基於其中表現經編碼多肽之特定宿主細胞進行密碼子最佳化)來達成。熟練技術者應明瞭，本發明之任何核苷酸序列及/或重組核酸可經密碼子最佳化用於任何所關注物種中之表現。密碼子最佳化係此項技術中熟知的且涉及針對密碼子使用偏性使用物種特異性密碼子使用表來修飾核苷酸序列。該等密碼子使用表係基於所關注物種之最高表現之基因之序列分析來產生。在一非限制性實例中，當核苷酸序列意欲在核中表現時，該等密碼子使用表基於所關注物種之高度表現之核基因之序列分析來產生。藉由比較該物種特異性密碼子使用表與存在於天然核酸序列中之密碼子來確定該等核苷酸序列之修飾。In certain embodiments, modified nucleic acids are used in the CRISPR-B-GEn.1, or B-GEn.1.2, or B-GEn.2 systems described herein, wherein guide RNAs and/or DNA or RNAs comprising nucleic acid sequences encoding engineered B-GEn polypeptides may be modified, as described below. Such modified nucleic acids may be used in CRISPR-B-GEn.1, or B-GEn.1.2, or B-GEn.2 systems to edit any one or more genomic loci. In some embodiments, such modifications in the nucleic acids of the present invention are achieved by codon optimization (e.g., codon optimization based on a specific host cell in which the encoded polypeptide is expressed). A skilled artisan will appreciate that any nucleotide sequence and/or recombinant nucleic acid of the present invention may be codon optimized for expression in any species of interest. Codon optimization is well known in the art and involves modifying nucleotide sequences for codon usage bias using species-specific codon usage tables. Such codon usage tables are generated based on sequence analysis of the most highly expressed genes of the species of interest. In a non-limiting example, when the nucleotide sequence is intended to be expressed in the nucleus, the codon usage tables are generated based on sequence analysis of highly expressed nuclear genes of the species of interest. Modifications of the nucleotide sequences are determined by comparing the species-specific codon usage table to the codons present in the natural nucleic acid sequence.

在一些實施例中，本文描述之經工程化之B-GEn多肽自經密碼子最佳化之核酸序列表現。例如，若所欲宿主細胞或靶細胞係人類細胞，則編碼包含B-GEn.1、或B-GEn.1.2、或B-GEn.2 (或B-GEn.1、或B-GEn.1.2、或B-GEn.2變體，例如酵素非活性變體)之胺基酸序列之經工程化之B-GEn多肽之人類密碼子最佳化核酸序列將係適宜的。作為另一非限制性實例，若所欲宿主細胞或靶細胞係小鼠細胞，則編碼包含B-GEn.1、或B-GEn.1.2、或B-GEn.2 (或B-GEn.1、或B-GEn.1.2、或B-GEn.2變體，例如酵素非活性變體)之胺基酸序列之經工程化之B-GEn多肽之小鼠密碼子最佳化核酸序列將係適宜的。In some embodiments, the engineered B-GEn polypeptides described herein are expressed from codon-optimized nucleic acid sequences. For example, if the desired host cell or target cell is a human cell, a human codon-optimized nucleic acid sequence encoding an engineered B-GEn polypeptide comprising an amino acid sequence of B-GEn.1, or B-GEn.1.2, or B-GEn.2 (or a B-GEn.1, or B-GEn.1.2, or a B-GEn.2 variant, such as an enzyme-inactive variant) would be appropriate. As another non-limiting example, if the desired host cell or target cell is a mouse cell, a mouse codon-optimized nucleic acid sequence encoding an engineered B-GEn polypeptide comprising the amino acid sequence of B-GEn.1, or B-GEn.1.2, or B-GEn.2 (or B-GEn.1, or B-GEn.1.2, or a B-GEn.2 variant, such as an enzyme-inactive variant) would be appropriate.

用於密碼子最佳化之策略及方法係此項技術中已知的且已針對各種系統進行描述，包括(但不限於)酵母(Outchkourov等人，Protein Expr Purif，24(1):18-24 (2002))及大腸桿菌(E. coli) (Feng等人，Biochemistry，39(50):15399-15409 (2000))。在一些實施例中，密碼子最佳化係藉由使用GeneGPS® Expression Optimization Technology (ATUM)及使用製造商推薦表現最佳化演算法進行。在一些實施例中，本發明之核酸經密碼子最佳化以達成在人類細胞中增加之表現。在一些實施例中，本發明之核酸經密碼子最佳化以達成在大腸桿菌細胞中增加之表現。在一些實施例中，本發明之核酸經密碼子最佳化以達成增加之在昆蟲細胞中之表現。在一些實施例中，本發明之核酸經密碼子最佳化以達成在Sf9昆蟲細胞中增加之表現。在一些實施例中，用於密碼子最佳化程序中之表現最佳化演算法經定義以避免推定聚-A信號(例如AATAAA及ATTAAA)以及可導致聚合酶滑移之A之長(大於4)延伸。Strategies and methods for codon optimization are known in the art and have been described for various systems, including, but not limited to, yeast (Outchkourov et al., Protein Expr Purif, 24(1):18-24 (2002)) and E. coli (Feng et al., Biochemistry, 39(50):15399-15409 (2000)). In some embodiments, codon optimization is performed using GeneGPS® Expression Optimization Technology (ATUM) and using the manufacturer's recommended expression optimization algorithm. In some embodiments, the nucleic acids of the invention are codon optimized to achieve increased expression in human cells. In some embodiments, the nucleic acids of the invention are codon optimized to achieve increased expression in E. coli cells. In some embodiments, nucleic acids of the invention are codon optimized to achieve increased expression in insect cells. In some embodiments, nucleic acids of the invention are codon optimized to achieve increased expression in Sf9 insect cells. In some embodiments, the performance optimization algorithm used in the codon optimization procedure is defined to avoid putative poly-A signals (e.g., AATAAA and ATTAAA) and long (greater than 4) stretches of A that can cause polymerase slippage.

如在此項技術中所熟知，核苷酸序列之密碼子最佳化導致與天然核苷酸序列具有小於100%一致性(例如小於70%、71%72%、73%、74%、75%、76%、77%、78%、79%、80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%或99%)但仍舊編碼具有與藉由該原始天然核苷酸序列所編碼相同之功能之多肽之核苷酸序列。因此，在本發明之代表性實施例中，本發明之核苷酸序列及/或重組核酸可經密碼子最佳化以達成在特定所關注物種中之表現。As is well known in the art, codon optimization of a nucleotide sequence results in a nucleotide sequence that is less than 100% identical (e.g., less than 70%, 71% 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%) to a native nucleotide sequence but still encodes a nucleotide sequence that has the same function as a polypeptide encoded by the original native nucleotide sequence. Thus, in representative embodiments of the invention, the nucleotide sequences and/or recombinant nucleic acids of the invention can be codon optimized to achieve expression in a particular species of interest.

在一些實施例中，經密碼子最佳化之核酸序列與SEQ ID NO: 4、SEQ ID NO:5或SEQ ID NO:6具有至少90%、91%、92%、93%、94%、95%、96%、97%、98%、99%、99.2%、99.5%、99.8%、99.9%或100%序列一致性。在一些實施例中，本發明之核酸經密碼子最佳化以達成經編碼之經工程化之B-GEn多肽在靶細胞或宿主細胞中增加之表現。在一些實施例中，本發明之核酸經密碼子最佳化以達成在人類細胞中增加之表現。一般而言，本發明之核酸經密碼子最佳化以達成在任何人類細胞中增加之表現。在一些實施例中，本發明之核酸經密碼子最佳化以達成在大腸桿菌細胞中增加之表現。在一些實施例中，本發明之核酸經密碼子最佳化以達成在昆蟲細胞中增加之表現。一般而言，本發明之核酸經密碼子最佳化以達成在任何昆蟲細胞中增加之表現。在一些實施例中，本發明之核酸經密碼子最佳化以達成在Sf9昆蟲細胞表現系統中增加之表現。In some embodiments, the codon-optimized nucleic acid sequence has at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.2%, 99.5%, 99.8%, 99.9% or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5 or SEQ ID NO: 6. In some embodiments, the nucleic acids of the present invention are codon-optimized to achieve increased expression of the encoded engineered B-GEn polypeptide in target cells or host cells. In some embodiments, the nucleic acids of the present invention are codon-optimized to achieve increased expression in human cells. In general, the nucleic acids of the present invention are codon-optimized to achieve increased expression in any human cell. In some embodiments, the nucleic acids of the invention are codon optimized for increased expression in E. coli cells. In some embodiments, the nucleic acids of the invention are codon optimized for increased expression in insect cells. In general, the nucleic acids of the invention are codon optimized for increased expression in any insect cell. In some embodiments, the nucleic acids of the invention are codon optimized for increased expression in the Sf9 insect cell expression system.

亦可選擇多腺苷酸化信號來將所欲宿主中之表現最佳化。 6.8.3. 核酸修飾 Polyadenylation signals can also be selected to optimize expression in a desired host. 6.8.3. Nucleic Acid Modifications

在一些實施例中，核酸(例如引導RNA、包含編碼引導RNA之核苷酸序列之核酸；編碼位點特異性修飾酵素(諸如本發明之經工程化之B-GEn多肽)之核酸；等)包括提供另外所需特徵(例如經修飾或經調節之穩定性；亞細胞靶向；追蹤，例如螢光標籤；蛋白質或蛋白質複合物之結合位點；等)之修飾或序列。非限制性實例包括：5'帽(例如7-甲基鳥苷酸帽(m7G))；3'聚腺苷酸化尾(例如3'聚(A)尾)；核糖開關序列(例如以允許蛋白質及/或蛋白質複合物之經調節之穩定性及/或經調節之可接近性)；穩定性控制序列；形成dsRNA雙鏈體(例如髮夾)之序列)；將RNA靶向至亞細胞位置(例如核、粒腺體、葉綠體及類似者)之修飾或序列；提供追蹤(例如直接結合至螢光分子、結合至促進螢光偵測之部分、允許螢光偵測之序列等)之修飾或序列；為蛋白質(例如作用於DNA上之蛋白質，包括轉錄活化子、轉錄抑制子、DNA甲基轉移酶、DNA去甲基化酶、組蛋白乙醯基轉移酶、組蛋白去乙醯酶及類似者)提供結合位點之修飾或序列；及其組合。In some embodiments, a nucleic acid (e.g., a guide RNA, a nucleic acid comprising a nucleotide sequence encoding a guide RNA; a nucleic acid encoding a site-specific modifying enzyme (e.g., an engineered B-GEn polypeptide of the present invention); etc.) includes modifications or sequences that provide additional desired characteristics (e.g., modified or regulated stability; subcellular targeting; tracking, such as a fluorescent tag; a binding site for a protein or protein complex; etc.). Non-limiting examples include: a 5' cap (e.g., a 7-methylguanylate cap (m7G)); a 3' polyadenylation tail (e.g., a 3' poly (A) tail); a riboswitch sequence (e.g., to allow regulated stability and/or regulated accessibility of a protein and/or protein complex); a stability control sequence; a sequence that forms a dsRNA duplex (e.g., a hairpin); targeting RNA to subcellular locations (e.g., nucleus, mitochondria, chloroplasts, and and the like); modifications or sequences that provide tracking (e.g., direct binding to a fluorescent molecule, binding to a moiety that promotes fluorescence detection, a sequence that allows fluorescence detection, etc.); modifications or sequences that provide binding sites for proteins (e.g., proteins that act on DNA, including transcriptional activators, transcriptional repressors, DNA methyltransferases, DNA demethylases, histone acetyltransferases, histone deacetylases, and the like); and combinations thereof.

在一些實施例中，引導RNA包括在5'或3'端的提供任何上述特徵之另外區段。例如，適宜第三區段可包括5'帽(例如7-甲基鳥苷酸帽(m7G))；3'聚腺苷酸化尾(例如3'聚(A)尾)；核糖開關序列(例如以允許蛋白質及蛋白質複合物之經調節之穩定性及/或經調節之可接近性)；穩定性控制序列；形成dsRNA雙鏈體(例如髮夾)之序列；將RNA靶向至亞細胞位置(例如核、粒腺體、葉綠體及類似者)之序列；提供追蹤(例如直接結合至螢光分子、結合至促進螢光偵測之部分、允許螢光偵測之序列等)之修飾或序列；為蛋白質(例如作用於DNA上之蛋白質，包括轉錄活化子、轉錄抑制子、DNA甲基轉移酶、DNA去甲基化酶、組蛋白乙醯基轉移酶、組蛋白去乙醯酶及類似者)提供結合位點之修飾或序列；及其組合。In some embodiments, the guide RNA includes an additional segment at the 5' or 3' end that provides any of the above characteristics. For example, a suitable third segment may include a 5' cap (e.g., a 7-methylguanylate cap (m7G)); a 3' polyadenylation tail (e.g., a 3' poly (A) tail); a riboswitch sequence (e.g., to allow regulated stability and/or regulated accessibility of proteins and protein complexes); a stability control sequence; a sequence that forms a dsRNA duplex (e.g., a hairpin); a sequence that targets the RNA to a subcellular location (e.g., nucleus, mitochondria, chloroplasts); and the like); modifications or sequences that provide tracking (e.g., direct binding to a fluorescent molecule, binding to a moiety that promotes fluorescence detection, sequences that allow fluorescence detection, etc.); modifications or sequences that provide binding sites for proteins (e.g., proteins that act on DNA, including transcriptional activators, transcriptional repressors, DNA methyltransferases, DNA demethylases, histone acetyltransferases, histone deacetylases, and the like); and combinations thereof.

修飾亦可或替代地用於降低引入至細胞中之RNA引發先天免疫反應之可能性或程度。此類反應，其在RNA干擾(RNAi) (包括小干擾RNA (siRNA))之情況下已得到充分描述，如下文及此項技術中所描述，其傾向於與該RNA之減少之半衰期及/或細胞介素或與免疫反應相關聯之其他因子之引發相關聯。Modifications may also or alternatively be used to reduce the likelihood or extent to which an RNA introduced into a cell elicits an innate immune response. Such responses, which have been well described in the context of RNA interference (RNAi), including small interfering RNA (siRNA), as described below and in this art, tend to be associated with a reduced half-life of the RNA and/or the induction of cytokines or other factors associated with an immune response.

亦可對編碼引入至細胞中之經工程化之B-GEn多肽之RNA進行一或多種類型之修飾，包括(但不限於)增強該RNA (諸如藉由減少其藉由存在於該細胞中之RNA酶之降解)之穩定性之修飾、增強所得產物(例如核酸內切酶)之轉譯之修飾、及/或降低引入至細胞中之RNA引發先天免疫反應之可能性或程度之修飾。同樣可使用修飾(諸如前述及其他)之組合。在經工程化之B-GEn多肽之情況下，例如，可對引導RNA(包括彼等上文所列舉者)進行一或多種類型之修飾，及/或可對編碼經工程化之B-GEn多肽(包括彼等上文所列舉者)之RNA進行一或多種類型之修飾。The RNA encoding the engineered B-GEn polypeptide introduced into the cell may also be subjected to one or more types of modifications, including, but not limited to, modifications that enhance the stability of the RNA (such as by reducing its degradation by RNases present in the cell), modifications that enhance the translation of the resulting product (such as an endonuclease), and/or modifications that reduce the likelihood or degree of the RNA introduced into the cell eliciting an innate immune response. Combinations of modifications (such as those described above and others) may also be used. In the case of an engineered B-GEn polypeptide, for example, one or more types of modifications may be performed to guide RNA (including those listed above), and/or one or more types of modifications may be performed to RNA encoding the engineered B-GEn polypeptide (including those listed above).

藉由說明，用於該CRISPR-B-GEn系統中之引導RNA或其他較小RNA可藉由化學手段容易地合成，使得許多修飾能夠容易地併入，如下文所說明及此項技術中所描述。雖然化學合成程序不斷擴展，但隨著核酸長度顯著增加超過一百個左右的核苷酸，藉由諸如高效液相層析(HPLC，其避免使用凝膠諸如PAGE)之程序來純化此類RNA往往變得更具挑戰性。一種用於產生更大長度之經化學修飾之RNA之方法係產生繫接在一起之兩個或更多個分子。長得多的RNA (諸如彼等編碼B-GEn.1、或B-GEn.1.2、或B-GEn.2核酸內切酶者)更易於酶促產生。儘管更少類型之修飾一般可用於酶促產生之RNA中，但仍舊存在可用於例如增強穩定性、降低先天免疫反應之可能性或程度、及/或增強其他屬性之修飾，如下文及此項技術中進一步描述；且定期開發新類型之修飾。By way of illustration, guide RNAs or other smaller RNAs used in the CRISPR-B-GEn system can be readily synthesized by chemical means, allowing many modifications to be readily incorporated, as described below and in this technology. Although chemical synthesis procedures continue to expand, as nucleic acid lengths increase significantly beyond a hundred or so nucleotides, purification of such RNAs by procedures such as high performance liquid chromatography (HPLC, which avoids the use of gels such as PAGE) tends to become more challenging. One method for producing chemically modified RNAs of greater length is to produce two or more molecules tethered together. Much longer RNAs (such as those encoding B-GEn.1, or B-GEn.1.2, or B-GEn.2 endonucleases) are more easily produced enzymatically. Although fewer types of modifications are generally available for use in enzymatically produced RNA, there are modifications that can be used, for example, to enhance stability, reduce the likelihood or magnitude of an innate immune response, and/or enhance other properties, as described further below and in this technology; and new types of modifications are regularly developed.

藉由說明各種類型之修飾，尤其是彼等就較小的經化學合成之RNA而言經常使用者，修飾可包括一或多個在糖的2'位置處修飾之核苷酸，在一些實施例中為2'-O-烷基、2'-O-烷基-O-烷基或2'-氟-修飾之核苷酸。在一些實施例中，RNA修飾包括在嘧啶之核糖上之2'-氟、2'-胺基及2' O-甲基修飾、鹼性殘基或在該RNA的3'端處之反向鹼基。此類修飾例行併入至寡核苷酸中且已顯示此等寡核苷酸對給定靶具有比2'去氧寡核苷酸更高之Tm (例如更高靶結合親和力)。By way of illustration of various types of modifications, particularly those commonly used with smaller chemically synthesized RNAs, modifications may include one or more nucleotides modified at the 2' position of the sugar, in some embodiments 2'-O-alkyl, 2'-O-alkyl-O-alkyl, or 2'-fluoro-modified nucleotides. In some embodiments, RNA modifications include 2'-fluoro, 2'-amine, and 2' O-methyl modifications on the ribose of pyrimidines, basic residues, or inverted bases at the 3' end of the RNA. Such modifications are routinely incorporated into oligonucleotides and such oligonucleotides have been shown to have higher Tm (e.g., higher target binding affinity) for a given target than 2'deoxy oligonucleotides.

已顯示許多核苷酸及核苷修飾使其所併入的寡核苷酸比天然寡核苷酸更耐核酸酶消化；此等經修飾之寡核苷酸比未經修飾之寡核苷酸完整存在更長時間。經修飾之寡核苷酸之特定實例包括彼等包含經修飾之主鏈者，例如硫磷酸酯、磷酸三酯、膦酸甲酯、短鏈烷基或環烷基糖間鍵聯或短鏈雜原子或雜環糖間鍵聯。一些寡核苷酸為具有硫磷酸酯主鏈之寡核苷酸及彼等具有雜原子主鏈者，特別是CH2-NH-O-CH2、CH、-N(CH3)-O-CH2 (稱為亞甲基(甲基亞胺基)或MMI主鏈)、CH2-O-N (CH3)-CH2、CH2-N (CH3)-N (CH3)-CH2及O-N (CH3)-CH2 -CH2主鏈；醯胺主鏈(參見De Mesmaeker等人，1995，Ace. Chem. Res.，28:366-374)；嗎啉基主鏈結構(參見Summerton及Weller，美國專利第5,034,506號)；肽核酸(PNA)主鏈(其中寡核苷酸之磷酸二酯主鏈經聚醯胺主鏈置換，該等核苷酸係直接或間接結合至聚醯胺主鏈之氮雜氮原子，參見Nielsen等人，1991，Science 254:1497)。含磷鍵聯包括(但不限於)硫磷酸酯、對掌性硫磷酸酯、二硫磷酸酯、磷酸三酯、胺基烷基磷酸三酯、甲基及其他烷基膦酸酯，包括3'伸烷基膦酸酯及對掌性膦酸酯、亞膦酸酯、磷醯胺酸酯，包括3'胺基磷醯胺酸酯及胺基烷基磷醯胺酸酯、硫羰基磷醯胺酸酯、硫羰基烷基膦酸酯、硫羰基烷基磷酸三酯、及具有正常3'-5'鍵聯之硼烷磷酸酯、此等之2 ¹至5 ¹連接之類似物、及彼等具有反向極性者，其中相鄰的核苷單元對進行3'-5'至5'-3'或2'-5'至5'-2'之連接；參見美國專利第3,687,808號；第4,469,863號；第4,476,301號；第5,023,243號；第5,177, 196號；第5,188,897號；第5,264,423號；第5,276,019號；第5,278,302號；第5,286,717號；第5,321,131號；第5,399,676號；第5,405,939號；第5,453,496號；第5,455,233號；第5,466,677號；第5,476,925號；第5,519,126號；第5,536,821號；第5,541,306號；第5,550,111號；第5,563,253號；第5,571,799號；第5,587,361號；及第5,625,050號。 Many nucleotide and nucleoside modifications have been shown to render the oligonucleotides into which they are incorporated more resistant to nuclease digestion than native oligonucleotides; such modified oligonucleotides persist intact longer than unmodified oligonucleotides. Specific examples of modified oligonucleotides include those comprising modified backbones, such as phosphothioate, phosphotriester, methylphosphonate, short-chain alkyl or cycloalkyl sugar inter-linkages or short-chain heteroatom or heterocyclic sugar inter-linkages. Some oligonucleotides are oligonucleotides with thiophosphate backbones and those with heteroatom backbones, particularly CH2-NH-O-CH2, CH, -N(CH3)-O-CH2 (referred to as methylene (methylimino) or MMI backbones), CH2-ON(CH3)-CH2, CH2-N(CH3)-N(CH3)-CH2 and ON(CH3)-CH2-CH2 backbones; amide backbones (see De Mesmaeker et al., 1995, Ace. Chem. Res., 28:366-374); morpholino backbone structures (see Summerton and Weller, U.S. Patent No. 5,034,506); peptide nucleic acid (PNA) backbones (in which the phosphodiester backbone of the oligonucleotides is replaced by a polyamide backbone, the nucleotides being directly or indirectly bound to nitrogen atoms of the polyamide backbone, see Nielsen et al., 1991, Science 254:1497). Phosphorus-containing linkages include, but are not limited to, phosphothioates, chiral phosphothioates, dithiophosphates, phosphotriesters, aminoalkylphosphotriesters, methyl and other alkylphosphonates, including 3'-alkylenephosphonates and chiral phosphonates, phosphinates, phosphamidates, including 3'-aminophosphamidates and aminoalkylphosphamidates, thiocarbonylphosphamidates, thiocarbonylalkylphosphonates, thiocarbonylalkylphosphotriesters, and boranophosphates with normal 3'-5' linkages, 2 ^{, 1} to 5' phosphodiester derivatives thereof. ¹ -linked analogs, and those with reverse polarity, wherein adjacent nucleoside unit pairs are linked 3'-5' to 5'-3' or 2'-5' to 5'-2'; see U.S. Patents Nos. 3,687,808; 4,469,863; 4,476,301; 5,023,243; 5,177, 196; 5,188,897; 5,264,423; 5,276,019; 5,278,302; 5,286,717; 5,321,131; 5,399,676; 5,405,939; 5,453,496; 5,455,233 No. 5,466,677; No. 5,476,925; No. 5,519,126; No. 5,536,821; No. 5,541,306; No. 5,550,111; No. 5,563,253; No. 5,571,799; No. 5,587,361; and No. 5,625,050.

基於嗎啉基之寡聚化合物描述於Braasch及Corey，Biochemistry，41(14): 4503-4510 (2002)；Genesis，第30卷，第3期，(2001)；Heasman，Dev. Biol.，243:209-214 (2002)；Nasevicius等人，Nat. Genet.，26:216-220 (2000)；Lacenra等，Proc. Nat/. Acad. Sci.，97: 9591-9596 (2000)；及1991年7月23日頒佈之美國專利第5,034,506號中。環己烯基核酸寡核苷酸擬似物描述於Wang等人，J. Am. Chem. Soc.，122: 8595-8602 (2000)中。Oligomers based on morpholinyl are described in Braasch and Corey, Biochemistry, 41(14): 4503-4510 (2002); Genesis, Vol. 30, No. 3, (2001); Heasman, Dev. Biol., 243:209-214 (2002); Nasevicius et al., Nat. Genet., 26:216-220 (2000); Lacenra et al., Proc. Nat/. Acad. Sci., 97: 9591-9596 (2000); and in U.S. Patent No. 5,034,506, issued July 23, 1991. Cyclohexenyl nucleic acid oligonucleotide analogs are described in Wang et al., J. Am. Chem. Soc., 122: 8595-8602 (2000).

其中不包括磷原子之經修飾之寡核苷酸主鏈具有藉由短鏈烷基或環烷基核苷間鍵聯、混合雜原子及烷基或環烷基核苷間鍵聯、或一或多個短鏈雜原子或雜環核苷間鍵聯形成之主鏈。此等包括彼等具有嗎啉基鍵聯者(部分由核苷之糖部分形成)；矽氧烷主鏈；硫化物、亞碸及碸主鏈；富馬醯基及硫代富馬醯基主鏈；亞甲基富馬醯基及硫代富馬醯基主鏈；含有烯烴之主鏈；胺基磺酸酯主鏈；亞甲基亞胺基及亞甲基腈基主鏈；磺酸酯及磺醯胺主鏈；醯胺主鏈；及具有混合N、0、Sand CH2組分部分之其他者；參見美國專利第5,034,506號；第5,166,315號；第5,185,444號；第5,214,134號；第5,216,141號；第5,235,033號；第5,264, 562號；第5,264,564號；第5,405,938號；第5,434,257號；第5,466,677號；第5,470,967號；第5,489,677號；第5,541,307號；第5,561,225號；第5,596,086號；第5,602,240號；第5,610,289號；第5,602,240號；第5,608,046號；第5,610,289號；第5,618,704號；第5,623,070號；第5,663,312號；第5,633,360號；第5,677,437號；及第5,677,439號，該等案之各者以引用之方式併入本文中。The modified oligonucleotide backbones which do not include a phosphorus atom have backbones formed by short-chain alkyl or cycloalkyl internucleoside linkages, mixed heteroatom and alkyl or cycloalkyl internucleoside linkages, or one or more short-chain heteroatom or heterocyclic internucleoside linkages. These include those having morpholinyl linkages (formed in part by the sugar portion of the nucleoside); siloxane backbones; sulfide, sulfone and sulfonium backbones; fumaryl and thiofumaryl backbones; methylenefumaryl and thiofumaryl backbones; backbones containing olefins; sulfamate backbones; methyleneimino and methylenenitrile backbones; sulfonate and sulfonamide backbones; amide backbones; and backbones having mixed N, O, Sand and Others of the CH2 component; see U.S. Patent Nos. 5,034,506; 5,166,315; 5,185,444; 5,214,134; 5,216,141; 5,235,033; 5,264, 562; 5,264,564; 5,405,938; 5,434,257; 5,466,677; 5,470,967; 5,489,677; 5,541,307; 5,561,225; 5,596,086; 5,602,240; 5,610, No. 289; No. 5,602,240; No. 5,608,046; No. 5,610,289; No. 5,618,704; No. 5,623,070; No. 5,663,312; No. 5,633,360; No. 5,677,437; and No. 5,677,439, each of which is incorporated herein by reference.

亦可包括在2'位置之一或多個經取代之糖部分，例如下列中之一者：OH、SH、SCH3、F、OCN、OCH3、OCH3O(CH2)n CH3、O(CH2)n NH2或O(CH2)n CH3，其中n為1至10；C1至C10低碳數烷基、烷氧基烷氧基、經取代之低碳數烷基、烷芳基或芳烷基；Cl；Br；CN；CF3；OCF3；O-、S-或N-烷基；O-、S-或N-烯基：SOCH3；SO2CH3；ONO2；NO2；N3；NH2；雜環烷基；雜環烷芳基；胺基烷基胺基；聚烷基胺基；經取代之矽基；RNA切割基團；報導子基團；嵌入劑；用於改良寡核苷酸之藥物動力學特性之基團；或用於改良寡核苷酸之藥物動力學特性之基團及其他具有類似特性之取代基。在一些實施例中，修飾包括2'甲氧基乙氧基(2'-O-CH2CH2OCH3，亦稱為2'-O-(2-甲氧基乙基)) (Martinet a/，Helv. Chim. Acta，1995，78，486)。其他修飾包括2'-甲氧基(2'-O-CH3)、2'-丙氧基(2'-OCH2 CH2CH3)及2'-氟(2'-F)。亦可在該寡核苷酸上的其他位置(特別是在3'端核苷酸上的糖的3'位置及5'端核苷酸的5'位置)處進行類似修飾。寡核苷酸亦可具有糖擬似物(諸如環丁基)取代戊呋喃糖基。在一些實施例中，該等核苷酸單元之糖及核苷間鍵聯(例如主鏈)經新穎基團置換。維持鹼基單元以與適宜核酸靶化合物雜交。一種此類寡聚化合物(已顯示具有極佳雜交特性之寡核苷酸擬似物)稱為肽核酸(PNA)。在PNA化合物中，寡核苷酸之糖主鏈經含醯胺主鏈(例如胺基乙基甘胺酸主鏈)置換。該等核鹼基經保留且直接或間接結合至該主鏈之該醯胺部分之氮雜氮原子。教示PNA化合物之製備之代表性美國專利包括(但不限於)美國專利第5,539,082號；第5,714,331號；及第5,719,262號。PNA化合物之進一步教示可見於Nielsen等人，Science，254: 1497-1500 (1991)中。It may also include one or more substituted sugar moieties at the 2' position, such as one of the following: OH, SH, SCH3, F, OCN, OCH3, OCH3O(CH2)nCH3, O(CH2)nNH2, or O(CH2)n CH3, wherein n is 1 to 10; C1 to C10 lower alkyl, alkoxyalkoxy, substituted lower alkyl, alkaryl or aralkyl; Cl; Br; CN; CF3; OCF3; O-, S- or N-alkyl; O-, S- or N-alkenyl: SOCH3; SO2CH3; ONO2; NO2; N3; NH2; heterocycloalkyl; heterocycloalkylaryl; aminoalkylamino; polyalkylamino; substituted silyl; RNA cleavage group; reporter group; intercalator; group for improving the pharmacokinetic properties of oligonucleotides; or group for improving the pharmacokinetic properties of oligonucleotides and other substituents with similar properties. In some embodiments, the modification includes 2'methoxyethoxy (2'-O-CH2CH2OCH3, also known as 2'-O-(2-methoxyethyl)) (Martinet a/, Helv. Chim. Acta, 1995, 78, 486). Other modifications include 2'-methoxy (2'-O-CH3), 2'-propoxy (2'-OCH2 CH2CH3) and 2'-fluoro (2'-F). Similar modifications may also be made at other positions on the oligonucleotide (particularly at the 3' position of the sugar on the 3' terminal nucleotide and the 5' position of the 5' terminal nucleotide). Oligonucleotides may also have sugar analogs (such as cyclobutyl) substituted for the pentofuranosyl group. In some embodiments, the sugars and internucleoside linkages (e.g., the backbone) of the nucleotide units are replaced by novel groups. The base units are maintained to hybridize with the appropriate nucleic acid target compound. One such oligomeric compound, an oligonucleotide analog that has been shown to have excellent hybridization properties, is called a peptide nucleic acid (PNA). In a PNA compound, the sugar backbone of the oligonucleotide is replaced with an amide-containing backbone (e.g., an aminoethylglycine backbone). The nucleobases are retained and are directly or indirectly bound to nitrogen atoms of the amide portion of the backbone. Representative U.S. patents that teach the preparation of PNA compounds include, but are not limited to, U.S. Patent Nos. 5,539,082; 5,714,331; and 5,719,262. Further teachings of PNA compounds can be found in Nielsen et al., Science, 254: 1497-1500 (1991).

引導RNA亦可包括(另外或替代地)核鹼基(在此項技術中經常簡稱為「鹼基」)修飾或取代。如本文所用，「未經修飾」或「天然」核鹼基包括腺嘌呤(A)、鳥嘌呤(G)、胸腺嘧啶(T)、胞嘧啶(C)及尿嘧啶(U)。經修飾之核鹼基包括僅在天然核酸中不常見或短暫發現之核鹼基，例如次黃嘌呤、6-甲基腺嘌呤、5-Me嘧啶，特別是5-甲基胞嘧啶(亦稱為5-甲基-2'去氧胞嘧啶且在此項技術中經常稱為5-Me-C)、5-羥甲基胞嘧啶(HMC)、醣基HMC及龍膽二糖基HMC、以及合成核鹼基，例如2-胺基腺嘌呤、2-(甲胺基)腺嘌呤、2-(咪唑基烷基)腺嘌呤、2-(胺基烷基胺基)腺嘌呤或其他雜取代之烷基腺嘌呤、2-硫尿嘧啶、2-硫胸腺嘧啶、5-溴尿嘧啶、5-羥甲基尿嘧啶、8-氮雜鳥嘌呤、7-去氮雜鳥嘌呤、N6 (6-胺基己基)腺嘌呤及2,6-二胺基嘌呤。Kornberg, A，DNA Replication，W. H. Freeman & Co.，San Francisco，pp75-77 (1980)；Gebeyehu等人，Nucl. Acids Res. 15:4513 (1997)。亦可包括此項技術中已知之「通用」鹼基，例如肌苷。已顯示5-Me-C取代使核酸雙鏈體穩定性增加0.6至1.2℃。(Sanghvi, Y. S.，Crooke, S. T.及Lebleu, B.編，Antisense Research and Applications，CRC Press，Boca Raton，1993，pp. 276-278)且為鹼基取代之實施例。The guide RNA may also include (additionally or alternatively) nucleobase (often referred to in the art as simply "base") modifications or substitutions. As used herein, "unmodified" or "natural" nucleobases include adenine (A), guanine (G), thymine (T), cytosine (C), and uracil (U). Modified nucleobases include nucleobases that are only rarely or transiently found in natural nucleic acids, such as hypoxanthine, 6-methyladenine, 5-Me pyrimidine, particularly 5-methylcytosine (also known as 5-methyl-2'deoxycytosine and often referred to in the art as 5-Me-C), 5-hydroxymethylcytosine (HMC), glycosyl HMC, and gentianbiose HMC, and synthetic nucleobases such as 2-aminoadenine, 2-(methylamino)adenine, 2-(imidazolylalkyl)adenine, 2-(aminoalkylamino)adenine or other heterosubstituted alkyladenines, 2-thiouracil, 2-thiothymine, 5-bromouracil, 5-hydroxymethyluracil, 8-azaguanine, 7-deazaguanine, N6 (6-aminohexyl)adenine and 2,6-diaminopurine. Kornberg, A, DNA Replication, W. H. Freeman & Co., San Francisco, pp75-77 (1980); Gebeyehu et al., Nucl. Acids Res. 15:4513 (1997). "Universal" bases known in the art, such as inosine, may also be included. 5-Me-C substitution has been shown to increase nucleic acid duplex stability by 0.6 to 1.2°C (Sanghvi, Y. S., Crooke, S. T., and Lebleu, B., eds., Antisense Research and Applications, CRC Press, Boca Raton, 1993, pp. 276-278) and is an example of a base substitution.

經修飾之核鹼基包括其他合成及天然核鹼基，諸如5-甲基胞嘧啶(5-me-C)、5-羥甲基胞嘧啶、黃嘌呤、次黃嘌呤、2-胺基腺嘌呤、腺嘌呤及鳥嘌呤之6-甲基及其他烷基衍生物、腺嘌呤及鳥嘌呤之2-丙基及其他烷基衍生物、2-硫尿嘧啶、2-硫胸腺嘧啶及2-硫胞嘧啶、5-鹵尿嘧啶及胞嘧啶、5-丙炔基尿嘧啶及胞嘧啶、6-偶氮尿嘧啶、胞嘧啶及胸腺嘧啶、5-尿嘧啶(假尿嘧啶)、4-硫尿嘧啶、8-鹵基、8-胺基、8-硫醇、8-硫烷基、8-羥基及其他a取代之腺嘌呤及鳥嘌呤、5-鹵基(特別是5-溴)、5-三氟甲基及其他5取代之尿嘧啶及胞嘧啶、7-甲基鳥嘌呤及7-甲基腺嘌呤、8-氮雜鳥嘌呤及8-氮雜腺嘌呤、7-去氮雜鳥嘌呤及7-去氮雜腺嘌呤及3-去氮雜鳥嘌呤及3-去氮雜腺嘌呤。Modified nucleobases include other synthetic and natural nucleobases such as 5-methylcytosine (5-me-C), 5-hydroxymethylcytosine, xanthine, hypoxanthine, 2-aminoadenine, 6-methyl and other alkyl derivatives of adenine and guanine, 2-propyl and other alkyl derivatives of adenine and guanine, 2-thiouracil, 2-thiothymine and 2-thiocytosine, 5-halouracil and cytosine, 5-propynyluracil and cytosine, 6-azouracil, cytosine and thymine. adenine, 5-uracil (pseudouracil), 4-thiouracil, 8-halogen, 8-amino, 8-thiol, 8-sulfanyl, 8-hydroxy and other α-substituted adenines and guanines, 5-halogen (especially 5-bromo), 5-trifluoromethyl and other 5-substituted uracils and cytosines, 7-methylguanine and 7-methyladenine, 8-azaguanine and 8-azaadenine, 7-deazaguanine and 7-deazaadenine and 3-deazaguanine and 3-deazaadenine.

其他有用之核鹼基包括彼等揭示於美國專利第3,687,808號中者、彼等揭示於「The Concise Encyclopedia of Polymer Science And Engineering」，第858至859頁，Kroschwitz, J.l.編，John Wiley & Sons，1990中者、彼等由Englisch等人，Angewandte Chemie，國際版，1991，30，第613頁所揭示者、及彼等揭示於Sanghvi, Y. S.，第15章，Antisense Research and Applications，第289至302頁，Crooke, S.T.及Lebleu, B.等人，CRC Press，1993中者。此等核鹼基中的某些對於增加本發明之寡聚化合物之結合親和力特別有用。此等包括5取代之嘧啶、6-氮雜嘧啶及N-2、N-6及-O-6取代之嘌呤，包括2-胺基丙基腺嘌呤、5-丙炔基尿嘧啶及5-丙炔基胞嘧啶。已顯示5-甲基胞嘧啶取代可使核酸雙鏈體穩定性增加0.6至1.2 oc (Sanghvi, Y.S.、Crooke, S.T.及Lebleu, B.編，「Antisense Research and Applications」，CRC Press，Boca Raton，1993，pp. 276-278)且為鹼基取代之實施例，甚至更特別是在與2'-O-甲氧基乙基糖修飾組合時。經修飾之核鹼基描述於美國專利第3,687,808號、以及第4,845,205號；第5,130,302號；第5,134,066號；5,175,273號；第5,367,066號；第5,432,272號；第5,457,187號；第5,459,255號；第5,484,908號；第5,502,177號；第5,525,711號；第5,552,540號；第5,587,469號；第5,596,091號；第5,614,617號；第5,681,941號；第5,750,692號；第5,763,588號；第5,830,653號；第6,005,096號；及美國專利申請公開案20030158403中。Other useful nucleobases include those disclosed in U.S. Patent No. 3,687,808, those disclosed in "The Concise Encyclopedia of Polymer Science And Engineering", pages 858-859, Kroschwitz, J.L., ed., John Wiley & Sons, 1990, those disclosed by Englisch et al., Angewandte Chemie, International Edition, 1991, 30, page 613, and those disclosed in Sanghvi, Y.S., Chapter 15, Antisense Research and Applications, pages 289-302, Crooke, S.T. and Lebleu, B. et al., CRC Press, 1993. Certain of these nucleobases are particularly useful for increasing the binding affinity of the oligomeric compounds of the present invention. These include 5-substituted pyrimidines, 6-azapyrimidines, and N-2, N-6, and -O-6 substituted purines, including 2-aminopropyladenine, 5-propynyluracil, and 5-propynylcytosine. 5-Methylcytosine substitution has been shown to increase nucleic acid duplex stability by 0.6 to 1.2 oc (Sanghvi, Y.S., Crooke, S.T., and Lebleu, B., eds., "Antisense Research and Applications", CRC Press, Boca Raton, 1993, pp. 276-278) and is an example of a base substitution, even more particularly when combined with a 2'-O-methoxyethyl sugar modification. Modified nucleobases are described in U.S. Patents Nos. 3,687,808, 4,845,205, 5,130,302, 5,134,066, 5,175,273, 5,367,066, 5,432,272, 5,457,187, 5,459,255, 5,484,908, 5,502,177 No. 5,525,711; No. 5,552,540; No. 5,587,469; No. 5,596,091; No. 5,614,617; No. 5,681,941; No. 5,750,692; No. 5,763,588; No. 5,830,653; No. 6,005,096; and U.S. Patent Application Publication No. 20030158403.

給定寡核苷酸中之所有位置不必經均勻修飾，且實際上可將前述修飾中之多於一者併入於單個寡核苷酸中或甚至併入於寡核苷酸內的單個核苷內。It is not necessary for all positions in a given oligonucleotide to be uniformly modified, and in fact more than one of the aforementioned modifications may be incorporated in a single oligonucleotide or even within a single nucleoside within an oligonucleotide.

在一些實施例中，引導RNA及/或編碼本發明之核酸內切酶(諸如B-GEn.1、或B-GEn.1.2、或B-GEn.2)之mRNA使用當前加帽方法(諸如mCAP、ARCA或酶促加帽方法)中之任一者加帽以建立保持生物活性且避免自身/非自身細胞內反應之可行之mRNA構築體。在一些實施例中，引導RNA及/或編碼本發明之核酸內切酶(諸如B-GEn.1、或B-GEn.1.2、或B-GEn.2)之mRNA係藉由使用CleanCap™ (TriLink)共轉錄加帽方法加帽。In some embodiments, the guide RNA and/or mRNA encoding the endonuclease of the present invention (such as B-GEn.1, or B-GEn.1.2, or B-GEn.2) is capped using any of the current capping methods (such as mCAP, ARCA, or enzymatic capping methods) to create a viable mRNA construct that retains biological activity and avoids self/non-self intracellular reactions. In some embodiments, the guide RNA and/or mRNA encoding the endonuclease of the present invention (such as B-GEn.1, or B-GEn.1.2, or B-GEn.2) is capped by using the CleanCap™ (TriLink) co-transcriptional capping method.

在一些實施例中，引導RNA及/或編碼本發明之核酸內切酶之mRNA包括一或多個選自由假尿苷、N1-甲基假尿苷及5-甲氧基尿苷組成之群之修飾。在一些實施例中，將一或多個N1-甲基假尿苷併入至引導RNA及/或編碼本發明之核酸內切酶之mRNA中以便在動物細胞(諸如哺乳動物細胞(例如人類及小鼠))中提供增強之RNA穩定性及/或蛋白質表現及降低之免疫原性。在一些實施例中，將該等N1-甲基假尿苷修飾與一或多個5-甲基胞苷組合併入。In some embodiments, the guide RNA and/or mRNA encoding the endonuclease of the present invention comprises one or more modifications selected from the group consisting of pseudouridine, N1-methylpseudouridine, and 5-methoxyuridine. In some embodiments, one or more N1-methylpseudouridines are incorporated into the guide RNA and/or mRNA encoding the endonuclease of the present invention to provide enhanced RNA stability and/or protein expression and reduced immunogenicity in animal cells, such as mammalian cells (e.g., human and mouse). In some embodiments, the N1-methylpseudouridine modifications are incorporated in combination with one or more 5-methylcytidines.

在一些實施例中，引導RNA及/或編碼核酸內切酶(諸如B-GEn.1、或B-GEn.1.2、或B-GEn.2)之mRNA (或DNA)經化學連接至增強該寡核苷酸之活性、細胞分佈或細胞吸收之一或多個部分或結合物。此類部分包括(但不限於)脂質部分，諸如膽固醇部分(Letsinger等人，1989，Proc. Nat/. Acad. Sci. USA 86: 6553-6556)；膽酸(Manoharan等人，1994，Bioorg. Med. Chem. Let. 4: 1053-1060)；硫醚，例如，己基-S-三苯甲基硫醇(Manoharan等人，1992，Ann. N. Y Acad. Sci. 660:306-309及Manoharan等人，1993，Bioorg. Med. Chem. Let. 3:2765-2770)；硫膽固醇(Oberhauser等人，1992，Nucl. Acids Res. 20: 533-538)；脂族鏈，例如，十二烷二醇或十一烷基殘基(Kabanov等人，1990，FEBS Lett.，259: 327-330及Svinarchuk等人，1993，Biochimie，75: 49-54)；磷脂，例如，二-十六烷基-rac-甘油或三乙基銨1,2-二-O-十六烷基-rac-甘油-3-H-膦酸酯(Manoharan等人，1995，Tetrahedron Lett. 36:3651-3654及Shea等人，1990，Nucl. Acids Res. 18: 3777-3783)；聚胺或聚乙二醇鏈(Mancharan等人，1995，Nucleosides & Nucleotides 14:969-973)；金剛烷乙酸(Manoharan等人，1995，Tetrahedron Lett.36:3651-3654)；棕櫚基部分(Mishra等人，1995，Biochim. Biophys. Acta 1264:229-237)；或十八胺或己基胺基-羰基-t氧基膽固醇部分(Crooke等人，1996，J. Pharmacol. Exp. Ther.，277: 923-937)。亦參見美國專利第4,828,979號；第4,948,882號；第5,218,105號；第5,525,465號；第5,541,313號；第5,545,730號；第5,552,538號；第5,578,717號、第5,580,731號；第5,580,731號；第5,591,584號；第5,109,124號；第5,118,802號；第5,138,045號；第5,414,077號；第5,486,603號；第5,512,439號；第5,578,718號；第5,608,046號；第4,587,044號；第4,605,735號；第4,667,025號；第4,762,779號；第4,789,737號；第4,824,941號；第4,835,263號；第4,876,335號；第4,904,582號；第4,958,013號；第5,082,830號；第5,112,963號；第5,214,136號；第5,082,830號；第5,112,963號；第5,214,136號；第5,245,022號；第5,254,469號；第5,258,506號；第5,262,536號；第5,272,250號；第5,292,873號；第5,317,098號；第5,371,241號、第5,391,723號；第5,416,203號、第5,451,463號；第5,510,475號；第5,512,667號；第5,514,785號；第5,565,552號；第5,567,810號；第5,574,142號；第5,585,481號；第5,587,371號；第5,595,726號；第5,597,696號；第5,599,923號；第5,599,928號及第5,688,941號。In some embodiments, the guide RNA and/or mRNA (or DNA) encoding a nuclease (such as B-GEn.1, or B-GEn.1.2, or B-GEn.2) is chemically linked to one or more moieties or binding agents that enhance the activity, cellular distribution, or cellular uptake of the oligonucleotide. Such moieties include, but are not limited to, lipid moieties such as cholesterol moieties (Letsinger et al., 1989, Proc. Nat/. Acad. Sci. USA 86: 6553-6556); cholic acid (Manoharan et al., 1994, Bioorg. Med. Chem. Let. 4: 1053-1060); thioethers, for example, hexyl-S-tritylthiol (Manoharan et al., 1992, Ann. N.Y. Acad. Sci. 660: 306-309 and Manoharan et al., 1993, Bioorg. Med. Chem. Let. 3: 2765-2770); thiocholesterol (Oberhauser et al., 1992, Nucl. Acids Res. 20: 1061-1064); 533-538); aliphatic chains, for example, dodecanediol or undecyl residues (Kabanov et al., 1990, FEBS Lett., 259: 327-330 and Svinarchuk et al., 1993, Biochimie, 75: 49-54); phospholipids, for example, di-hexadecyl-rac-glycerol or triethylammonium 1,2-di-O-hexadecyl-rac-glycerol-3-H-phosphonate (Manoharan et al., 1995, Tetrahedron Lett. 36: 3651-3654 and Shea et al., 1990, Nucl. Acids Res. 18: 3777-3783); polyamines or polyethylene glycol chains (Mancharan et al., 1995, Nucleosides & Nucleotides 14:969-973); adamantane acetic acid (Manoharan et al., 1995, Tetrahedron Lett. 36:3651-3654); a palmityl moiety (Mishra et al., 1995, Biochim. Biophys. Acta 1264:229-237); or an octadecylamine or hexylamino-carbonyl-t-oxycholesterol moiety (Crooke et al., 1996, J. Pharmacol. Exp. Ther., 277:923-937). See also U.S. Patent Nos. 4,828,979; 4,948,882; 5,218,105; 5,525,465; 5,541,313; 5,545,730; 5,552,538; 5,578,717; 5,580,731; 5,580,731; 5,591,584; 5,109,124; 5,118,802; 5,138,045; 5, No. 414,077; No. 5,486,603; No. 5,512,439; No. 5,578,718; No. 5,608,046; No. 4,587,044; No. 4,605,735; No. 4,667,025; No. 4,762,779; No. 4,789,737; No. 4,824,941; No. 4,835,263; No. 4,876,335; No. 4,904,582; No. 4,958,013; No. 5 ,082,830; No. 5,112,963; No. 5,214,136; No. 5,082,830; No. 5,112,963; No. 5,214,136; No. 5,245,022; No. 5,254,469; No. 5,258,506; No. 5,262,536; No. 5,272,250; No. 5,292,873; No. 5,317,098; No. 5,371,241, No. 5,391,723; No. No. 5,416,203; No. 5,451,463; No. 5,510,475; No. 5,512,667; No. 5,514,785; No. 5,565,552; No. 5,567,810; No. 5,574,142; No. 5,585,481; No. 5,587,371; No. 5,595,726; No. 5,597,696; No. 5,599,923; No. 5,599,928 and No. 5,688,941.

糖及其他部分可用於將蛋白質及複合物(包括核苷酸，諸如陽離子聚核糖體及脂質體)靶向至特定位點。例如，肝細胞定向轉移可經由去唾液酸糖蛋白受體(ASGPR)介導；參見，例如，Hu等人，2014，Protein Pept Lett.21(1 0):1025-30。可使用此項技術中已知且定期開發之其他系統將用於本情況中之生物分子及/或其複合物靶向至特定所關注靶細胞。Sugars and other moieties can be used to target proteins and complexes (including nucleotides, such as cationic polysomes and liposomes) to specific sites. For example, hepatocyte-directed translocation can be mediated via the asialoglycoprotein receptor (ASGPR); see, e.g., Hu et al., 2014, Protein Pept Lett. 21(10):1025-30. Other systems known in the art and regularly developed can be used to target biomolecules and/or their complexes used in this context to specific target cells of interest.

此等靶向部分或結合物可包括共價結合至官能基(諸如初級或二級羥基)之結合物基團。適宜結合物基團包括嵌入劑、報導分子、聚胺、聚醯胺、聚乙二醇、聚醚、增強寡聚物之藥效動力學性質之基團、及增強寡聚物之藥物動力學性質之基團。典型結合物基團包括膽固醇、脂質、磷脂、生物素、吩嗪、葉酸鹽、啡啶、蒽醌、吖啶、螢光素、玫瑰紅(rhodamine)、香豆素及染料。能夠增強藥效動力學性質之基團包括改良吸收、增強對降解之抗性、及/或強化與該靶核酸之序列特異性雜交之基團。能夠增強藥物動力學性質之基團包括改良本發明之化合物之吸收、分佈、代謝或排泄之基團。代表性結合物基團揭示於1992年10月23日申請之國際專利申請案第PCT/US92/09196號及美國專利第6,287,860號中，該等案以引用之方式併入本文中。結合物部分包括(但不限於)脂質部分，諸如膽固醇部分、膽酸、硫醚，例如己基-5-三苯甲基硫醇、硫膽固醇、脂族鏈，例如十二烷二醇或十一烷基殘基、磷脂，例如二-十六烷基-rac-甘油或三乙基銨1,2-二-O-十六烷基-rac-甘油-3-H膦酸酯、聚胺或聚乙二醇鏈、或金剛烷乙酸、棕櫚基部分、或十八胺或己基胺基-羰基-氧基膽固醇部分。參見，例如，美國專利第4,828,979號；第4,948,882號；第5,218,105號；第5,525,465號；第5,541,313號；第5,545,730號；第5,552,538號；第5,578,717號、第5,580,731號；第5,580,731號；第5,591,584號；第5,109,124號；第5,118,802號；第5,138,045號；第5,414,077號；第5,486,603號；第5,512,439號；第5,578,718號；第5,608,046號；第4,587,044號；第4,605,735號；第4,667,025號；第4,762,779號；第4,789,737號；第4,824,941號；第4,835,263號；第4,876,335號；第4,904,582號；第4,958,013號；第5,082,830號；第5,112,963號；第5,214,136號；第5,082,830號；第5,112,963號；第5,214,136號；第5,245,022號；第5,254,469號；第5,258,506號；第5,262,536號；第5,272,250號；第5,292,873號；第5,317,098號；第5,371,241號、第5,391,723號；第5,416,203號、第5,451,463號；第5,510,475號；第5,512,667號；第5,514,785號；第5,565,552號；第5,567,810號；第5,574,142號；第5,585,481號；第5,587,371號；第5,595,726號；第5,597,696號；第5,599,923號；第5,599,928號及第5,688,941號。Such targeting moieties or conjugates may include a conjugate group covalently bound to a functional group such as a primary or secondary hydroxyl group. Suitable conjugate groups include intercalators, reporter molecules, polyamines, polyamides, polyethylene glycols, polyethers, groups that enhance the pharmacodynamic properties of oligomers, and groups that enhance the pharmacodynamic properties of oligomers. Typical conjugate groups include cholesterol, lipids, phospholipids, biotin, phenazines, folates, phenanthridines, anthraquinones, acridines, fluoresceins, rhodamine, coumarins, and dyes. Groups that can enhance pharmacodynamic properties include groups that improve absorption, enhance resistance to degradation, and/or enhance sequence-specific hybridization with the target nucleic acid. Groups capable of enhancing pharmacokinetic properties include groups that improve the absorption, distribution, metabolism or excretion of the compounds of the present invention. Representative conjugate groups are disclosed in International Patent Application No. PCT/US92/09196 filed on October 23, 1992 and U.S. Patent No. 6,287,860, which are incorporated herein by reference. Conjugate moieties include, but are not limited to, lipid moieties such as cholesterol moieties, bile acid, thioethers such as hexyl-5-tritylthiol, thiocholesterol, aliphatic chains such as dodecandiol or undecyl residues, phospholipids such as di-hexadecyl-rac-glycerol or triethylammonium 1,2-di-O-hexadecyl-rac-glycerol-3-H phosphonate, polyamines or polyethylene glycol chains, or adamantane acetic acid, palmityl moieties, or octadecylamine or hexylamino-carbonyl-oxycholesterol moieties. See, e.g., U.S. Patent Nos. 4,828,979; 4,948,882; 5,218,105; 5,525,465; 5,541,313; 5,545,730; 5,552,538; 5,578,717; 5,580,731; 5,580,731; 5,591,584; 5,109,124; 5,118,802; 5,138,045; No. 5,414,077; No. 5,486,603; No. 5,512,439; No. 5,578,718; No. 5,608,046; No. 4,587,044; No. 4,605,735; No. 4,667,025; No. 4,762,779; No. 4,789,737; No. 4,824,941; No. 4,835,263; No. 4,876,335; No. 4,904,582; No. 4,958,013; No. No. 5,082,830; No. 5,112,963; No. 5,214,136; No. 5,082,830; No. 5,112,963; No. 5,214,136; No. 5,245,022; No. 5,254,469; No. 5,258,506; No. 5,262,536; No. 5,272,250; No. 5,292,873; No. 5,317,098; No. 5,371,241, No. 5,391,723; No. 5,416,203; No. 5,451,463; No. 5,510,475; No. 5,512,667; No. 5,514,785; No. 5,565,552; No. 5,567,810; No. 5,574,142; No. 5,585,481; No. 5,587,371; No. 5,595,726; No. 5,597,696; No. 5,599,923; No. 5,599,928 and No. 5,688,941.

亦可藉由各種手段修飾較長核酸，其較不易於化學合成且一般藉由酶促合成來產生。此類修飾可包括例如某些核苷酸類似物之引入、在分子的5'或3'端處特定序列或其他部分之併入、及其他修飾。藉由說明，編碼B-GEn.1、或B-GEn.1.2、或B-GEn.2之mRNA為約4kb長度且可藉由體內轉錄來合成。對該mRNA之修飾可經施用以例如增加其轉譯或穩定性(諸如藉由增加其對細胞之降解之抗性)、或減少該RNA引發先天免疫反應之傾向，該先天免疫反應經常在引入外源RNA (特別是較長RNA，諸如編碼B-GEn.1或B-GEn.2者)後在細胞中觀察到。Longer nucleic acids, which are less susceptible to chemical synthesis and are generally produced by enzymatic synthesis, can also be modified by various means. Such modifications may include, for example, the introduction of certain nucleotide analogs, the incorporation of specific sequences or other portions at the 5' or 3' end of the molecule, and other modifications. By way of illustration, the mRNA encoding B-GEn.1, or B-GEn.1.2, or B-GEn.2 is approximately 4 kb in length and can be synthesized by in vivo transcription. Modifications to the mRNA can be applied, for example, to increase its translation or stability (such as by increasing its resistance to degradation by cells), or to reduce the tendency of the RNA to induce an innate immune response, which is often observed in cells after the introduction of exogenous RNA (particularly longer RNA, such as that encoding B-GEn.1 or B-GEn.2).

此項技術中已描述許多此類修飾，諸如聚A尾、5'帽類似物(例如反向帽類似物(ARCA)或m7G(5')ppp(5')G (mCAP))、經修飾之5'或3'未轉譯區(UTRs)、經修飾之鹼基(諸如假-UTP、2-硫基-UTP、5-甲基胞苷-5'-三磷酸酯(5-甲基-CTP)或N6-甲基-ATP)之使用、或用磷酸酶處理以移除5'端磷酸酯。此等及其他修飾係此項技術中已知的，且定期開發RNA之新穎修飾。Many such modifications have been described in the art, such as poly A tails, 5' cap analogs such as reverse cap analogs (ARCA) or m7G(5')ppp(5')G (mCAP), modified 5' or 3' untranslated regions (UTRs), use of modified bases such as pseudo-UTP, 2-thio-UTP, 5-methylcytidine-5'-triphosphate (5-methyl-CTP) or N6-methyl-ATP, or treatment with phosphatases to remove the 5' terminal phosphate. These and other modifications are known in the art, and novel modifications of RNA are regularly developed.

存在經修飾之RNA之許多商業供應商，包括例如TriLink Biotech、Axolabs、Bio-Synthesis Inc.、Dharmacon及許多其他公司。如TriLink所描述，例如，可使用5-甲基-CTP賦予期望特性，諸如增加之核酸酶穩定性、增加之轉譯或減少之先天免疫受體與體外轉錄之RNA之相互作用。亦已顯示5'-甲基胞苷-5'-三磷酸酯(5-甲基-CTP)、N6-甲基-ATP、以及假-UTP及2-硫基-UTP減少培養及體內之先天免疫刺激同時增強轉譯，如在Konmann等人及Warren等人在下文提及之公開案中所說明。There are many commercial suppliers of modified RNA, including, for example, TriLink Biotech, Axolabs, Bio-Synthesis Inc., Dharmacon, and many others. As described by TriLink, for example, 5-methyl-CTP can be used to impart desired properties, such as increased nuclease stability, increased translation, or decreased interaction of innate immune receptors with transcribed RNA in vitro. 5'-methylcytidine-5'-triphosphate (5-methyl-CTP), N6-methyl-ATP, as well as pseudo-UTP and 2-thio-UTP have also been shown to reduce innate immune stimulation in culture and in vivo while enhancing translation, as described in the publications of Konmann et al. and Warren et al., referenced below.

已顯示，體內遞送之經化學修飾之mRNA可用於達成改良之治療效應；參見，例如，Kormann等人，Nature Biotechnology 29，154-157 (2011)。此類修飾可用於例如增加該RNA分子之穩定性及/或降低其免疫原性。使用化學修飾(諸如假-U、N6-甲基-A、2-硫基-U及5-甲基-C)，發現其分別僅用2-硫基-U及5-甲基-C取代該等尿苷及胞苷殘基的四分之一，導致了小鼠中該mRNA之鐸樣受體(TLR)介導之識別之顯著減少。藉由減少先天免疫系統之活化，因此，此等修飾可用於有效增加該mRNA在體內之穩定性及壽命；參見，例如，Konmann等人，同上。It has been shown that chemically modified mRNAs delivered in vivo can be used to achieve improved therapeutic effects; see, e.g., Kormann et al., Nature Biotechnology 29, 154-157 (2011). Such modifications can be used, for example, to increase the stability of the RNA molecule and/or reduce its immunogenicity. Using chemical modifications such as pseudo-U, N6-methyl-A, 2-thio-U, and 5-methyl-C, it was found that replacing only one quarter of the uridine and cytidine residues with 2-thio-U and 5-methyl-C, respectively, resulted in a significant reduction in toll-like receptor (TLR)-mediated recognition of the mRNA in mice. By reducing activation of the innate immune system, these modifications can therefore be used to effectively increase the stability and longevity of the mRNA in vivo; see, e.g., Konmann et al., supra.

亦已顯示，併入設計成繞過先天抗病毒反應之修飾之合成傳訊RNA之重複投與可將分化人類細胞再程式化至多能性。參見，例如，Warren等人，Cell Stem Cell，7(5):618-30 (2010)。作為初級再程式化蛋白之此類經修飾之mRNA可為再程式化多種人類細胞類型之有效手段。此類細胞稱為誘導型多能幹細胞(iPSC)且發現併入5-甲基-CTP、假UTP及抗反向帽類似物(ARCA)之酶促合成之RNA可用於有效逃避細胞的抗病毒反應；參見，例如，Warren等人，同上。此項技術中描述之核酸之其他修飾包括例如聚A尾之使用、5'帽類似物(諸如m7G(5')ppp(5')G (mCAP))之添加、5'或3'未轉譯區(UTR)之修飾、或用磷酸酶處理以移除5'端磷酸酯且定期開發新穎方法。It has also been shown that repeated administration of synthetic messaging RNAs incorporating modifications designed to bypass the innate antiviral response can reprogram differentiated human cells to pluripotency. See, e.g., Warren et al., Cell Stem Cell, 7(5):618-30 (2010). Such modified mRNAs as primary reprogramming proteins can be an effective means of reprogramming a variety of human cell types. Such cells are called induced pluripotent stem cells (iPSCs) and it was found that enzymatically synthesized RNAs incorporating 5-methyl-CTP, pseudoUTP, and anti-reverse cap analog (ARCA) can be used to effectively evade the antiviral response of the cell; see, e.g., Warren et al., supra. Other modifications of nucleic acids described in this technology include, for example, the use of poly A tails, the addition of 5' cap analogs such as m7G(5')ppp(5')G (mCAP), modification of the 5' or 3' untranslated regions (UTRs), or treatment with phosphatases to remove 5' phosphates, and new methods are regularly developed.

已結合RNA干擾(RNAi) (包括小干擾RNA (siRNA))之修飾開發適用於產生用於本文使用之經修飾之RNA之許多組合物及技術。siRNA在體內呈現特定挑戰，因為其於經由mRNA干擾之基因沉默上之效應一般係短暫的，此可能需要重複投與。此外，siRNA為雙股RNA (dsRNA)且哺乳動物細胞具有已經演化以偵測及中和dsRNA之免疫反應，此經常為病毒感染之副產物。因此，存在哺乳動物酵素，諸如PKR (dsRNA反應性激酶)、及潛在視黃酸可誘導基因I (RIG-I)，其可介導對dsRNA之細胞反應、以及鐸樣受體(諸如TLR3、TLR7及TLR8)，其可對此類分子產生反應而觸發細胞介素之誘導；參見，例如，Angart等人，Pharmaceuticals (Basel) 6(4): 440-468 (2013)；Kanasty等人，Molecular Therapy 20(3): 513-524 (2012)；Burnett等人，Biotechnol J. 6(9):1130-46 (2011)；Judge及Maclachlan，Hum Gene Ther 19(2):111-24 (2008)；及其中引用的參考文獻之綜述。Modifications that have been combined with RNA interference (RNAi), including small interfering RNA (siRNA), have developed a number of compositions and techniques suitable for generating modified RNA for use herein. siRNA presents particular challenges in vivo because its effects on gene silencing via mRNA interference are generally transient, which may require repeated administration. In addition, siRNA is double-stranded RNA (dsRNA) and mammalian cells have immune responses that have evolved to detect and neutralize dsRNA, which is often a byproduct of viral infection. Thus, there are mammalian enzymes such as PKR (dsRNA-responsive kinase) and potentially retinoic acid-inducible gene 1 (RIG-I) that mediate cellular responses to dsRNA, as well as toxin-like receptors such as TLR3, TLR7, and TLR8 that trigger induction of cytokines in response to such molecules; see, e.g., Angart et al., Pharmaceuticals (Basel) 6(4): 440-468 (2013); Kanasty et al., Molecular Therapy 20(3): 513-524 (2012); Burnett et al., Biotechnol J. 6(9): 1130-46 (2011); Judge and Maclachlan, Hum Gene Ther 19(2): 111-24. (2008); and a review of the references cited therein.

已開發及施用多種修飾以增強RNA穩定性、減少先天免疫反應、及/或達成可結合如本文所述將核酸引入至人類細胞中使用之其他益處；參見，例如，Whitehead KA等人，Annual Review of Chemical and Biomolecular Engineering，2:77-96 (2011)；Gaglione及Messere，Mini Rev Med Chem，10(7):578-95 (2010)；Chernolovskaya等人，Curr Opin Mol Ther.，12(2):158-67 (2010)；Deleavey 等人，Curr Protoc Nucleic Acid Chem，第16章：Unit 16.3 (2009)；Behlke，Oligonucleotides 18(4):305-19 (2008): Fucini等人，Nucleic Acid Ther 22(3): 205-210 (2012)；Bremsen等人，Front Genet 3:154 (2012)之綜述。A variety of modifications have been developed and used to enhance RNA stability, reduce innate immune responses, and/or achieve other benefits that can be used in conjunction with introducing nucleic acids into human cells as described herein; see, e.g., Whitehead KA et al., Annual Review of Chemical and Biomolecular Engineering, 2:77-96 (2011); Gaglione and Messere, Mini Rev Med Chem, 10(7):578-95 (2010); Chernolovskaya et al., Curr Opin Mol Ther., 12(2):158-67 (2010); Deleavey et al., Curr Protoc Nucleic Acid Chem, Chapter 16:Unit 16.3 (2009); Behlke, Oligonucleotides 18(4):305-19 (2008): Fucini et al., Nucleic Acid Ther 22(3): 205-210 (2012); review by Bremsen et al., Front Genet 3:154 (2012).

如上所述，存在經修飾之RNA之許多商業供應商，其中許多已專門用於經設計成改良siRNA之有效性之修飾。基於文獻中所報導之各種發現而提供各種方法。例如，Dharmacon指出，用硫(硫磷酸酯，PS)置換非橋接氧已廣泛用於改良siRNA之核酸酶抗性，如Kale，Nature Reviews Drug Discovery 11:125-140 (2012)所報導。已報導核糖的2'位置之修飾改良核苷酸間磷酸酯鍵之核酸酶抗性同時增加雙鏈體穩定性(Tm)，此亦已顯示提供保護免於免疫活化。中等PS主鏈修飾與小、良好耐受之2'-取代(2'-O-、2'-氟、2'-氫)之組合與用於體內施用之高度穩定之siRNA相關聯，如Soutschek等人，Nature 432:173-178 (2004)所報導；且已報導2'-O-甲基修飾有效於改良穩定性，如Volkov，Oligonucleotides 19:191-202 (2009)所報導。關於減少先天免疫反應之誘導，已報導用2'-O-甲基、2'-氟、2'-氫修飾特異性序列可減少TLR7/TLR8相互作用同時一般保留沉默活性；參見，例如，Judge等人，Mol. Ther. 13:494-505 (2006)；及Cekaite等人，J. Mol. Biol. 365:90-108 (2007)。亦已顯示另外修飾(諸如2-硫尿嘧啶、假尿嘧啶、5-甲基胞嘧啶、5-甲基尿嘧啶及N6-甲基腺苷)將由TLR3、TLR7及TLR8介導之免疫效應最小化；參見，例如，Kariko等人，Immunity 23:165-175 (2005)。As described above, there are many commercial suppliers of modified RNA, many of which have specialized in modifications designed to improve the effectiveness of siRNA. Various methods are provided based on various findings reported in the literature. For example, Dharmacon indicates that replacement of non-bridging oxygens with sulfur (phosphothioate, PS) has been widely used to improve nuclease resistance of siRNA, as reported by Kale, Nature Reviews Drug Discovery 11: 125-140 (2012). Modification of the 2' position of the ribose has been reported to improve nuclease resistance of the internucleotide phosphate bond while increasing duplex stability (Tm), which has also been shown to provide protection from immune activation. Moderate PS backbone modifications in combination with small, well-tolerated 2'-substitutions (2'-O-, 2'-fluoro, 2'-hydrogen) have been associated with highly stable siRNAs for in vivo administration, as reported by Soutschek et al., Nature 432:173-178 (2004); and 2'-O-methyl modifications have been reported to be effective in improving stability, as reported by Volkov, Oligonucleotides 19:191-202 (2009). With respect to reducing the induction of innate immune responses, it has been reported that modification of specific sequences with 2'-O-methyl, 2'-fluoro, 2'-hydrogen can reduce TLR7/TLR8 interactions while generally retaining silencing activity; see, e.g., Judge et al., Mol. Ther. 13:494-505 (2006); and Cekaite et al., J. Mol. Biol. 365:90-108 (2007). Additional modifications (such as 2-thiouracil, pseudouracil, 5-methylcytosine, 5-methyluracil, and N6-methyladenosine) have also been shown to minimize immune effects mediated by TLR3, TLR7, and TLR8; see, e.g., Kariko et al., Immunity 23:165-175 (2005).

如此項技術中亦已知且可購買獲得之許多結合物可施用至核酸，諸如本文中使用之可增強其遞送及/或細胞吸收之RNA，包括例如膽固醇、生育酚及葉酸、脂質、肽、聚合物、連接子及適體；參見，例如，Winkler, Ther. Deliv. 4:791-809 (2013)、及其中引用之參考文獻之綜述。 6.9. 載體 As is also known in the art and commercially available, a number of conjugates can be applied to nucleic acids, such as RNA used herein, that can enhance their delivery and/or cellular uptake, including, for example, cholesterol, tocopherol and folic acid, lipids, peptides, polymers, linkers, and aptamers; see, for example, Winkler, Ther. Deliv. 4:791-809 (2013), and references cited therein for a review. 6.9. Carriers

本發明提供包含本發明之核酸(例如如章節6.8中所述)之載體。在一些實施例中，該核酸包含編碼如章節6.2中所述之經工程化之B-GEn多肽之核酸。在一些實施例中，至少就編碼經工程化之B-GEn多肽之核酸酶組分之部分而言，該經工程化之B-GEn多肽編碼序列經密碼子最佳化。The invention provides vectors comprising nucleic acids of the invention (e.g., as described in Section 6.8). In some embodiments, the nucleic acid comprises a nucleic acid encoding an engineered B-GEn polypeptide as described in Section 6.2. In some embodiments, the engineered B-GEn polypeptide coding sequence is codon-optimized, at least for the portion of the nuclease component encoding the engineered B-GEn polypeptide.

該載體(或核苷酸序列)可進一步編碼gRNA。The vector (or nucleotide sequence) may further encode a gRNA.

在一些實施例中，該包含核苷酸序列之載體可為表現載體。In some embodiments, the vector comprising the nucleotide sequence may be an expression vector.

在一些實施例中，該表現載體為經工程化之B-GEn多肽之產生載體，例如，其可用於經工程化之B-GEn多肽於宿主細胞中之表現/產生。在經工程化之B-GEn多肽於宿主細胞中之表現/產生之後，可將該經工程化之B-GEn多肽併入至RNP中用於靶細胞之核轉染。In some embodiments, the expression vector is a production vector of an engineered B-GEn polypeptide, for example, it can be used for expression/production of an engineered B-GEn polypeptide in a host cell. After expression/production of an engineered B-GEn polypeptide in a host cell, the engineered B-GEn polypeptide can be incorporated into RNP for nuclear transfection of a target cell.

或者，包含該核苷酸序列之該表現載體可為經工程化之B-GEn多肽之遞送載體，例如，其可用於將該經工程化之B-GEn多肽編碼序列引入至意欲用於基因編輯之靶細胞中。在該經工程化之B-GEn多肽於該靶細胞中之表現/產生之後，該經工程化之B-GEn多肽以及引導RNA分子能夠編輯該靶細胞。在一些實施例中，遞送載體進一步包括該等gRNA之編碼序列。在其他實施例中，將編碼該gRNA之獨立核酸引入至該靶細胞中。Alternatively, the expression vector comprising the nucleotide sequence can be a delivery vector for an engineered B-GEn polypeptide, for example, it can be used to introduce the engineered B-GEn polypeptide coding sequence into a target cell intended for gene editing. After the expression/production of the engineered B-GEn polypeptide in the target cell, the engineered B-GEn polypeptide and the guide RNA molecule can edit the target cell. In some embodiments, the delivery vector further includes the coding sequence of the gRNA. In other embodiments, an independent nucleic acid encoding the gRNA is introduced into the target cell.

所涵蓋的表現載體包括(但不限於)基於牛痘病毒、脊髓灰質炎病毒、腺病毒、腺相關病毒、SV40、單純疱疹病毒、人類免疫缺陷病毒、逆轉錄病毒(例如鼠類白血病病毒、脾壞死病毒、及衍生自逆轉錄病毒(諸如勞斯肉瘤病毒(Rous Sarcoma Virus)、哈威肉瘤病毒(Harvey Sarcoma Virus)、禽白血病病毒、慢病毒、人類免疫缺陷病毒、骨髓增生性肉瘤病毒及乳腺腫瘤病毒)之載體)之病毒載體及其他重組載體。經設想用於真核靶細胞之其他載體包括(但不限於)載體pXT1、pSG5、pSVK3、pBPV、pMSG及pSVLSV40 (Pharmacia)。經設想用於真核細胞之另外載體包括(但不限於)載體pCTx-1、pCTx-2及pCTx-3。可使用其他載體，只要其與所欲宿主或靶細胞相容即可。Contemplated expression vectors include, but are not limited to, viral vectors based on vaccinia virus, poliovirus, adenovirus, adeno-associated virus, SV40, herpes simplex virus, human immunodeficiency virus, retroviruses (e.g., murine leukemia virus, spleen necrosis virus, and vectors derived from retroviruses (e.g., Rous Sarcoma Virus, Harvey Sarcoma Virus, avian leukosis virus, lentivirus, human immunodeficiency virus, myeloproliferative sarcoma virus, and mammary tumor virus) and other recombinant vectors. Other vectors contemplated for use in eukaryotic target cells include, but are not limited to, vectors pXT1, pSG5, pSVK3, pBPV, pMSG, and pSVLSV40 (Pharmacia). Additional vectors contemplated for use in eukaryotic cells include, but are not limited to, vectors pCTx-1, pCTx-2, and pCTx-3. Other vectors may be used as long as they are compatible with the desired host or target cell.

在一些實施例中，表現載體具有一或多個轉錄及/或轉譯控制元件。取決於所使用之表現細胞/載體系統，可在載體中使用多種適宜轉錄及轉譯控制元件(包括組成型及誘導型啟動子、轉錄增強子元件、轉錄終止子等)中之任一者。該載體亦可含有用於轉譯開始之核糖體結合位點及轉錄終止子。In some embodiments, the expression vector has one or more transcriptional and/or translational control elements. Depending on the expression cell/vector system used, any of a variety of suitable transcriptional and translational control elements (including constitutive and inducible promoters, transcriptional enhancer elements, transcriptional terminators, etc.) can be used in the vector. The vector may also contain a ribosome binding site for initiation of translation and a transcriptional terminator.

適宜真核啟動子(亦即在真核細胞中起作用之啟動子)之非限制性實例包括彼等來自巨細胞病毒(CMV) (即早)、單純疱疹病毒(HSV)胸苷激酶、早期及晚期SV40、來自逆轉錄病毒之長末端重複(LTR)、人類伸長因子-1啟動子(EF1)、具有融合至雞β-肌動蛋白啟動子(CAG)之巨細胞病毒(CMV)增強子之雜合構築體、鼠類幹細胞病毒啟動子(MSCV)、磷酸甘油酸激酶-1基因座啟動子(PGK)、及小鼠金屬硫蛋白-I者。Non-limiting examples of suitable eukaryotic promoters (i.e., promoters that function in eukaryotic cells) include those from cytomegalovirus (CMV) (immediate early), herpes simplex virus (HSV) thymidine kinase, early and late SV40, long terminal repeats (LTRs) from retroviruses, the human elongation factor-1 promoter (EF1), a hybrid construct with the cytomegalovirus (CMV) enhancer fused to the chicken β-actin promoter (CAG), the murine stem cell virus promoter (MSCV), the phosphoglycerate kinase-1 locus promoter (PGK), and mouse metallothionein-I.

在一些實施例中，啟動子為誘導型啟動子(例如熱休克啟動子、四环素調節之啟動子、類固醇調節之啟動子、金屬調節之啟動子、雌激素受體調節之啟動子等)。在一些實施例中，啟動子為組成型啟動子(例如CMV啟動子、UBC啟動子)。在一些實施例中，該啟動子為空間限制型及/或時間限制型啟動子(例如組織特異性啟動子、細胞類型特異性啟動子等)。在一些實施例中，載體不具有意欲在宿主細胞中表現之至少一個基因之啟動子，若該基因將在其插入至基因組之後在存在於該基因組中之內源性啟動子下表現。In some embodiments, the promoter is an inducible promoter (e.g., a heat shock promoter, a tetracycline-regulated promoter, a steroid-regulated promoter, a metal-regulated promoter, an estrogen receptor-regulated promoter, etc.). In some embodiments, the promoter is a constitutive promoter (e.g., a CMV promoter, a UBC promoter). In some embodiments, the promoter is a spatially restricted and/or temporally restricted promoter (e.g., a tissue-specific promoter, a cell type-specific promoter, etc.). In some embodiments, the vector does not have a promoter for at least one gene intended to be expressed in the host cell if the gene would be expressed under an endogenous promoter present in the genome following its insertion into the genome.

對於表現小RNA (包括引導RNA)，各種啟動子(諸如RNA聚合酶III啟動子，包括(例如) U6及H1)可係有利的。因此，可有利地將此類啟動子併入至遞送載體中。用於增強此類啟動子之使用之描述及參數係此項技術中已知的，且定期描述另外資訊及方法；參見，例如，Ma, H.等人，Molecular Therapy - Nucleic Acids 3，e161 (2014) doi:10.1038/mtna.2014.12。For expression of small RNAs, including guide RNAs, various promoters, such as RNA polymerase III promoters, including, for example, U6 and H1, can be advantageous. Thus, such promoters can be advantageously incorporated into delivery vectors. Descriptions and parameters for enhancing the use of such promoters are known in the art, and additional information and methods are regularly described; see, for example, Ma, H. et al., Molecular Therapy - Nucleic Acids 3, e161 (2014) doi:10.1038/mtna.2014.12.

在一些實施例中，該載體為自去活化載體，其使得該CRISPR機制之病毒序列或組分或其他元件去活化。自去活化載體對於遞送載體而言特別有用，用於選擇在基因編輯完成後保留經工程化之B-GEn多肽編碼序列之細胞。In some embodiments, the vector is a self-deactivating vector that deactivates a viral sequence or component or other element of the CRISPR mechanism. Self-deactivating vectors are particularly useful for delivery vectors to select cells that retain the engineered B-GEn polypeptide coding sequence after gene editing is complete.

在一些實施例中，該等表現載體為RNA載體。在其他實施例中，該等表現載體為DNA載體。 6.9.1. RNA載體 In some embodiments, the expression vectors are RNA vectors. In other embodiments, the expression vectors are DNA vectors. 6.9.1. RNA vectors

本發明之表現載體可為RNA載體。The expression vector of the present invention may be an RNA vector.

特別適宜之載體為基於RNA病毒之病毒複製子，諸如α病毒及副黏液病毒。α病毒及副黏液病毒複製子不涉及用於複製之DNA中間物且因此提供幾種其他常用病毒載體(包括慢病毒及逆轉錄病毒載體)之更安全的替代物(Yoshioka等人，2013，Cell Stem Cell. 13(2):246-54；Yoshioka及Dowdy，2017，PLOS ONE 12:e0182018)。α病毒為經脂質包膜之正義RNA病毒，其組成披衣病毒科(Togaviridae family)中之多於30種病毒的屬，包括東方(Eastern)、西方(Western)及委內瑞拉(Venezuelan)馬腦炎病毒(分別為EEEV、WEEV及VEEV)、屈公(chikungunya) (CHIK)、辛德畢斯(Sindbis)、羅斯河(Ross River)及歐尼恩(O’nyong-nyong)病毒等。仙台病毒(Sendai viruses，SeV)為在宿主細胞細胞質中附加性地複製之被膜、單股反義副黏液病毒。Particularly suitable vectors are viral replicons based on RNA viruses, such as alphaviruses and paramyxoviruses. Alphavirus and paramyxovirus replicons do not involve DNA intermediates for replication and therefore provide safer alternatives to several other commonly used viral vectors, including lentiviral and retroviral vectors (Yoshioka et al., 2013, Cell Stem Cell. 13(2):246-54; Yoshioka and Dowdy, 2017, PLOS ONE 12:e0182018). Alphaviruses are lipid-enveloped positive-sense RNA viruses that make up a genus of more than 30 viruses in the Togaviridae family, including Eastern, Western, and Venezuelan equine encephalitis viruses (EEEV, WEEV, and VEEV, respectively), chikungunya (CHIK), Sindbis, Ross River, and O'nyong-nyong viruses, etc. Sendai viruses (SeV) are enveloped, single-stranded negative-sense paramyxoviruses that replicate episodically in the host cell cytoplasm.

因此，在一些實施例中，RNA載體衍生自RNA病毒，諸如α病毒、副黏液病毒、黃病毒、棒狀病毒、麻疹病毒或小核糖核酸病毒。Thus, in some embodiments, the RNA vector is derived from an RNA virus, such as an alphavirus, a paramyxovirus, a flavivirus, a rhabdovirus, a measles virus, or a picornavirus.

在一些實施例中，該RNA載體為單股RNA複製子。在一些實施例中，該單股RNA複製子係陽性股。在一些其他實施例中，該單股RNA複製子係陰性股。在一些實施例中，該RNA載體包含一或多個經工程化之B-GEn多肽之一或多個編碼序列、及自複製元件。In some embodiments, the RNA vector is a single-stranded RNA replicon. In some embodiments, the single-stranded RNA replicon is a positive strand. In some other embodiments, the single-stranded RNA replicon is a negative strand. In some embodiments, the RNA vector comprises one or more coding sequences of one or more engineered B-GEn polypeptides and a self-replicating element.

本發明之RNA複製子通常包括可以操作方式連接至經工程化之B-GEn多肽編碼序列之調節元件，即次基因組(SG)啟動子。包含經工程化之B-GEn多肽編碼序列之序列通常藉由5'及3' UTR序列側接，及該3' UTR序列通常跟著多腺苷酸化信號。The RNA replicon of the present invention generally includes a regulatory element, i.e., a subgenomic (SG) promoter, that can be operably linked to an engineered B-GEn polypeptide coding sequence. The sequence comprising the engineered B-GEn polypeptide coding sequence is generally flanked by 5' and 3' UTR sequences, and the 3' UTR sequence is generally followed by a polyadenylation signal.

該RNA載體構築體可自DNA模板(DNA質體構築體)產生。以實例說明之，該RNA構築體可自DNA模板藉由使用SP6或T7體外轉錄套組來轉錄。The RNA vector construct can be generated from a DNA template (DNA plasmid construct). For example, the RNA construct can be transcribed from a DNA template using an SP6 or T7 in vitro transcription kit.

RNA載體作為遞送載體特別有用。 6.9.2. DNA載體 RNA vectors are particularly useful as delivery vectors. 6.9.2. DNA vectors

在一些實施例中，本發明之表現載體為DNA載體。本發明提供兩種類型之DNA載體：(1)為產生載體或遞送載體之DNA載體及(2)可自本發明之RNA載體(如章節6.9.1中所述)轉錄之DNA載體，如章節6.9.2.2中所述。本發明之可轉錄RNA複製子之DNA載體在本文中有時稱為「模板載體」。In some embodiments, the expression vector of the present invention is a DNA vector. The present invention provides two types of DNA vectors: (1) DNA vectors that are production vectors or delivery vectors and (2) DNA vectors that can be transcribed from RNA vectors of the present invention (as described in Section 6.9.1), as described in Section 6.9.2.2. DNA vectors of the present invention that can transcribe RNA replicas are sometimes referred to herein as "template vectors."

在一些實施例中，本發明之DNA載體為非整合性DNA載體。例如，該載體可為附加型載體。例如，可使用許多含有DNA病毒，諸如腺病毒、猴空泡病毒40 (SV40)、牛乳頭狀瘤病毒(BPV)或出芽酵母ARS (自主複製序列)之質體而無需基因組整合。In some embodiments, the DNA vector of the present invention is a non-integrating DNA vector. For example, the vector can be an episomal vector. For example, many plasmids containing DNA viruses, such as adenovirus, simian vacuolating virus 40 (SV40), bovine papilloma virus (BPV) or budding yeast ARS (autonomous replication sequence) can be used without genomic integration.

在一些實施例中，本發明之DNA載體包括複製起點。可併入至本發明之DNA載體中之複製起點之實例包括嗜淋巴性疱疹病毒、γ疱疹病毒、腺病毒、牛乳頭狀瘤病毒或酵母之複製起點。在一些實施例中，該複製起點來自於嗜淋巴性疱疹病毒或對應於EBV之oriP之γ疱疹病毒，作為自複製元件。在一些實施例中，該嗜淋巴性疱疹病毒為艾伯斯坦-巴爾(Epstein Barr)病毒(EBV)、卡波西氏肉瘤(Kaposi's sarcoma)疱疹病毒(KSHV)、狨疱疹病毒(Herpes virus saimiri，HS)或馬雷克病(Marek’s disease)病毒(MDV)。艾伯斯坦巴爾病毒(EBV)及卡波西氏肉瘤疱疹病毒(KSHV)亦係γ疱疹病毒之實例。In some embodiments, the DNA vector of the present invention includes a replication origin. Examples of replication origins that can be incorporated into the DNA vector of the present invention include the replication origin of lymphotropic herpes virus, gamma herpes virus, adenovirus, bovine papilloma virus, or yeast. In some embodiments, the replication origin is from a lymphotropic herpes virus or a gamma herpes virus corresponding to oriP of EBV, as a self-replicating element. In some embodiments, the lymphotropic herpes virus is Epstein Barr virus (EBV), Kaposi's sarcoma herpes virus (KSHV), Herpes virus saimiri (HS) or Marek's disease virus (MDV). Epstein-Barr virus (EBV) and Kaposi's sarcoma herpesvirus (KSHV) are also examples of gammaherpesviruses.

在某些實施例中，本發明之載體包含EBV之複製起點 OriP。 OriP係DNA複製開始或附近之位點且由相隔約1千鹼基對之稱為重複家族(FR)及二元對稱(DS)之兩個順式作用序列組成。FR由21個不完美的30 bp重複拷貝組成且含有20個高親和力EBNA-1結合位點。當FR被EBNA-1結合時，其既作為距離長至10 kb之順式啟動子之轉錄增強子。DS足以在EBNA-1之存在下開始DNA合成且開始係發生在DS處或附近。 In certain embodiments, the vector of the present invention comprises the replication origin OriP of EBV. OriP is the site at or near the start of DNA replication and consists of two cis-acting sequences called repeat family (FR) and binary symmetry (DS) separated by about 1 kilobase pair. FR consists of 21 imperfect 30 bp repeat copies and contains 20 high-affinity EBNA-1 binding sites. When FR is bound by EBNA-1, it acts as a transcription enhancer for cis-acting promoters up to 10 kb in distance. DS is sufficient to initiate DNA synthesis in the presence of EBNA-1 and initiation occurs at or near the DS.

複製DNA載體中之表現盒中之一者或多者可進一步包含編碼結合至該複製起點以複製染色體外模板之反式作用因子之核苷酸序列。或者或另外，該體細胞可表現此一反式作用因子。One or more of the expression cassettes in the replication DNA vector may further comprise a nucleotide sequence encoding a trans-acting factor that binds to the replication origin to replicate the extrachromosomal template. Alternatively or additionally, the somatic cell may express such a trans-acting factor.

在其他實施例中，本發明之DNA載體缺乏複製起點。In other embodiments, the DNA vectors of the present invention lack an origin of replication.

本發明之DNA載體通常包含一或多個啟動子SP6或T7，以在意欲為產生載體之DNA載體之情況下驅動該經工程化之B-GEn多肽之表現或在意欲為模板載體之DNA載體之情況下驅動RNA複製子之表現。 6.9.2.1. 表現載體 The DNA vectors of the present invention typically comprise one or more promoters SP6 or T7 to drive expression of the engineered B-GEn polypeptide in the case of a DNA vector intended to be a production vector or to drive expression of an RNA replicon in the case of a DNA vector intended to be a template vector. 6.9.2.1. Expression vectors

在一些實施例中，該表現載體為包含用於表現一或多種所關注蛋白質之表現盒之DNA載體，其可以操作方式連接至包含適合於驅動該經工程化之B-GEn多肽於所關註細胞類型中之表現之啟動子之調節元件。適合驅動蛋白質於哺乳動物細胞中之表現之啟動子之實例包括巨細胞病毒(CMV)啟動子、EF1a啟動子、SV40啟動子、Ubc啟動子、人類β肌動蛋白啟動子、PGK1啟動子及CAG啟動子。In some embodiments, the expression vector is a DNA vector comprising an expression cassette for expressing one or more proteins of interest, which can be operably linked to a regulatory element comprising a promoter suitable for driving expression of the engineered B-GEn polypeptide in a cell type of interest. Examples of promoters suitable for driving expression of proteins in mammalian cells include the cytomegalovirus (CMV) promoter, the EF1a promoter, the SV40 promoter, the Ubc promoter, the human beta actin promoter, the PGK1 promoter, and the CAG promoter.

用於直接表現經工程化之B-GEn多肽(而不是用作如章節6.9.2.2中所述之用於RNA載體之表現之模板)之DNA載體不需要包括RNA複製子自複製序列，例如VEEV之nsP1-nsP4蛋白或仙台病毒之NP、P及L蛋白。DNA vectors used to directly express engineered B-GEn polypeptides (rather than serving as templates for expression of RNA vectors as described in Section 6.9.2.2) need not include RNA replicon self-replicating sequences, such as the nsP1-nsP4 proteins of VEEV or the NP, P, and L proteins of Sendai virus.

在一些實施例中，DNA表現載體為非複製DNA載體。在一些實施例中，DNA表現載體為複製DNA載體。 6.9.2.2. 模板載體 In some embodiments, the DNA expression vector is a non-replicating DNA vector. In some embodiments, the DNA expression vector is a replicating DNA vector. 6.9.2.2. Template vector

本發明之DNA載體亦可作為用於如本文所述之RNA複製子之轉錄之模板。因此，包括在模板載體中之「表現盒」旨在自藉由該RNA複製子之轉錄產生之RNA複製子進行轉錄。The DNA vectors of the present invention can also be used as templates for transcription of RNA replicons as described herein. Thus, the "expression cassette" included in the template vector is intended to be transcribed from the RNA replicons produced by transcription of the RNA replicons.

因此，本發明之模板載體包含在調節元件(例如SP6或T7啟動子)之控制下編碼如本文所述之RNA複製子之核苷酸序列。Thus, the template vector of the present invention comprises a nucleotide sequence encoding an RNA replicon as described herein under the control of a regulatory element (eg, an SP6 or T7 promoter).

在一些實施例中，模板DNA載體為非複製DNA載體。在一些實施例中，模板DNA載體為複製DNA載體。In some embodiments, the template DNA vector is a non-replicating DNA vector. In some embodiments, the template DNA vector is a replicating DNA vector.

在一些實施例中，該等模板載體用於RNA複製子之體外轉錄，RNA複製子隨後經引入至細胞中以驅動經工程化之B-GEn多肽之表現。 6.9.3. 病毒載體 In some embodiments, the template vectors are used for in vitro transcription of RNA replicons, which are then introduced into cells to drive expression of engineered B-GEn polypeptides. 6.9.3. Viral vectors

可使用重組腺相關病毒(AAV)載體用於遞送。此項技術中之已知的產生rAAV粒子之技術係提供具有在兩個AAV反向末端重複(ITR)之欲遞送的多核苷酸、AAV rep及cap基因、及輔助病毒功能之細胞。rAAV之產生需要在單一細胞(本文中表示為包裝細胞)內存在以下組分：兩個ITR之間之所關注多核苷酸、自AAV基因組分離(亦即不在其中)之AAV rep及cap基因、及輔助病毒功能。該等AAV rep及cap基因可來自可衍生重組病毒之任何AAV血清型且可來自不同於包裝多核苷酸上的ITR之AAV血清型，包括(但不限於) AAV血清型AAV-1、AAV-2、AAV-3、AAV-4、AAV-5、AAV-6、AAV-7、AAV-8、AAV-9、AAV-10、AAV-11、AAV-12、AAV-13及AAV rh.74。假型rAAV之產生揭示於例如WO 01/83692中。 AAV 血清型 Genbank 登錄號 AAV-1 NC_002077.1 AAV-2 NC_001401.2 AAV-3 NC_001729.1 AAV-38 AF028705.1 AAV-4 NC_001829.1 AAV-5 NC_006152.1 AAV-6 AF028704.1 AAV-7 NC_006260.1 AAV-8 NC_006261.1 AAV-9 AX753250.1 AAV-10 AY631965.1 AAV-11 AY631966.1 AAV-12 00813647.1 AAV-13 EU285562.1 Recombinant adeno-associated virus (AAV) vectors may be used for delivery. Techniques known in the art for producing rAAV particles are to provide cells with the polynucleotide to be delivered between two AAV inverted terminal repeats (ITRs), the AAV rep and cap genes, and helper viral functions. The production of rAAV requires the presence of the following components within a single cell (represented herein as a packaging cell): the polynucleotide of interest between the two ITRs, the AAV rep and cap genes separated from (i.e., not present in) the AAV genomic components, and helper viral functions. The AAV rep and cap genes can be from any AAV serotype from which recombinant virus can be derived and can be from an AAV serotype different from the ITRs on the packaging polynucleotide, including but not limited to AAV serotypes AAV-1, AAV-2, AAV-3, AAV-4, AAV-5, AAV-6, AAV-7, AAV-8, AAV-9, AAV-10, AAV-11, AAV-12, AAV-13, and AAV rh. 74. The generation of pseudotyped rAAVs is disclosed, for example, in WO 01/83692. AAV serotypes Genbank registration number AAV-1 NC_002077.1 AAV-2 NC_001401.2 AAV-3 NC_001729.1 AAV-38 AF028705.1 AAV-4 NC_001829.1 AAV-5 NC_006152.1 AAV-6 AF028704.1 AAV-7 NC_006260.1 AAV-8 NC_006261.1 AAV-9 AX753250.1 AAV-10 AY631965.1 AAV-11 AY631966.1 AAV-12 00813647.1 AAV-13 EU285562.1

一種產生包裝細胞之方法係建立穩定地表現AAV粒子產生之所有必需組分之細胞系。例如，將包含介於AAV ITR之間之所關注多核苷酸、自AAV基因組分離之AAV rep及cap基因及可選擇之標記(諸如新黴素抗性基因)之質體(或多個質體)整合至細胞之基因組中。AAV基因組已藉由程序(諸如GC加尾(tailing))(Samulski等人，1982，Proc.Natl. Acad. Sci. USA，79:2077-2081)、添加含有限制性核酸內切酶切割位點之合成連接子(Laughlin等人，1983，Gene，23:65-73)或藉由直接、鈍端繫接(Senapathy & Carter，1984，J. Biol. Chem.，259:4661-4666)引入至細菌質體中。然後，用輔助病毒(諸如腺病毒)感染該包裝細胞系。本方法之優點在於該等細胞係可選擇的且適合於大規模產生rAAV。適宜方法之其他實例採用腺病毒或桿狀病毒而不是質體以將rAAV基因組及/或rep及cap基因引入至包裝細胞中。One method of generating packaging cells is to establish a cell line that stably expresses all the necessary components for AAV particle production. For example, a plasmid (or plasmids) comprising the polynucleotide of interest between the AAV ITRs, the AAV rep and cap genes isolated from the AAV genome, and a selectable marker (such as a neomycin resistance gene) is integrated into the genome of the cell. The AAV genome has been introduced into bacterioplasts by procedures such as GC tailing (Samulski et al., 1982, Proc. Natl. Acad. Sci. USA, 79:2077-2081), addition of synthetic linkers containing restriction endonuclease cleavage sites (Laughlin et al., 1983, Gene, 23:65-73), or by direct, blunt-end ligation (Senapathy & Carter, 1984, J. Biol. Chem., 259:4661-4666). The packaging cell line is then infected with a helper virus such as adenovirus. The advantage of this method is that the cells are selectable and suitable for large-scale production of rAAV. Other examples of suitable methods employ adenovirus or bacilli rather than plasmids to introduce the rAAV genome and/or the rep and cap genes into packaging cells.

rAAV產生之一般原理綜述於例如Carter，1992，Current Opinions in Biotechnology，1533-539；及Muzyczka，1992，Curr. Topics in Microbial. and lmmunol.，158:97-129)中。各種方法描述於Ratschin等人，Mol. Cell. Biol. 4:2072 (1984)；Hermonat等人，Proc. Natl. Acad. Sci. USA，81:6466 (1984)；Tratschin等人，Mol. Cell. Biol. 5:3251 (1985)；Mclaughlin等人，J. Virol.，62:1963 (1988)；及Lebkowski等人，1988 Mol. Cell. Biol.，7:349 (1988).Samulski等人(1989，J. Virol.，63:3822-3828)；美國專利第5,173,414號；WO 95/13365及對應美國專利第5,658.776號；WO 95/13392；WO 96/17947；PCT/US98/18600；WO97/09441 (PCT/US96/14423)；WO 97/08298 (PCT/US96/13872)；WO 97/21825 (PCT/US96/20777)；WO 97/06243 (PCT/FR96/01064)；WO 99/11764；Perrin等人(1995) Vaccine 13:1244-1250；Paul等人(1993) Human Gene Therapy 4:609-615；Clark等人(1996) Gene Therapy 3:1124-1132；美國專利第5,786,211號；美國專利第5,871,982號；及美國專利第6,258,595號中。The general principles of rAAV production are reviewed in, for example, Carter, 1992, Current Opinions in Biotechnology, 1533-539; and Muzyczka, 1992, Curr. Topics in Microbial. and lmmunol., 158:97-129). Various methods are described in Ratschin et al., Mol. Cell. Biol. 4:2072 (1984); Hermonat et al., Proc. Natl. Acad. Sci. USA, 81:6466 (1984); Tratschin et al., Mol. Cell. Biol. 5:3251 (1985); Mclaughlin et al., J. Virol., 62:1963 (1988); and Lebkowski et al., 1988 Mol. Cell. Biol., 7:349 (1988). Samulski et al. (1989, J. Virol., 63:3822-3828); U.S. Patent No. 5,173,414; WO 95/13365 and corresponding U.S. Patent No. 5,658.776; WO 95/13392; WO WO97/06243 (PCT/FR96/01064); WO 99/11764; Perrin et al. (1995) Vaccine 13:1244-1250; Paul et al. (1993) Human Gene Therapy 4:609-615; Clark et al. (1996) Gene Therapy 3:1124-1132; U.S. Patent No. 5,786,211; U.S. Patent No. 5,871,982; and U.S. Patent No. 6,258,595.

用於轉導之AAV載體血清型取決於靶細胞類型。例如，已知以下示例性細胞類型藉由所指示之AAV血清型等轉導。組織 / 細胞類型 血清型 肝臟 AAV8、AAV9 骨骼肌 AAV1、AAV7、AAV6、AAV8、AAV9 中樞神經系統 AAV5、AAV1、AAV4 RPE AAV5、AAV4 光受體細胞 AAV5 肺 AAV9 心臟 AAV8 胰臟 AAV8 腎臟 AAV2 The AAV vector serotype used for transduction depends on the target cell type. For example, the following exemplary cell types are known to be transduced by the indicated AAV serotypes, etc. Tissue / cell type Serotype Liver AAV8, AAV9 Skeletal muscle AAV1, AAV7, AAV6, AAV8, AAV9 Central nervous system AAV5, AAV1, AAV4 RPE AAV5, AAV4 Photoreceptor cells AAV5 lung AAV9 Heart AAV8 Pancreas AAV8 Kidney AAV2

許多適宜表現載體係熟習此項技術中已知的且許多可購買獲得。以實例方式提供以下載體；用於真核宿主細胞：pXT1、pSG5 (Stratagene)、pSVK3、pBPV、pMSG及pSVLSV40 (Pharmacia)。然而，可使用任何其他載體，只要其與宿主細胞相容即可。 6.10. 宿主細胞及重組表現 Many suitable expression vectors are known in the art and many are commercially available. The following vectors are provided by way of example; for use with eukaryotic host cells: pXT1, pSG5 (Stratagene), pSVK3, pBPV, pMSG, and pSVLSV40 (Pharmacia). However, any other vector may be used so long as it is compatible with the host cell. 6.10. Host Cells and Recombinant Expression

在一些實施例中，宿主細胞可用於表現gRNA、sgRNA或本發明之經工程化之B-GEn多肽。適宜宿主細胞包括天然存在細胞；經基因修飾之細胞(例如在實驗室中經基因修飾之細胞)、及以任何方式在體外操縱之細胞。在一些實施例中，宿主細胞經分離。In some embodiments, host cells can be used to express gRNA, sgRNA or the engineered B-GEn polypeptide of the present invention. Suitable host cells include naturally occurring cells; genetically modified cells (e.g., cells genetically modified in the laboratory), and cells manipulated in vitro in any manner. In some embodiments, the host cell is isolated.

該宿主細胞可為真核生物或原核生物且包括(例如)酵母(諸如畢氏酵母菌(Pichia pastoris)或釀酒酵母菌(Saccharomyces cerevisiae))、細菌(諸如大腸桿菌或枯草芽孢桿菌(Bacillus subtilis))、昆蟲Sf9細胞(諸如桿狀病毒感染之SF9細胞)或哺乳動物細胞(諸如人類胚胎腎臟(HEK)細胞、中國倉鼠卵巢細胞、HeLa細胞、人類293細胞及猴COS-7細胞)。The host cell can be eukaryotic or prokaryotic and includes, for example, yeast (such as Pichia pastoris or Saccharomyces cerevisiae), bacteria (such as Escherichia coli or Bacillus subtilis), insect Sf9 cells (such as SF9 cells infected with bacillus virus), or mammalian cells (such as human embryonic kidney (HEK) cells, Chinese hamster ovary cells, HeLa cells, human 293 cells, and monkey COS-7 cells).

宿主細胞可來自於確立的細胞系，或其可係初生細胞，其中「初生細胞」、「初生細胞系」及「初生培養物」在本文中可互換地用於指已衍生自個體且允許體外生長以達成培養物之有限數量之繼代(例如分裂)之細胞及細胞培養物。例如，初生培養物包括可能已繼代培養0次、1次、2次、4次、5次、10次或15次但通過危機階段的次數不足之培養物。初生細胞系可體外維持少於10個繼代。在一些實施例中，宿主細胞為PSC (例如iPSC或ESC)、或PSC衍生之細胞(例如PSC衍生之神經元、PSC衍生之微膠質細胞、PSC衍生之心肌細胞、眼睛之PSC衍生之細胞)。The host cell may be from an established cell line, or it may be a primary cell, wherein "primary cell," "primary cell line," and "primary culture" are used interchangeably herein to refer to cells and cell cultures that have been derived from an individual and allowed to grow in vitro to achieve a limited number of passages (e.g., divisions) of the culture. For example, a primary culture includes a culture that may have been passaged 0, 1, 2, 4, 5, 10, or 15 times, but not enough times to pass the crisis phase. A primary cell line may be maintained in vitro for less than 10 passages. In some embodiments, the host cell is a PSC (e.g., an iPSC or an ESC), or a PSC-derived cell (e.g., a PSC-derived neuron, a PSC-derived microglia, a PSC-derived cardiomyocyte, a PSC-derived cell of the eye).

若該等細胞係初生細胞，則可自個體藉由任何適宜方法收穫此類細胞。適宜溶液可用於分散或懸浮所收穫的細胞。所收穫的細胞可立即使用，或其可經儲存、冷凍，持續長時間期，進行解凍且能夠再使用。在此類情況下，一般將該等細胞在10%二甲基亞碸(DMSO)、50%血清、40%緩衝培養基或此項技術中通常用於在此類冷凍溫度下保存細胞之一些其他此種溶液中冷凍且以此項技術中通常已知用於解凍冷凍培養細胞之方式解凍。 6.11. 靶細胞 If the cells are primary cells, such cells may be harvested from an individual by any suitable method. Suitable solutions may be used to disperse or suspend the harvested cells. The harvested cells may be used immediately, or they may be stored, frozen, for a long period of time, thawed and capable of being reused. In such cases, the cells are generally frozen in 10% dimethyl sulfoxide (DMSO), 50% serum, 40% buffered medium, or some other such solution commonly used in the art to preserve cells at such freezing temperatures and thawed in a manner commonly known in the art for thawing frozen cultured cells. 6.11. Target cells

在一些實施例中，將該經工程化之V型核酸內切酶或B-GEn CRISPR-Cas系統引入至靶細胞或靶細胞群體中。用於將蛋白質及核酸引入至靶細胞之方法進一步描述於章節6.12中。In some embodiments, the engineered V-type endonuclease or B-GEn CRISPR-Cas system is introduced into a target cell or a population of target cells. Methods for introducing proteins and nucleic acids into target cells are further described in Section 6.12.

本發明之靶細胞及靶細胞群體可為其中已發生藉由本發明之系統之基因編輯之細胞、或其中本發明之系統之組分已引入或表現但尚未發生基因編輯之細胞、或其組合。在各種實施例中，細胞群體可包含例如其中該等細胞的至少1%、至少5%、至少10%、至少15%、至少20%、至少30%、至少40%、至少50%、至少60%或至少70%已藉由本發明之系統進行基因編輯之群體。The target cells and target cell populations of the present invention may be cells in which gene editing by the system of the present invention has occurred, or cells in which components of the system of the present invention have been introduced or expressed but gene editing has not yet occurred, or a combination thereof. In various embodiments, the cell population may include, for example, a population in which at least 1%, at least 5%, at least 10%, at least 15%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, or at least 70% of the cells have been gene edited by the system of the present invention.

在一些實施例中，本發明之方法可用於體內及/或離體及/或體外誘導有絲分裂或有絲分裂後細胞中之轉錄調節。在一些實施例中，本發明之方法可用於體內及/或離體及/或體外誘導有絲分裂或後有絲分裂細胞中之DNA切割、DNA修飾、及/或轉錄調節(例如以產生可再引入至個體中之經基因修飾之細胞)。In some embodiments, the methods of the present invention can be used for transcriptional regulation in cells that induce mitosis or post-mitosis in vivo and/or in vitro and/or in vitro. In some embodiments, the methods of the present invention can be used for DNA cleavage, DNA modification, and/or transcriptional regulation in cells that induce mitosis or post-mitosis in vivo and/or in vitro and/or in vitro (e.g., to produce genetically modified cells that can be reintroduced into an individual).

由於引導RNA藉由雜交至靶DNA來提供特異性，因此有絲分裂及/或有絲分裂後細胞可為任何多種靶細胞，且經體內或體外修飾。適宜靶細胞包括(但不限於)細菌細胞；古細菌細胞(archaeal cell)；單細胞化真核生物體；植物細胞；藻類細胞，例如，布朗葡萄藻( Botryococcus braunii)、萊茵衣藻( Chlamydomonas reinhardtii)、瓜迪亞納微擬球藻( Nannochloropsis gaditana)、蛋白核小球藻( Chlorella pyrenoidosa)、展枝馬尾藻( Sargassum patens)、橋彎藻( C. Agardh)及類似者；真菌細胞；動物細胞；來自於無脊椎動物(例如昆蟲、刺絲胞動物、刺皮動物、線蟲等)之細胞；真核寄生菌(例如瘧原蟲(malarial parasite)，例如惡性瘧原蟲( Plasmodium falciparum)；蠕蟲等)；來自於脊椎動物(例如魚、兩棲類、爬行動物、鳥、哺乳動物)之細胞；哺乳動物細胞，例如嚙齒動物細胞、人類細胞、非人類靈長類動物等。在一些實施例中，該靶細胞可為任何人類細胞。適宜靶細胞包括天然存在細胞；經基因修飾之細胞(例如在實驗室中經基因修飾之細胞，例如藉由「人的手」)、及以任何方式在體外操縱之細胞。在一些實施例中，靶細胞經分離。 Since the guide RNA provides specificity by hybridizing to the target DNA, the mitotic and/or post-mitotic cells can be any of a variety of target cells and modified in vivo or in vitro. Suitable target cells include, but are not limited to, bacterial cells; archaeal cells; unicellular eukaryotic organisms; plant cells; algal cells, such as Botryococcus braunii , Chlamydomonas reinhardtii , Nannochloropsis gaditana , Chlorella pyrenoidosa , Sargassum patens , C. agardh , and the like; fungal cells; animal cells; cells from invertebrates (e.g., insects, cnidarians, echinops, nematodes, etc.); eukaryotic parasites (e.g., malarial protozoa); parasite, such as Plasmodium falciparum ; worms, etc.); cells from vertebrates (e.g., fish , amphibians, reptiles, birds, mammals); mammalian cells, such as rodent cells, human cells, non-human primates, etc. In some embodiments, the target cell can be any human cell. Suitable target cells include naturally occurring cells; genetically modified cells (e.g., cells genetically modified in the laboratory, such as by "human hands"), and cells manipulated in vitro in any manner. In some embodiments, the target cell is isolated.

任何類型之細胞可作為用於體內或體外修飾之宿主細胞或靶細胞來關注。在各種實施例中，該宿主細胞或靶細胞為幹細胞(例如PSC，諸如胚胎幹(ES)細胞或誘導型多能幹細胞(iPSC))、生殖細胞；體細胞，例如纖維母細胞、造血細胞、免疫細胞(例如T-淋巴細胞、B-淋巴細胞、樹突細胞或巨噬細胞)、神經元、肌肉細胞、骨細胞、肝細胞、胰臟細胞；在任何階段之胚胎(例如1-細胞、2-細胞、4-細胞、8-細胞等階段斑馬魚胚胎；等)之胚胎細胞。細胞可來自於確立的細胞系，或其可係初生細胞，其中「初生細胞」、「初生細胞系」及「初生培養物」在本文中可互換地用於指已衍生自個體且允許體外生長以達成培養物之有限數量之繼代(例如分裂)之細胞及細胞培養物。例如，初生培養物包括可能已繼代培養0次、1次、2次、4次、5次、10次或15次但通過危機階段的次數不足之培養物。初生細胞系可體外維持少於10個繼代。在一些實施例中，靶細胞為單細胞生物體，或在培養中生長。在一些實施例中，宿主細胞與靶細胞相同。在一些實施例中，靶細胞經修飾成成為另一種細胞類型使得所得宿主細胞不同於該靶細胞。作為一實例，靶細胞可為PSC (例如iPSC)，然後，將其分化成PSC衍生之細胞(諸如PSC衍生之神經元)使得宿主細胞為神經元。或者，靶細胞不是PSC，而是隨後經去分化或再程式化為PSC。Any type of cell can be of interest as a host cell or target cell for modification in vivo or in vitro. In various embodiments, the host cell or target cell is a stem cell (e.g., PSC, such as embryonic stem (ES) cells or induced pluripotent stem cells (iPSC)), a germ cell; a somatic cell, such as a fibroblast, a hematopoietic cell, an immune cell (e.g., a T-lymphocyte, a B-lymphocyte, a dendritic cell, or a macrophage), a neuron, a muscle cell, a bone cell, a liver cell, a pancreatic cell; an embryonic cell at any stage (e.g., a zebrafish embryo at the 1-cell, 2-cell, 4-cell, 8-cell, etc. stage; etc.). The cell may be from an established cell line, or it may be a primary cell, wherein "primary cell," "primary cell line," and "primary culture" are used interchangeably herein to refer to cells and cell cultures that have been derived from an individual and allowed to grow in vitro to achieve a limited number of passages (e.g., divisions) of the culture. For example, a primary culture includes a culture that may have been passaged 0, 1, 2, 4, 5, 10, or 15 times, but not enough times to pass a crisis phase. A primary cell line may be maintained in vitro for less than 10 passages. In some embodiments, the target cell is a single cell organism, or is grown in culture. In some embodiments, the host cell is the same as the target cell. In some embodiments, the target cell is modified to become another cell type so that the resulting host cell is different from the target cell. As an example, the target cell can be a PSC (e.g., an iPSC), which is then differentiated into a PSC-derived cell (e.g., a PSC-derived neuron) so that the host cell is a neuron. Alternatively, the target cell is not a PSC, but is subsequently dedifferentiated or reprogrammed to a PSC.

若該等細胞係初生細胞，則可自個體藉由任何適宜方法收穫此類細胞。例如，白細胞可藉由血球分離術、白細胞分離術、密度梯度分離等適宜地收穫，而來自於組織(諸如皮膚、肌肉、骨髓、脾臟、肝臟、胰臟、肺、腸、胃等)之細胞藉由生檢來最適宜地收穫。可使用適宜溶液分散或懸浮所收穫的細胞。此種溶液一般為平衡鹽溶液，例如生理鹽水、磷酸鹽緩衝鹽水(PBS)、漢克氏(Hank’s)平衡鹽溶液等，其適宜地補充胎牛血清或其他天然存在因子，結合低濃度(例如5至25 mM)之可接受之緩衝液。適宜緩衝液包括HEPES、磷酸鹽緩衝液、乳酸鹽緩衝液等。該等細胞可立即使用，或其可經儲存、冷凍，持續長時間期，進行解凍且能夠再使用。在此類情況下，一般將該等細胞在10%二甲基亞碸(DMSO)、50%血清、40%緩衝培養基或此項技術中通常用於在此類冷凍溫度下保存細胞之一些其他此種溶液中冷凍且以此項技術中通常已知用於解凍冷凍培養細胞之方式解凍。If the cells are primary cells, they can be harvested from an individual by any suitable method. For example, leukocytes can be suitably harvested by hematopheresis, leukocyte separation, density gradient separation, etc., and cells from tissues (such as skin, muscle, bone marrow, spleen, liver, pancreas, lung, intestine, stomach, etc.) are most suitably harvested by biopsy. The harvested cells can be dispersed or suspended using a suitable solution. Such a solution is generally a balanced salt solution, such as physiological saline, phosphate buffered saline (PBS), Hank's balanced salt solution, etc., which is appropriately supplemented with fetal bovine serum or other naturally occurring factors, combined with a low concentration (e.g., 5 to 25 mM) of an acceptable buffer. Suitable buffers include HEPES, phosphate buffer, lactate buffer, etc. The cells can be used immediately, or they can be stored, frozen, for a long period of time, thawed and can be reused. In such cases, the cells are generally frozen in 10% dimethyl sulfoxide (DMSO), 50% serum, 40% buffered medium, or some other such solution commonly used in the art to preserve cells at such freezing temperatures and thawed in a manner commonly known in the art for thawing frozen cultured cells.

在一些實施例中，靶細胞位於個體中且本發明之方法包括對個體(例如哺乳動物個體，諸如人類或家畜個體)投與B-GEn CRISPR-Cas系統(例如包含本發明之經工程化之B-GEn多肽之核糖核蛋白)。可遞送B-GEn CRISPR-Cas系統之示例性體內細胞包括(但不限於)纖維母細胞、造血細胞、免疫細胞(例如T-淋巴細胞、B-淋巴細胞、樹突細胞或巨噬細胞)、神經元、神經膠質、肌肉細胞(例如心肌細胞)、骨細胞、肝細胞及胰臟細胞。 6.11.1. 多能幹細胞(PSC) In some embodiments, the target cell is located in an individual and the methods of the invention include administering a B-GEn CRISPR-Cas system (e.g., a ribonucleoprotein comprising an engineered B-GEn polypeptide of the invention) to an individual (e.g., a mammalian individual, such as a human or livestock individual). Exemplary in vivo cells to which the B-GEn CRISPR-Cas system can be delivered include, but are not limited to, fibroblasts, hematopoietic cells, immune cells (e.g., T-lymphocytes, B-lymphocytes, dendritic cells, or macrophages), neurons, neuroglia, muscle cells (e.g., cardiomyocytes), bone cells, hepatocytes, and pancreatic cells. 6.11.1. Pluripotent stem cells (PSC)

在一些實施例中，該等靶細胞為幹細胞，例如特別是多能幹細胞(PSC)，諸如誘導型多能幹細胞(iPSC)或人類胚胎幹細胞(hESC)，其可經分化且用於產生大量特定細胞類型，該特定細胞類型可在患有許多不同疾病之患者中遞送用於再生醫學。在PSC之情況下，分化係譜系特化之過程，且可使用細胞特異性方案來達成。In some embodiments, the target cells are stem cells, for example, particularly pluripotent stem cells (PSCs), such as induced pluripotent stem cells (iPSCs) or human embryonic stem cells (hESCs), which can be differentiated and used to generate large numbers of specific cell types that can be delivered for regenerative medicine in patients with many different diseases. In the case of PSCs, differentiation is a process of lineage specification and can be achieved using cell-specific protocols.

在藉由引入本發明之經工程化之V型核酸內切酶或B-GEn多肽修飾PSC (例如hESC或iPSC)之後，該PSC可分化成用於細胞療法之所關注細胞類型。在一些實施例中，該PSC的基因組在分化之前藉由本發明之經工程化之B-GEn多肽編輯。After modifying PSC (e.g., hESC or iPSC) by introducing an engineered V-type endonuclease or B-GEn polypeptide of the present invention, the PSC can be differentiated into a cell type of interest for cell therapy. In some embodiments, the genome of the PSC is edited by an engineered B-GEn polypeptide of the present invention prior to differentiation.

PSC (例如包含本發明之經工程化之V型核酸內切酶或B-GEn多肽或其基因組已藉由本發明之經工程化之V型核酸內切酶或B-GEn多肽編輯之PSC)可分化成適合於療法之細胞，包括內胚層(例如肺、甲狀腺或胰臟細胞或其先驅體)、外胚層(例如皮膚、神經元或色素細胞或其先驅體)及中胚層(例如心臟細胞、骨骼肌細胞、紅血細胞、平滑肌細胞或其先驅體或前驅體)譜系中之細胞。PSCs (e.g., PSCs comprising an engineered V-type endonuclease or B-GEn polypeptide of the invention or whose genomes have been edited by an engineered V-type endonuclease or B-GEn polypeptide of the invention) can be differentiated into cells suitable for therapy, including cells in the endoderm (e.g., lung, thyroid or pancreatic cells or their precursors), ectoderm (e.g., skin, neuron or pigment cells or their precursors), and mesoderm (e.g., heart cells, skeletal muscle cells, red blood cells, smooth muscle cells or their precursors or progenitors) lineages.

在一些實施例中，本發明之PSC經分化成心臟細胞。在各種實施例中，該心臟細胞係心臟先驅細胞或成熟或未成熟(心房或心室)心肌細胞。In some embodiments, the PSCs of the present invention are differentiated into cardiac cells. In various embodiments, the cardiac cells are cardiac progenitor cells or mature or immature (atrial or ventricular) cardiomyocytes.

在其他實施例中，本發明之PSC經分化成寡樹突膠質細胞先驅細胞或寡樹突膠質細胞。In other embodiments, the PSCs of the present invention are differentiated into oligodendrocyte precursor cells or oligodendrocytes.

在其他實施例中，本發明之PSC經分化成神經譜系細胞，例如神經脊細胞、星形神經膠質細胞、多巴胺性神經元先驅細胞、多巴胺性神經元細胞、中腦多巴胺性神經元先驅細胞、中腦多巴胺性神經元、真實中腦多巴胺(DA)神經元、多巴胺性神經元先驅細胞、底板中腦先驅細胞、底板中腦DA神經元。In other embodiments, the PSCs of the present invention are differentiated into neural lineage cells, such as neural crest cells, astrocytes, dopamine neuron precursor cells, dopamine neuron cells, midbrain dopamine neuron precursor cells, midbrain dopamine neurons, true midbrain dopamine (DA) neurons, dopamine neuron precursor cells, floor plate midbrain precursor cells, floor plate midbrain DA neurons.

在其他實施例中，本發明之PSC經分化成光受體細胞、光受體前驅細胞、視網膜色素上皮細胞、神經視網膜細胞或神經視網膜先驅細胞。In other embodiments, the PSCs of the present invention are differentiated into photoreceptor cells, photoreceptor progenitor cells, retinal pigment epithelial cells, neural retinal cells, or neural retinal progenitor cells.

在一些實施例中，本發明之PSC經分化成微膠質細胞或微膠質先驅細胞。In some embodiments, the PSCs of the present invention are differentiated into microglia or microglia precursor cells.

在一些實施例中，本發明之PSC經分化成巨噬細胞。In some embodiments, the PSCs of the present invention are differentiated into macrophages.

在一些實施例中，本發明之PSC經分化成腸先驅細胞或腸細胞。In some embodiments, the PSCs of the present invention are differentiated into intestinal progenitor cells or intestinal cells.

在一些實施例中，本發明之PSC經分化成免疫細胞，例如T淋巴細胞、B淋巴細胞、樹突細胞或巨噬細胞。In some embodiments, the PSCs of the present invention are differentiated into immune cells, such as T lymphocytes, B lymphocytes, dendritic cells or macrophages.

在一些實施例中，該等PSC可在分化成所關注細胞類型之前進行基因工程化(例如以產生患者中存在缺陷之功能性蛋白質、產生治療性蛋白質、包括關閉開關、或逃避免疫偵測，從而支持同種異體應用)。 6.12. 將核酸引入至宿主及靶細胞中之方法 In some embodiments, the PSCs can be genetically engineered prior to differentiation into the cell type of interest (e.g., to produce a functional protein that is deficient in the patient, to produce a therapeutic protein, including an off switch, or to evade immune detection to support allogeneic applications). 6.12. Methods of introducing nucleic acids into host and target cells

在一些實施例中，本發明之方法包括將一或多種包含編碼引導RNA之核苷酸序列及/或編碼經工程化之B-GEn多肽之核苷酸序列(例如密碼子最佳化之核苷酸序列)之核酸引入至宿主或靶細胞(或宿主或靶細胞群體)中。在一些實施例中，本發明之方法包括將引導RNA及/或編碼經工程化之B-GEn多肽之核苷酸序列(例如經密碼子最佳化之核苷酸序列)引入至宿主或靶細胞(或宿主或靶細胞群體)中。In some embodiments, the methods of the present invention include introducing one or more nucleic acids comprising a nucleotide sequence encoding a guide RNA and/or a nucleotide sequence encoding an engineered B-GEn polypeptide (e.g., a codon-optimized nucleotide sequence) into a host or target cell (or a host or target cell population). In some embodiments, the methods of the present invention include introducing a guide RNA and/or a nucleotide sequence encoding an engineered B-GEn polypeptide (e.g., a codon-optimized nucleotide sequence) into a host or target cell (or a host or target cell population).

在一些實施例中，靶細胞(例如包含藉由引導RNA靶向以藉由經工程化之B-GEn多肽編輯之DNA之細胞)在體外細胞中，例如在細胞培養中。在一些實施例中，靶細胞為例如意欲進行基因療法之個體(例如哺乳動物，諸如人類)中之體內細胞。In some embodiments, the target cell (e.g., a cell comprising DNA targeted by a guide RNA to be edited by an engineered B-GEn polypeptide) is in vitro, such as in cell culture. In some embodiments, the target cell is an in vivo cell in an individual (e.g., a mammal, such as a human) for whom gene therapy is intended.

在一些實施例中，編碼引導RNA及/或經工程化之B-GEn多肽之核苷酸序列可以操作方式連接至誘導型啟動子。在一些實施例中，編碼引導RNA及/或經工程化之B-GEn多肽之核苷酸序列可以操作方式連接至組成型啟動子。In some embodiments, the nucleotide sequence encoding the guide RNA and/or the engineered B-GEn polypeptide can be operably linked to an inducible promoter. In some embodiments, the nucleotide sequence encoding the guide RNA and/or the engineered B-GEn polypeptide can be operably linked to a constitutive promoter.

引導RNA、或包含編碼其之核苷酸序列之核酸可藉由任何多種熟知方法引入至宿主或靶細胞中。類似地，在方法涉及將包含編碼經工程化之B-GEn多肽之核苷酸序列(例如經密碼子最佳化之核苷酸序列)之核酸引入至宿主或靶細胞中之情況下，此一核酸可藉由多種熟知方法中之任一者引入至宿主或靶細胞中。引導核酸(RNA或DNA；例如，引導RNA或一或多種編碼引導RNA之DNA分子)及/或經工程化之B-GEn多肽編碼核酸(RNA或DNA)可藉由此項技術中已知之病毒或非病毒遞送運載體遞送。A guide RNA, or a nucleic acid comprising a nucleotide sequence encoding the guide RNA, can be introduced into a host or target cell by any of a variety of well-known methods. Similarly, in cases where the method involves introducing a nucleic acid comprising a nucleotide sequence encoding an engineered B-GEn polypeptide (e.g., a codon-optimized nucleotide sequence) into a host or target cell, such a nucleic acid can be introduced into a host or target cell by any of a variety of well-known methods. A guide nucleic acid (RNA or DNA; e.g., a guide RNA or one or more DNA molecules encoding the guide RNA) and/or an engineered B-GEn polypeptide encoding nucleic acid (RNA or DNA) can be delivered by viral or non-viral delivery vectors known in the art.

將核酸引入至宿主或靶細胞中之方法係此項技術中已知的，且任何已知方法可用於將核酸(例如表現構築體)引入至幹細胞或先驅細胞中。適宜方法包括例如病毒或噬菌體感染、轉染、結合、原生質體融合、脂質轉染、電穿孔、核轉染、磷酸鈣沉澱、聚乙烯亞胺(PEI)介導之轉染、DEAE-聚葡萄糖介導之轉染、脂質體介導之轉染、粒子槍技術、磷酸鈣沉澱、直接微注射、奈米粒子介導之核酸遞送(參見，例如，Panyam等人，Adv Drug Deliv Rev. 2012 Sep 13. pii: 50169-409X(12)00283-9. doi: 10.1016/j.addr.2012.09.023)及類似者，包括(但不限於)外泌體遞送。Methods for introducing nucleic acids into host or target cells are known in the art, and any known method can be used to introduce nucleic acids (e.g., expression constructs) into stem cells or pioneer cells. Suitable methods include, for example, viral or bacteriophage infection, transfection, conjugation, protoplast fusion, lipofection, electroporation, nucleofection, calcium phosphate precipitation, polyethyleneimine (PEI)-mediated transfection, DEAE-polydextrose-mediated transfection, liposome-mediated transfection, particle gun technology, calcium phosphate precipitation, direct microinjection, nanoparticle-mediated nucleic acid delivery (see, e.g., Panyam et al., Adv Drug Deliv Rev. 2012 Sep 13. pii: 50169-409X(12)00283-9. doi: 10.1016/j.addr.2012.09.023), and the like, including but not limited to exosome delivery.

多核苷酸核苷酸序列(例如a)可藉由非病毒遞送運載體(包括(但不限於)奈米粒子、脂質體、核糖核蛋白、帶正電荷之肽、小分子RNA結合物、適體-RNA嵌合體及RNA融合蛋白複合物)來遞送。一些示例性非病毒遞送運載體描述於Peer及Lieberman，Gene Therapy，18: 1127-1133 (2011) (其著重於用於siRNA之非病毒遞送運載體，其等亦可用於遞送其他核酸)中。The polynucleotide sequence (e.g., a) can be delivered by non-viral delivery vehicles (including, but not limited to, nanoparticles, liposomes, ribonucleoproteins, positively charged peptides, small RNA conjugates, aptamer-RNA chimeras, and RNA fusion protein complexes). Some exemplary non-viral delivery vehicles are described in Peer and Lieberman, Gene Therapy, 18: 1127-1133 (2011) (which focuses on non-viral delivery vehicles for siRNA, which can also be used to deliver other nucleic acids).

用於遞送本發明之核酸(例如mRNA及sgRNA)進行基因編輯之適宜系統及技術包括脂質奈米粒子(LNP)。如本文所用，術語「質體奈米粒子」包括脂質體，不論其層數(lamellarity)、形狀或結構、及如所述之用於將核酸及/或多肽引入至細胞中之脂複合體(lipoplex)。此等脂質奈米粒子可與生物活性化合物(例如核酸及/或多肽)複合且可用作體內遞送運載體。一般而言，可應用此項技術中已知的任何方法來製備包含一或多種本發明核酸之脂質奈米粒子及製備生物活性化合物及該等脂質奈米粒子之複合物。此類方法之實例廣泛揭示於例如Biochim Biophys Acta 1979，557:9；Biochim et Biophys Acta 1980，601:559；Liposomes: A practical approach (Oxford University Press，1990)；Pharmaceutica Acta Helvetiae 1995，70:95；Current Science 1995，68:715；Pakistan Journal of Pharmaceutical Sciences 1996，19:65；Methods in Enzymology 2009，464:343)中。用於製備包含一或多種本發明核酸及/或多肽之LNP調配物之特別適宜之系統及技術包括(但不限於)彼等由以下所開發者：Intellia (參見例如WO2017173054A1)、Alnylam (參見例如WO2014008334A1)、Modernatx (參見例如WO2017070622A1及WO2017099823A1)、TranslateBio、Acuitas (參見例如WO2018081480A1)、Genevant Sciences、Arbutus Biopharma、Tekmira、Arcturus、Merck (參見例如WO2015130584A2)、Novartis (參見例如WO2015095340A1)及Dicerna；該等案均以其全文引用之方式併入本文中。Suitable systems and techniques for delivering nucleic acids (e.g., mRNA and sgRNA) of the present invention for gene editing include lipid nanoparticles (LNPs). As used herein, the term "plastid nanoparticles" includes liposomes, regardless of their lamellarity, shape or structure, and lipoplexes used to introduce nucleic acids and/or polypeptides into cells as described. Such lipid nanoparticles can be complexed with biologically active compounds (e.g., nucleic acids and/or polypeptides) and can be used as in vivo delivery vehicles. In general, any method known in the art can be applied to prepare lipid nanoparticles containing one or more nucleic acids of the present invention and to prepare complexes of biologically active compounds and such lipid nanoparticles. Examples of such methods are widely disclosed in, for example, Biochim Biophys Acta 1979, 557:9; Biochim et Biophys Acta 1980, 601:559; Liposomes: A practical approach (Oxford University Press, 1990); Pharmaceutica Acta Helvetiae 1995, 70:95; Current Science 1995, 68:715; Pakistan Journal of Pharmaceutical Sciences 1996, 19:65; Methods in Enzymology 2009, 464:343). Particularly suitable systems and technologies for preparing LNP formulations comprising one or more nucleic acids and/or polypeptides of the invention include, but are not limited to, those developed by Intellia (see, e.g., WO2017173054A1), Alnylam (see, e.g., WO2014008334A1), Modernatx (see, e.g., WO2017070622A1 and WO2017099823A1), TranslateBio, Acuitas (see, e.g., WO2018081480A1), Genevan Sciences, Arbutus Biopharma, Tekmira, Arcturus, Merck (see, e.g., WO2015130584A2), Novartis (see, e.g., WO2015095340A1), and Dicerna; each of which is incorporated herein by reference in its entirety.

包含編碼經工程化之B-GEn多肽及/或引導RNA之核苷酸序列之適宜核酸包括表現載體。在一些實施例中，該表現載體為病毒構築體，例如重組腺相關病毒構築體(參見例如美國專利第7,078,387號)、重組腺病毒構築體、重組慢病毒構築體、重組逆轉錄病毒構築體等。適宜表現載體包括(但不限於)病毒載體(例如基於以下病毒之病毒載體：牛痘病毒；脊髓灰質炎病毒；腺病毒(參見，例如Li等人，Invest Opthalmol Vis Sci 35:2543 2549，1994；Borras等人，Gene Ther 6:515 524，1999；Li及Davidson，PNAS 92:7700 7704，1995；Sakamoto等人，H Gene Ther 5:10881097，1999；WO 94/12649、WO 93/03769；WO 93/19191；WO 94/28938；WO 95/11984及WO 95/00655)；腺相關病毒(參見，例如Ali等人，Hum Gene Ther 9:81 86，1998、Flannery等人，PNAS 94:6916 6921，1997；Bennett等人，Invest Opthalmol Vis Sci 38:2857 2863，1997；Jomary等人，Gene Ther 4:683-690，1997、Rolling等人，Hum Gene Ther 10:641 648，1999；Ali等人，Hum Mol Genet 5:591 594，1996；Srivastava in WO 93/09239，Samulski等人，J. Vir.(1989) 63:3822-3828；Mendelson等人，Viral.(1988) 166:154-165；及Flotte等人，PNAS (1993) 90:10613-10617)；SV40；單純疱疹病毒；人類免疫缺陷病毒(參見，例如Miyoshi等人，PNAS 94:10319 23，1997；Takahashi等人，J Virol 73:7812 7816，1999)；逆轉錄病毒載體(例如鼠類白血病病毒、脾臟壞死病毒及衍生自逆轉錄病毒(諸如勞斯肉瘤病毒、哈威肉瘤病毒(Harvey Sarcoma Virus)、禽白血病病毒、慢病毒、人類免疫缺陷病毒、骨髓增生性肉瘤病毒及乳房腫瘤病毒)之載體)；及類似者。Suitable nucleic acids comprising nucleotide sequences encoding engineered B-GEn polypeptides and/or guide RNAs include expression vectors. In some embodiments, the expression vector is a viral construct, such as a recombinant adeno-associated virus construct (see, e.g., U.S. Patent No. 7,078,387), a recombinant adenovirus construct, a recombinant lentivirus construct, a recombinant retrovirus construct, etc. Suitable expression vectors include, but are not limited to, viral vectors (e.g., viral vectors based on vaccinia virus; polio virus; adenovirus (see, e.g., Li et al., Invest Opthalmol Vis Sci 35:2543-2549, 1994; Borras et al., Gene Ther 6:515-524, 1999; Li and Davidson, PNAS 92:7700-7704, 1995; Sakamoto et al., H Gene Ther 5:1088-1097, 1999; WO 94/12649, WO 93/03769; WO 93/19191; WO 94/28938; WO 95/11984 and WO 95/00655); adeno-associated virus (see, e.g., Ali et al., Hum Gene Ther 9:81 86, 1998, Flannery et al., PNAS 94:6916 6921, 1997; Bennett et al., Invest Opthalmol Vis Sci 38:2857 2863, 1997; Jomary et al., Gene Ther 4:683-690, 1997, Rolling et al., Hum Gene Ther 10:641 648, 1999; Ali et al., Hum Mol Genet 5:591 594, 1996; Srivastava in WO 93/09239, Samulski et al., J. Vir. (1989) 63:3822-3828; Mendelson et al., Viral. (1988) 166:154-165; and Flotte et al., PNAS (1993) 90:10613-10617); SV40; herpes simplex virus; human immunodeficiency virus (see, e.g., Miyoshi et al., PNAS 94:10319-23, 1997; Takahashi et al., J Virol 73:7812-7816, 1999); retroviral vectors (e.g., murine leukemia virus, spleen necrosis virus, and vectors derived from retroviruses (e.g., Rous sarcoma virus, Harvey Sarcoma virus, avian leukemia virus, lentivirus, human immunodeficiency virus, myeloproliferative sarcoma virus, and mammary tumor virus)); and the like.

在一些實施例中，將包含B-GEn核酸內切酶及sgRNA之B-GEn核糖核蛋白透過核轉染遞送至靶細胞，此係藉由使用細胞特異性試劑及電參數將核酸遞送至細胞以在細胞膜中建立瞬時小孔之方法。In some embodiments, B-GEn ribonucleoprotein comprising B-GEn endonuclease and sgRNA is delivered to target cells by nucleofection, a method of creating transient pores in the cell membrane by delivering nucleic acids to cells using cell-specific reagents and electrical parameters.

經工程化之B-GEn V型CRISPR-Cas系統可藉由遞送載體(諸如病毒載體)遞送至靶細胞中。經工程化之B-GEn V型CRISPR-Cas系統亦可藉由非病毒遞送運載體遞送至靶細胞中，非病毒遞送運載體包括(但不限於)奈米粒子、脂質體、核糖核蛋白、帶正電之肽、小分子RNA結合物、適體-RNA嵌合體及RNA-融合蛋白複合物。一些示例性非病毒遞送運載體描述於Peer及Lieberman，Gene Therapy，18: 1127-1133 (2011)中。The engineered B-GEn V-type CRISPR-Cas system can be delivered to target cells by delivery vectors such as viral vectors. The engineered B-GEn V-type CRISPR-Cas system can also be delivered to target cells by non-viral delivery vectors, including but not limited to nanoparticles, liposomes, ribonucleoproteins, positively charged peptides, small RNA conjugates, aptamer-RNA chimeras, and RNA-fusion protein complexes. Some exemplary non-viral delivery vectors are described in Peer and Lieberman, Gene Therapy, 18: 1127-1133 (2011).

在一些實施例中，經工程化之B-GEn V型CRISPR-Cas系統經由使用如章節6.9中所述之遞送載體遞送至靶細胞中。 6.13. 基因編輯方法 In some embodiments, the engineered B-GEn V-type CRISPR-Cas system is delivered to a target cell using a delivery vector as described in Section 6.9. 6.13. Gene Editing Methods

本發明亦提供使用經工程化之V型核酸內切酶或B-GEn多肽系統進行基因編輯之方法。該等基因編輯方法可包括用於靶向、編輯、修飾或操縱在宿主或靶細胞(或多個宿主或靶細胞)之基因組中之一或多個位置處之靶DNA (無論是體外、離體或體內)或在無細胞環境中之靶DNA之方法。通常，該等基因編輯方法包括在適合於經工程化之V型核酸內切酶或B-GEn多肽之條件下將經工程化之V型核酸內切酶或B-GEn多肽系統引入至宿主或靶細胞(或宿主或靶細胞群體)或包含無細胞環境之靶DNA序列中以在該靶DNA中進行一或多種修飾(例如切口或切割或鹼基編輯)，其中將該經工程化之V型核酸內切酶或B-GEn多肽藉由經處理或未經處理之形式的引導RNA引導至該靶DNA。The present invention also provides methods for gene editing using engineered V-type endonucleases or B-GEn polypeptide systems. Such gene editing methods may include methods for targeting, editing, modifying or manipulating target DNA at one or more locations in the genome of a host or target cell (or multiple hosts or target cells) (whether in vitro, ex vivo or in vivo) or in a cell-free environment. Generally, the gene editing methods include systematically introducing an engineered V-type endonuclease or B-GEn polypeptide into a host or target cell (or a host or target cell population) or a target DNA sequence comprising a cell-free environment under conditions suitable for the engineered V-type endonuclease or B-GEn polypeptide to perform one or more modifications (e.g., nicking or cleavage or base editing) in the target DNA, wherein the engineered V-type endonuclease or B-GEn polypeptide is guided to the target DNA by a guide RNA in a treated or untreated form.

基因編輯方法可包括將本發明之經工程化之V型核酸內切酶或B-GEn多肽系統呈RNP複合物引入至宿主或靶細胞(或宿主或靶細胞群體)中及/或將包含引導RNA或編碼引導RNA之核苷酸序列及編碼經工程化之V型核酸內切酶或B-GEn多肽之核苷酸序列(例如經密碼子最佳化之核苷酸序列)之一或多種核酸引入(例如經由病毒(諸如AAV)或經由LNP)至宿主或靶細胞(或宿主或靶細胞群體)中。本發明之經工程化之V型核酸內切酶或B-GEn多肽系統(例如呈RNP複合物；一或多個編碼引導RNA之核酸分子及一或多個編碼經工程化之V型核酸內切酶或B-GEn多肽之核苷酸序列(例如一或多個經密碼子最佳化之核苷酸序列)；或引導RNA及一或多個編碼經工程化之V型核酸內切酶或B-GEn多肽之核苷酸序列(例如一或多個經密碼子最佳化之核苷酸序列))可藉由多種熟知之病毒或非病毒遞送方法中之任一者引入至宿主或靶細胞(或宿主或靶細胞群體)中。該等基因編輯方法可體內、離體或體外用於宿主或靶細胞(或宿主或靶細胞群體)。The gene editing method may include introducing the engineered V-type nuclease or B-GEn polypeptide system of the present invention into a host or target cell (or a host or target cell population) as an RNP complex and/or introducing one or more nucleic acids comprising a guide RNA or a nucleotide sequence encoding a guide RNA and a nucleotide sequence encoding an engineered V-type nuclease or B-GEn polypeptide (e.g., a codon-optimized nucleotide sequence) into a host or target cell (or a host or target cell population) (e.g., via a virus (such as AAV) or via LNP). The engineered V-type endonuclease or B-GEn polypeptide system of the present invention (e.g., in the form of an RNP complex; one or more nucleic acid molecules encoding a guide RNA and one or more nucleotide sequences encoding an engineered V-type endonuclease or B-GEn polypeptide (e.g., one or more codon-optimized nucleotide sequences); or a guide RNA and one or more nucleotide sequences encoding an engineered V-type endonuclease or B-GEn polypeptide (e.g., one or more codon-optimized nucleotide sequences)) can be introduced into a host or target cell (or a host or target cell population) by any of a variety of well-known viral or non-viral delivery methods. Such gene editing methods can be used in vivo, in vitro, or in vitro on a host or target cell (or a host or target cell population).

在一些實施例中，基因編輯方法包括在其中需要基因編輯(例如用於基因療法目的)之個體(例如哺乳動物個體，諸如人類個體)中引入本發明之經工程化之V型核酸內切酶或B-GEn多肽系統(例如呈RNP複合物；一或多個編碼引導RNA之核酸分子及一或多個編碼經工程化之V型核酸內切酶或B-GEn多肽之核苷酸序列(例如一或多個經密碼子最佳化之核苷酸序列)；或引導RNA及一或多個編碼經工程化之V型核酸內切酶或B-GEn多肽之核苷酸序列(例如一或多個經密碼子最佳化之核苷酸序列))以在體內靶DNA中進行一或多種修飾(例如切口或切割或鹼基編輯)。In some embodiments, the gene editing method includes introducing an engineered V-type nuclease or B-GEn polypeptide system of the present invention (e.g., in the form of an RNP complex; one or more nucleic acid molecules encoding a guide RNA and one or more nucleotide sequences encoding an engineered V-type nuclease or B-GEn polypeptide (e.g., one or more codon-optimized nucleotide sequences); or a guide RNA and one or more nucleotide sequences encoding an engineered V-type nuclease or B-GEn polypeptide (e.g., one or more codon-optimized nucleotide sequences)) into an individual (e.g., a mammalian individual, such as a human individual) in which gene editing is desired (e.g., for gene therapy purposes) to perform one or more modifications (e.g., nicking or cutting or base editing) in the target DNA in vivo.

在一些實施例中，基因編輯方法包括引入本發明之經工程化之V型核酸內切酶或B-GEn多肽系統(例如呈RNP複合物；一或多個編碼引導RNA之核酸分子及一或多個編碼經工程化之V型核酸內切酶或B-GEn多肽之核苷酸序列(例如一或多個經密碼子最佳化之核苷酸序列)；或引導RNA及一或多個編碼經工程化之V型核酸內切酶或B-GEn多肽之核苷酸序列(例如一或多個經密碼子最佳化之核苷酸序列))以在離體靶DNA中進行一或多種修飾(例如切口或切割或鹼基編輯)。In some embodiments, the gene editing method includes introducing an engineered V-type nuclease or B-GEn polypeptide system of the present invention (e.g., in the form of an RNP complex; one or more nucleic acid molecules encoding a guide RNA and one or more nucleotide sequences encoding an engineered V-type nuclease or B-GEn polypeptide (e.g., one or more codon-optimized nucleotide sequences); or a guide RNA and one or more nucleotide sequences encoding an engineered V-type nuclease or B-GEn polypeptide (e.g., one or more codon-optimized nucleotide sequences)) to perform one or more modifications (e.g., nicking or cutting or base editing) in an isolated target DNA.

在一些實施例中，基因編輯方法包括引入本發明之經工程化之V型核酸內切酶或B-GEn多肽系統(例如呈RNP複合物；一或多個編碼引導RNA之核酸分子及一或多個編碼經工程化之V型核酸內切酶或B-GEn多肽之核苷酸序列(例如一或多個經密碼子最佳化之核苷酸序列)；或引導RNA及一或多個編碼經工程化之V型核酸內切酶或B-GEn多肽之核苷酸序列(例如一或多個經密碼子最佳化之核苷酸序列))以在體外靶DNA中進行一或多種修飾(例如切口或切割或鹼基編輯)。In some embodiments, the gene editing method includes introducing an engineered V-type nuclease or B-GEn polypeptide system of the present invention (e.g., in the form of an RNP complex; one or more nucleic acid molecules encoding a guide RNA and one or more nucleotide sequences encoding an engineered V-type nuclease or B-GEn polypeptide (e.g., one or more codon-optimized nucleotide sequences); or a guide RNA and one or more nucleotide sequences encoding an engineered V-type nuclease or B-GEn polypeptide (e.g., one or more codon-optimized nucleotide sequences)) to perform one or more modifications (e.g., nicking or cutting or base editing) in the target DNA in vitro.

在一些實施例中，基因編輯方法包括使細胞與呈包含經工程化之V型或B-GEn多肽及引導RNA之RNP複合物之形式之該經工程化之V型核酸內切酶或B-GEn多肽系統接觸。有關RNP複合物之製備、組成及遞送之例示性詳細內容描述於章節6.7中。在一些實施例中，RNP複合物亦可使用例如經由電穿孔或核轉染增加細胞之質膜之孔隙度之遞送方法遞送至該細胞中。In some embodiments, the gene editing method includes contacting a cell with an engineered V-type endonuclease or B-GEn polypeptide system in the form of an RNP complex comprising an engineered V-type or B-GEn polypeptide and a guide RNA. Exemplary details of the preparation, composition, and delivery of RNP complexes are described in Section 6.7. In some embodiments, RNP complexes can also be delivered to the cell using a delivery method that increases the porosity of the plasma membrane of the cell, such as by electroporation or nuclear transfection.

在一些實施例中，基因編輯方法包括使細胞與呈LNP (例如包含RNP之LNP (例如如在前一段落中所述之LNP)或包含編碼經工程化之V型核酸內切酶或B-GEn多肽之核酸及引導RNA或編碼該引導RNA之核酸之LNP)之形式之經工程化之V型核酸內切酶或B-GEn多肽系統接觸。此項技術中已知的任何方法可應用於製備LNP，例如如章節6.12中所述。In some embodiments, the gene editing method comprises contacting a cell with an engineered V-type endonuclease or B-GEn polypeptide system in the form of an LNP (e.g., an LNP comprising an RNP (e.g., an LNP as described in the previous paragraph) or an LNP comprising a nucleic acid encoding an engineered V-type endonuclease or B-GEn polypeptide and a guide RNA or a nucleic acid encoding the guide RNA). Any method known in the art can be applied to prepare the LNP, for example, as described in Section 6.12.

在一些實施例中，基因編輯方法包括使細胞與呈一或多種病毒(例如一或多種AAV，其基因組包含一或多個編碼經工程化之V型核酸內切酶或B-GEn多肽之核苷酸序列及編碼引導RNA之核酸)之形式之該經工程化之V型核酸內切酶或B-GEn多肽系統接觸。使用病毒以引入轉基因(例如編碼經工程化之V型核酸內切酶或B-GEn多肽之核酸及引導RNA編碼序列)係此項技術中已知的且描述於章節6.9.3中。用於引入編碼經工程化之V型核酸內切酶或B-GEn多肽或引導RNA之核酸之其他方法及遞送載體描述於章節6.9及6.12中。 6.14. 醫藥組合物 In some embodiments, the gene editing method comprises contacting a cell with an engineered V-type endonuclease or B-GEn polypeptide system in the form of one or more viruses (e.g., one or more AAVs, whose genomes comprise one or more nucleotide sequences encoding an engineered V-type endonuclease or B-GEn polypeptide and a nucleic acid encoding a guide RNA). The use of viruses to introduce transgenes (e.g., nucleic acids encoding an engineered V-type endonuclease or B-GEn polypeptide and guide RNA encoding sequences) is known in the art and is described in Section 6.9.3. Other methods and delivery vectors for introducing nucleic acids encoding engineered V-type endonucleases or B-GEn polypeptides or guide RNAs are described in Sections 6.9 and 6.12. 6.14. Pharmaceutical Compositions

本文亦揭示包含本發明之B-GEn蛋白、gRNA、核酸或複數種核酸、系統、粒子、或複數個粒子以及醫藥上可接受之賦形劑之醫藥調配物及藥劑。Also disclosed herein are pharmaceutical formulations and agents comprising the B-GEn protein, gRNA, nucleic acid or multiple nucleic acids, system, particle or multiple particles of the present invention and a pharmaceutically acceptable formulation.

適宜賦形劑包括(但不限於)鹽、稀釋劑(例如Tris-HCl、乙酸鹽、磷酸鹽)、防腐劑(例如硫柳汞、苄醇、對羥基苯甲酸酯)、黏結劑、填充劑、增溶劑、崩解劑、吸附劑、溶劑、pH調節劑、抗氧化劑、抗感染劑、懸浮劑、潤濕劑、黏度調節劑、張力劑、安定劑、及其他組分及其組合。適宜之醫藥上可接受之賦形劑可選自一般認為安全(GRAS)且可投與至個體而不會導致不期望之生物學副作用或不想要之相互作用之材料。適宜賦形劑及其調配物描述於Remington's Pharmaceutical Sciences，第16版，1980，Mack Publishing Co.中。此外，此類組合物可與聚乙二醇(PEG)、金屬離子複合、或併入至聚合物化合物(諸如聚乙酸、聚乙醇酸、水凝膠等)中、或併入至脂質體、微乳液、膠束、單層或多層囊泡、紅血球血影(erythrocyte ghost)或球形母細胞中。用於投與(例如非經腸投與)之適宜劑型包括溶液、懸浮液及乳液。Suitable formulations include, but are not limited to, salts, diluents (e.g., Tris-HCl, acetate, phosphate), preservatives (e.g., thimerosal, benzyl alcohol, parabens), binders, fillers, solubilizers, disintegrants, adsorbents, solvents, pH regulators, antioxidants, anti-infective agents, suspending agents, wetting agents, viscosity regulators, tonic agents, stabilizers, and other components and combinations thereof. Suitable pharmaceutically acceptable formulations can be selected from materials that are generally recognized as safe (GRAS) and can be administered to an individual without causing undesirable biological side effects or unwanted interactions. Suitable excipients and their formulations are described in Remington's Pharmaceutical Sciences, 16th edition, 1980, Mack Publishing Co. In addition, such compositions can be complexed with polyethylene glycol (PEG), metal ions, or incorporated into polymer compounds (such as polyacetic acid, polyglycolic acid, hydrogels, etc.), or incorporated into liposomes, microemulsions, micelles, monolayer or multilayer vesicles, erythrocyte ghosts or spheroidal blasts. Suitable dosage forms for administration (e.g., parenteral administration) include solutions, suspensions, and emulsions.

該醫藥調配物之該等組分可溶解或懸浮於適宜溶劑(諸如例如水、林格氏溶液(Ringer's solution)、磷酸鹽緩衝鹽水(PBS)或等張氯化鈉)中。該調配物亦可為含在非毒性、非經腸可接受之稀釋劑或溶劑(諸如1,3-丁二醇)中之無菌溶液、懸浮液或乳液。The components of the pharmaceutical formulation can be dissolved or suspended in a suitable solvent such as water, Ringer's solution, phosphate buffered saline (PBS) or isotonic sodium chloride. The formulation can also be a sterile solution, suspension or emulsion in a non-toxic, parenterally acceptable diluent or solvent such as 1,3-butanediol.

在一些情況下，調配物可包括一或多種張力劑以調整該調配物之等張範圍。適宜張力劑係此項技術中熟知的且包括甘油、甘露醇、山梨糖醇、氯酸钠及其他電解質。在一些情況下，該等調配物可利用維持適合於非經腸投與之pH所需之有效量之緩衝液來緩衝。適宜緩衝液係熟習此項技術者熟知的且有用之緩衝液之一些實例為乙酸鹽、硼酸鹽、碳酸鹽、檸檬酸鹽及磷酸鹽緩衝液。In some cases, the formulation may include one or more tonicity agents to adjust the isotonic range of the formulation. Suitable tonicity agents are well known in the art and include glycerin, mannitol, sorbitol, sodium chlorate and other electrolytes. In some cases, the formulations may be buffered with an effective amount of a buffer required to maintain a pH suitable for parenteral administration. Some examples of suitable buffers that are well known to those skilled in the art and that are useful are acetate, borate, carbonate, citrate and phosphate buffers.

在一些實施例中，該調配物可分佈或包裝成液體形式，或替代地呈例如藉由凍乾適宜液體調配物獲得之固體，其可在投與前用適宜載劑或稀釋劑復水。在一些實施例中，該等調配物可以足以編輯細胞中之基因之醫藥有效量包含引導RNA及II型Cas蛋白。該等醫藥組合物可經調配以用於醫學及/或獸醫用途。In some embodiments, the formulation may be distributed or packaged in liquid form, or alternatively as a solid obtained, for example, by lyophilizing a suitable liquid formulation, which may be reconstituted with a suitable carrier or diluent prior to administration. In some embodiments, the formulations may comprise a guide RNA and a type II Cas protein in a pharmaceutically effective amount sufficient to edit a gene in a cell. The pharmaceutical compositions may be formulated for medical and/or veterinary use.

在一些實施例中，該等B-GEn核酸內切酶複合物可經引入至宿主細胞(例如iPSC)中以產生可再引入至個體中之經基因修飾之細胞。本文描述之iPSC衍生細胞可提供於含有該等細胞及醫藥上可接受之載劑之醫藥組合物中。該醫藥上可接受之載劑可為視需要不含有任何動物衍生組分之細胞培養基。為了儲存及運輸，該等細胞可在＜ -70℃下(例如在乾冰上或在液氮中)冷凍保存。在使用前，可將該等細胞解凍，且稀釋於支持所關注細胞類型之無菌細胞培養基中。In some embodiments, the B-GEn endonuclease complexes can be introduced into host cells (e.g., iPSCs) to produce genetically modified cells that can be reintroduced into an individual. The iPSC-derived cells described herein can be provided in a pharmaceutical composition containing the cells and a pharmaceutically acceptable carrier. The pharmaceutically acceptable carrier can be a cell culture medium that does not contain any animal-derived components as needed. For storage and transportation, the cells can be frozen at < -70°C (e.g., on dry ice or in liquid nitrogen). Prior to use, the cells can be thawed and diluted in a sterile cell culture medium that supports the cell type of interest.

該等細胞可全身(例如透過靜脈內注射或輸注)或局部(例如透過直接注射至局部組織，例如心臟、腦、及受損組織之部位)投與至該患者中。此項技術中已知用於投與細胞至患者的組織或器官中之各種方法，包括(但不限於)冠狀動脈內投與、心肌內投與、心內膜投與或顱內投與。The cells can be administered to the patient systemically (e.g., by intravenous injection or infusion) or locally (e.g., by direct injection into local tissues, such as the heart, brain, and sites of damaged tissue). Various methods are known in the art for administering cells to tissues or organs of a patient, including, but not limited to, intracoronary administration, intramyocardial administration, endocardial administration, or intracranial administration.

將治療有效數目之iPSC衍生細胞投與至患者。如本文所用，術語「治療有效」係指當投與至罹患或易患疾病、病症及/或病狀之人類個體時足以治療、預防及/或延遲該疾病、病症及/或病狀之該(等)症狀之發作或進展之數目之細胞或量之藥物組合物。一般技術者應明瞭，治療有效量通常經由包含至少一個單位劑量之給藥方案投與。在一些實施例中，在一或多個部位中一次對個體投與至少10 ³個(例如至少10 ⁴個、至少10 ⁵個、至少10 ⁶個、至少10 ⁷個、至少10 ⁸個、至少10 ⁹個、至少10 ¹⁰個、至少10 ¹¹個或至少10 ¹²個)細胞。在一些實施例中，在一或多個部位中一次對個體投與10 ³至10 ¹⁸個(例如10 ³至10 ⁴個、10 ³至10 ⁵個、10 ³至10 ⁶個、10 ³至10 ⁷個、10 ³至10 ⁸個、10 ³至10 ⁹個、10 ³至10 ¹⁰個、10 ³至10 ¹¹個、10 ³至10 ¹²個、10 ⁶至10 ⁷個、10 ⁶至10 ⁸個、10 ⁶至10 ⁹個、10 ⁶至10 ¹⁰個、10 ⁶至10 ¹¹個、10 ⁶至10 ¹²個、10 ⁹至10 ¹⁰個、10 ⁹至10 ¹¹個、10 ⁹至10 ¹²個)細胞。在一些實施例中，在一或多個部位處一次對個體投與多於10 ¹²個(例如多於10 ¹²個、多於10 ¹³個、多於10 ¹⁴個、多於10 ¹⁵個、多於10 ¹⁶個、多於10 ¹⁷個、多於10 ¹⁸個或更多個)細胞。 7. 編號實施例 A therapeutically effective number of iPSC-derived cells is administered to a patient. As used herein, the term "therapeutically effective" refers to a number of cells or an amount of a pharmaceutical composition that, when administered to a human subject suffering from or susceptible to a disease, disorder, and/or condition, is sufficient to treat, prevent, and/or delay the onset or progression of the disease, disorder, and/or condition. A person of ordinary skill will appreciate that a therapeutically effective amount is typically administered via a dosing regimen comprising at least one unit dose. In some embodiments, at least 10 ³ (e.g., at least 10 ⁴ , at least 10 ⁵ , at least 10 ⁶ , at least 10 7, at least 10 ⁸ , at least 10 ⁹ , at least ^{10 10} , at least 10 ¹¹ , or at least 10 ¹² ) cells are administered to a subject at one time in one or ^{more sites} . In some embodiments, 10 ³ to 10 ¹⁸ (e.g., 10 ³ to 10 ⁴ , 10 3 to 10 5, 10 3 to 10 ⁶ , 10 ³ to 10 7, 10 ³ to 10 ⁸ , 10 ³ to 10 ⁹ , 10 ³ to 10 ¹⁰ , 10 ³ to 10 ¹¹ , 10 ³ to 10 ¹² , ^{10 6} to 10 ⁷ , 10 ⁶ to 10 ⁸ , 10 ⁶ to 10 ⁹ , 10 ⁶ to 10 ¹⁰ , 10 ⁶ to 10 ¹¹ , 10 ⁶ to 10 ¹² , 10 ⁹ to 10 ¹⁰ , ^{10 9} ^to 10 ¹¹ , ¹⁰ ⁹ to 10 ¹² ) cells are administered to an individual at one time in ^one ^or more sites. In some embodiments, more than 10 ¹² (e.g., more than 10 ¹² , more than 10 ¹³ , more than 10 ¹⁴ , more than 10 15, more than 10 ¹⁶ , more than 10 ¹⁷ , more than 10 ¹⁸ , or more) cells are administered to an individual at ^one time at one or more sites. 7. Numbered Embodiments

雖然已繪示及描述各種特定實施例，但應明瞭，在不脫離本發明之精神及範疇下，可進行各種改變。本發明由下文闡明的編號實施例例示。除非另作指明，否則上文詳細描述中描述之任何概念、態樣及/或實施例之特徵可比照適用於任何以下編號實施例。 1. 一種包含在對應於SEQ ID NO: 1 (B-GEn.1)之D504或SEQ ID NO: 2 (B-GEn.1.2)或SEQ ID NO: 3 (B-GEn.2)之D501之位置處之除天冬胺酸之外之胺基酸之多肽。 2. 如實施例1之多肽，其為經工程化之V型核酸內切酶多肽。 3. 如實施例2之多肽，其為經工程化之芽孢桿菌目V型核酸內切酶多肽。 4. 如實施例1至3中任一實施例之多肽，其為B-GEn多肽。 5. 如實施例1至4中任一實施例之多肽，其具有在對應於SEQ ID NO: 1 (B-GEn.1)之D504或SEQ ID NO: 2 (B-GEn.1.2)或SEQ ID NO: 3 (B-GEn.2)之D501之位置處之精胺酸。 6. 如實施例1至5中任一實施例之多肽，其包含SEQ ID NO: 201、SEQ ID NO: 202、SEQ ID NO: 203；及SEQ ID NO: 204中之任一者之靶相互作用序列基序。 7. 如實施例1至6中任一實施例之多肽，其含有包含與SEQ ID NO: 8或SEQ ID NO: 11之RuvC I域具有至少40%、至少45%、至少50%、至少60%、至少70%、至少80%或至少90%序列一致性之胺基酸序列之RuvC I域。 8. 如實施例7之多肽，其中該RuvC I域包含與SEQ ID NO: 8或SEQ ID NO: 11之RuvC I域具有至少40%序列一致性之胺基酸序列。 9. 如實施例7之多肽，其中該RuvC I域包含與SEQ ID NO: 8或SEQ ID NO: 11之RuvC I域具有至少70%序列一致性之胺基酸序列。 10. 如實施例7之多肽，其中該RuvC I域包含與SEQ ID NO: 8或SEQ ID NO: 11之RuvC I域具有至少80%序列一致性之胺基酸序列。 11. 如實施例7之多肽，其中該RuvC I域包含與SEQ ID NO: 8或SEQ ID NO: 11之RuvC I域具有至少90%序列一致性之胺基酸序列。 12. 如實施例1至11中任一實施例之多肽，其含有包含與SEQ ID NO: 9或SEQ ID NO: 12之RuvC II域具有至少50%、至少60%、至少70%、至少80%或至少90%序列一致性之胺基酸序列之RuvC II域。 13. 如實施例12之多肽，其中該RuvC II域包含與SEQ ID NO: 9或SEQ ID NO: 12之RuvC II域具有至少60%序列一致性之胺基酸序列。 14. 如實施例12之多肽，其中該RuvC II域包含與SEQ ID NO: 9或SEQ ID NO: 12之RuvC II域具有至少70%序列一致性之胺基酸序列。 15. 如實施例12之多肽，其中該RuvC II域包含與SEQ ID NO: 9或SEQ ID NO: 12之RuvC II域具有至少80%序列一致性之胺基酸序列。 16. 如實施例12之多肽，其中該RuvC II域包含與SEQ ID NO: 9或SEQ ID NO: 12之RuvC II域具有至少90%序列一致性之胺基酸序列。 17. 如實施例1至16中任一實施例之多肽，其含有包含與SEQ ID NO: 10或SEQ ID NO: 13之RuvC III域具有至少80%、至少85%或至少90%序列一致性之胺基酸序列之RuvC III域。 18. 如實施例17之多肽，其中該RuvC III域包含與SEQ ID NO: 10或SEQ ID NO: 13之RuvC III域具有至少80%序列一致性之胺基酸序列。 19. 如實施例17之多肽，其中該RuvC III域包含與SEQ ID NO: 10或SEQ ID NO: 13之RuvC III域具有至少85%序列一致性之胺基酸序列。 20. 如實施例17之多肽，其中該RuvC III域包含與SEQ ID NO: 10或SEQ ID NO: 13之RuvC III域具有至少90%序列一致性之胺基酸序列。 21. 如實施例1至20中任一實施例之多肽，其包含SEQ ID NO: 8之胺基酸序列。 22. 如實施例1至21中任一實施例之多肽，其包含SEQ ID NO: 9之胺基酸序列。 23. 如實施例1至22中任一實施例之多肽，其包含SEQ ID NO: 10之胺基酸序列。 24. 如實施例1至20中任一實施例之多肽，其包含SEQ ID NO: 11之胺基酸序列。 25. 如實施例1至20及24中任一實施例之多肽，其包含SEQ ID NO: 12之胺基酸序列。 26. 如實施例1至20及24至25中任一實施例之多肽，其包含SEQ ID NO: 13之胺基酸序列。 27. 一種多肽，其視需要為如實施例1至26中任一實施例之多肽，其包含如下胺基酸序列： (a) 具有在對應於SEQ ID NO: 3之D501之位置處之取代，其中該取代： (i)相較於SEQ ID NO: 3之胺基酸序列，為取代D501R；或 (ii)相較於不具有在位置D501處之取代之對應胺基酸序列，增加基因編輯效率活性；及 (b) 具有： (i) 與SEQ ID NO: 1 (B-GEn.1)、SEQ ID NO: 2 (B-GEn.1.2)或SEQ ID NO: 3 (B-GEn.2)具有至少80%、至少85%、至少90%或至少95%序列一致性； (ii) (A) SEQ ID NO: 201、SEQ ID NO: 202、SEQ ID NO: 203；及SEQ ID NO: 204中之任一者之靶相互作用序列基序；(B)包含與SEQ ID NO: 8或SEQ ID NO: 11之RuvC I域具有至少40%、至少45%、至少50%、至少60%、至少70%、至少80%或至少90%序列一致性之胺基酸序列之RuvC I域；(C)包含與SEQ ID NO: 9或SEQ ID NO: 12之RuvC II域具有至少50%、至少60%、至少70%、至少80%或至少90%序列一致性之胺基酸序列之RuvC II域；(D)包含與SEQ ID NO: 10或SEQ ID NO: 13之RuvC III域具有至少80%、至少85%或至少90%序列一致性之胺基酸序列之RuvC III域；或(E) (A)、(B)、(C)及(D)中之二者、三者或全部四者之任何組合； (iii) 多至25個胺基酸插入、取代及/或缺失，相較於SEQ ID NO: 1 (B-GEn.1)、SEQ ID NO: 2 (B-GEn.1.2)或SEQ ID NO: 3 (B-GEn.2)之胺基酸序列； (iv) (b)(i)及(b)(ii)； (v) (b)(i)及(b)(iii)； (vi) (b)(ii)及(b)(iii)；或 (vii) (b)(i)、(b)(ii)及(b)(iii)。 28. 如實施例1至27中任一實施例之多肽，其具有比缺乏在對應於SEQ ID NO: 1 (B-GEn.1)之D504或SEQ ID NO: 2 (B-GEn.1.2)或SEQ ID NO: 3 (B-GEn.2)之D501之位置處之胺基酸取代之對應多肽之基因編輯效率高至少50%之基因編輯效率。 29. 如實施例1至27中任一實施例之多肽，其具有比缺乏在對應於SEQ ID NO: 1 (B-GEn.1)之D504或SEQ ID NO: 2 (B-GEn.1.2)或SEQ ID NO: 3 (B-GEn.2)之D501之位置處之胺基酸取代之對應多肽之基因編輯效率高至少70%之基因編輯效率。 30. 如實施例1至27中任一實施例之多肽，其具有比缺乏在對應於SEQ ID NO: 1 (B-GEn.1)之D504或SEQ ID NO: 2 (B-GEn.1.2)或SEQ ID NO: 3 (B-GEn.2)之D501之位置處之胺基酸取代之對應多肽之基因編輯效率高至少90%之基因編輯效率。 31. 如實施例28至30中任一實施例之多肽，其中該基因編輯效率係經由細胞內基因編輯檢定來評估，視需要，其中該基因編輯檢定如在實例1中所述。 32. 如實施例27至31中任一實施例之多肽，其中該序列一致性係相對SEQ ID NO: 1的且該等胺基酸插入、取代及/或缺失係相對於SEQ ID NO: 1。 33. 如實施例32之多肽，其中該胺基酸序列與SEQ ID NO: 1之胺基酸序列具有至少98%序列一致性。 34. 如實施例32之多肽，其中該胺基酸序列與SEQ ID NO: 1之胺基酸序列具有至少99%序列一致性。 35. 如實施例32之多肽，其中該胺基酸序列與SEQ ID NO: 1之胺基酸序列具有至少99.5%序列一致性。 36. 如實施例27至31中任一實施例之多肽，其中該序列一致性係相對SEQ ID NO: 2的且該等胺基酸插入、取代及/或缺失係相對於SEQ ID NO: 2。 37. 如實施例32之多肽，其中該胺基酸序列與SEQ ID NO: 2之胺基酸序列具有至少98%序列一致性。 38. 如實施例32之多肽，其中該胺基酸序列與SEQ ID NO: 2之胺基酸序列具有至少99%序列一致性。 39. 如實施例32之多肽，其中該胺基酸序列與SEQ ID NO: 2之胺基酸序列具有至少99.5%序列一致性。 40. 如實施例27至31中任一實施例之多肽，其中該序列一致性係相對SEQ ID NO: 3的且該等胺基酸插入、取代及/或缺失係相對於SEQ ID NO: 3。 41. 如實施例32之多肽，其中該胺基酸序列與SEQ ID NO: 3之胺基酸序列具有至少98%序列一致性。 42. 如實施例32之多肽，其中該胺基酸序列與SEQ ID NO: 3之胺基酸序列具有至少99%序列一致性。 43. 如實施例32之多肽，其中該胺基酸序列與SEQ ID NO: 3之胺基酸序列具有至少99.5%序列一致性。 44. 一種多肽，其包含SEQ ID NO: 4之胺基酸序列。 45. 一種多肽，其包含SEQ ID NO: 5之胺基酸序列。 46. 一種多肽，其包含SEQ ID NO: 6之胺基酸序列。 47. 如實施例1至46中任一實施例之多肽，其進一步包含至少一個核定位信號(「NLS」)。 48. 如實施例47之多肽，其包含位於該胺基酸序列的C端的至少一個NLS定位，視需要，其中： (a) 該多肽缺乏位於該胺基酸序列的N端的任何NLS；或 (b) 該多肽包含位於該胺基酸序列的N端的至少一個NLS。 49. 如實施例47或48之多肽，其缺乏位於該胺基酸序列的N端的任何NLS。 50. 如實施例47至49中任一實施例之多肽，其包含介於該胺基酸序列與每個NLS之間之連接子序列。 51. 如實施例47至50中任一實施例之多肽，其中每個NLS包含獨立地選自闡明於表2中之NLS序列之胺基酸序列。 52. 如實施例47至51中任一實施例之多肽，其中每個連接子序列獨立地選自闡明於表3中之連接子序列。 53. 一種核酸，其包含編碼如實施例1至52中任一實施例之多肽之核苷酸序列。 54. 如實施例53之核酸，其中編碼如實施例1至52中任一實施例之多肽之核苷酸序列可以操作方式連接至啟動子。 55. 如實施例53或54之核酸，其中該核酸進一步編碼引導RNA。 56. 如實施例53至55中任一實施例之核酸，其呈載體之形式。 57. 如實施例56之核酸，其中該載體為表現載體。 58. 如實施例57之核酸，其中該表現載體為產生載體。 59. 如實施例57之核酸，其中該表現載體為遞送載體。 60. 如實施例56至59中任一實施例之核酸，其中該載體為RNA載體。 61. 如實施例56至59中任一實施例之核酸，其中該載體為DNA載體。 62. 如實施例61之核酸，其中該DNA載體為質體。 63. 一種細胞，其包含如實施例53至62中任一實施例之核酸。 64. 一種細胞，其經工程化以表現編碼如實施例1至52中任一實施例之多肽之核苷酸序列。 65. 如實施例63或64之細胞，其為真核細胞。 66. 如實施例65之細胞，其為昆蟲細胞。 67. 如實施例65之細胞，其為植物細胞。 68. 如實施例65之細胞，其為哺乳動物細胞。 69. 如實施例68之細胞，其為人類細胞。 70. 一種產生如實施例1至52中任一實施例之多肽之方法，其包括在其中產生該多肽之條件下培養如實施例63至69中任一實施例之細胞。 71. 如實施例70之方法，其進一步包括分離及/或純化該多肽。 72. 一種組合物，其包含： (a) 如實施例1至52中任一實施例之多肽；及 (b) 引導RNA。 73. 如實施例72之組合物，其為核糖核蛋白複合物。 74. 如實施例72或73之組合物，其中該多肽:引導RNA莫耳比之範圍為1:1至1:4。 75. 如實施例72或73之組合物，其中該多肽:引導RNA莫耳比之範圍為1:1至1:3。 76. 如實施例72或73之組合物，其中該多肽:引導RNA莫耳比之範圍為1:1.5至1:2.5。 77. 如實施例72或73之組合物，其中該多肽:引導RNA莫耳比為1:2。 78. 一種編輯細胞之基因組之方法，其包括將下列引入至該細胞中： (a) 如實施例1至52中任一實施例之多肽；及 (b) 引導RNA。 79.`一種編輯細胞之基因組之方法，其包括將編碼下列之一或多種核酸引入至該細胞中： (a) 如實施例1至52中任一實施例之多肽；及 (b) 引導RNA，視需要，其中該一或多種核酸中之至少一者為如實施例53至62中任一實施例之核酸。 80. 一種編輯細胞之基因組之方法，其包括將下列引入至該細胞中： (a) 一或多種編碼如實施例1至52中任一實施例之多肽之核酸；及 (b) 引導RNA。 81. 如實施例80之方法，其包括使該細胞接觸包含該一或多種核酸及該引導RNA之脂質奈米粒子。 82. 一種編輯細胞之基因組之方法，其包括將編碼下列之一或多種核酸引入至該細胞中： (a) 如實施例1至52中任一實施例之多肽；及 (b) 引導RNA，視需要，其中該一或多種核酸中之至少一者為如實施例53至62中任一實施例之核酸。 83. 如實施例82之方法，其包括使該細胞與一或多個包含該一或多種核酸之重組AAV粒子接觸。 84. 如實施例82之方法，其包括使該細胞與一或多個包含該一或多種核酸之脂質奈米粒子接觸。 85. 一種編輯細胞之基因組之方法，其包括將如實施例72至77中任一實施例之組合物引入至該細胞中。 86. 如實施例78至80中任一實施例之方法，其中該細胞為哺乳動物細胞。 87. 如實施例86之方法，其中該哺乳動物細胞為人類細胞。 88. 如實施例86或87之方法，其中該哺乳動物細胞為免疫細胞，其視需要選自T細胞、表現嵌合抗原受體(CAR)或重組TCR之T細胞、調節T細胞、骨髓細胞、樹突細胞及免疫抑制巨噬細胞。 89. 如實施例78至87中任一實施例之方法，其中該細胞為造血幹細胞、紅血球先驅細胞、淋巴樣先驅細胞、周邊血液單核細胞、T淋巴細胞、B淋巴細胞、巨噬細胞、單核細胞、嗜中性球、嗜酸性球、樹突細胞或自其再程式化之細胞。 90. 如實施例78至89中任一實施例之方法，其中該細胞為幹細胞或自其分化之細胞。 91. 如實施例90之方法，其中該幹細胞為多能幹細胞(PSC)或自其分化之細胞。 92. 如實施例90或91之方法，其中該自其分化之細胞為人類免疫細胞，其視需要選自T細胞、表現嵌合抗原受體(CAR)或重組TCR之T細胞、調節T細胞、骨髓細胞、樹突細胞及免疫抑制巨噬細胞。 93. 如實施例90或91之方法，其中該自其分化之細胞為人類神經系統中之細胞，其視需要選自多巴胺性神經元、微膠質細胞、寡樹突膠質細胞、星形神經膠質細胞、皮質神經元、脊髓或動眼神經元、腸神經元、基板(Placode)衍生之細胞、雪旺氏細胞(Schwann cell)及三叉或感覺神經元。 94. 如實施例90或91之方法，其中該自其分化之細胞為人類心血管系統中之細胞，其視需要選自心肌細胞、內皮細胞及結節細胞。 95. 如實施例90或91之方法，其中該自其分化之細胞為人類代謝系統中之細胞，其視需要選自肝細胞、膽管細胞及胰臟β細胞。 96. 如實施例90或91之方法，其中該自其分化之細胞為人類眼部系統中之細胞，其視需要選自視網膜色素上皮細胞、光受體錐細胞、光受體桿細胞、雙極細胞或神經節細胞。 97. 如實施例78至80中任一實施例之方法，其中該細胞為植物細胞。 98. 一種細胞，其包含： (a) 如實施例72至77中任一實施例之組合物；或 (b) 如實施例53至62中任一實施例之核酸。 99. 如實施例98之細胞，其為哺乳動物細胞。 100. 如實施例99之細胞，其中該哺乳動物細胞為人類細胞。 101. 如實施例99或100之細胞，其中該哺乳動物細胞為免疫細胞，其視需要選自T細胞、表現嵌合抗原受體(CAR)或重組TCR之T細胞、調節T細胞、骨髓細胞、樹突細胞及免疫抑制巨噬細胞。 102. 如實施例98至100中任一實施例之細胞，其為造血幹細胞、紅血球先驅細胞、淋巴樣先驅細胞、周邊血液單核細胞、T淋巴細胞、B淋巴細胞、巨噬細胞、單核細胞、嗜中性球、嗜酸性球、樹突細胞或自其再程式化之細胞。 103. 如實施例98至102中任一實施例之細胞，其為幹細胞或自其分化之細胞。 104. 如實施例103之細胞，其中該幹細胞為多能幹細胞(PSC)或自其分化之細胞。 105. 如實施例103或104之細胞，其中該自其分化之細胞為免疫細胞，其視需要選自T細胞、表現嵌合抗原受體(CAR)或重組TCR之T細胞、調節T細胞、骨髓細胞、樹突細胞及免疫抑制巨噬細胞。 106. 如實施例103或104之細胞，其中該自其分化之細胞為人類神經系統中之細胞，其視需要選自多巴胺性神經元、微膠質細胞、寡樹突膠質細胞、星形神經膠質細胞、皮質神經元、脊髓或動眼神經元、腸神經元、基板衍生之細胞、雪旺氏細胞及三叉或感覺神經元。 107. 如實施例103或104之細胞，其中該自其分化之細胞為人類心血管系統中之細胞，其視需要選自心肌細胞、內皮細胞及結節細胞。 108. 如實施例103或104之細胞，其中該自其分化之細胞為人類代謝系統中之細胞，其視需要選自肝細胞、膽管細胞及胰臟β細胞。 109. 如實施例103或104之方法，其中該自其分化之細胞為人類眼部系統中之細胞，其視需要選自視網膜色素上皮細胞、光受體錐細胞、光受體桿細胞、雙極細胞或神經節細胞。 110. 如實施例98之細胞，其為植物細胞。 8. 實例 8.1. 材料及方法 8.1.1. B-GEn.2之結構模型化 Although various specific embodiments have been shown and described, it should be understood that various changes can be made without departing from the spirit and scope of the invention. The present invention is illustrated by the numbered embodiments described below. Unless otherwise specified, the features of any concept, aspect and/or embodiment described in the above detailed description can be applied to any of the following numbered embodiments. 1. A polypeptide comprising an amino acid other than aspartic acid at a position corresponding to D504 of SEQ ID NO: 1 (B-GEn.1) or D501 of SEQ ID NO: 2 (B-GEn.1.2) or SEQ ID NO: 3 (B-GEn.2). 2. A polypeptide as in Example 1, which is an engineered V-type nuclease polypeptide. 3. The polypeptide of embodiment 2, which is an engineered Bacillusales type V endonuclease polypeptide. 4. The polypeptide of any one of embodiments 1 to 3, which is a B-GEn polypeptide. 5. The polypeptide of any one of embodiments 1 to 4, which has an arginine at a position corresponding to D504 of SEQ ID NO: 1 (B-GEn.1) or D501 of SEQ ID NO: 2 (B-GEn.1.2) or SEQ ID NO: 3 (B-GEn.2). 6. The polypeptide of any one of embodiments 1 to 5, which comprises a target interaction sequence motif of any one of SEQ ID NO: 201, SEQ ID NO: 202, SEQ ID NO: 203; and SEQ ID NO: 204. 7. The polypeptide of any one of embodiments 1 to 6, comprising a RuvC I domain comprising an amino acid sequence having at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80% or at least 90% sequence identity to the RuvC I domain of SEQ ID NO: 8 or SEQ ID NO: 11. 8. The polypeptide of embodiment 7, wherein the RuvC I domain comprises an amino acid sequence having at least 40% sequence identity to the RuvC I domain of SEQ ID NO: 8 or SEQ ID NO: 11. 9. The polypeptide of embodiment 7, wherein the RuvC I domain comprises an amino acid sequence having at least 70% sequence identity to the RuvC I domain of SEQ ID NO: 8 or SEQ ID NO: 11. 10. The polypeptide of embodiment 7, wherein the RuvC I domain comprises an amino acid sequence having at least 80% sequence identity to the RuvC I domain of SEQ ID NO: 8 or SEQ ID NO: 11. 11. The polypeptide of embodiment 7, wherein the RuvC I domain comprises an amino acid sequence having at least 90% sequence identity with the RuvC I domain of SEQ ID NO: 8 or SEQ ID NO: 11. 12. The polypeptide of any one of embodiments 1 to 11, comprising a RuvC II domain comprising an amino acid sequence having at least 50%, at least 60%, at least 70%, at least 80% or at least 90% sequence identity with the RuvC II domain of SEQ ID NO: 9 or SEQ ID NO: 12. 13. The polypeptide of embodiment 12, wherein the RuvC II domain comprises an amino acid sequence having at least 60% sequence identity with the RuvC II domain of SEQ ID NO: 9 or SEQ ID NO: 12. 14. The polypeptide of embodiment 12, wherein the RuvC II domain comprises an amino acid sequence having at least 70% sequence identity with the RuvC II domain of SEQ ID NO: 9 or SEQ ID NO: 12. 15. The polypeptide of embodiment 12, wherein the RuvC II domain comprises an amino acid sequence having at least 80% sequence identity with the RuvC II domain of SEQ ID NO: 9 or SEQ ID NO: 12. 16. The polypeptide of embodiment 12, wherein the RuvC II domain comprises an amino acid sequence having at least 90% sequence identity with the RuvC II domain of SEQ ID NO: 9 or SEQ ID NO: 12. 17. The polypeptide of any one of embodiments 1 to 16, comprising a RuvC III domain comprising an amino acid sequence having at least 80%, at least 85% or at least 90% sequence identity with the RuvC III domain of SEQ ID NO: 10 or SEQ ID NO: 13. 18. The polypeptide of embodiment 17, wherein the RuvC III domain comprises an amino acid sequence having at least 80% sequence identity with the RuvC III domain of SEQ ID NO: 10 or SEQ ID NO: 13. 19. The polypeptide of embodiment 17, wherein the RuvC III domain comprises an amino acid sequence having at least 85% sequence identity with the RuvC III domain of SEQ ID NO: 10 or SEQ ID NO: 13. 20. The polypeptide of embodiment 17, wherein the RuvC III domain comprises an amino acid sequence having at least 90% sequence identity with the RuvC III domain of SEQ ID NO: 10 or SEQ ID NO: 13. 21. The polypeptide of any one of embodiments 1 to 20, comprising the amino acid sequence of SEQ ID NO: 8. 22. The polypeptide of any one of embodiments 1 to 21, comprising the amino acid sequence of SEQ ID NO: 9. 23. The polypeptide of any one of embodiments 1 to 22, comprising the amino acid sequence of SEQ ID NO: 10. 24. The polypeptide of any one of embodiments 1 to 20, comprising the amino acid sequence of SEQ ID NO: 11. 25. The polypeptide of any one of embodiments 1 to 20 and 24, comprising the amino acid sequence of SEQ ID NO: 12. 26. The polypeptide of any one of embodiments 1 to 20 and 24 to 25, comprising the amino acid sequence of SEQ ID NO: 13. 27. A polypeptide, which is optionally a polypeptide as in any one of embodiments 1 to 26, comprising the following amino acid sequence: (a) having a substitution at a position corresponding to D501 of SEQ ID NO: 3, wherein the substitution: (i) is a substitution D501R compared to the amino acid sequence of SEQ ID NO: 3; or (ii) increases gene editing efficiency activity compared to the corresponding amino acid sequence without the substitution at position D501; and (b) having: (i) at least 80%, at least 85%, at least 90% or at least 95% sequence identity with SEQ ID NO: 1 (B-GEn.1), SEQ ID NO: 2 (B-GEn.1.2) or SEQ ID NO: 3 (B-GEn.2); (ii) (A) SEQ ID NO: 201, SEQ ID NO: 202, SEQ ID NO: 203; and SEQ ID NO: 3 (B) a RuvC I domain comprising an amino acid sequence having at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80% or at least 90% sequence identity to the RuvC I domain of SEQ ID NO: 8 or SEQ ID NO: 11; (C) a RuvC II domain comprising an amino acid sequence having at least 50%, at least 60%, at least 70%, at least 80% or at least 90% sequence identity to the RuvC II domain of SEQ ID NO: 9 or SEQ ID NO: 12; (D) a RuvC III domain comprising an amino acid sequence having at least 80%, at least 85% or at least 90% sequence identity to the RuvC III domain of SEQ ID NO: 10 or SEQ ID NO: 13; or (E) any combination of two, three or all four of (A), (B), (C) and (D); (iii) Up to 25 amino acid insertions, substitutions and/or deletions compared to the amino acid sequence of SEQ ID NO: 1 (B-GEn.1), SEQ ID NO: 2 (B-GEn.1.2) or SEQ ID NO: 3 (B-GEn.2); (iv) (b)(i) and (b)(ii); (v) (b)(i) and (b)(iii); (vi) (b)(ii) and (b)(iii); or (vii) (b)(i), (b)(ii) and (b)(iii). 28. The polypeptide of any one of embodiments 1 to 27, which has a gene editing efficiency that is at least 50% higher than the gene editing efficiency of a corresponding polypeptide lacking an amino acid substitution at a position corresponding to D504 of SEQ ID NO: 1 (B-GEn. 1) or SEQ ID NO: 2 (B-GEn. 1.2) or SEQ ID NO: 3 (B-GEn. 2). 29. The polypeptide of any one of embodiments 1 to 27, which has a gene editing efficiency that is at least 70% higher than the gene editing efficiency of a corresponding polypeptide lacking an amino acid substitution at a position corresponding to D504 of SEQ ID NO: 1 (B-GEn. 1) or SEQ ID NO: 2 (B-GEn. 1.2) or SEQ ID NO: 3 (B-GEn. 2). 30. The polypeptide of any one of embodiments 1 to 27, which has a gene editing efficiency that is at least 90% higher than the gene editing efficiency of the corresponding polypeptide lacking the amino acid substitution at the position corresponding to D504 of SEQ ID NO: 1 (B-GEn.1) or SEQ ID NO: 2 (B-GEn.1.2) or SEQ ID NO: 3 (B-GEn.2). 31. The polypeptide of any one of embodiments 28 to 30, wherein the gene editing efficiency is assessed by an in-cell gene editing assay, optionally, wherein the gene editing assay is as described in Example 1. 32. The polypeptide of any one of embodiments 27 to 31, wherein the sequence identity is relative to SEQ ID NO: 1 and the amino acid insertions, substitutions and/or deletions are relative to SEQ ID NO: 1. 33. The polypeptide of embodiment 32, wherein the amino acid sequence has at least 98% sequence identity to the amino acid sequence of SEQ ID NO: 1. 34. The polypeptide of embodiment 32, wherein the amino acid sequence has at least 99% sequence identity to the amino acid sequence of SEQ ID NO: 1. 35. The polypeptide of embodiment 32, wherein the amino acid sequence has at least 99.5% sequence identity to the amino acid sequence of SEQ ID NO: 1. 36. The polypeptide of any one of embodiments 27 to 31, wherein the sequence identity is relative to SEQ ID NO: 2 and the amino acid insertions, substitutions and/or deletions are relative to SEQ ID NO: 2. 37. The polypeptide of embodiment 32, wherein the amino acid sequence has at least 98% sequence identity to the amino acid sequence of SEQ ID NO: 2. 38. The polypeptide of embodiment 32, wherein the amino acid sequence has at least 99% sequence identity to the amino acid sequence of SEQ ID NO: 2. 39. The polypeptide of embodiment 32, wherein the amino acid sequence has at least 99.5% sequence identity to the amino acid sequence of SEQ ID NO: 2. 40. The polypeptide of any one of embodiments 27 to 31, wherein the sequence identity is relative to SEQ ID NO: 3 and the amino acid insertions, substitutions and/or deletions are relative to SEQ ID NO: 3. 41. The polypeptide of embodiment 32, wherein the amino acid sequence has at least 98% sequence identity to the amino acid sequence of SEQ ID NO: 3. 42. The polypeptide of embodiment 32, wherein the amino acid sequence has at least 99% sequence identity to the amino acid sequence of SEQ ID NO: 3. 43. The polypeptide of embodiment 32, wherein the amino acid sequence has at least 99.5% sequence identity to the amino acid sequence of SEQ ID NO: 3. 44. A polypeptide comprising the amino acid sequence of SEQ ID NO: 4. 45. A polypeptide comprising the amino acid sequence of SEQ ID NO: 5. 46. A polypeptide comprising the amino acid sequence of SEQ ID NO: 6. 47. The polypeptide of any one of embodiments 1 to 46, further comprising at least one nuclear localization signal ("NLS"). 48. The polypeptide of embodiment 47, comprising at least one NLS located at the C-terminus of the amino acid sequence, optionally wherein: (a) the polypeptide lacks any NLS located at the N-terminus of the amino acid sequence; or (b) the polypeptide comprises at least one NLS located at the N-terminus of the amino acid sequence. 49. The polypeptide of embodiment 47 or 48, lacks any NLS located at the N-terminus of the amino acid sequence. 50. The polypeptide of any one of embodiments 47 to 49, comprising a linker sequence between the amino acid sequence and each NLS. 51. The polypeptide of any one of embodiments 47 to 50, wherein each NLS comprises an amino acid sequence independently selected from the NLS sequences described in Table 2. 52. The polypeptide of any one of Examples 47 to 51, wherein each linker sequence is independently selected from the linker sequences described in Table 3. 53. A nucleic acid comprising a nucleotide sequence encoding a polypeptide of any one of Examples 1 to 52. 54. The nucleic acid of Example 53, wherein the nucleotide sequence encoding a polypeptide of any one of Examples 1 to 52 can be operably linked to a promoter. 55. The nucleic acid of Example 53 or 54, wherein the nucleic acid further encodes a guide RNA. 56. The nucleic acid of any one of Examples 53 to 55, which is in the form of a vector. 57. The nucleic acid of Example 56, wherein the vector is an expression vector. 58. The nucleic acid of Example 57, wherein the expression vector is a production vector. 59. The nucleic acid of embodiment 57, wherein the expression vector is a delivery vector. 60. The nucleic acid of any one of embodiments 56 to 59, wherein the vector is an RNA vector. 61. The nucleic acid of any one of embodiments 56 to 59, wherein the vector is a DNA vector. 62. The nucleic acid of embodiment 61, wherein the DNA vector is a plasmid. 63. A cell comprising the nucleic acid of any one of embodiments 53 to 62. 64. A cell engineered to express a nucleotide sequence encoding a polypeptide of any one of embodiments 1 to 52. 65. The cell of embodiment 63 or 64, which is a eukaryotic cell. 66. The cell of embodiment 65, which is an insect cell. 67. The cell of Example 65, which is a plant cell. 68. The cell of Example 65, which is a mammalian cell. 69. The cell of Example 68, which is a human cell. 70. A method of producing the polypeptide of any one of Examples 1 to 52, comprising culturing the cell of any one of Examples 63 to 69 under conditions wherein the polypeptide is produced. 71. The method of Example 70, further comprising isolating and/or purifying the polypeptide. 72. A composition comprising: (a) the polypeptide of any one of Examples 1 to 52; and (b) a guide RNA. 73. The composition of Example 72, which is a ribonucleoprotein complex. 74. A composition as in Example 72 or 73, wherein the molar ratio of polypeptide:guide RNA ranges from 1:1 to 1:4. 75. A composition as in Example 72 or 73, wherein the molar ratio of polypeptide:guide RNA ranges from 1:1 to 1:3. 76. A composition as in Example 72 or 73, wherein the molar ratio of polypeptide:guide RNA ranges from 1:1.5 to 1:2.5. 77. A composition as in Example 72 or 73, wherein the molar ratio of polypeptide:guide RNA is 1:2. 78. A method for editing the genome of a cell, comprising introducing into the cell: (a) a polypeptide as in any one of Examples 1 to 52; and (b) a guide RNA. 79. A method of editing the genome of a cell, comprising introducing into the cell one or more nucleic acids encoding: (a) a polypeptide according to any one of embodiments 1 to 52; and (b) a guide RNA, optionally, wherein at least one of the one or more nucleic acids is a nucleic acid according to any one of embodiments 53 to 62. 80. A method of editing the genome of a cell, comprising introducing into the cell: (a) one or more nucleic acids encoding a polypeptide according to any one of embodiments 1 to 52; and (b) a guide RNA. 81. The method according to embodiment 80, comprising contacting the cell with a lipid nanoparticle comprising the one or more nucleic acids and the guide RNA. 82. A method for editing the genome of a cell, comprising introducing into the cell a nucleic acid encoding one or more of: (a) a polypeptide according to any one of embodiments 1 to 52; and (b) a guide RNA, optionally, wherein at least one of the one or more nucleic acids is a nucleic acid according to any one of embodiments 53 to 62. 83. The method according to embodiment 82, comprising contacting the cell with one or more recombinant AAV particles comprising the one or more nucleic acids. 84. The method according to embodiment 82, comprising contacting the cell with one or more lipid nanoparticles comprising the one or more nucleic acids. 85. A method for editing the genome of a cell, comprising introducing into the cell a composition according to any one of embodiments 72 to 77. 86. The method of any one of embodiments 78 to 80, wherein the cell is a mammalian cell. 87. The method of embodiment 86, wherein the mammalian cell is a human cell. 88. The method of embodiment 86 or 87, wherein the mammalian cell is an immune cell, optionally selected from T cells, T cells expressing a chimeric antigen receptor (CAR) or a recombinant TCR, regulatory T cells, bone marrow cells, dendritic cells, and immunosuppressive macrophages. 89. The method of any one of embodiments 78 to 87, wherein the cell is a hematopoietic stem cell, an erythroid progenitor cell, a lymphoid progenitor cell, a peripheral blood mononuclear cell, a T lymphocyte, a B lymphocyte, a macrophage, a monocyte, a neutrophil, an eosinophil, a dendritic cell, or a cell reprogrammed therefrom. 90. The method of any one of embodiments 78 to 89, wherein the cell is a stem cell or a cell differentiated therefrom. 91. The method of embodiment 90, wherein the stem cell is a pluripotent stem cell (PSC) or a cell differentiated therefrom. 92. The method of embodiment 90 or 91, wherein the cell differentiated therefrom is a human immune cell, which is optionally selected from T cells, T cells expressing chimeric antigen receptor (CAR) or recombinant TCR, regulatory T cells, bone marrow cells, dendritic cells and immunosuppressive macrophages. 93. The method of embodiment 90 or 91, wherein the cell differentiated therefrom is a cell of the human nervous system, which is optionally selected from dopaminergic neurons, microglia, oligodendrocytes, astroglial cells, cortical neurons, spinal or oculomotor neurons, intestinal neurons, placode-derived cells, Schwann cells, and trigeminal or sensory neurons. 94. The method of embodiment 90 or 91, wherein the cell differentiated therefrom is a cell of the human cardiovascular system, which is optionally selected from cardiac myocytes, endothelial cells, and nodal cells. 95. The method of embodiment 90 or 91, wherein the cell differentiated therefrom is a cell in the human metabolic system, which is optionally selected from liver cells, bile duct cells and pancreatic β cells. 96. The method of embodiment 90 or 91, wherein the cell differentiated therefrom is a cell in the human eye system, which is optionally selected from retinal pigment epithelial cells, photoreceptor cone cells, photoreceptor rod cells, bipolar cells or ganglion cells. 97. The method of any one of embodiments 78 to 80, wherein the cell is a plant cell. 98. A cell comprising: (a) the composition of any one of embodiments 72 to 77; or (b) the nucleic acid of any one of embodiments 53 to 62. 99. The cell of embodiment 98, which is a mammalian cell. 100. The cell of embodiment 99, wherein the mammalian cell is a human cell. 101. The cell of embodiment 99 or 100, wherein the mammalian cell is an immune cell, which is optionally selected from T cells, T cells expressing chimeric antigen receptors (CAR) or recombinant TCRs, regulatory T cells, bone marrow cells, dendritic cells, and immunosuppressive macrophages. 102. The cell according to any one of embodiments 98 to 100, which is a hematopoietic stem cell, an erythroid progenitor cell, a lymphoid progenitor cell, a peripheral blood mononuclear cell, a T lymphocyte, a B lymphocyte, a macrophage, a monocyte, a neutrophil, an eosinophil, a dendritic cell, or a cell reprogrammed therefrom. 103. The cell according to any one of embodiments 98 to 102, which is a stem cell or a cell differentiated therefrom. 104. The cell according to embodiment 103, wherein the stem cell is a pluripotent stem cell (PSC) or a cell differentiated therefrom. 105. The cell of embodiment 103 or 104, wherein the cell differentiated therefrom is an immune cell, which is optionally selected from T cells, T cells expressing chimeric antigen receptor (CAR) or recombinant TCR, regulatory T cells, bone marrow cells, dendritic cells and immunosuppressive macrophages. 106. The cell of embodiment 103 or 104, wherein the cell differentiated therefrom is a cell of the human nervous system, which is optionally selected from dopaminergic neurons, microglia, oligodendrocytes, astroglial cells, cortical neurons, spinal or oculomotor neurons, intestinal neurons, placode-derived cells, Schwann cells, and trigeminal or sensory neurons. 107. The cell of embodiment 103 or 104, wherein the cell differentiated therefrom is a cell of the human cardiovascular system, which is optionally selected from cardiac myocytes, endothelial cells, and nodal cells. 108. The cell of Example 103 or 104, wherein the cell differentiated therefrom is a cell in the human metabolic system, which is selected from liver cells, bile duct cells and pancreatic β cells as needed. 109. The method of Example 103 or 104, wherein the cell differentiated therefrom is a cell in the human eye system, which is selected from retinal pigment epithelial cells, photoreceptor cone cells, photoreceptor rod cells, bipolar cells or ganglion cells as needed. 110. The cell of Example 98, which is a plant cell. 8. Examples 8.1. Materials and Methods 8.1.1. Structural Modeling of B-GEn.2

B-GEn.2之結構迄今為止尚未進行表徵。使用AlphaFold2軟體預測B-GEn.2之結構，且將此針對具有已知晶體結構之蛋白質之資料庫進行比較。基於14個先導晶體結構命中，產生B-GEn.2之一組候選結構。此數據指示B-GEn.2之預測結構與來自桿菌之芽孢桿菌目之幾種晶體結構相當吻合。The structure of B-GEn.2 has not been characterized to date. The structure of B-GEn.2 was predicted using AlphaFold2 software and compared against a database of proteins with known crystal structures. Based on 14 lead crystal structure hits, a set of candidate structures for B-GEn.2 was generated. This data indicated that the predicted structure of B-GEn.2 was well consistent with several crystal structures from the Bacillusales order of bacteria.

基於該等比對，使用B-GEn.2之OBD II及RuvC I域進行在芽孢桿菌目內的NCBI BLASTp搜尋。使用MUSCLE比對演算法選擇來自相關來源生物體之具有約40至70%序列一致性之約20個Cas核酸酶用於多序列比對。此等各種核酸酶之譜系關係描繪於圖2之樹中，而該比對本身顯示於圖4中。 8.1.2. 設計具有胺基酸取代之B-GEn.2變體 Based on these alignments, NCBI BLASTp searches within the Bacillusales were performed using the OBD II and RuvC I domains of B-GEn.2. Approximately 20 Cas nucleases from related source organisms with approximately 40 to 70% sequence identity were selected for multiple sequence alignment using the MUSCLE alignment algorithm. The spectral relationships of these various nucleases are depicted in the tree of Figure 2, and the alignment itself is shown in Figure 4. 8.1.2. Design of B-GEn.2 variants with amino acid substitutions

使用兩種平行方法以識別可導致B-GEn.2之增強之插入缺失活性之胺基酸取代。在第一方法中，使用DNAStar Lasergene 17軟體以識別AacC2c1晶體結構中位於靶DNA受質、引導間隔子RNA或引導tracr RNA中任一者的3埃以內的胺基酸殘基。對B-GEn.2之預測結構進行相同操作，尤其注意彼等其中所選胺基酸殘基不同於AacC2c1之對應殘基之情況。Two parallel approaches were used to identify amino acid substitutions that lead to enhanced indel activity of B-GEn.2. In the first approach, DNAStar Lasergene 17 software was used to identify amino acid residues in the AacC2c1 crystal structure that were within 3 angstroms of either the target DNA substrate, guide spacer RNA, or guide tracr RNA. The same procedure was performed on the predicted structure of B-GEn.2, with particular attention paid to those cases where the selected amino acid residues were different from the corresponding residues of AacC2c1.

第二方法涉及使用Swiss-PdbViewer (亦稱Deepview (https://spdbv.unil.ch/))以識別AacC2c1 (RCSB項5U31)之晶體結構內的經由與靶DNA受質、引導間隔子RNA或引導tracr RNA中任一者進行預測氫鍵結而相互作用之胺基酸。然後，藉由比對AacC2c1之序列與B-GEn.2之序列，對於其中AacC2c1內的預測蛋白質-核酸接觸殘基在B-GEn.2上出現差異之情況進行搜尋。為將焦點自潛在的數十個候選突變體縮小至更易管理之數字以用於最終蛋白質產生及純化，進一步評估集中在與AacC2c1晶體中之靶或非靶DNA股形成緊密接觸但不對應於該預測B-GEn.2結構中之Arg或Lys殘基。 8.1.3. B-GEn.2變體之表現及純化 The second approach involved using Swiss-PdbViewer (also known as Deepview (https://spdbv.unil.ch/)) to identify amino acids within the crystal structure of AacC2c1 (RCSB entry 5U31) that interact via predicted hydrogen bonding with either the target DNA substrate, guide spacer RNA, or guide tracr RNA. Then, by aligning the sequence of AacC2c1 with that of B-GEn.2, a search was performed for instances in which predicted protein-nucleic acid contact residues in AacC2c1 differed on B-GEn.2. To narrow the focus from potentially dozens of candidate mutants to a more manageable number for final protein production and purification, further evaluation focused on Arg or Lys residues that formed close contacts with the target or non-target DNA strands in the AacC2c1 crystal but did not correspond to the predicted B-GEn.2 structure. 8.1.3. Performance and purification of B-GEn.2 variants

質體經構築用於表現B.Gen.2之所選點突變體。首先將編碼質體化學轉形至此等大腸桿菌細胞中且接種於抗生素選擇LB板上，在BL21(DE3)細胞中表現所有構築體。菌落刮去且轉移至250 mL MagicMedia (Thermo Scientific)。將培養物在37℃下生長4小時且然後切換至16℃ 40小時以進行蛋白質表現。然後，藉由在4℃下以5000 x g離心15分鐘而收穫細胞集結粒，且接著冷凍以進行稍後處理。為解凍，將集結粒再懸浮於裂解緩衝液(500 mM NaCl、50 mM Tris (pH 8.0)、5%甘油、5 mM EDTA、0.5 mM TCEP)中，超音波處理，且所得溶解產物藉由在4℃下以50,000 x g離心30分鐘來清除。將清除的溶解產物用負載緩衝液(100 mM NaCl、50 mM Tris (pH 8.0)、5%甘油、5 mM EDTA、0.5 mM TCEP)負載於肝素瓊脂糖凝膠6 FF管柱中，接著用相同緩衝液進行洗滌步驟。使用相同緩衝液利用500 mM、700及1000 mM NaCl之步驟洗脫蛋白質。該主要流份中之蛋白質經濃縮至約20 mg/mL。藉由考馬斯染色SDS-PAGE凝膠之帶式密度測定法估計蛋白質之純度，且將經純化之蛋白質之等分試樣在-80℃下冷凍以供後續使用。 8.1.4. RNP產生 Plasmids were constructed for expression of selected point mutants of B.Gen.2. All constructs were expressed in BL21(DE3) cells by first chemically transforming the coding plasmids into these E. coli cells and plating on antibiotic selection LB plates. Colonies were scraped and transferred to 250 mL MagicMedia (Thermo Scientific). Cultures were grown at 37°C for 4 hours and then switched to 16°C for 40 hours for protein expression. Cell pellets were then harvested by centrifugation at 5000 x g for 15 minutes at 4°C and then frozen for later processing. To thaw, the pellet was resuspended in lysis buffer (500 mM NaCl, 50 mM Tris (pH 8.0), 5% glycerol, 5 mM EDTA, 0.5 mM TCEP), sonicated, and the resulting lysate was cleared by centrifugation at 50,000 x g for 30 min at 4°C. The cleared lysate was loaded onto a Heparin Agarose Gel 6 FF column with loading buffer (100 mM NaCl, 50 mM Tris (pH 8.0), 5% glycerol, 5 mM EDTA, 0.5 mM TCEP), followed by a wash step with the same buffer. The protein was eluted using steps of 500 mM, 700, and 1000 mM NaCl using the same buffer. The protein in the main fraction was concentrated to approximately 20 mg/mL. The purity of the protein was estimated by band densitometry of Coomassie-stained SDS-PAGE gels, and aliquots of the purified protein were frozen at -80°C for subsequent use. 8.1.4. RNP production

將經純化且經濃縮之變體B-GEn.2核酸酶與靶向B2M基因內的位點之以下序列之引導RNA形成為RNP，其中間隔區為加底線： *mU*mA*GCUAUAGGCUAAUAAGAUAGUUGUGUCAAGUGCUUCGGAGACCUAACACGUCUCCAGUCACAACGGCUAAAAAUAGCCAGCAC AGUGUAGUACAAGAGAUAGA*mA*mA*mG (SEQ ID NO: 178) Purified and concentrated variant B-GEn.2 nuclease was formed into RNPs with a guide RNA of the following sequence targeting a site within the B2M gene, with the spacer underlined: *mU*mA*GCUAUAGGCUAAUAAGAUAGUUGUGUCAAGUGCUUCGGAGACCUAACACGUCUCCAGUCACAACGGCUAAAAAUAGCCAGCAC AGUGUAGUACAAGAGAUAGA*mA*mA*mG (SEQ ID NO: 178)

RNP藉由將該B2M靶向引導RNA與核酸酶以2:1之莫耳比在含有225 mM NaCl之緩衝液中混合且在室溫下培養約30分鐘來組裝。在Agilent Bio SCX (NP1.7, SS)管柱上使用NaCl之線性洗脫梯度經由陽離子交換UPLC來評估該等RNP之複合效率。緩衝液A由10 mM磷酸鈉、100 mM NaCl，pH 6.5組成，而緩衝液B為10 mM磷酸鈉、1 M NaCl，pH 6.5。藉由將該RNP之峰面積除以所添加的總核酸酶之峰面積(不含引導RNA之等效樣品)且乘以100來獲得複合之百分比效率。取決於所測試的核酸酶變體，值範圍為70至90%複合化。使用體外質體切割檢定來評估該等RNP體外切割DNA之能力。將漸增量之RNP在CutSmart緩衝液(New England Biolabs)中與固定量之已使用Xho-I酵素線性化之含有B2M靶位點之質體混合。在37℃下進行30 min培養後，在50℃下在蛋白酶K之存在下淬滅該反應10 min。將該等樣品施用至D5000膠帶且在Agilent 4200 TapeStation儀器上分析。每個反應之裂解百分百經計算為產物之峰面積相對添加的總受質(不含RNP之等效樣品)之峰面積且乘以100。使用來自GraphPad (版本9.3.1)之Prism軟體來確定每種變體蛋白(或Cpf1對照)系列之EC ₅₀值(其範圍為0.07至0.21 nM)。 8.1.5. 細胞內基因編輯 RNPs were assembled by mixing the B2M targeting guide RNA and nuclease at a molar ratio of 2:1 in a buffer containing 225 mM NaCl and incubated at room temperature for approximately 30 minutes. The complexation efficiency of the RNPs was assessed by cation exchange UPLC using a linear elution gradient of NaCl on an Agilent Bio SCX (NP1.7, SS) column. Buffer A consisted of 10 mM sodium phosphate, 100 mM NaCl, pH 6.5, and buffer B was 10 mM sodium phosphate, 1 M NaCl, pH 6.5. The percent efficiency of complexation was obtained by dividing the peak area of the RNP by the peak area of the total nuclease added (equivalent sample without guide RNA) and multiplying by 100. Values range from 70 to 90% complexation, depending on the nuclease variant tested. The ability of the RNPs to cleave DNA in vitro was assessed using an in vitro plastid cleavage assay. Increasing amounts of RNPs were mixed in CutSmart buffer (New England Biolabs) with a fixed amount of plastids containing the B2M target site that had been linearized using the Xho-I enzyme. After incubation at 37°C for 30 min, the reaction was quenched in the presence of proteinase K at 50°C for 10 min. The samples were applied to D5000 tape and analyzed on an Agilent 4200 TapeStation instrument. The percentage of cleavage for each reaction was calculated as the peak area of the product relative to the peak area of the total substrate added (equivalent sample without RNP) and multiplied by 100. The _EC50 values for each variant protein (or Cpf1 control) series were determined using Prism software from GraphPad (version 9.3.1) (ranging from 0.07 to 0.21 nM). 8.1.5. In-cell gene editing

將iPSC培養於具有Essential 8 (E8)生長培養基之受質塗覆T75燒瓶中且在繼代之間維持在37℃及5% CO ₂含量。在75至80%匯合率時完成繼代。在核轉染當天，使用Accutase™細胞分離溶液(Stem Cell Technologies)，藉由在37℃下培養10分鐘且然後用等體積之E8培養基淬滅，將iPSC自該燒瓶分離。藉由以115xg離心3分鐘，接著再懸浮於Lonza P3初生細胞核轉染緩衝液中來收穫細胞集結粒。針對每種核酸酶構築體使用1:2比率之蛋白質:sgRNA (IDT)將核糖核蛋白(RNP)與sgRNA進行組裝。將經複合之RNP核轉染至經P3再懸浮之iPSC中係使用LONZA 4D核轉染儀來達成。然後，將經核轉染之細胞以每孔150,000個細胞接種於受質塗覆24孔Falcon平底板(Corning)中且在E8生長培養基(具有Rock抑制劑Y-27632 (Tocris))中生長72至96小時。收穫後，使用來自BioLegend之抗B2M之APC結合抗體(用於B2M靶向實驗)對該等細胞中的一些進行染色以用於流動式細胞測量術。將剩餘的細胞再懸浮於來自BioRad’s singleshot細胞裂解套組之30 µL裂解緩衝液中且藉由在室溫下培養10分鐘，然後在37℃下培養5分鐘，接著在75℃下進行蛋白酶K去活化5分鐘來進行粗製gDNA提取。將1 µL粗製gDNA提取物直接用於每25 µL PCR反應以進行擴增子測序第一步驟，接著在Illumina MiSeq測序儀上進行末端準備(end-prep)及索引及測序。 8.2. 實例1：具有胺基酸取代之變體B-GEn.2之設計及表現 iPSCs were cultured in substrate-coated T75 flasks with Essential 8 (E8) growth medium and maintained at 37°C and 5% _CO2 between passages. Passaging was completed at 75 to 80% confluency. On the day of nucleofection, iPSCs were detached from the flask using Accutase™ Cell Dissociation Solution (Stem Cell Technologies) by incubation at 37°C for 10 minutes and then quenching with an equal volume of E8 medium. Cell pellets were harvested by centrifugation at 115xg for 3 minutes followed by resuspension in Lonza P3 Primary Cell Nucleofection Buffer. Ribonucleoproteins (RNPs) and sgRNAs were assembled using a 1:2 ratio of protein:sgRNA (IDT) for each nuclease construct. Nucleofection of complexed RNPs into P3 resuspended iPSCs was achieved using the LONZA 4D nucleofection instrument. Nucleofected cells were then seeded at 150,000 cells per well in substrate-coated 24-well Falcon flat-bottom plates (Corning) and grown for 72 to 96 hours in E8 growth medium with Rock inhibitor Y-27632 (Tocris). After harvest, some of the cells were stained for flow cytometry using an anti-B2M APC-binding antibody from BioLegend (used in B2M targeting experiments). The remaining cells were resuspended in 30 µL lysis buffer from BioRad's singleshot cell lysis kit and crude gDNA was extracted by incubation at room temperature for 10 minutes, then at 37°C for 5 minutes, followed by proteinase K inactivation at 75°C for 5 minutes. 1 µL of crude gDNA extract was used directly in each 25 µL PCR reaction for the first step of amplicon sequencing, followed by end-prep and indexing and sequencing on an Illumina MiSeq sequencer. 8.2. Example 1: Design and performance of variant B-GEn.2 with amino acid substitutions

來自短桿菌屬物種之V型核酸內切酶B-GEn.2 (SEQ ID NO: 3)先前在幾個專有iPSC系中進行基因編輯活性之評估。為改良插入缺失形成，如章節8.1.1中所述電腦模擬識別野生型B-GEn.2之似真結構。如在段落8.1.2中所述設計及產生具有單個胺基酸點突變之變體B-GEn.2序列。將所選B-GEn.2變體在BL21(DE3)細胞中表現且如章節8.1.3中所述進行純化。The V-type endonuclease B-GEn.2 (SEQ ID NO: 3) from Brevibacterium species was previously evaluated for gene editing activity in several proprietary iPSC lines. To improve indel formation, a plausible structure of wild-type B-GEn.2 was identified by computer simulation as described in Section 8.1.1. Variant B-GEn.2 sequences with single amino acid site mutations were designed and generated as described in Section 8.1.2. Selected B-GEn.2 variants were expressed in BL21 (DE3) cells and purified as described in Section 8.1.3.

識別出B-GEn.2之先導似真結構與來自酸土脂環芽孢桿菌( Alicyclobacillus acidoterrestris)之Cas核酸酶AacC2c1之先前公開的結晶結構(圖3A至3C)以及來自生物體喜油嗜熱地芽孢桿菌( Geobacillus thermoleovorans)之Cas核酸酶BthC2C1 (未顯示)完全吻合。 The lead plausible structure of B-GEn.2 was identified to be completely consistent with the previously published crystallographic structures of the Cas nuclease AacC2c1 from Alicyclobacillus acidoterrestris (Figures 3A to 3C) and the Cas nuclease BthC2C1 from the organism Geobacillus thermoleovorans (not shown).

接下來，比對AacC2c1及B-GEn.2之胺基酸序列以確定關鍵位置之胺基酸差異。此兩種核酸酶之間之序列比對揭示約37%的胺基酸序列一致性。使用AacC2c1及B-GEn.2胺基酸序列之差異，對應於AacC2c1晶體結構中與靶及非靶DNA形成緊密接觸之Arg或Lys殘基之預測B-GEn.2結構之非-Arg及非-Lys殘基經識別為突變靶。使用DNA Star及DeepView識別19個突變靶之初始清單，如章節8.1.2中所述，從中選擇8個突變靶用於進一步評估，其產生及純化如章節8.1.3中所述。Next, the amino acid sequences of AacC2c1 and B-GEn.2 were aligned to determine amino acid differences at key positions. Sequence alignment between these two nucleases revealed approximately 37% amino acid sequence identity. Using the differences in the amino acid sequences of AacC2c1 and B-GEn.2, non-Arg and non-Lys residues in the predicted B-GEn.2 structure that corresponded to Arg or Lys residues that formed close contacts with target and non-target DNA in the AacC2c1 crystal structure were identified as mutation targets. An initial list of 19 mutation targets was identified using DNA Star and DeepView, as described in Section 8.1.2, from which 8 mutation targets were selected for further evaluation, and their generation and purification were described in Section 8.1.3.

所有8種B-GEn.2變體之表現概況與野生型B-GEn.2表現概況相當(表7)。該等變體中之一者(B-GEn.2 D501R)之基於單步驟硫酸肝素之純化之代表性考馬斯染色SDS-PAGE影像顯示於圖5中。表 7 B-GEn.2 測量濃度 (mg/mL) 估計純度 (%) 調整濃度 (mg/mL) WT 9.0 65% 5.8 變體1 6.0 61% 3.7 變體2 8.9 74% 6.6 變體3 6.4 74% 4.7 變體4 10.0 59% 5.9 變體5 9.0 76% 6.8 變體6 17.0 72% 12.2 變體7 12.3 59% 7.3 變體8 9.0 65% 5.9 8.3. 實例2：具有B-GEn.2變體之RNP之表徵 The expression profiles of all eight B-GEn.2 variants were comparable to that of wild-type B-GEn.2 (Table 7). Representative Coomassie-stained SDS-PAGE images of one of the variants (B-GEn.2 D501R) based on single-step heparin sulfate purification are shown in FIG5 . Table 7 B-GEn.2 Measurement concentration (mg/mL) Estimated purity (%) Adjust concentration (mg/mL) WT 9.0 65% 5.8 Variant 1 6.0 61% 3.7 Variant 2 8.9 74% 6.6 Variant 3 6.4 74% 4.7 Variant 4 10.0 59% 5.9 Variant 5 9.0 76% 6.8 Variant 6 17.0 72% 12.2 Variant 7 12.3 59% 7.3 Variant 8 9.0 65% 5.9 8.3. Example 2: Characterization of RNPs with B-GEn.2 variants

將經純化之B-GEn.2變體與靶向B2M基因內的位點之引導RNA形成為RNP，如章節8.1.4中所述。評估RNP形成之效率以及靶B2M位點之體外切割功效。Purified B-GEn.2 variants were formed into RNPs with guide RNAs targeting sites within the B2M gene as described in Section 8.1.4. The efficiency of RNP formation and the efficacy of in vitro cleavage of the target B2M site were assessed.

結果證實該等B-GEn.2變體能夠有效地與該靶引導RNA複合成RNP，其中發現該等B-Gen.2變體之複合效率在71至92%之範圍內。The results demonstrated that the B-GEn.2 variants were able to efficiently complex with the target guide RNA to form RNPs, wherein the complexation efficiency of the B-Gen.2 variants was found to be in the range of 71 to 92%.

該體外活性檢定指示相較於WT B-GEn.2，B-GEn.2變體不存在可偵測之顯著差異。發現該等B-GEn.2變體之靶質體切割效率(如藉由EC ₅₀評估)在0.07至0.17 nM範圍內。圖6顯示該等B-GEn.2變體中之一者B-GEn.2 D501R之切割效率，相對於不同核酸酶:線性化質體比率之WT B-Gen.2及Cpf1之切割效率。 8.4. 實例3：具有B-GEn.2 D501R之iPSC中之B2M基因座之高效細胞內基因編輯 The in vitro activity assay indicated that there were no detectable significant differences between the B-GEn.2 variants and WT B-GEn.2. The target plastid cleavage efficiency (as assessed by EC ₅₀ ) of the B-GEn.2 variants was found to be in the range of 0.07 to 0.17 nM. Figure 6 shows the cleavage efficiency of one of the B-GEn.2 variants, B-GEn.2 D501R, relative to the cleavage efficiency of WT B-Gen.2 and Cpf1 at different nuclease: linearized plastid ratios. 8.4. Example 3: Efficient intracellular gene editing of the B2M locus in iPSCs with B-GEn.2 D501R

如章節8.1.4中所述，藉由檢查iPSC中之B2M靶基因座處之插入缺失產生來評估B-GEn.2 D501R之細胞內基因編輯，將其與藉由WT B-GEn.2及AsCpf1 Ultra(IDT)達成之基因編輯進行比較。 As described in Section 8.1.4, intracellular gene editing of B-GEn.2 D501R was assessed by examining indel production at the B2M target locus in iPSCs, which was compared to gene editing achieved by WT B-GEn.2 and AsCpf1 Ultra (IDT).

利用B-GEn.2 D501R達成大於80%之插入缺失形成效率，其大於藉由Cpf1達成之效率(圖7)。藉由B-GEn.2 D501R達成之此插入缺失形成效率係比藉由WT B-Gen.2達成之插入缺失形成高2至3倍(圖7)。此結果表明，在D501處由Arg取代Asp，此在AacC2c1之晶體結構中顯示可氫鍵鍵結至靶DNA股之+1磷酸骨架基團，增強該B-GEn.2變體切割其靶受質之能力。 9. 序列表 Greater than 80% indel formation efficiency was achieved with B-GEn.2 D501R, which was greater than that achieved by Cpf1 (Figure 7). This indel formation efficiency achieved by B-GEn.2 D501R was 2-3 times higher than that achieved by WT B-Gen.2 (Figure 7). This result suggests that substitution of Asp by Arg at D501, which in the crystal structure of AacC2c1 is shown to hydrogen bond to the +1 phosphate backbone group of the target DNA strand, enhances the ability of the B-GEn.2 variant to cleave its target substrate. 9. Sequence Listing

本發明之示例性序列提供於下表8中(其中「SEQ」係指SEQ ID NO)。表 8 SEQ 描述序列 1 WT B-GEn.1 aa 序列 2 WT B-GEn.1.2 aa序列 3 WT B-GEn.2 aa 序列 4 B-GEn.1 D504R aa序列 5 B-GEn.1.2 D501R aa序列 6 B-GEn.2 D501R aa序列 7 B-GEn共通I aa序列 8 B-GEn.1 RuvC I域aa序列 9 B-GEn.1 RuvC II域aa序列 10 B-GEn.1 RuvC III域aa序列 ADINAAQNLQKRFWL 11 B-GEn.1.2及B-GEn.2 RuvC I域aa序列 KELSVLMENTQIGNENGVSTIEAGMRIMSIDLGQRTAAAVSIFEVISKKPDEKETKLFYPIADTDLYAVHRRSLLLRLPGEEISS 12 B-GEn.1.2及B-GEn.2 RuvC II域aa序列 CQVILFEDLSRYRFALDRPRRENNRLMKWAHRSIPRLTYMQAELFGIQVGDV 13 B-GEn.1.2及B-GEn.2 RuvC III域aa序列 ADINAAQNLQKRFWQ 14 野生型SV40大T蛋白 PKKKRKV 15 核質 KRPAATKKAGQAKKKK 16 c-myc PAAKRVKLD 17 EGL-13 MSRRRKANPTKLSENAKKLAKEVEN 18 TUS KLKIKRPVK 19 鹼性PY-NLS (KR)-X(0,2)-(KR)-(KR)-x(3,10)-(RHK)-X(1,5)-PY 20 SV40大T蛋白，變體1 PKKKRMV 21 SV40大T蛋白，變體2 PKKKRKWEDP 22 SV40大T蛋白，變體3 CGYGPKKKRKVGG 23 SV40大T蛋白，變體4 CGYGPKKKRKV 24 SV40大T蛋白長NLS CYDDEATADSQHSTPPKKKRKWEDPK DFESELLS 25 SV40大T蛋白，變體5 CGGPKKKRKWG 26 SV40大T蛋白，變體6 PKKKIKW 27 SV40大T蛋白，變體7 KRTADGSEFESPKKKRKV 28 核質min-NLS TKKAGQAKKK 29 核質NLS變體1 TKKAGQAKKKKLD 30 核質NLS變體2 CGQAKKKKLD 31 c-myc NLS2 RQRRNELKRSP 32 多瘤大T蛋白 PKKARED 33 多瘤病毒大T蛋白 CGYGWSRKRPRPG 34 SV40 VP1衣殼多肽 APTKRKGS 35 多瘤病毒主要衣殼蛋白VP1 (11 N端aa) APKRKSGVSKC 36 SV40 VP2衣殼蛋白(39 kD) PNKKKRK 37 多瘤病毒衣殼蛋白VP2 EEDGPQKKKRRL 38 酵母組蛋白H2B GKKRSKA 39 腺病毒E1a KRPRP 40 腺病毒2/5型E1a CGGLSSKRPRP 41 爪蟾N1、NLS1 LVRKKRKTE3SP 42 爪蟾N1、NLS2 LKDKDAKKSKQE 43 V-Rel GNKAKRQRST 44 A型流感病毒之NS1蛋白 PFLDRLRRDQK 45 人類層黏連蛋白A SVTKKRKLE 46 爪蟾層黏連蛋白A SASKRRRLE 47 人類c-myc ACIDKRVKLD 48 鼠類c-abl SALIKKKKKMAP 49 腺病毒5 DBP PPKKRMRRRIE 50 大鼠糖皮質素受體 YRKCLQAGMNLEARKTKKKIKGIQQATA 51 人類糖皮質素受體 CGYGARKTKKKIK 52 人類糖皮質素受體NLS變體 RKCLQAGMNLEARKTKK 53 兔黃體酮受體 RKFKKFNK 54 人類雌激素受體 CGYGIRKDRRGGR 55 人類雄激素受體 CGYGARKLKKLGN 56 雞Ets1核心NLS GKRKNKPK 57 c-myb PLLKKIKQ 58 N-myc PPQKKIKS 59 p53 PQPKKKP 60 p53 NLS變體 PQPKKKPL 61 c-erb-A SKRWAKRKL 62 酵母核糖體蛋白L29 NLS MTGSKTRKHRGSGA 63 酵母核糖體蛋白L29 NLS，變體1 MTGSKHRKHPGSGA 64 酵母核糖體蛋白L29 NLS，變體2 RHRKHP 65 酵母核糖體蛋白L29 NLS，變體3 KRRKHP 66 酵母核糖體蛋白L29 NLS，變體4 KYRKHP 67 酵母核糖體蛋白L29 NLS，變體5 KHRRHP 68 酵母核糖體蛋白L29 NLS，變體6 KHKKHP 69 酵母核糖體蛋白L29 NLS，變體7 RHLKHP 70 酵母核糖體蛋白L29 NLS，變體8 KHRKYP 71 酵母核糖體蛋白L29 NLS，變體9 KHRQHP 72 爪蟾N1 NLS1 LVRKKRKTE3SP 73 爪蟾N1 NLS2 LKDKDAKKSKQE 74 病毒Jun ASKSRKRKL 75 人類T細胞白血病病毒Tax反式活化子蛋白 GGLCSARLHRHALLAT 76 小鼠核MX1蛋白(72 kD) DTREKKKFLKRRLLRLDE 77 小鼠核MX1蛋白NLS變體 REKKKFLKRR 78 人類視黃酸受體 CGYGDRNKKKKE 79 人類XPAC RKRQRALMLRQAR 80 T-DNA連接之VirD2核酸內切酶 EYLSRKGKLEL 206 酵母TRM1之推定核心NLS KKSKKKRC 81 人類泡沫狀逆轉錄病毒之Gag蛋白 QPQRYGGGRGRRW 82 SV40 Vp3結構蛋白 NKKKRKLSRGSSQKTKGTSASAKARH KRRNRSSRS 83 猿猴肉瘤病毒v-sis基因產物 RVTIRTWRWRRPPKGKHRK 84 爪蟾蛋白因子Xnf7之推定雙聯NLS KRKIEEPEPEPKKAK 85 酵母SWI5基因產物 KKYENVVIKRSPRKRGRPRKD 86 單純疱疹病毒ICP8蛋白 GRKRAFHGDDPFGEGPPDKKGD 87 VirD2核酸內切酶之雙聯NLS KRPREDDDGEPSERKRARDDR 88 來自輸入蛋白-α之IBB域 RMRIZFKNKGKDTAELRRRRVEVSVEL RKAKKDEQILKRRNV 89 hRNPAl M9 NLS NQSSNFGPMKGGNFGGRSSGPYGGG GQYFAKPRNQGGY 90 人類聚(ADP-核糖)聚合酶 KRKGDEVDGVDEVAKKKSKK 91 肝炎病毒δ抗原 RKLKKKIKKL 92 流感病毒NLS PKQKKRK 93 小鼠c-abl IV SALIKKKKKMAP 94 肌瘤T蛋白NLS1 VSRKRPRP 95 肌瘤T蛋白NLS2 PPKKARED 96 myc原癌基因蛋白[智人] GPAAKRVKLD 97 熱休克因子蛋白HSF8 [番茄] KKRRIKQD 98 2x SV40、LrgT PKKKRKVEDPKKKRKVD 99 3x SV40、LrgT PKKKRKVDPKKKRKVDPKKKRKV 100 單聯NLS共通叢集1 (Dissertation Tatyana Goldberg 2016) KKGKKKGK 101 單聯NLS共通叢集2 (Dissertation Tatyana Goldberg 2016) PKRRRGVVL 102 單聯NLS共通叢集3 (Dissertation Tatyana Goldberg 2016) EQLFKRRNV 103 單聯NLS共通叢集4 (Dissertation Tatyana Goldberg 2016) KRRRR 104 單聯NLS共通叢集5 (Dissertation Tatyana Goldberg 2016) KKRRR 105 單聯NLS共通叢集6 (Dissertation Tatyana Goldberg 2016) EGAPPAKRPR 106 單聯NLS共通叢集7 (Dissertation Tatyana Goldberg 2016) MLRRRRRKRAR 107 單聯NLS共通叢集8 (Dissertation Tatyana Goldberg 2016) RRKRR 108 單聯NLS共通叢集9 (Dissertation Tatyana Goldberg 2016) RKRK 109 單聯NLS次要叢集(Dissertation Tatyana Goldberg 2016) FKAVLEDILGEL 110 NLSdb、蛋白質來源P10152 KNRRL 111 NLSdb、蛋白質來源Q09353 RKRHW 112 NLSdb、蛋白質來源Q0VD86、Q58DS6、Q5R6G2、Q9ERI5、Q6AYK2、Q6NYC1 RRKKRR 113 NLSdb、蛋白質來源Q99PU7、D3ZHS6、Q92560、A2VDM8 RRKRSR 114 NLSdb、蛋白質來源Q14781、P30658 KRGRKP 115 NLSdb、蛋白質來源P02545、P48678、P48679、Q3ZD69 KKRKLE 116 NLSdb、蛋白質來源O35914、Q01954 PKKKSRK 117 NLSdb、蛋白質來源Q9FYS5、Q43386 PKRGRGR 118 NLSdb、蛋白質來源E5RQA1 KEKRKKR 119 NLSdb、蛋白質來源Q9Z1J1、Q9HCS4、Q924A0 KKKKRKR 120 NLSdb、蛋白質來源Q80WE1、Q5R9B4、Q06787、P35922 RRGDGRRR 121 NLSdb、蛋白質來源Q9Y261、P32182、P35583 LSPSLSPL 122 NLSdb、蛋白質來源P07156 VNFSEFSK 123 NLSdb、蛋白質來源Q96EB6 IVINILSE 124 NLSdb、蛋白質來源Q6AZ28、O75928、Q8C5D8 PPAKRKCIF 125 NLSdb、SeqNLS預測 QRPGPYDRP 126 NLSdb、蛋白質來源Q8L7L5、A1L4X7、O80834、Q8LPN5 KRKRGRPRK 127 NLSdb、蛋白質來源O88907, O75925 KIKELYRRR 128 NLSdb、SeqNLS預測 MVQLRPRASR 129 NLSdb、蛋白質來源Q5VK71 KKRREKQRRR 130 NLSdb、蛋白質來源P0C6L6、P25880、Q81835、P29833、P25882、P06934、P0C6L3、P29997、P0C6M5、P0C6M9、P29996、P0C6M1、P0C6L8、P0C6M2、P0C6L7、P25881 EGAPPAKRAR 131 NLSdb、蛋白質來源Q45FA5 PKKGDKYDKTD 132 NLSdb、蛋白質來源P97376、Q14331 KKKKSKDKKRK 133 NLSdb、蛋白質來源P21827 AHRAKKMSKTHA 134 NLSdb、蛋白質來源Q6ZN17 KKGPSVQKRKKT 135 NLSdb、蛋白質來源Q4R8Y1 KGVKRKADTTTP 136 NLSdb、蛋白質來源Q91Y44、D4A7T3 KGVKRRADTTTP 137 NLSdb、蛋白質來源Q15397、Q8BKS9、Q562C7 KKPKWDDFKKKKK 138 NLSdb、蛋白質來源Q96GM8 KRRRRRRREKRKR 139 NLSdb、蛋白質來源P61635、Q6DV79、Q19S50、P52631 DVRKRVQDLEQKM 140 NLSdb、蛋白質來源O60716 KKGKDEWFSRGKK 141 NLSdb、蛋白質來源P30999 KKGKDEWFSRGKKP 142 NLSdb、SeqNLS預測 ASPEYVNLPINGNG 143 NLSdb、蛋白質來源Q7Z7C8、Q5ZMS1、Q9EQH4、A7MAZ4 YLRPVKKPKIRRKK 144 NLSdb、蛋白質來源O15381 KRKGKLKNKGSKRKK 145 NLSdb、SeqNLS預測 RRRGKNKVAAQNCRK 146 NLSdb、蛋白質來源O15516、Q5RAK8、Q91YB2、Q91YB0、Q8QGQ6、O08785、Q9WVS9、Q6YGZ4 DKAKRVSRNKSEKKRR 147 NLSdb、蛋白質來源G5EFF5 EEQLRRRKNSRLNNTG 148 NLSdb、蛋白質來源P10103、Q4R844、P12682、B0CM99、A9RA84、Q6YKA4、P09429、P63159、Q08IE6、P63158、Q9YH06、B1MTB0 HKKKHPDASVNFSEFSK 149 NLSdb、蛋白質來源Q9Z301、O54943、Q8K3T2 KKTGKNRKLKSKRVKTR 150 NLSdb、蛋白質來源Q38740、Q38741、Q700W2、Q9S7A9、Q6Z461、P93015、Q94JW8、Q9S758 KRSCRRRLAGHNERRRK 151 NLSdb、蛋白質來源Q501B2 KRQRRKQSNRESARRSR 152 NLSdb、SeqNLS預測 RGKGGKGLGKGGAKRHRK 153 NLSdb、蛋白質來源Q63014、Q9DBR0 RRRGFERFGPDNMGRKRK 154 NLSdb、蛋白質來源Q0IJ08、Q2TAE3、Q63470、Q13627、Q61214 RRHQQGQGDDSSHKKERK 155 NLSdb、蛋白質來源Q8LAM0、O24454 KKKTGVIAPKRFVQRLKK 156 NLSdb、蛋白質來源Q0E671 KRAMKDDSHGNSTSPKRRK 157 NLSdb、蛋白質來源Q9P127 KVNFLDMSLDDIIIYKELE 158 NLSdb、SeqNLS預測 KKYENVVIKRSPRKRGRPRK 159 NLSdb、SeqNLS預測 KRGNSSIGPNDLSKRKQRKK 160 NLSdb、蛋白質來源Q9BZZ5、Q5R644 KRASEDTTSGSPPKKSSAGPKR 161 NLSdb、SeqNLS預測 KRIHSVSLSQSQIDPSKKVKRAK 162 NLSdb、SeqNLS預測 EVLKVIRTGKRKKKAWKRMVTKVC 163 NLSdb、SeqNLS預測 IINGRKLKLKKSRRRSSQTSNNSFTSRRS 164 NLSdb、蛋白質來源Q76IQ7 AHFKISGEKRPSTDPGKKAKNPKKKKKKDP 165 可撓性連接子 (GGGGS) ₃ 166 混合連接子 GGG(EAAAK) ₃ 167 剛性連接子 A(EAAAK) ₄ALEA(EAAAK) ₄A 168 可撓性連接子2 GGGGSLVPRGSGGGGS 169 富含脯胺酸、剛性、域間連接子 GAAPAAAPAKQEAAAPAPAAKAEAPAAAPAAKA 170 可切割之連接子 TRHRQPRGWE 171 B-GEn.2-sgRNA_v4 172 B-GEn.2-sgRNA_v4.2 173 B-GEn.2-sgRNA_v4.3 174 B-GEn.2-sgRNA_v4.4 175 B-GEn.2-sgRNA_v4.5 176 Tracr B-GEn.2 177 Tracr B-GEn.1 178 靶向B2M之引導RNA 179 BkaCas12b aa序列 180 BcyCas12b aa序列 181 BacCas12b aa序列 182 BmaCas12b aa序列 183 BshCas12b aa序列 184 BagCas12b aa序列 185 AacC2c1 aa序列 186 BsyCas12b aa序列 187 Bv3Cas12b aa序列 188 PteCas12b aa序列 189 BcoCas12b aa序列 190 LseCas12b aa序列 191 BheCas12b aa序列 192 BmeCas12b aa序列 193 BcaCas12b aa序列 194 BpaCas12b aa序列 195 BpuCas12b aa序列 196 BhiCas12b aa序列 197 BgeCas12b aa序列 198 BkaCas12b aa序列 199 BthCas12b aa序列 200 B-GEn.2(D501R)-NLS aa序列 201 B-GEn.1 (D504R)靶相互作用股aa序列 GPAFLNVVLRL 202 B-GEn.1.2 (D501R)靶相互作用股aa序列 GPIFLNVVVRV 203 B-GEn.2 (D501R)靶相互作用股aa序列 GPIFLNVVVRV 204 共通靶相互作用股aa序列 GX ₁X ₂X ₃X ₄NX ₅X ₆X7DX ₈X ₁：任何胺基酸；在一些實施例中，X ₁為D、E、S、P、K或R。 X ₂：任何胺基酸；在一些實施例中，X ₂為V、I或A。 X ₃：任何胺基酸；在一些實施例中，X ₃為Y或F。 X ₄：任何胺基酸；在一些實施例中，X ₄為L或F。 X ₅：任何胺基酸；在一些實施例中，X ₅為I、L、F、V或M。 X ₆：任何胺基酸；在一些實施例中，X ₆為S、V、T或A。 X ₇：任何胺基酸；在一些實施例中，X ₇為V、L或I。 X ₈：任何胺基酸；在一些實施例中，X ₈為V、F、L或I。 205 B-GEn共通II aa序列 207 芽孢桿菌目Cas核酸酶共通aa序列 10. 以引用之方式併入 Exemplary sequences of the present invention are provided in Table 8 below (wherein "SEQ" refers to SEQ ID NO). Table 8 SEQ describe sequence 1 WT B-GEn.1 aa sequence 2 WT B-GEn.1.2 aa sequence 3 WT B-GEn.2 aa sequence 4 B-GEn.1 D504R aa sequence 5 B-GEn.1.2 D501R aa sequence 6 B-GEn.2 D501R aa sequence 7 B-GEn common I aa sequence 8 B-GEn.1 RuvC I domain aa sequence 9 B-GEn.1 RuvC II domain aa sequence 10 B-GEn.1 RuvC III domain aa sequence ADINAAQNLQKRFWL 11 B-GEn.1.2 and B-GEn.2 RuvC I domain aa sequences KELSVLMENTQIGNENGVSTIEAGMRIMSIDLGQRTAAAVSIFEVISKKPDEKETKLFYPIADTDLYAVHRRSLLLRLPGEEISS 12 B-GEn.1.2 and B-GEn.2 RuvC II domain aa sequences CQVILFEDLSRYRFALDRPRRENNRLMKWAHRSIPRLTYMQAELFGIQVGDV 13 B-GEn.1.2 and B-GEn.2 RuvC III domain aa sequences ADINAAQNLQKRFWQ 14 Wild-type SV40 large T protein PKKKRKV 15 Nucleoplasm KRPAATKKAGQAKKKK 16 c-myc PAAKRVKLD 17 EGL-13 MSRRRKANPTKLSENAKKLAKEVEN 18 TUS KLKIKRPVK 19 Alkaline PY-NLS (KR)-X(0,2)-(KR)-(KR)-x(3,10)-(RHK)-X(1,5)-PY 20 SV40 large T protein, variant 1 PKKKRMV twenty one SV40 large T protein, variant 2 PKKKRKWEDP twenty two SV40 large T protein, variant 3 CGYGPKKKRKVGG twenty three SV40 large T protein, variant 4 CGYGPKKKRKV twenty four SV40 large T protein long NLS CYDDEATADSQHSTPPKKKRKWEDPKDFESELLS 25 SV40 large T protein, variant 5 CGGPKKKRKWG 26 SV40 large T protein, variant 6 PKKKIKW 27 SV40 large T protein, variant 7 KRTADGSEFESPKKKRKV 28 Nucleoplasmic min-NLS TKKAGQAKKK 29 Nucleocytoplasmic NLS variant 1 TKKAGQAKKKKLD 30 Nucleocytoplasmic NLS variant 2 CGQAKKKKLD 31 c-myc NLS2 RQRRNELKRSP 32 Polyoma large T protein PKKARED 33 Polyomavirus large T protein CGYGWSRKRPRPG 34 SV40 VP1 capsid polypeptide APTKRKGS 35 Polyomavirus major capsid protein VP1 (11 N-terminal aa) APKRKSGVSKC 36 SV40 VP2 capsid protein (39 kD) PNKKKR 37 Polyomavirus capsid protein VP2 EEDGPQKKKRRL 38 Yeast Histone H2B GKKRSKA 39 Adenovirus E1a KRPRP 40 Adenovirus type 2/5 E1a CGGLSSKRPRP 41 Xenopus N1, NLS1 LVRKKRKTE3SP 42 Xenopus N1, NLS2 LKDKDAKKSKQE 43 V-Rel GNKAKRQRST 44 NS1 protein of influenza A virus PFLDRLRRDQK 45 Human laminin A SVTKKRKLE 46 Xenopus laminin A SASKRRRLE 47 Human c-myc ACIDKRVKLD 48 Mouse c-abl SALIKKKKKMAP 49 Adenovirus 5 DBP PPKKRMRRRIE 50 Rat glucocorticoid receptor YRKCLQAGMNLEARKTKKKIKGIQQATA 51 Human glucocorticoid receptor CGYGARKTKKKIK 52 Human glucocorticoid receptor NLS variants RKCLQAGMNLEARKTKK 53 Rabbit progesterone receptor RKFKKFNK 54 Human estrogen receptor CGYGIRKDRRGGR 55 Human androgen receptor CGYGARKLKKLGN 56 Chicken Ets1 core NLS GKRKNKPK 57 c-myb PLLKKIKQ 58 N-myc PPQKKIKS 59 p53 PQPKKKP 60 p53 NLS variants PQPKKKPL 61 c-erb-A SKRWAKRKL 62 Yeast ribosomal protein L29 NLS MTGSKTRKHRGSGA 63 Yeast ribosomal protein L29 NLS, variant 1 MTGSKHRKHPGSGA 64 Yeast ribosomal protein L29 NLS, variant 2 R 65 Yeast ribosomal protein L29 NLS, variant 3 KRW 66 Yeast ribosomal protein L29 NLS, variant 4 KYRKHP 67 Yeast ribosomal protein L29 NLS, variant 5 KHRR 68 Yeast ribosomal protein L29 NLS, variant 6 KHKKHP 69 Yeast ribosomal protein L29 NLS, variant 7 RHLK 70 Yeast ribosomal protein L29 NLS, variant 8 KHRKYP 71 Yeast ribosomal protein L29 NLS, variant 9 QUR 72 Xenopus N1 NLS1 LVRKKRKTE3SP 73 Xenopus N1 NLS2 LKDKDAKKSKQE 74 VirusJun ASKSRKRKL 75 Human T-cell leukemia virus Tax transactivator protein GGLCSARLHRHALLAT 76 Mouse nuclear MX1 protein (72 kD) DTREKKKFLKRRLLRLDE 77 Mouse nuclear MX1 protein NLS variant REKKKFLKRR 78 Human retinoic acid receptor CGYGDRNKKKKE 79 Human XPAC RKRQRALMLRQAR 80 VirD2 endonuclease for T-DNA ligation EYLSRKGKLEL 206 Putative core NLS of yeast TRM1 KKSKKKRC 81 Human foamy retrovirus Gag protein QPQRYGGGRGRRW 82 SV40 Vp3 structural protein NKKKRKLSRGSSQKTKGTSASAKARH KRRNRSSRS 83 Simian sarcoma virus v-sis gene product RVTIRTWRWRRPPKGKHRK 84 Putative double NLS of Xenopus laevis protein factor Xnf7 KRKIEEPEPEPKKAK 85 Yeast SWI5 gene product KKYENVVIKRSPRKRGRPRKD 86 Herpes simplex virus ICP8 protein GRKRAFHGDDPFGEGPPDKKGD 87 VirD2 endonuclease double NLS KRPREDDDGEPSERKRARDDR 88 IBB domain from importin-α RMRIZFKNKGKDTAELRRRRVEVSVEL RKAKKDEQILKRRNV 89 hRNPAl M9 NLS NQSSNFGPMKGGNFGGRSSGPYGGG GQYFAKPRNQGGY 90 Human poly (ADP-ribose) polymerase KRKGDEVDGVDEVAKKKSKK 91 Hepatitis virus delta antigen RKLKKKIKKL 92 Influenza virus NLS PKKKRK 93 Mouse c-abl IV SALIKKKKKMAP 94 Myoma T protein NLS1 VSRKRPRP 95 Myoma T protein NLS2 PPKKARED 96 myc proto-oncogene protein [Homo sapiens] GPAAKRVKLD 97 Heat shock factor protein HSF8 [Tomato] KKRRIK 98 2x SV40, LrgT PKKKRKVEDPKKKRKVD 99 3x SV40, LrgT PKKKRKVDPKKKRKVDPKKKRKV 100 Single NLS Common Collection 1 (Dissertation Tatyana Goldberg 2016) KKGKKKGK 101 Single NLS Common Collection 2 (Dissertation Tatyana Goldberg 2016) PKRRRGVVL 102 Single NLS Common Collection 3 (Dissertation Tatyana Goldberg 2016) EQLFKRRNV 103 Single NLS Common Collection 4 (Dissertation Tatyana Goldberg 2016) KRRRR 104 Single NLS Common Collection 5 (Dissertation Tatyana Goldberg 2016) KKRR 105 Single NLS Common Collection 6 (Dissertation Tatyana Goldberg 2016) EGAPPAKRPR 106 Single NLS Common Collection 7 (Dissertation Tatyana Goldberg 2016) MLRRRRRKRAR 107 Single NLS Common Collection 8 (Dissertation Tatyana Goldberg 2016) RRKRR 108 Single NLS Common Collection 9 (Dissertation Tatyana Goldberg 2016) RKR 109 Single-linked NLS minor cluster (Dissertation Tatyana Goldberg 2016) FKAVLEDILGEL 110 NLSdb, protein source P10152 KNRRL 111 NLSdb, protein source Q09353 R 112 NLSdb, protein source Q0VD86, Q58DS6, Q5R6G2, Q9ERI5, Q6AYK2, Q6NYC1 RRKKRR 113 NLSdb, protein sources Q99PU7, D3ZHS6, Q92560, A2VDM8 RRKRSR 114 NLSdb, protein source Q14781, P30658 KRKP 115 NLSdb, protein sources P02545, P48678, P48679, Q3ZD69 KKRKLE 116 NLSdb, protein source O35914, Q01954 PKKKSRK 117 NLSdb, protein source Q9FYS5, Q43386 PKRGRGR 118 NLSdb, protein source E5RQA1 KEKRKKR 119 NLSdb, protein source Q9Z1J1, Q9HCS4, Q924A0 KKKKRKR 120 NLSdb, protein source Q80WE1, Q5R9B4, Q06787, P35922 RRGDGRRR 121 NLSdb, protein source Q9Y261, P32182, P35583 LSPSLSPL 122 NLSdb, protein source P07156 VNFSEFSK 123 NLSdb, protein source Q96EB6 IVINILSE 124 NLSdb, protein source Q6AZ28, O75928, Q8C5D8 PPAKRKCIF 125 NLSdb, SeqNLS prediction QRPGPYDRP 126 NLSdb, protein source Q8L7L5, A1L4X7, O80834, Q8LPN5 KRKRGRPRK 127 NLSdb, protein source O88907, O75925 KIKELYRR 128 NLSdb, SeqNLS prediction MVQLRPRASR 129 NLSdb, protein source Q5VK71 KKRREKQRRR 130 NLSdb, protein source P0C6L6, P25880, Q81835, P29833, P25882, P06934, P0C6L3, P29997, P0C6M5, P0C6M9, P29996, P0C6M1, P0C6L8, P0C6M2, P0C6L7, P25881 EGAPPAKRAR 131 NLSdb, protein source Q45FA5 PKKGDKYDKTD 132 NLSdb, protein source P97376, Q14331 KKKKSKDKKRK 133 NLSdb, protein source P21827 AHRAKKMSKTHA 134 NLSdb, protein source Q6ZN17 KKGPSVQKRKKT 135 NLSdb, protein source Q4R8Y1 KGVKRKADTTTP 136 NLSdb, protein source Q91Y44, D4A7T3 KGVKRRADTTTP 137 NLSdb, protein source Q15397, Q8BKS9, Q562C7 KKPKWDDFKKKKK 138 NLSdb, protein source Q96GM8 KRRRRRRREKRKR 139 NLSdb, protein source P61635, Q6DV79, Q19S50, P52631 DVRKRVQDLEQKM 140 NLSdb, protein source O60716 KKGKDEWFSRGKK 141 NLSdb, protein source P30999 KKGKDEWFSRGKKP 142 NLSdb, SeqNLS prediction ASPEYVNLPINGNG 143 NLSdb, protein source Q7Z7C8, Q5ZMS1, Q9EQH4, A7MAZ4 YLRPVKKPKIRRKK 144 NLSdb, protein source O15381 KRKGKLKNKGSKRKK 145 NLSdb, SeqNLS prediction RRRGKNKVAAQNCRK 146 NLSdb, protein source O15516, Q5RAK8, Q91YB2, Q91YB0, Q8QGQ6, O08785, Q9WVS9, Q6YGZ4 DKAKRVSRNKSEKKRR 147 NLSdb, protein source G5EFF5 EEQLRRRKNSRLNNTG 148 NLSdb, protein source P10103, Q4R844, P12682, B0CM99, A9RA84, Q6YKA4, P09429, P63159, Q08IE6, P63158, Q9YH06, B1MTB0 HKKKHPDASVNFSEFSK 149 NLSdb, protein source Q9Z301, O54943, Q8K3T2 KKTGKNRKLKSKRVKTR 150 NLSdb, protein sources Q38740, Q38741, Q700W2, Q9S7A9, Q6Z461, P93015, Q94JW8, Q9S758 KRSCRRRLAGHNERRRK 151 NLSdb, protein source Q501B2 KRQRRKQSNRESARRSR 152 NLSdb, SeqNLS prediction RGKGGKGLGKGGAKRHRK 153 NLSdb, protein source Q63014, Q9DBR0 RRRGFERFGPDNMGRKRK 154 NLSdb, protein source Q0IJ08, Q2TAE3, Q63470, Q13627, Q61214 RRHQQGQGDDSSHKKERK 155 NLSdb, protein source Q8LAM0, O24454 KKKTGVIAPKRFVQRLKK 156 NLSdb, protein source Q0E671 KRAMKDDSHGNSTSPKRRK 157 NLSdb, protein source Q9P127 KVNFLDMSLDDIIIYKELE 158 NLSdb, SeqNLS prediction KKYENVVIKRSPRKRGRPRK 159 NLSdb, SeqNLS prediction KRGNSSIGPNDLSKRKQRKK 160 NLSdb, protein source Q9BZZ5, Q5R644 KRASEDTTSGSPPKKSSAGPKR 161 NLSdb, SeqNLS prediction KRIHSVSLSQSQIDPSKKVKRAK 162 NLSdb, SeqNLS prediction EVLKVIRTGKRKKKAWKRMVTKVC 163 NLSdb, SeqNLS prediction IINGRKLKLKKSRRRSSQTSNNSFTSRRS 164 NLSdb, protein source Q76IQ7 AHFKISGEKRPSTDPGKKAKNPKKKKKKDP 165 Flexible connector (GGGGS) ₃ 166 Hybrid connector GGG(EAAAK) ₃ 167 Rigid connector A(EAAAK) ₄ ALE(EAAAK) ₄ A 168 Flexible connector 2 GGGGSLVPRGSGGGGS 169 Proline-rich, rigid, interdomain linker GAAPAAAPAKQEAAAAPAPAAKAAEAPAAAPAAKA 170 Cuttable connector THRQPRGWE 171 B-GEn.2-sgRNA_v4 172 B-GEn.2-sgRNA_v4.2 173 B-GEn.2-sgRNA_v4.3 174 B-GEn.2-sgRNA_v4.4 175 B-GEn.2-sgRNA_v4.5 176 Tracr B-GEn.2 177 Tracr B-GEn.1 178 Guide RNA targeting B2M 179 BkaCas12b aa sequence 180 BcyCas12b aa sequence 181 BacCas12b aa sequence 182 BmaCas12b aa sequence 183 BshCas12b aa sequence 184 BagCas12b aa sequence 185 AacC2c1 aa sequence 186 BsyCas12b aa sequence 187 Bv3Cas12b aa sequence 188 PteCas12b aa sequence 189 BcoCas12b aa sequence 190 LseCas12b aa sequence 191 BheCas12b aa sequence 192 BmeCas12b aa sequence 193 BcaCas12b aa sequence 194 BpaCas12b aa sequence 195 BpuCas12b aa sequence 196 BhiCas12b aa sequence 197 BgeCas12b aa sequence 198 BkaCas12b aa sequence 199 BthCas12b aa sequence 200 B-GEn.2(D501R)-NLS aa sequence 201 B-GEn.1 (D504R) target interaction strand aa sequence GPAFLNVVLRL 202 B-GEn.1.2 (D501R) target interaction strand aa sequence GPIFLNVVVRV 203 B-GEn.2 (D501R) target interaction strand aa sequence GPIFLNVVVRV 204 Common target interaction strand aa sequence _{GX1X2X3X4NX5X6X7DX8X1} _: any amino _acid ; in some embodiments, _X1 is D, E, S, P, K, or _R. _X2 _: _any amino _acid ; in some embodiments, _X2 is V, I, or _A. _X3 : any amino acid; in some embodiments, _X3 is Y or F. _X4 : any amino acid; in some embodiments, _X4 is L or F. X5: any amino acid; in some embodiments, _X5 is I, L, F, V, or _M. _X6 : any amino acid; in some embodiments, _X6 is S, V, T, or A. _X7 : any amino acid; in some embodiments, _X7 is V, L, or I. _X8 : any amino acid; in some embodiments, _X8 is V, F, L, or I. 205 B-GEn common II aa sequence 207 Common aa sequences of Cas nucleases in Bacillusales 10. Incorporation by Reference

本申請案中引用的所有公開案、專利、專利申請案及其他文獻均出於所有目的以其全文引用之方式併入本文中，其引用程度如同個別地指示各個別公開案、專利、專利申請案或其他文獻出於所有目的以引用之方式併入般。倘若本文及本發明所併入之參考文獻中之一者或多者之教示之間存在任何不一致性，則以本說明書之教示為準。All publications, patents, patent applications, and other documents cited in this application are incorporated herein by reference in their entirety for all purposes to the same extent as if each individual publication, patent, patent application, or other document was individually indicated to be incorporated by reference for all purposes. If there is any inconsistency between the teachings of this document and one or more of the references incorporated herein, the teachings of this specification shall prevail.

圖1顯示三種B-GEn酵素之胺基酸序列比對：B-GEn.1 (SEQ ID NO: 1)、B-GEn.1.2 (SEQ ID NO: 2)及B-GEn.2 (SEQ ID NO: 3)以及示例性共通序列(SEQ ID NO: 205)。每種B-GEn酵素之不同於其他兩種B-GEn酵素之對應胺基酸之殘基顯示於暗箱中，而匹配的胺基酸為未標記。Figure 1 shows the amino acid sequence alignment of three B-GEn enzymes: B-GEn.1 (SEQ ID NO: 1), B-GEn.1.2 (SEQ ID NO: 2) and B-GEn.2 (SEQ ID NO: 3) and an exemplary common sequence (SEQ ID NO: 205). The residues of each B-GEn enzyme that differ from the corresponding amino acids of the other two B-GEn enzymes are shown in dark boxes, while the matching amino acids are unlabeled.

圖2A至2C顯示衍生自芽孢桿菌目(order Bacillales)中之各種物種之核酸內切酶之譜系評估。圖2A係一種譜系樹，其展現衍生自芽孢杆菌屬(genus Bacillus)之屬，改編自Suzuki，2018，Appl Microbiol Biotechnol. 102:10425-10437。圖2B展現幾種核酸內切酶之譜系來源。圖2C係基於B-GEn.2 OBD II域及RuvC I域序列之NCBI blast輸出，使用Geneious Prime軟體之樹建立演算法產生的譜系樹，其展現衍生自芽孢桿菌目芽孢杆菌属之屬之核酸內切酶。Figures 2A to 2C show phylogenetic evaluation of endonucleases derived from various species in the order Bacillales. Figure 2A is a phylogenetic tree showing genera derived from the genus Bacillus, adapted from Suzuki, 2018, Appl Microbiol Biotechnol. 102:10425-10437. Figure 2B shows the phylogenetic origins of several endonucleases. Figure 2C is a phylogenetic tree generated using the tree building algorithm of Geneious Prime software based on the NCBI blast output of the B-GEn.2 OBD II domain and RuvC I domain sequences, showing endonucleases derived from the genus Bacillus of the order Bacillales.

圖3A至3C展現AacC2c1及B-GEn.2酵素結構。圖3A為使用DNAStar可視化之AacC2c1晶體結構(5U33)。圖3B顯示使用AlphaFold2產生及使用DNAStar可視化之B-GEn.2之預測結構。圖3C顯示AacC2c1 (5U33)之晶體結構與B-GEn.2之預測結構之比對。Figures 3A to 3C show the AacC2c1 and B-GEn.2 enzyme structures. Figure 3A shows the crystal structure of AacC2c1 (5U33) visualized using DNAStar. Figure 3B shows the predicted structure of B-GEn.2 generated using AlphaFold2 and visualized using DNAStar. Figure 3C shows the alignment of the crystal structure of AacC2c1 (5U33) with the predicted structure of B-GEn.2.

圖4顯示B-GEn.1及B-GEn.2與芽孢桿菌目之20種不同Cas核酸酶之胺基酸序列比對(使用Geneious Prime軟體之MUSCLE比對演算法建立)。域在Bth C2C1序列上標記如下：粗體胺基酸表示寡核苷酸結合域(OBD域) (OBD-I及OBD-II)，單連續加底線胺基酸表示識別(REC)域(REC1-I、REC1-II及REC2)，斜體胺基酸表示PAM-相互作用(PI)域，及雙連續加底線胺基酸表示RuvC域(RuvC-I、RuvC-II、RuvC-III)，其一起形成核酸酶域。共通序列上方的加號指示對應於B-GEn.2之D501殘基之胺基酸。殘基上方的星號標記RuvC域內的活性位點催化殘基。Figure 4 shows the amino acid sequence alignment of B-GEn.1 and B-GEn.2 with 20 different Cas nucleases of the Bacillus order (created using the MUSCLE alignment algorithm of Geneious Prime software). The domains are labeled on the Bth C2C1 sequence as follows: bold amino acids represent oligonucleotide binding domains (OBD domains) (OBD-I and OBD-II), single consecutive underlined amino acids represent recognition (REC) domains (REC1-I, REC1-II, and REC2), italic amino acids represent PAM-interaction (PI) domains, and double consecutive underlined amino acids represent RuvC domains (RuvC-I, RuvC-II, RuvC-III), which together form the nuclease domain. The plus sign above the consensus sequence indicates the amino acid corresponding to the D501 residue of B-GEn.2. The asterisk above the residue marks the active site catalytic residue within the RuvC domain.

圖5係SDS-PAGE凝膠之考馬斯(Coomassie)染色影像，其顯示使用描述於章節8.1.3中之方法進行B-GEn.2 D501R之基於單步驟硫酸肝素之純化之結果。FIG. 5 is a Coomassie-stained image of an SDS-PAGE gel showing the results of a single-step heparin sulfate-based purification of B-GEn.2 D501R using the method described in Section 8.1.3.

圖6係顯示在不同核酸酶:線性化質體比率之體外質體切割檢定中B-GEn.2 D501R相對於B-GEn.2 WT及Cpf1之B2M靶位點切割效率之圖。FIG. 6 is a graph showing the cleavage efficiency of the B2M target site by B-GEn.2 D501R relative to B-GEn.2 WT and Cpf1 in an in vitro plastid cleavage assay at different nuclease: linearized plastid ratios.

圖7係顯示使用包含B-GEN.2 WT、B-GEn.2 D501R或Cpf1之50 pmol RNP在iPSC中進行之B2M基因編輯之兩次評估之結果之條形圖。Y軸上的插入缺失(indel)形成百分比表示與所指示酵素形成之RNP之基因編輯百分比。Figure 7 is a bar graph showing the results of two evaluations of B2M gene editing in iPSCs using 50 pmol RNPs containing B-GEN.2 WT, B-GEn.2 D501R, or Cpf1. The percentage of indel formation on the Y-axis represents the percentage of gene editing with the RNPs formed by the indicated enzymes.

圖8顯示使用AlphaFold2產生之AacC2c1 (在影像上標記為5U31)及B-GEn.2之疊加的結構。鄰近靶DNA之P-1核苷酸之AacC2c1 (黑色殘基)及B-GEn.2 (白色殘基)均顯示展現疊加的核酸內切酶之靶相互作用股之所比對胺基酸主鏈，其中兩種核酸內切酶之靶相互作用股位於每種酵素之PAM識別區附近且包含對應於AacC2c1之R507 (顯示與DNA中之一者之磷酸酯基團形成氫鍵結之側鏈)及B-GEn.2之D501 (灰色，遠離靶DNA股)之胺基酸。具有D504R取代之B-GEn.1之靶相互作用股之胺基酸序列對應於SEQ ID NO: 201，具有D501R取代之B-GEn.1.2之靶相互作用股之胺基酸序列對應於SEQ ID NO: 202，及具有D501R取代之B-GEn.2之靶相互作用股之胺基酸序列對應於SEQ ID NO: 203。闡明於圖4中之核酸內切酶當中靶相互作用股之共通胺基酸序列(且其中胺基酸精胺酸在對應於SEQ ID NO: 3之D501之位置處)以SEQ ID NO: 204提供。Figure 8 shows the structure of the superposition of AacC2c1 (labeled 5U31 in the image) and B-GEn.2 generated using AlphaFold2. Both AacC2c1 (black residues) and B-GEn.2 (white residues) adjacent to the P-1 nucleotide of the target DNA show the aligned amino acid backbones of the target interaction strands of the superpositioned endonucleases, which are located near the PAM recognition region of each enzyme and include amino acids corresponding to R507 of AacC2c1 (showing a side chain hydrogen-bonded to a phosphate group in one of the DNAs) and D501 of B-GEn.2 (grey, away from the target DNA strand). The amino acid sequence of the target interaction strand of B-GEn.1 with a D504R substitution corresponds to SEQ ID NO: 201, the amino acid sequence of the target interaction strand of B-GEn.1.2 with a D501R substitution corresponds to SEQ ID NO: 202, and the amino acid sequence of the target interaction strand of B-GEn.2 with a D501R substitution corresponds to SEQ ID NO: 203. The common amino acid sequence of the target interaction strand among the endonucleases illustrated in FIG. 4 (and wherein the amino acid arginine is at a position corresponding to D501 of SEQ ID NO: 3) is provided as SEQ ID NO: 204.

TW202521691A_113137996_SEQL.xmlTW202521691A_113137996_SEQL.xml

Claims

An engineered V-type endonuclease comprising an amino acid other than aspartic acid at a position corresponding to D504 of SEQ ID NO: 1 (B-GEn. 1) or D501 of SEQ ID NO: 2 (B-GEn. 1.2) or SEQ ID NO: 3 (B-GEn. 2).

The engineered V-type endonuclease of claim 1, which is an engineered B-GEn polypeptide.

The engineered V-type nuclease of claim 1 or claim 2, which has arginine at the position corresponding to D504 of SEQ ID NO: 1 (B-GEn.1) or D501 of SEQ ID NO: 2 (B-GEn.1.2) or SEQ ID NO: 3 (B-GEn.2).

An engineered V-type nuclease, which is optionally an engineered V-type nuclease as in any one of claims 1 to 3, comprising the following amino acid sequence: (a) having a substitution at a position corresponding to D501 of SEQ ID NO: 3, wherein the substitution: (i) is a substitution D501R compared to the amino acid sequence of SEQ ID NO: 3; or (ii) increases gene editing efficiency activity compared to the corresponding amino acid sequence without the substitution at position D501; and (b) has at least 90% sequence identity with SEQ ID NO: 1 (B-GEn.1), SEQ ID NO: 2 (B-GEn.1.2) or SEQ ID NO: 3 (B-GEn.2).

An engineered V-type endonuclease comprising the amino acid sequence of SEQ ID NO: 4.

An engineered V-type endonuclease comprising the amino acid sequence of SEQ ID NO: 5.

An engineered V-type endonuclease comprising the amino acid sequence of SEQ ID NO: 6.

An engineered V-type endonuclease as claimed in any one of claims 1 to 7, comprising at least one NLS, optionally located at the C-terminus of the amino acid sequence, and optionally wherein: (a) the engineered V-type endonuclease lacks any NLS located at the N-terminus of the amino acid sequence; or (b) the engineered V-type endonuclease comprises at least one NLS located at the N-terminus of the amino acid sequence.

A nucleic acid comprising a nucleotide sequence encoding the engineered V-type endonuclease of any one of claims 1 to 8.

The nucleic acid of claim 9, wherein the nucleotide sequence encoding the engineered V-type endonuclease of any one of claims 1 to 8 can be operably linked to a promoter.

The nucleic acid of claim 9 or claim 10, wherein the nucleic acid further encodes a guide RNA.

The nucleic acid of any one of claims 9 to 11, which is in the form of a vector.

The nucleic acid of claim 12, wherein the vector is a recombinant adeno-associated virus (AAV) vector.

A cell comprising the nucleic acid of any one of claims 9 to 13.

A cell engineered to express a nucleotide sequence encoding the engineered V-type endonuclease of any one of claims 1 to 8.

The cell of claim 14 or claim 15, which is a eukaryotic cell, and optionally, the eukaryotic cell is a human cell.

A method for producing an engineered V-type nuclease as in any one of claims 1 to 8, the method comprising culturing a cell as in any one of claims 14 to 16 under conditions in which the engineered V-type nuclease is produced, the method further comprising isolating and/or purifying the engineered V-type nuclease as desired.

A composition comprising: (a) an engineered V-type nuclease as in any one of claims 1 to 8; and (b) a guide RNA.

The composition of claim 18, which is a ribonucleoprotein complex.

A composition as claimed in claim 18 or claim 19, wherein the molar ratio of the engineered V-type nuclease:guide RNA ranges from 1:1 to 1:4.

A method for editing the genome of a cell, the method comprising introducing into the cell: (a) an engineered V-type nuclease as in any one of claims 1 to 8; and (b) a guide RNA.

A method for editing the genome of a cell, the method comprising introducing into the cell: (a) one or more nucleic acids encoding an engineered V-type endonuclease as described in any one of claims 1 to 8; and (b) guide RNA.

The method of claim 22, comprising contacting the cell with a lipid nanoparticle comprising the one or more nucleic acids and the guide RNA.

A method for editing the genome of a cell, the method comprising introducing into the cell a nucleic acid encoding one or more of the following: (a) an engineered V-type endonuclease as in any one of claims 1 to 8; and (b) a guide RNA, optionally, wherein at least one of the one or more nucleic acids is a nucleic acid as in any one of claims 9 to 13.

The method of claim 24, comprising contacting the cell with one or more recombinant AAV particles comprising the one or more nucleic acids.

The method of claim 24, comprising contacting the cell with one or more lipid nanoparticles comprising the one or more nucleic acids.

A method for editing the genome of a cell, the method comprising introducing the composition of any one of claims 18 to 20 into the cell.

The method of claim 24, wherein the composition is a ribonucleoprotein complex.

The method of claim 26 or claim 27, wherein the composition is introduced into the cell via lipid nanoparticles.

The method of any one of claims 21 to 29, wherein the cell is a mammalian cell, optionally a human cell.

The method of claim 30, wherein the cell is a stem cell or a cell differentiated therefrom, and optionally, wherein the stem cell is a pluripotent stem cell (PSC) or a cell differentiated therefrom.

A cell comprising: (a) a composition as described in any one of claims 18 to 20; or (b) a nucleic acid as described in any one of claims 9 to 13.

The cell of claim 32 is a mammalian cell, and optionally a human cell.