CN118291459A

CN118291459A - 3'UTR sequence for promoting mRNA translation and its use

Info

Publication number: CN118291459A
Application number: CN202410326083.7A
Authority: CN
Inventors: 杨雪瑞; 吕瑶; 张小驹; 刘安东; 毛沁心; 吴鼎; 叶子惠; 于奕凡
Original assignee: Hangzhou Yetai Pharmaceutical Technology Co ltd; Beijing Yitai Pharmaceutical Technology Co ltd; Tsinghua University
Current assignee: Hangzhou Yetai Pharmaceutical Technology Co ltd; Beijing Yitai Pharmaceutical Technology Co ltd; Tsinghua University
Priority date: 2024-03-21
Filing date: 2024-03-21
Publication date: 2024-07-05

Abstract

The present disclosure relates to the biotechnology field, in particular to a 3' utr sequence for promoting mRNA translation and uses thereof. The present disclosure provides a 3' utr sequence comprising one of the following groups: (1) the nucleotide sequence shown in any one of SEQ ID NOs 1 to 3; (2) A complement of the nucleotide sequence shown in any one of SEQ ID NOs 1 to 3; (3) A nucleotide sequence having at least 80% homology with the nucleotide sequence set forth in any one of SEQ ID NOs 1 to 3; or (4) a nucleotide sequence having at least 80% homology with the complementary sequence of the nucleotide sequence shown in any one of SEQ ID NOS: 1-3. The invention provides a brand new 3' UTR sequence, and the advantage of optimizing translation is realized by reasonably optimizing the nucleic acid sequence of a functional region, so that the nucleic acid molecule has wide application prospect in the fields of gene therapy, vaccine research and development and the like.

Description

3'UTR sequence for promoting mRNA translation and its use

技术领域Technical Field

本公开涉及生物技术领域，具体涉及一种用于促进mRNA翻译的3’UTR序列及其用途。The present disclosure relates to the field of biotechnology, and in particular to a 3'UTR sequence for promoting mRNA translation and a use thereof.

背景技术Background technique

体外转录(in vitro transcribed，IVT)mRNA是指以DNA为模板转录获得的一种信使RNA(messenger RNA，mRNA)，结构上类似于天然mRNA，可在体内细胞中传递遗传信息。mRNA药物具有转录时不依赖细胞、序列易调整、理论上无整合风险、低成本、高效且可模块化生产等优点，因此，mRNA药物具有广泛的应用前景，在治疗和疫苗领域具有独特优势。而要实现有效的治疗和疫苗效果，关键在于确保mRNA能够充分表达。mRNA的表达调控是一个复杂的、多级调控过程，并且转录水平和转录后水平的调控尤其关键，其中，在翻译调控过程中，常涉及多种蛋白因子与mRNA的相互作用，而3’UTR位于mRNA的3’端，调控mRNA稳定性、定位和表达，因此3’UTR是影响翻译效率的关键节点。In vitro transcribed (IVT) mRNA refers to a messenger RNA (mRNA) obtained by transcription using DNA as a template. It is similar to natural mRNA in structure and can transmit genetic information in cells in vivo. mRNA drugs have the advantages of being independent of cells during transcription, easy to adjust the sequence, theoretically no integration risk, low cost, high efficiency and modular production. Therefore, mRNA drugs have broad application prospects and unique advantages in the fields of treatment and vaccines. To achieve effective treatment and vaccine effects, the key is to ensure that mRNA can be fully expressed. The expression and regulation of mRNA is a complex, multi-level regulatory process, and the regulation of transcriptional and post-transcriptional levels is particularly critical. Among them, in the process of translation regulation, the interaction between multiple protein factors and mRNA is often involved, and the 3'UTR is located at the 3' end of mRNA, regulating mRNA stability, localization and expression. Therefore, the 3'UTR is a key node affecting translation efficiency.

目前已有报道从天然存在的基因组中筛选获得的3’UTR以提升mRNA翻译效率，但是3’UTR通常与mRNA稳定性有关，且存在microRNA结合位点可能造成抑制翻译/RNA降解，因此天然存在的3’UTR序列在调控机制和翻译效率方面可能存在一定的局限性，无法满足对高水平基因表达的需求。除了利用来天然存在的基因组的3’UTR，还可以从头设计3’UTR，设计过程中依据减少二级结构和减少miRNA靶向效应等设计原则。尽管目前本领域已经根据不同的设计原理和方法筛选了一大批效果较好的3’UTR序列，但是开发新的、促进mRNA翻译且具有广泛适用性的3’UTR序列，进一步降低mRNA给药单位剂量降低成本，以实现在mRNA治疗和mRNA疫苗中更优的应用，仍然是该领域内的一个研究重点。At present, there have been reports of 3’UTRs screened from naturally occurring genomes to improve mRNA translation efficiency. However, 3’UTRs are usually related to mRNA stability, and the presence of microRNA binding sites may cause inhibition of translation/RNA degradation. Therefore, naturally occurring 3’UTR sequences may have certain limitations in terms of regulatory mechanisms and translation efficiency, and cannot meet the needs of high-level gene expression. In addition to using 3’UTRs from naturally occurring genomes, 3’UTRs can also be designed from scratch, based on design principles such as reducing secondary structures and reducing miRNA targeting effects. Although a large number of 3’UTR sequences with good effects have been screened in this field based on different design principles and methods, the development of new 3’UTR sequences that promote mRNA translation and have wide applicability, further reducing the unit dose of mRNA administration and reducing costs, so as to achieve better applications in mRNA therapy and mRNA vaccines, is still a research focus in this field.

此外，在自然界中，很多mRNA分子的3’UTR区段包含了microRNA的结合位点，这些位点让microRNA能够对mRNA进行细致的调节，既包括抑制其翻译，也包括加速其降解。microRNA因在不同的组织类型和疾病进展阶段展现出的差异性表达而被认为是具有高度特异性的生物标志物。据此，向3’UTR区段引入microRNA结合位点成为了一种普遍采用的方法，用以调控mRNA的功能。理论上，miRNA与目标mRNA的交互作用主要是基于miRNA的种子序列(一般为2到8个核苷酸长)与mRNA的3’UTR中对应位置的互补。这种互补性配对足以将RNA诱导的沉默复合体(RISC)定位到特定的mRNA上，进而阻断其翻译或加速其降解。众多研究已经证明，除了种子序列的匹配之外，miRNA非种子序列的部分与mRNA之间的附加互补配对也能提高miRNA介导的基因调控效率。这种在miRNA种子序列以外区域的额外配对，能够增强miRNA与其靶标之间的结合亲和力，从而提升了mRNA的抑制效果。例如，Moderna公司采纳的策略之一就是在3’UTR区段加入miRNA全长的反向互补序列，这样的设计不只局限于种子序列的配对，让miRNA能与其目标mRNA有更广泛的互补配对，因而强化了miRNA对mRNA调控的能力。In addition, in nature, the 3'UTR segments of many mRNA molecules contain microRNA binding sites, which allow microRNA to fine-tune mRNA, including inhibiting its translation and accelerating its degradation. MicroRNAs are considered to be highly specific biomarkers due to their differential expression in different tissue types and stages of disease progression. Accordingly, the introduction of microRNA binding sites into the 3'UTR segment has become a commonly used method to regulate the function of mRNA. In theory, the interaction between miRNA and target mRNA is mainly based on the complementarity of the seed sequence of miRNA (generally 2 to 8 nucleotides long) with the corresponding position in the 3'UTR of mRNA. This complementary pairing is sufficient to locate the RNA-induced silencing complex (RISC) to a specific mRNA, thereby blocking its translation or accelerating its degradation. Numerous studies have shown that in addition to the matching of the seed sequence, additional complementary pairing between the non-seed sequence portion of miRNA and mRNA can also improve the efficiency of miRNA-mediated gene regulation. This additional pairing outside the miRNA seed sequence can enhance the binding affinity between miRNA and its target, thereby improving the inhibitory effect of mRNA. For example, one of the strategies adopted by Moderna is to add the reverse complementary sequence of the full length of miRNA to the 3'UTR segment. This design is not limited to the pairing of the seed sequence, allowing miRNA to have a wider range of complementary pairing with its target mRNA, thereby enhancing the ability of miRNA to regulate mRNA.

发明内容Summary of the invention

为了解决以上技术问题，本公开提供了一种用于促进mRNA翻译的3’UTR序列及其用途。In order to solve the above technical problems, the present disclosure provides a 3'UTR sequence for promoting mRNA translation and its use.

一方面，本公开提供了一种3’UTR序列，3’UTR序列包含下组(1)-(4)中的一种：In one aspect, the present disclosure provides a 3'UTR sequence, wherein the 3'UTR sequence comprises one of the following groups (1)-(4):

(1)SEQ ID NO:1-3任一项所示核苷酸序列；(1) the nucleotide sequence shown in any one of SEQ ID NOs: 1-3;

(2)SEQ ID NO:1-3任一项所示核苷酸序列的互补序列；(2) a complementary sequence of the nucleotide sequence shown in any one of SEQ ID NOs: 1-3;

(3)与SEQ ID NO:1-3任一项所示核苷酸序列具有至少80％同源性的核苷酸序列；或(3) a nucleotide sequence having at least 80% homology to the nucleotide sequence shown in any one of SEQ ID NOs: 1-3; or

(4)与SEQ ID NO:1-3任一项所示核苷酸序列的互补序列具有至少80％同源性的核苷酸序列。(4) A nucleotide sequence having at least 80% homology with the complementary sequence of the nucleotide sequence shown in any one of SEQ ID NOs: 1-3.

在一些实施方案中，前述3’UTR序列还包含插入额外序列的3’UTR序列或其互补序列；优选额外序列为miRNA结合位点；更优选额外序列选自全长的microRNA反向互补序列或其种子序列的反向互补序列；优选全长的microRNA反向互补序列长度为19-25nt；优选种子序列的反向互补序列长度为7-8nt。In some embodiments, the aforementioned 3'UTR sequence further comprises a 3'UTR sequence or its complementary sequence inserted with an additional sequence; preferably, the additional sequence is a miRNA binding site; more preferably, the additional sequence is selected from the full-length microRNA reverse complementary sequence or the reverse complementary sequence of its seed sequence; preferably, the full-length microRNA reverse complementary sequence is 19-25nt in length; preferably, the reverse complementary sequence of the seed sequence is 7-8nt in length.

另一方面，本公开提供了一种DNA分子，DNA分子包含前述3’UTR序列。On the other hand, the present disclosure provides a DNA molecule comprising the aforementioned 3'UTR sequence.

在一些实施方式中，DNA分子还包含编码感兴趣多肽或蛋白的开放阅读框。In some embodiments, the DNA molecule further comprises an open reading frame encoding a polypeptide or protein of interest.

在一些实施方式中，DNA分子为环状DNA。In some embodiments, the DNA molecule is circular DNA.

另一方面，本公开提供了一种mRNA，mRNA包含前述3’UTR序列。On the other hand, the present disclosure provides an mRNA comprising the aforementioned 3'UTR sequence.

在一些实施方式中，前述mRNA在5’-3’方向包含：In some embodiments, the aforementioned mRNA comprises in the 5'-3' direction:

(1)编码感兴趣多肽或蛋白的开放阅读框；和(1) an open reading frame encoding a polypeptide or protein of interest; and

(2)所述的3’UTR序列。(2) The 3’UTR sequence described.

另一方面，本公开提供了一种表达载体，其中，表达载体包含前述DNA分子。In another aspect, the present disclosure provides an expression vector, wherein the expression vector comprises the aforementioned DNA molecule.

另一方面，本公开提供了一种宿主细胞，宿主细胞包含3’UTR序列、前述DNA分子、前述mRNA和/或前述表达载体。On the other hand, the present disclosure provides a host cell, which comprises a 3'UTR sequence, the aforementioned DNA molecule, the aforementioned mRNA and/or the aforementioned expression vector.

另一方面，本公开提供了一种药物组合物，药物组合物包含下述组分(1)-(5)中的一种，以及药学上可接受的载体；In another aspect, the present disclosure provides a pharmaceutical composition comprising one of the following components (1)-(5), and a pharmaceutically acceptable carrier;

(1)前述3’UTR序列；(1) the aforementioned 3’UTR sequence;

(2)前述DNA分子；(2) the aforementioned DNA molecule;

(3)前述mRNA；(3) the aforementioned mRNA;

(4)前述表达载体；或(4) the aforementioned expression vector; or

(5)前述细胞。(5) The aforementioned cells.

另一方面，本公开提供了一种促进感兴趣多肽或蛋白表达的方法，方法包括向宿主细胞中导入3’UTR序列、前述DNA分子、前述mRNA和/或前述载体。On the other hand, the present disclosure provides a method for promoting the expression of a polypeptide or protein of interest, the method comprising introducing a 3'UTR sequence, the aforementioned DNA molecule, the aforementioned mRNA and/or the aforementioned vector into a host cell.

另一方面，本公开提供了一种3’UTR序列、前述DNA分子、前述mRNA、前述载体、前述细胞、前述药物组合物和/或前述方法在促进感兴趣多肽或蛋白表达中的用途。On the other hand, the present disclosure provides a use of a 3'UTR sequence, the aforementioned DNA molecule, the aforementioned mRNA, the aforementioned vector, the aforementioned cell, the aforementioned pharmaceutical composition and/or the aforementioned method in promoting the expression of a polypeptide or protein of interest.

本公开的有益效果至少如下：The beneficial effects of the present disclosure are at least as follows:

1.优化翻译效率：在编码转录物的3’UTR中，本发明的核酸分子采用了特定的序列优化，有助于提高核糖体的扫描和翻译起始复合物的形成，从而增强蛋白质的翻译效率。1. Optimizing translation efficiency: In the 3'UTR of the coding transcript, the nucleic acid molecule of the present invention adopts specific sequence optimization, which helps to improve the scanning of ribosomes and the formation of translation initiation complexes, thereby enhancing the translation efficiency of proteins.

2.广泛适用性：本发明的核酸分子设计灵活，适用于不同基因和应用场景。其功能区域的序列可以根据具体需要进行定制，从而满足不同实验和应用的要求。2. Wide applicability: The nucleic acid molecules of the present invention are flexibly designed and applicable to different genes and application scenarios. The sequences of their functional regions can be customized according to specific needs to meet the requirements of different experiments and applications.

综上所述，本发明提供了一种全新的核酸分子设计方案，通过合理优化功能区域的核酸序列，实现了优化翻译的优势，使得该核酸分子在基因治疗、疫苗研发等领域具有广泛的应用前景。In summary, the present invention provides a new nucleic acid molecule design scheme, which achieves the advantage of optimized translation by rationally optimizing the nucleic acid sequence of the functional region, making the nucleic acid molecule have broad application prospects in the fields of gene therapy, vaccine development, etc.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为多孔板读数器(Agilent)检测细胞裂解液中萤火虫荧光素酶相对光单位柱状图，指征其表达水平。FIG1 is a bar graph showing the relative light units of firefly luciferase in cell lysate detected by a multi-well plate reader (Agilent), indicating its expression level.

图2为荧光定量分析包含3个microRNA结合位点的本发明3’UTR的mRNA转染293T细胞后绿色荧光蛋白的相对光强，以及在24小时内的变化趋势。FIG2 is a fluorescence quantitative analysis of the relative intensity of green fluorescent protein after transfection of 293T cells with mRNA of the 3'UTR of the present invention containing three microRNA binding sites, and its changing trend within 24 hours.

具体实施方式Detailed ways

定义和说明Definition and Description

为了更容易理解本发明，以下具体定义了某些技术和科学术语。除非在本文中另有明确定义，本文使用的所有其它技术和科学术语都具有本发明所属领域的一般技术人员通常理解的含义。应理解本发明不限于具体的方法、试剂、化合物、组合物或生物系统，当然可以对以上进行变化。还应理解本申请所用术语仅为了描述具体的实施方式，并不旨在进行限制。In order to more easily understand the present invention, some technical and scientific terms are specifically defined below. Unless otherwise clearly defined in this article, all other technical and scientific terms used herein have the meanings commonly understood by those of ordinary skill in the art to which the present invention belongs. It should be understood that the present invention is not limited to specific methods, reagents, compounds, compositions or biosystems, and certainly the above can be changed. It should also be understood that the terms used in this application are only for the purpose of describing specific embodiments and are not intended to be limited.

除非该内容被另外明确说明，否则本说明书以及所附权利要求中所用的单数形式“一个”、“一种”和“该”包括复数指代。As used in this specification and the appended claims, the singular forms "a," "an," and "the" include plural referents unless the content clearly dictates otherwise.

如本文所用，术语“包括”和“具有”以及它们任何变形，意图在于覆盖不排他的包含。例如包含了一系列步骤的过程、方法、装置、产品或设备没有限定于已列出的步骤或模块，而是可选地还包括没有列出的步骤，或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤。在本公开中提及的“多个”是指两个或两个以上。“和/或”，描述关联对象的关联关系，表示可以存在三种关系，例如，A和/或B，可以表示：单独存在A，同时存在A和B，单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。As used herein, the terms "include" and "have" and any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, apparatus, product or device comprising a series of steps is not limited to the listed steps or modules, but may optionally include steps that are not listed, or may optionally include other steps that are inherent to these processes, methods, products or devices. The "plurality" mentioned in this disclosure refers to two or more. "And/or" describes the association relationship of associated objects, indicating that three relationships may exist. For example, A and/or B may represent: A exists alone, A and B exist at the same time, and B exists alone. The character "/" generally indicates that the previously associated objects are in an "or" relationship.

如本文所用，术语“核酸”、“核苷酸”和“多聚核苷酸”可以互换使用，是指单链、双链或多链形式的脱氧核糖核酸(DNA)、核糖核酸(RNA)及其聚合物。该术语包括但不限于单链、双链或多链DNA或RNA、基因组DNA、cDNA、DNA-RNA杂交体，或包含嘌呤和/或嘧啶碱基或其它天然、化学修饰、生化修饰、非天然、合成或衍生的核苷酸碱基的聚合物。在一些实施方式中，核酸可以包括DNA、RNA及其类似物的混合物。该术语还涵盖含有天然核苷酸的已知类似物的核酸，其具有与参考核酸类似的结合特性，并以与天然存在的核苷酸类似的方式进行代谢。“核酸”可与“基因”、“DNA”和基因所编码的“mRNA”互换使用。As used herein, the terms "nucleic acid", "nucleotide" and "polynucleotide" are used interchangeably to refer to deoxyribonucleic acid (DNA), ribonucleic acid (RNA) and polymers thereof in single-stranded, double-stranded or multi-stranded form. The term includes, but is not limited to, single-stranded, double-stranded or multi-stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, or polymers containing purine and/or pyrimidine bases or other natural, chemically modified, biochemically modified, non-natural, synthetic or derived nucleotide bases. In some embodiments, nucleic acids may include mixtures of DNA, RNA and their analogs. The term also encompasses nucleic acids containing known analogs of natural nucleotides, which have similar binding properties to reference nucleic acids and are metabolized in a manner similar to naturally occurring nucleotides. "Nucleic acid" is used interchangeably with "gene", "DNA" and "mRNA" encoded by a gene.

如本文所用，术语“同源性”是指其中两种核酸共享至少部分互补性的区域(位点)。同源性区域可以跨越序列的仅一部分。例如，核苷酸的仅一部分可以与基因组中的一个位点是同源的。核苷酸的不同部分可以与基因组中的几个不同位点是同源的，而完整核苷酸可以与基因组中的又另一个位点是同源的。正如任何部分互补核酸序列，当比对两个序列时，同源性区域可以含有一个或多个错配和缺口。较小核酸链(例如，寡核苷酸)可以与更大核酸(例如基因或基因组)中的区域(位点)是同源的。两个序列之间的术语“同源性程度”是指序列之间的同一性程度。同一性的程度通常表示为以百分比表示的同源区域中错配核苷酸与核苷酸的总数的比率。例如，与具有两个错配的目标基因组中的同源区域(位点)杂交的20个碱基寡核苷酸被称为与该区域具有90％同一性。如本文所用，同源性是至少80％、至少85％、至少90％、至少95％、至少96％、至少97％、至少98％或至少99％。As used herein, the term "homology" refers to a region (site) in which two nucleic acids share at least partial complementarity. A region of homology can span only a portion of a sequence. For example, only a portion of a nucleotide can be homologous to a site in a genome. Different portions of a nucleotide can be homologous to several different sites in a genome, and a complete nucleotide can be homologous to another site in a genome. As with any partially complementary nucleic acid sequence, when two sequences are compared, a region of homology can contain one or more mismatches and gaps. A smaller nucleic acid chain (e.g., an oligonucleotide) can be homologous to a region (site) in a larger nucleic acid (e.g., a gene or genome). The term "degree of homology" between two sequences refers to the degree of identity between sequences. The degree of identity is generally expressed as the ratio of the total number of mismatched nucleotides to nucleotides in a homology region expressed as a percentage. For example, a 20 base oligonucleotide hybridized to a homology region (site) in a target genome with two mismatches is referred to as having 90% identity with the region. As used herein, homology is at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%.

如本文所用，术语“互补序列”是指一个核酸与另一个核酸序列通过传统的沃森-克里克(Watson-Crick)或其它非传统类型形成氢键。互补百分比表示核酸分子中可以与第二核酸序列形成氢键(例如沃森-克里克碱基配对)的残基百分比(例如10个中的5、6、7、8、9、10个为50％、60％、70％、80％、90％和100％互补)。“完全互补”是指核酸序列的所有连续残基将与第二核酸序列中相同数量的连续残基氢键键合，“基本上互补”是指在8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、30、35、40、45、50个或更多个核苷酸的区域上至少60％、65％、70％、75％、80％、85％、90％、95％、97％、98％、99％或100％的互补程度，或指两个核酸在严格条件下杂交。As used herein, the term "complementary sequence" refers to the ability of one nucleic acid to form hydrogen bonds with another nucleic acid sequence through traditional Watson-Crick or other non-traditional types. The percentage of complementarity represents the percentage of residues in a nucleic acid molecule that can form hydrogen bonds (e.g., Watson-Crick base pairing) with a second nucleic acid sequence (e.g., 5, 6, 7, 8, 9, 10 out of 10 are 50%, 60%, 70%, 80%, 90%, and 100% complementary). "Fully complementary" means that all consecutive residues of a nucleic acid sequence will hydrogen bond with the same number of consecutive residues in a second nucleic acid sequence, and "substantially complementary" means a degree of complementarity of at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99% or 100% over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50 or more nucleotides, or that two nucleic acids hybridize under stringent conditions.

如本文所用，术语“5’UTR”通常是指mRNA分子从5’末端到翻译起始密码子之间的序列，其能够募集核糖体复合物并启动mRNA的翻译。5’UTR包括mRNA上的5’UTR区结构或该结构对应于DNA模板上的编码序列。5’UTR通过与转录因子、核糖体和其他转录调控蛋白相互作用，调节转录后修饰、翻译起始复合物的形成和稳定性等过程。该区域的序列设计和优化对于提高转录后修饰和蛋白表达的效率至关重要。如本文所用，术语“5’UTR结构”、“5’UTR”、“5’UTR序列”和“5’UTR元件”等可互换使用，5’UTR具有3-500个核苷酸、5-150个核苷酸、10-100个核苷酸、15-90个核苷酸、或20-70个核苷酸的长度。本公开中涉及的5’UTR元件包含下述核酸序列或由下述核酸序列组成：核酸序列源自真核生物蛋白编码基因的5’UTR，优选源自脊椎动物蛋白编码基因的5’UTR，更优选源自哺乳动物蛋白编码基因的5’UTR，甚至更优选源自灵长类蛋白编码基因的5’UTR，特别源自人或鼠蛋白编码基因的5’UTR。本公开中涉及的5’UTR元件源自选自由以下各项组成的组的基因的5’UTR：白蛋白基因，α-珠蛋白基因，β-珠蛋白基因，酪氨酸羟化酶基因，脂氧化酶基因，和胶原蛋白α基因，如胶原蛋白α1(I)基因，或源自选自由以下各项组成的组的基因的5’UTR的变体：白蛋白基因，α-珠蛋白基因，β-珠蛋白基因，酪氨酸羟化酶基因，脂氧化酶基因，和胶原蛋白α基因，如胶原蛋白α1(I)基因。本公开中涉及的5’UTR包含TOP基因的5’UTR，优选缺少5’端寡嘧啶束的人大核糖体蛋白32的5’UTR。本公开中涉及的序列包含下组(1)-(4)中的一种：(1)GGGAAATAAGAGAGAAAAGAAGAGTAAGAAGAAATATAAGAGCCACC(SEQ ID NO:4)所示核苷酸序列；(2)SEQ ID NO:4所示核苷酸序列的互补序列；(3)与SEQ ID NO:4所示核苷酸序列具有至少80％同源性的核苷酸序列；或(4)与SEQ ID NO:4所示核苷酸序列的互补序列具有至少80％同源性的核苷酸序列。As used herein, the term "5'UTR" generally refers to the sequence between the 5' end of the mRNA molecule and the translation start codon, which can recruit ribosome complexes and initiate the translation of mRNA. 5'UTR includes a 5'UTR region structure on the mRNA or the structure corresponds to a coding sequence on the DNA template. 5'UTR regulates post-transcriptional modification, the formation and stability of the translation start complex by interacting with transcription factors, ribosomes and other transcription regulatory proteins. The sequence design and optimization of this region are essential for improving the efficiency of post-transcriptional modification and protein expression. As used herein, the terms "5'UTR structure", "5'UTR", "5'UTR sequence" and "5'UTR element" etc. are used interchangeably, and 5'UTR has a length of 3-500 nucleotides, 5-150 nucleotides, 10-100 nucleotides, 15-90 nucleotides or 20-70 nucleotides. The 5'UTR element involved in the present disclosure comprises or consists of the following nucleic acid sequence: the nucleic acid sequence is derived from the 5'UTR of a eukaryotic protein-coding gene, preferably from the 5'UTR of a vertebrate protein-coding gene, more preferably from the 5'UTR of a mammalian protein-coding gene, even more preferably from the 5'UTR of a primate protein-coding gene, and in particular from the 5'UTR of a human or mouse protein-coding gene. The 5'UTR element involved in the present disclosure is derived from the 5'UTR of a gene selected from the group consisting of: an albumin gene, an α-globin gene, a β-globin gene, a tyrosine hydroxylase gene, a lipoxygenase gene, and a collagen α gene, such as a collagen α1 (I) gene, or a variant of the 5'UTR of a gene selected from the group consisting of: an albumin gene, an α-globin gene, a β-globin gene, a tyrosine hydroxylase gene, a lipoxygenase gene, and a collagen α gene, such as a collagen α1 (I) gene. The 5'UTR involved in the present disclosure comprises the 5'UTR of the TOP gene, preferably the 5'UTR of the human large ribosomal protein 32 lacking the 5'-terminal oligopyrimidine tract. The sequence involved in the present disclosure comprises one of the following groups (1)-(4): (1) the nucleotide sequence shown in GGGAAATAAGAGAGAAAAGAAGAGTAAGAAGAAATATAAGAGCCACC (SEQ ID NO: 4); (2) the complementary sequence of the nucleotide sequence shown in SEQ ID NO: 4; (3) a nucleotide sequence having at least 80% homology with the nucleotide sequence shown in SEQ ID NO: 4; or (4) a nucleotide sequence having at least 80% homology with the complementary sequence of the nucleotide sequence shown in SEQ ID NO: 4.

如本文所用，术语“3’UTR”是指mRNA中多肽编码序列的终止密码子与多聚(A)(poly(A))序列之间的序列。3’UTR可以通过与mRNA结合蛋白、miRNA等相互作用来调控mRNA的翻译。3’UTR包括mRNA上的3’UTR区结构或该结构对应于DNA模板上的编码序列。它与转录后修饰和mRNA稳定性密切相关。3’UTR的序列和结构特征可以影响mRNA的稳定性、核糖体的扫描和翻译终止复合物的形成等，从而影响蛋白质的表达水平如本文所用，术语“3’UTR结构”、“3’UTR”、“3’UTR序列”和“3’UTR元件”等可互换使用，均指经发明人大量筛选后获得的可增强目的基因表达的3’UTR元件，3’UTR序列包含下组(1)-(3)中的一种：(1)SEQ ID NO:1-3任一项所示核苷酸序列；(2)SEQ ID NO:1-3任一项所示核苷酸序列的互补序列；(3)与SEQ ID NO:1-3任一项所示核苷酸序列具有至少80％同源性的核苷酸序列；或(4)与SEQ IDNO:1-3任一项所示核苷酸序列的互补序列具有至少80％同源性的核苷酸序列。本发明3’UTR元件可以用于mRNA治疗、mRNA疫苗和个性化免疫治疗的mRNA分子结构和DNA分子模板设计，以提高翻译效率，增强目的基因的表达量。As used herein, the term "3'UTR" refers to the sequence between the stop codon of the polypeptide coding sequence in the mRNA and the poly (A) sequence. The 3'UTR can regulate the translation of the mRNA by interacting with mRNA binding proteins, miRNA, etc. The 3'UTR includes the 3'UTR region structure on the mRNA or the structure corresponds to the coding sequence on the DNA template. It is closely related to post-transcriptional modification and mRNA stability. The sequence and structural features of 3'UTR can affect the stability of mRNA, ribosome scanning and the formation of translation termination complex, etc., thereby affecting the expression level of protein. As used herein, the terms "3'UTR structure", "3'UTR", "3'UTR sequence" and "3'UTR element" are used interchangeably, all referring to the 3'UTR elements that can enhance the expression of the target gene obtained by the inventors after a large number of screenings. The 3'UTR sequence comprises one of the following groups (1)-(3): (1) the nucleotide sequence shown in any one of SEQ ID NOs: 1-3; (2) the complementary sequence of the nucleotide sequence shown in any one of SEQ ID NOs: 1-3; (3) a nucleotide sequence having at least 80% homology with the nucleotide sequence shown in any one of SEQ ID NOs: 1-3; or (4) a nucleotide sequence having at least 80% homology with the complementary sequence of the nucleotide sequence shown in any one of SEQ ID NOs: 1-3. The 3'UTR element of the present invention can be used for the design of mRNA molecular structure and DNA molecular template for mRNA therapy, mRNA vaccine and personalized immunotherapy to improve translation efficiency and enhance the expression amount of the target gene.

如本文所用，术语“polyA尾”即多聚(A)序列，其含义包括mRNA上的polyA尾区结构或该结构对应于DNA模板上的编码序列。多聚A序列的添加有助于mRNA的稳定性和转运，防止其降解，并在转录后修饰过程中发挥重要作用。该多聚(A)序列可以是纯腺嘌呤核苷酸的连续链，亦可以包含非腺嘌呤的核苷酸。在任何形式中，只要其功能等同于传统的多聚(A)序列，即能够提供与传统多聚(A)序列相似的生物学功能，如影响mRNA的稳定性、翻译效率或核糖体结合等，该序列即被认定为多聚(A)序列。这包括但不限于已知的如人生长激素(hGH)多聚(A)序列和猴病毒40(SV40)多聚(A)序列等变体，这些变体在核苷酸组成上可能有所不同，但在功能上被认定为与传统的多聚(A)序列等效。在本公开中，多聚(A)序列具有20-500个腺嘌呤核苷酸的长度，例如，25个、50个、100个、150个、175个、200个、300个、400个或500个腺嘌呤核苷酸的长度。As used herein, the term "polyA tail" is a poly (A) sequence, which includes the structure of the poly A tail region on the mRNA or the structure corresponding to the coding sequence on the DNA template. The addition of the poly A sequence contributes to the stability and transport of the mRNA, prevents its degradation, and plays an important role in the post-transcriptional modification process. The poly (A) sequence can be a continuous chain of pure adenine nucleotides, or it can contain non-adenine nucleotides. In any form, as long as its function is equivalent to that of a traditional poly (A) sequence, that is, it can provide biological functions similar to those of a traditional poly (A) sequence, such as affecting the stability of mRNA, translation efficiency, or ribosome binding, the sequence is identified as a poly (A) sequence. This includes but is not limited to known variants such as the human growth hormone (hGH) poly (A) sequence and the simian virus 40 (SV40) poly (A) sequence, which may differ in nucleotide composition but are functionally identified as equivalent to traditional poly (A) sequences. In the present disclosure, the poly(A) sequence has a length of 20-500 adenine nucleotides, for example, a length of 25, 50, 100, 150, 175, 200, 300, 400 or 500 adenine nucleotides.

如本文所用，术语“5’帽子元件”与“5’帽结构”等可互换使用，包括存在于天然mRNA上的5’帽子元件以及其类似物。天然mRNA上的5'帽子元件是指甲基化鸟苷酸经焦磷酸与RNA的5’末端核苷酸相连，形成5’,5’-三磷酸连接(5’,5’-triphosphate linkage)。5'帽子元件通常有三种类型(m7G5’ppp5’Np、m7G5’ppp5’NmpNp、m7G5’ppp5’NmpNmpNp)，分别称为Cap0、Cap1和Cap2。Cap0指末端核苷酸的核糖未甲基化，Cap1指末端一个核苷酸的核糖甲基化，Cap2指末端两个核苷酸的核糖均甲基化。对mRNA分子加帽的方法是本领域已知的。前述mRNA分子的5’帽结构可以在化学合成或体外转录获得mRNA分子后使用通过酶促反应添加(例如通过包含牛痘加帽酶和mRNA帽结构2’-O-甲基转移酶的商品化试剂盒)。然而，也可以通过在体外转录过程中直接将带有帽结构的核苷酸类似物作为第一核苷酸掺入转录物，生产带有帽结构的mRNA。As used herein, the terms "5' cap element" and "5' cap structure" are used interchangeably, including 5' cap elements present on natural mRNA and their analogs. The 5' cap element on natural mRNA refers to the methylated guanylate connected to the 5' terminal nucleotide of RNA via pyrophosphate to form a 5', 5'-triphosphate linkage (5', 5'-triphosphate linkage). There are usually three types of 5' cap elements (m7G5'ppp5'Np, m7G5'ppp5'NmpNp, m7G5'ppp5'NmpNmpNp), respectively referred to as Cap0, Cap1 and Cap2. Cap0 refers to the unmethylated ribose of the terminal nucleotide, Cap1 refers to the methylation of the ribose of one terminal nucleotide, and Cap2 refers to the methylation of the ribose of both terminal two nucleotides. Methods for capping mRNA molecules are known in the art. The 5' cap structure of the aforementioned mRNA molecule can be added by enzymatic reaction after obtaining the mRNA molecule by chemical synthesis or in vitro transcription (for example, by a commercial kit containing vaccinia capping enzyme and mRNA cap structure 2'-O-methyltransferase). However, mRNA with a cap structure can also be produced by directly incorporating a nucleotide analog with a cap structure as the first nucleotide into the transcript during in vitro transcription.

如本文所用，术语“启动子”和“启动子元件”在本发明中可互换使用，是一个特定的核酸序列，转录酶能够识别并结合，开始转录过程。启动子在核酸分子的5’端附近位置，为后续转录和翻译提供必要的调控序列。本发明中涉及的启动子为T7 RNA聚合酶启动子、T6病毒RNA聚合酶启动子、SP6病毒RNA聚合酶启动子、T3病毒RNA聚合酶启动子或T4病毒RNA聚合酶启动子。As used herein, the terms "promoter" and "promoter element" are used interchangeably in the present invention and are a specific nucleic acid sequence that a transcriptase can recognize and bind to to start the transcription process. The promoter is located near the 5' end of a nucleic acid molecule and provides the necessary regulatory sequences for subsequent transcription and translation. The promoter involved in the present invention is a T7 RNA polymerase promoter, a T6 viral RNA polymerase promoter, an SP6 viral RNA polymerase promoter, a T3 viral RNA polymerase promoter, or a T4 viral RNA polymerase promoter.

如本文所用，术语“多肽”、“肽”和“蛋白质”在本发明中可互换使用，指氨基酸残基的聚合物。该术语适用于其中一个或多个氨基酸残基是相应的天然存在的氨基酸的人工化学类似物的氨基酸聚合物，以及适用于天然存在的氨基酸聚合物。术语“多肽”、“肽”、“氨基酸序列”和“蛋白质”还可包括修饰形式，包括但不限于糖基化、脂质连接、硫酸盐化、谷氨酸残基的γ羧化、羟化和ADP-核糖基化。多肽可以是真核、原核或者病毒来源的多肽。在某些实施方案中，多肽可以是用于治疗、预防或诊断用途的任何多肽。例如，多肽可以是抗原、抗体、基因编辑酶如CRISPR核酸酶等。多肽还可以是嵌合抗原受体、免疫调节蛋白、转录因子等。多肽的实例包括但不限于：荧光素酶、红色/绿色荧光蛋白、人促红细胞生成素、β-半乳糖苷酶。As used herein, the terms "polypeptide", "peptide" and "protein" are used interchangeably in the present invention to refer to polymers of amino acid residues. The term applies to amino acid polymers in which one or more amino acid residues are artificial chemical analogs of the corresponding naturally occurring amino acids, as well as to naturally occurring amino acid polymers. The terms "polypeptide", "peptide", "amino acid sequence" and "protein" may also include modified forms, including but not limited to glycosylation, lipid linkage, sulfation, gamma carboxylation, hydroxylation and ADP-ribosylation of glutamic acid residues. The polypeptide may be a polypeptide of eukaryotic, prokaryotic or viral origin. In certain embodiments, the polypeptide may be any polypeptide for therapeutic, preventive or diagnostic purposes. For example, the polypeptide may be an antigen, an antibody, a gene editing enzyme such as a CRISPR nuclease, etc. The polypeptide may also be a chimeric antigen receptor, an immunomodulatory protein, a transcription factor, etc. Examples of polypeptides include, but are not limited to: luciferase, red/green fluorescent protein, human erythropoietin, β-galactosidase.

如本文所用，术语“开放阅读框”(Open Reading Frame,ORF)是结构基因的正常核苷酸序列，具有编码蛋白质或多肽的潜能，从起始密码子开始，结束于终止密码子，其间不存在使翻译中断的终止密码子。在一条mRNA链上，核糖体从起始密码子开始翻译，沿着RNA序列合成多肽链并不断延伸，遇到终止密码子时，多肽链的延伸反应终止。As used herein, the term "open reading frame" (ORF) is a normal nucleotide sequence of a structural gene, which has the potential to encode a protein or polypeptide, starting from the start codon and ending at the stop codon, without a stop codon that interrupts translation. On an mRNA chain, the ribosome starts translating from the start codon, synthesizes the polypeptide chain along the RNA sequence and continues to extend, and when the stop codon is encountered, the extension reaction of the polypeptide chain is terminated.

如本文所用，术语“载体”是指从病毒、质粒或高等生物的细胞中提取的一段DNA，可以将外来DNA片段插入或已经插入其中以进行克隆和/或表达目的。在某些实施方案中，载体可以稳定地维持在生物体中。载体可包含，例如，复制起点、选择性标记或报告基因，如抗生素抗性或GFP，和/或多克隆位点(MCS)。该术语包括线性DNA片段(例如，PCR产物、线性的质粒片段)、质粒载体、病毒载体、粘粒、细菌人工染色体(BAC)、酵母人工染色体(YAC)等。As used herein, the term "vector" refers to a section of DNA extracted from a virus, plasmid or a cell of a higher organism, into which a foreign DNA fragment can be inserted or has been inserted for cloning and/or expression purposes. In certain embodiments, the vector can be stably maintained in an organism. The vector may include, for example, an origin of replication, a selective marker or a reporter gene, such as antibiotic resistance or GFP, and/or a multiple cloning site (MCS). The term includes linear DNA fragments (e.g., PCR products, linear plasmid fragments), plasmid vectors, viral vectors, cosmids, bacterial artificial chromosomes (BAC), yeast artificial chromosomes (YAC), etc.

如本文所用，术语“细胞”和“宿主细胞”在本发明中可互换使用，是指表达或者能够表达待表达的序列的细胞。本发明的宿主细胞表达编码具有多种用途(包括生物技术、分子生物学和临床应用)的多肽或RNA的多核苷酸。宿主细胞包括原核细胞或真核细胞，本发明中合适的宿主细胞的实例包括但不限于细菌、酵母细胞、昆虫细胞、动物细胞和哺乳动物细胞。As used herein, the terms "cell" and "host cell" are used interchangeably in the present invention and refer to cells that express or are capable of expressing a sequence to be expressed. The host cells of the present invention express polynucleotides encoding polypeptides or RNAs with a variety of uses, including biotechnology, molecular biology, and clinical applications. Host cells include prokaryotic cells or eukaryotic cells, and examples of suitable host cells in the present invention include, but are not limited to, bacteria, yeast cells, insect cells, animal cells, and mammalian cells.

如本文所用，术语“药学上可接受的载体”是指一种或多种相容性固体、半固体、液体或凝胶填料，它们适合于人体或动物使用，而且必须有足够的纯度和足够低的毒性。“相容性”是指药物组合物中的各组分和药物的活性成分以及它们之间相互掺和，而不明显降低药效。在本发明中，前述药学上可接受的载体包括但不限于缓冲剂、赋形剂、稳定剂或防腐剂。药学上可接受的载体的实例是生理上相容的溶剂、分散介质、包衣、抗细菌和抗真菌剂、等渗剂和吸收延迟剂等，如盐、缓冲液、糖类、抗氧化剂、水性或非水性载体、防腐剂、润湿剂、表面活性剂或乳化剂或其组合。可以基于载体的活性和制剂的所需特性，如稳定性和/或最小氧化，通过实验确定药物组合物中的药学上可接受的载体的量。As used herein, the term "pharmaceutically acceptable carrier" refers to one or more compatible solid, semisolid, liquid or gel fillers that are suitable for human or animal use and must have sufficient purity and sufficiently low toxicity. "Compatibility" refers to the components in the pharmaceutical composition and the active ingredients of the drug and their mutual blending without significantly reducing the efficacy. In the present invention, the aforementioned pharmaceutically acceptable carrier includes but is not limited to a buffer, an excipient, a stabilizer or a preservative. Examples of pharmaceutically acceptable carriers are physiologically compatible solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic agents and absorption delaying agents, such as salts, buffers, sugars, antioxidants, aqueous or non-aqueous carriers, preservatives, wetting agents, surfactants or emulsifiers or combinations thereof. The amount of a pharmaceutically acceptable carrier in a pharmaceutical composition can be determined experimentally based on the activity of the carrier and the desired characteristics of the formulation, such as stability and/or minimal oxidation.

具体实施方式详述Detailed description of the specific implementation method

在一些实施方案中，前述3’UTR序列包括与SEQ ID NO:1-3任一项所示核苷酸序列或其互补序列具有至少85％、至少90％、至少95％、至少96％、至少97％、至少98％或至少99％源性的核苷酸序列。In some embodiments, the aforementioned 3'UTR sequence includes a nucleotide sequence that is at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identical to the nucleotide sequence shown in any one of SEQ ID NOs: 1-3 or its complementary sequence.

在一些实施方案中，前述3’UTR序列还包含插入至少一个额外序列的3’UTR序列或其互补序列；优选前述3’UTR序列还包含插入1、2、3、4、5、6、7、8、9或10个额外序列的3’UTR序列或其互补序列。In some embodiments, the aforementioned 3’UTR sequence further comprises a 3’UTR sequence or its complementary sequence into which at least one additional sequence is inserted; preferably, the aforementioned 3’UTR sequence further comprises a 3’UTR sequence or its complementary sequence into which 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 additional sequences are inserted.

在一些实施方案中，前述3’UTR序列包含具有插入额外序列的SEQ ID NO:1-3任一项所示的核苷酸序列或其互补序列；优选额外序列为miRNA结合位点；更优选额外序列选自全长的microRNA反向互补序列或其种子序列的反向互补序列；优选全长的microRNA反向互补序列长度为19-25nt；种子序列的反向互补序列长度为7-8nt。向候选3’UTR中插入microRNA结合位点，不影响3’UTR的翻译调控功能，也不影响3’UTR对mRNA稳定性的调控功能。因此，向候选3’UTR中插入microRNA结合位点不影响其功能。In some embodiments, the aforementioned 3'UTR sequence comprises a nucleotide sequence shown in any one of SEQ ID NO: 1-3 or its complementary sequence with an inserted additional sequence; preferably, the additional sequence is a miRNA binding site; more preferably, the additional sequence is selected from the full-length microRNA reverse complementary sequence or the reverse complementary sequence of its seed sequence; preferably, the full-length microRNA reverse complementary sequence is 19-25nt in length; and the reverse complementary sequence of the seed sequence is 7-8nt in length. Inserting a microRNA binding site into a candidate 3'UTR does not affect the translational regulatory function of the 3'UTR, nor does it affect the regulatory function of the 3'UTR on the stability of mRNA. Therefore, inserting a microRNA binding site into a candidate 3'UTR does not affect its function.

在一些实施方案中，前述3’UTR序列包含具有插入至少一个额外序列的SEQ IDNO:1-3任一项所示的核苷酸序列或其互补序列；优选前述3’UTR序列还包含插入1、2、3、4、5、6、7、8、9或10个额外序列的SEQ ID NO:1-3任一项所示的核苷酸序列或其互补序列。In some embodiments, the aforementioned 3'UTR sequence comprises a nucleotide sequence shown in any one of SEQ ID NOs: 1-3 or a complementary sequence thereof with at least one additional sequence inserted therein; preferably, the aforementioned 3'UTR sequence further comprises a nucleotide sequence shown in any one of SEQ ID NOs: 1-3 or a complementary sequence thereof with 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 additional sequences inserted therein.

在一些实施方案中，全长的microRNA的反向互补序列包括SEQ ID NO:8所示核苷酸序列。In some embodiments, the reverse complementary sequence of the full-length microRNA includes the nucleotide sequence shown in SEQ ID NO:8.

在一些实施方式中，前述DNA分子还包含编码感兴趣多肽或蛋白的开放阅读框。In some embodiments, the aforementioned DNA molecule further comprises an open reading frame encoding a polypeptide or protein of interest.

(2)前述3’UTR序列。(2) The aforementioned 3’UTR sequence.

在一些优选的实施方案中，mRNA在5’-3’方向包含：In some preferred embodiments, the mRNA comprises in the 5'-3' direction:

(1)无或5’帽子元件；(1) No or 5' cap element;

(2)无或5’UTR序列或内部核糖体进入位点(IRES)序列；(2) no 5′UTR sequence or internal ribosome entry site (IRES) sequence;

(3)编码目的多肽或目的蛋白的开放阅读框；(3) an open reading frame encoding a target polypeptide or target protein;

(4)前述3’UTR序列；和(4) the aforementioned 3'UTR sequence; and

(5)无或多聚(A)序列。(5) No or poly(A) sequence.

在一些优选的实施方案中，前述5’帽子元件选自Cap0帽结构、Cap1帽结构或Cap2帽结构。在一些更优选的实施方案中，前述5’帽子元件为Cap1帽结构。In some preferred embodiments, the aforementioned 5' cap element is selected from Cap0 cap structure, Cap1 cap structure or Cap2 cap structure. In some more preferred embodiments, the aforementioned 5' cap element is Cap1 cap structure.

在一些实施方式中，前述5’UTR元件的序列包含下组(1)-(4)中的一种：In some embodiments, the sequence of the aforementioned 5'UTR element comprises one of the following groups (1)-(4):

(1)SEQ ID NO:4所示核苷酸序列；(1) the nucleotide sequence shown in SEQ ID NO:4;

(2)SEQ ID NO:4所示核苷酸序列的互补序列；(2) a complementary sequence to the nucleotide sequence shown in SEQ ID NO: 4;

(3)与SEQ ID NO:4所示核苷酸序列具有至少80％同源性的核苷酸序列；或(3) a nucleotide sequence having at least 80% homology to the nucleotide sequence shown in SEQ ID NO: 4; or

(4)与SEQ ID NO:4所示核苷酸序列的互补序列具有至少80％同源性的核苷酸序列。(4) A nucleotide sequence having at least 80% homology with the complementary sequence of the nucleotide sequence shown in SEQ ID NO:4.

在一些实施方式中，前述多聚A序列具有20-500个腺嘌呤核苷酸的长度。在一些更优选的实施方案中，前述多聚A序列具有25个、50个、100个、150个、175个、200个、300个、400个或500个腺嘌呤核苷酸的长度。In some embodiments, the aforementioned poly A sequence has a length of 20-500 adenine nucleotides. In some more preferred embodiments, the aforementioned poly A sequence has a length of 25, 50, 100, 150, 175, 200, 300, 400 or 500 adenine nucleotides.

在一些优选的实施方式中，前述DNA分子还包含至少一种核苷酸修饰。至少一种核苷酸修饰包括但不限于胞苷修饰、尿苷修饰或腺苷修饰。在一些更优选的实施方案中，至少一种核苷修饰包括但不限于5-甲基胞嘧啶(m5C)、N6-甲基腺苷(m6A)、假尿苷(ψ)、N1-甲基假尿苷(m1ψ)和5-甲氧基尿苷(5moU)。In some preferred embodiments, the aforementioned DNA molecule further comprises at least one nucleotide modification. At least one nucleotide modification includes but is not limited to cytidine modification, uridine modification or adenosine modification. In some more preferred embodiments, at least one nucleoside modification includes but is not limited to 5-methylcytosine (m5C), N6-methyladenosine (m6A), pseudouridine (ψ), N1-methylpseudouridine (m1ψ) and 5-methoxyuridine (5moU).

在一些实施方案中，前述载体还包含与DNA分子的编码序列可操作地连接的RNA聚合酶启动子序列。可操作地连接的启动子允许DNA分子的体内和/或体外转录。前述启动子是T7RNA聚合酶启动子、T6病毒RNA聚合酶启动子、SP6病毒RNA聚合酶启动子、T3病毒RNA聚合酶启动子或T4病毒RNA聚合酶启动子。In some embodiments, the aforementioned vector also comprises an RNA polymerase promoter sequence operably connected to the coding sequence of the DNA molecule. The operably connected promoter allows in vivo and/or in vitro transcription of the DNA molecule. The aforementioned promoter is a T7 RNA polymerase promoter, a T6 viral RNA polymerase promoter, an SP6 viral RNA polymerase promoter, a T3 viral RNA polymerase promoter, or a T4 viral RNA polymerase promoter.

在一些实施方案中，前述载体包括但不限于线性DNA片段(例如，PCR产物、线性的质粒片段)、质粒载体、病毒载体、粘粒、细菌人工染色体(BAC)或酵母人工染色体(YAC)等。在一些优选的实施方案中，前述载体包括质粒载体或病毒载体。在一些更优选的实施方案中，前述载体在前述mRNA分子的编码序列的3’侧翼包含限制性核酸内切酶位点，例如IIS型限制性核酸内切酶位点。合适的限制性核酸内切酶包括但不限于BsmBI、BsaI或SapI等。前述限制性核酸内切酶位点可以用于对载体进行线性化以进行体外转录。In some embodiments, the aforementioned vector includes but is not limited to linear DNA fragments (e.g., PCR products, linear plasmid fragments), plasmid vectors, viral vectors, cosmids, bacterial artificial chromosomes (BAC) or yeast artificial chromosomes (YAC) etc. In some preferred embodiments, the aforementioned vector includes a plasmid vector or a viral vector. In some more preferred embodiments, the aforementioned vector comprises a restriction endonuclease site, such as an IIS type restriction endonuclease site, at the 3' flank of the coding sequence of the aforementioned mRNA molecule. Suitable restriction endonucleases include but are not limited to BsmBI, BsaI or SapI etc. The aforementioned restriction endonuclease site can be used to linearize the vector for in vitro transcription.

另一方面，本公开提供了一种宿主细胞，宿主细胞包含3’UTR序列、前述DNA分子和/或前述表达载体。On the other hand, the present disclosure provides a host cell, which comprises a 3'UTR sequence, the aforementioned DNA molecule and/or the aforementioned expression vector.

在一些实施方案中，前述细胞包括原核细胞或真核细胞。在一些优选的实施方案中，前述细胞选自下组：大肠杆菌、酵母细胞或哺乳动物细胞。在一些更优选的实施方案中，细胞为哺乳动物细胞，包括但不限于啮齿类动物细胞如小鼠细胞或大鼠细胞等；灵长类动物细胞如猴细胞或人细胞等。在一些更优选的实施方案中，前述细胞为人体细胞。更进一步，前述细胞为293T细胞。In some embodiments, the aforementioned cells include prokaryotic cells or eukaryotic cells. In some preferred embodiments, the aforementioned cells are selected from the group consisting of Escherichia coli, yeast cells, or mammalian cells. In some more preferred embodiments, the cells are mammalian cells, including but not limited to rodent cells such as mouse cells or rat cells, etc.; primate cells such as monkey cells or human cells, etc. In some more preferred embodiments, the aforementioned cells are human cells. Furthermore, the aforementioned cells are 293T cells.

(1)前述3’UTR序列；(1) the aforementioned 3’UTR sequence;

(2)前述DNA分子；(2) the aforementioned DNA molecule;

(3)前述mRNA；(3) the aforementioned mRNA;

(4)前述表达载体；或(4) the aforementioned expression vector; or

(5)前述细胞。(5) The aforementioned cells.

在一些实施方案中，前述药物组合物中mRNA本身还可以充当佐剂。In some embodiments, the mRNA itself in the aforementioned pharmaceutical composition can also serve as an adjuvant.

在一些实施方案中，前述药物组合物的剂型选自注射剂或冻干剂。In some embodiments, the dosage form of the aforementioned pharmaceutical composition is selected from an injection or a lyophilized preparation.

另一方面，本公开提供了一种促进感兴趣多肽或蛋白表达的方法，方法包括向宿主细胞中导入3’UTR序列、前述DNA分子、前述mRNA分子和/或前述载体。本发明方法可以使前述宿主细胞含有前述核酸分子或前述载体，或其基因组中整合有前述3'UTR序列或核酸分子，从而提高宿主细胞中mRNA或载体翻译。On the other hand, the present disclosure provides a method for promoting the expression of a polypeptide or protein of interest, the method comprising introducing a 3'UTR sequence, the aforementioned DNA molecule, the aforementioned mRNA molecule and/or the aforementioned vector into a host cell. The method of the present invention can make the aforementioned host cell contain the aforementioned nucleic acid molecule or the aforementioned vector, or integrate the aforementioned 3'UTR sequence or nucleic acid molecule into its genome, thereby improving the translation of mRNA or vector in the host cell.

在一些实施方案中，导入可用过本领域已知的方法进行，例如显微注射或脂质体介导的转染或电转法。In some embodiments, introduction can be performed by methods known in the art, such as microinjection or liposome-mediated transfection or electroporation.

在一些实施方案中，将前述mRNA依次按照“头对尾关系”顺序偶联并克隆至载体的多克隆位点，并在poly A尾后加入IIS型限制性核酸内切酶位点，用于引导IIS型限制性核酸内切酶对DNA模板进行切割。载体的多克隆位点是指包含限制性内切酶的核酸区域，其中任何一个都可以用于切割载体并插入序列。限制性内切酶可以识别双链DNA分子上的特定序列(结合位点)，并切割磷酸二酯键。载体在IIS型限制性核酸内切酶的作用下会被切割，切割位点位于距DNA结合位点确定的距离处。进行产物回收，得到线性化的质粒。本公开中所涉及的IIS型限制性核酸内切酶包括但不限于，BsmBI，BsaI或SapI。In some embodiments, the aforementioned mRNA is sequentially coupled and cloned into the multiple cloning site of the vector in a "head-to-tail relationship" order, and an IIS type restriction endonuclease site is added after the poly A tail to guide the IIS type restriction endonuclease to cut the DNA template. The multiple cloning site of the vector refers to a nucleic acid region containing a restriction endonuclease, any of which can be used to cut the vector and insert a sequence. The restriction endonuclease can recognize a specific sequence (binding site) on a double-stranded DNA molecule and cut the phosphodiester bond. The vector will be cut under the action of the IIS type restriction endonuclease, and the cutting site is located at a distance determined from the DNA binding site. The product is recovered to obtain a linearized plasmid. The IIS type restriction endonuclease involved in the present disclosure includes, but is not limited to, BsmBI, BsaI or SapI.

通过将线性化的质粒模板使用T7 RNA聚合酶进行体外转录，可以生产mRNA分子，并同时添加5’帽结构。5’帽结构可以通过共转录加帽的方式，在体外转录过程中将帽类似物作为第一核苷酸掺入转录本，直接生产带有cap1结构的mRNA。Cap1结构也可以通过转录后加帽的方式，在体外转录结束后使用牛痘加帽酶添加cap0结构，进一步使用mRNA帽结构2’-O-甲基转移酶添加cap1结构。将这样得到的mRNA纯化并重悬在水中。并进行后续的转染和检测工作。By using T7 RNA polymerase to perform in vitro transcription on the linearized plasmid template, mRNA molecules can be produced and a 5' cap structure can be added at the same time. The 5' cap structure can be capped by co-transcription, where the cap analog is incorporated into the transcript as the first nucleotide during in vitro transcription to directly produce mRNA with a cap1 structure. The cap1 structure can also be capped post-transcriptionally, where the cap0 structure is added using vaccinia capping enzyme after in vitro transcription, and the cap1 structure is further added using mRNA cap structure 2'-O-methyltransferase. The mRNA obtained in this way is purified and resuspended in water. Subsequent transfection and detection work is then performed.

将包含生成的5’UTR序列元件，ORF为萤火虫荧光素酶序列(firefly luciferase)的mRNA分子转染至哺乳动物细胞293T细胞内。转染24小时后，收集细胞样本并进化学发光强度分析，比较不同优化的5’UTR对于firefly luciferase蛋白表达水平的影响。通过该实验来证明经过优化的5’UTR对蛋白的翻译效率产生的显著影响。The mRNA molecule containing the generated 5'UTR sequence element and the ORF of the firefly luciferase sequence was transfected into mammalian 293T cells. After 24 hours of transfection, the cell samples were collected and chemiluminescence intensity analysis was performed to compare the effects of different optimized 5'UTRs on the expression level of firefly luciferase protein. This experiment was used to demonstrate the significant effect of the optimized 5'UTR on the translation efficiency of the protein.

另一方面，本公开提供了一种3’UTR序列、前述DNA分子、前述mRNA分子、前述载体、前述细胞、前述药物组合物和/或前述方法在促进感兴趣多肽或蛋白表达中的用途。On the other hand, the present disclosure provides a use of a 3'UTR sequence, the aforementioned DNA molecule, the aforementioned mRNA molecule, the aforementioned vector, the aforementioned cell, the aforementioned pharmaceutical composition and/or the aforementioned method in promoting the expression of a polypeptide or protein of interest.

为了达到清楚和简洁描述的目的，本文中作为相同的或分开的一些实施方案的一部分来描述特征，然而，将要理解的是，本公开的范围可包括具有所描述的所有或一些特征的组合的一些实施方案。For purposes of clarity and concise description, features are described herein as part of the same or separate embodiments, however, it will be understood that the scope of the present disclosure may include embodiments having a combination of all or some of the described features.

实施例Example

实施例1：包含3’UTR序列的载体构建Example 1: Construction of a vector containing a 3'UTR sequence

使用公共数据库中已公开发表的人类细胞Ribo-seq数据和RNA-seq数据，训练了深度学习预测模型，该模型能够基于mRNA 3’UTR序列预测蛋白质翻译效率。随后，采用人工智能分析策略，包括对抗生成网络(GAN)和长短期记忆递归神经网络(LSTM)等技术，使用人类基因组中的3’UTR序列进行训练，构建生成式学习模型，从中产生非天然、新的3’UTR序列。将生成模型与预测模型结合，对生成的非天然3’UTR序列进行评估，预测由这些3’UTR及蛋白编码ORF组成的RNA翻译效率，从中筛选出一批可能实现较高蛋白翻译效率的3’UTR序列。然后通过湿试验排除实际表达中蛋白翻译效率不高的序列，最终得到的候选3’UTR序列见表1。Using human cell Ribo-seq data and RNA-seq data published in public databases, a deep learning prediction model was trained, which can predict protein translation efficiency based on mRNA 3’UTR sequences. Subsequently, artificial intelligence analysis strategies, including adversarial generative networks (GANs) and long short-term memory recurrent neural networks (LSTMs), were used to train 3’UTR sequences in the human genome to construct a generative learning model, from which non-natural, new 3’UTR sequences were generated. The generative model was combined with the prediction model to evaluate the generated non-natural 3’UTR sequences, predict the translation efficiency of RNA composed of these 3’UTRs and protein-coding ORFs, and screen out a group of 3’UTR sequences that may achieve higher protein translation efficiency. Then, the sequences with low protein translation efficiency in actual expression were excluded through wet experiments, and the candidate 3’UTR sequences finally obtained are shown in Table 1.

表1候选3’UTR序列Table 1 Candidate 3'UTR sequences

为了测试候选3’UTR，体外合成包含T7启动子、3’UTR(表1)、编码萤火虫荧光素酶的序列(表2)、5’UTR(SEQ ID NO:4)、包含120个A核苷酸残基的多聚(A)序列、和IIS型限制性核酸内切酶切割位点的核酸片段，并克隆到体外转录载体中(pIVTRup,Addgeneplasmid#101362)。To test candidate 3'UTRs, a nucleic acid fragment containing a T7 promoter, a 3'UTR (Table 1), a sequence encoding firefly luciferase (Table 2), a 5'UTR (SEQ ID NO:4), a poly(A) sequence containing 120 A nucleotide residues, and a type IIS restriction endonuclease cleavage site was synthesized in vitro and cloned into an in vitro transcription vector (pIVTRup, Addgene plasmid #101362).

表2序列信息Table 2 Sequence information

实施例2：mRNA分子长度和完整度检测Example 2: mRNA molecule length and integrity detection

将实施例1获得的载体进行线性化处理，并使用T7-RNA聚合酶进行体外转录生产mRNA分子，并同时添加5’帽结构。5’帽结构通过共转录加帽的方式，在体外转录过程中将帽类似物作为第一核苷酸掺入转录本，直接生产带有Cap1结构的mRNA分子。The vector obtained in Example 1 was linearized, and T7-RNA polymerase was used to perform in vitro transcription to produce mRNA molecules, and a 5' cap structure was added at the same time. The 5' cap structure was added by co-transcriptional capping, and the cap analog was incorporated into the transcript as the first nucleotide during in vitro transcription, directly producing mRNA molecules with Cap1 structure.

将这样得到的mRNA分子纯化并重悬在水中。使用5200片段分析仪(Agilent)对mRNA分子进行质检来检测mRNA分子长度和完整度(数值来源于预期长度片段的曲线下面积占比)符合要求，即可用于后续对不同候选3’UTR序列元件的测试。The mRNA molecules thus obtained were purified and resuspended in water. The mRNA molecules were quality checked using a 5200 fragment analyzer (Agilent) to detect the length and integrity of the mRNA molecules (the values were derived from the area under the curve of the expected length fragments). If they met the requirements, they could be used for subsequent testing of different candidate 3'UTR sequence elements.

实施例3：3’UTR序列元件的翻译效率验证Example 3: Verification of translation efficiency of 3'UTR sequence elements

本实施例中通过荧光素酶报告基因系统鉴定3’UTR序列元件的翻译效率调控作用。In this example, the luciferase reporter gene system was used to identify the translation efficiency regulation effect of 3'UTR sequence elements.

实验测试的3’UTR序列元件是否对翻译具有正向的调控作用，通过如下方法进行：The experimental test of whether the 3'UTR sequence element has a positive regulatory effect on translation was performed by the following method:

用含有一个3’UTR序列元件的RNA分子(编码蛋白序列为萤火虫荧光素酶)转染哺乳动物细胞293T。在转染后的特定时间点(24小时)检测荧光素酶的化学发光光吸收值，代表其蛋白表达量。用包含3’UTR序列元件的光吸收值读数减去背景组的光吸收值，得到该组的相对光单位。每个3’UTR序列元件在本实验中经检测得处理到对应的相对光单位，即可指征该3’UTR序列元件的翻译调控能力。Mammalian cells 293T were transfected with an RNA molecule containing a 3'UTR sequence element (the protein sequence encoding firefly luciferase). The chemiluminescent light absorption value of luciferase was detected at a specific time point (24 hours) after transfection, representing its protein expression level. The light absorption value reading containing the 3'UTR sequence element was subtracted from the light absorption value of the background group to obtain the relative light unit of the group. Each 3'UTR sequence element was processed to the corresponding relative light unit after detection in this experiment, which can indicate the translation regulation ability of the 3'UTR sequence element.

用于确定翻译效率的荧光素酶化学发光光吸收值的具体的实验过程如下：The specific experimental procedure for determining the luciferase chemiluminescence absorbance value of translation efficiency is as follows:

将293T人胚肾细胞以5×10⁴个细胞/孔的密度接种在48孔板中。次日，将细胞在Opti-MEM中洗涤，并随后用200ng/孔的在Opti-MEM中的Lipofectamine2000-复合的包含3’UTR序列元件的编码萤火虫荧光素酶的mRNA转染。没有加入任何RNA分子的细胞用作为背景组。转染6小时后，吸去混合培养基，替换为完全培养基。转染24小时后，吸去培养基，加入100μL的裂解缓冲液(Promega)，室温裂解5分钟。293T human embryonic kidney cells were seeded in 48-well plates at a density of 5×10 ⁴ cells/well. The next day, the cells were washed in Opti-MEM and then transfected with 200ng/well of Lipofectamine2000-complexed mRNA encoding firefly luciferase containing 3'UTR sequence elements in Opti-MEM. Cells without any RNA molecules were used as background groups. After 6 hours of transfection, the mixed culture medium was aspirated and replaced with complete culture medium. After 24 hours of transfection, the culture medium was aspirated, 100μL of lysis buffer (Promega) was added, and lysis was performed at room temperature for 5 minutes.

在多孔板读数器(Agilent)中以相对光单位(RLU)测量荧光素酶活性。在荧光素酶测定中从单个样品顺序测量firefly luciferase的活性。吸取20μL裂解液，加入50μL含有萤火虫荧光素酶底物的缓冲液，震板混匀，检测光吸收值，见表3。Luciferase activity was measured in relative light units (RLU) in a multiwell plate reader (Agilent). The activity of firefly luciferase was measured sequentially from individual samples in the luciferase assay. 20 μL of lysate was pipetted, 50 μL of buffer containing firefly luciferase substrate was added, the plate was shaken to mix, and the light absorbance was measured, see Table 3.

表3包含3’UTR序列元件原始光吸收值读取数据Table 3 contains the raw absorbance readings of 3'UTR sequence elements

用包含3’UTR序列元件的光吸收值读数减去背景孔的光吸收值，得到各组的相对光单位，见表4。The absorbance readings of the background wells were subtracted from the absorbance readings containing the 3’UTR sequence elements to obtain the relative light units of each group (see Table 4).

表4包含3’UTR序列元件的相对光单位Table 4 Relative light units containing 3'UTR sequence elements

统计表4的数据做成柱状图1，由图1可知，各组3’UTR对于荧光素酶报告基因的激活强弱有所不同，但均能有效激活，说明本公开提供的非天然3’UTR均具备类似天然3’UTR的引导蛋白表达的能力。The data in Statistical Table 4 are made into a bar graph 1. As can be seen from Figure 1, the activation strength of each group of 3'UTRs for the luciferase reporter gene is different, but all can be effectively activated, indicating that the non-natural 3'UTRs provided by the present disclosure all have the ability to guide protein expression similar to the natural 3'UTR.

实施例4：向候选3’UTR中插入microRNA结合位点不影响其功能Example 4: Insertion of a microRNA binding site into a candidate 3'UTR does not affect its function

本实施例利用在细胞内快速降解的荧光蛋白(d1EGFP)测试候选3’UTR序列元件在改造后功能是否发生改变，确认其对翻译具有正向调控功能，以及mRNA分子稳定性的调控作用。In this example, a fluorescent protein (d1EGFP) that is rapidly degraded in cells is used to test whether the function of the candidate 3'UTR sequence element is changed after modification, confirming that it has a positive regulatory function on translation and a regulatory effect on the stability of mRNA molecules.

方法如下：Methods as below:

1)向候选3’UTR中的3个随机位点插入全长的hsa-microRNA-1-3p反向互补序列，得到新的3’UTR。1) Insert the full-length hsa-microRNA-1-3p reverse complementary sequence into three random sites in the candidate 3’UTR to obtain a new 3’UTR.

2)用含有新的3’UTR及原始3’UTR序列元件的RNA分子(均编码1小时降解的绿色荧光蛋白，d1EGFP)，转染哺乳动物细胞系，如293T、Hela、raw264.7等。2) Transfect mammalian cell lines such as 293T, Hela, raw264.7, etc. with RNA molecules containing the new 3’UTR and the original 3’UTR sequence elements (both encoding 1-hour degraded green fluorescent protein, d1EGFP).

3)在转染后的24小时内，每间隔一小时对细胞进行拍摄，检测荧光蛋白光强，代表其蛋白表达量；3) Within 24 hours after transfection, the cells were photographed every hour to detect the intensity of the fluorescent protein, which represents the protein expression level;

4)用绿色荧光蛋白的光强的除以内参(红色荧光蛋白光强)，得到实验组或对照组的相对光强。相对光强可指征该3’UTR序列的翻译调控能力，且在24小时内的荧光强度变化可指征该3’UTR序列对于mRNA分子稳定性的影响。4) Divide the intensity of green fluorescent protein by the internal reference (intensity of red fluorescent protein) to obtain the relative intensity of the experimental group or control group. The relative intensity can indicate the translational regulation ability of the 3'UTR sequence, and the change in fluorescence intensity within 24 hours can indicate the effect of the 3'UTR sequence on the stability of the mRNA molecule.

全长的hsa-microRNA-1-3p的序列如下所示(SEQ ID NO:7)：The sequence of the full-length hsa-microRNA-1-3p is shown below (SEQ ID NO: 7):

TGGAATGTAAAGAAGTATGTAT。TGGAATGTAAAGAAGTATGTAT.

全长的hsa-microRNA-1-3p的反向互补序列如下所示(SEQ ID NO:8)：The reverse complementary sequence of the full-length hsa-microRNA-1-3p is shown below (SEQ ID NO: 8):

ATACATACTTCTTTACATTCCA。ATACATACTTCTTTACATTCCA.

向3’UTR-seq1中随机三个位点插入全长的hsa-microRNA-1-3p的反向互补序列后得到的新的3’UTR的序列如下所示(SEQ ID NO:9)：The sequence of the new 3'UTR obtained by inserting the reverse complementary sequence of the full-length hsa-microRNA-1-3p into three random sites in 3'UTR-seq1 is as follows (SEQ ID NO: 9):

ATACATACTTCTTTACATTCCAAGACGCTGGCTGGACTGTTACCAGTAGGGATGTAT GTAGCATACATACTTCTTTACATTCCATCGTGCAGACACTTCTTAGAAAATATGACATAC ATACTTCTTTACATTCCAAGTTATGTAATGTATATTCCCTTTTCATAGC。ATACATACTTCTTTACATTCCAAGACGCTGGCTGGACTGTTACCAGTAGGGATGTAT GTAGCATACATACTTCTTTACATTCCATCGTGCAGACACTTCTTAGAAAATATGACATAC ATACTTCTTTACATTCCAAGTTATGTAATGTATATTCCCTTTTCATAGC.

1小时降解的绿色荧光蛋白(d1EGFP)的氨基酸序列如下所示(SEQ ID NO:10)：The amino acid sequence of 1 hour degraded green fluorescent protein (d1EGFP) is as follows (SEQ ID NO: 10):

MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYKKLSHGFPPAVAAQDDGTLPMSCAQESGMDRHPAACASARINV*。MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYKKLSHGFPPAVAAQDDGTLPMSCAQESGMDRHPAACASARINV*.

其中，“*”表示终止。Among them, "*" means termination.

包含1小时降解的绿色荧光蛋白，和向3’UTR-seq1中随机三个位点插入全长的hsa-microRNA-1-3p的反向互补序列后得到的新的3’UTR的序列的mRNA的序列如下所示(SEQ ID NO:11)：The sequence of the mRNA containing the green fluorescent protein degraded in 1 hour and the new 3'UTR sequence obtained by inserting the reverse complementary sequence of the full-length hsa-microRNA-1-3p into three random sites in 3'UTR-seq1 is as follows (SEQ ID NO: 11):

GGGAAATAAGAGAGAAAAGAAGAGTAAGAAGAAATATAAGAGCCACCATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGAAGCTTAGCCATGGCTTCCCGCCGGCGGTGGCGGCGCAGGATGATGGCACGCTGCCCATGTCTTGTGCCCAGGAGAGCGGGATGGACCGTCACCCTGCAGCCTGTGCTTCTGCTAGGATCAATGTGTAGATACATACTTCTTTACATTCCAAGACGCTGGCTGGACTGTTACCAGTAGGGATGTATGTAGCATACATACTTCTTTACATTCCATCGTGCAGACACTTCTTAGAAAATATGACATACATACTTCTTTACATTCCAAGTTATGTAATGTATATTCCCTTTTCATAGC。GGGAAATAAGAGAGAAAAGAAGAGTAAGAAGAAATATAAGAGCCACCATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAG CCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGC ATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGAAGCTTAGCCATGGCTTCCCGCCGGCGGTGG CGGCGCAGGATGATGGCACGCTGCCCATGTCTTGTGCCCAGGAGAGCGGGATGGACCGTCACCCTGCAGCCTGTGCTTCTGCTAGGATCAATGTGTAGATACATACTTCTTTACATTCCAAGACGCTGGCTGGACTGTTACCAGTAGGGATGTATGTAGCATACATACTTCTTTACATTCCATCGTGCAGACACTTCTTAGAAAATATGACATACATACTTCTTTACATTCCAAGTTATGTAATGTATATTCCCTTTTCATAGC.

实验结果表明，相比与未插入microRNA结合位点的对照组，向候选3’UTR中插入microRNA结合位点，相对光强差异不显著，即不影响3’UTR的翻译调控功能；24小时内相对光强的曲线走势差异不显著，即不影响3’UTR对mRNA稳定性的调控功能。综上，向候选3’UTR中插入microRNA结合位点不影响其功能(图2)。The experimental results show that compared with the control group without microRNA binding site, the relative light intensity difference of the microRNA binding site inserted into the candidate 3'UTR is not significant, that is, it does not affect the translation regulation function of the 3'UTR; the curve trend of the relative light intensity within 24 hours is not significantly different, that is, it does not affect the regulation function of the 3'UTR on the stability of mRNA. In summary, the insertion of microRNA binding sites into the candidate 3'UTR does not affect its function (Figure 2).

Claims

1. A 3'UTR sequence, wherein the 3'UTR sequence comprises one of the following groups (1)-(4):

(1) the nucleotide sequence shown in any one of SEQ ID NOs: 1-3;

(2) a complementary sequence of the nucleotide sequence shown in any one of SEQ ID NOs: 1-3;

(3) a nucleotide sequence having at least 80% homology to the nucleotide sequence shown in any one of SEQ ID NOs: 1-3; or

(4) A nucleotide sequence having at least 80% homology with the complementary sequence of the nucleotide sequence shown in any one of SEQ ID NOs: 1-3.

2. A 3'UTR sequence according to claim 1, wherein the 3'UTR sequence further comprises a 3'UTR sequence or its complementary sequence inserted with an additional sequence; preferably, the additional sequence is a miRNA binding site; more preferably, the additional sequence is selected from a full-length microRNA reverse complementary sequence or a reverse complementary sequence of its seed sequence; preferably, the full-length microRNA reverse complementary sequence is 19-25 nt in length; preferably, the reverse complementary sequence of the seed sequence is 7-8 nt in length.

3. A DNA molecule comprising the 3'UTR sequence of claim 1 or 2.

4. A DNA molecule according to claim 3, further comprising an open reading frame encoding a polypeptide or protein of interest.

The DNA molecule according to claim 3 , wherein the DNA molecule is a circular DNA.

6. An mRNA molecule, the mRNA comprising the 3'UTR sequence of claim 1 or 2, preferably the mRNA comprises in the 5'-3' direction:

(1) an open reading frame encoding a polypeptide or protein of interest; and

(2) The 3’UTR sequence of claim 1.

7. An expression vector comprising the DNA molecule according to any one of claims 3 to 5.

8. A host cell, comprising the 3'UTR sequence of claim 1 or 2, the DNA molecule of any one of claims 3-5, the mRNA molecule of claim 6 and/or the expression vector of claim 6.

9. A pharmaceutical composition comprising one of the following components (1) to (5), and a pharmaceutically acceptable carrier;

(1) The 3'UTR sequence of claim 1 or 2;

(2) The DNA molecule according to any one of claims 3 to 5;

(3) the mRNA according to claim 6;

(4) the expression vector according to claim 7; or

(5) The host cell according to claim 8.

10. A method for promoting the expression of a polypeptide or protein of interest, the method comprising introducing into a host cell the 3'UTR sequence of claim 1 or 2, the DNA molecule of any one of claims 3-5, the mRNA molecule of claim 6 and/or the vector of claim 7.

11. Use of the 3'UTR sequence of claim 1 or 2, the DNA molecule of any one of claims 3-5, the mRNA molecule of claim 6, the vector of claim 7, the cell of claim 8, the pharmaceutical composition of claim 9 and/or the method of claim 10 in promoting the expression of a polypeptide or protein of interest.