Disclosure of Invention
The main purpose of the invention is to provide a preparation method CEMDISIRAN for solving the problem of lower purity of CEMDISIRAN in the prior art.
In order to achieve the aim, according to a first aspect of the invention, there is provided a method for preparing CEMDISIRAN, CEMDISIRAN is siRNA, and consists of a sense strand and an antisense strand by complementary pairing, the method comprises mixing a sense strand substrate, an antisense strand substrate and an RNA ligase, wherein the sense strand substrate can form a sense strand, the antisense strand substrate can form an antisense strand, the sense strand substrate and the antisense strand substrate are connected through hydrogen bonds formed by base complementation, the head base and the tail base of the sense strand substrate are not connected with each other to form a double-stranded nucleotide structure containing an nick, bases at two ends of the nick are connected through a phosphodiester bond by using an RNA ligase to form CEMDISIRAN, bases at two ends of the nick are respectively a 5' end and a 3' end of different substrates, the 5' end is phosphate, the 3' end is hydroxyl, the phosphate at the 5' end of the upstream end and the downstream end of the nick is connected through an RNA ligase to form a phosphodiester bond, the RNA ligase is obtained CEMDISIRAN, the RNA ligase is selected from RNA ligase family 1 or RNA ligase having RNA ligase activity of one or more than one of SEQ ID 1, SEQ ID 2 or SEQ ID 1-NO. and SEQ ID 1 or SEQ ID 1 has the activities.
Further, the sense strand substrate and the antisense strand substrate are obtained by a solid phase synthesis method or a liquid phase synthesis method, preferably the sense strand is a nucleic acid sequence shown as SEQ ID NO. 14, the antisense strand is a nucleic acid sequence shown as SEQ ID NO. 15, preferably the sense strand consists of 2 or more sense strand substrates, the length of the sense strand substrate is 3-18nt, more preferably 4-16nt, preferably the antisense strand consists of 2 or more antisense strand substrates, and the length of the antisense strand substrate is 3-22nt, more preferably 8-13nt.
The double-stranded RNA formed by annealing the sense strand substrate and the antisense strand substrate has more than 3 base combinations capable of complementary pairing, preferably the tail end of the double-stranded RNA is a sticky end, and preferably the sticky end is 2-8nt long.
Further, the sense strand substrate and the antisense strand substrate comprise 2 substrates, the sense strand substrate comprises a first sense strand substrate and a second sense strand substrate, the antisense strand substrate comprises a first antisense strand substrate and a second antisense strand substrate, the preparation method comprises the steps of mixing the first sense strand substrate, the second sense strand substrate, the first antisense strand substrate and the second antisense strand substrate, catalyzing the connection of the first sense strand substrate and the second sense strand substrate to form a sense strand by using RNA ligase, catalyzing the connection of the first antisense strand substrate and the second antisense strand substrate to form an antisense strand, and complementarily pairing the sense strand and the antisense strand to form CEMDISIRAN.
Further, the 3 'end of the first sense strand substrate is connected with the 5' end of the second sense strand substrate under the catalysis of RNA ligase to form a sense strand, the 3 'end of the first antisense strand substrate is connected with the 5' end of the second antisense strand substrate under the catalysis of RNA ligase to form an antisense strand, preferably, the 5 'end of the first sense strand substrate is a hydroxyl group, the 3' end is a hydroxyl group, the 5 'end of the second sense strand substrate is a phosphoric acid group, the 3' end is an L96 group, preferably, the 5 'end of the first antisense strand substrate is a hydroxyl group, the 3' end is a hydroxyl group, and the 5 'end of the second antisense strand substrate is a phosphoric acid group, and the 3' end is a hydroxyl group.
Further, the first sense strand substrate is a nucleic acid sequence shown as SEQ ID NO. 10 or 16, and the second sense strand substrate is a nucleic acid sequence shown as SEQ ID NO. 11 or 17.
Further, the first antisense strand substrate is a nucleic acid sequence shown as SEQ ID NO. 13 or 19, and the second antisense strand substrate is a nucleic acid sequence shown as SEQ ID NO. 12 or 18.
Further, the sense strand substrate and the antisense strand substrate each comprise 3 substrates,
The sense strand substrates include a first sense strand substrate, a second sense strand substrate, and a third sense strand substrate;
The antisense strand substrates comprise a first antisense strand substrate, a second antisense strand substrate and a third antisense strand substrate, wherein the first sense strand substrate is preferably a nucleic acid sequence shown as SEQ ID NO. 20, the second sense strand substrate is preferably a nucleic acid sequence shown as SEQ ID NO. 21, the third sense strand substrate is preferably a nucleic acid sequence shown as SEQ ID NO. 22, the first antisense strand substrate is preferably a nucleic acid sequence shown as SEQ ID NO. 25, the second antisense strand substrate is preferably a nucleic acid sequence shown as SEQ ID NO. 24, and the third sense strand substrate is preferably a nucleic acid sequence shown as SEQ ID NO. 23.
Further, the concentration of the sense strand substrate and the antisense strand substrate is 0.1-4.5 mM, the concentration of the RNA ligase is 0.05-0.6 mg/mL, the concentration of the RNA ligase is 0.2-mg/mL, the reaction system formed by mixing the sense strand substrate, the antisense strand substrate and the RNA ligase also comprises ATP, tris-HCl, mgCl 2 and DTT, the temperature of the enzyme catalytic reaction is 0-60 ℃, the temperature of the enzyme catalytic reaction is 4-37 ℃ is 0.5-24 h, the pH value of the enzyme catalytic reaction is 6.0-8.5, the purification is carried out after the enzyme catalytic reaction, and CEMDISIRAN is obtained after freeze drying.
By applying the technical scheme of the application, the preparation method is utilized, the sense strand of CEMDISIRAN is formed by connecting the sense strand substrates through RNA ligase catalysis, and the antisense strand of CEMDISIRAN is formed by connecting the antisense strand substrates, so that the siRNA with complex structure and multiple modifications is prepared by utilizing a biosynthesis mode. Compared with CEMDISIRAN prepared by chemical synthesis, the preparation method provided by the application has the advantages that the purity of the obtained product is higher, the impurity is less, the reaction condition is mild, and the industrial scale-up production is convenient to realize.
Detailed Description
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The present application will be described in detail with reference to examples.
Term interpretation:
n+1 impurities nucleic acid impurities having a single nucleotide attached in addition to the target synthetic sequence.
N-1 impurities nucleic acid impurities having a single nucleotide deletion compared to the target synthetic sequence.
As mentioned in the background art, the preparation of CEMDISIRAN in the prior art is performed by adopting a chemical synthesis mode, so that a large amount of organic solvents are used, the development concept of green chemistry is not met, and the cost of scale-up of the synthesis is very high. In addition, there are N+1 impurities and N-1 impurities which have low purity, high impurity content, and particularly are difficult to remove in the prepared product, and the subsequent purification of the product is affected. In the present application, the inventors have tried to develop a method for preparing siRNA for treating immunoglobulin a kidney disease using enzyme-catalyzed synthesis CEMDISIRAN, and thus proposed a series of protection schemes of the present application.
In a first exemplary embodiment of the present application, there is provided a method of preparing CEMDISIRAN, CEMDISIRAN being an siRNA, consisting of a sense strand and an antisense strand by complementary pairing; the method comprises the steps of mixing a sense strand substrate, an antisense strand substrate and RNA ligase, wherein the sense strand substrate can form a sense strand, the antisense strand substrate can form an antisense strand, the sense strand substrate and the antisense strand substrate are connected through hydrogen bonds formed by base complementation, the head base and the tail base of the sense strand substrate are not connected with each other to form a double-stranded nucleotide structure containing a notch, the bases at the two ends of the notch are connected through a phosphodiester bond by using the RNA ligase to form CEMDISIRAN, the bases at the two ends of the notch are respectively the 5 'end and the 3' end of different substrates, the 5 'end is phosphate radical, the 3' end is hydroxyl, the phosphate radical at the 5 'end and the hydroxyl at the 3' end at the upstream end of the notch are connected by using the RNA ligase to form a phosphodiester bond, CEMDISIRAN is obtained, the RNA ligase is selected from RNA ligase family 1 or RNA ligase family 2, the RNA ligase comprises one or more of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5 or RNA ligase shown by SEQ ID NO:6, and the RNA ligase has catalytic activity with one or more of the two or more of the same SEQ ID NO: 1. Wherein, the double-stranded nucleotide structure containing the nick formed by the sense strand substrate and the antisense strand substrate has an adhesive end.
The sense strand substrate and the antisense strand substrate can be chemically synthesized by a solid phase method or a liquid phase method.
In the above preparation method, the sense strand substrate is 2 or more nucleotide sequences capable of constituting the sense strand, i.e., a plurality of nucleotide sequences of the sense strand substrate can be spliced to form the same sequence as the sense strand sequence, and is different from the sense strand in that there is a break between the sense strand substrates and is not linked by a phosphodiester bond. Similarly, 2 or more antisense strand substrates were linked by phosphodiester bonds using RNA ligase to obtain CEMDISIRAN antisense strands.
In the preparation method, the sense strand substrate, the antisense strand substrate and the RNA ligase are mixed together, and CEMDISIRAN is directly prepared by a one-pot method. In one-pot ligation, base complementary pairing can be performed between the sense strand substrate and the antisense strand substrate to form a double-stranded structure, and the RNA ligase recognizes the double-stranded structure and then joins the nicks existing in the double-stranded structure, thereby preparing the target product CEMDISIRAN.
Any method capable of performing RNA synthesis is suitable for use in the present application, and in a preferred embodiment, the sense strand substrate and the antisense strand substrate are obtained by a solid phase synthesis method or a liquid phase synthesis method. In a preferred embodiment, the sense strand is the nucleic acid sequence shown in SEQ ID NO. 14 and the antisense strand is the nucleic acid sequence shown in SEQ ID NO. 15.
SEQ ID NO:14:
AmsAmsGfCmAfAmGfAmUfAfUfUmUfUmUmAfUmAfAmUmAm。
SEQ ID NO:15:
UmsAfsUfUmAfUmAmAfAmAfAmUmAmUfCmUfUmGfCmUmUmsUmsUmdTdT。
In the present application, A, C, G or m after U represents 2' methoxy modification of the ribonucleotide. f represents 2 'fluorine modification of the ribonucleotide, s in the written methods of "sAm", "sGf" and the like represent thio modification of the 5' phosphate of the ribonucleotide, and A, C, G or d before T represent that the nucleotide is deoxyribonucleotide.
CEMDISIRAN has a sense strand length of 21nt and an antisense strand of 25nt, and the sense strand/antisense strand is divided into sections of sense strand/antisense strand substrates with different lengths by the connection efficiency, and the sections are connected by enzyme catalytic reaction, so that a product with higher purity can be obtained. In a preferred embodiment, the sense strand consists of 2 and more sense strand substrates, the sense strand substrates having a length of 3 to 18nt, more preferably 4 to 16nt, and the antisense strand consists of 2 and more antisense strand substrates, the antisense strand substrates having a length of 3 to 22nt, more preferably 8 to 13nt.
When CEMDISIRAN synthesis is carried out by a one-pot method, as shown in fig. 1, base complementary pairing can be carried out between a sense strand substrate and an antisense strand substrate to form a double-stranded structure, and RNA ligase recognizes the double-stranded structure with the nicks formed by the complementary pairing of the substrates, so that the sense strand substrate and the antisense strand substrate are connected to obtain CEMDISIRAN. In a preferred embodiment, the double-stranded RNA formed by annealing the sense strand substrate and the antisense strand substrate has 3 or more base combinations capable of complementary pairing. In a preferred embodiment, the double stranded RNA ends are cohesive ends having a length of 2-8nt.
Any RNA ligase capable of recognizing a double-stranded structure with an absence of a complementary pairing of substrates that catalyzes the formation of a phosphodiester bond between a phosphate group and a hydroxyl group is suitable for use in the present application, in a preferred embodiment the RNA ligase is selected from the group consisting of RNA ligase family 1 or RNA ligase family 2.
In a preferred embodiment, the RNA ligase comprises one or more of the RNA ligases shown in SEQ ID NO. 1, SEQ ID NO. 2, SEQ ID NO. 3, SEQ ID NO. 4, SEQ ID NO. 5, or SEQ ID NO. 6, or an enzyme having more than 70% identity to any one of the RNA ligases shown in SEQ ID NO. 1-SEQ ID NO. 6, including but not limited to, more than 75%, 80%, 85%, 90%, 95%, 99% (such as 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 98.5%, 99%, 99.5%, 99.6%, 99.7%, 99.8% or more, or even more than 99.9%) and having activity in catalyzing phosphodiester bond formation. Wherein the RNA ligase of the RNA ligase family 1 comprises SEQ ID NO. 4 or SEQ ID NO. 6, and the RNA ligase of the RNA ligase family 2 comprises SEQ ID NO. 1, SEQ ID NO. 2, SEQ ID NO. 3 or SEQ ID NO. 5.
SEQ ID NO. 1 (ligase 48,Escherichia phage JN02)
MFKKYSSLENHYNSKFIEKLYTNGLTTGVWVAREKIHGTNFSLIIERDNVTCAKRTGPILPAEDFYGYEIVLKKYDKAIKAVQEVMESISTSVPVSYQVFGEFAGGGIQKGVDYGEKDFYVFDIIINTESDDTYYMSDYEMQDFCNTFGFKMAPMLGRGTFDSLIMIPNDLDSVLAAYNSTASEDLVEANNCVFDANVIGDNTAEGYVLKPCFPKWLSNGTRVAIKCKNSKFSEKKKSDKPVKTQVPLTEIDKNLLDVLACYVTLNRVNNVISKIGTVTPKDFGKVMGLTVQDILEETSREGIVLTSSDNPNLVKKELVRMVQDVLRPAWIELVS.
SEQ ID NO. 2 (ligase 25,Vibrio phage NT-1)
MSFVKYTSLENSYRQAFVDKCDMLGVRDWVALEKIHGANFSFIVEFDGGYTVTPAKRTSIIGATATGDYDFYGCTSVVEAHKEKVELVANFLWLNEYINLYEPIIIYGELAGKGIQKEVNYGDKDFWAFDIFLPQREEFVDWDTCVAAFTNAEIKYTKELARGTLDELLRIDPLFKSLHTPAEHEGDNVAEGFVVKQLHSEKRLQSGSRAILKVKNEKFKEKKKKEGKTPTKLVLTPEQEKLHAEFSCYLTENRLKNVLSKLGTVNQKQFGMISGLFVKDAKDEFERDELNEVAIDRDDWNAIRRSLTNIANEILRKNWLNILDGNF.
SEQ ID NO. 3 (ligase 26,Escherichia phage AR1)
MQELFNNLMELCKDSQRKFFYSDDVSASGRTYRIFSYNYASYSDWLLPDALECRGIMFEMDGEKPVRIASRPMEKFFNLNENPFTMNIDLNDVDYILTKEDGSLVSTYLDGDEILFKSKGSIKSEQALMANGILMNINHHQLRDRLKELAEDGFTANFEFVAPTNRIVLAYQEMKIILLNIRENETGEYISYDDIYKDAALRPYLVERYEIDSPKWVEEAKNAENIEGYVAVMKDGSHFKIKSDWYVSLHSTKSSLDNPEKLFKTIIDGASDDLKAMYADDEYSYRKIEAFETTYLKYLDRALFLVLDCHNKHCGKDRKTYAMEAQGVAKGAGMDHLFGIIMSLYQGYDSQEKVMCEIEQNFLKNYKKFIPEGY.
SEQ ID NO. 4 (ligase 31,Vibrio phage VH12019)
MTTQELYNHLMTLTDDAEGKFFFADHISPLGEKLRVFSYHIASYSDWLLPGALEARGIMFQLDEQDKMVRIVSRPMEKFFNLNENPFTMDLDLTTTVQLMDKADGSLISTYLTGENFALKSKTSIFSEQAVAANRYIKLPENRDLWEFCDDLTQAGCTVNMEWCAPNNRIVLEYPEAKLVILNIRDNETGDYVSFDDIPLPALMRVKKWLVDEYDPETAHADDFVEKLRATKGIEGMILRLANGQSVKIKTQWYVDLHSQKDSVNVPKKLVTTILNNNHDDLYALFADDKPTIDRIREFDSHVSKTVSASFHAVSQFYVKNRHMSRKDYAIAGQKTLKPWEFGVAMIAYQNQTVEGVYEALVGAYLKRPELLIPEKYLNEA.
SEQ ID NO. 5 (ligase 41,Vibrio phage VH7D)
MNVQELYKNLMSLADDAEGKFFFADHLSPLGEKFRVFSYHIASYSDWLLPGALEARGIMFQLDDNDEMIRIVSRPMEKFFNLNENPFTMELDLTTTVQLMDKADGSLISTYLSGENFALKSKTSIFSEQAVAANRYIKKPENRDLWEFCDDCTQAGLTVNMEWCAPNNRIVLEYPEAKLVILNIRDNETGDYVSFDDIPQSALMRVKQWLVDEYDPATAHEPDFVEKLRDTKGIEGMILRLANGQSVKIKTQWYVDLHSQKDSVNVPKKLVTTILNGNHDDLYALFADDKPTIERIREFDSHVTKTLTNSFNAVRQFYARNRHLARKDYAIAGQKVLKPWEFGVAMIAYQKQTVEGVYESLVTAYLKRPELAIPEKYLNGV.
SEQ ID NO. 6 (ligase 42,Escherichia phage JN02)
MEKLYYNLLSLCKSSSDRKFFYSDDVSPIGKKYRIFSYNFASYSDWLLPDALECRGIMFEMDGETPVRIASRPMEKFFNLNENPFTLSINLDDVKYLMTKEDGSLVSTYLDGGTVRFKSKGSIKSDQAVSATSILLDIDHKNLADRLLELCNDGFTANFEYVAPTNKIVLTYPEKRLILLNIRDNNTGEYIEYDDIYLDPVFRKYLVDRFEVPEGDWTSDVKSSTNIEGYVAVMKDGSHFKLKTDWYVALHTTRDSISSPEKLFLAIVNGASDDLKAMYADDEFSFKKVELFEKAYLDFLDRSFYICLDTYDKHKGKDRKTYAIEAQAVCKGAQTPWLFGIIMNLYQGGSKEQMMTALESVFIKNHKNFIPEGY.
SEQ ID NO. 7 (ligase 11, thermococcus)
MVSSYFRNLLLKLGLPEERLEVLEGKGALAEDEFEGIRYVRFRDSARNFRRGTVVFETGEAVLGFPHIKRVVQLENGIRRVFKNKPFYVEEKVDGYNVRVVKVKDKILAITRGGFVCPFTTERIEDFVNFDFFKDYPNLVLVGEMAGPESPYLVEGPPYVKEDIEFFLFDIQEKGTGRSLPAEERYRLAEEYGIPQVERFGLYDSSKVGELKELIEWLSEEKREGIVMKSPDMRRIAKYVTPYANINDIKIGSHIFFDLPHGYFMGRIKRLAFYLAENHVRGEEFENYAKALGTALLRPFVESIHEVANGGEVDETFTVRVKNITTAHKMVTHFERLGVKIHIEDIEDLGNGYWRITFKRVYPDATREIRELWNGLAFVD.
SEQ ID NO. 8 (ligase 20, archaea)
MVVPLKRIDKIRWEIPKFDKRMRVPGRVYADEVLLEKMKNDRTLEQATNVAMLPGIYKYSIVMPDGHQGYGFPIGGVAAFDVKEGVISPGGIGYDINCGVRLIRTNLTEKEVRPRIKQLVDTLFKNVPSGVGSQGRIKLHWTQIDDVLVDGAKWAVDNGYGWERDLERLEEGGRMEGADPEAVSQRAKQRGAPQLGSLGSGNHFLEVQVVDKIFDPEVAKAYGLFEGQVVVMVHTGSRGLGHQVASDYLRIMERAIRKYRIPWPDRELVSVPFQSEEGQRYFSAMKAAANFAWANRQMITHWVRESFQEVFKQDPEGDLGMDIVYDVAHNIGKVEEHEVDGKRVKVIVHRKGATRAFPPGHEAVPRLYRDVGQPVLIPGSMGTASYILAGTEGAMKETFGSTCHGAGRVLSRKAATRQYRGDRIRQELLNRGIYVRAASMRVVAEEAPGAYKNVDNVVKVVSEAGIAKLVARMRPIGVAKGAAALEH.
SEQ ID NO. 9 (ligase 32, bacterium)
MVSLHFKHILLKLGLDKERIEILEMKGGIVEDEFEGLRYLRFKDSAKGLRRGTVVFNESDIILGFPHIKRVVHLRNGVKRIFKSKPFYVEEKVDGYNVRVAKVGEKILALTRGGFVCPFTTERIGDFINEQFFKDHPNLILCGEMAGPESPYLVEGPPYVEEDIQFFLFDIQEKRTGRSIPVEERIKLAEEYGIQSVEIFGLYSYEKIDELYELIERLSKEGREGVVMKSPDMKKIVKYVTPYANVNDIKIGSRIFFDLPHGYFMQRIKRLAFYIAEKRIRREDFDEYAKALGKALLQPFVESIWDVAAGEMIAEIFTVRVKKIETAYKMVSHFERMGLNIHIDDIEELGNGYWKITFKRVYDDATKEIRELWNGHAFVD.
Identity (Identity) in the present application refers to "Identity" between amino acid sequences or nucleic acid sequences, i.e. the sum of the ratios of amino acid residues or nucleotides of the same kind in the amino acid sequences or nucleic acid sequences. The identity of amino acid sequences or nucleic acid sequences can be determined using the alignment programs BLAST (Basic Local ALIGNMENT SEARCH Tool), FASTA, etc.
Proteins that are 70%, 75%, 80%, 85%, 90%, 95%, 99% or more (e.g., 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 98.5%, 99%, 99.5%, 99.6%, 99.7%, 99.8% or more, or even 99.9% or more) identical and have the same function, and have the same active site, active pocket, active mechanism, protein structure, etc. as those provided by the a) sequence with a high probability.
As used herein, amino acid residues are abbreviated as alanine (Ala; A), asparagine (Asn; N), aspartic acid (Asp; D), arginine (Arg; R), cysteine (Cys; C), glutamic acid (Glu; E), glutamine (Gln; Q), glycine (Gly; G), histidine (His; H), isoleucine (Ile; I), leucine (Leu; L), lysine (Lys; K), methionine (Met; M), phenylalanine (Phe; F), proline (Pro; P), serine (Ser; S), threonine (Thr; T), tryptophan (Trp; W), tyrosine (Tyr; Y) and valine (Val; V).
The rules of substitution, replacement, etc., generally, which amino acids are similar in nature, and the effect after replacement is similar. For example, in the above homologous proteins, conservative amino acid substitutions may occur. "conservative amino acid substitutions" include, but are not limited to:
the hydrophobic amino acid (Ala, cys, gly, pro, met, val, ile, leu) is substituted with other hydrophobic amino acids;
The hydrophobic amino acid (Phe, tyr, trp) with a coarse side chain is replaced by other hydrophobic amino acids with a coarse side chain;
The positively charged amino acid (Arg, his, lys) of the side chain is replaced by other positively charged amino acids of the side chain;
The amino acid (Ser, thr, asn, gln) with a side chain having a polarity that is uncharged is substituted with other amino acids with a side chain having a polarity that is uncharged.
The amino acids may also be conservatively substituted by those skilled in the art according to amino acid substitution rules well known to those skilled in the art as the "blosum62 scoring matrix" in the art.
In the application, phosphodiester bonds are formed between substrates only through the catalysis of RNA ligase shown in SEQ ID NO. 1-SEQ ID NO. 6 or the catalysis of the enzyme which has more than 70% of the same degree with any RNA ligase shown in SEQ ID NO. 1-SEQ ID NO. 6, so as to obtain a product CEMDISIRAN. In the experiments related to the present application, the inventors obtained RNA ligase represented by SEQ ID NO:1 to SEQ ID NO:6 capable of synthesizing CEMDISIRAN by screening from a large number of enzymes (50). A negative result with a very large ratio (up to 70%) in the experiment shows that most RNA ligases are difficult to catalyze CEMDISIRAN to synthesize, including but not limited to RNA ligases shown in SEQ ID NO. 7-SEQ ID NO. 9, and the RNA ligases without activity of catalyzing CEMDISIRAN synthesis are only shown by taking SEQ ID NO. 7-SEQ ID NO. 9 as an example in the specification of the application.
In a preferred embodiment, the sense strand substrate and the antisense strand substrate each comprise 2 substrates, the sense strand substrate comprises a first sense strand substrate and a second sense strand substrate, the antisense strand substrate comprises a first antisense strand substrate and a second antisense strand substrate, the preparation method comprises mixing the first sense strand substrate, the second sense strand substrate, the first antisense strand substrate and the second antisense strand substrate, catalyzing the ligation of the first sense strand substrate and the second sense strand substrate to form a sense strand using an RNA ligase, catalyzing the ligation of the first antisense strand substrate and the second antisense strand substrate to form an antisense strand, and complementarily pairing the sense strand and the antisense strand to form CEMDISIRAN.
In a preferred embodiment, the 3 'end of the first sense strand substrate is linked to the 5' end of the second sense strand substrate under the catalysis of an RNA ligase to form a sense strand, the 3 'end of the first antisense strand substrate is linked to the 5' end of the second antisense strand substrate under the catalysis of an RNA ligase to form an antisense strand, preferably the 5 'end of the first sense strand substrate is a hydroxyl group, the 3' end is a hydroxyl group, the 5 'end of the second sense strand substrate is a phosphate group, the 3' end is an L96 group, preferably the 5 'end of the first antisense strand substrate is a hydroxyl group, the 3' end is a hydroxyl group, and the 5 'end of the second antisense strand substrate is a phosphate group, and the 3' end is a hydroxyl group.
In a preferred embodiment, the first sense strand substrate is the nucleic acid sequence shown in SEQ ID NO. 10 or 16 and the second sense strand substrate is the nucleic acid sequence shown in SEQ ID NO. 11 or 17
In a preferred embodiment, the first antisense strand substrate is the nucleic acid sequence shown in SEQ ID NO. 13 or 19 and the second antisense strand substrate is the nucleic acid sequence shown in SEQ ID NO. 12 or 18.
CEMDISIRAN can be prepared using the preparation method described above and the substrates shown in SEQ ID NOs 10-13 or SEQ ID NOs 16-19. However, the selection of the substrate is not limited to the substrates shown in SEQ ID NOS 10-13 or SEQ ID NOs 16-19, and the substrates capable of being combined to form the sense strand and the antisense strand can be used in the preparation method, which is applicable to the preparation of CEMDISIRAN but not limited to the difference in the ligation positions of the substrates, and the preparation has a good ligation effect for the ligation of the sense strand sequence and the antisense strand sequence of CEMDISIRAN. The number of sense strand substrates or antisense strand substrates includes, but is not limited to, 2, 3, 4, or even more.
SEQ ID NO:10:
AmsAmsGfCmAfAmGfAmUfAfUfUmUfUmUmAf。
SEQ ID NO:11:
UmAfAmUmAm。
SEQ ID NO:12:
UfCmUfUmGfCmUmUmsUmsUmdTdT。
SEQ ID NO:13:
UmsAfsUfUmAfUmAmAfAmAfAmUmAm。
SEQ ID NO:16:
AmsAmsGfCmAfAmGfAmUfAfUfUmUf。
SEQ ID NO:17:
UmUmAfUmAfAmUmAm。
SEQ ID NO:18:
UmAmUfCmUfUmGfCmUmUmsUmsUmdTdT。
SEQ ID NO:19:
UmsAfsUfUmAfUmAmAfAmAfAm。
In a preferred embodiment, the sense strand substrate and the antisense strand substrate each comprise 3 substrates, the sense strand substrate comprises a first sense strand substrate, a second sense strand substrate and a third sense strand substrate, the antisense strand substrate comprises a first antisense strand substrate, a second antisense strand substrate and a third antisense strand substrate, preferably the first sense strand substrate is the nucleic acid sequence shown in SEQ ID NO:20, the second sense strand substrate is the nucleic acid sequence shown in SEQ ID NO:21, the third sense strand substrate is the nucleic acid sequence shown in SEQ ID NO:22, preferably the first antisense strand substrate is the nucleic acid sequence shown in SEQ ID NO:25, the second antisense strand substrate is the nucleic acid sequence shown in SEQ ID NO:24, and the third sense strand substrate is the nucleic acid sequence shown in SEQ ID NO: 23.
SEQ ID NO:20:
AmsAmsGfCmAfAmGf。
SEQ ID NO:21:
AmUfAfUfUmUfUmUmAfUm。
SEQ ID NO:22:
AfAmUmAm。
SEQ ID NO:23:
GfCmUmUmsUmsUmdTdT。
SEQ ID NO:24:
AmAfAmUmAmUfCmUfUm。
SEQ ID NO:25:
UmsAfsUfUmAfUmAmAf。
In a preferred embodiment, the concentration of sense strand substrate and antisense strand substrate is 0.1-4.5 mM, preferably the concentration of RNA ligase is 0.05-0.6 mg/mL, more preferably 0.2 mg/mL, preferably ATP, tris-HCl, mgCl 2 and DTT are included in the reaction system formed by mixing the sense strand substrate, antisense strand substrate and RNA ligase, preferably the temperature of the enzyme catalytic reaction is 0-60 ℃, more preferably 4-37 ℃, preferably the time of the enzyme catalytic reaction is 0.5-24 h, more preferably 16-24 h, preferably the pH of the enzyme catalytic reaction is 6.0-8.5, preferably after purification, the enzyme catalytic reaction is freeze-dried to obtain CEMDISIRAN.
The concentration of the sense strand substrate fragment and the antisense strand substrate fragment is selected from the group consisting of, but not limited to, 0.1, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, and 4.5mM, the reaction temperature of the preparation method is 10, 15, 16, 20, 25, 30, 35, and 40 ℃, and the reaction time of the preparation method is 2, 5, 10, 15, 16, 20, 24, 25, 30, 35, 40, 45, and 48 hours.
The application is described in further detail below in connection with specific examples which are not to be construed as limiting the scope of the application as claimed.
Example 1
Based on CEMDISIRAN sequences, 4 single-stranded RNA fragments were designed as follows:
The above 4 single-stranded RNA fragments were prepared using a solid phase synthesis method. Wherein A, C, G or m after U represents 2' methoxy modification of the ribonucleotide, f represents 2' fluoro modification of the ribonucleotide, s in the writing such as ' sAm ', ' sGf ' and the like represents thio modification of 5' phosphoric acid of the ribonucleotide, d before A, C, G or T represents that the nucleotide is deoxyribonucleotide, L96 is an N-acetylgalactosamine group (GalNAc), and the leftmost wavy line of the protecting group of L96 represents that the group is connected with the last nucleotide of the sense strand.
CEMDISIRAN-1 has 2' methoxy modification at 1,2, 4, 6, 8, 12, 14 and 15 ribonucleotides, 2' fluoro modification at 3, 5, 7, 9, 10, 11, 13 and 16 ribonucleotides, and thio modification at 5' phosphate of 2 and 3 ribonucleotides.
CEMDISIRAN-2 has 2 'methoxy modification on each of 1,3,4, 5-ribonucleotide and 2' fluoro modification on 2-ribonucleotide.
CEMDISIRAN-3 has 2' methoxy modification at 2,4,6, 7,8, 9 and 10 ribonucleotides, 2' fluoro modification at 1, 3 and 5 ribonucleotides, thio modification at 5' phosphate of 9 and 10 ribonucleotides, deoxyribonucleotides at 11 and 12 thymine (T).
CEMDISIRAN-4 has 2' methoxy modifications at the 1, 4,6, 7, 9, 11, 12 and 13 ribonucleotides, 2' fluoro modifications at the 2, 3, 5, 8 and 10 ribonucleotides and thio modifications at the 5' phosphate of the 2 and 3 ribonucleotides.
The sense strand of CEMDISIRAN obtained was 5'-AmsAmsGfCmAfAmGfAmUfAfUfUmUfUmUmAfUmAfAmUmAm (SEQ ID NO: 14) -L96-3' and the antisense strand was 5'-UmsAfsUfUmAfUmAmAfAmAfAmUmAmUfCmUfUmGfCmUmUmsUmsUmdTdT-3' (SEQ ID NO: 15).
Uniformly mixing 4 single-stranded RNA fragments in an equimolar ratio to obtain a substrate mixture, and annealing to obtain an annealed RNA fragment mixed solution. The annealed RNA fragment mixture was subjected to enzyme-catalyzed ligation under the following conditions, 100. Mu.M substrate fragment, ATP 10eq, mgCl 2 100eq, DTT 10eq, RNA ligase at a final concentration of 0.2 mg/mL, 1911V 50 mM Tris-HCl pH 7.5, and 16℃for 16h were sequentially added to a 10 uL reactor. After the completion of the reaction, the protein was inactivated by heating at 80℃to 5 min, and the supernatant was collected by centrifugation, and the result of the Urea-PAGE detection is shown in FIG. 2. In FIG. 2, lane M represents RNA molecular standard (marker), lane 1 represents the reaction system of ligase 31, lane 2 represents the reaction system of ligase 48, and lane 3 represents the reaction system of ligase 11. The yields were estimated as grey scale analysis of product bands in the Urea-PAGE results, and the final yield results are shown in the following table:
In example 1, the yield was estimated as the result of gray scale analysis of the target band in the Urea-PAGE result, and "none" means that the target band was not detected, "++" indicates a yield of 25 to 50% (excluding 50% of the end point values), "+++" indicates a yield of 50 to 90%, ++++ indicates a yield >90%.
The yield calculation formula in the examples is that the yield = product gray data/(product gray data + substrate gray data).
The activity screening of 6 RNA ligases shows that the ligases 48, 25 and 31 have better ligation effect and can convert most substrates into CEMDISIRAN.
Example 2
The ligase 48 with better reactivity was selected and the annealed substrate fragment was used for enzyme-catalyzed ligation under the following conditions, which were that 800. Mu.M substrate fragment, ATP 4eq, mgCl 2 12.5eq,DTT 1.25 eq, final concentration of the above RNA ligase of 0.2 mg/mL, 239V 50 mM Tris-HCl pH 7.5 were added sequentially to a 50 uL reactor, and 16℃was reacted at 16 h. After the completion of the reaction, the protein was inactivated by heating at 80℃to 5 min, and the supernatant was collected by centrifugation. The HPLC results of the assay for ligase 48 are shown in FIG. 3.
The yield was measured as the roughly estimated ratio of product peaks in the HPLC data of the reaction system samples, and the results are shown in the following table:
in example 2, the yield was calculated as a statistic of the target peak in the HPLC result, ++ + + and representation of the yield was >90%.
LC-MS was used to identify the sense strand product as having a molecular weight 8677.95, the antisense strand product as having a molecular weight 8096.11, the sense strand product as theoretical 8677.94.+ -.8, and the antisense strand product as theoretical 8096.11.+ -.8, indicating that ligase 48 ligation produced CEMDISIRAN.
Comparative example 1
The average yield of the full-length CEMDISIRAN product by solid phase synthesis was 32.6% and the total n+1 and N-1 impurities was 1.47%.
The yield of CEMDISIRAN products produced by the enzymatic connection related by the invention in the enzymatic synthesis step is 87.48%, the average value of the solid phase synthesis yield of the used substrate is 39.2%, the yield of the whole process obtained by multiplication is 34.3%, the average yield of the total length CEMDISIRAN products is higher than that of the solid phase synthesis, the total impurity ratio of N+1 and N-1 in CEMDISIRAN products synthesized by the enzymatic connection is 0.4%, and the impurity ratio of the total length CEMDISIRAN products is lower than that of the solid phase synthesis.
From the above description, it can be seen that the above-described embodiments of the present application achieve the technical effect that, in the preparation method of the present application, the sense strand of CEMDISIRAN is formed by catalyzing the ligation between sense strand substrates through RNA ligase, and the antisense strand of CEMDISIRAN is formed by catalyzing the ligation between antisense strand substrates, thereby achieving the preparation of siRNA with various modifications having complex structures by using a biosynthetic manner. Compared with CEMDISIRAN prepared by chemical synthesis, the preparation method provided by the application has the advantages that the purity of the product is higher, the impurities are few (the content of N+1 and N-1 impurities is less than 0.5%), the subsequent final product purification pressure is small, the reaction condition is mild, a large amount of organic reagents are not used, the production cost can be reduced, and the industrialized amplified production is conveniently realized.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.