Main

Prime editors are advanced CRISPR tools that enable replacement of targeted DNA with programmed sequences3. A prime editor comprises a Cas9 nickase (Cas9n) fused to a reverse transcriptase and paired with an extended guide RNA (pegRNA) that encodes both the genomic target sequence and the intended edit1. Editing initiates with the prime editor binding its genomic target and forming a single-strand DNA break (nick; Fig. 1a). The nicked 3′ DNA end is released to anneal to the pegRNA template, priming the reverse transcriptase to write the template sequence into an extension of the 3′ DNA end. This resulting edited 3′ new strand can displace the competing 5′ strand to install the intended edit. This process can be adapted to a wide variety of edit types, including substitutions, insertions and deletions1,4,5,6,7,8,9. Substantial efforts have been made to increase prime-editing efficiency, including inhibition of cellular mismatch repair (MMR)2,10, stabilization of pegRNAs11,12,13 and engineering the reverse transcriptase domain2,14,15,16. These advances have led to highly effective and versatile prime-editing systems.

Fig. 1: Mutations that relax prime editor nick positioning suppress error generation.
figure 1

a, Schematic of the prime-editing process, depicting two competing structures that can produce edits or indel errors. PAM, protospacer-adjacent motif; RT, reverse transcriptase. b, Schematic of mechanisms through which indel errors are produced by end joining of edited 3′ new strands. c, Model depicting how Cas9 variants with different break stabilities are proposed to either protect non-target strand 5′ ends (left) or promote their degradation (right). d, Positions of most frequently shifted non-target strand nicks for Cas9 variants screened. e, Screen of different Cas9 variants to quantitate the frequency of shifted nicks (top) compared with the relative frequency (middle) and extent (bottom) of cut DNA end degradation. f, Screen of engineered PE variants to detect the relative frequency of nicked-end flap degradation (top) determined using a sensor assay (bottom). g, Screen of engineered PE variants to suppress indel errors with quantification of edit and indel frequencies (bottom), indel classes (middle) and edit:indel ratios (top). hj, Correlations comparing PE nicked-end flap degradation with Cas9 nick shift frequencies (h), PE edit:indel ratios with Cas9 nick shift frequencies (i) and PE edit:indel ratios with PE nicked-end flap degradation (j) for variants in the screen with the same mutations. *P < 0.05 by two-tailed unpaired Student’s t-test for comparisons with Cas9 (e) or to PE (f,g). Correlations were determined by Pearson coefficients (hj). All data were analysed by deep sequencing and represent means of n = 6 (e) or n = 3 (f,g) independent replicates with standard errors.

A key remaining challenge is the elimination of errors that occur as byproducts of prime editing. These errors are insertion and deletion (indel) mutations generated in lieu of the intended edit within a fraction of targeted cells, resulting in DNA sequences that are unpredictable and possibly deleterious2,17. Previous work has identified major drivers of indel error formation, although the mechanisms have not been fully elucidated. First, the prime editor may extend the edited 3′ new strand past the pegRNA template and into the scaffold1,2. This can be addressed by recoding the pegRNA scaffold to limit its homology with the genomic sequence2. Second, errant double-strand breaks (DSBs) can be generated, sometimes as a consequence of MMR converting nicks into DSBs, and can induce indels consistent with DSB repair2,18. These indels can be addressed through inhibition or avoidance of MMR2,10. Third, the edited 3′ new strand generated by the prime editor can end join at unintended positions2. This often produces large deletions or tandem duplication-like insertions, and there are presently no strategies for addressing these errors.

In the prime-editing process, the edited 3′ new strand is disfavoured in displacing the competing 5′ strand due to the former being mismatched to the complementary strand1 (Fig. 1a). We reasoned that this bias against annealing of the edited 3′ new strand can limit editing efficiency and promote errors (Fig. 1b). It is known that the 3′ end of a nicked DNA substrate can escape from bound Cas9 complex, whereas the 5′ end remains stably bound19,20. We hypothesized that destabilizing positioning of the 5′ end might enable its degradation (Fig. 1c). We recently discovered that mutations in the Cas9–DNA interface can relax nick positioning21, which encouraged our exploration here of whether competing 5′ strands formed by prime editors can be destabilized. In this study, we examined mutations that relax Cas9 nick positioning for induction of nicked end degradation. Using these mutations, we engineered prime editors to discover an unexpectedly large influence of nick relaxation on indel error generation. Through rational design, we created efficient prime editors that rarely produce indel errors.

Relaxing nick positioning reduces errors

To determine whether the DNA ends at Cas9-induced nicks can be destabilized, we characterized nick positioning and DNA end degradation for engineered Cas9 variants. We assessed nick location through inference from indirect measurements of paired DSB junctions in cells21,22. In this assay, paired gRNAs produce deletion junctions between their DSBs, and retained sequences within these junctions indicate shifted nicks (Extended Data Fig. 1a). We screened Streptococcus pyogenes Cas9 alanine substitutions in the DNA-binding clefts, combining sequencing data for paired gRNA junctions at the CXCR4 locus with previously generated data at the EMX1 locus21 (Extended Data Fig. 1b,c). Several Cas9 variants promoted retention of sequences within these junctions (Extended Data Fig. 1c), which we interpreted as evidence of relaxed nick positioning (Fig. 1d and Extended Data Fig. 1d). As these nick positions are inexact due to extensive processing of DNA ends, we reasoned that nick relaxation could be quantitated using the frequency of nicks shifted from the canonical position, which we termed ‘nick shift frequency’ (Fig. 1e). We further assayed DNA end perturbations using analysis of paired DSB junctions, reasoning that additional deletions on the PAM and non-PAM sides of these junctions would indicate degradation of their respective DNA ends (Extended Data Fig. 2a). We quantitated the length and frequency of deletions as metrics of degradation for these Cas9 variants (Fig. 1e and Extended Data Fig. 2b). Our analysis indicated that wild-type Cas9 produced fewer deletions on the PAM side versus the non-PAM side, mostly 1 bp but rarely larger, which we interpreted to indicate minimal end degradation with stable PAM-side ends (Extended Data Fig. 2c). We similarly observed minimal PAM-side deletions for several Cas9 variants without relaxed nicks. By contrast, mutations that relaxed nick positioning (R780A, K810A, K848A, K855A, R976A and H982A) increased PAM-side deletions without affecting non-PAM-side ends. In particular, R976A and H982A generated dominant deletions several base pairs in length, which we interpreted as evidence of significant PAM-side end degradation (Extended Data Fig. 2c). We postulated that this bias towards removal of PAM-side ends indicates destabilization of the non-target strand nicked 5′ end promoted by relaxed nick positioning.

As our findings suggested Cas9 mutations that relaxed nick positioning also promoted DNA end degradation, we explored whether prime editors with these mutations might show enhanced editing. We began with PEmax, composed of Cas9n(H840A) with R221K–N394K mutations2. For direct comparison to the screened Cas9 variants, we reverted these to R221–N394 and named this prime editor PE. To probe how nick-relaxing mutations affect editing outcomes, we introduced the 14 mutations from our Cas9 screen into PE. We developed an assay for prime editor nicked end degradation at the AAVS1 locus, where paired nicks produce homology deletions by annealing of nicked non-target strand flaps, with an included activity marker edit (Extended Data Fig. 2d,e). In this assay, stable nicked ends enable flap homology deletions while degraded nicked ends inhibit deletions. We quantitated the ratio of the activity marker edit to the flap homology deletion, which we termed ‘flap degradation’, to infer nicked end degradation for these 15 PE variants in HEK293T cells (Fig. 1f and Extended Data Fig. 2f). We also screened these PE variants targeting edits at the TGFB1 and KRAS loci in HEK293T cells with the pegRNA + nicking gRNA (ngRNA) mode (Fig. 1g and Extended Data Fig. 3a). We further screened these variants for editing in a negative position at the TGFB1 locus (Extended Data Fig. 3b), indicating relaxation of the nick position to shift the +1 start of the editing window, and quantitated nick relaxation using the ratio of negative versus positive edit efficiencies (Extended Data Fig. 3c). Several PE variants (R780A, K810A, K848A, K855A, R976A and H982A) increased negative:positive edit ratios and flap degradation up to 22-fold, which we interpreted as supporting non-target strand nick relaxation and nicked end degradation. These same PE variants decreased indel errors up to 20-fold, with improved edit:indel ratios up to 10-fold. We observed similar suppression of four different indel classes (deletions and insertions, without or with the edit). Of note, there was a strong correlation between nick shift frequency, flap degradation and edit:indel ratios for Cas9 and PE variants bearing the same mutations (Fig. 1h–j). This supported a connection between nicked strand degradation and indel error suppression, suggesting this as a new mechanism for enhancing prime-editing fidelity.

We next combined these mutations to create double-mutant PE variants and similarly tested them (Fig. 1f,g and Extended Data Fig. 3a–c). These variants demonstrated greatly increased negative:positive edit ratios, up to 42-fold, and dramatically reduced indel errors, up to 118-fold lower than PE. One variant, K848A–H982A, nearly eliminated errors, reducing them 36-fold versus PE and improving the edit:indel ratio 28-fold. We named this variant precise prime editor (pPE). Comparing pPE with PEmax across six loci (CXCR4, EMX1, GFP, MYC, STAT1 and TGFB1) in HEK293T cells revealed consistent indel error suppression. For pegRNA-only editing versus PEmax, pPE reduced indels 7.6-fold (range of 1.1–13-fold) and increased the edit:indel ratio 6.3-fold (range of 0.4–10-fold; Extended Data Fig. 4a–d). These improvements versus PEmax were more dramatic for pegRNA + ngRNA editing, in which pPE decreased indels 26-fold (range of 7.7–36-fold) and improved the edit:indel ratio 20-fold (range of 6.6–39-fold; Extended Data Fig. 4e–h). Examination of different indel classes demonstrated significant reductions of each (Extended Data Fig. 4c,g). For pegRNA-only editing, these reductions were 2.5–25-fold, and for pegRNA + ngRNA editing, they were 7.8–28-fold. We observed similar improvements in the presence of MMR inhibition (Extended Data Fig. 4i–p). These gains for pPE enabled edit:indel ratios of up to 361:1 (Extended Data Fig. 4d,h). Therefore, the effects of nick-relaxing mutations on error generation extended to diverse edit types and prime-editing modes.

Further design improves efficiency

Although pPE suppressed errors versus PEmax, we observed a moderate reduction in editing efficiency (Extended Data Fig. 4). Similar reductions in Cas9 activity were associated with efforts to improve on-target specificity, expand targeting space and alter repair outcomes21,23,24,25,26,27. Accordingly, we reasoned that decreased efficiency might be addressed by incorporating mutations previously found to enhance Cas9 activity into pPE21,26,27,28. We tested eight mutations that introduce charged residues near the nuclease positions in Cas9 to possibly rescue reduced activity. We screened these pPE variants using edits at the MYC and STAT1 loci in HEK293T cells in the pegRNA-only mode (Fig. 2a and Extended Data Fig. 5a). Several mutants demonstrated increased efficiency up to 1.2-fold with modestly improved edit:indel ratios. Of note, two mutations (R221K and L244Q) are known to non-specifically increase activity for Cas9 (ref. 28), whereas the other two (G1104K and N1317R) were near A982 and increase Cas9 activity in the context of nearby mutations21,27. We then assessed combinations of these mutations to identify variants with maximized efficiency (Fig. 2a and Extended Data Fig. 5a). We observed increased efficiency up to 1.7-fold, again with improved edit:indel ratios. Comparison with mutations that reduce potential DSB formation18, N854A and N863A, showed superior editing for our variants, and inclusion of N863A did not improve edit:indel ratios, suggesting that our variants caused minimal DSBs (Extended Data Fig. 5b–g). The most efficient variant, R221K–K848A–H982A–N1317R, also improved edit:indel ratios from 276:1 for pPE to 354:1. We named this variant extra-precise prime editor (xPE).

Fig. 2: Error suppression synergizes with efficiency-enhancing designs.
figure 2

a, Screen of engineered pPE variants to enhance efficiency with quantification of edit and indel frequencies (bottom), indel classes (middle) and edit:indel ratios (top). b, Architectures of different prime editors, highlighting the components of the standard editors PEmax and PE7 versus the engineered editors xPE and vPE. MMLV, Moloney murine leukaemia virus; NLS, nuclear localization signal. c, Edit and indel frequencies comparing vPE with PE7 using pegRNA-only editing, with fold reductions in indel rates marked. d, Means of edit and indel frequencies comparing PEmax, xPE, PE7 and vPE using pegRNA-only editing (from panel c and Extended Data Fig. 6a), with each point representing an individual edit and replicate. e, Means of edit:indel ratios comparing PEmax, xPE, PE7 and vPE using pegRNA-only editing (from panel c and Extended Data Fig. 6a), with each point representing an individual edit and replicate. *P < 0.05 by two-tailed unpaired Student’s t-test for comparisons with PE7 or PEmax. All data were analysed by deep sequencing and represent means of n = 3 independent replicates with standard errors.

We investigated the capabilities of these engineered prime editors to understand the mechanisms by which they improved editing. We compared xPE with PEmax across six loci (CXCR4, EMX1, GFP, MYC, STAT1 and TGFB1) in HEK293T cells. For pegRNA-only editing, we observed a decrease in indel rates from 0.59% (range of 0.095–1.2%) for PEmax to 0.11% (range of 0.072–0.14%) for xPE with slightly lower rates of intended edits (Extended Data Fig. 6a,b). Subclassifying indels showed similar reductions for deletions without or with the intended edit, whereas the decrease in insertions was less for xPE versus PEmax (Extended Data Fig. 6c). This corresponded to broad improvements in edit:indel ratios (Extended Data Fig. 6d). Summarizing our findings for pegRNA-only editing, xPE decreased indels 5.0-fold (range of 1.3–9.2-fold) and increased the edit:indel ratio 4.2-fold (range of 0.7–8.2-fold) versus PEmax. These effects of xPE were more dramatic for pegRNA + ngRNA editing, in which we observed a reduction in indels from 16% (range of 14–19%) for PEmax to 1.8% (range of 0.52–3.1%) for xPE (Extended Data Fig. 6e,f). Analysing indel subclasses showed that xPE decreased all indel types compared with PEmax, most dramatically for deletions (Extended Data Fig. 6g). These effects for xPE led to large increases in the edit:indel ratio versus PEmax (Extended Data Fig. 6h). For pegRNA + ngRNA editing, xPE significantly reduced indels 12.7-fold (range of 5.0–27.5-fold) and improved the edit:indel ratio 9.4-fold (range of 4.0–14.1-fold) versus PEmax. These improvements for xPE led to edit:indel ratios of up to 199:1 (Extended Data Fig. 6d,h).

RNA protection boosts low-error editing

Although xPE gained efficiency over pPE while maintaining low error rates, we still observed slightly lower efficiency versus PEmax. Because the mutations in xPE are not known to reduce Cas9 activity21,24,28, we reasoned that this reduced efficiency might be related to nick repositioning. Previous work showed that pegRNA 3′ ends are unstable, decreasing efficiency by eliminating binding to nicked DNA 3′ ends11. We hypothesized that prime editors with repositioned nicks might be particularly susceptible to pegRNA instability due to reduced overlap between their shortened nicked DNA 3′ ends and pegRNAs. To evaluate this, we swapped our error-suppressing Cas9n from xPE (R221K–K848A–H982A–N1317R) into a recent efficiency-boosting prime-editing architecture (PE7) that stabilizes pegRNAs using the La poly-U RNA-binding protein13. We named this new editor very-precise prime editor (vPE; Fig. 2b). We compared vPE with PE7 across six loci (CXCR4, EMX1, GFP, MYC, STAT1 and TGFB1) in HEK293T cells using pegRNA-only editing. Compared with PE7, vPE reduced indel rates 8.6-fold (range of 1.3–16-fold), with similar efficiency, and increased the edit:indel ratio 8.2-fold (range of 1.0–15-fold; Fig. 2c). Comparing vPE and PE7 with xPE and PEmax for the same edits (Extended Data Fig. 6a,b), we observed substantially increased editing efficiency (Fig. 2d). For PE7 versus PEmax, both composed of the same Cas9n (R221K–N394K), the increase in efficiency was 2.7-fold (range of 1.4–5.3-fold). Yet for vPE versus xPE, the gain in efficiency was 3.2-fold (range of 1.5–6.5-fold), suggesting that xPE was somewhat more restrained by pegRNA instability than PEmax. This increased efficiency coincided with similarly increased indel error rates for PE7 but not for vPE (Fig. 2d). Correspondingly, PE7 featured an edit:indel ratio of 138:1, slightly better than PEmax at 91:1, whereas vPE increased the edit:indel ratio to 465:1 (Fig. 2e).

To clarify the potential of vPE over earlier prime editors, we edited a larger set of targets for both pegRNA-only and pegRNA + ngRNA-editing modes. As previous optimization of pegRNAs and paired ngRNAs enabled low indel error rates1, we tested these optimized designs to evaluate whether further suppression was achievable2. These edits encompassed all transition and transversion point mutations, spanning +1 to +7 positions, at well-studied loci (DNMT1, EMX1, FANCF, HEK3/LINC01509, RNF2, RUNX1 and VEGFA) in HEK293T cells. For pegRNA-only editing, we measured mean efficiencies of 34% (range of 25–55%) for PE7 and 32% (range of 15–48%) for vPE (Fig. 3a,b). For these same edits, we observed mean indel errors of 0.50% (range of 0.15–1.1%) for PE7 and 0.14% (range of 0.039–0.23%) for vPE (Fig. 3a,b). Analysing indel subtypes revealed that vPE reduced all classes versus PE7 (Fig. 3c). This high efficiency for vPE coupled with significant error suppression led to large increases in edit:indel ratios versus PE7 (Fig. 3d). We next evaluated pegRNA + ngRNA editing, an important editing mode that boosts efficiencies but is error prone1, at these same loci and edits. Here we measured mean editing efficiencies of 28% (range of 15–60%) for PE7 and 31% (range of 15–44%) for vPE (Fig. 3e,f). For pegRNA + ngRNA editing, we observed higher mean indel error rates of 16% (range of 1.6–33%) for PE7 and 1.9% (range of 0.069–7.9%) for vPE (Fig. 3e,f). We again found that vPE reduced all indel classes when compared with PE7 (Fig. 3g). As vPE dramatically reduced errors and increased efficiency for pegRNA + ngRNA editing versus PE7, we observed large increases in edit:indel ratios (Fig. 3h). To evaluate whether vPE could also improve larger edits, we tested larger substitution, insertion and deletion edits at the same loci in HEK293T cells16. For these larger edits, we measured mean editing efficiencies of 35% (range of 19–53%) for PE7 and 29% (range of 7–43%) for vPE (Fig. 3i,j). We further observed mean indel error rates of 3.5% (range of 0.14–18%) for PE7 and 0.20% (range of 0.017–0.78%) for vPE (Fig. 3i,j). We found that vPE significantly reduced all indel classes versus PE7 (Fig. 3k). As vPE greatly reduced errors for larger edits versus PE7, we observed large increases in edit:indel ratios (Fig. 3l).

Fig. 3: Engineered prime editors suppress error formation for diverse edit types.
figure 3

a, Edit and indel frequencies comparing vPE with PE7 using pegRNA-only editing, with fold reductions in indel rates marked. b, Means of edit and indel frequencies comparing vPE with PE7 using pegRNA-only editing, with each point representing an individual edit and replicate. c, Means of different indel class frequencies comparing vPE with PE7 using pegRNA-only editing, with each point representing an individual edit and replicate. d, Means of edit:indel ratios comparing vPE with PE7 using pegRNA-only editing, with each point representing an individual edit and replicate. e, Edit and indel frequencies comparing vPE with PE7 using pegRNA + ngRNA editing, with fold reductions in indel rates marked. f, Means of edit and indel frequencies comparing vPE with PE7 using pegRNA + ngRNA editing, with each point representing an individual edit and replicate. g, Means of different indel class frequencies comparing vPE with PE7 using pegRNA + ngRNA editing, with each point representing an individual edit and replicate. h, Means of edit:indel ratios comparing vPE with PE7 using pegRNA + ngRNA editing, with each point representing an individual edit and replicate. i, Edit and indel frequencies comparing vPE with PE7 for larger edits, with fold reductions in indel rates marked. j, Means of edit and indel frequencies comparing vPE with PE7 for larger edits, with each point representing an individual edit and replicate. k, Means of different indel class frequencies comparing vPE with PE7 for larger edits, with each point representing an individual edit and replicate. l, Means of edit:indel ratios comparing vPE with PE7 for larger edits, with each point representing an individual edit and replicate. *P < 0.05 by two-tailed unpaired Student’s t-test for comparisons with PE7. All data were analysed by deep sequencing and represent means of n = 3 (most samples) or n = 2 (VEGFA edits) independent replicates with standard errors.

We next examined the generalizability of enhanced editing with our engineered prime editors. Summarizing editing in HEK293T cells, vPE increased the edit:indel ratio to 543:1 for pegRNA-only editing and 102:1 for pegRNA + ngRNA editing (Fig. 4a,b). As prime editors sometimes install edits at off-target loci, we also measured editing at known off-target positions in HEK293T cells2,13 (Fig. 4c,d). We observed that vPE reduced off-target edits up to 14-fold compared with PE7, probably due to inclusion in vPE of mutations known to suppress off-target breaks24. Next, we explored editing in additional cell models, including A549 and HeLa cells, and broadly observed suppression of indel errors with vPE versus PE7 (Extended Data Fig. 7a–h). To further assess whether engineered prime editors can yield efficient and functional edits in important cell types, we evaluated a prime edit in mouse embryonic stem cells (ESCs) that converts a GFP transgene to BFP (Fig. 4e). Analysis by flow cytometry revealed that PE7 had an editing efficiency of 9.3% with an indel error rate of 2.8%, whereas vPE edited 15% of cells with effectively no errors (Fig. 4f,g). We further analysed the allelic identities of edited loci for several edits in HEK293T to determine whether errors were dominated by any particular sequence (greater than 0.05% of sequenced reads). Whereas PE7 produced several dominant indel-containing alleles for each edit for pegRNA-only or pegRNA + ngRNA editing, vPE resulted in no significant indel-containing alleles (Extended Data Figs. 8a–d and 9a–c). Of note, the fold reductions in errors were largest for edits where PE7 made the most indels. Thus vPE minimized errors, correcting numerous edits that were highly error prone with earlier prime editors.

Fig. 4: Engineered prime editors suppress target and off-target errors for efficient functional edits.
figure 4

a, Summary of edit:indel ratios comparing vPE with PE7 using pegRNA-only editing for all experiments in this study, with each point representing an individual edit and replicate. b, Summary of edit:indel ratios comparing vPE with PE7 using pegRNA + ngRNA editing for all experiments in this study, with each point representing an individual edit and replicate. c, Edit frequencies at target and off-target sites comparing vPE with PE7 using pegRNA-only editing, with fold reductions in off-target edit rates marked. d, Means of target:off-target edit ratios comparing vPE with PE7 for pegRNA-only editing, with each point representing an individual edit and replicate. e, Assay for functional editing by conversion of a GFP transgene to BFP in ESCs. f, Flow cytometry for populations of GFP+, GFP and BFP+ ESCs comparing vPE with PE7 for pegRNA + ngRNA editing. g, Edit and indel frequencies comparing vPE with PE7 for pegRNA + ngRNA editing using flow cytometry for ESCs. *P < 0.05 by two-tailed unpaired Student’s t-test for comparisons with PE7. All data were analysed by deep sequencing (ad) or flow cytometry (f,g) and represent means of n = 3 independent replicates with standard errors.

Discussion

We described the surprising discovery that prime-editing errors can be greatly suppressed through engineering the CRISPR nuclease. We propose that this finding is explained by relaxation of Cas9-induced nick positioning that generates degradation of competing 5′ strands to reduce their competition with edited 3′ new strands (Extended Data Fig. 10a,b). We routinely observed several-fold decreases in indels with our engineered prime editors coupled with high editing efficiencies, resulting in significant increases in edit:indel ratios. This insight suggests how other engineered or natural Cas9 variants and orthologues could be tested for alternative break structures29, which may also promote non-target strand 5′ end degradation following cleavage. Our engineering strategy could potentially be applied to related genome editor classes that similarly utilize polymerases to introduce edits, which also produce significant errors30,31. Indeed, we propose that modulating DNA substrate stability to enhance editing and suppress errors is a design paradigm that can yield superior genome editors.

Uncertainty in editing outcomes is a major concern as errors might propagate with harmful consequences, for example, in multiplexed editing, gene drives, molecular recording and gene therapy. The unpredictability of errors is a significant design challenge as efficiency must be maximized while indel errors are minimized, necessitating extensive pegRNA and ngRNA optimization32,33,34. The engineered prime editors described here appear to eliminate one of these constraints by suppressing errors to minimal levels. We observed that pegRNA-only editing with vPE reduces error rates to near-uniformly low levels, in contrast to a large range of error rates for PE7. We speculate that variations in local sequences and cellular factors controlling relative annealing rates of the competing 3′ and 5′ strands influence the edit:indel ratio. This is supported by our observation that vPE consistently suppresses error rates to low levels, regardless of how frequent those errors were for PE7. Sequence differences may also explain why different edits at the same locus can yield very different editing efficiencies for vPE versus PE7. Because these adjacent edits utilize nearly identical pegRNAs, this observation could be explained by vPE being sensitive to sequence-specific misfolding that is known to occur for gRNAs35. This suggests that although vPE may simplify pegRNA design, some optimization to maximize efficiency will likely be necessary.

The design of prime editors that produce minimal errors shows that protein engineering can address a key challenge for precise genome editing. This approach is minimally invasive to cells, requiring no manipulation of cellular states, modulation of DNA repair processes or addition of exogenous factors. Use of these engineered prime editors is straightforward, enabling their facile substitution into existing and future genome-editing applications. The prime editors described here exhibit a uniquely high level of editing precision at many loci and could potentially form the basis for a range of advanced tools and applications.

Methods

Mammalian cell culture

All mammalian cell cultures were maintained in a 37 °C incubator at 5% CO2. HEK293T human embryonic kidney (Thermo Fisher), A549 human lung cancer (a gift from S. Garg) and HeLa human cervical cancer (a gift from M. Stewart) cells were maintained in Dulbecco’s modified eagle’s medium with high glucose, sodium pyruvate and GlutaMAX (DMEM; 10569, Thermo Fisher) supplemented with 10% fetal bovine serum (FBS; 10438, Thermo Fisher) and 100 U ml−1 penicillin–streptomycin (15140, Thermo Fisher). V6.5 mouse ESCs (a gift from R. Jaenisch) were maintained in DMEM with high glucose and sodium pyruvate (11995, Thermo Fisher) supplemented with 15% FBS (SH30070, GE Healthcare), 1 mM HEPES (15630, Thermo Fisher), 2 mM l-glutamine (25030, Thermo Fisher), 1X MEM non-essential amino acids (NEAAs; 11140, Thermo Fisher), 0.0008% 2-mercaptoethanol (M6250, Millipore-Sigma), 1,000 U ml−1 leukemia inhibitory factor (LIF; ESG1107, Millipore-Sigma) and 100 U ml−1 penicillin–streptomycin (30-002-Cl, Corning). V6.5 cells were grown on plates coated with 0.2% gelatin (G1890, Millipore-Sigma). HEK293T cells without or with a GFP transgene insertion were used as previously described21. ESCs with a GFP transgene insertion were created by infection of V6.5 cells with pLX TRC209 lentivirus and isolation of single-cell clones expressing GFP. All cell lines were tested for mycoplasma contamination and confirmed mycoplasma free. Cell lines were not authenticated. No commonly misidentified cell lines were used.

Mutagenesis and cloning

PE2 and PEmax prime editors were obtained from pCMV-PE2 and pCMV-PEmax, and a cloning backbone for pegRNA expression was obtained from pU6-pegRNA-GG-acceptor, which were gifts from D. R. Liu. PE7 prime editor was obtained from Lenti-PE7-P2A-Puro, which was a gift from F. J. Sanchez-Rivera. PE was created by restriction cloning of Cas9n(H840A) from pCMV-PE2 into pCMV-PEmax using NotI and SacI digestion. PE mutagenesis was performed using PCR-driven splicing by overlap extension using primers listed in Supplementary Table 1. In brief, one fragment was amplified by PCR from PE using the pe-FWD or pe-mid-FWD and mutant-BOT primers, and a second fragment was amplified using the mutant-TOP and pe-mid-REV or pe-rt-REV primers for each mutant. Each pair of fragments was then spliced by overlap extension PCR using the pe-FWD and pe-mid-REV or pe-mid-FWD and pe-rt-REV primers to create a PE gene fragment with a single-residue mutation. These PE gene fragments were then each cloned back into PE using unique NotI, SacI and BamHI restriction sites to replace the PE sequence with the mutant sequence. Additional mutants (double, triple and quadruple mutants) were made iteratively starting from these single-mutant plasmids. PE7 and vPE were created by restriction cloning of a fragment containing La amplified from Lenti-PE7-P2A-Puro into pCMV-PEmax, for PE7, or into xPE, for vPE, using BamHI and BshTI digestion. The pegRNA oligos, listed in Supplementary Table 2, were cloned into pU6-pegRNA-GG-acceptor by Golden Gate cloning with Eco31I digestion.

Cas9 variants were generated as previously described21. A custom gRNA cloning backbone vector was created by PCR amplification from pX330 using the gRNA-scaffold-NheI-FWD and gRNA-scaffold-EcoRI-REV primers and restriction cloning into pUC19 (Thermo Fisher) using NheI and EcoRI digestion. The nicking gRNA spacer sequence oligos, listed in Supplementary Table 2, were phosphorylated with T4 polynucleotide kinase (NEB) and cloned into gRNA cloning backbone by Golden Gate cloning with BpiI digestion.

Primers were synthesized by IDT. Restriction enzymes were obtained from Thermo Fisher. T7 DNA ligase was obtained from NEB. Plasmids were transformed into Stbl3 chemically competent Escherichia coli (Thermo Fisher). Sequences for the PEmax, PE, pPE, xPE, PE7 and vPE vectors are presented in the Sequences section in the Supplementary Information.

Structure analysis

Crystal structures of Cas9 with substrate DNA bound (5F9R) or without substrate DNA bound (4ZT0) were analysed using PyMol (Schrödinger).

Cell transfection

Cells were seeded in the maintenance medium (without penicillin–streptomycin for HEK293T, A549 and HeLa cells) into 48-well plates at 50,000 cells per well. Transfections with dual gRNAs were carried out 24 h after seeding using 200 ng Cas9 expression vector and 72 ng of each gRNA expression vector formulated with 0.86 µl Lipofectamine 2000 (Thermo Fisher) at a total volume of 34.4 µl in OptiMEM I (Thermo Fisher) per well. Transfections with prime-editing vectors were carried out 24 h after seeding using 238 ng PE expression vector, 57 ng pegRNA expression vector and 72 ng nicking gRNA expression vector (for pegRNA + ngRNA editing) formulated with 0.74–0.92 µl (equal volume per DNA) Lipofectamine 2000 at a total volume of 29.5–36.7 µl (equal DNA concentration) in OptiMEM I per well. For sequencing assays, genomic DNA was extracted 72 h after transfection using QuickExtract (Epicentre). For flow cytometry assays, cells were transferred to 10-cm dishes 72 h after transfection and harvested 9 days after transfection in PBS with 5% FBS (Thermo Fisher).

High-throughput sequencing

The targeted loci were amplified from extracted genomic DNA by PCR using Herculase II polymerase (Agilent). The PCR primers included Illumina sequencing handles as well as replicate-specific barcodes. These PCR products were then tagged with sample-specific barcodes and sequenced on an Illumina MiSeq. Primers, listed in Supplementary Table 3, were synthesized by IDT. A sequencing file listing and sequencing depth data are available in Supplementary Table 4.

Flow cytometry

Flow cytometry analysis was performed on an LSR Fortessa analyser and data were collected using FACSDiva (BD Biosciences). Cells were first gated comparing side scatter (SSC) and forward scatter (FSC) parameters, starting with SSC-A and FSC-A, then SSC-H and SSC-W, then FSC-H and FSC-W parameters to select for single cells (Supplementary Fig. 1). To assess editing frequencies, cells were gated for GFP (488-nm laser excitation, 530/30-nm filter detection) and BFP (405-nm laser excitation, 450/50-nm filter detection). Flow cytometry data were analysed using FlowJo (FlowJo). Intended edit rates were quantified as the fraction of cells gated as BFP+ for a prime-edited sample minus the fraction gated as BFP+ for the unedited control sample. Indel rates were quantified as the fraction of cells gated as GFP and BFP for a prime-edited sample minus the fraction gated as GFP and BFP for the unedited control sample.

Genome-editing analysis

To measure editing outcomes, the high-throughput sequencing data were analysed using CRISPResso2 (ref. 36). Data for prime-editing experiments were processed using the ‘prime editing’ mode in CRISPResso2 by including sequence values for the parameters ‘prime_editing_pegRNA_spacer_seq’, ‘prime_editing_pegRNA_extension_seq’, ‘prime_editing_pegRNA_scaffold_seq’ and ‘prime_editing_nicking_guide_seq’ (for pegRNA + ngRNA modes). Editing window parameters ‘prime_editing_pegRNA_extension_quantification_window_size’ and ‘w’ were set to 5. The ‘ignore_substitutions’ option was used to account for small sequence variations that occur due to PCR and sequencing errors. Intended edit rates were quantified as the fraction of reads marked as prime edited out of total sequencing reads. Indel rates were quantified as the fraction of reads marked as indels out of total sequencing reads. Frequencies of specific indel sizes were quantified as the fraction of reads containing these sizes out of all indel reads or out of total sequencing reads, as noted, and were averaged over three independent replicates. Mean indel sizes were calculated as the mean of the absolute values of indel sizes weighted by their indel fractions. Depletion of specific indel sizes was quantified as the fractional reduction in the frequency of that indel size, comparing different editors. Plots of insertion and deletion positions were produced from data generated in CRISPResso2 and averaged over three independent replicates.

Plots of editing outcome alleles were processed using the standard mode in CRISPResso2. The editing window parameter ‘w’ was set to 30. The plot size parameter ‘plot_window_size’ was set to 30, the minimum allele frequency parameter ‘min_frequency_alleles_around_cut_to_plot’ was set to 0.05 and the allele number parameter ‘max_rows_alleles_around_cut_to_plot’ was set to 30. Accordingly, the top 30 alleles were displayed regardless of frequency.

DNA nick shift frequency and end degradation analysis

To quantitate nick shift frequencies and nicked end degradation for Cas9 variants in cells, editing outcomes for dual-gRNA cutting of genomic DNA were analysed21,22. In these datasets, HEK293T cells were edited with pairs of gRNAs separately targeting either the EMX1 or CXCR4 locus. The CXCR4 dataset was generated from new experiments, whereas the EMX1 dataset was reanalysed from previously published data21. The gRNA pairs were complementary to the same strand at each locus and were expected to make cuts 84 bp apart, resulting in junctions. The loci were amplified and sequenced by high-throughput sequencing. The high-throughput sequencing data were analysed using CRISPResso2 with the expected junction as a reference sequence. To assess DNA nick position, sequencing reads aligned to the junction reference were analysed for retained sequences perfectly matching the sequences between the expected gRNA cut sites. The lengths of these retained sequences in the junctions were used to infer shifts in nick positioning leading to each read. The most frequent cut position with a frequency of 5% of reads or greater was presented as the dominant shifted nick position. The nick shift frequency was quantified as the fraction of reads containing a retained sequence indicating shifted nicks out of the total sequencing reads containing either a retained sequence or a perfect junction sequence. To assess DNA nicked end degradation, sequencing reads aligned to the junction reference were analysed for deletion sequences. The deletions to the PAM side of the junction were inferred to correspond to lengths of non-target strand 5′ end degradation, whereas the deletions to the non-PAM side of the junction were inferred to correspond to lengths of non-target strand 3′ end degradation. The degradation lengths were quantified as the median length of these degradation products. The degradation frequencies were quantified as the fraction of reads containing deletions on the PAM or non-PAM sides of the junction out of all reads containing either a deletion sequence or a perfect junction sequence, and were presented as the ratio between degradation on the PAM side versus non-PAM side.

To estimate nicked end degradation for prime editor variants, editing outcomes for a dual-nick genomic DNA degradation sensor were analysed. In these datasets, HEK293T cells were edited with a paired pegRNA and ngRNA targeting the AAVS1 locus. The gRNA pair was complementary to opposite strands and was expected to make nicks 43 bp apart, resulting in annealing of homologous sequences flanking the nick position. The frequency of these flap homology deletions could then be used to quantify the degradation of the nicked ends for each prime editor variant, as degradation would reduce deletion frequency. The pegRNA also generated a larger edit 8 bp from the nick that could be used as an activity marker, such that the effects of a mutation on overall activity of the prime editor variant could be determined in the same assay. The loci were amplified and sequenced by high-throughput sequencing. The high-throughput sequencing data were analysed using CRISPResso2 as described above for prime-editing data. Flap degradation was quantified as the number of reads containing the activity marker edit divided by the number of reads containing the flap homology deletions, normalized to this ratio for the standard prime editor PE.

To estimate nick shift frequencies for prime editor variants, editing outcomes comparing edits at a positive and negative position relative to a nick were analysed. In these datasets, HEK293T cells were edited with pegRNAs targeting the TGFB1 locus. The pegRNAs targeted the same sequence, but installed an edit either at the +6 or −1 position. The frequency of the −1 edit could then be used to quantify the nick shift frequency, as the nick would have to shift from the canonical position between −1 and +1 to enable a −1 edit. The loci were amplified and sequenced by high-throughput sequencing. The high-throughput sequencing data were analysed using CRISPResso2 as described above for prime-editing data. Negative:positive edit ratios were quantified as the intended edit rate for the −1 edit divided by the total editing rate for the +6 edit, normalized to this ratio for the standard prime editor PE.

Off-target editing

To measure off-target editing, two to three of the most commonly cleaved Cas9 off-target sites for given gRNAs were analysed as previously described2,13. Analysis was performed for off-target editing by pegRNAs targeting the EMX1, FANCF and HEK3 loci at known off-target sites (OT2 and OT3 for EMX1; OT1, OT3 and OT4 for FANCF; and OT1, OT2 and OT4 for HEK3). High-throughput sequencing was performed on these amplified sequences and data were processed using CRISPResso2. The quality filtering parameter ‘q’ was set to 30, whereas the editing window centre parameter ‘wc’ was set to 0 and the editing window size parameter ‘w’ was set to 3. The ‘discard_indel_reads’ option was used to remove reads containing deletions or insertions from the analysis. For each off-target locus, the sequence on the PAM side of the off-target nick was compared with the sequence encoded by the pegRNA template on the PAM side of the target nick to identify the first nucleotide on the off-target locus where these sequences differ. Sequencing reads at the off-target locus that matched the pegRNA template sequence from the nick to this first differing nucleotide position were considered off-target edit reads. Off-target-editing efficiencies were quantified as the fraction of reads marked as off-target edits out of total sequencing reads.

Edit notation

Edits were denoted based on the position where the edit begins relative to expected gRNA nick position for wild-type Cas9, denoting position +1 as 3 bp upstream of the first PAM position. Substitution edits were noted using a ‘>’ mark, deletions were noted by a ‘del’ mark, and insertions were noted by an ‘ins’ mark. The base identities of the strand containing the gRNA spacer sequence were used in all cases.

Statistical analysis

Specific statistical comparisons are indicated in the figure legends. Error bars indicate the standard error for independent replicates as noted. Significance where noted was assessed using unpaired, two-tailed Student’s t-tests. Correlations were determined by Pearson coefficients. Figures and analysis were produced using Graphpad Prism and Microsoft Excel software.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.