WO2024261323A1

WO2024261323A1 - Molecular switches

Info

Publication number: WO2024261323A1
Application number: PCT/EP2024/067561
Authority: WO
Inventors: Natalie Jo Tigue; Lisa Marie Kitching VINALL; Anna Gudny SIGURDARDOTTIR; Christina Schindler; Stacey CHIN
Original assignee: AstraZeneca AB
Current assignee: AstraZeneca AB
Priority date: 2023-06-23
Filing date: 2024-06-21
Publication date: 2024-12-26
Anticipated expiration: 2025-12-23

Abstract

Provided herein are compositions and methods that permit the controlled interaction of polypeptides to which a target protein and binding protein are fused. The compositions and methods make use of a target protein that binds to the hepatitis C virus protease inhibitor grazoprevir to form a complex, and a Tn3 binding protein that specifically binds the complex. The target protein is or is derived from a viral protease, such as the HCV NS3/4A protease.

Description

Molecular Switches

Field

Provided herein are compositions and methods that permit the controlled interaction of polypeptides to which a target protein and binding protein are fused. The compositions and methods make use of a target protein that binds to the hepatitis C virus protease inhibitor grazoprevir to form a complex, and a Tn3 binding protein that specifically binds the complex. The target protein is or is derived from a viral protease, such as the HCV NS3/4A protease. Also provided herein are dimerization-inducible proteins, such as split transcription factors and split chimeric antigen receptors, that contain the target protein and Tn3 binding protein. The methods and compositions described herein find application, for example, in cell and gene therapy methods that involve the controlled expression and/or activation of proteins.

Background

Protein-protein interactions (PPIs) represent a universal regulatory mechanism that controls multiple biological functions. For example, gene transcription, protein folding, protein localisation, protein degradation and signal transduction all rely on the interaction or proximity of one protein to another, or indeed several others. By temporally controlling PPIs, researchers can readily monitor the functional consequences of a PPI, enabling the dissection of complex biological mechanisms. Furthermore, the ability to control biological functions are being utilised in cell and gene therapy to control therapeutic activity, enabling safer and more personalised therapies.

A commonly used technique for controlling PPIs is to use so-called chemical inducers of dimerization (CID), small molecules that bring together two proteins that do not interact in the absence of the CID, to form a tripartite ternary complex (Stanton, Chory & Crabtree, 2018). The most widely used Cl Ds are rapamycin (an immunosuppressive drug derived from Streptomyces hygroscopicus) and analogues thereof, that form a heterodimeric complex with the proteins FKBP12 (12-kDa FK506-binding protein) and FRB (a domain from mTOR (mammalian target of rapamycin)) (Sabers et al. 1995). An attractive feature of rapamycin, along with other naturally-occurring Cl Ds, such as the plant hormones S-(+)-abscisic acid (ABA) and gibberellin (GA3-AM), is its co-operative binding mechanism whereby protein 2 can only bind to the protein 1 :CID complex (Banaszynski, Liu & Wandless, 2005).

De novo Cl Ds have also been generated through the chemical linkage of two small molecules that bind the same, or different, proteins, with these proteins constituting the dimerization protein pair (Belshaw, Ho et al. 1996; Belshaw, Spencer et al. 1996). In these systems however, at high concentrations of the bi-functional CID, non-productive complexes between one protein partner and the CID out-compete the production of tripartite complexes, meaning that a linear dose-response cannot be achieved. As such, there is a growing urgency for new co-operative binding CID systems that can be used to regulate cellular function and to expand the number of orthogonal systems that can be used in complex genetic circuits. Furthermore, there are very few Cl Ds that have been approved for chronic human use. Recently, a method to generate de novo CID systems (AbCIDs) using antibody-based phage display selection methods was described (Hill et al. 2018). The CID used in that study was ABT-737, a Bcl-2 and Bcl-xL inhibitor, and Bcl-xL itself was employed as one of the protein partners. The second protein was then selected from a phage display library of single chain Fab (scFab) molecules to be selective for the Bcl-xL:ABT-737 complex over Bcl-xL alone.

The approach described in Hill et al. (supra) and WO 2018/213848 of identifying complex-specific molecules by utilising existing small molecules and their targets is an attractive one. However, the overexpression of certain human proteins (e.g. the anti-apoptotic Bcl-xL protein) and use of small molecules that bind to human targets within the body is not without its risks. For example, overexpression of a functional human protein is likely to have consequences for the cells in which it is expressed, which could impact cell health and viability. Additionally, the use of small molecules whose targets are expressed in the body can result in an increased dose requirement due to the competition of binding of the small molecule to the endogenous target and the overexpressed target. Moreover, the binding of the small molecule to the endogenous target may affect the function of that protein that may be detrimental to the cells in which the target is expressed.

An approach designed to overcome the limitations of the AbCID system as described by Hill et al. (supra) is described in WO 2021/009692. This approach utilises as CIDs small molecules which bind non-human proteins (particularly viral proteins) and have already been approved for human use, improving safety and facilitating a smoother path to regulatory approval. Another advantage of using a viral target protein is that it removes the risk of an endogenous small molecule “sink” when used in a human, where the small molecule binds to endogenous targets in the human in addition to binding to the target protein. Furthermore, the expression of a viral protein within human cells is less likely to impact the cellular physiology of the cell than expression of a human protein, that has endogenous function.

The use of viral proteases as the target protein was identified as particularly advantageous because these proteases are cytoplasmically located, small, and consist of discrete domains. WO 2021/009692 describes a system using a modified HCV NS3/4A protease as the target protein and the small molecule inhibitor of NS3/4A simeprevir as the CID. Summary

As detailed in the Examples below, the present inventors have found that simeprevir has a short half-life in mice, which poses a challenge to pre-clinical development of simeprevir as a CID in pharmaceuticals. The inventors have found that grazoprevir, another HCV NS3/4A protease inhibitor, has a superior pharmacokinetic profile in mice, and have developed a new molecular switch for PPI control utilising grazoprevir as a CID. Use of grazoprevir is also advantageous from a regulatory perspective, as unlike simeprevir, it has been investigated in a clinical trial as a monotherapy. The inventors have also found that in some cases, grazoprevir can be substituted for glecaprevir without changing the other components of the molecular switch.

In a first aspect, provided herein is a Tn3 protein that binds specifically to an HCV NS3/4A protease (“T”) complexed with a small molecule (“SM”) [T-SM complex], wherein the small molecule is grazoprevir, glecaprevir or an analogue or derivative thereof.

The HCV NS3/4A protease may have the sequence set forth in SEQ ID NO: 26, or may be a derivative thereof as described below. The HCV NS3/4A protease is a small, monomeric protein that can be expressed cytoplasmically and has a limited number of endogenous human targets, therefore making it an ideal target protein. The HCV NS3/4A protease is also, interchangeably, referred to herein as the “target protein” (“T”).

Potential off-target activity caused by overexpression of the HCV NS3/4A protease can be mitigated by using a modified target protein that has attenuated protease activity compared to the native NS3/4A protease from which it is derived. Thus, in some embodiments, the target protein has attenuated protease activity compared to the native HCV NS3/4A protease.

For example, the target protein may contain one or more amino acid mutations compared to the HCV NS3/4A protease, for example at one or more amino acids selected from those at the positions corresponding to positions 72, 96, 112, 114, 154, 160 and 164 of SEQ ID NO: 26. For example, the target protein may have an amino acid mutation at the position corresponding to position 154 of SEQ ID NO: 26, such as a mutation to alanine. As described below, positions 72, 96, 112, 114, 154, 160 and 164 of SEQ ID NO: 26 correspond to positions 57, 81 , 97, 99, 139, 145 and 149, respectively, of the full length NS3 protein set forth in SEQ ID NO: 34.

In some cases, it may be desirable that a competing small molecule (other than grazoprevir or glecaprevir) is able to bind the target protein in the target protein-small molecule (T-SM) complex such that the competing small molecule is capable of displacing grazoprevir from the T-SM complex. In this way, the second small molecule can decrease the half-life of the tripartite complex formed between the Tn3 protein, the viral protease and grazoprevir/glecaprevir. This may be desirable, for example, in situations where it is considered useful to use the second small molecule to speed up dissociation of the tripartite complex, e.g. in order to rapidly inhibit activity of a dimerization-inducible protein activated by formation of the tripartite complex.

Certain affinity reducing mutations were identified in WO 2021/009692 (which is herein incorporated by reference) that reduce the affinity of the HCV NS3/4A protease for simeprevir. The same mutations may reduce the affinity of the HCV NS3/4A protease for grazoprevir and allow other small molecules to “compete” with grazoprevir or glecaprevir and disrupt the tripartite complex formed. Thus, in some embodiments, the target protein may comprise an affinity reducing amino acid substitution at one or both of the amino acid positions corresponding to positions 151 and 183 of SEQ ID NO: 26. In some embodiments, the affinity reducing amino acid mutation at the position corresponding to position 151 of SEQ ID NO: 26 is a mutation to aspartic acid, asparagine or histidine (e.g. aspartic acid or asparagine) and the affinity reducing mutation at the position corresponding to position 183 of SEQ ID NO: 26 is to glutamic acid, glutamine or alanine (e.g. glutamic acid). The target protein may comprise the affinity reducing amino acid mutation in addition to other mutations described herein, such as the amino acid mutation at one or more amino acids selected from those at the positions corresponding to positions 72, 96, 112, 114, 154, 160 and 164 of SEQ ID NO: 26.

Described herein is the development and use of particular Tn3 proteins that bind to a complex between the HCV NS3/4A protease and grazoprevir.

Binding of the Tn3 protein (also and interchangeably referred to herein as “the binding protein”) to the T-SM complex forms a tripartite complex made up of the binding protein, target protein and grazoprevir and the formation of this tripartite complex can be controlled by the presence of grazoprevir. The controlled formation of the tripartite complex is useful as, for example, it permits the controlled interaction of polypeptides to which the target protein and binding member are fused.

In a second aspect, provided herein is a polypeptide comprising the binding protein (or binding molecule/binding member, BM) of the first aspect fused to an HCV NS3/4A protease (T) to form a BM-T fusion protein. This aspect may alternatively be seen as providing a BM-T fusion protein comprising the binding protein (BM) of the first aspect fused to an HCV NS3/4A protease (T).

As set out in WO 2021/009692, the approach described herein can be used where the target protein and binding protein are individually fused to polypeptides of interest (termed “component polypeptides”). In particular, the approach can be used to control the activity of proteins that require dimerization or clustering to drive their activity. Such proteins are termed herein “dimerization-inducible proteins” and include “split proteins”, “dimerization-deficient proteins” and “split complexes”. Split proteins are single proteins that can be segregated or split into two or more domains, rendering the component parts non-functional or minimally active, but the function or activity of which can be initiated or restored when the separated components are brought into close proximity. Examples include split fluorescent proteins (e.g. split GFP), split luciferases (e.g. NanoBiT) and split kinases. A further example describes a split transcription factor, whereby the distinct DNA binding domain (DBD) and transcription regulatory domain (TRD) (such as an activation domain (AD)) are separated such that the individual transcription factor domains alone cannot initiate transcription. Only when the two domains are brought into close proximity are they able to reconstitute the transcriptional activation of relevant genes (i.e. they form a functional “transcription factor”).

Dimerization-deficient proteins are proteins that require dimerization for activity, but their endogenous dimerization capacity has been disabled e.g. via mutation or removal of the dimerization domain(s). One such example is the iCasp9 molecule, a caspase-9 protein that has had its dimerization (CARD) domain removed.

Split complexes denote either single proteins or 2 or more different proteins that are not optimally functional or function differently, until they are brought into close proximity, or “clustered”. Once such example is the split chimeric antigen receptor (CAR). Here, specific intracellular domains of the CAR that are responsible for the activation of cell signalling are physically separated such that full cellular activation is prevented. Once the domains are brought into close proximity, cell signalling is activated (i.e. they form a fully functional CAR).

Thus in a third aspect, provided herein is a dimerization-inducible protein, comprising: a) a first fusion protein, comprising a first component polypeptide fused to the binding protein of the first aspect; and b) a second fusion protein, comprising a second component polypeptide fused to an HCV NS3/4A protease.

In some embodiments, the dimerization-inducible protein is a split transcription factor. In such embodiments, either: (1) the first component polypeptide comprises a DNA binding domain and is fused to the target protein to form a DBD-T (DBD-target protein) fusion protein; and the second component polypeptide comprises a transcriptional regulatory domain and is fused to the binding protein to form a TRD-BM (transcriptional regulatory domain-binding molecule) fusion protein; or (2) the first component polypeptide comprises a transcriptional regulatory domain and is fused to the target protein to form a TRD-T fusion protein; and the second component polypeptide comprises a DNA binding domain and is fused to the binding protein to form a DBD-BM fusion protein; wherein the first and second components form a transcription factor upon dimerization. In other embodiments, the dimerization-inducible protein is a split CAR. In such embodiments, either: (1) the first component polypeptide comprises a first co-stimulatory domain and is fused to the target protein; and the second component polypeptide comprises an intracellular signalling domain and is fused to the binding protein; or (2) the first component polypeptide comprises an intracellular signalling domain and is fused to the target protein, and the second component polypeptide comprises a first co-stimulatory domain and is fused to the binding protein.

In such embodiments, it may be that the component polypeptide comprising the first co-stimulatory domain further comprises an antigen-specific recognition domain and a transmembrane domain, and the component polypeptide comprising the intracellular signalling domain further comprises a transmembrane domain and a second co-stimulatory domain.

The first and second component polypeptides form a chimeric antigen receptor (CAR) upon dimerization.

In other embodiments, the dimerization-inducible protein is a dimerization-deficient caspase. In such embodiments, the first component polypeptide comprises a first caspase component and the second component polypeptide comprises a second caspase component, and the first and second component polypeptides form a caspase upon dimerization. In some such embodiments, the dimerization-deficient caspase is iCasp9.

In a fourth aspect, provided herein is a cell expressing the binding protein of the first aspect, the BM-T fusion protein of the second aspect or the dimerization inducible protein of the third aspect.

In a fifth aspect, provided herein is a nucleic acid encoding a polypeptide comprising the binding protein of the first aspect, the BM-T fusion protein of the second aspect or the dimerization-inducible protein of the third aspect.

In a sixth aspect, provided herein is an expression vector comprising an expression cassette encoding a polypeptide comprising the binding protein of the first aspect or the BM-T fusion protein of the second aspect. The binding protein or BM-T fusion protein may be encoded as a fusion protein further comprising a first or second component polypeptide as described in the third aspect above.

In a seventh aspect, provided herein is an expression vector comprising an expression cassette encoding: a) a first polypeptide comprising the binding protein of the first aspect; and b) a second polypeptide comprising the target protein.

In an eighth aspect, provided herein is an expression vector comprising: a) a first expression cassette encoding a first polypeptide comprising the binding protein of the first aspect; and b) a second expression cassette encoding a second polypeptide comprising the target protein.

In a ninth aspect, provided herein is a vector set comprising a first expression vector and a second expression vector, wherein the first expression vector comprises a first expression cassette and the second expression vector comprises a second expression cassette, and: a) the first expression cassette encodes a first polypeptide comprising the binding protein of the first aspect; and b) the second expression cassette encodes a second polypeptide comprising the target protein.

In a tenth aspect, provided herein is one or more expression vectors comprising: a) a first expression cassette encoding a first polypeptide comprising the binding protein of the first aspect; and b) a second expression cassette encoding a second polypeptide comprising the target protein.

In some embodiments of the sixth to tenth aspects, the expression vector is a viral vector, such as an AAV vector.

In an eleventh aspect, provided herein is a viral particle comprising a viral genome comprising an expression cassette as defined in the sixth or seventh aspect.

In a twelfth aspect, provided herein is a set of viral particles comprising: a) a first viral particle comprising a first viral genome, wherein the first viral genome comprises a first expression cassette as defined in any one of the eighth to tenth aspects; and b) a second viral particle comprising a second viral genome, wherein the second viral genome comprises a second expression cassette as defined in any one of the eighth to tenth aspects.

In a thirteenth aspect, provided herein is one or more viral particles comprising one or more viral genomes comprising: a) a first expression cassette as defined in any one of the eighth to tenth aspects; and b) a second expression cassette as defined in any one of the eighth to tenth aspects.

In a fourteenth aspect, provided herein is a cell comprising the nucleic acid of the fifth aspect, the expression vector of the sixth, seventh or eighth aspect, the vector set of the ninth aspect or the one or more expression vectors of the tenth aspect. The cell may be for administration to a subject (e.g. a human subject), and may be e.g. allogeneic or autologous. The cell may be e.g. an immune cell or a stem cell (including an induced pluripotent stem cell (iPCS)). The cell may express the binding protein, target protein, BM-T fusion protein or dimerization-inducible protein described herein. Also provided herein are methods of genetically modifying a cell to produce cells expressing the binding protein, BM-T fusion protein or dimerization-inducible protein described herein, the method comprising administering expression vectors or viral particles as described above to the cell.

The approach described herein where the target protein and binding protein are fused to component polypeptides of a split transcription factor could have use in gene therapy methods that involve regulating the expression of a desired expression product (e.g. a desired polypeptide) in a cell.

Thus, in a further aspect, provided herein is a method of regulating the expression of a desired expression product (or a target gene) in a cell, comprising: i) expressing the dimerization-inducible protein defined herein in the cell, wherein the first and second components form a transcription factor upon dimerization, and wherein the DNA binding domain binds to a target sequence in the cell such that the transcription factor is capable of regulating expression of the desired expression product in the cell; and ii) administering grazoprevir or glecaprevir, or an analogue or derivative thereof, to the cell in order to regulate expression of the desired expression product.

In some embodiments of the method, the DNA binding domain target sequence is located in a promoter that is operably linked to a coding sequence for the desired expression product.

The method may involve delivery of the expression cassette or cassettes encoding the dimerization-inducible protein to control expression of a desired expression product that is also delivered exogenously to the cell.

Thus, in some embodiments, the method comprises administering an additional expression cassette to a cell, wherein the additional expression cassette encodes the desired expression product, and wherein the additional expression cassette comprises the target sequence of the DNA binding domain.

Alternatively, the method may involve delivery of the expression cassette or cassettes encoding the dimerization-inducible protein to control expression of a desired expression product that is already present as part of the genome of the cell (i.e. an endogenous desired expression product).

Thus, in other embodiments of the method, the target sequence is located in the genome of the cell. This method thus provides a method of treating a subject in need thereof by gene therapy, the method comprising administering to the subject the expression vector, vector set or one or more expression vectors of the sixth to tenth aspects, or the viral particle, viral particle set or one or more viral particles of the eleventh to thirteenth aspects, and optionally an additional expression cassette encoding the target sequence for the DNA binding domain, and grazoprevir or glecaprevir (or an analogue or derivative thereof). The additional expression cassette may be encoded on an expression vector according to the sixth to tenth aspects, or on a separate expression vector, or in the genome of a viral particle according to the eleventh to thirteenth aspects, on in the genome of a separate viral particle.

Also provided is grazoprevir or glecaprevir (or an analogue or derivative thereof) for use in a method of regulating the expression of a target gene in a cell in a subject, wherein the cell expresses a dimerization-inducible protein as described herein, wherein the first and second components form a transcription factor upon dimerization, and the DBD binds to a target sequence in the cell such that the transcription factor is capable of regulating the expression of the target gene, and the method comprises administering grazoprevir or glecaprevir (or an analogue or derivative thereof) to the subject. The method may further comprise administering the expression vector, vector set or one or more expression vectors of the sixth to tenth aspects, and optionally an additional expression cassette encoding the target sequence for the DNA binding domain, to the subject. Alternatively the method may further comprise administering the viral particle, viral particle set or one or more viral particles of the eleventh to thirteenth aspects, and optionally an additional expression cassette encoding the target sequence for the DNA binding domain, to the subject.

Furthermore, the approach described herein could have use in methods of cellular therapy. Such methods typically involve taking cells from an individual (autologous cells), modifying the cells ex vivo to express a particular protein, e.g. a dimerization-inducible protein, and administering the cells back into the individual. Alternatively, the cells may be taken from a different individual to the subject to be treated (allogeneic cells). Allogeneic cells may be modified and administered to the subject to be treated in the same manner as autologous cells.

Thus, in another aspect the present disclosure provides a method of treatment of a subject by cellular therapy, the method comprising: i) administering a cell comprising the expression cassette or cassettes encoding the dimerization-inducible protein as defined herein to a subject in need thereof; and ii) administering grazoprevir or glecaprevir, or an analogue or derivative thereof, to the subject. In some embodiments of this aspect, the dimerization-inducible protein is a split CAR. Thus it can be seen that in a further aspect, provided herein is the expression vector, vector set or one or more expression vectors of the sixth to tenth aspects, or the viral particle, viral particle set or one or more viral particles of the eleventh to thirteenth aspects, for use in a method of treatment of a subject, i.e. a method of treatment of a human or animal body. This aspect may alternatively be seen as providing the expression vector, vector set or one or more expression vectors of the sixth to tenth aspects, or the viral particle, viral particle set or one or more viral particles of the eleventh to thirteenth aspects, for use in therapy.

In embodiments, provided herein is the expression vector, vector set or one or more expression vectors of the sixth to tenth aspects, or the viral particle, viral particle set or one or more viral particles of the eleventh to thirteenth aspects for use in gene therapy.

It can also be seen that in a further aspect, provided herein is the cell of the fourteenth aspect for use in a method of treatment of a subject, i.e. a method of treatment of a human or animal body. This aspect may alternatively be seen as providing the cell of the fourteenth aspect for use in therapy.

In embodiments, provided herein is the cell of the fourteenth aspect for use in cellular therapy.

The approach described herein could also have use in methods of inducing cell death in a target cell. Such methods typically involve expression of a dimerization-inducible protein in the target cell, which dimerization-inducible protein induces cell death upon dimerization. The dimerization-inducible protein for this purpose may be a caspase, the functionality of which is initiated only upon dimerization of the dimerization-inducible protein, e.g. the dimerization-inducible protein may be a dimerization deficient apoptotic protein, such as iCasp9.

Thus provided herein is a method of inducing death of a target cell, wherein the target cell expresses a dimerization-inducible protein of the third aspect which, upon dimerization, forms a functional caspase, the method comprising contacting the cell with grazoprevir or glecaprevir, or an analogue or derivative thereof.

Such methods find particular utility in the context of cellular therapy, where overactivity of administered immune cells can result in cytokine release syndrome (CRS) and/or and immune-effector-cell-associated neurotoxicity syndrome (ICANS). To treat these conditions it is desired to develop a “kill switch” whereby, in the event of severe adverse effects, the administered immune cells can be induced to enter apoptosis. A dimerization-inducible protein which, upon dimerization, yields a functional caspase, can provide such a kill switch.

Thus provided herein is a method of inducing death of a target cell in a subject, wherein the target cell expresses a dimerization-inducible protein of the third aspect which, upon dimerization, forms a functional caspase, the method comprising administering grazoprevir or glecaprevir, or an analogue or derivative thereof, to the subject.

Also provided herein is grazoprevir or glecaprevir (or an analogue or derivative thereof) for use in a method of inducing death of a target cell in a subject, wherein the target cell expresses a dimerization-inducible protein of the third aspect which, upon dimerization, forms a functional caspase, and the method comprises administering grazoprevir or glecaprevor (or an analogue or derivative thereof) to the subject.

In a further aspect, provided herein is a kit comprising grazoprevir or glecaprevir, or an analogue or derivative thereof, and: a) the expression vector, vector set or one or more vectors of any of the sixth to tenth aspects; b) the viral particle, viral particle set or one or more viral particles of any of the eleventh to thirteenth aspect; c) the cell of the fourth or fourteenth aspect; or d) the nucleic acid of the fifth aspect.

As set out above, it is also possible to make use of an additional small molecule (termed herein as a “competing small molecule”) to induce disassembly of a tripartite complex formed between the binding protein, target protein and grazoprevir/glecaprevir. This may be useful, for example, where it is desirable to rapidly inactivate grazoprevir-induced dimerization, such as in order to turn off transgene expression or therapeutic activity associated with activity of a dimerization-inducible protein.

Thus, in another aspect, provide herein is a method of inducing disassembly of a tripartite complex, the method comprising administering a competing small molecule to a cell comprising the tripartite complex, wherein the tripartite complex is formed between a binding protein of the first aspect and a complex formed of a target protein and a small molecule (T-SM complex), and wherein the competing small molecule is capable of binding the target protein in the T- SM complex and displacing grazoprevir or glecaprevir from the T-SM complex.

Methods of determining whether the competing small molecule is capable of binding to the target protein in the T-SM complex and displacing grazoprevir include assays where a pre-formed tripartite complex is generated and the ability of the binding protein to bind the T-SM complex is measured (e.g. by a homogeneous time-resolved florescence (HTFR) binding assay) as increasing concentrations of the competing small molecule are added. A competing small molecule may be capable of displacing grazoprevir from the T-SM complex if it is capable of inhibiting binding of the binding protein to the T-SM complex by at least 50 %, by at least 75 %, by at least 80 %, by at least 85 %, by at least 90 % or by at least 95 % when measured using the HTFR binding assay. The competing small molecule may be another HCV NS3/4A protease inhibitor, such as simeprevir, asunaprevir, paritaprevir, vaniprevir or danoprevir.

The aspects and features described above may be combined, except where such a combination is clearly impermissible or expressly avoided.

Figure Legends

Figure 1 shows a comparison of the pharmacokinetics in Balb/c mice of simeprevir dosed orally at 200 mg/kg and grazoprevir dosed orally at 20 mg/kg. Both the Cmax and exposure of unbound compound as determined by AUC are greater for grazoprevir compared to a 10-fold higher dose of simeprevir.

Figure 2 shows the HTRF data obtained with a panel of Tn3 molecules that demonstrate HCV NS3/4A PR (S139A):grazoprevir-selective binding (and the HCV NS3/4A PR:simeprevir complex-specific Tn3 PRSIM23 as a control). (Top) Tn3s bind to HCV NS3/4A PR (S139A) in the presence of grazoprevir but (Bottom) no binding is observed in the absence of grazoprevir.

Figure 3 shows the dose-dependent induction of complex formation between HCV NS3/4A PR (S139A) and selected Tn3 molecules with a panel of small molecule HCV protease inhibitors.

Figure 4A depicts a schematic of the split transcription factor assay. HCV NS3/4A PR (S139A) and selected Tn3 molecules are fused to the DNA-binding domain (DBD) and activation domain (AD) of a split transcription factor. In the presence of grazoprevir, the transcription factor is reconstituted and activates expression of a luciferase reporter gene.

Figure 4B shows the dose-dependent activation of luciferase gene expression obtained from the split transcription factor assay.

Figure 5A depicts three versions of the split transcription factor constructs, with 1 , 2 or 3 copies of the Tn3 PRGRZ103 fused to the DBD, enabling recruitment of more AD domains, and associated regulatory molecules.

Figure 5B shows the data obtained from transfection plasmids encoding one, two or three copies of PRGRZ103 Tn3 fused to the DBD, HCV NS3/4A PR (S139A) fused to the AD, and the inducible IL-2 transgene, indicating that an increase in copy number has an additive effect on the IL-2 expression induced by grazoprevir.

Figure 6A depicts a schematic of the split CAR and its activation.

Figure 6B shows the dose-dependent increase in IL-2 release, as a marker of T cell activation, from cells expressing a grazoprevir-based switch-containing split CAR in the presence of grazoprevir in antigen-positive HepG2 but not antigen-negative A375 cells. Figure 7A shows the design of the grazoprevir-induced kill switch. Addition of grazoprevir results in formation of the PRGRZ103 T n3-HCV NS3/4A PR heterodimer, resulting in dimerization of caspase-9 (Casp9) activation domains and subsequent induction of apoptosis.

Figure 7B shows phase contrast images of HCT116 cells stably transduced with the kill switch showing rapid cell death upon treatment with grazoprevir

Figure 7C shows the normalized viability of HCT116 cells stably transduced with the kill switch 48 hours after grazoprevir dosing.

Figure 7D shows caspase-3 activity in kill switch-transduced HCT116 +/- 10 nM grazoprevir relative to treated untransduced HCT 116 cells.

Figure 8 shows that the grazoprevir-based switch can be used to regulate a caspase-9- based kill switch in vivo. Complete tumour regression is induced by a single 50 mg/kg dose of grazoprevir in a kill-switch-transduced HCT116 xenograft mouse model. The dashed line represents the day of grazoprevir dosing. Each line represents an individual mouse.

Figure 9 depicts a schematic of the homology-directed repair template encoding CD47 and the grazoprevir-inducible kill switch, separated by a “self-cleaving” P2A peptide to enable CRISPR/Cas9 knock-in at the B2M locus. LHA = left homology arm; RHA = right homology arm; hGH pA = human growth hormone polyA sequence.

Figure 10 demonstrates successful Knock in (KI) of the HDRT CD47 and kill switchencoding DNA cassette in Clone#76 by PCR (expected size = 550-600bp).

Figure 11 depicts killing of clone#76 ES cells by both cell density (quantified as % cytolysis) (A) and light microscopy (B) following treatment with grazoprevir.

Figure 12 depicts a time course of cell killing in ES-derived differentiated endothelial cells, following grazoprevir treatment.

Detailed Description

Binding Protein

Herein, the terms “binding protein”, “binding member” and “binding molecule” are used interchangeably, along with the abbreviation “BM”. As used herein “binding member” refers to a Tn3 protein (or polypeptide) that specifically binds to the target protein-small molecule complex (T-SM complex). The term "specific" may refer to the situation in which the binding member will not show any significant binding to molecules other than the T-SM complex. Such molecules are referred to as “non-target molecules” and include the target protein alone and the small molecule alone, e.g. the target protein or grazoprevir when not part of the T-SM complex.

The binding member may be considered to not show any significant binding to a nontarget molecule if the extent of binding to a non-target molecule is less than about 10 % of the binding of the binding member to the T-SM complex as measured, e.g., by isothermal calorimetry, ELISA, surface plasmon resonance (SPR), Bio-Layer Interferometry (BLI), homogeneous time-resolved fluorescence (HTRF), MicroScale Thermophoresis (MST), or by a radioimmunoassay (RIA). The extent of binding to a non-target molecule may be less than about 5 % or less than about 1 % of the binding of the binding member to the T -SM complex.

In some embodiments, the binding member exhibits no significant binding to the target protein alone and/or grazoprevir alone. That is to say, the binding member may bind the target protein alone to less than 10 %, e.g. less than 5 % or less than 1 %, of the extent to which it binds the T-SM complex; and/or the binding member may bind grazoprevir alone to less than 10 %, e.g. less than 5 % or less than 1 %, of the extent to which it binds the T-SM complex.

In some embodiments, where the extent of binding is measured by HTFR, the binding member described herein binds to the T-SM complex with an affinity that is at least 2-fold greater than the affinity towards another, non-target molecule, e.g. the target protein alone or grazoprevir alone. In some embodiments, the binding member binds to the T-SM complex with an affinity that is at least 3-, 5-, 10- or 20- fold greater than its affinity towards another, non- target molecule. Alternatively, the binding specificity may be reflected in terms of binding affinity, where the binding member described herein binds to the T-SM complex with an affinity that is greater, e.g. at least 10-fold greater, than the affinity towards another, non-target molecule. Thus the binding protein may bind the T-SM complex with a higher affinity than it binds to either the target protein (i.e. an HCV NS3/4A protease) alone or grazoprevir alone. Binding affinity may be measured by surface plasmon resonance, e.g. Biacore. In some embodiments, the binding member binds to its target molecule with an affinity that is at least 10-, 50-, 100-, 1000- or 10000-fold greater than its affinity towards another, non-target molecule. In particular, the binding member may bind to the T-SM complex with an affinity that is at least 10-, 50-, 100-, 1000- or 10000-fold greater than its affinity towards either the target protein alone and/or grazoprevir alone.

Binding affinity is typically measured by KD (the equilibrium dissociation constant between the binding member and its target). As is well understood, the lower the KD value, the higher the binding affinity of the binding member. For example, a binding member that binds to the T-SM complex with a KD of 1 nM would be considered to be binding the T-SM complex with an affinity that is greater than a binding member that binds with a KD of 100 nM. Similarly, a binding member that binds to the T-SM complex with a KD of 1 nM and to a non-target molecule with a KD of 100 nM would be considered to bind the T-SM complex with a greater affinity than the non-target molecule.

The binding member may bind to the T-SM complex with an affinity having a KD equal to or lower than 50 nM, 25 nM, 20 nM, 15 nM or 10 nM. The binding member may bind to the target protein alone or grazoprevir alone with an affinity having a KD equal to or higher than 500 nM, 1 pM, 10 pM, 100 pM, or 1 mM. Binding affinity may be measured by SPR, e.g. by Biacore. The binding member may show minimal or no binding to the target protein alone and/or to grazoprevir alone when measured by SPR. The binding member may thus exhibit no detectable binding to the target protein alone and/or to grazoprevir alone, by which is meant no binding of the binding protein to the target protein and/or grazoprevir can be detected by SPR. In some embodiments, the binding member may exhibit no binding at all to the target protein and/or grazoprevir.

In some embodiments, the binding member specifically binds the T-SM complex at a site that is only present on the T -SM complex and not on the target protein alone or grazoprevir alone. For example, the binding member may bind to a site of the T -SM complex comprising at least a portion of the small molecule and a portion of the target protein. Alternatively, the formation of a T-SM complex may induce a conformational change in the target protein that results in the formation of a new binding site that is specifically bound by the binding member. Methods of determining the binding site in the T-SM complex of a binding member include X- ray crystallography, peptide scanning, site-directed mutagenesis mapping and mass spectrometry.

The binding member may specifically bind the T-SM by forming interactions with at least one of the following residues of the target protein: those corresponding to Tyr71 , Gly75, Thr76, Val93 and Asp94 of SEQ ID NO: 26. The binding member may form interactions with 1 , 2, 3, 4 or most preferably all 5 of these residues. The binding member may additionally specifically bind the T-SM complex by forming interactions with one or more functional groups of grazoprevir. At least some of these interactions may by hydrophobic interactions and/or water-mediated interactions. Interactions can be determined using e.g. X-ray crystallography.

As noted above, the binding member is a Tn3 protein. That is to say, the binding member is a Tn3 protein that binds specifically to an HCV NS3/4A protease complexed with the small molecule, i.e. grazoprevir or glecaprevir, or an analogue or derivative thereof. Such Tn3 proteins are termed herein HCV NS3/4A PR:grazoprevir complex-specific binding (PRGRZ) molecules. A number of suitable PRGRZ Tn3 proteins are demonstrated in the Examples. Generally, the Tn3 protein binds specifically to the HCV NS3/4A protease complexed with grazoprevir. As shown in Figure 3, the Tn3 proteins exemplified herein are capable of binding the HCV NS3/4A protease complexed with either grazoprevir or glecaprevir. The Tn3 protein thus may bind specifically to an HCV NS3/4A protease complexed with grazoprevir or glecaprevir, in which case the Tn3 may bind more strongly to an NS3/4A-grazoprevir complex or an NS3/4A-glecaprevir complex. Tn3 Proteins

Tn3 proteins are based on the structure of a type III fibronectin module (Fnlll) and are derived from the third Fnlll domain of human tenascin C. The generation and use of Tn3 proteins is described for example in WO 2009/058379, WO 2011/130324, WO 2011/130328 and Gilbreth et al. 2014.

The Tn3 proteins and the native Fnlll domain from tenascin C are characterized by the same tridimensional structure, namely a beta-sandwich structure with three beta strands (A, B, and E) on one side and four beta strands (C, D, F, and G) on the other side, connected by six loop regions. These loop regions are designated according to the beta-strands connected to the N- and C-terminus of each loop. Accordingly, the AB loop is located between beta strands A and B, the BC loop is located between strands B and C, the CD loop is located between beta strands C and D, the DE loop is located between beta strands D and E, the EF loop is located between beta strands E and F, and the FG loop is located between beta strands F and G. Fnlll domains possess solvent-exposed loops tolerant of randomization, which facilitates the generation of diverse pools of protein scaffolds capable of binding specific targets with high affinity.

A wild-type Tn3 protein may comprise the sequence of SEQ ID NO: 36. In the wildtype Tn3 protein, the BC, DE and FG loops are located at positions 23 to 31 , 51 to 56 and 75 to 80, wherein the amino acid numbering corresponds to SEQ ID NO: 36. The Tn3 protein may contain one, preferably two, more preferably three, even more preferably four of the stabilising mutations selected from the list consisting of I32F, D49K, E86I and T89K, wherein the amino acid numbering corresponds to SEQ ID NO: 36. The amino acid sequence of a wildtype Tn3 protein comprising all four stabilising mutations is set forth in SEQ ID NO: 37. The Tn3 protein may additionally contain one or more of the stabilising mutations described in Gilbreth et al. 2014 (see, in particular, Table 1 of Gilbreth et al. 2014).

Tn3 proteins can be subjected to directed evolution designed to randomize one or more of the loops which are analogous to the complementarity-determining regions (CDRs) of an antibody variable region. Such a directed evolution approach results in the production of antibody-like binding members with high affinities for targets of interest, e.g., the T-SM complexes described herein.

Thus, the Tn3 protein that specifically binds to the T-SM complex described herein may comprise the BC, DE and FG loops of PRGRZ093, PRGRZ094, PRGRZ103, PRGRZ112, PRGRZ114 or PRGRZ115. For example, the Tn3 protein may comprise the sequence of SEQ ID NO: 36 or SEQ ID NO: 37, where the BC, DE and FG loops located at positions 23 to 31 , 51 to 56, and 75 to 80, respectively, are substituted for the BC, DE and FG loops of PRGRZ093, PRGRZ094, PRGRZ103, PRGRZ112, PRGRZ114 or PRGRZ115. A person skilled in the art would be readily able to determine the amino acid sequences of the BC, DE and FG loops of the PRGRZ Tn3 clones described herein. For example, the amino acid sequences of the PRGRZ Tn3 clones could be compared to the amino acid sequences of the wild-type Tn3 protein, e.g. those amino acid sequences set forth in SEQ ID NO: 36 or 37.

The Tn3 sequence, amino acid positions and sequences of the BC, DE and FG loops of PRGRZ093, PRGRZ094, PRGRZ103, PRGRZ112, PRGRZ114 and PRGRZ115 are as set forth in Table 1 below.

In some embodiments, the Tn3 protein comprises the BC, DE and FG loops of: a) PRGRZ093, set forth in SEQ ID NOs: 8, 9 and 10, respectively; b) PRGRZ094, set forth in SEQ ID NOs: 11 , 12 and 13, respectively; c) PRGRZ103, set forth in SEQ ID NOs: 14, 15 and 16, respectively; d) PRGRZ112, set forth in SEQ ID NOs: 17, 18 and 19, respectively; e) PRGRZ1 14, set forth in SEQ ID NOs: 20, 21 and 22, respectively; or f) PRGRZ115, set forth in SEQ ID NOs: 23, 24 and 25, respectively.

Table 1 : PRGRZ Tn3 sequences

In some embodiments, the Tn3 protein comprises a number of sequence alterations, e.g. one, two, three, four, or five sequence alterations, in any one or more of the BC, DE and EF loops defined above. In some embodiments, the Tn3 protein comprises a number of sequence alterations, e.g. one, two, three, four, or five sequence alterations, outside the BC, DE and EF loops defined above. Such sequence alterations may be amino acid insertions, deletions or substitutions, e.g. conservative substitutions. In particular embodiments, the Tn3 protein comprises the BC, DE and FG loops of PRGRZ103, set forth in SEQ ID NOs: 14, 15 and 16, respectively.

In some embodiments, the Tn3 protein has or comprises an amino acid sequence having at least 90 %, 91 %, 92 %, 93 %, 94 %, 95 %, 96 %, 97 %, 98 % or 99 % identity with the amino acid sequence of: a) PRGRZ093, set forth in SEQ ID NO: 1 ; b) PRGRZ094, set forth in SEQ ID NO: 2; c) PRGRZ103, set forth in SEQ ID NO: 3; d) PRGRZ112, set forth in SEQ ID NO: 4; e) PRGRZ114, set forth in SEQ ID NO: 5; or f) PRGRZ115, set forth in SEQ ID NO: 6.

In particular embodiments, the Tn3 protein has or comprises an amino acid sequence of: a) PRGRZ093, set forth in SEQ ID NO: 1 ; b) PRGRZ094, set forth in SEQ ID NO: 2; c) PRGRZ103, set forth in SEQ ID NO: 3; d) PRGRZ112, set forth in SEQ ID NO: 4; e) PRGRZ114, set forth in SEQ ID NO: 5; or f) PRGRZ115, set forth in SEQ ID NO: 6.

In particular embodiments, the Tn3 protein has or comprises an amino acid sequence having at least 90 %, 91 %, 92 %, 93 %, 94 %, 95 %, 96 %, 97 %, 98 % or 99 % identity with the amino acid sequence of PRGRZ103, set forth in SEQ ID NO: 3. In particular embodiments, the Tn3 protein has or comprises the amino acid sequence set forth in SEQ ID NO: 3.

Target Proteins and Small Molecules

The target protein described herein is the hepatitis C virus (HCV) NS3/4A protease (referred to herein as simply the “NS3/4A protease”). The NS3/4A protease used herein may be a native (i.e. wild type) NS3/4A protease, or may be derived from a native NS3/4A protease. The NS3/4A protease used is capable of binding grazoprevir and/or glecaprevir.

The term “derived from” in the context of the NS3/4A protease is intended to mean that the NS3/4A protease has a similar, but not necessarily identical, amino acid sequence to the native NS3/4A protease from which it is derived. An NS3/4A protease that is derived from a native NS3/4A protease may have (i.e. consist of) or comprise an amino acid sequence that is at least 90 %, 91 %, 92 %, 93 %, 94 %, 95 %, 96 %, 97 %, 98 % or 99 % identical to the protein from which it is derived. An NS3/4A protease that is derived from a native NS3/4A protease may contain less than 50, less than 40, less than 30, less than 20, less than 10, less than 9, less than 8, less than 7, less than 6, less than 5, less than 4, less than 3, or less than 2 sequence alterations compared to the protein from which it is derived. For example, a target protein having the amino acid sequence set forth in SEQ ID NO: 27 is derived from the native NS3/4A protease having the sequence set forth in SEQ ID NO: 26. Additionally, the NS3/4A protease may have fewer amino acids (i.e. it may be a shorter protein) than the protein from which it is derived.

Viral proteases are enzymes encoded by the genetic material of viral pathogens. The normal function of these enzymes is to catalyse the cleavage of specific peptide bonds in viral polyprotein precursors or in cellular proteins. Examples of viral proteases include those encoded by hepatitis C virus (HCV), human immunodeficiency virus (HIV), herpesvirus, retrovirus and human rhinovirus (HRV) families. Certain viral proteases, along with examples of small molecule inhibitors of these proteases, are described for example in Patick & Potts, 1998.

The HCV NS3/4A protease (HCV NS3/4A PR) is monomeric, relatively small in size (21 kDa), can be expressed cytoplasmically, and is not found associated with DNA, making it an ideal candidate for use herein. The HCV NS3/4A protease may have the amino acid sequence of amino acid positions 1030-1206 of the amino acid sequence set forth in UniProt accession number A8DG50, the HCV polyprotein (version 2 of the sequence; sequence update 29 April 2008). As used herein, the terms “an HCV NS3/4A protease” and “the HCV NS3/4A protease” refer to a protein which is, or which is derived from a wild type HCV NS3/4A protease. In some embodiments, the HCV NS3/4A protease has or comprises the amino acid sequence set forth in SEQ ID NO: 26. A target protein that is derived from the HCV NS3/4A protease may have or comprise an amino acid sequence that is at least 90 %, 91 %, 92 %, 93 %, 94 %, 95 %, 96 %, 97 %, 98 %, or 99 % identical to the amino acid sequence set forth in SEQ ID NO: 26.

There are several small molecule inhibitors that are known to bind the HCV NS3/4A protease and have been approved for human use. Some of these are set forth in Table 2 below:

Table 2: HCV protease-inhibitors and their structures

The structures of the target proteins in complex with the respective small molecule are provided as PDB accession numbers, which correspond to the crystal structures available from the Protein Data Bank (PDB). The small molecule structures and chemical names are also provided as PDB accession numbers.

As set out above, WO 2021/009692 describes a system in which simeprevir is used as a CID for controlling PPIs. Herein, grazoprevir or glecaprevir is used, or a pharmacologically acceptable analogue or derivative of grazoprevir or glecaprevir. Thus the term “small molecule” (or “SM”) as used herein refers to grazoprevir and glecaprevir and pharmacologically acceptable analogues or derivatives of grazoprevir or glecaprevir. In particular embodiments, the small molecule is grazoprevir.

Grazoprevir is a small molecule that is administered orally, is cell-permeable, and has a pharmacokinetics (PK) profile that supports once-daily dosing. It is approved in combination with elbasvir (combination trade name Zepatier) for treatment of chronic hepatitis C, and has the following structure:

Glecaprevir is a small molecule that is administered orally, is cell-permeable, and has a pharmacokinetics (PK) profile that supports once-daily dosing. It is approved in combination with pibrentasvir (combination trade name Mavyret) for treatment of hepatitis C infection, and has the following structure:

Pharmacologically acceptable analogues and derivatives of grazoprevir and glecaprevir include compounds that differ from the “parent” small molecule (i.e. grazoprevir or glecaprevir having the structures set out above) but contain a similar antiviral activity as the parent small molecule and include tautomers, regioisomers, geometric isomers, and where applicable, stereoisomers, including optical isomers (enantiomers) and other steroisomers (diastereomers) thereof, as well as pharmaceutically acceptable salts and derivatives (including prodrug forms) thereof where applicable, in context.

In some embodiments, the target protein is a modified NS3/4A protease with attenuated protease activity compared to the native NS3/4A protease from which it is derived (e.g. the HCV NS3/4A protease of SEQ ID NO: 26). Attenuated protease activity in this context means that the target protein has a lower enzymatic activity compared to the native NS3/4A protease. Enzymatic activity can be tested, for example, using a fluorogenic peptide cleavage assay as described in WO 2021/009692 or in Sabariegos et al., 2009. Briefly, the fluorogenic peptide cleavage assay involves incubating the target protein with a fluorogenic protease FRET substrate containing a donor-quencher pair such that cleavage of the peptide separates the donor from the quencher, emitting energy that can be detected at a certain wavelength, e.g. 490 nm.

In some embodiments, a modified NS3/4A protease is considered to have attenuated protease activity compared to the native NS3/4A protease if the target protein has an activity that is less than 10 % of the activity of the native NS3/4A protease as measured in an enzymatic activity assay, such as a fluorogenic peptide cleavage assay. In some embodiments, the target protein does not display any detectable viral activity when measured in an enzymatic activity assay, such as a fluorogenic peptide cleavage assay, when the target protein is at a concentration less than 1 nM, less than 10 nM, less than 100 nM, or less than 1 pM.

The target protein may comprise one or more amino acid mutations (e.g. substitutions, insertions and/or deletions) compared to the native NS3/4A protease from which it is derived (e.g. compared to SEQ ID NO: 26). The target protein comprising the one or more amino acid mutations should retain its ability to form a tripartite complex with the small molecule and binding member, which can be determined e.g. using a homogeneous time-resolved fluorescence (HTRF) assay as described in WO 2021/009692.

In some embodiments, the target protein comprises one or more amino acid mutations compared to the native NS3/4A protease, wherein the one or more amino acid mutations attenuate the viral activity of the target protein. The one or more amino acid mutations may be in the active site of the protease.

For example, the HCV NS3/4A protease contains a catalytic triad involving the amino acid residues H57, D81 and S139 of the HCV NS3/4A protease, as described in e.g. Grakoui et al. 1993; Eckart et al. 1993; and Bartenschlager et al. 1993. These amino acid residues correspond to positions H72, D96 and S154 of the amino acid sequence of SEQ ID NO: 26. Thus, the target protein may contain an amino acid mutation at one or more amino acids selected from the position corresponding to positions 72, 96 and 154 of the HCV NS3/4A protease of SEQ ID NO: 26. An amino acid position in the target protein that corresponds to a particular position in SEQ ID NO: 26 is the position in the target protein that aligns to the relevant position of SEQ ID NO: 26, if the sequence of the target protein is aligned with SEQ ID NO: 26. Thus for example, if a target protein sequence is aligned with SEQ ID NO: 26, the position in the target protein corresponding to position 154 of SEQ ID NO: 26 is the position in the target protein that aligns to position 154 of SEQ ID NO: 26. The position corresponding to position 154 of SEQ ID NO: 26 may also be at position 154 in the target protein, but may be at a different position if the target protein contains an insertion or deletion mutation relative to SEQ ID NO: 26.

Other residues of the HCV NS3/4A protease that are known to be involved in viral activity include C97, C99, C145 and H149 (corresponding to positions C112, C114, C160 and H164 of SEQ ID NO: 26), as reported in e.g. Hikikata et al. 1993; and Stempniak et al. 1997. In some embodiments, the target protein has attenuated enzymatic activity and contains an amino acid mutation (e.g. substitution) at one or more amino acids selected from the positions corresponding to positions 72, 96, 112, 114, 154, 160 and/or 164 of the HCV NS3/4A protease of SEQ ID NO: 26. In particular embodiments, the target protein has attenuated enzymatic activity and comprises an amino acid mutation at the position corresponding to position 154 of the HCV NS3/4A protease of SEQ ID NO: 26, such as a mutation to alanine. By a mutation at this position is meant that the target protein contains a mutation at this position compared to the HCV NS3/4A protease of SEQ ID NO: 26. Thus in some embodiments, the target protein has or comprises an amino acid sequence having at least 90 % identity to SEQ ID NO: 26, wherein the amino acid at the position corresponding to position 154 of SEQ ID NO: 26 is not serine. In some embodiments, the target protein has or comprises an amino acid sequence having at least 90 % identity to SEQ ID NO: 26, wherein the amino acid at the position corresponding to position 154 of SEQ ID NO: 26 is alanine. In certain embodiments, the target protein has or comprises the amino acid sequence of SEQ ID NO: 27 (which corresponds to SEQ ID NO: 26 comprising the S154A mutation).

The full-length sequence of the NS3 protein is provided in SEQ ID NO: 34. The amino acid mutation described here at position 154 of SEQ ID NO: 26 corresponds to the position 139 of SEQ ID NO: 34.

A table identifying the potential amino acid mutations described above numbered according to the full length NS3 protein (SEQ ID NO: 34) and their corresponding positions in the NS3/4A protease amino acid sequence set forth in SEQ ID NO: 26 is set out in Table 3 below:

Table 3: NS3/4A protease mutations

In some embodiments the target molecule is an HCV NS3/4A derivative comprising a mutation at the position corresponding to position 154 of SEQ ID NO: 26 as described above, and further comprises a mutation at one or more of the positions corresponding to positions 72, 96, 112, 114, 154, 160 and 164 of SEQ ID NO: 26.

The target protein and small molecule interact to form a complex between the target protein and the small molecule (SM) referred to herein as a T-SM complex. In some embodiments the small molecule, e.g. grazoprevir, binds to the target protein with a Kd that is lower than 1 mM, preferably lower than 500 nM, more preferably lower than 200 nM, even more preferably lower than 100 nM, or yet more preferably lower than 50 nM, when measured for example using surface plasmon resonance or bio-layer interferometry. In some embodiments, the small molecule, e.g. grazoprevir, binds to the target protein with a Kd between 25 nM and 200 nM, between 25 nM and 100 nM, or between 25 and 75 nM, when measured for example using surface plasmon resonance or bio-layer interferometry.

It may be desirable to introduce amino acid mutations (e.g. substitutions) into the target protein in order to reduce the affinity of the small molecule, e.g. grazoprevir, for the target protein and allow a second small molecule to displace the first small molecule (e.g. grazoprevir) in the T-SM complex. Reducing the binding affinity of the first small molecule (e.g. grazoprevir) to the HCV NS3/4A protease by introducing amino acid modification(s) in the target protein allows for the use of different small molecule inhibitors of the HCV NS3/4A protease to disrupt the tripartite complex formed between HCV NS3/4A protease (S139A), the small molecule (e.g. grazoprevir) and the binding protein. Thus, in some embodiments the target protein (e.g. HCV NS3/4A protease) comprises one or more affinity reducing amino acid mutations (e.g. substitutions) compared to the viral protease from which it is derived (e.g. SEQ ID NO: 26), such that the small molecule (e.g. grazoprevir) binds the target protein with a lower affinity than it binds the parent target protein from which it is derived. The ‘parent target protein’ in this context lacks the one or more affinity reducing amino acid mutations but is otherwise identical to the target protein. The parent target protein may be the native viral protease from which the target protein is derived (e.g. the HCV NS3/4A protease of SEQ ID NO: 26), or the parent target protein may itself be derived from a viral protease (e.g. the parent target protein may be a modified viral protease, e.g. a modified HCV NS3/4A protease, such as the modified protease set forth in SEQ ID NO: 27).

The one or more affinity reducing amino acid mutations may result in the small molecule (e.g. grazoprevir) binding the target protein with at least a 1.5-fold lower affinity than it binds the parent target protein. The one or more affinity reducing amino acid mutations may result in the small molecule (e.g. grazoprevir) binding the target protein with an affinity that is between 1.5-fold and 10-fold lower than the small molecule binds the parent target protein, or between 1.5-fold and 5-fold lower than binds the parent target protein. The one or more affinity reducing amino acid mutations may result in the small molecule (e.g. grazoprevir) binding the target protein with a Kd between 25 nM and 200 nM, between 25 and 100 nM, or between 25 and 75 nM, optionally where affinity is measured using bio-layer interferometry, such as using an Octet RED384. As demonstrated in WO 2021/009692, amino acid substitutions at positions 151 and 183 of the HCV NS3/4A protease of SEQ ID NO: 26, were found to reduce the affinity of simeprevir to the HCV NS3/4A protease and allow a second small molecule to disrupt the tripartite complex formed between the HCV NS3/4A protease, simeprevir and a binding protein specific for the HCV NS3/4A-simeprevir complex. Further, target proteins comprising these affinity reducing mutations were also demonstrated to retain functionality in dimerizationinducible proteins such as in split transcription factors. Amino acid positions 151 and 183 of SEQ ID NO: 26 correspond to amino acid positions 136 and 168, respectively, of the full length NS3 protein set forth in SEQ ID NO: 34.

Thus, in some embodiments, the target protein has an affinity reducing amino acid mutation (e.g. substitution) at one or both amino acids selected from those at the positions corresponding to positions 151 and 183 of SEQ ID NO: 26. In some embodiments, the affinity reducing amino acid mutation at the position corresponding to position 151 of SEQ ID NO: 26 is a mutation to aspartic acid, asparagine or histidine, and/or the affinity reducing mutation at the position corresponding to position 183 of SEQ ID NO: 26 is to glutamic acid, glutamine or alanine. In some embodiments, the affinity reducing amino acid mutation at the position corresponding to position 151 of SEQ ID NO: 26 is a mutation to aspartic acid or asparagine and/or the affinity reducing mutation at the position corresponding to position 183 of SEQ ID NO: 26 is to glutamic acid. The target protein may comprise the affinity reducing amino acid mutation in addition to another amino acid mutation described herein (e.g. in addition to one or more of the activity-attenuating amino acid mutations described above, such as a mutation at the position corresponding to position 154 of SEQ ID NO: 26, which may be a mutation to alanine).

In certain embodiments, the target protein has or comprises an amino acid sequence having at least 90 %, 91 %, 92 %, 93 %, 94 %, 95 %, 96 %, 97 %, 98 % or 99 % identity to SEQ ID NO: 26, and comprises an amino acid at the position corresponding to position 151 of SEQ ID NO: 26 which is not lysine, and/or an amino acid at the position corresponding to position 183 of SEQ ID NO: 26 which is not aspartic acid.

In certain embodiments, the target protein has or comprises an amino acid sequence having at least 90 %, 91 %, 92 %, 93 %, 94 %, 95 %, 96 %, 97 %, 98 % or 99 % identity to SEQ ID NO: 26, and comprises aspartic acid, asparagine or histidine (e.g. aspartic acid or asparagine) at the position corresponding to position 151 of SEQ ID NO: 26, and/or glutamic acid, glutamine or alanine (e.g. glutamic acid) at the position corresponding to position 183 of SEQ ID NO: 26. In certain embodiments, the target protein has or comprises the amino acid sequence of SEQ ID NO: 28 (which corresponds to SEQ ID NO: 26 comprising a K151 D mutation). In other embodiments, the target protein has or comprises the amino acid sequence of SEQ ID NO: 30 (which corresponds to SEQ ID NO: 26 comprising a D183E mutation). In other embodiments, the target protein has or comprises the amino acid sequence of SEQ ID NO: 32 (which corresponds to SEQ ID NO: 26 comprising K151 D and D183E mutations).

In certain embodiments the target protein has or comprises an amino acid sequence having at least 90 %, 91 %, 92 %, 93 %, 94 %, 95 %, 96 %, 97 %, 98 % or 99 % identity to SEQ ID NO: 26, and comprises alanine at the position corresponding to position 154 of SEQ ID NO: 26 and aspartic acid, asparagine or histidine (e.g. aspartic acid or asparagine) at position the position corresponding to position 151 of SEQ ID NO: 26. In certain embodiments, the target protein is derived from a viral protease having or comprising the amino acid sequence set forth in SEQ ID NO: 26, wherein the target protein differs from the viral protease in that it comprises alanine at the position corresponding to position 154 of SEQ ID NO: 26 and aspartic acid, asparagine or histidine (e.g. aspartic acid or asparagine) at the position corresponding to position 151 of SEQ ID NO: 26, and optionally 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19 or 20 additional sequence alterations (e.g. functionally conservative substitutions) relative to SEQ ID NO: 26. In certain embodiments, the target protein has or comprises the amino acid sequence of SEQ ID NO: 29 (which corresponds to SEQ ID NO: 26 comprising K151 D and S154A mutations).

In certain embodiments, the target protein has or comprises an amino acid sequence having at least 90 %, 91 %, 92 %, 93 %, 94 %, 95 %, 96 %, 97 %, 98 % or 99 % identity to SEQ ID NO: 26 and comprises alanine at the position corresponding to position 154 of SEQ ID NO: 26 and glutamic acid, glutamine or alanine (e.g. glutamic acid) at the position corresponding to position 183 of SEQ ID NO: 26. In certain embodiments, the target protein is derived from a viral protease having or comprising the amino acid sequence set forth in SEQ ID NO: 26, wherein the target protein differs from the viral protease in that it comprises alanine at the position corresponding to position 154 of SEQ ID NO: 26 and glutamic acid, glutamine or alanine (e.g. glutamic acid) at the position corresponding to position 183 of SEQ ID NO: 26, and optionally 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19 or 20 additional sequence alterations (e.g. functionally conservative substitutions), relative to SEQ ID NO: 26. In certain embodiments, the target protein has or comprises the amino acid sequence of SEQ ID NO: 31 (which corresponds to SEQ ID NO: 26 comprising S154A and D183E mutations).

In certain embodiments the target protein has or comprises an amino acid sequence having at least 90 %, 91 %, 92 %, 93 %, 94 %, 95 %, 96 %, 97 %, 98 % or 99 % identity to SEQ ID NO: 26, and comprises alanine at the position corresponding to position 154 of SEQ ID NO: 26, aspartic acid, asparagine or histidine (e.g. aspartic acid or asparagine) at position the position corresponding to position 151 of SEQ ID NO: 26, and glutamic acid, glutamine or alanine (e.g. glutamic acid) at the position corresponding to position 183 of SEQ ID NO: 26. In certain embodiments, the target protein is derived from a viral protease having or comprising the amino acid sequence set forth in SEQ ID NO: 26, wherein the target protein differs from the viral protease in that it comprises alanine at the position corresponding to position 154 of SEQ ID NO: 26, aspartic acid, asparagine or histidine (e.g. aspartic acid or asparagine) at the position corresponding to position 151 of SEQ ID NO: 26, and glutamic acid, glutamine or alanine (e.g. glutamic acid) at the position corresponding to position 183 of SEQ ID NO: 26, and optionally 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, 18, 19 or 20 additional sequence alterations (e.g. functionally conservative substitutions), relative to SEQ ID NO: 26. In certain embodiments, the target protein has or comprises the amino acid sequence of SEQ ID NO: 33 (which corresponds to SEQ ID NO: 26 comprising K151 D, S154A and D183E mutations).

BM-T Fusion Proteins

Provided herein is a polypeptide comprising a binding protein as described above fused to a target protein as described above. In particular, the polypeptide may comprise a binding protein as described above fused to an HCV NS3/4A protease as described above. The polypeptide is thus a fusion protein comprising the binding protein (Tn3 protein) and target protein (HCV NS3/4A protease) as described above. Such a fusion protein is referred to herein as a binding molecule-target protein fusion (BM-T fusion), and the polypeptide of this aspect is a polypeptide comprising a BM-T fusion protein, as set out below. This aspect may alternatively be seen as providing a BM-T fusion protein.

The binding protein within the BM-T fusion may be any binding protein as described above. Thus in some embodiments the binding protein is a binding protein (i.e. Tn3 protein) that binds specifically to an HCV NS3/4A protease complexed with grazoprevir. The target protein within the BM-T fusion may be any target protein as described above.

The binding protein and target protein may be fused directly to each other, i.e. without any intervening sequence, such that the C-terminal amino acid of the binding protein is fused to the N-terminal amino acid of the target protein, or vice versa. Alternatively, the binding protein and target protein may be joined indirectly. For instance, the binding protein and target protein may be fused via a linker, e.g. a linker as described below.

The target protein and binding protein may be arranged in either order, i.e. the target protein may be located N-terminal to the binding protein, or the binding protein may be located N-terminal to the target protein.

In some embodiments, the BM-T fusion protein comprises one or more additional domains, i.e. one or more domains in addition to the binding protein (which, in the context of the BM-T fusion protein, may be referred to as the binding protein domain) and the target protein (which, in the context of the BM-T fusion protein, may be referred to as the target protein domain). The BM-T fusion protein may comprise any number of additional domains, e.g. 1 , 2, 3, 4, 5, 6, 7 or 8 or more additional domains. An additional domain may be located at the N-terminus of the BM-T fusion protein, at the C-terminus of the BM-T fusion protein, and/or between the target protein domain and the binding protein domain. Where the BM-T fusion protein comprises one additional domain, the fusion protein may be arranged, from N-terminus to C-terminus, as follows: N - binding protein domain - target protein domain - additional domain - C; N - binding protein domain - additional domain - target protein domain - C; N - target protein domain - binding protein domain - additional domain - C; N - target protein domain - additional domain - binding protein domain - C; N - additional domain - binding protein domain - target protein domain - C; or N - additional domain - target protein domain - binding protein domain - C.

The one or more additional domains may be fused to the target protein domain and/or the binding protein domain directly, as described above, or indirectly via a linker, such as one of the linkers described below.

In some embodiments, the one or more additional domain is or comprises a caspase component or caspase domain. For example, the BM-T fusion protein may comprise one additional domain, which additional domain is or comprises a caspase domain. In certain embodiments, the caspase domain is a domain of a dimerization-deficient caspase, as described below. That is to say, the caspase domain may be a caspase domain which is activated upon dimerization, such that when dimerized it is capable of inducing apoptosis. Thus the caspase domain may be a caspase activation domain. The domain of a dimerizationdeficient caspase may be a domain of a dimerization-deficient caspase-9, in particular the activation domain of a caspase-9. An exemplary caspase-9 activation domain is provided as amino acids residues 152-414 of the human caspase-9 amino acid sequence provided as NCBI accession number AAO21133.1 (version 1 ; last updated 1 December 2009), set out herein as SEQ ID NO: 35. The caspase-9 activation domain may comprise or consist of the amino acid sequence set forth in SEQ ID NO: 35, or an amino acid sequence having at least 90 %, 91 %, 92 %, 93 %, 94 %, 95 %, 96 %, 97 %, 98 % or 99 % identity to SEQ ID NO: 35.

Where the BM-T fusion protein comprises a caspase-9 activation domain which is a variant of SEQ ID NO: 35 (i.e. a caspase activation domain with a sequence having at least 90 % identity to SEQ ID NO: 35) the variant is a functional variant of SEQ ID NO: 35. A functional variant of SEQ I D NO: 35 is a variant which retains the capability to induce apoptosis upon dimerization. A functional variant of SEQ ID NO: 35 may retain at least 80 %, 85 %, 90 %, 91 %, 92 %, 93 %, 94 %, 95 %, 96 %, 97 %, 98 % or 99 % of the apoptosis-inducing activity of the native sequence of SEQ ID NO: 35.

In a particular embodiment, the BM-T fusion protein comprises an N-terminal binding protein domain and a C-terminal caspase domain (such as one of those described above), and a target protein domain located in between. An exemplary BM-T fusion protein comprising the PRGRZ103 Tn3 protein, the S154A NS3/4A target protein and the caspase-9 activation domain is set out in SEQ ID NO: 76.

Dimerization-Inducible Proteins

In some embodiments the target protein is fused to a first component polypeptide and the binding member is fused to a second component polypeptide. In particular embodiments the first and second component polypeptides form part of a dimerization-inducible protein.

As used herein, “dimerization-inducible protein” refers to a protein or complex comprising a first and second component polypeptide, wherein the first and second component polypeptides form a functional protein upon dimerization. The term “dimerizationinducible proteins” includes “split proteins”, “dimerization-deficient proteins” and “split complexes”. The term “component polypeptide” is intended to encompass both single-chain and multi-chain polypeptides.

In the context herein, the dimerization-inducible protein comprises (1) a first fusion protein, comprising a first component polypeptide fused to the binding protein described above; and (2) a second fusion protein, comprising a second component polypeptide fused to the target protein (e.g. HCV NS3/4A protease) as described above.

The first and second component polypeptides in the dimerization-inducible protein typically do not have activity or have less activity when separated, but upon dimerization are brought into close proximity and as such become active or have increased activity. Dimerization is induced by binding of the small molecule (e.g. grazoprevir) to the target protein, whereupon the binding protein binds the target protein-small molecule complex, bringing the first and second component polypeptides into proximity, resulting in dimerization. Thus it will be appreciated by the skilled person that the first and second fusion proteins contain complementary binding and target proteins, that is to say the binding protein of the first fusion protein specifically binds the target protein of the second fusion protein when the target protein of the second fusion protein is complexed with the small molecule, e.g. grazoprevir, or vice versa.

Examples of dimerization-inducible proteins include a split chimeric antigen receptor (split CAR; e.g. as described in Wu et al., 2015), split kinases (e.g. as described in Camacho- Soto et al., 2014), split transcription factors (e.g. as described in Taylor et al., 2010), split apoptotic proteins (e.g. split caspases as described in Chelur et al., 2007) and split reporter systems (e.g. as described in Dixon et al., 2016).

The dimerization-inducible protein will have increased activity when the binding member is bound to the T-SM complex. Increased activity can be compared to the activity observed when the binding member is not bound to the T-SM complex (e.g. because one or more of the target protein, small molecule or binding member is not present). In some embodiments, the increased activity observed when the binding member is bound to the T-SM complex is at least a 1.5-fold, 2-fold, 3-fold, 5-fold, 10-fold, 15-fold, 20-fold, 25-fold, 30-fold, 35-fold, 40-fold, 45-fold, 50-fold, 55-fold, 60-fold, 65-fold, 70-fold, 75-fold, 80-fold, 85-fold, 90- fold, 95-fold, 100-fold, 105-fold, 110-fold, 115-fold, or 120-fold increase in activity as compared to activity observed when the binding member is not bound to the T-SM complex.

Methods of measuring the activity of the dimerization-inducible protein will depend upon the particular dimerization-inducible protein being studied. Where the first and second component polypeptide form a chimeric antigen receptor (CAR) upon dimerization, CAR activity can be determined by measuring the immune cell activation and/or proliferation in response to the antigen recognised by the CAR. CAR activity can be measured by e.g. interleukin-2 (IL-2) production, e.g. by ELISA, after stimulation of the CAR by an antigen.

Where the first and second component polypeptide form a kinase upon dimerization, activity of the kinase can be measured by incorporation of phosphate, e.g. radioactive ³²P, into a peptide substrate as described in Camacho-Soto et al., 2014. Where the first and second component polypeptides form a transcription factor upon dimerization, transcriptional activity can be determined by measuring expression of a downstream desired expression cassette modulated by the split transcription factor. Where the first and second component polypeptides form a therapeutic protein upon dimerization, activity can be measured by using suitable assays for determining functional activity of the protein. Where the first and second component polypeptides form a caspase upon dimerization, caspase activity can be measured using a caspase activity assay or by measuring apoptotic cell death. Where the first and second component polypeptides form a reporter system upon dimerization, reporter activity can be determined by measuring expression of the reporter, e.g. a luciferase.

The first component polypeptide may be fused to the C-terminus or the N-terminus of the target protein or binding member. The second component polypeptide may be fused to the C-terminus or the N-terminus of the target protein or binding member. The component polypeptides may be fused to the target protein or binding member via a peptide linker. Suitable peptide linkers include those represented by [G]n, [S]n, [A]n, [GS]n, [GGS]n, [GGGS]n (SEQ ID NO: 38), [GGGGS]n (SEQ ID NO: 39), [GGSG]n (SEQ ID NO: 40), [GSGG]n (SEQ ID NO: 41), [SGGG]n (SEQ ID NO: 60), [SSGG]n (SEQ ID NO: 42), [SSSG]n (SEQ ID NO: 43), [GG]n, [GGGJn, [SA]n, [TGGGGSGGGGS]n (SEQ ID NO: 44), [SAGS]n (SEQ ID NO: 74) and combinations thereof, wherein n is an integer between 1 and 30. For example, n may be 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10 or any number up to 30. That is to say, the peptide linker may be contain any one of SEQ ID NOs: 38 to 44, or the other amino acids and short sequences set out above, as a repeating unit. The component polypeptide may be fused to the target protein or binding member directly, e.g. in the format: first component polypeptide - peptide linker - target protein. Alternatively, the component polypeptide may be fused to the target protein or binding member indirectly with one or more additional polypeptides separating the first component polypeptide from the target protein or binding member, e.g. first component polypeptide - additional polypeptide - peptide linker - target protein.

In some embodiments, the first component polypeptide is fused to more than one target protein or binding member. In some embodiments, the second component polypeptide is fused to more than one target protein or binding member or a combination of both. For example, the first or second component polypeptide may be fused to 1 , 2, 3, 4, 5, 6, 7, 8, 9 or 10 binding members. In some embodiments, the first or second component polypeptide is fused to between 2 and 10, or between 2 and 5 binding members. In particular embodiments, the first or second component polypeptide is fused to 3 binding members. For example, the first or second component polypeptide may be fused to 1 , 2, 3, 4, 5, 6, 7, 8, 9 or 10 target proteins. In some embodiments, the first or second component polypeptide is fused to between 2 and 10, or between 2 and 5 target proteins. In particular embodiments, the first or second component polypeptide is fused to 3 target proteins. Where multiple binding members or target proteins are present, they may be fused to each other by peptide linkers, e.g. those peptide linkers described above.

In some embodiments, the first fusion protein of the dimerization-inducible protein is or comprises a BM-T fusion protein, as described above. In some embodiments, the second fusion protein of the dimerization-inducible protein is or comprises a BM-T fusion protein, as described above. In some embodiments, both the first fusion protein and the second fusion protein of the dimerization-inducible protein are or comprise a BM-T fusion protein, as described above. That is to say, the first and/or second fusion proteins may comprise a binding protein (or binding protein domain), a target protein (or target protein domain) and additional domain (which constitutes the first or second component polypeptide).

In some embodiments, the first and second fusion protein of the dimerization-inducible protein are identical, each constituting a BM-T fusion protein as described above, comprising the same component polypeptide as its additional domain. Identical first and second fusion proteins may be particularly suitable for use as split caspases, as described further below. Split Transcription Factor

The dimerization-inducible protein may be a split transcription factor. In some embodiments, the first component polypeptide comprises a DNA binding domain (DBD); and the second component polypeptide comprises a transcriptional regulatory domain (TRD), and wherein the first component polypeptide and second component polypeptide form a transcription factor upon dimerization. By “form a transcription factor” is meant that the first and second component polypeptides are brought into close enough proximity that they are able to reconstitute the transcriptional regulatory activity of desired expression products. The dimerization-inducible protein will have increased transcriptional regulatory activity when the binding member is bound to the T-SM complex, compared to the transcriptional regulatory activity observed when the binding member is not bound to the T-SM complex.

The transcriptional regulatory domain may be a transcriptional activation domain that is capable of upregulating transcription of a gene that the split transcription factor binds to. Suitable transcriptional activation domains include the p65 subunit of nuclear factor kappa B (Bitko & Barik, J. Virol. 72:5610-5618 (1998) and Doyle & Hunt, Neuroreport 8:2937-2942 (1997)); Liu et al. , Cancer Gene The 5:3-28 (1998)), the replication and transcription activator (RTA; Lukac et al., J Virol. 73, 9348-61 (1999)), the HSV VP16 activation domain (see, e.g., Hagmann etal., J. Virol. 71 , 5952-5962 (1997)), nuclear hormone receptors (see, e.g., Torchia et al., Curr. Opin. Cell. Biol. 10:373-383 (1998)), or artificial chimeric functional domains such as VP64 (Beerli et al., (1998) Proc. Natl. Acad. Sci. USA 95:14623-33) and degron (Molinari et al., (1999) EMBO J. 18, 6439-6447). Additional exemplary activation domains include, Oct 1 , Oct-2A, Sp1 , AP-2, and CTF1 (Seipel et al., EMBO J. 11 , 4961-4968 (1992)) as well as p300, CBP, PCAF, SRC1 PvALF, AtHD2A and ERF-2 (Robyr ef a/. (2000), Mol. Endocrinol. 14:329-347; Collingwood et al. (1999) J. Mol. Endocrinol. 23:255-275; Leo et al. (2000) Gene 245:1-11 ; Manteuffel-Cymborowska (1999) Acta Biochim. Pol. 46:77-89; McKenna et al. (1999) J. Steroid Biochem. Mol. Biol. 69:3-12; Malik etal. (2000) Trends Biochem. Sci. 25:277- 283; and Lemon et al. (1999) Curr. Opin. Genet. Dev. 9:499-504). Additional exemplary activation domains include, but are not limited to, OsGAI, HALF-1 , C1 , AP1 , ARF-5,-6,-7, and -8, CPRF1 , CPRF4, MYC-RP/GP, and TRAB1 and a modified Cas9 transactivator protein (Ogawa et al. (2000) Gene 245:21-29; Okanami et al. (1996) Genes Cells 1 :87-99; Goff et al. (1991) Genes Dev. 5:298-309; Cho et al. (1999) Plant Mol. Biol. 40:419-429; Ulmason et al. (1999) Proc. Natl. Acad. Sci. USA 96:5844-5849; Sprenger-Haussels et al. (2000) Plant J. 22:1-8; Gong et al. (1999) Plant Mol. Biol. 41 :33-44; Hobo et al. (1999) Proc. Natl. Acad. Sci. USA 96:15,348-15,353; and Perez-Pinera et al. (2013) Nature Methods 10:973-976). The transcriptional activation domain may comprise any combination of the above exemplary activation domains. In some embodiments multiple transcriptional activation domains may be used, e.g. tandem repeats of the same domain or fusions of different domains. In some embodiments, the transcriptional activation domain is VPR, a tripartite activate made up of the VP64, p65 and Rta domains. An example of a TRD-T fusion protein comprising VPR is set forth in SEQ ID NO: 45 (NS4A/3 PR S139A-VPR). Generation and use of VPR as a transcriptional activator is described for example in Chavez etal. 2015. In some embodiments, the transcriptional activation domain is HSF-1 , optionally in combination with p65. The human p65 activation domain may be used, which has the amino acid sequence set forth in SEQ ID NO: 72.

Alternatively, the transcriptional regulatory domain may be a transcriptional repression domain that is capable of downregulating transcription of a gene that the split transcription factor binds to. Transcriptional repression domains include, but are not limited to, KRAB A/B, KOX, TGF-beta-inducible early gene (TIEG), v-erbA, SID, MBD2, MBD3, members of the DNMT family (e.g., DNMT1 , DNMT3A, DNMT3B), Rb, and MeCP2 (see, for example, Bird et al. (1999) Cell 99:451-454; Tyler et al. (1999) Cell 99:443-446; Knoepfler et al. (1999) Cell 99:447-450; and Robertson et al. (2000) Nature Genet. 25:338-342). Additional exemplary repression domains include, but are not limited to, ROM2 and AtHD2A (Chem et al. (1996) Plant Cell 8:305-321 ; and Wu et al. (2000) Plant J. 22:19-27).

The DNA binding domain may be any protein that binds to a target sequence in a sequence specific manner. For example, the DNA binding domain may be or may contain a transcription factor that binds to a target sequence in a sequence specific manner, or a DNA- binding fragment thereof. It is expected that any transcription factor, or DNA-binding fragment thereof, that is capable of binding to a target sequence in a specific manner can be used with the split transcription factors disclosed herein. The DNA-binding domain may be or comprise a naturally occurring DNA-binding domain such as a binding domain from a human transcription factor. For example, the DNA-binding protein may be any of the human transcription factors described in Vaquerizas et al. (2009) (e.g. any of those listed in Supplementary information S3), or a DNA-binding fragment thereof. For example, the DNA- binding protein may be a member of the C2H2 zinc-finger family, the homeodomain family or the helix-loop-helix family or a DNA-binding fragment thereof. In particular embodiments the DNA binding domain may be zinc finger homeodomain transcription factor 1 (ZFHD1). ZFHD1 contains zinc fingers 1 and 2 from the Zif268 transcription factor and the Oct-1 homeodomain. The design and construction of ZFHD1 is described for example in Pomerantz et al. 1995. The amino acid sequence of ZFHD1 is set forth in SEQ ID NO: 73.

The DNA binding domain may be or comprise a DNA-binding domain such as a zinc finger DNA binding domain, a TALE DNA binding domain, a DNA binding domain from a meganuclease (e.g. based on Iscel) or a DNA binding domain from a CRISPR/Cas system. These binding domains can be engineered to bind a target sequence of choice, e.g. a target sequence in a target gene that is naturally present (endogenous) in a cell or a target sequence that has been provided in trans (e.g. as part of a third expression cassette). The engineering of zinc finger DNA binding domains to bind particular target sequences is described for example in US 6453242. In one embodiment, the DNA-binding domain is a TALE DNA binding domain. The engineering of TALE DNA binding domain domains to bind particular target sequences is described for example in WO 2010/079430. In one embodiment, the DNA binding domain is an engineered DNA binding domain from a meganuclease. The engineering of meganucleases to bind particular target sequence is described for example in WO 2007/047859. A meganuclease may be engineered such that it no longer cleaves DNA. In one embodiment, the DNA binding domain is an engineered DNA binding domain from a CRISPR/Cas system. The engineering of DNA binding domains from CRISPR/Cas systems to bind particular sequences is described for example in WO 2013/176772. CRISPR/Cas systems generally involve an RNA-guided endonuclease (e.g. Cas9) that is directed to a specific DNA sequence through complementarity between the associated guide RNA (gRNA) and its target sequence. Thus, the engineered DNA binding domain from a CRISPR/Cas system typically comprises a complex of a RNA-guided endonuclease (e.g. Cas9 or a variant thereof) and a guide RNA. Variants of Cas9 have been generated that lack the endonucleolytic activity but retain the capacity to interact with DNA. See for example Chavez et al. 2015 which describes the use of nuclease-null (dCas9) variants in a method of transcriptional regulation. Thus, the DNA-binding domain may include a nuclease null Cas9 variant which, upon addition of a particular gRNA specific for a target sequence, binds to the target sequence. The use of a dCas9 variant as part of a split transcription factor is described in Hill et al. 2018 and WO 2018/213848.

The binding member may be fused to the transcriptional regulatory domain or to the DNA binding domain.

In some embodiments:

(1) the first component polypeptide comprises a DNA binding domain and is fused to a target protein to form a DBD-T fusion protein; and the second component polypeptide comprises a transcriptional regulatory domain and is fused to a binding member to form a TRD-BM fusion protein; or

(2) the first component polypeptide comprises a transcriptional regulatory domain and is fused to a target protein to form a TRD-T fusion protein; and the second component polypeptide comprises a DNA binding domain and is fused to a binding member to form a DBD-BM fusion protein, wherein the DNA binding domain, target protein, transcriptional regulatory domain and binding member are as defined above.

In certain embodiments:

(1) the first component polypeptide comprises a DNA binding domain and is fused to a target protein to form a DBD-T fusion protein, wherein the target protein comprises an amino acid sequence having at least 90% identity to the amino acid sequence set forth in SEQ ID NO: 26, and the second component polypeptide comprises a transcriptional regulatory domain and is fused to a binding member to form a TRD-BM fusion protein, or

(2) the first component polypeptide comprises a transcriptional regulatory domain and is fused to a target protein to form a TRD-T fusion protein, wherein the target protein has an amino acid sequence having at least 90 % identity to SEQ ID NO: 1 , and the second component polypeptide comprises a DNA binding domain and is fused to a binding member to form a DBD-BM fusion protein, wherein in either (1) or (2): a) the binding member comprises the BC, DE and FG loops, or Tn3 sequence, of PRGRZ093; b) the binding member comprises the BC, DE and FG loops, or Tn3 sequence, of PRGRZ094; c) the binding member comprises the BC, DE and FG loops, or Tn3 sequence, of PRGRZ103; d) the binding member comprises the BC, DE and FG loops, or Tn3 sequence, of PRGRZ112; e) the binding member comprises the BC, DE and FG loops, or Tn3 sequence, of PRGRZ114; or f) the binding member comprises the BC, DE and FG loops, or Tn3 sequence, of PRGRZ115.

The DBD-T fusion protein may comprise an amino acid sequence having at least 90 % identity to the amino acid sequence set forth in SEQ ID NO: 46. In particular embodiments the TRD-BM fusion protein defined in (1) above may comprise an amino acid sequence having at least 90 % sequence identity to the amino acid sequence set forth in any one of SEQ ID NOs: 47-52.

The TRD-T fusion protein may comprise an amino acid sequence having at least 90% identity to the amino acid sequence set forth in SEQ ID NO: 53. In particular embodiments, the DBD-BM fusion protein defined in (2) above may comprise an amino acid sequence having at least 90 % sequence identity to the amino acid sequence set forth in any one of SEQ ID NOs: 54-59.

Binding members may show a preference for fusion to either the DNA binding domain or the transcriptional regulatory domain, whereby transcriptional regulatory activity may be increased or decreased depending on if the particular binding member is fused to the DNA binding domain or transcriptional regulatory domain.

In some embodiments, the binding member or target protein is fused to the C-terminus of the DNA binding domain. In other embodiments, the binding member or target protein is fused to the N-terminus of the transcriptional regulatory domain. The binding member or target protein may be fused to the DNA binding domain or transcriptional regulatory domain via a peptide linker, for example via one or more of the peptide linkers set out above. In particular embodiments the linkers have the amino acid sequence TGGGGSGGGGS (SEQ ID NO: 44) or SA.

The methods and binding proteins described herein can also be applied to an activating CRISPR (CRISPRa) system. This can be used, for example, to facilitate endogenous gene regulation. The DBD-BM fusion protein can be guided to a target sequence through the use of particular guide RNAs that are specific for said target sequence.

As demonstrated in the examples, split transcription factors comprising a DNA binding domain fused to multiple copies of the target protein or binding member exhibited increased expression relative to a split transcription factor comprising a DNA binding domain fused to a single copy of the target protein or binding member.

Thus, in some embodiments: the DBD-T fusion protein comprises the DNA binding domain fused to multiple copies of the target protein (e.g. two, three, four, five or more target proteins); or the DBD-BM fusion protein comprises the DNA binding domain fused to multiple copies of the binding protein (e.g. two, three, four, five or more binding proteins).

The multiple binding members or multiple target proteins may be separated by a linker, for example by one or more peptide linkers as set out above. In particular exemplified embodiments the DBD-T fusion protein comprises a DNA binding domain fused to three target proteins, or the DBD-BM fusion protein comprises a DNA binding domain fused to three binding members.

The first and/or second component polypeptide may additionally comprise nuclear localization signals (such as, for example, that from the SV40 medium T-antigen).

A split transcription factor may also be provided with a third expression cassette, wherein the third expression cassette encodes a desired expression product, wherein the DNA binding domain of the split transcription factor binds to a target sequence in the third expression cassette such that the transcription factor is capable of regulating expression of the desired expression product. By “capable of regulating expression” is meant that the DNA binding domain is able to bind the target sequence and upon forming a transcription factor with the transcriptional regulatory domain (i.e. upon dimerization of the dimerization-inducible protein), has transcriptional regulatory activity that regulates (increases or decreases) expression of the desired expression product. The desired expression product can be RNA or peptidic (peptide, polypeptide or protein). Preferably the desired expression product is peptidic. The desired expression product may be a therapeutic protein, i.e. a protein that exerts a therapeutic effect in the subject.

The target sequence may be located in or in close proximity to a promoter that is operably linked to a coding sequence for the desired expression product. By “close proximity” is meant that the target sequence is within 500 bp, within 250 bp, within 100 bp, within 50 bp, or within 25 bp of the sequence corresponding to the promoter.

Split Chimeric Antigen Receptor

The dimerization-inducible protein may be a split chimeric antigen receptor (split CAR). CARs combine both antibody-like recognition with T cell-activating function. They are typically composed of an antigen-specific recognition domain, e.g. derived from an antibody, a transmembrane domain to anchor the CAR to an immune cell (e.g. T cell), a co-stimulatory domain and one or more intracellular signalling domains that induce persistence, trafficking and effector functions in transduced T cells. The design and use of CARs is well known in the art and is described, for example, in Sadelain et al. 2013.

Split CARs have been designed that require an exogenous, user-provided signal to activate the CAR, for example as described in Wu et al. 2015. In these split receptors, antigen binding and intracellular signalling components only assemble in the presence of a heterodimerizing small molecule, allowing the user to precisely control the timing, location and dosage of immune cell (e.g. T cell) activity. Such split CARs are expected to mitigate toxicity, for example by inducing less off-target effects.

In one embodiment the dimerization-inducible protein comprises: a first component polypeptide comprising a co-stimulatory domain which is fused to the target protein as defined herein; and a second component polypeptide comprising an intracellular signalling domain which is fused to the binding member as defined herein.

In some embodiments the first component polypeptide set out above further comprises an antigen-specific recognition domain and a transmembrane domain and the second component polypeptide further comprises a transmembrane domain and a second co- stimulatory domain, such that the first and second component polypeptides form a chimeric antigen receptor (CAR) upon dimerization. By “form a CAR” is meant that the first and second component polypeptides are brought into close enough proximity that they are able to reconstitute a fully functional CAR.

In another embodiment the dimerization-inducible protein comprises: a first component polypeptide comprising an intracellular signalling domain and which is fused to the target protein as defined herein; and a second component polypeptide comprising a first co-stimulatory domain and which is fused to the binding member as defined herein.

In some embodiments the first component polypeptide set out above further comprises a transmembrane domain and a second co-stimulatory domain and the second component polypeptide further comprises an antigen-specific recognition domain and a transmembrane domain, wherein the first and second component polypeptides form a chimeric antigen receptor (CAR) upon dimerization.

The split CAR will have increased activity when the binding member is bound to the T- SM complex compared to the activity observed when the binding member is not bound to the T-SM complex.

In one embodiment the first component polypeptide comprises, from N-terminal to C-terminal: i) an antigen-specific recognition domain; ii) a transmembrane domain; and ii) a first co-stimulatory domain; and the second component polypeptide comprises, from N-terminal to C-terminal: i) a transmembrane domain; ii) a second co-stimulatory domain; and iii) an intracellular signalling domain, wherein the first component polypeptide and second component polypeptide form a CAR upon dimerization.

In some embodiments the target protein and binding member are fused at a location that is C-terminal to the respective transmembrane domains in the first and second component polypeptides. For example, the target protein or binding member may be fused to the N- terminus or C-terminus of the respective co-stimulatory domains in the first and second component polypeptides. In a particular embodiment, one of the target protein and binding member is fused to the C-terminus of the first co-stimulatory domain and the other is fused to the C-terminus of the second co-stimulatory domain. For example, in one embodiment the first component polypeptide comprises from N-terminal to C-terminal: i) an antigen-specific recognition domain; ii) a transmembrane domain; and iii) a first co-stimulatory domain; and the second component polypeptide comprises from N-terminal to C-terminal: i) a transmembrane domain; ii) a second co-stimulatory domain; and iii) an intracellular signalling domain, wherein the target protein is fused to the C-terminus of the first co-stimulatory domain and the binding member is fused to the C-terminus of the second co-stimulatory domain.

For example, in another embodiment the first component polypeptide comprises from N-terminal to C-terminal: i) an antigen-specific recognition domain; ii) a transmembrane domain; and iii) a first co-stimulatory domain; and the second component polypeptide comprises from N-terminal to C-terminal: i) a transmembrane domain; ii) a second co-stimulatory domain; and iii) an intracellular signalling domain, wherein the binding member is fused to the C-terminus of the first co-stimulatory domain and the target protein is fused to the C-terminus of the second co-stimulatory domain.

The target protein and/or binding member may be fused directed to the respective co- stimulatory domains. More preferably, the target protein and binding member are separated from their respective co-stimulatory domains by peptide linkers. The peptide linkers may be as further defined herein. In some embodiments, the target protein and binding member are separated from their respective co-stimulatory domains by a linker comprising the amino acid sequence set forth in SEQ ID NO: 61. Similarly, peptide linkers may separate the various domains in the first and second component polypeptides. For example, the transmembrane domain may be separated from the second co-stimulatory domain by a peptide linker, e.g. a peptide linker comprising the amino acid sequence GS, and/or the second co-stimulatory domain may be separated from the intracellular signalling domain by a peptide linker, e.g. a peptide linker comprising the amino acid sequence set forth in SEQ ID NO: 61 .

Non-limiting examples of suitable co-stimulatory domains include, but are not limited to, activation domains from 4-1 BB (CD137), CD28, ICOS, OX-40, BTLA, CD27, CD30, GITR, and HVEM. In one embodiment the first and second co-stimulatory domain is a 4-1 BB activation domain.

Non-limiting examples of suitable intracellular signalling domains include, but are not limited to, cytoplasmic sequences of the T cell receptor (TCR) and co-receptors that act in concert to initiate signal transduction following antigen receptor engagement, as well as any derivative or variant of these sequences and any synthetic sequence that has the same functional capability. Particular intracellular signalling domains are those that include signaling motifs which are known as immunoreceptor tyrosine-based activation motifs or ITAMs. Examples of ITAM-containing signaling domains include those derived from FcR gamma, FcR beta, CD3 gamma, CD3 delta, CD3 epsilon, CD3 zeta, CD5, CD22, CD79a, CD79b, and CD66d. In particular embodiments the intracellular signalling domain is derived from CD3 zeta (CD3Q.

The transmembrane domain may be derived either from a natural or from a synthetic source. Where the source is natural, the domain may be derived from any membrane-bound or transmembrane protein. Transmembrane regions may be derived from (i.e. comprise at least the transmembrane region(s) of) the alpha, beta or zeta chain of the T-cell receptor, CD28, CD3 epsilon, CD45, CD4, CD5, CD8, CD9, CD16, CD22, CD33, CD37, CD64, CD80, CD86, CD134, CD137 or CD154. Alternatively, the transmembrane domain may be synthetic, in which case it will comprise predominantly hydrophobic residues such as leucine and valine. A triplet of phenylalanine, tryptophan and valine may be found at each end of a synthetic transmembrane domain. Optionally, a short oligo- or polypeptide linker, preferably between 2 and 10 amino acids in length may form the linkage between the transmembrane domain and the intracellular signalling domain of the CAR. A glycineserine doublet provides a particularly suitable linker. In particular embodiments, the transmembrane domain is derived from CD28.

The first and second component polypeptides may additionally include a hinge domain, such as an lgG4 or CD8a hinge domain, N-terminal to the transmembrane domains in the first and/or second polypeptides. Examples of hinge domains are described in, for example, Qin et al. 2017. In particular embodiments, the hinge domain is a human lgG4 hinge domain.

An antigen-specific recognition domain suitable for use in a dimerization-inducible protein of the present disclosure can be any antigen-binding polypeptide, a wide variety of which are known in the art. In some instances, the antigen-binding domain is a single chain Fv (scFv). Other antibody-based recognition domains, e.g. cAb VHH (camelid antibody variable domains) and humanized versions, IgNAR VH (shark antibody variable domains) and humanized versions, sdAb VH (single domain antibody variable domains) and “camelized” antibody variable domains are suitable for use. In some instances, T-cell receptor (TCR) based recognition domains such as single chain TCR (scTv, single chain two-domain TCR containing v vP) are also suitable for use.

In particular embodiments, the antigen-specific recognition domain is a single chain Fv (scFv). An scFv typically comprises a VH chain separated from a VL chain by a peptide linker, e.g. a peptide linker comprising the amino acid sequence set forth in SEQ ID NO: 61 .

An antigen-specific recognition domain suitable for use in a dimerization-inducible protein of the present disclosure can have a variety of antigen-binding specificities. In some cases, the antigen-binding domain is specific for an epitope present in an antigen that is expressed by (synthesized by) a cancer cell, i.e. a cancer cell associated antigen. The cancer cell associated antigen can be an antigen associated with, e.g., a breast cancer cell, a B cell lymphoma, a Hodgkin lymphoma cell, an ovarian cancer cell, a prostate cancer cell, a mesothelioma cell, a lung cancer cell (e.g., a small cell lung cancer cell), a non-Hodgkin B-cell lymphoma (B-NHL) cell, a melanoma cell, a chronic lymphocytic leukemia cell, an acute lymphocytic leukemia cell, a neuroblastoma cell, a glioma cell, a glioblastoma cell, a medulloblastoma cell, a colorectal cancer cell, etc. A cancer cell associated antigen may also be expressed by a non-cancerous cell.

In particular exemplary embodiments, the target protein used in the split-CAR is derived from an HCV NS3/4A protease. The binding member is as described above, and in particular is or is based on PRGRZ103 (e.g. comprises the BC, DE and FG loops or Tn3 sequence of PRGRZ103, optionally with the sequence identity and/or alterations described herein).

In some embodiments, the first component polypeptide comprises a first signal peptide located N-terminal to the antigen-specific recognition domain. The first signal peptide may comprise the amino acid sequence set forth in SEQ ID NO: 62 or SEQ ID NO: 63. In some embodiments, the second component polypeptide comprises a second signal peptide located N-terminal to the transmembrane domain. The second signal peptide may comprise the amino acid sequence set forth in SEQ ID NO: 62 or SEQ ID NO: 62.

Also provided herein is an engineered immune cell comprising the split CAR disclosed herein. In one embodiment the immune cell is a T-cell. In another embodiment the immune cell is an NK cell. Also provided is a method of genetically modifying an immune cell to express a split CAR disclosed herein. The method may be carried out ex vivo. The method may comprise administering the one or more expression vectors described herein to the immune cell such that the split CAR is expressed on the surface of the immune cell. Split Reporter System

The dimerization-inducible protein may be a split reporter system. The split reporter system may be an enzyme or fluorescent protein that provides an observable phenotype when the first and second component polypeptides dimerise. The observable phenotype may be a colorimetic signal, a luminescent signal or a fluorescent signal. Particular examples of split reporter systems are provided in Dixon et al. 2017.

In some embodiments, the first component polypeptide comprises a first reporter component and the second component polypeptide comprises a second reporter component, wherein the first component polypeptide and second component polypeptide form a reporter system upon dimerization, optionally wherein the reporter system provides an increased colorimetric, luminescent, or fluorescent signal when the binding member is bound to the T- SM complex.

Split Apoptotic Protein

The dimerization-inducible protein may be a split apoptotic protein. A split apoptotic protein is any protein that is capable of inducing apoptosis when the first and second component polypeptides dimerise. An example of a split apoptotic protein is a split caspase or dimerization-deficient caspase (e.g. dimerization-deficient caspase-9 or dimerization-deficient caspase-3), that is capable of inducing apoptosis upon dimerization and as such can be used to kill specific cells that contain the split apoptotic protein (e.g. diseased cells, or therapeutic cells that have been administered for cell therapy purposes). Examples of split caspases are provided in Chelur et al. 2007. The use of an inducible caspase-9 suicide gene system is described, for example, in Gargett et al. 2014.

In some embodiments, the first component polypeptide comprises a first caspase component; and the second component polypeptide comprises a second caspase component, wherein the first component polypeptide and second component polypeptide form a caspase upon dimerization. The split caspase is capable of inducing cell death when the binding member is bound to the T-SM complex.

In certain embodiments, the first and second caspase components are identical, for example both caspase components comprise caspase-9 activation domains. An exemplary caspase-9 activation domain is provided as amino acids residues 152-414 of the human caspase-9 amino acid sequence provided as NCBI accession number AAO21133.1 (version 1 ; last updated 1 December 2009), set out herein as SEQ ID NO: 35.

In some embodiments, the first and second fusion proteins are both BM-T fusion proteins, each comprising a target protein domain, a binding protein domain and a caspase domain. In some embodiments, the first and second fusion proteins are identical. In cases where the first and second caspase components are identical, the first and second caspase components may be encoded from the same expression cassette. For example, a split apoptotic protein may be encoded from one or more expression cassettes encoding the target protein, the binding member and the caspase-9 activation domain, where both the target protein and the binding member are fused to a caspase-9 activation domain. Upon expression, a plurality of proteins comprising the target protein, binding member and caspase-9 activation domain are produced and dimerization of the caspase-9 activation domains (i.e. at least a first and a second caspase-9 activation domain) can be regulated through the addition of the small molecule, e.g. grazoprevir.

Where the first and second fusion proteins are identical, it will be apparent to the skilled person that only a single expression cassette is required, comprising a gene encoding the fusion protein of the dimerization-inducible protein.

Other Dimerization-Inducible Proteins

Other dimerization proteins contemplated for use herein include split therapeutic proteins, split TEV proteases and split Cas9. A split therapeutic protein is any protein that is capable of exerting a therapeutic effect when the first and second component polypeptides of the split therapeutic protein dimerize.

Nucleic Acids

Also provided herein is a nucleic acid molecule encoding a binding member or BM-T fusion protein as described herein, and a nucleic acid molecule or nucleic acid molecules encoding a dimerization-inducible protein as defined herein. The nucleic acid molecule provided herein may be seen as comprising a nucleotide sequence encoding a binding member or BM-T fusion, and the nucleic acid molecule or nucleic acid molecules provided herein may be seen as comprising nucleotide sequences encoding a dimerization-inducible protein. The nucleic acid molecule or molecules may be an isolated nucleic acid molecule or molecules. The skilled person would have no difficulty in preparing such nucleic acid molecules using methods well- known in the art.

The nucleic acid molecule provided herein may be a DNA or RNA molecule. The nucleic acid molecule provided herein may be single-stranded or double-stranded. The nucleic acid molecule provided herein may be a linear nucleic acid molecule or a circular nucleic acid molecule. The nucleic acid molecule may be provided in the context of an expression vector, such as an expression vector as described below. Alternatively, the nucleic acid molecule may be provided in the context of a linear construct for cloning into a vector, or in the context of a cloning vector. The nucleic acid molecule may be provided in the context of the genome of a viral particle, as further discussed below. The genome of a viral particle may be single- or double-stranded RNA or DNA. When the nucleic acid molecule is an RNA molecule, it may be an mRNA molecule.

In some embodiments, the nucleic acid molecule encodes the T n3 protein PRGRZ093, PRGRZ094, PRGRZ103, PRGRZ112, PRGRZ114 or PRGRZ115. The amino acid sequences for those Tn3 proteins are defined herein.

In some embodiments, the nucleic acid molecule or molecules comprise a nucleotide sequence having at least 80 %, 81 %, 82 %, 83 %, 84 %, 85 %, 86 %, 87 %, 88 %, 89 %, 90 %, 91 %, 92 %, 93 %, 94 %, 95 %, 96 %, 97 %, 98 %, or 99 % sequence identity with one or more of the exemplary nucleic acid sequences set forth for PRGRZ093, PRGRZ094, PRGRZ103, PRGRZ112, PRGRZ114 or PRGRZ115 in the table below. In some embodiments, the nucleic acid molecule or molecules comprise a nucleotide sequence which is degenerate with one or more of the exemplary nucleic acid sequences set forth for PRGRZ093, PRGRZ094, PRGRZ103, PRGRZ112, PRGRZ114 or PRGRZ115 in the table below. In some embodiments, the nucleic acid molecule or molecules comprise a nucleic acid sequence of PRGRZ093, PRGRZ094, PRGRZ103, PRGRZ112, PRGRZ114 or PRGRZ115 as set out in the table below. The nucleotide sequences forthose exemplary binding members are set forth in Table 4 below:

Table 4: PRGRZ Tn3 nucleotide sequences

In some embodiments, the nucleic acid molecule or molecules encodes the first component polypeptide and/or second component polypeptide fused to the target protein or binding member as described above. The amino acid sequences for those component polypeptides are defined herein.

In some embodiments, the nucleic acid molecule or molecules encodes one or more of the DBD-T fusion protein, TRD-BM fusion protein, DBD-BM fusion protein, and TRD-T fusion protein as described above. The amino acid sequences for those fusion proteins are defined herein.

In some embodiments the nucleic acid molecule or molecules encode a split CAR as defined herein.

In some embodiments, the nucleic acid molecule or molecules encode a split apoptotic protein as described herein, in particular a split caspase. In some embodiments the nucleic acid molecule encodes a BM-T fusion protein comprising a caspase domain, as described above, wherein the caspase domain is dimerizable to form an active caspase. In such embodiments, the BM-T fusion protein constitutes a dimerization-inducible protein comprising identical first and second fusion proteins, as described above, and in such embodiments the nucleic acid molecule thus encodes a dimerization-inducible protein as described herein, in particular a split caspase.

The nucleotide sequence encoding the binding protein, BM-T fusion protein or dimerization-inducible protein described herein may be provided in the context of an expression cassette, as described below. That is to say, the nucleic acid may comprise an expression cassette encoding the binding protein, BM-T fusion protein or dimerizationinducible protein described herein. An isolated nucleic acid molecule may be used to express a binding member, BM-T fusion protein or dimerization-inducible protein provided herein.

Expression Vectors and Expression Cassettes

An “expression vector” as used herein is a DNA molecule used for expression of foreign genetic material in a cell. Any suitable vectors known in the art may be used. Suitable vectors include DNA plasmids, binary vectors, viral vectors and artificial chromosomes (e.g. yeast artificial chromosomes). In certain embodiments, the expression vector is a viral vector as described in more detail below. In other embodiments, the expression vector is a DNA plasmid.

An “expression cassette” as used herein is a polynucleotide sequence that is capable of effecting transcription of an expression product, which may be a protein. A “coding sequence” is intended to mean a portion of a gene’s polynucleotide sequence that encodes the expression product. Where the expression product is a protein, this sequence may be referred to as a “protein coding sequence”. The protein coding sequence typically begins at the 5’ end with a start codon and ends at the 3’ end with a stop codon. An expression cassette may be part of an expression vector, or part of a viral genome in a viral particle, as described in more detail below.

Typically, the expression cassette comprises a promoter operably linked to a protein coding sequence. The term “operably linked” includes the situation where a selected coding sequence and promoter are covalently linked in such a way as to place the expression of the protein coding sequence under the influence or control of the promoter. Thus, a promoter is operably linked to the protein coding sequence if the promoter is capable of effecting transcription of the protein coding sequence. Where appropriate, the resulting transcript may then be translated into a desired protein.

Any suitable promoter known in the art may be used in the expression cassette providing it functions in the cell type being used. For example, where the cell is a mammalian cell, the promoter may be a cytomegalovirus (CMV) promoter. Where multiple expression cassettes are used, each coding sequence may be independently operably linked to its own promoter. Alternatively, the coding sequence for one or more of the expression cassettes may be operably linked to the same promoter. The promoter used in an expression cassette as provided herein may be a constitutive promoter or an inducible promoter.

An expression cassette may comprise additional elements useful for control of transcription, e.g. one or more enhancer sequences and one or more transcription termination (terminator) sequences.

Where multiple expression cassettes are described, e.g. a first and second expression cassette, they may be part of the same or different expression vectors. Thus, in some embodiments, first and second expression cassettes may be located on the same expression vector. In other embodiments, a first expression cassette is located on a first expression vector and a second expression cassette is located on a second expression vector.

Where multiple coding sequences are located in the same expression cassette, the individual coding sequences (e.g. first and second coding sequences) may be separated by an internal ribosome entry site (IRES) or 2A element. The use of IRES or 2A elements allows multiple expression products to be expressed from a single mRNA transcript from a single promoter. In other words, when first and second coding sequences are separated by an IRES or 2A element, both the first and second coding sequences can be operably linked to the same promoter. Thus coding sequences for both parts of a dimerization-inducible protein as described herein may be encoded in a polycistronic (or bicistronic) expression cassette (i.e. an expression cassette which drives transcription of a single mRNA transcript encoding both components of the dimerization-inducible protein).

The expression vector provided herein may comprise a nucleic acid molecule as described above, or may comprise an expression cassette comprising a nucleic acid molecule as described above.

Thus in one aspect, provided herein is an expression vector comprising an expression cassette encoding a polypeptide comprising the binding protein or BM-T fusion protein as described herein. In some embodiments, the binding protein or BM-T fusion protein further comprises a first or second component polypeptide, as described above, and thus can constitute a first or second fusion protein of a dimerization-inducible protein as described herein.

In some embodiments, the BM-T fusion protein additionally comprises a caspase domain as described above, wherein the caspase domain is dimerizable to form an active caspase. In such embodiments, the BM-T fusion protein constitutes part of a dimerizationinducible protein comprising identical first and second fusion proteins, as described above, and in such embodiments a single, monocistronic expression cassette thus encodes a dimerization-inducible protein as described herein, in particular a split caspase.

In some embodiments, provided herein is an expression vector comprising an expression cassette encoding the binding protein, BM-T fusion protein or dimerizationinducible protein flanked by targeting sequences, wherein the targeting sequences direct integration of the expression cassette into a target locus in a host cell genome. That is to say, the expression cassette may be flanked by sequences which, upon delivery of the vector into a cell, cause the expression cassette to be integrated into the genome of the cell at a desired site. The targeting sequences may be homology arms, which are homologous to the sequences flanking the target locus, such that the expression cassette is inserted into the locus by homologous recombination. Such insertion may be referred to as “knock-in” of the expression cassette, and may be achieved, for example, using the CRISPR/Cas-9 system.

In particular embodiments, an expression vector encoding a split apoptotic protein, e.g. a split caspase, is targeted for integration into the genome of a host cell. Thus the vector may comprise an expression cassette encoding a split caspase (as described above) flanked by targeting sequences. In particular, the split caspase may be iCasp9.

In some embodiments, the targeting locus is located in or at a target gene, which it is desired to knock out. In this way, insertion of the expression vector into the target gene can knock out the target gene, either by disrupting or replacing all or part of the target gene sequence.

In some embodiments, the target gene is a gene associated with immune recognition of a cell, for instance a component of the MHC class I. In particular, the target gene may be P-2-microglobulin (B2M). By knocking out MHC class I expression, e.g. by knocking out the B2M gene, the immunogenicity of a cell may be reduced, thereby improving its suitability for use in adoptive cell therapy, particularly allogeneic adoptive cell therapy, as further discussed below.

In particular embodiments, the vector comprises an expression cassette encoding a split caspase (e.g. iCasp9) targeted for integration at the B2M locus. Thus the vector may comprise an expression cassette encoding a split caspase (e.g. iCasp9) flanked by homology arms for the B2M gene locus. By “homology arms for the B2M gene locus” is meant sequences which are homologous to the B2M gene locus sequences, such that they target the expression cassette for genomic integration at the B2M gene locus and knock out the B2M gene.

In another aspect, provided herein is an expression vector comprising a first expression cassette encoding a binding protein as defined herein and a second expression cassette encoding a target protein as defined herein. The binding protein of the first expression cassette and/or the target protein of the second expression cassette may be encoded in the context of a BM-T fusion protein, as described herein, and/or a first or second fusion protein of a dimerization-inducible protein, as described herein. In this aspect, the expression vector can thus encode a dimerization-inducible protein as described herein, comprising nonidentical first and second fusion proteins, each fusion protein being encoded in a separate expression cassette. In this aspect, the first and second expression cassettes may utilise the same promoter (i.e. comprise separate copies of the same promoter) or may utilise different promoters.

In another aspect, provided herein is an expression vector comprising a polycistronic (or bicistronic) expression cassette (as described above) encoding: (1) a first polypeptide comprising a binding protein as described herein, and (2) a second polypeptide comprising a target protein as described herein. The binding protein of the first polypeptide and/or the target protein of the second polypeptide may be encoded in the context of a BM-T fusion protein, as described herein, and/or a first or second fusion protein of a dimerization-inducible protein, as described herein. Thus in this aspect, the expression vector can encode a dimerization - inducible protein as described herein, comprising non-identical first and second fusion proteins, within a single expression cassette.

In another aspect, provided herein is an expression vector encoding a binding protein, BM-T fusion protein or dimerization-inducible protein, as described herein, and a second, unrelated protein. In this aspect, the binding protein, BM-T fusion protein or dimerization inducible protein and the second protein may be encoded in separate expression cassettes, or in a polycistronic expression cassette encoding both proteins, as set out above.

In some embodiments, the second protein is a protein associated with immune evasion, i.e. the expression of which by a cell enables the cell to avoid destruction by a host immune system. In some embodiments, the second protein is CD47, which, when expressed on a cell in vivo, prevents killing of the cell by NK cells. In some embodiments, the expression vector encodes a split caspase (e.g. iCasp9) and a protein associated with immune evasion. In some embodiments, the expression vector encodes a split caspase (e.g. iCasp9) and CD47. In some embodiments, the expression vector comprises a polycistronic expression cassette which encodes a split caspase (e.g. iCasp9) and CD47. In a polycistronic expression cassette, the split caspase (e.g. iCasp9) and immune evasion (e.g. CD47) genes may be separated by a 2A element.

In some embodiments, the vector comprises a polycistronic expression cassette encoding a split caspase and an immune evasion protein, flanked by targeting sequences as described above. In particular, the targeting sequences may be directed to a target gene associated with immune recognition, such as B2M, as described above. The split caspase and immune evasion protein may be iCasp9 and CD47, respectively. In some embodiments, the vector comprises a polycistronic expression cassette encoding iCasp9 and CD47, wherein the expression cassette is flanked by targeting sequences for the B2M gene. An example of such a vector is shown in Figure 9.

The expression vectors described above may be provided in the context of a vector set, or vector pair. In such embodiments, the vector set or pair comprises a first expression vector comprising a first expression cassette and a second expression vector comprising a second expression cassette. The first expression cassette encodes a polypeptide comprising a binding protein as described herein and the second expression cassette encodes a polypeptide comprising a target protein as described herein. The binding protein and/or target protein may be encoded in the context of a BM-T fusion protein as described herein, and/or a first or second fusion protein of a dimerization-inducible protein as described herein. Thus in some embodiments, the first expression cassette encodes a first fusion protein of a dimerization-inducible protein as described herein and the second expression cassette encodes a second fusion protein of a dimerization-inducible protein as described herein. In such embodiments, the vector set, or pair, together encode a dimerization-inducible protein as described herein, and thus transfection of a cell with both the first and second expression vector of the vector set or pair can cause expression of the dimerization-inducible protein by the cell. To enable selection of cells successfully transfected with both the first and second expression vectors, each vector may include a different selection marker.

Similarly, provided herein is one or more expression vectors comprising (1) a first expression cassette encoding a polypeptide comprising a binding protein as described herein; and (2) a second expression cassette encoding a polypeptide comprising a target protein as described herein. As set out above, the binding protein and/or target protein may be encoded in the context of a BM-T fusion protein as described herein, and/or a first or second fusion protein of a dimerization-inducible protein as described herein. In this aspect, where the first and second expression cassettes are comprised within a single expression vector, this may be an expression vector as described above, and where the first and second expression cassettes are comprised within more than one (e.g. two) expression vectors, these may be provided as an expression vector set or pair as described above. Generally, in aspects where multiple expression vectors are provided (e.g. the vector set or pair described above), the expression vectors are of the same type, i.e. both/all expression vectors are plasmids or viral vectors.

Viral Vectors

In some embodiments the expression vector is a viral vector. Suitable viral vectors for use include adeno-associated virus (AAV) vectors, adenovirus vectors, herpes simplex virus vectors, retrovirus vectors, lentivirus vectors, alphavirus vectors, flavivirus vectors, rhabdovirus vectors, measles virus vectors, Newcastle disease virus vectors, poxvirus vectors and picornavirus vectors.

As used herein a viral vector means a DNA expression vector which comprises an expression cassette, or first and second expression cassettes as described above, such that the vector, or at least the expression cassettes, are converted into part of a viral genome, or incorporated into a viral genome, that is packaged in a viral particle when introduced into a cell alongside the necessary components for the assembly of the viral particle. Additionally, in one embodiment, the viral vector comprises an additional expression cassette encoding a desired expression product which is not a binding protein, target protein or BM-T fusion protein, or a component of a dimerization-inducible protein, as described herein.

In a particular embodiment the expression vector is an adeno-associated virus (AAV) vector. AAVs are one of the most actively investigated gene therapy vehicles and are characterized by excellent safety profile and high efficiency of transduction in a broad range of target tissues. The use of AAVs as a vector for gene therapy is described in , for example, Naso et al. 2017 and Colella et al. 2018. Various AAV serotypes, including AAV1 , AAV3, AAV4, AAV5, AAV6, AAV6.2, AAV6.2FF, AAV8, AAV 8.2, AAV9, and AAV rh10 and pseudotyped AAV such as AAV2/8, AAV2/5 and AAV2/6 can be used. Further examples of serotypes and their isolation are described in Srivastava, 2006.

The AAV particle is a small (25 nm) virus from the Parvoviridae family, and it is composed of a non-enveloped icosahedral capsid (protein shell) that contains a linear singlestranded DNA genome of around 4.8 kb. The AAV genome encodes several protein products, namely, four non-structural Rep proteins, three capsid proteins (VP1-3), and the assemblyactivating protein (AAP). The AAV genes are flanked by two AAV-specific palindromic inverted terminal repeats (ITRs).

Thus, where the expression vector is an AAV vector, this may mean that the expression cassette, or first and second expression cassettes, are flanked by ITRs (e.g. ITR

- expression cassette - ITR, or ITR - first expression cassette - second expression cassette

- ITR), such that the expression cassettes are converted into a single-stranded genome that is packaged in an AAV particle when introduced into a cell alongside the necessary components for the assembly of the AAV particle.

The AAV vector may be engineered, for example in order to improve its function. Examples of AAVs that have been engineered for clinical gene therapy are described in Kotterman and Schaffer, 2014.

AAV vectors have a packaging capacity of less than 5 kb, which can limit the size of the genetic material (e.g. expression cassettes) that can be introduced into the viral genome. The use of components that have a relatively small size, such as Tn3 proteins as the binding members, allow for the expression cassette(s) encoding the tripartite complex (e.g. as part of a dimerization-inducible protein such as a split transcription factor or split apoptotic protein as described herein) to fit within a single AAV vector. The small size of the expression cassette(s) encoding the tripartite complex allows for a transgene (e.g. as part of an additional expression cassette) to be introduced into the same AAV vector as the components of a split transcription factor, allowing the split transcription factor to be delivered “in cis” with the transgene.

Viral Particles

Provided herein is a viral particle comprising a viral genome comprising an expression cassette as described above. That is to say, provided herein is a viral particle which contains a viral genome comprising an expression cassette encoding a binding protein, a binding protein and a target protein, a BM-T fusion protein or a dimerization-inducible protein as described above.

Also provided herein is a viral particle comprising a viral genome comprising a first expression cassette and a second expression cassette, each as described above, the first expression cassette encoding a polypeptide comprising a binding protein as described herein and the second expression cassette encoding a polypeptide comprising a target protein as described herein. The binding protein and/or target protein may be encoded in the context of a BM-T fusion protein as described herein, and/or a first or second fusion protein of a dimerization-inducible protein as described herein. Thus in some embodiments, the first expression cassette encodes a first fusion protein of a dimerization-inducible protein as described herein and the second expression cassette encodes a second fusion protein of a dimerization-inducible protein as described herein. In such embodiments, the expression cassettes together encode a dimerization-inducible protein as described herein, and thus such a dimerization-inducible protein is encoded by the genome of the viral particle.

Further provided herein is a pair or set of viral particles. The viral particle pair or set comprises a first viral genome comprising a first expression cassette and a second viral particle comprising a second viral genome comprising a second expression cassette. The first expression cassette encodes a polypeptide comprising a binding protein as described herein and the second expression cassette encodes a polypeptide comprising a target protein as described herein. The binding protein and/or target protein may be encoded in the context of a BM-T fusion protein as described herein, and/or a first and/or second fusion protein of a dimerization-inducible protein as described herein. Thus in some embodiments, the first expression cassette encodes a first fusion protein of a dimerization-inducible protein as described herein and the second expression cassette encodes a second fusion protein of a dimerization-inducible protein as described herein. In such embodiments, the viral particle set, or pair, together encode a dimerization-inducible protein as described herein, and thus infection of a cell with both the first and second viral particle of the set or pair can cause expression of the dimerization-inducible protein by the cell.

Further provided herein are one or more viral particles comprising one or more viral genomes encoding, between the one or more viral particles, a binding protein, a binding protein and a target protein, a BM-T fusion protein or a dimerization-inducible protein as described above.

Also provided herein are one or more viral particles comprising: i) a first expression cassette encoding a target protein as described herein; and ii) a second expression cassette encoding a binding member as described herein, and wherein the first and second expression cassettes form part of a viral genome in the one or more viral particles. The binding protein and/or target protein may be encoded in the context of a BM-T fusion protein as described herein, and/or a first or second fusion protein of a dimerization-inducible protein as described herein.

In some embodiments, the first and second expression cassettes form part of the same viral genome of a viral particle. In other embodiments, the first expression cassette is located in a first viral genome of a first viral particle and the second expression cassette is located in a second viral genome of a second viral particle.

Any suitable viral particle may be used. The viral particle corresponds to the viral vector used to provide the viral genome encoding the binding protein, BM-T fusion protein, etc. That is to say that the viral particle is of a type for which the genome can be provided by the viral vector encoding the binding protein or suchlike. Thus where the e.g. binding protein is encoded on an AAV vector, the viral particle produced is an AAV.

Thus the viral particle may, for instance, be an adeno-associated virus (AAV) particle, adenovirus particle, herpes simplex virus particle, retrovirus particle, lentivirus particle, alphavirus particle, flavivirus particle, rhabdovirus particle, measles virus particle, Newcastle disease virus particle, poxvirus particle or picornavirus particle. In particular embodiments, the viral particle is an AAV particle. AAV particles, including serotypes etc., are described above. Depending on the viral particle used, the viral genome may be a single-stranded or double-stranded nucleic acid and may be RNA or DNA. For example, when the viral particle is an AAV particle, the viral genome is a single stranded DNA viral genome.

The viral particle (optionally provided in the context of a pair or set of viral particles, or as one or more viral particles) is suitable for use in delivering its genome encoding the proteins as described above to a target cell for expression by the target cell. The viral particle may be suitable for use in gene therapy, in which case it is suitable for administration to a subject (particularly a human subject) and in vivo delivery of its genome to target cells. In any event, the viral particle is generally replication deficient.

Also provided herein are in vitro methods of making viral particles. In one embodiment, a method of making viral particles involves transfecting host cells such as mammalian cells with a viral vector as described herein and expressing viral proteins necessary for particle formation in the cells and culturing the transfected cells in a culture medium, such that the cells produce viral particles. The viral particles may be released into the culture medium, or the method may additionally involve lysing and isolating particles from the cell lysates. An example of a suitable mammalian cell is a human embryonic kidney (HEK) 293 cell.

Typically, multiple plasmid expression vectors are utilised to generate the various protein components that generate the viral particles. It is also possible to make use of cell lines that constitutively express components for viral packaging, enabling the use of few plasmids. For example, construction of an AAV particle requires the Rep and Cap proteins and additional genes from adenovirus to mediate AAV replication. Making AAV particles is described for example in Robert et al. 2017. Briefly, this method involves transfection of a mammalian cell line, such as HEK293 cells, with three plasmids. One vector encodes the rep and cap genes of AAV (pRepCap) using their endogenous promoters; one vector (pHelper) encodes three additional adenoviral helper genes (E4, E2A and VA RNAs) not present in HEK293 cells; and one vector (the viral vector) (pAAV-GOI) contains the one or more expression cassettes flanked by two ITRs (see Figure 2 of Robert et al.). Following release of viral particles, the culture medium comprising the viral particles may be collected and, optionally, the viral particles may be separated from the cell lysate. Optionally, the viral particles may be concentrated.

Following production and optional concentration, the viral particles may be stored, for example by freezing at -80°C ready for use by administering to a cell and/or use in therapy. The disclosure also provides viral particles, such as AAV particles, for example those produced by the methods described herein. As used herein, a viral particle comprises a viral genome packaged within the viral envelope that is capable of infecting a cell, e.g. a mammalian cell. Cells

Also provided herein are cells comprising a nucleic acid, expression vector, expression vector set/pair, or one or more expression vectors as described herein.

In some embodiments, such a cell does not express a protein encoded on the expression vector, it may be e.g. a cloning host.

In other embodiments, such a cell expresses a binding protein, BM-T fusion protein and/or dimerization-inducible protein as described herein. In these embodiments, the cell comprises either the expression vector, expression vector set/pair, or one or more expression vectors as described herein, or the nucleic acid described herein comprising an expression cassette encoding a binding protein, BM-T fusion protein and/or dimerization-inducible protein as described herein. The nucleic acid may in particular be a viral genome or part of a viral genome of a viral particle as described above, in which case the cell may have been infected with a viral particle or particles as described above.

The cell comprising the nucleic acid, expression vector, expression vector set/pair, or one or more expression vectors may comprise the nucleic acid or vector(s) extrachromasomally, or alternatively the nucleic acid or vector(s) may be integrated into the genome of the cell.

The cell provided herein is generally a mammalian cell, in particular a human cell. Such a cell may be an in vitro cell (e.g. derived from a cell line) or an ex vivo cell isolated from a subject, e.g. a patient who is to receive cellular therapy or a donor of cells for cell therapy.

In particular embodiments, the cell is an immune cell (e.g. a T cell or NK cell) or a stem cell. Suitable stem cells include embryonic stem cells, induced pluripotent stem cells, hematopoietic stem cells, neuronal stem cells, cardiac stem cells and mesenchymal stem cells.

In some embodiments, the cell is an immune cell (e.g. T cell or NK cell) expressing a split CAR as described herein.

In some embodiments, the cell expresses a split transfection factor or reporter system as described herein.

In some embodiments, the cell expresses a split apoptotic protein as described herein, in particular a split or dimerization-deficient caspase. In such embodiments, the cell may be an immune cell (e.g. T cell or NK cell). The immune cell may further express an antigen receptor, e.g. a TCR or a CAR or a related receptor.

Such an immune cell may be for use in cellular therapy of a subject, as described further below. In this case, the immune cell may be autologous or allogeneic to the subject to be treated. In some embodiments, the cell provided herein does not express a functional MHC class I. In particular, such a cell may not express the B2M protein. Cells which do not express B2M may be generated using a vector as described above, whereby the expression cassette encoding the binding protein, BM-T fusion protein or dimerization inducible protein is integrated into the cell’s genome at the B2M locus. Such modification may in particular be made in a cell intended for use in cellular therapy, for example an immune cell. Such a cell may express a split apoptotic protein, e.g. a split caspase such as iCasp9.

In some embodiments, the cell provided herein additionally expresses a protein associated with immune evasion, e.g. CD47. Expression of such a protein may be of particular value in a cell intended for use in cellular therapy, for example an immune cell. Such a cell may express a split apoptotic protein, e.g. a split caspase such as iCasp9. In some embodiments, such a cell expresses a split apoptotic protein, e.g. a split caspase such as iCasp9, and CD47. Co-expression of these proteins may be achieved using an expression vector encoding both proteins, as described above.

In some embodiments, the cell provided herein does not express a functional MHC class I, and does express a protein associated with immune evasion. Such a cell may not express B2M, but express CD47. Such cells may be generated using an expression vector encoding both proteins which also targets knock-out of B2M, as described above and shown in Figure 9. Modification of cells not to express a functional MHC class I and to express a protein associated with immune evasion may in particular be made in a cell intended for use in cellular therapy, e.g. an immune cell. Such a cell may express a split apoptotic protein, e.g. a split caspase such as iCasp9.

Cells comprising an expression vector as provided herein may be generated by any suitable method, e.g. transformation, transfection or transduction. Such methods are well known in the art and are also further described below.

Gene Therapy

The genetic agents provided herein (i.e. the nucleic acids, expression vectors or viral particles) encoding a dimerization-inducible protein as described above may be administered to a subject in combination with the small molecule, e.g. grazoprevir as part of a method of treatment or a method of prophylaxis of a disease or disorder. That is to say, provided herein are the expression vector, expression vector set and one or more expression vectors described above, for use in therapy. Also provided herein are the viral particle, viral particle set and one or more viral particles described above, for use in therapy. The term “therapy” as used herein encompasses both treatment and prophylaxis of a disease or disorder. Therapeutic use of a genetic agent as provided herein constitutes gene therapy. Administration of the genetic agent(s) to the subject results in expression of the dimerization-inducible protein by cells of the subject. Administration of the small molecule, e.g. grazoprevir, to the subject results in dimerization of the dimerization-inducible protein, which may have a beneficial effect on the disease condition in the individual, e.g. by causing the recipient individual to experience a reduction in symptoms of the disease or disorder being treated.

The term “treatment,” as used herein in the context of treating a condition, pertains generally to treatment and therapy of a human, or alternative of a non-human animal, in which some desired therapeutic effect is achieved, for example, the inhibition of the progress of the condition, and includes a reduction in the rate of progress, a halt in the rate of progress, regression of the condition, amelioration of the condition, and cure of the condition. Treatment as a prophylactic measure (i.e. , prophylaxis, prevention) is also included.

“Prophylaxis” in the context of the present specification should not be understood to circumscribe complete success i.e. complete protection or complete prevention. Rather prophylaxis in the present context refers to a measure which is administered in advance of detection of a symptomatic condition with the aim of preserving health by helping to delay, mitigate or avoid that particular condition.

The individual treated according to the therapeutic methods described herein may be referred to as a subject or (normally when referring to a human subject) a patient.

As mentioned above, treatment using the genetic agents provided herein may involve expressing one or more dimerization-inducible proteins as defined herein in a cell. The dimerization-inducible protein may, for example, comprise a first component polypeptide and a second component polypeptide that form a therapeutic polypeptide upon dimerization. In this way, addition of the small molecule can result in the therapeutic protein having increased activity and can be used, for example, in a method of treatment of a disease where the therapeutic protein is deficient.

Provided herein is a method of regulating the expression of a target gene (which may alternatively be referred to as a target or desired expression product) in a cell, comprising (i) expressing a dimerization-inducible protein as described herein in the cell, wherein the first and second component polypeptides form a transcription factor upon dimerization, and wherein the DNA binding domain binds to a target sequence in the cell such that the transcription factor is capable of regulating (i.e. increasing or decreasing) expression of the target gene in the cell, and (ii) contacting the cell with the small molecule, e.g. grazoprevir in order to regulate expression of the target gene.

This method may be performed in vitro, ex vivo or in vivo. When performed in vitro or ex vivo the cell can be modified to express the dimerization-inducible protein by transfection or transduction with the nucleic acid(s), expression vector(s) or viral particle(s) provided herein, and contacted with the small molecule, e.g. grazoprevir, simply by applying a solution of the small molecule to the cells. Such in vitro and ex vivo methods do not constitute gene therapy, but may be useful for producing cells for research purposes or cellular therapy, as described below.

The method may also be performed in the context of gene therapy, in which case the method is performed in vivo. When the method is performed in vivo, i.e. on a subject, the subject is administered the nucleic acid(s), expression vector(s) or viral particle(s) provided herein in order to cause expression of the dimerization-inducible protein in cells of the subject. The subject is also administered the small molecule, e.g. grazoprevir, to cause dimerization of the dimerization-inducible protein, thereby regulating expression of the target gene.

The in vivo method may thus be seen as a method of regulating the expression of a target gene in a subject (or in a cell in a subject), comprising (i) expressing a dimerizationinducible protein as described herein in a cell in the subject, wherein the first and second component polypeptides form a transcription factor upon dimerization, and wherein the DNA binding domain binds to a target sequence in the cell such that the transcription factor is capable of regulating (i.e. increasing or decreasing) expression of the target gene in the cell, and (ii) administering the small molecule, e.g. grazoprevir, to the subject.

The methods of gene therapy provided herein may be seen in a number of ways. For instance, in one aspect, provided herein is the small molecule, e.g. grazoprevir for use in a method of regulating the expression of a target gene in a cell in a subject, wherein the cell expresses a dimerization-inducible protein as described herein in a cell in the subject, wherein the first and second component polypeptides form a transcription factor upon dimerization, and wherein the DNA binding domain binds to a target sequence in the cell such that the transcription factor is capable of regulating (i.e. increasing or decreasing) expression of the target gene in the cell, and the method comprises administering the small molecule to the subject.

Additionally provided herein is a dimerization-inducible protein for use in a method of regulating the expression of a target gene in a subject (or in a cell in a subject), the method comprising expressing the dimerization-inducible protein described herein in the cell, wherein the first and second component polypeptides form a transcription factor upon dimerization, and administering the small molecule, e.g. grazoprevir, to the cell in order to regulate (i.e. increase or decrease) expression of the target gene.

As noted above, the method may comprise contacting a cell with one or more expression vectors or viral particles as described herein in order to express the dimerizationinducible protein in the cell. In other embodiments the method may comprise contacting the cell with an expression product produced from the one or more expression vectors, e.g. mRNA encoding the dimerization-inducible protein. When the method is performed in vivo, contacting of the cell is achieved by administering the one or expression vectors (or expression products thereof, e.g. mRNA) or viral particles to a subject.

Thus provided herein is a method of regulating expression of a target gene in a subject (or in a cell in a subject), comprising administering to the subject one or more nucleic acids, expression vectors or viral particles, as described above, which encode a dimerizationinducible protein as described above which forms a transcription factor upon dimerization, and wherein the DNA binding domain of the transcription factor binds to a target sequence in the cell such that the transcription factor is capable of regulating (i.e. increasing or decreasing) expression of the target gene, such that the dimerization-inducible protein is expressed by cells within the subject; and administering the small molecule, e.g. grazoprevir, to the subject.

Similarly, provided herein is the small molecule, e.g. grazoprevir, in combination with (i) one or more nucleic acids as described above, (ii) one or more expression vectors as described above, or (iii) one or more viral particles as described above, for use in regulating expression of a target gene in a subject (or in a cell in a subject), wherein the one or more nucleic acids, expression vectors or viral particles encode a dimerization-inducible protein as described above which forms a transcription factor upon dimerization, and wherein the DNA binding domain of the transcription factor binds to a target sequence in the cell such that the transcription factor is capable of regulating (i.e. increasing or decreasing) expression of the target gene.

The particular administration of the genetic agents and the small molecule would be at the discretion of the physician who would also select dosages using his/her common general knowledge and dosing regimens known to a skilled practitioner.

The desired expression product (i.e. the product encoded by the target gene) can be RNA or a peptidic compound (peptide, polypeptide or protein). Preferably the desired expression product is peptidic. The desired expression product may be a therapeutic protein, i.e. a protein that exerts a therapeutic effect in the subject.

The target gene may be an endogenous gene present in the genome of the target cell or subject. For example, where these methods are carried out in a human cell or subject, the target gene may be a human gene. Alternatively, the target gene may be a transgene (i.e. an exogenous gene) delivered to the target cell, e.g. a therapeutic transgene. Regulating expression of the gene may be performed in a method of treatment or a method of prophylaxis of a disease. Following expression of the split transcription factor and administration of the small molecule, the recipient individual may exhibit reduction in symptoms of the disease or disorder being treated. This may have a beneficial effect on the disease condition in the individual.

Where the target sequence is part of a transgene delivered to the cell, the method may further comprise contacting the cell with an additional (e.g. third) expression cassette, wherein the additional expression cassette encodes the desired expression product and wherein the additional expression cassette comprises the target sequence bound by the split transcription factor. The transgene may comprise a promoter that is operably linked to a coding sequence for the desired expression product (i.e. that is operably linked to the target gene), which may be a therapeutic protein, e.g. a therapeutic antibody. An example of a therapeutic antibody is MEDI8852, having the heavy chain amino acid sequence set forth as SEQ ID NO: 64 and the light chain amino acid sequence set forth as SEQ ID NO: 65.

The additional expression cassette may be part of the same nucleic acid(s), expression vector(s) or viral particle(s) as the expression cassette or cassettes encoding the dimerizationinducible protein. In other words, the transgene may be delivered “in cis” with the split transcription factor to the cell, such as within the same viral (e.g. AAV) particle. Alternatively, the additional expression cassette may be part of a different nucleic acid, expression vector or viral particle to the expression cassette or cassettes encoding the split transcription factor. In other words, the transgene may be delivered “in trans” with the split transcription factor to the cell, such as within separate viral (e.g. AAV) particles. The split transcription factors provided herein are suitable for both “in cis” and “in trans” delivery with the transgene.

When the target gene is delivered “in trans” to the split transcription factor, the methods further comprise contacting the cell with a nucleic acid, expression vector or viral particle comprising the target gene (in addition to the nucleic acid(s), expression vector(s) or viral particle(s)). When the method is performed in vivo, the methods further comprise administering to the subject a nucleic acid, expression vector or viral particle which comprises the target gene.

Thus, in an embodiment, provided herein is the small molecule, e.g. grazoprevir, in combination with:

(a) (i) one or more nucleic acids as described above, (ii) one or more expression vectors as described above, or (iii) one or more viral particles as described above; and

(b) an additional nucleic acid, expression vector or viral particle; for use in regulating expression of a target gene in a subject (or in a cell in a subject), wherein the one or more nucleic acids, expression vectors or viral particles of (a) encode a dimerization-inducible protein as described above which forms a transcription factor upon dimerization, and wherein the additional nucleic acid, expression vector or viral particle of (b) comprises the target gene and a target sequence, wherein the DNA binding domain of the transcription factor binds the target sequence such that the transcription factor is capable of regulating (i.e. increasing or decreasing) expression of the target gene.

The target sequence may be located in or in close proximity to a promoter that is operably linked to the target gene. By “close proximity” is meant that the target sequence is within 500 bp, within 250 bp, within 100 bp, within 50 bp, or within 25 bp of the sequence corresponding to the promoter.

Delivery of genetic material to the cell may occur by any suitable means. For example, delivery may be by viral means, e.g. as part of a viral particle described herein, or by non-viral means. Non-viral means of delivery include electroporation, lipofection, microinjection, biolistics, virosomes, liposomes, immunoliposomes, polycation or lipidmucleic acid conjugates, naked DNA, naked RNA, artificial virions, and agent-enhanced uptake of DNA. In one embodiment, the expression cassettes are delivered as mRNA. In one embodiment, the expression cassettes are delivered as DNA plasmids. As set out above, when delivery to the cell occurs in vivo, the nucleic acid(s), expression vector(s) or viral particles are administered to the subject.

In any of the in vivo methods disclosed herein, the small molecule may be orally administered to a human subject, for example in an acceptable dosage form such as a capsule, tablet, aqueous suspension or solution. The amount used will depend on the subject treated and the particular mode of administration. The small molecule may be administered as a single dose, multiple doses or over an established period of time.

Where the method involves administering a viral particle to a cell or a subject, the unit dose may be calculated in terms of the dose of viral particles being administered. Viral doses include a particular number of virus particles or plaque forming units (pfu) or viral genome copies (vgc). For embodiments involving AAV, particular unit doses include 10³, 10⁴, 10⁵, 10⁶, 10⁷, 10⁸, 10⁹, 10¹⁰, 10¹¹, 10¹², 10¹³, 10¹⁴, 10¹⁵, 10¹⁶ viral genome copies (vgc) per kg of body weight. Particle doses may be somewhat higher (10 to 100-fold) due to the presence of infection-defective particles.

Without wishing to be bound by theory, infection and transduction of cells by viral particles (e.g. AAV particles) is believed to occur by a series of sequential events as follows: interaction of the viral capsid with receptors on the surface of the target cell, internalization by endocytosis, intracellular trafficking through the endocytic/proteasomal compartment, endosomal escape, nuclear import, virion uncoating, and viral DNA double-strand conversion that leads to the transcription and expression of proteins encoded by the viral genome in the viral particle.

While it is possible for the one or more nucleic acid(s) expression vector(s), viral particles and the small molecule to be used (e.g., administered) alone, it is often preferable to present the individual components within a composition or formulation, e.g. with a pharmaceutically acceptable carrier or diluent. For example, the one or more viral particles may be administered as a pharmaceutical composition comprising the one or more viral particles and a pharmaceutically acceptable carrier or diluent. As another example, grazoprevir may be administered as a pharmaceutical composition comprising both grazoprevir and a pharmaceutically acceptable carrier or diluent.

The term “pharmaceutically acceptable,” as used herein, pertains to compounds, ingredients, materials, compositions, dosage forms, etc., which are, within the scope of sound medical judgment, suitable for use in contact with the tissues of the subject in question (e.g., human) without excessive toxicity, irritation, allergic response, or other problem or complication, commensurate with a reasonable benefit/risk ratio. Each carrier, diluent, excipient, etc. must also be “acceptable” in the sense of being compatible with the other ingredients of the formulation.

The agents for gene therapy (i.e. the one or more expression vectors, DNA plasmids or viral particles, plus the small molecule) may be administered to the subject simultaneously or sequentially and may be administered in individually varying dose schedules and via different routes. For example, when administered sequentially, the agents can be administered at closely spaced intervals (e.g., over a period of 5-10 minutes) or at longer intervals (e.g., 1 , 2, 3, 4 or more hours apart, or even longer periods apart where required), the precise dosage regimen being commensurate with the properties of the agent(s) being administered. In one embodiment, the small molecule is administered to the subject after administration of the one or more nucleic acids, expression vectors or viral particles.

Cellular Therapy

Also provided are methods of cellular therapy. Cellular therapy involves administering cells that have been genetically modified to express an expression product, such as a dimerizationinducible protein, to a patient.

Cells such as stem cells may be used in methods of cellular therapy. One potential advantage associated with using stem cells is that they can be differentiated into other cell types in vitro, and can be introduced into a mammal (such as the donor of the cells) where they will engraft in the bone marrow. Suitable stem cells include embryonic stem cells, induced pluripotent stem cells, hematopoietic stem cells, neuronal stem cells, cardiac stem cells and mesenchymal stem cells.

For example, the cellular therapy may involve administering the one or more nucleic acids, expression vectors or viral particles described herein to a cell (e.g. a stem cell) in an ex vivo method such that a dimerization-inducible protein is expressed by the cell and administering the cell to a patient. Following administration of the cell expressing the dimerization-inducible protein, the small molecule, e.g. grazoprevir, may be administered to the individual in order to induce dimerization of the first and second component polypeptides in order to reconstitute their function. For example, the first and second component polypeptides may form a transcription factor upon dimerization, or the first and second component polypeptides may form a CAR upon dimerization.

Thus provided herein is a cell which expresses a dimerization-inducible protein as described above for use in therapy. Such a cell may be a stem cell, as described above. In other embodiments the cell is an immune cell.

Provided herein is a method of treatment comprising administering a cell expressing a dimerization-inducible protein as defined herein to a patient, the method comprising: i) administering the cell to an individual; and ii) administering the small molecule, e.g. grazoprevir, to the individual.

The dimerization-inducible protein may be for example a split transcription factor, a split CAR, a split apoptotic protein or a split therapeutic protein. The method of treatment may be a method of treating cancer.

Cellular therapy may involve isolating cells from a patient (or donor), transfecting the cells with one or more expression vectors ex vivo and administering the cells to the patient. Various cell types suitable for ex vivo transfection are well known to those of skill in the art (see e.g., Freshney et al., Culture of Animal Cells, A Manual of Basic Technique (3rd ed. 1994)) and the references cited therein for a discussion of how to isolate and culture cells from patients). For example, the cellular therapy may involve isolating a cell from a patient, administering the one or more expression vectors described herein to the cell in an ex vivo method such that a dimerization-inducible protein is expressed by the cell, and administering the cell back to the patient. Following administration of the cell expressing the dimerizationinducible protein, the small molecule may be administered to the individual in order to induce dimerization of the first and second component polypeptides as described herein.

Methods of regulating expression of a target gene in a cell in a subject are described above. Such methods may be applied in the context of cellular therapy by modifying a cell in vitro or ex vivo to express a dimerization-inducible protein which, upon dimerization, forms a transcription, using a method as described above, administering the cell to the subject and administering the small molecule (e.g. grazoprevir) to the subject.

The cellular therapy methods provided herein may be used in the context of immunotherapy. In one such aspect, the cell is an immune cell (such as a T-cell or an NK cell) and the dimerization-inducible protein expressed by the cell is a split CAR. Methods of treatment involving CAR T-cell therapy are known in the art and are described for example in Miliotou and Papadopoulou, 2018.

Thus provided herein is a method of treatment comprising administering a cell expressing the dimerization-inducible protein defined herein to a patient, wherein the first and second component polypeptide form a CAR upon dimerization, the method comprising: i) administering the cell to an individual; and ii) administering the small molecule (e.g. grazoprevir) to the individual.

The method of treatment may be a method of treating cancer.

Similarly, provided herein is a cell expressing a dimerization-inducible protein for use in a method of treatment of a subject, wherein the dimerization-inducible protein is a split CAR, and the method comprises administering to the subject the cell and the small molecule, e.g. grazoprevir.

Similarly, provided herein is the small molecule, e.g. grazoprevir, for use in a method of treatment of a subject, wherein the treatment comprises administering to the subject grazoprevir and a cell expressing a dimerization-inducible protein, wherein the dimerizationinducible protein is a split CAR.

Similarly, provided herein is the small molecule, e.g. grazoprevir, and a cell expressing a dimerization-inducible protein for use in a method of treatment in a subject, wherein the dimerization-inducible protein is a split CAR.

In cellular therapy methods utilising split CARs, the split CAR is generally expressed by an immune cell, as described above. Such methods may be for treating cancer.

The agents for cellular therapy using a split CAR or split transcription factor (i.e. the small molecule and cells expressing the split CAR or split transcription factor) may be administered in the context of one or more pharmaceutical compositions, as described above. The agents for cellular therapy using a split CAR or split transcription factor may also be administered to the subject simultaneously or sequentially and may be administered in individually varying dose schedules and via different routes. For example, when administered sequentially, the agents can be administered at closely spaced intervals (e.g., over a period of 5-10 minutes) or at longer intervals (e.g., 1 , 2, 3, 4 or more hours apart, or even longer periods apart where required), the precise dosage regimen being commensurate with the properties of the agent(s) being administered. In one embodiment, the small molecule is administered to the subject after administration of the cell expressing a split CAR or split transcription factor. Suitable dosing regimens may be determined by a physician.

Cells expressing a split apoptotic protein, such as a split caspase, as described above, may also be used in cellular therapy.

Thus in one aspect, provided herein is a method of inducing death of a target cell in a subject, wherein the target cell expresses a dimerization-inducible protein, the dimerizationinducible protein being a split apoptotic protein (e.g. split caspase) as described above, the method comprising contacting the cell with the small molecule, e.g. grazoprevir. Such a method may be performed in vitro or ex vivo, e.g. for research purposes, or in vivo in the context of a method of treatment.

Thus in an embodiment, provided herein is a method of inducing death of a target cell in a subject, wherein the target cell expresses a dimerization-inducible protein, the dimerization-inducible protein being a split apoptotic protein (e.g. split caspase) as described above, the method comprising administering the small molecule, e.g. grazoprevir, to the subject.

Similarly, provided herein is the small molecule, e.g. grazoprevir, for use in a method of inducing death of a target cell in a subject, wherein the target cell expresses a dimerizationinducible protein, the dimerization-inducible protein being a split apoptotic protein (e.g. split caspase) as described above, the method comprising administering the small molecule to the subject.

These methods of inducing (or causing) death of a target cell in a subject may alternatively be seen as methods of inducing or causing apoptosis in a target cell in a subject.

These methods may find particular utility in cellular immunotherapy, e.g. CAR-T therapy and suchlike. While CAR-T therapy has been found to be highly effective for treatment of certain cancers (particularly blood cancers), and holds great potential for transforming oncology, it is frequently associated with serious adverse events, which in many cases are caused by immune cell overactivity. One relatively common adverse event is cytokine release syndrome (CRS), sometimes referred to as a cytokine storm, which occurs when immune cell hyperactivation causes elevated levels of cytokines in the blood stream. Another adverse event associated with CAR-T therapy is immune effector cell-associated neurotoxicity syndrome (ICANS), the mechanism of which is poorly understood. Both CRS and ICANS can be fatal. While treatments for the conditions exist, targeted therapy to switch off the hyperactive, therapeutic immune cells is desirable.

Accordingly, in the methods of inducing death of a target cell in a subject, the target cell may be an immune cell, e.g. a T cell or NK cell. The target cell may express an antigen receptor, such as a CAR or a TCR. In particular, the target cell may be modified to express both an antigen receptor and a split apoptotic protein. The subject may be undergoing cellular immunotherapy, i.e. therapy with an immune cell expressing an antigen receptor.

The methods of inducing death of a target cell in a subject may thus be for the treatment of CRS or ICANS in a subject. In these methods the subject may have been administered an immune cell expressing an antigen receptor and a split apoptotic protein, as described above. In such methods a suitable small molecule (e.g. grazoprevir) dose and dosing regimen may be determined by a physician. The small molecule may be administered in a pharmaceutical formulation.

Also provided herein is an immune cell (e.g. a T cell or NK cell) expressing an antigen receptor (e.g. a CAR or TCR) and a split apoptotic protein as described herein (e.g. a split caspase) for use in therapy. Such therapy may in particular be cancer therapy. Similarly, provided herein is a method of treatment comprising administering an immune cell (e.g. a T cell or NK cell) expressing an antigen receptor (e.g. a CAR or TCR) and a split apoptotic protein as described herein (e.g. a split caspase) to a subject in need thereof. The subject may have cancer. In particular, the antigen receptor may recognise a cancer antigen expressed by the cells of the cancer of the subject.

Kits

Also provided herein are kits that comprise: (1) the expression vector, vector set or one or more vectors; viral particles, set of viral particles or one or more viral particles, cells; or one or more nucleic acids, as defined herein; and (2) the small molecule, e.g. grazoprevir. The small molecule included in the kit is the molecule which forms a complex with the encoded target protein which is recognised by the encoded binding protein. Where the expression vector, vector set or one or more vectors, viral particle, set of viral particles or one or more viral particles, nucleic acid or one or more nucleic acids encodes a polypeptide containing a DNA binding domain that is from a CRISPR/Cas system, the kit may additionally include a guide RNA specific for the target sequence, or a nucleic acid encoding the guide RNA specific for the target sequence.

Sequence Identity and Alterations

Sequence identity is commonly defined with reference to the algorithm GAP (Wisconsin GCG package, Accelerys Inc, San Diego USA). GAP uses the Needleman and Wunsch algorithm to align two complete sequences, maximising the number of matches and minimising the number of gaps. Generally, default parameters are used, with a gap creation penalty equalling 12 and a gap extension penalty equalling 4. Use of GAP may be preferred but other algorithms may be used, e.g. BLAST (which uses the method of Altschul et al. (1990)), FASTA (which uses the method of Pearson and Lipman (1988)), or the Smith-Waterman algorithm (Smith and Waterman (1981)), or the TBLASTN program, of Altschul et al. (1990) supra, generally employing default parameters. In particular, the psi-Blast algorithm may be used.

Where the disclosure makes reference to a particular amino acid sequence having at least 90 % sequence identity to a reference amino acid sequence, this includes the amino acid sequence having 90 %, 91 %, 92 %, 93 %, 94 %, 95 %, 96 %, 97 %, 98 %, 99 % and 100 % sequence identity to the reference amino acid sequence (including as rounded to the nearest integer percentage).

The term “sequence alterations” as used herein are intended to encompass the substitution, deletion and/or insertion of an amino acid residue. Thus, a protein containing one or more amino acid sequence alterations compared to a reference sequence contains one or more substitutions, one or more deletions and/or one or more insertions of an amino acid residue as compared to the reference sequence. The terms “amino acid mutation” is herein used interchangeably with “sequence alteration”, unless the context clearly identifies otherwise.

In some embodiments in which one or more amino acids are substituted with another amino acid, the substitutions may be conservative substitutions, for example according to the following Table. In some embodiments, amino acids in the same block in the middle column are substituted, i.e. a non-polar amino acid is substituted for another non-polar amino acid for example. In some embodiments, amino acids in the same line in the rightmost column are substituted, i.e. G is substituted for A or P for example.

In some embodiments, substitution(s) may be functionally conservative. That is, in some embodiments the substitution may not affect (or may not substantially affect) one or more functional properties (e.g. binding affinity) of the protein comprising the substitution as compared to the equivalent unsubstituted protein.

The binding member may comprise a variant of a BC, DE or FG loop or T n3 sequence as disclosed herein. Suitable variants can be obtained by means of methods of sequence alteration, or mutation, and screening. In a preferred embodiment, a binding member comprising one or more variant sequences relative to those specified herein is a functional variant, i.e. a variant that retains one or more of the functional characteristics of the parent binding member (defined herein), such as binding specificity and/or binding affinity for the T- SM complex. For example, a binding member comprising one or more variant sequences preferably binds to T-SM complex with the same affinity as, a higher affinity than, or a not substantially worse affinity than, the (parent) binding member. By “not substantially worse” may mean that the variant binding member retains at least 80 %, 85 %, 90 %, 91 %, 92 %, 93 %, 94 %, 95 %, 96 %, 97 %, 98 % or 99 % of the affinity of the parent binding member.

The parent binding member is a binding member which does not comprise the amino acid substitution(s), deletion(s), and/or insertion(s) which has (have) been incorporated into the variant binding member.

For example, a binding member may comprise BC, DE and FG loop sequences, or a Tn3 sequence, which has at least 70 %, at least 75 %, at least 80 %, at least 85 %, at least 90 %, at least 95 %, at least 96 %, at least 97 %, at least 98 %, at least 99 %, at least 99.1 %, at least 99.2 %, at least 99.3 %, at least 99.4 %, at least 99.5 %, at least 99.6 %, at least 99.7 %, at least 99.8 %, or at least 99.9 % sequence identity to BC, DE and FG loop sequences, or a Tn3 sequence, as disclosed herein.

A binding member may comprise BC, DE or FG loop sequences, or a Tn3 sequence, which has one or more amino acid sequence alterations (addition, deletion, substitution and/or insertion of an amino acid residue), preferably 20 alterations or fewer, 15 alterations or fewer, 10 alterations or fewer, 5 alterations or fewer, 4 alterations or fewer, 3 alterations or fewer, 2 alterations or fewer, or 1 alteration compared with the BC, DE and FG loop sequences, and the Tn3 sequences, disclosed herein.

The features disclosed in the foregoing description, or in the following claims, or in the accompanying drawings, expressed in their specific forms or in terms of a means for performing the disclosed function, or a method or process for obtaining the disclosed results, as appropriate, may, separately, or in any combination of such features, be utilised for realising the present disclosure in diverse forms thereof.

While the present disclosure has been described in conjunction with the exemplary embodiments described above, many equivalent modifications and variations will be apparent to those skilled in the art. Accordingly, the exemplary embodiments of the present disclosure set forth above are considered to be illustrative and not limiting. Various changes to the described embodiments may be made without departing from the spirit and scope of the present disclosure. For the avoidance of any doubt, any theoretical explanations provided herein are provided for the purposes of improving the understanding of a reader. The inventors do not wish to be bound by any of these theoretical explanations.

Any section headings used herein are for organizational purposes only and are not to be construed as limiting the subject matter described.

Throughout this specification, including the claims which follow, unless the context requires otherwise, the word “comprise” and “include”, and variations such as “comprises”, “comprising”, and “including” will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps.

It must be noted that, as used in the specification and the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Ranges may be expressed herein as from “about” one particular value, and/or to “about” another particular value. When such a range is expressed, another embodiment includes from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by the use of the antecedent “about,” it will be understood that the particular value forms another embodiment. The term “about” in relation to a numerical value is optional and means for example +/- 10%.

Examples

Example 1: Identification of grazoprevir and HCV NS3/4A PR as the basis for a chemical inducer of dimerization (CID) module

We have previously described chemical inducer of dimerization modules comprised of the catalytically inactive S139A mutant of the NS3/4A protease from hepatitis C virus (HCV NS3/4A PR (S139A), SEQ ID NO: 27), the antiviral compound simeprevir and proteins (termed PRSIM molecules) which bind selectively to the HCV NS3/4A PR (S139A):simeprevir complex (WO 2021/009692).

We have now found that using grazoprevir instead of simeprevir as the CID is advantageous for development of the molecular switches described herein. Grazoprevir is clinically approved for treatment of HCV in a fixed dose combination with the HCV NS5A inhibitor elbasvir. It is cell permeable, orally dosed and exhibits pharmacokinetics in humans which render it suitable for chronic daily dosing, all desired properties for a small molecule inducer. Furthermore, we have investigated the pharmacokinetics of grazoprevir in Balb/c mice, and have shown that grazoprevir exhibits superior pharmacokinetics in mice after oral dosing compared to simeprevir (Fig. 1), with a 10-fold lower dose of grazoprevir giving a 4.6- fold higher peak serum concentration (Cmax) and a 2-fold higher exposure of unbound compound based on AUC, thus facilitating in vivo studies in mice.

Example 2: Selection of HCV NS3/4A PR (S139A):grazoprevir complex-specific binding (PRGRZ) molecules

To identify selective HCV NS3/4A PR (S139A):grazoprevir complex-specific binding (PRGRZ) molecules, three rounds of phage display selections were performed on biotinylated HCV NS3/4A PR (S139A) in the presence of grazoprevir. Phage ELISAs were performed on biotinylated HCV NS3/4A PR (S139A) in both the presence and absence of grazoprevir to identify Tn3s from the third round of selection which were specific for the complex over the HCV NS3/4A PR (S139A) alone. A panel of 6 Tn3 clones (Table 1) with unique sequences that demonstrated selective binding to biotinylated HCV NS3/4A PR (S139A) in the presence of grazoprevir were selected to be expressed for further biochemical studies.

Example 3: A panel of PRGRZ molecules are selective for the HCV NS3/4A PR ( S 139 A ) :grazoprevir complex

The PRGRZ binding proteins identified from phage display selections as complex-specific were expressed and purified at larger scale to provide sufficient material for further analysis. A homogeneous time-resolved fluorescence (HTRF) binding screen was performed on all the HCV NS3/4A PR (S139A):grazoprevir complex-specific PRGRZ molecules and the PRGRZ molecules were confirmed as complex-specific with no detectable binding to the HCV NS3/4A PR (S139A) protein alone (Fig. 2). As a control, the HCV NS3/4A PR (S139A):simeprevir complex-specific Tn3 protein PRSIM23 (SEQ ID NO: 7, WO 2021/009692) was used, which showed no binding to the HCV NS3/4A PR (S139A):grazoprevir complex.

To further characterise the PRGRZ binding molecules, the kinetics of HCV NS3/4A PR (S139A) protease binding in the presence or absence of grazoprevir were determined using Biacore 8K (Table 5). All the PRGRZ binding molecules tested showed selectivity for grazoprevir-bound HCV NS3/4A PR (S139A) with no binding observed to HCV NS3/4A PR (S139A) in the absence of grazoprevir. Table 5: Binding and kinetic constants measured for the binding of HCV NS3/4A PR (S139A) to PRGRZ binding molecules in the presence of 10 nM grazoprevir.

Data is mean ± s.d. (n=3)

N.D. = indicates the values could not be determined due to absence of detectable binding

# = no binding

The effect of grazoprevir concentration on the formation of the complexes of PRGRZ93 and PRGRZ103 with HCV NS3/4A PR (S139A) complex was also assessed (Table 6); the EC50 values of 7.67 nM and 7.75 nM, respectively, are within 1.7 fold of the EC50 for the effect of simeprevir concentration on the formation of the PRSIM23:HCV NS3/4A PR (S193A) complex.

Table 6: Dose-response for grazoprevir- and simeprevir-dependent complex formation between Tn3s and HCV NS3/4A PR(S139A)

We also tested the specificity of these interactions with respect to alternative small molecule inhibitors of the HCV protease that have been approved for human use. A panel of such small molecules - simeprevir, glecaprevir, asunaprevir, paritaprevir, vaniprevir and danoprevir - were assessed for their ability to induce complex formation between HCV NS3/4A PR (S139A) and the PRGRZ molecules. A homogeneous time-resolved fluorescence (HTRF) binding assay (Fig. 3) was performed to determine the level of HCV NS3/4A PR (S139A):PRGRZ complex formed when grazoprevir was substituted with the alternative HCV PR inhibitor small molecules. All PRGRZ molecules exhibit a high level of selectivity for complex formation with HCV NS3/4A PR (S139A):grazoprevir compared to complexes of HCV NS3/4A PR (S139A) and simeprevir, paritaprevir, danoprevir, asunaprevir, or vaniprevir, with no or minimal complex formed. By contrast, all PRGRZ are able to form complexes with HCV NS3/4A PR (S139A):glecaprevir to differing extents - PRGRZ093, PRGRZ103, PRGRZ114 and PRGRZ115 show moderately greater levels of complex formation with HCV NS3/4A PR (S139A):glecaprevir compared to HCV NS3/4A PR (S139A):grazoprevir, whereas PRGRZ094 and PRGRZ112 exhibit a greater level of complex formation with HCV NS3/4A PR (S139A):grazoprevir.

Examination of the crystal structures of HCV NS3/4A PR bound to grazoprevir or glecaprevir (see Table 2 for PDB numbers) reveals a strikingly similar binding mode for the two compounds, which are structurally related, so at least some of the potential epitopes available for recognition by the PRGRZ molecules are conserved between both complexes.

This data suggests that administration of other HCV NS3/4A PR inhibitor small molecules, with the exception of glecaprevir, such as in the case of a HCV-infected individual, would not be able to form an active HCV NS3/4A PR (S139A):PRGRZ103 complex.

Example 4: PRGRZ-based CIDs can regulate gene expression via reconstitution of a split transcription factor

We reasoned that the PRGRZ-based CIDs could be used to regulate expression of transgenes via fusion to the two domains of a split transcription factor. To demonstrate this, we generated vectors in which three copies of each PRGRZ molecule were fused in tandem to the ZFHD1 DNA binding domain and one copy of HCV NS3/4A PR (S139A) was fused to the p65 activation domain (AD); these components are separated by an IRES sequence and downstream of a constitutive CMV promoter. We transfected each vector into HEK293 cells together with a DNA vector encoding luciferase under the control of an inducible promoter that contains 12 copies of the ZFHD1 recognition sequence upstream of a minimal IL-2 promoter. Only when grazoprevir is present is the AD domain recruited to the DBD bound to the promoter upstream of the luciferase gene (Fig. 4A), thus inducing gene expression. The PRGRZ-based CID constructs demonstrated dose-dependent gene expression regulation ranging from 160- to 2260-fold maximum induction with EC50s from ranging from 3.0 to 16.1 nM (Fig. 4B and Table 7). The PRGRZ103-based CID demonstrated a 2-fold greater fold induction of luciferase activity compared to the PRSIM23-based switch induced by simeprevir, albeit with a 2-fold weaker EC50.

Table 7: EC50 and fold-change values for PRGRZ-based CIDs in a split transcription factor assay.

Example 5: Transgene expression can be tuned by changing the copy number of PRGRZ molecules fused to the DBD

To assess the impact of copy number of the PRGRZ fused to the DNA binding domain, we generated vectors encoding one, two or three copies of PRGRZ103 fused to the DBD via a short peptide linker of SEQ ID NO: 74 (Fig. 5A) and HCV NS3/4A PR (S139A) fused to the p65 activation domain (AD); these components are separated by a P2A sequence and downstream of a constitutive hybrid EF1a/HTLV-1 promoter. The constructs have the sequences set forth in SEQ ID NOs: 56 (one copy), 77 (two copies) and 78 (three copies) respectively. We then assessed the ability of these to induce transcription of an IL-2 transgene located on the same vector downstream of the inducible ZFHD1 promoter. Each additional copy of PRGRZ103 leads to an increase in IL-2 expression, with maximum levels of 397 ng/ml, 985 ng/ml and 1226 ng/ml for 1 , 2 and 3 copies, respectively (Fig. 5B). This data suggests that it is possible to tune the regulation of gene expression from the inducible promoter by changing the copy number of PRGRZ103, and thus recruitment of the activation domain. Example 6: A PRGRZ-based CID can regulate activity of a split chimeric antigen receptor (CAR)

Chemical-induced heterodimerisation has previously been shown to be an effective way to modulate CAR function (Wu et al. 2015); (Hill et al. 2018). To demonstrate that the PRGRZ103-based CID can be used to regulate CAR function, we engineered Jurkat T-cells to express PRGRZ103-CID-based CARs using a lentiviral expression system (Fig. 6A). Activation of the CAR, upon antigen binding, should result in the secretion of IL-2 in an antigen- and grazoprevir-dependent manner. Upon addition of grazoprevir, we observed a dosedependent activation of the CAR-expressing Jurkat cells in the presence of antigen-positive HepG2 cells but not antigen-negative A375 cells as measured by IL-2 production (Fig. 6B). This data demonstrates that the PRGRZ-based CID module can be used for grazoprevir- mediated regulation/modulation of CAR-initiated cellular signalling pathways.

Example 7: A PRGRZ-based CID can regulate the activity of an apoptotic protein to control cell death in tumour cell lines

The ability to “remotely control” therapeutic cells once they have been administered provides a safety net in the advent of uncontrolled proliferation or adverse event. One way to control such cells is to endow them with a so-called “kill switch” such that they can be removed at will once they have performed their function or pose a safety risk. As such, a PRGRZ-based, grazoprevir-responsive Caspase-9-based kill switch was generated. The homo-dimerisation CARD domain of Caspase-9 was replaced with both the PRGRZ103 and HCV NS3/4A PR (S139A) domains, separated by short linkers of SEQ ID NO: 75. An active Caspase-9 homodimer can thus only be reconstituted by addition of grazoprevir (Fig. 7A). Addition of 10 nM grazoprevir to HCT 116 cells stably transduced with the kill switch construct induced rapid cell death as determined by microscopic inspection of cells (Fig. 7B). Cell killing, determined by cell viability, was grazoprevir-dose-dependent, with an EC50 of 68 pM after 48 hours. This was equivalent to the efficacy observed for simeprevir-induced cell killing of HCT116 cells stably transduced with the PRSIM-based kill switch described previously (EC50 = 63 pM) (Fig. 7C). Active Caspase-9 activates downstream Caspase-3 by proteolytic cleavage. Caspase-3 activity, detected by cleavage of fluorogenic substrate Ac-DEVD-AMC, is significantly (p<0.0001) up-regulated in 10 nM grazoprevir-treated kill switch-transduced HCT116 cells (Fig. 7D).

We next tested whether the PRGRZ-based CID module incorporated into the Caspase9-based kill switch could be used eliminate cells in vivo. Clonal HCT116 cells stably transduced with the PRGRZ-based kill switch were implanted into SCID mice, and 50 mg/kg grazoprevir was dosed orally when the average group tumour size reached 225 mm³ (Day 13). We observed complete tumour regression in all mice, with tumours remaining undetectable up to 5 weeks after dosing (Fig. 8).

Example 8: A PRGRZ-based CID can regulate the activity of an apoptotic protein to control cell death in both undifferentiated and differentiated ES cells

To demonstrate activity of the PRGRZ-based CID in therapeutically relevant cell types, we engineered ES cells to express the iCaspase 9-based kill switch described above. To achieve this, an expression cassette encoding both CD47 and the grazoprevir-inducible Caspase-9-based kill switch, separated by a “self-cleaving” peptide (P2A), with homology arms to the p-2-microglobulin (B2M) locus at either end (Figure 9), was used as a homology- directed repair template (HDRT) for CRISPR/Cas9-mediated knock-in at the B2M locus. Resulting cells were expected to express both CD47 (to prevent NK cell killing) and the kill switch, with concomitant lack of expression of B2M (to prevent recognition by T cells). Individual clones were selected by fluorescence-activated cell sorting (FACS) for the presence of CD47 and absence of B2M. Furthermore, integration of the cassette at the B2M locus was detected by PCR amplification (Figure 10). Based on this data clone #76 was chosen for further analysis.

Next, both clone #76 and the wild-type ESC cell lines were treated with a single dose of 10 nM or 100 nM grazoprevir, or vehicle-only control (DMSO). Quantification of the cell number after 48 hours was achieved using the xCELLigence instrument which was then converted into percent cytolysis using untreated samples as the reference (Figure 11 A).

In contrast to the wild-type ES cell line, clone #76 demonstrated significant cytolysis (>95 % cells) under both grazoprevir treatment conditions, but not with vehicle-only control treatment, demonstrating the efficacy of the grazoprevir-inducible kill switch. This killing could also be observed by light microscopy when overgrown ES cell cultures of wild-type and clone #76 were treated with 10 nM Grazoprevir every 48 h for 3 rounds over a period of 7 days. In contrast to the wild-type cells, no clone #76 cells were detected after the 3rd round of grazoprevir treatment, confirming that sustained activation of the highly potent kill switch can kill 100 % of cells without the cells becoming resistant to grazoprevir (Figure 11 B).

Clone #76 and the wild-type ES cell lines were then differentiated into endothelial cells and their susceptibility to grazoprevir-induced cytolysis was measured longitudinally for approximately 19 hours following addition of either 10 nM grazoprevir, 100 nM grazoprevir or DMSO. In contrast to WT and vehicle-treated cells, differentiated endothelial cells derived from clone #76 were effectively killed by both 10 nM and 100 nM of Grazoprevir at 24 h post treatment (Figure 12). The grazoprevir-inducible killing of clone #76 cells was induced rapidly in the first hour after treatment, compared to WT and DM SO-treated controls, and continued to rise throughout the experiment.

These data demonstrate that a PRGRZ-based kill switch can efficiently and rapidly eliminate many cell types, including both undifferentiated and differentiated ES cells in vitro and in vivo, thereby providing a means for the rapid removal of therapeutic cells in patients.

References

A number of publications are cited above in order to more fully describe the present disclosure and the state of the art to which the disclosure pertains. Full citations for these references are provided below. The entirety of each of these references is incorporated herein.

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. ‘Basic local alignment search tool’. J. Mol. Biol. 215(3), 403-10.

Banaszynski, L. A., C. W. Liu, and T. J. Wandless. 2005. 'Characterization of the FKBP.rapamycin.FRB ternary complex', J Am Chem Soc, 127: 4715-21 .

Bartenschlager R, Ahlborn-Laake L, Mous J, Jacobsen H. 1993 'Nonstructural protein 3 of the hepatitis C virus encodes a serine-type proteinase required for cleavage at the NS3/4 and NS4/5 junctions'. J Viro\.;Q7(7): 3835-3844

Belshaw, P J, S N Ho, G R Crabtree, and S L Schreiber. 1996. 'Controlling protein association and subcellular localization with a synthetic ligand that induces heterodimerization of proteins', Proceedings of the National Academy of Sciences, 93: 4604-07.

Belshaw, P. J., D. M. Spencer, G. R. Crabtree, and S. L. Schreiber. 1996. 'Controlling programmed cell death with a cyclophilin-cyclosporin-based chemical inducer of dimerization', Chem Biol, 3: 731-8.

Chavez A, Scheiman J, Vora S, Pruitt BW, Tuttle M, P R Iyer E, Lin S, Kiani S, Guzman CD, Wiegand DJ, Ter-Ovanesyan D, Braff JL, Davidsohn N, Housden BE, Perrimon N, Weiss R, Aach J, Collins JJ, Church GM. Nat. Mathods., 12(4): 326-8

Chelur DS, Chalfie M. 2007. ‘Targeted cell killing by reconstituted caspases.’ Proc. Natl. Acad. Sci. U.S.A., 104(7): 2283-8

Colella P, Ronzitti G, Mingozzi F. 2017. ‘Emerging Issues in AAV-Mediated In Vivo Gene Therapy. ’ Mol Ther Methods Clin Dev., 8: 87-104

De Clercq E. 2014. ‘Current race in the development of DAAs (direct-acting antivirals) against HCV.’ Biochem. Pharmacol., 89(4): 441-52

Dixon AS, Schwinn MK, Hall MP, Zimmerman K, Otto P, Lubben TH, Butler BL, Binkowski BF, Machleidt T, Kirkland TA, Wood MG, Eggers CT, Encell LP, Wood KV. 2016. ‘NanoLuc Complementation Reporter Optimized for Accurate Measurement of Protein Interactions in Cells.’ ACS Chem. Biol., 11 (2): 400-8

Eckart, M.R. M. Selby, F. Masiarz, C. Lee, K. Berger, K. Crawford, C. Kuo, G. Kuo, M. Houghton, Q.L. Choo. 1993 'The Hepatitis C Virus Encodes a Serine Protease Involved in Processing of the Putative Nonstructural Proteins from the Viral Polyprotein Precursor', Biochemical and Biophysical Research Communications, Volume 192, Issue 2, 1993, Pages 399-406

Foight GW, Wang Z, Wei CT, et al. Multi-input chemical control of protein dimerization for programming graded cellular responses. Nat Biotechnol. 2019;37(10):1209-1216. doi:10.1038/s41587- 019-0242-8

Gargett T, Brown MP. 2014. 'The inducible caspase-9 suicide gene system as a "safety switch" to limit on-target, off-tumor toxicities of chimeric antigen receptor T cells.' Front Pharmacol., 5:235.

Gilbreth, R. N., B. M. Chacko, L. Grinberg, J. S. Swers, and M. Baca. 2014. 'Stabilization of the third fibronectin type III domain of human tenascin-C through minimal mutation and rational design', Protein Eng Des Sei, 27: 411-8.

Grakoui A, McCourt DW, Wychowski C, Feinstone SM, Rice CM. 1993 'Characterization of the hepatitis C virus-encoded serine proteinase: determination of proteinase-dependent polyprotein cleavage sites.' J Virol., 67(5):2832-2843)

Hijikata M, Mizushima H, Akagi T, et al. 1993 'Two distinct proteinase activities required for the processing of a putative nonstructural precursor protein of hepatitis C virus.' J Virol. 67(8 .4665-4675.

Hill, Z. B., A. J. Martinko, D. P. Nguyen, and J. A. Wells. 2018. 'Human antibody-based chemically induced dimerizers for cell therapeutic applications', Nat Chem Biol, 14: 112-17.

Kotterman MA & Schaffer DV. 2014. ‘Engineering adeno-associated viruses for clinical gene therapy.’ Nat. Rev. Genet. 15(7): 445-51.

Leahy, D. J., W. A. Hendrickson, I. Aukhil, and H. P. Erickson. 1992. 'Structure of a fibronectin type III domain from tenascin phased by MAD analysis of the selenomethionyl protein', Science, 258: 987-91.

Li J, Abel R, Zhu K, Cao Y, Zhao S, Friesner RA. The VSGB 2.0 model: a next generation energy model for high resolution protein structure modeling. Proteins. 2011 ;79(10):2794-2812. Li, Kui, Eileen Foy, Josephine C. Ferreon, Mitsuyasu Nakamura, Allan C. M. Ferreon, Masanori Ikeda, Stuart C. Ray, Michael Gale, and Stanley M. Lemon. 2005. 'Immune evasion by hepatitis C virus NS3/4A protease-mediated cleavage of the Toll-like receptor 3 adaptor protein TRIF', Proceedings of the National Academy of Sciences of the United States of America, 102: 2992-97.

Li, Xiao-Dong, Lijun Sun, Rashu B. Seth, Gabriel Pineda, and Zhijian J. Chen. 2005. 'Hepatitis C virus protease NS3/4A cleaves mitochondrial antiviral signaling protein off the mitochondria to evade innate immunity', Proceedings of the National Academy of Sciences of the United States of America, 102: 17717-22.

Lv Z, Chu Y, Wang Y. 2015. ‘HIV protease inhibitors: a review of molecular selectivity and toxicity.’ HIV AIDS (Auckl)., 7: 95-104

Moraca, F., Negri, A., de Oliveira, C. & Abel, R. Application of Free Energy Perturbation (FEP+) to Understanding Ligand Selectivity: A Case Study to Assess Selectivity Between Pairs of Phosphodiesterases (PDE's). J Chem Inf Model 59, 2729-2740 (2019).

Naso MF, Tomkowicz B, Perry WL 3rd, Strohl WR. 2017 ‘Adeno-Associated Virus (AAV) as a Vector for Gene Therapy.’ BioDrugs, 31 (4): 317-334

Oganesyan, V., A. Ferguson, L. Grinberg, L. Wang, S. Phipps, B. Chacko, S. Drabic, T. Thisted, and M. Baca. 2013. 'Fibronectin type III domains engineered to bind CD40L: cloning, expression, purification, crystallization and preliminary X-ray diffraction analysis of two complexes', Acta Crystallogr Sect F Struct Biol Cryst Commun, 69: 1045-8.

Osbourn, J. K., A. Field, J. Wilton, E. Derbyshire, J. C. Earnshaw, P. T. Jones, D. Allen, and J. McCafferty. 1996. 'Generation of a panel of related human scFv antibodies with high affinities for human CEA', Immunotechnology, 2: 181-96.

Patick AK, Potts KE. 1998. ‘Protease inhibitors as antiviral agents.’ Clin. Microbiol. Rev., 11 (4): 614-27

Pomerantz JL, Sharp PA, Pabo CO. 1995. ‘Structure-based design of transcription factors.’ Science. 267(5194): 93-6

Sabariegos, Rosario, Fernando Picazo, Beatriz Domingo, Sandra Franco, Miguel-Angel Martinez, and Juan Llopis. 2009. 'Fluorescence Resonance Energy Transfer-Based Assay for Characterization of Hepatitis C Virus NS3-4A Protease Activity in Live Cells', Antimicrobial Agents and Chemotherapy, 53: 728-34.

Sabers, C. J., M. M. Martin, G. J. Brunn, J. M. Williams, F. J. Dumont, G. Wiederrecht, and R. T. Abraham. 1995. 'Isolation of a protein target of the FKBP12-rapamycin complex in mammalian cells', J Biol Chem, 270: 815-22.

Sadelain M, Brentjens R, Riviere I. 2013 ‘The basic principles of chimeric antigen receptor design.’ Cancer Discov., 3(4): 388-98

Sastry, G.M., Adzhigirey, M., Day, T., Annabhimoju, R. & Sherman, W. Protein and ligand preparation: parameters, protocols, and influence on virtual screening enrichments. J Comput Aided Mol Des 27, 221-234 (2013).

Smith-Garvin, J. E., G. A. Koretzky, and M. S. Jordan. 2009. 'T cell activation', Annu Rev Immunol, 27: 591-619.

Srivastava A. 2016. ‘In vivo tissue-tropism of adeno-associated viral vectors.’ Curr. Opin. Virol. 21 : 75-80

Stanton, B. Z., E. J. Chory, and G. R. Crabtree. 2018. 'Chemically induced proximity in biology and medicine', Science, 359.

Stempniak M, Hostomska Z, Nodes BR, Hostomsky Z. 1997 'The NS3 proteinase domain of hepatitis C virus is a zinc-containing enzyme.' J Virol., 71 (4):2881-2886.

Swers, J. S., L. Grinberg, L. Wang, H. Feng, K. Lekstrom, R. Carrasco, Z. Xiao, I. Inigo, C. C. Leow, H. Wu, D. A. Tice, and M. Baca. 2013. 'Multivalent scaffold proteins as superagonists of TRAIL receptor 2-induced apoptosis', Mol Cancer Ther, 12: 1235-44.

Vaughan, T. J., A. J. Williams, K. Pritchard, J. K. Osbourn, A. R. Pope, J. C. Earnshaw, J. McCafferty, R. A. Hodits, J. Wilton, and K. S. Johnson. 1996. 'Human antibodies with sub-nanomolar affinities isolated from a large non-immunized phage display library', Nat Biotechnol, 14: 309-14.

Wu, C. Y., K. T. Roybal, E. M. Puchner, J. Onuffer, and W. A. Lim. 2015. 'Remote control of therapeutic T cells through a small molecule-gated chimeric receptor', Science, 350: aab4077.

For standard molecular biology techniques, see Sambrook, J., Russel, D.W. Molecular Cloning, A Laboratory Manual. 3 ed. 2001 , Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press Sequences

Upper case sequences are protein sequences. Lower case sequences are nucleic acid sequences.

SEQ ID NO: 1 - PRGRZ093

RLDAPSQIEVKDVTDTTALITWTDPYYVIYSFELTYGIKDVPGDRTTIKLDSDVLYYSIGNLKP

DTEYEVSLISNTDGLRVRRASNPAKITFKTGL

SEQ ID NO: 2 - PRGRZ094

RLDAPSQIEVKDVTDTTALITWLDPDTYIYSFELTYGIKDVPGDRTTIKLNWGVNYYSIGNLKP

DTEYEVSLISGYRDTYDDSWSNPAKITFKTGL

SEQ ID NO: 3 - PRGRZ103

RLDAPSQIEVKDVTDTTALITWTDPSYDIDWFELTYGIKDVPGDRTTIKLDGWLLYYSIGNLK

PDTEYEVSLISDTYVRYSNPAKITFKTGL

SEQ ID NO: 4 - PRGRZ112

RLDAPSQIEVKDVTDTTALITWTPPDLWYYDIPSFELTYGIKDVPGDRTTIKLWNDDSYYSIG

NLKPDTEYEVSLISGYGGYDRSVDSNPAKITFKTGL

SEQ ID NO: 5 - PRGRZ114

RLDAPSQIEVKDVTDTTALITWTDPTYRIDDFELTYGIKDVPGDRTTIKLSGDYLYYSIGNLKP

DTEYEVSLISRPYSRSSWLVSNPAKITFKTGL

SEQ ID NO: 6 - PRGRZ115

RLDAPSQI EVKDVTDTTALITWTDPDDYI DYFELTYGI KDVPGDRTTI KLWGDYYYYSIGN LKL

DTEYEVSLISRTSYDNRRWSSNPAKITFKTGL

SEQ ID NO: 7 - PRSIM23

RLDAPSQIEVKDVTDTTALITWVDPRYDDIWWFELTYGIKDVPGDRTTIKLYLNDPYYSIGNL

KPDTEYEVSLISYTGDSYSRSGSNPAKITFKTGL

SEQ ID NO: 8 - PRGRZ093 BC Loop

TDPYYVIYS

SEQ ID NO: 9 - PRGRZ093 DE Loop

DSDVLY

SEQ ID NO: 10 - PRGRZ093 FG Loop

NTDGLRVRRASNP

SEQ ID NO: 11 - PRGRZ094 BC Loop

LDPDTYIYS SEQ ID NO: 12 - PRGRZ094 DE Loop

NWGVNY

SEQ ID NO: 13 - PRGRZ094 FG Loop

GYRDTYDDSWSNP

SEQ ID NO: 14 - PRGRZ103 BC Loop

TDPSYDIDW

SEQ ID NO: 15 - PRGRZ103 DE Loop

DGWLLY

SEQ ID NO: 16 - PRGRZ103 FG Loop

DTYVRYSNP

SEQ ID NO: 17 - PRGRZ112 BC Loop

TPPDLWYYDIPS

SEQ ID NO: 18 - PRGRZ112 DE Loop

WNDDSY

SEQ ID NO: 19 - PRGRZ112 FG Loop

GYGGYDRSVDSNP

SEQ ID NO: 20 - PRGRZ114 BC Loop

TDPTYRIDD

SEQ ID NO: 21 - PRGRZ114 DE Loop

SGDYLY

SEQ ID NO: 22 - PRGRZ114 FG Loop

RPYSRSSWLVSNP

SEQ ID NO: 23 - PRGRZ115 BC Loop

TDPDDYIDY

SEQ ID NO: 24 - PRGRZ115 DE Loop

WGDYYY

SEQ ID NO: 25 - PRGRZ115 FG Loop

RTSYDNRRWSSNP

SEQ ID NO: 26 - HCV NS3/4A Protease

MKKKGSVVIVGRINLSGDTAYAQQTRGEEGCQETSQTGRDKNQVEGEVQIVSTATQTFLAT

SINGVLWTVYHGAGTRTIASPKGPVTQMYTNVDKDLVGWQAPQGSRSLTPCTCGSSDLYL

VTRHADVIPVRRRGDSRGSLLSPRPISYLKGSSGGPLLCPAGHAVGIFRAAVSTRGVAKAV

DFIPVESLETTMRSP SEQ ID NO: 27 - HCV NS3/4A Protease S154A

MKKKGSVVIVGRINLSGDTAYAQQTRGEEGCQETSQTGRDKNQVEGEVQIVSTATQTFLAT

SINGVLWTVYHGAGTRTIASPKGPVTQMYTNVDKDLVGWQAPQGSRSLTPCTCGSSDLYL

VTRHADVIPVRRRGDSRGSLLSPRPISYLKGSAGGPLLCPAGHAVGIFRAAVSTRGVAKAV

DFIPVESLETTMRSP

SEQ ID NO: 28 - HCV NS3/4A Protease K151D

MKKKGSVVIVGRINLSGDTAYAQQTRGEEGCQETSQTGRDKNQVEGEVQIVSTATQTFLAT

SINGVLWTVYHGAGTRTIASPKGPVTQMYTNVDKDLVGWQAPQGSRSLTPCTCGSSDLYL

VTRHADVIPVRRRGDSRGSLLSPRPISYLDGSSGGPLLCPAGHAVGIFRAAVSTRGVAKAV

DFIPVESLETTMRSP

SEQ ID NO: 29 - HCV NS3/4A Protease K151D/S154A

MKKKGSVVIVGRINLSGDTAYAQQTRGEEGCQETSQTGRDKNQVEGEVQIVSTATQTFLAT

SINGVLWTVYHGAGTRTIASPKGPVTQMYTNVDKDLVGWQAPQGSRSLTPCTCGSSDLYL

VTRHADVIPVRRRGDSRGSLLSPRPISYLDGSAGGPLLCPAGHAVGIFRAAVSTRGVAKAV

DFIPVESLETTMRSP

SEQ ID NO: 30 - HCV NS3/4A Protease D183E

MKKKGSVVIVGRINLSGDTAYAQQTRGEEGCQETSQTGRDKNQVEGEVQIVSTATQTFLAT

SINGVLWTVYHGAGTRTIASPKGPVTQMYTNVDKDLVGWQAPQGSRSLTPCTCGSSDLYL

VTRHADVIPVRRRGDSRGSLLSPRPISYLKGSSGGPLLCPAGHAVGIFRAAVSTRGVAKAV

EFIPVESLETTMRSP

SEQ ID NO: 31 - HCV NS3/4A Protease S154A/D183E

MKKKGSVVIVGRINLSGDTAYAQQTRGEEGCQETSQTGRDKNQVEGEVQIVSTATQTFLAT

SINGVLWTVYHGAGTRTIASPKGPVTQMYTNVDKDLVGWQAPQGSRSLTPCTCGSSDLYL

VTRHADVIPVRRRGDSRGSLLSPRPISYLKGSAGGPLLCPAGHAVGIFRAAVSTRGVAKAV

EFIPVESLETTMRSP

SEQ ID NO: 32 - HCV NS3/4A Protease K151 D/D183E

MKKKGSVVIVGRINLSGDTAYAQQTRGEEGCQETSQTGRDKNQVEGEVQIVSTATQTFLAT

SINGVLWTVYHGAGTRTIASPKGPVTQMYTNVDKDLVGWQAPQGSRSLTPCTCGSSDLYL

VTRHADVIPVRRRGDSRGSLLSPRPISYLDGSSGGPLLCPAGHAVGIFRAAVSTRGVAKAV

EFIPVESLETTMRSP

SEQ ID NO: 33 - HCV NS3/4A Protease K151D/S154A/D183E

MKKKGSVVIVGRINLSGDTAYAQQTRGEEGCQETSQTGRDKNQVEGEVQIVSTATQTFLAT

SINGVLWTVYHGAGTRTIASPKGPVTQMYTNVDKDLVGWQAPQGSRSLTPCTCGSSDLYL

VTRHADVIPVRRRGDSRGSLLSPRPISYLDGSAGGPLLCPAGHAVGIFRAAVSTRGVAKAV

EFIPVESLETTMRSP SEQ ID NO: 34 - Full-Length HCV NS3

APITAYAQQTRGEEGCQETSLTGRDKNQVEGEVQIVSTAAQTFLATSINGVCWTVYHGAGT

RTIASPKGPVIQMYTNVDQDLVGWPAPQGSRSLTPCTCGSSDLYLVTRHADVIPVRRRGDS

RGSLLSPRPISYLKGSSGGPLLCPAGHAVGIFRAAVCTRGVAKAVDFIPVENLETTMRSPVF

TDNSSPPVVPQSFQVAHLHAPTGSGKSTKVPAAYAAQGYKVLVLNPSVAATLGFGAYMSK

AHGIDPNIRTGVRTITTGSPITYSTYGKFLADGGCSGGAYDIIICDECHSTDATSILGIGTVLDQ

AETAGARLVVLATATPPGSVTVPHPNIEEVALSTTGEIPFYGKAIPLEVIKGGRHLIFCHSKKK

CDELAAKLVALGINAVAYYRGLDVSVIPTSGDVWVATDALMTGYTGDFDSVIDCNTCVTQT

VDFSLDPTFTIETITLPQDAVSRTQRRGRTGRGKPGIYRFVAPGERPSGMFDSSVLCECYD

AGCAWYELTPAETTVRLRAYMNTPGLPVCQDHLEFWEGVFTGLTHIDAHFLSQTKQSGEN

LPYLVAYQATVCARAQAPPPSWDQMWKCLIRLKPTLHGPTPLLYRLGAVQNEITLTHPVTK

YIMTCMSADLEVVT

SEQ ID NO: 35 - Caspase-9 Activation Domain

AYILSMEPCGHCLIINNVNFCRESGLRTRTGSNIDCEKLRRRFSSLHFMVEVKGDLTAKKMV

LALLELARQDHGALDCCWVILSHGCQASHLQFPGAVYGTDGCPVSVEKIVNIFNGTSCPSL

GGKPKLFFIQACGGEQKDHGFEVASTSPEDESPGSNPEPDATPFQEGLRTFDQLDAISSLP

TPSDIFVSYSTFPGFVSWRDPKSGSWYVETLDDIFEQWAHSEDLQSLLLRVANAVSVKGIY

KQMPGCFNFLRKKLFFK

SEQ ID NO: 36 - Wild Type Tn3

RLDAPSQIEVKDVTDTTALITWFKPLAEIDGIELTYGIKDVPGDRTTIDLTEDENQYSIGNLKP

DTEYEVSLISRRGDMSSNPAKETFTTGL

SEQ ID NO: 37 - Wild Type Tn3 with Stabilising Mutations

RLDAPSQIEVKDVTDTTALITWFKPLAEIDGFELTYGIKDVPGDRTTIKLTEDENQYSIGNLKP

DTEYEVSLISRRGDMSSNPAKITFKTGL

SEQ ID NO: 38 - Linker

GGGS

SEQ ID NO: 39 - Linker

GGGGS

SEQ ID NO: 40 - Linker

GGSG

SEQ ID NO: 41 - Linker

GSGG

SEQ ID NO: 42 - Linker

SSGG

SEQ ID NO: 43 - Linker

SSSG SEQ ID NO: 44 - Linker TGGGGSGGGGS

SEQ ID NO: 45 - NS3/4A PR S154A-VPR Fusion

MKKKGSVVIVGRINLSGDTAYAQQTRGEEGCQETSQTGRDKNQVEGEVQIVSTATQTFLAT

SINGVLWTVYHGAGTRTIASPKGPVTQMYTNVDKDLVGWQAPQGSRSLTPCTCGSSDLYL

VTRHADVIPVRRRGDSRGSLLSPRPISYLKGSAGGPLLCPAGHAVGIFRAAVSTRGVAKAV

DFIPVESLETTMRSPTGGGGSGGGGSEASGSGRADALDDFDLDMLGSDALDDFDLDMLG

SDALDDFDLDMLGSDALDDFDLDMLINSRSSGSPKKKRKVGSQYLPDTDDRHRIEEKRKRT

YETFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMV

FPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKP

TQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHTT

EPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGS

GSRDSREGMFLPKPEAGSAISDVFEGREVCQPKRIRPFHPPGSPWANRPLPASLAPTPTG

PVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAI

CGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTGL

SIFDTSLF

SEQ ID NO: 46 - ZFHD1 DBD-NS3/4A PR S154A Fusion

MDYPAAKRVKLDSRERPYACPVESCDRRFSRSDELTRHIRIHTGQKPFQCRICMRNFSRSD

HLTTHIRTHTGGGRRRKKRTSIETNIRVALEKSFLENQKPTSEEITMIADQLNMEKEVIRVWF

CNRRQKEKRINTSAGSMKKKGSWIVGRINLSGDTAYAQQTRGEEGCQETSQTGRDKNQV

EGEVQIVSTATQTFLATSINGVLWTVYHGAGTRTIASPKGPVTQMYTNVDKDLVGWQAPQG

SRSLTPCTCGSSDLYLVTRHADVIPVRRRGDSRGSLLSPRPISYLKGSAGGPLLCPAGHAV

GIFRAAVSTRGVAKAVDFIPVESLETTMRSP

SEQ ID NO: 47 - PRGRZ093-p65 AD Fusion

RLDAPSQIEVKDVTDTTALITWTDPYYVIYSFELTYGIKDVPGDRTTIKLDSDVLYYSIGNLKP

DTEYEVSLISNTDGLRVRRASNPAKITFKTGLTGGGGSGGGGSDEFPTMVFPSGQISQASA

LAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSE

ALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAI

TRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLSQISSTSY

SEQ ID NO: 48 - PRGRZ094-p65 AD Fusion

RLDAPSQIEVKDVTDTTALITWLDPDTYIYSFELTYGIKDVPGDRTTIKLNWGVNYYSIGNLKP

DTEYEVSLISGYRDTYDDSWSNPAKITFKTGLTGGGGSGGGGSDEFPTMVFPSGQISQAS

ALAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLS

EALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPE

AITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLSQISSTSY

SEQ ID NO: 49 - PRGRZ103-p65 AD Fusion

RLDAPSQIEVKDVTDTTALITWTDPSYDIDWFELTYGIKDVPGDRTTIKLDGWLLYYSIGNLK

PDTEYEVSLISDTYVRYSNPAKITFKTGLTGGGGSGGGGSDEFPTMVFPSGQISQASALAP

APPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALL

QLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRL

VTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLSQISSTSY SEQ ID NO: 50 - PRGRZ112-p65 AD Fusion

RLDAPSQIEVKDVTDTTALITWTPPDLWYYDIPSFELTYGIKDVPGDRTTIKLWNDDSYYSIG NLKPDTEYEVSLISGYGGYDRSVDSNPAKITFKTGLTGGGGSGGGGSDEFPTMVFPSGQIS QASALAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEG TLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLME

YPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLSQISSTSY

SEQ ID NO: 51 - PRGRZ114-p65 AD Fusion

RLDAPSQIEVKDVTDTTALITWTDPTYRIDDFELTYGIKDVPGDRTTIKLSGDYLYYSIGNLKP DTEYEVSLISRPYSRSSWLVSNPAKITFKTGLTGGGGSGGGGSDEFPTMVFPSGQISQASA LAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSE ALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAI

TRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLSQISSTSY

SEQ ID NO: 52 - PRGRZ115-p65 AD Fusion

RLDAPSQI EVKDVTDTTALITWTDPDDYI DYFELTYGI KDVPGDRTTI KLWGDYYYYSIGN LKL DTEYEVSLISRTSYDNRRWSSNPAKITFKTGLTGGGGSGGGGSDEFPTMVFPSGQISQASA LAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSE ALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAI

TRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLSQISSTSY

SEQ ID NO: 53 - NS3/4A PR S154A-p65 AD Fusion

MKKKGSVVIVGRINLSGDTAYAQQTRGEEGCQETSQTGRDKNQVEGEVQIVSTATQTFLAT SINGVLWTVYHGAGTRTIASPKGPVTQMYTNVDKDLVGWQAPQGSRSLTPCTCGSSDLYL VTRHADVIPVRRRGDSRGSLLSPRPISYLKGSAGGPLLCPAGHAVGIFRAAVSTRGVAKAV DFIPVESLETTMRSPTGGGGSGGGGSDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPA

PAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLG NSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPA PLGAPGLPNGLLSGDEDFSSIADMDFSALLSQISSTSY

SEQ ID NO: 54 - ZFHD1 DBD-PRGRZ093 Fusion

MDYPAAKRVKLDSRERPYACPVESCDRRFSRSDELTRHIRIHTGQKPFQCRICMRNFSRSD HLTTHIRTHTGGGRRRKKRTSIETNIRVALEKSFLENQKPTSEEITMIADQLNMEKEVIRVWF CNRRQKEKRINTSAGSRLDAPSQIEVKDVTDTTALITWTDPYYVIYSFELTYGIKDVPGDRTT IKLDSDVLYYSIGNLKPDTEYEVSLISNTDGLRVRRASNPAKITFKTGL

SEQ ID NO: 55 - ZFHD1 DBD-PRGRZ094 Fusion

MDYPAAKRVKLDSRERPYACPVESCDRRFSRSDELTRHIRIHTGQKPFQCRICMRNFSRSD HLTTHIRTHTGGGRRRKKRTSIETNIRVALEKSFLENQKPTSEEITMIADQLNMEKEVIRVWF CNRRQKEKRINTSAGSRLDAPSQIEVKDVTDTTALITWLDPDTYIYSFELTYGIKDVPGDRTTI KLNWGVNYYSIGNLKPDTEYEVSLISGYRDTYDDSWSNPAKITFKTGL

SEQ ID NO: 56 - ZFHD1 DBD-PRGRZ103 Fusion

MDYPAAKRVKLDSRERPYACPVESCDRRFSRSDELTRHIRIHTGQKPFQCRICMRNFSRSD HLTTHIRTHTGGGRRRKKRTSIETNIRVALEKSFLENQKPTSEEITMIADQLNMEKEVIRVWF CNRRQKEKRINTSAGSRLDAPSQIEVKDVTDTTALITWTDPSYDIDWFELTYGIKDVPGDRT TIKLDGWLLYYSIGNLKPDTEYEVSLISDTYVRYSNPAKITFKTGL SEQ ID NO: 57 - ZFHD1 DBD-PRGRZ112 Fusion

MDYPAAKRVKLDSRERPYACPVESCDRRFSRSDELTRHIRIHTGQKPFQCRICMRNFSRSD HLTTHIRTHTGGGRRRKKRTSIETNIRVALEKSFLENQKPTSEEITMIADQLNMEKEVIRVWF CNRRQKEKRINTSAGSRLDAPSQIEVKDVTDTTALITWTPPDLWYYDIPSFELTYGIKDVPG DRTTIKLWNDDSYYSIGNLKPDTEYEVSLISGYGGYDRSVDSNPAKITFKTGL

SEQ ID NO: 58 - ZFHD1 DBD-PRGRZ114 Fusion

MDYPAAKRVKLDSRERPYACPVESCDRRFSRSDELTRHIRIHTGQKPFQCRICMRNFSRSD HLTTHIRTHTGGGRRRKKRTSIETNIRVALEKSFLENQKPTSEEITMIADQLNMEKEVIRVWF CNRRQKEKRINTSAGSRLDAPSQIEVKDVTDTTALITWTDPTYRIDDFELTYGIKDVPGDRTT IKLSGDYLYYSIGNLKPDTEYEVSLISRPYSRSSWLVSNPAKITFKTGL

SEQ ID NO: 59 - ZFHD1 DBD-PRGRZ115 Fusion

MDYPAAKRVKLDSRERPYACPVESCDRRFSRSDELTRHIRIHTGQKPFQCRICMRNFSRSD HLTTHIRTHTGGGRRRKKRTSIETNIRVALEKSFLENQKPTSEEITMIADQLNMEKEVIRVWF CNRRQKEKRINTSAGSRLDAPSQIEVKDVTDTTALITWTDPDDYIDYFELTYGIKDVPGDRTT IKLWGDYYYYSIGNLKLDTEYEVSLISRTSYDNRRWSSNPAKITFKTGL

SEQ ID NO: 60 - Linker

SGGG

SEQ ID NO: 61 - Linker

GGGGSGGGGS

SEQ ID NO: 62 - CAR Signal Peptide

MLLLVTSLLLCELPHPAFLLIP

SEQ ID NO: 63 - CAR Signal Peptide

MIHLGHILFLLLLPVAAAQTTPGERSSLPAFYPGTSGSCSGCGSLSLP

SEQ ID NO: 64 - MEDI8852 Heavy Chain

QVQLQQSGPGLVKPSQTLSLTCAISGDSVSSYNAVWNWIRQSPSRGLEWLGRTYYRSGW

YNDYAESVKSRITINPDTSKNQFSLQLNSVTPEDTAVYYCARSGHITVFGVNVDAFDMWGQ

GTMVTVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWNSGALTSGVHT FPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKRVEPKSCDKTHTCPPCP

APELLGGPSVFLFPPKPKDTLMISRTPEVTCWVDVSHEDPEVKFNWYVDGVEVHNAKTKP REEQYNSTYRWSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQVYTLP

PSREEMTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVLDSDGSFFLYSKLTVD KSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK

SEQ ID NO: 65 - MEDI8852 Light Chain

DIQMTQSPSSLSASVGDRVTITCRTSQSLSSYTHWYQQKPGKAPKLLIYAASSRGSGVPSR

FSGSGSGTDFTLTISSLQPEDFATYYCQQSRTFGQGTKVEIKRTVAAPSVFIFPPSDEQLKS

GTASWCLLNNFYPREAKVQWKVDNALQSGNSQESVTEQDSKDSTYSLSSTLTLSKADYE

KHKVYACEVTHQGLSSPVTKSFNRGEC SEQ ID NO: 66 - PRGRZ093 agactggacgcccctagccagatcgaagtgaaggacgtgaccgacaccaccgctctgatcacctggacagacccctactacg tgatctacagcttcgagctgacctacggcatcaaggacgtgcccggcgatagaaccaccatcaagctggatagcgacgtgctgt actactccatcggcaacctgaagcctgacaccgagtacgaggtgtccctgatcagcaacaccgacggcctgagagtgcggag agcctctaatcctgccaagatcaccttcaagaccggcctg

SEQ ID NO: 67 - PRGRZ094 cgtctggatgcaccgagccagattgaagttaaagatgttaccgataccaccgcactgattacctggctggacccagacacttaca tttactcttttgaactgacctatggcatcaaagatgttccgggtgatcgtaccaccattaaactgaactggggtgttaactactatagc attggcaatctgaaaccggataccgaatatgaagttagcctgattagcggttaccgtgacacttacgacgactcttggagcaatcc ggcaaaaattacctttaaaaccggtctg

SEQ ID NO: 68 - PRGRZ103 agactggacgcccctagccagatcgaagtgaaggacgtgaccgacaccaccgctctgatcacctggaccgatcctagctacg acatcgattggttcgagctgacctacggcatcaaggacgtgcccggcgatagaaccaccatcaagctggatggctggctgctgt actactccatcggcaacctgaagcctgacaccgagtacgaggtgtccctgatctccgatacctacgtgcggtacagcaaccccg ccaagatcacctttaagaccggcctg

SEQ ID NO: 69 - PRGRZ112 cgtctggatgcaccgagccagattgaagttaaagatgttaccgataccaccgcactgattacctggactccgccggacctgtggt actacgacattccgtcttttgaactgacctatggcatcaaagatgttccgggtgatcgtaccaccattaaactgtggaacgacgact cttactatagcattggtaatctgaaaccggataccgaatatgaagttagcctgattagcggttacggtggttacgaccgttctgttga cagcaatccggcaaaaattacctttaaaaccggtctg

SEQ ID NO: 70 - PRGRZ114 cgtctggatgcaccgagccagattgaagttaaagatgttaccgataccaccgcactgattacctggactgacccgacttaccgtat tgacgactttgaactgacctatggcatcaaagatgttccgggtgatcgtaccaccattaaactgtctggtgactacctgtactatagc attggtaatctgaaaccggataccgaatatgaagttagcctgattagccgtccgtactctcgttcttcttggctggttagcaatccggc aaaaattacctttaaaaccggtctg

SEQ ID NO: 71 - PRGRZ115 cgtctggatgcaccgagccagattgaagttaaagatgttaccgataccaccgcactgattacctggactgacccggacgactac attgactactttgaactgacctatggcatcaaagatgttccgggtgatcgtaccaccattaaactgtggggtgactactactactata gcattggtaatctgaaactggataccgaatatgaagttagcctgattagccgtacttcttacgacaaccgtcgttggtctagcaatcc ggcaaaaattacctttaaaaccggtctg

SEQ ID NO: 72 - p65 Activation Domain

DEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAV

APPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGI PVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFS ALLSQISSTSY

SEQ ID NO: 73 - ZFHD1 DNA Binding Domain

MDYPAAKRVKLDSRERPYACPVESCDRRFSRSDELTRHIRIHTGQKPFQCRICMRNFSRSD HLTTHIRTHTGGGRRRKKRTSIETNIRVALEKSFLENQKPTSEEITMIADQLNMEKEVIRVWF CNRRQKEKRINT SEQ ID NO: 74 - Linker

SAGS

SEQ ID NO: 75 - Linker

GGGSG

SEQ ID NO: 76 - PRGRZ103-NS3/4A Kill Switch

MGSRLDAPSQIEVKDVTDTTALITWTDPSYDIDWFELTYGIKDVPGDRTTIKLDGWLLYYSIG

NLKPDTEYEVSLISDTYVRYSNPAKITFKTGLGGGSGMKKKGSVVIVGRINLSGDTAYAQQT

RGEEGCQETSQTGRDKNQVEGEVQIVSTATQTFLATSINGVLWTVYHGAGTRTIASPKGPV

TQMYTNVDKDLVGWQAPQGSRSLTPCTCGSSDLYLVTRHADVIPVRRRGDSRGSLLSPRP

ISYLKGSAGGPLLCPAGHAVGIFRAAVSTRGVAKAVDFIPVESLETTMRSPGGGSGVDGFG

DVGALESLRGNADLAYILSMEPCGHCLIINNVNFCRESGLRTRTGSNIDCEKLRRRFSSLHF

MVEVKGDLTAKKMVLALLELARQDHGALDCCVVVILSHGCQASHLQFPGAVYGTDGCPVS

VEKIVNIFNGTSCPSLGGKPKLFFIQACGGEQKDHGFEVASTSPEDESPGSNPEPDATPFQ

EGLRTFDQLDAISSLPTPSDIFVSYSTFPGFVSWRDPKSGSWYVETLDDIFEQWAHSEDLQ

SLLLRVANAVSVKGIYKQMPGCFNFLRKKLFFKTS

SEQ ID NO: 77 - ZFHD1 DBD-PRGRZ103 x 2

MDYPAAKRVKLDSRERPYACPVESCDRRFSRSDELTRHIRIHTGQKPFQCRICMRNFSRSD

HLTTHIRTHTGGGRRRKKRTSIETNIRVALEKSFLENQKPTSEEITMIADQLNMEKEVIRVWF

CNRRQKEKRINTSAGSRLDAPSQIEVKDVTDTTALITWTDPSYDIDWFELTYGIKDVPGDRT

Tl KLDGWLLYYSIGN LKPDTEYEVSLISDTYVRYSN PAKITFKTGLRLDAPSQI EVKDVTDTTA

LITWTDPSYDIDWFELTYGIKDVPGDRTTIKLDGWLLYYSIGNLKPDTEYEVSLISDTYVRYS

NPAKITFKTGL

SEQ ID NO: 78 - ZFHD1 DBD-PRGRZ103 x 3

MDYPAAKRVKLDSRERPYACPVESCDRRFSRSDELTRHIRIHTGQKPFQCRICMRNFSRSD

HLTTHIRTHTGGGRRRKKRTSIETNIRVALEKSFLENQKPTSEEITMIADQLNMEKEVIRVWF

CNRRQKEKRINTSAGSRLDAPSQIEVKDVTDTTALITWTDPSYDIDWFELTYGIKDVPGDRT

Tl KLDGWLLYYSIGN LKPDTEYEVSLISDTYVRYSN PAKITFKTGLRLDAPSQI EVKDVTDTTA

LITWTDPSYDIDWFELTYGIKDVPGDRTTIKLDGWLLYYSIGNLKPDTEYEVSLISDTYVRYS

NPAKITFKTGLRLDAPSQIEVKDVTDTTALITWTDPSYDIDWFELTYGIKDVPGDRTTIKLDG

WLLYYSIGNLKPDTEYEVSLISDTYVRYSNPAKITFKTGL

Claims

1 . A Tn3 protein that binds specifically to an HCV NS3/4A protease (“T”) complexed with a small molecule (“SM”) [T-SM complex], wherein the small molecule is grazoprevir or glecaprevir, or an analogue or derivative of grazoprevir or glecaprevir, and the Tn3 protein comprises the BC, DE and FG loops of: a) PRGRZ093, set forth in SEQ ID NOs: 8, 9 and 10, respectively; b) PRGRZ094, set forth in SEQ ID NOs: 11 , 12 and 13, respectively; c) PRGRZ103, set forth in SEQ ID NOs: 14, 15 and 16, respectively; d) PRGRZ112, set forth in SEQ ID NOs: 17, 18 and 19, respectively; e) PRGRZ1 14, set forth in SEQ ID NOs: 20, 21 and 22, respectively; or f) PRGRZ115, set forth in SEQ ID NOs: 23, 24 and 25, respectively, wherein optionally the Tn3 protein comprises up to 3 sequence alterations in the BC, DE, and/or EF loop relative to the aforementioned sequences.

2. The T n3 protein of claim 1 , wherein the small molecule is grazoprevir.

3. The Tn3 protein of claim 1 or 2, which binds the T-SM complex at a higher affinity than it binds to either T or SM alone.

4. The Tn3 protein of claim 3, wherein the Tn3 protein binds to the T-SM complex with: a) at least a 10-fold higher affinity; b) at least a 50-fold higher affinity; c) at least a 100-fold higher affinity; or d) at least a 1000-fold higher affinity than the Tn3 protein binds to either the HCV NS3/4A protease alone and/or grazoprevir alone.

5. The Tn3 protein of any one of claims 1 to 4, comprising the BC, DE and FG loops of PRGRZ103, set forth in SEQ ID NOs: 14, 15 and 16, respectively, wherein optionally the Tn3 protein comprises up to 3 sequence alterations in the BC, DE, and/or EF loop relative to the aforementioned sequences.

6. The T n3 protein of any one of claims 1 to 4, comprising an amino acid sequence having at least 90 % identity with the amino acid sequence of: a) PRGRZ093, set forth in SEQ ID NO: 1 ; b) PRGRZ094, set forth in SEQ ID NO: 2; c) PRGRZ103, set forth in SEQ ID NO: 3; d) PRGRZ1 12, set forth in SEQ ID NO: 4; e) PRGRZ1 14, set forth in SEQ ID NO: 5; or f) PRGRZ115, set forth in SEQ ID NO: 6.

7. The Tn3 protein of claim 6, comprising an amino acid sequence having at least 90 % identity with the amino acid sequence of PRGRZ103, set forth in SEQ ID NO: 3.

8. The Tn3 protein of claim 5, comprising the BC, DE and FG loops of PRGRZ103, set forth in SEQ ID NOs: 14, 15 and 16, respectively.

9. The Tn3 protein of claim 8, comprising the amino acid sequence of PRGRZ103, set forth in SEQ ID NO: 3.

10. The Tn3 protein of any preceding claim, which exhibits no significant binding to the HCV NS3/4A protease alone and/or grazoprevir alone.

11 . The T n3 protein of claim 10, which exhibits no binding or no detectable binding to the HCV NS3/4A protease alone and/or grazoprevir alone.

12. A BM-T fusion protein comprising the Tn3 protein (BM) of any preceding claim fused to an HCV NS3/4A protease (T).

13. The BM-T fusion protein of claim 12, wherein the HCV NS3/4A protease has an amino acid sequence having at least 90 % identity to SEQ ID NO: 26.

14. The BM-T fusion protein of claim 12 or 13, wherein the HCV NS3/4A protease has attenuated activity compared to the HCV NS3/4A protease of SEQ ID NO: 26.

15. The BM-T fusion protein of claim 14, wherein the HCV NS3/4A protease comprises one or more amino acid mutations compared to the HCV NS3/4A protease of SEQ ID NO: 26, wherein the one or more amino acid mutations attenuate the activity of the HCV NS3/4A protease.

16. The BM-T fusion protein of any one of claims 13 to 15, wherein the HCV NS3/4A protease comprises an amino acid mutation at one or more amino acid positions selected from the positions corresponding to positions 72, 96, 112, 114, 154, 160 and 164 of SEQ ID NO: 26.

17. The BM-T fusion protein of claim 16, wherein the HCV NS3/4A protease comprises an amino acid mutation at the position corresponding to position 154 of SEQ ID NO: 26, optionally wherein the amino acid mutation at said position is a mutation to an alanine.

18. The BM-T fusion protein of claim 17, wherein the HCV NS3/4A protease comprises the amino acid sequence set forth in SEQ ID NO: 27.

19. The BM-T fusion protein of any one of claims 13 to 17, wherein the HCV NS3/4A protease comprises an affinity-reducing amino acid mutation at the position corresponding to position 151 and/or 183 of SEQ ID NO: 26.

20. The BM-T fusion protein of claim 19, wherein the amino acid mutation at the position corresponding to position 151 of SEQ ID NO: 26 is a mutation to aspartic acid, asparagine, or histidine, and the amino acid mutation at the position corresponding to position 183 of SEQ ID NO: 26 is a mutation to glutamic acid, glutamine or alanine.

21 . The BM-T fusion protein of claim 20, wherein the HCV NS3/4A protease comprises an aspartate at the position corresponding to position 151 of SEQ ID NO: 26 and/or a glutamate at the position corresponding to position 183 of SEQ ID NO: 26.

22. The BM-T fusion protein of claim 21 , wherein the HCV NS3/4A protease comprises the amino acid sequence set forth in any one of SEQ ID NOs: 28-33.

23. The BM-T fusion protein of any of claims 12 to 23, further comprising at least one additional domain.

24. The BM-T fusion protein of claim 24, wherein the at least one additional domain comprises a caspase component.

25. The BM-T fusion protein of claim 25, wherein the caspase component comprises a caspase-9 activation domain.

26. The BM-T fusion protein of claim 26, wherein the caspase-9 activation domain comprises an amino acid sequence having at least 90 % identity to SEQ ID NO: 35.

27. A dimerization-inducible protein, comprising: a) a first fusion protein, comprising a first component polypeptide fused to the T n3 protein of any one of claims 1 to 11 ; and b) a second fusion protein, comprising a second component polypeptide fused to the HCV NS3/4A protease defined in any one of claims 12-22.

28. The dimerization-inducible protein of claim 27, wherein: a) the first component polypeptide comprises a DNA binding domain (DBD) and the second component polypeptide comprises a transcriptional regulatory domain (TRD); or b) the first component polypeptide comprises a transcriptional regulatory domain (TRD) and the second component polypeptide comprises a DNA binding domain (DBD); and wherein, in the presence of grazoprevir, the dimerization-inducible protein forms a transcription factor.

29. The dimerization-inducible protein of claim 27, wherein the first component polypeptide comprises a first co-stimulatory domain and the second component polypeptide comprises an intracellular signalling domain.

30. The dimerization-inducible protein of claim 29, wherein the first component polypeptide further comprises an antigen-specific recognition domain and a transmembrane domain and the second component polypeptide comprises a transmembrane domain and a second costimulatory domain, wherein, in the presence of grazoprevir, the dimerization-inducible protein forms a chimeric-antigen receptor (CAR).

31 . The dimerization-inducible protein of claim 27, wherein the first component polypeptide comprises an intracellular signalling domain and the second component polypeptide comprises a first co-stimulatory domain.

32. The dimerization-inducible protein of claim 31 , wherein the first component polypeptide comprises a transmembrane domain and a second costimulatory domain, and the second component polypeptide further comprises an antigen-specific recognition domain and a transmembrane domain, wherein, in the presence of grazoprevir, the dimerization-inducible protein forms a chimeric-antigen receptor (CAR).

33. The dimerization-inducible protein of claim 27, wherein each of the first and second component polypeptides comprise a caspase component, and the first and second component polypeptides form a caspase upon dimerization.

34. The dimerization-inducible protein of claim 33, wherein the caspase components each comprise a caspase-9 activation domain.

35. The dimerization-inducible protein of claim 34, wherein the caspase-9 activation domain comprises an amino acid sequence having at least 90 % identity to SEQ ID NO: 35.

36. The dimerization-inducible protein of any one of claims 33 to 35, wherein the first and second fusion proteins are BM-T fusion proteins of any one of claims 24 to 26.

37. The dimerization-inducible protein of claim 36, wherein the first and second fusion proteins are identical.

38. A cell expressing the Tn3 protein of any one of claims 1 to 11 , the BM-T fusion protein of any one of claims 12 to 26 or the dimerization-inducible protein of any one of claims 27 to 37.

39. A nucleic acid encoding a polypeptide comprising the T n3 protein of any one of claims 1 to 11 , the BM-T fusion protein of any one of claims 12 to 26 or the dimerization-inducible protein of any one of claims 27 to 37.

40. The nucleic acid of claim 39, comprising an expression cassette encoding the polypeptide.

41. An expression vector comprising an expression cassette encoding a polypeptide comprising the Tn3 protein of any one of claims 1 to 11 or the BM-T fusion protein of any of claims 12 to 26.

42. The nucleic acid of claim 39 or 40, or the expression vector of claim 41 , wherein the polypeptide further comprises a first component as defined in any one of claims 27 to 35 or a second component as defined in any one of claims 27 to 35.

43. An expression vector comprising an expression cassette encoding: a) a first polypeptide comprising the Tn3 protein of any of claims 1 to 11 ; and b) a second polypeptide comprising the HCV NS3/4A protease as defined in any one of claims 12 to 22.

44. An expression vector comprising: a) a first expression cassette encoding a first polypeptide comprising the Tn3 protein of any of claims 1 to 11 ; and b) a second expression cassette encoding a second polypeptide comprising the HCV NS3/4A protease as defined in any one of claims 12 to 22.

45. A vector set comprising a first expression vector and a second expression vector, wherein the first expression vector comprises a first expression cassette and the second expression vector comprises a second expression cassette, and: a) the first expression cassette encodes a first polypeptide comprising the Tn3 protein of any of claims 1 to 11 ; and b) the second expression cassette encodes a second polypeptide comprising the HCV NS3/4A protease as defined in any one of claims 12 to 22.

46. One or more expression vectors comprising: a) a first expression cassette encoding a first polypeptide comprising the Tn3 protein of any of claims 1 to 11 ; and b) a second expression cassette encoding a second polypeptide comprising the HCV NS3/4A protease as defined in any one of claims 12 to 22.

47. The expression vector of claim 43 or 44, the vector set of claim 45, or the one or more expression vectors of claim 46 wherein: a) the first polypeptide further comprises a first component polypeptide as defined in any one of claims 27 to 35; and b) the second polypeptide further comprises a second component polypeptide as defined in any one of claims 27 to 35.

48. The expression vector, vector set or one or more expression vectors of any one of claims 43 to 47, wherein the first and/or second polypeptide comprises a BM-T fusion protein as defined in any one of claims 12 to 26.

49. The expression vector, vector set or one or more expression vectors of any one of claims 41 to 48, wherein the expression vector is a plasmid, each vector of the vector set is a plasmid or each of the one or more expression vectors is a plasmid.

50. The expression vector, vector set or one or more expression vectors of any one of claims 41 to 49, wherein the expression vector is a viral vector, each vector of the vector set is a viral vector or each of the one or more expression vectors is a viral vector.

51. The expression vector, vector set or one or more expression vectors of claim 50, wherein the viral vector is selected from the list consisting of adeno-associated virus (AAV) vectors, adenovirus vectors, herpes simplex virus vectors, retrovirus vectors, lentivirus vectors, alphavirus vectors, flavivirus vectors, rhabdovirus vectors, measles virus vectors, Newcastle disease virus vectors, poxvirus vectors and picornavirus vectors.

52. The expression vector, vector set or one or more expression vectors of claim 51 , wherein the viral vector is an AAV vector.

53. A viral particle comprising a viral genome comprising: a) an expression cassette as defined in any one of claims 40 to 43, 47 or 48; or b) a first expression cassette and a second expression cassette, each as defined in any one of claims 44 to 48.

54. A set of viral particles comprising: a) a first viral particle comprising a first viral genome, wherein the first viral genome comprises a first expression cassette as defined in any one of claims 44 to 48; and b) a second viral particle comprising a second viral genome, wherein the second viral genome comprises a second expression cassette as defined in any one of claims 44 to 48.

55. One or more viral particles comprising one or more viral genomes comprising: a) a first expression cassette as defined in any one of claims 44 to 48; and b) a second expression cassette as defined in any one of claims 44 to 48.

56. The viral particle of claim 53, set of viral particles of claim 54 or one or more viral particles of claim 55, wherein the viral particle, each viral particle of the set of viral particles or each of the one or more viral particles is an adeno-associated virus (AAV) particle, adenovirus particle, herpes simplex virus particle, retrovirus particle, lentivirus particle, alphavirus particle, flavivirus particle, rhabdovirus particle, measles virus particle, Newcastle disease virus particle, poxvirus particle or picornavirus particle.

57. The viral particle, set of viral particles or one or more viral particles of claim 56, wherein the viral particle is an AAV particle.

58. A cell comprising the nucleic acid of claim 39, 40 or 42 or the expression vector, vector set or one or more expression vectors of any one of claims 41 to 52.

59. The cell of claim 38 or 58, wherein the cell is an immune cell or a stem cell.

60. The cell of claim 59, wherein the cell is an immune cell expressing a dimerizationinducible protein as defined in any one of claims 29 to 32.

61. The cell of claim 59, wherein the cell is an immune cell expressing a dimerizationinducible protein as defined in any one of claims 33 to 37.

62. The cell of claim 61, wherein the immune cell further expresses an antigen receptor, optionally a TCR or CAR.

63. The cell of any one of claims 59 to 62, wherein the immune cell is a T cell or an NK cell.

64. The expression vector, vector set or one or more expression vectors of any one of claims 41 to 52, for use in therapy.

65. The viral particle, set of viral particles or one or more viral particles of any one of claims 53 to 57, for use in therapy.

66. The cell of any one of claims 38 or 58 to 63, for use in therapy.

67. The cell for use according to claim 66, wherein the cell is an immune cell as defined in claim 60, and the therapy comprises administering the immune cell and a small molecule to the subject, wherein the small molecule is grazoprevir or glecaprevir, or an analogue or derivative of grazoprevir or glecaprevir.

68. A method of treatment comprising administering to a subject in need thereof an immune cell expressing the dimerization inducible protein of any one of claims 29 to 32 and a small molecule, wherein the small molecule is grazoprevir or glecaprevir, or an analogue or derivative of grazoprevir or glecaprevir.

69. A small molecule for use in a method of inducing death of a target cell in a subject, wherein the target cell expresses the dimerization-inducible protein of any one of claims 33 to 37, and the method comprises administering the small molecule to the subject, wherein the small molecule is grazoprevir or glecaprevir, or an analogue or derivative of grazoprevir or glecaprevir.

70. A method of inducing death of a target cell in a subject, wherein the target cell expresses the dimerization-inducible protein of any one of claims 33 to 37, the method comprising administering a small molecule to the subject, wherein the small molecule is grazoprevir or glecaprevir, or an analogue or derivative of grazoprevir or glecaprevir.

71 . A small molecule for use in a method of regulating the expression of a target gene in a cell in a subject, wherein the cell expresses the dimerization-inducible protein of claim 28, and the DBD binds to a target sequence in the cell such that the transcription factor is capable of regulating the expression of the target gene, and the method comprises administering the small molecule to the subject, wherein the small molecule is grazoprevir or glecaprevir, or an analogue or derivative of grazoprevir or glecaprevir.

72. A method of regulating the expression of a target gene in a cell, wherein the cell expresses the dimerization-inducible protein of claim 28, and the DBD binds to a target sequence in the cell such that the transcription factor is capable of regulating the expression of the target gene, and the method comprises contacting the cell with a small molecule, wherein the small molecule is grazoprevir or glecaprevir, or an analogue or derivative of grazoprevir or glecaprevir.

73. A method of regulating the expression of a target gene in a cell in a subject, wherein the cell expresses the dimerization-inducible protein of claim 28, and the DBD binds to a target sequence in the cell such that the transcription factor is capable of regulating the expression of the target gene, and the method comprises administering a small molecule to the subject, wherein the small molecule is grazoprevir or glecaprevir, or an analogue or derivative of grazoprevir or glecaprevir.

74. The cell for use according to claim 67, the small molecule for use according to claim 69 or 71, or the method of any one of claims 68, 70, 72 or 73, wherein the small molecule is grazoprevir.

75. A kit comprising a small molecule and: a) the expression vector, vector set or one or more vectors of any one of claims 41 to 52; b) the viral particle, set of viral particles or one or more viral particles of any one of claims 53 to 57; c) the cell of claim 38 or 58 to 63; or d) the nucleic acid of claim 39 or 40; wherein the small molecule is grazoprevir or glecaprevir, or an analogue or derivative of grazoprevir or glecaprevir.

76. The kit of claim 75, wherein the small molecule is grazoprevir.

77. An expression vector comprising a nucleotide sequence encoding a polypeptide having the amino acid sequence set forth in SEQ ID NO: 76.