WO2025255157A1

WO2025255157A1 - Systems, methods, and compositions for protein screening

Info

Publication number: WO2025255157A1
Application number: PCT/US2025/032126
Authority: WO
Inventors: David A. Weitz; Wentao Xu; Anqi Chen
Original assignee: Harvard University
Current assignee: Harvard University
Priority date: 2024-06-04
Filing date: 2025-06-03
Publication date: 2025-12-11
Anticipated expiration: 2026-12-04

Abstract

Systems, methods, and compositions for improved identification and study of active proteins are provided. In addition, systems and methods for improved synthesis and determination of useful antibodies are also generally provided. In some aspects, the disclosure relates to methods of expressing and determining active enzymes. In addition, in some aspects, the disclosure relates to substrates simultaneously bound to antibodies and to nucleic acids expressing those antibodies. Substrates may be used to physically link expressed proteins to the nucleic acid used to express them, which may be associated with qualitative and quantitative improvements in the determination of functionally active proteins.

Description

SYSTEMS, METHODS, AND COMPOSITIONS FOR PROTEIN SCREENING

RELATED APPLICATIONS

This application claims priority under 35 U.S.C. § 119(e) to U.S. Provisional Application No. 63/655,779, filed June 4, 2024, and entitled “SYSTEMS, METHODS, AND COMPOSITIONS FOR PROTEIN SCREENING,” and to U.S. Provisional Application No. 63/655,849, filed June 4, 2024, and entitled “SYSTEMS AND METHODS FOR DETERMINING ANTIBODIES,” which are incorporated herein by reference in their entirety for all purposes.

REFERENCE TO AN ELECTRONIC SEQUENCE LISTING

The content of the electronic sequence listing (H049870809WO00-SEQ-TC.xml; Size: 43,289 bytes; and Date of Creation: June 2, 2025) is herein incorporated by reference in its entirety.

TECHNICAL FIELD

Determination of active proteins and, in particular, of active enzymes using improved compositions, systems, and methods is generally described. Compositions, systems, and methods for identifying potent antibodies are also generally described.

BACKGROUND

Determining proteins with improved functional performance is a widespread research goal with important applications for biomedicine. Improved methods of determining and developing protein activity, including the activity of enzymes, is generally desirable. Various methods of determining potent antibodies exist, but further improvements to speed, sensitivity, and accuracy of the methods would be advantageous.

SUMMARY

Systems, methods, and compositions for improved identification and study of active proteins are provided. In addition, systems and methods for improved synthesis and determination of useful antibodies are also generally provided.

In some aspects, the disclosure relates to methods of expressing and determining active enzymes. In addition, in some aspects, the disclosure relates to substrates simultaneously bound to antibodies and to nucleic acids expressing those antibodies. Substrates may be used to physically link expressed proteins to the nucleic acid used to express them, which may be associated with qualitative and quantitative improvements in the determination of functionally active proteins. The subject matter of the present disclosure involves, in some cases, interrelated products, alternative solutions to a particular problem, and/or a plurality of different uses of one or more systems and/or articles.

Systems and methods for improved synthesis and determination of useful antibodies are generally provided. In some aspects, the disclosure relates to substrates simultaneously bound to antibodies and to nucleic acids expressing those antibodies. The substrate-bound antibodies may be screened for activity, and nucleic acids producing highly active antibodies may, in some cases, be determined and/or further mutated to produce improved antibodies. The subject matter of the present disclosure involves, in some cases, interrelated products, alternative solutions to a particular problem, and/or a plurality of different uses of one or more systems and/or articles.

In one aspect, a solution is provided. According to some embodiments, the solution comprises: a fluid; a structured species suspended in the fluid; a nucleic acid chemically bound to the structured species; and an enzyme chemically bound to the structured species, wherein the enzyme is encoded by the nucleic acid.

In another aspect, a solution is provided. According to some embodiments, the solution comprises: a fluid; a substrate suspended in the fluid; a nucleic acid chemically bound to the substrate; and an enzyme chemically bound to the substrate, wherein the enzyme is encoded by the nucleic acid.

In still another aspect, a method of making an enzyme is provided. According to some embodiments, the method comprises: transcribing and translating a nucleic acid chemically bound to a structured species to produce an enzyme and chemically binding the enzyme to the structured species.

In yet another aspect, a method of making an enzyme is provided. According to some embodiments, the method comprises: transcribing and translating a nucleic acid chemically bound to a substrate to produce an enzyme and chemically binding the enzyme to the substrate. In an aspect, a solution is provided. According to some embodiments, the solution comprises: a fluid; a structured species suspended in the fluid; a nucleic acid chemically bound to the structured species; and a protein chemically bound to the structured species, wherein the protein is encoded by the nucleic acid.

In another aspect, a solution is provided. According to some embodiments, the solution comprises: a fluid; a substrate suspended in the fluid; a nucleic acid chemically bound to the substrate; and a protein chemically bound to the substrate, wherein the protein is encoded by the nucleic acid.

In yet another aspect, a method of making a protein is provided. According to some embodiments, the method comprises: transcribing and translating a nucleic acid chemically bound to a structured species to produce a protein and chemically binding the protein to the structured species.

In still another aspect, a method of making a protein is provided. According to some embodiments, the method comprises: transcribing and translating a nucleic acid chemically bound to a substrate to produce a protein and chemically binding the protein to the substrate.

In one aspect, a solution is provided. According to some embodiments, the solution comprises: a fluid; a structured species suspended in the fluid; a nucleic acid chemically bound to the structured species; and at least a portion of an antibody chemically bound to the structured species, wherein the at least a portion of the antibody is encoded by the nucleic acid.

In another aspect, a solution is provided. According to some embodiments, the solution comprises: a fluid; a substrate suspended in the fluid; a nucleic acid chemically bound to the substrate; and at least a portion of an antibody chemically bound to the substrate, wherein the at least a portion of the antibody is encoded by the nucleic acid.

In still another aspect, a method of making an antibody is provided. According to some embodiments, the method comprises: transcribing and translating a nucleic acid chemically bound to a structured species to produce at least a portion of an antibody, and chemically binding the at least a portion of the antibody to the structured species.

In yet another aspect, a method of making an antibody is provided. According to some embodiments, the method comprises: transcribing and translating a nucleic acid chemically bound to a structured species to produce at least a portion of an antibody, and chemically binding the at least a portion of the antibody to the structured species.

In another aspect, a composition is provided. According to some embodiments, the composition comprises a non-natural nucleic acid that is at least 70% identical to any one of SEQ. ID. NOS. 21-28.

In yet another aspect, a composition is provided. According to some embodiments, the composition comprises a non-natural nucleic acid selected from the group of SEQ. ID. NOS. 21-28.

In still another aspect, a composition is provided. According to some embodiments, the composition comprises a protein comprising a non-natural protein expressible via transcription and translation of a sequence that is at least 70% identical to any one of SEQ. ID. NOS. 21-28.

In one aspect, a composition is provided. According to some embodiments, the composition comprises a protein comprising a non-natural amino acid sequence expressible via transcription and translation of any one of SEQ. ID. NOS. 21-28. In an aspect, a method is provided. According to some embodiments, the method comprises: transcribing and translating a plurality of nucleic acids within a first plurality of droplets comprising a first plurality of structured species, wherein the nucleic acids of the plurality of nucleic acids are bound to the structured species of the plurality of structured species; determining one or more activity droplets of the first plurality of droplets having an activity of a target substrate; and separating nucleic acids from the one or more activity droplets of the first plurality of droplets into a second plurality of droplets comprising a second plurality of structured species.

In another aspect, a method is provided. According to some embodiments, the method comprises: transcribing and translating a plurality of nucleic acids within a first plurality of droplets comprising a first plurality of substrates, wherein the nucleic acids of the plurality of nucleic acids are bound to the substrates of the plurality of substrates; determining one or more activity droplets of the first plurality of droplets having an activity of a target substrate; and separating nucleic acids from the one or more activity droplets of the first plurality of droplets into a second plurality of droplets comprising a second plurality of substrates. Other advantages and novel features of the present disclosure will become apparent from the following detailed description of various non-limiting embodiments of the disclosure when considered in conjunction with the accompanying figures. In cases where the present specification and a document incorporated by reference include conflicting and/or inconsistent disclosure, the present specification shall control.

BRIEF DESCRIPTION OF THE DRAWINGS

Non-limiting embodiments of the present disclosure will be described by way of example with reference to the accompanying figures, which are schematic and are not intended to be drawn to scale unless otherwise indicated. In the figures, each identical or nearly identical component illustrated is typically represented by a single numeral. For purposes of clarity, not every component is labeled in every figure, nor is every component of each embodiment of the disclosure shown where illustration is not necessary to allow those of ordinary skill in the art to understand the disclosure. In the figures:

FIG. 1A presents a non-limiting schematic illustration of a nucleic acid chemically bound to a substrate, according to some embodiments;

FIG. IB presents a non-limiting schematic illustration of a nucleic acid chemically bound to a substrate, and a dissolved protein expressed by the nucleic acid, according to some embodiments;

FIG. 1C presents a non-limiting schematic illustration of a nucleic acid chemically bound to a substrate, a dissolved protein expressed by the nucleic acid, and a protein expressed by the nucleic acid and chemically bound to the substrate, according to some embodiments;

FIG. ID presents a non-limiting schematic illustration of a nucleic acid chemically bound to a substrate and a protein expressed by the nucleic acid and chemically bound to the substrate, according to some embodiments;

FIG. IE presents a non-limiting schematic illustration of a nucleic acid chemically bound to fluidized substrates and proteins expressed by the nucleic acids and chemically bound to the fluidized substrates, according to some embodiments; FIG. 2 presents a non-limiting schematic illustration of formation of a structured species, according to some embodiments;

FIG. 3A presents a schematic illustration of a non-limiting method of producing and sorting activity droplets, according to some embodiments;

FIG. 3B presents a schematic illustration of a non-limiting method of sorting activity droplets and amplifying active nucleic acids, according to some embodiments;

FIG. 4A presents a schematic illustration of a non-limiting method of binding fluidized substrates to a particle, according to some embodiments;

FIG. 4B presents a schematic illustration of a non-limiting method of pulling down particles bound to fluidized substrates, according to some embodiments;

FIG. 5A presents a micrograph of a plurality of beads covalently bound to green fluorescent protein (GFP) expressing genes and expressed GFP, according to some embodiments;

FIG. 5B presents a micrograph of the plurality of beads of FIG. 4A after cleaving the GFP from the substrate and washing the substrate, according to some embodiments; and

FIG. 6A presents a micrograph of a non-limiting plurality of substrates, wherein at least some substrates comprise highly active anti-GFP antibodies;

FIG. 6B presents a micrograph of a non-limiting plurality of substrates comprising anti-GFP antibodies, wherein the anti-GFP-antibodies have been subjected to one iteration of pulldown using magnetic particles to select for anti-GFP activity; and

FIG. 6C presents a micrograph of a non-limiting plurality of substrates comprising anti-GFP antibodies, wherein the anti-GFP-antibodies have been subjected to two iterations of pulldown using magnetic particles to select for anti-GFP activity.

FIG. 7 shows an example screening workflow 701 of purified protein using bifunctional agarose gel beads, according to some embodiments.

FIG. 8A shows chemical reactions for the modification of agarose with terminal alkyne groups and further click reactions with primers and BG, according to some embodiments.

FIG. 8B shows a schematic illustration of a bifunctional agarose bead with primers and BG modifications, according to some embodiments. FIG. 8C shows a brightfield-microscope image of an air-triggered microfluidic device generating molten agarose droplets as the precursors of agarose beads, according to some embodiments.

FIG. 8D shows a brightfield-microscope image of uniform droplets of molten agarose at the collection of a microfluidic device, according to some embodiments.

FIG. 8E shows a fluorescence-microscope image of bifunctional agarose beads stained with Qubit ssDNA specific dye, according to some embodiments.

FIG. 8F shows a fluorescence-microscope image of bifunctional agarose beads tagged with GFP-SNAP, according to some embodiments.

FIG. 8G shows concentration of GFP bound to the agarose beads as a function of the input concentration of BG, according to some embodiments.

FIG. 9A shows fluorescence-microscope images of agarose beads after digital droplet PCR with a 0.1 DNA molecule per drop loading, according to some embodiments.

FIG. 9B shows fluorescence-microscope images of agarose beads after digital droplet PCR with a 1 DNA molecule per drop loading, according to some embodiments.

FIG. 9C shows fluorescence-microscope images of agarose beads after digital droplet PCR with a 10 DNA molecule per drop loading, according to some embodiments.

FIG. 9D shows quantification of frequency of positive gels after droplet PCR, according to some embodiments.

FIGS. 9E-G show confocal images of agarose beads after droplet IVTT and tagging of GFP-SNAP on the beads after (i) 2 hours (FIG. 9E) (ii) 5 hours (FIG. 9F), and (iii) 42 hours (FIG. 9G) of reaction, according to some embodiments.

FIG. 9H shows quantification of tagged GFP concentration on individual agarose beads after different durations of droplet IVTT and tagging, according to some embodiments.

FIG. 91 shows an overlay of brightfield- and fluorescence-microscope images of droplets containing hydrogel beads with WT BsLipA and Calcein AM after 1 hour of reaction time, according to some embodiments. FIG. 9J shows a microfluidics cytometry analysis of droplets containing agarose beads with WT BsLipA and Calcein AM after 1 hour of reaction time, according to some embodiments.

FIG. 9K shows an overlay of brightfield- and fluorescence-microscope images of droplets containing Calcein AM and single cells expressing WT BsLipA after 1 hour reaction demonstrating the biological variability, according to some embodiments.

FIG. 9L shows a microfluidic cytometry analysis of droplets containing single cells expressing wild-type BsLipA, after a 1-hour reaction with Calcein AM, with loading of 1 DNA molecule encoding BsLipA per 10 drops, according to some embodiments.

FIG. 10A shows activity of WT BsLipA and a thermotolerant variant of BsLipA, after 20-minute incubation at 20 and 60 °C, according to some embodiments.

FIG. 10B shows a distribution of droplet fluorescence for a reference library, according to some embodiments.

FIG. 10C shows a brightfield image of a concentric microfluidics sorter, according to some embodiments.

FIG. 10D is a table showing the recovery of DNA from samples with different numbers of sorted drops and sequence identity of the recovered DNA, according to some embodiments.

FIGS. 11A-11C show Sanger sequencing traces of recovered samples in the reference library sorts, according to some embodiments.

FIG. 12A shows a distribution of the number of amino acid substitutions per gene in the combinatorial library, according to some embodiments.

FIG. 12B shows a heat map of the frequency of distinct amino acid substitutions at the 15 targeted positions, according to some embodiments.

FIG. 12C shows activity of randomly selected variants from the combinatorial BsLipA library measured using E. coli lysate (i) without heat inactivation and (ii) after 20-min incubation at 70 °C, according to some embodiments.

FIG. 12D shows (i) a schematic workflow of a first screening experiment which comprised a first sort with 0.3 genes per gel, a second sort with 0.1 genes per gel, and a plate-based lysate assay; and (ii) residual activity of individually expressed variants measured using E. coli lysate after 30-min incubation at 75 °C, according to some embodiments.

FIG. 12E shows (i) a schematic workflow of a second screening experiment which comprised a first sort with 5 genes per gel, a second sort with 0.1 genes per gel, and a plate-based lysate assay; and (ii) residual activity of individually expressed variants measured using E. coli lysate after 30-min incubation at 75 °C, according to some embodiments.

FIG. 12F shows (i) a schematic workflow of a third screening experiment which comprised a first sort with 20 genes per gel, a second sort with 0.1 genes per gel, and a plate-based lysate assay; and (ii) residual activity of individually expressed variants measured using E. coli lysate after 30-min incubation at 75 °C, according to some embodiments.

FIG. 13A shows an SDS-PAGE gel analysis of purified proteins of selected variants, according to some embodiments.

FIG. 13B shows a heatmap of the frequency of distinct amino acid substitutions at the 15 targeted positions in the selected variants, with color intensities indicating the count of the amino acids in all thermotolerant variants, according to some embodiments.

FIG. 13C shows normalized activity of the eight most thermotolerant variants after incubation at temperatures from 25°C to 95°C, according to some embodiments.

FIG. 14 shows reaction rate as a function of heat-incubation temperature from 25 °C to 95 °C for wildtype BsLipA and 8 hits, according to some embodiments.

FIG. 15 shows normalized activity of identified variants as a function of incubation time at 65 °C, according to some embodiments.

FIGS. 16A-16B show microfluidics devices, according to some embodiments. FIG. 16A shows a device for air-triggered agarose bead generation, according to some embodiments. FIG. 16B shows a concentric sorting chip for particle-templated emulsions, according to some embodiments.

FIGS. 17A-17B show calibrating the relation between confocal-measured GFP intensity and GFP concentration, according to some embodiments. FIG. 17A shows SDS-PAGE gel for purified GFP, according to some embodiments. FIG. 17B shows standard curve for confocal-measured GFP fluorescence v.s. GFP concentration, according to some embodiments.

FIG. 18 shows concentration of tagged GFP in time onto agarose hydrogels made with 50 pM input BG at 3 hours, 8 hours and 24 hours, according to some embodiments.

FIG. 19 shows a map of the 1.8 kb linear DNA for hydrogel bead surface display. estA: Bacillus subtilius Lipase A gene (UniProt P37957), according to some embodiments.

FIG. 20 shows a plurality of activity droplets comprising both inactive and active nanobodies, according to some embodiments.

DETAILED DESCRIPTION

Determining functionally active proteins and the nucleic acids encoding them remains a challenge in modem biotechnology. The present disclosure provides, in various aspects, systems, methods, and compositions useful for determining functionally active proteins such as active enzymes, with particular advantages, in some embodiments in the context of combinatorial display methods. In some aspects, methods of linking expressed proteins to the nucleic acids expressing them via a substrate are provided. Such approaches may provide quantitative and/or qualitative advantages, depending on the embodiment. For example, the substrate may be used, in some instances, to homogenize the amount of expressed protein attached to each of a plurality of substrates (e.g. by saturating protein binding sites on the substrate), so that concentrations of unique protein types are standardized, allowing apples-to-apples comparisons of protein activity. Standardizing the amount of protein may have particular advantages in the context of enzymes, the catalytic activity of which is, in some instances, highly concentration dependent. In other embodiments, the approach of connecting expressed proteins and the nucleic acids they were expressed from to the same substrate provides qualitative advantages. For example, in the context of combinatorial, library-based methods, substrates may be connected to relatively low concentrations of a given nucleic acid type. The use of the substrate may, in some embodiments, allow iterative performance of nucleic acid expression reactions e.g., in vitro transcription and translation reactions), interspersed with steps of washing the substrate to remove excess reagents and reaction byproducts, improving the process of amplifying the protein to a detectable level. Other advantages are described in detail, below.

For the sake of illustrating some of these advantages, this paragraph provides a specific, non-limiting example of a method that may be used to determine an active enzyme. Any such determinations may be qualitative and/or quantitative. Come embodiments, a plurality of structured, agarose substrates are covalently bound to a library of nucleic acids capable of expressing enzymes of various types (e.g., such that each substrate is bound to a nucleic acid having a unique sequence capable of expressing a unique enzyme). The substrates are, in some embodiments, suspended in the plurality of droplets in a microfluidic system such that each droplet contains about one substrate. According to some embodiments, one or more IVTT (in vitro transcription and translation) reactions may be performed within each droplet to express enzymes from the substrate bound nucleic acids in each droplet. The expressed enzymes may be chemically bound to the same substrate as the nucleic acid from which they were expressed (e.g., using a SNAP tag). IVTT reagents may be removed by washing the substrates (e.g., by replacing the solution surrounding the substrate in each droplet). According to some embodiments, after washing the substrate, and enzyme reactant and a detection reagent may be added to each droplet in order to detect activity of the enzyme reactant catalyzed by the substrate-bound enzyme in each droplet. The activity in each droplet may be assessed in order to determine enzymes with the highest catalytic activity. The nucleic acids associated with the most active enzymes may then be amplified (e.g. by PCR) and sequenced, optionally after removal of the enzyme, the enzyme reactant, and/or the detection reagent. At stated above, it should be understood that the foregoing example is non-limiting, and is merely illustrative of the types of techniques made available by the general approaches provided herein.

In addition, the development, refinement, and determination of antibodies is an important area of research, with broad applications in biomedicine. The present disclosure also provides, in various aspects, improved compositions, systems, and methods for systematically synthesizing, amplifying, and determining new, potent antibodies. In some embodiments, the disclosure relates to expressing one or more antibodies from one or more nucleic acids chemically bound to a substrate and chemically binding the expressed antibody or antibodies to the substrate. It has been recognized herein that substrates simultaneously bound to an active antibody and a nucleic acid capable of expressing more of the active antibody has various advantages, e.g., for homogenizing the amount of the active antibody bound to each substrate, or for amplifying and sequencing nucleic acids responsible for expressing highly-active antibodies. These recognized features may in turn improve the sensitivity of combinatorial methods of developing, refining, and amplifying active antibodies, simplifying these processes, as explained in greater detail below.

To provide a specific, non-limiting example solely for the sake of illustration, according to some embodiments, some methods comprise using a plurality of droplets comprising about 1 substrate per droplet, where each substrate in the plurality of droplets is bound to a unique nucleic acid type (e.g., to a nucleic acid having a sequence differing from the sequences of the other substrates). In vitro transcription and translation (IVTT) reagents may be added to each droplet to express the unique antibody encoded by each nucleic acid type. The expressed antibodies may be covalently bound to the substrate (e.g., using a SNAP tag). Then, according to some embodiments, the substrates are washed to remove IVTT reagents, and the substrate is fluidized to form smaller substrates bound to subsets of the expressed and chemically bound antibodies and nucleic acids. The disintegrated substrates bound to active substrates may be bound to binding targets connected to magnetic beads, and the beads may be pulled down and washed to preserve substrate bound, active antibodies (and the nucleic acids expressing them) while allowing non-binding substrates to be washed away. Antibodies with the strongest binding to the binding target may be amplified (e.g., by reconstituting the fluidized substrates bound to those antibodies into a new plurality of structured substrates) and optionally further mutated as part of an iterative process to determine highly active antibodies. It should be understood, of course, that the example above is purely illustrative and that, as elaborated upon below, any of a variety of other approaches may also be taken, as the disclosure is not limited to this specific example.

Various aspects of the disclosure relate to uses of a nucleic acid chemically bound to a substrate. In some aspects, the disclosure relates to transcription and/or translation of nucleic acids chemically bound e.g., covalently bound) to substrates. For example, FIGS. 1A-1D provide non limiting schematic illustrations of various stages of transcription, translation, and processing of a nucleic acid 103 chemically bound to a substrate 101. The nucleic acid may be configured to encode any of a variety of suitable proteins. For example, in some embodiments, the nucleic acid is configured to encode an enzyme. As another example, in some embodiments, the nucleic acid is configured to encode an antibody.

Although FIGS. 1A-1D only illustrate a single nucleic acid type, it should be understood that one advantage of some of the methods provided herein allows for combinatorial methods of amplifying, transcribing, translating, and/or determining nucleic acids. In some embodiments, the substrate is bound to a plurality of distinct types of nucleic acids (e.g., to a plurality of nucleic acids with different sequences). For example, the substrate may be bound to a library of nucleic acids, in some embodiments. In other embodiments, however, only a single type of nucleic acid is bound to a substrate. Even embodiments where a single type of nucleic acid is bound to a substrate may still be well-suited for combinatorial methods when different nucleic acid types are used in different containers (e.g., different droplets), as discussed in greater detail below.

A substrate may be chemically bound to any of a variety of suitable numbers of nucleic acid types. In some embodiments, a substrate is chemically bound to greater than or equal to 1 type, greater than or equal to 10 types, greater than or equal to 10² types, greater than or equal to 10³ types, greater than or equal to 10⁴ types, greater than or equal to 10⁵ types, or greater than or equal to 10⁶ types of nucleic acids. In some embodiments, a substrate is chemically bound to less than or equal to 10⁷ types, less than or equal to 10⁶ types, less than or equal to 10⁵ types, less than or equal to 10⁴ types, less than or equal to 10³ types, less than or equal to 10² types, or less than or equal to 10 types of nucleic acid types. Combinations of these ranges are also possible (e.g., greater than or equal to 1 type and less than or equal to 10⁷ types, or greater than or equal to 10 types and less than or equal to 10⁶ types). In some embodiments, a substrate is bound to exactly one nucleic acid type. Other ranges, both higher and lower than those described above, are also possible, as the disclosure is not so limited.

A substrate may be chemically bound to any of a variety of suitable numbers of nucleic acids of a given type. In some embodiments, a substrate is chemically bound to greater than or equal to 1 nucleic acid, greater than or equal to 10 nucleic acids, greater than or equal to 10² nucleic acids, greater than or equal to 10³ nucleic acids, greater than or equal to 10⁴ nucleic acids, greater than or equal to 10⁵ nucleic acids, or greater than or equal to 10⁶ nucleic acids of nucleic acids of a given type. In some embodiments, a substrate is chemically bound to less than or equal to 10⁷ nucleic acids, less than or equal to 10⁶ nucleic acids, less than or equal to 10⁵ nucleic acids, less than or equal to 10⁴ nucleic acids, less than or equal to 10³ nucleic acids, less than or equal to 10² nucleic acids, or less than or equal to 10 nucleic acids of nucleic acid nucleic acids of a given type. Combinations of these ranges are also possible (e.g., greater than or equal to 1 nucleic acid and less than or equal to 10⁷ nucleic acids, or greater than or equal to 10 nucleic acids and less than or equal to 10⁶ nucleic acids). Other ranges, both higher and lower than those described above, are also possible, as the disclosure is not so limited.

The total number of nucleic acids of each type (e.g., of each unique sequence) contained within a container may or may not necessarily be equal. For instance, in some cases, when two types of nucleic acid are contained within a container, there may be approximately an equal number of the first type of nucleic acid and the second type of nucleic acid contained within the container. In other cases, the first type of nucleic acid may be present in a greater or lesser amount than the second type of nucleic acid, for example, the ratio of one nucleic acid to another nucleic acid may be greater than or equal to about 1:2, about 1:3, about 1:4, about 1:5, about 1:6, about 1:10, about 1:20, about 1:100, and the like. The number of nucleic acids of each type of nucleic acid in each of a plurality of containers may or may not be equal.

According to some embodiments, the substrate further comprises protein binding sites. For example, referring to FIG. 1A, substrate 101 comprises protein binding sites 105 (designated by black circles) capable of chemically binding a protein encoded by nucleic acid 103. Using appropriate conditions and reagents as described in greater detail below, transcription and/or translation of the nucleic acid may be performed to produce a protein using the nucleic acid chemically bound to the substrate. For example, FIG. IB schematically represents proteins 107, produced by transcription and translation of nucleic acid 103 bound to substrate 101. Transcription and translation may be performed in any of a variety of appropriate ways, e.g., by IVTT of the substrate-bound nucleic acid.

A translated protein may be chemically bound (e.g., covalently bound) to the substrate, in some embodiments. For example, as shown in FIG. 1C, protein binding sites 105 may bind to some or all of protein 107, chemically binding protein 107 to substrate 101. Any variety of chemical binding sites may be used to chemically bind the protein to the substrate, as discussed in greater detail below. Binding a translated protein to the nucleic acid via the substrate may result in the nucleic acid being linked to the protein it expresses by the substrate, without requiring that the expressed protein be chemically linked to the nucleic acid. For example, in some embodiments, a protein and a nucleic acid are configured such that they are only bound together via a substrate.

In some embodiments, transcription and translation of the nucleic acid is used to over-express the protein relative to the number of protein binding sites available on the substrate. This may, advantageously, facilitate saturation of the available protein binding sites on the substrate. For example, FIG. 1C shows that proteins 107 outnumber protein binding sites 105, with the result that protein binding sites 105 are saturated with protein. Saturating the protein binding sites may be particularly advantageous, as substrates may, in some embodiments, be designed to have relatively consistent amounts (e.g., numbers or concentrations) of protein binding sites, allowing preparation of substrates with relatively consistent amounts of proteins. Without wishing to be bound by any particular theory, substrates prepared with relatively consistent amounts of protein binding sites may, in some embodiments, be used to reduce variation in protein-function measurements by standardizing the amount of protein used in each measurement. This standardization is not limited to cases where only a single nucleic acid type is bound to the substrate, but illustrates one advantage of using a substrate bound to only a single nucleic acid type, since each substrate can be used to standardize the concentration of surface-bound protein expressed by the nucleic acid type bound to the substrate. Using different substrates in different containers may still allow multiplex testing of various nucleic acids.

In contrast, standardizing the amount of protein produced by the transcription and translation reactions (as opposed to standardizing a substrate-bound amount of protein) may be considerably more difficult. Thus, binding the proteins to the substrate provides an advantage for various protein assays, according to some embodiments.

A substrate may be chemically bound to any of a variety of suitable numbers of protein types. In some embodiments, a substrate is chemically bound to greater than or equal to 1 type, greater than or equal to 10 types, greater than or equal to 10² types, greater than or equal to 10³ types, greater than or equal to 10⁴ types, greater than or equal to 10⁵ types, or greater than or equal to 10⁶ types of proteins. In some embodiments, a substrate is chemically bound to less than or equal to 10⁷ types, less than or equal to 10⁶ types, less than or equal to 10⁵ types, less than or equal to 10⁴ types, less than or equal to

10³ types, less than or equal to 10² types, or less than or equal to 10 types of protein types. Combinations of these ranges are also possible (e.g., greater than or equal to 1 type and less than or equal to 10⁷ types, or greater than or equal to 10 types and less than or equal to 10⁶ types). In some embodiments, a substrate is bound to exactly one protein type. Other ranges, both higher and lower than those described above, are also possible, as the disclosure is not so limited.

The substrate may be chemically bound to a plurality of protein types, e.g., as may be expressed by a plurality of nucleic acid types according to some embodiments. In some embodiments, a substrate is chemically bound to greater than or equal to 1 protein, greater than or equal to 10 proteins, greater than or equal to 10² proteins, greater than or equal to 10³ proteins, greater than or equal to 10⁴ proteins, greater than or equal to 10⁵ proteins, or greater than or equal to 10⁶ proteins of proteins of a given type. In some embodiments, a substrate is chemically bound to less than or equal to 10⁷ proteins, less than or equal to 10⁶ proteins, less than or equal to 10⁵ proteins, less than or equal to

10⁴ proteins, less than or equal to 10³ proteins, less than or equal to 10² proteins, or less than or equal to 10 proteins of protein proteins of a given type. Combinations of these ranges are also possible e.g., greater than or equal to 1 protein and less than or equal to 10⁷ proteins, or greater than or equal to 10 proteins and less than or equal to 10⁶ proteins). Other ranges, both higher and lower than those described above, are also possible, as the disclosure is not so limited.

In some embodiments, the substrate is washed to remove unbound reagents and/or proteins. For example, FIG. 1C shows unbound proteins 107a in addition to bound proteins 107b. FIG. ID shows substrate 101 after washing. As shown, bound proteins 107b remained chemically bound to the substrate, while unbound proteins 107a (not shown) were removed during washing. Watching substrate may be useful for any variety of reasons. For example, in some embodiments, washing substrate helps to regularize the amount of protein in a container. As another example, in some embodiments, washing a substrate allows the substrate to be combined (e.g., in a common container) with another substrate chemically bound to another protein, such that both substrates may be exposed to the same chemical conditions without crosscontamination of each substrate with protein from the other.

Optionally, in some embodiments, the substrate may be disintegrated into smaller substrates (e.g., after washing). For example, in some embodiments, a structured polymeric substrate is fluidized to form a plurality of smaller substrates in the form of dissolved polymer chains, at least some of which may be chemically bound to a nucleic acid and a protein encoded by the nucleic acid. FIG. IE provides one such example, where substrate 101 of FIGS. 1A-1D has been fluidized to form substrates 102 in the form of dissolved polymer chains, at least some of which are chemically bound to nucleic acids 103 and proteins 107. This approach may have certain advantages, as elaborated upon in greater detail below. For example, in some embodiments a substrate may be disintegrated, e.g., as part of a method of breaking active proteins into a new plurality of containers (e.g., by fluidizing or otherwise disintegrating a substrate in a first container and subsequently breaking the fluid from the first container into a plurality of new containers). As another example, in some embodiments a substrate may be disintegrated (e.g., fluidized) for the purpose of sorting proteins based on their activity. For example, in some embodiments a substrate like substrate 101 comprises a library of nucleic acids and/or proteins, and subdividing substrate 101 into a plurality of substrates 102 allows substrates chemically bound to active proteins (e.g., antibodies active against a binding target of interest) to be selectively preserved based on their binding to a binding target of interest (e.g., by binding the substrate-bound proteins to particles conjugated to binding targets and washing away unbound substrates). The disintegrated (e.g., fluidized) substrates may then be incorporated into new structured substrates and amplified, and nucleic acids chemically bound to the substrates may be amplified, depending on the embodiment.

Any of a variety of substrates may be used, depending on the embodiment. However, in some embodiments, the substrate has certain physical and/or chemical properties that may be particularly advantageous in the context of reactions of nucleic acids. For example, a substrate may be chosen for improved compatibility with amplification and or transcription reactions performed on nucleic acids bound to the substrate. Likewise, certain substrates may offer processing advantages for use in the context of, e.g., droplet-based methods.

One advantage of the substrates provided herein is that, in some embodiments, they may be fluidized in order to form new substrates comprising a greater diversity of nucleic acids and proteins. For example, a first structured species and a second structured species may, in some embodiments, be fluidized, merged, partitioned, and resolidified to form new substrates comprising a mixture of proteins and nucleic acids from the first structured species and the second structured species. This approach may have advantages for combinatorial methods, depending on the embodiment.

According to some embodiments, the substrate comprises a polymeric material. The polymeric material may be dissolved, gelled, or solidified, for example, depending on the embodiment. According to some embodiments, the substrate comprises a structured (e.g., solid, gelled, or otherwise shape-retaining) polymeric material, also referred to herein as a structured species. A structured species may retain a relatively fixed structure relative to a fluid. For example, a structured species may have a relatively invariant shape retained, at least in part, by elastic forces binding the polymeric material into a relatively fixed position. A structured species may be capable of deforming elastically under appropriate conditions, according to some embodiments. For example, the structured species is a gel, in some embodiments. In some cases, the structured species is a solid.

A structured species may be prepared by rigidifying a precursor (e.g., a fluid polymeric material such as a dissolved polymer, or a monomeric precursor configured to rigidify by polymerizing). In some embodiments, compartments (e.g., droplets) provided herein comprise a precursor material, where the precursor material is capable of undergoing a phase change, e.g., to form a structured species. For instance, a container may contain a gel precursor and/or a polymer precursor (e.g., dissolved within in the container) that can be gelled, precipitated, crystalized, or otherwise solidified to form a structured species.

A structured species, in some cases, contains a fluid (e.g., an aqueous solution). Containing a fluid within the structured species may be advantageous, in some embodiments, for conducting biochemical reactions that take place in aqueous environments. For example, the fluid may be an aqueous liquid containing a species such as a nucleic acid or protein. A structured species may be substantially porous or substantially non-porous. In some aspects, a structured species is substantially porous such that at least one species (e.g., a nucleic acid or protein) may be contained internally within the structured species. In other embodiments, however, a species may be contained within a non-porous structured species, or the species may be contained on an external surface of the structured species (e.g., at an interface between the structured species and a fluid surrounding the structured species), as the disclosure is not so limited.

A structured species may have any of a variety of suitable pore sizes. In some embodiments, a structured species has an average pore size of greater than or equal to 20 nm, greater than or equal to 50 nm, greater than or equal to 100 nm greater than or equal to 200 nm, greater than or equal to 500 nm, greater than or equal to 800 nm, greater than or equal to 1000 nm, greater than or equal to 1200 nm, greater than or equal to 1500 nm, greater than or equal to 1800 nm, greater than or equal to 2000 nm, greater than or equal to 2200 nm, greater than or equal to 2500 nm, greater than or equal to 2800 nm, greater than or equal to 3000 nm, greater than or equal to 3200 nm, greater than or equal to 3500 nm, or greater than or equal to 3800 nm. In some embodiments, a structured species has an average pore size of less than or equal to 4000 nm, less than or equal to 3800 nm, less than or equal to 3500 nm, less than or equal to 3200 nm, less than or equal to 3000 nm, less than or equal to 2800 nm, less than or equal to 2500 nm, less than or equal to 2200 nm, less than or equal to 2000 nm, less than or equal to 1800 nm, less than or equal to 1500 nm, less than or equal to 1200 nm, less than or equal to 1000 nm, less than or equal to 800 nm, less than or equal to 500 nm, less than or equal to 200 nm, less than or equal to 100 nm, less than or equal to 50 nm, or less. Combinations of these ranges are also possible (e.g., greater than or equal to 20 nm and less than or equal to 4000 nm, greater than or equal to 50 nm and less than or equal to 1000 nm, or greater than or equal to 100 nm and less than or equal to 500 nm). Other ranges, both higher and lower than those described above, are also possible, as the disclosure is not so limited. The average pore size of the structured species may be determined, in accordance with some embodiments, by scanning electron microscopy (SEM).

A structured species may be caused to undergo a phase change, e.g., to fluidize the structured species, according to some embodiments. Fluidization may convert a structured polymeric material to a fluid polymeric material (e.g., to a polymeric material dissolved in solution). Fluidization may be accomplished using any of a variety of suitable techniques. For example, a structured species may be fluidized by exposing a structured species to an environmental change. A fluid polymeric species resulting from fluidization of a structured species may, in some embodiments, be re-rigidified. For example, the fluidization of the structured species may be reversable such that reversing the environmental change that fluidized the structured species causes the fluid species to re-rigidify, according to some embodiments. Examples of environmental changes that may fluidize a structured species (and/or rigidify a fluid species) include but are not limited to: a change in ambient temperature, a change in ambient pH level, a change in ambient ionic strength, a change in ambient electromagnetic radiation (e.g., addition of ambient ultraviolet light), addition of a chemical (e.g., chemical that cleaves a crosslinker in a structured polymeric material or forms a crosslinker in a fluid polymeric material), and the like. Specific examples of environmental changes are provided below. As a specific example, in some cases, a polymeric material may be caused to undergo a fluidizing or solidifying phase change by raising or lowering the temperature of a polymeric material from a first temperature to a second temperature. For example, in some embodiments, a first temperature may be raised or lowered to a second temperature by greater than or equal to 5 °C, greater than or equal to 10 °C, greater than or equal to 15 °C, greater than or equal to 20 °C, greater than or equal to 25 °C, greater than or equal to 30 °C, greater than or equal to 35 °C, greater than or equal to 40 °C, greater than or equal to 45 °C, greater than or equal to 50 °C, greater than or equal to 55 °C, greater than or equal to 60 °C, greater than or equal to 65 °C, greater than or equal to 70 °C, greater than or equal to 75 °C, greater than or equal to 80 °C, greater than or equal to 85 °C, greater than or equal to 90 °C, or greater than or equal to 95 °C. In some embodiments, a first temperature may be raised or lowered to a second temperature by less than or equal to 100 °C, less than or equal to 95 °C, less than or equal to 90 °C, less than or equal to 85 °C, less than or equal to 80 °C, less than or equal to 75 °C, less than or equal to 70 °C, less than or equal to 65 °C, less than or equal to 60 °C, less than or equal to 55 °C, less than or equal to 50 °C, less than or equal to 45 °C, less than or equal to 40 °C, less than or equal to 35 °C, less than or equal to 30 °C, less than or equal to 25 °C, less than or equal to 20 °C, less than or equal to 15 °C, or less than or equal to 10 °C. Combinations of these ranges are also possible (e.g., greater than or equal to 5 °C and less than or equal to 100 °C, greater than or equal to 10 °C and less than or equal to 80 °C, or greater than or equal to 20 °C and less than or equal to 50 °C). Other ranges, both higher and lower than those described above, are also possible, as the disclosure is not so limited.

As a specific, non-limiting example, a polymeric material comprising agarose may be rigidified by cooling the agarose to a temperature below the gelling temperature of agarose. As another example, structured agarose (e.g., gelled agarose) may be fluidized by warming the agarose. In some cases, the temperature change is chosen in part such that a species (e.g., a nucleic acid, a protein, a cell) contained within a same container (e.g., within a droplet) as the polymeric material remains unchanged. Nonlimiting examples of suitable polymeric materials that may gel as a result of temperature change include agarose, a PEG-PLGA-PEG triblock copolymer, Matrigel ® or generic equivalents thereof, or the like.

As another example, a polymeric material may be caused to undergo a phase change by raising or lowering the ambient pH from a first pH to a second pH. For example, in some embodiments, a first pH may be raised or lowered to a second pH by greater than or equal to 0.5, greater than or equal to 1.0, greater than or equal to 1.5, greater than or equal to 2.0, greater than or equal to 2.5, greater than or equal to 3.0, greater than or equal to 3.5, greater than or equal to 4.0, greater than or equal to 4.5, greater than or equal to 5.0, greater than or equal to 5.5, greater than or equal to 6.0, greater than or equal to 6.5, greater than or equal to 7.0, greater than or equal to 7.5, greater than or equal to 8.0, greater than or equal to 8.5, greater than or equal to 9.0, greater than or equal to 9.5, greater than or equal to 10.0, greater than or equal to 10.5, greater than or equal to 11.0, greater than or equal to 11.5, greater than or equal to 12.0, greater than or equal to 12.5, greater than or equal to 13.0, or greater than or equal to 13.5. In some embodiments, a first pH may be raised or lowered to a second pH by less than or equal to 14.0, less than or equal to 13.5, less than or equal to 13.0, less than or equal to 12.5, less than or equal to 12.0, less than or equal to 11.5, less than or equal to 11.0, less than or equal to 10.5, less than or equal to 10.0, less than or equal to 9.5, less than or equal to 9.0, less than or equal to 8.5, less than or equal to 8.0, less than or equal to 7.5, less than or equal to 7.0, less than or equal to 6.5, less than or equal to 6.0, less than or equal to 5.5, less than or equal to 5.0, less than or equal to 4.5, less than or equal to 4.0, less than or equal to 3.5, less than or equal to 3.0, less than or equal to 2.5, less than or equal to 2.0, less than or equal to 1.5, or less than or equal to 1.0. Combinations of these ranges are also possible (e.g., greater than or equal to 0.5 and less than or equal to 14.0, greater than or equal to 0.5 and less than or equal to 10.0, or greater than or equal to 0.5 and less than or equal to 5.0). Other ranges, both higher and lower than those described above, are also possible, as the disclosure is not so limited. In some cases, the ambient pH may be changed from acidic to basic, basic to acidic, less acidic to more acidic, more acidic to less acidic, more basic to less basic, less basic to more basic, and the like. Non-limiting examples of gels that may undergo a phase change upon a change in pH include cellulose acetate phthalate latex and cross-linked poly acrylic or other carbomer derivatives (e.g., Polycarbophil® and Carbopol®).

As yet another example, the polymeric material may be caused to undergo a phase change by reaction with a chemical reagent, for example, a crosslinking reagent or a cleaving reagent. A polymer contained within a liquid can be crosslinked, in some embodiments, thereby forming a solid or a gel state structured species by crosslinking the chains of the polymer together. In some instances, a crosslinking reaction may be initiated by heat, pressure, or electromagnetic radiation. In certain cases, a crosslinking agent will be used to rigidify a polymeric material. Crosslinking may or may not be reversable, depending on the embodiment. Addition of a cleaving reagent may fluidize a structured species to form a fluid species (e.g., by cleaving crosslinks of the structured species). Examples of structured species that may be prepared using crosslinking reagents are discussed more herein.

A structured species may partially or completely fill a container (e.g., a droplet, a well), depending on the embodiment. For example, in some embodiments, the container includes additional liquid outside the structured species but miscible with a fluid contained by the structured species. As another example, in some embodiments, the exterior boundaries of the structured species are coincident with the external boundaries of the container.

A structured species may be a gel droplet (e.g., a droplet comprising or consisting essentially of a gel), according to some embodiments. As used herein, the term “gel” is given its ordinary meaning in the art and can refer to a material comprising a polymer network that is able to trap and contain fluids. For example, a structured species formed in a fluid container (e.g., a droplet) may contain at least some fluid that was present in the container prior to rigidification of the polymeric material. The gel may comprise polymer chains that are crosslinked. The degree of crosslinking may be varied, in some cases, to tailor the extent to which the gel absorbs or retains fluids. Those of ordinary skill in the art will be able to select appropriate materials suitable for use as gels. In some cases, a gel may be formed from a gel precursor. For instance, the gel precursor may comprise a material that forms a gel upon reaction with another material (e.g., a photoinitiator or crosslinker). An example of a gel precursor includes polyacrylamide. In another embodiment, the gel precursor comprises a material that forms a gel upon application of electromagnetic radiation to the material, such as chitosan or poly(ethylene) glycol.

In some cases, a gel may be fluidized. For instance, a polymer gel may be fluidized by cleaving the crosslinks formed in the gel, according to some embodiments. Different types of gels and gel precursors that can be used in accordance with various embodiments are described in more detail below.

In some embodiment, the gel is a natural gel; that is, a biologically-derived gel. A natural gel may include, for example, agarose (e.g., low melting point agarose), collagen, fibrin, laminin, Matrigel® or generic equivalents thereof, alginate, and/or combinations thereof. In one particular embodiment, agarose is used. Polymeric materials in the form of natural gels or gel precursors, in some instances, may be rigidified or fluidized by a change in the temperature or pH, etc.

Non-limiting examples of materials capable of forming gels from a liquid precursor include, but are not limited to, silicon-containing polymers, polyacrylamides (e.g., poly(N-isopropylacrylamide)), crosslinked polymers (e.g., polyethylene oxide, poly AMPS and polyvinylpyrrolidone), polyvinyl alcohol, acrylate polymers (e.g., sodium poly acrylate), and copolymers with an abundance of hydrophilic groups. Those of ordinary skill in the art can choose appropriate polymers that can be crosslinked, as well as suitable methods of crosslinking, based upon general knowledge of the art in combination with the description herein.

In some embodiments, a gel comprises a sol-gel. A “sol-gel” may be a gel derived from a sol, either by polymerizing the sol into an interconnected solid matrix, or by destabilizing the individual particles of a colloidal sol by means of an external agent. In general, the sol-gel process involves the change of a colloidal suspension system into a gel phase exhibiting a significantly higher viscosity.

It should be understood when using the various embodiments discussed above, not every container will comprise a substrate and a target nucleic acid (e.g., a nucleic acid suitable for transcription and/or translation to produce an active protein). Some containers may contain neither a substrate nor a target nucleic acid, some containers may contain only one of the two, and some containers may contain both. This by no means limits the applications of the containers. Non-limiting methods for forming a plurality and/or suspension of structured species within a plurality of containers are now described.

In some embodiments, a method for forming a plurality of structured species comprises first providing a plurality of droplets, each of the plurality of droplets comprising a first fluid and being substantially surrounded by a second fluid, where the first fluid and the second fluid are substantially immiscible. The plurality of droplets may undergo a phase change to form a plurality of structured species (e.g., gel droplets). The plurality of structured species may be exposed to a third fluid, which may, in some cases, be substantially miscible with the first fluid contained in the structured species. According to some embodiments, at least one nucleic acid may be added internally to at least some of the structured species (e.g., by diffusion of a fluid containing the nucleic acid into the droplet).

A non-limiting example of the above method is depicted in FIG. 2. A plurality of droplets 250 comprising a first fluid are substantially surrounded by a second fluid 252, where the first fluid and the second fluid are substantially immiscible. The plurality of droplets undergo a phase change, as indicated by arrow 251, to form a plurality of structured species 254, which are substantially surrounded by second fluid 252. The plurality of structured species 254 are exposed to a third fluid 256, where the first fluid comprised in the plurality of structured species 254 is substantially miscible with third fluid 256, as indicated by arrow 255. At least one first nucleic acid 260, depicted schematically as a black dot, may be chemically bound to each structured species 258, as indicated by arrow 257.

Organic frameworks may also be used as substrates, depending on the embodiment. Organic frameworks, like the structured species discussed above, may be porous structures capable of containing fluid. Organic frameworks may be formed by self-assembly of organic molecules. The organic molecules may self-assemble such that they coordinate around metal ions or covalent bonds, depending on the embodiment. Organic frameworks suitable for use as substrates provided herein may have any of a variety of coordinations, depending on the embodiment. For example, organic framework suitable for use as substrates provided herein may be coordinated 1- dimensionally, 2-dimensionally, or 3-dimensionally. Any of a variety of suitable types of organic frameworks may be used. For example, in some embodiments the substrate comprises a metal-organic framework (MOF) or a covalent organic framework (COF). Like the structured species described above, organic frameworks may be formed in droplets or other containers as a result of environmental change (e.g., mixing of organic framework forming reagents such as organic molecules and metal ions). Depending on the embodiment, it may be possible to fluidize and/or re-rigidify the organic framework, as the disclosure is not so limited.

Although the porous structures of various materials described above (e.g., structured species or organic frameworks) may be advantageous for facilitating reactions of substrate bound nucleic acids, porosity is not a prerequisite of good substrates and many suitable substrates are not porous. For example, 2D-materials may provide a suitable substrate, depending on the embodiment, because they provide a high surface area and permit substrate-bound nucleic acids to contact reagents present in nearby fluid. As used herein, the term 2D material refers to a layer of material with a thickness limited to a relatively small number of atomic layers. In some embodiments, a 2D material has an average thickness of less than or equal to 20 atomic layers, less than or equal to 19 atomic layers, less than or equal to 18 atomic layers, less than or equal to 17 atomic layers, less than or equal to 16 atomic layers, less than or equal to 15 atomic layers, less than or equal to 14 atomic layers, less than or equal to 13 atomic layers, less than or equal to 12 atomic layers, less than or equal to 11 atomic layers, less than or equal to 10 atomic layers, less than or equal to 9 atomic layers, less than or equal to 8 atomic layers, less than or equal to 7 atomic layers, less than or equal to 6 atomic layers, less than or equal to 5 atomic layers, less than or equal to 4 atomic layers, less than or equal to 3 atomic layers, or less than or equal to 2 atomic layers. In some embodiments, a 2D material has an average thickness of greater than or equal to 1 atomic layer, greater than or equal to 2 atomic layers, greater than or equal to 3 atomic layers, greater than or equal to 4 atomic layers, greater than or equal to 5 atomic layers, greater than or equal to 6 atomic layers, greater than or equal to 7 atomic layers, greater than or equal to 8 atomic layers, greater than or equal to 9 atomic layers, greater than or equal to 10 atomic layers, greater than or equal to 11 atomic layers, greater than or equal to 12 atomic layers, greater than or equal to 13 atomic layers, greater than or equal to 14 atomic layers, greater than or equal to 15 atomic layers, greater than or equal to 16 atomic layers, greater than or equal to 17 atomic layers, greater than or equal to 18 atomic layers, or greater than or equal to 19 atomic layers. Combinations of these ranges are also possible (e.g., greater than or equal to 1 atomic layer and less than or equal to 20 atomic layers, or greater than or equal to 1 atomic layer and less than or equal to 10 atomic layers). Other ranges, both higher and lower than those described above, are also possible, as the disclosure is not so limited.

Any of a variety of 2D materials may be used for substrates, depending on the embodiment. Examples of 2D materials that may be suitable for use in substrates include, but are not limited to: graphene, graphene oxide, reduced graphene oxide, and combinations or derivatives thereof.

As discussed above, in some embodiments the substrate is chemically bound to a nucleic acid. Any variety of nucleic acids known to those of ordinary skill in the art may be used. For example, in some embodiments the nucleic acid comprises DNA. In some embodiments the nucleic acid comprises RNA. Nucleic acids could, in some embodiments, be naturally occurring or chemically modified (e.g., such that they contain nonnatural nucleotides). In some embodiments, the nucleic acid may encode a protein. For example, the nucleic acid may comprise DNA capable of forming a protein via transcription and translation (e.g., using IVTT, as discussed in greater detail below). In some embodiments, the nucleic acid comprises RNA capable of forming a protein via translation alone. Other embodiments are also possible, as the disclosure is not so limited.

A substrate chemically bound to a nucleic acid may be formed in any of a variety of suitable ways. For example, in some embodiments, the substrate is formed first, then the nucleic acid is chemically reacted with the substrate to form the substrate covalently bound to the nucleic acid. But in some embodiments, the nucleic acid is chemically bound to a substrate precursor (e.g., a precursor to a structured species, or an organic molecule suitable for use in an organic framework) and the substrate bound to the nucleic acid is formed by binding the nucleic acid into the substrate via reaction of the precursor to form the substrate. Other embodiments are also possible as the disclosure is not so limited.

Those of ordinary skill in the art will be aware of methods to chemically bind a nucleic acid to a substrate (e.g., a polymeric material such as a structured species). The nucleic acid may be chemically bound to the substrate either directly (e.g., by formation of a bond, such as a covalent bond, directly with the substrate) or indirectly (e.g., using a crosslinking molecule). In some instances, application of light or heat to a container comprising a substrate and a nucleic acid may cause the nucleic acid to chemically bind to the substrate. The chemically bound nucleic acid may become immobilized relative to the substrate. In some instances, more than one type of nucleic acid is chemically bound to the substrate. That is, at least one of a first type of nucleic acid and at least one of a second type of nucleic acid may be bound to the same substrate. Binding of multiple types of nucleic acid to a same substrate may be accomplished, for example, using the above techniques, where the fluid substantially surrounding the substrate (which may be substantially miscible with the fluid contained within pores of the substrate, if applicable) comprises a plurality of the first type of nucleic acid and a plurality of the second type of nucleic acid. It should, of course, be understood that different types of container may contain different types and numbers of nucleic acids, as the disclosure is not so limited.

According to some embodiments, the nucleic acid is chemically bound to the substrate by chemically binding a primer to the substrate and extending the primer using an amplification technique such as PCR, as discussed in greater detail below. One advantage of amplifying substrate bound primers is that the technique may be suitable for chemically binding a library of nucleic acids to a substrate, as discussed in greater detail below. Mutations may be introduced, in some embodiments, by amplifying mutated substrate-bound primers. In other embodiments, substrate-bound primers are of the same type, but are amplified using a plurality of nucleic acid types from a pre-formed library of nucleic acid. For example, a library of nucleic acid types (e.g., as may be produced using error-prone PCR or random, site-specific mutation of nucleic acids) may be produced on and chemically bound to a substrate by amplifying substrate-bound primers against a nucleic acid library. A library of nucleic acids bound to a substrate may be used for determining nucleic acids expressing active proteins (e.g., active enzymes, active antibodies, etc.). Active proteins may then be selectively determined and/or amplified, as discussed in greater detail below.

A nucleic acid described herein may be configured to express a protein (e.g., when the nucleic acid is transcribed and/or translated). A protein may include a polypeptide sequence that may have any of a variety of suitable molecular weights. In some embodiments, a protein expressed by a nucleic acid provided herein has a molecular weight of greater than or equal to 132 Da, greater than or equal to 500 Da, greater than or equal to 1 kDa, greater than or equal to 5 kDa, greater than or equal to 10 kDa, greater than or equal to 20 kDa, greater than or equal to 30 kDa, greater than or equal to 40 kDa, greater than or equal to 50 kDa, greater than or equal to 60 kDa, greater than or equal to 70 kDa, greater than or equal to 80 kDa, greater than or equal to 90 kDa, greater than or equal to 100 kDa, greater than or equal to 110 kDa, greater than or equal to 120 kDa, greater than or equal to 130 kDa, greater than or equal to 140 kDa, greater than or equal to 150 kDa, greater than or equal to 160 kDa, greater than or equal to 170 kDa, greater than or equal to 180 kDa, or greater than or equal to 190 kDa. In some embodiments, a protein expressed by a nucleic acid provided herein has a molecular weight of less than or equal to 200 kDa, less than or equal to 190 kDa, less than or equal to 180 kDa, less than or equal to 170 kDa, less than or equal to 160 kDa, less than or equal to 150 kDa, less than or equal to 140 kDa, less than or equal to 130 kDa, less than or equal to 120 kDa, less than or equal to 110 kDa, less than or equal to 100 kDa, less than or equal to 90 kDa, less than or equal to 80 kDa, less than or equal to 70 kDa, less than or equal to 60 kDa, less than or equal to 50 kDa, less than or equal to 40 kDa, less than or equal to 30 kDa, less than or equal to 20 kDa, less than or equal to 10 kDa, or less than or equal to 5 kDa. Combinations of these ranges are also possible (e.g., greater than or equal to 132 Da and less than or equal to 200 kDa, or greater than or equal to 1 kDa and less than or equal to 150 kDa). Other ranges, both higher and lower than those described above, are also possible, as the disclosure is not so limited.

A nucleic acid may encode any of a variety of proteins, depending on the embodiment. In some embodiments, the nucleic acid(s) encode naturally occurring proteins. However, an advantage of the approaches described herein is that they may be used for screening various protein sequences for activity. To provide a specific, nonlimiting example, the methods provided herein may be well adapted to for use in combinatorial display methods for determining functional protein sequences. Accordingly it may be advantageous, in some embodiments, to use a plurality of nonnatural nucleic acids (e.g., in the form of mutants of a naturally occurring nucleic acid) chemically bound to one or more substrates. In some such embodiments, mutants with improved functional performance may be determined and/or studied.

A protein encoded by the nucleic acid may be a naturally occurring protein. In some embodiments, the protein is a mutant, fragment, or derivative of a naturally occurring protein. For example, the protein may be encoded by a mutant DNA sequence, in some embodiments. As another example, in some embodiments the protein is translated using synthetically modified amino acids, or is subjected to one or more post- translational modifications to form a derivative of a naturally occurring protein. In still other embodiments, the protein may be totally synthetic (e.g., the protein may have been designed de novo, rather than with reference to a naturally occurring sequence, and may differ in sequence and/or structure from naturally occurring proteins and their derivatives, fragments, or mutants). In some embodiments, a protein may combine one or more of the types of proteins described above, e.g., in the form of a fusion protein.

An expressed protein may be chemically bound to a substrate, according to some embodiments. In particular, it may be advantageous to chemically bind an expressed protein to the same substrate as the nucleic acid encoding it, according to some embodiments. Substrates bound both to a nucleic acid and to the protein the nucleic acid encodes may be particularly useful because functional activity of the substrate is directly correlated with the presence of target nucleic acid sequences on the substrate, according to some embodiments. This means that nucleic acids with high functional activity may be selected for amplification and subsequent rounds of mutation by determining and/or isolating substrates with high functional activities, in some embodiments.

A substrate may be chemically bound to an expressed protein in any of a variety of suitable ways. For example, in some embodiments, the protein is chemically reacted with the substrate. Those of ordinary skill in the art will be aware of methods to chemically bind a protein to a substrate (e.g., a polymeric material such as a structured species). The protein may be chemically bound to the substrate either directly (e.g., by formation of a bond, such as a covalent bond, directly with the substrate) or indirectly (e.g., using a crosslinking molecule). In some instances, application of light or heat to a container comprising a substrate and a protein may cause the protein to chemically bind to the substrate. The chemically bound protein may become immobilized relative to the substrate.

The substrate may be configured to bind to the expressed protein in a predictable way. For example, in some embodiments, the substrate comprises a plurality of protein binding sites comprising functional groups configured to bind to expressed proteins. The binding sites may be reactive with proteins generally, or may be configured to react specifically with the protein encoded by the nucleic acid (e.g., by synthetically modifying the proteins for specific reactivity with the binding sites). As a specific, nonlimiting illustration, according to some embodiments, the protein and the binding sites could be configured to undergo a “click” reaction as the result of synthetic modification of the expressed protein. To provide a specific, non-limiting example, an expressed protein could be modified to include an alkyne to make it click-reactive with azide binding sites on the substrate, or the protein could be modified to include an azide to make it click-reactive with alkyne binding sites on the substrate. As another example, in some embodiments the expressed protein is configured to covalently bind to the substrate using a SNAP-tag.

In some embodiments, the protein is chemically bound to the substrate via a degradable linker. The linker may be configured to degrade in response to an environmental change such as a change in ambient temperature, a change in ambient pH level, a change in ambient ionic strength, a change in ambient electromagnetic radiation (e.g., addition of ambient ultraviolet light), addition of a chemical (e.g., chemical that cleaves a crosslinker in a structured polymeric material or forms a crosslinker in a fluid polymeric material), or the like. This may be advantageous, e.g., if removal of the protein is desired after the protein has been determined to be effective, e.g., to act as a experimental control for activity of non-substrate-bound reagents. To provide a specific, non-limiting example, in some embodiments the protein is linked to the substrate via a recognition site that may be broken down by addition of a Tobacco Etch Virus (TEV) protease — such linkers may be referred to as TEV linkers.

In some instances, more than one type of protein is chemically bound to the substrate. That is, at least one of a first type of protein and at least one of a second type of protein may be bound to the same substrate. Binding of multiple types of protein to a same substrate may be accomplished, for example, using the above techniques, where the fluid substantially surrounding the substrate (which may be substantially miscible with the fluid contained within pores of the substrate, if applicable) comprises a plurality of the first type of protein and a plurality of the second type of protein. It should, of course, be understood that different types of containers may contain different types and numbers of proteins, as the disclosure is not so limited.

One advantage of the use of binding sites, as discussed above, is that the protein may be overexpressed, in some embodiments, in order to saturate the protein binding sites and produce a predictable concentration of an expressed protein on the substrate. The substrate may be saturated such that any of a variety of suitable percentages of the protein binding sites are bound to proteins expressed by the substrate-bound nucleic acids. In some embodiments, a substrate is saturated such that greater than or equal to 50%, greater than or equal to 60%, greater than or equal to 70%, greater than or equal to 80%, or greater than or equal to 90% of the protein binding sites are chemically bound to proteins expressed by substrate-bound nucleic acids. In some embodiments, a substrate is saturated such that less than or equal to 100%, less than or equal to 90%, less than or equal to 80%, less than or equal to 70%, or less than or equal to 60% of the protein binding sites are chemically bound to proteins expressed by substrate-bound nucleic acids. Combinations of these ranges are also possible (e.g., greater than or equal to 50% and less than or equal to 100%, or greater than or equal to 80% and less than or equal to 100%). Other ranges, both higher and lower than those described above, are also possible, as the disclosure is not so limited.

Systems and methods provided herein may be particularly useful to the study of enzymes. The “enzyme” can refer to a protein that has the ability to specifically catalyze a biochemical reaction, e.g., by reducing an activation energy required for the reaction to occur. In some embodiments, an enzyme expressed by a substrate-bound nucleic acid can be chemically bound to the same substrate, advantageously producing a substrate capable of actively catalyzing a biochemical process and producing more of the enzyme responsible for the catalytic activity. Such a substrate may be useful for preferentially determining and/or amplifying nucleic acids that express highly active enzymes.

An enzyme, as described herein, may be naturally occurring, or may be a mutant or fragment of a naturally occurring enzyme. In some embodiments, enzymes are synthetic. Activity of an enzyme may be determined, in some embodiments, by combining the enzyme with an enzyme reactant e.g., a reactant of a biochemical reaction catalyzed by the enzyme) in solution. The mixture of the enzyme and the enzyme reactant may produce activity of the enzyme reactant, which may be detected and/or used as discussed in greater detail below. Thus, in some embodiments, the method comprises catalyzing reaction of the enzyme reactant using the enzyme (e.g., for the purpose of determining activity of the enzyme and/or amplifying nucleic acids expressing active enzymes). In some embodiments, the substrate, advantageously, physically links enzymatic activity to the nucleic acid expressing an active enzyme, so that determining enzymatic activity determines a nucleic acid that expresses the active enzyme.

Systems and methods provided herein may be particularly useful to the study of antibodies. An antibody may have the ability to specifically interact with (e.g., by binding, steric hindrance, stabilizing/destabilizing, spatial distribution) a binding target or other species. Binding of the antibody to a binding target may produce activity of the binding target, which may be detected and/or used as discussed in greater detail below. Thus, in some embodiments, the method comprises binding a binding target using the antibody (e.g., for the purpose of determining activity of the antibody and/or amplifying nucleic acids expressing active antibodies). In some embodiments, an antibody expressed by a substrate-bound nucleic acid can be chemically bound to the same substrate, advantageously producing a substrate capable of actively binding to a binding target of the antibody and producing more of the antibody responsible for binding. Such a substrate may be useful for preferentially determining and/or amplifying the nucleic acids that express highly active antibodies.

An antibody generally comprises a protein. For example, the antibody may be a protein or a glycoprotein. An antibody may be substantially encoded by nucleic acids such as immunoglobulin genes or fragments of immunoglobulin genes. Any of a variety of immunoglobulin genes or fragments thereof are known to those of ordinary skill in the art. For example, the antibody may be substantially encoded by immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, epsilon, and mu constant region genes, as well as myriad immunoglobulin variable region genes and fragments thereof. The antibody may be encoded by a light chain immunoglobulin gene or a fragment thereof. Light chain immunoglobulins may be classified as either kappa or lambda. The antibody may be encoded by a heavy chain immunoglobulin gene or a fragment thereof. Heavy chain immunoglobulins may be classified as gamma, mu, alpha, delta, or epsilon(which in turn define the immunoglobulin classes, IgG, IgM, IgA, IgD and IgE, respectively).

An immunoglobulin structural unit may comprise a tetramer. Each tetramer may be composed of two identical pairs of polypeptide chains, each pair having one light chain and one heavy chain.

In some embodiments, the light chain has a molecular weight of greater than or equal to 15 kDa, greater than or equal to 20 kDa, greater than or equal to 25 kDa, or greater. In some embodiments, the light chain has a molecular weight of less than or equal to 35 kDa, less than or equal to 30 kDa, less than or equal to 25 kDa, or less. Combinations of these ranges are possible. For example, in some embodiments, the light chain has a molecular weight of greater than or equal to 15 kDa and less than or equal to 35 kDa.

In some embodiments, the heavy chain has a molecular weight of greater than or equal to 50 kDa, greater than or equal to 55 kDa, greater than or equal to 60 kDa, or greater. In some embodiments, the heavy chain has a molecular weight of less than or equal to 70 kDa, less than or equal to 65 kDa, less than or equal to 60 kDa, or less. Combinations of these ranges are possible. For example, in some embodiments, the heavy chain has a molecular weight of greater than or equal to 50 kDa and less than or equal to 70 kDa.

The N-terminus of each polypeptide chain may define a variable region of the immunoglobulin structural unit. The variable region may be primarily responsible for antigen recognition. In some embodiments, the variable region comprises greater than or equal to 95, greater than or equal to 98, greater than or equal to 100, greater than or equal to 103, greater than or equal to 105, or more amino acids. In some embodiments, the variable region comprises less than or equal to 115, less than or equal to 113, less than or equal to 110, less than or equal to 108, less than or equal to 105, or fewer amino acids. Combinations of these ranges are possible. For example, in some embodiments, the variable region comprises of greater than or equal to 95 and less than or equal to 115 amino acids. Antibodies may exist as intact immunoglobulins. However, in some embodiments, antibodies exist as any of a number of antibody fragments (e.g., immunoglobulin fragments). An “antibody fragment” may include at least one portion of an antibody that retains the ability to specifically interact with (e.g., by binding, steric hindrance, stabilizing/destabilizing, spatial distribution) an epitope of a binding target or other species. In some embodiments, the expressed protein is an antibody fragment rather than an intact antibody, but it should be understood that such fragments are generally referred to herein as antibodies because functionally they behave similarly to antibodies, even if they are not intact immunoglobulins.

Immunoglobulin fragments may be produced by digestion with any of a variety of peptidases (e.g., pepsin). For example, in some embodiments immunoglobulin fragments may be formed by digesting an antibody using pepsin. In some, exemplary embodiments, pepsin is used to digest the Fc domain of an antibody, e.g., by degrading disulfide linkages in a hinge region of the Fc domain to produce F(ab)’2. The F(ab)’2 is, according to certain embodiments, a dimer of Fab which itself is a light chain joined to VH-CH1 by a disulfide bond. The F(ab)’2 may be reduced to break the disulfide linkage in the hinge region, thereby converting the (Fab’)2 dimer into a Fab’ monomer. The Fab’ monomer, according to some embodiments, comprises Fab and a part of the hinge region of the Fc domain.

In some embodiments, an immunoglobulin fragment is synthesized de novo. An immunoglobulin fragment may be produced by any of a variety of methods known to those of ordinary skill in the art, such as by chemical synthesis, by utilizing recombinant DNA methodology, or by “phage display” methods. Antibodies may be naturally occurring or may be mutants of naturally occurring antibodies. In some embodiments, antibodies are synthetic.

Examples of antibodies include single chain antibodies, e.g., single chain Fv (scFv) antibodies in which a variable heavy and a variable light chain are joined together (directly or through a peptide linker) to form a continuous polypeptide. In one embodiment, the antibody is a monoclonal antibody.

In some embodiments, an antibody is configured such that the antibody or antibody fragment retains a relatively high affinity for a binding target or other species. Without wishing to be bound by theory, the affinity of the antibody or antibody fragment for the binding target or other species may be inversely related to the dissociation constant (KD) between the antibody or antibody fragment and the binding target or other species, so that a lower KD value corresponds to a higher affinity. In some embodiments, the KD value of the antibody or antibody fragment is less than or equal to 10'⁶ M, less than or equal to 10'⁷ M, less than or equal to 10'⁸ M, less than or equal to 10'⁹ M, less than or equal to IO ⁰ M, or less under physiological conditions. In some embodiments, the KD value of the antibody or antibody fragment is greater than or equal to IO ³ M, greater than or equal to 10⁴² M, greater than or equal to 10 ¹ M, or more. Combinations of these ranges are also possible. For example, in some embodiments, the KD value of the antibody or the antibody fragment is greater than or equal to 10⁴² M and less than or equal to 10'⁶ M. The KD value of the composition may be determined by a test that would be well-known to one of ordinary skill in the art.

In some embodiments, the antibody is a nanobody or a minibody.

The present disclosure relates, in certain aspects, to using droplet-based microfluidic devices and methods, and/or to the use of other containers such as wells. It may be useful, in some embodiments, to determine activity of an activity target within a container. In certain aspects, a method comprises determining one or more activity containers (e.g., activity droplets) of a plurality of containers that have activity of an activity target. The activity target may be any of a variety of appropriate activity targets. For example, the activity target may be a binding target (e.g., a viral or cancer cell antigen), a reaction target (e.g., an enzyme reactant), or may be a target configured to activate in response to specific action of a protein, etc.

Any of a variety of approaches may be used to determine the activity containers (e.g., activity droplets) having the activity of the activity target. For example, some embodiments may comprise determining activity containers that include any activity of an activity target. However, some embodiments are directed to determining a subset of containers that include activity exceeding a threshold value. For example, an activity target may, upon activation, produce a signal (e.g., a colorimetric, fluorescent, or luminescent signal) exceeding a predefined minimum signal. Such an approach may be useful in certain cases for determining highly active proteins, e.g., while excluding less active proteins. It should, of course, be understood that the containers with the most activity do not necessarily correspond to the containers comprising the most active molecules. For example, some containers may stochastically include larger numbers of active sequences, thereby demonstrating higher apparent activity without necessarily including highly active variants. More generally, it should be understood that separation of active containers may be performed by any of a variety of suitable systems and methods, e.g., as described herein, as the disclosure is not so limited.

Some aspects are generally directed to systems and methods of determining one or more nucleic acids in a sample (e.g., determining one or more nucleic acids that can cause activation of an activity target). For example, in some embodiments, systems and methods provided herein relate to determining a substrate-bound nucleic acid used to express a high concentration of a protein. This may be useful to ensure that most or all of the nucleic acids are transcribed, translated, or amplified, e.g., substantially evenly. In contrast, if the nucleic acids were to be transcribed, translated, or amplified in bulk solution, some nucleic acids could be transcribed, translated, or amplified without others being transcribed, translated, or amplified (or merely being transcribed, translated, or amplified to a much lesser degree). Thus, in certain embodiments as described herein the nucleic acids are encapsulated into containers (e.g., droplets), and manipulated therein.

In some embodiments, a plurality of containers (e.g., a plurality of droplets) may contain greater than or equal to 10⁶, greater than or equal to 10⁷, greater than or equal to 10⁸, greater than or equal to 10⁹, greater than or equal to IO¹⁰, greater than or equal to 10¹¹, greater than or equal to 10¹² or more distinct nucleic acid sequences (e.g., some or all of which may be bound to substrates) within the containers. In some embodiments, a plurality of containers contains less than or equal to 10¹⁴, less than or equal to 10¹³, less than or equal to 10¹², less than or equal to 10¹¹, less than or equal to IO¹⁰, less than or equal to 10⁹, or less distinct nucleic acid sequences within the containers. Combinations of these ranges are possible. For example, in some embodiments, a plurality of containers contains greater than or equal to 10⁶ and less than or equal to 10¹⁴ distinct nucleic acid sequences within the containers. Other ranges are also possible.

Droplets may be sorted by any of a variety of suitable methods. FIG. 3A presents a schematic illustration of a non-limiting method for producing and sorting activity droplets described herein. Initially, a droplet 300 comprises nucleic acids 303 bound to substrate 301, as well as optional other reagents 309. Nucleic acids of the droplet are transcribed and translated in step 310, as described below, to produce proteins 307 bound to substrate 310. Activity target 313 is included in droplet 300, and is inactive, as indicated in FIG. 3 A by the fact that activity target 313 is a white square. In some droplets 300, activity target 313 is activated (step 320) by at least some proteins bound to substrate 301, as indicated by the color change of the target substrate to a black square. Droplets 300 may then be sorted (step 330) into activity droplets 302 and non-activity droplets 304. For example, the droplets may be sorted by fluorescence-activated cell sorting (FACS), where the activity is fluorescent in nature. Nucleic acids of the activity droplets can then be amplified and/or otherwise processed as described below.

FIG. 3B presents a schematic illustration of a non-limiting method of producing and sorting activity droplets, according to some embodiments. Initially, a plurality of droplets (some of which are shown in dashed circle 341) are sorted into a plurality of activity droplets 302 and a plurality of non-activity droplets 304, shown within dashed circles 341 in FIG. 3B. Nucleic acids from activity droplets 302 can then be incorporated into a plurality of new droplets 306 (step 340). Incorporation of nucleic acids into the plurality of new droplets can be achieved in any of a variety of suitable ways. For example, in some embodiments, a plurality of substrates are included in each activity droplet 302, such that the plurality of substrates is divided when droplets 302 broken into plurality of droplets 306. As another example, in some embodiments the substrate is a structured species that can be fluidized, and the substrate is fluidized prior to breaking droplets 302 into plurality of droplets 306, such that the fluidized substrate is divided between droplets 306. As yet another example, nucleic acids can be amplified in activity droplets 302 to produce non-substrate-bound nucleic acids that can be broken into droplets 306, and substrates can be subsequently introduced into droplets 306 and chemically bound to the amplified nucleic acids. Other embodiments are also possible as the disclosure is not so limited. The new droplets 306 may contain fewer types of nucleic acids than the activity droplets 302 as a result of this separation. Optionally, step 350 may be performed, wherein new droplets 306 are treated (e.g., subjected to in vitro translation) to produce new activity droplets and new non-activity droplets. In some embodiments, a plurality of droplets containing amplified nucleic acids may be further refined by iterating one or more of the steps described above. For example, a starting plurality of droplets (e.g., comprising amplified nucleic acids of activity droplets belonging to the first plurality of droplets) may be sorted based on activity of an activity target in the starting plurality of droplets. In some embodiments, activity droplets comprising activity of the activity target may be separated from the remaining droplets of the starting plurality of droplets. Nucleic acids of the activity droplets may be separated into a new plurality of droplets. The nucleic acids of the activity droplets may be amplified before or after separation into the new plurality of droplets. Referring again to FIG. 3B, in some embodiments, sorting and separating steps as shown are iterated to determine nucleic acids associated with activity droplets of successive pluralities of droplets.

It should be understood that activity droplets can contain one or more nucleic acid sequences, depending on the embodiment, and that activity droplets can be active even where a plurality of nucleic acid sequences (some of which may be inactive) are included in the droplet. In some embodiments, it may be advantageous to combine the presence of multiple nucleic acids in a droplet with the use of multiple sorting steps. This approach of using multiple nucleic acids in a droplet and multiple sorting steps, a type of “multiplex sorting,” can significantly increase throughput, in some embodiments, since throughput of unique nucleic acids can be increased by including multiple nucleic acid sequences per droplet. As discussed above, in some embodiments a substrate is bound to a plurality of types of nucleic acids. In some embodiments, a plurality of substrate-bound nucleic acids is assayed for activity as presented in FIG. 3A. Then, activity droplets (droplets containing at least one active nucleic acid type) may be sorted from other droplets. And as shown in FIG. 3B, the activity droplets can then be broken into a plurality of new droplets. The substrate, and the nucleic acids therein, can be fluidized and broken into a plurality of substrates in the plurality of new droplets, tending to separate substrate-bound nucleic acid types onto separate substrates in the plurality of new droplets. This method may, in some embodiments, be iterated to isolate active nucleic acid types from inactive nucleic acid types. Sorting may be performed on droplets with any of a variety of suitable multiplex loadings. In some embodiments, a plurality of droplets is sorted wherein greater than or equal to 50%, greater than or equal to 60%, greater than or equal to 70%, greater than or equal to 80%, or greater than or equal to 90% of droplets contain a plurality of distinct nucleic acid species (e.g., at least 2 distinct nucleic acid species). In some embodiments, a plurality of droplets is sorted wherein less than or equal to 100%, less than or equal to 90%, less than or equal to 80%, less than or equal to 70%, or less than or equal to 60% of droplets contain a plurality of distinct nucleic acid species. Combinations of these ranges are also possible (e.g., greater than or equal to 50% and less than or equal to 100%, or greater than or equal to 80% and less than or equal to 100%). Other ranges, both higher and lower than those described above, are also possible, as the disclosure is not so limited. It should of course be understood that some or all of the distinct nucleic acid sequences of the plurality may be bound to a substrate, depending on the embodiment.

The plurality of sorted droplets may have any of a variety of suitable multiplex average loadings. For example, in some embodiments, the first plurality of droplets has an average nucleic acid loading of greater than or equal to 1 nucleic acid molecule/droplet, greater than or equal to 2 nucleic acid molecules/droplet, greater than or equal to 5 nucleic acid molecules/droplet, greater than or equal to 10 nucleic acid molecules/droplet, greater than or equal to 20 nucleic acid molecules/droplet, greater than or equal to 30 nucleic acid molecules/droplet, greater than or equal to 40 nucleic acid molecules/droplet, greater than or equal to 50 nucleic acid molecules/droplet, greater than or equal to 60 nucleic acid molecules/droplet, greater than or equal to 70 nucleic acid molecules/droplet, greater than or equal to 80 nucleic acid molecules/droplet, or greater than or equal to 90 nucleic acid molecules/droplet. In some embodiments, the first plurality of droplets has an average nucleic acid loading of less than or equal to 100 nucleic acid molecules/droplet, less than or equal to 90 nucleic acid molecules/droplet, less than or equal to 80 nucleic acid molecules/droplet, less than or equal to 70 nucleic acid molecules/droplet, less than or equal to 60 nucleic acid molecules/droplet, less than or equal to 50 nucleic acid molecules/droplet, less than or equal to 40 nucleic acid molecules/droplet, less than or equal to 30 nucleic acid molecules/droplet, less than or equal to 20 nucleic acid molecules/droplet, or less than or equal to 10 nucleic acid molecules/droplet. Combinations of these ranges are also possible (e.g., greater than or equal to 1 nucleic acid molecules/droplet and less than or equal to 100 nucleic acid molecules/droplet, greater than or equal to 2 nucleic acid molecules/droplet and less than or equal to 10 nucleic acid molecules/droplet, or greater than or equal to 10 nucleic acid molecules/droplet and less than or equal to 30 nucleic acid molecules/droplet). Other ranges, both higher and lower than those described above, are also possible, as the disclosure is not so limited.

In some embodiments, after sorting, substrates (e.g., structured species) of a plurality of activity droplets are broken into a second plurality of droplets, e.g., to dilute the nucleic acids thereon and to ensure the second plurality of droplets has a lower average loading than the first plurality of droplets. In some embodiments, the substrates (e.g., the structured species) are fluidized prior to forming the second plurality of droplets, e.g., so that the fluidized substrate can be broken into the second plurality of droplets. Additional substrate material (e.g., additional structured species) may be incorporated into the droplets. The fluidized substrate can then be re-rigidified within the second plurality of droplets to form a second plurality of substrates (e.g., a second plurality of structured species).

The nucleic acids in the second plurality of droplets may then be more dilute than the nucleic acids in the first plurality of droplets. For example, in some embodiments, the second plurality of droplets has a nucleic acid loading of less than or equal to 5 nucleic acid molecules/droplet, less than or equal to 4.5 nucleic acid molecules/droplet, less than or equal to 4 nucleic acid molecules/droplet, less than or equal to 3.5 nucleic acid molecules/droplet, less than or equal to 3 nucleic acid molecules/droplet, less than or equal to 2.5 nucleic acid molecules/droplet, less than or equal to 2 nucleic acid molecules/droplet, less than or equal to 1.5 nucleic acid molecules/droplet, less than or equal to 1 nucleic acid molecules/droplet, or less than or equal to 0.5 nucleic acid molecules/droplet. In some embodiments, the second plurality of droplets has a nucleic acid loading of greater than or equal to 0.1 nucleic acid molecules/droplet, greater than or equal to 0.5 nucleic acid molecules/droplet, greater than or equal to 1 nucleic acid molecules/droplet, greater than or equal to 1.5 nucleic acid molecules/droplet, greater than or equal to 2 nucleic acid molecules/droplet, greater than or equal to 2.5 nucleic acid molecules/droplet, greater than or equal to 3 nucleic acid molecules/droplet, greater than or equal to 3.5 nucleic acid molecules/droplet, greater than or equal to 4 nucleic acid molecules/droplet, or greater than or equal to 4.5 nucleic acid molecules/droplet. Combinations of these ranges are also possible (e.g., greater than or equal to 0 nucleic acid molecules/droplet and less than or equal to 5 nucleic acid molecules/droplet, or greater than or equal to 0.1 nucleic acid molecules/droplet and less than or equal to 1 nucleic acid molecules/droplet). Other ranges, both higher and lower than those described above, are also possible, as the disclosure is not so limited.

Of course, the use and sorting of activity droplets is not the only way to identify active proteins. For example, in some embodiments, a method comprises disintegrating a substrate (e.g., as shown in FIG. IE, e.g., by fluidizing the substrate) and pulling down an active protein (e.g., an active antibody) based on its interaction with another species. FIG. 4A provides a non-limiting schematic illustration of pulling down a fluidized substrate (e.g., that, prior to fluidization, contained substrates 402a and 402b) by binding active substrates (402a) to particles 450 via the interaction of proteins 407 with binding targets 413 bound to particles 450. Inactive substrates like substrate 402b may not bind to particles 450, or may bind to particles 450 to a lesser extent than active substrates 402a, with the result that substrates comprising active proteins are preferentially bound to particles 450 based on their activity. It should, of course, be understood that although suspended particles 450 are shown in FIG. 4A, any of a variety of pulldown methods may be used. For example, in some embodiments, pulldown may be performed using binding targets connected to walls of a container (e.g., walls of a well of a well-plate).

When particles are used, any of a variety of appropriate types of particles may be used. For example, particles may be functionalized solid particles (e.g., metal particles, magnetic particles, etc.), cells (e.g., eucaryotic or procaryotic cells), viral particles, or any of a variety of other particles, depending on the embodiment. When a plurality of containers is used, particles may be included in some or all of the containers, e.g., such that any appropriate portion of the active, particle-bound proteins (e.g., antibodies) in the containers may be isolated. As another option, in some embodiments, when a plurality of containers is used, their contents may be pooled before particles are added. In some embodiments, some particles are active (e.g., in the sense that they are specifically bound to active proteins such as antibodies via particle-bound binding targets) while others are inactive (e.g., in the sense that they are not specifically bound to active proteins). For example, FIG. 4B presents a non-limiting, schematic illustration of inactive particles 450a and active particles 450b being pulled down.

As shown in FIG. 4B, the particles may then be pulled down by any of a variety of appropriate methods. For example, FIG. 4B particles 450a and 450b are pulled down using magnet 475, such that particles 450a and 450b may be washed to remove unbound material. As another example, in some embodiments the particles may be sorted by adding them to a plurality of droplets and the droplets may be sorted based on activity, as described above with reference to FIGS. 3A-3B. It should, of course, be understood that any of a variety of appropriate methods of particle concentration (e.g., filtration, chromatography, magnetic or electrostatic separation, centrifugation, and/or droplet sorting) may be used for pull-down. As a specific example, in some embodiments, cells or viral particles are fluorescently dyed, and antibodies active against the cells are determined by including the cells in a plurality of droplets and sorting the cell-containing droplets based on activity. And in other embodiments, e.g., where binding targets are bound to well walls, pulldown may be performed by simply washing the well walls and removing excess fluid.

Depending on the embodiment, pulled-down substrates may be treated to amplify nucleic acids 403 chemically bound to the substrates of the pulled-down protein (e.g., antibody). The nucleic acids may be amplified by performing an amplification reaction in the presence of the beads and/or by de-binding the protein from the beads and reconstituting new structured substrates (e.g., like substrate 101) and amplifying the substrate-bound nucleic acids as discussed elsewhere herein. The foregoing steps of amplifying, expressing, and pulling-down nucleic acids chemically connected to active proteins they express may be used, in some embodiments, to identify proteins that are highly active against a target species.

Although activity of an activity target may be directly detectable (for example, when the activity causes a change in an optical property of an activity target, such as fluorescence, luminescence, or a colorimetric change). However, in some embodiments, direct detection of activity target activity is difficult or impossible. It may be advantageous, particularly when direct activity of an activity target is difficult or impossible to detect, to include a detection agent for the purpose of detecting activity of the activity target.

Any of a variety of types of detection agents may be used. In some embodiments, a detection agent is an indirect proxy for activity of an activity target. For example, the detection agent may be an indicator that is configured to experience a signal change in the presence of activity target activity. The detection agent may be configured to produce a signal (e.g., an optical signal such as fluorescence, luminescence, or a colorimetric signal) when it encounters an activated activity target. According to some embodiments, and activated activity target inhibits a signal produced by a detection agent in the absence of the activity target. For example, a detection agent may be configured to react with an activated activity target via a reaction that consumes the detection agent, or that chemically modifies a detection agent to render it undetectable.

A substrate-bound nucleic acid may be used to express a protein using any of a variety of suitable procedures. In general, it is possible to express proteins using any technique suitable for expressing proteins encoded by dissolved, non-substrate-bound nucleic acids. For example, without wishing to be bound by any particular theory, nucleic acids bound to a substrate may be amplified as discussed in greater detail below (e.g., by PCR) in order to produce non-substrate-bound nucleic acids that may subsequently be expressed in solution. However, in some embodiments, it is advantageous to express proteins using nucleic acids that are substrate-bound rather than dissolved. For example, the substrate may serve to retain the nucleic acid such that its point of origin may be more easily ascertained, according to some embodiments. Accordingly, protein expression techniques that may be used on substrate-bound nucleic acids (e.g., in vitro transcription and translation, IVTT) may have particular advantages, in some embodiments.

Any of a variety of suitable techniques may be used to express a protein from a nucleic acid encoding it. In some embodiments, expressing a protein comprises first transcribing an mRNA. Some aspects relate to mRNAs produced by “zh vitro transcription” or IVT. IVT methods produce (e.g., synthesize) an RNA transcript (e.g., mRNA transcript) by contacting a DNA template (e.g., an input DNA) with an RNA polymerase (e.g., a T7 RNA polymerase, a T7 RNA polymerase variant, etc.) under conditions that result in the production of the RNA transcript. IVT conditions typically employ a DNA template containing a promoter, nucleoside triphosphates, a buffer system that includes dithiothreitol (DTT) and magnesium ions, and an RNA polymerase. The exact conditions used in the transcription reaction depend on the amount of RNA needed for a specific application.

Some aspects relate to proteins produced by “zh vitro transcription and translation” or IVTT. IVTT may be performed such that transcription and translation occur simultaneously for a given complex, or at least so that transcription and translation occur simultaneously within the same solution. IVTT methods may produce (e.g., synthesize) a polypeptide (e.g., a protein) by contacting an mRNA produced in vitro with a ribosome under conditions that result in the production of the protein. IVTT conditions typically employ a DNA template containing a promoter, a ribosome binding site, nucleoside triphosphates, a buffer system that includes dithiothreitol (DTT) and magnesium ions, an RNA polymerase, and a ribosome. The exact conditions used in the IVTT reaction may depend on the amount of protein needed for a specific application.

IVT and IVTT may be performed in containers (e.g., droplets), depending on the embodiment, such that IVT or IVTT may be performed as part of a method described herein. Transcription and translation within a plurality of containers may have certain advantages. For example, transcription and translation within a plurality of containers may ensure that activity of an activity target within a container is associated exclusively with the nucleic acids present within that container, and does not result from nucleic acids that are not present within that container. Accordingly, in some embodiments solutions provided herein comprise, in addition to a nucleic acid, one or more IVTT reagents (e.g., ribosomes, RNA polymerases, etc.). It should, of course, be understood that other solutions provided herein do not contain IVTT reagents. For example, in some embodiments a substrate is washed to remove IVTT reagents from the surrounding solution.

In some cases, nucleic acids discussed above are amplified. This may be useful, for example, to produce a larger number or concentration of nucleic acids, e.g., for subsequent analysis, sequencing, or the like, or for producing long nucleic acids chemically bound to substrates. For example, in some embodiments, at least a portion of a nucleic acid bound to a substrate (e.g., a structured species) is synthesized (e.g., by nucleic acid amplification) prior to the transcription and translation of the nucleic acid to produce a protein. Those of ordinary skill in the art will be familiar with various amplification methods that can be used, including, but not limited to, polymerase chain reaction (PCR), reverse transcription (RT), PCR amplification, in vitro transcription amplification (IVT), multiple displacement amplification (MDA), or quantitative realtime PCR (qPCR).

Nucleic acids may be amplified while chemically bound to a substrate, depending on the embodiments. However, in some embodiments, nucleic acids are removed from a substrate prior to amplification. For example, in some embodiments, a restriction enzyme may be used to sever the chemical bond between the nucleic acid and the substrate, after which the nucleic acid amplification may be performed. Amplification on the substrate may have the advantage of preserving the location of at least some of the amplified nucleic acid. However, in some embodiments, removing the nucleic acid from the substrate prior to amplification may be advantageous, e.g., for improving the reaction rate of the nucleic acid, or for facilitating formation and addition of a new nucleic acid library to a new plurality of substrates, e.g., when discarding old substrates is desired.

In some cases, the nucleic acids may be amplified within containers (e.g., droplets). Nucleic acid amplification within the containers may allow amplification to occur “evenly” in some embodiments, e.g., such that the distribution of nucleic acids is not substantially changed after amplification, relative to before amplification. For example, according to certain embodiments, the nucleic acids within a plurality of containers may be amplified such that the number of nucleic acid molecules for each type of nucleic acid may have a distribution such that, after amplification, no more than about 5%, no more than about 2%, or no more than about 1% of the nucleic acids have a number less than about 90% (or less than about 95%, or less than about 99%) and/or greater than about 110% (or greater than about 105%, or greater than about 101%) of the overall average number of amplified nucleic acid molecules per container. In some embodiments, the nucleic acids within the containers may be amplified such that each of the nucleic acids that are amplified can be detected in the amplified nucleic acids, and in some cases, such that the mass ratio of the nucleic acid to the overall nucleic acid population changes by less than about 50%, less than about 25%, less than about 20%, less than about 15%, less than about 10%, or less than about 5% after amplification, relative to the mass ratio before amplification.

In some cases, certain primers are contained within the containers (e.g., within the droplets) to promote amplification. Such primers may be present during formation of the containers, and/or added to existing containers such as wells or droplets after formation of the droplets. It should be noted that the manner in which the primers are added to the containers may be the same or different from the manner in which the nucleic acids are added to the containers.

In certain embodiments, a plurality of different types of primers may be added to the containers (e.g., the droplets). Different primers may be distinguishable due to their having different sequences, and/or may be able to amplify different potential targets. In some cases, at least 2, at least 3, at least 5, at least 10, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, at least 60, at least 75, at least 100, at least 150, at least 200, at least 300, at least 400, at least 500, at least 1,000, at least 2,000, at least 3,000, at least 5,000, or at least 10,000, etc., different primers may be used. This may allow, for example, a variety of different target nucleic acids to be amplified within different containers.

Examples of techniques for forming droplets include those described in greater detail below. Examples of techniques for introducing primers after droplet formation include picoinjection or other methods such as those discussed in Int. Pat. Apl. Pub. No. WO 2010/151776, incorporated herein by reference, through fusion of the droplets with droplets containing primers, or the like. Other such techniques for either of these include, but are not limited to, any of those techniques described herein.

The primers may be present within the containers at any of a variety of suitable densities. For example, the primers may have a density of greater than or equal to 0.1 micromolar, greater than or equal to 0.3 micromolar, greater than or equal to 0.5 micromolar, greater than or equal to 0.8 micromolar, greater than or equal to 1 micromolar, greater than or equal to 5 micromolar, or more. In some embodiments, the primers have a density of less than or equal to 100 micromolar, less than or equal to 50 micromolar, less than or equal to 20 micromolar, less than or equal to 10 micromolar, less than or equal to 5 micromolar, less than or equal to 1 micromolar, or less. Combinations of these ranges are also possible (e.g., greater than or equal to 0.1 micromolar and less than or equal to 100 micromolar). Other ranges are also possible. The density may be independent of the density of target nucleic acids. In some cases, an excess of primers is used, e.g., such that the target nucleic acids control the reaction. For instance, if a large excess of primers is used, then substantially all of the containers will contain primer (regardless of whether or not the containers also contain target nucleic acids or substrates). For example, in certain embodiments, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 97%, at least about 98%, or at least about 99% of the containers may contain at least one amplification primer. As discussed above, in some embodiments, primers used for nucleic acid amplification are connected to the substrate, e.g., for the purpose of producing substrate-bound nucleic acids using the primers and a common template.

Containers containing both primer and a target nucleic acid may be treated to cause amplification of the nucleic acid to occur. This may allow a large amount or concentration of the target nucleic acids to be produced, e.g., without substantially altering the distribution of nucleic acids. In some cases, the primers are selected to allow substantially all, or only some, of the target nucleic acids suspected of being present to be amplified.

As examples, PCR (polymerase chain reaction) or other amplification techniques may be used to amplify nucleic acids, e.g., contained within containers (e.g., droplets). Typically, in PCR reactions, the nucleic acids are heated (e.g., to a temperature of at least about 50 °C, at least about 70 °C, or least about 90 °C in some cases) to cause dissociation of the nucleic acids into single strands, and a heat-stable DNA polymerase (such as Taq polymerase) is used to amplify the nucleic acid. This process is often repeated multiple times to amplify the nucleic acids.

Thus, in one set of embodiments, PCR amplification may be performed within the containers (e.g., within the droplets. For example, the containers may contain a polymerase (such as Taq polymerase), and DNA nucleotides (deoxyribonucleotides), and the containers may be processed (e.g., via repeated heated and cooling) to amplify the nucleic acid within the containers. Suitable reagents for PCR or other amplification techniques, such as polymerases and/or deoxyribonucleotides, may be added to droplets during their formation, and/or to containers or other droplets afterwards (e.g., via merger with droplets containing such reagents, and/or via direct injection of such reagents, e.g., contained within a fluid). Various techniques for droplet injection or merger of droplets will be known to those of ordinary skill in the art. See, e.g., U.S. Pat. Apl. Pub. No. 2012/0132288, incorporated herein by reference. Those of ordinary skill in the art will be aware of suitable primers, many of which can be readily obtained commercially.

In one set of embodiments, at least some of the primers may be distinguished, for example, using distinguishable fluorescent tags, barcodes, or other suitable identification tags. Examples of barcodes that can be contained within droplets include, but are not limited to, those described in U.S. Pat. Apl. Pub. No. 2018-0304222 or Int. Pat. Apl. Pub. No. WO 2015/164212, each incorporated herein by reference.

The nucleic acids may be amplified to any suitable extent. The degree of amplification may be controlled, for example, by controlling factors such as the temperature, cycle time, or amount of enzyme and/or deoxyribonucleotides contained within the containers (e.g., the droplets). For instance, in some embodiments, a population of containers may have at least about 50,000, at least about 100,000, at least about 150,000, at least about 200,000, at least about 250,000, at least about 300,000, at least about 400,000, at least about 500,000, at least about 750,000, at least about 1,000,000 or more molecules of the amplified nucleic acid per container (e.g., per droplet).

In some embodiments, a method comprises purifying nucleic acids. Purification may be used, for example, to extract the nucleic acids from unwanted reagents used in earlier steps. For example, purification may be used to extract the nucleic acids from proteins transcribed therefrom. Any of a variety of appropriate techniques may be used to purify the nucleic acids. For example, the nucleic acids may be purified using any of a variety of suitable methods, such as column- or gel-based methods (including electrophoretic and centrifuge-based methods). For example, nucleic acids may be purified using a PCR clean-up kit.

One advantage of using substrates as described above is that, in some embodiments, substrates may be washed (e.g., to remove unbound and unwanted reagents or species). For example, in some embodiments, the substrate may be washed by one or more steps of diluting liquid surrounding and/or contained within the substrate. Dilution may be achieved, to provide a non-limiting example, by one or more steps of adding a diluent liquid to a container and subsequently removing liquid from the container (e.g., by breaking the container, if the container is a droplet, and adding the substrate to a new container). This approach may, advantageously, remove dissolved species while preserving the concentration of nucleic acids and/or proteins bound to the substrate. After washing, the substrate bound nucleic acids and/or proteins may be amplified and/or re-expressed using the nucleic acids that remain chemically bound to the substrate, free from unwanted reagents or amplification products from earlier steps. This approach has significant advantages, especially for library -based methods. In particular, washing the substrate may permit removal of spent IVTT reagents and reintroduction of new IVTT reagents, which advantageously allows IVTT reactions to be performed iteratively, with the translated proteins of each reaction bound to the substrate. Iteratively performing IVTT can thus allow the detection of activity, even where the activity-producing nucleic acids are very dilute.

After refinement of activity-producing nucleic acids, the nucleic acids may optionally be determined and/or sequenced, e.g., using techniques such as those described herein. In some embodiments, the droplets may be burst and the nucleic acids may be combined to facilitate determination and/or sequencing, although in some cases, the determination and/or sequencing may occur within the droplets.

In addition, in certain embodiments, the pool of amplified nucleic acids may be sequenced using droplet-based techniques, e.g., droplet-based PCR. For example, in some cases, the amplified nucleic acids may be collected into droplets and the droplets exposed to certain primers. In some cases, the amplified nucleic acids may be collected into droplets at relatively low concentrations, e.g., such that the droplets may, on the average, contain less than 1 nucleic acid per droplet, as described herein. In addition, in certain embodiments, the droplets may be divided into different groups of droplets, which are exposed to different primers. For instance, the droplets may be divided into at least 5, 10, 30, 100, etc. groups, which are exposed to various primers, e.g., in different spatial locations, to determine whether a target nucleic acid was present in the sample. However, it should be understood that in other embodiments, the amplified nucleic acids may be present at relatively higher concentrations, e.g., at least 1 nucleic acid per droplet or at least 1 target per droplet. In some cases, more than one primer or one amplicon may be present within a droplet.

Examples of methods for determining and/or sequencing nucleic acids include, but are not limited to, chain-termination sequencing, sequencing-by-hybridization, Maxam-Gilbert sequencing, dye-terminator sequencing, chain-termination methods, Massively Parallel Signature Sequencing (Lynx Therapeutics), polony sequencing, pyrosequencing, sequencing by ligation, ion semiconductor sequencing, DNA nanoball sequencing, single-molecule real-time sequencing e.g., Pacbio sequencing), nanopore sequencing, Sanger sequencing, digital RNA sequencing (“digital RNA-seq”), Illumina sequencing, etc. In some cases, a microarray, such as a DNA microarray, may be used, for example, to determine, or to sequence, a nucleic acid. In some cases, the pool of amplified nucleic acids may be determined or identified, e.g., without any sequencing.

Although the methods described above may be implemented in any of a variety of contexts, in some embodiments it is advantageous to perform one or more of the methods in a plurality of compartments, e.g., to take advantage the combinatorial methods described elsewhere herein.

In some embodiments, the method comprises determining and/or amplifying active nucleic acids. An article comprising a plurality of compartments may be used to perform one or more of the methods provided herein, depending on the embodiment. For example, according to some embodiments, the article is a well-plate comprising a plurality of wells configured to define compartments comprising the substrates, proteins, and/or nucleic acids prepared as described above. As another example, in some embodiments the article is a microfluidic chip configured to define compartments comprising substrates, proteins, and/or nucleic acids prepared as described above, as discussed in greater detail below. Any of a variety of other articles comprising compartmentalized fluids may also be used, depending on the embodiment.

A plurality of compartments (e.g., a plurality of a droplets) generally comprises a first compartment (e.g., a first droplet) and a second compartment (e.g., a second droplet), but can comprise any of an appropriate number of compartments. In some embodiments, a plurality of compartments contains greater than or equal to 5, greater than or equal to 10, greater than or equal to 10², greater than or equal to 10³, greater than or equal to 10⁴, greater than or equal to 10⁵, greater than or equal to 10⁶, greater than or equal to 10⁷, or more compartments. In some embodiments, a plurality of compartments contains less than or equal to 10⁸, less than or equal to 10⁷, less than or equal to 10⁶, less than or equal to 10⁵, or less compartments. Combinations of these ranges are possible. For example, in some embodiments, a plurality of compartments contains greater than or equal to 5 and less than or equal to 10⁸ compartments. Other ranges are also possible. In some embodiments, a method comprises forming more than one plurality of compartments.

The containers may be droplets, according to some, particularly advantageous embodiments. Additional details regarding systems and methods for manipulating droplets in a microfluidic system follow, in accordance with certain aspects. For example, various systems and methods for screening and/or sorting droplets are described in U.S. Patent Application Serial No. 11/360,845, filed February 23, 2006, entitled “Electronic Control of Fluidic Species,” by Link, et al., published as U.S. Patent Application Publication No. 2007/000342 on January 4, 2007, incorporated herein by reference. As a non-limiting example, in some aspects, by applying (or removing) a first electric field (or a portion thereof), a droplet may be directed to a first region or channel; by applying (or removing) a second electric field to the device (or a portion thereof), the droplet may be directed to a second region or channel; by applying a third electric field to the device (or a portion thereof), the droplet may be directed to a third region or channel; etc., where the electric fields may differ in some way, for example, in intensity, direction, frequency, duration, etc.

In one set of embodiments, the droplets are broken down, e.g., after amplification, to allow substrates bound to amplified nucleic acids to be pooled together. A wide variety of methods for “breaking” or “bursting” droplets are available to those of ordinary skill in the art. For example, droplets contained in a carrying fluid may be disrupted using techniques such as mechanical disruption, chemical disruption, or ultrasound. Droplets may also be disrupted using chemical agents or surfactants, for example, 1 H, 1 H,2H,2H-perfluorooctanol.

Articles and/or fluidic systems may be used to perform one or more of the methods described herein within a plurality of droplets (or other compartments). Droplet-based fluidic systems and articles may offer several advantages for determining nucleic acids. For example, microfluidic droplet-based fluidic systems and articles may facilitate high-throughput processing of samples of nucleic acids.

Some aspects of the present disclosure are generally directed to systems and methods for containing or encapsulating nucleic acids, proteins, and/or substrates such as those discussed herein within microfluidic droplets or other suitable compartments, for example, microwells of a microwell plate, individual spots on a slide or other surface, or the like. Thus, it should be understood that as discussed herein, when nucleic acids, proteins, and/or substrates are contained within droplets, this is by way of example only, and in other embodiments, the nucleic acids may be contained within microwells, spots on a slide, or other suitable compartments.

Nucleic acids, primers, substrates, reagents and/or activity targets may be allocated to the compartments (e.g., droplets) of the plurality in any of a variety of suitable ways. In some embodiments, substrates are added to containers (e.g., droplets) in a manner designed to ensure that most containers include exactly one substrate. For example, in some embodiments substrates are added to and shaken within an emulsion, such that a relatively homogeneous distribution of the substrates within the emulsion is achieved. The concentration of the emulsion and the size of droplets produced by a fluidic device may then be chosen such that when the emulsion is partitioned into the droplets, a suitable proportion of the droplets contain exactly one substrate. For example, according to some embodiments, greater than or equal to 50%, greater than or equal to 75%, greater than or equal to 90%, greater than or equal to 95%, or greater than or equal to 99% of the droplets contain exactly one substrate. Other ranges are also possible as the disclosure is not so limited. A fluidic system may be used to perform some or all of the method steps described above. In one aspect of the present disclosure, emulsions are formed by flowing two, three, or more fluids through a system of channels of a fluidic system. The fluidic system may be or comprise an article. The system or article may be a microfluidic system or article. “Microfluidic,” as used herein, refers to a device, apparatus or system including at least one fluid channel having a cross-sectional dimension (measured perpendicular to the direction of fluid flow) of less than about 1 millimeter (mm), and in some cases, a ratio of length to largest cross-sectional dimension of at least 3:1.

A “channel,” as used herein, means a feature on or in a system or article that at least partially directs flow of a fluid. The channel can have any cross-sectional shape (circular, oval, triangular, irregular, square or rectangular, or the like) and can be covered or uncovered. One or more of the channels may (but not necessarily), in cross section, have a height that is substantially the same as a width at the same point.

In embodiments where it is completely covered, at least one portion of the channel can have a cross-section that is completely enclosed, or the entire channel may be completely enclosed along its entire length with the exception of its inlet(s) and/or outlet(s). A channel may also have an aspect ratio (length to average cross sectional dimension) of at least 2:1, more typically at least 3:1, 5:1, 10:1, 15:1, 20:1, or more. An open channel generally will include characteristics that facilitate control over fluid transport, e.g., structural characteristics (an elongated indentation) and/or physical or chemical characteristics (hydrophobicity vs. hydrophilicity) or other characteristics that can exert a force e.g., a containing force) on a fluid. The fluid within the channel may partially or completely fill the channel. In some cases where an open channel is used, the fluid may be held within the channel, for example, using surface tension (i.e., a concave or convex meniscus).

The channel may be of any size, for example, having a largest dimension perpendicular to fluid flow of less than about 5 mm or 2 mm, or less than about 1 mm, or less than about 500 microns, less than about 200 microns, less than about 100 microns, less than about 60 microns, less than about 50 microns, less than about 40 microns, less than about 30 microns, less than about 25 microns, less than about 10 microns, less than about 3 microns, less than about 1 micron, less than about 300 nm, less than about 100 nm, less than about 30 nm, or less than about 10 nm. In some cases the dimensions of the channel may be chosen such that fluid is able to freely flow through the article or substrate. The dimensions of the channel may also be chosen, for example, to allow a certain volumetric or linear flowrate of fluid in the channel. Of course, the number of channels and the shape of the channels can be varied by any method known to those of ordinary skill in the art. In some cases, more than one channel or capillary may be used. For example, two or more channels may be used, where they are positioned inside each other, positioned adjacent to each other, positioned to intersect with each other, etc.

The fluidic droplets within the channels may have a cross-sectional dimension smaller than about 100% of an average cross-sectional dimension of the channel, and in certain embodiments, smaller than about 90%, smaller than about 80%, about 70%, about 60%, about 50%, about 40%, about 30%, about 20%, about 10%, about 5%, about 3%, about 1%, about 0.5%, about 0.3%, about 0.1%, about 0.05%, about 0.03%, or about 0.01% of the average cross-sectional dimension of the channel.

During use, at least some processing of the droplets may be performed on an article. Thus, in some embodiments, an article comprises at least some of a plurality of droplets described above. For example, the article may comprise all droplets of a plurality of droplets. The droplets may be fluidic ally connected to one or more reservoirs of the fluidic system (e.g., to a pool used to form droplets, to a hydrophobic fluid used to form droplets, to a supply of an activity target, to a supply of a detection agent, to a supply of in vitro transcription and translation reagents, or any of a variety of other fluids described herein) via the article. For example, the droplets may be connected to one or more reservoirs of a fluidic system via the microchannel.

In some embodiments, the fluidic system comprises one or more additional components, such as a pressure source (for example, a pump), a detection tool (e.g., a sensor that may be used to detect fluorescence, luminescence, and/or colorimetric changes resulting from activity of an activity target); and/or a waste stream.

In one set of embodiments, a sample containing nucleic acids may be contained within a plurality of droplets, e.g., contained within a suitable carrying fluid. The nucleic acids may be present during formation of the droplets, and/or added to the droplets after formation. Any suitable method may be chosen to create droplets, and a wide variety of different droplet makers and techniques for forming droplets will be known to those of ordinary skill in the art. For example, a junction of channels may be used to create the droplets. The junction may be, for instance, a T-junction, a Y-junction, a channel- within-a-channel junction (e.g., in a coaxial arrangement, or comprising an inner channel and an outer channel surrounding at least a portion of the inner channel), a cross (or “X”) junction, a flow-focusing junction, or any other suitable junction for creating droplets. See, for example, International Patent Application No. PCT/US2004/010903, filed April 9, 2004, entitled “Formation and Control of Fluidic Species,” by Link, et al., published as WO 2004/091763 on October 28, 2004, or International Patent Application No. PCT/US2003/020542, filed June 30, 2003, entitled “Method and Apparatus for Fluid Dispersion,” by Stone, et al., published as WO 2004/002627 on January 8, 2004, each of which is incorporated herein by reference in its entirety.

In certain embodiments, nucleic acids may be added to droplet after the droplet has been formed, e.g., through picoinjection or other methods such as those discussed in Int. Pat. Apl. Pub. No. WO 2010/151776, entitled “Fluid Injection” (incorporated herein by reference), through fusion of the droplets with droplets containing the nucleic acids, or through other techniques known to those of ordinary skill in the art.

It should be understood that the nucleic acid may activate the activity target indirectly (e.g., by encoding a protein that, under the proper conditions, can be produced using the nucleic acid in order to activate the activity target). The activity target may also be included in the droplet. The activity target may be present in the droplets initially, or may be introduced after their formation (e.g., using an aforementioned technique, such as picoinjection).

As mentioned, certain embodiments comprise a droplet contained within a carrying fluid. For example, there may be a first phase forming droplets contained within a second phase, where the surface between the phases comprises one or more proteins. For example, the second phase may comprise oil or a hydrophobic fluid, while the first phase may comprise water or another hydrophilic fluid (or vice versa). It should be understood that a hydrophilic fluid is a fluid that is substantially miscible in water and does not show phase separation with water at equilibrium under ambient conditions ( typically 25 °C and 1 atm). Examples of hydrophilic fluids include, but are not limited to, water and other aqueous solutions comprising water, such as cell or biological media, ethanol, salt solutions, saline, blood, etc. In some cases, the fluid is biocompatible.

Similarly, a hydrophobic fluid is one that is substantially immiscible in water and will show phase separation with water at equilibrium under ambient conditions. As previously discussed, the hydrophobic fluid is sometimes referred to by those of ordinary skill in the art as the “oil phase” or simply as an oil. Non-limiting examples of hydrophobic fluids include oils such as hydrocarbons oils, silicon oils, fluorocarbon oils, organic solvents, perfluorinated oils, perfluorocarbons such as perfluoropoly ether, etc. Additional examples of potentially suitable hydrocarbons include, but are not limited to, light mineral oil (Sigma), kerosene (Fluka), hexadecane (Sigma), decane (Sigma), undecane (Sigma), dodecane (Sigma), octane (Sigma), cyclohexane (Sigma), hexane (Sigma), or the like. Non-limiting examples of potentially suitable silicone oils include 2 cst polydimethylsiloxane oil (Sigma). Non-limiting examples of fluorocarbon oils include FC3283 (3M), FC40 (3M), Krytox GPL (Dupont), etc. In addition, other hydrophobic entities may be contained within the hydrophobic fluid in some embodiments. Non-limiting examples of other hydrophobic entities include drugs, immunologic adjuvants, or the like.

Thus, the hydrophobic fluid may be present as a separate phase from the hydrophilic fluid. In some embodiments, the hydrophobic fluid may be present as a separate layer, although in other embodiments, the hydrophobic fluid may be present as individual fluidic droplets contained within a continuous hydrophilic fluid, e.g. suspended or dispersed within the hydrophilic fluid. This is often referred to as an oil/water emulsion. The droplets may be relatively monodisperse, or be present in a variety of different sizes, volumes, or average diameters. In some cases, the droplets may have an overall average diameter of less than about 1 mm, or other dimensions as discussed herein. In some cases, a surfactant may be used to stabilize the hydrophobic droplets within the hydrophilic liquid, for example, to prevent spontaneous coalescence of the droplets. Non-limiting examples of surfactants include those discussed in U.S. Pat. Apl. Pub. No. 2010/0105112, incorporated herein by reference. Other non-limiting examples of surfactants include Span80 (Sigma), Span80/Tween-20 (Sigma), Span80/Triton X-100 (Sigma), Abil EM90 (Degussa), Abil we09 (Degussa), polyglycerol polyricinoleate “PGPR90” (Danisco), Tween-85, 749 Fluid (Dow Coming), the ammonium carboxylate salt of Krytox 157 FSL (Dupont), the ammonium carboxylate salt of Krytox 157 FSM (Dupont), or the ammonium carboxylate salt of Krytox 157 FSH (Dupont). In addition, the surfactant may be, for example, a peptide surfactant, bovine serum albumin (BSA), or human serum albumin.

The droplets may have any suitable shape and/or size. In some cases, the droplets may be microfluidic, and/or have an average diameter of less than about 1 mm. For instance, the droplet may have an average diameter of less than about 1 mm, less than about 700 micrometers, less than about 500 micrometers, less than about 300 micrometers, less than about 100 micrometers, less than about 70 micrometers, less than about 50 micrometers, less than about 30 micrometers, less than about 10 micrometers, less than about 5 micrometers, less than about 3 micrometers, less than about 1 micrometer, etc. The average diameter may also be greater than about 1 micrometer, greater than about 3 micrometers, greater than about 5 micrometers, greater than about 7 micrometers, greater than about 10 micrometers, greater than about 30 micrometers, greater than about 50 micrometers, greater than about 70 micrometers, greater than about 100 micrometers, greater than about 300 micrometers, greater than about 500 micrometers, greater than about 700 micrometers, or greater than about 1 mm in some cases. Combinations of any of these are also possible; for example, the diameter of the droplet may be between about 1 mm and about 100 micrometers. The diameter of a droplet, in a non- spherical droplet, may be taken as the diameter of a perfect mathematical sphere having the same volume as the non- spherical droplet.

In some embodiments, the droplets may be of substantially the same shape and/or size (z.e., “monodisperse”), or of different shapes and/or sizes, depending on the particular application. In some cases, the droplets may have a homogenous distribution of cross-sectional diameters, i.e., in some embodiments, the droplets may have a distribution of average diameters such that no more than about 20%, no more than about 10%, or no more than about 5% of the droplets may have an average diameter greater than about 120% or less than about 80%, greater than about 115% or less than about 85%, greater than about 110% or less than about 90%, greater than about 105% or less than about 95%, greater than about 103% or less than about 97%, or greater than about 101% or less than about 99% of the average diameter of the microfluidic droplets. Some techniques for producing homogenous distributions of cross-sectional diameters of droplets are disclosed in International Patent Application No. PCT/US2004/010903, filed April 9, 2004, entitled “Formation and Control of Fluidic Species,” by Link, et al., published as WO 2004/091763 on October 28, 2004, incorporated herein by reference. In addition, in some instances, the coefficient of variation of the average diameter of the droplets may be less than or equal to about 20%, less than or equal to about 15%, less than or equal to about 10%, less than or equal to about 5%, less than or equal to about 3%, or less than or equal to about 1%. However, in other embodiments, the droplets may not necessarily be substantially monodisperse, and may instead exhibit a range of different diameters.

Those of ordinary skill in the art will be able to determine the average diameter of a population of droplets, for example, using laser light scattering or other known techniques. The droplets so formed can be spherical, or non- spherical in certain cases. The diameter of a droplet, in a non- spheric al droplet, may be taken as the diameter of a perfect mathematical sphere having the same volume as the non-spherical droplet.

In some embodiments, one or more droplets may be created within a channel by creating an electric charge on a fluid surrounded by a liquid, which may cause the fluid to separate into individual droplets within the liquid. In some embodiments, an electric field may be applied to the fluid to cause droplet formation to occur. The fluid can be present as a series of individual charged and/or electrically inducible droplets within the liquid. Electric charge may be created in the fluid within the liquid using any suitable technique, for example, by placing the fluid within an electric field (which may be AC, DC, etc.), and/or causing a reaction to occur that causes the fluid to have an electric charge.

The electric field, in some embodiments, is generated from an electric field generator, i.e., a device or system able to create an electric field that can be applied to the fluid. The electric field generator may produce an AC field (i.e., one that varies periodically with respect to time, for example, sinusoidally, sawtooth, square, etc.), a DC field (i.e., one that is constant with respect to time), a pulsed field, etc. Techniques for producing a suitable electric field (which may be AC, DC, etc.) are known to those of ordinary skill in the art. For example, in one embodiment, an electric field is produced by applying voltage across a pair of electrodes, which may be positioned proximate a channel such that at least a portion of the electric field interacts with the channel. The electrodes can be fashioned from any suitable electrode material or materials known to those of ordinary skill in the art, including, but not limited to, silver, gold, copper, carbon, platinum, copper, tungsten, tin, cadmium, nickel, indium tin oxide (“ITO”), etc., as well as combinations thereof.

In another set of embodiments, droplets of fluid can be created from a fluid surrounded by a liquid within a channel by altering the channel dimensions in a manner that is able to induce the fluid to form individual droplets. The channel may, for example, be a channel that expands relative to the direction of flow, e.g., such that the fluid does not adhere to the channel walls and forms individual droplets instead, or a channel that narrows relative to the direction of flow, e.g., such that the fluid is forced to coalesce into individual droplets. In some cases, the channel dimensions may be altered with respect to time (for example, mechanically or electromechanically, pneumatically, etc.) in such a manner as to cause the formation of individual droplets to occur. For example, the channel may be mechanically contracted (“squeezed”) to cause droplet formation, or a fluid stream may be mechanically disrupted to cause droplet formation, for example, through the use of moving baffles, rotating blades, or the like.

Some embodiments generally relate to systems and methods for fusing or coalescing two or more droplets into one droplet, e.g., where the two or more droplets ordinarily are unable to fuse or coalesce, for example, due to composition, surface tension, droplet size, the presence or absence of surfactants, etc. In certain cases, the surface tension of the droplets, relative to the size of the droplets, may also prevent fusion or coalescence of the droplets from occurring.

As a non-limiting example, two droplets can be given opposite electric charges (i.e., positive and negative charges, not necessarily of the same magnitude), which can increase the electrical interaction of the two droplets such that fusion or coalescence of the droplets can occur due to their opposite electric charges. For instance, an electric field may be applied to the droplets, the droplets may be passed through a capacitor, a chemical reaction may cause the droplets to become charged, etc. The droplets, in some cases, may not be able to fuse even if a surfactant is applied to lower the surface tension of the droplets. However, if the droplets are electrically charged with opposite charges (which can be, but are not necessarily of, the same magnitude), the droplets may be able to fuse or coalesce. As another example, the droplets may not necessarily be given opposite electric charges (and, in some cases, may not be given any electric charge), and are fused through the use of dipoles induced in the droplets that causes the droplets to coalesce. Also, the two or more droplets allowed to coalesce are not necessarily required to meet “head-on.” Any angle of contact, so long as at least some fusion of the droplets initially occurs, is sufficient. See also, e.g., U.S. Patent Application Serial No.

11/698,298, filed January 24, 2007, entitled “Fluidic Droplet Coalescence,” by Ahn, et al., published as U.S. Patent Application Publication No. 2007/0195127 on August 23, 2007, incorporated herein by reference in its entirety.

In one set of embodiments, a fluid may be injected into a droplet. The fluid may be microinjected into the droplet in some cases, e.g., using a microneedle or other such device. In other cases, the fluid may be injected directly into a droplet using a fluidic channel as the droplet comes into contact with the fluidic channel. Other techniques of fluid injection are disclosed in, e.g., International Patent Application No. PCT/US2010/040006, filed June 25, 2010, entitled “Fluid Injection,” by Weitz, et al., published as WO 2010/151776 on December 29, 2010; or International Patent Application No. PCT/US2009/006649, filed December 18, 2009, entitled “Particle- Assisted Nucleic Acid Sequencing,” by Weitz, et al., published as WO 2010/080134 on July 15, 2010, each incorporated herein by reference in its entirety.

The following documents are each incorporated herein by reference in its entirety for all purposes: Int. Pat. Apl. Pub. No. WO 2016/168584, entitled “Barcoding System for Gene Sequencing and Other Applications,” by Weitz et al.; Int. Pat. Apl. Pub. No. WO 2015/161223, entitled “Methods and Systems for Droplet Tagging and Amplification,” by Weitz, et al.; U.S. Pat. Apl. Ser. No. 61/980,541, entitled “Methods and Systems for Droplet Tagging and Amplification,” by Weitz, et al.; U.S. Pat. Apl. Ser. No. 61/981,123, entitled “Systems and Methods for Droplet Tagging,” by Bernstein, et al.; Int. Pat. Apl. Pub. No. WO 2004/091763, entitled “Formation and Control of Fluidic Species,” by Link et al.; Int. Pat. Apl. Pub. No. WO 2004/002627, entitled “Method and Apparatus for Fluid Dispersion,” by Stone et al.; Int. Pat. Apl. Pub. No. WO 2006/096571, entitled “Method and Apparatus for Forming Multiple Emulsions,” by Weitz et al.; Int. Pat. Apl. Pub. No. WO 2005/021151, entitled “Electronic Control of Fluidic Species,” by Link et al.; Int. Pat. Apl. Pub. No. WO 2011/056546, entitled “Droplet Creation Techniques,” by Weitz, et al.; Int. Pat. Apl. Pub. No. WO 2010/033200, entitled “Creation of Libraries of Droplets and Related Species,” by Weitz, et al.; U.S. Pat. Apl. Pub. No. 2012-0132288, entitled “Fluid Injection,” by Weitz, et al.; Int. Pat. Apl. Pub. No. WO 2008/109176, entitled “Assay And Other Reactions Involving Droplets,” by Agresti, et al.; and Int. Pat. Apl. Pub. No. WO 2010/151776, entitled “Fluid Injection,” by Weitz, et al.; and U.S. Pat. Apl. Ser. No. 62/072,944, entitled “Systems and Methods for Barcoding Nucleic Acids,” by Weitz, et al.

In addition, the following are incorporated herein by reference in their entireties: U.S. Pat. Apl. Ser. No. 61/981,123 filed April 17, 2014; PCT Pat. Apl. Ser. No. PCT/US2015/026338, filed April 17, 2015, entitled “Systems and Methods for Droplet Tagging”; U.S. Pat. Apl. Ser. No. 61/981,108 filed April 17, 2014; U.S. Pat. Apl. Ser. No. 62/072,944, filed October 30, 2014; PCT Pat. Apl. Ser. No. PCT/US2015/026443, filed on April 17, 2015, entitled “Systems and Methods for Barcoding Nucleic Acids”; U.S. Pat. Apl. Ser. No. 62/106,981, entitled “Systems, Methods, and Kits for Amplifying or Cloning Within Droplets,” by Weitz, et al.; U.S. Pat. Apl. Pub. No. 2010-0136544, entitled “Assay and Other Reactions Involving Droplets,” by Agresti, et al.; U.S. Pat. Apl. Ser. No. 61/981,108, entitled “Methods and Systems for Droplet Tagging and Amplification,” by Weitz, et al.; Int. Pat. Apl. Pub. No. PCT/US2014/037962, filed May 14, 2014, entitled “Rapid Production of Droplets,” by Weitz, et al.; and U.S. Provisional Patent Application Serial No. 62/133,140, filed 03/13/15, entitled “Determination of Cells Using Amplification,” by Weitz, et al.

A library of nucleic acids as described herein may prepared by any of a variety of appropriate methods. For example, in some embodiments, a library is prepared by error- prone PCR. However, error-prone PCR is nonrandom; certain codon mutations are more likely than others, and some codon mutations are totally forbidden. By using the systems and methods described below, libraries may be generated by a more uniformly random process, without any forbidden mutations. The libraries may then be screened using one or more of the systems or methods described above. The systems and methods described herein may thereby surpass the performance of conventional methods of library refinement, such as refinement of libraries prepared by error-prone PCR, by screening more functionally robust and diverse nucleic acid libraries, with higher randomness, using fewer steps.

Accordingly, in some embodiments, a library of nucleic acids is synthesized using a common template nucleic acids and one or more pluralities of primers configured to introduce site-specific mutations when amplifying the common template nucleic acids. Suitable techniques for preparing nucleic acid libraries are detailed, for example, in Int. Pat. Apl. Pub. No. WO 2024/073375, incorporated herein by reference in its entirety.

In some aspects, nucleic acids and proteins identified by one or more of the methods provided herein are provided. According to some embodiments, a nucleic acid provided herein is a mutant of a wild-type nucleic acid expressing the Bacillus subtilis Lipase A (“BsLipA”) enzyme. For example, in some embodiments, a nucleic acid provided herein comprises a sequence that is at least 70%, at least 80%, at least 90%, or at least 95% identical to one of SEQ. ID. NOS. 21-28. In some embodiments, a nucleic acid provided herein comprises one of SEQ. ID. NOS. 21-28. For example, a nucleic acid provided herein may be one of SEQ. ID. NOS. 21-28. In some embodiments, the nucleic acid sequences provided herein do not have a sequence identical to a naturally occurring nucleic acid sequence (e.g., are not identical to a portion of SEQ. ID. NO. 19).

In some embodiments, a protein provided herein comprises an amino acid sequence that is at least 70%, at least 80%, at least 90%, or at least 95% an amino acid sequence that is expressible via transcription and translation of one of SEQ. ID. NOS. 21-28. In some embodiments, a protein provided herein comprises an amino acid sequence that is expressible via transcription and translation of one of SEQ. ID. NOS. 21-28. In some embodiments, the proteins provided herein do not have a sequence identical to a naturally occurring protein

The following examples are intended to illustrate certain embodiments of the present disclosure, but do not exemplify the full scope of the disclosure.

EXAMPLE 1 This example illustrates the production and use of exemplary substrates covalently bound to nucleic acids and used to transcribe and translate proteins that were subsequently bound to the substrates. A structured species in the form of an agarose gel bead was prepared by amplifying a plurality of substrate-bound primers with a gene configured to express green fluorescent protein (GFP). IVTT was used to transcribe and translate GFP from the GFP-expressing genes bound to the substrate. Subsequently, the expressed GFP was chemically bound to the substrate with a SNAP-tag connected to a degradable TEV-linker.

The substrates were prepared by encapsulating agarose into monodispersed surfactant-stabilized water-in-oil droplets through single emulsion droplet microfluidics as described above. The droplets were cooled to produce the substrate in the form of a structured agarose species, in the form of a uniform hydrogel. Each substrate was functionalized to carry the same number of primers and protein binding sites. The substrates were re-encapsulated into droplets, and primers were amplified such that each droplet carried one nucleic acid type, which was a unique mutation of the naturally occurring GFP-expressing gene. Amplification was performed using PCR, during which the substrate fluidized due to heating of the droplet. After PCR, each substrate was chemically bound to multiple copies of the expressed GFP-expressing gene mutant.

After PCR, the droplets were cooled to re-rigidify the agarose beads. The beads were released from the droplets and re-encapsulated together with IVTT reagents, expressing a GFP mutant in each droplet. The expressed GFP mutants were covalently bound to the substrates using the SNAP-tags, and the substrates were washed to ensure the amount of GFP in each droplet was the same (because each droplet contained about one substrate that initially contained about the same number of protein binding sites, and because those protein binding sites were saturated due to overexpression of GFP.

FIG. 5A shows a micrograph of the resulting agarose beads after IVTT and washing. The green color (indicated in the figure by black arrows pointing to the center of the beads) indicates GFP expressed from IVTT binding to the beads via a probe-tag interaction. To demonstrate that the GFP was actually bound to the substrate, the TEV- linker was then degraded using TEV protease, and the substrate was subsequently washed. FIG. 5B shows a micrograph of the same substrates after treatment with the protease to release GFP from the beads. As shown, the beads no longer demonstrated activity of the GFP, making them impossible to resolve.

GFP was used illustratively, since its activity is relatively easy to detect, but it should of course be understood that any enzyme or antibody could be expressed or studied in a similar fashion, depending on the embodiment. Thus, this example demonstrates the general efficacy of the methods provided herein.

EXAMPLE 2

This example illustrates the production and use of exemplary substrates covalently bound to nucleic acids and used to transcribe and translate proteins that were subsequently bound to the substrates. A structured species in the form of an agarose gel bead was prepared by amplifying a plurality of substrate-bound primers to express a library of genes expressing anti-green fluorescent protein (GFP) antibodies. IVTT was used to transcribe and translate the anti-GFP antibodies from the library of genes bound to the substrate. Subsequently, the expressed anti-GFP antibodies were chemically bound to the substrate with a SNAP-tag connected to a degradable TEV-linker.

The substrates were prepared by encapsulating agarose into monodispersed surfactant-stabilized water-in-oil droplets through single emulsion droplet microfluidics as described above. The droplets were cooled to produce the substrate in the form of a structured agarose species, in the form of a uniform hydrogel. Each substrate was functionalized to carry primers that could be amplified to produce the anti-GFP gene library and protein binding sites. The substrates were re-encapsulated into droplets, and primers were amplified to chemically bind the anti-GFP gene library to the substrate. Amplification was performed using PCR, during which the substrate fluidized due to heating of the droplet. After PCR, each substrate was chemically bound to a library of anti-GFP genes.

After PCR, the droplets were cooled to re-rigidify the agarose beads. The beads were released from the droplets and re-encapsulated together with IVTT reagents, expressing a library of anti-GFP antibodies in each droplet. The expressed anti-GFP antibodies were covalently bound to the substrates using the SNAP-tags, and the substrates were washed to remove unbound material. Next, the washed agarose substrates were fluidized to disintegrate them into pluralities of agarose chains, at least some of which covalently bound expressed anti- GFP antibodies to the gene expressing those anti-GFP antibodies. The fluidized substrates were pooled, and a plurality of magnetic beads chemically bound to GFP were introduced to the fluidized substrates, creating conditions where higher activity of the anti-GFP antibodies was associated with stronger binding to the magnetic beads. Thus, the beads preferentially bound to agarose substrates chemically bound to active antibodies. The magnetic beads were pulled down using a magnet, and the beads were washed to remove unbound material, preferentially eliminating inactive antibodies.

Next, amplification was performed using PCR. This was achieved by first using a TEV protease was used to degrade the TEV-linkers linking the antibodies to the agarose substrate. Then, a restriction enzyme was used to degrade the linkage between the nucleic acids and the agarose substrate to release the nucleic acids. Finally, PCR reagents were added to perform the nucleic acid amplification, producing a new nucleic acid library. The new nucleic acid library was then chemically bound to a new plurality of agarose substrates, allowing another round of nucleic acid amplification, transcription and translation, and pull-down of active anti-GFP antibodies to be performed. The amplification, expression, and pull-down steps were iterated twice. FIGS. 6A-6C show the activity of representative agarose substrates, demonstrated by the addition of GFP to the solution, for the initial library (FIG. 6A), after the first iteration (FIG. 6B), and after the second iteration (FIG. 6C). Under these conditions, substrates with highly active anti-GFP antibodies, indicated by black arrows in the drawings, fluoresced more intensely. As shown, the proportion of substrates with highly active anti-GFP antibodies increased with each iteration, demonstrating that the process successfully selected for anti-GFP antibodies. Moreover, although the trend is difficult to perceive in the black- and-white renderings of FIGS. 6A-6C, the less-active substrates (not indicated by arrows) in FIG. 6C were visibly greener than the less-active substrates (not indicated by arrows) in FIG. 6A, suggesting that even the less-active substrates of FIG. 6C demonstrated higher anti-GFP activity after performing the above-described method. These trends further demonstrate the performance improvements offered by this approach. GFP was used as a binding target illustratively, since its activity is relatively easy to detect, but it should of course be understood that any antibody could be expressed or studied in a similar fashion, depending on the embodiment. Thus, this example demonstrates the general efficacy of the methods provided herein.

EXAMPLE 3

This example illustrates the development and use of an ultra-high throughput microfluidic droplet-based platform that allows for the generation of large enzyme variant libraries and supports IVTT-based production and activity screening of purified enzymes. The platform separates each of the reaction steps through use of a bifunctional agarose hydrogel bead that allows for purification of both amplified DNA and in vitro- expressed protein to facilitate optimum conditions for each reaction. Each reaction was still performed in drop, but between each reaction, the beads were removed from the drops, the reagents were exchanged, and the bead was re-compartmentalized using particle-templated emulsification. To increase the throughput even further, a multiplexed sort was introduced, where up to 20 different DNA variants were pooled in each drop for the initial screen, improving the effective throughput by -100 times. The potential of the platform was demonstrated by improving the thermotolerance of Bacillus subtilis Lipase A (BsLipA), a versatile enzyme used in biodiesel production and in detergent additives to catalyze the hydrolysis and synthesis of esters. Multiple new variants with > 40 °C improvements in thermotolerance were identified after only one mutagenesis round and less than 1 hour of total screening time.

Results and Discussion

Agarose beads, which melt during PCR and therefore do not impede DNA extension, yet solidify when the PCR reaction is complete, were used as a structured species. These properties allowed the drops to be merged and the beads to be washed. To bind amplified DNA to the beads, the agarose was functionalized with forward primers. When the template DNA was amplified by these primers, it covalently attached to the agarose chain. Upon cooling, gel beads were re-formed, and DNA remained bound to them. To enable functional screening of purified protein variants, the synthesized protein was also bound to the same bead. This was achieved by expressing the protein in chimeric form with a SNAP-tag and by covalently linking the SNAP-tag substrate, O⁶-benzylguanine (BG), to the agarose. This system was chosen for its compatibility with both PCR and IVTT conditions. Additionally, when the SNAP-tag reacted with BG, it formed a covalent bond, preventing protein loss under various reaction conditions and during washing steps.

FIG. 7 shows an example screening workflow 701 of purified protein using bifunctional agarose gel beads. At 703, FIG. 7 shows an emulsification of agarose beads with a DNA template and PCR reagents for in-droplet PCR amplification. At 705, FIG. 7 shows a removal of PCR reagents and re-emulsification of agarose beads with IVTT reagents for protein synthesis. At 707, FIG. 7 shows a removal of IVTT reagent and reemulsification of agarose beads with Anorogenic substrate for enzymatic reaction. And at 709, FIG. 7 shows a sorting of active variants.

To form the agarose beads both the forward primer and BG were covalently linked to the agarose using click reactions. The polymer was first modified to introduce terminal alkynes by mixing an agarose solution with glycidyl propargyl ether under basic conditions. The forward primer was modified with a 5' azide group, and BG was azide- modified as well. To prevent steric hindrance during the annealing of DNA to the oligonucleotide, a hexa-ethyleneglycol spacer was incorporated between the azide group and the oligonucleotide. Similarly, to minimize steric hindrance during the binding of the SNAP-tag to BG, a diethylene glycol spacer molecule was added. The oligonucleotide was linked to the modified agarose using Cu(I) as a catalyst. Separately, the BG was linked to modified agarose in the same manner.

To generate bifunctional beads, droplets of a melted mixture of unmodified, primer-modified and BG-modified agarose were produced at a final total concentration of 1.5%. To produce the droplets, an air-triggered microAuidics device was used. The air-triggered microAuidics device operated at a very high speed of over 2 mL/h and generated uniform 45 pm diameter droplets. To ensure that agarose stayed in liquid form during droplet generation, the whole system was heated and the drops were collected on ice. After the droplets cooled, they were solidified to form agarose beads which could be subsequently released from the emulsion. By adjusting the concentrations of primer- modified and BG-modified agarose, the concentrations of primer and BG were set in the agarose beads. In all cases, the primer-modified agarose was adjusted to achieve a primer concentration of 500 nM, which is the standard for PCR.

FIGS. 8A-8G show bifunctional agarose beads used for DNA amplification and protein tagging. FIG. 8A shows chemical reactions for the modification of agarose with terminal alkyne groups and further click reactions with primers and BG. FIG. 8B shows a schematic illustration of a bifunctional agarose bead with primers and BG modifications. FIG. 8C shows a brightfield-microscope image of the air-triggered microfluidic device generating molten agarose droplets as the precursors of agarose beads. Scale bar: 100 pm. FIG. 8D shows a brightfield-microscope image of uniform droplets of molten agarose at the collection of the microfluidic device. Scale bar: 100 pm. FIG. 8E shows a fluorescence-microscope image of bifunctional agarose beads stained with Qubit ssDNA specific dye. Scale bar: 50 pm. FIG. 8F shows a fluorescence-microscope image of bifunctional agarose beads tagged with GFP-SNAP. Scale bar: 50 pm. FIG. 8G shows concentration of GFP bound to the agarose beads as a function of the input concentration of BG. (Sample size, n = 20).

To validate the characteristics of the beads, the behavior of the primer and BG bound to the agarose was measured. It was confirmed that the primers were stable during bead generation by melting the beads and quantifying the amount of ssDNA using Qubit. The value was compared to the value measured for the primers on the agarose prior to bead generation. It was found that the bead generation process had no impact on primer concentration. The beads stained with Qubit ssDNA dye were also imaged using fluorescence microscopy, and exhibited uniform levels of fluorescence confirming the homogeneity of the beads, as shown in FIG. 8E. To confirm that the BG modification of the agarose was also successful, GFP-SNAP chimeric protein was added to the beads made with 50 pM BG. After 8 hours of tagging, the beads were washed to remove excess GFP-SNAP and imaged using confocal fluorescence microscopy. The beads exhibited uniform green fluorescence, confirming that BG was successfully bound to the beads and homogeneously distributed among and within the beads, as shown in FIG. 8F. These results also confirmed that the protein GFP-SNAP accessed and bound to the BG uniformly throughout the volume of the bead. To determine the dependence of the binding of GFP on BG concentration, beads with varying concentrations of BG were prepared and mixed with GFP-SNAP chimera. After 12 hours of incubation, the beads were washed and imaged using confocal microscopy. The BG-concentration dependence of the GFP was measured, and reached a concentration as high as 1.5 pM, as shown in FIG. 8G. These results also confirmed that GFP was covalently linked through BG rather than non- specifically bound to agarose.

To validate the use of these beads for digital PCR, PCR reagents were infused into the beads and a DNA template coding for GFP-SNAP was added at a concentration of 1 molecule per 10 beads. The beads were emulsified using particle-templated emulsion (PTE), ensuring only a single bead per drop, as shown schematically in FIG. 7 (at 703). This method was much simpler than microfluidic encapsulation. PCR amplification was performed in the drops, then the beads were removed from the emulsion, washed to remove PCR reagents and stained with high sensitivity Qubit dsDNA dye.

FIGS. 9A-9L show that bifunctional agarose beads enable DNA amplification, enzyme expression, purification and enzymatic assay. FIG. 9A shows fluorescencemicroscope images of agarose beads after digital droplet PCR with 0.1 DNA molecule per drop; FIG. 9B is similar but after digital droplet PCR with 1 DNA molecule per drop; and FIG. 9C is similar but after digital droplet PCR with 10 DNA molecules per drop. To detect the amplified DNA, fluorescence microscopy was used and the expected fraction of bright beads was observed, as shown by the image in FIG. 9A. To confirm that the platform was sensitive to a single DNA template, these measurements were repeated using DNA template concentrations of 1 and 10 molecules per bead. The expected fractions of bright beads was observed again, as shown in FIGS. 9B-C. Moreover, the data followed the behavior expected for Poisson statistics as shown in FIG. 9D. This confirmed the absence of false positives and the sensitivity to a single template per bead. The advantage of using these beads was that the agarose melts during the PCR reaction. An overlay of the Qubit dsDNA staining channel (blue) and brightfield is also shown. Beads are highlighted by dashed lines for clarity. Scale bar: 50 pm.

FIG. 9D shows quantification of frequency of positive gels after droplet PCR. (Sample size, n = 5). FIGS. 9E-G show confocal images of agarose beads after droplet IVTT and tagging of GFP-SNAP on the beads after (i) 2 hours (FIG. 9E) (ii) 5 hours (FIG. 9F), and (iii) 42 hours (FIG. 9G) of reaction. Scale bar: 50 pm. FIG. 9H shows quantification of tagged GFP concentration on individual agarose beads after different durations of droplet IVTT and tagging. (Sample size, n = 20).

To validate the use of these beads to synthesize and capture proteins, the beads grafted with DNA coding for GFP-SNAP were infused with IVTT reagents and reemulsified using PTE, ensuring a single bead per drop, as shown schematically in FIG. 7 (at 705). The time dependence of GFP synthesis and SNAP-tagging was measured: an aliquot of beads was obtained, the emulsion was broken, the beads were washed to remove all IVTT reagents and untagged GFP, and imaged with confocal microscopy. Strong and uniform green fluorescence on the beads was observed after 2 hours of reaction, as shown in FIG. 9E. The fluorescence intensity further increased after 5 and 42 hours, as shown in FIG. 9F-9G. This confirmed that GFP was successfully produced and bound to the beads. The concentration of GFP increased in time, reaching 1 pM in 2 hours and exceeding 2 pM at the end of the 42-hour reaction, as shown in FIG. 9H. These results illustrated the utility of the agarose beads and confirmed that they can be used to synthesize and capture functional proteins using IVTT, even from a single DNA template.

To illustrate the utility of the bifunctional agarose beads, the beads were used to perform a cell-free directed evolution of an enzyme to improve its thermotolerance. Lipase A from Bacillus subtilis (BsLipA) was used, an enzyme that catalyzes both the hydrolysis and synthesis of esters, making it valuable for industrial applications, including in biodiesel production and as an additive in detergents. However, the enzyme's practical utility is often constrained by its susceptibility to high temperatures commonly used in industrial processes. The wild-type BsLipA irreversibly loses its catalytic activity when exposed to elevated temperatures. Therefore, improving the thermotolerance of lipases like BsLipA would significantly expand their applicability across a broader range of high-temperature industrial processes. To measure the functionality of the BsLipA enzyme, a Anorogenic esterase substrate, Calcein AM, which Auoresces upon hydrolysis, was used. To test for thermotolerance, the enzyme was first exposed to a high temperature for a fixed period and then tested for functionality. The bifunctional agarose beads simplified this workflow.

Addition of a substrate to test the functionality of the enzyme in the presence of IVTT proved challenging due to its complex composition. For example, when the substrate was added directly to IVTT expressing BsLipA, no fluorescence signal was detected. However, the agarose beads could be easily washed after the enzyme was produced by IVTT. The same procedure was used to produce the GFP but instead DNA that encoded BsLipA was used. 1 molecule of DNA was loaded per 10 drops to ensure the activity of active BsLipA could be detected. The DNA was amplified, the enzyme was expressed with IVTT, the agarose beads were washed, infused in the solution of Calcein AM substrate, and re-emulsified using PTE, as shown schematically in FIG. 7 (at 707).

After 1 hour of enzymatic reaction, the drops were imaged using confocal microscopy. A small number of bright drops were clearly observed, indicating active enzymes, as shown in FIG. 91. FIG. 91 shows an overlay of brightfield- and fluorescence-microscope images of droplets containing hydrogel beads with WT BsLipA and Calcein AM after 1 hour of reaction time. Scale bar: 50 pm. The intensities were analyzed using a microfluidic droplet cytometer and their distribution was plotted normalized to the mean of the top 10%. The distribution exhibited two populations of drops: a high intensity population which accounted for 10% of all drops and a low intensity population which accounted for the remaining 90%, consistent with DNA loading of 1 molecule per 10 beads, as shown in FIG. 9J. FIG. 9J shows (i) a microfluidics cytometry analysis of droplets containing agarose beads with WT BsLipA and Calcein AM after 1 hour of reaction time, and (ii) an inset showing expansion of the tail region of the distribution. Thus, the agarose beads were used to synthesize active enzymes and test their functionality.

To demonstrate the advantages of cell-free enzyme expression, results from agarose bead-based expression were compared with those obtained using live-cell expression systems. E. coli cells expressing BsLipA were prepared and loaded with 1 cell per 10 drops, co-encapsulated with both lysis reagent and substrate. After incubation for 1 hour, the drops were imaged using confocal microscopy and a small number of bright drops with varying intensities were observed, as shown in FIG. 9K. FIG. 9K shows an overlay of brightfield- and fluorescence-microscope images of droplets containing Calcein AM and single cells expressing WT BsLipA after 1 hour reaction demonstrating the biological variability. Scale bar: 50 pm. FIG. 9L(i) shows a microfluidic cytometry analysis of droplets containing single cells expressing wild-type BsLipA, after a 1-hour reaction with Calcein AM, with loading of 1 DNA molecule encoding BsLipA per 10 drops. FIG. 9L (ii) shows an enlarged inset of droplets with normalized fluorescence greater than 2.

The intensities were quantified using microfluidic droplet cytometry, plotted, and the distribution was normalized to the mean of the top 10%. A broad range of fluorescence intensities was observed, as shown in FIG. 9L (i). These data were in sharp contrast to those of the cell-free expression where there was a well-defined peak. To further emphasize the contrast, the tail of each distribution was enlarged. The distribution of the cell-free expression exhibited only a very small number of drops with higher intensities, whereas, by contrast, there was a very large tail of drops with higher intensities in the distribution of the live-cell expressions, as seen by comparing the insets of FIG. 9J and FIG. 9L. These results clearly show the benefits of cell-free expression using agarose beads in reducing the variability of expression level as compared to live cell systems. This is advantageous when selecting variants with improved enzymatic activity; the cell-free system makes it possible to more clearly distinguish smaller increases in activity than the cell-based system.

To demonstrate the application of bifunctional agarose beads for selecting thermotolerant BsLipA variants, a reference DNA library was generated by mixing DNA that codes for WT enzyme with DNA that codes for a much more thermotolerant variant of BsLipA in ratio 99:1. The thermotolerant variant preserved close to 100% activity after 20 minutes of incubation at 60 °C, whereas, by contrast, the wildtype BsLipA lost its activity after 20 minutes of incubation at 60 °C, as shown in FIG. 10A. Agarose beads were produced using 1 DNA template from the library for each 10 beads. The DNA was amplified and the enzyme was expressed on the beads. After washing the beads, the beads were re-encapsulated using PTE and subjected to a temperature of 70 °C for 1 hour. After cooling the beads, the emulsion was broken, and the beads were re- emulsified with the substrate. After 1 hour of reaction, two distinct populations of drops were observed using a microfluidic droplet cytometer: the first population had a very low level of fluorescence and accounted for approximately 99.9% of the drops, whereas the second population had a significantly higher level of fluorescence and comprised -0.1% of the drops, as shown in FIG. 10B. This was consistent with the frequency of expected positive drops when 1% of DNA encoded for the thermotolerant variant, and when 0.1 DNA was loaded per drop.

To validate the performance of this selection, drops were injected into a concentric microfluidic sorter which minimized the number of false positive drops, as shown in FIG. 10C and schematically in FIG. 7 (at 709). The gate was set to sort the 0.1% of drops that emitted the highest fluorescence. To mimic the conditions with a very low number of hits, 10 drops or 1 drop per sample were sorted. This was repeated 5 times. The emulsion was broken by freeze-thawing the sample and amplifying the DNA. From each sample containing 10 sorted drops, the DNA was successfully recovered, while the recovery rate from a single sorted drop was 80%. Moreover, sequencing results showed that DNA recovered from every experiment coded for the thermotolerant BsLipA variant. To further confirm the selectivity, 1000 random drops were sorted without gating three times, then the DNA was recovered and sequenced. For all three samples, the DNA coded for the wildtype, as shown in FIG. 10D and FIGS. 11A-11C. These results demonstrated the advantages of the bifunctional agarose beads: even when sorting very few drops per sample (often the case when screening gene libraries for significantly improved variants), the bead display system ensured high recovery rate and reproducibility. Moreover, this fully synthetic system eliminated the noise in ultra-high throughput screens derived from expression differences between individual cells and resulted in the full separation of the improved variant from the wildtype enzyme.

FIGS. 10A-10D show droplet fluorescence intensity and microfluidic sorting analysis for enzymatic assays. FIG. 10A shows activity of WT BsLipA and a thermotolerant variant of BsLipA, after 20-minute incubation at 20 and 60 °C, mean-17- 0.95 confidence interval. (Sample size, n = 2). FIG. 10B shows a distribution of droplet fluorescence for the reference library. The sorted population is labeled in red. FIG. 10C shows a brightfield image of a concentric microfluidics sorter. Black arrow shows the droplet inlet; red arrow shows the collection channel of sorted droplets and white arrow shows the collection channel for the waste droplets. Scale bar: 100 pm. FIG. 10D is a table showing the recovery of DNA from samples with different numbers of sorted drops and sequence identity of the recovered DNA.

FIGS. 11A-11C show Sanger sequencing traces of recovered samples in the reference library sorts. FIG. 11A shows four recovered DNA samples from the 1-drop gated sorts. FIG. 11B shows five recovered DNA samples from the 10-drop gated sorts. FIG. 11C shows three recovered DNA samples from 1000 random drops. Residues 134- 142 for wildtype BsLipA correspond to MIVMNYLSR (SEQ ID NO: 29), while for the thermotolerant variant MT20, the sequence corresponds to PIVANSLSM (SEQ ID NO: 30). The thermotolerant positive control MT20 carries four amino acid substitutions, M134P, M137A, Y139S, R142M between residue numbers 134 and 142, therefore this part of the sequencing traces is shown to demonstrate the successful identification of the positive control in all recovered samples.

To illustrate the utility and advantages of these bifunctional agarose beads in identifying enzymes with improved functionality, bifunctional agarose beads were used to improve the thermotolerance of BsLipA. To make significant improvements in the thermotolerance of an enzyme, multiple amino acid substitutions are often required. However, for a typical- sized enzyme, such as BsLipA, the sequence space exceeds 1 million for more than two random mutations. To practically screen a representative sample of the theoretical sequence space, potentially beneficial residue positions were chosen so the search space was limited. These chosen positions were then simultaneously saturated, resulting in a combinatorial saturation library. To improve the thermotolerance of BsLipA, positions were chosen based on a deep mutational scanning data set. The positions were ranked by the maximum improvement of thermotolerance and the 15 residue positions that had the highest positive impact on thermotolerance of BsLipA were chosen.

To prepare the library, oligos with degenerate codons at all 15 target residue positions were designed and mutations were introduced using QuickChange mutagenesis and overlap extension PCR. The library was transformed into E. coli, colonies were grown, and 144 were chosen for Sanger sequencing. The library comprised variants with 0-15 single amino acid substitutions, with an average of 6 substitutions per variant, as shown in FIG. 12A. FIG. 12A shows a distribution of the number of amino acid substitutions per gene in the combinatorial library. FIG. 12B shows a heat map of the frequency of distinct amino acid substitutions at the 15 targeted positions. For each targeted residue position, substitutions into 10-17 other amino acids were observed, as shown in FIG. 12B. Due to the sequencing depth the change was not observed at each position to every other possible amino acid; nevertheless, the locations and substitutions were consistent with the combinatorial saturation library design. To characterize the activity and thermotolerance of the library, enzymes from the 144 colonies were expressed and a plate reader was used to measure the activity of these variants using an E. coli cell lysate assay. The variants’ thermotolerance was also characterized by incubating the lysates from the variants for 20 minutes at 70 °C and measuring their residual activity.

FIG. 12C shows activity of randomly selected variants from the combinatorial BsLipA library measured using E. coli lysate (i) without heat inactivation and (ii) after 20-min incubation at 70 °C. The activity values are normalized to WT BsLipA activity without heat inactivation. Before heat inactivation, half of the 144 variants exhibited detectable esterase activity, as defined by > 10% the reaction rate of the wildtype; however, after heat inactivation no activity from any variant was observed, as shown in FIG. 12C. Therefore, the hit rate in this library was lower than what could be screened by a well plate assay and thus, required an ultra-high throughput screening system.

The hydrogel bead display system was used to screen for improved thermo tolerance. A loading of 0.3 DNA molecules per bead was used, the DNA was amplified, the enzyme was produced using IVTT, and the enzyme was heated to 75 °C for 1 hour. This temperature that was higher than that used for the reference library. Substrate was added, -500 k beads were screened at -250 Hz sorting speed, and the -0.1% of the beads with the highest fluorescence were sorted. The DNA was recovered from the sorted beads, then amplified and screened a second time with a loading number of 0.1 DNA molecules per bead. The second round of screening ensured that each sorted bead carried DNA only for one variant from the first round, further enriching the most thermotolerant variants. The DNA was recovered after the second sort and the variant genes were subcloned into the pet28a vector to remove the SNAP-tag. To isolate and test individual variants, the plasmids were transformed into BL21(DE3) cells and single colonies were picked to express individual variants in individual wells of a 96-well plate. Then, their residual activity after incubation at 75 °C for 30 minutes was measured.

FIG. 12D shows (i) a schematic workflow of a first screening experiment which comprised a first sort with 0.3 genes per gel, a second sort with 0.1 genes per gel, and a plate-based lysate assay; and (ii) residual activity of individually expressed variants measured using E. coli lysate after 30-min incubation at 75 °C. Variants with >10% residual activity are plotted. Two thermotolerant clones, which are defined by >10% activity retention after heat inactivation, were identified, as shown in FIG. 12D (ii). Sanger sequencing of thermotolerant clones revealed only a single unique DNA sequence, which represented a low number of significantly improved variants recovered from the screen.

When performing droplet-based library screening, the average loading number of variants in drops is often set to be less than one, ensuring a single variant in the vast majority of filled drops. This ensures that the catalytic activity measured from each drop can be attributed to an individual variant, enabling accurate comparison between them. However, this low loading significantly limits the throughput of the screen. To overcome this limitation, a new method was developed, termed a “multiplexed sort.” A much larger number of variants were loaded in each drop and all positive drops were selected in the sorter. As a result, each sorted drop contained not only the active variant but also many inactive variants. This mixed population was separated by a second round of sorting where the average loading number was less than 1 (a “demultiplexing sort.”) This multiplexed sort method allowed higher loadings further increased the potential number of variants screened. However, the effectiveness of the method was influenced by the frequency of finding active variants; the frequency needed to be sufficiently low for the majority of droplets to contain no active variants whatsoever, even when loaded with large numbers of variants. As a result, higher selection pressure permitted a greater number of variants per drop.

To demonstrate the concept of multiplexed sorting, 5 DNA templates, on average, were loaded per hydrogel bead and the beads were analyzed as described previously (z.e., droplet PCR, IVTT, heat inactivation and hydrolysis reaction.) The top 0.1% drops were sorted, the DNA was recovered, a second round of sorting was performed with an average DNA loading number of 0.1, and a final lysate assay was conducted, as shown schematically in FIG. 12E (i). Thirteen thermotolerant clones in this screen exhibited residual activity from 0.10 to 1.86, as shown in FIG. 12E (ii); from these clones, four unique DNA sequences were identified. FIG. 12E shows (i) a schematic workflow of a second screening experiment which comprised a first sort with 5 genes per gel, a second sort with 0.1 genes per gel, and a plate-based lysate assay; and (ii) residual activity of individually expressed variants measured using E. coli lysate after 30-min incubation at 75 °C. Variants with >10% residual activity are plotted. To further exploit the capacity of multiplexed sorting, the DNA loading number of the initial sort was increased to 20 per drop, on average, and the full screening process was repeated, as shown schematically in FIG. 12F (i). From 28 clones, 9 unique thermotolerant variants that exhibited normalized reaction rates ranging from 0.13 to 1.24 were identified, as shown in FIG. 12F (ii). FIG. 12F shows (i) a schematic workflow of a third screening experiment which comprised a first sort with 20 genes per gel, a second sort with 0.1 genes per gel, and a plate-based lysate assay; and (ii) residual activity of individually expressed variants measured using E. coli lysate after 30-min incubation at 75 °C. Variants with >10% residual activity are plotted.

These results demonstrated the benefits of multiplexed sorting. As the DNA loading number of the initial sort was increased from 0.3 to 5 to 20, the screening throughput increased from 150k to 2.5 million to 10 million. Correspondingly, the number of identified thermotolerant variants increased from 1 to 4 to 9. Moreover, because the selection criterion for the sort was determined from the total number of drops screened, higher activities were observed for the variants selected from the largest loading level. Furthermore, even for the screen with 10 million variants, the total sorting time was less than 1 hour. By contrast, screening an equivalent number of variants using a single variant per drop would require multiple days of sorting. Thus, these results demonstrate the effectiveness of multiplexed sorting.

To characterize the sorted variants, all 14 hits were expressed in E. coli and purified using Ni-NTA spin columns. After elution and desalting, the molecular weight of the sample was analyzed using SDS-PAGE and 10 variants of the correct size were observed. FIGS. 13 A- 13C show characterization of thermotolerant variants. FIG. 13A shows an SDS-PAGE gel analysis of purified proteins of selected variants. Fane M is the marker. Fane 1 is the variant selected from the first screen with an initial DNA loading number of 0.3 per gel. Fanes 2-5 are four variants selected from the second screen with an initial DNA loading number of 5 per gel. Fanes 6-14 are nine variants selected from the third screen with an initial DNA loading number of 20 per gel. All the variants from the second screen were not successfully expressed, whereas the remaining variants all produced purified proteins at the expected size. FIG. 13B shows a heatmap of the frequency of distinct amino acid substitutions at the 15 targeted positions in the selected variants, with color intensities indicating the count of the amino acids in all thermotolerant variants. FIG. 13C shows normalized activity of the eight most thermotolerant variants after incubation at temperatures from 25°C to 95°C. All activity values were normalized by the average of the wildtype after incubation at 25 °C, mean-17- 0.95 confidence interval. The variant from the smallest scale screening is shown by the dashed line. (Sample size n = 3)

The remaining 4 variants (not included in the 10 variants of the correct size) were not detected on the gel, as shown in FIG. 13 A. Interestingly, these 4 variants were all selected variants from the mid-sized screen, which started with 5 variants per drop. Thus, the mid-sized screen did not provide any useful results for enzymes with increased thermo tolerance. For the 10 purified variants, Sanger sequencing was used to confirm that each one carried the previously identified 10-12 amino acid substitutions at the targeted positions, as shown in Table 3.

Table 3. Mutations in targeted positions introduced in discovered purified variants.

Amino acid substitutions from the wildtype are in bold.

All the substitutions for each of the target positions is summarized in FIG. 13B. Substitutions of at least three other amino acids were observed in 12 out of the 15 targeted residue positions. In 3 positions, R33, D34 and G104, almost no substitutions were observed; even though individual mutations at these positions lead to improvements of thermotolerance over the wildtype, they did not contribute when mutations at multiple positions were screened. Interestingly, 8 out of 10 variants had the M134P substitution, as shown in FIG. 13B. This may have an impact on preservation of the function of the enzyme after exposure to very high temperatures. These results illustrate how the improved capacity of the multiplexed method disclosed herein can help overcome instances where a large fraction or even all the selected variants cannot be purified. Moreover, the method led to significant improvements in the activity of thermotolerant variants because much larger combinatorial libraries were screened.

To characterize the selected variants, their activity was first measured with Calcein AM and compared to that of the wildtype. Two of the variants exhibited very low activity. Of the remaining 8, all but one exhibited a lower activity than that of the wildtype; the 20G6 variant, selected from the largest screen, exhibited slightly increased activity.

To determine the level of improvement in the thermotolerance of these 8 variants, their T50, the temperature at which half of their activity is lost, was measured. Each variant and the wildtype were incubated at temperatures ranging from 25 to 95 °C for 20 minutes, cooled to room temperature, and their activity was measured to determine the Tso values.

FIG. 14 shows reaction rate as a function of heat-incubation temperature from 25 °C to 95 °C for wildtype BsLipA and 8 hits. The Tso values were determined by linear interpolation of the reaction rate-incubation temperature curves. The wildtype enzyme exhibited a rapid decrease in activity when incubated at temperatures above 45 °C, and a T50 of 50.4 °C was observed, shown by the black line in FIG. 13C. In contrast, all purified variants exhibited a much slower decay of activity with incubation temperature, demonstrating a retention of more than 10% activity, even at 85 °C, as shown in FIG. 13C. All but 2 of the selected variants exhibited T50 values exceeding the wildtype by more than 35 °C, as shown in FIG. 14. The best performing variant, 20G6, exhibited significantly improved activity over that of the wildtype at all temperatures. All the variants with significantly improved thermotolerance were selected from the largest screen, demonstrating the importance of screening a much larger library size. These results demonstrate the advantages of the multiplex sorting technique.

FIG. 15 shows normalized activity of identified variants as a function of incubation time at 65 °C. Blue dots indicate measured activity values normalized to the average activity of each variant without heat inactivation. Red curves indicate exponential fit of the function between normalized activity and incubation time. The functional forms of the best fit as well as R²s of the fits are shown in the box at the bottom left corner of each plot.

A maximum of 20 DNA molecules per drop were used in the largest screen. However, this number could be increased if an even larger library of variants was necessary to identify improved variants.

Using this platform, a combinatorial saturation library of 15 targeted positions was screened and multiple BsLipA variants with significantly improved thermotolerance were discovered. Each of these variants had 10 or more amino acid substitutions which were simultaneously introduced with only one round of mutagenesis. The screening workflow took less than 1 hour of total sorting time, drastically accelerating the enzyme evolution process. By comparison, similar improvements of thermotolerance have previously only been achieved through multiple rounds of mutagenesis and screening. Materials and Methods

Agarose modification

Primer- and BG-modified agarose were synthesized using click chemistry. 1 g Agarose (Ultra-low Gelling Temperature, Sigma A5030) and 100 mg of Glycidyl Propargyl Ether (TCI, G0445) was dissolved in 100 mL of 0.1M NaOH. After stirring at 35 °C for 12 hours, 300 mL of isopropanol was added to the solution, and the mixture was stirred at room temperature for 30 minutes. The white precipitation was filtered out, washed repeatedly with isopropanol, and dried under vacuum.

Azide-and PEG- DNA oligo (5’AzideN-iSpl8- GGTTGCGTTTGAGACAGGCGAC (SEQ ID NO: 3)) was purchased from IDT. BG- N3 was prepared as previously reported.

To click-modify the agarose with DNA oligo, Na-primcr was added to a solution of alkyne functionalized agarose (1 wt%), CuSO4 (Mallinckrodt AR4844, 1 mM), TCEP (Ambeed A144762, 1 mM) and THPTA (TCI, T3171, 0.2 mM) in PBS. The solution was stirred at 45 °C for 12 hours. After cooling down to 4 °C, the gelled agarose was collected, vortexed to break into pieces, and washed 3 times with 7 times volume of IX IDTE (to remove unbonded primers and other unwanted byproducts and reagents). Agarose with BG functionalization was obtained according to a similar protocol, except that the N3 -Primer was replaced by a 0.1 M stock solution of BG-N3 in DMSO. Quantification of oligo modification

To quantify the oligo modification on agarose after click reaction, cold and shredded agarose gel was centrifuged at 5000 g for 10 min, supernatant was removed and the gel was melted at 70 °C for 5 mins. Oligo concentration was quantified using a Qubit ssDNA quantification kit (Thermo Fisher QI 0212) and Qubit Fluorometer (Thermo Fisher Q33226).

Microfluidic generation of agarose beads

Molten agarose solution was prepared by mixing the molten solutions of oligomodified, BG-modified and unmodified agarose at desired concentrations. The molten agarose was heated to 90 °C for 5 minutes and filtered through 5 pm PES syringe filters (Pall, 4650) and immediately transferred to a syringe. Microfluidics droplets were generated using an air-triggered device (FIGS. 16A-16B) running at 2200 pE/h for the agarose phase, 5000 pE/h for the oil phases of HFE-7500 engineering fluid (3M Novec) with 2% fluorosurfactant (RAN Biotechnologies) and air pressure of 1 bar. Drops were collected into a 15 ml falcon tube on ice with mineral oil layered on top. The setup was warmed by a space heater to prevent agarose gelation during droplet generation. After droplet generation, the hydrogel beads were washed three times by adding 10 ml of IxIDTE buffer (pH8, IDT) and 3,3,4,4,5,5,6,6,7,7,8,8,8-Tridecafluorooctan-l-ol (PFO, Ambeed 77239-430). After washing, the beads were centrifuged down, and a small sample was molten for the measurement of oligo concentration using Qubit ssDNA quantification assay.

FIGS. 16A-16B show microfluidics devices. FIG. 16A shows a device for airtriggered agarose bead generation. FIG. 16B shows a concentric sorting chip for particle-templated emulsions. Labels 1-4 are, agarose inlet, oil inlet, droplet outlet and pressurized air inlet of the air-triggered agarose droplet generation device. Labels 5-8 are the droplet inlet, spacer oil inlet, additional oil inlet, bias oil inlet of the sorting device. Labels 9 and 10 are the sort channel and waste channel outlets. Label 11 is the pressurized air inlet for expelling sorted drops. Label 12 is the inlet of the saltwater electrode. Labels 13 and 14 are the inlet and outlet of the saltwater moat. Quantification of BG tagging capacity for the agarose beads.

When different concentrations of input BG are used for the modification of agarose, the final available concentration of BG on the agarose hydrogel for protein tagging will vary. To quantify the available amount of BG for tagging once agarose beads were formed, GFP-SNAP was used as a control protein. To enable IVTT expression, a GFP-SNAP construct, flanked on the 5’ side by T7 promoter and ribosomal binding site, and on the 3’ side by T7 terminator, was ordered to be subcloned into the pUCIDT plasmid (Integrated DNA Technologies).

GFP (GenBank: QSX72528.1) corresponds to the sequence below:

TALTEGAKLFEKEIPYITELEGDVEGMKFIIKGEGTGDATTGTIKAKYICTT GDLPVPWATLVSTLSYGVQCFAKYPSHIKDFFKSAMPEGYTQERTISFEG DGVYKTRAMVTYERGSIYNRVTLTGENFKKDGHILRKNVAFQCPPSILYI LPDTVNNGIRVEFNQAYDIEGVTEKLVTKCSQMNRPLAGSAAVHIPRYH HITYHTKLSKDRDERRDHMCLVEVVKAVDLDTYQ

(SEQ ID NO: 31). SNAP (PDB: 3KZY_A) corresponds to the sequence below:

MGLGSMDKDCEMKRTTLDSPLGKLELSGCEQGLHEIIFLGKGTSAADAV EVPAPAAVLGGPEPLMQATAWLNAYFHQPEAIEEFPVPALHHPVFQQESF TRQVLWKLLKVVKFGEVISYSHLAALAGNPAATAAVKTALSGNPVPILIP CHRVVQGDLDVGGYEGGLAVKEWLLAHEGHRLGKPGLG (SEQ ID NO: 32). The full plasmid sequences are shown in Table 2.

To measure the amount of GFP-SNAP that can be tagged to the agarose beads, a standard curve was prepared for the relation between GFP concentration and fluorescence intensity measured by a fixed optical setting of Eeica SP5 confocal microscope using a lOx air objective with NA = 0.3 and optical slice thickness = 992 nm, as shown in FIGS. 17A-17B. FIGS. 17A-17B show calibrating the relation between confocal-measured GFP intensity and GFP concentration. FIG. 17A shows SDS-PAGE gel for purified GFP. FIG. 17B shows standard curve for confocal-measured GFP fluorescence v.s. GFP concentration. This standard curve and optical setting was used for the measurement of all GFP concentrations. To generate the standard curve, this GFP with a 6his purification tag was expressed using BE21 DE3 E. coli cells (NEB C2527H), purified with NiNTA purification columns (Thermo Fisher 88224) and its concentration was measured using BCA assay (Thermo Fisher 23235). The purity and molecular weight of the expressed GFP was confirmed using SDS-PAGE gel electrophoresis (FIGS. 17A-17B). To ensure consistency during imaging, imaging chambers were prepared using the same glass coverslips (Corning, 2975245) and silicone isolators (Thermo Fisher, S24737).

To measure the amount of GFP-SNAP that can be tagged to hydrogel beads modified with different amounts of BG, BG-modified agarose beads were suspended in 2 pM GFP-SNAP solution and incubated at room temperature with shaking for 8 hours. The beads were washed 3 times and imaged with the preset confocal setting to read out the concentration of tagged GFP. Mean fluorescence intensity of individual beads were measured using ImageJ. For each group, 20 beads were measured. Reaction saturation at 8 hours was confirmed by imaging the beads after 24 hours of tagging under the same conditions and no significant change in fluorescence intensity was observed.

FIG. 18 shows concentration of tagged GFP in time onto agarose hydrogels made with 50 pM input BG at 3 hours, 8 hours and 24 hours. Each point is the average concentration of a hydrogel bead measured from confocal imaging. The tagging experiment was completed within the first 3 hours of reaction. Further incubation did not improve tagged GFP concentration. Droplet PCR using agarose hydrogel beads

To perform droplet PCR using the bi-functional agarose hydrogel beads, 50 pL bead suspension, 50 pF Q5 2x HotStart master mix, 0.5 pL of 100 pM reverse primer (5’-CCACGAGTCGCAGCACAGC-3’ (SEQ ID NO: 4)) and the desired amount of template DNA were mixed and vortexed to ensure mixing. The bead suspension was centrifuged down briefly at 1000 g for 10 s. Then ~40 ul supernatant was removed and the bead bed was vortexed. To emulsify the gel beads, 200 pL droplet generation oil (Biorad, 1864005) was added to the bead bed and vortexed to generate bead-templated emulsions. The emulsified drops were transferred into PCR tubes and layered with mineral oil. The following PCR program was used: 98 °C 2 min; 98 °C 30 s, 60 °C 2 min, 72 °C 2 min, 40 x; 72 °C 3 min; 4 °C inf. After PCR, drops in all PCR tubes were collected in a 1.5 ml Eppendorf tube loaded with 1 mF lx IDTE buffer and 50 pL PFO. The mixture was vortexed and centrifuged for 30s at 5000 g. After centrifugation, a clear gel bed was visible at the bottom of the top aqueous layer. The aqueous phase was transferred to a new Eppendorf tube, washed 3 more times with IDTE buffer and stored at 4 °C until quantification or until use for IVTT.

To quantify the frequency of positive gels after droplet PCR with different loading numbers, for each loading number, 0.1, 1 and 10, droplet PCR was run in quintuplicate. For each replicate, a 10 ul sample of bead suspension was taken after PCR. The beads were stained with Qubit dsDNA dye and confocal microscopy images both in the brightfield and in the fluorescence channel were taken to determine the frequency of bright drops. In each replicate, the total gel number was from 87 to 428, depending on the local density of the bead suspension sample.

To quantify the amount of amplified DNA on beads, a sample was taken from the washed and centrifuged bead gel pellet and molten at 70 °C for 1 min. The concentration of dsDNA was measured by Qubit Fluorometer using lx HS dsDNA dye (Thermo Fisher Q33231). When the loading number of DNA in droplets was > 1, the measured dsDNA concentration was routinely above 20 ng/pL. Droplet IVTT using agarose hydrogel beads

For each 50 pF of beads with amplified templated DNA, an IVTT mix was prepared using the NEBexpress cell free protein expression kit (NEB, E5360S) by mixing 50 pL protein synthesis buffer, 24 pL of cell extract, 2 pL of RNase inhibitor and 2 pL of T7 RNA polymerase. The bead suspension was vortexed briefly to mix. The gel beads were briefly spun down at 300 g for 30 s. 50 pL supernatant was removed and 200 pL HFE-7500 engineering fluid (3M Novec) with 2% 008-fluorosurfactant (RAN Biotechnologies) was added and vortexed to emulsify. The drops were left at room temperature for protein expression for various durations based on the purpose of the experiment. During incubation for protein expression and tagging, mineral oil was layered on top of the drops to prevent evaporation. After protein synthesis and tagging, the drops were demulsified with PFO and washed at least three times with lx DPBS buffer (Corning).

Comparing droplet assay using cell-host and using hydrogel bead expression system

The variation of droplet assay was compared using an E. coli expression system and using a hydrogel bead templated expression system. For the E. coli expression system, BL21(DE3) was used to express wildtype BsLipA cloned into a pET-28a vector. After overnight cytosolic expression, single E. coli cells were encapsulated in drops and the cells were co-encapsulated with Bugbuster cell lysis reagent (Sigma 70584) and substrate Calcein AM (20 pM). A final OD of 0.08 was used in drops, which corresponded to a loading number of ~ 0.1 per ~25 pm droplet used in the experiment. Drops were incubated off chip for 1 hour after encapsulation and then imaged using confocal microscopy or analyzed using microfluidic cytometry.

For the hydrogel bead expression system, DNA amplification was performed using droplet digital PCR with a loading number of 0.1 for the template BsLipA DNA. Droplet PCR and droplet IVTT conditions were used as described in the previous sections. After washing drops from IVTT, 10 pM sulforhodamine b was added in the bead suspension to enable microfluidics drop detection and emulsify the beads by vertexing in HFE-7500 engineering fluid (3M Novec)l with 2% 008-fluorosurfactant (RAN Biotechnologies). Calcein AM was prepared in ethyl acetate as a 2 mM stock solution and dilute it 100 times in HFE-7500 Engineering Fluid (3M Novec)l with 2% 008-fluorosurfactant (RAN Biotechnologies) and mix this oil with bead emulsion in 1:1 volume ratio. The bead-templated emulsions were incubated at room temperature for 1 hour before confocal imaging as well as microfluidics cytometry analysis. To quantify confocal fluorescence microscopes, mean fluorescence intensity of individual beads were measured using ImageJ. For each group, 20 beads were measured.

For microfluidics droplet analysis, drops were reinjected into a sorting chip using the following flow rates and compositions: droplet phase - 80-120 pL/h, spacer oil phase - 2% 008-Fluorosurfactant (RAN Biotechnologies) in HFE-7500, 300 pl/h, additional oil phase - HFE-7500, 2800 pL/h, bios oil phase - HFE-7500, 1300 pL/h. A custom-built two-laser optical setup was used to simultaneously excite the drops with 488 nm and 532 nm for Calcein and sulforhodamine b detection, which were detected on PMT1 and PMT2, respectively. A customized Lab VIEW code was run on a FPGA board (PCIe- 7842r, National Instruments) for data acquisition and in situ analysis. Bead-templated emulsion drops were detected as fluorescence peaks on the PMT time trace with durations above 0.5 ms and PMT2 signal above background level. Satellite emulsions formed during particle templated emulsification have durations and fluorescence levels below thresholds and were thus not detected as drops. The maximum PMT1 signal during each peak was recorded as the fluorescence intensity of each droplet. Screening of reference library

To demonstrate the selectivity of the agarose hydrogel bead screening workflow, the selectivity and hit recovery capability of the hydrogel bead platform was verified. A reference library was prepared by mixing 1% DNA of a thermotolerant variant into wildtype BsLipA DNA. The reference library was encapsulated at an average DNA loading number of 0.1 into agarose bead-templated emulsions with PCR reagents and DNA was amplified as described in the droplet PCR section. After droplet PCR, the agarose beads were washed and used for enzyme expression and tagging as described in the droplet IVTT section. After droplet IVTT, the agarose beads were washed in 10 mM phosphate buffer (pH7, 50 x diluted from Thermo Fisher, J63791.AP) and emulsified with QX200 Droplet Generation Oil for EvaGreen (Bio-Rad) using bead templated emulsification. The drops were incubated at 70 °C for 1 hour to provide a selection pressure for thermo tolerance. After heating, the drops were cooled to 4 °C, washed and emulsified again for droplet sorting. Sulforhodamine b droplet label as well as Calcein AM substrates were added, and droplet fluorescence intensity values were detected as described in the previous section. To enable droplet sorting, a sorting threshold was set in the PMT1 -detected intensity value between two droplet populations. For each droplet with PMT1 intensity above the sorting threshold, 25 cycles of 30 kHz AC at 6300 V was applied with a delay time of -2000 us. The electric field was delivered to the sorting junction using 1 M NaCl saltwater electrodes. Each droplet event with PMT1 intensities above the sorting threshold was programmed to trigger fast camera recording for the validation of droplet sorting.

To collect 1 to 10 drops within the sorting threshold, sorting was first turned off while droplets ran through the chip. An air-filled syringe was used to expel any drops in the sort channel. Sorting was turned on and turned off when the predetermined number of droplets were sorted. All sorted drops were expelled into an Eppendorf collection tube using the air-filled syringe. To collect 1000 random drops, sorting was turn off, the droplet detection speed was kept at 250 Hz and a collection from the waste channel was performed for 4 seconds. The DNA was recovered as described in the next section and the recovered DNA was sent for Sanger sequencing. Recovery of DNA from sorted bead-templated drops

To recover the DNA from sorted drops, a previously reported protocol was used. Briefly, 25 pL water was added into each collection tube, the tube was centrifuged at 14.1 kg for 10 minutes, and the tubes were frozen at -80 °C for at least 2 hours. The drops were then defrosted at 60 °C for 10 minutes. After defrosting, the aqueous phase was carefully transferred to a PCR tube containing 25 pL of a 2x concentrated PCR master mix made with 2x Q5 master mix (NEB) and FPrec, RPrec primers. To reduce the accumulation of nonspecific PCR products, the FPrec and RPrec primers used in the recovery step were nested with respect to the linear DNA fragment on bead. Amplification of the recovered DNA was performed using the following PCR program: 98 °C 3 min; 98 °C 10 s, 71 °C 20 s, 72 °C 30 s, 25 x; 72 °C 5 min, 4 °C inf. The PCR product was purified using spin columns (NEB T1030) and sent for Sanger sequencing, or re-cloned into the pUCIDT plasmid for the next round of droplet screening, or subcloned into pet28a expression vector for cell lysate assay.

FIG. 19 shows a map of the 1.8 kb linear DNA for hydrogel bead surface display. estA: Bacillus subtilius Lipase A gene (UniProt P37957). T7p: T7 promoter. RBS: ribosomal binding site. T7t: T7 terminator. Primers used in library construction, hydrogel bead screening, DNA recovery and subcloning are shown as arrows. Mutagenesis primers containing NNK or MNN degenerate codons are shown as arrows with crosses. The name, sequence, position and usage of primers are summarized in Table 1.

Preparation of combinatorial mutagenesis library

The screening library used herein was prepared by combinatorial mutagenesis of a wildtype BslipA-SNAP chimeric construct. The gene map of the library, as well as the positions of primers are shown schematically in FIG. 19. Primer sequences are shown in Table 1.

Table 1. Primer sequences and location. The position of the 5’ end of each primer is referenced by 1 -indexing from the A in the ATG start codon of the esterase gene.

The gene of BsLipA, estA (UniProt P37957) and SNAP tag, flanked on the 5’ side by T7 promoter and ribosomal binding site, and on the 3’ side by T7 terminator, were ordered to be subcloned into the pUCIDT plasmid (Integrated DNA Technologies). The full plasmid sequence is in Table 2.

Table 2. DNA sequence of control plasmids, control genes and identified thermotolerant variants. ORF: open reading frame.

To perform site saturation mutagenesis on all 15 residue positions of BsLipA, a two-step protocol was used. In the first step, multi-site directed mutagenesis was used to saturate F17, N18, A20, R33 and D34 residues. The codons for F17, N18, A20 residues were mutated as NNK (N: A, T, G, C; K: G, T) degenerate codons on forward primer FPqcll, and the codons for R33 and D34 were mutated as NNK degenerate codons on forward primer FPqcl2 (see Table 1 for primer sequences). To introduce these mutations, these two forward primers were used to amplify the wildtype estA-SNAP plasmid construct in a multi site-directed mutagenesis reaction (Agilent QCL multi site- directed mutagenesis kit, 210514). At the end of each amplification cycle, ligase in the mutagenesis reaction mix sealed the nicks to generate a fully sealed whole plasmid. 30 amplification cycles were performed according to manufacturer instructions, dpnl was used to digest wildtype plasmid strands and the reaction mix was transformed into Turbo high efficiency competent cells (NEB C2984). 16 individual colonies were picked to confirm mutations were introduced at the correct positions and pooled plasmids were miniprepped for the introduction of the rest of the mutations.

To perform site-saturation mutagenesis on residues N89, G104, M134, M137, Y139, R142, G155, 1157, G158, N174, two long DNA oligos (IDT) were ordered. The first long oligo, FPlong, is in the forward direction to cover residue positions N89 and G104. The codons or these two residues in the oligo were replaced with NNK degenerate codons. The second long oligo, RPlong, is in the reverse direction to cover residue positions M134, M137, Y139, R142, G155, 1157, G158, N174. Each codon of these residues in the oligo were replaced with MNN (M = A, C) degenerate codons. The FPlong and RPlong primers overlap for 24 bases at their 3’ ends (sequences in Table 1). These two oligos were assembled into double stranded DNA fragments which covered all 10 mutagenesis sites using a previously reported protocol.

All PCR reactions were performed using Q5 2x master mix (New England Biolabs M0492) diluted to a final lx reaction mix by adding primers, template DNA and water. Briefly, the second strand of each long oligo was synthesized by Q5 polymerase respectively using FPsss and RPsss primers as shown in Table 1, with the following PCR program: 98 C 3 min 72 C 10 min, 0.1 C/min ramp down. Two double stranded DNA fragments were purified from agarose gel to remove short fragments due to missing bases in oligo synthesis. To assemble the two double stranded fragments, overlap extension was performed by mixing 50 ng of each fragment in a 50 pL PCR reaction. The following PCR program was used for the assembly: 98 °C 3 min; 98 °C 20 s, 72 °C 30s 15x; 72 °C 5 min; 12 °C inf. To further amplify the assembly product, for each 10 pL of the reaction mix, outer amplification primers, FPoe and RPoe (Table 1) were added at a final concentration of 500 nM. The following PCR program was used for the amplification of the assembled product: and 98 °C 3 min; 98 °C 20 s, 64 °C 20 s, 72 °C 20 s, lOx, 72 °C 5 min; 12 °C inf.

To construct the final combinatorial library saturating all 15 positions, overlap extension PCR cloning was used, as reported previously. Briefly, the assembled fragments carrying 10 substitutions were used as megaprimers to amplify the pUCIDT plasmid templates harvested from the first QCL multi site mutagenesis step. In a 50 pL PCR reaction, 250 ng of megaprimer and 300 ng of template DNA was mixed with 25 pL of Q5 2X master mix. The following PCR program was used: 98 °C 3 min; 98 °C 30 s, 61 °C 30 s, 72 °C 5 min, 15 x; 72 °C 5 min; 4 °C inf. The product was purified using PCR cleanup columns (NEB T1030) and the nicks were sealed using T4 DNA ligase. Following ligation, the DNA was cleaned up again and dpnl restriction enzyme was used to digest the template plasmids. To acquire high quality linear DNA fragments for library screening, FPop and RPop primers (Table 1) which are ~ 300 bp flanking the open reading frame of the estA-SNAP construct, are used to amplify the dpnl reaction mix. The final product was analyzed on a lx TAE 1.5% agarose gel and confirmed to be single band at 1.8 kb.

Characterization of combinatorial mutagenesis library

To confirm that amino acid substitutions were introduced at all 15 preselected residue positions, the dpnl digested plasmids of the combinatorial library were transformed into Turbo high efficiency competent cells to miniprep high quality plasmids. The plasmids were transformed into BL21(DE3) cells. 144 colonies from this library were picked for Sanger sequencing, as well as for the expression of the BsLipA variants in 96 well plates. The wildtype BsLipA, the thermotolerant positive control, and empty vector, were also expressed, each in triplicates, as the control of activity measurements. After overnight expression in autoinduction medium (Thermo Fisher K6803), 100 pL of each culture was pelleted by centrifuging at 2200 g for 10 minutes. After removing culture medium, pellets were resuspended in 45 pL 10 mM phosphate buffer and lysed by adding 5 pL Bugbuster cell lysis buffer. For each variant, 20 pL of the cell lysate was heated at 70 °C for 20 minutes while the rest of the cell lysate was kept at room temperature. 20 pM Calcein AM substrate solution in 10 mM phosphate buffer was prepared and 10 pL was distributed to each 96-well of a PCR plate. After heat inactivating the enzymes, 10 pL of heated or unheated enzymes were added to the substrate for a final substrate concentration of 10 pM. The fluorescence intensity was monitored in time using a qPCR machine.

Screening of the combinatorial mutagenesis library

The full screening process of the combinatorial mutagenesis library comprises three screening rounds: two rounds of droplet screening using hydrogel bead display and one round of well plate-based lysate screening using E. coli expression. Between two consecutive rounds, the DNA recovered from previous rounds was used as the input of the next round.

The initial round was a droplet-based screening using the hydrogel bead expression system as described in the screening of the reference library. Briefly, the 1.8 kb linear library DNA fragments were loaded at a high per-droplet DNA loading number into drops for amplification and grafting on beads. Beads were washed and used for droplet IVTT overnight at room temperature. After expression and tagging of the enzymes, the beads were washed and emulsified again for heat inactivation of thermally intolerant variants at 75 °C for 1 hour. Following heat inactivation, beads were washed and re-emulsified with sulforhodamine for droplet labeling. Calcein-AM substrate was added through the oil phase and drops were incubated overnight to accumulate sufficient hydrolysis product for sorting. In this round ~ 500 k drops are screened and the 0.1% drops displaying the highest Calcein AM hydrolysis signal were sorted. The DNA from the sorted drops was recovered as described in the “Recovery of DNA from sorted bead- templated drops” section, and re-cloned into the pUCIDT plasmid to add back flanking regions for the next round of hydrogel bead expression.

The second round was also a droplet-based screening using the hydrogel bead expression system. After re-cloning the recovered DNA from the first round of droplet screening, FPop and RPop primers were used to produce 1.8 kb linear DNA fragments as the input of droplet PCR for the second droplet screening round. Different from the first round, a DNA loading number of 0.1 per drop was used to ensure purity of the sort. As positive drops contained only one active variant at micromolar concentration, the hydrolysis reaction was incubated only for 2 hours before sorting, instead of overnight as in the previous round. In the second sorting round, ~ 100 k drops were screened and the 0.1% drops displaying the highest Calcein AM hydrolysis signal were sorted. The DNA from the sorted drops was recovered and amplified using FPpet and RPpet primers, and subcloned into pet28a vectors for expression and screening in E. coli.

The third round was a well-plate-based screening using E. coli expression. After recovering the DNA from the second round and subcloning the variant genes into pet28a vectors, individual colonies were picked into a 96 well plate and expressed in autoinduction medium at 30 °C for 22 hours. After expression, 100 pL of each culture was pelleted by centrifuging at 2200 g for 10 minutes. After removing culture medium, pellets were resuspended in 50 pL 10 mM phosphate buffer and lysed by adding 5 ul Bugbuster cell lysis buffer. For each variant, 30 pL of the cell lysate was heated at 75 °C for 30 minutes while the rest of the cell lysate was kept at room temperature. Calcein AM substrate was prepared as a 20 pM solution in 10 mM phosphate buffer and 20 pL was distributed to each 96-well of a 96-well plate. After heat inactivating the enzymes, 20 pL of heated or unheated enzymes were added to the substrate for a final substrate concentration of 10 pM. The fluorescence intensity was monitored in time using a Tecon Inifite 200 Pro plate reader. Colonies that exhibited reaction rates higher than 10% of the wildtype control before heat inactivation and exhibited post-heating residual reaction rates higher than 10% the reaction rates of the same colony before heating, were sent for Sanger sequencing to identify unique variants.

The entire 3-step screening workflow was performed three times, each time using a different DNA loading number in the initial droplet-based screen. Except for the initial DNA loading number, the rest of the screening process was the same in the three full screens.

Subcloning

Unless otherwise specified, all subcloning procedures were performed using NEBuilder HiFi DNA Assembly (NEB E2621S) according to manufacturer instructions. Briefly, the insert was PCR-amplified using 30 bp flanks that overlap with the vector. The vector was PCR- linearized from the cloning site and treated with dpnl to remove the templates. The insert and linearized vectors were both cleaned up using spin columns and mixed in a molar ratio of 2: 1 in the NEBuilder HiFi DNA Assembly reaction mix. The reaction was incubated at 50 °C for 1 hour and the assembly product was transformed into Turbo high efficiency competent cells and plated on corresponding antibiotic selection plates. For preparation of pooled plasmids, the transformed cells were cultured in 3 ml LB medium with corresponding antibiotics. For pUCIDT plasmids, 100 ug/ml ampicillin was used for selection. For pet28a plasmids, 50 pg/mL kanamycin was used for selection.

Hit purification and. Tso characterization

Individual clones of thermotolerant variants were inoculated into autoinduction medium (Thermo Fisher K6803), cultured at 30 °C for 22-24 hours and the liquid culture was harvested. The cells were pelleted by centrifugation at 5000 g for 10 mins and lysed with Bugbuster (Sigma 70584-M). The lysate was purified with NiNTA purification columns (Thermo Fisher 88224) and the protein concentration was measured using BCA assay (Thermo Fisher 23235). Purified enzymes were stored in 50% glycerol at -20 °C.

To characterize the thermotolerance of the purified hits, their activity after incubation was measured at various temperatures for 20 minutes and the curves were interpolated to determine the temperature which causes 50% activity loss. All glycerol stocks of purified hits and the wildtype BsLipA were normalized to 20 pM in 50% glycerol and then diluted to 400 nM in phosphate buffer (10 mM sodium phosphate pH 7, 50 x diluted from Thermo Fisher, J63791.AP). A 1% glycerol in phosphate buffer was used as the blank control. Heat inactivation was performed at 9 temperatures: 95 °C, 85 °C, 75.7 °C, 65.6 °C, 54.3 °C, 44 °C, 35 °C and 25 °C, for 20 min. After heat inactivation, enzymes were mixed with Calcium AM solution in 1 : 1 volume ratio for a final enzyme concentration of 200 nM and substrate concentration of 10 pM. Fluorescence intensities were measured using a qPCR machine (Biorad CFX96) at 3 min intervals for 300 cycles. For each incubation temperature, the enzymatic activity was measured as the slope of the linear region of the fluorescence-time curve after subtracting the fluorescence-time curve of the glycerol sample. The T50 values were taken by interpolating the activity-temperature curve at half the activity after incubation at 25 °C.

Fabrication of micro fluidics chip

Microfluidics chips, including an air-triggered droplet generator and a droplet sorter were designed using AutoCAD and printed as transparency photolithography masks (CAD/ Art Services, Inc.). The masters for microfluidics devices were fabricated on a silicon wafer using SU-8 photolithography as previously described. After master fabrication, poly dimethylsiloxane (PDMS) (Sylgard 184) was poured onto the silicon wafers and they were baked at 65 °C overnight for PDMS curation. Subsequently, the PDMS slab was cautiously removed from the master, biopsy punches were used to create inlets and outlets, and the PDMS slab was bonded to a glass slide (Coming, 2947). To improve hydrophobicity of the microfluidics channels, Aquapel (fluoroalkylsilanes) was injected into each device. Excess Aquapel was expelled using compressed air and bake the device at 65 °C overnight.

EXAMPLE 4

This example describes multiplex analysis of nanobodies on agarose beads. Agarose beads were used to increase library size and reduce the time required per selection round. An agarose bead can display multiple types of nanobodies (i.e., multiplexing). If each bead carries a multiplex of 10,000 nanobody variants, the effective library size increases 10,000-fold for the same FACS throughput.

To validate this concept, an active nanobody (anti-GFP) was mixed with l,000x, 5,000x, or 25,000x excess of an inactive nanobody (anti-mCherry) and was conjugated to agarose beads. The beads were then stained with a biotin-labeled GFP antigen, followed by fluorescently labeled streptavidin. FIG. 20 shows fluorescence images of the resulting droplets. Even at a 25,000-fold multiplex level, fluorescent signal was still distinguishable under a fluorescence microscope. This demonstrates the feasibility of increasing library complexity up to 25,000-fold using a method disclosed herein.

To further demonstrate that multiplexing enhances screening efficiency, a demonstration library with a 1:3.7x10® ratio of active to inactive genes was constructed. Agarose beads were mixed with the DNA library and encapsulated in water-in-oil emulsion, with each droplet containing one bead and approximately 1,000 DNA molecules (lOOOx multiplex). DNA was amplified via PCR and conjugated onto the beads, followed by in vitro transcription and translation (IVTT) to express nanobodies. After IVTT, beads were stained with biotin-labeled GFP and fluorescent streptavidin. Fluorescence microscopy revealed that approximately 1 in 3,000 beads were bright. A commercial FACS machine was then used to sort the fluorescent beads. Approximately 100,000 beads were sorted within 5 minutes. The sorted beads were melted, and the DNA was recovered and amplified via PCR. qPCR analysis indicated that the active-to-inactive gene ratio improved from 1:3.7x10® to 1:7900 after a single round of sorting, which corresponded to a 468-fold enrichment.

Given the lOOOx multiplexing per bead, this approach effectively screened a library of 100 million variants in just 5 minutes, reducing the overall workflow to one day, as compared to the three days typically required for each round in yeast surface display.

While several embodiments of the present disclosure have been described and illustrated herein, those of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the functions and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the present disclosure. More generally, those skilled in the art will readily appreciate that all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the teachings of the present disclosure is/are used. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the disclosure described herein. It is, therefore, to be understood that the foregoing embodiments are presented by way of example only and that, within the scope of the appended claims and equivalents thereto, the disclosure may be practiced otherwise than as specifically described and claimed. The present disclosure is directed to each individual feature, system, article, material, and/or method described herein. In addition, any combination of two or more such features, systems, articles, materials, and/or methods, if such features, systems, articles, materials, and/or methods are not mutually inconsistent, is included within the scope of the present disclosure. The indefinite articles “a” and “an,” as used herein in the specification and in the claims, unless clearly indicated to the contrary, should be understood to mean “at least one.”

The phrase “and/or,” as used herein in the specification and in the claims, should be understood to mean “either or both” of the elements so conjoined, i.e., elements that are conjunctively present in some cases and disjunctively present in other cases. Other elements may optionally be present other than the elements specifically identified by the “and/or” clause, whether related or unrelated to those elements specifically identified unless clearly indicated to the contrary. Thus, as a non-limiting example, a reference to “A and/or B,” when used in conjunction with open-ended language such as “comprising” can refer, in one embodiment, to A without B (optionally including elements other than B); in another embodiment, to B without A (optionally including elements other than A); in yet another embodiment, to both A and B (optionally including other elements); etc.

As used herein in the specification and in the claims, “or” should be understood to have the same meaning as “and/or” as defined above. For example, when separating items in a list, “or” or “and/or” shall be interpreted as being inclusive, i.e., the inclusion of at least one, but also including more than one, of a number or list of elements, and, optionally, additional unlisted items. Only terms clearly indicated to the contrary, such as “only one of’ or “exactly one of,” or, when used in the claims, “consisting of,” will refer to the inclusion of exactly one element of a number or list of elements. In general, the term “or” as used herein shall only be interpreted as indicating exclusive alternatives (i.e. “one or the other but not both”) when preceded by terms of exclusivity, such as “either,” “one of,” “only one of,” or “exactly one of.” “Consisting essentially of,” when used in the claims, shall have its ordinary meaning as used in the field of patent law.

As used herein in the specification and in the claims, the phrase “at least one,” in reference to a list of one or more elements, should be understood to mean at least one element selected from any one or more of the elements in the list of elements, but not necessarily including at least one of each and every element specifically listed within the list of elements and not excluding any combinations of elements in the list of elements. This definition also allows that elements may optionally be present other than the elements specifically identified within the list of elements to which the phrase “at least one” refers, whether related or unrelated to those elements specifically identified. Thus, as a non-limiting example, “at least one of A and B” (or, equivalently, “at least one of A or B,” or, equivalently “at least one of A and/or B”) can refer, in one embodiment, to at least one, optionally including more than one, A, with no B present (and optionally including elements other than B); in another embodiment, to at least one, optionally including more than one, B, with no A present (and optionally including elements other than A); in yet another embodiment, to at least one, optionally including more than one, A, and at least one, optionally including more than one, B (and optionally including other elements); etc.

As used herein, “wt%” is an abbreviation of weight percentage. As used herein, “at%” is an abbreviation of atomic percentage.

Some embodiments may be embodied as a method, of which various examples have been described. The acts performed as part of the methods may be ordered in any suitable way. Accordingly, embodiments may be constructed in which acts are performed in an order different than illustrated, which may include different (e.g., more or less) acts than those that are described, and/or that may involve performing some acts simultaneously, even though the acts are shown as being performed sequentially in the embodiments specifically described above.

Use of ordinal terms such as “first,” “second,” “third,” etc., in the claims to modify a claim element does not by itself connote any priority, precedence, or order of one claim element over another or the temporal order in which acts of a method are performed, but are used merely as labels to distinguish one claim element having a certain name from another element having a same name (but for use of the ordinal term) to distinguish the claim elements.

In the claims, as well as in the specification above, all transitional phrases such as “comprising,” “including,” “carrying,” “having,” “containing,” “involving,” “holding,” and the like are to be understood to be open-ended, i.e., to mean including but not limited to. Only the transitional phrases “consisting of’ and “consisting essentially of’ shall be closed or semi-closed transitional phrases, respectively, as set forth in the United States Patent Office Manual of Patent Examining Procedures, Section 2111.03.

Claims

CLAIMS What is claimed is:

1. A solution, comprising: a fluid; a structured species suspended in the fluid; a nucleic acid chemically bound to the structured species; and an enzyme chemically bound to the structured species, wherein the enzyme is encoded by the nucleic acid.

2. A solution, comprising: a fluid; a substrate suspended in the fluid; a nucleic acid chemically bound to the substrate; and an enzyme chemically bound to the substrate, wherein the enzyme is encoded by the nucleic acid.

3. A method of making an enzyme, the method comprising: transcribing and translating a nucleic acid chemically bound to a structured species to produce an enzyme and chemically binding the enzyme to the structured species.

4. A method of making an enzyme, the method comprising: transcribing and translating a nucleic acid chemically bound to a substrate to produce an enzyme and chemically binding the enzyme to the substrate.

5. The solution of claim 2, wherein the substrate comprises a polymeric material, an organic framework, and/or a 2D material.

6. The solution of claim 5, wherein the polymeric material comprises a structured species.

7. The solution of claim 5, wherein the organic framework comprises a covalent organic framework or a metal organic framework.

8. The solution of claim 5, wherein the 2D material comprises graphene and/or graphene oxide.

9. The solution of any one of claims 1-2 and 5-8, wherein the enzyme is a naturally occurring enzyme or a mutant or derivative thereof.

10. The solution of any one of claims 2 and 5-9, wherein the nucleic acid is only bound to the enzyme via the substrate.

11. The solution of any one of claims 1, and 9, wherein a plurality of copies of the nucleic acid are chemically bound to the structured species.

12. The solution of any one of claims 2 and 5-10, wherein a plurality of copies of enzyme are covalently bound to the substrate.

13. The solution of any one of claims 1, 9 and 11, wherein a plurality of copies of enzyme are covalently bound to the structured species.

14. The method of claim 4, wherein the substrate comprises a polymeric material, an organic framework, and/or a 2D material.

15. The method of claim 14, wherein the polymeric material comprises a structured species.

16. The method of claim 14, wherein the organic framework comprises a covalent organic framework or a metal organic framework.

17. The method of claim 14, wherein the 2D material comprises graphene and/or graphene oxide.

18. The method of any one of the preceding claims, wherein the enzyme is a naturally occurring enzyme or a mutant or derivative thereof.

19. The method of any one of claims 4, and 14-18, wherein the nucleic acid is only bound to the enzyme via the substrate.

20. The method of any one of claims 3, and 18, wherein a plurality of copies of the nucleic acid are chemically bound to the structured species.

21. The method of any one of claims 4, and 14-19, wherein a plurality of copies of enzyme are covalently bound to the substrate.

22. The method of any one of claims 3, 18 and 20, wherein a plurality of copies of enzyme are covalently bound to the structured species.

23. The solution of any one of the preceding claims, wherein the solution further comprises an IVTT reagent.

24. The solution of any one of the preceding claims, wherein the solution further comprises a ribosome.

25. The solution of any one of claims 1, 2, 5-13, and 15, wherein the solution does not comprise an IVTT reagent.

26. The solution of any one of the preceding claims, wherein a first portion of the solution is located in a first container that is separated from a second portion of the solution located in a second container.

27. The solution of claim 26, wherein the first container and the second container are droplets.

28. The solution of any one of claims 26-27, wherein the first container and the second container are wells.

29. The solution of any one of claims 26-28, wherein the nucleic acid is a first nucleic acid, where the enzyme is a first enzyme, wherein the first container contains the first nucleic acid and the first enzyme, wherein the second container contains a second nucleic acid and a second enzyme, wherein the second enzyme is encoded by the second nucleic acid.

30. The solution of claim 29, wherein the first nucleic acid is different from the second nucleic acid.

31. The solution of any one of the preceding claims, wherein the solution further comprises an enzyme substrate.

32. The solution of any one of the preceding claims, wherein the solution further comprises a nucleic acid amplification reagent.

33. The method of any one of the preceding claims, wherein the transcription and translation is performed via IVTT.

34. The method of any one of the preceding claims, wherein the method further comprises synthesizing at least a portion of the nucleic acid bound to the structured species prior to the transcription and translation of the nucleic acid.

35. The method of any one of the preceding claims, wherein the method further comprises catalyzing a reaction of an enzyme substrate using the enzyme.

36. The method of any one of the preceding claims, wherein the method comprises washing the structured species to remove at least a portion of a reagent from a solution contacting the structured species.

37. The method of any one of the preceding claims, wherein the method further comprises amplifying the nucleic acid.

38. A solution, comprising: a fluid; a structured species suspended in the fluid; a nucleic acid chemically bound to the structured species; and a protein chemically bound to the structured species, wherein the protein is encoded by the nucleic acid.

39. A solution, comprising: a fluid; a substrate suspended in the fluid; a nucleic acid chemically bound to the substrate; and a protein chemically bound to the substrate, wherein the protein is encoded by the nucleic acid.

40. A method of making a protein, the method comprising: transcribing and translating a nucleic acid chemically bound to a structured species to produce a protein and chemically binding the protein to the structured species.

41. A method of making a protein, the method comprising: transcribing and translating a nucleic acid chemically bound to a substrate to produce a protein and chemically binding the protein to the substrate. - HO -

42. The solution of claim 39, wherein the substrate comprises a polymeric material, an organic framework, and/or a 2D material.

43. The solution of claim 42, wherein the polymeric material comprises a structured species.

44. The solution of claim 42, wherein the organic framework comprises a covalent organic framework or a metal organic framework.

45. The solution of claim 42, wherein the 2D material comprises graphene and/or graphene oxide.

46. The solution of any one of claims 39 and 42-45, wherein the protein is a naturally occurring protein or a mutant or derivative thereof.

47. The solution of any one of claims 39 and 42-46, wherein the nucleic acid is only bound to the protein via the substrate.

48. The solution of any one of claims 38 and 46, wherein the nucleic acid is only bound to the protein via the structured species.

49. The solution of any one of claims 43-46 and 47, wherein a plurality of copies of the nucleic acid are chemically bound to the structured species.

50. The solution of any one of claims 38, 46, and 47, wherein a plurality of copies of protein are covalently bound to the structured species.

51. The method of claim 41, wherein the substrate comprises a polymeric material, an organic framework, and/or a 2D material.

52. The method of claim 51, wherein the polymeric material comprises a structured species. - I l l -

53. The method of claim 51, wherein the organic framework comprises a covalent organic framework or a metal organic framework.

54. The method of claim 51, wherein the 2D material comprises graphene and/or graphene oxide.

55. The method of any one of claims 40-41 and 51-54, wherein the protein is a naturally occurring protein or a mutant or derivative thereof.

56. The method of any one of claims 41 and 51-55, wherein the nucleic acid is only bound to the protein via the substrate.

57. The method of any one of claims 40 and 55, wherein the nucleic acid is only bound to the protein via the structured species.

58. The method of any one of claims 52-55 and 56, wherein a plurality of copies of the nucleic acid are chemically bound to the structured species.

59. The method of any one of claims 40, 55, and 56, wherein a plurality of copies of protein are covalently bound to the structured species.

60. The solution of any one of claims 38, 39 or 42-50, wherein the solution further comprises an IVTT reagent.

61. The solution of any one of claims 38, 39 or 42-60, wherein the solution further comprises a ribosome.

62. The solution of any one of claims 38, 39, 42-50, and 61, wherein the solution does not comprise an IVTT reagent.

63. The solution of any one of claims 38, 39 or 42-62, wherein a first portion of the solution is located in a first container that is separated from a second portion of the solution located in a second container.

64. The solution of claim 63, wherein the first container and the second container are droplets.

65. The solution of claim 63, wherein the first container and the second container are wells.

66. The solution of any one of claims 63-65, wherein the nucleic acid is a first nucleic acid, where the protein is a first protein, wherein the first container contains the first nucleic acid and the first protein, wherein the second container contains a second nucleic acid and a second protein, wherein the second protein is encoded by the second nucleic acid.

67. The solution of claim 66, wherein the first nucleic acid is different from the second nucleic acid.

68. The solution of any one of claims 38, 39 or 42-67, wherein the solution further comprises a nucleic acid amplification reagent.

69. The method of any one of claims 40, 41, and 51-59, wherein the transcription and translation is performed via IVTT.

70. The method of any one of claims 40, 41, 51-59, and 69, wherein the method further comprises synthesizing at least a portion of the nucleic acid bound to the structured species prior to the transcription and translation of the nucleic acid.

71. The method of any one of claims 40, 41, 51-59, and 69-70, wherein the method further comprises catalyzing a reaction of a protein substrate using the protein.

72. The method of any one of claims 40, 41, 51-59, and 69-71, wherein the method comprises washing the structured species to remove at least a portion of a reagent from a solution contacting the structured species.

73. The method of any one of claims 40, 41, 51-59, and 69-72, wherein the method further comprises amplifying the nucleic acid.

74. A solution, comprising: a fluid; a structured species suspended in the fluid; a nucleic acid chemically bound to the structured species; and at least a portion of an antibody chemically bound to the structured species, wherein the at least a portion of the antibody is encoded by the nucleic acid.

75. A solution, comprising: a fluid; a substrate suspended in the fluid; a nucleic acid chemically bound to the substrate; and at least a portion of an antibody chemically bound to the substrate, wherein the at least a portion of the antibody is encoded by the nucleic acid.

76. A method of making an antibody, the method comprising: transcribing and translating a nucleic acid chemically bound to a structured species to produce at least a portion of an antibody, and chemically binding the at least a portion of the antibody to the structured species.

77. A method of making an antibody, the method comprising: transcribing and translating a nucleic acid chemically bound to a substrate to produce at least a portion of an antibody, and chemically binding the at least a portion of the antibody to the substrate.

78. The solution of claim 75, wherein the substrate comprises a polymeric material, an organic framework, and/or a 2D material.

79. The solution of claim 78, wherein the polymeric material comprises a structured species.

80. The solution of claim 78, wherein the organic framework comprises a covalent organic framework or a metal organic framework.

81. The solution of claim 78, wherein the 2D material comprises graphene and/or graphene oxide.

82. The solution of any one of claims 74-75 and 78-81, wherein the at least a portion of an antibody is a complete antibody.

83. The solution of any one of claims 74-75 and 78-82, wherein the at least of portion of an antibody is a nanobody.

84. The solution of any one of claims 75 and 78-83, wherein the nucleic acid is only bound to the antibody via the substrate.

85. The solution of any one of claims 74 and 82-83, wherein the nucleic acid is only bound to the antibody via the structured species.

86. The solution of any one of claims 75 and 78-84, wherein a plurality of copies of the nucleic acid are chemically bound to the substrate.

87. The solution of any one of claims 74, 82-83, and 85, wherein a plurality of copies of the nucleic acid are chemically bound to the structured species.

88. The solution of any one of claims 75, 78-84, and 86, wherein a plurality of copies of the at least a portion of the antibody are covalently bound to the substrate.

89. The solution of any one of claims 74, 82-83, 85, and 87, wherein a plurality of copies of the at least a portion of the antibody are covalently bound to the structured species.

90. The method of claim 77, wherein the substrate comprises a polymeric material, an organic framework, and/or a 2D material.

91. The method of claim 90, wherein the polymeric material comprises a structured species.

92. The method of claim 90, wherein the organic framework comprises a covalent organic framework or a metal organic framework.

93. The method of claim 90, wherein the 2D material comprises graphene and/or graphene oxide.

94. The method of any one of claims 76-77 and 90-93, wherein the at least a portion of an antibody is a complete antibody.

95. The method of any one of claims 76-77 and 90-94, wherein the at least of portion of an antibody is a nanobody.

96. The method of any one of claims 77 and 90-95, wherein the nucleic acid is only bound to the antibody via the substrate.

97. The method of any one of claims 76, and 94-95, wherein the nucleic acid is only bound to the antibody via the structured species.

98. The method of any one of claims 77 and 90-96, wherein a plurality of copies of the nucleic acid are chemically bound to the substrate.

99. The method of any one of claims 76, 94-95, and 97, wherein a plurality of copies of the nucleic acid are chemically bound to the structured species.

100. The method of any one of claims 77, 90-96, and 98, wherein a plurality of copies of the at least a portion of the antibody are covalently bound to the substrate.

101. The method of any one of claims 76, 94-95, 97, and 99, wherein a plurality of copies of the at least a portion of the antibody are covalently bound to the structured species.

102. The solution of any one of claims 74-75 and 60-71, wherein the solution further comprises an IVTT reagent.

103. The solution of any one of claims 74-75 and 60-72, wherein the solution further comprises a ribosome.

104. The solution of any one of claims 74-75 and 60-73, wherein the solution does not comprise an IVTT reagent.

105. The solution of any one of claims 74-75 and 60-74, wherein a first portion of the solution is located in a first container that is separated from a second portion of the solution located in a second container.

106. The solution of claim 105, wherein the first container and the second container are droplets.

107. The solution of any one of claims 105-106, wherein the first container and the second container are wells.

108. The solution of any one of claims 105-107, wherein the nucleic acid is a first nucleic acid, where the antibody is a first antibody, wherein the first container contains the first nucleic acid and the first antibody, wherein the second container contains a second nucleic acid and at least a portion of a second antibody, wherein the at least a portion of the second antibody is encoded by the second nucleic acid.

109. The solution of claim 108, wherein the first nucleic acid is different from the second nucleic acid.

110. The solution of any one of claims 74-75 and 60-79, wherein the solution further comprises a binding target recognized by the at least a portion of the antibody.

111. The solution of any one of claims 74-75 and 60-80, wherein the solution further comprises a nucleic acid amplification reagent.

112. The method of any one of claims 76-77 and 90-101, wherein the transcription and translation is performed via IVTT.

113. The method of any one of claims 76-77, 90-101 and claim 112, wherein the method further comprises synthesizing at least a portion of the nucleic acid bound to the structured species prior to the transcription and translation of the nucleic acid.

114. The method of any one of claims 76-77, 90-101 and claims 112-113, wherein the method further comprises binding the antibody to a binding target.

115. The method of any one of claims 76-77, 90-101 and claims 112-114, wherein the method comprises washing the structured species to remove at least a portion of a reagent from a solution contacting the structured species.

116. The method of any one of claims 76-77, 90-101 and claims 112-115, wherein the method further comprises amplifying the nucleic acid.

117. A composition comprising a non-natural nucleic acid that is at least 70% identical to any one of SEQ. ID. NOS. 21-28.

118. A composition comprising a non-natural nucleic acid selected from the group of SEQ. ID. NOS. 21-28.

119. A composition comprising a protein comprising a non-natural amino acid sequence expressible via transcription and translation of a sequence that is at least 70% identical to any one of SEQ. ID. NOS. 21-28.

120. A composition comprising a protein comprising a non-natural amino acid sequence expressible via transcription and translation of any one of SEQ. ID. NOS. 21- 28.

121. A method, comprising: transcribing and translating a plurality of nucleic acids within a first plurality of droplets comprising a first plurality of structured species, wherein the nucleic acids of the plurality of nucleic acids are bound to the structured species of the plurality of structured species; determining one or more activity droplets of the first plurality of droplets having an activity of a target substrate; and separating nucleic acids from the one or more activity droplets of the first plurality of droplets into a second plurality of droplets comprising a second plurality of structured species.

122. The method of claim 121, wherein at least 50% of the droplets the first plurality of droplets each comprise at least 2 distinct nucleic acid sequences bound to a structured species.

123. The method of any one of claims 121-122, wherein the method comprises fluidizing the first plurality of structured species prior to separating nucleic acids from the one or more activity droplets of the first plurality of droplets into the second plurality of droplets, in order to break the fluidized structured species into the second plurality of droplets.

124. The method of claim 123, further comprising forming the second plurality of structured species by re-rigidifying the fluidized substrate within the second plurality of droplets.

125. The method of any one of claims 121-124, wherein the nucleic acids are more dilute in the second plurality of droplets than in the first plurality of droplets.

126. A method, comprising: transcribing and translating a plurality of nucleic acids within a first plurality of droplets comprising a first plurality of substrates, wherein the nucleic acids of the plurality of nucleic acids are bound to the substrates of the plurality of substrates; determining one or more activity droplets of the first plurality of droplets having an activity of a target substrate; and separating nucleic acids from the one or more activity droplets of the first plurality of droplets into a second plurality of droplets comprising a second plurality of substrates.

127. The method of claim 126, wherein at least 50% of the droplets the first plurality of droplets each comprise at least 2 distinct nucleic acid sequences bound to a substrate.

128. The method of any one of claims 126-127, wherein the method comprises fluidizing the first plurality of substrates prior to separating nucleic acids from the one or more activity droplets of the first plurality of droplets into the second plurality of droplets, in order to break the fluidized substrates into the second plurality of droplets.

129. The method of claim 128, further comprising forming the second plurality of substrates by re-rigidifying the fluidized substrate within the second plurality of droplets.

130. The method of any one of claims 126-129, wherein the nucleic acids are more dilute in the second plurality of droplets than in the first plurality of droplets.