JP2025528144A

JP2025528144A - DE NOVO Pore

Info

Publication number: JP2025528144A
Application number: JP2025507385A
Authority: JP
Inventors: エリザベス・ジェイン・ウォーレス; ラクマル・ニシャンタ・ジャヤシンゲ; リチャード・ジョージ・ハンブリー; アリステア・ジェームズ・スコット; ランガ・プラバート・マラヴィアラチゲ・ラベル; リース・コナー・グリフィス; アンバー・エリザベス・レッケンビー; プラティック・ラジ・シン; アルベルト・リエラ; ウィリアム・エフ・デグラド; リー・シュナイダー; ニコラス・ポリッツィ
Original assignee: University of California San Diego UCSD
Current assignee: University of California San Diego UCSD
Priority date: 2022-08-09
Filing date: 2023-08-09
Publication date: 2025-08-26
Also published as: KR20250048551A; WO2024033447A1; EP4569331A1; CA3262945A1; AU2023322679A1; CN119678047A

Abstract

Aspects of the present disclosure relate to protein pore complexes and their use in the detection and characterization of analytes. The present disclosure is based, in part, on nanopore complexes formed by one or more auxiliary proteins that form a CsgG-like pore and one or more channel constrictions within the nanopore complex. In some embodiments, the one or more auxiliary proteins are fusion proteins. The present disclosure further relates to methods for the design of auxiliary proteins and the generation of nanopore complexes and their use in molecular sensing and analyte sequencing applications.

Description

ナノ細孔センシングを使用するポリマー特性決定の２つの重要な要素は、（１）細孔を通過するポリマーの移動の制御、及び（２）ポリマーが細孔を通過する際の構成ビルディングブロックの差別である。ナノ細孔センシング中、細孔の最も狭い部分が、通過する分析物の関数としての電流シグネチャに関してナノ細孔の最も差別力のある部分である、狭窄を形成する。ＣｓｇＧは、大腸菌からのゲートなしの非選択的タンパク質分泌チャネルとして同定され（Ｇｏｙａｌｅｔａｌ．，２０１４）、分析物を検出及び特性決定するためのナノ細孔として使用されている。この文脈における細孔の特性を改善する野生型ＣｓｇＧ細孔への変異もまた開示されている（それらの全体が全て参照により本明細書に組み込まれる、ＷＯ２０１６／０３４５９１、ＷＯ２０１７／１４９３１６、ＷＯ２０１７／１４９３１７及びＷＯ２０１７／１４９３１８、ＰＣＴ／ＧＢ２０１８／０５１１９１）。 Two key elements of polymer characterization using nanopore sensing are (1) control of polymer transport through the pore and (2) discrimination of the constituent building blocks as the polymer passes through the pore. During nanopore sensing, the narrowest part of the pore forms the constriction, which is the most discriminatory part of the nanopore in terms of current signature as a function of the analyte passing through it. CsgG was identified as an ungated, nonselective protein secretion channel from Escherichia coli (Goyal et al., 2014) and has been used as a nanopore to detect and characterize analytes. Mutations to the wild-type CsgG pore that improve the properties of the pore in this context have also been disclosed (WO2016/034591, WO2017/149316, WO2017/149317 and WO2017/149318, PCT/GB2018/051191, all of which are incorporated by reference in their entirety).

分析物がポリヌクレオチドである場合、ヌクレオチドの差別は、そのような変異体細孔を通過することによって達成されるが、電流シグネチャは、配列依存性であることが示され、複数のヌクレオチドが観察された電流に寄与したため、チャネル狭窄の高さ及び分析物との相互作用表面の程度が、観察された電流とポリヌクレオチド配列との間の関係に影響を与える。ヌクレオチド差別のための電流範囲は、ＣｓｇＧ細孔の変異を通して改善されてきているものの、ヌクレオチド間の電流差を更に改善することができれば、配列決定システムは、より高い性能を有するであろう。 When the analyte is a polynucleotide, nucleotide discrimination is achieved by passing it through such a mutant pore; however, the current signature was shown to be sequence-dependent, with multiple nucleotides contributing to the observed current; therefore, the height of the channel constriction and the degree of interaction surface with the analyte influence the relationship between the observed current and the polynucleotide sequence. While the current range for nucleotide discrimination has been improved through mutations in the CsgG pore, sequencing systems would have higher performance if the current difference between nucleotides could be further improved.

本開示は、いくつかの態様において、タンパク質細孔複合体、並びに分析物の検出及び特性決定におけるそれらの使用に関する。本開示は、部分的に、ＣｓｇＧ細孔及びナノ細孔複合体内の１つ以上のチャネル狭窄を形成する１つ以上の補助タンパク質によって形成されるナノ細孔複合体に基づく。いくつかの実施形態では、１つ以上の補助タンパク質は、融合タンパク質である。実施例で更に説明するように、驚くべきことに、ＣｓｇＧタンパク質ナノ細孔に特定の所望の特徴（例えば、細孔幅の調節、細孔内腔の延長、１つ以上の追加の狭窄の形成など）を付与する補助タンパク質を、コンピュータベースの構造解析ツールを使用してｄｅｎｏｖｏ設計できることが発見された。いくつかの実施形態では、ｄｅｎｏｖｏ設計された補助タンパク質（例えば、融合タンパク質）は、ＣｓｇＧナノ細孔の内腔において１つ以上の狭窄を形成し、分析物がナノ細孔を通って移動する際にポリマーユニットの差別を改善する。 The present disclosure, in some aspects, relates to protein pore complexes and their use in the detection and characterization of analytes. The disclosure is based, in part, on nanopore complexes formed by a CsgG pore and one or more auxiliary proteins that form one or more channel constrictions within the nanopore complex. In some embodiments, the one or more auxiliary proteins are fusion proteins. As further described in the Examples, it has surprisingly been discovered that auxiliary proteins that impart specific desired characteristics to a CsgG protein nanopore (e.g., tuning the pore width, extending the pore lumen, forming one or more additional constrictions, etc.) can be de novo designed using computer-based structural analysis tools. In some embodiments, the de novo designed auxiliary proteins (e.g., fusion proteins) form one or more constrictions in the lumen of the CsgG nanopore, improving discrimination of polymer units as analytes translocate through the nanopore.

本開示のいくつかの態様は更に、補助タンパク質の設計及びナノ細孔複合体の生成、並びに分子センシング及び核酸配列決定の適用における使用のための方法に関する。 Some aspects of the present disclosure further relate to methods for designing auxiliary proteins and generating nanopore complexes, as well as for use in molecular sensing and nucleic acid sequencing applications.

いくつかの態様において、本開示は、内腔を含むＣｓｇＧナノ細孔と、ＣｓｇＦタンパク質を含む第１の部分と、ヘリックス形成補助タンパク質を含む第２の部分と、を含む融合ポリペプチドであって、前記融合タンパク質は、前記ナノ細孔に結合される、融合ポリペプチドと、を含む、タンパク質ナノ細孔複合体を提供する。 In some aspects, the present disclosure provides a protein nanopore complex comprising a CsgG nanopore comprising a lumen, and a fusion polypeptide comprising a first portion comprising a CsgF protein and a second portion comprising a helix-forming assisting protein, wherein the fusion protein is bound to the nanopore.

いくつかの態様において、前記融合タンパク質の前記第１の部分は、前記ＣｓｇＧナノ細孔に結合される。いくつかの態様において、前記融合タンパク質の前記第１の部分は、前記ＣｓｇＧナノ細孔の前記内腔の内部に位置する。いくつかの実施形態では、前記融合タンパク質の前記第１の部分は、前記ＣｓｇＧナノ細孔の前記内腔の外部に延びる。いくつかの実施形態では、前記第１の部分は、前記ＣｓｇＧナノ細孔の前記内腔において第１の狭窄領域を形成する。 In some aspects, the first portion of the fusion protein is bound to the CsgG nanopore. In some aspects, the first portion of the fusion protein is located within the lumen of the CsgG nanopore. In some embodiments, the first portion of the fusion protein extends outside the lumen of the CsgG nanopore. In some embodiments, the first portion forms a first constriction region in the lumen of the CsgG nanopore.

いくつかの実施形態では、前記第２の部分は、第２の狭窄領域を形成する。 In some embodiments, the second portion forms a second constriction region.

いくつかの実施形態では、前記ＣｓｇＧナノ細孔は、狭窄領域を更に含む。 In some embodiments, the CsgG nanopore further comprises a constriction region.

いくつかの実施形態では、前記第２の部分は、前記ＣｓｇＧナノ細孔に結合されない。いくつかの実施形態では、前記第２の部分は、１つ以上のヘリックス（例えば、アルファヘリックスなど）を含む。 In some embodiments, the second portion is not bound to the CsgG nanopore. In some embodiments, the second portion comprises one or more helices (e.g., alpha helices).

いくつかの実施形態では、前記第２の部分の前記ヘリックス（例えば、アルファヘリックスなど）のそれぞれは、０～１５個の間のアルファヘリックスターンを含む。いくつかの実施形態では、前記第２の部分は、１～４つのアルファヘリックスターンを含む第１のアルファヘリックスと、３～６つのアルファヘリックスターンを含む第２のアルファヘリックスと、を含む。いくつかの実施形態では、前記第２のアルファヘリックスは、前記第１のアルファヘリックスに対して詰まる。いくつかの実施形態では、前記第２の部分は、１～５５個の間のアミノ酸残基を含む。いくつかの実施形態では、前記ヘリックスのそれぞれは、約－４５°～－９０°の範囲のファイ角度及び約０°～－７０°の範囲のプサイ角度を有する１～２０個のアミノ酸残基を含む。いくつかの実施形態では、前記ヘリックスのそれぞれは、約－４５°～－９０°の範囲のファイ角度及び約０°～－７０°の範囲のプサイ角度を有する１～３０個のアミノ酸残基を含む。 In some embodiments, each of the helices (e.g., alpha helices) of the second portion comprises between 0 and 15 alpha helical turns. In some embodiments, the second portion comprises a first alpha helix comprising between 1 and 4 alpha helical turns and a second alpha helix comprising between 3 and 6 alpha helical turns. In some embodiments, the second alpha helix is packed against the first alpha helix. In some embodiments, the second portion comprises between 1 and 55 amino acid residues. In some embodiments, each of the helices comprises 1 to 20 amino acid residues having a phi angle in the range of about -45° to -90° and a psi angle in the range of about 0° to -70°. In some embodiments, each of the helices comprises 1 to 30 amino acid residues having a phi angle in the range of about -45° to -90° and a psi angle in the range of about 0° to -70°.

いくつかの実施形態では、前記第１の狭窄領域と前記第２の狭窄領域との間の距離（例えば、垂直距離）は、（例えば、前記第１の狭窄を形成する前記ナノ細孔の狭窄内腔内に最も遠く延びる前記アミノ酸残基と前記第２の狭窄を形成する前記ナノ細孔の前記内腔内に最も遠く延びる前記アミノ酸残基とのアルファ炭素（Ｃ_ａ）の間の距離として測定される場合）約５Å～約８０Åの範囲である。いくつかの実施形態では、前記タンパク質ナノ細孔複合体は、９０Åを超える軸長を有し、任意選択的に、前記軸長の範囲は、約９５Å～約１６０Åである。 In some embodiments, the distance (e.g., perpendicular distance) between the first constriction region and the second constriction region ranges from about 5 Å to about 80 Å (e.g., measured as the distance between the alpha carbons (C _a ) of the amino acid residues that extend furthest into the constricted lumen of the nanopore forming the first constriction and the amino acid residues that extend furthest into the lumen of the nanopore forming the second constriction). In some embodiments, the protein nanopore complex has an axial length of greater than 90 Å, optionally, the axial length ranges from about 95 Å to about 160 Å.

いくつかの実施形態では、前記融合タンパク質は、リンカーによって前記ナノ細孔に結合される。いくつかの実施形態では、前記リンカーは、結合、ペプチドリンカー、又は化学リンカーを含む。いくつかの実施形態では、前記リンカーは、硫黄（ＶＩ）フッ化物交換（ＳｕＦＥｘ）反応によって形成される結合を含む。いくつかの実施形態では、前記リンカーは、１つ以上のマレイミド分子を含む。 In some embodiments, the fusion protein is attached to the nanopore by a linker. In some embodiments, the linker comprises a bond, a peptide linker, or a chemical linker. In some embodiments, the linker comprises a bond formed by a sulfur(VI) fluoride exchange (SuFEx) reaction. In some embodiments, the linker comprises one or more maleimide molecules.

いくつかの実施形態では、前記融合タンパク質は、環化される。いくつかの実施形態では、前記環化は、１つ以上の側鎖－側鎖環化結合を含む。いくつかの実施形態では、前記側鎖－側鎖環化結合の少なくとも１つは、ジスルフィド結合である。 In some embodiments, the fusion protein is cyclized. In some embodiments, the cyclization comprises one or more side chain-to-side chain cyclization bonds. In some embodiments, at least one of the side chain-to-side chain cyclization bonds is a disulfide bond.

いくつかの態様では、本開示は、ＣｓｇＧナノ細孔であって、内腔と、前記ナノ細孔の前記内腔内に形成された第１の狭窄領域とを含むＣｓｇＧナノ細孔と、ＣｓｇＦタンパク質を含む第１の部分と、ヘリックス形成補助タンパク質を含む第２の部分と、を含む融合タンパク質であって、前記融合タンパク質は、ナノ細孔に結合される、融合タンパク質と、を含む、タンパク質ナノ細孔複合体を提供する。 In some aspects, the present disclosure provides a protein nanopore complex comprising: a CsgG nanopore comprising an inner lumen and a first constriction region formed within the inner lumen of the nanopore; and a fusion protein comprising a first portion comprising a CsgF protein and a second portion comprising a helix-forming assisting protein, wherein the fusion protein is bound to the nanopore.

いくつかの態様において、前記融合タンパク質の前記第１の部分は、前記ＣｓｇＧナノ細孔に結合される。いくつかの態様において、前記融合タンパク質の前記第１の部分は、前記ＣｓｇＧナノ細孔の前記内腔の内部に位置する。 In some embodiments, the first portion of the fusion protein is bound to the CsgG nanopore. In some embodiments, the first portion of the fusion protein is located within the lumen of the CsgG nanopore.

いくつかの態様において、前記融合タンパク質の前記第２の部分は、前記ＣｓｇＧナノ細孔の前記内腔の外部に位置する。 In some embodiments, the second portion of the fusion protein is located outside the lumen of the CsgG nanopore.

いくつかの実施形態では、前記第１の部分は、前記ＣｓｇＧナノ細孔の前記内腔において第２の狭窄領域を形成する。いくつかの実施形態では、前記第２の部分は、前記ＣｓｇＧナノ細孔の前記内腔において第３の狭窄領域を形成する。 In some embodiments, the first portion forms a second constriction region in the lumen of the CsgG nanopore. In some embodiments, the second portion forms a third constriction region in the lumen of the CsgG nanopore.

いくつかの実施形態では、前記第２の部分は、前記ＣｓｇＧナノ細孔に結合されない。 In some embodiments, the second portion is not bound to the CsgG nanopore.

いくつかの実施形態では、前記第２の部分は、１つ以上のヘリックス（例えば、アルファヘリックスなど）を含む。いくつかの実施形態では、前記ヘリックス（例えば、アルファヘリックス）のそれぞれは、０～１５個の間のアルファヘリックスターンを含む。いくつかの実施形態では、前記第２の部分は、１～５４個の間のアミノ酸残基を含む。いくつかの実施形態では、前記ヘリックスのそれぞれは、約－４５°～－９０°の範囲のファイ角度及び約０°～－７０°の範囲のプサイ角度を有する１～３６個のアミノ酸残基を含む。いくつかの実施形態では、前記ヘリックスのそれぞれは、約－４５°～－９０°の範囲のファイ角度及び約０°～－７０°の範囲のプサイ角度を有する１～３６個のアミノ酸残基を含む。 In some embodiments, the second portion comprises one or more helices (e.g., alpha helices). In some embodiments, each of the helices (e.g., alpha helices) comprises between 0 and 15 alpha helical turns. In some embodiments, the second portion comprises between 1 and 54 amino acid residues. In some embodiments, each of the helices comprises 1 to 36 amino acid residues having a phi angle in the range of about -45° to -90° and a psi angle in the range of about 0° to -70°. In some embodiments, each of the helices comprises 1 to 36 amino acid residues having a phi angle in the range of about -45° to -90° and a psi angle in the range of about 0° to -70°.

いくつかの実施形態では、融合タンパク質は、環化される。いくつかの実施形態では、前記環化は、１つ以上の側鎖－側鎖環化結合を含む。いくつかの実施形態では、前記環化は、１つ以上の側鎖－尾部（例えば、Ｃ末端）環化結合を含む。いくつかの実施形態では、前記環化結合の少なくとも１つは、ジスルフィド結合である。 In some embodiments, the fusion protein is cyclized. In some embodiments, the cyclization comprises one or more side chain-to-side chain cyclization bonds. In some embodiments, the cyclization comprises one or more side chain-to-tail (e.g., C-terminus) cyclization bonds. In some embodiments, at least one of the cyclization bonds is a disulfide bond.

いくつかの態様では、本開示は、ＣｓｇＧナノ細孔であって、内腔と、前記ナノ細孔の前記内腔内に形成された第１の狭窄領域とを含むＣｓｇＧナノ細孔と、前記ＣｓｇＧナノ細孔に結合され、前記ナノ細孔の前記内腔において第２の狭窄領域を形成する第１の補助タンパク質と、前記ＣｓｇＧナノ細孔又は前記第１の補助タンパク質に結合され、第３の狭窄領域を形成する第２の補助タンパク質と、を含む、タンパク質ナノ細孔複合体を提供する。 In some aspects, the present disclosure provides a protein nanopore complex comprising: a CsgG nanopore having a lumen and a first constriction region formed within the lumen of the nanopore; a first auxiliary protein bound to the CsgG nanopore and forming a second constriction region in the lumen of the nanopore; and a second auxiliary protein bound to the CsgG nanopore or the first auxiliary protein and forming a third constriction region.

いくつかの実施形態では、前記第１の補助タンパク質は、前記ＣｓｇＧナノ細孔の前記内腔の内部に位置する。いくつかの実施形態では、前記第１の補助タンパク質は、ＣｓｇＦタンパク質又はＣｓｇＦペプチドを含む。 In some embodiments, the first accessory protein is located within the lumen of the CsgG nanopore. In some embodiments, the first accessory protein comprises a CsgF protein or a CsgF peptide.

いくつかの実施形態では、前記第２の補助タンパク質は、１つ以上のヘリックス（例えば、アルファヘリックスなど）を含む。いくつかの実施形態では、前記１つ以上のヘリックス（例えば、アルファヘリックス）のそれぞれは、０～１５個の間のアルファヘリックスターンを含む。いくつかの実施形態では、前記第２の補助タンパク質は、２つのアルファヘリックスを含む。 In some embodiments, the second auxiliary protein comprises one or more helices (e.g., alpha helices). In some embodiments, each of the one or more helices (e.g., alpha helices) comprises between 0 and 15 alpha helical turns. In some embodiments, the second auxiliary protein comprises two alpha helices.

いくつかの実施形態では、前記アルファヘリックスのうちの１つは、１～６つの間のアルファヘリックスターンを含む。いくつかの実施形態では、前記アルファヘリックスのうちの１つは、１～１０つの間のアルファヘリックスターンを含む。いくつかの実施形態では、前記アルファヘリックスのうちの１つは、３つのアルファヘリックスターンを含み、他方のアルファヘリックスは、３つ又は４つのアルファヘリックスターンを含む。いくつかの実施形態では、前記ヘリックスのそれぞれは、約－４５°～－９０°の範囲のファイ角度及び約０°～－７０°の範囲のプサイ角度を有する１～３６個のアミノ酸残基を含む。いくつかの実施形態では、前記ヘリックスのそれぞれは、約－４５°～－９０°の範囲のファイ角度及び約０°～－７０°の範囲のプサイ角度を有する１～３６個のアミノ酸残基を含む。 In some embodiments, one of the alpha helices contains between 1 and 6 alpha helical turns. In some embodiments, one of the alpha helices contains between 1 and 10 alpha helical turns. In some embodiments, one of the alpha helices contains three alpha helical turns, and the other alpha helix contains three or four alpha helical turns. In some embodiments, each of the helices contains 1 to 36 amino acid residues with a phi angle in the range of about -45° to -90° and a psi angle in the range of about 0° to -70°. In some embodiments, each of the helices contains 1 to 36 amino acid residues with a phi angle in the range of about -45° to -90° and a psi angle in the range of about 0° to -70°.

いくつかの実施形態では、前記第２の補助タンパク質は、前記第１の補助タンパク質のアルファヘリックスに対して詰まる少なくとも１つのアルファヘリックスを含む。いくつかの実施形態では、前記第２の補助タンパク質は、１～５５個の間のアミノ酸残基を含む。 In some embodiments, the second auxiliary protein comprises at least one alpha helix that packs against an alpha helix of the first auxiliary protein. In some embodiments, the second auxiliary protein comprises between 1 and 55 amino acid residues.

いくつかの実施形態では、前記第１の狭窄と前記第２の狭窄との間の距離（例えば、垂直距離）は、（例えば、前記第１の狭窄を形成する前記ナノ細孔の狭窄内腔内に最も遠く延びる前記アミノ酸残基と前記第２の狭窄を形成する前記ナノ細孔の前記内腔内に最も遠く延びる前記アミノ酸残基とのアルファ炭素（Ｃ_ａ）の間の距離として測定される場合）約２０Å～約８０Åの範囲である。いくつかの実施形態では、前記第２の狭窄と前記第３の狭窄との間の距離は、約５Å～約８０Åの範囲である。いくつかの実施形態では、前記タンパク質ナノ細孔複合体は、９０Åを超える軸長を有し、任意選択的に、前記軸長の範囲は、約９５Å～約１６０Åである。 In some embodiments, the distance (e.g., perpendicular distance) between the first and second constrictions ranges from about 20 Å to about 80 Å (e.g., measured as the distance between the alpha carbons (C _a ) of the amino acid residues that extend furthest into the constriction lumen of the nanopore forming the first constriction and the amino acid residues that extend furthest into the lumen of the nanopore forming the second constriction). In some embodiments, the distance between the second and third constrictions ranges from about 5 Å to about 80 Å. In some embodiments, the protein nanopore complex has an axial length of greater than 90 Å, optionally, the axial length ranges from about 95 Å to about 160 Å.

いくつかの実施形態では、前記第１の補助タンパク質及び前記第２の補助タンパク質は、リンカーによって結合される。いくつかの実施形態では、前記リンカーは、結合、ペプチドリンカー、又は化学リンカーを含む。いくつかの実施形態では、前記リンカーは、硫黄（ＶＩ）フッ化物交換（ＳｕＦＥｘ）反応によって形成される結合を含む。いくつかの実施形態では、前記リンカーは、１つ以上のマレイミド分子を含む。いくつかの実施形態では、リンカーは、１つ以上の環化結合を含む（例えば、リンカーの第１のアミノ酸は、例えば、架橋剤によって、前記リンカーの第２のアミノ酸に共有結合又は非共有結合されてもよい）。 In some embodiments, the first auxiliary protein and the second auxiliary protein are linked by a linker. In some embodiments, the linker comprises a bond, a peptide linker, or a chemical linker. In some embodiments, the linker comprises a bond formed by a sulfur(VI) fluoride exchange (SuFEx) reaction. In some embodiments, the linker comprises one or more maleimide molecules. In some embodiments, the linker comprises one or more cyclization bonds (e.g., a first amino acid of a linker may be covalently or non-covalently bonded to a second amino acid of the linker, e.g., by a crosslinker).

いくつかの実施形態では、前記第１の補助タンパク質及び前記第２の補助タンパク質は、１つ以上の側鎖－側鎖環化結合を含む。いくつかの実施形態では、前記第１の補助タンパク質及び前記第２の補助タンパク質は、１つ以上の側鎖－尾部（例えば、Ｃ末端）環化結合を含む。いくつかの実施形態では、前記環化結合の少なくとも１つは、ジスルフィド結合である。 In some embodiments, the first auxiliary protein and the second auxiliary protein comprise one or more side chain-to-side chain cyclization bonds. In some embodiments, the first auxiliary protein and the second auxiliary protein comprise one or more side chain-to-tail (e.g., C-terminus) cyclization bonds. In some embodiments, at least one of the cyclization bonds is a disulfide bond.

いくつかの態様では、本開示は、標的分析物を特性決定するためのシステムを提供し、前記システムは、膜に挿入される本明細書に記載のタンパク質ナノ細孔複合体を含む。 In some aspects, the present disclosure provides a system for characterizing a target analyte, the system comprising a protein nanopore complex described herein inserted into a membrane.

いくつかの実施形態では、前記システムは、前記タンパク質ナノ細孔複合体と接触する導電性溶液と、前記膜間に電圧電位を提供する電極と、前記タンパク質ナノ細孔複合体を通る電流を測定する測定システムと、を更に含む。 In some embodiments, the system further comprises a conductive solution in contact with the protein nanopore complex, electrodes that provide a voltage potential across the membrane, and a measurement system that measures the current through the protein nanopore complex.

いくつかの態様では、本開示は、標的分析物を特性決定するための方法を提供し、前記方法は、本明細書に記載のシステムを前記標的分析物と接触させるステップと、前記標的分析物が前記タンパク質ナノ細孔複合体によって形成される前記内腔内に入るように、前記膜間に電位を印加するステップと、前記標的分析物が前記内腔に対して移動する際に１回以上の測定を行い、それによって前記標的分析物を特性決定するステップと、を含む。 In some aspects, the present disclosure provides a method for characterizing a target analyte, the method comprising contacting the target analyte with a system described herein, applying a potential across the membrane such that the target analyte enters the lumen formed by the protein nanopore complex, and performing one or more measurements as the target analyte migrates relative to the lumen, thereby characterizing the target analyte.

いくつかの実施形態では、前記標的分析物は、標的ポリヌクレオチドを含む。 In some embodiments, the target analyte comprises a target polynucleotide.

いくつかの実施形態では、１回以上の測定を行うステップは、連続チャネルを通る電流を測定することを含み、前記電流は、前記標的分析物の存在及び／又は１つ以上の特性を示し、それによって前記標的分析物を検出及び／又は特性決定する。 In some embodiments, the step of taking one or more measurements includes measuring a current through a continuous channel, the current indicating the presence and/or one or more characteristics of the target analyte, thereby detecting and/or characterizing the target analyte.

いくつかの実施形態では、前記標的分析物は、ポリヌクレオチドであり、前記ポリヌクレオチド中のヌクレオチドは、前記内腔内の前記第１の及び第２の（並びに任意選択的に、第３の）狭窄領域と相互作用し、前記第１の、第２の、（及び任意選択的に、第３の）狭窄領域のそれぞれは、前記内腔を通る電流全体が前記第１の、第２の、及び第３の狭窄領域のそれぞれと前記領域のそれぞれに位置する前記ヌクレオチドとの間の相互作用によって影響を受けるように、異なるヌクレオチドの間を差別することができる。 In some embodiments, the target analyte is a polynucleotide, and nucleotides in the polynucleotide interact with the first and second (and optionally, third) constriction regions within the lumen, and each of the first, second (and optionally, third) constriction regions can discriminate between different nucleotides such that the overall current through the lumen is affected by interactions between each of the first, second, and third constriction regions and the nucleotides located in each of the regions.

いくつかの態様では、本開示は、タンパク質ナノ細孔複合体を生成する方法を提供し、前記タンパク質ナノ細孔複合体は、
（ａ）内腔を含むＣｓｇＧナノ細孔と、
（ｂ）ＣｓｇＦタンパク質を含む第１の部分と、ヘリックス形成補助タンパク質を含む第２の部分と、を含む融合ポリペプチドであって、前記融合タンパク質は、前記ナノ細孔に結合され、前記融合ポリペプチドの少なくとも１つのドメインは、コンピュータ生成のアルゴリズムを用いて設計される、融合ポリペプチドと、を含む。 In some aspects, the present disclosure provides a method of producing a protein nanopore complex, said protein nanopore complex comprising:
(a) a CsgG nanopore comprising a lumen;
(b) a fusion polypeptide comprising a first portion comprising a CsgF protein and a second portion comprising a helix-forming assisting protein, wherein the fusion protein is bound to the nanopore, and at least one domain of the fusion polypeptide is designed using a computer-generated algorithm.

融合タンパク質のｄｅｎｏｖｏ設計のワークフローを示す。ＣｓｇＧナノ細孔を使用する設計ワークフローを示す。野生型ＣｓｇＦ（残基１～３５、左パネル）は、オレンジ色で示される。野生型ＣｓｇＦの残基１７～３０（赤）を、発明者らが探索する標的として選択し、標的を詰まって細孔に投影して直径１０Å～３０Åの間の新しい狭窄（シアン）を作成する幾何学的に一致した設計可能なヘリックスを構築した。２つのヘリックスは、ループ状（黄色）であり、得られたバックボーンの配列設計は、Ｒｏｓｅｔｔａを介して実行された。Figure 1 shows the workflow for de novo design of fusion proteins. Figure 2 shows the design workflow using a CsgG nanopore. Wild-type CsgF (residues 1-35, left panel) is shown in orange. Residues 17-30 (red) of wild-type CsgF were selected as the target we explored, and we constructed geometrically consistent designable helices that packed the target and projected it onto the pore to create a new constriction (cyan) between 10 Å and 30 Å in diameter. Two helices were looped (yellow), and sequence design of the resulting backbone was performed via Rosetta. 融合タンパク質のｄｅｎｏｖｏ設計のワークフローを示す。対称性関連パートナーとのヘリックス－ヘリックス相互作用を示す。1 shows the workflow for de novo design of fusion proteins. 2 shows helix-helix interactions with symmetry-related partners. 融合タンパク質のｄｅｎｏｖｏ設計のワークフローを示す。ｄｅｎｏｖｏ設計された融合タンパク質によって達成される追加の狭窄を示す九量体ＣｓｇＧ－融合タンパク質複合体の上面図を示す。Figure 1 shows the workflow for de novo design of fusion proteins. Figure 2 shows a top view of the nonameric CsgG-fusion protein complex demonstrating the additional constriction achieved by the de novo designed fusion protein. Ｒｏｓｅｔｔａを使用して設計されたｄｅｎｏｖｏ融合タンパク質の配列の優先順位付けのための代表的なデータを示す。実験検証用の配列は、最低のエネルギースコア及び最高のＰａｃｋＳｔａｔスコアに基づいて選択された。Representative data for sequence prioritization of de novo fusion proteins designed using Rosetta are shown. Sequences for experimental validation were selected based on lowest energy score and highest PackStat score. ｄｅｎｏｖｏ設計された融合タンパク質の、アミノ酸配列に基づくＰＳＩＰＲＥＤタンパク質の二次構造分析を示す。融合タンパク質の二次構造予測及び野生型ＣｓｇＦの成熟配列を示す。残基は、それぞれ鎖、ヘリックス及びコイルであると予測されるか否かに応じて網掛けされる。Figure 1 shows a secondary structure analysis of the PSIPRED protein based on the amino acid sequence of the de novo designed fusion protein. The secondary structure predictions of the fusion protein and the mature sequence of wild-type CsgF are shown. Residues are shaded according to whether they are predicted to be strands, helices, and coils, respectively. ｄｅｎｏｖｏ設計された融合タンパク質の、アミノ酸配列に基づくＰＳＩＰＲＥＤタンパク質の二次構造分析を示す。ｄｅｎｏｖｏ設計された融合タンパク質ＯＮＴ１～ＯＮＴ１０の二次構造解析を示す。残基は、それぞれ鎖、ヘリックス及びコイルであると予測されるか否かに応じて網掛けされる。Figure 1 shows secondary structure analysis of the PSIPRED protein based on the amino acid sequence of the de novo designed fusion proteins. Figure 2 shows secondary structure analysis of the de novo designed fusion proteins ONT1 to ONT10. Residues are shaded according to whether they are predicted to be strands, helices, and coils, respectively. ｄｅｎｏｖｏ設計された融合タンパク質の、アミノ酸配列に基づくＰＳＩＰＲＥＤタンパク質の二次構造分析を示す。ｄｅｎｏｖｏ設計された融合タンパク質ＯＮＴ１１～ＯＮＴ２０の二次構造解析を示す。残基は、それぞれ鎖、ヘリックス及びコイルであると予測されるか否かに応じて網掛けされる。Figure 1 shows secondary structure analysis of the PSIPRED protein based on the amino acid sequence of the de novo designed fusion proteins. Figure 2 shows secondary structure analysis of the de novo designed fusion proteins ONT11 to ONT20. Residues are shaded according to whether they are predicted to be strands, helices, and coils, respectively. ｄｅｎｏｖｏ設計された融合タンパク質の、アミノ酸配列に基づくＰＳＩＰＲＥＤタンパク質の二次構造分析を示す。残基は、それぞれ鎖、ヘリックス及びコイルであると予測されるか否かに応じて網掛けされる。ｄｅｎｏｖｏ設計された融合タンパク質ＯＮＴ２１～ＯＮＴ２５の二次構造解析を示す。Secondary structure analysis of the PSIPRED protein based on the amino acid sequence of the de novo designed fusion proteins is shown. Residues are shaded according to whether they are predicted to be chains, helices, and coils, respectively. Secondary structure analysis of the de novo designed fusion proteins ONT21 to ONT25 is shown. ｄｅｎｏｖｏ設計された融合タンパク質の代替配列の予測された３次元構造を示す。ｄｅｎｏｖｏ設計された融合タンパク質ＯＮＴ１～ＯＮＴ１０の予測構造を示す。1 shows the predicted three-dimensional structures of alternative sequences of de novo designed fusion proteins. 2 shows the predicted structures of de novo designed fusion proteins ONT1 to ONT10. ｄｅｎｏｖｏ設計された融合タンパク質の代替配列の予測された３次元構造を示す。ｄｅｎｏｖｏ設計された融合タンパク質ＯＮＴ１１～ＯＮＴ２０の予測構造を示す。1 shows the predicted three-dimensional structures of alternative sequences of de novo designed fusion proteins. 2 shows the predicted structures of de novo designed fusion proteins ONT11 to ONT20. ｄｅｎｏｖｏ設計された融合タンパク質の代替配列の予測された３次元構造を示す。ｄｅｎｏｖｏ設計された融合タンパク質ＯＮＴ２１～ＯＮＴ２５の予測構造を示す。1 shows the predicted three-dimensional structures of alternative sequences of de novo designed fusion proteins. 2 shows the predicted structures of de novo designed fusion proteins ONT21 to ONT25. ＣｓｇＧのみの細孔及びＣｓｇＧ／融合タンパク質複合体の代表的なＳＤＳ－ＰＡＧＥゲル分析を示し、複合体は、マレイミド架橋剤がある場合又はそれがない場合の、ＣｓｇＦ－ｄｅｌ（Ｓ３１～Ｆ１１９）対照又はｄｅｎｏｖｏ設計された融合タンパク質のいずれかを含む。融合タンパク質を含む複合体は、これらの試料が細孔複合体であることを示すバンドシフトを示す。試料は、ゲルにロードする前に加熱されなかったことに注意されたい。Representative SDS-PAGE gel analysis of CsgG-only pores and CsgG/fusion protein complexes containing either the CsgF-del(S31-F119) control or de novo designed fusion proteins, with or without the maleimide crosslinker, is shown. Complexes containing fusion proteins show a band shift indicating that these samples are pore complexes. Note that samples were not heated before loading onto the gel. ＣｓｇＧのみの細孔及びＣｓｇＧ／融合タンパク質複合体の代表的なＳＤＳ－ＰＡＧＥゲル分析を示し、複合体は、マレイミド架橋剤がある場合又はそれがない場合の、ＣｓｇＦ－ｄｅｌ（Ｓ３１～Ｆ１１９）対照又はｄｅｎｏｖｏ設計された融合タンパク質のいずれかを含む。細孔は、ゲルにロードする前にＤＴＴの存在で沸騰させると、それらの構成モノマー構成要素に分解された。Representative SDS-PAGE gel analysis of CsgG-only pores and CsgG/fusion protein complexes containing either the CsgF-del(S31-F119) control or the de novo designed fusion protein, with or without the maleimide crosslinker, is shown. The pores were disassembled into their constituent monomeric components by boiling in the presence of DTT before loading onto the gel. 一本鎖ＤＮＡがＣｓｇＧのみの細孔を通って移動するときの代表的なイオン電流（ｐＡ）対時間（ｓ）トレースを示す。生の電流トレースは、黒い線で示され、イベント検出信号は、赤い線で示される。各細孔について、上の行は、完全なＤＮＡ電流トレースを示し、下の行は、電流トレースの第１の部分の拡大図を示す。Representative ionic current (pA) versus time (s) traces are shown as single-stranded DNA translocates through a CsgG-only pore. The raw current trace is shown as a black line, and the event detection signal is shown as a red line. For each pore, the top row shows the full DNA current trace, and the bottom row shows a zoomed-in view of the first portion of the current trace. マレイミド架橋剤がある場合又はそれがない場合の、一本鎖ＤＮＡがｄｅｌ（Ｓ３１～Ｆ１１９）ＣｓｇＦペプチドを含むＣｓｇＧを通って移動するときの代表的なイオン電流（ｐＡ）対時間（ｓ）トレースを示す。Representative ionic current (pA) versus time (s) traces are shown as single-stranded DNA translocates through CsgG containing the del(S31-F119) CsgF peptide, with or without a maleimide crosslinker. マレイミド架橋剤なしで、一本鎖ＤＮＡがｄｅｎｏｖｏ設計された融合タンパク質を含むＣｓｇＧを通って移動するときの代表的なイオン電流（ｐＡ）対時間（ｓ）トレースを示す。Representative ionic current (pA) versus time (s) traces are shown as single-stranded DNA translocates through a CsgG containing de novo designed fusion protein without a maleimide crosslinker. マレイミド架橋の有無にかかわらず、一本鎖ＤＮＡがｄｅｎｏｖｏ設計された融合タンパク質を含むＣｓｇＧを通って移動するときの代表的なイオン電流（ｐＡ）対時間（ｓ）トレースを示す。Representative ionic current (pA) versus time (s) traces are shown as single-stranded DNA translocates through a CsgG containing de novo designed fusion protein with and without maleimide cross-linking. マレイミド架橋剤がある場合又はそれがない場合の、一本鎖ＤＮＡがｄｅｎｏｖｏ設計された融合タンパク質を含むＣｓｇＧを通って移動するときの代表的なイオン電流（ｐＡ）対時間（ｓ）トレースを示す。融合タンパク質は、ペプチド内に内部ジスルフィド結合を形成し、すなわち融合タンパク質を環化するために、システイン残基とともにＫ３７Ｒ変異を含む。Representative ionic current (pA) versus time (s) traces are shown as single-stranded DNA translocates through a CsgG containing de novo designed fusion protein with and without a maleimide crosslinker. The fusion protein contains a K37R mutation along with a cysteine residue to form an internal disulfide bond within the peptide, i.e., cyclize the fusion protein. ＤＮＡ分子が細孔を通って移動するときの、細孔内の位置及びイオン電流レベルの全体的な変化（「差別」）に対するそれらの寄与を示す代表的なプロファイルを示す。ＣｓｇＧのみの細孔（Ｑ１５３Ｃの有無にかかわらず）は、位置０で１つの主要な差別ピークを示す。Representative profiles are shown showing the position within the pore and their contribution to the overall change in ionic current level ("discrimination") as a DNA molecule translocates through the pore. CsgG-only pores (with or without Q153C) show one major discrimination peak at position 0. ＤＮＡ分子が細孔を通って移動するときの、細孔内の位置及びイオン電流レベルの全体的な変化（「差別」）に対するそれらの寄与を示す代表的なプロファイルを示す。破線のボックスは、ｄｅｎｏｖｏ設計された融合タンパク質の導入によって影響を受ける領域を示す。マレイミド架橋剤がある場合又はそれがない場合のＣｓｇＧ－ＣｓｇＦ－ｄｅｌ（Ｓ３１～Ｆ１１９）細孔は、２つの差別ピークを示す。主要な差別ピークは、ＣｓｇＧのみの細孔で見られるように、位置０にあり、追加の差別ピークは、主要な狭窄の下方の４～６つのヌクレオチド（位置－４～－６）にある。この差別の追加領域は、位置０での主要な差別ピークと比較して、イオン電流に対する影響が小さい。Representative profiles are shown showing the position within the pore and their contribution to the overall change in ion current level ("discrimination") as a DNA molecule translocates through the pore. The dashed box indicates the region affected by the introduction of the de novo designed fusion protein. The CsgG-CsgF-del(S31-F119) pore, with or without the maleimide crosslinker, shows two discrimination peaks. The major discrimination peak is at position 0, as seen in the CsgG-only pore, and an additional discrimination peak is 4 to 6 nucleotides below the major constriction (positions -4 to -6). This additional region of discrimination has a smaller effect on the ion current compared to the major discrimination peak at position 0. ＤＮＡ分子が細孔を通って移動するときの、細孔内の位置及びイオン電流レベルの全体的な変化（「差別」）に対するそれらの寄与を示す代表的なプロファイルを示す。細孔内の距離は、主要な狭窄に対してヌクレオチドステップにおいて測定される。負の値は、主要な狭窄下の位置に対応し、正の値は、主要な狭窄上の位置に対応する（ＣｓｇＧ）。破線のボックスは、ｄｅｎｏｖｏ設計された融合タンパク質の導入によって影響を受ける領域を示す。ＣｓｇＧとＫ３７Ｒを含有するｄｅｎｏｖｏ設計された融合タンパク質（マレイミド架橋剤がある場合又はそれがない場合、環化あり）で構成された複合体は、３つの差別ピークを示す。主要な差別ピークは、ＣｓｇＧのみの細孔に見られるように、位置０にあり、追加のピークは、位置－６及び－９にある。位置－９でのピークは、正しい配向に折り畳まれた場合に、ｄｅｎｏｖｏ設計された融合タンパク質によって生成される予想の狭窄に対応する。Representative profiles are shown showing the position within the pore and their contribution to the overall change in ionic current level ("discrimination") as a DNA molecule translocates through the pore. Distance within the pore is measured in nucleotide steps relative to the major constriction. Negative values correspond to positions below the major constriction, and positive values correspond to positions above the major constriction (CsgG). The dashed box indicates the region affected by the introduction of the de novo-designed fusion protein. A complex composed of CsgG and a de novo-designed fusion protein containing K37R (with or without a maleimide crosslinker, and with cyclization) shows three discrimination peaks. The major discrimination peak is at position 0, as seen in the CsgG-only pore, with additional peaks at positions -6 and -9. The peak at position -9 corresponds to the expected constriction produced by the de novo-designed fusion protein when folded in the correct orientation. マレイミドプロピオン酸リンカーによって接続される２つのタンパク質の例を示す。An example of two proteins connected by a maleimidopropionic acid linker is shown. チオール修飾剤などの反応性修飾剤で官能化された細孔タンパク質及び補助タンパク質（例えば、融合タンパク質）の例を示す。1 provides examples of pore proteins and auxiliary proteins (eg, fusion proteins) functionalized with reactive modifiers, such as thiol modifiers. マレイミド架橋剤がある（下の２つのトレース）場合又はそれがない（上の２つのトレース）場合の、一本鎖ＤＮＡがｄｅｎｏｖｏ設計された融合タンパク質（配列番号６１）を含むＣｓｇＧを通って移動するときの代表的なイオン電流（ｐＡ）対時間（ｓ）トレースを示す。生の電流トレースは、黒い線で示され、イベント検出信号は、赤い線で示される。各細孔について、上の行は、完全なＤＮＡ電流トレースを示し、下の行は、電流トレースの第１の部分の拡大図を示す。Representative ionic current (pA) versus time (s) traces are shown as single-stranded DNA translocates through CsgG containing a de novo designed fusion protein (SEQ ID NO: 61) in the presence (bottom two traces) or absence (top two traces) of a maleimide crosslinker. The raw current traces are shown as black lines, and the event detection signal is shown as a red line. For each pore, the top row shows the full DNA current trace, and the bottom row shows a zoomed-in view of the first portion of the current trace. ＤＮＡ分子が細孔を通って移動するときの、細孔内の位置及びイオン電流レベルの全体的な変化（「差別」）に対するそれらの寄与を示す代表的なプロファイルを示す。細孔内の距離は、主要な狭窄に対してヌクレオチドステップにおいて測定される。負の値は、主要な狭窄下の位置に対応し、正の値は、主要な狭窄上の位置に対応する（ＣｓｇＧ）。破線のボックスは、ｄｅｎｏｖｏ設計された融合タンパク質の導入によって影響を受ける領域を示す。ＣｓｇＧと、ｄｅｎｏｖｏ設計された融合タンパク質（配列番号６１）（マレイミド架橋剤がある（下のプロファイル）場合又はそれがない（上のプロファイル）場合、両方とも環化なし）とで構成された複合体は、３つの差別ピークを示す。主要な差別ピークは、ＣｓｇＧのみの細孔に見られるように、位置０にあり、追加のピークは、位置－５及び－１１にある。位置－１１でのピークは、正しい配向に折り畳まれた場合に、ｄｅｎｏｖｏ設計された融合タンパク質によって生成される予想の狭窄に対応する。Representative profiles are shown showing the position within the pore and their contribution to the overall change in ionic current level ("discrimination") as a DNA molecule translocates through the pore. Distance within the pore is measured in nucleotide steps relative to the major constriction. Negative values correspond to positions below the major constriction, and positive values correspond to positions above the major constriction (CsgG). The dashed box indicates the region affected by the introduction of the de novo designed fusion protein. A complex composed of CsgG and the de novo designed fusion protein (SEQ ID NO: 61) (with (lower profile) or without (upper profile) the maleimide crosslinker, both without cyclization) shows three discrimination peaks. The major discrimination peak is at position 0, as seen in the CsgG-only pore, with additional peaks at positions -5 and -11. The peak at position -11 corresponds to the expected constriction produced by the de novo designed fusion protein when folded in the correct orientation. 大腸菌Ｋ１２株由来の野生型ＣｓｇＧ細孔の構造及びサイズを示す（この構造のデータバンクアクセスコードは、４ＵＶ３である）。示された距離は、細孔構造を形成するアミノ酸のバックボーンからバックボーンまで測定される。ＣｓｇＧ細孔は、王冠に似た緊密に相互接続された対称的な九量体細孔である。全高は、９８Åであり、最大外径は、１２０Åである。これは、中央チャネルを定義し、（Ａ）キャップ領域、（Ｂ）狭窄領域及び（Ｃ）膜貫通ベータバレル領域である３つの部分からなる。キャップの軸長、又は高さは、３９Åである。内径は、４３Åであり、開口は、６６Åである。ベータバレルは、３６本のストランドを有し、軸方向の長さは、３９Åであり、内径は、５５Åである。細孔キャップとベータバレルとの間の遷移は、予測された脂質－水性界面のレベルで、それらの間に位置する狭窄がある急激である。狭窄は、直径が約１８．５Åであり、チャネルの軸に沿った長さが２０Åである。Figure 1 shows the structure and size of the wild-type CsgG pore from E. coli K12 (the databank access code for this structure is 4UV3). Distances shown are measured from backbone to backbone of the amino acids that form the pore structure. The CsgG pore is a tightly interconnected, symmetric nonameric pore resembling a crown. The overall height is 98 Å, and the maximum outer diameter is 120 Å. It defines a central channel and consists of three parts: (A) the cap region, (B) the constriction region, and (C) the transmembrane beta-barrel region. The axial length, or height, of the cap is 39 Å. The internal diameter is 43 Å, and the opening is 66 Å. The beta-barrel has 36 strands, an axial length of 39 Å, and an internal diameter of 55 Å. The transition between the pore cap and the beta-barrel is abrupt, with a constriction located between them at the level of the predicted lipid-aqueous interface. The constriction is approximately 18.5 Å in diameter and 20 Å in length along the axis of the channel.

本開示の態様は、ナノ細孔ベースのシステムを使用して分析物を特性決定するための組成物及び方法に関する。本開示は、部分的に、ＣｓｇＧ細孔及びナノ細孔複合体内の１つ以上のチャネル狭窄を形成する１つ以上の補助タンパク質によって形成されるタンパク質ナノ細孔複合体に基づく。いくつかの実施形態では、１つ以上の補助タンパク質は、融合タンパク質である。実施例で更に説明するように、驚くべきことに、ＣｓｇＧナノ細孔に特定の所望の特徴（例えば、細孔幅の調節、細孔内腔の延長、１つ以上の追加の狭窄の形成など）を付与する補助タンパク質を、コンピュータベースの構造解析ツールを使用してｄｅｎｏｖｏ設計できることが発見された。いくつかの実施形態では、ｄｅｎｏｖｏ設計された補助タンパク質（例えば、融合タンパク質）は、ＣｓｇＧ細孔の内腔において１つ以上の追加の狭窄を形成し、分析物がナノ細孔を通って移動する際にポリマーユニットの差別を改善する。 Aspects of the present disclosure relate to compositions and methods for characterizing analytes using nanopore-based systems. The disclosure is based, in part, on a protein nanopore complex formed by a CsgG pore and one or more auxiliary proteins that form one or more channel constrictions within the nanopore complex. In some embodiments, the one or more auxiliary proteins are fusion proteins. As further described in the Examples, it has surprisingly been discovered that auxiliary proteins that impart specific desired characteristics to a CsgG nanopore (e.g., tuning the pore width, extending the pore lumen, forming one or more additional constrictions, etc.) can be designed de novo using computer-based structural analysis tools. In some embodiments, the de novo-designed auxiliary proteins (e.g., fusion proteins) form one or more additional constrictions in the lumen of the CsgG pore, improving discrimination of polymer units as analytes translocate through the nanopore.

補助タンパク質
本開示によって記載されるタンパク質ナノ細孔複合体（タンパク質細孔複合体とも互換的に呼ばれる）は、１つ以上の補助タンパク質を含んでもよい。本明細書で使用されるように、「ペプチド」、「ポリペプチド」又は「タンパク質」という用語は、本明細書では互換的に使用され、ペプチド結合によって一緒に連結された２つ以上のアミノ酸を指す。いくつかの実施形態では、タンパク質（ポリペプチド又はペプチドとも呼ばれる）は、２～２０００個のアミノ酸を含む。いくつかの実施形態では、タンパク質は、２～１０個のアミノ酸、２～２５個のアミノ酸、２～５０個のアミノ酸、２～１００個のアミノ酸、２～５００個のアミノ酸、又は２～１０００個のアミノ酸（又はそれらの間の任意の数のアミノ酸、例えば、２、３、４、５、６、７、８、９、１０、１５、２０、２５、３０、３５、４０、４５、５０、７５、１００、２５０、５００、７５０、１０００個のアミノ酸など）を含む。いくつかの実施形態では、タンパク質は、２０００個を超えるアミノ酸を含む。いくつかの実施形態では、ペプチド、ポリペプチド、又はタンパク質は、合成由来である（例えば、天然には存在せず、例えば、任意の生物においても天然に発現されない）。いくつかの実施形態では、ペプチド、ポリペプチド、又はタンパク質は、天然に存在する（例えば、遺伝子修飾されなくてペプチド、ポリペプチド、又はタンパク質を発現する生物において天然に発現される）。いくつかの実施形態では、ペプチド、ポリペプチド、又はタンパク質は、生物によって天然に発現されてもよい。いくつかの実施形態では、ペプチド、ポリペプチド、又はタンパク質は、生物（例えば、遺伝子修飾されてペプチド、ポリペプチド、又はタンパク質を発現する生物）によって異種発現される。いくつかの実施形態では、ペプチド、ポリペプチド、又はタンパク質は、（例えば、インビトロ転写、ペプチド合成などによって）化学的に合成される。ペプチド、ポリペプチド、又はタンパク質は、１つ以上の天然アミノ酸（Ｌ－アミノ酸、Ｄ－アミノ酸など）、１つ以上の非天然アミノ酸（例えば、放射標識されたアミノ酸、非標準アミノ酸、非天然アミノ酸など）、又は１つ以上の天然アミノ酸と１つ以上の非天然アミノ酸との組み合わせを含んでもよい。 Auxiliary Proteins Protein nanopore complexes (also referred to interchangeably as protein pore complexes) described by the present disclosure may comprise one or more auxiliary proteins. As used herein, the terms "peptide,""polypeptide," or "protein" are used interchangeably herein and refer to two or more amino acids linked together by a peptide bond. In some embodiments, a protein (also referred to as a polypeptide or peptide) comprises 2 to 2000 amino acids. In some embodiments, a protein comprises 2 to 10 amino acids, 2 to 25 amino acids, 2 to 50 amino acids, 2 to 100 amino acids, 2 to 500 amino acids, or 2 to 1000 amino acids (or any number of amino acids therebetween, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 250, 500, 750, 1000 amino acids, etc.). In some embodiments, a protein comprises more than 2000 amino acids. In some embodiments, the peptide, polypeptide, or protein is synthetic (e.g., not occurring in nature, e.g., not naturally expressed in any organism). In some embodiments, the peptide, polypeptide, or protein is naturally occurring (e.g., naturally expressed in an organism that has not been genetically modified to express the peptide, polypeptide, or protein). In some embodiments, the peptide, polypeptide, or protein may be naturally expressed by an organism. In some embodiments, the peptide, polypeptide, or protein is heterologously expressed by an organism (e.g., an organism that has been genetically modified to express the peptide, polypeptide, or protein). In some embodiments, the peptide, polypeptide, or protein is chemically synthesized (e.g., by in vitro transcription, peptide synthesis, etc.). The peptide, polypeptide, or protein may include one or more naturally occurring amino acids (e.g., L-amino acids, D-amino acids, etc.), one or more non-naturally occurring amino acids (e.g., radiolabeled amino acids, non-standard amino acids, unnatural amino acids, etc.), or a combination of one or more naturally occurring amino acids and one or more non-naturally occurring amino acids.

いくつかの実施形態では、補助タンパク質は、融合タンパク質である。「融合タンパク質」という用語は、ペプチド結合によって結合された２つ以上の異種ポリペプチド（例えば、互いに異種であるポリペプチド）の全部又は一部を含む、天然存在、合成、半合成又は組換えの単一タンパク質分子を指す。いくつかの実施形態では、融合タンパク質は、ペプチド結合によって結合された少なくとも２、３、４、５、６、７、８、９、又は１０個の異種ポリペプチドの全部又は一部を含む。本明細書で使用されるように、「ペプチドの一部」とは、ペプチドの２つ以上のアミノ酸を指す。いくつかの実施形態では、ペプチドの一部は、ペプチドの完全なアミノ酸配列の、連続した又はギャップのある少なくとも５、１０、２０、３０、５０、若しくは１００個のアミノ酸（例えば、５、６、７、８、９、１０、１１、１２、１３、１４、１５、１６、１７、１８、１９、２０、２１、２２、２３、２４、２５、２６、２７、２８、２９、３０、３１、３２、３３、３４、３５、３６、３７、３８、３９、４０、４１、４２、４３、４４、４５、４６、４７、４８、４９、５０、５１、５２、５３、５４、５５、５６、５７、５８、５９、６０、６１、６２、６３、６４、６５、６６、６７、６８、６９、７０、７１、７２、７３、７４、７５、７６、７７、７８、７９、８０、８１、８２、８３、８４、８５、８６、８７、８８、８９、９０、９１、９２、９３、９４、９５、９６、９７、９８、９９、若しくは１００個のアミノ酸）、又はペプチドのフルアミノ酸配列を含む。融合タンパク質の一部分は、任意の適切な方法で配置されてもよい（例えば、Ｃ末端からＮ末端、Ｎ末端からＣ末端、Ｃ末端からＣ末端、Ｎ末端からＮ末端など）。いくつかの実施形態では、第１の部分のＣ末端は、第２の部分のＮ末端に結合（例えば、接続）されてもよい。融合タンパク質の一部は、直接結合されてもよく（例えば、１つの部分のアミノ酸は、その部分の末端アミノ酸間のペプチド結合を介して第２の部分のアミノ酸に直接結合されてもよい）、又は間接的に結合されてもよい（例えば、融合タンパク質の１つの部分のアミノ酸は、例えば、第１のペプチド結合によって、第２のペプチド結合によって融合タンパク質の第２の部分に結合されるリンカーに結合されてもよい）。いくつかの実施形態では、第１の補助タンパク質は、融合タンパク質の第１の部分であり、第２の補助タンパク質は、融合タンパク質の第２の部分である。リンカーを使用した融合タンパク質の一部の接続については、本明細書では、例えば「リンカー」というタイトルのセクションで更に説明される。 In some embodiments, the auxiliary protein is a fusion protein. The term "fusion protein" refers to a naturally occurring, synthetic, semi-synthetic, or recombinant single protein molecule that contains all or portions of two or more heterologous polypeptides (e.g., polypeptides that are heterologous to each other) linked by peptide bonds. In some embodiments, a fusion protein contains all or portions of at least 2, 3, 4, 5, 6, 7, 8, 9, or 10 heterologous polypeptides linked by peptide bonds. As used herein, a "portion of a peptide" refers to two or more amino acids of a peptide. In some embodiments, a portion of a peptide is at least 5, 10, 20, 30, 50, or 100 amino acids (e.g., 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 1 9, 98, 99, or 100 amino acids), or the full amino acid sequence of a peptide. The portions of the fusion protein may be arranged in any suitable manner (e.g., C-terminus to N-terminus, N-terminus to C-terminus, C-terminus to C-terminus, N-terminus to N-terminus, etc.). In some embodiments, the C-terminus of a first portion may be linked (e.g., connected) to the N-terminus of a second portion. The portions of a fusion protein may be directly linked (e.g., an amino acid of one portion may be directly linked to an amino acid of a second portion via a peptide bond between the terminal amino acids of the portions) or indirectly linked (e.g., an amino acid of one portion of a fusion protein may be linked, e.g., by a first peptide bond, to a linker that is linked to the second portion of the fusion protein by a second peptide bond). In some embodiments, the first auxiliary protein is the first portion of the fusion protein and the second auxiliary protein is the second portion of the fusion protein. Linking portions of fusion proteins using linkers is further described herein, e.g., in the section entitled "Linkers."

いくつかの実施形態では、タンパク質ナノ細孔複合体は、中央の空洞又は開口（ナノ細孔の「内腔」とも呼ばれる）の周囲に配置された複数のサブユニット又はモノマー（例えば、複数のＣｓｇＧモノマー）を含む。タンパク質ナノ細孔の形成は、本明細書では、例えば「ＣｓｇＧ細孔」というタイトルのセクションで更に説明される。いくつかの実施形態では、１つ以上（例えば、１、２、３、４、５、６、７、８、９、１０、１１、１２、１３、１４、１５、又はそれ以上）の補助タンパク質は、ナノ細孔の内腔内に又はそれとともに配置されて連続チャネル（例えば、連続内腔）を形成する。いくつかの実施形態では、タンパク質ナノ細孔複合体は、９：１、９：２、９：３、９：４、９：５、９：６、９：７、９：８、９：９（例えば、１：１）、９：１０、９：１１、９：１２、９：１３、９：１４、９：１５、９：１６、９：１７、又は、９：１８（例えば、１：２）の細孔モノマー（例えば、ＣｓｇＧ細孔モノマー）対補助タンパク質の比を含む。いくつかの実施形態では、１つ以上の補助タンパク質又は１つ以上の融合タンパク質は、ナノ細孔と同じ対称性を有してもよい。例えば、ナノ細孔が中心軸の周囲に８つのモノマーを含む場合、８つの補助タンパク質（又は８つの融合タンパク質）が存在し、又はナノ細孔が中心軸の周囲に９つのモノマーを含む場合、９つの補助タンパク質（又は９つの融合タンパク質）が存在する。いくつかの実施形態では、１つ以上の補助タンパク質（又は１つ以上の融合タンパク質）は、ナノ細孔より多いか少ない、例えば、１つ多いか又は１つ少ないモノマーを含んでもよい。 In some embodiments, the protein nanopore complex comprises multiple subunits or monomers (e.g., multiple CsgG monomers) arranged around a central cavity or opening (also referred to as the "lumen" of the nanopore). Formation of the protein nanopore is further described herein, e.g., in the section entitled "CsgG Pore." In some embodiments, one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or more) accessory proteins are disposed within or with the lumen of the nanopore to form a continuous channel (e.g., a continuous lumen). In some embodiments, the protein nanopore complex comprises a ratio of pore monomer (e.g., CsgG pore monomer) to auxiliary protein of 9:1, 9:2, 9:3, 9:4, 9:5, 9:6, 9:7, 9:8, 9:9 (e.g., 1:1), 9:10, 9:11, 9:12, 9:13, 9:14, 9:15, 9:16, 9:17, or 9:18 (e.g., 1:2). In some embodiments, the one or more auxiliary proteins or one or more fusion proteins may have the same symmetry as the nanopore. For example, if the nanopore comprises eight monomers around a central axis, there will be eight auxiliary proteins (or eight fusion proteins), or if the nanopore comprises nine monomers around a central axis, there will be nine auxiliary proteins (or nine fusion proteins). In some embodiments, one or more auxiliary proteins (or one or more fusion proteins) may comprise more or fewer monomers than the nanopore, e.g., one more or one less.

ナノ細孔又はタンパク質ナノ細孔複合体の内腔は、１つ以上の狭窄を有してもよい。本明細書で交換可能に使用される「狭窄」、「開口部」、「狭窄領域」、「チャネル狭窄」、又は「狭窄部位」は、細孔又はタンパク質細孔複合体の内腔表面によって画定される開口を指し、これは、イオン及び標的分子（例えば、ポリヌクレオチド若しくは個々のヌクレオチドに限定されない）の通過を可能にするように作用するが、細孔又はタンパク質細孔複合体チャネルを通る他の非標的分子は通過させない。狭窄（複数可）は、典型的には、細孔若しくはタンパク質細孔複合体内、又は細孔若しくは細孔複合体によって画定されるチャネル内の最も狭い開口（複数可）である。狭窄（複数可）は、細孔を通る分子の通過を限定するのに役立つ可能性がある。狭窄のサイズは、典型的には、分析物の特性決定用の細孔又は細孔複合体の適合性を決定する重要な因子である。狭窄が小さすぎる場合、特性決定される分子が通過することができなくなる。しかしながら、チャネルを通るイオン流に対して最大の効果を達成するには、各狭窄が大きすぎないようにする必要がある。例えば、各狭窄は、標的分析物の溶媒にアクセス可能な横方向の直径よりも広くないようにする必要がある。理想的には、各狭窄は、通過する分析物の横方向の直径にできるだけ近い必要がある。 The lumen of a nanopore or protein nanopore complex may have one or more constrictions. As used interchangeably herein, the terms "constriction," "opening," "constricted region," "channel constriction," or "constriction site" refer to an opening defined by the luminal surface of a pore or protein pore complex that acts to allow the passage of ions and target molecules (e.g., but not limited to, polynucleotides or individual nucleotides) but not other non-target molecules through the pore or protein pore complex channel. The constriction(s) are typically the narrowest opening(s) within the pore or protein pore complex or within the channel defined by the pore or pore complex. The constriction(s) may serve to limit the passage of molecules through the pore. The size of the constriction is typically a key factor determining the suitability of the pore or pore complex for analyte characterization. If the constriction is too small, the molecule to be characterized will not be able to pass. However, to achieve the greatest effect on ion flow through the channel, each constriction should not be too large. For example, each constriction should be no wider than the solvent-accessible lateral diameter of the target analyte. Ideally, each constriction should be as close as possible to the lateral diameter of the analyte passing through it.

本開示によって記載されるタンパク質細孔複合体内の狭窄の数は、変動してもよい。いくつかの実施形態では、タンパク質細孔複合体は、少なくとも１、２、３、４、５、又はそれ以上の狭窄を含む。いくつかの実施形態では、タンパク質細孔複合体は、２つ又は３つの狭窄を含む。いくつかの実施形態では、タンパク質細孔複合体は、２つの狭窄を含む。いくつかの実施形態では、第１の狭窄は、第１の補助タンパク質によって形成され、第２の狭窄は、第２の補助タンパク質によって形成される。いくつかの実施形態では、第１の狭窄は、ＣｓｇＧナノ細孔の一部によって形成され、第２の狭窄は、補助タンパク質又は融合タンパク質によって形成される。いくつかの実施形態では、タンパク質細孔複合体は、３つの狭窄を含む。いくつかの実施形態では、第１の狭窄は、ＣｓｇＧナノ細孔の一部によって形成され、第２の狭窄は、第１の補助タンパク質によって形成され、第３の狭窄は、第２の補助タンパク質によって形成される。いくつかの実施形態では、第１の狭窄は、ＣｓｇＧナノ細孔の一部によって形成され、第２及び第３の狭窄は、融合タンパク質によって形成される。 The number of constrictions within a protein pore complex described by the present disclosure may vary. In some embodiments, the protein pore complex comprises at least one, two, three, four, five, or more constrictions. In some embodiments, the protein pore complex comprises two or three constrictions. In some embodiments, the protein pore complex comprises two constrictions. In some embodiments, the first constriction is formed by a first auxiliary protein and the second constriction is formed by a second auxiliary protein. In some embodiments, the first constriction is formed by a portion of the CsgG nanopore and the second constriction is formed by an auxiliary protein or fusion protein. In some embodiments, the protein pore complex comprises three constrictions. In some embodiments, the first constriction is formed by a portion of the CsgG nanopore, the second constriction is formed by a first auxiliary protein, and the third constriction is formed by a second auxiliary protein. In some embodiments, the first constriction is formed by a portion of the CsgG nanopore, and the second and third constrictions are formed by the fusion protein.

中央の空洞又は開口の最も狭い点は、典型的には、連続チャネル内の狭窄を形成する。いくつかの実施形態では、狭窄の直径は、ナノ細孔の内腔内に最も奥まで延びて狭窄を形成するアミノ酸残基のアルファ炭素（Ｃ_ａ）間の距離を測定することで計算される。いくつかの実施形態では、狭窄の直径は、ナノ細孔の内腔内に最も奥まで延びて狭窄を形成する原子のファンデルワールス半径間の距離を測定することで計算される。いくつかの実施形態では、狭窄（例えば、ＣｓｇＧタンパク質の一部によって形成される狭窄、補助タンパク質によって形成される狭窄、融合タンパク質によって形成される狭窄など）の最小直径は、約０．５ｎｍ～約４．０ナノメートルの範囲（例えば、ファンデルワールス半径間の距離によって測定されるように）である。いくつかの実施形態では、狭窄の最小直径は、約０．５～約３．０ナノメートル、又は約０．５～約２．０ナノメートル、好ましくは、約０．７～約１．８ナノメートル、約０．８～約１．７ナノメートル、約０．９～約１．６ナノメートル、又は約１．０～約１．５ナノメートルの範囲であり、例えば、約１．１、１．２、１．３若しくは１．４ナノメートルである。いくつかの実施形態では、狭窄の最小直径は、約１０Å～約３０Åの範囲であり、例えば、１０Å、１１Å、１２Å、１３Å、１４Å、１５Å、１６Å、１７Å、１８Å、１９Å、２０Å、２１Å、２２Å、２３Å、２４Å、２５Å、２６Å、２７Å、２８Å、２９Å、又は３０Å（例えば、Ｃ_ａ－Ｃ_ａによって測定されるように）である。いくつかの実施形態では、狭窄の最小直径は、約１０Å～約３０Åの範囲（例えば、Ｃ_ａ－Ｃ_ａによって測定されるように）である。いくつかの実施形態では、狭窄の最小直径は、約１５Å～約２５Åの範囲（例えば、Ｃ_ａ－Ｃ_ａによって測定されるように）である。 The narrowest point of the central cavity or opening typically forms a constriction in the continuous channel. In some embodiments, the diameter of the constriction is calculated by measuring the distance between the alpha carbons (C _a ) of the amino acid residues that extend furthest into the lumen of the nanopore to form the constriction. In some embodiments, the diameter of the constriction is calculated by measuring the distance between the van der Waals radii of the atoms that extend furthest into the lumen of the nanopore to form the constriction. In some embodiments, the smallest diameter of a constriction (e.g., a constriction formed by a portion of a CsgG protein, a constriction formed by an accessory protein, a constriction formed by a fusion protein, etc.) is in the range of about 0.5 nm to about 4.0 nanometers (e.g., as measured by the distance between van der Waals radii). In some embodiments, the minimum diameter of the constriction is in the range of about 0.5 to about 3.0 nanometers, or about 0.5 to about 2.0 nanometers, preferably about 0.7 to about 1.8 nanometers, about 0.8 to about 1.7 nanometers, about 0.9 to about 1.6 nanometers, or about 1.0 to about 1.5 nanometers, e.g., about 1.1, 1.2, 1.3, or 1.4 nanometers. In some embodiments, the minimum diameter of the constriction is in the range of about 10 Å to about 30 Å, e.g., 10 Å, 11 Å, 12 Å, 13 Å, 14 Å, 15 Å, 16 Å, 17 Å, 18 Å, 19 Å, 20 Å, 21 Å, 22 Å, 23 Å, 24 Å, 25 Å, 26 Å, 27 Å, 28 Å, 29 Å, or 30 Å (e.g., as measured by _Ca - _Ca ). In some embodiments, the minimum diameter of the constriction ranges from about 10 Å to about 30 Å (e.g., as measured by C _a -C _a ). In some embodiments, the minimum diameter of the constriction ranges from about 15 Å to about 25 Å (e.g., as measured by C _a -C _a ).

タンパク質細孔複合体の内腔内の１つ以上の狭窄の間の距離は、変動してもよい。いくつかの実施形態では、第１の狭窄領域と第２の狭窄領域との間の距離は、約５Å～約８０Åの範囲である。いくつかの実施形態では、第１の狭窄領域と第２の狭窄領域との間の距離は、長さが約５Å、６Å、７Å、８Å、９Å、１０Å、１１Å、１２Å、１３Å、１４Å、１５Å、１６Å、１７Å、１８Å、１９Å、２０Å、２１Å、２２Å、２３Å、２４Å、２５Å、２６Å、２７Å、２８Å、２９Å、３０Å、３１Å、３２Å、３３Å、３４Å、３５Å、３６Å、３７Å、３８Å、３９Å、４０Å、４１Å、４２Å、４３Å、４４Å、４５Å、４６Å、４７Å、４８Å、４９Å、５０Å、５１Å、５２Å、５３Å、５４Å、５５Å、５６Å、５７Å、５８Å、５９Å、６０Å、６１Å、６２Å、６３Å、６４Å、６５Å、６６Å、６７Å、６８Å、６９Å、７０Å、７１Å、７２Å、７３Å、７４Å、７５Å、７６Å、７７Å、７８Å、７９Å、又は８０Åである。いくつかの実施形態では、第１の狭窄領域と第２の狭窄領域との間の距離は、長さが８０Åを超える（例えば、９０Å、１００Åなどである）。 The distance between one or more constrictions within the lumen of the protein pore complex may vary. In some embodiments, the distance between the first constriction region and the second constriction region ranges from about 5 Å to about 80 Å. In some embodiments, the distance between the first constriction region and the second constriction region ranges from about 5 Å, 6 Å, 7 Å, 8 Å, 9 Å, 10 Å, 11 Å, 12 Å, 13 Å, 14 Å, 15 Å, 16 Å, 17 Å, 18 Å, 19 Å, 20 Å, 21 Å, 22 Å, 23 Å, 24 Å, 25 Å, 26 Å, 27 Å, 28 Å, 29 Å, 30 Å, 31 Å, 32 Å, 33 Å, 34 Å, 35 Å, 36 Å, 37 Å, 38 Å, 39 Å, 40 Å, 41 Å, 42 Å, 43 Å, 44 Å, 45 Å, 46 Å, 47 Å, 48 Å, 49 Å, 50 Å, 51 Å, 52 Å, 53 Å, 54 Å, 55 Å, 56 Å, 57 Å, 58 Å, 59 Å, 60 Å, 61 Å, 62 Å, 63 Å, 64 Å, 65 Å, 66 Å, 67 Å, 68 Å, 69 Å, 70 Å, 71 Å, 72 Å, 73 Å, 74 Å, 75 Å, 76 Å, 7 9 Å, 40 Å, 41 Å, 42 Å, 43 Å, 44 Å, 45 Å, 46 Å, 47 Å, 48 Å, 49 Å, 50 Å, 51 Å, 52 Å, 53 Å, 54 Å, 55 Å, 56 Å, 57 Å, 58 Å, 59 Å, 60 Å, 61 Å, 62 Å, 63 Å, 64 Å, 65 Å, 66 Å, 67 Å, 68 Å, 69 Å, 70 Å, 71 Å, 72 Å, 73 Å, 74 Å, 75 Å, 76 Å, 77 Å, 78 Å, 79 Å, or 80 Å. In some embodiments, the distance between the first and second constriction regions is greater than 80 Å in length (e.g., 90 Å, 100 Å, etc.).

いくつかの実施形態では、第２の狭窄領域と第３の狭窄領域との間の距離は、約５Å～約８０Åの範囲である。いくつかの実施形態では、第１の狭窄領域と第２の狭窄領域との間の距離は、長さが約５Å、６Å、７Å、８Å、９Å、１０Å、１１Å、１２Å、１３Å、１４Å、１５Å、１６Å、１７Å、１８Å、１９Å、２０Å、２１Å、２２Å、２３Å、２４Å、２５Å、２６Å、２７Å、２８Å、２９Å、３０Å、３１Å、３２Å、３３Å、３４Å、３５Å、３６Å、３７Å、３８Å、３９Å、４０Å、４１Å、４２Å、４３Å、４４Å、４５Å、４６Å、４７Å、４８Å、４９Å、５０Å、５１Å、５２Å、５３Å、５４Å、５５Å、５６Å、５７Å、５８Å、５９Å、６０Å、６１Å、６２Å、６３Å、６４Å、６５Å、６６Å、６７Å、６８Å、６９Å、７０Å、７１Å、７２Å、７３Å、７４Å、７５Å、７６Å、７７Å、７８Å、７９Å、又は８０Åである。いくつかの実施形態では、第２の狭窄領域と第３の狭窄領域との間の距離は、長さが８０Åを超える（例えば、９０Å、１００Åなど）。 In some embodiments, the distance between the second and third constriction regions ranges from about 5 Å to about 80 Å. In some embodiments, the distance between the first and second constriction regions ranges from about 5 Å, 6 Å, 7 Å, 8 Å, 9 Å, 10 Å, 11 Å, 12 Å, 13 Å, 14 Å, 15 Å, 16 Å, 17 Å, 18 Å, 19 Å, 20 Å, 21 Å, 22 Å, 23 Å, 24 Å, 25 Å, 26 Å, 27 Å, 28 Å, 29 Å, 30 Å, 31 Å, 32 Å, 33 Å, 34 Å, 35 Å, 36 Å, 37 Å, 38 Å, 39 Å, 40 Å, 41 Å, 42 Å, 43 Å, 44 Å, 45 Å, 46 Å, 47 Å, 48 Å, 49 Å, 50 Å, 51 Å, 52 Å, 53 Å, 54 Å, 55 Å, 56 Å, 57 Å, 58 Å, 59 Å, 60 Å, 61 Å, 62 Å, 63 Å, 64 Å, 65 Å, 66 Å, 67 Å, 68 Å, 69 Å, 70 Å, 71 Å, 72 Å, 73 Å, 74 Å, 75 Å, 76 Å, 77 Å, 78 Å, 79 Å, 80 Å, 81 Å, 82 9 Å, 40 Å, 41 Å, 42 Å, 43 Å, 44 Å, 45 Å, 46 Å, 47 Å, 48 Å, 49 Å, 50 Å, 51 Å, 52 Å, 53 Å, 54 Å, 55 Å, 56 Å, 57 Å, 58 Å, 59 Å, 60 Å, 61 Å, 62 Å, 63 Å, 64 Å, 65 Å, 66 Å, 67 Å, 68 Å, 69 Å, 70 Å, 71 Å, 72 Å, 73 Å, 74 Å, 75 Å, 76 Å, 77 Å, 78 Å, 79 Å, or 80 Å. In some embodiments, the distance between the second and third constriction regions is greater than 80 Å in length (e.g., 90 Å, 100 Å, etc.).

いくつかの実施形態では、第１の狭窄領域と第３の狭窄領域との間の距離は、約１０Å～約１６０Åの範囲である。いくつかの実施形態では、第１の狭窄領域と第２の狭窄領域との間の距離は、長さが約１０Å、１１Å、１２Å、１３Å、１４Å、１５Å、１６Å、１７Å、１８Å、１９Å、２０Å、２１Å、２２Å、２３Å、２４Å、２５Å、２６Å、２７Å、２８Å、２９Å、３０Å、３１Å、３２Å、３３Å、３４Å、３５Å、３６Å、３７Å、３８Å、３９Å、４０Å、４１Å、４２Å、４３Å、４４Å、４５Å、４６Å、４７Å、４８Å、４９Å、５０Å、５１Å、５２Å、５３Å、５４Å、５５Å、５６Å、５７Å、５８Å、５９Å、６０Å、６１Å、６２Å、６３Å、６４Å、６５Å、６６Å、６７Å、６８Å、６９Å、７０Å、７１Å、７２Å、７３Å、７４Å、７５Å、７６Å、７７Å、７８Å、７９Å、８０Å、８１Å、８２Å、８３Å、８４Å、８５Å、８６Å、８７Å、８８Å、８９Å、９０Å、９１Å、９２Å、９３Å、９４Å、９５Å、９６Å、９７Å、９８Å、９９Å、１００Å、１０１Å、１０２Å、１０３Å、１０４Å、１０５Å、１０６Å、１０７Å、１０８Å、１０９Å、１１０Å、１１１Å、１１２Å、１１３Å、１１４Å、１１５Å、１１６Å、１１７Å、１１８Å、１１９Å、１２０Å、１２１Å、１２２Å、１２３Å、１２４Å、１２５Å、１２６Å、１２７Å、１２８Å、１２９Å、１３０Å、１３１Å、１３２Å、１３３Å、１３４Å、１３５Å、１３６Å、１３７Å、１３８Å、１３９Å、１４０Å、１４１Å、１４２Å、１４３Å、１４４Å、１４５Å、１４６Å、１４７Å、１４８Å、１４９Å、１５０Å、１５１Å、１５２Å、１５３Å、１５４Å、１５５Å、１５６Å、１５７Å、１５８Å、１５９Å、又は１６０Åである。いくつかの実施形態では、第１の狭窄領域と第３の狭窄領域との間の距離は、長さが１６０Åを超える（例えば、１９０Å、２００Åなどである）。 In some embodiments, the distance between the first and third constriction regions ranges from about 10 Å to about 160 Å. In some embodiments, the distance between the first and second constriction regions ranges from about 10 Å, 11 Å, 12 Å, 13 Å, 14 Å, 15 Å, 16 Å, 17 Å, 18 Å, 19 Å, 20 Å, 21 Å, 22 Å, 23 Å, 24 Å, 25 Å, 26 Å, 27 Å, 28 Å, 29 Å, 30 Å, 31 Å, 32 Å, 33 Å, 34 Å, 35 Å, 36 Å, 37 Å, 38 Å, 39 Å, 40 Å, 41 Å, 42 Å, 43 Å, 44 Å, 45 Å, 46 Å, 47 Å, 48 Å, 49 Å, 50 Å, 51 Å, 52 Å, 53 Å, 54 Å, 55 Å, 56 Å, 57 Å, 58 Å, 59 Å, 60 Å, 61 Å, 62 Å, 63 Å, 64 Å, 65 Å, 66 Å, 67 Å, 68 Å, 69 Å, 70 Å, 71 Å, 72 Å, 73 Å, 74 Å, 75 Å, 76 Å, 77 Å, 78 Å, 79 Å, 80 Å, 81 Å, 82 Å, 83 Å, 84 Å, 85 Å, 86 Å, 45 Å, 46 Å, 47 Å, 48 Å, 49 Å, 50 Å, 51 Å, 52 Å, 53 Å, 54 Å, 55 Å, 56 Å, 57 Å, 58 Å, 59 Å, 60 Å, 61 Å, 62 Å, 63 Å, 64 Å, 65 Å, 66 Å, 67 Å, 68 Å, 69 Å, 70 Å, 71 Å, 72 Å, 73 Å, 74 Å, 75 Å, 76 Å, 77 Å, 78 Å, 79 Å, 80 Å, 81 Å, 82 Å, 83 Å, 84 Å, 85 Å, 86 Å, 87 Å, 88 Å , 89Å, 90Å, 91Å, 92Å, 93Å, 94Å, 95Å, 96Å, 97Å, 98Å, 99Å, 100Å, 101Å, 102Å, 103Å, 104Å, 105Å, 106Å, 107Å, 108 Å, 109 Å, 110 Å, 111 Å, 112 Å, 113 Å, 114 Å, 115 Å, 116 Å, 117 Å, 118 Å, 119 Å, 120 Å, 121 Å, 122 Å, 123 Å, 124 Å, 125 Å, 12 6 Å, 127 Å, 128 Å, 129 Å, 130 Å, 131 Å, 132 Å, 133 Å, 134 Å, 135 Å, 136 Å, 137 Å, 138 Å, 139 Å, 140 Å, 141 Å, 142 Å, 143 Å, 144 Å, 145 Å, 146 Å, 147 Å, 148 Å, 149 Å, 150 Å, 151 Å, 152 Å, 153 Å, 154 Å, 155 Å, 156 Å, 157 Å, 158 Å, 159 Å, or 160 Å. In some embodiments, the distance between the first and third constriction regions is greater than 160 Å in length (e.g., 190 Å, 200 Å, etc.).

いくつかの実施形態では、補助タンパク質（又は融合タンパク質）は、それらの自然状態から修飾されて所望の最小直径を有する狭窄を提供してもよい。例えば、補助タンパク質は、例えば標的変異により１つ以上の嵩高残基を導入することによって、修飾されて上記範囲内の最小直径を有する狭窄を形成してもよい。補助タンパク質の最大高さは、一実施形態では、約３ｎｍ～約２０ｎｍ、例えば約４ｎｍ～約１０ｎｍである。一実施形態では、補助タンパク質内のチャネルの長さは、約３ｎｍ～約２０ｎｍ、例えば約４ｎｍ～約１０ｎｍである。高さは、膜に垂直な方向の補助タンパク質の寸法である。 In some embodiments, auxiliary proteins (or fusion proteins) may be modified from their native state to provide a constriction with a desired minimum diameter. For example, auxiliary proteins may be modified to form a constriction with a minimum diameter within the above ranges, e.g., by introducing one or more bulky residues by targeted mutation. The maximum height of the auxiliary protein, in one embodiment, is from about 3 nm to about 20 nm, e.g., from about 4 nm to about 10 nm. In one embodiment, the length of the channel within the auxiliary protein is from about 3 nm to about 20 nm, e.g., from about 4 nm to about 10 nm. The height is the dimension of the auxiliary protein perpendicular to the membrane.

いくつかの実施形態では、補助タンパク質（例えば、第１の補助タンパク質又は第２の補助タンパク質）又は融合タンパク質（例えば、融合タンパク質の第１の部分又は融合タンパク質の第２の部分）は、タンパク質細孔複合体の内腔の外部に延びる。補助タンパク質又は融合タンパク質は、（例えば、タンパク質細孔複合体が膜に挿入される場合に）タンパク質細孔複合体の内腔のシス側又はトランス側の外部に延びてもよい。いくつかの実施形態では、補助タンパク質又は融合タンパク質がタンパク質細孔複合体の内腔の外部に延びる距離は、内腔の外部に最も遠く延びる補助タンパク質又は融合タンパク質のアミノ酸残基と、タンパク質細孔（例えば、ＣｓｇＧ細孔）の基準アミノ酸、例えば、野生型ＣｓｇＧモノマーのアミノ酸残基Ｐｈｅ１４４又はＴｙｒ１９６とのＣ_ａの距離を測定することで計算される。いくつかの実施形態では、補助タンパク質又は融合タンパク質は、内腔の外部に約０Å～約５０Å延びる。いくつかの実施形態では、補助タンパク質又は融合タンパク質は、内腔の外部に約５Å～約３０Å延びる。いくつかの実施形態では、補助タンパク質又は融合タンパク質は、内腔の外部に約１０Å～約２５Å延びる。いくつかの実施形態では、補助タンパク質又は融合タンパク質は、内腔の外部に約１Å、２Å、３Å、４Å、５Å、６Å、７Å、８Å、９Å、１０Å、１１Å、１２Å、１３Å、１４Å、１５Å、１６Å、１７Å、１８Å、１９Å、２０Å、２１Å、２２Å、２３Å、２４Å、２５Å、２６Å、２７Å、２８Å、２９Å、３０Å、３１Å、３２Å、３３Å、３４Å、３５Å、３６Å、３７Å、３８Å、３９Å、４０Å、４１Å、４２Å、４３Å、４４Å、４５Å、４６Å、４７Å、４８Å、４９Å、又は約５０Å延びる。 In some embodiments, an auxiliary protein (e.g., a first auxiliary protein or a second auxiliary protein) or a fusion protein (e.g., a first portion of a fusion protein or a second portion of a fusion protein) extends outside the lumen of a protein pore complex. The auxiliary protein or fusion protein may extend outside the cis or trans side of the lumen of a protein pore complex (e.g., when the protein pore complex is inserted into a membrane). In some embodiments, the distance that an auxiliary protein or fusion protein extends outside the lumen of a protein pore complex is calculated by measuring the Ca distance between the amino acid residue of the auxiliary protein or fusion protein that extends furthest outside the lumen and a reference amino acid of the protein pore (e.g., the CsgG pore), e.g., amino acid residue Phe144 or Tyr196 of the wild-type _CsgG monomer. In some embodiments, the auxiliary protein or fusion protein extends outside the lumen by about 0 Å to about 50 Å. In some embodiments, the auxiliary protein or fusion protein extends outside the lumen by about 5 Å to about 30 Å. In some embodiments, the auxiliary protein or fusion protein extends outside the lumen by about 10 Å to about 25 Å. In some embodiments, the auxiliary protein or fusion protein extends outside the lumen by about 1 Å, 2 Å, 3 Å, 4 Å, 5 Å, 6 Å, 7 Å, 8 Å, 9 Å, 10 Å, 11 Å, 12 Å, 13 Å, 14 Å, 15 Å, 16 Å, 17 Å, 18 Å, 19 Å, 20 Å, 21 Å, 22 Å, 23 Å, 24 Å, 25 Å, 26 Å, 27 Å, 28 Å, 29 Å, 30 Å, 31 Å, 32 Å, 33 Å, 34 Å, 35 Å, 36 Å, 37 Å, 38 Å, 39 Å, 40 Å, 41 Å, 42 Å, 43 Å, 44 Å, 45 Å, 46 Å, 47 Å, 48 Å, 49 Å, or about 50 Å.

タンパク質細孔複合体の第１の狭窄と第２の狭窄の間の長さは、典型的には、タンパク質細孔複合体の軸長に影響を与える。いくつかの実施形態では、タンパク質細孔複合体の軸長は、タンパク質細孔複合体の内腔の上部とタンパク質細孔複合体の内腔の底部との間の距離を指す。いくつかの実施形態では、タンパク質細孔複合体の軸長は、９０Åを超える。いくつかの実施形態では、タンパク質細孔複合体（例えば、１つ以上の補助タンパク質又は１つ以上の融合タンパク質を含むタンパク質細孔複合体）の軸長は、約９５Å～約１６０Åの範囲であり、例えば、９５Å、９６Å、９７Å、９８Å、９９Å、１００Å、１０１Å、１０２Å、１０３Å、１０４Å、１０５Å、１０６Å、１０７Å、１０８Å、１０９Å、１１０Å、１１１Å、１１２Å、１１３Å、１１４Å、１１５Å、１１６Å、１１７Å、１１８Å、１１９Å、１２０Å、１２１Å、１２２Å、１２３Å、１２４Å、１２５Å、１２６Å、１２７Å、１２８Å、１２９Å、１３０Å、１３１Å、１３２Å、１３３Å、１３４Å、１３５Å、１３６Å、１３７Å、１３８Å、１３９Å、１４０Å、１４１Å、１４２Å、１４３Å、１４４Å、１４５Å、１４６Å、１４７Å、１４８Å、１４９Å、１５０Å、１５１Å、１５２Å、１５３Å、１５４Å、１５５Å、１５６Å、１５７Å、１５８Å、１５９Å、又は１６０Åである。 The distance between the first and second constrictions of a protein pore complex typically affects the axial length of the protein pore complex. In some embodiments, the axial length of the protein pore complex refers to the distance between the top of the lumen of the protein pore complex and the bottom of the lumen of the protein pore complex. In some embodiments, the axial length of the protein pore complex is greater than 90 Å. In some embodiments, the axial length of the protein pore complex (e.g., a protein pore complex comprising one or more auxiliary proteins or one or more fusion proteins) is in the range of about 95 Å to about 160 Å, e.g., 95 Å, 96 Å, 97 Å, 98 Å, 99 Å, 100 Å, 101 Å, 102 Å, 103 Å, 104 Å, 105 Å, 106 Å, 107 Å, 108 Å, 109 Å, 110 Å, 111 Å, 112 Å, 113 Å, 114 Å, 115 Å, 116 Å, 117 Å, 118 Å, 119 Å, 120 Å, 121 Å, 122 Å, 123 Å, 124 Å, 125 Å, 126 Å, 127 Å, 128 Å, 129 Å, 130 Å, 131 Å, 132 Å, 133 Å, 134 Å, 135 Å, 136 Å, 137 Å, 138 Å, 139 Å, 140 Å, 141 Å, 142 Å, 143 Å, 144 Å, 145 Å, 146 Å, 147 Å, 148 Å, 149 Å, 150 Å, 151 Å, 152 Å, 153 Å, 154 Å, 155 Å, 156 Å, 157 Å, 158 Å, 159 Å, 120 Å, 121 Å, 122 Å, 123 Å, 124 Å, 125 Å, 126 Å, 127 Å, 128 Å, 129 Å, 130 Å, 131 Å, 132 Å, 133 Å, 134 Å, 135 Å, 136 Å, 137 Å, 138 Å, 139 Å, 140 Å, 141 Å, 142 Å, 143 Å, 144 Å, 145 Å, 146 Å, 147 Å, 148 Å, 149 Å, 150 Å, 151 Å, 152 Å, 153 Å, 154 Å, 155 Å, 156 Å, 157 Å, 158 Å, 159 Å, or 160 Å.

いくつかの実施形態では、補助タンパク質又は融合タンパク質は、アルギニン、リジン若しくはヒスチジンなどの１つ以上の正に荷電したアミノ酸、又は補助タンパク質若しくは融合タンパク質によって形成された狭窄に位置するか又はその近く（例えば、狭窄から約１、２、３、４又は５ｎｍ以内）位置するチロシン若しくはトリプトファンなどの芳香族アミノ酸を含む。いくつかの実施形態では、補助タンパク質又は融合タンパク質は、補助タンパク質又は融合タンパク質によって形成される狭窄に位置するか又はその近く（例えば、狭窄の約１、２、３、４又は５ｎｍ以内）位置する１つ以上の極性アミノ酸、陰性アミノ酸、又は疎水性アミノ酸を含む。いくつかの実施形態では、補助タンパク質又は融合タンパク質によって形成される狭窄に位置するか又はその近く（例えば、狭窄の約１、２、３、４又は５ｎｍ以内）位置する１つ以上のアミノ酸は、アスパラギン、トレオニン、セリン又はグルタミン酸である。これらのアミノ酸は、典型的には、細孔とポリヌクレオチドとの間の相互作用を促進する。 In some embodiments, the auxiliary protein or fusion protein comprises one or more positively charged amino acids, such as arginine, lysine, or histidine, or aromatic amino acids, such as tyrosine or tryptophan, located at or near the constriction formed by the auxiliary protein or fusion protein (e.g., within about 1, 2, 3, 4, or 5 nm of the constriction). In some embodiments, the auxiliary protein or fusion protein comprises one or more polar, negative, or hydrophobic amino acids located at or near the constriction formed by the auxiliary protein or fusion protein (e.g., within about 1, 2, 3, 4, or 5 nm of the constriction). In some embodiments, the one or more amino acids located at or near the constriction formed by the auxiliary protein or fusion protein (e.g., within about 1, 2, 3, 4, or 5 nm of the constriction) are asparagine, threonine, serine, or glutamic acid. These amino acids typically facilitate the interaction between the pore and the polynucleotide.

タンパク質細孔複合体の１つ以上の補助タンパク質（又は１つ以上の融合タンパク質）の位置は、変動してもよい。いくつかの実施形態では、補助タンパク質（又は融合タンパク質）は、タンパク質細孔複合体の内腔内に完全に位置する。いくつかの実施形態では、補助タンパク質又は融合タンパク質は、タンパク質細孔複合体の内腔を越えて延び、例えば、タンパク質細孔複合体の内腔の上方で延び（例えば、タンパク質細孔複合体のシス側のキャップ領域の上方で延び）、及び／又はタンパク質細孔複合体の下方で延びる（例えば、タンパク質細孔複合体のトランス側上の膜貫通ドメイン（例えば、バレル）の下方で延びる）部分を含む。いくつかの実施形態では、補助タンパク質又は融合タンパク質（又は補助タンパク質若しくは融合タンパク質の一部、例えば、第１の部分又は第２の部分）は、ナノ細孔（例えば、ＣｓｇＧナノ細孔）に結合される。いくつかの実施形態では、補助タンパク質又は融合タンパク質（若しくはその一部）は、ナノ細孔に共有結合される。いくつかの実施形態では、補助タンパク質又は融合タンパク質（若しくはその一部）は、ナノ細孔に非共有結合的に結合される。いくつかの実施形態では、第１の補助タンパク質と第２の補助タンパク質は、互いに結合される（例えば、共有結合、非共有結合など）。いくつかの実施形態では、融合タンパク質の第１の部分と融合タンパク質の第２の部分は、互いに結合される（例えば、共有結合、非共有結合など）。いくつかの実施形態では、補助タンパク質又は融合タンパク質（又は補助タンパク質若しくは融合タンパク質の一部、例えば、第１の部分又は第２の部分）は、ナノ細孔（例えば、ＣｓｇＧナノ細孔）に結合されない。いくつかの実施形態では、第１の補助タンパク質と第２の補助タンパク質は、互いに結合されない。 The location of one or more auxiliary proteins (or one or more fusion proteins) in a protein pore complex may vary. In some embodiments, the auxiliary protein (or fusion protein) is located entirely within the lumen of the protein pore complex. In some embodiments, the auxiliary protein or fusion protein includes a portion that extends beyond the lumen of the protein pore complex, e.g., extends above the lumen of the protein pore complex (e.g., extends above the cap region on the cis side of the protein pore complex) and/or extends below the protein pore complex (e.g., extends below the transmembrane domain (e.g., barrel) on the trans side of the protein pore complex). In some embodiments, the auxiliary protein or fusion protein (or a portion of the auxiliary protein or fusion protein, e.g., the first portion or the second portion) is bound to a nanopore (e.g., a CsgG nanopore). In some embodiments, the auxiliary protein or fusion protein (or a portion thereof) is covalently bound to the nanopore. In some embodiments, the auxiliary protein or fusion protein (or a portion thereof) is non-covalently bound to the nanopore. In some embodiments, the first auxiliary protein and the second auxiliary protein are bound to each other (e.g., covalently, non-covalently, etc.). In some embodiments, the first portion of the fusion protein and the second portion of the fusion protein are bound to each other (e.g., covalently, non-covalently, etc.). In some embodiments, the auxiliary protein or fusion protein (or a portion of the auxiliary protein or fusion protein, e.g., the first portion or the second portion) is not bound to the nanopore (e.g., the CsgG nanopore). In some embodiments, the first auxiliary protein and the second auxiliary protein are not bound to each other.

いくつかの実施形態では、補助タンパク質（例えば、第１の補助タンパク質）は、ＣｓｇＦ、ＣｓｇＦペプチド、又はそれらの官能的相同体、断片若しくは修飾型ではない。いくつかの実施形態では、融合タンパク質の一部（例えば、第１の部分及び／又は第２の部分）は、ＣｓｇＦ、ＣｓｇＦペプチド、又はそれらの官能的相同体、断片若しくは修飾型ではない。いくつかの実施形態では、補助タンパク質は、ＣｓｇＧナノ細孔、又はその相同体、断片若しくは修飾型ではない。いくつかの実施形態では、融合タンパク質の一部（例えば、第１の部分及び／又は第２の部分）は、ＣｓｇＧナノ細孔、又はその相同体、断片若しくは修飾型ではない。 In some embodiments, the accessory protein (e.g., the first accessory protein) is not CsgF, a CsgF peptide, or a functional homolog, fragment, or modified form thereof. In some embodiments, a portion of the fusion protein (e.g., the first portion and/or the second portion) is not CsgF, a CsgF peptide, or a functional homolog, fragment, or modified form thereof. In some embodiments, the accessory protein is not a CsgG nanopore, or a homolog, fragment, or modified form thereof. In some embodiments, a portion of the fusion protein (e.g., the first portion and/or the second portion) is not a CsgG nanopore, or a homolog, fragment, or modified form thereof.

いくつかの実施形態では、補助タンパク質は、ポリヌクレオチド結合タンパク質ではない。いくつかの実施形態では、補助タンパク質は、官能的ポリヌクレオチド結合タンパク質ではなく、例えば、補助タンパク質は、酵素活性を有するポリヌクレオチド結合タンパク質ではない。いくつかの実施形態では、補助タンパク質は、核酸処理酵素以外のタンパク質、例えば、ヘリカーゼ若しくはポリメラーゼではない補助タンパク質、又はそのような酵素に由来するタンパク質であってもよい。いくつかの実施形態では、補助タンパク質は、酵素活性を有しない。いくつかの実施形態では、補助タンパク質は、標的分析物がタンパク質細孔複合体内に形成された連続チャネルを通過すると、構造が変化しない。 In some embodiments, the auxiliary protein is not a polynucleotide binding protein. In some embodiments, the auxiliary protein is not a functional polynucleotide binding protein, e.g., the auxiliary protein is not a polynucleotide binding protein with enzymatic activity. In some embodiments, the auxiliary protein may be a protein other than a nucleic acid processing enzyme, e.g., an auxiliary protein that is not a helicase or polymerase, or a protein derived from such an enzyme. In some embodiments, the auxiliary protein does not have enzymatic activity. In some embodiments, the auxiliary protein does not change conformation when the target analyte passes through the continuous channel formed within the protein pore complex.

いくつかの実施形態では、補助タンパク質又は融合タンパク質（例えば、融合タンパク質の一部）は、膜貫通細孔を形成する構成要素以外の、ナノ細孔システムの構成要素又はそのようなシステムの修飾された構成要素である。このような成要素の例として、ＣｓｇＦ又はＣｓｇＦの短縮型が挙げられる。いくつかの実施形態では、補助タンパク質又は融合タンパク質は、ＣｓｇＦタンパク質、又はその相同体若しくは断片などの修飾型を含む。いくつかの実施形態では、細孔複合体は、ＣｓｇＦタンパク質又はペプチドと、非ＣｓｇＧ細孔、相同体又はその断片などの修飾型を含む。 In some embodiments, the auxiliary protein or fusion protein (e.g., part of a fusion protein) is a component of a nanopore system or a modified component of such a system other than the component that forms the transmembrane pore. Examples of such components include CsgF or truncated forms of CsgF. In some embodiments, the auxiliary protein or fusion protein comprises a CsgF protein or a modified form thereof, such as a homolog or fragment thereof. In some embodiments, the pore complex comprises a CsgF protein or peptide and a modified form thereof, such as a non-CsgF pore, homolog, or fragment thereof.

「ＣｓｇＦタンパク質」又は「ＣｓｇＦペプチド」という用語は、好ましくは、そのＣ末端から短縮されたＣｓｇＦペプチド（すなわち、Ｎ末端断片である）を定義する。ＣｓｇＦペプチドは、野生型大腸菌ＣｓｇＦ（例えば、図３Ａに示すように）の断片、又は大腸菌ＣｓｇＦの野生型相同体の断片であってもよく、例えば、ＷＯ２０１９／００２８９３（その全体が参照により本明細書に組み込まれる）に示されるアミノ酸配列のいずれか１つを含むペプチドである。ＣｓｇＦ相同体は、野生型大腸菌ＣｓｇＦに対して少なくとも３０％、４０％、５０％、６０％、７０％、８０％、９０％、９５％又は９９％の完全な配列同一性を有するポリペプチドと呼ばれる。ＣｓｇＦ相同体はまた、ＣｓｇＦ様タンパク質に特徴的なＰＦＡＭドメインＰＦ１０６１４を含有するポリペプチドとも呼ばれる。現在知られているＣｓｇＦ相同体及びＣｓｇＦアーキテクチャのリストは、ｈｔｔｐ：／／ｐｆａｍ．ｘｆａｍ．ｏｒｇ／／ｆａｍｉｌｙ／ＰＦ１０６１４において見出され得る。成熟ＣｓｇＦ（例えば、図３Ａに示すように）は、「ＣｓｇＦ狭窄ペプチド」（ＦＣＰ）、「ネック」領域及び「ヘッド」領域である３つの主要な領域に分割することができる。ＣｓｇＦペプチドの「ヘッド」領域は、本明細書に記載の細孔の狭窄とは異なる。ＣｓｇＦペプチドの「ヘッド」領域はまた、「Ｃ末端ヘッドドメイン」とも呼ばれてもよい。ＣｓｇＦの構造は、ＷＯ２０１９／００２８９３（その全体が参照により本明細書に組み込まれる）で詳細に論じられる。 The terms "CsgF protein" or "CsgF peptide" preferably define a CsgF peptide truncated from its C-terminus (i.e., an N-terminal fragment). The CsgF peptide may be a fragment of wild-type E. coli CsgF (e.g., as shown in FIG. 3A) or a fragment of a wild-type homolog of E. coli CsgF, such as a peptide comprising any one of the amino acid sequences set forth in WO 2019/002893 (incorporated herein by reference in its entirety). A CsgF homolog is a polypeptide that shares at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 99% complete sequence identity with wild-type E. coli CsgF. A CsgF homolog is also referred to as a polypeptide that contains the PFAM domain PF10614, which is characteristic of CsgF-like proteins. A list of currently known CsgF homologs and CsgF architectures can be found at http://pfam.xfam.org/family/PF10614. Mature CsgF (e.g., as shown in Figure 3A) can be divided into three main regions: the "CsgF constriction peptide" (FCP), the "neck" region, and the "head" region. The "head" region of the CsgF peptide is distinct from the pore constriction described herein. The "head" region of the CsgF peptide may also be referred to as the "C-terminal head domain." The structure of CsgF is discussed in detail in WO2019/002893, which is incorporated herein by reference in its entirety.

いくつかの実施形態では、ＣｓｇＦペプチドは、短縮型ＣｓｇＦペプチドであり、短縮型ＣｓｇＦペプチドは、Ｃ末端ヘッドを欠き、Ｃ末端ヘッド及びＣｓｇＦのネックドメインの一部を欠き（例えば、短縮型ＣｓｇＦペプチドは、ＣｓｇＦのネックドメインの一部のみを含んでもよい）、又はＣ末端ヘッド及びＣｓｇＦのネックドメインを欠く。ＣｓｇＦペプチドは、ＣｓｇＦネックドメインの一部を欠いている可能性があり、例えばＣｓｇＦペプチドは、例えば、ネックドメインのＮ末端におけるアミノ酸残基３６からのネックドメインの一部を含んでもよい（例えば、野生型大腸菌ＣｓｇＦの残基３６～４０、３６～４１、３６～４２、３６～４３、３６～４５、３６～４６、最多で残基３６～５０又は３６～６０）。いくつかの実施形態では、ＣｓｇＦペプチドは、ＣｓｇＧ結合領域及び細孔の内腔内の狭窄を形成する領域を含む。ＣｓｇＧ結合領域は、典型的には、ＣｓｇＦタンパク質（例えば、野生型大腸菌ＣｓｇＦ又は別の種からの相同体）の残基１～１１及び／又は２９～３２を含み、１つ以上の修飾を含んでもよい。細孔内に狭窄を形成する領域は、典型的には、ＣｓｇＦタンパク質の残基９～２８（野生型大腸菌ＣｓｇＦ又は別の種からの相同体）を含み、１つ以上の修飾を含んでもよい。いくつかの実施形態では、残基９～１７は、保存されたモチーフＮ_９ＰＸＦＧＧＸＸＸ_１７を含み、ターン領域を形成する。いくつかの実施形態では、残基９～２８は、アルファ－ヘリックスを形成する。いくつかの実施形態では、ＣｓｇＦペプチドの１７位でのアミノ酸残基は、細孔内のＣｓｇＦ狭窄の最も狭い部分に対応する狭窄領域の頂点を形成する。いくつかの実施形態では、ＣｓｇＦ狭窄領域はまた、主にＣｓｇＦペプチドの残基８、９、１１、１２、１８、２１及び２２で、ＣｓｇＧベータ－バレルとの安定化接触を行う。いくつかの実施形態では、ＣｓｇＦペプチドは、野生型大腸菌ＣｓｇＦのアミノ酸残基１～３０に対応するアミノ酸配列ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮ（配列番号６０）を含むか、又はそれらからなる。いくつかの実施形態では、ＣｓｇＦペプチドは、第１の補助タンパク質である。いくつかの実施形態では、ＣｓｇＦペプチドは、融合タンパク質の一部（例えば、第１の部分又は第２の部分）である。いくつかの実施形態では、ＣｓｇＦペプチドは、野生型大腸菌ＣｓｇＦのアミノ酸残基１～２３を含むか、又はそれらからなる。いくつかの実施形態では、ＣｓｇＦペプチドは、野生型大腸菌ＣｓｇＦのアミノ酸残基１～２３を含むか、又はそれらからなる。いくつかの実施形態では、ＣｓｇＦペプチドは、野生型大腸菌ＣｓｇＦのアミノ酸残基１～２４を含むか、又はそれらからなる。いくつかの実施形態では、ＣｓｇＦペプチドは、野生型大腸菌ＣｓｇＦのアミノ酸残基１～２４を含むか、又はそれらからなる。 In some embodiments, the CsgF peptide is a truncated CsgF peptide that lacks the C-terminal head, lacks the C-terminal head and a portion of the neck domain of CsgF (e.g., the truncated CsgF peptide may include only a portion of the neck domain of CsgF), or lacks the C-terminal head and the neck domain of CsgF. The CsgF peptide can lack a portion of the CsgF neck domain, for example, the CsgF peptide can include a portion of the neck domain from amino acid residue 36 at the N-terminus of the neck domain (e.g., residues 36-40, 36-41, 36-42, 36-43, 36-45, 36-46, or at most residues 36-50 or 36-60 of wild-type E. coli CsgF). In some embodiments, the CsgF peptide includes the CsgG binding region and the region that forms the constriction within the lumen of the pore. The CsgG-binding region typically comprises residues 1-11 and/or 29-32 of a CsgF protein (e.g., wild-type E. coli CsgF or a homologue from another species) and may include one or more modifications. The region that forms the constriction within the pore typically comprises residues 9-28 of a CsgF protein (wild-type E. coli CsgF or a homologue from another species) and may include one or more modifications. In some embodiments, residues 9-17 comprise the conserved motif _N9PXFGGXXX17 and form a turn region. In some embodiments, residues 9-28 form an alpha-helix. In some embodiments, the amino acid residue at position ₁₇ of the CsgF peptide forms the apex of the constriction region, which corresponds to the narrowest part of the CsgF constriction within the pore. In some embodiments, the CsgF constriction region also makes stabilizing contacts with the CsgG beta-barrel, primarily at residues 8, 9, 11, 12, 18, 21, and 22 of the CsgF peptide. In some embodiments, the CsgF peptide comprises or consists of the amino acid sequence GTMTFQFRNPNFGGNPNNGAFLLNSAQAQN (SEQ ID NO: 60), which corresponds to amino acid residues 1-30 of wild-type E. coli CsgF. In some embodiments, the CsgF peptide is a first accessory protein. In some embodiments, the CsgF peptide is part of a fusion protein (e.g., the first portion or the second portion). In some embodiments, the CsgF peptide comprises or consists of amino acid residues 1-23 of wild-type E. coli CsgF. In some embodiments, the CsgF peptide comprises or consists of amino acid residues 1-23 of wild-type E. coli CsgF. In some embodiments, the CsgF peptide comprises or consists of amino acid residues 1-24 of wild-type E. coli CsgF. In some embodiments, the CsgF peptide comprises or consists of amino acid residues 1-24 of wild-type E. coli CsgF.

いくつかの実施形態では、ＣｓｇＦペプチドは、２８～６０個のアミノ酸、例えば、２９～４９、３０～４５又は３２～４０個のアミノ酸の長さを有する。いくつかの実施形態では、ＣｓｇＦペプチドは、２９～３５個のアミノ酸、又は２９～４５個のアミノ酸を含む。いくつかの実施形態では、ＣｓｇＦペプチドは、２４、２５、２６、２７、２８、２９、３０、３１、３２、３３、３４、３５、３６、３７、３８、３９、４０、４１、４２、４３、４４、４５、４６、４７、４８、４９、５０、５１、５２、５３、５４、５５、５６、５７、５８、５９、又は６０個のアミノ酸の長さを含む。いくつかの実施形態では、ＣｓｇＦペプチドは、野生型大腸菌ＣｓｇＦの残基１～３５（又はＣｓｇＦ相同体内の対応する残基）に対応するＦＣＰの全部又は一部を含む。いくつかの実施形態では、ＣｓｇＦペプチドがＦＣＰよりも短い場合、短縮は、好ましくは、Ｃ末端で行われる。 In some embodiments, the CsgF peptide has a length of 28 to 60 amino acids, e.g., 29 to 49, 30 to 45, or 32 to 40 amino acids. In some embodiments, the CsgF peptide comprises 29 to 35 amino acids, or 29 to 45 amino acids. In some embodiments, the CsgF peptide comprises 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, or 60 amino acids in length. In some embodiments, the CsgF peptide comprises all or a portion of an FCP corresponding to residues 1 to 35 of wild-type E. coli CsgF (or the corresponding residues in a CsgF homologue). In some embodiments, when the CsgF peptide is shorter than FCP, the truncation is preferably at the C-terminus.

ＣｓｇＦペプチドにおいて、１つ以上の残基は、修飾されてもよい。例えば、ＣｓｇＦペプチドは、配列番号６における配列番号６０の位置Ｇ１、Ｍ３、Ｔ４、Ｆ５、Ｒ８、Ｎ９、Ｎ１１、Ｆ１２、Ｎ１７、Ａ２０、Ｎ２４、Ａ２６及びＱ２９のうちの１つ以上に対応する位置に修飾を含んでもよい。いくつかの実施形態では、ＣｓｇＦペプチドは、１つ以上のシステイン、１つ以上の疎水性アミノ酸、１つ以上の荷電アミノ酸、１つ以上の非天然アミノ酸、１つ以上の極性アミノ酸、又は１つ以上の光反応性アミノ酸を、例えば、配列番号６０における位置Ｇ１、Ｔ４、Ｆ５、Ｒ８、Ｎ９、Ｎ１１、Ｆ１２、Ｎ１７、Ａ２０、Ｎ２４、Ａ２６、Ｑ２７、及びＱ２９のうちの１つ以上に対応する位置で導入するように修飾される。このような導入は、任意の数及び組み合わせで行われてもよい。導入は、好ましくは、置換によって行われる。 One or more residues in the CsgF peptide may be modified. For example, the CsgF peptide may include modifications in SEQ ID NO:6 at positions corresponding to one or more of G1, M3, T4, F5, R8, N9, N11, F12, N17, A20, N24, A26, and Q29 of SEQ ID NO:60. In some embodiments, the CsgF peptide is modified to introduce one or more cysteines, one or more hydrophobic amino acids, one or more charged amino acids, one or more unnatural amino acids, one or more polar amino acids, or one or more photoreactive amino acids, for example, at positions corresponding to one or more of G1, T4, F5, R8, N9, N11, F12, N17, A20, N24, A26, Q27, and Q29 of SEQ ID NO:60. Such introductions may be made in any number and combination. Introduction is preferably made by substitution.

いくつかの実施形態では、ＣｓｇＦペプチドは、配列番号６０における位置Ｎ１５、Ｎ１７、Ａ２０、Ｎ２４及びＡ２８のうちの１つ以上に対応する位置に修飾を含む。いくつかの実施形態では、ＣｓｇＦペプチドは、Ｎ１５Ｓ／Ａ／Ｔ／Ｑ／Ｇ／Ｌ／Ｖ／Ｉ／Ｆ／Ｙ／Ｗ／Ｒ／Ｋ／Ｄ／Ｃ／Ｅ、Ｎ１７Ｓ／Ａ／Ｔ／Ｑ／Ｇ／Ｌ／Ｖ／Ｉ／Ｆ／Ｙ／Ｗ／Ｒ／Ｋ／Ｄ／Ｃ／Ｅ、Ａ２０Ｓ／Ｔ／Ｑ／Ｎ／Ｇ／Ｌ／Ｖ／Ｉ／Ｆ／Ｙ／Ｗ／Ｒ／Ｋ／Ｄ／Ｃ／Ｅ、Ｎ２４Ｓ／Ｔ／Ｑ／Ａ／Ｇ／Ｌ／Ｖ／Ｉ／Ｆ／Ｙ／Ｗ／Ｒ／Ｋ／Ｄ／Ｃ／Ｅ、又はＡ２８Ｓ／Ｔ／Ｑ／Ｎ／Ｇ／Ｌ／Ｖ／Ｉ／Ｆ／Ｙ／Ｗ／Ｒ／Ｋ／Ｄ／Ｃ／Ｅである置換のうちの１つ以上を含む。 In some embodiments, the CsgF peptide includes modifications at positions corresponding to one or more of positions N15, N17, A20, N24, and A28 in SEQ ID NO: 60. In some embodiments, the CsgF peptide includes one or more of the following substitutions: N15S/A/T/Q/G/L/V/I/F/Y/W/R/K/D/C/E, N17S/A/T/Q/G/L/V/I/F/Y/W/R/K/D/C/E, A20S/T/Q/N/G/L/V/I/F/Y/W/R/K/D/C/E, N24S/T/Q/A/G/L/V/I/F/Y/W/R/K/D/C/E, or A28S/T/Q/N/G/L/V/I/F/Y/W/R/K/D/C/E.

いくつかの実施形態では、ＣｓｇＦペプチドは、好ましくは、比較配列と比較して１つ以上の修飾を含む、配列番号６０を含む上述したＣｓｇＦ配列のうちのいずれかのバリアントである。配列番号６０のアミノ酸配列の全長にわたって、バリアントは、好ましくは、アミノ酸同一性に基づき、その配列と少なくとも４０％相同である。より好ましくは、バリアントは、配列全体にわたって配列番号６０のアミノ酸配列に対するアミノ酸同一性に基づき、少なくとも４５％、少なくとも５０％、少なくとも５５％、少なくとも６０％、少なくとも６５％、少なくとも７０％、少なくとも７５％、少なくとも８０％、少なくとも８５％、少なくとも９０％、より好ましくは少なくとも９５％、９７％又は９９％相同であってもよい。配列番号６０のアミノ酸配列の全長にわたって、バリアントは、好ましくは、その配列と少なくとも４０％同一である。より好ましくは、バリアントは、配列全体にわたって配列番号６０と少なくとも４５％、少なくとも５０％、少なくとも５５％、少なくとも６０％、少なくとも６５％、少なくとも７０％、少なくとも７５％、少なくとも８０％、少なくとも８５％、少なくとも９０％、より好ましくは少なくとも９５％、９７％、又は９９％同一であってもよい。１５以上、例えば、１５、１６、１７、１８、１９、２０、２１、２２、２３、２４、２５、２６、２７、２８、２９、３０、又はそれ以上の連続アミノ酸のストレッチにわたって、少なくとも８０％、例えば、少なくとも８５％、９０％又は９５％のアミノ酸同一性（「ハードホモロジー」）が存在してもよい。これらの相同性／同一性のレベルは、上記のその他のＣｓｇＦペプチドのうちのいずれについても同様である。 In some embodiments, the CsgF peptide is a variant of any of the above-described CsgF sequences, including SEQ ID NO: 60, preferably containing one or more modifications compared to the comparison sequence. Over the entire length of the amino acid sequence of SEQ ID NO: 60, the variant is preferably at least 40% homologous to that sequence based on amino acid identity. More preferably, the variant may be at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, more preferably at least 95%, 97%, or 99% homologous to the amino acid sequence of SEQ ID NO: 60 over the entire sequence based on amino acid identity. Over the entire length of the amino acid sequence of SEQ ID NO: 60, the variant is preferably at least 40% identical to that sequence. More preferably, the variant may be at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, more preferably at least 95%, 97%, or 99% identical to SEQ ID NO: 60 over the entire sequence. There may also be at least 80%, e.g., at least 85%, 90%, or 95%, amino acid identity ("hard homology") over a stretch of 15 or more consecutive amino acids, e.g., 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or more. These levels of homology/identity are similar for any of the other CsgF peptides described above.

１、２、３、４、５、６、７、８、９又は１０など、細孔又は細孔複合体内の任意の数のＣｓｇＦペプチドは、配列番号６０と比較して１つ以上の置換を含んでもよい。いくつかの実施形態では、細孔又は細孔複合体内の６～１０個のモノマーは全て、好ましくは、配列番号６０と比較して１つ以上の置換を含有する。細孔複合体内のＣｓｇＦペプチドは、同じであっても異なっていてもよい。ＣｓｇＦペプチドは、好ましくは、本開示の細孔複合体内の各細孔モノマー結合体において同一である。 Any number of CsgF peptides within a pore or pore complex, such as 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10, may contain one or more substitutions compared to SEQ ID NO: 60. In some embodiments, all 6 to 10 monomers within a pore or pore complex preferably contain one or more substitutions compared to SEQ ID NO: 60. The CsgF peptides within a pore complex may be the same or different. The CsgF peptides are preferably identical in each pore monomer conjugate within a pore complex of the present disclosure.

本開示の態様は、１つ以上のアルファヘリックスを含む補助タンパク質又は融合タンパク質に関する。いくつかの実施形態では、このようなタンパク質は、「ヘリックス形成タンパク質」と呼ばれてもよい。本開示は、部分的に、ヘリックス形成タンパク質が特定のナノ細孔（例えば、ＣｓｇＧナノ細孔）の内腔に位置してナノ細孔の内腔内の１つ以上の狭窄を形成してもよい認識及びこのような１つ以上の狭窄の存在により、得られるタンパク質細孔複合体の信号対雑音比（例えば、ポリヌクレオチド塩基の差別）を改善する認識に基づく。「ヘリックス」又は「ヘリックス」という用語は、一般に、らせんを形成し、かつ反復パターンにおける非連続アミノ酸残基のバックボーン間の水素結合の形成から得られるタンパク質のコイル状の構造配置を指す。いくつかの実施形態では、ヘリックスは、１３個の原子が水素結合によって形成された環に関与する状態で１ヘリカルターン当たり約３．６個のアミノ酸残基を含むアルファヘリックス（３．６_１３－ヘリックスとも呼ばれる）である。いくつかの実施形態では、ヘリックスは、１ターンあたり約３つの残基を含む３_１０ヘリックスであり、水素結合の形成によって形成される環において１０個の原子を有する。 Aspects of the present disclosure relate to auxiliary proteins or fusion proteins comprising one or more alpha helices. In some embodiments, such proteins may be referred to as "helix-forming proteins." The present disclosure is based, in part, on the recognition that helix-forming proteins may be located within the lumen of certain nanopores (e.g., CsgG nanopores) to form one or more constrictions within the lumen of the nanopore, and that the presence of such one or more constrictions improves the signal-to-noise ratio (e.g., polynucleotide base discrimination) of the resulting protein-pore complex. The term "helix" or "helix" generally refers to a coiled structural arrangement of a protein that forms a spiral and results from the formation of hydrogen bonds between the backbones of non-contiguous amino acid residues in a repeating pattern. In some embodiments, the helix is an alpha helix (also referred to as a 3.6 ₁₃ -helix) containing approximately 3.6 amino acid residues per helical turn, with 13 atoms participating in the ring formed by hydrogen bonds. In some embodiments, the helix is a 3 ₁₀ helix containing approximately three residues per turn and with 10 atoms in the ring formed by hydrogen bond formation.

補助タンパク質又は融合タンパク質内のヘリックスの数（例えば、アルファヘリックス、３_１０ヘリックス、πヘリックスなど）は、変動してもよい。いくつかの実施形態では、補助タンパク質（例えば、第１の補助タンパク質、第２の補助タンパク質など）内のヘリックスの数は、約０～約１５の範囲であり、例えば、０、１、２、３、４、５、６、７、８、９、１０、１１、１２、１３、１４、又は１５である。いくつかの実施形態では、補助タンパク質（例えば、第１の補助タンパク質、第２の補助タンパク質など）内のヘリックスの数は、１５を超える（例えば、２０、２５などである）。いくつかの実施形態では、融合タンパク質（例えば、融合タンパク質の第１の部分、融合タンパク質の第２の部分など）は、０～約１５個のヘリックス（例えば、アルファヘリックス、３_１０ヘリックス、πヘリックスなど）、例えば、０、１、２、３、４、５、６、７、８、９、１０、１１、１２、１３、１４、又は１５個のヘリックスを含む。 The number of helices (e.g., alpha helices, 3 _' helices, pi helices, etc.) in an auxiliary protein or fusion protein can vary. In some embodiments, the number of helices in an auxiliary protein (e.g., a first auxiliary protein, a second auxiliary protein, etc.) ranges from about 0 to about 15, e.g., 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15. In some embodiments, the number of helices in an auxiliary protein (e.g., a first auxiliary protein, a second auxiliary protein, etc.) is greater than 15 (e.g., 20, 25, etc.). In some embodiments, the fusion protein (e.g., a first portion of a fusion protein, a second portion of a fusion protein, etc.) comprises from 0 to about 15 helices (e.g., alpha helices, 3 ₁₀ helices, pi helices, etc.), e.g., 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 helices.

ヘリックス（例えば、アルファヘリックス、３_１０ヘリックス、πヘリックスなど）内のターンの数は、変動してもよい。いくつかの実施形態では、補助タンパク質又は融合タンパク質の各ヘリックス（例えば、アルファヘリックス、３_１０ヘリックス、πヘリックスなど）は、約０～約１５個のヘリックスターン、例えば、０、１、２、３、４、５、６、７、８、９、１０、１１、１２、１３、１４、又は１５個のヘリックスターンを含む。ヘリックス（例えば、アルファヘリックス、３１０ヘリックス、πヘリックスなど）は、１つ以上のハーフヘリックス（例えば、ハーフターン）、例えば、０．５、１．５、２．５、３．５、４．５、５．５、６．５、７．５、８．５、９．５、１０．５、１１．５、１２．５、１３．５、１４．５個などのヘリカルターンを含んでもよい。 The number of turns within a helix (e.g., an alpha helix, a _3:10 helix, a pi helix, etc.) can vary. In some embodiments, each helix (e.g., an alpha helix, a _3:10 helix, a pi helix, etc.) of an auxiliary protein or fusion protein contains from about 0 to about 15 helical turns, e.g., 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 helical turns. A helix (e.g., an alpha helix, a 3:10 helix, a pi helix, etc.) can contain one or more half helices (e.g., half turns), e.g., 0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5, 11.5, 12.5, 13.5, 14.5, etc. helical turns.

ヘリックス（例えば、アルファヘリックス、３_１０ヘリックス、πヘリックスなど）内のターンの数は、変動してもよい。いくつかの実施形態では、補助タンパク質又は融合タンパク質の各ヘリックス（例えば、アルファヘリックス、３_１０ヘリックス、πヘリックスなど）は、２～５５個のアミノ酸残基、例えば、２、３、４、５、６、７、８、９、１０、１１、１２、１３、１４、１５、１６、１７、１８、１９、２０、２１、２２、２３、２４、２５、２６、２７、２８、２９、３０、３１、３２、３３、３４、３５、３６、３７、３８、３９、４０、４１、４２、４３、４４、４５、４６、４７、４８、４９、５０、５１、５２、５３、５４、又は５５個のアミノ酸残基を含む。 The number of turns within a helix (e.g., an alpha helix, a _3'10 helix, a pi helix, etc.) can vary. In some embodiments, each helix (e.g., an alpha helix, a _3'10 helix, a pi helix, etc.) of the auxiliary protein or fusion protein comprises 2 to 55 amino acid residues, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, or 55 amino acid residues.

補助タンパク質又は融合タンパク質のヘリックスの角度は、変動してもよい。いくつかの実施形態では、ヘリックスは、約－４５°～－９０°の範囲のファイ角度（例えば、－４５°、－４６°、－４７°、－４８°、－４９°、－５０°、－５１°、－５２°、－５３°、－５４°、－５５°、－５６°、－５７°、－５８°、－５９°、－６０°、－６１°、－６２°、－６３°、－６４°、－６５°、－６６°、－６７°、－６８°、－６９°、－７０°、－７１°、－７２°、－７３°、－７４°、－７５°、－７６°、－７７°、－７８°、－７９°、－８０°、－８１°、－８２°、－８３°、－８４°、－８５°、－８６°、－８７°、－８８°、－８９°、又は－９０°）を含む。いくつかの実施形態では、ヘリックスは、約０°～－７０°の範囲のプサイ角度（例えば、０°、－１°、－２°、－３°、－４°、－５°、－６°、－７°、－８°、－９°、－１０°、－１１°、－１２°、－１３°、－１４°、－１５°、－１６°、－１７°、－１８°、－１９°、－２０°、－２１°、－２２°、－２３°、－２４°、－２５°、－２６°、－２７°、－２８°、－２９°、－３０°、－３１°、－３２°、－３３°、－３４°、－３５°、－３６°、－３７°、－３８°、－３９°、－４０°、－４１°、－４２°、－４３°、－４４°、－４５°、－４６°、－４７°、－４８°、－４９°、－５０°、－５１°、－５２°、－５３°、－５４°、－５５°、－５６°、－５７°、－５８°、－５９°、－６０°、－６１°、－６２°、－６３°、－６４°、－６５°、－６６°、－６７°、－６８°、－６９°、又は－７０°）である。いくつかの実施形態では、前記ヘリックスのそれぞれは、約－４５°～－９０°の範囲のファイ角度及び約０°～－７０°の範囲のプサイ角度を有する１～２０個のアミノ酸残基を含む。いくつかの実施形態では、前記ヘリックスのそれぞれは、約－４５°～－９０°の範囲のファイ角度及び約０°～－７０°の範囲のプサイ角度を有する１～３０個のアミノ酸残基を含む。 The angle of the helix of the auxiliary protein or fusion protein may vary. In some embodiments, the helix has a phi angle ranging from about -45° to -90° (e.g., -45°, -46°, -47°, -48°, -49°, -50°, -51°, -52°, -53°, -54°, -55°, -56°, -57°, -58°, -59°, -60°, -61°, -62°, -63°, - -64°, -65°, -66°, -67°, -68°, -69°, -70°, -71°, -72°, -73°, -74°, -75°, -76°, -77°, -78°, -79°, -80°, -81°, -82°, -83°, -84°, -85°, -86°, -87°, -88°, -89°, or -90°). In some embodiments, the helix has a psi angle ranging from about 0° to −70° (e.g., 0°, −1°, −2°, −3°, −4°, −5°, −6°, −7°, −8°, −9°, −10°, −11°, −12°, −13°, −14°, −15°, −16°, −17°, −18°, −19°, −20°, −21°, −22°, −23°, −24°, −25°, −26°, −27°, −28°, −29°, −30°, −31°, −32°, In some embodiments, each of the helices comprises 1 to 20 amino acid residues having a phi angle in the range of about -45° to -90° and a psi angle in the range of about 0° to -70°. In some embodiments, each of the helices comprises 1 to 30 amino acid residues with a phi angle in the range of about -45° to -90° and a psi angle in the range of about 0° to -70°.

いくつかの実施形態では、補助タンパク質又は融合タンパク質の１つ以上のヘリックスは、ヘリックスの一体化パッキングを促進する構造的特徴を含む。ヘリックスの「パッキング」とは、典型的には、Ｗａｌｔｈｅｒ及びＡｒｇｏｓ，ＪＭｏｌＢｉｏｌ．１９９６Ｊａｎ２６；２５５（３）：５３６－５３．ｄｏｉ：１０．１００６／ｊｍｂｉ．１９９６．００４４に記載されるように、塩橋、水素結合、ジスルフィド結合及び疎水性側鎖－側鎖密接接触、側鎖－主鎖接触、主鎖－主鎖接触など、ヘリックス間の共有結合又は非共有結合による２つ以上のヘリックスの密接な関連性を指す。ヘリカルパッキングを予測する方法は、例えば、Ｅｉｌｅｒｓｅｔａｌ．ＰｒｏｃＮａｔｌＡｃａｄＳｃｉＵＳＡ．２０００Ｍａｙ２３；９７（１１）：５７９６－５８０１に記載されるように知られる。 In some embodiments, one or more helices of the auxiliary protein or fusion protein contain structural features that promote the packing of the helices together. "Packing" of helices typically refers to the close association of two or more helices through covalent or non-covalent bonds between the helices, such as salt bridges, hydrogen bonds, disulfide bonds, and hydrophobic side chain-side chain close contacts, side chain-main chain contacts, or main chain-main chain contacts, as described in Walther and Argos, J Mol Biol. 1996 Jan 26; 255(3):536-53. doi:10.1006/jmbi.1996.0044. Methods for predicting helical packing are described, for example, in Eilers et al. Proc Natl Acad Sci U S A. It is known as described in 2000 May 23;97(11):5796-5801.

本開示の態様は、環化された融合タンパク質がタンパク質細孔複合体内の標的分析物の差別を改善するという認識に関する。「環状の」タンパク質は、典型的には、結合の１つ以上の環状配置の形成をもたらす１つ以上の分子内相互作用を含むタンパク質（例えば、融合タンパク質）を指す。環化の例として、例えば、Ｈａｙｅｓｅｔａｌ．ＯｒｇＢｉｏｍｏｌＣｈｅｍ．２０２１Ｍａｙ１２；１９（１８）：３９８３－４００１に記載されるように、側鎖－側鎖環化（例えば、分子内ジスルフィド結合の形成）、ヘッド－テール環化（例えば、タンパク質のＮ末端アミノ酸とＣ末端アミノ酸との間のアミド結合の形成）、及びヘッド－側鎖環化が挙げられる。いくつかの実施形態では、融合タンパク質は、１つ以上の側鎖－側鎖環化結合を含む。いくつかの実施形態では、前記側鎖－側鎖環化結合の少なくとも１つは、ジスルフィド結合である。いくつかの実施形態では、１つ以上の環化結合は、融合タンパク質の第１の部分と融合タンパク質の第２の部分との間の環化（例えば、ＣｓｇＦペプチドとヘリックス形成タンパク質との間の環化）をもたらす。いくつかの実施形態では、補助タンパク質又は融合タンパク質は、１つ以上の環化結合を含むループ領域（例えば、ループ領域を形成するリンカー）を含む。いくつかの実施形態では、環化結合は、化学架橋剤によって形成され、及び／又はジスルフィド結合を含む。 Aspects of the present disclosure relate to the recognition that cyclized fusion proteins improve target analyte discrimination within protein pore complexes. A "cyclized" protein typically refers to a protein (e.g., a fusion protein) that includes one or more intramolecular interactions that result in the formation of one or more cyclic arrangements of bonds. Examples of cyclization include side chain-to-side chain cyclization (e.g., formation of an intramolecular disulfide bond), head-to-tail cyclization (e.g., formation of an amide bond between the N-terminal and C-terminal amino acids of a protein), and head-to-side chain cyclization, as described, for example, in Hayes et al. Org Biomol Chem. 2021 May 12;19(18):3983-4001. In some embodiments, the fusion protein includes one or more side chain-to-side chain cyclization bonds. In some embodiments, at least one of the side chain-to-side chain cyclization bonds is a disulfide bond. In some embodiments, the one or more cyclization bonds result in cyclization between a first portion of a fusion protein and a second portion of the fusion protein (e.g., cyclization between a CsgF peptide and a helix-forming protein). In some embodiments, the auxiliary protein or fusion protein includes a loop region (e.g., a linker that forms a loop region) that includes one or more cyclization bonds. In some embodiments, the cyclization bonds are formed by a chemical crosslinker and/or include a disulfide bond.

ＣｓｇＧナノ細孔
本開示の態様は、タンパク質細孔複合体に関する。いくつかの実施形態では、本開示に記載されるタンパク質細孔複合体は、ナノ細孔（例えば、ＣｓｇＧナノ細孔）を含む。ナノ細孔は、印加電位によって駆動される水和イオンが膜を横切って、又は膜内を流れることを可能にする、膜を通る孔又はチャネルである。 CsgG Nanopore Aspects of the present disclosure relate to protein pore complexes. In some embodiments, the protein pore complexes described in this disclosure comprise a nanopore (e.g., a CsgG nanopore). A nanopore is a hole or channel through a membrane that allows hydrated ions to flow across or through the membrane, driven by an applied potential.

いくつかの実施形態では、ナノ細孔は、膜貫通タンパク質細孔である。膜貫通タンパク質細孔は、典型的には、膜全体に広がり、片側又は両側上の膜を越えて延びる構造を有してもよい。膜貫通タンパク質細孔は、水和イオンが膜の片側から膜の反対側へ流れることを可能にする単一又は多量体タンパク質である。膜貫通タンパク質細孔は、分析物、例えばＤＮＡ又はＲＮＡなどのポリヌクレオチドが細孔内に及び／又は細孔を通って移動する、又は移動されることを可能にするチャネルを含む。 In some embodiments, the nanopore is a transmembrane protein pore. Transmembrane protein pores typically span the entire membrane and may have a structure that extends beyond the membrane on one or both sides. Transmembrane protein pores are single or multimeric proteins that allow hydrated ions to flow from one side of the membrane to the other side. Transmembrane protein pores contain a channel that allows an analyte, e.g., a polynucleotide such as DNA or RNA, to move or be moved into and/or through the pore.

膜貫通タンパク質細孔は、典型的には、イオンが流れてもよいバレル又はチャネルを含む。細孔のサブユニットは、典型的には、中心軸を取り囲み、鎖を膜貫通βバレル若しくはチャネル又は膜貫通α－ヘリックスバンドル若しくはチャネルに寄与する。 Transmembrane protein pores typically contain a barrel or channel through which ions may flow. The subunits of the pore typically surround a central axis and contribute strands to a transmembrane beta barrel or channel or a transmembrane alpha-helical bundle or channel.

膜貫通タンパク質細孔のバレル又はチャネルは、典型的には、ポリヌクレオチドとの相互作用を促進するアミノ酸を含む。これらのアミノ酸は、好ましくは、バレル又はチャネルの狭窄に近く（例えば、１、２、３、４又は５ｎｍ以内）位置する。膜貫通タンパク質細孔は、典型的には、１つ以上の極性残基又は疎水性残基を含む。これらのアミノ酸は、典型的には、細孔とヌクレオチド、ポリヌクレオチド又は核酸との相互作用を促進する。 The barrel or channel of a transmembrane protein pore typically contains amino acids that facilitate interaction with a polynucleotide. These amino acids are preferably located near (e.g., within 1, 2, 3, 4, or 5 nm) the constriction of the barrel or channel. A transmembrane protein pore typically contains one or more polar or hydrophobic residues. These amino acids typically facilitate interaction of the pore with a nucleotide, polynucleotide, or nucleic acid.

いくつかの実施形態では、ナノ細孔は、例えば、大腸菌Ｓｔｒ．Ｋ－１２ｓｕｂｓｔｒ．ＭＣ４１００からのＣｓｇＧ、又はその相同体若しくは変異体などのＣｓｇＧ細孔である。変異体ＣｓｇＧ細孔は、１個以上の変異体モノマーを含み得る。ＣｓｇＧ細孔は、同一のモノマーを含むホモポリマー、又は２個以上の異なるモノマーを含むヘテロポリマーであり得る。ＣｓｇＧ由来の適切な細孔は、全体が参照により本明細書に組み込まれるＷＯ２０１６／０３４５９１、ＷＯ２０１７／１４９３１６、ＷＯ２０１７／１４９３１７、ＷＯ２０１７／１４９３１８、国際特許出願番号ＰＣＴ／ＧＢ２０１８／０５１１９１及びＰＣＴ／ＧＢ２０１８／０５１８５８、並びに中国特許公開番号ＣＮ１１３７７３３７３、ＣＮ１１３８９６７７６、ＣＮ１１３９１２６８３、及びＣＮ１１３７５４７４３に開示される。ＣｓｇＧ細孔の追加の例は、Ｕｎｉｐｒｏｔ参照番号Ｋ４ＫＩＸ７、Ａ０Ａ０８６Ｄ１Ｎ６、Ａ０Ａ１Ｉ１ＭＮＥ８、Ａ０Ａ１４３ＨＪＧ２、ＡｏＡ０９０ＲＳ４８、及びＡ０Ａ０９０ＳＺＭ０を含むが、これらに限定されない。 In some embodiments, the nanopore is a CsgG pore, such as, for example, CsgG from E. coli Str. K-12 substr. MC4100, or a homolog or mutant thereof. The mutant CsgG pore may comprise one or more mutant monomers. The CsgG pore may be a homopolymer comprising identical monomers, or a heteropolymer comprising two or more different monomers. Suitable pores derived from CsgG are disclosed in WO2016/034591, WO2017/149316, WO2017/149317, WO2017/149318, International Patent Application Nos. PCT/GB2018/051191 and PCT/GB2018/051858, and Chinese Patent Publication Nos. CN113773373, CN113896776, CN113912683, and CN113754743, which are incorporated herein by reference in their entireties. Additional examples of CsgG pores include, but are not limited to, Uniprot reference numbers K4KIX7, A0A086D1N6, A0A1I1MNE8, A0A143HJG2, AoA090RS48, and A0A090SZM0.

ＣｓｇＧ細孔は、典型的には、１つ以上のＣｓｇＧモノマーを含む。ＣｓｇＧ細孔モノマーは、ＣｓｇＧ細孔を形成することができるモノマーである。このようなモノマーは、特にＷＯ２０１９／００２８９３（その全体が参照により本明細書に組み込まれる）から当該技術分野で知られる。ＣｓｇＧ細孔は、好ましくは、（ａ）キャップ領域、（ｂ）狭窄領域、及び（ｃ）膜貫通ベータバレル領域のうちの１つ以上、例えば、（ａ）、（ｂ）、（ｃ）、（ａ）及び（ｂ）、（ａ）及び（ｃ）、（ｂ）及び（ｃ）、又は、（ａ）、（ｂ）及び（ｃ）を含む。ＣｓｇＧ細孔モノマーは、好ましくは、（ａ）キャップ形成領域、（ｂ）狭窄形成領域、及び（ｃ）膜貫通ベータバレル形成領域のうちの１つ以上、例えば、（ａ）、（ｂ）、（ｃ）、（ａ）及び（ｂ）、（ａ）及び（ｃ）、（ｂ）及び（ｃ）、又は、（ａ）、（ｂ）及び（ｃ）を含む。モノマーによって形成されるＣｓｇＧ細孔は、任意の構造を有してもよいが、好ましくは、（例えば、ＰＤＢアクセッション番号４ＵＶ３によって記載されるような）野生型大腸菌ＣｓｇＧ細孔の構造を有するか、又はそれを含む。ＣｓｇＧのタンパク質構造は、膜の一方の側から他方の側への分子及びイオンの転位を可能にするチャネル又は穴を定義する。 The CsgG pore typically comprises one or more CsgG monomers. A CsgG pore monomer is a monomer capable of forming a CsgG pore. Such monomers are known in the art, in particular from WO 2019/002893 (herein incorporated by reference in its entirety). The CsgG pore preferably comprises one or more of: (a) a cap region; (b) a constriction region; and (c) a transmembrane beta-barrel region, e.g., (a), (b), (c), (a) and (b), (a) and (c), (b) and (c), or (a), (b) and (c). The CsgG pore monomer preferably comprises one or more of (a) a cap-forming region, (b) a constriction-forming region, and (c) a transmembrane beta-barrel-forming region, e.g., (a), (b), (c), (a) and (b), (a) and (c), (b) and (c), or (a), (b) and (c). The CsgG pore formed by the monomer may have any structure, but preferably has or comprises the structure of the wild-type E. coli CsgG pore (e.g., as described by PDB accession number 4UV3). The protein structure of CsgG defines a channel or hole that allows the translocation of molecules and ions from one side of the membrane to the other.

ＣｓｇＧ細孔は、任意のサイズであってもよいが、好ましくは、（例えば、ＰＤＢアクセッション番号４ＵＶ３によって記載されるような）野生型大腸菌ＣｓｇＧ細孔の寸法を有する。これらの寸法は、図１９に示される。いくつかの実施形態では、ＣｓｇＧ細孔は、その最も広い点で約１００～約１５０Å、例えば、その最も広い点で約１１０～約１４０Å又は約１１５～約１２５Åの外径を有する。いくつかの実施形態では、ＣｓｇＧ細孔は、その最も広い点で約１２０Åの外径を有する。いくつかの実施形態では、ＣｓｇＧ細孔は、約８０～約１２０Å、例えば、約９０～約１１０Å又は約９５～約１０５Åの全長を有する。いくつかの実施形態では、ＣｓｇＧ細孔は、約９８Åの全長を有する。「全長」及び「長さ」への言及は、側面から見たときの細孔又は細孔領域の長さに関る（例えば、膜に挿入された細孔のシスからトランスへの断面を参照）。これは、図１９の側面図であってもよい。いくつかの実施形態では、外径は、ＣｓｇＧ細孔の外部の、最も離れているアミノ酸残基のＣ_ａ－Ｃ_ａ距離を計算することで測定される。いくつかの実施形態では、外径は、ＣｓｇＧ細孔の外部の、最も離れているアミノ酸残基のファンデルワールス半径の距離を計算することで測定される。 The CsgG pore may be of any size, but preferably has the dimensions of the wild-type E. coli CsgG pore (e.g., as described by PDB accession number 4UV3). These dimensions are shown in Figure 19. In some embodiments, the CsgG pore has an outer diameter of about 100 to about 150 Å at its widest point, e.g., about 110 to about 140 Å or about 115 to about 125 Å at its widest point. In some embodiments, the CsgG pore has an outer diameter of about 120 Å at its widest point. In some embodiments, the CsgG pore has a total length of about 80 to about 120 Å, e.g., about 90 to about 110 Å or about 95 to about 105 Å. In some embodiments, the CsgG pore has a total length of about 98 Å. References to "total length" and "length" relate to the length of the pore or pore region when viewed from the side (e.g., referring to a cis-to-trans cross-section of the pore inserted into a membrane). This may be a side view of Figure 19. In some embodiments, the outer diameter is measured by calculating the _Ca - _Ca distance of the most distant amino acid residues on the exterior of the CsgG pore. In some embodiments, the outer diameter is measured by calculating the van der Waals radii distance of the most distant amino acid residues on the exterior of the CsgG pore.

いくつかの実施形態では、キャップ領域は、約２０～約６０Å、例えば、約３０～約５０Å又は約３５～約４５Åの長さを有する。いくつかの実施形態では、キャップ領域は、約３９Åの長さを有する。いくつかの実施形態では、キャップ領域によって画定されるチャネルは、直径が約３０～約７０Å、例えば、直径が約４０～約６０Å又は約４５～約５５Åの開口部を有する。いくつかの実施形態では、キャップ領域によって画定されるチャネルは、直径が約６６Åの開口部を有する。いくつかの実施形態では、キャップ領域によって画定されるチャネルは、その最も狭い点での直径が約２０～約６６Åであり、例えば、その最も狭い点での直径が約３０～約５０Å又は約３２～約４３Åである。いくつかの実施形態では、キャップ領域によって画定されるチャネルは、好ましくは、その最も狭い点での直径が約４３Åである。いくつかの実施形態では、外径は、ＣｓｇＧ細孔のキャップ領域のチャネル上の、最も近接するアミノ酸残基のＣ_ａ－Ｃ_ａ距離を計算することで測定される。いくつかの実施形態では、外径は、キャップ領域のチャネル上の、最も近接するアミノ酸残基のファンデルワールス半径の距離を計算することで測定される。 In some embodiments, the cap region has a length of about 20 to about 60 Å, e.g., about 30 to about 50 Å or about 35 to about 45 Å. In some embodiments, the cap region has a length of about 39 Å. In some embodiments, the channel defined by the cap region has an opening with a diameter of about 30 to about 70 Å, e.g., about 40 to about 60 Å or about 45 to about 55 Å. In some embodiments, the channel defined by the cap region has an opening with a diameter of about 66 Å. In some embodiments, the channel defined by the cap region has a diameter of about 20 to about 66 Å at its narrowest point, e.g., about 30 to about 50 Å or about 32 to about 43 Å at its narrowest point. In some embodiments, the channel defined by the cap region preferably has a diameter of about 43 Å at its narrowest point. In some embodiments, the outer diameter is measured by calculating the _Ca - _Ca distance of the closest amino acid residues on the channel of the cap region of the CsgG pore. In some embodiments, the outer diameter is measured by calculating the distance between the van der Waals radii of the nearest amino acid residues on the channel of the cap region.

いくつかの実施形態では、ＣｓｇＧ細孔（存在する場合）によって形成される狭窄領域は、約５～約４０Å、例えば、約１０～約３０Å又は約１５～約２５Åの長さを有する。いくつかの実施形態では、狭窄領域は、約２０Åの長さを有する。いくつかの実施形態では、狭窄領域によって画定されるチャネルは、その最も狭い点での直径が約２～約３０Å、例えば、その最も狭い点での直径が約５～約２５Å、約８～約２０Å又は約１０～約１５Åである。いくつかの実施形態では、狭窄領域によって画定されるチャネルは、直径が約９Åである。いくつかの実施形態では、狭窄領域によって画定されるチャネルは、直径が約１８．５Åである。いくつかの実施形態では、狭窄は、直径が約２～約３０Å、例えば、直径が約５～約２５Å、約８～約２０Å又は約１０～約１５Åである。いくつかの実施形態では、狭窄は、直径が約１２Åである。いくつかの実施形態では、ＣｓｇＧ細孔の狭窄領域は、細孔の内腔内に最も遠く延びて狭窄を形成するアミノ酸残基のＣ_ａ－Ｃ_ａ距離を計算することで測定される。いくつかの実施形態では、外径は、細孔の内腔内に最も遠く伸びて狭窄を形成するアミノ酸残基のファンデルワールス半径の距離を計算することで測定される。 In some embodiments, the constriction region formed by the CsgG pore (if present) has a length of about 5 to about 40 Å, e.g., about 10 to about 30 Å or about 15 to about 25 Å. In some embodiments, the constriction region has a length of about 20 Å. In some embodiments, the channel defined by the constriction region has a diameter of about 2 to about 30 Å at its narrowest point, e.g., about 5 to about 25 Å, about 8 to about 20 Å, or about 10 to about 15 Å at its narrowest point. In some embodiments, the channel defined by the constriction region has a diameter of about 9 Å. In some embodiments, the channel defined by the constriction region has a diameter of about 18.5 Å. In some embodiments, the constriction has a diameter of about 2 to about 30 Å, e.g., about 5 to about 25 Å, about 8 to about 20 Å, or about 10 to about 15 Å. In some embodiments, the constriction has a diameter of about 12 Å. In some embodiments, the constriction region of the CsgG pore is measured by calculating the C _a -C _a distance of the amino acid residues that extend furthest into the lumen of the pore forming the constriction, hi some embodiments, the outer diameter is measured by calculating the distance of the van der Waals radii of the amino acid residues that extend furthest into the lumen of the pore forming the constriction.

いくつかの実施形態では、膜貫通ベータバレル領域は、約２０～約６０Å、例えば、約３０～約５０Å又は約３５～約４５Åの長さを有する。いくつかの実施形態では、膜貫通ベータバレルは、約３９Åの長さを有する。いくつかの実施形態では、膜貫通ベータバレル領域によって画定されるチャネルは、その最も狭い点での直径が約２０～約６０Å、例えば、その最も狭い点での直径が約３０～約５０Å又は約３５～約４５Åである。いくつかの実施形態では、膜貫通ベータバレル領域によって画定されるチャネルは、その最も狭い点での直径が約５５Åである。 In some embodiments, the transmembrane beta-barrel region has a length of about 20 to about 60 Å, e.g., about 30 to about 50 Å or about 35 to about 45 Å. In some embodiments, the transmembrane beta-barrel region has a length of about 39 Å. In some embodiments, the channel defined by the transmembrane beta-barrel region has a diameter at its narrowest point of about 20 to about 60 Å, e.g., about 30 to about 50 Å or about 35 to about 45 Å. In some embodiments, the channel defined by the transmembrane beta-barrel region has a diameter at its narrowest point of about 55 Å.

上記の測定値は全て、異なる領域を形成するアミノ酸のバックボーンからバックボーンまでの測定に基づく（図１９に示すように）。 All of the above measurements are based on backbone-to-backbone measurements of the amino acids that make up the different regions (as shown in Figure 19).

配列番号５９は、成熟タンパク質としての野生型大腸菌ＣｓｇＧの配列を示す。配列番号５９の残基１～４１は、キャップ領域を形成する。配列番号５９の残基６４～１３１は、狭窄領域を形成する。配列番号５９の残基１５６～１８０及び２１２～２６２は、膜貫通ベータバレル領域を形成する。 SEQ ID NO:59 shows the sequence of wild-type E. coli CsgG as a mature protein. Residues 1-41 of SEQ ID NO:59 form the cap region. Residues 64-131 of SEQ ID NO:59 form the constriction region. Residues 156-180 and 212-262 of SEQ ID NO:59 form the transmembrane beta-barrel region.

いくつかの実施形態では、ＣｓｇＧ細孔モノマーは、配列番号５９の１５３位又は１３３位に対応する位置にシステインを有するため、配列番号５９のバリアントである。いくつかの実施形態では、バリアントＣｓｇＧモノマーは、修飾されたＣｓｇＧ細孔モノマー又は変異体ＣｓｇＧ細孔モノマーとも呼ばれてもよい。バリアントにおける修飾又は変異には、本明細書に開示される修飾のうちのいずれか１つ以上、又は前記修飾の組み合わせが含まれるが、これらに限定されない。ＣｓｇＧ細孔モノマーは、ＣｓｇＧ相同体モノマーであってもよい。ＣｓｇＧ相同体モノマーは、配列番号５９に示す野生型大腸菌ＣｓｇＧに対して少なくとも４０％、５０％、６０％、７０％、８０％、８５％、９０％、９５％又は９９％の完全な配列同一性を有するポリペプチドである。ＣｓｇＧ相同体はまた、ＣｓｇＧ様タンパク質に特徴的なＰＦＡＭドメインＰＦ０３７８３を含むポリペプチドとも称される。現在知られているＣｓｇＧ相同体及びＣｓｇＧアーキテクチャのリストは、ｈｔｔｐ：／／ｐｆａｍ．ｘｆａｍ．ｏｒｇ／／ｆａｍｉｌｙ／ＰＦ０３７８３において見出され得る。 In some embodiments, the CsgG pore monomer is a variant of SEQ ID NO:59, as it has a cysteine at a position corresponding to position 153 or 133 of SEQ ID NO:59. In some embodiments, the variant CsgG monomer may also be referred to as a modified CsgG pore monomer or a mutant CsgG pore monomer. Modifications or mutations in the variant include, but are not limited to, any one or more of the modifications disclosed herein, or combinations of the modifications. The CsgG pore monomer may also be a CsgG homolog monomer. A CsgG homolog monomer is a polypeptide that has at least 40%, 50%, 60%, 70%, 80%, 85%, 90%, 95%, or 99% complete sequence identity to the wild-type E. coli CsgG set forth in SEQ ID NO:59. A CsgG homolog is also referred to as a polypeptide that comprises the PFAM domain PF03783, which is characteristic of CsgG-like proteins. A list of currently known CsgG homologues and CsgG architectures can be found at http://pfam.xfam.org//family/PF03783.

いくつかの実施形態では、ＣｓｇＧ細孔モノマーは、配列番号５９の１５３位又は１３３位に対応する位置にシステインに加えて１つ以上の修飾を含む配列番号５９のバリアントである。配列番号５９のアミノ酸配列の全長にわたって、バリアントは、好ましくは、アミノ酸同一性に基づき、その配列と少なくとも４０％相同である。より好ましくは、バリアントは、配列全体にわたって配列番号５９のアミノ酸配列に対するアミノ酸同一性に基づき、少なくとも４５％、少なくとも５０％、少なくとも５５％、少なくとも６０％、少なくとも６５％、少なくとも７０％、少なくとも７５％、少なくとも８０％、少なくとも８５％、少なくとも９０％、より好ましくは少なくとも９５％、９７％又は９９％相同であってもよい。配列番号５９のアミノ酸配列の全長にわたって、バリアントは、好ましくは、その配列と少なくとも４０％相同である。より好ましくは、バリアントは、配列全体にわたって配列番号５９と少なくとも４５％、少なくとも５０％、少なくとも５５％、少なくとも６０％、少なくとも６５％、少なくとも７０％、少なくとも７５％、少なくとも８０％、少なくとも８５％、少なくとも９０％、より好ましくは少なくとも９５％、９７％、又は９９％同一であってもよい。 In some embodiments, the CsgG pore monomer is a variant of SEQ ID NO:59 that includes one or more modifications in addition to a cysteine at a position corresponding to position 153 or 133 of SEQ ID NO:59. Over the entire length of the amino acid sequence of SEQ ID NO:59, the variant is preferably at least 40% homologous to that sequence based on amino acid identity. More preferably, the variant may be at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, more preferably at least 95%, 97%, or 99% homologous to the amino acid sequence of SEQ ID NO:59 over the entire sequence based on amino acid identity. Over the entire length of the amino acid sequence of SEQ ID NO:59, the variant is preferably at least 40% homologous to that sequence. More preferably, the variant may be at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, more preferably at least 95%, 97%, or 99% identical to SEQ ID NO:59 over the entire sequence.

配列同一性はまた、ＣｓｇＧ細孔モノマーの断片又は一部に関することができる。したがって、配列は、配列番号５９と４０％未満の全体的な配列相同性／同一性を有してもよいが、特定の領域、ドメイン又はサブユニットの配列は、配列番号５９の対応する領域と少なくとも８０％、９０％、又は最大９９％の配列相同性／同一性を共有することができる。１００以上、例えば、１２５、１５０、１７５、又は２００以上の連続アミノ酸のストレッチにわたって、少なくとも８０％、例えば、少なくとも８５％、９０％、又は９５％のアミノ酸同一性（「ハードホモロジー」）が存在し得る。いくつかの実施形態では、ＣｓｇＧ細孔モノマーは、好ましくは、配列番号３のキャップ領域（残基１～４１）と少なくとも４０％相同である配列を含む配列番号３のバリアントである。より好ましくは、バリアントは、配列番号５９のアミノ酸残基１～４１に対するアミノ酸同一性に基づき、少なくとも４５％、少なくとも５０％、少なくとも５５％、少なくとも６０％、少なくとも６５％、少なくとも７０％、少なくとも７５％、少なくとも８０％、少なくとも８５％、少なくとも９０％、より好ましくは、少なくとも９５％、９７％又は９９％相同である配列を含んでもよい。いくつかの実施形態では、バリアントは、配列番号５９の残基１～４１と少なくとも４０％同一である配列を含む。より好ましくは、バリアントは、配列番号５９の１～４１の残基に対して少なくとも４５％、少なくとも５０％、少なくとも５５％、少なくとも６０％、少なくとも６５％、少なくとも７０％、少なくとも７５％、少なくとも８０％、少なくとも８５％、少なくとも９０％、より好ましくは、少なくとも９５％、９７％又は９９％同一である配列を含む。 Sequence identity can also relate to fragments or portions of the CsgG pore monomer. Thus, a sequence may have less than 40% overall sequence homology/identity with SEQ ID NO:59, but the sequence of a particular region, domain, or subunit can share at least 80%, 90%, or up to 99% sequence homology/identity with the corresponding region of SEQ ID NO:59. There may be at least 80%, e.g., at least 85%, 90%, or 95% amino acid identity ("hard homology") over a stretch of 100 or more, e.g., 125, 150, 175, or 200 or more contiguous amino acids. In some embodiments, the CsgG pore monomer is preferably a variant of SEQ ID NO:3, comprising a sequence that is at least 40% homologous to the cap region (residues 1-41) of SEQ ID NO:3. More preferably, a variant may comprise a sequence that is at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, more preferably at least 95%, 97%, or 99% homologous based on amino acid identity to amino acid residues 1-41 of SEQ ID NO: 59. In some embodiments, a variant comprises a sequence that is at least 40% identical to residues 1-41 of SEQ ID NO: 59. More preferably, a variant comprises a sequence that is at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, more preferably at least 95%, 97%, or 99% identical to residues 1-41 of SEQ ID NO: 59.

いくつかの実施形態では、ＣｓｇＧ細孔モノマーは、配列番号５９の狭窄領域（残基６４～１３１）と少なくとも４０％相同である配列を含む配列番号５９のバリアントである。いくつかの実施形態では、バリアントは、配列番号５９の残基６４～１３１に対するアミノ酸同一性に基づき、少なくとも４５％、少なくとも５０％、少なくとも５５％、少なくとも６０％、少なくとも６５％、少なくとも７０％、少なくとも７５％、少なくとも８０％、少なくとも８５％、少なくとも９０％、より好ましくは少なくとも９５％、９７％又は９９％相同である配列を含む。いくつかの実施形態では、バリアントは、配列番号５９の残基６４～１３１と少なくとも４０％同一である配列を含む。より好ましくは、バリアントは、配列番号５９の残基６４～１３１に対して少なくとも４５％、少なくとも５０％、少なくとも５５％、少なくとも６０％、少なくとも６５％、少なくとも７０％、少なくとも７５％、少なくとも８０％、少なくとも８５％、少なくとも９０％、より好ましくは、少なくとも９５％、９７％又は９９％同一である配列を含む。 In some embodiments, the CsgG pore monomer is a variant of SEQ ID NO: 59 comprising a sequence at least 40% homologous to the constriction region (residues 64-131) of SEQ ID NO: 59. In some embodiments, the variant comprises a sequence that is at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, more preferably at least 95%, 97%, or 99% homologous based on amino acid identity to residues 64-131 of SEQ ID NO: 59. In some embodiments, the variant comprises a sequence that is at least 40% identical to residues 64-131 of SEQ ID NO: 59. More preferably, the variant comprises a sequence that is at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, and more preferably at least 95%, 97%, or 99% identical to residues 64-131 of SEQ ID NO:59.

いくつかの実施形態では、ＣｓｇＧ細孔モノマーは、配列番号３の膜貫通ベータバレル領域（残基１５６～１８０及び２１２～２６２）と少なくとも４０％相同である配列を含む配列番号５９のバリアントである。いくつかの実施形態では、バリアントは、配列番号５９の残基１５６～１８０及び２１２～２６２に対するアミノ酸同一性に基づき、少なくとも４５％、少なくとも５０％、少なくとも５５％、少なくとも６０％、少なくとも６５％、少なくとも７０％、少なくとも７５％、少なくとも８０％、少なくとも８５％、少なくとも９０％、より好ましくは少なくとも９５％、９７％又は９９％相同である配列を含む。いくつかの実施形態では、バリアントは、配列番号５９の残基１５６～１８０及び２１２～２６２と少なくとも４０％同一である配列を含む。より好ましくは、バリアントは、配列番号５９の残基１５６～１８０及び２１２～２６２に対して少なくとも４５％、少なくとも５０％、少なくとも５５％、少なくとも６０％、少なくとも６５％、少なくとも７０％、少なくとも７５％、少なくとも８０％、少なくとも８５％、少なくとも９０％、より好ましくは、少なくとも９５％、９７％又は９９％同一である配列を含む。 In some embodiments, the CsgG pore monomer is a variant of SEQ ID NO:59, comprising a sequence at least 40% homologous to the transmembrane beta-barrel region (residues 156-180 and 212-262) of SEQ ID NO:3. In some embodiments, the variant comprises a sequence that is at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, more preferably at least 95%, 97%, or 99% homologous based on amino acid identity to residues 156-180 and 212-262 of SEQ ID NO:59. In some embodiments, the variant comprises a sequence that is at least 40% identical to residues 156-180 and 212-262 of SEQ ID NO:59. More preferably, the variant comprises a sequence that is at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, more preferably at least 95%, 97% or 99% identical to residues 156-180 and 212-262 of SEQ ID NO:59.

ＣｓｇＧ細孔モノマーは高度に保存される（ＷＯ２０１７／１４９３１７の図４５～４７から容易に理解することができる）。更に、配列番号５９に関連する変異の知識から、配列番号５９の変異以外のＣｓｇＧ細孔モノマーの変異の同等の位置を決定することが可能である。 The CsgG pore monomer is highly conserved (as can be readily seen from Figures 45-47 of WO 2017/149317). Furthermore, knowledge of the mutations relative to SEQ ID NO:59 makes it possible to determine the equivalent positions of mutations in the CsgG pore monomer other than the mutation in SEQ ID NO:59.

したがって、配列番号５９に示す配列のバリアントと、特許請求の範囲及び本明細書の他の場所に記載されるその特定のアミノ酸変異とを含む変異体ＣｓｇＧ細孔モノマーへの言及はまた、ＷＯ２０１９／００２８９３（その全体が参照により本明細書に組み込まれる）の配列番号６８～８８に示す配列のいずれかのバリアント及びその対応するアミノ酸変異を含む変異体ＣｓｇＧ細孔モノマーを包含する。ＣｓｇＧ細孔モノマーはまた、ＣＮ１１３７７３３７３Ａ、ＣＮ１１３８９６７７６Ａ、ＣＮ１１３９１２６８３Ａ、及びＣＮ１１３７５４７４３Ａに示す配列のいずれか、又はそのバリアントであってもよい。 Thus, reference to a mutant CsgG pore monomer comprising a variant of the sequence set forth in SEQ ID NO:59 and the specific amino acid mutations thereof described in the claims and elsewhere in this specification also encompasses mutant CsgG pore monomers comprising any variant of the sequences set forth in SEQ ID NOs:68-88 of WO2019/002893 (which is incorporated herein by reference in its entirety) and their corresponding amino acid mutations. The CsgG pore monomer may also be any of the sequences set forth in CN113773373A, CN113896776A, CN113912683A, and CN113754743A, or a variant thereof.

相同性を決定するために、当該技術分野における標準的な方法が使用され得る。例えば、ＵＷＧＣＧパッケージは、例えばそのデフォルト設定で使用されて、相同性を計算するために使用できるＢＥＳＴＦＩＴプログラムを提供する（Ｄｅｖｅｒｅｕｘｅｔａｌ（１９８４）ＮｕｃｌｅｉｃＡｃｉｄｓＲｅｓｅａｒｃｈ１２，ｐ３８７－３９５）。例えば、ＡｌｔｓｃｈｕｌＳ．Ｆ．（１９９３）ＪＭｏｌＥｖｏｌ３６：２９０－３００、Ａｌｔｓｃｈｕｌ，Ｓ．Ｆｅｔａｌ（１９９０）ＪＭｏｌＢｉｏｌ２１５：４０３－１０に記載されるように、ＰＩＬＥＵＰ及びＢＬＡＳＴアルゴリズムを使用して、相同性を計算するか、又は配列を並べる（同等の残基又は対応する配列を同定する（典型的には、それらのデフォルト設定において））ことができる。ＢＬＡＳＴ分析を実施するためのソフトウェアは、ＮａｔｉｏｎａｌＣｅｎｔｅｒｆｏｒＢｉｏｔｅｃｈｎｏｌｏｇｙＩｎｆｏｒｍａｔｉｏｎ（ｈｔｔｐ：／／ｗｗｗ．ｎｃｂｉ．ｎｌｍ．ｎｉｈ．ｇｏｖ／）を通じて公的に利用可能である。 Standard methods in the art can be used to determine homology. For example, the UWGCG package provides the BESTFIT program, which can be used to calculate homology, e.g., with its default settings (Devereux et al. (1984) Nucleic Acids Research 12, pp. 387-395). For example, the PILEUP and BLAST algorithms can be used to calculate homology or align sequences (identify equivalent residues or corresponding sequences, typically with their default settings), as described in Altschul S. F. (1993) J Mol Evol 36:290-300 and Altschul, S. F. et al. (1990) J Mol Biol 215:403-10. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/).

配列番号５９は、大腸菌Ｓｔｒ．Ｋ－１２ｓｕｂｓｔｒ．ＭＣ４１００からの野生型ＣｓｇＧ細孔モノマーである。配列番号５９のバリアントは、別のＣｓｇＧ相同体に存在する置換のいずれかを含んでもよい。好ましいＣｓｇＧ相同体は、ＷＯ２０１９／００２８９３（その全体が参照により本明細書に組み込まれる）の配列番号６８～８８に示される。バリアントは、配列番号５９と比較して、１つ以上を含むＷＯ２０１９／００２８９３（その全体が参照により本明細書に組み込まれる）の配列番号６８～８８に存在する置換のうちの１つ以上の組み合わせを含んでもよい SEQ ID NO:59 is the wild-type CsgG pore monomer from E. coli Str. K-12 substr. MC4100. Variants of SEQ ID NO:59 may include any of the substitutions present in other CsgG homologs. Preferred CsgG homologs are set forth in SEQ ID NOs:68-88 of WO 2019/002893 (incorporated herein by reference in their entirety). Variants may include one or more combinations of the substitutions present in SEQ ID NOs:68-88 of WO 2019/002893 (incorporated herein by reference in their entirety), including one or more, compared to SEQ ID NO:59.

本開示の細孔モノマー結合体内のＣｓｇＧ細孔モノマーは、典型的には、野生型ＣｓｇＧ細孔モノマーと同じ３Ｄ構造、例えば、配列番号５９の配列を有するＣｓｇＧ細孔モノマーと同じ３Ｄ構造を形成する能力を保持する。ＣｓｇＧの３Ｄ構造は、当該技術分野で既知であり、例えば、Ｇｏｙａｌｅｔａｌ（２０１４）Ｎａｔｕｒｅ５１６（７５３０）：２５０－３に開示されている。ＣｓｇＧ細孔モノマーが、変異によって付与される改善された特性を保持することを条件として、本明細書に記載される変異に加えて、任意の数の変異が野生型ＣｓｇＧ配列において行われてもよい。 The CsgG pore monomer in the pore monomer conjugates of the present disclosure typically retains the ability to form the same 3D structure as a wild-type CsgG pore monomer, for example, a CsgG pore monomer having the sequence of SEQ ID NO: 59. The 3D structure of CsgG is known in the art and is disclosed, for example, in Goyal et al. (2014) Nature 516(7530):250-3. In addition to the mutations described herein, any number of mutations may be made in the wild-type CsgG sequence, provided that the CsgG pore monomer retains the improved properties conferred by the mutations.

アミノ酸置換は、上述したものに加えて、配列番号５９のアミノ酸配列に対して、例えば最大１、２、３、４、５、１０、２０又は３０個の置換を行ってもよい。保存的置換は、アミノ酸を、類似の化学構造、類似の化学的特性、又は類似の側鎖体積の他のアミノ酸で置き換える。導入されるアミノ酸は、それらが置き換えるアミノ酸と類似の極性、親水性、疎水性、塩基性、酸性、中性、又は荷電性を有し得る。代替的に、保存的置換は、既存の芳香族又は脂肪族アミノ酸の代わりに芳香族又は脂肪族である別のアミノ酸を導入し得る。 In addition to those described above, amino acid substitutions may be made to the amino acid sequence of SEQ ID NO:59, for example, up to 1, 2, 3, 4, 5, 10, 20, or 30 substitutions. Conservative substitutions replace amino acids with other amino acids of similar chemical structure, similar chemical properties, or similar side chain volume. The introduced amino acids may have similar polarity, hydrophilicity, hydrophobicity, basicity, acidity, neutrality, or charge to the amino acids they replace. Alternatively, conservative substitutions may introduce another amino acid that is aromatic or aliphatic in place of an existing aromatic or aliphatic amino acid.

いくつかの実施形態では、ＣｓｇＧ細孔モノマーは、１つ以上のシステイン、１つ以上の疎水性アミノ酸、１つ以上の荷電アミノ酸、１つ以上の非天然アミノ酸、１つ以上の極性アミノ酸、又は１つ以上の光反応性アミノ酸を導入するように修飾される。このような導入は、任意の数及び組み合わせで行われてもよい。導入は、好ましくは、置換によって行われる。 In some embodiments, the CsgG pore monomer is modified to introduce one or more cysteines, one or more hydrophobic amino acids, one or more charged amino acids, one or more unnatural amino acids, one or more polar amino acids, or one or more photoreactive amino acids. Such introductions may be made in any number and combination. Introduction is preferably made by substitution.

配列番号５９のアミノ酸配列の１つ以上のアミノ酸残基は、上記ポリペプチドから追加的に欠失されてもよい。最大１、２、３、４、５、１０、２０、又は３０個以上の残基が欠失され得る。 One or more amino acid residues of the amino acid sequence of SEQ ID NO:59 may be additionally deleted from the polypeptide. Up to 1, 2, 3, 4, 5, 10, 20, or 30 or more residues may be deleted.

バリアントは、配列番号５９の断片を含んでもよい。そのような断片は、細孔形成活性を保持する。断片は、少なくとも５０、少なくとも１００、少なくとも１５０、少なくとも２００、又は少なくとも２５０アミノ酸長であり得る。かかる断片を使用して、細孔を産生し得る。断片は、好ましくは、配列番号５９の膜貫通ドメインを含み、Ｋ１３５～Ｑ１５３及びＳ１８３～Ｓ２０８と名づけられる。 Variants may include fragments of SEQ ID NO:59. Such fragments retain pore-forming activity. Fragments may be at least 50, at least 100, at least 150, at least 200, or at least 250 amino acids in length. Such fragments may be used to produce pores. The fragment preferably includes the transmembrane domain of SEQ ID NO:59, designated K135-Q153 and S183-S208.

１つ以上のアミノ酸は、代替的に、又はこれに加えて、上述のポリペプチドに付加され得る。伸長部は、配列番号５９のアミノ酸配列のアミノ酸末端若しくはカルボキシ末端、又はそのポリペプチドバリアント若しくは断片に提供されてもよい。伸長部は、非常に短くてもよく、例えば、１～１０アミノ酸長であってもよい。代替的に、伸長部は、より長くてもよく、例えば、最大５０個又は１００個のアミノ酸であり得る。担体タンパク質は、アミノ酸配列に融合されてもよい。その他の融合タンパク質は、本開示の他の場所、例えば「補助タンパク質」と題されたセクションでより詳細に論じられる。 Alternatively, or in addition, one or more amino acids may be added to the above-described polypeptides. The extension may be provided at the amino or carboxy terminus of the amino acid sequence of SEQ ID NO: 59, or a polypeptide variant or fragment thereof. The extension may be very short, for example, 1 to 10 amino acids in length. Alternatively, the extension may be longer, for example, up to 50 or 100 amino acids. A carrier protein may be fused to the amino acid sequence. Other fusion proteins are discussed in more detail elsewhere in this disclosure, for example, in the section entitled "Accessory Proteins."

配列番号５９のバリアントは、配列番号５９のアミノ酸配列とは異なり、細孔を形成する能力を保持するアミノ酸配列を有するポリペプチドである。バリアントは、典型的には、細孔形成を担当する配列番号５９の領域を含有する。β－バレルを含有するＣｓｇＧの細孔形成能力は、各サブユニットモノマーの膜貫通ベータバレルにおけるβ－シートによって提供される。配列番号５９のバリアントは、典型的には、βシートを形成する配列番号５９の領域を含み、Ｋ１３４～Ｑ１５４及びＳ１８３～Ｓ２０８と名づけられる。得られるバリアントが細孔を形成する能力を保持する限り、β－シートを形成する配列番号３の領域に対して１つ以上の修飾を行うことができる。 A variant of SEQ ID NO:59 is a polypeptide having an amino acid sequence different from that of SEQ ID NO:59 but which retains the ability to form a pore. The variant typically contains the region of SEQ ID NO:59 responsible for pore formation. The pore-forming ability of β-barrel-containing CsgG is provided by a β-sheet in the transmembrane beta-barrel of each subunit monomer. A variant of SEQ ID NO:59 typically includes the regions of SEQ ID NO:59 that form β-sheets, designated K134-Q154 and S183-S208. One or more modifications can be made to the regions of SEQ ID NO:3 that form β-sheets, as long as the resulting variant retains the ability to form a pore.

ＣｓｇＧ細孔モノマー内の１つ以上の修飾は、好ましくは、細孔モノマーを含む細孔複合体が分析物を特性決定する能力を向上させる。例えば、修飾／変異／置換は、本開示の細孔モノマー結合体からのチャネル内の狭窄の数、サイズ、形状、配置又は配向を変更することが企図される。ＣｓｇＧ細孔モノマー又は配列番号５９のバリアントは、ＷＯ２０１６／０３４５９１、ＷＯ２０１７／１４９３１６、ＷＯ２０１７／１４９３１７、ＷＯ２０１７／１４９３１８、ＷＯ２０１８／２１１２４１、及びＷＯ２０１９／００２８９３（それらの全体が全て参照により本明細書に組み込まれる）に開示された特定の修飾又は置換のいずれかを有してもよい。 One or more modifications in the CsgG pore monomer preferably improve the ability of a pore complex containing the pore monomer to characterize an analyte. For example, modifications/mutations/substitutions are contemplated that alter the number, size, shape, location, or orientation of constrictions within a channel from a pore monomer conjugate of the present disclosure. A CsgG pore monomer or variant of SEQ ID NO: 59 may have any of the specific modifications or substitutions disclosed in WO 2016/034591, WO 2017/149316, WO 2017/149317, WO 2017/149318, WO 2018/211241, and WO 2019/002893 (all of which are incorporated by reference in their entirety).

配列番号５９における好ましい修飾又は置換は、
（ａ）位置Ｙ５１、例えば、Ｙ５１Ｉ、Ｙ５１Ｌ、Ｙ５１Ａ、Ｙ５１Ｖ、Ｙ５１Ｔ、Ｙ５１Ｓ、Ｙ５１Ｑ又はＹ５１Ｎでの置換、
（ｂ）位置Ｎ５５、例えば、Ｎ５５Ｉ、Ｎ５５Ｌ、Ｎ５５Ａ、Ｎ５５Ｖ、Ｎ５５Ｔ、Ｎ５５Ｓ又はＮ５５Ｑでの置換、
（ｃ）位置Ｆ５６、例えば、Ｆ５６Ｉ、Ｆ５６Ｌ、Ｆ５６Ａ、Ｆ５６Ｖ、Ｆ５６Ｔ、Ｆ５６Ｓ、Ｆ５６Ｑ又はＦ５６Ｎでの置換、
（ｄ）位置Ｌ９０、例えば、Ｌ９０Ｎ、Ｌ９０Ｄ、Ｌ９０Ｅ、Ｌ９０Ｒ又はＬ９０Ｋでの置換、
（ｅ）位置Ｎ９１、例えば、Ｎ９１Ｄ、Ｎ９１Ｅ、Ｎ９１Ｒ又はＮ９１Ｋでの置換、
（ｆ）位置Ｋ９４、例えば、Ｋ９４Ｒ、Ｋ９４Ｆ、Ｋ９４Ｙ、Ｋ９４Ｑ、Ｋ９４Ｗ、Ｋ９４Ｌ、Ｋ９４Ｓ、又はＫ９４Ｎでの置換、
（ｇ）位置Ｒ１９２、例えば、Ｒ１９２Ｑ、Ｒ１９２Ｆ、Ｒ１９２ＳＲ１９２Ｄ、又はＲ１９２Ｔでの置換、
（ｉ）位置Ｃ２１５、例えば、Ｃ２１５Ｔ、Ｃ２１５Ｓ、Ｃ２１５Ｉ、Ｃ２１５Ｌ、Ｃ２１５Ａ、Ｃ２１５Ｖ、又はＣ２１５Ｇでの置換、のうちの１つ以上、例えば、２つ以上、３つ以上、４つ以上、５つ以上、６つ以上、７つ以上又は全てを含むが、これらに限定されない。 Preferred modifications or substitutions in SEQ ID NO: 59 are:
(a) a substitution at position Y51, for example Y51I, Y51L, Y51A, Y51V, Y51T, Y51S, Y51Q or Y51N;
(b) a substitution at position N55, for example N55I, N55L, N55A, N55V, N55T, N55S or N55Q;
(c) a substitution at position F56, for example F56I, F56L, F56A, F56V, F56T, F56S, F56Q or F56N;
(d) a substitution at position L90, e.g., L90N, L90D, L90E, L90R or L90K;
(e) a substitution at position N91, e.g., N91D, N91E, N91R or N91K;
(f) a substitution at position K94, e.g., K94R, K94F, K94Y, K94Q, K94W, K94L, K94S, or K94N;
(g) a substitution at position R192, e.g., R192Q, R192F, R192S R192D, or R192T;
(i) substitutions at position C215, e.g., C215T, C215S, C215I, C215L, C215A, C215V, or C215G, including, but not limited to, one or more, e.g., two or more, three or more, four or more, five or more, six or more, seven or more, or all of the following:

配列番号３のバリアントは更に、１つ以上の位置の欠失、例えば、Ｔ１０４～Ｎ１０９の欠失、Ｆ１９３～Ｌ１９９の欠失又はＦ１９５～Ｌ１９９の欠失を含んでもよい。 Variant of SEQ ID NO: 3 may further include a deletion of one or more positions, for example, a deletion of T104 to N109, a deletion of F193 to L199, or a deletion of F195 to L199.

６、７、８、９又は１０など、細孔又は細孔複合体内の任意の数のＣｓｇＧ細孔モノマーは、配列番号５９のバリアントであってもよい。細孔又は細孔複合体内の６～１０個のモノマーは全て、好ましくは、配列番号５９のバリアントである。細孔複合体内のバリアントは、同じであっても異なっていてもよい。バリアントは、好ましくは、細孔複合体内の各細孔モノマー結合体において同一である。 Any number of CsgG pore monomers within a pore or pore complex, such as 6, 7, 8, 9, or 10, may be a variant of SEQ ID NO: 59. Preferably, all 6 to 10 monomers within a pore or pore complex are variants of SEQ ID NO: 59. The variants within a pore complex may be the same or different. The variant is preferably the same in each pore monomer conjugate within the pore complex.

リンカー
いくつかの実施形態では、タンパク質細孔複合体は、補助タンパク質又は融合タンパク質のナノ細孔への結合（例えば、共有結合）によって安定化される。共有結合は、例えば、ジスルフィド結合、又はクリックケミストリーであり得る。更なる例として、システイン残基は、ＢＭＯＥなどのリンカーによって接続されてもよい。補助タンパク質又は融合タンパク質及び／又は膜貫通タンパク質ナノ細孔は、このような共有結合相互作用を促進するために修飾されてもよい。いくつかの実施形態では、補助タンパク質又は融合タンパク質は、ナノ細孔に非共有結合される。いくつかの実施形態では、補助タンパク質又は融合タンパク質は、１つ以上（例えば、１、２、３、４、５、又はそれ以上）のリンカーによってナノ細孔に結合される。 Linkers In some embodiments, the protein pore complex is stabilized by the attachment (e.g., covalent attachment) of an auxiliary protein or fusion protein to the nanopore. The covalent attachment can be, for example, a disulfide bond or click chemistry. As a further example, cysteine residues may be connected by a linker such as BMOE. The auxiliary protein or fusion protein and/or the transmembrane protein nanopore may be modified to facilitate such covalent interactions. In some embodiments, the auxiliary protein or fusion protein is non-covalently attached to the nanopore. In some embodiments, the auxiliary protein or fusion protein is attached to the nanopore by one or more (e.g., 1, 2, 3, 4, 5, or more) linkers.

いくつかの実施形態では、補助タンパク質又は融合タンパク質は、疎水性相互作用及び／又は１つ以上のジスルフィド結合によってナノ細孔に結合される。細孔内のモノマーの１つ以上、例えば２、３、４、５、６、８、９、例えば、全てを修飾して、このような相互作用を増強してもよい。これは、任意の適切な方法で達成されてもよい。更なる適切な相互作用には、塩橋、静電相互作用、水素結合の形成、ペプチド結合の形成、及びＰｉ－Ｐｉ相互作用を含む。 In some embodiments, the auxiliary protein or fusion protein is bound to the nanopore by hydrophobic interactions and/or one or more disulfide bonds. One or more, e.g., 2, 3, 4, 5, 6, 8, 9, or even all, of the monomers within the pore may be modified to enhance such interactions. This may be achieved in any suitable manner. Further suitable interactions include salt bridges, electrostatic interactions, hydrogen bond formation, peptide bond formation, and Pi-Pi interactions.

ナノ細孔と補助タンパク質（又は融合タンパク質）との間の界面での膜貫通タンパク質ナノ細孔のアミノ酸配列内の少なくとも１つのシステイン残基は、ナノ細孔と補助タンパク質との間の界面で補助タンパク質のアミノ酸配列内の少なくとも１つのシステイン残基にジスルフィド結合されてもよい。いくつかの実施形態では、第１の補助タンパク質のアミノ酸配列内の少なくとも１つのシステイン残基は、第２の補助タンパク質のアミノ酸配列内の少なくとも１つのシステイン残基にジスルフィド結合される。いくつかの実施形態では、融合タンパク質の第１の部分のアミノ酸配列内の少なくとも１つのシステイン残基は、融合タンパク質の第２の部分のアミノ酸配列内の少なくとも１つのシステイン残基にジスルフィド結合される。ナノ細孔内のシステイン残基及び／又は補助タンパク質若しくは融合タンパク質内のシステイン残基は、野生型膜貫通タンパク質細孔モノマー又は野生型補助タンパク質内に存在しないシステイン残基であってもよい。２、３、４、５、６、７、８若しくは９～１６、１８、２４、２７、３２、３６、４０、４５、４８、５４、５６又は６３などの複数のジスルフィド結合は、ナノ細孔と細孔複合体内の補助タンパク質（又は融合タンパク質）との間で形成されてもよい。ナノ細孔及び補助タンパク質（又は融合タンパク質）の一方又は両方は、ナノ細孔と補助タンパク質（又は融合タンパク質）との間の界面でシステイン残基を含む、少なくとも１つのモノマー又はサブユニット、例えば、最大８、９又は１０個のモノマー又はサブユニットを含んでもよい。 At least one cysteine residue in the amino acid sequence of the transmembrane protein nanopore at the interface between the nanopore and the auxiliary protein (or fusion protein) may be disulfide bonded to at least one cysteine residue in the amino acid sequence of the auxiliary protein at the interface between the nanopore and the auxiliary protein. In some embodiments, at least one cysteine residue in the amino acid sequence of a first auxiliary protein is disulfide bonded to at least one cysteine residue in the amino acid sequence of a second auxiliary protein. In some embodiments, at least one cysteine residue in the amino acid sequence of a first portion of a fusion protein is disulfide bonded to at least one cysteine residue in the amino acid sequence of a second portion of the fusion protein. The cysteine residue in the nanopore and/or the auxiliary protein or fusion protein may be a cysteine residue not present in the wild-type transmembrane protein pore monomer or wild-type auxiliary protein. A plurality of disulfide bonds, such as 2, 3, 4, 5, 6, 7, 8, or 9 to 16, 18, 24, 27, 32, 36, 40, 45, 48, 54, 56, or 63, may be formed between the nanopore and the auxiliary protein (or fusion protein) in the pore complex. One or both of the nanopore and the auxiliary protein (or fusion protein) may include at least one monomer or subunit, e.g., up to 8, 9, or 10 monomers or subunits, that includes a cysteine residue at the interface between the nanopore and the auxiliary protein (or fusion protein).

ナノ細孔及び／又は補助タンパク質（若しくは融合タンパク質）は、野生型ナノ細孔又は補助タンパク質（若しくは融合タンパク質）内の対応する位置に存在する残基よりも疎水性が高い、ナノ細孔と補助タンパク質（若しくは融合タンパク質）との間の界面で１つ以上の疎水性アミノ酸残基を含んでもよい。ナノ細孔内の少なくとも１つのモノマー若しくはサブユニット、及び／又は補助タンパク質（若しくは融合タンパク質）内の少なくとも１つのモノマー若しくはサブユニットは、ナノ細孔と補助タンパク質（又は融合タンパク質）との間の界面で少なくとも１つの残基を含んでもよく、この残基は、野生型細孔又は補助タンパク質（若しくは融合タンパク質）内の対応する位置に存在する残基よりも疎水性が高い。例えば、ナノ細孔及び／又は補助タンパク質（若しくは融合タンパク質）内の２～１０個、例えば３、４、５、６、７、８又は９つの残基は、対応する野生型ナノ細孔及び／又は補助タンパク質（若しくは融合タンパク質）内の同じ位置にある残基よりも疎水性が高くてもよい。このような疎水性残基は、ナノ細孔と細孔複合体内の補助タンパク質（又は融合タンパク質）との間の相互作用を増強する。野生型ナノ細孔又は補助タンパク質（若しくは融合タンパク質）の界面での残基がＲ、Ｑ、Ｎ又はＥである場合、疎水性残基は、典型的には、Ｉ、Ｌ、Ｖ、Ｍ、Ｆ、Ｗ、Ａ、又はＹである。野生型ナノ細孔又は補助タンパク質（若しくは融合タンパク質）の界面での残基がＩである場合、疎水性残基は、典型的には、Ｌ、Ｖ、Ｍ、Ｆ、Ｗ、Ａ、又は、Ｙである。野生型ナノ細孔又は補助タンパク質（若しくは融合タンパク質）の界面での残基がＬである場合、疎水性残基は、典型的には、Ｉ、Ｖ、Ｍ、Ｆ、Ｗ、Ａ、又は、Ｙである。 The nanopore and/or auxiliary protein (or fusion protein) may comprise one or more hydrophobic amino acid residues at the interface between the nanopore and the auxiliary protein (or fusion protein) that are more hydrophobic than the residues present at the corresponding position in the wild-type nanopore or auxiliary protein (or fusion protein). At least one monomer or subunit in the nanopore and/or at least one monomer or subunit in the auxiliary protein (or fusion protein) may comprise at least one residue at the interface between the nanopore and the auxiliary protein (or fusion protein), which residue is more hydrophobic than the residues present at the corresponding position in the wild-type pore or auxiliary protein (or fusion protein). For example, 2 to 10, e.g., 3, 4, 5, 6, 7, 8, or 9, residues in the nanopore and/or auxiliary protein (or fusion protein) may be more hydrophobic than the residues present at the same position in the corresponding wild-type nanopore and/or auxiliary protein (or fusion protein). Such hydrophobic residues enhance the interaction between the nanopore and the auxiliary protein (or fusion protein) in the pore complex. When a residue at the interface of the wild-type nanopore or auxiliary protein (or fusion protein) is R, Q, N, or E, the hydrophobic residue is typically I, L, V, M, F, W, A, or Y. When a residue at the interface of the wild-type nanopore or auxiliary protein (or fusion protein) is I, the hydrophobic residue is typically L, V, M, F, W, A, or Y. When a residue at the interface of the wild-type nanopore or auxiliary protein (or fusion protein) is L, the hydrophobic residue is typically I, V, M, F, W, A, or Y.

分子動力学シミュレーションを実行して、補助タンパク質及びナノ細孔内のどの残基が近接するかを確認できる。この情報は、複合体の安定性を高めることができる補助タンパク質及び／又は膜貫通タンパク質ナノ細孔変異体を設計するために使用できる。例えば、シミュレーションは、ＧＲＯＭＯＳ５３ａ６力場及びタンパク質のクライオＥＭ構造を使用したＳＰＣ水モデルを備えたＧＲＯＭＡＣＳパッケージバージョン４．６．５を使用して行うことができる。複合体を溶媒和し、その後、最急降下アルゴリズムを使用してエネルギーを最小化することができる。シミュレーションを通じて、タンパク質のバックボーンに拘束を適用できるが、残基側鎖は、自由に移動できる。システムは、Ｂｅｒｅｎｄｓｅｎサーモスタット及びＢｅｒｅｎｄｓｅｎバロスタットを使用して３００Ｋまで、ＮＰＴアンサンブルで２０ｎｓシミュレーションすることができる。補助タンパク質とナノ細孔との間の接触は、ＧＲＯＭＡＣＳ分析ソフトウェア及び／又はローカルに記述されたコードを使用して分析することができる。２つの残基は、互いに３Å以内にある場合、接触したと定義することができる。 Molecular dynamics simulations can be performed to determine which residues in the auxiliary protein and nanopore are in close proximity. This information can be used to design auxiliary protein and/or transmembrane protein nanopore mutants that can enhance the stability of the complex. For example, simulations can be performed using the GROMACS package version 4.6.5 with the GROMOS 53a6 force field and an SPC water model using the protein's cryo-EM structure. The complex can be solvated and then energy minimized using a steepest descent algorithm. Constraints can be applied to the protein backbone throughout the simulation, while residue side chains are free to move. The system can be simulated for 20 ns in the NPT ensemble up to 300 K using a Berendsen thermostat and Berendsen barostat. Contacts between the auxiliary protein and nanopore can be analyzed using the GROMACS analysis software and/or locally written code. Two residues can be defined as in contact if they are within 3 Å of each other.

例えば、細孔複合体において、ＣｓｇＦペプチドとＣｓｇＧ細孔との間の相互作用は、例えば、配列番号６０及び配列番号５９のそれぞれの、１と１５３、４と１３３、５と１３６、８と１８７、８と２０３、９と２０３、１１と１４２、１１と２０１、１２と１４９、１２と２０３、２６と１９１、２９と１４４、又は３０と１９６である位置対の１つ以上に対応する位置での疎水性相互作用、静電相互作用、又は共有結合によって安定化されてもよい。これらの位置のうちの１つ以上でのＣｓｇＦ及び／又はＣｓｇＧの残基は、細孔におけるＣｓｇＧとＣｓｇＦとの間の相互作用を増強するために修飾されてもよい。 For example, in the pore complex, the interaction between the CsgF peptide and the CsgG pore may be stabilized by hydrophobic interactions, electrostatic interactions, or covalent bonds at positions corresponding to one or more of the following position pairs: 1 and 153, 4 and 133, 5 and 136, 8 and 187, 8 and 203, 9 and 203, 11 and 142, 11 and 201, 12 and 149, 12 and 203, 26 and 191, 29 and 144, or 30 and 196 in SEQ ID NO: 60 and SEQ ID NO: 59, respectively. Residues of CsgF and/or CsgG at one or more of these positions may be modified to enhance the interaction between CsgG and CsgF in the pore.

共有連結又は結合は、例えば、システイン結合を介して行われ、システインのスルフィドリル側基は、別のアミノ酸残基又は部分と共有連結し、及び／又は非天然（光）反応性アミノ酸間の相互作用を介して行われる。（光）反応性アミノ酸は、タンパク質複合体の架橋に使用できる天然アミノ酸の人工類似体を指し、インビボ又はインビトロでタンパク質及びペプチドに組み込まれてもよい。一般的に使用されている光反応性アミノ酸類似体は、ロイシン及びメチオニンに対する光反応性ジアジリン類似体、並びにパラベンゾイル－フェニル－アラニン、並びにアジドホモアラニン、ホモプロパルギルグリシイン、ホモアレルグリシン、ｐ－アセチル－Ｐｈｅ、ｐ－アジド－Ｐｈｅ、ｐ－プロパルギルオキシ－Ｐｈｅ、及びｐ－ベンゾイル－Ｐｈｅである（Ｗａｎｇｅｔａｌ．２０１２；Ｃｈｉｎｅｔａｌ．２００２）。紫外線に曝されると、それらは活性化され、光反応性アミノ酸類似体の数オングストローム以内にある相互作用するタンパク質に共有結合する。 Covalent linkages or bonds can occur, for example, via cysteine bonds, where the sulfhydryl side group of cysteine is covalently linked to another amino acid residue or moiety, and/or through interactions between non-natural (photo)reactive amino acids. (Photo)reactive amino acids refer to artificial analogs of natural amino acids that can be used to crosslink protein complexes and may be incorporated into proteins and peptides in vivo or in vitro. Commonly used photoreactive amino acid analogs are photoreactive diazirine analogs of leucine and methionine, as well as parabenzoyl-phenyl-alanine, azidohomoalanine, homopropargylglycine, homoallelicglycine, p-acetyl-Phe, p-azido-Phe, p-propargyloxy-Phe, and p-benzoyl-Phe (Wang et al. 2012; Chin et al. 2002). Upon exposure to UV light, they become activated and covalently bond to interacting proteins within a few angstroms of the photoreactive amino acid analog.

細孔複合体は、作製することができ、ジスルフィド結合の形成は、酸化剤（例えば、銅－オルトフェナントロリン）を使用することで誘導することができる。システイン相互作用の代わりに、他の相互作用（例えば、疎水性相互作用、電荷－電荷相互作用／静電相互作用）もまた、それらの位置で使用することができる。別の実施形態では、非天然アミノ酸もこれらの位置に組み込むことができる。この実施形態では、共有結合は、クリックケミストリーによって作製される。例えば、アジド若しくはアルキンを有するか、又はジベンゾシクロオクチン（ＤＢＣＯ）基及び／又はビシクロ［６．１．０］ノニン（ＢＣＮ）基を有する非天然アミノ酸を、これらの位置のうちの１つ以上に導入することができる。 Pore complexes can be prepared, and disulfide bond formation can be induced using an oxidizing agent (e.g., copper-orthophenanthroline). Instead of cysteine interactions, other interactions (e.g., hydrophobic interactions, charge-charge interactions/electrostatic interactions) can also be used at these positions. In another embodiment, unnatural amino acids can also be incorporated at these positions. In this embodiment, covalent bonds are created by click chemistry. For example, unnatural amino acids bearing azides or alkynes, or bearing dibenzocyclooctyne (DBCO) and/or bicyclo[6.1.0]nonyne (BCN) groups, can be introduced at one or more of these positions.

例えば、ＣｓｇＧ細孔は、補助タンパク質又は融合タンパク質への結合を容易にするように修飾される少なくとも１つ、例えば２、３、４、５、６、７、８、９又は１０個のＣｓｇＧモノマーを含んでもよい。例えば、システイン残基は、配列番号５９の位置１３２、１３３、１３６、１３８、１４０、１４２、１４４、１４５、１４７、１４９、１５１、１５３、１５５、１８３、１８５、１８７、１８９、１９１、２０１、２０３、２０５、２０７及び２０９に対応する位置のうちの１つ以上、及び／又は補助タンパク質若しくは融合タンパク質と接触すると予測される任意の位置に導入されて、補助タンパク質又は融合タンパク質への共有結合を容易にしてもよい。システイン残基を介した共有結合の代替又は追加として、細孔は、疎水性相互作用又は静電相互作用によって安定化されてもよい。かかる相互作用を容易にするために、配列番号５９の位置１３２、１３３、１３６、１３８、１４０、１４２、１４４、１４５、１４７、１４９、１５１、１５３、１５５、１８３、１８５、１８７、１８９、１９１、２０１、２０３、２０５、２０７及び２０９のうちの１つ以上に対応する位置での非天然反応性又は光反応性アミノ酸。 For example, the CsgG pore may comprise at least one, e.g., 2, 3, 4, 5, 6, 7, 8, 9, or 10, CsgG monomers modified to facilitate binding to an auxiliary protein or fusion protein. For example, cysteine residues may be introduced at one or more of positions corresponding to positions 132, 133, 136, 138, 140, 142, 144, 145, 147, 149, 151, 153, 155, 183, 185, 187, 189, 191, 201, 203, 205, 207, and 209 of SEQ ID NO: 59, and/or at any position predicted to contact an auxiliary protein or fusion protein, to facilitate covalent binding to the auxiliary protein or fusion protein. Alternatively, or in addition to covalent binding via cysteine residues, the pore may be stabilized by hydrophobic or electrostatic interactions. To facilitate such interactions, non-naturally occurring reactive or photoreactive amino acids at positions corresponding to one or more of positions 132, 133, 136, 138, 140, 142, 144, 145, 147, 149, 151, 153, 155, 183, 185, 187, 189, 191, 201, 203, 205, 207, and 209 of SEQ ID NO:59.

例えば、ＣｓｇＦペプチドは、ＣｓｇＧ細孔への結合を容易にするように修飾されてもよい。例えば、システイン残基は、配列番号６０の位置１、４、５、８、９、１１、１２、２６又は２９に対応する位置のうちの１つ以上、及び／又はＣｓｇＧと接触すると予測される任意の位置に導入されて、ＣｓｇＧへの共有結合を容易にしてもよい。システイン残基を介した共有結合の代替又は追加として、細孔は、疎水性相互作用又は静電相互作用によって安定化されてもよい。かかる相互作用を容易にするために、配列番号６０の位置１、２、３、４、５、８、９、１１、１２、２６又は２９のうちの１つ以上に対応する位置での非天然反応性又は光反応性アミノ酸。 For example, the CsgF peptide may be modified to facilitate binding to the CsgG pore. For example, cysteine residues may be introduced at one or more of the positions corresponding to positions 1, 4, 5, 8, 9, 11, 12, 26, or 29 of SEQ ID NO: 60, and/or at any position predicted to contact CsgG, to facilitate covalent binding to CsgG. As an alternative or in addition to covalent binding via cysteine residues, the pore may be stabilized by hydrophobic or electrostatic interactions. To facilitate such interactions, a non-naturally reactive or photoreactive amino acid may be introduced at a position corresponding to one or more of positions 1, 2, 3, 4, 5, 8, 9, 11, 12, 26, or 29 of SEQ ID NO: 60.

このような安定化変異は、補助タンパク質又は融合タンパク質に対する任意の他の修飾、例えば、細孔複合体とポリヌクレオチドの相互作用を改善するための修飾、又は複合体の特定の特性（例えば、ポリヌクレオチドのヌクレオチドなどのポリマーユニットの差別）を改善するための修飾と組み合わせることができる。 Such stabilizing mutations can be combined with any other modifications to the auxiliary protein or fusion protein, such as modifications to improve the interaction of the pore complex with the polynucleotide or to improve a particular property of the complex (e.g., discrimination of polymer units such as nucleotides of the polynucleotide).

いくつかの実施形態では、ナノ細孔は、単離、実質的に単離、精製、又は実質的に精製されてもよい。細孔は、それが脂質又は他の細孔などの任意の他の構成要素を一切含まない場合、単離又は精製される。細孔は、それが、その意図される使用に干渉しない担体又は希釈剤と混合されている場合、実質的に単離されている。例えば、細孔は、それが、１０％未満、５％未満、２％未満又は１％未満の他の構成要素、例えばブロックコポリマー、脂質又は他の細孔を含む形態で存在する場合、実質的に単離されるか、又は実質的に精製される。代替的に、細孔は、膜に存在してもよい。好適な膜は、以下に考察される。 In some embodiments, the nanopore may be isolated, substantially isolated, purified, or substantially purified. A pore is isolated or purified if it is free of any other components, such as lipids or other nanopores. A pore is substantially isolated if it is mixed with a carrier or diluent that does not interfere with its intended use. For example, a pore is substantially isolated or substantially purified if it is present in a form that contains less than 10%, less than 5%, less than 2%, or less than 1% of other components, such as block copolymers, lipids, or other nanopores. Alternatively, the pore may be present in a membrane. Suitable membranes are discussed below.

細孔複合体は、膜において個別の細孔又は単一の細孔として存在してもよい。代替的に、細孔複合体は、２つ以上の細孔の相同又は異種の集団に存在してもよい。 A pore complex may exist as an individual pore or a single pore in the membrane. Alternatively, a pore complex may exist as a homogeneous or heterogeneous population of two or more pores.

補助タンパク質又は融合タンパク質は、膜貫通タンパク質ナノ細孔に直接結合されてもよいか、又は２つのタンパク質（例えば、第１の補助タンパク質及び第２の補助タンパク質、融合タンパク質の第１の部分及び融合タンパク質の第２の部分など）は、化学架橋剤又はペプチドリンカーなどのリンカーを使用して結合されてもよい。 The auxiliary protein or fusion protein may be directly attached to the transmembrane protein nanopore, or two proteins (e.g., a first auxiliary protein and a second auxiliary protein, a first portion of a fusion protein and a second portion of a fusion protein, etc.) may be attached using a linker, such as a chemical crosslinker or a peptide linker.

好適な化学架橋剤は、当該技術分野で周知である。架橋剤の例として、２，５－ジオキソピロリジン－１－イル３－（ピリジン－２－イルジスルファニル）プロパノエート、２，５－ジオキソピロリジン－１－イル４－（ピリジン－２－イルジスルファニル）ブタノエート、及び２，５－ジオキソピロリジン－１－イル８－（ピリジン－２－イルジスルファニル）オクタナノエートが挙げられるが、これらに限定されない。いくつかの実施形態では、架橋剤は、スクシンイミジル３－（２－ピリジルジチオ）プロピオネート（ＳＰＤＰ）である。典型的には、分子は、分子／架橋剤複合体が変異体モノマーと共有結合する前に二官能性架橋剤と共有結合しているが、二官能性架橋剤／モノマー複合体が分子に結合する前に二官能性架橋剤をモノマーと共有結合させることも可能である。いくつかの実施形態では、リンカーは、ジチオスレイトール（ＤＴＴ）に対して耐性がある。追加の好適なリンカーは、ヨードアセトアミド系及びマレイミド系リンカーを含むが、これらに限定されない。 Suitable chemical crosslinkers are well known in the art. Examples of crosslinkers include, but are not limited to, 2,5-dioxopyrrolidin-1-yl 3-(pyridin-2-yldisulfanyl)propanoate, 2,5-dioxopyrrolidin-1-yl 4-(pyridin-2-yldisulfanyl)butanoate, and 2,5-dioxopyrrolidin-1-yl 8-(pyridin-2-yldisulfanyl)octananoate. In some embodiments, the crosslinker is succinimidyl 3-(2-pyridyldithio)propionate (SPDP). Typically, the molecule is covalently attached to the bifunctional crosslinker before the molecule/crosslinker complex is covalently attached to the mutant monomer; however, it is also possible to covalently attach the bifunctional crosslinker to the monomer before the bifunctional crosslinker/monomer complex is attached to the molecule. In some embodiments, the linker is resistant to dithiothreitol (DTT). Additional suitable linkers include, but are not limited to, iodoacetamide-based and maleimide-based linkers.

ペプチドリンカーなどの好適なアミノ酸リンカーは、当該技術分野で既知である。アミノ酸又はペプチドリンカーの長さ、可撓性及び親水性は、典型的には、補助タンパク質又は融合タンパク質が細孔複合体内に狭窄を形成するように設計される。好ましい可撓性ペプチドリンカーは２～２０個、例えば、４、６、８、１０、又は１６個のセリン、及び／又はグリシンアミノ酸のストレッチである。より好ましい可撓性リンカーは、（ＳＧ）_１、（ＳＧ）_２、（ＳＧ）_３、（ＳＧ）_４、（ＳＧ）_５、（ＳＧ）_８、（ＳＧ）_１０、（ＳＧ）_１５又は（ＳＧ）_２０を含み、Ｓは、セリンであり、Ｇは、グリシンである。好ましい剛性リンカーは、２～３０個、例えば、４、６、８、１６又は２４個のプロリンアミノ酸のストレッチである。より好ましい剛性リンカーは、（Ｐ）_１２を含み、Ｐは、プロリンである。 Suitable amino acid linkers, such as peptide linkers, are known in the art. The length, flexibility, and hydrophilicity of the amino acid or peptide linker are typically designed to allow the auxiliary protein or fusion protein to form a constriction within the pore complex. Preferred flexible peptide linkers are stretches of 2 to 20, e.g., 4, 6, 8, 10, or 16, serine and/or glycine amino acids. More preferred flexible linkers include (SG) ₁ , (SG) ₂ , (SG) ₃ , (SG) ₄ , (SG) ₅ , (SG) ₈ , (SG) ₁₀ , (SG) ₁₅ , or (SG) ₂₀ , where S is serine and G is glycine. Preferred rigid linkers are stretches of 2 to 30, e.g., 4, 6, 8, 16, or 24, proline amino acids. More preferred rigid linkers include (P) ₁₂ , where P is proline.

好適な化学架橋剤には、以下の官能基：マレイミド、活性エステル、スクシンイミド、アジド、アルキン（ジベンゾシクロオクチノール（ＤＩＢＯ又はＤＢＣＯ）、ジフルオロシクロアルキン、及び直鎖アルキンなど）、ホスフィン（トレースレス及び非トレースレスＳｔａｕｄｉｎｇｅｒライゲーションに使用されるものなど）、ハロアセチル（ヨードアセトアミドなど）、ホスゲン型試薬、スルホニルクロリド試薬、イソチオシアネート、ハロゲン化アシル、ヒドラジン、ジスルフィド、ビニルスルホン、アジリジン、並びに光反応性試薬（アリールアジド、ジアジリジンなど）を含むものが含まれるが、これらに限定されない。 Suitable chemical crosslinkers include, but are not limited to, those containing the following functional groups: maleimide, active ester, succinimide, azide, alkyne (such as dibenzocyclooctynol (DIBO or DBCO), difluorocycloalkyne, and linear alkyne), phosphines (such as those used in traceless and non-traceless Staudinger ligation), haloacetyl (such as iodoacetamide), phosgene-type reagents, sulfonyl chloride reagents, isothiocyanates, acyl halides, hydrazine, disulfide, vinyl sulfone, aziridine, and photoreactive reagents (such as aryl azides and diaziridine).

アミノ酸と官能基との間の反応は、システイン／マレイミドなどの自発的であり得るか、又はアジド及び直鎖アルキンを連結するためのＣｕ（Ｉ）などの外部試薬を必要とし得る。 The reaction between the amino acid and the functional group can be spontaneous, such as cysteine/maleimide, or can require an external reagent, such as Cu(I) to link an azide and a linear alkyne.

リンカーは、必要な距離にわたって伸びる任意の分子を含み得る。リンカーは、１つの炭素（ホスゲン型リンカー）から多くのオングストロームまでの長さで変動し得る。リンカー分子の例として、ポリエチレングリコール（ＰＥＧ）、ポリペプチド、多糖類、デオキシリボ核酸（ＤＮＡ）、ペプチド核酸（ＰＮＡ）、トレオース核酸（ＴＮＡ）、グリセロール核酸（ＧＮＡ）、飽和及び不飽和炭化水素、ポリアミドが挙げられるが、これらに限定されない。これらのリンカーは、不活性であっても反応性であってもよく、特に、それらは、定義された位置で化学的に切断可能であってもよく、又はそれ自体がフルオロフォア若しくはリガンドで修飾されていてもよい。リンカーは、好ましくは、ＣｓｇＧ細孔モノマーへの補助タンパク質又は融合タンパク質の共有結合後のジチオスレイトール（ＤＴＴ）に対して耐性があり。 Linkers can include any molecule that spans the required distance. Linkers can vary in length from one carbon (phosgene-type linkers) to many angstroms. Examples of linker molecules include, but are not limited to, polyethylene glycol (PEG), polypeptides, polysaccharides, deoxyribonucleic acid (DNA), peptide nucleic acid (PNA), threose nucleic acid (TNA), glycerol nucleic acid (GNA), saturated and unsaturated hydrocarbons, and polyamides. These linkers can be inert or reactive; in particular, they may be chemically cleavable at defined positions or may themselves be modified with fluorophores or ligands. Linkers are preferably resistant to dithiothreitol (DTT) after covalent attachment of an auxiliary protein or fusion protein to the CsgG pore monomer.

いくつかの実施形態では、好ましい架橋剤は、２，５－ジオキソピロリジン－１－イル３－（ピリジン－２－イルジスルファニル）プロパノエート、２，５－ジオキソピロリジン－１－イル４－（ピリジン－２－イルジスルファニル）ブタノエート、及び２，５－ジオキソピロリジン－１－イル８－（ピリジン－２－イルジスルファニル）オクタナノエート、ジ－マレイミドＰＥＧ１ｋ、ジ－マレイミドＰＥＧ３．４ｋ、ジ－マレイミドＰＥＧ５ｋ、ジ－マレイミドＰＥＧ１０ｋ、ビス（マレイミド）エタン（ＢＭＯＥ）、ビス－マレイミドヘキサン（ＢＭＨ）、１，４－ビス－マレイミドブタン（ＢＭＢ）、１，４ビス－マレイミジル－２，３－ジヒドロキシブタン（ＢＭＤＢ）、ＢＭ［ＰＥＯ］２（１，８－ビス－マレイミドジエチレングリコール）、ＢＭ［ＰＥＯ］３（１，１１－ビス－マレイミドトリエチレングリコール）、トリス［２－マレイミドエチル］アミン（ＴＭＥＡ）、ＤＴＭＥジチオビスマレイミドエタン、ビス－マレイミドＰＥＧ３、ビス－マレイミドＰＥＧ１１、ＤＢＣＯ－マレイミド、ＤＢＣＯ－ＰＥＧ４－マレイミド、ＤＢＣＯ－ＰＥＧ４－ＮＨ２、ＤＢＣＯ－ＰＥＧ４－ＮＨＳ、ＤＢＣＯ－ＮＨＳ、ＤＢＣＯ－ＰＥＧ－ＤＢＣＯ２．８ｋＤａ、ＤＢＣＯ－ＰＥＧ－ＤＢＣＯ４．０ｋＤａ、ＤＢＣＯ－１５原子－ＤＢＣＯ、ＤＢＣＯ－２６原子－ＤＢＣＯ、ＤＢＣＯ－３５原子－ＤＢＣＯ、ＤＢＣＯ－ＰＥＧ４－Ｓ－Ｓ－ＰＥＧ３－ビオチン、ＤＢＣＯ－Ｓ－Ｓ－ＰＥＧ３－ビオチン、ＤＢＣＯ－Ｓ－Ｓ－ＰＥＧ１１－ビオチン、（スクシンイミジル３－（２－ピリジルジチオ）プロピオン酸（ＳＰＤＰ）、及びマレイミド－ＰＥＧ（２ｋＤａ）－マレイミド（アルファ、オメガ－ビス－マレイミドポリ（エチレングリコール））から選択される。いくつかの実施形態では、架橋剤は、マレイミド－プロピル－ＳＲＤＦＷＲＳ－（１，２－ジアミノエタン）－プロピル－マレイミドである。 In some embodiments, preferred crosslinkers are 2,5-dioxopyrrolidin-1-yl 3-(pyridin-2-yldisulfanyl)propanoate, 2,5-dioxopyrrolidin-1-yl 4-(pyridin-2-yldisulfanyl)butanoate, and 2,5-dioxopyrrolidin-1-yl 8-(pyridin-2-yldisulfanyl)octananoate, di-maleimide PEG 1k, di-maleimide PEG 3.4k, di-maleimide PEG 5k, di-maleimide PEG 10k, bis(maleimido)ethane (BMOE), bis-maleimidohexane (BMH), 1,4-bis-maleimidobutane (BMB), 1,4 bis-maleimidyl-2,3-dihydroxybutane (BMDB), BM[PEO]2 (1,8-bis-maleimidodiethylene glycol), BM[PEO]3 (1,11-bis-maleimidotriethylene glycol), tris[2-maleimidoethyl]amine (TMEA), DTME dithiobismaleimidoethane, bis-maleimidoPEG3, bis-maleimidoPEG11, DBCO-maleimide, DBCO-PEG4-maleimide, DBCO-PEG4-NH2, DBCO-PEG4-NHS, DBCO-NHS, DBCO-PEG-DBCO 2.8kDa, DBCO-PEG-DBCO 4.0 kDa, DBCO-15 atoms-DBCO, DBCO-26 atoms-DBCO, DBCO-35 atoms-DBCO, DBCO-PEG4-S-S-PEG3-biotin, DBCO-S-S-PEG3-biotin, DBCO-S-S-PEG11-biotin, (succinimidyl 3-(2-pyridyldithio)propionate (SPDP), and maleimide-PEG(2 kDa)-maleimide (alpha, omega-bis-maleimide poly(ethylene glycol)). In some embodiments, the crosslinker is maleimide-propyl-SRDFWRS-(1,2-diaminoethane)-propyl-maleimide.

連結されたＣｓｇＧ細孔モノマー及び補助タンパク質又は融合タンパク質は、基間の共有結合の形成を介して結合されてもよい。ＷＯ２０１０／０８６６０２（その全体が参照により本明細書に組み込まれる）に開示される特定のリンカーのいずれかは、使用されてもよい。 The linked CsgG pore monomer and auxiliary protein or fusion protein may be attached via the formation of a covalent bond between the groups. Any of the specific linkers disclosed in WO 2010/086602 (hereby incorporated by reference in its entirety) may be used.

リンカーは、標識されてもよい。好適な標識は、蛍光分子（Ｃｙ３又はＡｌｅｘａＦｌｕｏｒ（登録商標）５５５など）、放射性同位体、例えば、^１２５Ｉ、^３５Ｓ、^３２Ｐ、酵素、抗体、抗原、ポリヌクレオチド、及びビオチンなどのリガンドを含むが、これらに限定されない。かかる標識は、リンカーの量を定量することを可能にする。標識はまた、ビオチンなどの切断可能な精製タグ、又はタンパク質自体に存在しないが、トリプシン消化によって放出されるペプチドなどの同定方法において現れる特定の配列であり得る。 The linker may be labeled. Suitable labels include, but are not limited to, fluorescent molecules (such as Cy3 or AlexaFluor® 555), radioisotopes, e.g., ¹²⁵ I, ³⁵ S, ³² P, enzymes, antibodies, antigens, polynucleotides, and ligands such as biotin. Such labels allow the amount of linker to be quantified. The label may also be a cleavable purification tag, such as biotin, or a specific sequence that is not present in the protein itself but appears in an identification method, such as a peptide released by trypsin digestion.

細孔モノマー結合体を接続する好ましい方法は、システイン結合を介するものである。これは、二官能性化学架橋剤によって、又は末端に提示されたシステイン残基を有するアミノ酸リンカーによって媒介され得る。 A preferred method of connecting pore monomer conjugates is via a cysteine bond. This can be mediated by a bifunctional chemical crosslinker or by an amino acid linker with a terminally presented cysteine residue.

付着の別の好ましい方法は、４－アジドフェニルアラニン（Ｆａｚ）連結を介するものである。これは、二官能性化学リンカーによって、又は末端に提示されたＦａｚ残基を有するポリペプチドリンカーによって媒介され得る。 Another preferred method of attachment is via a 4-azidophenylalanine (Faz) linkage. This can be mediated by a bifunctional chemical linker or by a polypeptide linker bearing a terminally-disposed Faz residue.

いくつかの実施形態では、リンカーは、硫黄（ＶＩ）フッ化物交換（ＳｕＦＥｘ）反応によって形成される結合である。いくつかの実施形態では、補助タンパク質（例えば、ＣｓｇＦ又はＣｓｇＦの一部）は、適切に近接した場合に求核性アミノ酸（例えば、ＣｓｇＧ細孔モノマーの求核性アミノ酸、別の補助タンパク質の求核性酸など）と反応できるフッ化スルホニル基で官能化してスルホニル結合（ＳｕＦＥＸ）を形成することができる。 In some embodiments, the linker is a bond formed by a sulfur(VI) fluoride exchange (SuFEx) reaction. In some embodiments, an auxiliary protein (e.g., CsgF or a portion of CsgF) can be functionalized with a sulfonyl fluoride group that can react with a nucleophilic amino acid (e.g., a nucleophilic amino acid of a CsgG pore monomer, a nucleophilic acid of another auxiliary protein, etc.) when in appropriate proximity to form a sulfonyl bond (SuFEX).

補助タンパク質又は融合タンパク質は、膜貫通タンパク質ナノ細孔に遺伝的に融合されてもよい。構築物全体が単一のポリヌクレオチドコード配列から発現される場合、細孔モノマー及び補助タンパク質（又は融合タンパク質）は、遺伝的に融合される。モノマー又はサブユニット、補助タンパク質（又は融合タンパク質）は、膜貫通タンパク質ナノ細孔のモノマー又はサブユニットに直接融合されてもよい。代替的に、モノマー又はサブユニット、補助タンパク質（又は融合タンパク質）は、１つ以上のリンカーを介して膜貫通タンパク質ナノ細孔のモノマー又はサブユニットに融合されてもよい。 The auxiliary protein or fusion protein may be genetically fused to the transmembrane protein nanopore. When the entire construct is expressed from a single polynucleotide coding sequence, the pore monomer and auxiliary protein (or fusion protein) are genetically fused. The monomer or subunit, auxiliary protein (or fusion protein) may be fused directly to the transmembrane protein nanopore monomer or subunit. Alternatively, the monomer or subunit, auxiliary protein (or fusion protein) may be fused to the transmembrane protein nanopore monomer or subunit via one or more linkers.

ＣｓｇＧ細孔モノマー結合体内のＣｓｇＧ細孔モノマーと補助タンパク質若しくは融合タンパク質との間の距離、及び／又はリンカーの長さは、好ましくは、約２．００ｎｍ未満、例えば、約１．９０ｎｍ未満、約１．８０ｎｍ未満、約１．７０ｎｍ未満、約１．６０ｎｍ未満、約１．５０ｎｍ未満、約１．４０ｎｍ未満、約１．３０ｎｍ未満、約１．２０ｎｍ未満、約１．１０ｎｍ未満、約１．００ｎｍ未満、約０．９０ｎｍ未満、約０．８０ｎｍ未満、約０．７０ｎｍ未満、約０．６０ｎｍ未満、約０．５０ｎｍ未満、又は約０．４０ｎｍ未満である。細孔モノマー結合体内のＣｓｇＧ細孔モノマーと補助タンパク質若しくは融合タンパク質との間の距離及び／又はリンカーの長さは、好ましくは、約１．２０ｎｍ未満である。この距離／長さは、以下により詳細に説明されるように、マレイミドヘキサン酸を使用して達成することができる。細孔モノマー結合体内のＣｓｇＧ細孔モノマーと補助タンパク質若しくは融合タンパク質との間の距離及び／又はリンカーの長さは、好ましくは、約０．８ｎｍ未満である。この距離／長さは、以下に説明されるように、マレイミドプロピオン酸を使用して達成することができる。 The distance between the CsgG pore monomer and the auxiliary protein or fusion protein in a CsgG pore monomer conjugate and/or the length of the linker is preferably less than about 2.00 nm, e.g., less than about 1.90 nm, less than about 1.80 nm, less than about 1.70 nm, less than about 1.60 nm, less than about 1.50 nm, less than about 1.40 nm, less than about 1.30 nm, less than about 1.20 nm, less than about 1.10 nm, less than about 1.00 nm, less than about 0.90 nm, less than about 0.80 nm, less than about 0.70 nm, less than about 0.60 nm, less than about 0.50 nm, or less than about 0.40 nm. The distance between the CsgG pore monomer and the auxiliary protein or fusion protein in a pore monomer conjugate and/or the length of the linker is preferably less than about 1.20 nm. This distance/length can be achieved using maleimidohexanoic acid, as described in more detail below. The distance and/or linker length between the CsgG pore monomer and the auxiliary protein or fusion protein in the pore monomer conjugate is preferably less than about 0.8 nm. This distance/length can be achieved using maleimidopropionic acid, as described below.

細孔モノマー結合体内のＣｓｇＧ細孔モノマーと補助タンパク質若しくは融合タンパク質との間の距離及び／又はリンカーの長さは、好ましくは、約０．４０ｎｍ～約２．０ｎｍ、例えば、約０．４５ｎｍ～約１．９０ｎｍ、約０．５０ｎｍ～約１．８０ｎｍ、約０．５５ｎｍ～約１．７ｎｍ、約０．６０ｎｍ～約１．６ｎｍ、約０．６５ｎｍ～約１．５ｎｍ、約０．７ｎｍ～約１．４ｎｍ、約０．７５ｎｍ～約１．３ｎｍ、約０．８０ｎｍ～約１．２ｎｍ、約０．８５ｎｍ～約１．１ｎｍ及び約０．９０ｎｍ～約１．００ｎｍである。細孔モノマー結合体内のＣｓｇＧ細孔モノマーと補助タンパク質若しくは融合タンパク質との間の距離及び／又はリンカーの長さは、好ましくは、約０．５０ｎｍ～約１．５０ｎｍである。細孔モノマー結合体内のＣｓｇＧ細孔モノマーと補助タンパク質若しくは融合タンパク質との間の距離及び／又はリンカーの長さは、好ましくは、約０．６０ｎｍ～約１．２ｎｍである。この距離／長さは、以下に説明される特定のマレイミド含有リンカーのいずれかを使用して達成することができる。 The distance and/or linker length between the CsgG pore monomer and the auxiliary protein or fusion protein in the pore monomer conjugate is preferably about 0.40 nm to about 2.0 nm, e.g., about 0.45 nm to about 1.90 nm, about 0.50 nm to about 1.80 nm, about 0.55 nm to about 1.7 nm, about 0.60 nm to about 1.6 nm, about 0.65 nm to about 1.5 nm, about 0.7 nm to about 1.4 nm, about 0.75 nm to about 1.3 nm, about 0.80 nm to about 1.2 nm, about 0.85 nm to about 1.1 nm, and about 0.90 nm to about 1.00 nm. The distance and/or linker length between the CsgG pore monomer and the auxiliary protein or fusion protein in the pore monomer conjugate is preferably about 0.50 nm to about 1.50 nm. The distance and/or linker length between the CsgG pore monomer and the auxiliary protein or fusion protein in the pore monomer conjugate is preferably about 0.60 nm to about 1.2 nm. This distance/length can be achieved using any of the specific maleimide-containing linkers described below.

マレイミド含有リンカーは、本明細書に記載の構築物を参照して以下に説明されるリンカーのいずれであってもよい。マレイミド含有リンカーは、好ましくは、マレイミド基及び２、３、４、５、６又はそれ以上の炭素原子の直鎖炭素鎖を含むか、又はそれらからなる。直鎖炭素鎖は、典型的には、マレイミド基内の窒素原子に結合される。直鎖炭素鎖はまた、好ましくは、末端カルボキシル基を含む。このカルボキシル基は、補助タンパク質又は融合タンパク質内のアミノ酸とアミド結合を形成することができる。リンカーは、好ましくは、マレイミド酢酸、マレイミドプロピオン酸、マレイミド酪酸、マレイミドペンタン酸又はマレイミドヘキサン酸である。リンカーは、最も好ましくは、マレイミドプロピオン酸である。このリンカーは、図１５に示される。 The maleimide-containing linker may be any of the linkers described below with reference to the constructs described herein. The maleimide-containing linker preferably comprises or consists of a maleimide group and a linear carbon chain of 2, 3, 4, 5, 6, or more carbon atoms. The linear carbon chain is typically attached to a nitrogen atom in the maleimide group. The linear carbon chain also preferably comprises a terminal carboxyl group. This carboxyl group is capable of forming an amide bond with an amino acid in the auxiliary protein or fusion protein. The linker is preferably maleimidoacetic acid, maleimidopropionic acid, maleimidobutyric acid, maleimidopentanoic acid, or maleimidohexanoic acid. The linker is most preferably maleimidopropionic acid. This linker is shown in Figure 15.

本開示はまた、補助タンパク質又は融合タンパク質に共有結合されたＣｓｇＧ細孔モノマーを含む細孔モノマー結合体を提供し、補助タンパク質又は融合タンパク質は、チオール反応性基を含むリンカーによってＣｓｇＧ細孔モノマー内のシステイン残基に共有結合される。チオール反応性基は、マレイミド基、ピリジルジチオ基、ハロゲノ基、パラフルオロ基、エン基、イン基、ビニルスルホン基又はチオスルホン基であってもよい。これらの基は、図１６に示される。チオール反応性基を含むリンカーは、本開示の構築物を参照して以下に説明されるリンカーのいずれであってもよい。リンカーは、好ましくは、チオール反応性基及び２、３、４、５、６又はそれ以上の炭素原子の直鎖炭素鎖を含むか、又はそれらからなる。直鎖炭素鎖はまた、好ましくは、末端カルボキシル基を含む。このカルボキシル基は、補助タンパク質又は融合タンパク質内のアミノ酸とアミド結合を形成することができる。リンカーは、マレイミドが異なるチオール反応性基で置換される状態で、以上に説明される特定のマレイミド含有リンカーのいずれであってもよい。チオール反応性基を含有するリンカーは、以上に説明される長さのいずれであってもよい。 The present disclosure also provides a pore monomer conjugate comprising a CsgG pore monomer covalently linked to an auxiliary protein or fusion protein, where the auxiliary protein or fusion protein is covalently linked to a cysteine residue in the CsgG pore monomer by a linker comprising a thiol-reactive group. The thiol-reactive group may be a maleimide group, a pyridyldithio group, a halogeno group, a parafluoro group, an ene group, an yne group, a vinylsulfone group, or a thiosulfone group. These groups are shown in Figure 16. The linker comprising a thiol-reactive group may be any of the linkers described below with reference to the constructs of the present disclosure. The linker preferably comprises or consists of a thiol-reactive group and a linear carbon chain of 2, 3, 4, 5, 6, or more carbon atoms. The linear carbon chain also preferably comprises a terminal carboxyl group that can form an amide bond with an amino acid in the auxiliary protein or fusion protein. The linker may be any of the specific maleimide-containing linkers described above, with the maleimide being replaced with a different thiol-reactive group. The linker containing the thiol-reactive group may be any of the lengths described above.

適切な連結基は、従来のモデリング技術を使用して設計されてもよい。リンカーは、典型的には、モノマー又はサブユニットをそれらのそれぞれのタンパク質オリゴマーに組み立てることを可能にし、かつそれらの共通の対称軸に沿って整列して細孔複合体内の連続チャネルを生成するのに十分な可撓性を有する。 Suitable linking groups may be designed using conventional modeling techniques. The linker typically has sufficient flexibility to allow the monomers or subunits to assemble into their respective protein oligomers and align along their common axis of symmetry to generate continuous channels within the pore complex.

補助タンパク質の同定と選択
本開示の態様は、タンパク質細孔複合体（例えば、ＣｓｇＧナノ細孔を含むタンパク質細孔複合体）に含めるための補助タンパク質及び／又は融合タンパク質を設計及び／又は選択するコンピュータベースの方法に関する。いくつかの実施形態では、方法は、アミノ酸配列（例えば、ＣｓｇＦアミノ酸配列）を、タンパク質バックボーン配列選択技術を実施し、アミノ酸配列を処理してバックボーンアミノ酸配列を出力として生成するコードを含むソフトウェアへの入力として提供することを含む。いくつかの実施形態では、タンパク質バックボーン選択技術は、ＭＡＳＴＥＲ（例えば、Ｚｈｏｕ及びＧｒｉｇｏｒｙａｎ，ＰｒｏｔｅｉｎＳｃｉ．２０１５Ａｐｒ；２４（４）：５０８－５２４に記載されるように、その全ての内容が参照により本明細書に組み込まれる）であってもよい。いくつかの実施形態では、タンパク質バックボーン選択技術は、既知のタンパク質バックボーン構造（例えば、タンパク質データバンク、ＰＤＢに記載されるように）から、１つ以上の標的特性（例えば、１つ以上のヘリックス領域を形成する能力、タンパク質細孔の１つ以上のヘリックス領域を詰め込む能力など）を有するタンパク質バックボーン構造を選択することを含む。いくつかの実施形態では、バックボーン構造は、タンパク質配列設計及び構造予測技術を実施し、かつバックボーン構造を処理して１つ以上のｄｅｎｏｖｏ設計されたペプチド配列を生成するコードを含むソフトウェアへの入力として提供される。いくつかの実施形態では、タンパク質配列設計及び構造予測技術は、Ｒｏｓｅｔｔａ（例えば、Ｌｅａｖｅｒ－Ｆａｙｅｔａｌ．Ｃｈａｐｔｅｒｎｉｎｅｔｅｅｎ－Ｒｏｓｅｔｔａ３：ＡｎＯｂｊｅｃｔ－ＯｒｉｅｎｔｅｄＳｏｆｔｗａｒｅＳｕｉｔｅｆｏｒｔｈｅＳｉｍｕｌａｔｉｏｎａｎｄＤｅｓｉｇｎｏｆＭａｃｒｏｍｏｌｅｃｕｌｅｓ，ＭｅｔｈｏｄｓｉｎＥｎｚｙｍｏｌｏｇｙ，ＡｃａｄｅｍｉｃＰｒｅｓｓ，Ｖｏｌｕｍｅ４８７，２０１１，ｐａｇｅｓ５４５－５７４，ｄｏｉ．ｏｒｇ／１０．１０１６／Ｂ９７８－０－１２－３８１２７０－４．０００１９－６に記載されるように、その全ての内容が参照により本明細書に組み込まれる）であってもよい。いくつかの実施形態では、ｄｅｎｏｖｏ設計されたペプチド配列は、バックボーンアミノ酸配列の１つ以上の所望の特性と同じである１つ以上の標的特性を含む。 Aspects of the present disclosure relate to computer-based methods for designing and/or selecting auxiliary proteins and/or fusion proteins for inclusion in a protein pore complex (e.g., a protein pore complex comprising a CsgG nanopore). In some embodiments, the method comprises providing an amino acid sequence (e.g., a CsgF amino acid sequence) as input to software comprising code that performs a protein backbone sequence selection technique and processes the amino acid sequence to generate a backbone amino acid sequence as output. In some embodiments, the protein backbone selection technique may be MASTER (e.g., as described in Zhou and Grigoryan, Protein Sci. 2015 Apr;24(4):508-524, the entire contents of which are incorporated herein by reference). In some embodiments, the protein backbone selection technique involves selecting, from known protein backbone structures (e.g., as described in the Protein Data Bank, PDB), a protein backbone structure that has one or more target properties (e.g., the ability to form one or more helical regions, the ability to fill one or more helical regions of a protein pore, etc.). In some embodiments, the backbone structure is provided as input to software that performs protein sequence design and structure prediction techniques and includes code that processes the backbone structure to generate one or more de novo designed peptide sequences. In some embodiments, the protein sequence design and structure prediction technology may be Rosetta (e.g., as described in Leaver-Fay et al. Chapter nineteen-Rosetta 3: An Object-Oriented Software Suite for the Simulation and Design of Macromolecules, Methods in Enzymology, Academic Press, Volume 487, 2011, pages 545-574, doi.org/10.1016/B978-0-12-381270-4.00019-6, the entire contents of which are incorporated herein by reference). In some embodiments, the de novo designed peptide sequence contains one or more target properties that are the same as one or more desired properties of the backbone amino acid sequence.

ナノ細孔複合体の生成方法
補助タンパク質又は融合タンパク質及び膜貫通タンパク質ナノ細孔を含む細孔複合体は、一実施形態では、共発現によって生成することができる。いくつかの実施形態では、方法は、細孔モノマー及び補助タンパク質若しくは融合タンパク質、又は補助タンパク質若しくはモノマーの両方を適切な宿主細胞において発現させるステップと、複合体細孔のインビボ形成を可能にするステップと、を含む。この実施形態では、１つのベクター内の細孔モノマーをコードする少なくとも１つの遺伝子と、第２のベクター内の補助タンパク質若しくは融合タンパク質又は少なくとも１つの補助タンパク質サブユニット若しくはモノマーをコードする遺伝子とを共に形質転換してタンパク質を発現させて形質転換細胞内の複合体を生成してもよい。これは、好ましくは、エクスビボ又はインビトロで行われる。代替的に、細孔モノマー及び補助タンパク質（又は融合タンパク質）又はそのサブユニットをコードする２つの遺伝子を、単一のプロモーターの制御下又は同じであっても異なってもよい２つの別個のプロモーターの制御下で１つのベクター内に配置することができる。 Methods for Producing Nanopore Complexes Pore complexes comprising an auxiliary protein or fusion protein and a transmembrane protein nanopore can be produced, in one embodiment, by co-expression. In some embodiments, the method involves expressing a pore monomer and an auxiliary protein or fusion protein, or both auxiliary proteins or monomers, in a suitable host cell and allowing in vivo formation of the complex pore. In this embodiment, at least one gene encoding the pore monomer in one vector and a gene encoding the auxiliary protein or fusion protein or at least one auxiliary protein subunit or monomer in a second vector may be co-transformed to express the proteins and produce the complex in the transformed cell. This is preferably performed ex vivo or in vitro. Alternatively, the two genes encoding the pore monomer and the auxiliary protein (or fusion protein) or subunit thereof can be placed in one vector under the control of a single promoter or under the control of two separate promoters, which may be the same or different.

補助タンパク質又は融合タンパク質及び膜貫通タンパク質ナノ細孔によって形成される細孔複合体を生成する別の方法は、官能的な細孔を得るためのタンパク質のインビトロ再構成である。いくつかの実施形態では、方法は、適切なシステムにおいて、膜貫通タンパク質ナノ細孔のモノマーを補助タンパク質（又は融合タンパク質）、又は補助タンパク質サブユニット若しくはモノマーと接触させて複合体の形成を可能にするステップを含む。前記システムは、「インビトロシステム」であってもよく、これは、本方法を実行するために少なくとも必要な構成要素及び環境を含み、それらの通常の天然に存在する環境の外部の生体分子、生物、細胞（又は細胞の一部）を利用し、生物全体で行うことができるよりも詳細で、より便宜的、又はより効率的な分析を可能にするシステムを指す。インビトロシステムはまた、試験管内に提供された好適な緩衝液組成物を含んでもよく、複合体を形成するための前記タンパク質構成要素が添加される。当業者は、前記システムを提供するためのオプションを知っている。 Another method for generating a pore complex formed by an auxiliary protein or fusion protein and a transmembrane protein nanopore is in vitro reconstitution of the proteins to obtain a functional pore. In some embodiments, the method comprises contacting a monomer of the transmembrane protein nanopore with an auxiliary protein (or fusion protein), or an auxiliary protein subunit or monomer, in a suitable system to allow complex formation. The system may be an "in vitro system," which refers to a system that includes at least the components and environment necessary to carry out the method, utilizes biomolecules, organisms, or cells (or portions of cells) outside their normal, naturally occurring environment, and allows for more detailed, convenient, or efficient analysis than can be performed with whole organisms. An in vitro system may also include a suitable buffer composition provided in a test tube, to which the protein components for complex formation are added. Those skilled in the art will be aware of options for providing such a system.

この実施形態では、ナノ細孔は、補助タンパク質又は融合タンパク質とは別にモノマー（複数可）を発現させることで生成されてもよい。細孔モノマー又はナノ細孔は、少なくとも１つの細孔モノマーをコードするベクター又は細孔モノマーをそれぞれ発現する２つ以上のベクターで形質転換される細胞から精製されてもよい。補助タンパク質又は融合タンパク質は、少なくとも１つの補助タンパク質又は融合タンパク質をコードするベクターで形質転換される細胞から精製されてもよい。次に、精製された細孔モノマー（複数可）／ナノ細孔を補助タンパク質又は融合タンパク質とともにインキュベートして細孔複合体を生成してもよい。 In this embodiment, the nanopore may be generated by expressing the monomer(s) separately from the auxiliary protein or fusion protein. The pore monomer or nanopore may be purified from cells transformed with a vector encoding at least one pore monomer or two or more vectors expressing pore monomers, respectively. The auxiliary protein or fusion protein may be purified from cells transformed with a vector encoding at least one auxiliary protein or fusion protein. The purified pore monomer(s)/nanopore may then be incubated with the auxiliary protein or fusion protein to generate a pore complex.

別の実施形態では、ナノ細孔モノマー（複数可）及び／又は補助タンパク質若しくは融合タンパク質は、インビトロ翻訳及び転写（ＩＶＴＴ）によって別々に生成される。次に、ナノ細孔モノマー（複数可）を補助タンパク質又は融合タンパク質とともにインキュベートして細孔複合体を生成してもよい。 In another embodiment, the nanopore monomer(s) and/or the auxiliary protein or fusion protein are produced separately by in vitro translation and transcription (IVTT). The nanopore monomer(s) may then be incubated with the auxiliary protein or fusion protein to produce the pore complex.

上記の実施形態は、例えば、（ｉ）ナノ細孔がインビボで生成され、補助タンパク質又は融合タンパク質がインビボで生成され、（ｉｉ）ナノ細孔がインビトロで生成され、補助タンパク質又は融合タンパク質がインビボで生成され、（ｉｉｉ）ナノ細孔がインビボで生成され、補助タンパク質又は融合タンパク質がインビトロで生成され、又は、（ｉｖ）ナノ細孔がインビトロで生成され、補助タンパク質又は融合タンパク質がインビトロで生成される、ように組み合わせられてもよい。 The above embodiments may be combined, for example, (i) the nanopore is generated in vivo and the auxiliary protein or fusion protein is generated in vivo; (ii) the nanopore is generated in vitro and the auxiliary protein or fusion protein is generated in vivo; (iii) the nanopore is generated in vivo and the auxiliary protein or fusion protein is generated in vitro; or (iv) the nanopore is generated in vitro and the auxiliary protein or fusion protein is generated in vitro.

ナノ細孔モノマー及び補助タンパク質又は融合タンパク質のうちの一方又は両方にタグを付けて精製を容易にしてもよい。精製はまた、ナノ細孔モノマー及び／又は補助タンパク質若しくは融合タンパク質にタグが付けられない場合に行うことができる。当該技術分野で既知の方法（例えば、イオン交換、ゲル濾過、疎水性相互作用カラムクロマトグラフィーなど）を単独で、又は異なる組み合わせで使用して、細孔複合体の構成要素を精製することができる。 One or both of the nanopore monomer and the auxiliary protein or fusion protein may be tagged to facilitate purification. Purification can also be performed when the nanopore monomer and/or the auxiliary protein or fusion protein are untagged. Methods known in the art (e.g., ion exchange, gel filtration, hydrophobic interaction column chromatography, etc.) can be used alone or in different combinations to purify the components of the pore complex.

任意の既知のタグは、２つのタンパク質のいずれにおいて使用することができる。一実施形態では、２つのタグ精製を使用して、細孔複合体をその構成部分から精製することができる。例えば、Ｓｔｒｅｐタグをナノ細孔において使用し、Ｈｉｓタグを補助タンパク質（又は融合タンパク質）において使用することができ、又はその逆である。２つのタンパク質を個別に精製し、一体に混合した後、再びＳｔｒｅｐ及びＨｉｓ精製を行うと、同様の最終結果を得ることができる。 Any known tag can be used on either of the two proteins. In one embodiment, two-tag purification can be used to purify the pore complex from its constituent parts. For example, a Strep tag can be used on the nanopore and a His tag on the auxiliary protein (or fusion protein), or vice versa. The two proteins can be purified separately, mixed together, and then Strep and His purified again to achieve the same end result.

細孔複合体は、膜への挿入前又は膜へのナノ細孔の挿入後に生成することができる。しかしながら、細孔複合体がその場で形成することができるように、ナノ細孔を膜に挿入してもよく、その後、補助タンパク質（又は融合タンパク質）を添加してもよい。例えば、一実施形態では、膜のトランス側又はシス側がアクセス可能なシステム（例えば、電気生理学測定用のチップ又はチャンバ内）では、複合体をその場で形成することができるように、ナノ細孔を膜に挿入してもよく、その後、補助タンパク質（又は融合タンパク質）を膜のトランス側又はシス側から添加してもよい。 The pore complex can be generated before insertion into the membrane or after insertion of the nanopore into the membrane. However, the nanopore can also be inserted into the membrane and then the auxiliary protein (or fusion protein) added so that the pore complex can form in situ. For example, in one embodiment, in systems where the trans or cis side of the membrane is accessible (e.g., in a chip or chamber for electrophysiological measurements), the nanopore can be inserted into the membrane and then the auxiliary protein (or fusion protein) can be added from the trans or cis side of the membrane so that the complex can form in situ.

一実施形態では、補助タンパク質は、プロテアーゼ切断部位（例えば、ＴＥＶ、ＨＲＶ３、又は任意の他のプロテアーゼ切断部位）を含み、ナノ細孔に関連付ける前又は後に切断されてもよい。例えば、完全長さの補助タンパク質（又は融合タンパク質）を使用して細孔を形成してもよい。チャネル構造の一部を形成せず、膜貫通細孔との相互作用に必要ではないアミノ酸残基の切断は、補助タンパク質又は融合タンパク質から切断されてもよい。この実施形態では、細孔複合体が形成されると、プロテアーゼを使用して補助タンパク質又は融合タンパク質を切断する。代替的に、細孔複合体の組み立て前に、プロテアーゼを使用して補助タンパク質又は融合タンパク質を生成してもよい。 In one embodiment, the auxiliary protein includes a protease cleavage site (e.g., TEV, HRV3, or any other protease cleavage site) and may be cleaved before or after association with the nanopore. For example, a full-length auxiliary protein (or fusion protein) may be used to form the pore. Amino acid residues that do not form part of the channel structure and are not required for interaction with the transmembrane pore may be cleaved from the auxiliary protein or fusion protein. In this embodiment, a protease is used to cleave the auxiliary protein or fusion protein once the pore complex is formed. Alternatively, a protease may be used to generate the auxiliary protein or fusion protein prior to assembly of the pore complex.

いくつかのプロテアーゼ部位は、切断後に追加のタグ（又はその一部、例えば、タグの１つ以上のアミノ酸）を残す。例えば、ＴＥＶプロテアーゼ切断配列は、ＥＮＬＹＦＱＳである。ＴＥＶプロテアーゼは、ＱとＳとの間でタンパク質を切断して、ＣｓｇＦペプチドのＣ末端でのＥＮＬＹＦＱを無傷のまま残す。別の例として、ＨＲＶＣ３切断部位は、ＬＥＶＬＦＱＧＰであり、酵素は、ＱとＧとの間で切断して、ＣｓｇＦペプチドのＣ末端にＬＥＶＬＦＱを無傷のまま残す。 Some protease sites leave an additional tag (or a portion thereof, e.g., one or more amino acids of the tag) after cleavage. For example, the TEV protease cleavage sequence is ENLYFQS. The TEV protease cleaves the protein between Q and S, leaving ENLYFQ intact at the C-terminus of the CsgF peptide. As another example, the HRV C3 cleavage site is LEVLFQGP; the enzyme cleaves between Q and G, leaving LEVLFQ intact at the C-terminus of the CsgF peptide.

タンパク質は、モノマーを含む細孔と、標的ヌクレオチド又は標的ポリヌクレオチド配列との間の相互作用を容易にする分子アダプタで化学修飾されてもよい。環状分子、シクロデキストリン、ハイブリダイゼーションすることができる種、ＤＮＡ結合剤若しくはインターキレート剤、ペプチド若しくはペプチド類似体、合成ポリマー、芳香族平面分子、小さい正に荷電した分子、又は水素結合することができる小分子を含む適切なアダプタは、ＷＯ２０１９／００２８９３（その全体が参照により本明細書に組み込まれる）に記載される。分子アダプタは、上記に記載される方法及びリンカーのいずれかを使用して結合されてもよい。 The protein may be chemically modified with a molecular adaptor that facilitates interaction between the pore containing the monomer and the target nucleotide or target polynucleotide sequence. Suitable adaptors, including cyclic molecules, cyclodextrins, hybridizable species, DNA binders or interchelators, peptides or peptide analogs, synthetic polymers, aromatic planar molecules, small positively charged molecules, or small molecules capable of hydrogen bonding, are described in WO 2019/002893, which is incorporated herein by reference in its entirety. The molecular adaptor may be attached using any of the methods and linkers described above.

タンパク質は、ポリヌクレオチド結合タンパク質に結合されてもよい。これは、モジュール式配列決定システムを形成する。ポリヌクレオチド結合タンパク質は、以下に考察される。タンパク質は、当該技術分野で既知の任意の方法を使用してモノマーに共有結合され得る。モノマー及びタンパク質は、化学的に融合され得るか、又は遺伝子的に融合され得る。モノマーのポリヌクレオチド結合タンパク質への遺伝子融合は、ＷＯ２０１０／００４２６５（その全体が参照により本明細書に組み込まれる）において考察されている。ポリヌクレオチド結合タンパク質は、上記の任意の方法を使用してシステイン連結を介して結合されてもよい。 The protein may be attached to a polynucleotide binding protein, forming a modular sequencing system. Polynucleotide binding proteins are discussed below. The protein may be covalently attached to the monomer using any method known in the art. The monomer and protein may be chemically fused or genetically fused. Genetic fusion of a monomer to a polynucleotide binding protein is discussed in WO 2010/004265, which is incorporated herein by reference in its entirety. The polynucleotide binding protein may be attached via a cysteine linkage using any of the methods described above.

ポリヌクレオチド結合タンパク質は、１つ以上のリンカーを介してタンパク質に直接結合されてもよい。分子は、ＷＯ２０１０／０８６６０２（その全体が参照により本明細書に組み込まれる）として記載されるハイブリダイゼーションリンカーを使用してＣｓｇＧ細孔モノマーに結合されてもよい。代替的に、ペプチドリンカーを使用してもよい。適切なペプチドリンカーは、上記に記載される。 The polynucleotide-binding protein may be directly attached to the protein via one or more linkers. The molecule may be attached to the CsgG pore monomer using a hybridization linker as described in WO 2010/086602 (incorporated herein by reference in its entirety). Alternatively, a peptide linker may be used. Suitable peptide linkers are described above.

タンパク質のいずれかは、当該技術分野で既知の標準的な方法を使用して生成することができる。タンパク質をコードするポリヌクレオチド配列は、当該技術分野における標準的な方法を使用して得られ、複製され得る。タンパク質をコードするポリヌクレオチド配列は、当該技術分野における標準的な技法を使用して細菌宿主細胞において発現され得る。タンパク質は、組換え発現ベクターからポリペプチドを系内で発現することによって細胞内で産生され得る。発現ベクターは、任意選択的に、ポリペプチドの発現を制御するために誘導性プロモーターを担持する。これらの方法は、Ｓａｍｂｒｏｏｋ，Ｊ．及びＲｕｓｓｅｌｌ，Ｄ．（２００１）ＭｏｌｅｃｕｌａｒＣｌｏｎｉｎｇ：ＡＬａｂｏｒａｔｏｒｙＭａｎｕａｌ，３ｒｄＥｄｉｔｉｏｎ．ＣｏｌｄＳｐｒｉｎｇＨａｒｂｏｒＬａｂｏｒａｔｏｒｙＰｒｅｓｓ，ＣｏｌｄＳｐｒｉｎｇＨａｒｂｏｒ，ＮＹに記載されている。 Any of the proteins can be produced using standard methods known in the art. Polynucleotide sequences encoding the proteins can be obtained and replicated using standard methods in the art. Polynucleotide sequences encoding the proteins can be expressed in bacterial host cells using standard techniques in the art. Proteins can be produced intracellularly by expressing the polypeptide in situ from a recombinant expression vector. The expression vector optionally carries an inducible promoter to control expression of the polypeptide. These methods are described in Sambrook, J. and Russell, D. (2001) Molecular Cloning: A Laboratory Manual, 3rd Edition. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.

タンパク質は、タンパク質産生有機体から、任意のタンパク質液体クロマトグラフィーシステムによる精製に続いて、又は組換え発現の後に大規模に産生され得る。典型的なタンパク質液体クロマトグラフィーシステムには、ＦＰＬＣ、ＡＫＴＡシステム、Ｂｉｏ－Ｃａｄシステム、Ｂｉｏ－ＲａｄＢｉｏＬｏｇｉｃシステム、及びＧｉｌｓｏｎＨＰＬＣシステムが含まれる。 Proteins can be produced on a large scale from protein-producing organisms following purification by any protein liquid chromatography system, or after recombinant expression. Typical protein liquid chromatography systems include FPLC, AKTA systems, Bio-Cad systems, Bio-Rad BioLogic systems, and Gilson HPLC systems.

システム
別の態様では、本開示は、標的ポリヌクレオチドを特性決定するためのシステムに関し、システムは、膜及び細孔複合体を含み、細孔複合体は、（ｉ）膜内に位置するナノ細孔、及び（ｉｉ）ナノ細孔に結合される補助タンパク質又は融合タンパク質を含み、ナノ細孔と補助タンパク質又は融合タンパク質とは、共に、膜を横切る連続チャネルを形成し、チャネルは、第１の狭窄領域及び第２の狭窄領域を含む。 Systems In another aspect, the present disclosure relates to a system for characterizing a target polynucleotide, the system comprising a membrane and a pore complex, the pore complex comprising (i) a nanopore located in the membrane, and (ii) an accessory protein or fusion protein bound to the nanopore, wherein the nanopore and the accessory protein or fusion protein together form a continuous channel across the membrane, the channel comprising a first constriction region and a second constriction region.

細孔複合体、ナノ細孔及び補助タンパク質又は融合タンパク質は、本明細書において上記に記載されるもののいずれかであってもよい。 The pore complex, nanopore and auxiliary protein or fusion protein may be any of those described herein above.

一実施形態では、システムは更に、第１のチャンバ及び第２のチャンバを含み、第１及び第２のチャンバは、膜によって分離される。標的ポリヌクレオチドを特性決定するために使用される場合、システムは、標的ポリヌクレオチドを更に含んでもよく、標的ポリヌクレオチドは、連続チャネル内に一時的に位置し、標的ポリヌクレオチドの一端は、第１のチャンバに位置し、標的ポリヌクレオチドの一端は、第２のチャンバに位置する。 In one embodiment, the system further comprises a first chamber and a second chamber, the first and second chambers being separated by a membrane. When used to characterize a target polynucleotide, the system may further comprise a target polynucleotide, the target polynucleotide being temporarily located within the continuous channel, with one end of the target polynucleotide located in the first chamber and one end of the target polynucleotide located in the second chamber.

一実施形態では、システムは、ナノ細孔と接触する導電性溶液と、膜間に電圧電位を提供する電極と、ナノ細孔を通る電流を測定する測定システムと、を更に含む。一実施形態では、膜及び細孔複合体に印加された電圧は、＋５Ｖ～－５Ｖ、例えば－６００ｍＶ～＋６００ｍＶ又は－４００ｍＶ～＋４００ｍＶである。使用される電圧は、好ましくは、１００ｍＶ～２４０ｍＶの範囲内、より好ましくは１２０ｍＶ～２２０ｍＶの範囲内にある。増加した印加電位を使用することによって、細孔ごとに異なるヌクレオチド間の同定を増加させることが可能である。任意の好適な導電性溶液を使用してもよい。例えば、溶液は、金属塩、例えばアルカリ金属塩、ハロゲン化物塩、例えば塩化物塩、例えばアルカリ金属塩化物塩などの電荷担体を含み得る。電荷担体としては、イオン性液体又は有機塩、例えば、テトラメチルアンモニウムクロリド、トリメチルフェニルアンモニウムクロリド、フェニルトリメチルアンモニウムクロリド、又は１－エチル－３－メチルイミダゾリウムクロリドを挙げることができる。例示的なシステムでは、塩は、チャンバ内の水溶液中に存在する。塩化カリウム（ＫＣｌ）、塩化ナトリウム（ＮａＣｌ）、塩化セシウム（ＣｓＣｌ）、又はフェロシアン化カリウムとフェリシアン化カリウムとの混合物が典型的に使用される。ＫＣｌ、ＮａＣｌ、及びフェロシアン化カリウムとフェリシアン化カリウムとの混合物が好ましい。電荷担体は、膜を横切って非対称であり得る。例えば、電荷担体のタイプ及び／又は濃度は、膜の各側で、例えば、各チャンバ内で異なっていてもよい。 In one embodiment, the system further includes a conductive solution in contact with the nanopore, electrodes that provide a voltage potential across the membrane, and a measurement system that measures the current through the nanopore. In one embodiment, the voltage applied to the membrane and pore complex is between +5 V and -5 V, e.g., between -600 mV and +600 mV or between -400 mV and +400 mV. The voltage used is preferably in the range of 100 mV and 240 mV, more preferably between 120 mV and 220 mV. Using an increased applied potential can increase the discrimination between different nucleotides per pore. Any suitable conductive solution may be used. For example, the solution may include a charge carrier such as a metal salt, e.g., an alkali metal salt; a halide salt, e.g., a chloride salt, e.g., an alkali metal chloride salt. The charge carrier may include an ionic liquid or an organic salt, e.g., tetramethylammonium chloride, trimethylphenylammonium chloride, phenyltrimethylammonium chloride, or 1-ethyl-3-methylimidazolium chloride. In an exemplary system, a salt is present in the aqueous solution within the chamber. Potassium chloride (KCl), sodium chloride (NaCl), cesium chloride (CsCl), or a mixture of potassium ferrocyanide and potassium ferricyanide are typically used. KCl, NaCl, and a mixture of potassium ferrocyanide and potassium ferricyanide are preferred. Charge carriers can be asymmetric across the membrane. For example, the type and/or concentration of charge carriers can be different on each side of the membrane, e.g., within each chamber.

塩濃度は、飽和であり得る。塩濃度は、３Ｍ以下であり、典型的には、０．１～２．５Ｍ、０．３～１．９Ｍ、０．５～１．８Ｍ、０．７～１．７Ｍ、０．９～１．６Ｍ、又は１Ｍ～１．４Ｍであり得る。塩濃度は、好ましくは、１５０ｍＭ～１Ｍである。方法は、好ましくは、少なくとも０．３Ｍ、例えば、少なくとも０．４Ｍ、少なくとも０．５Ｍ、少なくとも０．６Ｍ、少なくとも０．８Ｍ、少なくとも１．０Ｍ、少なくとも１．５Ｍ、少なくとも２．０Ｍ、少なくとも２．５Ｍ、又は少なくとも３．０Ｍの塩濃度を使用して実施される。高い塩濃度は、高い信号対雑音比を提供し、正常な電流変動のバックグラウンドに対する、ヌクレオチドの存在を示す電流の同定を可能にする。 The salt concentration may be saturating. The salt concentration may be 3 M or less, typically 0.1 to 2.5 M, 0.3 to 1.9 M, 0.5 to 1.8 M, 0.7 to 1.7 M, 0.9 to 1.6 M, or 1 M to 1.4 M. The salt concentration is preferably 150 mM to 1 M. The method is preferably carried out using a salt concentration of at least 0.3 M, e.g., at least 0.4 M, at least 0.5 M, at least 0.6 M, at least 0.8 M, at least 1.0 M, at least 1.5 M, at least 2.0 M, at least 2.5 M, or at least 3.0 M. High salt concentrations provide a high signal-to-noise ratio, allowing identification of currents indicative of the presence of nucleotides against a background of normal current fluctuations.

導電性溶液中に緩衝液が存在してもよい。典型的には、緩衝液は、リン酸緩衝液である。他の好適な緩衝液は、ＨＥＰＥＳ及びＴｒｉｓ－ＨＣｌ緩衝液である。導電性溶液のｐＨは、４．０～１２．０、４．５～１０．０、５．０～９．０、５．５～８．８、６．０～８．７又は７．０～８．８若しくは７．５～８．５であってもよい。使用されるｐＨは、好ましくは、約６．９である。 A buffer may be present in the conductive solution. Typically, the buffer is a phosphate buffer. Other suitable buffers are HEPES and Tris-HCl buffers. The pH of the conductive solution may be 4.0 to 12.0, 4.5 to 10.0, 5.0 to 9.0, 5.5 to 8.8, 6.0 to 8.7, or 7.0 to 8.8, or 7.5 to 8.5. The pH used is preferably about 6.9.

システムは、膜内に存在する細孔複合体のアレイを含んでもよい。好ましい実施形態では、アレイ内の各膜は、１つの細孔複合体を含む。アレイが形成される様式に起因して、例えば、アレイは、細孔複合体を含まない１つ以上の膜、及び／又は２つ以上の細孔複合体を含む１つ以上の膜を含んでもよい。アレイは、約２～約１２，０００個、例えば、約１０～約８００個、約２０～約６００個、約３０～約５００個、約２５０～約２０００個、約５００～約４０００個、約１０００～約５０００個、約２５００～約１０，０００個、又は約５０００～約１２，０００個の膜を含んでもよい。いくつかの実施形態では、アレイは、１２，０００を超える膜を含む。 The system may include an array of pore complexes present in a membrane. In preferred embodiments, each membrane in the array includes one pore complex. Depending on the manner in which the array is formed, for example, the array may include one or more membranes that do not include a pore complex and/or one or more membranes that include two or more pore complexes. The array may include from about 2 to about 12,000 membranes, e.g., from about 10 to about 800, from about 20 to about 600, from about 30 to about 500, from about 250 to about 2000, from about 500 to about 4000, from about 1000 to about 5000, from about 2500 to about 10,000, or from about 5000 to about 12,000 membranes. In some embodiments, the array includes more than 12,000 membranes.

システムは、装置に含まれ得る。装置は、分析物分析のための任意の従来の装置、例えばアレイ又はチップであり得る。装置は、好ましくは、開示される方法を実施するように設定される。例えば、装置は、水溶液を含むチャンバと、チャンバを２つのセクションに分離する障壁と、を含んでもよい。障壁は、典型的には、細孔を含有する膜が形成される開口を有する。代替的に、障壁は、細孔が存在する膜を形成する。 The system may be included in a device. The device may be any conventional device for analyte analysis, such as an array or chip. The device is preferably configured to perform the disclosed method. For example, the device may include a chamber containing an aqueous solution and a barrier separating the chamber into two sections. The barrier typically has an opening through which a membrane containing pores is formed. Alternatively, the barrier forms a membrane in which the pores reside.

一実施形態では、装置は、複数の細孔及び膜を支持し、細孔及び膜を使用して分析物の特性決定を実行することができるように動作可能なセンサデバイスと、特性決定を行うための材料を送達するための少なくとも１つのポートと、を含む。 In one embodiment, the apparatus includes a sensor device operable to support a plurality of pores and membranes and perform characterization of an analyte using the pores and membranes, and at least one port for delivering material for performing the characterization.

一実施形態では、装置は、複数の細孔及び膜を支持し、細孔及び膜を使用して分析物の特性決定を実行することができるように動作可能なセンサデバイスと、特性決定を行うための材料を保持するための少なくとも１つのリザーバと、を含む。 In one embodiment, the apparatus includes a sensor device operable to support a plurality of pores and membranes and to perform characterization of an analyte using the pores and membranes, and at least one reservoir for holding material for performing the characterization.

一実施形態では、装置は、膜並びに複数の細孔及び膜を支持し、細孔及び膜を使用して分析物を特性決定するように動作可能なセンサデバイスと、特性決定を行うための材料を保持するための少なくとも１つのリザーバと、少なくとも１つのリザーバからセンサデバイスに材料を制御可能に供給するように構成される流体工学システムと、個別の試料を受けるための１つ以上の容器であって、流体システムは、試料を１つ以上の容器からセンサデバイスに選択的に供給するように構成される、１つ以上の容器と、を含む。 In one embodiment, the apparatus includes a sensor device supporting a membrane and a plurality of pores and membranes and operable to characterize an analyte using the pores and membranes; at least one reservoir for holding material for characterization; a fluidics system configured to controllably deliver material from the at least one reservoir to the sensor device; and one or more containers for receiving individual samples, the fluidics system configured to selectively deliver sample from the one or more containers to the sensor device.

装置はまた、電位を印加し、膜及び細孔複合体の間の電気信号を測定することができる電気回路を含んでもよい。装置は、ＷＯ２００８／１０２１２０、ＷＯ２００９／０７７７３４、ＷＯ２０１０／１２２２９３、ＷＯ２０１１／０６７５５９、又はＷＯ００／２８３１２に記載されているもののいずれかであってもよい。 The device may also include an electrical circuit capable of applying an electrical potential and measuring an electrical signal between the membrane and the pore complex. The device may be any of those described in WO 2008/102120, WO 2009/077734, WO 2010/122293, WO 2011/067559, or WO 00/28312.

膜
任意の好適な膜をシステムで使用することができる。膜は、好ましくは、両親媒性層である。両親媒性層は、親水特性及び親油特性の両方を有するリン脂質などの両親媒性分子から形成された層である。両親媒性分子は、合成又は天然に存在するものであり得る。天然に存在しない両親媒性物質及び単分子層を形成する両親媒性物質は、当該技術分野で既知であり、例えば、ブロックコポリマー（Ｇｏｎｚａｌｅｚ－Ｐｅｒｅｚｅｔａｌ．，Ｌａｎｇｍｕｉｒ，２００９，２５，１０４４７－１０４５０）を含む。ブロックコポリマーは、２つ以上のモノマーサブユニットが一緒に重合されて単一ポリマー鎖を作製するポリマー材料である。ブロックコポリマーは、典型的には、各モノマーサブユニットによって寄与される特性を有する。しかしながら、ブロックコポリマーは、個々のサブユニットから形成されたポリマーが有しない固有の特性を有し得る。ブロックコポリマーは、モノマーサブユニットの１つが疎水性（すなわち、親油性）であり、他のサブユニット（複数可）が水性媒体中では親水性であるように設計されてもよい。この場合、ブロックコポリマーは、両親媒特性を有し得、生体膜を模倣する構造を形成し得る。ブロックコポリマーは、ジブロック（２つのモノマーサブユニットからなる）であってもよいが、両親媒性体として機能するより複雑な配置を形成するために、２つより多くのモノマーサブユニットから構成されてもよい。コポリマーは、トリブロック、テトラブロック、又はペンタブロックコポリマーであり得る。膜は、好ましくは、トリブロックコポリマー膜である。 Membrane Any suitable membrane can be used in the system. The membrane is preferably an amphiphilic layer. An amphiphilic layer is a layer formed from amphiphilic molecules, such as phospholipids, that have both hydrophilic and lipophilic properties. The amphiphilic molecules can be synthetic or naturally occurring. Non-naturally occurring amphiphiles and amphiphiles that form monolayers are known in the art and include, for example, block copolymers (Gonzalez-Perez et al., Langmuir, 2009, 25, 10447-10450). Block copolymers are polymeric materials in which two or more monomer subunits are polymerized together to create a single polymer chain. Block copolymers typically have properties contributed by each monomer subunit. However, block copolymers may have unique properties not possessed by polymers formed from individual subunits. Block copolymers may be designed so that one of the monomer subunits is hydrophobic (i.e., lipophilic) and the other subunit(s) is/are hydrophilic in aqueous media. In this case, the block copolymer may have amphiphilic properties and form structures that mimic biological membranes. Block copolymers may be diblock (composed of two monomer subunits), but may also be composed of more than two monomer subunits to form more complex arrangements that function as amphiphiles. The copolymer may be a triblock, tetrablock, or pentablock copolymer. The membrane is preferably a triblock copolymer membrane.

古細菌二極性テトラエーテル脂質は、脂質が単分子層膜を形成するように構築される天然に存在する脂質である。これらの脂質は、一般に、厳しい生物環境下で生き残る好極限性細菌、好熱菌、好塩菌、及び好酸菌で見つけられる。それらの安定性は、最終的な二重層の融合した性質に由来すると考えられる。一般的なモチーフ親水性－疎水性－親水性を有するトリブロックポリマーを作製することによってこれらの生物学的実体を模倣するブロックコポリマーを構築することは簡単である。この材料は、脂質二分子層と同様に挙動するモノマー膜を形成し、ベシクルから層状膜に至るまでの幅広い相挙動を包含し得る。これらのトリブロックコポリマーから形成された膜は、生体脂質膜に優るいくつかの利点を有する。トリブロックコポリマーが合成されたものであるため、その正確な構造を慎重に制御して、膜を形成し、かつ細孔及び他のタンパク質と相互作用するのに必要な正しい鎖長及び特性を提供することができる。 Archaeal bipolar tetraether lipids are naturally occurring lipids that organize to form lipid monolayer membranes. These lipids are commonly found in extremophilic, thermophilic, halophilic, and acidophilic bacteria, which survive in harsh biological environments. Their stability is thought to derive from the fused nature of the final bilayer. It is straightforward to construct block copolymers that mimic these biological entities by creating triblock polymers with the general motif hydrophilic-hydrophobic-hydrophilic. This material forms monomeric membranes that behave similarly to lipid bilayers and can encompass a wide range of phase behaviors, from vesicles to lamellar membranes. Membranes formed from these triblock copolymers have several advantages over biological lipid membranes. Because the triblock copolymers are synthetic, their precise structure can be carefully controlled to provide the correct chain length and properties necessary to form membranes and interact with pores and other proteins.

ブロックコポリマーはまた、脂質サブ材料として分類されないサブユニットから構築されてもよい。例えば、疎水性ポリマーは、シロキサン又は他の非炭化水素ベースのモノマーから生成されてもよい。ブロックコポリマーの親水性サブセクションはまた、低いタンパク質結合特性も有してもよく、これは、生の生体試料に曝露されたときに高度に耐性である膜の生成を可能にする。このヘッド基単位は、非分類脂質ヘッド基に由来する場合もある。 Block copolymers may also be constructed from subunits that are not classified as lipid submaterials. For example, hydrophobic polymers may be formed from siloxane or other non-hydrocarbon-based monomers. The hydrophilic subsection of the block copolymer may also have low protein binding properties, allowing for the formation of films that are highly resistant when exposed to live biological samples. This head group unit may also be derived from a non-classified lipid head group.

トリブロックコポリマー膜は、生体脂質膜と比較して増加した機械的及び環境的安定性、例えば、はるかに高い動作温度又はｐＨ範囲も有する。ブロックコポリマーの合成性質は、ポリマー系膜を幅広い用途のためにカスタマイズする基盤を提供する。 Triblock copolymer membranes also have increased mechanical and environmental stability compared to biological lipid membranes, e.g., much higher operating temperature or pH ranges. The synthetic nature of block copolymers provides a platform for customizing polymer-based membranes for a wide range of applications.

膜は、最も好ましくは、ＷＯ２０１４／０６４４４３又はＷＯ２０１４／０６４４４４に開示される膜のうちの１つである。 The membrane is most preferably one of the membranes disclosed in WO2014/064443 or WO2014/064444.

両親媒性分子は、ポリヌクレオチドのカップリングを容易にするように化学的に修飾されても官能化されてもよい。両親媒性層は、単分子層であっても二分子層であってもよい。両親媒性層は、典型的には、平面である。両親媒性層は、湾曲していてもよい。両親媒性層は、支持されていてもよい。 The amphiphilic molecules may be chemically modified or functionalized to facilitate coupling of polynucleotides. The amphiphilic layer may be a monolayer or a bilayer. The amphiphilic layer is typically planar. The amphiphilic layer may also be curved. The amphiphilic layer may also be supported.

両親媒性膜は、典型的には、およそ１０^－８ｃｍｓ^－１の脂質拡散速度を有する二次元流体として本質的に作用する天然に移動性である。これは、細孔及びカップリングされたポリヌクレオチドが典型的には両親媒性膜内で移動することができることを意味する。 Amphiphilic membranes are typically mobile in nature, essentially behaving as two-dimensional fluids with lipid diffusion rates of approximately 10 ⁻⁸ cm s ⁻¹ , which means that pores and coupled polynucleotides are typically able to move within the amphiphilic membrane.

膜は、脂質二分子層であり得る。脂質二分子層は、細胞膜のモデルであり、幅広い実験研究のための優れた基盤として機能する。例えば、脂質二分子層は、単一チャネル記録による膜タンパク質のインビトロ調査のために使用することができる。代替的に、脂質二分子層は、幅広い物質の存在を検出するためのバイオセンサーとして使用することができる。脂質二分子層は、任意の脂質二分子層であり得る。適切な脂質二分子層は、平面脂質二分子層、支持された二分子層又はリポソームを含むが、これらに限定されない。脂質二分子層は、好ましくは、平面脂質二分子層である。好適な脂質二分子層は、ＷＯ２００８／１０２１２１、ＷＯ２００９／０７７７３４、及びＷＯ２００６／１００４８４に開示されている。 The membrane can be a lipid bilayer. Lipid bilayers are models of cell membranes and serve as an excellent platform for a wide range of experimental studies. For example, lipid bilayers can be used for in vitro investigation of membrane proteins by single-channel recording. Alternatively, lipid bilayers can be used as biosensors to detect the presence of a wide range of substances. The lipid bilayer can be any lipid bilayer. Suitable lipid bilayers include, but are not limited to, planar lipid bilayers, supported bilayers, or liposomes. The lipid bilayer is preferably a planar lipid bilayer. Suitable lipid bilayers are disclosed in WO 2008/102121, WO 2009/077734, and WO 2006/100484.

脂質二分子層を形成するための方法は、当該技術分野で既知である。脂質二分子層は、一般的に、Ｍｏｎｔａｌ及びＭｕｅｌｌｅｒ（Ｐｒｏｃ．Ｎａｔｌ．Ａｃａｄ．Ｓｃｉ．ＵＳＡ．，１９７２；６９：３５６１－３５６６）の方法によって形成され、その方法では、脂質単層が、水溶液／空気界面上に、その界面に対して垂直である開口のいずれかの側を通って担持される。脂質は、通常、最初にそれを有機溶媒中に溶解させ、その後、少量の溶媒を開口の両側にある水溶液の界面上で蒸発させることによって、電解質水溶液の表面に添加される。有機溶媒が蒸発すると、開口の両側の溶液／空気界面は、二分子層が形成されるまで開口を越えて物理的に上下に移動する。平面脂質二分子層は、開口を横切って膜内に又は開口部を横切って陥凹部内に形成され得る。 Methods for forming lipid bilayers are known in the art. Lipid bilayers are typically formed by the method of Montal and Mueller (Proc. Natl. Acad. Sci. USA, 1972; 69:3561-3566), in which a lipid monolayer is placed on the aqueous solution/air interface through an opening on either side that is perpendicular to the interface. Lipids are typically added to the surface of an aqueous electrolyte solution by first dissolving them in an organic solvent and then evaporating a small amount of solvent onto the aqueous solution interface on either side of the opening. As the organic solvent evaporates, the solution/air interfaces on either side of the opening physically move up and down across the opening until a bilayer is formed. Planar lipid bilayers can form across the opening in a membrane or across the opening in a recess.

Ｍｏｎｔａｌ＆Ｍｕｅｌｌｅｒの方法は、タンパク質細孔挿入に好適な良質の脂質二分子層を形成するための費用対効果の高い、比較的簡単な方法であるため、好評である。二分子層形成の他の一般的な方法は、先端浸漬、二分子層の塗装及びリポソーム二分子層のパッチクランプを含む。 The Montal & Mueller method is popular because it is a cost-effective and relatively simple method for forming high-quality lipid bilayers suitable for protein pore insertion. Other common methods of bilayer formation include tip dipping, bilayer painting, and patch clamping of liposome bilayers.

先端浸漬二分子層形成は、脂質の単分子層を担持する試験溶液の表面上に開口面（例えば、ピペットチップ）を接触させることを伴う。この場合もやはり、脂質単分子層は、最初に有機溶媒中に溶解した少量の脂質を溶液表面で蒸発させることによって溶液／空気界面で生成される。その後、二分子層はラングミュア－シェーファー法によって形成され、溶液表面に対して開口を移動させるための機械的自動化を必要とする。 Tip-dipping bilayer formation involves contacting an aperture (e.g., a pipette tip) onto the surface of a test solution bearing a lipid monolayer. Again, the lipid monolayer is generated at the solution/air interface by first evaporating a small amount of lipid dissolved in an organic solvent at the solution surface. The bilayer is then formed by the Langmuir-Schaefer method, which requires mechanical automation to move the aperture relative to the solution surface.

二分子層の塗装の場合、有機溶媒中に溶解した少量の脂質が試験水溶液中に浸した開口に直接塗布される。脂質溶液は、塗装用刷毛又は同等のものを使用して開口にわたって薄く広げられる。溶媒を薄くすることにより、脂質二分子層の形成がもたらされる。しかしながら、二分子層から溶媒を完全に除去することは困難であり、その結果として、この方法によって形成された二分子層は安定性が低く、電気化学的測定中にノイズを生じやすい。 For bilayer coating, a small amount of lipid dissolved in an organic solvent is applied directly to an aperture immersed in the test aqueous solution. The lipid solution is spread thinly across the aperture using a paintbrush or equivalent. Diluting the solvent results in the formation of a lipid bilayer. However, it is difficult to completely remove the solvent from the bilayer, and as a result, bilayers formed by this method are less stable and more likely to generate noise during electrochemical measurements.

パッチクランプは、生体細胞膜の研究で一般に使用されている。細胞膜は吸引によりピペットの末端に留められ、膜のパッチが開口にわたって付着するようになる。この方法は、リポソームを留めてから破裂させて脂質二分子層をピペットの開口にわたって密封することによって脂質二分子層を産生するように適合されている。この方法は、安定した巨大な単分子層リポソーム、及びガラス表面を有する材料に小さい開口を生成することを必要とする。 Patch clamping is commonly used in the study of biological cell membranes. The cell membrane is clamped to the end of a pipette by suction, causing a patch of membrane to adhere across the opening. This method has been adapted to produce lipid bilayers by clamping and then rupturing liposomes to seal the lipid bilayer across the pipette opening. This method requires the creation of stable, large, unilamellar liposomes and a small opening in a material with a glass surface.

リポソームは、超音波処理、押出又はＭｏｚａｆａｒｉ法（Ｃｏｌａｓｅｔａｌ．（２００７）Ｍｉｃｒｏｎ３８：８４１－８４７）によって形成することができる。好ましい実施形態では、脂質二分子層は、国際出願第ＷＯ２００９／０７７７３４号に記載されるように形成される。この方法では有利に、脂質二分子層は乾燥脂質から形成される。最も好ましい実施形態では、脂質二分子層は、ＷＯ２００９／０７７７３４に記載されるように、開口部を横切って形成される。 Liposomes can be formed by sonication, extrusion, or the Mozafari method (Colas et al. (2007) Micron 38:841-847). In a preferred embodiment, the lipid bilayer is formed as described in International Application No. WO 2009/077734. Advantageously, in this method, the lipid bilayer is formed from dry lipids. In a most preferred embodiment, the lipid bilayer is formed across an opening as described in WO 2009/077734.

脂質二分子層は、脂質の２つの対向する層から形成される。これらの２つの脂質層は、それらの疎水性テール基が互いに向き合って疎水性内部を形成するように配置される。脂質の親水性ヘッド基は、二分子層の両側で水性環境に向かって外側に向いている。二分子層は、液体無秩序相（流体層状）、液体秩序相、固体秩序相（層状ゲル相、櫛型ゲル相）、及び平面二分子層結晶（層状サブゲル相、ラメラ結晶相）を含むが、これらに限定されない、いくつかの脂質相に存在してもよい。 A lipid bilayer is formed from two opposing layers of lipids. These two lipid layers are arranged so that their hydrophobic tail groups face each other, forming a hydrophobic interior. The hydrophilic head groups of the lipids face outward toward the aqueous environment on either side of the bilayer. Bilayers may exist in several lipid phases, including, but not limited to, liquid disordered phases (fluid lamellar), liquid ordered phases, solid ordered phases (lamellar gel phase, interdigitated gel phase), and planar bilayer crystals (lamellar subgel phase, lamellar crystalline phase).

脂質二分子層を形成するいずれの脂質組成物も使用されてもよい。脂質組成物は、必要な特性、例えば、表面荷電、膜タンパク質を支持する能力、充填密度、又は機械的特性を有する脂質二分子層が形成されるように選択される。脂質組成物は、１つ以上の異なる脂質を含むことができる。例えば、脂質組成物は、最大１００個の脂質を含むことができる。脂質組成物は、好ましくは、１～１０個の脂質を含む。脂質組成物は、天然に存在する脂質及び／又は人工脂質を含み得る。 Any lipid composition that forms a lipid bilayer may be used. The lipid composition is selected to form a lipid bilayer with the desired properties, such as surface charge, ability to support membrane proteins, packing density, or mechanical properties. The lipid composition can include one or more different lipids. For example, the lipid composition can include up to 100 lipids. The lipid composition preferably includes 1 to 10 lipids. The lipid composition can include naturally occurring lipids and/or artificial lipids.

脂質は、典型的には、ヘッド基、界面部分、及び同じであっても異なってもよい２つの疎水性テール基を含む。適切なヘッド基は、ジアシルグリセリド（ＤＧ）及びセラミド（ＣＭ）などの中性ヘッド基と、ホスファチジルコリン（ＰＣ）、ホスファチジルエタノールアミン（ＰＥ）及びスフィンゴミエリン（ＳＭ）などの双性イオン性のヘッド基と、ホスファチジルグリセロール（ＰＧ）などの負に帯電したヘッド基と、ホスファチジルセリン（ＰＳ）、ホスファチジルイノシトール（ＰＩ）、リン酸（ＰＡ）及びカルジオリピン（ＣＡ）と、トリメチルアンモニウムプロパン（ＴＡＰ）などの正に帯電したヘッド基と、を含むが、これらに限定されない。好適な界面部分には、天然に存在する界面部分、例えば、グリセロール系部分又はセラミド系部分が含まれるが、これらに限定されない。好適な疎水性テール基は、ラウリン酸（ｎ－ドデカノール酸）、ミリスチン酸（ｎ－テトラデコノニン酸）、パルミチン酸（ｎ－ヘキサデカン酸）、ステアリン酸（ｎ－オクタデカン酸）、及びアラキジン酸（ｎ－エイコサン酸）などの飽和炭化水素鎖、オレイン酸（シス－９－オクタデカン酸）などの不飽和炭化水素鎖、並びにフィタノイルなどの分岐炭化水素鎖を含むが、これらに限定されない。不飽和炭化水素鎖中の鎖の長さ並びに二重結合の位置及び数は、異なってもよい。分岐炭化水素鎖中のメチル基などの鎖の長さ並びに分岐の位置及び数は、異なってもよい。疎水性テール基は、エーテル又はエステルとして界面部分に連結されてもよい。脂質は、ミコール酸であり得る。 Lipids typically comprise a head group, an interfacial moiety, and two hydrophobic tail groups, which may be the same or different. Suitable head groups include, but are not limited to, neutral head groups such as diacylglyceride (DG) and ceramide (CM), zwitterionic head groups such as phosphatidylcholine (PC), phosphatidylethanolamine (PE), and sphingomyelin (SM), negatively charged head groups such as phosphatidylglycerol (PG), phosphatidylserine (PS), phosphatidylinositol (PI), phosphate (PA), and cardiolipin (CA), and positively charged head groups such as trimethylammonium propane (TAP). Suitable interfacial moieties include naturally occurring interfacial moieties, such as, but not limited to, glycerol-based moieties or ceramide-based moieties. Suitable hydrophobic tail groups include, but are not limited to, saturated hydrocarbon chains such as lauric acid (n-dodecanolic acid), myristic acid (n-tetradecononinoic acid), palmitic acid (n-hexadecanoic acid), stearic acid (n-octadecanoic acid), and arachidic acid (n-eicosanoic acid), unsaturated hydrocarbon chains such as oleic acid (cis-9-octadecanoic acid), and branched hydrocarbon chains such as phytanoyl. The chain length and the position and number of double bonds in the unsaturated hydrocarbon chain may vary. The chain length and the position and number of branches, such as methyl groups in the branched hydrocarbon chain, may vary. The hydrophobic tail group may be linked to the interfacial moiety as an ether or ester. The lipid may be a mycolic acid.

脂質は、化学的に修飾することもできる。脂質のヘッド基又はテール基は化学的に修飾され得る。ヘッド基が化学修飾された適切な脂質は、１，２－ジアシル－ｓｎ－グリセロ－３－ホスホエタノールアミン－Ｎ－［メトキシ（ポリエチレングリコール）－２０００］などのＰＥＧ修飾脂質と、１，２－ジステアロイル－ｓｎ－グリセロ－３ホスホエタノールアミン－Ｎ－［ビオチニル（ポリエチレングリコール）２０００］などの官能化ＰＥＧ脂質と、１，２－ジオレオイル－ｓｎ－グリセロ－３－ホスホエタノールアミン－Ｎ－（スクシニル）及び１，２－ジパルミトイル－ｓｎ－グリセロ－３－ホスホエタノールアミン－Ｎ－（ビオチニル）などの結合のために修飾された脂質と、を含むが、これらに限定されない。テール基が化学的に修飾された適切な脂質は、１，２－ビス（１０，１２－トリコサジノイル）－ｓｎ－グリセロ－３－ホスホコリンなどの重合性脂質と、１－パルミトイル－２－（１６－フルオロパルミトイル）－ｓｎ－グリセロ－３－ホスホコリンなどのフッ素化脂質と、１，２－ジパルミトイル－Ｄ６２－ｓｎ－グリセロ－３－ホスホコリンなどの重水素化脂質と、１，２－ジ－Ｏ－フィタニル－ｓｎ－グリセロ－３－ホスホコリンなどのエーテル結合脂質と、を含むが、これらに限定されない。脂質は、ポリヌクレオチドのカップリングを容易にするように化学的に修飾されても官能化されてもよい。 Lipids can also be chemically modified. The head group or tail group of the lipid can be chemically modified. Suitable lipids with chemically modified head groups include, but are not limited to, PEG-modified lipids such as 1,2-diacyl-sn-glycero-3-phosphoethanolamine-N-[methoxy(polyethylene glycol)-2000], functionalized PEG lipids such as 1,2-distearoyl-sn-glycero-3-phosphoethanolamine-N-[biotinyl(polyethylene glycol)2000], and lipids modified for conjugation such as 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine-N-(succinyl) and 1,2-dipalmitoyl-sn-glycero-3-phosphoethanolamine-N-(biotinyl). Suitable lipids with chemically modified tail groups include, but are not limited to, polymerizable lipids such as 1,2-bis(10,12-tricosadinoyl)-sn-glycero-3-phosphocholine, fluorinated lipids such as 1-palmitoyl-2-(16-fluoropalmitoyl)-sn-glycero-3-phosphocholine, deuterated lipids such as 1,2-dipalmitoyl-D62-sn-glycero-3-phosphocholine, and ether-linked lipids such as 1,2-di-O-phytanyl-sn-glycero-3-phosphocholine. Lipids may be chemically modified or functionalized to facilitate coupling of polynucleotides.

両親媒性層、例えば、脂質組成物は、典型的には、層の特性に影響を及ぼすであろう１つ以上の添加剤を含む。適切な添加剤は、パルミチン酸、ミリスチン酸及びオレイン酸などの脂肪酸と、パルミチン酸アルコール、ミリスチン酸アルコール及びオレイン酸アルコールなどの脂肪アルコールと、コレステロール、エルゴステロール、ラノステロール、シトステロール及びスティグマステロールなどのステロールと、１－アシル－２－ヒドロキシ－ｓｎ－グリセロ－３－ホスホコリンなどのリゾリン脂質と、セラミドと、を含むが、これらに限定されない。 The amphiphilic layer, e.g., lipid composition, typically contains one or more additives that will affect the properties of the layer. Suitable additives include, but are not limited to, fatty acids such as palmitic acid, myristic acid, and oleic acid; fatty alcohols such as palmitic alcohol, myristic alcohol, and oleic alcohol; sterols such as cholesterol, ergosterol, lanosterol, sitosterol, and stigmasterol; lysophospholipids such as 1-acyl-2-hydroxy-sn-glycero-3-phosphocholine; and ceramides.

別の好ましい実施形態では、膜は、固体状態層を含む。固体状態層は、マイクロ電子材料、絶縁材料、例えば、Ｓｉ_３Ｎ_４、Ａ１_２Ｏ_３、及びＳｉＯ、有機ポリマー及び無機ポリマー、例えば、ポリアミド、プラスチック、例えば、Ｔｅｆｌｏｎ（登録商標）、又はエラストマー、例えば、二成分付加硬化シリコンゴム、及びガラスを含むが、これらに限定されない、有機材料及び無機材料の両方から形成することができる。固体状態層は、グラフェンから形成され得る。好適なグラフェン層は、ＷＯ２００９／０３５６４７に開示されている。膜が固体状態層を含む場合、細孔は、典型的には、固体状態層内、例えば、固体状態層内の穴、ウェル、ギャップ、チャネル、溝、又はスリット内に含まれる両親媒性膜又は層に存在する。当業者であれば、好適な固体状態／両親媒性ハイブリッドシステムを調製することができる。好適なシステムは、ＷＯ２００９／０２０６８２及びＷＯ２０１２／００５８５７に開示される。上記で考察した両親媒性膜又は層のうちのいずれかを使用してもよい。 In another preferred embodiment, the membrane comprises a solid-state layer. The solid-state layer can be formed from both organic and inorganic materials, including, but _not limited _{to, microelectronic materials, insulating materials such as Si3N4, Al2O3} _, _and SiO, organic and inorganic polymers such as polyamides, plastics such as Teflon®, or elastomers such as two-component addition-cured silicone rubber, and glass. The solid-state layer can be formed from graphene. Suitable graphene layers are disclosed in WO 2009/035647. When the membrane comprises a solid-state layer, the pores are typically present in the amphiphilic membrane or layer contained within the solid-state layer, for example, within holes, wells, gaps, channels, grooves, or slits within the solid-state layer. Those skilled in the art can prepare suitable solid-state/amphiphilic hybrid systems. Suitable systems are disclosed in WO 2009/020682 and WO 2012/005857. Any of the amphiphilic membranes or layers discussed above may be used.

この方法は、典型的には、（ｉ）細孔を含む人工両親媒性層、（ｉｉ）細孔を含む単離された天然脂質二分子層、又は（ｉｉｉ）細孔が挿入された細胞を使用して実行される。この方法は、典型的には、人工トリブロックコポリマー層などの人工両親媒性層を使用して実施される。層は、細孔に加えて、他の膜貫通タンパク質及び／又は膜内タンパク質、並びに他の分子を含んでもよい。好適な装置及び条件は、以下で考察される。本開示の方法は、典型的には、インビトロで実行される。 The method is typically carried out using (i) an artificial amphiphile layer containing a pore, (ii) an isolated natural lipid bilayer containing a pore, or (iii) a cell into which a pore has been inserted. The method is typically carried out using an artificial amphiphile layer, such as an artificial triblock copolymer layer. In addition to the pore, the layer may contain other transmembrane and/or intramembrane proteins, as well as other molecules. Suitable equipment and conditions are discussed below. The method of the present disclosure is typically carried out in vitro.

分析物を特性決定する方法
更なる態様では、標的分析物の存在、不在、又は１つ以上の特性を決定する方法を開示する。方法は、標的分析物を、標的分析物がそれぞれ細孔複合体内のナノ細孔及び補助タンパク質又はペプチドによって提供される少なくとも２つの構造を含む連続チャネルに対して、例えば、連続チャネル内又はそれを通過して移動するように、細孔複合体を含む膜と接触させることと、分析物がチャネルに対して移動するときに１回以上の測定を行い、それによって分析物の存在、不在、又は１つ以上の特性を決定することと、を含む。分析物は、ナノ細孔の狭窄、続いて補助タンパク質の狭窄を通過してもよい。代替の実施形態では、分析物は、膜内の細孔複合体の配向に応じて、補助タンパク質の狭窄、続いてナノ細孔の狭窄を通過してもよい。 Methods for Characterizing Analytes In a further aspect, methods for determining the presence, absence, or one or more properties of a target analyte are disclosed. The method includes contacting the target analyte with a membrane containing a pore complex such that the target analyte migrates relative to, e.g., in or through, a continuous channel comprising at least two structures provided by a nanopore and an auxiliary protein or peptide, respectively, within the pore complex, and performing one or more measurements as the analyte migrates relative to the channel, thereby determining the presence, absence, or one or more properties of the analyte. The analyte may pass through a constriction of the nanopore followed by a constriction of the auxiliary protein. In an alternative embodiment, the analyte may pass through a constriction of the auxiliary protein followed by a constriction of the nanopore, depending on the orientation of the pore complex within the membrane.

一実施形態では、本方法は、標的分析物の存在、不在又は１つ以上の特性を決定するためのものである。本方法は、少なくとも１つの分析物の存在、不在又は１つ以上の特性を決定するためのものであってもよい。本方法は、２つ以上の分析物の存在、不在又は１つ以上の特性を決定することに関してもよい。本方法は、任意の数の分析物、例えば２、５、１０、１５、２０、３０、４０、５０、１００又はそれ以上の分析物の不在又は１つ以上の特性を決定することを含んでもよい。１つ以上の分析物の任意の数の特性、例えば１、２、３、４、５、１０又はそれ以上の特性が決定され得る。 In one embodiment, the method is for determining the presence, absence, or one or more characteristics of a target analyte. The method may be for determining the presence, absence, or one or more characteristics of at least one analyte. The method may involve determining the presence, absence, or one or more characteristics of two or more analytes. The method may include determining the absence or one or more characteristics of any number of analytes, for example, 2, 5, 10, 15, 20, 30, 40, 50, 100, or more analytes. Any number of characteristics of one or more analytes may be determined, for example, 1, 2, 3, 4, 5, 10, or more characteristics.

細孔複合体のチャネル内、又はチャネルのいずれかの開口部の近くでの分子の結合は、細孔を通るオープンチャネルイオン流に影響を及ぼし、これは細孔チャネルの「分子センシング」の本質である。核酸配列決定アプリケーションと同様の様式で、オープンチャネルイオン流の変動は、電流の変化による好適な測定技法を使用して測定することができる（例えば、ＷＯ２０００／２８３１２及びＤ．Ｓｔｏｄｄａｒｔｅｔａｌ．，Ｐｒｏｃ．Ｎａｔｌ．Ａｃａｄ．Ｓｃｉ．，２０１０，１０６，７７０２－７又はＷＯ２００９／０７７７３４）。電流の減少によって測定されるイオン流の減少度は、細孔内又は細孔の近くの障害物のサイズに関連する。したがって、細孔内又は細孔の近くでの「分析物」とも称される目的の分子の結合は、検出可能かつ測定可能なイベントを提供し、それによって「生物学的センサ」の基礎を形成する。ナノ細孔センシングに適した分子は、核酸と、タンパク質と、ペプチドと、多糖類と、医薬品、毒素、サイトカイン、及び汚染物質などの小分子（ここでは、低分子量（例えば、＜９００Ｄａ又は＜５００Ｄａ）の有機化合物又は無機化合物を指す）と、を含む。生体分子の存在を検出することにより、個別化された薬剤開発、医学、診断、ライフサイエンス研究、環境モニタリング、並びに警備及び／又は防衛産業での用途が見出される。 Binding of a molecule within the channel of a pore complex or near any opening of the channel affects open-channel ionic flow through the pore, which is the essence of pore-channel "molecular sensing." In a manner similar to nucleic acid sequencing applications, fluctuations in open-channel ionic flow can be measured using suitable measurement techniques based on changes in electrical current (e.g., WO 2000/28312 and D. Stoddart et al., Proc. Natl. Acad. Sci., 2010, 106, 7702-7 or WO 2009/077734). The degree of decrease in ionic flow, measured by a decrease in electrical current, is related to the size of the obstacle within or near the pore. Thus, binding of a molecule of interest, also referred to as an "analyte," within or near the pore provides a detectable and measurable event, thereby forming the basis of a "biological sensor." Molecules suitable for nanopore sensing include nucleic acids, proteins, peptides, polysaccharides, and small molecules (herein referring to organic or inorganic compounds with low molecular weight (e.g., <900 Da or <500 Da)), such as pharmaceuticals, toxins, cytokines, and pollutants. Detecting the presence of biomolecules finds applications in personalized drug development, medicine, diagnostics, life science research, environmental monitoring, and the security and/or defense industries.

標的分析物は、金属イオン、無機塩、ポリマー、アミノ酸、ペプチド、ポリペプチド、タンパク質、ヌクレオチド、オリゴヌクレオチド、ポリヌクレオチド、単糖類、多糖類、染料、漂白剤、薬剤、診断薬、レクリエーショナルドラッグ、爆発物、毒性化合物、又は環境汚染物質であってもよい。本方法は、２つ以上の同じタイプの分析物、例えば２つ以上のタンパク質、２つ以上のヌクレオチド、又は２つ以上の医薬品の存在、不在、又は１つ以上の特性を決定することに関してもよい。代替的に、本方法は、２つ以上の異なるタイプの分析物、例えば１つ以上のタンパク質、１つ以上のヌクレオチド、及び１つ以上の医薬品の存在、不在、又は１つ以上の特性を決定することに関してもよい。 The target analyte may be a metal ion, inorganic salt, polymer, amino acid, peptide, polypeptide, protein, nucleotide, oligonucleotide, polynucleotide, monosaccharide, polysaccharide, dye, bleach, pharmaceutical, diagnostic agent, recreational drug, explosive, toxic compound, or environmental pollutant. The method may involve determining the presence, absence, or one or more characteristics of two or more analytes of the same type, e.g., two or more proteins, two or more nucleotides, or two or more pharmaceuticals. Alternatively, the method may involve determining the presence, absence, or one or more characteristics of two or more different types of analytes, e.g., one or more proteins, one or more nucleotides, and one or more pharmaceuticals.

標的分析物は、細胞から分泌され得る。代替的に、標的分析物は、細胞内に存在する分析物であってもよく、したがって、方法が実行されてもよい前に分析物が細胞から抽出されなければならない。 The target analyte may be secreted from the cell. Alternatively, the target analyte may be an analyte that is present intracellularly, and therefore the analyte must be extracted from the cell before the method may be performed.

一実施形態では、分析物は、アミノ酸、ペプチド、ポリペプチド、又はタンパク質である。アミノ酸、ペプチド、ポリペプチド、又はタンパク質は、天然型であっても非天然型であってもよい。ポリペプチド又はタンパク質は、それらの中に合成又は修飾アミノ酸を含み得る。アミノ酸へのいくつかの異なるタイプの修飾が、当該技術分野で既知である。好適なアミノ酸及びその修飾は、上記である。標的分析物は、当該技術分野で利用可能な任意の方法によって修飾され得ることを理解されたい。 In one embodiment, the analyte is an amino acid, peptide, polypeptide, or protein. The amino acid, peptide, polypeptide, or protein may be naturally occurring or non-naturally occurring. The polypeptide or protein may include synthetic or modified amino acids therein. Several different types of modifications to amino acids are known in the art. Suitable amino acids and their modifications are described above. It should be understood that the target analyte may be modified by any method available in the art.

好ましい実施形態では、分析物は、核酸などのポリヌクレオチドである。ポリヌクレオチドは、２つ以上のヌクレオチドを含む高分子として定義される。ＤＮＡ及びＲＮＡ中の天然に存在する核酸塩基は、それらの物理的サイズによって区別され得る。核酸分子又は個々の塩基がナノ細孔のチャネルを通過すると、塩基間のサイズの違いにより、チャネルを通るイオン流が直接相関して低減する。イオン流の変化を記録することができる。イオン流れの変化を記録するための適切な電気測定技術は、例えば、ＷＯ２０００／２８３１２及びＤ．Ｓｔｏｄｄａｒｔｅｔａｌ．，Ｐｒｏｃ．Ｎａｔｌ．Ａｃａｄ．Ｓｃｉ．，２０１０，１０６，ｐｐ７７０２－７（単一チャネル記録機器）、及び、例えば、ＷＯ２００９／０７７７３４（マルチチャネル記録技術）に記載される。好適な較正により、イオン流の特徴的な低減を使用して、チャネルを通過する特定のヌクレオチド及び関連する塩基をリアルタイムで同定することができる。典型的なナノ細孔核酸配列決定では、ヌクレオチドによるチャネルの部分的遮断のために、目的のヌクレオチド配列の個々のヌクレオチドがナノ細孔のチャネルを連続的に通過するにつれて、オープンチャネルイオン流が低減する。上記の好適な記録技術を使用して測定されるのは、このイオン流の低減である。イオン流の低減は、どのヌクレオチドがチャネルを通過するかを決定するための手段をもたらすチャネルを通る既知のヌクレオチドについて測定されたイオン流の低減に合わせて較正することができ、したがって、順次行われる場合、ナノ細孔を通過する核酸のヌクレオチド配列を決定する方式が得られる。個々のヌクレオチドを正確に決定するために、典型的には、チャネルを通るイオン流の低減が、狭窄（又は「リーディングヘッド」）を通過する個々のヌクレオチドのサイズに直接相関することが必要である。配列決定は、例えば、関連するポリメラーゼ又はヘリカーゼの作用を介して細孔に「挿通された」無傷の核酸ポリマーに対して行われてもよいことが理解されるであろう。代替的には、配列は、細孔に近接する標的核酸から連続的に除去されたヌクレオチド三リン酸塩基の通過によって決定され得る（例えば、ＷＯ２０１４／１８７９２４を参照されたい）。 In a preferred embodiment, the analyte is a polynucleotide, such as a nucleic acid. A polynucleotide is defined as a polymer containing two or more nucleotides. Naturally occurring nucleic acid bases in DNA and RNA can be distinguished by their physical size. As a nucleic acid molecule or individual bases pass through the nanopore channel, the size differences between the bases result in a directly correlated reduction in ion flow through the channel. The change in ion flow can be recorded. Suitable electrical measurement techniques for recording changes in ion flow are described, for example, in WO 2000/28312 and D. Stoddart et al., Proc. Natl. Acad. Sci., 2010, 106, pp. 7702-7 (single-channel recording instruments), and, for example, in WO 2009/077734 (multi-channel recording techniques). With appropriate calibration, the characteristic reduction in ion flow can be used to identify specific nucleotides and associated bases passing through the channel in real time. In typical nanopore nucleic acid sequencing, open-channel ion current decreases as individual nucleotides of a nucleotide sequence of interest pass sequentially through the nanopore channel due to partial blockage of the channel by the nucleotide. It is this reduction in ion current that is measured using the suitable recording techniques described above. The reduction in ion current can be calibrated to the reduction in ion current measured for known nucleotides passing through the channel, providing a means for determining which nucleotides pass through the channel, and thus, when performed sequentially, provides a method for determining the nucleotide sequence of a nucleic acid passing through the nanopore. To accurately determine individual nucleotides, it is typically necessary that the reduction in ion current through the channel directly correlate with the size of the individual nucleotides passing through the constriction (or "leading head"). It will be understood that sequencing may be performed on an intact nucleic acid polymer "threaded" through the pore, for example, via the action of an associated polymerase or helicase. Alternatively, the sequence may be determined by passing nucleotide triphosphate groups sequentially removed from the target nucleic acid adjacent to the pore (see, e.g., WO 2014/187924).

ポリヌクレオチド又は核酸は、任意のヌクレオチドの任意の組み合わせを含み得る。ヌクレオチドは、天然に存在しても人工であってもよい。ポリヌクレオチド中の１つ以上のヌクレオチドは、酸化されてもメチル化されてもよい。ポリヌクレオチド中の１つ以上のヌクレオチドは、損傷してもよい。例えば、ポリヌクレオチドは、ピリミジンダイマーを含み得る。そのようなダイマーは、典型的には、紫外線による損傷と関連しており、皮膚メラノーマの主な原因である。ポリヌクレオチド中の１つ以上のヌクレオチドは、例えば標識又はタグによって修飾され得、その好適な例は当業者に知られている。ポリヌクレオチドは、１つ以上のスペーサを含み得る。ヌクレオチドは、典型的には、核酸塩基、糖、及び少なくとも１つのリン酸基を含む。核酸塩基及び糖がヌクレオシドを形成する。核酸塩基は、典型的には、複素環式である。核酸塩基は、プリン及びピリミジン、より具体的には、アデニン（Ａ）、グアニン（Ｇ）、チミン（Ｔ）、ウラシル（Ｕ）及びシトシン（Ｃ）を含むが、これらに限定されない。糖は、典型的には、ペントース糖である。ヌクレオチド糖には、リボース及びデオキシリボースが含まれるが、これらに限定されない。糖は、好ましくは、デオキシリボースである。ポリヌクレオチドは、好ましくは、以下のヌクレオシド：デオキシアデノシン（ｄＡ）、デオキシウリジン（ｄＵ）及び／又はチミジン（ｄＴ）、デオキシグアノシン（ｄＧ）、並びにデオキシシチジン（ｄＣ）を含む。ヌクレオチドは、典型的には、リボヌクレオチド又はデオキシリボヌクレオチドである。ヌクレオチドは、典型的には、一リン酸、二リン酸、又は三リン酸を含む。ヌクレオチドは、３個より多くのリン酸、例えば、４個又は５個のリン酸を含み得る。リン酸は、ヌクレオチドの５’又は３’側に付着してもよい。ポリヌクレオチド中のヌクレオチドは、任意の様式で互いに付着し得る。ヌクレオチドは、典型的には、核酸と同様に、それらの糖及びリン酸基によって付着する。ヌクレオチドは、ピリミジンダイマーと同様に、核酸塩基を介して接続されてもよい。ポリヌクレオチドは、一本鎖であっても二本鎖であってもよい。ポリヌクレオチドの少なくとも一部分は、好ましくは、二本鎖である。ポリヌクレオチドは、最も好ましくは、リボ核酸（ＲＮＡ）又はデオキシリボ核酸（ＤＮＡ）である。特に、ポリヌクレオチドを分析物として代替的に使用する前記方法は、（ｉ）ポリヌクレオチドの長さ、（ｉｉ）ポリヌクレオチドの同一性、（ｉｉｉ）ポリヌクレオチドの配列、（ｉｖ）ポリヌクレオチドの二次構造、及び（ｖ）ポリヌクレオチドが修飾されているか否かから選択され１つ以上の特性を判定することを含む。 A polynucleotide or nucleic acid can contain any combination of any nucleotides. Nucleotides can be naturally occurring or artificial. One or more nucleotides in a polynucleotide can be oxidized or methylated. One or more nucleotides in a polynucleotide can be damaged. For example, a polynucleotide can contain a pyrimidine dimer. Such dimers are typically associated with UV damage and are a major cause of cutaneous melanoma. One or more nucleotides in a polynucleotide can be modified, for example, with a label or tag, suitable examples of which are known to those skilled in the art. A polynucleotide can contain one or more spacers. A nucleotide typically contains a nucleobase, a sugar, and at least one phosphate group. The nucleobase and sugar form a nucleoside. The nucleobase is typically heterocyclic. Nucleobases include, but are not limited to, purines and pyrimidines, more specifically, adenine (A), guanine (G), thymine (T), uracil (U), and cytosine (C). The sugar is typically a pentose sugar. Nucleotide sugars include, but are not limited to, ribose and deoxyribose. The sugar is preferably deoxyribose. Polynucleotides preferably contain the following nucleosides: deoxyadenosine (dA), deoxyuridine (dU) and/or thymidine (dT), deoxyguanosine (dG), and deoxycytidine (dC). Nucleotides are typically ribonucleotides or deoxyribonucleotides. Nucleotides typically contain monophosphate, diphosphate, or triphosphate. Nucleotides may contain more than three phosphates, for example, four or five phosphates. The phosphates may be attached to the 5' or 3' side of the nucleotide. Nucleotides in a polynucleotide can be attached to each other in any manner. Nucleotides are typically attached via their sugar and phosphate groups, similar to nucleic acids. Nucleotides may be connected via nucleobases, similar to pyrimidine dimers. Polynucleotides may be single-stranded or double-stranded. At least a portion of the polynucleotide is preferably double-stranded. The polynucleotide is most preferably ribonucleic acid (RNA) or deoxyribonucleic acid (DNA). In particular, the method alternatively using a polynucleotide as an analyte includes determining one or more characteristics selected from (i) the length of the polynucleotide, (ii) the identity of the polynucleotide, (iii) the sequence of the polynucleotide, (iv) the secondary structure of the polynucleotide, and (v) whether the polynucleotide is modified.

ポリヌクレオチドは、任意の長さ（ｉ）であり得る。例えば、ポリヌクレオチドは、少なくとも１０、少なくとも５０、少なくとも１００、少なくとも１５０、少なくとも２００、少なくとも２５０、少なくとも３００、少なくとも４００、又は少なくとも５００ヌクレオチド又はヌクレオチド対の長さであり得る。ポリヌクレオチドは、１０００ヌクレオチド長若しくはヌクレオチド対長以上、５０００ヌクレオチド長若しくはヌクレオチド対長以上、又は１０００００ヌクレオチド長若しくはヌクレオチド対長以上であり得る。任意の数のポリヌクレオチドを調査することができる。例えば、本方法は、２、３、４、５、６、７、８、９、１０、２０、３０、５０、１００又はそれ以上のポリヌクレオチドの特性決定に関し得る。２個以上のポリヌクレオチドが特性決定される場合、それらは、異なるポリヌクレオチドであるか、又は同じポリヌクレオチドの２つの事例であり得る。ポリヌクレオチドは、天然に存在しても人工であってもよい。例えば、本方法を使用して、製造されたオリゴヌクレオチドの配列を検証することができる。本方法は、典型的には、インビトロで実行される。 Polynucleotides can be of any length (i). For example, polynucleotides can be at least 10, at least 50, at least 100, at least 150, at least 200, at least 250, at least 300, at least 400, or at least 500 nucleotides or nucleotide pairs in length. Polynucleotides can be 1,000 nucleotides or nucleotide pairs or more in length, 5,000 nucleotides or nucleotide pairs or more in length, or 100,000 nucleotides or nucleotide pairs or more in length. Any number of polynucleotides can be investigated. For example, the method can involve characterizing 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 50, 100, or more polynucleotides. When two or more polynucleotides are characterized, they can be different polynucleotides or two instances of the same polynucleotide. Polynucleotides can be naturally occurring or artificial. For example, the method can be used to verify the sequence of manufactured oligonucleotides. The method is typically performed in vitro.

ヌクレオチドは同一性（ｉｉ）を有し得、アデノシン一リン酸塩（ＡＭＰ）、グアノシン一リン酸塩（ＧＭＰ）、チミジン一リン酸塩（ＴＭＰ）、ウリジン一リン酸塩（ＵＭＰ）、５－メチルシチジン一リン酸塩、５－ヒドロキシメチルシチジン一リン酸塩、シチジン一リン酸塩（ＣＭＰ）、環状アデノシン一リン酸塩（ｃＡＭＰ）、環状グアノシン一リン酸塩（ｃＧＭＰ）、デオキシアデノシン一リン酸塩（ｄＡＭＰ）、デオキシグアノシン一リン酸塩（ｄＧＭＰ）、デオキシチミジン一リン酸塩（ｄＴＭＰ）、デオキシウリジン一リン酸塩（ｄＵＭＰ）、デオキシシチジン一リン酸塩（ｄＣＭＰ）、及びデオキシメチルシチジン一リン酸塩が含まれ得るが、これらに限定されない。ヌクレオチドは、好ましくは、ＡＭＰ、ＴＭＰ、ＧＭＰ、ＣＭＰ、ＵＭＰ、ｄＡＭＰ、ｄＴＭＰ、ｄＧＭＰ、ｄＣＭＰ及びｄＵＭＰから選択される。ヌクレオチドは、脱塩基であり得る（すなわち、核酸塩基を欠く）。ヌクレオチドは、核酸塩基及び糖も欠き得る（すなわち、Ｃ３スペーサである）。ヌクレオチドの配列（ｉｉｉ）は、ポリヌクレオチド株全体で、鎖の５′から３′方向に互いに結合した以下のヌクレオチドの連続的な同一性によって決定される。 The nucleotides may have identity (ii) and may include, but are not limited to, adenosine monophosphate (AMP), guanosine monophosphate (GMP), thymidine monophosphate (TMP), uridine monophosphate (UMP), 5-methylcytidine monophosphate, 5-hydroxymethylcytidine monophosphate, cytidine monophosphate (CMP), cyclic adenosine monophosphate (cAMP), cyclic guanosine monophosphate (cGMP), deoxyadenosine monophosphate (dAMP), deoxyguanosine monophosphate (dGMP), deoxythymidine monophosphate (dTMP), deoxyuridine monophosphate (dUMP), deoxycytidine monophosphate (dCMP), and deoxymethylcytidine monophosphate. The nucleotides are preferably selected from AMP, TMP, GMP, CMP, UMP, dAMP, dTMP, dGMP, dCMP, and dUMP. A nucleotide may be abasic (i.e., lacking a nucleobase). A nucleotide may also lack a nucleobase and a sugar (i.e., a C3 spacer). The sequence (iii) of a nucleotide is determined by the sequential identities of the following nucleotides, linked together in the 5' to 3' direction of the strand, throughout the polynucleotide strand:

少なくとも２つの狭窄を含む細孔複合体は、ホモポリマーの分析に特に有用である。例えば、細孔を使用して、同一である２つ以上、例えば少なくとも３、４、５、６、７、８、９、又は１０個の連続したヌクレオチドを含むポリヌクレオチドの配列を決定することができる。例えば、細孔を使用して、ポリＡ領域、ポリＴ領域、ポリＧ領域、及び／又はポリＣ領域を含むポリヌクレオチドを配列決定することができる。 Pore complexes containing at least two constrictions are particularly useful for analyzing homopolymers. For example, the pore can be used to sequence polynucleotides that contain two or more identical nucleotides, e.g., at least 3, 4, 5, 6, 7, 8, 9, or 10 consecutive nucleotides. For example, the pore can be used to sequence polynucleotides that contain poly-A tracts, poly-T tracts, poly-G tracts, and/or poly-C tracts.

いくつかの実施形態では、ＣｓｇＧ細孔狭窄は、配列番号５９の位置５１、５５及び５６での残基で生成される。ＤＮＡが狭窄を通過するとき、任意の時点でのＤＮＡのおよそ５塩基と細孔の狭窄との相互作用が電流信号を支配する。特定のＣｓｇＧ細孔（例えば、本明細書に記載の１つ以上の補助タンパク質又は融合タンパク質を欠くＣｓｇＧ細孔）は、ＤＮＡの混合配列領域（Ａ、Ｔ、Ｇ、及びＣが混合される場合）の読み取りに非常に良好であるが、ＤＮＡ内にホモポリマー領域がある場合（例えば、ポリＴ、ポリＧ、ポリＡ、ポリＣ）、信号は、フラットになり、いくつかの情報を欠く。５個の塩基がＣｓｇＧ及びその狭窄変異体の信号を支配するため、追加の滞留時間情報を使用せずに５より長いホモポリマーを差別することは、困難である。しかしながら、ＤＮＡが第２の狭窄を通過する場合、より多くのＤＮＡ塩基が組み合わされた狭窄と相互作用し、差別することができるホモポリマーの長さを増加させる。 In some embodiments, the CsgG pore constriction is generated by residues at positions 51, 55, and 56 of SEQ ID NO: 59. As DNA passes through the constriction, the interaction of approximately five bases of DNA with the pore constriction at any given time dominates the current signal. Certain CsgG pores (e.g., CsgG pores lacking one or more auxiliary proteins or fusion proteins described herein) are very good at reading mixed-sequence regions of DNA (when A, T, G, and C are mixed), but when there are homopolymeric regions within the DNA (e.g., poly-T, poly-G, poly-A, poly-C), the signal becomes flat and lacks some information. Because five bases dominate the signal for CsgG and its constriction variants, it is difficult to discriminate between homopolymers longer than five without using additional dwell time information. However, when DNA passes through a second constriction, more DNA bases interact with the combined constrictions, increasing the length of homopolymers that can be discriminated.

キット
更なる態様では、本開示はまた、標的ポリヌクレオチドを特性決定するためのキットを提供する。キットは、開示された細孔複合体及び膜の構成要素を含む。膜は、好ましくは、構成要素から形成される。細孔複合体は、好ましくは、膜内に存在し、共に膜貫通細孔複合体チャネルを形成する。キットは、任意のタイプの膜、例えば両親媒性層又はトリブロックコポリマー膜の構成要素を含み得る。キットは更に、核酸処理酵素などのポリヌクレオチド結合タンパク質、例えば、ポリメラーゼ又はヘリカーゼを含んでもよい。キットは更に、ポリヌクレオチドを膜にカップリングするための、コレステロールなどの１つ以上のアンカーを含んでもよい。キットは、ポリヌクレオチドの特性決定を容易にするために標的ポリヌクレオチドに結合することができる、１つ以上のポリヌクレオチドアダプタを更に含み得る。一実施形態では、コレステロールなどのアンカーは、ポリヌクレオチドアダプタに結合している。キットは、上記の実施形態のいずれかを実施することを可能にする１つ以上の他の試薬又は器具を更に含み得る。かかる試薬又は器具には、以下の好適な緩衝液（複数可）（水溶液）、対象から試料を得るための手段（容器又は針を含む器具など）、ポリヌクレオチドを増幅及び／若しくは発現させるための手段、又は電圧若しくはパッチクランプ装置のうちの１つ以上が含まれる。試薬は、流体試料が試薬を再懸濁するように、乾燥状態でキット内に存在してもよい。キットはまた、任意選択的に、キットが本開示の方法で使用されることを可能にするための使用説明書、又は本方法が使用されてもよい生物に関する詳細を含んでもよい。最後に、キットはまた、ポリヌクレオチドの特性決定に有用な追加の構成要素も含んでもよい。 Kits In a further aspect, the present disclosure also provides kits for characterizing target polynucleotides. The kits include the disclosed pore complex and membrane components. The membrane is preferably formed from the components. The pore complex is preferably present within the membrane, and together they form a transmembrane pore complex channel. The kits may include components of any type of membrane, such as an amphiphilic layer or a triblock copolymer membrane. The kits may further include a polynucleotide-binding protein, such as a nucleic acid processing enzyme, e.g., a polymerase or a helicase. The kits may further include one or more anchors, such as cholesterol, for coupling the polynucleotide to the membrane. The kits may further include one or more polynucleotide adaptors capable of binding to the target polynucleotide to facilitate characterization of the polynucleotide. In one embodiment, an anchor, such as cholesterol, is attached to the polynucleotide adaptor. The kits may further include one or more other reagents or equipment that enable any of the above embodiments to be performed. Such reagents or equipment may include one or more of the following: suitable buffer(s) (aqueous solution), means for obtaining a sample from a subject (such as an apparatus including a container or needle), means for amplifying and/or expressing polynucleotides, or voltage or patch clamp apparatus. The reagents may be present in the kit in a dry state so that a fluid sample resuspends the reagents. The kit may also optionally include instructions to enable the kit to be used in the methods of the disclosure, or details regarding organisms for which the methods may be used. Finally, the kit may also include additional components useful for characterizing polynucleotides.

特定の実施形態、特定の構成、並びに材料及び／又は分子が、本開示による操作された細胞及び方法について本明細書で考察されてきたが、形態及び詳細における様々な変更又は修正は、本開示の範囲及び趣旨から逸脱することなく行われてもよいことが理解されるべきである。以下の実施例は、特定の実施形態をよりよく例示するために提供されており、本出願を限定するものとみなされるべきではない。この出願は、特許請求の範囲のみによって限定される。 While specific embodiments, particular configurations, and materials and/or molecules have been discussed herein for engineered cells and methods according to the present disclosure, it should be understood that various changes or modifications in form and detail may be made without departing from the scope and spirit of the present disclosure. The following examples are provided to better illustrate specific embodiments and should not be construed as limiting the present application, which is limited only by the claims.

実施例１
らせん状の狭窄を作成するために、ｄｅｎｏｖｏ設計を使用して、よく折り畳まれ、ナノ細孔の内腔内に所望の程度まで突出する小さなタンパク質ドメインを選択した。この目的のために、複数のプログラムを使用することができる。この実施例は、バックボーン設計を容易にするプログラムＭＡＳＴＥＲと、配列選択のための可変バックボーン幾何学形状を有するＲｏｓｅｔｔａと、を使用するワークフローについて説明する。 Example 1
To create the helical constriction, de novo design was used to select a small protein domain that was well-folded and protruded into the lumen of the nanopore to a desired extent. Several programs can be used for this purpose. This example describes a workflow using the programs MASTER, which facilitates backbone design, and Rosetta, which has variable backbone geometries for sequence selection.

細孔内腔内に投影する新しいドメインを作成するに、ＲＦ－拡散、ＣＨＲＯＭＡ又はプログラムＭＡＳＴＥＲなどのプログラムを使用することができる。ここで、本発明者らは、ＭＡＳＴＥＲを使用した。１）ＣｓｇＦの標的領域（残基１６～３０）の安定化、２）全てのユニットが９倍対称演算子を使用して生成される場合に、直径が１０Å～３０Åの間の新たな狭窄（細孔内腔内に最も奥まで延びるアミノ酸残基のＣａ－Ｃａ距離）を作成するためのナノ細孔内への投射、及び３）新しいドメインが、ＣｓｇＧ内の任意の原子又はＣｓｇＦからの対称メイトと衝突しないこと、という基準に一致する構造をタンパク質データバンク（ＰＤＢ）で検索した。 To create a new domain to project into the pore lumen, programs such as RF-diffusion, CHROMA, or the program MASTER can be used. Here, the inventors used MASTER. They searched the Protein Data Bank (PDB) for structures that met the following criteria: 1) stabilization of the target region of CsgF (residues 16-30); 2) projection into the nanopore to create a new constriction (the Ca-Ca distance of the amino acid residues that extend furthest into the pore lumen) with a diameter between 10 Å and 30 Å when all units are generated using 9-fold symmetry operators; and 3) the new domain does not clash with any atoms in CsgG or with symmetry mates from CsgF.

まず、ＣｓｇＦ内の標的領域及びその対称的な隣接領域にドッキングするヘリックスは、ＰＤＢ内の天然タンパク質において頻繁に観察され、したがって「設計可能」である幾何学形状において、同定された。上位候補は、標的領域及び発見されたヘリックスのＲＭＳＤに基づく出力のクラスタリング後、データベース内で見つかった密接に関連するヘリックス－ヘリックスペアの数に基づいて選択された（図１）。このようにして、標的アミノ酸及びそのＮ末端の４つのアミノ酸に対してよく詰まられるヘリックスの幾何学形状が選択された。更に、対称性関連パートナーとの有利なヘリックス－ヘリックス相互作用を行うヘリックスをデータベース内で検索した。ヘリックスを接続するリンカー（例えば、ループ構造）は、ヘリックスバックボーンのデータベースを使用して選択された（図１）。次に、得られたバックボーンの配列は、Ｒｏｓｅｔｔａを使用して設計された。代表的な配列は、生成された（例えば、配列番号１～５８）。 First, helices docking to the target region in CsgF and its symmetrically adjacent regions were identified in geometries frequently observed in natural proteins in the PDB and therefore "designable." Top candidates were selected based on the number of closely related helix-helix pairs found in the database after clustering the output based on the RMSD of the target region and the discovered helices (Figure 1). In this way, helix geometries that pack well against the target amino acid and its four N-terminal amino acids were selected. Furthermore, the database was searched for helices that form favorable helix-helix interactions with symmetry-related partners. Linkers (e.g., loop structures) connecting the helices were selected using a database of helix backbones (Figure 1). The sequences of the resulting backbones were then designed using Rosetta. Representative sequences were generated (e.g., SEQ ID NOS: 1-58).

実験検証用の配列は、図２に示すように、最低のエネルギースコア及び最高のＰａｃｋＳｔａｔスコアに基づいて選択された。配列を更に優先させるために、凝集傾向は、複数の凝集及びアミロイド予測プログラムのうちの１つを使用してテストされてもよい。 Sequences for experimental validation were selected based on lowest energy score and highest PackStat score, as shown in Figure 2. To further prioritize sequences, aggregation propensity may be tested using one of several aggregation and amyloid prediction programs.

実施例２
材料及び方法
大腸菌ＣｓｇＧ細孔生成
Ｃ末端Ｓｔｒｅｐアフィニティータグ及びアンピシリン耐性遺伝子を有するＣｓｇＧバリアントナノ細孔をコードする組換え発現ベクターを化学的に能力のある大腸菌細胞に形質転換した。細胞を、選択のための適切な抗生物質を含有するＬＢ寒天プレート上にプレーティングして３７℃で一晩インキュベートした。適切な抗生物質を含むＬＢ培地に寒天プレートからの単一コロニーを接種し、振盪しながら３７℃で一晩増殖させた。培養物を自己誘導培地及び必要な抗生物質で希釈し、振盪しながら１８℃で６８時間インキュベートした。細胞を溶解前に遠心分離によって回収して、１倍のＢｕｇｂｕｓｔｅｒ抽出試薬（Ｍｅｒｃｋ７０９２１）及び０．１％ＤＤＭを含有する緩衝液に抽出した。溶解物をスピンダウンし、細孔をアフィニティークロマトグラフィー、熱処理、次にサイズ排除クロマトグラフィーを使用して可溶性抽出物から精製し、ＳＤＳ－ＰＡＧＥによって判断されるようにオリゴマーナノ細孔を選択した。 Example 2
Materials and Methods: E. coli CsgG Pore Generation. A recombinant expression vector encoding a CsgG variant nanopore with a C-terminal Strep affinity tag and an ampicillin resistance gene was transformed into chemically competent E. coli cells. Cells were plated onto LB agar plates containing the appropriate antibiotic for selection and incubated overnight at 37°C. LB medium containing the appropriate antibiotic was inoculated with a single colony from the agar plate and grown overnight at 37°C with shaking. The culture was diluted with autoinduction medium and the required antibiotic and incubated for 68 hours at 18°C with shaking. Cells were harvested by centrifugation before lysis and extraction into a buffer containing 1x Bugbuster extraction reagent (Merck 70921) and 0.1% DDM. The lysate was spun down, and the pores were purified from the soluble extract using affinity chromatography, heat treatment, and then size-exclusion chromatography to select for oligomeric nanopores as judged by SDS-PAGE.

ＣｓｇＧ／ＣｓｇＦ又は融合タンパク質複合体形成プロトコル
ＣｓｇＧ－ＣｓｇＦ複合体は、上記のように精製されたナノ細孔から調製され、マレイミド修飾の有無にかかわらず、ｄｅｎｏｖｏ融合タンパク質を化学的に合成した。システインを含む融合タンパク質について、融合タンパク質の環化は、適切なシステインでチオールを架橋することで達成された。ナノ細孔を、還元剤を含まないｐＨ７．０の緩衝液に緩衝液交換し、ＣｓｇＧモノマーに対して８倍モル過剰のペプチドとともに２５℃で１時間インキュベートした。次に、試料を６０℃で１５分間加熱し、その後、遠心分離して任意の沈殿を除去し、ＤＴＴを添加して任意の更なる反応を防止した。 CsgG/CsgF or Fusion Protein Complex Formation Protocol. CsgG-CsgF complexes were prepared from purified nanopores as described above, and de novo fusion proteins were chemically synthesized with or without maleimide modification. For cysteine-containing fusion proteins, cyclization of the fusion protein was achieved by crosslinking the thiol at the appropriate cysteine. The nanopores were buffer-exchanged into a pH 7.0 buffer containing no reducing agent and incubated with an 8-fold molar excess of peptide over CsgG monomer at 25°C for 1 hour. The samples were then heated at 60°C for 15 minutes, after which they were centrifuged to remove any precipitates, and DTT was added to prevent any further reaction.

ＳＤＳ－ＰＡＧＥ分析
１μｇの複合体及びＣｓｇＧのみの細孔対照を個々の０．５ｍＬのＰｒｏｔｅｉｎＬｏＢｉｎｄＥｐｐｅｎｄｏｒｆチューブ（Ｆｉｓｈｅｒ、１０３１６７５２）に添加し、反応緩衝液で１０μＬの容量にした。これは、１０ｕＬの２倍のＬａｅｍｍｌｉ緩衝液の添加により２０μＬの最終容量にされた。各試料全体を、１倍のＴＧＳ緩衝液（Ｓｉｇｍａ、Ｔ７７７７）で実行する４～２０％のＴＧＸゲル（ＢｉｏＲａｄ、５６７１０９３）にロードした。これを３００Ｖで２１分間実行した。ゲルを画像化するために、メーカーの指示に従ってＳｐｙｒｏＲｕｂｙ（Ｍｅｒｋ、Ｓ４９４２）染色を使用した。次に、これを４５０ｎｍのレーザーを使用してＧＥタイフーンゲルイメージャーで画像化した。 SDS-PAGE analysis. 1 μg of conjugate and CsgG-only pore control were added to individual 0.5 mL ProteinLoBind Eppendorf tubes (Fisher, 10316752) and brought to a volume of 10 μL with reaction buffer. This was brought to a final volume of 20 μL by adding 10 μL of 2x Laemmli buffer. The entire sample was loaded onto a 4-20% TGX gel (BioRad, 5671093) run in 1x TGS buffer (Sigma, T7777). This was run at 300 V for 21 minutes. Spyro Ruby (Merck, S4942) stain was used to image the gel, according to the manufacturer's instructions. This was then imaged on a GE Typhoon gel imager using the 450 nm laser.

いくつかの分析について、１μｇの複合体及びＣｓｇＧのみの細孔対照を個々のＰＣＲチューブに添加し、反応緩衝液で１０μＬの容量にした。新たに調製された１ＭのＤＴＴストックを調製し、これを１０ｍＭの最終濃度で個々のＰＣＲチューブにスパイクした。これは、１０μＬの２倍のＬａｅｍｍｌｉ緩衝液の添加により２０μＬの最終容量にされた。各試料をＰＣＲサーモサイクラーで９５℃で２分間加熱した。これを、各試料からの材料全体を１倍のＴＧＳ緩衝液（Ｓｉｇｍａ、Ｔ７７７７）で実行する４～２０％のＴＧＸゲル（ＢｉｏＲａｄ、５６７１０９３）にロードする前に、５分間冷却した。これを３００Ｖで２１分間実行した。ゲルを画像化するために、メーカーの指示に従ってＳｐｙｒｏＲｕｂｙ（Ｍｅｒｋ、Ｓ４９４２）染色を使用した。次に、これを４５０ｎｍのレーザーを使用してＧＥタイフーンゲルイメージャーで画像化した。 For some analyses, 1 μg of complex and a CsgG-only pore control were added to individual PCR tubes and brought to a volume of 10 μL with reaction buffer. A freshly prepared 1 M DTT stock was spiked into each PCR tube at a final concentration of 10 mM. This was brought to a final volume of 20 μL by adding 10 μL of 2x Laemmli buffer. Each sample was heated to 95°C for 2 minutes in a PCR thermocycler. This was allowed to cool for 5 minutes before the entire material from each sample was loaded onto a 4-20% TGX gel (BioRad, 5671093) run in 1x TGS buffer (Sigma, T7777). This was run at 300V for 21 minutes. To image the gel, Spyro Ruby (Merck, S4942) stain was used according to the manufacturer's instructions. This was then imaged on a GE Typhoon gel imager using a 450 nm laser.

電気的測定値
電気測定値は、ＭｉｎＩＯＮフローセルに挿入されたＣｓｇＧのみの複合体、ＣｓｇＧ／ＣｓｇＦ複合体、又はＣｓｇＧ／融合タンパク質複合体から取得された。単一細孔をブロックコポリマー膜に挿入した後、２５ｍＭのリン酸カリウム、１５０ｍＭのフェロシアン化カリウム（ＩＩ）、１５０ｍＭのフェリシアン化カリウム（ＩＩＩ）、ｐＨ８．０を含む１ｍＬの緩衝液をシステムに通して流して任意の過剰なナノ細孔を除去した。 Electrical measurements were taken from CsgG-only complexes, CsgG/CsgF complexes, or CsgG/fusion protein complexes inserted into MinION flow cells. After inserting a single pore into the block copolymer membrane, 1 mL of buffer containing 25 mM potassium phosphate, 150 mM potassium ferrocyanide(II), 150 mM potassium ferricyanide(III), pH 8.0 was flowed through the system to remove any excess nanopores.

図２３に示すように、ＤＮＡ波線を評価するために使用される分析物は、ラムダゲノムの３’末端からの３．６キロベースのＤＮＡ切片であった。分析物の調製、分析物をＹアダプタにライゲーションすること、ライゲーションされた分析物のＳＰＲＩ－ビーズクリーンアップ、及びｍｉｎＩＯＮフローセルへの添加を、ＯｘｆｏｒｄＮａｎｏｐｏｒｅＴｅｃｈｎｏｌｏｇｉｅｓＱ－ＳＱＫ－ＬＳＫ１１０プロトコルを使用して実施した。 As shown in Figure 23, the analyte used to assess DNA waviness was a 3.6 kilobase DNA fragment from the 3' end of the lambda genome. Analyte preparation, ligation of the analyte to a Y-adapter, SPRI-bead cleanup of the ligated analyte, and application to the minION flow cell were performed using the Oxford Nanopore Technologies Q-SQK-LSK110 protocol.

ＯｘｆｏｒｄＮａｎｏｐｏｒｅＴｅｃｈｎｏｌｏｇｉｅｓ製のｍｉｎＩＯＮＭｋ１ｂを使用して電気的測定値を取得した。－１８０ｍＶで標準配列決定スクリプトを６時間実行し、５分間ごとに静的フリックして伸長ナノ細孔ブロックを除去した。生データは、ＭｉｎＫＮＯＷソフトウェア（ＯｘｆｏｒｄＮａｎｏｐｏｒｅＴｅｃｈｎｏｌｏｇｉｅｓ）を使用してバルクＦＡＳＴ５ファイルに収集された。 Electrical measurements were taken using a minION Mk1b instrument manufactured by Oxford Nanopore Technologies. A standard sequencing script was run at -180 mV for 6 hours, with static flicks every 5 minutes to remove any elongation nanopore blockages. Raw data were collected in bulk FAST5 files using MinKNOW software (Oxford Nanopore Technologies).

差別プロファイリング
ラムダゲノム（３．６Ｋｂのラムダ）の３’末端からの３．６キロベースのＤＮＡ切片のＤＮＡ波線（例えば、電気的測定など）を含有するＦＡＳＴ５を取得した。カスタムｐｙｔｈｏｎスクリプトを使用してＤＮＡ波線をトリミングして、ＤＮＡ配列決定が開始する前に捕捉された任意の電気信号測定値を除去した。 Differential Profiling: A FAST5 containing DNA squiggles (e.g., electrical measurements) of a 3.6 kilobase DNA segment from the 3' end of the lambda genome (3.6 Kb lambda) was acquired. The DNA squiggles were trimmed using a custom Python script to remove any electrical signal measurements captured before DNA sequencing began.

トリミングされた３．６Ｋｂのラムダ波線及びこの領域の対応するゲノム参照を使用して、ニューラルネットワークのパラメーターをトレーニングした。４つの層を含有するニューラルネットワークは、ユーザ指定のウィンドウ長の配列及びそれらの配列の関連付ける現在のレベルをモデル化した。これらのモデルに指定されたウィンドウ長により、＋／－１２個のヌクレオチドの領域が任意の１つの位置で現在のレベルに寄与することを可能にした。 The trimmed 3.6 Kb lambda curve and the corresponding genomic reference for this region were used to train the parameters of a neural network. The neural network, containing four layers, modeled sequences in a user-specified window length and the current level associated with those sequences. The window length specified for these models allowed a region of +/- 12 nucleotides to contribute to the current level at any one position.

トレーニングされたニューラルネットワークを使用して、３．６ＫｂのラムダＤＮＡ参照配列に対応する現在のレベルを予測した。これはまた、その配列の全ての可能な単一塩基編集から現在のレベルを予測するために使用された。 The trained neural network was used to predict the current level corresponding to a 3.6 Kb lambda DNA reference sequence. This was also used to predict the current level from all possible single-base edits of that sequence.

配列内の単一の位置（Ｌ）で塩基を変更すると、この塩基が細孔の主狭窄を通過するときに予測される電流を変化させるだけでなく、塩基がこの主狭窄を通過する前後の電流も変化する。編集された３．６Ｋｂのラムダ配列のセットについて予測電流レベルを分析して、位置Ｌでの塩基が変更されるときの位置Ｌ＋Ｘ（オフセット）での予測電流の範囲を計算した。－１６～＋１６のオフセットは、各位置で分析された。各オフセットでの予測電流の範囲の中間値が計算されて図内のデータを得た。ＣｓｇＧ狭窄を示す最大のピークが０位に対応するように、モデルを中心に配置した。 Changing a base at a single position (L) within a sequence not only changes the predicted current as this base passes through the main constriction of the pore, but also changes the current before and after the base passes through this main constriction. The predicted current levels for a set of edited 3.6 Kb lambda sequences were analyzed to calculate the range of predicted current at position L + X (offset) when the base at position L is changed. Offsets of -16 to +16 were analyzed at each position. The midpoint of the range of predicted current at each offset was calculated to obtain the data in the figure. The model was centered so that the largest peak, representing the CsgG constriction, corresponds to position 0.

実施例３
Ｒｏｓｅｔｔａを使用して設計されたｄｅｎｏｖｏ融合タンパク質配列は、分析され、実験的検証用の配列は、最低のエネルギースコア及び最高のＰａｃｋＳｔａｔスコアに基づいて選択された（図２）。ＰＳＩＰＲＥＤ（例えば、ＭｃＧｕｆｆｉｎＬＪ，Ｂｒｙｓｏｎ，Ｋ，ＪｏｎｅｓＤ，Ｂｉｏｉｎｆｏｒｍａｔｉｃｓ，１６，４０４－４０５，２０００に記載されるように）分析を行って、融合タンパク質の二次構造を予測した。残基は、それぞれ鎖、ヘリックス及びコイルであると予測されるか否かに応じて網掛けされる。ｄｅｎｏｖｏ設計された融合タンパク質（例えば、伸長したＣｓｇＦタンパク質）及び野生型ＣｓｇＦの成熟配列の二次構造分析を図３Ａに示す。ｄｅｎｏｖｏ設計された融合タンパク質、ＯＮＴ１～ＯＮＴ１０、ＯＮＴ１１～ＯＮＴ２０、及びＯＮＴ２１～ＯＮＴ２５の構造分析を図３Ｂ～３Ｃに示す。 Example 3
The de novo fusion protein sequences designed using Rosetta were analyzed, and sequences for experimental validation were selected based on the lowest energy score and highest PackStat score (Figure 2). PSIPRED analysis (e.g., as described in McGuffin LJ, Bryson K, Jones D, Bioinformatics, 16, 404-405, 2000) was performed to predict the secondary structure of the fusion proteins. Residues are shaded according to whether they are predicted to be strands, helices, or coils, respectively. Secondary structure analysis of the de novo designed fusion proteins (e.g., extended CsgF protein) and the mature sequence of wild-type CsgF is shown in Figure 3A. Structural analysis of the de novo designed fusion proteins, ONT1-ONT10, ONT11-ONT20, and ONT21-ONT25, is shown in Figures 3B-3C.

ｄｅｎｏｖｏ設計された融合タンパク質の代替配列の三次元構造も、タンパク質折り畳みアルゴリズムを使用して調査された。ｄｅｎｏｖｏ設計された融合タンパク質ＯＮＴ１～ＯＮＴ１０、ＯＮＴ１１～ＯＮＴ２０、及びＯＮＴ２１～ＯＮＴ２５の予測される３Ｄ構造を図４Ａ～４Ｃに示す。構造は、予測局所距離差試験（ｐＬＤＤＴ）である信頼度の目安に応じて陰影付けされる。 The three-dimensional structures of alternative sequences of the de novo designed fusion proteins were also investigated using protein folding algorithms. The predicted 3D structures of the de novo designed fusion proteins ONT1-ONT10, ONT11-ONT20, and ONT21-ONT25 are shown in Figures 4A-4C. The structures are shaded according to a measure of confidence, the predicted local distance difference test (pLDDT).

ＣｓｇＧのみの細孔及びＣｓｇＧ／融合タンパク質複合体のＳＤＳ－ＰＡＧＥゲル分析を行った。複合体は、マレイミド架橋剤がある場合又はそれがない場合の、ＣｓｇＦ－ｄｅｌ（Ｓ３１～Ｆ１１９）対照又はｄｅｎｏｖｏ設計された融合タンパク質のいずれかを含んだ（図５）。融合タンパク質を含む複合体は、これらの試料がナノ細孔複合体であることを示すバンドシフトを示した。試料は、ゲルにロードする前に加熱されなかったことを注意されたい。また、ＣｓｇＧのみの細孔及びＣｓｇＧ／融合タンパク質複合体のＳＤＳ－ＰＡＧＥゲル分析を行った。これらの複合体は、マレイミド架橋剤がある場合又はそれがない場合の、ＣｓｇＦ－ｄｅｌ（Ｓ３１～Ｆ１１９）対照又はｄｅｎｏｖｏ設計された融合タンパク質のいずれかを含んだ（図６）。細孔は、ゲルにロードする前にＤＴＴの存在で沸騰させると、それらの構成モノマー構成要素に分解された。マレイミド架橋が存在しない場合にバンドシフトが観察されなかったことを注意し、これは、これらのバンドがＣｓｇＧモノマーのみで構成されることを示す。レーン７は、ＣｓｇＧのみの対照と比較してバンドシフトを示し、これは、融合タンパク質が、マレイミドが存在するため、ＣｓｇＧ細孔に共有結合されることを示す。レーン８及び９は、融合タンパク質の質量が増加するため、更なるバンドシフトを示した。これは、融合タンパク質がＣｓｇＧ細孔に共有結合されることを示す。 SDS-PAGE gel analysis of CsgG-only pores and CsgG/fusion protein complexes was performed. The complexes contained either the CsgF-del(S31-F119) control or the de novo-designed fusion protein, with or without a maleimide crosslinker (Figure 5). Complexes containing the fusion protein exhibited a band shift, indicating that these samples were nanopore complexes. Note that the samples were not heated before loading onto the gel. SDS-PAGE gel analysis of CsgG-only pores and CsgG/fusion protein complexes was also performed. These complexes contained either the CsgF-del(S31-F119) control or the de novo-designed fusion protein, with or without a maleimide crosslinker (Figure 6). The pores were disassembled into their constituent monomeric components by boiling in the presence of DTT before loading onto the gel. Note that no band shift was observed in the absence of maleimide crosslinking, indicating that these bands consist solely of CsgG monomers. Lane 7 shows a band shift compared to the CsgG-only control, indicating that the fusion protein is covalently bound to the CsgG pore due to the presence of maleimide. Lanes 8 and 9 show a further band shift due to the increased mass of the fusion protein, indicating that the fusion protein is covalently bound to the CsgG pore.

一本鎖ＤＮＡがＣｓｇＧのみの細孔を通って移動するときのイオン電流（ｐＡ）対時間（ｓ）を測定した。各個々のグラフは、ｍｉｎＩＯＮフローセルに挿入された単一の細孔に対応する。ＣｓｇＧのみの細孔について観察された開放細孔電流は、－１８０ｍＶの印加電圧で約１８０ｐＡであった。以下の表１は、本開示に記載されるように、タンパク質細孔複合体の中間値範囲、中間値雑音、及び中間値信号対雑音比（ＳＮＲ）の代表的なデータを示す。
表１：メトリクス表
Ion current (pA) versus time (s) was measured as single-stranded DNA translocated through the CsgG-only pore. Each individual graph corresponds to a single pore inserted into the minION flow cell. The open pore current observed for the CsgG-only pore was approximately 180 pA at an applied voltage of -180 mV. Table 1 below shows representative data for the mean range, mean noise, and mean signal-to-noise ratio (SNR) of the protein-pore complex as described in this disclosure.
Table 1: Metrics table

図７～１１は、一本鎖ＤＮＡがＣｓｇＧのみの細孔、ｄｅｌ（Ｓ３１～Ｆ１１９）ＣｓｇＦペプチドを含むＣｓｇＧ、又はｄｅｎｏｖｏ設計された融合タンパク質を含むＣｓｇＧを通って移動するときの代表的なイオン電流（ｐＡ）対時間（ｓ）トレースを示す。生の電流トレースは、黒い線で示され、イベント検出信号は、赤い線で示される。各細孔について、上の行は、完全なＤＮＡ電流トレースを示し、下の行は、電流トレースの第１の部分の拡大図を示す。ＣｓｇＧのみである細孔の開放細孔電流は、約１７５～２００ｐＡと観察され、ＤＮＡ波線の中間電流は、約７５ｐＡである。ＣｓｇＦペプチドを含む細孔について、開放細孔電流は、約９０～１２０ｐＡであり、中間電流は、約３５～５０ｐＡである。図７は、ＤＮＡがＣｓｇＧのみの細孔を通って移動するときのトレースを示す。図８はマレイミド架橋剤がある場合（右）又はそれがない場合（左）の、一本鎖ＤＮＡがｄｅｌ（Ｓ３１～Ｆ１１９）ＣｓｇＦペプチドを含むＣｓｇＧを通って移動するときの代表的なイオン電流（ｐＡ）対時間（ｓ）トレースを示す。図９は、マレイミド架橋剤なしで、一本鎖ＤＮＡがＯＮＬＰ２０６２３であるｄｅｎｏｖｏ設計された融合タンパク質を含むＣｓｇＧを通って移動するときの代表的なイオン電流（ｐＡ）対時間（ｓ）トレースを示す。図１０は、一本鎖ＤＮＡがＯＮＬＰ２０６２４（マレイミド架橋がない場合）又はＯＮＬＰ２０６２７（マレイミド架橋がある場合）であるｄｅｎｏｖｏ設計された融合タンパク質を含むＣｓｇＧを通って移動するときの代表的なイオン電流（ｐＡ）対時間（ｓ）トレースを示す。図１１は、一本鎖ＤＮＡがＯＮＬＰ２０６２８（マレイミド架橋がある場合）又はＯＮＬＰ２０６２５（マレイミド架橋がない場合）であるｄｅｎｏｖｏ設計された融合タンパク質を含むＣｓｇＧを通って移動するときの代表的なイオン電流（ｐＡ）対時間（ｓ）トレースを示す。いくつかの実施形態では、融合タンパク質は、ペプチド内に内部ジスルフィド結合を形成し、すなわち融合タンパク質を環化するために、システイン残基とともに３７Ｒ残基を含む。 Figures 7-11 show representative ionic current (pA) versus time (s) traces as single-stranded DNA translocates through a CsgG-only pore, CsgG containing the del(S31-F119) CsgF peptide, or CsgG containing a de novo designed fusion protein. The raw current traces are shown as black lines, and the event detection signal is shown as a red line. For each pore, the top row shows the complete DNA current trace, and the bottom row shows a zoomed-in view of the first portion of the current trace. The open pore current for the CsgG-only pore is observed to be approximately 175-200 pA, with the mean current for the DNA squiggly line being approximately 75 pA. For the pore containing the CsgF peptide, the open pore current is approximately 90-120 pA, with the mean current being approximately 35-50 pA. Figure 7 shows the traces as DNA translocates through a CsgG-only pore. Figure 8 shows representative ion current (pA) versus time (s) traces when single-stranded DNA translocates through CsgG containing the del(S31-F119) CsgF peptide, with (right) or without (left) a maleimide crosslinker. Figure 9 shows representative ion current (pA) versus time (s) traces when single-stranded DNA translocates through CsgG containing the de novo designed fusion protein ONLP20623, without the maleimide crosslinker. Figure 10 shows representative ion current (pA) versus time (s) traces when single-stranded DNA translocates through CsgG containing the de novo designed fusion protein ONLP20624 (without the maleimide crosslinker) or ONLP20627 (with the maleimide crosslinker). Figure 11 shows representative ionic current (pA) versus time (s) traces as single-stranded DNA translocates through CsgG containing de novo designed fusion proteins ONLP20628 (with maleimide bridge) or ONLP20625 (without maleimide bridge). In some embodiments, the fusion protein contains a 37R residue along with a cysteine residue to form an internal disulfide bond within the peptide, i.e., cyclize the fusion protein.

ＤＮＡ分子が細孔を通って移動するときの、細孔内の位置及びイオン電流レベルの全体的な変化（「差別」）に対するそれらの寄与を示すプロファイルを生成する。細孔内の距離は、主要な狭窄に対してヌクレオチドステップにおいて測定される。負の値は、主要な狭窄下の位置に対応し、正の値は、主要な狭窄上の位置に対応する（ＣｓｇＧ）。破線のボックスは、ｄｅｎｏｖｏ設計された融合タンパク質の導入によって影響を受ける領域を示す。図１２は、ＤＮＡ分子がＣｓｇＧのみの細孔を通って移動するときの代表的なプロファイルを示す。ＣｓｇＧのみの細孔（Ｑ１５３Ｃの有無にかかわらず）は、位置０で１つの主要な差別ピークを示す。図１３は、ＤＮＡ分子がＣｓｇＧ／ＣｓｇＦ細孔を通って移動するときの代表的なプロファイルを示す。破線のボックスは、ｄｅｎｏｖｏ設計された融合タンパク質の導入によって影響を受ける領域を示す。マレイミド架橋剤がある場合（右）又はそれがない場合（左）のＣｓｇＧ－ＣｓｇＦ－ｄｅｌ（Ｓ３１～Ｆ１１９）細孔は、２つの差別ピークを示す。主要な差別ピークは、ＣｓｇＧのみの細孔で見られるように、位置０にあり、追加の差別ピークは、主要な狭窄の下方の４～６つのヌクレオチド（位置－４～－６）にある。この差別の追加領域は、位置０での主要な差別ピークと比較して、イオン電流に対する影響が小さい。図１４は、ＤＮＡ分子がＣｓｇＧ／融合タンパク質（ＯＮＬＰ２０６４１又はＯＮＬＰ２０６４４）細孔を通って移動するときの代表的なプロファイルを示す。ＣｓｇＧと、マレイミド架橋剤がある場合（右）又はそれがない場合（左）の、環化ありの、Ｋ３７Ｒを含有するｄｅｎｏｖｏ設計された融合タンパク質とを含む複合体は、３つの差別ピークを示す。主要な差別ピークは、ＣｓｇＧのみの細孔に見られるように、位置０にあり、追加のピークは、位置－６及び－９にある。位置－９でのピークは、正しい方向に折り畳まれた場合に、ｄｅｎｏｖｏ設計された融合タンパク質によって生成される予想の狭窄に対応する。 As a DNA molecule translocates through the pore, a profile is generated showing its position within the pore and its contribution to the overall change in ionic current level ("discrimination"). Distance within the pore is measured in nucleotide steps relative to the major constriction. Negative values correspond to positions below the major constriction, and positive values correspond to positions above the major constriction (CsgG). The dashed box indicates the region affected by the introduction of the de novo designed fusion protein. Figure 12 shows a representative profile when a DNA molecule translocates through a CsgG-only pore. CsgG-only pores (with or without Q153C) show one major discrimination peak at position 0. Figure 13 shows a representative profile when a DNA molecule translocates through a CsgG/CsgF pore. The dashed box indicates the region affected by the introduction of the de novo designed fusion protein. The CsgG-CsgF-del (S31-F119) pore with (right) or without (left) the maleimide crosslinker shows two discrimination peaks. The major discrimination peak is at position 0, as seen in the CsgG-only pore, and an additional discrimination peak is 4 to 6 nucleotides below the major constriction (positions -4 to -6). This additional region of discrimination has a smaller effect on the ion current compared to the major discrimination peak at position 0. Figure 14 shows a representative profile of a DNA molecule translocating through a CsgG/fusion protein (ONLP20641 or ONLP20644) pore. Complexes containing CsgG and a de novo designed fusion protein containing K37R, with or without (left) the maleimide crosslinker, with or without cyclization, show three discrimination peaks. The major discriminatory peak is at position 0, as seen in the CsgG-only pore, with additional peaks at positions -6 and -9. The peak at position -9 corresponds to the expected constriction produced by the de novo designed fusion protein when folded in the correct orientation.

実施例３
配列番号６１に示すサブユニットのうちの９つから形成された細孔（マレイミド架橋剤がある場合又はそれがない場合、両方とも環化なし）はまた、実施例２に記載のようにテストされた。結果は、図１７～１８に示される。 Example 3
Pores formed from nine of the subunits set forth in SEQ ID NO:61 (with or without the maleimide crosslinker, both without cyclization) were also tested as described in Example 2. The results are shown in Figures 17-18.

代表的な配列
＞（配列番号１）ＣｓｇＦ－ＷＴ－ｄｅｌ（Ｓ３１～Ｆ１１９））－Ｅｘｔ（３１－ＧＧＥＬＡＡＫＬＷＡＮＧＤＥＴＮＡＬＳＬＦＱＴＩＩＱＳ）（ＯＮＬＰ２０６２３）
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＧＧＥＬＡＡＫＬＷＡＮＧＤＥＴＮＡＬＳＬＦＱＴＩＩＱＳ
＞（配列番号２）ＣｓｇＦ－ＷＴ－Ｋ３７Ｒ－ｄｅｌ（Ｓ３１～Ｆ１１９）－Ｅｘｔ（３１－ＧＧＥＬＡＡＫＬＷＡＮＧＤＥＴＮＡＬＳＬＦＱＴＩＩＱＳ）（ＯＮＬＰ２０６２４）
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＧＧＥＬＡＡＲＬＷＡＮＧＤＥＴＮＡＬＳＬＦＱＴＩＩＱＳ
＞（配列番号３）ＣｓｇＦ－ＷＴ－Ｎ２４Ｃ／Ｋ３７Ｒ－ｄｅｌ（Ｓ３１～Ｆ１１９）－Ｅｘｔ（３１－ＧＧＥＬＡＡＫＬＷＡＮＧＤＥＴＮＡＬＳＬＦＱＴＩＩＱＳＣ）（ＯＮＬＰ２０６２５）
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＣＳＡＱＡＱＮＧＧＥＬＡＡＲＬＷＡＮＧＤＥＴＮＡＬＳＬＦＱＴＩＩＱＳＣ
＞（配列番号４）Ｍａｔ－ＣｓｇＦ－Ｅｃｏ－（ＷＴ－Ｄｅｌ（Ｓ３１～Ｆ１１９）－Ｅｘｔ（３１－ＡＧＥＬＡＫＫＬＷＥＮＧＮＶＮＱＡＬＳＬＦＱＴＶＩＱＳ）（ＯＮＬＺ１９４３２，ＤＧＬＯＮＴ７６）
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＫＫＬＷＥＮＧＮＶＮＱＡＬＳＬＦＱＴＶＩＱＳ
＞（配列番号５）Ｍａｔ－ＣｓｇＦ－Ｅｃｏ－（ＷＴ－Ｋ３６Ｒ／Ｋ３７Ｒ－Ｄｅｌ（Ｓ３１～Ｆ１１９）－Ｅｘｔ（３１－ＡＧＥＬＡＫＫＬＷＥＮＧＮＶＮＱＡＬＳＬＦＱＴＶＩＱＳ）（ＯＮＬＺ１９４３１）
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＲＲＬＷＥＮＧＮＶＮＱＡＬＳＬＦＱＴＶＩＱＳ
＞（配列番号６）Ｍａｔ－ＣｓｇＦ－Ｅｃｏ－（ＷＴ－Ｎ２４Ｃ／Ｋ３６Ｒ／Ｋ３７Ｒ－Ｄｅｌ（Ｓ３１～Ｆ１１９）－Ｅｘｔ（３１－ＡＧＥＬＡＲＲＬＷＥＮＧＮＶＮＱＡＬＳＬＦＱＴＶＩＱＳＣ）（ＯＮＬＺ１９７８１）
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＣＳＡＱＡＱＮＡＧＥＬＡＲＲＬＷＥＮＧＮＶＮＱＡＬＳＬＦＱＴＶＩＱＳＣ
＞（配列番号７）ＯＮＴ１１３＿２
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＡＥＬＡＡＫＬＷＡＮＡＤＥＴＮＡＬＳＬＦＱＴＩＩＱＳ
＞（配列番号８）ＯＮＴ１１３＿３
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＡＥＬＡＡＫＬＷＡＮＡＤＥＴＮＡＬＳＬＦＱＴＬＩＱＳ
＞（配列番号９）ＯＮＴ１
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＦＫＫＧＤＬＴＮＡＬＳＬＦＱＴＶＩＱＳ
＞（配列番号１０）ＯＮＴ２
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＶＥＫＬＦＫＮＧＤＷＴＮＡＩＳＩＦＱＴＶＩＱＳ
＞（配列番号１１）ＯＮＴ３
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＥＫＬＷＲＮＧＤＥＴＮＡＬＳＬＦＱＴＶＩＱＳ
＞（配列番号１２）ＯＮＴ４
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＥＫＬＷＫＮＧＤＥＴＮＡＬＳＬＦＱＴＶＩＱＳ
＞（配列番号１３）ＯＮＴ５
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＫＫＬＷＥＮＧＤＥＴＮＡＬＳＬＦＱＴＶＶＱＳ
＞（配列番号１４）ＯＮＴ６
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＥＫＬＷＲＮＧＮＥＳＤＡＬＳＬＦＱＴＶＩＱＳ
＞（配列番号１５）ＯＮＴ７
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＫＫＬＦＥＮＧＤＫＴＮＡＬＳＬＦＱＴＶＩＱＳ
＞（配列番号１６）ＯＮＴ８
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＫＫＬＷＥＮＧＤＥＴＮＡＬＳＬＦＱＴＶＩＱＳ
＞（配列番号１７）ＯＮＴ９
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＫＫＬＷＥＫＧＮＳＥＤＡＬＡＬＦＲＴＶＶＱＳ
＞（配列番号１８）ＯＮＴ１０
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＫＫＬＦＤＮＧＤＭＥＮＡＭＫＬＦＱＴＶＩＡＳ
＞（配列番号１９）ＯＮＴ１１
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＥＫＬＷＲＮＧＤＫＤＲＡＬＡＬＦＲＴＶＩＱＳ
＞（配列番号２０）ＯＮＴ１２
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＤＫＬＷＫＮＧＤＫＤＲＡＬＳＬＦＱＴＶＩＱＳ
＞（配列番号２１）ＯＮＴ１３
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＫＫＬＦＤＮＧＤＭＤＲＡＬＡＬＦＲＴＶＩＡＳ
＞（配列番号２２）ＯＮＴ１４
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＫＫＬＦＤＮＧＮＥＥＤＡＬＡＬＦＲＴＶＶＡＳ
＞（配列番号２３）ＯＮＴ１５
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＷＫＫＧＤＥＥＮＡＬＫＬＦＲＴＶＶＴＳ
＞（配列番号２４）ＯＮＴ１６
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＦＫＮＧＮＭＥＤＡＬＫＬＦＲＴＶＩＡＳ
＞（配列番号２５）ＯＮＴ１７
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＫＶＡＡＩＬＷＫＮＧＮＫＳＤＡＬＳＬＦＱＴＶＶＴＳ
＞（配列番号２６）ＯＮＴ１８
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＦＫＮＧＤＬＴＮＡＬＳＬＦＱＴＶＶＱＳ
＞（配列番号２７）ＯＮＴ１９
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＧＬＫＬＬＲＫＧＤＶＥＴＡＬＴＬＦＡＱＶＩＳＧ
＞（配列番号２８）ＯＮＴ２０
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＧＬＫＬＩＬＫＧＤＬＥＴＡＬＫＬＦＡＩＶＩＡＧ
＞（配列番号２９）ＯＮＴ２１
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＧＬＫＬＬＲＫＧＤＶＥＴＡＬＫＬＦＡＩＶＩＡＧ
＞（配列番号３０）ＯＮＴ２２
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＫＫＬＹＥＮＧＬＩＥＬＡＬＭＬＦＡＬＶＩＡＳ
＞（配列番号３１）ＯＮＴ２３
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＹＫＫＬＷＤＮＧＥＶＤＫＡＬＤＬＦＡＫＩＩＡＧ
＞（配列番号３２）ＯＮＴ２４
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＧＫＫＬＩＥＫＧＤＬＥＴＡＬＫＬＦＡＩＶＩＡＧ
＞（配列番号３３）ＯＮＴ２５
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＩＡＬＲＬＬＫＮＧＫＥＥＥＡＬＫＴＬＬＶＴＩＡＧ
＞（配列番号３４）ＯＮＴ２６
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＷＫＫＧＤＥＴＮＡＬＳＬＦＱＴＶＶＴＳ
＞（配列番号３５）ＯＮＴ２７
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＫＶＡＡＩＬＷＫＮＧＮＫＳＤＡＬＳＬＦＱＴＶＶＴＳ
＞（配列番号３６）ＯＮＴ２８
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＫＫＬＷＥＫＧＤＥＴＮＡＬＳＬＦＱＴＶＶＴＳ
＞（配列番号３７）ＯＮＴ２９
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＤＬＡＡＫＬＷＫＫＧＤＥＴＮＡＬＳＬＦＱＴＶＶＴＳ
＞（配列番号３８）ＯＮＴ３０
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＷＫＮＧＮＳＳＤＡＬＳＬＦＱＴＶＶＴＳ
＞（配列番号３９）ＯＮＴ３１
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＫＫＬＷＥＫＧＤＥＴＮＡＬＳＬＦＱＴＶＶＴＳ
＞（配列番号４０）ＯＮＴ３２
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＫＫＬＷＥＫＧＤＳＳＮＡＬＳＬＦＱＴＶＶＴＳ
＞（配列番号４１）ＯＮＴ３３
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＤＬＡＡＫＬＷＫＮＧＤＥＴＮＡＬＳＬＦＱＴＶＶＴＳ
＞（配列番号４２）ＯＮＴ３４
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＦＫＮＧＤＬＴＮＡＬＳＬＦＱＴＶＶＱＳ
＞（配列番号４３）ＯＮＴ３５
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＷＫＫＧＤＥＴＮＡＬＳＬＦＱＴＶＶＴＳ
＞（配列番号４４）ＯＮＴ３６
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＦＮＳＧＤＬＤＲＡＬＡＬＦＲＴＶＶＴＳ
＞（配列番号４５）ＯＮＴ３７
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＫＶＡＫＥＬＹＤＮＧＤＥＫＷＡＬＬＬＦＲＴＶＶＴＳ
＞（配列番号４６）ＯＮＴ３８
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＫＶＡＡＥＬＹＫＮＧＤＥＫＮＡＬＬＬＦＲＴＶＶＡＳ
＞（配列番号４７）ＯＮＴ３９
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＦＫＮＧＤＭＥＮＡＬＡＬＦＲＴＶＶＴＳ
＞（配列番号４８）ＯＮＴ４０
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＫＫＬＷＥＫＧＮＳＥＤＡＬＡＬＦＲＴＶＶＱＳ
＞（配列番号４９）ＯＮＴ４１
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＦＮＫＧＤＥＤＲＡＬＡＬＦＲＴＶＶＱＳ
＞（配列番号５０）ＯＮＴ４２
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＷＫＮＧＤＥＥＮＡＬＡＬＦＲＴＶＶＴＳ
＞（配列番号５１）ＯＮＴ４３
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＥＫＬＷＲＳＧＤＡＤＲＡＬＡＬＦＲＴＶＶＴＳ
＞（配列番号５２）ＯＮＴ４４
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＷＫＮＧＮＥＥＤＡＬＡＬＦＲＴＶＶＴＳ
＞（配列番号５３）ＯＮＴ４５
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＦＮＮＧＤＥＤＲＡＬＡＬＦＲＴＶＶＱＳ
＞（配列番号５４）ＯＮＴ４６
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＷＫＫＧＤＥＤＲＡＬＡＬＦＲＴＶＶＴＳ
＞（配列番号５５）ＯＮＴ４７
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＦＮＳＧＤＥＤＲＡＬＡＬＦＲＴＶＶＱＳ
＞（配列番号５６）ＯＮＴ４８
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＡＫＬＹＮＮＧＤＬＤＲＡＤＡＴＦＲＴＶＶＱＳ
＞（配列番号５７）ＯＮＴ４９
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＬＡＫＫＬＷＥＮＧＮＥＥＤＡＬＡＬＦＲＴＶＶＴＳ
＞（配列番号５８）ＯＮＴ５０
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＥＩＡＫＱＬＷＥＫＧＤＥＳＳＡＩＴＶＡＴＩＶＬＳＳ
＞（配列番号５９）野生型大腸菌ＣｓｇＧタンパク質モノマー（シグナル配列なし）
ＣＬＴＡＰＰＫＥＡＡＲＰＴＬＭＰＲＡＱＳＹＫＤＬＴＨＬＰＡＰＴＧＫＩＦＶＳＶＹＮＩＱＤＥＴＧＱＦＫＰＹＰＡＳＮＦＳＴＡＶＰＱＳＡＴＡＭＬＶＴＡＬＫＤＳＲＷＦＩＰＬＥＲＱＧＬＱＮＬＬＮＥＲＫＩＩＲＡＡＱＥＮＧＴＶＡＩＮＮＲＩＰＬＱＳＬＴＡＡＮＩＭＶＥＧＳＩＩＧＹＥＳＮＶＫＳＧＧＶＧＡＲＹＦＧＩＧＡＤＴＱＹＱＬＤＱＩＡＶＮＬＲＶＶＮＶＳＴＧＥＩＬＳＳＶＮＴＳＫＴＩＬＳＹＥＶＱＡＧＶＦＲＦＩＤＹＱＲＬＬＥＧＥＶＧＹＴＳＮＥＰＶＭＬＣＬＭＳＡＩＥＴＧＶＩＦＬＩＮＤＧＩＤＲＧＬＷＤＬＱＮＫＡＥＲＱＮＤＩＬＶＫＹＲＨＭＳＶＰＰＥＳ
＞（配列番号６０）ＣｓｇＦペプチドの残基１～３０
ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮ
＞（配列番号６１）ＣｓｇＦ－ＷＴ－ｄｅｌ（Ｓ３１～Ｆ１１９）－Ｅｘｔ（３１－ＡＧＩＬＡＡＱＬＷＮＮＧＤＹＤＲＡＬＳＬＦＩＡＶＶＱＳ－５７）ＧＴＭＴＦＱＦＲＮＰＮＦＧＧＮＰＮＮＧＡＦＬＬＮＳＡＱＡＱＮＡＧＩＬＡＡＱＬＷＮＮＧＤＹＤＲＡＬＳＬＦＩＡＶＶＱＳ Representative sequence: (SEQ ID NO: 1) CsgF-WT-del(S31-F119))-Ext(31-GGELAAKLWANGDETNALSLFQTIIQS) (ONLP20623)
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQNGGELAAKLWANGDETNALSLFQTIIQS
>(SEQ ID NO: 2) CsgF-WT-K37R-del(S31-F119)-Ext(31-GGELAAKLWANGDETNALSLFQTIIQS) (ONLP20624)
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQNGGELAARLWANGDETNALSLFQTIIQS
>(SEQ ID NO: 3) CsgF-WT-N24C/K37R-del(S31-F119)-Ext(31-GGELAAKLWANGDETNALSLFQTIIQSC) (ONLP20625)
GTMTFQFRNPNFGGNPNNGAFLLCSAQAQNGGELAARLWANGDETNALSLFQTIIQSC
>(SEQ ID NO: 4) Mat-CsgF-Eco-(WT-Del(S31-F119)-Ext(31-AGELAKKLWENGNVNQALSLFQTVIQS) (ONLZ19432, DGLONT76)
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAKKLWENGNVNQALSLFQTVIQS
>(SEQ ID NO: 5) Mat-CsgF-Eco-(WT-K36R/K37R-Del(S31-F119)-Ext(31-AGELAKKLWENGNVNQALSLFQTVIQS) (ONLZ19431)
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELARRLWENGNVNQALSLFQTVIQS
>(SEQ ID NO: 6) Mat-CsgF-Eco-(WT-N24C/K36R/K37R-Del(S31-F119)-Ext(31-AGELARRLWENGNVNQALSLFQTVIQSC) (ONLZ19781)
GTMTFQFRNPNFGGNPNNGAFLLCSAQAQNAGELARRLWENGNVNQALSLFQTVIQSC
>(SEQ ID NO: 7) ONT113_2
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAAELAAKLWANADETNALSLFQTIIQS
>(SEQ ID NO: 8) ONT113_3
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQNAAELAAKLWANADETNALSLFQTLIQS
>(SEQ ID NO: 9) ONT1
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAAKLFKKGDLTNALSLFQTVIQS
>(SEQ ID NO: 10) ONT2
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELVEKLFKNGDWTNAISIFQTVIQS
>(SEQ ID NO: 11) ONT3
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAEKLWRNGDETNALSLFQTVIQS
>(SEQ ID NO: 12) ONT4
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAEKLWKNGDETNALSLFQTVIQS
>(SEQ ID NO: 13) ONT5
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAKKLWENGDETNALSLFQTVVQS
>(SEQ ID NO: 14) ONT6
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQNAGELAEKLWRNGNESDALSLFQTVIQS
>(SEQ ID NO: 15) ONT7
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQNAGELAKKLFENGDKTNALSLFQTVIQS
>(SEQ ID NO: 16) ONT8
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQNAGELAKKLWENGDETNALSLFQTVIQS
>(SEQ ID NO: 17) ONT9
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAKKLWEKGNSEDALALFRTVVQS
>(SEQ ID NO: 18) ONT10
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAKKLFDNGDMENAMKLFQTVIAS
>(SEQ ID NO: 19) ONT11
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAEKLWRNGDKDRALALFRTVIQS
>(SEQ ID NO: 20) ONT12
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELADKLWKNGDKDRALSLFQTVIQS
>(SEQ ID NO: 21) ONT13
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAKKLFDNGDMDRALALFRTVIAS
>(SEQ ID NO: 22) ONT14
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAKKLFDNGNEEDALALFRTVVAS
>(SEQ ID NO: 23) ONT15
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAAKLWKKGDEENALKLFRTVVTS
>(SEQ ID NO: 24) ONT16
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQNAGELAAKLFKNGNMEDALKLFRTVIAS
>(SEQ ID NO: 25) ONT17
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQNAGKVAAILWKNGNKSDALSLFQTVVTS
>(SEQ ID NO: 26) ONT18
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAAKLFKNGDLTNALSLFQTVVQS
>(SEQ ID NO: 27) ONT19
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQNAGELGLKLLRKGDVETALTLFAQVISG
>(SEQ ID NO: 28) ONT20
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELGLKLILKGDLETALKLFAIVIAG
>(SEQ ID NO: 29) ONT21
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQNAGELGLKLLRKGDVETALKLFAIVIAG
>(SEQ ID NO: 30) ONT22
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAKKLYENGLIELALMLFALVIAS
>(SEQ ID NO: 31) ONT23
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELYKKLWDNGEVDKALDLFAKIIAG
>(SEQ ID NO: 32) ONT24
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELGKKLIEKGDLETALKLFAIVIAG
>(SEQ ID NO: 33) ONT25
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGEIALRLLKNGKEEEEALKTLLVTIAG
>(SEQ ID NO: 34) ONT26
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAAKLWKKGDETNALSLFQTVVTS
>(SEQ ID NO: 35) ONT27
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQNAGKVAAILWKNGNKSDALSLFQTVVTS
>(SEQ ID NO: 36) ONT28
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAKKLWEKGDETNALSLFQTVVTS
>(SEQ ID NO: 37) ONT29
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGDLAAKLWKKGDETNALSLFQTVVTS
>(SEQ ID NO: 38) ONT30
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQNAGELAAKLWKNGNSSDALSLFQTVVTS
>(SEQ ID NO: 39) ONT31
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAKKLWEKGDETNALSLFQTVVTS
>(SEQ ID NO: 40) ONT32
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAKKLWEKGDSSNALSLFQTVVTS
>(SEQ ID NO: 41) ONT33
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGDLAAKLWKNGDETNALSLFQTVVTS
>(SEQ ID NO: 42) ONT34
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAAKLFKNGDLTNALSLFQTVVQS
>(SEQ ID NO: 43) ONT35
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAAKLWKKGDETNALSLFQTVVTS
>(SEQ ID NO: 44) ONT36
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAAKLFNSGDLDRALALFRTVVTS
>(SEQ ID NO: 45) ONT37
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQNAGKVAKELYDNGDEKWALLFRTVVTS
>(SEQ ID NO: 46) ONT38
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGKVAAELYKNGDEKNALLFRTVAS
>(SEQ ID NO: 47) ONT39
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAAKLFKNGDMENALALFRTVVTS
>(SEQ ID NO: 48) ONT40
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAKKLWEKGNSEDALALFRTVVQS
>(SEQ ID NO: 49) ONT41
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQNAGELAAKLFNKGDEDRALALFRTVVQS
>(SEQ ID NO: 50) ONT42
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAAKLWKNGDEENALALFRTVVTS
>(SEQ ID NO: 51) ONT43
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAEKLWRSGDADRALALFRTVVTS
>(SEQ ID NO: 52) ONT44
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAAKLWKNGNEEDALALFRTVVTS
>(SEQ ID NO: 53) ONT45
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAAKLFNNGDEDRALALFRTVVQS
>(SEQ ID NO: 54) ONT46
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAAKLWKKGDEDRALALFRTVVTS
>(SEQ ID NO: 55) ONT47
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAAKLFNSGDEDRALALFRTVVQS
>(SEQ ID NO: 56) ONT48
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQNAGELAAKLYNNGDLDRADATFRTVVQS
>(SEQ ID NO: 57) ONT49
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGELAKKLWENGNEEDALALFRTVVTS
>(SEQ ID NO: 58) ONT50
GTMTFQFRNPNFGGNPNNNGAFLLNSAQAQNAGEIAKQLWEKGDESSAITVATIVLSS
(SEQ ID NO: 59) Wild-type E. coli CsgG protein monomer (without signal sequence)
CLTAPPKEAARPTLMPRAQSYKDLTHLPAPTGKIFVSVYNIQDETGQFKPYPASNFSTAVPQSAT AMLVTALKDSRWFIPLERQGLQNLLNERKIIRAAQENGTVAINNRIPLQSLTAANIMVEGSIIGYE SNVKSGGVGARYFGIGADTQYQLDQIAVNLRVVNVSTGEILSSSVNTSKTILSYEVQAGVFRFIDY QRLLEGEVGYTSNEPVMLCLMSAIETGVIFLINDGIDRGLWDLQNKAERQNDILVKYRHMSVPPES
(SEQ ID NO: 60) Residues 1-30 of the CsgF peptide
GTMTFQFRNPNFGGNPNNGAFLLLNSAQAQN
>(SEQ ID NO: 61) CsgF-WT-del(S31-F119)-Ext(31-AGILAAQLWNNGDYDRALSLFIAVVQS-57)GTMTFQFRNPNFGGNPNNGAFLLNSAQAQNAGILAAQLWNNGDYDRALSLFIAVVQS

Claims

1. A protein nanopore complex comprising:
(a) a CsgG nanopore comprising a lumen;
(b) a fusion polypeptide comprising a first portion comprising a CsgF protein and a second portion comprising a helix-forming assisting protein, wherein the fusion protein is bound to the nanopore.

The protein nanopore complex of claim 1, wherein the first portion of the fusion protein is bound to the CsgG nanopore.

The protein nanopore complex of claim 1 or 2, wherein the first portion of the fusion protein is located within the lumen of the CsgG nanopore.

The protein nanopore complex of any one of claims 1 to 3, wherein the first portion of the fusion protein extends outside the lumen of the CsgG nanopore.

The protein nanopore complex of any one of claims 1 to 4, wherein the first portion forms a first constriction region in the lumen of the CsgG nanopore.

The protein nanopore complex of claim 5, wherein the second portion forms a second constriction region.

The protein nanopore complex of any one of claims 1 to 6, wherein the CsgG nanopore further comprises a constriction region.

The protein nanopore complex of any one of claims 1 to 7, wherein the second portion is not bound to the CsgG nanopore.

The protein nanopore complex of any one of claims 1 to 8, wherein the second portion comprises one or more alpha helices.

The protein nanopore complex of any one of claims 1 to 9, wherein each of the alpha helices contains between 0 and 15 alpha helix turns.

The protein nanopore complex of any one of claims 1 to 10, wherein the second portion comprises a first alpha helix comprising one to four alpha helix turns and a second alpha helix comprising three to six alpha helix turns.

The protein nanopore complex of claim 11, wherein the second alpha helix is packed against the first alpha helix.

The protein nanopore complex described in any one of claims 9 to 12, wherein the second portion comprises between 1 and 55 amino acid residues.

The protein nanopore complex described in any one of claims 6 to 13, wherein the distance between the first constriction region and the second constriction region is in the range of about 5 Å to about 80 Å.

The protein nanopore complex of any one of claims 1 to 14, wherein the protein nanopore complex has an axial length of greater than 90 Å, and optionally, the axial length is in the range of about 95 Å to about 160 Å.

The protein nanopore complex described in any one of claims 1 to 15, wherein the fusion protein is bound to the nanopore by a linker.

The protein nanopore complex of claim 16, wherein the linker comprises a bond, a peptide linker, or a chemical linker.

The protein nanopore complex of claim 16 or 17, wherein the linker comprises a bond formed by a sulfur(VI) fluoride exchange (SuFEx) reaction.

The protein nanopore complex of claim 16 or 17, wherein the linker comprises one or more maleimide molecules.

The protein nanopore complex described in any one of claims 1 to 19, wherein the fusion protein is cyclized.

The protein nanopore complex of claim 20, wherein the cyclization comprises one or more side chain-to-side chain cyclization bonds.

The protein nanopore complex of claim 21, wherein at least one of the side chain-to-side chain cyclization bonds is a disulfide bond.

1. A protein nanopore complex comprising:
(a) a CsgG nanopore comprising a lumen and a first constriction region formed within the lumen of the nanopore;
(b) a fusion protein comprising a first portion comprising a CsgF protein and a second portion comprising a helix-forming assisting protein, wherein the fusion protein is bound to the nanopore.

The protein nanopore complex of claim 23, wherein the first portion of the fusion protein is bound to the CsgG nanopore.

The protein nanopore complex of claim 23 or 24, wherein the first portion of the fusion protein is located within the lumen of the CsgG nanopore.

The protein nanopore complex of any one of claims 23 to 25, wherein the second portion of the fusion protein is located outside the lumen of the CsgG nanopore.

The protein nanopore complex of any one of claims 23 to 26, wherein the first portion forms a second constriction region in the lumen of the CsgG nanopore.

The protein nanopore complex of claim 27, wherein the second portion forms a third constriction region in the lumen of the CsgG nanopore.

The protein nanopore complex of any one of claims 23 to 28, wherein the second portion is not bound to the CsgG nanopore.

The protein nanopore complex of any one of claims 23 to 29, wherein the second portion comprises one or more alpha helices.

The protein nanopore complex of claim 30, wherein each of the alpha helices contains between 0 and 15 alpha helix turns.

The protein nanopore complex described in any one of claims 23 to 31, wherein the second portion comprises between 1 and 55 amino acid residues.

The protein nanopore complex described in any one of claims 23 to 32, wherein the fusion protein is cyclized.

The protein nanopore complex of claim 33, wherein the cyclization comprises one or more side chain-to-side chain cyclization bonds.

The protein nanopore complex of claim 34, wherein at least one of the side chain-to-side chain cyclization bonds is a disulfide bond.

1. A protein nanopore complex comprising:
(a) a CsgG nanopore comprising a lumen and a first constriction region formed within the lumen of the nanopore;
(b) a first accessory protein bound to the CsgG nanopore and forming a second constriction region in the lumen of the nanopore;
(c) a second accessory protein bound to the CsgG nanopore or the first accessory protein and forming a third constriction region.

The protein nanopore complex of claim 36, wherein the first auxiliary protein is located within the lumen of the CsgG nanopore.

The protein nanopore complex of claim 36 or 37, wherein the first auxiliary protein comprises a CsgF protein or a CsgF peptide.

The protein nanopore complex described in any one of claims 36 to 38, wherein the second auxiliary protein comprises one or more alpha helices.

The protein nanopore complex of claim 39, wherein each of the one or more alpha helices contains between 0 and 15 alpha helix turns.

The protein nanopore complex of claim 39 or 40, wherein the second auxiliary protein comprises two alpha helices.

The protein nanopore complex of claim 41, wherein one of the alpha helices contains between 1 and 6 alpha helix turns.

The protein nanopore complex of claim 41 or 42, wherein one of the alpha helices contains between 1 and 10 alpha helix turns.

A protein nanopore complex according to any one of claims 41 to 43, wherein one of the alpha helices contains three alpha helix turns and the other alpha helix contains three or four alpha helix turns.

The protein nanopore complex of any one of claims 36 to 44, wherein the second auxiliary protein comprises at least one alpha helix that packs against an alpha helix of the first auxiliary protein.

The protein nanopore complex described in any one of claims 36 to 45, wherein the second auxiliary protein contains between 1 and 55 amino acid residues.

The protein nanopore complex described in any one of claims 36 to 46, wherein the distance between the first constriction region and the second constriction region is in the range of about 10 Å to about 80 Å.

The protein nanopore complex described in any one of claims 36 to 47, wherein the distance between the second constriction region and the third constriction region is in the range of about 5 Å to about 80 Å.

The protein nanopore complex of any one of claims 36 to 48, wherein the protein nanopore complex has an axial length of greater than 90 Å, and optionally, the axial length is in the range of about 95 Å to about 160 Å.

The protein nanopore complex described in any one of claims 36 to 49, wherein the first auxiliary protein and the second auxiliary protein are linked by a linker.

The protein nanopore complex of claim 50, wherein the linker comprises a bond, a peptide linker, or a chemical linker.

The protein nanopore complex of claim 50 or 51, wherein the linker comprises a bond formed by a sulfur(VI) fluoride exchange (SuFEx) reaction.

The protein nanopore complex of claim 50 or 51, wherein the linker comprises one or more maleimide molecules.

The protein nanopore complex described in any one of claims 36 to 53, wherein the first auxiliary protein and the second auxiliary protein contain one or more side chain-to-side chain cyclization bonds.

The protein nanopore complex of claim 54, wherein at least one of the side chain-to-side chain cyclization bonds is a disulfide bond.

A system for characterizing a target analyte, the system comprising a protein nanopore complex described in any one of claims 1 to 55 inserted into a membrane.

The system of claim 56, further comprising a conductive solution in contact with the protein nanopore complex, electrodes that provide a voltage potential across the membrane, and a measurement system that measures the current through the protein nanopore complex.

A method for characterizing a target analyte, the method comprising: (a) contacting the target analyte with the system described in claim 56; (b) applying a potential across the membrane such that the target analyte moves relative to the lumen formed by the protein nanopore complex; and (c) performing one or more measurements as the target analyte moves relative to the lumen, thereby characterizing the target analyte.

The method of claim 58, wherein the target analyte comprises a target polynucleotide.

The method of claim 58 or 59, wherein step (c) comprises measuring a current through the continuous channel, the current indicating the presence and/or one or more properties of the target analyte, thereby detecting and/or characterizing the target analyte.

The method of any one of claims 58 to 60, wherein the target analyte is a polynucleotide, nucleotides in the polynucleotide interact with the first, second, and optionally, third constriction regions in the lumen, and each of the first, second, and optionally, third constriction regions can distinguish between different nucleotides such that the overall current through the lumen is affected by interactions between each of the first, second, and third constriction regions and the nucleotides located in each of the regions.