IL322301A - Methods for the rapid identification of cefepime-resistance in - Google Patents
Methods for the rapid identification of cefepime-resistance inInfo
- Publication number
- IL322301A IL322301A IL322301A IL32230125A IL322301A IL 322301 A IL322301 A IL 322301A IL 322301 A IL322301 A IL 322301A IL 32230125 A IL32230125 A IL 32230125A IL 322301 A IL322301 A IL 322301A
- Authority
- IL
- Israel
- Prior art keywords
- mammal
- sequence
- target sequence
- isecpl
- cefepime
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6888—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
- C12Q1/689—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/112—Disease subtyping, staging or classification
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Description
METHODS FOR THE RAPID IDENTIFICATION OF CEFEPIME- RESISTANCE IN ESCHERICHIA COLI INCORPORATION BY REFERENCE id="p-1"
id="p-1"
[0001] A PCT Request Form is filed concurrently with this specification as part of the present application. Each application that the present application claims benefit of or priority to as identified in the concurrently filed PCT Request Form is incorporated by reference herein in its entirety and for all purposes.
BACKGROUND id="p-2"
id="p-2"
[0002]Resistance to antimicrobial agents (antibiotics) is responsible for morbidity, mortality and health care cost on a growing and global scale (Murray et al 2022). With multidrug-resistant organisms it is increasingly vital to select an effective antibiotic with precision, both to ensure effecacy as well as to curtail the profligate use of the strongest antibiotic, a practice that prompts the evolution of resistance in the drugs humanity needs most. The Gram-negative bacilli Escherechia coli is the single most common cause of infection, and the primary class of drugs utilized against it are the beta lactams. Cephalosporins are a multi- generation category of beta lactams of which cefepime is one of the most powerful, a so-called fourth-generation cephalosporin. The final category of beta lactam beyond the cephalosporins is the carbapenems, the so-called drugs of last resort. In order to reduce evolution of resistance against these precious agents, hospitals seek to use a next-highest level beta lactam like cefepime whenever it can be certain to be effective against an infecting strain, sparing the use of carbapenems. However, unfortunately, in vitro testing of cefepime efficacy for A coli, particularly against the drug resistant strains for which it is most needed, is problematic (Smith, KP, Brennan-Krohn, T, Weir, S, Kirby, IE (2017) Improved accuracy of cefepime susceptibility testing for extended-spectrum-beta-lactamase-producing Enterobacteriaceae with an on-demand digital dispensing method. J Clin Micro, 470-478 (incorporated by reference herein in its entirety and for all purposes)), undermining the ability of hospitals to know whether it can be used, likely leading to overuse of the carbapenems. A more accurate and precise means of diagnosing the efficacy of cefepime against strains of E. coli is therefore urgently needed. id="p-3"
id="p-3"
[0003]While in vitro testing methods such as those tested in Smith et al (2017) for A coli resistance to cefepime have inherent variability and inaccuracies due to imprecision of sample preparation and degradation of antibiotics in testing disposables, the full DNA sequence NGD1P001WO (whole genome sequence, or WGS) of a bacterial strain can be obtained with digital precision. Bacterial WGS contains all the information necessary to predict resistance and potentially other clinically important phenotype such as virulence or in vivo efficacy not testable in vitro, offering a more information-rich alternative but is not yet used to guide clinical drug selection. This is in part because validated, robust models to predict resistance from WGS at levels of accuracy at or above in vitro methods are only now emerging, and in part because the high cost of DNA sequencing previously prohibited its routine clinical use. However, the cost and rapidity of DNA sequencing has come down so rapidly it now rivals conventional tests, opening the prospect of its use to more precisely choose drugs. What is needed are methods for the development of validatable and robust data-driven machine learning models accurately predicting resistance from WGS for the drugs and species of greatest clinical importance. And, because clinical use and regulatory consideration will be encouraged by explainable models driven by biologically plausible causal elements, even better would be models that are also capable of being dissected to identify biologically plausible causal mechanisms underlying their accurate predictions. id="p-4"
id="p-4"
[0004]Presented here are new methods for the generation of machine learning models predictive of resistance from WGS along with an introduction of methods to dissect their biologically causal drivers and the interplay between them. We illustrate these methods with a now-validated (Humphries et al 2023) model that achieves accuracy in the prediction of cefepime resistance not only at but beyond that achieved with in vitro tests. Further, new methods are introduced to analyze the structure of the model, in particular the features that drive its performance, which reveal elements of the mechanisms of resistance itself. These in turn yield novel hybridization targets for molecular diagnostics and may further guide the development of targeted inhibitors of this resistance mechanisms, thus paving the way for combination drug of inhibitor plus cefepime which can be effective against cefepime-resistantE coin while sparing the use of last-generation carbapenems the use of which it is in humanity ’s interest to reserve.
SUMMARY id="p-5"
id="p-5"
[0005]In various embodiments methods of utilizing WGS to rapidly determine whether a bacteria such as Escherichia coli is resistant to a drug such as cefepime are provided, along with methods for the discovery of the genomic elements and the interplay between them which underly the model ’s determination of resistance. id="p-6"
id="p-6"
[0006]Various embodiments provided herein may include, but need not be limited to, one or more of the following:NGD1P001WO id="p-7"
id="p-7"
[0007]Embodiment I: A method of determining whether a gram-negative bacterium is cefepime resistant, said method comprising: [0008] determining the presence or absence of a target sequence or sequences inthe DNA of said bacterium, wherein one such target sequence corresponds to a portion of DNA sequence comprising one or more sets of nucleotides each comprising at least 8 specified contiguous nucleotides within a defined 209 base-pair (bp) region of the ISEcpl transposon or variants thereof with at least 90% identity to the ISEcpl sequence disclosed in Table 1 the furthest downstream of which is within 100 base pairs proximate to a start codon for a resistance gene; and where the presence of said target sequence contributes to the determination that said bacterium is cefepime-resistant. id="p-9"
id="p-9"
[0009]Embodiment 2: The method of embodiment 1, wherein the resistance gene is a beta lactamase gene. id="p-10"
id="p-10"
[0010]Embodiment 3: The method of embodiment 1, where said bacterium is of the species Escherichia coli, or another species of the Escherichia genus. id="p-11"
id="p-11"
[0011]Embodiment 4: The method of embodiment 1, where said bacterium is of the species Klebsiella pneumoniae, or another species of the Klebsiella genus. id="p-12"
id="p-12"
[0012]Embodiment 5: The method of embodiment 1, where said bacterium is of the species Pseudomonas aeruginosa, or another species of the Pseudomonas genus. id="p-13"
id="p-13"
[0013]Embodiment 6: The method of embodiment 1, where said bacterium is of the species Acinetobacter baumannii, or another species of the Acinetobacter genus. id="p-14"
id="p-14"
[0014]Embodiment 7: The method of embodiment 1, where said bacterium is of a species from a genus other than Escherichia, Klebsiella, Pseudomonas, or Acinetobacter. id="p-15"
id="p-15"
[0015]Embodiment 8: The method of embodiment I, where said region comprises at least 8 nucleotides of the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof with at least 90% identity, as shown in Table 1 (SEQ ID NO: 1). id="p-16"
id="p-16"
[0016]Embodiment 9: The method according to any one of embodiments 1-8, wherein said target sequence corresponds to a DNA sequence that includes at least 8, or at least 15, or at least 20, or at least 30, or at least 40, or at least 50, or at least 60, or at least 70, or at least 80, or at least 90, or at least 100, or at least 110, or at least 120, or at least 130, or at least 140, or at least 150 contiguous nucleotides of the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof with at least 90% identity, as shown in Table 1 (SEQ ID). 3NGD1P001WO id="p-17"
id="p-17"
[0017]Embodiment 10: The method of embodiment 9, wherein a portion of said target sequence corresponds to a DNA sequence that comprises a promoter sequence in the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof with at least 90% identity, as shown in Table 1 (SEQ ID). id="p-18"
id="p-18"
[0018]Embodiment 11: The method of embodiment 10, wherein said target sequence corresponds to a DNA sequence that comprises a nucleotide sequence ranging from bp 1543 to 1595 of the ISEcpl transposon shown in Table 1 (SEQ ID). id="p-19"
id="p-19"
[0019]Embodiment 12: The method according to any one of embodiments 1-11, wherein said target corresponds to a DNA sequence that comprises the full 209 contiguous nucleotides of the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof where said variants have at least 90% sequence identity to the sequence shown in Table 1. id="p-20"
id="p-20"
[0020]Embodiment 13: The method according to any one of embodiments 1-12, wherein said DNA sequence is a sequence in the region starting at base pair 1438 to base pair 656 of the ISEcpl transposon, or variants thereof where said variants have at least 90% sequence identity to the sequence, shown in Table 1. id="p-21"
id="p-21"
[0021]Embodiment 14: The method according to any one of embodiments 1-13, wherein said target sequence has at least 90% sequence identity, or at least 95% sequence identity, or at least 98% sequence identity, or 100% sequence identity to a DNA sequence comprising a portion of the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof where said variants have at least 90% sequence identity to the sequence,-as shown in Table 1. id="p-22"
id="p-22"
[0022]Embodiment 15: The method according to any one of embodiments 1-14, wherein target sequence has 100% sequence identity with a portion of the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants where said variants have at least 90% sequence identity to the sequence shown in Table 1. id="p-23"
id="p-23"
[0023]Embodiment 16: The method according to any one of embodiments 1-15, wherein said determination of the presence or absence of said target sequence comprises sequencing at least a portion of the genome expected to contain said target sequence. id="p-24"
id="p-24"
[0024]Embodiment 17: The method of embodiment 16, wherein said sequencing comprises sequencing the full genome of said bacterium. 4NGD1P001WO id="p-25"
id="p-25"
[0025]Embodiment 18: The method of embodiment 17, wherein said sequencing comprises analyzing nucleic acid sequence from said bacterium to output a prediction of cefepime resistance. id="p-26"
id="p-26"
[0026]Embodiment 19: The method of embodiment 18, wherein said analyzing comprises using a model or machine learning model to receive whole genome sequence data and output a prediction of cefepime resistance. id="p-27"
id="p-27"
[0027]Embodiment 20: The method of embodiment 19, wherein said prediction of cefepime resistance is a surrogate marker for the presence of said transposon. id="p-28"
id="p-28"
[0028]Embodiment 21: The method according to any one of embodiments 16-20, wherein said sequencing is performed by a method selected from the group consisting of sequencing by synthesis, sequencing by binding, sequencing by highly multiplexed hybridization, and nanopore sequencing. id="p-29"
id="p-29"
[0029]Embodiment 22: The method according to any one of embodiments 1-15, wherein said determining the presence or absence of said target sequence comprises performing a nucleic acid amplification reaction to amplify said target sequence. id="p-30"
id="p-30"
[0030] Embodiment 23: The method of embodiment 22, wherein said nucleic acidamplification reaction comprises an amplification system selected from the group consisting of a polymerase chain reaction (PCR), a real time polymerase chain reaction (rtPCR), Self-Sustained Sequence Reaction (3 SR), a Nucleic acid Based Transcription Assay (NASBA), a Transcription Mediated Amplification (TMA), a Strand Displacement Amplification (SDA), a Helicase- Dependent Amplification (HDA), a Loop-Mediated isothermal amplification (LAMP), a stem- loop amplification, an isothermal multiple displacement amplification (IMDA), a single primer isothermal amplification (SPIA), a circular helicase-dependent amplification (cHDA), and a Recombinase Polymerase Amplification (RPA). id="p-31"
id="p-31"
[0031]Embodiment 24: The method of embodiment 23, wherein said nucleic acid amplification reaction comprises PCR or rtPCR. id="p-32"
id="p-32"
[0032]Embodiment 25: The method according to any one of embodiments 1-15, wherein said determining the presence or absence of said target sequence comprises in situ hybridization with a probe that hybridizes to said target sequence. id="p-33"
id="p-33"
[0033]Embodiment 26: The method according to any one of embodiments 1-25, wherein said E. coli is obtained from a biological sample from a mammal having an E. coli infection.NGD1P001WO id="p-34"
id="p-34"
[0034]Embodiment 27: The method of embodiment 26, wherein said biological sample comprises a sample selected from the group consisting of a cell or tissue culture, blood, saliva, cerebrospinal fluid, urine, stool, bronchial aspirates, tracheal lavage, pleural fluid, lymph, sputum, semen, needle aspirates, punch biopsies, surgical biopsies, and wound swab. id="p-35"
id="p-35"
[0035]Embodiment 28: The method according to any one of embodiments 26-27, wherein said mammal is a mammal identified as having a pathology selected from the group consisting of a urinary tract infection (UTI), pneumonia, cellulitis, a liver abscess, a surgical wound infection, gastroenteritis, endocarditis, diabetic foot ulcers, and osteomyelitis. id="p-36"
id="p-36"
[0036]Embodiment 29: The method according to any one of embodiments 1-28, wherein said mammal is a human. id="p-37"
id="p-37"
[0037]Embodiment 30: The method according to any one of embodiments 1-28, wherein said mammal is a non-human mammal. id="p-38"
id="p-38"
[0038]Embodiment 31: The method according to any one of embodiments 1-30, wherein the presence of a portion of said ISEcpl transposon is identified and said method further comprises guiding antibiotic treatment of said mammal for a cefepime-resistant Gram-negative bacterial infection. id="p-39"
id="p-39"
[0039]Embodiment 32: The method of embodiment 31, wherein said treatment comprises treatment of said mammal with one or more drugs used to treat cefepime resistant Gram-negative bacterial infections. id="p-40"
id="p-40"
[0040]Embodiment 33: The method of embodiment 32, wherein said treatment said mammal comprises treatment of said mammal with a combination of mecillinam and cefotaxime. id="p-41"
id="p-41"
[0041]Embodiment 34: The method of embodiment 32, wherein said treatment said mammal comprises treatment of said mammal with a combination of cefepime and sulbactam. id="p-42"
id="p-42"
[0042]Embodiment 35: The method of embodiment 32, wherein treating said mammal comprises treatment of said mammal with a carbapenem. id="p-43"
id="p-43"
[0043]Embodiment 36: The method of embodiment 35, wherein treating said mammal comprises treatment of said mammal with ertapenem. id="p-44"
id="p-44"
[0044]Embodiment 37: The method of embodiment 32, wherein treating said mammal comprises treatment of said mammal with one or more drugs selected from the group consisting 6NGD1P001WO of imipenem, meropenem, doripenem, piperacillin-tazobactam, nitrofurantoin, fosfomycin, pivmecillinam, mecillinam-clavulanate, temocillin, pivmecillinam, and colistin. id="p-45"
id="p-45"
[0045]Embodiment 38: The method of embodiment 37, wherein treating said mammal comprises treatment of said mammal with one or more drugs selected from the group consisting of imipenem, meropenem, and doripenem, pivmecillinam, and mecillinam-clavulanate. id="p-46"
id="p-46"
[0046]Embodiment 39: The method of embodiment 37, wherein treating said mammal comprises treatment of said mammal with piperacillin-tazobactam. id="p-47"
id="p-47"
[0047]Embodiment 40: The method of embodiment 37, wherein treating said mammal comprises treatment of said mammal with nitrofurantoin and/or Fosfomycin. id="p-48"
id="p-48"
[0048]Embodiment 41: The method of embodiment 37, wherein treating said mammal comprises treatment of said mammal with temocillin, pivmecillinam, and/or colistin. id="p-49"
id="p-49"
[0049]Embodiment 42: The method of embodiment 37, wherein treating said mammal comprises treatment of said mammal with pivmecillinam. id="p-50"
id="p-50"
[0050]Embodiment 43: The method of embodiment 37, wherein treating said mammal comprises treatment of said mammal with mecillinam-clavulanate. id="p-51"
id="p-51"
[0051]Embodiment 44: The method according to any one of embodiments 32-43, wherein said treatment comprises prescribing said one or more drugs. id="p-52"
id="p-52"
[0052]Embodiment 45: The method according to any one of embodiments 32-43, wherein said treatment comprises administering said one or more drugs. id="p-53"
id="p-53"
[0053]Embodiment 46: The method according to any one of embodiments 32-43, wherein said treatment comprises providing said one or more drugs to said mammal. id="p-54"
id="p-54"
[0054]Embodiment 47: A method of treating a mammal having a Gram-negative bacterial infection, said method comprising: [0055] identifying said Gram-negative bacterium in a biological sample from saidmammal as cefepime resistant using the method according to any one of embodiments 1-25; and [0056] treating said mammal for a cefepime resistant Gram-negative bacterialinfection. id="p-57"
id="p-57"
[0057]Embodiment 48: The method of embodiment 47, wherein said biological sample comprises a sample selected from the group consisting of a cell or tissue culture, blood, saliva, cerebral spinal fluid, urine, stool, bronchial aspirates, tracheal lavage, pleural fluid, lymph, sputum, semen, needle aspirates, punch biopsies, surgical biopsies, and wound swab.NGD1P001WO id="p-58"
id="p-58"
[0058]Embodiment 49: The method according to any one of embodiments 47-48, wherein said mammal is a mammal identified as having a pathology selected from the group consisting of a urinary tract infection (UTI), pneumonia, cellulitis, a liver abscess, a surgical wound infection, gastroenteritis, endocarditis, diabetic foot ulcers, and osteomyelitis. id="p-59"
id="p-59"
[0059]Embodiment 50: The method according to any one of embodiments 47-49, wherein said mammal is a human. id="p-60"
id="p-60"
[0060]Embodiment 51: The method according to any one of embodiments 47-49, wherein said mammal is a non-human mammal. id="p-61"
id="p-61"
[0061]Embodiment 52: The method according to any one of embodiments 47-51, wherein said treating comprises treatment of said mammal with one or more drugs used to treat cefepime resistant Gram-negative infections. id="p-62"
id="p-62"
[0062]Embodiment 53: The method of embodiment 52, wherein said treating said mammal comprises treatment of said mammal with a combination of mecillinam and cefotaxime. id="p-63"
id="p-63"
[0063]Embodiment 54: The method of embodiment 52, wherein said treating said mammal comprises, wherein said treating said mammal comprises treatment of said mammal with a combination of cefepime and sulbactam. id="p-64"
id="p-64"
[0064]Embodiment 55: The method of embodiment 52, wherein said treating said mammal comprises, wherein treating said mammal comprises treatment of said mammal with a carbapenem. id="p-65"
id="p-65"
[0065]Embodiment 56: The method of embodiment 55, wherein treating said mammal comprises treatment of said mammal with ertapenem. id="p-66"
id="p-66"
[0066]Embodiment 57: The method of embodiment 52, wherein treating said mammal comprises treatment of said mammal with one or more drugs selected from the group consisting of imipenem, meropenem, doripenem, piperacillin-tazobactam, nitrofurantoin, fosfomycin, pivmecillinam, mecillinam-clavulanate, temocillin, pivmecillinam, and colistin. id="p-67"
id="p-67"
[0067]Embodiment 58: The method of embodiment 57, wherein treating said mammal comprises treatment of said mammal with one or more drugs selected from the group consisting of imipenem, meropenem, and doripenem, pivmecillinam, and mecillinam-clavulanate. id="p-68"
id="p-68"
[0068]Embodiment 59: The method of embodiment 57, wherein treating said mammal comprises treatment of said mammal with piperacillin-tazobactam. 8NGD1P001WO id="p-69"
id="p-69"
[0069]Embodiment 60: The method of embodiment 57, wherein treating said mammal comprises treatment of said mammal with nitrofurantoin and/or Fosfomycin. id="p-70"
id="p-70"
[0070]Embodiment 61: The method of embodiment 57, wherein treating said mammal comprises treatment of said mammal with temocillin, pivmecillinam, and/or colistin. id="p-71"
id="p-71"
[0071]Embodiment 62: The method of embodiment 57, wherein treating said mammal comprises treatment of said mammal with pivmecillinam. id="p-72"
id="p-72"
[0072]Embodiment 63: The method of embodiment 57, wherein treating said mammal comprises treatment of said mammal with mecillinam-clavulanate. id="p-73"
id="p-73"
[0073]Embodiment 64: The method according to any one of embodiments 52-63, wherein said treatment comprises prescribing said one or more drugs. id="p-74"
id="p-74"
[0074]Embodiment 65: The method according to any one of embodiments 52-63, wherein said treatment comprises administering said one or more drugs. id="p-75"
id="p-75"
[0075]Embodiment 66: The method according to any one of embodiments 52-63, wherein said treatment comprises providing said one or more drugs to said mammal. id="p-76"
id="p-76"
[0076]Embodiment 67: A kit for determining whether an Gram-negative bacterium is cefepime resistant, said kit comprising: [0077] a container containing on or more primers and/or probes for amplifyingand/or detecting the presence or absence of a target sequence in the DNA of said bacterium, wherein said target sequence comprises at least 8 contiguous nucleotides of a region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof with at least 90% sequence identity to the sequence shown in Table 1 where said region comprises or consists of a portion of 209 contiguous nucleotides of said transposon proximate to a start codon for a resistance gene. id="p-78"
id="p-78"
[0078]Embodiment 68: The kit of embodiment 67, where said target sequence comprises a sequence corresponding the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof where said variants have at least 90% sequence identity to the sequence shown in Table 1. id="p-79"
id="p-79"
[0079]Embodiment 69: The kit according to any one of embodiments 67-68, wherein said target sequence comprises at least 10, or at least 15, or at least 20, or at least 30, or at least 40, or at least 50, or at least 60, or at least 70, or at least 80, or at least 90, or at least 100, or at least 110, or at least 120, or at least 130, or at least 140, or at least 150 contiguous nucleotides of 9NGD1P001WO the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof where said variants have at least 90% identity to the sequence shown in Table 1. id="p-80"
id="p-80"
[0080]Embodiment 70: The kit of embodiment 69, wherein said target sequence comprises a promoter sequence in the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof where said variants have at least 90% identity to the sequence shown in Table 1. id="p-81"
id="p-81"
[0081]Embodiment 71: The kit of embodiment 70, wherein said target sequence comprises a nucleotide sequence ranging from bp 1543 to 1595 of the ISEcpl transposon shown in Table 1. id="p-82"
id="p-82"
[0082]Embodiment 72: The kit according to any one of embodiments 67-71, wherein said target sequence comprises the full 209 contiguous nucleotides of the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof where said variants have at least 90% sequence identity to the sequence shown in Table 1. id="p-83"
id="p-83"
[0083]Embodiment 73: The kit according to any one of embodiments 67-72, wherein said kit comprises primers for the amplification of said target sequence. id="p-84"
id="p-84"
[0084]Embodiment 74: The kit according to any one of embodiments 67-73, wherein said kit comprises a probe that hybridizes to said target sequence. id="p-85"
id="p-85"
[0085]Embodiment 75:The kit according to any one of embodiments 67-74, wherein said kit comprises instructional materials teaching the use of components of the kit to identifycefepime resistant Gram-negative bacteria. id="p-86"
id="p-86"
[0086]Embodiment 76: A method of treating a mammal having a Gram-negative bacterial infection caused by a bacterium, said method comprising: [0087] administering an antibacterial agent to the mammal; and [0088] inhibiting activity of a target sequence in the DNA of said bacterium inthe mammal, wherein said target sequence corresponds to a portion of DNA sequence comprising one or more sets of nucleotides each comprising at least 8 specified contiguous nucleotides within a defined 209 base pair region of an ISEcpl transposon or a variant thereof with 90% identity to the ISEcpl sequence disclosed in Table 1, the closest of which is within 100 base pairs proximate to a start codon for a resistance gene. id="p-89"
id="p-89"
[0089]Embodiment 77: The method of embodiment 76, further comprising, prior to administering the antibacterial agent to the mammal, identifying said Gram-negative bacterium 10NGD1P001WO in a biological sample from said mammal as cefepime resistant using the method according to any one of embodiments 1-25. id="p-90"
id="p-90"
[0090]Embodiment 78: The method according to any one of embodiments 76-77, wherein said mammal is a mammal identified as having a pathology selected from the group consisting of a urinary tract infection (UTI), pneumonia, cellulitis, a liver abscess, a surgical wound infection, gastroenteritis, endocarditis, diabetic foot ulcers, and osteomyelitis. id="p-91"
id="p-91"
[0091]Embodiment 79: The method according to any one of embodiments 76-79, wherein said mammal is a human. id="p-92"
id="p-92"
[0092]Embodiment 80: The method according to any one of embodiments 76-79, wherein said mammal is a non-human mammal. id="p-93"
id="p-93"
[0093]Embodiment 81: The method according to any one of embodiments 76-80, wherein the resistance gene is a beta lactamase gene. id="p-94"
id="p-94"
[0094]Embodiment 82: The method according to any one of embodiments 76-81, wherein the antibacterial agent is a 3-lactam antibiotic. id="p-95"
id="p-95"
[0095]Embodiment 83: The method of embodiment 82, wherein the antibacterial agent is cefepime. id="p-96"
id="p-96"
[0096]Embodiment 84: The method according to any one of embodiments 76-83, wherein inhibiting activity of the target sequence comprises at least partially blocking access to the target sequence. id="p-97"
id="p-97"
[0097]Embodiment 85: The method according to any one of embodiments 76-84, wherein inhibiting activity of the target sequence comprises at least partially blocking access of a polymerase to the target sequence. id="p-98"
id="p-98"
[0098]Embodiment 86: The method according to any one of embodiments 76-83, wherein inhibiting activity of the target sequence comprises disrupting integrity of the target sequence. id="p-99"
id="p-99"
[0099]Embodiment 87: The method of embodiment 86, wherein inhibiting activity of the target sequence comprises cleaving or nicking the target sequence. id="p-100"
id="p-100"
[0100]Embodiment 88: The method according to any one of embodiments 76-87, wherein inhibiting activity of the target sequence comprises administering a gene editing agent the mammal, wherein the gene editing agent targets to the target sequence. 11NGD1P001WO id="p-101"
id="p-101"
[0101]Embodiment 89: The method of embodiment 88, wherein the gene editing agent comprises a nuclease selected from the group consisting of a zinc-finger nuclease, a transcription activator-like effector nuclease, or a CRISPR-Cas genome-editing nuclease. id="p-102"
id="p-102"
[0102]Embodiment 90: The method according to any one of embodiments 76-89, wherein said treatment comprises prescribing said antibacterial agent and an agent for inhibiting activity of the target sequence to said mammal. id="p-103"
id="p-103"
[0103]Embodiment 91: The method according to any one of embodiments 76-90, wherein said treatment comprises administering said antibacterial agent and an agent for inhibiting activity of the target sequence to said mammal. id="p-104"
id="p-104"
[0104]Embodiment 92: The method according to any one of embodiments 76-91, wherein said treatment comprises providing said antibacterial agent and an agent for inhibiting activity of the target sequence to said mammal.
DEFINITIONS id="p-105"
id="p-105"
[0105]Cefepime ((6R,7R,Z), 7-(2-(2-aminothiazol-4-yl)-2-(methoxyimino)acetamido), 3-((l-methylpyrrolidinium-l-yl)methyl)-8-oxo-5-thia, l-aza-bicyclo[4.2.0]oct-2-ene-2- carboxylate) is a fourth-generation cephalosporin antibiotic. Cefepime has an extended spectrum of activity against Gram-positive and Gram-negative bacteria. id="p-106"
id="p-106"
[0106]The term "drug resistant bacterium" refers to a bacterium that shows reduced susceptibility to a drug, e.g., to a drug that is typically used to treat an infection by the bacterium. Reduced susceptibility refers to a reduced ability of the drug to inhibit growth, and/or infectivity, and/or proliferation, and/or lifetime, and/or virulence of the bacterium. In certain embodiments a drug will have essentially no effect on the bacterium with respect to inhibition of growth, and/or infectivity, and/or proliferation, and/or lifetime, and/or virulence of the bacterium. A "cefepime resistant Escherichia coli " bacterium refers to an E. coli that has drug resistance to the drug cefepime without being bound to a particular theory of drug resistance in E. coli. id="p-107"
id="p-107"
[0107]A "target sequence that corresponds to a DNA sequence" refers to a nucleic acid sequenced that has at least 90% sequence identity with the referenced DNA sequence. In certain embodiments the target sequence has at least 95%, or at least 98%, or at least 99%, or at least 100% sequence identity with the referenced DNA sequence along the full length of the target sequence. 12NGD1P001WO id="p-108"
id="p-108"
[0108]The term "whole genome sequencing", or "WGS", also known as full genome sequencing, complete genome sequencing, or entire genome sequencing, is the process of determining nearly the entirety of the DNA sequence of an organism's genome at a single time. As used herein, whole genome sequencing can include sequencing plasmid DNA where present as well as chromosomal DNA. Illustrative, but non-limiting examples of whole genome sequencing techniques include sequencing by synthesis (e.g., dye sequencing and pyrosequencing), single-molecule real-time sequencing, and nanopore sequencing. In various embodiments, whole genome sequencing employs a shotgun strategy, i.e., parallelization and template generation via genome fragmentation. id="p-109"
id="p-109"
[0109]The terms "ISEcpl transposon" or "ISEcplB transposon" refer to insertion sequences (transposons) weakly related to other IS elements and belonging to the IS 13 80 family (Chandler & Mahillon (2002) Insertion sequences revisited, p. 305-366. In N. L. Craig, R. Craigie, M. Gellert, and A. M. Lambowitz (ed.), Mobile DNA II, ASM Press, Washington, D.C.). ISEcplB differs from ISEcpl (GenBank accession no. AJ242809) by three nucleotide substitutions. Their inverted repeat (IR) sequences are identical, and their transposases differ by a single amino acid change (Poirel etal. (2003) Antimicrob. Agents Chemother. Al; 2938-2945). While the ISEcpl transposon is highly conserved in E. coli, it will be recognized that some variance may occur in different E. coli strains. Accordingly, in certain embodiments, an ISECpl transposon refers to a nucleic acid sequence in an E. coli strain or other gram-negative strain that has at least 90%, or at least 95%, or at least 96%, or at least 97%, or at least 98% , or at least 99% sequence identity with ISECpl or ISEcplB. id="p-110"
id="p-110"
[0110]The term "target sequence" as used herein refers to a nucleic acid sequence the presence of which is indicative of the domain of an ISEcpl transposon that is indicative of cefepime resistance in E. coli, as described herein. id="p-111"
id="p-111"
[0111]The terms "amplification" or "nucleic acid amplification" are used interchangeably to refer to any means by which at least a part of at least one target nucleic acid is reproduced, typically in a template-dependent manner, including without limitation, a broad range of techniques for amplifying nucleic acid sequences, either linearly or exponentially. Illustrative means for performing an amplifying step include PCR, nucleic acid strand-based amplification (NASBA), two-step multiplexed amplifications, rolling circle amplification (RCA), and the like, including multiplex versions and combinations thereof, for example but not limited to, OLA/PCR, PCR/OLA, LDR/PCR, PCR/PCR/LDR, PCR/LDR, LCR/PCR, PCR/LCR (also known as combined chain reaction —CCR), helicase-dependent amplification (HDA), and 13NGD1P001WO the like. Descriptions of such techniques can be found in, among other sources, Ausubel et al.; PCR Primer: A Laboratory Manual, Diffenbach, Ed., Cold Spring Harbor Press (1995); The Electronic Protocol Book, Chang Bioscience (2002); Msuih et al., J. Clin. Micro. 34:501-(1996); The Nucleic Acid Protocols Handbook, R. Rapley, ed., Humana Press, Totowa, N.J. (2002); Abramson et al., Curr Opin Biotechnol. 1993 Feb.;4(l):41-7, U.S. Pat. No. 6,027,998; U.S. Pat. No. 6,605,451, Barany et al., PCT Publication No. WO 97/31256; Wenz et al., PCT Publication No. WO 01/112579; Day et al., Genomics, 29(1): 152-162 (1995), Ehrlich et al., Science 252:1643-50 (1991); Innis et al., PCR Protocols: A Guide to Methods and Applications, Academic Press (1990); Favis et al., Nature Biotechnology 18:561-64 (2000); and Rabenau et al., Infection 28:97-102 (2000); Belgrader, Barany, and Lubin, Development of a Multiplex Ligation Detection Reaction DNA Typing Assay, Sixth International Symposium on Human Identification, 1995 (available on the world wide web at: promega.com/geneticidproc/ussymp6proc/blegrad.html- ); LCR Kit Instruction Manual, Cat. #200520, Rev. #050002, Stratagene, 2002; Barany, Proc. Natl. Acad. Sci. USA 88:188-(1991); Bi and Sambrook, Nuck Acids Res. 25:2924-2951 (1997); Zirvi et al., Nuck Acid Res. 27:e40i-viii (1999); Dean et al., Proc Natl Acad Sci USA 99:5261-66 (2002); Barany and Gelfand, Gene 109:1-11 (1991); Walker et al., Nuck Acid Res. 20:1691-96 (1992); Polstra et al., BMC Inf. Dis. 2:18- (2002); Lage et al., Genome Res. 2003 Feb.;13(2):294-307, and Landegren et al., Science 241:1077-80 (1988), Demidov, V., Expert Rev Mol Diagn. 2002 Nov.;2(6):542-8., Cook et al., J Microbiol Methods. 2003 May;53(2): 165-74, Schweitzer et al., Curr Opin Biotechnol. 2001 Feb.;12(l):21-7, U.S. Pat. No. 5,830,711, U.S. Pat. No. 6,027,889, U.S. Pat. No. 5,686,243, PCT Publication No. WO0056927A3, and PCT Publication No. WO9803673A1. id="p-112"
id="p-112"
[0112]As used herein, the terms "identity, " "sequence identity ", e.g., "percent identity " to an amino acid sequence or to a nucleotide sequence disclosed herein refers to a relationship between two or more amino acid sequences or between two or more nucleotide sequences. When a position in one sequence is occupied by the same nucleic acid base or amino acid in the corresponding position of the comparator sequence, the sequences are said to be "identical " at that position. The percentage of "sequence identity " is calculated by determining the number of positions at which the identical nucleic acid base or amino acid occurs in both sequences to yield the number of "identical " positions. The number of "identical " positions is then divided by the total number of positions in the comparison window and multiplied by 100 to yield the percentage of "sequence identity. " Percentage of "sequence identity " is determined by comparing two optimally aligned sequences over a comparison window. In order to optimally align sequences for comparison, the portion of a nucleotide or amino acid sequence in the NGD1P001WO comparison window can comprise additions or deletions termed gaps while the reference sequence is kept constant. An optimal alignment is that alignment which, even with gaps, produces the greatest possible number of "identical " positions between the reference and comparator sequences. Percentage "sequence identity " between two sequences can be determined using, e.g., the program "BLAST" which is available from the National Center for Biotechnology Information, and which program incorporates the programs BLASTN (for nucleotide sequence comparison) and BLASTP (for amino acid sequence comparison), which programs are based on the algorithm of Karlin and Altschul ((1993). Proc. Natl. Acad. Sci. USA. 90(12): 5873-5877). In typical embodiments the BLAST program is used with default parameters. id="p-113"
id="p-113"
[0113]The phrase "treating a mammal", e.g., for a cefepime resistant E. coli refers to the provision of care (healthcare) to a mammal having a cefepime resistant E. coli infection. In certain illustrative, but non-limiting embodiments, such treating can refer to the actual administration of one or more drugs, to the prescription of one or more drugs, or simply or to provision of one or more drugs to a subject for self-administration or to a caregiver for administration to the subject. id="p-114"
id="p-114"
[0114]The terms "subject," "individual," and "patient" may be used interchangeably and refer to humans, as well as non-human mammals (e.g., non-human primates, canines, equines, felines, porcines, bovines, ungulates, lagomorphs, and the like).
BRIEF DESCRIPTION OF THE DRAWINGS id="p-115"
id="p-115"
[0115]Figure 1 presents a schematic representation of a decision tree as may be employed in many classes of machine learning model. id="p-116"
id="p-116"
[0116]Figure 2 presents an example general structure of computational elements of a computational system that may be used to train and use a computational model for predicting the resistance or susceptibility of a pathogen to a drug or drug class. id="p-117"
id="p-117"
[0117]Figure 3 shows receiver operating characteristic (ROC)curves for validated model results. Thin lines correspond to validation results on each of 5 folds of 20% of the database completely held out from training; thick lines correspond to mean of all iterations and shaded areas correspond to mean plus standard deviation. Using the optimal model showed categorical agreement of 97.5%, and sensitivity of 98.5%. id="p-118"
id="p-118"
[0118]Figure 4 illustrates how ISEcpl can act as a promoter of CTX-M in a strain of E. coli. Adapted from Poirel et al. (2003) Antimicrob. Agents Chemother. 47: 2938-2945.NGD1P001WO Figure 5 identifies the locations of the top 17 features on E. coli genomes that are predictive of cefepime resistance using an example ML model. id="p-119"
id="p-119"
[0119]Figure 6 provides a more detailed representation of the plot in Figure 5 but highlights how the ranking of the features drops off from highest ranked feature to the twentieth ranked feature. id="p-120"
id="p-120"
[0120]Figure 7 presents a chart illustrating locations of some top-ranked features on a 209 base pair long segment near the 3’ end of the 1656 base pair ISEcpl gene of E. coli. id="p-121"
id="p-121"
[0121]Figure 8 illustrates the 209 base pair segment of the ISEcpl gene (also shown in Figure 7) sitting upstream of the CTX-M beta lactamase gene. id="p-122"
id="p-122"
[0122]Figure 9 shows a pie chart of high ranked ISEcpl feature presence in all pathogens in the training set used to train the model of Figures 5-8. id="p-123"
id="p-123"
[0123]Figure 10 shows a pie chart of high ranked ISEcpl feature presence in those pathogens of the training set that exhibited cefepime resistance. id="p-124"
id="p-124"
[0124]Figures 11-13 show how the presence and number of ISEcpl features in E. coli strains correlates with minimum inhibitory concentration (MIC) values of cefepime in those strains.
DETAILED DESCRIPTION id="p-125"
id="p-125"
[0125]In various embodiments, methods of for the use of whole genome sequence to rapidly and accurately determine resistance to a drug such as cefepime, in strains of bacteria such as Escherichia coli (E coli) or other Gram-negative or Gram-positive bacteria. Also provided are methods of treatment of subjects (mammals) infected with cefepime resistant E. coli. The rapid identification of cefepime resistance of the E. coli implicated in an infection is believed to provide appropriate treatment options substantially earlier in the course of the infection that previous method and such early identification and appropriate treatment is believed to result in substantially better outcomes for the subject. id="p-126"
id="p-126"
[0126]In various embodiments, using the methods described herein, exemplified by their application to cefepime-resistantE. coli, infections can be identified in as little as 1/2 hour via hybridization to a target defined by the methods disclosed herein which is far more rapid than the hours or more for "gold standard" culture methods previously utilized. Additionally, it is believed the methods provided herein show a higher positive detection rate and a lower false 16NGD1P001WO positive rate than previous methods of identifying cefepime resistant E. coli infections. As noted above, it is believed such rapid diagnosis and treatment can lead to a significantly better prognosis for patients.
Identification of cefepime resistant E. coli id="p-127"
id="p-127"
[0127]Using a machine learning method for the analysis of E. coli (and other) genomes, and training and testing data comprising a curated database comprising 1,782sequence/susceptibility pairs for E. colike&ipvme resistance, a machine model for predicting cefepime resistance was developed. Notably this model showed 97.5% accuracy (categorical agreement) and 98.5% sensitivity well beyond the accuracies reported for in vitro methods of identifying cefepime resistant bacteria (see, e.g., Smith KP, Brennan-Krohn T, Weir S, Kirby IE. Improved Accuracy of Cefepime Susceptibility Testing for Extended-Spectrum-Beta-Lactamase- Producing Enterobacteriaceae with an On-Demand Digital Dispensing Method. J Qin Microbiol. 2017 55:470-478. doi: 10.1128/JCM.02128-16.). id="p-128"
id="p-128"
[0128]The resistance prediction model was then used to discover genomic elements underpinning resistance by selecting the features, each a set of not-necessarily contiguous k-mers with similar predictive information, with greatest contribution to the predictions of the ML model and mapping all the k-mers in each such feature against the genomes of a set of resistant strains. The most important machine learning features were, surprisingly, not the beta lactamase (bla) genes themselves, which were often also present in susceptible strains, but rather regions of a transposon, ISEcpl, situated just upstream of the bla genes, suggesting the ISEcpl may control bla expression and thus cefepime resistance in E coli, as indeed has been seen experimentally (Poirel et al 2003). id="p-129"
id="p-129"
[0129]More specifically, it was determined that a 209 bp region of an ISEcpl transposon upstream (and proximal to) a beta lactamase gene, often a CTX-M, is highly conserved and the presence of this region along with the joint presence of certain CTX-M alleles such as CTX-M- but not CTX-M-11, is predictive of cefepime resistance. In cases where this region of ISEcpl was absent, the model checked for the presence of one of the carbapenemases KPC or NDM, and in their absence then declared the strain likely to be susceptible to cefepime. This machine learning model, which was over 96% accurate in internal validation, and was externally validated against a blind testing set (Humphries et al 2023), and moreover was able to be dissected to define a set of biologically plausible causal elements: to wit, the key subsection of the transposon, along with certain effective alleles of CTX-M, and in the alternative the presence of a carbapenemase predict resistance of E. coli strain to cefepime. Finally, variations at the bp NGD1P001WO level within the 209 bp key section ofISEcpl that the model defines are shown to be predictive of lower MIC, presumably because they are associated with the incremental weakening of the strong promoter function inferred to be provided by the version of the 209 bp region specified by the variable (each a set of kmers) which map to it in the model. id="p-130"
id="p-130"
[0130]Additionally it is believed that detection of the 209 bp region offers (or a sequence within the 209 bp region) offers superior accuracy than detection of the ISEcpl intron in its entirety, as it was found that regions of the intron outside the 209 bp region can be absent in cefepime resistant E. coli and in such circumstances those bacteria might not be identified as cefepime resistant by methods that involve detection of regions of the ISEcpl intron outside the 209 bp region identified herein, demonstrating that analysis of the ML model led to a level of precision in specifying the key subregions within the 209 bp region not achieved in the genetic engineering experiments of 2003 (Poirel et al) in which the entire transposon was either inserted or not inserted in order to quantify its promoter effect. The design of either hybridization-based diagnostics or inhibitors of resistance both would require the precise localization of predictive and functional regions of ISEcpl enabled by the methods disclosed. id="p-131"
id="p-131"
[0131]Accordingly, in certain embodiments, methods of determining whether an Escherichia coli bacterium is cefepime resistant are provided where the method comprises determining the presence or absence of this region or a fragment thereof where the presence of the region or fragment thereof is indicative of cefepime resistance the analyzed E. coli strain. Typically, such methods will involve providing a sample (e.g, a biological sample) comprising the bacteria in question, and analyzing/assaying DNA from the bacteria for a target sequence that has a specified sequence identity (e.g, has at least 90% sequence identity with) with a DNA sequence that comprises all or certain subsections of the nucleotide sequence of the 209 bp region or fragment thereof. Where the target sequence is found/identified, the E. coli bacterium is thereby identified as cefepime resistant. id="p-132"
id="p-132"
[0132]It is noted that the nucleotide sequence of the ISEcpl transposon and the ISEcplB transposon as a whole are well known to those of skill in the art and shown below in Table 1. However, it was not known that only all or subsections of a 209 bp region is relevant to cefepime resistance, and that the remainder of the ISEcpl transposon may be absent with resistance nevertheless intact. id="p-133"
id="p-133"
[0133]Typically, the target sequence will have a length sufficient to be dispositive of the presence of at least of the promoter elements contained within the 209 bp region. Accordingly, in certain embodiments, the target sequence will correspond to a DNA sequence that comprises NGD1P001WO at least 8 contiguous nucleotides, or at least 10 contiguous nucleotides, or at least 10 contiguous nucleotides, or at least 15 contiguous nucleotides, or at least 20 contiguous nucleotides, or at least 25 contiguous nucleotides, or at least 30 contiguous nucleotides, or at least 40 contiguous nucleotides, or at least 50 contiguous nucleotides, or at least 60 contiguous nucleotides, or atleast 70 contiguous nucleotides, or at least 80 contiguous nucleotides, or at least 90 contiguous nucleotides, or at least 100 contiguous nucleotides, or at least 110 contiguous nucleotides, or at least 120 contiguous nucleotides, or at least 130 contiguous nucleotides, or at least 1contiguous nucleotides, or at least 150 contiguous nucleotides found in a 152 bp terminal region of an ISEcpl (or ISEcplB) transposon immediately upstream of a beta lactamase gene. Incertain embodiments the target sequence corresponds to a DNA sequence that comprises the full- length 209 bp terminal region. In certain embodiments the target sequence corresponds to a DNA sequence that comprises a 209 bp terminal region of an ISEcpl transposon immediately upstream from a beta lactamase region. In certain embodiments, the target sequence comprises at least one DNA sequence of at least 8 contiguous nucleotides within 100 base pairs proximateto a start codon for a resistance gene. cctagattctacgtcagtacttcaaaaagcataatcaaagccttgataaatatgcattcc Table 1. ISEcpl transposon sequence (SEQ ID NO: 1). A 152 bp region immediately upstream of a beta lactamase gene is shown shaded, while promoter elements in this region are underlined. 20 30 40 50 60ttcgaaattcagctttcacccattgggtgaaagaaaagtgctcaaaaatatgttaaatta80 90 100 110 120tcagcttttatgactcgatatatggtaaaataatagtaagaaaagtagtaaaaaggggtt130 140 150 160 170 180ctaattatgattaataaaattgatttcaaagctaagaatctaacatcaaatgcaggtctt190 200 210 220 230 240tttctgctccttgagaatgcaaaaagcaatgggatttttgattttattgaaaatgacctc250 260 270 280 290 300gtatttgataatgactcaacaaataaaatcaagatgaatcatataaagaccatgctctgc310 320 330 340 350 360ggtcacttcattggcattgataagttagaacgtctaaagctacttcaaaatgatcccctc370 380 390 400 410 420gtcaacgagtttgatatttccgtaaaagaacctgaaacagtgtcacggtttctaggaaac430 440 450 460 470 480ttcaacttcaagacaacccaaatgtttagagacattaattttaaagtctttaaaaaactg490 500 510 520 530 540ctcactaaaagtaaattgacatccattacgattgatattgatagtagtgtaattaacgta 19NGD1P001WO 550 560 570 580 590 600gaaggtcatcaagaaggtgcgtcaaaaggatataatcctaagaaactgggaaaccgatgc 610 620 630 640 650 660tacaatatccaatttgcattttgcgacgaattaaaagcatatgttaccggatttgtaaga670 680 690 700 710 720agtggcaatacttacactgcaaacggtgctgcggaaatgatcaaagaaattgttgctaac 730 740 750 760 770 780atcaaatcagacgatttagaaattttatttcgaatggatagtggctactttgatgaaaaa790 800 810 820 830 840attatcgaaacgatagaatctcttggatgcaaatatttaattaaagccaaaagttattct850 860 870 880 890 900acactcacctcacaagcaacgaattcatcaattgtattcgttaaaggagaagaaggtaga 910 920 930 940 950 960gaaactacagaactgtatacaaaattagttaaatgggaaaaagacagaagatttgtcgta970 980 990 1000 1010 1020tctcgcgtactgaaaccagaaaaagaaagagcacaattatcacttttagaaggttccgaa 1030 1040 1050 1060 1070 1080tacgactactttttctttgtaacaaatactaccttgctttctgaaaaagtagttatatac1090 1100 1110 1120 1130 1140tatgaaaagcgtggtaatgctgaaaactatatcaaagaagccaaatacgacatggcggtg1150 1160 1170 1180 1190 1200ggtcatctcttgctaaagtcattttgggcgaatgaagccgtgtttcaaatgatgatgctt1210 1220 1230 1240 1250 1260tcatataacctatttttgttgttcaagtttgattccttggactcttcagaatacagacag1270 1280 1290 1300 1310 1320caaataaagacctttcgtttgaagtatgtatttcttgcagcaaaaataatcaaaaccgca1330 1340 1350 1360 1370 1380agatatgtaatcatgaagttgtcggaaaactatccgtacaagggagtgtatgaaaaatgt1390 1400 1410 1420 1430 1440ctggtataataagaatatcatcaataaaattgagtgttgctctgtggataacttgca،g1450 1460 1470 1480 1490 1500 1510 1520 1530 1540 1550 1560^^ 11 ! 11111 ، ،ןוווווןןן،،וווו1570 1580 1590 1600 1610 1620 2 0 9bp diagnostic region is ،،،Inverted repeats in boldPromoter elements underlined id="p-134"
id="p-134"
[0134]In certain embodiments the target sequence corresponds to a DNA sequence that comprises a region of the 209 bp diagnostic region that comprises (e.g., predominantly comprises) a promoter element. Accordingly, in certain embodiments, the target sequence corresponds to a DNA sequence that comprises or consists of a sequence corresponding to NGD1P001WO nucleotides 1543-1597 of SEQ ID NO: 1) or to nucleotides 1553 to 1587 of SEQ ID NO: 1. In certain embodiments the target sequence corresponds to a DNA sequence that comprises or consists of a nucleic acid sequence corresponding to a sequence within nucleotides 1543-1597 of SEQ IDNO:1 or within nucleotides 1553 to 1587 of SEQ IDNO:1. In certain embodiments the target sequence corresponds to a DNA sequence that comprises or consists of a nucleotide sequence corresponding to at least 8, or at least 10, or at least 15, or at least 18, or at least 20, or at least 25, or at least 30, or at least 40, or at least 50 contiguous nucleotides within the sequence ranging from nt 1438-1656, or a subsection thereof of SEQ ID NO:1.
In certain embodiments, a target sequence corresponds to one of the sequences in the following table.
Table 2 ISEcpl- mapping feature Sequence Start End 1AATAATGTTACAATGTGTGAGAAGCAGTCTAAATTC TT1569 1606 2 AAATAGTGAI 1 1 1 1G A AG CTA AT A AAA A AC A 1611 1641AATCA1 1 1 1 1 GA 1AAATCATTGATTTCATCTT 1523 1554TGAAATAGTGAI 1 1 1 1GAAGCTAATAAAAAAC 1609 1640ATCATTGATTTCATCTTTGCTGCAATGATACT 1508 1539CCTAAATTCCACGTGTG11 1 1 1 IAI rAGCTTCAAAAAT CACTATT1612 1656 7 AACACTCAAI 1 1 IAI !־GATGATATTCTTATT 1448 1478AGCTAATAAAAAACACACGTGGAATTTAGGG 1627 1656AATTGAGTGTTGCTCTGTGGATAACTTGCAGAG 1468 1500 These sequences are those of nine of the top ranked features described in Example 2. The sequences in Table 2 are ordered as they appear in Figure 7, top to bottom. id="p-135"
id="p-135"
[0135]It will be noted that some variation in the nucleotide sequence of an ISEcpl transposon can occur between various E. coli strains. Accordingly, in certain embodiments, identification of a target sequence that has at least 90%, or at least 95%, or at least 98%, or at least 99% sequence identity with the corresponding region of ISEcpl sequence shown in Table or a subsequence thereof and described above is taken as indicative that the E. coli at issue is cefepime resistant. In certain embodiments the target sequence has 100% sequence identity with the referenced DNA sequence.NGD1P001WO Machine Learning Models id="p-136"
id="p-136"
[0136]In certain embodiments, a machine learning (ML) model is designed or configured to receive, as inputs, features extracted from whole or partial genome sequences of pathogen samples under consideration (e.g., E. coli samples suspected of harboring cefepime resistance). The features may take any of various forms that represent genomic (including plasmid) sequence information. The ML model may be implemented in various systems comprising hardware and software. Examples are presented in connection with Figure described below. id="p-137"
id="p-137"
[0137]The first step in development of the machine learning model is the choice of representation of the WGS. One choice is the string of approximately 5M raw bases (A,C,T,G), while another is the set of so-called k-mers, strings of k bases, present. Many representations, including k-mer representations, suffer from redundancy, with multiple instances of features that highly correlate, which reduces the computational efficiency of the model and makes it difficult to analyze which genomic elements drive the model ’s results. Thus, a part of the method is a recoding of the WGS into variables that are less redundant than individual kmers. In certain embodiments multiple k-mers with identical or sufficiently similar predictive information can be combined into an equivalence class, and such equivalence classes may serve as the variables (i.e., the features) in a machine learning model to predict drug resistance, in vivo efficacy, virulence, protein production, or other phenotype. In certain embodiments, the equivalence class features that include multiple k-mers need not be contiguous k-mers to be part of the feature. id="p-138"
id="p-138"
[0138]Using the equivalence class feature representation described above, many classes of ML or other predictive model may be employed for assessing cefepime resistance in an E. coli in a sample. Examples include regression models, regularized linear models, support vector machines, decision trees, random forest models, gradient boosted tree models (e.g., XGBoost models), deep nets, neural networks, and autoencoders. id="p-139"
id="p-139"
[0139]One class of machine learning model employs decision trees. A schematic representation of part of one decision tree is illustrated in Figure 1. Each node of a decision tree represents a decision based on the presence or absence of a genomic element in the pathogen under consideration (e.g., E. coli that are possibly cefepime-resistant). If the genomic element is present in the pathogen, the analysis follows a branch (e.g., the right branches in Figure 1) to the connected element in the next layer below. The next element considers the presence or absence of a different genomic element in the pathogen. In this way, for any given pathogen under NGD1P001WO consideration, the analysis charts a path from the top node to one of the outputs of the bottom layer nodes. And each of these outputs has a value which either positively or negatively contributes to the likelihood that the pathogen is either resistant or susceptible respectively. Each of the variables at the nodes in the decision tree is a variable (feature) that was used to train the model. For example, each one may be an equivalence class such as described above. The example in Figure I is merely illustrative. In various embodiments, number of trees and their depths could be greater or lesser. In many cases, the model comprises more decision trees and/or trees with greater depths. [0140]One may identify the features that are important for determinations of drug resistance using models such as gradient boosted models and elastic net models. In other words, certain features may be ranked for contribution to a model ’s prediction of the resistance or susceptibility of a pathogen strain to a drug. Examples of considerations that may be employed in a ranking of features include (1) a numerical magnitude of the contribution a feature makes to the model ’s final prediction of resistance or susceptibility to a drug, (2) the frequency in which paths including the feature are used in determining the result (prediction of resistance/susceptibility), (3) the frequency with which the variable is associated with the phenotype being predicted, for example the frequency with which it is present in resistant or susceptible strains. [0141]As an example, the XGBoost package provides five metrics to identify the features which are important to the model: gain, total gain, weight, and coverage and total coverage. "Gain " computes a numerical contribution, "coverage " counts the number of times a variable (feature) is involved in a result (i.e., the number of times the path through the decision tree passes through a variable regardless of whether that variable is present in the sample being analyzed), and "weight " which simply counts how many times the variable appears in the set of trees, regardless of whether it is used. In certain embodiments, ranking employs Total Gain and Total Coverage. In one implementation, these features are averaged across the five folds in each of which the model is built with 80% of the training set. If a variable is far more likely to be present in susceptible versus non-susceptible strains, it may be of more relevance to predicting resistance. To this end, the ranking may compile the probability PNS and PS with which a variable is present in the NS and S strains respectively and multiply the product of Average Total Gain and Average Total Coverage by their ratio. In some cases, the ranking accounts for strains where PS is 0 by, e.g., adding 0.001 to the denominator. [0142]In some embodiments, the ranking employs a metric that focuses on variables which are very often present in NS strains. Such metric may employ a heuristic based on the features ’ contribution (gain), their ubiquity in the decision rendered (coverage), the degree to 23NGD1P001WO which they are more predictive of NS versus S, and their prevalence in the NS population. As an example, the following Rank Value employs a product of these factors:Rank Value = Average Total Gain x Average Total Gain x (PNs/Ps) x Pns Rank Value was computed for each of 1,010 variables which an XGBoost model utilized in an Example below, and they were thereby ranked.
Detection of the cefepime resistance from sequence information id="p-143"
id="p-143"
[0143]Method of identifying a target sequence within a bacterial genome are well known to those of skill in the art. Illustrative methods include but are not limited to sequencing of all or a portion of the genome of the E. coli in question, in situ hybridization with one or more probes that hybridize (e.g, specifically hybridize) to the region(s) of ISEcpl, and amplification of all or a portion of the region of an ISEcpl described herein. In some embodiments, sequencing includes sequencing nucleic acids of the bacterial chromosome and of any plasmids that are present in the strain. id="p-144"
id="p-144"
[0144]Typically the methods involve provision of a sample (e.g, a biological sample) comprising E. coli for which cefepime resistance is to be determined. In certain embodiments the sample can simply be an E. coli culture sample for which cefepime resistance is to be determined. In certain embodiments the sample comprises a biological sample derived from a mammal (e.g., a mammal with an infection characterized by E. coli). Illustrative biological samples include but are not limited to a cell or tissue culture, blood, saliva, cerebrospinal fluid, urine, stool, bronchial aspirates, tracheal lavage, pleural fluid, lymph, sputum, semen, needle aspirates, punch biopsies, surgical biopsies, a wound swab, and the like. id="p-145"
id="p-145"
[0145]In certain embodiments the biological sample is obtained from a subject that has a pathology characterized by an E. coli infection. Such pathologies, include but are not limited to a urinary tract infection (UTI), pneumonia, cellulitis, a liver abscess, a surgical wound infection, gastroenteritis, endocarditis, diabetic foot ulcers, osteomyelitis, pressure sores, certain blood stream infections, meningitis, and the like. id="p-146"
id="p-146"
[0146]It will also be noted that the sample need not be limited to a biological sample from a mammal. In certain embodiments the sample can be a sample from a food (e.g, a prepared food product), an agricultural product, beef, milk, water or other beverage, environmental samples (e.g, soil), and the like. 24NGD1P001WO id="p-147"
id="p-147"
[0147]E. coli are isolated from the sample using methods well known to those of skill in the art. Such methods can involve concentrating and isolating the bacteria by filtration and/or centrifugation (see, e.g., Ream et al. (2013) Molecular Microbiology Laboratory, 2nd ed. Elsevier Inc.; Tille (2015) Diagnostic Microbiology, 15th ed., Elsevier Inc.; and the like). id="p-148"
id="p-148"
[0148]In certain embodiments, in situ hybridization methods can be performed directly on the isolated E. coli. id="p-149"
id="p-149"
[0149]In certain embodiments DNA can be isolated from the E. coli for subsequent sequencing and/or amplification. Methods of isolating DNA from bacteria, especially from gram-negative bacteria such as E. coli are well known to those of skill in the art. Illustrative, but non-limiting DNA extraction methods include, for example, guanidine thiocyanate treatment and silica column purification (see, e.g., Boom etal. (1990) J. Clin. Microbiol. 28: 495-503), heating in an ethanol alkaline solution followed by centrifugation (see, e.g.. Vingataramin & Erost (2015) Bio. Techniques, 58(3): 120-125), and the like. It is also noted that many commercial kits are available for isolation of bacterial DNA (see, e.g., DNEASY® blood and tissue kit and DNEASY® Powersoil kits from Qiagen). id="p-150"
id="p-150"
[0150]It is also noted that in certain embodiments, isolation of the DNA prior to amplification is not required. Thus, for example, Song etal. (2021)doi.org/10.1101/2021.03.01.433496 describe an automation-friendly direct PCR approach for bacterial analysis. id="p-151"
id="p-151"
[0151]In certain embodiments the target sequence(s) are identified by direct sequencing of all or a portion of the genome of the E. coli for which cefepime resistance is to be determined. Methods of whole genome sequencing for bacteria are well known to those of skill in the art (see, e.g., Quainoo et al. (2017) Clin. Microbiol. Rev. 2017, 30(4): 1015). id="p-152"
id="p-152"
[0152]Illustrative, but non-limiting examples of bacterial genome sequencing methods include but are not limited to sequencing-by-synthesis (SBS) (Illumina), single-molecule real- time (SMRT) sequencing by Pacific Biosciences (see, e.g., Rhoads & Au (2015) Genomics Proteomics Bioinformatics 13; 278-289), single molecule nanopore sequencing by Oxford Nanopore Technologies, and the like. id="p-153"
id="p-153"
[0153]Given the known nucleotide sequence(s) of the target sequences described herein, identification of those sequences in the sequenced bacterial genome is straightforward, e.g., by manual or computer inspection of the bacterial DNA sequencing with appropriate percentage sequence identity as appropriate, see above. 25NGD1P001WO id="p-154"
id="p-154"
[0154]In various embodiments the sequencing comprises analyzing a nucleic acid sequence from said bacterium to output a prediction of cefepime resistance. In certain embodiments the analysis is performed using a computer and can involve simple searching of receive sequenced information for target sequences having, e.g., 95%, or 98%, or at least 99%, or 100% sequence identity with all or a portion of the 152 bp diagnostic region of an ISEcpl intron described herein. id="p-155"
id="p-155"
[0155]However, in certain embodiments, the method can involve a model or machine learning model to receive whole genome sequence data and output a prediction of cefepime resistance. In certain embodiments such a prediction of cefepime resistance is achieved by identification of the presence of and variations in the sequence of the 209 bp region of the ISEcpl transposon or a subsequence as described herein. id="p-156"
id="p-156"
[0156]Various ML approaches, and in particular, feature selection methods have been applied to discover molecular biomarkers and classify clinical cases. In particular, as noted herein, a machine learning model, utilizing, for example, training and testing data comprising sequence/susceptibility pairs for E. colike&ipvme resistance can readily be used to optimize detection/prediction of cefepime resistance in E. coli. Methods using k-mer features directly (as contrasted to the typically 20-30-fold compressed representation achieved by the k-mer equivalence class features described above) have been widely developed and used in concert with various ML models (logistic regression, XGBoost, etc) (e.g., Nguyen et al 2018, 2019; Ferreira et al 2020) id="p-157"
id="p-157"
[0157]In certain embodiments, as mentioned herein, a ML model is designed or configured to receive, as inputs, features extracted from whole or partial genome sequences of E. coli samples under consideration. The features provide a compact representation of the genome sequence. For example, they may comprise sets of k-mer strings, which may or may not be found in groups of contiguous k-mers, that have common occurrence patterns in a population of E. coli samples. In certain embodiments multiple k-mer strings that often occur together within strains of a given phenotype serve as an equivalent class, and such equivalence classes may serve as features in a machine learning model for drug resistance. In certain embodiments, the equivalence class features that include multiple k-mers need not be contiguous k-mers to be part of the feature. id="p-158"
id="p-158"
[0158]Examples of ML classes that may use equivalence class representations as described herein include regression models, regularized linear models, support vector machines, 26NGD1P001WO decision trees, random forest models, gradient boosted trees, neural networks, autoencoders, and XGBoost models. id="p-159"
id="p-159"
[0159]In certain embodiments the target sequence whose identification is indicative of cefepime resistance in the subject E. coli is readily identified using in situ hybridization methods. id="p-160"
id="p-160"
[0160]In situ hybridization (ISH), and in particular, fluorescence in situ hybridization (FISH) is a molecular assay that is readily applied for detection of one or more target nucleic acids within a bacterium (bacterial strain and/or isolate). This method is based on the specific binding of small oligonucleotides (probes) to particular target nucleic acids (e.g, DNA) within the subject bacteria. Typically, probes are selected that are complementary to the target nucleic acid that is to be detected and are labeled with a detectable label (e.g, a fluorescent label, a radioactive label, a colorimetric label, an enzymatic label, etc?). In certain embodiments the probes range in length from about 10 or about 20 or about 30 or about 40 or about 50 nucleotides up to about 150, or up to about 140, or up to about 130, or up to about 120, or up to about 110, or up to about 100, or up to about 90, or up to about 80, or up to about 70, or up to about nucleotides in length. Suitable detectable labels are well known by those of skill in the art and can readily be obtained from Molecular Probes, Inc. (Fisher Scientific). id="p-161"
id="p-161"
[0161]More recently, peptide nucleic acid probes (PNA) have been developed for in situ hybridization. These molecules mimic DNA and establish a stronger bond, since they have a neutrally charged repeated N-(2-aminoethil) glycine unit instead of the negatively charged sugar- phosphate backbone. The adequate use of this molecule in FISH technology has made the procedure more robust, quicker, and more efficient and allowed the development of several PNA-FISH methods for the detection of pathogens. id="p-162"
id="p-162"
[0162]Using the teaching provide herein with respect to target sequences indicative of cefepime resistance numerous probes and in situ hybridization protocols will be readily available to one of skill in the art to rapidly identify cefepime resistant E. coli. id="p-163"
id="p-163"
[0163]In certain embodiments the target sequence whose identification is indicative of cefepime resistance in the subject E. coli is readily identified using nucleic acid amplification methods. Such amplification methods are well known to those of skill in the art and include, but are not limited to polymerase chain reaction (PCR), real time polymerase chain reaction (rtPCR), Self-Sustained Sequence Reaction (3 SR), Nucleic acid Based Transcription Assay (NASBA), Transcription Mediated Amplification (TMA), Strand Displacement Amplification (SDA), Helicase-Dependent Amplification (HDA), Loop-Mediated isothermal amplification (LAMP), stem-loop amplification, an isothermal multiple displacement amplification (IMDA), single NGD1P001WO primer isothermal amplification (SPIA), circular helicase-dependent amplification (cHDA), Recombinase Polymerase Amplification (RPA), and the like. In certain embodiments the target sequence is amplified using polymerase chain reaaction (PCR) or real time PCR (rtPCR). id="p-164"
id="p-164"
[0164]Method of amplifying target nucleic acids (e.g., bacterial DNA) are well known to those of skill in the art and suitable protocols can readily be found for example, in Rolfs et al. (1994) Methods in DNA Amplification, Springer, Boston, Ma; Ausubel et al. PCR Primer: A Laboratory Manual, Diffenbach, Ed., Cold Spring Harbor Press (1995); The Electronic Protocol Book, Chang Bioscience (2002); Msuih et al. (1996) J. Clin. Micro. 34: 501-07; The Nucleic Acid Protocols Handbook, R. Rapley, ed., Humana Press, Totowa, N.J. (2002); Abramson et al., Curr Opin Biotechnol. 1993 Feb.;4(l):41-7, U.S. Pat. No. 6,027,998; U.S. Pat. No. 6,605,451, Barany et al., PCT Publication No. WO 97/31256; Wenz et al., PCT Publication No. WO 01/112579; Day et al., Genomics, 29(1): 152-162 (1995), Ehrlich et al., Science 252:1643-(1991); Innis et al., PCR Protocols: A Guide to Methods and Applications, Academic Press (1990); Favis et al., Nature Biotechnology 18:561-64 (2000); and Rabenau et al., Infection 28:97-102 (2000); Belgrader, Barany, and Lubin, Development of a Multiplex Ligation Detection Reaction DNA Typing Assay, Sixth International Symposium on Human Identification, 1995 (available on the world wide web at: promega.com/geneticidproc/ussymp6proc/blegrad.html- ); LCR Kit Instruction Manual, Cat. #200520, Rev. #050002, Stratagene, 2002; Barany, Proc. Natl. Acad. Sci. USA 88:188-(1991); Bi and Sambrook, Nuck Acids Res. 25:2924-2951 (1997); Zirvi et al., Nuck Acid Res. 27:e40i-viii (1999); Dean et al., Proc Natl Acad Sci USA 99:5261-66 (2002); Barany and Gelfand, Gene 109:1-11 (1991); Walker et al., Nuck Acid Res. 20:1691-96 (1992); Polstra et al., BMC Inf. Dis. 2:18- (2002); Lage et al., Genome Res. 2003 Feb.;13(2):294-307, and Landegren et al., Science 241:1077-80 (1988), Demidov, V., Expert Rev Mol Diagn. 2002 Nov.;2(6):542-8., Cook et al., J Microbiol Methods. 2003 May;53(2): 165-74, Schweitzer et al., Curr Opin Biotechnol. 2001 Feb.;12(l):21-7, U.S. Pat. No. 5,830,711, U.S. Pat. No. 6,027,889, U.S. Pat. No. 5,686,243, PCT Publication No. WO0056927A3, and PCT Publication No. WO9803673A1, and the like. id="p-165"
id="p-165"
[0165]It is also noted that many software programs exist to determine and optimize nucleic acid amplification reactions. Such programs include, but are not limited to Primer-Blast by NCBI (//www.ncbi. nlm.nih.gov/tools/primer-blast/ ), Primer3 (//primer3. ut.ee/ ), Primer3Plus (www.bioinformatics.nl/cgi-bin/primer3plus/primer3plus.cgi/ ), PrimerQuest (Integrated DNA Technologies (IDT), 28NGD1P001WO www.idtdna.com/pages/tools/primerquest7returnurU%2FPrimerquest%2FHome%2FIndex ), OligoPerfect, PeriPrimer (open source GUI //perlprimer. sourceforge.net/ ), OLIGO (oligo.net ), AutoPrime (www.autoprime.de/AutoPrimeWeb ), and the like. id="p-166"
id="p-166"
[0166]In this context, it is noted that primers for PCR amplification typically range from about 10 or 15 nt up to about 30 nt in length, and it is generally accepted that the optimal length of PCR primers is 18-22 bp. This length is long enough for adequate specificity and short enough for primers to bind easily to the template at the annealing temperature. id="p-167"
id="p-167"
[0167]By way of illustration, PCR primers that amplify a 121 nt product within the promoter region of the ISEcpl region that is determinative of cefepime resistance are shown in Table 3.
Table 3. Illustrative PCR primers.
Sequence (5*->3*) Template strand Length Start Stop Tm GC% Forward primerTGCTCTGTGGATAACTTGCAGA(SEQ ID NO:2)Plus 22 72 93 59.70 45.45 Reverse primerAGACTGCTTCTCACA CATTGTA(SEQ ID NO:3)Minus 22 192 171 57.39 40.91 id="p-168"
id="p-168"
[0168]It is noted that the above-identified PCR primers are illustrative and non-limiting. Using the teaching provide herein with respect to target sequences indicative of cefepime resistance numerous PCR and other amplification protocols will be readily available to one of skill in the art to rapidly identify cefepime resistant E. coli.
Treatment of subjects with cefepime resistant E.coli infections. id="p-169"
id="p-169"
[0169] Awide range of antimicrobial agents effectively inhibit the growth of E. coli and are used to treat various E. coli infections. In particular, the P־lactams, fluoroquinolones, aminoglycosides and trimethoprim-sulfamethoxazole have often been used to treat community and hospital infections due to E. coli (see, e.g., Pitout (2012) Expert Rev. Anti. Infect. Ther. 10: 1165-1176). The P־lactam antibiotics, especially the cephalosporins and P-lactam ־p־lactamases inhibitor combinations, are major drug classes used to treat community-onset or hospital- acquired infections caused by E. coli, especially the multi-drug resistant pathogens known as extended-spectrum beta lactamase (ESBL)pathogens, for which cefepime is a primary drug in many US hospitals. Among E coli, P־lactamase production remains the most important 29NGD1P001WO contributing factor to b-lactam resistance. 3-lactamases are bacterial enzymes that inactivate P־ lactam antibiotics by hydrolysis, which results in ineffective compounds (see, e.g, Jacoby (2009) Clin. Microbiol. Rev. 22: 161-182). id="p-170"
id="p-170"
[0170]E. coli is an important cause of community and nosocomial-acquired infections, especially of urinary tract infections, bloodstream infections, surgical site infections, pneumonia and sepsis. The penicillins, fluoroquinolones, and trimethoprim-sulfamethoxazole are considered as 1st line agents and often used to treat community and hospital infections caused by E. coli, however, the management of infections has been complicated by the emergence of antimicrobial resistance to first line antibiotics. id="p-171"
id="p-171"
[0171]Extended-spectrum 3-lactamases or ESBLs are enzymes that have the ability to hydrolyse the penicillins, cephalosporins and monobactams, but not the cephamycins and carbapenems. Although ESBLs have been identified in a range of Enterobacteriaceae, they are most often present in A coli and A. pneumoniae. The majority of ESBLs identified in clinical isolates during the 1980s to 1990s were of the SHV or TEM types, which evolved from parent enzymes such as TEM-1, -2 and SHV-1. A different type of ESBL, named CTX-M P־ lactamases, originated from environmental Kluyvera spp, and gained prominence in the early 2000s with reports of clinical isolates of E. coli producing these enzymes from Europe, Africa, Asia, South and North America (see, e.g., Pitout et al. (2005) J. Antimicrob. Chemother. 56: 52- 59). Since the mid 2000’s, the prevalence of CTX-M p־lactamases increased significantly in E. coli from various parts of the world, and today have become the most wide-spread and common type of ESBL (Id). id="p-172"
id="p-172"
[0172]CTX-M-producing E. coli are important causes of community-onset urinary tract infections, bacteraemia and intra-abdominal infections (Id.). Risk factors associated with infections caused by CTX-M-producing E. coli include the following: repeat UTIs, underlying renal pathology, previous antibiotics including cephalosporins and fluoroquinolones, previous hospitalization, nursing home residents, older males and females, Diabetes Mellitus, underlying liver pathology and international travel to high-risk areas such as the Indian subcontinent (see, e.g., Rodriguez-Bano & Pascual (2008) Exp. Rev. Anti. Infect. Ther. 6: 671-683). id="p-173"
id="p-173"
[0173]Surveys from several countries worldwide have illustrated an alarming trend of associated resistance to other classes of antimicrobial agents among CTX-M-producing A. coli that included trimethoprim-sulfamethoxazole, tetracycline, gentamicin, tobramycin and ciprofloxacin (Pitout cZ«/. (2005) J. Antimicrob. Chemother. 56: 52-59). 30NGD1P001WO id="p-174"
id="p-174"
[0174]Escherichia coli possess a chromosomal gene that encodes for an AmpC P־ lactamase, and resistance to the fourth generation cephalosporins (e.g., cefepime) are often caused by point mutations in AmpC p-lactamases and is called extended-spectrum cephalosporinases (Jacoby (2009) Clin. Microbiol. Rev. TT. 161-182). The genes are typically encoded on large plasmids containing additional antibiotic resistance genes that are responsible for multi-resistant phenotype, leaving few therapeutic options (Harris & Ferguson (2012) Int. J. Antimicrob. Agents, 40: 297-305). id="p-175"
id="p-175"
[0175]The presence of ESBLs and AmpC beta-lactamases complicates antibiotic selection especially in patients with serious infections such as bacteraemia. The reason for this is that these bacteria are often multiresistant to various antibiotics and an interesting feature of CTX-M-producing isolates is the co-resistance to the fluoroquinolones (Pitout & Laupland (2008) Lancet Inject. Dis. 8: 159-166). Antibiotics that are regularly used for therapy of serious community-onset infections, such as the third generation cephalosporins or fluoroquinolones are often not effective against ESBL and or AmpC-producing bacteria (Pitout (2013) Curr. Pharm. Des. 19: 257-263), and it is in these clinically important cases that cefepime can be used, after accurate assessment of the strain ’s sensitivity to cefepime. id="p-176"
id="p-176"
[0176]Studies consistently show that infections due to ESBL-producing Enterobacteriaceae are associated with a delay in initiation of appropriate antibiotic therapy. Moreover, multiple studies in a wide range of settings, clinical syndromes, and organisms have shown that failure or delay in adequate therapy results prolonged hospital stays, increases hospital costs (Schwaber et al. (2007) J. Antimicrob. Chemother. 60: 913-920). Failure to initiate appropriate antibiotic therapy from the start appears to be responsible for higher patient mortality (see, e.g., Tumbarello et al. (2007) Antimicrob. Agents Chemother. 51: 1987-1994). id="p-177"
id="p-177"
[0177]In view of these considerations, rapid and accurate identification of the presence of cefepime resistant E. coli can greatly facilitate the selection of treatment regimen and dramatically improve prognosis. Accordingly, in certain embodiments, methods of treatment are provided that in volve identifying E. coli in a biological sample from a mammal (e.g, a mammal to be treated) as cefepime resistant using the methods described herein (e.g, detection of a target nucleic acid corresponding to a sequence within the 209 bp region of the ISEcpl intron as described herein) and, wherein the E. coli is cefepime resistant, treating the treating mammal for a cefepime resistant E. coli infection. id="p-178"
id="p-178"
[0178]Methods of treating cefepime resistant E. coli infections are known to those of skill in the art. Thus, for example, the carbapenems, the antibiotics of last resort, are widely NGD1P001WO regarded as the drugs of choice for the treatment of severe infections due to AmpC- and ESBL- producing E. coli (Pitout (2012) Expert Rev. Anti. Infect. Ther. 10: 1165-1176). Accordingly, it is reasonable to suggest that ertapenem should be used for treatment of mammals identified as infected with a cefepime resistant E. coli, e.g, using the methods described herein, particularly in the case of community-onset infections. In certain instances, imipenem or meropenem or doripenem can be more appropriate for the treatment of serious hospital-onset infections in cases where cefepime resistant E. coli are identified, e.g., using the methods described herein.Existing data also suggest that piperacillin-tazobactam may be a useful agent for the treatment of some infections with ESBL-producing pathogens (see, e.g., Retamar et al. (2013) Antimicrob. Agents Chemother. 57: 3402-3404). id="p-179"
id="p-179"
[0179]Oral agents such as nitrofuratoin, and fosfomycin show good in-vitro activity against ESBL and AmpC-producing E. coli and it is believed these drugs can readily be used in the treatment of cefepime resisting E. coli infections, e.g., identified using the methods described herein, particularly for the treatment of uncomplicated lower UTIs. id="p-180"
id="p-180"
[0180]Other agents such as temocillin, pivmecillinam and colistin show good in-vitro activity against ESBL-producing bacteria especially if present in E. coli (see, e.g., Titelman et al. (2011) APMIS, 119: 853-863; Zahar et al. (2009) Curr. Opin. Investig. Drug, 10: 172-280). The clinical and bacteriological efficacy of pivmecillinam against lower UTIs caused by ESBL- producing E. coli andK. pneumoniae showed good clinical activity (Titelman et al. (2012) Microb. Drug Resist. 18: 189-192). A study investigating the in-vitro activity of mecillinam- clavulanate combination against ESBL-producing bacteria that showed that the addition of clavulanate did improve the activity of mecillinam, even when high bacterial inoculums were present (Lampri et al. (2012) J. Antimicrob. Chemother. 67: 2424-2428). id="p-181"
id="p-181"
[0181]It has also been demonstrated that a cocktail of two common antibiotics, mecillinam and cefotaxime, can make these CTX-M15 expressing multi-resistant E. coli (extended spectrum beta-lactamase, ESBL) sensitive to treatment again. Without being bound to a particular theory, it is believed the development of resistance towards either mecillinam or cefotaxime leads to concurrent sensitivity to the other drug — a phenomenon called collateral sensitivity. By giving both mecillinam and cefotaxime at the same time, the CTX-M-mutation works like a switch, and the bacteria become sensitive to treatment again. id="p-182"
id="p-182"
[0182]The foregoing methods of treatment of cefepime resistant E. coli infections are illustrative, and non-limiting. Using the teaching provided herein, subjects having a cefepime 32NGD1P001WO resistant E. coli infection can be rapidly identified and once identified appropriate treatment regimen can readily be provided. id="p-183"
id="p-183"
[0183]In certain embodiments, a treatment includes a combination therapy that includes (a) administering any of the above-mentioned known compounds for treating E. Coli infections strains (e.g., a P־lactam antibiotic, a cephalosporin antibiotic, or cefepime), and (b) targeting the ISEcpl or a variant transposon sequence or at least a portion the 209 bp subsequence or an expression product thereof such as its mRNA. Agents that target the ISEcpl or a variant transposon sequence may, for example, inhibit promoter and/or enhancer activity of this sequence. id="p-184"
id="p-184"
[0184]As examples, targeting ISEcpl or a variant transposon sequence or at least a portion the 209 bp subsequence may include binding to at least a portion of the region, chemically modifying one or more nucleotides in the region, cleaving or nicking a sequence in the region, or any combination thereof. id="p-185"
id="p-185"
[0185]In one example, targeting ISECpl or a variant transposon (or the 209 bp subsequence) may include targeting the ISEcpl DNA with an agent that cleaves or nicks the DNA. Examples agents include CRISPR/Cas nucleases, zinc finger nucleases, or transcription activator-like effector nucleases. id="p-186"
id="p-186"
[0186]In another example, targeting ISEcpl or a variant transposon (or the 209 bp subsequence) may include targeting promoter expression from ISEcpl using, e.g., a repressor protein, a nuclease such as CRISPR/Cas nuclease, certain small molecules, or PNAs. id="p-187"
id="p-187"
[0187]In another example, targeting ISEcpl or a variant transposon (or the 209 bp subsequence) may include targeting the mRNA itself using, e.g., cleavage, anti-sense blocking (RNA, DNA, a PNA), application of certain small molecules, etc.
Computational Systems id="p-188"
id="p-188"
[0188]Figure 2 is a block diagram of an example of the computing device or system 2suitable for use in implementing computational aspects of some embodiments of the present disclosure. For example, device 200 may be suitable for implementing some or all operations associated with predicting drug resistance of a pathogen given genomic information of the pathogen. For example, a computational system such as system 200 may be employed to receive sequence data (e.g., WGS information) and/or other input about pathogens present in a sample, 33NGD1P001WO extract features from such input, and predict drug resistance as disclosed herein. In other examples, a computational system such as system 200 may be employed to receive training data about pathogens (e.g., genomic sequence data and drug resistance information about the pathogens in a training set) and train a machine learning model that predicts drug resistance from pathogen sequence information. id="p-189"
id="p-189"
[0189]Computing device 200 may include a bus 202 that directly or indirectly couples the following devices: memory 204, one or more central processing units (CPUs) 206, one or more graphics processing units (GPUs) 208, a communication interface 210, input/output (I/O) ports 212, input/output components 214, a power supply 216, and one or more presentation components 218 (e.g., display(s)). In addition to CPU 206 and GPU 208, computing device 200 may include additional logic devices that are not shown in Figure 2, such as but not limited to an image signal processor (ISP), a digital signal processor (DSP), a deep learning processor (DLP), an ASIC, an FPGA, or the like. id="p-190"
id="p-190"
[0190]Although the various blocks of Figure 2 are shown as connected via the bus 202 with lines, this is not intended to be limiting and is for clarity only. For example, in some embodiments, a presentation component 218, such as a display device, may be considered an I/O component 2(e.g., if the display is a touch screen). As another example, CPUs 206 and/or GPUs 208 may include memory (e.g., the memory 204 may be representative of a storage device in addition to the memory of the GPUs 208, the CPUs 206, and/or other components). In other words, the computing device of Figure 2 is merely illustrative. Distinction is not made between such categories as "workstation, " "server, " "laptop, " "desktop, " "tablet, " "client device, " "mobile device, " "hand-held device, " "electronic control unit (ECU)," "virtual reality system, " and/or other device or system types, as all are contemplated within the scope of the computing device of Figure 2. id="p-191"
id="p-191"
[0191]Bus 202 may represent one or more busses, such as an address bus, a data bus, a control bus, or a combination thereof. The bus 202 may include one or more bus types, such as an industry standard architecture (ISA) bus, an extended industry standard architecture (EISA) bus, a video electronics standards association (VESA) bus, a peripheral component interconnect (PCI) bus, a peripheral component interconnect express (PCIe) bus, and/or another type of bus. id="p-192"
id="p-192"
[0192]Memory 204 may include any of a variety of computer-readable media. The computer- readable media may be any available media that can be accessed by the computing device 200. The computer-readable media may include both volatile and nonvolatile media, and removable NGD1P001WO and non-removable media. By way of example, and not limitation, the computer-readable media may comprise computer-storage media and/or communication media. id="p-193"
id="p-193"
[0193]The computer-storage media may include both volatile and nonvolatile media and/or removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, and/or other data types. For example, memory 204 may store computer-readable instructions (e.g., that represent a program(s) and/or a program element(s), such as an operating system. Computer- storage media may include, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information, and which can be accessed by computing device 200. As used herein, computer storage media does not comprise signals per se. id="p-194"
id="p-194"
[0194]The communication media may embody computer-readable instructions, data structures, program modules, and/or other data types in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term "modulated data signal " may refer to a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, the communication media may include wired media such as a wired network or direct- wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer-readable media. id="p-195"
id="p-195"
[0195]CPU(s) 206 may be configured to execute the computer-readable instructions to control one or more components of the computing device 200 to perform one or more of the methods and/or processes described herein. CPU(s) 206 may each include one or more cores (e.g., one, two, four, eight, twenty-eight, seventy-two, etc.) that are capable of handling a multitude of software threads simultaneously. CPU(s) 206 may include any type of processor and may include different types of processors depending on the type of computing device 200 implemented (e.g., processors with fewer cores for mobile devices and processors with more cores for servers). For example, depending on the type of computing device 200, the processor may be an ARM processor implemented using Reduced Instruction Set Computing (RISC) or an x86 processor implemented using Complex Instruction Set Computing (CISC). Computing device 200 may include one or 35NGD1P001WO more CPUs 206 in addition to one or more microprocessors or supplementary co-processors, such as graphics, math, or machine learning co-processors. id="p-196"
id="p-196"
[0196]GPU(s) 208 may be used by computing device 200 to render graphics (e.g., 3D graphics). GPU(s) 208 may include many (e.g., tens, hundreds, or thousands) of cores that are capable of handling many software threads simultaneously. GPU(s) 208 may generate pixel data for output images in response to rendering commands (e.g., rendering commands from CPU(s) 206 received via a host interface). GPU(s) 208 may include graphics memory, such as display memory, for storing pixel data. The display memory may be included as part of memory 204. GPU(s) 208 may include two or more GPUs operating in parallel (e.g., via a link). When combined, each GPU 208 can generate pixel data for different portions of an output image or for different output images (e.g., a first GPU for a first image and a second GPU for a second image). Each GPU can include its own memory or can share memory with other GPUs. id="p-197"
id="p-197"
[0197]In examples where the computing device 200 does not include the GPU(s) 208, the CPU(s) 206 may be used to render graphics. id="p-198"
id="p-198"
[0198]Communication interface 210 may include one or more receivers, transmitters, and/or transceivers that enable computing device 200 to communicate with other computing devices via an electronic communication network, included wired and/or wireless communications. Communication interface 210 may include components and functionality to enable communication over any of a number of different networks, such as wireless networks (e.g., Wi- Fi, Z-Wave, Bluetooth, Bluetooth LE, ZigBee, etc.), wired networks (e.g., communicating over Ethernet), low-power wide-area networks (e.g., LoRaWAN, SigFox, etc.), and/or the internet. id="p-199"
id="p-199"
[0199]I/O ports 212 may enable the computing device 200 to be logically coupled to other devices including I/O components 214, presentation component(s) 218, and/or other components, some of which may be built in to (e.g., integrated in) computing device 200. Illustrative I/O components 214 include a microphone, mouse, keyboard, joystick, track pad, satellite dish, scanner, printer, wireless device, etc. I/O components 214 may provide a natural user interface (NUI) that processes air gestures, voice, or other physiological inputs generated by a user. In some instances, inputs may be transmitted to an appropriate network element for further processing. An NUI may implement any combination of speech recognition, stylus recognition, facial recognition, biometric recognition, gesture recognition both on screen and adjacent to the screen, air gestures, head and eye tracking, and touch recognition (as described in more detail below) associated with a display of computing device 200. Computing device 200 may be include depth cameras, such as NGD1P001WO stereoscopic camera systems, infrared camera systems, RGB camera systems, touchscreen technology, and combinations of these, for gesture detection and recognition. Additionally, computing device 200 may include accelerometers or gyroscopes (e.g., as part of an inertia measurement unit (IMU)) that enable detection of motion. In some examples, the output of the accelerometers or gyroscopes may be used by computing device 200 to render immersive augmented reality or virtual reality. id="p-200"
id="p-200"
[0200]Power supply 216 may include a hard-wired power supply, a battery power supply, or a combination thereof. Power supply 216 may provide power to computing device 200 to enable the components of computing device 200 to operate. id="p-201"
id="p-201"
[0201]Presentation component(s) 218 may include a display (e.g., a monitor, a touch screen, a television screen, a heads-up-display (HUD), other display types, or a combination thereof), speakers, and/or other presentation components. Presentation component(s) 218 may receive data from other components (e.g., GPU(s) 208, CPU(s) 206, etc.), and output the data (e.g., as an image, video, sound, etc.). id="p-202"
id="p-202"
[0202]The disclosure may be described in the general context of computer code or machine- useable instructions, including computer-executable instructions such as program modules, being executed by a computer or other machine, such as a cell phone, tablet, or other handheld device. Generally, program modules including routines, programs, objects, components, data structures, etc., refer to code configured to perform particular tasks or implement particular data types. The disclosure may be practiced in a variety of system configurations, including hand-held devices, consumer electronics, general-purpose computers, more specialty computing devices, etc. The disclosure may also be practiced in distributed computing environments where tasks are performed by remote-processing devices that are linked through a communications network. Some disclosed embodiments may be implemented, at least in part, using cloud-based resources.
Kits for the detection of cefepime resistant E. coli. id="p-203"
id="p-203"
[0203]In certain embodiments kits are provided for the rapid detection/identification of cefepime resistant E. coli. In certain embodiments the kits comprise one or more containers containing primers and/or probes for detection/identification of cefepime resistant E. coli. In certain embodiments the kits comprise probes that hybridize (e.g, specifically hybridize) with a target nucleic acid corresponding to a region of an ISEcpl intron (e.g, a 209 bp region or subregion thereof) identified herein.NGD1P001WO id="p-204"
id="p-204"
[0204]By way of non-limiting illustration, in certain embodiments, the kit contains primers and/or probes for the amplification and/or detection of a target sequence that comprises or consists of a sequence corresponding to a region or regions within nucleotides 1438 to 16inclusive_of the ISEcpl transposon shown in Table 1 (SEQ ID NO:1). In certain embodiments the target sequence comprises at least 10, or at least 15, or at least 20, or at least 30, or at least 40, or at least 50, or at least 60, or at least 70, or at least 80, or at least 90, or at least 100, or at least 110, or at least 120, or at least 130, or at least 140, or at least 150 contiguous nucleotides of said region. In certain embodiments the target sequence comprises a nucleic acid sequence that corresponds to a sequence comprising a promoter sequence in the region. In certain embodiments the target sequence comprises a nucleotide sequence ranging from bp 1543 to 15of the ISEcpl transposon shown in Table 1 (SEQ ID NO: 1). In certain embodiments the target sequence comprises a nucleic acid sequence that corresponds to 209 contiguous nucleotides of said region. id="p-205"
id="p-205"
[0205]In certain embodiments the kit comprises primers for the amplification of the target sequence and/or one or more probe(s) that hybridize to the target sequence. id="p-206"
id="p-206"
[0206]In addition, in certain embodiments, the kits include labeling and/or instructional materials providing directions (e.g, protocols) for the use of the materials described herein, e.g., for the identification of cefepime resistant E. coli. id="p-207"
id="p-207"
[0207]While the instructional materials in the various kits typically comprise written or printed materials, they are not limited to such. Any medium capable of storing such instructions and communicating them to an end user is contemplated by this disclosure. Such media include but are not limited to electronic storage media (e.g, magnetic discs, tapes, cartridges, chips), optical media (e.g, CD ROM), and the like. Such media may include addresses to internet sites that provide such instructional materials.
EXAMPLES id="p-208"
id="p-208"
[0208]The following examples are offered to illustrate, but not to limit the claimed invention.
Example 1 id="p-209"
id="p-209"
[0209]The whole genome sequence of a bacterium should enable highly accurate prediction of phenotype, but for Gram-negative species accuracies exceeding in vitro tests have not been achieved. Using the ECF features as input to a machine learning (ME) model, and a training data set comprising a curated database comprised of 1,782 sequence/susceptibility pairs NGD1P001WO for E. co/z/cefepime was used for training and testing, a ML model was developed that can predict cefepime resistance in Escherichia coli more accurately than reported in vitro tests. id="p-210"
id="p-210"
[0210]As illustrated in Figure 3, the accuracy of the machine learning (ML) prediction of E. coli resistance to cefepime was 97.5% in one example, whereas simply using the presence of the bla genes (primarily alleles of CTX-M and CMY) as the resistance predictor only yielded 76.9% accuracy because many S strains contained a bla gene, suggesting an additional regulatory mechanism may control bla expression. id="p-211"
id="p-211"
[0211]The ML model was used to discover this mechanism, starting by ranking the features utilized by the ML model according to their contribution, and found that in R strains many of the top features mapped to a region of the genome between upstream of the bla gene start codon. BLAST against the ISFinder database revealed that these bases were all within a transposon, ISEcpl, in particular within bases 1438 and 1656 of insertion segment ISEcpl, situated just upstream of the bla promoter. This section of ISEcpl was highly predictive of resistance, suggesting that when located in this position it controlled bla transcription. It also predicted the presence of the bla gene itself is not adequate to predict resistance. Other top features of the model utilized by the decision trees included features selective for particular alleles of CTX-M, particularly including the CTX-M-15 allele, or identified strains which instead incorporate the ineffective allele CTX-M-1 and accordingly predicted susceptibility regardless of the presence of the 209 bp ISEcpl region, together contributing to the high level of overall accuracy achieved by the model. id="p-212"
id="p-212"
[0212]Regarding ISEcpl, Poirel et al. (2003) Antimicrob. Agents Chemother. 47: 2938- 2945, reported that ISEcpl (in its entirety) acts as a promoter of CTX-M in a strain of E. coli as schematically illustrated in Figure 4. They performed experiments with the promoter region of the IS spliced into plasmids at a location abutting CTX-M-19 and observed a 17-fold increase in beta lactamase activity, validating the biological role of the expression mechanism discovered by analysis of the ML model. Poirel et al did not identify which part of the 1656 bp transposon was critical for its promoter function, and indeed the 1,782 strain training set in the example described herein included resistant strains in which most of the transposon was truncated by a later evolutionary event, emphasizing that a reliable diagnostic probe cannot test for the presence of the transposon in its entirety but rather must be directed to the 209 bp section of the promoter revealed by analysis of the equivalence class features of the ML model described above. id="p-213"
id="p-213"
[0213]Thus, it has been demonstrated that mapping the most significant elements of a self-organized ML model enabled discovery of the genomic elements by which cefepime NGD1P001WO resistance inE coli is gated, enabling the specified subregion(s) of an ISEcpl intron described herein to be used as a part of an ML system that rapidly and accurately identifies E. coli that have cefepime resistance. id="p-214"
id="p-214"
[0214]Additionally the data demonstrated herein encourage the application of this approach to develop sequence-based predictions of resistance and use these models to discovery the genomic underpinning of resistance mechanisms in other drug/pathogen pairs.
Example 2 id="p-215"
id="p-215"
[0215] Amachine learning model was prepared using a 5,901-sample training set of WGS data for E. coli stains with and without cefepime resistance. The resulting model predicted susceptibility versus non-susceptibility. The training started with approximately 140,000,0distinct 31-mers in the data set. These k-mers were obtained from the sequenced whole genomes of the E. coli stains in the training set. Before training the model with susceptibility/non- susceptibility tags, the set of 31-mers was reduced to approximately 7,500,000 equivalence classes, which are sets of 31-mers with similar predictive information, thus achieving an initial nearly 20-fold compression of representation, with this initial reduction of redundancy enabling a more efficient as well as more interpretable ML model. XGBoost trained a decision tree model based on the equivalence classes and the labels (here resistant, which can in this implementation include the intermediate resistance category, or susceptible strains). The trained model was five- fold cross validated. The XGBoost model comprised 100 decision trees, each of depth six. The trees used 4,362 variables (equivalence classes), thus achieving a substantial culling of the 7.5M equivalence classes into the 4,362 utilized by one of the 5 models each built from 80% of the training set. Thus, less than 1/1000 of the available variables were used in the model. The XGBoost machine learning model had a sensitivity of 95.5%, a specificity of 96.2%, and a balanced accuracy of 95.8% against this larger and more diverse dataset, a level of performance very similar to that achieved in an external blind validation (Humphries et al 2023). id="p-216"
id="p-216"
[0216] Atrained machine learning model of drug resistance may allow ranking of features based on their ability to accurately predict drug resistance versus susceptibility. The features illustrated in the graph (x-axis) of Figure 5 are pathogen genomic elements that the model chose to use in its decision trees. As mentioned, the model exemplified here chose 4,3such genomic elements as features to use in predicting drug resistance. The ranking of these of these features may employ many techniques such as those described elsewhere herein. As 40NGD1P001WO suggested by the ranking metric, the predictive strength of the model resides almost exclusively in the top 20 (of 4,362) ranked features. Figure 5 identifies the locations of the top 17 features on E. coli genomes. Of interest, most of the features reside on the ISEcpl transposon of the E. coli genome, which is upstream of, and contains the promoter for expression of, the downstream beta lactamase gene, usually one of the alleles of CTX-M. Also of interest, the non-ISEcpl features are either predictive of particular alleles of CTX-M, or alternatively of the carbapenemases KPC, and NDM. Note that the structure of the model indicates that the KPC and NDM gene expression is not controlled by ISEcpl, as these variables are on branches of the decision trees that are taken when ISEcpl-mapping variables are not present. id="p-217"
id="p-217"
[0217]Figure 6 provides a more detailed representation of the plot in Figure 5 but highlights how the ranking of the features drops off from highest ranked feature to the twentieth ranked feature. id="p-218"
id="p-218"
[0218]Figure 7 presents a chart illustrating locations of some top-ranked features on a 209 base pair long segment near the 3’ end of the 1656 base pair ISEcpl gene. Of the top fourteen ranked features predictive of cefepime resistance, ten map to ISEcpl and are all located on this segment. Horizontally along the chart, the base pair position near the 3 ’ end of the ISEcplgene is shown. Vertically along chart, predictive strength of features is shown, with feature 1 being the most predictive of cefepime resistance and feature 12 being the least predictive —among the top 12 features —of cefepime resistance. Interestingly, all nine of these top twelve features cluster within this narrow band of 209 base pairs on the ISEcpl gene. The remaining four features either predict a particular allele of CTX-M, on the branch of the tree followed by strains with ISEcpl present, or alternatively map to carbapenemase genes. id="p-219"
id="p-219"
[0219]The vertical arrows pointing downward indicate the core promoter region containing the 35־ and -10 elements (located before the beginning of sequence to be transcribed) where RNA polymerase binds. Note also that while the model was configured to use non-redundant sets of 31-mers as training features, some of the 12 features shown in the figure are longer than others ranging from a single 31-mer to thousands of 31-mers, with the latter including up to 7 distinct non-contiguous strings of bases. id="p-220"
id="p-220"
[0220]Figure 8 illustrates the 209 base pair segment of the ISEcpl gene (also shown in Figure 7) sitting upstream of the CTX-M beta lactamase gene. id="p-221"
id="p-221"
[0221]Figure 9 shows a pie chart of high ranked ISEcpl feature presence in all pathogens in the training set. As shown, approximately 85% of the pathogens lacked the 2 41NGD1P001WO base pair long segment ofISEcpl gene illustrated in slides 10 and 11. And, 90% of the pathogens that are resistant harbor this segment. id="p-222"
id="p-222"
[0222]Figure 10 shows a pie chart of high ranked ISEcpl feature presence in those pathogens of the training set that exhibited cefepime resistance. As indicated, approximately 95% of all pathogens exhibiting resistance, contained either all ten high ranked ISEcpl features or nine of ten of these features. Interestingly, of the pathogens that contained only nine of ten such features, nearly all of them (277 of 279), lack the eighth ranked feature (ISC#8), which is shown as a light green feature in the lower right side of the illustration in Figures 7 and 8. This feature has a 1 base pair overhang of the 3’ end of the ISEcpl gene, and its absence is correlated with a lower probability of resistance and a lower MIC. id="p-223"
id="p-223"
[0223]Figures 11-13 [Slides 14 to 16] show how the presence and number of ISEcpl features in E. coli strains correlates with minimum inhibitory concentration (MIC) values of cefepime in those strains. id="p-224"
id="p-224"
[0224]Figure 11 shows a histogram of E. coli strains in the dataset that harbor all ten of the ISEcpl genomic features that bin into different MIC levels. As shown, of all 565 strains harboring all 10 ISEcpl-mapping features, the average MIC was 18.1 ug/mL, indicating strong cefepime resistance. However, a significant fraction of these strains had a low MIC in the range of 2 ug/mL; i.e., these strains had a high susceptibility to cefepime, indicating that to achieve its high level of accuracy the model successfully integrated the interplay of its different elements (i.e., it uses ISEcpl-feature in tandem with others predictive of CTX-M allele or the presence/absence of a carbapenemase to determine resistance). In hindsight all these elements and their interplay is highly consistent with biologically causal determination of cefepime resistance, and it should be noted that no pre-existing information whatsoever about resistance elements was provided to the system in generating the model, nothing other than the WGS of the set of strains along with the resistance category measured for each. id="p-225"
id="p-225"
[0225]Figure 12 shows an MIC histogram of the 279 stains that harbor nine of the ten high ranked ISEcpl features. Interestingly, these strains had a significantly lower average MIC of 9.0 ug/mL. Note that 277 of the 279 strains in this histogram lack only the eighth ranked feature (ISC#8), which has a 1 base pair overhang of the 3’ end of the ISEcpl gene. id="p-226"
id="p-226"
[0226]Figure 13 shows a plot of cefepime MIC (in the strains of the data set) as a function of the raw number of high-rank ISEcpl features (0-10) present in the strains. As illustrated, there is a good correlation between cefepime MIC and number of ISEcpl features present.NGD1P001WO id="p-227"
id="p-227"
[0227] Afurther discussion of the methods and results of Example 2 will now be provided. This further discussion contains various non-limiting descriptions of techniques, interpretations of results, and performance criteria. id="p-228"
id="p-228"
[0228]While in many classes of machine learning models, such as deep nets, the basis upon which the model makes its determination is opaque, here we disclose methods with which to rank the importance of the features used by for example an XGBoost model of the sequence-based prediction of resistance, and found that in the cases of cefepime-E coli the model ’s resistance determination is largely driven by a small number of genomic elements amongst the millions initially available, with just the top 20 features comprising over 99.9% of the aggregate Rank Value (defined above) of all the thousands of features used in the model. We then explored whether these high-ranking variables, each a set of strings of bases, corresponded to known determinants of resistance by mapping them against genomes of resistant and susceptible strains. Unexpectedly, we found that the high Rank Value variables did not map to the resistance gene, CTX-M, which is widely considered predictive of cefepime resistance, and therefore a CTX-M PCR target is currently a widely used diagnostic for cefepime resistance, but rather to a 209 base section at the 5’ end of the ISEcpl transposon, leading us to conclude that CTX-M is not a well- chosen and selective marker of resistance; rather the 209 bp section of ISEcpl that we have identified by analysis of the top features of the ML model, in particular the very top feature, a bp string which sites over the -10 box promoter region within the 209 bp, is predicted to be a far more effective single PCR target for the molecular testing of E. coli strains for cefepime resistance. In this way, the high Rank Value ML model variables greatly sharpened the identification of the strong promoter to from being associated with the entirety of ISEcpl to a small defined section at the 5’ end nearest to the start codon of the beta-lactamase, which may furnish the strong RNA polymerase binding site. Consistent with this conclusion, we found strains in the training set where most of the transposon is truncated by a later insertion but which were still resistant so long as this section at the 5’ end remains intact. id="p-229"
id="p-229"
[0229]This finding has the benefits of (a) identifying regions or sequences for PCR-based diagnostics more predictive of cefepime resistance than the CTX-M target, and (b) demonstrating that the accurate performance (Humphries et al 2023) of the ML model is driven by a biologically validated mechanism. This model was automatically generated by training data, consisting of WGSand resistance phenotype for a few thousand strains, in an entirely hypothesis-free manner; no pre-existing information about resistance genes or promoters was furnished in the model 43NGD1P001WO building process. The ability to automatically generate a hypothesis-free model from training data provides a general method of discovery (confirmable using the methods of molecular biology) of biologically causal mechanism from training sets of sequence and phenotype. In some cases, the methodology employs a parsimonious technique such as one that utilizes a regularized gradient boosted class of decision tree models (e.g., XGBoost). We note that this method is not limited to identification of the genomic mechanisms underlying resistance, but any measurable phenotype, including virulence, in vivo efficacy, inhibitory concentrations, and/or simply protein expression per se. id="p-230"
id="p-230"
[0230]The variables driving a machine learning model for the prediction of non- susceptibility (NS) of E. coli to cefepime from whole genome sequence (validated in Humphries et al 2023) were ranked using a heuristically defined metric (Rank Value described above) combining their importance in model result generation with their predictive correlation with resistance. Starting with the 140M distinct 31-mers present in the 5,901 strain training set, a new set of variables was constructed, each a set of 31-mers with redundant predictive value, achieving a 19-fold compression. When the resulting 7.5M variables were input to the SciKit XGBoost algorithm with 100 decision trees of depth 6, each model thus accommodating up to 6,3variables, 4,356 distinct variables were utilized by the model, aggregating over all 5 validation folds. id="p-231"
id="p-231"
[0231] The Rank Value of these 4,356 variables fell off precipitously, with just the first 20comprised 99.9% of the aggregate Rank Value of all variables utilized by the model. This surprising finding suggested that this handful of variables contains most of the information utilized by the XGBoost model in making its WGS-based predictions of susceptibility to cefepime, which suggested the possibility that these variables, each a set of 31-mers, might correspond to causally relevant genomic elements. We BLASTed each variable against the UniProt and ISFinder databases, and compared with the result of the automated annotator PROKKA. 9 of the top 12 variables map to a 209 bp region at the 3 ’ end of the transposon ISEcpl id="p-232"
id="p-232"
[0232]Surprisingly, 9 of the top 12 variables mapped to a single 209 bp section at the 3’ end of a 1656 bp transposon, ISEcpl (Figure 7). No other variables in the model mapped elsewhere in ISEcpl, suggesting that only this part of the transposon is needed for resistance. Examining this 209bp region for promoter signature found TATAAT and TTGACA boxes in the region, and the first and second highest Rank Value features, which together dominated Rank Value, mapped to the -10 and -35 region respectively. Both variables were present in 880 of the 890 strains (98.9%) in which any of the 10 ISEcpl-mapping variables were present and they were NGD1P001WO found in approximately 90% of R (resistant) but just 5% of S (susceptible) strains. The first variables in the decision tree divided the dataset into strains with and without an exact match of the sequences represented by these variables in the transposon promoter. id="p-233"
id="p-233"
[0233] Of the 890 strains (15.0% of the 5890 strains in the dataset) that had at least one ofthese 10 variables present, 70.8% (630/880) were R, versus just 68/5010 (1.4%) of the strains without ISEcpl-mapping variables, indicating that the promoter was usually necessary but not by itself sufficient to ensure resistance. Of R strains, 90% had one of the ISEcpl-mapping variables, so a PCR probe for either one of them may exhibit far greater sensitivity than that associated with the presence of the variable mapping to CTX-M, which was in 62% of R strains, or with any allele (defined as >95% match) of CTX-M. id="p-234"
id="p-234"
[0234]Interestingly, when one of the 10 variables was not present it was ISEcpl#8, a single 31-mer which covered the last 30bp at the 3’ end of the transposon and extended one base further. A mutation in the 8th ISEcpl-mapping variable was correlated with the CTX-M-1 versus the CTX-M-15 allele of the CTX-M gene. id="p-235"
id="p-235"
[0235]While we considered the possibility that each variable might correspond to a different allele of this segment of ISEcpl, instead we found that in 95.0% of the strains in which any ISEcpl-mapping variable was present, either 9 or 10 were present, and never fewer than 5. Of the 880 containing at least 5 of these variables, 53.9% were categorized as NS (MIC > 2) versus only 1.1% of the strains without variables mapping to the 209bp of the transposon.
There is a correlation between MIC and the number of variables mapping to ISEcpl. Amongst those 880 strains containing variables mapping to this segment of ISEcpl, there was a 87% correlation between the number mapping and MIC, with distributions of MIC’s for the strains with either all 10 or 9/10 ISEcpl-mapping variables containing all MIC’s from <=1 to >64, indicating that other factors contributed to MIC. id="p-236"
id="p-236"
[0236]The distribution of MIC when just one of the 10 ISEcpl-mapping variables shifted to the left, with average MIC declining from 18.0 to 9.1 ug/mL respectively.
Of the four most utilized variables other than ISEcp 1, 3 mapped to plasmids containing either KPC or NDM. Of the 10 variables mapping to the 209 bp region of ISEcpl, the largest contained 31-mers. In contrast, the other 4 top Rank Value variable ranged from 1011 to 5355 31-mers, with up to 8 distinct unitigs (continuous sequences of 31-mers) and were by far the largest variables used in the model. Of these, two mapped to a TN3 class transposase in insertion sequence ISPsyin the proximity of the start codon of a KPC carbapenemase. Another comprised 1723 31-mers 45NGD1P001WO and mapped to a bleo-MBL resistance cassette that accompanied an NDM carbapenemase. The KPC- and NDM-adjacent variables were present in 3.3 and 6.9% of the resistant strains respectively, suggesting the hypothesis that they were used to predict resistance in the -10% of the R strains in which no ISEcpl element was present. id="p-237"
id="p-237"
[0237] Aspuriously predictive variable associated with sequencing contaminant PhiX1used in Illumina sequencing was found is almost all in strains associated with the CDC’s AR bank. The largest highly predictive variable comprised 5355 31-mers but did not map to any protein or mobile element in the bacterial genome. However, it mapped to the phage PhiX174. We realized that the phage is utilized as a positive control in Illumina sequencing. As it was present in 14.6% of the resistant strains in our dataset but none of the susceptible one and could not have any causal relevance, we reasoned that it must be simply correlated with R strains. Given it is a contaminant we hypothesized that it might be from one study, rather than scattered across the samples in many studies, and indeed we found that 143/149 of the strains containing this variable were from the CDC’s AR bank, all of which are resistant to cefepime. Thus, analysis of a machine learning model led to discovery of an unsuspected contaminant in the CDC FASTQ’s, a finding we have relayed to the CDC. We removed the contaminating PhiX174-mapping reads from our database and rebuilt the model for further analysis. The rebuilt model did not alter utilization of the ISEcp 1 - mapping variables, but just shifted the model ’s use of PhiX174 reads to another variable, which was in many of the genomes which had contained PhiX174, with no reduction in model accuracy. id="p-238"
id="p-238"
[0238]Most PCR panels seeking to predict susceptibility to cefepime and thus guide therapeutic decisions match to CTX-M, the ESBL is most often associated with cefepime resistance in E coli. However, the variables mapping directly to CTX-M (but without determination of which CTX-M allele) were the 16th and 18th ranked, comprising less than 0.01% of the aggregate rank value, and analysis of reduced models using the top 20 features did not use them at all, leading to the conclusion that the CTX-M target present on most commercial PCR panels is not a well-chosen predictor of cefepime resistance in E. coli. id="p-239"
id="p-239"
[0239]Rather, analysis of the ML model led to the conclusion that the variable with the highest Rank Value, which overlies the -10 region of the promoter within the ISEcpl transposon, gates the branch of the tree associated with ESBL-mediated resistance, raising the hypothesis that this region is vital for the strong promoter function, and in any event suggesting that this 39 bp region (instead of CTX-M per se) is the best single target for PCR probes designed to assess cefepime resistance in E. coli.
MIC dependence on ISEcplNGD1P001WO id="p-240"
id="p-240"
[0240]The absence of ISEcpl-mapping variables was very highly predictive of susceptibility. Of the 5,004 strains where no ISEcpl-mapping variable was present, 98.0% had MIC below the limit of measurement (so <=2), and therefore categorized as susceptible. Of the remaining 895 strains 79.4% had MIC >=2. As almost all (99.9%) of these strains had the top- ranked variable present, cefepime resistance requires the presence of the presence of this 39 bp section from bp 1569 to 1606 of the 1656 bp long ISEcpl transposon complex, and that a PCR probe designed to detect the resistance of E coli to cefepime may target this 39bp region, the absence of which is 98% predictive of cefepime susceptibility.
Correlation between cefepime MIC and the number of ISEcpl-mapping variables present in a strain id="p-241"
id="p-241"
[0241]Going beyond S/R categorization to quantitative MIC determination, there was an 0.87 correlation coefficient (cc) between average MIC and the number of ISEcpl-mapping variables present (Figure 13). Since they roughly tile the 209 bp section at the 5’ end of ISEcpl this result suggests that promoter efficacy is correlated with how closely this strong promoter region matches the ideal version defined by the presence of all 10 mapping variables, with fewer variables exactly mapping corresponding to incrementally inferior promoter function, a testable hypothesis. id="p-242"
id="p-242"
[0242]However, while there was an 0.87 cc between average MIC and the degree of promoter match quantified by the number of exactly mapping variables, for a given number of mapping variables there was diversity of MIC, suggesting that while high MIC required the promoter its value was controlled by other mechanisms. For the 62.2% of the ISEcpl-containing strains which contained all 10 variables average MIC was 18 ug/mL with a wide range (Figure H). id="p-243"
id="p-243"
[0243]In summary, we have developed a methodology to dissect the determinative features of an accurate (Humphries et al 2023) machine learning model predicting cefepime resistance in E. coli and identified the interplay of mechanisms which control its resistance and susceptibility and non-susceptibility predictions. The elements controlling the model started with the transposon-based promoter ISEcpl, which has been found (Poirel et al 2003, 2002; Woodford 2009, 2011) to control expression of CTX-M and other ESBL’s and thus resistance to 4th generation cephalosporins like cefepime. Further, the ML features mapped to very precise subregions of the transposon which were more predictive than CTX-M-15 itself. Thus, PCR tests for cefepime resistance inE. coli may better-be probes of the promoter region of ISEcpl, coupled, NGD1P001WO if possible, with probes which determine CTX-M-1 or CTX-M-15 are present than simply of the presence of CTX-M that is currently used clinically to predict cefepime-resistant phenotype in E. coli.
Example 3 id="p-244"
id="p-244"
[0244] AML-based model for prediction of cefepime phenotypic susceptibility results in Escherichia coli was successfully tested in a blind validation (Humphries et al 2023). The model developed was developed as described herein. With the model frozen, a cohort of 100 isolates of E. coli recovered from urine (n=77) and blood (n=23) cultures were used. Cefepime minimal inhibitory concentration (MIC) was determined in triplicate by reference broth microdilution and classified as susceptible (MIC <2 ug/mL) or not susceptible (MIC >4 ug/mL) using 20Clinical and Laboratory Standards Institute (CLSI) breakpoints. Five isolates generated both susceptible and not susceptible MIC results, yielding categorical agreement of 95% for the reference method to itself. Categorical agreement of ML to MIC interpretations was 97%, with very major (false-susceptible) and 1 major (false-not susceptible) errors. One error occurred for an isolate with blacix-u-ii (BMD mode, >32 ug/ml) and one for an isolate with A/،7tei34־ for which BMD cefepime mode was 4 ug/mL. One major error was for an isolate with blacix-u-but an MIC mode of 2 ug/mL. These data demonstrate performance of ML for a clinically important antimicrobial-species pair at a caliber exceeding that of phenotypic methods used in hospital clinics today (Smith et al 2017) and rivalling the accuracy of the triplicate broth micro dilution gold standard (Humphries et al 2023). The level of accuracy achieved for WGS-based determination of cefepime resistance in E. coli, an important drug in an important species for which in vitro testing is problematic, suggests sequence-based susceptibility prediction not only can but, after suitable validation and regulatory consideration, should be utilized in the selection of therapy in clinical practice.
Conclusion id="p-245"
id="p-245"
[0245]It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. For example, there are many alternative ways of implementing the of the present embodiments. Accordingly, the present embodiments are to be 48NGD1P001WO considered as illustrative and not restrictive, and the embodiments are not to be limited to the details given herein. All publications, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes.
NGD1P001WO
Claims (92)
1. A method of determining whether a gram-negative bacterium is cefepime resistant, said method comprising:determining the presence or absence of a target sequence in the DNA of said bacterium, wherein said target sequence corresponds to a portion of DNA sequence comprising one or more sets of nucleotides each comprising at least 8 specified contiguous nucleotides within a defined 209 base pair region of an ISEcpl transposon or variants thereof with at least 90% identity to the ISEcpl sequence disclosed in Table 1 the closest of which is within 100 base pairs proximate to a start codon for a resistance gene; and where the presence of said target sequence indicates that said bacterium is cefepime-resistant.
2. The method of claim 1, wherein the resistance gene is a beta lactamase gene.
3. The method of claim I, where said bacterium is of the species Escherichia coli, or another species of the Escherichia genus.
4. The method of claim I, where said bacterium is of the species Klebsiella pneumoniae, or another species of the Klebsiella genus.
5. The method of claim I, where said bacterium is of the species Pseudomonas aeruginosa, or another species of the Pseudomonas genus.
6. The method of claim I, where said bacterium is of the species Acinetobacter baumannii, or another species of the Acinetobacter genus.
7. The method of claim I, where said bacterium is of a species from a genus other than Esc/zerzc/zza, Klebsiella, Pseudomonas, ox Acinetobacter.
8. The method of claim 1, where said region comprises at least 8 nucleotides of the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof with at least 90% identity, as shown in Table 1 (SEQ ID NO: 1).
9. The method according to any one of claims 1-8, wherein said target sequence corresponds to a DNA sequence that includes at least 8, or at least 15, or at least 20, or 50NGD1P001WO at least 30, or at least 40, or at least 50, or at least 60, or at least 70, or at least 80, or at least 90, or at least 100, or at least 110, or at least 120, or at least 130, or at least 140, or at least 1contiguous nucleotides of the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof with at least 90% identity, as shown in Table 1 (SEQ ID).
10. The method of claim 9, wherein a portion of said target sequence corresponds to a DNA sequence that comprises a promoter sequence in the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof with at least 90% identity, as shown in Table 1 (SEQ ID).
11. The method of claim 10, wherein said target sequence corresponds to a DNA sequence that comprises a nucleotide sequence ranging from bp 1543 to 1595 of the ISEcpl transposon shown in Table 1 (SEQ ID).
12. The method according to any one of claims 1-11, wherein said target corresponds to a DNA sequence that comprises the full 209 contiguous nucleotides of the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof where said variants have at least 90% sequence identity to the sequence shown in Table 1.
13. The method according to any one of claims 1-12, wherein said DNA sequence is a sequence in the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof where said variants have at least 90% sequence identity to the sequence, shown in Table 1.
14. The method according to any one of claims 1-13, wherein said target sequence has at least 90% sequence identity, or at least 95% sequence identity, or at least 98% sequence identity, or 100% sequence identity to a DNA sequence comprising a portion of the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof where said variants have at least 90% sequence identity to the sequence,-as shown in Table 1.
15. The method according to any one of claims 1-14, wherein target sequence has 100% sequence identity with a portion of the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants where said variants have at least 90% sequence identity to the sequence shown in Table 1. NGD1P001WO
16. The method according to any one of claims 1-15, wherein said determination of the presence or absence of said target sequence comprises sequencing at least a portion of the genome expected to contain said target sequence.
17. The method of claim 16, wherein said sequencing comprises sequencing the full genome of said bacterium.
18. The method of claim 17, wherein said sequencing comprises analyzing nucleic acid sequence from said bacterium to output a prediction of cefepime resistance.
19. The method of claim 18, wherein said analyzing comprises using a model or machine learning model to receive whole genome sequence data and output a prediction of cefepime resistance.
20. The method of claim 19, wherein said prediction of cefepime resistance is a surrogate marker for the presence of said transposon.
21. The method according to any one of claims 16-20, wherein said sequencing is performed by a method selected from the group consisting of sequencing by synthesis, sequencing by binding, sequencing by highly multiplexed hybridization, and nanopore sequencing.
22. The method according to any one of claims 1-15, wherein said determining the presence or absence of said target sequence comprises performing a nucleic acid amplification reaction to amplify said target sequence.
23. The method of claim 22, wherein said nucleic acid amplification reaction comprises an amplification system selected from the group consisting of a polymerase chain reaction (PCR), a real time polymerase chain reaction (rtPCR), Self-Sustained Sequence Reaction (3 SR), a Nucleic acid Based Transcription Assay (NASBA), a Transcription Mediated Amplification (TMA), a Strand Displacement Amplification (SDA), a Helicase-Dependent Amplification (HDA), a Loop-Mediated isothermal amplification (LAMP), a stem-loop amplification, an isothermal multiple displacement amplification (IMDA), a single primer isothermal amplification (SPIA), a circular helicase-dependent amplification (cHDA), and a Recombinase Polymerase Amplification (RPA). 52NGD1P001WO
24. The method of claim 23, wherein said nucleic acid amplification reaction comprises PCR or rtPCR.
25. The method according to any one of claims 1-15, wherein said determining the presence or absence of said target sequence comprises in situ hybridization with a probe that hybridizes to said target sequence.
26. The method according to any one of claims 1-25, wherein said E. coli is obtained from a biological sample from a mammal having an E. coli infection.
27. The method of claim 26, wherein said biological sample comprises a sample selected from the group consisting of a cell or tissue culture, blood, saliva, cerebrospinal fluid, urine, stool, bronchial aspirates, tracheal lavage, pleural fluid, lymph, sputum, semen, needle aspirates, punch biopsies, surgical biopsies, and wound swab.
28. The method according to any one of claims 26-27, wherein said mammal is a mammal identified as having a pathology selected from the group consisting of a urinary tract infection (UTI), pneumonia, cellulitis, a liver abscess, a surgical wound infection, gastroenteritis, endocarditis, diabetic foot ulcers, and osteomyelitis.
29. The method according to any one of claims 1-28, wherein said mammal is a human.
30. The method according to any one of claims 1-28, wherein said mammal is a non-human mammal.
31. The method according to any one of claims 1-30, wherein the presence of a portion of said ISEcpl transposon is identified and said method further comprises guiding antibiotic treatment of said mammal for a cefepime-resistant Gram-negative bacterial infection.
32. The method of claim 31, wherein said treatment comprises treatment of said mammal with one or more drugs used to treat cefepime resistant Gram-negative bacterial infections.
33. The method of claim 32, wherein said treatment said mammal comprises treatment of said mammal with a combination of mecillinam and cefotaxime. 53NGD1P001WO
34. The method of claim 32, wherein said treatment said mammal comprises treatment of said mammal with a combination of cefepime and sulbactam.
35. The method of claim 32, wherein treating said mammal comprises treatment of said mammal with a carbapenem.
36. The method of claim 35, wherein treating said mammal comprises treatment of said mammal with ertapenem.
37. The method of claim 32, wherein treating said mammal comprises treatment of said mammal with one or more drugs selected from the group consisting of imipenem, meropenem, doripenem, piperacillin-tazobactam, nitrofurantoin, fosfomycin, pivmecillinam, mecillinam-clavulanate, temocillin, pivmecillinam, and colistin.
38. The method of claim 37, wherein treating said mammal comprises treatment of said mammal with one or more drugs selected from the group consisting of imipenem, meropenem, and doripenem, pivmecillinam, and mecillinam-clavulanate.
39. The method of claim 37, wherein treating said mammal comprises treatment of said mammal with piperacillin-tazobactam.
40. The method of claim 37, wherein treating said mammal comprises treatment of said mammal with nitrofurantoin and/or Fosfomycin.
41. The method of claim 37, wherein treating said mammal comprises treatment of said mammal with temocillin, pivmecillinam, and/or colistin.
42. The method of claim 37, wherein treating said mammal comprises treatment of said mammal with pivmecillinam.
43. The method of claim 37, wherein treating said mammal comprises treatment of said mammal with mecillinam-clavulanate.
44. The method according to any one of claims 32-43, wherein said treatment comprises prescribing said one or more drugs.
45. The method according to any one of claims 32-43, wherein said treatment comprises administering said one or more drugs. 54NGD1P001WO
46. The method according to any one of claims 32-43, wherein said treatment comprises providing said one or more drugs to said mammal.
47. A method of treating a mammal having a Gram-negative bacterial infection, said method comprising:identifying said Gram-negative bacterium in a biological sample from said mammal as cefepime resistant using the method according to any one of claims 1-25; andtreating said mammal for a cefepime resistant Gram-negative bacterial infection.
48. The method of claim 47, wherein said biological sample comprises a sample selected from the group consisting of a cell or tissue culture, blood, saliva, cerebral spinal fluid, urine, stool, bronchial aspirates, tracheal lavage, pleural fluid, lymph, sputum, semen, needle aspirates, punch biopsies, surgical biopsies, and wound swab.
49. The method according to any one of claims 47-48, wherein said mammal is a mammal identified as having a pathology selected from the group consisting of a urinary tract infection (UTI), pneumonia, cellulitis, a liver abscess, a surgical wound infection, gastroenteritis, endocarditis, diabetic foot ulcers, and osteomyelitis.
50. The method according to any one of claims 47-49, wherein said mammal is a human.
51. The method according to any one of claims 47-49, wherein said mammal is a non-human mammal.
52. The method according to any one of claims 47-51, wherein said treating comprises treatment of said mammal with one or more drugs used to treat cefepime resistant Gram-negative infections.
53. The method of claim 52, wherein said treating said mammal comprises treatment of said mammal with a combination of mecillinam and cefotaxime.
54. The method of claim 52, wherein said treating said mammal comprises treatment of said mammal with a combination of cefepime and sulbactam.
55. The method of claim 52, wherein treating said mammal comprises treatment of said mammal with a carbapenem.NGD1P001WO
56. The method of claim 55, wherein treating said mammal comprises treatment of said mammal with ertapenem.
57. The method of claim 52, wherein treating said mammal comprises treatment of said mammal with one or more drugs selected from the group consisting of imipenem, meropenem, doripenem, piperacillin-tazobactam, nitrofurantoin, fosfomycin, pivmecillinam, mecillinam-clavulanate, temocillin, pivmecillinam, and colistin.
58. The method of claim 57, wherein treating said mammal comprises treatment of said mammal with one or more drugs selected from the group consisting of imipenem, meropenem, and doripenem, pivmecillinam, and mecillinam-clavulanate.
59. The method of claim 57, wherein treating said mammal comprises treatment of said mammal with piperacillin-tazobactam.
60. The method of claim 57, wherein treating said mammal comprises treatment of said mammal with nitrofurantoin and/or Fosfomycin.
61. The method of claim 57, wherein treating said mammal comprises treatment of said mammal with temocillin, pivmecillinam, and/or colistin.
62. The method of claim 57, wherein treating said mammal comprises treatment of said mammal with pivmecillinam.
63. The method of claim 57, wherein treating said mammal comprises treatment of said mammal with mecillinam-clavulanate.
64. The method according to any one of claims 52-63, wherein said treatment comprises prescribing said one or more drugs.
65. The method according to any one of claims 52-63, wherein said treatment comprises administering said one or more drugs.
66. The method according to any one of claims 52-63, wherein said treatment comprises providing said one or more drugs to said mammal.
67. A kit for determining whether a Gram-negative bacterium is cefepime resistant, said kit comprising: 56NGD1P001WO a container containing on or more primers and/or probes for amplifying and/or detecting the presence or absence of a target sequence in the DNA of said bacterium, wherein said target sequence comprises at least 8 contiguous nucleotides of a region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof with at least 90% sequence identity to the sequence shown in Table 1 where said region comprises or consists of a portion of 152 contiguous nucleotides of said transposon proximate to a start codon for a resistance gene.
68. The kit of claim 67, where said target sequence comprises a sequence of the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof where said variants have at least 90% sequence identity to the sequence shown in Table 1.
69. The kit according to any one of claims 67-68, wherein said target sequence comprises at least 10, or at least 15, or at least 20, or at least 30, or at least 40, or at least 50, or at least 60, or at least 70, or at least 80, or at least 90, or at least 100, or at least 110, or at least 120, or at least 130, or at least 140, or at least 150 contiguous nucleotides of the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof where said variants have at least 90% identity to the sequence shown in Table 1.
70. The kit of claim 69, wherein said target sequence comprises a promoter sequence in the region starting at base pair 1438 to base pair 1656 of the ISEcpl transposon, or variants thereof where said variants have at least 90% identity to the sequence shown in Table 1.
71. The kit of claim 70, wherein said target sequence comprises a nucleotide sequence ranging from bp 1543 to 1595 of the ISEcpl transposon shown in Table 1.
72. The kit according to any one of claims 67-71, wherein said target sequence comprises the full 209 contiguous nucleotides of the region starting at base pair 14to base pair 1656 of the ISEcpl transposon, or variants thereof where said variants have at least 90% sequence identity to the sequence shown in Table 1.
73. The kit according to any one of claims 67-72, wherein said kit comprises primers for the amplification of said target sequence.
74. The kit according to any one of claims 67-73, wherein said kit comprises a probe that hybridizes to said target sequence. 57NGD1P001WO
75. The kit according to any one of claims 67-74, wherein said kit comprises instructional materials teaching the use of components of the kit to identify cefepime resistant Gram-negative bacteria.
76. A method of treating a mammal having a Gram-negative bacterial infection caused by a bacterium, said method comprising:administering an antibacterial agent to the mammal; andinhibiting activity of a target sequence in the DNA of said bacterium in the mammal, wherein said target sequence corresponds to a portion of DNA sequence comprising one or more sets of nucleotides each comprising at least 8 specified contiguous nucleotides within a defined 152 base pair region of an ISEcpl transposon or a variant thereof with 90% identity to the ISEcpl sequence disclosed in Table 1, the closest of which is within 100 base pairs proximate to a start codon for a resistance gene.
77. The method of claim 76, further comprising, prior to administering the antibacterial agent to the mammal, identifying said Gram-negative bacterium in a biological sample from said mammal as cefepime resistant using the method according to any one of claims 1-25.
78. The method according to any one of claims 76-77, wherein said mammal is a mammal identified as having a pathology selected from the group consisting of a urinary tract infection (UTI), pneumonia, cellulitis, a liver abscess, a surgical wound infection, gastroenteritis, endocarditis, diabetic foot ulcers, and osteomyelitis.
79. The method according to any one of claims 76-79, wherein said mammal is a human.
80. The method according to any one of claims 76-79, wherein said mammal is a non-human mammal.
81. The method according to any one of claims 76-80, wherein the resistance gene is a beta lactamase gene.
82. The method according to any one of claims 76-81, wherein the antibacterial agent is a 3-lactam antibiotic.
83. The method of claim 82, wherein the antibacterial agent is cefepime. 58NGD1P001WO
84. The method according to any one of claims 76-83, wherein inhibiting activity of the target sequence comprises at least partially blocking access to the target sequence.
85. The method according to any one of claims 76-84, wherein inhibiting activity of the target sequence comprises at least partially blocking access of a polymerase to the target sequence.
86. The method according to any one of claims 76-83, wherein inhibiting activity of the target sequence comprises disrupting integrity of the target sequence.
87. The method of claim 86, wherein inhibiting activity of the target sequence comprises cleaving or nicking the target sequence.
88. The method according to any one of claims 76-87, wherein inhibiting activity of the target sequence comprises administering a gene editing agent the mammal, wherein the gene editing agent targets to the target sequence.
89. The method of claim 88, wherein the gene editing agent comprises a nuclease selected from the group consisting of a zinc-finger nuclease, a transcription activator- like effector nuclease, or a CRISPR-Cas genome-editing nuclease.
90. The method according to any one of claims 76-89, wherein said treatment comprises prescribing said antibacterial agent and an agent for inhibiting activity of the target sequence in said mammal.
91. The method according to any one of claims 76-90, wherein said treatment comprises administering said antibacterial agent and an agent for inhibiting activity of the target sequence in said mammal.
92. The method according to any one of claims 76-91, wherein said treatment comprises providing said antibacterial agent and an agent for inhibiting activity of the target sequence in said mammal. NGD1P001WO
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363481181P | 2023-01-23 | 2023-01-23 | |
| US202463618867P | 2024-01-08 | 2024-01-08 | |
| PCT/US2024/012591 WO2024158797A1 (en) | 2023-01-23 | 2024-01-23 | Methods for the rapid identification of cefepime-resistance in escherichia coli |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| IL322301A true IL322301A (en) | 2025-09-01 |
Family
ID=90361618
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| IL322301A IL322301A (en) | 2023-01-23 | 2024-01-23 | Methods for the rapid identification of cefepime-resistance in |
| IL322298A IL322298A (en) | 2023-01-23 | 2024-01-23 | Methods for the rapid identification of cefepime-resistance in |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| IL322298A IL322298A (en) | 2023-01-23 | 2024-01-23 | Methods for the rapid identification of cefepime-resistance in |
Country Status (3)
| Country | Link |
|---|---|
| EP (1) | EP4655424A1 (en) |
| IL (2) | IL322301A (en) |
| WO (1) | WO2024158797A1 (en) |
Family Cites Families (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5449602A (en) | 1988-01-13 | 1995-09-12 | Amoco Corporation | Template-directed photoligation |
| US5494810A (en) | 1990-05-03 | 1996-02-27 | Cornell Research Foundation, Inc. | Thermostable ligase-mediated DNA amplifications system for the detection of genetic disease |
| EP2574617B1 (en) | 1996-02-09 | 2016-04-20 | Cornell Research Foundation, Inc. | Detection of nucleic acid sequence differences using the ligase detection reaction with addressable arrays |
| AU730633B2 (en) | 1996-05-29 | 2001-03-08 | Phillip Belgrader | Detection of nucleic acid sequence differences using coupled ligase detection and polymerase chain reactions |
| US6312892B1 (en) | 1996-07-19 | 2001-11-06 | Cornell Research Foundation, Inc. | High fidelity detection of nucleic acid differences by ligase detection reaction |
| US6027998A (en) | 1997-12-17 | 2000-02-22 | Advanced Micro Devices, Inc. | Method for fully planarized conductive line for a stack gate |
| US6506594B1 (en) | 1999-03-19 | 2003-01-14 | Cornell Res Foundation Inc | Detection of nucleic acid sequence differences using the ligase detection reaction with addressable arrays |
| US6605451B1 (en) | 2000-06-06 | 2003-08-12 | Xtrana, Inc. | Methods and devices for multiplexing amplification reactions |
-
2024
- 2024-01-23 IL IL322301A patent/IL322301A/en unknown
- 2024-01-23 EP EP24709544.1A patent/EP4655424A1/en active Pending
- 2024-01-23 IL IL322298A patent/IL322298A/en unknown
- 2024-01-23 WO PCT/US2024/012591 patent/WO2024158797A1/en not_active Ceased
Also Published As
| Publication number | Publication date |
|---|---|
| WO2024158797A9 (en) | 2024-08-29 |
| EP4655424A1 (en) | 2025-12-03 |
| WO2024158797A1 (en) | 2024-08-02 |
| IL322298A (en) | 2025-09-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Pesesky et al. | Evaluation of machine learning and rules-based approaches for predicting antimicrobial resistance profiles in gram-negative bacilli from whole genome sequence data | |
| Bialek-Davenet et al. | Differential contribution of AcrAB and OqxAB efflux pumps to multidrug resistance and virulence in Klebsiella pneumoniae | |
| Falcone et al. | Infections with VIM-1 metallo-β-lactamase-producing Enterobacter cloacae and their correlation with clinical outcome | |
| Bouso et al. | Complete nontuberculous mycobacteria whole genomes using an optimized DNA extraction protocol for long-read sequencing | |
| Skoglund et al. | In Vivo Resistance to Ceftolozane/Tazobactam in Pseudomonas aeruginosa Arising by AmpC‐and Non‐AmpC‐Mediated Pathways | |
| IL303605B1 (en) | Methods of diagnosing and treating tourette syndrome | |
| Kokai-Kun et al. | Ribaxamase, an orally administered β-lactamase, diminishes changes to acquired antimicrobial resistance of the gut resistome in patients treated with ceftriaxone | |
| Herencias et al. | β-lactamase expression induces collateral sensitivity in Escherichia coli | |
| Ruppé et al. | Inferring antibiotic susceptibility from metagenomic data: dream or reality? | |
| Baker et al. | Epidemiology of bloodstream infections caused by Escherichia coli and Klebsiella pneumoniae that are piperacillin-tazobactam-nonsusceptible but ceftriaxone-susceptible | |
| Matovina et al. | An outbreak of ertapenem-resistant, carbapenemase-negative and porin-deficient ESBL-producing Klebsiella pneumoniae complex | |
| Wen et al. | Genome-based characterization of conjugative IncHI1B plasmid carrying carbapenemase genes blaVIM-1, blaIMP-23, and truncated blaOXA-256 in Klebsiella pneumoniae NTU107224 | |
| Maclean et al. | What contributes to the minimum inhibitory concentration? Beyond β-lactamase gene detection in Klebsiella pneumoniae | |
| Toribio-Celestino et al. | A plasmid-chromosome crosstalk in multidrug resistant enterobacteria | |
| IL322301A (en) | Methods for the rapid identification of cefepime-resistance in | |
| Cheng et al. | Molecular characterization of cefepime and aztreonam nonsusceptibility in Haemophilus influenzae | |
| Abhishek et al. | Genotypic Distribution and Antimicrobial Susceptibilities of Carbapenemase-Producing Enterobacteriaceae isolated in tertiary Care Hospital in South india. | |
| Kurpiel et al. | Point mutations in the inc antisense RNA gene are associated with increased plasmid copy number, expression of bla CMY-2 and resistance to piperacillin/tazobactam in Escherichia coli | |
| Duployez et al. | In vitro activity of temocillin against extended-spectrum beta-lactamase-producing Escherichia coli and Klebsiella pneumoniae strains isolated from urinary tract infections in France | |
| Hall et al. | Mutational signature analysis predicts bacterial hypermutation and multidrug resistance | |
| Barrios-Villa et al. | Genomic insights of Leclercia adecarboxylata strains linked to an outbreak in public hospitals in Mexico | |
| Nguyen et al. | Complex pathways to ceftolozane-tazobactam resistance in clinical Pseudomonas aeruginosa isolates: a genomic epidemiology study | |
| Kocer et al. | In vivo evolution of ceftazidime–avibactam resistance in bla OXA-244-positive E. coli potentially linked to PBP3 insertion and mutations in acrB and PBP2 | |
| Ye et al. | Surveillance and characterization of carbapenem-resistant Enterobacter cloacae complex from China, 2015–2018 | |
| Giuliano et al. | Potential role of avibactam in restoring susceptibility in Escherichia coli with two copies of bla KPC-3 and PBP3 mutations |