WO2024097831A1 - Bioreactive proteins containing unnatural amino acids - Google Patents
Bioreactive proteins containing unnatural amino acids Download PDFInfo
- Publication number
- WO2024097831A1 WO2024097831A1 PCT/US2023/078455 US2023078455W WO2024097831A1 WO 2024097831 A1 WO2024097831 A1 WO 2024097831A1 US 2023078455 W US2023078455 W US 2023078455W WO 2024097831 A1 WO2024097831 A1 WO 2024097831A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- unsubstituted
- substituted
- receptor
- protein
- membered
- Prior art date
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 410
- 102000004169 proteins and genes Human genes 0.000 title claims abstract description 404
- 150000001413 amino acids Chemical class 0.000 title claims abstract description 191
- 150000001875 compounds Chemical class 0.000 claims abstract description 256
- 102000005962 receptors Human genes 0.000 claims description 149
- 108020003175 receptors Proteins 0.000 claims description 149
- 125000004404 heteroalkyl group Chemical group 0.000 claims description 139
- 210000004027 cell Anatomy 0.000 claims description 121
- 125000002947 alkylene group Chemical group 0.000 claims description 102
- 229910052739 hydrogen Inorganic materials 0.000 claims description 101
- 239000001257 hydrogen Substances 0.000 claims description 101
- 125000000217 alkyl group Chemical group 0.000 claims description 99
- 125000004474 heteroalkylene group Chemical group 0.000 claims description 99
- 150000007523 nucleic acids Chemical class 0.000 claims description 88
- 125000000753 cycloalkyl group Chemical group 0.000 claims description 87
- 150000002431 hydrogen Chemical group 0.000 claims description 87
- 125000006163 5-membered heteroaryl group Chemical group 0.000 claims description 86
- 125000000592 heterocycloalkyl group Chemical group 0.000 claims description 78
- 125000005842 heteroatom Chemical group 0.000 claims description 74
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical group N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 claims description 72
- 125000001151 peptidyl group Chemical group 0.000 claims description 71
- 101710123256 Pyrrolysine-tRNA ligase Proteins 0.000 claims description 63
- 125000003118 aryl group Chemical group 0.000 claims description 55
- 125000001072 heteroaryl group Chemical group 0.000 claims description 54
- 102000039446 nucleic acids Human genes 0.000 claims description 54
- 108020004707 nucleic acids Proteins 0.000 claims description 54
- 229910052736 halogen Inorganic materials 0.000 claims description 51
- 229910052757 nitrogen Inorganic materials 0.000 claims description 48
- 238000009739 binding Methods 0.000 claims description 47
- 230000027455 binding Effects 0.000 claims description 46
- 150000002367 halogens Chemical group 0.000 claims description 46
- 239000012634 fragment Substances 0.000 claims description 44
- 229910052760 oxygen Inorganic materials 0.000 claims description 41
- 229910052717 sulfur Chemical group 0.000 claims description 40
- 108010003723 Single-Domain Antibodies Proteins 0.000 claims description 38
- 239000000427 antigen Substances 0.000 claims description 37
- 102000036639 antigens Human genes 0.000 claims description 37
- 108091007433 antigens Proteins 0.000 claims description 37
- 125000000732 arylene group Chemical group 0.000 claims description 35
- 239000003795 chemical substances by application Substances 0.000 claims description 35
- 125000005549 heteroarylene group Chemical group 0.000 claims description 35
- 239000001301 oxygen Substances 0.000 claims description 34
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical group [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 claims description 33
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical group [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 claims description 33
- 239000011593 sulfur Chemical group 0.000 claims description 33
- 102000001301 EGF receptor Human genes 0.000 claims description 32
- 125000006588 heterocycloalkylene group Chemical group 0.000 claims description 32
- 239000013598 vector Substances 0.000 claims description 31
- 239000003814 drug Substances 0.000 claims description 30
- 125000004178 (C1-C4) alkyl group Chemical group 0.000 claims description 29
- 108060006698 EGF receptor Proteins 0.000 claims description 29
- 229940124597 therapeutic agent Drugs 0.000 claims description 26
- 125000002993 cycloalkylene group Chemical group 0.000 claims description 24
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 claims description 20
- 150000002632 lipids Chemical group 0.000 claims description 12
- 102000003688 G-Protein-Coupled Receptors Human genes 0.000 claims description 11
- 108090000045 G-Protein-Coupled Receptors Proteins 0.000 claims description 11
- 230000001580 bacterial effect Effects 0.000 claims description 11
- 210000004962 mammalian cell Anatomy 0.000 claims description 10
- 239000000546 pharmaceutical excipient Substances 0.000 claims description 10
- 102000004190 Enzymes Human genes 0.000 claims description 9
- 108090000790 Enzymes Proteins 0.000 claims description 9
- 108010073466 Bombesin Receptors Proteins 0.000 claims description 8
- 108010001857 Cell Surface Receptors Proteins 0.000 claims description 8
- 108010016122 Ghrelin Receptors Proteins 0.000 claims description 8
- 102100039256 Growth hormone secretagogue receptor type 1 Human genes 0.000 claims description 8
- 102000004378 Melanocortin Receptors Human genes 0.000 claims description 8
- 108090000950 Melanocortin Receptors Proteins 0.000 claims description 8
- 102000017922 Neurotensin receptor Human genes 0.000 claims description 8
- 108060003370 Neurotensin receptor Proteins 0.000 claims description 8
- 239000008194 pharmaceutical composition Substances 0.000 claims description 7
- 230000001086 cytosolic effect Effects 0.000 claims description 6
- 230000002103 transcriptional effect Effects 0.000 claims description 6
- HWYCFZUSOBOBIN-AQJXLSMYSA-N (2s)-2-[[(2s)-1-[(2s)-5-amino-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-amino-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]-3-phenylpropanoyl]amino]-5-oxopentanoyl]pyrrolidine-2-carbonyl]amino]-n-[(2s)-1-[[(2s)-1-amino-1-oxo-3-phenylpropan-2-yl]amino]-5-(diaminome Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCC(N)=O)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(N)=O)C1=CC=CC=C1 HWYCFZUSOBOBIN-AQJXLSMYSA-N 0.000 claims description 5
- 102000040125 5-hydroxytryptamine receptor family Human genes 0.000 claims description 5
- 108091032151 5-hydroxytryptamine receptor family Proteins 0.000 claims description 5
- 108010004276 A18Famide Proteins 0.000 claims description 5
- 108091008803 APLNR Proteins 0.000 claims description 5
- 102000007471 Adenosine A2A receptor Human genes 0.000 claims description 5
- 108010085277 Adenosine A2A receptor Proteins 0.000 claims description 5
- 102000007470 Adenosine A2B Receptor Human genes 0.000 claims description 5
- 108010085273 Adenosine A2B receptor Proteins 0.000 claims description 5
- 102000009346 Adenosine receptors Human genes 0.000 claims description 5
- 108050000203 Adenosine receptors Proteins 0.000 claims description 5
- 102000008873 Angiotensin II receptor Human genes 0.000 claims description 5
- 108050000824 Angiotensin II receptor Proteins 0.000 claims description 5
- 102000016555 Apelin receptors Human genes 0.000 claims description 5
- 102000017002 Bile acid receptors Human genes 0.000 claims description 5
- 108070000005 Bile acid receptors Proteins 0.000 claims description 5
- 102000010183 Bradykinin receptor Human genes 0.000 claims description 5
- 108050001736 Bradykinin receptor Proteins 0.000 claims description 5
- 102000018208 Cannabinoid Receptor Human genes 0.000 claims description 5
- 108050007331 Cannabinoid receptor Proteins 0.000 claims description 5
- 102100031011 Chemerin-like receptor 1 Human genes 0.000 claims description 5
- 102000009410 Chemokine receptor Human genes 0.000 claims description 5
- 108050000299 Chemokine receptor Proteins 0.000 claims description 5
- 102000004859 Cholecystokinin Receptors Human genes 0.000 claims description 5
- 108090001085 Cholecystokinin Receptors Proteins 0.000 claims description 5
- 108010009685 Cholinergic Receptors Proteins 0.000 claims description 5
- 102000015554 Dopamine receptor Human genes 0.000 claims description 5
- 108050004812 Dopamine receptor Proteins 0.000 claims description 5
- 102000010180 Endothelin receptor Human genes 0.000 claims description 5
- 108050001739 Endothelin receptor Proteins 0.000 claims description 5
- 102000011652 Formyl peptide receptors Human genes 0.000 claims description 5
- 108010076288 Formyl peptide receptors Proteins 0.000 claims description 5
- 108070000009 Free fatty acid receptors Proteins 0.000 claims description 5
- 108091006027 G proteins Proteins 0.000 claims description 5
- 108700012941 GNRH1 Proteins 0.000 claims description 5
- 102000030782 GTP binding Human genes 0.000 claims description 5
- 108091000058 GTP-Binding Proteins 0.000 claims description 5
- 102000011392 Galanin receptor Human genes 0.000 claims description 5
- 108050001605 Galanin receptor Proteins 0.000 claims description 5
- 102000017357 Glycoprotein hormone receptor Human genes 0.000 claims description 5
- 108050005395 Glycoprotein hormone receptor Proteins 0.000 claims description 5
- 102000000543 Histamine Receptors Human genes 0.000 claims description 5
- 108010002059 Histamine Receptors Proteins 0.000 claims description 5
- 101000919756 Homo sapiens Chemerin-like receptor 1 Proteins 0.000 claims description 5
- 101000986779 Homo sapiens Orexigenic neuropeptide QRFP Proteins 0.000 claims description 5
- 101001062098 Homo sapiens RNA-binding protein 14 Proteins 0.000 claims description 5
- 101000836174 Homo sapiens Tumor protein p53-inducible nuclear protein 1 Proteins 0.000 claims description 5
- 108091006343 Hydroxycarboxylic acid receptors Proteins 0.000 claims description 5
- 102100022888 KN motif and ankyrin repeat domain-containing protein 2 Human genes 0.000 claims description 5
- 108010012048 Kisspeptins Proteins 0.000 claims description 5
- 102000013599 Kisspeptins Human genes 0.000 claims description 5
- 102000016994 Lysolipids receptors Human genes 0.000 claims description 5
- 108070000013 Lysolipids receptors Proteins 0.000 claims description 5
- 102000029828 Melanin-concentrating hormone receptor Human genes 0.000 claims description 5
- 108010047068 Melanin-concentrating hormone receptor Proteins 0.000 claims description 5
- 108050009605 Melatonin receptor Proteins 0.000 claims description 5
- 102000001419 Melatonin receptor Human genes 0.000 claims description 5
- 108700040483 Motilin receptors Proteins 0.000 claims description 5
- 102000030937 Neuromedin U receptor Human genes 0.000 claims description 5
- 108010002741 Neuromedin U receptor Proteins 0.000 claims description 5
- 102400001090 Neuropeptide AF Human genes 0.000 claims description 5
- 102100038842 Neuropeptide B Human genes 0.000 claims description 5
- 102400001095 Neuropeptide FF Human genes 0.000 claims description 5
- 102000016990 Neuropeptide S receptor Human genes 0.000 claims description 5
- 108070000017 Neuropeptide S receptor Proteins 0.000 claims description 5
- 102100021875 Neuropeptide W Human genes 0.000 claims description 5
- 101710100561 Neuropeptide W Proteins 0.000 claims description 5
- 108050002826 Neuropeptide Y Receptor Proteins 0.000 claims description 5
- 102000012301 Neuropeptide Y receptor Human genes 0.000 claims description 5
- 102000003840 Opioid Receptors Human genes 0.000 claims description 5
- 108090000137 Opioid Receptors Proteins 0.000 claims description 5
- 102000010175 Opsin Human genes 0.000 claims description 5
- 108050001704 Opsin Proteins 0.000 claims description 5
- 102100028142 Orexigenic neuropeptide QRFP Human genes 0.000 claims description 5
- 108050000742 Orexin Receptor Proteins 0.000 claims description 5
- 102000008834 Orexin receptor Human genes 0.000 claims description 5
- 102000016978 Orphan receptors Human genes 0.000 claims description 5
- 108070000031 Orphan receptors Proteins 0.000 claims description 5
- 108700023400 Platelet-activating factor receptors Proteins 0.000 claims description 5
- 108070000023 Prokineticin receptors Proteins 0.000 claims description 5
- 102000056271 Prolactin-releasing peptide receptors Human genes 0.000 claims description 5
- 108700024163 Prolactin-releasing peptide receptors Proteins 0.000 claims description 5
- 102000002020 Protease-activated receptors Human genes 0.000 claims description 5
- 108050009310 Protease-activated receptors Proteins 0.000 claims description 5
- 102000002298 Purinergic P2Y Receptors Human genes 0.000 claims description 5
- 108010000818 Purinergic P2Y Receptors Proteins 0.000 claims description 5
- 102000003743 Relaxin Human genes 0.000 claims description 5
- 108090000103 Relaxin Proteins 0.000 claims description 5
- 102000016983 Releasing hormones receptors Human genes 0.000 claims description 5
- 108050001286 Somatostatin Receptor Proteins 0.000 claims description 5
- 102000011096 Somatostatin receptor Human genes 0.000 claims description 5
- 102000007124 Tachykinin Receptors Human genes 0.000 claims description 5
- 108010072901 Tachykinin Receptors Proteins 0.000 claims description 5
- 102000004852 Thyrotropin-releasing hormone receptors Human genes 0.000 claims description 5
- 108090001094 Thyrotropin-releasing hormone receptors Proteins 0.000 claims description 5
- 102000016981 Trace amine receptors Human genes 0.000 claims description 5
- 108070000027 Trace amine receptors Proteins 0.000 claims description 5
- 101150056450 UTS2R gene Proteins 0.000 claims description 5
- 102000004136 Vasopressin Receptors Human genes 0.000 claims description 5
- 108090000643 Vasopressin Receptors Proteins 0.000 claims description 5
- 102000034337 acetylcholine receptors Human genes 0.000 claims description 5
- 102000015694 estrogen receptors Human genes 0.000 claims description 5
- 108010038795 estrogen receptors Proteins 0.000 claims description 5
- 230000003834 intracellular effect Effects 0.000 claims description 5
- 102000003835 leukotriene receptors Human genes 0.000 claims description 5
- 108090000146 leukotriene receptors Proteins 0.000 claims description 5
- 108010085094 neuropeptide B Proteins 0.000 claims description 5
- 102000014187 peptide receptors Human genes 0.000 claims description 5
- 108010011903 peptide receptors Proteins 0.000 claims description 5
- 108010055752 phenylalanyl-leucyl-phenylalanyl-glutaminyl-prolyl-glutaminyl-arginyl-phenylalaninamide Proteins 0.000 claims description 5
- 102000030769 platelet activating factor receptor Human genes 0.000 claims description 5
- 102000017953 prostanoid receptors Human genes 0.000 claims description 5
- 108050007059 prostanoid receptors Proteins 0.000 claims description 5
- 102000000844 Cell Surface Receptors Human genes 0.000 claims 1
- 102100033818 Motilin receptor Human genes 0.000 claims 1
- 125000000837 carbohydrate group Chemical group 0.000 claims 1
- 238000000034 method Methods 0.000 abstract description 41
- 235000018102 proteins Nutrition 0.000 description 354
- 235000001014 amino acid Nutrition 0.000 description 184
- 229940024606 amino acid Drugs 0.000 description 183
- 125000001424 substituent group Chemical group 0.000 description 112
- -1 for example Proteins 0.000 description 92
- 125000003275 alpha amino acid group Chemical group 0.000 description 44
- 239000000203 mixture Substances 0.000 description 42
- 229910052799 carbon Inorganic materials 0.000 description 41
- 125000003729 nucleotide group Chemical group 0.000 description 41
- 239000002773 nucleotide Substances 0.000 description 39
- 239000002671 adjuvant Substances 0.000 description 36
- 150000001721 carbon Chemical group 0.000 description 36
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 36
- 125000002950 monocyclic group Chemical group 0.000 description 30
- 108090000765 processed proteins & peptides Proteins 0.000 description 29
- 125000005647 linker group Chemical group 0.000 description 28
- 238000006243 chemical reaction Methods 0.000 description 26
- 239000000126 substance Substances 0.000 description 26
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 23
- 229960004441 tyrosine Drugs 0.000 description 22
- 125000000539 amino acid group Chemical group 0.000 description 21
- 230000000295 complement effect Effects 0.000 description 21
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 21
- 108091028043 Nucleic acid sequence Proteins 0.000 description 20
- KAESVJOAVNADME-UHFFFAOYSA-N Pyrrole Chemical compound C=1C=CNC=1 KAESVJOAVNADME-UHFFFAOYSA-N 0.000 description 20
- 102000004196 processed proteins & peptides Human genes 0.000 description 20
- 239000000243 solution Substances 0.000 description 19
- YLQBMQCUIZJEEH-UHFFFAOYSA-N Furan Chemical compound C=1C=COC=1 YLQBMQCUIZJEEH-UHFFFAOYSA-N 0.000 description 18
- YTPLMLYBLZKORZ-UHFFFAOYSA-N Thiophene Chemical compound C=1C=CSC=1 YTPLMLYBLZKORZ-UHFFFAOYSA-N 0.000 description 18
- 125000004429 atom Chemical group 0.000 description 18
- 108020004414 DNA Proteins 0.000 description 17
- 239000004472 Lysine Substances 0.000 description 17
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 17
- 150000003254 radicals Chemical class 0.000 description 17
- 238000004132 cross linking Methods 0.000 description 16
- 125000004122 cyclic group Chemical group 0.000 description 16
- 125000001997 phenyl group Chemical group [H]C1=C([H])C([H])=C(*)C([H])=C1[H] 0.000 description 16
- 229920001184 polypeptide Polymers 0.000 description 16
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 15
- 125000001309 chloro group Chemical group Cl* 0.000 description 15
- 102000037865 fusion proteins Human genes 0.000 description 15
- 108020001507 fusion proteins Proteins 0.000 description 15
- 102000040430 polynucleotide Human genes 0.000 description 15
- 108091033319 polynucleotide Proteins 0.000 description 15
- 239000002157 polynucleotide Substances 0.000 description 15
- 238000006467 substitution reaction Methods 0.000 description 14
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 13
- 125000001246 bromo group Chemical group Br* 0.000 description 13
- 125000002618 bicyclic heterocycle group Chemical group 0.000 description 12
- 125000004432 carbon atom Chemical group C* 0.000 description 12
- 108020004566 Transfer RNA Proteins 0.000 description 11
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 11
- 238000011534 incubation Methods 0.000 description 11
- 230000003993 interaction Effects 0.000 description 11
- 125000002496 methyl group Chemical group [H]C([H])([H])* 0.000 description 11
- 229940035032 monophosphoryl lipid a Drugs 0.000 description 11
- 125000002023 trifluoromethyl group Chemical group FC(F)(F)* 0.000 description 11
- 150000001720 carbohydrates Chemical group 0.000 description 10
- 229920000642 polymer Polymers 0.000 description 10
- 239000000758 substrate Substances 0.000 description 10
- 125000001493 tyrosinyl group Chemical group [H]OC1=C([H])C([H])=C(C([H])=C1[H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 10
- DJMUYABFXCIYSC-UHFFFAOYSA-N 1H-phosphole Chemical compound C=1C=CPC=1 DJMUYABFXCIYSC-UHFFFAOYSA-N 0.000 description 9
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 9
- ZCQWOFVYLHDMMC-UHFFFAOYSA-N Oxazole Chemical compound C1=COC=N1 ZCQWOFVYLHDMMC-UHFFFAOYSA-N 0.000 description 9
- WTKZEGDFNFYCGP-UHFFFAOYSA-N Pyrazole Chemical compound C=1C=NNC=1 WTKZEGDFNFYCGP-UHFFFAOYSA-N 0.000 description 9
- FZWLAAWBMGSTSO-UHFFFAOYSA-N Thiazole Chemical compound C1=CSC=N1 FZWLAAWBMGSTSO-UHFFFAOYSA-N 0.000 description 9
- 125000002619 bicyclic group Chemical group 0.000 description 9
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 9
- 201000010099 disease Diseases 0.000 description 9
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 9
- 238000003384 imaging method Methods 0.000 description 9
- 230000002163 immunogen Effects 0.000 description 9
- ZLTPDFXIESTBQG-UHFFFAOYSA-N isothiazole Chemical compound C=1C=NSC=1 ZLTPDFXIESTBQG-UHFFFAOYSA-N 0.000 description 9
- CTAPFRYPJLPFDF-UHFFFAOYSA-N isoxazole Chemical compound C=1C=NOC=1 CTAPFRYPJLPFDF-UHFFFAOYSA-N 0.000 description 9
- 229930192474 thiophene Natural products 0.000 description 9
- 238000001890 transfection Methods 0.000 description 9
- 150000003852 triazoles Chemical class 0.000 description 9
- 108020005098 Anticodon Proteins 0.000 description 8
- 108020004705 Codon Proteins 0.000 description 8
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 8
- 102100040678 Programmed cell death protein 1 Human genes 0.000 description 8
- 101710089372 Programmed cell death protein 1 Proteins 0.000 description 8
- 229910006074 SO2NH2 Inorganic materials 0.000 description 8
- 229940037003 alum Drugs 0.000 description 8
- 125000000392 cycloalkenyl group Chemical group 0.000 description 8
- JROGBPMEKVAPEH-GXGBFOEMSA-N emetine dihydrochloride Chemical compound Cl.Cl.N1CCC2=CC(OC)=C(OC)C=C2[C@H]1C[C@H]1C[C@H]2C3=CC(OC)=C(OC)C=C3CCN2C[C@@H]1CC JROGBPMEKVAPEH-GXGBFOEMSA-N 0.000 description 8
- 238000009472 formulation Methods 0.000 description 8
- 125000000623 heterocyclic group Chemical group 0.000 description 8
- 102000006240 membrane receptors Human genes 0.000 description 8
- 239000002105 nanoparticle Substances 0.000 description 8
- 239000013612 plasmid Substances 0.000 description 8
- 102000021127 protein binding proteins Human genes 0.000 description 8
- 108091011138 protein binding proteins Proteins 0.000 description 8
- 241000894007 species Species 0.000 description 8
- 102000008096 B7-H1 Antigen Human genes 0.000 description 7
- 108010074708 B7-H1 Antigen Proteins 0.000 description 7
- UFHFLCQGNIYNRP-UHFFFAOYSA-N Hydrogen Chemical compound [H][H] UFHFLCQGNIYNRP-UHFFFAOYSA-N 0.000 description 7
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 7
- 206010028980 Neoplasm Diseases 0.000 description 7
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 7
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 7
- 229910006069 SO3H Inorganic materials 0.000 description 7
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 7
- 239000004473 Threonine Substances 0.000 description 7
- 238000007792 addition Methods 0.000 description 7
- 235000004279 alanine Nutrition 0.000 description 7
- GDTBXPJZTBHREO-UHFFFAOYSA-N bromine Chemical compound BrBr GDTBXPJZTBHREO-UHFFFAOYSA-N 0.000 description 7
- 201000011510 cancer Diseases 0.000 description 7
- 238000012217 deletion Methods 0.000 description 7
- 230000037430 deletion Effects 0.000 description 7
- 230000000694 effects Effects 0.000 description 7
- 229940088598 enzyme Drugs 0.000 description 7
- 125000001153 fluoro group Chemical group F* 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 125000000717 hydrazino group Chemical group [H]N([*])N([H])[H] 0.000 description 7
- 230000028993 immune response Effects 0.000 description 7
- 125000004433 nitrogen atom Chemical group N* 0.000 description 7
- SFZCNBIFKDRMGX-UHFFFAOYSA-N sulfur hexafluoride Chemical group FS(F)(F)(F)(F)F SFZCNBIFKDRMGX-UHFFFAOYSA-N 0.000 description 7
- 229960000909 sulfur hexafluoride Drugs 0.000 description 7
- 230000001225 therapeutic effect Effects 0.000 description 7
- 229960005486 vaccine Drugs 0.000 description 7
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 7
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 6
- 125000001313 C5-C10 heteroaryl group Chemical group 0.000 description 6
- RTZKZFJDLAIYFH-UHFFFAOYSA-N Diethyl ether Chemical compound CCOCC RTZKZFJDLAIYFH-UHFFFAOYSA-N 0.000 description 6
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 6
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical group CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 6
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 6
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 6
- IGLNJRXAVVLDKE-OIOBTWANSA-N Rubidium-82 Chemical compound [82Rb] IGLNJRXAVVLDKE-OIOBTWANSA-N 0.000 description 6
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 6
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 6
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 6
- 239000002253 acid Substances 0.000 description 6
- 229910052794 bromium Inorganic materials 0.000 description 6
- 239000003153 chemical reaction reagent Substances 0.000 description 6
- 239000000460 chlorine Substances 0.000 description 6
- 229910052801 chlorine Inorganic materials 0.000 description 6
- 239000011737 fluorine Substances 0.000 description 6
- 229910052731 fluorine Inorganic materials 0.000 description 6
- 125000004435 hydrogen atom Chemical group [H]* 0.000 description 6
- 229930182817 methionine Chemical group 0.000 description 6
- 238000002360 preparation method Methods 0.000 description 6
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 6
- 125000003107 substituted aryl group Chemical group 0.000 description 6
- 125000000876 trifluoromethoxy group Chemical group FC(F)(F)O* 0.000 description 6
- 239000013603 viral vector Substances 0.000 description 6
- 230000003612 virological effect Effects 0.000 description 6
- 125000004209 (C1-C8) alkyl group Chemical group 0.000 description 5
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 5
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 5
- 102000053602 DNA Human genes 0.000 description 5
- 241000282412 Homo Species 0.000 description 5
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 5
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 5
- 241000534944 Thia Species 0.000 description 5
- 229960001230 asparagine Drugs 0.000 description 5
- 235000009582 asparagine Nutrition 0.000 description 5
- 235000018417 cysteine Nutrition 0.000 description 5
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 5
- 239000006185 dispersion Substances 0.000 description 5
- 239000000839 emulsion Substances 0.000 description 5
- 125000000524 functional group Chemical group 0.000 description 5
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 5
- 229960000310 isoleucine Drugs 0.000 description 5
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 5
- 229920001223 polyethylene glycol Polymers 0.000 description 5
- 239000011541 reaction mixture Substances 0.000 description 5
- 230000009257 reactivity Effects 0.000 description 5
- 230000002829 reductive effect Effects 0.000 description 5
- 229920006395 saturated elastomer Polymers 0.000 description 5
- 239000011734 sodium Substances 0.000 description 5
- 125000000547 substituted alkyl group Chemical group 0.000 description 5
- YYGNTYWPHWGJRM-UHFFFAOYSA-N (6E,10E,14E,18E)-2,6,10,15,19,23-hexamethyltetracosa-2,6,10,14,18,22-hexaene Chemical compound CC(C)=CCCC(C)=CCCC(C)=CCCC=C(C)CCC=C(C)CCC=C(C)C YYGNTYWPHWGJRM-UHFFFAOYSA-N 0.000 description 4
- 125000003837 (C1-C20) alkyl group Chemical group 0.000 description 4
- 125000006570 (C5-C6) heteroaryl group Chemical group 0.000 description 4
- 125000006582 (C5-C6) heterocycloalkyl group Chemical group 0.000 description 4
- WKBOTKDWSSQWDR-UHFFFAOYSA-N Bromine atom Chemical group [Br] WKBOTKDWSSQWDR-UHFFFAOYSA-N 0.000 description 4
- 125000000041 C6-C10 aryl group Chemical group 0.000 description 4
- 241000588724 Escherichia coli Species 0.000 description 4
- XEKOWRVHYACXOJ-UHFFFAOYSA-N Ethyl acetate Chemical compound CCOC(C)=O XEKOWRVHYACXOJ-UHFFFAOYSA-N 0.000 description 4
- 229910052688 Gadolinium Inorganic materials 0.000 description 4
- 102000004457 Granulocyte-Macrophage Colony-Stimulating Factor Human genes 0.000 description 4
- 108010017213 Granulocyte-Macrophage Colony-Stimulating Factor Proteins 0.000 description 4
- 241000238631 Hexapoda Species 0.000 description 4
- 108060003951 Immunoglobulin Proteins 0.000 description 4
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 4
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 4
- LRQKBLKVPFOOQJ-YFKPBYRVSA-N L-norleucine Chemical group CCCC[C@H]([NH3+])C([O-])=O LRQKBLKVPFOOQJ-YFKPBYRVSA-N 0.000 description 4
- 102000057413 Motilin receptors Human genes 0.000 description 4
- JGFZNNIVVJXRND-UHFFFAOYSA-N N,N-Diisopropylethylamine (DIPEA) Chemical compound CCN(C(C)C)C(C)C JGFZNNIVVJXRND-UHFFFAOYSA-N 0.000 description 4
- IMNFDUFMRHMDMM-UHFFFAOYSA-N N-Heptane Chemical compound CCCCCCC IMNFDUFMRHMDMM-UHFFFAOYSA-N 0.000 description 4
- 108020005038 Terminator Codon Proteins 0.000 description 4
- BHEOSNUKNHRBNM-UHFFFAOYSA-N Tetramethylsqualene Natural products CC(=C)C(C)CCC(=C)C(C)CCC(C)=CCCC=C(C)CCC(C)C(=C)CCC(C)C(C)=C BHEOSNUKNHRBNM-UHFFFAOYSA-N 0.000 description 4
- 230000000692 anti-sense effect Effects 0.000 description 4
- 239000007864 aqueous solution Substances 0.000 description 4
- 230000001413 cellular effect Effects 0.000 description 4
- 229940126214 compound 3 Drugs 0.000 description 4
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 4
- 239000000539 dimer Substances 0.000 description 4
- PRAKJMSDJKAYCZ-UHFFFAOYSA-N dodecahydrosqualene Natural products CC(C)CCCC(C)CCCC(C)CCCCC(C)CCCC(C)CCCC(C)C PRAKJMSDJKAYCZ-UHFFFAOYSA-N 0.000 description 4
- 125000006575 electron-withdrawing group Chemical group 0.000 description 4
- 230000002068 genetic effect Effects 0.000 description 4
- 230000036541 health Effects 0.000 description 4
- 238000003018 immunoassay Methods 0.000 description 4
- 102000018358 immunoglobulin Human genes 0.000 description 4
- 238000010348 incorporation Methods 0.000 description 4
- 239000007788 liquid Substances 0.000 description 4
- VNWKTOKETHGBQD-UHFFFAOYSA-N methane Chemical compound C VNWKTOKETHGBQD-UHFFFAOYSA-N 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 125000006578 monocyclic heterocycloalkyl group Chemical group 0.000 description 4
- 239000000178 monomer Substances 0.000 description 4
- 239000003921 oil Substances 0.000 description 4
- 239000012074 organic phase Substances 0.000 description 4
- 239000002245 particle Substances 0.000 description 4
- ZJAOAACCNHFJAH-UHFFFAOYSA-N phosphonoformic acid Chemical class OC(=O)P(O)(O)=O ZJAOAACCNHFJAH-UHFFFAOYSA-N 0.000 description 4
- 239000000843 powder Substances 0.000 description 4
- 239000007787 solid Substances 0.000 description 4
- 229940031439 squalene Drugs 0.000 description 4
- TUHBEKDERLKLEC-UHFFFAOYSA-N squalene Natural products CC(=CCCC(=CCCC(=CCCC=C(/C)CCC=C(/C)CC=C(C)C)C)C)C TUHBEKDERLKLEC-UHFFFAOYSA-N 0.000 description 4
- 125000005346 substituted cycloalkyl group Chemical group 0.000 description 4
- 239000003826 tablet Substances 0.000 description 4
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 4
- 238000001262 western blot Methods 0.000 description 4
- 125000005913 (C3-C6) cycloalkyl group Chemical group 0.000 description 3
- 125000006552 (C3-C8) cycloalkyl group Chemical group 0.000 description 3
- 238000001644 13C nuclear magnetic resonance spectroscopy Methods 0.000 description 3
- ZCYVEMRRCGMTRW-UHFFFAOYSA-N 7553-56-2 Chemical group [I] ZCYVEMRRCGMTRW-UHFFFAOYSA-N 0.000 description 3
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical compound CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 description 3
- 101710159080 Aconitate hydratase A Proteins 0.000 description 3
- 101710159078 Aconitate hydratase B Proteins 0.000 description 3
- 102000052866 Amino Acyl-tRNA Synthetases Human genes 0.000 description 3
- 108700028939 Amino Acyl-tRNA Synthetases Proteins 0.000 description 3
- 101100067974 Arabidopsis thaliana POP2 gene Proteins 0.000 description 3
- 108010039939 Cell Wall Skeleton Proteins 0.000 description 3
- ZAMOUSCENKQFHK-UHFFFAOYSA-N Chlorine atom Chemical group [Cl] ZAMOUSCENKQFHK-UHFFFAOYSA-N 0.000 description 3
- 108010089448 Cholecystokinin B Receptor Proteins 0.000 description 3
- 108091026890 Coding region Proteins 0.000 description 3
- 102100036016 Gastrin/cholecystokinin type B receptor Human genes 0.000 description 3
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 3
- 239000004471 Glycine Substances 0.000 description 3
- 101100118549 Homo sapiens EGFR gene Proteins 0.000 description 3
- 101001012157 Homo sapiens Receptor tyrosine-protein kinase erbB-2 Proteins 0.000 description 3
- 108010021625 Immunoglobulin Fragments Proteins 0.000 description 3
- 102000008394 Immunoglobulin Fragments Human genes 0.000 description 3
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 3
- ZFOMKMMPBOQKMC-KXUCPTDWSA-N L-pyrrolysine Chemical compound C[C@@H]1CC=N[C@H]1C(=O)NCCCC[C@H]([NH3+])C([O-])=O ZFOMKMMPBOQKMC-KXUCPTDWSA-N 0.000 description 3
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 3
- 102000008238 LHRH Receptors Human genes 0.000 description 3
- 108010021290 LHRH Receptors Proteins 0.000 description 3
- 238000005481 NMR spectroscopy Methods 0.000 description 3
- 108091034117 Oligonucleotide Proteins 0.000 description 3
- 102000001708 Protein Isoforms Human genes 0.000 description 3
- 108010029485 Protein Isoforms Proteins 0.000 description 3
- 102000044126 RNA-Binding Proteins Human genes 0.000 description 3
- 101710105008 RNA-binding protein Proteins 0.000 description 3
- 102100030086 Receptor tyrosine-protein kinase erbB-2 Human genes 0.000 description 3
- 101710100969 Receptor tyrosine-protein kinase erbB-3 Proteins 0.000 description 3
- 102100029986 Receptor tyrosine-protein kinase erbB-3 Human genes 0.000 description 3
- 102100029981 Receptor tyrosine-protein kinase erbB-4 Human genes 0.000 description 3
- 101710100963 Receptor tyrosine-protein kinase erbB-4 Proteins 0.000 description 3
- 108020004511 Recombinant DNA Proteins 0.000 description 3
- 101100123851 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) HER1 gene Proteins 0.000 description 3
- 229930006000 Sucrose Natural products 0.000 description 3
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 3
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 3
- 108010075974 Vasoactive Intestinal Peptide Receptors Proteins 0.000 description 3
- 102000012088 Vasoactive Intestinal Peptide Receptors Human genes 0.000 description 3
- 239000004480 active ingredient Substances 0.000 description 3
- 239000013543 active substance Substances 0.000 description 3
- 230000002411 adverse Effects 0.000 description 3
- 125000003295 alanine group Chemical group N[C@@H](C)C(=O)* 0.000 description 3
- 150000001298 alcohols Chemical class 0.000 description 3
- 150000001336 alkenes Chemical class 0.000 description 3
- 125000003342 alkenyl group Chemical group 0.000 description 3
- 125000003545 alkoxy group Chemical group 0.000 description 3
- 229920000180 alkyd Polymers 0.000 description 3
- 125000000304 alkynyl group Chemical group 0.000 description 3
- 150000001412 amines Chemical class 0.000 description 3
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 3
- 229960000723 ampicillin Drugs 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 3
- 210000004102 animal cell Anatomy 0.000 description 3
- 238000013459 approach Methods 0.000 description 3
- 125000000613 asparagine group Chemical group N[C@@H](CC(N)=O)C(=O)* 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 3
- 230000037396 body weight Effects 0.000 description 3
- 239000002775 capsule Substances 0.000 description 3
- 210000004520 cell wall skeleton Anatomy 0.000 description 3
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 3
- 229960005091 chloramphenicol Drugs 0.000 description 3
- 125000000151 cysteine group Chemical group N[C@@H](CS)C(=O)* 0.000 description 3
- 238000002059 diagnostic imaging Methods 0.000 description 3
- 229940079593 drug Drugs 0.000 description 3
- 230000009881 electrostatic interaction Effects 0.000 description 3
- 108010048367 enhanced green fluorescent protein Proteins 0.000 description 3
- 239000013604 expression vector Substances 0.000 description 3
- 239000008103 glucose Substances 0.000 description 3
- 235000011187 glycerol Nutrition 0.000 description 3
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 3
- 229910052737 gold Inorganic materials 0.000 description 3
- 239000010931 gold Substances 0.000 description 3
- 239000005090 green fluorescent protein Substances 0.000 description 3
- 230000002209 hydrophobic effect Effects 0.000 description 3
- 238000000338 in vitro Methods 0.000 description 3
- 238000001727 in vivo Methods 0.000 description 3
- 238000003780 insertion Methods 0.000 description 3
- 230000037431 insertion Effects 0.000 description 3
- 102000006495 integrins Human genes 0.000 description 3
- 108010044426 integrins Proteins 0.000 description 3
- 238000007918 intramuscular administration Methods 0.000 description 3
- 238000001990 intravenous administration Methods 0.000 description 3
- 229910052740 iodine Inorganic materials 0.000 description 3
- 150000002500 ions Chemical class 0.000 description 3
- 238000002595 magnetic resonance imaging Methods 0.000 description 3
- 229910052751 metal Inorganic materials 0.000 description 3
- 239000002184 metal Substances 0.000 description 3
- 125000000956 methoxy group Chemical group [H]C([H])([H])O* 0.000 description 3
- 125000002911 monocyclic heterocycle group Chemical group 0.000 description 3
- 235000019198 oils Nutrition 0.000 description 3
- 230000005298 paramagnetic effect Effects 0.000 description 3
- 150000004713 phosphodiesters Chemical class 0.000 description 3
- 235000010482 polyoxyethylene sorbitan monooleate Nutrition 0.000 description 3
- 229920000053 polysorbate 80 Polymers 0.000 description 3
- 238000002600 positron emission tomography Methods 0.000 description 3
- 239000003755 preservative agent Substances 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 125000000168 pyrrolyl group Chemical group 0.000 description 3
- 102000027426 receptor tyrosine kinases Human genes 0.000 description 3
- 108091008598 receptor tyrosine kinases Proteins 0.000 description 3
- 230000010076 replication Effects 0.000 description 3
- 150000003839 salts Chemical class 0.000 description 3
- 150000007949 saponins Chemical class 0.000 description 3
- 239000000741 silica gel Substances 0.000 description 3
- 229910002027 silica gel Inorganic materials 0.000 description 3
- 239000011780 sodium chloride Substances 0.000 description 3
- 239000002904 solvent Substances 0.000 description 3
- 239000007921 spray Substances 0.000 description 3
- 238000007920 subcutaneous administration Methods 0.000 description 3
- 125000005717 substituted cycloalkylene group Chemical group 0.000 description 3
- 239000005720 sucrose Substances 0.000 description 3
- 235000000346 sugar Nutrition 0.000 description 3
- 239000000725 suspension Substances 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000003325 tomography Methods 0.000 description 3
- 238000010361 transduction Methods 0.000 description 3
- 230000026683 transduction Effects 0.000 description 3
- 125000000430 tryptophan group Chemical group [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C2=C([H])C([H])=C([H])C([H])=C12 0.000 description 3
- 239000004474 valine Substances 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- UKAUYVFTDYCKQA-UHFFFAOYSA-N -2-Amino-4-hydroxybutanoic acid Natural products OC(=O)C(N)CCO UKAUYVFTDYCKQA-UHFFFAOYSA-N 0.000 description 2
- RYHBNJHYFVUHQT-UHFFFAOYSA-N 1,4-Dioxane Chemical compound C1COCCO1 RYHBNJHYFVUHQT-UHFFFAOYSA-N 0.000 description 2
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 2
- FPQQSJJWHUJYPU-UHFFFAOYSA-N 3-(dimethylamino)propyliminomethylidene-ethylazanium;chloride Chemical compound Cl.CCN=C=NCCCN(C)C FPQQSJJWHUJYPU-UHFFFAOYSA-N 0.000 description 2
- QGZKDVFQNNGYKY-UHFFFAOYSA-N Ammonia Chemical compound N QGZKDVFQNNGYKY-UHFFFAOYSA-N 0.000 description 2
- 241000272478 Aquila Species 0.000 description 2
- 239000004475 Arginine Substances 0.000 description 2
- 229910052684 Cerium Inorganic materials 0.000 description 2
- 108010047041 Complementarity Determining Regions Proteins 0.000 description 2
- BWGNESOTFCXPMA-UHFFFAOYSA-N Dihydrogen disulfide Chemical compound SS BWGNESOTFCXPMA-UHFFFAOYSA-N 0.000 description 2
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 2
- 229910052692 Dysprosium Inorganic materials 0.000 description 2
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 2
- 238000002965 ELISA Methods 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 2
- 229910052691 Erbium Inorganic materials 0.000 description 2
- 229910052693 Europium Inorganic materials 0.000 description 2
- 108010010803 Gelatin Proteins 0.000 description 2
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 2
- ZRALSGWEFCBTJO-UHFFFAOYSA-N Guanidine Chemical compound NC(N)=N ZRALSGWEFCBTJO-UHFFFAOYSA-N 0.000 description 2
- 239000007995 HEPES buffer Substances 0.000 description 2
- 229910052689 Holmium Inorganic materials 0.000 description 2
- PMMYEEVYMWASQN-DMTCNVIQSA-N Hydroxyproline Chemical compound O[C@H]1CN[C@H](C(O)=O)C1 PMMYEEVYMWASQN-DMTCNVIQSA-N 0.000 description 2
- UQSXHKLRYXJYBZ-UHFFFAOYSA-N Iron oxide Chemical compound [Fe]=O UQSXHKLRYXJYBZ-UHFFFAOYSA-N 0.000 description 2
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 2
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 2
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 2
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 2
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 2
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 2
- UKAUYVFTDYCKQA-VKHMYHEASA-N L-homoserine Chemical group OC(=O)[C@@H](N)CCO UKAUYVFTDYCKQA-VKHMYHEASA-N 0.000 description 2
- QEFRNWWLZKMPFJ-ZXPFJRLXSA-N L-methionine (R)-S-oxide Chemical group C[S@@](=O)CC[C@H]([NH3+])C([O-])=O QEFRNWWLZKMPFJ-ZXPFJRLXSA-N 0.000 description 2
- QEFRNWWLZKMPFJ-UHFFFAOYSA-N L-methionine sulphoxide Chemical group CS(=O)CCC(N)C(O)=O QEFRNWWLZKMPFJ-UHFFFAOYSA-N 0.000 description 2
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 2
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 2
- GUBGYTABKSRVRQ-QKKXKWKRSA-N Lactose Natural products OC[C@H]1O[C@@H](O[C@H]2[C@H](O)[C@@H](O)C(O)O[C@@H]2CO)[C@H](O)[C@@H](O)[C@H]1O GUBGYTABKSRVRQ-QKKXKWKRSA-N 0.000 description 2
- 235000010643 Leucaena leucocephala Nutrition 0.000 description 2
- 240000007472 Leucaena leucocephala Species 0.000 description 2
- 229910052765 Lutetium Inorganic materials 0.000 description 2
- 102000007651 Macrophage Colony-Stimulating Factor Human genes 0.000 description 2
- 108010046938 Macrophage Colony-Stimulating Factor Proteins 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- 108700020354 N-acetylmuramyl-threonyl-isoglutamine Proteins 0.000 description 2
- 238000004497 NIR spectroscopy Methods 0.000 description 2
- 229910052779 Neodymium Inorganic materials 0.000 description 2
- 108091005461 Nucleic proteins Proteins 0.000 description 2
- 229910019142 PO4 Inorganic materials 0.000 description 2
- 229910052777 Praseodymium Inorganic materials 0.000 description 2
- 108091028664 Ribonucleotide Proteins 0.000 description 2
- 239000012722 SDS sample buffer Substances 0.000 description 2
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 2
- 229910052772 Samarium Inorganic materials 0.000 description 2
- 101000629318 Severe acute respiratory syndrome coronavirus 2 Spike glycoprotein Proteins 0.000 description 2
- 108020004682 Single-Stranded DNA Proteins 0.000 description 2
- 241000399119 Spio Species 0.000 description 2
- 229920002472 Starch Polymers 0.000 description 2
- 229910052771 Terbium Inorganic materials 0.000 description 2
- 229910052775 Thulium Inorganic materials 0.000 description 2
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical group O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 2
- 108060008682 Tumor Necrosis Factor Proteins 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- 229910052769 Ytterbium Inorganic materials 0.000 description 2
- 150000007513 acids Chemical class 0.000 description 2
- 125000002252 acyl group Chemical group 0.000 description 2
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 2
- 239000000443 aerosol Substances 0.000 description 2
- 150000001345 alkine derivatives Chemical class 0.000 description 2
- NWMHDZMRVUOQGL-CZEIJOLGSA-N almurtide Chemical compound OC(=O)CC[C@H](C(N)=O)NC(=O)[C@H](C)NC(=O)CO[C@@H]([C@H](O)[C@H](O)CO)[C@@H](NC(C)=O)C=O NWMHDZMRVUOQGL-CZEIJOLGSA-N 0.000 description 2
- 125000003277 amino group Chemical group 0.000 description 2
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 2
- 235000009697 arginine Nutrition 0.000 description 2
- 235000003704 aspartic acid Nutrition 0.000 description 2
- TZCXTZWJZNENPQ-UHFFFAOYSA-L barium sulfate Chemical compound [Ba+2].[O-]S([O-])(=O)=O TZCXTZWJZNENPQ-UHFFFAOYSA-L 0.000 description 2
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 2
- 239000012267 brine Substances 0.000 description 2
- 235000014633 carbohydrates Nutrition 0.000 description 2
- 239000011203 carbon fibre reinforced carbon Substances 0.000 description 2
- 238000004113 cell culture Methods 0.000 description 2
- 210000000170 cell membrane Anatomy 0.000 description 2
- 210000002421 cell wall Anatomy 0.000 description 2
- 239000001913 cellulose Substances 0.000 description 2
- 229920002678 cellulose Polymers 0.000 description 2
- 239000013522 chelant Substances 0.000 description 2
- 229910052804 chromium Inorganic materials 0.000 description 2
- 238000004440 column chromatography Methods 0.000 description 2
- 229940125904 compound 1 Drugs 0.000 description 2
- 229940125782 compound 2 Drugs 0.000 description 2
- 239000002872 contrast media Substances 0.000 description 2
- 229910052802 copper Inorganic materials 0.000 description 2
- 239000012043 crude product Substances 0.000 description 2
- 125000001995 cyclobutyl group Chemical group [H]C1([H])C([H])([H])C([H])(*)C1([H])[H] 0.000 description 2
- 125000000582 cycloheptyl group Chemical group [H]C1([H])C([H])([H])C([H])([H])C([H])([H])C([H])(*)C([H])([H])C1([H])[H] 0.000 description 2
- 125000000596 cyclohexenyl group Chemical group C1(=CCCCC1)* 0.000 description 2
- 125000000113 cyclohexyl group Chemical group [H]C1([H])C([H])([H])C([H])([H])C([H])(*)C([H])([H])C1([H])[H] 0.000 description 2
- 125000002433 cyclopentenyl group Chemical group C1(=CCCC1)* 0.000 description 2
- 125000001511 cyclopentyl group Chemical group [H]C1([H])C([H])([H])C([H])([H])C([H])(*)C1([H])[H] 0.000 description 2
- 125000001559 cyclopropyl group Chemical group [H]C1([H])C([H])([H])C1([H])* 0.000 description 2
- 229940104302 cytosine Drugs 0.000 description 2
- 239000005547 deoxyribonucleotide Substances 0.000 description 2
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- PMMYEEVYMWASQN-UHFFFAOYSA-N dl-hydroxyproline Natural products OC1C[NH2+]C(C([O-])=O)C1 PMMYEEVYMWASQN-UHFFFAOYSA-N 0.000 description 2
- 238000002592 echocardiography Methods 0.000 description 2
- 238000004520 electroporation Methods 0.000 description 2
- 238000001425 electrospray ionisation time-of-flight mass spectrometry Methods 0.000 description 2
- 150000002081 enamines Chemical class 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 150000002148 esters Chemical class 0.000 description 2
- 235000019439 ethyl acetate Nutrition 0.000 description 2
- 210000003527 eukaryotic cell Anatomy 0.000 description 2
- 239000013613 expression plasmid Substances 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- YCKRFDGAMUMZLT-BJUDXGSMSA-N fluorine-18 atom Chemical compound [18F] YCKRFDGAMUMZLT-BJUDXGSMSA-N 0.000 description 2
- 230000002538 fungal effect Effects 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- UIWYJDYFSGRHKR-UHFFFAOYSA-N gadolinium atom Chemical compound [Gd] UIWYJDYFSGRHKR-UHFFFAOYSA-N 0.000 description 2
- 239000007789 gas Substances 0.000 description 2
- 229920000159 gelatin Polymers 0.000 description 2
- 235000019322 gelatine Nutrition 0.000 description 2
- 235000011852 gelatine desserts Nutrition 0.000 description 2
- 235000013922 glutamic acid Nutrition 0.000 description 2
- 239000004220 glutamic acid Substances 0.000 description 2
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 2
- 235000004554 glutamine Nutrition 0.000 description 2
- 150000004676 glycans Chemical class 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- 125000001188 haloalkyl group Chemical group 0.000 description 2
- 125000005843 halogen group Chemical group 0.000 description 2
- 125000000487 histidyl group Chemical group [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C([H])=N1 0.000 description 2
- 210000005260 human cell Anatomy 0.000 description 2
- 210000004408 hybridoma Anatomy 0.000 description 2
- 229960002591 hydroxyproline Drugs 0.000 description 2
- 230000003053 immunization Effects 0.000 description 2
- 238000002649 immunization Methods 0.000 description 2
- 229940072221 immunoglobulins Drugs 0.000 description 2
- 239000003022 immunostimulating agent Substances 0.000 description 2
- 230000008676 import Effects 0.000 description 2
- 238000001802 infusion Methods 0.000 description 2
- 238000002347 injection Methods 0.000 description 2
- 239000007924 injection Substances 0.000 description 2
- 238000007912 intraperitoneal administration Methods 0.000 description 2
- 239000011630 iodine Chemical group 0.000 description 2
- 229910052742 iron Inorganic materials 0.000 description 2
- XEEYBQQBJWHFJM-UHFFFAOYSA-N iron Substances [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 2
- WTFXARWRTYJXII-UHFFFAOYSA-N iron(2+);iron(3+);oxygen(2-) Chemical compound [O-2].[O-2].[O-2].[O-2].[Fe+2].[Fe+3].[Fe+3] WTFXARWRTYJXII-UHFFFAOYSA-N 0.000 description 2
- 239000008101 lactose Substances 0.000 description 2
- 229910052746 lanthanum Inorganic materials 0.000 description 2
- 125000001909 leucine group Chemical group [H]N(*)C(C(*)=O)C([H])([H])C(C([H])([H])[H])C([H])([H])[H] 0.000 description 2
- 239000000314 lubricant Substances 0.000 description 2
- 239000006166 lysate Substances 0.000 description 2
- 125000003588 lysine group Chemical group [H]N([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 2
- 229920002521 macromolecule Polymers 0.000 description 2
- HQKMJHAJHXVSDF-UHFFFAOYSA-L magnesium stearate Chemical compound [Mg+2].CCCCCCCCCCCCCCCCCC([O-])=O.CCCCCCCCCCCCCCCCCC([O-])=O HQKMJHAJHXVSDF-UHFFFAOYSA-L 0.000 description 2
- 239000006249 magnetic particle Substances 0.000 description 2
- 229910052748 manganese Inorganic materials 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 239000002609 medium Substances 0.000 description 2
- 150000002739 metals Chemical class 0.000 description 2
- OKKJLVBELUTLKV-VMNATFBRSA-N methanol-d1 Chemical compound [2H]OC OKKJLVBELUTLKV-VMNATFBRSA-N 0.000 description 2
- LSDPWZHWYPCBBB-UHFFFAOYSA-O methylsulfide anion Chemical compound [SH2+]C LSDPWZHWYPCBBB-UHFFFAOYSA-O 0.000 description 2
- JMUHBNWAORSSBD-WKYWBUFDSA-N mifamurtide Chemical compound CCCCCCCCCCCCCCCC(=O)OC[C@@H](OC(=O)CCCCCCCCCCCCCCC)COP(O)(=O)OCCNC(=O)[C@H](C)NC(=O)CC[C@H](C(N)=O)NC(=O)[C@H](C)NC(=O)[C@@H](C)O[C@H]1[C@H](O)[C@@H](CO)OC(O)[C@@H]1NC(C)=O JMUHBNWAORSSBD-WKYWBUFDSA-N 0.000 description 2
- 229960005225 mifamurtide Drugs 0.000 description 2
- 108091005601 modified peptides Proteins 0.000 description 2
- 125000004573 morpholin-4-yl group Chemical group N1(CCOCC1)* 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 229910052759 nickel Inorganic materials 0.000 description 2
- 238000009206 nuclear medicine Methods 0.000 description 2
- QYSGYZVSCZSLHT-UHFFFAOYSA-N octafluoropropane Chemical compound FC(F)(F)C(F)(F)C(F)(F)F QYSGYZVSCZSLHT-UHFFFAOYSA-N 0.000 description 2
- 229960004065 perflutren Drugs 0.000 description 2
- 238000002823 phage display Methods 0.000 description 2
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 2
- 229960005190 phenylalanine Drugs 0.000 description 2
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 2
- 239000010452 phosphate Substances 0.000 description 2
- XUYJLQHKOGNDPB-UHFFFAOYSA-N phosphonoacetic acid Chemical compound OC(=O)CP(O)(O)=O XUYJLQHKOGNDPB-UHFFFAOYSA-N 0.000 description 2
- 229910052698 phosphorus Inorganic materials 0.000 description 2
- BZQFBWGGLXLEPQ-REOHCLBHSA-N phosphoserine Chemical compound OC(=O)[C@@H](N)COP(O)(O)=O BZQFBWGGLXLEPQ-REOHCLBHSA-N 0.000 description 2
- 239000006187 pill Substances 0.000 description 2
- 210000001778 pluripotent stem cell Anatomy 0.000 description 2
- 239000002244 precipitate Substances 0.000 description 2
- 230000002028 premature Effects 0.000 description 2
- 229930010796 primary metabolite Natural products 0.000 description 2
- 210000001236 prokaryotic cell Anatomy 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 125000000561 purinyl group Chemical group N1=C(N=C2N=CNC2=C1)* 0.000 description 2
- 125000003373 pyrazinyl group Chemical group 0.000 description 2
- 125000000714 pyrimidinyl group Chemical group 0.000 description 2
- 239000001397 quillaja saponaria molina bark Substances 0.000 description 2
- 230000002285 radioactive effect Effects 0.000 description 2
- 238000002601 radiography Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000001177 retroviral effect Effects 0.000 description 2
- 239000002336 ribonucleotide Substances 0.000 description 2
- 125000002652 ribonucleotide group Chemical group 0.000 description 2
- 125000006413 ring segment Chemical group 0.000 description 2
- 229930182490 saponin Natural products 0.000 description 2
- 229930000044 secondary metabolite Natural products 0.000 description 2
- 229910052710 silicon Inorganic materials 0.000 description 2
- 229940126586 small molecule drug Drugs 0.000 description 2
- 150000003384 small molecules Chemical class 0.000 description 2
- HPALAKNZSZLMCH-UHFFFAOYSA-M sodium;chloride;hydrate Chemical compound O.[Na+].[Cl-] HPALAKNZSZLMCH-UHFFFAOYSA-M 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 239000003381 stabilizer Substances 0.000 description 2
- 238000010561 standard procedure Methods 0.000 description 2
- 239000008107 starch Substances 0.000 description 2
- 235000019698 starch Nutrition 0.000 description 2
- 230000000638 stimulation Effects 0.000 description 2
- 125000004434 sulfur atom Chemical group 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 239000006188 syrup Substances 0.000 description 2
- 235000020357 syrup Nutrition 0.000 description 2
- 238000011285 therapeutic regimen Methods 0.000 description 2
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical compound [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 2
- 229940113082 thymine Drugs 0.000 description 2
- FGMPLJWBKKVCDB-UHFFFAOYSA-N trans-L-hydroxy-proline Natural products ON1CCCC1C(O)=O FGMPLJWBKKVCDB-UHFFFAOYSA-N 0.000 description 2
- 238000003151 transfection method Methods 0.000 description 2
- 230000014616 translation Effects 0.000 description 2
- XETCRXVKJHBPMK-MJSODCSWSA-N trehalose 6,6'-dimycolate Chemical compound C([C@@H]1[C@H]([C@H](O)[C@@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](O)[C@H](O)[C@@H](COC(=O)C(CCCCCCCCCCC3C(C3)CCCCCCCCCCCCCCCCCC)C(O)CCCCCCCCCCCCCCCCCCCCCCCCC)O2)O)O1)O)OC(=O)C(C(O)CCCCCCCCCCCCCCCCCCCCCCCCC)CCCCCCCCCCC1CC1CCCCCCCCCCCCCCCCCC XETCRXVKJHBPMK-MJSODCSWSA-N 0.000 description 2
- 102000003390 tumor necrosis factor Human genes 0.000 description 2
- 238000002113 ultrasound elastography Methods 0.000 description 2
- 125000004417 unsaturated alkyl group Chemical group 0.000 description 2
- 229910052720 vanadium Inorganic materials 0.000 description 2
- 239000003981 vehicle Substances 0.000 description 2
- 239000003643 water by type Substances 0.000 description 2
- 210000005253 yeast cell Anatomy 0.000 description 2
- KCRZBDJVYOBHIP-HHQFNNIRSA-N (1r,2s)-2-aminocycloheptane-1-carboxylic acid;hydrochloride Chemical compound Cl.N[C@H]1CCCCC[C@H]1C(O)=O KCRZBDJVYOBHIP-HHQFNNIRSA-N 0.000 description 1
- RIKSICCAWWEQSL-CIRBGYJCSA-N (1s,2r)-2-amino-2-methylcyclohexane-1-carboxylic acid;hydrochloride Chemical compound Cl.C[C@@]1(N)CCCC[C@@H]1C(O)=O RIKSICCAWWEQSL-CIRBGYJCSA-N 0.000 description 1
- XSGMGAINOILNJR-PGUFJCEWSA-N (2r)-2-(9h-fluoren-9-ylmethoxycarbonylamino)-3-methyl-3-tritylsulfanylbutanoic acid Chemical compound CC(C)([C@H](NC(=O)OCC1C2=CC=CC=C2C2=CC=CC=C21)C(O)=O)SC(C=1C=CC=CC=1)(C=1C=CC=CC=1)C1=CC=CC=C1 XSGMGAINOILNJR-PGUFJCEWSA-N 0.000 description 1
- UZDKQMIDSLETST-ZCFIWIBFSA-N (2r)-2-[(2-methylpropan-2-yl)oxycarbonylamino]-3-(2,3,4,5,6-pentafluorophenyl)propanoic acid Chemical compound CC(C)(C)OC(=O)N[C@@H](C(O)=O)CC1=C(F)C(F)=C(F)C(F)=C1F UZDKQMIDSLETST-ZCFIWIBFSA-N 0.000 description 1
- OXNUZCWFCJRJSU-SECBINFHSA-N (2r)-2-amino-3-[4-(hydroxymethyl)phenyl]propanoic acid Chemical compound OC(=O)[C@H](N)CC1=CC=C(CO)C=C1 OXNUZCWFCJRJSU-SECBINFHSA-N 0.000 description 1
- RCZHBTHQISEPPP-LLVKDONJSA-N (2r)-3-(3-chlorophenyl)-2-[(2-methylpropan-2-yl)oxycarbonylamino]propanoic acid Chemical compound CC(C)(C)OC(=O)N[C@@H](C(O)=O)CC1=CC=CC(Cl)=C1 RCZHBTHQISEPPP-LLVKDONJSA-N 0.000 description 1
- ULNOXUAEIPUJMK-LLVKDONJSA-N (2r)-3-(4-bromophenyl)-2-[(2-methylpropan-2-yl)oxycarbonylamino]propanoic acid Chemical compound CC(C)(C)OC(=O)N[C@@H](C(O)=O)CC1=CC=C(Br)C=C1 ULNOXUAEIPUJMK-LLVKDONJSA-N 0.000 description 1
- PLYYQWWELYJSEB-DEOSSOPVSA-N (2s)-2-(2,3-dihydro-1h-inden-2-yl)-2-(9h-fluoren-9-ylmethoxycarbonylamino)acetic acid Chemical compound C1C2=CC=CC=C2CC1[C@@H](C(=O)O)NC(=O)OCC1C2=CC=CC=C2C2=CC=CC=C21 PLYYQWWELYJSEB-DEOSSOPVSA-N 0.000 description 1
- VCHHRDDQOOBPTC-ZDUSSCGKSA-N (2s)-2-(2,3-dihydro-1h-inden-2-yl)-2-[(2-methylpropan-2-yl)oxycarbonylamino]acetic acid Chemical compound C1=CC=C2CC([C@H](NC(=O)OC(C)(C)C)C(O)=O)CC2=C1 VCHHRDDQOOBPTC-ZDUSSCGKSA-N 0.000 description 1
- DLOGILOIJKBYKA-KRWDZBQOSA-N (2s)-2-(9h-fluoren-9-ylmethoxycarbonylamino)-3-(2,3,4,5,6-pentafluorophenyl)propanoic acid Chemical compound C([C@@H](C(=O)O)NC(=O)OCC1C2=CC=CC=C2C2=CC=CC=C21)C1=C(F)C(F)=C(F)C(F)=C1F DLOGILOIJKBYKA-KRWDZBQOSA-N 0.000 description 1
- ASVUOKGTAIPUBY-YFKPBYRVSA-N (2s)-2-(prop-2-enylamino)propanoic acid Chemical compound OC(=O)[C@H](C)NCC=C ASVUOKGTAIPUBY-YFKPBYRVSA-N 0.000 description 1
- GRJPAUULVKPBHU-QFIPXVFZSA-N (2s)-3-(2-bromophenyl)-2-(9h-fluoren-9-ylmethoxycarbonylamino)propanoic acid Chemical compound C([C@@H](C(=O)O)NC(=O)OCC1C2=CC=CC=C2C2=CC=CC=C21)C1=CC=CC=C1Br GRJPAUULVKPBHU-QFIPXVFZSA-N 0.000 description 1
- XDJSTMCSOXSTGZ-NSHDSACASA-N (2s)-3-(2-bromophenyl)-2-[(2-methylpropan-2-yl)oxycarbonylamino]propanoic acid Chemical compound CC(C)(C)OC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1Br XDJSTMCSOXSTGZ-NSHDSACASA-N 0.000 description 1
- UYEQBZISDRNPFC-QFIPXVFZSA-N (2s)-3-(3,5-difluorophenyl)-2-(9h-fluoren-9-ylmethoxycarbonylamino)propanoic acid Chemical compound C([C@@H](C(=O)O)NC(=O)OCC1C2=CC=CC=C2C2=CC=CC=C21)C1=CC(F)=CC(F)=C1 UYEQBZISDRNPFC-QFIPXVFZSA-N 0.000 description 1
- CZBNUDVCRKSYDG-NSHDSACASA-N (2s)-3-(3,5-difluorophenyl)-2-[(2-methylpropan-2-yl)oxycarbonylamino]propanoic acid Chemical compound CC(C)(C)OC(=O)N[C@H](C(O)=O)CC1=CC(F)=CC(F)=C1 CZBNUDVCRKSYDG-NSHDSACASA-N 0.000 description 1
- NDMVQEZKACRLDP-NSHDSACASA-N (2s)-3-(4-aminophenyl)-2-[(2-methylpropan-2-yl)oxycarbonylamino]propanoic acid Chemical compound CC(C)(C)OC(=O)N[C@H](C(O)=O)CC1=CC=C(N)C=C1 NDMVQEZKACRLDP-NSHDSACASA-N 0.000 description 1
- TVBAVBWXRDHONF-QFIPXVFZSA-N (2s)-3-(4-bromophenyl)-2-(9h-fluoren-9-ylmethoxycarbonylamino)propanoic acid Chemical compound C([C@@H](C(=O)O)NC(=O)OCC1C2=CC=CC=C2C2=CC=CC=C21)C1=CC=C(Br)C=C1 TVBAVBWXRDHONF-QFIPXVFZSA-N 0.000 description 1
- ULNOXUAEIPUJMK-NSHDSACASA-N (2s)-3-(4-bromophenyl)-2-[(2-methylpropan-2-yl)oxycarbonylamino]propanoic acid Chemical compound CC(C)(C)OC(=O)N[C@H](C(O)=O)CC1=CC=C(Br)C=C1 ULNOXUAEIPUJMK-NSHDSACASA-N 0.000 description 1
- ZKSJJSOHPQQZHC-VWLOTQADSA-N (2s)-3-[4-(9h-fluoren-9-ylmethoxycarbonylamino)phenyl]-2-[(2-methylpropan-2-yl)oxycarbonylamino]propanoic acid Chemical compound C1=CC(C[C@H](NC(=O)OC(C)(C)C)C(O)=O)=CC=C1NC(=O)OCC1C2=CC=CC=C2C2=CC=CC=C21 ZKSJJSOHPQQZHC-VWLOTQADSA-N 0.000 description 1
- YHQZWWDVLJPRIF-JLHRHDQISA-N (4R)-4-[[(2S,3R)-2-[acetyl-[(3R,4R,5S,6R)-3-amino-4-[(1R)-1-carboxyethoxy]-5-hydroxy-6-(hydroxymethyl)oxan-2-yl]amino]-3-hydroxybutanoyl]amino]-5-amino-5-oxopentanoic acid Chemical compound C(C)(=O)N([C@@H]([C@H](O)C)C(=O)N[C@H](CCC(=O)O)C(N)=O)C1[C@H](N)[C@@H](O[C@@H](C(=O)O)C)[C@H](O)[C@H](O1)CO YHQZWWDVLJPRIF-JLHRHDQISA-N 0.000 description 1
- ROICYBLUWUMJFF-RDTXWAMCSA-N (6aR,9R)-N,7-dimethyl-N-propan-2-yl-6,6a,8,9-tetrahydro-4H-indolo[4,3-fg]quinoline-9-carboxamide Chemical compound CN(C(=O)[C@H]1CN(C)[C@@H]2CC3=CNC4=CC=CC(C2=C1)=C34)C(C)C ROICYBLUWUMJFF-RDTXWAMCSA-N 0.000 description 1
- 125000004769 (C1-C4) alkylsulfonyl group Chemical group 0.000 description 1
- 125000006527 (C1-C5) alkyl group Chemical group 0.000 description 1
- 125000004169 (C1-C6) alkyl group Chemical group 0.000 description 1
- 125000003161 (C1-C6) alkylene group Chemical group 0.000 description 1
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 1
- IGERFAHWSHDDHX-UHFFFAOYSA-N 1,3-dioxanyl Chemical group [CH]1OCCCO1 IGERFAHWSHDDHX-UHFFFAOYSA-N 0.000 description 1
- JPRPJUMQRZTTED-UHFFFAOYSA-N 1,3-dioxolanyl Chemical group [CH]1OCCO1 JPRPJUMQRZTTED-UHFFFAOYSA-N 0.000 description 1
- ILWJAOPQHOZXAN-UHFFFAOYSA-N 1,3-dithianyl Chemical group [CH]1SCCCS1 ILWJAOPQHOZXAN-UHFFFAOYSA-N 0.000 description 1
- FLOJNXXFMHCMMR-UHFFFAOYSA-N 1,3-dithiolanyl Chemical group [CH]1SCCS1 FLOJNXXFMHCMMR-UHFFFAOYSA-N 0.000 description 1
- SSYLTDCVONDKNS-UHFFFAOYSA-N 1-[(2-methylpropan-2-yl)oxycarbonyl]-3,6-dihydro-2h-pyridine-2-carboxylic acid Chemical compound CC(C)(C)OC(=O)N1CC=CCC1C(O)=O SSYLTDCVONDKNS-UHFFFAOYSA-N 0.000 description 1
- BUXKULRFRATXSI-UHFFFAOYSA-N 1-hydroxypyrrole-2,5-dione Chemical compound ON1C(=O)C=CC1=O BUXKULRFRATXSI-UHFFFAOYSA-N 0.000 description 1
- NFGXHKASABOEEW-UHFFFAOYSA-N 1-methylethyl 11-methoxy-3,7,11-trimethyl-2,4-dodecadienoate Chemical compound COC(C)(C)CCCC(C)CC=CC(C)=CC(=O)OC(C)C NFGXHKASABOEEW-UHFFFAOYSA-N 0.000 description 1
- 125000001637 1-naphthyl group Chemical group [H]C1=C([H])C([H])=C2C(*)=C([H])C([H])=C([H])C2=C1[H] 0.000 description 1
- 125000004214 1-pyrrolidinyl group Chemical group [H]C1([H])N(*)C([H])([H])C([H])([H])C1([H])[H] 0.000 description 1
- 125000001462 1-pyrrolyl group Chemical group [*]N1C([H])=C([H])C([H])=C1[H] 0.000 description 1
- 238000005160 1H NMR spectroscopy Methods 0.000 description 1
- 125000004206 2,2,2-trifluoroethyl group Chemical group [H]C([H])(*)C(F)(F)F 0.000 description 1
- ZSGKIKRNLJANGA-UHFFFAOYSA-N 2-(2-fluorophenyl)-2-[4-[(2-methylpropan-2-yl)oxycarbonyl]piperazin-1-ium-1-yl]acetate Chemical compound C1CN(C(=O)OC(C)(C)C)CCN1C(C(O)=O)C1=CC=CC=C1F ZSGKIKRNLJANGA-UHFFFAOYSA-N 0.000 description 1
- KYPLTDWTMVRRAD-UHFFFAOYSA-N 2-(3,4-dimethoxyphenyl)-2-[4-[(2-methylpropan-2-yl)oxycarbonyl]piperazin-1-ium-1-yl]acetate Chemical compound C1=C(OC)C(OC)=CC=C1C(C(O)=O)N1CCN(C(=O)OC(C)(C)C)CC1 KYPLTDWTMVRRAD-UHFFFAOYSA-N 0.000 description 1
- PPGHGFHJSQSOJP-UHFFFAOYSA-N 2-(3-fluorophenyl)-2-[4-[(2-methylpropan-2-yl)oxycarbonyl]piperazin-1-ium-1-yl]acetate Chemical compound C1CN(C(=O)OC(C)(C)C)CCN1C(C(O)=O)C1=CC=CC(F)=C1 PPGHGFHJSQSOJP-UHFFFAOYSA-N 0.000 description 1
- QPEHPIVVAWESTM-UHFFFAOYSA-N 2-(4-Boc-piperazino)-2-phenylacetic acid Chemical compound C1CN(C(=O)OC(C)(C)C)CCN1C(C(O)=O)C1=CC=CC=C1 QPEHPIVVAWESTM-UHFFFAOYSA-N 0.000 description 1
- RBVUICOGSFFJQN-UHFFFAOYSA-N 2-(4-fluorophenyl)-2-[4-[(2-methylpropan-2-yl)oxycarbonyl]piperazin-1-ium-1-yl]acetate Chemical compound C1CN(C(=O)OC(C)(C)C)CCN1C(C(O)=O)C1=CC=C(F)C=C1 RBVUICOGSFFJQN-UHFFFAOYSA-N 0.000 description 1
- DCFDOKBNIXUWKP-UHFFFAOYSA-N 2-(4-methoxyphenyl)-2-[4-[(2-methylpropan-2-yl)oxycarbonyl]piperazin-1-ium-1-yl]acetate Chemical compound C1=CC(OC)=CC=C1C(C(O)=O)N1CCN(C(=O)OC(C)(C)C)CC1 DCFDOKBNIXUWKP-UHFFFAOYSA-N 0.000 description 1
- UIDQSTVPYKMCEY-UHFFFAOYSA-N 2-[(2,4-dimethoxyphenyl)methyl-(9h-fluoren-9-ylmethoxycarbonyl)amino]acetic acid Chemical compound COC1=CC(OC)=CC=C1CN(CC(O)=O)C(=O)OCC1C2=CC=CC=C2C2=CC=CC=C21 UIDQSTVPYKMCEY-UHFFFAOYSA-N 0.000 description 1
- WZVLJRPOVUCTFZ-UHFFFAOYSA-N 2-[(2-methylpropan-2-yl)oxycarbonylamino]octanedioic acid Chemical compound CC(C)(C)OC(=O)NC(C(O)=O)CCCCCC(O)=O WZVLJRPOVUCTFZ-UHFFFAOYSA-N 0.000 description 1
- QVOPNRRQHPWQMF-UHFFFAOYSA-N 2-[4-[(2-methylpropan-2-yl)oxycarbonyl]morpholin-3-yl]acetic acid Chemical compound CC(C)(C)OC(=O)N1CCOCC1CC(O)=O QVOPNRRQHPWQMF-UHFFFAOYSA-N 0.000 description 1
- IYIQZDBAVIZZOC-UHFFFAOYSA-N 2-[4-[(2-methylpropan-2-yl)oxycarbonyl]piperazin-1-ium-1-yl]-2-[2-(trifluoromethyl)phenyl]acetate Chemical compound C1CN(C(=O)OC(C)(C)C)CCN1C(C(O)=O)C1=CC=CC=C1C(F)(F)F IYIQZDBAVIZZOC-UHFFFAOYSA-N 0.000 description 1
- UOZAIRMXJCRTJN-UHFFFAOYSA-N 2-[4-[(2-methylpropan-2-yl)oxycarbonyl]piperazin-1-ium-1-yl]-2-pyridin-3-ylacetate Chemical compound C1CN(C(=O)OC(C)(C)C)CCN1C(C(O)=O)C1=CC=CN=C1 UOZAIRMXJCRTJN-UHFFFAOYSA-N 0.000 description 1
- SMLJSDLXJRGOKW-UHFFFAOYSA-N 2-[9h-fluoren-9-ylmethoxycarbonyl-[2-[(2-methylpropan-2-yl)oxycarbonylamino]ethyl]amino]acetic acid Chemical compound C1=CC=C2C(COC(=O)N(CC(O)=O)CCNC(=O)OC(C)(C)C)C3=CC=CC=C3C2=C1 SMLJSDLXJRGOKW-UHFFFAOYSA-N 0.000 description 1
- MNAXPVXIHALBEF-UHFFFAOYSA-N 2-[9h-fluoren-9-ylmethoxycarbonyl-[4-[(2-methylpropan-2-yl)oxycarbonylamino]butyl]amino]acetic acid Chemical compound C1=CC=C2C(COC(=O)N(CC(O)=O)CCCCNC(=O)OC(C)(C)C)C3=CC=CC=C3C2=C1 MNAXPVXIHALBEF-UHFFFAOYSA-N 0.000 description 1
- FAZMFLNCRFKVDW-UHFFFAOYSA-N 2-[[(2-methylpropan-2-yl)oxycarbonylamino]methyl]benzoic acid Chemical compound CC(C)(C)OC(=O)NCC1=CC=CC=C1C(O)=O FAZMFLNCRFKVDW-UHFFFAOYSA-N 0.000 description 1
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 1
- VUBCCMLFYBOWSD-UHFFFAOYSA-N 2-amino-2-methylcyclopentane-1-carboxylic acid;hydrochloride Chemical compound Cl.CC1(N)CCCC1C(O)=O VUBCCMLFYBOWSD-UHFFFAOYSA-N 0.000 description 1
- 125000004174 2-benzimidazolyl group Chemical group [H]N1C(*)=NC2=C([H])C([H])=C([H])C([H])=C12 0.000 description 1
- AOYNUTHNTBLRMT-SLPGGIOYSA-N 2-deoxy-2-fluoro-aldehydo-D-glucose Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)[C@@H](F)C=O AOYNUTHNTBLRMT-SLPGGIOYSA-N 0.000 description 1
- 125000002941 2-furyl group Chemical group O1C([*])=C([H])C([H])=C1[H] 0.000 description 1
- 125000001622 2-naphthyl group Chemical group [H]C1=C([H])C([H])=C2C([H])=C(*)C([H])=C([H])C2=C1[H] 0.000 description 1
- 125000004105 2-pyridyl group Chemical group N1=C([*])C([H])=C([H])C([H])=C1[H] 0.000 description 1
- 125000000389 2-pyrrolyl group Chemical group [H]N1C([*])=C([H])C([H])=C1[H] 0.000 description 1
- 125000000175 2-thienyl group Chemical group S1C([*])=C([H])C([H])=C1[H] 0.000 description 1
- HVCOBJNICQPDBP-UHFFFAOYSA-N 3-[3-[3,5-dihydroxy-6-methyl-4-(3,4,5-trihydroxy-6-methyloxan-2-yl)oxyoxan-2-yl]oxydecanoyloxy]decanoic acid;hydrate Chemical class O.OC1C(OC(CC(=O)OC(CCCCCCC)CC(O)=O)CCCCCCC)OC(C)C(O)C1OC1C(O)C(O)C(O)C(C)O1 HVCOBJNICQPDBP-UHFFFAOYSA-N 0.000 description 1
- 125000000474 3-butynyl group Chemical group [H]C#CC([H])([H])C([H])([H])* 0.000 description 1
- 125000003682 3-furyl group Chemical group O1C([H])=C([*])C([H])=C1[H] 0.000 description 1
- 125000003349 3-pyridyl group Chemical group N1=C([H])C([*])=C([H])C([H])=C1[H] 0.000 description 1
- 125000001397 3-pyrrolyl group Chemical group [H]N1C([H])=C([*])C([H])=C1[H] 0.000 description 1
- 125000001541 3-thienyl group Chemical group S1C([H])=C([*])C([H])=C1[H] 0.000 description 1
- BWGRDBSNKQABCB-UHFFFAOYSA-N 4,4-difluoro-N-[3-[3-(3-methyl-5-propan-2-yl-1,2,4-triazol-4-yl)-8-azabicyclo[3.2.1]octan-8-yl]-1-thiophen-2-ylpropyl]cyclohexane-1-carboxamide Chemical compound CC(C)C1=NN=C(C)N1C1CC2CCC(C1)N2CCC(NC(=O)C1CCC(F)(F)CC1)C1=CC=CS1 BWGRDBSNKQABCB-UHFFFAOYSA-N 0.000 description 1
- QFVHZQCOUORWEI-UHFFFAOYSA-N 4-[(4-anilino-5-sulfonaphthalen-1-yl)diazenyl]-5-hydroxynaphthalene-2,7-disulfonic acid Chemical compound C=12C(O)=CC(S(O)(=O)=O)=CC2=CC(S(O)(=O)=O)=CC=1N=NC(C1=CC=CC(=C11)S(O)(=O)=O)=CC=C1NC1=CC=CC=C1 QFVHZQCOUORWEI-UHFFFAOYSA-N 0.000 description 1
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 1
- 125000000339 4-pyridyl group Chemical group N1=C([H])C([H])=C([*])C([H])=C1[H] 0.000 description 1
- KDDQRKBRJSGMQE-UHFFFAOYSA-N 4-thiazolyl Chemical group [C]1=CSC=N1 KDDQRKBRJSGMQE-UHFFFAOYSA-N 0.000 description 1
- ZAYHVCMSTBRABG-UHFFFAOYSA-N 5-Methylcytidine Natural products O=C1N=C(N)C(C)=CN1C1C(O)C(O)C(CO)O1 ZAYHVCMSTBRABG-UHFFFAOYSA-N 0.000 description 1
- ZAYHVCMSTBRABG-JXOAFFINSA-N 5-methylcytidine Chemical group O=C1N=C(N)C(C)=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 ZAYHVCMSTBRABG-JXOAFFINSA-N 0.000 description 1
- CWDWFSXUQODZGW-UHFFFAOYSA-N 5-thiazolyl Chemical group [C]1=CN=CS1 CWDWFSXUQODZGW-UHFFFAOYSA-N 0.000 description 1
- HBAQYPYDRFILMT-UHFFFAOYSA-N 8-[3-(1-cyclopropylpyrazol-4-yl)-1H-pyrazolo[4,3-d]pyrimidin-5-yl]-3-methyl-3,8-diazabicyclo[3.2.1]octan-2-one Chemical class C1(CC1)N1N=CC(=C1)C1=NNC2=C1N=C(N=C2)N1C2C(N(CC1CC2)C)=O HBAQYPYDRFILMT-UHFFFAOYSA-N 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 229920000936 Agarose Polymers 0.000 description 1
- 102000009027 Albumins Human genes 0.000 description 1
- 108010088751 Albumins Proteins 0.000 description 1
- 108700028369 Alleles Proteins 0.000 description 1
- GUBGYTABKSRVRQ-XLOQQCSPSA-N Alpha-Lactose Chemical compound O[C@@H]1[C@@H](O)[C@@H](O)[C@@H](CO)O[C@H]1O[C@@H]1[C@@H](CO)O[C@H](O)[C@H](O)[C@H]1O GUBGYTABKSRVRQ-XLOQQCSPSA-N 0.000 description 1
- 229920000856 Amylose Polymers 0.000 description 1
- 108010032595 Antibody Binding Sites Proteins 0.000 description 1
- 241001132374 Asta Species 0.000 description 1
- 241000416162 Astragalus gummifer Species 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 1
- 239000002126 C01EB10 - Adenosine Substances 0.000 description 1
- 125000004406 C3-C8 cycloalkylene group Chemical group 0.000 description 1
- 108091033409 CRISPR Proteins 0.000 description 1
- OKTJSMMVPCPJKN-NJFSPNSNSA-N Carbon-14 Chemical compound [14C] OKTJSMMVPCPJKN-NJFSPNSNSA-N 0.000 description 1
- 101710163595 Chaperone protein DnaK Proteins 0.000 description 1
- 102000001327 Chemokine CCL5 Human genes 0.000 description 1
- 108010055166 Chemokine CCL5 Proteins 0.000 description 1
- 102000019034 Chemokines Human genes 0.000 description 1
- 108010012236 Chemokines Proteins 0.000 description 1
- KZBUYRJDOAKODT-UHFFFAOYSA-N Chlorine Chemical compound ClCl KZBUYRJDOAKODT-UHFFFAOYSA-N 0.000 description 1
- 102000000989 Complement System Proteins Human genes 0.000 description 1
- 108010069112 Complement System Proteins Proteins 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- 241000759568 Corixa Species 0.000 description 1
- 206010011224 Cough Diseases 0.000 description 1
- 241000699802 Cricetulus griseus Species 0.000 description 1
- 102000004127 Cytokines Human genes 0.000 description 1
- 108090000695 Cytokines Proteins 0.000 description 1
- FBPFZTCFMRRESA-KVTDHHQDSA-N D-Mannitol Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)[C@H](O)CO FBPFZTCFMRRESA-KVTDHHQDSA-N 0.000 description 1
- 238000000018 DNA microarray Methods 0.000 description 1
- 108010053770 Deoxyribonucleases Proteins 0.000 description 1
- 102000016911 Deoxyribonucleases Human genes 0.000 description 1
- 241000702421 Dependoparvovirus Species 0.000 description 1
- 238000005698 Diels-Alder reaction Methods 0.000 description 1
- SHIBSTMRCDJXLN-UHFFFAOYSA-N Digoxigenin Natural products C1CC(C2C(C3(C)CCC(O)CC3CC2)CC2O)(O)C2(C)C1C1=CC(=O)OC1 SHIBSTMRCDJXLN-UHFFFAOYSA-N 0.000 description 1
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 1
- VGGSQFUCUMXWEO-UHFFFAOYSA-N Ethene Chemical compound C=C VGGSQFUCUMXWEO-UHFFFAOYSA-N 0.000 description 1
- 239000005977 Ethylene Substances 0.000 description 1
- 102000009109 Fc receptors Human genes 0.000 description 1
- 108010087819 Fc receptors Proteins 0.000 description 1
- PXGOKWXKJXAPGV-UHFFFAOYSA-N Fluorine Chemical compound FF PXGOKWXKJXAPGV-UHFFFAOYSA-N 0.000 description 1
- 230000005526 G1 to G0 transition Effects 0.000 description 1
- 108020005004 Guide RNA Proteins 0.000 description 1
- 101710178376 Heat shock 70 kDa protein Proteins 0.000 description 1
- 101710152018 Heat shock cognate 70 kDa protein Proteins 0.000 description 1
- 101710113864 Heat shock protein 90 Proteins 0.000 description 1
- 102100034051 Heat shock protein HSP 90-alpha Human genes 0.000 description 1
- 108010004889 Heat-Shock Proteins Proteins 0.000 description 1
- 102000002812 Heat-Shock Proteins Human genes 0.000 description 1
- 108091006054 His-tagged proteins Proteins 0.000 description 1
- 229920002153 Hydroxypropyl cellulose Polymers 0.000 description 1
- 108010054477 Immunoglobulin Fab Fragments Proteins 0.000 description 1
- 102000001706 Immunoglobulin Fab Fragments Human genes 0.000 description 1
- 108010067060 Immunoglobulin Variable Region Proteins 0.000 description 1
- 102000017727 Immunoglobulin Variable Region Human genes 0.000 description 1
- 108010002352 Interleukin-1 Proteins 0.000 description 1
- 108010065805 Interleukin-12 Proteins 0.000 description 1
- 108090000176 Interleukin-13 Proteins 0.000 description 1
- 108010002350 Interleukin-2 Proteins 0.000 description 1
- 108090000978 Interleukin-4 Proteins 0.000 description 1
- 108090001005 Interleukin-6 Proteins 0.000 description 1
- 108010063738 Interleukins Proteins 0.000 description 1
- 102000015696 Interleukins Human genes 0.000 description 1
- AMDBBAQNWSUWGN-UHFFFAOYSA-N Ioversol Chemical compound OCCN(C(=O)CO)C1=C(I)C(C(=O)NCC(O)CO)=C(I)C(C(=O)NCC(O)CO)=C1I AMDBBAQNWSUWGN-UHFFFAOYSA-N 0.000 description 1
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 239000012097 Lipofectamine 2000 Substances 0.000 description 1
- 239000006142 Luria-Bertani Agar Substances 0.000 description 1
- 229930195725 Mannitol Natural products 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 238000006957 Michael reaction Methods 0.000 description 1
- 108010006519 Molecular Chaperones Proteins 0.000 description 1
- 241001092142 Molina Species 0.000 description 1
- 241000699660 Mus musculus Species 0.000 description 1
- 102000008300 Mutant Proteins Human genes 0.000 description 1
- 108010021466 Mutant Proteins Proteins 0.000 description 1
- 241000238367 Mya arenaria Species 0.000 description 1
- 108700015872 N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine Proteins 0.000 description 1
- CHJJGSNFBQVOTG-UHFFFAOYSA-N N-methyl-guanidine Natural products CNC(N)=N CHJJGSNFBQVOTG-UHFFFAOYSA-N 0.000 description 1
- QJGQUHMNIGDVPM-BJUDXGSMSA-N Nitrogen-13 Chemical compound [13N] QJGQUHMNIGDVPM-BJUDXGSMSA-N 0.000 description 1
- 229910003849 O-Si Inorganic materials 0.000 description 1
- 229910004727 OSO3H Inorganic materials 0.000 description 1
- 229910003872 O—Si Inorganic materials 0.000 description 1
- LYNKVJADAPZJIK-UHFFFAOYSA-H P([O-])([O-])=O.[B+3].P([O-])([O-])=O.P([O-])([O-])=O.[B+3] Chemical compound P([O-])([O-])=O.[B+3].P([O-])([O-])=O.P([O-])([O-])=O.[B+3] LYNKVJADAPZJIK-UHFFFAOYSA-H 0.000 description 1
- 102100035591 POU domain, class 2, transcription factor 2 Human genes 0.000 description 1
- 101710084411 POU domain, class 2, transcription factor 2 Proteins 0.000 description 1
- 241000282376 Panthera tigris Species 0.000 description 1
- 235000019483 Peanut oil Nutrition 0.000 description 1
- 102000057297 Pepsin A Human genes 0.000 description 1
- 108090000284 Pepsin A Proteins 0.000 description 1
- 102000035195 Peptidases Human genes 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- 108091093037 Peptide nucleic acid Chemical group 0.000 description 1
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 description 1
- 239000002202 Polyethylene glycol Substances 0.000 description 1
- 108010020346 Polyglutamic Acid Proteins 0.000 description 1
- 108010039918 Polylysine Proteins 0.000 description 1
- 229920001213 Polysorbate 20 Polymers 0.000 description 1
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 1
- 229930185560 Pseudouridine Chemical group 0.000 description 1
- PTJWIQPHWPFNBW-UHFFFAOYSA-N Pseudouridine C Chemical group OC1C(O)C(CO)OC1C1=CNC(=O)NC1=O PTJWIQPHWPFNBW-UHFFFAOYSA-N 0.000 description 1
- 241001454523 Quillaja saponaria Species 0.000 description 1
- 235000009001 Quillaja saponaria Nutrition 0.000 description 1
- 230000004570 RNA-binding Effects 0.000 description 1
- 241000283984 Rodentia Species 0.000 description 1
- 229910007161 Si(CH3)3 Inorganic materials 0.000 description 1
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 1
- 108020004459 Small interfering RNA Proteins 0.000 description 1
- PRXRUNOAOLTIEF-ADSICKODSA-N Sorbitan trioleate Chemical compound CCCCCCCC\C=C/CCCCCCCC(=O)OC[C@@H](OC(=O)CCCCCCC\C=C/CCCCCCCC)[C@H]1OC[C@H](O)[C@H]1OC(=O)CCCCCCC\C=C/CCCCCCCC PRXRUNOAOLTIEF-ADSICKODSA-N 0.000 description 1
- QAOWNCQODCNURD-UHFFFAOYSA-L Sulfate Chemical compound [O-]S([O-])(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-L 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- 229920001615 Tragacanth Polymers 0.000 description 1
- YZCKVEUIGOORGS-NJFSPNSNSA-N Tritium Chemical compound [3H] YZCKVEUIGOORGS-NJFSPNSNSA-N 0.000 description 1
- 108020005202 Viral DNA Proteins 0.000 description 1
- PNDPGZBMCMUPRI-XXSWNUTMSA-N [125I][125I] Chemical compound [125I][125I] PNDPGZBMCMUPRI-XXSWNUTMSA-N 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- DHKHKXVYLBGOIT-UHFFFAOYSA-N acetaldehyde Diethyl Acetal Natural products CCOC(C)OCC DHKHKXVYLBGOIT-UHFFFAOYSA-N 0.000 description 1
- 150000001241 acetals Chemical class 0.000 description 1
- 150000001266 acyl halides Chemical class 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 229960005305 adenosine Drugs 0.000 description 1
- 238000012382 advanced drug delivery Methods 0.000 description 1
- 150000001299 aldehydes Chemical class 0.000 description 1
- PPQRONHOSHZGFQ-LMVFSUKVSA-N aldehydo-D-ribose 5-phosphate Chemical group OP(=O)(O)OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PPQRONHOSHZGFQ-LMVFSUKVSA-N 0.000 description 1
- 125000004450 alkenylene group Chemical group 0.000 description 1
- 125000004390 alkyl sulfonyl group Chemical group 0.000 description 1
- 125000005237 alkyleneamino group Chemical group 0.000 description 1
- 125000005238 alkylenediamino group Chemical group 0.000 description 1
- 125000005530 alkylenedioxy group Chemical group 0.000 description 1
- 125000005529 alkyleneoxy group Chemical group 0.000 description 1
- RMRFFCXPLWYOOY-UHFFFAOYSA-N allyl radical Chemical class [CH2]C=C RMRFFCXPLWYOOY-UHFFFAOYSA-N 0.000 description 1
- WQZGKKKJIJFFOK-PHYPRBDBSA-N alpha-D-galactose Chemical compound OC[C@H]1O[C@H](O)[C@H](O)[C@@H](O)[C@H]1O WQZGKKKJIJFFOK-PHYPRBDBSA-N 0.000 description 1
- HSFWRNGVRCDJHI-UHFFFAOYSA-N alpha-acetylene Natural products C#C HSFWRNGVRCDJHI-UHFFFAOYSA-N 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- AZDRQVAHHNSJOQ-UHFFFAOYSA-N alumane Chemical class [AlH3] AZDRQVAHHNSJOQ-UHFFFAOYSA-N 0.000 description 1
- WNROFYMDJYEPJX-UHFFFAOYSA-K aluminium hydroxide Chemical compound [OH-].[OH-].[OH-].[Al+3] WNROFYMDJYEPJX-UHFFFAOYSA-K 0.000 description 1
- ILRRQNADMUWWFW-UHFFFAOYSA-K aluminium phosphate Chemical compound O1[Al]2OP1(=O)O2 ILRRQNADMUWWFW-UHFFFAOYSA-K 0.000 description 1
- YVPYQUNUQOZFHG-UHFFFAOYSA-N amidotrizoic acid Chemical compound CC(=O)NC1=C(I)C(NC(C)=O)=C(I)C(C(O)=O)=C1I YVPYQUNUQOZFHG-UHFFFAOYSA-N 0.000 description 1
- 229910021529 ammonia Inorganic materials 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- 230000000845 anti-microbial effect Effects 0.000 description 1
- 230000000890 antigenic effect Effects 0.000 description 1
- 239000012736 aqueous medium Substances 0.000 description 1
- PYMYPHUHKUWMLA-WDCZJNDASA-N arabinose Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)C=O PYMYPHUHKUWMLA-WDCZJNDASA-N 0.000 description 1
- PYMYPHUHKUWMLA-UHFFFAOYSA-N arabinose Natural products OCC(O)C(O)C(O)C=O PYMYPHUHKUWMLA-UHFFFAOYSA-N 0.000 description 1
- 125000003710 aryl alkyl group Chemical group 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 239000012752 auxiliary agent Substances 0.000 description 1
- 125000003725 azepanyl group Chemical group 0.000 description 1
- 125000002393 azetidinyl group Chemical group 0.000 description 1
- 125000004069 aziridinyl group Chemical group 0.000 description 1
- 210000003719 b-lymphocyte Anatomy 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 125000003785 benzimidazolyl group Chemical group N1=C(NC2=C1C=CC=C2)* 0.000 description 1
- 125000001164 benzothiazolyl group Chemical group S1C(=NC2=C1C=CC=C2)* 0.000 description 1
- 125000004196 benzothienyl group Chemical group S1C(=CC2=C1C=CC=C2)* 0.000 description 1
- SRBFZHDQGSBBOR-UHFFFAOYSA-N beta-D-Pyranose-Lyxose Natural products OC1COC(O)C(O)C1O SRBFZHDQGSBBOR-UHFFFAOYSA-N 0.000 description 1
- WQZGKKKJIJFFOK-VFUOTHLCSA-N beta-D-glucose Chemical compound OC[C@H]1O[C@@H](O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-VFUOTHLCSA-N 0.000 description 1
- IQFYYKKMVGJFEH-UHFFFAOYSA-N beta-L-thymidine Natural products O=C1NC(=O)C(C)=CN1C1OC(CO)C(O)C1 IQFYYKKMVGJFEH-UHFFFAOYSA-N 0.000 description 1
- WGDUUQDYDIIBKT-UHFFFAOYSA-N beta-Pseudouridine Chemical group OC1OC(CN2C=CC(=O)NC2=O)C(O)C1O WGDUUQDYDIIBKT-UHFFFAOYSA-N 0.000 description 1
- GPRLTFBKWDERLU-UHFFFAOYSA-N bicyclo[2.2.2]octane Chemical compound C1CC2CCC1CC2 GPRLTFBKWDERLU-UHFFFAOYSA-N 0.000 description 1
- GNTFBMAGLFYMMZ-UHFFFAOYSA-N bicyclo[3.2.2]nonane Chemical compound C1CC2CCC1CCC2 GNTFBMAGLFYMMZ-UHFFFAOYSA-N 0.000 description 1
- WNTGVOIBBXFMLR-UHFFFAOYSA-N bicyclo[3.3.1]nonane Chemical compound C1CCC2CCCC1C2 WNTGVOIBBXFMLR-UHFFFAOYSA-N 0.000 description 1
- KVLCIHRZDOKRLK-UHFFFAOYSA-N bicyclo[4.2.1]nonane Chemical compound C1C2CCC1CCCC2 KVLCIHRZDOKRLK-UHFFFAOYSA-N 0.000 description 1
- 239000011230 binding agent Substances 0.000 description 1
- 229960000074 biopharmaceutical Drugs 0.000 description 1
- 229940126587 biotherapeutics Drugs 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 125000000319 biphenyl-4-yl group Chemical group [H]C1=C([H])C([H])=C([H])C([H])=C1C1=C([H])C([H])=C([*])C([H])=C1[H] 0.000 description 1
- OWTGPXDXLMNQKK-NSHDSACASA-N boc-3-nitro-l-phenylalanine Chemical compound CC(C)(C)OC(=O)N[C@H](C(O)=O)CC1=CC=CC([N+]([O-])=O)=C1 OWTGPXDXLMNQKK-NSHDSACASA-N 0.000 description 1
- 239000006189 buccal tablet Substances 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- BPKIGYQJPYCAOW-FFJTTWKXSA-I calcium;potassium;disodium;(2s)-2-hydroxypropanoate;dichloride;dihydroxide;hydrate Chemical compound O.[OH-].[OH-].[Na+].[Na+].[Cl-].[Cl-].[K+].[Ca+2].C[C@H](O)C([O-])=O BPKIGYQJPYCAOW-FFJTTWKXSA-I 0.000 description 1
- 229940041514 candida albicans extract Drugs 0.000 description 1
- CREMABGTGYGIQB-UHFFFAOYSA-N carbon carbon Chemical compound C.C CREMABGTGYGIQB-UHFFFAOYSA-N 0.000 description 1
- OKTJSMMVPCPJKN-BJUDXGSMSA-N carbon-11 Chemical compound [11C] OKTJSMMVPCPJKN-BJUDXGSMSA-N 0.000 description 1
- 150000007942 carboxylates Chemical class 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 239000008004 cell lysis buffer Substances 0.000 description 1
- 238000009614 chemical analysis method Methods 0.000 description 1
- 239000007795 chemical reaction product Substances 0.000 description 1
- 210000004978 chinese hamster ovary cell Anatomy 0.000 description 1
- 239000012069 chiral reagent Substances 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 238000011097 chromatography purification Methods 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000004040 coloring Methods 0.000 description 1
- 238000012875 competitive assay Methods 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 125000000640 cyclooctyl group Chemical group [H]C1([H])C([H])([H])C([H])([H])C([H])([H])C([H])(*)C([H])([H])C([H])([H])C1([H])[H] 0.000 description 1
- 125000004652 decahydroisoquinolinyl group Chemical group C1(NCCC2CCCCC12)* 0.000 description 1
- 125000004856 decahydroquinolinyl group Chemical group N1(CCCC2CCCCC12)* 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 239000000412 dendrimer Substances 0.000 description 1
- 229920000736 dendritic polymer Polymers 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- ANCLJVISBRWUTR-UHFFFAOYSA-N diaminophosphinic acid Chemical compound NP(N)(O)=O ANCLJVISBRWUTR-UHFFFAOYSA-N 0.000 description 1
- 229960005423 diatrizoate Drugs 0.000 description 1
- 125000005959 diazepanyl group Chemical group 0.000 description 1
- XBPCUCUWBYBCDP-UHFFFAOYSA-O dicyclohexylazanium Chemical compound C1CCCCC1[NH2+]C1CCCCC1 XBPCUCUWBYBCDP-UHFFFAOYSA-O 0.000 description 1
- 235000005911 diet Nutrition 0.000 description 1
- 230000037213 diet Effects 0.000 description 1
- 125000001028 difluoromethyl group Chemical group [H]C(F)(F)* 0.000 description 1
- QONQRTHLHBTMGP-UHFFFAOYSA-N digitoxigenin Natural products CC12CCC(C3(CCC(O)CC3CC3)C)C3C11OC1CC2C1=CC(=O)OC1 QONQRTHLHBTMGP-UHFFFAOYSA-N 0.000 description 1
- SHIBSTMRCDJXLN-KCZCNTNESA-N digoxigenin Chemical compound C1([C@@H]2[C@@]3([C@@](CC2)(O)[C@H]2[C@@H]([C@@]4(C)CC[C@H](O)C[C@H]4CC2)C[C@H]3O)C)=CC(=O)OC1 SHIBSTMRCDJXLN-KCZCNTNESA-N 0.000 description 1
- 239000003085 diluting agent Substances 0.000 description 1
- SWSQBOPZIKWTGO-UHFFFAOYSA-N dimethylaminoamidine Natural products CN(C)C(N)=N SWSQBOPZIKWTGO-UHFFFAOYSA-N 0.000 description 1
- NAGJZTKCGNOGPW-UHFFFAOYSA-K dioxido-sulfanylidene-sulfido-$l^{5}-phosphane Chemical compound [O-]P([O-])([S-])=S NAGJZTKCGNOGPW-UHFFFAOYSA-K 0.000 description 1
- 229940042399 direct acting antivirals protease inhibitors Drugs 0.000 description 1
- 239000007884 disintegrant Substances 0.000 description 1
- 239000002612 dispersion medium Substances 0.000 description 1
- 239000003937 drug carrier Substances 0.000 description 1
- 238000009509 drug development Methods 0.000 description 1
- 239000003596 drug target Substances 0.000 description 1
- 239000012636 effector Substances 0.000 description 1
- 238000007336 electrophilic substitution reaction Methods 0.000 description 1
- 239000012149 elution buffer Substances 0.000 description 1
- 239000003995 emulsifying agent Substances 0.000 description 1
- 102000027412 enzyme-linked receptors Human genes 0.000 description 1
- 108091008592 enzyme-linked receptors Proteins 0.000 description 1
- 235000019441 ethanol Nutrition 0.000 description 1
- 125000001495 ethyl group Chemical group [H]C([H])([H])C([H])([H])* 0.000 description 1
- 125000002534 ethynyl group Chemical group [H]C#C* 0.000 description 1
- 239000000945 filler Substances 0.000 description 1
- 239000000796 flavoring agent Substances 0.000 description 1
- 235000019634 flavors Nutrition 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 238000000799 fluorescence microscopy Methods 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 125000004216 fluoromethyl group Chemical group [H]C([H])(F)* 0.000 description 1
- UQSQSQZYBQSBJZ-UHFFFAOYSA-M fluorosulfate group Chemical group S(=O)(=O)([O-])F UQSQSQZYBQSBJZ-UHFFFAOYSA-M 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 235000003599 food sweetener Nutrition 0.000 description 1
- 229960005102 foscarnet Drugs 0.000 description 1
- 238000004108 freeze drying Methods 0.000 description 1
- 125000002541 furyl group Chemical group 0.000 description 1
- 229930182830 galactose Natural products 0.000 description 1
- 230000005251 gamma ray Effects 0.000 description 1
- 239000000499 gel Substances 0.000 description 1
- 239000008273 gelatin Substances 0.000 description 1
- 239000007903 gelatin capsule Substances 0.000 description 1
- 229930195712 glutamate Natural products 0.000 description 1
- 239000008187 granular material Substances 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 239000003102 growth factor Substances 0.000 description 1
- 125000004366 heterocycloalkenyl group Chemical group 0.000 description 1
- 238000004128 high performance liquid chromatography Methods 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 108091008039 hormone receptors Proteins 0.000 description 1
- 150000002430 hydrocarbons Chemical group 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-M hydroxide Chemical compound [OH-] XLYOFNOQVPJJNP-UHFFFAOYSA-M 0.000 description 1
- 239000001863 hydroxypropyl cellulose Substances 0.000 description 1
- 235000010977 hydroxypropyl cellulose Nutrition 0.000 description 1
- 239000005457 ice water Substances 0.000 description 1
- 239000012216 imaging agent Substances 0.000 description 1
- 125000002632 imidazolidinyl group Chemical group 0.000 description 1
- 125000002636 imidazolinyl group Chemical group 0.000 description 1
- 125000002883 imidazolyl group Chemical group 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 230000001571 immunoadjuvant effect Effects 0.000 description 1
- 230000000984 immunochemical effect Effects 0.000 description 1
- 239000000568 immunological adjuvant Substances 0.000 description 1
- 239000002955 immunomodulating agent Substances 0.000 description 1
- 229940121354 immunomodulator Drugs 0.000 description 1
- 230000002584 immunomodulator Effects 0.000 description 1
- 229960001438 immunostimulant agent Drugs 0.000 description 1
- 230000003308 immunostimulating effect Effects 0.000 description 1
- 230000002637 immunotoxin Effects 0.000 description 1
- 239000002596 immunotoxin Substances 0.000 description 1
- 229940051026 immunotoxin Drugs 0.000 description 1
- 231100000608 immunotoxin Toxicity 0.000 description 1
- 125000004246 indolin-2-yl group Chemical group [H]N1C(*)=C([H])C2=C([H])C([H])=C([H])C([H])=C12 0.000 description 1
- 125000001041 indolyl group Chemical group 0.000 description 1
- 239000003701 inert diluent Substances 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 229940047122 interleukins Drugs 0.000 description 1
- 229940044173 iodine-125 Drugs 0.000 description 1
- 229960004359 iodixanol Drugs 0.000 description 1
- NBQNWMBBSKPBAY-UHFFFAOYSA-N iodixanol Chemical compound IC=1C(C(=O)NCC(O)CO)=C(I)C(C(=O)NCC(O)CO)=C(I)C=1N(C(=O)C)CC(O)CN(C(C)=O)C1=C(I)C(C(=O)NCC(O)CO)=C(I)C(C(=O)NCC(O)CO)=C1I NBQNWMBBSKPBAY-UHFFFAOYSA-N 0.000 description 1
- 229960001025 iohexol Drugs 0.000 description 1
- NTHXOOBQLCIOLC-UHFFFAOYSA-N iohexol Chemical compound OCC(O)CN(C(=O)C)C1=C(I)C(C(=O)NCC(O)CO)=C(I)C(C(=O)NCC(O)CO)=C1I NTHXOOBQLCIOLC-UHFFFAOYSA-N 0.000 description 1
- 102000027415 ion channel-linked receptors Human genes 0.000 description 1
- 108091008593 ion channel-linked receptors Proteins 0.000 description 1
- 229960004647 iopamidol Drugs 0.000 description 1
- XQZXYNRDCRIARQ-LURJTMIESA-N iopamidol Chemical compound C[C@H](O)C(=O)NC1=C(I)C(C(=O)NC(CO)CO)=C(I)C(C(=O)NC(CO)CO)=C1I XQZXYNRDCRIARQ-LURJTMIESA-N 0.000 description 1
- 229960002603 iopromide Drugs 0.000 description 1
- DGAIEPBNLOQYER-UHFFFAOYSA-N iopromide Chemical compound COCC(=O)NC1=C(I)C(C(=O)NCC(O)CO)=C(I)C(C(=O)N(C)CC(O)CO)=C1I DGAIEPBNLOQYER-UHFFFAOYSA-N 0.000 description 1
- 229960004537 ioversol Drugs 0.000 description 1
- 229940029407 ioxaglate Drugs 0.000 description 1
- TYYBFXNZMFNZJT-UHFFFAOYSA-N ioxaglic acid Chemical compound CNC(=O)C1=C(I)C(N(C)C(C)=O)=C(I)C(C(=O)NCC(=O)NC=2C(=C(C(=O)NCCO)C(I)=C(C(O)=O)C=2I)I)=C1I TYYBFXNZMFNZJT-UHFFFAOYSA-N 0.000 description 1
- 229960002611 ioxilan Drugs 0.000 description 1
- UUMLTINZBQPNGF-UHFFFAOYSA-N ioxilan Chemical compound OCC(O)CN(C(=O)C)C1=C(I)C(C(=O)NCCO)=C(I)C(C(=O)NCC(O)CO)=C1I UUMLTINZBQPNGF-UHFFFAOYSA-N 0.000 description 1
- 125000000959 isobutyl group Chemical group [H]C([H])([H])C([H])(C([H])([H])[H])C([H])([H])* 0.000 description 1
- 125000000904 isoindolyl group Chemical group C=1(NC=C2C=CC=CC12)* 0.000 description 1
- 125000000741 isoleucyl group Chemical group [H]N([H])C(C(C([H])([H])[H])C([H])([H])C([H])([H])[H])C(=O)O* 0.000 description 1
- 125000001449 isopropyl group Chemical group [H]C([H])([H])C([H])(*)C([H])([H])[H] 0.000 description 1
- 125000005956 isoquinolyl group Chemical group 0.000 description 1
- 125000004628 isothiazolidinyl group Chemical group S1N(CCC1)* 0.000 description 1
- 125000005969 isothiazolinyl group Chemical group 0.000 description 1
- 230000000155 isotopic effect Effects 0.000 description 1
- 125000003965 isoxazolidinyl group Chemical group 0.000 description 1
- 125000003971 isoxazolinyl group Chemical group 0.000 description 1
- 150000002576 ketones Chemical class 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 229910052747 lanthanoid Inorganic materials 0.000 description 1
- ZADHKSJXSZBQFB-HHHXNRCGSA-N lipid fragment Chemical compound CC(C)CCCCCCCCCCCC[C@@H](C)CCCCCCCC(C)C ZADHKSJXSZBQFB-HHHXNRCGSA-N 0.000 description 1
- 239000002502 liposome Substances 0.000 description 1
- 239000012160 loading buffer Substances 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 239000007937 lozenge Substances 0.000 description 1
- 210000004698 lymphocyte Anatomy 0.000 description 1
- 210000002540 macrophage Anatomy 0.000 description 1
- ZLNQQNXFFQJAID-UHFFFAOYSA-L magnesium carbonate Chemical compound [Mg+2].[O-]C([O-])=O ZLNQQNXFFQJAID-UHFFFAOYSA-L 0.000 description 1
- 239000001095 magnesium carbonate Substances 0.000 description 1
- 229910000021 magnesium carbonate Inorganic materials 0.000 description 1
- 235000019359 magnesium stearate Nutrition 0.000 description 1
- 239000000594 mannitol Substances 0.000 description 1
- 235000010355 mannitol Nutrition 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 108020004084 membrane receptors Proteins 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 125000001570 methylene group Chemical group [H]C([H])([*:1])[*:2] 0.000 description 1
- YACKEPLHDIMKIO-UHFFFAOYSA-N methylphosphonic acid Chemical compound CP(O)(O)=O YACKEPLHDIMKIO-UHFFFAOYSA-N 0.000 description 1
- 229960004712 metrizoic acid Drugs 0.000 description 1
- GGGDNPWHMNJRFN-UHFFFAOYSA-N metrizoic acid Chemical compound CC(=O)N(C)C1=C(I)C(NC(C)=O)=C(I)C(C(O)=O)=C1I GGGDNPWHMNJRFN-UHFFFAOYSA-N 0.000 description 1
- 108091070501 miRNA Proteins 0.000 description 1
- 239000002679 microRNA Substances 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 239000004005 microsphere Substances 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 125000006682 monohaloalkyl group Chemical group 0.000 description 1
- 125000004572 morpholin-3-yl group Chemical group N1C(COCC1)* 0.000 description 1
- 125000002757 morpholinyl group Chemical group 0.000 description 1
- 125000001446 muramyl group Chemical group N[C@@H](C=O)[C@@H](O[C@@H](C(=O)*)C)[C@H](O)[C@H](O)CO 0.000 description 1
- 238000002703 mutagenesis Methods 0.000 description 1
- 231100000350 mutagenesis Toxicity 0.000 description 1
- 125000004108 n-butyl group Chemical group [H]C([H])([H])C([H])([H])C([H])([H])C([H])([H])* 0.000 description 1
- 125000003136 n-heptyl group Chemical group [H]C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])* 0.000 description 1
- 125000001280 n-hexyl group Chemical group C(CCCCC)* 0.000 description 1
- 125000000740 n-pentyl group Chemical group [H]C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])* 0.000 description 1
- 125000004123 n-propyl group Chemical group [H]C([H])([H])C([H])([H])C([H])([H])* 0.000 description 1
- 229940031182 nanoparticles iron oxide Drugs 0.000 description 1
- 125000001624 naphthyl group Chemical group 0.000 description 1
- 229940037525 nasal preparations Drugs 0.000 description 1
- 229960005419 nitrogen Drugs 0.000 description 1
- QJGQUHMNIGDVPM-UHFFFAOYSA-N nitrogen group Chemical group [N] QJGQUHMNIGDVPM-UHFFFAOYSA-N 0.000 description 1
- 231100000252 nontoxic Toxicity 0.000 description 1
- 230000003000 nontoxic effect Effects 0.000 description 1
- 230000000269 nucleophilic effect Effects 0.000 description 1
- 238000010534 nucleophilic substitution reaction Methods 0.000 description 1
- 239000007764 o/w emulsion Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000008203 oral pharmaceutical composition Substances 0.000 description 1
- 230000003204 osmotic effect Effects 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 125000005963 oxadiazolidinyl group Chemical group 0.000 description 1
- 125000005882 oxadiazolinyl group Chemical group 0.000 description 1
- 125000000160 oxazolidinyl group Chemical group 0.000 description 1
- 125000005968 oxazolinyl group Chemical group 0.000 description 1
- QVGXLLKOCUKJST-BJUDXGSMSA-N oxygen-15 atom Chemical compound [15O] QVGXLLKOCUKJST-BJUDXGSMSA-N 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 238000007911 parenteral administration Methods 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 235000010603 pastilles Nutrition 0.000 description 1
- 239000000312 peanut oil Substances 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- 230000035515 penetration Effects 0.000 description 1
- 229940111202 pepsin Drugs 0.000 description 1
- 239000000137 peptide hydrolase inhibitor Substances 0.000 description 1
- 229960004624 perflexane Drugs 0.000 description 1
- 229940124531 pharmaceutical excipient Drugs 0.000 description 1
- 239000012071 phase Substances 0.000 description 1
- PTMHPRAIXMAOOB-UHFFFAOYSA-L phosphoramidate Chemical compound NP([O-])([O-])=O PTMHPRAIXMAOOB-UHFFFAOYSA-L 0.000 description 1
- 239000011574 phosphorus Substances 0.000 description 1
- 125000005642 phosphothioate group Chemical group 0.000 description 1
- 230000004962 physiological condition Effects 0.000 description 1
- 125000004193 piperazinyl group Chemical group 0.000 description 1
- 125000000587 piperidin-1-yl group Chemical group [H]C1([H])N(*)C([H])([H])C([H])([H])C([H])([H])C1([H])[H] 0.000 description 1
- 125000004483 piperidin-3-yl group Chemical group N1CC(CCC1)* 0.000 description 1
- 125000003386 piperidinyl group Chemical group 0.000 description 1
- 210000004180 plasmocyte Anatomy 0.000 description 1
- 229920001983 poloxamer Polymers 0.000 description 1
- 238000002264 polyacrylamide gel electrophoresis Methods 0.000 description 1
- 229920002643 polyglutamic acid Polymers 0.000 description 1
- 125000006684 polyhaloalkyl group Polymers 0.000 description 1
- 229920000656 polylysine Polymers 0.000 description 1
- 239000000256 polyoxyethylene sorbitan monolaurate Substances 0.000 description 1
- 235000010486 polyoxyethylene sorbitan monolaurate Nutrition 0.000 description 1
- 229920006316 polyvinylpyrrolidine Polymers 0.000 description 1
- 239000013641 positive control Substances 0.000 description 1
- 230000002335 preservative effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 230000000069 prophylactic effect Effects 0.000 description 1
- QLNJFJADRCOGBJ-UHFFFAOYSA-N propionamide Chemical compound CCC(N)=O QLNJFJADRCOGBJ-UHFFFAOYSA-N 0.000 description 1
- QQONPFPTGQHPMA-UHFFFAOYSA-N propylene Natural products CC=C QQONPFPTGQHPMA-UHFFFAOYSA-N 0.000 description 1
- 125000004805 propylene group Chemical group [H]C([H])([H])C([H])([*:1])C([H])([H])[*:2] 0.000 description 1
- 235000019833 protease Nutrition 0.000 description 1
- 230000006916 protein interaction Effects 0.000 description 1
- 230000009145 protein modification Effects 0.000 description 1
- 238000001243 protein synthesis Methods 0.000 description 1
- PTJWIQPHWPFNBW-GBNDHIKLSA-N pseudouridine Chemical group O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1C1=CNC(=O)NC1=O PTJWIQPHWPFNBW-GBNDHIKLSA-N 0.000 description 1
- 125000004309 pyranyl group Chemical group O1C(C=CC=C1)* 0.000 description 1
- 125000003072 pyrazolidinyl group Chemical group 0.000 description 1
- 125000002755 pyrazolinyl group Chemical group 0.000 description 1
- 125000003226 pyrazolyl group Chemical group 0.000 description 1
- 125000004076 pyridyl group Chemical group 0.000 description 1
- 125000000719 pyrrolidinyl group Chemical group 0.000 description 1
- 125000001422 pyrrolinyl group Chemical group 0.000 description 1
- 125000005493 quinolyl group Chemical group 0.000 description 1
- 125000001567 quinoxalinyl group Chemical group N1=C(C=NC2=CC=CC=C12)* 0.000 description 1
- 230000007115 recruitment Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 150000003290 ribose derivatives Chemical group 0.000 description 1
- CVHZOJJKTDOEJC-UHFFFAOYSA-N saccharin Chemical compound C1=CC=C2C(=O)NS(=O)(=O)C2=C1 CVHZOJJKTDOEJC-UHFFFAOYSA-N 0.000 description 1
- 239000012266 salt solution Substances 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 229930195734 saturated hydrocarbon Natural products 0.000 description 1
- 238000006748 scratching Methods 0.000 description 1
- 230000002393 scratching effect Effects 0.000 description 1
- 125000002914 sec-butyl group Chemical group [H]C([H])([H])C([H])([H])C([H])(*)C([H])([H])[H] 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 230000035939 shock Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 239000000377 silicon dioxide Substances 0.000 description 1
- 239000002002 slurry Substances 0.000 description 1
- 239000007790 solid phase Substances 0.000 description 1
- 230000009870 specific binding Effects 0.000 description 1
- 230000001954 sterilising effect Effects 0.000 description 1
- 238000004659 sterilization and disinfection Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 150000008163 sugars Chemical class 0.000 description 1
- OBTWBSRJZRCYQV-UHFFFAOYSA-N sulfuryl difluoride Chemical group FS(F)(=O)=O OBTWBSRJZRCYQV-UHFFFAOYSA-N 0.000 description 1
- 239000004094 surface-active agent Substances 0.000 description 1
- 238000013268 sustained release Methods 0.000 description 1
- 239000012730 sustained-release form Substances 0.000 description 1
- 239000003765 sweetening agent Substances 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 125000000999 tert-butyl group Chemical group [H]C([H])([H])C(*)(C([H])([H])[H])C([H])([H])[H] 0.000 description 1
- 125000004192 tetrahydrofuran-2-yl group Chemical group [H]C1([H])OC([H])(*)C([H])([H])C1([H])[H] 0.000 description 1
- 125000003718 tetrahydrofuranyl group Chemical group 0.000 description 1
- 125000005958 tetrahydrothienyl group Chemical group 0.000 description 1
- 230000004797 therapeutic response Effects 0.000 description 1
- 125000005304 thiadiazolidinyl group Chemical group 0.000 description 1
- 125000005305 thiadiazolinyl group Chemical group 0.000 description 1
- 125000001984 thiazolidinyl group Chemical group 0.000 description 1
- 125000002769 thiazolinyl group Chemical group 0.000 description 1
- 125000001544 thienyl group Chemical group 0.000 description 1
- 125000005309 thioalkoxy group Chemical group 0.000 description 1
- 125000003396 thiol group Chemical group [H]S* 0.000 description 1
- 125000004568 thiomorpholinyl group Chemical group 0.000 description 1
- ZCUFMDLYAMJYST-UHFFFAOYSA-N thorium dioxide Chemical compound O=[Th]=O ZCUFMDLYAMJYST-UHFFFAOYSA-N 0.000 description 1
- 229940104230 thymidine Drugs 0.000 description 1
- 230000036962 time dependent Effects 0.000 description 1
- 231100000419 toxicity Toxicity 0.000 description 1
- 230000001988 toxicity Effects 0.000 description 1
- 230000002110 toxicologic effect Effects 0.000 description 1
- 231100000759 toxicological effect Toxicity 0.000 description 1
- 239000003053 toxin Substances 0.000 description 1
- 231100000765 toxin Toxicity 0.000 description 1
- 239000000196 tragacanth Substances 0.000 description 1
- 235000010487 tragacanth Nutrition 0.000 description 1
- 229940116362 tragacanth Drugs 0.000 description 1
- 230000002463 transducing effect Effects 0.000 description 1
- 238000011830 transgenic mouse model Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000011282 treatment Methods 0.000 description 1
- 238000011269 treatment regimen Methods 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- 125000005455 trithianyl group Chemical group 0.000 description 1
- 229910052722 tritium Inorganic materials 0.000 description 1
- 239000012137 tryptone Substances 0.000 description 1
- 238000004704 ultra performance liquid chromatography Methods 0.000 description 1
- 241000701161 unidentified adenovirus Species 0.000 description 1
- 241001430294 unidentified retrovirus Species 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 238000002255 vaccination Methods 0.000 description 1
- 238000001291 vacuum drying Methods 0.000 description 1
- 125000002987 valine group Chemical group [H]N([H])C([H])(C(*)=O)C([H])(C([H])([H])[H])C([H])([H])[H] 0.000 description 1
- 125000000391 vinyl group Chemical group [H]C([*])=C([H])[H] 0.000 description 1
- 229920002554 vinyl polymer Polymers 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
- 235000012431 wafers Nutrition 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- 239000000080 wetting agent Substances 0.000 description 1
- 239000012138 yeast extract Substances 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07D—HETEROCYCLIC COMPOUNDS
- C07D207/00—Heterocyclic compounds containing five-membered rings not condensed with other rings, with one nitrogen atom as the only ring hetero atom
- C07D207/02—Heterocyclic compounds containing five-membered rings not condensed with other rings, with one nitrogen atom as the only ring hetero atom with only hydrogen or carbon atoms directly attached to the ring nitrogen atom
- C07D207/30—Heterocyclic compounds containing five-membered rings not condensed with other rings, with one nitrogen atom as the only ring hetero atom with only hydrogen or carbon atoms directly attached to the ring nitrogen atom having two double bonds between ring members or between ring members and non-ring members
- C07D207/34—Heterocyclic compounds containing five-membered rings not condensed with other rings, with one nitrogen atom as the only ring hetero atom with only hydrogen or carbon atoms directly attached to the ring nitrogen atom having two double bonds between ring members or between ring members and non-ring members with hetero atoms or with carbon atoms having three bonds to hetero atoms with at the most one bond to halogen, e.g. ester or nitrile radicals, directly attached to ring carbon atoms
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07D—HETEROCYCLIC COMPOUNDS
- C07D307/00—Heterocyclic compounds containing five-membered rings having one oxygen atom as the only ring hetero atom
- C07D307/02—Heterocyclic compounds containing five-membered rings having one oxygen atom as the only ring hetero atom not condensed with other rings
- C07D307/34—Heterocyclic compounds containing five-membered rings having one oxygen atom as the only ring hetero atom not condensed with other rings having two or three double bonds between ring members or between ring members and non-ring members
- C07D307/56—Heterocyclic compounds containing five-membered rings having one oxygen atom as the only ring hetero atom not condensed with other rings having two or three double bonds between ring members or between ring members and non-ring members with hetero atoms or with carbon atoms having three bonds to hetero atoms with at the most one bond to halogen, e.g. ester or nitrile radicals, directly attached to ring carbon atoms
- C07D307/68—Carbon atoms having three bonds to hetero atoms with at the most one bond to halogen
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07D—HETEROCYCLIC COMPOUNDS
- C07D333/00—Heterocyclic compounds containing five-membered rings having one sulfur atom as the only ring hetero atom
- C07D333/02—Heterocyclic compounds containing five-membered rings having one sulfur atom as the only ring hetero atom not condensed with other rings
- C07D333/04—Heterocyclic compounds containing five-membered rings having one sulfur atom as the only ring hetero atom not condensed with other rings not substituted on the ring sulphur atom
- C07D333/26—Heterocyclic compounds containing five-membered rings having one sulfur atom as the only ring hetero atom not condensed with other rings not substituted on the ring sulphur atom with hetero atoms or with carbon atoms having three bonds to hetero atoms with at the most one bond to halogen, e.g. ester or nitrile radicals, directly attached to ring carbon atoms
- C07D333/38—Carbon atoms having three bonds to hetero atoms with at the most one bond to halogen, e.g. ester or nitrile radicals
Definitions
- a protein comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II): wherein the substitutents are as defined herein.
- FIGS. 1A-1C show the synthetic scheme and data demonstrating that SFK is a latent bioreactive unnatural amino acid (Uaa) for protein-protein cross-linking.
- FIG. 1A synthetic scheme for SFK.
- FIG. 2C SDS-PAGE analysis of cross-linking between Afb7X with MBP-Z(24SFK).
- FIGS. 2A-2G show embodiments of the compounds described herein.
- R can be R 1 as defined herein.
- X, Y i, Y2, Zi, and Z2 can be O, N, S, and C.
- FIG. 3 provides the synthetic scheme to prepare the compound shown in FIG. 2E.
- FIGS. 4A-4B are fluorescence microscopic imaging of HeLa-GFP-182TAG reporter cells grown in the absence of SFK (FIG. 4A) or in the presence of SFK (FIG. 4B).
- the bar at the bottom right hand comer represents the scale of 51 microns.
- FIGS. 5A-5F are SDS-PAGE analysis of MBP-Z(24FSK) incubation with an affibody.
- MBP-Z(24FSK) incubation with Affibody(7H) (FIG. 5A), Affibody(7K) (FIG. 5B), and Affibody(7Y) (FIG. 5C) show no cross-linking.
- MBP-Z(24SFK) incubation with Affibody(7H) (FIG. 5D), Affibody(7K) (FIG. 5E), and Affibody(7Y) (FIG. 5F) show time-dependent crosslinking.
- FIGS. 6A-6B are Western blot analyses of Spike BRD incubation with mNb6.
- FIG. 6A is a Western blot analysis of Spike RBD(E484K) incubation with mNb6 with FSK incorporated at indicated sites 50-59.
- FIG. 6B is a Western blot analysis of Spike RBD(E484K) incubation with mNb6 with SFK incorporated at indicated sites 50-59.
- TSK and “fluorosulfonyloxybenzoyl-L-lysine” or “FSK” refer to the compound having the structure:
- antibody is used according to its commonly known meaning in the art. Antibodies exist, e g., as intact immunoglobulins or as a number of well-characterized fragments produced by digestion with various peptidases. Thus, for example, pepsin digests an antibody below the disulfide linkages in the hinge region to produce F(ab)'?, a dimer of Fab which itself is a light chain joined to VH-CHI by a disulfide bond.
- F(ab)'2 is used interchangeably with “Fab dimer.”
- the F(ab)'2 may be reduced under mild conditions to break the disulfide linkage in the hinge region, thereby converting the F(ab)'2 dimer into an Fab' monomer.
- the Fab' monomer is essentially Fab with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed. 1993)).
- the term “Fab’ monomer” is used interchangeably with “Fab” and “or an antigen-binding fragment.'’ While various antibody fragments are defined in terms of the digestion of an intact antibody, one of skill will appreciate that such fragments may be synthesized de novo either chemically or by using recombinant DNA methodology.
- the term antibody as used herein, also includes antibody fragments either produced by the modification of whole antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or those identified using phage display libraries.
- Antibodies are large, complex proteins with an intricate internal structure.
- a natural antibody molecule contains two identical pairs of polypeptide chains, each pair having one light chain and one heavy chain.
- Each light chain and heavy chain in turn consists of two regions: a variable (“V”) region involved in binding the target antigen, and a constant (“C”) region that interacts with other components of the immune system.
- the light and heavy chain variable regions come together in 3-dimensional space to form a variable region that binds the antigen (for example, a receptor on the surface of a cell).
- Within each light or heavy chain variable region there are three short segments (averaging 10 amino acids in length) called the complementarity determining regions (“CDRs”).
- the six CDRs in an antibody variable domain fold up together in 3 -dimensional space to form the actual antibody binding site which docks onto the target antigen.
- the position and length of the CDRs have been precisely defined by Kabat et al, Sequences of Proteins of Immunological Interest, U.S. Department of Health and Human Services, 1987.
- the part of a variable region not contained in the CDRs is called the framework (“FR”), which forms the environment for the CDRs.
- An exemplary' immunoglobulin (antibody) structural unit comprises a tetramer.
- Each tetramer is composed of two identical pairs of polypeptide chains, each pair having one "‘light” and one ‘'heavy” chain.
- the N-terminus of each chain defines a variable region of about 100 to 110 or more amino acids primarily responsible for antigen recognition.
- the terms variable light chain (VL) and variable heavy chain (VH) refer to these light and heavy chains respectively.
- the Fc i.e., fragment crystallizable region
- the Fc region is the “base” or “tail” of an immunoglobulin and is ty pically composed of two heavy chains that contribute two or three constant domains depending on the class of the antibody. By binding to specific proteins the Fc region ensures that each antibody generates an appropriate immune response for a given antigen.
- the Fc region also binds to various cell receptors, such as Fc receptors, and other immune molecules, such as complement proteins.
- an “antibody variant” as provided herein refers to a polypeptide capable of binding to a receptor protein or an antigen and including one or more structural domains of an antibody or fragment thereof.
- antibody variants include single-domain antibodies (nanobodies), affibodies (polypeptides smaller than monoclonal antibodies and capable of binding receptor proteins or antigens with high affinity and imitating monoclonal antibodies), antigen-binding fragments (Fab), Fab dimers (monospecific Fab2, bispecific Fab2), trispecific Fabs, monovalent IgGs, single-chain variable fragments (scFv), bispecific diabodies, trispecific triabodies, scFv-Fc, minibodies.
- a “peptibody” as provided herein refers to a peptide moiety attached (through a covalent or non-covalent linker) to the Fc domain of an antibody.
- a “single-domain antibody” or “nanobody” refers to an antibody fragment having a single monomeric variable antibody domain. Like a whole antibody, it is able to bind selectively to a specific antigen.
- the single domain antibody is a human or humanized single-domain antibody.
- a single-chain variable fragment is typically a fusion protein of the variable regions of the heavy (VH) and light chains (VL) of immunoglobulins, connected with a short linker peptide of 10 to about 25 amino acids.
- the linker is usually rich in glycine for flexibility, as well as serine or threonine for solubility.
- the linker can either connect the N-terminus of the VH with the C-terminus of the VL, or vice versa.
- Antibodies e.g., recombinant, monoclonal, or polyclonal antibodies
- the genes encoding the heavy and light chains of an antibody of interest can be cloned from a cell, e.g., the genes encoding a monoclonal antibody can be cloned from a hybridoma and used to produce a recombinant monoclonal antibody.
- Gene libraries encoding heavy and light chains of monoclonal antibodies can also be made from hybridoma or plasma cells. Random combinations of the heavy and light chain gene products generate a large pool of antibodies with different antigenic specificity.
- Techniques for the production of single chain antibodies or recombinant antibodies can be adapted to produce antibodies to polypeptides.
- transgenic mice, or other organisms such as other mammals may be used to express humanized or human antibodies.
- phage display technology can be used to identify antibodies and heteromeric Fab fragments that specifically bind to selected antigens.
- Antibodies can also be made bispecific, i.e., able to recognize two different antigens.
- Antibodies can also be heteroconjugates, e.g., two covalently j oined antibodies, or immunotoxins.
- the epitope of an antibody is the region of its antigen to which the antibody binds.
- Two antibodies bind to the same or overlapping epitope if each competitively inhibits (blocks) binding of the other to the antigen. That is, a lx, 5x, lOx, 20x or lOOx excess of one antibody inhibits binding of the other by at least 30% but preferably 50%, 75%, 90% or even 99% as measured in a competitive binding assay (see, e.g., Junghans et al., Cancer Res. 50: 1495, 1990).
- two antibodies have the same epitope if essentially all amino acid mutations in the antigen that reduce or eliminate binding of one antibody reduce or eliminate binding of the other.
- Two antibodies have overlapping epitopes if some amino acid mutations that reduce or eliminate binding of one antibody reduce or eliminate binding of the other.
- a humanized antibody has one or more amino acid residues introduced into it from a source which is non-human. These non-human amino acid residues are often referred to as import residues, which are typically taken from an import variable domain. Humanization can be performed by methods known in the art. Accordingly, such humanized antibodies are chimeric antibodies, wherein substantially less than an intact human variable domain has been substituted by the corresponding sequence from a non-human species. In practice, humanized antibodies are typically human antibodies in which some CDR residues and possibly some FR residues are substituted by residues from analogous sites in rodent antibodies.
- polynucleotides comprising a first sequence coding for humanized immunoglobulin framework regions and a second sequence set coding for the desired immunoglobulin complementarity determining regions can be produced synthetically or by combining appropriate cDNA and genomic DNA segments.
- Human constant region DNA sequences can be isolated in accordance with well known procedures from a variety of human cells.
- a “chimeric antibody” is an antibody molecule in which (i) the constant region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site (variable region) is linked to a constant region of a different or altered class, effector function and/or species, or an entirely different molecule which confers new' properties to the chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (ii) the variable region, or a portion thereof, is altered, replaced or exchanged with a variable region having a different or altered antigen specificity.
- the antibodies described herein include humanized and/or chimeric monoclonal antibodies.
- the phrase “specifically (or selectively) binds” to an antibody or a receptor protein or “specifically (or selectively) immunoreactive with” when referring to a protein refers to a binding reaction that is determinative of the presence of the protein, often in a heterogeneous population of proteins and other biologies.
- the specified antibodies bind to a particular protein at least two times the background and more ty pically more than 10 to 100 times background.
- Specific binding to an antibody under such conditions requires an antibody that is selected for its specificity for a particular protein.
- polyclonal antibodies can be selected to obtain only a subset of antibodies that are specifically immunoreactive with the selected antigen and not with other proteins.
- This selection may be achieved by subtracting out antibodies that cross-react with other molecules.
- a variety of immunoassay formats may be used to select antibodies specifically immunoreactive with a particular protein.
- solid-phase ELISA immunoassays are routinely used to select antibodies specifically immunoreactive with a protein (e.g., Harlow' & Lane, Using Antibodies, A Laboratory Manual (1998) for a description of immunoassay formats and conditions that can be used to determine specific immunoreactivity).
- Receptor protein or “membrane receptor” refers to a receptor (protein) that is embedded in the plasma membrane of a cell.
- the receptor protein is located in the extracellular domain of a cell, the transmembrane domain of a cell, or the intracellular domain of a cell.
- the receptor protein is a cell-surface receptor.
- the receptor protein is in the extracellular domain.
- the receptor protein is in the transmembrane domain.
- the receptor protein is an ion channel- linked receptor, an enzyme-linked receptor, or a G protein-coupled receptor.
- the receptor protein is a hormone receptor.
- peptidyl moiety refers to a protein, protein fragment, or peptide that may form part of a biomolecule or a biomolecule conjugate.
- the peptidyl moiety forms part of a biomolecule (e.g., protein).
- the peptidyl moiety forms part of a biomolecule (e.g., protein) conjugate.
- the peptidyl moiety may also be substituted with additional chemical moieties (e.g., additional R substituents).
- the peptidyl moiety 7 forms part of an antibody or an antibody variant.
- the peptidyl moiety 7 forms part of a receptor protein.
- a peptidyl moiety is a protein, protein fragment, or peptide that conatins a monovalent radical of an amino acid.
- amino acid moiety refers to a monovalent amino acid.
- carbohydrate moiety refers to carbohydrates, for example, polyhydroxy aldehydes, ketones, alcohols, acids, their simple derivatives and their polymers having linkages of the acetal type, that may form part of a biomolecule or a biomolecule conjugate.
- carbohydrate moiety forms part of a biomolecule.
- carbohydrate moiety forms part of a biomolecule conjugate.
- the carbohydrate moiety may also be substituted with additional chemical moieties (e.g.. additional R substituents).
- nucleic acid moiety' refers to nucleic acids, for example, DNA, and RNA. that may form part of a biomolecule or biomolecule conjugate. In aspects, the nucleic acid moiety forms part of a biomolecule. In aspects, the nucleic acid moiety forms part of a biomolecule conjugate. The nucleic acid moiety may also be substituted with additional chemical moieties (e g., additional R substituents).
- lipid moiety refers to a lipid or lipid fragment.
- the lipid may be substituted with additional chemical moieties.
- a lipid moiety is a monovalent radical of a lipid.
- RNA moiety refers to a RNA, as described herein.
- an RNA moiety is a monovalent radical of RNA.
- an RNA moiety is an RNA containing a monovalent radical of a nucleotide.
- RNA-binding protein moiety refers to a protein, as described herein.
- an RNA-binding moiety is a monovalent radical of an RNA-binding protein, such as a monovalent radical of a CRISPR protein or a monovalent radical of a RNA chaperone.
- Nucleic acid refers to nucleotides (e.g., deoxy ribonucleotides or ribonucleotides) and polymers thereof in either single-, double- or multiple-stranded form, or complements thereof.
- polynucleotide e.g., deoxy ribonucleotides or ribonucleotides
- oligonucleotide oligo or the like refer, in the usual and customary sense, to a linear sequence of nucleotides.
- nucleotide refers, in the usual and customary sense, to a single unit of a polynucleotide, i.e., a monomer.
- Nucleotides can be ribonucleotides, deoxyribonucleotides, or modified versions thereof.
- Examples of polynucleotides contemplated herein include single and double stranded DNA. single and double stranded RNA, and hybrid molecules having mixtures of single and double stranded DNA and RNA.
- Examples of nucleic acid, e.g. polynucleotides contemplated herein include any types of RNA, e.g. mRNA, siRNA, miRNA, and guide RNA and any types of DNA, genomic DNA, plasmid DNA, and minicircle DNA, and any fragments thereof.
- nucleic acids can be linear or branched.
- nucleic acids can be a linear chain of nucleotides or the nucleic acids can be branched, e.g., such that the nucleic acids comprise one or more arms or branches of nucleotides.
- the branched nucleic acids are repetitively branched to form higher ordered structures such as dendrimers and the like.
- Nucleic acids including e.g., nucleic acids with a phosphothioate backbone, can include one or more reactive moieties.
- the term reactive moiety includes any group capable of reacting with another molecule, e.g., a nucleic acid or polypeptide through covalent, non-covalent or other interactions.
- the nucleic acid can include an amino acid reactive moiety that reacts with an amio acid on a protein or polypeptide through a covalent, non-covalent or other interaction.
- the terms also encompass nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, and non- naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides.
- analogs include, without limitation, phosphodiester derivatives including, e.g., phosphorami date, phosphorodiamidate.
- phosphorothioate also known as phosphorothioate having double bonded sulfur replacing oxygen in the phosphate
- phosphorodithioate phosphonocarboxylic acids
- phosphonocarboxylates phosphonoacetic acid
- phosphonoformic acid methyl phosphonate
- boron phosphonate or O-methylphosphoroamidite linkages
- nucleic acids include those with positive backbones; nonionic backbones, modified sugars, and non-ribose backbones (e.g. phosphorodiamidate morpholino oligos or locked nucleic acids (LNA) as known in the art), including those described in U.S. Patent Nos. 5,235,033 and 5.034,506, and Chapters 6 and 7, ASC Symposium Series 580, Glycan Modifications in Antisense Research, Sanghui & Cook, eds. Nucleic acids containing one or more carbocyclic sugars are also included within one definition of nucleic acids.
- LNA locked nucleic acids
- Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments or as probes on a biochip.
- Mixtures of naturally occurring nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made.
- the intemucleotide linkages in DNA are phosphodiester, phosphodiester derivatives, or a combination of both.
- Nucleic acids can include nonspecific sequences.
- nonspecific sequence refers to a nucleic acid sequence that contains a series of residues that are not designed to be complementary to or are only partially complementary to any other nucleic acid sequence.
- a nonspecific nucleic acid sequence is a sequence of nucleic acid residues that does not function as an inhibitory nucleic acid when contacted with a cell or organism.
- a polynucleotide is typically composed of a specific sequence of four nucleotide bases: adenine (A); cytosine (C); guanine (G); and thymine (T) (uracil (U) for thymine (T) when the polynucleotide is RNA).
- A adenine
- C cytosine
- G guanine
- T thymine
- U uracil
- T thymine
- polynucleotide sequence is the alphabetical representation of a polynucleotide molecule: alternatively, the term may be applied to the polynucleotide molecule itself. This alphabetical representation can be input into databases in a computer having a central processing unit and used for bioinformatics applications such as functional genomics and homology searching.
- Polynucleotides may optionally include one or more non-standard nucleotide(s), nucleotide analog(s) and/or modified nucleo
- complement refers to a nucleotide (e.g.. RNA or DNA) or a sequence of nucleotides capable of base pairing with a complementary nucleotide or sequence of nucleotides.
- a complement may include a sequence of nucleotides that base pair with corresponding complementary nucleotides of a second nucleic acid sequence.
- the nucleotides of a complement may partially or completely match the nucleotides of the second nucleic acid sequence. Where the nucleotides of the complement completely match each nucleotide of the second nucleic acid sequence, the complement forms base pairs with each nucleotide of the second nucleic acid sequence. Where the nucleotides of the complement partially match the nucleotides of the second nucleic acid sequence only some of the nucleotides of the complement form base pairs with nucleotides of the second nucleic acid sequence.
- Examples of complementary' sequences include coding and a non-coding sequences, wherein the non-coding sequence contains complementary nucleotides to the coding sequence and thus forms the complement of the coding sequence.
- a further example of complementary sequences are sense and antisense sequences, wherein the sense sequence contains complementary nucleotides to the antisense sequence and thus forms the complement of the antisense sequence.
- sequences may be partial, in which only some of the nucleic acids match according to base pairing, or complete, where all the nucleic acids match according to base pairing.
- two sequences that are complementary to each other may have a specified percentage of nucleotides that are the same (i.e., about 60% identity, preferably 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region).
- amino acid refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids.
- Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, y-carboxy glutamate, and O-phosphoserine.
- Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid.
- Amino acid mimetics refers to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid.
- the terms “non-naturally occurring amino acid” and “unnatural amino acid” refer to amino acid analogs, synthetic amino acids, and amino acid mimetics which are not found in nature.
- Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.
- amino acid side chain refers to the functional substituent contained on amino acids.
- an amino acid side chain may be the side chain of a naturally occurring amino acid.
- Naturally occurring amino acids are those encoded by the genetic code (e.g., alanine, arginine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, or valine), as well as those amino acids that are later modified, e.g.. hydroxyproline, y-carboxyglutamate. and O-phosphoserine.
- the amino acid side chain may be a non-natural amino acid side chain.
- the amino acid side chain is H,
- non-natural amino acid side chain or “unnatural amino acid side chain” refers to the functional substituent of compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium, allylalanine, 2-aminoisobutryric acid.
- Non-natural amino acids are non- proteinogemc amino acids that either occur naturally or are chemically synthesized.
- Such analogs have modified R groups (e g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid.
- Non-limiting examples include exo-cis-3-aminobicyclo[2.2.1]hept-5-ene-2-carboxylic acid hydrochloride, cis-2- aminocycloheptane-carboxylic acid hydrochloride, cis-6-amino-3-cyclohexene-l -carboxylic acid hydrochloride, cis-2-amino-2-methylcyclohexanecarboxylic acid hydrochloride, cis-2- amino-2-methylcyclopentane-carboxylic acid hydrochloride, 2-(Boc-aminomethyl)benzoic acid, 2-(Boc-amino)octanedioic acid, Boc-4,5-dehydro-Leu-OH (dicyclohexylammonium), Boc-4- (F
- Fmoc-Phe(4-Br)-OH Fmoc-Phe(3,5-F2)-OH.
- Fmoc-P-(4-thiazolyl)-Ala-OH Fmoc-P-(2 -thienyl)- Ala-OH, 4-(Hydroxymethyl)-D-phenylalanine.
- “Conservatively modified variants” applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, “conservatively modified variants” refers to those nucleic acids that encode identical or essentially identical amino acid sequences. Because of the degeneracy of the genetic code, a number of nucleic acid sequences will encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are “silent variations.” which are one species of conservatively modified variations.
- Every nucleic acid sequence herein which encodes a polypeptide also describes every possible silent variation of the nucleic acid.
- each codon in a nucleic acid except AUG, which is ordinarily the only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan
- TGG which is ordinarily the only codon for tryptophan
- amino acid sequences one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a “conservatively modified variant” where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the disclosure.
- the following eight groups each contain amino acids that are conservative substitutions for one another: (i) Alanine (A), Glycine (G); (ii) Aspartic acid (D). Glutamic acid (E); (hi) Asparagine (N). Glutamine (Q); (iv) Arginine (R). Lysine (K); (v) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); (vi) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); (vii) Serine (S), Threonine (T); and (viii) Cysteine (C), Methionine (M). (e.g., Creighton, Proteins (1984)).
- protein protein
- polypeptide and “peptide” are used interchangeably herein to refer to a polymer of amino acid residues.
- the polymer of amino acids may, in embodiments, be conjugated to a moiety that does not consist of amino acids.
- the terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymers.
- a “fusion protein” refers to a chimeric protein encoding two or more separate protein sequences that are recombinantly expressed as a single moiety.
- amino acid or nucleotide base “position’” is denoted by a number that sequentially identifies each amino acid (or nucleotide base) in the reference sequence based on its position relative to the N-terminus (or 5'-end). Due to deletions, insertions, truncations, fusions, and the like that must be taken into account when determining an optimal alignment, in general the amino acid residue number in a test sequence determined by simply counting from the N- terminus will not necessarily be the same as the number of its corresponding position in the reference sequence.
- a variant has a deletion relative to an aligned reference sequence
- that insertion will not correspond to a numbered amino acid position in the reference sequence.
- truncations or fusions there can be stretches of amino acids in either the reference or aligned sequence that do not correspond to any amino acid in the corresponding sequence.
- an amino acid residue in a protein “corresponds” to a given residue when it occupies the same essential structural position within the protein as the given residue.
- a selected residue in a selected protein corresponds to specific position (e.g., A100) of a protein when the selected residue occupies the same essential spatial or other structural relationship as that specific position (e.g., A100) of the protein.
- the position in the aligned selected protein aligning with that specific position is said to correspond to that specific residue (e.g., Al 00).
- a three dimensional structural alignment can also be used, e.g., where the structure of the selected protein is aligned for maximum correspondence with the protein and the overall structures compared.
- an amino acid that occupies the same essential position as that specific position (e.g., A100) in the structural model is said to correspond to the that specific position residue (e.g., A 100).
- “Percentage of sequence identity” is determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
- nucleic acids or polypeptide sequences refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 60% identity, or at least 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identity over a specified region, when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection (e.g., NCBI web site ncbi.nlm.nih.gov/BLAST/ or the like).
- sequences are then said to be “substantially identical.”
- This definition also refers to, or may be applied to, the compliment of a test sequence.
- the definition also includes sequences that have deletions and/or additions, as well as those that have substitutions.
- the preferred algorithms can account for gaps and the like.
- identity exists over a region that is at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 50-100 amino acids or nucleotides in length.
- biomolecule refers to large macromolecules such as, for example, proteins, lipids, and nucleic acids, as well as small molecules such as, for example, primary and secondary metabolites.
- biomolecule refers to a protein.
- biomolecule refers to a RNA-binding protein.
- biomolecule refers to RNA.
- biomolecule refers to a receptor protein.
- biomolecule moiety refers to biomolecules, including large macromolecules such as, for example, proteins, lipids, and nucleic acids, as well as small molecules such as, for example, primary and secondary metabolites.
- the biomolecule moiety is a peptidyl moiety, a lipid moiety or a nucleic acid moiety.
- Biomolecule moieties may form part of a molecule (e.g., biomolecule).
- biomolecule moieties may form part of a biomolecule conjugate, where the biomolecule conjugate includes two or more biomolecule moieties.
- the biomolecule conjugate includes two or more biomolecule moieties conjugated via a bioconjugate linker.
- pyrrolysyl-tRNA synthetase refers to an enzyme (including homologs, isoforms, and functional fragments thereof) with pyrrolysyl-tRNA synthetase activity.
- Pyrrolysyl-tRNA synthetase is an aminoacyl-tRNA synthetase that catalyzes the reaction necessary to attach a-amino acid pyrrolysine to the cognate tRNA (tRNA pyl ), thereby allowing incorporation of pyrrolysine during proteinogenesis at amber stop codons (i.e., UAG).
- the term includes any recombinant or naturally-occurring form of pyrrolysyl-tRNA synthetase or variants, homologs, or isoforms thereof that maintain pyrrolysyl-tRNA synthetase activity (e.g. within at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 100% activity compared to wildtype pyrrolysyl-tRNA synthetase).
- the variants, homologs, or isoforms have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g., a 50, 100.
- the mutant pyrrolysyl-tRNA synthetase catalyzes the attachment of the compound of Formula (I) and embodiments thereof to a tRNA pyl .
- the mutant pyrrolysyl-tRNA synthetase catalyzes the attachment of the compounds described herein and embodiments thereof to a tRNA pyl .
- the pyrrolysyl-tRNA synthetase comprises the amino acid sequence set forth as SEQ ID NO: 1.
- mutant pyrrolysyl-tRNA synthetase or “mutant PylRS” refers to any pyrrolysyl-tRNA synthetase that has a different amino acid sequence from wild-type amino acid sequence.
- tRNA 1 ⁇ 1 and “rTNA Pyl cuA” and “tRNA ⁇ u A ” (i.e., tRNA(superscript Pyl)(subscript CUA)) are used interchangeably and all refer to a single-stranded RNA molecule containing about 70 to 90 nucleotides which fold via intrastrand base pairing to form a characteristic cloverleaf structure that carries a specific amino acid (e.g., compound of Formula (I) or embodiments thereof; compound of Formula (IV) or embodiments thereof; compound of Formula (VII) or embodiments thereof) and matches it to its corresponding codon (i.e., a complementary to the anticodon of the tRNA) on an rnRNA during protein synthesis.
- a specific amino acid e.g., compound of Formula (I) or embodiments thereof; compound of Formula (IV) or embodiments thereof; compound of Formula (VII) or embodiments thereof
- matches it to its corresponding codon i.e., a complementary to the anticodon
- tRN A Pyl the anticodon is CUA. Anticodon CUA is complementary' to amber stop codon UAG.
- the tRNA Pyl comprises an anticodon.
- the anticodon is CUA. TTA, or TCA.
- the tRNA 1 ’ 5 ' 1 comprises an anticodon, wherein the anticodon comprises at least one non-cannonical base.
- the abbreviation “Pyl” of tRN A Pyl stands for pyrrolysine and the “CUA” of tRNA Pyl refers to its anticodon CUA.
- tRN A Pvl is attached to a compound described herein, including embodiments thereof.
- substrate-binding site' refers to residues located in the enzyme active site that form temporary bonds or interactions with the substrate.
- the substrate-binding site of pyrrolysyl-tRNA synthetase refers to residues located in the active site of pyrrolysyl-tRNA synthetase that form temporary bonds or interactions with the amino acid substrate.
- vector refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked.
- plasmid refers to a linear or circular double stranded DNA loop into which additional DNA segments can be ligated.
- viral vector Another type of vector is a viral vector, wherein additional DNA segments can be ligated into the viral genome.
- Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors).
- vectors e.g., non episomal mammalian vectors
- Other vectors are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome.
- certain vectors are capable of directing the expression of genes to which they are operatively linked.
- Such vectors are referred to herein as “expression vectors.”
- expression vectors of utility in recombinant DNA techniques are often in the form of plasmids.
- plasmid and “vector” can be used interchangeably as the plasmid is the most commonly used form of vector.
- viral vectors e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses
- Some viral vectors are capable of targeting a particular cells type either specifically or non- specifically.
- Exemplary vectors that can be used include, but are not limited to, pEvol vector, pMP vector, pET vector, pTak vector, pBad vector.
- a complex refers to a composition that includes two or more components, where the components bind together to make a functional unit.
- a complex described herein include a mutant pyrrolysyl-tRNA synthetase described herein and an amino acid substrate (e.g., the compounds described herein, including embodiments thereof).
- a complex described herein includes a mutant pyrrolysyl-tRNA synthetase described herein and a tRNA (e.g., tRNA Py ).
- a complex described herein includes a mutant pyrrolysyl-tRNA synthetase described herein, an amino acid substrate and a tRNA (e.g., tRNA Py ).
- a complex described herein includes at least two components selected from the group consisting of a mutant pyrrolysyl-tRNA synthetase described herein, an amino acid substrate (e.g., the compound of Formula (I) or embodiments thereof), a polypeptide containing the compound of Formula (I) or embodiments thereof, and a tRNA (e.g., tRNA Py ).
- a complex described herein includes at least two components selected from the group consisting of a mutant pyrrolysyl-tRNA synthetase described herein, an amino acid substrate (e.g., a compounds described herein, including embodiments thereof), a polypeptide containing a compound described herein, including embodiments thereof, and a tRNA (e.g., tRNA Py ).
- protein/protein complex refers to a composition that includes one proteinbinding protein (e.g., comprising an unnatural amino acid as described herein) and one protein, where the protein-binding protein and protein are proximal to each other but not bound together; the protein-binding protein and protein are covalently bound together; or the protein-binding protein and protein are ionically bound together.
- the protein-binding protein and protein are proximal to each other but not bound together.
- the proteinbinding protein and protein are covalently bonded together.
- the protein-binding protein and protein are ionically bonded together.
- the protein-binding protein and protein are covalently and ionically bonded together.
- the chemical reaction forming the protein/protein complex is a SuFEx reaction.
- transfection can be used interchangeably and are defined as a process of introducing a nucleic acid molecule or a protein to a cell.
- Nucleic acids are introduced to a cell using non-viral or viral-based methods.
- the nucleic acid molecules may be gene sequences encoding complete proteins or functional portions thereof.
- Non-viral methods of transfection include any appropriate transfection method that does not use viral DNA or viral particles as a del i x ery system to introduce the nucleic acid molecule into the cell.
- Exemplary non-viral transfection methods include calcium phosphate transfection, liposomal transfection, nucleofection, sonoporation.
- the nucleic acid molecules are introduced into a cell using electroporation following standard procedures well known in the art.
- any useful viral vector may be used in the methods described herein. Examples for viral vectors include, but are not limited to retroviral, adenoviral, lentiviral and adeno-associated viral vectors.
- the nucleic acid molecules are introduced into a cell using a retroviral vector following standard procedures well known in the art.
- the terms "transfection” or “transduction” also refer to introducing proteins into a cell from the external environment.
- transduction or transfection of a protein relies on attachment of a peptide or protein capable of crossing the cell membrane to the protein of interest.
- isolated when applied to a nucleic acid or protein, denotes that the nucleic acid or protein is essentially free of other cellular components with which it is associated in the natural state. It can be, for example, in a homogeneous state and may be in either a dry or aqueous solution. Purity and homogeneity are typically determined using analytical chemistry techniques such as polyacrylamide gel electrophoresis or high performance liquid chromatography. A protein that is the predominant species present in a preparation is substantially purified.
- Contacting is used in accordance with its plain ordinary meaning and refers to the process of allowing at least two distinct species (e.g. chemical compounds including amino acids, proteins, peptides, biomolecules, or cells) to become sufficiently proximal to react, interact or physically touch. It should be appreciated; however, the resulting reaction product can be produced directly from a reaction between the added reagents or from an intermediate from one or more of the added reagents that can be produced in the reaction mixture.
- the term “contacting” may include allowing two species to react, interact, or physically touch, wherein the two species may be biomolecule moieties as described herein. In some embodiments, contacting includes allowing two proteins or a protein and a glycan as described herein to interact.
- a “detectable agent” or “detectable moiety” is a composition detectable by appropriate means such as spectroscopic, photochemical, biochemical, immunochemical, chemical, magnetic resonance imaging, or other physical means.
- the proteins described herein are bonded to a detectable agent.
- the fusion proteins described herein are bonded to a detectable agent.
- an antibody or antibody variant is bonded to a detectable agent.
- a nanobody is bonded to a detectable agent.
- the bond is noncovalent or covalent.
- the bond is covalent.
- the protein is covalently bonded to a detectable agent.
- the fusion protein is covalently bonded to a detectable agent.
- the antibody or antibody variant is covalently bonded to a detectable agent.
- a nanobody is covalently bonded to a detectable agent.
- the covalent bond is between the detectable agent and a naturally-occurring amino acid in the protein or fusion protein.
- the nanobody is covalently bonded to a detectable agent, the covalent bond is between the detectable agent and a naturally- occurring amino acid in the nanobody.
- Detectable agents include 18 F, 32 P, 33 P, 45 Ti, 47 Sc, 52 Fe, 59 Fe, 62 Cu, 64 Cu, 67 Cu, 67 Ga, 68 Ga, 77 As, 86 Y, 90 Y. 89 Sr, 89 Zr, 94 Tc, 94 Tc, 99m Tc, 99 Mo, 105 Pd, 105 Rh, Cr, V, Mn, Fe, Co, Ni, Cu, La, Ce, Pr, Nd, Pm, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, Lu, 32 P, fluorophore (e.g., fluorescent dyes), electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, paramagnetic molecules, paramagnetic nanoparticles, ultrasmall superparamagnetic iron oxide (“USPIO”) nanoparticles, USPIO nanoparticle aggregates, superparamagnetic iron oxide C SP
- fluorodeoxyglucose e.g., fluorine-18 labeled
- any gamma ray emitting radionuclides, positron-emitting radionuclide radiolabeled glucose, radiolabeled water, radiolabeled ammonia, biocolloids, microbubbles (e.g. including microbubble shells including albumin, galactose, lipid, and/or polymers; microbubble gas core including air. heavy gases, perfluorcarbon, nitrogen, octafluoropropane, perflexane lipid microsphere, perflutren, etc.), iodinated contrast agents (e.g...
- a detectable moiety is a monovalent detectable agent or a detectable agent capable of forming a bond with another composition.
- paramagnetic ions that may be used as imaging agents in accordance with the embodiments of the disclosure include, e.g., ions of transition and lanthanide metals (e.g., metals having atomic numbers of 21-29, 42, 43, 44, or 57- 71). These metals include ions of Cr, V, Mn, Fe, Co, Ni, Cu, La, Ce, Pr, Nd, Pm, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb and Lu.
- transition and lanthanide metals e.g., metals having atomic numbers of 21-29, 42, 43, 44, or 57- 71.
- metals include ions of Cr, V, Mn, Fe, Co, Ni, Cu, La, Ce, Pr, Nd, Pm, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb and Lu.
- a “radioisotope” that may be used as imaging and/or labeling agents in accordance with the embodiments of the disclosure include, but are not limited to, 18 F, 32 P, 33 P, 45 Ti, 47 Sc, 52 Fe, 59 Fe, 62 Cu, 64 Cu, 67 Cu, 67 Ga, 68 Ga, 77 As, 86 Y, 90 Y. 89 Sr, 89 Zr, 94 Tc, 94 Tc, 99in Tc, 99 Mo, 105 Pd, and 225 Ac.
- the proteins described herein are bonded to a radioisotope.
- the fusion proteins described herein are bonded to a radioisotope.
- an antibody or antibody variant is bonded to a radioisotope.
- a nanobody is bonded to a radioisotope.
- the bond is noncovalent or covalent.
- the bond is covalent.
- the protein is covalently bonded to a radioisotope.
- the fusion protein is covalently bonded to a radioisotope.
- the antibody or antibody variant is covalently bonded to a radioisotope.
- a nanobody is covalently bonded to a radioisotope.
- the covalent bond is between the radioisotope and a naturally-occurring amino acid in the protein or fusion protein.
- the covalent bond is between the radioisotope and a naturally-occurring amino acid in the nanobody.
- Methods for covalently bonding radioisotopes to proteins are well-known in the art.
- the radioisotope is 123 I, 124 I, 125 I, or 131 I.
- the radioisotope is 123 I.
- the radioisotope is 124 I.
- the radioisotope is 127 I. In embodiments, the radioisotope is 131 I. In embodiments, the radioisotope is a positron-emitting radioisotope. In embodiments, the positron-emitting radioisotope is n C, 13 N, 15 O, 18 F, 64 Cu, 68 Ga, 78 Br, 82 Rb, 86 Y, 89 Zr, 90 Y. 22 Na. 26 Al, 40 K, 83 Sr. or 124 I. In embodiments, the positron-emitting radioisotope is n C. In embodiments, the positron-emitting radioisotope is 13 N.
- the positronemitting radioisotope is 15 O. In embodiments, the positron-emitting radioisotope is 18 F. In embodiments, the positron-emitting radioisotope is 64 Cu. In embodiments, the positron-emitting radioisotope is 168 Ga. In embodiments, the positron-emitting radioisotope is 78 Br. In embodiments, the positron-emitting radioisotope is 82 Rb. In embodiments, the positron-emitting radioisotope is 86 Y. In embodiments, the positron-emitting radioisotope is 89 Zr.
- the positron-emitting radioisotope is 90 Y. In embodiments, the positron-emitting radioisotope is 22 Na. In embodiments, the positron-emitting radioisotope is 26 Al. In embodiments, the positronemitting radioisotope is 40 K. In embodiments, the positron-emitting radioisotope is 83 Sr. In embodiments, the positron-emitting radioisotope is 124 I. In embodiments, the radioisotope is an alpha-emitting radioisotope.
- the alpha-emitting radioisotope is 211 At, 227 Th, 225 Ac, 223 Ra, 213 Bi, or 212 Bi. In embodiments, the alpha-emitting radioisotope is 211 At. In embodiments, the alpha-emitting radioisotope is 227 Th. In embodiments, the alpha-emitting radioisotope is 225 Ac. In embodiments, the alpha-emitting radioisotope is 223 Ra. In embodiments, the alpha-emiting radioisotope is 213 Bi. In embodiments, the alpha-emitting radioisotope is 212 Bi.
- therapeutic agent refers to any agent useful in treating and/or preventing a disease.
- “Therapeutic agent“ includes, without limitation, small molecule drugs, proteins, nucleic acids (e.g., DNA, RNA). and the like.
- Small-molecule drugs refers to chemical compounds with low molecular weight that are capable of treating and/or preventing diseases.
- the proteins described herein are bonded to a therapeutic agent.
- the fusion proteins described herein are bonded to a therapeutic agent.
- an antibody or antibody variant is bonded to a therapeutic agent.
- a nanobody is bonded to a therapeutic agent.
- the bond is noncovalent or covalent.
- the bond is covalent.
- the protein is covalently bonded to a therapeutic agent.
- the fusion protein is covalently bonded to a therapeutic agent.
- the antibody or antibody variant is covalently bonded to a therapeutic agent.
- a nanobody is covalently bonded to a therapeutic agent.
- the covalent bond is between the therapeutic agent and a naturally-occurring amino acid in the protein or fusion protein.
- the covalent bond is between the therapeutic agent and a naturally-occurring amino acid in the nanobody.
- SuFEx sulfur-fluoride exchange reaction
- proximally- enabled SuFEx refers to the sulfur-fluoride exchange reaction occurring when the reactive species are proximal to each other, i.e., spatially close enough for the SuFEx reaction to occur. The proximity may occur within a single biomolecule (e.g..
- RNA e.g., a hydroxyl group on RNA
- the reactive species are sufficiently proximal for the reaction to occur, e.g., sulfur-fluoride exchange reaction between the compound of Formula (IV) and a peptidyl moiety (e.g., having a tyrosine, lysine, or histidine), a nucleic acid moiety, or a carbohydrate moiety; or for example a sulfur-fluoride exchange reaction between the compound of Formula (I) and a nucleic acid moiety; or for example a sulfur-fluoride exchange reaction between the compound of Formula (VII) and a peptidyl moiety (e.g., having a tyrosine, lysine, or histidine), a nucleic acid moiety, or a carbohydrate moiety.
- sulfur-fluoride exchange reaction between the compound of Formula (IV) and a peptidyl moiety e.g., having a tyrosine, lysine, or histidine
- a nucleic acid moiety or a
- proximal means that two compounds (e.g., biomolecules, proteins, peptides, amino acids, glycans) are adjacent (e.g., but not covalently bonded together).
- proximal means up to about 25 angstroms.
- proximal means up to about 20 angstroms.
- proximal means up to about 15 angstroms.
- proximal means up to about 10 angstroms.
- proximal means from about 1 angstrom to about 25 angstroms.
- proximal means from about 1 angstrom to about 20 angstroms.
- proximal means from about 1 angstrom to about 15 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 12 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 10 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 8 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 6 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 5 angstroms. In embodiments, “proximal” means from about 1 angstroms to about 4 angstroms.
- intermolecular linker refers to a linking group between two biomolecules.
- the peptidyl moiety of R 4 is a first protein and the peptidyl moiety of R 3 is a second protein, such that the first protein and the second protein are covalently bonded.
- the first protein and the second protein can have the same sequence, e.g., providing an intermolecular linker between two different proteins having the same amino acid sequence.
- the first protein and the second protein are different proteins, e.g., providing an intermolecular linker between two different proteins, such as a nanobody and a receptor protein.
- intramolecular linker refers to a linking group within a biomolecule.
- the compounds of Formula (III) or embodiments thereof
- the peptidyl moiety of R 4 and the peptidyl moiety of R 5 are in the same protein.
- a compound having an intramolecular linker may also be referred to as an intramolecularly conjugated biomolecule conjugate or an intramolecularly conjugated biomolecule protein.
- alkyl by itself or as part of another substituent, means, unless otherw ise stated, a straight (i.e., unbranched) or branched carbon chain (or carbon), or combination thereof, which may be fully saturated, mono- or polyunsaturated and can include mono-, di- and multivalent radicals.
- the alkyl may include a designated number of carbons (e.g., C1-C10 means one to ten carbons).
- Alkyl is an uncyclized chain.
- saturated hydrocarbon radicals include, but are not limited to, groups such as methyl, ethyl, n-propyl, isopropyl, n-butyl, t-butyl, isobutyl, sec-butyl, methyl, homologs and isomers of, for example, n-pentyl, n-hexyl, n-heptyl, n-octyl, and the like.
- An unsaturated alkyl group is one having one or more double bonds or triple bonds.
- Examples of unsaturated alkyl groups include, but are not limited to, vinyl, 2- propenyL crotyl, 2-isopentenyl, 2-(butadienyl), 2,4-pentadienyl, 3-(l,4-pentadienyl), ethynyl, 1- and 3-propynyl, 3-butynyl, and the higher homologs and isomers.
- An alkoxy is an alky l attached to the remainder of the molecule via an oxygen linker (-O-).
- An alkyl moiety may be an alkenyl moiety.
- An alkyl moiety may be an alkynyl moiety.
- An alkyl moiety may be fully saturated.
- An alkenyl may include more than one double bond and/or one or more triple bonds in addition to the one or more double bonds.
- An alkynyl may include more than one triple bond and/or one or more double bonds in addition to the one or more triple bonds.
- alkydene by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from an alkyl, as exemplified by, e.g., -CH2CH2CH2CH2-.
- an alkyl (or alkylene) group will have from 1 to 24 carbon atoms, with those groups having 10 or fewer carbon atoms being preferred herein.
- a “lower alkyl” or “lower alkylene” is a shorter chain alkyl or alkylene group, generally having eight or fewer carbon atoms.
- alkenylene by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from an alkene.
- heteroalkyl by itself or in combination with another term, means, unless otherwise stated, a stable straight or branched chain, or combinations thereof, including at least one carbon atom and at least one heteroatom (e.g., O, N, P, Si, and S), and wherein the nitrogen and sulfur atoms may optionally be oxidized, and the nitrogen heteroatom may' optionally be quatemized.
- the heteroatom(s) may be placed at any interior position of the heteroalky 1 group or at the position at which the alkyl group is attached to the remainder of the molecule.
- Heteroalkyl is an uncyclized chain.
- a heteroalkyl moiety may include one heteroatom.
- a heteroalkyd moiety may include two optionally different heteroatoms.
- a heteroalkyd moiety' may include three optionally different heteroatoms.
- a heteroalkyd moiety may include four optionally different heteroatoms.
- a heteroalkyd moiety' may include five optionally different heteroatoms.
- a heteroalkyl moiety may include up to 8 optionally different heteroatoms.
- the term “heteroalkenyl,” by itself or in combination with another term, means, unless otherwise stated, a heteroalkyd including at least one double bond.
- a heteroalkenyl may optionally include more than one double bond and/or one or more triple bonds in additional to the one or more double bonds.
- a heteroalkynyl may optionally include more than one triple bond and/or one or more double bonds in additional to the one or more triple bonds.
- heteroalkylene by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from heteroalkyd, as exemplified, but not limited by, -CH2-CH2-S-CH2-CH2- and -CH2-S-CH2-CH2-NH-CH2-.
- heteroatoms can also occupy either or both of the chain termini (e.g., alkyleneoxy, alkylenedioxy, alkyleneamino, alkylenediamino, and the like).
- no orientation of the linking group is implied by the direction in which the formula of the linking group is written.
- heteroalkyl groups include those groups that are attached to the remainder of the molecule through a heteroatom, such as - C(O)R', -C(O)NR', -NR'R", -OR', -SR', and/or -SO2R'.
- heteroalkyl is recited, followed by recitations of specific heteroalkyl groups, such as -NR'R" or the like, it will be understood that the terms heteroalkyl and -NR'R" are not redundant or mutually exclusive. Rather, the specific heteroalkyl groups are recited to add clarity. Thus, the term “heteroalkyl” should not be interpreted herein as excluding specific heteroalkyl groups, such as -NR'R” or the like.
- cycloalkyl 7 and “heterocycloalkyl,” by themselves or in combination with other terms, mean, unless otherwise stated, cyclic versions of “alkyl” and “heteroalkyl,” respectively. Cycloalkyd and heterocycloalkyl are not aromatic. Additionally, for heterocycloalkyl, a heteroatom can occupy the position at which the heterocycle is attached to the remainder of the molecule. Examples of cycloalkyl include, but are not limited to, cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, 1 -cyclohex enyl, 3 -cyclohexenyl, cycloheptyl, and the like.
- heterocycloalkyl examples include, but are not limited to, 1 -(1,2, 5, 6- tetrahydropyridyl), 1-piperidinyl, 2-piperidinyl, 3-piperidinyl, 4-morpholinyl, 3-morpholinyl, tetrahydrofuran-2-yl. tetrahydrofuran-3-yl, tetrahydrothien-2-yl, tetrahydrothien-3-yl, 1- piperazinyl, 2-piperazinyL and the like.
- a “cycloalkylene” and a “heterocycloalkylene,” alone or as part of another substituent, means a divalent radical derived from a cycloalkyl and heterocycloalkyl, respectively.
- cycloalkyl means a monocyclic, bicyclic, or a multicyclic cycloalkyl ring system.
- monocyclic ring systems are cyclic hydrocarbon groups containing from 3 to 8 carbon atoms, where such groups can be saturated or unsaturated, but not aromatic.
- cycloalkyl groups are fully saturated. Examples of monocyclic cycloalkyls include cyclopropyl, cyclobutyl, cyclopentyl, cyclopentenyl, cyclohexyl, cyclohexenyl, cycloheptyl, and cyclooctyl.
- Bicyclic cycloalkyl ring systems are bridged monocyclic rings or fused bicyclic rings.
- bridged monocyclic rings contain a monocyclic cycloalkyl ring where two non adjacent carbon atoms of the monocyclic ring are linked by an alkylene bridge of between one and three additional carbon atoms (i.e., a bridging group of the form (CH 2 ) W , where w is 1. 2, or 3).
- Representative examples of bicyclic ring systems include, but are not limited to, bicyclo[3. 1.1 (heptane, bicyclo[2.2.
- fused bicy devis cycloalkyl ring systems contain a monocyclic cycloalkyl ring fused to either a phenyl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, a monocyclic heterocyclyl, or a monocyclic heteroaryl.
- the bridged or fused bicyclic cycloalkyl is attached to the parent molecular moiety through any carbon atom contained within the monocyclic cycloalky l ring.
- cycloalkyl groups are optionally substituted with one or two groups which are independently oxo or thia.
- the fused bicyclic cycloalkyl is a 5 or 6 membered monocyclic cycloalkyl ring fused to either a phenyl nng, a 5 or 6 membered monocyclic cycloalkyl, a 5 or 6 membered monocyclic cycloalkenyl, a 5 or 6 membered monocyclic heterocyclyl, or a 5 or 6 membered monocyclic heteroaryl, wherein the fused bicyclic cycloalkyl is optionally substituted by one or two groups which are independently oxo or thia.
- multicyclic cycloalkyl ring systems are a monocyclic cycloalkyd ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a bicyclic aryl, a monocyclic or bicyclic heteroaryl, a monocyclic or bicyclic cycloalkyl, a monocyclic or bicyclic cycloalkenyl, and a monocyclic or bicyclic heterocyclyl.
- multicyclic cycloalkyl is attached to the parent molecular moiety' through any carbon atom contained within the base ring.
- multicyclic cycloalkyl ring systems are a monocyclic cycloalkyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic ary 1, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicy devis heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a monocyclic heteroaryl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, and a monocyclic heterocyclyl.
- ring A is a 5-membered monocyclic cycloalkyl, a 5-membered monocyclic heterocycloalkyl, or a 5-membered monocyclic heteroaiyl.
- a cycloalkyl is a cycloalkenyl.
- the term “cycloalkenyl” is used in accordance with its plain ordinary meaning.
- a cycloalkenyl is a monocyclic, bicyclic, or a multicyclic cycloalkenyl ring system.
- monocyclic cycloalkenyl ring systems are cyclic hydrocarbon groups containing from 3 to 8 carbon atoms, where such groups are unsaturated (i.e., containing at least one annular carbon carbon double bond), but not aromatic. Examples of monocyclic cycloalkenyl ring systems include cyclopentenyl and cyclohexenyl.
- bicyclic cycloalkenyl rings are bridged monocyclic rings or a fused bicyclic rings.
- bridged monocyclic rings contain a monocyclic cycloalkenyl ring where two non adjacent carbon atoms of the monocyclic ring are linked by an alkylene bridge of between one and three additional carbon atoms (i.e., a bridging group of the form (CH?) W , where w is 1, 2, or 3).
- Representative examples of bicyclic cycloalkenyls include, but are not limited to, norbomenyl and bicyclo[2.2.2]oct 2 enyl.
- fused bicyclic cycloalkenyl ring systems contain a monocyclic cycloalkenyl ring fused to either a phenyl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, a monocyclic heterocyclyl. or a monocyclic heteroaryl.
- the bridged or fused bicyclic cycloalkenyl is attached to the parent molecular moiety through any carbon atom contained within the monocyclic cycloalkenyl ring.
- cycloalkenyl groups are optionally substituted with one or two groups which are independently oxo or thia.
- multicyclic cycloalkenyl rings contain a monocyclic cycloalkenyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two ring systems independently selected from the group consisting of a phenyl, a bicyclic aryl, a monocyclic or bicyclic heteroaryl, a monocyclic or bicyclic cycloalkyl, a monocyclic or bicyclic cycloalkenyl, and a monocyclic or bicyclic heterocyclyl.
- multicyclic cycloalkenyl is attached to the parent molecular moiety through any carbon atom contained within the base ring.
- multicyclic cycloalkenyl rings contain a monocyclic cycloalkenyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two ring systems independently selected from the group consisting of a phenyl, a monocyclic heteroaryl, a monocyclic cy cloalkyl, a monocyclic cycloalkenyl, and a monocyclic heterocy clyl.
- ring A is a 5-membered monocyclic cycloalkyl, a 5-membered monocyclic heterocycloalkyl, or a 5-membered monocyclic heteroaryl.
- a heterocycloalkyl is a heterocyclyl.
- heterocyclyl as used herein, means a monocyclic, bicyclic, or multicyclic heterocycle.
- the heterocyclyl monocyclic heterocycle is a 3, 4, 5, 6 or 7 membered ring containing at least one heteroatom independently selected from the group consisting of O, N, and S where the ring is saturated or unsaturated, but not aromatic.
- the 3 or 4 membered ring contains 1 heteroatom selected from the group consisting of O, N and S.
- the 5 membered ring can contain zero or one double bond and one, two or three heteroatoms selected from the group consisting of O, N and S.
- the 6 or 7 membered ring contains zero, one or two double bonds and one, two or three heteroatoms selected from the group consisting of O, N and S.
- the heterocyclyl monocyclic heterocycle is connected to the parent molecular moiety through any carbon atom or any nitrogen atom contained within the heterocyclyl monocyclic heterocycle.
- Representative examples of heterocyclyl monocyclic heterocycles include, but are not limited to, azetidinyl, azepanyl, aziridinyl, diazepanyl, 1,3-dioxanyl, 1,3-dioxolanyl, 1,3-dithiolanyl, 1,3-dithianyl, imidazolinyl, imidazolidinyl.
- piperidinyl pyranyl, pyrazolinyl, pyrazolidinyl, pyrrolinyl, pyrrolidinyl, tetrahydrofuranyl, tetrahydrothienyl, thiadiazolinyl, thiadiazolidinyl, thiazolinyl, thiazolidinyl, thiomorpholinyl, 1,1- dioxidothiomorpholinyl (thiomorpholine sulfone), thiopyranyl, and trithianyl.
- the heterocyclyl bicyclic heterocycle is a monocyclic heterocycle fused to either a phenyl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, a monocyclic heterocycle, or a monocyclic heteroaryl.
- the heterocyclyl bicyclic heterocycle is connected to the parent molecular moiety through any carbon atom or any nitrogen atom contained within the monocyclic heterocycle portion of the bicyclic ring system.
- bicyclic heterocyclyls include, but are not limited to, 2.3-dihydrobenzofuran-2-yl, 2.3-dihydrobenzofuran-3-yl, indolin-l-yl, indolin-2-yl, indolin-3-yl, 2,3-dihydrobenzothien-2-yl, decahydroquinolinyl, decahydroisoquinolinyl, octahydro- IH-indolyl, and octahydrobenzofuranyl.
- heterocyclyl groups are optionally substituted with one or two groups which are independently oxo or thia.
- the bicyclic heterocyclyl is a 5 or 6 membered monocyclic heterocyclyl ring fused to a phenyl ring, a 5 or 6 membered monocyclic cycloalkyl, a 5 or 6 membered monocyclic cycloalkenyl, a 5 or 6 membered monocyclic heterocyclyl, or a 5 or 6 membered monocyclic heteroaryl, wherein the bicyclic heterocyclyl is optionally substituted by one or two groups which are independently oxo or thia.
- Multicyclic heterocyclyl ring systems are a monocyclic heterocyclyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a bicyclic aryl, a monocyclic or bicyclic heteroaryl, a monocyclic or bicyclic cycloalkyl, a monocyclic or bicyclic cycloalkenyl, and a monocyclic or bicyclic heterocyclyl.
- multicyclic heterocyclyl is attached to the parent molecular moiety through any carbon atom or nitrogen atom contained within the base ring.
- multicyclic heterocyclyl ring systems are a monocyclic heterocyclyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic ary l, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a monocyclic heteroaryl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, and a monocyclic heterocyclyl.
- multicyclic heterocyclyl groups include, but are not limited to lOH-phenothiazin- 10-yl, 9,10- dihydroacridin-9-yl, 9,10-dihydroacridin-10-yl, lOH-phenoxazin- 10-yl, 10,1 l-dihydro-5H- dibenzo[b,f
- ring A is a 5-membered monocyclic cycloalkyl, a 5-membered monocyclic heterocycloalkyl, or a 5-membered monocyclic heteroaryl.
- halo by themselves or as part of another substituent, mean, unless otherwise stated, a fluorine, chlorine, bromine, or iodine atom. Additionally, terms such as “haloalkyl” are meant to include monohaloalkyl and polyhaloalkyl.
- halo(Ci-C4)alkyl includes, but is not limited to, fluoromethyl, difluoromethyl, trifluoromethyl, 2,2,2-trifluoroethyl, 4-chlorobutyl, 3-bromopropyl, and the like.
- acyl means, unless otherwise stated, -C(O)R where R is a substituted or unsubstituted alkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
- aryl means, unless otherwise stated, a polyunsaturated, aromatic, hydrocarbon substituent, which can be a single ring or multiple rings (preferably from 1 to 3 rings) that are fused together (i.e., a fused ring ary l) or linked covalently.
- a fused ring aryl refers to multiple rings fused together wherein at least one of the fused rings is an aryl ring.
- heteroaryl refers to aryl groups (or rings) that contain at least one heteroatom such as N, O, or S, wherein the nitrogen and sulfur atoms are optionally oxidized, and the nitrogen atom(s) are optionally quatemized.
- heteroaryl includes fused ring heteroaryl groups (i.e., multiple rings fused together wherein at least one of the fused rings is a heteroaromatic ring).
- 5.6-fused ring heteroarylene refers to two rings fused together, wherein one ring has 5 members and the other ring has 6 members, and wherein at least one ring is a heteroaryl ring. Likewise, a
- 6.6-fused ring heteroarylene refers to two rings fused together, wherein one ring has 6 members and the other ring has 6 members, and wherein at least one ring is a heteroaryl ring.
- a 6.5- fused ring heteroarylene refers to two rings fused together, wherein one ring has 6 members and the other ring has 5 members, and wherein at least one ring is a heteroaryl ring.
- a heteroaryl group can be attached to the remainder of the molecule through a carbon or heteroatom.
- Nonlimiting examples of aryl and heteroaryl groups include phenyl, naphthyl, pyrrolyl.
- pyrazolyl pyridazinyL triazinyl, pyrimidinyl, imidazolyl, pyrazinyl, purinyl, oxazolyL isoxazolyL thiazolyl, furyl, thienyl, pyridyl, pyrimidyl, benzothiazolyl, benzoxazoyl benzimidazolyl, benzofuran, isobenzofuranyl, indolyl, isoindolyl, benzothiophenyl, isoquinolyl, quinoxalinyl, quinolyl, 1 -naphthyl, 2-naphthyl, 4-biphenyl, 1 -pyrrolyl, 2-pyrrolyl, 3-pyrrolyl, 3-pyrazolyl, 2- imidazolyl, 4-imidazolyl, pyrazinyl, 2-oxazolyL 4-oxazolyl, 2-phenyl-4-ox
- arylene and heteroarylene independently or as part of another substituent, mean a divalent radical derived from an aryl and heteroaryl, respectively.
- a heteroaryl group substituent may be -O- bonded to a ring heteroatom nitrogen.
- ring A is a 5- membered monocyclic cycloalkyl, a 5-membered monocyclic heterocycloalkyl, or a 5- membered monocyclic heteroaryl.
- oxo means an oxygen that is double bonded to a carbon atom.
- alkylsulfonyl means a moiety having the formula -S(O2)-R', where R' is a substituted or unsubstituted alkyl group as defined above. R' may have a specified number of carbons (e.g., “C1-C4 alkylsulfonyl”).
- alkyl arylene as an arylene moiety covalently bonded to an alky lene moiety 7 (also referred to herein as an alkylene linker).
- An alkylarylene moiety may be substituted (e.g. with a substituent group) on the alkylene moiety or the arylene linker (e.g. at carbons 2, 3. 4, or 6) with halogen, oxo. -N3, -CF3, -CCI3, -CBn, -CI3. -CN. -CHO, -OH, -NH 2 . -COOH, -CONH 2 . -NO 2 , -SH. -SO2CH3 -SO3H.
- -OSO3H -SO2NH2, -NHNH2, -ONH2, -NHC(O)NHNH 2 , substituted or unsubstituted C1-C5 alkyl or substituted or unsubstituted 2 to 5 membered heteroalky l).
- the alkylarylene is unsubstituted.
- R'", and R" each preferably independently refer to hydrogen, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyd, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl (e.g., aryl substituted with 1-3 halogens), substituted or unsubstituted heteroaryl, substituted or unsubstituted alkyl, alkoxy, or thioalkoxy groups, or arylalkyl groups.
- aryl e.g., aryl substituted with 1-3 halogens
- substituted or unsubstituted heteroaryl substituted or unsubstituted alkyl, alkoxy, or thioalkoxy groups, or arylalkyl groups.
- each of the R groups is independently selected as are each R', R", R'", and R"" group when more than one of these groups is present.
- R' and R" are attached to the same nitrogen atom, they can be combined with the nitrogen atom to form a 4-, 5-, 6-, or 7-membered ring.
- -NR'R includes, but is not limited to, 1-pyrrolidinyl and 4-morpholinyl.
- alkyl is meant to include groups including carbon atoms bound to groups other than hydrogen groups, such as haloalkyl (e.g., -CF 3 and -CH2CF3) and acyl (e.g., -C(O)CH 3 , -C(O)CF 3 , -C(O)CH 2 OCH 3 , and the like).
- haloalkyl e.g., -CF 3 and -CH2CF3
- acyl e.g., -C(O)CH 3 , -C(O)CF 3 , -C(O)CH 2 OCH 3 , and the like.
- -NR'OR in a number ranging from zero to the total number of open valences on the aromatic ring system; and where R', R", R'", and R"" are preferably independently selected from hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, and substituted or unsubstituted heteroaryl.
- each of the R groups is independently selected as are each R', R", R'", and R"" groups when more than one of these groups
- Substituents for rings may be depicted as substituents on the ring rather than on a specific atom of a ring (commonly referred to as a floating substituent).
- the substituent may be attached to any of the ring atoms (obeying the rules of chemical valency) and in the case of fused rings or spirocyclic rings, a substituent depicted as associated with one member of the fused rings or spirocyclic rings (a floating substituent on a single ring), may be a substituent on any of the fused rings or spirocyclic rings (a floating substituent on multiple rings).
- the multiple substituents may be on the same atom, same ring, different atoms, different fused rings, different spirocyclic rings, and each substituent may optionally be different.
- a point of attachment of a ring to the remainder of a molecule is not limited to a single atom (a floating substituent)
- the attachment point may be any atom of the ring and in the case of a fused ring or spirocyclic ring, any atom of any of the fused rings or spirocyclic rings while obey ing the rules of chemical valency.
- a ring, fused rings, or spirocyclic rings contain one or more ring heteroatoms and the ring, fused rings, or spirocyclic rings are shown with one more floating substituents (including, but not limited to, points of attachment to the remainder of the molecule), the floating substituents may be bonded to the heteroatoms.
- the ring heteroatoms are shown bound to one or more hydrogens (e.g. a ring nitrogen with two bonds to ring atoms and a third bond to a hydrogen) in the structure or formula with the floating substituent, when the heteroatom is bonded to the floating substituent, the substituent will be understood to replace the hydrogen, while obeying the rules of chemical valency.
- Two or more substituents may optionally be joined to form aryl, heteroaryl, cycloalkyl, or heterocycloalkyl groups.
- Such so-called ring-forming substituents are typically, though not necessarily, found attached to a cyclic base structure.
- the ring-forming substituents are attached to adjacent members of the base structure.
- two ringforming substituents attached to adjacent members of a cyclic base structure create a fused ring structure.
- the ring-forming substituents are attached to a single member of the base structure.
- two ring-forming substituents attached to a single member of a cyclic base structure create a spirocyclic structure.
- the ring-forming substituents are attached to non-adj acent members of the base structure.
- Two of the substituents on adj acent atoms of the aryl or heteroar l ring may optionally form a ring of the formula -T-C(O)-(CRR') q -U-, wherein T and U are independently -NR-, -O-, -CRR'-, or a single bond, and q is an integer of from 0 to 3.
- two of the substituents on adjacent atoms of the aryl or heteroaryl ring may optionally be replaced with a substituent of the formula -A-(CH 2 ) r -B-, wherein A and B are independently -CRR'-, -O-, -NR-, -S-, -S(O) -, -S(O) 2 -, -S(O) 2 NR'-, or a single bond, and r is an integer of from 1 to 4.
- One of the single bonds of the new ring so formed may optionally be replaced with a double bond.
- two of the substituents on adjacent atoms of the aryl or heteroaryl ring may optionally be replaced with a substituent of the formula -(CRR')s-X'- (C"R"R"')d-, where s and d are independently integers of from 0 to 3, and X' is -O-, -NR'-, -S-, -S(O)-, -S(O)2-, or -S(O)2NR'-.
- R, R', R", and R' are preferably independently selected from hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, and substituted or unsubstituted heteroaryl.
- heteroatom or “ring heteroatom” are meant to include oxygen (O), nitrogen (N), sulfur (S), phosphorus (P), and silicon (Si).
- a “substituent group,” as used herein, means a group selected from the following moieties:
- C3-C6 cycloalkyl, or C 5 -C 6 cycloalkyl unsubstituted heterocycloalkyl (e.g., 3 to 8 membered heterocycloalkyl, 3 to 6 membered heterocycloalkyl, or 5 to 6 membered heterocycloalkyl), unsubstituted aryl (e.g., C 6 -C 10 aryl, C10 aryl, or phenyl), or unsubstituted heteroaryl (e.g., 5 to 10 membered heteroaryl, 5 to 9 membered heteroaryl, or 5 to 6 membered heteroaryl), and
- -OCHI2, -OCHF2 unsubstituted alkyl (e.g., C1-C8 alkyl, C1-C6 alkyl, or C1-C4 alky l), unsubstituted heteroalkyl (e.g., 2 to 8 membered heteroalkyl, 2 to 6 membered heteroalkyl, or 2 to 4 membered heteroalkyl), unsubstituted cycloalkyl (e.g., C 3 -C 8 cycloalkyl, C3-C6 cycloalkyl, or C 5 -C 6 cycloalkyd), unsubstituted heterocycloalkyl (e.g., 3 to 8 membered heterocycloalkyl, 3 to 6 membered heterocycloalkyl, or 5 to 6 membered heterocycloalkyl), unsubstituted aryl (e.g., C 6 -C 10 aryl, C10 ary 1, or phenyl
- Ci-Cs alkyl, Ci-Ce alkyl, or C1-C4 alkyl unsubstituted heteroalkyl (e.g., 2 to 8 membered heteroalkyl, 2 to 6 membered heteroalkyl, or 2 to 4 membered heteroalkyl), unsubstituted cycloalkyl (e.g., C 3 -C 8 cycloalkyl, C3-C6 cycloalkyl, or C 5 -C 6 cycloalky l), unsubstituted heterocycloalkyl (e.g., 3 to 8 membered heterocycloalkyl, 3 to 6 membered heterocycloalkyl, or 5 to 6 membered heterocycloalkyl), unsubstituted aryl (e.g., C 6 -C 10 aryl, C10 aryl, or phenyl), or unsubstituted heteroaryl (e.g.. 5 to 10 membered heteroaryl,
- -OCI 3 -OCHCI2.
- -OCHBr 2 -OCHI2, -OCHF2, unsubstituted alkyl (e.g., Ci-Cs alkyl, Ci-Ce alkyl, or C1-C4 alkyl), unsubstituted heteroalkyl (e.g., 2 to 8 membered heteroalkyl, 2 to 6 membered heteroalkyl, or 2 to 4 membered heteroalkyl), unsubstituted cycloalkyl (e.g., C 3 -Cs cycloalkyl, C 3 -Ce cycloalkyl, or C 5 -C 6 cycloalkyl), unsubstituted heterocycloalkyl (e.g., 3 to 8 membered heterocycloalkyl, 3 to 6 membered heterocycloalkyl, or 5 to 6 membered heterocycloalkyl), unsubstituted aryl (e.g., Ce-
- a “size-limited substituent’’ or “ size-limited substituent group,” as used herein, means a group selected from all of the substituents described above for a “substituent group,” wherein each substituted or unsubstituted alkyl is a substituted or unsubstituted C1-C20 alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 20 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C 3 -Cs cycloalkyl, each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 8 membered heterocycloalkyl, each substituted or unsubstituted aryl is a substituted or unsubstituted Ce-Cio aryl, and each substituted or unsubstituted heteroaryl
- a “lower substituent” or “lower substituent group,” as used herein, means a group selected from all of the substituents described above for a “substituent group,” wherein each substituted or unsubstituted alkyl is a substituted or unsubstituted Ci-Cs alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 8 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C 3 -C?
- each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 7 membered heterocycloalkyl
- each substituted or unsubstituted aryl is a substituted or unsubstituted Ce-Cio aryl
- each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 9 membered heteroaryl.
- each substituted group described in the compounds herein is substituted with at least one substituent group. More specifically, in embodiments, each substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene described in the compounds herein are substituted with at least one substituent group. In embodiments, at least one or all of these groups are substituted with at least one size-limited substituent group. In embodiments, at least one or all of these groups are substituted with at least one lower substituent group.
- each substituted or unsubstituted alkyl maybe a substituted or unsubstituted C1-C20 alkyl
- each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 20 membered heteroalkyl
- each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted Cs-Cs cycloalkyl
- each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 8 membered heterocycloalkyl
- each substituted or unsubstituted aryl is a substituted or unsubstituted C 6 -C 10 aryl
- each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 10 membered heteroaryl.
- each substituted or unsubstituted alkylene is a substituted or unsubstituted C1-C20 alkylene
- each substituted or unsubstituted heteroalkylene is a substituted or unsubstituted 2 to 20 membered heteroalkylene
- each substituted or unsubstituted cycloalkylene is a substituted or unsubstituted C 3 -C 8 cycloalkylene
- each substituted or unsubstituted heterocycloalkylene is a substituted or unsubstituted 3 to 8 membered heterocycloalkylene
- each substituted or unsubstituted ary lene is a substituted or unsubstituted Ce-Cio arylene
- each substituted or unsubstituted heteroarylene is a substituted or unsubstituted 5 to 10 membered heteroarylene.
- each substituted or unsubstituted alkyl is a substituted or unsubstituted Ci-Cs alkyl
- each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 8 membered heteroalkyl
- each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C3-C7 cycloalkyl
- each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 7 membered heterocycloalkyl
- each substituted or unsubstituted aryl is a substituted or unsubstituted Ce-Cio aryl
- each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 9 membered heteroaryl.
- each substituted or unsubstituted alkylene is a substituted or unsubstituted Ci-Cs alkylene
- each substituted or unsubstituted heteroalkylene is a substituted or unsubstituted 2 to 8 membered heteroalkylene
- each substituted or unsubstituted cycloalkylene is a substituted or unsubstituted C3-C7 cycloalkylene
- each substituted or unsubstituted heterocycloalkylene is a substituted or unsubstituted 3 to 7 membered heterocycloalkylene
- each substituted or unsubstituted arylene is a substituted or unsubstituted Ce-Cio ary lene
- each substituted or unsubstituted heteroarylene is a substituted or unsubstituted 5 to 9 membered heteroarylene.
- a substituted or unsubstituted moiety e.g., substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, and/or substituted or unsubstituted heteroarylene) is unsubstituted (e.g., is an unsubstituted alkyl, unsubstituted heteroalkyl, unsubstituted cycloalkyl, unsubstituted
- a substituted or unsubstituted moiety e.g., substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, substituted or unsubstituted alky lene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, and/or substituted or unsubstituted heteroarylene) is substituted (e.g., is a substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted al
- a substituted moiety e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyd, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene
- is substituted with at least one substituent group wherein if the substituted moiety is substituted with a plurality of substituent groups, each substituent group may optionally be different. In embodiments, if the substituted moiety is substituted with a plurality of substituent groups, each substituent group is different.
- a substituted moiety e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalky 1, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkyd ene, substituted cycloalky dene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene
- is substituted with at least one size-limited substituent group wherein if the substituted moiety is substituted with a plurality of size-limited substituent groups, each size-limited substituent group may optionally' be different. In embodiments, if the substituted moiety is substituted with a plurality of size- limited substituent groups, each size-limited substituent group is different.
- a substituted moiety e.g., substituted alky l, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyd, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkydene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene
- is substituted with at least one lower substituent group wherein if the substituted moiety is substituted with a plurality of lower substituent groups, each lower substituent group may optionally be different.
- each lower substituent group is different.
- a substituted moiety e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyd, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkyd ene, substituted cycloalkydene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarydene
- the substituted moiety is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or
- Certain compounds described herein possess asymmetric carbon atoms (optical or chiral centers) or double bonds; the enantiomers, racemates, diastereomers, tautomers, geometric isomers, stereoisometric forms that may be defined, in terms of absolute stereochemistry, as (R)- or (S)- or, as (D)- or (L)- for amino acids, and individual isomers are encompassed within the scope of the present disclosure.
- the compounds of the present disclosure do not include those that are known in art to be too unstable to synthesize and/or isolate.
- the present disclosure is meant to include compounds in racemic and optically pure forms.
- Optically active (R)- and (S)-, or (D)- and (L)-isomers may be prepared using chiral synthons or chiral reagents, or resolved using conventional techniques.
- the compounds described herein contain olefinic bonds or other centers of geometric asymmetry, and unless specified otherwise, it is intended that the compounds include both E and Z geometric isomers.
- the term “isomers” refers to compounds having the same number and kind of atoms, and hence the same molecular weight, but differing in respect to the structural arrangement or configuration of the atoms.
- tautomer refers to one of two or more structural isomers which exist in equilibrium and which are readily converted from one isomeric form to another. It will be apparent to one skilled in the art that certain compounds of this disclosure may exist in tautomeric forms, all such tautomeric forms of the compounds being within the scope of the disclosure. Unless otherwise stated, structures depicted herein are also meant to include all stereochemical forms of the structure; i.e., the R and S configurations for each asymmetric center. Therefore, single stereochemical isomers (stereoisomers) as well as enantiomeric and diastereomeric mixtures of the present compounds are within the scope of the disclosure.
- the compounds described herein may also contain unnatural proportions of atomic isotopes at one or more of the atoms that constitute such compounds.
- the compounds may be radiolabeled with radioactive isotopes, such as for example tritium ( 3 H), iodine-125 ( 125 I). or carbon-14 ( 14 C). All isotopic variations of the compounds described herein, whether radioactive or not, are encompassed within the scope of the present disclosure.
- an analog is used in accordance with its plain ordinary meaning within Chemistry and Biology and refers to a chemical compound that is structurally similar to another compound (i.e., a so-called “reference” compound) but differs in composition, e.g., in the replacement of one atom by an atom of a different element, or in the presence of a particular functional group, or the replacement of one functional group by another functional group, or the absolute stereochemistry of one or more chiral centers of the reference compound. Accordingly, an analog is a compound that is similar or comparable in function and appearance but not in structure or origin to a reference compound.
- a or “an,” as used in herein means one or more.
- substituted with a[n] means the specified group may be substituted with one or more of any or all of the named substituents.
- a group such as an alkyl or heteroaryl group
- the group may contain one or more unsubstituted C1-C20 alkyls, and/or one or more unsubstituted 2 to 20 membered heteroalkyls.
- R-substituted where a moiety is substituted with an R substituent, the group may be referred to as “R-substituted.” Where a moiety is R-substituted, the moiety is substituted with at least one R substituent and each R substituent is optionally different. Where a particular R group is present in the description of a chemical genus (such as Formula (I)), a Roman alphabetic symbol may be used to distinguish each appearance of that particular R group. For example, where multiple R 3 substituents are present, each R 3 substituent may be distinguished as R 3A , R 3B , wherein each of R 3A , R 3B , is defined within the scope of the definition of R 3 and optionally differently.
- variable e.g., moiety or linker
- a compound or of a compound genus e.g., a genus described herein
- the unfilled valence(s) of the variable will be dictated by the context in which the variable is used.
- variable of a compound as described herein when a variable of a compound as described herein is connected (e.g., bonded) to the remainder of the compound through a single bond, that variable is understood to represent a monovalent form (i.e., capable of forming a single bond due to an unfilled valence) of a standalone compound (e.g., if the variable is named “methane’’ in an embodiment but the variable is known to be attached by a single bond to the remainder of the compound, a person of ordinary skill in the art would understand that the variable is actually a monovalent form of methane, i.e., methyl or - CH3).
- linker variable e.g.. L 1 .
- variable is the divalent form of a standalone compound (e.g., if the variable is assigned to “PEG” or “polyethylene glycol” in an embodiment but the variable is connected by two separate bonds to the remainder of the compound, a person of ordinary skill in the art would understand that the variable is a divalent (i.e., capable of forming two bonds through two unfilled valences) form of PEG instead of the standalone compound PEG).
- bond refers to direct bonds, such as covalent bonds (e.g., direct or a linking group), or indirect bonds, such as non-covalent bond (e.g., electrostatic interactions (e.g., ionic bond, hydrogen bond, halogen bond), van der Waals interactions (e.g., dipole-dipole, dipole-induced dipole, London dispersion), ring stacking (pi effects), hydrophobic interactions, and the like).
- covalent bonds e.g., direct or a linking group
- indirect bonds such as non-covalent bond (e.g., electrostatic interactions (e.g., ionic bond, hydrogen bond, halogen bond), van der Waals interactions (e.g., dipole-dipole, dipole-induced dipole, London dispersion), ring stacking (pi effects), hydrophobic interactions, and the like).
- bioconjugate and “bioconjugate linker” refers to the resulting association between atoms or molecules of “bioconjugate reactive groups” or “bioconjugate reactive moieties”.
- the association can be direct or indirect.
- a conjugate between a first bioconjugate reactive group e.g., -NH2, -C(O)OH, -N-hydroxy succinimide, or -maleimide
- a second bioconjugate reactive group e.g., sulfhydryl, sulfur-containing amino acid, amine, amine sidechain containing amino acid, or carboxylate
- covalent bond or linker e.g.
- bioconjugates or bioconjugate linkers are formed using bioconjugate chemistry (i.e.
- bioconjugate reactive groups including, but are not limited to nucleophilic substitutions (e.g., reactions of amines and alcohols with acyl halides, active esters), electrophilic substitutions (e.g., enamine reactions) and additions to carbon-carbon and carbon-heteroatom multiple bonds (e.g., Michael reaction, Diels-Alder addition).
- nucleophilic substitutions e.g., reactions of amines and alcohols with acyl halides, active esters
- electrophilic substitutions e.g., enamine reactions
- additions to carbon-carbon and carbon-heteroatom multiple bonds e.g., Michael reaction, Diels-Alder addition.
- the first bioconjugate reactive group e.g., unnatural amino acid side chain
- the second bioconjugate reactive group e.g., a hydroxyl group
- electrophilic group refers to a chemical moiety or substituent that removes electron density from a conjugated pi-electron system, thereby making the pi electron system less electrophilic.
- electrostatic compound refers to a chemical moiety or substituent that can donate electron density into a conjugated pi-electron system, thereby making the pi electron system more nucleophilic.
- bound and bound as used herein is used in accordance with its plain and ordinary meaning and refers to the association between atoms or molecules.
- the association can be direct or indirect.
- bound atoms or molecules may be bound, e.g., by covalent bond, linker (e.g. a first linker or second linker), or non-covalent bond (e.g. electrostatic interactions (e.g. ionic bond, hydrogen bond, halogen bond), van der Waals interactions (e.g. dipole-dipole, dipole-induced dipole, London dispersion), ring stacking (pi effects), hydrophobic interactions and the like).
- linker e.g. a first linker or second linker
- non-covalent bond e.g. electrostatic interactions (e.g. ionic bond, hydrogen bond, halogen bond), van der Waals interactions (e.g. dipole-dipole, dipole-induced dipole, London dispersion), ring stacking (pi effects), hydrophobic interactions and the
- the term “capable of binding” as used herein refers to a moiety (e.g., a single-domain antibody or a recombinant protein as described herein, i.e., comprising an unnatural amino acid side chain that is capable of binding to an amino acid residue on a different protein) that is able to measurably bind to a target.
- a moiety e.g., a single-domain antibody or a recombinant protein as described herein, i.e., comprising an unnatural amino acid side chain that is capable of binding to an amino acid residue on a different protein
- the moiety is capable of binding with a Kd of less than about 10 pM, 5 pM, 1 pM, 500 nM, 250 nM, 100 nM, 75 nM, 50 nM, 25 nM, 15 nM, 10 nM, 5 nM, 1 nM, or about 0.1 nM.
- the compounds of Formula (I), i.e., bioreactive unnatural amino acids, facilitate formation of chemically reactive amino acids with proximal target amino acid residues by undergoing a click chemistry reaction (e.g., sulfur-fluoride exchange reaction (SuFEx)).
- a click chemistry reaction e.g., sulfur-fluoride exchange reaction (SuFEx)
- the compounds of Formula (I) may be inserted into or replace an amino acid in a naturally occurring protein, thereby endowing the protein with the ability to form a chemically reactive amino acid with proximally positioned target functional groups (e.g., a hydroxyl group in RNA) or amino acid residues (e.g., serine, threonine, tyrosine) with other proteins.
- target functional groups e.g., a hydroxyl group in RNA
- amino acid residues e.g., serine, threonine, tyrosine
- the compound of Formula (I) may be used to facilitate the formation of chemically reactive amino acids in proteins in both in vitro and in vivo conditions.
- the bioreactive unnatural amino acids of Formula (I) are useful for forming chemically reactive amino acid residues that can be further chemically modified.
- the compounds of Formula (I) have shown excellent chemical functionality (i.e., superior properties) compared to previously described bioreactive unnatural amino acids.
- the compounds of Formula (I) are stable, nontoxic and nonreactive inside cells, yet when placed in proximity to target amino acid residues (e.g., serine, threonine, tyrosine) or reactive moieties (e.g., a hydroxyl group in RNA) they becomes reactive under cellular conditions.
- target amino acid residues e.g., serine, threonine, tyrosine
- reactive moieties e.g., a hydroxyl group in RNA
- the compounds of Formula (I)) are able to react with target amino acid residues (e.g., serine, threonine, tyrosine) or other reactive moieties (e.g., a hydroxyl group in RNA) with great selectivity via proximity-enabled SuFEx reaction within and between proteins and RNA under physiological conditions.
- target amino acid residues e.g., serine, threonine, tyrosine
- other reactive moieties e.g., a hydroxyl group in RNA
- ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl
- L 4 is a bond or -O-
- x is an integer from 0 to 8
- L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene
- R 1 is hydrogen, halogen, -CX , -CHX ⁇ , -CH2X 1 , -OCX ⁇ , -OCH2X 1 , -OCHX ⁇ , -CN, -SO n iR 1A , -SO v iNR 1A R 1B , -NHC(O)NR 1A R IB , -N(O)mi : -NR 1A R 1B , -C(O)R 1A ,
- R 1 is ortho or meta to -S(O 2 )F. In embodiments, R 1 is meta to -S(C>2)F. In embodiments, R 1 is ortho to —S(O 2 )F. In embodiments, R 1 is hydrogen, halogen, -CX 1 3 , -CHXC, -CH2X 1 , -OCX 1 3 , -OCH2X 1 , -OCHX ⁇ , -CN, -SO n 1R 1A , -SO v 1NR 1A R 1B , -NHC(O)NR 1A R 1B , -N(O) m i, -NR 1A R 1B , -C(O)R 1A , -C(O)-OR 1A , -C(O)NR 1A R 1B , -OR 1A , -NR 1A SO 2 R 1B -NR 1A C(O)R 1B , -NR 1A C(O
- ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl;
- x is an integer from 0 to 8;
- L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkydene;
- R 1 is hydrogen, halogen, -CXh, -CHXA -CH2X 1 , -OCXS, -OCH2X 1 , -OCHXS, -CN, -SOniR’ A , -SO v iNR IA R 1B , -NHC(O)NR 1A R 1B , -N(0)mi, -NR 1A R 1B , -C(O)R 1A , -C(O)-OR 1A , -C(O)
- R 1 is ortho or meta to -S(O 2 )F. In embodiments, R 1 is meta to -S(O 2 )F. In embodiments, R 1 is ortho to -S(Ch)F. In embodiments, R 1 is hydrogen, halogen, -CX 1 3 -CHX ⁇ , -CH2X 1 , -OCX 1 3 , -OCH2X 1 , -OCHX ⁇ , -CN, -SOniR 1A , -SOviNR 1A R 1B , -NHC(O)NR 1A R 1B , -N(O) m i, -NR 1A R 1B , -C(O)R 1A , -C(O)-OR 1A , -C(O)NR 1A R 1B , -OR 1A , -NR 1A SO 2 R 1B .
- -NR 1A C(O)R 1B -NR 1A C(O)OR 1B .
- -NR 1A 0R 1B unsubstituted C 1-8 alkyl, or unsubstituted 2 to 8 membered heteroalkyl;
- R 1A is hydrogen, unsubstituted C 1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyd;
- R 1B is hydrogen, unsubstituted C 1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl.
- ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl
- x is an integer from 0 to 8
- L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene.
- ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl.
- ring A is a 5-membered cycloalkyl.
- ring A is a 5-membered heterocycloalkyl.
- ring A is a 5- membered heterocycloalkyl having no double bonds.
- ring A is a 5-membered heterocycloalkyl having one double bond.
- ring A is a 5-membered heteroaryl. In embodiments, ring A is a 5- membered heteroaryl containing 1 to 4 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5 -membered heteroaryl containing 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5-membered heteroaryl containing 1 or 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5-membered heteroaryl containing 1 heteroatom selected from the group consisting of oxygen, nitrogen, and sulfur.
- ring A is a 5-membered heteroaryl containing 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5- membered heteroaryl containing 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is pyrrole, pyrazole. imidazole, triazole, furan, thiophene, phosphole, oxazole, isoxazole, thiazole, or isothiazole. In embodiments, ring A is pyrrole. In embodiments, ring A is pyrazole. In embodiments, ring A is imidazole. In embodiments, ring A is triazole.
- ring A is furan. In embodiments, ring A is thiophene. In embodiments, ring A is phosphole. In embodiments, ring A is oxazole. In embodiments, ring A is isoxazole. In embodiments, ring A is thiazole. In embodiments, ring A is isothiazole. In embodiments, L 1 is attached to a heteroatom in the 5-membered heteroaryl. In embodiments, L 1 is attached to a carbon atom in the 5-membered heteroaryl. In embodiments, the -S(C>2)F moiety is attached to a heteroatom in the 5-membered heteroaryl.
- the -S(C>2)F moiety is attached to a carbon atom in the 5-membered heteroaryl.
- L 1 is attached to a carbon atom in the 5-membered heteroaryl and the -S(O 2 )F moiety is attached to a carbon atom in the 5-membered heteroaryl.
- L 1 is attached to a heteroatom in the 5-membered heteroaryl and the -S(O 2 )F moiety is attached to a carbon atom in the 5- membered heteroaryl.
- L 1 is attached to a carbon atom in the 5-membered heteroaryl and the -S(O 2 )F moiety is attached to a heteroatom in the 5-membered heteroaryl. In embodiments, L 1 is attached to a heteroatom in the 5-membered heteroaryl, and the -S(O 2 )F moiety is attached to a heteroatom in the 5-membered heteroaryl.
- L 1 is substituted or unsubstituted alkylene. In embodiments, L 1 is substituted or unsubstituted C 1-4 alkylene. In embodiments, L 1 is substituted or unsubstituted heteroalkylene. In embodiments, L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L 1 is -NH-C(O)-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is - NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-NH- (CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2.
- y is 0.
- y is 1.
- y is 2.
- x is an integer from 0 to 6.
- x is an integer from 2 to 6.
- x is 4.
- R 1 is halogen.
- L 1 is substituted or unsubstituted alkylene. In embodiments, L 1 is substituted or unsubstituted C 1-4 alkylene. In embodiments, L 1 is substituted or unsubstituted heteroalkylene. In embodiments, L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L 1 is -NH-C(O)-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is - NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-NH- (CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2.
- y is 0.
- y is 1.
- y is 2.
- x is an integer from 0 to 6.
- x is an integer from 2 to 6.
- x is 4.
- R 1 is halogen.
- L 1 is substituted or unsubstituted alkylene. In embodiments, L 1 is substituted or unsubstituted C 1-4 alkylene. In embodiments, L 1 is substituted or unsubstituted heteroalkylene. In embodiments, L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L 1 is -NH-C(O)-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is - NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-NH- (CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2.
- y is 0.
- y is 1.
- y is 2.
- x is an integer from 0 to 6.
- x is an integer from 2 to 6.
- x is 4.
- R 1 is halogen.
- L 1 is substituted or unsubstituted alkylene. In embodiments. L 1 is substituted or unsubstituted C 1-4 alkylene. In embodiments, L 1 is substituted or unsubstituted heteroalkylene. In embodiments, L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L 1 is -NH-C(O)-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is - NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-NH- (CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2.
- y is 0.
- y is 1.
- y is 2.
- x is an integer from 0 to 6.
- x is an integer from 2 to 6.
- x is 4.
- R 1 is halogen.
- L 1 is substituted or unsubstituted alkylene. In embodiments, L 1 is substituted or unsubstituted C 1-4 alkylene. In embodiments, L 1 is substituted or unsubstituted heteroalkylene. In embodiments, L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L 1 is -NH-C(0)-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2. In embodiments, L 1 is -NH-C(O)-NH- (CH 2 ) y -, and y is an integer from 0 to 2. In embodiments, L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4..
- L 1 is substituted or unsubstituted alkylene. In embodiments, L 1 is substituted or unsubstituted C 1-4 alkylene. In embodiments, L 1 is substituted or unsubstituted heteroalkylene. In embodiments, L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L 1 is -NH-C(O)-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-O-(CH 2 )y-, and y is an integer from 0 to 2. In embodiments, L 1 is -NH-C(O)-NH- (CH 2 ) y -, and y is an integer from 0 to 2. In embodiments, L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4.
- L 1 is substituted or unsubstituted alkylene. In embodiments, L 1 is substituted or unsubstituted C 1-4 alkylene. In embodiments, L 1 is substituted or unsubstituted heteroalkylene. In embodiments, L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L 1 is -NH-C(O)-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2. In embodiments, L 1 is -NH-C(O)-NH- (CH 2 ) y -, and y is an integer from 0 to 2. In embodiments, L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4..
- L 1 is substituted or unsubstituted alkylene. In embodiments, L 1 is substituted or unsubstituted C M alkylene. In embodiments, L 1 is substituted or unsubstituted heteroalkylene. In embodiments, L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L 1 is -NH-C(O)-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2. In embodiments, L 1 is -NH-C(O)-NH- (CH 2 )y-, and y is an integer from 0 to 2. In embodiments, L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4.
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II): wherein: ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl; L 4 is a bond or -O-; x is an integer from 0 to 8; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen, halogen, -CX ⁇ -CHX ⁇ . -CH2X 1 , -OCXh, -OCH2X 1 .
- R 1 is ortho or meta to -S(C>2)F. In embodiments, R 1 is meta to -S(O 2 )F. In embodiments, R 1 is ortho to -S(O 2 )F. In embodiments, R 1 is hydrogen, halogen, -CX 1 3 , -CHX ⁇ , -CH2X 1 , -OCX 1 3 , mi, A C(O)R 1B .
- R 1A is hydrogen, unsubstituted C 1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl
- R 1B is hydrogen, unsubstituted C 1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl.
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II- 1 ): wherein: ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl; x is an integer from 0 to 8; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen, halogen, -CXh, -CHX ⁇ , -NR 1 A C(O)R 1B , -NR I A C(O)OR 1B , -NR 1A OR 1B , substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; X 1 is independently -F, -Cl, -Br, or -I; R 1A is hydrogen, substituted or unsubstituted
- R 1 is ortho or meta to -S(Ch)F. In embodiments, R 1 is meta to -S(Ch)F. In embodiments, R 1 is ortho to -S(O 2 )F. In embodiments. R 1 is hydrogen, halogen, -CXb, -CHX ⁇ . -CH2X 1 , -OCXS, -OCH2X 1 . -OCHXC, -CN, -SO n iR 1A .
- proteins comprising an unnatural ammo acid, wherein the unnatural amino comprises a side chain of Formula (II -2): wherein: ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl; x is an integer from 0 to 8; and L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene.
- Formula (II -2) wherein: ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl; x is an integer from 0 to 8; and L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene.
- ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl.
- ring A is a 5- membered cycloalkyl.
- ring A is a 5-membered heterocycloalkyl.
- ring A is a 5- membered heterocycloalkyl having no double bonds.
- ring A is a 5-membered heterocycloalkyl having one double bond.
- ring A is a 5-membered heteroaryl. In embodiments, ring A is a 5- membered heteroaryl containing 1 to 4 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5 -membered heteroaryl containing 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5-membered heteroaryl containing 1 or 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5-membered heteroaryl containing 1 heteroatom selected from the group consisting of oxygen, nitrogen, and sulfur.
- ring A is a 5-membered heteroaryl containing 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5- membered heteroaryl containing 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is pyrrole, pyrazole, imidazole, triazole, furan, thiophene, phosphole, oxazole, isoxazole, thiazole, or isothiazole. In embodiments, ring A is pyrrole. In embodiments, ring A is pyrazole. In embodiments, ring A is imidazole. In embodiments, ring A is triazole.
- ring A is furan. In embodiments, ring A is thiophene. In embodiments, ring A is phosphole. In embodiments, ring A is oxazole. In embodiments, ring A is isoxazole. In embodiments, ring A is thiazole. In embodiments, ring A is isothiazole. In embodiments, L 1 is attached to a heteroatom in the 5-membered heteroaryl. In embodiments, L 1 is attached to a carbon atom in the 5-membered heteroaryl. In embodiments, the -S(Oi)F moiety is attached to a heteroatom in the 5-membered heteroaryl.
- the -S(C>2)F moiety is attached to a carbon atom in the 5-membered heteroaryl.
- L 1 is attached to a carbon atom in the 5-membered heteroaryl and the -S(C>2)F moiety is attached to a carbon atom in the 5-membered heteroaryl.
- L 1 is attached to a heteroatom in the 5-membered heteroaryl and the -S(02)F moiety is attached to a carbon atom in the 5- membered heteroaryl.
- L 1 is attached to a carbon atom in the 5-membered heteroaryl and the -S(C>2)F moiety is attached to a heteroatom in the 5-membered heteroaryl.
- L 1 is attached to a heteroatom in the 5-membered heteroaryl, and the -S(C>2)F moiety is attached to a heteroatom in the 5-membered heteroaryl.
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-3): wherein x, L 1 . and R 1 are as defined herein.
- L 1 is substituted or unsubstituted alkylene.
- L 1 is substituted or unsubstituted C 1-4 alkylene.
- L 1 is substituted or unsubstituted heteroalkylene.
- L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
- L 1 is -NH-C(O)-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is - NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(0)-NH- (CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2.
- y is 0.
- y is 1.
- y is 2.
- x is an integer from 0 to 6.
- x is an integer from 2 to 6.
- x is 4.
- R 1 is halogen.
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-4): wherein x, L 1 , and R 1 are as defined herein.
- L 1 is substituted or unsubstituted alkylene.
- L 1 is substituted or unsubstituted C 1-4 alkylene.
- L 1 is substituted or unsubstituted heteroalkylene.
- L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
- L 1 is -NH-C(O)-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is - NH-C(0)-0-(CH 2 ) y -, and y is an integer from 0 to 2. In embodiments, L 1 is -NH-C(O)-NH- (CH 2 ) y -, and y is an integer from 0 to 2. In embodiments, L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4. In embodiments, R 1 is halogen.
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-5): wherein x, L 1 , and R 1 are as defined herein.
- L 1 is substituted or unsubstituted alkylene.
- L 1 is substituted or unsubstituted Ci-j alkylene.
- L 1 is substituted or unsubstituted heteroalkylene.
- L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
- L 1 is -NH-C(O)-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is - NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-NH- (CH 2 )y-, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2.
- y is 0.
- y is 1.
- y is 2.
- x is an integer from 0 to 6.
- x is an integer from 2 to 6.
- x is 4.
- R 1 is halogen.
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-6): wherein x, L 1 , and R 1 are as defined herein.
- L 1 is substituted or unsubstituted alkylene.
- L 1 is substituted or unsubstituted C 1-4 alkylene.
- L 1 is substituted or unsubstituted heteroalkylene.
- L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
- L 1 is -NH-C(O)-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is - NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-NH- (CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2.
- y is 0.
- y is 1.
- y is 2.
- x is an integer from 0 to 6.
- x is an integer from 2 to 6.
- x is 4.
- R 1 is halogen.
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-7): (II-7); wherein x and L 1 are as defined herein.
- L 1 is substituted or unsubstituted alkylene.
- L 1 is substituted or unsubstituted C 1-4 alkylene.
- L 1 is substituted or unsubstituted heteroalkylene.
- L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
- L 1 is -NH-C(O)-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is - NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-NH- (CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2.
- y is 0.
- y is 1.
- y is 2.
- x is an integer from 0 to 6.
- x is an integer from 2 to 6.
- x is 4.
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-8): (II-8); wherein x and L 1 are as defined herein.
- L 1 is substituted or unsubstituted alkylene.
- L 1 is substituted or unsubstituted C 1-4 alkylene.
- L 1 is substituted or unsubstituted heteroalkylene.
- L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
- L 1 is -NH-C(O)-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is - NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-NH- (CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2.
- y is 0.
- y is 1.
- y is 2.
- x is an integer from 0 to 6.
- x is an integer from 2 to 6.
- x is 4.
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-9): (II-9); wherein x and L 1 are as defined herein.
- L 1 is substituted or unsubstituted alkylene.
- L 1 is substituted or unsubstituted Ci4 alkylene.
- L 1 is substituted or unsubstituted heteroalkylene.
- L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
- L 1 is -NH-C(O)-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is - NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-NH- (CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2.
- y is 0.
- y is 1.
- y is 2.
- x is an integer from 0 to 6.
- x is an integer from 2 to 6.
- x is 4.
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (11-10): (II- 10); wherein x and L 1 are as defined herein.
- L 1 is substituted or unsubstituted alkylene.
- L 1 is substituted or unsubstituted C 1-4 alkylene.
- L 1 is substituted or unsubstituted heteroalkylene.
- L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
- L 1 is -NH-C(O)-(CH?) y -, and y is an integer from 0 to 2.
- L 1 is - NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-NH- (CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2.
- y is 0.
- y is 1.
- y is 2.
- x is an integer from 0 to 6.
- x is an integer from 2 to 6.
- x is 4.
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-l 1):
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (11-12):
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (11-13):
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (11-14):
- proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (11-15):
- the protein is an antibody, an antibody variant. In embodiments, the protein is an antibody. In embodiments, the protein is an antibody variant. In embodiments, the antibody variant is a variant as defined herein. In embodiments, the antibody variant is a single-chain variable fragment, a single-domain antibody, an affibody, or an antigen-binding fragment. In embodiments, the antibody variant is a single-chain variable fragment. In embodiments, the antibody variant is a single-domain antibody. In embodiments, the antibody variant is an affibody . In embodiments, the antibodyvariant is or an antigen-binding fragment. In embodiment, the unnatural amino acid is within a CDR region or a framework region of the antibody.
- the unnatural amino acid is within a CDR region of the antibody. In embodiment, the unnatural amino acid is within a framework region of the antibody. In embodiment, the unnatural amino acid is within a CDR region or a framework region of the antibody variant. In embodiment, the unnatural amino acid is within a CDR region of the antibody variant. In embodiment, the unnatural amino acid is within a framework region of the antibody variant.
- the protein is a receptor protein.
- the receptor protein is a programmed death-ligand 1 (PD-L1) receptor, a programmed cell death protein 1 (PD-1) receptor, a 5-hydroxytryptamine receptor, an acetylcholine receptor, an adenosine receptor, an adenosine A2A receptor, an adenosine A2B receptor, an angiotensin receptor, an apelin receptor, a bile acid receptor, a bombesin receptor, a brady kinin receptor, a cannabinoid receptor, a chemerin receptor, a chemokine receptor, a cholecystokinin receptor, a Class A Orphan receptor, a dopamine receptor, an endothelin receptor, an epidermal growth factor receptor (EGFR), a formyl peptide receptor, a free fatty acid receptor, a galanin receptor, a ghrelin
- EGFR epidermal growth factor receptor
- the receptor protein is an integrin. In embodiments, the receptor protein is a somatostain receptor. In embodiments, the receptor protein is a gonadotropin-releasing hormone receptor. In embodiments, the receptor protein is a bombesin receptor. In embodiments, the receptor protein is a vasoactive intestinal peptide receptor. In embodiments, the receptor protein is a neurotensin receptor. In embodiments, the receptor protein is a cholecystokinin 2 receptor. In embodiments, the receptor protein is a melanocortin receptor. In embodiments, the receptor protein is a ghrelin receptor.
- the receptor protein is a PD-L1 receptor or a PD-1 receptor. In embodiments, the receptor protein is a PD-L1 receptor. In embodiments, the receptor protein is a PD-1 receptor.
- the receptor protein is a receptor expressed on a cancer cell. In embodiments, the receptor protein is a receptor overexpressed on a cancer cell relative to a control.
- the receptor protein is a G protein-coupled receptor.
- the receptor protein is a receptor tyrosine kinase.
- the receptor protein is a an ErbB receptor.
- the receptor protein is an epidermal grow th factor receptor (EGFR).
- the receptor protein is epidermal growth factor receptor 1 (HER1).
- the receptor protein is epidermal growth factor receptor 2 (HER2).
- the receptor protein is epidermal growth factor receptor 3 (HER3).
- the receptor protein is epidermal growth factor receptor 4 (HER4).
- the protein is a cell surface receptor.
- the cell surface receptor is in the extracellular domain, the transmembrane domain, or the intracellular domain.
- the protein is a cytosolic protein.
- the protein is a transcriptional factor.
- the protein is a an enzyme.
- the protein further comprises a detectable agent or a therapeutic agent. In embodiments, the protein further comprises a detectable agent and a therapeutic agent. In embodiments, the protein further comprises a detectable agent. In embodiments, the detectable agent is a radioisotope. In embodiments, the protein further comprises a therapeutic agent.
- R 4 and R 5 are each independently a peptidyl moiety, a carbohydrate moiety, or a nucleic acid moiety;
- ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5- membered heteroaryl;
- L 4 is a bond or -O-;
- x is an integer from 0 to 8;
- L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene;
- L 2 is a bond, -NR 2A -, -S-.
- L 3 is a bond, -N(R 3A )-, -S-, -S(O)2-, -O-, -C(S)-, -C(O)-. -C(O)O-.
- R 3A , and R 3B are independently hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalky 1, substituted or unsubstituted heterocycloalkyd, substituted or unsubstituted ary 1, or substituted or unsubstituted heteroaryd;
- R 1 is hydrogen, halogen, -CXk.
- X 1 is independently -F, -Cl, -Br, or -I;
- R 1A is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl;
- R 1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl;
- nl is an integer from 0 to 4;
- ml is 1 or 2;
- vl is 1 or 2.
- R 1 is meta or ortho to the carbon atom linked to -L 4 S(O2)L 3 R 5 .
- R 1 is hydrogen, halogen, -CXh, -CHX 1 2 , -CH2X 1 , -OCX 1 3 , -OCH2X 1 , -OCHX ⁇ , -CN, -SO n iR 1A , -SO V INR 1A R 1B , -NHC(O)NR 1A R 1B , -N(O)mi, -NR 1A R 1B , -C(O)R 1A , -C(O)-OR 1A , -C(O)NR 1A R 1B , -OR 1A , -NR 1A SO 2 R 1B , -NR 1A C(O)R 1B .
- R 1A is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl
- R 1B is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl.
- ring A is a 5- membered cycloalky l, a 5-membered heterocycloalky l, or a 5-membered heteroaryl.
- ring A is a 5-membered cycloalkyl.
- ring A is a 5 -membered heterocycloalkyd.
- ring A is a 5-membered heterocycloalkyl having no double bonds.
- ring A is a 5-membered heterocycloalkyl having one double bond.
- ring A is a 5-membered heteroaryl.
- ring A is a 5- membered heteroaryl containing 1 to 4 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
- ring A is a 5 -membered heteroaryl containing 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
- ring A is a 5-membered heteroaryl containing 1 or 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
- ring A is a 5-membered heteroaryl containing 1 heteroatom selected from the group consisting of oxygen, nitrogen, and sulfur.
- ring A is a 5-membered heteroaryl containing 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5- membered heteroaryl containing 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is pyrrole, pyrazole, imidazole, triazole, furan, thiophene, phosphole, oxazole, isoxazole, thiazole, or isothiazole. In embodiments, ring A is pyrrole. In embodiments, ring A is pyrazole. In embodiments, ring A is imidazole. In embodiments, ring A is triazole.
- ring A is furan. In embodiments, ring A is thiophene. In embodiments, ring A is phosphole. In embodiments, ring A is oxazole. In embodiments, ring A is isoxazole. In embodiments, ring A is thiazole. In embodiments, ring A is isothiazole. In embodiments, L 1 is attached to a heteroatom in the 5-membered heteroaryl. In embodiments, L 1 is attached to a carbon atom in the 5-membered heteroaryl. In embodiments, the -S(Ch)F moiety is attached to a heteroatom in the 5-membered heteroaryl.
- the -S(Oz)F moiety is attached to a carbon atom in the 5-membered heteroaryl.
- L 1 is attached to a carbon atom in the 5-membered heteroaryl and the -S(O?)F moiety is attached to a carbon atom in the 5-membered heteroaryl.
- L 1 is attached to a heteroatom in the 5-membered heteroaryl and the -S(C>2)F moiety is attached to a carbon atom in the 5- membered heteroaryl.
- L 1 is attached to a carbon atom in the 5-membered heteroaryl and the -S(O 2 )F moiety is attached to a heteroatom in the 5-membered heteroaryl.
- L 1 is attached to a heteroatom in the 5-membered heteroary l, and the -S(O 2 )F moiety is attached to a heteroatom in the 5-membered heteroaryl.
- R 4 , R 5 , x, L 1 , L 2 , L 3 , and R 1 are as defined herein.
- L 1 is substituted or unsubstituted alkylene.
- L 1 is substituted or unsubstituted Ci-4 alkydene.
- L 1 is substituted or unsubstituted heteroalky lene.
- L 1 is substituted or unsubstituted 2 to 6 membered heteroalkydene.
- L 1 is -NH-C(O)- (CH 2 )y-, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-NH-(CH 2 ) y -, and y is an integer from 0 to 2.
- L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2.
- y is 0.
- y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4. In embodiments, - (CH 2 ) X -L 1 - is -(CH 2 )4NH-C(O)-. In embodiments, -(CH 2 )x-L 1 - is -(CH 2 )4NH-C(O)-O-. In embodiments, -(CFkjx-L 1 - is -(CH 2 )4NH-C(O)-NH-. In embodiments, -(CH 2 )x-L 1 - is - (CH 2 ) 4 NH-C(O)-S-.
- R 4 and R 5 are as defined herein.
- R 4 and R 5 are each independently a peptidyl moiety.
- the peptidyl moiety of R 4 comprises an antibody; and the peptidyl moiety of R 5 comprises a protein.
- the peptidyl moiety of R 4 comprises an antibody; and the peptidyl moiety of R 5 comprises a protein, wherein the protein is the target of the antibody.
- the peptidyl moiety of R 4 comprises an antibody variant; and the peptidyl moiety of R 5 comprises a protein.
- the peptidyl moiety of R 4 comprises an antibody variant: and the peptidyl moiety of R 5 comprises a protein, wherein the protein is the target of the antibody variant.
- the peptidyl moiety of R 4 comprises a protein; and the peptidyl moiety of R 5 comprises an antibody or an antibody variant.
- the peptidyl moiety of R 4 comprises a protein: and the peptidyl moiety of R 5 comprises an antibody or an antibody variant, wherein the protein is the target of the antibody or antibody variant.
- the peptidyl moiety of R 4 or R 5 is an antibody, an antibody variant.
- the peptidyl moiety 7 of R 4 or R 5 is an antibody.
- the peptidyl moiety of R 4 or R’ is an antibody variant.
- the antibody variant is a variant as defined herein.
- the antibody variant is a singlechain variable fragment, a single-domain antibody, an affibody, or an antigen-binding fragment.
- the antibody variant is a single-chain variable fragment.
- the antibody variant is a single-domain antibody.
- the antibody variant is an affibody.
- the antibody variant is or an antigen-binding fragment.
- the unnatural amino acid is within a CDR region or a framework region of the antibody. In embodiment, the unnatural amino acid is within a CDR region of the antibody. In embodiment, the unnatural amino acid is within a framework region of the antibody. In embodiment, the unnatural amino acid is within a CDR region or a framework region of the antibody variant. In embodiment, the unnatural amino acid is within a CDR region of the antibody variant. In embodiment, the unnatural amino acid is within a framework region of the antibody variant.
- the peptidyl moiety of R 4 or R 5 is a receptor protein.
- the receptor protein is a programmed death-ligand 1 (PD- Ll) receptor, a programmed cell death protein 1 (PD-1) receptor, a 5-hydroxytryptamine receptor, an acetylcholine receptor, an adenosine receptor, an adenosine A2A receptor, an adenosine A2B receptor, an angiotensin receptor, an apelin receptor, a bile acid receptor, a bombesin receptor, a bradykinin receptor, a cannabinoid receptor, a chemerin receptor, a chemokine receptor, a cholecystokinin receptor, a Class A Orphan receptor, a dopamine receptor, an endothelin receptor, an epidermal growth factor receptor (EGFR), a formyl peptide receptor, a free fatty acid receptor, a gal
- PD- Ll programmed death-ligand 1
- the receptor protein is an integrin. In embodiments, the receptor protein is a somatostain receptor. In embodiments, the receptor protein is a gonadotropin-releasing hormone receptor. In embodiments, the receptor protein is a bombesin receptor. In embodiments, the receptor protein is a vasoactive intestinal peptide receptor. In embodiments, the receptor protein is a neurotensin receptor. In embodiments, the receptor protein is a cholecystokinin 2 receptor. In embodiments, the receptor protein is a melanocortin receptor. In embodiments, the receptor protein is a ghrelin receptor.
- the receptor protein is a PD-L1 receptor or a PD-1 receptor. In embodiments, the receptor protein is a PD-L1 receptor. In embodiments, the receptor protein is a PD-1 receptor.
- the receptor protein is a receptor expressed on a cancer cell. In embodiments, the receptor protein is a receptor overexpressed on a cancer cell relative to a control.
- the receptor protein is a G protein-coupled receptor.
- the receptor protein is a receptor tyrosine kinase.
- the receptor protein is a an ErbB receptor.
- the receptor protein is an epidermal grow th factor receptor (EGFR).
- the receptor protein is epidermal growth factor receptor 1 (HER1).
- the receptor protein is epidermal growth factor receptor 2 (HER2).
- the receptor protein is epidermal growth factor receptor 3 (HER3).
- the receptor protein is epidermal growth factor receptor 4 (HER4).
- the peptidyl moiety' of R 4 or R 5 is a cell surface receptor.
- the cell surface receptor is in the extracellular domain, the transmembrane domain, or the intracellular domain.
- the peptidyl moiety of R 4 or R 5 is a cytosolic protein.
- the peptidyl moiety of R 4 or R 5 is a transcriptional factor.
- the peptidyl moiety of R 4 or R 5 is a an enz me.
- the biomolecule conjugate further comprises a detectable agent or a therapeutic agent. In embodiments, the biomolecule conjugate further comprises a detectable agent and a therapeutic agent. In embodiments, the biomolecule conjugate further comprises a detectable agent. In embodiments, the detectable agent is a radioisotope. In embodiments, the protein biomolecule conjugate further comprises a therapeutic agent.
- L 5 is a -O-, -NH-, or-S-. In embodiments, L 5 is a -NH- or -S-. In embodiments, L 5 is -NH-. In embodiments, L 5 is -S-. In embodiments, L 5 is -O-. In embodiments, L 5 is a bond. In embodiments, -S(Oz)F is meta to the carbon atom bonded to L 5 . In embodiments, -S(O 2 )F is ortho to the carbon atom bonded to I?. In embodiments, -S(O 2 )F is para to the carbon atom bonded to I?. In embodiments, the compound is Formula (IV-1 ) or a stereoisomer thereof.
- the compound is Formula (IV-2) or a stereoisomer thereof. In embodiments, the compound is Formula (IV-3) or a stereoisomer thereof. In embodiments, the compound is Formula (IV-4) or a stereoisomer thereof. In embodiments, the compound is Formula (IV-5) or a stereoisomer thereof.
- a protein comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of having the formula:
- R 1 and L 4 are as defined herein; and L 5 is a bond, -O-, -NH-, or -S-.
- I? is a -O-, -NH-, or -S-.
- L 5 is a -NH- or -S-.
- L 5 is -NH-.
- L 5 is -S-.
- L 5 is -O-.
- L 5 is a bond.
- -S(C>2)F is meta to the carbon atom bonded to L 5 .
- -S(O 2 )F is ortho to the carbon atom bonded to L 5 .
- -S(O 2 )F is para to the carbon atom bonded to L 5 .
- a biomolecule conjugate comprising the proteins of Formula (V) described herein, including embodiments thereof.
- R 1 is hydrogen, halogen. -CX' 3 , -CHX' 2 , -CH 2 X', -OCX's, -OCH2X 1 , -OCHX 1 2 . -CN, -SOmR' A , -SOviNR 1A R 1B , -NHC(O)NR 1A R 1B , -N(O) m i, -NR 1A R 1B , -C(O)R 1A . -C(O)-OR 1A .
- R 1 is hydrogen, halogen, -CX's, -CHX 1 2 . -CH 2 X', -OCX's, -OCH 2 X'. -OCHX'2, -CN, -SO n iR 1A .
- R' is halogen, -CX's, -CHX' 2 . -CH 2 X', -OCX's, -OCH 2 X'. -OCHX' 2 , -CN, -SOmR 1A .
- R' is an electron-donating group or an electron-w ithdrawing group.
- R' is an electron-withdrawing group.
- the electronwithdrawing group is halogen, -CX' 3 , -CHX' 2 , -CH 2 X', , -CN, -SOniR' A , -SO v iNR' A R' B , -N(0)mi, -C(O)R' A , -C(O)OR' A .
- R' A and R' B are hydrogen.
- R' is an electron-donating group.
- the electrondonating group is -Cl, -Br, -I, -CX 2 3 , -CHX 2 2 , -OCX' 3 , -OCH 2 X', -OCHX' 2 , , -OCOR' A , -OC(O)R' A , -OC(O)NR' A R' B , -SR' A , -PR' A R' B -NHC(O)NR' A R' B , -NR' A R 1B , -OR' A , -NR' A SO 2 R' B , -NR' A C(O)R' B , -NR' A C(O)OR' B , substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted hetero
- the substituted or unsubstituted alkyl is substituted or unsubstituted alkene.
- the electron-donating group is unsubstituted alkene.
- the substituted or unsubstituted alkyl is substituted or unsubstituted alkyne.
- R' A and R' B are hydrogen.
- the electron-donating group is unsubstituted alkyne.
- R 1 is unsubstituted 2 to 6 membered heteroalkyl. In embodiments, R 1 is unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R 1 is -O(CH 2 ) m CH3, and m is an integer from 0 to 6. In embodiments, R 1 is - O(CH 2 ) m CH3, and m is an integer from 0 to 4. In embodiments, R 1 is -O(CH 2 )mCH 3 . and m is an integer from 0 to 3. In embodiments, R 1 is -O(CH 2 ) m CH3, and m is an integer from 0 to 2.
- R 1 is -O(CH 2 ) m CH3, and m is 0 or 1. In embodiments, R 1 is -OCH3. In embodiments, R 1 is -OCH2CH3, In embodiments, R 1 is -O(CH 2 )2CH3, In embodiments, R 1 is - O(CH 2 ) 3 CH 3 . In embodiments, R 1 is hydrogen.
- R 1 is halogen. In embodiments, R 1 is fluorine, chlorine, bromine, or iodine. In embodiments, R 1 is fluorine, chlorine, or bromine. In embodiments, R 1 is fluorine or chlorine. In embodiments, R 1 is fluorine or bromine. In embodiments, R 1 is chlorine or bromine. In embodiments, R 1 is fluorine. In embodiments, R 1 is chlorine. In embodiments, R 1 is bromine. In embodiments, R 1 is iodine.
- R 1 is -CXh, -CHX ⁇ , or -CH2X 1 , wherein X 1 is halogen.
- R 1 is -CH2X 1 .
- R 1 is -CHXb.
- R 1 is -CX 1 ⁇
- R 1 is -CF 3 .
- R 1 is -CHF2.
- R 1 is -CH2F.
- R 1 is -CC1 3 .
- R 1 is -CHCI2.
- R 1 is -CH2CI.
- R 1 is -CBr 3 .
- R 1 is -CHBn In embodiments, R 1 is -CkhBr. In embodiments, R 1 is -CN. In embodiments, R 1 is -N(0)mi. In embodiments, R 1 is -NO2. In embodiments, R 1 is -SOniR 1A . In embodiments, R 1 is -SO2H. In embodiments, R 1 is -SO V INR 1A R 1B . In embodiments, R 1 is -SO2NH2. In embodiments, R 1 is -NR 3 + .
- R 1 is an alkyl group substituted with an electron-withdrawing group.
- R 1 is a halogen-substituted alkyl group.
- -(CH 2 ) W CX 1 3 -(CH 2 ) W CHX 1 2, or -(CH 2 )wCFBX 1 , wherein w is an integer from 1 to 5, and X 1 is halogen.
- w is 1.
- w is 2.
- w is 3.
- w is 4.
- w is 5.
- R 1A is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl. In embodiments, R 1A is hydrogen, unsubstituted alkyl, or unsubstituted heteroalkyl. In embodiments, R 1A is hydrogen, substituted or unsubstituted C 1-4 alkyl, or substituted or unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R 1A is hydrogen, unsubstituted C 1-4
- R 1A is hydrogen. In embodiments, R 1A is unsubstituted C 1-4 alkyl. In embodiments, R 1A is unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R 1A is hydrogen and R 1B is hydrogen.
- R 1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl. In embodiments, R 1B is hydrogen, unsubstituted alky l, or unsubstituted heteroalkyl. In embodiments, R 1B is hydrogen, substituted or unsubstituted C 1-4 alkyl, or substituted or unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R 1B is hydrogen, unsubstituted C 1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R 1B is hydrogen. In embodiments, R 1B is unsubstituted C 1-4 alkyl. In embodiments, R 1B is unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R 1A is hydrogen and R 1B is hydrogen.
- X 1 is independently -F, -Cl, -Br, or -I. In embodiments, X 1 is independently -F, -Cl, or -Br. In embodiments, X 1 is independently -F or -Cl. In embodiments, X 1 is -F. In embodiments, X 1 is -Cl. In embodiments, X 1 is -Br. In embodiments, X 1 is -I.
- nl is an integer from 0 to 4. In embodiments nl is an integer from 0 to 3. In embodiments nl is an integer from 0 to 2. In embodiments nl is 0. In embodiments nl is 1. In embodiments nl is 2. In embodiments nl is 3. In embodiments nl is 4.
- ml is 1 or 2. In embodiments, ml is 1. In embodiments, ml is 2.
- vl is 1 or 2. In embodiments, vl is 1. In embodiments, vl is 2.
- x is an integer from 0 to 8. In embodiments, x is an integer from 1 to 8. In embodiments, x is an integer from 1 to 7. In embodiments, x is an integer from 1 to 6. In embodiments, x is an integer from 1 to 5. In embodiments, x is an integer from 1 to 4. In embodiments, x is an integer from 1 to 3. In embodiments, x is an integer of 1 or 2. In embodiments, x is 1. In embodiments, x is 2. In embodiments, x is 3. In embodiments, x is 4. In embodiments, x is 5. In embodiments, x is 6. In embodiments, x is 7. In embodiments, x is 8. In embodiments, x is 0.
- L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene. In embodiments, L 1 is a bond. In embodiments, L 1 is substituted or unsubstituted alky lene. In embodiments, L 1 is substituted or unsubstituted Ci-6 alkylene. In embodiments, L 1 is substituted or unsubstituted C 1-4 alkylene. In embodiments, L 1 is unsubstituted alkylene. In embodiments, L 1 is unsubstituted CM alkylene. In embodiments, L 1 is unsubstituted C 1-4 alkydene.
- L 1 is methylene. In embodiments, L 1 is ethylene. In embodiments, L 1 is propylene. In embodiments, L 1 is substituted or unsubstituted heteroalkylene. In embodiments, L 1 is substituted or unsubstituted 2 to 8 membered heteroalkylene. In embodiments, L 1 is substituted or unsubstituted 2 to 6 membered heteroalky dene. In embodiments, L 1 is -NH-C(O)- (CH 2 ) y - or -NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 6.
- L 1 is -NH- C(O)-(CH 2 ) y - or -NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 5.
- L 1 is - NH-C(O)-(CH 2 )y- or -NH-C(0)-0-(CH 2 ) y -. and y is an integer from 0 to 4.
- L 1 is -NH-C(0)-(CH 2 ) y - or -NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 3.
- L 1 is -NH-C(0)-(CH 2 ) y - or -NH-C(0)-0-(CH 2 ) y -, and y is an integer from 0 to 2. In embodiments, L 1 is -NH-C(0)-(CH 2 ) y -, and y is an integer from 0 to 3. In embodiments, L 1 is - NH-C(O)-. In embodiments, L 1 is -NH-C(0)-(CH 2 )- In embodiments, L 1 is -NH-C(O)-(CH 2 )2-. In embodiments, L 1 is -NH-C(O)-(CH 2 )3-.
- L 1 is -NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 3. In embodiments, L 1 is -NH-C(O)-O-. In embodiments, L 1 is -NH-C(O)- O-(CH 2 )-. In embodiments, L 1 is -NH-C(O)-O-(CH 2 )2-. In embodiments, L 1 is -NH-C(O)-O- (CH 2 ) 3 -.
- L 2 is a bond.
- L 2 is a bond, -NH-, -S-. -S(O) 2 -, -O-. -C(O)-, -C(O)O-, -OC(O)-, -NHC(O)-, -C(O)NH-, -NHC(O)NH-, -SO2NH-, -NHSO2-, -C(S)-, L 12 -substituted or unsubstituted alkylene, L 12 - substituted or unsubstituted heteroalkylene, L 12 -substituted or unsubstituted cycloalkylene, L 12 - substituted or unsubstituted heterocycloalkylene, L 12 -substituted or unsubstituted arylene, or L 12 - substituted or unsubstituted heteroarylene.
- L 2 is a bond. -NH-, -S-, -S(O)2-. -O-, -C(O)-, -C(O)O-, -OC(O)-, -NHC(O)-, -C(O)NH-, -NHC(O)NH-, -SO2NH-, -NHSO2-, -C(S)-, unsubstituted alkylene, unsubstituted heteroalkylene, unsubstituted cycloalkylene, unsubstituted heterocycloalkylene, unsubstituted arylene, or unsubstituted heteroarylene.
- L 2 is a bond.
- the alkylene is a C1-6 alkylene.
- the alkylene is a C 1-4 alkylene.
- the heteroalkylene is a 2 to 6 membered heteroalkylene.
- the heteroalkylene is a 2 to 4 membered heteroalkylene.
- the cycloalkylene is a C 5 -C 6 cycloalkydene.
- the heterocycloalkylene is a 5 or 6 membered heterocycloalkydene.
- the arylene is a C5-6 arylene.
- the heteroarylene is a 5 or 6 membered heteroary dene.
- -(CH 2 ) X -L 1 - is -(CH 2 ) X NHC(O)- or -(CH 2 ) X NHC(O)O-, where x is as defined herein.
- -(CHzjx-L 1 - is -(CH 2 ) X NHC(O)-, where x is as defined herein.
- -(CH 2 ) X -L 1 - is - (CH 2 )NHC(O)-.
- -(CH 2 ) X -L 1 - is -(CHzhNHCCO)-.
- -(CH 2 ) X - L 1 - is -(CH 2 )3NHC(O)-.
- -(CH 2 ) X -L 1 - is -(CH 2 )4NHC(O)-.
- -(CH 2 ) X -L 1 - is -(CH 2 ) 5 NHC(O)-.
- -(CH 2 ) X -L 1 - is -(CH 2 ) 6 NHC(O)-.
- -(CH2X-L 1 - is -(CH 2 ) X NHC(O)O-, where x is as defined herein. In embodiments.
- - (CH 2 ) X -L 1 - is -(CH 2 )NHC(O)O-.
- -(CH 2 )x-L 1 - is -(CH2hNHC(O)O-.
- -(CH2X-L 1 - is -(CH 2 ) 3 NHC(O)O-.
- -(CH 2 ) X -L 1 - is -(CH 2 )4NHC(O)O-.
- -(CH 2 ) X -L 1 - is -(CH 2 )5NHC(O)O-.
- -(CH 2 ) X -L 1 - IS -(CH 2 ) 6 NHC(O)O-.
- L 1 is a bond and L 2 is a bond.
- R 2 is a peptidyl moiety
- R 3 is a peptidyl moiety
- L 1 is a bond
- L 2 is a bond.
- R 2A and R 2B are independently hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
- the alkylene is a C 1-4 alkylene.
- the heteroalkylene is a 2 to 6 membered heteroalkylene.
- the heteroalkylene is a 2 to 4 membered heteroalkylene.
- the cycloalkylene is a C 5 -C 6 cycloalkydene.
- the heterocycloalky dene is a 5 or 6 membered heterocycloalky dene.
- the arylene is a C5-6 ary dene.
- the heteroarylene is a 5 or 6 membered heteroary dene.
- R 2A and R 2B are hydrogen.
- L 12 is halogen, -CF 3 , -CBr 3 , -CCI 3 , -CI 3 , -CHF 2 , -CHBr 2 , -CHCI 2 , -CHI 2 , -CH 2 F, -CH 2 Br, -CH2CI, -CH2I, -OCF 3 , -OCBr 3 , -OCCI 3 , -OCI 3 , -OCHF2, -OCHBr 2 , -OCHCI 2 , -OCHI 2 , -OCH 2 F, -OCH 2 Br, -OCH2CI, -OCH 2 I, -CN, -OH, -NH2, -COOH, -CONH 2 , -NO 2 , -SH, -SO 3 H, -SO 4 H, - SO2NH2, -NHNH 2 , -ONH2,
- the alkylene is a C 1-4 alkylene.
- the heteroalkylene is a 2 to 6 membered heteroalkyd ene.
- the heteroalkylene is a 2 to 4 membered heteroalky lene.
- the cycloalkylene is a Cs-Cg cycloalkylene.
- the heterocycloalkylene is a 5 or 6 membered heterocycloalkylene.
- the arylene is a C5-6 ary dene.
- the heteroarylene is a 5 or 6 membered heteroarylene.
- L 3 is a bond, -N(R 3A )-, -S-, -S(O) 2 -, -O-, -C(O)-, -C(O)O-, -OC(O)-, -N(R 3A )C(O)-, -C(O)N(R 3A )-, -NR 3A C(O)NR 3B -, -NR 3A C(NH)NR 3B -, -SO 2 N(R 3A )-, -N(R 3A )SO 2 -, -C(S)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkydene, substituted or unsubstituted cycloalkydene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted
- L 3 is a bond, -NH-, -S-, -S(O) 2 -, -O-, -C(O)-, -C(O)O-, -OC(O)-, -NHC(O)-, -C(O)NH-, -NHC(O)NH-, -SO2NH-, -NHSO2-, -C(S)-, L 13 -substituted or unsubstituted alkylene, L 13 - substituted or unsubstituted heteroalkydene, L 13 -substituted or unsubstituted cycloalkylene, L 13 - substituted or unsubstituted heterocycloalkylene, L 13 -substituted or unsubstituted arylene, or L 13 - substituted or unsubstituted heteroarylene.
- the alkylene is a C1-4 alkylene.
- the heteroalkylene is a 2 to 6 membered heteroalky dene.
- the heteroalkylene is a 2 to 4 membered heteroalkylene.
- the cycloalkylene is a C5- Ce cycloalkylene.
- the heterocycloalkylene is a 5 or 6 membered heterocycloalkylene.
- the arylene is a C5-6 arylene.
- the heteroary lene is a 5 or 6 membered heteroarylene.
- R 3A and R 3B are independently hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyd, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
- the alkylene is a C1-4 alky dene.
- the heteroalky dene is a 2 to 6 membered heteroalkylene.
- the heteroalky dene is a 2 to 4 membered heteroalkylene.
- the cycloalky lene is a C 5 -C 6 cycloalk dene.
- the heterocycloalkylene is a 5 or 6 membered heterocycloalkyd ene.
- the arylene is a C5-6 arylene.
- the heteroarylene is a 5 or 6 membered heteroarylene.
- the alkylene is a C 1-4 alkylene.
- the heteroalkylene is a 2 to 6 membered heteroalkylene.
- the heteroalkylene is a 2 to 4 membered heteroalkylene.
- the cycloalkydene is a C 5 -C 6 cycloalky dene.
- the heterocycloalkylene is a 5 or 6 membered heterocycloalkydene.
- the arylene is a C5-6 arylene.
- the heteroarylene is a 5 or 6 membered heteroarylene.
- the peptidyl moiety of R 4 comprises an antibody or an antibody variant; and the peptidyl moiety of R 3 comprises a protein.
- the peptidyl moiety' of R 4 comprises an antibody or an antibody variant; and the peptidyl moiety' of R 5 comprises a protein, wherein the protein comprises a lysine, histidine, or tyrosine bonded to L 3 . where L 3 is a bond.
- R 4 comprises an antibody.
- R 4 comprises an antibody variant.
- the antibody variant is a variant as defined herein.
- the antibody variant is a single-chain variable fragment, a single-domain antibody, an affibody, or an antigen-binding fragment. In embodiments, the antibody variant is a single-chain variable fragment. In embodiments, the antibody variant is a single-domain antibody. In embodiments, the antibody variant is an affibody. In embodiments, the antibody variant is an antigen-binding fragment.
- the protein is the target protein of the antibody or antibody variant. In embodiments, the target protein is a receptor protein.
- the peptidyl moiety' of R 4 comprises a protein; and the peptidyl moiety’ of R 5 comprises an antibody or an antibody variant.
- the peptidyl moiety' of R 4 comprises a protein; and the peptidyl moiety of R 5 comprises an antibody or an antibody variant; wherein the antibody or antibody variant comprises a lysine, histidine, or tyrosine bonded to L 3 , where L 3 is a bond.
- R’ comprises an antibody.
- R 5 comprises an antibody variant.
- the antibody variant is a variant as defined herein.
- the antibody variant is a singlechain variable fragment, a single-domain antibody, an affibody, or an an tiger -bin ding fragment. In embodiments, the antibody variant is a single-chain variable fragment. In embodiments, the antibody variant is a single-domain antibody. In embodiments, the antibody variant is an affibody. In embodiments, the antibody variant is an antigen-binding fragment. In embodiments, the protein is the target protein of the antibody or antibody variant. In embodiments, the target protein is a receptor protein.
- R 5 is a peptidyl moiety comprising a lysine, histidine, or tyrosine bonded to L 3 .
- R 5 is a peptidyl moiety comprising a lysine bonded to L 3 .
- R 5 is a peptidyl moiety comprising a histidine bonded to L 3 .
- R 5 is a peptidyl moiety comprising a tyrosine bonded to L 3 .
- R 5 is a peptidyl moiety comprising a lysine, histidine, or tyrosine bonded to L 3 , where L 3 is a bond.
- R 5 is a peptidyl moiety comprising a lysine bonded to L 3 , where L 3 is a bond. In embodiments, R 5 is a peptidyl moiety' comprising a histidine bonded to L 3 . where L 3 is a bond. In embodiments, R 5 is a peptidyl moiety comprising a tyrosine bonded to L 3 . where L 3 is a bond. In embodiments, L 2 is a bond.
- the biomolecules, proteins, and peptidyl moieties described herein comprise a receptor protein.
- the receptor protein is a 5-hydroxytryptamine receptor, an acetylcholine receptor, an adenosine receptor, an adenosine A2A receptor, an adenosine A2B receptor, an angiotensin receptor, an apelin receptor, a bile acid receptor, a bombesin receptor, a bradykinin receptor, a cannabinoid receptor, a chemerin receptor, a chemokine receptor, a cholecystokinin receptor, a Class A Orphan receptor, a dopamine receptor, an endothelin receptor, an epidermal growth factor receptor (EGFR), a formyl peptide receptor, a free fatty acid receptor, a galanin receptor, a ghrelin receptor, a glycoprotein hormone receptor, a gonadotrophin-releasing hormone receptor,
- the receptor protein is an integrin. In embodiments, the receptor protein is a somatostain receptor. In embodiments, the receptor protein is a gonadotropin-releasing hormone receptor. In embodiments, the receptor protein is a bombesin receptor. In embodiments, the receptor protein is a vasoactive intestinal peptide receptor. In embodiments, the receptor protein is a neurotensin receptor. In embodiments, the receptor protein is a cholecystokinin 2 receptor. In embodiments, the receptor protein is a melanocortin receptor. In embodiments, the receptor protein is a ghrelin receptor.
- the receptor protein is a receptor expressed on a cancer cell. In embodiments, the receptor protein is a receptor overexpressed on a cancer cell relative to a control.
- the receptor protein is a G protein-coupled receptor.
- the receptor protein is a receptor tyrosine kinase.
- the receptor protein is a an ErbB receptor.
- the receptor protein is an epidermal grow th factor receptor (EGFR).
- the receptor protein is epidermal growth factor receptor 1 (HER1).
- the receptor protein is epidermal growth factor receptor 2 (HER2).
- the receptor protein is epidermal growth factor receptor 3 (HER3).
- the receptor protein is epidermal growth factor receptor 4 (HER4).
- proteins comprising an unnatural amino acid as described herein, including embodiments thereof, within CDR-L1, CDR-L2, CDR-L3, CDR-H1, CDR-H2, or CDR-H3, w herein the protein is an antigen-binding fragment, a single-chain variable fragment, or an antibody.
- the protein is an antigen-binding fragment.
- the protein is a single-chain variable fragment.
- the protein is an antibody.
- the protein has one unnatural amino acid within CDR-L1.
- the protein has one unnatural amino acid within CDR-L2.
- the protein has one unnatural amino acid within CDR-L3.
- the protein has one unnatural amino acid within CDR-H1. In embodiments, the protein has one unnatural amino acid within CDR-H2. In embodiments, the protein has one unnatural amino acid within CDR-H3. In embodiments, the protein has two or more unnatural amino acids within CDR-L1, CDR-L2, CDR-L3, CDR-H1, CDR-H2, or CDR-H3. The two or more unnatural acids can be in the same or different CDR, and can be in the same or different chain (i. e. , light or heavy).
- the proteins described herein comprise an unnatural amino acid as described herein, including embodiments thereof, within a framework region, w erein wherein the protein is an antigen-binding fragment, a single-chain variable fragment, or an antibody.
- Fabs comprising an unnatural amino acid as described herein, including embodiments thereof.
- Fabs comprising an unnatural amino acid, wherein the unnatural amino acid comprises a side chain of Formula (II), including embodiments thereof.
- nanobodies comprising an unnatural amino acid having the side chain of Formula (II) as described herein, including embodiments thereof.
- single-domain antibodies having an unnatural amino acid side chain wherein the unnatural amino acid side chain is capable of covalently binding to lysine, tyrosine, or histidine.
- the unnatural amino acid side chain is capable of covalently binding to lysine or tyrosine.
- the unnatural amino acid side chain is capable of covalently binding to lysine.
- the unnatural amino acid side chain is capable of covalently binding to tyrosine.
- nanobodies comprising an unnatural amino acid, wherein the unnatural amino acid is within CDR1, CDR2, or CDR3 of the nanobody.
- nanobodies comprising one unnatural amino acid, wherein the one unnatural amino acid is within CDR1, CDR2, or CDR3 of the nanobody.
- nanobodies comprising two unnatural amino acids, wherein the two unnatural amino acids are within CDR1, CDR2, or CDR3 of the nanobody.
- nanobodies comprising three unnatural amino acids wherein the three unnatural amino acids are within CDR1, CDR2, or CDR3 of the nanobody.
- nanobodies comprising four unnatural amino acids, wherein the four unnatural amino acids are within CDR1, CDR2, or CDR3 of the nanobody.
- nanobodies comprising an unnatural amino acid, wherein the unnatural amino acid is within CDR1 of the nanobody.
- nanobodies comprising an unnatural amino acid wherein the unnatural amino acid is within CDR1, but not within CDR2 or CDR3 of the nanobody.
- nanobodies comprising one unnatural amino acid wherein the one unnatural amino acid is within CDR1 of the nanobody.
- nanobodies comprising an unnatural amino acid, wherein the unnatural amino acid is w ithin CDR2 of the nanobody.
- nanobodies comprising an unnatural amino acid, wherein the unnatural amino acid is within CDR2, and there are not any unnatural amino acids within CDR1 or CDR3 of the nanobody.
- nanobodies comprising one unnatural amino acid, wherein the one unnatural amino acid is within CDR2 of the nanobody.
- nanobodies comprising an unnatural amino acid, wherein the unnatural amino acid is within CDR3 of the nanobody.
- nanobodies comprising an unnatural amino acid, wherein the unnatural amino acid is within CDR3, and there are not any unnatural amino acids within CDR1 or CDR2 of the nanobody.
- nanobodies comprising one unnatural amino acid, wherein the one unnatural amino acid is within CDR3 of the nanobody.
- the unnatural amino acid comprises a side chain of Formula (II), including embodiments thereof.
- the proteins or biomolecule conjugates described herein, including embodiments thereof, comprise a detectable agent.
- the detectabel agent is a radioisotope.
- the radioisotope is a positron-emitting radioisotope.
- the positron-emitting radioisotope is n C. 13 N, 15 O, 18 F. 64 Cu, 68 Ga. 78 Br, 82 Rb, 86 Y, 89 Zr, 90 Y, 22 Na, 26 AL 40 K, 83 Sr, or 124 I.
- the positron-emitting radioisotope is 124 I.
- the radioisotope is an alpha-emitting radioisotope.
- the alphaemitting radioisotope is 211 At, 227 Th, 225 Ac, 223 Ra, 213 Bi, or 212 Bi.
- the alphaemitting radioisotope is 21 'At.
- the proteins or biomolecule conjugates described herein further comprise a therapeutic agent.
- the proteins or biomolecule conjugates described herein further comprise a detectable agent and a therapeutic agent.
- the disclosure provides a cell comprising the compounds, proteins, and conjugates described herein, including embodiments thereof.
- the cell further includes a vector as described herein.
- the protein described herein, including embodiments thereof is biosynthesized inside the cell, thereby generating a cell containing the protein.
- the protein described herein, including embodiments thereof is contained in the medium outside the cell and penetrates into the cell, thereby generating a cell containing the protein.
- the cell comprises a protein complex described herein.
- a cell can be any prokary otic or eukary otic cell.
- any of the compounds (e.g., single-domain antibody) compositions described herein can be expressed in bacterial cells such as E. coli, insect cells, yeast or mammalian cells (such as Hela cells, Chinese hamster ovary' cells (CHO) or COS cells).
- a cell can be a premature mammalian cell, i.e., pluripotent stem cell.
- a cell can be derived from other human tissue. Other suitable cells are know n to those skilled in the art.
- the proteins provided herein may be delivered to cells using methods well known in the art.
- a nucleic acid sequence encoding the proteins described herein including embodiments and aspects thereof.
- a cell comprise the compound of Formula (I), including any embodiment thereof.
- a cell comprise the compound of Formula (II), including any embodiment thereof.
- a cell comprise the compound of Formula (III), including any embodiment thereof.
- the cell further includes a mutant pyrrolysyl-tRNA synthetase as described herein, including embodiments thereof.
- the cell further includes a vector as described herein, including embodiments thereof.
- the cell further includes a tRNA Pyl .
- the compound of Formula (I) (including embodiments thereol) is biosynthesized inside the cell, thereby generating a cell containing the compound of Formula (I).
- the compound of Formula (I) is contained in the medium outside the cell and penetrates into the cell, thereby generating a cell containing the compound of Formula (I).
- the cell comprises the compound of Formula (II) (including embodiments thereol).
- the cell comprises the compound of Formula (II) that is synthesized inside the cell.
- the cell comprises the compound of Formula (II) that is synthesized outside a cell, and that penetrates into the cell.
- the cell comprises the compound of Formula (III) (including embodiments thereof).
- the cell comprises the compound of Formula (III) that is synthesized inside the cell.
- the cell comprises the compound of Formula (III) that is synthesized outside a cell, and that penetrates into the cell.
- a cell can be any prokaryotic or eukaryotic cell.
- the cell is prokaryotic.
- the cell is eukaryotic.
- the cell is a bacterial cell, a fungal cell, a plant cell, an archael cell, or an animal cell.
- the animal cell is an insect cell or a mammalian cell.
- the cell is a bacterial cell.
- the cell is a fungal cell.
- the cell is a plant cell.
- the cell is an archael cell.
- the cell is an animal cell.
- the cell is an insect cell.
- the cell is a mammalian cell.
- the cell is a human cell.
- any of the compositions described herein can be expressed in bacterial cells such as E. coll, insect cells, yeast or mammalian cells (such as Hela cells, Chinese hamster ovary cells (CHO) or COS cells).
- the cell is a premature mammalian cell, i.e., a pluripotent stem cell.
- the cell is derived from other human tissue.
- Other suitable cells are known to those skilled in the art.
- an unnatural amino acid may be inserted into or replace a naturally occurring amino acid in a protein.
- the unnatural amino acid In order for the unnatural amino acid to be inserted or replace an amino acid in a protein, it must be capable of being incorporated during proteinogenesis. Thus, the unnatural amino acid must be present on a transfer RNA molecule (tRNA) such that it may be used in translation.
- tRNA transfer RNA molecule
- Loading of amino acids occurs via an aminoacyl- tRNA synthetase, which is an enzy me that facilitates the attachment of appropriate amino acids to tRNA molecules.
- the attachment of unnatural amino acids to tRNA may not necessarily be accomplished by the naturally occurring aminoacyl-tRNA synthetase.
- Engineered aminoacy 4-tRNA synthetases may be useful for attaching unnatural amino acids to tRNA.
- a PylRS mutant library was generated. Compared to previously described PylRS mutant library, the PylRS mutant library generated herein was constructed using the new small-intelligent mutagenesis approach that allows a greater number of amino acid residues to be mutated simultaneously (e.g., 10 amino acid residues). Mutant pyrrolysyl-tRNA synthetases and methods for making them are described, for example, in US 2021/0002325, WO 2020/072674, and WO 2020/206341. the disclosures of which are incorporated by reference herein in their entirety.
- the disclosure provides a pyrrolysyl-tRNA synthetases having at least 85% sequence identity to the amino acid sequence of SEQ ID NO: 1. In embodiments, the disclosure provides a pyrrolysyl-tRNA synthetases having at least 90% sequence identity' to the amino acid sequence of SEQ ID NO: 1. In embodiments, the disclosure provides a pyrrolysyl- tRNA synthetases having at least 95% sequence identity' to the amino acid sequence of SEQ ID NO: 1. In embodiments, the disclosure provides a pyrrolysyl-tRNA synthetases comprising the amino acid sequence of SEQ ID NO: 1. In embodiments, the disclosure provides a pyrrolysyl- tRNA synthetases as set forth in SEQ ID NO: 1.
- the disclosure provides a mutant pyrrolysyl-tRNA synthetase, including at least 5 amino acid residues substitutions within the substrate-binding site of the mutant pyrrolysyl- tRNA synthetase.
- the mutant pyrrolysyl-tRNA synthetase comprises at least 5 amino acid residues substitutions in the amino acid sequence of SEQ ID NO:2.
- the substrate-binding site includes residues alanine at position 302, leucine at position 305, tyrosine at position 306, leucine at position 309, isoleucine at position 322, asparagine at position 346, cysteine at position 348, tyrosine at position 384, valine at position 401 and tryptophan at position 417 as set forth in the amino acid sequence of SEQ ID NO:2.
- the at least 5 amino acid residues substitutions are a substitution for alanine at position 302, a substitution for asparagine at position 346. a substitution for cysteine at position 348.
- the at least 5 amino acid residues substitutions are isoleucine for alanine at position 302, threonine for asparagine at position 346, isoleucine for cysteine at position 348, leucine for tyrosine at position 384, and lysine for tryptophan at position 417 as set forth in the amino acid sequence of SEQ ID NO:2.
- the mutant pyrrolysyl-tRNA synthetase is encoded by the nucleic acid sequence of SEQ ID NO:3. In aspects, the mutant pyrrolysyl-tRNA synthetase is encoded by a nucleic acid sequence including the sequence of SEQ ID NO:3. In aspects, the mutant pyrrolysyl-tRNA synthetase is encoded by a nucleic acid sequence that is at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%. 99%. or 100% identity to SEQ ID NO:3.
- the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 80% identity to SEQ ID NO:3. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 85% identity to SEQ ID NO:3. In aspects, the mutant pyrrolysyl- tRNA synthetase has an amino acid sequence that has at least 90% identity to SEQ ID NO: 3. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 95% identity to SEQ ID NO:3.
- the mutant pyrrolysyl-tRNA synthetase has the amino acid sequence of SEQ ID NO:4. In aspects, the mutant pyrrolysyl-tRNA synthetase includes an amino acid sequence of SEQ ID NO:4. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%. or 100% identity to SEQ ID NO:4. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 80% identity to SEQ ID NO:4.
- the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 85% identity to SEQ ID NO:4. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 90% identity to SEQ ID NO:4. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 95% identity to SEQ ID NO:4.
- compositions e.g., mutant pyrrolysyl-tRNA synthetase, IRMA 1 "' 1
- a vector including a nucleic acid sequence encoding a mutant pyrrolysyl-tRNA synthetase as described herein, including embodiments thereof.
- the vector further includes a nucleic acid sequence encoding tRN A' 3 ' 1 .
- the vector comprises a nucleic acid sequence encoding a mutant pyrrolysyl-tRNA synthetase as described herein.
- the vector further includes a nucleic acid sequence encoding tRNA Pyl .
- compositions provided herein are useful for forming a biomolecule or biomolecule conjugate.
- the method of forming a biomolecule comprises contacting a biomolecule (e.g., protein), a mutant pyrrolysyl-tRNA synthetase, a tRNA Pyl , and a compound of Formula (I) (including embodiments thereof), thereby producing the biomolecule, i.e., a biomolecule comprising the unnatural amino acid of Formula (I) (including embodiments thereof).
- the biomolecule produced by the method will comprise the unnatural amino acid side chain of Formula (II) (including embodiments thereof).
- the mutant pyrrolysyl-tRNA synthetase used in the method of producing the biomolecule is any described herein or known in the art.
- the tRNA Pyl used in the method of producing the biomolecule is any described herein.
- the reaction is performed in vitro. In embodiments, the reaction is performed in vivo. In embodiments, the reaction is performed in one or more living cells. In embodiments, the reaction is performed in one or more living bacterial cells. In embodiments, the reaction is performed in one or more living mammalian cells.
- the detectable label is a detectable label that can be used in medical imaging.
- the detectable label is a label that can be used for radiography, magnetic resonance imaging, nuclear medicine, ultrasound elastography, photoacoustic imaging, tomography, echocardiography, functional near-infrared spectroscopy, magnetic particle imaging.
- the detectable label is a label that can be use for tomography.
- the detectable label is a label that can be used for positron emission tomography.
- the detectable label is a radioisotope.
- the detectable label is an idoine radioisotope.
- the radioisotope is 123 I, 124 I, 125 I, or 131 I.
- the radioisotope is 123 I.
- the radioisotope is 124 I.
- the radioisotope is 125 I.
- the radioisotope is 131 I.
- the radioisotope is a positron-emitting radioisotope.
- the positron-emitting radioisotope is n C, 13 N, 15 O, 18 F, 64 Cu, 68 Ga, 78 Br, 82 Rb, 86 Y, 89 Zr, 90 Y, 22 Na, 26 Al, 40 K, 83 Sr, or 124 I.
- the positron-emitting radioisotope is n C.
- the positronemitting radioisotope is 13 N.
- the positron-emitting radioisotope is 15 O.
- the positron-emitting radioisotope is 18 F.
- the positron-emitting radioisotope is 64 Cu.
- the positron-emitting radioisotope is ,68 Ga. In embodiments, the positron-emitting radioisotope is 78 Br. In embodiments, the positron-emitting radioisotope is 82 Rb. In embodiments, the positron-emitting radioisotope is 86 Y. In embodiments, the positron-emitting radioisotope is 89 Zr. In embodiments, the positron-emitting radioisotope is 90 Y. In embodiments, the positron-emitting radioisotope is 22 Na. In embodiments, the positronemitting radioisotope is 26 Al.
- the positron-emitting radioisotope is 40 K. In embodiments, the positron-emitting radioisotope is 83 Sr. In embodiments, the positron-emitting radioisotope is 124 I. In embodiments, the radioisotope is an alpha-emitting radioisotope. In embodiments, the alpha-emitting radioisotope is 211 At, 227 Th, 225 Ac, 223 Ra, 213 Bi, or 212 Bi. In embodiments, the alpha-emitting radioisotope is 21 'At. In embodiments, the alpha-emitting radioisotope is 227 Th.
- the alpha-emitting radioisotope is 225 Ac. In embodiments, the alpha-emitting radioisotope is 223 Ra. In embodiments, the alpha-emitting radioisotope is 213 Bi. In embodiments, the alpha-emitting radioisotope is 212 Bi.
- any of the proteins described herein may be administered to a subject in a pharmaceutical composition further comprising a pharmaceutically acceptable excipient.
- the compositions are suitable for formulation and administration in vitro or in vivo. Suitable carriers and excipients and their formulations are known in the art and described, e.g., Remington: The Science and Practice of Pharmacy. 21st Ed, Lippicott Williams & Wilkins (2005).
- compositions administered to a patient for therapeutic purposes (e.g., treating a disease) and/or diagnostic purposes (e.g., medical imaging).
- Medical imagining includes, without limitation, radiography, magnetic resonance imaging, nuclear medicine, ultrasound elastography, photoacoustic imaging, tomography (e.g., positron emission tomography), echocardiography, functional near-infrared spectroscopy, magnetic particle imaging, and the like.
- “Pharmaceutically acceptable excipient” and “pharmaceutically acceptable carrier” refer to a substance that aids the administration of an active agent to and absorption by a subject and can be included in the compositions of the disclosure without causing a significant adverse toxicological effect on the patient.
- Non-limiting examples of pharmaceutically acceptable excipients include water.
- NaCl normal saline solutions
- lactated Ringer's normal sucrose, normal glucose, binders, fillers, disintegrants, lubricants, coatings, sweeteners, flavors, salt solutions (such as Ringer's solution), alcohols, oils, gelatins, carbohydrates such as lactose, amylose or starch, Patty acid esters, hydroxymethy cellulose, polyvinyl pyrrolidine, and colors, and the like.
- Such preparations can be sterilized and, if desired, mixed with auxiliary agents such as lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, coloring, and/or aromatic substances and the like that do not deleteriously react with the compounds of the disclosure.
- auxiliary agents such as lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, coloring, and/or aromatic substances and the like that do not deleteriously react with the compounds of the disclosure.
- auxiliary agents such as lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, coloring, and/or aromatic substances and the like that do not deleteriously react with the compounds of the disclosure.
- Pharmaceutically acceptable excipients can be used in pharmaceutical compositions for therapeutic purposes (e.g.
- Solutions of the pharmaceutical compositions can be prepared in water suitably mixed with a lipid or surfactant, such as hydroxypropylcellulose. Dispersions can also be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations can contain a preservative to prevent the growth of microorganisms. Solutions can be administered, e.g., parenterally, such as subcutaneously or intravenously (e.g., infusion or bolus).
- a lipid or surfactant such as hydroxypropylcellulose.
- Dispersions can also be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations can contain a preservative to prevent the growth of microorganisms. Solutions can be administered, e.g., parenterally, such as subcutaneously or intravenously (e.g., infusion or bolus).
- compositions can be delivered via intranasal or inhalable solutions.
- the intranasal composition can be a spray, aerosol, or inhalant.
- the inhalable composition can be a spray, aerosol, or inhalant.
- Nasal solutions can be aqueous solutions designed to be administered to the nasal passages in drops or sprays. Nasal solutions can be prepared so that they are similar in many respects to nasal secretions. Thus, the aqueous nasal solutions usually are isotonic and slightly buffered to maintain a pH of 5.5 to 6.5.
- antimicrobial preservatives similar to those used in ophthalmic preparations and appropriate drug stabilizers, if required, may be included in the formulation.
- Various commercial nasal preparations are known in the art.
- Oral formulations can include excipients as, for example, pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharine, cellulose, magnesium carbonate and the like. These compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained release formulations or powders.
- oral pharmaceutical compositions will comprise an inert diluent or edible carrier, or they may be enclosed in hard or soft shell gelatin capsule, or they may be compressed into tablets, or they may be incorporated directly with the food.
- the active compounds may be incorporated with excipients and used in the form of ingestible tablets, buccal tablets, troches, capsules, elixirs, suspensions, syrups, wafers, and the like.
- the percentage of the compositions and preparations may, of course, be varied and may be between about 1 to about 75% of the weight of the unit.
- the amount of nucleic acids in such compositions is such that a suitable dosage can be obtained.
- aqueous solutions for parenteral administration in an aqueous solution, for example, the solution should be suitably buffered and the liquid diluent first rendered isotonic with sufficient saline or glucose.
- Aqueous solutions in particular, sterile aqueous media, are especially suitable for intravenous, intramuscular, subcutaneous and intraperitoneal administration.
- one dosage could be dissolved in 1 ml of isotonic NaCl solution and either added to 1000 ml of hypodermoclysis fluid or injected at the proposed site of infusion.
- Sterile injectable solutions can be prepared by incorporating the recombinant proteins in the required amount in the appropriate solvent followed by filtered sterilization.
- dispersions are prepared by incorporating the various sterilized active ingredients into a sterile vehicle which contains the basic dispersion medium. Vacuum-drying and freeze-drying techniques, which yield a powder of the active ingredient plus any additional desired ingredients, can be used to prepare sterile powders for reconstitution of sterile injectable solutions.
- the preparation of more, or highly, concentrated solutions for direct injection is also contemplated. Dimethyl sulfoxide can be used as solvent for rapid penetration, delivering high concentrations of the active agents to a small area.
- proteins described herein may be formulated and introduced as a vaccine through oral, intradermal, intramuscular, intraperitoneal, intravenous, subcutaneous, intranasal, and via scarification (scratching through the top layers of skin, e.g.. using a bifurcated needle) or any other standard route of immunization.
- Vaccine formulations suitable for oral administration may be in the form of capsules, cachets, pills, tablets, lozenges (using a flavored basis, usually sucrose and acacia or tragacanth), powders, granules, or as a solution or a suspension in an aqueous or non-aqueous liquid, or as an oil-in-water or water-in-oil liquid emulsion, or as an elixir or syrup, or as pastilles (using an inert base, such as gelatin and glycerin, or sucrose and acacia), each containing a predetermined amount of a subject composition thereof as an active ingredient or any other oral composition as listed above.
- an inert base such as gelatin and glycerin, or sucrose and acacia
- the vaccines may be administered parenterally as injections (intravenous, intramuscular or subcutaneous).
- injections intravenous, intramuscular or subcutaneous.
- the amount of recombinant proteins used in a vaccine can depend upon a variety of factors including the route of administration, species, and use of booster administration. However, a person of ordinary skill in the art would immediately recognize appropriate and/or equivalent doses looking at dosages of approved whopping cough vaccines for guidance.
- adjuvant refers to a compound that when administered in conjunction with the recombinant proteins provided herein including embodiments thereof augments the immune response to the antigen, but when administered alone does not generate an immune response to the antigen.
- the recombinant proteins provided herein including embodiments thereof may be used as an adjuvant. Therefore, the term “adjuvant” refers to a compound that when administered in conjunction with a vaccine augments the immune response to the antigen, but when administered alone does not generate an immune response to the antigen.
- Adjuvants can augment an immune response by several mechanisms including lymphocyte recruitment, stimulation of B and/or T cells, and stimulation of macrophages.
- the adjuvant increases the titer of induced antibodies and/or the binding affinity of induced antibodies relative to the situation if the immunogen were used alone.
- a variety of adjuvants can be used in combination with the recombinant proteins provided herein to elicit an immune response.
- Adjuvants augment the intrinsic response to an immunogen without causing conformational changes in the immunogen that affect the qualitative form of the response.
- Exemplary adjuvants include aluminum hydroxide and aluminum phosphate, 3 De-O-acylated monophosphoryl lipid A (MPLTM) (see GB 2220211 (RIBI ImmunoChem Research Inc., Hamilton, Montana, now part of Corixa).
- StimulonTM QS-21 is a triterpene glycoside or saponin isolated from the bark of the Quillaja Saponaria Molina tree found in South America (see Kensil et al., in Vaccine Design: The Subunit and Adjuvant Approach (eds. Powell & Newman, Plenum Press, NY, 1995); US Patent No. 5,057,540), (Aquila BioPharmaceuticals, Framingham, MA).
- Other adjuvants are oil in water emulsions (such as squalene or peanut oil), optionally in combination with immune stimulants, such as monophosphoryl lipid A (see Stoute et al., N. Engl. J. Med.
- Adjuvants can be administered as a component of a therapeutic composition with an active agent or can be administered separately, before, concurrently with, or after administration of the therapeutic agent.
- adjuvants are aluminum salts (alum), such as alum hydroxide, alum phosphate, alum sulfate. Such adjuvants can be used with or without other specific immunostimulating agents such as MPL or 3-DMP, QS-21, polymeric or monomeric amino acids such as poly glutamic acid or poly lysine.
- alum aluminum salts
- Such adjuvants can be used with or without other specific immunostimulating agents such as MPL or 3-DMP, QS-21, polymeric or monomeric amino acids such as poly glutamic acid or poly lysine.
- Another class of adjuvants is oil-in-water emulsion formulations.
- Such adjuvants can be used with or without other specific immunostimulating agents such as muramyl peptides (e.g., N-acetylmuramyl-L-threonyl-D- isoglutamine (thr-MDP), N-acetyl-normuramyl-L-alanyl-D-isoglutamine (nor-MDP), N- acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-(l'-2'dipalmitoyl-sn-glycero-3- hydroxyphosphoryloxy)-ethylamine (MTP-PE).
- muramyl peptides e.g., N-acetylmuramyl-L-threonyl-D- isoglutamine (thr-MDP), N-acetyl-normuramyl-L-alanyl-D-isoglutamine (nor
- Oil-in-water emulsions include (a) MF59 (WO 90/14837), containing 5% Squalene, 0.5% Tween 80, and 0.5% Span 85 (optionally containing various amounts of MTP- PE) formulated into submicron particles using a microfluidizer such as Model HOY microfluidizer (Microfluidics, New ton MA), (b) SAF, containing 10% Squalene, 0.4% Tween 80, 5% pluronic-blocked polymer L121, and thr-MDP, either microfluidized into a submicron emulsion or vortexed to generate a larger particle size emulsion, and (c) RibiTM adjuvant system (RAS), (Ribi ImmunoChem, Hamilton, MT) containing 2% squalene, 0.2% Tween 80, and one or more bacterial cell wall components from the group consisting of monophosphoryllipid A (MPL). trehalose dimycolate (TDM
- adjuvants are saponin adjuvants, such as StimulonTM (QS-21, Aquila, Framingham, MA) or particles generated therefrom such as ISCOMs (immunostimulating complexes) and ISCOMATRIX.
- Other adjuvants include RC-529, GM-CSF and Complete Freund's Adjuvant (CFA) and Incomplete Freund's Adjuvant (IF A).
- cytokines such as interleukins (e.g., IL-1 a and [3 peptides,, IL-2, IL-4, IL-6, IL-12, IL-13, and IL- 15), macrophage colony stimulating factor (M-CSF), granulocyte-macrophage colony stimulating factor (GM-CSF), tumor necrosis factor (TNF), chemokines, such as MIPla and p and RANTES.
- interleukins e.g., IL-1 a and [3 peptides,, IL-2, IL-4, IL-6, IL-12, IL-13, and IL- 15
- M-CSF macrophage colony stimulating factor
- GM-CSF granulocyte-macrophage colony stimulating factor
- TNF tumor necrosis factor
- chemokines such as MIPla and p and RANTES.
- glycolipid analogues including N-glycosylamides, N-glycosylureas and N-glycosylcarbamates, each of which is substituted in the sugar residue by an amino acid, as immuno-modulators or adjuvants (see US Pat. No. 4,855,283).
- Heat shock proteins e.g., HSP70 and HSP90, may also be used as adjuvants.
- An adjuvant can be administered with an immunogen as a single composition, or can be administered before, concurrent with or after administration of the immunogen.
- Immunogen and adjuvant can be packaged and supplied in the same vial or can be packaged in separate vials and mixed before use. Immunogen and adjuvant are typically packaged with a label indicating the intended therapeutic application. If immunogen and adjuvant are packaged separately, the packaging typically includes instructions for mixing before use.
- an adjuvant and/or carrier depends on the stability of the immunogenic formulation containing the adjuvant, the route of administration, the dosing schedule, the efficacy of the adjuvant for the species being vaccinated, and, in humans, a pharmaceutically acceptable adjuvant is one that has been approved or is approvable for human administration by pertinent regulator ⁇ ' bodies.
- Complete Freund's adjuvant is not suitable for human administration.
- Alum, MPL and QS-21 are preferred.
- two or more different adjuvants can be used simultaneously. Preferred combinations include alum with MPL, alum with QS-21, MPL with QS-21, MPL or RC-529 with GM-CSF, and alum, QS-21 and MPL together.
- Incomplete Freund's adjuvant can be used (Chang et al., Advanced Drug Delivery Reviews 32, 173-186 (1998)), optionally in combination with any of alum, QS-21, and MPL and all combinations thereof.
- the dosage and frequency (single or multiple doses) of the proteins described herein (e.g., proteins of Formula (II) including embodiments thereof) administered to a subject can vary depending upon a variety of factors, for example, whether the mammal suffers from another disease, and its route of administration; size, age, sex. health, body weight, body mass index, and diet of the recipient; nature and extent of symptoms of the disease being treated, kind of concurrent treatment, complications from the disease being treated or other health-related problems.
- Other therapeutic regimens or agents can be used in conjunction with the methods and proteins described herein (e.g., proteins of Formula (II) including embodiments thereof). Adjustment and manipulation of established dosages (e.g., frequency and duration) are within the ability of the skilled artisan.
- the effective amount of a protein described herein can be initially determined from cell culture assays.
- Target concentrations will be those concentrations of protein that are capable of achieving the methods described herein, as measured using the methods described herein or know n in the art.
- effective amounts of proteins for use in humans can also be determined from animal models.
- a dose for humans can be formulated to achieve a concentration that has been found to be effective in animals.
- the dosage in humans can be adjusted by monitoring effectiveness and adjusting the dosage upwards or downwards, as described above. Adjusting the dose to achieve maximal efficacy in humans based on the methods described above and other methods is well within the capabilities of the ordinarily skilled artisan.
- Dosages of the proteins described herein may be varied depending upon the requirements of the patient, and whether the purpose is therapeutic or medical imaging.
- the dose administered to a patient should be sufficient to affect a beneficial therapeutic response in the patient over time.
- the size of the dose also will be determined by the existence, nature, and extent of any adverse sideeffects. Determination of the proper dosage for a particular situation is within the skill of the art. Dosage amounts and intervals can be adjusted individually to provide levels of the protein effective for the particular clinical indication being treated. This will provide a therapeutic regimen that is commensurate with the severity of the individual's disease state.
- an effective prophylactic, diagnostic, or therapeutic treatment regimen can be planned that does not cause substantial toxicity and yet is effective to treat the clinical disease or symptoms demonstrated by the particular patient.
- This planning should involve the careful choice of proteins by considering factors such as compound potency, relative bioavailability, patient body weight, presence and severity of adverse side effects.
- the proteins are administered to a patient at an amount of about 0.001 mg/kg to about 500 mg/kg.
- the proteins e.g., recombinant proteins, antibodies, antibody variants, single-domain antibodies
- the proteins are administered to a patient in an amount of about 0.01 mg/kg, 0. 1 mg/kg, 0.5 mg/kg, 1 mg/kg. 2 mg/kg, 3 mg/kg, 4 mg/kg, 5 mg/kg, 10 mg/kg, 20 mg/kg, 30 mg/kg, 40 mg/kg. 50 mg/kg, 60 mg/kg. 70 mg/kg, 80 mg/kg, 90 mg/kg, 100 mg/kg. 200 mg/kg, or 300 mg/kg. It is understood that where the amount is referred to as “mg/kg,” the amount is milligram per kilogram body weight of the subject being administered with the proteins.
- the proteins are administered to a patent in an amount from about 0.01 mg to about 500 mg per day.
- Embodiment 1 A compound of Formula (I) or a stereoisomer thereof: wherein: ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl; L 4 is a bond or -O-; x is an integer from 0 to 8; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen, halogen, -CX , -CHXh, -CH2X 1 . -OCX , -OCH2X 1 , -OCHXh.
- Embodiment 3 The compound of Embodiment 1, wherein L 4 is O-.
- Embodiment 4 The compound of Embodiment 1, wherein the compound of Formula
- Embodiment 5 The compound of any one of Embodiments 1 to 4, wherein R 1 is hydrogen, halogen, -CX -CHXE, -CH2X 1 , -OCX's, -OCH2X 1 , -OCHX ⁇ , -CN, -SO n iR 1A , -SOviNR 1A R 1B , -NHC(O)NR 1A R 1B , -N(0)mi, -NR 1A R 1B , -C(O)R 1A , -C(O)-OR 1A , -C(O)NR 1A R 1B , -OR 1A , -NR 1A SO 2 R 1B -NR 1A C(O)R 1B , -NR 1A C(O)OR 1B , -NR 1A OR 1B , unsubstituted C1-8 alkyl, or unsubstituted 2 to 8 membered heteroalky
- Embodiment 6 The compound of any one of Embodiments 1 to 5, wherein R 1 is ortho to -S(O 2 )F.
- Embodiment 7 The compound of any one of Embodiments 1 to 5, wherein R 1 is meta to -S(O 2 )F.
- Embodiment 8 The compound of Embodiment 1, wherein the compound of Formula (I) has the formula:
- Embodiment 9 The compound of any one of Embodiments 1 to 8, wherein ring A is a 5-membered cycloalkyl having one or two double bonds or a 5-membered heterocycloalkyl having one double bonds.
- Embodiment 10 The compound of any one of Embodiments 1 to 9, wherein ring A is a 5-membered cycloalkyl.
- Embodiment 11 The compound of any one of Embodiments 1 to 9. wherein ring A is a 5-membered heterocycloalkylene.
- Embodiment 12 The compound of any one of Embodiments 1 to 8. wherein ring A is a 5-membered heteroaryl.
- Embodiment 13 The compound of Embodiment 12, wherein ring A is a 5-membered heteroaryl containing 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
- Embodiment 14 The compound of Embodiment 13. wherein ring A is a 5-membered heteroaryl containing 1 or 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
- Embodiment 15 The compound of Embodiment 14, wherein ring A is a 5-membered heteroaryl containing 1 heteroatom selected from the group consisting of oxygen, nitrogen, and sulfur.
- Embodiment 16 The compound of any one of Embodiments 1 to 8, wherein ring A is pyrrole, pyrazole, imidazole, triazole, furan, thiophene, phosphole, oxazole, isoxazole, thiazole, or isothiazole.
- Embodiment 17 The compound of any one of Embodiments 1 to 5. wherein the compound of Formula (I) is a compound of formula:
- Embodiment 18 The compound of any one of Embodiments 1 to 5, wherein the compound of Formula (I) is a compound of formula:
- Embodiment 19 The compound of any one of Embodiments 1 to 5, wherein the compound of Formula (I) is a compound of formula:
- Embodiment 20 The compound of any one of Embodiments 1 to 5, wherein the compound of Formula (I) is a compound of formula:
- Embodiment 21 The compound of Embodiment 8, wherein the compound of Formula (I) is a compound of formula:
- Embodiment 22 The compound of Embodiment 8, wherein the compound of Formula
- Embodiment 23 The compound of embodiment 8, wherein the compound of Formula
- Embodiment 24 The compound of Embodiment 8, wherein the compound of Formula
- Embodiment 25 is a compound of formula: [0298] Embodiment 25.
- Embodiment 26 The compound of any one of Embodiments 1 to 24, wherein L 1 is substituted or unsubstituted alkylene.
- Embodiment 27 The compound of Embodiment 26, wherein L 1 is substituted or unsubstituted C u alkylene.
- Embodiment 28 The compound of any one of Embodiments 1 to 24, wherein L 1 is substituted or unsubstituted heteroalkylene.
- Embodiment 29 The compound of Embodiment 28, wherein L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
- Embodiment 30 The compound of Embodiment 29, wherein L 1 is -NH-C(O)-(CHz)y-, and y is an integer from 0 to 2.
- Embodiment 31 The compound of Embodiment 29, wherein L 1 is -NH-C(O)-O- (CH 2 ) y -, and y is an integer from 0 to 2.
- Embodiment 32 The compound of Embodiment 29, wherein L 1 is -NH-C(O)-NH- (CH 2 ) y -, and y is an integer from 0 to 2.
- Embodiment 33 The compound of Embodiment 29, wherein L 1 is -NH-C(O)-S- (CH 2 ) y -, and y is an integer from 0 to 2.
- Embodiment 34 The compound of any one of Embodiments 30 to 33, wherein y is 0.
- Embodiment 35 The compound of any one of Embodiments 1 to 34, wherein x is an integer from 0 to 6.
- Embodiment 36 The compound of Embodiment 35, wherein x is an integer from 2 to 6.
- Embodiment 37 The compound of Embodiment 36, wherein x is 4.
- Embodiment 38 The compound of any one of Embodiments 1 to 24, wherein -(CH 2 ) X - L 1 - is -(CH 2 ) 4 NH-C(O)-
- Embodiment 39 The compound of any one of Embodiments 1 to 24, wherein -(CH 2 ) X - L 1 - is -(CH 2 ) 4 NH-C(O)-O-.
- Embodiment 40 The compound of any one of Embodiments 1 to 24, wherein -(CH 2 ) X - L 1 - is -(CH 2 ) 4 NH-C(O)-NH-.
- Embodiment 41 The compound of any one of Embodiments 1 to 24, wherein -(CH 2 ) X - L 1 - is -(CH 2 ) 4 NH-C(O)-S-.
- Embodiment 42 The compound of Embodiment 1, wherein the compound of Formula
- Embodiment 43 The compound of Embodiment 1, wherein the compound of Formula
- Embodiment 44 The compound of Embodiment 1, wherein the compound of Formula
- Embodiment 45 The compound of Embodiment 1, wherein the compound of Formula
- Embodiment 46 A protein comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II): wherein ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl; L 4 is a bond or -O-; x is an integer from 0 to 8; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R 1 is hydrogen, halogen, )mi, -SO v iNR C> R 1B , -C(O)NR 1A R 1B , -NR 1A C(O)R 1B , -NR 1A C(O)OR 1B , -NR 1A OR 1B , substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; X 1 is independently -
- R 1A is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl
- R 1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl
- nl is an integer from 0 to 4
- ml is 1 or 2
- vl is 1 or 2.
- Embodiment 47 The protein of Embodiment 46, wherein L 4 is a bond.
- Embodiment 48 The protein of Embodiment 46, wherein L 4 is -O-.
- Embodiment 49 The protein of Embodiment 46, wherein the compound of Formula
- Embodiment 50 The protein of any one of Embodiments 46 to 49, wherein R 1 is hydrogen, halogen. - -SOniR 1A , -SOviNR 1 -NHC(O)NR 1A R 1B , -NR 1A SO 2 R 1B , -NR 1A C(O)R 1B , -NR 1A C(O)OR 1B , -NR 1A OR 1B , unsubstituted C1-8 alkyl, or unsubstituted 2 to 8 membered heteroalkyl; R 1A is hydrogen, unsubstituted C 1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl; and R 1B is hydrogen, unsubstituted C 1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl.
- Embodiment 51 The protein of any one of Embodiments 46 to 50, wherein R 1 is ortho to -S(O 2 )F.
- Embodiment 52 The protein of any one of Embodiments 46 to 50, wherein R 1 is meta to -S(O 2 )F.
- Embodiment 53 The protein of Embodiment 46, wherein the compound of Formula (II) has the formula: [0327] Embodiment 54.
- Embodiment 55 The protein of any one of Embodiments 46 to 54, wherein ring A is a 5-membered cycloalkyl.
- Embodiment 56 The protein of any one of Embodiments 46 to 54, wherein ring A is a 5-membered heterocycloalkylene.
- Embodiment 57 The protein of any one of Embodiments 46 to 53, wherein ring A is a 5-membered heteroaryl.
- Embodiment 58 The protein of Embodiment 57, wherein ring A is a 5-membered heteroaryl containing 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
- Embodiment 59 The protein of Embodiment 58, wherein ring A is a 5-membered heteroaryl containing 1 or 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
- Embodiment 60 The protein of Embodiment 59, wherein ring A is a 5 -membered heteroaryl containing 1 heteroatom selected from the group consisting of oxygen, nitrogen, and sulfur.
- Embodiment 61 The protein of any one of Embodiments 46 to 53, wherein ring A is pyrrole, pyrazole, imidazole, triazole, furan, thiophene, phosphole, oxazole, isoxazole, thiazole, or isothiazole.
- Embodiment 62 The protein of any one of Embodiments 46 to 50, wherein the protein of Formula (II) is a protein of formula:
- Embodiment 63 The protein of any one of Embodiments 46 to 50, wherein the protein of Formula (II) is a protein of formula:
- Embodiment 64 The protein of any one of Embodiments 46 to 50, wherein the protein of Formula (II) is a protein of formula:
- Embodiment 65 The protein of any one of Embodiments 46 to 50, wherein the protein of Formula (II) is a protein of formula:
- Embodiment 66 The protein of Embodiment 53, wherein the protein of Formula (II) is a protein of formula:
- Embodiment 67 The protein of Embodiment 53, wherein the protein of Formula (II) is a protein of formula:
- Embodiment 68 The protein of Embodiment 53, wherein the protein of Formula (II) is a protein of formula:
- Embodiment 69 The protein of Embodiment 53, wherein the protein of Formula (II) is a protein of formula:
- Embodiment 70 The protein of any one of Embodiments 46 to 69, wherein L 1 is a bond.
- Embodiment 71 The protein of any one of Embodiments 46 to 69, wherein L 1 is substituted or unsubstituted alkylene.
- Embodiment 72 The protein of Embodiment 71, wherein L 1 is substituted or unsubstituted C 1-4 alkylene.
- Embodiment 73 The protein of any one of Embodiments 46 to 69, wherein L 1 is substituted or unsubstituted heteroalkylene.
- Embodiment 74 The protein of Embodiment 73, wherein L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
- Embodiment 75 The protein of Embodiment 74, wherein L 1 is -NH-C(O)-(CH 2 ) y -, and y is an integer from 0 to 2.
- Embodiment 76 The protein of Embodiment 74, wherein L 1 is -NH-C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2.
- Embodiment 77 The protein of Embodiment 74, wherein L 1 is -NH-C(O)-NH-(CH 2 ) y - , and y is an integer from 0 to 2.
- Embodiment 78 The protein of Embodiment 74, wherein L 1 is -NH-C(O)-S-(CH 2 ) y -, and y is an integer from 0 to 2.
- Embodiment 79 The protein of any one of Embodiments 75 to 78, wherein y is 0.
- Embodiment 80 The protein of any one of Embodiments 46 to 79, wherein x is an integer from 0 to 6.
- Embodiment 81 The protein of Embodiment 80, wherein x is an integer from 2 to 6.
- Embodiment 82 The protein of Embodiment 81 , wherein x is 4.
- Embodiment 83 The protein of any one of Embodiments 46 to 69, wherein -(CH2 - L 1 - is -(CH 2 ) 4 NH-C(O)-
- Embodiment 84 The protein of any one of Embodiments 46 to 69, wherein -(CH 2 ) X - L 1 - is -(CH 2 ) 4 NH-C(O)-O-.
- Embodiment 85 The protein of any one of Embodiments 46 to 69, wherein -(CH 2 ) X - L 1 - is -(CH 2 ) 4 NH-C(O)-NH-.
- Embodiment 86 The protein of any one of Embodiments 46 to 69, wherein -(CH 2 ) X - L 1 - is -(CH 2 ) 4 NH-C(O)-S-.
- Embodiment 87 The protein of Embodiment 46, wherein the protein of Formula (II) is a protein of the formula:
- Embodiment 88 The protein of Embodiment 46, wherein the protein of Formula (II) is a protein of the formula:
- Embodiment 89 The protein of Embodiment 46, wherein the protein of Formula (II) is a protein of the formula:
- Embodiment 90 The protein of Embodiment 46, wherein the protein of Formula (II) is a protein of the formula:
- Embodiment 91 The protein of any one of Embodiments 46 to 90, wherein the protein is an antibody.
- Embodiment 92 The protein of any one of Embodiments 46 to 90, wherein the protein is an antibody variant.
- Embodiment 93 The protein of Embodiment 92, wherein the antibody variant is a single-chain variable fragment, a single-domain antibody, an affibody, or an antigen-binding fragment.
- Embodiment 94 The protein of Embodiment 93, wherein the antibody variant is a single-chain variable fragment.
- Embodiment 95 The protein of Embodiment 93, wherein the antibody variant is a single-domain antibody.
- Embodiment 96 The protein of Embodiment 93, wherein the antibody variant is an affibody.
- Embodiment 97 The protein of Embodiment 93, wherein the antibody variant is an antigen-binding fragment.
- Embodiment 98 The protein of Embodiment any one of Embodiments 91 to 97, wherein the unnatural amino acid is within a CDR region or a framework region of the antibody or antibody variant.
- Embodiment 99 The protein of any one of Embodiments 46 to 98, wherein the protein is a receptor.
- Embodiment 100 The protein of any one of Embodiments 46 to 98, wherein the protein is a cell surface receptor.
- Embodiment 101 The protein of any one of Embodiments 100, wherein the cell surface receptor is in the extracellular domain, the transmembrane domain, or the intracellular domain.
- Embodiment 102 The protein of any one of Embodiments 46 to 98, wherein the protein is a cytosolic protein.
- Embodiment 103 The protein of any one of Embodiments 46 to 98, wherein the protein is a transcriptional factor or an enzyme.
- Embodiment 104 The protein of any one of Embodiments 46 to 103, further comprising a detectable agent.
- Embodiment 105 The protein of Embodiment 104, wherein the detectable agent is a radioisotope.
- Embodiment 106 The protein of any one of Embodiments 46 to 105, further comprising a therapeutic agent.
- Embodiment 107 A nucleic acid encoding the protein of any one of Embodiments 46 to 106.
- Embodiment 108 A vector comprising a nucleic acid of Embodiment 107.
- Embodiment 109 A biomolecule conjugate of Formula (III): wherein: R 4 and R 5 are each independently a peptidyl moiety, a carbohydrate moiety, a lipid moiety, or a nucleic acid moiety; ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl; L 4 is a bond or -O-; x is an integer from 0 to 8; L 1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; L 2 is a bond, -NR 2A -, -S-, -S(O)2-, -O-, -C(O)-, -C(O)O-, -OC(O)-, -N(R 2A )C(O)-,-C(O)N(R 2A )-, -, -,
- R 2A , R 2B , R 3A , and R 3B are independently hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene; and R 2A , R 2B , R 3A , and R 3B are independently hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl; R 1 is hydrogen, halogen, -CX 4 3, -CHX ⁇ , -CH2X 1 , -OCX 1 3,
- X 1 is independently -F, -Cl, -Br, or -I;
- R IA is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl;
- R 1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl;
- nl is an integer from 0 to 4;
- ml is 1 or 2; and
- vl is 1 or 2.
- Embodiment 110 The biomolecule conjugate of Embodiment 109, wherein R 1 is meta to the carbon atom linked to -L 4 S(O2)L 3 R 5 .
- Embodiment 111 The biomolecule conjugate of Embodiment 109, wherein R 1 is ortho to the carbon atom linked to -L 4 S(O2)L 3 R 5 .
- Embodiment 112. The biomolecule conjugate of any one of Embodiments 109 to 111, wherein L 4 is a bond.
- Embodiment 113 The biomolecule conjugate of any one of Embodiments 109 to 111, wherein L 4 is -O-.
- Embodiment 114 The biomolecule conjugate of any one of Embodiments 109 to 111, wherein the compound of Formula (III) has the formula:
- Embodiment 115 The biomolecule conjugate of any one of Embodiments 109 to 114, wherein R 1 is hydrogen, halogen, -CXb, -CHXb, -CH2X 1 , -OCXh, -OCH2X 1 , -OCHXb, -CN, -SOniR 1A , -SOviNR 1A R 1B , -NHC(O)NR 1A R 1B .
- Embodiment 116 The biomolecule conjugate of Embodiment 109, wherein the compound of Formula (III) has the formula:
- Embodiment 117 The biomolecule conjugate of any one of Embodiments 109 to 1 16, wherein ring A is a 5-membered cycloalkyl having one or two double bonds or a 5-membered heterocycloalkyl having one double bond.
- Embodiment 118 The biomolecule conjugate of any one of Embodiments 109 to 117, wherein ring A is a 5-membered cycloalkyd.
- Embodiment 119 The biomolecule conjugate of any one of Embodiments 109 to 117, wherein ring A is a 5-membered heterocycloalkylene.
- Embodiment 120 The biomolecule conjugate of any one of Embodiments 109 to 116, wherein ring A is a 5-membered heteroaryl.
- Embodiment 121 The biomolecule conjugate of Embodiment 120, wherein ring A is a 5-membered heteroaryl containing 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
- Embodiment 122 The biomolecule conjugate of Embodiment 121, wherein ring A is a 5-membered heteroaryl containing 1 or 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
- Embodiment 123 The biomolecule conjugate of Embodiment 122, wherein ring A is a 5-membered heteroaryl containing 1 heteroatom selected from the group consisting of oxygen, nitrogen, and sulfur.
- Embodiment 124 The biomolecule conjugate of any one of Embodiments 109 to 116, wherein ring A is pyrrole, pyrazole, imidazole, triazole, furan, thiophene, phosphole, oxazole, isoxazole, thiazole, or isothiazole.
- Embodiment 125 The biomolecule conjugate of any one of Embodiments 109 to 124, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
- Embodiment 126 The biomolecule conjugate of any one of Embodiments 109 to 124, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
- Embodiment 127 The biomolecule conjugate of any one of Embodiments 109 to 124, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
- Embodiment 128 The biomolecule conjugate of any one of Embodiments 109 to 124, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
- Embodiment 129 The biomolecule conjugate of Embodiment 116, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
- Embodiment 130 The biomolecule conjugate of Embodiment 116, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
- Embodiment 131 The biomolecule conjugate of Embodiment 116, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
- Embodiment 132 The biomolecule conjugate of Embodiment 116, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
- Embodiment 133 The biomolecule conjugate of any one of Embodiments 109 to 132, wherein L 1 is a bond.
- Embodiment 134 The biomolecule conjugate of any one of Embodiments 109 to 132, wherein L 1 is substituted or unsubstituted alkylene.
- Embodiment 135. The biomolecule conjugate of Embodiment 134, wherein L 1 is substituted or unsubstituted C 1-4 alkylene.
- Embodiment 136 The biomolecule conjugate of any one of Embodiments 109 to 132, wherein L 1 is substituted or unsubstituted heteroalkylene.
- Embodiment 137 The biomolecule conjugate of Embodiment 136, wherein L 1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
- Embodiment 138 The biomolecule conjugate of Embodiment 137, wherein L 1 is -NH- C(O)-(CH 2 )y-, and y is an integer from 0 to 2.
- Embodiment 139 The biomolecule conjugate of Embodiment 137, wherein L 1 is -NH- C(O)-O-(CH 2 ) y -, and y is an integer from 0 to 2.
- Embodiment 140 The biomolecule conjugate of Embodiment 137, wherein L 1 is -NH- C(O)-NH-(CH 2 )y-, and y is an integer from 0 to 2.
- Embodiment 141 The biomolecule conjugate of Embodiment 137, wherein L 1 is -NH- C(O)-S-(CH 2 )y-, and y is an integer from 0 to 2.
- Embodiment 142 The biomolecule conjugate of any one of Embodiments 138 to 141, wherein y is 0.
- Embodiment 143 The biomolecule conjugate of any one of Embodiments 109 to 142, wherein x is an integer from 0 to 6.
- Embodiment 144 The biomolecule conjugate of Embodiment 143, wherein x is an integer from 2 to 6.
- Embodiment 145 The biomolecule conjugate of Embodiment 144, wherein x is 4.
- Embodiment 146 The biomolecule conjugate of any one of Embodiments 109 to 132, wherein -(CH 2 ) X -L 1 - is -(CH 2 )4NH-C(O)-.
- Embodiment 147 The biomolecule conjugate of any one of Embodiments 109 to 132, wherein -(CH 2 ) -L l - is -(CH 2 )4NH-C(O)-O-.
- Embodiment 148 The biomolecule conjugate of any one of Embodiments 109 to 132, wherein -(CFEjx-L 1 - is -(CH 2 ) 4 NH-C(O)-NH-.
- Embodiment 149 The biomolecule conjugate of any one of Embodiments 109 to 132, wherein -(CFEjx-L 1 - is -(CH 2 )4NH-C(O)-S-.
- Embodiment 150 The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
- Embodiment 151 The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
- Embodiment 152 The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
- Embodiment 153 The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
- Embodiment 154 The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
- Embodiment 155 The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
- Embodiment 156 The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
- Embodiment 157 The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
- Embodiment 158 The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (Ill) is a biomolecule conjugate of the formula:
- Embodiment 159 The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
- Embodiment 160 The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
- Embodiment 161 The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
- Embodiment 162 The biomolecule conjugate of any one of Embodiments 109 to 161 , wherein R 4 and R 5 are each independently a peptidyl moiety.
- Embodiment 163 The biomolecule conjugate of Embodiment 162, wherein the peptidyl moiety of R 4 comprises an antibody; and the peptidyl moiety of R 5 comprises a protein.
- Embodiment 164 The biomolecule conjugate of Embodiment 162, wherein the peptidyl moiety of R 4 comprises an antibody variant; and the peptidyl moiety of R 5 comprises a protein.
- Embodiment 165 The biomolecule conjugate of Embodiment 162, wherein the peptidyl moiety of R 4 comprises a protein; and the peptidyl moiety of R 5 comprises an antibody or an antibody variant.
- Embodiment 166 The biomolecule conjugate of Embodiment 164 or 165, wherein the antibody variant is an antigen-binding fragment, a single-chain variable fragment, a singledomain antibody , or an affibody.
- Embodiment 167 The biomolecule conjugate of any one of Embodiments 163 to 166, wherein the protein is the target protein of the antibody or antibody variant.
- Embodiment 168 The biomolecule conjugate of any one of Embodiments 163 to 167, wherein the protein is a cytosolic protein.
- Embodiment 169 The biomolecule conjugate of any one of Embodiments 163 to 167, wherein the protein is an enzyme.
- Embodiment 170 The biomolecule conjugate of any one of Embodiments 163 to 167, wherein the protein is a transcriptional factor.
- Embodiment 171. The biomolecule conjugate of any one of Embodiments 163 to 167, wherein the protein is a receptor protein.
- Embodiment 173 The biomolecule conjugate of Embodiment 172, wherein the protein is a G protein-coupled receptor.
- Embodiment 174 Acomplex comprising a pyrrolysyl-tRNA synthetase and the compound of any one of Embodiments 1 to 106.
- Embodiment 175. The complex of Embodiment 174, wherein the pyrrolysyl-tRNA synthetase has an amino acid sequence with at least 90% sequence identity to SEQ ID NO: 1, 2, 3, or 4.
- Embodiment 176 The complex of Embodiment 175, wherein the pyrrolysyl-tRNA synthetase has an amino acid sequence as set forth in SEQ ID NO: 1, 2, 3, or 4.
- Embodiment 177 The complex of any one of Embodiments 174 to 176, further comprising a tRN A P l .
- Embodiment 178. A cell comprising: (i) the compound of any one of Embodiments 1 to 45; (ii) the protein of any one of Embodiments 46 to 106; (iii) the nucleic acid of Embodiment 107, (iv) the vector of Embodiment 108, (v) the biomolecule conjugate of any one of Embodiments 109 to 173; or (vi) the complex of any one of Embodiments 174 to 177.
- Embodiment 179 The cell of Embodiment 178, wherein the cell is a bacterial cell or a mammalian cell.
- Embodiment 180 A pharmaceutical composition comprising: (i) a pharmaceutically acceptable excipient, and (ii) the compound of any one of Embodiments 1 to 45, the protein of any one of Embodiments 46 to 106, the nucleic acid of Embodiment 107. or the vector of Embodiment 108.
- SFK was synthesized following the procedure described in FIG. 1A.
- the relatively electron-rich pyrrole ring was used to stabilize the sulfonyl fluoride functional group. It was tested to determine if the pyrrolysyl-tRNA synthetase (PylRS) described by Liu et al, J. Am. Chem. Soc., 143(27): 10341-10351 (2021) could incorporate SFK into proteins.
- the enhanced green fluorescent protein (EGFP) containing a TAG codon at position 182 (EGFP-182TAG) was co-expressed with FSKRS in E. coli.
- SFK Upon Afb-Z binding, SFK would be placed in close proximity with the testing residue at position 7 of Afb (Afb7X); reaction between which would lead to protein-protein cross-linking. As shown in FIG. 1C, SFK was able to crosslink with Lys, His, and Tyr.
- substituents R can be introduced into the pyrrole ring to further finetune the reactivity. These can be electron-withdrawing or electron-donating groups.
- substituents R can be introduced to fine-tune the reactivity .
- FIG. 2C two to four hetero-atoms can simultaneously introduced into the ring as shown, with additional substituent R for further fine-tuning the reactivity.
- a single colony was picked and inoculated into 1 mL 2XYT (5 g/L NaCI, 16 g/L Tryptone, 10 g/L Yeast extract) with 50 ⁇ g/mL ampicillin and 34 ⁇ g/mL chloramphenicol.
- the cells w ere left grown at 37 °C, 220 rpm overnight.
- the next morning cells were diluted 100 times in fresh 2XYT supplemented with 50 ⁇ g/mL ampicillin and 34 ⁇ g/mL chloramphenicol. When cells reach an OD600 of 1.0. cells were supplied with 2 mM SFK. The cells were then induced by 0.2% arabinose at 25 °C for 20 h. Proteins were then purified using the following procedure.
- Lysate was sonicated with Sonic Dismembrator (Fisher Scientific, 30 % output, 5 min, 1 s off, 1 s on) in an ice-water bath, after which the lysate was centrifugated (4,000 rpm for 10 min) and the supernatant was collected. Ni-NTA Agarose slurry’ (Thermo Scientific, #88222, 200 pL) was added to the supernatant. The mixture was incubated at 4 °C for 15 min and subsequently loaded onto a Poly-Prep® Chromatography Column.
- Afb7X and MBP-Z(24SFK) cross-linking 1 mg/ml Afb7X and 0.5 mg/ml MBP- Z(24SFK) were incubated in PBS (pH 7.4 ) at 37 °C for 12 h, after which 2 pL reaction solution w as extracted and mixed with 10 pL Laemmli loading buffer. The mixture w as heated to 95 °C for 10 min and then loaded for SDS-PAGE, after which the gel was stained w ith Coomassie blue and imaged with ChemiDocTM MP imaging system (Bio-rad). The maltose binding protein (MBP). Z protein, and Z spa affibody are well known in the art.
- the compound shown in FIG. 2E was synthesized by the process shown in FIG. 3.
- compound 2 (1 .0 g, 4.4 mmol) and l-ethyl-3-(3 '-dimethylaminopropyl)- carbodiimide hydrochloride (1.3 g, 6.6 mmol) in 20 mL anhydrous DCM was added compound 1 (HC1 form, 1.7 g, 5.3 mmol) and diisopropylethylamine (683 mg. 5.3 mmol) in 10 mL anhydrous DCM. The mixture was stirred at room temperature for 2 hours.
- SFK was incorporated into mNb6, a nanobody specific for the SARS-CoV-2 Spike protein, in E. coli.
- the mNb6 gene containing a TAG codon at position 54 (mNb6-54TAG) was co-expressed with the tRNA Pyl /FSKRS pair in E. coli with SFK added in the media.
- the purified mNb6(54SFK) protein was analyzed by electrospray ionization time-of-flight mass spectrometry (ESI-TOF MS). A peak observed at 13768 Da corresponds to intact protein mass of mNb6- 54SFK (expected 13770.68 Da).
- HeLa-GFP-182TAG reporter cells which contains a genome-integrated GFP(182TAG) gene, were transfected with plasmid pMP- FSKRS-3xtRNA. In the presence of 1 mM SFK in media, strong GFP fluorescence was observed. No fluorescence signal was detected without SFK addition (FIGS. 4A-4B).
- MBP maltose binding protein
- a protein band corresponding to the cross-linked MBP-Z with Affibody was clearly observed for Affibody(7H), Affibody(7K), and Affibody(7Y), with cross-linking efficiency (determined by band intensities) of 30.2%, 51.1%, and 62.6% after 22 h incubation, respectively.
- the cross-linking efficiency also increased with incubation time.
- MBP-Z(24FSK) was purified and incubated with Affibody(7H), Affibody(7K), or Affibody(7Y)
- no cross-linking band was detected at the position of corresponding molecular w eight (FIGS. 5A-5C).
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Peptides Or Proteins (AREA)
Abstract
Provided herein are, inter alia, unnatural amino acids, proteins comprising unnatural amino acids, biomolecule conjugates, and methods of making the unnatural amino acids, proteins, and biomolecule conjugates. In embodiments, the unnatural amino acid is a compound of Formula (I) or a stereoisomer thereof: wherein the substituents are defined herein.
Description
BIOREACTIVE PROTEINS CONTAINING UNNATURAL AMINO ACIDS
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of priority to US Application No. 63/421,974 filed November 2, 2022, the disclosure of which is incorporated by reference herein.
STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER FEDERALLY SPONSORED RESEARCH AND DEVELOPMENT
[0002] This invention was made with government support under R01 GM118384 awarded by The National Institutes of Health. The government has certain rights in the invention.
REFERENCE TO A "SEQUENCE LISTING," A TABLE, OR A COMPUTER PROGRAM LISTING APPENDIX SUBMITTED AS AN ASCII FILE
[0003] The Sequence Listing written in XML format, entitled “048536-758001WO Sequence Listing,” created October 23, 2023, having 7,282 bytes, is incorporated by reference herein.
BACKGROUND
[0004] Introducing new chemical bonds into proteins provides innovative avenues for manipulating protein structure and function. Unnatural amino acids (Uaas) containing diverse latent bioreactive functional groups have recently been introduced into proteins via genetic code expansion. This offers an exquisite tool not only to study cellular protein interactions but also create novel protein-based therapeutics. SuFEx click chemistry via the latent aryl fluorosulfate group has demonstrated value in aiding modular organic synthesis, chemical biology, and drug development. As set forth in US Publication No. 2021/0002325, the inventors incorporated fluorosulfate-L-tyrosine (FSY) into proteins for protein crosslinking and generating covalent protein drugs. There is a need in the art, inter alia, for new and other unnatural amino acids that can be used for protein identification, drug target discovery, or biotherapeutics. Provided herein are solutions to these and other needs in the art.
SUMMARY
[0005] Provided herein is a compound of Formula (I) or a stereoisomer thereof:
wherein the substitutents are as defined herein.
[0006] Provided herein is a protein comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II):
wherein the substitutents are as defined herein.
[0007] Provided herein is a biomolecule conjugate of Formula (III):
wherein the substitutents are as defined herein.
[0008] These and other embodiments of the disclosure are provided in detail herein.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] FIGS. 1A-1C show the synthetic scheme and data demonstrating that SFK is a latent bioreactive unnatural amino acid (Uaa) for protein-protein cross-linking. FIG. 1A: synthetic scheme for SFK. FIG. IB: incorporation of SFK into EGFR using tRNAPy1/ FSKRS, n = 3, values are mean ± SD; FIG. 2C: SDS-PAGE analysis of cross-linking between Afb7X with MBP-Z(24SFK).
[0010] FIGS. 2A-2G show embodiments of the compounds described herein. In FIGS. 2A- 2C and 2G, R can be R1 as defined herein. In FIG. 2G, X, Y i, Y2, Zi, and Z2 can be O, N, S, and C.
[0011] FIG. 3 provides the synthetic scheme to prepare the compound shown in FIG. 2E.
[0012] FIGS. 4A-4B are fluorescence microscopic imaging of HeLa-GFP-182TAG reporter cells grown in the absence of SFK (FIG. 4A) or in the presence of SFK (FIG. 4B). The bar at the bottom right hand comer represents the scale of 51 microns.
[0013] FIGS. 5A-5F are SDS-PAGE analysis of MBP-Z(24FSK) incubation with an affibody. MBP-Z(24FSK) incubation with Affibody(7H) (FIG. 5A), Affibody(7K) (FIG. 5B), and Affibody(7Y) (FIG. 5C) show no cross-linking. MBP-Z(24SFK) incubation with Affibody(7H) (FIG. 5D), Affibody(7K) (FIG. 5E), and Affibody(7Y) (FIG. 5F) show time-dependent crosslinking.
[0014] FIGS. 6A-6B are Western blot analyses of Spike BRD incubation with mNb6. FIG. 6A is a Western blot analysis of Spike RBD(E484K) incubation with mNb6 with FSK
incorporated at indicated sites 50-59. FIG. 6B is a Western blot analysis of Spike RBD(E484K) incubation with mNb6 with SFK incorporated at indicated sites 50-59.
DETAILED DESCRIPTION
[0015] Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by a person of ordinary skill in the art. See, e.g., Singleton et al.. Dictionary of Microbiology and Molecular Biology, 2nd ed., J. Wiley & Sons (New York, NY 1994); Sambrook et al., Molecular Cloning, A Laboratory Manual, Cold Springs Harbor Press (Cold Springs Harbor, NY 1989). Any methods, devices and materials similar or equivalent to those described herein can be used in the practice of this disclosure. The following definitions are provided to facilitate understanding of certain terms used frequently herein and are not meant to limit the scope of the present disclosure.
[0017] The terms TSK " and “fluorosulfonyloxybenzoyl-L-lysine” or “FSK” refer to the compound having the structure:
[0018] The term “antibody” is used according to its commonly known meaning in the art. Antibodies exist, e g., as intact immunoglobulins or as a number of well-characterized fragments produced by digestion with various peptidases. Thus, for example, pepsin digests an antibody below the disulfide linkages in the hinge region to produce F(ab)'?, a dimer of Fab which itself is a light chain joined to VH-CHI by a disulfide bond. The term “F(ab)'2” is used interchangeably with “Fab dimer.” The F(ab)'2 may be reduced under mild conditions to break the disulfide linkage in the hinge region, thereby converting the F(ab)'2 dimer into an Fab' monomer. The Fab' monomer is essentially Fab with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed. 1993)). The term “Fab’ monomer” is used interchangeably with “Fab” and “or an
antigen-binding fragment.'’ While various antibody fragments are defined in terms of the digestion of an intact antibody, one of skill will appreciate that such fragments may be synthesized de novo either chemically or by using recombinant DNA methodology. Thus, the term antibody, as used herein, also includes antibody fragments either produced by the modification of whole antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or those identified using phage display libraries.
[0019] Antibodies are large, complex proteins with an intricate internal structure. A natural antibody molecule contains two identical pairs of polypeptide chains, each pair having one light chain and one heavy chain. Each light chain and heavy chain in turn consists of two regions: a variable (“V”) region involved in binding the target antigen, and a constant (“C”) region that interacts with other components of the immune system. The light and heavy chain variable regions come together in 3-dimensional space to form a variable region that binds the antigen (for example, a receptor on the surface of a cell). Within each light or heavy chain variable region, there are three short segments (averaging 10 amino acids in length) called the complementarity determining regions (“CDRs”). The six CDRs in an antibody variable domain (three from the light chain and three from the heavy chain) fold up together in 3 -dimensional space to form the actual antibody binding site which docks onto the target antigen. The position and length of the CDRs have been precisely defined by Kabat et al, Sequences of Proteins of Immunological Interest, U.S. Department of Health and Human Services, 1987. The part of a variable region not contained in the CDRs is called the framework (“FR”), which forms the environment for the CDRs.
[0020] An exemplary' immunoglobulin (antibody) structural unit comprises a tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair having one "‘light” and one ‘'heavy” chain. The N-terminus of each chain defines a variable region of about 100 to 110 or more amino acids primarily responsible for antigen recognition. The terms variable light chain (VL) and variable heavy chain (VH) refer to these light and heavy chains respectively. The Fc (i.e., fragment crystallizable region) is the “base” or “tail” of an immunoglobulin and is ty pically composed of two heavy chains that contribute two or three constant domains depending on the class of the antibody. By binding to specific proteins the Fc region ensures that each antibody generates an appropriate immune response for a given antigen. The Fc region also binds to various cell receptors, such as Fc receptors, and other immune molecules, such as complement proteins.
[0021] An “antibody variant” as provided herein refers to a polypeptide capable of binding to
a receptor protein or an antigen and including one or more structural domains of an antibody or fragment thereof. Non-limiting examples of antibody variants include single-domain antibodies (nanobodies), affibodies (polypeptides smaller than monoclonal antibodies and capable of binding receptor proteins or antigens with high affinity and imitating monoclonal antibodies), antigen-binding fragments (Fab), Fab dimers (monospecific Fab2, bispecific Fab2), trispecific Fabs, monovalent IgGs, single-chain variable fragments (scFv), bispecific diabodies, trispecific triabodies, scFv-Fc, minibodies. IgNAR, V-NAR. hcIgG. VhH. and peptibodies. A “peptibody” as provided herein refers to a peptide moiety attached (through a covalent or non-covalent linker) to the Fc domain of an antibody.
[0022] A “single-domain antibody” or “nanobody” refers to an antibody fragment having a single monomeric variable antibody domain. Like a whole antibody, it is able to bind selectively to a specific antigen. In embodiments, the single domain antibody is a human or humanized single-domain antibody.
[0023] A single-chain variable fragment (scFv) is typically a fusion protein of the variable regions of the heavy (VH) and light chains (VL) of immunoglobulins, connected with a short linker peptide of 10 to about 25 amino acids. The linker is usually rich in glycine for flexibility, as well as serine or threonine for solubility. The linker can either connect the N-terminus of the VH with the C-terminus of the VL, or vice versa.
[0024] Antibodies, e.g., recombinant, monoclonal, or polyclonal antibodies, can be prepared by techniques well known in the art. The genes encoding the heavy and light chains of an antibody of interest can be cloned from a cell, e.g., the genes encoding a monoclonal antibody can be cloned from a hybridoma and used to produce a recombinant monoclonal antibody. Gene libraries encoding heavy and light chains of monoclonal antibodies can also be made from hybridoma or plasma cells. Random combinations of the heavy and light chain gene products generate a large pool of antibodies with different antigenic specificity. Techniques for the production of single chain antibodies or recombinant antibodies can be adapted to produce antibodies to polypeptides. Also, transgenic mice, or other organisms such as other mammals, may be used to express humanized or human antibodies. Alternatively, phage display technology can be used to identify antibodies and heteromeric Fab fragments that specifically bind to selected antigens. Antibodies can also be made bispecific, i.e., able to recognize two different antigens. Antibodies can also be heteroconjugates, e.g., two covalently j oined antibodies, or immunotoxins.
[0025] The epitope of an antibody is the region of its antigen to which the antibody binds.
Two antibodies bind to the same or overlapping epitope if each competitively inhibits (blocks) binding of the other to the antigen. That is, a lx, 5x, lOx, 20x or lOOx excess of one antibody inhibits binding of the other by at least 30% but preferably 50%, 75%, 90% or even 99% as measured in a competitive binding assay (see, e.g., Junghans et al., Cancer Res. 50: 1495, 1990). Alternatively, two antibodies have the same epitope if essentially all amino acid mutations in the antigen that reduce or eliminate binding of one antibody reduce or eliminate binding of the other. Two antibodies have overlapping epitopes if some amino acid mutations that reduce or eliminate binding of one antibody reduce or eliminate binding of the other.
[0026] Methods for humanizing or primatizing non-human antibodies are well known in the art. Generally, a humanized antibody has one or more amino acid residues introduced into it from a source which is non-human. These non-human amino acid residues are often referred to as import residues, which are typically taken from an import variable domain. Humanization can be performed by methods known in the art. Accordingly, such humanized antibodies are chimeric antibodies, wherein substantially less than an intact human variable domain has been substituted by the corresponding sequence from a non-human species. In practice, humanized antibodies are typically human antibodies in which some CDR residues and possibly some FR residues are substituted by residues from analogous sites in rodent antibodies. For example, polynucleotides comprising a first sequence coding for humanized immunoglobulin framework regions and a second sequence set coding for the desired immunoglobulin complementarity determining regions can be produced synthetically or by combining appropriate cDNA and genomic DNA segments. Human constant region DNA sequences can be isolated in accordance with well known procedures from a variety of human cells.
[0027] A “chimeric antibody” is an antibody molecule in which (i) the constant region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site (variable region) is linked to a constant region of a different or altered class, effector function and/or species, or an entirely different molecule which confers new' properties to the chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (ii) the variable region, or a portion thereof, is altered, replaced or exchanged with a variable region having a different or altered antigen specificity. In embodiments, the antibodies described herein include humanized and/or chimeric monoclonal antibodies.
[0028] The phrase “specifically (or selectively) binds” to an antibody or a receptor protein or “specifically (or selectively) immunoreactive with” when referring to a protein refers to a binding reaction that is determinative of the presence of the protein, often in a heterogeneous
population of proteins and other biologies. Thus, under designated immunoassay conditions, the specified antibodies bind to a particular protein at least two times the background and more ty pically more than 10 to 100 times background. Specific binding to an antibody under such conditions requires an antibody that is selected for its specificity for a particular protein. For example, polyclonal antibodies can be selected to obtain only a subset of antibodies that are specifically immunoreactive with the selected antigen and not with other proteins. This selection may be achieved by subtracting out antibodies that cross-react with other molecules. A variety of immunoassay formats may be used to select antibodies specifically immunoreactive with a particular protein. For example, solid-phase ELISA immunoassays are routinely used to select antibodies specifically immunoreactive with a protein (e.g., Harlow' & Lane, Using Antibodies, A Laboratory Manual (1998) for a description of immunoassay formats and conditions that can be used to determine specific immunoreactivity).
[0029] "Receptor protein” or “membrane receptor” refers to a receptor (protein) that is embedded in the plasma membrane of a cell. In embodiments, the receptor protein is located in the extracellular domain of a cell, the transmembrane domain of a cell, or the intracellular domain of a cell. In embodiments, the receptor protein is a cell-surface receptor. In embodiments, the receptor protein is in the extracellular domain. In embodiments, the receptor protein is in the transmembrane domain. In embodiments, the receptor protein is an ion channel- linked receptor, an enzyme-linked receptor, or a G protein-coupled receptor. In embodiments, the receptor protein is a hormone receptor.
[0030] The term “peptidyl moiety ” as used herein refers to a protein, protein fragment, or peptide that may form part of a biomolecule or a biomolecule conjugate. In aspects, the peptidyl moiety forms part of a biomolecule (e.g., protein). In aspects, the peptidyl moiety forms part of a biomolecule (e.g., protein) conjugate. The peptidyl moiety may also be substituted with additional chemical moieties (e.g., additional R substituents). In aspects, the peptidyl moiety7 forms part of an antibody or an antibody variant. In aspects, the peptidyl moiety7 forms part of a receptor protein. In aspects, a peptidyl moiety is a protein, protein fragment, or peptide that conatins a monovalent radical of an amino acid.
[0031] The term “amino acid moiety ” refers to a monovalent amino acid.
[0032] The term “carbohydrate moiety ” as used herein refers to carbohydrates, for example, polyhydroxy aldehydes, ketones, alcohols, acids, their simple derivatives and their polymers having linkages of the acetal type, that may form part of a biomolecule or a biomolecule conjugate. In aspects, the carbohydrate moiety forms part of a biomolecule. In aspects, the
carbohydrate moiety forms part of a biomolecule conjugate. The carbohydrate moiety may also be substituted with additional chemical moieties (e.g.. additional R substituents).
[0033] The term “nucleic acid moiety'’ as used herein refers to nucleic acids, for example, DNA, and RNA. that may form part of a biomolecule or biomolecule conjugate. In aspects, the nucleic acid moiety forms part of a biomolecule. In aspects, the nucleic acid moiety forms part of a biomolecule conjugate. The nucleic acid moiety may also be substituted with additional chemical moieties (e g., additional R substituents).
[0034] The term “lipid moiety” refers to a lipid or lipid fragment. The lipid may be substituted with additional chemical moieties. In embodiments, a lipid moiety is a monovalent radical of a lipid.
[0035] The term “RNA moiety” refers to a RNA, as described herein. In embodiments, an RNA moiety is a monovalent radical of RNA. In aspects, an RNA moiety is an RNA containing a monovalent radical of a nucleotide.
[0036] The term “RNA-binding protein moiety” refers to a protein, as described herein. In embodiments, an RNA-binding moiety is a monovalent radical of an RNA-binding protein, such as a monovalent radical of a CRISPR protein or a monovalent radical of a RNA chaperone.
[0037] “Nucleic acid” refers to nucleotides (e.g., deoxy ribonucleotides or ribonucleotides) and polymers thereof in either single-, double- or multiple-stranded form, or complements thereof. The terms “polynucleotide,” “oligonucleotide,” “oligo” or the like refer, in the usual and customary sense, to a linear sequence of nucleotides. The term “nucleotide” refers, in the usual and customary sense, to a single unit of a polynucleotide, i.e., a monomer. Nucleotides can be ribonucleotides, deoxyribonucleotides, or modified versions thereof. Examples of polynucleotides contemplated herein include single and double stranded DNA. single and double stranded RNA, and hybrid molecules having mixtures of single and double stranded DNA and RNA. Examples of nucleic acid, e.g. polynucleotides contemplated herein include any types of RNA, e.g. mRNA, siRNA, miRNA, and guide RNA and any types of DNA, genomic DNA, plasmid DNA, and minicircle DNA, and any fragments thereof. The term “duplex” in the context of polynucleotides refers, in the usual and customary sense, to double strandedness. Nucleic acids can be linear or branched. For example, nucleic acids can be a linear chain of nucleotides or the nucleic acids can be branched, e.g., such that the nucleic acids comprise one or more arms or branches of nucleotides. Optionally, the branched nucleic acids are repetitively branched to form higher ordered structures such as dendrimers and the like.
[0038] Nucleic acids, including e.g., nucleic acids with a phosphothioate backbone, can include one or more reactive moieties. As used herein, the term reactive moiety includes any group capable of reacting with another molecule, e.g., a nucleic acid or polypeptide through covalent, non-covalent or other interactions. By way of example, the nucleic acid can include an amino acid reactive moiety that reacts with an amio acid on a protein or polypeptide through a covalent, non-covalent or other interaction.
[0039] The terms also encompass nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, and non- naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides. Examples of such analogs include, without limitation, phosphodiester derivatives including, e.g., phosphorami date, phosphorodiamidate. phosphorothioate (also known as phosphorothioate having double bonded sulfur replacing oxygen in the phosphate), phosphorodithioate, phosphonocarboxylic acids, phosphonocarboxylates, phosphonoacetic acid, phosphonoformic acid, methyl phosphonate, boron phosphonate, or O-methylphosphoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: A Practical Approach, Oxford University Press) as well as modifications to the nucleotide bases such as in 5-methyl cytidine or pseudouridine and peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with positive backbones; nonionic backbones, modified sugars, and non-ribose backbones (e.g. phosphorodiamidate morpholino oligos or locked nucleic acids (LNA) as known in the art), including those described in U.S. Patent Nos. 5,235,033 and 5.034,506, and Chapters 6 and 7, ASC Symposium Series 580, Glycan Modifications in Antisense Research, Sanghui & Cook, eds. Nucleic acids containing one or more carbocyclic sugars are also included within one definition of nucleic acids. Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments or as probes on a biochip. Mixtures of naturally occurring nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made. In embodiments, the intemucleotide linkages in DNA are phosphodiester, phosphodiester derivatives, or a combination of both.
[0040] Nucleic acids can include nonspecific sequences. As used herein, the term “nonspecific sequence” refers to a nucleic acid sequence that contains a series of residues that are not designed to be complementary to or are only partially complementary to any other nucleic acid sequence. By way of example, a nonspecific nucleic acid sequence is a sequence of nucleic acid residues that does not function as an inhibitory nucleic acid when contacted with a cell or
organism.
[0041] A polynucleotide is typically composed of a specific sequence of four nucleotide bases: adenine (A); cytosine (C); guanine (G); and thymine (T) (uracil (U) for thymine (T) when the polynucleotide is RNA). Thus, the term “polynucleotide sequence” is the alphabetical representation of a polynucleotide molecule: alternatively, the term may be applied to the polynucleotide molecule itself. This alphabetical representation can be input into databases in a computer having a central processing unit and used for bioinformatics applications such as functional genomics and homology searching. Polynucleotides may optionally include one or more non-standard nucleotide(s), nucleotide analog(s) and/or modified nucleotides.
[0042] The term “complement,” as used herein, refers to a nucleotide (e.g.. RNA or DNA) or a sequence of nucleotides capable of base pairing with a complementary nucleotide or sequence of nucleotides. As described herein and commonly know n in the art the complementary' (matching) nucleotide of adenosine is thymidine and the complementary (matching) nucleotide of guanidine is cytosine. Thus, a complement may include a sequence of nucleotides that base pair with corresponding complementary nucleotides of a second nucleic acid sequence. The nucleotides of a complement may partially or completely match the nucleotides of the second nucleic acid sequence. Where the nucleotides of the complement completely match each nucleotide of the second nucleic acid sequence, the complement forms base pairs with each nucleotide of the second nucleic acid sequence. Where the nucleotides of the complement partially match the nucleotides of the second nucleic acid sequence only some of the nucleotides of the complement form base pairs with nucleotides of the second nucleic acid sequence. Examples of complementary' sequences include coding and a non-coding sequences, wherein the non-coding sequence contains complementary nucleotides to the coding sequence and thus forms the complement of the coding sequence. A further example of complementary sequences are sense and antisense sequences, wherein the sense sequence contains complementary nucleotides to the antisense sequence and thus forms the complement of the antisense sequence.
[0043] As described herein the complementarity of sequences may be partial, in which only some of the nucleic acids match according to base pairing, or complete, where all the nucleic acids match according to base pairing. Thus, two sequences that are complementary to each other, may have a specified percentage of nucleotides that are the same (i.e., about 60% identity, preferably 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region).
[0044] The term “amino acid” refers to naturally occurring and synthetic amino acids, as well
as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, y-carboxy glutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid. The terms “non-naturally occurring amino acid” and “unnatural amino acid” refer to amino acid analogs, synthetic amino acids, and amino acid mimetics which are not found in nature.
[0045] Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.
[0046] The term “amino acid side chain” refers to the functional substituent contained on amino acids. For example, an amino acid side chain may be the side chain of a naturally occurring amino acid. Naturally occurring amino acids are those encoded by the genetic code (e.g., alanine, arginine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, or valine), as well as those amino acids that are later modified, e.g.. hydroxyproline, y-carboxyglutamate. and O-phosphoserine. In aspects, the amino acid side chain may be a non-natural amino acid side chain. In aspects, the amino acid side chain is H,
[0047] The term “non-natural amino acid side chain" or "unnatural amino acid side chain” refers to the functional substituent of compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium, allylalanine, 2-aminoisobutryric acid. Non-natural amino acids are non- proteinogemc amino acids that either occur naturally or are chemically synthesized. Such analogs have modified R groups (e g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid. Non-limiting examples include exo-cis-3-aminobicyclo[2.2.1]hept-5-ene-2-carboxylic acid hydrochloride, cis-2- aminocycloheptane-carboxylic acid hydrochloride, cis-6-amino-3-cyclohexene-l -carboxylic acid hydrochloride, cis-2-amino-2-methylcyclohexanecarboxylic acid hydrochloride, cis-2- amino-2-methylcyclopentane-carboxylic acid hydrochloride, 2-(Boc-aminomethyl)benzoic acid, 2-(Boc-amino)octanedioic acid, Boc-4,5-dehydro-Leu-OH (dicyclohexylammonium), Boc-4- (Fmoc-amino)-L-phenylalanine. Boc-(3-Homopyr-OH, Boc-(2-indanyl)-Gly-OH. 4-Boc-3- morpholineacetic acid, 4-Boc-3 -morpholine acetic acid, Boc-pentafluoro-D-phenylalanine, Boc- pentafluoro-L-phenylalanine, Boc-Phe(2-Br)-OH, Boc-Phe(4-Br)-OH, Boc-D-Phe(4-Br)-OH, Boc-D-Phe(3-Cl)-OH , Boc-Phe(4-NH2)-OH, Boc-Phe(3-NO2)-OH, Boc-Phe(3,5-F2)-OH, 2- (4-Boc-piperazino)-2-(3.4-dimethoxy-phenyl)acetic acid purum, 2-(4-Boc-piperazino)-2-(2- fluorophenyl)acetic acid purum, 2-(4-Boc-piperazino)-2-(3-fluorophenyl)acetic acid purum, 2- (4-Boc-piperazino)-2-(4-fluorophenyl)acetic acid purum, 2-(4-Boc-piperazino)-2-(4-methoxy- phenyl)acetic acid purum, 2-(4-Boc-piperazino)-2-phenylacetic acid purum, 2-(4-Boc- piperazino)-2-(3-pyridyl)acetic acid purum, 2-(4-Boc-piperazino)-2-[4-(trifluoromethyl)phenyl]- acetic acid purum, Boc-P-(2-quinolyl)-Ala-OH, N-Boc-1.2.3.6-tetrahydro-2 -pyridinecarboxylic acid, Boc-P-(4-thiazolyl)-Ala-OH, Boc-P-(2-thienyl)-D-Ala-OH, Fmoc-N-(4-Boc-aminobutyl)- Gly-OH, Fmoc-N-(2-Boc-aminoethyl)-Gly-OH , Fmoc-N-(2,4-dimethoxybenzyl)-Gly-OH, Fmoc-(2-indanyl)-Gly-OH, Fmoc-pentafluoro-L-phenylalanine, Fmoc-Pen(Trt)-OH, Fmoc- Phe(2-Br)-OH. Fmoc-Phe(4-Br)-OH, Fmoc-Phe(3,5-F2)-OH. Fmoc-P-(4-thiazolyl)-Ala-OH, Fmoc-P-(2 -thienyl)- Ala-OH, 4-(Hydroxymethyl)-D-phenylalanine.
[0048] “Conservatively modified variants” applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, “conservatively modified variants”
refers to those nucleic acids that encode identical or essentially identical amino acid sequences. Because of the degeneracy of the genetic code, a number of nucleic acid sequences will encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are “silent variations.” which are one species of conservatively modified variations. Every nucleic acid sequence herein which encodes a polypeptide also describes every possible silent variation of the nucleic acid. One of skill will recognize that each codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a functionally identical molecule. Accordingly, each silent variation of a nucleic acid which encodes a polypeptide is implicit in each described sequence.
[0049] As to amino acid sequences, one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a “conservatively modified variant” where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the disclosure.
[0050] The following eight groups each contain amino acids that are conservative substitutions for one another: (i) Alanine (A), Glycine (G); (ii) Aspartic acid (D). Glutamic acid (E); (hi) Asparagine (N). Glutamine (Q); (iv) Arginine (R). Lysine (K); (v) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); (vi) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); (vii) Serine (S), Threonine (T); and (viii) Cysteine (C), Methionine (M). (e.g., Creighton, Proteins (1984)).
[0051] The terms “protein,” “polypeptide,” and “peptide” are used interchangeably herein to refer to a polymer of amino acid residues. The polymer of amino acids may, in embodiments, be conjugated to a moiety that does not consist of amino acids. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymers. A “fusion protein” refers to a chimeric protein encoding two or more separate protein sequences that are recombinantly
expressed as a single moiety.
[0052] An amino acid or nucleotide base “position’" is denoted by a number that sequentially identifies each amino acid (or nucleotide base) in the reference sequence based on its position relative to the N-terminus (or 5'-end). Due to deletions, insertions, truncations, fusions, and the like that must be taken into account when determining an optimal alignment, in general the amino acid residue number in a test sequence determined by simply counting from the N- terminus will not necessarily be the same as the number of its corresponding position in the reference sequence. For example, in a case where a variant has a deletion relative to an aligned reference sequence, there will be no amino acid in the variant that corresponds to a position in the reference sequence at the site of deletion. Where there is an insertion in an aligned reference sequence, that insertion will not correspond to a numbered amino acid position in the reference sequence. In the case of truncations or fusions there can be stretches of amino acids in either the reference or aligned sequence that do not correspond to any amino acid in the corresponding sequence.
[0053] The terms “numbered with reference to” or “corresponding to,” when used in the context of the numbering of a given amino acid or polynucleotide sequence, refers to the numbering of the residues of a specified reference sequence when the given amino acid or polynucleotide sequence is compared to the reference sequence.
[0054] An amino acid residue in a protein “corresponds” to a given residue when it occupies the same essential structural position within the protein as the given residue. For example, a selected residue in a selected protein corresponds to specific position (e.g., A100) of a protein when the selected residue occupies the same essential spatial or other structural relationship as that specific position (e.g., A100) of the protein. In embodiments, where a selected protein is aligned for maximum homology with the protein, the position in the aligned selected protein aligning with that specific position (e.g., A100) is said to correspond to that specific residue (e.g., Al 00). Instead of a primary sequence alignment, a three dimensional structural alignment can also be used, e.g., where the structure of the selected protein is aligned for maximum correspondence with the protein and the overall structures compared. In this case, an amino acid that occupies the same essential position as that specific position (e.g., A100) in the structural model is said to correspond to the that specific position residue (e.g., A 100).
[0055] "‘Percentage of sequence identity” is determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as
compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
[0056] The terms “identical'’ or percent “identity,” in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 60% identity, or at least 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% identity over a specified region, when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection (e.g., NCBI web site ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to be “substantially identical.” This definition also refers to, or may be applied to, the compliment of a test sequence. The definition also includes sequences that have deletions and/or additions, as well as those that have substitutions. As described below, the preferred algorithms can account for gaps and the like. Preferably, identity exists over a region that is at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 50-100 amino acids or nucleotides in length.
[0057] The term “biomolecule” as used herein refers to large macromolecules such as, for example, proteins, lipids, and nucleic acids, as well as small molecules such as, for example, primary and secondary metabolites. In embodiments, the term biomolecule refers to a protein. In embodiments, the term biomolecule refers to a RNA-binding protein. In embodiments, the term biomolecule refers to RNA. In embodiments, the term biomolecule refers to a receptor protein.
[0058] The term “biomolecule moiety” as used herein refers to biomolecules, including large macromolecules such as, for example, proteins, lipids, and nucleic acids, as well as small molecules such as, for example, primary and secondary metabolites. Thus, in embodiments, the biomolecule moiety is a peptidyl moiety, a lipid moiety or a nucleic acid moiety. Biomolecule moieties may form part of a molecule (e.g., biomolecule). For example, biomolecule moieties may form part of a biomolecule conjugate, where the biomolecule conjugate includes two or more biomolecule moieties. In embodiments, the biomolecule conjugate includes two or more biomolecule moieties conjugated via a bioconjugate linker.
[0059] The term “pyrrolysyl-tRNA synthetase” refers to an enzyme (including homologs, isoforms, and functional fragments thereof) with pyrrolysyl-tRNA synthetase activity. Pyrrolysyl-tRNA synthetase is an aminoacyl-tRNA synthetase that catalyzes the reaction necessary to attach a-amino acid pyrrolysine to the cognate tRNA (tRNApyl), thereby allowing incorporation of pyrrolysine during proteinogenesis at amber stop codons (i.e., UAG). The term includes any recombinant or naturally-occurring form of pyrrolysyl-tRNA synthetase or variants, homologs, or isoforms thereof that maintain pyrrolysyl-tRNA synthetase activity (e.g. within at least 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 100% activity compared to wildtype pyrrolysyl-tRNA synthetase). In embodiments, the variants, homologs, or isoforms have at least 90%, 95%, 96%, 97%, 98%, 99% or 100% amino acid sequence identity across the whole sequence or a portion of the sequence (e.g., a 50, 100. 150 or 200 continuous amino acid portion) compared to a naturally occurring pyrrolysyl-tRNA synthetase. In embodiments, the mutant pyrrolysyl-tRNA synthetase catalyzes the attachment of the compound of Formula (I) and embodiments thereof to a tRNApyl. In embodiments, the mutant pyrrolysyl-tRNA synthetase catalyzes the attachment of the compounds described herein and embodiments thereof to a tRNApyl. In embodiments, the pyrrolysyl-tRNA synthetase comprises the amino acid sequence set forth as SEQ ID NO: 1.
[0060] The term '‘mutant pyrrolysyl-tRNA synthetase” or “mutant PylRS” refers to any pyrrolysyl-tRNA synthetase that has a different amino acid sequence from wild-type amino acid sequence.
[0061] The terms “tRNA1^1” and “rTNAPylcuA” and “tRNA^uA” (i.e., tRNA(superscript Pyl)(subscript CUA)) are used interchangeably and all refer to a single-stranded RNA molecule containing about 70 to 90 nucleotides which fold via intrastrand base pairing to form a characteristic cloverleaf structure that carries a specific amino acid (e.g., compound of Formula (I) or embodiments thereof; compound of Formula (IV) or embodiments thereof; compound of Formula (VII) or embodiments thereof) and matches it to its corresponding codon (i.e., a complementary to the anticodon of the tRNA) on an rnRNA during protein synthesis. In tRN APyl. the anticodon is CUA. Anticodon CUA is complementary' to amber stop codon UAG. In embodiments, the tRNAPyl comprises an anticodon. In embodiments, the anticodon is CUA. TTA, or TCA. In embodiments, the tRNA1’5'1 comprises an anticodon, wherein the anticodon comprises at least one non-cannonical base. The abbreviation “Pyl” of tRN APyl stands for pyrrolysine and the “CUA” of tRNAPyl refers to its anticodon CUA. In embodiments, tRN APvl is attached to a compound described herein, including embodiments thereof.
[0062] The term “substrate-binding site'’ as used herein refers to residues located in the enzyme active site that form temporary bonds or interactions with the substrate. In embodiments, the substrate-binding site of pyrrolysyl-tRNA synthetase refers to residues located in the active site of pyrrolysyl-tRNA synthetase that form temporary bonds or interactions with the amino acid substrate.
[0063] The term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of vector is a “plasmid”, which refers to a linear or circular double stranded DNA loop into which additional DNA segments can be ligated. Another type of vector is a viral vector, wherein additional DNA segments can be ligated into the viral genome. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as “expression vectors.” In general, expression vectors of utility in recombinant DNA techniques are often in the form of plasmids. The terms “plasmid” and “vector” can be used interchangeably as the plasmid is the most commonly used form of vector. However, the disclosure is intended to include such other forms of expression vectors, such as viral vectors (e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses), which serve equivalent functions. Some viral vectors are capable of targeting a particular cells type either specifically or non- specifically. Exemplary vectors that can be used include, but are not limited to, pEvol vector, pMP vector, pET vector, pTak vector, pBad vector.
[0064] The term “complex” refers to a composition that includes two or more components, where the components bind together to make a functional unit. In embodiments, a complex described herein include a mutant pyrrolysyl-tRNA synthetase described herein and an amino acid substrate (e.g., the compounds described herein, including embodiments thereof). In embodiments, a complex described herein includes a mutant pyrrolysyl-tRNA synthetase described herein and a tRNA (e.g., tRNAPy). In embodiments, a complex described herein includes a mutant pyrrolysyl-tRNA synthetase described herein, an amino acid substrate and a tRNA (e.g., tRNAPy). In embodiments, a complex described herein includes at least two components selected from the group consisting of a mutant pyrrolysyl-tRNA synthetase described herein, an amino acid substrate (e.g., the compound of Formula (I) or embodiments thereof), a polypeptide containing the compound of Formula (I) or embodiments thereof, and a
tRNA (e.g., tRNAPy). In embodiments, a complex described herein includes at least two components selected from the group consisting of a mutant pyrrolysyl-tRNA synthetase described herein, an amino acid substrate (e.g., a compounds described herein, including embodiments thereof), a polypeptide containing a compound described herein, including embodiments thereof, and a tRNA (e.g., tRNAPy).
[0065] The term “protein/protein complex” refers to a composition that includes one proteinbinding protein (e.g., comprising an unnatural amino acid as described herein) and one protein, where the protein-binding protein and protein are proximal to each other but not bound together; the protein-binding protein and protein are covalently bound together; or the protein-binding protein and protein are ionically bound together. In embodiments, the protein-binding protein and protein are proximal to each other but not bound together. In embodiments, the proteinbinding protein and protein are covalently bonded together. In embodiments, the protein-binding protein and protein are ionically bonded together. In embodiments, the protein-binding protein and protein are covalently and ionically bonded together. In embodiments, the chemical reaction forming the protein/protein complex is a SuFEx reaction.
[0066] The terms “transfection”, “transduction”, “transfecting” or “transducing” can be used interchangeably and are defined as a process of introducing a nucleic acid molecule or a protein to a cell. Nucleic acids are introduced to a cell using non-viral or viral-based methods. The nucleic acid molecules may be gene sequences encoding complete proteins or functional portions thereof. Non-viral methods of transfection include any appropriate transfection method that does not use viral DNA or viral particles as a del i x ery system to introduce the nucleic acid molecule into the cell. Exemplary non-viral transfection methods include calcium phosphate transfection, liposomal transfection, nucleofection, sonoporation. transfection through heat shock, magnetifection and electroporation. In embodiments, the nucleic acid molecules are introduced into a cell using electroporation following standard procedures well known in the art. For viral-based methods of transfection any useful viral vector may be used in the methods described herein. Examples for viral vectors include, but are not limited to retroviral, adenoviral, lentiviral and adeno-associated viral vectors. In embodiments, the nucleic acid molecules are introduced into a cell using a retroviral vector following standard procedures well known in the art. The terms "transfection" or "transduction" also refer to introducing proteins into a cell from the external environment. Typically, transduction or transfection of a protein relies on attachment of a peptide or protein capable of crossing the cell membrane to the protein of interest.
[0067] The term “isolated,” when applied to a nucleic acid or protein, denotes that the nucleic acid or protein is essentially free of other cellular components with which it is associated in the natural state. It can be, for example, in a homogeneous state and may be in either a dry or aqueous solution. Purity and homogeneity are typically determined using analytical chemistry techniques such as polyacrylamide gel electrophoresis or high performance liquid chromatography. A protein that is the predominant species present in a preparation is substantially purified.
[0068] “Contacting” is used in accordance with its plain ordinary meaning and refers to the process of allowing at least two distinct species (e.g. chemical compounds including amino acids, proteins, peptides, biomolecules, or cells) to become sufficiently proximal to react, interact or physically touch. It should be appreciated; however, the resulting reaction product can be produced directly from a reaction between the added reagents or from an intermediate from one or more of the added reagents that can be produced in the reaction mixture. The term “contacting” may include allowing two species to react, interact, or physically touch, wherein the two species may be biomolecule moieties as described herein. In some embodiments, contacting includes allowing two proteins or a protein and a glycan as described herein to interact.
[0069] A “detectable agent” or “detectable moiety” is a composition detectable by appropriate means such as spectroscopic, photochemical, biochemical, immunochemical, chemical, magnetic resonance imaging, or other physical means. In embodiments, the proteins described herein are bonded to a detectable agent. In embodiments, the fusion proteins described herein are bonded to a detectable agent. In embodiments, an antibody or antibody variant is bonded to a detectable agent. In embodiments, a nanobody is bonded to a detectable agent. In embodiments, the bond is noncovalent or covalent. In embodiments, the bond is covalent. In embodiments, the protein is covalently bonded to a detectable agent. In embodiments, the fusion protein is covalently bonded to a detectable agent. In embodiments, the antibody or antibody variant is covalently bonded to a detectable agent. In embodiments, a nanobody is covalently bonded to a detectable agent. In embodiments when the protein or fusion protein is covalently bonded to a detectable agent, the covalent bond is between the detectable agent and a naturally-occurring amino acid in the protein or fusion protein. In embodiments when the nanobody is covalently bonded to a detectable agent, the covalent bond is between the detectable agent and a naturally- occurring amino acid in the nanobody. Methods for covalently bonding detectable agents to proteins are well-known in the art. Detectable agents include 18F, 32P, 33P, 45Ti, 47Sc, 52Fe, 59Fe, 62Cu, 64Cu, 67Cu, 67Ga, 68Ga, 77 As, 86Y, 90Y. 89Sr, 89Zr, 94Tc, 94Tc, 99mTc, 99Mo, 105Pd, 105Rh,
Cr, V, Mn, Fe, Co, Ni, Cu, La, Ce, Pr, Nd, Pm, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, Lu, 32P, fluorophore (e.g., fluorescent dyes), electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, paramagnetic molecules, paramagnetic nanoparticles, ultrasmall superparamagnetic iron oxide (“USPIO”) nanoparticles, USPIO nanoparticle aggregates, superparamagnetic iron oxide C SPIO") nanoparticles, SPIO nanoparticle aggregates, monocrystalline iron oxide nanoparticles, monochrystalline iron oxide, nanoparticle contrast agents, liposomes or other delivery vehicles containing Gadolinium chelate (“Gd- chelate”) molecules, Gadolinium, radioisotopes, radionuclides (e.g., carbon-11, nitrogen-13, oxygen-15, fluorine-18, rubidium-82). fluorodeoxyglucose (e.g., fluorine-18 labeled), any gamma ray emitting radionuclides, positron-emitting radionuclide, radiolabeled glucose, radiolabeled water, radiolabeled ammonia, biocolloids, microbubbles (e.g. including microbubble shells including albumin, galactose, lipid, and/or polymers; microbubble gas core including air. heavy gases, perfluorcarbon, nitrogen, octafluoropropane, perflexane lipid microsphere, perflutren, etc.), iodinated contrast agents (e.g.. iohexol, iodixanol, ioversol, iopamidol, ioxilan, iopromide, diatrizoate, metrizoate, ioxaglate), barium sulfate, thorium dioxide, gold, gold nanoparticles, gold nanoparticle aggregates, fluorophores, two-photon fluorophores, or haptens and proteins or other entities which can be made detectable, e.g., by incorporating a radiolabel into a peptide or antibody specifically reactive with a target peptide. A detectable moiety is a monovalent detectable agent or a detectable agent capable of forming a bond with another composition. In embodiments, paramagnetic ions that may be used as imaging agents in accordance with the embodiments of the disclosure include, e.g., ions of transition and lanthanide metals (e.g., metals having atomic numbers of 21-29, 42, 43, 44, or 57- 71). These metals include ions of Cr, V, Mn, Fe, Co, Ni, Cu, La, Ce, Pr, Nd, Pm, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb and Lu.
[0070] A “radioisotope” that may be used as imaging and/or labeling agents in accordance with the embodiments of the disclosure include, but are not limited to, 18F, 32P, 33P, 45Ti, 47Sc, 52Fe, 59Fe, 62Cu, 64Cu, 67Cu, 67 Ga, 68Ga, 77 As, 86Y, 90Y. 89Sr, 89Zr, 94Tc, 94Tc, 99inTc, 99Mo, 105Pd,
and 225Ac. In embodiments, the proteins described herein are bonded to a radioisotope. In embodiments, the fusion proteins described herein are bonded to a radioisotope. In embodiments, an antibody or antibody variant is bonded to a radioisotope. In embodiments, a
nanobody is bonded to a radioisotope. In embodiments, the bond is noncovalent or covalent. In embodiments, the bond is covalent. In embodiments, the protein is covalently bonded to a radioisotope. In embodiments, the fusion protein is covalently bonded to a radioisotope. In embodiments, the antibody or antibody variant is covalently bonded to a radioisotope. In embodiments, a nanobody is covalently bonded to a radioisotope. In embodiments when the protein or fusion protein is covalently bonded to a radioisotope, the covalent bond is between the radioisotope and a naturally-occurring amino acid in the protein or fusion protein. In embodiments when the nanobody is covalently bonded to a radioisotope, the covalent bond is between the radioisotope and a naturally-occurring amino acid in the nanobody. Methods for covalently bonding radioisotopes to proteins are well-known in the art. In embodiments, the radioisotope is 123I, 124I, 125I, or 131I. In embodiments, the radioisotope is 123I. In embodiments, the radioisotope is 124I. In embodiments, the radioisotope is 127I. In embodiments, the radioisotope is 131I. In embodiments, the radioisotope is a positron-emitting radioisotope. In embodiments, the positron-emitting radioisotope is nC, 13N, 15O, 18F, 64Cu, 68Ga, 78Br, 82Rb, 86Y, 89Zr, 90Y. 22Na. 26 Al, 40K, 83Sr. or 124I. In embodiments, the positron-emitting radioisotope is nC. In embodiments, the positron-emitting radioisotope is 13N. In embodiments, the positronemitting radioisotope is 15O. In embodiments, the positron-emitting radioisotope is 18F. In embodiments, the positron-emitting radioisotope is 64Cu. In embodiments, the positron-emitting radioisotope is 168Ga. In embodiments, the positron-emitting radioisotope is 78Br. In embodiments, the positron-emitting radioisotope is 82Rb. In embodiments, the positron-emitting radioisotope is 86Y. In embodiments, the positron-emitting radioisotope is 89Zr. In embodiments, the positron-emitting radioisotope is 90Y. In embodiments, the positron-emitting radioisotope is 22Na. In embodiments, the positron-emitting radioisotope is 26 Al. In embodiments, the positronemitting radioisotope is 40K. In embodiments, the positron-emitting radioisotope is 83Sr. In embodiments, the positron-emitting radioisotope is 124I. In embodiments, the radioisotope is an alpha-emitting radioisotope. In embodiments, the alpha-emitting radioisotope is 211At, 227Th, 225 Ac, 223Ra, 213Bi, or 212Bi. In embodiments, the alpha-emitting radioisotope is 211At. In embodiments, the alpha-emitting radioisotope is 227Th. In embodiments, the alpha-emitting radioisotope is 225 Ac. In embodiments, the alpha-emitting radioisotope is 223Ra. In embodiments, the alpha-emiting radioisotope is 213Bi. In embodiments, the alpha-emitting radioisotope is 212Bi.
[0071] The term “therapeutic agent” refers to any agent useful in treating and/or preventing a disease. “Therapeutic agent“ includes, without limitation, small molecule drugs, proteins, nucleic acids (e.g., DNA, RNA). and the like. “Small-molecule drugs” refers to
chemical compounds with low molecular weight that are capable of treating and/or preventing diseases. In embodiments, the proteins described herein are bonded to a therapeutic agent. In embodiments, the fusion proteins described herein are bonded to a therapeutic agent. In embodiments, an antibody or antibody variant is bonded to a therapeutic agent. In embodiments, a nanobody is bonded to a therapeutic agent. In embodiments, the bond is noncovalent or covalent. In embodiments, the bond is covalent. In embodiments, the protein is covalently bonded to a therapeutic agent. In embodiments, the fusion protein is covalently bonded to a therapeutic agent. In embodiments, the antibody or antibody variant is covalently bonded to a therapeutic agent. In embodiments, a nanobody is covalently bonded to a therapeutic agent. In embodiments when the protein or fusion protein is covalently bonded to a therapeutic agent, the covalent bond is between the therapeutic agent and a naturally-occurring amino acid in the protein or fusion protein. In embodiments when the nanobody is covalently bonded to a therapeutic agent, the covalent bond is between the therapeutic agent and a naturally-occurring amino acid in the nanobody. Methods for covalently bonding therapeutic agents to proteins are well-known in the art.
[0072] The term “sulfur-fluoride exchange reaction” or “SuFEx” refers to a type of click chemistry as described in detail by, e.g.. Dong et al. Angewandte Chemie, 53(36): 9340-9448 (2014); and Wang et al, J. Am. Chem. Soc., 140(15):4995-4999 (2018). The term “proximally- enabled” SuFEx refers to the sulfur-fluoride exchange reaction occurring when the reactive species are proximal to each other, i.e., spatially close enough for the SuFEx reaction to occur. The proximity may occur within a single biomolecule (e.g.. protein) or between two different biomolecules (e.g., protein and RNA). The skilled artisan could readily determine whether the reactive species are sufficiently proximal for the reaction to occur, e.g., sulfur-fluoride exchange reaction between the compound of Formula (I) and RNA (e.g., a hydroxyl group on RNA). The skilled artisan could readily determine whether the reactive species are sufficiently proximal for the reaction to occur, e.g., sulfur-fluoride exchange reaction between the compound of Formula (IV) and a peptidyl moiety (e.g., having a tyrosine, lysine, or histidine), a nucleic acid moiety, or a carbohydrate moiety; or for example a sulfur-fluoride exchange reaction between the compound of Formula (I) and a nucleic acid moiety; or for example a sulfur-fluoride exchange reaction between the compound of Formula (VII) and a peptidyl moiety (e.g., having a tyrosine, lysine, or histidine), a nucleic acid moiety, or a carbohydrate moiety.
[0073] In embodiments, “proximal” means that two compounds (e.g., biomolecules, proteins, peptides, amino acids, glycans) are adjacent (e.g., but not covalently bonded together). In embodiments, “proximal” means up to about 25 angstroms. In embodiments, “proximal” means
up to about 20 angstroms. In embodiments, “proximal” means up to about 15 angstroms. In embodiments, “proximal” means up to about 10 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 25 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 20 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 15 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 12 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 10 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 8 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 6 angstroms. In embodiments, “proximal” means from about 1 angstrom to about 5 angstroms. In embodiments, “proximal” means from about 1 angstroms to about 4 angstroms.
[0074] The term “intermolecular linker” refers to a linking group between two biomolecules. For example, when the compounds of Formula (III) (or embodiments thereof) are an intermolecular linker, then the peptidyl moiety of R4 is a first protein and the peptidyl moiety of R3 is a second protein, such that the first protein and the second protein are covalently bonded. In aspects, the first protein and the second protein can have the same sequence, e.g., providing an intermolecular linker between two different proteins having the same amino acid sequence. In aspects, the first protein and the second protein are different proteins, e.g., providing an intermolecular linker between two different proteins, such as a nanobody and a receptor protein.
[0075] The term “intramolecular linker” refers to a linking group within a biomolecule. For example, when the compounds of Formula (III) (or embodiments thereof) are an intramolecular linker, then the peptidyl moiety of R4 and the peptidyl moiety of R5 are in the same protein. A compound having an intramolecular linker may also be referred to as an intramolecularly conjugated biomolecule conjugate or an intramolecularly conjugated biomolecule protein.
[0076] Where substituent groups are specified by their conventional chemical formulae, written from left to right, they equally encompass the chemically identical substituents that would result from writing the structure from right to left, e.g., -CH2O- is equivalent to -OCH2-
[0077] The term “alkyl,” by itself or as part of another substituent, means, unless otherw ise stated, a straight (i.e., unbranched) or branched carbon chain (or carbon), or combination thereof, which may be fully saturated, mono- or polyunsaturated and can include mono-, di- and multivalent radicals. The alkyl may include a designated number of carbons (e.g., C1-C10 means one to ten carbons). Alkyl is an uncyclized chain. Examples of saturated hydrocarbon radicals include, but are not limited to, groups such as methyl, ethyl, n-propyl, isopropyl, n-butyl, t-butyl, isobutyl, sec-butyl, methyl, homologs and isomers of, for example, n-pentyl, n-hexyl, n-heptyl,
n-octyl, and the like. An unsaturated alkyl group is one having one or more double bonds or triple bonds. Examples of unsaturated alkyl groups include, but are not limited to, vinyl, 2- propenyL crotyl, 2-isopentenyl, 2-(butadienyl), 2,4-pentadienyl, 3-(l,4-pentadienyl), ethynyl, 1- and 3-propynyl, 3-butynyl, and the higher homologs and isomers. An alkoxy is an alky l attached to the remainder of the molecule via an oxygen linker (-O-). An alkyl moiety may be an alkenyl moiety. An alkyl moiety may be an alkynyl moiety. An alkyl moiety may be fully saturated. An alkenyl may include more than one double bond and/or one or more triple bonds in addition to the one or more double bonds. An alkynyl may include more than one triple bond and/or one or more double bonds in addition to the one or more triple bonds.
[0078] The term “alkydene,” by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from an alkyl, as exemplified by, e.g., -CH2CH2CH2CH2-. Typically, an alkyl (or alkylene) group will have from 1 to 24 carbon atoms, with those groups having 10 or fewer carbon atoms being preferred herein. A “lower alkyl” or “lower alkylene” is a shorter chain alkyl or alkylene group, generally having eight or fewer carbon atoms. The term “alkenylene,” by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from an alkene.
[0079] The term “heteroalkyl,” by itself or in combination with another term, means, unless otherwise stated, a stable straight or branched chain, or combinations thereof, including at least one carbon atom and at least one heteroatom (e.g., O, N, P, Si, and S), and wherein the nitrogen and sulfur atoms may optionally be oxidized, and the nitrogen heteroatom may' optionally be quatemized. The heteroatom(s) may be placed at any interior position of the heteroalky 1 group or at the position at which the alkyl group is attached to the remainder of the molecule. Heteroalkyl is an uncyclized chain. Examples include, but are not limited to: - CH2NH2, -CH2-CH2-O-CH3, -CH2-CH2-NH-CH3, -CH2-CH2-NH2, -CH2-CH2-N(CH3)-CH3, -CN -CH2-S-CH2-CH3, -CH2-CH2, -S(O)-CH3, -CH2-CH2-S(O)2-CH3, -CH=CH-O-CH3, -Si(CH3)3, -O-CH3, -CH2-CH=N-OCH3, and -CH=CH-N(CH3)-CH3. Up to two or three heteroatoms may be consecutive, such as, for example, -CH2-NH-OCH3 and -CH2-O-Si(CH3)3. A heteroalkyl moiety may include one heteroatom. A heteroalkyd moiety may include two optionally different heteroatoms. A heteroalkyd moiety' may include three optionally different heteroatoms. A heteroalkyd moiety may include four optionally different heteroatoms. A heteroalkyd moiety' may include five optionally different heteroatoms. A heteroalkyl moiety may include up to 8 optionally different heteroatoms. The term “heteroalkenyl,” by itself or in combination with another term, means, unless otherwise stated, a heteroalkyd including at least one double bond. A heteroalkenyl may optionally include more than one double bond and/or one or more triple
bonds in additional to the one or more double bonds. The term “heteroalkynyl,” by itself or in combination with another term, means, unless otherwise stated, a heteroalkyl including at least one triple bond. A heteroalkynyl may optionally include more than one triple bond and/or one or more double bonds in additional to the one or more triple bonds.
[0080] Similarly, the term “heteroalkylene,” by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from heteroalkyd, as exemplified, but not limited by, -CH2-CH2-S-CH2-CH2- and -CH2-S-CH2-CH2-NH-CH2-. For heteroalkylene groups, heteroatoms can also occupy either or both of the chain termini (e.g., alkyleneoxy, alkylenedioxy, alkyleneamino, alkylenediamino, and the like). Still further, for alkylene and heteroalkylene linking groups, no orientation of the linking group is implied by the direction in which the formula of the linking group is written. For example, the formula -C(O)2R'- represents both -C(O)2R'- and -R'C(O)2-. As described above, heteroalkyl groups, as used herein, include those groups that are attached to the remainder of the molecule through a heteroatom, such as - C(O)R', -C(O)NR', -NR'R", -OR', -SR', and/or -SO2R'. Where “heteroalkyl” is recited, followed by recitations of specific heteroalkyl groups, such as -NR'R" or the like, it will be understood that the terms heteroalkyl and -NR'R" are not redundant or mutually exclusive. Rather, the specific heteroalkyl groups are recited to add clarity. Thus, the term "heteroalkyl” should not be interpreted herein as excluding specific heteroalkyl groups, such as -NR'R" or the like.
[0081] The terms “cycloalkyl’7 and “heterocycloalkyl,” by themselves or in combination with other terms, mean, unless otherwise stated, cyclic versions of “alkyl” and “heteroalkyl,” respectively. Cycloalkyd and heterocycloalkyl are not aromatic. Additionally, for heterocycloalkyl, a heteroatom can occupy the position at which the heterocycle is attached to the remainder of the molecule. Examples of cycloalkyl include, but are not limited to, cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, 1 -cyclohex enyl, 3 -cyclohexenyl, cycloheptyl, and the like. Examples of heterocycloalkyl include, but are not limited to, 1 -(1,2, 5, 6- tetrahydropyridyl), 1-piperidinyl, 2-piperidinyl, 3-piperidinyl, 4-morpholinyl, 3-morpholinyl, tetrahydrofuran-2-yl. tetrahydrofuran-3-yl, tetrahydrothien-2-yl, tetrahydrothien-3-yl, 1- piperazinyl, 2-piperazinyL and the like. A “cycloalkylene” and a “heterocycloalkylene,” alone or as part of another substituent, means a divalent radical derived from a cycloalkyl and heterocycloalkyl, respectively.
[0082] In embodiments, the term “cycloalkyl” means a monocyclic, bicyclic, or a multicyclic cycloalkyl ring system. In embodiments, monocyclic ring systems are cyclic hydrocarbon groups containing from 3 to 8 carbon atoms, where such groups can be saturated or unsaturated,
but not aromatic. In embodiments, cycloalkyl groups are fully saturated. Examples of monocyclic cycloalkyls include cyclopropyl, cyclobutyl, cyclopentyl, cyclopentenyl, cyclohexyl, cyclohexenyl, cycloheptyl, and cyclooctyl. Bicyclic cycloalkyl ring systems are bridged monocyclic rings or fused bicyclic rings. In embodiments, bridged monocyclic rings contain a monocyclic cycloalkyl ring where two non adjacent carbon atoms of the monocyclic ring are linked by an alkylene bridge of between one and three additional carbon atoms (i.e., a bridging group of the form (CH2)W , where w is 1. 2, or 3). Representative examples of bicyclic ring systems include, but are not limited to, bicyclo[3. 1.1 (heptane, bicyclo[2.2. l]heptane, bicyclo[2.2.2]octane, bicyclo[3.2.2]nonane, bicyclo[3.3.1]nonane, and bicyclo[4.2.1]nonane. In embodiments, fused bicy clic cycloalkyl ring systems contain a monocyclic cycloalkyl ring fused to either a phenyl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, a monocyclic heterocyclyl, or a monocyclic heteroaryl. In embodiments, the bridged or fused bicyclic cycloalkyl is attached to the parent molecular moiety through any carbon atom contained within the monocyclic cycloalky l ring. In embodiments, cycloalkyl groups are optionally substituted with one or two groups which are independently oxo or thia. In embodiments, the fused bicyclic cycloalkyl is a 5 or 6 membered monocyclic cycloalkyl ring fused to either a phenyl nng, a 5 or 6 membered monocyclic cycloalkyl, a 5 or 6 membered monocyclic cycloalkenyl, a 5 or 6 membered monocyclic heterocyclyl, or a 5 or 6 membered monocyclic heteroaryl, wherein the fused bicyclic cycloalkyl is optionally substituted by one or two groups which are independently oxo or thia. In embodiments, multicyclic cycloalkyl ring systems are a monocyclic cycloalkyd ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a bicyclic aryl, a monocyclic or bicyclic heteroaryl, a monocyclic or bicyclic cycloalkyl, a monocyclic or bicyclic cycloalkenyl, and a monocyclic or bicyclic heterocyclyl. In embodiments, the multicyclic cycloalkyl is attached to the parent molecular moiety' through any carbon atom contained within the base ring. In embodiments, multicyclic cycloalkyl ring systems are a monocyclic cycloalkyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic ary 1, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicy clic heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a monocyclic heteroaryl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, and a monocyclic heterocyclyl. Examples of multicyclic cycloalky l groups include, but are not limited to tetradecahydrophenanthrenyl, perhy drophenothiazin-l-yl, and perhydrophenoxazin- 1-yl. In embodiments of the compounds of
Formula (I), Formula (II), and Formula (III) described herein (including embodiments thereof), ring A is a 5-membered monocyclic cycloalkyl, a 5-membered monocyclic heterocycloalkyl, or a 5-membered monocyclic heteroaiyl.
[0083] In embodiments, a cycloalkyl is a cycloalkenyl. The term “cycloalkenyl” is used in accordance with its plain ordinary meaning. In embodiments, a cycloalkenyl is a monocyclic, bicyclic, or a multicyclic cycloalkenyl ring system. In embodiments, monocyclic cycloalkenyl ring systems are cyclic hydrocarbon groups containing from 3 to 8 carbon atoms, where such groups are unsaturated (i.e., containing at least one annular carbon carbon double bond), but not aromatic. Examples of monocyclic cycloalkenyl ring systems include cyclopentenyl and cyclohexenyl. In embodiments, bicyclic cycloalkenyl rings are bridged monocyclic rings or a fused bicyclic rings. In embodiments, bridged monocyclic rings contain a monocyclic cycloalkenyl ring where two non adjacent carbon atoms of the monocyclic ring are linked by an alkylene bridge of between one and three additional carbon atoms (i.e., a bridging group of the form (CH?)W, where w is 1, 2, or 3). Representative examples of bicyclic cycloalkenyls include, but are not limited to, norbomenyl and bicyclo[2.2.2]oct 2 enyl. In embodiments, fused bicyclic cycloalkenyl ring systems contain a monocyclic cycloalkenyl ring fused to either a phenyl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, a monocyclic heterocyclyl. or a monocyclic heteroaryl. In embodiments, the bridged or fused bicyclic cycloalkenyl is attached to the parent molecular moiety through any carbon atom contained within the monocyclic cycloalkenyl ring. In embodiments, cycloalkenyl groups are optionally substituted with one or two groups which are independently oxo or thia. In embodiments, multicyclic cycloalkenyl rings contain a monocyclic cycloalkenyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two ring systems independently selected from the group consisting of a phenyl, a bicyclic aryl, a monocyclic or bicyclic heteroaryl, a monocyclic or bicyclic cycloalkyl, a monocyclic or bicyclic cycloalkenyl, and a monocyclic or bicyclic heterocyclyl. In embodiments, the multicyclic cycloalkenyl is attached to the parent molecular moiety through any carbon atom contained within the base ring. In embodiments, multicyclic cycloalkenyl rings contain a monocyclic cycloalkenyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two ring systems independently selected from the group consisting of a phenyl, a monocyclic heteroaryl, a monocyclic cy cloalkyl, a monocyclic cycloalkenyl, and a monocyclic heterocy clyl. In embodiments of the compounds of Formula (I), Formula (II), and Formula (III) described herein
(including embodiments thereof), ring A is a 5-membered monocyclic cycloalkyl, a 5-membered monocyclic heterocycloalkyl, or a 5-membered monocyclic heteroaryl.
[0084] In embodiments, a heterocycloalkyl is a heterocyclyl. The term “heterocyclyl” as used herein, means a monocyclic, bicyclic, or multicyclic heterocycle. The heterocyclyl monocyclic heterocycle is a 3, 4, 5, 6 or 7 membered ring containing at least one heteroatom independently selected from the group consisting of O, N, and S where the ring is saturated or unsaturated, but not aromatic. The 3 or 4 membered ring contains 1 heteroatom selected from the group consisting of O, N and S. The 5 membered ring can contain zero or one double bond and one, two or three heteroatoms selected from the group consisting of O, N and S. The 6 or 7 membered ring contains zero, one or two double bonds and one, two or three heteroatoms selected from the group consisting of O, N and S. The heterocyclyl monocyclic heterocycle is connected to the parent molecular moiety through any carbon atom or any nitrogen atom contained within the heterocyclyl monocyclic heterocycle. Representative examples of heterocyclyl monocyclic heterocycles include, but are not limited to, azetidinyl, azepanyl, aziridinyl, diazepanyl, 1,3-dioxanyl, 1,3-dioxolanyl, 1,3-dithiolanyl, 1,3-dithianyl, imidazolinyl, imidazolidinyl. isothiazolinyl, isothiazolidinyl, isoxazolinyl, isoxazolidinyl, morpholinyl, oxadiazolinyl, oxadiazolidinyl, oxazolinyl, oxazolidinyl, piperazinyl. piperidinyl, pyranyl, pyrazolinyl, pyrazolidinyl, pyrrolinyl, pyrrolidinyl, tetrahydrofuranyl, tetrahydrothienyl, thiadiazolinyl, thiadiazolidinyl, thiazolinyl, thiazolidinyl, thiomorpholinyl, 1,1- dioxidothiomorpholinyl (thiomorpholine sulfone), thiopyranyl, and trithianyl. The heterocyclyl bicyclic heterocycle is a monocyclic heterocycle fused to either a phenyl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, a monocyclic heterocycle, or a monocyclic heteroaryl. The heterocyclyl bicyclic heterocycle is connected to the parent molecular moiety through any carbon atom or any nitrogen atom contained within the monocyclic heterocycle portion of the bicyclic ring system. Representative examples of bicyclic heterocyclyls include, but are not limited to, 2.3-dihydrobenzofuran-2-yl, 2.3-dihydrobenzofuran-3-yl, indolin-l-yl, indolin-2-yl, indolin-3-yl, 2,3-dihydrobenzothien-2-yl, decahydroquinolinyl, decahydroisoquinolinyl, octahydro- IH-indolyl, and octahydrobenzofuranyl. In embodiments, heterocyclyl groups are optionally substituted with one or two groups which are independently oxo or thia. In certain embodiments, the bicyclic heterocyclyl is a 5 or 6 membered monocyclic heterocyclyl ring fused to a phenyl ring, a 5 or 6 membered monocyclic cycloalkyl, a 5 or 6 membered monocyclic cycloalkenyl, a 5 or 6 membered monocyclic heterocyclyl, or a 5 or 6 membered monocyclic heteroaryl, wherein the bicyclic heterocyclyl is optionally substituted by one or two groups which are independently oxo or thia. Multicyclic heterocyclyl ring systems are a monocyclic
heterocyclyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic aryl, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a bicyclic aryl, a monocyclic or bicyclic heteroaryl, a monocyclic or bicyclic cycloalkyl, a monocyclic or bicyclic cycloalkenyl, and a monocyclic or bicyclic heterocyclyl. The multicyclic heterocyclyl is attached to the parent molecular moiety through any carbon atom or nitrogen atom contained within the base ring. In embodiments, multicyclic heterocyclyl ring systems are a monocyclic heterocyclyl ring (base ring) fused to either (i) one ring system selected from the group consisting of a bicyclic ary l, a bicyclic heteroaryl, a bicyclic cycloalkyl, a bicyclic cycloalkenyl, and a bicyclic heterocyclyl; or (ii) two other ring systems independently selected from the group consisting of a phenyl, a monocyclic heteroaryl, a monocyclic cycloalkyl, a monocyclic cycloalkenyl, and a monocyclic heterocyclyl. Examples of multicyclic heterocyclyl groups include, but are not limited to lOH-phenothiazin- 10-yl, 9,10- dihydroacridin-9-yl, 9,10-dihydroacridin-10-yl, lOH-phenoxazin- 10-yl, 10,1 l-dihydro-5H- dibenzo[b,f|azepin-5-yl, 1.2.3.4-tetrahydropyrido[4,3-g]isoquinolin-2-yl, 12H- benzo|bj phenoxazin- 12-yl, and dodecahydro- lH-carbazol-9-yl. In embodiments of the compounds of Formula (I), Formula (II), and Formula (III) described herein (including embodiments thereol), ring A is a 5-membered monocyclic cycloalkyl, a 5-membered monocyclic heterocycloalkyl, or a 5-membered monocyclic heteroaryl.
[0085] The terms ‘‘halo’' or “halogen,"’ by themselves or as part of another substituent, mean, unless otherwise stated, a fluorine, chlorine, bromine, or iodine atom. Additionally, terms such as “haloalkyl” are meant to include monohaloalkyl and polyhaloalkyl. For example, the term “halo(Ci-C4)alkyl” includes, but is not limited to, fluoromethyl, difluoromethyl, trifluoromethyl, 2,2,2-trifluoroethyl, 4-chlorobutyl, 3-bromopropyl, and the like.
[0086] The term “acyl’’ means, unless otherwise stated, -C(O)R where R is a substituted or unsubstituted alkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
[0087] The term “aryl” means, unless otherwise stated, a polyunsaturated, aromatic, hydrocarbon substituent, which can be a single ring or multiple rings (preferably from 1 to 3 rings) that are fused together (i.e., a fused ring ary l) or linked covalently. A fused ring aryl refers to multiple rings fused together wherein at least one of the fused rings is an aryl ring. The term “heteroaryl” refers to aryl groups (or rings) that contain at least one heteroatom such as N, O, or
S, wherein the nitrogen and sulfur atoms are optionally oxidized, and the nitrogen atom(s) are optionally quatemized. Thus, the term “heteroaryl” includes fused ring heteroaryl groups (i.e., multiple rings fused together wherein at least one of the fused rings is a heteroaromatic ring). A
5.6-fused ring heteroarylene refers to two rings fused together, wherein one ring has 5 members and the other ring has 6 members, and wherein at least one ring is a heteroaryl ring. Likewise, a
6.6-fused ring heteroarylene refers to two rings fused together, wherein one ring has 6 members and the other ring has 6 members, and wherein at least one ring is a heteroaryl ring. And a 6.5- fused ring heteroarylene refers to two rings fused together, wherein one ring has 6 members and the other ring has 5 members, and wherein at least one ring is a heteroaryl ring. A heteroaryl group can be attached to the remainder of the molecule through a carbon or heteroatom. Nonlimiting examples of aryl and heteroaryl groups include phenyl, naphthyl, pyrrolyl. pyrazolyl, pyridazinyL triazinyl, pyrimidinyl, imidazolyl, pyrazinyl, purinyl, oxazolyL isoxazolyL thiazolyl, furyl, thienyl, pyridyl, pyrimidyl, benzothiazolyl, benzoxazoyl benzimidazolyl, benzofuran, isobenzofuranyl, indolyl, isoindolyl, benzothiophenyl, isoquinolyl, quinoxalinyl, quinolyl, 1 -naphthyl, 2-naphthyl, 4-biphenyl, 1 -pyrrolyl, 2-pyrrolyl, 3-pyrrolyl, 3-pyrazolyl, 2- imidazolyl, 4-imidazolyl, pyrazinyl, 2-oxazolyL 4-oxazolyl, 2-phenyl-4-oxazolyl, 5-oxazolyl, 3- isoxazolyl, 4-isoxazolyl, 5-isoxazolyl, 2-thiazolyl, 4-thiazolyl, 5-thiazolyl, 2-furyl, 3-furyl, 2- thienyl, 3-thienyl, 2-pyridyl, 3-pyridyl, 4-pyridyl, 2 -pyrimidyl, 4-pyrimidyl, 5 -benzothiazolyl, purinyl, 2-benzimidazolyl, 5-indolyl, 1 -isoquinolyl, 5-isoquinolyl, 2-quinoxalinyl, 5- quinoxalinyl. 3-quinolyl. and 6-quinolyl. Substituents for each of the above noted aryl and heteroaryl ring systems are selected from the group of acceptable substituents described below. An “arylene” and a “heteroarylene,” alone or as part of another substituent, mean a divalent radical derived from an aryl and heteroaryl, respectively. A heteroaryl group substituent may be -O- bonded to a ring heteroatom nitrogen. In embodiments of the compounds of Formula (I), Formula (II), and Formula (III) described herein (including embodiments thereof), ring A is a 5- membered monocyclic cycloalkyl, a 5-membered monocyclic heterocycloalkyl, or a 5- membered monocyclic heteroaryl.
[0088] The symbol ’ or denotes the point of attachment of a chemical moiety to the remainder of a molecule or chemical formula.
[0089] The term “oxo,” as used herein, means an oxygen that is double bonded to a carbon atom.
[0090] The term “alkylsulfonyl,” as used herein, means a moiety having the formula -S(O2)-R', where R' is a substituted or unsubstituted alkyl group as defined above. R'
may have a specified number of carbons (e.g., “C1-C4 alkylsulfonyl”).
[0091] The term "alkyl arylene" as an arylene moiety covalently bonded to an alky lene moiety7 (also referred to herein as an alkylene linker).
[0092] An alkylarylene moiety may be substituted (e.g. with a substituent group) on the alkylene moiety or the arylene linker (e.g. at carbons 2, 3. 4, or 6) with halogen, oxo. -N3, -CF3, -CCI3, -CBn, -CI3. -CN. -CHO, -OH, -NH2. -COOH, -CONH2. -NO2, -SH. -SO2CH3 -SO3H. -OSO3H, -SO2NH2, -NHNH2, -ONH2, -NHC(O)NHNH2, substituted or unsubstituted C1-C5 alkyl or substituted or unsubstituted 2 to 5 membered heteroalky l). In embodiments, the alkylarylene is unsubstituted.
[0093] Each of the above terms (e.g., ‘‘alkyl,’' “heteroalkyl,” “cycloalkyl.” “heterocycloalkyl,” “aryl,” and “heteroaryl”) includes both substituted and unsubstituted forms of the indicated radical. Preferred substituents for each type of radical are provided below.
[0094] Substituents for the alkyl and heteroalkyl radicals (including those groups often referred to as alkylene, alkenyl, heteroalkylene, heteroalkenyl, alkynyl, cycloalkyl, heterocycloalkyl, cycloalkenyl, and heterocycloalkenyl) can be one or more of a variety of groups selected from, but not limited to, -OR'. =0, =NR', =N-0R', -NR'R", -SR', -halogen, -SiR'R' R'", -OC(O)R', -C(O)R', -CO2R'. -CONR'R", -OC(O)NR'R", -NR"C(O)R', -NR'-C(O)NR"R", -NR"C(O)2R', -NR-C(NR'R"R"')=NR"", -NR-C(NR'R")=NR'", -S(O)R', -S(O)2R', -S(O)2NR'R", -NRSO2R', -NR'NR"R'", -ONR'R", -NR'C(O)NR"NR'"R"", -CN, -NO2, -NR'SO2R", -NR'C(O)R", -NR'C(O)-OR", -NR'OR", in a number ranging from zero to (2m'+l), where m' is the total number of carbon atoms in such radical. R, R'. R". R'", and R"" each preferably independently refer to hydrogen, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyd, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl (e.g., aryl substituted with 1-3 halogens), substituted or unsubstituted heteroaryl, substituted or unsubstituted alkyl, alkoxy, or thioalkoxy groups, or arylalkyl groups. When a compound described herein includes more than one R group, for example, each of the R groups is independently selected as are each R', R", R'", and R"" group when more than one of these groups is present. When R' and R" are attached to the same nitrogen atom, they can be combined with the nitrogen atom to form a 4-, 5-, 6-, or 7-membered ring. For example, -NR'R" includes, but is not limited to, 1-pyrrolidinyl and 4-morpholinyl. From the above discussion of substituents, one of skill in the art will understand that the term “alkyl” is meant to include groups including carbon atoms bound to groups other than hydrogen groups, such as haloalkyl (e.g., -CF3 and -CH2CF3) and acyl (e.g., -C(O)CH3, -C(O)CF3, -C(O)CH2OCH3, and the like).
[0095] Similar to the substituents described for the alkyl radical, substituents for the aryl and heteroaryl groups are varied and are selected from, for example: -OR', -NR'R", -SR', -halogen, -SiR'R"R"', -OC(O)R', -C(O)R', -CO2R, -CONR'R", -OC(O)NR,R", -NR"C(O)R', -NR'-C(O)NR"R", -NR"C(O)2R', -NR-C(NR'R"R'")=NR"", -NR-C(NR'R")=NR"', -S(O)R’, -S(O)2R’, -S(O)2NR’R", -NRSO2R', -NR'NR'R'", -ONR’R", -NR'C(O)NR"NR"'R"", -CN, -NO2, -R', -N3. -CH(Ph)2, fluoro(Ci-C.i)alkoxy, and fluoro(Ci-C4)alkyl. -NR'SO2R", -NR'C(O)R", -NR'C(O)-OR", -NR'OR", in a number ranging from zero to the total number of open valences on the aromatic ring system; and where R', R", R'", and R"" are preferably independently selected from hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, and substituted or unsubstituted heteroaryl. When a compound described herein includes more than one R group, for example, each of the R groups is independently selected as are each R', R", R'", and R"" groups when more than one of these groups is present.
[0096] Substituents for rings (e.g. cycloalkyl, heterocycloalkyl, ary l, heteroary l, cycloalkylene, heterocycloalkydene, arylene, or heteroarylene) may be depicted as substituents on the ring rather than on a specific atom of a ring (commonly referred to as a floating substituent). In such a case, the substituent may be attached to any of the ring atoms (obeying the rules of chemical valency) and in the case of fused rings or spirocyclic rings, a substituent depicted as associated with one member of the fused rings or spirocyclic rings (a floating substituent on a single ring), may be a substituent on any of the fused rings or spirocyclic rings (a floating substituent on multiple rings). When a substituent is attached to a ring, but not a specific atom (a floating substituent), and a subscript for the substituent is an integer greater than one, the multiple substituents may be on the same atom, same ring, different atoms, different fused rings, different spirocyclic rings, and each substituent may optionally be different. Where a point of attachment of a ring to the remainder of a molecule is not limited to a single atom (a floating substituent), the attachment point may be any atom of the ring and in the case of a fused ring or spirocyclic ring, any atom of any of the fused rings or spirocyclic rings while obey ing the rules of chemical valency. Where a ring, fused rings, or spirocyclic rings contain one or more ring heteroatoms and the ring, fused rings, or spirocyclic rings are shown with one more floating substituents (including, but not limited to, points of attachment to the remainder of the molecule), the floating substituents may be bonded to the heteroatoms. Where the ring heteroatoms are shown bound to one or more hydrogens (e.g. a ring nitrogen with two bonds to ring atoms and a third bond to a hydrogen) in the structure or formula with the floating
substituent, when the heteroatom is bonded to the floating substituent, the substituent will be understood to replace the hydrogen, while obeying the rules of chemical valency.
[0097] Two or more substituents may optionally be joined to form aryl, heteroaryl, cycloalkyl, or heterocycloalkyl groups. Such so-called ring-forming substituents are typically, though not necessarily, found attached to a cyclic base structure. In embodiments, the ring-forming substituents are attached to adjacent members of the base structure. For example, two ringforming substituents attached to adjacent members of a cyclic base structure create a fused ring structure. In embodiments, the ring-forming substituents are attached to a single member of the base structure. For example, two ring-forming substituents attached to a single member of a cyclic base structure create a spirocyclic structure. In embodiments, the ring-forming substituents are attached to non-adj acent members of the base structure.
[0098] Two of the substituents on adj acent atoms of the aryl or heteroar l ring may optionally form a ring of the formula -T-C(O)-(CRR')q-U-, wherein T and U are independently -NR-, -O-, -CRR'-, or a single bond, and q is an integer of from 0 to 3. Alternatively, two of the substituents on adjacent atoms of the aryl or heteroaryl ring may optionally be replaced with a substituent of the formula -A-(CH2)r-B-, wherein A and B are independently -CRR'-, -O-, -NR-, -S-, -S(O) -, -S(O)2-, -S(O)2NR'-, or a single bond, and r is an integer of from 1 to 4. One of the single bonds of the new ring so formed may optionally be replaced with a double bond. Alternatively, two of the substituents on adjacent atoms of the aryl or heteroaryl ring may optionally be replaced with a substituent of the formula -(CRR')s-X'- (C"R"R"')d-, where s and d are independently integers of from 0 to 3, and X' is -O-, -NR'-, -S-, -S(O)-, -S(O)2-, or -S(O)2NR'-. The substituents R, R', R", and R'" are preferably independently selected from hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, and substituted or unsubstituted heteroaryl.
[0099] As used herein, the terms “heteroatom” or “ring heteroatom” are meant to include oxygen (O), nitrogen (N), sulfur (S), phosphorus (P), and silicon (Si).
[0100] A “substituent group,” as used herein, means a group selected from the following moieties:
[0101] (A) oxo, halogen, -CCh, -CBr3, -CF3, -CI3,-CN, -OH, -NH2, -COOH, -CONH2, -NO2, -SH, -SO3H, -SO4H, -SO2NH2, -NHNH2, -ONH2, -NHC(0)NHNH2, -NHC(0)NH2, -NHSO2H, -NHC(0)H, -NHC(0)0H, -NHOH, -OCC13, -OCF3, -OCBr3, -OCI3,-OCHCb, -OCHBn, -OCHI2, -OCHF2, unsubstituted alkyl (e.g., Ci-Cs alkyl, Ci-Ce alkyl, or C1-C4 alkyl),
unsubstituted heteroalkyl (e.g., 2 to 8 membered heteroalkyl, 2 to 6 membered heteroalkyl, or 2 to 4 membered heteroalkyl), unsubstituted cycloalkyl (e.g., C3-C8 cycloalkyl. C3-C6 cycloalkyl, or C5-C6 cycloalkyl), unsubstituted heterocycloalkyl (e.g., 3 to 8 membered heterocycloalkyl, 3 to 6 membered heterocycloalkyl, or 5 to 6 membered heterocycloalkyl), unsubstituted aryl (e.g., C6-C10 aryl, C10 aryl, or phenyl), or unsubstituted heteroaryl (e.g., 5 to 10 membered heteroaryl, 5 to 9 membered heteroaryl, or 5 to 6 membered heteroaryl), and
[0102] (B) alkyl, heteroalkyl, cycloalkyl, heterocycloalkyl, aryl, heteroaryl, substituted with at least one substituent selected from:
[0103] (i) oxo, halogen, -CCh, -CBr3, -CF3, -CI3,-CN, -OH, -NH2, -COOH, -C0NH2, -NO2, -SH, -SO3H, -SO4H, -SO2NH2, -NHNH2, -ONH2, -NHC(0)NHNH2, -NHC(0)NH2, -NHSO2H, -NHC(0)H, -NHC(0)0H, -NHOH, -OCCI3, -OCF3, -OCBr3, -OCI3, -OCHCh, -OCHBr2. -OCHI2, -OCHF2, unsubstituted alkyl (e.g., C1-C8 alkyl, C1-C6 alkyl, or C1-C4 alky l), unsubstituted heteroalkyl (e.g., 2 to 8 membered heteroalkyl, 2 to 6 membered heteroalkyl, or 2 to 4 membered heteroalkyl), unsubstituted cycloalkyl (e.g., C3-C8 cycloalkyl, C3-C6 cycloalkyl, or C5-C6 cycloalkyd), unsubstituted heterocycloalkyl (e.g., 3 to 8 membered heterocycloalkyl, 3 to 6 membered heterocycloalkyl, or 5 to 6 membered heterocycloalkyl), unsubstituted aryl (e.g., C6-C10 aryl, C10 ary 1, or phenyl), or unsubstituted heteroaryl (e.g.. 5 to 10 membered heteroaryl, 5 to 9 membered heteroaryl, or 5 to 6 membered heteroaryl), and
[0104] (ii) alkyl, heteroalkyl, cycloalkyl, heterocycloalkyl, aryl, heteroaryl, substituted with at least one substituent selected from:
[0105] (a) oxo. halogen, -CCI3, -CBr3. -CF3, -CI3.-CN, -OH, -NH2, -COOH, -CONH2, -NO2. -SH, -SO3H, -SO4H, -SO2NH2, -NHNH2, -0NH2, -NHC(0)NHNH2, -NHC(0)NH2, -NHSO2H, -NHC(0)H, -NHC(0)0H, -NHOH, -OCCI3, -OCF3, -OCBr3, -OCI3, -OCHCh, -OCHBn, -OCHI2. -OCHF2. unsubstituted alkyl (e.g.. Ci-Cs alkyl, Ci-Ce alkyl, or C1-C4 alkyl), unsubstituted heteroalkyl (e.g., 2 to 8 membered heteroalkyl, 2 to 6 membered heteroalkyl, or 2 to 4 membered heteroalkyl), unsubstituted cycloalkyl (e.g., C3-C8 cycloalkyl, C3-C6 cycloalkyl, or C5-C6 cycloalky l), unsubstituted heterocycloalkyl (e.g., 3 to 8 membered heterocycloalkyl, 3 to 6 membered heterocycloalkyl, or 5 to 6 membered heterocycloalkyl), unsubstituted aryl (e.g., C6-C10 aryl, C10 aryl, or phenyl), or unsubstituted heteroaryl (e.g.. 5 to 10 membered heteroaryl, 5 to 9 membered heteroaryl, or 5 to 6 membered heteroaryl), and
[0106] (b) alkyl, heteroalkyl, cycloalkyl, heterocycloalkyl, aryl, heteroaryl, substituted with at least one substituent selected from: oxo, halogen, -CCI3, -CBn, -CF3, -CI3,-CN, -OH, -NH2, -COOH, -CONH2, -NO2, -SH, -SO3H, -SO4H, -SO2NH2, -NHNH2, -ONH2, -NHC(0)NHNH2,
-NHC(O)NH2, -NHSO2H, -NHC(O)H, -NHC(O)OH, -NHOH, -OCCI3, -OCF3; -OCBr3,
-OCI3, -OCHCI2. -OCHBr2, -OCHI2, -OCHF2, unsubstituted alkyl (e.g., Ci-Cs alkyl, Ci-Ce alkyl, or C1-C4 alkyl), unsubstituted heteroalkyl (e.g., 2 to 8 membered heteroalkyl, 2 to 6 membered heteroalkyl, or 2 to 4 membered heteroalkyl), unsubstituted cycloalkyl (e.g., C3-Cs cycloalkyl, C3-Ce cycloalkyl, or C5-C6 cycloalkyl), unsubstituted heterocycloalkyl (e.g., 3 to 8 membered heterocycloalkyl, 3 to 6 membered heterocycloalkyl, or 5 to 6 membered heterocycloalkyl), unsubstituted aryl (e.g., Ce-Cio aryl. C10 aryl, or phenyl), or unsubstituted heteroaryl (e.g., 5 to 10 membered heteroaryl, 5 to 9 membered heteroaryl, or 5 to 6 membered heteroaryl).
[0107] A “size-limited substituent’’ or “ size-limited substituent group,” as used herein, means a group selected from all of the substituents described above for a “substituent group,” wherein each substituted or unsubstituted alkyl is a substituted or unsubstituted C1-C20 alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 20 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C3-Cs cycloalkyl, each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 8 membered heterocycloalkyl, each substituted or unsubstituted aryl is a substituted or unsubstituted Ce-Cio aryl, and each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 10 membered heteroaryl.
[0108] A “lower substituent” or “ lower substituent group,” as used herein, means a group selected from all of the substituents described above for a “substituent group,” wherein each substituted or unsubstituted alkyl is a substituted or unsubstituted Ci-Cs alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 8 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C3-C? cycloalkyl, each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 7 membered heterocycloalkyl, each substituted or unsubstituted aryl is a substituted or unsubstituted Ce-Cio aryl, and each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 9 membered heteroaryl.
[0109] In embodiments, each substituted group described in the compounds herein is substituted with at least one substituent group. More specifically, in embodiments, each substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene described in the compounds herein are substituted with at least one substituent group. In embodiments, at least one or all of these groups are substituted with at least
one size-limited substituent group. In embodiments, at least one or all of these groups are substituted with at least one lower substituent group.
[0110] In embodiments of the compounds herein, each substituted or unsubstituted alkyl maybe a substituted or unsubstituted C1-C20 alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 20 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted Cs-Cs cycloalkyl, each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 8 membered heterocycloalkyl, each substituted or unsubstituted aryl is a substituted or unsubstituted C6-C10 aryl, and/or each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 10 membered heteroaryl. In embodiments of the compounds herein, each substituted or unsubstituted alkylene is a substituted or unsubstituted C1-C20 alkylene, each substituted or unsubstituted heteroalkylene is a substituted or unsubstituted 2 to 20 membered heteroalkylene, each substituted or unsubstituted cycloalkylene is a substituted or unsubstituted C3-C8 cycloalkylene, each substituted or unsubstituted heterocycloalkylene is a substituted or unsubstituted 3 to 8 membered heterocycloalkylene, each substituted or unsubstituted ary lene is a substituted or unsubstituted Ce-Cio arylene, and/or each substituted or unsubstituted heteroarylene is a substituted or unsubstituted 5 to 10 membered heteroarylene.
[OlH] In embodiments, each substituted or unsubstituted alkyl is a substituted or unsubstituted Ci-Cs alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 8 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C3-C7 cycloalkyl, each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 7 membered heterocycloalkyl, each substituted or unsubstituted aryl is a substituted or unsubstituted Ce-Cio aryl, and/or each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 9 membered heteroaryl. In embodiments, each substituted or unsubstituted alkylene is a substituted or unsubstituted Ci-Cs alkylene, each substituted or unsubstituted heteroalkylene is a substituted or unsubstituted 2 to 8 membered heteroalkylene, each substituted or unsubstituted cycloalkylene is a substituted or unsubstituted C3-C7 cycloalkylene, each substituted or unsubstituted heterocycloalkylene is a substituted or unsubstituted 3 to 7 membered heterocycloalkylene, each substituted or unsubstituted arylene is a substituted or unsubstituted Ce-Cio ary lene, and/or each substituted or unsubstituted heteroarylene is a substituted or unsubstituted 5 to 9 membered heteroarylene.
[0112] In embodiments, a substituted or unsubstituted moiety (e.g., substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted
cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, and/or substituted or unsubstituted heteroarylene) is unsubstituted (e.g., is an unsubstituted alkyl, unsubstituted heteroalkyl, unsubstituted cycloalkyl, unsubstituted heterocycloalkyl, unsubstituted aryl, unsubstituted heteroaryl, unsubstituted alkylene, unsubstituted heteroalkylene, unsubstituted cycloalkylene, unsubstituted heterocycloalkylene, unsubstituted arylene, and/or unsubstituted heteroarylene, respectively). In embodiments, a substituted or unsubstituted moiety (e.g., substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, substituted or unsubstituted alky lene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, and/or substituted or unsubstituted heteroarylene) is substituted (e.g., is a substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene, respectively).
[0113] In embodiments, a substituted moiety (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyd, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one substituent group, wherein if the substituted moiety is substituted with a plurality of substituent groups, each substituent group may optionally be different. In embodiments, if the substituted moiety is substituted with a plurality of substituent groups, each substituent group is different.
[0114] In embodiments, a substituted moiety (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalky 1, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkyd ene, substituted cycloalky dene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one size-limited substituent group, wherein if the substituted moiety is substituted with a plurality of size-limited substituent groups, each size-limited substituent group may optionally' be different. In embodiments, if the substituted moiety is substituted with a plurality of size-
limited substituent groups, each size-limited substituent group is different.
[0115] In embodiments, a substituted moiety (e.g., substituted alky l, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyd, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkydene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one lower substituent group, wherein if the substituted moiety is substituted with a plurality of lower substituent groups, each lower substituent group may optionally be different. In embodiments, if the substituted moiety’ is substituted with a plurality of lower substituent groups, each lower substituent group is different.
[0116] In embodiments, a substituted moiety (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyd, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkyd ene, substituted cycloalkydene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarydene) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted moiety is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, sizelimited substituent group, and/or lower substituent group may optionally be different. In embodiments, if the substituted moiety is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group is different.
[0117] Certain compounds described herein possess asymmetric carbon atoms (optical or chiral centers) or double bonds; the enantiomers, racemates, diastereomers, tautomers, geometric isomers, stereoisometric forms that may be defined, in terms of absolute stereochemistry, as (R)- or (S)- or, as (D)- or (L)- for amino acids, and individual isomers are encompassed within the scope of the present disclosure. The compounds of the present disclosure do not include those that are known in art to be too unstable to synthesize and/or isolate. The present disclosure is meant to include compounds in racemic and optically pure forms. Optically active (R)- and (S)-, or (D)- and (L)-isomers may be prepared using chiral synthons or chiral reagents, or resolved using conventional techniques. When the compounds described herein contain olefinic bonds or other centers of geometric asymmetry, and unless specified otherwise, it is intended that the compounds include both E and Z geometric isomers. As used herein, the term “isomers” refers to compounds having the same number and kind of atoms, and hence the same molecular weight, but differing in respect to the structural arrangement or configuration of the atoms. The
term “tautomer,’" as used herein, refers to one of two or more structural isomers which exist in equilibrium and which are readily converted from one isomeric form to another. It will be apparent to one skilled in the art that certain compounds of this disclosure may exist in tautomeric forms, all such tautomeric forms of the compounds being within the scope of the disclosure. Unless otherwise stated, structures depicted herein are also meant to include all stereochemical forms of the structure; i.e., the R and S configurations for each asymmetric center. Therefore, single stereochemical isomers (stereoisomers) as well as enantiomeric and diastereomeric mixtures of the present compounds are within the scope of the disclosure.
[0118] The compounds described herein may also contain unnatural proportions of atomic isotopes at one or more of the atoms that constitute such compounds. For example, the compounds may be radiolabeled with radioactive isotopes, such as for example tritium (3H), iodine-125 (125I). or carbon-14 (14C). All isotopic variations of the compounds described herein, whether radioactive or not, are encompassed within the scope of the present disclosure.
[0119] It should be noted that throughout the application that alternatives are written in Markush groups, for example, each amino acid position that contains more than one possible amino acid. It is specifically contemplated that each member of the Markush group should be considered separately, thereby comprising another embodiment, and the Markush group is not to be read as a single unit.
[0120] “Analog,"’ or “analogue’" is used in accordance with its plain ordinary meaning within Chemistry and Biology and refers to a chemical compound that is structurally similar to another compound (i.e., a so-called “reference” compound) but differs in composition, e.g., in the replacement of one atom by an atom of a different element, or in the presence of a particular functional group, or the replacement of one functional group by another functional group, or the absolute stereochemistry of one or more chiral centers of the reference compound. Accordingly, an analog is a compound that is similar or comparable in function and appearance but not in structure or origin to a reference compound.
[0121] The terms “a” or “an,” as used in herein means one or more. In addition, the phrase “substituted with a[n],” as used herein, means the specified group may be substituted with one or more of any or all of the named substituents. For example, where a group, such as an alkyl or heteroaryl group, is “substituted with an unsubstituted C1-C20 alkyl, or unsubstituted 2 to 20 membered heteroalkyL” the group may contain one or more unsubstituted C1-C20 alkyls, and/or one or more unsubstituted 2 to 20 membered heteroalkyls.
[0122] Where a moiety is substituted with an R substituent, the group may be referred to as
“R-substituted.” Where a moiety is R-substituted, the moiety is substituted with at least one R substituent and each R substituent is optionally different. Where a particular R group is present in the description of a chemical genus (such as Formula (I)), a Roman alphabetic symbol may be used to distinguish each appearance of that particular R group. For example, where multiple R3 substituents are present, each R3 substituent may be distinguished as R3A, R3B, wherein each of R3A, R3B, is defined within the scope of the definition of R3 and optionally differently.
[0123] A person of ordinary skill in the art will understand when a variable (e.g., moiety or linker) of a compound or of a compound genus (e.g., a genus described herein) is described by a name or formula of a standalone compound with all valencies filled, the unfilled valence(s) of the variable will be dictated by the context in which the variable is used. For example, when a variable of a compound as described herein is connected (e.g., bonded) to the remainder of the compound through a single bond, that variable is understood to represent a monovalent form (i.e., capable of forming a single bond due to an unfilled valence) of a standalone compound (e.g., if the variable is named “methane’’ in an embodiment but the variable is known to be attached by a single bond to the remainder of the compound, a person of ordinary skill in the art would understand that the variable is actually a monovalent form of methane, i.e., methyl or - CH3). Likewise, for a linker variable (e.g.. L1. L2, or L3 as described herein), a person of ordinary skill in the art will understand that the variable is the divalent form of a standalone compound (e.g., if the variable is assigned to “PEG” or “polyethylene glycol” in an embodiment but the variable is connected by two separate bonds to the remainder of the compound, a person of ordinary skill in the art would understand that the variable is a divalent (i.e., capable of forming two bonds through two unfilled valences) form of PEG instead of the standalone compound PEG).
[0124] The term “bond” or “bonded” refers to direct bonds, such as covalent bonds (e.g., direct or a linking group), or indirect bonds, such as non-covalent bond (e.g., electrostatic interactions (e.g., ionic bond, hydrogen bond, halogen bond), van der Waals interactions (e.g., dipole-dipole, dipole-induced dipole, London dispersion), ring stacking (pi effects), hydrophobic interactions, and the like).
[0125] The terms “bioconjugate” and “bioconjugate linker” refers to the resulting association between atoms or molecules of “bioconjugate reactive groups” or “bioconjugate reactive moieties”. The association can be direct or indirect. For example, a conjugate between a first bioconjugate reactive group (e.g., -NH2, -C(O)OH, -N-hydroxy succinimide, or -maleimide) and a second bioconjugate reactive group (e.g., sulfhydryl, sulfur-containing amino acid, amine,
amine sidechain containing amino acid, or carboxylate) provided herein can be direct, e.g., by covalent bond or linker (e.g. a first linker of second linker), or indirect, e.g., by non-covalent bond (e.g. electrostatic interactions (e.g. ionic bond, hydrogen bond, halogen bond), van der Waals interactions (e.g. dipole-dipole, dipole-induced dipole, London dispersion), ring stacking (pi effects), hydrophobic interactions and the like). In embodiments, bioconjugates or bioconjugate linkers are formed using bioconjugate chemistry (i.e. the association of two bioconjugate reactive groups) including, but are not limited to nucleophilic substitutions (e.g., reactions of amines and alcohols with acyl halides, active esters), electrophilic substitutions (e.g., enamine reactions) and additions to carbon-carbon and carbon-heteroatom multiple bonds (e.g., Michael reaction, Diels-Alder addition). These and other useful reactions are discussed in, for example, March, Advanced Organic Chemistry, 3rd Ed., John Wiley & Sons. New York. 1985; Hermanson, Bioconjugate Techniques, Academic Press, San Diego, 1996; and Feeney et al, Modification of Proteins, Advances in Chemistry Series, Vol. 198, American Chemical Society, Washington, D.C., 1982. In embodiments, the first bioconjugate reactive group (e.g., unnatural amino acid side chain) is covalently attached to the second bioconjugate reactive group (e.g., a hydroxyl group).
[0126] The term “electron-withdrawing group” refers to a chemical moiety or substituent that removes electron density from a conjugated pi-electron system, thereby making the pi electron system less electrophilic.
[0127] The term “electron-donating group” refers to a chemical moiety or substituent that can donate electron density into a conjugated pi-electron system, thereby making the pi electron system more nucleophilic.
[0128] The terms “bind” and “bound” as used herein is used in accordance with its plain and ordinary meaning and refers to the association between atoms or molecules. The association can be direct or indirect. For example, bound atoms or molecules may be bound, e.g., by covalent bond, linker (e.g. a first linker or second linker), or non-covalent bond (e.g. electrostatic interactions (e.g. ionic bond, hydrogen bond, halogen bond), van der Waals interactions (e.g. dipole-dipole, dipole-induced dipole, London dispersion), ring stacking (pi effects), hydrophobic interactions and the like).
[0129] The term “capable of binding” as used herein refers to a moiety (e.g., a single-domain antibody or a recombinant protein as described herein, i.e., comprising an unnatural amino acid side chain that is capable of binding to an amino acid residue on a different protein) that is able to measurably bind to a target. In aspects, where a moiety is capable of binding a target, the
moiety is capable of binding with a Kd of less than about 10 pM, 5 pM, 1 pM, 500 nM, 250 nM, 100 nM, 75 nM, 50 nM, 25 nM, 15 nM, 10 nM, 5 nM, 1 nM, or about 0.1 nM.
[0130] Compounds
[0131] Provided herein are compounds, proteins comprising unnatural amino acid side chains, and biomolecule conjugates formed through the interaction of the unnatural amino acids with naturally occurring amino acids or nucleotides. The compounds of Formula (I), i.e., bioreactive unnatural amino acids, facilitate formation of chemically reactive amino acids with proximal target amino acid residues by undergoing a click chemistry reaction (e.g., sulfur-fluoride exchange reaction (SuFEx)). For example, the compounds of Formula (I) may be inserted into or replace an amino acid in a naturally occurring protein, thereby endowing the protein with the ability to form a chemically reactive amino acid with proximally positioned target functional groups (e.g., a hydroxyl group in RNA) or amino acid residues (e.g., serine, threonine, tyrosine) with other proteins. The compound of Formula (I) may be used to facilitate the formation of chemically reactive amino acids in proteins in both in vitro and in vivo conditions. As such, the bioreactive unnatural amino acids of Formula (I) are useful for forming chemically reactive amino acid residues that can be further chemically modified.
[0132] The compounds of Formula (I) have shown excellent chemical functionality (i.e., superior properties) compared to previously described bioreactive unnatural amino acids. For example, the compounds of Formula (I) are stable, nontoxic and nonreactive inside cells, yet when placed in proximity to target amino acid residues (e.g., serine, threonine, tyrosine) or reactive moieties (e.g., a hydroxyl group in RNA) they becomes reactive under cellular conditions. The compounds of Formula (I)) are able to react with target amino acid residues (e.g., serine, threonine, tyrosine) or other reactive moieties (e.g., a hydroxyl group in RNA) with great selectivity via proximity-enabled SuFEx reaction within and between proteins and RNA under physiological conditions.
[0133] Provided herein are compounds of Formula (I) or a stereoisomer thereof:
wherein: ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl; L4 is a bond or -O-; x is an integer from 0 to 8; L1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R1 is hydrogen, halogen, -CX , -CHX^, -CH2X1, -OCX^, -OCH2X1, -OCHX^, -CN, -SOniR1A, -SOviNR1AR1B,
-NHC(O)NR1ARIB, -N(O)mi: -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -0R1A, -NR1AS02R1B, -NR1AC(O)R1B, -NR1AC(O)OR1B, -NR1AOR1B, substituted or unsubstituted alky l, or substituted or unsubstituted heteroalkyl; X1 is independently -F, -Cl, -Br, or -I; R1A is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyd; R1B is hydrogen, substituted or unsubstituted alkyd, or substituted or unsubstituted heteroalkyd; nl is an integer from 0 to 4; ml is 1 or 2; and vl is 1 or 2. R1 is ortho or meta to -S(O2)F. In embodiments, R1 is meta to -S(C>2)F. In embodiments, R1 is ortho to —S(O2)F. In embodiments, R1 is hydrogen, halogen, -CX1 3, -CHXC, -CH2X1, -OCX1 3, -OCH2X1, -OCHX^, -CN, -SOn1R1A, -SOv1NR1AR1B, -NHC(O)NR1AR1B, -N(O)mi, -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -OR1A, -NR1ASO2R1B -NR1AC(O)R1B, -NR1AC(O)OR1B, -NR1AOR1B, unsubstituted C1-8 alkyl, or unsubstituted 2 to 8 membered heteroalkyl; R1A is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl; and R1B is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyd.
[0134] Provided herein are compounds of Formula (1-1) or a stereoisomer thereof:
wherein: ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl; x is an integer from 0 to 8; L1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkydene; R1 is hydrogen, halogen, -CXh, -CHXA -CH2X1, -OCXS, -OCH2X1, -OCHXS, -CN, -SOniR’A, -SOviNRIAR1B, -NHC(O)NR1AR1B, -N(0)mi, -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -OR1A, -NR1ASO2R1B, -NR1AC(O)R1B, -NR1AC(O)OR1B, -NR1A0R1B, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; X1 is independently -F, -Cl, -Br, or -I; R1A is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyd; R1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyd; nl is an integer from 0 to 4; ml is 1 or 2; and vl is 1 or 2. R1 is ortho or meta to -S(O2)F. In embodiments, R1 is meta to -S(O2)F. In embodiments, R1 is ortho to -S(Ch)F. In embodiments, R1 is hydrogen, halogen, -CX1 3 -CHX^, -CH2X1, -OCX1 3, -OCH2X1, -OCHX^, -CN, -SOniR1A, -SOviNR1AR1B, -NHC(O)NR1AR1B, -N(O)mi, -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -OR1A, -NR1ASO2R1B. -NR1AC(O)R1B, -NR1AC(O)OR1B. -NR1A0R1B, unsubstituted C 1-8 alkyl, or unsubstituted 2 to 8 membered heteroalkyl; R1A is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyd; and R1B is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl.
[0135] Provided herein are compounds of Formula (1-2) or a stereoisomer thereof:
wherein: ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl; x is an integer from 0 to 8; L1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene.
[0136] In embodiments of the compounds described herein, ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl. In embodiments, ring A is a 5-membered cycloalkyl. In embodiments, ring A is a 5-membered cycloalkyl having no C=C double bonds. In embodiments, ring A is a 5-membered cycloalkyl having one C=C double bond. In embodiments, ring A is a 5-membered cycloalkyl having two C=C double bonds. In embodiments, ring A is a 5-membered heterocycloalkyl. In embodiments, ring A is a 5- membered heterocycloalkyl having no double bonds. In embodiments, ring A is a 5-membered heterocycloalkyl having one double bond.
[0137] In embodiments, ring A is a 5-membered heteroaryl. In embodiments, ring A is a 5- membered heteroaryl containing 1 to 4 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5 -membered heteroaryl containing 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5-membered heteroaryl containing 1 or 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5-membered heteroaryl containing 1 heteroatom selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5-membered heteroaryl containing 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5- membered heteroaryl containing 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is pyrrole, pyrazole. imidazole, triazole, furan, thiophene, phosphole, oxazole, isoxazole, thiazole, or isothiazole. In embodiments, ring A is pyrrole. In embodiments, ring A is pyrazole. In embodiments, ring A is imidazole. In embodiments, ring A is triazole. In embodiments, ring A is furan. In embodiments, ring A is thiophene. In embodiments, ring A is phosphole. In embodiments, ring A is oxazole. In embodiments, ring A is isoxazole. In embodiments, ring A is thiazole. In embodiments, ring A is isothiazole. In embodiments, L1 is attached to a heteroatom in the 5-membered heteroaryl. In embodiments, L1 is attached to a carbon atom in the 5-membered heteroaryl. In embodiments, the -S(C>2)F moiety is attached to a heteroatom in the 5-membered heteroaryl. In embodiments,
the -S(C>2)F moiety is attached to a carbon atom in the 5-membered heteroaryl. In embodiments, L1 is attached to a carbon atom in the 5-membered heteroaryl and the -S(O2)F moiety is attached to a carbon atom in the 5-membered heteroaryl. In embodiments, L1 is attached to a heteroatom in the 5-membered heteroaryl and the -S(O2)F moiety is attached to a carbon atom in the 5- membered heteroaryl. In embodiments, L1 is attached to a carbon atom in the 5-membered heteroaryl and the -S(O2)F moiety is attached to a heteroatom in the 5-membered heteroaryl. In embodiments, L1 is attached to a heteroatom in the 5-membered heteroaryl, and the -S(O2)F moiety is attached to a heteroatom in the 5-membered heteroaryl.
[0138] Provided herein are compounds of Formula (1-3) or a stereoisomer thereof:
wherein x, L1, and R1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted C1-4 alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L1 is -NH-C(O)-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is - NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH- (CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4. In embodiments, R1 is halogen.
[0139] Provided herein are compounds of Formula (1-4) or a stereoisomer thereof:
wherein x, L1, and R1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted C1-4 alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L1 is -NH-C(O)-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is - NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH-
(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4. In embodiments, R1 is halogen.
[0140] Provided herein are compounds of Formula (T-5) or a stereoisomer thereof:
wherein x, L1, and R1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted C1-4 alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L1 is -NH-C(O)-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is - NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH- (CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4. In embodiments, R1 is halogen.
[0141] Provided herein are compounds of Formula (1-6) or a stereoisomer thereof:
wherein x, L1, and R1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments. L1 is substituted or unsubstituted C1-4 alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L1 is -NH-C(O)-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is - NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH- (CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4. In embodiments, R1 is halogen.
[0142] Provided herein are compounds of Formula (1-7) or a stereoisomer thereof:
wherein x and L1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted C1-4 alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L1 is -NH-C(0)-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH- (CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4..
[0143] Provided herein are compounds of Formula (1-8) or a stereoisomer thereof:
wherein x and L1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted C1-4 alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L1 is -NH-C(O)-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH- (CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4.
[0144] Provided herein are compounds of Formula (1-9) or a stereoisomer thereof:
wherein x and L1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted C1-4 alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In
embodiments, L1 is -NH-C(O)-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH- (CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4..
[0145] Provided herein are compounds of Formula (I- 10) or a stereoisomer thereof
wherein x and L1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted C M alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L1 is -NH-C(O)-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH- (CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4.
[0152] Proteins
[0153] Provied herein are proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II):
wherein: ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl; L4 is a bond or -O-; x is an integer from 0 to 8; L1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R1 is hydrogen, halogen, -CX\ -CHX^. -CH2X1, -OCXh, -OCH2X1. -OCHX^, -CN, -SOniR1A. -SOviNR1AR1B, -NHC(O)NR1AR1B, -N(O)mi, -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -OR1A, -NR1ASO2R1B, -NR1AC(O)R1B, -NR1AC(O)OR1B, -NR1AOR1B, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; X1 is
independently -F, -Cl, -Br, or -I; R1A is hydrogen, substituted or unsubstituted alkyd, or substituted or unsubstituted heteroalkyl; R1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; nl is an integer from 0 to 4; ml is 1 or 2; and vl is 1 or 2. R1 is ortho or meta to -S(C>2)F. In embodiments, R1 is meta to -S(O2)F. In embodiments, R1 is ortho to -S(O2)F. In embodiments, R1 is hydrogen, halogen, -CX1 3, -CHX^, -CH2X1, -OCX1 3, mi,
AC(O)R1B. -NR1 AC(O)OR1B, -NR1AOR1B, unsubstituted C1-8 alkyl, or unsubstituted 2 to 8 membered heteroalkyl; R1A is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl; and R1B is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl.
[0154] Provided herein are proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II- 1 ):
wherein: ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl; x is an integer from 0 to 8; L1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R1 is hydrogen, halogen, -CXh, -CHX^,
-NR1 AC(O)R1B, -NRI AC(O)OR1B, -NR1AOR1B, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; X1 is independently -F, -Cl, -Br, or -I; R1A is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; R1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; nl is an integer from 0 to 4; ml is 1 or 2; and vl is 1 or 2. R1 is ortho or meta to -S(Ch)F. In embodiments, R1 is meta to -S(Ch)F. In embodiments, R1 is ortho to -S(O2)F. In embodiments. R1 is hydrogen, halogen, -CXb, -CHX^. -CH2X1, -OCXS, -OCH2X1. -OCHXC, -CN, -SOniR1A.
-SOviNR1AR1B, -NHC(O)NR1AR1B, -N(0)mi, -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -OR1A, -NR1ASO2R1B, -NR1AC(O)R1B, -NR1AC(O)OR1B, -NR1AOR1B, unsubstituted CM alkyl, or unsubstituted 2 to 8 membered heteroalkyl; R1A is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl; and R1B is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl.
[0155] Provided herein are proteins comprising an unnatural ammo acid, wherein the
unnatural amino comprises a side chain of Formula (II -2):
wherein: ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl; x is an integer from 0 to 8; and L1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene.
[0156] In embodiments of the proteins described herein, ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl. In embodiments, ring A is a 5- membered cycloalkyl. In embodiments, ring A is a 5-membered cycloalkyl having no C=C double bonds. In embodiments, ring A is a 5-membered cycloalkyl having one C=C double bond. In embodiments, ring A is a 5-membered cycloalkyl having two C=C double bonds. In embodiments, ring A is a 5-membered heterocycloalkyl. In embodiments, ring A is a 5- membered heterocycloalkyl having no double bonds. In embodiments, ring A is a 5-membered heterocycloalkyl having one double bond.
[0157] In embodiments, ring A is a 5-membered heteroaryl. In embodiments, ring A is a 5- membered heteroaryl containing 1 to 4 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5 -membered heteroaryl containing 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5-membered heteroaryl containing 1 or 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5-membered heteroaryl containing 1 heteroatom selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5-membered heteroaryl containing 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5- membered heteroaryl containing 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is pyrrole, pyrazole, imidazole, triazole, furan, thiophene, phosphole, oxazole, isoxazole, thiazole, or isothiazole. In embodiments, ring A is pyrrole. In embodiments, ring A is pyrazole. In embodiments, ring A is imidazole. In embodiments, ring A is triazole. In embodiments, ring A is furan. In embodiments, ring A is thiophene. In embodiments, ring A is phosphole. In embodiments, ring A is oxazole. In embodiments, ring A is isoxazole. In embodiments, ring A is thiazole. In embodiments, ring A is isothiazole. In embodiments, L1 is attached to a heteroatom in the 5-membered heteroaryl. In embodiments, L1 is attached to a carbon atom in the 5-membered heteroaryl. In embodiments, the -S(Oi)F moiety is attached to a heteroatom in the 5-membered heteroaryl. In embodiments,
the -S(C>2)F moiety is attached to a carbon atom in the 5-membered heteroaryl. In embodiments, L1 is attached to a carbon atom in the 5-membered heteroaryl and the -S(C>2)F moiety is attached to a carbon atom in the 5-membered heteroaryl. In embodiments, L1 is attached to a heteroatom in the 5-membered heteroaryl and the -S(02)F moiety is attached to a carbon atom in the 5- membered heteroaryl. In embodiments, L1 is attached to a carbon atom in the 5-membered heteroaryl and the -S(C>2)F moiety is attached to a heteroatom in the 5-membered heteroaryl. In embodiments, L1 is attached to a heteroatom in the 5-membered heteroaryl, and the -S(C>2)F moiety is attached to a heteroatom in the 5-membered heteroaryl.
[0158] Provided herein are proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-3):
wherein x, L1. and R1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted C1-4 alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L1 is -NH-C(O)-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is - NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(0)-NH- (CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4. In embodiments. R1 is halogen.
[0159] Provided herein are proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-4):
wherein x, L1, and R1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted C1-4 alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L1 is -NH-C(O)-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -
NH-C(0)-0-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH- (CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4. In embodiments, R1 is halogen.
[0160] Provided herein are proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-5):
wherein x, L1, and R1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted Ci-j alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L1 is -NH-C(O)-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is - NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH- (CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4. In embodiments, R1 is halogen.
[0161] Provided herein are proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-6):
wherein x, L1, and R1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted C1-4 alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L1 is -NH-C(O)-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is - NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH- (CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In
embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4. In embodiments, R1 is halogen.
[0162] Provided herein are proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-7):
(II-7); wherein x and L1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted C1-4 alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L1 is -NH-C(O)-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is - NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH- (CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4.
[0163] Provided herein are proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-8):
(II-8); wherein x and L1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted C1-4 alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L1 is -NH-C(O)-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is - NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH- (CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4.
[0164] Provided herein are proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-9):
(II-9); wherein x and L1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted Ci4 alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments. L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L1 is -NH-C(O)-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is - NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH- (CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4.
[0165] Provided herein are proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (11-10):
(II- 10); wherein x and L1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted C1-4 alkylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments. L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene. In embodiments, L1 is -NH-C(O)-(CH?)y-, and y is an integer from 0 to 2. In embodiments, L1 is - NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH- (CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4.
[0166] Provided herein are proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II-l 1):
[0167] Provided herein are proteins comprising an unnatural amino acid, wherein the
unnatural amino comprises a side chain of Formula (11-12):
[0168] Provided herein are proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (11-13):
[0169] Provided herein are proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (11-14):
[0170] Provided herein are proteins comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (11-15):
[0171] In embodiments of the compounds described herein, the protein is an antibody, an antibody variant. In embodiments, the protein is an antibody. In embodiments, the protein is an antibody variant. In embodiments, the antibody variant is a variant as defined herein. In embodiments, the antibody variant is a single-chain variable fragment, a single-domain antibody, an affibody, or an antigen-binding fragment. In embodiments, the antibody variant is a single-chain variable fragment. In embodiments, the antibody variant is a single-domain antibody. In embodiments, the antibody variant is an affibody . In embodiments, the antibodyvariant is or an antigen-binding fragment. In embodiment, the unnatural amino acid is within a CDR region or a framework region of the antibody. In embodiment, the unnatural amino acid is
within a CDR region of the antibody. In embodiment, the unnatural amino acid is within a framework region of the antibody. In embodiment, the unnatural amino acid is within a CDR region or a framework region of the antibody variant. In embodiment, the unnatural amino acid is within a CDR region of the antibody variant. In embodiment, the unnatural amino acid is within a framework region of the antibody variant.
[0172] In embodiments of the compounds described herein, the protein is a receptor protein. In embodiments, the receptor protein is a programmed death-ligand 1 (PD-L1) receptor, a programmed cell death protein 1 (PD-1) receptor, a 5-hydroxytryptamine receptor, an acetylcholine receptor, an adenosine receptor, an adenosine A2A receptor, an adenosine A2B receptor, an angiotensin receptor, an apelin receptor, a bile acid receptor, a bombesin receptor, a brady kinin receptor, a cannabinoid receptor, a chemerin receptor, a chemokine receptor, a cholecystokinin receptor, a Class A Orphan receptor, a dopamine receptor, an endothelin receptor, an epidermal growth factor receptor (EGFR), a formyl peptide receptor, a free fatty acid receptor, a galanin receptor, a ghrelin receptor, a glycoprotein hormone receptor, a gonadotrophin-releasing hormone receptor, a G protein-coupled receptor, a G protein-coupled estrogen receptor, a histamine receptor, a hydroxy carboxylic acid receptor, a kisspeptin receptor, a leukotriene receptor, a lysophospholipid receptor, a lysophospholipid SIP receptor, a melaninconcentrating hormone receptor, a melanocortin receptor, a melatonin receptor, a motilin receptor, a neuromedin U receptor, a neuropeptide FF/neuropeptide AF receptor, a neuropeptide S receptor, a neuropeptide W/neuropeptide B receptor, a neuropeptide Y receptor, a neurotensin receptor, an opioid receptor, an opsin receptor, an orexin receptor, an oxoglutarate receptor, a P2Y receptor, a platelet-activating factor receptor, a prokineticin receptor, a prolactin-releasing peptide receptor, a prostanoid receptor, a proteinase-activated receptor, a QRFP receptor, a relaxin family peptide receptor, a somatostatin receptor, a succinate receptor, a tachykinin receptor, a thyrotropin-releasing hormone receptor, a trace amine receptor, a urotensin receptor, a vasopressin receptor, or a combination of two or more thereof. In embodiments, the receptor protein is an integrin. In embodiments, the receptor protein is a somatostain receptor. In embodiments, the receptor protein is a gonadotropin-releasing hormone receptor. In embodiments, the receptor protein is a bombesin receptor. In embodiments, the receptor protein is a vasoactive intestinal peptide receptor. In embodiments, the receptor protein is a neurotensin receptor. In embodiments, the receptor protein is a cholecystokinin 2 receptor. In embodiments, the receptor protein is a melanocortin receptor. In embodiments, the receptor protein is a ghrelin receptor.
[0173] In embodiments, the receptor protein is a PD-L1 receptor or a PD-1 receptor. In
embodiments, the receptor protein is a PD-L1 receptor. In embodiments, the receptor protein is a PD-1 receptor.
[0174] In embodiments, the receptor protein is a receptor expressed on a cancer cell. In embodiments, the receptor protein is a receptor overexpressed on a cancer cell relative to a control.
[0175] In embodiments, the receptor protein is a G protein-coupled receptor. In embodiments, the receptor protein is a receptor tyrosine kinase. In embodiments, the receptor protein is a an ErbB receptor. In embodiments, the receptor protein is an epidermal grow th factor receptor (EGFR). In embodiments, the receptor protein is epidermal growth factor receptor 1 (HER1). In embodiments, the receptor protein is epidermal growth factor receptor 2 (HER2). In embodiments, the receptor protein is epidermal growth factor receptor 3 (HER3). In embodiments, the receptor protein is epidermal growth factor receptor 4 (HER4).
[0176] In embodiments, the protein is a cell surface receptor. In embodiments, the cell surface receptor is in the extracellular domain, the transmembrane domain, or the intracellular domain. In embodiments, the protein is a cytosolic protein. In embodiments, the protein is a transcriptional factor. In embodiments, the protein is a an enzyme.
[0177] In embodiments, the protein further comprises a detectable agent or a therapeutic agent. In embodiments, the protein further comprises a detectable agent and a therapeutic agent. In embodiments, the protein further comprises a detectable agent. In embodiments, the detectable agent is a radioisotope. In embodiments, the protein further comprises a therapeutic agent.
[0178] Conjugates
[0179] Provided herein is a biomolecule conjugate of Formula (III), Formula (III- 1), and Formula (III-2):
wherein: R4 and R5 are each independently a peptidyl moiety, a carbohydrate moiety, or a nucleic acid moiety; ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5- membered heteroaryl; L4 is a bond or -O-; x is an integer from 0 to 8; L1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; L2 is a bond, -NR2A-, -S-. -S(O)2-, -O-, -C(O)-, -C(O)O-, -OC(O)-, -N(R2A)C(O)-, -C(O)N(R2A)-. -NR2AC(O)NR2B-, -NR2AC(NH)NR2B-, -SOZN(R2A)-. -N(R2A)SC>2-, -C(S)-, substituted or unsubstituted alkydene, substituted or unsubstituted heteroalky dene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkydene, substituted or unsubstituted ary dene, or substituted or unsubstituted heteroary dene; L3 is a bond, -N(R3A)-, -S-, -S(O)2-, -O-, -C(S)-, -C(O)-. -C(O)O-. -OC(O)-, -N(R3A)C(O)-. -C(O)N(R3A)-. -NR3AC(O)NR3B, -N(R3A)SO2-, -NR3AC(NH)NR3B-, -SC>2N(R3A)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkydene, substituted or unsubstituted cycloalkydene, substituted or unsubstituted heterocycloalkydene, substituted or unsubstituted ary dene, or substituted or unsubstituted heteroarydene; and R2A, R2B. R3A, and R3B are independently hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalky 1, substituted or unsubstituted heterocycloalkyd, substituted or unsubstituted ary 1, or substituted or unsubstituted heteroaryd; R1 is hydrogen, halogen, -CXk. -CHX^, -CH2X1,
-NR1AC(O)OR1B, -NR1AOR1B, substituted or unsubstituted alkyd, or substituted or unsubstituted heteroalkyl; X1 is independently -F, -Cl, -Br, or -I; R1A is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; R1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; nl is an integer from 0 to 4; ml is 1 or 2; vl is 1 or 2. R1 is meta or ortho to the carbon atom linked to -L4S(O2)L3R5. In embodimetns, R1 is hydrogen, halogen, -CXh, -CHX1 2, -CH2X1, -OCX1 3, -OCH2X1, -OCHX^, -CN, -SOniR1A, -SOVINR1AR1B, -NHC(O)NR1AR1B, -N(O)mi, -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -OR1A, -NR1ASO2R1B, -NR1AC(O)R1B. -NR1AC(O)OR1B, -NR1AOR1B, unsubstituted Ci-8 alkyl, or unsubstituted 2 to 8 membered heteroalkyl; R1A is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl; and R1B is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl.
[0180] In embodiments of the compounds of Formula (I) described herein, ring A is a 5-
membered cycloalky l, a 5-membered heterocycloalky l, or a 5-membered heteroaryl. In embodiments, ring A is a 5-membered cycloalkyl. In embodiments, ring A is a 5-membered cycloalkyl having no C=C double bonds. In embodiments, ring A is a 5-membered cycloalkyl having one C=C double bond. In embodiments, ring A is a 5-membered cycloalkyl having two C=C double bonds. In embodiments, ring A is a 5 -membered heterocycloalkyd. In embodiments, ring A is a 5-membered heterocycloalkyl having no double bonds. In embodiments, ring A is a 5-membered heterocycloalkyl having one double bond.
[0181] In embodiments, ring A is a 5-membered heteroaryl. In embodiments, ring A is a 5- membered heteroaryl containing 1 to 4 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5 -membered heteroaryl containing 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5-membered heteroaryl containing 1 or 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5-membered heteroaryl containing 1 heteroatom selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5-membered heteroaryl containing 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is a 5- membered heteroaryl containing 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur. In embodiments, ring A is pyrrole, pyrazole, imidazole, triazole, furan, thiophene, phosphole, oxazole, isoxazole, thiazole, or isothiazole. In embodiments, ring A is pyrrole. In embodiments, ring A is pyrazole. In embodiments, ring A is imidazole. In embodiments, ring A is triazole. In embodiments, ring A is furan. In embodiments, ring A is thiophene. In embodiments, ring A is phosphole. In embodiments, ring A is oxazole. In embodiments, ring A is isoxazole. In embodiments, ring A is thiazole. In embodiments, ring A is isothiazole. In embodiments, L1 is attached to a heteroatom in the 5-membered heteroaryl. In embodiments, L1 is attached to a carbon atom in the 5-membered heteroaryl. In embodiments, the -S(Ch)F moiety is attached to a heteroatom in the 5-membered heteroaryl. In embodiments, the -S(Oz)F moiety is attached to a carbon atom in the 5-membered heteroaryl. In embodiments, L1 is attached to a carbon atom in the 5-membered heteroaryl and the -S(O?)F moiety is attached to a carbon atom in the 5-membered heteroaryl. In embodiments, L1 is attached to a heteroatom in the 5-membered heteroaryl and the -S(C>2)F moiety is attached to a carbon atom in the 5- membered heteroaryl. In embodiments, L1 is attached to a carbon atom in the 5-membered heteroaryl and the -S(O2)F moiety is attached to a heteroatom in the 5-membered heteroaryl. In embodiments, L1 is attached to a heteroatom in the 5-membered heteroary l, and the -S(O2)F moiety is attached to a heteroatom in the 5-membered heteroaryl.
[0182] Provided herein are the following biomolecule conjugates:
wherein R4, R5, x, L1, L2, L3, and R1 are as defined herein. In embodiments, L1 is substituted or unsubstituted alkylene. In embodiments, L1 is substituted or unsubstituted Ci-4 alkydene. In embodiments, L1 is substituted or unsubstituted heteroalky lene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalkydene. In embodiments, L1 is -NH-C(O)-
(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-NH-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2. In embodiments, y is 0. In embodiments, y is 1. In embodiments, y is 2. In embodiments, x is an integer from 0 to 6. In embodiments, x is an integer from 2 to 6. In embodiments, x is 4. In embodiments, - (CH2)X-L1- is -(CH2)4NH-C(O)-. In embodiments, -(CH2)x-L1- is -(CH2)4NH-C(O)-O-. In embodiments, -(CFkjx-L1- is -(CH2)4NH-C(O)-NH-. In embodiments, -(CH2)x-L1- is - (CH2)4NH-C(O)-S-.
[0184] In embodiments of the compounds described herein, R4 and R5 are each independently a peptidyl moiety. In embodiments, the peptidyl moiety of R4 comprises an antibody; and the peptidyl moiety of R5 comprises a protein. In embodiments, the peptidyl moiety of R4 comprises an antibody; and the peptidyl moiety of R5 comprises a protein, wherein the protein is the target of the antibody. In embodiments, the peptidyl moiety of R4 comprises an antibody variant; and the peptidyl moiety of R5 comprises a protein. In embodiments, the peptidyl moiety of R4 comprises an antibody variant: and the peptidyl moiety of R5 comprises a protein, wherein the protein is the target of the antibody variant. In embodiments, the peptidyl moiety of R4 comprises a protein; and the peptidyl moiety of R5 comprises an antibody or an antibody variant. In embodiments, the peptidyl moiety of R4 comprises a protein: and the peptidyl moiety of R5 comprises an antibody or an antibody variant, wherein the protein is the target of the antibody or antibody variant.
[0185] In embodiments of the compounds described herein, the peptidyl moiety of R4 or R5 is an antibody, an antibody variant. In embodiments, the peptidyl moiety7 of R4 or R5 is an antibody. In embodiments, the peptidyl moiety of R4 or R’ is an antibody variant. In embodiments, the antibody variant is a variant as defined herein. In embodiments, the antibody variant is a singlechain variable fragment, a single-domain antibody, an affibody, or an antigen-binding fragment. In embodiments, the antibody variant is a single-chain variable fragment. In embodiments, the antibody variant is a single-domain antibody. In embodiments, the antibody variant is an affibody. In embodiments, the antibody variant is or an antigen-binding fragment. In embodiment, the unnatural amino acid is within a CDR region or a framework region of the antibody. In embodiment, the unnatural amino acid is within a CDR region of the antibody. In
embodiment, the unnatural amino acid is within a framework region of the antibody. In embodiment, the unnatural amino acid is within a CDR region or a framework region of the antibody variant. In embodiment, the unnatural amino acid is within a CDR region of the antibody variant. In embodiment, the unnatural amino acid is within a framework region of the antibody variant.
[0186] In embodiments of the compounds described herein, the peptidyl moiety of R4 or R5 is a receptor protein. In embodiments, the receptor protein is a programmed death-ligand 1 (PD- Ll) receptor, a programmed cell death protein 1 (PD-1) receptor, a 5-hydroxytryptamine receptor, an acetylcholine receptor, an adenosine receptor, an adenosine A2A receptor, an adenosine A2B receptor, an angiotensin receptor, an apelin receptor, a bile acid receptor, a bombesin receptor, a bradykinin receptor, a cannabinoid receptor, a chemerin receptor, a chemokine receptor, a cholecystokinin receptor, a Class A Orphan receptor, a dopamine receptor, an endothelin receptor, an epidermal growth factor receptor (EGFR), a formyl peptide receptor, a free fatty acid receptor, a galanin receptor, a ghrelin receptor, a glycoprotein hormone receptor, a gonadotrophin-releasing hormone receptor, a G protein-coupled receptor, a G protein-coupled estrogen receptor, a histamine receptor, a hydroxycarboxylic acid receptor, a kisspeptin receptor, a leukotriene receptor, a lysophospholipid receptor, a lysophospholipid SIP receptor, a melanin-concentrating hormone receptor, a melanocortin receptor, a melatonin receptor, a motilin receptor, a neuromedin U receptor, a neuropeptide FF/neuropeptide AF receptor, a neuropeptide S receptor, a neuropeptide W/neuropeptide B receptor, a neuropeptide Y receptor, a neurotensin receptor, an opioid receptor, an opsin receptor, an orexin receptor, an oxoglutarate receptor, a P2Y receptor, a platelet-activating factor receptor, a prokineticin receptor, a prolactin-releasing peptide receptor, a prostanoid receptor, a proteinase-activated receptor, a QRFP receptor, a relaxin family peptide receptor, a somatostatin receptor, a succinate receptor, a tachykinin receptor, a thyrotropin-releasing hormone receptor, a trace amine receptor, a urotensin receptor, a vasopressin receptor, or a combination of two or more thereof. In embodiments, the receptor protein is an integrin. In embodiments, the receptor protein is a somatostain receptor. In embodiments, the receptor protein is a gonadotropin-releasing hormone receptor. In embodiments, the receptor protein is a bombesin receptor. In embodiments, the receptor protein is a vasoactive intestinal peptide receptor. In embodiments, the receptor protein is a neurotensin receptor. In embodiments, the receptor protein is a cholecystokinin 2 receptor. In embodiments, the receptor protein is a melanocortin receptor. In embodiments, the receptor protein is a ghrelin receptor.
[0187] In embodiments, the receptor protein is a PD-L1 receptor or a PD-1 receptor. In
embodiments, the receptor protein is a PD-L1 receptor. In embodiments, the receptor protein is a PD-1 receptor.
[0188] In embodiments, the receptor protein is a receptor expressed on a cancer cell. In embodiments, the receptor protein is a receptor overexpressed on a cancer cell relative to a control.
[0189] In embodiments, the receptor protein is a G protein-coupled receptor. In embodiments, the receptor protein is a receptor tyrosine kinase. In embodiments, the receptor protein is a an ErbB receptor. In embodiments, the receptor protein is an epidermal grow th factor receptor (EGFR). In embodiments, the receptor protein is epidermal growth factor receptor 1 (HER1). In embodiments, the receptor protein is epidermal growth factor receptor 2 (HER2). In embodiments, the receptor protein is epidermal growth factor receptor 3 (HER3). In embodiments, the receptor protein is epidermal growth factor receptor 4 (HER4).
[0190] In embodiments, the peptidyl moiety' of R4 or R5 is a cell surface receptor. In embodiments, the cell surface receptor is in the extracellular domain, the transmembrane domain, or the intracellular domain. In embodiments, the peptidyl moiety of R4 or R5 is a cytosolic protein. In embodiments, the peptidyl moiety of R4 or R5 is a transcriptional factor. In embodiments, the peptidyl moiety of R4 or R5 is a an enz me.
[0191] In embodiments, the biomolecule conjugate further comprises a detectable agent or a therapeutic agent. In embodiments, the biomolecule conjugate further comprises a detectable agent and a therapeutic agent. In embodiments, the biomolecule conjugate further comprises a detectable agent. In embodiments, the detectable agent is a radioisotope. In embodiments, the protein biomolecule conjugate further comprises a therapeutic agent.
In embodiments, L5 is a -O-, -NH-, or-S-. In embodiments, L5 is a -NH- or -S-. In embodiments, L5 is -NH-. In embodiments, L5 is -S-. In embodiments, L5 is -O-. In embodiments, L5 is a bond. In embodiments, -S(Oz)F is meta to the carbon atom bonded to L5. In embodiments, -S(O2)F is ortho to the carbon atom bonded to I?. In embodiments, -S(O2)F is para to the carbon atom bonded to I?. In embodiments, the compound is Formula (IV-1 ) or a stereoisomer thereof. In embodiments, the compound is Formula (IV-2) or a stereoisomer thereof. In embodiments, the compound is Formula (IV-3) or a stereoisomer thereof. In embodiments, the compound is Formula (IV-4) or a stereoisomer thereof. In embodiments, the compound is Formula (IV-5) or a stereoisomer thereof.
[0193] Provided herein is a protein comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of having the formula:
wherein R1 and L4 are as defined herein; and L5 is a bond, -O-, -NH-, or -S-. In embodiments, I? is a -O-, -NH-, or -S-. In embodiments, L5 is a -NH- or -S-. In embodiments, L5 is -NH-. In embodiments, L5 is -S-. In embodiments, L5 is -O-. In embodiments, L5 is a bond. In embodiments, -S(C>2)F is meta to the carbon atom bonded to L5. In embodiments, -S(O2)F is ortho to the carbon atom bonded to L5. In embodiments, -S(O2)F is para to the carbon atom bonded to L5.
[0194] Provided herein is a biomolecule conjugate comprising the proteins of Formula (V) described herein, including embodiments thereof.
[0195] Substituents
[0196] In embodiments of the compounds, proteins, and conjugates described herein, R1 is hydrogen, halogen. -CX'3, -CHX'2, -CH2X', -OCX's, -OCH2X1, -OCHX1 2. -CN, -SOmR'A, -SOviNR1AR1B, -NHC(O)NR1AR1B, -N(O)mi, -NR1AR1B, -C(O)R1A. -C(O)-OR1A.
-C(O)NR1AR1B, -OR'A, -NR1ASO2R1B, -NR1AC(O)R1B, -NR1AC(O)OR1B, -NR1AOR1B,
-NR3 1. substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl. In embodiments, R1 is hydrogen, halogen, -CX's, -CHX1 2. -CH2X', -OCX's, -OCH2X'. -OCHX'2, -CN, -SOniR1A.
-SOviNR'AR'B, -NHC(O)NR'AR'B, -N(O)mi, -NR'AR'B, -C(O)R'A, -C(O)-OR'A, -C(O)NR'AR'B, -OR'A, -NR'ASO2R'B, -NR'AC(O)R1B, -NR'AC(O)OR'B, -NR'AOR'B, -NR3 +, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl. In embodiments, R' is halogen, -CX's, -CHX'2. -CH2X', -OCX's, -OCH2X'. -OCHX'2, -CN, -SOmR1A. -SOVINR'AR1B, -NHC(O)NR'AR'B, -N(0)mi, -NR'AR'B, -C(O)R'A, -C(O)-OR'A, -C(O)NR1AR'B, -OR'A, -NR'ASO2R'B, -NR'AC(O)R'B, -NR1AC(O)OR'B, -NR'AOR'B, -NR3 +, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl.
[0197] In embodiments, R' is an electron-donating group or an electron-w ithdrawing group.
[0198] In embodiments, R' is an electron-withdrawing group. In embodiments, the electronwithdrawing group is halogen, -CX'3, -CHX'2, -CH2X', , -CN, -SOniR'A, -SOviNR'AR'B, -N(0)mi, -C(O)R'A, -C(O)OR'A. -C(O)NR'AR'B, -NR'AOR'B, -NR3 +. substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyk wherein X', R'A, R'B, nl, vl, and ml are as defined herein. In embodiments, R'A and R'B are hydrogen.
[0199] In embodiments, R' is an electron-donating group. In embodiments, the electrondonating group is -Cl, -Br, -I, -CX2 3, -CHX2 2, -OCX'3, -OCH2X', -OCHX'2, , -OCOR'A, -OC(O)R'A, -OC(O)NR'AR'B, -SR'A, -PR'AR'B -NHC(O)NR'AR'B, -NR'AR1B, -OR'A, -NR'ASO2R'B, -NR'AC(O)R'B, -NR'AC(O)OR'B, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl. In embodiments, the substituted or unsubstituted alkyl is substituted or unsubstituted alkene. In embodiments, the electron-donating group is unsubstituted alkene. In embodiments, the substituted or unsubstituted alkyl is substituted or unsubstituted alkyne. In embodiments, R'A and R'B are hydrogen. In embodiments, the electron-donating group is unsubstituted alkyne.
[0200] In embodiments of the compounds described herein, R1 is substituted or unsubstituted heteroalkyl. In embodiments, R1 is unsubstituted heteroalkyl. In embodiments, R1 is unsubstituted 2 to 8 membered heteroalkyl. In embodiments, R1 is unsubstituted 2 to 6 membered heteroalkyl. In embodiments, R1 is unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R1 is -O(CH2)mCH3, and m is an integer from 0 to 6. In embodiments, R1 is - O(CH2)mCH3, and m is an integer from 0 to 4. In embodiments, R1 is -O(CH2)mCH3. and m is an integer from 0 to 3. In embodiments, R1 is -O(CH2)mCH3, and m is an integer from 0 to 2. In embodiments, R1 is -O(CH2)mCH3, and m is 0 or 1. In embodiments, R1 is -OCH3. In embodiments, R1 is -OCH2CH3, In embodiments, R1 is -O(CH2)2CH3, In embodiments, R1 is - O(CH2)3CH3. In embodiments, R1 is hydrogen.
[0201] In embodiments of the compounds described herein, R1 is halogen. In embodiments, R1 is fluorine, chlorine, bromine, or iodine. In embodiments, R1 is fluorine, chlorine, or bromine. In embodiments, R1 is fluorine or chlorine. In embodiments, R1 is fluorine or bromine. In embodiments, R1 is chlorine or bromine. In embodiments, R1 is fluorine. In embodiments, R1 is chlorine. In embodiments, R1 is bromine. In embodiments, R1 is iodine.
[0202] In embodiments, R1 is -CXh, -CHX^, or -CH2X1, wherein X1 is halogen. In embodiments, R1 is -CH2X1. In embodiments, R1 is -CHXb. In embodiments, R1 is -CX1^ In embodiments, R1 is -CF3. In embodiments, R1 is -CHF2. In embodiments, R1 is -CH2F. In embodiments, R1 is -CC13. In embodiments, R1 is -CHCI2. In embodiments, R1 is -CH2CI. In embodiments, R1 is -CBr3. In embodiments, R1 is -CHBn In embodiments, R1 is -CkhBr. In embodiments, R1 is -CN. In embodiments, R1 is -N(0)mi. In embodiments, R1 is -NO2. In embodiments, R1 is -SOniR1A. In embodiments, R1 is -SO2H. In embodiments, R1 is -SOVINR1AR1B. In embodiments, R1 is -SO2NH2. In embodiments, R1 is -NR3 +.
[0203] In embodiments of the compounds described herein, R1 is an alkyl group substituted with an electron-withdrawing group. In embodiments, R1 is a halogen-substituted alkyl group. In embodiments, -(CH2)WCX1 3, -(CH2)WCHX12, or -(CH2)wCFBX1, wherein w is an integer from 1 to 5, and X1 is halogen. In embodiments, w is 1. In embodiments, w is 2. In embodiments, w is 3. In embodiments, w is 4. In embodiments, w is 5.
[0204] In embodiments of the compounds, proteins, and conjugates described herein, R1A is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl. In embodiments, R1A is hydrogen, unsubstituted alkyl, or unsubstituted heteroalkyl. In embodiments, R1A is hydrogen, substituted or unsubstituted C1-4 alkyl, or substituted or unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R1A is hydrogen, unsubstituted C1-4
10
alkyl, or unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R1A is hydrogen. In embodiments, R1A is unsubstituted C1-4 alkyl. In embodiments, R1A is unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R1A is hydrogen and R1B is hydrogen.
[0205] In embodiments of the compounds, proteins, and conjugates described herein, R1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl. In embodiments, R1B is hydrogen, unsubstituted alky l, or unsubstituted heteroalkyl. In embodiments, R1B is hydrogen, substituted or unsubstituted C1-4 alkyl, or substituted or unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R1B is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R1B is hydrogen. In embodiments, R1B is unsubstituted C1-4 alkyl. In embodiments, R1B is unsubstituted 2 to 4 membered heteroalkyl. In embodiments, R1A is hydrogen and R1B is hydrogen.
[0206] In embodiments of the compounds, proteins, and conjugates described herein, X1 is independently -F, -Cl, -Br, or -I. In embodiments, X1 is independently -F, -Cl, or -Br. In embodiments, X1 is independently -F or -Cl. In embodiments, X1 is -F. In embodiments, X1 is -Cl. In embodiments, X1 is -Br. In embodiments, X1 is -I.
[0207] In embodiments of the compounds, proteins, and conjugates described herein, nl is an integer from 0 to 4. In embodiments nl is an integer from 0 to 3. In embodiments nl is an integer from 0 to 2. In embodiments nl is 0. In embodiments nl is 1. In embodiments nl is 2. In embodiments nl is 3. In embodiments nl is 4.
[0208] In embodiments of the compounds, proteins, and conjugates described herein, ml is 1 or 2. In embodiments, ml is 1. In embodiments, ml is 2.
[0209] In embodiments of the compounds, proteins, and conjugates described herein, vl is 1 or 2. In embodiments, vl is 1. In embodiments, vl is 2.
[0210] In embodiments of the compounds, proteins, and conjugates described herein, x is an integer from 0 to 8. In embodiments, x is an integer from 1 to 8. In embodiments, x is an integer from 1 to 7. In embodiments, x is an integer from 1 to 6. In embodiments, x is an integer from 1 to 5. In embodiments, x is an integer from 1 to 4. In embodiments, x is an integer from 1 to 3. In embodiments, x is an integer of 1 or 2. In embodiments, x is 1. In embodiments, x is 2. In embodiments, x is 3. In embodiments, x is 4. In embodiments, x is 5. In embodiments, x is 6. In embodiments, x is 7. In embodiments, x is 8. In embodiments, x is 0.
[0211] In embodiments of the compounds, proteins, and conjugates described herein, L1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene. In
embodiments, L1 is a bond. In embodiments, L1 is substituted or unsubstituted alky lene. In embodiments, L1 is substituted or unsubstituted Ci-6 alkylene. In embodiments, L1 is substituted or unsubstituted C1-4 alkylene. In embodiments, L1 is unsubstituted alkylene. In embodiments, L1 is unsubstituted CM alkylene. In embodiments, L1 is unsubstituted C1-4 alkydene. In embodiments, L1 is methylene. In embodiments, L1 is ethylene. In embodiments, L1 is propylene. In embodiments, L1 is substituted or unsubstituted heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 8 membered heteroalkylene. In embodiments, L1 is substituted or unsubstituted 2 to 6 membered heteroalky dene. In embodiments, L1 is -NH-C(O)- (CH2)y- or -NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 6. In embodiments, L1 is -NH- C(O)-(CH2)y- or -NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 5. In embodiments, L1 is - NH-C(O)-(CH2)y- or -NH-C(0)-0-(CH2)y-. and y is an integer from 0 to 4. In embodiments, L1 is -NH-C(0)-(CH2)y- or -NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 3. In embodiments, L1 is -NH-C(0)-(CH2)y- or -NH-C(0)-0-(CH2)y-, and y is an integer from 0 to 2. In embodiments, L1 is -NH-C(0)-(CH2)y-, and y is an integer from 0 to 3. In embodiments, L1 is - NH-C(O)-. In embodiments, L1 is -NH-C(0)-(CH2)- In embodiments, L1 is -NH-C(O)-(CH2)2-. In embodiments, L1 is -NH-C(O)-(CH2)3-. In embodiments, L1 is -NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 3. In embodiments, L1 is -NH-C(O)-O-. In embodiments, L1 is -NH-C(O)- O-(CH2)-. In embodiments, L1 is -NH-C(O)-O-(CH2)2-. In embodiments, L1 is -NH-C(O)-O- (CH2)3-.
[0212] In embodiments of the compounds, proteins, and conjugates described herein, L2 is a bond. -NR2A-, -S-, -S(O)2-, -O-, -C(O)-, -C(O)O-, -OC(O)-, -N(R2A)C(O)-, -C(O)N(R2A)-, -NR2AC(O)NR2B-, -NR2AC(NH)NR2B-, -SO2N(R2A)-, -N(R2A)SO2-, -C(S)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroaiylene. In embodiments, L2 is a bond, -NH-, -S-. -S(O)2-, -O-. -C(O)-, -C(O)O-, -OC(O)-, -NHC(O)-, -C(O)NH-, -NHC(O)NH-, -NHC(NH)NH-, -SO2NH-, -NHSO2-, -C(S)-, L12-substituted or unsubstituted alkylene, L12- substituted or unsubstituted heteroalkylene, L12-substituted or unsubstituted cycloalkylene, L12- substituted or unsubstituted heterocycloalkylene, L12-substituted or unsubstituted arylene, or L12- substituted or unsubstituted heteroarylene. In embodiments. L2 is a bond. -NH-, -S-, -S(O)2-. -O-, -C(O)-, -C(O)O-, -OC(O)-, -NHC(O)-, -C(O)NH-, -NHC(O)NH-, -NHC(NH)NH-, -SO2NH-, -NHSO2-, -C(S)-, unsubstituted alkylene, unsubstituted heteroalkylene, unsubstituted cycloalkylene, unsubstituted heterocycloalkylene, unsubstituted arylene, or unsubstituted heteroarylene. In embodiments, L2 is a bond. In embodiments, the alkylene is a C1-6 alkylene. In
embodiments, the alkylene is a C1-4 alkylene. In embodiments, the heteroalkylene is a 2 to 6 membered heteroalkylene. In embodiments, the heteroalkylene is a 2 to 4 membered heteroalkylene. In embodiments, the cycloalkylene is a C5-C6 cycloalkydene. In embodiments, the heterocycloalkylene is a 5 or 6 membered heterocycloalkydene. In embodiments, the arylene is a C5-6 arylene. In embodiments, the heteroarylene is a 5 or 6 membered heteroary dene.
[0213] In embodiments of the compounds described herein, -(CH2)X-L1- is -(CH2)XNHC(O)- or -(CH2)XNHC(O)O-, where x is as defined herein. In embodiments, -(CHzjx-L1 - is -(CH2)XNHC(O)-, where x is as defined herein. In embodiments, -(CH2)X-L1- is - (CH2)NHC(O)-. In embodiments, -(CH2)X-L1- is -(CHzhNHCCO)-. In embodiments, -(CH2)X- L1- is -(CH2)3NHC(O)-. In emdobiments, -(CH2)X-L1- is -(CH2)4NHC(O)-. In embodiments, -(CH2)X-L1- is -(CH2)5NHC(O)-. In embodiments, -(CH2)X-L1- is -(CH2)6NHC(O)-. In embodiments, -(CH2X-L1- is -(CH2)XNHC(O)O-, where x is as defined herein. In embodiments. - (CH2)X-L1- is -(CH2)NHC(O)O-. In embodiments, -(CH2)x-L1- is -(CH2hNHC(O)O-. In embodiments, -(CH2X-L1- is -(CH2)3NHC(O)O-. In embodiments, -(CH2)X-L1- is -(CH2)4NHC(O)O-. In embodiments, -(CH2)X-L1- is -(CH2)5NHC(O)O-. In embodiments, -(CH2)X-L1- IS -(CH2)6NHC(O)O-.
[0214] In embodiments of the compounds described herein, L1 is a bond and L2 is a bond. In embodiments of the compounds described herein, R2 is a peptidyl moiety, R3 is a peptidyl moiety, L1 is a bond, and L2 is a bond.
[0215] In embodiments of the compounds, proteins, and conjugates described herein, R2A and R2B are independently hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl. In embodiments, the alkylene is a C 1-4 alkylene. In embodiments, the heteroalkylene is a 2 to 6 membered heteroalkylene. In embodiments, the heteroalkylene is a 2 to 4 membered heteroalkylene. In embodiments, the cycloalkylene is a C5-C6 cycloalkydene. In embodiments, the heterocycloalky dene is a 5 or 6 membered heterocycloalky dene. In embodiments, the arylene is a C5-6 ary dene. In embodiments, the heteroarylene is a 5 or 6 membered heteroary dene. In embodiments, R2A and R2B are hydrogen.
[0216] In embodiments of the compounds, proteins, and conjugates described herein, L12 is halogen, -CF3, -CBr3, -CCI3, -CI3, -CHF2, -CHBr2, -CHCI2, -CHI2, -CH2F, -CH2Br, -CH2CI, -CH2I, -OCF3, -OCBr3, -OCCI3, -OCI3, -OCHF2, -OCHBr2, -OCHCI2, -OCHI2, -OCH2F, -OCH2Br, -OCH2CI, -OCH2I, -CN, -OH, -NH2, -COOH, -CONH2, -NO2, -SH, -SO3H, -SO4H, -
SO2NH2, -NHNH2, -ONH2, -NHC(O)NHNH2, -N(0)2, -NHSO2H, -NHC(O)H, -NHC(O)OH, -NHOH, -N3, unsubstituted alkyl, unsubstituted heteroalkyl, unsubstituted cycloalkyl, unsubstituted heterocycloalkyl, unsubstituted aryl, or unsubstituted heteroaryl. In embodiments, the alkylene is a C1-4 alkylene. In embodiments, the heteroalkylene is a 2 to 6 membered heteroalkyd ene. In embodiments, the heteroalkylene is a 2 to 4 membered heteroalky lene. In embodiments, the cycloalkylene is a Cs-Cg cycloalkylene. In embodiments, the heterocycloalkylene is a 5 or 6 membered heterocycloalkylene. In embodiments, the arylene is a C5-6 ary dene. In embodiments, the heteroarylene is a 5 or 6 membered heteroarylene.
[0217] In embodiments of the compounds, proteins, and conjugates described herein, L3 is a bond, -N(R3A)-, -S-, -S(O)2-, -O-, -C(O)-, -C(O)O-, -OC(O)-, -N(R3A)C(O)-, -C(O)N(R3A)-, -NR3AC(O)NR3B-, -NR3AC(NH)NR3B-, -SO2N(R3A)-, -N(R3A)SO2-, -C(S)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkydene, substituted or unsubstituted cycloalkydene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene. In embodiments, L3 is a bond, -NH-, -S-, -S(O)2-, -O-, -C(O)-, -C(O)O-, -OC(O)-, -NHC(O)-, -C(O)NH-, -NHC(O)NH-, -NHC(NH)NH-, -SO2NH-, -NHSO2-, -C(S)-, L13-substituted or unsubstituted alkylene, L13- substituted or unsubstituted heteroalkydene, L13-substituted or unsubstituted cycloalkylene, L13- substituted or unsubstituted heterocycloalkylene, L 13 -substituted or unsubstituted arylene, or L13- substituted or unsubstituted heteroarylene. In embodiments, the alkylene is a C1-4 alkylene. In embodiments, the heteroalkylene is a 2 to 6 membered heteroalky dene. In embodiments, the heteroalkylene is a 2 to 4 membered heteroalkylene. In embodiments, the cycloalkylene is a C5- Ce cycloalkylene. In embodiments, the heterocycloalkylene is a 5 or 6 membered heterocycloalkylene. In embodiments, the arylene is a C5-6 arylene. In embodiments, the heteroary lene is a 5 or 6 membered heteroarylene.
[0218] In embodiments of the compounds, proteins, and conjugates described herein, R3A and R3B are independently hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyd, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl. In embodiments, the alkylene is a C1-4 alky dene. In embodiments, the heteroalky dene is a 2 to 6 membered heteroalkylene. In embodiments, the heteroalky dene is a 2 to 4 membered heteroalkylene. In embodiments, the cycloalky lene is a C5-C6 cycloalk dene. In embodiments, the heterocycloalkylene is a 5 or 6 membered heterocycloalkyd ene. In embodiments, the arylene is a C5-6 arylene. In embodiments, the heteroarylene is a 5 or 6 membered heteroarylene.
[0219] In embodiments of the compounds, proteins, and conjugates described herein, Ln is halogen, -CF3, -CBn, -CC13. -CI3, -CHF2, -CHBr2, -CHC12, -CHI2, -CH2F, -CH2Br, -CH2C1, -CH2I, -OCF3, -OCBn. -OCCh, -OCI3, -OCHF2, -OCHBr2, -OCHCh, -OCHI2, -OCH2F, -OCH2Br, -OCH2C1, -OCH2I, -CN, -OH, -NH2, -COOH, -CONH2, -NO2, -SH, -SO3H, -SO4H, -SO2NH2, -NHNH2, -ONH2, -NHC(O)NHNH2, -N(O)2, -NHSO2H, -NHC(O)H, -NHC(O)OH, -NHOH, -N3, unsubstituted alkyl, unsubstituted heteroalkyl, unsubstituted cycloalkyl, unsubstituted heterocycloalkyl, unsubstituted aryl, or unsubstituted heteroaryl. In embodiments, the alkylene is a C1-4 alkylene. In embodiments, the heteroalkylene is a 2 to 6 membered heteroalkylene. In embodiments, the heteroalkylene is a 2 to 4 membered heteroalkylene. In embodiments, the cycloalkydene is a C5-C6 cycloalky dene. In embodiments, the heterocycloalkylene is a 5 or 6 membered heterocycloalkydene. In embodiments, the arylene is a C5-6 arylene. In embodiments, the heteroarylene is a 5 or 6 membered heteroarylene.
[0220] In embodiments of the compounds described herein, the peptidyl moiety of R4 comprises an antibody or an antibody variant; and the peptidyl moiety of R3 comprises a protein. In embodiments, the peptidyl moiety' of R4 comprises an antibody or an antibody variant; and the peptidyl moiety' of R5 comprises a protein, wherein the protein comprises a lysine, histidine, or tyrosine bonded to L3. where L3 is a bond. In embodiments, R4 comprises an antibody. In embodiments, R4 comprises an antibody variant. In embodiments, the antibody variant is a variant as defined herein. In embodiments, the antibody variant is a single-chain variable fragment, a single-domain antibody, an affibody, or an antigen-binding fragment. In embodiments, the antibody variant is a single-chain variable fragment. In embodiments, the antibody variant is a single-domain antibody. In embodiments, the antibody variant is an affibody. In embodiments, the antibody variant is an antigen-binding fragment. In embodiments, the protein is the target protein of the antibody or antibody variant. In embodiments, the target protein is a receptor protein.
[0221] In embodiments of the compounds described herein, the peptidyl moiety' of R4 comprises a protein; and the peptidyl moiety’ of R5 comprises an antibody or an antibody variant. In embodiments, the peptidyl moiety' of R4 comprises a protein; and the peptidyl moiety of R5 comprises an antibody or an antibody variant; wherein the antibody or antibody variant comprises a lysine, histidine, or tyrosine bonded to L3, where L3 is a bond. In embodiments, R’ comprises an antibody. In embodiments, R5 comprises an antibody variant. In embodiments, the antibody variant is a variant as defined herein. In embodiments, the antibody variant is a singlechain variable fragment, a single-domain antibody, an affibody, or an an tiger -bin ding fragment. In embodiments, the antibody variant is a single-chain variable fragment. In embodiments, the
antibody variant is a single-domain antibody. In embodiments, the antibody variant is an affibody. In embodiments, the antibody variant is an antigen-binding fragment. In embodiments, the protein is the target protein of the antibody or antibody variant. In embodiments, the target protein is a receptor protein.
[0222] In embodiments of the compounds described herein, R5 is a peptidyl moiety comprising a lysine, histidine, or tyrosine bonded to L3. In embodiments, R5 is a peptidyl moiety comprising a lysine bonded to L3. In embodiments, R5 is a peptidyl moiety comprising a histidine bonded to L3. In embodiments. R5 is a peptidyl moiety comprising a tyrosine bonded to L3. In embodiments, R5 is a peptidyl moiety comprising a lysine, histidine, or tyrosine bonded to L3, where L3 is a bond. In embodiments, R5 is a peptidyl moiety comprising a lysine bonded to L3, where L3 is a bond. In embodiments, R5 is a peptidyl moiety' comprising a histidine bonded to L3. where L3 is a bond. In embodiments, R5 is a peptidyl moiety comprising a tyrosine bonded to L3. where L3 is a bond. In embodiments, L2 is a bond.
[0223] In embodiments, the biomolecules, proteins, and peptidyl moieties described herein comprise a receptor protein. In embodiments, the receptor protein is a 5-hydroxytryptamine receptor, an acetylcholine receptor, an adenosine receptor, an adenosine A2A receptor, an adenosine A2B receptor, an angiotensin receptor, an apelin receptor, a bile acid receptor, a bombesin receptor, a bradykinin receptor, a cannabinoid receptor, a chemerin receptor, a chemokine receptor, a cholecystokinin receptor, a Class A Orphan receptor, a dopamine receptor, an endothelin receptor, an epidermal growth factor receptor (EGFR), a formyl peptide receptor, a free fatty acid receptor, a galanin receptor, a ghrelin receptor, a glycoprotein hormone receptor, a gonadotrophin-releasing hormone receptor, a G protein-coupled receptor, a G protein-coupled estrogen receptor, a histamine receptor, a hydroxy carboxylic acid receptor, a kisspeptin receptor, a leukotriene receptor, a lysophospholipid receptor, a lysophospholipid SIP receptor, a melanin-concentrating hormone receptor, a melanocortin receptor, a melatonin receptor, a motilin receptor, a neuromedin U receptor, a neuropeptide FF/neuropeptide AF receptor, a neuropeptide S receptor, a neuropeptide W/neuropeptide B receptor, a neuropeptide Y receptor, a neurotensin receptor, an opioid receptor, an opsin receptor, an orexin receptor, an oxoglutarate receptor, a P2Y receptor, a platelet-activating factor receptor, a prokineticin receptor, a prolactin-releasing peptide receptor, a prostanoid receptor, a proteinase-activated receptor, a QRFP receptor, a relaxin family peptide receptor, a somatostatin receptor, a succinate receptor, a tachykinin receptor, a thyrotropin-releasing hormone receptor, a trace amine receptor, a urotensin receptor, a vasopressin receptor, or a combination of tw o or more thereof. In embodiments, the receptor protein is an integrin. In embodiments, the receptor protein is a
somatostain receptor. In embodiments, the receptor protein is a gonadotropin-releasing hormone receptor. In embodiments, the receptor protein is a bombesin receptor. In embodiments, the receptor protein is a vasoactive intestinal peptide receptor. In embodiments, the receptor protein is a neurotensin receptor. In embodiments, the receptor protein is a cholecystokinin 2 receptor. In embodiments, the receptor protein is a melanocortin receptor. In embodiments, the receptor protein is a ghrelin receptor.
[0224] In embodiments, the receptor protein is a receptor expressed on a cancer cell. In embodiments, the receptor protein is a receptor overexpressed on a cancer cell relative to a control.
[0225] In embodiments, the receptor protein is a G protein-coupled receptor. In embodiments, the receptor protein is a receptor tyrosine kinase. In embodiments, the receptor protein is a an ErbB receptor. In embodiments, the receptor protein is an epidermal grow th factor receptor (EGFR). In embodiments, the receptor protein is epidermal growth factor receptor 1 (HER1). In embodiments, the receptor protein is epidermal growth factor receptor 2 (HER2). In embodiments, the receptor protein is epidermal growth factor receptor 3 (HER3). In embodiments, the receptor protein is epidermal growth factor receptor 4 (HER4).
[0226] Proteins
[0227] Provided herein are proteins comprising an unnatural amino acid as described herein, including embodiments thereof, within CDR-L1, CDR-L2, CDR-L3, CDR-H1, CDR-H2, or CDR-H3, w herein the protein is an antigen-binding fragment, a single-chain variable fragment, or an antibody. In embodiments, the protein is an antigen-binding fragment. In embodiments, the protein is a single-chain variable fragment. In embodiments, the protein is an antibody. In embodiments, the protein has one unnatural amino acid within CDR-L1. In embodiments, the protein has one unnatural amino acid within CDR-L2. In embodiments, the protein has one unnatural amino acid within CDR-L3. In embodiments, the protein has one unnatural amino acid within CDR-H1. In embodiments, the protein has one unnatural amino acid within CDR-H2. In embodiments, the protein has one unnatural amino acid within CDR-H3. In embodiments, the protein has two or more unnatural amino acids within CDR-L1, CDR-L2, CDR-L3, CDR-H1, CDR-H2, or CDR-H3. The two or more unnatural acids can be in the same or different CDR, and can be in the same or different chain (i. e. , light or heavy). In embodiments, the proteins described herein comprise an unnatural amino acid as described herein, including embodiments thereof, within a framework region, w erein wherein the protein is an antigen-binding fragment, a single-chain variable fragment, or an antibody.
[0228] Provided herein are Fabs comprising an unnatural amino acid as described herein, including embodiments thereof. Provided herein are Fabs comprising an unnatural amino acid, wherein the unnatural amino acid comprises a side chain of Formula (II), including embodiments thereof.
[0229] Nanobodies
[0230] Provided herein are nanobodies comprising an unnatural amino acid having the side chain of Formula (II) as described herein, including embodiments thereof. Provided herein are single-domain antibodies having an unnatural amino acid side chain; wherein the unnatural amino acid side chain is capable of covalently binding to lysine, tyrosine, or histidine. In aspects, the unnatural amino acid side chain is capable of covalently binding to lysine or tyrosine. In aspects, the unnatural amino acid side chain is capable of covalently binding to lysine. In aspects, the unnatural amino acid side chain is capable of covalently binding to tyrosine. Provided herein are nanobodies comprising an unnatural amino acid, wherein the unnatural amino acid is within CDR1, CDR2, or CDR3 of the nanobody. Provided herein are nanobodies comprising one unnatural amino acid, wherein the one unnatural amino acid is within CDR1, CDR2, or CDR3 of the nanobody. Provided herein are nanobodies comprising two unnatural amino acids, wherein the two unnatural amino acids are within CDR1, CDR2, or CDR3 of the nanobody. Provided herein are nanobodies comprising three unnatural amino acids, wherein the three unnatural amino acids are within CDR1, CDR2, or CDR3 of the nanobody. Provided herein are nanobodies comprising four unnatural amino acids, wherein the four unnatural amino acids are within CDR1, CDR2, or CDR3 of the nanobody. Provided herein are nanobodies comprising an unnatural amino acid, wherein the unnatural amino acid is within CDR1 of the nanobody. Provided herein are nanobodies comprising an unnatural amino acid, wherein the unnatural amino acid is within CDR1, but not within CDR2 or CDR3 of the nanobody. Provided herein are nanobodies comprising one unnatural amino acid, wherein the one unnatural amino acid is within CDR1 of the nanobody. Provided herein are nanobodies comprising an unnatural amino acid, wherein the unnatural amino acid is w ithin CDR2 of the nanobody. Provided herein are nanobodies comprising an unnatural amino acid, wherein the unnatural amino acid is within CDR2, and there are not any unnatural amino acids within CDR1 or CDR3 of the nanobody. Provided herein are nanobodies comprising one unnatural amino acid, wherein the one unnatural amino acid is within CDR2 of the nanobody. Provided herein are nanobodies comprising an unnatural amino acid, wherein the unnatural amino acid is within CDR3 of the nanobody. Provided herein are nanobodies comprising an unnatural amino acid, wherein the unnatural amino acid is within CDR3, and there are not any unnatural amino acids
within CDR1 or CDR2 of the nanobody. Provided herein are nanobodies comprising one unnatural amino acid, wherein the one unnatural amino acid is within CDR3 of the nanobody. In embodiments, the unnatural amino acid comprises a side chain of Formula (II), including embodiments thereof.
[0231] In embodiments, the proteins or biomolecule conjugates described herein, including embodiments thereof, comprise a detectable agent. In embodiments, the detectabel agent is a radioisotope. In embodiments, the radioisotope is a positron-emitting radioisotope. In embodiments, the positron-emitting radioisotope is nC. 13N, 15O, 18F. 64Cu, 68Ga. 78Br, 82Rb, 86Y, 89Zr, 90Y, 22Na, 26 AL 40K, 83Sr, or 124I. In embodiments, the positron-emitting radioisotope is 124I. In embodiments, the radioisotope is an alpha-emitting radioisotope. In embodiments, the alphaemitting radioisotope is 211At, 227Th, 225Ac, 223Ra, 213Bi, or 212Bi. In embodiments, the alphaemitting radioisotope is 21 'At. In embodiments, the proteins or biomolecule conjugates described herein further comprise a therapeutic agent. In embodiments, the proteins or biomolecule conjugates described herein further comprise a detectable agent and a therapeutic agent.
[0232] Cells
[0233] In embodiments, the disclosure provides a cell comprising the compounds, proteins, and conjugates described herein, including embodiments thereof. In aspects, the cell further includes a vector as described herein. In embodiments, the protein described herein, including embodiments thereof, is biosynthesized inside the cell, thereby generating a cell containing the protein. In aspects, the protein described herein, including embodiments thereof, is contained in the medium outside the cell and penetrates into the cell, thereby generating a cell containing the protein. In aspects, the cell comprises a protein complex described herein. A cell can be any prokary otic or eukary otic cell. For example, any of the compounds (e.g., single-domain antibody) compositions described herein can be expressed in bacterial cells such as E. coli, insect cells, yeast or mammalian cells (such as Hela cells, Chinese hamster ovary' cells (CHO) or COS cells). In aspects, a cell can be a premature mammalian cell, i.e., pluripotent stem cell. In aspects, a cell can be derived from other human tissue. Other suitable cells are know n to those skilled in the art.
[0234] The proteins provided herein may be delivered to cells using methods well known in the art. Thus, in an aspect is provided a nucleic acid sequence encoding the proteins described herein, including embodiments and aspects thereof. Thus, in an aspect is provided a vector including a nucleic acid sequence encoding the protein described herein, including embodiments and aspects thereof.
[0235] Cellular Compositions
[0236] The disclosure provides cells comprising the compounds, compositions and complexes provided herein, including embodiments thereof. In embodiments, a cell comprise the compound of Formula (I), including any embodiment thereof. In embodiments, a cell comprise the compound of Formula (II), including any embodiment thereof. In embodiments, a cell comprise the compound of Formula (III), including any embodiment thereof.
[0237] In embodiments, the cell further includes a mutant pyrrolysyl-tRNA synthetase as described herein, including embodiments thereof. In embodiments, the cell further includes a vector as described herein, including embodiments thereof. In embodiments, the cell further includes a tRNAPyl.
[0238] In embodiments, the compound of Formula (I) (including embodiments thereol) is biosynthesized inside the cell, thereby generating a cell containing the compound of Formula (I). In embodiments, the compound of Formula (I) is contained in the medium outside the cell and penetrates into the cell, thereby generating a cell containing the compound of Formula (I). In embodiments, the cell comprises the compound of Formula (II) (including embodiments thereol). In embodiments, the cell comprises the compound of Formula (II) that is synthesized inside the cell. In embodiments, the cell comprises the compound of Formula (II) that is synthesized outside a cell, and that penetrates into the cell. In embodiments, the cell comprises the compound of Formula (III) (including embodiments thereof). In embodiments, the cell comprises the compound of Formula (III) that is synthesized inside the cell. In embodiments, the cell comprises the compound of Formula (III) that is synthesized outside a cell, and that penetrates into the cell.
[0239] A cell can be any prokaryotic or eukaryotic cell. In aspects, the cell is prokaryotic. In aspects, the cell is eukaryotic. In aspects, the cell is a bacterial cell, a fungal cell, a plant cell, an archael cell, or an animal cell. In aspects, the animal cell is an insect cell or a mammalian cell. In aspects, the cell is a bacterial cell. In aspects, the cell is a fungal cell. In aspects, the cell is a plant cell. In aspects, the cell is an archael cell. In aspects, the cell is an animal cell. In aspects, the cell is an insect cell. In aspects, the cell is a mammalian cell. In aspects, the cell is a human cell. For example, any of the compositions described herein can be expressed in bacterial cells such as E. coll, insect cells, yeast or mammalian cells (such as Hela cells, Chinese hamster ovary cells (CHO) or COS cells). In aspects, the cell is a premature mammalian cell, i.e., a pluripotent stem cell. In aspects, the cell is derived from other human tissue. Other suitable cells are known to those skilled in the art.
[0240] Pyrrolysyl-tRNA Synthetase
[0241] As described herein, an unnatural amino acid (e.g., of Formula (I) may be inserted into or replace a naturally occurring amino acid in a protein. In order for the unnatural amino acid to be inserted or replace an amino acid in a protein, it must be capable of being incorporated during proteinogenesis. Thus, the unnatural amino acid must be present on a transfer RNA molecule (tRNA) such that it may be used in translation. Loading of amino acids occurs via an aminoacyl- tRNA synthetase, which is an enzy me that facilitates the attachment of appropriate amino acids to tRNA molecules. However, the attachment of unnatural amino acids to tRNA may not necessarily be accomplished by the naturally occurring aminoacyl-tRNA synthetase. Engineered aminoacy 4-tRNA synthetases (e.g., mutant pyrrolysyl-tRNA synthetase (PylRS)) may be useful for attaching unnatural amino acids to tRNA. A PylRS mutant library was generated. Compared to previously described PylRS mutant library, the PylRS mutant library generated herein was constructed using the new small-intelligent mutagenesis approach that allows a greater number of amino acid residues to be mutated simultaneously (e.g., 10 amino acid residues). Mutant pyrrolysyl-tRNA synthetases and methods for making them are described, for example, in US 2021/0002325, WO 2020/072674, and WO 2020/206341. the disclosures of which are incorporated by reference herein in their entirety.
[0242] In embodiments, the disclosure provides a pyrrolysyl-tRNA synthetases having at least 85% sequence identity to the amino acid sequence of SEQ ID NO: 1. In embodiments, the disclosure provides a pyrrolysyl-tRNA synthetases having at least 90% sequence identity' to the amino acid sequence of SEQ ID NO: 1. In embodiments, the disclosure provides a pyrrolysyl- tRNA synthetases having at least 95% sequence identity' to the amino acid sequence of SEQ ID NO: 1. In embodiments, the disclosure provides a pyrrolysyl-tRNA synthetases comprising the amino acid sequence of SEQ ID NO: 1. In embodiments, the disclosure provides a pyrrolysyl- tRNA synthetases as set forth in SEQ ID NO: 1.
[0243] The disclosure provides a mutant pyrrolysyl-tRNA synthetase, including at least 5 amino acid residues substitutions within the substrate-binding site of the mutant pyrrolysyl- tRNA synthetase. In aspects, the mutant pyrrolysyl-tRNA synthetase comprises at least 5 amino acid residues substitutions in the amino acid sequence of SEQ ID NO:2. In aspects, the substrate-binding site includes residues alanine at position 302, leucine at position 305, tyrosine at position 306, leucine at position 309, isoleucine at position 322, asparagine at position 346, cysteine at position 348, tyrosine at position 384, valine at position 401 and tryptophan at position 417 as set forth in the amino acid sequence of SEQ ID NO:2. In aspects, the at least 5
amino acid residues substitutions are a substitution for alanine at position 302, a substitution for asparagine at position 346. a substitution for cysteine at position 348. a substitution for tyrosine at position 384, and a substitution for tryptophan at position 417 as set forth in the amino acid sequence of SEQ ID NO:2. In aspects, the at least 5 amino acid residues substitutions are isoleucine for alanine at position 302, threonine for asparagine at position 346, isoleucine for cysteine at position 348, leucine for tyrosine at position 384, and lysine for tryptophan at position 417 as set forth in the amino acid sequence of SEQ ID NO:2.
[0244] In embodiments, the mutant pyrrolysyl-tRNA synthetase is encoded by the nucleic acid sequence of SEQ ID NO:3. In aspects, the mutant pyrrolysyl-tRNA synthetase is encoded by a nucleic acid sequence including the sequence of SEQ ID NO:3. In aspects, the mutant pyrrolysyl-tRNA synthetase is encoded by a nucleic acid sequence that is at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%. 99%. or 100% identity to SEQ ID NO:3. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 80% identity to SEQ ID NO:3. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 85% identity to SEQ ID NO:3. In aspects, the mutant pyrrolysyl- tRNA synthetase has an amino acid sequence that has at least 90% identity to SEQ ID NO: 3. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 95% identity to SEQ ID NO:3.
[0245] In embodiments, the mutant pyrrolysyl-tRNA synthetase has the amino acid sequence of SEQ ID NO:4. In aspects, the mutant pyrrolysyl-tRNA synthetase includes an amino acid sequence of SEQ ID NO:4. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%. or 100% identity to SEQ ID NO:4. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 80% identity to SEQ ID NO:4. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 85% identity to SEQ ID NO:4. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 90% identity to SEQ ID NO:4. In aspects, the mutant pyrrolysyl-tRNA synthetase has an amino acid sequence that has at least 95% identity to SEQ ID NO:4.
[0246] Vectors
[0247] The compositions (e.g., mutant pyrrolysyl-tRNA synthetase, IRMA1"' 1) provided herein may be delivered to cells using methods well known in the art. Thus, in an embodiment is provided a vector including a nucleic acid sequence encoding a mutant pyrrolysyl-tRNA synthetase as described herein, including embodiments thereof. In embodiments, the vector
further includes a nucleic acid sequence encoding tRN A'3'1. In embodiments, the vector comprises a nucleic acid sequence encoding a mutant pyrrolysyl-tRNA synthetase as described herein. In embodiments, the vector further includes a nucleic acid sequence encoding tRNAPyl.
[0248] Methods of Forming a Biomolecule or Biomolecule Conjugate
[0249] The compositions provided herein are useful for forming a biomolecule or biomolecule conjugate. In embodiments, the method of forming a biomolecule (e.g., protein) comprises contacting a biomolecule (e.g., protein), a mutant pyrrolysyl-tRNA synthetase, a tRNAPyl, and a compound of Formula (I) (including embodiments thereof), thereby producing the biomolecule, i.e., a biomolecule comprising the unnatural amino acid of Formula (I) (including embodiments thereof). The biomolecule produced by the method will comprise the unnatural amino acid side chain of Formula (II) (including embodiments thereof). The mutant pyrrolysyl-tRNA synthetase used in the method of producing the biomolecule is any described herein or known in the art. The tRNAPyl used in the method of producing the biomolecule is any described herein. In embodiments, the reaction is performed in vitro. In embodiments, the reaction is performed in vivo. In embodiments, the reaction is performed in one or more living cells. In embodiments, the reaction is performed in one or more living bacterial cells. In embodiments, the reaction is performed in one or more living mammalian cells.
[0250] Imaging and Diagnostic Methods
[0251] In embodiments, the detectable label is a detectable label that can be used in medical imaging. In embodiments, the detectable label is a label that can be used for radiography, magnetic resonance imaging, nuclear medicine, ultrasound elastography, photoacoustic imaging, tomography, echocardiography, functional near-infrared spectroscopy, magnetic particle imaging. In embodiments, the detectable label is a label that can be use for tomography. In embodiments, the detectable label is a label that can be used for positron emission tomography.
[0252] In embodiments, the detectable label is a radioisotope. In embodiments, the detectable label is an idoine radioisotope. In embodiments, the radioisotope is 123I, 124I, 125I, or 131I. In embodiments, the radioisotope is 123I. In embodiments, the radioisotope is 124I. In embodiments, the radioisotope is 125I. In embodiments, the radioisotope is 131I. In embodiments, the radioisotope is a positron-emitting radioisotope. In embodiments, the positron-emitting radioisotope is nC, 13N, 15O, 18F, 64Cu, 68Ga, 78Br, 82Rb, 86Y, 89Zr, 90Y, 22Na, 26 Al, 40K, 83Sr, or 124I. In embodiments, the positron-emitting radioisotope is nC. In embodiments, the positronemitting radioisotope is 13N. In embodiments, the positron-emitting radioisotope is 15O. In embodiments, the positron-emitting radioisotope is 18F. In embodiments, the positron-emitting
radioisotope is 64Cu. In embodiments, the positron-emitting radioisotope is ,68Ga. In embodiments, the positron-emitting radioisotope is 78Br. In embodiments, the positron-emitting radioisotope is 82Rb. In embodiments, the positron-emitting radioisotope is 86Y. In embodiments, the positron-emitting radioisotope is 89Zr. In embodiments, the positron-emitting radioisotope is 90Y. In embodiments, the positron-emitting radioisotope is 22Na. In embodiments, the positronemitting radioisotope is 26 Al. In embodiments, the positron-emitting radioisotope is 40K. In embodiments, the positron-emitting radioisotope is 83Sr. In embodiments, the positron-emitting radioisotope is 124I. In embodiments, the radioisotope is an alpha-emitting radioisotope. In embodiments, the alpha-emitting radioisotope is 211At, 227Th, 225Ac, 223Ra, 213Bi, or 212Bi. In embodiments, the alpha-emitting radioisotope is 21 'At. In embodiments, the alpha-emitting radioisotope is 227Th. In embodiments, the alpha-emitting radioisotope is 225 Ac. In embodiments, the alpha-emitting radioisotope is 223Ra. In embodiments, the alpha-emitting radioisotope is 213Bi. In embodiments, the alpha-emitting radioisotope is 212Bi.
[0253] Pharmaceutical Compositions
[0254] Any of the proteins described herein may be administered to a subject in a pharmaceutical composition further comprising a pharmaceutically acceptable excipient. The compositions are suitable for formulation and administration in vitro or in vivo. Suitable carriers and excipients and their formulations are known in the art and described, e.g., Remington: The Science and Practice of Pharmacy. 21st Ed, Lippicott Williams & Wilkins (2005).
[0255] The term “pharmaceutical composilon" encompasses compositions administered to a patient for therapeutic purposes (e.g., treating a disease) and/or diagnostic purposes (e.g., medical imaging). Medical imagining includes, without limitation, radiography, magnetic resonance imaging, nuclear medicine, ultrasound elastography, photoacoustic imaging, tomography (e.g., positron emission tomography), echocardiography, functional near-infrared spectroscopy, magnetic particle imaging, and the like.
[0256] “Pharmaceutically acceptable excipient” and "pharmaceutically acceptable carrier” refer to a substance that aids the administration of an active agent to and absorption by a subject and can be included in the compositions of the disclosure without causing a significant adverse toxicological effect on the patient. Non-limiting examples of pharmaceutically acceptable excipients include water. NaCl, normal saline solutions, lactated Ringer's, normal sucrose, normal glucose, binders, fillers, disintegrants, lubricants, coatings, sweeteners, flavors, salt solutions (such as Ringer's solution), alcohols, oils, gelatins, carbohydrates such as lactose, amylose or starch, Patty acid esters, hydroxymethy cellulose, polyvinyl pyrrolidine, and colors,
and the like. Such preparations can be sterilized and, if desired, mixed with auxiliary agents such as lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, coloring, and/or aromatic substances and the like that do not deleteriously react with the compounds of the disclosure. One of skill in the art will recognize that other pharmaceutical excipients are useful. Pharmaceutically acceptable excipients can be used in pharmaceutical compositions for therapeutic purposes (e.g.. treating a disease) and/or diagnostic purposes (e.g., imaging, such as positron emission tomography).
[0257] Solutions of the pharmaceutical compositions can be prepared in water suitably mixed with a lipid or surfactant, such as hydroxypropylcellulose. Dispersions can also be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations can contain a preservative to prevent the growth of microorganisms. Solutions can be administered, e.g., parenterally, such as subcutaneously or intravenously (e.g., infusion or bolus).
[0258] Pharmaceutical compositions can be delivered via intranasal or inhalable solutions. The intranasal composition can be a spray, aerosol, or inhalant. The inhalable composition can be a spray, aerosol, or inhalant. Nasal solutions can be aqueous solutions designed to be administered to the nasal passages in drops or sprays. Nasal solutions can be prepared so that they are similar in many respects to nasal secretions. Thus, the aqueous nasal solutions usually are isotonic and slightly buffered to maintain a pH of 5.5 to 6.5. In addition, antimicrobial preservatives, similar to those used in ophthalmic preparations and appropriate drug stabilizers, if required, may be included in the formulation. Various commercial nasal preparations are known in the art.
[0259] Oral formulations can include excipients as, for example, pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharine, cellulose, magnesium carbonate and the like. These compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained release formulations or powders. In aspects, oral pharmaceutical compositions will comprise an inert diluent or edible carrier, or they may be enclosed in hard or soft shell gelatin capsule, or they may be compressed into tablets, or they may be incorporated directly with the food. For oral therapeutic administration, the active compounds may be incorporated with excipients and used in the form of ingestible tablets, buccal tablets, troches, capsules, elixirs, suspensions, syrups, wafers, and the like. The percentage of the compositions and preparations may, of course, be varied and may be between about 1 to about 75% of the weight of the unit. The amount of nucleic acids in such compositions is such that a suitable
dosage can be obtained.
[0260] For parenteral administration in an aqueous solution, for example, the solution should be suitably buffered and the liquid diluent first rendered isotonic with sufficient saline or glucose. Aqueous solutions, in particular, sterile aqueous media, are especially suitable for intravenous, intramuscular, subcutaneous and intraperitoneal administration. For example, one dosage could be dissolved in 1 ml of isotonic NaCl solution and either added to 1000 ml of hypodermoclysis fluid or injected at the proposed site of infusion.
[0261] Sterile injectable solutions can be prepared by incorporating the recombinant proteins in the required amount in the appropriate solvent followed by filtered sterilization. Generally, dispersions are prepared by incorporating the various sterilized active ingredients into a sterile vehicle which contains the basic dispersion medium. Vacuum-drying and freeze-drying techniques, which yield a powder of the active ingredient plus any additional desired ingredients, can be used to prepare sterile powders for reconstitution of sterile injectable solutions. The preparation of more, or highly, concentrated solutions for direct injection is also contemplated. Dimethyl sulfoxide can be used as solvent for rapid penetration, delivering high concentrations of the active agents to a small area.
[0262] For vaccination or immunization purposes the proteins described herein (e.g., proteins of Formula (II) including embodiments thereof) may be formulated and introduced as a vaccine through oral, intradermal, intramuscular, intraperitoneal, intravenous, subcutaneous, intranasal, and via scarification (scratching through the top layers of skin, e.g.. using a bifurcated needle) or any other standard route of immunization. Vaccine formulations suitable for oral administration may be in the form of capsules, cachets, pills, tablets, lozenges (using a flavored basis, usually sucrose and acacia or tragacanth), powders, granules, or as a solution or a suspension in an aqueous or non-aqueous liquid, or as an oil-in-water or water-in-oil liquid emulsion, or as an elixir or syrup, or as pastilles (using an inert base, such as gelatin and glycerin, or sucrose and acacia), each containing a predetermined amount of a subject composition thereof as an active ingredient or any other oral composition as listed above. Alternatively, the vaccines may be administered parenterally as injections (intravenous, intramuscular or subcutaneous). The amount of recombinant proteins used in a vaccine can depend upon a variety of factors including the route of administration, species, and use of booster administration. However, a person of ordinary skill in the art would immediately recognize appropriate and/or equivalent doses looking at dosages of approved whopping cough vaccines for guidance.
[0263] The term “adjuvant” refers to a compound that when administered in conjunction with
the recombinant proteins provided herein including embodiments thereof augments the immune response to the antigen, but when administered alone does not generate an immune response to the antigen. As described above the recombinant proteins provided herein including embodiments thereof may be used as an adjuvant. Therefore, the term “adjuvant” refers to a compound that when administered in conjunction with a vaccine augments the immune response to the antigen, but when administered alone does not generate an immune response to the antigen. Adjuvants can augment an immune response by several mechanisms including lymphocyte recruitment, stimulation of B and/or T cells, and stimulation of macrophages. The adjuvant increases the titer of induced antibodies and/or the binding affinity of induced antibodies relative to the situation if the immunogen were used alone. A variety of adjuvants can be used in combination with the recombinant proteins provided herein to elicit an immune response. Adjuvants augment the intrinsic response to an immunogen without causing conformational changes in the immunogen that affect the qualitative form of the response. Exemplary adjuvants include aluminum hydroxide and aluminum phosphate, 3 De-O-acylated monophosphoryl lipid A (MPL™) (see GB 2220211 (RIBI ImmunoChem Research Inc., Hamilton, Montana, now part of Corixa). Stimulon™ QS-21 is a triterpene glycoside or saponin isolated from the bark of the Quillaja Saponaria Molina tree found in South America (see Kensil et al., in Vaccine Design: The Subunit and Adjuvant Approach (eds. Powell & Newman, Plenum Press, NY, 1995); US Patent No. 5,057,540), (Aquila BioPharmaceuticals, Framingham, MA). Other adjuvants are oil in water emulsions (such as squalene or peanut oil), optionally in combination with immune stimulants, such as monophosphoryl lipid A (see Stoute et al., N. Engl. J. Med. 336, 86-91 (1997)), pluronic polymers, and killed mycobacteria. Another adjuvant is CpG (WO 98/40100). Adjuvants can be administered as a component of a therapeutic composition with an active agent or can be administered separately, before, concurrently with, or after administration of the therapeutic agent.
[0264] Other examples of adjuvants are aluminum salts (alum), such as alum hydroxide, alum phosphate, alum sulfate. Such adjuvants can be used with or without other specific immunostimulating agents such as MPL or 3-DMP, QS-21, polymeric or monomeric amino acids such as poly glutamic acid or poly lysine. Another class of adjuvants is oil-in-water emulsion formulations. Such adjuvants can be used with or without other specific immunostimulating agents such as muramyl peptides (e.g., N-acetylmuramyl-L-threonyl-D- isoglutamine (thr-MDP), N-acetyl-normuramyl-L-alanyl-D-isoglutamine (nor-MDP), N- acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-(l'-2'dipalmitoyl-sn-glycero-3- hydroxyphosphoryloxy)-ethylamine (MTP-PE). N-acetylglucsaminyl-N-acetylmuramyl-L-Al-D-
isoglu-L-Ala-dipalmitoxy propylamide (DTP -DPP) theramideTM), or other bacterial cell wall components. Oil-in-water emulsions include (a) MF59 (WO 90/14837), containing 5% Squalene, 0.5% Tween 80, and 0.5% Span 85 (optionally containing various amounts of MTP- PE) formulated into submicron particles using a microfluidizer such as Model HOY microfluidizer (Microfluidics, New ton MA), (b) SAF, containing 10% Squalene, 0.4% Tween 80, 5% pluronic-blocked polymer L121, and thr-MDP, either microfluidized into a submicron emulsion or vortexed to generate a larger particle size emulsion, and (c) Ribi™ adjuvant system (RAS), (Ribi ImmunoChem, Hamilton, MT) containing 2% squalene, 0.2% Tween 80, and one or more bacterial cell wall components from the group consisting of monophosphoryllipid A (MPL). trehalose dimycolate (TDM), and cell wall skeleton (CWS), preferably MPL + CWS (Detox™).
[0265] Other adjuvants are saponin adjuvants, such as Stimulon™ (QS-21, Aquila, Framingham, MA) or particles generated therefrom such as ISCOMs (immunostimulating complexes) and ISCOMATRIX. Other adjuvants include RC-529, GM-CSF and Complete Freund's Adjuvant (CFA) and Incomplete Freund's Adjuvant (IF A). Other adjuvants include cytokines, such as interleukins (e.g., IL-1 a and [3 peptides,, IL-2, IL-4, IL-6, IL-12, IL-13, and IL- 15), macrophage colony stimulating factor (M-CSF), granulocyte-macrophage colony stimulating factor (GM-CSF), tumor necrosis factor (TNF), chemokines, such as MIPla and p and RANTES. Another class of adjuvants is glycolipid analogues including N-glycosylamides, N-glycosylureas and N-glycosylcarbamates, each of which is substituted in the sugar residue by an amino acid, as immuno-modulators or adjuvants (see US Pat. No. 4,855,283). Heat shock proteins, e.g., HSP70 and HSP90, may also be used as adjuvants.
[0266] An adjuvant can be administered with an immunogen as a single composition, or can be administered before, concurrent with or after administration of the immunogen. Immunogen and adjuvant can be packaged and supplied in the same vial or can be packaged in separate vials and mixed before use. Immunogen and adjuvant are typically packaged with a label indicating the intended therapeutic application. If immunogen and adjuvant are packaged separately, the packaging typically includes instructions for mixing before use. The choice of an adjuvant and/or carrier depends on the stability of the immunogenic formulation containing the adjuvant, the route of administration, the dosing schedule, the efficacy of the adjuvant for the species being vaccinated, and, in humans, a pharmaceutically acceptable adjuvant is one that has been approved or is approvable for human administration by pertinent regulator}' bodies. For example. Complete Freund's adjuvant is not suitable for human administration. Alum, MPL and
QS-21 are preferred. Optionally, two or more different adjuvants can be used simultaneously. Preferred combinations include alum with MPL, alum with QS-21, MPL with QS-21, MPL or RC-529 with GM-CSF, and alum, QS-21 and MPL together. Also, Incomplete Freund's adjuvant can be used (Chang et al., Advanced Drug Delivery Reviews 32, 173-186 (1998)), optionally in combination with any of alum, QS-21, and MPL and all combinations thereof.
[0267] Dose and Dosing Regimens
[0268] The dosage and frequency (single or multiple doses) of the proteins described herein (e.g., proteins of Formula (II) including embodiments thereof) administered to a subject can vary depending upon a variety of factors, for example, whether the mammal suffers from another disease, and its route of administration; size, age, sex. health, body weight, body mass index, and diet of the recipient; nature and extent of symptoms of the disease being treated, kind of concurrent treatment, complications from the disease being treated or other health-related problems. Other therapeutic regimens or agents can be used in conjunction with the methods and proteins described herein (e.g., proteins of Formula (II) including embodiments thereof). Adjustment and manipulation of established dosages (e.g., frequency and duration) are within the ability of the skilled artisan.
[0269] For any composition, the effective amount of a protein described herein (e.g., proteins of Formula (II) including embodiments thereol) can be initially determined from cell culture assays. Target concentrations will be those concentrations of protein that are capable of achieving the methods described herein, as measured using the methods described herein or know n in the art. As is know n in the art, effective amounts of proteins for use in humans can also be determined from animal models. For example, a dose for humans can be formulated to achieve a concentration that has been found to be effective in animals. The dosage in humans can be adjusted by monitoring effectiveness and adjusting the dosage upwards or downwards, as described above. Adjusting the dose to achieve maximal efficacy in humans based on the methods described above and other methods is well within the capabilities of the ordinarily skilled artisan.
[0270] Dosages of the proteins described herein (e.g., proteins of Formula (II) including embodiments thereof) may be varied depending upon the requirements of the patient, and whether the purpose is therapeutic or medical imaging. The dose administered to a patient should be sufficient to affect a beneficial therapeutic response in the patient over time. The size of the dose also will be determined by the existence, nature, and extent of any adverse sideeffects. Determination of the proper dosage for a particular situation is within the skill of the art.
Dosage amounts and intervals can be adjusted individually to provide levels of the protein effective for the particular clinical indication being treated. This will provide a therapeutic regimen that is commensurate with the severity of the individual's disease state.
[0271] Utilizing the teachings provided herein, an effective prophylactic, diagnostic, or therapeutic treatment regimen can be planned that does not cause substantial toxicity and yet is effective to treat the clinical disease or symptoms demonstrated by the particular patient. This planning should involve the careful choice of proteins by considering factors such as compound potency, relative bioavailability, patient body weight, presence and severity of adverse side effects.
[0272] In embodiments, the proteins are administered to a patient at an amount of about 0.001 mg/kg to about 500 mg/kg. In aspects, the proteins (e.g., recombinant proteins, antibodies, antibody variants, single-domain antibodies) are administered to a patient in an amount of about 0.01 mg/kg, 0. 1 mg/kg, 0.5 mg/kg, 1 mg/kg. 2 mg/kg, 3 mg/kg, 4 mg/kg, 5 mg/kg, 10 mg/kg, 20 mg/kg, 30 mg/kg, 40 mg/kg. 50 mg/kg, 60 mg/kg. 70 mg/kg, 80 mg/kg, 90 mg/kg, 100 mg/kg. 200 mg/kg, or 300 mg/kg. It is understood that where the amount is referred to as “mg/kg,” the amount is milligram per kilogram body weight of the subject being administered with the proteins. In aspects, the proteins are administered to a patent in an amount from about 0.01 mg to about 500 mg per day.
[0273] Embodiments 1-180
[0274] Embodiment 1. A compound of Formula (I) or a stereoisomer thereof:
wherein: ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl; L4 is a bond or -O-; x is an integer from 0 to 8; L1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R1 is hydrogen, halogen, -CX , -CHXh, -CH2X1. -OCX , -OCH2X1, -OCHXh. -CN, -SOniR1A, -SOviNR1AR1B, -NHC(O)NR1AR1B, -N(O)rai, -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -OR1A, -NR1AS O2R1B, -NR1AC(O)R1B, -NR1AC(O)OR1B, -NR1AOR1B, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; X1 is independently -F, -Cl, -Br, or -I; R1A is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; R1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; nl is an integer from 0 to 4; ml is 1 or 2; and vl is 1 or 2.
[0275] Embodiment 2. The compound of Embodiment 1, wherein L4 is a bond.
[0276] Embodiment 3. The compound of Embodiment 1, wherein L4 is O-.
[0277] Embodiment 4.The compound of Embodiment 1, wherein the compound of Formula
[0278] Embodiment 5. The compound of any one of Embodiments 1 to 4, wherein R1 is hydrogen, halogen, -CX -CHXE, -CH2X1, -OCX's, -OCH2X1, -OCHX^, -CN, -SOniR1A, -SOviNR1AR1B, -NHC(O)NR1AR1B, -N(0)mi, -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -OR1A, -NR1ASO2R1B -NR1AC(O)R1B, -NR1AC(O)OR1B, -NR1AOR1B, unsubstituted C1-8 alkyl, or unsubstituted 2 to 8 membered heteroalkyl; R1A is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl; and R1B is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyd.
[0279] Embodiment 6. The compound of any one of Embodiments 1 to 5, wherein R1 is ortho to -S(O2)F.
[0280] Embodiment 7. The compound of any one of Embodiments 1 to 5, wherein R1 is meta to -S(O2)F.
[0281] Embodiment 8. The compound of Embodiment 1, wherein the compound of Formula (I) has the formula:
[0282] Embodiment 9. The compound of any one of Embodiments 1 to 8, wherein ring A is a 5-membered cycloalkyl having one or two double bonds or a 5-membered heterocycloalkyl having one double bonds.
[0283] Embodiment 10. The compound of any one of Embodiments 1 to 9, wherein ring A is a 5-membered cycloalkyl.
[0284] Embodiment 11. The compound of any one of Embodiments 1 to 9. wherein ring A is a 5-membered heterocycloalkylene.
[0285] Embodiment 12. The compound of any one of Embodiments 1 to 8. wherein ring A is a
5-membered heteroaryl.
[0286] Embodiment 13. The compound of Embodiment 12, wherein ring A is a 5-membered heteroaryl containing 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
[0287] Embodiment 14. The compound of Embodiment 13. wherein ring A is a 5-membered heteroaryl containing 1 or 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
[0288] Embodiment 15. The compound of Embodiment 14, wherein ring A is a 5-membered heteroaryl containing 1 heteroatom selected from the group consisting of oxygen, nitrogen, and sulfur.
[0289] Embodiment 16. The compound of any one of Embodiments 1 to 8, wherein ring A is pyrrole, pyrazole, imidazole, triazole, furan, thiophene, phosphole, oxazole, isoxazole, thiazole, or isothiazole.
[0290] Embodiment 17. The compound of any one of Embodiments 1 to 5. wherein the compound of Formula (I) is a compound of formula:
[0291] Embodiment 18. The compound of any one of Embodiments 1 to 5, wherein the compound of Formula (I) is a compound of formula:
[0292] Embodiment 19. The compound of any one of Embodiments 1 to 5, wherein the compound of Formula (I) is a compound of formula:
[0293] Embodiment 20. The compound of any one of Embodiments 1 to 5, wherein the compound of Formula (I) is a compound of formula:
[0294] Embodiment 21. The compound of Embodiment 8, wherein the compound of Formula (I) is a compound of formula:
[0295] Embodiment 22. The compound of Embodiment 8, wherein the compound of Formula
[0296] Embodiment 23. The compound of embodiment 8, wherein the compound of Formula
[0297] Embodiment 24. The compound of Embodiment 8, wherein the compound of Formula
(I) is a compound of formula:
[0298] Embodiment 25. The compound of any one of Embodiments 1 to 24, wherein L1 is a bond.
[0299] Embodiment 26. The compound of any one of Embodiments 1 to 24, wherein L1 is substituted or unsubstituted alkylene.
[0300] Embodiment 27. The compound of Embodiment 26, wherein L1 is substituted or unsubstituted C u alkylene.
[0301] Embodiment 28. The compound of any one of Embodiments 1 to 24, wherein L1 is substituted or unsubstituted heteroalkylene.
[0302] Embodiment 29. The compound of Embodiment 28, wherein L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
[0303] Embodiment 30. The compound of Embodiment 29, wherein L1 is -NH-C(O)-(CHz)y-, and y is an integer from 0 to 2.
[0304] Embodiment 31. The compound of Embodiment 29, wherein L1 is -NH-C(O)-O- (CH2)y-, and y is an integer from 0 to 2.
[0305] Embodiment 32. The compound of Embodiment 29, wherein L1 is -NH-C(O)-NH- (CH2)y-, and y is an integer from 0 to 2.
[0306] Embodiment 33. The compound of Embodiment 29, wherein L1 is -NH-C(O)-S- (CH2)y-, and y is an integer from 0 to 2.
[0307] Embodiment 34. The compound of any one of Embodiments 30 to 33, wherein y is 0.
[0308] Embodiment 35. The compound of any one of Embodiments 1 to 34, wherein x is an integer from 0 to 6.
[0309] Embodiment 36. The compound of Embodiment 35, wherein x is an integer from 2 to 6.
[0310] Embodiment 37. The compound of Embodiment 36, wherein x is 4.
[0311] Embodiment 38. The compound of any one of Embodiments 1 to 24, wherein -(CH2)X- L1- is -(CH2)4NH-C(O)-
[0312] Embodiment 39. The compound of any one of Embodiments 1 to 24, wherein -(CH2)X- L1- is -(CH2)4NH-C(O)-O-.
[0313] Embodiment 40. The compound of any one of Embodiments 1 to 24, wherein -(CH2)X- L1- is -(CH2)4NH-C(O)-NH-.
[0314] Embodiment 41. The compound of any one of Embodiments 1 to 24, wherein -(CH2)X- L1- is -(CH2)4NH-C(O)-S-.
[0315] Embodiment 42. The compound of Embodiment 1, wherein the compound of Formula
[0316] Embodiment 43. The compound of Embodiment 1, wherein the compound of Formula
[0317] Embodiment 44. The compound of Embodiment 1, wherein the compound of Formula
[0318] Embodiment 45. The compound of Embodiment 1, wherein the compound of Formula
[0319] Embodiment 46. A protein comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II):
wherein ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered
heteroaryl; L4 is a bond or -O-; x is an integer from 0 to 8; L1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; R1 is hydrogen, halogen, )mi, -SOviNR
C> R1B, -C(O)NR1AR1B, -NR1AC(O)R1B, -NR1AC(O)OR1B, -NR1AOR1B, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; X1 is independently -F. -Cl, -Br, or -I; R1A is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; R1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; nl is an integer from 0 to 4; ml is 1 or 2; and vl is 1 or 2.
[0320] Embodiment 47. The protein of Embodiment 46, wherein L4 is a bond.
[0321] Embodiment 48. The protein of Embodiment 46, wherein L4 is -O-.
[0322] Embodiment 49. The protein of Embodiment 46, wherein the compound of Formula
[0323] Embodiment 50. The protein of any one of Embodiments 46 to 49, wherein R1 is hydrogen, halogen. - -SOniR1A, -SOviNR1
-NHC(O)NR1AR1B, -NR1ASO2R1B, -NR1AC(O)R1B, -NR1AC(O)OR1B, -NR1AOR1B, unsubstituted C1-8 alkyl, or unsubstituted 2 to 8 membered heteroalkyl; R1A is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl; and R1B is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl.
[0324] Embodiment 51. The protein of any one of Embodiments 46 to 50, wherein R1 is ortho to -S(O2)F.
[0325] Embodiment 52. The protein of any one of Embodiments 46 to 50, wherein R1 is meta to -S(O2)F.
[0326] Embodiment 53. The protein of Embodiment 46, wherein the compound of Formula (II) has the formula:
[0327] Embodiment 54. The protein of any one of Embodiments 46 to 53, wherein ring A is a 5-membered cycloalkyl having one or two double bonds or a 5-membered heterocycloalkyl having one double bonds.
[0328] Embodiment 55. The protein of any one of Embodiments 46 to 54, wherein ring A is a 5-membered cycloalkyl.
[0329] Embodiment 56. The protein of any one of Embodiments 46 to 54, wherein ring A is a 5-membered heterocycloalkylene.
[0330] Embodiment 57. The protein of any one of Embodiments 46 to 53, wherein ring A is a 5-membered heteroaryl.
[0331] Embodiment 58. The protein of Embodiment 57, wherein ring A is a 5-membered heteroaryl containing 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
[0332] Embodiment 59. The protein of Embodiment 58, wherein ring A is a 5-membered heteroaryl containing 1 or 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
[0333] Embodiment 60. The protein of Embodiment 59, wherein ring A is a 5 -membered heteroaryl containing 1 heteroatom selected from the group consisting of oxygen, nitrogen, and sulfur.
[0334] Embodiment 61. The protein of any one of Embodiments 46 to 53, wherein ring A is pyrrole, pyrazole, imidazole, triazole, furan, thiophene, phosphole, oxazole, isoxazole, thiazole, or isothiazole.
[0335] Embodiment 62. The protein of any one of Embodiments 46 to 50, wherein the protein of Formula (II) is a protein of formula:
[0336] Embodiment 63. The protein of any one of Embodiments 46 to 50, wherein the protein of Formula (II) is a protein of formula:
[0337] Embodiment 64. The protein of any one of Embodiments 46 to 50, wherein the protein of Formula (II) is a protein of formula:
[0338] Embodiment 65. The protein of any one of Embodiments 46 to 50, wherein the protein of Formula (II) is a protein of formula:
[0339] Embodiment 66. The protein of Embodiment 53, wherein the protein of Formula (II) is a protein of formula:
[0340] Embodiment 67. The protein of Embodiment 53, wherein the protein of Formula (II) is a protein of formula:
[0341] Embodiment 68. The protein of Embodiment 53, wherein the protein of Formula (II) is a protein of formula:
[0342] Embodiment 69. The protein of Embodiment 53, wherein the protein of Formula (II) is
a protein of formula:
[0343] Embodiment 70. The protein of any one of Embodiments 46 to 69, wherein L1 is a bond.
[0344] Embodiment 71. The protein of any one of Embodiments 46 to 69, wherein L1 is substituted or unsubstituted alkylene.
[0345] Embodiment 72. The protein of Embodiment 71, wherein L1 is substituted or unsubstituted C1-4 alkylene.
[0346] Embodiment 73. The protein of any one of Embodiments 46 to 69, wherein L1 is substituted or unsubstituted heteroalkylene.
[0347] Embodiment 74. The protein of Embodiment 73, wherein L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
[0348] Embodiment 75. The protein of Embodiment 74, wherein L1 is -NH-C(O)-(CH2)y-, and y is an integer from 0 to 2.
[0349] Embodiment 76. The protein of Embodiment 74, wherein L1 is -NH-C(O)-O-(CH2)y-, and y is an integer from 0 to 2.
[0350] Embodiment 77. The protein of Embodiment 74, wherein L1 is -NH-C(O)-NH-(CH2)y- , and y is an integer from 0 to 2.
[0351] Embodiment 78. The protein of Embodiment 74, wherein L1 is -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2.
[0352] Embodiment 79. The protein of any one of Embodiments 75 to 78, wherein y is 0.
[0353] Embodiment 80. The protein of any one of Embodiments 46 to 79, wherein x is an integer from 0 to 6.
[0354] Embodiment 81. The protein of Embodiment 80, wherein x is an integer from 2 to 6.
[0355] Embodiment 82. The protein of Embodiment 81 , wherein x is 4.
[0356] Embodiment 83. The protein of any one of Embodiments 46 to 69, wherein -(CH2 - L1- is -(CH2)4NH-C(O)-
[0357] Embodiment 84. The protein of any one of Embodiments 46 to 69, wherein -(CH2)X-
L1- is -(CH2)4NH-C(O)-O-.
[0358] Embodiment 85. The protein of any one of Embodiments 46 to 69, wherein -(CH2)X- L1- is -(CH2)4NH-C(O)-NH-.
[0359] Embodiment 86. The protein of any one of Embodiments 46 to 69, wherein -(CH2)X- L1- is -(CH2)4NH-C(O)-S-.
[0360] Embodiment 87. The protein of Embodiment 46, wherein the protein of Formula (II) is a protein of the formula:
[0361] Embodiment 88. The protein of Embodiment 46, wherein the protein of Formula (II) is a protein of the formula:
[0362] Embodiment 89. The protein of Embodiment 46, wherein the protein of Formula (II) is a protein of the formula:
[0363] Embodiment 90. The protein of Embodiment 46, wherein the protein of Formula (II) is a protein of the formula:
[0364] Embodiment 91. The protein of any one of Embodiments 46 to 90, wherein the protein is an antibody.
[0365] Embodiment 92. The protein of any one of Embodiments 46 to 90, wherein the protein
is an antibody variant.
[0366] Embodiment 93. The protein of Embodiment 92, wherein the antibody variant is a single-chain variable fragment, a single-domain antibody, an affibody, or an antigen-binding fragment.
[0367] Embodiment 94. The protein of Embodiment 93, wherein the antibody variant is a single-chain variable fragment.
[0368] Embodiment 95. The protein of Embodiment 93, wherein the antibody variant is a single-domain antibody.
[0369] Embodiment 96. The protein of Embodiment 93, wherein the antibody variant is an affibody.
[0370] Embodiment 97. The protein of Embodiment 93, wherein the antibody variant is an antigen-binding fragment.
[0371] Embodiment 98. The protein of Embodiment any one of Embodiments 91 to 97, wherein the unnatural amino acid is within a CDR region or a framework region of the antibody or antibody variant.
[0372] Embodiment 99. The protein of any one of Embodiments 46 to 98, wherein the protein is a receptor.
[0373] Embodiment 100. The protein of any one of Embodiments 46 to 98, wherein the protein is a cell surface receptor.
[0374] Embodiment 101. The protein of any one of Embodiments 100, wherein the cell surface receptor is in the extracellular domain, the transmembrane domain, or the intracellular domain.
[0375] Embodiment 102. The protein of any one of Embodiments 46 to 98, wherein the protein is a cytosolic protein.
[0376] Embodiment 103. The protein of any one of Embodiments 46 to 98, wherein the protein is a transcriptional factor or an enzyme.
[0377] Embodiment 104. The protein of any one of Embodiments 46 to 103, further comprising a detectable agent.
[0378] Embodiment 105. The protein of Embodiment 104, wherein the detectable agent is a radioisotope.
[0379] Embodiment 106. The protein of any one of Embodiments 46 to 105, further comprising a therapeutic agent.
[0380] Embodiment 107. A nucleic acid encoding the protein of any one of Embodiments 46 to 106.
[0381] Embodiment 108. A vector comprising a nucleic acid of Embodiment 107.
[0382] Embodiment 109. A biomolecule conjugate of Formula (III):
wherein: R4 and R5 are each independently a peptidyl moiety, a carbohydrate moiety, a lipid moiety, or a nucleic acid moiety; ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl; L4 is a bond or -O-; x is an integer from 0 to 8; L1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene; L2 is a bond, -NR2A-, -S-, -S(O)2-, -O-, -C(O)-, -C(O)O-, -OC(O)-, -N(R2A)C(O)-,-C(O)N(R2A)-, -NR2AC(O)NR2B-, -NR2AC(NH)NR2B-, -SO2N(R2A)-, -N(R2A)SO2-, -C(S)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene; L3 is a bond, -N(R3A)-, -S-, -S(O)2-, -O-, -C(O)-, -C(O)O-, -OC(O)-, -N(R3A)C(O)-, -C(O)N(R3A)-, -NR3AC(O)NR3B-, -NR3AC(NH)NR3B-, -SO2N(R3A)-, -N(R3A)SO2-, -C(S)-. substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene; and R2A, R2B, R3A, and R3B are independently hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl; R1 is hydrogen, halogen, -CX43, -CHX^, -CH2X1, -OCX13, -OCFhX1, -OCHXb, -CN, -SOniR1A, -SOviNR1AR1B, -NHC(O)NR1AR1B, -N(O)mi,
-NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -OR1A, -NR1ASO2R1B, -NR1AC(O)R1B, -NR1AC(O)OR1B, -NR1AOR1B, substituted or unsubstituted alky l, or substituted or unsubstituted heteroalkyl; X1 is independently -F, -Cl, -Br, or -I; RIA is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; R1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; nl is an integer from 0 to 4; ml is 1 or 2; and
vl is 1 or 2.
[0383] Embodiment 110. The biomolecule conjugate of Embodiment 109, wherein R1 is meta to the carbon atom linked to -L4S(O2)L3R5.
[0384] Embodiment 111. The biomolecule conjugate of Embodiment 109, wherein R1 is ortho to the carbon atom linked to -L4S(O2)L3R5.
[0385] Embodiment 112. The biomolecule conjugate of any one of Embodiments 109 to 111, wherein L4 is a bond.
[0386] Embodiment 113. The biomolecule conjugate of any one of Embodiments 109 to 111, wherein L4 is -O-.
[0387] Embodiment 114. The biomolecule conjugate of any one of Embodiments 109 to 111, wherein the compound of Formula (III) has the formula:
[0388] Embodiment 115. The biomolecule conjugate of any one of Embodiments 109 to 114, wherein R1 is hydrogen, halogen, -CXb, -CHXb, -CH2X1, -OCXh, -OCH2X1, -OCHXb, -CN, -SOniR1A, -SOviNR1AR1B, -NHC(O)NR1AR1B. -N(0)mi, -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -OR1A -NR1ASO2R1B, -NR1AC(O)R1B, -NR1AC(O)OR1B, -NR1AOR1B, unsubstituted Cus alkyl, or unsubstituted 2 to 8 membered heteroalkyl; R1A is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl; and R1B is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl.
[0389] Embodiment 116. The biomolecule conjugate of Embodiment 109, wherein the compound of Formula (III) has the formula:
[0390] Embodiment 117. The biomolecule conjugate of any one of Embodiments 109 to 1 16, wherein ring A is a 5-membered cycloalkyl having one or two double bonds or a 5-membered heterocycloalkyl having one double bond.
[0391] Embodiment 118. The biomolecule conjugate of any one of Embodiments 109 to 117, wherein ring A is a 5-membered cycloalkyd.
[0392] Embodiment 119. The biomolecule conjugate of any one of Embodiments 109 to 117, wherein ring A is a 5-membered heterocycloalkylene.
[0393] Embodiment 120. The biomolecule conjugate of any one of Embodiments 109 to 116, wherein ring A is a 5-membered heteroaryl.
[0394] Embodiment 121. The biomolecule conjugate of Embodiment 120, wherein ring A is a 5-membered heteroaryl containing 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
[0395] Embodiment 122. The biomolecule conjugate of Embodiment 121, wherein ring A is a 5-membered heteroaryl containing 1 or 2 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
[0396] Embodiment 123. The biomolecule conjugate of Embodiment 122, wherein ring A is a 5-membered heteroaryl containing 1 heteroatom selected from the group consisting of oxygen, nitrogen, and sulfur.
[0397] Embodiment 124. The biomolecule conjugate of any one of Embodiments 109 to 116, wherein ring A is pyrrole, pyrazole, imidazole, triazole, furan, thiophene, phosphole, oxazole, isoxazole, thiazole, or isothiazole.
[0398] Embodiment 125. The biomolecule conjugate of any one of Embodiments 109 to 124, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
[0399] Embodiment 126. The biomolecule conjugate of any one of Embodiments 109 to 124, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
[0400] Embodiment 127. The biomolecule conjugate of any one of Embodiments 109 to 124, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
[0401] Embodiment 128. The biomolecule conjugate of any one of Embodiments 109 to 124, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
[0402] Embodiment 129. The biomolecule conjugate of Embodiment 116, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
[0403] Embodiment 130. The biomolecule conjugate of Embodiment 116, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
[0404] Embodiment 131. The biomolecule conjugate of Embodiment 116, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
[0405] Embodiment 132. The biomolecule conjugate of Embodiment 116, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of formula:
[0406] Embodiment 133. The biomolecule conjugate of any one of Embodiments 109 to 132, wherein L1 is a bond.
[0407] Embodiment 134. The biomolecule conjugate of any one of Embodiments 109 to 132, wherein L1 is substituted or unsubstituted alkylene.
[0408] Embodiment 135. The biomolecule conjugate of Embodiment 134, wherein L1 is substituted or unsubstituted C1-4 alkylene.
[0409] Embodiment 136. The biomolecule conjugate of any one of Embodiments 109 to 132, wherein L1 is substituted or unsubstituted heteroalkylene.
[0410] Embodiment 137. The biomolecule conjugate of Embodiment 136, wherein L1 is substituted or unsubstituted 2 to 6 membered heteroalkylene.
[0411] Embodiment 138. The biomolecule conjugate of Embodiment 137, wherein L1 is -NH- C(O)-(CH2)y-, and y is an integer from 0 to 2.
[0412] Embodiment 139. The biomolecule conjugate of Embodiment 137, wherein L1 is -NH- C(O)-O-(CH2)y-, and y is an integer from 0 to 2.
[0413] Embodiment 140. The biomolecule conjugate of Embodiment 137, wherein L1 is -NH- C(O)-NH-(CH2)y-, and y is an integer from 0 to 2.
[0414] Embodiment 141. The biomolecule conjugate of Embodiment 137, wherein L1 is -NH- C(O)-S-(CH2)y-, and y is an integer from 0 to 2.
[0415] Embodiment 142. The biomolecule conjugate of any one of Embodiments 138 to 141, wherein y is 0.
[0416] Embodiment 143. The biomolecule conjugate of any one of Embodiments 109 to 142, wherein x is an integer from 0 to 6.
[0417] Embodiment 144. The biomolecule conjugate of Embodiment 143, wherein x is an integer from 2 to 6.
[0418] Embodiment 145. The biomolecule conjugate of Embodiment 144, wherein x is 4.
[0419] Embodiment 146. The biomolecule conjugate of any one of Embodiments 109 to 132, wherein -(CH2)X-L1- is -(CH2)4NH-C(O)-.
[0420] Embodiment 147. The biomolecule conjugate of any one of Embodiments 109 to 132, wherein -(CH2) -Ll- is -(CH2)4NH-C(O)-O-.
[0421] Embodiment 148. The biomolecule conjugate of any one of Embodiments 109 to 132, wherein -(CFEjx-L1- is -(CH2)4NH-C(O)-NH-.
[0422] Embodiment 149. The biomolecule conjugate of any one of Embodiments 109 to 132,
wherein -(CFEjx-L1- is -(CH2)4NH-C(O)-S-.
[0423] Embodiment 150. The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
[0424] Embodiment 151. The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
[0425] Embodiment 152. The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
[0426] Embodiment 153. The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
[0427] Embodiment 154. The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
[0428] Embodiment 155. The biomolecule conjugate of Embodiment 109, wherein the
biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
[0429] Embodiment 156. The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
[0430] Embodiment 157. The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
[0431] Embodiment 158. The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (Ill) is a biomolecule conjugate of the formula:
[0432] Embodiment 159. The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
[0433] Embodiment 160. The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
[0434] Embodiment 161. The biomolecule conjugate of Embodiment 109, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate of the formula:
[0435] Embodiment 162. The biomolecule conjugate of any one of Embodiments 109 to 161 , wherein R4 and R5 are each independently a peptidyl moiety.
[0436] Embodiment 163. The biomolecule conjugate of Embodiment 162, wherein the peptidyl moiety of R4 comprises an antibody; and the peptidyl moiety of R5 comprises a protein.
[0437] Embodiment 164. The biomolecule conjugate of Embodiment 162, wherein the peptidyl moiety of R4 comprises an antibody variant; and the peptidyl moiety of R5 comprises a protein.
[0438] Embodiment 165. The biomolecule conjugate of Embodiment 162, wherein the peptidyl moiety of R4 comprises a protein; and the peptidyl moiety of R5 comprises an antibody or an antibody variant.
[0439] Embodiment 166. The biomolecule conjugate of Embodiment 164 or 165, wherein the antibody variant is an antigen-binding fragment, a single-chain variable fragment, a singledomain antibody , or an affibody.
[0440] Embodiment 167. The biomolecule conjugate of any one of Embodiments 163 to 166, wherein the protein is the target protein of the antibody or antibody variant.
[0441] Embodiment 168. The biomolecule conjugate of any one of Embodiments 163 to 167, wherein the protein is a cytosolic protein.
[0442] Embodiment 169. The biomolecule conjugate of any one of Embodiments 163 to 167, wherein the protein is an enzyme.
[0443] Embodiment 170. The biomolecule conjugate of any one of Embodiments 163 to 167, wherein the protein is a transcriptional factor.
[0444] Embodiment 171. The biomolecule conjugate of any one of Embodiments 163 to 167, wherein the protein is a receptor protein.
[0445] Embodiment 172. The biomolecule conjugate of Embodiment 171, wherein the receptor protein is a 5 -hydroxy tryptamine receptor, an acetylcholine receptor, an adenosine receptor, an adenosine A2A receptor, an adenosine A2B receptor, an angiotensin receptor, an apelin receptor, a bile acid receptor, a bombesin receptor, a bradykinin receptor, a cannabinoid receptor, a chemerin receptor, a chemokine receptor, a cholecystokinin receptor, a Class A Orphan receptor, a dopamine receptor, an endothelin receptor, an epidermal growth factor receptor (EGFR), a formyl peptide receptor, a free fatty acid receptor, a galanin receptor, a ghrelin receptor, a glycoprotein hormone receptor, a gonadotrophin-releasing hormone receptor, a G protein-coupled receptor, a G protein-coupled estrogen receptor, a histamine receptor, a hydroxycarboxylic acid receptor, a kisspeptin receptor, a leukotriene receptor, a lysophospholipid receptor, a lysophospholipid SIP receptor, a melanin-concentrating hormone receptor, a melanocortin receptor, a melatonin receptor, a motilin receptor, a neuromedin U receptor, a neuropeptide FF/neuropeptide AF receptor, a neuropeptide S receptor, a neuropeptide W/neuropeptide B receptor, a neuropeptide Y receptor, a neurotensin receptor, an opioid receptor, an opsin receptor, an orexin receptor, an oxoglutarate receptor, a P2Y receptor, a platelet-activating factor receptor, a prokineticin receptor, a prolactin-releasing peptide receptor, a prostanoid receptor, a proteinase-activated receptor, a QRFP receptor, a relaxin family peptide receptor, a somatostatin receptor, a succinate receptor, a tachykinin receptor, a thyrotropinreleasing hormone receptor, a trace amine receptor, a urotensin receptor, a vasopressin receptor.
[0446] Embodiment 173. The biomolecule conjugate of Embodiment 172, wherein the protein is a G protein-coupled receptor.
[0447] Embodiment 174. Acomplex comprising a pyrrolysyl-tRNA synthetase and the compound of any one of Embodiments 1 to 106.
[0448] Embodiment 175. The complex of Embodiment 174, wherein the pyrrolysyl-tRNA synthetase has an amino acid sequence with at least 90% sequence identity to SEQ ID NO: 1, 2, 3, or 4.
[0449] Embodiment 176. The complex of Embodiment 175, wherein the pyrrolysyl-tRNA synthetase has an amino acid sequence as set forth in SEQ ID NO: 1, 2, 3, or 4.
[0450] The Embodiment 177. The complex of any one of Embodiments 174 to 176, further comprising a tRN AP l.
[0451] Embodiment 178. A cell comprising: (i) the compound of any one of Embodiments 1 to 45; (ii) the protein of any one of Embodiments 46 to 106; (iii) the nucleic acid of Embodiment 107, (iv) the vector of Embodiment 108, (v) the biomolecule conjugate of any one of Embodiments 109 to 173; or (vi) the complex of any one of Embodiments 174 to 177.
[0452] Embodiment 179. The cell of Embodiment 178, wherein the cell is a bacterial cell or a mammalian cell.
[0453] Embodiment 180. A pharmaceutical composition comprising: (i) a pharmaceutically acceptable excipient, and (ii) the compound of any one of Embodiments 1 to 45, the protein of any one of Embodiments 46 to 106, the nucleic acid of Embodiment 107. or the vector of Embodiment 108.
EXAMPLES
[0454] The following examples are intended to further illustrate certain embodiments of the disclosure. The examples are put forth so as to provide one of ordinary skill in the art and are not intended to limit its scope.
[0455] Example 1
[0456] SFK was synthesized following the procedure described in FIG. 1A. The relatively electron-rich pyrrole ring was used to stabilize the sulfonyl fluoride functional group. It was tested to determine if the pyrrolysyl-tRNA synthetase (PylRS) described by Liu et al, J. Am. Chem. Soc., 143(27): 10341-10351 (2021) could incorporate SFK into proteins. The enhanced green fluorescent protein (EGFP) containing a TAG codon at position 182 (EGFP-182TAG) was co-expressed with FSKRS in E. coli. In the absence of SFK, no obvious fluorescence was detected; in the presence of 2 mM SFK, concentration-dependent fluorescence signaling was observed and the fluorescence intensity was 26-fold higher than the background (FIG. IB). 1 mM FSK was used as positive control (FIG. IB). SFK was also instroduced into the maltose binding protein (MBP) fused Z protein at position 24 (MBP-Z(24SFK)) and this protein was purified via the Hisx6 tag appended at the C terminus. The reactivities of SFK towards various amino acid residues was tested using the binding complex of MBP-Z and the Zspa affibody (Afb). Upon Afb-Z binding, SFK would be placed in close proximity with the testing residue at position 7 of Afb (Afb7X); reaction between which would lead to protein-protein cross-linking. As shown in FIG. 1C, SFK was able to crosslink with Lys, His, and Tyr.
[0457] Discussion
[0458] Based on the results presented herein, analogs of SFK are proposed in FIGS. 2A-2C
I l l
which should possess similar reactivities as SFK to covalently target proteins and other biomolecules. In FIG. 2A. substituents R can be introduced into the pyrrole ring to further finetune the reactivity. These can be electron-withdrawing or electron-donating groups. In FIG. 2B, we propose to replace the N atom in the pyrrole with O or S. Similarly, substituents R can be introduced to fine-tune the reactivity . In FIG. 2C, two to four hetero-atoms can simultaneously introduced into the ring as shown, with additional substituent R for further fine-tuning the reactivity.
[0459] Materials and Methods
[0460] Primers were synthesized and purified by Integrated DNA Technologies (IDT), and plasmids were sequenced by GENEWIZ. All molecular biology reagents were either obtained from New England Biolabs or Vazyme. All solvents were of reagent grade and were purchased from Fisher Scientific and Aldrich. Reagents were purchased from Aldrich, Enamine, and Asta Tech. The stationary phase of chromatographic purification is silica (230 x 400 mesh, Sorbtech). Silica gel TLC plate was purchased from Sorbtech. H-NMR (400 MHz) and 13C-NMR (100 MHz) spectra were recorded on a Bruker Avance 400 MHz NMR spectrometer. ODeoo and fluorescence intensity were recorded on BioTek UV/ Vis/Fluorescence plate reader.
[0461] Incorporation of SFK into protein. EGFP (182TAG) and MBP-Z(24TAG) were cloned into the expression plasmid pBAD as reported. pBAD-EGFP (182TAG) or pBAD-MBP- Z(24TAG) was co-transformed with pEVOL-FSKRS into DHIOb, and plated on LB agar plate supplemented with 50 μg/mL ampicillin and 34 μg/mL chloramphenicol. A single colony was picked and inoculated into 1 mL 2XYT (5 g/L NaCI, 16 g/L Tryptone, 10 g/L Yeast extract) with 50 μg/mL ampicillin and 34 μg/mL chloramphenicol. The cells w ere left grown at 37 °C, 220 rpm overnight. The next morning, cells were diluted 100 times in fresh 2XYT supplemented with 50 μg/mL ampicillin and 34 μg/mL chloramphenicol. When cells reach an OD600 of 1.0. cells were supplied with 2 mM SFK. The cells were then induced by 0.2% arabinose at 25 °C for 20 h. Proteins were then purified using the following procedure.
[0462] His-tagged protein expression and purification. Afb7X was cloned into the expression plasmid pBAD and expressed in E. coll following literature reported procedure by Wang et al, J. Am. Chem. Soc. 140(15):4995-4999 (2018). After protein expression, 100 mL cells were centrifuged at 4,000 rpm for 10 min and the cell pellet was suspended in cell lysis buffer (50 mM Tris-HCl pH 8.0, 500 mM NaCI, 20 mM imidazole, 1% v/v Tween20, 10% v/v glycerol, DNase 0.1 mg/mL) with protease inhibitors. Lysate was sonicated with Sonic Dismembrator (Fisher Scientific, 30 % output, 5 min, 1 s off, 1 s on) in an ice-water bath, after which the lysate
was centrifugated (4,000 rpm for 10 min) and the supernatant was collected. Ni-NTA Agarose slurry’ (Thermo Scientific, #88222, 200 pL) was added to the supernatant. The mixture was incubated at 4 °C for 15 min and subsequently loaded onto a Poly-Prep® Chromatography Column. After washing the column 3 times with 20 mL PBS (pH 7.4) containing 20 mM imidazole, 0.5 mL elution buffer (PBS w ith 300 mM imidazole) was used to elute the protein. Purified protein was exchanged to PBS (pH 7.4) using Amicon Ultra column and stored at -20 °C.
[0463] Afb7X and MBP-Z(24SFK) cross-linking. 1 mg/ml Afb7X and 0.5 mg/ml MBP- Z(24SFK) were incubated in PBS (pH 7.4 ) at 37 °C for 12 h, after which 2 pL reaction solution w as extracted and mixed with 10 pL Laemmli loading buffer. The mixture w as heated to 95 °C for 10 min and then loaded for SDS-PAGE, after which the gel was stained w ith Coomassie blue and imaged with ChemiDoc™ MP imaging system (Bio-rad). The maltose binding protein (MBP). Z protein, and Zspa affibody are well known in the art.
[0464] Synthesis of compound 3. To a stirred solution compound 2 (1.0 g, 5.2 mmol) and 1- Ethyl-3-(3'-dimethylaminopropyl)carbodiimide hydrochloride (1.5 g, 7.8 mmol) in 20 mL anhydrous DCM was added compound 1 (HC1 form, 2.1 g, 6.2 mmol) and Diisopropylethylamine (0.8 g, 6.2 mmol) in 10 mL anhydrous DCM. The mixture was stirred at r.t. for 2 h. Then 100 ml EtOAc was added to dilute the reaction mixture and the organic phase was washed sequentially with FLO (50 mL) and brine (50 mL). The organic phase was dried over anhydrous Na^SCfi and evaporated under reduced pressure to give the crude product, which was then purified by column chromatography (silica gel, DCM: MeOH=20: l) to give compound 3 white solid (1.0 g, 40 %).
[0465] Synthesis of SFK. Compound 3 (1.0 g, 2. 1 mmol) was stirred in 4 M HC1 in dioxane (10 ml) at r.t. for 20 h. Then 20 ml diethyl ether was added to the reaction mixture, and a white precipitate was formed and collected by filtration. The white solid was further dried under reduced pressure to give SFK in HC1 salt form (0.6 g, 89 %). XH NMR (MeOD): 5 7.74 (d, J= 1.6 Hz, 1H), 7.29 (d, J= 1.6 Hz, 1H), 4.00 (t, J= 12.4 Hz, 1H), 3.40 (t, J= 13.8 Hz, 2H), 2.05- 1.93 (m, 2H), 1.73-1.48 (m, 4H). 13C NMR (MeOD): 5170.4, 160.3, 128.9, 126.8, 115.3 (d, J = 30 Hz, C-F), 109.4, 52.4, 38.5, 29.8. 28.6. 21.9; MS calcd for C11H17FN3O5S [M+H]+ 322.09, found: 322.14.
[0466] Example 2
[0467] The compound shown in FIG. 2E was synthesized by the process shown in FIG. 3. To a stirred solution compound 2 (1 .0 g, 4.4 mmol) and l-ethyl-3-(3 '-dimethylaminopropyl)-
carbodiimide hydrochloride (1.3 g, 6.6 mmol) in 20 mL anhydrous DCM was added compound 1 (HC1 form, 1.7 g, 5.3 mmol) and diisopropylethylamine (683 mg. 5.3 mmol) in 10 mL anhydrous DCM. The mixture was stirred at room temperature for 2 hours. Then 100 ml EtOAc was added to dilute the reaction mixture and the organic phase was washed sequentially with H2O (100 mL) and brine (100 mL). The organic phase was dried over anhydrous ISfeSCL and evaporated under reduced pressure to give the crude product, which was then purified by column chromatography (silica gel, DCM: MeOH=20: 1) to give compound 3 white solid (730 g. 32.3 %).
[0468] Compound 3 (400 mg, 0.78 mmol) was stirred in 4 M HC1 in dioxane (4 ml) at room temperature for 20 hours. Then 10 ml diethyl ether was added to the reaction mixture, and a white precipitate was formed and collected by filtration. The white solid was further dried under reduced pressure to give SFOK in HC1 salt form (quantitative yield). JH NMR (D2O): 5 7.58 (s, 1H), 4.01 (t, J = 6.0 Hz, 1H), 3.40 (t, J= 6.8 Hz, 2H). 2.01 - 1.92 (m. 2H), 1.70 - 1.63 (m, 2H), 1.53 - 1.44 (m, 2H). 13C NMR (D2O): 5 173.1, 158.4, 147.2, 145.9, 116.6 (d, J= 34 Hz, C-F), 115.2, 53.6, 39.5, 30.4, 28.4, 22.2. MS calcd for C11H15CIFN2O6S [M+H]+ 357.0318, found: 357.0312.
[0469] Example 3
[0470] SFK was incorporated into mNb6, a nanobody specific for the SARS-CoV-2 Spike protein, in E. coli. The mNb6 gene containing a TAG codon at position 54 (mNb6-54TAG) was co-expressed with the tRNAPyl/FSKRS pair in E. coli with SFK added in the media. The purified mNb6(54SFK) protein was analyzed by electrospray ionization time-of-flight mass spectrometry (ESI-TOF MS). A peak observed at 13768 Da corresponds to intact protein mass of mNb6- 54SFK (expected 13770.68 Da).
[0471] Methods. Intact mNb6(54SFK) was analyzed by Waters Xevo G2S Q-TOF. 10 pg protein was injected and separated on Waters Acquity UPLC protein BEH C4 columns (1.7 pm x 2.1 m x 50 mm) by a reverse-phase gradient of 0-80% acetonitrile for 5 min. Protein spectra were averaged and the charge states were deconvoluted. The mNb6 nanobody is known in the art and described, for example, in WO 2022/232377, the disclosure of which is incorporated by reference herein.
[0472] Example 4
[0473] SFK incorporation in mammalian cells was tested. HeLa-GFP-182TAG reporter cells, which contains a genome-integrated GFP(182TAG) gene, were transfected with plasmid pMP-
FSKRS-3xtRNA. In the presence of 1 mM SFK in media, strong GFP fluorescence was observed. No fluorescence signal was detected without SFK addition (FIGS. 4A-4B).
[0474] Methods. 8*104 HeLa-GFP(182TAG) reporter cells were seeded in a 6-well cell culture plate and incubated at 37 °C in a CO2 incubator for overnight. Plasmid pMP-SFKRS was transfected into cells using lipofectamine 2000 following manufacturer’s instructions. Five hours post transfection, the media was replaced with fresh DMEM media with 10% FBS in the presence or absence of 1 mM SFK. After incubation at 37 °C for 24 h, transfected cells were subject to imaging.
[0475] Example 5
[0476] Both FSK and SFK crosslink with residues His, Lys, and Tyr placed in proximity . Since FSK has its warhead installed on the six-membered ring while SFK has its warhead installed on the five-membered ring, their cross-linking ability was compared by incorporating them at the same site into proteins. In light of the crystal structure of Affibody-Z complex, SFK was introduced at position 24 of the Z protein in order to target position 7 of the Affibody. Upon Affibody-Z binding, the incorporated SFK would be brought into close proximity’ with the residue at the position 7 of Affibody for cross-linking. The maltose binding protein (MBP) was fused at the N-terminus of the Z protein to generate MBP-Z to distinguish from the affibody protein, which has the similar molecular weight with the Z protein. After expression and purification, MBP-Z(24SFK) was incubated with Affibody(7H), Affibody(7K), or Affibody(7Y), respectively, followed with SDS-PAGE analysis (FIGS. 5D-5F). A protein band corresponding to the cross-linked MBP-Z with Affibody was clearly observed for Affibody(7H), Affibody(7K), and Affibody(7Y), with cross-linking efficiency (determined by band intensities) of 30.2%, 51.1%, and 62.6% after 22 h incubation, respectively. The cross-linking efficiency also increased with incubation time. In contrast, when MBP-Z(24FSK) was purified and incubated with Affibody(7H), Affibody(7K), or Affibody(7Y), no cross-linking band was detected at the position of corresponding molecular w eight (FIGS. 5A-5C). These results indicate that SFK w as able to crosslink proteins at sites where FSK did not.
[0477] To further validate the cross-linking ability of SFK, we next compared the crosslinking of FSK and SFK at multiple sites in another protein system. FSK or SFK was incorporated at sites 50-59 individually of the CD2 region of the nanobody mNb6. Each mutant protein was purified and incubated with the SARS-CoV-2 Spike receptor binding domain (RBD) variant E484K, followed with Western blot analysis. As shown in FIG. 6A, FSK enabled mNb6 cross-linking with the Spike RBD when FSK was incorporated at sites 53 and 54. In contrast,
SFK enabled mNb6 cross-linking with the Spike RBD when SFK was incorporated not only at sites 53 and 54 but also at sites 55, 56. and 57 (FIG. 6B). These results further demonstrate that SFK expanded over FSK to crosslink proteins at more sites.
[0478] Methods. 5 mM MBP-Z(24FSK) or MBP-Z(24SFK) was incubated with 100 mM Affibody(7X) at 37 °C. Samples were collected at different incubation time (5 min, 15 min, 30 min, 1 h, 2 h, 4 h, 7 h, 11 h, 22 h). 10 pl samples were mixed with 10 pl SDS sample buffer (containing 0.5 mM EDTA, 20 mM HEPES, and 2 % SDS) and boiled at 95 °C for 10 min. The samples were then separated on SDS-PAGE.
[0479] 500 nM Spike RBD(E484K) was incubated with 5 mM mNb6-FSK or mNB6-SFK mutants at 37 °C. Samples were collected at different incubation time (5 min, 15 min, 30 min, 1 h, 2 h, 4 h, 7 h, 1 1 h, 22 h). 10 pl samples were mixed with 10 pl SDS sample buffer (containing 0.5 mM EDTA, 20 mM HEPES, and 2 % SDS) and boiled at 95 °C for 10 min. The samples were separated on SDS-PAGE and immunoblotted with 1: 10000 anti-his monoclonal antibody (Proteintech #HRP-66005).
[0480] Informal Sequence Listing
[0481] SEQ ID NO: !
DKKPLNTLISATGLWMSRTGTIHKIKHHEVSRSKIYIEMACGDHLVVNNSRSSRTARALR HHKYRKTCKRCRVSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKAMPKSVARAPKPL ENTEAAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASALVKGNTNPITSMSA PVQASAPALTKQTDRLEVLLNPKDEISLNSGKPFRELESELLSRRKKDLQQIYAEERENY LGKLEREITRFFVDRGFLEIKSPILIPLEYIERMGIDNDTELSKQIFRVDKNFCLRPMLAPN MYNYLRKLDRALPDP1KTFEIGPCYRKESDGKEHLEEFTMLGFCQMGSGCTRENLES11T DFLNHLGIDFKIVGDSCMVYGDTLDVMHGDLELSSAVVGPIPLDREWGIDKPWIGAGFG LERLLKVKHDFKNIKRAARSESYYNGISTNL
[0482] SEQ ID NO: 2
MDKKPLNTLISATGLWMSRTGT1HK1KHHEVSRSK1YIEMACGDHLVVNNSRSSRTARA LRHHKYRKTCKRCRVSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKAMPKSVARAP KPLENTEAAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASALVKGNTNPITS MSAPVQASAPALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRRKKDLQQIYAEE RENYLGKLEREITRFFVDRGFLEIKSPILIPLEYIERMGIDNDTELSKQIFRVDKNFCLRPM LAPNLYNYLRKLDRALPDPIKIFEIGPCYRKESDGKEHLEEFTMLNFCQMGSGCTRENLE SIITDFLNHLGIDFKIVGDSCMVYGDTLDVMHGDLELSSAVVGPIPLDREWGIDKPWIGA GFGLERLLKVKHDFKNIKRAARSESYYNGISTNL*
[0483] SEQ ID NO:3
ATGGATAAAAAGCCTTTGAACACTCTGATTTCTGCGACCGGTCTGTGGATGTCCCGC
ACCGGCACCATCCACAAAATCAAACACCATGAAGTTAGCCGTTCCAAAATCTACAT
TGAAATGGCTTGCGGCGATCACCTGGTTGTCAACAACTCCCGTTCTTCTCGTACCGC
TCGCGCACTGCGCCACCACAAATATCGCAAAACCTGCAAACGTTGCCGTGTTAGCG
ATGAGGACCTGAACAAATTCCTGACCAAAGCTAACGAGGATCAGACCTCCGTAAAA
GTGAAGGTAGTAAGCGCTCCGACCCGTACTAAAAAGGCTATGCCAAAAAGCGTGGC
CCGTGCCCCGAAACCTCTGGAAAACACCGAGGCGGCTCAGGCTCAACCATCCGGTT
CTAAATTTTCTCCGGCGATCCCAGTGTCCACCCAAGAATCTGTTTCCGTACCAGCAA
GCGTGTCTACCAGCATTAGCAGCATTTCTACCGGTGCTACCGCTTCTGCGCTGGTAA
AAGGTAACACTAACCCGATTACTAGCATGTCTGCACCGGTACAGGCAAGCGCCCCA
GCTCTGACTAAATCCCAGACGGACCGTCTGGAGGTGCTGCTGAACCCAAAGGATGA
AATCTCTCTGAACAGCGGCAAGCCTTTCCGTGAGCTGGAAAGCGAGCTGCTGTCTC
GTCGTAAAAAGGATCTGCAACAGATCTACGCTGAGGAACGCGAGAACTATCTGGGT
AAGCTGGAGCGCGAAATTACTCGCTTCTTCGTGGATCGCGGTTTCCTGGAGATCAAA
TCTCCGATTCTGATTCCGCTGGAATACATTGAACGTATGGGCATCGATAATGATACC
GAACTGTCTAAACAGATCTTCCGTGTGGATAAAAACTTCTGTCTGCGTCCGATGCTG
ATTCCGAACTTGTACAACTATTTACGTAAACTGGACCGTGCCCTGCCGGACCCGATC
AAAATATTCGAGATCGGTCCTTGCTACCGTAAAGAGTCCGACGGTAAAGAGCACCT
GGAAGAATTCACCATGCTGACATTCATTCAGATGGGTAGCGGTTGCACGCGTGAAA
ACCTGGAATCCATTATCACCGACTTCCTGAATCACCTGGGTATCGATTTCAAAATTG
TTGGTGACAGCTGTATGGTGTTAGGCGATACGCTGGATGTTATGCACGGCGATCTGG
AGCTGTCTTCCGCAGTTGTGGGCCCAATCCCGCTGGATCGTGAGTGGGGTATCGACA
AACCTAAAATCGGTGCGGGTTTTGGTCTGGAGCGTCTGCTGAAAGTAAAACACGAC
TTCAAGAACATCAAACGTGCTGCACGTTCCGAGTCCTATTACAATGGTATTTCTACT
AACCTGTAA
[0484] SEQ ID NO:4
MDKKPLNTLISATGLWMSRTGTIHKIKHHEVSRSKIYIEMACGDHLVVNNSRSSRTARA
LRHHKYRKTCKRCRVSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKAMPKSVARAP
KPLENTEAAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASALVKGNTNPITS
MSAPVQASAPALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRRKKDLQQIYAEE
RENYLGKLEREITRFFVDRGFLEIKSPILIPLEYIERMGIDNDTELSKQIFRVDKNFCLRPM
LIPNLYNYLRKLDRALPDPIKIFEIGPCYRKESDGKEHLEEFTMLTFIQMGSGCTRENLESI
ITDFLNHLGIDFKIVGDSCMVLGDTLDVMHGDLELSSAVVGPIPLDREWGIDKPKIGA
[0485] References: Wang et al. Genetically Encoding Fluorosulfate-l-tyrosine To React with Lysine. Histidine, and Tyrosine via SuFEx in Proteins in Vivo. J. Am. Chem. Soc. 140 (15), 4995-4999 (2018); Liu et al. A Genetically Encoded Fluorosulfonyloxybenzoyl-l-lysine for Expansive Covalent Bonding of Proteins via SuFEx Chemistry. J. Am. Chem. Soc. 143 (27), 10341-10351 (2021); Li et al. Developing Covalent Protein Drugs via Proximity-Enabled Reactive Therapeutics. Cell 182 (1), 85-97. el6 (2020).
[0486] It is understood that the examples described herein are for illustrative purposes only and that various modifications or changes in light thereof will be indicateed to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein are hereby incorporated by reference herein in their entirety for all purposes.
Claims
1. A compound of Formula (I) or a stereoisomer thereof:
wherein: ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl;
L4 is a bond or -O-; x is an integer from 0 to 8;
L1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene;
R1 is hydrogen, halogen, -CXb, -CHX1 2, -CH2X1, -OCXh, -OCH2X1, -OCHX^, -CN, -SOniR1A, -SOviNR1AR1B, -NHC(O)NR1AR1B. -N(O)mi. -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -OR1A -NR1ASO2R1B, -NR1AC(O)R1B, -NR1AC(O)OR1B, -NR1AOR1B, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl;
X1 is independently -F, -Cl, -Br, or -I;
R1A is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl;
R1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; nl is an integer from 0 to 4; ml is 1 or 2; and vl is 1 or 2.
3. The compound of claim 1, wherein R1 is hydrogen, halogen, -CX13, -CHX12, -CH2X1. -0CX13, -OCH2X1, -OCHX^. -CN, -SOmR1A, -SOviNR1AR1B, -NHC(O)NR1AR1B, -N(O)mi, -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -OR1A,
-NR1ASO2R’B, -NR, AC(O)RIB, -NR1 AC(O)OR,B, -NR1AOR1B, unsubstituted Ci-8 alkyl, or unsubstituted 2 to 8 membered heteroalkyl;
R1A is hydrogen, unsubstituted Ci4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl; and
R1B is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl.
5. The compound of claim 1, wherein ring A is a 5 -membered cycloalkyl having one or two double bonds or a 5-membered heterocycloalkyl having one double bond.
6. The compound of claim 1, wherein ring A is a 5-membered heteroaryl containing
1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
9. The compound of claim 1, wherein L1 is a bond, substituted or unsubstituted C1-4 alkylene, or substituted or unsubstituted 2 to 6 membered heteroalkylene.
10. The compound of claim 1, wherein L1 is -NH-C(O)-(CH2)y-, -NH-C(O)-O-(CH2)y-, -NH-C(O)-NH-(CH2)y-, or -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2.
11. The compound of claim 1, wherein -(CH2)X-L1- is -(CH2)-INH-C(O)-.
-(CH2)4NH-C(O)-O-, -(CH2)4NH-C(O)-NH-. or -(CH2)4NH-C(O)-S-.
13. A protein comprising an unnatural amino acid, wherein the unnatural amino comprises a side chain of Formula (II):
ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyl, or a 5-membered heteroaryl;
L4 is a bond or -O-; x is an integer from 0 to 8;
L1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene;
R1 is hydrogen, halogen, -CXI3, -CHXI2, -CFhX1, -OCXI3, -OCH2X1, -OCHXI2, -CN, -SOmR1A, -SOviNR1AR1B, -NHC(O)NR1AR1B, -N(O)mi, -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -OR1A, -NR1ASO2R1B, -NR1AC(O)R1B, -NR1AC(O)OR1B, -NR1AOR1B, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl;
X1 is independently -F, -Cl, -Br. or -I;
R1A is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyd;
R1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; nl is an integer from 0 to 4; ml is 1 or 2; and vl is 1 or 2.
15. The protein of claim 13, wherein R1 is hydrogen, halogen,
-CXI3, -CHXh, -CH2X1. -OCXI3, -OCH2X1, -OCHXI2, -CN, -S0mR1A, -SOviNR1AR1B, -NHC(0)NR1AR1B, -N(0)mi, -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(0)NR1AR1B, -0R1A, -NR1AS02R1B, -NR1AC(0)R1B, -NR1AC(0)0R1B, -NR1A0R1B, unsubstituted Ci-s alkyl, or unsubstituted 2 to 8 membered heteroalkyl;
R1A is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl; and
R1B is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl.
17. The protein of claim 13, wherein ring A is a 5-membered cycloalky l having one or two double bonds or a 5-membered heterocycloalkyl having one double bond.
18. The protein of claim 13, wherein ring A is a 5-membered heteroaryl containing 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
20. The protein of claim 13, wherein the side chain of Formula (II) is selected from the group consisting of:
21. The protein of claim 13, wherein L1 is a bond, substituted or unsubstituted C1-4 alkylene, or substituted or unsubstituted 2 to 6 membered heteroalkylene.
22. The protein of claim 13, wherein L1 is -NH-C(0)-(CH2)y-, -NH-C(0)-0-(CH2)y-, -NH-C(0)-NH-(CH2)y-, -NH-C(0)-S-(CH2)y-, and y is an integer from 0 to 2.
23. The protein of claim 13, wherein -(CFFX-L1- is -(CH2)4NH-C(O)-, (CH2)4NH-C(O)-O-, (CH2)4NH-C(O)-NH-, or (CH2)4NH-C(O)-S-.
25. The protein of claim 13, wherein the protein is an antibody or an antibody variant.
26. The protein of claim 25, wherein the antibody variant is a single-chain variable fragment, a single-domain antibody, an affibody, or an antigen-binding fragment.
27. The protein of claim 13, wherein the unnatural amino acid is within a CDR region or a framework region of the antibody or antibody variant.
28. The protein of claim 13, wherein the protein is a receptor.
29. The protein of claim 13, wherein the cell surface receptor is in the extracellular domain, the transmembrane domain, or the intracellular domain.
30. The protein of claim 13, wherein the protein is a cytosolic protein, a transcriptional factor, or an enzyme.
31. The protein of claim 13, further comprising a detectable agent.
32. The protein of claim 31, wherein the detectable agent is a radioisotope.
33. The protein of claim 13, further comprising a therapeutic agent.
34. A nucleic acid encoding the protein of claim 13.
35. A vector comprising a nucleic acid that encodes the protein of claim 13.
R4 and R5 are each independently a peptidyl moiety, a carbohydrate moiety, a lipid moiety, or a nucleic acid moiety; ring A is a 5-membered cycloalkyl, a 5-membered heterocycloalkyd, or a 5-membered heteroaryl;
L4 is a bond or -O-; x is an integer from 0 to 8;
L1 is a bond, substituted or unsubstituted alkylene, or substituted or unsubstituted heteroalkylene;
L2 is a bond, -NR2A-, -S-, -S(O)2-, -O-. -C(O)-, -C(O)O-, -OC(O)-, -N(R2A)C(O)-, -C(O)N(R2A)-, -NR2AC(O)NR2B-, -NR2AC(NH)NR2B-, -SO2N(R2A)-, -N(R2A)SO2-. -C(S)-. substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalky dene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene;
L3 is a bond, -N(R3A)-, -S-. -S(O)2-, -O-. -C(O)-. -C(O)O-, -OC(O)-, -N(R3A)C(O)-,
-C(O)N(R3A)-, -NR3AC(O)NR3B-, -NR3AC(NH)NR3B-, -SO2N(R3A)-, -N(R3A)SO2-, -C(S)-, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene; and
R2A, R2B, R3A, and R3B are independently hydrogen, substituted or unsubstituted alky l, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl;
R1 is hydrogen, halogen, -CX1?,, -CHX^, -CH2X1, -OCX1?,, -OCH2X1, -OCHXti, -CN, -SOniR1A, -SOviNR1AR1B, -NHC(O)NR1AR1B, -N(O)mi, -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -OR1A, -NR1ASO2R1B. -NR1AC(O)R1B, -NR1AC(O)OR1B. -NR1AOR1B, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl;
X1 is independently -F, -Cl, -Br, or -I;
R1A is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl;
R1B is hydrogen, substituted or unsubstituted alkyl, or substituted or unsubstituted heteroalkyl; nl is an integer from 0 to 4; ml is 1 or 2; and vl is 1 or 2.
38. The biomolecule conjugate of claim 36, wherein R1 is hydrogen, halogen. -CXh, -CHX1?. -CH2X1, -OCX . -OCTFX1. -OCHX'2. -CN. -SOniR1A, -SOviNR1AR1B, -OR1A, -NHC(O)NR1AR1B, -N(O)rai, -NR1AR1B, -C(O)R1A, -C(O)-OR1A, -C(O)NR1AR1B, -NR1ASO2R1B, -NR1AC(O)R1B, -NR1AC(O)OR1B, -NR1AOR1B, unsubstituted Ci-s alkyl, or unsubstituted 2 to 8 membered heteroalkyl; R1A is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl; and R1B is hydrogen, unsubstituted C1-4 alkyl, or unsubstituted 2 to 4 membered heteroalkyl.
40. The biomolecule conjugate of claim 36, wherein ring A is a 5 -membered cycloalkyl having one or two double bonds or a 5-membered heterocycloalkyl having one double bond.
41. The biomolecule conjugate of claim 36, wherein ring A is a 5 -membered heleroaryl containing 1 to 3 heteroatoms selected from the group consisting of oxygen, nitrogen, and sulfur.
42. The biomolecule conjugate of claim 36, wherein the biomolecule conjugate of
43. The biomolecule conjugate of claim 36, wherein the biomolecule conjugate of
44. The biomolecule conjugate of claim 36, wherein L1 is a bond, substituted or unsubstituted C 1-4 alkylene, or substituted or unsubstituted 2 to 6 membered heteroalkylene.
45. The biomolecule conjugate of claim 36, wherein L1 is -NH-C(O)-(CH2)y-, -NH- C(O)-O-(CH2)y-, -NH-C(O)-NH-(CH2)y-, -NH-C(O)-S-(CH2)y-, and y is an integer from 0 to 2.
46. The biomolecule conjugate of claim 36, wherein -(CFFjx-L1- is -(CH2)4NH- C(O)-, -(CH2)4NH-C(O)-O-, -(CH2)4NH-C(O)-NH-, or -(CH2)4NH-C(O)-S-.
47. The biomolecule conjugate of claim 36, wherein the biomolecule conjugate of Formula (III) is a biomolecule conjugate selected grom the group consisting of:
48. The biomolecule conjugate of claim 36, wherein R4 and R’ are each independently a peptidyl moiety .
49. The biomolecule conjugate of claim 48, wherein the peptidyl moiety of R4 comprises an antibody; and the peptidyl moiety of R5 comprises a protein.
50. The biomolecule conjugate of claim 48, wherein the peptidyl moiety of R4 comprises an antibody variant; and the peptidyl moiety of R5 comprises a protein.
51. The biomolecule conjugate of claim 48, wherein the peptidyl moiety' of R4 comprises a protein; and the peptidyl moiety’ of R5 comprises an antibody or an antibody variant.
52. The biomolecule conjugate of claim 50, wherein the antibody variant is an antigen-binding fragment, a single-chain variable fragment, a single-domain antibody, or an affibody.
53. The biomolecule conjugate of claim 49, wherein the protein is the target protein of the antibody or antibody variant, claim 36
54. The biomolecule conjugate of claim 49, wherein the protein is a cytosolic protein,
an enzyme, or a transcriptional factor.
55. The biomolecule conjugate of claim 49, wherein the protein is a receptor protein.
56. The biomolecule conjugate of claim 55, wherein the receptor protein is a 5- hydroxytryptamine receptor, an acetylcholine receptor, an adenosine receptor, an adenosine A2A receptor, an adenosine A2B receptor, an angiotensin receptor, an apelin receptor, a bile acid receptor, a bombesin receptor, a bradykinin receptor, a cannabinoid receptor, a chemerin receptor, a chemokine receptor, a cholecystokinin receptor, a Class A Orphan receptor, a dopamine receptor, an endothelin receptor, an epidermal growth factor receptor (EGFR). a formyl peptide receptor, a free fatty acid receptor, a galanin receptor, a ghrelin receptor, a glycoprotein hormone receptor, a gonadotrophin-releasing hormone receptor, a G protein- coupled receptor, a G protein-coupled estrogen receptor, a histamine receptor, a hydroxycarboxylic acid receptor, a kisspeptin receptor, a leukotriene receptor, a lysophospholipid receptor, a lysophospholipid SIP receptor, a melanin-concentrating hormone receptor, a melanocortin receptor, a melatonin receptor, a motilin receptor, a neuromedin U receptor, a neuropeptide FF/neuropeptide AF receptor, a neuropeptide S receptor, a neuropeptide W/neuropeptide B receptor, a neuropeptide Y receptor, a neurotensin receptor, an opioid receptor, an opsin receptor, an orexin receptor, an oxoglutarate receptor, a P2Y receptor, a platelet-activating factor receptor, a prokineticin receptor, a prolactin-releasing peptide receptor, a prostanoid receptor, a proteinase-activated receptor, a QRFP receptor, a relaxin family peptide receptor, a somatostatin receptor, a succinate receptor, a tachykinin receptor, a thyrotropinreleasing hormone receptor, a trace amine receptor, a urotensin receptor, a vasopressin receptor.
57. The biomolecule conjugate of claim 55, wherein the protein is a G protein- coupled receptor.
58. A complex comprising a pyrrolysyl-tRNA synthetase and the compound of claim 1.
59. The complex of claim 58, wherein the pyrrolysyl-tRNA synthetase has an amino acid sequence with at least 90% sequence identity to SEQ ID NO: 1, 2, 3, or 4.
60. The complex of claim 59, wherein the pyrrolysyl-tRNA synthetase has an amino acid sequence as set forth in SEQ ID NO: 1, 2. 3, or 4.
61. The complex of claim 58, further comprising a tRNAPvl.
62. A cell comprising the protein of claim 13.
63. The cell of claim 62, wherein the cell is a bacterial cell or a mammalian cell.
64. A pharmaceutical composition comprising the protein of claim 13 and a pharmaceutically acceptable excipient.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202263421974P | 2022-11-02 | 2022-11-02 | |
US63/421,974 | 2022-11-02 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024097831A1 true WO2024097831A1 (en) | 2024-05-10 |
Family
ID=90931542
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2023/078455 WO2024097831A1 (en) | 2022-11-02 | 2023-11-02 | Bioreactive proteins containing unnatural amino acids |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024097831A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170196985A1 (en) * | 2014-06-06 | 2017-07-13 | The Scripps Research Institute | Sulfur(vi) fluoride compounds and methods for the preparation thereof |
US20210002325A1 (en) * | 2018-03-08 | 2021-01-07 | The Regents Of The University Of California | Bioreactive compositions and methods of use thereof |
WO2021102624A1 (en) * | 2019-11-25 | 2021-06-03 | Hangzhou Branch Of Technical Institute Of Physics And Chemistry, Chinese Academy Of Sciences | Covalent protein drugs developed via proximity-enabled reactive therapeutics (perx) |
-
2023
- 2023-11-02 WO PCT/US2023/078455 patent/WO2024097831A1/en unknown
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170196985A1 (en) * | 2014-06-06 | 2017-07-13 | The Scripps Research Institute | Sulfur(vi) fluoride compounds and methods for the preparation thereof |
US20210002325A1 (en) * | 2018-03-08 | 2021-01-07 | The Regents Of The University Of California | Bioreactive compositions and methods of use thereof |
WO2021102624A1 (en) * | 2019-11-25 | 2021-06-03 | Hangzhou Branch Of Technical Institute Of Physics And Chemistry, Chinese Academy Of Sciences | Covalent protein drugs developed via proximity-enabled reactive therapeutics (perx) |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10501496B2 (en) | Chemical synthesis and screening of bicyclic peptide libraries | |
ES2747273T3 (en) | Anti-EpCAM antibodies and methods of use | |
CN109952375B (en) | Novel anti-human MUC1 antibody Fab fragment | |
JP2023002508A (en) | Granzyme B directed imaging and therapy | |
AU2009268349C1 (en) | Glycopeptide constructs and uses thereof | |
AU2019308231B2 (en) | Opioid haptens, conjugates, vaccines, and methods of generating antibodies | |
JP2021515561A (en) | Bioreactive compositions and how to use them | |
US20240262791A1 (en) | Bioreactive proteins containing unnatural amino acids | |
US20220098260A1 (en) | BH4 Stabilized Peptides And Uses Thereof | |
US20200368364A1 (en) | Cysteine peptide-enabled antibodies | |
AU2023283882A1 (en) | Trifunctional compound and use thereof | |
WO2024097831A1 (en) | Bioreactive proteins containing unnatural amino acids | |
JP2024512297A (en) | Bioreactive compounds and their use | |
WO2024145687A1 (en) | Bioreactive proteins containing an unnatural amino acid and arginine | |
US11279708B2 (en) | Amidino- and amino-rocaglates as novel translation inhibitors and anticancer agents | |
CN117412761A (en) | Antibody drug conjugates for delivery of cytotoxic agents using MATE technology | |
US20240252652A1 (en) | Proteins having unnatural amino acids and methods of use | |
AU2014271331A1 (en) | Glycopeptide constructs and uses thereof | |
WO2025054996A1 (en) | Antibody-drug conjugate | |
RU2815064C2 (en) | Pladienolide derivatives as drugs with targeted influence on spliceosome for treatment of cancer | |
CN117098768A (en) | Biologically reactive compounds and methods of use thereof | |
CN119136833A (en) | Membrane translocation domain and its use | |
JP2025508328A (en) | Antibody-conjugated chemical degradants and methods thereof | |
Strydom | Investigating Karyopherin B1: small molecule interactions for cancer therapy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23886973 Country of ref document: EP Kind code of ref document: A1 |