US10332616B2 - Movable type method applied to protein-ligand binding - Google Patents
Movable type method applied to protein-ligand binding Download PDFInfo
- Publication number
- US10332616B2 US10332616B2 US15/143,519 US201615143519A US10332616B2 US 10332616 B2 US10332616 B2 US 10332616B2 US 201615143519 A US201615143519 A US 201615143519A US 10332616 B2 US10332616 B2 US 10332616B2
- Authority
- US
- United States
- Prior art keywords
- molecule
- atom
- energy
- ligand
- protein
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
- 239000003446 ligand Substances 0.000 title claims abstract description 202
- 238000000034 method Methods 0.000 title claims abstract description 142
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 59
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 57
- 230000001419 dependent effect Effects 0.000 claims abstract description 10
- 230000006870 function Effects 0.000 claims description 101
- 239000011159 matrix material Substances 0.000 claims description 89
- 238000005192 partition Methods 0.000 claims description 60
- 238000005070 sampling Methods 0.000 claims description 51
- 238000004364 calculation method Methods 0.000 claims description 41
- 238000003860 storage Methods 0.000 claims description 19
- 239000002245 particle Substances 0.000 claims description 13
- 230000008859 change Effects 0.000 claims description 8
- 230000010354 integration Effects 0.000 claims description 6
- 125000004429 atom Chemical group 0.000 description 211
- 238000007614 solvation Methods 0.000 description 90
- 230000003993 interaction Effects 0.000 description 88
- 238000012360 testing method Methods 0.000 description 57
- 229910052799 carbon Inorganic materials 0.000 description 54
- 229920009537 polybutylene succinate adipate Polymers 0.000 description 45
- 239000002904 solvent Substances 0.000 description 45
- 239000001273 butane Substances 0.000 description 43
- OFBQJSOFQDEBGM-UHFFFAOYSA-N n-pentane Natural products CCCCC OFBQJSOFQDEBGM-UHFFFAOYSA-N 0.000 description 43
- 238000009826 distribution Methods 0.000 description 39
- IJDNQMDRQITEOD-UHFFFAOYSA-N n-butane Chemical compound CCCC IJDNQMDRQITEOD-UHFFFAOYSA-N 0.000 description 39
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 38
- 239000013598 vector Substances 0.000 description 36
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 28
- 238000013459 approach Methods 0.000 description 26
- 229910052760 oxygen Inorganic materials 0.000 description 25
- VNWKTOKETHGBQD-UHFFFAOYSA-N methane Chemical compound C VNWKTOKETHGBQD-UHFFFAOYSA-N 0.000 description 24
- 239000001301 oxygen Substances 0.000 description 24
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 23
- 238000004422 calculation algorithm Methods 0.000 description 20
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 18
- 230000000875 corresponding effect Effects 0.000 description 17
- 230000015654 memory Effects 0.000 description 17
- 238000003032 molecular docking Methods 0.000 description 17
- 239000000126 substance Substances 0.000 description 17
- 150000002500 ions Chemical class 0.000 description 15
- 230000007935 neutral effect Effects 0.000 description 15
- 230000008569 process Effects 0.000 description 14
- 150000003384 small molecules Chemical class 0.000 description 14
- 125000000524 functional group Chemical group 0.000 description 13
- 239000007789 gas Substances 0.000 description 13
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N phenol group Chemical group C1(=CC=CC=C1)O ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 13
- 229910052757 nitrogen Inorganic materials 0.000 description 12
- 150000001408 amides Chemical class 0.000 description 11
- 239000001257 hydrogen Substances 0.000 description 11
- 229910052739 hydrogen Inorganic materials 0.000 description 11
- 238000000329 molecular dynamics simulation Methods 0.000 description 11
- 238000007639 printing Methods 0.000 description 11
- 239000013078 crystal Substances 0.000 description 10
- 239000000243 solution Substances 0.000 description 10
- 150000007942 carboxylates Chemical class 0.000 description 9
- 150000008282 halocarbons Chemical class 0.000 description 9
- 229930195733 hydrocarbon Natural products 0.000 description 9
- 150000002430 hydrocarbons Chemical class 0.000 description 9
- ATUOYWHBWRKTHZ-UHFFFAOYSA-N Propane Chemical compound CCC ATUOYWHBWRKTHZ-UHFFFAOYSA-N 0.000 description 8
- 150000001412 amines Chemical class 0.000 description 7
- 125000003118 aryl group Chemical group 0.000 description 7
- 230000008901 benefit Effects 0.000 description 7
- 150000001875 compounds Chemical class 0.000 description 7
- 230000000694 effects Effects 0.000 description 7
- UVCJGUGAGLDPAA-UHFFFAOYSA-N ensulizole Chemical compound N1C2=CC(S(=O)(=O)O)=CC=C2N=C1C1=CC=CC=C1 UVCJGUGAGLDPAA-UHFFFAOYSA-N 0.000 description 7
- 238000005457 optimization Methods 0.000 description 7
- 230000010076 replication Effects 0.000 description 7
- DLFVBJFMPXGRIB-UHFFFAOYSA-N Acetamide Chemical compound CC(N)=O DLFVBJFMPXGRIB-UHFFFAOYSA-N 0.000 description 6
- QTBSBXVTEAMEQO-UHFFFAOYSA-N Acetic acid Chemical compound CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 6
- UHOVQNZJYSORNB-UHFFFAOYSA-N Benzene Chemical compound C1=CC=CC=C1 UHOVQNZJYSORNB-UHFFFAOYSA-N 0.000 description 6
- 239000004215 Carbon black (E152) Substances 0.000 description 6
- 238000004891 communication Methods 0.000 description 6
- 230000006854 communication Effects 0.000 description 6
- 238000010276 construction Methods 0.000 description 6
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 6
- 229910052698 phosphorus Inorganic materials 0.000 description 6
- 229910052717 sulfur Inorganic materials 0.000 description 6
- RTZKZFJDLAIYFH-UHFFFAOYSA-N Diethyl ether Chemical compound CCOCC RTZKZFJDLAIYFH-UHFFFAOYSA-N 0.000 description 5
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical compound [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 5
- 125000004432 carbon atom Chemical group C* 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 239000011593 sulfur Substances 0.000 description 5
- JJYPMNFTHPTTDI-UHFFFAOYSA-N 3-methylaniline Chemical compound CC1=CC=CC(N)=C1 JJYPMNFTHPTTDI-UHFFFAOYSA-N 0.000 description 4
- 102100034523 Histone H4 Human genes 0.000 description 4
- 101001067880 Homo sapiens Histone H4 Proteins 0.000 description 4
- BAVYZALUXZFZLV-UHFFFAOYSA-N Methylamine Chemical compound NC BAVYZALUXZFZLV-UHFFFAOYSA-N 0.000 description 4
- WDQCHCWLIPRZMX-UHFFFAOYSA-N butane;methane Chemical compound C.CCCC WDQCHCWLIPRZMX-UHFFFAOYSA-N 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- RZXMPPFPUUCRFN-UHFFFAOYSA-N p-toluidine Chemical compound CC1=CC=C(N)C=C1 RZXMPPFPUUCRFN-UHFFFAOYSA-N 0.000 description 4
- WGYKZJWCGVVSQN-UHFFFAOYSA-N propylamine Chemical compound CCCN WGYKZJWCGVVSQN-UHFFFAOYSA-N 0.000 description 4
- 230000012846 protein folding Effects 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000010200 validation analysis Methods 0.000 description 4
- RFFLAFLAYFXFSW-UHFFFAOYSA-N 1,2-dichlorobenzene Chemical compound ClC1=CC=CC=C1Cl RFFLAFLAYFXFSW-UHFFFAOYSA-N 0.000 description 3
- OISVCGZHLKNMSJ-UHFFFAOYSA-N 2,6-dimethylpyridine Chemical compound CC1=CC=CC(C)=N1 OISVCGZHLKNMSJ-UHFFFAOYSA-N 0.000 description 3
- CSCPPACGZOOCGX-UHFFFAOYSA-N Acetone Chemical compound CC(C)=O CSCPPACGZOOCGX-UHFFFAOYSA-N 0.000 description 3
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical compound CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 description 3
- CURLTUGMZLYLDI-UHFFFAOYSA-N Carbon dioxide Chemical group O=C=O CURLTUGMZLYLDI-UHFFFAOYSA-N 0.000 description 3
- RGSFGYAAUTVSQA-UHFFFAOYSA-N Cyclopentane Chemical compound C1CCCC1 RGSFGYAAUTVSQA-UHFFFAOYSA-N 0.000 description 3
- YMWUJEATGCHHMB-UHFFFAOYSA-N Dichloromethane Chemical compound ClCCl YMWUJEATGCHHMB-UHFFFAOYSA-N 0.000 description 3
- QMMFVYPAHWMCMS-UHFFFAOYSA-N Dimethyl sulfide Chemical compound CSC QMMFVYPAHWMCMS-UHFFFAOYSA-N 0.000 description 3
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 3
- XEKOWRVHYACXOJ-UHFFFAOYSA-N Ethyl acetate Chemical compound CCOC(C)=O XEKOWRVHYACXOJ-UHFFFAOYSA-N 0.000 description 3
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 3
- LRHPLDYGYMQRHN-UHFFFAOYSA-N N-Butanol Chemical compound CCCCO LRHPLDYGYMQRHN-UHFFFAOYSA-N 0.000 description 3
- IMNFDUFMRHMDMM-UHFFFAOYSA-N N-Heptane Chemical compound CCCCCCC IMNFDUFMRHMDMM-UHFFFAOYSA-N 0.000 description 3
- SJRJJKPEHAURKC-UHFFFAOYSA-N N-Methylmorpholine Chemical compound CN1CCOCC1 SJRJJKPEHAURKC-UHFFFAOYSA-N 0.000 description 3
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 description 3
- RWRDLPDLKQPQOW-UHFFFAOYSA-N Pyrrolidine Chemical compound C1CCNC1 RWRDLPDLKQPQOW-UHFFFAOYSA-N 0.000 description 3
- YXFVVABEGXRONW-UHFFFAOYSA-N Toluene Chemical compound CC1=CC=CC=C1 YXFVVABEGXRONW-UHFFFAOYSA-N 0.000 description 3
- INKDAKMSOSCDGL-UHFFFAOYSA-N [O].OC1=CC=CC=C1 Chemical group [O].OC1=CC=CC=C1 INKDAKMSOSCDGL-UHFFFAOYSA-N 0.000 description 3
- WPYMKLBDIGXBTP-UHFFFAOYSA-N benzoic acid Chemical compound OC(=O)C1=CC=CC=C1 WPYMKLBDIGXBTP-UHFFFAOYSA-N 0.000 description 3
- JFDZBHWFFUWGJE-UHFFFAOYSA-N benzonitrile Chemical compound N#CC1=CC=CC=C1 JFDZBHWFFUWGJE-UHFFFAOYSA-N 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 3
- 238000010668 complexation reaction Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- RHMZKSWPMYAOAZ-UHFFFAOYSA-N diethyl peroxide Chemical compound CCOOCC RHMZKSWPMYAOAZ-UHFFFAOYSA-N 0.000 description 3
- SRXOCFMDUSFFAK-UHFFFAOYSA-N dimethyl peroxide Chemical compound COOC SRXOCFMDUSFFAK-UHFFFAOYSA-N 0.000 description 3
- 229940079593 drug Drugs 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 230000009881 electrostatic interaction Effects 0.000 description 3
- 230000002349 favourable effect Effects 0.000 description 3
- 230000002209 hydrophobic effect Effects 0.000 description 3
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 3
- 230000033001 locomotion Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- FUZZWVXGSFPDMH-UHFFFAOYSA-N n-hexanoic acid Natural products CCCCCC(O)=O FUZZWVXGSFPDMH-UHFFFAOYSA-N 0.000 description 3
- 125000004433 nitrogen atom Chemical group N* 0.000 description 3
- QJGQUHMNIGDVPM-UHFFFAOYSA-N nitrogen group Chemical group [N] QJGQUHMNIGDVPM-UHFFFAOYSA-N 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 125000004430 oxygen atom Chemical group O* 0.000 description 3
- 150000002978 peroxides Chemical class 0.000 description 3
- 239000011574 phosphorus Substances 0.000 description 3
- BDERNNFJNOPAEC-UHFFFAOYSA-N propan-1-ol Chemical compound CCCO BDERNNFJNOPAEC-UHFFFAOYSA-N 0.000 description 3
- 239000001294 propane Substances 0.000 description 3
- 238000004088 simulation Methods 0.000 description 3
- RMVRSNDYEFQCLF-UHFFFAOYSA-N thiophenol Chemical compound SC1=CC=CC=C1 RMVRSNDYEFQCLF-UHFFFAOYSA-N 0.000 description 3
- NQPDZGIKBAWPEJ-UHFFFAOYSA-N valeric acid Chemical compound CCCCC(O)=O NQPDZGIKBAWPEJ-UHFFFAOYSA-N 0.000 description 3
- UBOXGVDOUJQMTN-UHFFFAOYSA-N 1,1,2-trichloroethane Chemical compound ClCC(Cl)Cl UBOXGVDOUJQMTN-UHFFFAOYSA-N 0.000 description 2
- DHKHKXVYLBGOIT-UHFFFAOYSA-N 1,1-Diethoxyethane Chemical compound CCOC(C)OCC DHKHKXVYLBGOIT-UHFFFAOYSA-N 0.000 description 2
- RHUYHJGZWVXEHW-UHFFFAOYSA-N 1,1-Dimethyhydrazine Chemical compound CN(C)N RHUYHJGZWVXEHW-UHFFFAOYSA-N 0.000 description 2
- VXNZUUAINFGPBY-UHFFFAOYSA-N 1-Butene Chemical compound CCC=C VXNZUUAINFGPBY-UHFFFAOYSA-N 0.000 description 2
- KBPLFHHGFOOTCA-UHFFFAOYSA-N 1-Octanol Chemical compound CCCCCCCCO KBPLFHHGFOOTCA-UHFFFAOYSA-N 0.000 description 2
- IBYHHJPAARCAIE-UHFFFAOYSA-N 1-bromo-2-chloroethane Chemical compound ClCCBr IBYHHJPAARCAIE-UHFFFAOYSA-N 0.000 description 2
- ZBTMRBYMKUEVEU-UHFFFAOYSA-N 1-bromo-4-methylbenzene Chemical compound CC1=CC=C(Br)C=C1 ZBTMRBYMKUEVEU-UHFFFAOYSA-N 0.000 description 2
- BBMCTIGTTCKYKF-UHFFFAOYSA-N 1-heptanol Chemical compound CCCCCCCO BBMCTIGTTCKYKF-UHFFFAOYSA-N 0.000 description 2
- LIKMAJRDDDTEIG-UHFFFAOYSA-N 1-hexene Chemical compound CCCCC=C LIKMAJRDDDTEIG-UHFFFAOYSA-N 0.000 description 2
- JYYNAJVZFGKDEQ-UHFFFAOYSA-N 2,4-Dimethylpyridine Chemical compound CC1=CC=NC(C)=C1 JYYNAJVZFGKDEQ-UHFFFAOYSA-N 0.000 description 2
- BZHMBWZPUJHVEE-UHFFFAOYSA-N 2,4-dimethylpentane Chemical compound CC(C)CC(C)C BZHMBWZPUJHVEE-UHFFFAOYSA-N 0.000 description 2
- XWKFPIODWVPXLX-UHFFFAOYSA-N 2,5-dimethylpyridine Chemical compound CC1=CC=C(C)N=C1 XWKFPIODWVPXLX-UHFFFAOYSA-N 0.000 description 2
- AFABGHUZZDYHJO-UHFFFAOYSA-N 2-Methylpentane Chemical compound CCCC(C)C AFABGHUZZDYHJO-UHFFFAOYSA-N 0.000 description 2
- QQZOPKMRPOGIEB-UHFFFAOYSA-N 2-Oxohexane Chemical compound CCCCC(C)=O QQZOPKMRPOGIEB-UHFFFAOYSA-N 0.000 description 2
- BSKHPKMHTQYZBB-UHFFFAOYSA-N 2-methylpyridine Chemical compound CC1=CC=CC=N1 BSKHPKMHTQYZBB-UHFFFAOYSA-N 0.000 description 2
- ZPVFWPFBNIEHGJ-UHFFFAOYSA-N 2-octanone Chemical compound CCCCCCC(C)=O ZPVFWPFBNIEHGJ-UHFFFAOYSA-N 0.000 description 2
- NURQLCJSMXZBPC-UHFFFAOYSA-N 3,4-dimethylpyridine Chemical compound CC1=CC=NC=C1C NURQLCJSMXZBPC-UHFFFAOYSA-N 0.000 description 2
- HWWYDZCSSYKIAD-UHFFFAOYSA-N 3,5-dimethylpyridine Chemical compound CC1=CN=CC(C)=C1 HWWYDZCSSYKIAD-UHFFFAOYSA-N 0.000 description 2
- IAVREABSGIHHMO-UHFFFAOYSA-N 3-hydroxybenzaldehyde Chemical compound OC1=CC=CC(C=O)=C1 IAVREABSGIHHMO-UHFFFAOYSA-N 0.000 description 2
- XHQZJYCNDZAGLW-UHFFFAOYSA-N 3-methoxybenzoic acid Chemical compound COC1=CC=CC(C(O)=O)=C1 XHQZJYCNDZAGLW-UHFFFAOYSA-N 0.000 description 2
- ITQTTZVARXURQS-UHFFFAOYSA-N 3-methylpyridine Chemical compound CC1=CC=CN=C1 ITQTTZVARXURQS-UHFFFAOYSA-N 0.000 description 2
- HCFAJYNVAYBARA-UHFFFAOYSA-N 4-heptanone Chemical compound CCCC(=O)CCC HCFAJYNVAYBARA-UHFFFAOYSA-N 0.000 description 2
- RGHHSNMVTDWUBI-UHFFFAOYSA-N 4-hydroxybenzaldehyde Chemical compound OC1=CC=C(C=O)C=C1 RGHHSNMVTDWUBI-UHFFFAOYSA-N 0.000 description 2
- ZEYHEAKUIGZSGI-UHFFFAOYSA-N 4-methoxybenzoic acid Chemical compound COC1=CC=C(C(O)=O)C=C1 ZEYHEAKUIGZSGI-UHFFFAOYSA-N 0.000 description 2
- FKNQCJSGGFJEIZ-UHFFFAOYSA-N 4-methylpyridine Chemical compound CC1=CC=NC=C1 FKNQCJSGGFJEIZ-UHFFFAOYSA-N 0.000 description 2
- IKHGUXGNUITLKF-UHFFFAOYSA-N Acetaldehyde Chemical compound CC=O IKHGUXGNUITLKF-UHFFFAOYSA-N 0.000 description 2
- KWOLFJPFCHCOCG-UHFFFAOYSA-N Acetophenone Chemical compound CC(=O)C1=CC=CC=C1 KWOLFJPFCHCOCG-UHFFFAOYSA-N 0.000 description 2
- HSFWRNGVRCDJHI-UHFFFAOYSA-N Acetylene Chemical compound C#C HSFWRNGVRCDJHI-UHFFFAOYSA-N 0.000 description 2
- -1 Alkoxide ions Chemical class 0.000 description 2
- VVJKKWFAADXIJK-UHFFFAOYSA-N Allylamine Chemical compound NCC=C VVJKKWFAADXIJK-UHFFFAOYSA-N 0.000 description 2
- QGZKDVFQNNGYKY-UHFFFAOYSA-N Ammonia Chemical compound N QGZKDVFQNNGYKY-UHFFFAOYSA-N 0.000 description 2
- PAYRUJLWNCNPSJ-UHFFFAOYSA-N Aniline Chemical compound NC1=CC=CC=C1 PAYRUJLWNCNPSJ-UHFFFAOYSA-N 0.000 description 2
- KXDAEFPNCMNJSK-UHFFFAOYSA-N Benzamide Chemical compound NC(=O)C1=CC=CC=C1 KXDAEFPNCMNJSK-UHFFFAOYSA-N 0.000 description 2
- QFOHBWFCKVYLES-UHFFFAOYSA-N Butylparaben Chemical compound CCCCOC(=O)C1=CC=C(O)C=C1 QFOHBWFCKVYLES-UHFFFAOYSA-N 0.000 description 2
- FERIUCNNQQJTOY-UHFFFAOYSA-N Butyric acid Chemical compound CCCC(O)=O FERIUCNNQQJTOY-UHFFFAOYSA-N 0.000 description 2
- WWZKQHOCKIZLMA-UHFFFAOYSA-N Caprylic acid Natural products CCCCCCCC(O)=O WWZKQHOCKIZLMA-UHFFFAOYSA-N 0.000 description 2
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 2
- CETBSQOFQKLHHZ-UHFFFAOYSA-N Diethyl disulfide Chemical compound CCSSCC CETBSQOFQKLHHZ-UHFFFAOYSA-N 0.000 description 2
- XTHFKEDIFFGKHM-UHFFFAOYSA-N Dimethoxyethane Chemical compound COCCOC XTHFKEDIFFGKHM-UHFFFAOYSA-N 0.000 description 2
- LCGLNKUTAGEVQW-UHFFFAOYSA-N Dimethyl ether Chemical compound COC LCGLNKUTAGEVQW-UHFFFAOYSA-N 0.000 description 2
- ROSDSFDQCJNGOL-UHFFFAOYSA-N Dimethylamine Chemical compound CNC ROSDSFDQCJNGOL-UHFFFAOYSA-N 0.000 description 2
- VGGSQFUCUMXWEO-UHFFFAOYSA-N Ethene Chemical compound C=C VGGSQFUCUMXWEO-UHFFFAOYSA-N 0.000 description 2
- QUSNBJAOOMFDIB-UHFFFAOYSA-N Ethylamine Chemical compound CCN QUSNBJAOOMFDIB-UHFFFAOYSA-N 0.000 description 2
- YNQLUTRBYVCPMQ-UHFFFAOYSA-N Ethylbenzene Chemical compound CCC1=CC=CC=C1 YNQLUTRBYVCPMQ-UHFFFAOYSA-N 0.000 description 2
- OAKJQQAXSVQMHS-UHFFFAOYSA-N Hydrazine Chemical compound NN OAKJQQAXSVQMHS-UHFFFAOYSA-N 0.000 description 2
- UFHFLCQGNIYNRP-UHFFFAOYSA-N Hydrogen Chemical compound [H][H] UFHFLCQGNIYNRP-UHFFFAOYSA-N 0.000 description 2
- KFZMGEQAYNKOFK-UHFFFAOYSA-N Isopropanol Chemical compound CC(C)O KFZMGEQAYNKOFK-UHFFFAOYSA-N 0.000 description 2
- LSDPWZHWYPCBBB-UHFFFAOYSA-N Methanethiol Chemical compound SC LSDPWZHWYPCBBB-UHFFFAOYSA-N 0.000 description 2
- UUIQMZJEGPQKFD-UHFFFAOYSA-N Methyl butyrate Chemical compound CCCC(=O)OC UUIQMZJEGPQKFD-UHFFFAOYSA-N 0.000 description 2
- YNAVUWVOSKDBBP-UHFFFAOYSA-N Morpholine Chemical compound C1COCCN1 YNAVUWVOSKDBBP-UHFFFAOYSA-N 0.000 description 2
- JLTDJTHDQAWBAV-UHFFFAOYSA-N N,N-dimethylaniline Chemical compound CN(C)C1=CC=CC=C1 JLTDJTHDQAWBAV-UHFFFAOYSA-N 0.000 description 2
- OJGMBLNIHDZDGS-UHFFFAOYSA-N N-Ethylaniline Chemical compound CCNC1=CC=CC=C1 OJGMBLNIHDZDGS-UHFFFAOYSA-N 0.000 description 2
- AMQJEAYHLZJPGS-UHFFFAOYSA-N N-Pentanol Chemical compound CCCCCO AMQJEAYHLZJPGS-UHFFFAOYSA-N 0.000 description 2
- AFBPFSWMIHJQDM-UHFFFAOYSA-N N-methylaniline Chemical compound CNC1=CC=CC=C1 AFBPFSWMIHJQDM-UHFFFAOYSA-N 0.000 description 2
- UFWIBTONFRDIAS-UHFFFAOYSA-N Naphthalene Chemical compound C1=CC=CC2=CC=CC=C21 UFWIBTONFRDIAS-UHFFFAOYSA-N 0.000 description 2
- CTQNGGLPUBDAKN-UHFFFAOYSA-N O-Xylene Chemical group CC1=CC=CC=C1C CTQNGGLPUBDAKN-UHFFFAOYSA-N 0.000 description 2
- ILUJQPXNXACGAN-UHFFFAOYSA-N O-methylsalicylic acid Chemical compound COC1=CC=CC=C1C(O)=O ILUJQPXNXACGAN-UHFFFAOYSA-N 0.000 description 2
- URLKBWYHVLBVBO-UHFFFAOYSA-N Para-Xylene Chemical group CC1=CC=C(C)C=C1 URLKBWYHVLBVBO-UHFFFAOYSA-N 0.000 description 2
- NQRYJNQNLNOLGT-UHFFFAOYSA-N Piperidine Chemical compound C1CCNCC1 NQRYJNQNLNOLGT-UHFFFAOYSA-N 0.000 description 2
- QQONPFPTGQHPMA-UHFFFAOYSA-N Propene Chemical compound CC=C QQONPFPTGQHPMA-UHFFFAOYSA-N 0.000 description 2
- XBDQKXXYIPTUBI-UHFFFAOYSA-M Propionate Chemical compound CCC([O-])=O XBDQKXXYIPTUBI-UHFFFAOYSA-M 0.000 description 2
- NBBJYMSMWIIQGU-UHFFFAOYSA-N Propionic aldehyde Chemical compound CCC=O NBBJYMSMWIIQGU-UHFFFAOYSA-N 0.000 description 2
- JUJWROOIHBZHMG-UHFFFAOYSA-N Pyridine Chemical compound C1=CC=NC=C1 JUJWROOIHBZHMG-UHFFFAOYSA-N 0.000 description 2
- LCTONWCANYUPML-UHFFFAOYSA-N Pyruvic acid Chemical compound CC(=O)C(O)=O LCTONWCANYUPML-UHFFFAOYSA-N 0.000 description 2
- DKGAVHZHDRPRBM-UHFFFAOYSA-N Tert-Butanol Chemical compound CC(C)(C)O DKGAVHZHDRPRBM-UHFFFAOYSA-N 0.000 description 2
- WYURNTSHIVDZCO-UHFFFAOYSA-N Tetrahydrofuran Chemical compound C1CCOC1 WYURNTSHIVDZCO-UHFFFAOYSA-N 0.000 description 2
- YTPLMLYBLZKORZ-UHFFFAOYSA-N Thiophene Chemical compound C=1C=CSC=1 YTPLMLYBLZKORZ-UHFFFAOYSA-N 0.000 description 2
- DTQVDTLACAAQTR-UHFFFAOYSA-N Trifluoroacetic acid Chemical compound OC(=O)C(F)(F)F DTQVDTLACAAQTR-UHFFFAOYSA-N 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- XSQUKJJJFZCRTK-UHFFFAOYSA-N Urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 description 2
- 239000000370 acceptor Substances 0.000 description 2
- 235000011054 acetic acid Nutrition 0.000 description 2
- 229960000583 acetic acid Drugs 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 238000007792 addition Methods 0.000 description 2
- XXROGKLTLUQVRX-UHFFFAOYSA-N allyl alcohol Chemical compound OCC=C XXROGKLTLUQVRX-UHFFFAOYSA-N 0.000 description 2
- 150000001450 anions Chemical class 0.000 description 2
- RDOXTESZEPMUJZ-UHFFFAOYSA-N anisole Chemical compound COC1=CC=CC=C1 RDOXTESZEPMUJZ-UHFFFAOYSA-N 0.000 description 2
- MWPLVEDNUUSJAV-UHFFFAOYSA-N anthracene Chemical compound C1=CC=CC2=CC3=CC=CC=C3C=C21 MWPLVEDNUUSJAV-UHFFFAOYSA-N 0.000 description 2
- 239000007864 aqueous solution Substances 0.000 description 2
- HUMNYLRZRPPJDN-UHFFFAOYSA-N benzaldehyde Chemical compound O=CC1=CC=CC=C1 HUMNYLRZRPPJDN-UHFFFAOYSA-N 0.000 description 2
- GONOPSZTUGRENK-UHFFFAOYSA-N benzyl(trichloro)silane Chemical compound Cl[Si](Cl)(Cl)CC1=CC=CC=C1 GONOPSZTUGRENK-UHFFFAOYSA-N 0.000 description 2
- QARVLSVVCXYDNA-UHFFFAOYSA-N bromobenzene Chemical compound BrC1=CC=CC=C1 QARVLSVVCXYDNA-UHFFFAOYSA-N 0.000 description 2
- DIKBFYAXUHHXCS-UHFFFAOYSA-N bromoform Chemical compound BrC(Br)Br DIKBFYAXUHHXCS-UHFFFAOYSA-N 0.000 description 2
- KDKYADYSIPSCCQ-UHFFFAOYSA-N but-1-yne Chemical compound CCC#C KDKYADYSIPSCCQ-UHFFFAOYSA-N 0.000 description 2
- HQABUPZFAYXKJW-UHFFFAOYSA-N butan-1-amine Chemical compound CCCCN HQABUPZFAYXKJW-UHFFFAOYSA-N 0.000 description 2
- RYYVLZVUVIJVGH-UHFFFAOYSA-N caffeine Chemical compound CN1C(=O)N(C)C(=O)C2=C1N=CN2C RYYVLZVUVIJVGH-UHFFFAOYSA-N 0.000 description 2
- 150000001768 cations Chemical class 0.000 description 2
- 239000000460 chlorine Substances 0.000 description 2
- MVPPADPHJFYWMZ-UHFFFAOYSA-N chlorobenzene Chemical compound ClC1=CC=CC=C1 MVPPADPHJFYWMZ-UHFFFAOYSA-N 0.000 description 2
- NEHMKBQYUWJMIP-UHFFFAOYSA-N chloromethane Chemical compound ClC NEHMKBQYUWJMIP-UHFFFAOYSA-N 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- BGTOWKSIORTVQH-UHFFFAOYSA-N cyclopentanone Chemical compound O=C1CCCC1 BGTOWKSIORTVQH-UHFFFAOYSA-N 0.000 description 2
- LPIQUOYDBNQMRZ-UHFFFAOYSA-N cyclopentene Chemical compound C1CC=CC1 LPIQUOYDBNQMRZ-UHFFFAOYSA-N 0.000 description 2
- 238000013480 data collection Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- JXTHNDFMNIQAHM-UHFFFAOYSA-N dichloroacetic acid Chemical compound OC(=O)C(Cl)Cl JXTHNDFMNIQAHM-UHFFFAOYSA-N 0.000 description 2
- LJSQFQKUNVCTIA-UHFFFAOYSA-N diethyl sulfide Chemical compound CCSCC LJSQFQKUNVCTIA-UHFFFAOYSA-N 0.000 description 2
- WQOXQRCZOLPYPM-UHFFFAOYSA-N dimethyl disulfide Chemical compound CSSC WQOXQRCZOLPYPM-UHFFFAOYSA-N 0.000 description 2
- XBDQKXXYIPTUBI-UHFFFAOYSA-N dimethylselenoniopropionate Natural products CCC(O)=O XBDQKXXYIPTUBI-UHFFFAOYSA-N 0.000 description 2
- 238000005315 distribution function Methods 0.000 description 2
- 238000009510 drug design Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- DNJIEGIFACGWOD-UHFFFAOYSA-N ethanethiol Chemical compound CCS DNJIEGIFACGWOD-UHFFFAOYSA-N 0.000 description 2
- KVFIJIWMDBAGDP-UHFFFAOYSA-N ethylpyrazine Chemical compound CCC1=CN=CC=N1 KVFIJIWMDBAGDP-UHFFFAOYSA-N 0.000 description 2
- NBVXSUQYWXRMNV-UHFFFAOYSA-N fluoromethane Chemical compound FC NBVXSUQYWXRMNV-UHFFFAOYSA-N 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- 125000005843 halogen group Chemical group 0.000 description 2
- CATSNJVOTSVZJV-UHFFFAOYSA-N heptan-2-one Chemical compound CCCCCC(C)=O CATSNJVOTSVZJV-UHFFFAOYSA-N 0.000 description 2
- 125000000623 heterocyclic group Chemical group 0.000 description 2
- WMIYKQLTONQJES-UHFFFAOYSA-N hexafluoroethane Chemical compound FC(F)(F)C(F)(F)F WMIYKQLTONQJES-UHFFFAOYSA-N 0.000 description 2
- ZSIAUFGUXNUGDI-UHFFFAOYSA-N hexan-1-ol Chemical compound CCCCCCO ZSIAUFGUXNUGDI-UHFFFAOYSA-N 0.000 description 2
- 238000011835 investigation Methods 0.000 description 2
- NNPPMTNAJDCUHE-UHFFFAOYSA-N isobutane Chemical compound CC(C)C NNPPMTNAJDCUHE-UHFFFAOYSA-N 0.000 description 2
- HJOVHMDZYOCNQW-UHFFFAOYSA-N isophorone Chemical compound CC1=CC(=O)CC(C)(C)C1 HJOVHMDZYOCNQW-UHFFFAOYSA-N 0.000 description 2
- 229940052961 longrange Drugs 0.000 description 2
- RLSSMJSEOOYNOY-UHFFFAOYSA-N m-cresol Chemical compound CC1=CC=CC(O)=C1 RLSSMJSEOOYNOY-UHFFFAOYSA-N 0.000 description 2
- IVSZLXZYQVIEFR-UHFFFAOYSA-N m-xylene Chemical group CC1=CC=CC(C)=C1 IVSZLXZYQVIEFR-UHFFFAOYSA-N 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- BDAGIHXWWSANSR-UHFFFAOYSA-N methanoic acid Natural products OC=O BDAGIHXWWSANSR-UHFFFAOYSA-N 0.000 description 2
- QPJVMBTYPHYUOC-UHFFFAOYSA-N methyl benzoate Chemical compound COC(=O)C1=CC=CC=C1 QPJVMBTYPHYUOC-UHFFFAOYSA-N 0.000 description 2
- TZIHFWKZFHZASV-UHFFFAOYSA-N methyl formate Chemical compound COC=O TZIHFWKZFHZASV-UHFFFAOYSA-N 0.000 description 2
- NUKZAGXMHTUAFE-UHFFFAOYSA-N methyl hexanoate Chemical compound CCCCCC(=O)OC NUKZAGXMHTUAFE-UHFFFAOYSA-N 0.000 description 2
- JGHZJRVDZXSNKQ-UHFFFAOYSA-N methyl octanoate Chemical compound CCCCCCCC(=O)OC JGHZJRVDZXSNKQ-UHFFFAOYSA-N 0.000 description 2
- HNBDRPTVWVGKBR-UHFFFAOYSA-N methyl pentanoate Chemical compound CCCCC(=O)OC HNBDRPTVWVGKBR-UHFFFAOYSA-N 0.000 description 2
- UAEPNZWRGJTJPN-UHFFFAOYSA-N methylcyclohexane Chemical compound CC1CCCCC1 UAEPNZWRGJTJPN-UHFFFAOYSA-N 0.000 description 2
- LXCFILQKKLGQFO-UHFFFAOYSA-N methylparaben Chemical compound COC(=O)C1=CC=C(O)C=C1 LXCFILQKKLGQFO-UHFFFAOYSA-N 0.000 description 2
- CAWHJQAVHZEVTJ-UHFFFAOYSA-N methylpyrazine Chemical compound CC1=CN=CC=N1 CAWHJQAVHZEVTJ-UHFFFAOYSA-N 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000007643 movable-type printing Methods 0.000 description 2
- DUWWHGPELOTTOE-UHFFFAOYSA-N n-(5-chloro-2,4-dimethoxyphenyl)-3-oxobutanamide Chemical compound COC1=CC(OC)=C(NC(=O)CC(C)=O)C=C1Cl DUWWHGPELOTTOE-UHFFFAOYSA-N 0.000 description 2
- VLKZOEOYAKHREP-UHFFFAOYSA-N n-Hexane Chemical compound CCCCCC VLKZOEOYAKHREP-UHFFFAOYSA-N 0.000 description 2
- CRSOQBOWXPBRES-UHFFFAOYSA-N neopentane Chemical compound CC(C)(C)C CRSOQBOWXPBRES-UHFFFAOYSA-N 0.000 description 2
- LQNUZADURLCDLV-UHFFFAOYSA-N nitrobenzene Chemical compound [O-][N+](=O)C1=CC=CC=C1 LQNUZADURLCDLV-UHFFFAOYSA-N 0.000 description 2
- QWVGKYWNOKOFNN-UHFFFAOYSA-N o-cresol Chemical compound CC1=CC=CC=C1O QWVGKYWNOKOFNN-UHFFFAOYSA-N 0.000 description 2
- RNVCVTLRINQCPJ-UHFFFAOYSA-N o-toluidine Chemical compound CC1=CC=CC=C1N RNVCVTLRINQCPJ-UHFFFAOYSA-N 0.000 description 2
- 235000019407 octafluorocyclobutane Nutrition 0.000 description 2
- QYSGYZVSCZSLHT-UHFFFAOYSA-N octafluoropropane Chemical compound FC(F)(F)C(F)(F)C(F)(F)F QYSGYZVSCZSLHT-UHFFFAOYSA-N 0.000 description 2
- NUJGJRNETVAIRJ-UHFFFAOYSA-N octanal Chemical compound CCCCCCCC=O NUJGJRNETVAIRJ-UHFFFAOYSA-N 0.000 description 2
- TVMXDCGIABBOFY-UHFFFAOYSA-N octane Chemical compound CCCCCCCC TVMXDCGIABBOFY-UHFFFAOYSA-N 0.000 description 2
- IWDCLRJOBJJRNH-UHFFFAOYSA-N p-cresol Chemical compound CC1=CC=C(O)C=C1 IWDCLRJOBJJRNH-UHFFFAOYSA-N 0.000 description 2
- DPBLXKKOBLCELK-UHFFFAOYSA-N pentan-1-amine Chemical compound CCCCCN DPBLXKKOBLCELK-UHFFFAOYSA-N 0.000 description 2
- XNLICIUVMPYHGG-UHFFFAOYSA-N pentan-2-one Chemical compound CCCC(C)=O XNLICIUVMPYHGG-UHFFFAOYSA-N 0.000 description 2
- FDPIMTJIUBPUKL-UHFFFAOYSA-N pentan-3-one Chemical compound CCC(=O)CC FDPIMTJIUBPUKL-UHFFFAOYSA-N 0.000 description 2
- HGBOYTHUEUWSSQ-UHFFFAOYSA-N pentanal Chemical compound CCCCC=O HGBOYTHUEUWSSQ-UHFFFAOYSA-N 0.000 description 2
- YWAKXRMUMFPDSH-UHFFFAOYSA-N pentene Chemical compound CCCC=C YWAKXRMUMFPDSH-UHFFFAOYSA-N 0.000 description 2
- PGMYKACGEOXYJE-UHFFFAOYSA-N pentyl acetate Chemical compound CCCCCOC(C)=O PGMYKACGEOXYJE-UHFFFAOYSA-N 0.000 description 2
- CPJSUEIXXCENMM-UHFFFAOYSA-N phenacetin Chemical compound CCOC1=CC=C(NC(C)=O)C=C1 CPJSUEIXXCENMM-UHFFFAOYSA-N 0.000 description 2
- 125000001997 phenyl group Chemical group [H]C1=C([H])C([H])=C(*)C([H])=C1[H] 0.000 description 2
- 125000004437 phosphorous atom Chemical group 0.000 description 2
- SUVIGLJNEAMWEG-UHFFFAOYSA-N propane-1-thiol Chemical compound CCCS SUVIGLJNEAMWEG-UHFFFAOYSA-N 0.000 description 2
- 235000019260 propionic acid Nutrition 0.000 description 2
- QELSKZZBTMNZEB-UHFFFAOYSA-N propylparaben Chemical compound CCCOC(=O)C1=CC=C(O)C=C1 QELSKZZBTMNZEB-UHFFFAOYSA-N 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- URAYPUMNDPQOKB-UHFFFAOYSA-N triacetin Chemical compound CC(=O)OCC(OC(C)=O)COC(C)=O URAYPUMNDPQOKB-UHFFFAOYSA-N 0.000 description 2
- GETQZCLCWQTVFV-UHFFFAOYSA-N trimethylamine Chemical compound CN(C)C GETQZCLCWQTVFV-UHFFFAOYSA-N 0.000 description 2
- 229930195735 unsaturated hydrocarbon Natural products 0.000 description 2
- 238000007642 woodblock printing Methods 0.000 description 2
- WGECXQBGLLYSFP-UHFFFAOYSA-N (+-)-2,3-dimethyl-pentane Natural products CCC(C)C(C)C WGECXQBGLLYSFP-UHFFFAOYSA-N 0.000 description 1
- LDVVMCZRFWMZSG-OLQVQODUSA-N (3ar,7as)-2-(trichloromethylsulfanyl)-3a,4,7,7a-tetrahydroisoindole-1,3-dione Chemical compound C1C=CC[C@H]2C(=O)N(SC(Cl)(Cl)Cl)C(=O)[C@H]21 LDVVMCZRFWMZSG-OLQVQODUSA-N 0.000 description 1
- QMMOXUPEWRXHJS-HWKANZROSA-N (e)-pent-2-ene Chemical compound CC\C=C\C QMMOXUPEWRXHJS-HWKANZROSA-N 0.000 description 1
- NMFQPFSIPWZZMR-UHFFFAOYSA-N 1,1,1,2,3,3-hexafluoropropan-2-ol Chemical compound FC(F)C(F)(O)C(F)(F)F NMFQPFSIPWZZMR-UHFFFAOYSA-N 0.000 description 1
- QVLAWKAXOMEXPM-UHFFFAOYSA-N 1,1,1,2-tetrachloroethane Chemical compound ClCC(Cl)(Cl)Cl QVLAWKAXOMEXPM-UHFFFAOYSA-N 0.000 description 1
- LVGUZGTVOIAKKC-UHFFFAOYSA-N 1,1,1,2-tetrafluoroethane Chemical compound FCC(F)(F)F LVGUZGTVOIAKKC-UHFFFAOYSA-N 0.000 description 1
- LEEHHJCQFUKQTC-UHFFFAOYSA-N 1,1,1,3,3-pentafluoro-3-(1,1,3,3,3-pentafluoropropoxy)propane Chemical compound FC(F)(F)CC(F)(F)OC(F)(F)CC(F)(F)F LEEHHJCQFUKQTC-UHFFFAOYSA-N 0.000 description 1
- UOCLXMDMGBRAIB-UHFFFAOYSA-N 1,1,1-trichloroethane Chemical compound CC(Cl)(Cl)Cl UOCLXMDMGBRAIB-UHFFFAOYSA-N 0.000 description 1
- UJPMYEOUBPIPHQ-UHFFFAOYSA-N 1,1,1-trifluoroethane Chemical compound CC(F)(F)F UJPMYEOUBPIPHQ-UHFFFAOYSA-N 0.000 description 1
- GILIYJDBJZWGBG-UHFFFAOYSA-N 1,1,1-trifluoropropan-2-ol Chemical compound CC(O)C(F)(F)F GILIYJDBJZWGBG-UHFFFAOYSA-N 0.000 description 1
- NPNPZTNLOVBDOC-UHFFFAOYSA-N 1,1-difluoroethane Chemical compound CC(F)F NPNPZTNLOVBDOC-UHFFFAOYSA-N 0.000 description 1
- XVIZMMSINIOIQP-UHFFFAOYSA-N 1,2-dichloro-3-(2-chlorophenyl)benzene Chemical group ClC1=CC=CC(C=2C(=CC=CC=2)Cl)=C1Cl XVIZMMSINIOIQP-UHFFFAOYSA-N 0.000 description 1
- XOMKZKJEJBZBJJ-UHFFFAOYSA-N 1,2-dichloro-3-phenylbenzene Chemical group ClC1=CC=CC(C=2C=CC=CC=2)=C1Cl XOMKZKJEJBZBJJ-UHFFFAOYSA-N 0.000 description 1
- LZDKZFUFMNSQCJ-UHFFFAOYSA-N 1,2-diethoxyethane Chemical compound CCOCCOCC LZDKZFUFMNSQCJ-UHFFFAOYSA-N 0.000 description 1
- WZCQRUWWHSTZEM-UHFFFAOYSA-N 1,3-phenylenediamine Chemical compound NC1=CC=CC(N)=C1 WZCQRUWWHSTZEM-UHFFFAOYSA-N 0.000 description 1
- RYHBNJHYFVUHQT-UHFFFAOYSA-N 1,4-Dioxane Chemical compound C1COCCO1 RYHBNJHYFVUHQT-UHFFFAOYSA-N 0.000 description 1
- SWJPEBQEEAHIGZ-UHFFFAOYSA-N 1,4-dibromobenzene Chemical compound BrC1=CC=C(Br)C=C1 SWJPEBQEEAHIGZ-UHFFFAOYSA-N 0.000 description 1
- OCJBOOLMMGQPQU-UHFFFAOYSA-N 1,4-dichlorobenzene Chemical compound ClC1=CC=C(Cl)C=C1 OCJBOOLMMGQPQU-UHFFFAOYSA-N 0.000 description 1
- GKMIDMKPBOUSBQ-UHFFFAOYSA-N 1,5-dimethyluracil Chemical compound CC1=CN(C)C(=O)NC1=O GKMIDMKPBOUSBQ-UHFFFAOYSA-N 0.000 description 1
- PVOAHINGSUIXLS-UHFFFAOYSA-N 1-Methylpiperazine Chemical compound CN1CCNCC1 PVOAHINGSUIXLS-UHFFFAOYSA-N 0.000 description 1
- XRIGHGYEGNDPEU-UHFFFAOYSA-N 1-anilinoanthracene-9,10-dione Chemical compound C1=CC=C2C(=O)C3=CC=CC=C3C(=O)C2=C1NC1=CC=CC=C1 XRIGHGYEGNDPEU-UHFFFAOYSA-N 0.000 description 1
- HLVFKOKELQSXIQ-UHFFFAOYSA-N 1-bromo-2-methylpropane Chemical compound CC(C)CBr HLVFKOKELQSXIQ-UHFFFAOYSA-N 0.000 description 1
- MPPPKRYCTPRNTB-UHFFFAOYSA-N 1-bromobutane Chemical compound CCCCBr MPPPKRYCTPRNTB-UHFFFAOYSA-N 0.000 description 1
- YZWKKMVJZFACSU-UHFFFAOYSA-N 1-bromopentane Chemical compound CCCCCBr YZWKKMVJZFACSU-UHFFFAOYSA-N 0.000 description 1
- CYNYIHKIEHGYOZ-UHFFFAOYSA-N 1-bromopropane Chemical compound CCCBr CYNYIHKIEHGYOZ-UHFFFAOYSA-N 0.000 description 1
- JAYCNKDKIKZTAF-UHFFFAOYSA-N 1-chloro-2-(2-chlorophenyl)benzene Chemical group ClC1=CC=CC=C1C1=CC=CC=C1Cl JAYCNKDKIKZTAF-UHFFFAOYSA-N 0.000 description 1
- IBSQPLPBRSHTTG-UHFFFAOYSA-N 1-chloro-2-methylbenzene Chemical compound CC1=CC=CC=C1Cl IBSQPLPBRSHTTG-UHFFFAOYSA-N 0.000 description 1
- SQCZQTSHSZLZIQ-UHFFFAOYSA-N 1-chloropentane Chemical compound CCCCCCl SQCZQTSHSZLZIQ-UHFFFAOYSA-N 0.000 description 1
- CGHIBGNXEGJPQZ-UHFFFAOYSA-N 1-hexyne Chemical compound CCCCC#C CGHIBGNXEGJPQZ-UHFFFAOYSA-N 0.000 description 1
- NALZTFARIYUCBY-UHFFFAOYSA-N 1-nitrobutane Chemical compound CCCC[N+]([O-])=O NALZTFARIYUCBY-UHFFFAOYSA-N 0.000 description 1
- JSZOAYXJRCEYSX-UHFFFAOYSA-N 1-nitropropane Chemical compound CCC[N+]([O-])=O JSZOAYXJRCEYSX-UHFFFAOYSA-N 0.000 description 1
- IBXNCJKFFQIKKY-UHFFFAOYSA-N 1-pentyne Chemical compound CCCC#C IBXNCJKFFQIKKY-UHFFFAOYSA-N 0.000 description 1
- YOYAIZYFCNQIRF-UHFFFAOYSA-N 2,6-dichlorobenzonitrile Chemical compound ClC1=CC=CC(Cl)=C1C#N YOYAIZYFCNQIRF-UHFFFAOYSA-N 0.000 description 1
- KGKGSIUWJCAFPX-UHFFFAOYSA-N 2,6-dichlorothiobenzamide Chemical compound NC(=S)C1=C(Cl)C=CC=C1Cl KGKGSIUWJCAFPX-UHFFFAOYSA-N 0.000 description 1
- SMZOUWXMTYCWNB-UHFFFAOYSA-N 2-(2-methoxy-5-methylphenyl)ethanamine Chemical compound COC1=CC=C(C)C=C1CCN SMZOUWXMTYCWNB-UHFFFAOYSA-N 0.000 description 1
- OWZPCEFYPSAJFR-UHFFFAOYSA-N 2-(butan-2-yl)-4,6-dinitrophenol Chemical compound CCC(C)C1=CC([N+]([O-])=O)=CC([N+]([O-])=O)=C1O OWZPCEFYPSAJFR-UHFFFAOYSA-N 0.000 description 1
- ZWEHNKRNPOVVGH-UHFFFAOYSA-N 2-Butanone Chemical compound CCC(C)=O ZWEHNKRNPOVVGH-UHFFFAOYSA-N 0.000 description 1
- XNWFRZJHXBZDAG-UHFFFAOYSA-N 2-METHOXYETHANOL Chemical compound COCCO XNWFRZJHXBZDAG-UHFFFAOYSA-N 0.000 description 1
- NIXOWILDQLNWCW-UHFFFAOYSA-N 2-Propenoic acid Natural products OC(=O)C=C NIXOWILDQLNWCW-UHFFFAOYSA-N 0.000 description 1
- JTXMVXSTHSMVQF-UHFFFAOYSA-N 2-acetyloxyethyl acetate Chemical compound CC(=O)OCCOC(C)=O JTXMVXSTHSMVQF-UHFFFAOYSA-N 0.000 description 1
- NAMYKGVDVNBCFQ-UHFFFAOYSA-N 2-bromopropane Chemical compound CC(C)Br NAMYKGVDVNBCFQ-UHFFFAOYSA-N 0.000 description 1
- KZMAWJRXKGLWGS-UHFFFAOYSA-N 2-chloro-n-[4-(4-methoxyphenyl)-1,3-thiazol-2-yl]-n-(3-methoxypropyl)acetamide Chemical compound S1C(N(C(=O)CCl)CCCOC)=NC(C=2C=CC(OC)=CC=2)=C1 KZMAWJRXKGLWGS-UHFFFAOYSA-N 0.000 description 1
- BSPCSKHALVHRSR-UHFFFAOYSA-N 2-chlorobutane Chemical compound CCC(C)Cl BSPCSKHALVHRSR-UHFFFAOYSA-N 0.000 description 1
- NFRKUDYZEVQXTE-UHFFFAOYSA-N 2-chloropentane Chemical compound CCCC(C)Cl NFRKUDYZEVQXTE-UHFFFAOYSA-N 0.000 description 1
- CHZCERSEMVWNHL-UHFFFAOYSA-N 2-hydroxybenzonitrile Chemical compound OC1=CC=CC=C1C#N CHZCERSEMVWNHL-UHFFFAOYSA-N 0.000 description 1
- HTKIMWYSDZQQBP-UHFFFAOYSA-N 2-hydroxyethyl nitrate Chemical compound OCCO[N+]([O-])=O HTKIMWYSDZQQBP-UHFFFAOYSA-N 0.000 description 1
- MNWSGMTUGXNYHJ-UHFFFAOYSA-N 2-methoxybenzamide Chemical compound COC1=CC=CC=C1C(N)=O MNWSGMTUGXNYHJ-UHFFFAOYSA-N 0.000 description 1
- ASUDFOJKTJLAIK-UHFFFAOYSA-N 2-methoxyethanamine Chemical compound COCCN ASUDFOJKTJLAIK-UHFFFAOYSA-N 0.000 description 1
- RMGHERXMTMUMMV-UHFFFAOYSA-N 2-methoxypropane Chemical compound COC(C)C RMGHERXMTMUMMV-UHFFFAOYSA-N 0.000 description 1
- LNNXFUZKZLXPOF-UHFFFAOYSA-N 2-methylpropyl nitrate Chemical compound CC(C)CO[N+]([O-])=O LNNXFUZKZLXPOF-UHFFFAOYSA-N 0.000 description 1
- FGLBSLMDCBOPQK-UHFFFAOYSA-N 2-nitropropane Chemical compound CC(C)[N+]([O-])=O FGLBSLMDCBOPQK-UHFFFAOYSA-N 0.000 description 1
- PLAZTCDQAHEYBI-UHFFFAOYSA-N 2-nitrotoluene Chemical compound CC1=CC=CC=C1[N+]([O-])=O PLAZTCDQAHEYBI-UHFFFAOYSA-N 0.000 description 1
- ISWXYJQANHQYSR-UHFFFAOYSA-N 2-oxopropyl nitrate Chemical compound CC(=O)CO[N+]([O-])=O ISWXYJQANHQYSR-UHFFFAOYSA-N 0.000 description 1
- PNPCRKVUWYDDST-UHFFFAOYSA-N 3-chloroaniline Chemical compound NC1=CC=CC(Cl)=C1 PNPCRKVUWYDDST-UHFFFAOYSA-N 0.000 description 1
- OSWFIVFLDKOXQC-UHFFFAOYSA-N 4-(3-methoxyphenyl)aniline Chemical compound COC1=CC=CC(C=2C=CC(N)=CC=2)=C1 OSWFIVFLDKOXQC-UHFFFAOYSA-N 0.000 description 1
- UNBOSJFEZZJZLR-UHFFFAOYSA-N 4-(4-nitrophenylazo)aniline Chemical compound C1=CC(N)=CC=C1N=NC1=CC=C([N+]([O-])=O)C=C1 UNBOSJFEZZJZLR-UHFFFAOYSA-N 0.000 description 1
- GZFGOTFRPZRKDS-UHFFFAOYSA-N 4-bromophenol Chemical compound OC1=CC=C(Br)C=C1 GZFGOTFRPZRKDS-UHFFFAOYSA-N 0.000 description 1
- QSNSCYSYFYORTR-UHFFFAOYSA-N 4-chloroaniline Chemical compound NC1=CC=C(Cl)C=C1 QSNSCYSYFYORTR-UHFFFAOYSA-N 0.000 description 1
- VJXRKZJMGVSXPX-UHFFFAOYSA-N 4-ethylpyridine Chemical compound CCC1=CC=NC=C1 VJXRKZJMGVSXPX-UHFFFAOYSA-N 0.000 description 1
- TYMLOMAKGOJONV-UHFFFAOYSA-N 4-nitroaniline Chemical compound NC1=CC=C([N+]([O-])=O)C=C1 TYMLOMAKGOJONV-UHFFFAOYSA-N 0.000 description 1
- LMNPKIOZMGYQIU-UHFFFAOYSA-N 5-(trifluoromethyl)-1h-pyrimidine-2,4-dione Chemical compound FC(F)(F)C1=CNC(=O)NC1=O LMNPKIOZMGYQIU-UHFFFAOYSA-N 0.000 description 1
- CTSLUCNDVMMDHG-UHFFFAOYSA-N 5-bromo-3-(butan-2-yl)-6-methylpyrimidine-2,4(1H,3H)-dione Chemical compound CCC(C)N1C(=O)NC(C)=C(Br)C1=O CTSLUCNDVMMDHG-UHFFFAOYSA-N 0.000 description 1
- LQLQRFGHAALLLE-UHFFFAOYSA-N 5-bromouracil Chemical compound BrC1=CNC(=O)NC1=O LQLQRFGHAALLLE-UHFFFAOYSA-N 0.000 description 1
- ZFTBZKVVGZNMJR-UHFFFAOYSA-N 5-chlorouracil Chemical compound ClC1=CNC(=O)NC1=O ZFTBZKVVGZNMJR-UHFFFAOYSA-N 0.000 description 1
- PKUFNWPSFCOSLU-UHFFFAOYSA-N 6-chloro-1h-pyrimidine-2,4-dione Chemical compound ClC1=CC(=O)NC(=O)N1 PKUFNWPSFCOSLU-UHFFFAOYSA-N 0.000 description 1
- WRXCXOUDSPTXNX-UHFFFAOYSA-N 9-methyladenine Chemical compound N1=CN=C2N(C)C=NC2=C1N WRXCXOUDSPTXNX-UHFFFAOYSA-N 0.000 description 1
- RZVAJINKPMORJF-UHFFFAOYSA-N Acetaminophen Chemical compound CC(=O)NC1=CC=C(O)C=C1 RZVAJINKPMORJF-UHFFFAOYSA-N 0.000 description 1
- OSDWBNJEKMUWAV-UHFFFAOYSA-N Allyl chloride Chemical compound ClCC=C OSDWBNJEKMUWAV-UHFFFAOYSA-N 0.000 description 1
- 240000000662 Anethum graveolens Species 0.000 description 1
- BSYNRYMUTXBXSQ-UHFFFAOYSA-N Aspirin Chemical compound CC(=O)OC1=CC=CC=C1C(O)=O BSYNRYMUTXBXSQ-UHFFFAOYSA-N 0.000 description 1
- 239000005711 Benzoic acid Substances 0.000 description 1
- ZNSMNVMLTJELDZ-UHFFFAOYSA-N Bis(2-chloroethyl)ether Chemical compound ClCCOCCCl ZNSMNVMLTJELDZ-UHFFFAOYSA-N 0.000 description 1
- WKBOTKDWSSQWDR-UHFFFAOYSA-N Bromine atom Chemical compound [Br] WKBOTKDWSSQWDR-UHFFFAOYSA-N 0.000 description 1
- DKPFZGUDAPQIHT-UHFFFAOYSA-N Butyl acetate Natural products CCCCOC(C)=O DKPFZGUDAPQIHT-UHFFFAOYSA-N 0.000 description 1
- ZTQSAGDEMFDKMZ-UHFFFAOYSA-N Butyraldehyde Chemical compound CCCC=O ZTQSAGDEMFDKMZ-UHFFFAOYSA-N 0.000 description 1
- 239000005745 Captan Substances 0.000 description 1
- VEDTXTNSFWUXGQ-UHFFFAOYSA-N Carbophenothion Chemical compound CCOP(=S)(OCC)SCSC1=CC=C(Cl)C=C1 VEDTXTNSFWUXGQ-UHFFFAOYSA-N 0.000 description 1
- ZAMOUSCENKQFHK-UHFFFAOYSA-N Chlorine atom Chemical compound [Cl] ZAMOUSCENKQFHK-UHFFFAOYSA-N 0.000 description 1
- VOPWNXZWBYDODV-UHFFFAOYSA-N Chlorodifluoromethane Chemical compound FC(F)Cl VOPWNXZWBYDODV-UHFFFAOYSA-N 0.000 description 1
- XWCDCDSDNJVCLO-UHFFFAOYSA-N Chlorofluoromethane Chemical compound FCCl XWCDCDSDNJVCLO-UHFFFAOYSA-N 0.000 description 1
- 239000005944 Chlorpyrifos Substances 0.000 description 1
- XDTMQSROBMDMFD-UHFFFAOYSA-N Cyclohexane Chemical compound C1CCCCC1 XDTMQSROBMDMFD-UHFFFAOYSA-N 0.000 description 1
- LVZWSLJZHVFIQJ-UHFFFAOYSA-N Cyclopropane Chemical compound C1CC1 LVZWSLJZHVFIQJ-UHFFFAOYSA-N 0.000 description 1
- MUMQYXACQUZOFP-UHFFFAOYSA-N Dialifor Chemical compound C1=CC=C2C(=O)N(C(CCl)SP(=S)(OCC)OCC)C(=O)C2=C1 MUMQYXACQUZOFP-UHFFFAOYSA-N 0.000 description 1
- 239000005504 Dicamba Substances 0.000 description 1
- ZAFNJMIOTHYJRJ-UHFFFAOYSA-N Diisopropyl ether Chemical compound CC(C)OC(C)C ZAFNJMIOTHYJRJ-UHFFFAOYSA-N 0.000 description 1
- KKUKTXOBAWVSHC-UHFFFAOYSA-N Dimethylphosphate Chemical compound COP(O)(=O)OC KKUKTXOBAWVSHC-UHFFFAOYSA-N 0.000 description 1
- OFDYMSKSGFSLLM-UHFFFAOYSA-N Dinitramine Chemical compound CCN(CC)C1=C([N+]([O-])=O)C=C(C(F)(F)F)C(N)=C1[N+]([O-])=O OFDYMSKSGFSLLM-UHFFFAOYSA-N 0.000 description 1
- ZERULLAPCVRMCO-UHFFFAOYSA-N Dipropyl sulfide Chemical compound CCCSCCC ZERULLAPCVRMCO-UHFFFAOYSA-N 0.000 description 1
- 101000925646 Enterobacteria phage T4 Endolysin Proteins 0.000 description 1
- OTMSDBZUPAUEDD-UHFFFAOYSA-N Ethane Chemical compound CC OTMSDBZUPAUEDD-UHFFFAOYSA-N 0.000 description 1
- IYXGSMUGOJNHAZ-UHFFFAOYSA-N Ethyl malonate Chemical compound CCOC(=O)CC(=O)OCC IYXGSMUGOJNHAZ-UHFFFAOYSA-N 0.000 description 1
- LYCAIKOWRPUZTN-UHFFFAOYSA-N Ethylene glycol Chemical compound OCCO LYCAIKOWRPUZTN-UHFFFAOYSA-N 0.000 description 1
- PIICEJLVQHRZGT-UHFFFAOYSA-N Ethylenediamine Chemical compound NCCN PIICEJLVQHRZGT-UHFFFAOYSA-N 0.000 description 1
- YCKRFDGAMUMZLT-UHFFFAOYSA-N Fluorine atom Chemical compound [F] YCKRFDGAMUMZLT-UHFFFAOYSA-N 0.000 description 1
- GHASVSINZRGABV-UHFFFAOYSA-N Fluorouracil Chemical compound FC1=CNC(=O)NC1=O GHASVSINZRGABV-UHFFFAOYSA-N 0.000 description 1
- 101001099181 Homo sapiens TATA-binding protein-associated factor 2N Proteins 0.000 description 1
- HEFNNWSXXWATRW-UHFFFAOYSA-N Ibuprofen Chemical compound CC(C)CC1=CC=C(C(C)C(O)=O)C=C1 HEFNNWSXXWATRW-UHFFFAOYSA-N 0.000 description 1
- VQTUBCCKSQIDNK-UHFFFAOYSA-N Isobutene Chemical compound CC(C)=C VQTUBCCKSQIDNK-UHFFFAOYSA-N 0.000 description 1
- LPHGQDQBBGAPDZ-UHFFFAOYSA-N Isocaffeine Natural products CN1C(=O)N(C)C(=O)C2=C1N(C)C=N2 LPHGQDQBBGAPDZ-UHFFFAOYSA-N 0.000 description 1
- NHTMVDHEPJAVLT-UHFFFAOYSA-N Isooctane Chemical compound CC(C)CC(C)(C)C NHTMVDHEPJAVLT-UHFFFAOYSA-N 0.000 description 1
- 238000004510 Lennard-Jones potential Methods 0.000 description 1
- 239000005949 Malathion Substances 0.000 description 1
- 102000010750 Metalloproteins Human genes 0.000 description 1
- 108010063312 Metalloproteins Proteins 0.000 description 1
- 239000005916 Methomyl Substances 0.000 description 1
- 239000005641 Methyl octanoate Substances 0.000 description 1
- BZLVMXJERCGZMT-UHFFFAOYSA-N Methyl tert-butyl ether Chemical compound COC(C)(C)C BZLVMXJERCGZMT-UHFFFAOYSA-N 0.000 description 1
- 239000005584 Metsulfuron-methyl Substances 0.000 description 1
- 238000012614 Monte-Carlo sampling Methods 0.000 description 1
- 208000037004 Myoclonic-astatic epilepsy Diseases 0.000 description 1
- CMWTZPSULFXXJA-UHFFFAOYSA-N Naproxen Natural products C1=C(C(C)C(O)=O)C=CC2=CC(OC)=CC=C21 CMWTZPSULFXXJA-UHFFFAOYSA-N 0.000 description 1
- UMKANAFDOQQUKE-UHFFFAOYSA-N Nitralin Chemical compound CCCN(CCC)C1=C([N+]([O-])=O)C=C(S(C)(=O)=O)C=C1[N+]([O-])=O UMKANAFDOQQUKE-UHFFFAOYSA-N 0.000 description 1
- WSDRAZIPGVLSNP-UHFFFAOYSA-N O.P(=O)(O)(O)O.O.O.P(=O)(O)(O)O Chemical compound O.P(=O)(O)(O)O.O.O.P(=O)(O)(O)O WSDRAZIPGVLSNP-UHFFFAOYSA-N 0.000 description 1
- 239000004341 Octafluorocyclobutane Substances 0.000 description 1
- SGEJQUSYQTVSIU-UHFFFAOYSA-N Pebulate Chemical compound CCCCN(CC)C(=O)SCCC SGEJQUSYQTVSIU-UHFFFAOYSA-N 0.000 description 1
- CYTYCFOTNPOANT-UHFFFAOYSA-N Perchloroethylene Chemical compound ClC(Cl)=C(Cl)Cl CYTYCFOTNPOANT-UHFFFAOYSA-N 0.000 description 1
- GLUUGHFHXGJENI-UHFFFAOYSA-N Piperazine Chemical compound C1CNCCN1 GLUUGHFHXGJENI-UHFFFAOYSA-N 0.000 description 1
- 239000005923 Pirimicarb Substances 0.000 description 1
- 238000012356 Product development Methods 0.000 description 1
- ITVQAKZNYJEWKS-UHFFFAOYSA-N Profluralin Chemical compound [O-][N+](=O)C=1C=C(C(F)(F)F)C=C([N+]([O-])=O)C=1N(CCC)CC1CC1 ITVQAKZNYJEWKS-UHFFFAOYSA-N 0.000 description 1
- QAOWNCQODCNURD-UHFFFAOYSA-L Sulfate Chemical compound [O-]S([O-])(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-L 0.000 description 1
- 102100038917 TATA-binding protein-associated factor 2N Human genes 0.000 description 1
- NBQCNZYJJMBDKY-UHFFFAOYSA-N Terbacil Chemical compound CC=1NC(=O)N(C(C)(C)C)C(=O)C=1Cl NBQCNZYJJMBDKY-UHFFFAOYSA-N 0.000 description 1
- DHXVGJBLRPWPCS-UHFFFAOYSA-N Tetrahydropyran Chemical compound C1CCOCC1 DHXVGJBLRPWPCS-UHFFFAOYSA-N 0.000 description 1
- XSTXAVWGXDQKEL-UHFFFAOYSA-N Trichloroethylene Chemical compound ClC=C(Cl)Cl XSTXAVWGXDQKEL-UHFFFAOYSA-N 0.000 description 1
- RHQDFWAXVIIEBN-UHFFFAOYSA-N Trifluoroethanol Chemical compound OCC(F)(F)F RHQDFWAXVIIEBN-UHFFFAOYSA-N 0.000 description 1
- BZHJMEDXRYGGRV-UHFFFAOYSA-N Vinyl chloride Chemical compound ClC=C BZHJMEDXRYGGRV-UHFFFAOYSA-N 0.000 description 1
- FSAVDKDHPDSCTO-WQLSENKSSA-N [(z)-2-chloro-1-(2,4-dichlorophenyl)ethenyl] diethyl phosphate Chemical compound CCOP(=O)(OCC)O\C(=C/Cl)C1=CC=C(Cl)C=C1Cl FSAVDKDHPDSCTO-WQLSENKSSA-N 0.000 description 1
- VUMNNZPFDAQQOJ-UHFFFAOYSA-N [O-]P(O)(O)=[S+]C(C=CC=C1Cl)=C1Cl Chemical compound [O-]P(O)(O)=[S+]C(C=CC=C1Cl)=C1Cl VUMNNZPFDAQQOJ-UHFFFAOYSA-N 0.000 description 1
- KXKVLQRXCPHEJC-UHFFFAOYSA-N acetic acid trimethyl ester Natural products COC(C)=O KXKVLQRXCPHEJC-UHFFFAOYSA-N 0.000 description 1
- 229960001138 acetylsalicylic acid Drugs 0.000 description 1
- 229940114077 acrylic acid Drugs 0.000 description 1
- XCSGPAVHZFQHGE-UHFFFAOYSA-N alachlor Chemical compound CCC1=CC=CC(CC)=C1N(COC)C(=O)CCl XCSGPAVHZFQHGE-UHFFFAOYSA-N 0.000 description 1
- QGLZXHRNAYXIBU-WEVVVXLNSA-N aldicarb Chemical compound CNC(=O)O\N=C\C(C)(C)SC QGLZXHRNAYXIBU-WEVVVXLNSA-N 0.000 description 1
- BHELZAPQIKSEDF-UHFFFAOYSA-N allyl bromide Chemical compound BrCC=C BHELZAPQIKSEDF-UHFFFAOYSA-N 0.000 description 1
- RQVYBGPQFYCBGX-UHFFFAOYSA-N ametryn Chemical compound CCNC1=NC(NC(C)C)=NC(SC)=N1 RQVYBGPQFYCBGX-UHFFFAOYSA-N 0.000 description 1
- 229910021529 ammonia Inorganic materials 0.000 description 1
- PYKYMHQGRFAEBM-UHFFFAOYSA-N anthraquinone Natural products CCC(=O)c1c(O)c2C(=O)C3C(C=CC=C3O)C(=O)c2cc1CC(=O)OC PYKYMHQGRFAEBM-UHFFFAOYSA-N 0.000 description 1
- 150000004056 anthraquinones Chemical class 0.000 description 1
- 239000003125 aqueous solvent Substances 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- HONIICLYMWZJFZ-UHFFFAOYSA-N azetidine Chemical compound C1CNC1 HONIICLYMWZJFZ-UHFFFAOYSA-N 0.000 description 1
- CJJOSEISRRTUQB-UHFFFAOYSA-N azinphos-methyl Chemical group C1=CC=C2C(=O)N(CSP(=S)(OC)OC)N=NC2=C1 CJJOSEISRRTUQB-UHFFFAOYSA-N 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000005452 bending Methods 0.000 description 1
- SMDHCQAYESWHAE-UHFFFAOYSA-N benfluralin Chemical compound CCCCN(CC)C1=C([N+]([O-])=O)C=C(C(F)(F)F)C=C1[N+]([O-])=O SMDHCQAYESWHAE-UHFFFAOYSA-N 0.000 description 1
- PPWBRCCBKOWDNB-UHFFFAOYSA-N bensulfuron Chemical compound COC1=CC(OC)=NC(NC(=O)NS(=O)(=O)CC=2C(=CC=CC=2)C(O)=O)=N1 PPWBRCCBKOWDNB-UHFFFAOYSA-N 0.000 description 1
- RRTCFFFUTAGOSG-UHFFFAOYSA-N benzene;phenol Chemical group C1=CC=CC=C1.OC1=CC=CC=C1 RRTCFFFUTAGOSG-UHFFFAOYSA-N 0.000 description 1
- 235000010233 benzoic acid Nutrition 0.000 description 1
- 229960004365 benzoic acid Drugs 0.000 description 1
- KCXMKQUNVWSEMD-UHFFFAOYSA-N benzyl chloride Chemical compound ClCC1=CC=CC=C1 KCXMKQUNVWSEMD-UHFFFAOYSA-N 0.000 description 1
- 230000031018 biological processes and functions Effects 0.000 description 1
- QKSKPIVNLNLAAV-UHFFFAOYSA-N bis(2-chloroethyl) sulfide Chemical compound ClCCSCCCl QKSKPIVNLNLAAV-UHFFFAOYSA-N 0.000 description 1
- GDTBXPJZTBHREO-UHFFFAOYSA-N bromine Substances BrBr GDTBXPJZTBHREO-UHFFFAOYSA-N 0.000 description 1
- 229910052794 bromium Inorganic materials 0.000 description 1
- XNNQFQFUQLJSQT-UHFFFAOYSA-N bromo(trichloro)methane Chemical compound ClC(Cl)(Cl)Br XNNQFQFUQLJSQT-UHFFFAOYSA-N 0.000 description 1
- RDHPKYGYEGBMSE-UHFFFAOYSA-N bromoethane Chemical compound CCBr RDHPKYGYEGBMSE-UHFFFAOYSA-N 0.000 description 1
- 229950005228 bromoform Drugs 0.000 description 1
- GZUXJHMPEANEGY-UHFFFAOYSA-N bromomethane Chemical compound BrC GZUXJHMPEANEGY-UHFFFAOYSA-N 0.000 description 1
- RJCQBQGAPKAMLL-UHFFFAOYSA-N bromotrifluoromethane Chemical compound FC(F)(F)Br RJCQBQGAPKAMLL-UHFFFAOYSA-N 0.000 description 1
- DYONNFFVDNILGI-UHFFFAOYSA-N butan-2-yl nitrate Chemical compound CCC(C)O[N+]([O-])=O DYONNFFVDNILGI-UHFFFAOYSA-N 0.000 description 1
- WFYPICNXBKQZGB-UHFFFAOYSA-N butenyne Chemical compound C=CC#C WFYPICNXBKQZGB-UHFFFAOYSA-N 0.000 description 1
- 229940043232 butyl acetate Drugs 0.000 description 1
- QQHZPQUHCAKSOL-UHFFFAOYSA-N butyl nitrate Chemical compound CCCCO[N+]([O-])=O QQHZPQUHCAKSOL-UHFFFAOYSA-N 0.000 description 1
- 229940067596 butylparaben Drugs 0.000 description 1
- KVNRLNFWIYMESJ-UHFFFAOYSA-N butyronitrile Chemical compound CCCC#N KVNRLNFWIYMESJ-UHFFFAOYSA-N 0.000 description 1
- 229960001948 caffeine Drugs 0.000 description 1
- VJEONQKOZGKCAK-UHFFFAOYSA-N caffeine Natural products CN1C(=O)N(C)C(=O)C2=C1C=CN2C VJEONQKOZGKCAK-UHFFFAOYSA-N 0.000 description 1
- 229940117949 captan Drugs 0.000 description 1
- 239000004202 carbamide Substances 0.000 description 1
- 229960005286 carbaryl Drugs 0.000 description 1
- CVXBEEMKQHEXEN-UHFFFAOYSA-N carbaryl Chemical compound C1=CC=C2C(OC(=O)NC)=CC=CC2=C1 CVXBEEMKQHEXEN-UHFFFAOYSA-N 0.000 description 1
- DUEPRVBVGDRKAG-UHFFFAOYSA-N carbofuran Chemical compound CNC(=O)OC1=CC=CC2=C1OC(C)(C)C2 DUEPRVBVGDRKAG-UHFFFAOYSA-N 0.000 description 1
- 150000001721 carbon Chemical group 0.000 description 1
- 125000002915 carbonyl group Chemical group [*:2]C([*:1])=O 0.000 description 1
- 150000001793 charged compounds Chemical class 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- BIWJNBZANLAXMG-YQELWRJZSA-N chloordaan Chemical compound ClC1=C(Cl)[C@@]2(Cl)C3CC(Cl)C(Cl)C3[C@]1(Cl)C2(Cl)Cl BIWJNBZANLAXMG-YQELWRJZSA-N 0.000 description 1
- WYKYKTKDBLFHCY-UHFFFAOYSA-N chloridazon Chemical compound O=C1C(Cl)=C(N)C=NN1C1=CC=CC=C1 WYKYKTKDBLFHCY-UHFFFAOYSA-N 0.000 description 1
- NSWAMPCUPHPTTC-UHFFFAOYSA-N chlorimuron-ethyl Chemical group CCOC(=O)C1=CC=CC=C1S(=O)(=O)NC(=O)NC1=NC(Cl)=CC(OC)=N1 NSWAMPCUPHPTTC-UHFFFAOYSA-N 0.000 description 1
- 229910052801 chlorine Inorganic materials 0.000 description 1
- FOCAUTSVDIKZOP-UHFFFAOYSA-N chloroacetic acid Chemical compound OC(=O)CCl FOCAUTSVDIKZOP-UHFFFAOYSA-N 0.000 description 1
- 229940106681 chloroacetic acid Drugs 0.000 description 1
- HRYZWHHZPQKTII-UHFFFAOYSA-N chloroethane Chemical compound CCCl HRYZWHHZPQKTII-UHFFFAOYSA-N 0.000 description 1
- LFHISGNCFUNFFM-UHFFFAOYSA-N chloropicrin Chemical compound [O-][N+](=O)C(Cl)(Cl)Cl LFHISGNCFUNFFM-UHFFFAOYSA-N 0.000 description 1
- SBPBAQFWLVIOKP-UHFFFAOYSA-N chlorpyrifos Chemical compound CCOP(=S)(OCC)OC1=NC(Cl)=C(Cl)C=C1Cl SBPBAQFWLVIOKP-UHFFFAOYSA-N 0.000 description 1
- KFUSEUYYWQURPO-UPHRSURJSA-N cis-1,2-dichloroethene Chemical compound Cl\C=C/Cl KFUSEUYYWQURPO-UPHRSURJSA-N 0.000 description 1
- KVZJLSYJROEPSQ-OCAPTIKFSA-N cis-1,2-dimethylcyclohexane Chemical compound C[C@H]1CCCC[C@H]1C KVZJLSYJROEPSQ-OCAPTIKFSA-N 0.000 description 1
- KVZJLSYJROEPSQ-UHFFFAOYSA-N cis-DMCH Natural products CC1CCCCC1C KVZJLSYJROEPSQ-UHFFFAOYSA-N 0.000 description 1
- 238000005094 computer simulation Methods 0.000 description 1
- 238000005314 correlation function Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- XCIXKGXIYUWCLL-UHFFFAOYSA-N cyclopentanol Chemical compound OC1CCCC1 XCIXKGXIYUWCLL-UHFFFAOYSA-N 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- FHIVAFMUCKRCQO-UHFFFAOYSA-N diazinon Chemical compound CCOP(=S)(OCC)OC1=CC(C)=NC(C(C)C)=N1 FHIVAFMUCKRCQO-UHFFFAOYSA-N 0.000 description 1
- FJBFPHVGVWTDIP-UHFFFAOYSA-N dibromomethane Chemical compound BrCBr FJBFPHVGVWTDIP-UHFFFAOYSA-N 0.000 description 1
- IWEDIXLBFLAXBO-UHFFFAOYSA-N dicamba Chemical compound COC1=C(Cl)C=CC(Cl)=C1C(O)=O IWEDIXLBFLAXBO-UHFFFAOYSA-N 0.000 description 1
- 229960005215 dichloroacetic acid Drugs 0.000 description 1
- PXBRQCKWGAHEHS-UHFFFAOYSA-N dichlorodifluoromethane Chemical compound FC(F)(Cl)Cl PXBRQCKWGAHEHS-UHFFFAOYSA-N 0.000 description 1
- 235000019404 dichlorodifluoromethane Nutrition 0.000 description 1
- DCOPUUMXTXDBNB-UHFFFAOYSA-N diclofenac Chemical compound OC(=O)CC1=CC=CC=C1NC1=C(Cl)C=CC=C1Cl DCOPUUMXTXDBNB-UHFFFAOYSA-N 0.000 description 1
- JXSJBGJIGXNWCI-UHFFFAOYSA-N diethyl 2-[(dimethoxyphosphorothioyl)thio]succinate Chemical compound CCOC(=O)CC(SP(=S)(OC)OC)C(=O)OCC JXSJBGJIGXNWCI-UHFFFAOYSA-N 0.000 description 1
- 229960004132 diethyl ether Drugs 0.000 description 1
- HPNMFZURTQLUMO-UHFFFAOYSA-N diethylamine Chemical compound CCNCC HPNMFZURTQLUMO-UHFFFAOYSA-N 0.000 description 1
- HUPFGZXOMWLGNK-UHFFFAOYSA-N diflunisal Chemical compound C1=C(O)C(C(=O)O)=CC(C=2C(=CC(F)=CC=2)F)=C1 HUPFGZXOMWLGNK-UHFFFAOYSA-N 0.000 description 1
- 229960000616 diflunisal Drugs 0.000 description 1
- IZTIQXSCWGHBTL-UHFFFAOYSA-N dihydroxy-(2-nitrophenyl)-sulfanylidene-lambda5-phosphane Chemical compound OP(O)(=S)c1ccccc1[N+]([O-])=O IZTIQXSCWGHBTL-UHFFFAOYSA-N 0.000 description 1
- NKDDWNXOKDWJAK-UHFFFAOYSA-N dimethoxymethane Chemical compound COCOC NKDDWNXOKDWJAK-UHFFFAOYSA-N 0.000 description 1
- WEHWNAOGRSTTBQ-UHFFFAOYSA-N dipropylamine Chemical compound CCCNCCC WEHWNAOGRSTTBQ-UHFFFAOYSA-N 0.000 description 1
- DFBKLUNHFCTMDC-GKRDHZSOSA-N endrin Chemical compound C([C@@H]1[C@H]2[C@@]3(Cl)C(Cl)=C([C@]([C@H]22)(Cl)C3(Cl)Cl)Cl)[C@@H]2[C@H]2[C@@H]1O2 DFBKLUNHFCTMDC-GKRDHZSOSA-N 0.000 description 1
- 150000002148 esters Chemical class 0.000 description 1
- 150000002170 ethers Chemical class 0.000 description 1
- 229940052303 ethers for general anesthesia Drugs 0.000 description 1
- RIZMRRKBZQXFOY-UHFFFAOYSA-N ethion Chemical compound CCOP(=S)(OCC)SCSP(=S)(OCC)OCC RIZMRRKBZQXFOY-UHFFFAOYSA-N 0.000 description 1
- 235000019439 ethyl acetate Nutrition 0.000 description 1
- 229940093499 ethyl acetate Drugs 0.000 description 1
- 229960003750 ethyl chloride Drugs 0.000 description 1
- 229960001617 ethyl hydroxybenzoate Drugs 0.000 description 1
- 235000010228 ethyl p-hydroxybenzoate Nutrition 0.000 description 1
- 239000004403 ethyl p-hydroxybenzoate Substances 0.000 description 1
- UQXKXGWGFRWILX-UHFFFAOYSA-N ethylene glycol dinitrate Chemical compound O=N(=O)OCCON(=O)=O UQXKXGWGFRWILX-UHFFFAOYSA-N 0.000 description 1
- ACKALUBLCWJVNB-UHFFFAOYSA-N ethylidene diacetate Chemical compound CC(=O)OC(C)OC(C)=O ACKALUBLCWJVNB-UHFFFAOYSA-N 0.000 description 1
- NUVBSKCKDOMJSU-UHFFFAOYSA-N ethylparaben Chemical compound CCOC(=O)C1=CC=C(O)C=C1 NUVBSKCKDOMJSU-UHFFFAOYSA-N 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 229960001395 fenbufen Drugs 0.000 description 1
- ZPAKPRAICRBAOD-UHFFFAOYSA-N fenbufen Chemical compound C1=CC(C(=O)CCC(=O)O)=CC=C1C1=CC=CC=C1 ZPAKPRAICRBAOD-UHFFFAOYSA-N 0.000 description 1
- XXOYNJXVWVNOOJ-UHFFFAOYSA-N fenuron Chemical compound CN(C)C(=O)NC1=CC=CC=C1 XXOYNJXVWVNOOJ-UHFFFAOYSA-N 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 229960004369 flufenamic acid Drugs 0.000 description 1
- LPEPZBJOKDYZAD-UHFFFAOYSA-N flufenamic acid Chemical compound OC(=O)C1=CC=CC=C1NC1=CC=CC(C(F)(F)F)=C1 LPEPZBJOKDYZAD-UHFFFAOYSA-N 0.000 description 1
- 229910052731 fluorine Inorganic materials 0.000 description 1
- 239000011737 fluorine Substances 0.000 description 1
- 229960002949 fluorouracil Drugs 0.000 description 1
- 229960002390 flurbiprofen Drugs 0.000 description 1
- SYTBZMRGLBWNTM-UHFFFAOYSA-N flurbiprofen Chemical compound FC1=CC(C(C(O)=O)C)=CC=C1C1=CC=CC=C1 SYTBZMRGLBWNTM-UHFFFAOYSA-N 0.000 description 1
- DLEGDLSLRSOURQ-UHFFFAOYSA-N fluroxene Chemical compound FC(F)(F)COC=C DLEGDLSLRSOURQ-UHFFFAOYSA-N 0.000 description 1
- 235000019253 formic acid Nutrition 0.000 description 1
- 229940013688 formic acid Drugs 0.000 description 1
- WBJINCZRORDGAQ-UHFFFAOYSA-N formic acid ethyl ester Natural products CCOC=O WBJINCZRORDGAQ-UHFFFAOYSA-N 0.000 description 1
- JLYXXMFPNIAWKQ-GNIYUCBRSA-N gamma-hexachlorocyclohexane Chemical compound Cl[C@H]1[C@H](Cl)[C@@H](Cl)[C@@H](Cl)[C@H](Cl)[C@H]1Cl JLYXXMFPNIAWKQ-GNIYUCBRSA-N 0.000 description 1
- JLYXXMFPNIAWKQ-UHFFFAOYSA-N gamma-hexachlorocyclohexane Natural products ClC1C(Cl)C(Cl)C(Cl)C(Cl)C1Cl JLYXXMFPNIAWKQ-UHFFFAOYSA-N 0.000 description 1
- 235000013773 glyceryl triacetate Nutrition 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- FRCCEHPWNOQAEU-UHFFFAOYSA-N heptachlor Chemical compound ClC1=C(Cl)C2(Cl)C3C=CC(Cl)C3C1(Cl)C2(Cl)Cl FRCCEHPWNOQAEU-UHFFFAOYSA-N 0.000 description 1
- DMEGYFMYUHOHGS-UHFFFAOYSA-N heptamethylene Natural products C1CCCCCC1 DMEGYFMYUHOHGS-UHFFFAOYSA-N 0.000 description 1
- GGKJPMAIXBETTD-UHFFFAOYSA-L heptyl phosphate Chemical compound CCCCCCCOP([O-])([O-])=O GGKJPMAIXBETTD-UHFFFAOYSA-L 0.000 description 1
- 150000002391 heterocyclic compounds Chemical class 0.000 description 1
- CKAPSXZOOQJIBF-UHFFFAOYSA-N hexachlorobenzene Chemical compound ClC1=C(Cl)C(Cl)=C(Cl)C(Cl)=C1Cl CKAPSXZOOQJIBF-UHFFFAOYSA-N 0.000 description 1
- VHHHONWQHHHLTI-UHFFFAOYSA-N hexachloroethane Chemical compound ClC(Cl)(Cl)C(Cl)(Cl)Cl VHHHONWQHHHLTI-UHFFFAOYSA-N 0.000 description 1
- BHEPBYXIRTUNPN-UHFFFAOYSA-N hydridophosphorus(.) (triplet) Chemical compound [PH] BHEPBYXIRTUNPN-UHFFFAOYSA-N 0.000 description 1
- 239000000852 hydrogen donor Substances 0.000 description 1
- OUUQCZGPVNCOIJ-UHFFFAOYSA-N hydroperoxyl Chemical compound O[O] OUUQCZGPVNCOIJ-UHFFFAOYSA-N 0.000 description 1
- 229960001680 ibuprofen Drugs 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 230000009878 intermolecular interaction Effects 0.000 description 1
- 235000013847 iso-butane Nutrition 0.000 description 1
- KQNPFQTWMSNSAP-UHFFFAOYSA-M isobutyrate Chemical compound CC(C)C([O-])=O KQNPFQTWMSNSAP-UHFFFAOYSA-M 0.000 description 1
- ZFSLODLOARCGLH-UHFFFAOYSA-N isocyanuric acid Chemical compound OC1=NC(O)=NC(O)=N1 ZFSLODLOARCGLH-UHFFFAOYSA-N 0.000 description 1
- ULYZAYCEDJDHCC-UHFFFAOYSA-N isopropyl chloride Chemical compound CC(C)Cl ULYZAYCEDJDHCC-UHFFFAOYSA-N 0.000 description 1
- DKYWVDODHFEZIM-UHFFFAOYSA-N ketoprofen Chemical compound OC(=O)C(C)C1=CC=CC(C(=O)C=2C=CC=CC=2)=C1 DKYWVDODHFEZIM-UHFFFAOYSA-N 0.000 description 1
- 229960000991 ketoprofen Drugs 0.000 description 1
- 229960002809 lindane Drugs 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 229960000453 malathion Drugs 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000000873 masking effect Effects 0.000 description 1
- UHXUZOCRWCRNSJ-QPJJXVBHSA-N methomyl Chemical compound CNC(=O)O\N=C(/C)SC UHXUZOCRWCRNSJ-QPJJXVBHSA-N 0.000 description 1
- UZKWTJUDCOPSNM-UHFFFAOYSA-N methoxybenzene Substances CCCCOC=C UZKWTJUDCOPSNM-UHFFFAOYSA-N 0.000 description 1
- VNKYTQGIUYNRMY-UHFFFAOYSA-N methoxypropane Chemical compound CCCOC VNKYTQGIUYNRMY-UHFFFAOYSA-N 0.000 description 1
- 229940095102 methyl benzoate Drugs 0.000 description 1
- 235000010270 methyl p-hydroxybenzoate Nutrition 0.000 description 1
- 239000004292 methyl p-hydroxybenzoate Substances 0.000 description 1
- GYNNXHKOJHMOHS-UHFFFAOYSA-N methyl-cycloheptane Natural products CC1CCCCCC1 GYNNXHKOJHMOHS-UHFFFAOYSA-N 0.000 description 1
- 229960002216 methylparaben Drugs 0.000 description 1
- 229960001952 metrifonate Drugs 0.000 description 1
- RSMUVYRMZCOLBH-UHFFFAOYSA-N metsulfuron methyl Chemical group COC(=O)C1=CC=CC=C1S(=O)(=O)NC(=O)NC1=NC(C)=NC(OC)=N1 RSMUVYRMZCOLBH-UHFFFAOYSA-N 0.000 description 1
- 238000000874 microwave-assisted extraction Methods 0.000 description 1
- 230000000116 mitigating effect Effects 0.000 description 1
- 230000004001 molecular interaction Effects 0.000 description 1
- PYLWMHQQBFSUBP-UHFFFAOYSA-N monofluorobenzene Chemical compound FC1=CC=CC=C1 PYLWMHQQBFSUBP-UHFFFAOYSA-N 0.000 description 1
- HDZGCSFEDULWCS-UHFFFAOYSA-N monomethylhydrazine Chemical compound CNN HDZGCSFEDULWCS-UHFFFAOYSA-N 0.000 description 1
- ULKWBVNMJZUEBD-UHFFFAOYSA-N n,n,4-trimethylbenzamide Chemical compound CN(C)C(=O)C1=CC=C(C)C=C1 ULKWBVNMJZUEBD-UHFFFAOYSA-N 0.000 description 1
- YKYONYBAUNKHLG-UHFFFAOYSA-N n-Propyl acetate Natural products CCCOC(C)=O YKYONYBAUNKHLG-UHFFFAOYSA-N 0.000 description 1
- SNMVRZFUUCLYTO-UHFFFAOYSA-N n-propyl chloride Chemical compound CCCCl SNMVRZFUUCLYTO-UHFFFAOYSA-N 0.000 description 1
- 229960002009 naproxen Drugs 0.000 description 1
- CMWTZPSULFXXJA-VIFPVBQESA-N naproxen Chemical compound C1=C([C@H](C)C(O)=O)C=CC2=CC(OC)=CC=C21 CMWTZPSULFXXJA-VIFPVBQESA-N 0.000 description 1
- MCSAJNNLRCFZED-UHFFFAOYSA-N nitroethane Chemical compound CC[N+]([O-])=O MCSAJNNLRCFZED-UHFFFAOYSA-N 0.000 description 1
- LYGJENNIWJXYER-UHFFFAOYSA-N nitromethane Chemical compound C[N+]([O-])=O LYGJENNIWJXYER-UHFFFAOYSA-N 0.000 description 1
- WSGCRAOTEDLMFQ-UHFFFAOYSA-N nonan-5-one Chemical compound CCCCC(=O)CCCC WSGCRAOTEDLMFQ-UHFFFAOYSA-N 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 229940078552 o-xylene Drugs 0.000 description 1
- BCCOBQSFUDVTJQ-UHFFFAOYSA-N octafluorocyclobutane Chemical compound FC1(F)C(F)(F)C(F)(F)C1(F)F BCCOBQSFUDVTJQ-UHFFFAOYSA-N 0.000 description 1
- BHAAPTBBJKJZER-UHFFFAOYSA-N p-anisidine Chemical compound COC1=CC=C(N)C=C1 BHAAPTBBJKJZER-UHFFFAOYSA-N 0.000 description 1
- QNGNSVIICDLXHT-UHFFFAOYSA-N para-ethylbenzaldehyde Natural products CCC1=CC=C(C=O)C=C1 QNGNSVIICDLXHT-UHFFFAOYSA-N 0.000 description 1
- 229960005489 paracetamol Drugs 0.000 description 1
- LCCNCVORNKJIRZ-UHFFFAOYSA-N parathion Chemical compound CCOP(=S)(OCC)OC1=CC=C([N+]([O-])=O)C=C1 LCCNCVORNKJIRZ-UHFFFAOYSA-N 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- LKPLKUMXSAEKID-UHFFFAOYSA-N pentachloronitrobenzene Chemical compound [O-][N+](=O)C1=C(Cl)C(Cl)=C(Cl)C(Cl)=C1Cl LKPLKUMXSAEKID-UHFFFAOYSA-N 0.000 description 1
- 229940100684 pentylamine Drugs 0.000 description 1
- 229960004065 perflutren Drugs 0.000 description 1
- 229960003893 phenacetin Drugs 0.000 description 1
- DLRJIFUOBPOJNS-UHFFFAOYSA-N phenetole Chemical compound CCOC1=CC=CC=C1 DLRJIFUOBPOJNS-UHFFFAOYSA-N 0.000 description 1
- HKOOXMFOFWEVGF-UHFFFAOYSA-N phenylhydrazine Chemical compound NNC1=CC=CC=C1 HKOOXMFOFWEVGF-UHFFFAOYSA-N 0.000 description 1
- BULVZWIRKLYCBC-UHFFFAOYSA-N phorate Chemical compound CCOP(=S)(OCC)SCSCC BULVZWIRKLYCBC-UHFFFAOYSA-N 0.000 description 1
- XKJCHHZQLQNZHY-UHFFFAOYSA-N phthalimide Chemical compound C1=CC=C2C(=O)NC(=O)C2=C1 XKJCHHZQLQNZHY-UHFFFAOYSA-N 0.000 description 1
- PJGSXYOJTGTZAV-UHFFFAOYSA-N pinacolone Chemical compound CC(=O)C(C)(C)C PJGSXYOJTGTZAV-UHFFFAOYSA-N 0.000 description 1
- YFGYUFNIOHWBOB-UHFFFAOYSA-N pirimicarb Chemical compound CN(C)C(=O)OC1=NC(N(C)C)=NC(C)=C1C YFGYUFNIOHWBOB-UHFFFAOYSA-N 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 238000005381 potential energy Methods 0.000 description 1
- AAEVYOVXGOFMJO-UHFFFAOYSA-N prometryn Chemical compound CSC1=NC(NC(C)C)=NC(NC(C)C)=N1 AAEVYOVXGOFMJO-UHFFFAOYSA-N 0.000 description 1
- LFULEKSKNZEWOE-UHFFFAOYSA-N propanil Chemical compound CCC(=O)NC1=CC=C(Cl)C(Cl)=C1 LFULEKSKNZEWOE-UHFFFAOYSA-N 0.000 description 1
- FVSKHRXBFJPNKK-UHFFFAOYSA-N propionitrile Chemical compound CCC#N FVSKHRXBFJPNKK-UHFFFAOYSA-N 0.000 description 1
- 229940090181 propyl acetate Drugs 0.000 description 1
- 235000010232 propyl p-hydroxybenzoate Nutrition 0.000 description 1
- 239000004405 propyl p-hydroxybenzoate Substances 0.000 description 1
- PSXCGTLGGVDWFU-UHFFFAOYSA-N propylene glycol dinitrate Chemical compound [O-][N+](=O)OC(C)CO[N+]([O-])=O PSXCGTLGGVDWFU-UHFFFAOYSA-N 0.000 description 1
- 229960003415 propylparaben Drugs 0.000 description 1
- MWWATHDPGQKSAR-UHFFFAOYSA-N propyne Chemical compound CC#C MWWATHDPGQKSAR-UHFFFAOYSA-N 0.000 description 1
- 238000010403 protein-protein docking Methods 0.000 description 1
- 230000004850 protein–protein interaction Effects 0.000 description 1
- UMJSCPRVCHMLSP-UHFFFAOYSA-N pyridine Natural products COC1=CC=CN=C1 UMJSCPRVCHMLSP-UHFFFAOYSA-N 0.000 description 1
- 229940107700 pyruvic acid Drugs 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- ODCWYMIRDDJXKW-UHFFFAOYSA-N simazine Chemical compound CCNC1=NC(Cl)=NC(NCC)=N1 ODCWYMIRDDJXKW-UHFFFAOYSA-N 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- HXJUTPCZVOIRIF-UHFFFAOYSA-N sulfolane Chemical compound O=S1(=O)CCCC1 HXJUTPCZVOIRIF-UHFFFAOYSA-N 0.000 description 1
- ZDXMLEQEMNLCQG-UHFFFAOYSA-N sulfometuron methyl Chemical group COC(=O)C1=CC=CC=C1S(=O)(=O)NC(=O)NC1=NC(C)=CC(C)=N1 ZDXMLEQEMNLCQG-UHFFFAOYSA-N 0.000 description 1
- 125000004434 sulfur atom Chemical group 0.000 description 1
- IROINLKCQGIITA-UHFFFAOYSA-N terbutryn Chemical compound CCNC1=NC(NC(C)(C)C)=NC(SC)=N1 IROINLKCQGIITA-UHFFFAOYSA-N 0.000 description 1
- 229950011008 tetrachloroethylene Drugs 0.000 description 1
- TXEYQDLBPFQVAA-UHFFFAOYSA-N tetrafluoromethane Chemical compound FC(F)(F)F TXEYQDLBPFQVAA-UHFFFAOYSA-N 0.000 description 1
- YLQBMQCUIZJEEH-UHFFFAOYSA-N tetrahydrofuran Natural products C=1C=COC=1 YLQBMQCUIZJEEH-UHFFFAOYSA-N 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- LOQQVLXUKHKNIA-UHFFFAOYSA-N thifensulfuron Chemical compound COC1=NC(C)=NC(NC(=O)NS(=O)(=O)C2=C(SC=C2)C(O)=O)=N1 LOQQVLXUKHKNIA-UHFFFAOYSA-N 0.000 description 1
- HNKJADCVZUBCPG-UHFFFAOYSA-N thioanisole Chemical compound CSC1=CC=CC=C1 HNKJADCVZUBCPG-UHFFFAOYSA-N 0.000 description 1
- 229930192474 thiophene Natural products 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical compound [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 1
- 229960002905 tolfenamic acid Drugs 0.000 description 1
- YEZNLOUZAIOMLT-UHFFFAOYSA-N tolfenamic acid Chemical compound CC1=C(Cl)C=CC=C1NC1=CC=CC=C1C(O)=O YEZNLOUZAIOMLT-UHFFFAOYSA-N 0.000 description 1
- KFUSEUYYWQURPO-OWOJBTEDSA-N trans-1,2-dichloroethene Chemical compound Cl\C=C\Cl KFUSEUYYWQURPO-OWOJBTEDSA-N 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 229960002622 triacetin Drugs 0.000 description 1
- NFACJZMKEDPNKN-UHFFFAOYSA-N trichlorfon Chemical compound COP(=O)(OC)C(O)C(Cl)(Cl)Cl NFACJZMKEDPNKN-UHFFFAOYSA-N 0.000 description 1
- 229960002415 trichloroethylene Drugs 0.000 description 1
- CYRMSUTZVYGINF-UHFFFAOYSA-N trichlorofluoromethane Chemical compound FC(Cl)(Cl)Cl CYRMSUTZVYGINF-UHFFFAOYSA-N 0.000 description 1
- DQWPFSLDHJDLRL-UHFFFAOYSA-N triethyl phosphate Chemical compound CCOP(=O)(OCC)OCC DQWPFSLDHJDLRL-UHFFFAOYSA-N 0.000 description 1
- 125000002023 trifluoromethyl group Chemical group FC(F)(F)* 0.000 description 1
- ZSDSQXJSNMTJDA-UHFFFAOYSA-N trifluralin Chemical compound CCCN(CCC)C1=C([N+]([O-])=O)C=C(C(F)(F)F)C=C1[N+]([O-])=O ZSDSQXJSNMTJDA-UHFFFAOYSA-N 0.000 description 1
- USDVAISJBZZCEZ-UHFFFAOYSA-N trihydroxy-[nitro(phenyl)-lambda4-sulfanylidene]-lambda5-phosphane Chemical compound OP(O)(O)=S(c1ccccc1)[N+]([O-])=O USDVAISJBZZCEZ-UHFFFAOYSA-N 0.000 description 1
- RXPQRKFMDQNODS-UHFFFAOYSA-N tripropyl phosphate Chemical compound CCCOP(=O)(OCCC)OCCC RXPQRKFMDQNODS-UHFFFAOYSA-N 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 239000003643 water by type Substances 0.000 description 1
- 230000003313 weakening effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B5/00—ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
- G16B15/30—Drug targeting using structural data; Docking or binding prediction
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B30/00—Methods of screening libraries
- C40B30/04—Methods of screening libraries by measuring the ability to specifically bind a target molecule, e.g. antibody-antigen binding, receptor-ligand binding
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/68—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G06F17/5018—
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
- G16B15/20—Protein or domain folding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B5/00—ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
- G16B5/20—Probabilistic models
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/50—Molecular design, e.g. of drugs
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/90—Programming languages; Computing architectures; Database systems; Data warehousing
Definitions
- This disclosure relates to movable type method applied to protein-ligand binding.
- a method of estimating the pose of a ligand in a receptor comprising identifying all possible atom pairs of protein-ligand complexes in a given configuration space for a system that comprises proteins; creating a first database and a second database; where the first database comprises associated pairwise distant dependent energies and where the second database comprises all probabilities that include how the atom pairs can combine; combining the first database with the second database via statistical mechanics to accurately estimate binding free energies as well as a pose of a ligand in a receptor; and selecting a protein-ligand complex for further study.
- a system comprising a device; where the device is effective to estimate a pose of a ligand in a receptor comprising identifying all possible atom pairs of protein-ligand complexes in a given configuration space for a system that comprises proteins; creating a first database and a second database; where the first database comprises associated pairwise distant dependent energies and where the second database comprises all probabilities that include how the atom pairs can combine; combining the first database with the second database via statistical mechanics to accurately estimate binding free energies as well as the pose of the ligand in the receptor; and selecting a protein-ligand complex for further study.
- Also disclosed herein is a method for verifying the binding characteristics of a protein and a ligand using a computer, comprising the steps of: collecting and maintaining a database containing atom pairwise energy data, assembling said atom pairwise data in a printing forme, introducing a fixed-size Z-matrix which represents a Boltzmann-weighted energy ensemble in association with said printing forme, and further assembling atom pairwise energies at different distances using said printing forme so as simultaneously to represent both ensemble and free energies of said protein and said ligand, and rendering said ensemble and free energies as an output to a user to verify binding characteristics of said protein and said ligand.
- Also disclosed herein is (i) a computer-implemented method of simulating free energy change with respect to one or a series of molecules that can include the following; (ii) one or more computer-readable hardware storage device having embedded therein a set of instructions which, when executed by one or more processors of a computer, causes the computer to execute operations that can include the following; and (iii) a system comprising one or more hardware computer processor configured to do the following:
- FIG. 1 depicts the thermodynamic cycle used to formulate the free energy of protein-ligand binding in solution, according to some example embodiments
- FIG. 2 shows modeling of ligand-solvent polar interaction using a Boltzmann factor multiplier, according to some example embodiments
- FIG. 3 depicts a plot of a movable type (MT) Knowledge-based and Empirical Combined Scoring Algorithm (KECSA; upper panel) model and the original KECSA model (lower panel) calculated pKd or pKi vales vs.
- Experimental pKd or pKi values according to some example embodiments;
- FIG. 4 depicts MT energy maps optimization mechanism to derive the final docking pose in one protein ligand complex, according to some example embodiments
- FIG. 5 depicts a contact map of the 1 LI 2 protein-ligand complex binding region, according to some example embodiments. Hydrophobic contacts are shown as dashed lines and the one hydrogen bond is shown extending between the phenol oxygen atom (central striped atom) and the GLN102A residue as another dashed line. The binding pocket cavity encircled with a solid line;
- FIG. 6 depicts heat maps for sp 3 oxygen (left) and aromatic carbon (right), according to some example embodiments. Grid points with lighter color indicate energetically favorable locations for certain atom types within the binding pocket;
- FIG. 7 shows the binding pocket of protein-ligand complex 1L12, ligand crystal structure (marked as CS) is shown as a stick & ball structure, the global minimum pose (marked as GM) is shown as a stick structure along with the three other identified local minimum (marked as a, b, and c), according to some example embodiments.
- Bubbles (red in the original) on the protein atoms indicate potential contacts with the ligand sp 3 oxygen.
- Other bubbles (grey in the original) on the protein atoms indicate potential contacts with aromatic carbons;
- FIG. 8 shows sp3 carbon-sp3 carbon bond probability distribution (top) and exponential energy vs. atom pairwise distance (bottom), according to some example embodiments;
- FIG. 9 shows a representation of the torsion angle a using the pairwise distance x, according to some example embodiments.
- FIG. 10 shows sp 3 carbon-(sp 3 carbon)-sp 3 carbon angle probability distribution (top) and the exponential energy vs. atom pairwise distance (bottom), according to some example embodiments;
- FIG. 11 depicts the torsion angle a with the atom pairwise distance x, according to some example embodiments.
- FIG. 12 shows sp3 carbon-(sp3 carbon- sp3 carbon)-sp3 carbon torsion probability distribution (top) and exponential energy vs. atom pairwise distance (bottom), according to some example embodiments;
- FIG. 13 depicts an example of the SM k bond and its corresponding SM k bond , where SM refers to Standard Matrix, according to some example embodiments.
- the scramble operator index numbers with circles (red in the original) connected by arrows (blue in the original) indicate that scrambled vectors with the same scramble manner (the same index number i) in both k bond and k bond are tiled in the same position in both SMs;
- FIG. 14 depicts factors involved in the implicit-solvent model in the (KECSA-Movable Type Implicit Solvation Model) KMTISM method, according to some example embodiments;
- FIG. 15 is a graph showing KMTISM, MM-GBSA and MM-PBSA calculated vs. experimental solvation free energies (kcal/mol) for 372 neutral molecules (kcal/mol), according to some example embodiments;
- FIGS. 16A-16I graphically illustrate KMTISM's top three performing test sets according to RMSE (Root-Mean-Square Error), according to some example embodiments.
- KMTISM, MM-GBSA and MM-PBSA calculated solvation free energies (kcal/mol) vs. experimental data are shown.
- FIG. 16A shows KMTISM calculated solvation free energies (kcal/mol) vs. experimental data for hydrocarbon test sets.
- FIG. 16B shows MM-GBSA calculated solvation free energies (kcal/mol) vs. experimental data for hydrocarbon test sets.
- FIG. 16C shows MM-PBSA calculated solvation free energies (kcal/mol) vs. experimental data for hydrocarbon test sets.
- FIG. 16D shows KMTISM calculated solvation free energies (kcal/mol) vs. experimental data for oxygen-bearing test sets.
- FIG. 16E shows MM-GBSA calculated solvation free energies (kcal/mol) vs. experimental data for oxygen-bearing test sets.
- FIG. 16F shows MM-PBSA calculated solvation free energies (kcal/mol) vs. experimental data for oxygen-bearing test sets.
- FIG. 16G shows KMTISM calculated solvation free energies (kcal/mol) vs. experimental data for halocarbon test sets.
- FIG. 16H shows MM-GBSA calculated solvation free energies (kcal/mol) vs. experimental data for halocarbon test sets.
- FIG. 161 shows MM-PBSA calculated solvation free energies (kcal/mol) vs. experimental data for halocarbon test sets;
- FIGS. 17A-17I graphically illustrate KMTISM's worst three performing test sets according to RMSE.
- FIG. 17A shows KMTISM calculated solvation free energies (kcal/mol) vs. experimental data for amide test sets.
- FIG. 17B shows MM-GBSA calculated solvation free energies (kcal/mol) vs. experimental data for amide test sets.
- FIG. 17C shows MM-PBSA calculated solvation free energies (kcal/mol) vs. experimental data for amide test sets.
- FIG. 17D shows KMTISM calculated solvation free energies (kcal/mol) vs. experimental data for organosulfur and organophosphorus test sets.
- FIG. 17A shows KMTISM calculated solvation free energies (kcal/mol) vs. experimental data for organosulfur and organophosphorus test sets.
- FIG. 17A shows KMTISM calculated solvation free energies (kcal/mol) vs. experimental
- FIG. 17E shows MM-GBSA calculated solvation free energies (kcal/mol) vs. experimental data for organosulfur and organophosphorus test sets.
- FIG. 17F shows MM-PBSA calculated solvation free energies (kcal/mol) vs. experimental data for organosulfur and organophosphorus test sets.
- FIG. 17G shows KMTISM calculated solvation free energies (kcal/mol) vs. experimental data for polyfunctional test sets.
- FIG. 17H shows MM-GBSA calculated solvation free energies (kcal/mol) vs. experimental data for polyfunctional test sets.
- FIG. 171 shows MM-PBSA calculated solvation free energies (kcal/mol) vs. experimental data for polyfunctional test sets;
- FIGS. 18A-18C graphically illustrates KMTISM, MM-GBSA and MM- PBSA calculated solvation free energies (kcal/mol) vs. experimental data for carboxylate and charged amine test sets, according to some example embodiments.
- FIG. 18A shows KMTISM calculated solvation free energies (kcal/mol) vs. experimental data for carboxylate and charged amine test sets.
- FIG. 18B shows MM-GBSA calculated solvation free energies (kcal/mol) vs. experimental data for carboxylate and charged amine test sets.
- FIG. 18C shows MM-PBSA calculated solvation free energies (kcal/mol) vs. experimental data for carboxylate and charged amine test sets; and
- FIG. 19 is a block diagram illustrating components of a machine, according to some example embodiments, able to read instructions from a machine-readable medium and perform any one or more of the methodologies discussed herein.
- movable type Disclosed herein is a method for addressing the sampling problem using a novel methodology called “movable type”.
- movable type Conceptually it can be understood by analogy with the evolution of printing and, hence, the name movable type.
- a common approach to the study of protein-ligand complexation involves taking a database of intact drug-like molecules and exhaustively docking them into a binding pocket. This is reminiscent of early woodblock printing where each page had to be laboriously created prior to printing a book.
- printing evolved to an approach where a database of symbols (letters, numerals, and the like) was created and then assembled using a movable type system, which allowed for the creation of all possible combinations of symbols on a given page, thereby, revolutionizing the dissemination of knowledge.
- the movable type (MT) method involves an identification of all atom pairs seen in protein-ligand complexes and then creating two databases: one with their associated pairwise distant dependent energies and another associated with the probability of how these pairs can combine in terms of bonds, angles, dihedrals and non-bonded interactions. Combining these two databases coupled with the principles of statistical mechanics allows us to accurately estimate binding free energies as well as the pose of a ligand in a receptor.
- This method by its mathematical construction, samples all of configuration space of a selected region in a single effort without resorting to brute force sampling schemes involving Monte Carlo, genetic algorithms or molecular dynamics simulations making the methodology extremely efficient.
- this method explores the free energy surface eliminating the need to estimate the enthalpy and entropy components individually.
- low free energy structures can be obtained via a free energy minimization procedure yielding all low free energy poses on a given free energy surface.
- this approach can be utilized in a wide range of applications in computational biology which involves the computation of free energies for systems with extensive phase spaces including protein folding, protein-protein docking and protein design.
- the method permits the selection of particular proteins and ligands for use in experiments, in manufacturing, in therapy, in drugs, in drug trials, in artificial parts, or a combination thereof.
- the determination of free energies between proteins and ligands permits the selection of protein-ligand pairs that can effectively function for various desired purposes.
- the computations disclosed below can be conducted on a computer or a microprocessor, with the most probable protein-ligand pairs being detailed on a computer screen, a paper print-out, a hard drive, a flash drive, and the like.
- movable type printing (which was invented in China in the 11th century and introduced by Gutenberg into the alphabetic language system) uses a “database” of letters that is pre-constructed and then the printing of any word involves a database search followed by the appropriate combination from the movable type system.
- the molecular energy of a system can be decomposed into atom pairwise interaction energies including bond, angle, torsion, and long-range non-covalent forces (van der Waals and electrostatic forces), which by analogy to the MT systems is our database of “letters”. Each interaction has a different intensity and probability of occurrence along an atom pairwise coordinate axis.
- atom pairwise interaction energies including bond, angle, torsion, and long-range non-covalent forces (van der Waals and electrostatic forces), which by analogy to the MT systems is our database of “letters”.
- Each interaction has a different intensity and probability of occurrence along an atom pairwise coordinate axis.
- a pose- is a (one of many in practice) unique 3-D positioning of a ligand in a receptor pocket.
- the docking and scoring problem is an example of a broad class of problems in computational biology that involve both the computation of the free energy and structure of a biological system, which includes challenges like the prediction of protein folds, protein-protein interactions and protein design all of which the MT method can address.
- the binding free energy in solution is now separated into two terms: The binding free energy in the gas-phase and the change in the solvation free energy during the complexation process.
- the moveable type algorithm is introduced to model both terms each with their own designs.
- the binding free energy ( ⁇ G b g ) in the gas-phase is one of the terms to evaluate in order to predict the protein-ligand binding affinity because it contains all interactions between the protein and ligand.
- NVT Helmholtz free energy
- the Gibbs the canonical ensemble is used throughout, but the ⁇ G notation is predominantly used
- binding free energy in the gas-phase can be generated using the ratio of the partition functions describing the protein-ligand complex, the protein, and the ligand.
- F P and F L are the number of external degrees of freedom for the unbound protein and the unbound ligand respectively
- F PL is approximated as the product of the external degrees of freedom (DoFs) of the bound protein and ligand (including the rotational and translational DoFs), and the internal DoFs of the bound protein and ligand (including the relative-positional and vibrational DoFs), given as:
- DoFs external degrees of freedom
- F PL F boundP external F boundL external F boundP internal F boundL internal (5)
- the DoFs of the free protein and ligand molecules are also separated into the external and internal components. Internal DoFs are identical for bound and free protein/ligand structures and the bound and free proteins are also assumed to share the same internal and external DoFs. Only the external DoFs of the ligand are differentiated between the bound and free systems.
- the rotational DoF of a free ligand is 8 ⁇ 2 on a normalized unit sphere. However, because of the inaccessible volume present in protein-ligand systems, the rotational DoFs of bound ligands are designated as ⁇ 2 with a to-be-determined average volume factor a less than 8.
- the translational DoFs are treated as a constant C, which is assumed to be identical for all free ligands, while the translational DoF for bound ligands is the volume of the binding pocket V pocket in which the ligands' center of mass can translate.
- V pocket in which the ligands' center of mass can translate.
- gas-phase protein-ligand binding free energy can then be further manipulated into the following form:
- Equation 3 the solvation free energy can be correlated to the partition function of the solute (protein, ligand or protein-ligand complex) and solute-solvent bulk interactions.
- the solvation free energy using ⁇ G solv L , as an example, is modeled as in Equation 8, and the DoFs are approximated as being the same for the solute and the solute-solvent bulk terms.
- Equation 1 ( ⁇ G solv P and ⁇ G solv PL ) can be modeled in an analogous manner yielding the change in the solvation free energy as ligand binding occurs which then can be used to evaluate the overall free energy of ligand binding in aqueous solution.
- the gas-phase protein-ligand binding free energy can be generated using molecular dynamics, Monte Carlo, genetic algorithms, and the like, by sampling over a large number of poses of the protein, ligand and protein-ligand complex.
- the Helmholtz free energy (A) can be obtained as the arithmetic mean (sum of the energies of all ligand poses divided by the total number of all poses along with an estimate of integration volume) of Boltzmann factors:
- the challenge in deriving the canonical partition function (as the denominator in Equation 10) for a protein-ligand system is that it is difficult to include all relevant ligand pose energies within the binding pocket using brute force sampling schemes.
- the task becomes much easier when a protein-ligand system is reduced to the atom-pair level.
- the “pose” sampling problem can then can be cast as a 1-D rather than a 3-D problem by deriving the canonical partition function as a sum of the Boltzmann factor products of all atom pairwise energies included in the system over all atom pairwise separation distance ranges.
- the canonical partition function can be derived following Equation 12, where the index “i” refers to each ligand pose (microstate) in a “traditional” brute force sampling scheme.
- index “i” refers to each ligand pose (microstate) in a “traditional” brute force sampling scheme.
- q indicates all atom pairs in the molecular system
- p indicates each possible combination of all atom pairs each of which is at a pre-chosen distance.
- a, b, c and d refer to each atom pair as a bond, angle, torsion or long-range (van der Waals or electrostatic) interaction in the canonical system, respectively, and ⁇ , ⁇ , ⁇ and ⁇ refers to each sampled separation distance between the corresponding atom pair. Probabilities of all the atom pairwise distributions on the right hand side of Equation 12 are normalized as
- the MT method is designed to decompose the molecular energy into atom pairwise energies, which then simplifies the energy sampling problem to the atom-pair level.
- the advantage of this idea lies in that (1) atom pairs can be categorized based on atom types and interaction types, e.g. bond, angle, torsion, and long-range non-covalent interactions; (2) Calculation of atom pairwise energies is extremely cheap. Thereby, it is easy to build an atomic pairwise interaction matrix of energy vs. distance for each interaction type and atom pair type i, j. Hence, the energy calculation for each molecule is no more than a combination of elements from different energy matrices.
- the MT method is a template by which any pairwise decomposable energy function can be used.
- the energy for each interaction type between a certain atom type pair i, j is calculated using the Knowledge-based and Empirical Combined Scoring Algorithm (KECSA) potential function.
- KECSA Knowledge-based and Empirical Combined Scoring Algorithm
- the protein-ligand statistical potential is modified and equated to an atom pairwise energy in order to generate force field parameters for bond stretching, angle bending, dihedral torsion angles and long-range non-covalent interactions.
- each atom pair type Along with the distance-based energy, each atom pair type also has a distance preference encoded in its distribution, resulting in different probabilities associated with Boltzmann factors for each sampled atom pairwise distance.
- Atom-pair radial distributions were collected from a protein-ligand structure training set (i.e., the PDBbind v2011 data set with 6019 protein-ligand structures) and utilized in the current model.
- the atom pairwise radial distribution function is modeled as:
- n ij * ⁇ ( r ) n ij ⁇ ( r ) N ij V ⁇ 4 ⁇ ⁇ ⁇ ⁇ r a ⁇ ⁇ ⁇ ⁇ r ( 14 )
- n i,j (r) is the number of protein-ligand pairwise interactions between a certain atom pair type i and j in the bin (r, r+ ⁇ r), with the volume 4 ⁇ r a ⁇ r collected from the training set.
- n ij *(r) in the denominator mimics the number of protein-ligand atom type pairs i and j in the same distance bin in an ideal gas state.
- ⁇ r is defined as 0.005 ⁇ .
- N ij is the total number of atom pairs of type i and j.
- the average volume V of the protein-ligand binding sites is given as
- a cutoff distance R is assigned to each atom type pair defining the distance at which the atom pairwise interaction energy can be regarded as zero. Both a and R can be derived using a previously introduced method.
- the radial distribution frequency is then normalized by dividing the sum of radial distributions of all the atom pairs in the system (Equation 15).
- the energy and distribution frequency vs. distance is calculated for any interaction type, and atom pair type, thereby, forming our MT database for later use.
- the binding free energy is defined as a ratio of partition functions of the different molecules involved in the binding process, i.e., the protein, ligand and the protein-ligand complex.
- the partition function matrices for the MT algorithm are constructed.
- the atom pairwise energy multiplier sampled as a function of distance is the basic element needed to assemble the total energy, as shown in Equation 18, using the protein bond energy as an example.
- Z P bond [ z 1 bond ⁇ ( r 1 ) ⁇ z 2 bond ⁇ ( r 1 ) ⁇ ⁇ ... ⁇ ⁇ z m bond ⁇ ( r 1 ) z 1 bond ⁇ ( r 1 ) ⁇ z 2 bond ⁇ ( r 1 ) ⁇ ⁇ ... ⁇ ⁇ z m bond ⁇ ( r 2 ) ... z 1 bond ⁇ ( r 1 ) ⁇ z 2 bond ⁇ ( r n ) ⁇ ⁇ ... ⁇ ⁇ z m bond ⁇ ( r n ) z 1 bond ⁇ ( r 2 ) ⁇ ⁇ ⁇ ... ⁇ ⁇ z m bond ⁇ ( r
- Equations 19 and 20 m indicates the total number of atom pairs that need to have their bond stretch term computed (i.e., number of covalent bonds), and n is the distance sampling size. T indicates the transpose.
- the matrix P bond has a total of n m elements, and includes all combinations of the sampled atom pairwise distances and atom pairs (see Equation 20).
- Energy matrices for other kinds of atom pairwise interactions are assembled in the same way (i.e., bond, angle, torsion, and long-range interactions).
- Example No. 1 butane-methane interaction
- L L L bond ⁇ L angle ⁇ L torsion ⁇ L long-range (23)
- PL PL P bond ⁇ P angle ⁇ P torsion ⁇ P long-range ⁇ L bond ⁇ L angle ⁇ L torsion ⁇ L long-range ⁇ PL long-range (24)
- the distribution frequency matrix is built in the same way, with the q ij (r) derived from Equation 15 as elements in each multiplier (also using the protein bond term as an example):
- Equation 29 The distribution frequency matrix for the protein is derived using Equations 26 through 28, and the distribution frequency matrices of the ligand and protein-ligand intermolecular interactions are analogously derived as in Equations 29 and 30.
- L L bond ⁇ L angle ⁇ L torsion ⁇ L long-range
- PL P bond ⁇ P angle ⁇ P torsion ⁇ P long-range ⁇ L bond ⁇ L angle ⁇ L torsion ⁇ L long-range ⁇ PL long-range
- the corresponding elements in all energy and distribution frequency matrices correlate with each other.
- the pointwise product over all matrices ensures that the energies and distribution frequencies with the same range and distance increment are combined into one element in the final matrix of the probability-weighted partition function of the protein-ligand complex ( PL in Equation 31).
- each element of PL is a value of the partition function of the protein-ligand complex multiplied by its probability based on its radial distribution forming the ensemble average.
- the protein-ligand binding free energy in the gas-phase is defined as in Equation 35, using the averaged partition functions of all three systems (protein, ligand, protein-ligand complex) derived above.
- Equation 35 Q is the radial distribution frequency and E is the energy.
- i, j, k are the indices of the protein, ligand and protein-ligand complex, while I, J, K are the total number of protein, ligand and protein-ligand complex samples, respectively.
- Determination of the change in the solvation energy as a function of the binding process is computed in a similar manner. To illustrate this we describe how we obtain the solvation free energy of the ligand, which is one component of ⁇ G solv and by extension the other terms can be derived.
- the ligand solvation free energy is obtained by decomposing the ligand-solvent bulk energy into the free ligand energy E L (r), the ligand-solvent polar interaction energy E psol (r), and the ligand-solvent non-polar interaction energy E npsol (r):
- E LS ( r ) E L ( r )+ E psol ( r )+ E npsol ( r ) (36)
- Solvent was approximated as a shell of even thickness around the ligand, in which the water molecules are evenly distributed.
- the solvent shell thickness was 6 ⁇
- the inner surface of the shell was 1.6 ⁇ away from the ligand surface, which approximates the radius of a water molecule.
- the ligand-solvent polar interaction was considered as a surface (solvent accessible polar surface of the ligand)—surface (solvent bulk layer surface at a certain distance away from ligand) interaction, instead of a point-point interaction, i.e. atom pairwise interaction.
- the ligand polar atom-solvent interaction energy was modeled as a solvent accessible buried area (SABA) of the ligand polar atoms multiplied by the polar atom—oxygen interaction energy terms taken from KECSA, to simulate the ligand-solvent surface interaction energy. All SABA-weighted interaction energies along the solvent shell thickness, with a 0.005 ⁇ increment were collected and stored.
- SABA solvent accessible buried area
- Equation 38 The ligand-solvent polar interaction Boltzmann factor matrix is then derived using Equation 38, covering all ligand polar atoms up to m.
- the distribution frequency matrices were not included in ligand-solvent energy calculation because the radial distribution function is approximated as being identical along all ligand-solvent distances (i.e. a featureless continuum).
- FIG. 2 further illustrates the modeling of the ligand-solvent polar interaction.
- psol 1 psol ⁇ 2 psol T ⁇ 3 psol T . . . k psol T . . . m psol T (38)
- FIG. 2 depicts the modelling of ligand-solvent polar interaction using a Boltzmann factor multiplier.
- a carbonyl oxygen atom is used and an example here.
- the green surface shows the solvent accessible surface of the ligand (inner layer of the solvent shell).
- the surface consisting of blue dots represents the outer boundary surface of the solvent shell.
- the non-polar atom buried area (NABA) is used to simulate the interactions between the non-polar atoms and aqueous solvent, because the interaction energy between non-polar atoms and water molecules has a weaker response to changes in distance.
- NABA non-polar atom buried area
- solv L L ⁇ psol ⁇ npsol (40)
- solvation free energy was not fit to experimental solvation free energies and was found to have a small influence of the final binding free energies for the protein-ligand complexes. Nonetheless, future work will fit these models to small molecule solvation free energies, but for the present application the solvation model was used as formulated above.
- the binding free energy in solution can be generated using:
- test set containing 795 protein-ligand complexes was chosen from the PDBbind v2011 refined dataset based on the following criteria:
- MT KECSA calculations show improvements in Pearson's r, Kendall ⁇ and RMSE (Root-Mean-Square Error) when compared to the original KECSA model (Table 1). Importantly, judging from the slope and intercept of both calculations versus experimental data, MT KECSA (with slope of 0.85 and intercept of 0.14) better reproduces the binding affinities in the low and high affinity regions than the original KECSA model (with slope of 0.27 and intercept of 3.57). In the original KECSA approach, the entropy terms were empirically trained, thus, its test results demonstrate training-set dependence to some degree.
- FIG. 3 depicts plots of MT KECSA (left) and the original KECSA model (right) calculated pK d or pK i vales vs. experimental pK d or pK i values.
- Grid based methods and their graphical representation have had a long tradition in computer-aided drug design.
- COMFA creates a field describing the chemical nature of the active site pocket
- the GRID algorithm uses a grid based approach to aid in molecular docking and has been adopted by other docking programs (e.g., GLIDE).
- the very nature of the MT method permits the ready generation of “heat maps” describing the chemical nature of the grid points created in the MT method. These can be used to describe pairwise interactions between the grid-point and the protein environment (e.g., amide hydrogen with a carbonyl oxygen) or interactions can be lumped into nonpolar or polar interactions describing the aggregate collection of polar and non-polar pairwise interactions. Not only does this describe the nature of the grid points it also indicates regions where specific atoms should be placed to optimize binding affinity.
- the MT heat maps represent the probability-weighted interaction energy on each grid point.
- Knowledge-based data i.e., the probability distribution along the interacting distance
- energy gradient maps can be generated based on heat map energy calculations, which facilitates ligand docking as described below.
- the advantage of the MT method is that the energy and the free energy (when introducing the partition function) can be derived using only atomic linkage information coupled with the databases of atom pairwise distance distributions along with their corresponding energies. This offers us a new approach for protein-ligand docking without resorting to exhaustive pose sampling. Our initial efforts utilized the frozen receptor model, but the incorporation of receptor flexibility is, in principle, straightforward and will be explored in the future.
- the best-docked pose for the ligand is usually obtained based on the highest binding affinity, which can be regarded as an optimization problem.
- generation of the “best” docking pose is a gradient optimization of the ligand atoms within the binding pocket, subject to the constraints of the ligand topology.
- Optimum ligand atom locations are obtained when the calculation satisfies the minimum values for all the objective functions (ligand torsions and protein-ligand long range interactions) and all ligand bonds and angle constraints.
- FIG. 4 introduces the process of the heatmap docking. To illustrate the method in detail one example whose structure is 1LI2 will be discussed. Heatmap docking against the previously introduced test set of 795 protein ligand complexes has also been conducted and this will be summarized below.
- FIG. 4 depicts MT energy maps optimization mechanism to derive the final docking pose in one protein ligand complex.
- the protein-ligand complex with PDB ID 1LI2 is used as an example to illustrate in detail the process of heatmap docking.
- 1LI2 is a T4 Lysozyme mutant bound to phenol with a modest binding affinity of 4.04 (pK d ).
- the binding pocket region is larger than the small phenol ligand structure (see FIG. 5 ), potentially allowing several ligand poses that represent local minima.
- phenol as the ligand, has a simple enough structure to clearly show the differences in protein-ligand contacts between low energy poses.
- phenol forms a hydrogen bond with GLN102A, and several hydrophobic contacts with VAL87A, TYR88A, ALA99A, VAL111A and LEU118A in the binding pocket (shown in FIG. 5 ).
- FIG. 5 depicts a contact map of the 1LI2 protein-ligand complex binding region. Hydrophobic contacts are shown as dashed lines and the one hydrogen bond is shown between the phenol oxygen atom (central striped atom) and the GLN102A residue as another dashed line. The binding pocket cavity is encircles in a solid line.
- FIG. 6 depicts heat maps for sp 3 oxygen (left) and aromatic carbon (right). Grid points with lighter color indicate energetically favorable locations for certain atom types within the binding pocket.
- the heatmap docking program then generated one sp 3 oxygen and six aromatic carbons to their optimized position following the gradients on their corresponding energy heatmaps while satisfying the linkage constraints of phenol.
- GM energetic global minimum ligand pose
- three more local minimum poses pose a, b and c were generated using the heatmap docking method.
- RMSD values ( ⁇ ) and binding scores (pK d ) are shown in Table 2.
- the GM pose slightly deviates from the crystal structure (CS) because of the adjustment of the hydrogen bond distance between the phenol oxygen and the sp 2 oxygen on GLN102A in the MT KECSA calculation.
- the phenol benzene ring balances the contacts with ALA99A and TYR88A on one side and the contacts with LEU118A, VAL87A and LEU84A on the other.
- the local minimum pose c and b have close binding scores when compared to the GM pose. They form hydrogen bonds with different hydrogen acceptors (ALA99A backbone oxygen for pose c and LEU84A backbone oxygen for pose b) while maintaining very similar benzene ring locations.
- the local minimum pose a is trying to form a hydrogen bond with ALA99A backbone oxygen.
- the benzene ring of local minimum pose a is tilted towards the LEU118A, VAL87A and LEU84A side chain carbons, weakening the hydrogen bond with the ALA99A backbone oxygen with the net result being a reduction in binding affinity.
- FIG. 7 shows the binding pocket of protein-ligand complex 1L12, ligand crystal structure (marked as CS) is shown as a stick & ball, the global minimum pose (marked as GM) is shown as a stick along with the three other identified local minimum (marked as a, b, and c).
- Red bubbles on the protein atoms indicate potential contacts with the ligand sp 3 oxygen.
- Grey bubbles on the protein atoms indicate potential contacts with aromatic carbons.
- the new approach detailed in this disclosure is one that in one-shot samples all the relevant degrees of freedom in a defined region directly affording a free energy without resorting to ad hoc modeling of the entropy associated with a given process. This is accomplished by converting ensemble assembly from a 3-D to a 1-D problem by using pairwise energies of all relevant interactions in a system coupled with their probabilities.
- MT moveable type
- KECSA Knowledge-based and Empirical Combined Scoring Algorithm
- the Movable Type (MT) method numerically simulates the local partition functions utilizing the Monte Carlo integration (MCI) given the initial structures from a canonical ensemble.
- MCI Monte Carlo integration
- the MCI method is a widely used numerical approach for free energy calculation. By simulating the integral of the canonical partition function instead of generating enthalpy and entropy values separately, The MCI method allows for the avoidance of expensive and poorly converging entropy calculations.
- the simulation comprises calculation of the following equation:
- the development of the MT algorithm is inspired by the idea of the MCI approach expressed in equation 45, where the Helmholtz free energy is simulated using the average of the sampled energy states multiplied by the actual sampling volume.
- the distinctive feature of the MT method is that it numerically simulates the average of the local partition function given a defined sampling volume centered around an initial structure, instead of searching among actual physical structures within that defined volume.
- a matrix-based random sampling strategy combining every atom pairwise potentials was introduced against each target molecular system, in which all pairwise potentials are regarded as orthogonal and the totally random combinations among atom pairwise distances were performed within a small range of sampling (in one embodiment, ⁇ 0.5 ⁇ for every atom pairwise distances).
- the so-generated hyper-dimensional energy states were associated with pre-modeled structural weighting factors and averaged over their sampling magnitude CN, where C is the defined sampling range (for example, ⁇ 0.5 ⁇ ) and N is the pairwise contact number.
- C is the defined sampling range (for example, ⁇ 0.5 ⁇ ) and N is the pairwise contact number.
- FIG. 19 is a block diagram illustrating components of a machine, according to some example embodiments, able to read instructions from a machine-readable medium and perform any one or more of the methodologies discussed herein.
- any of the machines, databases, or devices shown or discussed herein may be implemented in a general-purpose computer modified (e.g., configured or programmed) by software (e.g., one or more software modules) to be a special-purpose computer to perform one or more of the functions described herein for that machine, database, or device.
- a computer system able to implement any one or more of the methodologies described herein is discussed below with respect to FIG. 19 .
- a “database” is a data storage resource and may store data structured as a text file, a table, a spreadsheet, a relational database (e.g., an object-relational database), a triple store, a hierarchical data store, or any suitable combination thereof.
- any two or more of the machines, databases, or devices illustrated in FIG. 19 may be combined into a single machine, and the functions described herein for any single machine, database, or device may be subdivided among multiple machines, databases, or devices.
- FIG. 19 is a block diagram illustrating components of a machine 1100 , according to some example embodiments, able to read instructions 1124 from a machine-readable medium 1122 (e.g., a non-transitory machine-readable medium, a machine-readable storage medium, a computer-readable storage medium, or any suitable combination thereof) and perform any one or more of the methodologies discussed herein, in whole or in part.
- a machine-readable medium 1122 e.g., a non-transitory machine-readable medium, a machine-readable storage medium, a computer-readable storage medium, or any suitable combination thereof
- FIG. 19 is a block diagram illustrating components of a machine 1100 , according to some example embodiments, able to read instructions 1124 from a machine-readable medium 1122 (e.g., a non-transitory machine-readable medium, a machine-readable storage medium, a computer-readable storage medium, or any suitable combination thereof) and perform any one or more of the methodologies discussed herein, in whole or in part.
- FIG. 19 shows the machine 1100 in the example form of a computer system (e.g., a computer) within which the instructions 1124 (e.g., software, a program, an application, an applet, an app, or other executable code) for causing the machine 1100 to perform any one or more of the methodologies discussed herein may be executed, in whole or in part.
- the instructions 1124 e.g., software, a program, an application, an applet, an app, or other executable code
- the machine 1100 operates as a standalone device or may be connected (e.g., networked) to other machines.
- the machine 1100 may operate in the capacity of a server machine or a client machine in a server-client network environment, or as a peer machine in a distributed (e.g., peer-to-peer) network environment.
- the machine 1100 may be a server computer, a client computer, a personal computer (PC), a tablet computer, a laptop computer, a netbook, a cellular telephone, a smartphone, a set-top box (STB), a personal digital assistant (PDA), a web appliance, a network router, a network switch, a network bridge, or any machine capable of executing the instructions 1124 , sequentially or otherwise, that specify actions to be taken by that machine.
- PC personal computer
- PDA personal digital assistant
- STB set-top box
- web appliance a network router, a network switch, a network bridge, or any machine capable of executing the instructions 1124 , sequentially or otherwise, that specify actions to be taken by that machine.
- the machine 1100 includes a processor 1102 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), an application specific integrated circuit (ASIC), a radio-frequency integrated circuit (RFIC), or any suitable combination thereof), a main memory 1104 , and a static memory 1106 , which are configured to communicate with each other via a bus 1108 .
- the processor 1102 may contain microcircuits that are configurable, temporarily or permanently, by some or all of the instructions 1124 such that the processor 1102 is configurable to perform any one or more of the methodologies described herein, in whole or in part.
- a set of one or more microcircuits of the processor 1102 may be configurable to execute one or more modules (e.g., software modules) described herein.
- the machine 1100 may further include a graphics display 1110 (e.g., a plasma display panel (PDP), a light emitting diode (LED) display, a liquid crystal display (LCD), a projector, a cathode ray tube (CRT), or any other display capable of displaying graphics or video).
- a graphics display 1110 e.g., a plasma display panel (PDP), a light emitting diode (LED) display, a liquid crystal display (LCD), a projector, a cathode ray tube (CRT), or any other display capable of displaying graphics or video).
- PDP plasma display panel
- LED light emitting diode
- LCD liquid crystal display
- CRT cathode ray tube
- the machine 1100 may also include an alphanumeric input device 1112 (e.g., a keyboard or keypad), a cursor control device 1114 (e.g., a mouse, a touchpad, a trackball, a joystick, a motion sensor, an eye tracking device, or other pointing instrument), a storage unit 1116 , an audio generation device 1118 (e.g., a sound card, an amplifier, a speaker, a headphone jack, or any suitable combination thereof), and a network interface device 1120 .
- an alphanumeric input device 1112 e.g., a keyboard or keypad
- a cursor control device 1114 e.g., a mouse, a touchpad, a trackball, a joystick, a motion sensor, an eye tracking device, or other pointing instrument
- a storage unit 1116 e.g., a storage unit 1116 , an audio generation device 1118 (e.g., a sound card, an amplifier, a speaker, a
- the storage unit 1116 includes the machine-readable medium 1122 (e.g., a tangible and non-transitory machine-readable storage medium) on which are stored the instructions 1124 embodying any one or more of the methodologies or functions described herein.
- the instructions 1124 may also reside, completely or at least partially, within the main memory 1104 , within the processor 1102 (e.g., within the processor's cache memory), or both, before or during execution thereof by the machine 1100 . Accordingly, the main memory 1104 and the processor 1102 may be considered machine-readable media (e.g., tangible and non-transitory machine-readable media).
- the instructions 1124 may be transmitted or received over the network 190 via the network interface device 1120 .
- the network interface device 1120 may communicate the instructions 1124 using any one or more transfer protocols (e.g., hypertext transfer protocol (HTTP)).
- HTTP hypertext transfer protocol
- the network 190 may be any network that enables communication between or among machines, databases, and devices (e.g., the server machine 110 and the device 130 ). Accordingly, the network 190 may be a wired network, a wireless network (e.g., a mobile or cellular network), or any suitable combination thereof. The network 190 may include one or more portions that constitute a private network, a public network (e.g., the Internet), or any suitable combination thereof.
- the network 190 may include one or more portions that incorporate a local area network (LAN), a wide area network (WAN), the Internet, a mobile telephone network (e.g., a cellular network), a wired telephone network (e.g., a plain old telephone system (POTS) network), a wireless data network (e.g., WiFi network or WiMax network), or any suitable combination thereof. Any one or more portions of the network 190 may communicate information via a transmission medium.
- LAN local area network
- WAN wide area network
- the Internet a mobile telephone network
- POTS plain old telephone system
- WiFi network e.g., WiFi network or WiMax network
- transmission medium refers to any intangible (e.g., transitory) medium that is capable of communicating (e.g., transmitting) instructions for execution by a machine (e.g., by one or more processors of such a machine), and includes digital or analog communication signals or other intangible media to facilitate communication of such software.
- the machine 1100 may be a portable computing device, such as a smart phone or tablet computer, and have one or more additional input components 1130 (e.g., sensors or gauges).
- additional input components 1130 include an image input component (e.g., one or more cameras), an audio input component (e.g., a microphone), a direction input component (e.g., a compass), a location input component (e.g., a global positioning system (GPS) receiver), an orientation component (e.g., a gyroscope), a motion detection component (e.g., one or more accelerometers), an altitude detection component (e.g., an altimeter), and a gas detection component (e.g., a gas sensor).
- Inputs harvested by any one or more of these input components may be accessible and available for use by any of the modules described herein.
- the term “memory” refers to a machine-readable medium able to store data temporarily or permanently and may be taken to include, but not be limited to, random-access memory (RAM), read-only memory (ROM), buffer memory, flash memory, and cache memory. While the machine-readable medium 1122 is shown in an example embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, or associated caches and servers) able to store instructions.
- machine-readable medium shall also be taken to include any medium, or combination of multiple media, that is capable of storing the instructions 1124 for execution by the machine 1100 , such that the instructions 1124 , when executed by one or more processors of the machine 1100 (e.g., processor 1102 ), cause the machine 1100 to perform any one or more of the methodologies described herein, in whole or in part.
- a “machine-readable medium” refers to a single storage apparatus or device, as well as cloud-based storage systems or storage networks that include multiple storage apparatus or devices.
- machine-readable medium shall accordingly be taken to include, but not be limited to, one or more tangible (e.g., non-transitory) data repositories in the form of a solid-state memory, an optical medium, a magnetic medium, or any suitable combination thereof.
- Modules may constitute software modules (e.g., code stored or otherwise embodied on a machine-readable medium or in a transmission medium), hardware modules, or any suitable combination thereof.
- a “hardware module” is a tangible (e.g., non-transitory) unit capable of performing certain operations and may be configured or arranged in a certain physical manner.
- one or more computer systems e.g., a standalone computer system, a client computer system, or a server computer system
- one or more hardware modules of a computer system e.g., a processor or a group of processors
- software e.g., an application or application portion
- a hardware module may be implemented mechanically, electronically, or any suitable combination thereof.
- a hardware module may include dedicated circuitry or logic that is permanently configured to perform certain operations.
- a hardware module may be a special-purpose processor, such as a field programmable gate array (FPGA) or an ASIC.
- a hardware module may also include programmable logic or circuitry that is temporarily configured by software to perform certain operations.
- a hardware module may include software encompassed within a general-purpose processor or other programmable processor. It will be appreciated that the decision to implement a hardware module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by cost and time considerations.
- hardware module should be understood to encompass a tangible entity, and such a tangible entity may be physically constructed, permanently configured (e.g., hardwired), or temporarily configured (e.g., programmed) to operate in a certain manner or to perform certain operations described herein.
- “hardware-implemented module” refers to a hardware module. Considering embodiments in which hardware modules are temporarily configured (e.g., programmed), each of the hardware modules need not be configured or instantiated at any one instance in time. For example, where a hardware module comprises a general-purpose processor configured by software to become a special-purpose processor, the general-purpose processor may be configured as respectively different special-purpose processors (e.g., comprising different hardware modules) at different times. Software (e.g., a software module) may accordingly configure one or more processors, for example, to constitute a particular hardware module at one instance of time and to constitute a different hardware module at a different instance of time.
- Hardware modules can provide information to, and receive information from, other hardware modules. Accordingly, the described hardware modules may be regarded as being communicatively coupled. Where multiple hardware modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) between or among two or more of the hardware modules. In embodiments in which multiple hardware modules are configured or instantiated at different times, communications between such hardware modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple hardware modules have access. For example, one hardware module may perform an operation and store the output of that operation in a memory device to which it is communicatively coupled. A further hardware module may then, at a later time, access the memory device to retrieve and process the stored output. Hardware modules may also initiate communications with input or output devices, and can operate on a resource (e.g., a collection of information).
- a resource e.g., a collection of information
- processors may be temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions described herein.
- processor-implemented module refers to a hardware module implemented using one or more processors.
- processor-implemented module refers to a hardware module in which the hardware includes one or more processors.
- processors may also operate to support performance of the relevant operations in a “cloud computing” environment or as a “software as a service” (SaaS).
- At least some of the operations may be performed by a group of computers (as examples of machines including processors), with these operations being accessible via a network (e.g., the Internet) and via one or more appropriate interfaces (e.g., an application program interface (API)).
- a network e.g., the Internet
- API application program interface
- the performance of certain operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines.
- the one or more processors or processor-implemented modules may be located in a single geographic location (e.g., within a home environment, an office environment, or a server farm). In other example embodiments, the one or more processors or processor-implemented modules may be distributed across a number of geographic locations.
- the canonical ensemble is used in computing Helmholtz free energies in Equation 2i. Because both methane and butane have no polar atoms, and their accessible surface areas have a negligible change after association, we can ignore computing ⁇ G solv further simplifying this example.
- the binding free energy is given as a ratio of partition functions:
- V pocket translational DoF for the bound ligands
- C translational DoF of the free ligand
- a small “receptor” like butane when compared to any protein molecule, has a greatly reduced constraint on the translational movement of a “ligand” like methane, making V pocket ⁇ C.
- the volume factor “a” is also approximately equal to 8, for small molecules because the small receptor affords more accessible volume than found in protein-ligand systems.
- Equation 7i E is the microstate energy within an ensemble average, and on the right hand side each “j” indicates each atom pair in the molecule.
- the atom pairwise energy is zero in methane, making the energy of methane equal to the energy of a single sp 3 carbon atom.
- Butane has four sp 3 carbon atoms in a chain, making three sp 3 carbon-sp 3 carbon bonds, two sp 3 carbon-sp 3 carbon angles, and one sp 3 carbon-sp 3 carbon torsion. Sampling along the distance for all interactions and gathering all possible combinations generates all the desirable energy terms in Equation 10i.
- each of the three sp 3 carbon-sp 3 carbon bond partition functions can be modeled as
- 1.53 ⁇ is the distance at which the sp 3 carbon-sp 3 carbon bond energy reaches its minimum and 4.195 is a constant that adjusts the energy baseline to zero.
- the FIG. 8 shows sp 3 carbon-sp 3 carbon bond probability distribution and exponential energy vs. atom pairwise distance.
- Sampling of bond energies includes all distances that derive non-zero partition function values, which range from 1.72 ⁇ to 1.99 ⁇ in the sp 3 carbon-sp 3 carbon bond interaction case.
- the product over all three bond-linked sp 3 carbon-sp 3 carbon pairs derives the total bond partition function in butane (Equation 13i).
- bonds, angles, torsions and long-range interactions are all formulated using the atom pairwise distance, because (1) the units of the variables all agree with each other in the matrix calculation and (2) databases with the lowest dimension, i.e. one dimension for an atom pairwise distance based database, is much easier to manipulate than databases with unmatched dimensions, e.g. an angle-based database contains two dimensions with three atoms, and the torsion-based database has three dimensions with four atoms. To avoid this all angles and torsions are all represented by atom pairwise distances, which is described below.
- the torsion angle ⁇ is modeled using an atom pairwise distance x with the help of bond lengths d 1 d 2 d 3 and angles ⁇ 1 and ⁇ 2 .
- Equation 21i When used in the sp 3 carbon-(sp 3 carbon- sp 3 carbon)-sp 3 carbon torsion case, the parameters are given in Equation 21i:
- the probability distribution and exponential energy vs. atom pairwise distance for the sp 3 carbon-(sp 3 carbon- sp 3 carbon)-sp 3 carbon torsion are shown in FIG. 12 .
- the ensemble averaged partition function of butane can then be derived with Equation 24i, including all interaction matrices defined by butane ⁇ butane together with four sp 3 carbon single atom energies.
- e E butane Q butane ⁇ Z butane ⁇ e 4E sp 3 Carbon (24i)
- Methane has only one sp 3 carbon single atom energy partition function and no interaction matrix, while butane has four sp 3 carbon single atom energy partition functions, probability matrix Q butane and an atom pairwise interaction partition function matrix Z butane .
- Q 1 ⁇ 2 butane bond ⁇ butane angle ⁇ butane torsion ⁇ butane long-range ⁇ 1 ⁇ 2 long-range (27i)
- the gas-phase binding free energy can be derived using Equation 30i.
- Standard Matrices These fixed-size matrices for all atom pairs are termed “Standard Matrices”. Sizes of the original atom pair vectors with tens to hundreds of elements shown in Equation 1m below are far from enough for the final matrix size. Construction of the “Standard Matrices” relies on replication and tiling of the original atom pair vectors. Wherein, the vectors for each individual atom pairwise Boltzmann factor and probability are replicated in the “Standard Matrices” through all atom pairs. In order to perform the vector-to-matrix conversion, randomly scrambled permutations of the original vectors are needed. By introducing permutations to the original vector increases the diversity of atom pair combinations at different discrete distance values (r a ) in the MT computation, thereby increasing the sample size. We offer a detailed explanation in the latter paragraphs.
- Equation 2 [ scram ⁇ ( ) 1 scram ⁇ ( ) ⁇ + 1 ... scram ⁇ ( ) ⁇ - ⁇ + 1 scram ⁇ ( ) 2 scram ⁇ ( ) ⁇ + 2 ... scram ⁇ ( ) ⁇ - ⁇ + 2 ⁇ ⁇ ⁇ ⁇ scram ⁇ ( ) ⁇ scram ⁇ ( ) 2 ⁇ ⁇ ... scram ⁇ ( ) ⁇ ] ( 2 ⁇ m ) k bond in Equation 2 is the “Standard Matrix” we built for the kth atom pair with a bond constraint. Although the sizes (the number t in Equation 1m) of different vectors k vary under different constraints (i.e.
- k bond , k angle , k torsion , k long-range ), k for all atom pairs was fixed to the same size with a predetermined permutation number ⁇ and ⁇ .
- the size of the Standard Matrix (SM) e.g. g rows ⁇ h columns, must satisfy that the row number g is divisible by the sizes t of all the atom pair vectors k , so that each discrete probability q k (r i ) has an equal number of appearance in each SM k .
- This definition is important to make sure the replication numbers for all Boltzmann factors and probabilities are identical in each SM, so that their relative probabilities are the same as in the original probability vector.
- the bond probability matrix of one specific atom pair k through P bond the probability matrix of all protein atom pairs with bond constraints to PL final the probability matrix of all atom pairs in the protein-ligand system, all the matrices have the same size such that the size of the SMs is the sample size of the atom pair combinations of the protein-ligand system.
- the advantage of using a pointwise product instead of a tensor product is that the size of the final matrix can be controlled at the beginning of the computation.
- Disordered vectors are generated using random scrambling of the original vectors.
- An example of the randomly scrambled vector of the Boltzmann factor with the index number i is shown in Equation 13m.
- the maximum number of permutation is t! (the maximum value of i).
- Each index number i in the scramble operation scram(X) i represents one certain arrangement order of elements in the vector.
- Each vector k bond is assumed to contain 5 elements thus 4 scrambled vectors tiled in a column makes 20 as the row number thus 120 scrambled vectors assemble the SM k bond with 600 elements.
- the same scramble and replication processes are performed. Furthermore, a probability value and a Boltzmann factor value corresponding to the same discrete distance (r a ) are mapped to each other in the probability and Boltzmann factor SMs.
- scrambled vectors of probabilities and Boltzmann factors scrambled in the same way are in the same position in both the probability and Boltzmann factor SMs ( k bond and k bond ).
- the mapping of probability and Boltzmann factor vectors in the SM k bond and SM k bond is illustrated in FIG. 13 .
- FIG. 13 depicts an example of the SM k bond and its corresponding SM k bond .
- the scramble operator index numbers with red circles connected by blue arrows indicate that scrambled vectors with the same scramble manner (the same index number i) in both k bond and k bond are tiled in the same position in both SMs.
- k bond [ scram ⁇ ( ) 1 scram ⁇ ( ) 5 ... scram ⁇ ( ) 117 scram ⁇ ( ) 2 scram ⁇ ( ) 6 ... scram ⁇ ( ) 118 scram ⁇ ( ) 3 scram ⁇ ( ) 7 ⁇ scram ⁇ ( ) 119 scram ⁇ ( ) 4 scram ⁇ ( ) 8 ... scram ⁇ ( ) 120 ] ( 16 ⁇ m )
- SMs for the kth (one of the three) sp 3 carbon-sp 3 carbon bond in propane are modeled.
- 120 scrambled vectors represent 120 different scrambled permutations of vector k bond and k bond and tiled in a pattern from 1 through 120.
- SMs for lth (1 ⁇ l ⁇ 2, l ⁇ N, l ⁇ k) are modeled in a similar way while with different tiling sequences for the scram(X) i vectors in both SMs.
- tiling of the scram(X) i should use a different pattern.
- a possible l bond with 120 scrambled vectors could be:
- the maximum permutation number for a vector with t elements is t!. So there are around 10 30 scrambled permutations with a bond vector containing about 30 elements, with which we could easily design thousands of l bond and l bond of a certain atom type pair using different tiling patterns. Using different tiling patterns for different atom type pairs increases the mix and match diversity of atom pairs at different discrete distance values (r a ) in the MT computation, and maximizes the degrees of freedom of the elements (shown in Equation 18m) in the pointwise product of the SMs of these two atom pairs (k and l).
- a protein-ligand complex would create several thousand SMs on average.
- a laptop with a Intel(R) Core(TM) i7 CPU with 8 cores at 1.73 GHz and 8 Gb of RAM it takes 6 seconds to calculate the pose and binding affinity for the protein-ligand complex 1LI2 and on average less than a minute to calculate the pose and binding free energy of one of the 795 protein-ligand complexes studied herein.
- the SM size is increased to 7 ⁇ 10 10
- the computation time required for 1LI2 increases to 8 minutes and on average it increases to around 20 minutes using the same laptop.
- this approach is faster than using MD or MC simulations to collect the energies of 7 ⁇ 10 5 to 7 ⁇ 10 10 protein-ligand poses. Future speed-ups are clearly possible using state of the art CPUs and GPUs and this is work that is underway.
- ⁇ ij (r) is the number density for the atom pairs of types i and j observed in the known protein or ligand structures and ⁇ * ij (r) is the number density of the corresponding pair in the background or reference state.
- a central problem for statistical potentials is to model specific atom pairwise interactions removed from the background energy.
- geometric information i.e. atom pairwise radial distributions, represents an averaged effect of all interactions in chemical space, including bond, angle, torsion, and long-range non-covalent forces. Converting these radial distributions into energy functions is a challenge.
- Equation 2 represents the free energy change of transferring a molecule (L) from vacuum to aqueous solution.
- Many sampling methods have proven effective, e.g. molecular dynamics (MD) and Monte Carlo (MC) methods; however, thoroughly sampling phase space is challenging for brute-force methods.
- a new sampling method which we call the Movable Type (MT) method, was developed by our group in an attempt to avoid some of the pitfalls encountered by the more computationally intensive sampling methods. Via sampling of all atom pairwise energies, at all possible distances, using pre-built databases and then combining these energies for all atom pairs found in the molecular system of interest, the MT sampling method was able to accurately estimate binding free energies as well as protein-ligand poses.
- Movable Type (MT) method is a free energy method that generates the ensemble of the molecular system of interest using pairwise energies and probabilities.
- the term “Movable Type” originates from the printing technique where a database of symbols (letters, numerals, etc.) is created and then assembled using a movable type system.
- MT free energy calculations start from the construction of a large database containing interaction energies between all classes of atom pairs found in the chemical space under investigation. An atom pairwise energy function is required to create the database and the modified KECSA model is employed herein.
- molecular “printing” is then performed by assembling the pairwise energies using a “printing forme”.
- a fixed-size matrix (Z-matrix) is introduced to represent the Boltzmann-weighted energy ensemble, in which atom pairwise energies at different distances are assembled to simultaneously represent the ensemble and free energies of the chemical space under investigation.
- Z k L (Z-matrix) in equation 3 represents a Boltzmann-weighted energy (Boltzmann factor) matrix for the kth atom pair in the observed molecule L containing energies ranging from distance r 1 to r n .
- An inner product of Z-matrices results in the Boltzmann-weighted energy combinations between different atom pairs at different distances with a sampling size of n (matrix size), as is shown in equation 4n.
- a Q-matrix of atom pairwise radial distribution probabilities is introduced in order to avoid physically unreasonable combinations between different atom pairs at certain bond lengths, angles and torsions.
- the elements in the Q-matrix were collected from a large structural database containing 8256 protein crystal structures from PDBBind v2013 database and 44766 small molecules from both PDBBind v2013 and the CSD small molecule database.
- Q-matrices matching the composition of the corresponding Z-matrix are also assembled using inner products.
- the final Q-matrix for the molecule of interest is normalized before being multiplied by the final Z-matrix, assuring that the overall probability is 1.
- the sum of the final matrix ( total L ) gives the ensemble average of the Boltzmann factors with a sampling size of n (matrix size).
- the energies of different molecular conformations can be generated simultaneously via matrix products over all atom pairs.
- the solvation free energy is then calculated by incorporating the ensemble average of the Boltzmann factors into:
- the free energy is computed directly from the NVT ensemble avoiding issues related to the additivity of the free energy. Theoretically and experimentally it can be shown that the energy can be decomposed, while the entropy and free energies cannot.
- the MT energy sampling method can incorporate both an explicit and implicit water model into a solvation free energy calculation.
- Our previous attempt utilized a simple continuum ligand-solvent interaction model.
- a new semi-continuum water model is developed herein, in which the solute-solvent interaction is calculated by placing water molecules around the solute.
- Water molecules were modeled as isotropic rigid balls with van der Waals radii of 1.6 ⁇ . Water molecules were placed into isometric solute-surrounding solvent layers, starting from the solute's water accessible surface until 8 ⁇ away from the solute's van der Waals surface with an increment of 0.005 ⁇ per layer. The number of water molecules was limited by comparing their maximum cross-sectional areas with the solvent accessible surface area at each solvent layer for each atom in the solute molecules. The number of water molecules (N w ) accessible to each atom at distance R away from the atomic center of mass is rounded down via filtering using the maximum cross-sectional area (S w ) of water with the atomic solvent accessible surface area (S a ) in the solvent layer at distance R.
- N w ⁇ ( r ) floor ⁇ ⁇ ( S a ⁇ ( r ) S w ) ( 9 ⁇ n )
- the maximum cross-sectional areas (S w ) of a water molecule is calculated as:
- the Boltzmann factor matrix for the k th solute atom-water ( k A ⁇ S ) interaction is defined as a Boltzmann weighted solute atom-water energy multiplied by the number of accessible water molecules at the different distances.
- Multiplication of the Z-matrices for all solute atom-water interactions composes the final solute molecule-water Z-matrix ( total L ⁇ S ) which when multiplied by the Z-matrix for the intra-solute molecular interactions ( total L ) derives the final Z-matrix for the solute-solvent complex system ( total LS )
- Multiplication of the final Z-matrix with its corresponding normalized Q-matrix generates the Boltzmann-weighted energy ensemble ( total LS ). With the energy ensembles for the solute molecule ( total L ) and solute-solvent complex ( total LS ), the solvation free energy is calculated using equation 14n.
- FIG. 14 depicts the factors involved in the implicit-solvent model in the KMTISM method.
- a collection of structural information is the first requirement to assemble a statistical potential.
- Many crystal structures in the Cambridge Structural Database (CSD) represent small molecules co-crystalized with water molecules.
- the Protein Data Bank (PDB) also contains a large number of protein-ligand complexes with water molecules at the interface between binding pockets and ligand molecules albeit the resolution of these structures are poorer than typically encountered in the CSD. Since our goal was to construct a solvation energy model focusing on small molecules, the CSD small molecule database became our primary resource for structural data. In order to data mine the CSD we only examined structures with (1) an R factor less than 0.1 and (2) all polymer structures and molecules with ions were excluded. The resulting data set selected contained 7281 small molecules surrounded by crystal water molecules.
- Statistical potentials are derived by converting the number density distributions between two atoms or residues to energies; hence, they are “fixed-charge” models for the selected atom or residue pairs.
- a detailed atom type categorization has been employed where 21 atom types (shown in Table 1) were chosen from the database as having statistically relevant water molecule contact information contained within the CSD.
- the pairwise radial distribution is a mixed consequence of direct pairwise contacts and indirect environmental effects.
- statistical potentials have difficulty in separating out various chemical environment effects on the observed atoms, thereby generating a major source of error in this class of models.
- over-represented contacts in a structural database could mask the presence of other contacts.
- a new statistical potential energy function called KECSA developed in our group defines a new reference state attempting to eliminate the contact masking due to quantitative preferences.
- KECSA defines a reference state energy or background energy as the energy contributed by all atoms surrounding the observed atom pairs. It introduces a reference state number distribution modeled by a linear combination of the number distribution under mean force (n ij (r)) and the number distribution of an ideal gas state
- n ij ** ⁇ ( r ) ( N ij V ⁇ 4 ⁇ ⁇ ⁇ ⁇ r 2 ⁇ ⁇ ⁇ ⁇ r ) ⁇ x + ( n ij ⁇ ( r ) ) ⁇ ( 1 - x ) ( 15 ⁇ n )
- x indicates the intensity of the observed atom pairwise interaction in the chemical space V.
- solute and solvent-solute interactions requires us to define an “x term” for each atom pairwise interaction.
- Several approaches have been used to model x in our knowledge-based energy function.
- n ij **(r) The original model of n ij **(r) is based on the notion that every atom pair has an equal contact opportunity in a background energy contributed by the other atom pairs, while neglecting the fact that the background energies have different effects on atom pairwise distributions with different interaction strengths (say atom i and j under a covalent bond constraint compared to atom k and l under a non-bond interaction constraint).
- nij**(r) model takes every atom pairwise contact as a energy state distributed between an ideal gas state energy and mean force state energy following a Boltzmann distribution in the reference state.
- x factor is defined as:
- e ⁇ E ij (r) is the Boltzmann factor and N ij (r) is the degeneracy factor (contact number) for atom type pair i and j.
- each E ij (r) can be derived iteratively at discrete distance points.
- every E ij (r) derived using the KECSA energy function is never a mean force potential between atom pair i and j as found in traditional statistical potentials.
- E ij (r) represents a pure atom pairwise interaction energy between i and j because the reference state energy defined in KECSA is a background energy contributed by all other atom pairs, and not just the ideal gas state energy.
- KMTISM Two major differences between (KECSA-Movable Type Implicit Solvation Model) KMTISM and other continuum solvation models are (1) the MT method calculates the free energy change using a ratio of partition functions in the NVT ensemble, while traditional continuum solvation models separate the Gibbs free energies into linear components with enthalpy and entropy components. (2) Electrostatic interactions are implicit via the categorization of pairwise atom-types in the KECSA model while they are calculated explicitly using classical or QM based energy calculation approaches. In this manner, KMTISM can be viewed as the null hypothesis for the addition of explicit electrostatic interactions.
- the Minnesota Solvation Database is a well-constructed data set, including aqueous solvation free energies for 391 neutral molecules and 144 ions. This data set was filtered down to 372 neutral molecules and 21 ions in our test set via the exclusion of (1) inorganic molecules, and (2) molecules with atom types not represented in the KECSA potential. This test set, including various hydrocarbons, mono- and polyfunctional molecules with solvation free energies ranging from ⁇ 85 to 4 kcal/mol, was further classified into different subsets based on the functional groups within the molecules. Some molecules were included in several subsets due to their polyfunctional nature.
- Carbon, nitrogen and oxygen are essential elements in organic molecules. More than one half of the compounds in the neutral test set (219 out of 372 compounds) were composed exclusively of C, N and O atoms. From the Minnesota Solvation Database we created 4 subsets from these 219 molecules: 41 hydrocarbons, 91 molecules with oxygen based functional groups, 44 molecules with nitrogen based functional groups and 43 molecules with mixed N and O functional groups. Validation also focused on molecules with sulfur, phosphorus, and halogen atoms, which play important roles in organic molecules. A test set with only halocarbons was created for the purpose of avoiding interference from other polar atoms. Sulfur and phosphorus, on the other hand, are often contained in oxyacid groups in organic molecules.
- a test set with sulfur or phosphorus-containing molecules was composed. Heterocycles, amides and their analogs are pervasive in drug-like molecules and are well represented in the Minnesota Solvation Database. 37 heterocyclic compounds and 33 amides and their analogs were categorized into two subsets. In addition, 28 molecules containing three or more different functional groups were selected to provide a challenging test with complex and highly polar molecules. The ion test set was limited to biologically relevant ions herein resulting in positively charged nitrogen and negatively charged carboxylate oxygen subsets. In this way 21 ions were chosen from Minnesota Solvation Database (11 cations and 10 anions).
- KMTISM and MM-PBSA Against the neutral molecule test set, KMTISM and MM-PBSA gave comparable correlation coefficients (R 2 ) and both were better than MM-GBSA. According to Kendall's tau values, MM-PBSA outperformed the other two methods in ranking ability, with KMTISM as the second best. In terms of accuracy of the models, KMTISM has the lowest root-mean-square error (RMSE), while the RMSE values for MM-GBSA and MM-PBSA were almost twice as large.
- RMSE root-mean-square error
- a and b are the slope and the intercept of the regression line between experimental data and computed data, respectively.
- the slopes of their regression lines indicated an overestimation of the solvation free energies using these two methods.
- the significant improvement in RMSE values for MM-GBSA and MM-PBSA after the linear scaling as well as their correlation coefficient (R 2 and Kendall's tau) values indicate that they have a better ranking ability than free energy prediction.
- polar atom types in the KECSA energy function were classified according to their corresponding hydrophilic functional groups and were less affected by adjacent functional groups.
- Polar atom type-water radial probabilities were driven by a more fine grained atom pairwise set of interactions, thereby, improving the performance of the KECSA energy function for these groups.
- the oxygenated molecule set and halocarbon set were among the top 3 test sets based on KMTISM's performance according to RMSE. against the oxygen containing molecule set, KMTISM gave a correlation coefficient comparable to MM-PBSA, while its RMSE was better than MM-GBSA.
- KMTISM outperformed the MM-PB/GBSA methods according to the RMSE and correlation coefficients. Especially for fluorocarbons whose solvation free energies were much better reproduced by KMTISM compared to the MM-PB/GBSA methods.
- the RMSE for KMTISM was 1.1 kcal/mol compared to RMSE values as 5.8 kcal/mol for MM-GBSA and 2.2 kcal/mol for MM-PBSA.
- FIGS. 16A-16I show KMTISM's top three performing test sets according to RMSE.
- FIG. 15 is a graph showing KMTISM, MM-GBSA and MM-PBSA calculated vs. experimental solvation free energies (kcal/mol) for 372 neutral molecules (kcal/mol).
- the ⁇ G sol for methylperoxide was ⁇ 9.90 kcal/mol or ⁇ 8.86 kcal/mol (scaled) vs. the experimental value of ⁇ 5.28 kcal/mol and the ⁇ G sol for ethylperoxide was ⁇ 10.27 kcal/mol or ⁇ 9.20 kcal/mol (scaled) vs. the experimental value of ⁇ 5.32 kcal/mol.
- the ⁇ G sol for methylperoxide was ⁇ 9.89 kcal/mol or ⁇ 6.51 kcal/mol (scaled) using MM-GBSA and ⁇ 9.07 kcal/mol or ⁇ 5.90 kcal/mol (scaled) using MM-PBSA;
- the ⁇ G sol for ethylperoxide was ⁇ 9.21 kcal/mol or ⁇ 6.00 kcal/mol (scaled) using MM-GBSA and ⁇ 8.59 kcal/mol or ⁇ 5.59 kcal/mol (scaled) using MM-PBSA.
- none of the methods examined particularly did well modeling the solvation free energy of peroxides.
- FIG. 17A-17I shows KMTISM's worst three performing test sets according to RMSE, including amide, organosulfur and organophosphorus, and polyfunctional test sets.
- FIGS. 18A-18C graphically illustrate KMTISM, MM-GBSA and MM-PBSA calculated solvation free energies (kcal/mol) vs. experimental data for carboxylate and charged amine test sets.
- KMTISM has an advantage over the MM-GB/PBSA methods for the prediction of the solvation free energy of polyfunctional molecules. This advantage will have a significant effect on the ability of this model to predict, for example, protein-ligand binding affinities, where the solvation free energy of the ligand can have a significant impact on binding affinity prediction.
- MM-GBSA and MM-PBSA are two broadly used implicit solvation models.
- KMTISM using a new sampling method (MT method), combined with a statistical energy function (KECSA), is found to have a comparable or a better ability to predict the solvation free energy for several test sets selected from the Minnesota Solvation Database. Though all of these methods perform worse than the most recent SMX model reported by Cramer and Truhlar. It is important to appreciate that without using the approximation that the free energy of solvation is a collection of linearly combined free energies, as is employed in many traditional continuum solvent models, KMTISM uses computed energies to directly determine free energies. Hence, the Helmholtz free energy is calculated by the construction of the relevant partition functions.
- Future work includes (1) a detailed study of enthalpy changes and entropy changes using the MT method; (2) improving the statistical energy terms by data collection from MD simulations of atom types with high polarizability and uncommon atom types in structural databases, and (3) replacing the statistical energy function with different force field based energy functions and combine them with the MT sampling method in order to affect the rapid evaluation of thermodynamic quantities.
- Tables 7 and 8 show the experimental and computed solvation free energies of the neutral molecules (Table 7) and charge ions (Table 8) that were studies using the methodology disclosed above.
- the method detailed above may also be used in a system that comprises a computational device.
- the system uses a computational device such as a computer to estimate a pose of a ligand in a receptor that comprises identifying all possible atom pairs of protein-ligand complexes in a given configuration space for a system that comprises proteins. It then creates a first database and a second database; where the first database comprises associated pairwise distant dependent energies and where the second database comprises all probabilities that include how the atom pairs can combine.
- the first database is then combined with the second database using statistical mechanics to accurately estimate binding free energies as well as the pose of the ligand in the receptor.
- a protein-ligand complex is then used for further study depending upon the data obtained for the aforementioned estimations.
- the further study can include ranking the interactions so as to enable one to choose a group of protein-ligand complexes for further experimentation, analysis or product development, for use in choosing a particular protein-ligand for developing a medical device, an analytical device such as a fluidics device, and the like.
- the further study can also include choosing a protein-ligand for the further complexation studies—where the protein-ligand is further complexed with another molecule.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Chemical & Material Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- General Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- Molecular Biology (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Medicinal Chemistry (AREA)
- Crystallography & Structural Chemistry (AREA)
- General Physics & Mathematics (AREA)
- Immunology (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Hematology (AREA)
- Urology & Nephrology (AREA)
- Pharmacology & Pharmacy (AREA)
- Probability & Statistics with Applications (AREA)
- Biochemistry (AREA)
- Physiology (AREA)
- Mathematical Physics (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Microbiology (AREA)
- Computing Systems (AREA)
- Pathology (AREA)
- Analytical Chemistry (AREA)
- Food Science & Technology (AREA)
- Cell Biology (AREA)
- Organic Chemistry (AREA)
Abstract
Description
-
- determining, using at least one hardware computer processor, atom pairwise contacts between a first molecule structure and a second molecule structure;
- sampling atom energy for the two-molecule system and constructing an atom energy matrix for the first molecule and an atom-cncrgy matrix for the second molecule;
- converting the atom energy matrix for the first molecule into a molecular energy matrix for the first molecule;
- converting the atom energy matrix for the second molecule into a molecular energy matrix for the second molecule;
- converting energy values of the molecular energy matrix for the first molecule to Boltzman factors under room temperature;
- converting energy values of the molecular energy matrix for the second molecule to Boltzman factors under room temperature;
- using the Boltzman factors, calculating Boltzmann free energy
-
- where:
- R is the gas constant;
- T is temperature in degrees Kelvin (optionally 298.15 K);
- Z is the partition function, which is the sum of the Boltzman factors;
- D is a defined volume quantity representing the three dimensional space that particles under study should be contained;
- M is a subscription meaning “molecular”;
- β is 1/(RT):
- E is the energy;
- V is the ensemble volume; and
- N is the number of states that have been sampled;
- using Monte Carlo Integration by a method that comprises calculating an estimate of the ensemble volume V according to the equation:
V=2Y−4×(2π)Y−3×(4π)Y−2 C Y−1 - where:
- C a constant represent ing a predetermined boundary of the particle-particle distance in the ensemble, and
- Y is the number of atoms in the ensemble.
- where:
ΔG b s =ΔG b g +ΔG solv PL −ΔG solv L −ΔG solv P (1)
where P and L indicate the protein and ligand, s and g represent the behavior in solution and the gas-phase, respectively, ΔGsolv is the solvation free energy, and ΔGb is the binding free energy in gas (g) and solution (s), respectively.
ΔG b s =ΔG b g −ΔΔG solv (2)
where Z represents the canonical ensemble partition function, EPL is the energy of the protein-ligand interactions as a function of distance r, EP is the protein energy, EL is the ligand energy (both as a function of distance r) and β is the reciprocal of the thermodynamic temperature kBT. Partition functions are integrals over all possible coordinates of the protein-ligand complex, the protein, and the ligand.
where the partition functions are expressed as the Boltzmann-weighted average of the pose energies (shown in brackets) multiplied the volume of configuration space available to each state, shown as F in
F PL =F boundP external F boundL external F boundP internal F boundL internal (5)
where ni,j(r) is the number of protein-ligand pairwise interactions between a certain atom pair type i and j in the bin (r, r+Δr), with the volume 4πraΔr collected from the training set. nij*(r) in the denominator mimics the number of protein-ligand atom type pairs i and j in the same distance bin in an ideal gas state. This removes the “non-interacting” background distribution from the protein-ligand system. Δr is defined as 0.005 Å. Nij is the total number of atom pairs of type i and j. The average volume V of the protein-ligand binding sites is given as
with the same to-be-determined parameter a as described above (
c=q·z (16)
where the averaged molecular partition function is given as a sum of atom pairwise partition functions c sampled over distance intervals (M) of all combination of N atom pairs at all possible distances.
where subscript k indicates a bonded atom pair i and j, and each distance increment between any ra and ra+1 is 0.005 Å. Using this scheme the distance sampling size is given by:
where r1 and rn are the lower and upper bounds for distance sampling, which varies depending on the each atom pair and interaction type. The product over all bond-linked atom pairs derives the total bond partition function in the protein:
P P= P bond⊗ P angle⊗ P torsion⊗ P long-range (21)
where P long-range= P vdw-elec⊗ P H-bond (22)
L L= L bond⊗ L angle⊗ L torsion⊗ L long-range (23)
PL PL= P bond⊗ P angle⊗ P torsion⊗ P long-range⊗ L bond⊗ L angle⊗ L torsion⊗ L long-range⊗ PL long-range (24)
L= L bond⊗ L angle⊗ L torsion⊗ L long-range
PL= PL long-range= PL vdw-elec⊗ PL H-bond (29)
PL= P bond⊗ P angle⊗ P torsion⊗ P long-range⊗ L bond⊗ L angle⊗ L torsion⊗ L long-range⊗ PL long-range (30)
PL=·= PL· PL (31)
∵Sum()=1;
∴Sum( PL)=Sum(·)= e −βE
where the first equation is the normalization statement for the probabilities. In this manner, the normalized averaged partition function of the protein-ligand complex is derived in Equation 32.
e −βE
e −βE
and {tilde over (Q)}k PL are standard distribution frequency matrices normalized over all three systems, in order to satisfy
In this way the protein-ligand binding free energy in the gas-phase is derived using our MT algorithm.
E LS(r)=E L(r)+E psol(r)+E npsol(r) (36)
psol= 1 psol· 2 psol T· 3 psol T . . . k psol T . . . m psol T (38)
npsol= e −βNABA (39)
solv L= L· psol· npsol (40)
SUM( solv L)=SUM L· psol· npsol = e −βE
Performance of MT KECSA as a Scoring Function for Protein-Ligand Binding Affinity Prediction
-
- (1) Crystal structures of all selected complexes had X-ray resolutions of <2.5 Å.
- (2) Complexes with molecular weights (MWs) distributed from 100 to 900 were selected, to avoid ligand size-dependent prediction results.
- (3) Complexes with ligands who have more than 20 hydrogen donors and acceptors, more than one phosphorus atom, and complexes with metalloproteins were excluded.
TABLE 1 |
Statistical results for MT KECSA and original KECSA |
correlated with experimental binding affinities. |
Pearson's r | RMSE(pKd) | Kendall τ | |||
MT KECSA | 0.72 | 1.88 | 0.53 | ||
original KECSA | 0.62 | 2.03 | 0.46 | ||
TABLE 2 |
RMSD values (Å) and binding scores (pKd) of the |
global and local minima |
RMSD | Binding Affinity | |
(Å) | (pKd) | |
Global Minimum | 0.937 | 3.329 |
Local Minimum a | 2.667 | 2.255 |
Local Minimum b | 2.839 | 2.975 |
Local Minimum c | 2.342 | 3.299 |
-
- where:
- R is the gas constant;
- T is temperature, normally 298.15K;
- Z is the partition function, which is the sum of the Boltzman factors;
- D is a defined volume quantity representing the three dimensional space that particles under study should be contained;
- M is a subscription meaning “molecular”;
- β is a commonly used term representing 1/(RT);
- E is the energy;
- V is the ensemble volume; and
- N is the number of states that have been sampled.
- where:
V=∫ . . . ∫ D dτ 1 . . . dτ N (46)
-
- where
- τ1 to τN are the coordinates of all the particles, and
- D as the domain of definition for all of the particle coordinates.
The ensemble volume is under an exponential growth as the number of dimensions increases. The MT algorithm uses a distance-based coordinate system in order to better estimate the ensemble volume. With all the geometric parameters of the molecules converted into a distance-based representation, the ensemble volume can be expressed usingequation 3 which shows the exponential growth of the state volume with the number of energy components.
V=2N−4×(2π)N−3×(4π)N−2 C N−1 (47)
- where:
- C a constant representing a predetermined boundary of the particle-particle distance in the ensemble, and
- N is the number of atoms in the ensemble.
In one embodiment C was selected as 6 Å.
- where
ΔG b s =ΔG b g +ΔG solv 1−2 −ΔG solv 1 −ΔG solv 2 (1i)
where methane and butane are indicated as
ΔG b s =ΔG b g −ΔΔG solv (2i)
E methane =E sp
e E
k indicates each sp3 carbon-sp3 carbon bond and 1 through n are the distance increments. Each increment is set to 0.005 Å. The bond energy is modeled as a harmonic oscillator:
z k bond (r a)=e −79.98(r
−79.98 is the energy unit used in KECSA and will be ultimately converted into kcal/mol. 1.53 Å is the distance at which the sp3 carbon-sp3 carbon bond energy reaches its minimum and 4.195 is a constant that adjusts the energy baseline to zero. The
r1 through rn in Equation 13i indicates the sp3 carbon-sp3 carbon bond distance range from 1.72 Å to 1.99 Å
and the angle energy terms are formulated as:
and the torsion partition function is modeled as:
e E
e E
Z butane= butane bond⊗ butane angle⊗ butane torsion (22i)
Q butane= butane bond⊗ butane angle⊗ butane torsion⊗ butane long-range (23i)
with probability values gathered from available structural data. The ensemble averaged partition function of butane can then be derived with Equation 24i, including all interaction matrices defined by butane· butane together with four sp3 carbon single atom energies.
e E
C 1−2 =Q 1−2 ·Z 1−2 ·e E
Z 1−2= butane bond⊗ butane angle⊗ butane torsion⊗ 1−2 long-range (26i)
Q 1−2= butane bond⊗ butane angle⊗ butane torsion⊗ butane long-range⊗ 1−2 long-range (27i)
(28i)
(29i)
k bond is the unscrambled probability vector of the kth atom pair with a bond constraint. t is the number of discrete probabilities with significant values. scram(X)i represents a randomly scrambled permutation of matrix X with i as the index number. The enlarged matrix of k bond is represented as follow:
k bond in
P bond= 1 bond∘ 2 bond∘ 3 bond∘ . . . ∘ k bond∘ . . . ∘ n bond (3m)
P final= P bond∘ P angle∘ P torsion∘ P long-range (4m)
Similarly, SMs for the ligand and the complex are given as:
L final= L bond∘ L angle∘ L torsion∘ L long-range (5m)
PL final= P bond∘ P angle∘ P torsion∘ P long-range∘ L bond∘ L angle∘ L torsion∘ L long- range∘ PL long-range (6m)
P bond= 1 bond∘ 2 bond∘ 3 bond∘ . . . ∘ k bond∘ . . . ∘ n bond (7m)
P final= P bond∘ P angle∘ P torsion∘ P long-range (8m)
L final= L bond∘ L angle∘ L torsion∘ L long-range (9m)
PL final= P bond∘ P angle∘ P torsion∘ P long-range∘ L bond∘ L angle∘ L torsion∘ L long- range∘ PL long-range (10m)
k indicates one sp3 carbon-sp3 carbon bond in propane and the discrete distance a goes from 1 through t and represent the distance increments. Disordered vectors are generated using random scrambling of the original vectors. An example of the randomly scrambled vector of the Boltzmann factor with the index number i is shown in Equation 13m. For a vector with t elements in it, the maximum number of permutation is t! (the maximum value of i). Each index number i in the scramble operation scram(X)i represents one certain arrangement order of elements in the vector.
as an example of one randomly scrambled vector.
TABLE 3 |
pKd RMSDs for 100 rounds binding affinity calculations against |
the protein-ligand complex 1LI2 using |
the SM pointwise product with four different SM sizes. |
SM sizes | 700 | 7 × 105 | 7 × 1010 | 7 × 1015 |
PKd | 0.059 | 0.012 | 0.011 | 0.011 |
RMSD | ||||
where g(2) is called a correlation function. β=1/kBT and kB is the Boltzmann constant and T is the temperature. ρij(r) is the number density for the atom pairs of types i and j observed in the known protein or ligand structures and ρ*ij(r) is the number density of the corresponding pair in the background or reference state. A central problem for statistical potentials is to model specific atom pairwise interactions removed from the background energy. In protein-ligand complexes, geometric information, i.e. atom pairwise radial distributions, represents an averaged effect of all interactions in chemical space, including bond, angle, torsion, and long-range non-covalent forces. Converting these radial distributions into energy functions is a challenge.
e −βE
where the energy of the molecule in solution (ELS) is modeled as:
E LS(r)=E L(r)+E L−S interaction(r) (8n)
DOFLS and DOFL indicate the degrees of freedom of the molecule in solution and in the gas phase, which were assigned the same value in the current implicit water model for simplicity.
where Rw and Ra are the van der Waals radii for water and the atom in the solute molecule respectively.
TABLE 4 |
List of 21 atom types in the current solvation model |
Atom | |||
Type | Description | ||
C1 | sp1 carbon | ||
C2 | sp2 carbon | ||
C3 | sp3 carbon | ||
Car | aromatic carbon | ||
N2 | sp2 nitrogen | ||
N3 | sp3 nitrogen | ||
N4 | positively charged nitrogen | ||
Nam | amide nitrogen | ||
Nar | aromatic nitrogen | ||
Npl3 | trigonal planar nitrogen | ||
Ow | water oxygen | ||
O2 | sp2 oxygen | ||
O3 | hydroxyl oxygen | ||
OE | ether and ester sp3 oxygen | ||
Oco2 | carboxylate, sulfate and phosphate | ||
oxygen | |||
S2 | sp2 sulfur | ||
S3 | sp3 sulfur | ||
P3 | sp3 phosphorous | ||
F | Fluorine | ||
Cl | Chlorine | ||
Br | Bromine | ||
Energy Function Modeling
where x indicates the intensity of the observed atom pairwise interaction in the chemical space V. This definition puts the number distribution of one certain observed atom pair in the reference state somewhere between the ideal gas state and the “mean force” state, depending on its relative strength. Stronger interactions have background energies closer to an ideal gas state while weaker interactions have background energies approaching the mean force state energy contributed by all atoms in the chemical space.
where Nt is the total atom type number in the chemical space.
Hence we can build the energy function for each atom type pair as:
TABLE 5 |
Performance of KMTISM, MM-GBSA and MM-PBSA for the prediction of the |
solvation free energies of neutral molecules. |
MM- | MM- | MM- | MM- | |||
KMTISM | GBSA | PBSA | KMTISM | GBSA | PBSA | |
Total Neutral Molecule Set | Amide Set |
R2 | 0.792 | 0.734 | 0.804 | 0.660 | 0.493 | 0.509 |
Kendall's tau | 0.755 | 0.708 | 0.793 | 0.568 | 0.484 | 0.465 |
Raw RMSE | 2.597 | 4.629 | 4.647 | 4.368 | 8.666 | 9.717 |
(kcal/mol) | ||||||
Scaled RMSE | 2.248 | 2.634 | 2.160 | 3.852 | 4.885 | 4.663 |
(kcal/mol) |
Hydrocarbon Set | Halocarbon Set |
R2 | 0.699 | 0.906 | 0.954 | 0.648 | 0.004 | 0.594 |
Kendall's tau | 0.663 | 0.748 | 0.887 | 0.656 | 0.091 | 0.625 |
Raw RMSE | 0.858 | 1.179 | 0.925 | 1.052 | 2.768 | 1.148 |
(kcal/mol) | ||||||
Scaled RMSE | 0.845 | 0.498 | 0.332 | 1.030 | 2.063 | 1.109 |
(kcal/mol) |
Organosulfur & | ||
Oxygenated Molecule Set | Organophosphosphorus Set |
R2 | 0.829 | 0.881 | 0.916 | 0.762 | 0.751 | 0.777 |
Kendall's tau | 0.657 | 0.723 | 0.754 | 0.680 | 0.626 | 0.618 |
Raw RMSE | 2.104 | 4.232 | 3.868 | 4.337 | 8.297 | 9.179 |
(kcal/mol) | ||||||
Scaled RMSE | 1.578 | 1.613 | 1.186 | 3.500 | 4.316 | 3.992 |
(kcal/mol) |
Nitrogenous Molecule Set | Heterocycle Set |
R2 | 0.615 | 0.485 | 0.795 | 0.604 | 0.528 | 0.552 |
Kendall's tau | 0.420 | 0.412 | 0.592 | 0.652 | 0.622 | 0.646 |
Raw RMSE | 2.384 | 2.416 | 1.690 | 4.314 | 7.584 | 8.722 |
(kcal/mol) | ||||||
Scaled RMSE | 2.276 | 2.555 | 1.797 | 3.721 | 4.413 | 4.217 |
(kcal/mol) |
Oxygenated & Nitrogenous Molecule Set | Polyfunctional Molecule Set |
R2 | 0.545 | 0.747 | 0.694 | 0.736 | 0.615 | 0.650 |
Kendall's tau | 0.565 | 0.663 | 0.621 | 0.726 | 0.577 | 0.609 |
Raw RMSE | 3.259 | 4.282 | 5.043 | 4.688 | 10.138 | 11.132 |
(kcal/mol) | ||||||
Scaled RMSE | 2.991 | 2.794 | 2.484 | 3.597 | 5.335 | 4.804 |
(kcal/mol) | ||||||
TABLE 6 |
Performance of KMTISM, MM-GBSA and MM-PBSA for |
the prediction of the solvation free energies of ions. |
Ion Set |
KMTISM | MM-GBSA | MM-PBSA | |||
R2 | 0.351 | 0.000 | 0.003 | ||
Kendall's tau | 0.258 | −0.057 | −0.067 | ||
RMSE (kcal/mol) | 5.777 | 11.736 | 10.481 | ||
Carboxylate Set |
KMTISM | MM-GBSA | MM-PBSA | |||
R2 | 0.239 | 0.161 | 0.166 | ||
Kendal's tau | −0.090 | −0.180 | −0.180 | ||
RMSE (kcal/mol) | 5.337 | 11.918 | 11.252 | ||
Charged Amine Set |
KMTISM | MM-GBSA | MM-PBSA | |||
R2 | 0.557 | 0.008 | 0.009 | ||
Kendall's tau | 0.491 | −0.127 | −0.127 | ||
RMSE (kcal/mol) | 6.149 | 11.569 | 9.727 | ||
Comparison with SMX Results
TABLE 7 |
KMTISM, MM-GBSA and MM-PBSA calculated solvation free energy (in kcal/mol) results against the test set with 372 neutral compounds. |
MM- | MM- | MM- | MM- | ||||||
KMTISM | KMTISM | GBSA | GBSA | PBSA | PBSA | ||||
Compound | Exp. | Raw | Scaled | Raw | Scaled | Raw | Scaled | ||
ID | ΔGsolv | Result | Result | Result | Result | Result | Result | Solute Name | Formula |
0044met | −5.11 | −5.8 | −5.17 | −8.2 | −5.24 | −7.35 | −4.76 | methanol | H4C1O1 |
0045eth | −5.01 | −4.96 | −4.41 | −7.45 | −4.67 | −6.83 | −4.42 | ethanol | H6C2O1 |
0046eth | −9.3 | −11 | −9.85 | −15.6 | −10.85 | −16.12 | −10.57 | 1,2-ethanediol | H6C2O2 |
0047pro | −4.83 | −4.6 | −4.08 | −6.97 | −4.3 | −6.17 | −4.18 | 1-propanol | H8C3O1 |
0048pro | −1.76 | −4.2 | −3.72 | −7.01 | −4.34 | −6.23 | −4.02 | isopropanol | H8C3O1 |
0049but | −4.72 | −3.98 | −3.52 | −6.71 | −4.11 | −6.23 | −4.03 | 1-butanol | H10C4O1 |
0050met | −4.51 | −3.29 | −2.9 | −6.34 | −3.83 | −5.62 | −3.62 | t-butanol | H10C4O1 |
0051cyc | −5.49 | −3.62 | −3.19 | −6.5 | −3.95 | −6.87 | −4.45 | cyclopentanol | H10C5O1 |
0052pen | −4.47 | −4.25 | −3.77 | −6.43 | −3.9 | −6.04 | −3.9 | 1-pentanol | H12C5O1 |
0053phe | −6.62 | −7.41 | −6.62 | −11.11 | −7.45 | −10.47 | −6.83 | phenol | H6C6O1 |
0054hex | −4.36 | −3.93 | −3.48 | −6.18 | −3.71 | −5.79 | −3.73 | 1-hexanol | H14C6O1 |
0055ocr | −5.87 | −5.74 | −5.11 | −10.38 | −6.9 | −9.57 | −6.24 | o-cresol | H8C7O1 |
0056mcr | −5.49 | −6.85 | −6.11 | −10.67 | −7.11 | −9.98 | −6.51 | m-cresol | H8C7O1 |
0057pcr | −6.14 | −6.43 | −5.73 | −10.78 | −7.19 | −10.17 | −6.64 | p-cresol | H8C7O1 |
0058hep | −4.24 | −3.63 | −3.21 | −5.96 | −3.54 | −5.5 | −3.54 | 1-heptanol | H16C7O1 |
0145pro | −5.08 | −6.25 | −5.57 | −8 | −5.09 | −7.23 | −4.69 | allylalcohol | H6C3O1 |
0146met | −6.77 | −7.35 | −6.56 | −9.62 | −6.31 | −10.43 | −6.81 | 2-methoxyethanol | H8C3O2 |
0236oct | −4.09 | −3.49 | −3.07 | −5.71 | −3.35 | −5.25 | −3.38 | 1-octanol | H18C8O1 |
0070eth | −3.5 | −5.19 | −4.61 | −6.76 | −4.15 | −4.99 | −3.2 | acetaldehyde | H4C2O1 |
0071proa | −3.44 | −4.09 | −3.62 | −6.05 | −3.61 | −4.44 | −2.84 | propanal | H6C3O1 |
0072but | −3.18 | −4.56 | −4.05 | −6 | −3.57 | −4.42 | −2.83 | butanal | H8C4O1 |
0073pen | −3.03 | −3.63 | −3.2 | −5.51 | −3.2 | −3.98 | −2.53 | pentanal | H10C5O1 |
0074ben | −4.02 | −6.61 | −5.9 | −8.56 | −5.52 | −7.55 | −4.9 | benzaldehyde | H6C7O1 |
0237oct | −2.29 | −2.72 | −2.39 | −4.85 | −2.7 | −3.28 | −2.07 | octanal | H16C8O1 |
0150mhy | −9.51 | −12.67 | −11.37 | −16.65 | −11.65 | −15.84 | −10.39 | m-hydroxybenzaldehyde | H6C7O2 |
0151phy | −10.48 | −12.77 | −11.45 | −16.85 | −11.8 | −16 | −10.49 | p-hydroxybenzaldehyde | H6C7O2 |
0233ethb | −9.71 | −10.71 | −9.59 | −10.89 | −7.28 | −9.93 | −6.47 | acetamide | H5C2N1O1 |
0234ENmb | −10 | −6.4 | −5.7 | −9.08 | −5.9 | −8.37 | −5.44 | E-N-methylacetamide | H7C3N1O1 |
0235ZNmb | −10 | −6.37 | −5.68 | −10.49 | −6.98 | −9.48 | −6.17 | Z-N-methylacetamide | H7C3N1O1 |
n008 | −10.9 | −11.69 | −10.48 | −12.24 | −8.3 | −12.16 | −7.95 | benzamide | H7C7N1O1 |
test0006 | −9.76 | −9.15 | −8.19 | −9.42 | −6.17 | −9.29 | −6.05 | N,N,4-trimethylbenzamide | H13C10N1O1 |
test3001 | −14.83 | −11.62 | −10.42 | −21.72 | −15.49 | −21.24 | −13.96 | paracetamol | H9C8N1O2 |
test3004 | −11.61 | −11 | −9.86 | −21.46 | −15.29 | −20.84 | −13.7 | N-(2-hydroxyphenyl) | H9C8N1O2 |
acetamide | |||||||||
test3002 | −13.93 | −12.01 | −10.77 | −21.56 | −15.37 | −21.32 | −14.02 | N-(3-hydroxyphenyl) | H9C8N1O2 |
acetamide | |||||||||
test0005 | −11.01 | −7.27 | −6.49 | −11.19 | −7.51 | −11.71 | −7.66 | N,N-dimethyl-p- | H13C10N1O2 |
methoxybenzamide | |||||||||
test3005 | −10.91 | −6.35 | −5.66 | −14.26 | −9.83 | −14.15 | −9.27 | phenacetin | H13C10N1O2 |
0103eth | −4.5 | −5.37 | −4.78 | −5.65 | −3.31 | −3.94 | −2.51 | ethylamine | H7C2N1 |
0104dim | −4.29 | −1.01 | −0.84 | −3.86 | −1.95 | −3.75 | −2.38 | dimethylamine | H7C2N1 |
0105aze | −5.56 | −1.9 | −1.64 | −4.14 | −2.16 | −4.48 | −2.87 | azetidine | H7C3N1 |
0106pro | −4.39 | −5.25 | −4.67 | −5.25 | −3 | −3.67 | −2.33 | propylamine | H9C3N1 |
0107tri | −3.23 | 1.14 | 1.1 | −1.42 | −0.1 | −3.43 | −2.17 | trimethylamine | H9C3N1 |
0108pyr | −5.48 | −1.16 | −0.97 | −2.65 | −1.03 | −4.26 | −2.72 | pyrrolidine | H9C4N1 |
0109pip | −7.4 | −6.87 | −6.13 | −5.62 | −3.28 | −8.76 | −5.7 | piperazine | H10C4N2 |
0110but | −4.29 | −4.7 | −4.17 | −4.88 | −2.73 | −3.43 | −2.17 | butylamine | H11C4N1 |
0111die | −4.07 | −0.14 | −0.06 | −1.89 | −0.45 | −2.58 | −1.61 | diethylamine | H11C4N1 |
0112Nme | −7.77 | −5.28 | −4.69 | −3.01 | −1.3 | −8.25 | −5.36 | N-methylpiperazine | H12C5N2 |
0113pen | −4.1 | −4.51 | −4 | −4.72 | −2.6 | −3.22 | −2.03 | pentylamine | H13C5N1 |
0114NNd | −7.58 | −3.27 | −2.88 | −0.39 | 0.68 | −7.73 | −5.02 | N,N′-dimethylpierazine | H14C6N2 |
0115dip | −3.66 | 0.34 | 0.38 | −0.96 | 0.25 | −1.71 | −1.03 | dipropylamine | H15C6N1 |
0118ani | −5.49 | −7.7 | −6.87 | −7.88 | −5 | −6.97 | −4.52 | aniline | H7C6N1 |
0225pipa | −5.11 | −0.54 | −0.41 | −2.1 | −0.61 | −3.75 | −2.38 | piperidine | H11C5N1 |
0228met | −4.56 | −6.22 | −5.54 | −6.7 | −4.1 | −4.43 | −2.83 | methylamine | H5C1N1 |
n009 | −5.56 | −5.88 | −5.23 | −7.31 | −4.56 | −6.56 | −4.24 | 2-methylaniline | H9C7N1 |
n010 | −5.67 | −6.78 | −6.05 | −7.58 | −4.77 | −6.64 | −4.3 | 3-methylaniline | H9C7N1 |
n011 | −5.55 | −6.78 | −6.05 | −7.46 | −4.68 | −6.65 | −4.3 | 4-methylaniline | H9C7N1 |
n013 | −4.62 | −1.91 | −1.65 | −5.36 | −3.09 | −5.6 | −3.61 | N-ethylaniline | H11C8N1 |
n014 | −3.58 | −0.14 | −0.06 | −4.41 | −2.36 | −5.86 | −3.78 | N,N-dimethylaniline | H11C8N1 |
n015 | −9.92 | −13.66 | −12.25 | −13.25 | −9.07 | −12.47 | −8.16 | 3,aminoaniline | H8C6N2 |
n016 | −9.72 | −12.04 | −10.8 | −11.1 | −7.44 | −9.82 | −6.4 | 1,2-ethanediamine | H8C2N2 |
0147met | −6.55 | −7.64 | −6.82 | −7.33 | −4.58 | −7.06 | −4.57 | 2-methoxyethanamine | H9C3N1O1 |
0149mor | −7.17 | −5.3 | −4.71 | −4.5 | −2.43 | −7.22 | −4.68 | morpholine | H9C4N1O1 |
0227Nme | −6.34 | −2.72 | −2.38 | −2.14 | −0.65 | −6.72 | −4.35 | N-methylmorpholine | H11C5N1O1 |
test1059 | −7.4 | −16.38 | −14.72 | −15.54 | −10.81 | −18.03 | −11.84 | 1-amino-4- | H14C20N2O2 |
anilinoanthraquinone | |||||||||
test1060 | −8.9 | −19.45 | −17.49 | −22.8 | −16.31 | −25.4 | −16.72 | 1,4,5,8- | H12C14N4O2 |
tetraminoanthraquinone | |||||||||
test1061 | −8 | −15.57 | −13.98 | −15.17 | −10.53 | −15.98 | −10.48 | 1-amino- | H9C14N1O2 |
anthraquinone | |||||||||
test1015 | −9.5 | −9.06 | −8.1 | −10.37 | −6.89 | −11.08 | −7.24 | carbaryl | H11C12N1O2 |
test1016 | −9.6 | −6.89 | −6.15 | −11.51 | −7.75 | −11.81 | −7.72 | carbofuran | H15C12N1O3 |
test1037 | −10.7 | −9.45 | −8.46 | −11.74 | −7.93 | −11.57 | −7.56 | methomyl | H10C5N2O2S1 |
test1008 | −9.8 | −10.28 | −9.21 | −9.53 | −6.25 | −9.08 | −5.91 | aldicarb | H14C7N2O2S1 |
0086eth | −6.7 | −11.44 | −10.25 | −15.99 | −11.14 | −15 | −9.83 | aceticacid | H4C2O2 |
0087pro | −6.47 | −10.12 | −9.06 | −11.57 | −7.79 | −10.83 | −7.07 | propanoicacid | H6C3O2 |
0088but | −6.36 | −9.72 | −8.7 | −11.26 | −7.56 | −10.75 | −7.02 | butanoicacid | H8C4O2 |
0089pen | −6.16 | −9.46 | −8.47 | −11.02 | −7.38 | −10.42 | −6.8 | pentanoicacid | H10C5O2 |
0090hex | −6.21 | −9.37 | −8.39 | −10.65 | −7.09 | −10.19 | −6.65 | hexanoicacid | H12C6O2 |
test2017 | −7 | −9.02 | −8.07 | −12.32 | −8.36 | −11.75 | −7.68 | ibuprofen | H18C13O2 |
test2001 | −9.94 | −13.96 | −12.53 | −18.92 | −13.37 | −18.67 | −12.26 | acetylsalicylicacid | H8C9O4 |
test3007 | −10.32 | −11.71 | −10.5 | −14.69 | −10.17 | −15.34 | −10.06 | 2-methoxybenzoicacid | H8C8O3 |
test3014 | −9.15 | −13.25 | −11.89 | −14.72 | −10.18 | −15.34 | −10.06 | 4-methoxybenzoicacid | H8C8O3 |
test3015 | −8.93 | −13.33 | −11.96 | −14.43 | −9.97 | −14.85 | −9.74 | 3-methoxybenzoicacid | H8C8O3 |
test2021 | −10.21 | −12.96 | −11.62 | −16.33 | −11.4 | −17.54 | −11.52 | naproxen | H14C14O3 |
test3003 | −12.75 | −16.37 | −14.7 | −25.75 | −18.55 | −26.05 | −17.15 | fenbufen | H14C16O3 |
test2019 | −10.78 | −13.16 | −11.8 | −19 | −13.43 | −18.71 | −12.29 | ketoprofen | H14C16O3 |
0093met | −3.32 | −6.61 | −5.89 | −6.66 | −4.07 | −5.79 | −3.74 | methylacetate | H6C3O2 |
0094met | −2.93 | −5.07 | −4.51 | −6.06 | −3.62 | −5.13 | −3.29 | methylpropanoate | H8C4O2 |
0095eth | −3.1 | −4.88 | −4.33 | −6.1 | −3.65 | −5.13 | −3.29 | ethylacetate | H8C4O2 |
0096met | −2.83 | −4.98 | −4.42 | −5.68 | −3.33 | −4.79 | −3.07 | methylbutanoate | H10C5O2 |
0097pro | −2.86 | −4.55 | −4.03 | −5.77 | −3.4 | −4.68 | −3 | propylacetate | H10C5O2 |
0098met | −2.57 | −4.94 | −4.39 | −5.41 | −3.13 | −4.59 | −2.94 | methylpentanoate | H12C6O2 |
0099but | −2.55 | −4.53 | −4.02 | −5.43 | −3.14 | −4.38 | −2.8 | butylacetate | H12C6O2 |
0100met | −2.49 | −4.54 | −4.02 | −5.2 | −2.96 | −4.2 | −2.68 | methylhexanoate | H14C7O2 |
0101pen | −2.45 | −4 | −3.54 | −5.2 | −2.96 | −4.13 | −2.63 | pentylacetate | H14C7O2 |
0238met | −2.04 | −3.72 | −3.28 | −4.74 | −2.62 | −3.69 | −2.34 | methyloctanoate | H18C9O2 |
0240met | −3.91 | −6.34 | −5.65 | −7.83 | −4.96 | −7.71 | −5.01 | methylbenzoate | H8C8O2 |
test0001 | −8.84 | −12.38 | −11.1 | −18.14 | −12.78 | −17.12 | −11.24 | glyceroltriacetate | H14C9O6 |
test0008 | −4.97 | −9.45 | −8.45 | −11.01 | −7.37 | −10.87 | −7.1 | 1,1-diacetoxyethane | H10C6O4 |
test0011 | −6 | −9.11 | −8.15 | −10.91 | −7.3 | −10.45 | −6.82 | diethylpropanedioate | H12C7O4 |
test0013 | −6.34 | −8.81 | −7.88 | −12.65 | −8.61 | −12.26 | −8.02 | ethyleneglycoldiacetate | H10C6O4 |
test2003 | −8.72 | −10.4 | −9.32 | −15.29 | −10.62 | −14.73 | −9.66 | butylparaben | H14C11O3 |
test2011 | −9.2 | −10.91 | −9.77 | −15.96 | −11.12 | −15.49 | −10.16 | ethylparaben | H10C9O3 |
test2020 | −9.51 | −12.1 | −10.85 | −16.56 | −11.58 | −16.51 | −10.83 | methylparaben | H8C8O3 |
test2026 | −9.37 | −10.11 | −9.05 | −15.53 | −10.8 | −14.97 | −9.81 | propylparaben | H12C10O3 |
0060dim | −1.92 | −2.92 | −2.56 | −2.24 | −0.72 | −1.85 | −1.12 | dimethylether | H6C2O1 |
0061tet | −3.47 | −2.92 | −2.56 | −2.25 | −0.73 | −1.85 | −1.12 | tetrahydrofuran | H8C4O1 |
0062dio | −5.05 | −7.22 | −6.44 | −3.64 | −1.78 | −6.01 | −3.88 | 1,4-dioxane | H8C4O2 |
0063die | −1.76 | −1.52 | −1.3 | −1.27 | 0.01 | −0.91 | −0.5 | diethylether | H10C4O1 |
0064met | −1.66 | −1.1 | −0.92 | −1.33 | −0.03 | −0.93 | −0.51 | methylpropylether | H10C4O1 |
0065met | −2.01 | −0.83 | −0.68 | −1.53 | −0.18 | −0.98 | −0.54 | methylisopropylether | H10C4O1 |
0066dim | −4.84 | −4.24 | −3.76 | −4.66 | −2.56 | −5.9 | −3.81 | 1,2-dimethoxyethane | H10C4O2 |
0067but | −2.21 | −0.32 | −0.22 | −1.37 | −0.06 | −0.79 | −0.42 | t-butylmethylether | H12C5O1 |
0068ani | −2.45 | −3.65 | −3.22 | −4.3 | −2.28 | −4.02 | −2.56 | anisole | H8C7O1 |
0242dii | −0.53 | 0.34 | 0.38 | −0.81 | 0.37 | 0.21 | 0.24 | isopropylether | H14C6O1 |
0244tet | −3.12 | −2.11 | −1.83 | −1.21 | 0.06 | −2.25 | −1.39 | tetrahydropyran | H10C5O1 |
0246eth | −2.22 | −1.88 | −1.63 | −3.7 | −1.83 | −3.38 | −2.13 | ethylphenylether | H10C8O1 |
test0009 | −3.28 | −1.05 | −0.88 | −2.97 | −1.27 | −3.64 | −2.31 | 1,1-diethoxyethane | H14C6O2 |
test0012 | −2.93 | −5.15 | −4.58 | −5.06 | −2.86 | −6.24 | −4.03 | dimethoxymethane | H8C3O2 |
test0014 | −3.54 | −0.96 | −0.79 | −3.36 | −1.57 | −4.4 | −2.81 | 1,2-diethoxyethane | H14C6O2 |
0091met | −2.78 | −4.28 | −3.79 | −7.85 | −4.98 | −6.82 | −4.12 | methylformate | H4C2O2 |
0092ethb | −2.65 | −3.5 | −3.09 | −7.23 | −4.51 | −6.27 | −4.05 | ethylformate | H6C3O2 |
test0016 | −3.82 | −5.66 | −5.04 | −9.96 | −6.57 | −9.14 | −5.95 | phenylformate | H6C7O2 |
test2015 | −2.33 | −1.93 | −1.67 | 1.26 | 1.93 | −0.38 | −0.15 | hexachlorobenzene | C6CL6 |
test2023 | 3.43 | 3.66 | 3.38 | −2.06 | −0.58 | 1.49 | 1.09 | octafluorocyclobutane | C4F8 |
0207tri | −4.31 | −3.7 | −3.27 | −11.93 | −8.07 | −9.44 | −6.15 | 2,2,2-trifluoroethanol | H3C2O1F3 |
0211tri | −4.16 | −3.21 | −2.82 | −11.26 | −7.56 | −9.2 | −5.99 | 1,1,1-trifluoropropan-2-ol | H5C3O1F3 |
0212hex | −3.77 | −2.24 | −1.95 | −15.29 | −10.62 | −11.15 | −7.28 | 1,1,1,3,3,3- | H2C3O1F6 |
hexafluoropropan-2-ol | |||||||||
0215pbr | −7.13 | −8.04 | −7.18 | −11.31 | −7.6 | −11.25 | −7.35 | p-bromophenol | H5C6O1BR1 |
test1025 | −9.9 | −12.78 | −11.46 | −14.15 | −9.75 | −15.35 | −10.07 | dicamba | H6C8O3CL2 |
0425dbr | −9 | −12.44 | −11.16 | −14.7 | −10.17 | −15.08 | −9.89 | 3,5-dibromo-4- | H3C7N1O1BR2 |
hydroxybenzonitrile | |||||||||
test1048 | −7.8 | −4.74 | −4.21 | −11.49 | −7.74 | −11.46 | −7.49 | propanil | H9C9N1O1CL2 |
test1007 | −8.2 | −3.88 | −3.43 | −10.73 | −7.16 | −10.05 | −6.55 | alachlor | H20C14N1O2CL1 |
test2013 | −8.42 | −11.34 | −10.16 | −17.39 | −12.21 | −17.03 | −11.18 | flurbiprofen | H13C15O2F1 |
test2010 | −9.4 | −15.84 | −14.22 | −17.33 | −12.16 | −17.24 | −11.32 | diflunisal | H8C13O3F2 |
test3019 | −6.71 | −10.06 | −9.01 | −12.94 | −8.83 | −14.18 | −9.29 | tolfenamicacid | H12C14N1O2CL1 |
test3020 | −6.3 | −10.92 | −9.79 | −22.37 | −15.99 | −21.19 | −13.94 | diclofenacacid | H11C14N1O2CL2 |
test3021 | −5.68 | −10.96 | −9.82 | −16.06 | −11.2 | −14.64 | −9.6 | flufenamicacid | H10C14N1O2F3 |
0223die | −1.63 | −0.82 | −0.67 | −1.64 | −0.27 | −0.81 | −0.43 | diethyldisulfide | H10C4S2 |
0209chl | 0.11 | −0.24 | −0.15 | −9.17 | −5.97 | −5.68 | −3.66 | 1-chloro-2,2,2- | H2C3O1F5CL1 |
trifluoroethyl- | |||||||||
difluoromethylether | |||||||||
0214tri | −0.12 | −1.42 | −1.21 | −6.06 | −3.62 | −3.67 | −2.33 | 2,2,2-trifluorethylvinylether | H5C4O1F3 |
test0007 | −4.23 | −1.5 | −1.29 | −4.86 | −2.7 | −5.24 | −3.37 | bis(2-chloroethyl)ether | H8C4O1CL2 |
test1030 | −5.5 | −7.69 | −6.87 | −4.32 | −2.3 | −6.33 | −4.09 | endrin | H8C12O1CL6 |
test2029 | −0.8 | −1.69 | −1.46 | −5.48 | −3.18 | −4.96 | −3.18 | trimethylorthotrifluoroacetate | H9C5O3F3 |
test1049 | −16.4 | −13.33 | −11.96 | −18.79 | −13.27 | −18.84 | −12.38 | pyrazon | H8C10N3O1CL1 |
test1050 | −10.2 | −11.38 | −10.2 | −11.65 | −7.86 | −14.63 | −9.59 | simazine | H12C7N5CL1 |
0428ami | −11.96 | −20.64 | −18.55 | −21.09 | −15.02 | −22.17 | −14.58 | 4-amino-3,5,6- | H3C6N2O2CL3 |
trichloropyridine-2- | |||||||||
carboxylicacid | |||||||||
0426dcl | −5.22 | −7.32 | −6.54 | −4.68 | −2.57 | −5.34 | −3.43 | 2,6-dichlorobenzonitrile | H3C7N1CL2 |
test1021 | −1.5 | −1.85 | −1.6 | −0.2 | 0.83 | −1.52 | −0.9 | chloropicrin | C1N1O2CL3 |
test2024 | −5.22 | −4.3 | −3.81 | 0.52 | 1.37 | −2.29 | −1.42 | pentachloronitrobenzene | C6N1O2CL5 |
test1011 | −3.5 | −4.41 | −3.91 | −4.12 | −2.15 | −4.48 | −2.86 | benefin | H16C13N3O4F3 |
test1027 | −5.7 | −6.21 | −5.54 | −9.11 | −5.93 | −11.99 | −7.84 | dinitramine | H13C11N4O4F3 |
test1052 | −11.1 | −7.5 | −6.7 | −13.41 | −9.19 | −14.14 | −9.26 | terbacil | H13C9N2O2CL1 |
0440pho | −7.28 | −7.4 | −6.61 | −14.55 | −10.05 | −13 | −8.51 | dimethyl5- | H12C9O4P1CL1 |
(4-chloro)bicyclo[3.2.0] | |||||||||
heptylphosphate | |||||||||
test1019 | −7.1 | −5.29 | −4.7 | −11.88 | −8.03 | −10.82 | −7.06 | chlorfenvinphos | H14C12O4P1CL3 |
test1055 | −12.7 | −11.46 | −10.27 | −21.94 | −15.66 | −20.41 | −13.42 | trichlorfon | H8C4O4P1CL3 |
test1029 | −4.2 | −10.4 | −9.32 | −10.19 | −6.75 | −13.89 | −9.1 | endosulfanalpha | H6C9O3S1CL6 |
0213bis | −3.92 | −1.11 | −0.93 | −4.5 | −2.43 | −4.61 | −2.95 | bis(2-chloroethyl)sulfide | H8C4S1CL2 |
0438pho | −3.86 | −4.72 | −4.19 | −9.78 | −6.44 | −9.82 | −6.4 | diethyl2,4- | H13C10O3P1S1CL2 |
dichlorophenylthiophosphate | |||||||||
0441pho | −7.62 | −20.61 | −18.53 | −12.75 | −8.69 | −15.07 | −9.88 | dimethyl4- | H10C8N1O5P1S1 |
nitrophenylthiophosphate | |||||||||
0442pho | −4.09 | −5.49 | −4.88 | −12 | −8.12 | −11.74 | −7.67 | O-ethylO′-4-bromo-2- | H15C11O3P1S1CL1BR1 |
chlorophenylS− | |||||||||
propylphosphorothioate | |||||||||
0444pho | −5.06 | −6.02 | −5.36 | −10.23 | −6.78 | −10.87 | −7.1 | dimethyl2,4,5- | H8C8O3P1S1CL3 |
trichlorophenylthiophosphate | |||||||||
0445pho | −5.7 | −6.46 | −5.76 | −10.24 | −6.79 | −11.32 | −7.39 | dimethyl4- | H8C8O3P1S1CL2BR1 |
bromo-2,5- | |||||||||
dichlorophenylthiophosphate | |||||||||
test1017 | −6.5 | −5.51 | −4.9 | −12.62 | −8.59 | −14.79 | −9.7 | carbophenothion | H16C11O2P1S3CL1 |
test1022 | −5 | −8.21 | −7.34 | −13.53 | −9.28 | −13.31 | −8.71 | chlorpyrifos | H11C9N1O3P1S1CL3 |
0427dcl | −10.81 | −8.63 | −7.71 | −8.79 | −5.69 | −9.44 | −6.15 | 2,6-dichlorothiobenzamide | H5C7N1S1CL2 |
0433pho | −6.61 | −9.13 | −8.17 | −15.3 | −10.62 | −13.89 | −9.1 | 2,2-dichloroethenyl- | H7C4O4P1CL2 |
dimethylphosphate | |||||||||
0153flu | −0.22 | 1.41 | 1.35 | −1.77 | −0.37 | 0.04 | 0.13 | fluoromethane | H3C1F1 |
0154dif | −0.11 | 2.01 | 1.89 | −2.88 | −1.21 | −0.91 | −0.5 | 1,1-difluoroethane | H4C2F2 |
0157flu | −0.78 | −1.14 | −0.95 | −2.33 | −0.79 | −1.29 | −0.75 | fluorobenzene | H5C6F1 |
0160chl | −0.56 | 0.21 | 0.26 | −1.3 | −0.01 | −0.1 | 0.03 | chloromethane | HC1CL1 |
0161dic | −1.36 | −0.59 | −0.46 | −2.04 | −0.57 | −1.32 | −0.77 | dichloromethane | H2C1CL2 |
0162tri | −1.07 | −1.33 | −1.13 | −1.8 | −0.38 | −1.08 | −0.61 | chloroform | H1C1CL3 |
0163chl | −0.63 | 0.59 | 0.61 | −1.04 | 0.19 | 0.1 | 0.17 | chloroethane | H5C2CL1 |
0165tri | −0.25 | −0.89 | −0.73 | −1.01 | 0.22 | −0.42 | −0.18 | 1,1,1-trichloroethane | H3C2CL3 |
0166tri | −1.95 | −0.96 | −0.79 | −3.22 | −1.47 | −2.72 | −1.7 | 1,1,2-trichloroethane | H3C2CL3 |
0167chla | −0.27 | 0.84 | 0.83 | −0.71 | 0.44 | 0.35 | 0.33 | 1-chloropropane | H7C3CL1 |
0168chl | −0.25 | 0.86 | 0.85 | −1.1 | 0.15 | 0.27 | 0.28 | 2-chloropropane | H7C3CL1 |
0169chl | −0.59 | −1.38 | −1.18 | −1.11 | 0.13 | 0.26 | 0.28 | chloroethene | H3C2CL1 |
0170chl | −0.57 | −0.74 | −0.6 | −1.52 | −0.17 | −0.17 | −0.01 | 3-chloropropene | H5C3CL1 |
0171Zdi | −1.17 | −1.78 | −1.53 | −1.86 | −0.43 | −1.19 | −0.68 | Z-1,2-dichloroethene | H2C2CL2 |
0172Edi | −0.76 | −1.83 | −1.58 | −0.81 | 0.36 | −0.13 | 0.02 | E-1,2-dichloroethene | H2C2CL2 |
0173tri | −0.39 | −1.72 | −1.48 | −0.4 | 0.67 | −0.1 | 0.04 | trichloroethene | H1C2CL3 |
0174chl | −1.12 | −1.66 | −1.42 | −2.27 | −0.74 | −1.83 | −1.11 | chlorobenzene | H5C6CL1 |
0175odi | −1.36 | −1.73 | −1.49 | −1.98 | −0.52 | −1.75 | −1.05 | 1,2-dichlorobenzene | H4C6CL2 |
0176pdi | −1.01 | −1.78 | −1.53 | −1.81 | −0.4 | −1.49 | −0.88 | 1,4-dichlorobenzene | H4C6CL2 |
0177bro | −0.82 | −0.38 | −0.27 | −1.42 | −0.1 | −0.65 | −0.33 | bromomethane | H3C1BR1 |
0178dib | −2.11 | −1.66 | −1.42 | −1.54 | −0.19 | −1.78 | −1.08 | dibromomethane | H2C1BR2 |
0179tri | −1.98 | −2.89 | −2.54 | −0.7 | 0.45 | −1.5 | −0.89 | bromoform | H1C1BR3 |
0180bro | −0.7 | 0.06 | 0.13 | −1.33 | −0.03 | −0.55 | −0.26 | bromoethane | H5C2BR1 |
0182bro | −0.56 | 0.31 | 0.35 | −1.01 | 0.21 | −0.33 | −0.11 | 1-bromopropane | H7C3BR1 |
0183bro | −0.48 | 0.38 | 0.42 | −1.51 | −0.17 | −0.43 | −0.18 | 2-bromopropane | H7C3BR1 |
0184bro | −0.41 | 0.57 | 0.59 | −0.76 | 0.4 | −0.02 | 0.09 | 1-bromobutane | H9C4BR1 |
0185bro | −0.08 | 0.8 | 0.8 | −0.54 | 0.57 | 0.22 | 0.25 | 1-bromopentane | H11C5BR1 |
0186bro | −1.46 | −2.14 | −1.86 | −2.63 | −1.02 | −2.56 | −1.59 | bromobenzene | H5C6BR1 |
0187dib | −2.3 | −2.68 | −2.34 | −2.28 | −0.75 | −2.76 | −1.72 | p-dibromobenzene | H4C6BR2 |
0197bro | 1.79 | 0.71 | 0.71 | −2.81 | −1.15 | 0.23 | 0.25 | bromotrifluoromethane | C1F3BR1 |
0198chl | −0.77 | 0.56 | 0.58 | −3.36 | −1.57 | −1.91 | −1.16 | chlorofluoromethane | H2C1F1CL1 |
0199chl | −0.5 | 0.92 | 0.9 | −4.64 | −2.54 | −2.07 | −1.27 | chlorodifluoromethane | H1C1F2CL1 |
0200tet | 3.16 | 2.44 | 2.27 | −3.45 | −1.64 | 1.44 | 1.06 | tetrafluoromethane | C1F4 |
0201bro | −0.13 | 0.13 | 0.19 | −3.6 | −1.76 | −1.27 | −0.74 | 1-bromo-1- | H1C2F3CL1BR1 |
chloro-2,2,2- | |||||||||
trifluoroethane | |||||||||
0202bro | −1.95 | −0.78 | −0.63 | −2.99 | −1.29 | −2.81 | −1.76 | 1-bromo-2-chloroethane | H4C2CL1BR1 |
0203bro | 0.52 | 1.22 | 1.17 | −4.89 | −2.73 | −1.78 | −1.08 | 1-bromo-1,2,2,2- | H1C2F4BR1 |
tetrafluoroethane | |||||||||
0204tet | 0.05 | −1.64 | −1.4 | 1 | 1.74 | 0.72 | 0.58 | tetrachloroethene | C2CL4 |
0205chl | 0.06 | 1.45 | 1.38 | −3.94 | −2.01 | −1.55 | −0.92 | 1-chloro-2,2,2- | H2C2F3CL1 |
trifluoroethane | |||||||||
0206tri | 1.77 | −0.1 | −0.02 | −2.49 | −0.91 | 0.61 | 0.51 | 1,1,2-trichloro-1,2,2- | C2F3CL3 |
trif1uoroethane | |||||||||
0405hex | 3.94 | 3.13 | 2.9 | −4.18 | −2.2 | 1.08 | 0.82 | hexafluoroethane | C2F6 |
0406oct | 4.28 | 3.54 | 3.27 | −3.62 | −1.76 | 1.33 | 0.98 | octafluoropropane | C3F8 |
0407tet | −1.15 | −1.64 | −1.4 | −1.86 | −0.43 | −1.51 | −0.89 | 1,1,1,2-tetrachloroethane | H2C2CL4 |
0408hex | −1.4 | −2.76 | −2.42 | −0.27 | 0.78 | 0.12 | 0.18 | hexachloroethane | C2CL6 |
0409clb | 0.07 | 1.13 | 1.09 | −0.81 | 0.36 | 0.47 | 0.42 | 2-chlorobutane | H9C4CL1 |
0410clp | 0.07 | 1.34 | 1.28 | −0.25 | 0.79 | 0.81 | 0.64 | 1-chloropentane | H11C5CL1 |
0411chp | 0.07 | 1.45 | 1.38 | −0.58 | 0.54 | 0.76 | 0.61 | 2-chloropentane | H11C5CL1 |
0412clt | −1.92 | −1.43 | −1.22 | −3.54 | −1.71 | −3.1 | −1.95 | chlorotoluene | H7C7CL1 |
0413clt | −1.15 | −0.88 | −0.73 | −2.11 | −0.62 | −1.53 | −0.91 | o-chlorotoluene | H7C7CL1 |
0414dcl | −2.73 | −2.57 | −2.24 | −4.09 | −2.13 | −3.48 | −2.21 | 2,2′-dichlorobiphenyl | H8C12CL2 |
0415dcl | −2.45 | −2.49 | −2.18 | −3.78 | −1.89 | −3.85 | −2.45 | 2,3-dichlorobiphenyl | H8C12CL2 |
0416dcl | −1.99 | −2.65 | −2.32 | −3.63 | −1.77 | −3.36 | −2.12 | 2,2′,3′-trichlorobiphenyl | H7C12CL3 |
0417brp | −0.86 | −1.33 | −1.13 | −1.73 | −0.34 | −0.75 | −0.39 | 3-bromopropene | H5C3BR1 |
0418bri | −0.03 | 0.53 | 0.55 | −1 | 0.22 | −0.09 | 0.05 | 1-bromo-isobutane | H9C4BR1 |
0419brt | −2.37 | −1.9 | −1.64 | −3.89 | −1.97 | −3.77 | −2.4 | bromotoluene | H7C7BR1 |
0420pbr | −1.39 | −1.31 | −1.1 | −2.35 | −0.8 | −2.18 | −1.34 | p-bromotoluene | H7C7BR1 |
0421dfl | 1.69 | 0.08 | 0.15 | −2.01 | −0.55 | 0.71 | 0.57 | difluorodichloromethane | C1F2CL2 |
0422ftc | 0.82 | −1.02 | −0.85 | −1.13 | 0.12 | 0.54 | 0.46 | fluorotrichloromethane | C1F1CL3 |
0423brt | −0.93 | −2.54 | −2.22 | 0.1 | 1.05 | 0.09 | 0.16 | bromotrichloromethane | C1CL3BR1 |
0424clp | 2.86 | 2 | 1.88 | −3.64 | −1.78 | 1.02 | 0.78 | chloropentaflouroethane | C2F5CL1 |
test0004 | 1.07 | 1.81 | 1.71 | −6.15 | −3.69 | −2.66 | −1.66 | m-bis(trifluoromethyl) | H4C8F6 |
benzene | |||||||||
test1018 | −3.4 | −3.03 | −2.66 | −2.12 | −0.63 | −2.39 | −1.48 | chlordane | H6C10CL8 |
test1033 | −2.6 | −3.26 | −2.87 | −1.67 | −0.29 | −1.97 | −1.2 | heptachlor | H5C10CL7 |
test1035 | −5.4 | −1.99 | −1.73 | −5.45 | −3.16 | −5.47 | −3.52 | lindane | H6C6CL6 |
0116pyr | −4.7 | −5.22 | −4.64 | −6.09 | −3.64 | −5.43 | −3.49 | pyridine | H5C5N1 |
0117met | −5.57 | −7.53 | −6.72 | −9.78 | −6.44 | −9.42 | −6.14 | 2-methylpyrazine | H6C5N2 |
0119met | −4.63 | −4.4 | −3.89 | −5.7 | −3.35 | −4.87 | −3.12 | 2-methylpyridine | H7C6N1 |
0120met | −4.77 | −4.89 | −4.34 | −5.63 | −3.29 | −4.98 | −3.2 | 3-methylpyridine | H7C6N1 |
0121met | −4.94 | −4.89 | −4.34 | −5.89 | −3.49 | −5.16 | −3.32 | 4-methylpyridine | H7C6N1 |
0122Nme | −4.68 | −2.59 | −2.26 | −6.25 | −3.76 | −6.37 | −4.12 | N-methylaniline | H9C7N1 |
0123dim | −4.86 | −3.36 | −2.96 | −5.47 | −3.17 | −4.51 | −2.89 | 2,4-dimethylpyridine | H9C7N1 |
0124dim | −4.72 | −3.5 | −3.09 | −5.22 | −2.98 | −4.26 | −2.72 | 2,5-dimethylpyridine | H9C7N1 |
0125dim | −4.6 | −2.67 | −2.34 | −5.3 | −3.04 | −4.23 | −2.7 | 2,6-dimethylpyridine | H9C7N1 |
0230eth | −5.51 | −7.16 | −6.39 | −9.23 | −6.02 | −8.71 | −5.67 | 2-ethylpyrazine | H8C6N2 |
0471dim | −5.22 | −4.3 | −3.81 | −5.48 | −3.18 | −4.83 | −3.1 | 3,4-dimethylpyridine | H9C7N1 |
0571dim | −4.84 | −3.99 | −3.53 | −5.15 | −2.93 | −4.32 | −2.76 | 3,5-dimethylpyridine | H9C7N1 |
0574eth | −4.74 | −6.25 | −5.57 | −5.81 | −3.43 | −5.93 | −3.82 | 4-ethylpyridine | H9C7N1 |
test0017 | −9.81 | −8.7 | −7.78 | −10.3 | −6.83 | −10.46 | −6.83 | imidazole | H4C3N2 |
test1009 | −7.7 | −10.88 | −9.75 | −9.02 | −5.86 | −12.23 | −8 | ametryn | H17C9N5S1 |
test1047 | −8.4 | −8.37 | −7.48 | −8.31 | −5.33 | −11.25 | −7.35 | prometryn | H19C10N5S1 |
test1053 | −6.7 | −9.3 | −8.32 | −9.11 | −5.93 | −12.2 | −7.98 | terbutryn | H19C10N5S1 |
test1063 | −9.4 | −4.62 | −4.09 | −14.17 | −9.77 | −16.19 | −10.62 | pirimicarb | H18C11N4O2 |
n005 | −5.31 | −6.72 | −5.99 | −10.54 | −7.01 | −8.76 | −5.7 | methylhydrazine | H6C1N2 |
n006 | −4.48 | −6.33 | −5.64 | −7.69 | −4.85 | −7.75 | −5.03 | 1,1-dimethylhydrazine | H8C2N2 |
0001met | 2 | 1.02 | 1 | −0.05 | 0.94 | 2.57 | 1.8 | methane | H4C1 |
0002eth | 1.83 | 1.39 | 1.33 | 0.69 | 1.5 | 2.65 | 1.86 | ethane | H6C2 |
0003pro | 1.96 | 1.67 | 1.58 | 0.93 | 1.68 | 2.72 | 1.91 | n-propane | H8C3 |
0004nbu | 2.08 | 1.95 | 1.83 | 1.15 | 1.85 | 2.9 | 2.02 | n-butane | H10C4 |
0005npe | 2.33 | 2.18 | 2.04 | 1.36 | 2.01 | 3.09 | 2.15 | n-pentane | H12C5 |
0006nhe | 2.49 | 2.45 | 2.28 | 1.57 | 2.17 | 3.33 | 2.31 | n-hexane | H14C6 |
0007nhe | 2.62 | 2.71 | 2.52 | 1.8 | 2.35 | 3.57 | 2.47 | n-heptane | H16C7 |
0008noc | 2.89 | 3.01 | 2.79 | 2.02 | 2.51 | 3.79 | 2.61 | n-octane | H18C8 |
0010met | 2.32 | 1.83 | 1.72 | 0.88 | 1.65 | 2.8 | 1.96 | 2-methylpropane | H10C4 |
0011dim | 2.5 | 1.88 | 1.77 | 1.05 | 1.77 | 2.87 | 2.01 | 2,2-dimethylpropane | H12C5 |
0012met | 2.52 | 2.28 | 2.13 | 1.32 | 1.98 | 3.22 | 2.24 | 2-methylpentane | H14C6 |
0013dim | 2.88 | 2.35 | 2.19 | 1.31 | 1.97 | 3.4 | 2.35 | 2,4-dimethylpentane | H16C7 |
0014tri | 2.85 | 2.39 | 2.23 | 1.45 | 2.08 | 3.39 | 2.35 | 2,2,4-trimethylpentane | H18C8 |
0016cyc | 0.75 | 1.63 | 1.55 | 0.51 | 1.37 | 2.48 | 1.75 | cyclopropane | H6C3 |
0017cyc | 1.2 | 2.03 | 1.91 | 1.21 | 1.9 | 1.92 | 1.37 | cyclopentane | H10C5 |
0018cyc | 1.23 | 2.19 | 2.05 | 1.38 | 2.03 | 2.03 | 1.45 | cyclohexane | H12C6 |
0019met | 1.71 | 2.35 | 2.2 | 1.39 | 2.04 | 2.26 | 1.6 | methylcyclohexane | H14C7 |
0020cis | 1.58 | 2.46 | 2.3 | 1.38 | 1.03 | 2.37 | 1.67 | cis-1,2-dimethylcyclohexane | H16C8 |
0021eth | 1.27 | −0.96 | −0.79 | −0.34 | 0.72 | 1.98 | 1.41 | ethene | H4C2 |
0022pro | 1.27 | −0.06 | 0.02 | −0.03 | 0.95 | 2.03 | 1.45 | propene | H6C3 |
0023str | 0.61 | −1.35 | −1.14 | −0.78 | 0.39 | 1.37 | 1.01 | s-trans-1,3-butadiene | H6C4 |
0024met | 1.16 | 0.67 | 0.68 | −0.01 | 0.97 | 2.03 | 1.45 | 2-methylpropene | H8C4 |
0025buta | 1.38 | 0.29 | 0.34 | 0.24 | 1.16 | 2.19 | 1.56 | 1-butene | H8C4 |
0026cyc | 0.56 | 0.7 | 0.7 | 0.08 | 1.04 | 0.89 | 0.69 | cyclopentene | H8C5 |
0027pen | 1.66 | 0.57 | 0.59 | 0.49 | 1.35 | 2.44 | 1.72 | 1-pentene | H10C5 |
0028Epe | 1.34 | 1.23 | 1.19 | 0.6 | 1.43 | 2.44 | 1.72 | E-2-pentene | H10C5 |
0029hex | 1.68 | 0.84 | 0.83 | 0.71 | 1.52 | 2.65 | 1.86 | 1-hexene | H12C6 |
0030eth | −0.01 | −1 | −0.83 | −1.07 | 0.17 | −0.68 | −0.35 | ethyne | H2C2 |
0031pro | −0.31 | −0.09 | −0.01 | −1.06 | 0.17 | −0.74 | −0.38 | propyne | H4C3 |
0032but | −0.16 | 0.28 | 0.33 | −0.71 | 0.44 | −0.38 | −0.15 | 1-butyne | H6C4 |
0033pen | 0.01 | 0.63 | 0.64 | −0.44 | 0.64 | −0.08 | 0.05 | 1-pentyne | H8C5 |
0034hex | 0.29 | 0.8 | 0.8 | −0.2 | 0.82 | 0.13 | 0.19 | 1-hexyne | H10C6 |
0035ben | −0.87 | −1.59 | −1.36 | −1.59 | −0.99 | −1.77 | −1.07 | benzene | H6C6 |
0036tol | −0.89 | −0.74 | −0.6 | −2.34 | −0.8 | −1.45 | −0.86 | toluene | H8C7 |
0037eth | −0.8 | −0.34 | −0.24 | −1.96 | −0.51 | −1.09 | −0.62 | ethylbenzene | H10C8 |
0038oxy | −0.9 | −0.09 | −0.01 | −2.12 | −0.63 | −1.28 | −0.75 | o-xylene | H10C8 |
0039mxy | −0.84 | 0.11 | 0.17 | −2.02 | −0.56 | −1.08 | −0.61 | m-xylene | H10C8 |
0040pxy | −0.81 | 0.16 | 0.22 | −2.01 | −0.55 | −1.06 | −0.6 | p-xylene | H10C8 |
0041nap | −2.39 | −2.08 | −1.81 | −4.31 | −2.29 | −4.59 | −2.94 | naphthalene | H8C10 |
0042ant | −4.23 | −2.54 | −2.22 | −5.68 | −3.33 | −6.98 | −4.52 | anthracene | H10C14 |
0148but | 0.04 | −1.42 | −1.21 | −2.21 | −0.7 | −1.61 | −0.96 | butenyne | H4C4 |
test2025 | −9.61 | −14.4 | −12.93 | −15.14 | −10.5 | −15.2 | −9.97 | phthalimide | H5C8N1O2 |
0075pro | −3.85 | −4.13 | −3.65 | −6.06 | −3.62 | −4.51 | −2.88 | acetone | H6C3O1 |
0076but | −3.64 | −3.19 | −2.8 | −5.51 | −3.2 | −3.91 | −2.48 | 2-butanone | H8C4O1 |
0077cyc | −4.68 | −4.08 | −3.61 | −5.23 | −2.99 | −4.62 | −2.96 | cyclopentanone | H8C5O1 |
0078pen | −3.53 | −2.93 | −2.58 | −5.22 | −2 .98 | −3.66 | −2.32 | 2-pentanone | H10C5O1 |
0079pen | −3.41 | −1.92 | −1.66 | −4.95 | −2.78 | −3.31 | −2.09 | 3-pentanone | H10C5O1 |
0080hex | −3.29 | −2.62 | −2.29 | −4.96 | −2.78 | −3.43 | −2.17 | 2-hexanone | H12C6O1 |
0081dim | −2.89 | −2.77 | −2.43 | −4.90 | −2.81 | −3.36 | −2.12 | 3,3-dimethylbutanone | H12C6O1 |
0082hep | −3.04 | −2.45 | −2.14 | −4.72 | −2.6 | −3.18 | −2.01 | 2-heptanone | H14C7O1 |
0083hep | −2.93 | −1.59 | −1.36 | −4.43 | −2.38 | −2.65 | −1.65 | 4-heptanone | H14C7O1 |
0084met | −4.58 | −5.46 | −4.86 | −7.94 | −5.04 | −7.16 | −4.64 | acetophenone | H8C8O1 |
0085non | −2.67 | −0.91 | −0.75 | −3.89 | −1.97 | −2.01 | −1.23 | 5-nonanone | H18C9O1 |
0239oct | −2.88 | −2.04 | −1.77 | −4.51 | −2.44 | −2.92 | −1.83 | 2-octanone | H16C8O1 |
test1034 | −5.2 | −3.95 | −3.49 | −5.7 | −3.34 | −4.18 | −2.66 | isophorone | H14C9O1 |
test1001 | −5.7 | −6.53 | −5.82 | −5.56 | −3.24 | −9.7 | −6.32 | nitroglycol | H4C2N2O6 |
test1002 | −5 | −5.57 | −4.96 | −5.22 | −2.98 | −9.01 | −5.87 | 1,2-dinitroxypropane | H6C3N2O6 |
test1003 | −2.1 | −2.92 | −2.57 | −1.64 | −0.27 | −2.94 | −1.84 | butylnitrate | H9C4N1O3 |
test1004 | −1.8 | −2.15 | −1.87 | −1.87 | −0.44 | −2.97 | −1.86 | 2-butylnitrate | H9C4N1O3 |
test1005 | −1.9 | −2.59 | −2.27 | −1.72 | −0.33 | −2.79 | −1.74 | isobutylnitrate | H9C4N1O3 |
test1006 | −8.2 | −9.17 | −8.2 | −11 | −7.37 | −13.25 | −8.67 | ethyleneglycolmononitrate | H5C2N1O4 |
0126eth | −3.89 | −4.93 | −4.38 | −5.3 | −3.04 | −4.54 | −2.9 | acetonitrile | H3C2N1 |
0127pro | −3.85 | −4.62 | −4.1 | −4.64 | −2.54 | −3.93 | −2.5 | propionitrile | H5C3N1 |
0128butb | −3.64 | −4.32 | −3.82 | −4.35 | −2.32 | −3.63 | −2.3 | butanonitrile | H7C4N1 |
0129ben | −4.1 | −6.77 | −6.04 | −5.56 | −3.24 | −5.56 | −3.58 | benzonitrile | H5C7N1 |
0506nit | −3.95 | −5.88 | −5.24 | −2.73 | −1.09 | −3.66 | −2.32 | nitromethane | H3C1N1O2 |
0130nit | −3.71 | −4.36 | −3.86 | −2.35 | −0.8 | −3.14 | −1.98 | nitroethane | H5C2N1O2 |
0131nit | −3.34 | −4.02 | −3.56 | −1.97 | −0.52 | −2.71 | −1.69 | 1-nitropropane | H7C3N1O2 |
0132nit | −3.14 | −2.88 | −2.52 | −2.26 | −0.74 | −2.63 | −1.64 | 2-nitropropane | H7C3N1O2 |
0133nit | −3.08 | −3.92 | −3.47 | −1.72 | −0.33 | −2.4 | −1.48 | 1-nitrobutane | H9C4N1O2 |
0134nit | −4.12 | −6.04 | −5.38 | −4.25 | −2.25 | −5.69 | −3.67 | nitrobenzene | H5C6N1O2 |
0135met | −3.59 | −4.18 | −3.7 | −4.4 | −2.36 | −5.59 | −3.6 | 2-methyl-1-nitrobenzene | H7C7N1O2 |
test1028 | −6.2 | −8.68 | −7.76 | −11.3 | −7.59 | −13.15 | −8.61 | dinoseb | H12C10N2O5 |
test1058 | −11.2 | −12.76 | −11.44 | −13.68 | −9.39 | −15.24 | −9.99 | 4-amino-4′-nitroazobenzene | H10C12N4O2 |
test2022 | −9.45 | −12.16 | −10.9 | −10.07 | −6.66 | −11.81 | −7.72 | 4-nitroaniline | H6C6N2O2 |
test1046 | −2.5 | −3.37 | −2.97 | −4.67 | −2.56 | −4.76 | −3.05 | profluralin | H16C14N3O4F3 |
test1056 | −3.3 | −3.04 | −2.67 | −4.67 | −2.56 | −4.99 | −3.2 | trifluralin | H16C13N3O4F3 |
test1041 | −6 | −9.54 | −8.54 | −9.33 | −6.1 | −10.51 | −6.86 | nitroxyacetone | H5C3N1O4 |
0402adn | −13.6 | −18.22 | −16.37 | −18.38 | −12.96 | −19.59 | −12.87 | 9-methyladenine | H7C6N5 |
0403thi | −10.4 | −11.48 | −10.29 | −16.32 | −11.4 | −17.04 | −11.18 | 1-methylthymine | H8C6N2O2 |
n191 | −16.59 | −15.77 | −14.16 | −18.49 | −13.05 | −19.38 | −12.74 | uracil | H4C4N2O2 |
n200 | −16.92 | −11.46 | −10.27 | −18.65 | −13.16 | −19.54 | −12.84 | 5-fluorouracil | H3C4N2O2F1 |
n201 | −15.46 | −12.9 | −11.57 | −21.18 | −15.08 | −21.13 | −13.89 | 5-trifluoromethyluracil | H3C5N2O2F3 |
n202 | −17.74 | −15.8 | −14.18 | −18.25 | −12.86 | −19.5 | −12.81 | 5-chlorouracil | H3C4N2O2CL1 |
n203 | −18.17 | −15 | −13.47 | −18.54 | −13.08 | −20.24 | −13.31 | 5-bromouracil | H3C4N2O2BR1 |
test2004 | −12.64 | −12.01 | −10.76 | −18.95 | −13.39 | −21.27 | −13.98 | caffeine | H10C8N4O2 |
test2006 | −15.83 | −15.51 | −13.92 | −16.25 | −11.35 | −17.51 | −11.5 | 6-chlorouracil | H3C4N2O2CL1 |
test1013 | −9.7 | −8.09 | −7.23 | −14.11 | −9.72 | −14.48 | −9.49 | bromacil | H13C9N2O2BR1 |
n018 | −5.28 | −9.9 | −8.87 | −9.89 | −6.52 | −9.07 | −5.9 | methylperoxide | H4C1O2 |
n019 | −5.32 | −10.27 | −9.2 | −9.21 | −6 | −8.59 | −5.59 | ethylperoxide | H6C2O2 |
0220tri | −8.7 | −9.38 | −8.39 | −14.22 | −9.8 | −14.22 | −9.31 | trimethylphoshate | H9C3O4P1 |
0221tri | −7.8 | −4.67 | −4.14 | −11.55 | −7.78 | −11.33 | −7.4 | triethylphosphate | H15C6O4P1 |
0222tri | −6.1 | −3.89 | −3.44 | −10.3 | −6.83 | −9.4 | −6.12 | tripropylphosphate | H21C9O4P1 |
test2027 | −8.61 | −10.18 | −9.12 | −13.12 | −8.97 | −13.73 | −8.99 | sulfolane | H8C4O2S1 |
test1040 | −8 | −12.36 | −11.09 | −15.07 | −10.45 | −15.7 | −10.3 | nitralin | H19C13N3O6S1 |
test1039 | −15.5 | −25.61 | −23.04 | −32.58 | −23.73 | −35.93 | −23.7 | metsulfuronmethyl | H15C14N5O6S1 |
test1057 | −4.1 | −1.81 | −1.56 | −4.63 | −2.53 | −3.88 | −2.47 | vemolate | H21C10N1O1S1 |
0136met | −1.24 | −2.05 | −1.78 | −2.78 | −1.13 | −2.09 | −1.28 | methanethiol | H4C1S1 |
0137ethb | −1.3 | −1.52 | −1.3 | −2.61 | −1 | −1.95 | −1.19 | ethanethiol | H6C2S1 |
0138pro | −1.05 | −1.25 | −1.05 | −2.21 | −0.69 | −1.62 | −0.97 | 1-propanethiol | H8C3S1 |
0139thi | −2.55 | −3.43 | −3.03 | −3.81 | −1.92 | −3.88 | −2.46 | thiophenol | H6C6S1 |
0140dim | −1.54 | −0.64 | −0.5 | −1.47 | −0.14 | −0.51 | −0.24 | dimethylsulfide | H6C2S1 |
0141dim | −1.83 | −2.04 | −1.77 | −1.93 | −0.49 | −1.4 | −0.82 | dimethyldisulfide | H6C2S2 |
0142die | −1.43 | 0.52 | 0.55 | −1.24 | 0.04 | −0.21 | −0.04 | diethylsulfide | H10C4S1 |
0143dip | −1.27 | 1.05 | 1.02 | −0.57 | 0.54 | 0.52 | 0.45 | dipropylsulfide | H14C6S1 |
0144thi | −2.73 | −2.13 | −1.85 | −2.78 | −1.13 | −2.47 | −1.53 | thioanisole | H8C7S1 |
0245thi | −1.42 | −2.95 | −2.59 | −1.95 | −0.5 | −1.69 | −1.02 | thiophene | H4C4S1 |
test1044 | −3.6 | −1.72 | −1.48 | −4.74 | −2.62 | −4.23 | −2.7 | pebulate | H21C10N1O1S1 |
test1031 | −6.1 | −2.46 | −2.15 | −17.95 | −12.64 | −17.98 | −11.81 | ethion | H22C9O4P2S4 |
test1036 | −8.2 | −12.93 | −11.6 | −19.82 | −14.05 | −18.93 | −12.44 | malathion | HI9C10O6P1S2 |
test1024 | −6.5 | −6.6 | −5.89 | −12.5 | −8.5 | −12.4 | −8.11 | diazinon | H21C12N2O3P1S1 |
0449pho | −5.1 | −8.61 | −7.7 | −18.15 | −12.78 | −17.51 | −11.5 | ethyl4- | H14C15N1O2P1S1 |
cyanophenylphenyl- | |||||||||
thiophosphonate | |||||||||
0447pho | −6.27 | −8.29 | −7.41 | −10.9 | −7.29 | −12.95 | −8.48 | diethyl4- | H14C10N1O2P1S1 |
nitrophenylthiophosphonate | |||||||||
test1043 | −6.7 | −17.84 | −16.03 | −11.03 | −7.38 | −13.03 | −8.53 | parathion | H14C10N1O5P1S1 |
test1045 | −4.4 | −2.93 | −2.57 | −8.28 | −5.3 | −9.2 | −5.99 | phorate | H17C7O2P1S3 |
test1010 | −10 | −15.51 | −13.93 | −18.66 | −13.17 | −21.58 | −14.19 | azinphosmethyl | H12C10N3O3P1S2 |
0401amia | −9.63 | −10.58 | −9.48 | −13.36 | −9.15 | −13.74 | −9 | 1,1-dimethyl-3-phenylurea | H12C9N2O1 |
n007 | −13.8 | −15.47 | −13.89 | −13.57 | −9.31 | −13.91 | −9.11 | urea | H4C1N2O1 |
test2007 | −18.06 | −21.51 | −19.34 | −22.88 | −16.37 | −25.87 | −17.04 | cyanuricacid | H3C3N3O3 |
0437pho | −6.92 | −12.07 | −10.82 | −18.48 | −13.04 | −20.99 | −13.8 | methyl3-methyl-4- | H13C9O3P1S2 |
thiomethoxyphenyl- | |||||||||
thiophosphate | |||||||||
test1012 | −17.2 | −27.37 | −24.63 | −33.94 | −24.76 | −38.29 | −25.26 | bensulfuron | H18C16N4O7S1 |
test1014 | −9 | −9.95 | −8.91 | −11.05 | −7.4 | −11.57 | −7.56 | captan | H8C9N1O2S1CL3 |
test1020 | −14 | −23.13 | −20.8 | −34.54 | −25.21 | −37.03 | −24.43 | chlorimuronethyl | H15C15N4O6S1CL1 |
test1023 | −5.7 | −10.04 | −8.99 | −29.24 | −21.19 | −28.85 | −19.01 | dialifor | H17C14N1O4P1S2CL1 |
test1051 | −20.3 | −21.56 | −19.39 | −30.85 | −22.42 | −32.58 | −21.48 | sulfometuron-methyl | H16C15N4O5S1 |
test1054 | −16.2 | −25.7 | −23.12 | −36.05 | −26.36 | −38.78 | −25.59 | thifensulfuron | H13C12N5O6S2 |
TABLE 8 |
KMTISM, MM-GBSA and MM-PBSA calculated solvation free energy |
(in kcal/mol) results against the test set with 21 charged compounds. |
Compound | Exp. | KMTISM | MM-GBSA | MM-PBSA | ||
ID | ΔGsolv | Result | Result | Result | Solute Name | Formula |
Anions |
i058 | −76.20 | −70.64 | −70.46 | −71.79 | formicacid | H1C1O2 |
i059 | −77.60 | −70.58 | −63.33 | −65.10 | aceticacid | H3C2O2 |
i060 | −76.20 | −70.91 | −63.15 | −64.37 | propanoicacid | H5C3O2 |
i061 | −74.60 | −71.64 | −60.41 | −61.92 | hexanoicacid | H11C6O2 |
i062 | −74.00 | −71.58 | −60.06 | −61.48 | acrylicacid | H3C3O2 |
i063 | −68.50 | −71.63 | −63.81 | −65.94 | pyruvicacid | H3C3O3 |
i064 | −71.20 | −73.01 | −80.95 | −82.23 | benzoicacid | H5C7O2 |
i118 | −59.30 | −67.87 | −79.96 | −80.54 | trifluoroaceticacid | C2O2F3 |
i119 | −69.70 | −70.83 | −61.19 | −63.11 | chloroaceticacid | H2C2O2CL1 |
i120 | −62.30 | −70.91 | −63.45 | −65.44 | dichloroaceticacid | H1C2O2CL2 |
Cations |
i003 | −76.40 | −78.40 | −62.95 | −64.91 | methylamine | H6C1N1 |
i004 | −71.50 | −78.92 | −91.16 | −87.52 | n-propylamine | H10C3N1 |
i008 | −72.00 | −79.70 | −86.92 | −83.96 | allylamine | H8C3N1 |
i020 | −69.60 | −71.13 | −84.68 | −81.80 | 3-methylaniline | H10C7N1 |
i021 | −69.80 | −71.18 | −82.01 | −79.25 | 4-methylaniline | H10C7N1 |
i023 | −65.80 | −72.90 | −81.93 | −79.18 | 3-aminoaniline | H9C6N2 |
i047 | −85.20 | −90.16 | −84.38 | −79.84 | ammonia | H4N1 |
i048 | −84.60 | −80.59 | −80.98 | −78.91 | hydrazine | H5N2 |
i093 | −71.20 | −82.26 | −70.80 | −67.26 | 4-methoxyaniline | H10C7N1O1 |
i125 | −74.70 | −80.74 | −80.29 | −77.73 | 3-chloroaniline | H7C6N1CL1 |
i126 | −74.10 | −80.70 | −74.39 | −71.60 | 4-chloroaniline | H7C6N1CL1 |
Claims (26)
V=2Y−4×(2π)Y−3×(4π)Y−2 C Y−1
V=2Y−4×(2π)Y−3×(4π)Y−2 C Y−1
V=2Y−4×(2π)Y−3×(4π)Y−2 C Y−1
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/143,519 US10332616B2 (en) | 2013-11-01 | 2016-04-30 | Movable type method applied to protein-ligand binding |
US16/385,735 US12224040B2 (en) | 2013-11-01 | 2019-04-16 | Movable type method applied to protein-ligand binding |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361898718P | 2013-11-01 | 2013-11-01 | |
PCT/US2014/063328 WO2015066415A1 (en) | 2013-11-01 | 2014-10-31 | Movable type method applied to protein-ligand binding |
US15/143,519 US10332616B2 (en) | 2013-11-01 | 2016-04-30 | Movable type method applied to protein-ligand binding |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2014/063328 Continuation-In-Part WO2015066415A1 (en) | 2013-11-01 | 2014-10-31 | Movable type method applied to protein-ligand binding |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/385,735 Division US12224040B2 (en) | 2013-11-01 | 2019-04-16 | Movable type method applied to protein-ligand binding |
Publications (2)
Publication Number | Publication Date |
---|---|
US20160350474A1 US20160350474A1 (en) | 2016-12-01 |
US10332616B2 true US10332616B2 (en) | 2019-06-25 |
Family
ID=53005160
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/143,519 Active 2036-02-10 US10332616B2 (en) | 2013-11-01 | 2016-04-30 | Movable type method applied to protein-ligand binding |
US16/385,735 Active 2039-03-02 US12224040B2 (en) | 2013-11-01 | 2019-04-16 | Movable type method applied to protein-ligand binding |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/385,735 Active 2039-03-02 US12224040B2 (en) | 2013-11-01 | 2019-04-16 | Movable type method applied to protein-ligand binding |
Country Status (2)
Country | Link |
---|---|
US (2) | US10332616B2 (en) |
WO (1) | WO2015066415A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US12224040B2 (en) | 2013-11-01 | 2025-02-11 | Board Of Trustees Of Michigan State University | Movable type method applied to protein-ligand binding |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6840668B2 (en) * | 2014-11-14 | 2021-03-10 | ディ.イー.ショー リサーチ, エルエルシーD.E.Shaw Research, Llc | Suppression of interactions between bound particles |
EP3762730A4 (en) | 2018-03-05 | 2021-12-01 | The Board of Trustees of the Leland Stanford Junior University | METHODS BASED ON MACHINE LEARNING AND MOLECULAR SIMULATION FOR IMPROVED ASSESSMENT AND ACTIVITY PREDICTION |
US11727282B2 (en) * | 2018-03-05 | 2023-08-15 | The Board Of Trustees Of The Leland Stanford Junior University | Systems and methods for spatial graph convolutions with applications to drug discovery and molecular simulation |
EP3561702A1 (en) * | 2018-04-23 | 2019-10-30 | Covestro Deutschland AG | Method for determining a product composition for a chemical mixture product |
WO2019213581A1 (en) * | 2018-05-03 | 2019-11-07 | Base Pair Biotechnologies, Inc. | Functional ligands to dicamba |
CN111161810B (en) * | 2019-12-31 | 2022-03-22 | 中山大学 | Free energy perturbation method based on constraint probability distribution function optimization |
CN112509647B (en) * | 2020-11-27 | 2022-09-06 | 易波 | Hydrophilic interface selection system and method for holding biological tissue |
US11568961B2 (en) | 2020-12-16 | 2023-01-31 | Ro5 Inc. | System and method for accelerating FEP methods using a 3D-restricted variational autoencoder |
CN113539381B (en) * | 2021-07-16 | 2023-09-05 | 中国海洋大学 | Molecular dynamics result analysis method based on residue interaction and PEN |
US20240290420A1 (en) * | 2023-02-27 | 2024-08-29 | TandemAI Suzhou Co., Ltd. | Systems and methods for computational drug discovery |
CN118866158A (en) * | 2023-04-27 | 2024-10-29 | 华为云计算技术有限公司 | Molecular docking method and system |
CN118351977B (en) * | 2024-04-17 | 2025-03-21 | 苏州予路乾行生物科技有限公司 | A method and system for predicting atomic potential energy between proteins and drug-like compounds |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6178384B1 (en) * | 1997-09-29 | 2001-01-23 | The Trustees Of Columbia University In The City Of New York | Method and apparatus for selecting a molecule based on conformational free energy |
US20030215959A1 (en) | 2002-05-14 | 2003-11-20 | Vasanthi Jayaraman | Method of screening binding of a compound to a receptor |
US20080243452A1 (en) * | 2005-04-19 | 2008-10-02 | Bowers Kevin J | Approaches and architectures for computation of particle interactions |
US20100112724A1 (en) | 2007-04-12 | 2010-05-06 | Dmitry Gennadievich Tovbin | Method of determination of protein ligand binding and of the most probable ligand pose in protein binding site |
WO2015066415A1 (en) | 2013-11-01 | 2015-05-07 | University Of Florida Research Foundation, Inc. | Movable type method applied to protein-ligand binding |
-
2014
- 2014-10-31 WO PCT/US2014/063328 patent/WO2015066415A1/en active Application Filing
-
2016
- 2016-04-30 US US15/143,519 patent/US10332616B2/en active Active
-
2019
- 2019-04-16 US US16/385,735 patent/US12224040B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6178384B1 (en) * | 1997-09-29 | 2001-01-23 | The Trustees Of Columbia University In The City Of New York | Method and apparatus for selecting a molecule based on conformational free energy |
US20030215959A1 (en) | 2002-05-14 | 2003-11-20 | Vasanthi Jayaraman | Method of screening binding of a compound to a receptor |
US20080243452A1 (en) * | 2005-04-19 | 2008-10-02 | Bowers Kevin J | Approaches and architectures for computation of particle interactions |
US20100112724A1 (en) | 2007-04-12 | 2010-05-06 | Dmitry Gennadievich Tovbin | Method of determination of protein ligand binding and of the most probable ligand pose in protein binding site |
WO2015066415A1 (en) | 2013-11-01 | 2015-05-07 | University Of Florida Research Foundation, Inc. | Movable type method applied to protein-ligand binding |
Non-Patent Citations (6)
Title |
---|
"International Application Serial No. PCT/US2014/063328, International Search Report dated Feb. 5, 2015", 3 pgs. |
"International Application Serial No. PCT/US2014/063328, Written Opinion dated Feb. 5, 2015", 7 pgs. |
D. L. Beveridge, Free Energy Via Molecular Simulation: Applications to Chemical and Biomolecular Systems, (Year: 1989). * |
Gallicchio, et al., "Advances in all atom sampling methods for modeling protein-ligand binding affinities", Current Opinion in Structural Biology, vol. 21, No. 2,, (2011), 161-166. |
Gilson, et al., "Calculation of protein-ligand binding affinities", The Annual Review of Biophysics and Biomolecular Structure, vol. 36, (2007), 21-42. |
Zheng, et al., "The movable type method applied to protein-ligand binding", Journal of Chemical Theory and Computation, vol. 9, No. 12, (Oct. 28, 2013), 5526-5538. |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US12224040B2 (en) | 2013-11-01 | 2025-02-11 | Board Of Trustees Of Michigan State University | Movable type method applied to protein-ligand binding |
Also Published As
Publication number | Publication date |
---|---|
US20190378592A1 (en) | 2019-12-12 |
US12224040B2 (en) | 2025-02-11 |
WO2015066415A1 (en) | 2015-05-07 |
US20160350474A1 (en) | 2016-12-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US12224040B2 (en) | Movable type method applied to protein-ligand binding | |
Tao et al. | Benchmarking machine learning models for polymer informatics: an example of glass transition temperature | |
Maggiora et al. | Molecular similarity in medicinal chemistry: miniperspective | |
Befort et al. | Machine learning directed optimization of classical molecular modeling force fields | |
Nascimento et al. | A multiple kernel learning algorithm for drug-target interaction prediction | |
Stanton | Evaluation and use of BCUT descriptors in QSAR and QSPR studies | |
Guthrie | A blind challenge for computational solvation free energies: introduction and overview | |
Kundi et al. | Predicting Octanol–Water partition coefficients: are Quantum Mechanical Implicit Solvent models better than empirical fragment-based methods? | |
Moradzadeh et al. | Transfer-learning-based coarse-graining method for simple fluids: Toward deep inverse liquid-state theory | |
Messerly et al. | Configuration-sampling-based surrogate models for rapid parameterization of non-bonded interactions | |
Guo et al. | DeepPSP: a global–local information-based deep neural network for the prediction of protein phosphorylation sites | |
Durrant et al. | Comparing neural-network scoring functions and the state of the art: applications to common library screening | |
Zhao et al. | Machine learning-based models with high accuracy and broad applicability domains for screening PMT/vPvM substances | |
Wade et al. | Assimilating radial distribution functions to build water models with improved structural properties | |
Klamt et al. | COSMO sar3D: molecular field analysis based on local COSMO σ-profiles | |
Fan et al. | The integration of pharmacophore-based 3D QSAR modeling and virtual screening in safety profiling: A case study to identify antagonistic activities against adenosine receptor, A2A, using 1,897 known drugs | |
Lim et al. | Exploring deep learning of quantum chemical properties for absorption, distribution, metabolism, and excretion predictions | |
Genheden | Predicting partition coefficients with a simple all-atom/coarse-grained hybrid model | |
Ojha et al. | DeepWEST: Deep learning of kinetic models with the Weighted Ensemble Simulation Toolkit for enhanced sampling | |
Stroet et al. | Optimization of empirical force fields by parameter space mapping: A single-step perturbation approach | |
Guan et al. | Application of clustering algorithms to partitioning configuration space in fitting reactive potential energy surfaces | |
Tanemura et al. | AutoGraph: Autonomous graph-based clustering of small-molecule conformations | |
Duarte Ramos Matos et al. | Infinite dilution activity coefficients as constraints for force field parametrization and method development | |
Asthagiri et al. | MD simulation of water using a rigid body description requires a small time step to ensure equipartition | |
Forouzesh et al. | Multidimensional global optimization and robustness analysis in the context of protein–ligand binding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: UNIVERSITY OF FLORIDA RESEARCH FOUNDATION, INC., F Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ZHENG, ZHENG;MERZ, KENNETH M., JR.;SIGNING DATES FROM 20160329 TO 20160330;REEL/FRAME:039234/0007 |
|
AS | Assignment |
Owner name: BOARD OF TRUSTEES OF MICHIGAN STATE UNIVERSITY, MI Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:UNIVERSITY OF FLORIDA RESEARCH FOUNDATION, INC.;REEL/FRAME:039288/0603 Effective date: 20160422 |
|
AS | Assignment |
Owner name: NATIONAL INSTITUTES OF HEALTH (NIH), U.S. DEPT. OF Free format text: CONFIRMATORY LICENSE;ASSIGNOR:MICHIGAN STATE UNIVERSITY;REEL/FRAME:040069/0024 Effective date: 20160916 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: AWAITING TC RESP., ISSUE FEE NOT PAID |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: AWAITING TC RESP., ISSUE FEE NOT PAID |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
CC | Certificate of correction | ||
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2551); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY Year of fee payment: 4 |