0% found this document useful (0 votes)

27 views9 pages

27786-Article Text-31840-1-2-20240324

Combination of articles

Uploaded by

saleh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views9 pages

27786-Article Text-31840-1-2-20240324

Combination of articles

Uploaded by

saleh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

Multilevel Attention Network with Semi-supervised Domain Adaptation for

Drug-Target Prediction
Zhousan Xie1 , Shikui Tu1 * , Lei Xu1,2*
1
Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
2
Guangdong Institute of Intelligence Science and Technology, Zhuhai, Guangdong 519031, China
{waduhek, tushikui, leixu}@sjtu.edu.cn

Abstract docking and molecular simulations have shown great suc-

cess in drug discovery (Cheng et al. 2012), but they are lim-
Prediction of drug-target interactions (DTIs) is a crucial step
in drug discovery, and deep learning methods have shown ited for being computationally resource-intensive and rely-
great promise on various DTI datasets. However, existing ap- ing on the availability of 3D structure data. Methods includ-
proaches still face several challenges, including limited la- ing machine learning approaches (Faulon et al. 2008; Wang
beled data, hidden bias issue, and a lack of generalization et al. 2021; Meng et al. 2017) perform well in predicting
ability to out-of-domain data. These challenges hinder the DTIs for known drug-target pairs, while their performance
model’s capacity to learn truly informative interaction fea- tends to deteriorate when applied to unknown structures.
tures, leading to shortcut learning and inferior predictive per- With the accumulation of a large volume of labeled DTI
formance on novel drug-target pairs. To address these is-
sues, we propose MlanDTI, a semi-supervised domain adap-
data in recent years, numerous end-to-end deep learning
tive multilevel attention network (Mlan) for DTI prediction. methods have been employed for predicting DTIs. From
We utilize two pre-trained BERT models to acquire bidirec- the perspective of the input data, DTI prediction models
tional representations enriched with information from unla- can be categorized into three groups. The first category
beled data. Then, we introduce a multilevel attention mech- is sequence-based models, where drugs are represented as
anism, enabling the model to learn domain-invariant DTIs at Simplified Molecular Input Line Entry System (SMILES)
different hierarchical levels. Moreover, we present a simple or Extended-Connectivity Fingerprints (ECFP) and pro-
yet effective semi-supervised pseudo-labeling method to fur- teins are treated as amino acid sequences. These models
ther enhance our model’s predictive ability in cross-domain commonly utilize 1D-CNN (Öztürk, Özgür, and Ozkirimli
scenarios. Experiments on four datasets show that MlanDTI 2018; Lee, Keum, and Nam 2019; Zhao et al. 2022; Bai
achieves state-of-the-art performances over other methods
under intra-domain settings and outperforms all other ap-
et al. 2023) or transformer architectures (Chen et al. 2020;
proaches under cross-domain settings. The source code is Huang et al. 2022). Secondly, drug molecules can be repre-
available at https://github.com/CMACH508/MlanDTI. sented as graphs (Nguyen et al. 2021; Tsubaki, Tomii, and
Sese 2019; Huang et al. 2022) or images (Qian, Wu, and
Zhang 2022). Similarly, protein distance maps can serve as
Introduction a 2D abstraction of their 3D structural information (Zheng
The process of drug discovery and development is charac- et al. 2020), enabling the use of Graph Neural Networks
terized by its high costs and time-intensive nature. Bring- (GNNs) (Scarselli et al. 2008), Graph Convolutional Net-
ing a first-in-class drug to the market typically requires sev- works (GCNs) (Kipf and Welling 2016), and Convolutional
eral decades and substantial investments amounting to bil- Neural Networks (CNNs). Thirdly, the incorporation of 3D
lions of dollars. Predicting drug-target interactions (DTIs) structural data such as protein pockets (Yazdani-Jahromi
is an essential task in drug discovery and drug repurposing et al. 2022) or molecular dynamics simulation data (Wu et al.
(Paul et al. 2010), which hold significant value in the field of 2022) undoubtedly improves model performance and re-
biomedicine (Agamah et al. 2020; Ezzat et al. 2019). While duces computational complexity compared to those directly
traditional techniques like high-throughput screening, pro- using the whole 3D data as input (Wallach, Dzamba, and
teomics, and genomics remain prevalent, they suffer from Heifets 2015; Stepniewska-Dziubinska, Zielenkiewicz, and
time and cost constraints due to the vast chemical space Siedlecki 2018). Nonetheless, they are still constrained by
involved (Broach, Thorner et al. 1996; Bakheet and Doig the limited availability of 3D structural data.
2009). Despite these remarkable development, deep learning
In order to expedite the drug discovery process and re- methods still face several challenges. The first challenge is
duce costs, virtual screening (VS) techniques have been de- the restriction of limited labeled data. Previous works have
veloped to aid in silico (Rifaioglu et al. 2019). Molecular primarily concentrated on utilizing the available labeled data
* Corresponding authors and learn interactions on a few thousands labeled drug-target
Copyright © 2024, Association for the Advancement of Artificial pairs (Öztürk, Özgür, and Ozkirimli 2018; Lee, Keum, and
Intelligence (www.aaai.org). All rights reserved. Nam 2019; Tsubaki, Tomii, and Sese 2019; Nguyen et al.

329
The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

2021; Huang et al. 2022; Zhao et al. 2022; Bai et al. 2023). sentations that possess better robustness and generaliza-
However, these approaches often overlook the enormous tion capabilities. We observed that the representations
amount of unlabeled biomedical data, which hinders the obtained by the BERT models significantly enhance the
models from fully leveraging the chemical structures and accuracy of pseudo-labeling.
interactions of drugs and proteins. Consequently, the mod- • We propose a novel multi-level attention mechanism
els struggle to extract truly informative features, leading to which enables effective feature extraction by allowing
limited generalization ability. the model to dynamically focus on different aspects of
The second challenge is the hidden bias and shortcut proteins and drugs during the learning process. The atten-
learning. The issue of hidden bias has been reported on the tion mechanism mitigates the shortcut learning problem
DUD-E and MUV datasets (Sieg, Flachsenberg, and Rarey and reduces the impact of hidden bias on predictions.
2019). It has been observed that models trained on the DUD- • We propose a simple yet effective pseudo-label domain
E dataset (Chen et al. 2019) and other datasets (Chen et al. adaptation method, which significantly reduces the noise
2020), tend to rely predominantly on drug patterns when of pseudo-labels.
making predictions, rather than capturing the comprehen-
sive interaction between drugs and targets. This lead to a gap Related Work
between theoretical modeling and practical application. We
further identify two main reasons for this issue: 1) The pres- Leveraging Additional Data
ence of a greater variety and quantity of drug molecules in One of the key to DTI prediction is how to represent drug
datasets than proteins; 2) The inherent ease of feature extrac- molecules and proteins that allows the model to learn use-
tion for drug molecules compared to proteins. These factors ful features. Learning from 3D structural information (Wal-
result in shortcut learning, where the model tends to priori- lach, Dzamba, and Heifets 2015; Stepniewska-Dziubinska,
tize learning features from the larger and easier-to-learn drug Zielenkiewicz, and Siedlecki 2018) is undoubtedly the most
molecule data, rather than focusing on the features of pro- direct approach, but it is limited by the high computational
teins. Consequently, the model struggles to effectively cap- costs and model complexity. Another indirect approach is
ture the interaction features between drugs and proteins. to provide additional data containing 3D structural infor-
The third challenge lies in the model’s ability to gener- mation, such as molecular dynamics simulations (Wu et al.
alize and make predictions on out-of-domain data, which is 2022) and protein pocket data (Yazdani-Jahromi et al. 2022).
closely related to the previous two challenges. Developing a While the aforementioned methods are limited by the avail-
first-in-class drug often involves predicting interactions with ability of a finite amount of 3D structural data, Moltrans
a completely new target using novel compounds, which may (Huang et al. 2021), in contrast, leverages a vast amount
have a distribution that differs significantly from the data on of unlabeled protein and drug sequences by using Frequent
which the model was trained. Thus, the model needs to be Consecutive Sub-sequence (FCS) algorithm to extract high-
capable of cross-domain generalization (Abbasi et al. 2020; quality substructures and enhances the representations us-
Bai et al. 2023; Kao et al. 2021). Currently, most models are ing transformers. However, FCS has certain limitations in
trained on limited labeled data and fail to address the issue its ability to comprehensively extract information from se-
of shortcut learning, resulting in limited ability to predict in- quence data, and the quantity of unlabeled data utilized is
teractions between completely new drugs and proteins. also insufficient. In this paper, we utilize two pre-trained
To tackle the three challenges, we propose MlanDTI, a BERT (Devlin et al. 2018) models learned on a large amount
semi-supervised domain adaptive multilevel attention net- of unlabeled data to obtain rich representations of proteins
work for DTI prediction. We utilize two pre-trained BERT and drug sequences with powerful generalization abilities.
models to acquire bidirectional embeddings of protein and
SMILES (drug) sequences from millions of unlabeled data. Learning Interactions
Inspired by the least mean squared error reconstruction Proteins and drugs are two fundamentally different types
(Lmser) network (Xu 1993; Huang, Tu, and Xu 2022), we of data, and the task of DTI prediction requires the model
then devise a variant of transformer with a multi-level atten- to learn their interaction features. The simplest approach
tion mechanism with drug and protein embeddings as input. is to concatenate the features (Öztürk, Özgür, and Ozkir-
It enables the joint extraction of both drug and target fea- imli 2018; Lee, Keum, and Nam 2019; Zheng et al. 2020;
tures with reduced hidden bias and facilitates the learning of Nguyen et al. 2021) and pass them through a Fully-
multi-level interactions. Moreover, we incorporates a sim- Connected Network (FCN) to obtain the prediction results.
ple yet effective semi-supervised pseudo-labeling method Another approach (Qian, Wu, and Zhang 2022) is to over-
to further enhance our model’s predictive ability in cross- lap the feature maps and use CNN to extract interaction
domain scenarios. Experiments on four datasets demonstrate features. However, these methods lack interpretability and
that MlanDTI achieves state-of-the-art performances over overlook the inherent structure of interactions. Recently, at-
other methods under intra-domain settings and outperforms tention mechanisms have been demonstrated effective in
all other approaches under cross-domain settings. capturing intricate interactions between proteins and drugs.
The main contributions are three-fold as follows: Multi-head attention (Bian et al. 2023; Chen et al. 2020) and
other attention variations (Bai et al. 2023; Zhao et al. 2022)
• To leverage massive unlabeled biomedical data, we em- have been widely applied in DTI prediction. However, (Chen
ployed two pre-trained BERT models to acquire repre- et al. 2020) found that the hidden bias in some datasets that

330
The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

led models to rely mainly on drug patterns rather than the data, the model predicts on unlabeled target domain data
interactions for prediction. We further observed that this is- to obtain pseudo-labels. The pseudo-label learning process
sue was prevalent in existing models. To address this issue, consists learning high-confidence pseudo-labels and mini-
we proposed a multi-level attention mechanism. mizing conflicting predictions.

Domain Generalization in DTI Predictions Encoder for Protein Sequence We build the encoder by
adopting a modification on the transformer similar to Trans-
In previous works (Huang et al. 2021; Yazdani-Jahromi et al. formerCPI (Chen et al. 2020). Instead of using the self-
2022; Zhao et al. 2022), the evaluation of model generaliza- attention module, we utilize a 1D-CNN and GLU (gated lin-
tion was often conducted through the partitioning of datasets ear unit) (Dauphin et al. 2017) as alternatives. The hidden
into “unseen drug” or “unseen protein” scenarios, where layers h0 , ..., hL in the encoder are computed as:
drugs or proteins were only present in the test set. However,
such evaluations still fall into the intra-domain setting, dif- hi (XT ) = (XT Wi1 + s) ⊗ σ(XT Wi2 + t), (1)
ferent from real-world applications. Currently, there is lim- n×m1
where XT ∈ R is the input of layer hi , Wi1 ∈
ited research on domain generalization in DTI prediction.
Rk×m1 ×m2 , s ∈ Rm2 , Wi2 ∈ Rk×m1 ×m2 , t ∈ Rm2 are pa-
DrugBAN (Bai et al. 2023) addresses this challenge by uti-
rameters, n is the input sequence length, k is the patch size,
lizing Conditional Domain Adversarial Network (CDAN) to
m1 , m2 are the dimensions of input and hidden vectors, σ is
transfer the learned knowledge from the source domain to
the sigmoid function, and ⊗ is the element-wise product.
the target domain, thereby enhancing the model’s perfor-
Since the length of a protein sequence may range in the
mance in cross-domain settings. Here, we leverage pseudo-
thousands or even tens of thousands, the self-attention mod-
labeling techniques to mitigate the distribution discrepancy
ule in transformers poses a significant computational and
between the target and source domains. Through the integra-
memory burden with O(n2 ) time and space complexity, and
tion of an auxiliary classifier and the powerful representa-
is prone to overfitting when working on small datasets. The
tional capacity of BERT models, our approach significantly
above modification by Eq. (1) mitigates the computational
improves the accuracy of pseudo-labeling. Under the cross-
and storage burden on long protein sequences and remedies
domain setting, our method demonstrates remarkable per-
overfitting on small datasets.
formance surpassing that of DrugBAN.
Multilevel Cross-Attention For the task of DTI predic-
Method tion, the most crucial ability for the model is to learn the
interaction patterns between drugs and targets. It involves
Problem Formulation
aligning the features of proteins with the features of drugs in
The task of DTI prediction aims to determine whether a a shared feature subspace. However, extracting multi-level
drug compound and a target protein will interact. For drug features from proteins is more challenging than extracting
compounds, most existing deep learning methods utilize the features from drugs, because protein sequences are notably
SMILES strings to represent the drugs. Specifically, a drug long, with intricate multi-level structures, while drugs are
is represented as D = (d1 , ..., dm ), where di is a SMILES often small chemical molecules. This difference contributes
symbol with chemical meanings such as atoms, m is the to the hidden bias in DTI models (another is inherent dataset
length. As for target proteins, each protein sequence is rep- bias). Aligning protein features with drug features also re-
resented as T = (a1 , ..., an ), where ai corresponds to one of quires a multi-level process, but the model may not capture
the 23 amino acids, n is the length of the protein sequence. the multi-level features of proteins well and effectively align
Given a drug SMILES sequence D and a protein sequence them with drug features. Thus, the existing models tend to
T, the objective is to train a model to assign an interaction learn a shortcut by relying on the features of drug molecules
probability score P ∈ [0, 1] by mapping the joint feature to predict drug-target interactions.
representation space D × T. In an early literature (Xu 1993), Lmser network was pro-
posed to enhance the representation learning by building
The Proposed Framework bidirectional skip connections on every levels of layers be-
An overview of MlanDTI is depicted in Figure 1. It tween encoder and decoder. It was first demonstrated in a
commences by encoding the drug and target sequences deep CNN implementation to be robust and effective on
into vector embeddings via pre-trained BERT models, i.e., image processing (Huang, Tu, and Xu 2022; Xu 2019),
ChemBERTa-2 (Ahmad et al. 2022) and ProtTrans (Elnag- and then Lmser-transformer was developed to improve the
gar et al. 2021). Subsequently, these embeddings are passed molecular representation learning by adding hierarchical
through the encoder and decoder of a modified transformer connections to the original transformer (Qian et al. 2022).
architecture with a multi-level attention module to extract in- Inspired by these works, we propose a multi-level cross-
teraction features. The classifier comprises a bilinear atten- attention mechanism to address this issue, as illustrated in
tion module and a max pooling layer, followed by a FCN for Figure 1(a). In the vanilla transformer, the encoder uses the
prediction. For cross-domain prediction, we employ an aux- protein features from the last layer of the encoder as the Key
iliary classifier that directly accepts BERT outputs. It aids and Value for the cross-attention layer of the decoder, align-
in learning implicit distributional information from BERT ing them with the drug features in the decoder. However,
representations, thereby enhancing pseudo-label accuracy. the protein features obtained from the encoder’s output do
After training the two classifiers on labeled source domain not fully capture the expression of the multi-level structural

331
The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

(a) N layers (b) Feature

Fusing
encoder decoder

Protein Sequence … …

Add & Scale

Max pooling
Prot-trans

1DCNN
1DCNN
encoder decoder

GLU
GLU
encoder decoder

Prediction results
Full connected layers
Bilinear attention

Concatenated
MGN…PVR (c)
Protein
Primary Select
embeddings Multilevel Multilevel classifier Pseudo
Drug Feature
labels
Fusing Attention outputs
SMILES Sequence embeddings

classifier Train
Self Attention
Chemberta-2

heads Cross

Feedforward
Add & Norm
Add & Norm

Max pooling
Auxiliary with

Attention
classifier

Talking
pseudo
Transformer outputs labels
with
Multilevel
attention classifier
C1C1=C

Bert Model
N layers

Figure 1: (a). The overall framework of MlanDTI, it consists of two pre-trained BERT models that convert SMILES and amino
acid sequences into vector embeddings. The Encoder and Decoder are connected by a multilevel attention module, and the final
output is processed through the classifier with a bilinear attention module and a max pooling layer before being fed into FCN
to generate the prediction results. (b). The detail of Multilevel attention. (c). Training with pseudo labeling with an auxiliary
classifier.

information of proteins, and they do not align with drug fea- across different attention heads and improving the overall
tures at different levels in the decoder. performance of the model, i.e.,
Here, we develop the multi-level attention mechanism by
QK T

two steps: 1) the multi-level feature fusing step and 2) the
Attention(Q, K, V ) = softmax Pℓ √ Pw V, (4)
cross-attention feature aligning step. Suppose the protein dk
feature matrices of each encoder layers are T0 , ..., Tn ∈
Rm×d , where n is the number of transformer layers, m is where Q, K, V are given by Eq. (3), and Pℓ ∈
the size of the protein sequence and d is the vector dimen- Rhk ×hk , Pw ∈ Rhk ×hv are the two additional linear projec-
sion. For the ℓ-th decoder layer, we concatenate the pro- tions. hk represents the number of attention heads for keys
tein feature matrices from the preceding ℓ layers to form and queries, and hv denotes the number of attention heads
Tcatℓ = [T0 , ..., Tℓ ] ∈ Rℓ×m×d . Then, we perform a cross- for values, and they can optionally differ in size.
layer feature aggregation by applying a fusion matrix Fℓ ∈ The advantages of the proposed multi-level attention
Rℓ×1 . This results in multi-level fused protein feature matrix mechanism are briefly summarized below.
Tℓ′ = FℓT Tcatℓ . To summarize, we compute all Tℓ′ as: • Encourage multi-level feature learning: By fusing pro-
diag(T0′ , ..., Tn′ ) = F · diag(Tcat0 , ..., Tcatn ), (2) tein features, drug features are derived to interact with
relevant characteristics, which thereby captures multi-
where F is a learnable diagonal matrix with each diagonal level interaction features, leading to a more comprehen-
element being Fℓ from each layer, i.e., ℓ = 0, ..., n. Then, sive understanding of drug-target interactions.
the query, key, and value for the multi-level cross-attention
mechanism at the ℓ-th layer are respectively computed by • Alleviate hidden bias and reduce overfitting: Multi-
level attention encourages the model to focus more on
Q = D ℓ Wq , K= Tℓ′ Wk , V = Tℓ′ Wv , (3) hierarchical interaction features, the model becomes less
where Dℓ is the drug feature matrix which has passed the prone to biased representations that might emerge from
self-attention module, and Tℓ′ the multi-level protein feature focusing solely on specific patterns, and thus the model
matrix given by Eq. (2). is less likely to overfit to noisy patterns of the data.
To enhance the extraction capabilities of attention heads • Improve generalization abilities: Multi-level attention
for multi-level interactions, we incorporate the talking-heads enables the model to learn domain-invariant interaction
attention mechanism (Shazeer et al. 2020) for feature align- features. These representations exhibit robustness and
ment. This variation of multi-head attention in the trans- enhance transferability across different data domains.
former introduces two additional linear projections. These
projections transform the attention logits and the atten- The Classifier The classifier consists of the bilinear at-
tion weights, respectively, allowing the flow of information tention module from hyperattentionDTI (Zhao et al. 2022)

332
The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

to further extract bidirectional interaction features. Subse- The second step is to penalize conflict predictions. Let Xd
quently, we utilize a multi-layer FCN with each layer fol- be the set of samples for which the two classifiers exhibit
lowed by a leaky ReLU activation function (He et al. 2015) conflicting classifications, i.e.,
to combine these features and generate prediction results.
Since it is a binary classification problem, we utilize the bi- Xd = {x(i) |x(i) ∈ Xt , argmax p(i) ̸= argmax p(i)
aux }, (9)
nary cross-entropy loss function to train the model. (i) (i) (i) (i) (i)
where p(i) = (p0 , p1 ), paux = (p0,aux , p1,aux ).
LCE = −[y log ŷ + (1 − y) log(1 − ŷ)], (5)
We randomly select a subset, Xd′ , of size M ′ , from Xd ,
where y is the ground truth label, ŷ is the classifier’s output. where the value of M ′ increases with the number of model
Pseudo Label Learning for Domain Adaptation iterations. We utilize a modified binary cross entropy loss to
Pseudo-labeling (Lee et al. 2013) is a semi-supervised augment the prediction uncertainty for conflicting samples
learning (SSL) method that utilizes a model trained on between the two classifiers, i.e.,
labeled source domain data to generate pseudo-labels M ′

for unlabeled target domain data. By incorporating these 1 X

Lconf =− ′ [y log 0.5 + (1 − y) log(1 − 0.5)]. (10)
pseudo-labels into the training process, the model can M i=1
adapt to target domain, which is particularly suitable
for DTI predictions where labeled data are limited and Both steps enable the model to acquire pseudo-labels
unlabeled data are massive. However, in other domains, with reduced noise for training, consequently enhancing the
pseudo-labeling based SSL methods often suffer from poor model’s performance in the target domain.
model performance due to the presence of noisy pseudo-
labels (Rizve et al. 2021). Here, we propose a simple yet Experiments
effective approach that significantly improves the accuracy Datasets
of generated labels and reduces the impact of noisy labels
on the model. We evaluated our model on the human dataset, Caenorhab-
Our method consists of two steps. In the first step, we per- ditis elegans dataset (Tsubaki, Tomii, and Sese 2019), bind-
form the selection and learning of high-confidence pseudo- ingdb dataset (Liu et al. 2007), and Biosnap dataset (Huang
labels. To achieve this, we introduce an auxiliary classi- et al. 2021). Specifically, we conducted both intra-domain
fier for co-training, which is essentially the classifier men- and cross-domain tests on the BindingDB and Biosnap
tioned earlier but directly takes BERT representations as datasets. For the intra-domain evaluation, we randomly split
(i) (i) N the dataset into training, validation, and test sets with a ratio
input. Let P1 = {p1 }N i=1 , P0 = {p0 }i=1 and P1,aux = of 8:1:1 in smaller human and C.elegans datasets, and 7:1:2
(i) (i)
{p1,aux }N N
i=1 , P0,aux = {p0,aux }i=1 be the probability out- in larger BindingDB and Biosnap datasets. We also con-
puts of the model and the auxiliary classifier for target do- ducted cold pair split experiments on BindingDB and Bios-
(i) (i)
main data Xt = {x(i) }N i=1 , respectively, such that p0 , p0,aux
nap datasets. We select 70% of drugs/proteins randomly, and
is the probability of no interaction for sample x(i) and all related DT pairs were collected as the training set. Sub-
(i) (i) sequently, DT pairs in the remaining 30% were split into a
p1 , p1,aux is the probability the sample interact. Rather 3:7 ratio, as validation set and test set. This ensures that all
than selecting thresholds, which we observed may lead drugs and proteins in the test set are unseen to model.
to unbalanced pseudo-labels, we sort (P1 + P1,aux ) and For the cross-domain evaluation, we followed the
(P0+P0,aux ) in descending order and select the top M pos- clustering-based split strategy used in DrugBAN. We ap-
itive and negative sample pairs based on their probabilities plied the ECFP4 and PSC algorithms to cluster drugs and
to assign pseudo-labels: proteins, respectively. Then, we randomly selected 60% of
(i) (i) (i) the drug and protein clusters and used all drug-protein pairs
Y1 = {ŷ1 = 1|p1 + p1,aux ∈ topM (P1 + P1,aux )}, (6)
belonging to these clusters as the source domain data. The
(i) (i) (i)
Y0 = {ŷ0 = 0|p0 + p0,aux ∈ topM (P0 + P0,aux )}, (7) drug-protein pairs in the remaining 40% clusters were used
where Y1 , Y0 represent the sets of pseudo-labels for positive as the target domain data. This data partitioning ensured that
and negative samples, respectively, and M is the number of the target domain and source domain data were from dis-
selected samples which grows with the number of iterations. joint distributions, making the evaluation more challenging
The auxiliary classifier focuses on learning the latent re- and enabling a true assessment of the model’s ability to pre-
lationships between target domain and source domain data dict interactions for unknown proteins and molecules.
within the BERT representations, while the main model pri- For the domain adaptation setting, we used all labeled
oritizes learning domain-invariant DT interaction features. source domain data and 80% of the unlabeled target domain
This leads classifier discrepancy in nature, enabling higher data as the training set. This 80% of the target domain data
accuracy for pseudo-label with high confidence on both clas- was also used as the validation set, while the remaining 20%
sifiers. After generating pseudo-labels, we employ the cross- of labeled data from the target domain served as the test set.
entropy loss to train model, i.e.,
2M
Baselines and Implementation Details
1 X We conducted a comparison between our proposed method
Lpseudo =− [y log ŷ (i) +(1−y) log(1− ŷ (i) )]. (8)
2M i=1 and eight baseline approaches: Support Vector Machine

333
The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

human C.elegans BindingDB BioSNAP

methods
AUC AUPR F1 AUC AUPR F1 AUC AUPR F1 AUC AUPR F1
SVM 0.910 – 0.967 0.894 – 0.801 0.939 0.928 0.787 0.862 0.864 0.762
RF 0.940 – 0.878 0.902 – 0.832 0.942 0.921 0.858 0.860 0.886 0.808
GraphDTA 0.960 0.959 0.897 0.974 0.975 0.919 0.951 0.934 0.867 0.887 0.890 0.789
DeepConvDTI 0.967 0.964 0.922 0.983 0.985 0.944 0.945 0.925 0.859 0.886 0.890 0.797
MolTrans 0.974 0.976 0.944 0.982 0.985 0.966 0.952 0.936 0.865 0.895 0.897 0.824
TransformerCPI 0.973 0.975 0.920 0.988 0.986 0.952 0.943 0.925 0.855 0.889 0.893 0.798
HyperAttDTI 0.984 0.984 0.946 0.989 0.990 0.958 0.959 0.948 0.887 0.901 0.902 0.838
DrugBAN 0.981 0.983 0.940 0.986 0.988 0.949 0.959 0.947 0.881 0.903 0.902 0.832
Ours 0.988 0.990 0.961 0.990 0.992 0.962 0.945 0.926 0.857 0.909 0.912 0.841

Table 1: The results of the proposed model and baslines on four datasets (5 random runs), Metric: AUROC (AUC), AUPRC
(AUPR), F1-score (F1), The best results are indicated by bold. ”–” means no result for this metric.

cold cross-domain
methods BindingDB BioSNAP BindingDB BioSNAP
AUC AUPR F1 AUC AUPR F1 AUC AUPR F1 AUC AUPR F1
Moltrans 0.595 0.522 0.511 0.672 0.697 0.437 0.537 0.476 0.389 0.632 0.635 0.401
TransformerCPI 0.656 0.594 0.566 0.680 0.708 0.523 0.568 0.450 0.410 0.656 0.693 0.432
HyperAttDTI 0.661 0.598 0.582 0.732 0.760 0.539 0.545 0.462 0.376 0.654 0.685 0.395
DrugBAN 0.655 0.600 0.542 0.651 0.667 0.449 0.578 0.471 0.484 0.608 0.606 0.438
DrugBANCDAN NA NA NA NA NA NA 0.616 0.512 0.426 0.673 0.706 0.542
Ours 0.671 0.594 0.601 0.782 0.801 0.653 0.657 0.537 0.489 0.728 0.759 0.604
Ours (with PL) NA NA NA NA NA NA 0.687 0.579 0.564 0.749 0.770 0.629

Table 2: In-domain (cold pair split: unseen drugs & proteins) and cross-domain (clustering-based split) comparison on the Bind-
ingDB and BioSNAP datasets (5 random runs). 1) Underlined values explanation: We chose a threshold of 0.5 (the same one as
in MolTrans) to calculate the F1-score of DrugBAN. This is to ensure a fair comparison and to avoid ineffective classification
caused by overly low thresholds in DrugBAN. Further information is provided in the appendix. 2) NA, not applicable to this
study. 3) The term “with PL” within parentheses refers to our method that incorporates the pseudo-labeling module.

(SVM) (Cortes and Vapnik 1995), Random Forest (RF) was not particularly competitive. This discrepancy was due
(Ho 1995), GraphDTA (Nguyen et al. 2021), DeepConv- to the hidden bias issue present in the BindingDB dataset.
DTI (Lee, Keum, and Nam 2019), MolTrans (Huang et al. The BindingDB dataset contains 14643 drugs and 2623
2021), TransformerCPI (Chen et al. 2020), Hyperatten- proteins, which results in an extremely imbalanced drug-to-
tionDTI (Zhao et al. 2022), and DrugBAN (Bai et al. 2023). protein ratio compared to the other datasets (BioSNAP: 4510
These baselines encompass both classic machine learning / 2181, human: 2726 / 2001, C.elegans: 1767 / 1876). Com-
methods and the current state-of-the-art deep learning ap- pared to the other three datasets, deep learning models even
proaches, ensuring a comprehensive comparison. All deep struggle to outperform traditional machine learning meth-
learning methods were employed with their default configu- ods (AUC: RF 0.942, deepConv-DTI 0.945) on the Bind-
rations as provided by their respective authors. Our proposed ingDB dataset. Previous studies (Bai et al. 2023) have also
method in implemented in PyTorch, utilizing the Adam op- reported that the performance in the BindingDB dataset un-
timizer with an initial learning rate of 0.001. Detailed hyper- der unseen-drug setting shows minimal decline compared to
parameter settings are provided in the appendix. random splits. This phenomenon is attributed to the presence
of a large number of highly similar molecules in the dataset,
Intra-domain Experiments which makes it challenging for the naive unseen-drug set-
Table 1 displays the comparison on the human and C.elegans ting to distinguish between them. The excessive number
datasets. These two datasets are relatively small, with bal- of highly similar drug samples causes baseline models to
anced positive and negative samples, enabling us to evalu- lean towards learning drug patterns rather than drug-target
ate the model’s predictive ability within the same distribu- interactions for prediction. As a result, deep learning and
tion. Our method outperforms all deep learning baselines in machine learning methods exhibit similar performance lev-
terms of AUROC and AUPRC, and it also exhibits competi- els. However, this shortcut learning approach contradicts the
tive performance in terms of F1-score. original intent of DTI prediction and cannot be considered
We also conducted comparisons on the larger datasets, reliable in practical applications.
BindingDB and BioSNAP. In the random split tests, our However, our model focuses more on learning the multi-
model achieved state-of-the-art performance on the BioS- level interactions between proteins and drugs. In the cold
NAP dataset, but its performance on the BindingDB dataset split setting in Table 2, the model can only learn drug-target

334
The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

BindingDB BioSNAP
Ablation
AUC AUPR F1 AUC AUPR F1
0.687 0.579 0.564 0.749 0.770 0.629
-BERT 0.573 0.455 0.413 0.648 0.671 0.499
-MLA 0.628 0.511 0.523 0.731 0.753 0.585
-PL 0.657 0.537 0.489 0.728 0.759 0.604
-Aux Cls 0.626 0.486 0.503 0.739 0.776 0.633

Table 3: Ablation study on BindingDB and BioSNAP

datasets (cross-domain,five random runs)
(a) (b)

interaction features, due to the lack of sufficiently similar Figure 2: Ablation experiments of Pseudo labeling accuracy
drug and protein molecules as references. Our model out- on (a) BindingDB dataset (b) BioSNAP dataset
performs other baselines on the BindingDB dataset, while
on the more balanced BioSNAP dataset, our model achieves
a superior performance compared to the baselines. substantial amount of noisy pseudo-labels that deteriorated
Overall, the challenges posed by the hidden bias issue the model’s performance.
on the BindingDB dataset highlight the importance of our
model’s ability to capture multi-level drug-target interac-
tions, which allows it to perform well in scenarios where Effectiveness of Multilevel Attention We replaced the
other baselines struggle to maintain effectiveness. multilevel attention (MLA) mechanism with the origi-
nal Transformer multi-head attention. However, on both
Cross-Domain Experiments datasets, the model exhibited performance drop in varying
degrees. With an increase in training iterations, a significant
Table 2 presents a comparison of model performance on the
decline in the accuracy of pseudo-labels was observed. It
BindingDB and BioSNAP datasets under the cross-domain
turns out that the multilevel attention mechanism is better
setting. Compared to the intra-domain setting, the major-
equipped to capture domain-invariant drug-target interaction
ity of models experience significant performance drop due
features, thereby enhancing the model’s performance in the
to the differences in data distributions. Particularly, for the
target domain.
BindingDB dataset, the clustering-based strategy ensures
that there are no similar drugs or proteins between the train-
ing and testing sets, preventing the models from relying Effectiveness of Pseudo Labeling and Auxiliary Classi-
on drug patterns. This breaks the false high-performance fier Pseudo-labeling (PL) proves effective in enhancing
illusion observed in the intra-domain scenario, and some the model’s performance within the target domain. Concur-
models even show no better performances than random rently, auxiliary classifiers contribute to reducing the noise
guessing (AUC: 0.5). Among all baselines, DrugBANCDAN , within these pseudo-labels. This effect is particularly pro-
which leveraged a conditional domain adversarial network nounced in BindingDB dataset, which exhibits substantial
(CDAN) for domain adaptation, achieved the best perfor- disparities in domain distributions. The absence of auxiliary
mance. However, DrugBANCDAN did not surpass our vanilla classifiers exacerbates the noise present within the pseudo-
model with out pseudo labeling, and our model with pseudo labels, leading to the insufficiency of the pseudo-labeling ap-
labeling significantly outperformed all state-of-the-art mod- proach in enhancing the model’s performance.
els, including DrugBAN with domain adaptation module.
Specifically, our model outperformed DrugBANCDAN by
11.52% and 11.29% (AUROC) on the BindingDB and BioS- Conclusion
NAP datasets, respectively.
In this paper, we proposed MlanDTI, a semi-supervised
Ablation Studies domain adaptive multilevel attention network that lever-
We conducted ablation studies in Table 3 under the cross- ages a large amount of unlabeled data to obtain enriched
domain setting on the BindingDB and BioSNAP datasets to bidirectional representations of drugs and proteins from a
analyze the effectiveness of modules in our proposed model. pre-trained BERT model. Additionally, we introduced the
multilevel-attention mechanism to capture domain-invariant
Effectiveness of BERT Embeddings We replaced BERT interaction features between proteins and drugs at differ-
with Word2Vec and GCN as used in TransformerCPI (Chen ent levels and depths. Finally, we incorporated a simple
et al. 2020) to obtain embeddings for drugs and proteins. As yet effective pseudo labeling method to further enhance our
shown in Table 3, the performance of the model experienced model’s generalization ability. Our model demonstrated ex-
a notable decline. This outcome can be attributed to the aux- cellent domain generalization capabilities, making it well-
iliary classifier’s inability to effectively capture the implicit suited for predicting interactions between new drugs and tar-
relationship between the source and target domains through gets in drug development. Through comprehensive compar-
the representations. As a result, in Figure 2 the accuracy isons with state-of-the-art models, we establish a substantial
of pseudo-labels exhibited a significant drop, introducing a performance superiority over prior methodologies.

335
The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

Acknowledgements Devlin, J.; Chang, M.-W.; Lee, K.; and Toutanova, K. 2018.
Bert: Pre-training of deep bidirectional transformers for lan-
This work was supported by the National Natural
guage understanding. arXiv preprint arXiv:1810.04805.
Science Foundation of China (62172273) and Shang-
hai Municipal Science and Technology Major Project Elnaggar, A.; Heinzinger, M.; Dallago, C.; Rehawi, G.;
(2021SHZDZX0102). Shikui Tu and Lei Xu are co- Wang, Y.; Jones, L.; Gibbs, T.; Feher, T.; Angerer, C.;
corresponding authors. Steinegger, M.; et al. 2021. Prottrans: Toward understanding
the language of life through self-supervised learning. IEEE
transactions on pattern analysis and machine intelligence,
References 44(10): 7112–7127.
Abbasi, K.; Razzaghi, P.; Poso, A.; Amanlou, M.; Ghasemi, Ezzat, A.; Wu, M.; Li, X.-L.; and Kwoh, C.-K. 2019.
J. B.; and Masoudi-Nejad, A. 2020. DeepCDA: deep Computational prediction of drug–target interactions using
cross-domain compound–protein affinity prediction through chemogenomic approaches: an empirical survey. Briefings
LSTM and convolutional neural networks. Bioinformatics, in bioinformatics, 20(4): 1337–1357.
36(17): 4633–4642.
Faulon, J.-L.; Misra, M.; Martin, S.; Sale, K.; and Sapra, R.
Agamah, F. E.; Mazandu, G. K.; Hassan, R.; Bope, C. D.; 2008. Genome scale enzyme–metabolite and drug–target in-
Thomford, N. E.; Ghansah, A.; and Chimusa, E. R. 2020. teraction predictions using the signature molecular descrip-
Computational/in silico methods in drug target and lead pre- tor. Bioinformatics, 24(2): 225–233.
diction. Briefings in bioinformatics, 21(5): 1663–1675. He, K.; Zhang, X.; Ren, S.; and Sun, J. 2015. Delving deep
Ahmad, W.; Simon, E.; Chithrananda, S.; Grand, G.; and into rectifiers: Surpassing human-level performance on im-
Ramsundar, B. 2022. Chemberta-2: Towards chemical foun- agenet classification. In Proceedings of the IEEE interna-
dation models. arXiv preprint arXiv:2209.01712. tional conference on computer vision, 1026–1034.
Bai, P.; Miljković, F.; John, B.; and Lu, H. 2023. Inter- Ho, T. K. 1995. Random decision forests. In Proceedings
pretable bilinear attention network with domain adaptation of 3rd international conference on document analysis and
improves drug–target prediction. Nature Machine Intelli- recognition, volume 1, 278–282. IEEE.
gence, 5(2): 126–136. Huang, K.; Xiao, C.; Glass, L. M.; and Sun, J. 2021.
Bakheet, T. M.; and Doig, A. J. 2009. Properties and identifi- MolTrans: molecular interaction transformer for drug–target
cation of human protein drug targets. Bioinformatics, 25(4): interaction prediction. Bioinformatics, 37(6): 830–836.
451–457. Huang, L.; Lin, J.; Liu, R.; Zheng, Z.; Meng, L.; Chen, X.;
Bian, J.; Zhang, X.; Zhang, X.; Xu, D.; and Wang, G. Li, X.; and Wong, K.-C. 2022. CoaDTI: multi-modal co-
2023. MCANet: shared-weight-based MultiheadCrossAt- attention based framework for drug–target interaction anno-
tention network for drug–target interaction prediction. Brief- tation. Briefings in Bioinformatics, 23(6): bbac446.
ings in Bioinformatics, 24(2): bbad082. Huang, W.; Tu, S.; and Xu, L. 2022. Deep CNN based Lmser
and strengths of two built-in dualities. Neural Processing
Broach, J. R.; Thorner, J.; et al. 1996. High-throughput Letters, 54(5): 3565–3581.
screening for drug discovery. Nature, 384(6604): 14–16.
Kao, P.-Y.; Kao, S.-M.; Huang, N.-L.; and Lin, Y.-C.
Chen, L.; Cruz, A.; Ramsey, S.; Dickson, C. J.; Duca, J. S.; 2021. Toward drug-target interaction prediction via en-
Hornak, V.; Koes, D. R.; and Kurtzman, T. 2019. Hidden semble modeling and transfer learning. In 2021 IEEE In-
bias in the DUD-E dataset leads to misleading performance ternational Conference on Bioinformatics and Biomedicine
of deep learning in structure-based virtual screening. PloS (BIBM), 2384–2391. IEEE.
one, 14(8): e0220113.
Kipf, T. N.; and Welling, M. 2016. Semi-supervised classi-
Chen, L.; Tan, X.; Wang, D.; Zhong, F.; Liu, X.; Yang, T.; fication with graph convolutional networks. arXiv preprint
Luo, X.; Chen, K.; Jiang, H.; and Zheng, M. 2020. Trans- arXiv:1609.02907.
formerCPI: improving compound–protein interaction pre- Lee, D.-H.; et al. 2013. Pseudo-label: The simple and effi-
diction by sequence-based deep learning with self-attention cient semi-supervised learning method for deep neural net-
mechanism and label reversal experiments. Bioinformatics, works. In Workshop on challenges in representation learn-
36(16): 4406–4414. ing, ICML, volume 3, 896. Atlanta.
Cheng, F.; Zhou, Y.; Li, J.; Li, W.; Liu, G.; and Tang, Lee, I.; Keum, J.; and Nam, H. 2019. DeepConv-DTI: Pre-
Y. 2012. Prediction of chemical–protein interactions: diction of drug-target interactions via deep learning with
multitarget-QSAR versus computational chemogenomic convolution on protein sequences. PLoS computational bi-
methods. Molecular BioSystems, 8(9): 2373–2384. ology, 15(6): e1007129.
Cortes, C.; and Vapnik, V. 1995. Support-vector networks. Liu, T.; Lin, Y.; Wen, X.; Jorissen, R. N.; and Gilson, M. K.
Machine learning, 20: 273–297. 2007. BindingDB: a web-accessible database of experimen-
Dauphin, Y. N.; Fan, A.; Auli, M.; and Grangier, D. 2017. tally determined protein–ligand binding affinities. Nucleic
Language modeling with gated convolutional networks. In acids research, 35(suppl 1): D198–D201.
International conference on machine learning, 933–941. Meng, F.-R.; You, Z.-H.; Chen, X.; Zhou, Y.; and An, J.-
PMLR. Y. 2017. Prediction of drug–target interaction networks

336
The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

from the integration of protein sequences and drug chemi- Wu, F.; Jin, S.; Jiang, Y.; Jin, X.; Tang, B.; Niu, Z.; Liu,
cal structures. Molecules, 22(7): 1119. X.; Zhang, Q.; Zeng, X.; and Li, S. Z. 2022. Pre-Training
Nguyen, T.; Le, H.; Quinn, T. P.; Nguyen, T.; Le, T. D.; of Equivariant Graph Matching Networks with Conforma-
and Venkatesh, S. 2021. GraphDTA: Predicting drug–target tion Flexibility for Drug Binding. Advanced Science, 9(33):
binding affinity with graph neural networks. Bioinformatics, 2203796.
37(8): 1140–1147. Xu, L. 1993. Least mean square error reconstruction prin-
Öztürk, H.; Özgür, A.; and Ozkirimli, E. 2018. DeepDTA: ciple for self-organizing neural-nets. Neural networks, 6(5):
deep drug–target binding affinity prediction. Bioinformatics, 627–648.
34(17): i821–i829. Xu, L. 2019. An overview and perspectives on bidirectional
Paul, S. M.; Mytelka, D. S.; Dunwiddie, C. T.; Persinger, intelligence: Lmser duality, double IA harmony, and causal
C. C.; Munos, B. H.; Lindborg, S. R.; and Schacht, A. L. computation. IEEE/CAA Journal of Automatica Sinica, 6(4):
2010. How to improve R&D productivity: the pharmaceuti- 865–893.
cal industry’s grand challenge. Nature reviews Drug discov- Yazdani-Jahromi, M.; Yousefi, N.; Tayebi, A.; Kolanthai, E.;
ery, 9(3): 203–214. Neal, C. J.; Seal, S.; and Garibay, O. O. 2022. AttentionSit-
Qian, H.; Lin, C.; Zhao, D.; Tu, S.; and Xu, L. 2022. Al- eDTI: an interpretable graph-based model for drug-target in-
phaDrug: protein target specific de novo molecular genera- teraction prediction using NLP sentence-level relation clas-
tion. PNAS nexus, 1(4): pgac227. sification. Briefings in Bioinformatics, 23(4): bbac272.
Qian, Y.; Wu, J.; and Zhang, Q. 2022. CAT-CPI: Combin- Zhao, Q.; Zhao, H.; Zheng, K.; and Wang, J. 2022. HyperAt-
ing CNN and transformer to learn compound image features tentionDTI: improving drug–protein interaction prediction
for predicting compound-protein interactions. Frontiers in by sequence-based deep learning with attention mechanism.
Molecular Biosciences, 9: 963912. Bioinformatics, 38(3): 655–662.
Rifaioglu, A. S.; Atas, H.; Martin, M. J.; Cetin-Atalay, R.; Zheng, S.; Li, Y.; Chen, S.; Xu, J.; and Yang, Y. 2020. Pre-
Atalay, V.; and Doğan, T. 2019. Recent applications of deep dicting drug–protein interaction using quasi-visual question
learning and machine intelligence on in silico drug discov- answering system. Nature Machine Intelligence, 2(2): 134–
ery: methods, tools and databases. Briefings in bioinformat- 140.
ics, 20(5): 1878–1912.
Rizve, M. N.; Duarte, K.; Rawat, Y. S.; and Shah, M.
2021. In defense of pseudo-labeling: An uncertainty-aware
pseudo-label selection framework for semi-supervised
learning. arXiv preprint arXiv:2101.06329.
Scarselli, F.; Gori, M.; Tsoi, A. C.; Hagenbuchner, M.; and
Monfardini, G. 2008. The graph neural network model.
IEEE transactions on neural networks, 20(1): 61–80.
Shazeer, N.; Lan, Z.; Cheng, Y.; Ding, N.; and Hou, L. 2020.
Talking-heads attention. arXiv preprint arXiv:2003.02436.
Sieg, J.; Flachsenberg, F.; and Rarey, M. 2019. In need of
bias control: evaluating chemical data for machine learning
in structure-based virtual screening. Journal of chemical in-
formation and modeling, 59(3): 947–961.
Stepniewska-Dziubinska, M. M.; Zielenkiewicz, P.; and
Siedlecki, P. 2018. Development and evaluation of a deep
learning model for protein–ligand binding affinity predic-
tion. Bioinformatics, 34(21): 3666–3674.
Tsubaki, M.; Tomii, K.; and Sese, J. 2019. Compound–
protein interaction prediction with end-to-end learning of
neural networks for graphs and sequences. Bioinformatics,
35(2): 309–318.
Wallach, I.; Dzamba, M.; and Heifets, A. 2015. Atom-
Net: a deep convolutional neural network for bioactivity pre-
diction in structure-based drug discovery. arXiv preprint
arXiv:1510.02855.
Wang, X.-r.; Cao, T.-t.; Jia, C. M.; Tian, X.-m.; and Wang,
Y. 2021. Quantitative prediction model for affinity of drug–
target interactions based on molecular vibrations and overall
system of ligand-receptor. BMC bioinformatics, 22(1): 1–
18.

337

MIN: Multi-Channel Interaction Network For Drug-Target Interaction With Protein Distillation
No ratings yet
MIN: Multi-Channel Interaction Network For Drug-Target Interaction With Protein Distillation
11 pages
Drug-Target Interaction Prediction Using
No ratings yet
Drug-Target Interaction Prediction Using
16 pages
(2023) DeepDrug
No ratings yet
(2023) DeepDrug
15 pages
Multilayer Attention Graph
No ratings yet
Multilayer Attention Graph
8 pages
Comprehensive Survey of Recent Drug Discovery Usin
No ratings yet
Comprehensive Survey of Recent Drug Discovery Usin
37 pages
Computers in Biology and Medicine: Nelson R.C. Monteiro, José L. Oliveira, Joel P. Arrais
No ratings yet
Computers in Biology and Medicine: Nelson R.C. Monteiro, José L. Oliveira, Joel P. Arrais
13 pages
Btae 147
No ratings yet
Btae 147
8 pages
Artigo 3
No ratings yet
Artigo 3
3 pages
Predicting Drug Drug Interactions Using
No ratings yet
Predicting Drug Drug Interactions Using
13 pages
Drug-Target Interaction Prediction by Integrating Heterogeneous Information With Mutual Attention Network
No ratings yet
Drug-Target Interaction Prediction by Integrating Heterogeneous Information With Mutual Attention Network
16 pages
Machine Learning in Drug-Target Prediction
No ratings yet
Machine Learning in Drug-Target Prediction
15 pages
Comprehensive Evaluation of Deep and Graph Learning On Drug-Drug Interactions Prediction
No ratings yet
Comprehensive Evaluation of Deep and Graph Learning On Drug-Drug Interactions Prediction
22 pages
DrugBAN: Advanced DTI Prediction
No ratings yet
DrugBAN: Advanced DTI Prediction
19 pages
HyGNN Drug Drug Interaction Prediction V
No ratings yet
HyGNN Drug Drug Interaction Prediction V
14 pages
Drug-Drug Interactions Prediction Based On Deep Learning and Knowledge Graph
No ratings yet
Drug-Drug Interactions Prediction Based On Deep Learning and Knowledge Graph
27 pages
Prediction of Drug-Target Interactions and Drug Repositioning Via Network-Based Inference
No ratings yet
Prediction of Drug-Target Interactions and Drug Repositioning Via Network-Based Inference
12 pages
Advancing Drug-Target Interaction Prediction A Com
No ratings yet
Advancing Drug-Target Interaction Prediction A Com
43 pages
Study 3
No ratings yet
Study 3
4 pages
Btae 271
No ratings yet
Btae 271
7 pages
Graph Regularized Non-Negative Matrix Factorization With Prior Knowledge Consistency Constraint For Drug-Target Interactions Prediction
No ratings yet
Graph Regularized Non-Negative Matrix Factorization With Prior Knowledge Consistency Constraint For Drug-Target Interactions Prediction
20 pages
Small Molecule Drug and Biotech Drug Interaction Prediction Based On Multi-Modal Representation Learning
No ratings yet
Small Molecule Drug and Biotech Drug Interaction Prediction Based On Multi-Modal Representation Learning
16 pages
(2023) Dgdta
No ratings yet
(2023) Dgdta
15 pages
Drug-Target Interaction Prediction With Graph Attention Networks
No ratings yet
Drug-Target Interaction Prediction With Graph Attention Networks
9 pages
EmerGNN (Original Paper)
No ratings yet
EmerGNN (Original Paper)
41 pages
Reference Paper - 6
No ratings yet
Reference Paper - 6
7 pages
Singh Et Al 2023 Contrastive Learning in Protein Language Space Predicts Interactions Between Drugs and Protein Targets
No ratings yet
Singh Et Al 2023 Contrastive Learning in Protein Language Space Predicts Interactions Between Drugs and Protein Targets
11 pages
An Improved Graph Isomorphism Network For Accurate
No ratings yet
An Improved Graph Isomorphism Network For Accurate
16 pages
Bbad 484
No ratings yet
Bbad 484
11 pages
Deep Learning With Feature Embedding For Compound-Protein Interaction Prediction
No ratings yet
Deep Learning With Feature Embedding For Compound-Protein Interaction Prediction
21 pages
Link Prediction Drug Disease
No ratings yet
Link Prediction Drug Disease
21 pages
Efficient Prediction of Drug-Drug Interaction Using Deep Learning Models
No ratings yet
Efficient Prediction of Drug-Drug Interaction Using Deep Learning Models
6 pages
ENGGG
No ratings yet
ENGGG
36 pages
DSIL-DDI A Domain-Invariant Substructure Interaction Learning For Generalizable DrugDrug Interaction Prediction
No ratings yet
DSIL-DDI A Domain-Invariant Substructure Interaction Learning For Generalizable DrugDrug Interaction Prediction
9 pages
Abstract
No ratings yet
Abstract
1 page
Biomolecules 14 01267
No ratings yet
Biomolecules 14 01267
18 pages
Basedpaired
No ratings yet
Basedpaired
34 pages
1 s2.0 S0957417424015148 Main
No ratings yet
1 s2.0 S0957417424015148 Main
12 pages
DeepSide A Deep Learning Framework For Drug Side Effect Prediction
No ratings yet
DeepSide A Deep Learning Framework For Drug Side Effect Prediction
60 pages
REFERENCE PAPER 2 - Machine Learning-Based Prediction of Drug-Drug
No ratings yet
REFERENCE PAPER 2 - Machine Learning-Based Prediction of Drug-Drug
9 pages
Interpretable Prediction of Protein-Ligand Interaction by Convolutional Neural Network
No ratings yet
Interpretable Prediction of Protein-Ligand Interaction by Convolutional Neural Network
4 pages
SPRINT Enables Interpretable and Ultra-Fast Virtual Screening Against Thousands of Proteomes
No ratings yet
SPRINT Enables Interpretable and Ultra-Fast Virtual Screening Against Thousands of Proteomes
15 pages
FMCA-DTI A Fragment-Oriented Method Based On A
No ratings yet
FMCA-DTI A Fragment-Oriented Method Based On A
10 pages
GraphDTA Prediction of Drug Target Bindi
No ratings yet
GraphDTA Prediction of Drug Target Bindi
12 pages
Bioinformatics-2022-MinLi-0-BridgeDPI A Novel Graph Neural Network For Predicting Drug-Protein Interactions
No ratings yet
Bioinformatics-2022-MinLi-0-BridgeDPI A Novel Graph Neural Network For Predicting Drug-Protein Interactions
8 pages
Prediction of Drugdisease Associations Based On Multi-Kernel Deep Learning Method in Heterogeneous Graph Embedding
No ratings yet
Prediction of Drugdisease Associations Based On Multi-Kernel Deep Learning Method in Heterogeneous Graph Embedding
9 pages
Ijms 22 05118
No ratings yet
Ijms 22 05118
15 pages
MONN: Predicting Compound-Protein Interactions
No ratings yet
MONN: Predicting Compound-Protein Interactions
27 pages
DDI-GPT: Explainable Prediction of Drug-Drug Interactions Using Large Language Models Enhanced With Knowledge Graphs
No ratings yet
DDI-GPT: Explainable Prediction of Drug-Drug Interactions Using Large Language Models Enhanced With Knowledge Graphs
36 pages
Deep Graph Convolutional Network and LST
No ratings yet
Deep Graph Convolutional Network and LST
9 pages
Drug Discovery and Drug Identification Using AI
No ratings yet
Drug Discovery and Drug Identification Using AI
3 pages
GraphDTA: Predicting Drug-Target Affinity
No ratings yet
GraphDTA: Predicting Drug-Target Affinity
8 pages
Reference Paper 7
No ratings yet
Reference Paper 7
12 pages
Deep Learning in Drug Discovery An Integrative Review
No ratings yet
Deep Learning in Drug Discovery An Integrative Review
63 pages
Zhavoronkov 2018 Artificial Intelligence For Drug Discovery Biomarker Development and Generation of Novel Chemistry
No ratings yet
Zhavoronkov 2018 Artificial Intelligence For Drug Discovery Biomarker Development and Generation of Novel Chemistry
3 pages
Acs Molpharmaceut 6b00248
No ratings yet
Acs Molpharmaceut 6b00248
7 pages
BIB-2021-JianyuShi-0-SSI-DDI Substructure-Substructure Interactions For Drug-Drug Interaction Prediction
No ratings yet
BIB-2021-JianyuShi-0-SSI-DDI Substructure-Substructure Interactions For Drug-Drug Interaction Prediction
10 pages
A Generalist Cross-Domain Molecular Learning Framework For Structure-Based Drug Discovery
No ratings yet
A Generalist Cross-Domain Molecular Learning Framework For Structure-Based Drug Discovery
38 pages
A Drug Combination Prediction Framework Based On Graph Convolutional Network and Heterogeneous Information
No ratings yet
A Drug Combination Prediction Framework Based On Graph Convolutional Network and Heterogeneous Information
9 pages
Artificial Intelligence For Prediction of Biological 2vipcq6k
No ratings yet
Artificial Intelligence For Prediction of Biological 2vipcq6k
22 pages
EduSkills & Google AIML Internship Report
No ratings yet
EduSkills & Google AIML Internship Report
9 pages
An Analysis Method For Interpretability of CNN Text Classification Model
No ratings yet
An Analysis Method For Interpretability of CNN Text Classification Model
14 pages
Lange and Sippel MachineLearning Hydrology
No ratings yet
Lange and Sippel MachineLearning Hydrology
26 pages
A Survey On The Interpretability of Deep Learning in Medical Diagnosis
No ratings yet
A Survey On The Interpretability of Deep Learning in Medical Diagnosis
21 pages
A Deep Learning Based Object Detection System For User Interface Code Generation
No ratings yet
A Deep Learning Based Object Detection System For User Interface Code Generation
5 pages
CNN Optimizers: A Comparative Study
No ratings yet
CNN Optimizers: A Comparative Study
8 pages
CNNs for Skin Disease Diagnosis
No ratings yet
CNNs for Skin Disease Diagnosis
5 pages
17143-Article Text-61876-1-10-20220601
No ratings yet
17143-Article Text-61876-1-10-20220601
8 pages
2023, October-A Review On Deep Learning in Planetary
No ratings yet
2023, October-A Review On Deep Learning in Planetary
24 pages
Practical AI For Beginners Bundle
No ratings yet
Practical AI For Beginners Bundle
26 pages
cs231n 2017 Lecture5
No ratings yet
cs231n 2017 Lecture5
78 pages
Project Synapsis
No ratings yet
Project Synapsis
14 pages
DLT Lab Mannual
No ratings yet
DLT Lab Mannual
50 pages
Ria 37.05 21
No ratings yet
Ria 37.05 21
10 pages
869 When Vision Transformers Outpe
No ratings yet
869 When Vision Transformers Outpe
20 pages
Diffusion Policy
No ratings yet
Diffusion Policy
16 pages
Decoding Emotions From EEG Responses Elicited by Videos Using Machine Learning Techniques On Two Datasets
No ratings yet
Decoding Emotions From EEG Responses Elicited by Videos Using Machine Learning Techniques On Two Datasets
4 pages
Zeng Et Al - 2022 - Lightweight Dense-Scale Network (LDSNet) For Corn Leaf Disease Identification
No ratings yet
Zeng Et Al - 2022 - Lightweight Dense-Scale Network (LDSNet) For Corn Leaf Disease Identification
15 pages
Empowering Artificial Intelligence Techniques With Soft Computing of Neutrosophic Theory in Mystery Circumstances For Plant Diseases
No ratings yet
Empowering Artificial Intelligence Techniques With Soft Computing of Neutrosophic Theory in Mystery Circumstances For Plant Diseases
13 pages
Bhai 3
No ratings yet
Bhai 3
18 pages
Symmetry-Enhanced Attention Network For Acute Ischemic Infarct Segmentation With Non-Contrast CT Images
No ratings yet
Symmetry-Enhanced Attention Network For Acute Ischemic Infarct Segmentation With Non-Contrast CT Images
11 pages
Biomedicines 13 00951
No ratings yet
Biomedicines 13 00951
18 pages
Dilip Final Report
No ratings yet
Dilip Final Report
46 pages
Report Med Ghassen Dahmani
No ratings yet
Report Med Ghassen Dahmani
46 pages
Title Automated Student Attendance Management System Using Face Recognition Domain: Image Processing, Deep Learning Objective
No ratings yet
Title Automated Student Attendance Management System Using Face Recognition Domain: Image Processing, Deep Learning Objective
8 pages
American SIGN - LANGUAGE - DETECTION
No ratings yet
American SIGN - LANGUAGE - DETECTION
35 pages
Color Report Draft1
No ratings yet
Color Report Draft1
31 pages
Detection of Mental Stress Using EEG Signals - Alpha Beta Theta and Gamma Bands
No ratings yet
Detection of Mental Stress Using EEG Signals - Alpha Beta Theta and Gamma Bands
9 pages
Springer Nature LaTeX Template
No ratings yet
Springer Nature LaTeX Template
15 pages
OBC-YOLOv8 - An Improved Road Damage Detection Model Based On YOLOv8
No ratings yet
OBC-YOLOv8 - An Improved Road Damage Detection Model Based On YOLOv8
15 pages

27786-Article Text-31840-1-2-20240324

Uploaded by

27786-Article Text-31840-1-2-20240324

Uploaded by

The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

Multilevel Attention Network with Semi-supervised Domain Adaptation for

Abstract docking and molecular simulations have shown great suc-

(a) N layers (b) Feature

Add & Scale

for unlabeled target domain data. By incorporating these 1 X

human C.elegans BindingDB BioSNAP

Table 3: Ablation study on BindingDB and BioSNAP

You might also like