Skip to main content

Chee Keong Kwoh

Nanyang Technological University, School of Computer Engineering, Faculty Member

Followers

66

Following

0

Public Views

Interests

Uploads

Papers by Chee Keong Kwoh

ADATIME: A Benchmarking Suite for Domain Adaptation on Time Series Data

ACM Transactions on Knowledge Discovery from Data

Unsupervised domain adaptation methods aim at generalizing well on unlabeled test data that may h... more Unsupervised domain adaptation methods aim at generalizing well on unlabeled test data that may have a different (shifted) distribution from the training data. Such methods are typically developed on image data, and their application to time series data is less explored. Existing works on time series domain adaptation suffer from inconsistencies in evaluation schemes, datasets, and backbone neural network architectures. Moreover, labeled target data are often used for model selection, which violates the fundamental assumption of unsupervised domain adaptation. To address these issues, we develop a benchmarking evaluation suite ( AdaTime ) to systematically and fairly evaluate different domain adaptation methods on time series data. Specifically, we standardize the backbone neural network architectures and benchmarking datasets, while also exploring more realistic model selection approaches that can work with no labeled data or just a few labeled samples. Our evaluation includes adap...

Self-Supervised Learning for Label- Efficient Sleep Stage Classification: A Comprehensive Evaluation

IEEE Transactions on Neural Systems and Rehabilitation Engineering

Alignment-free machine learning approaches for the lethality prediction of potential novel human-adapted coronavirus using genomic nucleotide

A newly emerging novel coronavirus appeared and rapidly spread worldwide and World Health Organiz... more A newly emerging novel coronavirus appeared and rapidly spread worldwide and World Health Organization declared a pandemic on March 11, 2020. The roles and characteristics of coronavirus have captured much attention due to its power of causing a wide variety of infectious diseases, from mild to severe on humans. The detection of the lethality of human coronavirus is key to estimate the viral toxicity and provide perspective for treatment. We developed alignment-free machine learning approaches for an ultra-fast and highly accurate prediction of the lethality of potential human-adapted coronavirus using genomic nucleotide. We performed extensive experiments through six different feature transformation and machine learning algorithms in combination with digital signal processing to infer the lethality of possible future novel coronaviruses using previous existing strains. The results tested on SARS-CoV, MERS-Cov and SARS-CoV-2 datasets show an average 96.7% prediction accuracy. We als...

Attention Over Self-Attention: Intention-Aware Re-Ranking With Dynamic Transformer Encoders for Recommendation

IEEE Transactions on Knowledge and Data Engineering

ADAST: Attentive Cross-Domain EEG-Based Sleep Staging Framework With Iterative Self-Training

IEEE Transactions on Emerging Topics in Computational Intelligence

Self-Supervised Autoregressive Domain Adaptation for Time Series Data

IEEE Transactions on Neural Networks and Learning Systems

Information Theory-Based Feature Selection: Minimum Distribution Similarity with Removed Redundancy

Computational Science – ICCS 2020, 2020

Feature selection is an important preprocessing step in pattern recognition. In this paper, we pr... more Feature selection is an important preprocessing step in pattern recognition. In this paper, we presented a new feature selection approach in two-class classification problems based on information theory, named minimum Distribution Similarity with Removed Redundancy (mDSRR). Different from the previous methods which use mutual information and greedy iteration with a loss function to rank the features, we rank features according to their distribution similarities in two classes measured by relative entropy, and then remove the high redundant features from the sorted feature subsets. Experimental results on datasets in varieties of fields with different classifiers highlight the value of mDSRR on selecting feature subsets, especially so for choosing small size feature subset. mDSRR is also proved to outperform other state-of-the-art methods in most cases. Besides, we observed that the mutual information may not be a good practice to select the initial feature in the methods with subseq...

Attention-based sequence to sequence model for machine remaining useful life prediction

Neurocomputing, 2021

Contrastive Adversarial Domain Adaptation for Machine Remaining Useful Life Prediction

IEEE Transactions on Industrial Informatics, 2021

Recent advances in network-based methods for disease gene prediction

Briefings in Bioinformatics, 2020

Disease–gene association through genome-wide association study (GWAS) is an arduous task for rese... more Disease–gene association through genome-wide association study (GWAS) is an arduous task for researchers. Investigating single nucleotide polymorphisms that correlate with specific diseases needs statistical analysis of associations. Considering the huge number of possible mutations, in addition to its high cost, another important drawback of GWAS analysis is the large number of false positives. Thus, researchers search for more evidence to cross-check their results through different sources. To provide the researchers with alternative and complementary low-cost disease–gene association evidence, computational approaches come into play. Since molecular networks are able to capture complex interplay among molecules in diseases, they become one of the most extensively used data for disease–gene association prediction. In this survey, we aim to provide a comprehensive and up-to-date review of network-based methods for disease gene prediction. We also conduct an empirical analysis on 14...

Adversarial Multiple-Target Domain Adaptation for Fault Classification

IEEE Transactions on Instrumentation and Measurement, 2020

Rule-based meta-analysis reveals the major role of PB2 in influencing influenza A virus virulence in mice

BMC Genomics, 2019

Background Influenza A virus (IAV) poses threats to human health and life. Many individual studie... more Background Influenza A virus (IAV) poses threats to human health and life. Many individual studies have been carried out in mice to uncover the viral factors responsible for the virulence of IAV infections. Nonetheless, a single study may not provide enough confident about virulence factors, hence combining several studies for a meta-analysis is desired to provide better views. For this, we documented more than 500 records of IAV infections in mice, whose viral proteins could be retrieved and the mouse lethal dose 50 or alternatively, weight loss and/or survival data, was/were available for virulence classification. Results IAV virulence models were learned from various datasets containing aligned IAV proteins and the corresponding two virulence classes (avirulent and virulent) or three virulence classes (low, intermediate and high virulence). Three proven rule-based learning approaches, i.e., OneR, JRip and PART, and additionally random forest were used for modelling. PART models a...

Comprehensive detection of cancer gene expression profiles and gene networks are impacted by the choice of pre-processing algorithm and gene-selection method

International Journal of Data Mining and Bioinformatics, 2013

Integration of genomic and epigenomic features to predict meiotic recombination hotspots in human and mouse

Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine, 2012

Meta-analysis of Genomic and Proteomic Features to Predict Synthetic Lethality of Yeast and Human Cancer

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics, 2007

Measuring Similarity by Prediction Class between Biomedical Datasets via Fuzzy Unordered Rule Induction

International Journal of Bio-Science and Bio-Technology, 2014

The pattern classification based on the nearest feature midpoints

Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., 2004

Benchmarking Human Protein Complexes to Investigate Drug-Related Systems and Evaluate Predicted Protein Complexes

PLoS ONE, 2013

Reliable and Fast Estimation of Recombination Rates by Convergence Diagnosis and Parallel Markov Chain Monte Carlo

IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2014

Erratum to "QuickVina: Accelerating AutoDock Vina Using Gradient-Based Heuristics for Global Optimization&#x0022

IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2012

ADATIME: A Benchmarking Suite for Domain Adaptation on Time Series Data

ACM Transactions on Knowledge Discovery from Data

Unsupervised domain adaptation methods aim at generalizing well on unlabeled test data that may h... more Unsupervised domain adaptation methods aim at generalizing well on unlabeled test data that may have a different (shifted) distribution from the training data. Such methods are typically developed on image data, and their application to time series data is less explored. Existing works on time series domain adaptation suffer from inconsistencies in evaluation schemes, datasets, and backbone neural network architectures. Moreover, labeled target data are often used for model selection, which violates the fundamental assumption of unsupervised domain adaptation. To address these issues, we develop a benchmarking evaluation suite ( AdaTime ) to systematically and fairly evaluate different domain adaptation methods on time series data. Specifically, we standardize the backbone neural network architectures and benchmarking datasets, while also exploring more realistic model selection approaches that can work with no labeled data or just a few labeled samples. Our evaluation includes adap...

Self-Supervised Learning for Label- Efficient Sleep Stage Classification: A Comprehensive Evaluation

IEEE Transactions on Neural Systems and Rehabilitation Engineering

Alignment-free machine learning approaches for the lethality prediction of potential novel human-adapted coronavirus using genomic nucleotide

A newly emerging novel coronavirus appeared and rapidly spread worldwide and World Health Organiz... more A newly emerging novel coronavirus appeared and rapidly spread worldwide and World Health Organization declared a pandemic on March 11, 2020. The roles and characteristics of coronavirus have captured much attention due to its power of causing a wide variety of infectious diseases, from mild to severe on humans. The detection of the lethality of human coronavirus is key to estimate the viral toxicity and provide perspective for treatment. We developed alignment-free machine learning approaches for an ultra-fast and highly accurate prediction of the lethality of potential human-adapted coronavirus using genomic nucleotide. We performed extensive experiments through six different feature transformation and machine learning algorithms in combination with digital signal processing to infer the lethality of possible future novel coronaviruses using previous existing strains. The results tested on SARS-CoV, MERS-Cov and SARS-CoV-2 datasets show an average 96.7% prediction accuracy. We als...

Attention Over Self-Attention: Intention-Aware Re-Ranking With Dynamic Transformer Encoders for Recommendation

IEEE Transactions on Knowledge and Data Engineering

ADAST: Attentive Cross-Domain EEG-Based Sleep Staging Framework With Iterative Self-Training

IEEE Transactions on Emerging Topics in Computational Intelligence

Self-Supervised Autoregressive Domain Adaptation for Time Series Data

IEEE Transactions on Neural Networks and Learning Systems

Information Theory-Based Feature Selection: Minimum Distribution Similarity with Removed Redundancy

Computational Science – ICCS 2020, 2020

Feature selection is an important preprocessing step in pattern recognition. In this paper, we pr... more Feature selection is an important preprocessing step in pattern recognition. In this paper, we presented a new feature selection approach in two-class classification problems based on information theory, named minimum Distribution Similarity with Removed Redundancy (mDSRR). Different from the previous methods which use mutual information and greedy iteration with a loss function to rank the features, we rank features according to their distribution similarities in two classes measured by relative entropy, and then remove the high redundant features from the sorted feature subsets. Experimental results on datasets in varieties of fields with different classifiers highlight the value of mDSRR on selecting feature subsets, especially so for choosing small size feature subset. mDSRR is also proved to outperform other state-of-the-art methods in most cases. Besides, we observed that the mutual information may not be a good practice to select the initial feature in the methods with subseq...

Attention-based sequence to sequence model for machine remaining useful life prediction

Neurocomputing, 2021

Contrastive Adversarial Domain Adaptation for Machine Remaining Useful Life Prediction

IEEE Transactions on Industrial Informatics, 2021

Recent advances in network-based methods for disease gene prediction

Briefings in Bioinformatics, 2020

Disease–gene association through genome-wide association study (GWAS) is an arduous task for rese... more Disease–gene association through genome-wide association study (GWAS) is an arduous task for researchers. Investigating single nucleotide polymorphisms that correlate with specific diseases needs statistical analysis of associations. Considering the huge number of possible mutations, in addition to its high cost, another important drawback of GWAS analysis is the large number of false positives. Thus, researchers search for more evidence to cross-check their results through different sources. To provide the researchers with alternative and complementary low-cost disease–gene association evidence, computational approaches come into play. Since molecular networks are able to capture complex interplay among molecules in diseases, they become one of the most extensively used data for disease–gene association prediction. In this survey, we aim to provide a comprehensive and up-to-date review of network-based methods for disease gene prediction. We also conduct an empirical analysis on 14...

Adversarial Multiple-Target Domain Adaptation for Fault Classification

IEEE Transactions on Instrumentation and Measurement, 2020

Rule-based meta-analysis reveals the major role of PB2 in influencing influenza A virus virulence in mice

BMC Genomics, 2019

Background Influenza A virus (IAV) poses threats to human health and life. Many individual studie... more Background Influenza A virus (IAV) poses threats to human health and life. Many individual studies have been carried out in mice to uncover the viral factors responsible for the virulence of IAV infections. Nonetheless, a single study may not provide enough confident about virulence factors, hence combining several studies for a meta-analysis is desired to provide better views. For this, we documented more than 500 records of IAV infections in mice, whose viral proteins could be retrieved and the mouse lethal dose 50 or alternatively, weight loss and/or survival data, was/were available for virulence classification. Results IAV virulence models were learned from various datasets containing aligned IAV proteins and the corresponding two virulence classes (avirulent and virulent) or three virulence classes (low, intermediate and high virulence). Three proven rule-based learning approaches, i.e., OneR, JRip and PART, and additionally random forest were used for modelling. PART models a...

Comprehensive detection of cancer gene expression profiles and gene networks are impacted by the choice of pre-processing algorithm and gene-selection method

International Journal of Data Mining and Bioinformatics, 2013

Integration of genomic and epigenomic features to predict meiotic recombination hotspots in human and mouse

Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine, 2012

Meta-analysis of Genomic and Proteomic Features to Predict Synthetic Lethality of Yeast and Human Cancer

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics, 2007

Measuring Similarity by Prediction Class between Biomedical Datasets via Fuzzy Unordered Rule Induction

International Journal of Bio-Science and Bio-Technology, 2014

The pattern classification based on the nearest feature midpoints

Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., 2004

Benchmarking Human Protein Complexes to Investigate Drug-Related Systems and Evaluate Predicted Protein Complexes

PLoS ONE, 2013

Reliable and Fast Estimation of Recombination Rates by Convergence Diagnosis and Parallel Markov Chain Monte Carlo

IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2014

Erratum to "QuickVina: Accelerating AutoDock Vina Using Gradient-Based Heuristics for Global Optimization&#x0022

IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2012